* Re: Bug#569505: git-core: 'git add' corrupts repository if the working directory is modified as it runs [not found] <20100211234753.22574.48799.reportbug@gibbs.hungrycats.org> @ 2010-02-12 0:27 ` Jonathan Nieder 2010-02-12 1:23 ` Zygo Blaxell 2010-02-14 1:36 ` mmap with MAP_PRIVATE is useless (was Re: Bug#569505: git-core: 'git add' corrupts repository if the working directory is modified as it runs) Paolo Bonzini 0 siblings, 2 replies; 84+ messages in thread From: Jonathan Nieder @ 2010-02-12 0:27 UTC (permalink / raw) To: git, 569505-forwarded; +Cc: Zygo Blaxell Hi gitsters, Zygo Blaxell reported through http://bugs.debian.org/569505 that ‘git update-index’ has some issues when the files it is adding change under its feet: My thoughts: - Low-hanging fruit: it should be possible for update-index to check the stat information to see if the file has changed between when it first opens it and when it finishes. - Zygo reported suppress that ‘git gc’ didn’t notice the problem. Should ‘git gc’ imply a ‘git fsck --no-full’? - Recovering from this kind of mistake in early history is indeed hard. Any tricks for doing this? Maybe fast-export | fast-import can do something with this, or maybe replace + filter-branch once it learns to be a little smarter. - How do checkout-index and cat-file blob react to a blob whose contents do not reflect its object name? Are they behaving appropriately? I would want cat-file blob to be able to retrieve such a broken blob’s contents, checkout-index not so much. I imagine there are other things to learn, too. The report and reproduction recipe follow. Thoughts? Jonathan Package: git-core Version: 1:1.6.6.1-1 Severity: important 'git add' will happily corrupt a git repo if it is run while files in the working directory are being modified. A blob is added to the index with contents that do not match its SHA1 hash. If the index is then committed, the corrupt blob cannot be checked out (or is checked out with incorrect contents, depending on which tool you use to try to get the file out of git) in the future. Surprisingly, it's possible to clone, fetch, push, pull, and sometimes even gc the corrupted repo several times before anyone notices the corruption. If the affected commit is included in a merge with history from other git users, the only way to fix it is to rebase (or come up with a blob whose contents match the affected SHA1 hash somehow). It is usually possible to retrieve data committed before the corruption by simply checking out an earlier tree in the affected branch's history. The following shell code demonstrates this problem. It runs a thread which continuously modifies a file, and another thread that does 'git commit -am' and 'git fsck' in a continuous loop until corruption is detected. This might take up to 20 seconds on a slow machine. #!/bin/sh set -e # Create an empty git repo in /tmp/git-test rm -fr /tmp/git-test mkdir /tmp/git-test cd /tmp/git-test git init # Create a file named foo and add it to the repo touch foo git add foo # Thread 1: continuously modify foo: while echo -n .; do dd if=/dev/urandom of=foo count=1024 bs=1k conv=notrunc >/dev/null 2>&1 done & # Thread 2: loop until the repo is corrupted while git fsck; do # Note the implied 'git add' in 'commit -a' # It will do the same with explicit 'git add' git commit -a -m'Test' done # Kill thread 1, we don't need it any more kill $! # Success! Well, sort of. echo Repository is corrupted. Have a nice day. I discovered this bug accidentally when I was using inotifywait (from the inotify-tools package) to automatically commit snapshots of a working directory triggered by write events. I tested this with a number of kernel versions from 2.6.27 to 2.6.31. All of them reproduced this problem. I checked this because strace shows 'git add' doing a mmap(..., MAP_PRIVATE, ...) of its input file, so I was wondering if there might have been a recent change in mmap() behavior in either git or the kernel. git 1.5.6.5 has this problem too, but some of the error messages are different, and the problem sometimes manifests itself as silent corruption of other objects (e.g. if someone checks out a corrupt tree and then does 'git add -u' or 'git commit -a', they will include the corrupt data in their commit). ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: Bug#569505: git-core: 'git add' corrupts repository if the working directory is modified as it runs 2010-02-12 0:27 ` Bug#569505: git-core: 'git add' corrupts repository if the working directory is modified as it runs Jonathan Nieder @ 2010-02-12 1:23 ` Zygo Blaxell 2010-02-13 12:12 ` Jonathan Nieder 2010-02-14 1:36 ` mmap with MAP_PRIVATE is useless (was Re: Bug#569505: git-core: 'git add' corrupts repository if the working directory is modified as it runs) Paolo Bonzini 1 sibling, 1 reply; 84+ messages in thread From: Zygo Blaxell @ 2010-02-12 1:23 UTC (permalink / raw) To: Jonathan Nieder; +Cc: git, 569505-forwarded On Thu, Feb 11, 2010 at 06:27:41PM -0600, Jonathan Nieder wrote: > Zygo Blaxell reported through http://bugs.debian.org/569505 that ???git > update-index??? has some issues when the files it is adding change under > its feet: > > My thoughts: > > - Low-hanging fruit: it should be possible for update-index to check > the stat information to see if the file has changed between when it > first opens it and when it finishes. I don't think this is a good idea. stat() is very coarse-grained, and provides accuracy of only a second on a lot of file systems where git working directories might be found. If you run the test script on an ext3 filesystem on a modern machine the stat() data won't change at all even though the file contents change completely many times. What would be a good idea is to make sure that the code that copies a file into the index and calculates its hash does both in a single pass over the same input data. That might require replacing a simple mmap() of the input file with a read-hash-copy loop. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: 'git add' corrupts repository if the working directory is modified as it runs 2010-02-12 1:23 ` Zygo Blaxell @ 2010-02-13 12:12 ` Jonathan Nieder 2010-02-13 13:39 ` Ilari Liusvaara 0 siblings, 1 reply; 84+ messages in thread From: Jonathan Nieder @ 2010-02-13 12:12 UTC (permalink / raw) To: Zygo Blaxell; +Cc: git Zygo Blaxell wrote: > On Thu, Feb 11, 2010 at 06:27:41PM -0600, Jonathan Nieder wrote: >> - Low-hanging fruit: it should be possible for update-index to check >> the stat information to see if the file has changed between when it >> first opens it and when it finishes. > > I don't think this is a good idea. stat() is very coarse-grained You’re probably right. For many file types, st_size is likely to change (in this way your script is testing something unusual), but that is no excuse to behave poorly when it doesn’t. > What would be a good idea is to make sure that the code that copies a > file into the index and calculates its hash does both in a single pass > over the same input data. That might require replacing a simple mmap() > of the input file with a read-hash-copy loop. This leaves me nervous about speed. Consider the following simple case: someone the file to be added is already in the object repository somewhere (maybe the user has tried this code before, or a file was renamed with 'mv', or a patch applied with 'patch', or an unmount and remount dirtied the stat information). With the current code, write_sha1_file() will hash the file, notice that object is already in .git/objects, and return. With a read-hash-copy loop, git would have to store a (compressed or uncompressed) copy of the file somewhere in the meantime. But I’d be happy to see code appear that proves me wrong. ;-) One simple benchmark to try is running the git test suite. Cheers, Jonathan ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: 'git add' corrupts repository if the working directory is modified as it runs 2010-02-13 12:12 ` Jonathan Nieder @ 2010-02-13 13:39 ` Ilari Liusvaara 2010-02-13 14:39 ` Thomas Rast 0 siblings, 1 reply; 84+ messages in thread From: Ilari Liusvaara @ 2010-02-13 13:39 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Zygo Blaxell, git On Sat, Feb 13, 2010 at 06:12:38AM -0600, Jonathan Nieder wrote: > > This leaves me nervous about speed. Consider the following simple > case: someone the file to be added is already in the object > repository somewhere (maybe the user has tried this code before, or > a file was renamed with 'mv', or a patch applied with 'patch', or an > unmount and remount dirtied the stat information). > > With the current code, write_sha1_file() will hash the file, notice > that object is already in .git/objects, and return. With a > read-hash-copy loop, git would have to store a (compressed or > uncompressed) copy of the file somewhere in the meantime. It could be done by first reading the file and computing hash, if the hash matches existing object, return that hash. Otherwise read the file for object write, hashing it again and use that value for object ID. This would require two hash computations in non-existing case, but SHA-1 is pretty fast. If the first computation produces match, then it doesn't matter if file is modified as adding and modifying in parallel results undefined contents anyway (just that it should not corrupt the repository). -Ilari ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: 'git add' corrupts repository if the working directory is modified as it runs 2010-02-13 13:39 ` Ilari Liusvaara @ 2010-02-13 14:39 ` Thomas Rast 2010-02-13 16:29 ` Ilari Liusvaara 0 siblings, 1 reply; 84+ messages in thread From: Thomas Rast @ 2010-02-13 14:39 UTC (permalink / raw) To: Ilari Liusvaara; +Cc: Jonathan Nieder, Zygo Blaxell, git On Saturday 13 February 2010 14:39:52 Ilari Liusvaara wrote: > On Sat, Feb 13, 2010 at 06:12:38AM -0600, Jonathan Nieder wrote: > > > > With the current code, write_sha1_file() will hash the file, notice > > that object is already in .git/objects, and return. With a > > read-hash-copy loop, git would have to store a (compressed or > > uncompressed) copy of the file somewhere in the meantime. > > It could be done by first reading the file and computing hash, > if the hash matches existing object, return that hash. Otherwise > read the file for object write, hashing it again and use that value > for object ID. That is still racy. The real problem is that the file is mmap()ed, and git then first computes the SHA1 of that buffer, next it compresses it.[*] Due to the last sentence in the following snippet from mmap(2): MAP_PRIVATE Create a private copy-on-write mapping. Updates to the map- ping are not visible to other processes mapping the same file, and are not carried through to the underlying file. It is unspecified whether changes made to the file after the mmap() call are visible in the mapped region. This is racy despite the use of MAP_PRIVATE: the mapped contents can change at any time. AFAICS there are only two possible solutions: * Copy the file (possibly block-by-block) as we go, to make sure that the data we SHA1 is the same we compress. * Unpack and re-hash the compressed data to verify that the SHA1 is correct. In case of failure either retry (but you could have to do this infinitely often if the user just hates you!) or abort. (Of course, in neither case does the user have any sort of guarantee about what data ended up in the repository, but he never had that, we only try to ensure repo consistency.) [*] The "do we have this" check actually happens before the compression, and that arm is thus race-free. -- Thomas Rast trast@{inf,student}.ethz.ch ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: 'git add' corrupts repository if the working directory is modified as it runs 2010-02-13 14:39 ` Thomas Rast @ 2010-02-13 16:29 ` Ilari Liusvaara 2010-02-13 22:09 ` Dmitry Potapov 0 siblings, 1 reply; 84+ messages in thread From: Ilari Liusvaara @ 2010-02-13 16:29 UTC (permalink / raw) To: Thomas Rast; +Cc: Jonathan Nieder, Zygo Blaxell, git On Sat, Feb 13, 2010 at 03:39:53PM +0100, Thomas Rast wrote: > On Saturday 13 February 2010 14:39:52 Ilari Liusvaara wrote: > > On Sat, Feb 13, 2010 at 06:12:38AM -0600, Jonathan Nieder wrote: > > > > > > With the current code, write_sha1_file() will hash the file, notice > > > that object is already in .git/objects, and return. With a > > > read-hash-copy loop, git would have to store a (compressed or > > > uncompressed) copy of the file somewhere in the meantime. > > > > It could be done by first reading the file and computing hash, > > if the hash matches existing object, return that hash. Otherwise > > read the file for object write, hashing it again and use that value > > for object ID. > > That is still racy. The real problem is that the file is mmap()ed, > and git then first computes the SHA1 of that buffer, next it > compresses it.[*] Hmm... One needs to copy the data block at time into temporary buffer and use that for feeding zlib and SHA-1. That ensures that whatever SHA-1 hashes and zlib compresses are consistent. -Ilari ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: 'git add' corrupts repository if the working directory is modified as it runs 2010-02-13 16:29 ` Ilari Liusvaara @ 2010-02-13 22:09 ` Dmitry Potapov 2010-02-13 22:37 ` Zygo Blaxell 0 siblings, 1 reply; 84+ messages in thread From: Dmitry Potapov @ 2010-02-13 22:09 UTC (permalink / raw) To: Ilari Liusvaara; +Cc: Thomas Rast, Jonathan Nieder, Zygo Blaxell, git On Sat, Feb 13, 2010 at 7:29 PM, Ilari Liusvaara <ilari.liusvaara@elisanet.fi> wrote: > > Hmm... One needs to copy the data block at time into temporary buffer > and use that for feeding zlib and SHA-1. That ensures that whatever > SHA-1 hashes and zlib compresses are consistent. If you want this then just compile Git with NO_MMAP = YesPlease It should solve the described problem. Dmitry ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: 'git add' corrupts repository if the working directory is modified as it runs 2010-02-13 22:09 ` Dmitry Potapov @ 2010-02-13 22:37 ` Zygo Blaxell 2010-02-14 1:18 ` [PATCH] don't use mmap() to hash files Dmitry Potapov 0 siblings, 1 reply; 84+ messages in thread From: Zygo Blaxell @ 2010-02-13 22:37 UTC (permalink / raw) To: Dmitry Potapov; +Cc: Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Sun, Feb 14, 2010 at 01:09:23AM +0300, Dmitry Potapov wrote: > On Sat, Feb 13, 2010 at 7:29 PM, Ilari Liusvaara > <ilari.liusvaara@elisanet.fi> wrote: > > Hmm... One needs to copy the data block at time into temporary buffer > > and use that for feeding zlib and SHA-1. That ensures that whatever > > SHA-1 hashes and zlib compresses are consistent. > > If you want this then just compile Git with NO_MMAP = YesPlease > It should solve the described problem. Doesn't that also turn off mmap in other places where it's harmless or even beneficial? Otherwise, why use mmap in Git at all? It also doesn't solve the problem in cases when mmap support is compiled in. There is a performance-robustness trade-off here. If we do an extra copy of the file data, we get always consistent repo contents but lose speed (not very much speed, since sha1 and zlib are much slower than a memory copy). If we don't, we still get consistent repo contents if--and only if--the files never happen to be modified during a git add. I can live with that, as long as the limitation is documented and there's a config switch somewhere that I can turn on for cases when I can't make such assurances. I imagine similar reasoning led to the existence of the receive.fsckObjects option. Personally, checking all objects fetched from remote repos is not a feature I'd ever want to be able to turn off; however, it's easy to think of cases where integrity matters less than speed (e.g. build systems doing clone-build-destroy cycles from a trusted, reliable server over a similarly trusted network). ^ permalink raw reply [flat|nested] 84+ messages in thread
* [PATCH] don't use mmap() to hash files 2010-02-13 22:37 ` Zygo Blaxell @ 2010-02-14 1:18 ` Dmitry Potapov 2010-02-14 1:37 ` Junio C Hamano ` (3 more replies) 0 siblings, 4 replies; 84+ messages in thread From: Dmitry Potapov @ 2010-02-14 1:18 UTC (permalink / raw) To: Zygo Blaxell; +Cc: Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git If a mmapped file is changed by another program during git-add, it causes the repository corruption. Disabling mmap() in index_fd() does not have any negative impact on the overall speed of Git. In fact, it makes git hash-object to work slightly faster. Here is the best result before and after patch based on 5 runs on the Linix kernel repository: Before: $ git ls-files | time git hash-object --stdin-path > /dev/null 2.15user 0.36system 0:02.52elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+103248minor)pagefaults 0swaps After: $ git ls-files | time ../git/git hash-object --stdin-path > /dev/null 2.09user 0.33system 0:02.42elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+1073minor)pagefaults 0swaps Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> --- I think more people should test this change to see its impact on performance. For me, it was positive. Here is my results: Using mmap (git version 1.7.0) 2.18user 0.33system 0:02.52elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+103248minor)pagefaults 0swaps $ git ls-files | time git hash-object --stdin-path > /dev/null 2.23user 0.28system 0:02.53elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+103249minor)pagefaults 0swaps $ git ls-files | time git hash-object --stdin-path > /dev/null 2.20user 0.31system 0:02.52elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+103248minor)pagefaults 0swaps $ git ls-files | time git hash-object --stdin-path > /dev/null 2.21user 0.30system 0:02.51elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+103248minor)pagefaults 0swaps $ git ls-files | time git hash-object --stdin-path > /dev/null 2.15user 0.36system 0:02.52elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+103248minor)pagefaults 0swaps Using read() instead of mmap() (1.7.0 with the above patch) $ git ls-files | time ../git/git hash-object --stdin-path > /dev/null 2.19user 0.24system 0:02.42elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+1073minor)pagefaults 0swaps $ git ls-files | time ../git/git hash-object --stdin-path > /dev/null 2.15user 0.26system 0:02.42elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+1073minor)pagefaults 0swaps $ git ls-files | time ../git/git hash-object --stdin-path > /dev/null 2.18user 0.24system 0:02.42elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+1073minor)pagefaults 0swaps $ git ls-files | time ../git/git hash-object --stdin-path > /dev/null 2.18user 0.25system 0:02.42elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+1073minor)pagefaults 0swaps $ git ls-files | time ../git/git hash-object --stdin-path > /dev/null 2.09user 0.33system 0:02.42elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+1073minor)pagefaults 0swaps sha1_file.c | 22 +++++++--------------- 1 files changed, 7 insertions(+), 15 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index 657825e..83f82a2 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2438,22 +2438,14 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, enum object_type type, const char *path) { int ret; - size_t size = xsize_t(st->st_size); + struct strbuf sbuf = STRBUF_INIT; - if (!S_ISREG(st->st_mode)) { - struct strbuf sbuf = STRBUF_INIT; - if (strbuf_read(&sbuf, fd, 4096) >= 0) - ret = index_mem(sha1, sbuf.buf, sbuf.len, write_object, - type, path); - else - ret = -1; - strbuf_release(&sbuf); - } else if (size) { - void *buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0); - ret = index_mem(sha1, buf, size, write_object, type, path); - munmap(buf, size); - } else - ret = index_mem(sha1, NULL, size, write_object, type, path); + if (strbuf_read(&sbuf, fd, 4096) >= 0) + ret = index_mem(sha1, sbuf.buf, sbuf.len, write_object, + type, path); + else + ret = -1; + strbuf_release(&sbuf); close(fd); return ret; } -- 1.7.0 ^ permalink raw reply related [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-14 1:18 ` [PATCH] don't use mmap() to hash files Dmitry Potapov @ 2010-02-14 1:37 ` Junio C Hamano 2010-02-14 2:18 ` Dmitry Potapov 2010-02-14 1:53 ` Johannes Schindelin ` (2 subsequent siblings) 3 siblings, 1 reply; 84+ messages in thread From: Junio C Hamano @ 2010-02-14 1:37 UTC (permalink / raw) To: Dmitry Potapov Cc: Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git Dmitry Potapov <dpotapov@gmail.com> writes: > If a mmapped file is changed by another program during git-add, it > causes the repository corruption. Disabling mmap() in index_fd() does > not have any negative impact on the overall speed of Git. In fact, it > makes git hash-object to work slightly faster.... > ... > I think more people should test this change to see its impact on > performance. For me, it was positive. Here is my results: I wasn't particularly impressed by the original problem description, but this is a very interesting result. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-14 1:37 ` Junio C Hamano @ 2010-02-14 2:18 ` Dmitry Potapov 2010-02-14 3:14 ` Junio C Hamano 0 siblings, 1 reply; 84+ messages in thread From: Dmitry Potapov @ 2010-02-14 2:18 UTC (permalink / raw) To: Junio C Hamano Cc: Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Sat, Feb 13, 2010 at 05:37:21PM -0800, Junio C Hamano wrote: > Dmitry Potapov <dpotapov@gmail.com> writes: > > > If a mmapped file is changed by another program during git-add, it > > causes the repository corruption. Disabling mmap() in index_fd() does > > not have any negative impact on the overall speed of Git. In fact, it > > makes git hash-object to work slightly faster.... > > ... > > I think more people should test this change to see its impact on > > performance. For me, it was positive. Here is my results: > > I wasn't particularly impressed by the original problem description, but > this is a very interesting result. My initial reaction was to write that the whole problem is due to abuse Git for the purposes that it was not intended. But then I decided to do some testing to see what impact it has. And, because I do not see any negative impact (in fact, slightly improve speed), and I decided to ask other people (who are interested in this patch) to do more testing. Dmitry ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-14 2:18 ` Dmitry Potapov @ 2010-02-14 3:14 ` Junio C Hamano 2010-02-14 11:14 ` Thomas Rast 0 siblings, 1 reply; 84+ messages in thread From: Junio C Hamano @ 2010-02-14 3:14 UTC (permalink / raw) To: Dmitry Potapov Cc: Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git Dmitry Potapov <dpotapov@gmail.com> writes: > My initial reaction was to write that the whole problem is due to abuse > Git for the purposes that it was not intended. But then I decided to do > some testing to see what impact it has. And, because I do not see any > negative impact (in fact, slightly improve speed), and I decided to ask > other people (who are interested in this patch) to do more testing. Yes, I know, and I greatly appreciate it. Later, we might want to split the codepath again to: (0) see if it is huge or small if we are not reading from pipe; (1) if we do not know the size or if it is moderately tiny, keep doing what your code does; (2) if we know we are reading something huge with known size, then have a loop to read-a-bit-compress-and-write-it-out-while-hashing, and finally rename the loose resulting object to the final name. Or we may even want to do that into a new pack on its own. but obviously all of that can come after this initial round ;-) ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-14 3:14 ` Junio C Hamano @ 2010-02-14 11:14 ` Thomas Rast 2010-02-14 11:46 ` Junio C Hamano 0 siblings, 1 reply; 84+ messages in thread From: Thomas Rast @ 2010-02-14 11:14 UTC (permalink / raw) To: Junio C Hamano Cc: Dmitry Potapov, Zygo Blaxell, Ilari Liusvaara, Jonathan Nieder, git On Sunday 14 February 2010 04:14:01 Junio C Hamano wrote: > Later, we might want to split the codepath again to: > > (0) see if it is huge or small if we are not reading from pipe; > > (1) if we do not know the size or if it is moderately tiny, keep doing > what your code does; > > (2) if we know we are reading something huge with known size, then have a > loop to read-a-bit-compress-and-write-it-out-while-hashing, and > finally rename the loose resulting object to the final name. Or we > may even want to do that into a new pack on its own. There's a slight problem with that code (I tried to finish last night's attempt but got stuck on this): The create_tmpfile() and move_temp_to_file() duo goes to some lengths to ensure that the file is created in the same directory that we want it to end up in. However, in the block-based scheme, you cannot know which directory this will be before you have already written the entire output. So again I guess there are a few possible solutions: * Try the cross-directory rename anyway, but if it doesn't work, copy&unlink. This of course means that you may write the same object over network twice. * Declare that keeping the memory usage near what it is today (the full output buffer plus a constant) is okay. * Give up and stick with Dmitry's patch :-) -- Thomas Rast trast@{inf,student}.ethz.ch ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-14 11:14 ` Thomas Rast @ 2010-02-14 11:46 ` Junio C Hamano 0 siblings, 0 replies; 84+ messages in thread From: Junio C Hamano @ 2010-02-14 11:46 UTC (permalink / raw) To: Thomas Rast Cc: Dmitry Potapov, Zygo Blaxell, Ilari Liusvaara, Jonathan Nieder, git Thomas Rast <trast@student.ethz.ch> writes: > * Give up and stick with Dmitry's patch :-) You obviously didn't read the last line of my message before responding. In any case, I have a suspicion that streaming to a single loose object file would not buy us much (that is the only case the "cross directory rename" could matter), exactly because we wouldn't want to leave an object in loose form if it is so big that we do not want to slurp it in full into memory anyway. If we stream such a huge object directly to a new pack, on the other hand, there won't be any cross directory rename issues. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-14 1:18 ` [PATCH] don't use mmap() to hash files Dmitry Potapov 2010-02-14 1:37 ` Junio C Hamano @ 2010-02-14 1:53 ` Johannes Schindelin 2010-02-14 2:00 ` Junio C Hamano 2010-02-14 2:42 ` Dmitry Potapov 2010-02-14 3:05 ` [PATCH v2] " Dmitry Potapov 2010-02-18 1:16 ` [PATCH] Teach "git add" and friends to be paranoid Junio C Hamano 3 siblings, 2 replies; 84+ messages in thread From: Johannes Schindelin @ 2010-02-14 1:53 UTC (permalink / raw) To: Dmitry Potapov Cc: Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git Hi, On Sun, 14 Feb 2010, Dmitry Potapov wrote: > + if (strbuf_read(&sbuf, fd, 4096) >= 0) How certain are you at this point that all of fd's contents fit into your memory? And even if you could be certain, a hint is missing that strbuf_read(), its name notwithstanding, does not read NUL-terminated strings. Oh, and the size is just a hint for the initial size, and it reads until EOF. That has to be said in the commit message. Ciao, Dscho ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-14 1:53 ` Johannes Schindelin @ 2010-02-14 2:00 ` Junio C Hamano 2010-02-14 2:42 ` Dmitry Potapov 1 sibling, 0 replies; 84+ messages in thread From: Junio C Hamano @ 2010-02-14 2:00 UTC (permalink / raw) To: Johannes Schindelin Cc: Dmitry Potapov, Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > And even if you could be certain, a hint is missing that strbuf_read(), > its name notwithstanding, does not read NUL-terminated strings. Oh, and > the size is just a hint for the initial size, and it reads until EOF. That > has to be said in the commit message. It is healthy to ask for explanation, especially when the patch cannot justify itself by reading only the diff but reviewer needs to check the surrounding code. I however think you are asking a little too much in this particular case. That strbuf_read() code is exactly the same as in used for "reading from pipe and we don't know how big it is" case in the original code. You can see it in the lines deleted by the patch. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-14 1:53 ` Johannes Schindelin 2010-02-14 2:00 ` Junio C Hamano @ 2010-02-14 2:42 ` Dmitry Potapov 2010-02-14 11:07 ` Jakub Narebski ` (2 more replies) 1 sibling, 3 replies; 84+ messages in thread From: Dmitry Potapov @ 2010-02-14 2:42 UTC (permalink / raw) To: Johannes Schindelin Cc: Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Sun, Feb 14, 2010 at 02:53:58AM +0100, Johannes Schindelin wrote: > On Sun, 14 Feb 2010, Dmitry Potapov wrote: > > > + if (strbuf_read(&sbuf, fd, 4096) >= 0) > > How certain are you at this point that all of fd's contents fit into your > memory? You can't be sure... In fact, we know mmap() also may fail for huge files, so can strbuf_read(). Perhaps, mmap() behaves better when you want to hash a huge file that does not fit in free physical memory, but I do not think it is an important use case for any VCS, which mostly stores small text files and a few not so big binary files. Git is not design to store your video collection. (probably, Git can be improved to handle big files better but I leave that exercise to those who want to store their media files in Git). > > And even if you could be certain, a hint is missing that strbuf_read(), > its name notwithstanding, does not read NUL-terminated strings. Oh, and > the size is just a hint for the initial size, and it reads until EOF. That > has to be said in the commit message. I did not add _any_ new code, including the above line. It was there before my patch. I only removed a few lines for the case when we used mmap() and left the code that used strbuf_read() (mmap() was used for regular files and strbuf_read() for other type of descriptors). The fact that I re-aligned those lines could not introduce any bug, so if you think this code incorrect then it was before my patch, but I do not see why. And, I see no reason to comment on the code that does not change at all (including that the third parameter of strbuf_read() is just a hint). I may agree with you that strbuf_read with 'hint' is a bit confusing, but it has nothing to do with my patch... Dmitry ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-14 2:42 ` Dmitry Potapov @ 2010-02-14 11:07 ` Jakub Narebski 2010-02-14 11:55 ` Paolo Bonzini 2010-02-14 18:10 ` Johannes Schindelin 2 siblings, 0 replies; 84+ messages in thread From: Jakub Narebski @ 2010-02-14 11:07 UTC (permalink / raw) To: Dmitry Potapov Cc: Johannes Schindelin, Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git Dmitry Potapov <dpotapov@gmail.com> writes: > On Sun, Feb 14, 2010 at 02:53:58AM +0100, Johannes Schindelin wrote: > > On Sun, 14 Feb 2010, Dmitry Potapov wrote: > > > > > + if (strbuf_read(&sbuf, fd, 4096) >= 0) > > > > How certain are you at this point that all of fd's contents fit into your > > memory? > > You can't be sure... In fact, we know mmap() also may fail for huge > files, so can strbuf_read(). Perhaps, mmap() behaves better when you > want to hash a huge file that does not fit in free physical memory, but > I do not think it is an important use case for any VCS, which mostly > stores small text files and a few not so big binary files. Git is not > design to store your video collection. (probably, Git can be improved to > handle big files better but I leave that exercise to those who want to > store their media files in Git). Something like git-bigfiles: http://caca.zoy.org/wiki/git-bigfiles ? (found via http://blog.bitquabit.com/2010/02/10/fightings-been-fun-and-all-its-time-shut-and-get-along/ found via http://tomayko.com/, entry for 10 Feb 2010). -- Jakub Narebski Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-14 2:42 ` Dmitry Potapov 2010-02-14 11:07 ` Jakub Narebski @ 2010-02-14 11:55 ` Paolo Bonzini 2010-02-14 18:10 ` Johannes Schindelin 2 siblings, 0 replies; 84+ messages in thread From: Paolo Bonzini @ 2010-02-14 11:55 UTC (permalink / raw) To: Dmitry Potapov Cc: Johannes Schindelin, Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git, Junio C Hamano On 02/14/2010 03:42 AM, Dmitry Potapov wrote: > In fact, we know mmap() also may fail for huge files, so can > strbuf_read(). On a 64-bit machine mmap should fail pretty much never, as it is limited by address space and not available memory. That said I can reproduce the result here too, and it's actually quite understandable from your "time" results: mmap has a minor page fault for every 4k you read (or at least that's the order of magnitude), read has basically none. Furthermore, it's true that with read you touch the whole memory twice from beginning to end, while with mmap you touch a page twice and move on; however, a file's contents will almost always fit in the L2 cache, so it's not too expensive. I tried madvise(buf, size, MADV_SEQUENTIAL) and MADV_WILLNEED but it has no effect. I suspect mmap will be faster only if the data is not in cache (so the cost of a page fault is negligible compared to going to disk) and the average file size is a few megabytes. Paolo ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-14 2:42 ` Dmitry Potapov 2010-02-14 11:07 ` Jakub Narebski 2010-02-14 11:55 ` Paolo Bonzini @ 2010-02-14 18:10 ` Johannes Schindelin 2010-02-14 19:06 ` Dmitry Potapov 2 siblings, 1 reply; 84+ messages in thread From: Johannes Schindelin @ 2010-02-14 18:10 UTC (permalink / raw) To: Dmitry Potapov Cc: Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git Hi, On Sun, 14 Feb 2010, Dmitry Potapov wrote: > On Sun, Feb 14, 2010 at 02:53:58AM +0100, Johannes Schindelin wrote: > > On Sun, 14 Feb 2010, Dmitry Potapov wrote: > > > > > + if (strbuf_read(&sbuf, fd, 4096) >= 0) > > > > How certain are you at this point that all of fd's contents fit into > > your memory? > > You can't be sure... In fact, we know mmap() also may fail for huge > files, so can strbuf_read(). That's comparing oranges to apples. In one case, the address space runs out, in the other the available memory. The latter is much more likely. > > And even if you could be certain, a hint is missing that > > strbuf_read(), its name notwithstanding, does not read NUL-terminated > > strings. Oh, and the size is just a hint for the initial size, and it > > reads until EOF. That has to be said in the commit message. > > I did not add _any_ new code, including the above line. It was there > before my patch. But that explanation does not answer my question, does it? And my question was not unreasonable to ask, was it? Ciao, Dscho ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-14 18:10 ` Johannes Schindelin @ 2010-02-14 19:06 ` Dmitry Potapov 2010-02-14 19:22 ` Johannes Schindelin 2010-02-14 23:13 ` Avery Pennarun 0 siblings, 2 replies; 84+ messages in thread From: Dmitry Potapov @ 2010-02-14 19:06 UTC (permalink / raw) To: Johannes Schindelin Cc: Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Sun, Feb 14, 2010 at 9:10 PM, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > On Sun, 14 Feb 2010, Dmitry Potapov wrote: > >> On Sun, Feb 14, 2010 at 02:53:58AM +0100, Johannes Schindelin wrote: >> > On Sun, 14 Feb 2010, Dmitry Potapov wrote: >> > >> > > + if (strbuf_read(&sbuf, fd, 4096) >= 0) >> > >> > How certain are you at this point that all of fd's contents fit into >> > your memory? >> >> You can't be sure... In fact, we know mmap() also may fail for huge >> files, so can strbuf_read(). > > That's comparing oranges to apples. In one case, the address space runs > out, in the other the available memory. The latter is much more likely. "much more likely" is not a very qualitative characteristic... I would prefer to see numbers. My gut feeling is that it is not a problem in real use cases where Git is used as VCS and not storage for huge media files. In fact, we do not use mmap() on Windows or MacOS at all, and I have not heard that users of those platforms suffered much more from inability to store huge files than those who use Linux. I do not want to say that there is no difference, but it may not as large as you try to portray. In any case, my patch was to let people to test it and to see what impact it has and come up with some numbers. BTW, probably, it is not difficult to stream a large file in chunks (and it may be even much faster, because we work on CPU cache), but I suspect it will not resolve all issues with huge files, because eventually we need to store them in a pack file. So we need to develop some strategy how to deal with them. One way to deal with them is to stream directly into a separate pack. Still, it does not resolve all problems, because each pack file should be mapped into a memory, and this may be a problem for 32-bit system (or even 64-bit systems where a sysadmin set limit on amount virtual memory available a single program). The other way to handle huge files is to split them into chunks. http://article.gmane.org/gmane.comp.version-control.git/120112 Maybe there are other approaches. I heard some people tried to do something about it, but i have never interested in big files to look at this issue closely. > >> > And even if you could be certain, a hint is missing that >> > strbuf_read(), its name notwithstanding, does not read NUL-terminated >> > strings. Oh, and the size is just a hint for the initial size, and it >> > reads until EOF. That has to be said in the commit message. >> >> I did not add _any_ new code, including the above line. It was there >> before my patch. > > But that explanation does not answer my question, does it? I believe it did, or I did not understand your question. Dmitry ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-14 19:06 ` Dmitry Potapov @ 2010-02-14 19:22 ` Johannes Schindelin 2010-02-14 19:28 ` Johannes Schindelin 2010-02-14 19:55 ` Dmitry Potapov 2010-02-14 23:13 ` Avery Pennarun 1 sibling, 2 replies; 84+ messages in thread From: Johannes Schindelin @ 2010-02-14 19:22 UTC (permalink / raw) To: Dmitry Potapov Cc: Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git Hi, On Sun, 14 Feb 2010, Dmitry Potapov wrote: > On Sun, Feb 14, 2010 at 9:10 PM, Johannes Schindelin > <Johannes.Schindelin@gmx.de> wrote: > > > > On Sun, 14 Feb 2010, Dmitry Potapov wrote: > > > >> On Sun, Feb 14, 2010 at 02:53:58AM +0100, Johannes Schindelin wrote: > >> > On Sun, 14 Feb 2010, Dmitry Potapov wrote: > >> > > >> > > + if (strbuf_read(&sbuf, fd, 4096) >= 0) > >> > > >> > How certain are you at this point that all of fd's contents fit > >> > into your memory? > >> > >> You can't be sure... In fact, we know mmap() also may fail for huge > >> files, so can strbuf_read(). > > > > That's comparing oranges to apples. In one case, the address space > > runs out, in the other the available memory. The latter is much more > > likely. > > "much more likely" is not a very qualitative characteristic... Git was touted as a "content tracker". So I use it as such. Concrete example: in one of my repositories, the average file size is well over 2 gigabytes. Go figure, Dscho ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-14 19:22 ` Johannes Schindelin @ 2010-02-14 19:28 ` Johannes Schindelin 2010-02-14 19:56 ` Dmitry Potapov 2010-02-14 19:55 ` Dmitry Potapov 1 sibling, 1 reply; 84+ messages in thread From: Johannes Schindelin @ 2010-02-14 19:28 UTC (permalink / raw) To: Dmitry Potapov Cc: Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git Hi, On Sun, 14 Feb 2010, Johannes Schindelin wrote: > On Sun, 14 Feb 2010, Dmitry Potapov wrote: > > > On Sun, Feb 14, 2010 at 9:10 PM, Johannes Schindelin > > <Johannes.Schindelin@gmx.de> wrote: > > > > > > On Sun, 14 Feb 2010, Dmitry Potapov wrote: > > > > > >> On Sun, Feb 14, 2010 at 02:53:58AM +0100, Johannes Schindelin > > >> wrote: > > >> > On Sun, 14 Feb 2010, Dmitry Potapov wrote: > > >> > > > >> > > + if (strbuf_read(&sbuf, fd, 4096) >= 0) > > >> > > > >> > How certain are you at this point that all of fd's contents fit > > >> > into your memory? > > >> > > >> You can't be sure... In fact, we know mmap() also may fail for huge > > >> files, so can strbuf_read(). > > > > > > That's comparing oranges to apples. In one case, the address space > > > runs out, in the other the available memory. The latter is much more > > > likely. > > > > "much more likely" is not a very qualitative characteristic... > > Git was touted as a "content tracker". So I use it as such. > > Concrete example: in one of my repositories, the average file size is > well over 2 gigabytes. Just to make extremely sure that you undertand the issue: adding these files on a computer with 512 megabyte RAM works at the moment. Can you guarantee that there is no regression in that respect _with_ your patch? Git is in wide use. If you provide a patch, it is not good enough anymore if it Works For You(tm), when it Does Not Work Somewhere Else Anymore(tm). Ciao, Dscho ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-14 19:28 ` Johannes Schindelin @ 2010-02-14 19:56 ` Dmitry Potapov 2010-02-14 23:52 ` Zygo Blaxell ` (2 more replies) 0 siblings, 3 replies; 84+ messages in thread From: Dmitry Potapov @ 2010-02-14 19:56 UTC (permalink / raw) To: Johannes Schindelin Cc: Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Sun, Feb 14, 2010 at 10:28 PM, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: >> >> Concrete example: in one of my repositories, the average file size is >> well over 2 gigabytes. > > Just to make extremely sure that you undertand the issue: adding these > files on a computer with 512 megabyte RAM works at the moment. Can you > guarantee that there is no regression in that respect _with_ your patch? It may not work without enough swap space, and it will not pretty anyway due to swapping. So, I see the following options: 1. to introduce a configuration parameter that will define whether to use mmap() to hash files or not. It is a trivial change, but the real question is what default value for this option (should we do some heuristic based on filesize vs available memory?) 2. to stream files in chunks. It is better because it is faster, especially on large files, as you calculate SHA-1 and zip data while they are in CPU cache. However, it may be more difficult to implement, because we have filters that should be apply to files that are put to the repository. 3. to improve Git to support huge files on computers with low memory. I think #3 is a noble goal, but I do not have time for that. I can try to take on #2, but it may take more time than I have now. As to #1, I am ready to send the patch if we agree that is the right way to go... I am open to any your suggestion. Maybe there are some options here.. Dmitry ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-14 19:56 ` Dmitry Potapov @ 2010-02-14 23:52 ` Zygo Blaxell 2010-02-15 5:05 ` Nicolas Pitre 2010-02-15 7:48 ` Paolo Bonzini 2 siblings, 0 replies; 84+ messages in thread From: Zygo Blaxell @ 2010-02-14 23:52 UTC (permalink / raw) To: Dmitry Potapov Cc: Johannes Schindelin, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Sun, Feb 14, 2010 at 10:56:46PM +0300, Dmitry Potapov wrote: > It may not work without enough swap space, and it will not pretty anyway > due to swapping. So, I see the following options: > > 1. to introduce a configuration parameter that will define whether to use > mmap() to hash files or not. It is a trivial change, but the real question > is what default value for this option (should we do some heuristic based > on filesize vs available memory?) I'm a fan of having the option. It lets users who know what they're doing (and who might be doing something unusual, like files that are large relative to memory size, or files that are being modified during git add) decide how to make the performance-robustness trade-off. "Large" is relative. How common is it to add files to git that are large relative to the RAM size of the machine? I have a machine with 6GB of RAM and a repo on that machine with half a dozen 0.5-1.5GB files in it. Git add is painfully slow on that repo even with mmap(). I wouldn't dare trying to manipulate that repo on a machine with less than 1GB of RAM. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-14 19:56 ` Dmitry Potapov 2010-02-14 23:52 ` Zygo Blaxell @ 2010-02-15 5:05 ` Nicolas Pitre 2010-02-15 12:23 ` Dmitry Potapov 2010-02-15 7:48 ` Paolo Bonzini 2 siblings, 1 reply; 84+ messages in thread From: Nicolas Pitre @ 2010-02-15 5:05 UTC (permalink / raw) To: Dmitry Potapov Cc: Johannes Schindelin, Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Sun, 14 Feb 2010, Dmitry Potapov wrote: > 1. to introduce a configuration parameter that will define whether to use > mmap() to hash files or not. It is a trivial change, but the real question > is what default value for this option (should we do some heuristic based > on filesize vs available memory?) I don't like such kind of heuristic. They're almost always wrong, and any issue is damn hard to reproduce. I tend to believe that mmap() works better by letting the OS paging in and out memory as needed while reading data into allocated memory is only going to force the system into swap. > 2. to stream files in chunks. It is better because it is faster, especially on > large files, as you calculate SHA-1 and zip data while they are in CPU > cache. However, it may be more difficult to implement, because we have > filters that should be apply to files that are put to the repository. So? "More difficult" when it is the right thing to do is no excuse not to do it and satisfy ourselves with an half solution. Barely replacing mmap() with read() has drawbacks while the advantages aren't that many. Gaining a few speed percentage while making it less robust when memory is tight isn't such a great compromize to me. BUT if you were to replace mmap() with read() and make the process chunked then you do improve both speed _and_ memory usage. As to huge file: we have that core.bigFileThreshold variable now, and anything that crosses it should be considered "stream in / stream out" without further considerations. That means no diff, no rename similarity estimates, no delta, no filter, no blame, no fancies. If you have source code files that big then you do have a bigger problem already anyway. Typical huge files are rarely manipulated, and when they do it is pretty unlikely to be compared with other versions using diff, and then that also means that you have the storage capacity and network bandwidth to deal with them. Hence repository tightness is not your top concern in that case, but repack/checkout speed most likely is. So big files should be streamed to a pack of their own at "git add" time. Then repack will simply "reuse pack data" without delta compression attempts, meaning that they will be streamed into a single huge pack with no issue (this particular case is already supported in the code). > 3. to improve Git to support huge files on computers with low memory. That comes for free with #2. Nicolas ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-15 5:05 ` Nicolas Pitre @ 2010-02-15 12:23 ` Dmitry Potapov 0 siblings, 0 replies; 84+ messages in thread From: Dmitry Potapov @ 2010-02-15 12:23 UTC (permalink / raw) To: Nicolas Pitre Cc: Johannes Schindelin, Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Mon, Feb 15, 2010 at 8:05 AM, Nicolas Pitre <nico@fluxnic.net> wrote: > On Sun, 14 Feb 2010, Dmitry Potapov wrote: > >> 1. to introduce a configuration parameter that will define whether to use >> mmap() to hash files or not. It is a trivial change, but the real question >> is what default value for this option (should we do some heuristic based >> on filesize vs available memory?) > > I don't like such kind of heuristic. They're almost always wrong, and > any issue is damn hard to reproduce. I tend to believe that mmap() works > better by letting the OS paging in and out memory as needed while > reading data into allocated memory is only going to force the system > into swap. Probably, you are right. Heuristic is a bad idea. Still, we may want to add an option to disable mmap() during hash calculation if we preserve mmap() here. Though, I don't like keeping mmap() there if we go for #2.. See below... > >> 2. to stream files in chunks. It is better because it is faster, especially on >> large files, as you calculate SHA-1 and zip data while they are in CPU >> cache. However, it may be more difficult to implement, because we have >> filters that should be apply to files that are put to the repository. > > So? "More difficult" when it is the right thing to do is no excuse not > to do it and satisfy ourselves with an half solution. Barely replacing > mmap() with read() has drawbacks while the advantages aren't that many. > Gaining a few speed percentage while making it less robust when memory > is tight isn't such a great compromize to me. BUT if you were to > replace mmap() with read() and make the process chunked then you do > improve both speed _and_ memory usage. I have not had time to look closely at this, but there is one problem that I noticed -- the header of any git object contains the blob length. We know this length in advance (without reading all data) only for regular files and only if they do not have any filter to be applied. In all other cases, it seems we cannot do much better than we do now, assuming that we do not want to change the storage format... If so, the question remains what to do about regular files with some filter. Currently, we use mmap() for the original data but store the processed data in memory anyway. The question is whether want to keep this use of mmap() here? Considering that it is a potential source of a repository corruption and these filters should not be used for big files because they take a lot of memory anyway, I think we should get rid of mmap() in hashing file completely, once we can process regular files without filters in chunks. Dmitry ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-14 19:56 ` Dmitry Potapov 2010-02-14 23:52 ` Zygo Blaxell 2010-02-15 5:05 ` Nicolas Pitre @ 2010-02-15 7:48 ` Paolo Bonzini 2010-02-15 12:25 ` Dmitry Potapov 2 siblings, 1 reply; 84+ messages in thread From: Paolo Bonzini @ 2010-02-15 7:48 UTC (permalink / raw) To: Dmitry Potapov Cc: Johannes Schindelin, Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On 02/14/2010 08:56 PM, Dmitry Potapov wrote: > It may not work without enough swap space No, the kernel will use the file itself as "swap" if it has to page out the memory (since the mmap is PROT_READ). Paolo ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-15 7:48 ` Paolo Bonzini @ 2010-02-15 12:25 ` Dmitry Potapov 0 siblings, 0 replies; 84+ messages in thread From: Dmitry Potapov @ 2010-02-15 12:25 UTC (permalink / raw) To: Paolo Bonzini Cc: Johannes Schindelin, Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Mon, Feb 15, 2010 at 10:48 AM, Paolo Bonzini <bonzini@gnu.org> wrote: > On 02/14/2010 08:56 PM, Dmitry Potapov wrote: >> >> It may not work without enough swap space > > No, the kernel will use the file itself as "swap" if it has to page out the > memory (since the mmap is PROT_READ). I was speaking about read(), when the whole file is read in memory. In this case, you are going to use a lot of "swap" and it is not pretty... That's why I said maybe we should have an option here... Dmitry ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-14 19:22 ` Johannes Schindelin 2010-02-14 19:28 ` Johannes Schindelin @ 2010-02-14 19:55 ` Dmitry Potapov 1 sibling, 0 replies; 84+ messages in thread From: Dmitry Potapov @ 2010-02-14 19:55 UTC (permalink / raw) To: Johannes Schindelin Cc: Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Sun, Feb 14, 2010 at 10:22 PM, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > Git was touted as a "content tracker". So I use it as such. > Well... those who got their Git repository corrupted also used Git as a "content tracker". But I believe Git is a content tracker primary for source code, thus using it as a backup tool or to store your multi-media collection is abuse of Git... Having said so, I have no objection that Git could be used in those areas too... So, I would like to hear any specific suggestion how to achieve those goals better... Dmitry ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-14 19:06 ` Dmitry Potapov 2010-02-14 19:22 ` Johannes Schindelin @ 2010-02-14 23:13 ` Avery Pennarun 2010-02-15 4:16 ` Nicolas Pitre 1 sibling, 1 reply; 84+ messages in thread From: Avery Pennarun @ 2010-02-14 23:13 UTC (permalink / raw) To: Dmitry Potapov Cc: Johannes Schindelin, Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Sun, Feb 14, 2010 at 2:06 PM, Dmitry Potapov <dpotapov@gmail.com> wrote: > On Sun, Feb 14, 2010 at 9:10 PM, Johannes Schindelin > <Johannes.Schindelin@gmx.de> wrote: >> That's comparing oranges to apples. In one case, the address space runs >> out, in the other the available memory. The latter is much more likely. > > "much more likely" is not a very qualitative characteristic... I would > prefer to see numbers. Well, the numbers are rather easy to calculate of course. On a 32-bit machine, your (ideal) maximum address space size is 4GB. On a 64-bit machine, it's a heck of a lot bigger. And in either case, a single process consuming it all doesn't matter since it won't hurt other processes. But the available RAM is frequently less than 4GB and that has to be shared between *all* your processes. > BTW, probably, it is not difficult to stream a large file in chunks (and > it may be even much faster, because we work on CPU cache), but I suspect > it will not resolve all issues with huge files, because eventually we > need to store them in a pack file. So we need to develop some strategy > how to deal with them. It definitely doesn't resolve all the issues. There are different ways of looking at this; one is to not bother make git-add work smoothly with large files, because calculating the deltas will later cause a disastrous meltdown anyway. In fact, arguably you should prevent git-add from adding large files at all, because at least then you don't get the repository into a hard-to-recover-from state with huge files. (This happened at work a few months ago; most people have no idea what to do in such a situation.) The other way to look at it is that if we want git to *eventually* work with huge files, we have to fix each bug one at a time, and we can't go making things worse. For my own situation, I think I'm more likely to (and I know people who are more likely to) try storing huge files in git than I am likely to modify a file *while* I'm trying to store it in git. > One way to deal with them is to stream directly into a separate pack. > Still, it does not resolve all problems, because each pack file should > be mapped into a memory, and this may be a problem for 32-bit system > (or even 64-bit systems where a sysadmin set limit on amount virtual > memory available a single program). > > The other way to handle huge files is to split them into chunks. > http://article.gmane.org/gmane.comp.version-control.git/120112 I have a bit of experience splitting files into chunks: http://groups.google.com/group/bup-list/browse_thread/thread/812031efd4c5f7e4 It works. Also note that the speed gain from mmap'ing packs appears to be much less than the gain from mmap'ing indexes. You could probably sacrifice most or all of the former and never really notice. Caching expanded deltas can be pretty valuable, though. (bup presently avoids that whole question by not using deltas.) I can also confirm that streaming objects directly into packs is a massive performance increase when dealing with big files. However, you then start to run into git's heuristics that often assume (for example) that if an object is in a pack, it should never (or rarely) be pruned. This is normally a fine assumption, because if it was likely to get pruned, it probably never would have been put into a pack in the first place. Have fun, Avery ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-14 23:13 ` Avery Pennarun @ 2010-02-15 4:16 ` Nicolas Pitre 2010-02-15 5:01 ` Avery Pennarun 0 siblings, 1 reply; 84+ messages in thread From: Nicolas Pitre @ 2010-02-15 4:16 UTC (permalink / raw) To: Avery Pennarun Cc: Dmitry Potapov, Johannes Schindelin, Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Sun, 14 Feb 2010, Avery Pennarun wrote: > On Sun, Feb 14, 2010 at 2:06 PM, Dmitry Potapov <dpotapov@gmail.com> wrote: > > BTW, probably, it is not difficult to stream a large file in chunks (and > > it may be even much faster, because we work on CPU cache), but I suspect > > it will not resolve all issues with huge files, because eventually we > > need to store them in a pack file. So we need to develop some strategy > > how to deal with them. > > It definitely doesn't resolve all the issues. There are different > ways of looking at this; one is to not bother make git-add work > smoothly with large files, because calculating the deltas will later > cause a disastrous meltdown anyway. We have that core.bigFileThreshold configuration variable now. It is currently only used by fast-import. But the idea is to apply it across the board when that makes sense. And one of those places is to skip over those files when it comes to delta compression. > In fact, arguably you should prevent git-add from adding large files > at all, because at least then you don't get the repository into a > hard-to-recover-from state with huge files. (This happened at work a > few months ago; most people have no idea what to do in such a > situation.) Git needs to be fixed in that case, not be crippled. > The other way to look at it is that if we want git to *eventually* > work with huge files, we have to fix each bug one at a time, and we > can't go making things worse. Obviously. > For my own situation, I think I'm more likely to (and I know people > who are more likely to) try storing huge files in git than I am likely > to modify a file *while* I'm trying to store it in git. And fancy operations on huge files are pretty unlikely. Blame, diff, etc, are suited for text file which are by nature relatively small. And if your source code is all pasted in one single huge file that Git can't handle right now, then the compiler is unlikely to cope either. > > One way to deal with them is to stream directly into a separate pack. > > Still, it does not resolve all problems, because each pack file should > > be mapped into a memory, and this may be a problem for 32-bit system > > (or even 64-bit systems where a sysadmin set limit on amount virtual > > memory available a single program). This is not a problem at all. Git already deals with packs bigger than the available memory by not mapping them whole -- see documentation for core.packedGitWindowSize. > > The other way to handle huge files is to split them into chunks. > > http://article.gmane.org/gmane.comp.version-control.git/120112 No. The chunk idea doesn't fit the Git model well enough without many corner cases all over the place which is a major drawback. I think that was discussed in that thread already. > I have a bit of experience splitting files into chunks: > http://groups.google.com/group/bup-list/browse_thread/thread/812031efd4c5f7e4 > > It works. Also note that the speed gain from mmap'ing packs appears > to be much less than the gain from mmap'ing indexes. You could > probably sacrifice most or all of the former and never really notice. > Caching expanded deltas can be pretty valuable, though. (bup > presently avoids that whole question by not using deltas.) We do have a cache of expanded deltas already. > I can also confirm that streaming objects directly into packs is a > massive performance increase when dealing with big files. However, > you then start to run into git's heuristics that often assume (for > example) that if an object is in a pack, it should never (or rarely) > be pruned. This is normally a fine assumption, because if it was > likely to get pruned, it probably never would have been put into a > pack in the first place. Would you please for my own sanity tell me where we do such thing. I thought I had a firm grip on the pack model but you're casting a shadow of doubts on some code I might have written myself. Nicolas ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-15 4:16 ` Nicolas Pitre @ 2010-02-15 5:01 ` Avery Pennarun 2010-02-15 5:48 ` Nicolas Pitre 0 siblings, 1 reply; 84+ messages in thread From: Avery Pennarun @ 2010-02-15 5:01 UTC (permalink / raw) To: Nicolas Pitre Cc: Dmitry Potapov, Johannes Schindelin, Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Sun, Feb 14, 2010 at 11:16 PM, Nicolas Pitre <nico@fluxnic.net> wrote: > On Sun, 14 Feb 2010, Avery Pennarun wrote: >> In fact, arguably you should prevent git-add from adding large files >> at all, because at least then you don't get the repository into a >> hard-to-recover-from state with huge files. (This happened at work a >> few months ago; most people have no idea what to do in such a >> situation.) > > Git needs to be fixed in that case, not be crippled. That would be ideal, but is more work than disabling imports for large files by default (for example), which would be easy. In any case, my solution at work was to say "if it hurts, don't do that" and it seems to have worked out okay for now. >> For my own situation, I think I'm more likely to (and I know people >> who are more likely to) try storing huge files in git than I am likely >> to modify a file *while* I'm trying to store it in git. > > And fancy operations on huge files are pretty unlikely. Blame, diff, > etc, are suited for text file which are by nature relatively small. > And if your source code is all pasted in one single huge file that Git > can't handle right now, then the compiler is unlikely to cope either. Well, I'm thinking of things like textual database dumps, such as those produced by mysqldump. It would be nice to be able to diff those efficiently, even if they're several gigs in size. bup's hierarchical chunking allows this. >> > The other way to handle huge files is to split them into chunks. >> > http://article.gmane.org/gmane.comp.version-control.git/120112 > > No. The chunk idea doesn't fit the Git model well enough without many > corner cases all over the place which is a major drawback. I think that > was discussed in that thread already. > >> I have a bit of experience splitting files into chunks: >> http://groups.google.com/group/bup-list/browse_thread/thread/812031efd4c5f7e4 Note that bup's rolling-checksum-based hierarchical chunking is not the same as the chunking that was discussed in that thread, and it resolves most of the problems. Unless I'm missing something. Also note that bup just uses normal tree objects (for better or worse) instead of introducing a new object type. >> It works. Also note that the speed gain from mmap'ing packs appears >> to be much less than the gain from mmap'ing indexes. You could >> probably sacrifice most or all of the former and never really notice. >> Caching expanded deltas can be pretty valuable, though. (bup >> presently avoids that whole question by not using deltas.) > > We do have a cache of expanded deltas already. Yes, sorry to have implied otherwise. I was just comparing the performance advantage of the delta expansion cache (which should be a lot) with that of mmaping packfiles (which probably isn't much since the packfile data is typically needed in expanded form anyway). >> I can also confirm that streaming objects directly into packs is a >> massive performance increase when dealing with big files. However, >> you then start to run into git's heuristics that often assume (for >> example) that if an object is in a pack, it should never (or rarely) >> be pruned. This is normally a fine assumption, because if it was >> likely to get pruned, it probably never would have been put into a >> pack in the first place. > > Would you please for my own sanity tell me where we do such thing. I > thought I had a firm grip on the pack model but you're casting a shadow > of doubts on some code I might have written myself. Sorry, I didn't hunt down the code, but I ran into it while experimenting before. The rules are something like: - git-prune only prunes unpacked objects - git-repack claims to be willing to explode unreachable objects back into loose objects with -A, but I'm not quite sure if its definition of "unreachable" is the same as mine. And I'm not sure rewriting a pack with -A makes the old pack reliably unreachable according to -d. It's possible I was just being dense. - there seems to be no documented situation in which you can ever delete unused objects from a pack without using repack -a or -A, which can be amazingly slow if your packs are huge. (Ideally you'd only repack the particular packs that you want to shrink.) For example, my bup repo is currently 200 GB. Anyway, I didn't have much luck when playing with it earlier, but didn't investigate since I assumed it's just a workflow that nobody much cares about. Which I think is a reasonable position for git developers to take anyway. Have fun, Avery ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-15 5:01 ` Avery Pennarun @ 2010-02-15 5:48 ` Nicolas Pitre 2010-02-15 19:19 ` Avery Pennarun 0 siblings, 1 reply; 84+ messages in thread From: Nicolas Pitre @ 2010-02-15 5:48 UTC (permalink / raw) To: Avery Pennarun Cc: Dmitry Potapov, Johannes Schindelin, Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Mon, 15 Feb 2010, Avery Pennarun wrote: > - git-prune only prunes unpacked objects > > - git-repack claims to be willing to explode unreachable objects back > into loose objects with -A, but I'm not quite sure if its definition > of "unreachable" is the same as mine. Unreachable means not referenced by the specified rev-list specification. So if you give it --all --reflog then it means any objects that is not reachable through either your branches, tags or reflog entries. > And I'm not sure rewriting a > pack with -A makes the old pack reliably unreachable according to -d. Reachability doesn't apply to packs. That applies to objects. And unreachable objects may be copied to loose objects with -A, or simply forgotten about with -a. Then -d will literally delete the old pack file. > - there seems to be no documented situation in which you can ever > delete unused objects from a pack without using repack -a or -A, which > can be amazingly slow if your packs are huge. (Ideally you'd only > repack the particular packs that you want to shrink.) For example, my > bup repo is currently 200 GB. Ideally you don't keep volatile objects into huge packs. That's why we have .keep to flag those packs that are huge and pure so not to touch them anymore. Incremental repacking is there to gather only those _reachable_ loose objects into a new pack. The objects that you're likely to make unreachable are probably going to come from a temporary branch that you deleted which is likely to affect objects only from that latest and small pack. And repacking can be done unattended and in parallel to normal Git operations with no issues. So even if it is slow to repack huge packs, it is something that you might do during the night and only once in a while. But if you really want to shrink only one pack without touching the other packs, and you do know which objects have to be removed from that pack, then it is trivial to write a small script using git-show-index, sorting the output by offset, filter out the unwanted objects, keeping only the SHA1 column, and feeding the result into git-pack-objects. Oh and delete the original pack when done of course. It is also trivial to generate the list of all packed objects, compare it to the list of all reachable objects, and prune objects from the packs that contains those objects which are not to be found in the reachable object list. Nicolas ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-15 5:48 ` Nicolas Pitre @ 2010-02-15 19:19 ` Avery Pennarun 2010-02-15 19:29 ` Nicolas Pitre 0 siblings, 1 reply; 84+ messages in thread From: Avery Pennarun @ 2010-02-15 19:19 UTC (permalink / raw) To: Nicolas Pitre Cc: Dmitry Potapov, Johannes Schindelin, Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Mon, Feb 15, 2010 at 12:48 AM, Nicolas Pitre <nico@fluxnic.net> wrote: > Ideally you don't keep volatile objects into huge packs. That's why we > have .keep to flag those packs that are huge and pure so not to touch > them anymore. Of course the problem here is that as soon as you import a single (possibly volatile) 2GB file, your pack becomes "huge." So these heuristics stop working very well and start to need some revision. Thanks for the clarification on packing and repacking. I might be able to use this method to come up with a better repacking/pruning algorithm in bup, and from there to perhaps forward it on to git. Have fun, Avery ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] don't use mmap() to hash files 2010-02-15 19:19 ` Avery Pennarun @ 2010-02-15 19:29 ` Nicolas Pitre 0 siblings, 0 replies; 84+ messages in thread From: Nicolas Pitre @ 2010-02-15 19:29 UTC (permalink / raw) To: Avery Pennarun Cc: Dmitry Potapov, Johannes Schindelin, Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git [-- Attachment #1: Type: TEXT/PLAIN, Size: 827 bytes --] On Mon, 15 Feb 2010, Avery Pennarun wrote: > On Mon, Feb 15, 2010 at 12:48 AM, Nicolas Pitre <nico@fluxnic.net> wrote: > > Ideally you don't keep volatile objects into huge packs. That's why we > > have .keep to flag those packs that are huge and pure so not to touch > > them anymore. > > Of course the problem here is that as soon as you import a single > (possibly volatile) 2GB file, your pack becomes "huge." So these > heuristics stop working very well and start to need some revision. You don't have to repack that often though. In which case the single-object 2GB pack might be discarded before the next repack. And loose objects are packed into a pack of their own more often than multiple packs being repacked into a single pack. So I think the current heuristics should still work pretty well. Nicolas ^ permalink raw reply [flat|nested] 84+ messages in thread
* [PATCH v2] don't use mmap() to hash files 2010-02-14 1:18 ` [PATCH] don't use mmap() to hash files Dmitry Potapov 2010-02-14 1:37 ` Junio C Hamano 2010-02-14 1:53 ` Johannes Schindelin @ 2010-02-14 3:05 ` Dmitry Potapov 2010-02-18 1:16 ` [PATCH] Teach "git add" and friends to be paranoid Junio C Hamano 3 siblings, 0 replies; 84+ messages in thread From: Dmitry Potapov @ 2010-02-14 3:05 UTC (permalink / raw) To: git Cc: Junio C Hamano, Johannes Schindelin, Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder If a mmapped file is changed by another program during git-add, it causes the repository corruption. Disabling mmap() in index_fd() does not have any negative impact on the overall speed of Git. In fact, it makes git hash-object to work slightly faster. Here is the best result before and after patch based on 5 runs on the Linix kernel repository: Before: $ git ls-files | time git hash-object --stdin-path > /dev/null 2.15user 0.36system 0:02.52elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+103248minor)pagefaults 0swaps After: $ git ls-files | time ../git/git hash-object --stdin-path > /dev/null 2.09user 0.33system 0:02.42elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+1073minor)pagefaults 0swaps Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> --- In this version, I have improved the hint value for regular files to avoid useless re-allocation and copy. sha1_file.c | 27 +++++++++++---------------- 1 files changed, 11 insertions(+), 16 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index 657825e..26c6231 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2438,22 +2438,17 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, enum object_type type, const char *path) { int ret; - size_t size = xsize_t(st->st_size); - - if (!S_ISREG(st->st_mode)) { - struct strbuf sbuf = STRBUF_INIT; - if (strbuf_read(&sbuf, fd, 4096) >= 0) - ret = index_mem(sha1, sbuf.buf, sbuf.len, write_object, - type, path); - else - ret = -1; - strbuf_release(&sbuf); - } else if (size) { - void *buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0); - ret = index_mem(sha1, buf, size, write_object, type, path); - munmap(buf, size); - } else - ret = index_mem(sha1, NULL, size, write_object, type, path); + struct strbuf sbuf = STRBUF_INIT; + /* for regular files, we supply the real file size, otherwise + `size' is just a hint */ + size_t size = S_ISREG(st->st_mode) ? xsize_t(st->st_size) : 4096; + + if (strbuf_read(&sbuf, fd, size) >= 0) + ret = index_mem(sha1, sbuf.buf, sbuf.len, write_object, + type, path); + else + ret = -1; + strbuf_release(&sbuf); close(fd); return ret; } -- 1.7.0 ^ permalink raw reply related [flat|nested] 84+ messages in thread
* [PATCH] Teach "git add" and friends to be paranoid 2010-02-14 1:18 ` [PATCH] don't use mmap() to hash files Dmitry Potapov ` (2 preceding siblings ...) 2010-02-14 3:05 ` [PATCH v2] " Dmitry Potapov @ 2010-02-18 1:16 ` Junio C Hamano 2010-02-18 1:20 ` Junio C Hamano ` (3 more replies) 3 siblings, 4 replies; 84+ messages in thread From: Junio C Hamano @ 2010-02-18 1:16 UTC (permalink / raw) To: Dmitry Potapov Cc: Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git When creating a loose object, we normally mmap(2) the entire file, and hash and then compress to write it out in two separate steps for efficiency. This is perfectly good for the intended use of git---nobody is supposed to be insane enough to expect that it won't break anything to muck with the contents of a file after telling git to index it and before getting the control back from git. But the nature of breakage caused by such an abuse is rather bad. We will end up with loose object files, whose names do not match what are stored and recovered when uncompressed. This teaches the index_mem() codepath to be paranoid and hash and compress the data after reading it in core. The contents hashed may not match the contents of the file in an insane use case, but at least this way the result will be internally consistent. Signed-off-by: Junio C Hamano <gitster@pobox.com> --- sha1_file.c | 81 ++++++++++++++++++++++++++++++++++++++++++++++++----------- 1 files changed, 66 insertions(+), 15 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index 657825e..d8a7722 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2278,7 +2278,8 @@ static int create_tmpfile(char *buffer, size_t bufsiz, const char *filename) } static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen, - void *buf, unsigned long len, time_t mtime) + void *buf, unsigned long len, time_t mtime, + int paranoid) { int fd, ret; size_t size; @@ -2286,6 +2287,7 @@ static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen, z_stream stream; char *filename; static char tmpfile[PATH_MAX]; + git_SHA_CTX ctx; filename = sha1_file_name(sha1); fd = create_tmpfile(tmpfile, sizeof(tmpfile), filename); @@ -2312,12 +2314,41 @@ static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen, stream.next_in = (unsigned char *)hdr; stream.avail_in = hdrlen; while (deflate(&stream, 0) == Z_OK) - /* nothing */; + ; /* nothing */ /* Then the data itself.. */ - stream.next_in = buf; - stream.avail_in = len; - ret = deflate(&stream, Z_FINISH); + if (paranoid) { + unsigned char stablebuf[262144]; + char *bufptr = buf; + unsigned long remainder = len; + + git_SHA1_Init(&ctx); + git_SHA1_Update(&ctx, hdr, hdrlen); + + ret = Z_OK; + while (remainder) { + unsigned long chunklen = remainder; + + if (sizeof(stablebuf) <= chunklen) + chunklen = sizeof(stablebuf); + memcpy(stablebuf, bufptr, chunklen); + git_SHA1_Update(&ctx, stablebuf, chunklen); + stream.next_in = stablebuf; + stream.avail_in = chunklen; + do { + ret = deflate(&stream, Z_NO_FLUSH); + } while (ret == Z_OK); + bufptr += chunklen; + remainder -= chunklen; + } + if (ret != Z_STREAM_END) + ret = deflate(&stream, Z_FINISH); + } else { + stream.next_in = buf; + stream.avail_in = len; + ret = deflate(&stream, Z_FINISH); + } + if (ret != Z_STREAM_END) die("unable to deflate new object %s (%d)", sha1_to_hex(sha1), ret); @@ -2327,6 +2358,12 @@ static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen, size = stream.total_out; + if (paranoid) { + unsigned char paranoid_sha1[20]; + git_SHA1_Final(paranoid_sha1, &ctx); + if (hashcmp(paranoid_sha1, sha1)) + die("hashed file is volatile"); + } if (write_buffer(fd, compressed, size) < 0) die("unable to write sha1 file"); close_sha1_file(fd); @@ -2344,7 +2381,7 @@ static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen, return move_temp_to_file(tmpfile, filename); } -int write_sha1_file(void *buf, unsigned long len, const char *type, unsigned char *returnsha1) +static int write_sha1_file_paranoid(void *buf, unsigned long len, const char *type, unsigned char *returnsha1, int paranoid) { unsigned char sha1[20]; char hdr[32]; @@ -2358,7 +2395,12 @@ int write_sha1_file(void *buf, unsigned long len, const char *type, unsigned cha hashcpy(returnsha1, sha1); if (has_sha1_file(sha1)) return 0; - return write_loose_object(sha1, hdr, hdrlen, buf, len, 0); + return write_loose_object(sha1, hdr, hdrlen, buf, len, 0, paranoid); +} + +int write_sha1_file(void *buf, unsigned long len, const char *type, unsigned char *returnsha1) +{ + return write_sha1_file_paranoid(buf, len, type, returnsha1, 0); } int force_object_loose(const unsigned char *sha1, time_t mtime) @@ -2376,7 +2418,7 @@ int force_object_loose(const unsigned char *sha1, time_t mtime) if (!buf) return error("cannot read sha1_file for %s", sha1_to_hex(sha1)); hdrlen = sprintf(hdr, "%s %lu", typename(type), len) + 1; - ret = write_loose_object(sha1, hdr, hdrlen, buf, len, mtime); + ret = write_loose_object(sha1, hdr, hdrlen, buf, len, mtime, 0); free(buf); return ret; @@ -2405,10 +2447,15 @@ int has_sha1_file(const unsigned char *sha1) return has_loose_object(sha1); } +#define INDEX_MEM_WRITE_OBJECT 01 +#define INDEX_MEM_PARANOID 02 + static int index_mem(unsigned char *sha1, void *buf, size_t size, - int write_object, enum object_type type, const char *path) + enum object_type type, const char *path, int flag) { int ret, re_allocated = 0; + int write_object = flag & INDEX_MEM_WRITE_OBJECT; + int paranoid = flag & INDEX_MEM_PARANOID; if (!type) type = OBJ_BLOB; @@ -2426,9 +2473,11 @@ static int index_mem(unsigned char *sha1, void *buf, size_t size, } if (write_object) - ret = write_sha1_file(buf, size, typename(type), sha1); + ret = write_sha1_file_paranoid(buf, size, typename(type), + sha1, paranoid); else ret = hash_sha1_file(buf, size, typename(type), sha1); + if (re_allocated) free(buf); return ret; @@ -2437,23 +2486,25 @@ static int index_mem(unsigned char *sha1, void *buf, size_t size, int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, enum object_type type, const char *path) { - int ret; + int ret, flag; size_t size = xsize_t(st->st_size); + flag = write_object ? INDEX_MEM_WRITE_OBJECT : 0; if (!S_ISREG(st->st_mode)) { struct strbuf sbuf = STRBUF_INIT; if (strbuf_read(&sbuf, fd, 4096) >= 0) - ret = index_mem(sha1, sbuf.buf, sbuf.len, write_object, - type, path); + ret = index_mem(sha1, sbuf.buf, sbuf.len, + type, path, flag); else ret = -1; strbuf_release(&sbuf); } else if (size) { void *buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0); - ret = index_mem(sha1, buf, size, write_object, type, path); + flag |= INDEX_MEM_PARANOID; + ret = index_mem(sha1, buf, size, type, path, flag); munmap(buf, size); } else - ret = index_mem(sha1, NULL, size, write_object, type, path); + ret = index_mem(sha1, NULL, size, type, path, flag); close(fd); return ret; } -- 1.7.0.81.g58679 ^ permalink raw reply related [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-18 1:16 ` [PATCH] Teach "git add" and friends to be paranoid Junio C Hamano @ 2010-02-18 1:20 ` Junio C Hamano 2010-02-18 15:32 ` Zygo Blaxell 2010-02-18 1:38 ` Jeff King ` (2 subsequent siblings) 3 siblings, 1 reply; 84+ messages in thread From: Junio C Hamano @ 2010-02-18 1:20 UTC (permalink / raw) To: Dmitry Potapov Cc: Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git Junio C Hamano <gitster@pobox.com> writes: > But the nature of breakage caused by such an abuse is rather bad. We will > end up with loose object files, whose names do not match what are stored > and recovered when uncompressed. > > This teaches the index_mem() codepath to be paranoid and hash and compress > the data after reading it in core. The contents hashed may not match the > contents of the file in an insane use case, but at least this way the > result will be internally consistent. With a small fix to the test program earlier in the thread, this seems to protect the repository; I didn't bother to assess the performance impact of the patch, though. Here is the corrected test. -- >8 -- #!/bin/sh set -e # Create an empty git repo in /tmp/git-test rm -fr /tmp/git-test mkdir /tmp/git-test cd /tmp/git-test git init # Create a file named foo and add it to the repo touch foo git add foo # Thread 1: continuously modify foo: while echo -n .; do dd if=/dev/urandom of=foo count=1024 bs=1k conv=notrunc >/dev/null 2>&1 done & # Thread 2: loop until the repo is corrupted while git fsck; do # Note the implied 'git add' in 'commit -a' # It will do the same with explicit 'git add' git commit -a -m'Test' || break done # Kill thread 1, we don't need it any more kill $! # Success! Well, sort of. if git fsck then echo Repository is corrupted. Have a nice day. else echo Repository is still healthy. You are stupid. fi ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-18 1:20 ` Junio C Hamano @ 2010-02-18 15:32 ` Zygo Blaxell 2010-02-19 17:51 ` Junio C Hamano 0 siblings, 1 reply; 84+ messages in thread From: Zygo Blaxell @ 2010-02-18 15:32 UTC (permalink / raw) To: Junio C Hamano Cc: Dmitry Potapov, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Wed, Feb 17, 2010 at 05:20:00PM -0800, Junio C Hamano wrote: > Junio C Hamano <gitster@pobox.com> writes: > With a small fix to the test program earlier in the thread, this seems to > protect the repository; I didn't bother to assess the performance impact > of the patch, though. > > Here is the corrected test. Depends on what you mean by "corrected" I suppose. > # Thread 2: loop until the repo is corrupted > while git fsck; do > # Note the implied 'git add' in 'commit -a' > # It will do the same with explicit 'git add' > git commit -a -m'Test' || break This is at least partly wrong--it will terminate prematurely if Thread 1 gets stalled and fails to modify 'foo' during the loop (git commit normally refuses to commit a tree with no changes). This can cause the test for the corruption bug to return false success results. If you add '--allow-empty' to the git commit command you will fix that case, but there might be others. If git commit runs out of disk space, for example, the commit should fail, but the repository should still not be corrupt. Future commits (for example after freeing some disk space) should eventually succeed. Really, the original loop was correct, and this new one isn't. > else > echo Repository is still healthy. You are stupid. If git is working, you should never reach this line, because git fsck should not fail after executing any sequence of git porcelain operations--and this particular sequence is nothing but 'git commit' in a single thread. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-18 15:32 ` Zygo Blaxell @ 2010-02-19 17:51 ` Junio C Hamano 0 siblings, 0 replies; 84+ messages in thread From: Junio C Hamano @ 2010-02-19 17:51 UTC (permalink / raw) To: Zygo Blaxell Cc: Dmitry Potapov, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git Zygo Blaxell <zblaxell@gibbs.hungrycats.org> writes: > If git commit runs out of disk space, for example, the commit should > fail, but the repository should still not be corrupt. Future commits > (for example after freeing some disk space) should eventually succeed. That is true. It is Ok to create a corrupt object as long as running an equivalent of fsck immediately after an object creation to catch the breakage to prevent it from propagating further. That is essentially what "paranoid" switch does, but it adds overhead that is unnecessary for the use case we primarily target in git. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-18 1:16 ` [PATCH] Teach "git add" and friends to be paranoid Junio C Hamano 2010-02-18 1:20 ` Junio C Hamano @ 2010-02-18 1:38 ` Jeff King 2010-02-18 4:55 ` Nicolas Pitre 2010-02-18 10:14 ` Thomas Rast 2010-02-19 8:28 ` Dmitry Potapov 3 siblings, 1 reply; 84+ messages in thread From: Jeff King @ 2010-02-18 1:38 UTC (permalink / raw) To: Junio C Hamano Cc: Dmitry Potapov, Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Wed, Feb 17, 2010 at 05:16:23PM -0800, Junio C Hamano wrote: > + if (paranoid) { > + unsigned char stablebuf[262144]; Is 256K a bit big for allocating on the stack? Modern OS's seem to give us at least a couple of megabytes (my Linux boxen all have 8M, and even Solaris 8 seems to have that much). But PTHREAD_STACK_MIN is only 16K (I don't think it is possible to hit this code path in a thread right now, but I'm not sure). And I have no idea what the situation is on Windows. I dunno if it is worth worrying about, but maybe somebody more clueful than me can comment. -Peff ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-18 1:38 ` Jeff King @ 2010-02-18 4:55 ` Nicolas Pitre 2010-02-18 5:36 ` Junio C Hamano 0 siblings, 1 reply; 84+ messages in thread From: Nicolas Pitre @ 2010-02-18 4:55 UTC (permalink / raw) To: Jeff King Cc: Junio C Hamano, Dmitry Potapov, Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Wed, 17 Feb 2010, Jeff King wrote: > On Wed, Feb 17, 2010 at 05:16:23PM -0800, Junio C Hamano wrote: > > > + if (paranoid) { > > + unsigned char stablebuf[262144]; > > Is 256K a bit big for allocating on the stack? Modern OS's seem to give > us at least a couple of megabytes (my Linux boxen all have 8M, and > even Solaris 8 seems to have that much). But PTHREAD_STACK_MIN is only > 16K (I don't think it is possible to hit this code path in a thread > right now, but I'm not sure). And I have no idea what the situation is > on Windows. > > I dunno if it is worth worrying about, but maybe somebody more clueful > than me can comment. It is likely to have better performance if the buffer is small enough to fit in the CPU L1 cache. There are two sequencial passes over the buffer: one for the SHA1 computation, and another for the compression, and currently they're sure to trash the L1 cache on each pass. Of course that requires big enough objects to matter. Nicolas ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-18 4:55 ` Nicolas Pitre @ 2010-02-18 5:36 ` Junio C Hamano 2010-02-18 7:27 ` Wincent Colaiuta 2010-02-22 12:59 ` Paolo Bonzini 0 siblings, 2 replies; 84+ messages in thread From: Junio C Hamano @ 2010-02-18 5:36 UTC (permalink / raw) To: Nicolas Pitre Cc: Jeff King, Dmitry Potapov, Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git Nicolas Pitre <nico@fluxnic.net> writes: > It is likely to have better performance if the buffer is small enough to > fit in the CPU L1 cache. There are two sequencial passes over the > buffer: one for the SHA1 computation, and another for the compression, > and currently they're sure to trash the L1 cache on each pass. I did a very unscientific test to hash about 14k paths (arch/ and fs/ from the kernel source) using "git-hash-object -w --stdin-paths" into an empty repository with varying sizes of paranoia buffer (quarter, 1, 4, 8 and 256kB) and saw 8-30% overhead. 256kB did hurt and around 4kB seemed to be optimal for my this small sample load. In any case, with any size of paranoia, this hurts the sane use case, so I'd introduce an expert switch to disable it, like this. -- >8 -- When creating a loose object, we normally mmap(2) the entire file, and hash and then compress to write it out in two separate steps for efficiency. This is perfectly good for the intended use of git---nobody is supposed to be insane enough to expect that it won't break anything to muck with the contents of a file after telling git to index it and before getting the control back from git. But the nature of breakage caused by such an abuse is rather bad. We will end up with loose object files, whose names do not match what are stored and recovered when uncompressed. This teaches the index_mem() codepath to be paranoid and hash and compress the data after reading it in core. The contents hashed may not match the contents of the file in an insane use case, but at least this way the result will be internally consistent. People with saner use of git can regain performance by setting a new configuration variable 'core.volatilefiles' to false to disable this check. Signed-off-by: Junio C Hamano <gitster@pobox.com> --- Documentation/config.txt | 7 ++++ cache.h | 1 + config.c | 6 +++ environment.c | 1 + sha1_file.c | 83 +++++++++++++++++++++++++++++++++++++-------- 5 files changed, 83 insertions(+), 15 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 52786c7..0295aee 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -117,6 +117,14 @@ core.fileMode:: the working copy are ignored; useful on broken filesystems like FAT. See linkgit:git-update-index[1]. True by default. +core.volatilefiles:: + If you modify a file after telling git to record it (e.g. with + "git add") but before git finishes the request and gives the + control back to you, you may create a broken object (and of course + you can keep both halves ;-). Setting this option to true will + tell git to be extra careful to detect the situation and abort. + Defaults to true. + core.ignoreCygwinFSTricks:: This option is only used by Cygwin implementation of Git. If false, the Cygwin stat() and lstat() functions are used. This may be useful diff --git a/cache.h b/cache.h index 231c06d..e5a87cf 100644 --- a/cache.h +++ b/cache.h @@ -497,6 +497,7 @@ extern int trust_ctime; extern int quote_path_fully; extern int has_symlinks; extern int ignore_case; +extern int worktree_files_are_volatile; extern int assume_unchanged; extern int prefer_symlink_refs; extern int log_all_ref_updates; diff --git a/config.c b/config.c index 790405a..9898041 100644 --- a/config.c +++ b/config.c @@ -360,6 +360,12 @@ static int git_default_core_config(const char *var, const char *value) trust_executable_bit = git_config_bool(var, value); return 0; } + + if (!strcmp(var, "core.volatilefiles")) { + worktree_files_are_volatile = git_config_bool(var, value); + return 0; + } + if (!strcmp(var, "core.trustctime")) { trust_ctime = git_config_bool(var, value); return 0; diff --git a/environment.c b/environment.c index e278bce..5d0faf3 100644 --- a/environment.c +++ b/environment.c @@ -22,6 +22,7 @@ int is_bare_repository_cfg = -1; /* unspecified */ int log_all_ref_updates = -1; /* unspecified */ int warn_ambiguous_refs = 1; int repository_format_version; +int worktree_files_are_volatile = 1; /* yuck */ const char *git_commit_encoding; const char *git_log_output_encoding; int shared_repository = PERM_UMASK; diff --git a/sha1_file.c b/sha1_file.c index 52d1ead..e126179 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2335,14 +2335,18 @@ static int create_tmpfile(char *buffer, size_t bufsiz, const char *filename) } static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen, - void *buf, unsigned long len, time_t mtime) + void *buf, unsigned long len, time_t mtime, + int paranoid) { int fd, size, ret; unsigned char *compressed; z_stream stream; char *filename; static char tmpfile[PATH_MAX]; + git_SHA_CTX ctx; + if (!worktree_files_are_volatile) + paranoid = 0; filename = sha1_file_name(sha1); fd = create_tmpfile(tmpfile, sizeof(tmpfile), filename); if (fd < 0) { @@ -2366,12 +2370,41 @@ static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen, stream.next_in = (unsigned char *)hdr; stream.avail_in = hdrlen; while (deflate(&stream, 0) == Z_OK) - /* nothing */; + ; /* nothing */ /* Then the data itself.. */ - stream.next_in = buf; - stream.avail_in = len; - ret = deflate(&stream, Z_FINISH); + if (paranoid) { + unsigned char stablebuf[4096]; + char *bufptr = buf; + unsigned long remainder = len; + + git_SHA1_Init(&ctx); + git_SHA1_Update(&ctx, hdr, hdrlen); + + ret = Z_OK; + while (remainder) { + unsigned long chunklen = remainder; + + if (sizeof(stablebuf) <= chunklen) + chunklen = sizeof(stablebuf); + memcpy(stablebuf, bufptr, chunklen); + git_SHA1_Update(&ctx, stablebuf, chunklen); + stream.next_in = stablebuf; + stream.avail_in = chunklen; + do { + ret = deflate(&stream, Z_NO_FLUSH); + } while (ret == Z_OK); + bufptr += chunklen; + remainder -= chunklen; + } + if (ret != Z_STREAM_END) + ret = deflate(&stream, Z_FINISH); + } else { + stream.next_in = buf; + stream.avail_in = len; + ret = deflate(&stream, Z_FINISH); + } + if (ret != Z_STREAM_END) die("unable to deflate new object %s (%d)", sha1_to_hex(sha1), ret); @@ -2381,6 +2414,12 @@ static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen, size = stream.total_out; + if (paranoid) { + unsigned char paranoid_sha1[20]; + git_SHA1_Final(paranoid_sha1, &ctx); + if (hashcmp(paranoid_sha1, sha1)) + die("hashed file is volatile"); + } if (write_buffer(fd, compressed, size) < 0) die("unable to write sha1 file"); close_sha1_file(fd); @@ -2398,7 +2437,7 @@ static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen, return move_temp_to_file(tmpfile, filename); } -int write_sha1_file(void *buf, unsigned long len, const char *type, unsigned char *returnsha1) +static int write_sha1_file_paranoid(void *buf, unsigned long len, const char *type, unsigned char *returnsha1, int paranoid) { unsigned char sha1[20]; char hdr[32]; @@ -2412,7 +2451,12 @@ int write_sha1_file(void *buf, unsigned long len, const char *type, unsigned cha hashcpy(returnsha1, sha1); if (has_sha1_file(sha1)) return 0; - return write_loose_object(sha1, hdr, hdrlen, buf, len, 0); + return write_loose_object(sha1, hdr, hdrlen, buf, len, 0, paranoid); +} + +int write_sha1_file(void *buf, unsigned long len, const char *type, unsigned char *returnsha1) +{ + return write_sha1_file_paranoid(buf, len, type, returnsha1, 0); } int force_object_loose(const unsigned char *sha1, time_t mtime) @@ -2430,7 +2474,7 @@ int force_object_loose(const unsigned char *sha1, time_t mtime) if (!buf) return error("cannot read sha1_file for %s", sha1_to_hex(sha1)); hdrlen = sprintf(hdr, "%s %lu", typename(type), len) + 1; - ret = write_loose_object(sha1, hdr, hdrlen, buf, len, mtime); + ret = write_loose_object(sha1, hdr, hdrlen, buf, len, mtime, 0); free(buf); return ret; @@ -2467,10 +2511,15 @@ int has_sha1_file(const unsigned char *sha1) return has_loose_object(sha1); } +#define INDEX_MEM_WRITE_OBJECT 01 +#define INDEX_MEM_PARANOID 02 + static int index_mem(unsigned char *sha1, void *buf, size_t size, - int write_object, enum object_type type, const char *path) + enum object_type type, const char *path, int flag) { int ret, re_allocated = 0; + int write_object = flag & INDEX_MEM_WRITE_OBJECT; + int paranoid = flag & INDEX_MEM_PARANOID; if (!type) type = OBJ_BLOB; @@ -2488,9 +2537,11 @@ static int index_mem(unsigned char *sha1, void *buf, size_t size, } if (write_object) - ret = write_sha1_file(buf, size, typename(type), sha1); + ret = write_sha1_file_paranoid(buf, size, typename(type), + sha1, paranoid); else ret = hash_sha1_file(buf, size, typename(type), sha1); + if (re_allocated) free(buf); return ret; @@ -2499,23 +2550,25 @@ static int index_mem(unsigned char *sha1, void *buf, size_t size, int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, enum object_type type, const char *path) { - int ret; + int ret, flag; size_t size = xsize_t(st->st_size); + flag = write_object ? INDEX_MEM_WRITE_OBJECT : 0; if (!S_ISREG(st->st_mode)) { struct strbuf sbuf = STRBUF_INIT; if (strbuf_read(&sbuf, fd, 4096) >= 0) - ret = index_mem(sha1, sbuf.buf, sbuf.len, write_object, - type, path); + ret = index_mem(sha1, sbuf.buf, sbuf.len, + type, path, flag); else ret = -1; strbuf_release(&sbuf); } else if (size) { void *buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0); - ret = index_mem(sha1, buf, size, write_object, type, path); + flag |= INDEX_MEM_PARANOID; + ret = index_mem(sha1, buf, size, type, path, flag); munmap(buf, size); } else - ret = index_mem(sha1, NULL, size, write_object, type, path); + ret = index_mem(sha1, NULL, size, type, path, flag); close(fd); return ret; } -- 1.7.0.81.g58679 ^ permalink raw reply related [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-18 5:36 ` Junio C Hamano @ 2010-02-18 7:27 ` Wincent Colaiuta 2010-02-18 16:18 ` Zygo Blaxell 2010-02-22 12:59 ` Paolo Bonzini 1 sibling, 1 reply; 84+ messages in thread From: Wincent Colaiuta @ 2010-02-18 7:27 UTC (permalink / raw) To: Junio C Hamano Cc: Nicolas Pitre, Jeff King, Dmitry Potapov, Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git El 18/02/2010, a las 06:36, Junio C Hamano escribió: > Nicolas Pitre <nico@fluxnic.net> writes: > >> It is likely to have better performance if the buffer is small enough to >> fit in the CPU L1 cache. There are two sequencial passes over the >> buffer: one for the SHA1 computation, and another for the compression, >> and currently they're sure to trash the L1 cache on each pass. > > I did a very unscientific test to hash about 14k paths (arch/ and fs/ from > the kernel source) using "git-hash-object -w --stdin-paths" into an empty > repository with varying sizes of paranoia buffer (quarter, 1, 4, 8 and > 256kB) and saw 8-30% overhead. 256kB did hurt and around 4kB seemed to be > optimal for my this small sample load. > > In any case, with any size of paranoia, this hurts the sane use case, so > I'd introduce an expert switch to disable it, like this. Shouldn't a switch that hurts performance and is only needed for insane use cases default to off rather than on? Cheers, Wincent ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-18 7:27 ` Wincent Colaiuta @ 2010-02-18 16:18 ` Zygo Blaxell 2010-02-18 18:12 ` Jonathan Nieder 0 siblings, 1 reply; 84+ messages in thread From: Zygo Blaxell @ 2010-02-18 16:18 UTC (permalink / raw) To: Wincent Colaiuta Cc: Junio C Hamano, Nicolas Pitre, Jeff King, Dmitry Potapov, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Thu, Feb 18, 2010 at 08:27:28AM +0100, Wincent Colaiuta wrote: > Shouldn't a switch that hurts performance and is only needed for insane use cases default to off rather than on? While I don't disagree that default off might(*) be a good idea, I do object to the categorization of this use case as 'insane'. Neither the documentation for 'git add' nor its various aliases (e.g. 'git commit' with paths or -a, etc) mentions that any use of 'git add' might cause repository corruption under any circumstances. Contrast with examples of repository-corrupting pitfalls in the man pages of tools such as 'git clone' and 'git gc'. In fact, the language in the git add man page seems to suggest the opposite, using words like "snapshot" and pointing out several times that the index is intentionally immune to changes interleaved between 'git add' and 'git commit' commands. Common sense (for Unix users) is that the index is not immune to changes *during* git add, but nowhere in my wildest nightmares would I conceive that changes in file contents during git add would *corrupt the repository* and git would *fail to notice or give useful diagnostics* until *days or weeks later* after the corruption has already *spread to multiple repositories* through *git push with default options*. Now, if you want to put that text in the man pages of 'git add' and friends, and point out the paranoia switch there, I have nothing to object to. I also see nothing prohibiting concurrent file modification in some reasonable revision-control use cases. What happens if I do a 'git commit -a' on, say, proprietary EDA tool data files or Microsoft Office documents, and those tools choose an unfortunate moment to automatically update files under revision control? Granted, I can't really expect the repo to contain usable data, but what I do expect is another commit, or a rebased/amended commit, that fixes the mangled file's contents--not to be required to rebase on the commit's parent everything that comes after it, then purge my reflogs so 'git gc' will work again. Working directories on network filesystems might do all kinds of strange things, most of which aren't intentional. It's one thing to commit a useless tree, and quite another to unintentionally commit an irretrievable one. (*) "might" be a good idea because there's been some evidence to suggest that a paranoid implementation of git add might perform better than the mmap-based one in all cases, if more work was done than anyone seems willing to do. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-18 16:18 ` Zygo Blaxell @ 2010-02-18 18:12 ` Jonathan Nieder 2010-02-18 18:35 ` Junio C Hamano 0 siblings, 1 reply; 84+ messages in thread From: Jonathan Nieder @ 2010-02-18 18:12 UTC (permalink / raw) To: Zygo Blaxell Cc: Wincent Colaiuta, Junio C Hamano, Nicolas Pitre, Jeff King, Dmitry Potapov, Ilari Liusvaara, Thomas Rast, git Zygo Blaxell wrote: > On Thu, Feb 18, 2010 at 08:27:28AM +0100, Wincent Colaiuta wrote: >> Shouldn't a switch that hurts performance and is only needed for >> insane use cases default to off rather than on? > > While I don't disagree that default off might(*) be a good idea, > I do object to the categorization of this use case as 'insane'. FWIW I think default off would not be a good idea. This talk of insane uses started from the idea that git is not so great for taking automatic snapshots, but as you pointed out, other situations can trigger this and the failure mode is pretty bad. > (*) "might" be a good idea because there's been some evidence to suggest > that a paranoid implementation of git add might perform better than the > mmap-based one in all cases, if more work was done than anyone seems > willing to do. What you are saying here seems a bit handwavy. If you have some concrete ideas about what this paranoid implementation should look like, I encourage you to write a rough patch. The two patches so far have indicated the relevant parts of sha1_file.c (index_fd at the beginning and write_sha1_file at the end of the pipeline, respectively). Special cases include: - The blob being added to the repository is a special file (e.g., pipe) specified on the 'git hash-object' command line: I think it’s fine if this is slow, but it should keep working. - The blob was generated in memory (e.g. 'git apply --cached'). - autocrlf conversion is on. This means scanning through the file to collect statistics on the dominant line ending, then scanning through again to convert the file. - some other filter is on. This means sending the file as input to a command, then slurping it up somewhere until its length has been determined for the beginning of the blob header - The blob being added to the repository is already in the repository, so it would be a waste of time to compress and write it again. Some of these already don’t have great performance for large files (autocrlf and filters), and I suspect there is room for improvement for many of them. Jonathan ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-18 18:12 ` Jonathan Nieder @ 2010-02-18 18:35 ` Junio C Hamano 0 siblings, 0 replies; 84+ messages in thread From: Junio C Hamano @ 2010-02-18 18:35 UTC (permalink / raw) To: Jonathan Nieder Cc: Zygo Blaxell, Wincent Colaiuta, Junio C Hamano, Nicolas Pitre, Jeff King, Dmitry Potapov, Ilari Liusvaara, Thomas Rast, git Jonathan Nieder <jrnieder@gmail.com> writes: > Zygo Blaxell wrote: >> On Thu, Feb 18, 2010 at 08:27:28AM +0100, Wincent Colaiuta wrote: >>> Shouldn't a switch that hurts performance and is only needed for >>> insane use cases default to off rather than on? >> >> While I don't disagree that default off might(*) be a good idea, >> I do object to the categorization of this use case as 'insane'. > > FWIW I think default off would not be a good idea. This talk of > insane uses started from the idea that git is not so great for taking > automatic snapshots,... But git is not so great for taking automatic snapshots, and that is a fact. You shouldn't be expecting such a thing, but more importantly, we shouldn't be dishonest about it either. git fanboys who spread "you can use it to snapshot automatically" without thinking are actively doing disservice to the users by making them even more confused. If we make this "safety" an opt-in feature, it would give people an excuse to claim that git _by default_ stores a corrupt object, and when they make such a claim, they may not reveal that it happens only when they abuse git in a way it it was not designed to be used to begin with. And it may not be because they are malicious, but merely because they are uninformed. The approach to use paranoia by default is to regard "safety" as not about protecting the users from such an abuse of their own, but primarily as a way to protect us from potential FUD. What Wincent suggested would work very well if there are only honest and informed people around in the world. People who use git as intended would not have to do anything special. People who abuse git for their special use case would be very aware of the fact that they are abusing git, and more importantly, would also be honest about it. They would not complain that "git will store corrupt objects by default", and just flip the option that is designed to support their use case, and they will get their desired result. Everybody is happy. But such a happy ending would happen only in an ideal world, in which sadly we do not live in. It is not 2005 anymore, and the risk of FUD arising from uninformed abuses is very real. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-18 5:36 ` Junio C Hamano 2010-02-18 7:27 ` Wincent Colaiuta @ 2010-02-22 12:59 ` Paolo Bonzini 2010-02-22 13:33 ` Dmitry Potapov 1 sibling, 1 reply; 84+ messages in thread From: Paolo Bonzini @ 2010-02-22 12:59 UTC (permalink / raw) To: Junio C Hamano Cc: Nicolas Pitre, Jeff King, Dmitry Potapov, Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On 02/18/2010 06:36 AM, Junio C Hamano wrote: > Nicolas Pitre<nico@fluxnic.net> writes: > >> It is likely to have better performance if the buffer is small enough to >> fit in the CPU L1 cache. There are two sequencial passes over the >> buffer: one for the SHA1 computation, and another for the compression, >> and currently they're sure to trash the L1 cache on each pass. > > I did a very unscientific test to hash about 14k paths (arch/ and fs/ from > the kernel source) using "git-hash-object -w --stdin-paths" into an empty > repository with varying sizes of paranoia buffer (quarter, 1, 4, 8 and > 256kB) and saw 8-30% overhead. 256kB did hurt and around 4kB seemed to be > optimal for my this small sample load. > > In any case, with any size of paranoia, this hurts the sane use case Because by mmaping + memcpying you are getting the worst of both cases: you get a page fault per page like with mmap, and touch memory twice like with read. Paolo ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-22 12:59 ` Paolo Bonzini @ 2010-02-22 13:33 ` Dmitry Potapov 0 siblings, 0 replies; 84+ messages in thread From: Dmitry Potapov @ 2010-02-22 13:33 UTC (permalink / raw) To: Paolo Bonzini Cc: Junio C Hamano, Nicolas Pitre, Jeff King, Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Mon, Feb 22, 2010 at 01:59:50PM +0100, Paolo Bonzini wrote: > On 02/18/2010 06:36 AM, Junio C Hamano wrote: > > > >In any case, with any size of paranoia, this hurts the sane use case > > Because by mmaping + memcpying you are getting the worst of both > cases: you get a page fault per page like with mmap, and touch > memory twice like with read. and there is also an extra round of SHA-1 calculation, which I believe is more expensive than memcpy(). Dmitry ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-18 1:16 ` [PATCH] Teach "git add" and friends to be paranoid Junio C Hamano 2010-02-18 1:20 ` Junio C Hamano 2010-02-18 1:38 ` Jeff King @ 2010-02-18 10:14 ` Thomas Rast 2010-02-18 18:16 ` Junio C Hamano 2010-02-19 8:28 ` Dmitry Potapov 3 siblings, 1 reply; 84+ messages in thread From: Thomas Rast @ 2010-02-18 10:14 UTC (permalink / raw) To: Junio C Hamano Cc: Dmitry Potapov, Zygo Blaxell, Ilari Liusvaara, Jonathan Nieder, git On Thursday 18 February 2010 02:16:23 Junio C Hamano wrote: > When creating a loose object, we normally mmap(2) the entire file, and > hash and then compress to write it out in two separate steps for > efficiency. > > This is perfectly good for the intended use of git---nobody is supposed to > be insane enough to expect that it won't break anything to muck with the > contents of a file after telling git to index it and before getting the > control back from git. This makes it sound as if the user is to blame, but IMHO we're just not checking the input well enough. The user should never be able to corrupt the repository (without git noticing!) just by running a git command and manipulating the worktree in parallel. The file data at any given time is just user input, and you also cannot (I hope; otherwise let's fix it!) corrupt the repository merely by typoing some command arguments. (Mucking around in .git is an entirely different matter, but that is off limits.) > This teaches the index_mem() codepath to be paranoid and hash and compress > the data after reading it in core. The contents hashed may not match the > contents of the file in an insane use case, but at least this way the > result will be internally consistent. Doesn't that trigger on windows, where xmmap() already makes a copy? -- Thomas Rast trast@{inf,student}.ethz.ch ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-18 10:14 ` Thomas Rast @ 2010-02-18 18:16 ` Junio C Hamano 2010-02-18 19:58 ` Nicolas Pitre 0 siblings, 1 reply; 84+ messages in thread From: Junio C Hamano @ 2010-02-18 18:16 UTC (permalink / raw) To: Thomas Rast Cc: Dmitry Potapov, Zygo Blaxell, Ilari Liusvaara, Jonathan Nieder, git Thomas Rast <trast@student.ethz.ch> writes: > This makes it sound as if the user is to blame, but IMHO we're just > not checking the input well enough. Honesty is very good. An alternative implementation that does not hurt performance as much as the "paranoia" would, and checks "the input well enough" would be very welcome. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-18 18:16 ` Junio C Hamano @ 2010-02-18 19:58 ` Nicolas Pitre 2010-02-18 20:11 ` 16 gig, 350,000 file repository Bill Lear ` (2 more replies) 0 siblings, 3 replies; 84+ messages in thread From: Nicolas Pitre @ 2010-02-18 19:58 UTC (permalink / raw) To: Junio C Hamano Cc: Thomas Rast, Dmitry Potapov, Zygo Blaxell, Ilari Liusvaara, Jonathan Nieder, git On Thu, 18 Feb 2010, Junio C Hamano wrote: > Thomas Rast <trast@student.ethz.ch> writes: > > > This makes it sound as if the user is to blame, but IMHO we're just > > not checking the input well enough. > > Honesty is very good. An alternative implementation that does not hurt > performance as much as the "paranoia" would, and checks "the input well > enough" would be very welcome. Can't we rely on the mtime of the source file? Sample it before starting hashing it, then make sure it didn't change when done. Nicolas ^ permalink raw reply [flat|nested] 84+ messages in thread
* 16 gig, 350,000 file repository 2010-02-18 19:58 ` Nicolas Pitre @ 2010-02-18 20:11 ` Bill Lear 2010-02-18 20:58 ` Nicolas Pitre 2010-02-18 20:14 ` [PATCH] Teach "git add" and friends to be paranoid Peter Harris 2010-02-18 20:17 ` Junio C Hamano 2 siblings, 1 reply; 84+ messages in thread From: Bill Lear @ 2010-02-18 20:11 UTC (permalink / raw) To: git I'm starting a new, large project and would like a quick bit of advice. Bringing in a set of test cases and other files from a ClearCase repository resulted in a 350,000 file git repo of about 16 gigabytes. The time to clone over a fast network was about 250 minutes. I could not verify if the repo had been packed properly, etc. However, we are thinking of using submodules or subtrees, to allow a person to selectively clone only a part of the repo they need for their work. Is there a way to do this without submodules/subtrees? We also need to be able to branch the entire repo, which I think would make submodules kind of a pain, but don't know... What is the current thinking on these issues in the git community? Bill ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: 16 gig, 350,000 file repository 2010-02-18 20:11 ` 16 gig, 350,000 file repository Bill Lear @ 2010-02-18 20:58 ` Nicolas Pitre 2010-02-19 9:27 ` Erik Faye-Lund 2010-02-22 22:20 ` Bill Lear 0 siblings, 2 replies; 84+ messages in thread From: Nicolas Pitre @ 2010-02-18 20:58 UTC (permalink / raw) To: Bill Lear; +Cc: git On Thu, 18 Feb 2010, Bill Lear wrote: > I'm starting a new, large project and would like a quick bit of advice. > > Bringing in a set of test cases and other files from a ClearCase > repository resulted in a 350,000 file git repo of about 16 gigabytes. > > The time to clone over a fast network was about 250 minutes. I could > not verify if the repo had been packed properly, etc. I'd start from there. If you didn't do a 'git gc --aggressive' after the import then it is quite likely that your repo isn't well packed. Of course you'll need a big machine to repack this. But that should be needed only once. Nicolas ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: 16 gig, 350,000 file repository 2010-02-18 20:58 ` Nicolas Pitre @ 2010-02-19 9:27 ` Erik Faye-Lund 2010-02-22 22:20 ` Bill Lear 1 sibling, 0 replies; 84+ messages in thread From: Erik Faye-Lund @ 2010-02-19 9:27 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Bill Lear, git On Thu, Feb 18, 2010 at 9:58 PM, Nicolas Pitre <nico@fluxnic.net> wrote: > On Thu, 18 Feb 2010, Bill Lear wrote: > >> I'm starting a new, large project and would like a quick bit of advice. >> >> Bringing in a set of test cases and other files from a ClearCase >> repository resulted in a 350,000 file git repo of about 16 gigabytes. >> >> The time to clone over a fast network was about 250 minutes. I could >> not verify if the repo had been packed properly, etc. > > I'd start from there. If you didn't do a 'git gc --aggressive' after > the import then it is quite likely that your repo isn't well packed. > > Of course you'll need a big machine to repack this. Something like this? http://www.gadgetopia.com/images/big_machine.jpg -- Erik "kusma" Faye-Lund ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: 16 gig, 350,000 file repository 2010-02-18 20:58 ` Nicolas Pitre 2010-02-19 9:27 ` Erik Faye-Lund @ 2010-02-22 22:20 ` Bill Lear 2010-02-22 22:31 ` Nicolas Pitre 1 sibling, 1 reply; 84+ messages in thread From: Bill Lear @ 2010-02-22 22:20 UTC (permalink / raw) To: Nicolas Pitre; +Cc: git On Thursday, February 18, 2010 at 15:58:42 (-0500) Nicolas Pitre writes: >On Thu, 18 Feb 2010, Bill Lear wrote: > >> I'm starting a new, large project and would like a quick bit of advice. >> >> Bringing in a set of test cases and other files from a ClearCase >> repository resulted in a 350,000 file git repo of about 16 gigabytes. >> >> The time to clone over a fast network was about 250 minutes. I could >> not verify if the repo had been packed properly, etc. > >I'd start from there. If you didn't do a 'git gc --aggressive' after >the import then it is quite likely that your repo isn't well packed. > >Of course you'll need a big machine to repack this. But that should be >needed only once. Ok, well they have a "big machine", but not big enough. It's running out of memory on the gc. I believe they have a fair amount of memory: % free total used free shared buffers cached Mem: 16629680 16051444 578236 0 28332 14385948 -/+ buffers/cache: 1637164 14992516 Swap: 8289500 1704 8287796 and they are using git 1.6.6. Assuming we can figure out how to gc this puppy (is there any way on a machine without 64 gigabytes?), there is still a question that remains: how to organize a project that has a very large amount of test cases (and test data) that we might not want to pull across the wire each time. Instead of shallow clone, as sort of slicing clone operation? We thought of using submodules. That is, code (say) goes in a separate repo 'src' and functional tests go in another, called 'ftests'. Then, we add 'ftests' as a submodule to 'src'. Great. However, we need to be able to branch 'src' and 'ftests' together. Example: I am working on a new feature in a branch "GLX-473_incremental_compression". I would like to be able to create the branch in both the 'src' repo and the 'ftests' repo at the same time, make changes, commit, and push to that branch for both. When developers check out the repo, they move to that branch, but do NOT want the cloned ftests. However, the QA team wants both the source and the tests that I have checked in and pushed. Is there an easy way to support this? Bill ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: 16 gig, 350,000 file repository 2010-02-22 22:20 ` Bill Lear @ 2010-02-22 22:31 ` Nicolas Pitre 0 siblings, 0 replies; 84+ messages in thread From: Nicolas Pitre @ 2010-02-22 22:31 UTC (permalink / raw) To: Bill Lear; +Cc: git On Mon, 22 Feb 2010, Bill Lear wrote: > On Thursday, February 18, 2010 at 15:58:42 (-0500) Nicolas Pitre writes: > >On Thu, 18 Feb 2010, Bill Lear wrote: > > > >> I'm starting a new, large project and would like a quick bit of advice. > >> > >> Bringing in a set of test cases and other files from a ClearCase > >> repository resulted in a 350,000 file git repo of about 16 gigabytes. > >> > >> The time to clone over a fast network was about 250 minutes. I could > >> not verify if the repo had been packed properly, etc. > > > >I'd start from there. If you didn't do a 'git gc --aggressive' after > >the import then it is quite likely that your repo isn't well packed. > > > >Of course you'll need a big machine to repack this. But that should be > >needed only once. > > Ok, well they have a "big machine", but not big enough. It's running > out of memory on the gc. I believe they have a fair amount of memory: > > % free > total used free shared buffers cached > Mem: 16629680 16051444 578236 0 28332 14385948 > -/+ buffers/cache: 1637164 14992516 > Swap: 8289500 1704 8287796 > > and they are using git 1.6.6. Hmmm. OK. You might try: git repack -a -f -d --depth=200 --window=100 --window-memory=1g Nicolas ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-18 19:58 ` Nicolas Pitre 2010-02-18 20:11 ` 16 gig, 350,000 file repository Bill Lear @ 2010-02-18 20:14 ` Peter Harris 2010-02-18 20:17 ` Junio C Hamano 2 siblings, 0 replies; 84+ messages in thread From: Peter Harris @ 2010-02-18 20:14 UTC (permalink / raw) To: Nicolas Pitre Cc: Junio C Hamano, Thomas Rast, Dmitry Potapov, Zygo Blaxell, Ilari Liusvaara, Jonathan Nieder, git On Thu, Feb 18, 2010 at 2:58 PM, Nicolas Pitre wrote: > > Can't we rely on the mtime of the source file? Sample it before > starting hashing it, then make sure it didn't change when done. Some filesystems (eg FAT) have an abysmal mtime granularity (2 seconds). The same race exists on other filesystems, albeit with a much narrower window of opportunity. Peter Harris ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-18 19:58 ` Nicolas Pitre 2010-02-18 20:11 ` 16 gig, 350,000 file repository Bill Lear 2010-02-18 20:14 ` [PATCH] Teach "git add" and friends to be paranoid Peter Harris @ 2010-02-18 20:17 ` Junio C Hamano 2010-02-18 21:30 ` Nicolas Pitre 2 siblings, 1 reply; 84+ messages in thread From: Junio C Hamano @ 2010-02-18 20:17 UTC (permalink / raw) To: Nicolas Pitre Cc: Thomas Rast, Dmitry Potapov, Zygo Blaxell, Ilari Liusvaara, Jonathan Nieder, git Nicolas Pitre <nico@fluxnic.net> writes: >> Honesty is very good. An alternative implementation that does not hurt >> performance as much as the "paranoia" would, and checks "the input well >> enough" would be very welcome. > > Can't we rely on the mtime of the source file? Sample it before > starting hashing it, then make sure it didn't change when done. I suspect that opening to mmap(2), hashing once to compute the object name, and deflating it to write it out, will all happen within the same second, unless you are talking about a really huge file, or you started at very near a second boundary. I am perfectly Ok that it will have false negatives that way, but if the probability of it catching problems is too low because git is too fast relative to the file timestamp granularity, then it doesn't sound very useful in practice, unless of course you are on better filesystems. It won't have false positives and that is a very good thing, though. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-18 20:17 ` Junio C Hamano @ 2010-02-18 21:30 ` Nicolas Pitre 2010-02-19 1:04 ` Jonathan Nieder 0 siblings, 1 reply; 84+ messages in thread From: Nicolas Pitre @ 2010-02-18 21:30 UTC (permalink / raw) To: Junio C Hamano Cc: Thomas Rast, Dmitry Potapov, Zygo Blaxell, Ilari Liusvaara, Jonathan Nieder, git On Thu, 18 Feb 2010, Junio C Hamano wrote: > Nicolas Pitre <nico@fluxnic.net> writes: > > >> Honesty is very good. An alternative implementation that does not hurt > >> performance as much as the "paranoia" would, and checks "the input well > >> enough" would be very welcome. > > > > Can't we rely on the mtime of the source file? Sample it before > > starting hashing it, then make sure it didn't change when done. > > I suspect that opening to mmap(2), hashing once to compute the object > name, and deflating it to write it out, will all happen within the same > second, unless you are talking about a really huge file, or you started at > very near a second boundary. How is the index dealing with this? Surely if a file is added to the index and modified within the same second then 'git status' will fail to notice the changes. I'm not familiar enough with that part of Git. Alternatively, you could use the initial mtime sample to determine the filesystem's time granularity by noticing how many LSBs are zero. Let's say FAT should have a granularity of one second. Then if the mtime of the file is less than one second away before starting to hash then just wait for one second. If one second later the mtime has changed and still less than a second away then abort. If after the hash the mtime has changed then abort. On a recent filesystem, it is likely that the mtime granularity is a nanosecond. Nevertheless the above algorithm should just work all the same, although it is unlikely that the mtime will be within the current nanosecond, hence the probability for having to do an initial wait is almost zero. On kernels without hires timers the granularity will be like 10 ms. Of course you might be unlucky and the initial mtime sample happens to be right on a whole second even on a high resolution mtime filesystem, in which case the delay test will consider one second instead of 10 ms or whatever. but the probability is rather small that you'll end up with all sub-second bits to be all zeroes causing a longer delay than actually necessary, and this would matter only for files that would have been modified within that second. I don't think there is a reliable way to enquire a filesystem+OS time stamping granularity. Nicolas ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-18 21:30 ` Nicolas Pitre @ 2010-02-19 1:04 ` Jonathan Nieder 2010-02-19 15:26 ` Zygo Blaxell 0 siblings, 1 reply; 84+ messages in thread From: Jonathan Nieder @ 2010-02-19 1:04 UTC (permalink / raw) To: Nicolas Pitre Cc: Junio C Hamano, Thomas Rast, Dmitry Potapov, Zygo Blaxell, Ilari Liusvaara, git Nicolas Pitre wrote: > On Thu, 18 Feb 2010, Junio C Hamano wrote: >> I suspect that opening to mmap(2), hashing once to compute the object >> name, and deflating it to write it out, will all happen within the same >> second, unless you are talking about a really huge file, or you started at >> very near a second boundary. > > How is the index dealing with this? Surely if a file is added to the > index and modified within the same second then 'git status' will fail to > notice the changes. I'm not familiar enough with that part of Git. See Documentation/technical/racy-git.txt and t/t0010-racy-git.sh. Short version: in the awful case, the timestamp of the index is the same as (or before) the timestamp of the file. Git will notice this and re-hash the tracked file. > Alternatively, you could use the initial mtime sample to determine the > filesystem's time granularity by noticing how many LSBs are zero. Yuck. If such detection is going to happen, I would prefer to see it used once to determine the initial value of a per-repository configuration variable asking to speed up ‘git add’ and friends. Note that we are currently not using the nsec timestamps to make any important decisions, probably because in some filesystems they are unreliable when inode cache entries are evicted (not sure about the current status; does this work in NFS, for example?). Within the short runtime of ‘git add’, I guess this would not be as much of a problem. Jonathan ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-19 1:04 ` Jonathan Nieder @ 2010-02-19 15:26 ` Zygo Blaxell 2010-02-19 17:52 ` Junio C Hamano 0 siblings, 1 reply; 84+ messages in thread From: Zygo Blaxell @ 2010-02-19 15:26 UTC (permalink / raw) To: Jonathan Nieder Cc: Nicolas Pitre, Junio C Hamano, Thomas Rast, Dmitry Potapov, Ilari Liusvaara, git On Thu, Feb 18, 2010 at 07:04:56PM -0600, Jonathan Nieder wrote: > Nicolas Pitre wrote: > > On Thu, 18 Feb 2010, Junio C Hamano wrote: > >> I suspect that opening to mmap(2), hashing once to compute the object > >> name, and deflating it to write it out, will all happen within the same > >> second, unless you are talking about a really huge file, or you started at > >> very near a second boundary. > > > > How is the index dealing with this? Surely if a file is added to the > > index and modified within the same second then 'git status' will fail to > > notice the changes. I'm not familiar enough with that part of Git. > > See Documentation/technical/racy-git.txt and t/t0010-racy-git.sh. > > Short version: in the awful case, the timestamp of the index is the > same as (or before) the timestamp of the file. Git will notice this > and re-hash the tracked file. As far as I can tell, the index doesn't handle this case at all. Suppose the file is modified during git add near the beginning of the file, after git add has read that part of the file, but the modifications finish before git add does. Now the mtime of the file is earlier than the index timestamp, but the file contents don't match the index. This holds even if the objects git adds to the index aren't corrupted. Actually right now you can have all four combinations: index up to date or not, and object matching its sha1 hash or not, depending on where and when you modify data during an index update. racy-git.txt doesn't discuss concurrent modification of files with the index. It only discusses low-resolution file timestamps and modifications at times that are close to, but not concurrent with, index modifications. Git probably also doesn't handle things like NTP time corrections (especially those where time moves backward by sub-second intervals) and mismatched server/client clocks on remote filesystems either (mind you, I know of no SCM that currently handles that case, and CVS in particular is unusually bad at it). Personally, I find the combination of nanosecond-precision timestamps and network file systems amusing. At nanosecond precision, relativistic effects start to matter across a volume of space the size of my laptop. I'm not sure how timestamps at any resolution could be a reliable metric for detecting changes to file contents in the general case. A valuable hint in many cases, but not authoritative (unless they all come from a single monotonic high-resolution clock guaranteed to increment faster than git--but they don't). rsync solves this sort of problem with a 'modification window' parameter, which is a time interval that is "close enough" to consider two timestamps to be equal. Some of rsync's use cases set that window to six months. Git would use a modification window for the opposite reason rsync does--rsync uses the window to avoid unnecessarily examining files that have different timestamps, while git would use it to re-examine files even when it appears to be unnecessary. Git probably wants the modification window to be the maximum clock offset between a network filesystem client and server plus the minimum representable interval in the filesystem's timestamp data type--which is a value git couldn't possibly know for some cases, so it needs input from the user. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-19 15:26 ` Zygo Blaxell @ 2010-02-19 17:52 ` Junio C Hamano 2010-02-19 19:08 ` Zygo Blaxell 0 siblings, 1 reply; 84+ messages in thread From: Junio C Hamano @ 2010-02-19 17:52 UTC (permalink / raw) To: Zygo Blaxell Cc: Jonathan Nieder, Nicolas Pitre, Thomas Rast, Dmitry Potapov, Ilari Liusvaara, git Zygo Blaxell <zblaxell@gibbs.hungrycats.org> writes: > As far as I can tell, the index doesn't handle this case at all. > ... > racy-git.txt doesn't discuss concurrent modification of files with the > index. It only discusses low-resolution file timestamps and modifications > at times that are close to, but not concurrent with, index modifications. Correct. As I said a few times in this thread, a use case with concurrent modifications is outside of the original design scope of git. As you may have realized, racy-git solution actually _relies_ on lack of concurrent modifications. The document does not even _talk_ about this assumption, exactly because at least back then it was a common knowledge shared by everybody that users are not supposed to muck with files in the work tree until they get control back from git and they can keep both halves if they get a new broken loose object if they did so ;-). ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-19 17:52 ` Junio C Hamano @ 2010-02-19 19:08 ` Zygo Blaxell 0 siblings, 0 replies; 84+ messages in thread From: Zygo Blaxell @ 2010-02-19 19:08 UTC (permalink / raw) To: Junio C Hamano Cc: Jonathan Nieder, Nicolas Pitre, Thomas Rast, Dmitry Potapov, Ilari Liusvaara, git On Fri, Feb 19, 2010 at 09:52:10AM -0800, Junio C Hamano wrote: > As you may have realized, racy-git solution actually _relies_ on lack of > concurrent modifications. It relies on 1) no concurrent modifications, 2) strictly increasing timestamps, and 3) consistent timestamps between the working directory and index. Believe it or not, out of those three I actually think the first assumption is the most reasonable, because it's something a user could prevent without using administrative privileges or changing filesystems. For performance reasons I frequently work with GIT_DIR on ext3 and working directory on tmpfs (though a mix of ext3 and ext4 has the same issues). One has second resolution, the other nanosecond. If two event timestamps with different precisions are compared as values at maximum precision, the later event might appear to occur before the earlier one. I try to avoid using anything that relies on mtime on network filesystems because a lot more than just Git breaks in those cases. NTP breakage is rarer, mostly because NTP step events are rare, and negative ones ever rarer. I've seen negative steps on laptops when they lose time accuracy during suspend or when Internet access is unavailable (or asymmetrically laggy), then get it back later. On the other hand, I don't actually test for effects of this event anywhere, so I can't say it hasn't caused breakage I don't know about, although I can say it hasn't cause breakage I do know about either. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-18 1:16 ` [PATCH] Teach "git add" and friends to be paranoid Junio C Hamano ` (2 preceding siblings ...) 2010-02-18 10:14 ` Thomas Rast @ 2010-02-19 8:28 ` Dmitry Potapov 2010-02-19 17:52 ` Junio C Hamano 3 siblings, 1 reply; 84+ messages in thread From: Dmitry Potapov @ 2010-02-19 8:28 UTC (permalink / raw) To: Junio C Hamano Cc: Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git Hi Junio, I am sorry I have not had time to reply earlier. I think it is possible to avoid the overhead of being on the safe side in a few common cases. Here is a patch. I have not had time to test it, but changes appear to trivial. -- >8 -- From 3e53610a41c4aad458dff13135a73bb4944f456b Mon Sep 17 00:00:00 2001 From: Dmitry Potapov <dpotapov@gmail.com> Date: Fri, 19 Feb 2010 11:00:51 +0300 Subject: [PATCH] speed up "git add" by avoiding the paranoid mode While the paranoid mode preserve the git repository from corruption in the case when the added file is changed simultaneously with running "git add", it has some overhead. However, in a few common cases, it is possible to avoid this mode and still be on the safe side: 1. If mmap() is implemented as reading the whole file in memory. 2. If the whole file was read in memory as result of applying some filter. 3. If the added file is small, it is faster to use read() than mmap(). Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> --- sha1_file.c | 5 ++++- 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index d8a7722..4efeb21 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2469,6 +2469,7 @@ static int index_mem(unsigned char *sha1, void *buf, size_t size, write_object ? safe_crlf : 0)) { buf = strbuf_detach(&nbuf, &size); re_allocated = 1; + paranoid = 0; } } @@ -2490,7 +2491,7 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, size_t size = xsize_t(st->st_size); flag = write_object ? INDEX_MEM_WRITE_OBJECT : 0; - if (!S_ISREG(st->st_mode)) { + if (!S_ISREG(st->st_mode) || size < 262144) { struct strbuf sbuf = STRBUF_INIT; if (strbuf_read(&sbuf, fd, 4096) >= 0) ret = index_mem(sha1, sbuf.buf, sbuf.len, @@ -2500,7 +2501,9 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, strbuf_release(&sbuf); } else if (size) { void *buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0); +#ifndef NO_MMAP flag |= INDEX_MEM_PARANOID; +#endif ret = index_mem(sha1, buf, size, type, path, flag); munmap(buf, size); } else -- 1.7.0 -- >8 -- ^ permalink raw reply related [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-19 8:28 ` Dmitry Potapov @ 2010-02-19 17:52 ` Junio C Hamano 2010-02-20 19:23 ` Junio C Hamano 0 siblings, 1 reply; 84+ messages in thread From: Junio C Hamano @ 2010-02-19 17:52 UTC (permalink / raw) To: Dmitry Potapov Cc: Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git Dmitry Potapov <dpotapov@gmail.com> writes: > ... I think it is possible > to avoid the overhead of being on the safe side in a few common cases. > Here is a patch. I have not had time to test it, but changes appear to > trivial. Yeah, these are obvious "paranoia not needed" cases. How much "speed-up" are we talking about, though? Can we quantify? I personally think it is not even worth to quantify it but instead simply say "avoid unnecessary computation" without saying "speed up", though ;-) Thanks. > -- >8 -- > From 3e53610a41c4aad458dff13135a73bb4944f456b Mon Sep 17 00:00:00 2001 > From: Dmitry Potapov <dpotapov@gmail.com> > Date: Fri, 19 Feb 2010 11:00:51 +0300 > Subject: [PATCH] speed up "git add" by avoiding the paranoid mode > > While the paranoid mode preserve the git repository from corruption in the > case when the added file is changed simultaneously with running "git add", > it has some overhead. However, in a few common cases, it is possible to > avoid this mode and still be on the safe side: > > 1. If mmap() is implemented as reading the whole file in memory. > > 2. If the whole file was read in memory as result of applying some filter. > > 3. If the added file is small, it is faster to use read() than mmap(). > > Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> > --- > sha1_file.c | 5 ++++- > 1 files changed, 4 insertions(+), 1 deletions(-) > > diff --git a/sha1_file.c b/sha1_file.c > index d8a7722..4efeb21 100644 > --- a/sha1_file.c > +++ b/sha1_file.c > @@ -2469,6 +2469,7 @@ static int index_mem(unsigned char *sha1, void *buf, size_t size, > write_object ? safe_crlf : 0)) { > buf = strbuf_detach(&nbuf, &size); > re_allocated = 1; > + paranoid = 0; > } > } > > @@ -2490,7 +2491,7 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, > size_t size = xsize_t(st->st_size); > > flag = write_object ? INDEX_MEM_WRITE_OBJECT : 0; > - if (!S_ISREG(st->st_mode)) { > + if (!S_ISREG(st->st_mode) || size < 262144) { > struct strbuf sbuf = STRBUF_INIT; > if (strbuf_read(&sbuf, fd, 4096) >= 0) > ret = index_mem(sha1, sbuf.buf, sbuf.len, > @@ -2500,7 +2501,9 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, > strbuf_release(&sbuf); > } else if (size) { > void *buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0); > +#ifndef NO_MMAP > flag |= INDEX_MEM_PARANOID; > +#endif > ret = index_mem(sha1, buf, size, type, path, flag); > munmap(buf, size); > } else > -- > 1.7.0 > > -- >8 -- ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-19 17:52 ` Junio C Hamano @ 2010-02-20 19:23 ` Junio C Hamano 2010-02-21 7:21 ` Dmitry Potapov 0 siblings, 1 reply; 84+ messages in thread From: Junio C Hamano @ 2010-02-20 19:23 UTC (permalink / raw) To: Dmitry Potapov Cc: Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git Junio C Hamano <gitster@pobox.com> writes: > Dmitry Potapov <dpotapov@gmail.com> writes: > >> ... I think it is possible >> to avoid the overhead of being on the safe side in a few common cases. >> Here is a patch. I have not had time to test it, but changes appear to >> trivial. > > Yeah, these are obvious "paranoia not needed" cases. > Actually the "if it is smaller than 256k" part is not quite obvious. >> @@ -2490,7 +2491,7 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, >> size_t size = xsize_t(st->st_size); >> >> flag = write_object ? INDEX_MEM_WRITE_OBJECT : 0; >> - if (!S_ISREG(st->st_mode)) { >> + if (!S_ISREG(st->st_mode) || size < 262144) { >> struct strbuf sbuf = STRBUF_INIT; >> if (strbuf_read(&sbuf, fd, 4096) >= 0) >> ret = index_mem(sha1, sbuf.buf, sbuf.len, INDEX_MEM_PARANOID is never given to index_mem() in this codepath, so trade-offs look like this: * In non-paranoia mode, your conjecture is that between - malloc, read, SHA-1, deflate, and then free; and - mmap, SHA-1, deflate and then munmap the former is faster for small files that can fit in core without thrashing. * In paranoia mode, your conjecture is that between - malloc, read, SHA-1, deflate, and then free; and - mmap, SHA-1, SHA-1 and deflate in chunks, and then munmap the former is faster for small files that can fit in core without thrashing. The "mmap" strategy has larger cost in paranoia mode compared to its cost in non-paranoia mode. The "read" strategy on the other hand has the same cost in both modes. If this "read small files" is good for non-paranoia mode, it is obvious that it is also good (better) for paranoia mode. Which means that this hunk addresses an unrelated issue. "paranoid avoidance" falls naturally as a side effect of doing this, but that is not the primary effect of this change. There needs some benchmarking to justify it, I think. So I'd split this hunk out when queuing. Thanks. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-20 19:23 ` Junio C Hamano @ 2010-02-21 7:21 ` Dmitry Potapov 2010-02-21 19:32 ` Junio C Hamano 0 siblings, 1 reply; 84+ messages in thread From: Dmitry Potapov @ 2010-02-21 7:21 UTC (permalink / raw) To: Junio C Hamano Cc: Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Sat, Feb 20, 2010 at 11:23:11AM -0800, Junio C Hamano wrote: > > There needs some benchmarking to justify it, I think. > > So I'd split this hunk out when queuing. I completely argee with your reasoning that it is a separate issue and it needs some benchmarking to prove its usefulness. So, I have done it today. For that, I have created a repository up to 512Mb size and containing up to 100,000 files (for small files the file number was the limiting factor, for big files, the total size limit was used). I intentionally limit the total size to 512Mb to make sure that they are in the FS cache and no disk related effects. The content of files have been generated randomly by /dev/urandom and then used in all tests for one file size. I have made 5 runs using mmap() (with git 1.7.0) and 5 runs with using read() (after applying the patch below). Below is the best result of 5 runs for each size. "Before" marks the original version, which uses mmap(). "After" marks the modified version using read(). The command used to measure time was: cat list | time git hash-object --stdin-paths >/dev/null So it is just calculating SHA-1 without deflating and writing the result on the disk, which significantly depends on fsync() speed. Tested on: Intel(R) Core(TM)2 Quad CPU Q9300 @ 2.50GHz Here are results: file size = 1Kb; Hashing 100000 files Before: 0.63user 0.86system 0:01.49elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+100428minor)pagefaults 0swaps After: 0.54user 0.53system 0:01.07elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+421minor)pagefaults 0swaps file size = 2Kb; Hashing 100000 files Before: 1.04user 0.79system 0:01.82elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+100428minor)pagefaults 0swaps After: 0.95user 0.48system 0:01.43elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+422minor)pagefaults 0swaps file size = 4Kb; Hashing 100000 files Before: 1.73user 0.74system 0:02.47elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+100428minor)pagefaults 0swaps After: 1.54user 0.57system 0:02.11elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+424minor)pagefaults 0swaps file size = 8Kb; Hashing 64000 files Before: 1.86user 0.63system 0:02.48elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+128428minor)pagefaults 0swaps After: 1.75user 0.50system 0:02.23elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+424minor)pagefaults 0swaps file size = 16Kb; Hashing 32000 files Before: 1.73user 0.41system 0:02.14elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+128428minor)pagefaults 0swaps After: 1.68user 0.32system 0:02.00elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+425minor)pagefaults 0swaps file size = 32Kb; Hashing 16000 files Before: 1.65user 0.32system 0:01.96elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+128428minor)pagefaults 0swaps After: 1.64user 0.24system 0:01.87elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+431minor)pagefaults 0swaps file size = 64Kb; Hashing 8000 files Before: 1.71user 0.17system 0:01.87elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+128428minor)pagefaults 0swaps After: 1.63user 0.18system 0:01.81elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+438minor)pagefaults 0swaps file size = 128Kb; Hashing 4000 files Before: 1.60user 0.20system 0:01.79elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+128428minor)pagefaults 0swaps After: 1.55user 0.20system 0:01.75elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+454minor)pagefaults 0swaps file size = 256Kb; Hashing 2000 files Before: 1.62user 0.15system 0:01.77elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+128429minor)pagefaults 0swaps After: 1.56user 0.16system 0:01.71elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+551minor)pagefaults 0swaps file size = 512Kb; Hashing 1000 files Before: 1.59user 0.17system 0:01.76elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+128428minor)pagefaults 0swaps After: 1.56user 0.15system 0:01.71elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+679minor)pagefaults 0swaps file size = 1024Kb; Hashing 500 files Before: 1.64user 0.15system 0:01.78elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+128429minor)pagefaults 0swaps After: 1.61user 0.15system 0:01.76elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+934minor)pagefaults 0swaps As you can see, in all tests the read() version performed better than mmap() though the difference decreases with increase of the file size. While for 1Kb files, the speed up is 39% (based on the elapsed time), it is mere 1% for 1Mb file size. Note: I do not use strbuf_read(), because it is suboptimal to deal with this case, because we know the size ahead. (In fact, strbuf_read() is not so good even for unknown size as it does redundant strbuf_grow() almost in every use case, which probably needs to be fixed). -- >8 -- From 6b3f8335dece7c9b9f810b1ab08f1bcb090e4d5e Mon Sep 17 00:00:00 2001 From: Dmitry Potapov <dpotapov@gmail.com> Date: Sun, 21 Feb 2010 09:32:19 +0300 Subject: [PATCH] hash-object: don't use mmap() for small files Using read() instead of mmap() can be 39% speed up for 1Kb files and is 1% speed up 1Mb files. For larger files, it is better to use mmap(), because the difference between is not significant, and when there is not enough memory, mmap() performs much better, because it avoids swapping. Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> --- sha1_file.c | 10 ++++++++++ 1 files changed, 10 insertions(+), 0 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index 657825e..8a83e56 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2434,6 +2434,8 @@ static int index_mem(unsigned char *sha1, void *buf, size_t size, return ret; } +#define SMALL_FILE_SIZE (1024*1024) + int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, enum object_type type, const char *path) { @@ -2448,6 +2450,14 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, else ret = -1; strbuf_release(&sbuf); + } else if (size <= SMALL_FILE_SIZE) { + char *buf = xmalloc(size); + if (size == read_in_full(fd, buf, size)) + ret = index_mem(sha1, buf, size, write_object, type, + path); + else + ret = error("short read %s", strerror(errno)); + free(buf); } else if (size) { void *buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0); ret = index_mem(sha1, buf, size, write_object, type, path); -- 1.7.0 -- >8 -- Thanks, Dmitry ^ permalink raw reply related [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-21 7:21 ` Dmitry Potapov @ 2010-02-21 19:32 ` Junio C Hamano 2010-02-22 3:35 ` Dmitry Potapov 0 siblings, 1 reply; 84+ messages in thread From: Junio C Hamano @ 2010-02-21 19:32 UTC (permalink / raw) To: Dmitry Potapov Cc: Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git Dmitry Potapov <dpotapov@gmail.com> writes: > file size = 1Kb; Hashing 100000 files > Before: > 0.63user 0.86system 0:01.49elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k > 0inputs+0outputs (0major+100428minor)pagefaults 0swaps > After: > 0.54user 0.53system 0:01.07elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k > 0inputs+0outputs (0major+421minor)pagefaults 0swaps > > As you can see, in all tests the read() version performed better than > mmap() though the difference decreases with increase of the file size. > While for 1Kb files, the speed up is 39% (based on the elapsed time), > it is mere 1% for 1Mb file size. Sounds good. Summarizing your numbers, 1Kb 39.25% 2Kb 27.27% 4Kb 17.06% 8Kb 11.21% 16Kb 7.00% 32Kb 4.81% 64Kb 3.31% 128Kb 2.29% 256Kb 3.51% 512Kb 2.92% 1024Kb 1.14% 32*1024 sounds like a better cut-off to me. After that, doubling the size does not get comparable gain, and numbers get unstable (notice the glitch around 256kB). ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-21 19:32 ` Junio C Hamano @ 2010-02-22 3:35 ` Dmitry Potapov 2010-02-22 6:59 ` Junio C Hamano 0 siblings, 1 reply; 84+ messages in thread From: Dmitry Potapov @ 2010-02-22 3:35 UTC (permalink / raw) To: Junio C Hamano Cc: Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Sun, Feb 21, 2010 at 11:32:10AM -0800, Junio C Hamano wrote: > > 32*1024 sounds like a better cut-off to me. After that, doubling the size > does not get comparable gain, and numbers get unstable (notice the glitch > around 256kB). The reduction of speed-up after 32Kb is most likely due to L1 cache size, which is 32Kb data per core on Core 2, and L2 cache is shared among cores and is considerably slow. I have run my test a few more times, and here are results: 1 - 39.25% 2 - 30.00% 4 - 17.79% 8 - 11.76% 16 - 7.58% 32 - 5.38% 64 - 3.89% 128 - 2.87% 256 - 2.31% 512 - 2.92% 1024 - 1.14% and here is one more re-run starting with 32Kb: 32 - 5.38% 64 - 3.89% 128 - 2.29% 256 - 2.91% 512 - 2.92% 1024 - 1.14% If you look at speed-up numbers, you can think that the numbers are unstable, but in fact, the best time in 5 runs does not differ more than 0.01s between those trials. But because difference for >=128Kb is 0.05s or less, the accuracy of the above numbers is less than 25%. But overall the outcome is clear -- read() is always a winner. It would be interesting to see what difference Nehalem, which has a smaller but much faster L2 cache than Core 2. It may perform better at larger sizes up to 256Kb. Anyway, based on above data, I believe that the proper cut-off should be at least 64Kb, because additional 32Kb (from 32Kb to 64Kb) is about of 2.5% of total memory that git consumes anyway, and it gives you speed-up around 3.5%... Dmitry ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-22 3:35 ` Dmitry Potapov @ 2010-02-22 6:59 ` Junio C Hamano 2010-02-22 12:25 ` Dmitry Potapov 2010-02-22 15:40 ` Nicolas Pitre 0 siblings, 2 replies; 84+ messages in thread From: Junio C Hamano @ 2010-02-22 6:59 UTC (permalink / raw) To: Dmitry Potapov Cc: Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git Dmitry Potapov <dpotapov@gmail.com> writes: > If you look at speed-up numbers, you can think that the numbers are > unstable, but in fact, the best time in 5 runs does not differ more > than 0.01s between those trials. But because difference for >=128Kb > is 0.05s or less, the accuracy of the above numbers is less than 25%. Then wouldn't it make the following statement... > But overall the outcome is clear -- read() is always a winner. "... a winner, below 128kB; above that the difference is within noise and measurement error"? > It would be interesting to see what difference Nehalem, which has a > smaller but much faster L2 cache than Core 2. It may perform better > at larger sizes up to 256Kb. Interesting. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-22 6:59 ` Junio C Hamano @ 2010-02-22 12:25 ` Dmitry Potapov 2010-02-22 15:40 ` Nicolas Pitre 1 sibling, 0 replies; 84+ messages in thread From: Dmitry Potapov @ 2010-02-22 12:25 UTC (permalink / raw) To: Junio C Hamano Cc: Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Sun, Feb 21, 2010 at 10:59:51PM -0800, Junio C Hamano wrote: > > "... a winner, below 128kB; above that the difference is within noise and > measurement error"? What I was trying to say is that if you see consistently four or five points win (based on many trials), it is clear a win; but you have 25% error bar for the gain number. Dmitry ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-22 6:59 ` Junio C Hamano 2010-02-22 12:25 ` Dmitry Potapov @ 2010-02-22 15:40 ` Nicolas Pitre 2010-02-22 16:01 ` Dmitry Potapov 2010-02-22 17:31 ` Zygo Blaxell 1 sibling, 2 replies; 84+ messages in thread From: Nicolas Pitre @ 2010-02-22 15:40 UTC (permalink / raw) To: Junio C Hamano Cc: Dmitry Potapov, Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Sun, 21 Feb 2010, Junio C Hamano wrote: > Dmitry Potapov <dpotapov@gmail.com> writes: > > > If you look at speed-up numbers, you can think that the numbers are > > unstable, but in fact, the best time in 5 runs does not differ more > > than 0.01s between those trials. But because difference for >=128Kb > > is 0.05s or less, the accuracy of the above numbers is less than 25%. > > Then wouldn't it make the following statement... > > > But overall the outcome is clear -- read() is always a winner. > > "... a winner, below 128kB; above that the difference is within noise and > measurement error"? read() is not always a winner. A read() call will always have the data duplicated in memory. Especially with large files, it is more efficient on the system as a whole to mmap() a 50 MB file rather than allocating an extra 50 MB of anonymous memory that cannot be paged out (except to the swap file which would be yet another data duplication). With mmap() when there is memory pressure the read-only mapped memory is simply dropped with no extra IO. So when read() is not _significantly_ faster than mmap() then it should not be used. Nicolas ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-22 15:40 ` Nicolas Pitre @ 2010-02-22 16:01 ` Dmitry Potapov 2010-02-22 17:31 ` Zygo Blaxell 1 sibling, 0 replies; 84+ messages in thread From: Dmitry Potapov @ 2010-02-22 16:01 UTC (permalink / raw) To: Nicolas Pitre Cc: Junio C Hamano, Zygo Blaxell, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Mon, Feb 22, 2010 at 6:40 PM, Nicolas Pitre <nico@fluxnic.net> wrote: > > read() is not always a winner. A read() call will always have the data > duplicated in memory. Especially with large files, it is more efficient > on the system as a whole to mmap() a 50 MB file rather than allocating Agreed. I meant it was faster on data that I measured, which was <=1Mb. IMHO, read() should be used up to 64Kb. Dmitry ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-22 15:40 ` Nicolas Pitre 2010-02-22 16:01 ` Dmitry Potapov @ 2010-02-22 17:31 ` Zygo Blaxell 2010-02-22 18:01 ` Nicolas Pitre 2010-02-22 18:05 ` Dmitry Potapov 1 sibling, 2 replies; 84+ messages in thread From: Zygo Blaxell @ 2010-02-22 17:31 UTC (permalink / raw) To: Nicolas Pitre Cc: Junio C Hamano, Dmitry Potapov, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Mon, Feb 22, 2010 at 10:40:59AM -0500, Nicolas Pitre wrote: > On Sun, 21 Feb 2010, Junio C Hamano wrote: > > Dmitry Potapov <dpotapov@gmail.com> writes: > > > But overall the outcome is clear -- read() is always a winner. > > > > "... a winner, below 128kB; above that the difference is within noise and > > measurement error"? > > read() is not always a winner. A read() call will always have the data > duplicated in memory. Especially with large files, it is more efficient > on the system as a whole to mmap() a 50 MB file rather than allocating > an extra 50 MB of anonymous memory that cannot be paged out (except to > the swap file which would be yet another data duplication). With mmap() > when there is memory pressure the read-only mapped memory is simply > dropped with no extra IO. That holds if you're comparing read() and mmap() of the entire file as a single chunk, instead of in fixed-size chunks at the sweet spot between syscall overhead and CPU cache size. If you're read()ing a chunk at a time into a fixed size buffer, and doing sha1 and deflate in chunks, the data should be copied once into CPU cache, processed with both algorithms, and replaced with new data from the next chunk. The data will be copied from the page cache instead of directly mapped, which is a small overhead, but setting up the page map in mmap() also a small overhead, so you have to use benchmarks to know which of the overheads is smaller. It might be that there's no one answer that applies to all CPU configurations. If you're doing mmap() and sha1 and deflate of a 50MB file in two separate passes that are the same size as the file, you load 50MB of data into CPU cache at least twice, you get two sets of associated things like TLB misses, and if the file is very large, you page it from disk twice. So it might make sense to process in chunks regardless of read() vs mmap() fetching the data. If you're malloc()ing 50MB, you're wasting memory and CPU bandwidth making up pages full of zeros before you've even processed the first byte. I don't see how that could ever be faster for large file cases. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-22 17:31 ` Zygo Blaxell @ 2010-02-22 18:01 ` Nicolas Pitre 2010-02-22 19:56 ` Junio C Hamano 2010-02-22 18:05 ` Dmitry Potapov 1 sibling, 1 reply; 84+ messages in thread From: Nicolas Pitre @ 2010-02-22 18:01 UTC (permalink / raw) To: Zygo Blaxell Cc: Junio C Hamano, Dmitry Potapov, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Mon, 22 Feb 2010, Zygo Blaxell wrote: > On Mon, Feb 22, 2010 at 10:40:59AM -0500, Nicolas Pitre wrote: > > On Sun, 21 Feb 2010, Junio C Hamano wrote: > > > Dmitry Potapov <dpotapov@gmail.com> writes: > > > > But overall the outcome is clear -- read() is always a winner. > > > > > > "... a winner, below 128kB; above that the difference is within noise and > > > measurement error"? > > > > read() is not always a winner. A read() call will always have the data > > duplicated in memory. Especially with large files, it is more efficient > > on the system as a whole to mmap() a 50 MB file rather than allocating > > an extra 50 MB of anonymous memory that cannot be paged out (except to > > the swap file which would be yet another data duplication). With mmap() > > when there is memory pressure the read-only mapped memory is simply > > dropped with no extra IO. > > That holds if you're comparing read() and mmap() of the entire file as a > single chunk, instead of in fixed-size chunks at the sweet spot between > syscall overhead and CPU cache size. Obviously. But we currently don't have the infrastructure to do chunked read of the input data. I think we should do that eventually, by applying the pack windowing code to input files as well. That would make memory usage constant even for huge files, but this is much more complicated to support especially for data fed through stdin. > If you're read()ing a chunk at a time into a fixed size buffer, and > doing sha1 and deflate in chunks, the data should be copied once into CPU > cache, processed with both algorithms, and replaced with new data from > the next chunk. The data will be copied from the page cache instead > of directly mapped, which is a small overhead, but setting up the page > map in mmap() also a small overhead, so you have to use benchmarks to > know which of the overheads is smaller. It might be that there's no > one answer that applies to all CPU configurations. Normally mmap() has more overhead than read(). However mmap() provides much nicer properties than read() by simplifying the code a lot, and by letting the OS manage memory pressure much more gracefully. > If you're doing mmap() and sha1 and deflate of a 50MB file in two > separate passes that are the same size as the file, you load 50MB of > data into CPU cache at least twice, you get two sets of associated > things like TLB misses, and if the file is very large, you page it from > disk twice. So it might make sense to process in chunks regardless > of read() vs mmap() fetching the data. We do have to make two separate passes anyway. The first pass is to hash the data only, and if that hash already exists in the object store then we call it done and skip over the deflate process which is still the dominant cost. And that happens quite often. However, with a really large file, then it becomes advantageous to simply do the hash and deflate in parallel one chunk at a time, and simply discard the newly created objects if it happens to already exists. That's the whole idea behind the newly introduced core.bigFileThreshold config variable (but the code to honor it in sha1_file.c doesn't exist yet). > If you're malloc()ing 50MB, you're wasting memory and CPU bandwidth > making up pages full of zeros before you've even processed the first byte. > I don't see how that could ever be faster for large file cases. It can't. This is why read() is not much better than mmap() in those cases. Nicolas ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-22 18:01 ` Nicolas Pitre @ 2010-02-22 19:56 ` Junio C Hamano 2010-02-22 20:52 ` Nicolas Pitre 0 siblings, 1 reply; 84+ messages in thread From: Junio C Hamano @ 2010-02-22 19:56 UTC (permalink / raw) To: Nicolas Pitre Cc: Zygo Blaxell, Dmitry Potapov, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git Nicolas Pitre <nico@fluxnic.net> writes: > We do have to make two separate passes anyway. The first pass is to > hash the data only, and if that hash already exists in the object store > then we call it done and skip over the deflate process which is still > the dominant cost. And that happens quite often. > > However, with a really large file, then it becomes advantageous to > simply do the hash and deflate in parallel one chunk at a time, and > simply discard the newly created objects if it happens to already > exists. That's the whole idea behind the newly introduced > core.bigFileThreshold config variable (but the code to honor it in > sha1_file.c doesn't exist yet). The core.bigFileThreshold could be used in sha1_file.c to decide writing straight into a new packfile; that would avoid both the later repacking cost and the cross directory rename issue for loose object files. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-22 19:56 ` Junio C Hamano @ 2010-02-22 20:52 ` Nicolas Pitre 0 siblings, 0 replies; 84+ messages in thread From: Nicolas Pitre @ 2010-02-22 20:52 UTC (permalink / raw) To: Junio C Hamano Cc: Zygo Blaxell, Dmitry Potapov, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Mon, 22 Feb 2010, Junio C Hamano wrote: > Nicolas Pitre <nico@fluxnic.net> writes: > > > We do have to make two separate passes anyway. The first pass is to > > hash the data only, and if that hash already exists in the object store > > then we call it done and skip over the deflate process which is still > > the dominant cost. And that happens quite often. > > > > However, with a really large file, then it becomes advantageous to > > simply do the hash and deflate in parallel one chunk at a time, and > > simply discard the newly created objects if it happens to already > > exists. That's the whole idea behind the newly introduced > > core.bigFileThreshold config variable (but the code to honor it in > > sha1_file.c doesn't exist yet). > > The core.bigFileThreshold could be used in sha1_file.c to decide writing > straight into a new packfile; that would avoid both the later repacking > cost and the cross directory rename issue for loose object files. Yep, that's the plan. Let's find the time now. Nicolas ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-22 17:31 ` Zygo Blaxell 2010-02-22 18:01 ` Nicolas Pitre @ 2010-02-22 18:05 ` Dmitry Potapov 2010-02-22 18:14 ` Nicolas Pitre 1 sibling, 1 reply; 84+ messages in thread From: Dmitry Potapov @ 2010-02-22 18:05 UTC (permalink / raw) To: Zygo Blaxell Cc: Nicolas Pitre, Junio C Hamano, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Mon, Feb 22, 2010 at 8:31 PM, Zygo Blaxell <zblaxell@gibbs.hungrycats.org> wrote: > > If you're read()ing a chunk at a time into a fixed size buffer, and > doing sha1 and deflate in chunks, the data should be copied once into CPU > cache, processed with both algorithms, and replaced with new data from > the next chunk. Currently, we calculate SHA-1, then lookup whether the object with this SHA-1 exists, and if it does not, then deflate and write it to the object storage. So, we avoid deflate and write cost if this object already exists. Moreover, when we deflate data, we create the temporary file in the same directory where the target object will be stored, thus avoiding cross-directory rename (which is important for some reason, but I don't remember why). So, creating the temporary file requires the knowledge first two digits of SHA-1, which you cannot know without calculation SHA-1. So, the idea of processing file in chunks is very attractive, but it has two drawbacks: 1. extra cost (deflating+writing) when the object is already stored 2. some issues with cross-directory renaming Dmitry ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [PATCH] Teach "git add" and friends to be paranoid 2010-02-22 18:05 ` Dmitry Potapov @ 2010-02-22 18:14 ` Nicolas Pitre 0 siblings, 0 replies; 84+ messages in thread From: Nicolas Pitre @ 2010-02-22 18:14 UTC (permalink / raw) To: Dmitry Potapov Cc: Zygo Blaxell, Junio C Hamano, Ilari Liusvaara, Thomas Rast, Jonathan Nieder, git On Mon, 22 Feb 2010, Dmitry Potapov wrote: > Currently, we calculate SHA-1, then lookup whether the object with this > SHA-1 exists, and if it does not, then deflate and write it to the > object storage. So, we avoid deflate and write cost if this object > already exists. Moreover, when we deflate data, we create the temporary > file in the same directory where the target object will be stored, thus > avoiding cross-directory rename (which is important for some reason, but > I don't remember why). So, creating the temporary file requires the > knowledge first two digits of SHA-1, which you cannot know without > calculation SHA-1. Even that initial SHA1 calculation can be done in chunks. Worth doing for large enough files only though. Nicolas ^ permalink raw reply [flat|nested] 84+ messages in thread
* mmap with MAP_PRIVATE is useless (was Re: Bug#569505: git-core: 'git add' corrupts repository if the working directory is modified as it runs) 2010-02-12 0:27 ` Bug#569505: git-core: 'git add' corrupts repository if the working directory is modified as it runs Jonathan Nieder 2010-02-12 1:23 ` Zygo Blaxell @ 2010-02-14 1:36 ` Paolo Bonzini 2010-02-14 1:53 ` mmap with MAP_PRIVATE is useless Junio C Hamano 1 sibling, 1 reply; 84+ messages in thread From: Paolo Bonzini @ 2010-02-14 1:36 UTC (permalink / raw) To: Jonathan Nieder; +Cc: git, 569505-forwarded, Zygo Blaxell > I tested this with a number of kernel versions from 2.6.27 to 2.6.31. > All of them reproduced this problem. I checked this because strace > shows 'git add' doing a mmap(..., MAP_PRIVATE, ...) of its input file, > so I was wondering if there might have been a recent change in mmap() > behavior in either git or the kernel. From mmap(2): "it is unspecified whether changes made to the file after the mmap() call are visible in the mapped region". You may think that doing a dummy "*p = *p" every 4096 bytes `fixes' it (because it causes copy-on-write for every page) but even that does not work because you can get a SIGBUS if the file is truncated while you have it mapped privately (e.g. by fopen ("file", "w") or open with O_TRUNC). Testcase: #include <sys/mman.h> #include <fcntl.h> #include <stdio.h> int main () { system ("echo foo > file.test"); int f = open ("file.test", O_RDONLY); char *p = mmap (NULL, 4096, PROT_READ, MAP_PRIVATE, f, 0); close (f); printf ("%s", p); // prints "foo\n" f = open ("file.test", O_RDWR | O_CREAT | O_TRUNC, 0666); write (f, "bar\n", 4); // comment out and next printf SIGSEGVs printf ("%s", p); // prints "bar\n" } This means that MAP_PRIVATE is utterly useless. Paolo ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: mmap with MAP_PRIVATE is useless 2010-02-14 1:36 ` mmap with MAP_PRIVATE is useless (was Re: Bug#569505: git-core: 'git add' corrupts repository if the working directory is modified as it runs) Paolo Bonzini @ 2010-02-14 1:53 ` Junio C Hamano 2010-02-14 2:11 ` Paolo Bonzini 0 siblings, 1 reply; 84+ messages in thread From: Junio C Hamano @ 2010-02-14 1:53 UTC (permalink / raw) To: Paolo Bonzini; +Cc: Jonathan Nieder, git, 569505-forwarded, Zygo Blaxell Paolo Bonzini <bonzini@gnu.org> writes: > This means that MAP_PRIVATE is utterly useless. I do not think we ever used MAP_PRIVATE in order to protect outselves from uncontrolled changes made by the outside world in the first place. Back when most of these mmap calls were written by Linus and myself, we weren't interested in using MAP_PRIVATE, or any other trick for that matter, to deal with the case where the user tells git to go index a file, and then mucks with the file before git finishes and gives back control. We do use mmap in read-write mode when reading from the index file, and we use MAP_PRIVATE to protect the outside world from our writing into the mapped memory. As far as I know that is the only mmap for which MAP_PRIVATE matters in the core git codebase. Our calls to mmap() almost all have MAP_PRIVATE, even for read-only mmap, but that is more or less from inertia, aka "an existing call to mmap is with these options, I'll add another call imitating that". ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: mmap with MAP_PRIVATE is useless 2010-02-14 1:53 ` mmap with MAP_PRIVATE is useless Junio C Hamano @ 2010-02-14 2:11 ` Paolo Bonzini 0 siblings, 0 replies; 84+ messages in thread From: Paolo Bonzini @ 2010-02-14 2:11 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jonathan Nieder, git, Zygo Blaxell On 02/14/2010 02:53 AM, Junio C Hamano wrote: > Back when most of these mmap calls were written by Linus and myself, > we weren't interested in using MAP_PRIVATE, or any other trick for > that matter, to deal with the case where the user tells git to go > index a file, and then mucks with the file before git finishes and > gives back control. Eh, this one in particular (in index_fd) is quite ancient... commit e83c5163316f89bfbde7d9ab23ca2e25604af290 Author: Linus Torvalds <torvalds@ppc970.osdl.org> Date: Thu Apr 7 15:13:13 2005 -0700 Initial revision of "git", the information manager from hell :-) There were three mmap calls -- in read_sha1_file, read_cache and index_fd -- and all three were of the same mmap (NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0) shape. Paolo ^ permalink raw reply [flat|nested] 84+ messages in thread
end of thread, other threads:[~2010-02-22 22:31 UTC | newest] Thread overview: 84+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <20100211234753.22574.48799.reportbug@gibbs.hungrycats.org> 2010-02-12 0:27 ` Bug#569505: git-core: 'git add' corrupts repository if the working directory is modified as it runs Jonathan Nieder 2010-02-12 1:23 ` Zygo Blaxell 2010-02-13 12:12 ` Jonathan Nieder 2010-02-13 13:39 ` Ilari Liusvaara 2010-02-13 14:39 ` Thomas Rast 2010-02-13 16:29 ` Ilari Liusvaara 2010-02-13 22:09 ` Dmitry Potapov 2010-02-13 22:37 ` Zygo Blaxell 2010-02-14 1:18 ` [PATCH] don't use mmap() to hash files Dmitry Potapov 2010-02-14 1:37 ` Junio C Hamano 2010-02-14 2:18 ` Dmitry Potapov 2010-02-14 3:14 ` Junio C Hamano 2010-02-14 11:14 ` Thomas Rast 2010-02-14 11:46 ` Junio C Hamano 2010-02-14 1:53 ` Johannes Schindelin 2010-02-14 2:00 ` Junio C Hamano 2010-02-14 2:42 ` Dmitry Potapov 2010-02-14 11:07 ` Jakub Narebski 2010-02-14 11:55 ` Paolo Bonzini 2010-02-14 18:10 ` Johannes Schindelin 2010-02-14 19:06 ` Dmitry Potapov 2010-02-14 19:22 ` Johannes Schindelin 2010-02-14 19:28 ` Johannes Schindelin 2010-02-14 19:56 ` Dmitry Potapov 2010-02-14 23:52 ` Zygo Blaxell 2010-02-15 5:05 ` Nicolas Pitre 2010-02-15 12:23 ` Dmitry Potapov 2010-02-15 7:48 ` Paolo Bonzini 2010-02-15 12:25 ` Dmitry Potapov 2010-02-14 19:55 ` Dmitry Potapov 2010-02-14 23:13 ` Avery Pennarun 2010-02-15 4:16 ` Nicolas Pitre 2010-02-15 5:01 ` Avery Pennarun 2010-02-15 5:48 ` Nicolas Pitre 2010-02-15 19:19 ` Avery Pennarun 2010-02-15 19:29 ` Nicolas Pitre 2010-02-14 3:05 ` [PATCH v2] " Dmitry Potapov 2010-02-18 1:16 ` [PATCH] Teach "git add" and friends to be paranoid Junio C Hamano 2010-02-18 1:20 ` Junio C Hamano 2010-02-18 15:32 ` Zygo Blaxell 2010-02-19 17:51 ` Junio C Hamano 2010-02-18 1:38 ` Jeff King 2010-02-18 4:55 ` Nicolas Pitre 2010-02-18 5:36 ` Junio C Hamano 2010-02-18 7:27 ` Wincent Colaiuta 2010-02-18 16:18 ` Zygo Blaxell 2010-02-18 18:12 ` Jonathan Nieder 2010-02-18 18:35 ` Junio C Hamano 2010-02-22 12:59 ` Paolo Bonzini 2010-02-22 13:33 ` Dmitry Potapov 2010-02-18 10:14 ` Thomas Rast 2010-02-18 18:16 ` Junio C Hamano 2010-02-18 19:58 ` Nicolas Pitre 2010-02-18 20:11 ` 16 gig, 350,000 file repository Bill Lear 2010-02-18 20:58 ` Nicolas Pitre 2010-02-19 9:27 ` Erik Faye-Lund 2010-02-22 22:20 ` Bill Lear 2010-02-22 22:31 ` Nicolas Pitre 2010-02-18 20:14 ` [PATCH] Teach "git add" and friends to be paranoid Peter Harris 2010-02-18 20:17 ` Junio C Hamano 2010-02-18 21:30 ` Nicolas Pitre 2010-02-19 1:04 ` Jonathan Nieder 2010-02-19 15:26 ` Zygo Blaxell 2010-02-19 17:52 ` Junio C Hamano 2010-02-19 19:08 ` Zygo Blaxell 2010-02-19 8:28 ` Dmitry Potapov 2010-02-19 17:52 ` Junio C Hamano 2010-02-20 19:23 ` Junio C Hamano 2010-02-21 7:21 ` Dmitry Potapov 2010-02-21 19:32 ` Junio C Hamano 2010-02-22 3:35 ` Dmitry Potapov 2010-02-22 6:59 ` Junio C Hamano 2010-02-22 12:25 ` Dmitry Potapov 2010-02-22 15:40 ` Nicolas Pitre 2010-02-22 16:01 ` Dmitry Potapov 2010-02-22 17:31 ` Zygo Blaxell 2010-02-22 18:01 ` Nicolas Pitre 2010-02-22 19:56 ` Junio C Hamano 2010-02-22 20:52 ` Nicolas Pitre 2010-02-22 18:05 ` Dmitry Potapov 2010-02-22 18:14 ` Nicolas Pitre 2010-02-14 1:36 ` mmap with MAP_PRIVATE is useless (was Re: Bug#569505: git-core: 'git add' corrupts repository if the working directory is modified as it runs) Paolo Bonzini 2010-02-14 1:53 ` mmap with MAP_PRIVATE is useless Junio C Hamano 2010-02-14 2:11 ` Paolo Bonzini
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.