* [BUG] index corruption with git commit -p @ 2018-09-01 21:41 Luc Van Oostenryck 2018-09-01 22:17 ` Ævar Arnfjörð Bjarmason 0 siblings, 1 reply; 19+ messages in thread From: Luc Van Oostenryck @ 2018-09-01 21:41 UTC (permalink / raw) To: git Hi, I've just had a scary error: error: index uses $?+? extension, which we do not understand fatal: index file corrupt Things were quickly recovered by deleting the index but it clearly looks to a but to me. Here are the steps to reproduce it: $ git clone git://github.com/lucvoo/sparse-dev.git <somedir> $ cd <somedir> $ git co index-corruption $ git rm -r validation/ Documentation/ $ git commit -m <some message> -p $ git status error: index uses $?+? extension, which we do not understand fatal: index file corrupt The 'extension' pattern '$?+?', can vary a bit, sometimes it's just '????', but always seems 4 chars. If the commit command doesn't use the '-p' flag, there is no problem. The repository itself is not corrupted, it's only the index. It happends with git 2.18.0 and 2.17.0 -- Luc Van Oostenryck ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [BUG] index corruption with git commit -p 2018-09-01 21:41 [BUG] index corruption with git commit -p Luc Van Oostenryck @ 2018-09-01 22:17 ` Ævar Arnfjörð Bjarmason 2018-09-02 5:08 ` Jeff King 0 siblings, 1 reply; 19+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2018-09-01 22:17 UTC (permalink / raw) To: Luc Van Oostenryck; +Cc: git, Kevin Willford On Sat, Sep 01 2018, Luc Van Oostenryck wrote: > Hi, > > I've just had a scary error: > error: index uses $?+? extension, which we do not understand > fatal: index file corrupt > > Things were quickly recovered by deleting the index but it clearly > looks to a but to me. > > Here are the steps to reproduce it: > $ git clone git://github.com/lucvoo/sparse-dev.git <somedir> > $ cd <somedir> > $ git co index-corruption > $ git rm -r validation/ Documentation/ > $ git commit -m <some message> -p > $ git status > error: index uses $?+? extension, which we do not understand > fatal: index file corrupt > > > The 'extension' pattern '$?+?', can vary a bit, sometimes > it's just '????', but always seems 4 chars. > If the commit command doesn't use the '-p' flag, there is no > problem. The repository itself is not corrupted, it's only > the index. It happends with git 2.18.0 and 2.17.0 Yeah this is a bug, I didn't dig much but testing with this script down to 2.8.0: #!/bin/sh cd ~/g/git make -j $(parallel --number-of-cores) USE_LIBPCRE2=YesPlease CFLAGS="-O0 -g -ggdb3" DEVELOPER=1 DEVOPTS=no-error NO_OPENSSL=Y all ( rm -rf /tmp/x; ~/g/git/git --exec-path=/home/avar/g/git clone git://github.com/lucvoo/sparse-dev.git /tmp/x && cd /tmp/x && ~/g/git/git --exec-path=/home/avar/g/git checkout index-corruption && ~/g/git/git --exec-path=/home/avar/g/git rm -r validation/ Documentation/ && ~/g/git/git --exec-path=/home/avar/g/git commit -p ) ~/g/git/git --exec-path=/home/avar/g/git -C /tmp/x status if ~/g/git/git --exec-path=/home/avar/g/git -C /tmp/x status then exit 0 else exit 1 fi I found that the first bad commit was: 680ee550d7 ("commit: skip discarding the index if there is no pre-commit hook", 2017-08-14) Now, note the two invocations of "status" in my script. Before we'd already been complaining about a bad index, but after that commit is the first time we started getting a persistent error, and indeed even reverting it now on top of master makes the error non-persistent. So not a fix, but a strong signal to see where we should start looking. I.e. the index file handling around discard_cache() & "interactive" in commit.c is likely what's broken. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [BUG] index corruption with git commit -p 2018-09-01 22:17 ` Ævar Arnfjörð Bjarmason @ 2018-09-02 5:08 ` Jeff King 2018-09-02 7:12 ` Duy Nguyen 0 siblings, 1 reply; 19+ messages in thread From: Jeff King @ 2018-09-02 5:08 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Luc Van Oostenryck, git, Kevin Willford On Sun, Sep 02, 2018 at 12:17:53AM +0200, Ævar Arnfjörð Bjarmason wrote: > > Here are the steps to reproduce it: > > $ git clone git://github.com/lucvoo/sparse-dev.git <somedir> > > $ cd <somedir> > > $ git co index-corruption > > $ git rm -r validation/ Documentation/ > > $ git commit -m <some message> -p > > $ git status > > error: index uses $?+? extension, which we do not understand > > fatal: index file corrupt > > > > The 'extension' pattern '$?+?', can vary a bit, sometimes > > it's just '????', but always seems 4 chars. > > If the commit command doesn't use the '-p' flag, there is no > > problem. The repository itself is not corrupted, it's only > > the index. It happends with git 2.18.0 and 2.17.0 > > Yeah this is a bug, I didn't dig much but testing with this script down > to 2.8.0: > [...] > I found that the first bad commit was: 680ee550d7 ("commit: skip > discarding the index if there is no pre-commit hook", 2017-08-14) I think it's much older than that. I set up my test repo like this: git clone git://github.com/lucvoo/sparse-dev.git cd sparse-dev git checkout --detach and then bisected with this script: cd /path/to/sparse-dev rm .git/index git reset --hard index-corruption && git rm -q -r validation/ Documentation/ && git commit -qm foo -p && git status Since a33fc72fe9 (read-cache: force_verify_index_checksum, 2017-04-14), that produces the corrupt extension error. But before that, I consistently get: error: bad index file sha1 signature fatal: index file corrupt from git-commit. And that bisects back to 9c4d6c0297 (cache-tree: Write updated cache-tree after commit, 2014-07-13). If I revert that commit (which takes some untangling, see below), then the problem seems to go away. Here's the patch I tried on top of the current master, though I think it is actually the first hunk that is making the difference. --- diff --git a/builtin/commit.c b/builtin/commit.c index 0d9828e29e..779c5e2cb5 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -359,13 +359,6 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix discard_cache(); read_cache_from(get_lock_file_path(&index_lock)); - if (update_main_cache_tree(WRITE_TREE_SILENT) == 0) { - if (reopen_lock_file(&index_lock) < 0) - die(_("unable to write index file")); - if (write_locked_index(&the_index, &index_lock, 0)) - die(_("unable to update temporary index")); - } else - warning(_("Failed to update main cache tree")); commit_style = COMMIT_NORMAL; ret = get_lock_file_path(&index_lock); @@ -408,9 +401,6 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix if (!only && !pathspec.nr) { hold_locked_index(&index_lock, LOCK_DIE_ON_ERROR); refresh_cache_or_die(refresh_flags); - if (active_cache_changed - || !cache_tree_fully_valid(active_cache_tree)) - update_main_cache_tree(WRITE_TREE_SILENT); if (write_locked_index(&the_index, &index_lock, COMMIT_LOCK | SKIP_IF_UNCHANGED)) die(_("unable to write new_index file")); @@ -457,7 +447,6 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix hold_locked_index(&index_lock, LOCK_DIE_ON_ERROR); add_remove_files(&partial); refresh_cache(REFRESH_QUIET); - update_main_cache_tree(WRITE_TREE_SILENT); if (write_locked_index(&the_index, &index_lock, 0)) die(_("unable to write new_index file")); -Peff ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [BUG] index corruption with git commit -p 2018-09-02 5:08 ` Jeff King @ 2018-09-02 7:12 ` Duy Nguyen 2018-09-02 7:24 ` Jeff King 0 siblings, 1 reply; 19+ messages in thread From: Duy Nguyen @ 2018-09-02 7:12 UTC (permalink / raw) To: Jeff King Cc: Ævar Arnfjörð Bjarmason, Luc Van Oostenryck, git, Kevin Willford On Sun, Sep 02, 2018 at 01:08:03AM -0400, Jeff King wrote: > On Sun, Sep 02, 2018 at 12:17:53AM +0200, Ævar Arnfjörð Bjarmason wrote: > > > > Here are the steps to reproduce it: > > > $ git clone git://github.com/lucvoo/sparse-dev.git <somedir> > > > $ cd <somedir> > > > $ git co index-corruption > > > $ git rm -r validation/ Documentation/ > > > $ git commit -m <some message> -p > > > $ git status > > > error: index uses $?+? extension, which we do not understand > > > fatal: index file corrupt > > > > > > The 'extension' pattern '$?+?', can vary a bit, sometimes > > > it's just '????', but always seems 4 chars. > > > If the commit command doesn't use the '-p' flag, there is no > > > problem. The repository itself is not corrupted, it's only > > > the index. It happends with git 2.18.0 and 2.17.0 > > > > Yeah this is a bug, I didn't dig much but testing with this script down > > to 2.8.0: > > [...] > > I found that the first bad commit was: 680ee550d7 ("commit: skip > > discarding the index if there is no pre-commit hook", 2017-08-14) > > I think it's much older than that. I set up my test repo like this: > > git clone git://github.com/lucvoo/sparse-dev.git > cd sparse-dev > git checkout --detach > > and then bisected with this script: > > cd /path/to/sparse-dev > rm .git/index > git reset --hard index-corruption && > git rm -q -r validation/ Documentation/ && > git commit -qm foo -p && > git status > > Since a33fc72fe9 (read-cache: force_verify_index_checksum, 2017-04-14), > that produces the corrupt extension error. But before that, I > consistently get: > > error: bad index file sha1 signature > fatal: index file corrupt > > from git-commit. And that bisects back to 9c4d6c0297 (cache-tree: Write > updated cache-tree after commit, 2014-07-13). > > If I revert that commit (which takes some untangling, see below), then > the problem seems to go away. Here's the patch I tried on top of the > current master, though I think it is actually the first hunk that is > making the difference. > > --- > diff --git a/builtin/commit.c b/builtin/commit.c > index 0d9828e29e..779c5e2cb5 100644 > --- a/builtin/commit.c > +++ b/builtin/commit.c > @@ -359,13 +359,6 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix > > discard_cache(); > read_cache_from(get_lock_file_path(&index_lock)); > - if (update_main_cache_tree(WRITE_TREE_SILENT) == 0) { > - if (reopen_lock_file(&index_lock) < 0) > - die(_("unable to write index file")); > - if (write_locked_index(&the_index, &index_lock, 0)) > - die(_("unable to update temporary index")); > - } else > - warning(_("Failed to update main cache tree")); > Narrowing down to this does help. This patch seems to fix it to me. I guess we have some leftover from the interactive add that should not be there after we have written the new index. diff --git a/builtin/commit.c b/builtin/commit.c index 2be7bdb331..60f30b3780 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -432,6 +432,7 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix if (update_main_cache_tree(WRITE_TREE_SILENT) == 0) { if (reopen_lock_file(&index_lock) < 0) die(_("unable to write index file")); + ftruncate(index_lock.tempfile->fd, 0); if (write_locked_index(&the_index, &index_lock, 0)) die(_("unable to update temporary index")); } else -- Duy ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [BUG] index corruption with git commit -p 2018-09-02 7:12 ` Duy Nguyen @ 2018-09-02 7:24 ` Jeff King 2018-09-02 7:53 ` Luc Van Oostenryck 2018-09-04 15:57 ` Junio C Hamano 0 siblings, 2 replies; 19+ messages in thread From: Jeff King @ 2018-09-02 7:24 UTC (permalink / raw) To: Duy Nguyen Cc: Ævar Arnfjörð Bjarmason, Luc Van Oostenryck, git, Kevin Willford On Sun, Sep 02, 2018 at 09:12:04AM +0200, Duy Nguyen wrote: > > diff --git a/builtin/commit.c b/builtin/commit.c > > index 0d9828e29e..779c5e2cb5 100644 > > --- a/builtin/commit.c > > +++ b/builtin/commit.c > > @@ -359,13 +359,6 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix > > > > discard_cache(); > > read_cache_from(get_lock_file_path(&index_lock)); > > - if (update_main_cache_tree(WRITE_TREE_SILENT) == 0) { > > - if (reopen_lock_file(&index_lock) < 0) > > - die(_("unable to write index file")); > > - if (write_locked_index(&the_index, &index_lock, 0)) > > - die(_("unable to update temporary index")); > > - } else > > - warning(_("Failed to update main cache tree")); > > > > Narrowing down to this does help. This patch seems to fix it to me. I > guess we have some leftover from the interactive add that should not > be there after we have written the new index. > > diff --git a/builtin/commit.c b/builtin/commit.c > index 2be7bdb331..60f30b3780 100644 > --- a/builtin/commit.c > +++ b/builtin/commit.c > @@ -432,6 +432,7 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix > if (update_main_cache_tree(WRITE_TREE_SILENT) == 0) { > if (reopen_lock_file(&index_lock) < 0) > die(_("unable to write index file")); > + ftruncate(index_lock.tempfile->fd, 0); > if (write_locked_index(&the_index, &index_lock, 0)) > die(_("unable to update temporary index")); > } else Doh, of course. I even thought about this issue and dug all the way into reopen_lock_file(), but for some reason temporarily forgot that O_WRONLY does not imply O_TRUNC. Arguably this should be the default for reopen_lockfile(), as getting a write pointer into an existing file is not ever going to be useful for the way Git uses lockfiles. Opening with O_APPEND could conceivably be useful, but it's pretty unlikely (and certainly not helpful here, and this is the only caller). Alternatively, the function should just take open(2) flags. At any rate, I think this perfectly explains the behavior we're seeing. -Peff ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [BUG] index corruption with git commit -p 2018-09-02 7:24 ` Jeff King @ 2018-09-02 7:53 ` Luc Van Oostenryck 2018-09-02 8:02 ` Jeff King 2018-09-04 15:57 ` Junio C Hamano 1 sibling, 1 reply; 19+ messages in thread From: Luc Van Oostenryck @ 2018-09-02 7:53 UTC (permalink / raw) To: Jeff King Cc: Duy Nguyen, Ævar Arnfjörð Bjarmason, git, Kevin Willford On Sun, Sep 02, 2018 at 03:24:09AM -0400, Jeff King wrote: > On Sun, Sep 02, 2018 at 09:12:04AM +0200, Duy Nguyen wrote: > > diff --git a/builtin/commit.c b/builtin/commit.c > > index 2be7bdb331..60f30b3780 100644 > > --- a/builtin/commit.c > > +++ b/builtin/commit.c > > @@ -432,6 +432,7 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix > > if (update_main_cache_tree(WRITE_TREE_SILENT) == 0) { > > if (reopen_lock_file(&index_lock) < 0) > > die(_("unable to write index file")); > > + ftruncate(index_lock.tempfile->fd, 0); > > if (write_locked_index(&the_index, &index_lock, 0)) > > die(_("unable to update temporary index")); > > } else > > Doh, of course. I even thought about this issue and dug all the way into > reopen_lock_file(), but for some reason temporarily forgot that O_WRONLY > does not imply O_TRUNC. > > Arguably this should be the default for reopen_lockfile(), as getting a > write pointer into an existing file is not ever going to be useful for > the way Git uses lockfiles. Opening with O_APPEND could conceivably be > useful, but it's pretty unlikely (and certainly not helpful here, and > this is the only caller). Alternatively, the function should just take > open(2) flags. > > At any rate, I think this perfectly explains the behavior we're seeing. Yes, this certainly make sense. For fun (and testing) I tried to take the problem in the other direction: * why hasn't this be detected earlier (I'm reasonably be sure I did the same operation a few times already without seeing the corruption)? * how easy it is to reproduce the problem? * Is it enough to do an interactive commit with nothing needing interactive? So I tried the following and some variants: > for i in $(seq -w 0 100); do echo $i > f$i; done > git add f* && git commit -m add f* && git rm -q f* && git commit -m rm -p but I didn't succeed to recreate the problem. So I'm still wondering what is special in my repository & tree that trigger the corruption. Anyway, thanks to al of you for looking at this so quickly. -- Luc ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [BUG] index corruption with git commit -p 2018-09-02 7:53 ` Luc Van Oostenryck @ 2018-09-02 8:02 ` Jeff King 0 siblings, 0 replies; 19+ messages in thread From: Jeff King @ 2018-09-02 8:02 UTC (permalink / raw) To: Luc Van Oostenryck Cc: Duy Nguyen, Ævar Arnfjörð Bjarmason, git, Kevin Willford On Sun, Sep 02, 2018 at 09:53:53AM +0200, Luc Van Oostenryck wrote: > > At any rate, I think this perfectly explains the behavior we're seeing. > > Yes, this certainly make sense. > > For fun (and testing) I tried to take the problem in the other direction: > * why hasn't this be detected earlier (I'm reasonably be sure I did the > same operation a few times already without seeing the corruption)? > * how easy it is to reproduce the problem? > * Is it enough to do an interactive commit with nothing needing interactive? > > So I tried the following and some variants: > > for i in $(seq -w 0 100); do echo $i > f$i; done > > git add f* && git commit -m add f* && git rm -q f* && git commit -m rm -p > > but I didn't succeed to recreate the problem. So I'm still wondering > what is special in my repository & tree that trigger the corruption. I think the partial deletion is important, because it means that the resulting index is going to be smaller. And the truncation problem only matters when we go from a larger file to a smaller one (since otherwise overwrite the old content completely). And it doesn't seem to trigger without the interactive "-p". I think that's not directly related, but it somehow triggers the case where we actually need to update the cache tree in this particular order. That's pretty hand-wavy, but I think it gives a sense of why most people didn't run into this. I do wish I understood better what it would take to create a minimal example for the test suite. -Peff ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [BUG] index corruption with git commit -p 2018-09-02 7:24 ` Jeff King 2018-09-02 7:53 ` Luc Van Oostenryck @ 2018-09-04 15:57 ` Junio C Hamano 2018-09-04 16:13 ` Duy Nguyen 1 sibling, 1 reply; 19+ messages in thread From: Junio C Hamano @ 2018-09-04 15:57 UTC (permalink / raw) To: Jeff King Cc: Duy Nguyen, Ævar Arnfjörð Bjarmason, Luc Van Oostenryck, git, Kevin Willford Jeff King <peff@peff.net> writes: >> diff --git a/builtin/commit.c b/builtin/commit.c >> index 2be7bdb331..60f30b3780 100644 >> --- a/builtin/commit.c >> +++ b/builtin/commit.c >> @@ -432,6 +432,7 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix >> if (update_main_cache_tree(WRITE_TREE_SILENT) == 0) { >> if (reopen_lock_file(&index_lock) < 0) >> die(_("unable to write index file")); >> + ftruncate(index_lock.tempfile->fd, 0); >> if (write_locked_index(&the_index, &index_lock, 0)) >> die(_("unable to update temporary index")); >> } else > > Doh, of course. I even thought about this issue and dug all the way into > reopen_lock_file(), but for some reason temporarily forgot that O_WRONLY > does not imply O_TRUNC. > > Arguably this should be the default for reopen_lockfile(), as getting a > write pointer into an existing file is not ever going to be useful for > the way Git uses lockfiles. Opening with O_APPEND could conceivably be > useful, but it's pretty unlikely (and certainly not helpful here, and > this is the only caller). Alternatively, the function should just take > open(2) flags. > > At any rate, I think this perfectly explains the behavior we're seeing. Thanks all for digging this down (I am a bit jealous to see that I seem to have missed all this fun over the weekend X-<). ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [BUG] index corruption with git commit -p 2018-09-04 15:57 ` Junio C Hamano @ 2018-09-04 16:13 ` Duy Nguyen 2018-09-04 16:38 ` Jeff King 0 siblings, 1 reply; 19+ messages in thread From: Duy Nguyen @ 2018-09-04 16:13 UTC (permalink / raw) To: Junio C Hamano Cc: Jeff King, Ævar Arnfjörð Bjarmason, Van Oostenryck Luc, Git Mailing List, Kevin Willford On Tue, Sep 4, 2018 at 5:57 PM Junio C Hamano <gitster@pobox.com> wrote: > > Jeff King <peff@peff.net> writes: > > >> diff --git a/builtin/commit.c b/builtin/commit.c > >> index 2be7bdb331..60f30b3780 100644 > >> --- a/builtin/commit.c > >> +++ b/builtin/commit.c > >> @@ -432,6 +432,7 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix > >> if (update_main_cache_tree(WRITE_TREE_SILENT) == 0) { > >> if (reopen_lock_file(&index_lock) < 0) > >> die(_("unable to write index file")); > >> + ftruncate(index_lock.tempfile->fd, 0); > >> if (write_locked_index(&the_index, &index_lock, 0)) > >> die(_("unable to update temporary index")); > >> } else > > > > Doh, of course. I even thought about this issue and dug all the way into > > reopen_lock_file(), but for some reason temporarily forgot that O_WRONLY > > does not imply O_TRUNC. > > > > Arguably this should be the default for reopen_lockfile(), as getting a > > write pointer into an existing file is not ever going to be useful for > > the way Git uses lockfiles. Opening with O_APPEND could conceivably be > > useful, but it's pretty unlikely (and certainly not helpful here, and > > this is the only caller). Alternatively, the function should just take > > open(2) flags. > > > > At any rate, I think this perfectly explains the behavior we're seeing. > > Thanks all for digging this down (I am a bit jealous to see that I > seem to have missed all this fun over the weekend X-<). And just to be clear I'm looking forward to a patch from Jeff to fix this since he clearly put more thoughts on this than me. With commit.c being the only user of reopen_lock_file() I guess it's even ok to just stick O_TRUNC in there and worry about O_APPEND when a new caller needs that. -- Duy ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [BUG] index corruption with git commit -p 2018-09-04 16:13 ` Duy Nguyen @ 2018-09-04 16:38 ` Jeff King 2018-09-04 23:36 ` [PATCH] reopen_tempfile(): truncate opened file Jeff King 0 siblings, 1 reply; 19+ messages in thread From: Jeff King @ 2018-09-04 16:38 UTC (permalink / raw) To: Duy Nguyen Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason, Van Oostenryck Luc, Git Mailing List, Kevin Willford On Tue, Sep 04, 2018 at 06:13:36PM +0200, Duy Nguyen wrote: > > > Doh, of course. I even thought about this issue and dug all the way into > > > reopen_lock_file(), but for some reason temporarily forgot that O_WRONLY > > > does not imply O_TRUNC. > > > > > > Arguably this should be the default for reopen_lockfile(), as getting a > > > write pointer into an existing file is not ever going to be useful for > > > the way Git uses lockfiles. Opening with O_APPEND could conceivably be > > > useful, but it's pretty unlikely (and certainly not helpful here, and > > > this is the only caller). Alternatively, the function should just take > > > open(2) flags. > > > > > > At any rate, I think this perfectly explains the behavior we're seeing. > > > > Thanks all for digging this down (I am a bit jealous to see that I > > seem to have missed all this fun over the weekend X-<). > > And just to be clear I'm looking forward to a patch from Jeff to fix > this since he clearly put more thoughts on this than me. With commit.c > being the only user of reopen_lock_file() I guess it's even ok to just > stick O_TRUNC in there and worry about O_APPEND when a new caller > needs that. That's the way I'm leaning to. The fix is obviously a one-liner, but I was hoping to construct a minimal test case. I just haven't gotten around to it yet. The bug is ancient, so I don't think it's important for v2.19. -Peff ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH] reopen_tempfile(): truncate opened file 2018-09-04 16:38 ` Jeff King @ 2018-09-04 23:36 ` Jeff King 2018-09-05 15:27 ` Duy Nguyen 2018-09-05 18:48 ` Luc Van Oostenryck 0 siblings, 2 replies; 19+ messages in thread From: Jeff King @ 2018-09-04 23:36 UTC (permalink / raw) To: Duy Nguyen Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason, Van Oostenryck Luc, Git Mailing List, Kevin Willford On Tue, Sep 04, 2018 at 12:38:07PM -0400, Jeff King wrote: > > And just to be clear I'm looking forward to a patch from Jeff to fix > > this since he clearly put more thoughts on this than me. With commit.c > > being the only user of reopen_lock_file() I guess it's even ok to just > > stick O_TRUNC in there and worry about O_APPEND when a new caller > > needs that. > > That's the way I'm leaning to. The fix is obviously a one-liner, but I > was hoping to construct a minimal test case. I just haven't gotten > around to it yet. It turned out not to be too bad to write a test. It feels a little like black magic, since I empirically determined a way in which the cache-tree happens to shrink with the current code. But that assumption is tested with a sanity check, so we'll at least know if it becomes a noop. > The bug is ancient, so I don't think it's important for v2.19. The patch below should work on master or maint. We could do a fix directly on top of the bug, but merging-up is weird (because the buggy code became part of a reusable module). -- >8 -- Subject: [PATCH] reopen_tempfile(): truncate opened file We provide a reopen_tempfile() function, which is in turn used by reopen_lockfile(). The idea is that a caller may want to rewrite the tempfile without letting go of the lock. And that's what our one caller does: after running add--interactive, "commit -p" will update the cache-tree extension of the index and write out the result, all while holding the lock. However, because we open the file with only the O_WRONLY flag, the existing index content is left in place, and we overwrite it starting at position 0. If the new index after updating the cache-tree is smaller than the original, those final bytes are not overwritten and remain in the file. This results in a corrupt index, since those cruft bytes are interpreted as part of the trailing hash (or even as an extension, if there are enough bytes). This bug actually pre-dates reopen_tempfile(); the original code from 9c4d6c0297 (cache-tree: Write updated cache-tree after commit, 2014-07-13) has the same bug, and those lines were eventually refactored into the tempfile module. Nobody noticed until now for two reasons: - the bug can only be triggered in interactive mode ("commit -p" or "commit -i") - the size of the index must shrink after updating the cache-tree, which implies a non-trivial deletion. Notice that the included test actually has to create a 2-deep hierarchy. A single level is not enough to actually cause shrinkage. The fix is to truncate the file before writing out the second index. We can do that at the caller by using ftruncate(). But we shouldn't have to do that. There is no other place in Git where we want to open a file and overwrite bytes, making reopen_tempfile() a confusing and error-prone interface. Let's pass O_TRUNC there, which gives callers the same state they had after initially opening the file or lock. It's possible that we could later add a caller that wants something else (e.g., to open with O_APPEND). But this is the only caller we've had in the history of the codebase. Let's punt on doing anything more clever until another one comes along. Reported-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Jeff King <peff@peff.net> --- lockfile.h | 4 ++-- t/t0090-cache-tree.sh | 18 ++++++++++++++++++ tempfile.c | 2 +- tempfile.h | 4 ++-- 4 files changed, 23 insertions(+), 5 deletions(-) diff --git a/lockfile.h b/lockfile.h index f401c979f0..35403ccc0d 100644 --- a/lockfile.h +++ b/lockfile.h @@ -263,8 +263,8 @@ static inline int close_lock_file_gently(struct lock_file *lk) * nobody else) to inspect the contents you wrote, while still * holding the lock yourself. * - * * `reopen_lock_file()` to reopen the lockfile. Make further updates - * to the contents. + * * `reopen_lock_file()` to reopen the lockfile, truncating the existing + * contents. Write out the new contents. * * * `commit_lock_file()` to make the final version permanent. */ diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh index 7de40141ca..94fcb4a78e 100755 --- a/t/t0090-cache-tree.sh +++ b/t/t0090-cache-tree.sh @@ -161,6 +161,24 @@ test_expect_success PERL 'commit --interactive gives cache-tree on partial commi test_cache_tree ' +test_expect_success PERL 'commit -p with shrinking cache-tree' ' + mkdir -p deep/subdir && + echo content >deep/subdir/file && + git add deep && + git commit -m add && + git rm -r deep && + + before=$(wc -c <.git/index) && + git commit -m delete -p && + after=$(wc -c <.git/index) && + + # double check that the index shrank + test $before -gt $after && + + # and that our index was not corrupted + git fsck +' + test_expect_success 'commit in child dir has cache-tree' ' mkdir dir && >dir/child.t && diff --git a/tempfile.c b/tempfile.c index 139ecd97f8..d43ad8c191 100644 --- a/tempfile.c +++ b/tempfile.c @@ -279,7 +279,7 @@ int reopen_tempfile(struct tempfile *tempfile) BUG("reopen_tempfile called for an inactive object"); if (0 <= tempfile->fd) BUG("reopen_tempfile called for an open object"); - tempfile->fd = open(tempfile->filename.buf, O_WRONLY); + tempfile->fd = open(tempfile->filename.buf, O_WRONLY|O_TRUNC); return tempfile->fd; } diff --git a/tempfile.h b/tempfile.h index 36434eb6fa..61d8dc4d1b 100644 --- a/tempfile.h +++ b/tempfile.h @@ -236,8 +236,8 @@ extern int close_tempfile_gently(struct tempfile *tempfile); * it (and nobody else) to inspect or even modify the file's * contents. * - * * `reopen_tempfile()` to reopen the temporary file. Make further - * updates to the contents. + * * `reopen_tempfile()` to reopen the temporary file, truncating the existing + * contents. Write out the new contents. * * * `rename_tempfile()` to move the file to its permanent location. */ -- 2.19.0.rc1.605.g83416793fa ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH] reopen_tempfile(): truncate opened file 2018-09-04 23:36 ` [PATCH] reopen_tempfile(): truncate opened file Jeff King @ 2018-09-05 15:27 ` Duy Nguyen 2018-09-05 15:35 ` Jeff King 2018-09-05 18:48 ` Luc Van Oostenryck 1 sibling, 1 reply; 19+ messages in thread From: Duy Nguyen @ 2018-09-05 15:27 UTC (permalink / raw) To: Jeff King Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason, Van Oostenryck Luc, Git Mailing List, Kevin Willford On Wed, Sep 5, 2018 at 1:36 AM Jeff King <peff@peff.net> wrote: > It turned out not to be too bad to write a test. It feels a little like > black magic, since I empirically determined a way in which the > cache-tree happens to shrink with the current code. Aha! I attempted to reproduce with a verylongpathname and failed, but then I had the main index portion in mind, not cache-tree. > diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh > index 7de40141ca..94fcb4a78e 100755 > --- a/t/t0090-cache-tree.sh > +++ b/t/t0090-cache-tree.sh > @@ -161,6 +161,24 @@ test_expect_success PERL 'commit --interactive gives cache-tree on partial commi > test_cache_tree > ' > > +test_expect_success PERL 'commit -p with shrinking cache-tree' ' > + mkdir -p deep/subdir && > + echo content >deep/subdir/file && > + git add deep && > + git commit -m add && > + git rm -r deep && OK so I guess at this step, we invalidate some cache-tree blocks, but we write the same blocks down (with "invalid" flag), so pretty much the same size as before. > + > + before=$(wc -c <.git/index) && > + git commit -m delete -p && Then inside this command we recompute cache-tree and throw away the invalidated cache-tree blocks for deep and deep/subdir, which shrinks the index. Makes sense. > + after=$(wc -c <.git/index) && > + > + # double check that the index shrank > + test $before -gt $after && > + > + # and that our index was not corrupted > + git fsck If the index is not shrunk, we parse remaining rubbish as extensions. If by chance the rubbish extension name is in uppercase, then we ignore (and not flag it as error). But then the chances of the next 4 bytes being the "right" extension size is so small that we would end up flagging it as bad extension anyway. So it's good. But if you want to be even stricter (not necessary in my opinion), make sure that stderr is empty. The rest of the patch is obviously correct. -- Duy ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] reopen_tempfile(): truncate opened file 2018-09-05 15:27 ` Duy Nguyen @ 2018-09-05 15:35 ` Jeff King 2018-09-05 15:39 ` Duy Nguyen 0 siblings, 1 reply; 19+ messages in thread From: Jeff King @ 2018-09-05 15:35 UTC (permalink / raw) To: Duy Nguyen Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason, Van Oostenryck Luc, Git Mailing List, Kevin Willford On Wed, Sep 05, 2018 at 05:27:11PM +0200, Duy Nguyen wrote: > > +test_expect_success PERL 'commit -p with shrinking cache-tree' ' > > + mkdir -p deep/subdir && > > + echo content >deep/subdir/file && > > + git add deep && > > + git commit -m add && > > + git rm -r deep && > > OK so I guess at this step, we invalidate some cache-tree blocks, but > we write the same blocks down (with "invalid" flag), so pretty much > the same size as before. I didn't verify exactly what was in the index, but that was my understanding, too (well, it's a little smaller because we drop the actual index entries, but keep the invalidated cache-tree). I worry a little that "rm" might eventually learn to drop those invalidated bits. But hopefully finding this commit would lead that person to figure out another way to accomplish the same thing, or to decide that carrying the test forward isn't worth it. > > + after=$(wc -c <.git/index) && > > + > > + # double check that the index shrank > > + test $before -gt $after && > > + > > + # and that our index was not corrupted > > + git fsck > > If the index is not shrunk, we parse remaining rubbish as extensions. > If by chance the rubbish extension name is in uppercase, then we > ignore (and not flag it as error). But then the chances of the next 4 > bytes being the "right" extension size is so small that we would end > up flagging it as bad extension anyway. So it's good. But if you want > to be even stricter (not necessary in my opinion), make sure that > stderr is empty. In this case, the size difference is only a few bytes, so the rubbish actually ends up in the trailing sha1. The reason I use git-fsck here is that it actually verifies the whole sha1 (since normal index reads no longer do). In fact, a normal index read won't show any problem for this case (since it is _only_ the trailing sha1 which is junk, and we no longer verify it on every read). In the original sparse-dev case, the size of the rubbish is much larger (because we deleted a lot more entries), and we do interpret it as a bogus extension. But it also triggers here, because the trailing sha1 is _also_ wrong. So AFAIK this fsck catches everything and yields a non-zero exit in the error case. And it should work for even a single byte of rubbish. -Peff ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] reopen_tempfile(): truncate opened file 2018-09-05 15:35 ` Jeff King @ 2018-09-05 15:39 ` Duy Nguyen 2018-09-05 15:48 ` Jeff King 0 siblings, 1 reply; 19+ messages in thread From: Duy Nguyen @ 2018-09-05 15:39 UTC (permalink / raw) To: Jeff King Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason, Van Oostenryck Luc, Git Mailing List, Kevin Willford On Wed, Sep 5, 2018 at 5:35 PM Jeff King <peff@peff.net> wrote: > > > + after=$(wc -c <.git/index) && > > > + > > > + # double check that the index shrank > > > + test $before -gt $after && > > > + > > > + # and that our index was not corrupted > > > + git fsck > > > > If the index is not shrunk, we parse remaining rubbish as extensions. > > If by chance the rubbish extension name is in uppercase, then we > > ignore (and not flag it as error). But then the chances of the next 4 > > bytes being the "right" extension size is so small that we would end > > up flagging it as bad extension anyway. So it's good. But if you want > > to be even stricter (not necessary in my opinion), make sure that > > stderr is empty. > > In this case, the size difference is only a few bytes, so the rubbish > actually ends up in the trailing sha1. The reason I use git-fsck here is > that it actually verifies the whole sha1 (since normal index reads no > longer do). In fact, a normal index read won't show any problem for this > case (since it is _only_ the trailing sha1 which is junk, and we no > longer verify it on every read). > > In the original sparse-dev case, the size of the rubbish is much larger > (because we deleted a lot more entries), and we do interpret it as a > bogus extension. But it also triggers here, because the trailing sha1 is > _also_ wrong. > > So AFAIK this fsck catches everything and yields a non-zero exit in the > error case. And it should work for even a single byte of rubbish. Yes you're right. I forgot about the trailing hash. -- Duy ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] reopen_tempfile(): truncate opened file 2018-09-05 15:39 ` Duy Nguyen @ 2018-09-05 15:48 ` Jeff King 2018-09-05 16:54 ` Junio C Hamano 0 siblings, 1 reply; 19+ messages in thread From: Jeff King @ 2018-09-05 15:48 UTC (permalink / raw) To: Duy Nguyen Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason, Van Oostenryck Luc, Git Mailing List, Kevin Willford On Wed, Sep 05, 2018 at 05:39:19PM +0200, Duy Nguyen wrote: > On Wed, Sep 5, 2018 at 5:35 PM Jeff King <peff@peff.net> wrote: > > > > + after=$(wc -c <.git/index) && > > > > + > > > > + # double check that the index shrank > > > > + test $before -gt $after && > > > > + > > > > + # and that our index was not corrupted > > > > + git fsck > > > > > > If the index is not shrunk, we parse remaining rubbish as extensions. > > > If by chance the rubbish extension name is in uppercase, then we > > > ignore (and not flag it as error). But then the chances of the next 4 > > > bytes being the "right" extension size is so small that we would end > > > up flagging it as bad extension anyway. So it's good. But if you want > > > to be even stricter (not necessary in my opinion), make sure that > > > stderr is empty. > > > > In this case, the size difference is only a few bytes, so the rubbish > > actually ends up in the trailing sha1. The reason I use git-fsck here is > > that it actually verifies the whole sha1 (since normal index reads no > > longer do). In fact, a normal index read won't show any problem for this > > case (since it is _only_ the trailing sha1 which is junk, and we no > > longer verify it on every read). > > > > In the original sparse-dev case, the size of the rubbish is much larger > > (because we deleted a lot more entries), and we do interpret it as a > > bogus extension. But it also triggers here, because the trailing sha1 is > > _also_ wrong. > > > > So AFAIK this fsck catches everything and yields a non-zero exit in the > > error case. And it should work for even a single byte of rubbish. > > Yes you're right. I forgot about the trailing hash. Thanks, I was worried that I was missing something. ;) Maybe it is worth making that final comment: # and that the trailing hash in the index was not corrupted, # which should catch even a single byte of cruft git fsck -Peff ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] reopen_tempfile(): truncate opened file 2018-09-05 15:48 ` Jeff King @ 2018-09-05 16:54 ` Junio C Hamano 2018-09-05 16:56 ` Jeff King 0 siblings, 1 reply; 19+ messages in thread From: Junio C Hamano @ 2018-09-05 16:54 UTC (permalink / raw) To: Jeff King Cc: Duy Nguyen, Ævar Arnfjörð Bjarmason, Van Oostenryck Luc, Git Mailing List, Kevin Willford Jeff King <peff@peff.net> writes: >> > So AFAIK this fsck catches everything and yields a non-zero exit in the >> > error case. And it should work for even a single byte of rubbish. >> >> Yes you're right. I forgot about the trailing hash. > > Thanks, I was worried that I was missing something. ;) > > Maybe it is worth making that final comment: > > # and that the trailing hash in the index was not corrupted, > # which should catch even a single byte of cruft > git fsck Perhaps. I do not mind seeing an additional comment to explain why this requires PERL (it wasn't immediately obvious as I never use 'commit -p' myself), either. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] reopen_tempfile(): truncate opened file 2018-09-05 16:54 ` Junio C Hamano @ 2018-09-05 16:56 ` Jeff King 2018-09-05 17:01 ` Junio C Hamano 0 siblings, 1 reply; 19+ messages in thread From: Jeff King @ 2018-09-05 16:56 UTC (permalink / raw) To: Junio C Hamano Cc: Duy Nguyen, Ævar Arnfjörð Bjarmason, Van Oostenryck Luc, Git Mailing List, Kevin Willford On Wed, Sep 05, 2018 at 09:54:42AM -0700, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > >> > So AFAIK this fsck catches everything and yields a non-zero exit in the > >> > error case. And it should work for even a single byte of rubbish. > >> > >> Yes you're right. I forgot about the trailing hash. > > > > Thanks, I was worried that I was missing something. ;) > > > > Maybe it is worth making that final comment: > > > > # and that the trailing hash in the index was not corrupted, > > # which should catch even a single byte of cruft > > git fsck > > Perhaps. I do not mind seeing an additional comment to explain why > this requires PERL (it wasn't immediately obvious as I never use > 'commit -p' myself), either. I thought the PERL prereq in the existing "-p" test of the commit header would explain it. ;) Do you prefer an in-code comment, or one in the commit message? -Peff ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] reopen_tempfile(): truncate opened file 2018-09-05 16:56 ` Jeff King @ 2018-09-05 17:01 ` Junio C Hamano 0 siblings, 0 replies; 19+ messages in thread From: Junio C Hamano @ 2018-09-05 17:01 UTC (permalink / raw) To: Jeff King Cc: Duy Nguyen, Ævar Arnfjörð Bjarmason, Van Oostenryck Luc, Git Mailing List, Kevin Willford Jeff King <peff@peff.net> writes: > On Wed, Sep 05, 2018 at 09:54:42AM -0700, Junio C Hamano wrote: > >> Jeff King <peff@peff.net> writes: >> >> >> > So AFAIK this fsck catches everything and yields a non-zero exit in the >> >> > error case. And it should work for even a single byte of rubbish. >> >> >> >> Yes you're right. I forgot about the trailing hash. >> > >> > Thanks, I was worried that I was missing something. ;) >> > >> > Maybe it is worth making that final comment: >> > >> > # and that the trailing hash in the index was not corrupted, >> > # which should catch even a single byte of cruft >> > git fsck >> >> Perhaps. I do not mind seeing an additional comment to explain why >> this requires PERL (it wasn't immediately obvious as I never use >> 'commit -p' myself), either. > > I thought the PERL prereq in the existing "-p" test of the commit header > would explain it. ;) > > Do you prefer an in-code comment, or one in the commit message? Neither ;-) Just like I think 'our index was not corrupted' as an explanation for 'git fsck' is sufficient, PERL sprinkled all over this script and all of them tend to be near "commit -i/-p" should be a good enough clue, I'd think. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] reopen_tempfile(): truncate opened file 2018-09-04 23:36 ` [PATCH] reopen_tempfile(): truncate opened file Jeff King 2018-09-05 15:27 ` Duy Nguyen @ 2018-09-05 18:48 ` Luc Van Oostenryck 1 sibling, 0 replies; 19+ messages in thread From: Luc Van Oostenryck @ 2018-09-05 18:48 UTC (permalink / raw) To: Jeff King Cc: Duy Nguyen, Junio C Hamano, Ævar Arnfjörð Bjarmason, Git Mailing List, Kevin Willford On Tue, Sep 04, 2018 at 07:36:43PM -0400, Jeff King wrote: > On Tue, Sep 04, 2018 at 12:38:07PM -0400, Jeff King wrote: > > > > And just to be clear I'm looking forward to a patch from Jeff to fix > > > this since he clearly put more thoughts on this than me. With commit.c > > > being the only user of reopen_lock_file() I guess it's even ok to just > > > stick O_TRUNC in there and worry about O_APPEND when a new caller > > > needs that. > > > > That's the way I'm leaning to. The fix is obviously a one-liner, but I > > was hoping to construct a minimal test case. I just haven't gotten > > around to it yet. > > It turned out not to be too bad to write a test. It feels a little like > black magic, since I empirically determined a way in which the > cache-tree happens to shrink with the current code. But that assumption > is tested with a sanity check, so we'll at least know if it becomes a > noop. > > > The bug is ancient, so I don't think it's important for v2.19. > > The patch below should work on master or maint. We could do a fix > directly on top of the bug, but merging-up is weird (because the buggy > code became part of a reusable module). It's great that you were able to create a reproducer relatively easily. Thank you, guys. -- Luc ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2018-09-05 18:48 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-09-01 21:41 [BUG] index corruption with git commit -p Luc Van Oostenryck 2018-09-01 22:17 ` Ævar Arnfjörð Bjarmason 2018-09-02 5:08 ` Jeff King 2018-09-02 7:12 ` Duy Nguyen 2018-09-02 7:24 ` Jeff King 2018-09-02 7:53 ` Luc Van Oostenryck 2018-09-02 8:02 ` Jeff King 2018-09-04 15:57 ` Junio C Hamano 2018-09-04 16:13 ` Duy Nguyen 2018-09-04 16:38 ` Jeff King 2018-09-04 23:36 ` [PATCH] reopen_tempfile(): truncate opened file Jeff King 2018-09-05 15:27 ` Duy Nguyen 2018-09-05 15:35 ` Jeff King 2018-09-05 15:39 ` Duy Nguyen 2018-09-05 15:48 ` Jeff King 2018-09-05 16:54 ` Junio C Hamano 2018-09-05 16:56 ` Jeff King 2018-09-05 17:01 ` Junio C Hamano 2018-09-05 18:48 ` Luc Van Oostenryck
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).