* [PATCH 0/2] pack-write,repack: prevent opening packs too early @ 2021-09-01 2:05 Taylor Blau 2021-09-01 2:06 ` [PATCH 1/2] pack-write.c: rename `.idx` file into place last Taylor Blau ` (7 more replies) 0 siblings, 8 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-01 2:05 UTC (permalink / raw) To: git; +Cc: peff This pair of patches fixes a race where the .idx is moved into place before the .rev file, allowing the pack to be in a state where it appears a .rev file wasn't generated. This can cause Git to inadvertently take the slow "generate the reverse index on-the-fly", which does not impact correctness, but is unnecessarily slow when compared to reading the .rev file. The race is fixed by moving the .idx into place only after all other pack-related files have already been written. The first patch fixes the direct `pack-objects` case, and the second patch fixes `repack` (which also renames pack files around). Thanks in advance for your review. Taylor Blau (2): pack-write.c: rename `.idx` file into place last builtin/repack.c: move `.idx` files into place last builtin/repack.c | 2 +- pack-write.c | 12 ++++++------ 2 files changed, 7 insertions(+), 7 deletions(-) -- 2.33.0.96.g73915697e6 ^ permalink raw reply [flat|nested] 81+ messages in thread
* [PATCH 1/2] pack-write.c: rename `.idx` file into place last 2021-09-01 2:05 [PATCH 0/2] pack-write,repack: prevent opening packs too early Taylor Blau @ 2021-09-01 2:06 ` Taylor Blau 2021-09-01 2:06 ` [PATCH 2/2] builtin/repack.c: move `.idx` files " Taylor Blau ` (6 subsequent siblings) 7 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-01 2:06 UTC (permalink / raw) To: git; +Cc: peff We treat the presence of an `.idx` file as the indicator of whether or not it's safe to use a packfile. But `finish_tmp_packfile()` (which is responsible for writing the pack and moving the temporary versions of all of its auxiliary files into place) is inconsistent about the write order. But the `.rev` file is moved into place after the `.idx`, so it's possible for a process to open a pack in between the two (i.e., while the `.idx` file is in place but the `.rev` file is not) and mistakenly fall back to generating the pack's reverse index in memory. Though racy, this amounts to an unnecessary slow-down at worst, and doesn't affect the correctness of the resulting reverse index. Close this race by moving the .rev file into place before moving the .idx file into place. While we're here, only call strbuf_setlen() if we actually modified the buffer by bringing it inside of the same if-statement that calls rename(). Signed-off-by: Taylor Blau <me@ttaylorr.com> --- The diff is a little inscrutable here, since it shows the .idx hunk moving below the .rev hunk (instead of the .rev hunk moving above the .idx one), but the end-result is the same. pack-write.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/pack-write.c b/pack-write.c index f1fc3ecafa..277c60165e 100644 --- a/pack-write.c +++ b/pack-write.c @@ -490,18 +490,18 @@ void finish_tmp_packfile(struct strbuf *name_buffer, strbuf_setlen(name_buffer, basename_len); - strbuf_addf(name_buffer, "%s.idx", hash_to_hex(hash)); - if (rename(idx_tmp_name, name_buffer->buf)) - die_errno("unable to rename temporary index file"); - - strbuf_setlen(name_buffer, basename_len); - if (rev_tmp_name) { strbuf_addf(name_buffer, "%s.rev", hash_to_hex(hash)); if (rename(rev_tmp_name, name_buffer->buf)) die_errno("unable to rename temporary reverse-index file"); + + strbuf_setlen(name_buffer, basename_len); } + strbuf_addf(name_buffer, "%s.idx", hash_to_hex(hash)); + if (rename(idx_tmp_name, name_buffer->buf)) + die_errno("unable to rename temporary index file"); + strbuf_setlen(name_buffer, basename_len); free((void *)idx_tmp_name); -- 2.33.0.96.g73915697e6 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* [PATCH 2/2] builtin/repack.c: move `.idx` files into place last 2021-09-01 2:05 [PATCH 0/2] pack-write,repack: prevent opening packs too early Taylor Blau 2021-09-01 2:06 ` [PATCH 1/2] pack-write.c: rename `.idx` file into place last Taylor Blau @ 2021-09-01 2:06 ` Taylor Blau 2021-09-01 3:53 ` [PATCH 0/2] pack-write,repack: prevent opening packs too early Jeff King ` (5 subsequent siblings) 7 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-01 2:06 UTC (permalink / raw) To: git; +Cc: peff In a similar spirit as the previous patch, fix the identical problem from `git repack` (which invokes `pack-objects` with a temporary location for output, and then moves the files into their final locations itself). Signed-off-by: Taylor Blau <me@ttaylorr.com> --- builtin/repack.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/builtin/repack.c b/builtin/repack.c index 5f9bc74adc..c3e4771609 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -208,10 +208,10 @@ static struct { unsigned optional:1; } exts[] = { {".pack"}, - {".idx"}, {".rev", 1}, {".bitmap", 1}, {".promisor", 1}, + {".idx"}, }; static unsigned populate_pack_exts(char *name) -- 2.33.0.96.g73915697e6 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* Re: [PATCH 0/2] pack-write,repack: prevent opening packs too early 2021-09-01 2:05 [PATCH 0/2] pack-write,repack: prevent opening packs too early Taylor Blau 2021-09-01 2:06 ` [PATCH 1/2] pack-write.c: rename `.idx` file into place last Taylor Blau 2021-09-01 2:06 ` [PATCH 2/2] builtin/repack.c: move `.idx` files " Taylor Blau @ 2021-09-01 3:53 ` Jeff King 2021-09-01 4:29 ` Taylor Blau 2021-09-07 19:42 ` [PATCH 0/3] rename *.idx file into place last (also after *.bitmap) Ævar Arnfjörð Bjarmason ` (4 subsequent siblings) 7 siblings, 1 reply; 81+ messages in thread From: Jeff King @ 2021-09-01 3:53 UTC (permalink / raw) To: Taylor Blau; +Cc: git On Tue, Aug 31, 2021 at 10:05:46PM -0400, Taylor Blau wrote: > This pair of patches fixes a race where the .idx is moved into place > before the .rev file, allowing the pack to be in a state where it > appears a .rev file wasn't generated. > > This can cause Git to inadvertently take the slow "generate the > reverse index on-the-fly", which does not impact correctness, but is > unnecessarily slow when compared to reading the .rev file. > > The race is fixed by moving the .idx into place only after all other > pack-related files have already been written. The first patch fixes > the direct `pack-objects` case, and the second patch fixes `repack` > (which also renames pack files around). These both look good to me, but I think we're missing one more spot. The first patch covers git-pack-objects directly, like: git pack-objects .git/objects/pack/pack In practice, though, we never do that. On-disk repacks happen via git-repack, which always writes to temporary files. So I thought the case in git-pack-objects wouldn't matter, but it does look like prepare_packed_git() will actually read .tmp-$$-pack-1234abcd files, and so might load our temporary pack. Unexpected, but we are covered by your first patch. And then the second patch covers us as we move those temporary files into place. So far so good. But the other obvious way to get a pack idx is via index-pack (especially "--stdin"). It looks like we'd want the same re-ordering to happen in builtin/index-pack.c:final(), which is where we rename the temporary files into place. We _might_ also want to re-order the calls to write_idx_file() and write_rev_file() in its caller, given that simultaneous readers are happy to read our tmp_pack_* files. I guess the same might apply to the actual file write order pack-objects, too? I'm not sure if that's even possible, though; do we rely on side effects of generating the .idx when generating the other meta files? I think it might be more sensible if the reading side was taught to ignore ".tmp-*" and "tmp_*" (and possibly even ".*", though it's possible somebody is relying on that). -Peff ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH 0/2] pack-write,repack: prevent opening packs too early 2021-09-01 3:53 ` [PATCH 0/2] pack-write,repack: prevent opening packs too early Jeff King @ 2021-09-01 4:29 ` Taylor Blau 2021-09-01 4:59 ` Jeff King 0 siblings, 1 reply; 81+ messages in thread From: Taylor Blau @ 2021-09-01 4:29 UTC (permalink / raw) To: Jeff King; +Cc: Taylor Blau, git On Tue, Aug 31, 2021 at 11:53:35PM -0400, Jeff King wrote: > So far so good. But the other obvious way to get a pack idx is via > index-pack (especially "--stdin"). > > It looks like we'd want the same re-ordering to happen in > builtin/index-pack.c:final(), which is where we rename the temporary > files into place. Sure, that's easy enough. > We _might_ also want to re-order the calls to write_idx_file() and > write_rev_file() in its caller, given that simultaneous readers are > happy to read our tmp_pack_* files. I guess the same might apply to the > actual file write order pack-objects, too? I'm not sure if that's even > possible, though; do we rely on side effects of generating the .idx when > generating the other meta files? These are a little trickier. write_idx_file() is also responsible for rearranging the objects array into index (name) order on the way out, which write_rev_file() depends on in order to build up its mapping. So you could sort the array at the call-site before calling either function, but it gets awkward since there are a handful of other callers of write_idx_file() besides the two we're discussing. It could be done, but it feels like the wrong approach, because... > I think it might be more sensible if the reading side was taught to > ignore ".tmp-*" and "tmp_*" (and possibly even ".*", though it's > possible somebody is relying on that). ...this seems like the much-better way to go. Git shouldn't have to worry about what order it writes the temporary files in, only what order those temporary files are made non-temporary. But I need to do some more investigation to make sure there aren't any other implications. So I'm happy to wait on that, or send a new version of this series with a patch to fix the race in builtin/index-pack.c:final(), too. (Unrelated to your suggestions above) another consideration for "stuff to do later" would be to flip the default of pack.writeReverseIndex. I had intentions to do that in the 2.32 cycle, but I forgot about it. > -Peff Thanks, Taylor ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH 0/2] pack-write,repack: prevent opening packs too early 2021-09-01 4:29 ` Taylor Blau @ 2021-09-01 4:59 ` Jeff King 2021-09-01 5:12 ` Taylor Blau 0 siblings, 1 reply; 81+ messages in thread From: Jeff King @ 2021-09-01 4:59 UTC (permalink / raw) To: Taylor Blau; +Cc: git On Wed, Sep 01, 2021 at 12:29:54AM -0400, Taylor Blau wrote: > > We _might_ also want to re-order the calls to write_idx_file() and > > write_rev_file() in its caller, given that simultaneous readers are > > happy to read our tmp_pack_* files. I guess the same might apply to the > > actual file write order pack-objects, too? I'm not sure if that's even > > possible, though; do we rely on side effects of generating the .idx when > > generating the other meta files? > > These are a little trickier. write_idx_file() is also responsible for > rearranging the objects array into index (name) order on the way out, > which write_rev_file() depends on in order to build up its mapping. > > So you could sort the array at the call-site before calling either > function, but it gets awkward since there are a handful of other callers > of write_idx_file() besides the two we're discussing. Yeah, I had in the back of my mind that there was some dependency there. I definitely prefer the "readers should not pick up tmp-packs" approach if that is workable. > > I think it might be more sensible if the reading side was taught to > > ignore ".tmp-*" and "tmp_*" (and possibly even ".*", though it's > > possible somebody is relying on that). > > ...this seems like the much-better way to go. Git shouldn't have to > worry about what order it writes the temporary files in, only what order > those temporary files are made non-temporary. > > But I need to do some more investigation to make sure there aren't any > other implications. So I'm happy to wait on that, or send a new version > of this series with a patch to fix the race in > builtin/index-pack.c:final(), too. I think if we kept it restricted to ".tmp-*" and "tmp_*", it should be pretty safe. The absolute worst case is that somebody trying to recover a corrupted repository might have to rename the files manually, I would think. Blocking ".*" is a harder sell. If we were starting from scratch, I'd probably do that, but now we don't know what weird things people might be doing. So unless there's a huge gain, it's hard to justify. (If we were starting from scratch, I'd actually probably insist they be named pack-$checksum.pack, etc, but it's much too late for that now). So anyway. I think we definitely want the index-pack.c change. We _could_ stop there and change the read side as a separate series, but I think that until we do, the ordering changes on the write side aren't really doing that much. They're shrinking the race a little, I guess, but it's still very much there. > (Unrelated to your suggestions above) another consideration for "stuff > to do later" would be to flip the default of pack.writeReverseIndex. I > had intentions to do that in the 2.32 cycle, but I forgot about it. Oh, yeah. We should definitely do that (in its own series). The .rev files have been a huge performance win, and I don't think there's any reason we wouldn't want to always use them. -Peff ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH 0/2] pack-write,repack: prevent opening packs too early 2021-09-01 4:59 ` Jeff King @ 2021-09-01 5:12 ` Taylor Blau 2021-09-01 6:08 ` Jeff King 0 siblings, 1 reply; 81+ messages in thread From: Taylor Blau @ 2021-09-01 5:12 UTC (permalink / raw) To: Jeff King; +Cc: Taylor Blau, git On Wed, Sep 01, 2021 at 12:59:23AM -0400, Jeff King wrote: > So anyway. I think we definitely want the index-pack.c change. We > _could_ stop there and change the read side as a separate series, but I > think that until we do, the ordering changes on the write side aren't > really doing that much. They're shrinking the race a little, I guess, > but it's still very much there. Yeah, now that I've had a chance to look at it it seems pretty straightforward. Probably limited to checks in prepare_pack(), and add_pack_to_midx(), which are the lone two callbacks of for_each_file_in_pack_dir(). I'll make sure that I'm not missing anything and then send a reroll tomorrow. Thanks, Taylor ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH 0/2] pack-write,repack: prevent opening packs too early 2021-09-01 5:12 ` Taylor Blau @ 2021-09-01 6:08 ` Jeff King 2021-09-01 21:40 ` Taylor Blau 0 siblings, 1 reply; 81+ messages in thread From: Jeff King @ 2021-09-01 6:08 UTC (permalink / raw) To: Taylor Blau; +Cc: git On Wed, Sep 01, 2021 at 01:12:41AM -0400, Taylor Blau wrote: > On Wed, Sep 01, 2021 at 12:59:23AM -0400, Jeff King wrote: > > So anyway. I think we definitely want the index-pack.c change. We > > _could_ stop there and change the read side as a separate series, but I > > think that until we do, the ordering changes on the write side aren't > > really doing that much. They're shrinking the race a little, I guess, > > but it's still very much there. > > Yeah, now that I've had a chance to look at it it seems pretty > straightforward. Probably limited to checks in prepare_pack(), and > add_pack_to_midx(), which are the lone two callbacks of > for_each_file_in_pack_dir(). Yes, that matches what I'd expect. > I'll make sure that I'm not missing anything and then send a reroll > tomorrow. Great, thanks! -Peff ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH 0/2] pack-write,repack: prevent opening packs too early 2021-09-01 6:08 ` Jeff King @ 2021-09-01 21:40 ` Taylor Blau 2021-09-07 16:07 ` Jeff King 0 siblings, 1 reply; 81+ messages in thread From: Taylor Blau @ 2021-09-01 21:40 UTC (permalink / raw) To: Jeff King; +Cc: Taylor Blau, git On Wed, Sep 01, 2021 at 02:08:48AM -0400, Jeff King wrote: > On Wed, Sep 01, 2021 at 01:12:41AM -0400, Taylor Blau wrote: > > > On Wed, Sep 01, 2021 at 12:59:23AM -0400, Jeff King wrote: > > > So anyway. I think we definitely want the index-pack.c change. We > > > _could_ stop there and change the read side as a separate series, but I > > > think that until we do, the ordering changes on the write side aren't > > > really doing that much. They're shrinking the race a little, I guess, > > > but it's still very much there. > > > > Yeah, now that I've had a chance to look at it it seems pretty > > straightforward. Probably limited to checks in prepare_pack(), and > > add_pack_to_midx(), which are the lone two callbacks of > > for_each_file_in_pack_dir(). > > Yes, that matches what I'd expect. Hmm. As I was wondering about about, this is more complicated than meets the eye. Consider t5616.36, which tests that repacking does not loosen promisor objects. In builtin/repack.c:repack_promisor_object(), the repack builtin tells pack-objects about the pack that it just wrote with `--keep-pack` (and we rely on that working in order to not loosen all of the objects that we just wrote). Except when we iterate through `get_all_packs()`, we don't see the keep pack yet, because it is still prefixed with .tmp. So, this does get kind of tricky. There are some internal callers that do want to know about .tmp packs and a whole host of other callers that don't or shouldn't. Maybe that should point us towards "we should be more careful about the order we write packs in, even temporary ones". But alternatively, perhaps it should point us to "this must not actually be happening very often, otherwise we would have heard more complaints." So, ultimately I'm not sure of the best way forward. I'll have to think on it some more... Thanks, Taylor ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH 0/2] pack-write,repack: prevent opening packs too early 2021-09-01 21:40 ` Taylor Blau @ 2021-09-07 16:07 ` Jeff King 0 siblings, 0 replies; 81+ messages in thread From: Jeff King @ 2021-09-07 16:07 UTC (permalink / raw) To: Taylor Blau; +Cc: git On Wed, Sep 01, 2021 at 05:40:22PM -0400, Taylor Blau wrote: > Hmm. As I was wondering about about, this is more complicated than meets > the eye. Consider t5616.36, which tests that repacking does not loosen > promisor objects. > > In builtin/repack.c:repack_promisor_object(), the repack builtin tells > pack-objects about the pack that it just wrote with `--keep-pack` (and > we rely on that working in order to not loosen all of the objects that > we just wrote). > > Except when we iterate through `get_all_packs()`, we don't see the keep > pack yet, because it is still prefixed with .tmp. > > So, this does get kind of tricky. There are some internal callers that > do want to know about .tmp packs and a whole host of other callers that > don't or shouldn't. Maybe that should point us towards "we should be > more careful about the order we write packs in, even temporary ones". So I happened to be looking at some packing stuff again today, and I realized the situation is much less dire than I made it out to be. On the reading side, we _won't_ look at tmp_pack_*, nor tmp_idx_*, because they don't end in '.pack' or '.idx'. So really, the only confusing case is the ".tmp-$$-etc.idx" that is generated by pack-objects (because it thinks of that as the final name to use itself), and then later rearranged by git-repack. So I think we _are_ OK as long as the correct order is observed during the rename-into-place steps. I.e., your original patch (plus the extra post in index-pack) would mean that we're fully covered. The .tmp-*.idx ones would appear in the correct order due to pack-objects, and the final pack-*.idx ones due to git-repack. -Peff ^ permalink raw reply [flat|nested] 81+ messages in thread
* [PATCH 0/3] rename *.idx file into place last (also after *.bitmap) 2021-09-01 2:05 [PATCH 0/2] pack-write,repack: prevent opening packs too early Taylor Blau ` (2 preceding siblings ...) 2021-09-01 3:53 ` [PATCH 0/2] pack-write,repack: prevent opening packs too early Jeff King @ 2021-09-07 19:42 ` Ævar Arnfjörð Bjarmason 2021-09-07 19:42 ` [PATCH 1/3] pack-write: use more idiomatic strbuf usage for packname construction Ævar Arnfjörð Bjarmason ` (3 more replies) 2021-09-07 19:48 ` [PATCH 0/2] pack-write,repack: prevent opening packs too early Ævar Arnfjörð Bjarmason ` (3 subsequent siblings) 7 siblings, 4 replies; 81+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-09-07 19:42 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Taylor Blau, Ævar Arnfjörð Bjarmason I came up with this on top of Taylor's series which fixes the order in which we write files associated with pack files[1]. His series fixes a race where we write *.idx before *.rev, but left the issue of writing *.bitmap after *.idx, this series fixes that. Now we'll really write the *.idx last. There's still a race in these combined serieses, but now it's to do with our general lack of fsync(), i.e. there's no guarantee that just because we write files A, B and C in that order that without doing the proper fsync() dancing on the files and the fd for the directory that other processes will see it that way. In practice though there is, as few/no OS's are so pedantic as to expose inconsistent ordering of written files in a directory, even though POSIX et al leave that possibily open. 1. https://lore.kernel.org/git/cover.1630461918.git.me@ttaylorr.com/ Ævar Arnfjörð Bjarmason (3): pack-write: use more idiomatic strbuf usage for packname construction pack-write: split up finish_tmp_packfile() function pack-write: rename *.idx file into place last (really!) builtin/pack-objects.c | 33 +++++++++++++++++------ builtin/repack.c | 8 ++++++ pack-write.c | 60 ++++++++++++++++++++++++++++-------------- pack.h | 16 ++++++++++- 4 files changed, 88 insertions(+), 29 deletions(-) -- 2.33.0.818.gd2ef2916285 ^ permalink raw reply [flat|nested] 81+ messages in thread
* [PATCH 1/3] pack-write: use more idiomatic strbuf usage for packname construction 2021-09-07 19:42 ` [PATCH 0/3] rename *.idx file into place last (also after *.bitmap) Ævar Arnfjörð Bjarmason @ 2021-09-07 19:42 ` Ævar Arnfjörð Bjarmason 2021-09-07 22:21 ` Taylor Blau 2021-09-07 19:42 ` [PATCH 2/3] pack-write: split up finish_tmp_packfile() function Ævar Arnfjörð Bjarmason ` (2 subsequent siblings) 3 siblings, 1 reply; 81+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-09-07 19:42 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Taylor Blau, Ævar Arnfjörð Bjarmason Change code added in 5889271114a (finish_tmp_packfile():use strbuf for pathname construction, 2014-03-03) to do strbuf_reset() instead of noting the length of the base template, and doing a strbuf_setlen() to reset it, also change the spacing in the finish_tmp_packfile() so that each setup of the template, rename, and strbuf_reset() is grouped together. Since the prototype of the previous "name_buffer" now has a "const" use this chance to wrap the overly long definition of the finish_tmp_packfile() function. This doesn't really matter for now, but as we'll see makes the subsequent change much easier, as we won't need to juggle the basename template v.s. its current contents anymore when writing bitmaps. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/pack-objects.c | 17 +++++++++++------ pack-write.c | 27 ++++++++++++--------------- pack.h | 7 ++++++- 3 files changed, 29 insertions(+), 22 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index df49f656b96..717003563db 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1216,7 +1216,7 @@ static void write_pack_file(void) if (!pack_to_stdout) { struct stat st; - struct strbuf tmpname = STRBUF_INIT; + struct strbuf tmp_basename = STRBUF_INIT; /* * Packs are runtime accessed in their mtime @@ -1237,7 +1237,7 @@ static void write_pack_file(void) warning_errno(_("failed utime() on %s"), pack_tmp_name); } - strbuf_addf(&tmpname, "%s-", base_name); + strbuf_addf(&tmp_basename, "%s-", base_name); if (write_bitmap_index) { bitmap_writer_set_checksum(hash); @@ -1245,12 +1245,16 @@ static void write_pack_file(void) &to_pack, written_list, nr_written); } - finish_tmp_packfile(&tmpname, pack_tmp_name, + finish_tmp_packfile(&tmp_basename, pack_tmp_name, written_list, nr_written, &pack_idx_opts, hash); if (write_bitmap_index) { - strbuf_addf(&tmpname, "%s.bitmap", hash_to_hex(hash)); + struct strbuf sb = STRBUF_INIT; + + strbuf_addf(&sb, "%s%s.bitmap", + tmp_basename.buf, + hash_to_hex(hash)); stop_progress(&progress_state); @@ -1258,11 +1262,12 @@ static void write_pack_file(void) bitmap_writer_select_commits(indexed_commits, indexed_commits_nr, -1); bitmap_writer_build(&to_pack); bitmap_writer_finish(written_list, nr_written, - tmpname.buf, write_bitmap_options); + sb.buf, write_bitmap_options); write_bitmap_index = 0; + strbuf_release(&sb); } - strbuf_release(&tmpname); + strbuf_release(&tmp_basename); free(pack_tmp_name); puts(hash_to_hex(hash)); } diff --git a/pack-write.c b/pack-write.c index 277c60165e8..57b9fc11423 100644 --- a/pack-write.c +++ b/pack-write.c @@ -462,15 +462,15 @@ struct hashfile *create_tmp_packfile(char **pack_tmp_name) return hashfd(fd, *pack_tmp_name); } -void finish_tmp_packfile(struct strbuf *name_buffer, +void finish_tmp_packfile(const struct strbuf *tmp_basename, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, struct pack_idx_option *pack_idx_opts, unsigned char hash[]) { + struct strbuf sb = STRBUF_INIT; const char *idx_tmp_name, *rev_tmp_name = NULL; - int basename_len = name_buffer->len; if (adjust_shared_perm(pack_tmp_name)) die_errno("unable to make temporary pack file readable"); @@ -483,26 +483,23 @@ void finish_tmp_packfile(struct strbuf *name_buffer, rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash, pack_idx_opts->flags); - strbuf_addf(name_buffer, "%s.pack", hash_to_hex(hash)); - - if (rename(pack_tmp_name, name_buffer->buf)) + strbuf_addf(&sb, "%s%s.pack", tmp_basename->buf, hash_to_hex(hash)); + if (rename(pack_tmp_name, sb.buf)) die_errno("unable to rename temporary pack file"); - - strbuf_setlen(name_buffer, basename_len); + strbuf_reset(&sb); if (rev_tmp_name) { - strbuf_addf(name_buffer, "%s.rev", hash_to_hex(hash)); - if (rename(rev_tmp_name, name_buffer->buf)) + strbuf_addf(&sb, "%s%s.rev", tmp_basename->buf, + hash_to_hex(hash)); + if (rename(rev_tmp_name, sb.buf)) die_errno("unable to rename temporary reverse-index file"); - - strbuf_setlen(name_buffer, basename_len); + strbuf_reset(&sb); } - strbuf_addf(name_buffer, "%s.idx", hash_to_hex(hash)); - if (rename(idx_tmp_name, name_buffer->buf)) + strbuf_addf(&sb, "%s%s.idx", tmp_basename->buf, hash_to_hex(hash)); + if (rename(idx_tmp_name, sb.buf)) die_errno("unable to rename temporary index file"); - - strbuf_setlen(name_buffer, basename_len); + strbuf_reset(&sb); free((void *)idx_tmp_name); } diff --git a/pack.h b/pack.h index fa139545262..ae0c9e04cd9 100644 --- a/pack.h +++ b/pack.h @@ -110,6 +110,11 @@ int encode_in_pack_object_header(unsigned char *hdr, int hdr_len, int read_pack_header(int fd, struct pack_header *); struct hashfile *create_tmp_packfile(char **pack_tmp_name); -void finish_tmp_packfile(struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, struct pack_idx_option *pack_idx_opts, unsigned char sha1[]); +void finish_tmp_packfile(const struct strbuf *name_buffer, + const char *pack_tmp_name, + struct pack_idx_entry **written_list, + uint32_t nr_written, + struct pack_idx_option *pack_idx_opts, + unsigned char sha1[]); #endif -- 2.33.0.818.gd2ef2916285 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* Re: [PATCH 1/3] pack-write: use more idiomatic strbuf usage for packname construction 2021-09-07 19:42 ` [PATCH 1/3] pack-write: use more idiomatic strbuf usage for packname construction Ævar Arnfjörð Bjarmason @ 2021-09-07 22:21 ` Taylor Blau 2021-09-07 23:22 ` Ævar Arnfjörð Bjarmason 0 siblings, 1 reply; 81+ messages in thread From: Taylor Blau @ 2021-09-07 22:21 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano, Jeff King On Tue, Sep 07, 2021 at 09:42:36PM +0200, Ævar Arnfjörð Bjarmason wrote: > Change code added in 5889271114a (finish_tmp_packfile():use strbuf for s/Change code/Code/ ? (I wondered also if the missing space in 5889271114a's subject line was intentional, but it does appear in the original commit.) Reading this patch, I'm not sure I agree that this makes the later changes any easier. To be honest, replacing things like > if (rev_tmp_name) { > - strbuf_addf(name_buffer, "%s.rev", hash_to_hex(hash)); > - if (rename(rev_tmp_name, name_buffer->buf)) > + strbuf_addf(&sb, "%s%s.rev", tmp_basename->buf, > + hash_to_hex(hash)); > + if (rename(rev_tmp_name, sb.buf)) > die_errno("unable to rename temporary reverse-index file"); > - > - strbuf_setlen(name_buffer, basename_len); > + strbuf_reset(&sb); Does not much help or hurt the readability, at least in my opinion. One advantage of the pre-image is that we're doing less copying, but that's probably splitting hairs at this point. So, I would probably be just as happy without this patch. You mentioned that it makes the later changes easier, but I couldn't come up with why. I may be missing something, in which case it may be helpful to know what that is and how it makes this change necessary. Thanks, Taylor ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH 1/3] pack-write: use more idiomatic strbuf usage for packname construction 2021-09-07 22:21 ` Taylor Blau @ 2021-09-07 23:22 ` Ævar Arnfjörð Bjarmason 0 siblings, 0 replies; 81+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-09-07 23:22 UTC (permalink / raw) To: Taylor Blau; +Cc: git, Junio C Hamano, Jeff King On Tue, Sep 07 2021, Taylor Blau wrote: > On Tue, Sep 07, 2021 at 09:42:36PM +0200, Ævar Arnfjörð Bjarmason wrote: >> Change code added in 5889271114a (finish_tmp_packfile():use strbuf for > > s/Change code/Code/ ? That would make it: Change code added in X (...) to do strbuf_reset() instead... Instead of: Code added in X (...) to do strbuf_reset() instead... > (I wondered also if the missing space in 5889271114a's subject line was > intentional, but it does appear in the original commit.) *Nod*, it was automatically generated. > Reading this patch, I'm not sure I agree that this makes the later > changes any easier. To be honest, replacing things like > >> if (rev_tmp_name) { >> - strbuf_addf(name_buffer, "%s.rev", hash_to_hex(hash)); >> - if (rename(rev_tmp_name, name_buffer->buf)) >> + strbuf_addf(&sb, "%s%s.rev", tmp_basename->buf, >> + hash_to_hex(hash)); >> + if (rename(rev_tmp_name, sb.buf)) >> die_errno("unable to rename temporary reverse-index file"); >> - >> - strbuf_setlen(name_buffer, basename_len); >> + strbuf_reset(&sb); > > Does not much help or hurt the readability, at least in my opinion. One > advantage of the pre-image is that we're doing less copying, but that's > probably splitting hairs at this point. > So, I would probably be just as happy without this patch. You mentioned > that it makes the later changes easier, but I couldn't come up with why. > I may be missing something, in which case it may be helpful to know what > that is and how it makes this change necessary. It's not that continually appending/trimming the strbuf is per-se less readable than having a "prefix" and copying/appending to it, and as you point out this moves us towards more allocations, but in this case that's not the bottleneck. It's that if I retain the current pattern while splitting up these functions I'd need to pass a "basename_len" owned by the caller between the two, and we'd end up juggling the "tmpname" as the "if (write_bitmap_index)" codepath is moved around. So just having each site get the prefix and add its own suffix seemed much easier to deal with,. ^ permalink raw reply [flat|nested] 81+ messages in thread
* [PATCH 2/3] pack-write: split up finish_tmp_packfile() function 2021-09-07 19:42 ` [PATCH 0/3] rename *.idx file into place last (also after *.bitmap) Ævar Arnfjörð Bjarmason 2021-09-07 19:42 ` [PATCH 1/3] pack-write: use more idiomatic strbuf usage for packname construction Ævar Arnfjörð Bjarmason @ 2021-09-07 19:42 ` Ævar Arnfjörð Bjarmason 2021-09-07 22:28 ` Taylor Blau 2021-09-07 19:42 ` [PATCH 3/3] pack-write: rename *.idx file into place last (really!) Ævar Arnfjörð Bjarmason 2021-09-07 22:36 ` [PATCH 0/3] rename *.idx file into place last (also after *.bitmap) Taylor Blau 3 siblings, 1 reply; 81+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-09-07 19:42 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Taylor Blau, Ævar Arnfjörð Bjarmason Split up the finish_tmp_packfile() function and use the split-up version in pack-objects.c. This change should not change any functionality, but sets up code flow for a bug fix where we'll be able to move the *.idx in-place after the *.bitmap is written. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/pack-objects.c | 9 ++++++--- pack-write.c | 39 +++++++++++++++++++++++++++++++-------- pack.h | 9 +++++++++ 3 files changed, 46 insertions(+), 11 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 717003563db..9fa4c6a9be8 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1217,6 +1217,7 @@ static void write_pack_file(void) if (!pack_to_stdout) { struct stat st; struct strbuf tmp_basename = STRBUF_INIT; + char *idx_tmp_name = NULL; /* * Packs are runtime accessed in their mtime @@ -1245,9 +1246,10 @@ static void write_pack_file(void) &to_pack, written_list, nr_written); } - finish_tmp_packfile(&tmp_basename, pack_tmp_name, - written_list, nr_written, - &pack_idx_opts, hash); + stage_tmp_packfile(&tmp_basename, pack_tmp_name, + written_list, nr_written, + &pack_idx_opts, hash, &idx_tmp_name); + rename_tmp_packfile_idx(&tmp_basename, hash, &idx_tmp_name); if (write_bitmap_index) { struct strbuf sb = STRBUF_INIT; @@ -1267,6 +1269,7 @@ static void write_pack_file(void) strbuf_release(&sb); } + free(idx_tmp_name); strbuf_release(&tmp_basename); free(pack_tmp_name); puts(hash_to_hex(hash)); diff --git a/pack-write.c b/pack-write.c index 57b9fc11423..ba5c4767dc8 100644 --- a/pack-write.c +++ b/pack-write.c @@ -468,16 +468,33 @@ void finish_tmp_packfile(const struct strbuf *tmp_basename, uint32_t nr_written, struct pack_idx_option *pack_idx_opts, unsigned char hash[]) +{ + char *idx_tmp_name = NULL; + + stage_tmp_packfile(tmp_basename, pack_tmp_name, written_list, + nr_written, pack_idx_opts, hash, &idx_tmp_name); + rename_tmp_packfile_idx(tmp_basename, hash, &idx_tmp_name); + + free(idx_tmp_name); +} + +void stage_tmp_packfile(const struct strbuf *tmp_basename, + const char *pack_tmp_name, + struct pack_idx_entry **written_list, + uint32_t nr_written, + struct pack_idx_option *pack_idx_opts, + unsigned char hash[], + char **idx_tmp_name) { struct strbuf sb = STRBUF_INIT; - const char *idx_tmp_name, *rev_tmp_name = NULL; + const char *rev_tmp_name = NULL; if (adjust_shared_perm(pack_tmp_name)) die_errno("unable to make temporary pack file readable"); - idx_tmp_name = write_idx_file(NULL, written_list, nr_written, - pack_idx_opts, hash); - if (adjust_shared_perm(idx_tmp_name)) + *idx_tmp_name = (char *)write_idx_file(NULL, written_list, nr_written, + pack_idx_opts, hash); + if (adjust_shared_perm(*idx_tmp_name)) die_errno("unable to make temporary index file readable"); rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash, @@ -496,12 +513,18 @@ void finish_tmp_packfile(const struct strbuf *tmp_basename, strbuf_reset(&sb); } + strbuf_release(&sb); +} + +void rename_tmp_packfile_idx(const struct strbuf *tmp_basename, + unsigned char hash[], char **idx_tmp_name) +{ + struct strbuf sb = STRBUF_INIT; + strbuf_addf(&sb, "%s%s.idx", tmp_basename->buf, hash_to_hex(hash)); - if (rename(idx_tmp_name, sb.buf)) + if (rename(*idx_tmp_name, sb.buf)) die_errno("unable to rename temporary index file"); - strbuf_reset(&sb); - - free((void *)idx_tmp_name); + strbuf_release(&sb); } void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought) diff --git a/pack.h b/pack.h index ae0c9e04cd9..6c1508c3de7 100644 --- a/pack.h +++ b/pack.h @@ -110,6 +110,15 @@ int encode_in_pack_object_header(unsigned char *hdr, int hdr_len, int read_pack_header(int fd, struct pack_header *); struct hashfile *create_tmp_packfile(char **pack_tmp_name); +void stage_tmp_packfile(const struct strbuf *tmp_basename, + const char *pack_tmp_name, + struct pack_idx_entry **written_list, + uint32_t nr_written, + struct pack_idx_option *pack_idx_opts, + unsigned char hash[], + char **idx_tmp_name); +void rename_tmp_packfile_idx(const struct strbuf *tmp_basename, + unsigned char hash[], char **idx_tmp_name); void finish_tmp_packfile(const struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, -- 2.33.0.818.gd2ef2916285 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* Re: [PATCH 2/3] pack-write: split up finish_tmp_packfile() function 2021-09-07 19:42 ` [PATCH 2/3] pack-write: split up finish_tmp_packfile() function Ævar Arnfjörð Bjarmason @ 2021-09-07 22:28 ` Taylor Blau 0 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-07 22:28 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano, Jeff King On Tue, Sep 07, 2021 at 09:42:37PM +0200, Ævar Arnfjörð Bjarmason wrote: > Split up the finish_tmp_packfile() function and use the split-up > version in pack-objects.c. This change should not change any > functionality, but sets up code flow for a bug fix where we'll be able > to move the *.idx in-place after the *.bitmap is written. Seems like a logical step to make, and all makes sense to me. One thought for future clean-up... > diff --git a/pack-write.c b/pack-write.c > index 57b9fc11423..ba5c4767dc8 100644 > --- a/pack-write.c > +++ b/pack-write.c > @@ -468,16 +468,33 @@ void finish_tmp_packfile(const struct strbuf *tmp_basename, > uint32_t nr_written, > struct pack_idx_option *pack_idx_opts, > unsigned char hash[]) > +{ > + char *idx_tmp_name = NULL; > + > + stage_tmp_packfile(tmp_basename, pack_tmp_name, written_list, > + nr_written, pack_idx_opts, hash, &idx_tmp_name); > + rename_tmp_packfile_idx(tmp_basename, hash, &idx_tmp_name); > + > + free(idx_tmp_name); > +} > + I was surprised that you did not replace the caller in write_pack_file with this (and instead open-coded the implementation). But that makes sense, since the very next commit is going to move the bitmap into place between staging and renaming the .idx. The only other caller of finish_tmp_packfile is in the bulk-checkin code. So I might suggest that we get rid of this function altogether and instead call stage_tmp_packfile() and rename_tmp_packfile_idx() directly in bulk-checkin.c:finish_bulk_checkin(). That would allow us to slim down the header, but more importantly would make it clear that something like finish_tmp_packfile() shouldn't be used blindly in case there is other work to be down between staging and remaing into place. > - idx_tmp_name = write_idx_file(NULL, written_list, nr_written, > - pack_idx_opts, hash); > - if (adjust_shared_perm(idx_tmp_name)) > + *idx_tmp_name = (char *)write_idx_file(NULL, written_list, nr_written, > + pack_idx_opts, hash); Yuck (and unavoidable!) Thanks, Taylor ^ permalink raw reply [flat|nested] 81+ messages in thread
* [PATCH 3/3] pack-write: rename *.idx file into place last (really!) 2021-09-07 19:42 ` [PATCH 0/3] rename *.idx file into place last (also after *.bitmap) Ævar Arnfjörð Bjarmason 2021-09-07 19:42 ` [PATCH 1/3] pack-write: use more idiomatic strbuf usage for packname construction Ævar Arnfjörð Bjarmason 2021-09-07 19:42 ` [PATCH 2/3] pack-write: split up finish_tmp_packfile() function Ævar Arnfjörð Bjarmason @ 2021-09-07 19:42 ` Ævar Arnfjörð Bjarmason 2021-09-07 22:31 ` Taylor Blau 2021-09-07 22:36 ` [PATCH 0/3] rename *.idx file into place last (also after *.bitmap) Taylor Blau 3 siblings, 1 reply; 81+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-09-07 19:42 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Taylor Blau, Ævar Arnfjörð Bjarmason Follow-up a preceding commit (pack-write.c: rename `.idx` file into place last, 2021-08-16)[1] and rename the *.idx file in-place after we write the *.bitmap. The preceding commit fixed the issue of *.idx being written before *.rev files, but did not do so for *.idx files. See 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21) for commentary at the time when *.bitmap was implemented about how those files are written out, nothing in that commit contradicts what's being done here. While we're at it let's add cross-commentary to both builtin/repack.c and builtin/pack-objects.c to point out the two places where we write out these sets of files in sequence. 1. https://lore.kernel.org/git/a6a4d2154e83d41c10986c5f455279ab249a910c.1630461918.git.me@ttaylorr.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/pack-objects.c | 11 ++++++++++- builtin/repack.c | 8 ++++++++ 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 9fa4c6a9be8..fad3fe5651d 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1249,7 +1249,6 @@ static void write_pack_file(void) stage_tmp_packfile(&tmp_basename, pack_tmp_name, written_list, nr_written, &pack_idx_opts, hash, &idx_tmp_name); - rename_tmp_packfile_idx(&tmp_basename, hash, &idx_tmp_name); if (write_bitmap_index) { struct strbuf sb = STRBUF_INIT; @@ -1269,6 +1268,16 @@ static void write_pack_file(void) strbuf_release(&sb); } + /* + * We must write the *.idx last, so that anything that expects + * an accompanying *.rev, *.bitmap etc. can count on it being + * present. + * + * See also corresponding logic in the "exts" + * struct in builtin/repack.c + */ + rename_tmp_packfile_idx(&tmp_basename, hash, &idx_tmp_name); + free(idx_tmp_name); strbuf_release(&tmp_basename); free(pack_tmp_name); diff --git a/builtin/repack.c b/builtin/repack.c index c3e47716093..dd438ebb3ee 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -211,6 +211,14 @@ static struct { {".rev", 1}, {".bitmap", 1}, {".promisor", 1}, + /* + * We must write the *.idx last, so that anything that expects + * an accompanying *.rev, *.bitmap etc. can count on it being + * present. + * + * See also corresponding logic in write_pack_file() in + * builtin/pack-objects.c + */ {".idx"}, }; -- 2.33.0.818.gd2ef2916285 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* Re: [PATCH 3/3] pack-write: rename *.idx file into place last (really!) 2021-09-07 19:42 ` [PATCH 3/3] pack-write: rename *.idx file into place last (really!) Ævar Arnfjörð Bjarmason @ 2021-09-07 22:31 ` Taylor Blau 0 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-07 22:31 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano, Jeff King On Tue, Sep 07, 2021 at 09:42:38PM +0200, Ævar Arnfjörð Bjarmason wrote: > + /* > + * We must write the *.idx last, so that anything that expects > + * an accompanying *.rev, *.bitmap etc. can count on it being > + * present. > + * > + * See also corresponding logic in the "exts" > + * struct in builtin/repack.c > + */ TBH, I'm not sure the cycle between comments helps much, and it would probably suffice to say: /* * write the .idx last, so everything else can count on its existence */ In both places. But I don't think that it makes much of a difference, really, so either outcome is fine with me. Thanks, Taylor ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH 0/3] rename *.idx file into place last (also after *.bitmap) 2021-09-07 19:42 ` [PATCH 0/3] rename *.idx file into place last (also after *.bitmap) Ævar Arnfjörð Bjarmason ` (2 preceding siblings ...) 2021-09-07 19:42 ` [PATCH 3/3] pack-write: rename *.idx file into place last (really!) Ævar Arnfjörð Bjarmason @ 2021-09-07 22:36 ` Taylor Blau 3 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-07 22:36 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano, Jeff King On Tue, Sep 07, 2021 at 09:42:35PM +0200, Ævar Arnfjörð Bjarmason wrote: > I came up with this on top of Taylor's series which fixes the order in > which we write files associated with pack files[1]. His series fixes a > race where we write *.idx before *.rev, but left the issue of writing > *.bitmap after *.idx, this series fixes that. Now we'll really write > the *.idx last. Nice catch. And fixing this race (by moving the .bitmap into place before the .idx) doesn't create another race, because open_pack_bitmap_1() returns early when the call to open_pack_index() fails. (And furthermore, open_pack_bitmap_1() is only called on the list of packs generated by calling prepare_pack() over each file in objects/pack, which only adds a pack to the list if its .idx file exists, too). I reviewed this series and left a few notes which I would like to see addressed/responded to before queuing, but this seems to be generally of the right approach to me. Thanks, Taylor ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH 0/2] pack-write,repack: prevent opening packs too early 2021-09-01 2:05 [PATCH 0/2] pack-write,repack: prevent opening packs too early Taylor Blau ` (3 preceding siblings ...) 2021-09-07 19:42 ` [PATCH 0/3] rename *.idx file into place last (also after *.bitmap) Ævar Arnfjörð Bjarmason @ 2021-09-07 19:48 ` Ævar Arnfjörð Bjarmason 2021-09-08 0:38 ` [PATCH v2 0/4] rename *.idx file into place last (also after *.bitmap) Ævar Arnfjörð Bjarmason ` (2 subsequent siblings) 7 siblings, 0 replies; 81+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-09-07 19:48 UTC (permalink / raw) To: Taylor Blau; +Cc: git, peff On Tue, Aug 31 2021, Taylor Blau wrote: > This pair of patches fixes a race where the .idx is moved into place > before the .rev file, allowing the pack to be in a state where it > appears a .rev file wasn't generated. > > This can cause Git to inadvertently take the slow "generate the > reverse index on-the-fly", which does not impact correctness, but is > unnecessarily slow when compared to reading the .rev file. > > The race is fixed by moving the .idx into place only after all other > pack-related files have already been written. The first patch fixes > the direct `pack-objects` case, and the second patch fixes `repack` > (which also renames pack files around). > > Thanks in advance for your review. > > Taylor Blau (2): > pack-write.c: rename `.idx` file into place last > builtin/repack.c: move `.idx` files into place last > > builtin/repack.c | 2 +- > pack-write.c | 12 ++++++------ > 2 files changed, 7 insertions(+), 7 deletions(-) I've reviewed this to the point of coming up with my own series on top of it, but it can be considered separately: https://lore.kernel.org/git/cover-0.3-00000000000-20210907T193600Z-avarab@gmail.com/ The only change (if any) would be to perhaps update the commit message(s) to note that the *.bitmap case is left, and that the promise of "closing the race" is still subject to the vagaries of our non-use of fsync() here, so pedantically speaking the race is still there. But even without those suggested changes: Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> ^ permalink raw reply [flat|nested] 81+ messages in thread
* [PATCH v2 0/4] rename *.idx file into place last (also after *.bitmap) 2021-09-01 2:05 [PATCH 0/2] pack-write,repack: prevent opening packs too early Taylor Blau ` (4 preceding siblings ...) 2021-09-07 19:48 ` [PATCH 0/2] pack-write,repack: prevent opening packs too early Ævar Arnfjörð Bjarmason @ 2021-09-08 0:38 ` Ævar Arnfjörð Bjarmason 2021-09-08 0:38 ` [PATCH v2 1/4] pack.h: line-wrap the definition of finish_tmp_packfile() Ævar Arnfjörð Bjarmason ` (3 more replies) 2021-09-08 22:17 ` [PATCH v2 0/3] prevent opening packs too early Taylor Blau 2021-09-09 3:24 ` [PATCH 0/9] packfile: avoid .idx rename races Taylor Blau 7 siblings, 4 replies; 81+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-09-08 0:38 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Taylor Blau, Ævar Arnfjörð Bjarmason I came up with this on top of Taylor's series which fixes the order in which we write files associated with pack files[1]. His series fixes a race where we write *.idx before *.rev, but left the issue of writing *.bitmap after *.idx, this series fixes that. Now we'll really write the *.idx last. This v2 attempts to fix the comments Taylor had on v1, in particular I think the borderline readability improvements in v1 are now pretty unambiguously more readable in 2/4 of this series. I also converted the bulk-checkin.c caller, split up the wrapping change in pack.h into its own commit etc. 1. https://lore.kernel.org/git/cover.1630461918.git.me@ttaylorr.com/ [1] Ævar Arnfjörð Bjarmason (4): pack.h: line-wrap the definition of finish_tmp_packfile() pack-write: refactor renaming in finish_tmp_packfile() pack-write: split up finish_tmp_packfile() function pack-write: rename *.idx file into place last (really!) builtin/pack-objects.c | 16 +++++++++--- bulk-checkin.c | 16 ++++++++++++ pack-write.c | 59 +++++++++++++++++++++--------------------- pack.h | 10 ++++++- 4 files changed, 67 insertions(+), 34 deletions(-) Range-diff against v1: -: ---------- > 1: 29f5787651 pack.h: line-wrap the definition of finish_tmp_packfile() 1: 0e6ef07ce0 ! 2: 7b39f4599b pack-write: use more idiomatic strbuf usage for packname construction @@ Metadata Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com> ## Commit message ## - pack-write: use more idiomatic strbuf usage for packname construction + pack-write: refactor renaming in finish_tmp_packfile() - Change code added in 5889271114a (finish_tmp_packfile():use strbuf for - pathname construction, 2014-03-03) to do strbuf_reset() instead of - noting the length of the base template, and doing a strbuf_setlen() to - reset it, also change the spacing in the finish_tmp_packfile() so that - each setup of the template, rename, and strbuf_reset() is grouped - together. + Refactor the renaming in finish_tmp_packfile() so that it takes a + "const struct strbuf *" instead of a non-const, and refactor the + repetitive renaming pattern in finish_tmp_packfile() to use a new + static rename_tmp_packfile() helper function. - Since the prototype of the previous "name_buffer" now has a "const" - use this chance to wrap the overly long definition of the - finish_tmp_packfile() function. + The previous strbuf_reset() idiom originated with + 5889271114a (finish_tmp_packfile():use strbuf for pathname + construction, 2014-03-03), which in turn was a minimal adjustment of + pre-strbuf code added in 0e990530ae (finish_tmp_packfile(): a helper + function, 2011-10-28). - This doesn't really matter for now, but as we'll see makes the - subsequent change much easier, as we won't need to juggle the basename - template v.s. its current contents anymore when writing bitmaps. + Since the memory allocation is not a bottleneck here we can afford a + bit more readability at the cost of re-allocating this new "struct + strbuf sb". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> ## builtin/pack-objects.c ## @@ builtin/pack-objects.c: static void write_pack_file(void) - - if (!pack_to_stdout) { - struct stat st; -- struct strbuf tmpname = STRBUF_INIT; -+ struct strbuf tmp_basename = STRBUF_INIT; - - /* - * Packs are runtime accessed in their mtime -@@ builtin/pack-objects.c: static void write_pack_file(void) - warning_errno(_("failed utime() on %s"), pack_tmp_name); - } - -- strbuf_addf(&tmpname, "%s-", base_name); -+ strbuf_addf(&tmp_basename, "%s-", base_name); - - if (write_bitmap_index) { - bitmap_writer_set_checksum(hash); -@@ builtin/pack-objects.c: static void write_pack_file(void) - &to_pack, written_list, nr_written); - } - -- finish_tmp_packfile(&tmpname, pack_tmp_name, -+ finish_tmp_packfile(&tmp_basename, pack_tmp_name, - written_list, nr_written, &pack_idx_opts, hash); if (write_bitmap_index) { - strbuf_addf(&tmpname, "%s.bitmap", hash_to_hex(hash)); + struct strbuf sb = STRBUF_INIT; + -+ strbuf_addf(&sb, "%s%s.bitmap", -+ tmp_basename.buf, ++ strbuf_addf(&sb, "%s%s.bitmap", tmpname.buf, + hash_to_hex(hash)); stop_progress(&progress_state); @@ builtin/pack-objects.c: static void write_pack_file(void) + strbuf_release(&sb); } -- strbuf_release(&tmpname); -+ strbuf_release(&tmp_basename); - free(pack_tmp_name); - puts(hash_to_hex(hash)); - } + strbuf_release(&tmpname); ## pack-write.c ## @@ pack-write.c: struct hashfile *create_tmp_packfile(char **pack_tmp_name) @@ pack-write.c: struct hashfile *create_tmp_packfile(char **pack_tmp_name) } -void finish_tmp_packfile(struct strbuf *name_buffer, -+void finish_tmp_packfile(const struct strbuf *tmp_basename, ++static void rename_tmp_packfile(const char *source, ++ const struct strbuf *basename, ++ const unsigned char hash[], ++ const char *ext) ++{ ++ struct strbuf sb = STRBUF_INIT; ++ ++ strbuf_addf(&sb, "%s%s.%s", basename->buf, hash_to_hex(hash), ext); ++ if (rename(source, sb.buf)) ++ die_errno("unable to rename temporary '*.%s' file to '%s'", ++ ext, sb.buf); ++ strbuf_release(&sb); ++} ++ ++void finish_tmp_packfile(const struct strbuf *basename, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, - struct pack_idx_option *pack_idx_opts, +@@ pack-write.c: void finish_tmp_packfile(struct strbuf *name_buffer, unsigned char hash[]) { -+ struct strbuf sb = STRBUF_INIT; const char *idx_tmp_name, *rev_tmp_name = NULL; - int basename_len = name_buffer->len; @@ pack-write.c: void finish_tmp_packfile(struct strbuf *name_buffer, - strbuf_addf(name_buffer, "%s.pack", hash_to_hex(hash)); - - if (rename(pack_tmp_name, name_buffer->buf)) -+ strbuf_addf(&sb, "%s%s.pack", tmp_basename->buf, hash_to_hex(hash)); -+ if (rename(pack_tmp_name, sb.buf)) - die_errno("unable to rename temporary pack file"); +- die_errno("unable to rename temporary pack file"); - - strbuf_setlen(name_buffer, basename_len); -+ strbuf_reset(&sb); - - if (rev_tmp_name) { +- +- if (rev_tmp_name) { - strbuf_addf(name_buffer, "%s.rev", hash_to_hex(hash)); - if (rename(rev_tmp_name, name_buffer->buf)) -+ strbuf_addf(&sb, "%s%s.rev", tmp_basename->buf, -+ hash_to_hex(hash)); -+ if (rename(rev_tmp_name, sb.buf)) - die_errno("unable to rename temporary reverse-index file"); +- die_errno("unable to rename temporary reverse-index file"); - - strbuf_setlen(name_buffer, basename_len); -+ strbuf_reset(&sb); - } - +- } +- - strbuf_addf(name_buffer, "%s.idx", hash_to_hex(hash)); - if (rename(idx_tmp_name, name_buffer->buf)) -+ strbuf_addf(&sb, "%s%s.idx", tmp_basename->buf, hash_to_hex(hash)); -+ if (rename(idx_tmp_name, sb.buf)) - die_errno("unable to rename temporary index file"); +- die_errno("unable to rename temporary index file"); - - strbuf_setlen(name_buffer, basename_len); -+ strbuf_reset(&sb); ++ rename_tmp_packfile(pack_tmp_name, basename, hash, "pack"); ++ if (rev_tmp_name) ++ rename_tmp_packfile(rev_tmp_name, basename, hash, "rev"); ++ rename_tmp_packfile(idx_tmp_name, basename, hash, "idx"); free((void *)idx_tmp_name); } @@ pack.h: int encode_in_pack_object_header(unsigned char *hdr, int hdr_len, int read_pack_header(int fd, struct pack_header *); struct hashfile *create_tmp_packfile(char **pack_tmp_name); --void finish_tmp_packfile(struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, struct pack_idx_option *pack_idx_opts, unsigned char sha1[]); -+void finish_tmp_packfile(const struct strbuf *name_buffer, -+ const char *pack_tmp_name, -+ struct pack_idx_entry **written_list, -+ uint32_t nr_written, -+ struct pack_idx_option *pack_idx_opts, -+ unsigned char sha1[]); - - #endif +-void finish_tmp_packfile(struct strbuf *name_buffer, ++void finish_tmp_packfile(const struct strbuf *basename, + const char *pack_tmp_name, + struct pack_idx_entry **written_list, + uint32_t nr_written, 2: 42f83774fe ! 3: 1205f9d0c2 pack-write: split up finish_tmp_packfile() function @@ Commit message pack-write: split up finish_tmp_packfile() function Split up the finish_tmp_packfile() function and use the split-up - version in pack-objects.c. This change should not change any - functionality, but sets up code flow for a bug fix where we'll be able - to move the *.idx in-place after the *.bitmap is written. + version in pack-objects.c in preparation for moving the step of + renaming the *.idx file later as part of a function change. + + Since the only other caller of finish_tmp_packfile() was in + bulk-checkin.c, and it won't be needing a change to its *.idx + renaming, provide a thin wrapper for the old function as a static + function in that file. If other callers end up needing the simpler + version it could be moved back to "pack-write.c" and "pack.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> @@ builtin/pack-objects.c @@ builtin/pack-objects.c: static void write_pack_file(void) if (!pack_to_stdout) { struct stat st; - struct strbuf tmp_basename = STRBUF_INIT; + struct strbuf tmpname = STRBUF_INIT; + char *idx_tmp_name = NULL; /* @@ builtin/pack-objects.c: static void write_pack_file(void) &to_pack, written_list, nr_written); } -- finish_tmp_packfile(&tmp_basename, pack_tmp_name, -- written_list, nr_written, +- finish_tmp_packfile(&tmpname, pack_tmp_name, ++ stage_tmp_packfiles(&tmpname, pack_tmp_name, + written_list, nr_written, - &pack_idx_opts, hash); -+ stage_tmp_packfile(&tmp_basename, pack_tmp_name, -+ written_list, nr_written, -+ &pack_idx_opts, hash, &idx_tmp_name); -+ rename_tmp_packfile_idx(&tmp_basename, hash, &idx_tmp_name); ++ &pack_idx_opts, hash, &idx_tmp_name); ++ rename_tmp_packfile_idx(&tmpname, hash, &idx_tmp_name); if (write_bitmap_index) { struct strbuf sb = STRBUF_INIT; @@ builtin/pack-objects.c: static void write_pack_file(void) } + free(idx_tmp_name); - strbuf_release(&tmp_basename); + strbuf_release(&tmpname); free(pack_tmp_name); puts(hash_to_hex(hash)); - ## pack-write.c ## -@@ pack-write.c: void finish_tmp_packfile(const struct strbuf *tmp_basename, - uint32_t nr_written, - struct pack_idx_option *pack_idx_opts, - unsigned char hash[]) + ## bulk-checkin.c ## +@@ bulk-checkin.c: static struct bulk_checkin_state { + uint32_t nr_written; + } state; + ++static void finish_tmp_packfile(const struct strbuf *basename, ++ const char *pack_tmp_name, ++ struct pack_idx_entry **written_list, ++ uint32_t nr_written, ++ struct pack_idx_option *pack_idx_opts, ++ unsigned char hash[]) +{ + char *idx_tmp_name = NULL; + -+ stage_tmp_packfile(tmp_basename, pack_tmp_name, written_list, -+ nr_written, pack_idx_opts, hash, &idx_tmp_name); -+ rename_tmp_packfile_idx(tmp_basename, hash, &idx_tmp_name); ++ stage_tmp_packfiles(basename, pack_tmp_name, written_list, nr_written, ++ pack_idx_opts, hash, &idx_tmp_name); ++ rename_tmp_packfile_idx(basename, hash, &idx_tmp_name); + + free(idx_tmp_name); +} + -+void stage_tmp_packfile(const struct strbuf *tmp_basename, -+ const char *pack_tmp_name, -+ struct pack_idx_entry **written_list, -+ uint32_t nr_written, -+ struct pack_idx_option *pack_idx_opts, -+ unsigned char hash[], -+ char **idx_tmp_name) + static void finish_bulk_checkin(struct bulk_checkin_state *state) + { + struct object_id oid; + + ## pack-write.c ## +@@ pack-write.c: static void rename_tmp_packfile(const char *source, + strbuf_release(&sb); + } + +-void finish_tmp_packfile(const struct strbuf *basename, ++void rename_tmp_packfile_idx(const struct strbuf *basename, ++ unsigned char hash[], char **idx_tmp_name) ++{ ++ rename_tmp_packfile(*idx_tmp_name, basename, hash, "idx"); ++} ++ ++void stage_tmp_packfiles(const struct strbuf *basename, + const char *pack_tmp_name, + struct pack_idx_entry **written_list, + uint32_t nr_written, + struct pack_idx_option *pack_idx_opts, +- unsigned char hash[]) ++ unsigned char hash[], ++ char **idx_tmp_name) { - struct strbuf sb = STRBUF_INIT; - const char *idx_tmp_name, *rev_tmp_name = NULL; + const char *rev_tmp_name = NULL; @@ pack-write.c: void finish_tmp_packfile(const struct strbuf *tmp_basename, die_errno("unable to make temporary index file readable"); rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash, -@@ pack-write.c: void finish_tmp_packfile(const struct strbuf *tmp_basename, - strbuf_reset(&sb); - } - -+ strbuf_release(&sb); -+} -+ -+void rename_tmp_packfile_idx(const struct strbuf *tmp_basename, -+ unsigned char hash[], char **idx_tmp_name) -+{ -+ struct strbuf sb = STRBUF_INIT; -+ - strbuf_addf(&sb, "%s%s.idx", tmp_basename->buf, hash_to_hex(hash)); -- if (rename(idx_tmp_name, sb.buf)) -+ if (rename(*idx_tmp_name, sb.buf)) - die_errno("unable to rename temporary index file"); -- strbuf_reset(&sb); +@@ pack-write.c: void finish_tmp_packfile(const struct strbuf *basename, + rename_tmp_packfile(pack_tmp_name, basename, hash, "pack"); + if (rev_tmp_name) + rename_tmp_packfile(rev_tmp_name, basename, hash, "rev"); +- rename_tmp_packfile(idx_tmp_name, basename, hash, "idx"); - - free((void *)idx_tmp_name); -+ strbuf_release(&sb); } void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought) @@ pack.h: int encode_in_pack_object_header(unsigned char *hdr, int hdr_len, int read_pack_header(int fd, struct pack_header *); struct hashfile *create_tmp_packfile(char **pack_tmp_name); -+void stage_tmp_packfile(const struct strbuf *tmp_basename, -+ const char *pack_tmp_name, -+ struct pack_idx_entry **written_list, -+ uint32_t nr_written, -+ struct pack_idx_option *pack_idx_opts, -+ unsigned char hash[], -+ char **idx_tmp_name); -+void rename_tmp_packfile_idx(const struct strbuf *tmp_basename, -+ unsigned char hash[], char **idx_tmp_name); - void finish_tmp_packfile(const struct strbuf *name_buffer, +-void finish_tmp_packfile(const struct strbuf *basename, ++void stage_tmp_packfiles(const struct strbuf *basename, const char *pack_tmp_name, struct pack_idx_entry **written_list, + uint32_t nr_written, + struct pack_idx_option *pack_idx_opts, +- unsigned char sha1[]); ++ unsigned char hash[], ++ char **idx_tmp_name); ++void rename_tmp_packfile_idx(const struct strbuf *tmp_basename, ++ unsigned char hash[], char **idx_tmp_name); + + #endif 3: 78976fcb7b ! 4: 70f4a9767d pack-write: rename *.idx file into place last (really!) @@ Commit message those files are written out, nothing in that commit contradicts what's being done here. - While we're at it let's add cross-commentary to both builtin/repack.c - and builtin/pack-objects.c to point out the two places where we write - out these sets of files in sequence. + Note that the referenced earlier commit[1] is overly optimistic about + "clos[ing the] race", i.e. yes we'll now write the files in the right + order, but we might still race due to our sloppy use of fsync(). See + the thread at [2] for a rabbit hole of various discussions about + filesystem races in the face of doing and not doing fsync() (and if + doing fsync(), not doing it properly). 1. https://lore.kernel.org/git/a6a4d2154e83d41c10986c5f455279ab249a910c.1630461918.git.me@ttaylorr.com/ + 2. https://lore.kernel.org/git/8735qgkvv1.fsf@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> ## builtin/pack-objects.c ## @@ builtin/pack-objects.c: static void write_pack_file(void) - stage_tmp_packfile(&tmp_basename, pack_tmp_name, - written_list, nr_written, - &pack_idx_opts, hash, &idx_tmp_name); -- rename_tmp_packfile_idx(&tmp_basename, hash, &idx_tmp_name); + stage_tmp_packfiles(&tmpname, pack_tmp_name, + written_list, nr_written, + &pack_idx_opts, hash, &idx_tmp_name); +- rename_tmp_packfile_idx(&tmpname, hash, &idx_tmp_name); if (write_bitmap_index) { struct strbuf sb = STRBUF_INIT; @@ builtin/pack-objects.c: static void write_pack_file(void) strbuf_release(&sb); } -+ /* -+ * We must write the *.idx last, so that anything that expects -+ * an accompanying *.rev, *.bitmap etc. can count on it being -+ * present. -+ * -+ * See also corresponding logic in the "exts" -+ * struct in builtin/repack.c -+ */ -+ rename_tmp_packfile_idx(&tmp_basename, hash, &idx_tmp_name); ++ rename_tmp_packfile_idx(&tmpname, hash, &idx_tmp_name); + free(idx_tmp_name); - strbuf_release(&tmp_basename); + strbuf_release(&tmpname); free(pack_tmp_name); - - ## builtin/repack.c ## -@@ builtin/repack.c: static struct { - {".rev", 1}, - {".bitmap", 1}, - {".promisor", 1}, -+ /* -+ * We must write the *.idx last, so that anything that expects -+ * an accompanying *.rev, *.bitmap etc. can count on it being -+ * present. -+ * -+ * See also corresponding logic in write_pack_file() in -+ * builtin/pack-objects.c -+ */ - {".idx"}, - }; - -- 2.33.0.819.gea1b153a43c ^ permalink raw reply [flat|nested] 81+ messages in thread
* [PATCH v2 1/4] pack.h: line-wrap the definition of finish_tmp_packfile() 2021-09-08 0:38 ` [PATCH v2 0/4] rename *.idx file into place last (also after *.bitmap) Ævar Arnfjörð Bjarmason @ 2021-09-08 0:38 ` Ævar Arnfjörð Bjarmason 2021-09-08 0:38 ` [PATCH v2 2/4] pack-write: refactor renaming in finish_tmp_packfile() Ævar Arnfjörð Bjarmason ` (2 subsequent siblings) 3 siblings, 0 replies; 81+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-09-08 0:38 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Taylor Blau, Ævar Arnfjörð Bjarmason Line-wrap the definition of finish_tmp_packfile() to make subsequent changes easier to read. See 0e990530ae (finish_tmp_packfile(): a helper function, 2011-10-28) for the commit that introduced this overly long line. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- pack.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/pack.h b/pack.h index fa13954526..1c17254c0a 100644 --- a/pack.h +++ b/pack.h @@ -110,6 +110,11 @@ int encode_in_pack_object_header(unsigned char *hdr, int hdr_len, int read_pack_header(int fd, struct pack_header *); struct hashfile *create_tmp_packfile(char **pack_tmp_name); -void finish_tmp_packfile(struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, struct pack_idx_option *pack_idx_opts, unsigned char sha1[]); +void finish_tmp_packfile(struct strbuf *name_buffer, + const char *pack_tmp_name, + struct pack_idx_entry **written_list, + uint32_t nr_written, + struct pack_idx_option *pack_idx_opts, + unsigned char sha1[]); #endif -- 2.33.0.819.gea1b153a43c ^ permalink raw reply related [flat|nested] 81+ messages in thread
* [PATCH v2 2/4] pack-write: refactor renaming in finish_tmp_packfile() 2021-09-08 0:38 ` [PATCH v2 0/4] rename *.idx file into place last (also after *.bitmap) Ævar Arnfjörð Bjarmason 2021-09-08 0:38 ` [PATCH v2 1/4] pack.h: line-wrap the definition of finish_tmp_packfile() Ævar Arnfjörð Bjarmason @ 2021-09-08 0:38 ` Ævar Arnfjörð Bjarmason 2021-09-08 4:22 ` Taylor Blau 2021-09-08 0:38 ` [PATCH v2 3/4] pack-write: split up finish_tmp_packfile() function Ævar Arnfjörð Bjarmason 2021-09-08 0:38 ` [PATCH v2 4/4] pack-write: rename *.idx file into place last (really!) Ævar Arnfjörð Bjarmason 3 siblings, 1 reply; 81+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-09-08 0:38 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Taylor Blau, Ævar Arnfjörð Bjarmason Refactor the renaming in finish_tmp_packfile() so that it takes a "const struct strbuf *" instead of a non-const, and refactor the repetitive renaming pattern in finish_tmp_packfile() to use a new static rename_tmp_packfile() helper function. The previous strbuf_reset() idiom originated with 5889271114a (finish_tmp_packfile():use strbuf for pathname construction, 2014-03-03), which in turn was a minimal adjustment of pre-strbuf code added in 0e990530ae (finish_tmp_packfile(): a helper function, 2011-10-28). Since the memory allocation is not a bottleneck here we can afford a bit more readability at the cost of re-allocating this new "struct strbuf sb". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/pack-objects.c | 8 ++++++-- pack-write.c | 41 +++++++++++++++++++---------------------- pack.h | 2 +- 3 files changed, 26 insertions(+), 25 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index df49f656b9..f2569b5ca2 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1250,7 +1250,10 @@ static void write_pack_file(void) &pack_idx_opts, hash); if (write_bitmap_index) { - strbuf_addf(&tmpname, "%s.bitmap", hash_to_hex(hash)); + struct strbuf sb = STRBUF_INIT; + + strbuf_addf(&sb, "%s%s.bitmap", tmpname.buf, + hash_to_hex(hash)); stop_progress(&progress_state); @@ -1258,8 +1261,9 @@ static void write_pack_file(void) bitmap_writer_select_commits(indexed_commits, indexed_commits_nr, -1); bitmap_writer_build(&to_pack); bitmap_writer_finish(written_list, nr_written, - tmpname.buf, write_bitmap_options); + sb.buf, write_bitmap_options); write_bitmap_index = 0; + strbuf_release(&sb); } strbuf_release(&tmpname); diff --git a/pack-write.c b/pack-write.c index 277c60165e..363b395d67 100644 --- a/pack-write.c +++ b/pack-write.c @@ -462,7 +462,21 @@ struct hashfile *create_tmp_packfile(char **pack_tmp_name) return hashfd(fd, *pack_tmp_name); } -void finish_tmp_packfile(struct strbuf *name_buffer, +static void rename_tmp_packfile(const char *source, + const struct strbuf *basename, + const unsigned char hash[], + const char *ext) +{ + struct strbuf sb = STRBUF_INIT; + + strbuf_addf(&sb, "%s%s.%s", basename->buf, hash_to_hex(hash), ext); + if (rename(source, sb.buf)) + die_errno("unable to rename temporary '*.%s' file to '%s'", + ext, sb.buf); + strbuf_release(&sb); +} + +void finish_tmp_packfile(const struct strbuf *basename, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, @@ -470,7 +484,6 @@ void finish_tmp_packfile(struct strbuf *name_buffer, unsigned char hash[]) { const char *idx_tmp_name, *rev_tmp_name = NULL; - int basename_len = name_buffer->len; if (adjust_shared_perm(pack_tmp_name)) die_errno("unable to make temporary pack file readable"); @@ -483,26 +496,10 @@ void finish_tmp_packfile(struct strbuf *name_buffer, rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash, pack_idx_opts->flags); - strbuf_addf(name_buffer, "%s.pack", hash_to_hex(hash)); - - if (rename(pack_tmp_name, name_buffer->buf)) - die_errno("unable to rename temporary pack file"); - - strbuf_setlen(name_buffer, basename_len); - - if (rev_tmp_name) { - strbuf_addf(name_buffer, "%s.rev", hash_to_hex(hash)); - if (rename(rev_tmp_name, name_buffer->buf)) - die_errno("unable to rename temporary reverse-index file"); - - strbuf_setlen(name_buffer, basename_len); - } - - strbuf_addf(name_buffer, "%s.idx", hash_to_hex(hash)); - if (rename(idx_tmp_name, name_buffer->buf)) - die_errno("unable to rename temporary index file"); - - strbuf_setlen(name_buffer, basename_len); + rename_tmp_packfile(pack_tmp_name, basename, hash, "pack"); + if (rev_tmp_name) + rename_tmp_packfile(rev_tmp_name, basename, hash, "rev"); + rename_tmp_packfile(idx_tmp_name, basename, hash, "idx"); free((void *)idx_tmp_name); } diff --git a/pack.h b/pack.h index 1c17254c0a..594d5176aa 100644 --- a/pack.h +++ b/pack.h @@ -110,7 +110,7 @@ int encode_in_pack_object_header(unsigned char *hdr, int hdr_len, int read_pack_header(int fd, struct pack_header *); struct hashfile *create_tmp_packfile(char **pack_tmp_name); -void finish_tmp_packfile(struct strbuf *name_buffer, +void finish_tmp_packfile(const struct strbuf *basename, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, -- 2.33.0.819.gea1b153a43c ^ permalink raw reply related [flat|nested] 81+ messages in thread
* Re: [PATCH v2 2/4] pack-write: refactor renaming in finish_tmp_packfile() 2021-09-08 0:38 ` [PATCH v2 2/4] pack-write: refactor renaming in finish_tmp_packfile() Ævar Arnfjörð Bjarmason @ 2021-09-08 4:22 ` Taylor Blau 0 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-08 4:22 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Junio C Hamano, Jeff King, Taylor Blau On Wed, Sep 08, 2021 at 02:38:26AM +0200, Ævar Arnfjörð Bjarmason wrote: > -void finish_tmp_packfile(struct strbuf *name_buffer, > +static void rename_tmp_packfile(const char *source, > + const struct strbuf *basename, > + const unsigned char hash[], > + const char *ext) > +{ > + struct strbuf sb = STRBUF_INIT; > + > + strbuf_addf(&sb, "%s%s.%s", basename->buf, hash_to_hex(hash), ext); > + if (rename(source, sb.buf)) > + die_errno("unable to rename temporary '*.%s' file to '%s'", > + ext, sb.buf); > + strbuf_release(&sb); > +} At the end of the day, I really do not mind the extra copying of basename. But if you were interested in a potential alternative, I'd suggest making basename non-const and doing instead: static void rename_tmp_packfile(struct strbuf *basename, ...) { size_t basename_len = basename->len; strbuf_addf(basename, ".%s", ext); if (rename(source, basename.buf)) die_errno(...); strbuf_setlen(basename, basename_len); } where the contract of this function is "I will modify the buffer you gave me (so do not read or write to it concurrently) but will return it to you in the same state as it was provided". That would be an improvement in readability (because of your idea to extract rename_tmp_packfile()) but would result in no new copying, which would be nice to avoid if we can. But I do not feel that strongly, it just seems like an unnecessary introduction of copying where we don't otherwise need it. Thanks, Taylor ^ permalink raw reply [flat|nested] 81+ messages in thread
* [PATCH v2 3/4] pack-write: split up finish_tmp_packfile() function 2021-09-08 0:38 ` [PATCH v2 0/4] rename *.idx file into place last (also after *.bitmap) Ævar Arnfjörð Bjarmason 2021-09-08 0:38 ` [PATCH v2 1/4] pack.h: line-wrap the definition of finish_tmp_packfile() Ævar Arnfjörð Bjarmason 2021-09-08 0:38 ` [PATCH v2 2/4] pack-write: refactor renaming in finish_tmp_packfile() Ævar Arnfjörð Bjarmason @ 2021-09-08 0:38 ` Ævar Arnfjörð Bjarmason 2021-09-08 0:38 ` [PATCH v2 4/4] pack-write: rename *.idx file into place last (really!) Ævar Arnfjörð Bjarmason 3 siblings, 0 replies; 81+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-09-08 0:38 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Taylor Blau, Ævar Arnfjörð Bjarmason Split up the finish_tmp_packfile() function and use the split-up version in pack-objects.c in preparation for moving the step of renaming the *.idx file later as part of a function change. Since the only other caller of finish_tmp_packfile() was in bulk-checkin.c, and it won't be needing a change to its *.idx renaming, provide a thin wrapper for the old function as a static function in that file. If other callers end up needing the simpler version it could be moved back to "pack-write.c" and "pack.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/pack-objects.c | 7 +++++-- bulk-checkin.c | 16 ++++++++++++++++ pack-write.c | 22 +++++++++++++--------- pack.h | 7 +++++-- 4 files changed, 39 insertions(+), 13 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index f2569b5ca2..33567aaa74 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1217,6 +1217,7 @@ static void write_pack_file(void) if (!pack_to_stdout) { struct stat st; struct strbuf tmpname = STRBUF_INIT; + char *idx_tmp_name = NULL; /* * Packs are runtime accessed in their mtime @@ -1245,9 +1246,10 @@ static void write_pack_file(void) &to_pack, written_list, nr_written); } - finish_tmp_packfile(&tmpname, pack_tmp_name, + stage_tmp_packfiles(&tmpname, pack_tmp_name, written_list, nr_written, - &pack_idx_opts, hash); + &pack_idx_opts, hash, &idx_tmp_name); + rename_tmp_packfile_idx(&tmpname, hash, &idx_tmp_name); if (write_bitmap_index) { struct strbuf sb = STRBUF_INIT; @@ -1266,6 +1268,7 @@ static void write_pack_file(void) strbuf_release(&sb); } + free(idx_tmp_name); strbuf_release(&tmpname); free(pack_tmp_name); puts(hash_to_hex(hash)); diff --git a/bulk-checkin.c b/bulk-checkin.c index b023d9959a..ca7697e9f7 100644 --- a/bulk-checkin.c +++ b/bulk-checkin.c @@ -23,6 +23,22 @@ static struct bulk_checkin_state { uint32_t nr_written; } state; +static void finish_tmp_packfile(const struct strbuf *basename, + const char *pack_tmp_name, + struct pack_idx_entry **written_list, + uint32_t nr_written, + struct pack_idx_option *pack_idx_opts, + unsigned char hash[]) +{ + char *idx_tmp_name = NULL; + + stage_tmp_packfiles(basename, pack_tmp_name, written_list, nr_written, + pack_idx_opts, hash, &idx_tmp_name); + rename_tmp_packfile_idx(basename, hash, &idx_tmp_name); + + free(idx_tmp_name); +} + static void finish_bulk_checkin(struct bulk_checkin_state *state) { struct object_id oid; diff --git a/pack-write.c b/pack-write.c index 363b395d67..09af8fccae 100644 --- a/pack-write.c +++ b/pack-write.c @@ -476,21 +476,28 @@ static void rename_tmp_packfile(const char *source, strbuf_release(&sb); } -void finish_tmp_packfile(const struct strbuf *basename, +void rename_tmp_packfile_idx(const struct strbuf *basename, + unsigned char hash[], char **idx_tmp_name) +{ + rename_tmp_packfile(*idx_tmp_name, basename, hash, "idx"); +} + +void stage_tmp_packfiles(const struct strbuf *basename, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, struct pack_idx_option *pack_idx_opts, - unsigned char hash[]) + unsigned char hash[], + char **idx_tmp_name) { - const char *idx_tmp_name, *rev_tmp_name = NULL; + const char *rev_tmp_name = NULL; if (adjust_shared_perm(pack_tmp_name)) die_errno("unable to make temporary pack file readable"); - idx_tmp_name = write_idx_file(NULL, written_list, nr_written, - pack_idx_opts, hash); - if (adjust_shared_perm(idx_tmp_name)) + *idx_tmp_name = (char *)write_idx_file(NULL, written_list, nr_written, + pack_idx_opts, hash); + if (adjust_shared_perm(*idx_tmp_name)) die_errno("unable to make temporary index file readable"); rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash, @@ -499,9 +506,6 @@ void finish_tmp_packfile(const struct strbuf *basename, rename_tmp_packfile(pack_tmp_name, basename, hash, "pack"); if (rev_tmp_name) rename_tmp_packfile(rev_tmp_name, basename, hash, "rev"); - rename_tmp_packfile(idx_tmp_name, basename, hash, "idx"); - - free((void *)idx_tmp_name); } void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought) diff --git a/pack.h b/pack.h index 594d5176aa..d5814f3158 100644 --- a/pack.h +++ b/pack.h @@ -110,11 +110,14 @@ int encode_in_pack_object_header(unsigned char *hdr, int hdr_len, int read_pack_header(int fd, struct pack_header *); struct hashfile *create_tmp_packfile(char **pack_tmp_name); -void finish_tmp_packfile(const struct strbuf *basename, +void stage_tmp_packfiles(const struct strbuf *basename, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, struct pack_idx_option *pack_idx_opts, - unsigned char sha1[]); + unsigned char hash[], + char **idx_tmp_name); +void rename_tmp_packfile_idx(const struct strbuf *tmp_basename, + unsigned char hash[], char **idx_tmp_name); #endif -- 2.33.0.819.gea1b153a43c ^ permalink raw reply related [flat|nested] 81+ messages in thread
* [PATCH v2 4/4] pack-write: rename *.idx file into place last (really!) 2021-09-08 0:38 ` [PATCH v2 0/4] rename *.idx file into place last (also after *.bitmap) Ævar Arnfjörð Bjarmason ` (2 preceding siblings ...) 2021-09-08 0:38 ` [PATCH v2 3/4] pack-write: split up finish_tmp_packfile() function Ævar Arnfjörð Bjarmason @ 2021-09-08 0:38 ` Ævar Arnfjörð Bjarmason 2021-09-08 1:14 ` Ævar Arnfjörð Bjarmason 2021-09-08 4:24 ` Taylor Blau 3 siblings, 2 replies; 81+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-09-08 0:38 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Taylor Blau, Ævar Arnfjörð Bjarmason Follow-up a preceding commit (pack-write.c: rename `.idx` file into place last, 2021-08-16)[1] and rename the *.idx file in-place after we write the *.bitmap. The preceding commit fixed the issue of *.idx being written before *.rev files, but did not do so for *.idx files. See 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21) for commentary at the time when *.bitmap was implemented about how those files are written out, nothing in that commit contradicts what's being done here. Note that the referenced earlier commit[1] is overly optimistic about "clos[ing the] race", i.e. yes we'll now write the files in the right order, but we might still race due to our sloppy use of fsync(). See the thread at [2] for a rabbit hole of various discussions about filesystem races in the face of doing and not doing fsync() (and if doing fsync(), not doing it properly). 1. https://lore.kernel.org/git/a6a4d2154e83d41c10986c5f455279ab249a910c.1630461918.git.me@ttaylorr.com/ 2. https://lore.kernel.org/git/8735qgkvv1.fsf@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/pack-objects.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 33567aaa74..396240c25a 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1249,7 +1249,6 @@ static void write_pack_file(void) stage_tmp_packfiles(&tmpname, pack_tmp_name, written_list, nr_written, &pack_idx_opts, hash, &idx_tmp_name); - rename_tmp_packfile_idx(&tmpname, hash, &idx_tmp_name); if (write_bitmap_index) { struct strbuf sb = STRBUF_INIT; @@ -1268,6 +1267,8 @@ static void write_pack_file(void) strbuf_release(&sb); } + rename_tmp_packfile_idx(&tmpname, hash, &idx_tmp_name); + free(idx_tmp_name); strbuf_release(&tmpname); free(pack_tmp_name); -- 2.33.0.819.gea1b153a43c ^ permalink raw reply related [flat|nested] 81+ messages in thread
* Re: [PATCH v2 4/4] pack-write: rename *.idx file into place last (really!) 2021-09-08 0:38 ` [PATCH v2 4/4] pack-write: rename *.idx file into place last (really!) Ævar Arnfjörð Bjarmason @ 2021-09-08 1:14 ` Ævar Arnfjörð Bjarmason 2021-09-08 9:18 ` Ævar Arnfjörð Bjarmason 2021-09-08 4:24 ` Taylor Blau 1 sibling, 1 reply; 81+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-09-08 1:14 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Taylor Blau, Ævar Arnfjörð Bjarmason On Wed, Sep 08 2021, Ævar Arnfjörð Bjarmason wrote: > Follow-up a preceding commit (pack-write.c: rename `.idx` file into > place last, 2021-08-16)[1] and rename the *.idx file in-place after we > write the *.bitmap. The preceding commit fixed the issue of *.idx > being written before *.rev files, but did not do so for *.idx files. > > See 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21) > for commentary at the time when *.bitmap was implemented about how > those files are written out, nothing in that commit contradicts what's > being done here. > > Note that the referenced earlier commit[1] is overly optimistic about > "clos[ing the] race", i.e. yes we'll now write the files in the right > order, but we might still race due to our sloppy use of fsync(). See > the thread at [2] for a rabbit hole of various discussions about > filesystem races in the face of doing and not doing fsync() (and if > doing fsync(), not doing it properly). Actually I think it's a bit worse than that, we will unconditionally fsync() the *.pack we write out, but in stage_tmp_packfiles() (the behavior pre-dates both this series and its parent, I just think my stage_tmp_packfiles() is easier to follow) we'll not write the *.idx file with fsync() since we won't pass WRITE_IDX_VERIFY. The same goes for *.rev (which oddly makes its fsync() conditional on WRITE_IDX_VERIFY), but not *.bitmap, which fsyncs unconditionally just like *.pack does. And then of course we'll do all these in-place renames but nothing fsyncs the fd of the directory, so the metadata and new names being committed to disk & visible to other processes is anyone's guess. But not only is that metadata commit not made, but due to the inconsistent fsync() we might end up with an *.idx that's partial and renamed in-place. In any case, any such issues pre-date this series and the series by Taylor it depends on, just adding some #leftoverbits for future fsync() fixes since I spent time looking into it. ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH v2 4/4] pack-write: rename *.idx file into place last (really!) 2021-09-08 1:14 ` Ævar Arnfjörð Bjarmason @ 2021-09-08 9:18 ` Ævar Arnfjörð Bjarmason 0 siblings, 0 replies; 81+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-09-08 9:18 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Taylor Blau, Ævar Arnfjörð Bjarmason On Wed, Sep 08 2021, Ævar Arnfjörð Bjarmason wrote: > On Wed, Sep 08 2021, Ævar Arnfjörð Bjarmason wrote: > >> Follow-up a preceding commit (pack-write.c: rename `.idx` file into >> place last, 2021-08-16)[1] and rename the *.idx file in-place after we >> write the *.bitmap. The preceding commit fixed the issue of *.idx >> being written before *.rev files, but did not do so for *.idx files. >> >> See 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21) >> for commentary at the time when *.bitmap was implemented about how >> those files are written out, nothing in that commit contradicts what's >> being done here. >> >> Note that the referenced earlier commit[1] is overly optimistic about >> "clos[ing the] race", i.e. yes we'll now write the files in the right >> order, but we might still race due to our sloppy use of fsync(). See >> the thread at [2] for a rabbit hole of various discussions about >> filesystem races in the face of doing and not doing fsync() (and if >> doing fsync(), not doing it properly). > > Actually I think it's a bit worse than that, we will unconditionally > fsync() the *.pack we write out, but in stage_tmp_packfiles() (the > behavior pre-dates both this series and its parent, I just think my > stage_tmp_packfiles() is easier to follow) we'll not write the *.idx > file with fsync() since we won't pass WRITE_IDX_VERIFY. > > The same goes for *.rev (which oddly makes its fsync() conditional on > WRITE_IDX_VERIFY), but not *.bitmap, which fsyncs unconditionally just > like *.pack does. I was confused here, we use fsync() with *.bitmap because there's no --verify mode for it, i.e. we'll always fsync() the *.rev files if we're actually writing them "for real", the non-fsync() path is when we're doing the "noop-write" to /dev/null with "index-pack --verify", i.e. writing to an FD without an associated "real" filename. That we use WRITE_IDX_VERIFY to decide on that fsync() for the *.rev files, as opposed to WRITE_REV_VERIFY is a bug in 8ef50d9958 (pack-write.c: prepare to write 'pack-*.rev' files, 2021-01-25). In practice if WRITE_REV_VERIFY is true then WRITE_IDX_VERIFY is true as well. See e37d0b8730 (builtin/index-pack.c: write reverse indexes, 2021-01-25) and its interaction with 68be2fea50 (receive-pack, fetch-pack: reject bogus pack that records objects twice, 2011-11-16). I.e. both flags are set if --verify is there. So it's not a bug as far as the behavior of the program goes, it just makes for some very confusing code. Why should *.rev fsyncing be contingent on whether or not we want to fsync *.idx files on writes? It shouldn't, but it turns out those two flags always go hand in hand if we're writing *.rev at all. But re the thread started at <patch-1.1-366ba928bd-20210908T010743Z-avarab@gmail.com> about how exactly we inspect these flags I think we should just take that change as-is for now, and refactor & fix this properly later. Once write_rev_file{_order}() don't pretend to take the trie-state bitflags of WRITE_REV|WRITE_REV_VERIFY|WRITE_IDX_VERIFY but just take a boolean "int write_file" the whole callchain & confusion around "do we have more than 1 flag?" goes away. I.e. something like the end-state shown by this patch, which I'm not submitting now because it conflicts with the other in-flight topic (and it's on top of some trivial but not-included whitespace changes in pack.h): diff --git a/builtin/fast-import.c b/builtin/fast-import.c index 20406f6775..69301fc044 100644 --- a/builtin/fast-import.c +++ b/builtin/fast-import.c @@ -778,7 +778,7 @@ static const char *create_index(void) die("internal consistency error creating the index"); tmpfile = write_idx_file(NULL, idx, object_count, &pack_idx_opts, - pack_data->hash); + pack_data->hash, 0); free(idx); return tmpfile; } diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 8336466865..b9c2a21216 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1590,12 +1590,9 @@ static int git_index_pack_config(const char *k, const char *v, void *cb) } return 0; } - if (!strcmp(k, "pack.writereverseindex")) { - if (git_config_bool(k, v)) - opts->flags |= WRITE_REV; - else - opts->flags &= ~WRITE_REV; - } + if (!strcmp(k, "pack.writereverseindex")) + opts->write_rev = git_config_bool(k, v); + return git_default_config(k, v, cb); } @@ -1713,7 +1710,7 @@ static void show_pack_info(int stat_only) int cmd_index_pack(int argc, const char **argv, const char *prefix) { - int i, fix_thin_pack = 0, verify = 0, stat_only = 0, rev_index; + int i, fix_thin_pack = 0, verify = 0, stat_only = 0; const char *curr_index; const char *curr_rev_index = NULL; const char *index_name = NULL, *pack_name = NULL, *rev_index_name = NULL; @@ -1747,10 +1744,8 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) if (prefix && chdir(prefix)) die(_("Cannot come back to cwd")); - if (git_env_bool(GIT_TEST_WRITE_REV_INDEX, 0)) - rev_index = 1; - else - rev_index = !!(opts.flags & (WRITE_REV_VERIFY | WRITE_REV)); + opts.write_rev = git_env_bool(GIT_TEST_WRITE_REV_INDEX, + opts.write_rev); for (i = 1; i < argc; i++) { const char *arg = argv[i]; @@ -1831,9 +1826,9 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) die(_("unknown hash algorithm '%s'"), arg); repo_set_hash_algo(the_repository, hash_algo); } else if (!strcmp(arg, "--rev-index")) { - rev_index = 1; + opts.write_rev = 1; } else if (!strcmp(arg, "--no-rev-index")) { - rev_index = 0; + opts.write_rev = 0; } else usage(index_pack_usage); continue; @@ -1855,9 +1850,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) if (!index_name && pack_name) index_name = derive_filename(pack_name, "pack", "idx", &index_name_buf); - opts.flags &= ~(WRITE_REV | WRITE_REV_VERIFY); - if (rev_index) { - opts.flags |= verify ? WRITE_REV_VERIFY : WRITE_REV; + if (opts.write_rev) { if (index_name) rev_index_name = derive_filename(index_name, "idx", "rev", @@ -1868,10 +1861,9 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) if (!index_name) die(_("--verify with no packfile name given")); read_idx_option(&opts, index_name); - opts.flags |= WRITE_IDX_VERIFY | WRITE_IDX_STRICT; } if (strict) - opts.flags |= WRITE_IDX_STRICT; + opts.write_idx_strict = 1; if (HAVE_THREADS && !nr_threads) { nr_threads = online_cpus(); @@ -1915,11 +1907,12 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) ALLOC_ARRAY(idx_objects, nr_objects); for (i = 0; i < nr_objects; i++) idx_objects[i] = &objects[i].idx; - curr_index = write_idx_file(index_name, idx_objects, nr_objects, &opts, pack_hash); - if (rev_index) + curr_index = write_idx_file(index_name, idx_objects, nr_objects, &opts, + pack_hash, verify); + if (opts.write_rev) curr_rev_index = write_rev_file(rev_index_name, idx_objects, nr_objects, pack_hash, - opts.flags); + verify); free(idx_objects); if (!verify) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index df49f656b9..1dae01800f 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -3139,10 +3139,7 @@ static int git_pack_config(const char *k, const char *v, void *cb) return 0; } if (!strcmp(k, "pack.writereverseindex")) { - if (git_config_bool(k, v)) - pack_idx_opts.flags |= WRITE_REV; - else - pack_idx_opts.flags &= ~WRITE_REV; + pack_idx_opts.write_rev = git_config_bool(k, v); return 0; } if (!strcmp(k, "uploadpack.blobpackfileuri")) { @@ -4030,8 +4027,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) reset_pack_idx_option(&pack_idx_opts); git_config(git_pack_config, NULL); - if (git_env_bool(GIT_TEST_WRITE_REV_INDEX, 0)) - pack_idx_opts.flags |= WRITE_REV; + pack_idx_opts.write_rev = git_env_bool(GIT_TEST_WRITE_REV_INDEX, + pack_idx_opts.write_rev); progress = isatty(2); argc = parse_options(argc, argv, prefix, pack_objects_options, diff --git a/midx.c b/midx.c index 321c6fdd2f..63da740cd4 100644 --- a/midx.c +++ b/midx.c @@ -874,7 +874,7 @@ static void write_midx_reverse_index(char *midx_name, unsigned char *midx_hash, strbuf_addf(&buf, "%s-%s.rev", midx_name, hash_to_hex(midx_hash)); tmp_file = write_rev_file_order(NULL, ctx->pack_order, ctx->entries_nr, - midx_hash, WRITE_REV); + midx_hash, 0); if (finalize_object_file(tmp_file, buf.buf)) die(_("cannot store reverse index file")); diff --git a/pack-write.c b/pack-write.c index 1883848e7c..4f54191499 100644 --- a/pack-write.c +++ b/pack-write.c @@ -44,7 +44,7 @@ static int need_large_offset(off_t offset, const struct pack_idx_option *opts) */ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objects, int nr_objects, const struct pack_idx_option *opts, - const unsigned char *sha1) + const unsigned char *sha1, int verify) { struct hashfile *f; struct pack_idx_entry **sorted_by_sha, **list, **last; @@ -65,7 +65,7 @@ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objec else sorted_by_sha = list = last = NULL; - if (opts->flags & WRITE_IDX_VERIFY) { + if (verify) { assert(index_name); f = hashfd_check(index_name); } else { @@ -119,7 +119,7 @@ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objec if (index_version < 2) hashwrite_be32(f, obj->offset); hashwrite(f, obj->oid.hash, the_hash_algo->rawsz); - if ((opts->flags & WRITE_IDX_STRICT) && + if (opts->write_idx_strict && (i && oideq(&list[-2]->oid, &obj->oid))) die("The same object %s appears twice in the pack", oid_to_hex(&obj->oid)); @@ -162,8 +162,7 @@ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objec hashwrite(f, sha1, the_hash_algo->rawsz); finalize_hashfile(f, NULL, CSUM_HASH_IN_STREAM | CSUM_CLOSE | - ((opts->flags & WRITE_IDX_VERIFY) - ? 0 : CSUM_FSYNC)); + (verify ? 0 : CSUM_FSYNC)); return index_name; } @@ -218,22 +217,19 @@ const char *write_rev_file(const char *rev_name, struct pack_idx_entry **objects, uint32_t nr_objects, const unsigned char *hash, - unsigned flags) + int verify) { uint32_t *pack_order; uint32_t i; const char *ret; - if (!(flags & WRITE_REV) && !(flags & WRITE_REV_VERIFY)) - return NULL; - ALLOC_ARRAY(pack_order, nr_objects); for (i = 0; i < nr_objects; i++) pack_order[i] = i; QSORT_S(pack_order, nr_objects, pack_order_cmp, objects); ret = write_rev_file_order(rev_name, pack_order, nr_objects, hash, - flags); + verify); free(pack_order); @@ -244,15 +240,12 @@ const char *write_rev_file_order(const char *rev_name, uint32_t *pack_order, uint32_t nr_objects, const unsigned char *hash, - unsigned flags) + int verify) { struct hashfile *f; int fd; - if ((flags & WRITE_REV) && (flags & WRITE_REV_VERIFY)) - die(_("cannot both write and verify reverse index")); - - if (flags & WRITE_REV) { + if (!verify) { if (!rev_name) { struct strbuf tmp_file = STRBUF_INIT; fd = odb_mkstemp(&tmp_file, "pack/tmp_rev_XXXXXX"); @@ -264,7 +257,7 @@ const char *write_rev_file_order(const char *rev_name, die_errno("unable to create '%s'", rev_name); } f = hashfd(fd, rev_name); - } else if (flags & WRITE_REV_VERIFY) { + } else { struct stat statbuf; if (stat(rev_name, &statbuf)) { if (errno == ENOENT) { @@ -274,8 +267,7 @@ const char *write_rev_file_order(const char *rev_name, die_errno(_("could not stat: %s"), rev_name); } f = hashfd_check(rev_name); - } else - return NULL; + } write_rev_header(f); @@ -286,7 +278,7 @@ const char *write_rev_file_order(const char *rev_name, die(_("failed to make %s readable"), rev_name); finalize_hashfile(f, NULL, CSUM_HASH_IN_STREAM | CSUM_CLOSE | - ((flags & WRITE_IDX_VERIFY) ? 0 : CSUM_FSYNC)); + (verify ? 0 : CSUM_FSYNC)); return rev_name; } @@ -479,12 +471,13 @@ void finish_tmp_packfile(struct strbuf *name_buffer, die_errno("unable to make temporary pack file readable"); idx_tmp_name = write_idx_file(NULL, written_list, nr_written, - pack_idx_opts, hash); + pack_idx_opts, hash, 0); if (adjust_shared_perm(idx_tmp_name)) die_errno("unable to make temporary index file readable"); - rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash, - pack_idx_opts->flags); + if (pack_idx_opts->write_rev) + rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash, + 0); strbuf_addf(name_buffer, "%s.pack", hash_to_hex(hash)); diff --git a/pack.h b/pack.h index aa23be74df..e096dea4c9 100644 --- a/pack.h +++ b/pack.h @@ -38,12 +38,8 @@ struct pack_header { #define PACK_IDX_SIGNATURE 0xff744f63 /* "\377tOc" */ struct pack_idx_option { - unsigned flags; - /* flag bits */ -#define WRITE_IDX_VERIFY 01 /* verify only, do not write the idx file */ -#define WRITE_IDX_STRICT 02 -#define WRITE_REV 04 -#define WRITE_REV_VERIFY 010 + unsigned int write_idx_strict:1, + write_rev:1; uint32_t version; uint32_t off32_limit; @@ -84,7 +80,8 @@ typedef int (*verify_fn)(const struct object_id *, enum object_type, unsigned lo const char *write_idx_file(const char *index_name, struct pack_idx_entry **objects, int nr_objects, const struct pack_idx_option *opts, - const unsigned char *sha1); + const unsigned char *sha1, + int verify); int check_pack_crc(struct packed_git *p, struct pack_window **w_curs, off_t offset, off_t len, unsigned int nr); int verify_pack_index(struct packed_git *); int verify_pack(struct repository *, struct packed_git *, verify_fn fn, struct progress *, uint32_t); @@ -99,11 +96,11 @@ void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_ const char *write_rev_file(const char *rev_name, struct pack_idx_entry **objects, uint32_t nr_objects, - const unsigned char *hash, unsigned flags); + const unsigned char *hash, int verify); const char *write_rev_file_order(const char *rev_name, uint32_t *pack_order, uint32_t nr_objects, const unsigned char *hash, - unsigned flags); + int verify); /* * The "hdr" output buffer should be at least this big, which will handle sizes ^ permalink raw reply related [flat|nested] 81+ messages in thread
* Re: [PATCH v2 4/4] pack-write: rename *.idx file into place last (really!) 2021-09-08 0:38 ` [PATCH v2 4/4] pack-write: rename *.idx file into place last (really!) Ævar Arnfjörð Bjarmason 2021-09-08 1:14 ` Ævar Arnfjörð Bjarmason @ 2021-09-08 4:24 ` Taylor Blau 1 sibling, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-08 4:24 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Junio C Hamano, Jeff King, Taylor Blau On Wed, Sep 08, 2021 at 02:38:28AM +0200, Ævar Arnfjörð Bjarmason wrote: > Follow-up a preceding commit (pack-write.c: rename `.idx` file into > place last, 2021-08-16)[1] and rename the *.idx file in-place after we > write the *.bitmap. The preceding commit fixed the issue of *.idx > being written before *.rev files, but did not do so for *.idx files. s/idx/bitmap/ in the last line, I believe. Otherwise this patch looks good to me. Thanks, Taylor ^ permalink raw reply [flat|nested] 81+ messages in thread
* [PATCH v2 0/3] prevent opening packs too early 2021-09-01 2:05 [PATCH 0/2] pack-write,repack: prevent opening packs too early Taylor Blau ` (5 preceding siblings ...) 2021-09-08 0:38 ` [PATCH v2 0/4] rename *.idx file into place last (also after *.bitmap) Ævar Arnfjörð Bjarmason @ 2021-09-08 22:17 ` Taylor Blau 2021-09-08 22:17 ` [PATCH v2 1/3] pack-write.c: rename `.idx` files into place last Taylor Blau ` (3 more replies) 2021-09-09 3:24 ` [PATCH 0/9] packfile: avoid .idx rename races Taylor Blau 7 siblings, 4 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-08 22:17 UTC (permalink / raw) To: git; +Cc: peff, avarab, gitster Here is a very small reroll of a couple of patches to make sure that packs are not read before all of their auxiliary files exist and are moved into place, by renaming the .idx file into place last. New since the original version is a patch to apply similar treatment to index-pack, as noticed by Peff in [1]. This should be queued before Ævar's series on top. [1]: https://lore.kernel.org/git/YS75P7r33NIBmFh2@coredump.intra.peff.net/ Taylor Blau (3): pack-write.c: rename `.idx` files into place last builtin/repack.c: move `.idx` files into place last builtin/index-pack.c: move `.idx` files into place last builtin/index-pack.c | 16 ++++++++-------- builtin/repack.c | 2 +- pack-write.c | 12 ++++++------ 3 files changed, 15 insertions(+), 15 deletions(-) Range-diff against v1: 1: ea3b1a0d8e ! 1: cb3a843994 pack-write.c: rename `.idx` file into place last @@ Metadata Author: Taylor Blau <me@ttaylorr.com> ## Commit message ## - pack-write.c: rename `.idx` file into place last + pack-write.c: rename `.idx` files into place last We treat the presence of an `.idx` file as the indicator of whether or not it's safe to use a packfile. But `finish_tmp_packfile()` (which is 2: a6a4d2154e = 2: 925f9ada2a builtin/repack.c: move `.idx` files into place last -: ---------- > 3: 460b7b9151 builtin/index-pack.c: move `.idx` files into place last -- 2.33.0.96.g73915697e6 ^ permalink raw reply [flat|nested] 81+ messages in thread
* [PATCH v2 1/3] pack-write.c: rename `.idx` files into place last 2021-09-08 22:17 ` [PATCH v2 0/3] prevent opening packs too early Taylor Blau @ 2021-09-08 22:17 ` Taylor Blau 2021-09-08 22:17 ` [PATCH v2 2/3] builtin/repack.c: move " Taylor Blau ` (2 subsequent siblings) 3 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-08 22:17 UTC (permalink / raw) To: git; +Cc: peff, avarab, gitster We treat the presence of an `.idx` file as the indicator of whether or not it's safe to use a packfile. But `finish_tmp_packfile()` (which is responsible for writing the pack and moving the temporary versions of all of its auxiliary files into place) is inconsistent about the write order. But the `.rev` file is moved into place after the `.idx`, so it's possible for a process to open a pack in between the two (i.e., while the `.idx` file is in place but the `.rev` file is not) and mistakenly fall back to generating the pack's reverse index in memory. Though racy, this amounts to an unnecessary slow-down at worst, and doesn't affect the correctness of the resulting reverse index. Close this race by moving the .rev file into place before moving the .idx file into place. While we're here, only call strbuf_setlen() if we actually modified the buffer by bringing it inside of the same if-statement that calls rename(). Signed-off-by: Taylor Blau <me@ttaylorr.com> --- pack-write.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/pack-write.c b/pack-write.c index f1fc3ecafa..277c60165e 100644 --- a/pack-write.c +++ b/pack-write.c @@ -490,18 +490,18 @@ void finish_tmp_packfile(struct strbuf *name_buffer, strbuf_setlen(name_buffer, basename_len); - strbuf_addf(name_buffer, "%s.idx", hash_to_hex(hash)); - if (rename(idx_tmp_name, name_buffer->buf)) - die_errno("unable to rename temporary index file"); - - strbuf_setlen(name_buffer, basename_len); - if (rev_tmp_name) { strbuf_addf(name_buffer, "%s.rev", hash_to_hex(hash)); if (rename(rev_tmp_name, name_buffer->buf)) die_errno("unable to rename temporary reverse-index file"); + + strbuf_setlen(name_buffer, basename_len); } + strbuf_addf(name_buffer, "%s.idx", hash_to_hex(hash)); + if (rename(idx_tmp_name, name_buffer->buf)) + die_errno("unable to rename temporary index file"); + strbuf_setlen(name_buffer, basename_len); free((void *)idx_tmp_name); -- 2.33.0.96.g73915697e6 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* [PATCH v2 2/3] builtin/repack.c: move `.idx` files into place last 2021-09-08 22:17 ` [PATCH v2 0/3] prevent opening packs too early Taylor Blau 2021-09-08 22:17 ` [PATCH v2 1/3] pack-write.c: rename `.idx` files into place last Taylor Blau @ 2021-09-08 22:17 ` Taylor Blau 2021-09-08 22:17 ` [PATCH v2 3/3] builtin/index-pack.c: " Taylor Blau 2021-09-08 23:52 ` [PATCH v2 0/3] prevent opening packs too early Ævar Arnfjörð Bjarmason 3 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-08 22:17 UTC (permalink / raw) To: git; +Cc: peff, avarab, gitster In a similar spirit as the previous patch, fix the identical problem from `git repack` (which invokes `pack-objects` with a temporary location for output, and then moves the files into their final locations itself). Signed-off-by: Taylor Blau <me@ttaylorr.com> --- builtin/repack.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/builtin/repack.c b/builtin/repack.c index 5f9bc74adc..c3e4771609 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -208,10 +208,10 @@ static struct { unsigned optional:1; } exts[] = { {".pack"}, - {".idx"}, {".rev", 1}, {".bitmap", 1}, {".promisor", 1}, + {".idx"}, }; static unsigned populate_pack_exts(char *name) -- 2.33.0.96.g73915697e6 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* [PATCH v2 3/3] builtin/index-pack.c: move `.idx` files into place last 2021-09-08 22:17 ` [PATCH v2 0/3] prevent opening packs too early Taylor Blau 2021-09-08 22:17 ` [PATCH v2 1/3] pack-write.c: rename `.idx` files into place last Taylor Blau 2021-09-08 22:17 ` [PATCH v2 2/3] builtin/repack.c: move " Taylor Blau @ 2021-09-08 22:17 ` Taylor Blau 2021-09-08 23:52 ` [PATCH v2 0/3] prevent opening packs too early Ævar Arnfjörð Bjarmason 3 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-08 22:17 UTC (permalink / raw) To: git; +Cc: peff, avarab, gitster In a similar spirit as the previous patch, fix the identical problem in `git index-pack`. Signed-off-by: Taylor Blau <me@ttaylorr.com> --- builtin/index-pack.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 8336466865..6ca47c5c03 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1517,14 +1517,6 @@ static void final(const char *final_pack_name, const char *curr_pack_name, } else if (from_stdin) chmod(final_pack_name, 0444); - if (final_index_name != curr_index_name) { - if (!final_index_name) - final_index_name = odb_pack_name(&index_name, hash, "idx"); - if (finalize_object_file(curr_index_name, final_index_name)) - die(_("cannot store index file")); - } else - chmod(final_index_name, 0444); - if (curr_rev_index_name) { if (final_rev_index_name != curr_rev_index_name) { if (!final_rev_index_name) @@ -1535,6 +1527,14 @@ static void final(const char *final_pack_name, const char *curr_pack_name, chmod(final_rev_index_name, 0444); } + if (final_index_name != curr_index_name) { + if (!final_index_name) + final_index_name = odb_pack_name(&index_name, hash, "idx"); + if (finalize_object_file(curr_index_name, final_index_name)) + die(_("cannot store index file")); + } else + chmod(final_index_name, 0444); + if (do_fsck_object) { struct packed_git *p; p = add_packed_git(final_index_name, strlen(final_index_name), 0); -- 2.33.0.96.g73915697e6 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* Re: [PATCH v2 0/3] prevent opening packs too early 2021-09-08 22:17 ` [PATCH v2 0/3] prevent opening packs too early Taylor Blau ` (2 preceding siblings ...) 2021-09-08 22:17 ` [PATCH v2 3/3] builtin/index-pack.c: " Taylor Blau @ 2021-09-08 23:52 ` Ævar Arnfjörð Bjarmason 2021-09-09 0:50 ` Ævar Arnfjörð Bjarmason 3 siblings, 1 reply; 81+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-09-08 23:52 UTC (permalink / raw) To: Taylor Blau; +Cc: git, peff, gitster On Wed, Sep 08 2021, Taylor Blau wrote: > Here is a very small reroll of a couple of patches to make sure that packs are > not read before all of their auxiliary files exist and are moved into place, by > renaming the .idx file into place last. > > New since the original version is a patch to apply similar treatment to > index-pack, as noticed by Peff in [1]. This should be queued before Ævar's > series on top. I read Junio's earlier <xmqq8s063m7m.fsf@gitster.g>[1] as a suggestion that we combine the two into a singe series. I do think that yields a better end-result, in particular your 1/3 is much more readable if the refactoring in my 2/4. I'm mot of the way done with such a rewrite, which also incorporates your suggestion for how to manage memory in rename_tmp_packfile(), but I'll hold of on what you & Junio have to say about next steps before adding to the re-roll competition Junio mentioned... 1. https://lore.kernel.org/git/xmqq8s063m7m.fsf@gitster.g ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH v2 0/3] prevent opening packs too early 2021-09-08 23:52 ` [PATCH v2 0/3] prevent opening packs too early Ævar Arnfjörð Bjarmason @ 2021-09-09 0:50 ` Ævar Arnfjörð Bjarmason 2021-09-09 1:13 ` Taylor Blau 0 siblings, 1 reply; 81+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-09-09 0:50 UTC (permalink / raw) To: Taylor Blau; +Cc: git, peff, gitster On Thu, Sep 09 2021, Ævar Arnfjörð Bjarmason wrote: > On Wed, Sep 08 2021, Taylor Blau wrote: > >> Here is a very small reroll of a couple of patches to make sure that packs are >> not read before all of their auxiliary files exist and are moved into place, by >> renaming the .idx file into place last. >> >> New since the original version is a patch to apply similar treatment to >> index-pack, as noticed by Peff in [1]. This should be queued before Ævar's >> series on top. > > I read Junio's earlier <xmqq8s063m7m.fsf@gitster.g>[1] as a suggestion > that we combine the two into a singe series. I do think that yields a > better end-result, in particular your 1/3 is much more readable if the > refactoring in my 2/4. > > I'm mot of the way done with such a rewrite, which also incorporates > your suggestion for how to manage memory in rename_tmp_packfile(), but > I'll hold of on what you & Junio have to say about next steps before > adding to the re-roll competition Junio mentioned... > > 1. https://lore.kernel.org/git/xmqq8s063m7m.fsf@gitster.g I've got that at https://github.com/git/git/compare/master...avar:avar-tb/idx-rename-race-3 Range-diff to my own v2 + your v1 follows (i.e. not my v2 + your v2 here). I can submit this as-is to the list, or something else. Junio nominated you "team leader", so I'll go with your plan :) I think the functional changes on top of the function refactoring make it much easier to see what's going on, e.g.: https://github.com/git/git/commit/832d9fa083d20abab35695037152dfeb1f9fe50a I took the liberty of similarly creating a helper function in index-pack.c, which turns your commit there into this: https://github.com/git/git/commit/8c2dcbf6011c466a806491672c0d7308e8105a1d The relevant commit messages were also adjusted to note that in the earlier *.idx v.s. *.rev we're leaving the similar *.bitmap problem for a later change. 3: 29f57876515 = 1: 4869f97408b pack.h: line-wrap the definition of finish_tmp_packfile() 4: 7b39f4599b1 ! 2: c0b1ec90d43 pack-write: refactor renaming in finish_tmp_packfile() @@ Metadata ## Commit message ## pack-write: refactor renaming in finish_tmp_packfile() - Refactor the renaming in finish_tmp_packfile() so that it takes a - "const struct strbuf *" instead of a non-const, and refactor the - repetitive renaming pattern in finish_tmp_packfile() to use a new - static rename_tmp_packfile() helper function. + Refactor the renaming in finish_tmp_packfile() into a helper + function. The callers are now expected to pass a "name_buffer" ending + in "pack-OID." instead of the previous "pack-", we then append "pack", + "idx" or "rev" to it. + + By doing the strbuf_setlen() in rename_tmp_packfile() we re-use the + buffer and avoid the repeated allocations we'd get if that function + had its own temporary "struct strbuf". + + This approach of re-using the buffer does make the last user in + pack-object.c's write_pack_file() slightly awkward, we needlessly do a + strbuf_setlen() there just before the strbuf_release() for + consistency. In subsequent changes we'll move that bitmap writing code + around, so let's not skip the strbuf_setlen() now. The previous strbuf_reset() idiom originated with 5889271114a (finish_tmp_packfile():use strbuf for pathname @@ Commit message pre-strbuf code added in 0e990530ae (finish_tmp_packfile(): a helper function, 2011-10-28). - Since the memory allocation is not a bottleneck here we can afford a - bit more readability at the cost of re-allocating this new "struct - strbuf sb". - Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> ## builtin/pack-objects.c ## +@@ builtin/pack-objects.c: static void write_pack_file(void) + warning_errno(_("failed utime() on %s"), pack_tmp_name); + } + +- strbuf_addf(&tmpname, "%s-", base_name); ++ strbuf_addf(&tmpname, "%s-%s.", base_name, ++ hash_to_hex(hash)); + + if (write_bitmap_index) { + bitmap_writer_set_checksum(hash); @@ builtin/pack-objects.c: static void write_pack_file(void) &pack_idx_opts, hash); if (write_bitmap_index) { - strbuf_addf(&tmpname, "%s.bitmap", hash_to_hex(hash)); -+ struct strbuf sb = STRBUF_INIT; -+ -+ strbuf_addf(&sb, "%s%s.bitmap", tmpname.buf, -+ hash_to_hex(hash)); ++ size_t tmpname_len = tmpname.len; ++ strbuf_addstr(&tmpname, "bitmap"); stop_progress(&progress_state); + bitmap_writer_show_progress(progress); @@ builtin/pack-objects.c: static void write_pack_file(void) - bitmap_writer_select_commits(indexed_commits, indexed_commits_nr, -1); - bitmap_writer_build(&to_pack); bitmap_writer_finish(written_list, nr_written, -- tmpname.buf, write_bitmap_options); -+ sb.buf, write_bitmap_options); + tmpname.buf, write_bitmap_options); write_bitmap_index = 0; -+ strbuf_release(&sb); ++ strbuf_setlen(&tmpname, tmpname_len); } strbuf_release(&tmpname); @@ pack-write.c: struct hashfile *create_tmp_packfile(char **pack_tmp_name) return hashfd(fd, *pack_tmp_name); } --void finish_tmp_packfile(struct strbuf *name_buffer, -+static void rename_tmp_packfile(const char *source, -+ const struct strbuf *basename, -+ const unsigned char hash[], -+ const char *ext) ++static void rename_tmp_packfile(struct strbuf *nb, const char *source, ++ const char *ext) +{ -+ struct strbuf sb = STRBUF_INIT; ++ size_t nb_len = nb->len; + -+ strbuf_addf(&sb, "%s%s.%s", basename->buf, hash_to_hex(hash), ext); -+ if (rename(source, sb.buf)) ++ strbuf_addstr(nb, ext); ++ if (rename(source, nb->buf)) + die_errno("unable to rename temporary '*.%s' file to '%s'", -+ ext, sb.buf); -+ strbuf_release(&sb); ++ ext, nb->buf); ++ strbuf_setlen(nb, nb_len); +} + -+void finish_tmp_packfile(const struct strbuf *basename, + void finish_tmp_packfile(struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, - uint32_t nr_written, @@ pack-write.c: void finish_tmp_packfile(struct strbuf *name_buffer, unsigned char hash[]) { @@ pack-write.c: void finish_tmp_packfile(struct strbuf *name_buffer, - - strbuf_setlen(name_buffer, basename_len); - +- strbuf_addf(name_buffer, "%s.idx", hash_to_hex(hash)); +- if (rename(idx_tmp_name, name_buffer->buf)) +- die_errno("unable to rename temporary index file"); +- +- strbuf_setlen(name_buffer, basename_len); +- - if (rev_tmp_name) { - strbuf_addf(name_buffer, "%s.rev", hash_to_hex(hash)); - if (rename(rev_tmp_name, name_buffer->buf)) - die_errno("unable to rename temporary reverse-index file"); -- -- strbuf_setlen(name_buffer, basename_len); - } - -- strbuf_addf(name_buffer, "%s.idx", hash_to_hex(hash)); -- if (rename(idx_tmp_name, name_buffer->buf)) -- die_errno("unable to rename temporary index file"); -- - strbuf_setlen(name_buffer, basename_len); -+ rename_tmp_packfile(pack_tmp_name, basename, hash, "pack"); ++ rename_tmp_packfile(name_buffer, pack_tmp_name, "pack"); ++ rename_tmp_packfile(name_buffer, idx_tmp_name, "idx"); + if (rev_tmp_name) -+ rename_tmp_packfile(rev_tmp_name, basename, hash, "rev"); -+ rename_tmp_packfile(idx_tmp_name, basename, hash, "idx"); ++ rename_tmp_packfile(name_buffer, rev_tmp_name, "rev"); free((void *)idx_tmp_name); } - - ## pack.h ## -@@ pack.h: int encode_in_pack_object_header(unsigned char *hdr, int hdr_len, - int read_pack_header(int fd, struct pack_header *); - - struct hashfile *create_tmp_packfile(char **pack_tmp_name); --void finish_tmp_packfile(struct strbuf *name_buffer, -+void finish_tmp_packfile(const struct strbuf *basename, - const char *pack_tmp_name, - struct pack_idx_entry **written_list, - uint32_t nr_written, -: ----------- > 3: 832d9fa083d pack-write.c: rename `.idx` files after `*.rev' 2: a6a4d2154e8 ! 4: 9f7e005ba03 builtin/repack.c: move `.idx` files into place last @@ Commit message itself). Signed-off-by: Taylor Blau <me@ttaylorr.com> + Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> ## builtin/repack.c ## @@ builtin/repack.c: static struct { 1: ea3b1a0d8ed ! 5: 457d4b9787c pack-write.c: rename `.idx` file into place last @@ ## Metadata ## -Author: Taylor Blau <me@ttaylorr.com> +Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com> ## Commit message ## - pack-write.c: rename `.idx` file into place last + index-pack: refactor renaming in final() - We treat the presence of an `.idx` file as the indicator of whether or - not it's safe to use a packfile. But `finish_tmp_packfile()` (which is - responsible for writing the pack and moving the temporary versions of - all of its auxiliary files into place) is inconsistent about the write - order. + Refactor the renaming in final() into a helper function, this is + similar in spirit to a preceding refactoring of finish_tmp_packfile() + in pack-write.c. - But the `.rev` file is moved into place after the `.idx`, so it's - possible for a process to open a pack in between the two (i.e., while - the `.idx` file is in place but the `.rev` file is not) and mistakenly - fall back to generating the pack's reverse index in memory. Though racy, - this amounts to an unnecessary slow-down at worst, and doesn't affect - the correctness of the resulting reverse index. + Before e37d0b8730b (builtin/index-pack.c: write reverse indexes, + 2021-01-25) it probably wasn't worth it to have this sort of helper, + due to the differing "else if" case for "pack" files v.s. "idx" files. - Close this race by moving the .rev file into place before moving the - .idx file into place. + But since we've got "rev" as well now, let's do the renaming via a + helper, this is both a net decrease in lines, and improves the + readability, since we can easily see at a glance that the logic for + writing these three types of files is exactly the same, aside from the + obviously differing cases of "*final_xyz_name" being NULL, and + "else_chmod_if" being different. - While we're here, only call strbuf_setlen() if we actually modified the - buffer by bringing it inside of the same if-statement that calls - rename(). + Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> - Signed-off-by: Taylor Blau <me@ttaylorr.com> - - ## pack-write.c ## -@@ pack-write.c: void finish_tmp_packfile(struct strbuf *name_buffer, + ## builtin/index-pack.c ## +@@ builtin/index-pack.c: static void write_special_file(const char *suffix, const char *msg, + strbuf_release(&name_buf); + } - strbuf_setlen(name_buffer, basename_len); ++static void rename_tmp_packfile(const char **final_xyz_name, ++ const char *curr_xyz_name, ++ struct strbuf *xyz_name, unsigned char *hash, ++ const char *ext, int else_chmod_if) ++{ ++ if (*final_xyz_name != curr_xyz_name) { ++ if (!*final_xyz_name) ++ *final_xyz_name = odb_pack_name(xyz_name, hash, ext); ++ if (finalize_object_file(curr_xyz_name, *final_xyz_name)) ++ die(_("unable to rename temporary '*.%s' file to '%s"), ++ ext, *final_xyz_name); ++ } else if (else_chmod_if) { ++ chmod(*final_xyz_name, 0444); ++ } ++} ++ + static void final(const char *final_pack_name, const char *curr_pack_name, + const char *final_index_name, const char *curr_index_name, + const char *final_rev_index_name, const char *curr_rev_index_name, +@@ builtin/index-pack.c: static void final(const char *final_pack_name, const char *curr_pack_name, + write_special_file("promisor", promisor_msg, final_pack_name, + hash, NULL); -- strbuf_addf(name_buffer, "%s.idx", hash_to_hex(hash)); -- if (rename(idx_tmp_name, name_buffer->buf)) -- die_errno("unable to rename temporary index file"); +- if (final_pack_name != curr_pack_name) { +- if (!final_pack_name) +- final_pack_name = odb_pack_name(&pack_name, hash, "pack"); +- if (finalize_object_file(curr_pack_name, final_pack_name)) +- die(_("cannot store pack file")); +- } else if (from_stdin) +- chmod(final_pack_name, 0444); - -- strbuf_setlen(name_buffer, basename_len); +- if (final_index_name != curr_index_name) { +- if (!final_index_name) +- final_index_name = odb_pack_name(&index_name, hash, "idx"); +- if (finalize_object_file(curr_index_name, final_index_name)) +- die(_("cannot store index file")); +- } else +- chmod(final_index_name, 0444); - - if (rev_tmp_name) { - strbuf_addf(name_buffer, "%s.rev", hash_to_hex(hash)); - if (rename(rev_tmp_name, name_buffer->buf)) - die_errno("unable to rename temporary reverse-index file"); -+ -+ strbuf_setlen(name_buffer, basename_len); - } - -+ strbuf_addf(name_buffer, "%s.idx", hash_to_hex(hash)); -+ if (rename(idx_tmp_name, name_buffer->buf)) -+ die_errno("unable to rename temporary index file"); -+ - strbuf_setlen(name_buffer, basename_len); +- if (curr_rev_index_name) { +- if (final_rev_index_name != curr_rev_index_name) { +- if (!final_rev_index_name) +- final_rev_index_name = odb_pack_name(&rev_index_name, hash, "rev"); +- if (finalize_object_file(curr_rev_index_name, final_rev_index_name)) +- die(_("cannot store reverse index file")); +- } else +- chmod(final_rev_index_name, 0444); +- } ++ rename_tmp_packfile(&final_pack_name, curr_pack_name, &pack_name, ++ hash, "pack", from_stdin); ++ rename_tmp_packfile(&final_index_name, curr_index_name, &index_name, ++ hash, "idx", 1); ++ if (curr_rev_index_name) ++ rename_tmp_packfile(&final_rev_index_name, curr_rev_index_name, ++ &rev_index_name, hash, "rev", 1); - free((void *)idx_tmp_name); + if (do_fsck_object) { + struct packed_git *p; -: ----------- > 6: 8c2dcbf6011 builtin/index-pack.c: move `.idx` files into place last 5: 1205f9d0c25 ! 7: 3a0ee7e4a99 pack-write: split up finish_tmp_packfile() function @@ builtin/pack-objects.c: static void write_pack_file(void) written_list, nr_written, - &pack_idx_opts, hash); + &pack_idx_opts, hash, &idx_tmp_name); -+ rename_tmp_packfile_idx(&tmpname, hash, &idx_tmp_name); ++ rename_tmp_packfile_idx(&tmpname, &idx_tmp_name); if (write_bitmap_index) { - struct strbuf sb = STRBUF_INIT; + size_t tmpname_len = tmpname.len; @@ builtin/pack-objects.c: static void write_pack_file(void) - strbuf_release(&sb); + strbuf_setlen(&tmpname, tmpname_len); } + free(idx_tmp_name); @@ bulk-checkin.c: static struct bulk_checkin_state { uint32_t nr_written; } state; -+static void finish_tmp_packfile(const struct strbuf *basename, ++static void finish_tmp_packfile(struct strbuf *basename, + const char *pack_tmp_name, + struct pack_idx_entry **written_list, + uint32_t nr_written, @@ bulk-checkin.c: static struct bulk_checkin_state { + + stage_tmp_packfiles(basename, pack_tmp_name, written_list, nr_written, + pack_idx_opts, hash, &idx_tmp_name); -+ rename_tmp_packfile_idx(basename, hash, &idx_tmp_name); ++ rename_tmp_packfile_idx(basename, &idx_tmp_name); + + free(idx_tmp_name); +} @@ bulk-checkin.c: static struct bulk_checkin_state { static void finish_bulk_checkin(struct bulk_checkin_state *state) { struct object_id oid; +@@ bulk-checkin.c: static void finish_bulk_checkin(struct bulk_checkin_state *state) + close(fd); + } + +- strbuf_addf(&packname, "%s/pack/pack-", get_object_directory()); ++ strbuf_addf(&packname, "%s/pack/pack-%s.", get_object_directory(), ++ oid_to_hex(&oid)); + finish_tmp_packfile(&packname, state->pack_tmp_name, + state->written, state->nr_written, + &state->pack_idx_opts, oid.hash); ## pack-write.c ## -@@ pack-write.c: static void rename_tmp_packfile(const char *source, - strbuf_release(&sb); +@@ pack-write.c: static void rename_tmp_packfile(struct strbuf *nb, const char *source, + strbuf_setlen(nb, nb_len); } --void finish_tmp_packfile(const struct strbuf *basename, -+void rename_tmp_packfile_idx(const struct strbuf *basename, -+ unsigned char hash[], char **idx_tmp_name) +-void finish_tmp_packfile(struct strbuf *name_buffer, ++void rename_tmp_packfile_idx(struct strbuf *name_buffer, ++ char **idx_tmp_name) +{ -+ rename_tmp_packfile(*idx_tmp_name, basename, hash, "idx"); ++ rename_tmp_packfile(name_buffer, *idx_tmp_name, "idx"); +} + -+void stage_tmp_packfiles(const struct strbuf *basename, ++void stage_tmp_packfiles(struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, @@ pack-write.c: static void rename_tmp_packfile(const char *source, die_errno("unable to make temporary index file readable"); rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash, -@@ pack-write.c: void finish_tmp_packfile(const struct strbuf *basename, - rename_tmp_packfile(pack_tmp_name, basename, hash, "pack"); +@@ pack-write.c: void finish_tmp_packfile(struct strbuf *name_buffer, + rename_tmp_packfile(name_buffer, pack_tmp_name, "pack"); if (rev_tmp_name) - rename_tmp_packfile(rev_tmp_name, basename, hash, "rev"); -- rename_tmp_packfile(idx_tmp_name, basename, hash, "idx"); + rename_tmp_packfile(name_buffer, rev_tmp_name, "rev"); +- rename_tmp_packfile(name_buffer, idx_tmp_name, "idx"); - - free((void *)idx_tmp_name); } @@ pack.h: int encode_in_pack_object_header(unsigned char *hdr, int hdr_len, int read_pack_header(int fd, struct pack_header *); struct hashfile *create_tmp_packfile(char **pack_tmp_name); --void finish_tmp_packfile(const struct strbuf *basename, -+void stage_tmp_packfiles(const struct strbuf *basename, +-void finish_tmp_packfile(struct strbuf *name_buffer, ++void stage_tmp_packfiles(struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, @@ pack.h: int encode_in_pack_object_header(unsigned char *hdr, int hdr_len, - unsigned char sha1[]); + unsigned char hash[], + char **idx_tmp_name); -+void rename_tmp_packfile_idx(const struct strbuf *tmp_basename, -+ unsigned char hash[], char **idx_tmp_name); ++void rename_tmp_packfile_idx(struct strbuf *basename, ++ char **idx_tmp_name); #endif 6: 70f4a9767d3 ! 8: b1b232b80de pack-write: rename *.idx file into place last (really!) @@ Metadata Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com> ## Commit message ## - pack-write: rename *.idx file into place last (really!) + pack-objects: rename .idx files into place after .bitmap files - Follow-up a preceding commit (pack-write.c: rename `.idx` file into - place last, 2021-08-16)[1] and rename the *.idx file in-place after we - write the *.bitmap. The preceding commit fixed the issue of *.idx - being written before *.rev files, but did not do so for *.idx files. + In preceding commits the race of renaming .idx files in place before + .rev files and other auxiliary files was fixed in pack-write.c's + finish_tmp_packfile(), builtin/repack.c's "struct exts", and + builtin/index-pack.c's final(). As noted in the change to pack-write.c + we left in place the issue of writing *.bitmap files after the *.idx, + let's fix that issue. See 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21) for commentary at the time when *.bitmap was implemented about how those files are written out, nothing in that commit contradicts what's being done here. - Note that the referenced earlier commit[1] is overly optimistic about - "clos[ing the] race", i.e. yes we'll now write the files in the right - order, but we might still race due to our sloppy use of fsync(). See - the thread at [2] for a rabbit hole of various discussions about - filesystem races in the face of doing and not doing fsync() (and if - doing fsync(), not doing it properly). + Note that this commit and preceding ones only close any race condition + with *.idx files being written before their auxiliary files if we're + optimistic about our lack of fsync()-ing in this are not tripping us + over. See the thread at [1] for a rabbit hole of various discussions + about filesystem races in the face of doing and not doing fsync() (and + if doing fsync(), not doing it properly). - 1. https://lore.kernel.org/git/a6a4d2154e83d41c10986c5f455279ab249a910c.1630461918.git.me@ttaylorr.com/ - 2. https://lore.kernel.org/git/8735qgkvv1.fsf@evledraar.gmail.com/ + In particular, in this case of writing to ".git/objects/pack" we only + write and fsync() the individual files, but if we wanted to guarantee + that the metadata update was seen in that way by concurrent processes + we'd need to fsync() on the "fd" of the containing directory. That + concern is probably more theoretical than not, modern OS's tend to be + more on the forgiving side than the overly pedantic side of + implementing POSIX FS semantics. + + 1. https://lore.kernel.org/git/8735qgkvv1.fsf@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> @@ builtin/pack-objects.c: static void write_pack_file(void) stage_tmp_packfiles(&tmpname, pack_tmp_name, written_list, nr_written, &pack_idx_opts, hash, &idx_tmp_name); -- rename_tmp_packfile_idx(&tmpname, hash, &idx_tmp_name); +- rename_tmp_packfile_idx(&tmpname, &idx_tmp_name); if (write_bitmap_index) { - struct strbuf sb = STRBUF_INIT; + size_t tmpname_len = tmpname.len; @@ builtin/pack-objects.c: static void write_pack_file(void) - strbuf_release(&sb); + strbuf_setlen(&tmpname, tmpname_len); } -+ rename_tmp_packfile_idx(&tmpname, hash, &idx_tmp_name); ++ rename_tmp_packfile_idx(&tmpname, &idx_tmp_name); + free(idx_tmp_name); strbuf_release(&tmpname); ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH v2 0/3] prevent opening packs too early 2021-09-09 0:50 ` Ævar Arnfjörð Bjarmason @ 2021-09-09 1:13 ` Taylor Blau 2021-09-09 1:33 ` Ævar Arnfjörð Bjarmason 0 siblings, 1 reply; 81+ messages in thread From: Taylor Blau @ 2021-09-09 1:13 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: Taylor Blau, git, peff, gitster On Thu, Sep 09, 2021 at 02:50:59AM +0200, Ævar Arnfjörð Bjarmason wrote: > > On Thu, Sep 09 2021, Ævar Arnfjörð Bjarmason wrote: > > > On Wed, Sep 08 2021, Taylor Blau wrote: > > > >> Here is a very small reroll of a couple of patches to make sure that packs are > >> not read before all of their auxiliary files exist and are moved into place, by > >> renaming the .idx file into place last. > >> > >> New since the original version is a patch to apply similar treatment to > >> index-pack, as noticed by Peff in [1]. This should be queued before Ævar's > >> series on top. > > > > I read Junio's earlier <xmqq8s063m7m.fsf@gitster.g>[1] as a suggestion > > that we combine the two into a singe series. I do think that yields a > > better end-result, in particular your 1/3 is much more readable if the > > refactoring in my 2/4. > > > > I'm mot of the way done with such a rewrite, which also incorporates > > your suggestion for how to manage memory in rename_tmp_packfile(), but > > I'll hold of on what you & Junio have to say about next steps before > > adding to the re-roll competition Junio mentioned... > > > > 1. https://lore.kernel.org/git/xmqq8s063m7m.fsf@gitster.g > > I've got that at > https://github.com/git/git/compare/master...avar:avar-tb/idx-rename-race-3 Beautiful :-). I mentioned in my response to [1] that I missed that message when sending v2 of my series to fix a couple of these races. And I was even happy to unify our two series, but you did all of the work for me while I was eating dinner. Thank you! I fetched your branch, reviewed, and am happy with the result. So I would be content to apply my s-o-b to your patches and resubmit them as a unified series. But I did wonder if you wanted to include [2] in this series as well. It's kind of different, but also related enough that it probably makes sense to just pull it in. So I'm inclined to just do that, unless you have any objections. Thanks, Taylor [2]: https://lore.kernel.org/git/patch-1.1-366ba928bd-20210908T010743Z-avarab@gmail.com/ ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH v2 0/3] prevent opening packs too early 2021-09-09 1:13 ` Taylor Blau @ 2021-09-09 1:33 ` Ævar Arnfjörð Bjarmason 2021-09-09 2:36 ` Ævar Arnfjörð Bjarmason 0 siblings, 1 reply; 81+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-09-09 1:33 UTC (permalink / raw) To: Taylor Blau; +Cc: git, peff, gitster On Wed, Sep 08 2021, Taylor Blau wrote: > On Thu, Sep 09, 2021 at 02:50:59AM +0200, Ævar Arnfjörð Bjarmason wrote: >> >> On Thu, Sep 09 2021, Ævar Arnfjörð Bjarmason wrote: >> >> > On Wed, Sep 08 2021, Taylor Blau wrote: >> > >> >> Here is a very small reroll of a couple of patches to make sure that packs are >> >> not read before all of their auxiliary files exist and are moved into place, by >> >> renaming the .idx file into place last. >> >> >> >> New since the original version is a patch to apply similar treatment to >> >> index-pack, as noticed by Peff in [1]. This should be queued before Ævar's >> >> series on top. >> > >> > I read Junio's earlier <xmqq8s063m7m.fsf@gitster.g>[1] as a suggestion >> > that we combine the two into a singe series. I do think that yields a >> > better end-result, in particular your 1/3 is much more readable if the >> > refactoring in my 2/4. >> > >> > I'm mot of the way done with such a rewrite, which also incorporates >> > your suggestion for how to manage memory in rename_tmp_packfile(), but >> > I'll hold of on what you & Junio have to say about next steps before >> > adding to the re-roll competition Junio mentioned... >> > >> > 1. https://lore.kernel.org/git/xmqq8s063m7m.fsf@gitster.g >> >> I've got that at >> https://github.com/git/git/compare/master...avar:avar-tb/idx-rename-race-3 > > Beautiful :-). > > I mentioned in my response to [1] that I missed that message when > sending v2 of my series to fix a couple of these races. And I was even > happy to unify our two series, but you did all of the work for me while > I was eating dinner. Thank you! > > I fetched your branch, reviewed, and am happy with the result. So I > would be content to apply my s-o-b to your patches and resubmit them as > a unified series. Sounds good to me, also in particular any typo/grammar etc. fixes to my changes are most welcome, I tend to sneak those in, and any code changes you see fit of course. > But I did wonder if you wanted to include [2] in this series as well. > It's kind of different, but also related enough that it probably makes > sense to just pull it in. So I'm inclined to just do that, unless you > have any objections. In this latest push-out of "next" Junio's merged that one down already, see 1972c5931bb (Merge branch 'ab/reverse-midx-optim' into next, 2021-09-08). So I think at this point it could be built on top of that, but given that the two don't conflict in any way (textually or semantically) it's probably better to base this larger topic on "master" and proceed with them independently. > [2]: https://lore.kernel.org/git/patch-1.1-366ba928bd-20210908T010743Z-avarab@gmail.com/ ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH v2 0/3] prevent opening packs too early 2021-09-09 1:33 ` Ævar Arnfjörð Bjarmason @ 2021-09-09 2:36 ` Ævar Arnfjörð Bjarmason 2021-09-09 2:49 ` Taylor Blau 0 siblings, 1 reply; 81+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-09-09 2:36 UTC (permalink / raw) To: Taylor Blau; +Cc: git, peff, gitster On Thu, Sep 09 2021, Ævar Arnfjörð Bjarmason wrote: > On Wed, Sep 08 2021, Taylor Blau wrote: > >> On Thu, Sep 09, 2021 at 02:50:59AM +0200, Ævar Arnfjörð Bjarmason wrote: >>> >>> On Thu, Sep 09 2021, Ævar Arnfjörð Bjarmason wrote: >>> >>> > On Wed, Sep 08 2021, Taylor Blau wrote: >>> > >>> >> Here is a very small reroll of a couple of patches to make sure that packs are >>> >> not read before all of their auxiliary files exist and are moved into place, by >>> >> renaming the .idx file into place last. >>> >> >>> >> New since the original version is a patch to apply similar treatment to >>> >> index-pack, as noticed by Peff in [1]. This should be queued before Ævar's >>> >> series on top. >>> > >>> > I read Junio's earlier <xmqq8s063m7m.fsf@gitster.g>[1] as a suggestion >>> > that we combine the two into a singe series. I do think that yields a >>> > better end-result, in particular your 1/3 is much more readable if the >>> > refactoring in my 2/4. >>> > >>> > I'm mot of the way done with such a rewrite, which also incorporates >>> > your suggestion for how to manage memory in rename_tmp_packfile(), but >>> > I'll hold of on what you & Junio have to say about next steps before >>> > adding to the re-roll competition Junio mentioned... >>> > >>> > 1. https://lore.kernel.org/git/xmqq8s063m7m.fsf@gitster.g >>> >>> I've got that at >>> https://github.com/git/git/compare/master...avar:avar-tb/idx-rename-race-3 >> >> Beautiful :-). >> >> I mentioned in my response to [1] that I missed that message when >> sending v2 of my series to fix a couple of these races. And I was even >> happy to unify our two series, but you did all of the work for me while >> I was eating dinner. Thank you! >> >> I fetched your branch, reviewed, and am happy with the result. So I >> would be content to apply my s-o-b to your patches and resubmit them as >> a unified series. > > Sounds good to me, also in particular any typo/grammar etc. fixes to my > changes are most welcome, I tend to sneak those in, and any code changes > you see fit of course. > >> But I did wonder if you wanted to include [2] in this series as well. >> It's kind of different, but also related enough that it probably makes >> sense to just pull it in. So I'm inclined to just do that, unless you >> have any objections. > > In this latest push-out of "next" Junio's merged that one down already, > see 1972c5931bb (Merge branch 'ab/reverse-midx-optim' into next, > 2021-09-08). > > So I think at this point it could be built on top of that, but given > that the two don't conflict in any way (textually or semantically) it's > probably better to base this larger topic on "master" and proceed with > them independently. > >> [2]: https://lore.kernel.org/git/patch-1.1-366ba928bd-20210908T010743Z-avarab@gmail.com/ Also: That avar-tb/idx-rename-race-3 has a bug where I didn't update the bulk-checkin.c code accordingly, missed it because I was running a glob of tests that included "*pack*,*rev*", but missed "*large*". I've got a avar-tb/idx-rename-race-4 which fixes it. Sorry about that. https://github.com/git/git/compare/master...avar:avar-tb/idx-rename-race-4 ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH v2 0/3] prevent opening packs too early 2021-09-09 2:36 ` Ævar Arnfjörð Bjarmason @ 2021-09-09 2:49 ` Taylor Blau 0 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-09 2:49 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: Taylor Blau, git, peff, gitster On Thu, Sep 09, 2021 at 04:36:17AM +0200, Ævar Arnfjörð Bjarmason wrote: > Also: That avar-tb/idx-rename-race-3 has a bug where I didn't update the > bulk-checkin.c code accordingly, missed it because I was running a glob > of tests that included "*pack*,*rev*", but missed "*large*". I've got a > avar-tb/idx-rename-race-4 which fixes it. Sorry about that. > > https://github.com/git/git/compare/master...avar:avar-tb/idx-rename-race-4 Yep, I noticed the same thing while running make test on the series before sending it out. Your fix matches mine (which is to move the change from 7/8 to 2/8). While I was fixing the second patch, I inserted a new one just before it to store the hash in a `unsigned char hash[GIT_MAX_RAWSZ]` instead of a `struct object_id`. Thanks, Taylor ^ permalink raw reply [flat|nested] 81+ messages in thread
* [PATCH 0/9] packfile: avoid .idx rename races 2021-09-01 2:05 [PATCH 0/2] pack-write,repack: prevent opening packs too early Taylor Blau ` (6 preceding siblings ...) 2021-09-08 22:17 ` [PATCH v2 0/3] prevent opening packs too early Taylor Blau @ 2021-09-09 3:24 ` Taylor Blau 2021-09-09 3:24 ` [PATCH 1/9] pack.h: line-wrap the definition of finish_tmp_packfile() Taylor Blau ` (10 more replies) 7 siblings, 11 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-09 3:24 UTC (permalink / raw) To: git; +Cc: avarab, gitster This series is a unification of [1] and [2] which together prevent a handful of races when code that writes packfiles moves the `.idx` into place before all other auxiliary files are properly renamed. This can lead to races like not reading the `.rev` file even if one was generated and is about to be moved into place. Credit goes to Ævar for preparing what is more-or-less sent here. I polished a few of the commit messages, and added the second patch on top of his result. It isn't necessary, but felt like good hygiene when I was reading the surrounding code. Thanks in advance for reviewing. [1]: https://lore.kernel.org/git/cover.1631139433.git.me@ttaylorr.com/ [2]: https://lore.kernel.org/git/cover-0.3-00000000000-20210907T193600Z-avarab@gmail.com/ Taylor Blau (4): bulk-checkin.c: store checksum directly pack-write.c: rename `.idx` files after `*.rev` builtin/repack.c: move `.idx` files into place last builtin/index-pack.c: move `.idx` files into place last Ævar Arnfjörð Bjarmason (5): pack.h: line-wrap the definition of finish_tmp_packfile() pack-write: refactor renaming in finish_tmp_packfile() index-pack: refactor renaming in final() pack-write: split up finish_tmp_packfile() function pack-objects: rename .idx files into place after .bitmap files builtin/index-pack.c | 48 +++++++++++++++++------------------ builtin/pack-objects.c | 15 ++++++++--- builtin/repack.c | 2 +- bulk-checkin.c | 31 +++++++++++++++++------ pack-write.c | 57 +++++++++++++++++++++--------------------- pack.h | 10 +++++++- 6 files changed, 96 insertions(+), 67 deletions(-) -- 2.33.0.96.g73915697e6 ^ permalink raw reply [flat|nested] 81+ messages in thread
* [PATCH 1/9] pack.h: line-wrap the definition of finish_tmp_packfile() 2021-09-09 3:24 ` [PATCH 0/9] packfile: avoid .idx rename races Taylor Blau @ 2021-09-09 3:24 ` Taylor Blau 2021-09-09 3:24 ` [PATCH 2/9] bulk-checkin.c: store checksum directly Taylor Blau ` (9 subsequent siblings) 10 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-09 3:24 UTC (permalink / raw) To: git; +Cc: avarab, gitster From: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Line-wrap the definition of finish_tmp_packfile() to make subsequent changes easier to read. See 0e990530ae (finish_tmp_packfile(): a helper function, 2011-10-28) for the commit that introduced this overly long line. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> --- pack.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/pack.h b/pack.h index fa13954526..1c17254c0a 100644 --- a/pack.h +++ b/pack.h @@ -110,6 +110,11 @@ int encode_in_pack_object_header(unsigned char *hdr, int hdr_len, int read_pack_header(int fd, struct pack_header *); struct hashfile *create_tmp_packfile(char **pack_tmp_name); -void finish_tmp_packfile(struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, struct pack_idx_option *pack_idx_opts, unsigned char sha1[]); +void finish_tmp_packfile(struct strbuf *name_buffer, + const char *pack_tmp_name, + struct pack_idx_entry **written_list, + uint32_t nr_written, + struct pack_idx_option *pack_idx_opts, + unsigned char sha1[]); #endif -- 2.33.0.96.g73915697e6 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* [PATCH 2/9] bulk-checkin.c: store checksum directly 2021-09-09 3:24 ` [PATCH 0/9] packfile: avoid .idx rename races Taylor Blau 2021-09-09 3:24 ` [PATCH 1/9] pack.h: line-wrap the definition of finish_tmp_packfile() Taylor Blau @ 2021-09-09 3:24 ` Taylor Blau 2021-09-09 7:38 ` Ævar Arnfjörð Bjarmason 2021-09-09 3:24 ` [PATCH 3/9] pack-write: refactor renaming in finish_tmp_packfile() Taylor Blau ` (8 subsequent siblings) 10 siblings, 1 reply; 81+ messages in thread From: Taylor Blau @ 2021-09-09 3:24 UTC (permalink / raw) To: git; +Cc: avarab, gitster finish_bulk_checkin() stores the checksum from finalize_hashfile() by writing to the `hash` member of `struct object_id`, but that hash has nothing to do with an object id (it's just a convenient location that happens to be sized correctly). Store the hash directly in an unsigned char array. This behaves the same as writing to the `hash` member, but makes the intent clearer (and avoids allocating an extra four bytes for the `algo` member of `struct object_id`). Signed-off-by: Taylor Blau <me@ttaylorr.com> --- bulk-checkin.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/bulk-checkin.c b/bulk-checkin.c index b023d9959a..6283bc8bd9 100644 --- a/bulk-checkin.c +++ b/bulk-checkin.c @@ -25,7 +25,7 @@ static struct bulk_checkin_state { static void finish_bulk_checkin(struct bulk_checkin_state *state) { - struct object_id oid; + unsigned char hash[GIT_MAX_RAWSZ]; struct strbuf packname = STRBUF_INIT; int i; @@ -37,11 +37,11 @@ static void finish_bulk_checkin(struct bulk_checkin_state *state) unlink(state->pack_tmp_name); goto clear_exit; } else if (state->nr_written == 1) { - finalize_hashfile(state->f, oid.hash, CSUM_HASH_IN_STREAM | CSUM_FSYNC | CSUM_CLOSE); + finalize_hashfile(state->f, hash, CSUM_HASH_IN_STREAM | CSUM_FSYNC | CSUM_CLOSE); } else { - int fd = finalize_hashfile(state->f, oid.hash, 0); - fixup_pack_header_footer(fd, oid.hash, state->pack_tmp_name, - state->nr_written, oid.hash, + int fd = finalize_hashfile(state->f, hash, 0); + fixup_pack_header_footer(fd, hash, state->pack_tmp_name, + state->nr_written, hash, state->offset); close(fd); } @@ -49,7 +49,7 @@ static void finish_bulk_checkin(struct bulk_checkin_state *state) strbuf_addf(&packname, "%s/pack/pack-", get_object_directory()); finish_tmp_packfile(&packname, state->pack_tmp_name, state->written, state->nr_written, - &state->pack_idx_opts, oid.hash); + &state->pack_idx_opts, hash); for (i = 0; i < state->nr_written; i++) free(state->written[i]); -- 2.33.0.96.g73915697e6 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* Re: [PATCH 2/9] bulk-checkin.c: store checksum directly 2021-09-09 3:24 ` [PATCH 2/9] bulk-checkin.c: store checksum directly Taylor Blau @ 2021-09-09 7:38 ` Ævar Arnfjörð Bjarmason 0 siblings, 0 replies; 81+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-09-09 7:38 UTC (permalink / raw) To: Taylor Blau; +Cc: git, gitster, brian m. carlson On Wed, Sep 08 2021, Taylor Blau wrote: > finish_bulk_checkin() stores the checksum from finalize_hashfile() by > writing to the `hash` member of `struct object_id`, but that hash has > nothing to do with an object id (it's just a convenient location that > happens to be sized correctly). > > Store the hash directly in an unsigned char array. This behaves the same > as writing to the `hash` member, but makes the intent clearer (and > avoids allocating an extra four bytes for the `algo` member of `struct > object_id`). > > Signed-off-by: Taylor Blau <me@ttaylorr.com> > --- > bulk-checkin.c | 12 ++++++------ > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/bulk-checkin.c b/bulk-checkin.c > index b023d9959a..6283bc8bd9 100644 > --- a/bulk-checkin.c > +++ b/bulk-checkin.c > @@ -25,7 +25,7 @@ static struct bulk_checkin_state { > > static void finish_bulk_checkin(struct bulk_checkin_state *state) > { > - struct object_id oid; > + unsigned char hash[GIT_MAX_RAWSZ]; > struct strbuf packname = STRBUF_INIT; > int i; > > @@ -37,11 +37,11 @@ static void finish_bulk_checkin(struct bulk_checkin_state *state) > unlink(state->pack_tmp_name); > goto clear_exit; > } else if (state->nr_written == 1) { > - finalize_hashfile(state->f, oid.hash, CSUM_HASH_IN_STREAM | CSUM_FSYNC | CSUM_CLOSE); > + finalize_hashfile(state->f, hash, CSUM_HASH_IN_STREAM | CSUM_FSYNC | CSUM_CLOSE); > } else { > - int fd = finalize_hashfile(state->f, oid.hash, 0); > - fixup_pack_header_footer(fd, oid.hash, state->pack_tmp_name, > - state->nr_written, oid.hash, > + int fd = finalize_hashfile(state->f, hash, 0); > + fixup_pack_header_footer(fd, hash, state->pack_tmp_name, > + state->nr_written, hash, > state->offset); > close(fd); > } > @@ -49,7 +49,7 @@ static void finish_bulk_checkin(struct bulk_checkin_state *state) > strbuf_addf(&packname, "%s/pack/pack-", get_object_directory()); > finish_tmp_packfile(&packname, state->pack_tmp_name, > state->written, state->nr_written, > - &state->pack_idx_opts, oid.hash); > + &state->pack_idx_opts, hash); > for (i = 0; i < state->nr_written; i++) > free(state->written[i]); [This commit looks good, nothing needs to be fixed here, just a digression]: This is a good change, FWIW I came up with in my version, and then ended up ejecting it to resist the urge to go on some general cleanup spree, I guess we can just fix *this* one :) Anyway, I agree that this should be in & having it is good. The code pattern being fixed here is more insidious than your commit message describes though, since fa33c3aae23 (bulk-checkin.c: convert to use struct object_id, 2015-03-13) when this was converted from "unsigned char sha1[20]" to "struct object_id" we've had a landmine waiting to be stepped on here. The "oid" automatic variable will get initialized to some garbage, we then write to just the "hash" member, but leave the "algo" member in some undefined state. Thus when you e.g. call oid_to_hex(&oid) here you might get a segfault, e.g. on my box because in hex.c we'll try to access &hash_algos[oid->algo], where oid->algo happens to be set to the garbage -9872. If somebody's interested in some follow-up cleanup renaming the "hash" member to say "hash2" in hash.h, and compiling with "make -k" will find all the sites that are using it directly, I looked in detail at a few, and many can either be converted to just use "struct object_id" without playing around with "oid.hash", and some share the bug/landmine being fixed here. ^ permalink raw reply [flat|nested] 81+ messages in thread
* [PATCH 3/9] pack-write: refactor renaming in finish_tmp_packfile() 2021-09-09 3:24 ` [PATCH 0/9] packfile: avoid .idx rename races Taylor Blau 2021-09-09 3:24 ` [PATCH 1/9] pack.h: line-wrap the definition of finish_tmp_packfile() Taylor Blau 2021-09-09 3:24 ` [PATCH 2/9] bulk-checkin.c: store checksum directly Taylor Blau @ 2021-09-09 3:24 ` Taylor Blau 2021-09-09 19:29 ` Junio C Hamano 2021-09-09 3:24 ` [PATCH 4/9] pack-write.c: rename `.idx` files after `*.rev` Taylor Blau ` (7 subsequent siblings) 10 siblings, 1 reply; 81+ messages in thread From: Taylor Blau @ 2021-09-09 3:24 UTC (permalink / raw) To: git; +Cc: avarab, gitster From: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Refactor the renaming in finish_tmp_packfile() into a helper function. The callers are now expected to pass a "name_buffer" ending in "pack-OID." instead of the previous "pack-", we then append "pack", "idx" or "rev" to it. By doing the strbuf_setlen() in rename_tmp_packfile() we reuse the buffer and avoid the repeated allocations we'd get if that function had its own temporary "struct strbuf". This approach of reusing the buffer does make the last user in pack-object.c's write_pack_file() slightly awkward, since we needlessly do a strbuf_setlen() before calling strbuf_release() for consistency. In subsequent changes we'll move that bitmap writing code around, so let's not skip the strbuf_setlen() now. The previous strbuf_reset() idiom originated with 5889271114a (finish_tmp_packfile():use strbuf for pathname construction, 2014-03-03), which in turn was a minimal adjustment of pre-strbuf code added in 0e990530ae (finish_tmp_packfile(): a helper function, 2011-10-28). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> --- builtin/pack-objects.c | 7 +++++-- bulk-checkin.c | 3 ++- pack-write.c | 37 ++++++++++++++++--------------------- 3 files changed, 23 insertions(+), 24 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index df49f656b9..2a105c8d6e 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1237,7 +1237,8 @@ static void write_pack_file(void) warning_errno(_("failed utime() on %s"), pack_tmp_name); } - strbuf_addf(&tmpname, "%s-", base_name); + strbuf_addf(&tmpname, "%s-%s.", base_name, + hash_to_hex(hash)); if (write_bitmap_index) { bitmap_writer_set_checksum(hash); @@ -1250,8 +1251,9 @@ static void write_pack_file(void) &pack_idx_opts, hash); if (write_bitmap_index) { - strbuf_addf(&tmpname, "%s.bitmap", hash_to_hex(hash)); + size_t tmpname_len = tmpname.len; + strbuf_addstr(&tmpname, "bitmap"); stop_progress(&progress_state); bitmap_writer_show_progress(progress); @@ -1260,6 +1262,7 @@ static void write_pack_file(void) bitmap_writer_finish(written_list, nr_written, tmpname.buf, write_bitmap_options); write_bitmap_index = 0; + strbuf_setlen(&tmpname, tmpname_len); } strbuf_release(&tmpname); diff --git a/bulk-checkin.c b/bulk-checkin.c index 6283bc8bd9..c19d471f0b 100644 --- a/bulk-checkin.c +++ b/bulk-checkin.c @@ -46,7 +46,8 @@ static void finish_bulk_checkin(struct bulk_checkin_state *state) close(fd); } - strbuf_addf(&packname, "%s/pack/pack-", get_object_directory()); + strbuf_addf(&packname, "%s/pack/pack-%s.", get_object_directory(), + hash_to_hex(hash)); finish_tmp_packfile(&packname, state->pack_tmp_name, state->written, state->nr_written, &state->pack_idx_opts, hash); diff --git a/pack-write.c b/pack-write.c index 2767b78619..95b063be94 100644 --- a/pack-write.c +++ b/pack-write.c @@ -458,6 +458,18 @@ struct hashfile *create_tmp_packfile(char **pack_tmp_name) return hashfd(fd, *pack_tmp_name); } +static void rename_tmp_packfile(struct strbuf *nb, const char *source, + const char *ext) +{ + size_t nb_len = nb->len; + + strbuf_addstr(nb, ext); + if (rename(source, nb->buf)) + die_errno("unable to rename temporary '*.%s' file to '%s'", + ext, nb->buf); + strbuf_setlen(nb, nb_len); +} + void finish_tmp_packfile(struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, @@ -466,7 +478,6 @@ void finish_tmp_packfile(struct strbuf *name_buffer, unsigned char hash[]) { const char *idx_tmp_name, *rev_tmp_name = NULL; - int basename_len = name_buffer->len; if (adjust_shared_perm(pack_tmp_name)) die_errno("unable to make temporary pack file readable"); @@ -479,26 +490,10 @@ void finish_tmp_packfile(struct strbuf *name_buffer, rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash, pack_idx_opts->flags); - strbuf_addf(name_buffer, "%s.pack", hash_to_hex(hash)); - - if (rename(pack_tmp_name, name_buffer->buf)) - die_errno("unable to rename temporary pack file"); - - strbuf_setlen(name_buffer, basename_len); - - strbuf_addf(name_buffer, "%s.idx", hash_to_hex(hash)); - if (rename(idx_tmp_name, name_buffer->buf)) - die_errno("unable to rename temporary index file"); - - strbuf_setlen(name_buffer, basename_len); - - if (rev_tmp_name) { - strbuf_addf(name_buffer, "%s.rev", hash_to_hex(hash)); - if (rename(rev_tmp_name, name_buffer->buf)) - die_errno("unable to rename temporary reverse-index file"); - } - - strbuf_setlen(name_buffer, basename_len); + rename_tmp_packfile(name_buffer, pack_tmp_name, "pack"); + rename_tmp_packfile(name_buffer, idx_tmp_name, "idx"); + if (rev_tmp_name) + rename_tmp_packfile(name_buffer, rev_tmp_name, "rev"); free((void *)idx_tmp_name); } -- 2.33.0.96.g73915697e6 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* Re: [PATCH 3/9] pack-write: refactor renaming in finish_tmp_packfile() 2021-09-09 3:24 ` [PATCH 3/9] pack-write: refactor renaming in finish_tmp_packfile() Taylor Blau @ 2021-09-09 19:29 ` Junio C Hamano 2021-09-09 21:07 ` Taylor Blau 0 siblings, 1 reply; 81+ messages in thread From: Junio C Hamano @ 2021-09-09 19:29 UTC (permalink / raw) To: Taylor Blau; +Cc: git, avarab Taylor Blau <me@ttaylorr.com> writes: > @@ -1250,8 +1251,9 @@ static void write_pack_file(void) > &pack_idx_opts, hash); > > if (write_bitmap_index) { > - strbuf_addf(&tmpname, "%s.bitmap", hash_to_hex(hash)); > + size_t tmpname_len = tmpname.len; > > + strbuf_addstr(&tmpname, "bitmap"); > stop_progress(&progress_state); > > bitmap_writer_show_progress(progress); > @@ -1260,6 +1262,7 @@ static void write_pack_file(void) > bitmap_writer_finish(written_list, nr_written, > tmpname.buf, write_bitmap_options); > write_bitmap_index = 0; > + strbuf_setlen(&tmpname, tmpname_len); > } > > strbuf_release(&tmpname); This runs setlen on tmpname strbuf and immediately releases (the close brace before release closes the "if (write_bitmap_index)", not any loop. If we plan to insert more code to use tmpname where the blank line we see above is in the future, it may make sense, but even in such a case, adding setlen() only when it starts to matter may make it easier to follow. In any case, the above correctly adjusts tmpname to have a <hash> plus '.' at the end of tmpname to call finish_tmp_packfile(). > diff --git a/bulk-checkin.c b/bulk-checkin.c > index 6283bc8bd9..c19d471f0b 100644 > --- a/bulk-checkin.c > +++ b/bulk-checkin.c > @@ -46,7 +46,8 @@ static void finish_bulk_checkin(struct bulk_checkin_state *state) > close(fd); > } > > - strbuf_addf(&packname, "%s/pack/pack-", get_object_directory()); > + strbuf_addf(&packname, "%s/pack/pack-%s.", get_object_directory(), > + hash_to_hex(hash)); > finish_tmp_packfile(&packname, state->pack_tmp_name, > state->written, state->nr_written, > &state->pack_idx_opts, hash); OK. > diff --git a/pack-write.c b/pack-write.c > index 2767b78619..95b063be94 100644 > --- a/pack-write.c > +++ b/pack-write.c > @@ -458,6 +458,18 @@ struct hashfile *create_tmp_packfile(char **pack_tmp_name) > return hashfd(fd, *pack_tmp_name); > } > > +static void rename_tmp_packfile(struct strbuf *nb, const char *source, > + const char *ext) > +{ > + size_t nb_len = nb->len; > + > + strbuf_addstr(nb, ext); > + if (rename(source, nb->buf)) > + die_errno("unable to rename temporary '*.%s' file to '%s'", > + ext, nb->buf); I do not like '*.%s' here. Without '*' it is clear enough, and because nb->buf has already the same ext information, unable to rename temporary file to '%s'. would be even simpler without losing any information than this rewrite does. The original explicitly used the more human-facing terms like "pack file", so we are losing information by this refactoring, but the loss is not too bad. In one case, namely ".idx" files, it is even better, as the original says "temporary index file" to refer to the new .idx file, which makes it ambiguous with _the_ index file. > + strbuf_setlen(nb, nb_len); > +} In the longer term if we were to add more auxiliary files next to each of the .pack file, it makes perfect sense that the common prefix is fed to rename_tmp_packfile() and the function reverts contents of the prefix buffer back when it is done with it. But it would be more friendly to those adding more calls to this function in the future to document the convention in a comment before the function. Also, the name "nb" would need rethinking. As far as the callers are concerned, that is not a full name, but they are supplying the common prefix to the function. Perhaps "struct strbuf *name_prefix" or soemthing might be better? I dunno. > void finish_tmp_packfile(struct strbuf *name_buffer, > const char *pack_tmp_name, > struct pack_idx_entry **written_list, > @@ -466,7 +478,6 @@ void finish_tmp_packfile(struct strbuf *name_buffer, > unsigned char hash[]) > { > const char *idx_tmp_name, *rev_tmp_name = NULL; > - int basename_len = name_buffer->len; > > if (adjust_shared_perm(pack_tmp_name)) > die_errno("unable to make temporary pack file readable"); > @@ -479,26 +490,10 @@ void finish_tmp_packfile(struct strbuf *name_buffer, > rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash, > pack_idx_opts->flags); It is unfortunate that write_idx_file() and write_rev_file() take hash and come up with the temporary filename on their own (which means their resulting filenames may not share the same prefix as .pack and .idx files), but it is just the naming of temporaries and inconsistencies among them is, eh, temporary, so it is OK. > + rename_tmp_packfile(name_buffer, pack_tmp_name, "pack"); > + rename_tmp_packfile(name_buffer, idx_tmp_name, "idx"); > + if (rev_tmp_name) > + rename_tmp_packfile(name_buffer, rev_tmp_name, "rev"); > > free((void *)idx_tmp_name); > } ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH 3/9] pack-write: refactor renaming in finish_tmp_packfile() 2021-09-09 19:29 ` Junio C Hamano @ 2021-09-09 21:07 ` Taylor Blau 2021-09-09 23:30 ` Junio C Hamano 2021-09-10 1:29 ` Junio C Hamano 0 siblings, 2 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-09 21:07 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, avarab On Thu, Sep 09, 2021 at 12:29:01PM -0700, Junio C Hamano wrote: > Taylor Blau <me@ttaylorr.com> writes: > > > @@ -1250,8 +1251,9 @@ static void write_pack_file(void) > > &pack_idx_opts, hash); > > > > if (write_bitmap_index) { > > - strbuf_addf(&tmpname, "%s.bitmap", hash_to_hex(hash)); > > + size_t tmpname_len = tmpname.len; > > > > + strbuf_addstr(&tmpname, "bitmap"); > > stop_progress(&progress_state); > > > > bitmap_writer_show_progress(progress); > > @@ -1260,6 +1262,7 @@ static void write_pack_file(void) > > bitmap_writer_finish(written_list, nr_written, > > tmpname.buf, write_bitmap_options); > > write_bitmap_index = 0; > > + strbuf_setlen(&tmpname, tmpname_len); > > } > > > > strbuf_release(&tmpname); > > This runs setlen on tmpname strbuf and immediately releases (the > close brace before release closes the "if (write_bitmap_index)", not > any loop. If we plan to insert more code to use tmpname where the > blank line we see above is in the future, it may make sense, but > even in such a case, adding setlen() only when it starts to matter > may make it easier to follow. > > In any case, the above correctly adjusts tmpname to have a <hash> > plus '.' at the end of tmpname to call finish_tmp_packfile(). Ævar makes a slight mention of this in their commit message: This approach of reusing the buffer does make the last user in pack-object.c's write_pack_file() slightly awkward, since we needlessly do a strbuf_setlen() before calling strbuf_release() for consistency. In subsequent changes we'll move that bitmap writing code around, so let's not skip the strbuf_setlen() now. And this is to set us up for the final patch which moves the call to rename_tmp_packfile_idx(&tmpname, &idx_tmp_name) after the closing brace in the quoted portion of your message and the strbuf_release(). There, we'll want the strbuf reset to the appropriate contents and length, and not released. But in the meantime it is awkward. > I do not like '*.%s' here. Without '*' it is clear enough, and > because nb->buf has already the same ext information, > > [...] But it would be more friendly to those adding more calls to this > function in the future to document the convention in a comment before > the function. > > Also, the name "nb" would need rethinking. As far as the callers > are concerned, that is not a full name, but they are supplying the > common prefix to the function. Perhaps "struct strbuf *name_prefix" > or soemthing might be better? I dunno. In each of these three, I wasn't able to decide if you wanted these addressed in a newer version of this series, or if you were happy enough with the result to pick it up. I'm happy to send you a new version, but don't want to clog your inbox if you were already planning on picking this up. Thanks, Taylor ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH 3/9] pack-write: refactor renaming in finish_tmp_packfile() 2021-09-09 21:07 ` Taylor Blau @ 2021-09-09 23:30 ` Junio C Hamano 2021-09-09 23:31 ` Taylor Blau 2021-09-10 1:29 ` Junio C Hamano 1 sibling, 1 reply; 81+ messages in thread From: Junio C Hamano @ 2021-09-09 23:30 UTC (permalink / raw) To: Taylor Blau; +Cc: git, avarab Taylor Blau <me@ttaylorr.com> writes: > In each of these three, I wasn't able to decide if you wanted these > addressed in a newer version of this series, or if you were happy enough > with the result to pick it up. I'm happy to send you a new version, but > don't want to clog your inbox if you were already planning on picking > this up. Well, it is often a no-cost operation to replace a topic that has been in 'seen' with a newer round, so you do not have to worry about my inbox. As I often say, if it turns out to be a bad idea, I can just drop it from 'seen' as if I didn't see it ;-) Anyway. If a newer version will come, I'd love to see the review comments at least considered (be it from me or from anybody else) --- after all, if the original were good enough, reviewer(s) wouldn't have raised them as potential issues. Of course, "considered" is the key word. No need to blindly follow; as long as there is a solid reason to justify why changing would be worsening the well-written original. Thanks. ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH 3/9] pack-write: refactor renaming in finish_tmp_packfile() 2021-09-09 23:30 ` Junio C Hamano @ 2021-09-09 23:31 ` Taylor Blau 0 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-09 23:31 UTC (permalink / raw) To: Junio C Hamano; +Cc: Taylor Blau, git, avarab On Thu, Sep 09, 2021 at 04:30:38PM -0700, Junio C Hamano wrote: > Taylor Blau <me@ttaylorr.com> writes: > > > In each of these three, I wasn't able to decide if you wanted these > > addressed in a newer version of this series, or if you were happy enough > > with the result to pick it up. I'm happy to send you a new version, but > > don't want to clog your inbox if you were already planning on picking > > this up. > > Well, it is often a no-cost operation to replace a topic that has > been in 'seen' with a newer round, so you do not have to worry about > my inbox. As I often say, if it turns out to be a bad idea, I can > just drop it from 'seen' as if I didn't see it ;-) > > Anyway. > > If a newer version will come, I'd love to see the review comments at > least considered (be it from me or from anybody else) --- after all, > if the original were good enough, reviewer(s) wouldn't have raised > them as potential issues. > > Of course, "considered" is the key word. No need to blindly follow; > as long as there is a solid reason to justify why changing would be > worsening the well-written original. Agreed strongly :-). After reading through your review again, I felt like the individual components were greater than the sum of their parts, and that the v2 of this combined series is on the whole better than v1. Thanks for taking the time to review it. Thanks, Taylor ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH 3/9] pack-write: refactor renaming in finish_tmp_packfile() 2021-09-09 21:07 ` Taylor Blau 2021-09-09 23:30 ` Junio C Hamano @ 2021-09-10 1:29 ` Junio C Hamano 1 sibling, 0 replies; 81+ messages in thread From: Junio C Hamano @ 2021-09-10 1:29 UTC (permalink / raw) To: Taylor Blau; +Cc: git, avarab Taylor Blau <me@ttaylorr.com> writes: > Ævar makes a slight mention of this in their commit message: > > This approach of reusing the buffer does make the last user in > pack-object.c's write_pack_file() slightly awkward, since we > needlessly do a strbuf_setlen() before calling strbuf_release() for > consistency. In subsequent changes we'll move that bitmap writing > code around, so let's not skip the strbuf_setlen() now. > > And this is to set us up for the final patch which moves the call to > rename_tmp_packfile_idx(&tmpname, &idx_tmp_name) after the closing brace > in the quoted portion of your message and the strbuf_release(). There, > we'll want the strbuf reset to the appropriate contents and length, and > not released. > > But in the meantime it is awkward. That is why I said "adding setlen only when t starts to matter may make it easier to follow". It won't make it awkward at this step, and it will make it stand out why it is needed when it is added, presumably at the very end, when rename_tmp_packfile_idx() call is added after this if() block. ^ permalink raw reply [flat|nested] 81+ messages in thread
* [PATCH 4/9] pack-write.c: rename `.idx` files after `*.rev` 2021-09-09 3:24 ` [PATCH 0/9] packfile: avoid .idx rename races Taylor Blau ` (2 preceding siblings ...) 2021-09-09 3:24 ` [PATCH 3/9] pack-write: refactor renaming in finish_tmp_packfile() Taylor Blau @ 2021-09-09 3:24 ` Taylor Blau 2021-09-09 7:46 ` Ævar Arnfjörð Bjarmason 2021-09-09 19:32 ` Junio C Hamano 2021-09-09 3:25 ` [PATCH 5/9] builtin/repack.c: move `.idx` files into place last Taylor Blau ` (6 subsequent siblings) 10 siblings, 2 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-09 3:24 UTC (permalink / raw) To: git; +Cc: avarab, gitster We treat the presence of an `.idx` file as the indicator of whether or not it's safe to use a packfile. But `finish_tmp_packfile()` (which is responsible for writing the pack and moving the temporary versions of all of its auxiliary files into place) is inconsistent about the write order. Specifically, it moves the `.rev` file into place after the `.idx`, leaving open the possibility to open a pack which looks "ready" (because the `.idx` file exists and is readable) but appears momentarily to not have a `.rev` file. This causes Git to fall back to generating the pack's reverse index in memory. Though racy, this amounts to an unnecessary slow-down at worst, and doesn't affect the correctness of the resulting reverse index. Close this race by moving the .rev file into place before moving the .idx file into place. This still leaves the issue of `.idx` files being renamed into place before the auxiliary `.bitmap` file is renamed when in pack-object.c's write_pack_file() "write_bitmap_index" is true. That race will be addressed in subsequent commits. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> --- pack-write.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pack-write.c b/pack-write.c index 95b063be94..077710090e 100644 --- a/pack-write.c +++ b/pack-write.c @@ -491,9 +491,9 @@ void finish_tmp_packfile(struct strbuf *name_buffer, pack_idx_opts->flags); rename_tmp_packfile(name_buffer, pack_tmp_name, "pack"); - rename_tmp_packfile(name_buffer, idx_tmp_name, "idx"); if (rev_tmp_name) rename_tmp_packfile(name_buffer, rev_tmp_name, "rev"); + rename_tmp_packfile(name_buffer, idx_tmp_name, "idx"); free((void *)idx_tmp_name); } -- 2.33.0.96.g73915697e6 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* Re: [PATCH 4/9] pack-write.c: rename `.idx` files after `*.rev` 2021-09-09 3:24 ` [PATCH 4/9] pack-write.c: rename `.idx` files after `*.rev` Taylor Blau @ 2021-09-09 7:46 ` Ævar Arnfjörð Bjarmason 2021-09-09 14:37 ` Taylor Blau 2021-09-09 19:32 ` Junio C Hamano 1 sibling, 1 reply; 81+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-09-09 7:46 UTC (permalink / raw) To: Taylor Blau; +Cc: git, gitster On Wed, Sep 08 2021, Taylor Blau wrote: > [...] > > Signed-off-by: Taylor Blau <me@ttaylorr.com> > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > Signed-off-by: Taylor Blau <me@ttaylorr.com> git commit --amend --signed-off-by-harder ? :) I.e. the duplicate header entry can go, maybe something Junio will fix up while queuing... ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH 4/9] pack-write.c: rename `.idx` files after `*.rev` 2021-09-09 7:46 ` Ævar Arnfjörð Bjarmason @ 2021-09-09 14:37 ` Taylor Blau 0 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-09 14:37 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: Taylor Blau, git, gitster On Thu, Sep 09, 2021 at 09:46:57AM +0200, Ævar Arnfjörð Bjarmason wrote: > > On Wed, Sep 08 2021, Taylor Blau wrote: > > > [...] > > > > Signed-off-by: Taylor Blau <me@ttaylorr.com> > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > > Signed-off-by: Taylor Blau <me@ttaylorr.com> > > git commit --amend --signed-off-by-harder ? :) I don't think so, though I had to check myself before sending. Documentation/SubmittingPatches says: Notice that you can place your own `Signed-off-by` trailer when forwarding somebody else's patch with the above rules for D-C-O. Indeed you are encouraged to do so. Do not forget to place an in-body "From: " line at the beginning to properly attribute the change to its true author (see (2) above). Here it's awkward because you modified my original patch and then I sent your modified result out. But I think the s-o-b chain still makes sense: I wrote the original patch, and signed it off. Then you modified it, signing that off. Finally, I sent it to the list, signing off on your modification. > I.e. the duplicate header entry can go, maybe something Junio will fix > up while queuing... If I misread SubmittingPatches then I'm happy to fix it myself, but this was the intended outcome. Thanks, Taylor ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH 4/9] pack-write.c: rename `.idx` files after `*.rev` 2021-09-09 3:24 ` [PATCH 4/9] pack-write.c: rename `.idx` files after `*.rev` Taylor Blau 2021-09-09 7:46 ` Ævar Arnfjörð Bjarmason @ 2021-09-09 19:32 ` Junio C Hamano 1 sibling, 0 replies; 81+ messages in thread From: Junio C Hamano @ 2021-09-09 19:32 UTC (permalink / raw) To: Taylor Blau; +Cc: git, avarab Taylor Blau <me@ttaylorr.com> writes: > This still leaves the issue of `.idx` files being renamed into place > before the auxiliary `.bitmap` file is renamed when in pack-object.c's > write_pack_file() "write_bitmap_index" is true. That race will be > addressed in subsequent commits. OK. I was about to suggest s/after .*rev/last/ on the title, but the above makes it clear that we are not ready for such a title. > > Signed-off-by: Taylor Blau <me@ttaylorr.com> > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > Signed-off-by: Taylor Blau <me@ttaylorr.com> > --- > pack-write.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/pack-write.c b/pack-write.c > index 95b063be94..077710090e 100644 > --- a/pack-write.c > +++ b/pack-write.c > @@ -491,9 +491,9 @@ void finish_tmp_packfile(struct strbuf *name_buffer, > pack_idx_opts->flags); > > rename_tmp_packfile(name_buffer, pack_tmp_name, "pack"); > - rename_tmp_packfile(name_buffer, idx_tmp_name, "idx"); > if (rev_tmp_name) > rename_tmp_packfile(name_buffer, rev_tmp_name, "rev"); > + rename_tmp_packfile(name_buffer, idx_tmp_name, "idx"); > > free((void *)idx_tmp_name); > } ^ permalink raw reply [flat|nested] 81+ messages in thread
* [PATCH 5/9] builtin/repack.c: move `.idx` files into place last 2021-09-09 3:24 ` [PATCH 0/9] packfile: avoid .idx rename races Taylor Blau ` (3 preceding siblings ...) 2021-09-09 3:24 ` [PATCH 4/9] pack-write.c: rename `.idx` files after `*.rev` Taylor Blau @ 2021-09-09 3:25 ` Taylor Blau 2021-09-09 19:38 ` Junio C Hamano 2021-09-09 3:25 ` [PATCH 6/9] index-pack: refactor renaming in final() Taylor Blau ` (5 subsequent siblings) 10 siblings, 1 reply; 81+ messages in thread From: Taylor Blau @ 2021-09-09 3:25 UTC (permalink / raw) To: git; +Cc: avarab, gitster In a similar spirit as the previous patch, fix the identical problem from `git repack` (which invokes `pack-objects` with a temporary location for output, and then moves the files into their final locations itself). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> --- builtin/repack.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/builtin/repack.c b/builtin/repack.c index 5f9bc74adc..c3e4771609 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -208,10 +208,10 @@ static struct { unsigned optional:1; } exts[] = { {".pack"}, - {".idx"}, {".rev", 1}, {".bitmap", 1}, {".promisor", 1}, + {".idx"}, }; static unsigned populate_pack_exts(char *name) -- 2.33.0.96.g73915697e6 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* Re: [PATCH 5/9] builtin/repack.c: move `.idx` files into place last 2021-09-09 3:25 ` [PATCH 5/9] builtin/repack.c: move `.idx` files into place last Taylor Blau @ 2021-09-09 19:38 ` Junio C Hamano 2021-09-09 21:08 ` Taylor Blau 0 siblings, 1 reply; 81+ messages in thread From: Junio C Hamano @ 2021-09-09 19:38 UTC (permalink / raw) To: Taylor Blau; +Cc: git, avarab Taylor Blau <me@ttaylorr.com> writes: > In a similar spirit as the previous patch, fix the identical problem > from `git repack` (which invokes `pack-objects` with a temporary > location for output, and then moves the files into their final locations > itself). OK. This array is used in cmd_repack() to call rename() in the order in which its elements appear, so moving the entry to the last means it will be renamed the last. > Signed-off-by: Taylor Blau <me@ttaylorr.com> > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > Signed-off-by: Taylor Blau <me@ttaylorr.com> > --- > builtin/repack.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/builtin/repack.c b/builtin/repack.c > index 5f9bc74adc..c3e4771609 100644 > --- a/builtin/repack.c > +++ b/builtin/repack.c > @@ -208,10 +208,10 @@ static struct { > unsigned optional:1; > } exts[] = { > {".pack"}, > - {".idx"}, > {".rev", 1}, > {".bitmap", 1}, > {".promisor", 1}, > + {".idx"}, > }; And the .idx entry MUST stay the last. I wonder if dropping the trailing comma and replace it with a "/* must be at the end */" comment would be an effective way to ensure that. ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH 5/9] builtin/repack.c: move `.idx` files into place last 2021-09-09 19:38 ` Junio C Hamano @ 2021-09-09 21:08 ` Taylor Blau 0 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-09 21:08 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, avarab On Thu, Sep 09, 2021 at 12:38:52PM -0700, Junio C Hamano wrote: > Taylor Blau <me@ttaylorr.com> writes: > > diff --git a/builtin/repack.c b/builtin/repack.c > > index 5f9bc74adc..c3e4771609 100644 > > --- a/builtin/repack.c > > +++ b/builtin/repack.c > > @@ -208,10 +208,10 @@ static struct { > > unsigned optional:1; > > } exts[] = { > > {".pack"}, > > - {".idx"}, > > {".rev", 1}, > > {".bitmap", 1}, > > {".promisor", 1}, > > + {".idx"}, > > }; > > And the .idx entry MUST stay the last. I wonder if dropping the > trailing comma and replace it with a "/* must be at the end */" > comment would be an effective way to ensure that. Heh, Ævar even wrote that up, and I responded that I did not think it helped much. Again, I'm happy to add it back if anybody feels strongly. Thanks, Taylor ^ permalink raw reply [flat|nested] 81+ messages in thread
* [PATCH 6/9] index-pack: refactor renaming in final() 2021-09-09 3:24 ` [PATCH 0/9] packfile: avoid .idx rename races Taylor Blau ` (4 preceding siblings ...) 2021-09-09 3:25 ` [PATCH 5/9] builtin/repack.c: move `.idx` files into place last Taylor Blau @ 2021-09-09 3:25 ` Taylor Blau 2021-09-09 19:45 ` Junio C Hamano 2021-09-09 3:25 ` [PATCH 7/9] builtin/index-pack.c: move `.idx` files into place last Taylor Blau ` (4 subsequent siblings) 10 siblings, 1 reply; 81+ messages in thread From: Taylor Blau @ 2021-09-09 3:25 UTC (permalink / raw) To: git; +Cc: avarab, gitster From: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Refactor the renaming in final() into a helper function, this is similar in spirit to a preceding refactoring of finish_tmp_packfile() in pack-write.c. Before e37d0b8730b (builtin/index-pack.c: write reverse indexes, 2021-01-25) it probably wasn't worth it to have this sort of helper, due to the differing "else if" case for "pack" files v.s. "idx" files. But since we've got "rev" as well now, let's do the renaming via a helper, this is both a net decrease in lines, and improves the readability, since we can easily see at a glance that the logic for writing these three types of files is exactly the same, aside from the obviously differing cases of "*final_xyz_name" being NULL, and "else_chmod_if" being different. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> --- builtin/index-pack.c | 48 +++++++++++++++++++++----------------------- 1 file changed, 23 insertions(+), 25 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 6cc4890217..cd4e85f5bb 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1477,6 +1477,22 @@ static void write_special_file(const char *suffix, const char *msg, strbuf_release(&name_buf); } +static void rename_tmp_packfile(const char **final_xyz_name, + const char *curr_xyz_name, + struct strbuf *xyz_name, unsigned char *hash, + const char *ext, int else_chmod_if) +{ + if (*final_xyz_name != curr_xyz_name) { + if (!*final_xyz_name) + *final_xyz_name = odb_pack_name(xyz_name, hash, ext); + if (finalize_object_file(curr_xyz_name, *final_xyz_name)) + die(_("unable to rename temporary '*.%s' file to '%s"), + ext, *final_xyz_name); + } else if (else_chmod_if) { + chmod(*final_xyz_name, 0444); + } +} + static void final(const char *final_pack_name, const char *curr_pack_name, const char *final_index_name, const char *curr_index_name, const char *final_rev_index_name, const char *curr_rev_index_name, @@ -1505,31 +1521,13 @@ static void final(const char *final_pack_name, const char *curr_pack_name, write_special_file("promisor", promisor_msg, final_pack_name, hash, NULL); - if (final_pack_name != curr_pack_name) { - if (!final_pack_name) - final_pack_name = odb_pack_name(&pack_name, hash, "pack"); - if (finalize_object_file(curr_pack_name, final_pack_name)) - die(_("cannot store pack file")); - } else if (from_stdin) - chmod(final_pack_name, 0444); - - if (final_index_name != curr_index_name) { - if (!final_index_name) - final_index_name = odb_pack_name(&index_name, hash, "idx"); - if (finalize_object_file(curr_index_name, final_index_name)) - die(_("cannot store index file")); - } else - chmod(final_index_name, 0444); - - if (curr_rev_index_name) { - if (final_rev_index_name != curr_rev_index_name) { - if (!final_rev_index_name) - final_rev_index_name = odb_pack_name(&rev_index_name, hash, "rev"); - if (finalize_object_file(curr_rev_index_name, final_rev_index_name)) - die(_("cannot store reverse index file")); - } else - chmod(final_rev_index_name, 0444); - } + rename_tmp_packfile(&final_pack_name, curr_pack_name, &pack_name, + hash, "pack", from_stdin); + rename_tmp_packfile(&final_index_name, curr_index_name, &index_name, + hash, "idx", 1); + if (curr_rev_index_name) + rename_tmp_packfile(&final_rev_index_name, curr_rev_index_name, + &rev_index_name, hash, "rev", 1); if (do_fsck_object) { struct packed_git *p; -- 2.33.0.96.g73915697e6 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* Re: [PATCH 6/9] index-pack: refactor renaming in final() 2021-09-09 3:25 ` [PATCH 6/9] index-pack: refactor renaming in final() Taylor Blau @ 2021-09-09 19:45 ` Junio C Hamano 2021-09-09 21:11 ` Taylor Blau 0 siblings, 1 reply; 81+ messages in thread From: Junio C Hamano @ 2021-09-09 19:45 UTC (permalink / raw) To: Taylor Blau; +Cc: git, avarab Taylor Blau <me@ttaylorr.com> writes: > But since we've got "rev" as well now, let's do the renaming via a > helper, this is both a net decrease in lines, and improves the > readability,... xyz_ may be cute, but is distracting. I do not think it loses any information if we used final_name, current_name, etc.; it may make the result even easier to read. > +static void rename_tmp_packfile(const char **final_xyz_name, > + const char *curr_xyz_name, > + struct strbuf *xyz_name, unsigned char *hash, > + const char *ext, int else_chmod_if) > +{ > + if (*final_xyz_name != curr_xyz_name) { > + if (!*final_xyz_name) > + *final_xyz_name = odb_pack_name(xyz_name, hash, ext); > + if (finalize_object_file(curr_xyz_name, *final_xyz_name)) > + die(_("unable to rename temporary '*.%s' file to '%s"), > + ext, *final_xyz_name); > + } else if (else_chmod_if) { > + chmod(*final_xyz_name, 0444); > + } > +} "else_chmod_if" is unclear. The caller must be aware of what 'else' refers to and anybody adding a new caller is forced to go look at the body of this helper. I think chmod (or "make_read_only") happens only when the current one already has the final name, so perhaps that is what the name should highlight? make-read-only-if-same or somesuch? Thanks. ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH 6/9] index-pack: refactor renaming in final() 2021-09-09 19:45 ` Junio C Hamano @ 2021-09-09 21:11 ` Taylor Blau 0 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-09 21:11 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, avarab On Thu, Sep 09, 2021 at 12:45:04PM -0700, Junio C Hamano wrote: > Taylor Blau <me@ttaylorr.com> writes: > > > But since we've got "rev" as well now, let's do the renaming via a > > helper, this is both a net decrease in lines, and improves the > > readability,... > > xyz_ may be cute, but is distracting. I do not think it loses any > information if we used final_name, current_name, etc.; it may make > the result even easier to read. > > [...] > > make-read-only-if-same or somesuch? Both of these suggestions (dropping the "xyz"s and renaming the last parameter to "make_read_only_if_same") make sense. Will apply. Thanks, Taylor ^ permalink raw reply [flat|nested] 81+ messages in thread
* [PATCH 7/9] builtin/index-pack.c: move `.idx` files into place last 2021-09-09 3:24 ` [PATCH 0/9] packfile: avoid .idx rename races Taylor Blau ` (5 preceding siblings ...) 2021-09-09 3:25 ` [PATCH 6/9] index-pack: refactor renaming in final() Taylor Blau @ 2021-09-09 3:25 ` Taylor Blau 2021-09-09 7:52 ` Ævar Arnfjörð Bjarmason 2021-09-09 19:45 ` Junio C Hamano 2021-09-09 3:25 ` [PATCH 8/9] pack-write: split up finish_tmp_packfile() function Taylor Blau ` (3 subsequent siblings) 10 siblings, 2 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-09 3:25 UTC (permalink / raw) To: git; +Cc: avarab, gitster In a similar spirit as preceding patches to `git repack` and `git pack-objects`, fix the identical problem in `git index-pack`. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> --- builtin/index-pack.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index cd4e85f5bb..bf294d9083 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1523,11 +1523,11 @@ static void final(const char *final_pack_name, const char *curr_pack_name, rename_tmp_packfile(&final_pack_name, curr_pack_name, &pack_name, hash, "pack", from_stdin); - rename_tmp_packfile(&final_index_name, curr_index_name, &index_name, - hash, "idx", 1); if (curr_rev_index_name) rename_tmp_packfile(&final_rev_index_name, curr_rev_index_name, &rev_index_name, hash, "rev", 1); + rename_tmp_packfile(&final_index_name, curr_index_name, &index_name, + hash, "idx", 1); if (do_fsck_object) { struct packed_git *p; -- 2.33.0.96.g73915697e6 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* Re: [PATCH 7/9] builtin/index-pack.c: move `.idx` files into place last 2021-09-09 3:25 ` [PATCH 7/9] builtin/index-pack.c: move `.idx` files into place last Taylor Blau @ 2021-09-09 7:52 ` Ævar Arnfjörð Bjarmason 2021-09-09 19:45 ` Junio C Hamano 1 sibling, 0 replies; 81+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-09-09 7:52 UTC (permalink / raw) To: Taylor Blau; +Cc: git, gitster On Wed, Sep 08 2021, Taylor Blau wrote: > Signed-off-by: Taylor Blau <me@ttaylorr.com> > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > Signed-off-by: Taylor Blau <me@ttaylorr.com> Ditto the comment on 4/9 about duplicate Signed-off-by headers, looks good otherwise! ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH 7/9] builtin/index-pack.c: move `.idx` files into place last 2021-09-09 3:25 ` [PATCH 7/9] builtin/index-pack.c: move `.idx` files into place last Taylor Blau 2021-09-09 7:52 ` Ævar Arnfjörð Bjarmason @ 2021-09-09 19:45 ` Junio C Hamano 1 sibling, 0 replies; 81+ messages in thread From: Junio C Hamano @ 2021-09-09 19:45 UTC (permalink / raw) To: Taylor Blau; +Cc: git, avarab Taylor Blau <me@ttaylorr.com> writes: > In a similar spirit as preceding patches to `git repack` and `git > pack-objects`, fix the identical problem in `git index-pack`. > > Signed-off-by: Taylor Blau <me@ttaylorr.com> > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > Signed-off-by: Taylor Blau <me@ttaylorr.com> > --- > builtin/index-pack.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/builtin/index-pack.c b/builtin/index-pack.c > index cd4e85f5bb..bf294d9083 100644 > --- a/builtin/index-pack.c > +++ b/builtin/index-pack.c > @@ -1523,11 +1523,11 @@ static void final(const char *final_pack_name, const char *curr_pack_name, > > rename_tmp_packfile(&final_pack_name, curr_pack_name, &pack_name, > hash, "pack", from_stdin); > - rename_tmp_packfile(&final_index_name, curr_index_name, &index_name, > - hash, "idx", 1); > if (curr_rev_index_name) > rename_tmp_packfile(&final_rev_index_name, curr_rev_index_name, > &rev_index_name, hash, "rev", 1); > + rename_tmp_packfile(&final_index_name, curr_index_name, &index_name, > + hash, "idx", 1); Good. ^ permalink raw reply [flat|nested] 81+ messages in thread
* [PATCH 8/9] pack-write: split up finish_tmp_packfile() function 2021-09-09 3:24 ` [PATCH 0/9] packfile: avoid .idx rename races Taylor Blau ` (6 preceding siblings ...) 2021-09-09 3:25 ` [PATCH 7/9] builtin/index-pack.c: move `.idx` files into place last Taylor Blau @ 2021-09-09 3:25 ` Taylor Blau 2021-09-09 3:25 ` [PATCH 9/9] pack-objects: rename .idx files into place after .bitmap files Taylor Blau ` (2 subsequent siblings) 10 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-09 3:25 UTC (permalink / raw) To: git; +Cc: avarab, gitster From: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Split up the finish_tmp_packfile() function and use the split-up version in pack-objects.c in preparation for moving the step of renaming the *.idx file later as part of a function change. Since the only other caller of finish_tmp_packfile() was in bulk-checkin.c, and it won't be needing a change to its *.idx renaming, provide a thin wrapper for the old function as a static function in that file. If other callers end up needing the simpler version it could be moved back to "pack-write.c" and "pack.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> --- builtin/pack-objects.c | 7 +++++-- bulk-checkin.c | 16 ++++++++++++++++ pack-write.c | 22 +++++++++++++--------- pack.h | 7 +++++-- 4 files changed, 39 insertions(+), 13 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 2a105c8d6e..944134b6f2 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1217,6 +1217,7 @@ static void write_pack_file(void) if (!pack_to_stdout) { struct stat st; struct strbuf tmpname = STRBUF_INIT; + char *idx_tmp_name = NULL; /* * Packs are runtime accessed in their mtime @@ -1246,9 +1247,10 @@ static void write_pack_file(void) &to_pack, written_list, nr_written); } - finish_tmp_packfile(&tmpname, pack_tmp_name, + stage_tmp_packfiles(&tmpname, pack_tmp_name, written_list, nr_written, - &pack_idx_opts, hash); + &pack_idx_opts, hash, &idx_tmp_name); + rename_tmp_packfile_idx(&tmpname, &idx_tmp_name); if (write_bitmap_index) { size_t tmpname_len = tmpname.len; @@ -1265,6 +1267,7 @@ static void write_pack_file(void) strbuf_setlen(&tmpname, tmpname_len); } + free(idx_tmp_name); strbuf_release(&tmpname); free(pack_tmp_name); puts(hash_to_hex(hash)); diff --git a/bulk-checkin.c b/bulk-checkin.c index c19d471f0b..8785b2ac80 100644 --- a/bulk-checkin.c +++ b/bulk-checkin.c @@ -23,6 +23,22 @@ static struct bulk_checkin_state { uint32_t nr_written; } state; +static void finish_tmp_packfile(struct strbuf *basename, + const char *pack_tmp_name, + struct pack_idx_entry **written_list, + uint32_t nr_written, + struct pack_idx_option *pack_idx_opts, + unsigned char hash[]) +{ + char *idx_tmp_name = NULL; + + stage_tmp_packfiles(basename, pack_tmp_name, written_list, nr_written, + pack_idx_opts, hash, &idx_tmp_name); + rename_tmp_packfile_idx(basename, &idx_tmp_name); + + free(idx_tmp_name); +} + static void finish_bulk_checkin(struct bulk_checkin_state *state) { unsigned char hash[GIT_MAX_RAWSZ]; diff --git a/pack-write.c b/pack-write.c index 077710090e..d40d2ddd01 100644 --- a/pack-write.c +++ b/pack-write.c @@ -470,21 +470,28 @@ static void rename_tmp_packfile(struct strbuf *nb, const char *source, strbuf_setlen(nb, nb_len); } -void finish_tmp_packfile(struct strbuf *name_buffer, +void rename_tmp_packfile_idx(struct strbuf *name_buffer, + char **idx_tmp_name) +{ + rename_tmp_packfile(name_buffer, *idx_tmp_name, "idx"); +} + +void stage_tmp_packfiles(struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, struct pack_idx_option *pack_idx_opts, - unsigned char hash[]) + unsigned char hash[], + char **idx_tmp_name) { - const char *idx_tmp_name, *rev_tmp_name = NULL; + const char *rev_tmp_name = NULL; if (adjust_shared_perm(pack_tmp_name)) die_errno("unable to make temporary pack file readable"); - idx_tmp_name = write_idx_file(NULL, written_list, nr_written, - pack_idx_opts, hash); - if (adjust_shared_perm(idx_tmp_name)) + *idx_tmp_name = (char *)write_idx_file(NULL, written_list, nr_written, + pack_idx_opts, hash); + if (adjust_shared_perm(*idx_tmp_name)) die_errno("unable to make temporary index file readable"); rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash, @@ -493,9 +500,6 @@ void finish_tmp_packfile(struct strbuf *name_buffer, rename_tmp_packfile(name_buffer, pack_tmp_name, "pack"); if (rev_tmp_name) rename_tmp_packfile(name_buffer, rev_tmp_name, "rev"); - rename_tmp_packfile(name_buffer, idx_tmp_name, "idx"); - - free((void *)idx_tmp_name); } void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought) diff --git a/pack.h b/pack.h index 1c17254c0a..b22bfc4a18 100644 --- a/pack.h +++ b/pack.h @@ -110,11 +110,14 @@ int encode_in_pack_object_header(unsigned char *hdr, int hdr_len, int read_pack_header(int fd, struct pack_header *); struct hashfile *create_tmp_packfile(char **pack_tmp_name); -void finish_tmp_packfile(struct strbuf *name_buffer, +void stage_tmp_packfiles(struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, struct pack_idx_option *pack_idx_opts, - unsigned char sha1[]); + unsigned char hash[], + char **idx_tmp_name); +void rename_tmp_packfile_idx(struct strbuf *basename, + char **idx_tmp_name); #endif -- 2.33.0.96.g73915697e6 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* [PATCH 9/9] pack-objects: rename .idx files into place after .bitmap files 2021-09-09 3:24 ` [PATCH 0/9] packfile: avoid .idx rename races Taylor Blau ` (7 preceding siblings ...) 2021-09-09 3:25 ` [PATCH 8/9] pack-write: split up finish_tmp_packfile() function Taylor Blau @ 2021-09-09 3:25 ` Taylor Blau 2021-09-09 7:54 ` Ævar Arnfjörð Bjarmason 2021-09-09 8:06 ` [PATCH 0/9] packfile: avoid .idx rename races Ævar Arnfjörð Bjarmason 2021-09-09 23:24 ` [PATCH v2 " Taylor Blau 10 siblings, 1 reply; 81+ messages in thread From: Taylor Blau @ 2021-09-09 3:25 UTC (permalink / raw) To: git; +Cc: avarab, gitster From: Ævar Arnfjörð Bjarmason <avarab@gmail.com> In preceding commits the race of renaming .idx files in place before .rev files and other auxiliary files was fixed in pack-write.c's finish_tmp_packfile(), builtin/repack.c's "struct exts", and builtin/index-pack.c's final(). As noted in the change to pack-write.c we left in place the issue of writing *.bitmap files after the *.idx, let's fix that issue. See 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21) for commentary at the time when *.bitmap was implemented about how those files are written out, nothing in that commit contradicts what's being done here. Note that this commit and preceding ones only close any race condition with *.idx files being written before their auxiliary files if we're optimistic about our lack of fsync()-ing in this are not tripping us over. See the thread at [1] for a rabbit hole of various discussions about filesystem races in the face of doing and not doing fsync() (and if doing fsync(), not doing it properly). In particular, in this case of writing to ".git/objects/pack" we only write and fsync() the individual files, but if we wanted to guarantee that the metadata update was seen in that way by concurrent processes we'd need to fsync() on the "fd" of the containing directory. That concern is probably more theoretical than not, modern OS's tend to be more on the forgiving side than the overly pedantic side of implementing POSIX FS semantics. 1. https://lore.kernel.org/git/8735qgkvv1.fsf@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> --- builtin/pack-objects.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 944134b6f2..a01767a384 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1250,7 +1250,6 @@ static void write_pack_file(void) stage_tmp_packfiles(&tmpname, pack_tmp_name, written_list, nr_written, &pack_idx_opts, hash, &idx_tmp_name); - rename_tmp_packfile_idx(&tmpname, &idx_tmp_name); if (write_bitmap_index) { size_t tmpname_len = tmpname.len; @@ -1267,6 +1266,8 @@ static void write_pack_file(void) strbuf_setlen(&tmpname, tmpname_len); } + rename_tmp_packfile_idx(&tmpname, &idx_tmp_name); + free(idx_tmp_name); strbuf_release(&tmpname); free(pack_tmp_name); -- 2.33.0.96.g73915697e6 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* Re: [PATCH 9/9] pack-objects: rename .idx files into place after .bitmap files 2021-09-09 3:25 ` [PATCH 9/9] pack-objects: rename .idx files into place after .bitmap files Taylor Blau @ 2021-09-09 7:54 ` Ævar Arnfjörð Bjarmason 2021-09-09 19:52 ` Junio C Hamano 0 siblings, 1 reply; 81+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-09-09 7:54 UTC (permalink / raw) To: Taylor Blau; +Cc: git, gitster On Wed, Sep 08 2021, Taylor Blau wrote: > From: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > > In preceding commits the race of renaming .idx files in place before > .rev files and other auxiliary files was fixed in pack-write.c's > finish_tmp_packfile(), builtin/repack.c's "struct exts", and > builtin/index-pack.c's final(). As noted in the change to pack-write.c > we left in place the issue of writing *.bitmap files after the *.idx, > let's fix that issue. > > See 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21) > for commentary at the time when *.bitmap was implemented about how > those files are written out, nothing in that commit contradicts what's > being done here. > > Note that this commit and preceding ones only close any race condition > with *.idx files being written before their auxiliary files if we're > optimistic about our lack of fsync()-ing in this are not tripping us > over. See the thread at [1] for a rabbit hole of various discussions > about filesystem races in the face of doing and not doing fsync() (and > if doing fsync(), not doing it properly). > > In particular, in this case of writing to ".git/objects/pack" we only > write and fsync() the individual files, but if we wanted to guarantee > that the metadata update was seen in that way by concurrent processes > we'd need to fsync() on the "fd" of the containing directory. That > concern is probably more theoretical than not, modern OS's tend to be > more on the forgiving side than the overly pedantic side of > implementing POSIX FS semantics. Some weird gramma/phrasing of mine left over here, i.e. the "[...] in this are not tripping us over.". On reflection perhaps it's better to replace these last two paragraphs with say: With this and preceding commits we've covered all the cases where we wrote another auxiliary file before the *.idx file, and should thus never have another concurrent process try to use an incomplete set of pack-OID.* files. This assumes that the renames we make here and elsewhere appear on the filesystem in the order we performed them. This assumption is known to be false in the face of pedantic POSIX FS semantics. See the thread around [1] for discussions on that point. We fsync() the file descriptors we're writing out for all the auxiliary files, but we don't yet fsync() the file descriptor for the containing directory. Therefore our data might have been written out, but it's anyone's guess what the state of the directory containing the file is after we write the *.idx. In practice modern OS's are known to be forgiving on that point, so this will probably solve races in practice for most users. It will almost certainly make them better than they were before when we didn't write *.idx files last. We should more generally improve our use of fsync() to cover containing directories, but that'll hopefully be addressed by some follow-up series. ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH 9/9] pack-objects: rename .idx files into place after .bitmap files 2021-09-09 7:54 ` Ævar Arnfjörð Bjarmason @ 2021-09-09 19:52 ` Junio C Hamano 2021-09-09 21:13 ` Taylor Blau 0 siblings, 1 reply; 81+ messages in thread From: Junio C Hamano @ 2021-09-09 19:52 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: Taylor Blau, git Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > We fsync() the file descriptors we're writing out for all the > auxiliary files, but we don't yet fsync() the file descriptor for > the containing directory. Therefore our data might have been written > out, but it's anyone's guess what the state of the directory > containing the file is after we write the *.idx. > > In practice modern OS's are known to be forgiving on that point, so > this will probably solve races in practice for most users. It will > almost certainly make them better than they were before when we > didn't write *.idx files last. We should more generally improve our > use of fsync() to cover containing directories, but that'll > hopefully be addressed by some follow-up series. I'd probably drop the last paragraph, and replace it with a single sentence "we may want to fsync the containing directory once after placing *.idx file in place, but it is outside of the scope of this series", if I were doing this series. Other than that (and I made a few comments on other patches), these were pleasant read. Thanks. ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH 9/9] pack-objects: rename .idx files into place after .bitmap files 2021-09-09 19:52 ` Junio C Hamano @ 2021-09-09 21:13 ` Taylor Blau 0 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-09 21:13 UTC (permalink / raw) To: Junio C Hamano; +Cc: Ævar Arnfjörð Bjarmason, git On Thu, Sep 09, 2021 at 12:52:03PM -0700, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > > > We fsync() the file descriptors we're writing out for all the > > auxiliary files, but we don't yet fsync() the file descriptor for > > the containing directory. Therefore our data might have been written > > out, but it's anyone's guess what the state of the directory > > containing the file is after we write the *.idx. > > > > In practice modern OS's are known to be forgiving on that point, so > > this will probably solve races in practice for most users. It will > > almost certainly make them better than they were before when we > > didn't write *.idx files last. We should more generally improve our > > use of fsync() to cover containing directories, but that'll > > hopefully be addressed by some follow-up series. > > I'd probably drop the last paragraph, and replace it with a single > sentence "we may want to fsync the containing directory once after > placing *.idx file in place, but it is outside of the scope of this > series", if I were doing this series. Yep, I think that this makes things clearer, and I prefer it to Ævar's suggestion (which conveys more information than I think is necessary here). Thanks, Taylor ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH 0/9] packfile: avoid .idx rename races 2021-09-09 3:24 ` [PATCH 0/9] packfile: avoid .idx rename races Taylor Blau ` (8 preceding siblings ...) 2021-09-09 3:25 ` [PATCH 9/9] pack-objects: rename .idx files into place after .bitmap files Taylor Blau @ 2021-09-09 8:06 ` Ævar Arnfjörð Bjarmason 2021-09-09 14:40 ` Taylor Blau 2021-09-09 23:24 ` [PATCH v2 " Taylor Blau 10 siblings, 1 reply; 81+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-09-09 8:06 UTC (permalink / raw) To: Taylor Blau; +Cc: git, gitster On Wed, Sep 08 2021, Taylor Blau wrote: > This series is a unification of [1] and [2] which together prevent a handful of > races when code that writes packfiles moves the `.idx` into place before all > other auxiliary files are properly renamed. > > This can lead to races like not reading the `.rev` file even if one was > generated and is about to be moved into place. > > Credit goes to Ævar for preparing what is more-or-less sent here. I polished a > few of the commit messages, and added the second patch on top of his result. It > isn't necessary, but felt like good hygiene when I was reading the surrounding > code. > > Thanks in advance for reviewing. Thanks a lot! I think it's probably redundant to note it at this point but I've given this a thorough review and it all looks good to me. I left some comments/musings on minor points in the series, but none of those IMO require a re-roll except perhaps the duplicate Signed-off-by headers, depending on what Junio thinks & if he's going to fix those while queuing. ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH 0/9] packfile: avoid .idx rename races 2021-09-09 8:06 ` [PATCH 0/9] packfile: avoid .idx rename races Ævar Arnfjörð Bjarmason @ 2021-09-09 14:40 ` Taylor Blau 2021-09-09 19:52 ` Junio C Hamano 0 siblings, 1 reply; 81+ messages in thread From: Taylor Blau @ 2021-09-09 14:40 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: Taylor Blau, git, gitster On Thu, Sep 09, 2021 at 10:06:24AM +0200, Ævar Arnfjörð Bjarmason wrote: > I think it's probably redundant to note it at this point but I've given > this a thorough review and it all looks good to me. I left some > comments/musings on minor points in the series, but none of those IMO > require a re-roll except perhaps the duplicate Signed-off-by headers, > depending on what Junio thinks & if he's going to fix those while > queuing. Thanks, and likewise your parts of this series look good to me (obviously, or I wouldn't have sent it otherwise ;)). The duplicate s-o-b's are intentional, but see my response in 2/9 (I'd link to it, but vger seems to be a little sluggish for me this morning) for why. If I made an error there, I'm happy to fix it up for Junio. Thanks, Taylor ^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [PATCH 0/9] packfile: avoid .idx rename races 2021-09-09 14:40 ` Taylor Blau @ 2021-09-09 19:52 ` Junio C Hamano 0 siblings, 0 replies; 81+ messages in thread From: Junio C Hamano @ 2021-09-09 19:52 UTC (permalink / raw) To: Taylor Blau; +Cc: Ævar Arnfjörð Bjarmason, git Taylor Blau <me@ttaylorr.com> writes: > The duplicate s-o-b's are intentional, but see my response in 2/9 (I'd > link to it, but vger seems to be a little sluggish for me this morning) Looks good to me. Thanks. ^ permalink raw reply [flat|nested] 81+ messages in thread
* [PATCH v2 0/9] packfile: avoid .idx rename races 2021-09-09 3:24 ` [PATCH 0/9] packfile: avoid .idx rename races Taylor Blau ` (9 preceding siblings ...) 2021-09-09 8:06 ` [PATCH 0/9] packfile: avoid .idx rename races Ævar Arnfjörð Bjarmason @ 2021-09-09 23:24 ` Taylor Blau 2021-09-09 23:24 ` [PATCH v2 1/9] pack.h: line-wrap the definition of finish_tmp_packfile() Taylor Blau ` (9 more replies) 10 siblings, 10 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-09 23:24 UTC (permalink / raw) To: git; +Cc: avarab, gitster Here is a relatively small reroll of mine and Ævar's series to prevent packfile-related races when moving the `.idx` into place before other auxiliary files are renamed. All changes are cosmetic, but all of the feedback on the previous round was a strict improvement in the overall quality, so it seemed pertinent to send an improved version. Range-diff is below, and thanks in advance for your review. Taylor Blau (4): bulk-checkin.c: store checksum directly pack-write.c: rename `.idx` files after `*.rev` builtin/repack.c: move `.idx` files into place last builtin/index-pack.c: move `.idx` files into place last Ævar Arnfjörð Bjarmason (5): pack.h: line-wrap the definition of finish_tmp_packfile() pack-write: refactor renaming in finish_tmp_packfile() index-pack: refactor renaming in final() pack-write: split up finish_tmp_packfile() function pack-objects: rename .idx files into place after .bitmap files builtin/index-pack.c | 48 +++++++++++++++++------------------ builtin/pack-objects.c | 15 ++++++++--- builtin/repack.c | 2 +- bulk-checkin.c | 31 +++++++++++++++++------ pack-write.c | 57 +++++++++++++++++++++--------------------- pack.h | 10 +++++++- 6 files changed, 96 insertions(+), 67 deletions(-) Range-diff against v1: -: ---------- > 1: 0b07aa4947 pack.h: line-wrap the definition of finish_tmp_packfile() 1: 20b35ce050 ! 2: c46d3c29b4 bulk-checkin.c: store checksum directly @@ Commit message Store the hash directly in an unsigned char array. This behaves the same as writing to the `hash` member, but makes the intent clearer (and avoids allocating an extra four bytes for the `algo` member of `struct - object_id`). + object_id`). It likewise prevents the possibility of a segfault when + reading `algo` (e.g., by calling `oid_to_hex()`) if it is uninitialized. Signed-off-by: Taylor Blau <me@ttaylorr.com> 2: 35052ef494 ! 3: 2e907e309d pack-write: refactor renaming in finish_tmp_packfile() @@ pack-write.c: struct hashfile *create_tmp_packfile(char **pack_tmp_name) return hashfd(fd, *pack_tmp_name); } -+static void rename_tmp_packfile(struct strbuf *nb, const char *source, ++static void rename_tmp_packfile(struct strbuf *name_prefix, const char *source, + const char *ext) +{ -+ size_t nb_len = nb->len; ++ size_t name_prefix_len = name_prefix->len; + -+ strbuf_addstr(nb, ext); -+ if (rename(source, nb->buf)) -+ die_errno("unable to rename temporary '*.%s' file to '%s'", -+ ext, nb->buf); -+ strbuf_setlen(nb, nb_len); ++ strbuf_addstr(name_prefix, ext); ++ if (rename(source, name_prefix->buf)) ++ die_errno("unable to rename temporary file to '%s'", ++ name_prefix->buf); ++ strbuf_setlen(name_prefix, name_prefix_len); +} + void finish_tmp_packfile(struct strbuf *name_buffer, 3: 0fb2c25f5a = 4: 41d34b1f18 pack-write.c: rename `.idx` files after `*.rev` 4: 3b10a97ec0 = 5: 6b340b7eba builtin/repack.c: move `.idx` files into place last 5: 3c9b515907 ! 6: 1e4d0ea0f3 index-pack: refactor renaming in final() @@ Commit message helper, this is both a net decrease in lines, and improves the readability, since we can easily see at a glance that the logic for writing these three types of files is exactly the same, aside from the - obviously differing cases of "*final_xyz_name" being NULL, and - "else_chmod_if" being different. + obviously differing cases of "*final_name" being NULL, and + "make_read_only_if_same" being different. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> @@ builtin/index-pack.c: static void write_special_file(const char *suffix, const c strbuf_release(&name_buf); } -+static void rename_tmp_packfile(const char **final_xyz_name, -+ const char *curr_xyz_name, -+ struct strbuf *xyz_name, unsigned char *hash, -+ const char *ext, int else_chmod_if) ++static void rename_tmp_packfile(const char **final_name, ++ const char *curr_name, ++ struct strbuf *name, unsigned char *hash, ++ const char *ext, int make_read_only_if_same) +{ -+ if (*final_xyz_name != curr_xyz_name) { -+ if (!*final_xyz_name) -+ *final_xyz_name = odb_pack_name(xyz_name, hash, ext); -+ if (finalize_object_file(curr_xyz_name, *final_xyz_name)) ++ if (*final_name != curr_name) { ++ if (!*final_name) ++ *final_name = odb_pack_name(name, hash, ext); ++ if (finalize_object_file(curr_name, *final_name)) + die(_("unable to rename temporary '*.%s' file to '%s"), -+ ext, *final_xyz_name); -+ } else if (else_chmod_if) { -+ chmod(*final_xyz_name, 0444); ++ ext, *final_name); ++ } else if (make_read_only_if_same) { ++ chmod(*final_name, 0444); + } +} + 6: 8d67a71501 = 7: 906e75d707 builtin/index-pack.c: move `.idx` files into place last 7: 5c553229b0 ! 8: 90bebe4e51 pack-write: split up finish_tmp_packfile() function @@ bulk-checkin.c: static struct bulk_checkin_state { unsigned char hash[GIT_MAX_RAWSZ]; ## pack-write.c ## -@@ pack-write.c: static void rename_tmp_packfile(struct strbuf *nb, const char *source, - strbuf_setlen(nb, nb_len); +@@ pack-write.c: static void rename_tmp_packfile(struct strbuf *name_prefix, const char *source, + strbuf_setlen(name_prefix, name_prefix_len); } -void finish_tmp_packfile(struct strbuf *name_buffer, 8: d8286cf107 ! 9: 1409725509 pack-objects: rename .idx files into place after .bitmap files @@ Commit message about filesystem races in the face of doing and not doing fsync() (and if doing fsync(), not doing it properly). - In particular, in this case of writing to ".git/objects/pack" we only - write and fsync() the individual files, but if we wanted to guarantee - that the metadata update was seen in that way by concurrent processes - we'd need to fsync() on the "fd" of the containing directory. That - concern is probably more theoretical than not, modern OS's tend to be - more on the forgiving side than the overly pedantic side of - implementing POSIX FS semantics. + We may want to fsync the containing directory once after renaming the + *.idx file into place, but that is outside the scope of this series. 1. https://lore.kernel.org/git/8735qgkvv1.fsf@evledraar.gmail.com/ -- 2.33.0.96.g73915697e6 ^ permalink raw reply [flat|nested] 81+ messages in thread
* [PATCH v2 1/9] pack.h: line-wrap the definition of finish_tmp_packfile() 2021-09-09 23:24 ` [PATCH v2 " Taylor Blau @ 2021-09-09 23:24 ` Taylor Blau 2021-09-09 23:24 ` [PATCH v2 2/9] bulk-checkin.c: store checksum directly Taylor Blau ` (8 subsequent siblings) 9 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-09 23:24 UTC (permalink / raw) To: git; +Cc: avarab, gitster From: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Line-wrap the definition of finish_tmp_packfile() to make subsequent changes easier to read. See 0e990530ae (finish_tmp_packfile(): a helper function, 2011-10-28) for the commit that introduced this overly long line. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> --- pack.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/pack.h b/pack.h index fa13954526..1c17254c0a 100644 --- a/pack.h +++ b/pack.h @@ -110,6 +110,11 @@ int encode_in_pack_object_header(unsigned char *hdr, int hdr_len, int read_pack_header(int fd, struct pack_header *); struct hashfile *create_tmp_packfile(char **pack_tmp_name); -void finish_tmp_packfile(struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, struct pack_idx_option *pack_idx_opts, unsigned char sha1[]); +void finish_tmp_packfile(struct strbuf *name_buffer, + const char *pack_tmp_name, + struct pack_idx_entry **written_list, + uint32_t nr_written, + struct pack_idx_option *pack_idx_opts, + unsigned char sha1[]); #endif -- 2.33.0.96.g73915697e6 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* [PATCH v2 2/9] bulk-checkin.c: store checksum directly 2021-09-09 23:24 ` [PATCH v2 " Taylor Blau 2021-09-09 23:24 ` [PATCH v2 1/9] pack.h: line-wrap the definition of finish_tmp_packfile() Taylor Blau @ 2021-09-09 23:24 ` Taylor Blau 2021-09-09 23:24 ` [PATCH v2 3/9] pack-write: refactor renaming in finish_tmp_packfile() Taylor Blau ` (7 subsequent siblings) 9 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-09 23:24 UTC (permalink / raw) To: git; +Cc: avarab, gitster finish_bulk_checkin() stores the checksum from finalize_hashfile() by writing to the `hash` member of `struct object_id`, but that hash has nothing to do with an object id (it's just a convenient location that happens to be sized correctly). Store the hash directly in an unsigned char array. This behaves the same as writing to the `hash` member, but makes the intent clearer (and avoids allocating an extra four bytes for the `algo` member of `struct object_id`). It likewise prevents the possibility of a segfault when reading `algo` (e.g., by calling `oid_to_hex()`) if it is uninitialized. Signed-off-by: Taylor Blau <me@ttaylorr.com> --- bulk-checkin.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/bulk-checkin.c b/bulk-checkin.c index b023d9959a..6283bc8bd9 100644 --- a/bulk-checkin.c +++ b/bulk-checkin.c @@ -25,7 +25,7 @@ static struct bulk_checkin_state { static void finish_bulk_checkin(struct bulk_checkin_state *state) { - struct object_id oid; + unsigned char hash[GIT_MAX_RAWSZ]; struct strbuf packname = STRBUF_INIT; int i; @@ -37,11 +37,11 @@ static void finish_bulk_checkin(struct bulk_checkin_state *state) unlink(state->pack_tmp_name); goto clear_exit; } else if (state->nr_written == 1) { - finalize_hashfile(state->f, oid.hash, CSUM_HASH_IN_STREAM | CSUM_FSYNC | CSUM_CLOSE); + finalize_hashfile(state->f, hash, CSUM_HASH_IN_STREAM | CSUM_FSYNC | CSUM_CLOSE); } else { - int fd = finalize_hashfile(state->f, oid.hash, 0); - fixup_pack_header_footer(fd, oid.hash, state->pack_tmp_name, - state->nr_written, oid.hash, + int fd = finalize_hashfile(state->f, hash, 0); + fixup_pack_header_footer(fd, hash, state->pack_tmp_name, + state->nr_written, hash, state->offset); close(fd); } @@ -49,7 +49,7 @@ static void finish_bulk_checkin(struct bulk_checkin_state *state) strbuf_addf(&packname, "%s/pack/pack-", get_object_directory()); finish_tmp_packfile(&packname, state->pack_tmp_name, state->written, state->nr_written, - &state->pack_idx_opts, oid.hash); + &state->pack_idx_opts, hash); for (i = 0; i < state->nr_written; i++) free(state->written[i]); -- 2.33.0.96.g73915697e6 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* [PATCH v2 3/9] pack-write: refactor renaming in finish_tmp_packfile() 2021-09-09 23:24 ` [PATCH v2 " Taylor Blau 2021-09-09 23:24 ` [PATCH v2 1/9] pack.h: line-wrap the definition of finish_tmp_packfile() Taylor Blau 2021-09-09 23:24 ` [PATCH v2 2/9] bulk-checkin.c: store checksum directly Taylor Blau @ 2021-09-09 23:24 ` Taylor Blau 2021-09-09 23:24 ` [PATCH v2 4/9] pack-write.c: rename `.idx` files after `*.rev` Taylor Blau ` (6 subsequent siblings) 9 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-09 23:24 UTC (permalink / raw) To: git; +Cc: avarab, gitster From: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Refactor the renaming in finish_tmp_packfile() into a helper function. The callers are now expected to pass a "name_buffer" ending in "pack-OID." instead of the previous "pack-", we then append "pack", "idx" or "rev" to it. By doing the strbuf_setlen() in rename_tmp_packfile() we reuse the buffer and avoid the repeated allocations we'd get if that function had its own temporary "struct strbuf". This approach of reusing the buffer does make the last user in pack-object.c's write_pack_file() slightly awkward, since we needlessly do a strbuf_setlen() before calling strbuf_release() for consistency. In subsequent changes we'll move that bitmap writing code around, so let's not skip the strbuf_setlen() now. The previous strbuf_reset() idiom originated with 5889271114a (finish_tmp_packfile():use strbuf for pathname construction, 2014-03-03), which in turn was a minimal adjustment of pre-strbuf code added in 0e990530ae (finish_tmp_packfile(): a helper function, 2011-10-28). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> --- builtin/pack-objects.c | 7 +++++-- bulk-checkin.c | 3 ++- pack-write.c | 37 ++++++++++++++++--------------------- 3 files changed, 23 insertions(+), 24 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index df49f656b9..2a105c8d6e 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1237,7 +1237,8 @@ static void write_pack_file(void) warning_errno(_("failed utime() on %s"), pack_tmp_name); } - strbuf_addf(&tmpname, "%s-", base_name); + strbuf_addf(&tmpname, "%s-%s.", base_name, + hash_to_hex(hash)); if (write_bitmap_index) { bitmap_writer_set_checksum(hash); @@ -1250,8 +1251,9 @@ static void write_pack_file(void) &pack_idx_opts, hash); if (write_bitmap_index) { - strbuf_addf(&tmpname, "%s.bitmap", hash_to_hex(hash)); + size_t tmpname_len = tmpname.len; + strbuf_addstr(&tmpname, "bitmap"); stop_progress(&progress_state); bitmap_writer_show_progress(progress); @@ -1260,6 +1262,7 @@ static void write_pack_file(void) bitmap_writer_finish(written_list, nr_written, tmpname.buf, write_bitmap_options); write_bitmap_index = 0; + strbuf_setlen(&tmpname, tmpname_len); } strbuf_release(&tmpname); diff --git a/bulk-checkin.c b/bulk-checkin.c index 6283bc8bd9..c19d471f0b 100644 --- a/bulk-checkin.c +++ b/bulk-checkin.c @@ -46,7 +46,8 @@ static void finish_bulk_checkin(struct bulk_checkin_state *state) close(fd); } - strbuf_addf(&packname, "%s/pack/pack-", get_object_directory()); + strbuf_addf(&packname, "%s/pack/pack-%s.", get_object_directory(), + hash_to_hex(hash)); finish_tmp_packfile(&packname, state->pack_tmp_name, state->written, state->nr_written, &state->pack_idx_opts, hash); diff --git a/pack-write.c b/pack-write.c index 2767b78619..58bc5fbcdf 100644 --- a/pack-write.c +++ b/pack-write.c @@ -458,6 +458,18 @@ struct hashfile *create_tmp_packfile(char **pack_tmp_name) return hashfd(fd, *pack_tmp_name); } +static void rename_tmp_packfile(struct strbuf *name_prefix, const char *source, + const char *ext) +{ + size_t name_prefix_len = name_prefix->len; + + strbuf_addstr(name_prefix, ext); + if (rename(source, name_prefix->buf)) + die_errno("unable to rename temporary file to '%s'", + name_prefix->buf); + strbuf_setlen(name_prefix, name_prefix_len); +} + void finish_tmp_packfile(struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, @@ -466,7 +478,6 @@ void finish_tmp_packfile(struct strbuf *name_buffer, unsigned char hash[]) { const char *idx_tmp_name, *rev_tmp_name = NULL; - int basename_len = name_buffer->len; if (adjust_shared_perm(pack_tmp_name)) die_errno("unable to make temporary pack file readable"); @@ -479,26 +490,10 @@ void finish_tmp_packfile(struct strbuf *name_buffer, rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash, pack_idx_opts->flags); - strbuf_addf(name_buffer, "%s.pack", hash_to_hex(hash)); - - if (rename(pack_tmp_name, name_buffer->buf)) - die_errno("unable to rename temporary pack file"); - - strbuf_setlen(name_buffer, basename_len); - - strbuf_addf(name_buffer, "%s.idx", hash_to_hex(hash)); - if (rename(idx_tmp_name, name_buffer->buf)) - die_errno("unable to rename temporary index file"); - - strbuf_setlen(name_buffer, basename_len); - - if (rev_tmp_name) { - strbuf_addf(name_buffer, "%s.rev", hash_to_hex(hash)); - if (rename(rev_tmp_name, name_buffer->buf)) - die_errno("unable to rename temporary reverse-index file"); - } - - strbuf_setlen(name_buffer, basename_len); + rename_tmp_packfile(name_buffer, pack_tmp_name, "pack"); + rename_tmp_packfile(name_buffer, idx_tmp_name, "idx"); + if (rev_tmp_name) + rename_tmp_packfile(name_buffer, rev_tmp_name, "rev"); free((void *)idx_tmp_name); } -- 2.33.0.96.g73915697e6 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* [PATCH v2 4/9] pack-write.c: rename `.idx` files after `*.rev` 2021-09-09 23:24 ` [PATCH v2 " Taylor Blau ` (2 preceding siblings ...) 2021-09-09 23:24 ` [PATCH v2 3/9] pack-write: refactor renaming in finish_tmp_packfile() Taylor Blau @ 2021-09-09 23:24 ` Taylor Blau 2021-09-09 23:24 ` [PATCH v2 5/9] builtin/repack.c: move `.idx` files into place last Taylor Blau ` (5 subsequent siblings) 9 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-09 23:24 UTC (permalink / raw) To: git; +Cc: avarab, gitster We treat the presence of an `.idx` file as the indicator of whether or not it's safe to use a packfile. But `finish_tmp_packfile()` (which is responsible for writing the pack and moving the temporary versions of all of its auxiliary files into place) is inconsistent about the write order. Specifically, it moves the `.rev` file into place after the `.idx`, leaving open the possibility to open a pack which looks "ready" (because the `.idx` file exists and is readable) but appears momentarily to not have a `.rev` file. This causes Git to fall back to generating the pack's reverse index in memory. Though racy, this amounts to an unnecessary slow-down at worst, and doesn't affect the correctness of the resulting reverse index. Close this race by moving the .rev file into place before moving the .idx file into place. This still leaves the issue of `.idx` files being renamed into place before the auxiliary `.bitmap` file is renamed when in pack-object.c's write_pack_file() "write_bitmap_index" is true. That race will be addressed in subsequent commits. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> --- pack-write.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pack-write.c b/pack-write.c index 58bc5fbcdf..b9f9cd5c14 100644 --- a/pack-write.c +++ b/pack-write.c @@ -491,9 +491,9 @@ void finish_tmp_packfile(struct strbuf *name_buffer, pack_idx_opts->flags); rename_tmp_packfile(name_buffer, pack_tmp_name, "pack"); - rename_tmp_packfile(name_buffer, idx_tmp_name, "idx"); if (rev_tmp_name) rename_tmp_packfile(name_buffer, rev_tmp_name, "rev"); + rename_tmp_packfile(name_buffer, idx_tmp_name, "idx"); free((void *)idx_tmp_name); } -- 2.33.0.96.g73915697e6 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* [PATCH v2 5/9] builtin/repack.c: move `.idx` files into place last 2021-09-09 23:24 ` [PATCH v2 " Taylor Blau ` (3 preceding siblings ...) 2021-09-09 23:24 ` [PATCH v2 4/9] pack-write.c: rename `.idx` files after `*.rev` Taylor Blau @ 2021-09-09 23:24 ` Taylor Blau 2021-09-09 23:24 ` [PATCH v2 6/9] index-pack: refactor renaming in final() Taylor Blau ` (4 subsequent siblings) 9 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-09 23:24 UTC (permalink / raw) To: git; +Cc: avarab, gitster In a similar spirit as the previous patch, fix the identical problem from `git repack` (which invokes `pack-objects` with a temporary location for output, and then moves the files into their final locations itself). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> --- builtin/repack.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/builtin/repack.c b/builtin/repack.c index 5f9bc74adc..c3e4771609 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -208,10 +208,10 @@ static struct { unsigned optional:1; } exts[] = { {".pack"}, - {".idx"}, {".rev", 1}, {".bitmap", 1}, {".promisor", 1}, + {".idx"}, }; static unsigned populate_pack_exts(char *name) -- 2.33.0.96.g73915697e6 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* [PATCH v2 6/9] index-pack: refactor renaming in final() 2021-09-09 23:24 ` [PATCH v2 " Taylor Blau ` (4 preceding siblings ...) 2021-09-09 23:24 ` [PATCH v2 5/9] builtin/repack.c: move `.idx` files into place last Taylor Blau @ 2021-09-09 23:24 ` Taylor Blau 2021-09-09 23:24 ` [PATCH v2 7/9] builtin/index-pack.c: move `.idx` files into place last Taylor Blau ` (3 subsequent siblings) 9 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-09 23:24 UTC (permalink / raw) To: git; +Cc: avarab, gitster From: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Refactor the renaming in final() into a helper function, this is similar in spirit to a preceding refactoring of finish_tmp_packfile() in pack-write.c. Before e37d0b8730b (builtin/index-pack.c: write reverse indexes, 2021-01-25) it probably wasn't worth it to have this sort of helper, due to the differing "else if" case for "pack" files v.s. "idx" files. But since we've got "rev" as well now, let's do the renaming via a helper, this is both a net decrease in lines, and improves the readability, since we can easily see at a glance that the logic for writing these three types of files is exactly the same, aside from the obviously differing cases of "*final_name" being NULL, and "make_read_only_if_same" being different. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> --- builtin/index-pack.c | 48 +++++++++++++++++++++----------------------- 1 file changed, 23 insertions(+), 25 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 6cc4890217..3e3736cd95 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1477,6 +1477,22 @@ static void write_special_file(const char *suffix, const char *msg, strbuf_release(&name_buf); } +static void rename_tmp_packfile(const char **final_name, + const char *curr_name, + struct strbuf *name, unsigned char *hash, + const char *ext, int make_read_only_if_same) +{ + if (*final_name != curr_name) { + if (!*final_name) + *final_name = odb_pack_name(name, hash, ext); + if (finalize_object_file(curr_name, *final_name)) + die(_("unable to rename temporary '*.%s' file to '%s"), + ext, *final_name); + } else if (make_read_only_if_same) { + chmod(*final_name, 0444); + } +} + static void final(const char *final_pack_name, const char *curr_pack_name, const char *final_index_name, const char *curr_index_name, const char *final_rev_index_name, const char *curr_rev_index_name, @@ -1505,31 +1521,13 @@ static void final(const char *final_pack_name, const char *curr_pack_name, write_special_file("promisor", promisor_msg, final_pack_name, hash, NULL); - if (final_pack_name != curr_pack_name) { - if (!final_pack_name) - final_pack_name = odb_pack_name(&pack_name, hash, "pack"); - if (finalize_object_file(curr_pack_name, final_pack_name)) - die(_("cannot store pack file")); - } else if (from_stdin) - chmod(final_pack_name, 0444); - - if (final_index_name != curr_index_name) { - if (!final_index_name) - final_index_name = odb_pack_name(&index_name, hash, "idx"); - if (finalize_object_file(curr_index_name, final_index_name)) - die(_("cannot store index file")); - } else - chmod(final_index_name, 0444); - - if (curr_rev_index_name) { - if (final_rev_index_name != curr_rev_index_name) { - if (!final_rev_index_name) - final_rev_index_name = odb_pack_name(&rev_index_name, hash, "rev"); - if (finalize_object_file(curr_rev_index_name, final_rev_index_name)) - die(_("cannot store reverse index file")); - } else - chmod(final_rev_index_name, 0444); - } + rename_tmp_packfile(&final_pack_name, curr_pack_name, &pack_name, + hash, "pack", from_stdin); + rename_tmp_packfile(&final_index_name, curr_index_name, &index_name, + hash, "idx", 1); + if (curr_rev_index_name) + rename_tmp_packfile(&final_rev_index_name, curr_rev_index_name, + &rev_index_name, hash, "rev", 1); if (do_fsck_object) { struct packed_git *p; -- 2.33.0.96.g73915697e6 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* [PATCH v2 7/9] builtin/index-pack.c: move `.idx` files into place last 2021-09-09 23:24 ` [PATCH v2 " Taylor Blau ` (5 preceding siblings ...) 2021-09-09 23:24 ` [PATCH v2 6/9] index-pack: refactor renaming in final() Taylor Blau @ 2021-09-09 23:24 ` Taylor Blau 2021-09-09 23:24 ` [PATCH v2 8/9] pack-write: split up finish_tmp_packfile() function Taylor Blau ` (2 subsequent siblings) 9 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-09 23:24 UTC (permalink / raw) To: git; +Cc: avarab, gitster In a similar spirit as preceding patches to `git repack` and `git pack-objects`, fix the identical problem in `git index-pack`. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> --- builtin/index-pack.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 3e3736cd95..f267dce49e 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1523,11 +1523,11 @@ static void final(const char *final_pack_name, const char *curr_pack_name, rename_tmp_packfile(&final_pack_name, curr_pack_name, &pack_name, hash, "pack", from_stdin); - rename_tmp_packfile(&final_index_name, curr_index_name, &index_name, - hash, "idx", 1); if (curr_rev_index_name) rename_tmp_packfile(&final_rev_index_name, curr_rev_index_name, &rev_index_name, hash, "rev", 1); + rename_tmp_packfile(&final_index_name, curr_index_name, &index_name, + hash, "idx", 1); if (do_fsck_object) { struct packed_git *p; -- 2.33.0.96.g73915697e6 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* [PATCH v2 8/9] pack-write: split up finish_tmp_packfile() function 2021-09-09 23:24 ` [PATCH v2 " Taylor Blau ` (6 preceding siblings ...) 2021-09-09 23:24 ` [PATCH v2 7/9] builtin/index-pack.c: move `.idx` files into place last Taylor Blau @ 2021-09-09 23:24 ` Taylor Blau 2021-09-09 23:25 ` [PATCH v2 9/9] pack-objects: rename .idx files into place after .bitmap files Taylor Blau 2021-09-10 1:35 ` [PATCH v2 0/9] packfile: avoid .idx rename races Junio C Hamano 9 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-09 23:24 UTC (permalink / raw) To: git; +Cc: avarab, gitster From: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Split up the finish_tmp_packfile() function and use the split-up version in pack-objects.c in preparation for moving the step of renaming the *.idx file later as part of a function change. Since the only other caller of finish_tmp_packfile() was in bulk-checkin.c, and it won't be needing a change to its *.idx renaming, provide a thin wrapper for the old function as a static function in that file. If other callers end up needing the simpler version it could be moved back to "pack-write.c" and "pack.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> --- builtin/pack-objects.c | 7 +++++-- bulk-checkin.c | 16 ++++++++++++++++ pack-write.c | 22 +++++++++++++--------- pack.h | 7 +++++-- 4 files changed, 39 insertions(+), 13 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 2a105c8d6e..944134b6f2 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1217,6 +1217,7 @@ static void write_pack_file(void) if (!pack_to_stdout) { struct stat st; struct strbuf tmpname = STRBUF_INIT; + char *idx_tmp_name = NULL; /* * Packs are runtime accessed in their mtime @@ -1246,9 +1247,10 @@ static void write_pack_file(void) &to_pack, written_list, nr_written); } - finish_tmp_packfile(&tmpname, pack_tmp_name, + stage_tmp_packfiles(&tmpname, pack_tmp_name, written_list, nr_written, - &pack_idx_opts, hash); + &pack_idx_opts, hash, &idx_tmp_name); + rename_tmp_packfile_idx(&tmpname, &idx_tmp_name); if (write_bitmap_index) { size_t tmpname_len = tmpname.len; @@ -1265,6 +1267,7 @@ static void write_pack_file(void) strbuf_setlen(&tmpname, tmpname_len); } + free(idx_tmp_name); strbuf_release(&tmpname); free(pack_tmp_name); puts(hash_to_hex(hash)); diff --git a/bulk-checkin.c b/bulk-checkin.c index c19d471f0b..8785b2ac80 100644 --- a/bulk-checkin.c +++ b/bulk-checkin.c @@ -23,6 +23,22 @@ static struct bulk_checkin_state { uint32_t nr_written; } state; +static void finish_tmp_packfile(struct strbuf *basename, + const char *pack_tmp_name, + struct pack_idx_entry **written_list, + uint32_t nr_written, + struct pack_idx_option *pack_idx_opts, + unsigned char hash[]) +{ + char *idx_tmp_name = NULL; + + stage_tmp_packfiles(basename, pack_tmp_name, written_list, nr_written, + pack_idx_opts, hash, &idx_tmp_name); + rename_tmp_packfile_idx(basename, &idx_tmp_name); + + free(idx_tmp_name); +} + static void finish_bulk_checkin(struct bulk_checkin_state *state) { unsigned char hash[GIT_MAX_RAWSZ]; diff --git a/pack-write.c b/pack-write.c index b9f9cd5c14..c21aed64b3 100644 --- a/pack-write.c +++ b/pack-write.c @@ -470,21 +470,28 @@ static void rename_tmp_packfile(struct strbuf *name_prefix, const char *source, strbuf_setlen(name_prefix, name_prefix_len); } -void finish_tmp_packfile(struct strbuf *name_buffer, +void rename_tmp_packfile_idx(struct strbuf *name_buffer, + char **idx_tmp_name) +{ + rename_tmp_packfile(name_buffer, *idx_tmp_name, "idx"); +} + +void stage_tmp_packfiles(struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, struct pack_idx_option *pack_idx_opts, - unsigned char hash[]) + unsigned char hash[], + char **idx_tmp_name) { - const char *idx_tmp_name, *rev_tmp_name = NULL; + const char *rev_tmp_name = NULL; if (adjust_shared_perm(pack_tmp_name)) die_errno("unable to make temporary pack file readable"); - idx_tmp_name = write_idx_file(NULL, written_list, nr_written, - pack_idx_opts, hash); - if (adjust_shared_perm(idx_tmp_name)) + *idx_tmp_name = (char *)write_idx_file(NULL, written_list, nr_written, + pack_idx_opts, hash); + if (adjust_shared_perm(*idx_tmp_name)) die_errno("unable to make temporary index file readable"); rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash, @@ -493,9 +500,6 @@ void finish_tmp_packfile(struct strbuf *name_buffer, rename_tmp_packfile(name_buffer, pack_tmp_name, "pack"); if (rev_tmp_name) rename_tmp_packfile(name_buffer, rev_tmp_name, "rev"); - rename_tmp_packfile(name_buffer, idx_tmp_name, "idx"); - - free((void *)idx_tmp_name); } void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought) diff --git a/pack.h b/pack.h index 1c17254c0a..b22bfc4a18 100644 --- a/pack.h +++ b/pack.h @@ -110,11 +110,14 @@ int encode_in_pack_object_header(unsigned char *hdr, int hdr_len, int read_pack_header(int fd, struct pack_header *); struct hashfile *create_tmp_packfile(char **pack_tmp_name); -void finish_tmp_packfile(struct strbuf *name_buffer, +void stage_tmp_packfiles(struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, struct pack_idx_option *pack_idx_opts, - unsigned char sha1[]); + unsigned char hash[], + char **idx_tmp_name); +void rename_tmp_packfile_idx(struct strbuf *basename, + char **idx_tmp_name); #endif -- 2.33.0.96.g73915697e6 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* [PATCH v2 9/9] pack-objects: rename .idx files into place after .bitmap files 2021-09-09 23:24 ` [PATCH v2 " Taylor Blau ` (7 preceding siblings ...) 2021-09-09 23:24 ` [PATCH v2 8/9] pack-write: split up finish_tmp_packfile() function Taylor Blau @ 2021-09-09 23:25 ` Taylor Blau 2021-09-10 1:35 ` [PATCH v2 0/9] packfile: avoid .idx rename races Junio C Hamano 9 siblings, 0 replies; 81+ messages in thread From: Taylor Blau @ 2021-09-09 23:25 UTC (permalink / raw) To: git; +Cc: avarab, gitster From: Ævar Arnfjörð Bjarmason <avarab@gmail.com> In preceding commits the race of renaming .idx files in place before .rev files and other auxiliary files was fixed in pack-write.c's finish_tmp_packfile(), builtin/repack.c's "struct exts", and builtin/index-pack.c's final(). As noted in the change to pack-write.c we left in place the issue of writing *.bitmap files after the *.idx, let's fix that issue. See 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21) for commentary at the time when *.bitmap was implemented about how those files are written out, nothing in that commit contradicts what's being done here. Note that this commit and preceding ones only close any race condition with *.idx files being written before their auxiliary files if we're optimistic about our lack of fsync()-ing in this are not tripping us over. See the thread at [1] for a rabbit hole of various discussions about filesystem races in the face of doing and not doing fsync() (and if doing fsync(), not doing it properly). We may want to fsync the containing directory once after renaming the *.idx file into place, but that is outside the scope of this series. 1. https://lore.kernel.org/git/8735qgkvv1.fsf@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> --- builtin/pack-objects.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 944134b6f2..a01767a384 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1250,7 +1250,6 @@ static void write_pack_file(void) stage_tmp_packfiles(&tmpname, pack_tmp_name, written_list, nr_written, &pack_idx_opts, hash, &idx_tmp_name); - rename_tmp_packfile_idx(&tmpname, &idx_tmp_name); if (write_bitmap_index) { size_t tmpname_len = tmpname.len; @@ -1267,6 +1266,8 @@ static void write_pack_file(void) strbuf_setlen(&tmpname, tmpname_len); } + rename_tmp_packfile_idx(&tmpname, &idx_tmp_name); + free(idx_tmp_name); strbuf_release(&tmpname); free(pack_tmp_name); -- 2.33.0.96.g73915697e6 ^ permalink raw reply related [flat|nested] 81+ messages in thread
* Re: [PATCH v2 0/9] packfile: avoid .idx rename races 2021-09-09 23:24 ` [PATCH v2 " Taylor Blau ` (8 preceding siblings ...) 2021-09-09 23:25 ` [PATCH v2 9/9] pack-objects: rename .idx files into place after .bitmap files Taylor Blau @ 2021-09-10 1:35 ` Junio C Hamano 9 siblings, 0 replies; 81+ messages in thread From: Junio C Hamano @ 2021-09-10 1:35 UTC (permalink / raw) To: Taylor Blau; +Cc: git, avarab Taylor Blau <me@ttaylorr.com> writes: > Here is a relatively small reroll of mine and Ævar's series to prevent > packfile-related races when moving the `.idx` into place before other auxiliary > files are renamed. Will replace. I didn't see anything questionable in range-diff (of course, lack of anything is hard to spot but at least there is no new iffy stuff and there wasn't anything iffy in the original, so ... ;-) ^ permalink raw reply [flat|nested] 81+ messages in thread
end of thread, other threads:[~2021-09-10 1:35 UTC | newest] Thread overview: 81+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-09-01 2:05 [PATCH 0/2] pack-write,repack: prevent opening packs too early Taylor Blau 2021-09-01 2:06 ` [PATCH 1/2] pack-write.c: rename `.idx` file into place last Taylor Blau 2021-09-01 2:06 ` [PATCH 2/2] builtin/repack.c: move `.idx` files " Taylor Blau 2021-09-01 3:53 ` [PATCH 0/2] pack-write,repack: prevent opening packs too early Jeff King 2021-09-01 4:29 ` Taylor Blau 2021-09-01 4:59 ` Jeff King 2021-09-01 5:12 ` Taylor Blau 2021-09-01 6:08 ` Jeff King 2021-09-01 21:40 ` Taylor Blau 2021-09-07 16:07 ` Jeff King 2021-09-07 19:42 ` [PATCH 0/3] rename *.idx file into place last (also after *.bitmap) Ævar Arnfjörð Bjarmason 2021-09-07 19:42 ` [PATCH 1/3] pack-write: use more idiomatic strbuf usage for packname construction Ævar Arnfjörð Bjarmason 2021-09-07 22:21 ` Taylor Blau 2021-09-07 23:22 ` Ævar Arnfjörð Bjarmason 2021-09-07 19:42 ` [PATCH 2/3] pack-write: split up finish_tmp_packfile() function Ævar Arnfjörð Bjarmason 2021-09-07 22:28 ` Taylor Blau 2021-09-07 19:42 ` [PATCH 3/3] pack-write: rename *.idx file into place last (really!) Ævar Arnfjörð Bjarmason 2021-09-07 22:31 ` Taylor Blau 2021-09-07 22:36 ` [PATCH 0/3] rename *.idx file into place last (also after *.bitmap) Taylor Blau 2021-09-07 19:48 ` [PATCH 0/2] pack-write,repack: prevent opening packs too early Ævar Arnfjörð Bjarmason 2021-09-08 0:38 ` [PATCH v2 0/4] rename *.idx file into place last (also after *.bitmap) Ævar Arnfjörð Bjarmason 2021-09-08 0:38 ` [PATCH v2 1/4] pack.h: line-wrap the definition of finish_tmp_packfile() Ævar Arnfjörð Bjarmason 2021-09-08 0:38 ` [PATCH v2 2/4] pack-write: refactor renaming in finish_tmp_packfile() Ævar Arnfjörð Bjarmason 2021-09-08 4:22 ` Taylor Blau 2021-09-08 0:38 ` [PATCH v2 3/4] pack-write: split up finish_tmp_packfile() function Ævar Arnfjörð Bjarmason 2021-09-08 0:38 ` [PATCH v2 4/4] pack-write: rename *.idx file into place last (really!) Ævar Arnfjörð Bjarmason 2021-09-08 1:14 ` Ævar Arnfjörð Bjarmason 2021-09-08 9:18 ` Ævar Arnfjörð Bjarmason 2021-09-08 4:24 ` Taylor Blau 2021-09-08 22:17 ` [PATCH v2 0/3] prevent opening packs too early Taylor Blau 2021-09-08 22:17 ` [PATCH v2 1/3] pack-write.c: rename `.idx` files into place last Taylor Blau 2021-09-08 22:17 ` [PATCH v2 2/3] builtin/repack.c: move " Taylor Blau 2021-09-08 22:17 ` [PATCH v2 3/3] builtin/index-pack.c: " Taylor Blau 2021-09-08 23:52 ` [PATCH v2 0/3] prevent opening packs too early Ævar Arnfjörð Bjarmason 2021-09-09 0:50 ` Ævar Arnfjörð Bjarmason 2021-09-09 1:13 ` Taylor Blau 2021-09-09 1:33 ` Ævar Arnfjörð Bjarmason 2021-09-09 2:36 ` Ævar Arnfjörð Bjarmason 2021-09-09 2:49 ` Taylor Blau 2021-09-09 3:24 ` [PATCH 0/9] packfile: avoid .idx rename races Taylor Blau 2021-09-09 3:24 ` [PATCH 1/9] pack.h: line-wrap the definition of finish_tmp_packfile() Taylor Blau 2021-09-09 3:24 ` [PATCH 2/9] bulk-checkin.c: store checksum directly Taylor Blau 2021-09-09 7:38 ` Ævar Arnfjörð Bjarmason 2021-09-09 3:24 ` [PATCH 3/9] pack-write: refactor renaming in finish_tmp_packfile() Taylor Blau 2021-09-09 19:29 ` Junio C Hamano 2021-09-09 21:07 ` Taylor Blau 2021-09-09 23:30 ` Junio C Hamano 2021-09-09 23:31 ` Taylor Blau 2021-09-10 1:29 ` Junio C Hamano 2021-09-09 3:24 ` [PATCH 4/9] pack-write.c: rename `.idx` files after `*.rev` Taylor Blau 2021-09-09 7:46 ` Ævar Arnfjörð Bjarmason 2021-09-09 14:37 ` Taylor Blau 2021-09-09 19:32 ` Junio C Hamano 2021-09-09 3:25 ` [PATCH 5/9] builtin/repack.c: move `.idx` files into place last Taylor Blau 2021-09-09 19:38 ` Junio C Hamano 2021-09-09 21:08 ` Taylor Blau 2021-09-09 3:25 ` [PATCH 6/9] index-pack: refactor renaming in final() Taylor Blau 2021-09-09 19:45 ` Junio C Hamano 2021-09-09 21:11 ` Taylor Blau 2021-09-09 3:25 ` [PATCH 7/9] builtin/index-pack.c: move `.idx` files into place last Taylor Blau 2021-09-09 7:52 ` Ævar Arnfjörð Bjarmason 2021-09-09 19:45 ` Junio C Hamano 2021-09-09 3:25 ` [PATCH 8/9] pack-write: split up finish_tmp_packfile() function Taylor Blau 2021-09-09 3:25 ` [PATCH 9/9] pack-objects: rename .idx files into place after .bitmap files Taylor Blau 2021-09-09 7:54 ` Ævar Arnfjörð Bjarmason 2021-09-09 19:52 ` Junio C Hamano 2021-09-09 21:13 ` Taylor Blau 2021-09-09 8:06 ` [PATCH 0/9] packfile: avoid .idx rename races Ævar Arnfjörð Bjarmason 2021-09-09 14:40 ` Taylor Blau 2021-09-09 19:52 ` Junio C Hamano 2021-09-09 23:24 ` [PATCH v2 " Taylor Blau 2021-09-09 23:24 ` [PATCH v2 1/9] pack.h: line-wrap the definition of finish_tmp_packfile() Taylor Blau 2021-09-09 23:24 ` [PATCH v2 2/9] bulk-checkin.c: store checksum directly Taylor Blau 2021-09-09 23:24 ` [PATCH v2 3/9] pack-write: refactor renaming in finish_tmp_packfile() Taylor Blau 2021-09-09 23:24 ` [PATCH v2 4/9] pack-write.c: rename `.idx` files after `*.rev` Taylor Blau 2021-09-09 23:24 ` [PATCH v2 5/9] builtin/repack.c: move `.idx` files into place last Taylor Blau 2021-09-09 23:24 ` [PATCH v2 6/9] index-pack: refactor renaming in final() Taylor Blau 2021-09-09 23:24 ` [PATCH v2 7/9] builtin/index-pack.c: move `.idx` files into place last Taylor Blau 2021-09-09 23:24 ` [PATCH v2 8/9] pack-write: split up finish_tmp_packfile() function Taylor Blau 2021-09-09 23:25 ` [PATCH v2 9/9] pack-objects: rename .idx files into place after .bitmap files Taylor Blau 2021-09-10 1:35 ` [PATCH v2 0/9] packfile: avoid .idx rename races Junio C Hamano
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.