git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] pack-bitmap: various pack-bitmap-write cleanups
@ 2024-05-14 19:56 Taylor Blau
  2024-05-14 19:56 ` [PATCH 1/6] object.h: add flags allocated by pack-bitmap.h Taylor Blau
                   ` (7 more replies)
  0 siblings, 8 replies; 14+ messages in thread
From: Taylor Blau @ 2024-05-14 19:56 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Patrick Steinhardt, Junio C Hamano

This topic was born out of tb/pseudo-merge-bitmaps, which has a few
quality-of-life improvements in the pack-bitmap machinery.

The main changes are to remove the static 'struct bitmap_writer', drop
one unused fields, and start using another unused one (see "move
commit_positions into commit_pos fields").

There are other smaller changes, too, like cleaning up the flag
allocation table in object.h, as well as introducing a new
bitmap_writer_free() function.

The next round of tb/pseudo-merge-bitmaps will be based on this branch.

Thanks in advance for your review!

Taylor Blau (6):
  object.h: add flags allocated by pack-bitmap.h
  pack-bitmap-write.c: move commit_positions into commit_pos fields
  pack-bitmap: avoid use of static `bitmap_writer`
  pack-bitmap: drop unused `max_bitmaps` parameter
  pack-bitmap-write.c: avoid uninitialized 'write_as' field
  pack-bitmap: introduce `bitmap_writer_free()`

 builtin/pack-objects.c |  19 +++-
 midx-write.c           |  17 ++-
 object.h               |   1 +
 pack-bitmap-write.c    | 248 +++++++++++++++++++++--------------------
 pack-bitmap.h          |  38 +++++--
 5 files changed, 185 insertions(+), 138 deletions(-)


base-commit: 83f1add914c6b4682de1e944ec0d1ac043d53d78
-- 
2.45.1.151.g7cc3499008c

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/6] object.h: add flags allocated by pack-bitmap.h
  2024-05-14 19:56 [PATCH 0/6] pack-bitmap: various pack-bitmap-write cleanups Taylor Blau
@ 2024-05-14 19:56 ` Taylor Blau
  2024-05-14 19:56 ` [PATCH 2/6] pack-bitmap-write.c: move commit_positions into commit_pos fields Taylor Blau
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Taylor Blau @ 2024-05-14 19:56 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Patrick Steinhardt, Junio C Hamano

In commit 7cc8f971085 (pack-objects: implement bitmap writing,
2013-12-21) the NEEDS_BITMAP flag was introduced into pack-bitmap.h, but
no object flags allocation table existed at the time.

In 208acbfb82f (object.h: centralize object flag allocation, 2014-03-25)
when that table was first introduced, we never added the flags from
7cc8f971085, which has remained the case since.

Rectify this by including the flag bit used by pack-bitmap.h into the
centralized table in object.h.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 object.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/object.h b/object.h
index 9293e703cc..99b9c8f114 100644
--- a/object.h
+++ b/object.h
@@ -81,6 +81,7 @@ void object_array_init(struct object_array *array);
  * reflog.c:                           10--12
  * builtin/show-branch.c:    0-------------------------------------------26
  * builtin/unpack-objects.c:                                 2021
+ * pack-bitmap.h:                                                22
  */
 #define FLAG_BITS  28
 
-- 
2.45.1.151.g7cc3499008c


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/6] pack-bitmap-write.c: move commit_positions into commit_pos fields
  2024-05-14 19:56 [PATCH 0/6] pack-bitmap: various pack-bitmap-write cleanups Taylor Blau
  2024-05-14 19:56 ` [PATCH 1/6] object.h: add flags allocated by pack-bitmap.h Taylor Blau
@ 2024-05-14 19:56 ` Taylor Blau
  2024-05-14 19:56 ` [PATCH 3/6] pack-bitmap: avoid use of static `bitmap_writer` Taylor Blau
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Taylor Blau @ 2024-05-14 19:56 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Patrick Steinhardt, Junio C Hamano

In 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21), the
bitmapped_commit struct was introduced, including the 'commit_pos'
field, which has been unused ever since its introduction more than a
decade ago.

Instead, we have used the nearby `commit_positions` array leaving the
bitmapped_commit struct with an unused 4-byte field.

We could drop the `commit_pos` field as unused, and continue to store
the values in the auxiliary array. But we could also drop the array and
store the data for each bitmapped_commit struct inside of the structure
itself, which is what this patch does.

In any spot that we previously read `commit_positions[i]`, we can now
instead read `writer.selected[i].commit_pos`. There are a few spots that
need changing as a result:

  - write_selected_commits_v1() is a simple transformation, since we're
    just reading the field. As a result, the function no longer needs an
    explicit argument to pass the commit_positions array.

  - write_lookup_table() also no longer needs the explicit
    commit_positions array passed in as an argument. But it still needs
    to sort an array of indices into the writer.selected array to read
    them in commit_pos order, so table_cmp() is adjusted accordingly.

  - bitmap_writer_finish() no longer needs to allocate, populate, and
    free the commit_positions table. Instead, we can just write the data
    directly into each struct bitmapped_commit.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pack-bitmap-write.c | 41 ++++++++++++++++-------------------------
 1 file changed, 16 insertions(+), 25 deletions(-)

diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index c6c8f94cc5..2ae82b8696 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -670,9 +670,7 @@ static const struct object_id *oid_access(size_t pos, const void *table)
 	return &index[pos]->oid;
 }
 
-static void write_selected_commits_v1(struct hashfile *f,
-				      uint32_t *commit_positions,
-				      off_t *offsets)
+static void write_selected_commits_v1(struct hashfile *f, off_t *offsets)
 {
 	int i;
 
@@ -682,7 +680,7 @@ static void write_selected_commits_v1(struct hashfile *f,
 		if (offsets)
 			offsets[i] = hashfile_total(f);
 
-		hashwrite_be32(f, commit_positions[i]);
+		hashwrite_be32(f, stored->commit_pos);
 		hashwrite_u8(f, stored->xor_offset);
 		hashwrite_u8(f, stored->flags);
 
@@ -690,23 +688,20 @@ static void write_selected_commits_v1(struct hashfile *f,
 	}
 }
 
-static int table_cmp(const void *_va, const void *_vb, void *_data)
+static int table_cmp(const void *_va, const void *_vb)
 {
-	uint32_t *commit_positions = _data;
-	uint32_t a = commit_positions[*(uint32_t *)_va];
-	uint32_t b = commit_positions[*(uint32_t *)_vb];
+	struct bitmapped_commit *a = &writer.selected[*(uint32_t *)_va];
+	struct bitmapped_commit *b = &writer.selected[*(uint32_t *)_vb];
 
-	if (a > b)
+	if (a->commit_pos < b->commit_pos)
+		return -1;
+	else if (a->commit_pos > b->commit_pos)
 		return 1;
-	else if (a < b)
-		return -1;
 
 	return 0;
 }
 
-static void write_lookup_table(struct hashfile *f,
-			       uint32_t *commit_positions,
-			       off_t *offsets)
+static void write_lookup_table(struct hashfile *f, off_t *offsets)
 {
 	uint32_t i;
 	uint32_t *table, *table_inv;
@@ -722,7 +717,7 @@ static void write_lookup_table(struct hashfile *f,
 	 * bitmap corresponds to j'th bitmapped commit (among the selected
 	 * commits) in lex order of OIDs.
 	 */
-	QSORT_S(table, writer.selected_nr, table_cmp, commit_positions);
+	QSORT(table, writer.selected_nr, table_cmp);
 
 	/* table_inv helps us discover that relationship (i'th bitmap
 	 * to j'th commit by j = table_inv[i])
@@ -753,7 +748,7 @@ static void write_lookup_table(struct hashfile *f,
 			xor_row = 0xffffffff;
 		}
 
-		hashwrite_be32(f, commit_positions[table[i]]);
+		hashwrite_be32(f, writer.selected[table[i]].commit_pos);
 		hashwrite_be64(f, (uint64_t)offsets[table[i]]);
 		hashwrite_be32(f, xor_row);
 	}
@@ -789,7 +784,6 @@ void bitmap_writer_finish(struct pack_idx_entry **index,
 	static uint16_t flags = BITMAP_OPT_FULL_DAG;
 	struct strbuf tmp_file = STRBUF_INIT;
 	struct hashfile *f;
-	uint32_t *commit_positions = NULL;
 	off_t *offsets = NULL;
 	uint32_t i;
 
@@ -814,22 +808,20 @@ void bitmap_writer_finish(struct pack_idx_entry **index,
 	if (options & BITMAP_OPT_LOOKUP_TABLE)
 		CALLOC_ARRAY(offsets, index_nr);
 
-	ALLOC_ARRAY(commit_positions, writer.selected_nr);
-
 	for (i = 0; i < writer.selected_nr; i++) {
 		struct bitmapped_commit *stored = &writer.selected[i];
-		int commit_pos = oid_pos(&stored->commit->object.oid, index, index_nr, oid_access);
+		int commit_pos = oid_pos(&stored->commit->object.oid, index,
+					 index_nr, oid_access);
 
 		if (commit_pos < 0)
 			BUG(_("trying to write commit not in index"));
-
-		commit_positions[i] = commit_pos;
+		stored->commit_pos = commit_pos;
 	}
 
-	write_selected_commits_v1(f, commit_positions, offsets);
+	write_selected_commits_v1(f, offsets);
 
 	if (options & BITMAP_OPT_LOOKUP_TABLE)
-		write_lookup_table(f, commit_positions, offsets);
+		write_lookup_table(f, offsets);
 
 	if (options & BITMAP_OPT_HASH_CACHE)
 		write_hash_cache(f, index, index_nr);
@@ -844,6 +836,5 @@ void bitmap_writer_finish(struct pack_idx_entry **index,
 		die_errno("unable to rename temporary bitmap file to '%s'", filename);
 
 	strbuf_release(&tmp_file);
-	free(commit_positions);
 	free(offsets);
 }
-- 
2.45.1.151.g7cc3499008c


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/6] pack-bitmap: avoid use of static `bitmap_writer`
  2024-05-14 19:56 [PATCH 0/6] pack-bitmap: various pack-bitmap-write cleanups Taylor Blau
  2024-05-14 19:56 ` [PATCH 1/6] object.h: add flags allocated by pack-bitmap.h Taylor Blau
  2024-05-14 19:56 ` [PATCH 2/6] pack-bitmap-write.c: move commit_positions into commit_pos fields Taylor Blau
@ 2024-05-14 19:56 ` Taylor Blau
  2024-05-14 19:57 ` [PATCH 4/6] pack-bitmap: drop unused `max_bitmaps` parameter Taylor Blau
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Taylor Blau @ 2024-05-14 19:56 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Patrick Steinhardt, Junio C Hamano

The pack-bitmap machinery uses a structure called 'bitmap_writer' to
collect the data necessary to write out .bitmap files. Since its
introduction in 7cc8f971085 (pack-objects: implement bitmap writing,
2013-12-21), there has been a single static bitmap_writer structure,
which is responsible for all bitmap writing-related operations.

In practice, this is OK, since we are only ever writing a single .bitmap
file in a single process (e.g., `git multi-pack-index write --bitmap`,
`git pack-objects --write-bitmap-index`, `git repack -b`, etc.).

However, having a single static variable makes issues like data
ownership unclear, when to free variables, what has/hasn't been
initialized unclear.

Refactor this code to be written in terms of a given bitmap_writer
structure instead of relying on a static global.

Note that this exposes the structure definition of the bitmap_writer at
the pack-bitmap.h level. We could work around this by, e.g., forcing
callers to declare their writers as:

    struct bitmap_writer *writer;
    bitmap_writer_init(&bitmap_writer);

and then declaring `bitmap_writer_init()` as taking in a double-pointer
like so:

    void bitmap_writer_init(struct bitmap_writer **writer);

which would avoid us having to expose the definition of the structure
itself. This patch takes a different approach, since future patches
(like for the ongoing pseudo-merge bitmaps work) will want to modify the
innards of this structure (in the previous example, via pseudo-merge.c).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 builtin/pack-objects.c |  19 ++--
 midx-write.c           |  16 ++--
 pack-bitmap-write.c    | 209 +++++++++++++++++++++--------------------
 pack-bitmap.h          |  38 ++++++--
 4 files changed, 159 insertions(+), 123 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index baf0090fc8..ba4c93d241 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1245,6 +1245,7 @@ static void write_pack_file(void)
 	uint32_t nr_remaining = nr_result;
 	time_t last_mtime = 0;
 	struct object_entry **write_order;
+	struct bitmap_writer bitmap_writer;
 
 	if (progress > pack_to_stdout)
 		progress_state = start_progress(_("Writing objects"), nr_result);
@@ -1339,8 +1340,9 @@ static void write_pack_file(void)
 				    hash_to_hex(hash));
 
 			if (write_bitmap_index) {
-				bitmap_writer_set_checksum(hash);
-				bitmap_writer_build_type_index(
+				bitmap_writer_init(&bitmap_writer);
+				bitmap_writer_set_checksum(&bitmap_writer, hash);
+				bitmap_writer_build_type_index(&bitmap_writer,
 					&to_pack, written_list, nr_written);
 			}
 
@@ -1358,11 +1360,16 @@ static void write_pack_file(void)
 				strbuf_addstr(&tmpname, "bitmap");
 				stop_progress(&progress_state);
 
-				bitmap_writer_show_progress(progress);
-				bitmap_writer_select_commits(indexed_commits, indexed_commits_nr, -1);
-				if (bitmap_writer_build(&to_pack) < 0)
+				bitmap_writer_show_progress(&bitmap_writer,
+							    progress);
+				bitmap_writer_select_commits(&bitmap_writer,
+							     indexed_commits,
+							     indexed_commits_nr,
+							     -1);
+				if (bitmap_writer_build(&bitmap_writer, &to_pack) < 0)
 					die(_("failed to write bitmap index"));
-				bitmap_writer_finish(written_list, nr_written,
+				bitmap_writer_finish(&bitmap_writer,
+						     written_list, nr_written,
 						     tmpname.buf, write_bitmap_options);
 				write_bitmap_index = 0;
 				strbuf_setlen(&tmpname, tmpname_len);
diff --git a/midx-write.c b/midx-write.c
index 65e69d2de7..5fdc8f2ff5 100644
--- a/midx-write.c
+++ b/midx-write.c
@@ -798,6 +798,7 @@ static int write_midx_bitmap(const char *midx_name,
 {
 	int ret, i;
 	uint16_t options = 0;
+	struct bitmap_writer writer;
 	struct pack_idx_entry **index;
 	char *bitmap_name = xstrfmt("%s-%s.bitmap", midx_name,
 					hash_to_hex(midx_hash));
@@ -819,8 +820,10 @@ static int write_midx_bitmap(const char *midx_name,
 	for (i = 0; i < pdata->nr_objects; i++)
 		index[i] = &pdata->objects[i].idx;
 
-	bitmap_writer_show_progress(flags & MIDX_PROGRESS);
-	bitmap_writer_build_type_index(pdata, index, pdata->nr_objects);
+	bitmap_writer_init(&writer);
+	bitmap_writer_show_progress(&writer, flags & MIDX_PROGRESS);
+	bitmap_writer_build_type_index(&writer, pdata, index,
+				       pdata->nr_objects);
 
 	/*
 	 * bitmap_writer_finish expects objects in lex order, but pack_order
@@ -838,13 +841,14 @@ static int write_midx_bitmap(const char *midx_name,
 	for (i = 0; i < pdata->nr_objects; i++)
 		index[pack_order[i]] = &pdata->objects[i].idx;
 
-	bitmap_writer_select_commits(commits, commits_nr, -1);
-	ret = bitmap_writer_build(pdata);
+	bitmap_writer_select_commits(&writer, commits, commits_nr, -1);
+	ret = bitmap_writer_build(&writer, pdata);
 	if (ret < 0)
 		goto cleanup;
 
-	bitmap_writer_set_checksum(midx_hash);
-	bitmap_writer_finish(index, pdata->nr_objects, bitmap_name, options);
+	bitmap_writer_set_checksum(&writer, midx_hash);
+	bitmap_writer_finish(&writer, index, pdata->nr_objects, bitmap_name,
+			     options);
 
 cleanup:
 	free(index);
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 2ae82b8696..e22fa70745 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -27,43 +27,30 @@ struct bitmapped_commit {
 	uint32_t commit_pos;
 };
 
-struct bitmap_writer {
-	struct ewah_bitmap *commits;
-	struct ewah_bitmap *trees;
-	struct ewah_bitmap *blobs;
-	struct ewah_bitmap *tags;
-
-	kh_oid_map_t *bitmaps;
-	struct packing_data *to_pack;
-
-	struct bitmapped_commit *selected;
-	unsigned int selected_nr, selected_alloc;
-
-	struct progress *progress;
-	int show_progress;
-	unsigned char pack_checksum[GIT_MAX_RAWSZ];
-};
-
-static struct bitmap_writer writer;
+void bitmap_writer_init(struct bitmap_writer *writer)
+{
+	memset(writer, 0, sizeof(struct bitmap_writer));
+}
 
-void bitmap_writer_show_progress(int show)
+void bitmap_writer_show_progress(struct bitmap_writer *writer, int show)
 {
-	writer.show_progress = show;
+	writer->show_progress = show;
 }
 
 /**
  * Build the initial type index for the packfile or multi-pack-index
  */
-void bitmap_writer_build_type_index(struct packing_data *to_pack,
+void bitmap_writer_build_type_index(struct bitmap_writer *writer,
+				    struct packing_data *to_pack,
 				    struct pack_idx_entry **index,
 				    uint32_t index_nr)
 {
 	uint32_t i;
 
-	writer.commits = ewah_new();
-	writer.trees = ewah_new();
-	writer.blobs = ewah_new();
-	writer.tags = ewah_new();
+	writer->commits = ewah_new();
+	writer->trees = ewah_new();
+	writer->blobs = ewah_new();
+	writer->tags = ewah_new();
 	ALLOC_ARRAY(to_pack->in_pack_pos, to_pack->nr_objects);
 
 	for (i = 0; i < index_nr; ++i) {
@@ -88,19 +75,19 @@ void bitmap_writer_build_type_index(struct packing_data *to_pack,
 
 		switch (real_type) {
 		case OBJ_COMMIT:
-			ewah_set(writer.commits, i);
+			ewah_set(writer->commits, i);
 			break;
 
 		case OBJ_TREE:
-			ewah_set(writer.trees, i);
+			ewah_set(writer->trees, i);
 			break;
 
 		case OBJ_BLOB:
-			ewah_set(writer.blobs, i);
+			ewah_set(writer->blobs, i);
 			break;
 
 		case OBJ_TAG:
-			ewah_set(writer.tags, i);
+			ewah_set(writer->tags, i);
 			break;
 
 		default:
@@ -115,23 +102,25 @@ void bitmap_writer_build_type_index(struct packing_data *to_pack,
  * Compute the actual bitmaps
  */
 
-static inline void push_bitmapped_commit(struct commit *commit)
+static inline void push_bitmapped_commit(struct bitmap_writer *writer,
+					 struct commit *commit)
 {
-	if (writer.selected_nr >= writer.selected_alloc) {
-		writer.selected_alloc = (writer.selected_alloc + 32) * 2;
-		REALLOC_ARRAY(writer.selected, writer.selected_alloc);
+	if (writer->selected_nr >= writer->selected_alloc) {
+		writer->selected_alloc = (writer->selected_alloc + 32) * 2;
+		REALLOC_ARRAY(writer->selected, writer->selected_alloc);
 	}
 
-	writer.selected[writer.selected_nr].commit = commit;
-	writer.selected[writer.selected_nr].bitmap = NULL;
-	writer.selected[writer.selected_nr].flags = 0;
+	writer->selected[writer->selected_nr].commit = commit;
+	writer->selected[writer->selected_nr].bitmap = NULL;
+	writer->selected[writer->selected_nr].flags = 0;
 
-	writer.selected_nr++;
+	writer->selected_nr++;
 }
 
-static uint32_t find_object_pos(const struct object_id *oid, int *found)
+static uint32_t find_object_pos(struct bitmap_writer *writer,
+				const struct object_id *oid, int *found)
 {
-	struct object_entry *entry = packlist_find(writer.to_pack, oid);
+	struct object_entry *entry = packlist_find(writer->to_pack, oid);
 
 	if (!entry) {
 		if (found)
@@ -143,17 +132,17 @@ static uint32_t find_object_pos(const struct object_id *oid, int *found)
 
 	if (found)
 		*found = 1;
-	return oe_in_pack_pos(writer.to_pack, entry);
+	return oe_in_pack_pos(writer->to_pack, entry);
 }
 
-static void compute_xor_offsets(void)
+static void compute_xor_offsets(struct bitmap_writer *writer)
 {
 	static const int MAX_XOR_OFFSET_SEARCH = 10;
 
 	int i, next = 0;
 
-	while (next < writer.selected_nr) {
-		struct bitmapped_commit *stored = &writer.selected[next];
+	while (next < writer->selected_nr) {
+		struct bitmapped_commit *stored = &writer->selected[next];
 
 		int best_offset = 0;
 		struct ewah_bitmap *best_bitmap = stored->bitmap;
@@ -166,7 +155,7 @@ static void compute_xor_offsets(void)
 				break;
 
 			test_xor = ewah_pool_new();
-			ewah_xor(writer.selected[curr].bitmap, stored->bitmap, test_xor);
+			ewah_xor(writer->selected[curr].bitmap, stored->bitmap, test_xor);
 
 			if (test_xor->buffer_size < best_bitmap->buffer_size) {
 				if (best_bitmap != stored->bitmap)
@@ -348,7 +337,8 @@ static void bitmap_builder_clear(struct bitmap_builder *bb)
 	bb->commits_nr = bb->commits_alloc = 0;
 }
 
-static int fill_bitmap_tree(struct bitmap *bitmap,
+static int fill_bitmap_tree(struct bitmap_writer *writer,
+			    struct bitmap *bitmap,
 			    struct tree *tree)
 {
 	int found;
@@ -360,7 +350,7 @@ static int fill_bitmap_tree(struct bitmap *bitmap,
 	 * If our bit is already set, then there is nothing to do. Both this
 	 * tree and all of its children will be set.
 	 */
-	pos = find_object_pos(&tree->object.oid, &found);
+	pos = find_object_pos(writer, &tree->object.oid, &found);
 	if (!found)
 		return -1;
 	if (bitmap_get(bitmap, pos))
@@ -375,12 +365,12 @@ static int fill_bitmap_tree(struct bitmap *bitmap,
 	while (tree_entry(&desc, &entry)) {
 		switch (object_type(entry.mode)) {
 		case OBJ_TREE:
-			if (fill_bitmap_tree(bitmap,
+			if (fill_bitmap_tree(writer, bitmap,
 					     lookup_tree(the_repository, &entry.oid)) < 0)
 				return -1;
 			break;
 		case OBJ_BLOB:
-			pos = find_object_pos(&entry.oid, &found);
+			pos = find_object_pos(writer, &entry.oid, &found);
 			if (!found)
 				return -1;
 			bitmap_set(bitmap, pos);
@@ -397,7 +387,8 @@ static int fill_bitmap_tree(struct bitmap *bitmap,
 
 static int reused_bitmaps_nr;
 
-static int fill_bitmap_commit(struct bb_commit *ent,
+static int fill_bitmap_commit(struct bitmap_writer *writer,
+			      struct bb_commit *ent,
 			      struct commit *commit,
 			      struct prio_queue *queue,
 			      struct prio_queue *tree_queue,
@@ -436,7 +427,7 @@ static int fill_bitmap_commit(struct bb_commit *ent,
 		 * Mark ourselves and queue our tree. The commit
 		 * walk ensures we cover all parents.
 		 */
-		pos = find_object_pos(&c->object.oid, &found);
+		pos = find_object_pos(writer, &c->object.oid, &found);
 		if (!found)
 			return -1;
 		bitmap_set(ent->bitmap, pos);
@@ -444,7 +435,8 @@ static int fill_bitmap_commit(struct bb_commit *ent,
 			       repo_get_commit_tree(the_repository, c));
 
 		for (p = c->parents; p; p = p->next) {
-			pos = find_object_pos(&p->item->object.oid, &found);
+			pos = find_object_pos(writer, &p->item->object.oid,
+					      &found);
 			if (!found)
 				return -1;
 			if (!bitmap_get(ent->bitmap, pos)) {
@@ -455,29 +447,31 @@ static int fill_bitmap_commit(struct bb_commit *ent,
 	}
 
 	while (tree_queue->nr) {
-		if (fill_bitmap_tree(ent->bitmap,
+		if (fill_bitmap_tree(writer, ent->bitmap,
 				     prio_queue_get(tree_queue)) < 0)
 			return -1;
 	}
 	return 0;
 }
 
-static void store_selected(struct bb_commit *ent, struct commit *commit)
+static void store_selected(struct bitmap_writer *writer,
+			   struct bb_commit *ent, struct commit *commit)
 {
-	struct bitmapped_commit *stored = &writer.selected[ent->idx];
+	struct bitmapped_commit *stored = &writer->selected[ent->idx];
 	khiter_t hash_pos;
 	int hash_ret;
 
 	stored->bitmap = bitmap_to_ewah(ent->bitmap);
 
-	hash_pos = kh_put_oid_map(writer.bitmaps, commit->object.oid, &hash_ret);
+	hash_pos = kh_put_oid_map(writer->bitmaps, commit->object.oid, &hash_ret);
 	if (hash_ret == 0)
 		die("Duplicate entry when writing index: %s",
 		    oid_to_hex(&commit->object.oid));
-	kh_value(writer.bitmaps, hash_pos) = stored;
+	kh_value(writer->bitmaps, hash_pos) = stored;
 }
 
-int bitmap_writer_build(struct packing_data *to_pack)
+int bitmap_writer_build(struct bitmap_writer *writer,
+			struct packing_data *to_pack)
 {
 	struct bitmap_builder bb;
 	size_t i;
@@ -488,11 +482,12 @@ int bitmap_writer_build(struct packing_data *to_pack)
 	uint32_t *mapping;
 	int closed = 1; /* until proven otherwise */
 
-	writer.bitmaps = kh_init_oid_map();
-	writer.to_pack = to_pack;
+	writer->bitmaps = kh_init_oid_map();
+	writer->to_pack = to_pack;
 
-	if (writer.show_progress)
-		writer.progress = start_progress("Building bitmaps", writer.selected_nr);
+	if (writer->show_progress)
+		writer->progress = start_progress("Building bitmaps",
+						  writer->selected_nr);
 	trace2_region_enter("pack-bitmap-write", "building_bitmaps_total",
 			    the_repository);
 
@@ -502,23 +497,23 @@ int bitmap_writer_build(struct packing_data *to_pack)
 	else
 		mapping = NULL;
 
-	bitmap_builder_init(&bb, &writer, old_bitmap);
+	bitmap_builder_init(&bb, writer, old_bitmap);
 	for (i = bb.commits_nr; i > 0; i--) {
 		struct commit *commit = bb.commits[i-1];
 		struct bb_commit *ent = bb_data_at(&bb.data, commit);
 		struct commit *child;
 		int reused = 0;
 
-		if (fill_bitmap_commit(ent, commit, &queue, &tree_queue,
+		if (fill_bitmap_commit(writer, ent, commit, &queue, &tree_queue,
 				       old_bitmap, mapping) < 0) {
 			closed = 0;
 			break;
 		}
 
 		if (ent->selected) {
-			store_selected(ent, commit);
+			store_selected(writer, ent, commit);
 			nr_stored++;
-			display_progress(writer.progress, nr_stored);
+			display_progress(writer->progress, nr_stored);
 		}
 
 		while ((child = pop_commit(&ent->reverse_edges))) {
@@ -549,10 +544,10 @@ int bitmap_writer_build(struct packing_data *to_pack)
 	trace2_data_intmax("pack-bitmap-write", the_repository,
 			   "building_bitmaps_reused", reused_bitmaps_nr);
 
-	stop_progress(&writer.progress);
+	stop_progress(&writer->progress);
 
 	if (closed)
-		compute_xor_offsets();
+		compute_xor_offsets(writer);
 	return closed ? 0 : -1;
 }
 
@@ -590,7 +585,8 @@ static int date_compare(const void *_a, const void *_b)
 	return (long)b->date - (long)a->date;
 }
 
-void bitmap_writer_select_commits(struct commit **indexed_commits,
+void bitmap_writer_select_commits(struct bitmap_writer *writer,
+				  struct commit **indexed_commits,
 				  unsigned int indexed_commits_nr,
 				  int max_bitmaps)
 {
@@ -600,12 +596,12 @@ void bitmap_writer_select_commits(struct commit **indexed_commits,
 
 	if (indexed_commits_nr < 100) {
 		for (i = 0; i < indexed_commits_nr; ++i)
-			push_bitmapped_commit(indexed_commits[i]);
+			push_bitmapped_commit(writer, indexed_commits[i]);
 		return;
 	}
 
-	if (writer.show_progress)
-		writer.progress = start_progress("Selecting bitmap commits", 0);
+	if (writer->show_progress)
+		writer->progress = start_progress("Selecting bitmap commits", 0);
 
 	for (;;) {
 		struct commit *chosen = NULL;
@@ -615,8 +611,8 @@ void bitmap_writer_select_commits(struct commit **indexed_commits,
 		if (i + next >= indexed_commits_nr)
 			break;
 
-		if (max_bitmaps > 0 && writer.selected_nr >= max_bitmaps) {
-			writer.selected_nr = max_bitmaps;
+		if (max_bitmaps > 0 && writer->selected_nr >= max_bitmaps) {
+			writer->selected_nr = max_bitmaps;
 			break;
 		}
 
@@ -638,13 +634,13 @@ void bitmap_writer_select_commits(struct commit **indexed_commits,
 			}
 		}
 
-		push_bitmapped_commit(chosen);
+		push_bitmapped_commit(writer, chosen);
 
 		i += next + 1;
-		display_progress(writer.progress, i);
+		display_progress(writer->progress, i);
 	}
 
-	stop_progress(&writer.progress);
+	stop_progress(&writer->progress);
 }
 
 
@@ -670,12 +666,13 @@ static const struct object_id *oid_access(size_t pos, const void *table)
 	return &index[pos]->oid;
 }
 
-static void write_selected_commits_v1(struct hashfile *f, off_t *offsets)
+static void write_selected_commits_v1(struct bitmap_writer *writer,
+				      struct hashfile *f, off_t *offsets)
 {
 	int i;
 
-	for (i = 0; i < writer.selected_nr; ++i) {
-		struct bitmapped_commit *stored = &writer.selected[i];
+	for (i = 0; i < writer->selected_nr; ++i) {
+		struct bitmapped_commit *stored = &writer->selected[i];
 
 		if (offsets)
 			offsets[i] = hashfile_total(f);
@@ -688,10 +685,11 @@ static void write_selected_commits_v1(struct hashfile *f, off_t *offsets)
 	}
 }
 
-static int table_cmp(const void *_va, const void *_vb)
+static int table_cmp(const void *_va, const void *_vb, void *_data)
 {
-	struct bitmapped_commit *a = &writer.selected[*(uint32_t *)_va];
-	struct bitmapped_commit *b = &writer.selected[*(uint32_t *)_vb];
+	struct bitmap_writer *writer = _data;
+	struct bitmapped_commit *a = &writer->selected[*(uint32_t *)_va];
+	struct bitmapped_commit *b = &writer->selected[*(uint32_t *)_vb];
 
 	if (a->commit_pos < b->commit_pos)
 		return -1;
@@ -701,15 +699,16 @@ static int table_cmp(const void *_va, const void *_vb)
 	return 0;
 }
 
-static void write_lookup_table(struct hashfile *f, off_t *offsets)
+static void write_lookup_table(struct bitmap_writer *writer, struct hashfile *f,
+			       off_t *offsets)
 {
 	uint32_t i;
 	uint32_t *table, *table_inv;
 
-	ALLOC_ARRAY(table, writer.selected_nr);
-	ALLOC_ARRAY(table_inv, writer.selected_nr);
+	ALLOC_ARRAY(table, writer->selected_nr);
+	ALLOC_ARRAY(table_inv, writer->selected_nr);
 
-	for (i = 0; i < writer.selected_nr; i++)
+	for (i = 0; i < writer->selected_nr; i++)
 		table[i] = i;
 
 	/*
@@ -717,17 +716,17 @@ static void write_lookup_table(struct hashfile *f, off_t *offsets)
 	 * bitmap corresponds to j'th bitmapped commit (among the selected
 	 * commits) in lex order of OIDs.
 	 */
-	QSORT(table, writer.selected_nr, table_cmp);
+	QSORT_S(table, writer->selected_nr, table_cmp, writer);
 
 	/* table_inv helps us discover that relationship (i'th bitmap
 	 * to j'th commit by j = table_inv[i])
 	 */
-	for (i = 0; i < writer.selected_nr; i++)
+	for (i = 0; i < writer->selected_nr; i++)
 		table_inv[table[i]] = i;
 
 	trace2_region_enter("pack-bitmap-write", "writing_lookup_table", the_repository);
-	for (i = 0; i < writer.selected_nr; i++) {
-		struct bitmapped_commit *selected = &writer.selected[table[i]];
+	for (i = 0; i < writer->selected_nr; i++) {
+		struct bitmapped_commit *selected = &writer->selected[table[i]];
 		uint32_t xor_offset = selected->xor_offset;
 		uint32_t xor_row;
 
@@ -748,7 +747,7 @@ static void write_lookup_table(struct hashfile *f, off_t *offsets)
 			xor_row = 0xffffffff;
 		}
 
-		hashwrite_be32(f, writer.selected[table[i]].commit_pos);
+		hashwrite_be32(f, writer->selected[table[i]].commit_pos);
 		hashwrite_be64(f, (uint64_t)offsets[table[i]]);
 		hashwrite_be32(f, xor_row);
 	}
@@ -770,12 +769,14 @@ static void write_hash_cache(struct hashfile *f,
 	}
 }
 
-void bitmap_writer_set_checksum(const unsigned char *sha1)
+void bitmap_writer_set_checksum(struct bitmap_writer *writer,
+				const unsigned char *sha1)
 {
-	hashcpy(writer.pack_checksum, sha1);
+	hashcpy(writer->pack_checksum, sha1);
 }
 
-void bitmap_writer_finish(struct pack_idx_entry **index,
+void bitmap_writer_finish(struct bitmap_writer *writer,
+			  struct pack_idx_entry **index,
 			  uint32_t index_nr,
 			  const char *filename,
 			  uint16_t options)
@@ -796,20 +797,20 @@ void bitmap_writer_finish(struct pack_idx_entry **index,
 	memcpy(header.magic, BITMAP_IDX_SIGNATURE, sizeof(BITMAP_IDX_SIGNATURE));
 	header.version = htons(default_version);
 	header.options = htons(flags | options);
-	header.entry_count = htonl(writer.selected_nr);
-	hashcpy(header.checksum, writer.pack_checksum);
+	header.entry_count = htonl(writer->selected_nr);
+	hashcpy(header.checksum, writer->pack_checksum);
 
 	hashwrite(f, &header, sizeof(header) - GIT_MAX_RAWSZ + the_hash_algo->rawsz);
-	dump_bitmap(f, writer.commits);
-	dump_bitmap(f, writer.trees);
-	dump_bitmap(f, writer.blobs);
-	dump_bitmap(f, writer.tags);
+	dump_bitmap(f, writer->commits);
+	dump_bitmap(f, writer->trees);
+	dump_bitmap(f, writer->blobs);
+	dump_bitmap(f, writer->tags);
 
 	if (options & BITMAP_OPT_LOOKUP_TABLE)
 		CALLOC_ARRAY(offsets, index_nr);
 
-	for (i = 0; i < writer.selected_nr; i++) {
-		struct bitmapped_commit *stored = &writer.selected[i];
+	for (i = 0; i < writer->selected_nr; i++) {
+		struct bitmapped_commit *stored = &writer->selected[i];
 		int commit_pos = oid_pos(&stored->commit->object.oid, index,
 					 index_nr, oid_access);
 
@@ -818,10 +819,10 @@ void bitmap_writer_finish(struct pack_idx_entry **index,
 		stored->commit_pos = commit_pos;
 	}
 
-	write_selected_commits_v1(f, offsets);
+	write_selected_commits_v1(writer, f, offsets);
 
 	if (options & BITMAP_OPT_LOOKUP_TABLE)
-		write_lookup_table(f, offsets);
+		write_lookup_table(writer, f, offsets);
 
 	if (options & BITMAP_OPT_HASH_CACHE)
 		write_hash_cache(f, index, index_nr);
diff --git a/pack-bitmap.h b/pack-bitmap.h
index c7dea13217..5a1890a2c5 100644
--- a/pack-bitmap.h
+++ b/pack-bitmap.h
@@ -97,9 +97,29 @@ int bitmap_has_oid_in_uninteresting(struct bitmap_index *, const struct object_i
 
 off_t get_disk_usage_from_bitmap(struct bitmap_index *, struct rev_info *);
 
-void bitmap_writer_show_progress(int show);
-void bitmap_writer_set_checksum(const unsigned char *sha1);
-void bitmap_writer_build_type_index(struct packing_data *to_pack,
+struct bitmap_writer {
+	struct ewah_bitmap *commits;
+	struct ewah_bitmap *trees;
+	struct ewah_bitmap *blobs;
+	struct ewah_bitmap *tags;
+
+	kh_oid_map_t *bitmaps;
+	struct packing_data *to_pack;
+
+	struct bitmapped_commit *selected;
+	unsigned int selected_nr, selected_alloc;
+
+	struct progress *progress;
+	int show_progress;
+	unsigned char pack_checksum[GIT_MAX_RAWSZ];
+};
+
+void bitmap_writer_init(struct bitmap_writer *writer);
+void bitmap_writer_show_progress(struct bitmap_writer *writer, int show);
+void bitmap_writer_set_checksum(struct bitmap_writer *writer,
+				const unsigned char *sha1);
+void bitmap_writer_build_type_index(struct bitmap_writer *writer,
+				    struct packing_data *to_pack,
 				    struct pack_idx_entry **index,
 				    uint32_t index_nr);
 uint32_t *create_bitmap_mapping(struct bitmap_index *bitmap_git,
@@ -109,10 +129,14 @@ int rebuild_bitmap(const uint32_t *reposition,
 		   struct bitmap *dest);
 struct ewah_bitmap *bitmap_for_commit(struct bitmap_index *bitmap_git,
 				      struct commit *commit);
-void bitmap_writer_select_commits(struct commit **indexed_commits,
-		unsigned int indexed_commits_nr, int max_bitmaps);
-int bitmap_writer_build(struct packing_data *to_pack);
-void bitmap_writer_finish(struct pack_idx_entry **index,
+void bitmap_writer_select_commits(struct bitmap_writer *writer,
+				  struct commit **indexed_commits,
+				  unsigned int indexed_commits_nr,
+				  int max_bitmaps);
+int bitmap_writer_build(struct bitmap_writer *writer,
+			struct packing_data *to_pack);
+void bitmap_writer_finish(struct bitmap_writer *writer,
+			  struct pack_idx_entry **index,
 			  uint32_t index_nr,
 			  const char *filename,
 			  uint16_t options);
-- 
2.45.1.151.g7cc3499008c


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/6] pack-bitmap: drop unused `max_bitmaps` parameter
  2024-05-14 19:56 [PATCH 0/6] pack-bitmap: various pack-bitmap-write cleanups Taylor Blau
                   ` (2 preceding siblings ...)
  2024-05-14 19:56 ` [PATCH 3/6] pack-bitmap: avoid use of static `bitmap_writer` Taylor Blau
@ 2024-05-14 19:57 ` Taylor Blau
  2024-05-14 19:57 ` [PATCH 5/6] pack-bitmap-write.c: avoid uninitialized 'write_as' field Taylor Blau
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Taylor Blau @ 2024-05-14 19:57 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Patrick Steinhardt, Junio C Hamano

The `max_bitmaps` parameter in `bitmap_writer_select_commits()` was
introduced back in 7cc8f97108 (pack-objects: implement bitmap writing,
2013-12-21), making it original to the bitmap implementation in Git
itself.

When that patch was merged via 0f9e62e084 (Merge branch
'jk/pack-bitmap', 2014-02-27), its sole caller in builtin/pack-objects.c
passed a value of "-1" for `max_bitmaps`, indicating no limit.

Since then, the only other caller (in midx.c, added via c528e17966
(pack-bitmap: write multi-pack bitmaps, 2021-08-31)) also uses a value
of "-1" for `max_bitmaps`.

Since no callers have needed a finite limit for the `max_bitmaps`
parameter in the nearly decade that has passed since 0f9e62e084, let's
remove the parameter and any dead pieces of code connected to it.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 builtin/pack-objects.c | 3 +--
 midx-write.c           | 2 +-
 pack-bitmap-write.c    | 8 +-------
 pack-bitmap.h          | 3 +--
 4 files changed, 4 insertions(+), 12 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index ba4c93d241..10e69fdc8e 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1364,8 +1364,7 @@ static void write_pack_file(void)
 							    progress);
 				bitmap_writer_select_commits(&bitmap_writer,
 							     indexed_commits,
-							     indexed_commits_nr,
-							     -1);
+							     indexed_commits_nr);
 				if (bitmap_writer_build(&bitmap_writer, &to_pack) < 0)
 					die(_("failed to write bitmap index"));
 				bitmap_writer_finish(&bitmap_writer,
diff --git a/midx-write.c b/midx-write.c
index 5fdc8f2ff5..78fb0a2c8c 100644
--- a/midx-write.c
+++ b/midx-write.c
@@ -841,7 +841,7 @@ static int write_midx_bitmap(const char *midx_name,
 	for (i = 0; i < pdata->nr_objects; i++)
 		index[pack_order[i]] = &pdata->objects[i].idx;
 
-	bitmap_writer_select_commits(&writer, commits, commits_nr, -1);
+	bitmap_writer_select_commits(&writer, commits, commits_nr);
 	ret = bitmap_writer_build(&writer, pdata);
 	if (ret < 0)
 		goto cleanup;
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index e22fa70745..c0087dab12 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -587,8 +587,7 @@ static int date_compare(const void *_a, const void *_b)
 
 void bitmap_writer_select_commits(struct bitmap_writer *writer,
 				  struct commit **indexed_commits,
-				  unsigned int indexed_commits_nr,
-				  int max_bitmaps)
+				  unsigned int indexed_commits_nr)
 {
 	unsigned int i = 0, j, next;
 
@@ -611,11 +610,6 @@ void bitmap_writer_select_commits(struct bitmap_writer *writer,
 		if (i + next >= indexed_commits_nr)
 			break;
 
-		if (max_bitmaps > 0 && writer->selected_nr >= max_bitmaps) {
-			writer->selected_nr = max_bitmaps;
-			break;
-		}
-
 		if (next == 0) {
 			chosen = indexed_commits[i];
 		} else {
diff --git a/pack-bitmap.h b/pack-bitmap.h
index 5a1890a2c5..b2d13d40eb 100644
--- a/pack-bitmap.h
+++ b/pack-bitmap.h
@@ -131,8 +131,7 @@ struct ewah_bitmap *bitmap_for_commit(struct bitmap_index *bitmap_git,
 				      struct commit *commit);
 void bitmap_writer_select_commits(struct bitmap_writer *writer,
 				  struct commit **indexed_commits,
-				  unsigned int indexed_commits_nr,
-				  int max_bitmaps);
+				  unsigned int indexed_commits_nr);
 int bitmap_writer_build(struct bitmap_writer *writer,
 			struct packing_data *to_pack);
 void bitmap_writer_finish(struct bitmap_writer *writer,
-- 
2.45.1.151.g7cc3499008c


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/6] pack-bitmap-write.c: avoid uninitialized 'write_as' field
  2024-05-14 19:56 [PATCH 0/6] pack-bitmap: various pack-bitmap-write cleanups Taylor Blau
                   ` (3 preceding siblings ...)
  2024-05-14 19:57 ` [PATCH 4/6] pack-bitmap: drop unused `max_bitmaps` parameter Taylor Blau
@ 2024-05-14 19:57 ` Taylor Blau
  2024-05-15  9:05   ` Patrick Steinhardt
  2024-05-14 19:57 ` [PATCH 6/6] pack-bitmap: introduce `bitmap_writer_free()` Taylor Blau
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 14+ messages in thread
From: Taylor Blau @ 2024-05-14 19:57 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Patrick Steinhardt, Junio C Hamano

Prepare to free() memory associated with bitmapped_commit structs by
zero'ing the 'write_as' field.

In ideal cases, it is fine to do something like:

    for (i = 0; i < writer->selected_nr; i++) {
        struct bitmapped_commit *bc = &writer->selected[i];
        if (bc->write_as != bc->bitmap)
            ewah_free(bc->write_as);
        ewah_free(bc->bitmap);
    }

but if not all of the 'write_as' fields were populated (e.g., because
the packing_data given does not form a reachability closure), then we
may attempt to free uninitialized memory.

Guard against this by preemptively zero'ing this field just in case.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pack-bitmap-write.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index c0087dab12..420f17c2e0 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -112,6 +112,7 @@ static inline void push_bitmapped_commit(struct bitmap_writer *writer,
 
 	writer->selected[writer->selected_nr].commit = commit;
 	writer->selected[writer->selected_nr].bitmap = NULL;
+	writer->selected[writer->selected_nr].write_as = NULL;
 	writer->selected[writer->selected_nr].flags = 0;
 
 	writer->selected_nr++;
-- 
2.45.1.151.g7cc3499008c


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 6/6] pack-bitmap: introduce `bitmap_writer_free()`
  2024-05-14 19:56 [PATCH 0/6] pack-bitmap: various pack-bitmap-write cleanups Taylor Blau
                   ` (4 preceding siblings ...)
  2024-05-14 19:57 ` [PATCH 5/6] pack-bitmap-write.c: avoid uninitialized 'write_as' field Taylor Blau
@ 2024-05-14 19:57 ` Taylor Blau
  2024-05-15  9:05   ` Patrick Steinhardt
  2024-05-15  6:18 ` [PATCH 0/6] pack-bitmap: various pack-bitmap-write cleanups Jeff King
  2024-05-15  9:05 ` Patrick Steinhardt
  7 siblings, 1 reply; 14+ messages in thread
From: Taylor Blau @ 2024-05-14 19:57 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Patrick Steinhardt, Junio C Hamano

Now that there is clearer memory ownership around the bitmap_writer
structure, introduce a bitmap_writer_free() function that callers may
use to free any memory associated with their instance of the
bitmap_writer structure.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 builtin/pack-objects.c |  3 ++-
 midx-write.c           |  1 +
 pack-bitmap-write.c    | 23 +++++++++++++++++++++++
 pack-bitmap.h          |  1 +
 4 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 10e69fdc8e..26a6d0d791 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1245,7 +1245,6 @@ static void write_pack_file(void)
 	uint32_t nr_remaining = nr_result;
 	time_t last_mtime = 0;
 	struct object_entry **write_order;
-	struct bitmap_writer bitmap_writer;
 
 	if (progress > pack_to_stdout)
 		progress_state = start_progress(_("Writing objects"), nr_result);
@@ -1315,6 +1314,7 @@ static void write_pack_file(void)
 		if (!pack_to_stdout) {
 			struct stat st;
 			struct strbuf tmpname = STRBUF_INIT;
+			struct bitmap_writer bitmap_writer;
 			char *idx_tmp_name = NULL;
 
 			/*
@@ -1370,6 +1370,7 @@ static void write_pack_file(void)
 				bitmap_writer_finish(&bitmap_writer,
 						     written_list, nr_written,
 						     tmpname.buf, write_bitmap_options);
+				bitmap_writer_free(&bitmap_writer);
 				write_bitmap_index = 0;
 				strbuf_setlen(&tmpname, tmpname_len);
 			}
diff --git a/midx-write.c b/midx-write.c
index 78fb0a2c8c..7c0c08c64b 100644
--- a/midx-write.c
+++ b/midx-write.c
@@ -853,6 +853,7 @@ static int write_midx_bitmap(const char *midx_name,
 cleanup:
 	free(index);
 	free(bitmap_name);
+	bitmap_writer_free(&writer);
 
 	trace2_region_leave("midx", "write_midx_bitmap", the_repository);
 
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 420f17c2e0..6cae670412 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -32,6 +32,29 @@ void bitmap_writer_init(struct bitmap_writer *writer)
 	memset(writer, 0, sizeof(struct bitmap_writer));
 }
 
+void bitmap_writer_free(struct bitmap_writer *writer)
+{
+	uint32_t i;
+
+	if (!writer)
+		return;
+
+	ewah_free(writer->commits);
+	ewah_free(writer->trees);
+	ewah_free(writer->blobs);
+	ewah_free(writer->tags);
+
+	kh_destroy_oid_map(writer->bitmaps);
+
+	for (i = 0; i < writer->selected_nr; i++) {
+		struct bitmapped_commit *bc = &writer->selected[i];
+		if (bc->write_as != bc->bitmap)
+			ewah_free(bc->write_as);
+		ewah_free(bc->bitmap);
+	}
+	free(writer->selected);
+}
+
 void bitmap_writer_show_progress(struct bitmap_writer *writer, int show)
 {
 	writer->show_progress = show;
diff --git a/pack-bitmap.h b/pack-bitmap.h
index b2d13d40eb..3091095f33 100644
--- a/pack-bitmap.h
+++ b/pack-bitmap.h
@@ -139,6 +139,7 @@ void bitmap_writer_finish(struct bitmap_writer *writer,
 			  uint32_t index_nr,
 			  const char *filename,
 			  uint16_t options);
+void bitmap_writer_free(struct bitmap_writer *writer);
 char *midx_bitmap_filename(struct multi_pack_index *midx);
 char *pack_bitmap_filename(struct packed_git *p);
 
-- 
2.45.1.151.g7cc3499008c

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/6] pack-bitmap: various pack-bitmap-write cleanups
  2024-05-14 19:56 [PATCH 0/6] pack-bitmap: various pack-bitmap-write cleanups Taylor Blau
                   ` (5 preceding siblings ...)
  2024-05-14 19:57 ` [PATCH 6/6] pack-bitmap: introduce `bitmap_writer_free()` Taylor Blau
@ 2024-05-15  6:18 ` Jeff King
  2024-05-15  9:05 ` Patrick Steinhardt
  7 siblings, 0 replies; 14+ messages in thread
From: Jeff King @ 2024-05-15  6:18 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Patrick Steinhardt, Junio C Hamano

On Tue, May 14, 2024 at 03:56:46PM -0400, Taylor Blau wrote:

> This topic was born out of tb/pseudo-merge-bitmaps, which has a few
> quality-of-life improvements in the pack-bitmap machinery.
> 
> The main changes are to remove the static 'struct bitmap_writer', drop
> one unused fields, and start using another unused one (see "move
> commit_positions into commit_pos fields").
> 
> There are other smaller changes, too, like cleaning up the flag
> allocation table in object.h, as well as introducing a new
> bitmap_writer_free() function.
> 
> The next round of tb/pseudo-merge-bitmaps will be based on this branch.

These all look good to me. And I think it is nice to pull them out of
the larger series, to keep the independent steps to a more digestible
size.

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/6] pack-bitmap-write.c: avoid uninitialized 'write_as' field
  2024-05-14 19:57 ` [PATCH 5/6] pack-bitmap-write.c: avoid uninitialized 'write_as' field Taylor Blau
@ 2024-05-15  9:05   ` Patrick Steinhardt
  2024-05-15 13:29     ` Taylor Blau
  0 siblings, 1 reply; 14+ messages in thread
From: Patrick Steinhardt @ 2024-05-15  9:05 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Jeff King, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 1596 bytes --]

On Tue, May 14, 2024 at 03:57:03PM -0400, Taylor Blau wrote:
> Prepare to free() memory associated with bitmapped_commit structs by
> zero'ing the 'write_as' field.
> 
> In ideal cases, it is fine to do something like:
> 
>     for (i = 0; i < writer->selected_nr; i++) {
>         struct bitmapped_commit *bc = &writer->selected[i];
>         if (bc->write_as != bc->bitmap)
>             ewah_free(bc->write_as);
>         ewah_free(bc->bitmap);
>     }
> 
> but if not all of the 'write_as' fields were populated (e.g., because
> the packing_data given does not form a reachability closure), then we
> may attempt to free uninitialized memory.
> 
> Guard against this by preemptively zero'ing this field just in case.
> 
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  pack-bitmap-write.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
> index c0087dab12..420f17c2e0 100644
> --- a/pack-bitmap-write.c
> +++ b/pack-bitmap-write.c
> @@ -112,6 +112,7 @@ static inline void push_bitmapped_commit(struct bitmap_writer *writer,
>  
>  	writer->selected[writer->selected_nr].commit = commit;
>  	writer->selected[writer->selected_nr].bitmap = NULL;
> +	writer->selected[writer->selected_nr].write_as = NULL;
>  	writer->selected[writer->selected_nr].flags = 0;

Instead of having to ensure that all fields are initialized we could
also set the whole structure to zero via `memset()`, which might be a
bit more sustainable in the future. That alone doesn't really warrant a
reroll though.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 6/6] pack-bitmap: introduce `bitmap_writer_free()`
  2024-05-14 19:57 ` [PATCH 6/6] pack-bitmap: introduce `bitmap_writer_free()` Taylor Blau
@ 2024-05-15  9:05   ` Patrick Steinhardt
  2024-05-15 15:58     ` Taylor Blau
  0 siblings, 1 reply; 14+ messages in thread
From: Patrick Steinhardt @ 2024-05-15  9:05 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Jeff King, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 2616 bytes --]

On Tue, May 14, 2024 at 03:57:06PM -0400, Taylor Blau wrote:
> Now that there is clearer memory ownership around the bitmap_writer
> structure, introduce a bitmap_writer_free() function that callers may
> use to free any memory associated with their instance of the
> bitmap_writer structure.

Great. I wanted to ask about this in preceding commits already, good to
see that you already thought of if.

> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  builtin/pack-objects.c |  3 ++-
>  midx-write.c           |  1 +
>  pack-bitmap-write.c    | 23 +++++++++++++++++++++++
>  pack-bitmap.h          |  1 +
>  4 files changed, 27 insertions(+), 1 deletion(-)
> 
> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index 10e69fdc8e..26a6d0d791 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -1245,7 +1245,6 @@ static void write_pack_file(void)
>  	uint32_t nr_remaining = nr_result;
>  	time_t last_mtime = 0;
>  	struct object_entry **write_order;
> -	struct bitmap_writer bitmap_writer;
>  
>  	if (progress > pack_to_stdout)
>  		progress_state = start_progress(_("Writing objects"), nr_result);
> @@ -1315,6 +1314,7 @@ static void write_pack_file(void)
>  		if (!pack_to_stdout) {
>  			struct stat st;
>  			struct strbuf tmpname = STRBUF_INIT;
> +			struct bitmap_writer bitmap_writer;
>  			char *idx_tmp_name = NULL;
>  
>  			/*

Nit: we could have avoided moving the struct if it was introduced in
this spot in the preceding patch.

[snip]
> diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
> index 420f17c2e0..6cae670412 100644
> --- a/pack-bitmap-write.c
> +++ b/pack-bitmap-write.c
> @@ -32,6 +32,29 @@ void bitmap_writer_init(struct bitmap_writer *writer)
>  	memset(writer, 0, sizeof(struct bitmap_writer));
>  }
>  
> +void bitmap_writer_free(struct bitmap_writer *writer)
> +{
> +	uint32_t i;
> +
> +	if (!writer)
> +		return;
> +
> +	ewah_free(writer->commits);
> +	ewah_free(writer->trees);
> +	ewah_free(writer->blobs);
> +	ewah_free(writer->tags);
> +
> +	kh_destroy_oid_map(writer->bitmaps);
> +
> +	for (i = 0; i < writer->selected_nr; i++) {
> +		struct bitmapped_commit *bc = &writer->selected[i];
> +		if (bc->write_as != bc->bitmap)
> +			ewah_free(bc->write_as);
> +		ewah_free(bc->bitmap);
> +	}
> +	free(writer->selected);

I was wondering whether we also want to set all those fields to `NULL`
at the end, or just `memset()` the whole structure. But there probably
isn't much of a reason to do it currently, so I don't mind if your
answer is "no".

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/6] pack-bitmap: various pack-bitmap-write cleanups
  2024-05-14 19:56 [PATCH 0/6] pack-bitmap: various pack-bitmap-write cleanups Taylor Blau
                   ` (6 preceding siblings ...)
  2024-05-15  6:18 ` [PATCH 0/6] pack-bitmap: various pack-bitmap-write cleanups Jeff King
@ 2024-05-15  9:05 ` Patrick Steinhardt
  2024-05-15 15:58   ` Taylor Blau
  7 siblings, 1 reply; 14+ messages in thread
From: Patrick Steinhardt @ 2024-05-15  9:05 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Jeff King, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 906 bytes --]

On Tue, May 14, 2024 at 03:56:46PM -0400, Taylor Blau wrote:
> This topic was born out of tb/pseudo-merge-bitmaps, which has a few
> quality-of-life improvements in the pack-bitmap machinery.
> 
> The main changes are to remove the static 'struct bitmap_writer', drop
> one unused fields, and start using another unused one (see "move
> commit_positions into commit_pos fields").
> 
> There are other smaller changes, too, like cleaning up the flag
> allocation table in object.h, as well as introducing a new
> bitmap_writer_free() function.
> 
> The next round of tb/pseudo-merge-bitmaps will be based on this branch.
> 
> Thanks in advance for your review!

I've got some tiny nits, but overall this series looks good to me
already. I don't mind if it's merged in the current state if you don't
think any of the nits are worthy to be addressed.

Thanks for this cleanup!

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/6] pack-bitmap-write.c: avoid uninitialized 'write_as' field
  2024-05-15  9:05   ` Patrick Steinhardt
@ 2024-05-15 13:29     ` Taylor Blau
  0 siblings, 0 replies; 14+ messages in thread
From: Taylor Blau @ 2024-05-15 13:29 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Jeff King, Junio C Hamano

On Wed, May 15, 2024 at 11:05:10AM +0200, Patrick Steinhardt wrote:
> > diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
> > index c0087dab12..420f17c2e0 100644
> > --- a/pack-bitmap-write.c
> > +++ b/pack-bitmap-write.c
> > @@ -112,6 +112,7 @@ static inline void push_bitmapped_commit(struct bitmap_writer *writer,
> >
> >  	writer->selected[writer->selected_nr].commit = commit;
> >  	writer->selected[writer->selected_nr].bitmap = NULL;
> > +	writer->selected[writer->selected_nr].write_as = NULL;
> >  	writer->selected[writer->selected_nr].flags = 0;
>
> Instead of having to ensure that all fields are initialized we could
> also set the whole structure to zero via `memset()`, which might be a
> bit more sustainable in the future. That alone doesn't really warrant a
> reroll though.

I had considered it, but decided against it as it seemed wasteful to

    memset(&writer->selected[writer->selected_nr], 0,
           sizeof(struct bitmapped_commit))

and then immediately un-zero some of its memory by assigning the commit
field.

Obviously not actually all that wasteful, as we're only talking about a
couple of hundred CPU cycles ;-).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 6/6] pack-bitmap: introduce `bitmap_writer_free()`
  2024-05-15  9:05   ` Patrick Steinhardt
@ 2024-05-15 15:58     ` Taylor Blau
  0 siblings, 0 replies; 14+ messages in thread
From: Taylor Blau @ 2024-05-15 15:58 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Jeff King, Junio C Hamano

On Wed, May 15, 2024 at 11:05:18AM +0200, Patrick Steinhardt wrote:
> > diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> > index 10e69fdc8e..26a6d0d791 100644
> > --- a/builtin/pack-objects.c
> > +++ b/builtin/pack-objects.c
> > @@ -1245,7 +1245,6 @@ static void write_pack_file(void)
> >  	uint32_t nr_remaining = nr_result;
> >  	time_t last_mtime = 0;
> >  	struct object_entry **write_order;
> > -	struct bitmap_writer bitmap_writer;
> >
> >  	if (progress > pack_to_stdout)
> >  		progress_state = start_progress(_("Writing objects"), nr_result);
> > @@ -1315,6 +1314,7 @@ static void write_pack_file(void)
> >  		if (!pack_to_stdout) {
> >  			struct stat st;
> >  			struct strbuf tmpname = STRBUF_INIT;
> > +			struct bitmap_writer bitmap_writer;
> >  			char *idx_tmp_name = NULL;
> >
> >  			/*
>
> Nit: we could have avoided moving the struct if it was introduced in
> this spot in the preceding patch.

Ugh, I meant to move the declaration here in the previous patch, but
apparently didn't. I don't think it's worth sending another round just
to fix that minor issue, but happy to do so if you feel strongly.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/6] pack-bitmap: various pack-bitmap-write cleanups
  2024-05-15  9:05 ` Patrick Steinhardt
@ 2024-05-15 15:58   ` Taylor Blau
  0 siblings, 0 replies; 14+ messages in thread
From: Taylor Blau @ 2024-05-15 15:58 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Jeff King, Junio C Hamano

On Wed, May 15, 2024 at 11:05:25AM +0200, Patrick Steinhardt wrote:
> On Tue, May 14, 2024 at 03:56:46PM -0400, Taylor Blau wrote:
> > This topic was born out of tb/pseudo-merge-bitmaps, which has a few
> > quality-of-life improvements in the pack-bitmap machinery.
> >
> > The main changes are to remove the static 'struct bitmap_writer', drop
> > one unused fields, and start using another unused one (see "move
> > commit_positions into commit_pos fields").
> >
> > There are other smaller changes, too, like cleaning up the flag
> > allocation table in object.h, as well as introducing a new
> > bitmap_writer_free() function.
> >
> > The next round of tb/pseudo-merge-bitmaps will be based on this branch.
> >
> > Thanks in advance for your review!
>
> I've got some tiny nits, but overall this series looks good to me
> already. I don't mind if it's merged in the current state if you don't
> think any of the nits are worthy to be addressed.

Thanks, both, for your review!

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-05-15 15:58 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-14 19:56 [PATCH 0/6] pack-bitmap: various pack-bitmap-write cleanups Taylor Blau
2024-05-14 19:56 ` [PATCH 1/6] object.h: add flags allocated by pack-bitmap.h Taylor Blau
2024-05-14 19:56 ` [PATCH 2/6] pack-bitmap-write.c: move commit_positions into commit_pos fields Taylor Blau
2024-05-14 19:56 ` [PATCH 3/6] pack-bitmap: avoid use of static `bitmap_writer` Taylor Blau
2024-05-14 19:57 ` [PATCH 4/6] pack-bitmap: drop unused `max_bitmaps` parameter Taylor Blau
2024-05-14 19:57 ` [PATCH 5/6] pack-bitmap-write.c: avoid uninitialized 'write_as' field Taylor Blau
2024-05-15  9:05   ` Patrick Steinhardt
2024-05-15 13:29     ` Taylor Blau
2024-05-14 19:57 ` [PATCH 6/6] pack-bitmap: introduce `bitmap_writer_free()` Taylor Blau
2024-05-15  9:05   ` Patrick Steinhardt
2024-05-15 15:58     ` Taylor Blau
2024-05-15  6:18 ` [PATCH 0/6] pack-bitmap: various pack-bitmap-write cleanups Jeff King
2024-05-15  9:05 ` Patrick Steinhardt
2024-05-15 15:58   ` Taylor Blau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).