git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/11] reftable: expose write options as config
@ 2024-05-02  6:51 Patrick Steinhardt
  2024-05-02  6:51 ` [PATCH 01/11] reftable: consistently refer to `reftable_write_options` as `opts` Patrick Steinhardt
                   ` (15 more replies)
  0 siblings, 16 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  6:51 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 2898 bytes --]

Hi,

the reftable format has some flexibility with regards to how exactly it
writes the respective tables:

  - The block size allows you to control how large each block is
    supposed to be. The bigger the block, the more records you can fit
    into it.

  - Restart intervals control how often a restart point is written that
    breaks prefix compression. The lower the interval, the less disk
    space savings you get.

  - Object indices can be enabled or disabled. These are optional and
    Git doesn't use them right now, so disabling them may be a sensible
    thing to do if you want to save some disk space.

  - The geometric factor controls when we compact tables during auto
    compaction.

This patch series exposes all of these via a new set of configs so that
they can be tweaked by the user as-needed. It's not expected that those
are really of much importance for the "normal" user -- the defaults
should be good enough. But for edge cases (huge repos with millions of
refs) and for hosting providers these knobs can be helpful.

This patch series applies on top of d4cc1ec35f (Start the 2.46 cycle,
2024-04-30).

Patrick

Patrick Steinhardt (11):
  reftable: consistently refer to `reftable_write_options` as `opts`
  reftable: consistently pass write opts as value
  reftable/writer: drop static variable used to initialize strbuf
  reftable/writer: improve error when passed an invalid block size
  reftable/dump: support dumping a table's block structure
  refs/reftable: allow configuring block size
  reftable: use `uint16_t` to track restart interval
  refs/reftable: allow configuring restart interval
  refs/reftable: allow disabling writing the object index
  reftable: make the compaction factor configurable
  refs/reftable: allow configuring geometric factor

 Documentation/config.txt          |   2 +
 Documentation/config/reftable.txt |  49 +++++
 refs/reftable-backend.c           |  46 ++++-
 reftable/block.h                  |   2 +-
 reftable/dump.c                   |  12 +-
 reftable/merged_test.c            |   6 +-
 reftable/reader.c                 |  63 +++++++
 reftable/readwrite_test.c         |  26 +--
 reftable/refname_test.c           |   2 +-
 reftable/reftable-reader.h        |   2 +
 reftable/reftable-stack.h         |   2 +-
 reftable/reftable-writer.h        |  10 +-
 reftable/stack.c                  |  56 +++---
 reftable/stack.h                  |   5 +-
 reftable/stack_test.c             | 118 ++++++------
 reftable/writer.c                 |  20 +--
 t/t0613-reftable-write-options.sh | 286 ++++++++++++++++++++++++++++++
 17 files changed, 577 insertions(+), 130 deletions(-)
 create mode 100644 Documentation/config/reftable.txt
 create mode 100755 t/t0613-reftable-write-options.sh


base-commit: d4cc1ec35f3bcce816b69986ca41943f6ce21377
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH 01/11] reftable: consistently refer to `reftable_write_options` as `opts`
  2024-05-02  6:51 [PATCH 00/11] reftable: expose write options as config Patrick Steinhardt
@ 2024-05-02  6:51 ` Patrick Steinhardt
  2024-05-10  9:00   ` Karthik Nayak
  2024-05-02  6:51 ` [PATCH 02/11] reftable: consistently pass write opts as value Patrick Steinhardt
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  6:51 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 21809 bytes --]

Throughout the reftable library the `reftable_write_options` are
sometimes referred to as `cfg` and sometimes as `opts`. Unify these to
consistently use `opts` to avoid confusion.

While at it, touch up the coding style a bit by removing unneeded braces
around one-line statements and newlines between variable declarations.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/dump.c           |   4 +-
 reftable/reftable-stack.h |   2 +-
 reftable/stack.c          |  43 +++++++-------
 reftable/stack.h          |   2 +-
 reftable/stack_test.c     | 114 +++++++++++++++++---------------------
 5 files changed, 75 insertions(+), 90 deletions(-)

diff --git a/reftable/dump.c b/reftable/dump.c
index 26e0393c7d..9c770a10cc 100644
--- a/reftable/dump.c
+++ b/reftable/dump.c
@@ -27,9 +27,9 @@ license that can be found in the LICENSE file or at
 static int compact_stack(const char *stackdir)
 {
 	struct reftable_stack *stack = NULL;
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 
-	int err = reftable_new_stack(&stack, stackdir, cfg);
+	int err = reftable_new_stack(&stack, stackdir, opts);
 	if (err < 0)
 		goto done;
 
diff --git a/reftable/reftable-stack.h b/reftable/reftable-stack.h
index 1b602dda58..9c8e4eef49 100644
--- a/reftable/reftable-stack.h
+++ b/reftable/reftable-stack.h
@@ -29,7 +29,7 @@ struct reftable_stack;
  *  stored in 'dir'. Typically, this should be .git/reftables.
  */
 int reftable_new_stack(struct reftable_stack **dest, const char *dir,
-		       struct reftable_write_options config);
+		       struct reftable_write_options opts);
 
 /* returns the update_index at which a next table should be written. */
 uint64_t reftable_stack_next_update_index(struct reftable_stack *st);
diff --git a/reftable/stack.c b/reftable/stack.c
index 80266bcbab..3979657793 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -56,15 +56,14 @@ static int reftable_fd_flush(void *arg)
 }
 
 int reftable_new_stack(struct reftable_stack **dest, const char *dir,
-		       struct reftable_write_options config)
+		       struct reftable_write_options opts)
 {
 	struct reftable_stack *p = reftable_calloc(1, sizeof(*p));
 	struct strbuf list_file_name = STRBUF_INIT;
 	int err = 0;
 
-	if (config.hash_id == 0) {
-		config.hash_id = GIT_SHA1_FORMAT_ID;
-	}
+	if (opts.hash_id == 0)
+		opts.hash_id = GIT_SHA1_FORMAT_ID;
 
 	*dest = NULL;
 
@@ -75,7 +74,7 @@ int reftable_new_stack(struct reftable_stack **dest, const char *dir,
 	p->list_file = strbuf_detach(&list_file_name, NULL);
 	p->list_fd = -1;
 	p->reftable_dir = xstrdup(dir);
-	p->config = config;
+	p->opts = opts;
 
 	err = reftable_stack_reload_maybe_reuse(p, 1);
 	if (err < 0) {
@@ -257,7 +256,7 @@ static int reftable_stack_reload_once(struct reftable_stack *st, char **names,
 
 	/* success! */
 	err = reftable_new_merged_table(&new_merged, new_tables,
-					new_readers_len, st->config.hash_id);
+					new_readers_len, st->opts.hash_id);
 	if (err < 0)
 		goto done;
 
@@ -580,8 +579,8 @@ static int reftable_stack_init_addition(struct reftable_addition *add,
 		}
 		goto done;
 	}
-	if (st->config.default_permissions) {
-		if (chmod(add->lock_file->filename.buf, st->config.default_permissions) < 0) {
+	if (st->opts.default_permissions) {
+		if (chmod(add->lock_file->filename.buf, st->opts.default_permissions) < 0) {
 			err = REFTABLE_IO_ERROR;
 			goto done;
 		}
@@ -680,7 +679,7 @@ int reftable_addition_commit(struct reftable_addition *add)
 	if (err)
 		goto done;
 
-	if (!add->stack->config.disable_auto_compact) {
+	if (!add->stack->opts.disable_auto_compact) {
 		/*
 		 * Auto-compact the stack to keep the number of tables in
 		 * control. It is possible that a concurrent writer is already
@@ -758,9 +757,9 @@ int reftable_addition_add(struct reftable_addition *add,
 		err = REFTABLE_IO_ERROR;
 		goto done;
 	}
-	if (add->stack->config.default_permissions) {
+	if (add->stack->opts.default_permissions) {
 		if (chmod(get_tempfile_path(tab_file),
-			  add->stack->config.default_permissions)) {
+			  add->stack->opts.default_permissions)) {
 			err = REFTABLE_IO_ERROR;
 			goto done;
 		}
@@ -768,7 +767,7 @@ int reftable_addition_add(struct reftable_addition *add,
 	tab_fd = get_tempfile_fd(tab_file);
 
 	wr = reftable_new_writer(reftable_fd_write, reftable_fd_flush, &tab_fd,
-				 &add->stack->config);
+				 &add->stack->opts);
 	err = write_table(wr, arg);
 	if (err < 0)
 		goto done;
@@ -855,14 +854,14 @@ static int stack_compact_locked(struct reftable_stack *st,
 	}
 	tab_fd = get_tempfile_fd(tab_file);
 
-	if (st->config.default_permissions &&
-	    chmod(get_tempfile_path(tab_file), st->config.default_permissions) < 0) {
+	if (st->opts.default_permissions &&
+	    chmod(get_tempfile_path(tab_file), st->opts.default_permissions) < 0) {
 		err = REFTABLE_IO_ERROR;
 		goto done;
 	}
 
 	wr = reftable_new_writer(reftable_fd_write, reftable_fd_flush,
-				 &tab_fd, &st->config);
+				 &tab_fd, &st->opts);
 	err = stack_write_compact(st, wr, first, last, config);
 	if (err < 0)
 		goto done;
@@ -910,7 +909,7 @@ static int stack_write_compact(struct reftable_stack *st,
 				   st->readers[last]->max_update_index);
 
 	err = reftable_new_merged_table(&mt, subtabs, subtabs_len,
-					st->config.hash_id);
+					st->opts.hash_id);
 	if (err < 0) {
 		reftable_free(subtabs);
 		goto done;
@@ -1100,9 +1099,9 @@ static int stack_compact_range(struct reftable_stack *st,
 		goto done;
 	}
 
-	if (st->config.default_permissions) {
+	if (st->opts.default_permissions) {
 		if (chmod(get_lock_file_path(&tables_list_lock),
-			  st->config.default_permissions) < 0) {
+			  st->opts.default_permissions) < 0) {
 			err = REFTABLE_IO_ERROR;
 			goto done;
 		}
@@ -1292,7 +1291,7 @@ static uint64_t *stack_table_sizes_for_compaction(struct reftable_stack *st)
 {
 	uint64_t *sizes =
 		reftable_calloc(st->merged->stack_len, sizeof(*sizes));
-	int version = (st->config.hash_id == GIT_SHA1_FORMAT_ID) ? 1 : 2;
+	int version = (st->opts.hash_id == GIT_SHA1_FORMAT_ID) ? 1 : 2;
 	int overhead = header_size(version) - 1;
 	int i = 0;
 	for (i = 0; i < st->merged->stack_len; i++) {
@@ -1368,7 +1367,7 @@ static int stack_check_addition(struct reftable_stack *st,
 	int len = 0;
 	int i = 0;
 
-	if (st->config.skip_name_check)
+	if (st->opts.skip_name_check)
 		return 0;
 
 	err = reftable_block_source_from_file(&src, new_tab_name);
@@ -1500,11 +1499,11 @@ int reftable_stack_clean(struct reftable_stack *st)
 int reftable_stack_print_directory(const char *stackdir, uint32_t hash_id)
 {
 	struct reftable_stack *stack = NULL;
-	struct reftable_write_options cfg = { .hash_id = hash_id };
+	struct reftable_write_options opts = { .hash_id = hash_id };
 	struct reftable_merged_table *merged = NULL;
 	struct reftable_table table = { NULL };
 
-	int err = reftable_new_stack(&stack, stackdir, cfg);
+	int err = reftable_new_stack(&stack, stackdir, opts);
 	if (err < 0)
 		goto done;
 
diff --git a/reftable/stack.h b/reftable/stack.h
index d43efa4760..97d7ebc043 100644
--- a/reftable/stack.h
+++ b/reftable/stack.h
@@ -20,7 +20,7 @@ struct reftable_stack {
 
 	char *reftable_dir;
 
-	struct reftable_write_options config;
+	struct reftable_write_options opts;
 
 	struct reftable_reader **readers;
 	size_t readers_len;
diff --git a/reftable/stack_test.c b/reftable/stack_test.c
index 1df3ffce52..3316d55f19 100644
--- a/reftable/stack_test.c
+++ b/reftable/stack_test.c
@@ -150,7 +150,7 @@ static void test_reftable_stack_add_one(void)
 	char *dir = get_tmp_dir(__LINE__);
 	struct strbuf scratch = STRBUF_INIT;
 	int mask = umask(002);
-	struct reftable_write_options cfg = {
+	struct reftable_write_options opts = {
 		.default_permissions = 0660,
 	};
 	struct reftable_stack *st = NULL;
@@ -163,7 +163,7 @@ static void test_reftable_stack_add_one(void)
 	};
 	struct reftable_ref_record dest = { NULL };
 	struct stat stat_result = { 0 };
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_add(st, &write_test_ref, &ref);
@@ -186,7 +186,7 @@ static void test_reftable_stack_add_one(void)
 	strbuf_addstr(&scratch, "/tables.list");
 	err = stat(scratch.buf, &stat_result);
 	EXPECT(!err);
-	EXPECT((stat_result.st_mode & 0777) == cfg.default_permissions);
+	EXPECT((stat_result.st_mode & 0777) == opts.default_permissions);
 
 	strbuf_reset(&scratch);
 	strbuf_addstr(&scratch, dir);
@@ -195,7 +195,7 @@ static void test_reftable_stack_add_one(void)
 	strbuf_addstr(&scratch, st->readers[0]->name);
 	err = stat(scratch.buf, &stat_result);
 	EXPECT(!err);
-	EXPECT((stat_result.st_mode & 0777) == cfg.default_permissions);
+	EXPECT((stat_result.st_mode & 0777) == opts.default_permissions);
 #else
 	(void) stat_result;
 #endif
@@ -209,7 +209,7 @@ static void test_reftable_stack_add_one(void)
 
 static void test_reftable_stack_uptodate(void)
 {
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st1 = NULL;
 	struct reftable_stack *st2 = NULL;
 	char *dir = get_tmp_dir(__LINE__);
@@ -232,10 +232,10 @@ static void test_reftable_stack_uptodate(void)
 	/* simulate multi-process access to the same stack
 	   by creating two stacks for the same directory.
 	 */
-	err = reftable_new_stack(&st1, dir, cfg);
+	err = reftable_new_stack(&st1, dir, opts);
 	EXPECT_ERR(err);
 
-	err = reftable_new_stack(&st2, dir, cfg);
+	err = reftable_new_stack(&st2, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_add(st1, &write_test_ref, &ref1);
@@ -257,8 +257,7 @@ static void test_reftable_stack_uptodate(void)
 static void test_reftable_stack_transaction_api(void)
 {
 	char *dir = get_tmp_dir(__LINE__);
-
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	int err;
 	struct reftable_addition *add = NULL;
@@ -271,8 +270,7 @@ static void test_reftable_stack_transaction_api(void)
 	};
 	struct reftable_ref_record dest = { NULL };
 
-
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	reftable_addition_destroy(add);
@@ -301,12 +299,12 @@ static void test_reftable_stack_transaction_api(void)
 static void test_reftable_stack_transaction_api_performs_auto_compaction(void)
 {
 	char *dir = get_tmp_dir(__LINE__);
-	struct reftable_write_options cfg = {0};
+	struct reftable_write_options opts = {0};
 	struct reftable_addition *add = NULL;
 	struct reftable_stack *st = NULL;
 	int i, n = 20, err;
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	for (i = 0; i <= n; i++) {
@@ -325,7 +323,7 @@ static void test_reftable_stack_transaction_api_performs_auto_compaction(void)
 		 * we can ensure that we indeed honor this setting and have
 		 * better control over when exactly auto compaction runs.
 		 */
-		st->config.disable_auto_compact = i != n;
+		st->opts.disable_auto_compact = i != n;
 
 		err = reftable_stack_new_addition(&add, st);
 		EXPECT_ERR(err);
@@ -361,13 +359,13 @@ static void test_reftable_stack_auto_compaction_fails_gracefully(void)
 		.value_type = REFTABLE_REF_VAL1,
 		.value.val1 = {0x01},
 	};
-	struct reftable_write_options cfg = {0};
+	struct reftable_write_options opts = {0};
 	struct reftable_stack *st;
 	struct strbuf table_path = STRBUF_INIT;
 	char *dir = get_tmp_dir(__LINE__);
 	int err;
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_add(st, write_test_ref, &ref);
@@ -442,8 +440,7 @@ static int write_error(struct reftable_writer *wr, void *arg)
 static void test_reftable_stack_update_index_check(void)
 {
 	char *dir = get_tmp_dir(__LINE__);
-
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	int err;
 	struct reftable_ref_record ref1 = {
@@ -459,7 +456,7 @@ static void test_reftable_stack_update_index_check(void)
 		.value.symref = "master",
 	};
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_add(st, &write_test_ref, &ref1);
@@ -474,12 +471,11 @@ static void test_reftable_stack_update_index_check(void)
 static void test_reftable_stack_lock_failure(void)
 {
 	char *dir = get_tmp_dir(__LINE__);
-
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	int err, i;
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 	for (i = -1; i != REFTABLE_EMPTY_TABLE_ERROR; i--) {
 		err = reftable_stack_add(st, &write_error, &i);
@@ -494,7 +490,7 @@ static void test_reftable_stack_add(void)
 {
 	int i = 0;
 	int err = 0;
-	struct reftable_write_options cfg = {
+	struct reftable_write_options opts = {
 		.exact_log_message = 1,
 		.default_permissions = 0660,
 		.disable_auto_compact = 1,
@@ -507,7 +503,7 @@ static void test_reftable_stack_add(void)
 	struct stat stat_result;
 	int N = ARRAY_SIZE(refs);
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	for (i = 0; i < N; i++) {
@@ -566,7 +562,7 @@ static void test_reftable_stack_add(void)
 	strbuf_addstr(&path, "/tables.list");
 	err = stat(path.buf, &stat_result);
 	EXPECT(!err);
-	EXPECT((stat_result.st_mode & 0777) == cfg.default_permissions);
+	EXPECT((stat_result.st_mode & 0777) == opts.default_permissions);
 
 	strbuf_reset(&path);
 	strbuf_addstr(&path, dir);
@@ -575,7 +571,7 @@ static void test_reftable_stack_add(void)
 	strbuf_addstr(&path, st->readers[0]->name);
 	err = stat(path.buf, &stat_result);
 	EXPECT(!err);
-	EXPECT((stat_result.st_mode & 0777) == cfg.default_permissions);
+	EXPECT((stat_result.st_mode & 0777) == opts.default_permissions);
 #else
 	(void) stat_result;
 #endif
@@ -593,7 +589,7 @@ static void test_reftable_stack_add(void)
 static void test_reftable_stack_log_normalize(void)
 {
 	int err = 0;
-	struct reftable_write_options cfg = {
+	struct reftable_write_options opts = {
 		0,
 	};
 	struct reftable_stack *st = NULL;
@@ -617,7 +613,7 @@ static void test_reftable_stack_log_normalize(void)
 		.update_index = 1,
 	};
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	input.value.update.message = "one\ntwo";
@@ -650,8 +646,7 @@ static void test_reftable_stack_tombstone(void)
 {
 	int i = 0;
 	char *dir = get_tmp_dir(__LINE__);
-
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	int err;
 	struct reftable_ref_record refs[2] = { { NULL } };
@@ -660,8 +655,7 @@ static void test_reftable_stack_tombstone(void)
 	struct reftable_ref_record dest = { NULL };
 	struct reftable_log_record log_dest = { NULL };
 
-
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	/* even entries add the refs, odd entries delete them. */
@@ -729,8 +723,7 @@ static void test_reftable_stack_tombstone(void)
 static void test_reftable_stack_hash_id(void)
 {
 	char *dir = get_tmp_dir(__LINE__);
-
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	int err;
 
@@ -740,24 +733,24 @@ static void test_reftable_stack_hash_id(void)
 		.value.symref = "target",
 		.update_index = 1,
 	};
-	struct reftable_write_options cfg32 = { .hash_id = GIT_SHA256_FORMAT_ID };
+	struct reftable_write_options opts32 = { .hash_id = GIT_SHA256_FORMAT_ID };
 	struct reftable_stack *st32 = NULL;
-	struct reftable_write_options cfg_default = { 0 };
+	struct reftable_write_options opts_default = { 0 };
 	struct reftable_stack *st_default = NULL;
 	struct reftable_ref_record dest = { NULL };
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_add(st, &write_test_ref, &ref);
 	EXPECT_ERR(err);
 
 	/* can't read it with the wrong hash ID. */
-	err = reftable_new_stack(&st32, dir, cfg32);
+	err = reftable_new_stack(&st32, dir, opts32);
 	EXPECT(err == REFTABLE_FORMAT_ERROR);
 
-	/* check that we can read it back with default config too. */
-	err = reftable_new_stack(&st_default, dir, cfg_default);
+	/* check that we can read it back with default opts too. */
+	err = reftable_new_stack(&st_default, dir, opts_default);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_read_ref(st_default, "master", &dest);
@@ -790,8 +783,7 @@ static void test_suggest_compaction_segment_nothing(void)
 static void test_reflog_expire(void)
 {
 	char *dir = get_tmp_dir(__LINE__);
-
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	struct reftable_log_record logs[20] = { { NULL } };
 	int N = ARRAY_SIZE(logs) - 1;
@@ -802,8 +794,7 @@ static void test_reflog_expire(void)
 	};
 	struct reftable_log_record log = { NULL };
 
-
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	for (i = 1; i <= N; i++) {
@@ -866,21 +857,19 @@ static int write_nothing(struct reftable_writer *wr, void *arg)
 
 static void test_empty_add(void)
 {
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	int err;
 	char *dir = get_tmp_dir(__LINE__);
-
 	struct reftable_stack *st2 = NULL;
 
-
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_add(st, &write_nothing, NULL);
 	EXPECT_ERR(err);
 
-	err = reftable_new_stack(&st2, dir, cfg);
+	err = reftable_new_stack(&st2, dir, opts);
 	EXPECT_ERR(err);
 	clear_dir(dir);
 	reftable_stack_destroy(st);
@@ -899,16 +888,15 @@ static int fastlog2(uint64_t sz)
 
 static void test_reftable_stack_auto_compaction(void)
 {
-	struct reftable_write_options cfg = {
+	struct reftable_write_options opts = {
 		.disable_auto_compact = 1,
 	};
 	struct reftable_stack *st = NULL;
 	char *dir = get_tmp_dir(__LINE__);
-
 	int err, i;
 	int N = 100;
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	for (i = 0; i < N; i++) {
@@ -938,13 +926,13 @@ static void test_reftable_stack_auto_compaction(void)
 
 static void test_reftable_stack_add_performs_auto_compaction(void)
 {
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	struct strbuf refname = STRBUF_INIT;
 	char *dir = get_tmp_dir(__LINE__);
 	int err, i, n = 20;
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	for (i = 0; i <= n; i++) {
@@ -959,7 +947,7 @@ static void test_reftable_stack_add_performs_auto_compaction(void)
 		 * we can ensure that we indeed honor this setting and have
 		 * better control over when exactly auto compaction runs.
 		 */
-		st->config.disable_auto_compact = i != n;
+		st->opts.disable_auto_compact = i != n;
 
 		strbuf_reset(&refname);
 		strbuf_addf(&refname, "branch-%04d", i);
@@ -986,14 +974,13 @@ static void test_reftable_stack_add_performs_auto_compaction(void)
 
 static void test_reftable_stack_compaction_concurrent(void)
 {
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st1 = NULL, *st2 = NULL;
 	char *dir = get_tmp_dir(__LINE__);
-
 	int err, i;
 	int N = 3;
 
-	err = reftable_new_stack(&st1, dir, cfg);
+	err = reftable_new_stack(&st1, dir, opts);
 	EXPECT_ERR(err);
 
 	for (i = 0; i < N; i++) {
@@ -1010,7 +997,7 @@ static void test_reftable_stack_compaction_concurrent(void)
 		EXPECT_ERR(err);
 	}
 
-	err = reftable_new_stack(&st2, dir, cfg);
+	err = reftable_new_stack(&st2, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_compact_all(st1, NULL);
@@ -1036,14 +1023,13 @@ static void unclean_stack_close(struct reftable_stack *st)
 
 static void test_reftable_stack_compaction_concurrent_clean(void)
 {
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st1 = NULL, *st2 = NULL, *st3 = NULL;
 	char *dir = get_tmp_dir(__LINE__);
-
 	int err, i;
 	int N = 3;
 
-	err = reftable_new_stack(&st1, dir, cfg);
+	err = reftable_new_stack(&st1, dir, opts);
 	EXPECT_ERR(err);
 
 	for (i = 0; i < N; i++) {
@@ -1060,7 +1046,7 @@ static void test_reftable_stack_compaction_concurrent_clean(void)
 		EXPECT_ERR(err);
 	}
 
-	err = reftable_new_stack(&st2, dir, cfg);
+	err = reftable_new_stack(&st2, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_compact_all(st1, NULL);
@@ -1069,7 +1055,7 @@ static void test_reftable_stack_compaction_concurrent_clean(void)
 	unclean_stack_close(st1);
 	unclean_stack_close(st2);
 
-	err = reftable_new_stack(&st3, dir, cfg);
+	err = reftable_new_stack(&st3, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_clean(st3);
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 02/11] reftable: consistently pass write opts as value
  2024-05-02  6:51 [PATCH 00/11] reftable: expose write options as config Patrick Steinhardt
  2024-05-02  6:51 ` [PATCH 01/11] reftable: consistently refer to `reftable_write_options` as `opts` Patrick Steinhardt
@ 2024-05-02  6:51 ` Patrick Steinhardt
  2024-05-02  6:51 ` [PATCH 03/11] reftable/writer: drop static variable used to initialize strbuf Patrick Steinhardt
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  6:51 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 10489 bytes --]

We sometimes pass the refatble write options as value and sometimes as a
pointer. This is quite confusing and makes the reader wonder whether the
options get modified sometimes.

In fact, `reftable_new_writer()` does cause the caller-provided options
to get updated when some values aren't set up. This is quite unexpected,
but didn't cause any harm until now.

Refactor the code to consistently pass the options as a value so that
they are local to the subsystem they are being passed into so that we
can avoid weirdness like this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/merged_test.c     |  6 +++---
 reftable/readwrite_test.c  | 26 +++++++++++++-------------
 reftable/refname_test.c    |  2 +-
 reftable/reftable-writer.h |  2 +-
 reftable/stack.c           |  4 ++--
 reftable/writer.c          | 12 +++++++-----
 6 files changed, 27 insertions(+), 25 deletions(-)

diff --git a/reftable/merged_test.c b/reftable/merged_test.c
index 530fc82d1c..4ac81de8d4 100644
--- a/reftable/merged_test.c
+++ b/reftable/merged_test.c
@@ -42,7 +42,7 @@ static void write_test_table(struct strbuf *buf,
 		}
 	}
 
-	w = reftable_new_writer(&strbuf_add_void, &noop_flush, buf, &opts);
+	w = reftable_new_writer(&strbuf_add_void, &noop_flush, buf, opts);
 	reftable_writer_set_limits(w, min, max);
 
 	for (i = 0; i < n; i++) {
@@ -70,7 +70,7 @@ static void write_test_log_table(struct strbuf *buf,
 		.exact_log_message = 1,
 	};
 	struct reftable_writer *w = NULL;
-	w = reftable_new_writer(&strbuf_add_void, &noop_flush, buf, &opts);
+	w = reftable_new_writer(&strbuf_add_void, &noop_flush, buf, opts);
 	reftable_writer_set_limits(w, update_index, update_index);
 
 	for (i = 0; i < n; i++) {
@@ -403,7 +403,7 @@ static void test_default_write_opts(void)
 	struct reftable_write_options opts = { 0 };
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
 
 	struct reftable_ref_record rec = {
 		.refname = "master",
diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c
index a6dbd214c5..27631a041b 100644
--- a/reftable/readwrite_test.c
+++ b/reftable/readwrite_test.c
@@ -51,7 +51,7 @@ static void write_table(char ***names, struct strbuf *buf, int N,
 		.hash_id = hash_id,
 	};
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, buf, opts);
 	struct reftable_ref_record ref = { NULL };
 	int i = 0, n;
 	struct reftable_log_record log = { NULL };
@@ -129,7 +129,7 @@ static void test_log_buffer_size(void)
 					   .message = "commit: 9\n",
 				   } } };
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
 
 	/* This tests buffer extension for log compression. Must use a random
 	   hash, to ensure that the compressed part is larger than the original.
@@ -172,7 +172,7 @@ static void test_log_overflow(void)
 		},
 	};
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
 
 	memset(msg, 'x', sizeof(msg) - 1);
 	reftable_writer_set_limits(w, update_index, update_index);
@@ -199,7 +199,7 @@ static void test_log_write_read(void)
 	struct reftable_block_source source = { NULL };
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
 	const struct reftable_stats *stats = NULL;
 	reftable_writer_set_limits(w, 0, N);
 	for (i = 0; i < N; i++) {
@@ -288,7 +288,7 @@ static void test_log_zlib_corruption(void)
 	struct reftable_block_source source = { 0 };
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
 	const struct reftable_stats *stats = NULL;
 	char message[100] = { 0 };
 	int err, i, n;
@@ -526,7 +526,7 @@ static void test_table_refs_for(int indexed)
 
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
 
 	struct reftable_iterator it = { NULL };
 	int j;
@@ -619,7 +619,7 @@ static void test_write_empty_table(void)
 	struct reftable_write_options opts = { 0 };
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
 	struct reftable_block_source source = { NULL };
 	struct reftable_reader *rd = NULL;
 	struct reftable_ref_record rec = { NULL };
@@ -657,7 +657,7 @@ static void test_write_object_id_min_length(void)
 	};
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
 	struct reftable_ref_record ref = {
 		.update_index = 1,
 		.value_type = REFTABLE_REF_VAL1,
@@ -692,7 +692,7 @@ static void test_write_object_id_length(void)
 	};
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
 	struct reftable_ref_record ref = {
 		.update_index = 1,
 		.value_type = REFTABLE_REF_VAL1,
@@ -726,7 +726,7 @@ static void test_write_empty_key(void)
 	struct reftable_write_options opts = { 0 };
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
 	struct reftable_ref_record ref = {
 		.refname = "",
 		.update_index = 1,
@@ -749,7 +749,7 @@ static void test_write_key_order(void)
 	struct reftable_write_options opts = { 0 };
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
 	struct reftable_ref_record refs[2] = {
 		{
 			.refname = "b",
@@ -792,7 +792,7 @@ static void test_write_multiple_indices(void)
 	struct reftable_reader *reader;
 	int err, i;
 
-	writer = reftable_new_writer(&strbuf_add_void, &noop_flush, &writer_buf, &opts);
+	writer = reftable_new_writer(&strbuf_add_void, &noop_flush, &writer_buf, opts);
 	reftable_writer_set_limits(writer, 1, 1);
 	for (i = 0; i < 100; i++) {
 		struct reftable_ref_record ref = {
@@ -869,7 +869,7 @@ static void test_write_multi_level_index(void)
 	struct reftable_reader *reader;
 	int err;
 
-	writer = reftable_new_writer(&strbuf_add_void, &noop_flush, &writer_buf, &opts);
+	writer = reftable_new_writer(&strbuf_add_void, &noop_flush, &writer_buf, opts);
 	reftable_writer_set_limits(writer, 1, 1);
 	for (size_t i = 0; i < 200; i++) {
 		struct reftable_ref_record ref = {
diff --git a/reftable/refname_test.c b/reftable/refname_test.c
index b9cc62554e..3468253be7 100644
--- a/reftable/refname_test.c
+++ b/reftable/refname_test.c
@@ -30,7 +30,7 @@ static void test_conflict(void)
 	struct reftable_write_options opts = { 0 };
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
 	struct reftable_ref_record rec = {
 		.refname = "a/b",
 		.value_type = REFTABLE_REF_SYMREF,
diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h
index 155bf0bbe2..44cb986465 100644
--- a/reftable/reftable-writer.h
+++ b/reftable/reftable-writer.h
@@ -92,7 +92,7 @@ struct reftable_stats {
 struct reftable_writer *
 reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
 		    int (*flush_func)(void *),
-		    void *writer_arg, struct reftable_write_options *opts);
+		    void *writer_arg, struct reftable_write_options opts);
 
 /* Set the range of update indices for the records we will add. When writing a
    table into a stack, the min should be at least
diff --git a/reftable/stack.c b/reftable/stack.c
index 3979657793..7b4fff7c9e 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -767,7 +767,7 @@ int reftable_addition_add(struct reftable_addition *add,
 	tab_fd = get_tempfile_fd(tab_file);
 
 	wr = reftable_new_writer(reftable_fd_write, reftable_fd_flush, &tab_fd,
-				 &add->stack->opts);
+				 add->stack->opts);
 	err = write_table(wr, arg);
 	if (err < 0)
 		goto done;
@@ -861,7 +861,7 @@ static int stack_compact_locked(struct reftable_stack *st,
 	}
 
 	wr = reftable_new_writer(reftable_fd_write, reftable_fd_flush,
-				 &tab_fd, &st->opts);
+				 &tab_fd, st->opts);
 	err = stack_write_compact(st, wr, first, last, config);
 	if (err < 0)
 		goto done;
diff --git a/reftable/writer.c b/reftable/writer.c
index 1d9ff0fbfa..ad2f2e6c65 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -122,20 +122,22 @@ static struct strbuf reftable_empty_strbuf = STRBUF_INIT;
 struct reftable_writer *
 reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
 		    int (*flush_func)(void *),
-		    void *writer_arg, struct reftable_write_options *opts)
+		    void *writer_arg, struct reftable_write_options opts)
 {
 	struct reftable_writer *wp = reftable_calloc(1, sizeof(*wp));
 	strbuf_init(&wp->block_writer_data.last_key, 0);
-	options_set_defaults(opts);
-	if (opts->block_size >= (1 << 24)) {
+
+	options_set_defaults(&opts);
+	if (opts.block_size >= (1 << 24)) {
 		/* TODO - error return? */
 		abort();
 	}
+
 	wp->last_key = reftable_empty_strbuf;
-	REFTABLE_CALLOC_ARRAY(wp->block, opts->block_size);
+	REFTABLE_CALLOC_ARRAY(wp->block, opts.block_size);
 	wp->write = writer_func;
 	wp->write_arg = writer_arg;
-	wp->opts = *opts;
+	wp->opts = opts;
 	wp->flush = flush_func;
 	writer_reinit_block_writer(wp, BLOCK_TYPE_REF);
 
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 03/11] reftable/writer: drop static variable used to initialize strbuf
  2024-05-02  6:51 [PATCH 00/11] reftable: expose write options as config Patrick Steinhardt
  2024-05-02  6:51 ` [PATCH 01/11] reftable: consistently refer to `reftable_write_options` as `opts` Patrick Steinhardt
  2024-05-02  6:51 ` [PATCH 02/11] reftable: consistently pass write opts as value Patrick Steinhardt
@ 2024-05-02  6:51 ` Patrick Steinhardt
  2024-05-02  6:51 ` [PATCH 04/11] reftable/writer: improve error when passed an invalid block size Patrick Steinhardt
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  6:51 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1471 bytes --]

We have a static variable in the reftable writer code that is merely
used to initialize the `last_key` of the writer. Convert the code to
instead use `strbuf_init()` and drop the variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/writer.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/reftable/writer.c b/reftable/writer.c
index ad2f2e6c65..7df6e53699 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -117,15 +117,12 @@ static void writer_reinit_block_writer(struct reftable_writer *w, uint8_t typ)
 	w->block_writer->restart_interval = w->opts.restart_interval;
 }
 
-static struct strbuf reftable_empty_strbuf = STRBUF_INIT;
-
 struct reftable_writer *
 reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
 		    int (*flush_func)(void *),
 		    void *writer_arg, struct reftable_write_options opts)
 {
 	struct reftable_writer *wp = reftable_calloc(1, sizeof(*wp));
-	strbuf_init(&wp->block_writer_data.last_key, 0);
 
 	options_set_defaults(&opts);
 	if (opts.block_size >= (1 << 24)) {
@@ -133,7 +130,8 @@ reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
 		abort();
 	}
 
-	wp->last_key = reftable_empty_strbuf;
+	strbuf_init(&wp->block_writer_data.last_key, 0);
+	strbuf_init(&wp->last_key, 0);
 	REFTABLE_CALLOC_ARRAY(wp->block, opts.block_size);
 	wp->write = writer_func;
 	wp->write_arg = writer_arg;
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 04/11] reftable/writer: improve error when passed an invalid block size
  2024-05-02  6:51 [PATCH 00/11] reftable: expose write options as config Patrick Steinhardt
                   ` (2 preceding siblings ...)
  2024-05-02  6:51 ` [PATCH 03/11] reftable/writer: drop static variable used to initialize strbuf Patrick Steinhardt
@ 2024-05-02  6:51 ` Patrick Steinhardt
  2024-05-02  6:51 ` [PATCH 05/11] reftable/dump: support dumping a table's block structure Patrick Steinhardt
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  6:51 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1010 bytes --]

The reftable format only supports block sizes up to 16MB. When the
writer is being passed a value bigger than that it simply calls
abort(3P), which isn't all that helpful due to the lack of a proper
error message.

Improve this by calling `BUG()` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/writer.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/reftable/writer.c b/reftable/writer.c
index 7df6e53699..374b7d15ed 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -125,10 +125,8 @@ reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
 	struct reftable_writer *wp = reftable_calloc(1, sizeof(*wp));
 
 	options_set_defaults(&opts);
-	if (opts.block_size >= (1 << 24)) {
-		/* TODO - error return? */
-		abort();
-	}
+	if (opts.block_size >= (1 << 24))
+		BUG("configured block size exceeds 16MB");
 
 	strbuf_init(&wp->block_writer_data.last_key, 0);
 	strbuf_init(&wp->last_key, 0);
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 05/11] reftable/dump: support dumping a table's block structure
  2024-05-02  6:51 [PATCH 00/11] reftable: expose write options as config Patrick Steinhardt
                   ` (3 preceding siblings ...)
  2024-05-02  6:51 ` [PATCH 04/11] reftable/writer: improve error when passed an invalid block size Patrick Steinhardt
@ 2024-05-02  6:51 ` Patrick Steinhardt
  2024-05-02  6:51 ` [PATCH 06/11] refs/reftable: allow configuring block size Patrick Steinhardt
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  6:51 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 6732 bytes --]

We're about to introduce new configs that will allow users to have more
control over how exactly reftables are written. To verify that these
configs are effective we will need to take a peak into the actual blocks
written by the reftable backend.

Introduce a new mode to the dumping logic that prints out the block
structure. This logic can be invoked via `test-tool dump-reftables -b`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/dump.c                   |   8 ++-
 reftable/reader.c                 |  63 ++++++++++++++++++
 reftable/reftable-reader.h        |   2 +
 t/t0613-reftable-write-options.sh | 102 ++++++++++++++++++++++++++++++
 4 files changed, 174 insertions(+), 1 deletion(-)
 create mode 100755 t/t0613-reftable-write-options.sh

diff --git a/reftable/dump.c b/reftable/dump.c
index 9c770a10cc..24476cc2a9 100644
--- a/reftable/dump.c
+++ b/reftable/dump.c
@@ -48,6 +48,7 @@ static void print_help(void)
 	printf("usage: dump [-cst] arg\n\n"
 	       "options: \n"
 	       "  -c compact\n"
+	       "  -b dump blocks\n"
 	       "  -t dump table\n"
 	       "  -s dump stack\n"
 	       "  -6 sha256 hash format\n"
@@ -58,6 +59,7 @@ static void print_help(void)
 int reftable_dump_main(int argc, char *const *argv)
 {
 	int err = 0;
+	int opt_dump_blocks = 0;
 	int opt_dump_table = 0;
 	int opt_dump_stack = 0;
 	int opt_compact = 0;
@@ -67,6 +69,8 @@ int reftable_dump_main(int argc, char *const *argv)
 	for (; argc > 1; argv++, argc--)
 		if (*argv[1] != '-')
 			break;
+		else if (!strcmp("-b", argv[1]))
+			opt_dump_blocks = 1;
 		else if (!strcmp("-t", argv[1]))
 			opt_dump_table = 1;
 		else if (!strcmp("-6", argv[1]))
@@ -88,7 +92,9 @@ int reftable_dump_main(int argc, char *const *argv)
 
 	arg = argv[1];
 
-	if (opt_dump_table) {
+	if (opt_dump_blocks) {
+		err = reftable_reader_print_blocks(arg);
+	} else if (opt_dump_table) {
 		err = reftable_reader_print_file(arg);
 	} else if (opt_dump_stack) {
 		err = reftable_stack_print_directory(arg, opt_hash_id);
diff --git a/reftable/reader.c b/reftable/reader.c
index 481dff10d4..f23c8523db 100644
--- a/reftable/reader.c
+++ b/reftable/reader.c
@@ -856,3 +856,66 @@ int reftable_reader_print_file(const char *tablename)
 	reftable_reader_free(r);
 	return err;
 }
+
+int reftable_reader_print_blocks(const char *tablename)
+{
+	struct {
+		const char *name;
+		int type;
+	} sections[] = {
+		{
+			.name = "ref",
+			.type = BLOCK_TYPE_REF,
+		},
+		{
+			.name = "obj",
+			.type = BLOCK_TYPE_OBJ,
+		},
+		{
+			.name = "log",
+			.type = BLOCK_TYPE_LOG,
+		},
+	};
+	struct reftable_block_source src = { 0 };
+	struct table_iter ti = TABLE_ITER_INIT;
+	struct reftable_reader *r = NULL;
+	size_t i;
+	int err;
+
+	err = reftable_block_source_from_file(&src, tablename);
+	if (err < 0)
+		goto done;
+
+	err = reftable_new_reader(&r, &src, tablename);
+	if (err < 0)
+		goto done;
+
+	printf("header:\n");
+	printf("  block_size: %d\n", r->block_size);
+
+	for (i = 0; i < ARRAY_SIZE(sections); i++) {
+		err = reader_start(r, &ti, sections[i].type, 0);
+		if (err < 0)
+			goto done;
+		if (err > 0)
+			continue;
+
+		printf("%s:\n", sections[i].name);
+
+		while (1) {
+			printf("  - length: %u\n", ti.br.block_len);
+			printf("    restarts: %u\n", ti.br.restart_count);
+
+			err = table_iter_next_block(&ti);
+			if (err < 0)
+				goto done;
+			if (err > 0)
+				break;
+		}
+	}
+
+done:
+	reftable_reader_free(r);
+	table_iter_close(&ti);
+	return err;
+}
diff --git a/reftable/reftable-reader.h b/reftable/reftable-reader.h
index 4a4bc2fdf8..4a04857773 100644
--- a/reftable/reftable-reader.h
+++ b/reftable/reftable-reader.h
@@ -97,5 +97,7 @@ void reftable_table_from_reader(struct reftable_table *tab,
 
 /* print table onto stdout for debugging. */
 int reftable_reader_print_file(const char *tablename);
+/* print blocks onto stdout for debugging. */
+int reftable_reader_print_blocks(const char *tablename);
 
 #endif
diff --git a/t/t0613-reftable-write-options.sh b/t/t0613-reftable-write-options.sh
new file mode 100755
index 0000000000..462980c37c
--- /dev/null
+++ b/t/t0613-reftable-write-options.sh
@@ -0,0 +1,102 @@
+#!/bin/sh
+
+test_description='reftable write options'
+
+GIT_TEST_DEFAULT_REF_FORMAT=reftable
+export GIT_TEST_DEFAULT_REF_FORMAT
+# Disable auto-compaction for all tests as we explicitly control repacking of
+# refs.
+GIT_TEST_REFTABLE_AUTOCOMPACTION=false
+export GIT_TEST_REFTABLE_AUTOCOMPACTION
+# Block sizes depend on the hash function, so we force SHA1 here.
+GIT_TEST_DEFAULT_HASH=sha1
+export GIT_TEST_DEFAULT_HASH
+# Block sizes also depend on the actual refs we write, so we force "master" to
+# be the default initial branch name.
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+test_expect_success 'default write options' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		git pack-refs &&
+		cat >expect <<-EOF &&
+		header:
+		  block_size: 4096
+		ref:
+		  - length: 129
+		    restarts: 2
+		log:
+		  - length: 262
+		    restarts: 2
+		EOF
+		test-tool dump-reftable -b .git/reftable/*.ref >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'disabled reflog writes no log blocks' '
+	test_config_global core.logAllRefUpdates false &&
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		git pack-refs &&
+		cat >expect <<-EOF &&
+		header:
+		  block_size: 4096
+		ref:
+		  - length: 129
+		    restarts: 2
+		EOF
+		test-tool dump-reftable -b .git/reftable/*.ref >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'many refs results in multiple blocks' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		for i in $(test_seq 200)
+		do
+			printf "update refs/heads/branch-%d HEAD\n" "$i" ||
+			return 1
+		done >input &&
+		git update-ref --stdin <input &&
+		git pack-refs &&
+
+		cat >expect <<-EOF &&
+		header:
+		  block_size: 4096
+		ref:
+		  - length: 4049
+		    restarts: 11
+		  - length: 1136
+		    restarts: 3
+		log:
+		  - length: 4041
+		    restarts: 4
+		  - length: 4015
+		    restarts: 3
+		  - length: 4014
+		    restarts: 3
+		  - length: 4012
+		    restarts: 3
+		  - length: 3289
+		    restarts: 3
+		EOF
+		test-tool dump-reftable -b .git/reftable/*.ref >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_done
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 06/11] refs/reftable: allow configuring block size
  2024-05-02  6:51 [PATCH 00/11] reftable: expose write options as config Patrick Steinhardt
                   ` (4 preceding siblings ...)
  2024-05-02  6:51 ` [PATCH 05/11] reftable/dump: support dumping a table's block structure Patrick Steinhardt
@ 2024-05-02  6:51 ` Patrick Steinhardt
  2024-05-10  9:29   ` Karthik Nayak
  2024-05-02  6:52 ` [PATCH 07/11] reftable: use `uint16_t` to track restart interval Patrick Steinhardt
                   ` (9 subsequent siblings)
  15 siblings, 1 reply; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  6:51 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 6246 bytes --]

Add a new option `reftable.blockSize` that allows the user to control
the block size used by the reftable library.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/config.txt          |  2 +
 Documentation/config/reftable.txt | 14 ++++++
 refs/reftable-backend.c           | 32 +++++++++++++-
 t/t0613-reftable-write-options.sh | 72 +++++++++++++++++++++++++++++++
 4 files changed, 119 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/config/reftable.txt

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 70b448b132..fa1469e5e7 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -497,6 +497,8 @@ include::config/rebase.txt[]
 
 include::config/receive.txt[]
 
+include::config/reftable.txt[]
+
 include::config/remote.txt[]
 
 include::config/remotes.txt[]
diff --git a/Documentation/config/reftable.txt b/Documentation/config/reftable.txt
new file mode 100644
index 0000000000..fa7c4be014
--- /dev/null
+++ b/Documentation/config/reftable.txt
@@ -0,0 +1,14 @@
+reftable.blockSize::
+	The size in bytes used by the reftable backend when writing blocks.
+	The block size is determined by the writer, and does not have to be a
+	power of 2. The block size must be larger than the longest reference
+	name or log entry used in the repository, as references cannot span
+	blocks.
++
+Powers of two that are friendly to the virtual memory system or
+filesystem (such as 4kB or 8kB) are recommended. Larger sizes (64kB) can
+yield better compression, with a possible increased cost incurred by
+readers during access.
++
+The largest block size is `16777215` bytes (15.99 MiB). The default value is
+`4096` bytes (4kB). A value of `0` will use the default value.
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 1cda48c504..c2c47a3bc1 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1,6 +1,7 @@
 #include "../git-compat-util.h"
 #include "../abspath.h"
 #include "../chdir-notify.h"
+#include "../config.h"
 #include "../environment.h"
 #include "../gettext.h"
 #include "../hash.h"
@@ -230,6 +231,23 @@ static int read_ref_without_reload(struct reftable_stack *stack,
 	return ret;
 }
 
+static int reftable_be_config(const char *var, const char *value,
+			      const struct config_context *ctx,
+			      void *_opts)
+{
+	struct reftable_write_options *opts = _opts;
+
+	if (!strcmp(var, "reftable.blocksize")) {
+		unsigned long block_size = git_config_ulong(var, value, ctx->kvi);
+		if (block_size > 16777215)
+			die("reftable block size cannot exceed 16MB");
+		opts->block_size = block_size;
+		return 0;
+	}
+
+	return 0;
+}
+
 static struct ref_store *reftable_be_init(struct repository *repo,
 					  const char *gitdir,
 					  unsigned int store_flags)
@@ -245,12 +263,24 @@ static struct ref_store *reftable_be_init(struct repository *repo,
 	base_ref_store_init(&refs->base, repo, gitdir, &refs_be_reftable);
 	strmap_init(&refs->worktree_stacks);
 	refs->store_flags = store_flags;
-	refs->write_options.block_size = 4096;
+
 	refs->write_options.hash_id = repo->hash_algo->format_id;
 	refs->write_options.default_permissions = calc_shared_perm(0666 & ~mask);
 	refs->write_options.disable_auto_compact =
 		!git_env_bool("GIT_TEST_REFTABLE_AUTOCOMPACTION", 1);
 
+	git_config(reftable_be_config, &refs->write_options);
+
+	/*
+	 * It is somewhat unfortunate that we have to mirror the default block
+	 * size of the reftable library here. But given that the write options
+	 * wouldn't be updated by the library here, and given that we require
+	 * the proper block size to trim reflog message so that they fit, we
+	 * must set up a proper value here.
+	 */
+	if (!refs->write_options.block_size)
+		refs->write_options.block_size = 4096;
+
 	/*
 	 * Set up the main reftable stack that is hosted in GIT_COMMON_DIR.
 	 * This stack contains both the shared and the main worktree refs.
diff --git a/t/t0613-reftable-write-options.sh b/t/t0613-reftable-write-options.sh
index 462980c37c..8bdbc6ec70 100755
--- a/t/t0613-reftable-write-options.sh
+++ b/t/t0613-reftable-write-options.sh
@@ -99,4 +99,76 @@ test_expect_success 'many refs results in multiple blocks' '
 	)
 '
 
+test_expect_success 'tiny block size leads to error' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		cat >expect <<-EOF &&
+		error: unable to compact stack: entry too large
+		EOF
+		test_must_fail git -c reftable.blockSize=50 pack-refs 2>err &&
+		test_cmp expect err
+	)
+'
+
+test_expect_success 'small block size leads to multiple ref blocks' '
+	test_config_global core.logAllRefUpdates false &&
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit A &&
+		test_commit B &&
+		git -c reftable.blockSize=100 pack-refs &&
+
+		cat >expect <<-EOF &&
+		header:
+		  block_size: 100
+		ref:
+		  - length: 53
+		    restarts: 1
+		  - length: 74
+		    restarts: 1
+		  - length: 38
+		    restarts: 1
+		EOF
+		test-tool dump-reftable -b .git/reftable/*.ref >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'small block size fails with large reflog message' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit A &&
+		perl -e "print \"a\" x 500" >logmsg &&
+		cat >expect <<-EOF &&
+		fatal: update_ref failed for ref ${SQ}refs/heads/logme${SQ}: reftable: transaction failure: entry too large
+		EOF
+		test_must_fail git -c reftable.blockSize=100 \
+			update-ref -m "$(cat logmsg)" refs/heads/logme HEAD 2>err &&
+		test_cmp expect err
+	)
+'
+
+test_expect_success 'block size exceeding maximum supported size' '
+	test_config_global core.logAllRefUpdates false &&
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit A &&
+		test_commit B &&
+		cat >expect <<-EOF &&
+		fatal: reftable block size cannot exceed 16MB
+		EOF
+		test_must_fail git -c reftable.blockSize=16777216 pack-refs 2>err &&
+		test_cmp expect err
+	)
+'
+
 test_done
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 07/11] reftable: use `uint16_t` to track restart interval
  2024-05-02  6:51 [PATCH 00/11] reftable: expose write options as config Patrick Steinhardt
                   ` (5 preceding siblings ...)
  2024-05-02  6:51 ` [PATCH 06/11] refs/reftable: allow configuring block size Patrick Steinhardt
@ 2024-05-02  6:52 ` Patrick Steinhardt
  2024-05-02  6:52 ` [PATCH 08/11] refs/reftable: allow configuring " Patrick Steinhardt
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  6:52 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1294 bytes --]

The restart interval can at most be `UINT16_MAX` as specified in the
technical documentation of the reftable format. Furthermore, it cannot
ever be negative. Regardless of that we use an `int` to track the
restart interval.

Change the type to use an `uint16_t` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/block.h           | 2 +-
 reftable/reftable-writer.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/reftable/block.h b/reftable/block.h
index ea4384a7e2..cd5577105d 100644
--- a/reftable/block.h
+++ b/reftable/block.h
@@ -25,7 +25,7 @@ struct block_writer {
 	uint32_t header_off;
 
 	/* How often to restart keys. */
-	int restart_interval;
+	uint16_t restart_interval;
 	int hash_size;
 
 	/* Offset of next uint8_t to write. */
diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h
index 44cb986465..4cd8ebe6c7 100644
--- a/reftable/reftable-writer.h
+++ b/reftable/reftable-writer.h
@@ -28,7 +28,7 @@ struct reftable_write_options {
 	unsigned skip_index_objects : 1;
 
 	/* how often to write complete keys in each block. */
-	int restart_interval;
+	uint16_t restart_interval;
 
 	/* 4-byte identifier ("sha1", "s256") of the hash.
 	 * Defaults to SHA1 if unset
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 08/11] refs/reftable: allow configuring restart interval
  2024-05-02  6:51 [PATCH 00/11] reftable: expose write options as config Patrick Steinhardt
                   ` (6 preceding siblings ...)
  2024-05-02  6:52 ` [PATCH 07/11] reftable: use `uint16_t` to track restart interval Patrick Steinhardt
@ 2024-05-02  6:52 ` Patrick Steinhardt
  2024-05-02  6:52 ` [PATCH 09/11] refs/reftable: allow disabling writing the object index Patrick Steinhardt
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  6:52 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 3750 bytes --]

Add a new option `reftable.restartInterval` that allows the user to
control the restart interval when writing reftable records used by the
reftable library.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/config/reftable.txt | 19 ++++++++++++++
 refs/reftable-backend.c           |  6 +++++
 t/t0613-reftable-write-options.sh | 43 +++++++++++++++++++++++++++++++
 3 files changed, 68 insertions(+)

diff --git a/Documentation/config/reftable.txt b/Documentation/config/reftable.txt
index fa7c4be014..16b915c75e 100644
--- a/Documentation/config/reftable.txt
+++ b/Documentation/config/reftable.txt
@@ -12,3 +12,22 @@ readers during access.
 +
 The largest block size is `16777215` bytes (15.99 MiB). The default value is
 `4096` bytes (4kB). A value of `0` will use the default value.
+
+reftable.restartInterval::
+	The interval at which to create restart points. The reftable backend
+	determines the restart points at file creation. The process is
+	arbitrary, but every 16 or 64 records is recommended. Every 16 may be
+	more suitable for smaller block sizes (4k or 8k), every 64 for larger
+	block sizes (64k).
++
+More frequent restart points reduces prefix compression and increases
+space consumed by the restart table, both of which increase file size.
++
+Less frequent restart points makes prefix compression more effective,
+decreasing overall file size, with increased penalties for readers
+walking through more records after the binary search step.
++
+A maximum of `65535` restart points per block is supported.
++
+The default value is to create restart points every 16 records. A value of `0`
+will use the default value.
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index c2c47a3bc1..a786143de2 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -243,6 +243,12 @@ static int reftable_be_config(const char *var, const char *value,
 			die("reftable block size cannot exceed 16MB");
 		opts->block_size = block_size;
 		return 0;
+	} else if (!strcmp(var, "reftable.restartinterval")) {
+		unsigned long restart_interval = git_config_ulong(var, value, ctx->kvi);
+		if (restart_interval > UINT16_MAX)
+			die("reftable block size cannot exceed %u", (unsigned)UINT16_MAX);
+		opts->restart_interval = restart_interval;
+		return 0;
 	}
 
 	return 0;
diff --git a/t/t0613-reftable-write-options.sh b/t/t0613-reftable-write-options.sh
index 8bdbc6ec70..e0a5b26f58 100755
--- a/t/t0613-reftable-write-options.sh
+++ b/t/t0613-reftable-write-options.sh
@@ -171,4 +171,47 @@ test_expect_success 'block size exceeding maximum supported size' '
 	)
 '
 
+test_expect_success 'restart interval at every single record' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		for i in $(test_seq 10)
+		do
+			printf "update refs/heads/branch-%d HEAD\n" "$i" ||
+			return 1
+		done >input &&
+		git update-ref --stdin <input &&
+		git -c reftable.restartInterval=1 pack-refs &&
+
+		cat >expect <<-EOF &&
+		header:
+		  block_size: 4096
+		ref:
+		  - length: 566
+		    restarts: 13
+		log:
+		  - length: 1393
+		    restarts: 12
+		EOF
+		test-tool dump-reftable -b .git/reftable/*.ref >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'restart interval exceeding maximum supported interval' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		cat >expect <<-EOF &&
+		fatal: reftable block size cannot exceed 65535
+		EOF
+		test_must_fail git -c reftable.restartInterval=65536 pack-refs 2>err &&
+		test_cmp expect err
+	)
+'
+
 test_done
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 09/11] refs/reftable: allow disabling writing the object index
  2024-05-02  6:51 [PATCH 00/11] reftable: expose write options as config Patrick Steinhardt
                   ` (7 preceding siblings ...)
  2024-05-02  6:52 ` [PATCH 08/11] refs/reftable: allow configuring " Patrick Steinhardt
@ 2024-05-02  6:52 ` Patrick Steinhardt
  2024-05-02  6:52 ` [PATCH 10/11] reftable: make the compaction factor configurable Patrick Steinhardt
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  6:52 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 4189 bytes --]

Besides the expected "ref" and "log" records, the reftable library also
writes "obj" records. These are basically a reverse mapping of object
IDs to their respective ref records so that it becomes efficient to
figure out which references point to a specific object. The motivation
for this data structure is the "uploadpack.allowTipSHA1InWant" config,
which allows a client to fetch any object by its hash that has a ref
pointing to it.

This reverse index is not used by Git at all though, and the expectation
is that most hosters nowadays use "uploadpack.allowAnySHA1InWant". It
may thus be preferable for many users to disable writing these optional
object indices altogether to safe some precious disk space.

Add a new config "reftable.indexObjects" that allows the user to disable
the object index altogether.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/config/reftable.txt |  6 +++
 refs/reftable-backend.c           |  3 ++
 t/t0613-reftable-write-options.sh | 69 +++++++++++++++++++++++++++++++
 3 files changed, 78 insertions(+)

diff --git a/Documentation/config/reftable.txt b/Documentation/config/reftable.txt
index 16b915c75e..6e4466f3c5 100644
--- a/Documentation/config/reftable.txt
+++ b/Documentation/config/reftable.txt
@@ -31,3 +31,9 @@ A maximum of `65535` restart points per block is supported.
 +
 The default value is to create restart points every 16 records. A value of `0`
 will use the default value.
+
+reftable.indexObjects::
+	Whether the reftable backend shall write object blocks. Object blocks
+	are a reverse mapping of object ID to the references pointing to them.
++
+The default value is `true`.
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index a786143de2..5298fcef6e 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -249,6 +249,9 @@ static int reftable_be_config(const char *var, const char *value,
 			die("reftable block size cannot exceed %u", (unsigned)UINT16_MAX);
 		opts->restart_interval = restart_interval;
 		return 0;
+	} else if (!strcmp(var, "reftable.indexobjects")) {
+		opts->skip_index_objects = !git_config_bool(var, value);
+		return 0;
 	}
 
 	return 0;
diff --git a/t/t0613-reftable-write-options.sh b/t/t0613-reftable-write-options.sh
index e0a5b26f58..e2708e11d5 100755
--- a/t/t0613-reftable-write-options.sh
+++ b/t/t0613-reftable-write-options.sh
@@ -214,4 +214,73 @@ test_expect_success 'restart interval exceeding maximum supported interval' '
 	)
 '
 
+test_expect_success 'object index gets written by default with ref index' '
+	test_config_global core.logAllRefUpdates false &&
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		for i in $(test_seq 5)
+		do
+			printf "update refs/heads/branch-%d HEAD\n" "$i" ||
+			return 1
+		done >input &&
+		git update-ref --stdin <input &&
+		git -c reftable.blockSize=100 pack-refs &&
+
+		cat >expect <<-EOF &&
+		header:
+		  block_size: 100
+		ref:
+		  - length: 53
+		    restarts: 1
+		  - length: 95
+		    restarts: 1
+		  - length: 71
+		    restarts: 1
+		  - length: 80
+		    restarts: 1
+		obj:
+		  - length: 11
+		    restarts: 1
+		EOF
+		test-tool dump-reftable -b .git/reftable/*.ref >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'object index can be disabled' '
+	test_config_global core.logAllRefUpdates false &&
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		for i in $(test_seq 5)
+		do
+			printf "update refs/heads/branch-%d HEAD\n" "$i" ||
+			return 1
+		done >input &&
+		git update-ref --stdin <input &&
+		git -c reftable.blockSize=100 -c reftable.indexObjects=false pack-refs &&
+
+		cat >expect <<-EOF &&
+		header:
+		  block_size: 100
+		ref:
+		  - length: 53
+		    restarts: 1
+		  - length: 95
+		    restarts: 1
+		  - length: 71
+		    restarts: 1
+		  - length: 80
+		    restarts: 1
+		EOF
+		test-tool dump-reftable -b .git/reftable/*.ref >actual &&
+		test_cmp expect actual
+	)
+'
+
 test_done
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 10/11] reftable: make the compaction factor configurable
  2024-05-02  6:51 [PATCH 00/11] reftable: expose write options as config Patrick Steinhardt
                   ` (8 preceding siblings ...)
  2024-05-02  6:52 ` [PATCH 09/11] refs/reftable: allow disabling writing the object index Patrick Steinhardt
@ 2024-05-02  6:52 ` Patrick Steinhardt
  2024-05-10  9:55   ` Karthik Nayak
  2024-05-02  6:52 ` [PATCH 11/11] refs/reftable: allow configuring geometric factor Patrick Steinhardt
                   ` (5 subsequent siblings)
  15 siblings, 1 reply; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  6:52 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 4047 bytes --]

When auto-compacting, the reftable library packs references such that
the sizes of the tables form a geometric sequence. The factor for this
geometric sequence is hardcoded to 2 right now. We're about to expose
this as a config option though, so let's expose the factor via write
options.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/reftable-writer.h |  6 ++++++
 reftable/stack.c           | 13 +++++++++----
 reftable/stack.h           |  3 ++-
 reftable/stack_test.c      |  4 ++--
 4 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h
index 4cd8ebe6c7..155457b042 100644
--- a/reftable/reftable-writer.h
+++ b/reftable/reftable-writer.h
@@ -49,6 +49,12 @@ struct reftable_write_options {
 
 	/* boolean: Prevent auto-compaction of tables. */
 	unsigned disable_auto_compact : 1;
+
+	/*
+	 * Geometric sequence factor used by auto-compaction to decide which
+	 * tables to compact. Defaults to 2 if unset.
+	 */
+	uint8_t auto_compaction_factor;
 };
 
 /* reftable_block_stats holds statistics for a single block type */
diff --git a/reftable/stack.c b/reftable/stack.c
index 7b4fff7c9e..6b0f8e13e7 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -1215,12 +1215,16 @@ static int segment_size(struct segment *s)
 	return s->end - s->start;
 }
 
-struct segment suggest_compaction_segment(uint64_t *sizes, size_t n)
+struct segment suggest_compaction_segment(uint64_t *sizes, size_t n,
+					  uint8_t factor)
 {
 	struct segment seg = { 0 };
 	uint64_t bytes;
 	size_t i;
 
+	if (!factor)
+		factor = 2;
+
 	/*
 	 * If there are no tables or only a single one then we don't have to
 	 * compact anything. The sequence is geometric by definition already.
@@ -1252,7 +1256,7 @@ struct segment suggest_compaction_segment(uint64_t *sizes, size_t n)
 	 * 	64, 32, 16, 8, 4, 3, 1
 	 */
 	for (i = n - 1; i > 0; i--) {
-		if (sizes[i - 1] < sizes[i] * 2) {
+		if (sizes[i - 1] < sizes[i] * factor) {
 			seg.end = i + 1;
 			bytes = sizes[i];
 			break;
@@ -1278,7 +1282,7 @@ struct segment suggest_compaction_segment(uint64_t *sizes, size_t n)
 		uint64_t curr = bytes;
 		bytes += sizes[i - 1];
 
-		if (sizes[i - 1] < curr * 2) {
+		if (sizes[i - 1] < curr * factor) {
 			seg.start = i - 1;
 			seg.bytes = bytes;
 		}
@@ -1304,7 +1308,8 @@ int reftable_stack_auto_compact(struct reftable_stack *st)
 {
 	uint64_t *sizes = stack_table_sizes_for_compaction(st);
 	struct segment seg =
-		suggest_compaction_segment(sizes, st->merged->stack_len);
+		suggest_compaction_segment(sizes, st->merged->stack_len,
+					   st->opts.auto_compaction_factor);
 	reftable_free(sizes);
 	if (segment_size(&seg) > 0)
 		return stack_compact_range_stats(st, seg.start, seg.end - 1,
diff --git a/reftable/stack.h b/reftable/stack.h
index 97d7ebc043..5b45cff4f7 100644
--- a/reftable/stack.h
+++ b/reftable/stack.h
@@ -35,6 +35,7 @@ struct segment {
 	uint64_t bytes;
 };
 
-struct segment suggest_compaction_segment(uint64_t *sizes, size_t n);
+struct segment suggest_compaction_segment(uint64_t *sizes, size_t n,
+					  uint8_t factor);
 
 #endif
diff --git a/reftable/stack_test.c b/reftable/stack_test.c
index 3316d55f19..f6c11ef18d 100644
--- a/reftable/stack_test.c
+++ b/reftable/stack_test.c
@@ -767,7 +767,7 @@ static void test_suggest_compaction_segment(void)
 {
 	uint64_t sizes[] = { 512, 64, 17, 16, 9, 9, 9, 16, 2, 16 };
 	struct segment min =
-		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes));
+		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes), 2);
 	EXPECT(min.start == 1);
 	EXPECT(min.end == 10);
 }
@@ -776,7 +776,7 @@ static void test_suggest_compaction_segment_nothing(void)
 {
 	uint64_t sizes[] = { 64, 32, 16, 8, 4, 2 };
 	struct segment result =
-		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes));
+		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes), 2);
 	EXPECT(result.start == result.end);
 }
 
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 11/11] refs/reftable: allow configuring geometric factor
  2024-05-02  6:51 [PATCH 00/11] reftable: expose write options as config Patrick Steinhardt
                   ` (9 preceding siblings ...)
  2024-05-02  6:52 ` [PATCH 10/11] reftable: make the compaction factor configurable Patrick Steinhardt
@ 2024-05-02  6:52 ` Patrick Steinhardt
  2024-05-10  9:58   ` Karthik Nayak
  2024-05-02  7:29 ` [PATCH 00/11] reftable: expose write options as config Patrick Steinhardt
                   ` (4 subsequent siblings)
  15 siblings, 1 reply; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  6:52 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1861 bytes --]

Allow configuring the geometric factor used by the auto-compaction
algorithm whenever a new table is appended to the stack of tables.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/config/reftable.txt | 10 ++++++++++
 refs/reftable-backend.c           |  5 +++++
 2 files changed, 15 insertions(+)

diff --git a/Documentation/config/reftable.txt b/Documentation/config/reftable.txt
index 6e4466f3c5..1c381dda04 100644
--- a/Documentation/config/reftable.txt
+++ b/Documentation/config/reftable.txt
@@ -37,3 +37,13 @@ reftable.indexObjects::
 	are a reverse mapping of object ID to the references pointing to them.
 +
 The default value is `true`.
+
+reftable.geometricFactor::
+	Whenever the reftable backend appends a new table to the table it
+	performs auto compaction to ensure that there is only a handful of
+	tables. The backend does this by ensuring that tables form a geometric
+	sequence regarding the respective sizes of each table.
++
+By default, the geometric sequence uses a factor of 2, meaning that for any
+table, the next-biggest table must at least be twice as big. A maximum factor
+of 256 is supported.
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 5298fcef6e..657d227c12 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -252,6 +252,11 @@ static int reftable_be_config(const char *var, const char *value,
 	} else if (!strcmp(var, "reftable.indexobjects")) {
 		opts->skip_index_objects = !git_config_bool(var, value);
 		return 0;
+	} else if (!strcmp(var, "reftable.geometricfactor")) {
+		unsigned long factor = git_config_ulong(var, value, ctx->kvi);
+		if (factor > UINT8_MAX)
+			die("reftable geometric factor cannot exceed %u", (unsigned)UINT8_MAX);
+		opts->auto_compaction_factor = factor;
 	}
 
 	return 0;
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* Re: [PATCH 00/11] reftable: expose write options as config
  2024-05-02  6:51 [PATCH 00/11] reftable: expose write options as config Patrick Steinhardt
                   ` (10 preceding siblings ...)
  2024-05-02  6:52 ` [PATCH 11/11] refs/reftable: allow configuring geometric factor Patrick Steinhardt
@ 2024-05-02  7:29 ` Patrick Steinhardt
  2024-05-03 20:38   ` Junio C Hamano
  2024-05-06 21:29 ` Justin Tobler
                   ` (3 subsequent siblings)
  15 siblings, 1 reply; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-02  7:29 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1684 bytes --]

On Thu, May 02, 2024 at 08:51:27AM +0200, Patrick Steinhardt wrote:
> Hi,
> 
> the reftable format has some flexibility with regards to how exactly it
> writes the respective tables:
> 
>   - The block size allows you to control how large each block is
>     supposed to be. The bigger the block, the more records you can fit
>     into it.
> 
>   - Restart intervals control how often a restart point is written that
>     breaks prefix compression. The lower the interval, the less disk
>     space savings you get.
> 
>   - Object indices can be enabled or disabled. These are optional and
>     Git doesn't use them right now, so disabling them may be a sensible
>     thing to do if you want to save some disk space.
> 
>   - The geometric factor controls when we compact tables during auto
>     compaction.
> 
> This patch series exposes all of these via a new set of configs so that
> they can be tweaked by the user as-needed. It's not expected that those
> are really of much importance for the "normal" user -- the defaults
> should be good enough. But for edge cases (huge repos with millions of
> refs) and for hosting providers these knobs can be helpful.
> 
> This patch series applies on top of d4cc1ec35f (Start the 2.46 cycle,
> 2024-04-30).

Ugh. I actually intended to pull in ps/reftable-write-optim as a
dependency because I know it causes conflicts. But I screwed this up as
I thought that the topic was merged into "master" already, even though
it has only hit "next".

I'll refrain from sending a new version immediately though and will wait
for reviews first. Once those are in I will pull in the above topic.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 00/11] reftable: expose write options as config
  2024-05-02  7:29 ` [PATCH 00/11] reftable: expose write options as config Patrick Steinhardt
@ 2024-05-03 20:38   ` Junio C Hamano
  2024-05-06  6:51     ` Patrick Steinhardt
  0 siblings, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2024-05-03 20:38 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Patrick Steinhardt <ps@pks.im> writes:

> Ugh. I actually intended to pull in ps/reftable-write-optim as a
> dependency because I know it causes conflicts. But I screwed this up as
> I thought that the topic was merged into "master" already, even though
> it has only hit "next".
>
> I'll refrain from sending a new version immediately though and will wait
> for reviews first. Once those are in I will pull in the above topic.

I saw [01/11] has changes to stack_check_addition() that disappeared
by the other topic, and the resolution is to remove the function (as
nobody calls it anyway).  Also [02/11] has changes to refname_test.c
that can be resolved by just removing the file.

If there are other semantic conflict resolution needed, such a
rebasing is appreciated, but otherwise, there is no strong need to
rebase.

The mention of the name of the other topic that has interactions was
very much helpful.

Thanks.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 00/11] reftable: expose write options as config
  2024-05-03 20:38   ` Junio C Hamano
@ 2024-05-06  6:51     ` Patrick Steinhardt
  0 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-06  6:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1147 bytes --]

On Fri, May 03, 2024 at 01:38:51PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > Ugh. I actually intended to pull in ps/reftable-write-optim as a
> > dependency because I know it causes conflicts. But I screwed this up as
> > I thought that the topic was merged into "master" already, even though
> > it has only hit "next".
> >
> > I'll refrain from sending a new version immediately though and will wait
> > for reviews first. Once those are in I will pull in the above topic.
> 
> I saw [01/11] has changes to stack_check_addition() that disappeared
> by the other topic, and the resolution is to remove the function (as
> nobody calls it anyway).  Also [02/11] has changes to refname_test.c
> that can be resolved by just removing the file.

Yup, sounds right to me.

> If there are other semantic conflict resolution needed, such a
> rebasing is appreciated, but otherwise, there is no strong need to
> rebase.

No other semantic conflicts I'm aware of, no. Thanks!

Patrick

> The mention of the name of the other topic that has interactions was
> very much helpful.
> 
> Thanks.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 00/11] reftable: expose write options as config
  2024-05-02  6:51 [PATCH 00/11] reftable: expose write options as config Patrick Steinhardt
                   ` (11 preceding siblings ...)
  2024-05-02  7:29 ` [PATCH 00/11] reftable: expose write options as config Patrick Steinhardt
@ 2024-05-06 21:29 ` Justin Tobler
  2024-05-10 10:00 ` Karthik Nayak
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 78+ messages in thread
From: Justin Tobler @ 2024-05-06 21:29 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On 24/05/02 08:51AM, Patrick Steinhardt wrote:
> Hi,
> 
> the reftable format has some flexibility with regards to how exactly it
> writes the respective tables:
> 
>   - The block size allows you to control how large each block is
>     supposed to be. The bigger the block, the more records you can fit
>     into it.
> 
>   - Restart intervals control how often a restart point is written that
>     breaks prefix compression. The lower the interval, the less disk
>     space savings you get.
> 
>   - Object indices can be enabled or disabled. These are optional and
>     Git doesn't use them right now, so disabling them may be a sensible
>     thing to do if you want to save some disk space.
> 
>   - The geometric factor controls when we compact tables during auto
>     compaction.
> 
> This patch series exposes all of these via a new set of configs so that
> they can be tweaked by the user as-needed. It's not expected that those
> are really of much importance for the "normal" user -- the defaults
> should be good enough. But for edge cases (huge repos with millions of
> refs) and for hosting providers these knobs can be helpful.
> 
> This patch series applies on top of d4cc1ec35f (Start the 2.46 cycle,
> 2024-04-30).

I have reviewed these patches and I have nothing to add. Thanks Patrick!

-Justin

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 01/11] reftable: consistently refer to `reftable_write_options` as `opts`
  2024-05-02  6:51 ` [PATCH 01/11] reftable: consistently refer to `reftable_write_options` as `opts` Patrick Steinhardt
@ 2024-05-10  9:00   ` Karthik Nayak
  2024-05-10 10:13     ` Patrick Steinhardt
  0 siblings, 1 reply; 78+ messages in thread
From: Karthik Nayak @ 2024-05-10  9:00 UTC (permalink / raw)
  To: Patrick Steinhardt, git

[-- Attachment #1: Type: text/plain, Size: 1061 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> Throughout the reftable library the `reftable_write_options` are
> sometimes referred to as `cfg` and sometimes as `opts`. Unify these to
> consistently use `opts` to avoid confusion.
>

I think one location was missed:

diff --git a/reftable/stack_test.c b/reftable/stack_test.c
index 3316d55f19..40eb793b3c 100644
--- a/reftable/stack_test.c
+++ b/reftable/stack_test.c
@@ -396,7 +396,7 @@ static void
test_reftable_stack_auto_compaction_fails_gracefully(void)

 static void test_reftable_stack_validate_refname(void)
 {
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	int err;
 	char *dir = get_tmp_dir(__LINE__);
@@ -410,7 +410,7 @@ static void test_reftable_stack_validate_refname(void)
 	};
 	char *additions[] = { "a", "a/b/c" };

-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);

 	err = reftable_stack_add(st, &write_test_ref, &ref);

Rest of the patch looks good. Thanks

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* Re: [PATCH 06/11] refs/reftable: allow configuring block size
  2024-05-02  6:51 ` [PATCH 06/11] refs/reftable: allow configuring block size Patrick Steinhardt
@ 2024-05-10  9:29   ` Karthik Nayak
  2024-05-10 10:13     ` Patrick Steinhardt
  0 siblings, 1 reply; 78+ messages in thread
From: Karthik Nayak @ 2024-05-10  9:29 UTC (permalink / raw)
  To: Patrick Steinhardt, git

[-- Attachment #1: Type: text/plain, Size: 1946 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

[snip]

> @@ -230,6 +231,23 @@ static int read_ref_without_reload(struct reftable_stack *stack,
>  	return ret;
>  }
>
> +static int reftable_be_config(const char *var, const char *value,
> +			      const struct config_context *ctx,
> +			      void *_opts)
> +{
> +	struct reftable_write_options *opts = _opts;
> +
> +	if (!strcmp(var, "reftable.blocksize")) {
> +		unsigned long block_size = git_config_ulong(var, value, ctx->kvi);
> +		if (block_size > 16777215)
> +			die("reftable block size cannot exceed 16MB");
> +		opts->block_size = block_size;
> +		return 0;

nit: unecessary return

> +	}
> +
> +	return 0;
> +}
> +
>  static struct ref_store *reftable_be_init(struct repository *repo,
>  					  const char *gitdir,
>  					  unsigned int store_flags)
> @@ -245,12 +263,24 @@ static struct ref_store *reftable_be_init(struct repository *repo,
>  	base_ref_store_init(&refs->base, repo, gitdir, &refs_be_reftable);
>  	strmap_init(&refs->worktree_stacks);
>  	refs->store_flags = store_flags;
> -	refs->write_options.block_size = 4096;
> +

Nit: do we need this newline?

>  	refs->write_options.hash_id = repo->hash_algo->format_id;
>  	refs->write_options.default_permissions = calc_shared_perm(0666 & ~mask);
>  	refs->write_options.disable_auto_compact =
>  		!git_env_bool("GIT_TEST_REFTABLE_AUTOCOMPACTION", 1);
>
> +	git_config(reftable_be_config, &refs->write_options);
> +
> +	/*
> +	 * It is somewhat unfortunate that we have to mirror the default block
> +	 * size of the reftable library here. But given that the write options
> +	 * wouldn't be updated by the library here, and given that we require
> +	 * the proper block size to trim reflog message so that they fit, we
> +	 * must set up a proper value here.
> +	 */
> +	if (!refs->write_options.block_size)
> +		refs->write_options.block_size = 4096;
> +

Wouldn't it be to import and use `reftable/constants.h` here?

[snip]

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 10/11] reftable: make the compaction factor configurable
  2024-05-02  6:52 ` [PATCH 10/11] reftable: make the compaction factor configurable Patrick Steinhardt
@ 2024-05-10  9:55   ` Karthik Nayak
  2024-05-10 10:13     ` Patrick Steinhardt
  0 siblings, 1 reply; 78+ messages in thread
From: Karthik Nayak @ 2024-05-10  9:55 UTC (permalink / raw)
  To: Patrick Steinhardt, git

[-- Attachment #1: Type: text/plain, Size: 1787 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> When auto-compacting, the reftable library packs references such that
> the sizes of the tables form a geometric sequence. The factor for this
> geometric sequence is hardcoded to 2 right now. We're about to expose
> this as a config option though, so let's expose the factor via write
> options.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  reftable/reftable-writer.h |  6 ++++++
>  reftable/stack.c           | 13 +++++++++----
>  reftable/stack.h           |  3 ++-
>  reftable/stack_test.c      |  4 ++--
>  4 files changed, 19 insertions(+), 7 deletions(-)
>
> diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h
> index 4cd8ebe6c7..155457b042 100644
> --- a/reftable/reftable-writer.h
> +++ b/reftable/reftable-writer.h
> @@ -49,6 +49,12 @@ struct reftable_write_options {
>
>  	/* boolean: Prevent auto-compaction of tables. */
>  	unsigned disable_auto_compact : 1;
> +
> +	/*
> +	 * Geometric sequence factor used by auto-compaction to decide which
> +	 * tables to compact. Defaults to 2 if unset.
> +	 */
> +	uint8_t auto_compaction_factor;
>  };
>
>  /* reftable_block_stats holds statistics for a single block type */
> diff --git a/reftable/stack.c b/reftable/stack.c
> index 7b4fff7c9e..6b0f8e13e7 100644
> --- a/reftable/stack.c
> +++ b/reftable/stack.c
> @@ -1215,12 +1215,16 @@ static int segment_size(struct segment *s)
>  	return s->end - s->start;
>  }
>
> -struct segment suggest_compaction_segment(uint64_t *sizes, size_t n)
> +struct segment suggest_compaction_segment(uint64_t *sizes, size_t n,
> +					  uint8_t factor)
>  {
>  	struct segment seg = { 0 };
>  	uint64_t bytes;
>  	size_t i;
>
> +	if (!factor)
> +		factor = 2;
> +

This should probably go in reftable/constants.h

[snip]

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 11/11] refs/reftable: allow configuring geometric factor
  2024-05-02  6:52 ` [PATCH 11/11] refs/reftable: allow configuring geometric factor Patrick Steinhardt
@ 2024-05-10  9:58   ` Karthik Nayak
  2024-05-10 10:13     ` Patrick Steinhardt
  0 siblings, 1 reply; 78+ messages in thread
From: Karthik Nayak @ 2024-05-10  9:58 UTC (permalink / raw)
  To: Patrick Steinhardt, git

[-- Attachment #1: Type: text/plain, Size: 917 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> Allow configuring the geometric factor used by the auto-compaction
> algorithm whenever a new table is appended to the stack of tables.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  Documentation/config/reftable.txt | 10 ++++++++++
>  refs/reftable-backend.c           |  5 +++++
>  2 files changed, 15 insertions(+)
>
> diff --git a/Documentation/config/reftable.txt b/Documentation/config/reftable.txt
> index 6e4466f3c5..1c381dda04 100644
> --- a/Documentation/config/reftable.txt
> +++ b/Documentation/config/reftable.txt
> @@ -37,3 +37,13 @@ reftable.indexObjects::
>  	are a reverse mapping of object ID to the references pointing to them.
>  +
>  The default value is `true`.
> +
> +reftable.geometricFactor::
> +	Whenever the reftable backend appends a new table to the table it

This doesn't read right, did you mean 's/to the table/,' perhaps?

[snip]

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 00/11] reftable: expose write options as config
  2024-05-02  6:51 [PATCH 00/11] reftable: expose write options as config Patrick Steinhardt
                   ` (12 preceding siblings ...)
  2024-05-06 21:29 ` Justin Tobler
@ 2024-05-10 10:00 ` Karthik Nayak
  2024-05-10 10:14   ` Patrick Steinhardt
  2024-05-10 10:29 ` [PATCH v2 " Patrick Steinhardt
  2024-05-13  8:17 ` [PATCH v3 " Patrick Steinhardt
  15 siblings, 1 reply; 78+ messages in thread
From: Karthik Nayak @ 2024-05-10 10:00 UTC (permalink / raw)
  To: Patrick Steinhardt, git

[-- Attachment #1: Type: text/plain, Size: 1370 bytes --]

Hello Patrick,

Patrick Steinhardt <ps@pks.im> writes:
> Hi,
>
> the reftable format has some flexibility with regards to how exactly it
> writes the respective tables:
>
>   - The block size allows you to control how large each block is
>     supposed to be. The bigger the block, the more records you can fit
>     into it.
>
>   - Restart intervals control how often a restart point is written that
>     breaks prefix compression. The lower the interval, the less disk
>     space savings you get.
>
>   - Object indices can be enabled or disabled. These are optional and
>     Git doesn't use them right now, so disabling them may be a sensible
>     thing to do if you want to save some disk space.
>
>   - The geometric factor controls when we compact tables during auto
>     compaction.
>
> This patch series exposes all of these via a new set of configs so that
> they can be tweaked by the user as-needed. It's not expected that those
> are really of much importance for the "normal" user -- the defaults
> should be good enough. But for edge cases (huge repos with millions of
> refs) and for hosting providers these knobs can be helpful.
>
> This patch series applies on top of d4cc1ec35f (Start the 2.46 cycle,
> 2024-04-30).
>
> Patrick
>

I'v gone through the commits and apart from a few small comments, I
think it looks great already.

Thanks,
Karthik

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 01/11] reftable: consistently refer to `reftable_write_options` as `opts`
  2024-05-10  9:00   ` Karthik Nayak
@ 2024-05-10 10:13     ` Patrick Steinhardt
  0 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-10 10:13 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1730 bytes --]

On Fri, May 10, 2024 at 02:00:31AM -0700, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > Throughout the reftable library the `reftable_write_options` are
> > sometimes referred to as `cfg` and sometimes as `opts`. Unify these to
> > consistently use `opts` to avoid confusion.
> >
> 
> I think one location was missed:
> 
> diff --git a/reftable/stack_test.c b/reftable/stack_test.c
> index 3316d55f19..40eb793b3c 100644
> --- a/reftable/stack_test.c
> +++ b/reftable/stack_test.c
> @@ -396,7 +396,7 @@ static void
> test_reftable_stack_auto_compaction_fails_gracefully(void)
> 
>  static void test_reftable_stack_validate_refname(void)
>  {
> -	struct reftable_write_options cfg = { 0 };
> +	struct reftable_write_options opts = { 0 };
>  	struct reftable_stack *st = NULL;
>  	int err;
>  	char *dir = get_tmp_dir(__LINE__);
> @@ -410,7 +410,7 @@ static void test_reftable_stack_validate_refname(void)
>  	};
>  	char *additions[] = { "a", "a/b/c" };
> 
> -	err = reftable_new_stack(&st, dir, cfg);
> +	err = reftable_new_stack(&st, dir, opts);
>  	EXPECT_ERR(err);
> 
>  	err = reftable_stack_add(st, &write_test_ref, &ref);
> 
> Rest of the patch looks good. Thanks

This section has been removed in 485c63cf5c (reftable: remove name
checks, 2024-04-08), so it doesn't exist in `master` anymore. Which is
cheating a bit because the topic does not build on top of anything where
that commit would be reachable. But both rebasing the topic and
willfully creating a conflict would probably make Junio's life harder
now, so I'll just leave it at that and lie by ommission.

But thanks anyway for reading this carefully and double checking the
results!

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 06/11] refs/reftable: allow configuring block size
  2024-05-10  9:29   ` Karthik Nayak
@ 2024-05-10 10:13     ` Patrick Steinhardt
  0 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-10 10:13 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 2571 bytes --]

On Fri, May 10, 2024 at 02:29:19AM -0700, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> [snip]
> 
> > @@ -230,6 +231,23 @@ static int read_ref_without_reload(struct reftable_stack *stack,
> >  	return ret;
> >  }
> >
> > +static int reftable_be_config(const char *var, const char *value,
> > +			      const struct config_context *ctx,
> > +			      void *_opts)
> > +{
> > +	struct reftable_write_options *opts = _opts;
> > +
> > +	if (!strcmp(var, "reftable.blocksize")) {
> > +		unsigned long block_size = git_config_ulong(var, value, ctx->kvi);
> > +		if (block_size > 16777215)
> > +			die("reftable block size cannot exceed 16MB");
> > +		opts->block_size = block_size;
> > +		return 0;
> 
> nit: unecessary return

It's unnecessary indeed. I first wanted to defend this, but then I
noticed that I'm also being inconsistent here where the last branch
won't have `return 0;` at the end of this series.

Will remove.

> > +	}
> > +
> > +	return 0;
> > +}
> > +
> >  static struct ref_store *reftable_be_init(struct repository *repo,
> >  					  const char *gitdir,
> >  					  unsigned int store_flags)
> > @@ -245,12 +263,24 @@ static struct ref_store *reftable_be_init(struct repository *repo,
> >  	base_ref_store_init(&refs->base, repo, gitdir, &refs_be_reftable);
> >  	strmap_init(&refs->worktree_stacks);
> >  	refs->store_flags = store_flags;
> > -	refs->write_options.block_size = 4096;
> > +
> 
> Nit: do we need this newline?

I think it's easier to read that way.

> >  	refs->write_options.hash_id = repo->hash_algo->format_id;
> >  	refs->write_options.default_permissions = calc_shared_perm(0666 & ~mask);
> >  	refs->write_options.disable_auto_compact =
> >  		!git_env_bool("GIT_TEST_REFTABLE_AUTOCOMPACTION", 1);
> >
> > +	git_config(reftable_be_config, &refs->write_options);
> > +
> > +	/*
> > +	 * It is somewhat unfortunate that we have to mirror the default block
> > +	 * size of the reftable library here. But given that the write options
> > +	 * wouldn't be updated by the library here, and given that we require
> > +	 * the proper block size to trim reflog message so that they fit, we
> > +	 * must set up a proper value here.
> > +	 */
> > +	if (!refs->write_options.block_size)
> > +		refs->write_options.block_size = 4096;
> > +
> 
> Wouldn't it be to import and use `reftable/constants.h` here?

Headers in the "reftable/" directory which do not have a "reftable-"
prefix are considered to be private. So those shouldn't be used.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 10/11] reftable: make the compaction factor configurable
  2024-05-10  9:55   ` Karthik Nayak
@ 2024-05-10 10:13     ` Patrick Steinhardt
  0 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-10 10:13 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 786 bytes --]

On Fri, May 10, 2024 at 04:55:10AM -0500, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
[snip]
> > diff --git a/reftable/stack.c b/reftable/stack.c
> > index 7b4fff7c9e..6b0f8e13e7 100644
> > --- a/reftable/stack.c
> > +++ b/reftable/stack.c
> > @@ -1215,12 +1215,16 @@ static int segment_size(struct segment *s)
> >  	return s->end - s->start;
> >  }
> >
> > -struct segment suggest_compaction_segment(uint64_t *sizes, size_t n)
> > +struct segment suggest_compaction_segment(uint64_t *sizes, size_t n,
> > +					  uint8_t factor)
> >  {
> >  	struct segment seg = { 0 };
> >  	uint64_t bytes;
> >  	size_t i;
> >
> > +	if (!factor)
> > +		factor = 2;
> > +
> 
> This should probably go in reftable/constants.h

Good idea, will do.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 11/11] refs/reftable: allow configuring geometric factor
  2024-05-10  9:58   ` Karthik Nayak
@ 2024-05-10 10:13     ` Patrick Steinhardt
  0 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-10 10:13 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1110 bytes --]

On Fri, May 10, 2024 at 02:58:58AM -0700, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > Allow configuring the geometric factor used by the auto-compaction
> > algorithm whenever a new table is appended to the stack of tables.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> >  Documentation/config/reftable.txt | 10 ++++++++++
> >  refs/reftable-backend.c           |  5 +++++
> >  2 files changed, 15 insertions(+)
> >
> > diff --git a/Documentation/config/reftable.txt b/Documentation/config/reftable.txt
> > index 6e4466f3c5..1c381dda04 100644
> > --- a/Documentation/config/reftable.txt
> > +++ b/Documentation/config/reftable.txt
> > @@ -37,3 +37,13 @@ reftable.indexObjects::
> >  	are a reverse mapping of object ID to the references pointing to them.
> >  +
> >  The default value is `true`.
> > +
> > +reftable.geometricFactor::
> > +	Whenever the reftable backend appends a new table to the table it
> 
> This doesn't read right, did you mean 's/to the table/,' perhaps?

This should rather be "to the stack,". Good catch.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 00/11] reftable: expose write options as config
  2024-05-10 10:00 ` Karthik Nayak
@ 2024-05-10 10:14   ` Patrick Steinhardt
  0 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-10 10:14 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 232 bytes --]

On Fri, May 10, 2024 at 03:00:41AM -0700, Karthik Nayak wrote:
> I'v gone through the commits and apart from a few small comments, I
> think it looks great already.
> 
> Thanks,
> Karthik

Thanks for your review!

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v2 00/11] reftable: expose write options as config
  2024-05-02  6:51 [PATCH 00/11] reftable: expose write options as config Patrick Steinhardt
                   ` (13 preceding siblings ...)
  2024-05-10 10:00 ` Karthik Nayak
@ 2024-05-10 10:29 ` Patrick Steinhardt
  2024-05-10 10:29   ` [PATCH v2 01/11] reftable: consistently refer to `reftable_write_options` as `opts` Patrick Steinhardt
                     ` (11 more replies)
  2024-05-13  8:17 ` [PATCH v3 " Patrick Steinhardt
  15 siblings, 12 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-10 10:29 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 7119 bytes --]

Hi,

this is the second version of my patch series that exposes various
options of the reftable writer via Git configuration.

Changes compared to v1:

  - Drop unneeded return statements.

  - Move default geometric factor into "constants.h".

  - Fix a typo in a commit message.

Thanks!

Patrick

Patrick Steinhardt (11):
  reftable: consistently refer to `reftable_write_options` as `opts`
  reftable: consistently pass write opts as value
  reftable/writer: drop static variable used to initialize strbuf
  reftable/writer: improve error when passed an invalid block size
  reftable/dump: support dumping a table's block structure
  refs/reftable: allow configuring block size
  reftable: use `uint16_t` to track restart interval
  refs/reftable: allow configuring restart interval
  refs/reftable: allow disabling writing the object index
  reftable: make the compaction factor configurable
  refs/reftable: allow configuring geometric factor

 Documentation/config.txt          |   2 +
 Documentation/config/reftable.txt |  49 +++++
 refs/reftable-backend.c           |  43 ++++-
 reftable/block.h                  |   2 +-
 reftable/constants.h              |   1 +
 reftable/dump.c                   |  12 +-
 reftable/merged_test.c            |   6 +-
 reftable/reader.c                 |  63 +++++++
 reftable/readwrite_test.c         |  26 +--
 reftable/refname_test.c           |   2 +-
 reftable/reftable-reader.h        |   2 +
 reftable/reftable-stack.h         |   2 +-
 reftable/reftable-writer.h        |  10 +-
 reftable/stack.c                  |  57 +++---
 reftable/stack.h                  |   5 +-
 reftable/stack_test.c             | 118 ++++++------
 reftable/writer.c                 |  20 +--
 t/t0613-reftable-write-options.sh | 286 ++++++++++++++++++++++++++++++
 18 files changed, 576 insertions(+), 130 deletions(-)
 create mode 100644 Documentation/config/reftable.txt
 create mode 100755 t/t0613-reftable-write-options.sh

Range-diff against v1:
 1:  47cee6e25e =  1:  7efa566306 reftable: consistently refer to `reftable_write_options` as `opts`
 2:  d8a0764e87 =  2:  e6f8fc09c2 reftable: consistently pass write opts as value
 3:  c040f81fba =  3:  aa2903e3e5 reftable/writer: drop static variable used to initialize strbuf
 4:  ef79bb1b7b =  4:  5e7cbb7b19 reftable/writer: improve error when passed an invalid block size
 5:  4d4407d4a4 =  5:  ed1c150d90 reftable/dump: support dumping a table's block structure
 6:  b4e4db5735 !  6:  be5bdc6dc1 refs/reftable: allow configuring block size
    @@ refs/reftable-backend.c: static int read_ref_without_reload(struct reftable_stac
     +		if (block_size > 16777215)
     +			die("reftable block size cannot exceed 16MB");
     +		opts->block_size = block_size;
    -+		return 0;
     +	}
     +
     +	return 0;
 7:  79d9e07ca9 =  7:  05e8d1df2d reftable: use `uint16_t` to track restart interval
 8:  653ec4dfa5 !  8:  bc0bf65553 refs/reftable: allow configuring restart interval
    @@ Documentation/config/reftable.txt: readers during access.
     
      ## refs/reftable-backend.c ##
     @@ refs/reftable-backend.c: static int reftable_be_config(const char *var, const char *value,
    + 		if (block_size > 16777215)
      			die("reftable block size cannot exceed 16MB");
      		opts->block_size = block_size;
    - 		return 0;
     +	} else if (!strcmp(var, "reftable.restartinterval")) {
     +		unsigned long restart_interval = git_config_ulong(var, value, ctx->kvi);
     +		if (restart_interval > UINT16_MAX)
     +			die("reftable block size cannot exceed %u", (unsigned)UINT16_MAX);
     +		opts->restart_interval = restart_interval;
    -+		return 0;
      	}
      
      	return 0;
 9:  6f2c481acc !  9:  6bc240fd0c refs/reftable: allow disabling writing the object index
    @@ Documentation/config/reftable.txt: A maximum of `65535` restart points per block
     
      ## refs/reftable-backend.c ##
     @@ refs/reftable-backend.c: static int reftable_be_config(const char *var, const char *value,
    + 		if (restart_interval > UINT16_MAX)
      			die("reftable block size cannot exceed %u", (unsigned)UINT16_MAX);
      		opts->restart_interval = restart_interval;
    - 		return 0;
     +	} else if (!strcmp(var, "reftable.indexobjects")) {
     +		opts->skip_index_objects = !git_config_bool(var, value);
    -+		return 0;
      	}
      
      	return 0;
10:  30e2e33479 ! 10:  9d4c1f0340 reftable: make the compaction factor configurable
    @@ Commit message
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
    + ## reftable/constants.h ##
    +@@ reftable/constants.h: license that can be found in the LICENSE file or at
    + 
    + #define MAX_RESTARTS ((1 << 16) - 1)
    + #define DEFAULT_BLOCK_SIZE 4096
    ++#define DEFAULT_GEOMETRIC_FACTOR 2
    + 
    + #endif
    +
      ## reftable/reftable-writer.h ##
     @@ reftable/reftable-writer.h: struct reftable_write_options {
      
    @@ reftable/reftable-writer.h: struct reftable_write_options {
      /* reftable_block_stats holds statistics for a single block type */
     
      ## reftable/stack.c ##
    +@@ reftable/stack.c: license that can be found in the LICENSE file or at
    + 
    + #include "../write-or-die.h"
    + #include "system.h"
    ++#include "constants.h"
    + #include "merged.h"
    + #include "reader.h"
    + #include "refname.h"
     @@ reftable/stack.c: static int segment_size(struct segment *s)
      	return s->end - s->start;
      }
    @@ reftable/stack.c: static int segment_size(struct segment *s)
      	size_t i;
      
     +	if (!factor)
    -+		factor = 2;
    ++		factor = DEFAULT_GEOMETRIC_FACTOR;
     +
      	/*
      	 * If there are no tables or only a single one then we don't have to
11:  861f2e72d9 ! 11:  e1282e53fb refs/reftable: allow configuring geometric factor
    @@ Documentation/config/reftable.txt: reftable.indexObjects::
      The default value is `true`.
     +
     +reftable.geometricFactor::
    -+	Whenever the reftable backend appends a new table to the table it
    ++	Whenever the reftable backend appends a new table to the stack, it
     +	performs auto compaction to ensure that there is only a handful of
     +	tables. The backend does this by ensuring that tables form a geometric
     +	sequence regarding the respective sizes of each table.
    @@ Documentation/config/reftable.txt: reftable.indexObjects::
     
      ## refs/reftable-backend.c ##
     @@ refs/reftable-backend.c: static int reftable_be_config(const char *var, const char *value,
    + 		opts->restart_interval = restart_interval;
      	} else if (!strcmp(var, "reftable.indexobjects")) {
      		opts->skip_index_objects = !git_config_bool(var, value);
    - 		return 0;
     +	} else if (!strcmp(var, "reftable.geometricfactor")) {
     +		unsigned long factor = git_config_ulong(var, value, ctx->kvi);
     +		if (factor > UINT8_MAX)

base-commit: d4cc1ec35f3bcce816b69986ca41943f6ce21377
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v2 01/11] reftable: consistently refer to `reftable_write_options` as `opts`
  2024-05-10 10:29 ` [PATCH v2 " Patrick Steinhardt
@ 2024-05-10 10:29   ` Patrick Steinhardt
  2024-05-10 21:03     ` Junio C Hamano
  2024-05-10 10:29   ` [PATCH v2 02/11] reftable: consistently pass write opts as value Patrick Steinhardt
                     ` (10 subsequent siblings)
  11 siblings, 1 reply; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-10 10:29 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 21809 bytes --]

Throughout the reftable library the `reftable_write_options` are
sometimes referred to as `cfg` and sometimes as `opts`. Unify these to
consistently use `opts` to avoid confusion.

While at it, touch up the coding style a bit by removing unneeded braces
around one-line statements and newlines between variable declarations.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/dump.c           |   4 +-
 reftable/reftable-stack.h |   2 +-
 reftable/stack.c          |  43 +++++++-------
 reftable/stack.h          |   2 +-
 reftable/stack_test.c     | 114 +++++++++++++++++---------------------
 5 files changed, 75 insertions(+), 90 deletions(-)

diff --git a/reftable/dump.c b/reftable/dump.c
index 26e0393c7d..9c770a10cc 100644
--- a/reftable/dump.c
+++ b/reftable/dump.c
@@ -27,9 +27,9 @@ license that can be found in the LICENSE file or at
 static int compact_stack(const char *stackdir)
 {
 	struct reftable_stack *stack = NULL;
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 
-	int err = reftable_new_stack(&stack, stackdir, cfg);
+	int err = reftable_new_stack(&stack, stackdir, opts);
 	if (err < 0)
 		goto done;
 
diff --git a/reftable/reftable-stack.h b/reftable/reftable-stack.h
index 1b602dda58..9c8e4eef49 100644
--- a/reftable/reftable-stack.h
+++ b/reftable/reftable-stack.h
@@ -29,7 +29,7 @@ struct reftable_stack;
  *  stored in 'dir'. Typically, this should be .git/reftables.
  */
 int reftable_new_stack(struct reftable_stack **dest, const char *dir,
-		       struct reftable_write_options config);
+		       struct reftable_write_options opts);
 
 /* returns the update_index at which a next table should be written. */
 uint64_t reftable_stack_next_update_index(struct reftable_stack *st);
diff --git a/reftable/stack.c b/reftable/stack.c
index 80266bcbab..3979657793 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -56,15 +56,14 @@ static int reftable_fd_flush(void *arg)
 }
 
 int reftable_new_stack(struct reftable_stack **dest, const char *dir,
-		       struct reftable_write_options config)
+		       struct reftable_write_options opts)
 {
 	struct reftable_stack *p = reftable_calloc(1, sizeof(*p));
 	struct strbuf list_file_name = STRBUF_INIT;
 	int err = 0;
 
-	if (config.hash_id == 0) {
-		config.hash_id = GIT_SHA1_FORMAT_ID;
-	}
+	if (opts.hash_id == 0)
+		opts.hash_id = GIT_SHA1_FORMAT_ID;
 
 	*dest = NULL;
 
@@ -75,7 +74,7 @@ int reftable_new_stack(struct reftable_stack **dest, const char *dir,
 	p->list_file = strbuf_detach(&list_file_name, NULL);
 	p->list_fd = -1;
 	p->reftable_dir = xstrdup(dir);
-	p->config = config;
+	p->opts = opts;
 
 	err = reftable_stack_reload_maybe_reuse(p, 1);
 	if (err < 0) {
@@ -257,7 +256,7 @@ static int reftable_stack_reload_once(struct reftable_stack *st, char **names,
 
 	/* success! */
 	err = reftable_new_merged_table(&new_merged, new_tables,
-					new_readers_len, st->config.hash_id);
+					new_readers_len, st->opts.hash_id);
 	if (err < 0)
 		goto done;
 
@@ -580,8 +579,8 @@ static int reftable_stack_init_addition(struct reftable_addition *add,
 		}
 		goto done;
 	}
-	if (st->config.default_permissions) {
-		if (chmod(add->lock_file->filename.buf, st->config.default_permissions) < 0) {
+	if (st->opts.default_permissions) {
+		if (chmod(add->lock_file->filename.buf, st->opts.default_permissions) < 0) {
 			err = REFTABLE_IO_ERROR;
 			goto done;
 		}
@@ -680,7 +679,7 @@ int reftable_addition_commit(struct reftable_addition *add)
 	if (err)
 		goto done;
 
-	if (!add->stack->config.disable_auto_compact) {
+	if (!add->stack->opts.disable_auto_compact) {
 		/*
 		 * Auto-compact the stack to keep the number of tables in
 		 * control. It is possible that a concurrent writer is already
@@ -758,9 +757,9 @@ int reftable_addition_add(struct reftable_addition *add,
 		err = REFTABLE_IO_ERROR;
 		goto done;
 	}
-	if (add->stack->config.default_permissions) {
+	if (add->stack->opts.default_permissions) {
 		if (chmod(get_tempfile_path(tab_file),
-			  add->stack->config.default_permissions)) {
+			  add->stack->opts.default_permissions)) {
 			err = REFTABLE_IO_ERROR;
 			goto done;
 		}
@@ -768,7 +767,7 @@ int reftable_addition_add(struct reftable_addition *add,
 	tab_fd = get_tempfile_fd(tab_file);
 
 	wr = reftable_new_writer(reftable_fd_write, reftable_fd_flush, &tab_fd,
-				 &add->stack->config);
+				 &add->stack->opts);
 	err = write_table(wr, arg);
 	if (err < 0)
 		goto done;
@@ -855,14 +854,14 @@ static int stack_compact_locked(struct reftable_stack *st,
 	}
 	tab_fd = get_tempfile_fd(tab_file);
 
-	if (st->config.default_permissions &&
-	    chmod(get_tempfile_path(tab_file), st->config.default_permissions) < 0) {
+	if (st->opts.default_permissions &&
+	    chmod(get_tempfile_path(tab_file), st->opts.default_permissions) < 0) {
 		err = REFTABLE_IO_ERROR;
 		goto done;
 	}
 
 	wr = reftable_new_writer(reftable_fd_write, reftable_fd_flush,
-				 &tab_fd, &st->config);
+				 &tab_fd, &st->opts);
 	err = stack_write_compact(st, wr, first, last, config);
 	if (err < 0)
 		goto done;
@@ -910,7 +909,7 @@ static int stack_write_compact(struct reftable_stack *st,
 				   st->readers[last]->max_update_index);
 
 	err = reftable_new_merged_table(&mt, subtabs, subtabs_len,
-					st->config.hash_id);
+					st->opts.hash_id);
 	if (err < 0) {
 		reftable_free(subtabs);
 		goto done;
@@ -1100,9 +1099,9 @@ static int stack_compact_range(struct reftable_stack *st,
 		goto done;
 	}
 
-	if (st->config.default_permissions) {
+	if (st->opts.default_permissions) {
 		if (chmod(get_lock_file_path(&tables_list_lock),
-			  st->config.default_permissions) < 0) {
+			  st->opts.default_permissions) < 0) {
 			err = REFTABLE_IO_ERROR;
 			goto done;
 		}
@@ -1292,7 +1291,7 @@ static uint64_t *stack_table_sizes_for_compaction(struct reftable_stack *st)
 {
 	uint64_t *sizes =
 		reftable_calloc(st->merged->stack_len, sizeof(*sizes));
-	int version = (st->config.hash_id == GIT_SHA1_FORMAT_ID) ? 1 : 2;
+	int version = (st->opts.hash_id == GIT_SHA1_FORMAT_ID) ? 1 : 2;
 	int overhead = header_size(version) - 1;
 	int i = 0;
 	for (i = 0; i < st->merged->stack_len; i++) {
@@ -1368,7 +1367,7 @@ static int stack_check_addition(struct reftable_stack *st,
 	int len = 0;
 	int i = 0;
 
-	if (st->config.skip_name_check)
+	if (st->opts.skip_name_check)
 		return 0;
 
 	err = reftable_block_source_from_file(&src, new_tab_name);
@@ -1500,11 +1499,11 @@ int reftable_stack_clean(struct reftable_stack *st)
 int reftable_stack_print_directory(const char *stackdir, uint32_t hash_id)
 {
 	struct reftable_stack *stack = NULL;
-	struct reftable_write_options cfg = { .hash_id = hash_id };
+	struct reftable_write_options opts = { .hash_id = hash_id };
 	struct reftable_merged_table *merged = NULL;
 	struct reftable_table table = { NULL };
 
-	int err = reftable_new_stack(&stack, stackdir, cfg);
+	int err = reftable_new_stack(&stack, stackdir, opts);
 	if (err < 0)
 		goto done;
 
diff --git a/reftable/stack.h b/reftable/stack.h
index d43efa4760..97d7ebc043 100644
--- a/reftable/stack.h
+++ b/reftable/stack.h
@@ -20,7 +20,7 @@ struct reftable_stack {
 
 	char *reftable_dir;
 
-	struct reftable_write_options config;
+	struct reftable_write_options opts;
 
 	struct reftable_reader **readers;
 	size_t readers_len;
diff --git a/reftable/stack_test.c b/reftable/stack_test.c
index 1df3ffce52..3316d55f19 100644
--- a/reftable/stack_test.c
+++ b/reftable/stack_test.c
@@ -150,7 +150,7 @@ static void test_reftable_stack_add_one(void)
 	char *dir = get_tmp_dir(__LINE__);
 	struct strbuf scratch = STRBUF_INIT;
 	int mask = umask(002);
-	struct reftable_write_options cfg = {
+	struct reftable_write_options opts = {
 		.default_permissions = 0660,
 	};
 	struct reftable_stack *st = NULL;
@@ -163,7 +163,7 @@ static void test_reftable_stack_add_one(void)
 	};
 	struct reftable_ref_record dest = { NULL };
 	struct stat stat_result = { 0 };
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_add(st, &write_test_ref, &ref);
@@ -186,7 +186,7 @@ static void test_reftable_stack_add_one(void)
 	strbuf_addstr(&scratch, "/tables.list");
 	err = stat(scratch.buf, &stat_result);
 	EXPECT(!err);
-	EXPECT((stat_result.st_mode & 0777) == cfg.default_permissions);
+	EXPECT((stat_result.st_mode & 0777) == opts.default_permissions);
 
 	strbuf_reset(&scratch);
 	strbuf_addstr(&scratch, dir);
@@ -195,7 +195,7 @@ static void test_reftable_stack_add_one(void)
 	strbuf_addstr(&scratch, st->readers[0]->name);
 	err = stat(scratch.buf, &stat_result);
 	EXPECT(!err);
-	EXPECT((stat_result.st_mode & 0777) == cfg.default_permissions);
+	EXPECT((stat_result.st_mode & 0777) == opts.default_permissions);
 #else
 	(void) stat_result;
 #endif
@@ -209,7 +209,7 @@ static void test_reftable_stack_add_one(void)
 
 static void test_reftable_stack_uptodate(void)
 {
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st1 = NULL;
 	struct reftable_stack *st2 = NULL;
 	char *dir = get_tmp_dir(__LINE__);
@@ -232,10 +232,10 @@ static void test_reftable_stack_uptodate(void)
 	/* simulate multi-process access to the same stack
 	   by creating two stacks for the same directory.
 	 */
-	err = reftable_new_stack(&st1, dir, cfg);
+	err = reftable_new_stack(&st1, dir, opts);
 	EXPECT_ERR(err);
 
-	err = reftable_new_stack(&st2, dir, cfg);
+	err = reftable_new_stack(&st2, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_add(st1, &write_test_ref, &ref1);
@@ -257,8 +257,7 @@ static void test_reftable_stack_uptodate(void)
 static void test_reftable_stack_transaction_api(void)
 {
 	char *dir = get_tmp_dir(__LINE__);
-
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	int err;
 	struct reftable_addition *add = NULL;
@@ -271,8 +270,7 @@ static void test_reftable_stack_transaction_api(void)
 	};
 	struct reftable_ref_record dest = { NULL };
 
-
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	reftable_addition_destroy(add);
@@ -301,12 +299,12 @@ static void test_reftable_stack_transaction_api(void)
 static void test_reftable_stack_transaction_api_performs_auto_compaction(void)
 {
 	char *dir = get_tmp_dir(__LINE__);
-	struct reftable_write_options cfg = {0};
+	struct reftable_write_options opts = {0};
 	struct reftable_addition *add = NULL;
 	struct reftable_stack *st = NULL;
 	int i, n = 20, err;
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	for (i = 0; i <= n; i++) {
@@ -325,7 +323,7 @@ static void test_reftable_stack_transaction_api_performs_auto_compaction(void)
 		 * we can ensure that we indeed honor this setting and have
 		 * better control over when exactly auto compaction runs.
 		 */
-		st->config.disable_auto_compact = i != n;
+		st->opts.disable_auto_compact = i != n;
 
 		err = reftable_stack_new_addition(&add, st);
 		EXPECT_ERR(err);
@@ -361,13 +359,13 @@ static void test_reftable_stack_auto_compaction_fails_gracefully(void)
 		.value_type = REFTABLE_REF_VAL1,
 		.value.val1 = {0x01},
 	};
-	struct reftable_write_options cfg = {0};
+	struct reftable_write_options opts = {0};
 	struct reftable_stack *st;
 	struct strbuf table_path = STRBUF_INIT;
 	char *dir = get_tmp_dir(__LINE__);
 	int err;
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_add(st, write_test_ref, &ref);
@@ -442,8 +440,7 @@ static int write_error(struct reftable_writer *wr, void *arg)
 static void test_reftable_stack_update_index_check(void)
 {
 	char *dir = get_tmp_dir(__LINE__);
-
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	int err;
 	struct reftable_ref_record ref1 = {
@@ -459,7 +456,7 @@ static void test_reftable_stack_update_index_check(void)
 		.value.symref = "master",
 	};
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_add(st, &write_test_ref, &ref1);
@@ -474,12 +471,11 @@ static void test_reftable_stack_update_index_check(void)
 static void test_reftable_stack_lock_failure(void)
 {
 	char *dir = get_tmp_dir(__LINE__);
-
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	int err, i;
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 	for (i = -1; i != REFTABLE_EMPTY_TABLE_ERROR; i--) {
 		err = reftable_stack_add(st, &write_error, &i);
@@ -494,7 +490,7 @@ static void test_reftable_stack_add(void)
 {
 	int i = 0;
 	int err = 0;
-	struct reftable_write_options cfg = {
+	struct reftable_write_options opts = {
 		.exact_log_message = 1,
 		.default_permissions = 0660,
 		.disable_auto_compact = 1,
@@ -507,7 +503,7 @@ static void test_reftable_stack_add(void)
 	struct stat stat_result;
 	int N = ARRAY_SIZE(refs);
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	for (i = 0; i < N; i++) {
@@ -566,7 +562,7 @@ static void test_reftable_stack_add(void)
 	strbuf_addstr(&path, "/tables.list");
 	err = stat(path.buf, &stat_result);
 	EXPECT(!err);
-	EXPECT((stat_result.st_mode & 0777) == cfg.default_permissions);
+	EXPECT((stat_result.st_mode & 0777) == opts.default_permissions);
 
 	strbuf_reset(&path);
 	strbuf_addstr(&path, dir);
@@ -575,7 +571,7 @@ static void test_reftable_stack_add(void)
 	strbuf_addstr(&path, st->readers[0]->name);
 	err = stat(path.buf, &stat_result);
 	EXPECT(!err);
-	EXPECT((stat_result.st_mode & 0777) == cfg.default_permissions);
+	EXPECT((stat_result.st_mode & 0777) == opts.default_permissions);
 #else
 	(void) stat_result;
 #endif
@@ -593,7 +589,7 @@ static void test_reftable_stack_add(void)
 static void test_reftable_stack_log_normalize(void)
 {
 	int err = 0;
-	struct reftable_write_options cfg = {
+	struct reftable_write_options opts = {
 		0,
 	};
 	struct reftable_stack *st = NULL;
@@ -617,7 +613,7 @@ static void test_reftable_stack_log_normalize(void)
 		.update_index = 1,
 	};
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	input.value.update.message = "one\ntwo";
@@ -650,8 +646,7 @@ static void test_reftable_stack_tombstone(void)
 {
 	int i = 0;
 	char *dir = get_tmp_dir(__LINE__);
-
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	int err;
 	struct reftable_ref_record refs[2] = { { NULL } };
@@ -660,8 +655,7 @@ static void test_reftable_stack_tombstone(void)
 	struct reftable_ref_record dest = { NULL };
 	struct reftable_log_record log_dest = { NULL };
 
-
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	/* even entries add the refs, odd entries delete them. */
@@ -729,8 +723,7 @@ static void test_reftable_stack_tombstone(void)
 static void test_reftable_stack_hash_id(void)
 {
 	char *dir = get_tmp_dir(__LINE__);
-
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	int err;
 
@@ -740,24 +733,24 @@ static void test_reftable_stack_hash_id(void)
 		.value.symref = "target",
 		.update_index = 1,
 	};
-	struct reftable_write_options cfg32 = { .hash_id = GIT_SHA256_FORMAT_ID };
+	struct reftable_write_options opts32 = { .hash_id = GIT_SHA256_FORMAT_ID };
 	struct reftable_stack *st32 = NULL;
-	struct reftable_write_options cfg_default = { 0 };
+	struct reftable_write_options opts_default = { 0 };
 	struct reftable_stack *st_default = NULL;
 	struct reftable_ref_record dest = { NULL };
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_add(st, &write_test_ref, &ref);
 	EXPECT_ERR(err);
 
 	/* can't read it with the wrong hash ID. */
-	err = reftable_new_stack(&st32, dir, cfg32);
+	err = reftable_new_stack(&st32, dir, opts32);
 	EXPECT(err == REFTABLE_FORMAT_ERROR);
 
-	/* check that we can read it back with default config too. */
-	err = reftable_new_stack(&st_default, dir, cfg_default);
+	/* check that we can read it back with default opts too. */
+	err = reftable_new_stack(&st_default, dir, opts_default);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_read_ref(st_default, "master", &dest);
@@ -790,8 +783,7 @@ static void test_suggest_compaction_segment_nothing(void)
 static void test_reflog_expire(void)
 {
 	char *dir = get_tmp_dir(__LINE__);
-
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	struct reftable_log_record logs[20] = { { NULL } };
 	int N = ARRAY_SIZE(logs) - 1;
@@ -802,8 +794,7 @@ static void test_reflog_expire(void)
 	};
 	struct reftable_log_record log = { NULL };
 
-
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	for (i = 1; i <= N; i++) {
@@ -866,21 +857,19 @@ static int write_nothing(struct reftable_writer *wr, void *arg)
 
 static void test_empty_add(void)
 {
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	int err;
 	char *dir = get_tmp_dir(__LINE__);
-
 	struct reftable_stack *st2 = NULL;
 
-
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_add(st, &write_nothing, NULL);
 	EXPECT_ERR(err);
 
-	err = reftable_new_stack(&st2, dir, cfg);
+	err = reftable_new_stack(&st2, dir, opts);
 	EXPECT_ERR(err);
 	clear_dir(dir);
 	reftable_stack_destroy(st);
@@ -899,16 +888,15 @@ static int fastlog2(uint64_t sz)
 
 static void test_reftable_stack_auto_compaction(void)
 {
-	struct reftable_write_options cfg = {
+	struct reftable_write_options opts = {
 		.disable_auto_compact = 1,
 	};
 	struct reftable_stack *st = NULL;
 	char *dir = get_tmp_dir(__LINE__);
-
 	int err, i;
 	int N = 100;
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	for (i = 0; i < N; i++) {
@@ -938,13 +926,13 @@ static void test_reftable_stack_auto_compaction(void)
 
 static void test_reftable_stack_add_performs_auto_compaction(void)
 {
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	struct strbuf refname = STRBUF_INIT;
 	char *dir = get_tmp_dir(__LINE__);
 	int err, i, n = 20;
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	for (i = 0; i <= n; i++) {
@@ -959,7 +947,7 @@ static void test_reftable_stack_add_performs_auto_compaction(void)
 		 * we can ensure that we indeed honor this setting and have
 		 * better control over when exactly auto compaction runs.
 		 */
-		st->config.disable_auto_compact = i != n;
+		st->opts.disable_auto_compact = i != n;
 
 		strbuf_reset(&refname);
 		strbuf_addf(&refname, "branch-%04d", i);
@@ -986,14 +974,13 @@ static void test_reftable_stack_add_performs_auto_compaction(void)
 
 static void test_reftable_stack_compaction_concurrent(void)
 {
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st1 = NULL, *st2 = NULL;
 	char *dir = get_tmp_dir(__LINE__);
-
 	int err, i;
 	int N = 3;
 
-	err = reftable_new_stack(&st1, dir, cfg);
+	err = reftable_new_stack(&st1, dir, opts);
 	EXPECT_ERR(err);
 
 	for (i = 0; i < N; i++) {
@@ -1010,7 +997,7 @@ static void test_reftable_stack_compaction_concurrent(void)
 		EXPECT_ERR(err);
 	}
 
-	err = reftable_new_stack(&st2, dir, cfg);
+	err = reftable_new_stack(&st2, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_compact_all(st1, NULL);
@@ -1036,14 +1023,13 @@ static void unclean_stack_close(struct reftable_stack *st)
 
 static void test_reftable_stack_compaction_concurrent_clean(void)
 {
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st1 = NULL, *st2 = NULL, *st3 = NULL;
 	char *dir = get_tmp_dir(__LINE__);
-
 	int err, i;
 	int N = 3;
 
-	err = reftable_new_stack(&st1, dir, cfg);
+	err = reftable_new_stack(&st1, dir, opts);
 	EXPECT_ERR(err);
 
 	for (i = 0; i < N; i++) {
@@ -1060,7 +1046,7 @@ static void test_reftable_stack_compaction_concurrent_clean(void)
 		EXPECT_ERR(err);
 	}
 
-	err = reftable_new_stack(&st2, dir, cfg);
+	err = reftable_new_stack(&st2, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_compact_all(st1, NULL);
@@ -1069,7 +1055,7 @@ static void test_reftable_stack_compaction_concurrent_clean(void)
 	unclean_stack_close(st1);
 	unclean_stack_close(st2);
 
-	err = reftable_new_stack(&st3, dir, cfg);
+	err = reftable_new_stack(&st3, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_clean(st3);
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v2 02/11] reftable: consistently pass write opts as value
  2024-05-10 10:29 ` [PATCH v2 " Patrick Steinhardt
  2024-05-10 10:29   ` [PATCH v2 01/11] reftable: consistently refer to `reftable_write_options` as `opts` Patrick Steinhardt
@ 2024-05-10 10:29   ` Patrick Steinhardt
  2024-05-10 21:11     ` Junio C Hamano
  2024-05-10 10:29   ` [PATCH v2 03/11] reftable/writer: drop static variable used to initialize strbuf Patrick Steinhardt
                     ` (9 subsequent siblings)
  11 siblings, 1 reply; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-10 10:29 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 10489 bytes --]

We sometimes pass the refatble write options as value and sometimes as a
pointer. This is quite confusing and makes the reader wonder whether the
options get modified sometimes.

In fact, `reftable_new_writer()` does cause the caller-provided options
to get updated when some values aren't set up. This is quite unexpected,
but didn't cause any harm until now.

Refactor the code to consistently pass the options as a value so that
they are local to the subsystem they are being passed into so that we
can avoid weirdness like this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/merged_test.c     |  6 +++---
 reftable/readwrite_test.c  | 26 +++++++++++++-------------
 reftable/refname_test.c    |  2 +-
 reftable/reftable-writer.h |  2 +-
 reftable/stack.c           |  4 ++--
 reftable/writer.c          | 12 +++++++-----
 6 files changed, 27 insertions(+), 25 deletions(-)

diff --git a/reftable/merged_test.c b/reftable/merged_test.c
index 530fc82d1c..4ac81de8d4 100644
--- a/reftable/merged_test.c
+++ b/reftable/merged_test.c
@@ -42,7 +42,7 @@ static void write_test_table(struct strbuf *buf,
 		}
 	}
 
-	w = reftable_new_writer(&strbuf_add_void, &noop_flush, buf, &opts);
+	w = reftable_new_writer(&strbuf_add_void, &noop_flush, buf, opts);
 	reftable_writer_set_limits(w, min, max);
 
 	for (i = 0; i < n; i++) {
@@ -70,7 +70,7 @@ static void write_test_log_table(struct strbuf *buf,
 		.exact_log_message = 1,
 	};
 	struct reftable_writer *w = NULL;
-	w = reftable_new_writer(&strbuf_add_void, &noop_flush, buf, &opts);
+	w = reftable_new_writer(&strbuf_add_void, &noop_flush, buf, opts);
 	reftable_writer_set_limits(w, update_index, update_index);
 
 	for (i = 0; i < n; i++) {
@@ -403,7 +403,7 @@ static void test_default_write_opts(void)
 	struct reftable_write_options opts = { 0 };
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
 
 	struct reftable_ref_record rec = {
 		.refname = "master",
diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c
index a6dbd214c5..27631a041b 100644
--- a/reftable/readwrite_test.c
+++ b/reftable/readwrite_test.c
@@ -51,7 +51,7 @@ static void write_table(char ***names, struct strbuf *buf, int N,
 		.hash_id = hash_id,
 	};
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, buf, opts);
 	struct reftable_ref_record ref = { NULL };
 	int i = 0, n;
 	struct reftable_log_record log = { NULL };
@@ -129,7 +129,7 @@ static void test_log_buffer_size(void)
 					   .message = "commit: 9\n",
 				   } } };
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
 
 	/* This tests buffer extension for log compression. Must use a random
 	   hash, to ensure that the compressed part is larger than the original.
@@ -172,7 +172,7 @@ static void test_log_overflow(void)
 		},
 	};
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
 
 	memset(msg, 'x', sizeof(msg) - 1);
 	reftable_writer_set_limits(w, update_index, update_index);
@@ -199,7 +199,7 @@ static void test_log_write_read(void)
 	struct reftable_block_source source = { NULL };
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
 	const struct reftable_stats *stats = NULL;
 	reftable_writer_set_limits(w, 0, N);
 	for (i = 0; i < N; i++) {
@@ -288,7 +288,7 @@ static void test_log_zlib_corruption(void)
 	struct reftable_block_source source = { 0 };
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
 	const struct reftable_stats *stats = NULL;
 	char message[100] = { 0 };
 	int err, i, n;
@@ -526,7 +526,7 @@ static void test_table_refs_for(int indexed)
 
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
 
 	struct reftable_iterator it = { NULL };
 	int j;
@@ -619,7 +619,7 @@ static void test_write_empty_table(void)
 	struct reftable_write_options opts = { 0 };
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
 	struct reftable_block_source source = { NULL };
 	struct reftable_reader *rd = NULL;
 	struct reftable_ref_record rec = { NULL };
@@ -657,7 +657,7 @@ static void test_write_object_id_min_length(void)
 	};
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
 	struct reftable_ref_record ref = {
 		.update_index = 1,
 		.value_type = REFTABLE_REF_VAL1,
@@ -692,7 +692,7 @@ static void test_write_object_id_length(void)
 	};
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
 	struct reftable_ref_record ref = {
 		.update_index = 1,
 		.value_type = REFTABLE_REF_VAL1,
@@ -726,7 +726,7 @@ static void test_write_empty_key(void)
 	struct reftable_write_options opts = { 0 };
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
 	struct reftable_ref_record ref = {
 		.refname = "",
 		.update_index = 1,
@@ -749,7 +749,7 @@ static void test_write_key_order(void)
 	struct reftable_write_options opts = { 0 };
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
 	struct reftable_ref_record refs[2] = {
 		{
 			.refname = "b",
@@ -792,7 +792,7 @@ static void test_write_multiple_indices(void)
 	struct reftable_reader *reader;
 	int err, i;
 
-	writer = reftable_new_writer(&strbuf_add_void, &noop_flush, &writer_buf, &opts);
+	writer = reftable_new_writer(&strbuf_add_void, &noop_flush, &writer_buf, opts);
 	reftable_writer_set_limits(writer, 1, 1);
 	for (i = 0; i < 100; i++) {
 		struct reftable_ref_record ref = {
@@ -869,7 +869,7 @@ static void test_write_multi_level_index(void)
 	struct reftable_reader *reader;
 	int err;
 
-	writer = reftable_new_writer(&strbuf_add_void, &noop_flush, &writer_buf, &opts);
+	writer = reftable_new_writer(&strbuf_add_void, &noop_flush, &writer_buf, opts);
 	reftable_writer_set_limits(writer, 1, 1);
 	for (size_t i = 0; i < 200; i++) {
 		struct reftable_ref_record ref = {
diff --git a/reftable/refname_test.c b/reftable/refname_test.c
index b9cc62554e..3468253be7 100644
--- a/reftable/refname_test.c
+++ b/reftable/refname_test.c
@@ -30,7 +30,7 @@ static void test_conflict(void)
 	struct reftable_write_options opts = { 0 };
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
 	struct reftable_ref_record rec = {
 		.refname = "a/b",
 		.value_type = REFTABLE_REF_SYMREF,
diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h
index 155bf0bbe2..44cb986465 100644
--- a/reftable/reftable-writer.h
+++ b/reftable/reftable-writer.h
@@ -92,7 +92,7 @@ struct reftable_stats {
 struct reftable_writer *
 reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
 		    int (*flush_func)(void *),
-		    void *writer_arg, struct reftable_write_options *opts);
+		    void *writer_arg, struct reftable_write_options opts);
 
 /* Set the range of update indices for the records we will add. When writing a
    table into a stack, the min should be at least
diff --git a/reftable/stack.c b/reftable/stack.c
index 3979657793..7b4fff7c9e 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -767,7 +767,7 @@ int reftable_addition_add(struct reftable_addition *add,
 	tab_fd = get_tempfile_fd(tab_file);
 
 	wr = reftable_new_writer(reftable_fd_write, reftable_fd_flush, &tab_fd,
-				 &add->stack->opts);
+				 add->stack->opts);
 	err = write_table(wr, arg);
 	if (err < 0)
 		goto done;
@@ -861,7 +861,7 @@ static int stack_compact_locked(struct reftable_stack *st,
 	}
 
 	wr = reftable_new_writer(reftable_fd_write, reftable_fd_flush,
-				 &tab_fd, &st->opts);
+				 &tab_fd, st->opts);
 	err = stack_write_compact(st, wr, first, last, config);
 	if (err < 0)
 		goto done;
diff --git a/reftable/writer.c b/reftable/writer.c
index 1d9ff0fbfa..ad2f2e6c65 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -122,20 +122,22 @@ static struct strbuf reftable_empty_strbuf = STRBUF_INIT;
 struct reftable_writer *
 reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
 		    int (*flush_func)(void *),
-		    void *writer_arg, struct reftable_write_options *opts)
+		    void *writer_arg, struct reftable_write_options opts)
 {
 	struct reftable_writer *wp = reftable_calloc(1, sizeof(*wp));
 	strbuf_init(&wp->block_writer_data.last_key, 0);
-	options_set_defaults(opts);
-	if (opts->block_size >= (1 << 24)) {
+
+	options_set_defaults(&opts);
+	if (opts.block_size >= (1 << 24)) {
 		/* TODO - error return? */
 		abort();
 	}
+
 	wp->last_key = reftable_empty_strbuf;
-	REFTABLE_CALLOC_ARRAY(wp->block, opts->block_size);
+	REFTABLE_CALLOC_ARRAY(wp->block, opts.block_size);
 	wp->write = writer_func;
 	wp->write_arg = writer_arg;
-	wp->opts = *opts;
+	wp->opts = opts;
 	wp->flush = flush_func;
 	writer_reinit_block_writer(wp, BLOCK_TYPE_REF);
 
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v2 03/11] reftable/writer: drop static variable used to initialize strbuf
  2024-05-10 10:29 ` [PATCH v2 " Patrick Steinhardt
  2024-05-10 10:29   ` [PATCH v2 01/11] reftable: consistently refer to `reftable_write_options` as `opts` Patrick Steinhardt
  2024-05-10 10:29   ` [PATCH v2 02/11] reftable: consistently pass write opts as value Patrick Steinhardt
@ 2024-05-10 10:29   ` Patrick Steinhardt
  2024-05-10 21:19     ` Junio C Hamano
  2024-05-10 10:29   ` [PATCH v2 04/11] reftable/writer: improve error when passed an invalid block size Patrick Steinhardt
                     ` (8 subsequent siblings)
  11 siblings, 1 reply; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-10 10:29 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 1471 bytes --]

We have a static variable in the reftable writer code that is merely
used to initialize the `last_key` of the writer. Convert the code to
instead use `strbuf_init()` and drop the variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/writer.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/reftable/writer.c b/reftable/writer.c
index ad2f2e6c65..7df6e53699 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -117,15 +117,12 @@ static void writer_reinit_block_writer(struct reftable_writer *w, uint8_t typ)
 	w->block_writer->restart_interval = w->opts.restart_interval;
 }
 
-static struct strbuf reftable_empty_strbuf = STRBUF_INIT;
-
 struct reftable_writer *
 reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
 		    int (*flush_func)(void *),
 		    void *writer_arg, struct reftable_write_options opts)
 {
 	struct reftable_writer *wp = reftable_calloc(1, sizeof(*wp));
-	strbuf_init(&wp->block_writer_data.last_key, 0);
 
 	options_set_defaults(&opts);
 	if (opts.block_size >= (1 << 24)) {
@@ -133,7 +130,8 @@ reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
 		abort();
 	}
 
-	wp->last_key = reftable_empty_strbuf;
+	strbuf_init(&wp->block_writer_data.last_key, 0);
+	strbuf_init(&wp->last_key, 0);
 	REFTABLE_CALLOC_ARRAY(wp->block, opts.block_size);
 	wp->write = writer_func;
 	wp->write_arg = writer_arg;
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v2 04/11] reftable/writer: improve error when passed an invalid block size
  2024-05-10 10:29 ` [PATCH v2 " Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2024-05-10 10:29   ` [PATCH v2 03/11] reftable/writer: drop static variable used to initialize strbuf Patrick Steinhardt
@ 2024-05-10 10:29   ` Patrick Steinhardt
  2024-05-10 21:25     ` Junio C Hamano
  2024-05-10 10:29   ` [PATCH v2 05/11] reftable/dump: support dumping a table's block structure Patrick Steinhardt
                     ` (7 subsequent siblings)
  11 siblings, 1 reply; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-10 10:29 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 1010 bytes --]

The reftable format only supports block sizes up to 16MB. When the
writer is being passed a value bigger than that it simply calls
abort(3P), which isn't all that helpful due to the lack of a proper
error message.

Improve this by calling `BUG()` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/writer.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/reftable/writer.c b/reftable/writer.c
index 7df6e53699..374b7d15ed 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -125,10 +125,8 @@ reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
 	struct reftable_writer *wp = reftable_calloc(1, sizeof(*wp));
 
 	options_set_defaults(&opts);
-	if (opts.block_size >= (1 << 24)) {
-		/* TODO - error return? */
-		abort();
-	}
+	if (opts.block_size >= (1 << 24))
+		BUG("configured block size exceeds 16MB");
 
 	strbuf_init(&wp->block_writer_data.last_key, 0);
 	strbuf_init(&wp->last_key, 0);
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v2 05/11] reftable/dump: support dumping a table's block structure
  2024-05-10 10:29 ` [PATCH v2 " Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2024-05-10 10:29   ` [PATCH v2 04/11] reftable/writer: improve error when passed an invalid block size Patrick Steinhardt
@ 2024-05-10 10:29   ` Patrick Steinhardt
  2024-05-13 22:42     ` Junio C Hamano
  2024-05-10 10:29   ` [PATCH v2 06/11] refs/reftable: allow configuring block size Patrick Steinhardt
                     ` (6 subsequent siblings)
  11 siblings, 1 reply; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-10 10:29 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 6732 bytes --]

We're about to introduce new configs that will allow users to have more
control over how exactly reftables are written. To verify that these
configs are effective we will need to take a peak into the actual blocks
written by the reftable backend.

Introduce a new mode to the dumping logic that prints out the block
structure. This logic can be invoked via `test-tool dump-reftables -b`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/dump.c                   |   8 ++-
 reftable/reader.c                 |  63 ++++++++++++++++++
 reftable/reftable-reader.h        |   2 +
 t/t0613-reftable-write-options.sh | 102 ++++++++++++++++++++++++++++++
 4 files changed, 174 insertions(+), 1 deletion(-)
 create mode 100755 t/t0613-reftable-write-options.sh

diff --git a/reftable/dump.c b/reftable/dump.c
index 9c770a10cc..24476cc2a9 100644
--- a/reftable/dump.c
+++ b/reftable/dump.c
@@ -48,6 +48,7 @@ static void print_help(void)
 	printf("usage: dump [-cst] arg\n\n"
 	       "options: \n"
 	       "  -c compact\n"
+	       "  -b dump blocks\n"
 	       "  -t dump table\n"
 	       "  -s dump stack\n"
 	       "  -6 sha256 hash format\n"
@@ -58,6 +59,7 @@ static void print_help(void)
 int reftable_dump_main(int argc, char *const *argv)
 {
 	int err = 0;
+	int opt_dump_blocks = 0;
 	int opt_dump_table = 0;
 	int opt_dump_stack = 0;
 	int opt_compact = 0;
@@ -67,6 +69,8 @@ int reftable_dump_main(int argc, char *const *argv)
 	for (; argc > 1; argv++, argc--)
 		if (*argv[1] != '-')
 			break;
+		else if (!strcmp("-b", argv[1]))
+			opt_dump_blocks = 1;
 		else if (!strcmp("-t", argv[1]))
 			opt_dump_table = 1;
 		else if (!strcmp("-6", argv[1]))
@@ -88,7 +92,9 @@ int reftable_dump_main(int argc, char *const *argv)
 
 	arg = argv[1];
 
-	if (opt_dump_table) {
+	if (opt_dump_blocks) {
+		err = reftable_reader_print_blocks(arg);
+	} else if (opt_dump_table) {
 		err = reftable_reader_print_file(arg);
 	} else if (opt_dump_stack) {
 		err = reftable_stack_print_directory(arg, opt_hash_id);
diff --git a/reftable/reader.c b/reftable/reader.c
index 481dff10d4..f23c8523db 100644
--- a/reftable/reader.c
+++ b/reftable/reader.c
@@ -856,3 +856,66 @@ int reftable_reader_print_file(const char *tablename)
 	reftable_reader_free(r);
 	return err;
 }
+
+int reftable_reader_print_blocks(const char *tablename)
+{
+	struct {
+		const char *name;
+		int type;
+	} sections[] = {
+		{
+			.name = "ref",
+			.type = BLOCK_TYPE_REF,
+		},
+		{
+			.name = "obj",
+			.type = BLOCK_TYPE_OBJ,
+		},
+		{
+			.name = "log",
+			.type = BLOCK_TYPE_LOG,
+		},
+	};
+	struct reftable_block_source src = { 0 };
+	struct table_iter ti = TABLE_ITER_INIT;
+	struct reftable_reader *r = NULL;
+	size_t i;
+	int err;
+
+	err = reftable_block_source_from_file(&src, tablename);
+	if (err < 0)
+		goto done;
+
+	err = reftable_new_reader(&r, &src, tablename);
+	if (err < 0)
+		goto done;
+
+	printf("header:\n");
+	printf("  block_size: %d\n", r->block_size);
+
+	for (i = 0; i < ARRAY_SIZE(sections); i++) {
+		err = reader_start(r, &ti, sections[i].type, 0);
+		if (err < 0)
+			goto done;
+		if (err > 0)
+			continue;
+
+		printf("%s:\n", sections[i].name);
+
+		while (1) {
+			printf("  - length: %u\n", ti.br.block_len);
+			printf("    restarts: %u\n", ti.br.restart_count);
+
+			err = table_iter_next_block(&ti);
+			if (err < 0)
+				goto done;
+			if (err > 0)
+				break;
+		}
+	}
+
+done:
+	reftable_reader_free(r);
+	table_iter_close(&ti);
+	return err;
+}
diff --git a/reftable/reftable-reader.h b/reftable/reftable-reader.h
index 4a4bc2fdf8..4a04857773 100644
--- a/reftable/reftable-reader.h
+++ b/reftable/reftable-reader.h
@@ -97,5 +97,7 @@ void reftable_table_from_reader(struct reftable_table *tab,
 
 /* print table onto stdout for debugging. */
 int reftable_reader_print_file(const char *tablename);
+/* print blocks onto stdout for debugging. */
+int reftable_reader_print_blocks(const char *tablename);
 
 #endif
diff --git a/t/t0613-reftable-write-options.sh b/t/t0613-reftable-write-options.sh
new file mode 100755
index 0000000000..462980c37c
--- /dev/null
+++ b/t/t0613-reftable-write-options.sh
@@ -0,0 +1,102 @@
+#!/bin/sh
+
+test_description='reftable write options'
+
+GIT_TEST_DEFAULT_REF_FORMAT=reftable
+export GIT_TEST_DEFAULT_REF_FORMAT
+# Disable auto-compaction for all tests as we explicitly control repacking of
+# refs.
+GIT_TEST_REFTABLE_AUTOCOMPACTION=false
+export GIT_TEST_REFTABLE_AUTOCOMPACTION
+# Block sizes depend on the hash function, so we force SHA1 here.
+GIT_TEST_DEFAULT_HASH=sha1
+export GIT_TEST_DEFAULT_HASH
+# Block sizes also depend on the actual refs we write, so we force "master" to
+# be the default initial branch name.
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+test_expect_success 'default write options' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		git pack-refs &&
+		cat >expect <<-EOF &&
+		header:
+		  block_size: 4096
+		ref:
+		  - length: 129
+		    restarts: 2
+		log:
+		  - length: 262
+		    restarts: 2
+		EOF
+		test-tool dump-reftable -b .git/reftable/*.ref >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'disabled reflog writes no log blocks' '
+	test_config_global core.logAllRefUpdates false &&
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		git pack-refs &&
+		cat >expect <<-EOF &&
+		header:
+		  block_size: 4096
+		ref:
+		  - length: 129
+		    restarts: 2
+		EOF
+		test-tool dump-reftable -b .git/reftable/*.ref >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'many refs results in multiple blocks' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		for i in $(test_seq 200)
+		do
+			printf "update refs/heads/branch-%d HEAD\n" "$i" ||
+			return 1
+		done >input &&
+		git update-ref --stdin <input &&
+		git pack-refs &&
+
+		cat >expect <<-EOF &&
+		header:
+		  block_size: 4096
+		ref:
+		  - length: 4049
+		    restarts: 11
+		  - length: 1136
+		    restarts: 3
+		log:
+		  - length: 4041
+		    restarts: 4
+		  - length: 4015
+		    restarts: 3
+		  - length: 4014
+		    restarts: 3
+		  - length: 4012
+		    restarts: 3
+		  - length: 3289
+		    restarts: 3
+		EOF
+		test-tool dump-reftable -b .git/reftable/*.ref >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_done
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v2 06/11] refs/reftable: allow configuring block size
  2024-05-10 10:29 ` [PATCH v2 " Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2024-05-10 10:29   ` [PATCH v2 05/11] reftable/dump: support dumping a table's block structure Patrick Steinhardt
@ 2024-05-10 10:29   ` Patrick Steinhardt
  2024-05-10 10:29   ` [PATCH v2 07/11] reftable: use `uint16_t` to track restart interval Patrick Steinhardt
                     ` (5 subsequent siblings)
  11 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-10 10:29 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 6231 bytes --]

Add a new option `reftable.blockSize` that allows the user to control
the block size used by the reftable library.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/config.txt          |  2 +
 Documentation/config/reftable.txt | 14 ++++++
 refs/reftable-backend.c           | 31 ++++++++++++-
 t/t0613-reftable-write-options.sh | 72 +++++++++++++++++++++++++++++++
 4 files changed, 118 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/config/reftable.txt

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 70b448b132..fa1469e5e7 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -497,6 +497,8 @@ include::config/rebase.txt[]
 
 include::config/receive.txt[]
 
+include::config/reftable.txt[]
+
 include::config/remote.txt[]
 
 include::config/remotes.txt[]
diff --git a/Documentation/config/reftable.txt b/Documentation/config/reftable.txt
new file mode 100644
index 0000000000..fa7c4be014
--- /dev/null
+++ b/Documentation/config/reftable.txt
@@ -0,0 +1,14 @@
+reftable.blockSize::
+	The size in bytes used by the reftable backend when writing blocks.
+	The block size is determined by the writer, and does not have to be a
+	power of 2. The block size must be larger than the longest reference
+	name or log entry used in the repository, as references cannot span
+	blocks.
++
+Powers of two that are friendly to the virtual memory system or
+filesystem (such as 4kB or 8kB) are recommended. Larger sizes (64kB) can
+yield better compression, with a possible increased cost incurred by
+readers during access.
++
+The largest block size is `16777215` bytes (15.99 MiB). The default value is
+`4096` bytes (4kB). A value of `0` will use the default value.
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 1cda48c504..bd9999cefc 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1,6 +1,7 @@
 #include "../git-compat-util.h"
 #include "../abspath.h"
 #include "../chdir-notify.h"
+#include "../config.h"
 #include "../environment.h"
 #include "../gettext.h"
 #include "../hash.h"
@@ -230,6 +231,22 @@ static int read_ref_without_reload(struct reftable_stack *stack,
 	return ret;
 }
 
+static int reftable_be_config(const char *var, const char *value,
+			      const struct config_context *ctx,
+			      void *_opts)
+{
+	struct reftable_write_options *opts = _opts;
+
+	if (!strcmp(var, "reftable.blocksize")) {
+		unsigned long block_size = git_config_ulong(var, value, ctx->kvi);
+		if (block_size > 16777215)
+			die("reftable block size cannot exceed 16MB");
+		opts->block_size = block_size;
+	}
+
+	return 0;
+}
+
 static struct ref_store *reftable_be_init(struct repository *repo,
 					  const char *gitdir,
 					  unsigned int store_flags)
@@ -245,12 +262,24 @@ static struct ref_store *reftable_be_init(struct repository *repo,
 	base_ref_store_init(&refs->base, repo, gitdir, &refs_be_reftable);
 	strmap_init(&refs->worktree_stacks);
 	refs->store_flags = store_flags;
-	refs->write_options.block_size = 4096;
+
 	refs->write_options.hash_id = repo->hash_algo->format_id;
 	refs->write_options.default_permissions = calc_shared_perm(0666 & ~mask);
 	refs->write_options.disable_auto_compact =
 		!git_env_bool("GIT_TEST_REFTABLE_AUTOCOMPACTION", 1);
 
+	git_config(reftable_be_config, &refs->write_options);
+
+	/*
+	 * It is somewhat unfortunate that we have to mirror the default block
+	 * size of the reftable library here. But given that the write options
+	 * wouldn't be updated by the library here, and given that we require
+	 * the proper block size to trim reflog message so that they fit, we
+	 * must set up a proper value here.
+	 */
+	if (!refs->write_options.block_size)
+		refs->write_options.block_size = 4096;
+
 	/*
 	 * Set up the main reftable stack that is hosted in GIT_COMMON_DIR.
 	 * This stack contains both the shared and the main worktree refs.
diff --git a/t/t0613-reftable-write-options.sh b/t/t0613-reftable-write-options.sh
index 462980c37c..8bdbc6ec70 100755
--- a/t/t0613-reftable-write-options.sh
+++ b/t/t0613-reftable-write-options.sh
@@ -99,4 +99,76 @@ test_expect_success 'many refs results in multiple blocks' '
 	)
 '
 
+test_expect_success 'tiny block size leads to error' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		cat >expect <<-EOF &&
+		error: unable to compact stack: entry too large
+		EOF
+		test_must_fail git -c reftable.blockSize=50 pack-refs 2>err &&
+		test_cmp expect err
+	)
+'
+
+test_expect_success 'small block size leads to multiple ref blocks' '
+	test_config_global core.logAllRefUpdates false &&
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit A &&
+		test_commit B &&
+		git -c reftable.blockSize=100 pack-refs &&
+
+		cat >expect <<-EOF &&
+		header:
+		  block_size: 100
+		ref:
+		  - length: 53
+		    restarts: 1
+		  - length: 74
+		    restarts: 1
+		  - length: 38
+		    restarts: 1
+		EOF
+		test-tool dump-reftable -b .git/reftable/*.ref >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'small block size fails with large reflog message' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit A &&
+		perl -e "print \"a\" x 500" >logmsg &&
+		cat >expect <<-EOF &&
+		fatal: update_ref failed for ref ${SQ}refs/heads/logme${SQ}: reftable: transaction failure: entry too large
+		EOF
+		test_must_fail git -c reftable.blockSize=100 \
+			update-ref -m "$(cat logmsg)" refs/heads/logme HEAD 2>err &&
+		test_cmp expect err
+	)
+'
+
+test_expect_success 'block size exceeding maximum supported size' '
+	test_config_global core.logAllRefUpdates false &&
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit A &&
+		test_commit B &&
+		cat >expect <<-EOF &&
+		fatal: reftable block size cannot exceed 16MB
+		EOF
+		test_must_fail git -c reftable.blockSize=16777216 pack-refs 2>err &&
+		test_cmp expect err
+	)
+'
+
 test_done
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v2 07/11] reftable: use `uint16_t` to track restart interval
  2024-05-10 10:29 ` [PATCH v2 " Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2024-05-10 10:29   ` [PATCH v2 06/11] refs/reftable: allow configuring block size Patrick Steinhardt
@ 2024-05-10 10:29   ` Patrick Steinhardt
  2024-05-13 22:42     ` Junio C Hamano
  2024-05-10 10:29   ` [PATCH v2 08/11] refs/reftable: allow configuring " Patrick Steinhardt
                     ` (4 subsequent siblings)
  11 siblings, 1 reply; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-10 10:29 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 1294 bytes --]

The restart interval can at most be `UINT16_MAX` as specified in the
technical documentation of the reftable format. Furthermore, it cannot
ever be negative. Regardless of that we use an `int` to track the
restart interval.

Change the type to use an `uint16_t` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/block.h           | 2 +-
 reftable/reftable-writer.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/reftable/block.h b/reftable/block.h
index ea4384a7e2..cd5577105d 100644
--- a/reftable/block.h
+++ b/reftable/block.h
@@ -25,7 +25,7 @@ struct block_writer {
 	uint32_t header_off;
 
 	/* How often to restart keys. */
-	int restart_interval;
+	uint16_t restart_interval;
 	int hash_size;
 
 	/* Offset of next uint8_t to write. */
diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h
index 44cb986465..4cd8ebe6c7 100644
--- a/reftable/reftable-writer.h
+++ b/reftable/reftable-writer.h
@@ -28,7 +28,7 @@ struct reftable_write_options {
 	unsigned skip_index_objects : 1;
 
 	/* how often to write complete keys in each block. */
-	int restart_interval;
+	uint16_t restart_interval;
 
 	/* 4-byte identifier ("sha1", "s256") of the hash.
 	 * Defaults to SHA1 if unset
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v2 08/11] refs/reftable: allow configuring restart interval
  2024-05-10 10:29 ` [PATCH v2 " Patrick Steinhardt
                     ` (6 preceding siblings ...)
  2024-05-10 10:29   ` [PATCH v2 07/11] reftable: use `uint16_t` to track restart interval Patrick Steinhardt
@ 2024-05-10 10:29   ` Patrick Steinhardt
  2024-05-10 21:57     ` Junio C Hamano
  2024-05-10 10:30   ` [PATCH v2 09/11] refs/reftable: allow disabling writing the object index Patrick Steinhardt
                     ` (3 subsequent siblings)
  11 siblings, 1 reply; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-10 10:29 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 3752 bytes --]

Add a new option `reftable.restartInterval` that allows the user to
control the restart interval when writing reftable records used by the
reftable library.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/config/reftable.txt | 19 ++++++++++++++
 refs/reftable-backend.c           |  5 ++++
 t/t0613-reftable-write-options.sh | 43 +++++++++++++++++++++++++++++++
 3 files changed, 67 insertions(+)

diff --git a/Documentation/config/reftable.txt b/Documentation/config/reftable.txt
index fa7c4be014..16b915c75e 100644
--- a/Documentation/config/reftable.txt
+++ b/Documentation/config/reftable.txt
@@ -12,3 +12,22 @@ readers during access.
 +
 The largest block size is `16777215` bytes (15.99 MiB). The default value is
 `4096` bytes (4kB). A value of `0` will use the default value.
+
+reftable.restartInterval::
+	The interval at which to create restart points. The reftable backend
+	determines the restart points at file creation. The process is
+	arbitrary, but every 16 or 64 records is recommended. Every 16 may be
+	more suitable for smaller block sizes (4k or 8k), every 64 for larger
+	block sizes (64k).
++
+More frequent restart points reduces prefix compression and increases
+space consumed by the restart table, both of which increase file size.
++
+Less frequent restart points makes prefix compression more effective,
+decreasing overall file size, with increased penalties for readers
+walking through more records after the binary search step.
++
+A maximum of `65535` restart points per block is supported.
++
+The default value is to create restart points every 16 records. A value of `0`
+will use the default value.
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index bd9999cefc..9972dfc1a3 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -242,6 +242,11 @@ static int reftable_be_config(const char *var, const char *value,
 		if (block_size > 16777215)
 			die("reftable block size cannot exceed 16MB");
 		opts->block_size = block_size;
+	} else if (!strcmp(var, "reftable.restartinterval")) {
+		unsigned long restart_interval = git_config_ulong(var, value, ctx->kvi);
+		if (restart_interval > UINT16_MAX)
+			die("reftable block size cannot exceed %u", (unsigned)UINT16_MAX);
+		opts->restart_interval = restart_interval;
 	}
 
 	return 0;
diff --git a/t/t0613-reftable-write-options.sh b/t/t0613-reftable-write-options.sh
index 8bdbc6ec70..e0a5b26f58 100755
--- a/t/t0613-reftable-write-options.sh
+++ b/t/t0613-reftable-write-options.sh
@@ -171,4 +171,47 @@ test_expect_success 'block size exceeding maximum supported size' '
 	)
 '
 
+test_expect_success 'restart interval at every single record' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		for i in $(test_seq 10)
+		do
+			printf "update refs/heads/branch-%d HEAD\n" "$i" ||
+			return 1
+		done >input &&
+		git update-ref --stdin <input &&
+		git -c reftable.restartInterval=1 pack-refs &&
+
+		cat >expect <<-EOF &&
+		header:
+		  block_size: 4096
+		ref:
+		  - length: 566
+		    restarts: 13
+		log:
+		  - length: 1393
+		    restarts: 12
+		EOF
+		test-tool dump-reftable -b .git/reftable/*.ref >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'restart interval exceeding maximum supported interval' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		cat >expect <<-EOF &&
+		fatal: reftable block size cannot exceed 65535
+		EOF
+		test_must_fail git -c reftable.restartInterval=65536 pack-refs 2>err &&
+		test_cmp expect err
+	)
+'
+
 test_done
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v2 09/11] refs/reftable: allow disabling writing the object index
  2024-05-10 10:29 ` [PATCH v2 " Patrick Steinhardt
                     ` (7 preceding siblings ...)
  2024-05-10 10:29   ` [PATCH v2 08/11] refs/reftable: allow configuring " Patrick Steinhardt
@ 2024-05-10 10:30   ` Patrick Steinhardt
  2024-05-10 10:30   ` [PATCH v2 10/11] reftable: make the compaction factor configurable Patrick Steinhardt
                     ` (2 subsequent siblings)
  11 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-10 10:30 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 4199 bytes --]

Besides the expected "ref" and "log" records, the reftable library also
writes "obj" records. These are basically a reverse mapping of object
IDs to their respective ref records so that it becomes efficient to
figure out which references point to a specific object. The motivation
for this data structure is the "uploadpack.allowTipSHA1InWant" config,
which allows a client to fetch any object by its hash that has a ref
pointing to it.

This reverse index is not used by Git at all though, and the expectation
is that most hosters nowadays use "uploadpack.allowAnySHA1InWant". It
may thus be preferable for many users to disable writing these optional
object indices altogether to safe some precious disk space.

Add a new config "reftable.indexObjects" that allows the user to disable
the object index altogether.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/config/reftable.txt |  6 +++
 refs/reftable-backend.c           |  2 +
 t/t0613-reftable-write-options.sh | 69 +++++++++++++++++++++++++++++++
 3 files changed, 77 insertions(+)

diff --git a/Documentation/config/reftable.txt b/Documentation/config/reftable.txt
index 16b915c75e..6e4466f3c5 100644
--- a/Documentation/config/reftable.txt
+++ b/Documentation/config/reftable.txt
@@ -31,3 +31,9 @@ A maximum of `65535` restart points per block is supported.
 +
 The default value is to create restart points every 16 records. A value of `0`
 will use the default value.
+
+reftable.indexObjects::
+	Whether the reftable backend shall write object blocks. Object blocks
+	are a reverse mapping of object ID to the references pointing to them.
++
+The default value is `true`.
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 9972dfc1a3..63b75f770d 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -247,6 +247,8 @@ static int reftable_be_config(const char *var, const char *value,
 		if (restart_interval > UINT16_MAX)
 			die("reftable block size cannot exceed %u", (unsigned)UINT16_MAX);
 		opts->restart_interval = restart_interval;
+	} else if (!strcmp(var, "reftable.indexobjects")) {
+		opts->skip_index_objects = !git_config_bool(var, value);
 	}
 
 	return 0;
diff --git a/t/t0613-reftable-write-options.sh b/t/t0613-reftable-write-options.sh
index e0a5b26f58..e2708e11d5 100755
--- a/t/t0613-reftable-write-options.sh
+++ b/t/t0613-reftable-write-options.sh
@@ -214,4 +214,73 @@ test_expect_success 'restart interval exceeding maximum supported interval' '
 	)
 '
 
+test_expect_success 'object index gets written by default with ref index' '
+	test_config_global core.logAllRefUpdates false &&
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		for i in $(test_seq 5)
+		do
+			printf "update refs/heads/branch-%d HEAD\n" "$i" ||
+			return 1
+		done >input &&
+		git update-ref --stdin <input &&
+		git -c reftable.blockSize=100 pack-refs &&
+
+		cat >expect <<-EOF &&
+		header:
+		  block_size: 100
+		ref:
+		  - length: 53
+		    restarts: 1
+		  - length: 95
+		    restarts: 1
+		  - length: 71
+		    restarts: 1
+		  - length: 80
+		    restarts: 1
+		obj:
+		  - length: 11
+		    restarts: 1
+		EOF
+		test-tool dump-reftable -b .git/reftable/*.ref >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'object index can be disabled' '
+	test_config_global core.logAllRefUpdates false &&
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		for i in $(test_seq 5)
+		do
+			printf "update refs/heads/branch-%d HEAD\n" "$i" ||
+			return 1
+		done >input &&
+		git update-ref --stdin <input &&
+		git -c reftable.blockSize=100 -c reftable.indexObjects=false pack-refs &&
+
+		cat >expect <<-EOF &&
+		header:
+		  block_size: 100
+		ref:
+		  - length: 53
+		    restarts: 1
+		  - length: 95
+		    restarts: 1
+		  - length: 71
+		    restarts: 1
+		  - length: 80
+		    restarts: 1
+		EOF
+		test-tool dump-reftable -b .git/reftable/*.ref >actual &&
+		test_cmp expect actual
+	)
+'
+
 test_done
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v2 10/11] reftable: make the compaction factor configurable
  2024-05-10 10:29 ` [PATCH v2 " Patrick Steinhardt
                     ` (8 preceding siblings ...)
  2024-05-10 10:30   ` [PATCH v2 09/11] refs/reftable: allow disabling writing the object index Patrick Steinhardt
@ 2024-05-10 10:30   ` Patrick Steinhardt
  2024-05-10 22:12     ` Junio C Hamano
  2024-05-10 10:30   ` [PATCH v2 11/11] refs/reftable: allow configuring geometric factor Patrick Steinhardt
  2024-05-10 11:43   ` [PATCH v2 00/11] reftable: expose write options as config Karthik Nayak
  11 siblings, 1 reply; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-10 10:30 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 4673 bytes --]

When auto-compacting, the reftable library packs references such that
the sizes of the tables form a geometric sequence. The factor for this
geometric sequence is hardcoded to 2 right now. We're about to expose
this as a config option though, so let's expose the factor via write
options.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/constants.h       |  1 +
 reftable/reftable-writer.h |  6 ++++++
 reftable/stack.c           | 14 ++++++++++----
 reftable/stack.h           |  3 ++-
 reftable/stack_test.c      |  4 ++--
 5 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/reftable/constants.h b/reftable/constants.h
index 5eee72c4c1..f6beb843eb 100644
--- a/reftable/constants.h
+++ b/reftable/constants.h
@@ -17,5 +17,6 @@ license that can be found in the LICENSE file or at
 
 #define MAX_RESTARTS ((1 << 16) - 1)
 #define DEFAULT_BLOCK_SIZE 4096
+#define DEFAULT_GEOMETRIC_FACTOR 2
 
 #endif
diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h
index 4cd8ebe6c7..155457b042 100644
--- a/reftable/reftable-writer.h
+++ b/reftable/reftable-writer.h
@@ -49,6 +49,12 @@ struct reftable_write_options {
 
 	/* boolean: Prevent auto-compaction of tables. */
 	unsigned disable_auto_compact : 1;
+
+	/*
+	 * Geometric sequence factor used by auto-compaction to decide which
+	 * tables to compact. Defaults to 2 if unset.
+	 */
+	uint8_t auto_compaction_factor;
 };
 
 /* reftable_block_stats holds statistics for a single block type */
diff --git a/reftable/stack.c b/reftable/stack.c
index 7b4fff7c9e..762954b181 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -10,6 +10,7 @@ license that can be found in the LICENSE file or at
 
 #include "../write-or-die.h"
 #include "system.h"
+#include "constants.h"
 #include "merged.h"
 #include "reader.h"
 #include "refname.h"
@@ -1215,12 +1216,16 @@ static int segment_size(struct segment *s)
 	return s->end - s->start;
 }
 
-struct segment suggest_compaction_segment(uint64_t *sizes, size_t n)
+struct segment suggest_compaction_segment(uint64_t *sizes, size_t n,
+					  uint8_t factor)
 {
 	struct segment seg = { 0 };
 	uint64_t bytes;
 	size_t i;
 
+	if (!factor)
+		factor = DEFAULT_GEOMETRIC_FACTOR;
+
 	/*
 	 * If there are no tables or only a single one then we don't have to
 	 * compact anything. The sequence is geometric by definition already.
@@ -1252,7 +1257,7 @@ struct segment suggest_compaction_segment(uint64_t *sizes, size_t n)
 	 * 	64, 32, 16, 8, 4, 3, 1
 	 */
 	for (i = n - 1; i > 0; i--) {
-		if (sizes[i - 1] < sizes[i] * 2) {
+		if (sizes[i - 1] < sizes[i] * factor) {
 			seg.end = i + 1;
 			bytes = sizes[i];
 			break;
@@ -1278,7 +1283,7 @@ struct segment suggest_compaction_segment(uint64_t *sizes, size_t n)
 		uint64_t curr = bytes;
 		bytes += sizes[i - 1];
 
-		if (sizes[i - 1] < curr * 2) {
+		if (sizes[i - 1] < curr * factor) {
 			seg.start = i - 1;
 			seg.bytes = bytes;
 		}
@@ -1304,7 +1309,8 @@ int reftable_stack_auto_compact(struct reftable_stack *st)
 {
 	uint64_t *sizes = stack_table_sizes_for_compaction(st);
 	struct segment seg =
-		suggest_compaction_segment(sizes, st->merged->stack_len);
+		suggest_compaction_segment(sizes, st->merged->stack_len,
+					   st->opts.auto_compaction_factor);
 	reftable_free(sizes);
 	if (segment_size(&seg) > 0)
 		return stack_compact_range_stats(st, seg.start, seg.end - 1,
diff --git a/reftable/stack.h b/reftable/stack.h
index 97d7ebc043..5b45cff4f7 100644
--- a/reftable/stack.h
+++ b/reftable/stack.h
@@ -35,6 +35,7 @@ struct segment {
 	uint64_t bytes;
 };
 
-struct segment suggest_compaction_segment(uint64_t *sizes, size_t n);
+struct segment suggest_compaction_segment(uint64_t *sizes, size_t n,
+					  uint8_t factor);
 
 #endif
diff --git a/reftable/stack_test.c b/reftable/stack_test.c
index 3316d55f19..f6c11ef18d 100644
--- a/reftable/stack_test.c
+++ b/reftable/stack_test.c
@@ -767,7 +767,7 @@ static void test_suggest_compaction_segment(void)
 {
 	uint64_t sizes[] = { 512, 64, 17, 16, 9, 9, 9, 16, 2, 16 };
 	struct segment min =
-		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes));
+		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes), 2);
 	EXPECT(min.start == 1);
 	EXPECT(min.end == 10);
 }
@@ -776,7 +776,7 @@ static void test_suggest_compaction_segment_nothing(void)
 {
 	uint64_t sizes[] = { 64, 32, 16, 8, 4, 2 };
 	struct segment result =
-		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes));
+		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes), 2);
 	EXPECT(result.start == result.end);
 }
 
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v2 11/11] refs/reftable: allow configuring geometric factor
  2024-05-10 10:29 ` [PATCH v2 " Patrick Steinhardt
                     ` (9 preceding siblings ...)
  2024-05-10 10:30   ` [PATCH v2 10/11] reftable: make the compaction factor configurable Patrick Steinhardt
@ 2024-05-10 10:30   ` Patrick Steinhardt
  2024-05-10 11:43   ` [PATCH v2 00/11] reftable: expose write options as config Karthik Nayak
  11 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-10 10:30 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 1895 bytes --]

Allow configuring the geometric factor used by the auto-compaction
algorithm whenever a new table is appended to the stack of tables.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/config/reftable.txt | 10 ++++++++++
 refs/reftable-backend.c           |  5 +++++
 2 files changed, 15 insertions(+)

diff --git a/Documentation/config/reftable.txt b/Documentation/config/reftable.txt
index 6e4466f3c5..f928d1029b 100644
--- a/Documentation/config/reftable.txt
+++ b/Documentation/config/reftable.txt
@@ -37,3 +37,13 @@ reftable.indexObjects::
 	are a reverse mapping of object ID to the references pointing to them.
 +
 The default value is `true`.
+
+reftable.geometricFactor::
+	Whenever the reftable backend appends a new table to the stack, it
+	performs auto compaction to ensure that there is only a handful of
+	tables. The backend does this by ensuring that tables form a geometric
+	sequence regarding the respective sizes of each table.
++
+By default, the geometric sequence uses a factor of 2, meaning that for any
+table, the next-biggest table must at least be twice as big. A maximum factor
+of 256 is supported.
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 63b75f770d..c4bd0fe1f9 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -249,6 +249,11 @@ static int reftable_be_config(const char *var, const char *value,
 		opts->restart_interval = restart_interval;
 	} else if (!strcmp(var, "reftable.indexobjects")) {
 		opts->skip_index_objects = !git_config_bool(var, value);
+	} else if (!strcmp(var, "reftable.geometricfactor")) {
+		unsigned long factor = git_config_ulong(var, value, ctx->kvi);
+		if (factor > UINT8_MAX)
+			die("reftable geometric factor cannot exceed %u", (unsigned)UINT8_MAX);
+		opts->auto_compaction_factor = factor;
 	}
 
 	return 0;
-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* Re: [PATCH v2 00/11] reftable: expose write options as config
  2024-05-10 10:29 ` [PATCH v2 " Patrick Steinhardt
                     ` (10 preceding siblings ...)
  2024-05-10 10:30   ` [PATCH v2 11/11] refs/reftable: allow configuring geometric factor Patrick Steinhardt
@ 2024-05-10 11:43   ` Karthik Nayak
  11 siblings, 0 replies; 78+ messages in thread
From: Karthik Nayak @ 2024-05-10 11:43 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 7313 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> Hi,
>
> this is the second version of my patch series that exposes various
> options of the reftable writer via Git configuration.
>
> Changes compared to v1:
>
>   - Drop unneeded return statements.
>
>   - Move default geometric factor into "constants.h".
>
>   - Fix a typo in a commit message.
>
> Thanks!
>
> Patrick
>
> Patrick Steinhardt (11):
>   reftable: consistently refer to `reftable_write_options` as `opts`
>   reftable: consistently pass write opts as value
>   reftable/writer: drop static variable used to initialize strbuf
>   reftable/writer: improve error when passed an invalid block size
>   reftable/dump: support dumping a table's block structure
>   refs/reftable: allow configuring block size
>   reftable: use `uint16_t` to track restart interval
>   refs/reftable: allow configuring restart interval
>   refs/reftable: allow disabling writing the object index
>   reftable: make the compaction factor configurable
>   refs/reftable: allow configuring geometric factor
>
>  Documentation/config.txt          |   2 +
>  Documentation/config/reftable.txt |  49 +++++
>  refs/reftable-backend.c           |  43 ++++-
>  reftable/block.h                  |   2 +-
>  reftable/constants.h              |   1 +
>  reftable/dump.c                   |  12 +-
>  reftable/merged_test.c            |   6 +-
>  reftable/reader.c                 |  63 +++++++
>  reftable/readwrite_test.c         |  26 +--
>  reftable/refname_test.c           |   2 +-
>  reftable/reftable-reader.h        |   2 +
>  reftable/reftable-stack.h         |   2 +-
>  reftable/reftable-writer.h        |  10 +-
>  reftable/stack.c                  |  57 +++---
>  reftable/stack.h                  |   5 +-
>  reftable/stack_test.c             | 118 ++++++------
>  reftable/writer.c                 |  20 +--
>  t/t0613-reftable-write-options.sh | 286 ++++++++++++++++++++++++++++++
>  18 files changed, 576 insertions(+), 130 deletions(-)
>  create mode 100644 Documentation/config/reftable.txt
>  create mode 100755 t/t0613-reftable-write-options.sh
>
> Range-diff against v1:
>  1:  47cee6e25e =  1:  7efa566306 reftable: consistently refer to `reftable_write_options` as `opts`
>  2:  d8a0764e87 =  2:  e6f8fc09c2 reftable: consistently pass write opts as value
>  3:  c040f81fba =  3:  aa2903e3e5 reftable/writer: drop static variable used to initialize strbuf
>  4:  ef79bb1b7b =  4:  5e7cbb7b19 reftable/writer: improve error when passed an invalid block size
>  5:  4d4407d4a4 =  5:  ed1c150d90 reftable/dump: support dumping a table's block structure
>  6:  b4e4db5735 !  6:  be5bdc6dc1 refs/reftable: allow configuring block size
>     @@ refs/reftable-backend.c: static int read_ref_without_reload(struct reftable_stac
>      +		if (block_size > 16777215)
>      +			die("reftable block size cannot exceed 16MB");
>      +		opts->block_size = block_size;
>     -+		return 0;
>      +	}
>      +
>      +	return 0;
>  7:  79d9e07ca9 =  7:  05e8d1df2d reftable: use `uint16_t` to track restart interval
>  8:  653ec4dfa5 !  8:  bc0bf65553 refs/reftable: allow configuring restart interval
>     @@ Documentation/config/reftable.txt: readers during access.
>
>       ## refs/reftable-backend.c ##
>      @@ refs/reftable-backend.c: static int reftable_be_config(const char *var, const char *value,
>     + 		if (block_size > 16777215)
>       			die("reftable block size cannot exceed 16MB");
>       		opts->block_size = block_size;
>     - 		return 0;
>      +	} else if (!strcmp(var, "reftable.restartinterval")) {
>      +		unsigned long restart_interval = git_config_ulong(var, value, ctx->kvi);
>      +		if (restart_interval > UINT16_MAX)
>      +			die("reftable block size cannot exceed %u", (unsigned)UINT16_MAX);
>      +		opts->restart_interval = restart_interval;
>     -+		return 0;
>       	}
>
>       	return 0;
>  9:  6f2c481acc !  9:  6bc240fd0c refs/reftable: allow disabling writing the object index
>     @@ Documentation/config/reftable.txt: A maximum of `65535` restart points per block
>
>       ## refs/reftable-backend.c ##
>      @@ refs/reftable-backend.c: static int reftable_be_config(const char *var, const char *value,
>     + 		if (restart_interval > UINT16_MAX)
>       			die("reftable block size cannot exceed %u", (unsigned)UINT16_MAX);
>       		opts->restart_interval = restart_interval;
>     - 		return 0;
>      +	} else if (!strcmp(var, "reftable.indexobjects")) {
>      +		opts->skip_index_objects = !git_config_bool(var, value);
>     -+		return 0;
>       	}
>
>       	return 0;
> 10:  30e2e33479 ! 10:  9d4c1f0340 reftable: make the compaction factor configurable
>     @@ Commit message
>
>          Signed-off-by: Patrick Steinhardt <ps@pks.im>
>
>     + ## reftable/constants.h ##
>     +@@ reftable/constants.h: license that can be found in the LICENSE file or at
>     +
>     + #define MAX_RESTARTS ((1 << 16) - 1)
>     + #define DEFAULT_BLOCK_SIZE 4096
>     ++#define DEFAULT_GEOMETRIC_FACTOR 2
>     +
>     + #endif
>     +
>       ## reftable/reftable-writer.h ##
>      @@ reftable/reftable-writer.h: struct reftable_write_options {
>
>     @@ reftable/reftable-writer.h: struct reftable_write_options {
>       /* reftable_block_stats holds statistics for a single block type */
>
>       ## reftable/stack.c ##
>     +@@ reftable/stack.c: license that can be found in the LICENSE file or at
>     +
>     + #include "../write-or-die.h"
>     + #include "system.h"
>     ++#include "constants.h"
>     + #include "merged.h"
>     + #include "reader.h"
>     + #include "refname.h"
>      @@ reftable/stack.c: static int segment_size(struct segment *s)
>       	return s->end - s->start;
>       }
>     @@ reftable/stack.c: static int segment_size(struct segment *s)
>       	size_t i;
>
>      +	if (!factor)
>     -+		factor = 2;
>     ++		factor = DEFAULT_GEOMETRIC_FACTOR;
>      +
>       	/*
>       	 * If there are no tables or only a single one then we don't have to
> 11:  861f2e72d9 ! 11:  e1282e53fb refs/reftable: allow configuring geometric factor
>     @@ Documentation/config/reftable.txt: reftable.indexObjects::
>       The default value is `true`.
>      +
>      +reftable.geometricFactor::
>     -+	Whenever the reftable backend appends a new table to the table it
>     ++	Whenever the reftable backend appends a new table to the stack, it
>      +	performs auto compaction to ensure that there is only a handful of
>      +	tables. The backend does this by ensuring that tables form a geometric
>      +	sequence regarding the respective sizes of each table.
>     @@ Documentation/config/reftable.txt: reftable.indexObjects::
>
>       ## refs/reftable-backend.c ##
>      @@ refs/reftable-backend.c: static int reftable_be_config(const char *var, const char *value,
>     + 		opts->restart_interval = restart_interval;
>       	} else if (!strcmp(var, "reftable.indexobjects")) {
>       		opts->skip_index_objects = !git_config_bool(var, value);
>     - 		return 0;
>      +	} else if (!strcmp(var, "reftable.geometricfactor")) {
>      +		unsigned long factor = git_config_ulong(var, value, ctx->kvi);
>      +		if (factor > UINT8_MAX)
>
> base-commit: d4cc1ec35f3bcce816b69986ca41943f6ce21377
> --
> 2.45.0

The range diff looks good to me, thanks for the quick iteration :)

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2 01/11] reftable: consistently refer to `reftable_write_options` as `opts`
  2024-05-10 10:29   ` [PATCH v2 01/11] reftable: consistently refer to `reftable_write_options` as `opts` Patrick Steinhardt
@ 2024-05-10 21:03     ` Junio C Hamano
  0 siblings, 0 replies; 78+ messages in thread
From: Junio C Hamano @ 2024-05-10 21:03 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Justin Tobler

Patrick Steinhardt <ps@pks.im> writes:

>  int reftable_new_stack(struct reftable_stack **dest, const char *dir,
> -		       struct reftable_write_options config);
> +		       struct reftable_write_options opts);

Passing struct by value is somewhat unusual.  As long as the
structure does not contain any pointer to something else, it is not
too bad as structure assignment would also work well, though.  Not a
fault of this patch, and this series is not a place to change
anything about it.

> -	if (config.hash_id == 0) {
> -		config.hash_id = GIT_SHA1_FORMAT_ID;
> -	}
> +	if (opts.hash_id == 0)
> +		opts.hash_id = GIT_SHA1_FORMAT_ID;

Nice attention to detail.


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2 02/11] reftable: consistently pass write opts as value
  2024-05-10 10:29   ` [PATCH v2 02/11] reftable: consistently pass write opts as value Patrick Steinhardt
@ 2024-05-10 21:11     ` Junio C Hamano
  2024-05-13  7:53       ` Patrick Steinhardt
  0 siblings, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2024-05-10 21:11 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Justin Tobler

Patrick Steinhardt <ps@pks.im> writes:

> We sometimes pass the refatble write options as value and sometimes as a
> pointer. This is quite confusing and makes the reader wonder whether the
> options get modified sometimes.
>
> In fact, `reftable_new_writer()` does cause the caller-provided options
> to get updated when some values aren't set up. This is quite unexpected,
> but didn't cause any harm until now.
>
> Refactor the code to consistently pass the options as a value so that
> they are local to the subsystem they are being passed into so that we
> can avoid weirdness like this.

Turning pass-by-reference to pass-by-value of a large structure is a
rather huge hammer to ensure that the structure is not modified
(qualifying the pointer with "const" is).  Consistency is good, but
I am not sure offhand if this is making things consistent in a good
way.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2 03/11] reftable/writer: drop static variable used to initialize strbuf
  2024-05-10 10:29   ` [PATCH v2 03/11] reftable/writer: drop static variable used to initialize strbuf Patrick Steinhardt
@ 2024-05-10 21:19     ` Junio C Hamano
  0 siblings, 0 replies; 78+ messages in thread
From: Junio C Hamano @ 2024-05-10 21:19 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Justin Tobler

Patrick Steinhardt <ps@pks.im> writes:

> We have a static variable in the reftable writer code that is merely
> used to initialize the `last_key` of the writer. Convert the code to
> instead use `strbuf_init()` and drop the variable.

Nice.  There is no guarantee that a structure assignment of a
just-initialized empty strbuf to another will stay to be a trouble
free operation, and using an explicit _init() call is the right
thing to do.


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2 04/11] reftable/writer: improve error when passed an invalid block size
  2024-05-10 10:29   ` [PATCH v2 04/11] reftable/writer: improve error when passed an invalid block size Patrick Steinhardt
@ 2024-05-10 21:25     ` Junio C Hamano
  2024-05-13  7:53       ` Patrick Steinhardt
  0 siblings, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2024-05-10 21:25 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Justin Tobler

Patrick Steinhardt <ps@pks.im> writes:

> The reftable format only supports block sizes up to 16MB. When the
> writer is being passed a value bigger than that it simply calls
> abort(3P), which isn't all that helpful due to the lack of a proper
> error message.
>
> Improve this by calling `BUG()` instead.

As a "git" person, I do not mind this at all.

But doesn't the reftable/ library codebase want to avoid things like
BUG() that are very much tied to our codebase, for the same reason
as it avoids things like xmalloc(), xcalloc(), and ALLOC_GROW()?

We may have crossed the bridge long time ago, though.  We see a
handful calls to BUG() already inside reftable/ directory.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2 08/11] refs/reftable: allow configuring restart interval
  2024-05-10 10:29   ` [PATCH v2 08/11] refs/reftable: allow configuring " Patrick Steinhardt
@ 2024-05-10 21:57     ` Junio C Hamano
  2024-05-13  7:54       ` Patrick Steinhardt
  0 siblings, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2024-05-10 21:57 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Justin Tobler

Patrick Steinhardt <ps@pks.im> writes:

> +
> +reftable.restartInterval::
> +	The interval at which to create restart points. The reftable backend
> +	determines the restart points at file creation. The process is
> +	arbitrary, but every 16 or 64 records is recommended. Every 16 may be

It is unclear what exactly "The process is arbitrary, but" wants to
say, especially the use of the noun "process".  The process the user
uses to choose the inteval value is?  The default value chosen by us
was arbitrary and out of thin air?

Just striking the whole sentence (or removing up to ", but" part and
starting the sentence with "Every 16 or 64") may make the resulting
paragraph easier to follow, I suspect.

> +	} else if (!strcmp(var, "reftable.restartinterval")) {
> +		unsigned long restart_interval = git_config_ulong(var, value, ctx->kvi);
> +		if (restart_interval > UINT16_MAX)
> +			die("reftable block size cannot exceed %u", (unsigned)UINT16_MAX);

OK.


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2 10/11] reftable: make the compaction factor configurable
  2024-05-10 10:30   ` [PATCH v2 10/11] reftable: make the compaction factor configurable Patrick Steinhardt
@ 2024-05-10 22:12     ` Junio C Hamano
  2024-05-13  7:54       ` Patrick Steinhardt
  0 siblings, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2024-05-10 22:12 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Justin Tobler

Patrick Steinhardt <ps@pks.im> writes:

> When auto-compacting, the reftable library packs references such that
> the sizes of the tables form a geometric sequence. The factor for this
> geometric sequence is hardcoded to 2 right now. We're about to expose
> this as a config option though, so let's expose the factor via write
> options.

Hmph.  It is unclear if having this as uint8_t gives us a useful
enhancement, but perhaps in the future hosters may find a more
aggressive geometric sequence is better for their workload or
something and raise it to 3 or 4?  I was actually wondering if a
base smaller than 2 (e.g. fibonacci) may work better.

Anyway, making it configurable is a good first step.  Allowing a bit
finer grained setting than just integral values can be done later if
it proves necessary.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2 02/11] reftable: consistently pass write opts as value
  2024-05-10 21:11     ` Junio C Hamano
@ 2024-05-13  7:53       ` Patrick Steinhardt
  0 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-13  7:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Karthik Nayak, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 1110 bytes --]

On Fri, May 10, 2024 at 02:11:55PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > We sometimes pass the refatble write options as value and sometimes as a
> > pointer. This is quite confusing and makes the reader wonder whether the
> > options get modified sometimes.
> >
> > In fact, `reftable_new_writer()` does cause the caller-provided options
> > to get updated when some values aren't set up. This is quite unexpected,
> > but didn't cause any harm until now.
> >
> > Refactor the code to consistently pass the options as a value so that
> > they are local to the subsystem they are being passed into so that we
> > can avoid weirdness like this.
> 
> Turning pass-by-reference to pass-by-value of a large structure is a
> rather huge hammer to ensure that the structure is not modified
> (qualifying the pointer with "const" is).  Consistency is good, but
> I am not sure offhand if this is making things consistent in a good
> way.

Yeah, I've also been a bit "meh" about this. I'm happy to convert this
to all be const pointers instead.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2 04/11] reftable/writer: improve error when passed an invalid block size
  2024-05-10 21:25     ` Junio C Hamano
@ 2024-05-13  7:53       ` Patrick Steinhardt
  0 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-13  7:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Karthik Nayak, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 1863 bytes --]

On Fri, May 10, 2024 at 02:25:28PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > The reftable format only supports block sizes up to 16MB. When the
> > writer is being passed a value bigger than that it simply calls
> > abort(3P), which isn't all that helpful due to the lack of a proper
> > error message.
> >
> > Improve this by calling `BUG()` instead.
> 
> As a "git" person, I do not mind this at all.
> 
> But doesn't the reftable/ library codebase want to avoid things like
> BUG() that are very much tied to our codebase, for the same reason
> as it avoids things like xmalloc(), xcalloc(), and ALLOC_GROW()?
> 
> We may have crossed the bridge long time ago, though.  We see a
> handful calls to BUG() already inside reftable/ directory.

Exactly. No matter what, once there will be a second user of the
reftable library we will have to figure out a maintainable way to ensure
that the library can be used by other projects, too. And that will
require some larger refactorings anyway.

I think initially, the intent was to have a "system.h" header that
contains a bunch of wrappers that bridge the gap between reftables and
Git. I feel like this abstraction does not make any sense though in its
current form as it is simply being included by the reftable code, which
then uses the Git functions directly.

I think eventually, we will have to adapt this such that the Git
includes do not leak into the reftable code at all. Instead, we should
have a shim "system.c" that carries the Git-specific includes and then
implements a couple of wrapper functions. "system.h" would then only be
carrying function declarations of those wrappers.

That's a larger topic though, and I think that tackling this now would
be premature without any potential users of the reftable library.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2 08/11] refs/reftable: allow configuring restart interval
  2024-05-10 21:57     ` Junio C Hamano
@ 2024-05-13  7:54       ` Patrick Steinhardt
  0 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-13  7:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Karthik Nayak, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 1043 bytes --]

On Fri, May 10, 2024 at 02:57:46PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > +
> > +reftable.restartInterval::
> > +	The interval at which to create restart points. The reftable backend
> > +	determines the restart points at file creation. The process is
> > +	arbitrary, but every 16 or 64 records is recommended. Every 16 may be
> 
> It is unclear what exactly "The process is arbitrary, but" wants to
> say, especially the use of the noun "process".  The process the user
> uses to choose the inteval value is?  The default value chosen by us
> was arbitrary and out of thin air?

The latter is what I wanted to say, but I agree that it's hard to parse.
And honestly, I don't even know how arbitrary it is, so I should
probably not claim something like this in the first place.

> Just striking the whole sentence (or removing up to ", but" part and
> starting the sentence with "Every 16 or 64") may make the resulting
> paragraph easier to follow, I suspect.

Will do.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2 10/11] reftable: make the compaction factor configurable
  2024-05-10 22:12     ` Junio C Hamano
@ 2024-05-13  7:54       ` Patrick Steinhardt
  2024-05-13 16:22         ` Junio C Hamano
  0 siblings, 1 reply; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-13  7:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Karthik Nayak, Justin Tobler, Taylor Blau

[-- Attachment #1: Type: text/plain, Size: 1572 bytes --]

On Fri, May 10, 2024 at 03:12:03PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > When auto-compacting, the reftable library packs references such that
> > the sizes of the tables form a geometric sequence. The factor for this
> > geometric sequence is hardcoded to 2 right now. We're about to expose
> > this as a config option though, so let's expose the factor via write
> > options.
> 
> Hmph.  It is unclear if having this as uint8_t gives us a useful
> enhancement, but perhaps in the future hosters may find a more
> aggressive geometric sequence is better for their workload or
> something and raise it to 3 or 4?  I was actually wondering if a
> base smaller than 2 (e.g. fibonacci) may work better.
> 
> Anyway, making it configurable is a good first step.  Allowing a bit
> finer grained setting than just integral values can be done later if
> it proves necessary.

That's a fair point indeed. I had similar issues with git-repack(1)'s
`--geometric=` option, where you can also only pick integers. There's
also a similar discussion in the patch series by Taylor [1], where I
proposed to maybe introduce floats into git-config(1).

So this may be good enough for now, and when we gain the ability to
parse floats we may convert this to accept floats, as well. An
alternative would be to convert this to percent, where the default value
would be 200. That should give sufficient flexibility without having to
introduce floats.

Patrick

[1]: https://lore.kernel.org/git/Zjk2UIV3kEwZUDW+@nand.local/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v3 00/11] reftable: expose write options as config
  2024-05-02  6:51 [PATCH 00/11] reftable: expose write options as config Patrick Steinhardt
                   ` (14 preceding siblings ...)
  2024-05-10 10:29 ` [PATCH v2 " Patrick Steinhardt
@ 2024-05-13  8:17 ` Patrick Steinhardt
  2024-05-13  8:17   ` [PATCH v3 01/11] reftable: consistently refer to `reftable_write_options` as `opts` Patrick Steinhardt
                     ` (12 more replies)
  15 siblings, 13 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-13  8:17 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Justin Tobler, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 28422 bytes --]

Hi,

this is the third version of my patch series that exposes several write
options of the reftable library via Git configs.

Changes compared to v2:

  - Adapted patch 2 such that we now pass options as const pointers
    instead of by value.

  - Removed a confusing sentence in the documentation of the restart
    points in patch 8.

Other than that I decided to rebase this on top of the current "master"
branch at 0f3415f1f8 (The second batch, 2024-05-08). This is because the
revamped patch 2 would cause new conflicts with 485c63cf5c (reftable:
remove name checks, 2024-04-08) that didn't exist in v2 of this patch
series yet. Rebasing thus seemed like the more reasonable option.

Patrick

Patrick Steinhardt (11):
  reftable: consistently refer to `reftable_write_options` as `opts`
  reftable: pass opts as constant pointer
  reftable/writer: drop static variable used to initialize strbuf
  reftable/writer: improve error when passed an invalid block size
  reftable/dump: support dumping a table's block structure
  refs/reftable: allow configuring block size
  reftable: use `uint16_t` to track restart interval
  refs/reftable: allow configuring restart interval
  refs/reftable: allow disabling writing the object index
  reftable: make the compaction factor configurable
  refs/reftable: allow configuring geometric factor

 Documentation/config.txt          |   2 +
 Documentation/config/reftable.txt |  48 +++++
 refs/reftable-backend.c           |  49 ++++-
 reftable/block.h                  |   2 +-
 reftable/constants.h              |   1 +
 reftable/dump.c                   |  12 +-
 reftable/reader.c                 |  63 +++++++
 reftable/reftable-reader.h        |   2 +
 reftable/reftable-stack.h         |   2 +-
 reftable/reftable-writer.h        |  10 +-
 reftable/stack.c                  |  58 +++---
 reftable/stack.h                  |   5 +-
 reftable/stack_test.c             | 118 ++++++------
 reftable/writer.c                 |  23 +--
 t/t0613-reftable-write-options.sh | 286 ++++++++++++++++++++++++++++++
 15 files changed, 566 insertions(+), 115 deletions(-)
 create mode 100644 Documentation/config/reftable.txt
 create mode 100755 t/t0613-reftable-write-options.sh

Range-diff against v2:
 1:  7efa566306 !  1:  71f4e31cf7 reftable: consistently refer to `reftable_write_options` as `opts`
    @@ reftable/stack.c: static uint64_t *stack_table_sizes_for_compaction(struct refta
      	int overhead = header_size(version) - 1;
      	int i = 0;
      	for (i = 0; i < st->merged->stack_len; i++) {
    -@@ reftable/stack.c: static int stack_check_addition(struct reftable_stack *st,
    - 	int len = 0;
    - 	int i = 0;
    - 
    --	if (st->config.skip_name_check)
    -+	if (st->opts.skip_name_check)
    - 		return 0;
    - 
    - 	err = reftable_block_source_from_file(&src, new_tab_name);
     @@ reftable/stack.c: int reftable_stack_clean(struct reftable_stack *st)
      int reftable_stack_print_directory(const char *stackdir, uint32_t hash_id)
      {
 2:  e6f8fc09c2 !  2:  f1c9914a77 reftable: consistently pass write opts as value
    @@ Metadata
     Author: Patrick Steinhardt <ps@pks.im>
     
      ## Commit message ##
    -    reftable: consistently pass write opts as value
    +    reftable: pass opts as constant pointer
     
         We sometimes pass the refatble write options as value and sometimes as a
         pointer. This is quite confusing and makes the reader wonder whether the
    @@ Commit message
         to get updated when some values aren't set up. This is quite unexpected,
         but didn't cause any harm until now.
     
    -    Refactor the code to consistently pass the options as a value so that
    -    they are local to the subsystem they are being passed into so that we
    -    can avoid weirdness like this.
    +    Adapt the code so that we do not modify the caller-provided values
    +    anymore. While at it, refactor the code to code to consistently pass the
    +    options as a constant pointer to clarify that the caller-provided opts
    +    will not ever get modified.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
    - ## reftable/merged_test.c ##
    -@@ reftable/merged_test.c: static void write_test_table(struct strbuf *buf,
    + ## refs/reftable-backend.c ##
    +@@ refs/reftable-backend.c: static struct reftable_stack *stack_for(struct reftable_ref_store *store,
    + 				    store->base.repo->commondir, wtname_buf.buf);
    + 
    + 			store->err = reftable_new_stack(&stack, wt_dir.buf,
    +-							store->write_options);
    ++							&store->write_options);
    + 			assert(store->err != REFTABLE_API_ERROR);
    + 			strmap_put(&store->worktree_stacks, wtname_buf.buf, stack);
      		}
    +@@ refs/reftable-backend.c: static struct ref_store *reftable_be_init(struct repository *repo,
      	}
    + 	strbuf_addstr(&path, "/reftable");
    + 	refs->err = reftable_new_stack(&refs->main_stack, path.buf,
    +-				       refs->write_options);
    ++				       &refs->write_options);
    + 	if (refs->err)
    + 		goto done;
      
    --	w = reftable_new_writer(&strbuf_add_void, &noop_flush, buf, &opts);
    -+	w = reftable_new_writer(&strbuf_add_void, &noop_flush, buf, opts);
    - 	reftable_writer_set_limits(w, min, max);
    - 
    - 	for (i = 0; i < n; i++) {
    -@@ reftable/merged_test.c: static void write_test_log_table(struct strbuf *buf,
    - 		.exact_log_message = 1,
    - 	};
    - 	struct reftable_writer *w = NULL;
    --	w = reftable_new_writer(&strbuf_add_void, &noop_flush, buf, &opts);
    -+	w = reftable_new_writer(&strbuf_add_void, &noop_flush, buf, opts);
    - 	reftable_writer_set_limits(w, update_index, update_index);
    - 
    - 	for (i = 0; i < n; i++) {
    -@@ reftable/merged_test.c: static void test_default_write_opts(void)
    - 	struct reftable_write_options opts = { 0 };
    - 	struct strbuf buf = STRBUF_INIT;
    - 	struct reftable_writer *w =
    --		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
    -+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
    +@@ refs/reftable-backend.c: static struct ref_store *reftable_be_init(struct repository *repo,
    + 		strbuf_addf(&path, "%s/reftable", gitdir);
      
    - 	struct reftable_ref_record rec = {
    - 		.refname = "master",
    + 		refs->err = reftable_new_stack(&refs->worktree_stack, path.buf,
    +-					       refs->write_options);
    ++					       &refs->write_options);
    + 		if (refs->err)
    + 			goto done;
    + 	}
     
    - ## reftable/readwrite_test.c ##
    -@@ reftable/readwrite_test.c: static void write_table(char ***names, struct strbuf *buf, int N,
    - 		.hash_id = hash_id,
    - 	};
    - 	struct reftable_writer *w =
    --		reftable_new_writer(&strbuf_add_void, &noop_flush, buf, &opts);
    -+		reftable_new_writer(&strbuf_add_void, &noop_flush, buf, opts);
    - 	struct reftable_ref_record ref = { NULL };
    - 	int i = 0, n;
    - 	struct reftable_log_record log = { NULL };
    -@@ reftable/readwrite_test.c: static void test_log_buffer_size(void)
    - 					   .message = "commit: 9\n",
    - 				   } } };
    - 	struct reftable_writer *w =
    --		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
    -+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
    - 
    - 	/* This tests buffer extension for log compression. Must use a random
    - 	   hash, to ensure that the compressed part is larger than the original.
    -@@ reftable/readwrite_test.c: static void test_log_overflow(void)
    - 		},
    - 	};
    - 	struct reftable_writer *w =
    --		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
    -+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
    - 
    - 	memset(msg, 'x', sizeof(msg) - 1);
    - 	reftable_writer_set_limits(w, update_index, update_index);
    -@@ reftable/readwrite_test.c: static void test_log_write_read(void)
    - 	struct reftable_block_source source = { NULL };
    - 	struct strbuf buf = STRBUF_INIT;
    - 	struct reftable_writer *w =
    --		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
    -+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
    - 	const struct reftable_stats *stats = NULL;
    - 	reftable_writer_set_limits(w, 0, N);
    - 	for (i = 0; i < N; i++) {
    -@@ reftable/readwrite_test.c: static void test_log_zlib_corruption(void)
    - 	struct reftable_block_source source = { 0 };
    - 	struct strbuf buf = STRBUF_INIT;
    - 	struct reftable_writer *w =
    --		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
    -+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
    - 	const struct reftable_stats *stats = NULL;
    - 	char message[100] = { 0 };
    - 	int err, i, n;
    -@@ reftable/readwrite_test.c: static void test_table_refs_for(int indexed)
    - 
    - 	struct strbuf buf = STRBUF_INIT;
    - 	struct reftable_writer *w =
    --		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
    -+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
    - 
    - 	struct reftable_iterator it = { NULL };
    - 	int j;
    -@@ reftable/readwrite_test.c: static void test_write_empty_table(void)
    + ## reftable/dump.c ##
    +@@ reftable/dump.c: static int compact_stack(const char *stackdir)
    + 	struct reftable_stack *stack = NULL;
      	struct reftable_write_options opts = { 0 };
    - 	struct strbuf buf = STRBUF_INIT;
    - 	struct reftable_writer *w =
    --		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
    -+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
    - 	struct reftable_block_source source = { NULL };
    - 	struct reftable_reader *rd = NULL;
    - 	struct reftable_ref_record rec = { NULL };
    -@@ reftable/readwrite_test.c: static void test_write_object_id_min_length(void)
    - 	};
    - 	struct strbuf buf = STRBUF_INIT;
    - 	struct reftable_writer *w =
    --		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
    -+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
    - 	struct reftable_ref_record ref = {
    - 		.update_index = 1,
    - 		.value_type = REFTABLE_REF_VAL1,
    -@@ reftable/readwrite_test.c: static void test_write_object_id_length(void)
    - 	};
    - 	struct strbuf buf = STRBUF_INIT;
    - 	struct reftable_writer *w =
    --		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
    -+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
    - 	struct reftable_ref_record ref = {
    - 		.update_index = 1,
    - 		.value_type = REFTABLE_REF_VAL1,
    -@@ reftable/readwrite_test.c: static void test_write_empty_key(void)
    - 	struct reftable_write_options opts = { 0 };
    - 	struct strbuf buf = STRBUF_INIT;
    - 	struct reftable_writer *w =
    --		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
    -+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
    - 	struct reftable_ref_record ref = {
    - 		.refname = "",
    - 		.update_index = 1,
    -@@ reftable/readwrite_test.c: static void test_write_key_order(void)
    - 	struct reftable_write_options opts = { 0 };
    - 	struct strbuf buf = STRBUF_INIT;
    - 	struct reftable_writer *w =
    --		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
    -+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
    - 	struct reftable_ref_record refs[2] = {
    - 		{
    - 			.refname = "b",
    -@@ reftable/readwrite_test.c: static void test_write_multiple_indices(void)
    - 	struct reftable_reader *reader;
    - 	int err, i;
      
    --	writer = reftable_new_writer(&strbuf_add_void, &noop_flush, &writer_buf, &opts);
    -+	writer = reftable_new_writer(&strbuf_add_void, &noop_flush, &writer_buf, opts);
    - 	reftable_writer_set_limits(writer, 1, 1);
    - 	for (i = 0; i < 100; i++) {
    - 		struct reftable_ref_record ref = {
    -@@ reftable/readwrite_test.c: static void test_write_multi_level_index(void)
    - 	struct reftable_reader *reader;
    - 	int err;
    +-	int err = reftable_new_stack(&stack, stackdir, opts);
    ++	int err = reftable_new_stack(&stack, stackdir, &opts);
    + 	if (err < 0)
    + 		goto done;
      
    --	writer = reftable_new_writer(&strbuf_add_void, &noop_flush, &writer_buf, &opts);
    -+	writer = reftable_new_writer(&strbuf_add_void, &noop_flush, &writer_buf, opts);
    - 	reftable_writer_set_limits(writer, 1, 1);
    - 	for (size_t i = 0; i < 200; i++) {
    - 		struct reftable_ref_record ref = {
     
    - ## reftable/refname_test.c ##
    -@@ reftable/refname_test.c: static void test_conflict(void)
    - 	struct reftable_write_options opts = { 0 };
    - 	struct strbuf buf = STRBUF_INIT;
    - 	struct reftable_writer *w =
    --		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts);
    -+		reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, opts);
    - 	struct reftable_ref_record rec = {
    - 		.refname = "a/b",
    - 		.value_type = REFTABLE_REF_SYMREF,
    + ## reftable/reftable-stack.h ##
    +@@ reftable/reftable-stack.h: struct reftable_stack;
    +  *  stored in 'dir'. Typically, this should be .git/reftables.
    +  */
    + int reftable_new_stack(struct reftable_stack **dest, const char *dir,
    +-		       struct reftable_write_options opts);
    ++		       const struct reftable_write_options *opts);
    + 
    + /* returns the update_index at which a next table should be written. */
    + uint64_t reftable_stack_next_update_index(struct reftable_stack *st);
     
      ## reftable/reftable-writer.h ##
     @@ reftable/reftable-writer.h: struct reftable_stats {
    @@ reftable/reftable-writer.h: struct reftable_stats {
      reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
      		    int (*flush_func)(void *),
     -		    void *writer_arg, struct reftable_write_options *opts);
    -+		    void *writer_arg, struct reftable_write_options opts);
    ++		    void *writer_arg, const struct reftable_write_options *opts);
      
      /* Set the range of update indices for the records we will add. When writing a
         table into a stack, the min should be at least
     
      ## reftable/stack.c ##
    -@@ reftable/stack.c: int reftable_addition_add(struct reftable_addition *add,
    - 	tab_fd = get_tempfile_fd(tab_file);
    +@@ reftable/stack.c: static int reftable_fd_flush(void *arg)
    + }
    + 
    + int reftable_new_stack(struct reftable_stack **dest, const char *dir,
    +-		       struct reftable_write_options opts)
    ++		       const struct reftable_write_options *_opts)
    + {
    + 	struct reftable_stack *p = reftable_calloc(1, sizeof(*p));
    + 	struct strbuf list_file_name = STRBUF_INIT;
    ++	struct reftable_write_options opts = {0};
    + 	int err = 0;
      
    - 	wr = reftable_new_writer(reftable_fd_write, reftable_fd_flush, &tab_fd,
    --				 &add->stack->opts);
    -+				 add->stack->opts);
    - 	err = write_table(wr, arg);
    ++	if (_opts)
    ++		opts = *_opts;
    + 	if (opts.hash_id == 0)
    + 		opts.hash_id = GIT_SHA1_FORMAT_ID;
    + 
    +@@ reftable/stack.c: int reftable_stack_print_directory(const char *stackdir, uint32_t hash_id)
    + 	struct reftable_merged_table *merged = NULL;
    + 	struct reftable_table table = { NULL };
    + 
    +-	int err = reftable_new_stack(&stack, stackdir, opts);
    ++	int err = reftable_new_stack(&stack, stackdir, &opts);
      	if (err < 0)
      		goto done;
    -@@ reftable/stack.c: static int stack_compact_locked(struct reftable_stack *st,
    + 
    +
    + ## reftable/stack_test.c ##
    +@@ reftable/stack_test.c: static void test_reftable_stack_add_one(void)
    + 	};
    + 	struct reftable_ref_record dest = { NULL };
    + 	struct stat stat_result = { 0 };
    +-	err = reftable_new_stack(&st, dir, opts);
    ++	err = reftable_new_stack(&st, dir, &opts);
    + 	EXPECT_ERR(err);
    + 
    + 	err = reftable_stack_add(st, &write_test_ref, &ref);
    +@@ reftable/stack_test.c: static void test_reftable_stack_uptodate(void)
    + 	/* simulate multi-process access to the same stack
    + 	   by creating two stacks for the same directory.
    + 	 */
    +-	err = reftable_new_stack(&st1, dir, opts);
    ++	err = reftable_new_stack(&st1, dir, &opts);
    + 	EXPECT_ERR(err);
    + 
    +-	err = reftable_new_stack(&st2, dir, opts);
    ++	err = reftable_new_stack(&st2, dir, &opts);
    + 	EXPECT_ERR(err);
    + 
    + 	err = reftable_stack_add(st1, &write_test_ref, &ref1);
    +@@ reftable/stack_test.c: static void test_reftable_stack_transaction_api(void)
    + 	};
    + 	struct reftable_ref_record dest = { NULL };
    + 
    +-	err = reftable_new_stack(&st, dir, opts);
    ++	err = reftable_new_stack(&st, dir, &opts);
    + 	EXPECT_ERR(err);
    + 
    + 	reftable_addition_destroy(add);
    +@@ reftable/stack_test.c: static void test_reftable_stack_transaction_api_performs_auto_compaction(void)
    + 	struct reftable_stack *st = NULL;
    + 	int i, n = 20, err;
    + 
    +-	err = reftable_new_stack(&st, dir, opts);
    ++	err = reftable_new_stack(&st, dir, &opts);
    + 	EXPECT_ERR(err);
    + 
    + 	for (i = 0; i <= n; i++) {
    +@@ reftable/stack_test.c: static void test_reftable_stack_auto_compaction_fails_gracefully(void)
    + 	char *dir = get_tmp_dir(__LINE__);
    + 	int err;
    + 
    +-	err = reftable_new_stack(&st, dir, opts);
    ++	err = reftable_new_stack(&st, dir, &opts);
    + 	EXPECT_ERR(err);
    + 
    + 	err = reftable_stack_add(st, write_test_ref, &ref);
    +@@ reftable/stack_test.c: static void test_reftable_stack_update_index_check(void)
    + 		.value.symref = "master",
    + 	};
    + 
    +-	err = reftable_new_stack(&st, dir, opts);
    ++	err = reftable_new_stack(&st, dir, &opts);
    + 	EXPECT_ERR(err);
    + 
    + 	err = reftable_stack_add(st, &write_test_ref, &ref1);
    +@@ reftable/stack_test.c: static void test_reftable_stack_lock_failure(void)
    + 	struct reftable_stack *st = NULL;
    + 	int err, i;
    + 
    +-	err = reftable_new_stack(&st, dir, opts);
    ++	err = reftable_new_stack(&st, dir, &opts);
    + 	EXPECT_ERR(err);
    + 	for (i = -1; i != REFTABLE_EMPTY_TABLE_ERROR; i--) {
    + 		err = reftable_stack_add(st, &write_error, &i);
    +@@ reftable/stack_test.c: static void test_reftable_stack_add(void)
    + 	struct stat stat_result;
    + 	int N = ARRAY_SIZE(refs);
    + 
    +-	err = reftable_new_stack(&st, dir, opts);
    ++	err = reftable_new_stack(&st, dir, &opts);
    + 	EXPECT_ERR(err);
    + 
    + 	for (i = 0; i < N; i++) {
    +@@ reftable/stack_test.c: static void test_reftable_stack_log_normalize(void)
    + 		.update_index = 1,
    + 	};
    + 
    +-	err = reftable_new_stack(&st, dir, opts);
    ++	err = reftable_new_stack(&st, dir, &opts);
    + 	EXPECT_ERR(err);
    + 
    + 	input.value.update.message = "one\ntwo";
    +@@ reftable/stack_test.c: static void test_reftable_stack_tombstone(void)
    + 	struct reftable_ref_record dest = { NULL };
    + 	struct reftable_log_record log_dest = { NULL };
    + 
    +-	err = reftable_new_stack(&st, dir, opts);
    ++	err = reftable_new_stack(&st, dir, &opts);
    + 	EXPECT_ERR(err);
    + 
    + 	/* even entries add the refs, odd entries delete them. */
    +@@ reftable/stack_test.c: static void test_reftable_stack_hash_id(void)
    + 	struct reftable_stack *st_default = NULL;
    + 	struct reftable_ref_record dest = { NULL };
    + 
    +-	err = reftable_new_stack(&st, dir, opts);
    ++	err = reftable_new_stack(&st, dir, &opts);
    + 	EXPECT_ERR(err);
    + 
    + 	err = reftable_stack_add(st, &write_test_ref, &ref);
    + 	EXPECT_ERR(err);
    + 
    + 	/* can't read it with the wrong hash ID. */
    +-	err = reftable_new_stack(&st32, dir, opts32);
    ++	err = reftable_new_stack(&st32, dir, &opts32);
    + 	EXPECT(err == REFTABLE_FORMAT_ERROR);
    + 
    + 	/* check that we can read it back with default opts too. */
    +-	err = reftable_new_stack(&st_default, dir, opts_default);
    ++	err = reftable_new_stack(&st_default, dir, &opts_default);
    + 	EXPECT_ERR(err);
    + 
    + 	err = reftable_stack_read_ref(st_default, "master", &dest);
    +@@ reftable/stack_test.c: static void test_reflog_expire(void)
    + 	};
    + 	struct reftable_log_record log = { NULL };
    + 
    +-	err = reftable_new_stack(&st, dir, opts);
    ++	err = reftable_new_stack(&st, dir, &opts);
    + 	EXPECT_ERR(err);
    + 
    + 	for (i = 1; i <= N; i++) {
    +@@ reftable/stack_test.c: static void test_empty_add(void)
    + 	char *dir = get_tmp_dir(__LINE__);
    + 	struct reftable_stack *st2 = NULL;
    + 
    +-	err = reftable_new_stack(&st, dir, opts);
    ++	err = reftable_new_stack(&st, dir, &opts);
    + 	EXPECT_ERR(err);
    + 
    + 	err = reftable_stack_add(st, &write_nothing, NULL);
    + 	EXPECT_ERR(err);
    + 
    +-	err = reftable_new_stack(&st2, dir, opts);
    ++	err = reftable_new_stack(&st2, dir, &opts);
    + 	EXPECT_ERR(err);
    + 	clear_dir(dir);
    + 	reftable_stack_destroy(st);
    +@@ reftable/stack_test.c: static void test_reftable_stack_auto_compaction(void)
    + 	int err, i;
    + 	int N = 100;
    + 
    +-	err = reftable_new_stack(&st, dir, opts);
    ++	err = reftable_new_stack(&st, dir, &opts);
    + 	EXPECT_ERR(err);
    + 
    + 	for (i = 0; i < N; i++) {
    +@@ reftable/stack_test.c: static void test_reftable_stack_add_performs_auto_compaction(void)
    + 	char *dir = get_tmp_dir(__LINE__);
    + 	int err, i, n = 20;
    + 
    +-	err = reftable_new_stack(&st, dir, opts);
    ++	err = reftable_new_stack(&st, dir, &opts);
    + 	EXPECT_ERR(err);
    + 
    + 	for (i = 0; i <= n; i++) {
    +@@ reftable/stack_test.c: static void test_reftable_stack_compaction_concurrent(void)
    + 	int err, i;
    + 	int N = 3;
    + 
    +-	err = reftable_new_stack(&st1, dir, opts);
    ++	err = reftable_new_stack(&st1, dir, &opts);
    + 	EXPECT_ERR(err);
    + 
    + 	for (i = 0; i < N; i++) {
    +@@ reftable/stack_test.c: static void test_reftable_stack_compaction_concurrent(void)
    + 		EXPECT_ERR(err);
      	}
      
    - 	wr = reftable_new_writer(reftable_fd_write, reftable_fd_flush,
    --				 &tab_fd, &st->opts);
    -+				 &tab_fd, st->opts);
    - 	err = stack_write_compact(st, wr, first, last, config);
    - 	if (err < 0)
    - 		goto done;
    +-	err = reftable_new_stack(&st2, dir, opts);
    ++	err = reftable_new_stack(&st2, dir, &opts);
    + 	EXPECT_ERR(err);
    + 
    + 	err = reftable_stack_compact_all(st1, NULL);
    +@@ reftable/stack_test.c: static void test_reftable_stack_compaction_concurrent_clean(void)
    + 	int err, i;
    + 	int N = 3;
    + 
    +-	err = reftable_new_stack(&st1, dir, opts);
    ++	err = reftable_new_stack(&st1, dir, &opts);
    + 	EXPECT_ERR(err);
    + 
    + 	for (i = 0; i < N; i++) {
    +@@ reftable/stack_test.c: static void test_reftable_stack_compaction_concurrent_clean(void)
    + 		EXPECT_ERR(err);
    + 	}
    + 
    +-	err = reftable_new_stack(&st2, dir, opts);
    ++	err = reftable_new_stack(&st2, dir, &opts);
    + 	EXPECT_ERR(err);
    + 
    + 	err = reftable_stack_compact_all(st1, NULL);
    +@@ reftable/stack_test.c: static void test_reftable_stack_compaction_concurrent_clean(void)
    + 	unclean_stack_close(st1);
    + 	unclean_stack_close(st2);
    + 
    +-	err = reftable_new_stack(&st3, dir, opts);
    ++	err = reftable_new_stack(&st3, dir, &opts);
    + 	EXPECT_ERR(err);
    + 
    + 	err = reftable_stack_clean(st3);
     
      ## reftable/writer.c ##
     @@ reftable/writer.c: static struct strbuf reftable_empty_strbuf = STRBUF_INIT;
    @@ reftable/writer.c: static struct strbuf reftable_empty_strbuf = STRBUF_INIT;
      reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
      		    int (*flush_func)(void *),
     -		    void *writer_arg, struct reftable_write_options *opts)
    -+		    void *writer_arg, struct reftable_write_options opts)
    ++		    void *writer_arg, const struct reftable_write_options *_opts)
      {
      	struct reftable_writer *wp = reftable_calloc(1, sizeof(*wp));
    - 	strbuf_init(&wp->block_writer_data.last_key, 0);
    +-	strbuf_init(&wp->block_writer_data.last_key, 0);
     -	options_set_defaults(opts);
     -	if (opts->block_size >= (1 << 24)) {
    ++	struct reftable_write_options opts = {0};
     +
    ++	if (_opts)
    ++		opts = *_opts;
     +	options_set_defaults(&opts);
     +	if (opts.block_size >= (1 << 24)) {
      		/* TODO - error return? */
      		abort();
      	}
     +
    ++	strbuf_init(&wp->block_writer_data.last_key, 0);
      	wp->last_key = reftable_empty_strbuf;
     -	REFTABLE_CALLOC_ARRAY(wp->block, opts->block_size);
     +	REFTABLE_CALLOC_ARRAY(wp->block, opts.block_size);
 3:  aa2903e3e5 !  3:  ef14bf7195 reftable/writer: drop static variable used to initialize strbuf
    @@ reftable/writer.c: static void writer_reinit_block_writer(struct reftable_writer
      struct reftable_writer *
      reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
      		    int (*flush_func)(void *),
    - 		    void *writer_arg, struct reftable_write_options opts)
    - {
    - 	struct reftable_writer *wp = reftable_calloc(1, sizeof(*wp));
    --	strbuf_init(&wp->block_writer_data.last_key, 0);
    - 
    - 	options_set_defaults(&opts);
    - 	if (opts.block_size >= (1 << 24)) {
     @@ reftable/writer.c: reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
    - 		abort();
      	}
      
    + 	strbuf_init(&wp->block_writer_data.last_key, 0);
     -	wp->last_key = reftable_empty_strbuf;
    -+	strbuf_init(&wp->block_writer_data.last_key, 0);
     +	strbuf_init(&wp->last_key, 0);
      	REFTABLE_CALLOC_ARRAY(wp->block, opts.block_size);
      	wp->write = writer_func;
 4:  5e7cbb7b19 !  4:  8ec26646f2 reftable/writer: improve error when passed an invalid block size
    @@ Commit message
     
      ## reftable/writer.c ##
     @@ reftable/writer.c: reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
    - 	struct reftable_writer *wp = reftable_calloc(1, sizeof(*wp));
    - 
    + 	if (_opts)
    + 		opts = *_opts;
      	options_set_defaults(&opts);
     -	if (opts.block_size >= (1 << 24)) {
     -		/* TODO - error return? */
 5:  ed1c150d90 =  5:  c4377180ef reftable/dump: support dumping a table's block structure
 6:  be5bdc6dc1 =  6:  70720af4d3 refs/reftable: allow configuring block size
 7:  05e8d1df2d =  7:  b3fe81b7b7 reftable: use `uint16_t` to track restart interval
 8:  bc0bf65553 !  8:  2b15795707 refs/reftable: allow configuring restart interval
    @@ Documentation/config/reftable.txt: readers during access.
     +
     +reftable.restartInterval::
     +	The interval at which to create restart points. The reftable backend
    -+	determines the restart points at file creation. The process is
    -+	arbitrary, but every 16 or 64 records is recommended. Every 16 may be
    ++	determines the restart points at file creation. Every 16 may be
     +	more suitable for smaller block sizes (4k or 8k), every 64 for larger
     +	block sizes (64k).
     ++
 9:  6bc240fd0c =  9:  b128d584a5 refs/reftable: allow disabling writing the object index
10:  9d4c1f0340 ! 10:  fb1ca02e77 reftable: make the compaction factor configurable
    @@ reftable/stack.c: license that can be found in the LICENSE file or at
     +#include "constants.h"
      #include "merged.h"
      #include "reader.h"
    - #include "refname.h"
    + #include "reftable-error.h"
     @@ reftable/stack.c: static int segment_size(struct segment *s)
      	return s->end - s->start;
      }
11:  e1282e53fb = 11:  d341741eb0 refs/reftable: allow configuring geometric factor
-- 
2.45.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v3 01/11] reftable: consistently refer to `reftable_write_options` as `opts`
  2024-05-13  8:17 ` [PATCH v3 " Patrick Steinhardt
@ 2024-05-13  8:17   ` Patrick Steinhardt
  2024-05-13  8:17   ` [PATCH v3 02/11] reftable: pass opts as constant pointer Patrick Steinhardt
                     ` (11 subsequent siblings)
  12 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-13  8:17 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Justin Tobler, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 21546 bytes --]

Throughout the reftable library the `reftable_write_options` are
sometimes referred to as `cfg` and sometimes as `opts`. Unify these to
consistently use `opts` to avoid confusion.

While at it, touch up the coding style a bit by removing unneeded braces
around one-line statements and newlines between variable declarations.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/dump.c           |   4 +-
 reftable/reftable-stack.h |   2 +-
 reftable/stack.c          |  41 +++++++-------
 reftable/stack.h          |   2 +-
 reftable/stack_test.c     | 114 +++++++++++++++++---------------------
 5 files changed, 74 insertions(+), 89 deletions(-)

diff --git a/reftable/dump.c b/reftable/dump.c
index 26e0393c7d..9c770a10cc 100644
--- a/reftable/dump.c
+++ b/reftable/dump.c
@@ -27,9 +27,9 @@ license that can be found in the LICENSE file or at
 static int compact_stack(const char *stackdir)
 {
 	struct reftable_stack *stack = NULL;
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 
-	int err = reftable_new_stack(&stack, stackdir, cfg);
+	int err = reftable_new_stack(&stack, stackdir, opts);
 	if (err < 0)
 		goto done;
 
diff --git a/reftable/reftable-stack.h b/reftable/reftable-stack.h
index 1b602dda58..9c8e4eef49 100644
--- a/reftable/reftable-stack.h
+++ b/reftable/reftable-stack.h
@@ -29,7 +29,7 @@ struct reftable_stack;
  *  stored in 'dir'. Typically, this should be .git/reftables.
  */
 int reftable_new_stack(struct reftable_stack **dest, const char *dir,
-		       struct reftable_write_options config);
+		       struct reftable_write_options opts);
 
 /* returns the update_index at which a next table should be written. */
 uint64_t reftable_stack_next_update_index(struct reftable_stack *st);
diff --git a/reftable/stack.c b/reftable/stack.c
index a59ebe038d..54e7473f3a 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -54,15 +54,14 @@ static int reftable_fd_flush(void *arg)
 }
 
 int reftable_new_stack(struct reftable_stack **dest, const char *dir,
-		       struct reftable_write_options config)
+		       struct reftable_write_options opts)
 {
 	struct reftable_stack *p = reftable_calloc(1, sizeof(*p));
 	struct strbuf list_file_name = STRBUF_INIT;
 	int err = 0;
 
-	if (config.hash_id == 0) {
-		config.hash_id = GIT_SHA1_FORMAT_ID;
-	}
+	if (opts.hash_id == 0)
+		opts.hash_id = GIT_SHA1_FORMAT_ID;
 
 	*dest = NULL;
 
@@ -73,7 +72,7 @@ int reftable_new_stack(struct reftable_stack **dest, const char *dir,
 	p->list_file = strbuf_detach(&list_file_name, NULL);
 	p->list_fd = -1;
 	p->reftable_dir = xstrdup(dir);
-	p->config = config;
+	p->opts = opts;
 
 	err = reftable_stack_reload_maybe_reuse(p, 1);
 	if (err < 0) {
@@ -255,7 +254,7 @@ static int reftable_stack_reload_once(struct reftable_stack *st, char **names,
 
 	/* success! */
 	err = reftable_new_merged_table(&new_merged, new_tables,
-					new_readers_len, st->config.hash_id);
+					new_readers_len, st->opts.hash_id);
 	if (err < 0)
 		goto done;
 
@@ -578,8 +577,8 @@ static int reftable_stack_init_addition(struct reftable_addition *add,
 		}
 		goto done;
 	}
-	if (st->config.default_permissions) {
-		if (chmod(add->lock_file->filename.buf, st->config.default_permissions) < 0) {
+	if (st->opts.default_permissions) {
+		if (chmod(add->lock_file->filename.buf, st->opts.default_permissions) < 0) {
 			err = REFTABLE_IO_ERROR;
 			goto done;
 		}
@@ -678,7 +677,7 @@ int reftable_addition_commit(struct reftable_addition *add)
 	if (err)
 		goto done;
 
-	if (!add->stack->config.disable_auto_compact) {
+	if (!add->stack->opts.disable_auto_compact) {
 		/*
 		 * Auto-compact the stack to keep the number of tables in
 		 * control. It is possible that a concurrent writer is already
@@ -756,9 +755,9 @@ int reftable_addition_add(struct reftable_addition *add,
 		err = REFTABLE_IO_ERROR;
 		goto done;
 	}
-	if (add->stack->config.default_permissions) {
+	if (add->stack->opts.default_permissions) {
 		if (chmod(get_tempfile_path(tab_file),
-			  add->stack->config.default_permissions)) {
+			  add->stack->opts.default_permissions)) {
 			err = REFTABLE_IO_ERROR;
 			goto done;
 		}
@@ -766,7 +765,7 @@ int reftable_addition_add(struct reftable_addition *add,
 	tab_fd = get_tempfile_fd(tab_file);
 
 	wr = reftable_new_writer(reftable_fd_write, reftable_fd_flush, &tab_fd,
-				 &add->stack->config);
+				 &add->stack->opts);
 	err = write_table(wr, arg);
 	if (err < 0)
 		goto done;
@@ -849,14 +848,14 @@ static int stack_compact_locked(struct reftable_stack *st,
 	}
 	tab_fd = get_tempfile_fd(tab_file);
 
-	if (st->config.default_permissions &&
-	    chmod(get_tempfile_path(tab_file), st->config.default_permissions) < 0) {
+	if (st->opts.default_permissions &&
+	    chmod(get_tempfile_path(tab_file), st->opts.default_permissions) < 0) {
 		err = REFTABLE_IO_ERROR;
 		goto done;
 	}
 
 	wr = reftable_new_writer(reftable_fd_write, reftable_fd_flush,
-				 &tab_fd, &st->config);
+				 &tab_fd, &st->opts);
 	err = stack_write_compact(st, wr, first, last, config);
 	if (err < 0)
 		goto done;
@@ -904,7 +903,7 @@ static int stack_write_compact(struct reftable_stack *st,
 				   st->readers[last]->max_update_index);
 
 	err = reftable_new_merged_table(&mt, subtabs, subtabs_len,
-					st->config.hash_id);
+					st->opts.hash_id);
 	if (err < 0) {
 		reftable_free(subtabs);
 		goto done;
@@ -1094,9 +1093,9 @@ static int stack_compact_range(struct reftable_stack *st,
 		goto done;
 	}
 
-	if (st->config.default_permissions) {
+	if (st->opts.default_permissions) {
 		if (chmod(get_lock_file_path(&tables_list_lock),
-			  st->config.default_permissions) < 0) {
+			  st->opts.default_permissions) < 0) {
 			err = REFTABLE_IO_ERROR;
 			goto done;
 		}
@@ -1286,7 +1285,7 @@ static uint64_t *stack_table_sizes_for_compaction(struct reftable_stack *st)
 {
 	uint64_t *sizes =
 		reftable_calloc(st->merged->stack_len, sizeof(*sizes));
-	int version = (st->config.hash_id == GIT_SHA1_FORMAT_ID) ? 1 : 2;
+	int version = (st->opts.hash_id == GIT_SHA1_FORMAT_ID) ? 1 : 2;
 	int overhead = header_size(version) - 1;
 	int i = 0;
 	for (i = 0; i < st->merged->stack_len; i++) {
@@ -1435,11 +1434,11 @@ int reftable_stack_clean(struct reftable_stack *st)
 int reftable_stack_print_directory(const char *stackdir, uint32_t hash_id)
 {
 	struct reftable_stack *stack = NULL;
-	struct reftable_write_options cfg = { .hash_id = hash_id };
+	struct reftable_write_options opts = { .hash_id = hash_id };
 	struct reftable_merged_table *merged = NULL;
 	struct reftable_table table = { NULL };
 
-	int err = reftable_new_stack(&stack, stackdir, cfg);
+	int err = reftable_new_stack(&stack, stackdir, opts);
 	if (err < 0)
 		goto done;
 
diff --git a/reftable/stack.h b/reftable/stack.h
index d43efa4760..97d7ebc043 100644
--- a/reftable/stack.h
+++ b/reftable/stack.h
@@ -20,7 +20,7 @@ struct reftable_stack {
 
 	char *reftable_dir;
 
-	struct reftable_write_options config;
+	struct reftable_write_options opts;
 
 	struct reftable_reader **readers;
 	size_t readers_len;
diff --git a/reftable/stack_test.c b/reftable/stack_test.c
index 7889f818d1..e17ad4dc62 100644
--- a/reftable/stack_test.c
+++ b/reftable/stack_test.c
@@ -150,7 +150,7 @@ static void test_reftable_stack_add_one(void)
 	char *dir = get_tmp_dir(__LINE__);
 	struct strbuf scratch = STRBUF_INIT;
 	int mask = umask(002);
-	struct reftable_write_options cfg = {
+	struct reftable_write_options opts = {
 		.default_permissions = 0660,
 	};
 	struct reftable_stack *st = NULL;
@@ -163,7 +163,7 @@ static void test_reftable_stack_add_one(void)
 	};
 	struct reftable_ref_record dest = { NULL };
 	struct stat stat_result = { 0 };
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_add(st, &write_test_ref, &ref);
@@ -186,7 +186,7 @@ static void test_reftable_stack_add_one(void)
 	strbuf_addstr(&scratch, "/tables.list");
 	err = stat(scratch.buf, &stat_result);
 	EXPECT(!err);
-	EXPECT((stat_result.st_mode & 0777) == cfg.default_permissions);
+	EXPECT((stat_result.st_mode & 0777) == opts.default_permissions);
 
 	strbuf_reset(&scratch);
 	strbuf_addstr(&scratch, dir);
@@ -195,7 +195,7 @@ static void test_reftable_stack_add_one(void)
 	strbuf_addstr(&scratch, st->readers[0]->name);
 	err = stat(scratch.buf, &stat_result);
 	EXPECT(!err);
-	EXPECT((stat_result.st_mode & 0777) == cfg.default_permissions);
+	EXPECT((stat_result.st_mode & 0777) == opts.default_permissions);
 #else
 	(void) stat_result;
 #endif
@@ -209,7 +209,7 @@ static void test_reftable_stack_add_one(void)
 
 static void test_reftable_stack_uptodate(void)
 {
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st1 = NULL;
 	struct reftable_stack *st2 = NULL;
 	char *dir = get_tmp_dir(__LINE__);
@@ -232,10 +232,10 @@ static void test_reftable_stack_uptodate(void)
 	/* simulate multi-process access to the same stack
 	   by creating two stacks for the same directory.
 	 */
-	err = reftable_new_stack(&st1, dir, cfg);
+	err = reftable_new_stack(&st1, dir, opts);
 	EXPECT_ERR(err);
 
-	err = reftable_new_stack(&st2, dir, cfg);
+	err = reftable_new_stack(&st2, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_add(st1, &write_test_ref, &ref1);
@@ -257,8 +257,7 @@ static void test_reftable_stack_uptodate(void)
 static void test_reftable_stack_transaction_api(void)
 {
 	char *dir = get_tmp_dir(__LINE__);
-
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	int err;
 	struct reftable_addition *add = NULL;
@@ -271,8 +270,7 @@ static void test_reftable_stack_transaction_api(void)
 	};
 	struct reftable_ref_record dest = { NULL };
 
-
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	reftable_addition_destroy(add);
@@ -301,12 +299,12 @@ static void test_reftable_stack_transaction_api(void)
 static void test_reftable_stack_transaction_api_performs_auto_compaction(void)
 {
 	char *dir = get_tmp_dir(__LINE__);
-	struct reftable_write_options cfg = {0};
+	struct reftable_write_options opts = {0};
 	struct reftable_addition *add = NULL;
 	struct reftable_stack *st = NULL;
 	int i, n = 20, err;
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	for (i = 0; i <= n; i++) {
@@ -325,7 +323,7 @@ static void test_reftable_stack_transaction_api_performs_auto_compaction(void)
 		 * we can ensure that we indeed honor this setting and have
 		 * better control over when exactly auto compaction runs.
 		 */
-		st->config.disable_auto_compact = i != n;
+		st->opts.disable_auto_compact = i != n;
 
 		err = reftable_stack_new_addition(&add, st);
 		EXPECT_ERR(err);
@@ -361,13 +359,13 @@ static void test_reftable_stack_auto_compaction_fails_gracefully(void)
 		.value_type = REFTABLE_REF_VAL1,
 		.value.val1 = {0x01},
 	};
-	struct reftable_write_options cfg = {0};
+	struct reftable_write_options opts = {0};
 	struct reftable_stack *st;
 	struct strbuf table_path = STRBUF_INIT;
 	char *dir = get_tmp_dir(__LINE__);
 	int err;
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_add(st, write_test_ref, &ref);
@@ -404,8 +402,7 @@ static int write_error(struct reftable_writer *wr, void *arg)
 static void test_reftable_stack_update_index_check(void)
 {
 	char *dir = get_tmp_dir(__LINE__);
-
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	int err;
 	struct reftable_ref_record ref1 = {
@@ -421,7 +418,7 @@ static void test_reftable_stack_update_index_check(void)
 		.value.symref = "master",
 	};
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_add(st, &write_test_ref, &ref1);
@@ -436,12 +433,11 @@ static void test_reftable_stack_update_index_check(void)
 static void test_reftable_stack_lock_failure(void)
 {
 	char *dir = get_tmp_dir(__LINE__);
-
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	int err, i;
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 	for (i = -1; i != REFTABLE_EMPTY_TABLE_ERROR; i--) {
 		err = reftable_stack_add(st, &write_error, &i);
@@ -456,7 +452,7 @@ static void test_reftable_stack_add(void)
 {
 	int i = 0;
 	int err = 0;
-	struct reftable_write_options cfg = {
+	struct reftable_write_options opts = {
 		.exact_log_message = 1,
 		.default_permissions = 0660,
 		.disable_auto_compact = 1,
@@ -469,7 +465,7 @@ static void test_reftable_stack_add(void)
 	struct stat stat_result;
 	int N = ARRAY_SIZE(refs);
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	for (i = 0; i < N; i++) {
@@ -528,7 +524,7 @@ static void test_reftable_stack_add(void)
 	strbuf_addstr(&path, "/tables.list");
 	err = stat(path.buf, &stat_result);
 	EXPECT(!err);
-	EXPECT((stat_result.st_mode & 0777) == cfg.default_permissions);
+	EXPECT((stat_result.st_mode & 0777) == opts.default_permissions);
 
 	strbuf_reset(&path);
 	strbuf_addstr(&path, dir);
@@ -537,7 +533,7 @@ static void test_reftable_stack_add(void)
 	strbuf_addstr(&path, st->readers[0]->name);
 	err = stat(path.buf, &stat_result);
 	EXPECT(!err);
-	EXPECT((stat_result.st_mode & 0777) == cfg.default_permissions);
+	EXPECT((stat_result.st_mode & 0777) == opts.default_permissions);
 #else
 	(void) stat_result;
 #endif
@@ -555,7 +551,7 @@ static void test_reftable_stack_add(void)
 static void test_reftable_stack_log_normalize(void)
 {
 	int err = 0;
-	struct reftable_write_options cfg = {
+	struct reftable_write_options opts = {
 		0,
 	};
 	struct reftable_stack *st = NULL;
@@ -579,7 +575,7 @@ static void test_reftable_stack_log_normalize(void)
 		.update_index = 1,
 	};
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	input.value.update.message = "one\ntwo";
@@ -612,8 +608,7 @@ static void test_reftable_stack_tombstone(void)
 {
 	int i = 0;
 	char *dir = get_tmp_dir(__LINE__);
-
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	int err;
 	struct reftable_ref_record refs[2] = { { NULL } };
@@ -622,8 +617,7 @@ static void test_reftable_stack_tombstone(void)
 	struct reftable_ref_record dest = { NULL };
 	struct reftable_log_record log_dest = { NULL };
 
-
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	/* even entries add the refs, odd entries delete them. */
@@ -691,8 +685,7 @@ static void test_reftable_stack_tombstone(void)
 static void test_reftable_stack_hash_id(void)
 {
 	char *dir = get_tmp_dir(__LINE__);
-
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	int err;
 
@@ -702,24 +695,24 @@ static void test_reftable_stack_hash_id(void)
 		.value.symref = "target",
 		.update_index = 1,
 	};
-	struct reftable_write_options cfg32 = { .hash_id = GIT_SHA256_FORMAT_ID };
+	struct reftable_write_options opts32 = { .hash_id = GIT_SHA256_FORMAT_ID };
 	struct reftable_stack *st32 = NULL;
-	struct reftable_write_options cfg_default = { 0 };
+	struct reftable_write_options opts_default = { 0 };
 	struct reftable_stack *st_default = NULL;
 	struct reftable_ref_record dest = { NULL };
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_add(st, &write_test_ref, &ref);
 	EXPECT_ERR(err);
 
 	/* can't read it with the wrong hash ID. */
-	err = reftable_new_stack(&st32, dir, cfg32);
+	err = reftable_new_stack(&st32, dir, opts32);
 	EXPECT(err == REFTABLE_FORMAT_ERROR);
 
-	/* check that we can read it back with default config too. */
-	err = reftable_new_stack(&st_default, dir, cfg_default);
+	/* check that we can read it back with default opts too. */
+	err = reftable_new_stack(&st_default, dir, opts_default);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_read_ref(st_default, "master", &dest);
@@ -752,8 +745,7 @@ static void test_suggest_compaction_segment_nothing(void)
 static void test_reflog_expire(void)
 {
 	char *dir = get_tmp_dir(__LINE__);
-
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	struct reftable_log_record logs[20] = { { NULL } };
 	int N = ARRAY_SIZE(logs) - 1;
@@ -764,8 +756,7 @@ static void test_reflog_expire(void)
 	};
 	struct reftable_log_record log = { NULL };
 
-
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	for (i = 1; i <= N; i++) {
@@ -828,21 +819,19 @@ static int write_nothing(struct reftable_writer *wr, void *arg)
 
 static void test_empty_add(void)
 {
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	int err;
 	char *dir = get_tmp_dir(__LINE__);
-
 	struct reftable_stack *st2 = NULL;
 
-
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_add(st, &write_nothing, NULL);
 	EXPECT_ERR(err);
 
-	err = reftable_new_stack(&st2, dir, cfg);
+	err = reftable_new_stack(&st2, dir, opts);
 	EXPECT_ERR(err);
 	clear_dir(dir);
 	reftable_stack_destroy(st);
@@ -861,16 +850,15 @@ static int fastlog2(uint64_t sz)
 
 static void test_reftable_stack_auto_compaction(void)
 {
-	struct reftable_write_options cfg = {
+	struct reftable_write_options opts = {
 		.disable_auto_compact = 1,
 	};
 	struct reftable_stack *st = NULL;
 	char *dir = get_tmp_dir(__LINE__);
-
 	int err, i;
 	int N = 100;
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	for (i = 0; i < N; i++) {
@@ -900,13 +888,13 @@ static void test_reftable_stack_auto_compaction(void)
 
 static void test_reftable_stack_add_performs_auto_compaction(void)
 {
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st = NULL;
 	struct strbuf refname = STRBUF_INIT;
 	char *dir = get_tmp_dir(__LINE__);
 	int err, i, n = 20;
 
-	err = reftable_new_stack(&st, dir, cfg);
+	err = reftable_new_stack(&st, dir, opts);
 	EXPECT_ERR(err);
 
 	for (i = 0; i <= n; i++) {
@@ -921,7 +909,7 @@ static void test_reftable_stack_add_performs_auto_compaction(void)
 		 * we can ensure that we indeed honor this setting and have
 		 * better control over when exactly auto compaction runs.
 		 */
-		st->config.disable_auto_compact = i != n;
+		st->opts.disable_auto_compact = i != n;
 
 		strbuf_reset(&refname);
 		strbuf_addf(&refname, "branch-%04d", i);
@@ -948,14 +936,13 @@ static void test_reftable_stack_add_performs_auto_compaction(void)
 
 static void test_reftable_stack_compaction_concurrent(void)
 {
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st1 = NULL, *st2 = NULL;
 	char *dir = get_tmp_dir(__LINE__);
-
 	int err, i;
 	int N = 3;
 
-	err = reftable_new_stack(&st1, dir, cfg);
+	err = reftable_new_stack(&st1, dir, opts);
 	EXPECT_ERR(err);
 
 	for (i = 0; i < N; i++) {
@@ -972,7 +959,7 @@ static void test_reftable_stack_compaction_concurrent(void)
 		EXPECT_ERR(err);
 	}
 
-	err = reftable_new_stack(&st2, dir, cfg);
+	err = reftable_new_stack(&st2, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_compact_all(st1, NULL);
@@ -998,14 +985,13 @@ static void unclean_stack_close(struct reftable_stack *st)
 
 static void test_reftable_stack_compaction_concurrent_clean(void)
 {
-	struct reftable_write_options cfg = { 0 };
+	struct reftable_write_options opts = { 0 };
 	struct reftable_stack *st1 = NULL, *st2 = NULL, *st3 = NULL;
 	char *dir = get_tmp_dir(__LINE__);
-
 	int err, i;
 	int N = 3;
 
-	err = reftable_new_stack(&st1, dir, cfg);
+	err = reftable_new_stack(&st1, dir, opts);
 	EXPECT_ERR(err);
 
 	for (i = 0; i < N; i++) {
@@ -1022,7 +1008,7 @@ static void test_reftable_stack_compaction_concurrent_clean(void)
 		EXPECT_ERR(err);
 	}
 
-	err = reftable_new_stack(&st2, dir, cfg);
+	err = reftable_new_stack(&st2, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_compact_all(st1, NULL);
@@ -1031,7 +1017,7 @@ static void test_reftable_stack_compaction_concurrent_clean(void)
 	unclean_stack_close(st1);
 	unclean_stack_close(st2);
 
-	err = reftable_new_stack(&st3, dir, cfg);
+	err = reftable_new_stack(&st3, dir, opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_clean(st3);
-- 
2.45.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v3 02/11] reftable: pass opts as constant pointer
  2024-05-13  8:17 ` [PATCH v3 " Patrick Steinhardt
  2024-05-13  8:17   ` [PATCH v3 01/11] reftable: consistently refer to `reftable_write_options` as `opts` Patrick Steinhardt
@ 2024-05-13  8:17   ` Patrick Steinhardt
  2024-05-17  8:02     ` Karthik Nayak
  2024-05-21 23:22     ` Justin Tobler
  2024-05-13  8:18   ` [PATCH v3 03/11] reftable/writer: drop static variable used to initialize strbuf Patrick Steinhardt
                     ` (10 subsequent siblings)
  12 siblings, 2 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-13  8:17 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Justin Tobler, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 13024 bytes --]

We sometimes pass the refatble write options as value and sometimes as a
pointer. This is quite confusing and makes the reader wonder whether the
options get modified sometimes.

In fact, `reftable_new_writer()` does cause the caller-provided options
to get updated when some values aren't set up. This is quite unexpected,
but didn't cause any harm until now.

Adapt the code so that we do not modify the caller-provided values
anymore. While at it, refactor the code to code to consistently pass the
options as a constant pointer to clarify that the caller-provided opts
will not ever get modified.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs/reftable-backend.c    |  6 ++---
 reftable/dump.c            |  2 +-
 reftable/reftable-stack.h  |  2 +-
 reftable/reftable-writer.h |  2 +-
 reftable/stack.c           |  7 ++++--
 reftable/stack_test.c      | 48 +++++++++++++++++++-------------------
 reftable/writer.c          | 17 +++++++++-----
 7 files changed, 46 insertions(+), 38 deletions(-)

diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 010ef811b6..f8f930380d 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -129,7 +129,7 @@ static struct reftable_stack *stack_for(struct reftable_ref_store *store,
 				    store->base.repo->commondir, wtname_buf.buf);
 
 			store->err = reftable_new_stack(&stack, wt_dir.buf,
-							store->write_options);
+							&store->write_options);
 			assert(store->err != REFTABLE_API_ERROR);
 			strmap_put(&store->worktree_stacks, wtname_buf.buf, stack);
 		}
@@ -263,7 +263,7 @@ static struct ref_store *reftable_be_init(struct repository *repo,
 	}
 	strbuf_addstr(&path, "/reftable");
 	refs->err = reftable_new_stack(&refs->main_stack, path.buf,
-				       refs->write_options);
+				       &refs->write_options);
 	if (refs->err)
 		goto done;
 
@@ -280,7 +280,7 @@ static struct ref_store *reftable_be_init(struct repository *repo,
 		strbuf_addf(&path, "%s/reftable", gitdir);
 
 		refs->err = reftable_new_stack(&refs->worktree_stack, path.buf,
-					       refs->write_options);
+					       &refs->write_options);
 		if (refs->err)
 			goto done;
 	}
diff --git a/reftable/dump.c b/reftable/dump.c
index 9c770a10cc..586f3eb288 100644
--- a/reftable/dump.c
+++ b/reftable/dump.c
@@ -29,7 +29,7 @@ static int compact_stack(const char *stackdir)
 	struct reftable_stack *stack = NULL;
 	struct reftable_write_options opts = { 0 };
 
-	int err = reftable_new_stack(&stack, stackdir, opts);
+	int err = reftable_new_stack(&stack, stackdir, &opts);
 	if (err < 0)
 		goto done;
 
diff --git a/reftable/reftable-stack.h b/reftable/reftable-stack.h
index 9c8e4eef49..c15632c401 100644
--- a/reftable/reftable-stack.h
+++ b/reftable/reftable-stack.h
@@ -29,7 +29,7 @@ struct reftable_stack;
  *  stored in 'dir'. Typically, this should be .git/reftables.
  */
 int reftable_new_stack(struct reftable_stack **dest, const char *dir,
-		       struct reftable_write_options opts);
+		       const struct reftable_write_options *opts);
 
 /* returns the update_index at which a next table should be written. */
 uint64_t reftable_stack_next_update_index(struct reftable_stack *st);
diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h
index b601a69a40..03df3a4963 100644
--- a/reftable/reftable-writer.h
+++ b/reftable/reftable-writer.h
@@ -88,7 +88,7 @@ struct reftable_stats {
 struct reftable_writer *
 reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
 		    int (*flush_func)(void *),
-		    void *writer_arg, struct reftable_write_options *opts);
+		    void *writer_arg, const struct reftable_write_options *opts);
 
 /* Set the range of update indices for the records we will add. When writing a
    table into a stack, the min should be at least
diff --git a/reftable/stack.c b/reftable/stack.c
index 54e7473f3a..d2e68be7e8 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -54,12 +54,15 @@ static int reftable_fd_flush(void *arg)
 }
 
 int reftable_new_stack(struct reftable_stack **dest, const char *dir,
-		       struct reftable_write_options opts)
+		       const struct reftable_write_options *_opts)
 {
 	struct reftable_stack *p = reftable_calloc(1, sizeof(*p));
 	struct strbuf list_file_name = STRBUF_INIT;
+	struct reftable_write_options opts = {0};
 	int err = 0;
 
+	if (_opts)
+		opts = *_opts;
 	if (opts.hash_id == 0)
 		opts.hash_id = GIT_SHA1_FORMAT_ID;
 
@@ -1438,7 +1441,7 @@ int reftable_stack_print_directory(const char *stackdir, uint32_t hash_id)
 	struct reftable_merged_table *merged = NULL;
 	struct reftable_table table = { NULL };
 
-	int err = reftable_new_stack(&stack, stackdir, opts);
+	int err = reftable_new_stack(&stack, stackdir, &opts);
 	if (err < 0)
 		goto done;
 
diff --git a/reftable/stack_test.c b/reftable/stack_test.c
index e17ad4dc62..d15f11d712 100644
--- a/reftable/stack_test.c
+++ b/reftable/stack_test.c
@@ -163,7 +163,7 @@ static void test_reftable_stack_add_one(void)
 	};
 	struct reftable_ref_record dest = { NULL };
 	struct stat stat_result = { 0 };
-	err = reftable_new_stack(&st, dir, opts);
+	err = reftable_new_stack(&st, dir, &opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_add(st, &write_test_ref, &ref);
@@ -232,10 +232,10 @@ static void test_reftable_stack_uptodate(void)
 	/* simulate multi-process access to the same stack
 	   by creating two stacks for the same directory.
 	 */
-	err = reftable_new_stack(&st1, dir, opts);
+	err = reftable_new_stack(&st1, dir, &opts);
 	EXPECT_ERR(err);
 
-	err = reftable_new_stack(&st2, dir, opts);
+	err = reftable_new_stack(&st2, dir, &opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_add(st1, &write_test_ref, &ref1);
@@ -270,7 +270,7 @@ static void test_reftable_stack_transaction_api(void)
 	};
 	struct reftable_ref_record dest = { NULL };
 
-	err = reftable_new_stack(&st, dir, opts);
+	err = reftable_new_stack(&st, dir, &opts);
 	EXPECT_ERR(err);
 
 	reftable_addition_destroy(add);
@@ -304,7 +304,7 @@ static void test_reftable_stack_transaction_api_performs_auto_compaction(void)
 	struct reftable_stack *st = NULL;
 	int i, n = 20, err;
 
-	err = reftable_new_stack(&st, dir, opts);
+	err = reftable_new_stack(&st, dir, &opts);
 	EXPECT_ERR(err);
 
 	for (i = 0; i <= n; i++) {
@@ -365,7 +365,7 @@ static void test_reftable_stack_auto_compaction_fails_gracefully(void)
 	char *dir = get_tmp_dir(__LINE__);
 	int err;
 
-	err = reftable_new_stack(&st, dir, opts);
+	err = reftable_new_stack(&st, dir, &opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_add(st, write_test_ref, &ref);
@@ -418,7 +418,7 @@ static void test_reftable_stack_update_index_check(void)
 		.value.symref = "master",
 	};
 
-	err = reftable_new_stack(&st, dir, opts);
+	err = reftable_new_stack(&st, dir, &opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_add(st, &write_test_ref, &ref1);
@@ -437,7 +437,7 @@ static void test_reftable_stack_lock_failure(void)
 	struct reftable_stack *st = NULL;
 	int err, i;
 
-	err = reftable_new_stack(&st, dir, opts);
+	err = reftable_new_stack(&st, dir, &opts);
 	EXPECT_ERR(err);
 	for (i = -1; i != REFTABLE_EMPTY_TABLE_ERROR; i--) {
 		err = reftable_stack_add(st, &write_error, &i);
@@ -465,7 +465,7 @@ static void test_reftable_stack_add(void)
 	struct stat stat_result;
 	int N = ARRAY_SIZE(refs);
 
-	err = reftable_new_stack(&st, dir, opts);
+	err = reftable_new_stack(&st, dir, &opts);
 	EXPECT_ERR(err);
 
 	for (i = 0; i < N; i++) {
@@ -575,7 +575,7 @@ static void test_reftable_stack_log_normalize(void)
 		.update_index = 1,
 	};
 
-	err = reftable_new_stack(&st, dir, opts);
+	err = reftable_new_stack(&st, dir, &opts);
 	EXPECT_ERR(err);
 
 	input.value.update.message = "one\ntwo";
@@ -617,7 +617,7 @@ static void test_reftable_stack_tombstone(void)
 	struct reftable_ref_record dest = { NULL };
 	struct reftable_log_record log_dest = { NULL };
 
-	err = reftable_new_stack(&st, dir, opts);
+	err = reftable_new_stack(&st, dir, &opts);
 	EXPECT_ERR(err);
 
 	/* even entries add the refs, odd entries delete them. */
@@ -701,18 +701,18 @@ static void test_reftable_stack_hash_id(void)
 	struct reftable_stack *st_default = NULL;
 	struct reftable_ref_record dest = { NULL };
 
-	err = reftable_new_stack(&st, dir, opts);
+	err = reftable_new_stack(&st, dir, &opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_add(st, &write_test_ref, &ref);
 	EXPECT_ERR(err);
 
 	/* can't read it with the wrong hash ID. */
-	err = reftable_new_stack(&st32, dir, opts32);
+	err = reftable_new_stack(&st32, dir, &opts32);
 	EXPECT(err == REFTABLE_FORMAT_ERROR);
 
 	/* check that we can read it back with default opts too. */
-	err = reftable_new_stack(&st_default, dir, opts_default);
+	err = reftable_new_stack(&st_default, dir, &opts_default);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_read_ref(st_default, "master", &dest);
@@ -756,7 +756,7 @@ static void test_reflog_expire(void)
 	};
 	struct reftable_log_record log = { NULL };
 
-	err = reftable_new_stack(&st, dir, opts);
+	err = reftable_new_stack(&st, dir, &opts);
 	EXPECT_ERR(err);
 
 	for (i = 1; i <= N; i++) {
@@ -825,13 +825,13 @@ static void test_empty_add(void)
 	char *dir = get_tmp_dir(__LINE__);
 	struct reftable_stack *st2 = NULL;
 
-	err = reftable_new_stack(&st, dir, opts);
+	err = reftable_new_stack(&st, dir, &opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_add(st, &write_nothing, NULL);
 	EXPECT_ERR(err);
 
-	err = reftable_new_stack(&st2, dir, opts);
+	err = reftable_new_stack(&st2, dir, &opts);
 	EXPECT_ERR(err);
 	clear_dir(dir);
 	reftable_stack_destroy(st);
@@ -858,7 +858,7 @@ static void test_reftable_stack_auto_compaction(void)
 	int err, i;
 	int N = 100;
 
-	err = reftable_new_stack(&st, dir, opts);
+	err = reftable_new_stack(&st, dir, &opts);
 	EXPECT_ERR(err);
 
 	for (i = 0; i < N; i++) {
@@ -894,7 +894,7 @@ static void test_reftable_stack_add_performs_auto_compaction(void)
 	char *dir = get_tmp_dir(__LINE__);
 	int err, i, n = 20;
 
-	err = reftable_new_stack(&st, dir, opts);
+	err = reftable_new_stack(&st, dir, &opts);
 	EXPECT_ERR(err);
 
 	for (i = 0; i <= n; i++) {
@@ -942,7 +942,7 @@ static void test_reftable_stack_compaction_concurrent(void)
 	int err, i;
 	int N = 3;
 
-	err = reftable_new_stack(&st1, dir, opts);
+	err = reftable_new_stack(&st1, dir, &opts);
 	EXPECT_ERR(err);
 
 	for (i = 0; i < N; i++) {
@@ -959,7 +959,7 @@ static void test_reftable_stack_compaction_concurrent(void)
 		EXPECT_ERR(err);
 	}
 
-	err = reftable_new_stack(&st2, dir, opts);
+	err = reftable_new_stack(&st2, dir, &opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_compact_all(st1, NULL);
@@ -991,7 +991,7 @@ static void test_reftable_stack_compaction_concurrent_clean(void)
 	int err, i;
 	int N = 3;
 
-	err = reftable_new_stack(&st1, dir, opts);
+	err = reftable_new_stack(&st1, dir, &opts);
 	EXPECT_ERR(err);
 
 	for (i = 0; i < N; i++) {
@@ -1008,7 +1008,7 @@ static void test_reftable_stack_compaction_concurrent_clean(void)
 		EXPECT_ERR(err);
 	}
 
-	err = reftable_new_stack(&st2, dir, opts);
+	err = reftable_new_stack(&st2, dir, &opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_compact_all(st1, NULL);
@@ -1017,7 +1017,7 @@ static void test_reftable_stack_compaction_concurrent_clean(void)
 	unclean_stack_close(st1);
 	unclean_stack_close(st2);
 
-	err = reftable_new_stack(&st3, dir, opts);
+	err = reftable_new_stack(&st3, dir, &opts);
 	EXPECT_ERR(err);
 
 	err = reftable_stack_clean(st3);
diff --git a/reftable/writer.c b/reftable/writer.c
index 10eccaaa07..4cc6e2ebd8 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -122,20 +122,25 @@ static struct strbuf reftable_empty_strbuf = STRBUF_INIT;
 struct reftable_writer *
 reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
 		    int (*flush_func)(void *),
-		    void *writer_arg, struct reftable_write_options *opts)
+		    void *writer_arg, const struct reftable_write_options *_opts)
 {
 	struct reftable_writer *wp = reftable_calloc(1, sizeof(*wp));
-	strbuf_init(&wp->block_writer_data.last_key, 0);
-	options_set_defaults(opts);
-	if (opts->block_size >= (1 << 24)) {
+	struct reftable_write_options opts = {0};
+
+	if (_opts)
+		opts = *_opts;
+	options_set_defaults(&opts);
+	if (opts.block_size >= (1 << 24)) {
 		/* TODO - error return? */
 		abort();
 	}
+
+	strbuf_init(&wp->block_writer_data.last_key, 0);
 	wp->last_key = reftable_empty_strbuf;
-	REFTABLE_CALLOC_ARRAY(wp->block, opts->block_size);
+	REFTABLE_CALLOC_ARRAY(wp->block, opts.block_size);
 	wp->write = writer_func;
 	wp->write_arg = writer_arg;
-	wp->opts = *opts;
+	wp->opts = opts;
 	wp->flush = flush_func;
 	writer_reinit_block_writer(wp, BLOCK_TYPE_REF);
 
-- 
2.45.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v3 03/11] reftable/writer: drop static variable used to initialize strbuf
  2024-05-13  8:17 ` [PATCH v3 " Patrick Steinhardt
  2024-05-13  8:17   ` [PATCH v3 01/11] reftable: consistently refer to `reftable_write_options` as `opts` Patrick Steinhardt
  2024-05-13  8:17   ` [PATCH v3 02/11] reftable: pass opts as constant pointer Patrick Steinhardt
@ 2024-05-13  8:18   ` Patrick Steinhardt
  2024-05-13  8:18   ` [PATCH v3 04/11] reftable/writer: improve error when passed an invalid block size Patrick Steinhardt
                     ` (9 subsequent siblings)
  12 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-13  8:18 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Justin Tobler, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 1198 bytes --]

We have a static variable in the reftable writer code that is merely
used to initialize the `last_key` of the writer. Convert the code to
instead use `strbuf_init()` and drop the variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/writer.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/reftable/writer.c b/reftable/writer.c
index 4cc6e2ebd8..a043025b01 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -117,8 +117,6 @@ static void writer_reinit_block_writer(struct reftable_writer *w, uint8_t typ)
 	w->block_writer->restart_interval = w->opts.restart_interval;
 }
 
-static struct strbuf reftable_empty_strbuf = STRBUF_INIT;
-
 struct reftable_writer *
 reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
 		    int (*flush_func)(void *),
@@ -136,7 +134,7 @@ reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
 	}
 
 	strbuf_init(&wp->block_writer_data.last_key, 0);
-	wp->last_key = reftable_empty_strbuf;
+	strbuf_init(&wp->last_key, 0);
 	REFTABLE_CALLOC_ARRAY(wp->block, opts.block_size);
 	wp->write = writer_func;
 	wp->write_arg = writer_arg;
-- 
2.45.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v3 04/11] reftable/writer: improve error when passed an invalid block size
  2024-05-13  8:17 ` [PATCH v3 " Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2024-05-13  8:18   ` [PATCH v3 03/11] reftable/writer: drop static variable used to initialize strbuf Patrick Steinhardt
@ 2024-05-13  8:18   ` Patrick Steinhardt
  2024-05-13  8:18   ` [PATCH v3 05/11] reftable/dump: support dumping a table's block structure Patrick Steinhardt
                     ` (8 subsequent siblings)
  12 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-13  8:18 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Justin Tobler, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 977 bytes --]

The reftable format only supports block sizes up to 16MB. When the
writer is being passed a value bigger than that it simply calls
abort(3P), which isn't all that helpful due to the lack of a proper
error message.

Improve this by calling `BUG()` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/writer.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/reftable/writer.c b/reftable/writer.c
index a043025b01..45b3e9ce1f 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -128,10 +128,8 @@ reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
 	if (_opts)
 		opts = *_opts;
 	options_set_defaults(&opts);
-	if (opts.block_size >= (1 << 24)) {
-		/* TODO - error return? */
-		abort();
-	}
+	if (opts.block_size >= (1 << 24))
+		BUG("configured block size exceeds 16MB");
 
 	strbuf_init(&wp->block_writer_data.last_key, 0);
 	strbuf_init(&wp->last_key, 0);
-- 
2.45.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v3 05/11] reftable/dump: support dumping a table's block structure
  2024-05-13  8:17 ` [PATCH v3 " Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2024-05-13  8:18   ` [PATCH v3 04/11] reftable/writer: improve error when passed an invalid block size Patrick Steinhardt
@ 2024-05-13  8:18   ` Patrick Steinhardt
  2024-05-21 23:35     ` Justin Tobler
  2024-05-13  8:18   ` [PATCH v3 06/11] refs/reftable: allow configuring block size Patrick Steinhardt
                     ` (7 subsequent siblings)
  12 siblings, 1 reply; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-13  8:18 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Justin Tobler, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 6734 bytes --]

We're about to introduce new configs that will allow users to have more
control over how exactly reftables are written. To verify that these
configs are effective we will need to take a peak into the actual blocks
written by the reftable backend.

Introduce a new mode to the dumping logic that prints out the block
structure. This logic can be invoked via `test-tool dump-reftables -b`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/dump.c                   |   8 ++-
 reftable/reader.c                 |  63 ++++++++++++++++++
 reftable/reftable-reader.h        |   2 +
 t/t0613-reftable-write-options.sh | 102 ++++++++++++++++++++++++++++++
 4 files changed, 174 insertions(+), 1 deletion(-)
 create mode 100755 t/t0613-reftable-write-options.sh

diff --git a/reftable/dump.c b/reftable/dump.c
index 586f3eb288..41abbb8ecf 100644
--- a/reftable/dump.c
+++ b/reftable/dump.c
@@ -48,6 +48,7 @@ static void print_help(void)
 	printf("usage: dump [-cst] arg\n\n"
 	       "options: \n"
 	       "  -c compact\n"
+	       "  -b dump blocks\n"
 	       "  -t dump table\n"
 	       "  -s dump stack\n"
 	       "  -6 sha256 hash format\n"
@@ -58,6 +59,7 @@ static void print_help(void)
 int reftable_dump_main(int argc, char *const *argv)
 {
 	int err = 0;
+	int opt_dump_blocks = 0;
 	int opt_dump_table = 0;
 	int opt_dump_stack = 0;
 	int opt_compact = 0;
@@ -67,6 +69,8 @@ int reftable_dump_main(int argc, char *const *argv)
 	for (; argc > 1; argv++, argc--)
 		if (*argv[1] != '-')
 			break;
+		else if (!strcmp("-b", argv[1]))
+			opt_dump_blocks = 1;
 		else if (!strcmp("-t", argv[1]))
 			opt_dump_table = 1;
 		else if (!strcmp("-6", argv[1]))
@@ -88,7 +92,9 @@ int reftable_dump_main(int argc, char *const *argv)
 
 	arg = argv[1];
 
-	if (opt_dump_table) {
+	if (opt_dump_blocks) {
+		err = reftable_reader_print_blocks(arg);
+	} else if (opt_dump_table) {
 		err = reftable_reader_print_file(arg);
 	} else if (opt_dump_stack) {
 		err = reftable_stack_print_directory(arg, opt_hash_id);
diff --git a/reftable/reader.c b/reftable/reader.c
index 481dff10d4..f23c8523db 100644
--- a/reftable/reader.c
+++ b/reftable/reader.c
@@ -856,3 +856,66 @@ int reftable_reader_print_file(const char *tablename)
 	reftable_reader_free(r);
 	return err;
 }
+
+int reftable_reader_print_blocks(const char *tablename)
+{
+	struct {
+		const char *name;
+		int type;
+	} sections[] = {
+		{
+			.name = "ref",
+			.type = BLOCK_TYPE_REF,
+		},
+		{
+			.name = "obj",
+			.type = BLOCK_TYPE_OBJ,
+		},
+		{
+			.name = "log",
+			.type = BLOCK_TYPE_LOG,
+		},
+	};
+	struct reftable_block_source src = { 0 };
+	struct table_iter ti = TABLE_ITER_INIT;
+	struct reftable_reader *r = NULL;
+	size_t i;
+	int err;
+
+	err = reftable_block_source_from_file(&src, tablename);
+	if (err < 0)
+		goto done;
+
+	err = reftable_new_reader(&r, &src, tablename);
+	if (err < 0)
+		goto done;
+
+	printf("header:\n");
+	printf("  block_size: %d\n", r->block_size);
+
+	for (i = 0; i < ARRAY_SIZE(sections); i++) {
+		err = reader_start(r, &ti, sections[i].type, 0);
+		if (err < 0)
+			goto done;
+		if (err > 0)
+			continue;
+
+		printf("%s:\n", sections[i].name);
+
+		while (1) {
+			printf("  - length: %u\n", ti.br.block_len);
+			printf("    restarts: %u\n", ti.br.restart_count);
+
+			err = table_iter_next_block(&ti);
+			if (err < 0)
+				goto done;
+			if (err > 0)
+				break;
+		}
+	}
+
+done:
+	reftable_reader_free(r);
+	table_iter_close(&ti);
+	return err;
+}
diff --git a/reftable/reftable-reader.h b/reftable/reftable-reader.h
index 4a4bc2fdf8..4a04857773 100644
--- a/reftable/reftable-reader.h
+++ b/reftable/reftable-reader.h
@@ -97,5 +97,7 @@ void reftable_table_from_reader(struct reftable_table *tab,
 
 /* print table onto stdout for debugging. */
 int reftable_reader_print_file(const char *tablename);
+/* print blocks onto stdout for debugging. */
+int reftable_reader_print_blocks(const char *tablename);
 
 #endif
diff --git a/t/t0613-reftable-write-options.sh b/t/t0613-reftable-write-options.sh
new file mode 100755
index 0000000000..462980c37c
--- /dev/null
+++ b/t/t0613-reftable-write-options.sh
@@ -0,0 +1,102 @@
+#!/bin/sh
+
+test_description='reftable write options'
+
+GIT_TEST_DEFAULT_REF_FORMAT=reftable
+export GIT_TEST_DEFAULT_REF_FORMAT
+# Disable auto-compaction for all tests as we explicitly control repacking of
+# refs.
+GIT_TEST_REFTABLE_AUTOCOMPACTION=false
+export GIT_TEST_REFTABLE_AUTOCOMPACTION
+# Block sizes depend on the hash function, so we force SHA1 here.
+GIT_TEST_DEFAULT_HASH=sha1
+export GIT_TEST_DEFAULT_HASH
+# Block sizes also depend on the actual refs we write, so we force "master" to
+# be the default initial branch name.
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+test_expect_success 'default write options' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		git pack-refs &&
+		cat >expect <<-EOF &&
+		header:
+		  block_size: 4096
+		ref:
+		  - length: 129
+		    restarts: 2
+		log:
+		  - length: 262
+		    restarts: 2
+		EOF
+		test-tool dump-reftable -b .git/reftable/*.ref >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'disabled reflog writes no log blocks' '
+	test_config_global core.logAllRefUpdates false &&
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		git pack-refs &&
+		cat >expect <<-EOF &&
+		header:
+		  block_size: 4096
+		ref:
+		  - length: 129
+		    restarts: 2
+		EOF
+		test-tool dump-reftable -b .git/reftable/*.ref >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'many refs results in multiple blocks' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		for i in $(test_seq 200)
+		do
+			printf "update refs/heads/branch-%d HEAD\n" "$i" ||
+			return 1
+		done >input &&
+		git update-ref --stdin <input &&
+		git pack-refs &&
+
+		cat >expect <<-EOF &&
+		header:
+		  block_size: 4096
+		ref:
+		  - length: 4049
+		    restarts: 11
+		  - length: 1136
+		    restarts: 3
+		log:
+		  - length: 4041
+		    restarts: 4
+		  - length: 4015
+		    restarts: 3
+		  - length: 4014
+		    restarts: 3
+		  - length: 4012
+		    restarts: 3
+		  - length: 3289
+		    restarts: 3
+		EOF
+		test-tool dump-reftable -b .git/reftable/*.ref >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_done
-- 
2.45.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v3 06/11] refs/reftable: allow configuring block size
  2024-05-13  8:17 ` [PATCH v3 " Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2024-05-13  8:18   ` [PATCH v3 05/11] reftable/dump: support dumping a table's block structure Patrick Steinhardt
@ 2024-05-13  8:18   ` Patrick Steinhardt
  2024-05-17  8:09     ` Karthik Nayak
  2024-05-13  8:18   ` [PATCH v3 07/11] reftable: use `uint16_t` to track restart interval Patrick Steinhardt
                     ` (6 subsequent siblings)
  12 siblings, 1 reply; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-13  8:18 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Justin Tobler, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 6233 bytes --]

Add a new option `reftable.blockSize` that allows the user to control
the block size used by the reftable library.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/config.txt          |  2 +
 Documentation/config/reftable.txt | 14 ++++++
 refs/reftable-backend.c           | 31 ++++++++++++-
 t/t0613-reftable-write-options.sh | 72 +++++++++++++++++++++++++++++++
 4 files changed, 118 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/config/reftable.txt

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 6f649c997c..cbf0b99c44 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -498,6 +498,8 @@ include::config/rebase.txt[]
 
 include::config/receive.txt[]
 
+include::config/reftable.txt[]
+
 include::config/remote.txt[]
 
 include::config/remotes.txt[]
diff --git a/Documentation/config/reftable.txt b/Documentation/config/reftable.txt
new file mode 100644
index 0000000000..fa7c4be014
--- /dev/null
+++ b/Documentation/config/reftable.txt
@@ -0,0 +1,14 @@
+reftable.blockSize::
+	The size in bytes used by the reftable backend when writing blocks.
+	The block size is determined by the writer, and does not have to be a
+	power of 2. The block size must be larger than the longest reference
+	name or log entry used in the repository, as references cannot span
+	blocks.
++
+Powers of two that are friendly to the virtual memory system or
+filesystem (such as 4kB or 8kB) are recommended. Larger sizes (64kB) can
+yield better compression, with a possible increased cost incurred by
+readers during access.
++
+The largest block size is `16777215` bytes (15.99 MiB). The default value is
+`4096` bytes (4kB). A value of `0` will use the default value.
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index f8f930380d..8d0ae9e285 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1,6 +1,7 @@
 #include "../git-compat-util.h"
 #include "../abspath.h"
 #include "../chdir-notify.h"
+#include "../config.h"
 #include "../environment.h"
 #include "../gettext.h"
 #include "../hash.h"
@@ -228,6 +229,22 @@ static int read_ref_without_reload(struct reftable_stack *stack,
 	return ret;
 }
 
+static int reftable_be_config(const char *var, const char *value,
+			      const struct config_context *ctx,
+			      void *_opts)
+{
+	struct reftable_write_options *opts = _opts;
+
+	if (!strcmp(var, "reftable.blocksize")) {
+		unsigned long block_size = git_config_ulong(var, value, ctx->kvi);
+		if (block_size > 16777215)
+			die("reftable block size cannot exceed 16MB");
+		opts->block_size = block_size;
+	}
+
+	return 0;
+}
+
 static struct ref_store *reftable_be_init(struct repository *repo,
 					  const char *gitdir,
 					  unsigned int store_flags)
@@ -243,12 +260,24 @@ static struct ref_store *reftable_be_init(struct repository *repo,
 	base_ref_store_init(&refs->base, repo, gitdir, &refs_be_reftable);
 	strmap_init(&refs->worktree_stacks);
 	refs->store_flags = store_flags;
-	refs->write_options.block_size = 4096;
+
 	refs->write_options.hash_id = repo->hash_algo->format_id;
 	refs->write_options.default_permissions = calc_shared_perm(0666 & ~mask);
 	refs->write_options.disable_auto_compact =
 		!git_env_bool("GIT_TEST_REFTABLE_AUTOCOMPACTION", 1);
 
+	git_config(reftable_be_config, &refs->write_options);
+
+	/*
+	 * It is somewhat unfortunate that we have to mirror the default block
+	 * size of the reftable library here. But given that the write options
+	 * wouldn't be updated by the library here, and given that we require
+	 * the proper block size to trim reflog message so that they fit, we
+	 * must set up a proper value here.
+	 */
+	if (!refs->write_options.block_size)
+		refs->write_options.block_size = 4096;
+
 	/*
 	 * Set up the main reftable stack that is hosted in GIT_COMMON_DIR.
 	 * This stack contains both the shared and the main worktree refs.
diff --git a/t/t0613-reftable-write-options.sh b/t/t0613-reftable-write-options.sh
index 462980c37c..8bdbc6ec70 100755
--- a/t/t0613-reftable-write-options.sh
+++ b/t/t0613-reftable-write-options.sh
@@ -99,4 +99,76 @@ test_expect_success 'many refs results in multiple blocks' '
 	)
 '
 
+test_expect_success 'tiny block size leads to error' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		cat >expect <<-EOF &&
+		error: unable to compact stack: entry too large
+		EOF
+		test_must_fail git -c reftable.blockSize=50 pack-refs 2>err &&
+		test_cmp expect err
+	)
+'
+
+test_expect_success 'small block size leads to multiple ref blocks' '
+	test_config_global core.logAllRefUpdates false &&
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit A &&
+		test_commit B &&
+		git -c reftable.blockSize=100 pack-refs &&
+
+		cat >expect <<-EOF &&
+		header:
+		  block_size: 100
+		ref:
+		  - length: 53
+		    restarts: 1
+		  - length: 74
+		    restarts: 1
+		  - length: 38
+		    restarts: 1
+		EOF
+		test-tool dump-reftable -b .git/reftable/*.ref >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'small block size fails with large reflog message' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit A &&
+		perl -e "print \"a\" x 500" >logmsg &&
+		cat >expect <<-EOF &&
+		fatal: update_ref failed for ref ${SQ}refs/heads/logme${SQ}: reftable: transaction failure: entry too large
+		EOF
+		test_must_fail git -c reftable.blockSize=100 \
+			update-ref -m "$(cat logmsg)" refs/heads/logme HEAD 2>err &&
+		test_cmp expect err
+	)
+'
+
+test_expect_success 'block size exceeding maximum supported size' '
+	test_config_global core.logAllRefUpdates false &&
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit A &&
+		test_commit B &&
+		cat >expect <<-EOF &&
+		fatal: reftable block size cannot exceed 16MB
+		EOF
+		test_must_fail git -c reftable.blockSize=16777216 pack-refs 2>err &&
+		test_cmp expect err
+	)
+'
+
 test_done
-- 
2.45.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v3 07/11] reftable: use `uint16_t` to track restart interval
  2024-05-13  8:17 ` [PATCH v3 " Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2024-05-13  8:18   ` [PATCH v3 06/11] refs/reftable: allow configuring block size Patrick Steinhardt
@ 2024-05-13  8:18   ` Patrick Steinhardt
  2024-05-13  8:18   ` [PATCH v3 08/11] refs/reftable: allow configuring " Patrick Steinhardt
                     ` (5 subsequent siblings)
  12 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-13  8:18 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Justin Tobler, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 1296 bytes --]

The restart interval can at most be `UINT16_MAX` as specified in the
technical documentation of the reftable format. Furthermore, it cannot
ever be negative. Regardless of that we use an `int` to track the
restart interval.

Change the type to use an `uint16_t` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/block.h           | 2 +-
 reftable/reftable-writer.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/reftable/block.h b/reftable/block.h
index e91f3d2790..1c8f25ee6e 100644
--- a/reftable/block.h
+++ b/reftable/block.h
@@ -29,7 +29,7 @@ struct block_writer {
 	uint32_t header_off;
 
 	/* How often to restart keys. */
-	int restart_interval;
+	uint16_t restart_interval;
 	int hash_size;
 
 	/* Offset of next uint8_t to write. */
diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h
index 03df3a4963..94804eaa68 100644
--- a/reftable/reftable-writer.h
+++ b/reftable/reftable-writer.h
@@ -28,7 +28,7 @@ struct reftable_write_options {
 	unsigned skip_index_objects : 1;
 
 	/* how often to write complete keys in each block. */
-	int restart_interval;
+	uint16_t restart_interval;
 
 	/* 4-byte identifier ("sha1", "s256") of the hash.
 	 * Defaults to SHA1 if unset
-- 
2.45.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v3 08/11] refs/reftable: allow configuring restart interval
  2024-05-13  8:17 ` [PATCH v3 " Patrick Steinhardt
                     ` (6 preceding siblings ...)
  2024-05-13  8:18   ` [PATCH v3 07/11] reftable: use `uint16_t` to track restart interval Patrick Steinhardt
@ 2024-05-13  8:18   ` Patrick Steinhardt
  2024-05-21 23:50     ` Justin Tobler
  2024-05-13  8:18   ` [PATCH v3 09/11] refs/reftable: allow disabling writing the object index Patrick Steinhardt
                     ` (4 subsequent siblings)
  12 siblings, 1 reply; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-13  8:18 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Justin Tobler, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 3681 bytes --]

Add a new option `reftable.restartInterval` that allows the user to
control the restart interval when writing reftable records used by the
reftable library.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/config/reftable.txt | 18 +++++++++++++
 refs/reftable-backend.c           |  5 ++++
 t/t0613-reftable-write-options.sh | 43 +++++++++++++++++++++++++++++++
 3 files changed, 66 insertions(+)

diff --git a/Documentation/config/reftable.txt b/Documentation/config/reftable.txt
index fa7c4be014..2374be71d7 100644
--- a/Documentation/config/reftable.txt
+++ b/Documentation/config/reftable.txt
@@ -12,3 +12,21 @@ readers during access.
 +
 The largest block size is `16777215` bytes (15.99 MiB). The default value is
 `4096` bytes (4kB). A value of `0` will use the default value.
+
+reftable.restartInterval::
+	The interval at which to create restart points. The reftable backend
+	determines the restart points at file creation. Every 16 may be
+	more suitable for smaller block sizes (4k or 8k), every 64 for larger
+	block sizes (64k).
++
+More frequent restart points reduces prefix compression and increases
+space consumed by the restart table, both of which increase file size.
++
+Less frequent restart points makes prefix compression more effective,
+decreasing overall file size, with increased penalties for readers
+walking through more records after the binary search step.
++
+A maximum of `65535` restart points per block is supported.
++
+The default value is to create restart points every 16 records. A value of `0`
+will use the default value.
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 8d0ae9e285..a2880aabce 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -240,6 +240,11 @@ static int reftable_be_config(const char *var, const char *value,
 		if (block_size > 16777215)
 			die("reftable block size cannot exceed 16MB");
 		opts->block_size = block_size;
+	} else if (!strcmp(var, "reftable.restartinterval")) {
+		unsigned long restart_interval = git_config_ulong(var, value, ctx->kvi);
+		if (restart_interval > UINT16_MAX)
+			die("reftable block size cannot exceed %u", (unsigned)UINT16_MAX);
+		opts->restart_interval = restart_interval;
 	}
 
 	return 0;
diff --git a/t/t0613-reftable-write-options.sh b/t/t0613-reftable-write-options.sh
index 8bdbc6ec70..e0a5b26f58 100755
--- a/t/t0613-reftable-write-options.sh
+++ b/t/t0613-reftable-write-options.sh
@@ -171,4 +171,47 @@ test_expect_success 'block size exceeding maximum supported size' '
 	)
 '
 
+test_expect_success 'restart interval at every single record' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		for i in $(test_seq 10)
+		do
+			printf "update refs/heads/branch-%d HEAD\n" "$i" ||
+			return 1
+		done >input &&
+		git update-ref --stdin <input &&
+		git -c reftable.restartInterval=1 pack-refs &&
+
+		cat >expect <<-EOF &&
+		header:
+		  block_size: 4096
+		ref:
+		  - length: 566
+		    restarts: 13
+		log:
+		  - length: 1393
+		    restarts: 12
+		EOF
+		test-tool dump-reftable -b .git/reftable/*.ref >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'restart interval exceeding maximum supported interval' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		cat >expect <<-EOF &&
+		fatal: reftable block size cannot exceed 65535
+		EOF
+		test_must_fail git -c reftable.restartInterval=65536 pack-refs 2>err &&
+		test_cmp expect err
+	)
+'
+
 test_done
-- 
2.45.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v3 09/11] refs/reftable: allow disabling writing the object index
  2024-05-13  8:17 ` [PATCH v3 " Patrick Steinhardt
                     ` (7 preceding siblings ...)
  2024-05-13  8:18   ` [PATCH v3 08/11] refs/reftable: allow configuring " Patrick Steinhardt
@ 2024-05-13  8:18   ` Patrick Steinhardt
  2024-05-13  8:18   ` [PATCH v3 10/11] reftable: make the compaction factor configurable Patrick Steinhardt
                     ` (3 subsequent siblings)
  12 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-13  8:18 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Justin Tobler, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 4201 bytes --]

Besides the expected "ref" and "log" records, the reftable library also
writes "obj" records. These are basically a reverse mapping of object
IDs to their respective ref records so that it becomes efficient to
figure out which references point to a specific object. The motivation
for this data structure is the "uploadpack.allowTipSHA1InWant" config,
which allows a client to fetch any object by its hash that has a ref
pointing to it.

This reverse index is not used by Git at all though, and the expectation
is that most hosters nowadays use "uploadpack.allowAnySHA1InWant". It
may thus be preferable for many users to disable writing these optional
object indices altogether to safe some precious disk space.

Add a new config "reftable.indexObjects" that allows the user to disable
the object index altogether.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/config/reftable.txt |  6 +++
 refs/reftable-backend.c           |  2 +
 t/t0613-reftable-write-options.sh | 69 +++++++++++++++++++++++++++++++
 3 files changed, 77 insertions(+)

diff --git a/Documentation/config/reftable.txt b/Documentation/config/reftable.txt
index 2374be71d7..68083876fa 100644
--- a/Documentation/config/reftable.txt
+++ b/Documentation/config/reftable.txt
@@ -30,3 +30,9 @@ A maximum of `65535` restart points per block is supported.
 +
 The default value is to create restart points every 16 records. A value of `0`
 will use the default value.
+
+reftable.indexObjects::
+	Whether the reftable backend shall write object blocks. Object blocks
+	are a reverse mapping of object ID to the references pointing to them.
++
+The default value is `true`.
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index a2880aabce..5ffb36770a 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -245,6 +245,8 @@ static int reftable_be_config(const char *var, const char *value,
 		if (restart_interval > UINT16_MAX)
 			die("reftable block size cannot exceed %u", (unsigned)UINT16_MAX);
 		opts->restart_interval = restart_interval;
+	} else if (!strcmp(var, "reftable.indexobjects")) {
+		opts->skip_index_objects = !git_config_bool(var, value);
 	}
 
 	return 0;
diff --git a/t/t0613-reftable-write-options.sh b/t/t0613-reftable-write-options.sh
index e0a5b26f58..e2708e11d5 100755
--- a/t/t0613-reftable-write-options.sh
+++ b/t/t0613-reftable-write-options.sh
@@ -214,4 +214,73 @@ test_expect_success 'restart interval exceeding maximum supported interval' '
 	)
 '
 
+test_expect_success 'object index gets written by default with ref index' '
+	test_config_global core.logAllRefUpdates false &&
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		for i in $(test_seq 5)
+		do
+			printf "update refs/heads/branch-%d HEAD\n" "$i" ||
+			return 1
+		done >input &&
+		git update-ref --stdin <input &&
+		git -c reftable.blockSize=100 pack-refs &&
+
+		cat >expect <<-EOF &&
+		header:
+		  block_size: 100
+		ref:
+		  - length: 53
+		    restarts: 1
+		  - length: 95
+		    restarts: 1
+		  - length: 71
+		    restarts: 1
+		  - length: 80
+		    restarts: 1
+		obj:
+		  - length: 11
+		    restarts: 1
+		EOF
+		test-tool dump-reftable -b .git/reftable/*.ref >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'object index can be disabled' '
+	test_config_global core.logAllRefUpdates false &&
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		for i in $(test_seq 5)
+		do
+			printf "update refs/heads/branch-%d HEAD\n" "$i" ||
+			return 1
+		done >input &&
+		git update-ref --stdin <input &&
+		git -c reftable.blockSize=100 -c reftable.indexObjects=false pack-refs &&
+
+		cat >expect <<-EOF &&
+		header:
+		  block_size: 100
+		ref:
+		  - length: 53
+		    restarts: 1
+		  - length: 95
+		    restarts: 1
+		  - length: 71
+		    restarts: 1
+		  - length: 80
+		    restarts: 1
+		EOF
+		test-tool dump-reftable -b .git/reftable/*.ref >actual &&
+		test_cmp expect actual
+	)
+'
+
 test_done
-- 
2.45.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v3 10/11] reftable: make the compaction factor configurable
  2024-05-13  8:17 ` [PATCH v3 " Patrick Steinhardt
                     ` (8 preceding siblings ...)
  2024-05-13  8:18   ` [PATCH v3 09/11] refs/reftable: allow disabling writing the object index Patrick Steinhardt
@ 2024-05-13  8:18   ` Patrick Steinhardt
  2024-05-13  8:18   ` [PATCH v3 11/11] refs/reftable: allow configuring geometric factor Patrick Steinhardt
                     ` (2 subsequent siblings)
  12 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-13  8:18 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Justin Tobler, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 4682 bytes --]

When auto-compacting, the reftable library packs references such that
the sizes of the tables form a geometric sequence. The factor for this
geometric sequence is hardcoded to 2 right now. We're about to expose
this as a config option though, so let's expose the factor via write
options.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/constants.h       |  1 +
 reftable/reftable-writer.h |  6 ++++++
 reftable/stack.c           | 14 ++++++++++----
 reftable/stack.h           |  3 ++-
 reftable/stack_test.c      |  4 ++--
 5 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/reftable/constants.h b/reftable/constants.h
index 5eee72c4c1..f6beb843eb 100644
--- a/reftable/constants.h
+++ b/reftable/constants.h
@@ -17,5 +17,6 @@ license that can be found in the LICENSE file or at
 
 #define MAX_RESTARTS ((1 << 16) - 1)
 #define DEFAULT_BLOCK_SIZE 4096
+#define DEFAULT_GEOMETRIC_FACTOR 2
 
 #endif
diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h
index 94804eaa68..189b1f4144 100644
--- a/reftable/reftable-writer.h
+++ b/reftable/reftable-writer.h
@@ -45,6 +45,12 @@ struct reftable_write_options {
 
 	/* boolean: Prevent auto-compaction of tables. */
 	unsigned disable_auto_compact : 1;
+
+	/*
+	 * Geometric sequence factor used by auto-compaction to decide which
+	 * tables to compact. Defaults to 2 if unset.
+	 */
+	uint8_t auto_compaction_factor;
 };
 
 /* reftable_block_stats holds statistics for a single block type */
diff --git a/reftable/stack.c b/reftable/stack.c
index d2e68be7e8..0ebe69e81d 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -10,6 +10,7 @@ license that can be found in the LICENSE file or at
 
 #include "../write-or-die.h"
 #include "system.h"
+#include "constants.h"
 #include "merged.h"
 #include "reader.h"
 #include "reftable-error.h"
@@ -1212,12 +1213,16 @@ static int segment_size(struct segment *s)
 	return s->end - s->start;
 }
 
-struct segment suggest_compaction_segment(uint64_t *sizes, size_t n)
+struct segment suggest_compaction_segment(uint64_t *sizes, size_t n,
+					  uint8_t factor)
 {
 	struct segment seg = { 0 };
 	uint64_t bytes;
 	size_t i;
 
+	if (!factor)
+		factor = DEFAULT_GEOMETRIC_FACTOR;
+
 	/*
 	 * If there are no tables or only a single one then we don't have to
 	 * compact anything. The sequence is geometric by definition already.
@@ -1249,7 +1254,7 @@ struct segment suggest_compaction_segment(uint64_t *sizes, size_t n)
 	 * 	64, 32, 16, 8, 4, 3, 1
 	 */
 	for (i = n - 1; i > 0; i--) {
-		if (sizes[i - 1] < sizes[i] * 2) {
+		if (sizes[i - 1] < sizes[i] * factor) {
 			seg.end = i + 1;
 			bytes = sizes[i];
 			break;
@@ -1275,7 +1280,7 @@ struct segment suggest_compaction_segment(uint64_t *sizes, size_t n)
 		uint64_t curr = bytes;
 		bytes += sizes[i - 1];
 
-		if (sizes[i - 1] < curr * 2) {
+		if (sizes[i - 1] < curr * factor) {
 			seg.start = i - 1;
 			seg.bytes = bytes;
 		}
@@ -1301,7 +1306,8 @@ int reftable_stack_auto_compact(struct reftable_stack *st)
 {
 	uint64_t *sizes = stack_table_sizes_for_compaction(st);
 	struct segment seg =
-		suggest_compaction_segment(sizes, st->merged->stack_len);
+		suggest_compaction_segment(sizes, st->merged->stack_len,
+					   st->opts.auto_compaction_factor);
 	reftable_free(sizes);
 	if (segment_size(&seg) > 0)
 		return stack_compact_range_stats(st, seg.start, seg.end - 1,
diff --git a/reftable/stack.h b/reftable/stack.h
index 97d7ebc043..5b45cff4f7 100644
--- a/reftable/stack.h
+++ b/reftable/stack.h
@@ -35,6 +35,7 @@ struct segment {
 	uint64_t bytes;
 };
 
-struct segment suggest_compaction_segment(uint64_t *sizes, size_t n);
+struct segment suggest_compaction_segment(uint64_t *sizes, size_t n,
+					  uint8_t factor);
 
 #endif
diff --git a/reftable/stack_test.c b/reftable/stack_test.c
index d15f11d712..0f7b1453e6 100644
--- a/reftable/stack_test.c
+++ b/reftable/stack_test.c
@@ -729,7 +729,7 @@ static void test_suggest_compaction_segment(void)
 {
 	uint64_t sizes[] = { 512, 64, 17, 16, 9, 9, 9, 16, 2, 16 };
 	struct segment min =
-		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes));
+		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes), 2);
 	EXPECT(min.start == 1);
 	EXPECT(min.end == 10);
 }
@@ -738,7 +738,7 @@ static void test_suggest_compaction_segment_nothing(void)
 {
 	uint64_t sizes[] = { 64, 32, 16, 8, 4, 2 };
 	struct segment result =
-		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes));
+		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes), 2);
 	EXPECT(result.start == result.end);
 }
 
-- 
2.45.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v3 11/11] refs/reftable: allow configuring geometric factor
  2024-05-13  8:17 ` [PATCH v3 " Patrick Steinhardt
                     ` (9 preceding siblings ...)
  2024-05-13  8:18   ` [PATCH v3 10/11] reftable: make the compaction factor configurable Patrick Steinhardt
@ 2024-05-13  8:18   ` Patrick Steinhardt
  2024-05-17  8:14   ` [PATCH v3 00/11] reftable: expose write options as config Karthik Nayak
  2024-05-21 23:54   ` Justin Tobler
  12 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-13  8:18 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Justin Tobler, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 1897 bytes --]

Allow configuring the geometric factor used by the auto-compaction
algorithm whenever a new table is appended to the stack of tables.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/config/reftable.txt | 10 ++++++++++
 refs/reftable-backend.c           |  5 +++++
 2 files changed, 15 insertions(+)

diff --git a/Documentation/config/reftable.txt b/Documentation/config/reftable.txt
index 68083876fa..0515727977 100644
--- a/Documentation/config/reftable.txt
+++ b/Documentation/config/reftable.txt
@@ -36,3 +36,13 @@ reftable.indexObjects::
 	are a reverse mapping of object ID to the references pointing to them.
 +
 The default value is `true`.
+
+reftable.geometricFactor::
+	Whenever the reftable backend appends a new table to the stack, it
+	performs auto compaction to ensure that there is only a handful of
+	tables. The backend does this by ensuring that tables form a geometric
+	sequence regarding the respective sizes of each table.
++
+By default, the geometric sequence uses a factor of 2, meaning that for any
+table, the next-biggest table must at least be twice as big. A maximum factor
+of 256 is supported.
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 5ffb36770a..da620fd598 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -247,6 +247,11 @@ static int reftable_be_config(const char *var, const char *value,
 		opts->restart_interval = restart_interval;
 	} else if (!strcmp(var, "reftable.indexobjects")) {
 		opts->skip_index_objects = !git_config_bool(var, value);
+	} else if (!strcmp(var, "reftable.geometricfactor")) {
+		unsigned long factor = git_config_ulong(var, value, ctx->kvi);
+		if (factor > UINT8_MAX)
+			die("reftable geometric factor cannot exceed %u", (unsigned)UINT8_MAX);
+		opts->auto_compaction_factor = factor;
 	}
 
 	return 0;
-- 
2.45.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* Re: [PATCH v2 10/11] reftable: make the compaction factor configurable
  2024-05-13  7:54       ` Patrick Steinhardt
@ 2024-05-13 16:22         ` Junio C Hamano
  2024-05-14  4:54           ` Patrick Steinhardt
  0 siblings, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2024-05-13 16:22 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Justin Tobler, Taylor Blau

Patrick Steinhardt <ps@pks.im> writes:

> So this may be good enough for now, and when we gain the ability to
> parse floats we may convert this to accept floats, as well. An
> alternative would be to convert this to percent, where the default value
> would be 200. That should give sufficient flexibility without having to
> introduce floats.

There already is an established way to specify float with arbitrary
precision on the command line if we limit the value to 0..1, by the
way.

https://lore.kernel.org/git/Pine.LNX.4.58.0505191516350.2322@ppc970.osdl.org/

It is amusing to revisit an ancient discussion thread.  I can see,
on the same thread in the discussion following that message, that
back in May 2005, we were discussing "intent to add" that did not
get invented until mid 2008 already ;-).

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2 05/11] reftable/dump: support dumping a table's block structure
  2024-05-10 10:29   ` [PATCH v2 05/11] reftable/dump: support dumping a table's block structure Patrick Steinhardt
@ 2024-05-13 22:42     ` Junio C Hamano
  0 siblings, 0 replies; 78+ messages in thread
From: Junio C Hamano @ 2024-05-13 22:42 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Justin Tobler

Patrick Steinhardt <ps@pks.im> writes:

> We're about to introduce new configs that will allow users to have more
> control over how exactly reftables are written. To verify that these
> configs are effective we will need to take a peak into the actual blocks
> written by the reftable backend.
>
> Introduce a new mode to the dumping logic that prints out the block
> structure. This logic can be invoked via `test-tool dump-reftables -b`.

This step somehow looks familiar.  Perhaps that is because I read it
more carefully than other steps during the last round.

Looking good to me.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2 07/11] reftable: use `uint16_t` to track restart interval
  2024-05-10 10:29   ` [PATCH v2 07/11] reftable: use `uint16_t` to track restart interval Patrick Steinhardt
@ 2024-05-13 22:42     ` Junio C Hamano
  2024-05-14  4:54       ` Patrick Steinhardt
  0 siblings, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2024-05-13 22:42 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Justin Tobler

Patrick Steinhardt <ps@pks.im> writes:

> The restart interval can at most be `UINT16_MAX` as specified in the
> technical documentation of the reftable format. Furthermore, it cannot
> ever be negative. Regardless of that we use an `int` to track the
> restart interval.
>
> Change the type to use an `uint16_t` instead.

Not wrong per-se, but this one is more or less a Meh, as we know we
do not work on 16-bit platforms anyway.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2 10/11] reftable: make the compaction factor configurable
  2024-05-13 16:22         ` Junio C Hamano
@ 2024-05-14  4:54           ` Patrick Steinhardt
  0 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-14  4:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Karthik Nayak, Justin Tobler, Taylor Blau

[-- Attachment #1: Type: text/plain, Size: 1680 bytes --]

On Mon, May 13, 2024 at 09:22:38AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > So this may be good enough for now, and when we gain the ability to
> > parse floats we may convert this to accept floats, as well. An
> > alternative would be to convert this to percent, where the default value
> > would be 200. That should give sufficient flexibility without having to
> > introduce floats.
> 
> There already is an established way to specify float with arbitrary
> precision on the command line if we limit the value to 0..1, by the
> way.
> 
> https://lore.kernel.org/git/Pine.LNX.4.58.0505191516350.2322@ppc970.osdl.org/

The problem is that we don't want to limit the value to 0..1, but to
1..n. 1 really is the lowest semi-sensible number you can pick and means
that all tables should have the same size. And in practice, there is no
upper limit, even though it's probably not all that reasonable to pick
anything beyond 10.

In any case, I'd propose to keep this as-is for now. The simple reason
is that we have preexisting commands (`git repack -g`) and in-flight
series (pseudo-merge bitmaps) that also use plain integers to represent
the geometric factor. So I'm aiming for consistency. From thereon we can
then iterate and design a solution that works for all of them, e.g. by
allowing for floats outside of the 0..1 range in whatever form.

> It is amusing to revisit an ancient discussion thread.  I can see,
> on the same thread in the discussion following that message, that
> back in May 2005, we were discussing "intent to add" that did not
> get invented until mid 2008 already ;-).

:)

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2 07/11] reftable: use `uint16_t` to track restart interval
  2024-05-13 22:42     ` Junio C Hamano
@ 2024-05-14  4:54       ` Patrick Steinhardt
  0 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-14  4:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Karthik Nayak, Justin Tobler

[-- Attachment #1: Type: text/plain, Size: 710 bytes --]

On Mon, May 13, 2024 at 03:42:37PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > The restart interval can at most be `UINT16_MAX` as specified in the
> > technical documentation of the reftable format. Furthermore, it cannot
> > ever be negative. Regardless of that we use an `int` to track the
> > restart interval.
> >
> > Change the type to use an `uint16_t` instead.
> 
> Not wrong per-se, but this one is more or less a Meh, as we know we
> do not work on 16-bit platforms anyway.

Yeah, my intent isn't really "platform portability". It's rather that I
think that an `uint16_t` documents the accepted range of values much
better than `int` does.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3 02/11] reftable: pass opts as constant pointer
  2024-05-13  8:17   ` [PATCH v3 02/11] reftable: pass opts as constant pointer Patrick Steinhardt
@ 2024-05-17  8:02     ` Karthik Nayak
  2024-05-21 23:22     ` Justin Tobler
  1 sibling, 0 replies; 78+ messages in thread
From: Karthik Nayak @ 2024-05-17  8:02 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: Justin Tobler, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 871 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> We sometimes pass the refatble write options as value and sometimes as a
> pointer. This is quite confusing and makes the reader wonder whether the
> options get modified sometimes.
>
> In fact, `reftable_new_writer()` does cause the caller-provided options
> to get updated when some values aren't set up. This is quite unexpected,
> but didn't cause any harm until now.
>
> Adapt the code so that we do not modify the caller-provided values
> anymore. While at it, refactor the code to code to consistently pass the
> options as a constant pointer to clarify that the caller-provided opts
> will not ever get modified.
>

So from v2, we changed this from passing the value to passing the
address. Mostly, we're trying to stay consistent and have to pick
either, so this makes more sense since we're passing around big structs.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3 06/11] refs/reftable: allow configuring block size
  2024-05-13  8:18   ` [PATCH v3 06/11] refs/reftable: allow configuring block size Patrick Steinhardt
@ 2024-05-17  8:09     ` Karthik Nayak
  0 siblings, 0 replies; 78+ messages in thread
From: Karthik Nayak @ 2024-05-17  8:09 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: Justin Tobler, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 1489 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> Add a new option `reftable.blockSize` that allows the user to control
> the block size used by the reftable library.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  Documentation/config.txt          |  2 +
>  Documentation/config/reftable.txt | 14 ++++++
>  refs/reftable-backend.c           | 31 ++++++++++++-
>  t/t0613-reftable-write-options.sh | 72 +++++++++++++++++++++++++++++++
>  4 files changed, 118 insertions(+), 1 deletion(-)
>  create mode 100644 Documentation/config/reftable.txt
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 6f649c997c..cbf0b99c44 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -498,6 +498,8 @@ include::config/rebase.txt[]
>
>  include::config/receive.txt[]
>
> +include::config/reftable.txt[]
> +
>  include::config/remote.txt[]
>
>  include::config/remotes.txt[]
> diff --git a/Documentation/config/reftable.txt b/Documentation/config/reftable.txt
> new file mode 100644
> index 0000000000..fa7c4be014
> --- /dev/null
> +++ b/Documentation/config/reftable.txt
> @@ -0,0 +1,14 @@
> +reftable.blockSize::
> +	The size in bytes used by the reftable backend when writing blocks.
> +	The block size is determined by the writer, and does not have to be a
> +	power of 2. The block size must be larger than the longest reference
> +	name or log entry used in the repository, as references cannot span

Nit: s/as references cannot/as neither can/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3 00/11] reftable: expose write options as config
  2024-05-13  8:17 ` [PATCH v3 " Patrick Steinhardt
                     ` (10 preceding siblings ...)
  2024-05-13  8:18   ` [PATCH v3 11/11] refs/reftable: allow configuring geometric factor Patrick Steinhardt
@ 2024-05-17  8:14   ` Karthik Nayak
  2024-05-17  8:26     ` Patrick Steinhardt
  2024-05-21 23:54   ` Justin Tobler
  12 siblings, 1 reply; 78+ messages in thread
From: Karthik Nayak @ 2024-05-17  8:14 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: Justin Tobler, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 1111 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> Hi,
>
> this is the third version of my patch series that exposes several write
> options of the reftable library via Git configs.
>
> Changes compared to v2:
>
>   - Adapted patch 2 such that we now pass options as const pointers
>     instead of by value.
>
>   - Removed a confusing sentence in the documentation of the restart
>     points in patch 8.
>
> Other than that I decided to rebase this on top of the current "master"
> branch at 0f3415f1f8 (The second batch, 2024-05-08). This is because the
> revamped patch 2 would cause new conflicts with 485c63cf5c (reftable:
> remove name checks, 2024-04-08) that didn't exist in v2 of this patch
> series yet. Rebasing thus seemed like the more reasonable option.
>

I did go through the patches and only had a small nit, but not worth a
re-roll.

I was also wondering what happens if users tweak these values when a
repository already contains reftables with different values. Seems like
it'll use the new configuration during new table creation and also
during autocompaction. Which makes sense.

Thanks
Karthik

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3 00/11] reftable: expose write options as config
  2024-05-17  8:14   ` [PATCH v3 00/11] reftable: expose write options as config Karthik Nayak
@ 2024-05-17  8:26     ` Patrick Steinhardt
  0 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-17  8:26 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git, Justin Tobler, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 1385 bytes --]

On Fri, May 17, 2024 at 10:14:19AM +0200, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > Hi,
> >
> > this is the third version of my patch series that exposes several write
> > options of the reftable library via Git configs.
> >
> > Changes compared to v2:
> >
> >   - Adapted patch 2 such that we now pass options as const pointers
> >     instead of by value.
> >
> >   - Removed a confusing sentence in the documentation of the restart
> >     points in patch 8.
> >
> > Other than that I decided to rebase this on top of the current "master"
> > branch at 0f3415f1f8 (The second batch, 2024-05-08). This is because the
> > revamped patch 2 would cause new conflicts with 485c63cf5c (reftable:
> > remove name checks, 2024-04-08) that didn't exist in v2 of this patch
> > series yet. Rebasing thus seemed like the more reasonable option.
> >
> 
> I did go through the patches and only had a small nit, but not worth a
> re-roll.

Thanks!

> I was also wondering what happens if users tweak these values when a
> repository already contains reftables with different values. Seems like
> it'll use the new configuration during new table creation and also
> during autocompaction. Which makes sense.

Yup. It should be fine to change the values at will and for different
tables to be written with different configs.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3 02/11] reftable: pass opts as constant pointer
  2024-05-13  8:17   ` [PATCH v3 02/11] reftable: pass opts as constant pointer Patrick Steinhardt
  2024-05-17  8:02     ` Karthik Nayak
@ 2024-05-21 23:22     ` Justin Tobler
  2024-05-22  7:19       ` Patrick Steinhardt
  1 sibling, 1 reply; 78+ messages in thread
From: Justin Tobler @ 2024-05-21 23:22 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano

On 24/05/13 10:17AM, Patrick Steinhardt wrote:
> We sometimes pass the refatble write options as value and sometimes as a

s/refatble/reftable

> pointer. This is quite confusing and makes the reader wonder whether the
> options get modified sometimes.
> 
> In fact, `reftable_new_writer()` does cause the caller-provided options
> to get updated when some values aren't set up. This is quite unexpected,
> but didn't cause any harm until now.
> 
> Adapt the code so that we do not modify the caller-provided values
> anymore. While at it, refactor the code to code to consistently pass the
> options as a constant pointer to clarify that the caller-provided opts
> will not ever get modified.

Doesn't really matter, but would it be more accurate to say "pointer to
a constant type"?

Overall, I like this change. Improves consistency and readability :)

-Justin

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3 05/11] reftable/dump: support dumping a table's block structure
  2024-05-13  8:18   ` [PATCH v3 05/11] reftable/dump: support dumping a table's block structure Patrick Steinhardt
@ 2024-05-21 23:35     ` Justin Tobler
  2024-05-22  7:19       ` Patrick Steinhardt
  0 siblings, 1 reply; 78+ messages in thread
From: Justin Tobler @ 2024-05-21 23:35 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano

On 24/05/13 10:18AM, Patrick Steinhardt wrote:
> +int reftable_reader_print_blocks(const char *tablename)
> +{
> +	struct {
> +		const char *name;
> +		int type;
> +	} sections[] = {
> +		{
> +			.name = "ref",
> +			.type = BLOCK_TYPE_REF,
> +		},
> +		{
> +			.name = "obj",
> +			.type = BLOCK_TYPE_OBJ,
> +		},
> +		{
> +			.name = "log",
> +			.type = BLOCK_TYPE_LOG,
> +		},
> +	};

I noticed that we are not including all the block types. Would we ever
want to also be able to dump index blocks? Or would they not be useful
in this context?

-Justin

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3 08/11] refs/reftable: allow configuring restart interval
  2024-05-13  8:18   ` [PATCH v3 08/11] refs/reftable: allow configuring " Patrick Steinhardt
@ 2024-05-21 23:50     ` Justin Tobler
  2024-05-22  7:19       ` Patrick Steinhardt
  0 siblings, 1 reply; 78+ messages in thread
From: Justin Tobler @ 2024-05-21 23:50 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano

On 24/05/13 10:18AM, Patrick Steinhardt wrote:
> +reftable.restartInterval::
> +	The interval at which to create restart points. The reftable backend
> +	determines the restart points at file creation. Every 16 may be
> +	more suitable for smaller block sizes (4k or 8k), every 64 for larger
> +	block sizes (64k).
> ++
> +More frequent restart points reduces prefix compression and increases
> +space consumed by the restart table, both of which increase file size.
> ++
> +Less frequent restart points makes prefix compression more effective,
> +decreasing overall file size, with increased penalties for readers
> +walking through more records after the binary search step.
> ++
> +A maximum of `65535` restart points per block is supported.
> ++
> +The default value is to create restart points every 16 records. A value of `0`
> +will use the default value.

Out of curiousity, if for some reason we didn't want any prefix
compression, would the best way to do this be via setting the restart
interval to 1? I guess this means the number of references would also be
limited by the maximum number of restart points.

-Justin

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3 00/11] reftable: expose write options as config
  2024-05-13  8:17 ` [PATCH v3 " Patrick Steinhardt
                     ` (11 preceding siblings ...)
  2024-05-17  8:14   ` [PATCH v3 00/11] reftable: expose write options as config Karthik Nayak
@ 2024-05-21 23:54   ` Justin Tobler
  2024-05-22  7:19     ` Patrick Steinhardt
  12 siblings, 1 reply; 78+ messages in thread
From: Justin Tobler @ 2024-05-21 23:54 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Junio C Hamano

On 24/05/13 10:17AM, Patrick Steinhardt wrote:
> Hi,
> 
> this is the third version of my patch series that exposes several write
> options of the reftable library via Git configs.
> 
> Changes compared to v2:
> 
>   - Adapted patch 2 such that we now pass options as const pointers
>     instead of by value.
> 
>   - Removed a confusing sentence in the documentation of the restart
>     points in patch 8.
> 
> Other than that I decided to rebase this on top of the current "master"
> branch at 0f3415f1f8 (The second batch, 2024-05-08). This is because the
> revamped patch 2 would cause new conflicts with 485c63cf5c (reftable:
> remove name checks, 2024-04-08) that didn't exist in v2 of this patch
> series yet. Rebasing thus seemed like the more reasonable option.

Thanks Patrick! I reviewed this version and left a couple
comments/questions, but nothing that would neccessitate a reroll. :)

-Justin

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3 02/11] reftable: pass opts as constant pointer
  2024-05-21 23:22     ` Justin Tobler
@ 2024-05-22  7:19       ` Patrick Steinhardt
  0 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-22  7:19 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git, Karthik Nayak, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 1089 bytes --]

On Tue, May 21, 2024 at 06:22:43PM -0500, Justin Tobler wrote:
> On 24/05/13 10:17AM, Patrick Steinhardt wrote:
> > We sometimes pass the refatble write options as value and sometimes as a
> 
> s/refatble/reftable
> 
> > pointer. This is quite confusing and makes the reader wonder whether the
> > options get modified sometimes.
> > 
> > In fact, `reftable_new_writer()` does cause the caller-provided options
> > to get updated when some values aren't set up. This is quite unexpected,
> > but didn't cause any harm until now.
> > 
> > Adapt the code so that we do not modify the caller-provided values
> > anymore. While at it, refactor the code to code to consistently pass the
> > options as a constant pointer to clarify that the caller-provided opts
> > will not ever get modified.
> 
> Doesn't really matter, but would it be more accurate to say "pointer to
> a constant type"?
> 
> Overall, I like this change. Improves consistency and readability :)

True. As you mentioned, I'll not reroll this series just for these two
small issues though.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3 05/11] reftable/dump: support dumping a table's block structure
  2024-05-21 23:35     ` Justin Tobler
@ 2024-05-22  7:19       ` Patrick Steinhardt
  0 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-22  7:19 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git, Karthik Nayak, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 924 bytes --]

On Tue, May 21, 2024 at 06:35:13PM -0500, Justin Tobler wrote:
> On 24/05/13 10:18AM, Patrick Steinhardt wrote:
> > +int reftable_reader_print_blocks(const char *tablename)
> > +{
> > +	struct {
> > +		const char *name;
> > +		int type;
> > +	} sections[] = {
> > +		{
> > +			.name = "ref",
> > +			.type = BLOCK_TYPE_REF,
> > +		},
> > +		{
> > +			.name = "obj",
> > +			.type = BLOCK_TYPE_OBJ,
> > +		},
> > +		{
> > +			.name = "log",
> > +			.type = BLOCK_TYPE_LOG,
> > +		},
> > +	};
> 
> I noticed that we are not including all the block types. Would we ever
> want to also be able to dump index blocks? Or would they not be useful
> in this context?

Maybe. It wasn't really necessary to include index blocks in this
context as I was already able to extract all relevant information
without them. So I decided to skip them. We may iterate on this in the
future as required.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3 08/11] refs/reftable: allow configuring restart interval
  2024-05-21 23:50     ` Justin Tobler
@ 2024-05-22  7:19       ` Patrick Steinhardt
  0 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-22  7:19 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git, Karthik Nayak, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 1281 bytes --]

On Tue, May 21, 2024 at 06:50:58PM -0500, Justin Tobler wrote:
> On 24/05/13 10:18AM, Patrick Steinhardt wrote:
> > +reftable.restartInterval::
> > +	The interval at which to create restart points. The reftable backend
> > +	determines the restart points at file creation. Every 16 may be
> > +	more suitable for smaller block sizes (4k or 8k), every 64 for larger
> > +	block sizes (64k).
> > ++
> > +More frequent restart points reduces prefix compression and increases
> > +space consumed by the restart table, both of which increase file size.
> > ++
> > +Less frequent restart points makes prefix compression more effective,
> > +decreasing overall file size, with increased penalties for readers
> > +walking through more records after the binary search step.
> > ++
> > +A maximum of `65535` restart points per block is supported.
> > ++
> > +The default value is to create restart points every 16 records. A value of `0`
> > +will use the default value.
> 
> Out of curiousity, if for some reason we didn't want any prefix
> compression, would the best way to do this be via setting the restart
> interval to 1? I guess this means the number of references would also be
> limited by the maximum number of restart points.

Yup, exactly.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3 00/11] reftable: expose write options as config
  2024-05-21 23:54   ` Justin Tobler
@ 2024-05-22  7:19     ` Patrick Steinhardt
  0 siblings, 0 replies; 78+ messages in thread
From: Patrick Steinhardt @ 2024-05-22  7:19 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git, Karthik Nayak, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 1210 bytes --]

On Tue, May 21, 2024 at 06:54:33PM -0500, Justin Tobler wrote:
> On 24/05/13 10:17AM, Patrick Steinhardt wrote:
> > Hi,
> > 
> > this is the third version of my patch series that exposes several write
> > options of the reftable library via Git configs.
> > 
> > Changes compared to v2:
> > 
> >   - Adapted patch 2 such that we now pass options as const pointers
> >     instead of by value.
> > 
> >   - Removed a confusing sentence in the documentation of the restart
> >     points in patch 8.
> > 
> > Other than that I decided to rebase this on top of the current "master"
> > branch at 0f3415f1f8 (The second batch, 2024-05-08). This is because the
> > revamped patch 2 would cause new conflicts with 485c63cf5c (reftable:
> > remove name checks, 2024-04-08) that didn't exist in v2 of this patch
> > series yet. Rebasing thus seemed like the more reasonable option.
> 
> Thanks Patrick! I reviewed this version and left a couple
> comments/questions, but nothing that would neccessitate a reroll. :)

Thanks for your review! As mentioned, I agree that there is no strong
motivator to reroll this series as the only changes would be a typo fix
in the second patch.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 78+ messages in thread

end of thread, other threads:[~2024-05-22  7:19 UTC | newest]

Thread overview: 78+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-02  6:51 [PATCH 00/11] reftable: expose write options as config Patrick Steinhardt
2024-05-02  6:51 ` [PATCH 01/11] reftable: consistently refer to `reftable_write_options` as `opts` Patrick Steinhardt
2024-05-10  9:00   ` Karthik Nayak
2024-05-10 10:13     ` Patrick Steinhardt
2024-05-02  6:51 ` [PATCH 02/11] reftable: consistently pass write opts as value Patrick Steinhardt
2024-05-02  6:51 ` [PATCH 03/11] reftable/writer: drop static variable used to initialize strbuf Patrick Steinhardt
2024-05-02  6:51 ` [PATCH 04/11] reftable/writer: improve error when passed an invalid block size Patrick Steinhardt
2024-05-02  6:51 ` [PATCH 05/11] reftable/dump: support dumping a table's block structure Patrick Steinhardt
2024-05-02  6:51 ` [PATCH 06/11] refs/reftable: allow configuring block size Patrick Steinhardt
2024-05-10  9:29   ` Karthik Nayak
2024-05-10 10:13     ` Patrick Steinhardt
2024-05-02  6:52 ` [PATCH 07/11] reftable: use `uint16_t` to track restart interval Patrick Steinhardt
2024-05-02  6:52 ` [PATCH 08/11] refs/reftable: allow configuring " Patrick Steinhardt
2024-05-02  6:52 ` [PATCH 09/11] refs/reftable: allow disabling writing the object index Patrick Steinhardt
2024-05-02  6:52 ` [PATCH 10/11] reftable: make the compaction factor configurable Patrick Steinhardt
2024-05-10  9:55   ` Karthik Nayak
2024-05-10 10:13     ` Patrick Steinhardt
2024-05-02  6:52 ` [PATCH 11/11] refs/reftable: allow configuring geometric factor Patrick Steinhardt
2024-05-10  9:58   ` Karthik Nayak
2024-05-10 10:13     ` Patrick Steinhardt
2024-05-02  7:29 ` [PATCH 00/11] reftable: expose write options as config Patrick Steinhardt
2024-05-03 20:38   ` Junio C Hamano
2024-05-06  6:51     ` Patrick Steinhardt
2024-05-06 21:29 ` Justin Tobler
2024-05-10 10:00 ` Karthik Nayak
2024-05-10 10:14   ` Patrick Steinhardt
2024-05-10 10:29 ` [PATCH v2 " Patrick Steinhardt
2024-05-10 10:29   ` [PATCH v2 01/11] reftable: consistently refer to `reftable_write_options` as `opts` Patrick Steinhardt
2024-05-10 21:03     ` Junio C Hamano
2024-05-10 10:29   ` [PATCH v2 02/11] reftable: consistently pass write opts as value Patrick Steinhardt
2024-05-10 21:11     ` Junio C Hamano
2024-05-13  7:53       ` Patrick Steinhardt
2024-05-10 10:29   ` [PATCH v2 03/11] reftable/writer: drop static variable used to initialize strbuf Patrick Steinhardt
2024-05-10 21:19     ` Junio C Hamano
2024-05-10 10:29   ` [PATCH v2 04/11] reftable/writer: improve error when passed an invalid block size Patrick Steinhardt
2024-05-10 21:25     ` Junio C Hamano
2024-05-13  7:53       ` Patrick Steinhardt
2024-05-10 10:29   ` [PATCH v2 05/11] reftable/dump: support dumping a table's block structure Patrick Steinhardt
2024-05-13 22:42     ` Junio C Hamano
2024-05-10 10:29   ` [PATCH v2 06/11] refs/reftable: allow configuring block size Patrick Steinhardt
2024-05-10 10:29   ` [PATCH v2 07/11] reftable: use `uint16_t` to track restart interval Patrick Steinhardt
2024-05-13 22:42     ` Junio C Hamano
2024-05-14  4:54       ` Patrick Steinhardt
2024-05-10 10:29   ` [PATCH v2 08/11] refs/reftable: allow configuring " Patrick Steinhardt
2024-05-10 21:57     ` Junio C Hamano
2024-05-13  7:54       ` Patrick Steinhardt
2024-05-10 10:30   ` [PATCH v2 09/11] refs/reftable: allow disabling writing the object index Patrick Steinhardt
2024-05-10 10:30   ` [PATCH v2 10/11] reftable: make the compaction factor configurable Patrick Steinhardt
2024-05-10 22:12     ` Junio C Hamano
2024-05-13  7:54       ` Patrick Steinhardt
2024-05-13 16:22         ` Junio C Hamano
2024-05-14  4:54           ` Patrick Steinhardt
2024-05-10 10:30   ` [PATCH v2 11/11] refs/reftable: allow configuring geometric factor Patrick Steinhardt
2024-05-10 11:43   ` [PATCH v2 00/11] reftable: expose write options as config Karthik Nayak
2024-05-13  8:17 ` [PATCH v3 " Patrick Steinhardt
2024-05-13  8:17   ` [PATCH v3 01/11] reftable: consistently refer to `reftable_write_options` as `opts` Patrick Steinhardt
2024-05-13  8:17   ` [PATCH v3 02/11] reftable: pass opts as constant pointer Patrick Steinhardt
2024-05-17  8:02     ` Karthik Nayak
2024-05-21 23:22     ` Justin Tobler
2024-05-22  7:19       ` Patrick Steinhardt
2024-05-13  8:18   ` [PATCH v3 03/11] reftable/writer: drop static variable used to initialize strbuf Patrick Steinhardt
2024-05-13  8:18   ` [PATCH v3 04/11] reftable/writer: improve error when passed an invalid block size Patrick Steinhardt
2024-05-13  8:18   ` [PATCH v3 05/11] reftable/dump: support dumping a table's block structure Patrick Steinhardt
2024-05-21 23:35     ` Justin Tobler
2024-05-22  7:19       ` Patrick Steinhardt
2024-05-13  8:18   ` [PATCH v3 06/11] refs/reftable: allow configuring block size Patrick Steinhardt
2024-05-17  8:09     ` Karthik Nayak
2024-05-13  8:18   ` [PATCH v3 07/11] reftable: use `uint16_t` to track restart interval Patrick Steinhardt
2024-05-13  8:18   ` [PATCH v3 08/11] refs/reftable: allow configuring " Patrick Steinhardt
2024-05-21 23:50     ` Justin Tobler
2024-05-22  7:19       ` Patrick Steinhardt
2024-05-13  8:18   ` [PATCH v3 09/11] refs/reftable: allow disabling writing the object index Patrick Steinhardt
2024-05-13  8:18   ` [PATCH v3 10/11] reftable: make the compaction factor configurable Patrick Steinhardt
2024-05-13  8:18   ` [PATCH v3 11/11] refs/reftable: allow configuring geometric factor Patrick Steinhardt
2024-05-17  8:14   ` [PATCH v3 00/11] reftable: expose write options as config Karthik Nayak
2024-05-17  8:26     ` Patrick Steinhardt
2024-05-21 23:54   ` Justin Tobler
2024-05-22  7:19     ` Patrick Steinhardt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).