git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] reftable: optimize I/O patterns
@ 2024-01-08 12:18 Patrick Steinhardt
  2024-01-08 12:18 ` [PATCH 1/4] reftable/stack: refactor stack reloading to have common exit path Patrick Steinhardt
                   ` (4 more replies)
  0 siblings, 5 replies; 22+ messages in thread
From: Patrick Steinhardt @ 2024-01-08 12:18 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys

[-- Attachment #1: Type: text/plain, Size: 890 bytes --]

Hi,

this patch series optimizes some I/O patterns that the reftable library
uses:

  - Establish a stat(3P)-based caching mechanism to avoid re-reading
    "tables.list" all the time.

  - Convert the block source to use mmap(3P) instead of read(3P) to read
    data.

Combined these lead to a ~2.7x speedup when writing many refs.

Patrick

Patrick Steinhardt (4):
  reftable/stack: refactor stack reloading to have common exit path
  reftable/stack: refactor reloading to use file descriptor
  reftable/stack: use stat info to avoid re-reading stack list
  reftable/blocksource: use mmap to read tables

 reftable/blocksource.c |  48 +++++++++++++-----
 reftable/stack.c       | 111 ++++++++++++++++++++++++-----------------
 reftable/stack.h       |   1 +
 reftable/system.h      |   1 +
 4 files changed, 103 insertions(+), 58 deletions(-)

-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/4] reftable/stack: refactor stack reloading to have common exit path
  2024-01-08 12:18 [PATCH 0/4] reftable: optimize I/O patterns Patrick Steinhardt
@ 2024-01-08 12:18 ` Patrick Steinhardt
  2024-01-10 17:30   ` Junio C Hamano
  2024-01-08 12:18 ` [PATCH 2/4] reftable/stack: refactor reloading to use file descriptor Patrick Steinhardt
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: Patrick Steinhardt @ 2024-01-08 12:18 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys

[-- Attachment #1: Type: text/plain, Size: 3322 bytes --]

The `reftable_stack_reload_maybe_reuse()` function is responsible for
reloading the reftable list from disk. The function is quite hard to
follow though because it has a bunch of different exit paths, many of
which have to free the same set of resources.

Refactor the function to have a common exit path. While at it, touch up
the style of this function a bit to match our usual coding style better.
---
 reftable/stack.c | 86 +++++++++++++++++++++++-------------------------
 1 file changed, 42 insertions(+), 44 deletions(-)

diff --git a/reftable/stack.c b/reftable/stack.c
index 16bab82063..bf869a6772 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -304,69 +304,67 @@ static int tv_cmp(struct timeval *a, struct timeval *b)
 static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
 					     int reuse_open)
 {
-	struct timeval deadline = { 0 };
-	int err = gettimeofday(&deadline, NULL);
+	char **names = NULL, **names_after = NULL;
+	struct timeval deadline;
 	int64_t delay = 0;
-	int tries = 0;
-	if (err < 0)
-		return err;
+	int tries = 0, err;
 
+	err = gettimeofday(&deadline, NULL);
+	if (err < 0)
+		goto out;
 	deadline.tv_sec += 3;
+
 	while (1) {
-		char **names = NULL;
-		char **names_after = NULL;
-		struct timeval now = { 0 };
-		int err = gettimeofday(&now, NULL);
-		int err2 = 0;
-		if (err < 0) {
-			return err;
-		}
+		struct timeval now;
 
-		/* Only look at deadlines after the first few times. This
-		   simplifies debugging in GDB */
+		err = gettimeofday(&now, NULL);
+		if (err < 0)
+			goto out;
+
+		/*
+		 * Only look at deadlines after the first few times. This
+		 * simplifies debugging in GDB.
+		 */
 		tries++;
-		if (tries > 3 && tv_cmp(&now, &deadline) >= 0) {
-			break;
-		}
+		if (tries > 3 && tv_cmp(&now, &deadline) >= 0)
+			goto out;
 
 		err = read_lines(st->list_file, &names);
-		if (err < 0) {
-			free_names(names);
-			return err;
-		}
+		if (err < 0)
+			goto out;
+
 		err = reftable_stack_reload_once(st, names, reuse_open);
-		if (err == 0) {
-			free_names(names);
+		if (!err)
 			break;
-		}
-		if (err != REFTABLE_NOT_EXIST_ERROR) {
-			free_names(names);
-			return err;
-		}
-
-		/* err == REFTABLE_NOT_EXIST_ERROR can be caused by a concurrent
-		   writer. Check if there was one by checking if the name list
-		   changed.
-		*/
-		err2 = read_lines(st->list_file, &names_after);
-		if (err2 < 0) {
-			free_names(names);
-			return err2;
-		}
-
+		if (err != REFTABLE_NOT_EXIST_ERROR)
+			goto out;
+
+		/*
+		 * REFTABLE_NOT_EXIST_ERROR can be caused by a concurrent
+		 * writer. Check if there was one by checking if the name list
+		 * changed.
+		 */
+		err = read_lines(st->list_file, &names_after);
+		if (err < 0)
+			goto out;
 		if (names_equal(names_after, names)) {
-			free_names(names);
-			free_names(names_after);
-			return err;
+			err = REFTABLE_NOT_EXIST_ERROR;
+			goto out;
 		}
+
 		free_names(names);
+		names = NULL;
 		free_names(names_after);
+		names_after = NULL;
 
 		delay = delay + (delay * rand()) / RAND_MAX + 1;
 		sleep_millisec(delay);
 	}
 
-	return 0;
+out:
+	free_names(names);
+	free_names(names_after);
+	return err;
 }
 
 /* -1 = error
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/4] reftable/stack: refactor reloading to use file descriptor
  2024-01-08 12:18 [PATCH 0/4] reftable: optimize I/O patterns Patrick Steinhardt
  2024-01-08 12:18 ` [PATCH 1/4] reftable/stack: refactor stack reloading to have common exit path Patrick Steinhardt
@ 2024-01-08 12:18 ` Patrick Steinhardt
  2024-01-10 19:57   ` Junio C Hamano
  2024-01-08 12:18 ` [PATCH 3/4] reftable/stack: use stat info to avoid re-reading stack list Patrick Steinhardt
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: Patrick Steinhardt @ 2024-01-08 12:18 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys

[-- Attachment #1: Type: text/plain, Size: 1851 bytes --]

We're about to introduce a stat(3P)-based caching mechanism to reload
the list of stacks only when it has changed. In order to avoid race
conditions this requires us to have a file descriptor available that we
can use to call fstat(3P) on.

Prepare for this by converting the code to use `fd_read_lines()` so that
we have the file descriptor readily available.
---
 reftable/stack.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/reftable/stack.c b/reftable/stack.c
index bf869a6772..b1ee247601 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -308,6 +308,7 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
 	struct timeval deadline;
 	int64_t delay = 0;
 	int tries = 0, err;
+	int fd = -1;
 
 	err = gettimeofday(&deadline, NULL);
 	if (err < 0)
@@ -329,9 +330,19 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
 		if (tries > 3 && tv_cmp(&now, &deadline) >= 0)
 			goto out;
 
-		err = read_lines(st->list_file, &names);
-		if (err < 0)
-			goto out;
+		fd = open(st->list_file, O_RDONLY);
+		if (fd < 0) {
+			if (errno != ENOENT) {
+				err = REFTABLE_IO_ERROR;
+				goto out;
+			}
+
+			names = reftable_calloc(sizeof(char *));
+		} else {
+			err = fd_read_lines(fd, &names);
+			if (err < 0)
+				goto out;
+		}
 
 		err = reftable_stack_reload_once(st, names, reuse_open);
 		if (!err)
@@ -356,12 +367,16 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
 		names = NULL;
 		free_names(names_after);
 		names_after = NULL;
+		close(fd);
+		fd = -1;
 
 		delay = delay + (delay * rand()) / RAND_MAX + 1;
 		sleep_millisec(delay);
 	}
 
 out:
+	if (fd >= 0)
+		close(fd);
 	free_names(names);
 	free_names(names_after);
 	return err;
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/4] reftable/stack: use stat info to avoid re-reading stack list
  2024-01-08 12:18 [PATCH 0/4] reftable: optimize I/O patterns Patrick Steinhardt
  2024-01-08 12:18 ` [PATCH 1/4] reftable/stack: refactor stack reloading to have common exit path Patrick Steinhardt
  2024-01-08 12:18 ` [PATCH 2/4] reftable/stack: refactor reloading to use file descriptor Patrick Steinhardt
@ 2024-01-08 12:18 ` Patrick Steinhardt
  2024-01-10 20:07   ` Junio C Hamano
  2024-01-08 12:18 ` [PATCH 4/4] reftable/blocksource: use mmap to read tables Patrick Steinhardt
  2024-01-11 10:06 ` [PATCH v2 0/5] reftable: optimize I/O patterns Patrick Steinhardt
  4 siblings, 1 reply; 22+ messages in thread
From: Patrick Steinhardt @ 2024-01-08 12:18 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys

[-- Attachment #1: Type: text/plain, Size: 5530 bytes --]

Whenever we call into the refs interfaces we potentially have to reload
refs in case they have been concurrently modified, either in-process or
externally. While this happens somewhat automatically for loose refs
because we simply try to re-read the files, the "packed" backend will
reload its snapshot of the packed-refs file in case its stat info has
changed since last reading it.

In the reftable backend we have a similar mechanism that is provided by
`reftable_stack_reload()`. This function will read the list of stacks
from "tables.list" and, if they have changed from the currently stored
list, reload the stacks. This is heavily inefficient though, as we have
to check whether the stack is up-to-date on basically every read and
thus keep on re-reading the file all the time even if it didn't change
at all.

We can do better and use the same stat(3P)-based mechanism that the
"packed" backend uses. Instead of reading the file, we will only open
the file descriptor, fstat(3P) it, and then compare the info against the
cached value from the last time we have updated the stack. This should
always work alright because "tables.list" is updated atomically via a
rename, so even if the ctime or mtime wasn't granular enough to identify
a change, at least the inode number should have changed.

This change significantly speeds up operations where many refs are read,
like when using git-update-ref(1). The following benchmark creates N
refs in an otherwise-empty repository via `git update-ref --stdin`:

  Benchmark 1: update-ref: create many refs (refcount = 1, revision = HEAD~)
    Time (mean ± σ):       5.6 ms ±   0.2 ms    [User: 2.5 ms, System: 3.0 ms]
    Range (min … max):     5.2 ms …   6.0 ms    89 runs

  Benchmark 2: update-ref: create many refs (refcount = 100, revision = HEAD~)
    Time (mean ± σ):      19.2 ms ±   0.3 ms    [User: 8.6 ms, System: 10.4 ms]
    Range (min … max):    18.4 ms …  19.8 ms    63 runs

  Benchmark 3: update-ref: create many refs (refcount = 10000, revision = HEAD~)
    Time (mean ± σ):      1.310 s ±  0.014 s    [User: 0.566 s, System: 0.724 s]
    Range (min … max):    1.291 s …  1.342 s    10 runs

  Benchmark 4: update-ref: create many refs (refcount = 1, revision = HEAD)
    Time (mean ± σ):       5.7 ms ±   0.4 ms    [User: 2.6 ms, System: 3.1 ms]
    Range (min … max):     5.1 ms …   9.5 ms    91 runs

  Benchmark 5: update-ref: create many refs (refcount = 100, revision = HEAD)
    Time (mean ± σ):      15.2 ms ±   0.3 ms    [User: 7.0 ms, System: 8.1 ms]
    Range (min … max):    14.3 ms …  17.1 ms    71 runs

  Benchmark 6: update-ref: create many refs (refcount = 10000, revision = HEAD)
    Time (mean ± σ):     916.1 ms ±  11.0 ms    [User: 420.8 ms, System: 495.0 ms]
    Range (min … max):   906.9 ms … 943.8 ms    10 runs

  Summary
    update-ref: create many refs (refcount = 1, revision = HEAD~) ran
      1.01 ± 0.09 times faster than update-ref: create many refs (refcount = 1, revision = HEAD)
      2.72 ± 0.11 times faster than update-ref: create many refs (refcount = 100, revision = HEAD)
      3.42 ± 0.13 times faster than update-ref: create many refs (refcount = 100, revision = HEAD~)
    163.59 ± 5.62 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD)
    233.91 ± 7.92 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD~)
---
 reftable/stack.c  | 10 +++++++++-
 reftable/stack.h  |  1 +
 reftable/system.h |  1 +
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/reftable/stack.c b/reftable/stack.c
index b1ee247601..a51a2dbf1f 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -175,6 +175,7 @@ void reftable_stack_destroy(struct reftable_stack *st)
 		st->readers_len = 0;
 		FREE_AND_NULL(st->readers);
 	}
+	stat_validity_clear(&st->list_validity);
 	FREE_AND_NULL(st->list_file);
 	FREE_AND_NULL(st->reftable_dir);
 	reftable_free(st);
@@ -374,6 +375,8 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
 		sleep_millisec(delay);
 	}
 
+	stat_validity_update(&st->list_validity, fd);
+
 out:
 	if (fd >= 0)
 		close(fd);
@@ -388,8 +391,13 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
 static int stack_uptodate(struct reftable_stack *st)
 {
 	char **names = NULL;
-	int err = read_lines(st->list_file, &names);
+	int err;
 	int i = 0;
+
+	if (stat_validity_check(&st->list_validity, st->list_file))
+		return 0;
+
+	err = read_lines(st->list_file, &names);
 	if (err < 0)
 		return err;
 
diff --git a/reftable/stack.h b/reftable/stack.h
index f57005846e..3f80cc598a 100644
--- a/reftable/stack.h
+++ b/reftable/stack.h
@@ -14,6 +14,7 @@ license that can be found in the LICENSE file or at
 #include "reftable-stack.h"
 
 struct reftable_stack {
+	struct stat_validity list_validity;
 	char *list_file;
 	char *reftable_dir;
 	int disable_auto_compact;
diff --git a/reftable/system.h b/reftable/system.h
index 6b74a81514..2cc7adf271 100644
--- a/reftable/system.h
+++ b/reftable/system.h
@@ -12,6 +12,7 @@ license that can be found in the LICENSE file or at
 /* This header glues the reftable library to the rest of Git */
 
 #include "git-compat-util.h"
+#include "statinfo.h"
 #include "strbuf.h"
 #include "hash-ll.h" /* hash ID, sizes.*/
 #include "dir.h" /* remove_dir_recursively, for tests.*/
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 4/4] reftable/blocksource: use mmap to read tables
  2024-01-08 12:18 [PATCH 0/4] reftable: optimize I/O patterns Patrick Steinhardt
                   ` (2 preceding siblings ...)
  2024-01-08 12:18 ` [PATCH 3/4] reftable/stack: use stat info to avoid re-reading stack list Patrick Steinhardt
@ 2024-01-08 12:18 ` Patrick Steinhardt
  2024-01-10 21:24   ` Junio C Hamano
  2024-01-11 10:06 ` [PATCH v2 0/5] reftable: optimize I/O patterns Patrick Steinhardt
  4 siblings, 1 reply; 22+ messages in thread
From: Patrick Steinhardt @ 2024-01-08 12:18 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys

[-- Attachment #1: Type: text/plain, Size: 7265 bytes --]

The blocksource interface provides an interface to read blocks from a
reftable table. This interface is implemented using read(3P) calls on
the underlying file descriptor. While this works alright, this pattern
is very inefficient when repeatedly querying the reftable stack for one
or more refs. This inefficiency can mostly be attributed to the fact
that we often need to re-read the same blocks over and over again, and
every single time we need to call read(3P) again.

A natural fit in this context is to use mmap(3P) instead of read(3P),
which has a bunch of benefits:

  - We do not need to come up with a caching strategy for some of the
    blocks as this will be handled by the kernel already.

  - We can avoid the overhead of having to call into the read(3P)
    syscall repeatedly.

  - We do not need to allocate returned blocks repeatedly, but can
    instead hand out pointers into the mmapped region directly.

Using mmap comes with a significant drawback on Windows though, because
mmapped files cannot be deleted and neither is it possible to rename
files onto an mmapped file. But for one, the reftable library gracefully
handles the case where auto-compaction cannot delete a still-open stack
already and ignores any such errors. Also, `reftable_stack_clean()` will
prune stale tables which are not referenced by "tables.list" anymore so
that those files can eventually be pruned. And second, we never rewrite
already-rewritten stacks, so it does not matter that we cannot rename a
file over an mmaped file, either.

Another unfortunate property of mmap is that it is not supported by all
systems. But given that the size of reftables should typically be rather
limited (megabytes at most in the vast majority of repositories), we can
provide an easy fallback by just reading the complete table into memory
on such platforms. This is the same strategy that the "packed" backend
uses in case the platform does not provide mmap.

While this change doesn't significantly improve performance in the case
where we're seeking through stacks once (like e.g. git-for-each-ref(1)
would). But it does speed up usecases where there is lots of random
access to refs, e.g. when writing. The following benchmark demonstrates
these savings with git-update-ref(1) creating N refs in an otherwise
empty repository:

  Benchmark 1: update-ref: create many refs (refcount = 1, revision = HEAD~)
    Time (mean ± σ):       5.6 ms ±   0.2 ms    [User: 2.7 ms, System: 2.8 ms]
    Range (min … max):     5.1 ms …   6.0 ms    90 runs

  Benchmark 2: update-ref: create many refs (refcount = 100, revision = HEAD~)
    Time (mean ± σ):      15.1 ms ±   0.4 ms    [User: 7.1 ms, System: 8.0 ms]
    Range (min … max):    14.2 ms …  16.5 ms    71 runs

  Benchmark 3: update-ref: create many refs (refcount = 10000, revision = HEAD~)
    Time (mean ± σ):     919.4 ms ±  11.2 ms    [User: 427.5 ms, System: 491.7 ms]
    Range (min … max):   908.1 ms … 936.6 ms    10 runs

  Benchmark 4: update-ref: create many refs (refcount = 1, revision = HEAD)
    Time (mean ± σ):       5.5 ms ±   0.3 ms    [User: 2.4 ms, System: 3.1 ms]
    Range (min … max):     5.0 ms …   7.3 ms    89 runs

  Benchmark 5: update-ref: create many refs (refcount = 100, revision = HEAD)
    Time (mean ± σ):      10.4 ms ±   0.3 ms    [User: 5.1 ms, System: 5.3 ms]
    Range (min … max):     9.7 ms …  11.0 ms    78 runs

  Benchmark 6: update-ref: create many refs (refcount = 10000, revision = HEAD)
    Time (mean ± σ):     483.5 ms ±   6.3 ms    [User: 227.8 ms, System: 255.6 ms]
    Range (min … max):   477.5 ms … 498.8 ms    10 runs

  Summary
    update-ref: create many refs (refcount = 1, revision = HEAD) ran
      1.01 ± 0.06 times faster than update-ref: create many refs (refcount = 1, revision = HEAD~)
      1.89 ± 0.11 times faster than update-ref: create many refs (refcount = 100, revision = HEAD)
      2.73 ± 0.16 times faster than update-ref: create many refs (refcount = 100, revision = HEAD~)
     87.55 ± 4.65 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD)
    166.48 ± 8.80 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD~)

Theoretically, we could also replicate the strategy of the "packed"
backend where small tables are read into memory instead of using mmap.
Benchmarks did not confirm that this has a performance benefit though.
---
 reftable/blocksource.c | 48 ++++++++++++++++++++++++++++++------------
 1 file changed, 35 insertions(+), 13 deletions(-)

diff --git a/reftable/blocksource.c b/reftable/blocksource.c
index a1ea304429..5d3f3d264c 100644
--- a/reftable/blocksource.c
+++ b/reftable/blocksource.c
@@ -13,6 +13,12 @@ license that can be found in the LICENSE file or at
 #include "reftable-blocksource.h"
 #include "reftable-error.h"
 
+#if defined(NO_MMAP)
+static int use_mmap = 0;
+#else
+static int use_mmap = 1;
+#endif
+
 static void strbuf_return_block(void *b, struct reftable_block *dest)
 {
 	if (dest->len)
@@ -78,6 +84,7 @@ struct reftable_block_source malloc_block_source(void)
 struct file_block_source {
 	int fd;
 	uint64_t size;
+	unsigned char *data;
 };
 
 static uint64_t file_size(void *b)
@@ -87,19 +94,23 @@ static uint64_t file_size(void *b)
 
 static void file_return_block(void *b, struct reftable_block *dest)
 {
-	if (dest->len)
-		memset(dest->data, 0xff, dest->len);
-	reftable_free(dest->data);
 }
 
-static void file_close(void *b)
+static void file_close(void *v)
 {
-	int fd = ((struct file_block_source *)b)->fd;
-	if (fd > 0) {
-		close(fd);
-		((struct file_block_source *)b)->fd = 0;
+	struct file_block_source *b = v;
+
+	if (b->fd >= 0) {
+		close(b->fd);
+		b->fd = -1;
 	}
 
+	if (use_mmap)
+		munmap(b->data, b->size);
+	else
+		reftable_free(b->data);
+	b->data = NULL;
+
 	reftable_free(b);
 }
 
@@ -108,9 +119,7 @@ static int file_read_block(void *v, struct reftable_block *dest, uint64_t off,
 {
 	struct file_block_source *b = v;
 	assert(off + size <= b->size);
-	dest->data = reftable_malloc(size);
-	if (pread_in_full(b->fd, dest->data, size, off) != size)
-		return -1;
+	dest->data = b->data + off;
 	dest->len = size;
 	return size;
 }
@@ -127,8 +136,10 @@ int reftable_block_source_from_file(struct reftable_block_source *bs,
 {
 	struct stat st = { 0 };
 	int err = 0;
-	int fd = open(name, O_RDONLY);
+	int fd;
 	struct file_block_source *p = NULL;
+
+	fd = open(name, O_RDONLY);
 	if (fd < 0) {
 		if (errno == ENOENT) {
 			return REFTABLE_NOT_EXIST_ERROR;
@@ -144,7 +155,18 @@ int reftable_block_source_from_file(struct reftable_block_source *bs,
 
 	p = reftable_calloc(sizeof(struct file_block_source));
 	p->size = st.st_size;
-	p->fd = fd;
+	if (use_mmap) {
+		p->data = xmmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
+		p->fd = fd;
+	} else {
+		p->data = xmalloc(st.st_size);
+		if (read_in_full(fd, p->data, st.st_size) != st.st_size) {
+			close(fd);
+			return -1;
+		}
+		close(fd);
+		p->fd = -1;
+	}
 
 	assert(!bs->ops);
 	bs->ops = &file_vtable;
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/4] reftable/stack: refactor stack reloading to have common exit path
  2024-01-08 12:18 ` [PATCH 1/4] reftable/stack: refactor stack reloading to have common exit path Patrick Steinhardt
@ 2024-01-10 17:30   ` Junio C Hamano
  2024-01-11  7:33     ` Patrick Steinhardt
  0 siblings, 1 reply; 22+ messages in thread
From: Junio C Hamano @ 2024-01-10 17:30 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Han-Wen Nienhuys

Patrick Steinhardt <ps@pks.im> writes:

> The `reftable_stack_reload_maybe_reuse()` function is responsible for
> reloading the reftable list from disk. The function is quite hard to
> follow though because it has a bunch of different exit paths, many of
> which have to free the same set of resources.
>
> Refactor the function to have a common exit path. While at it, touch up
> the style of this function a bit to match our usual coding style better.
> ---
>  reftable/stack.c | 86 +++++++++++++++++++++++-------------------------
>  1 file changed, 42 insertions(+), 44 deletions(-)

Missing sign-off.

Other than that, I did not find anything questionable in the
conversion.  By sticking to the two simple invariants:

 - we use "err" as our return value when we jump to "out:"

 - we always keep "names" and "names_after" freeable, and free them
   when we jump to "out:".

the exit status and leak prevention are both very clear (and the
behaviour is not changed---it is not like there are any existing
leaks that are plugged by this restructuring of the loop).


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/4] reftable/stack: refactor reloading to use file descriptor
  2024-01-08 12:18 ` [PATCH 2/4] reftable/stack: refactor reloading to use file descriptor Patrick Steinhardt
@ 2024-01-10 19:57   ` Junio C Hamano
  0 siblings, 0 replies; 22+ messages in thread
From: Junio C Hamano @ 2024-01-10 19:57 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Han-Wen Nienhuys

Patrick Steinhardt <ps@pks.im> writes:

> We're about to introduce a stat(3P)-based caching mechanism to reload
> the list of stacks only when it has changed. In order to avoid race
> conditions this requires us to have a file descriptor available that we
> can use to call fstat(3P) on.
>
> Prepare for this by converting the code to use `fd_read_lines()` so that
> we have the file descriptor readily available.
> ---

Missing sign-off.

Other than that, the change is a refactoring that is very faithful
to the original.  Instead of doing the "open - pretend we opened an
empty file if it does not exist - read - close" dance all inside the
read_lines() call, this sort-of open codes the helper in this caller,
so that later steps of this series can look at the file descriptor
open to it.

Looking good.

>  reftable/stack.c | 21 ++++++++++++++++++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/reftable/stack.c b/reftable/stack.c
> index bf869a6772..b1ee247601 100644
> --- a/reftable/stack.c
> +++ b/reftable/stack.c
> @@ -308,6 +308,7 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
>  	struct timeval deadline;
>  	int64_t delay = 0;
>  	int tries = 0, err;
> +	int fd = -1;
>  
>  	err = gettimeofday(&deadline, NULL);
>  	if (err < 0)
> @@ -329,9 +330,19 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
>  		if (tries > 3 && tv_cmp(&now, &deadline) >= 0)
>  			goto out;
>  
> -		err = read_lines(st->list_file, &names);
> -		if (err < 0)
> -			goto out;
> +		fd = open(st->list_file, O_RDONLY);
> +		if (fd < 0) {
> +			if (errno != ENOENT) {
> +				err = REFTABLE_IO_ERROR;
> +				goto out;
> +			}
> +
> +			names = reftable_calloc(sizeof(char *));
> +		} else {
> +			err = fd_read_lines(fd, &names);
> +			if (err < 0)
> +				goto out;
> +		}
>  
>  		err = reftable_stack_reload_once(st, names, reuse_open);
>  		if (!err)
> @@ -356,12 +367,16 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
>  		names = NULL;
>  		free_names(names_after);
>  		names_after = NULL;
> +		close(fd);
> +		fd = -1;
>  
>  		delay = delay + (delay * rand()) / RAND_MAX + 1;
>  		sleep_millisec(delay);
>  	}
>  
>  out:
> +	if (fd >= 0)
> +		close(fd);
>  	free_names(names);
>  	free_names(names_after);
>  	return err;

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/4] reftable/stack: use stat info to avoid re-reading stack list
  2024-01-08 12:18 ` [PATCH 3/4] reftable/stack: use stat info to avoid re-reading stack list Patrick Steinhardt
@ 2024-01-10 20:07   ` Junio C Hamano
  2024-01-11  7:41     ` Patrick Steinhardt
  0 siblings, 1 reply; 22+ messages in thread
From: Junio C Hamano @ 2024-01-10 20:07 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Han-Wen Nienhuys

Patrick Steinhardt <ps@pks.im> writes:

> We can do better and use the same stat(3P)-based mechanism that the
> "packed" backend uses. Instead of reading the file, we will only open
> the file descriptor, fstat(3P) it, and then compare the info against the
> cached value from the last time we have updated the stack. This should
> always work alright because "tables.list" is updated atomically via a
> rename, so even if the ctime or mtime wasn't granular enough to identify
> a change, at least the inode number should have changed.

Or the file size.  Let's keep in mind that many users get useless
inum from their filesystem X-<.

>   Summary
>     update-ref: create many refs (refcount = 1, revision = HEAD~) ran
>       1.01 ± 0.09 times faster than update-ref: create many refs (refcount = 1, revision = HEAD)
>       2.72 ± 0.11 times faster than update-ref: create many refs (refcount = 100, revision = HEAD)
>       3.42 ± 0.13 times faster than update-ref: create many refs (refcount = 100, revision = HEAD~)
>     163.59 ± 5.62 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD)
>     233.91 ± 7.92 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD~)
> ---

Nice.

> @@ -374,6 +375,8 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
>  		sleep_millisec(delay);
>  	}
>  
> +	stat_validity_update(&st->list_validity, fd);
> +
>  out:
>  	if (fd >= 0)
>  		close(fd);

The stat_validity_update() does not happen in the error codepath.

Should we be clearing the validity of the list when somebody jumps
to "out:" due to an error?  Or by the time this function gets
called, the caller would already have cleared the validity and an
error that jumps to "out:" keeps the list invalid?

Other than the missing sign-off, the change looks very straight-forward.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/4] reftable/blocksource: use mmap to read tables
  2024-01-08 12:18 ` [PATCH 4/4] reftable/blocksource: use mmap to read tables Patrick Steinhardt
@ 2024-01-10 21:24   ` Junio C Hamano
  2024-01-11  9:21     ` Patrick Steinhardt
  0 siblings, 1 reply; 22+ messages in thread
From: Junio C Hamano @ 2024-01-10 21:24 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Han-Wen Nienhuys

Patrick Steinhardt <ps@pks.im> writes:

> Using mmap comes with a significant drawback on Windows though, because
> mmapped files cannot be deleted and neither is it possible to rename
> files onto an mmapped file. But for one, the reftable library gracefully
> handles the case where auto-compaction cannot delete a still-open stack
> already and ignores any such errors. Also, `reftable_stack_clean()` will
> prune stale tables which are not referenced by "tables.list" anymore so
> that those files can eventually be pruned. And second, we never rewrite
> already-rewritten stacks, so it does not matter that we cannot rename a
> file over an mmaped file, either.

I somehow thought that we use "read into an allocated memory and
pretend as if we mapped" emulation on Windows, so all of that may be
moot.

> diff --git a/reftable/blocksource.c b/reftable/blocksource.c
> index a1ea304429..5d3f3d264c 100644
> --- a/reftable/blocksource.c
> +++ b/reftable/blocksource.c
> @@ -13,6 +13,12 @@ license that can be found in the LICENSE file or at
>  #include "reftable-blocksource.h"
>  #include "reftable-error.h"
>  
> +#if defined(NO_MMAP)
> +static int use_mmap = 0;
> +#else
> +static int use_mmap = 1;
> +#endif

Is there (do you need) some externally controllable knob that the
user can use to turn this off on a system that is compiled without
NO_MMAP?  Otherwise it is misleading to have this as a variable.

> -static void file_close(void *b)
> +static void file_close(void *v)
>  {
> -	int fd = ((struct file_block_source *)b)->fd;
> -	if (fd > 0) {
> -		close(fd);
> -		((struct file_block_source *)b)->fd = 0;
> +	struct file_block_source *b = v;
> +
> +	if (b->fd >= 0) {
> +		close(b->fd);
> +		b->fd = -1;
>  	}
>  
> +	if (use_mmap)
> +		munmap(b->data, b->size);
> +	else
> +		reftable_free(b->data);
> +	b->data = NULL;
> +
>  	reftable_free(b);
>  }

It would have been nicer to do this kind of "a void pointer is taken
and we immediately cast it to a concrete and useful type upon entry"
clean-up as a separate step that is purely clean-up, if there were
many instances.  It is the first such one in the series as far as I
remember, so it is not a huge deal.

> @@ -108,9 +119,7 @@ static int file_read_block(void *v, struct reftable_block *dest, uint64_t off,
>  {
>  	struct file_block_source *b = v;
>  	assert(off + size <= b->size);
> -	dest->data = reftable_malloc(size);
> -	if (pread_in_full(b->fd, dest->data, size, off) != size)
> -		return -1;
> +	dest->data = b->data + off;
>  	dest->len = size;
>  	return size;
>  }

So, we used to delay the allocation and reading of a block until
this function gets called.  Now, by the time the control reaches
the function, we are expected to have the data handy at b->data.
That is ensured by reftable_block_source_from_file(), I presume?

> @@ -127,8 +136,10 @@ int reftable_block_source_from_file(struct reftable_block_source *bs,
>  {
>  	struct stat st = { 0 };
>  	int err = 0;
> -	int fd = open(name, O_RDONLY);
> +	int fd;
>  	struct file_block_source *p = NULL;
> +
> +	fd = open(name, O_RDONLY);
>  	if (fd < 0) {
>  		if (errno == ENOENT) {
>  			return REFTABLE_NOT_EXIST_ERROR;

This is a no-op clean-up that would have been better in a separate
clean-up step.  Again, not a huge deal.

> @@ -144,7 +155,18 @@ int reftable_block_source_from_file(struct reftable_block_source *bs,
>  
>  	p = reftable_calloc(sizeof(struct file_block_source));
>  	p->size = st.st_size;
> -	p->fd = fd;
> +	if (use_mmap) {
> +		p->data = xmmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
> +		p->fd = fd;

This is a bit unusual for our use of mmap() where the norm is to
close the file descriptor once we mapped (we only need the memory
region itself and not the originating file descriptor to unmap it).

Is there a reason why we need to keep p->fd?  Now the other side of
this if/else preallocates the whole thing and slurps everything in
core to allow the remainder of the code to mimic what happens on the
mmaped block memory (e.g., we saw how file_read_block() assumes that
b->data[0..b->size] are valid) and does not obviously need p->fd,
can we just remove .fd member from the file_block_source structure?

That way, file_close() can also lose the conditional close() call.

For that matter, do we need any codepath in this file that is
enabled by !use_mmap?  Can't we just use xmmap() and let its
"instead, we allocate, read into it and return" emulation?

Thanks.

> +	} else {
> +		p->data = xmalloc(st.st_size);
> +		if (read_in_full(fd, p->data, st.st_size) != st.st_size) {
> +			close(fd);
> +			return -1;
> +		}
> +		close(fd);
> +		p->fd = -1;
> +	}
>  
>  	assert(!bs->ops);
>  	bs->ops = &file_vtable;

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/4] reftable/stack: refactor stack reloading to have common exit path
  2024-01-10 17:30   ` Junio C Hamano
@ 2024-01-11  7:33     ` Patrick Steinhardt
  0 siblings, 0 replies; 22+ messages in thread
From: Patrick Steinhardt @ 2024-01-11  7:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Han-Wen Nienhuys

[-- Attachment #1: Type: text/plain, Size: 796 bytes --]

On Wed, Jan 10, 2024 at 09:30:51AM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > The `reftable_stack_reload_maybe_reuse()` function is responsible for
> > reloading the reftable list from disk. The function is quite hard to
> > follow though because it has a bunch of different exit paths, many of
> > which have to free the same set of resources.
> >
> > Refactor the function to have a common exit path. While at it, touch up
> > the style of this function a bit to match our usual coding style better.
> > ---
> >  reftable/stack.c | 86 +++++++++++++++++++++++-------------------------
> >  1 file changed, 42 insertions(+), 44 deletions(-)
> 
> Missing sign-off.

Ah, sorry, I forgot my usual `git rebase --signoff`. Will fix in v2.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/4] reftable/stack: use stat info to avoid re-reading stack list
  2024-01-10 20:07   ` Junio C Hamano
@ 2024-01-11  7:41     ` Patrick Steinhardt
  0 siblings, 0 replies; 22+ messages in thread
From: Patrick Steinhardt @ 2024-01-11  7:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Han-Wen Nienhuys

[-- Attachment #1: Type: text/plain, Size: 2339 bytes --]

On Wed, Jan 10, 2024 at 12:07:52PM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > We can do better and use the same stat(3P)-based mechanism that the
> > "packed" backend uses. Instead of reading the file, we will only open
> > the file descriptor, fstat(3P) it, and then compare the info against the
> > cached value from the last time we have updated the stack. This should
> > always work alright because "tables.list" is updated atomically via a
> > rename, so even if the ctime or mtime wasn't granular enough to identify
> > a change, at least the inode number should have changed.
> 
> Or the file size.  Let's keep in mind that many users get useless
> inum from their filesystem X-<.
> 
> >   Summary
> >     update-ref: create many refs (refcount = 1, revision = HEAD~) ran
> >       1.01 ± 0.09 times faster than update-ref: create many refs (refcount = 1, revision = HEAD)
> >       2.72 ± 0.11 times faster than update-ref: create many refs (refcount = 100, revision = HEAD)
> >       3.42 ± 0.13 times faster than update-ref: create many refs (refcount = 100, revision = HEAD~)
> >     163.59 ± 5.62 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD)
> >     233.91 ± 7.92 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD~)
> > ---
> 
> Nice.
> 
> > @@ -374,6 +375,8 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
> >  		sleep_millisec(delay);
> >  	}
> >  
> > +	stat_validity_update(&st->list_validity, fd);
> > +
> >  out:
> >  	if (fd >= 0)
> >  		close(fd);
> 
> The stat_validity_update() does not happen in the error codepath.
> 
> Should we be clearing the validity of the list when somebody jumps
> to "out:" due to an error?  Or by the time this function gets
> called, the caller would already have cleared the validity and an
> error that jumps to "out:" keeps the list invalid?

It likely does not matter much. This function only gets called when we
have previously determined the stack to be out of date already. So
unless we update the validity cache, we know that the next validity
check would continue to treat the stack as outdated.

That being said I think it's good hygiene to clear the validity cache
regardless of that.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/4] reftable/blocksource: use mmap to read tables
  2024-01-10 21:24   ` Junio C Hamano
@ 2024-01-11  9:21     ` Patrick Steinhardt
  0 siblings, 0 replies; 22+ messages in thread
From: Patrick Steinhardt @ 2024-01-11  9:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Han-Wen Nienhuys

[-- Attachment #1: Type: text/plain, Size: 5948 bytes --]

On Wed, Jan 10, 2024 at 01:24:24PM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > Using mmap comes with a significant drawback on Windows though, because
> > mmapped files cannot be deleted and neither is it possible to rename
> > files onto an mmapped file. But for one, the reftable library gracefully
> > handles the case where auto-compaction cannot delete a still-open stack
> > already and ignores any such errors. Also, `reftable_stack_clean()` will
> > prune stale tables which are not referenced by "tables.list" anymore so
> > that those files can eventually be pruned. And second, we never rewrite
> > already-rewritten stacks, so it does not matter that we cannot rename a
> > file over an mmaped file, either.
> 
> I somehow thought that we use "read into an allocated memory and
> pretend as if we mapped" emulation on Windows, so all of that may be
> moot.

Ah, you're right in fact. I missed the definition of `git_mmap()` and
`git_munmap()`.

> > diff --git a/reftable/blocksource.c b/reftable/blocksource.c
> > index a1ea304429..5d3f3d264c 100644
> > --- a/reftable/blocksource.c
> > +++ b/reftable/blocksource.c
> > @@ -13,6 +13,12 @@ license that can be found in the LICENSE file or at
> >  #include "reftable-blocksource.h"
> >  #include "reftable-error.h"
> >  
> > +#if defined(NO_MMAP)
> > +static int use_mmap = 0;
> > +#else
> > +static int use_mmap = 1;
> > +#endif
> 
> Is there (do you need) some externally controllable knob that the
> user can use to turn this off on a system that is compiled without
> NO_MMAP?  Otherwise it is misleading to have this as a variable.

No. The benefit of using a variable instead of using ifdefs is that we
know that both code paths continue to compile just fine. We do the same
thing in "refs/packed-backend.c".

But this code is not needed at all anyway, as you pointed out, because
we can simply use the emulated mmap(3P) code.

> > +static void file_close(void *v)
> >  {
> > -	int fd = ((struct file_block_source *)b)->fd;
> > -	if (fd > 0) {
> > -		close(fd);
> > -		((struct file_block_source *)b)->fd = 0;
> > +	struct file_block_source *b = v;
> > +
> > +	if (b->fd >= 0) {
> > +		close(b->fd);
> > +		b->fd = -1;
> >  	}
> >  
> > +	if (use_mmap)
> > +		munmap(b->data, b->size);
> > +	else
> > +		reftable_free(b->data);
> > +	b->data = NULL;
> > +
> >  	reftable_free(b);
> >  }
> 
> It would have been nicer to do this kind of "a void pointer is taken
> and we immediately cast it to a concrete and useful type upon entry"
> clean-up as a separate step that is purely clean-up, if there were
> many instances.  It is the first such one in the series as far as I
> remember, so it is not a huge deal.
> 
> > @@ -108,9 +119,7 @@ static int file_read_block(void *v, struct reftable_block *dest, uint64_t off,
> >  {
> >  	struct file_block_source *b = v;
> >  	assert(off + size <= b->size);
> > -	dest->data = reftable_malloc(size);
> > -	if (pread_in_full(b->fd, dest->data, size, off) != size)
> > -		return -1;
> > +	dest->data = b->data + off;
> >  	dest->len = size;
> >  	return size;
> >  }
> 
> So, we used to delay the allocation and reading of a block until
> this function gets called.  Now, by the time the control reaches
> the function, we are expected to have the data handy at b->data.
> That is ensured by reftable_block_source_from_file(), I presume?
> 
> > @@ -127,8 +136,10 @@ int reftable_block_source_from_file(struct reftable_block_source *bs,
> >  {
> >  	struct stat st = { 0 };
> >  	int err = 0;
> > -	int fd = open(name, O_RDONLY);
> > +	int fd;
> >  	struct file_block_source *p = NULL;
> > +
> > +	fd = open(name, O_RDONLY);
> >  	if (fd < 0) {
> >  		if (errno == ENOENT) {
> >  			return REFTABLE_NOT_EXIST_ERROR;
> 
> This is a no-op clean-up that would have been better in a separate
> clean-up step.  Again, not a huge deal.

Yeah, fair enough. One problem I have with the reftable library is that
its coding style doesn't quite blend in with the rest of the Git code
base, so I want to refactor code that I'm touching to match our coding
style better. This means that there will be a lot of such smallish
refactorings, and I'm often not sure whether to evict them into their
own commit or whether to do the refactoring in the same step.

For small changes like this one here I think it's reasonable to do so in
the same commit. But larger refactorings like e.g. the introduction of
the common exit path in the first patch of this series I of course put
into their own commit.

> > @@ -144,7 +155,18 @@ int reftable_block_source_from_file(struct reftable_block_source *bs,
> >  
> >  	p = reftable_calloc(sizeof(struct file_block_source));
> >  	p->size = st.st_size;
> > -	p->fd = fd;
> > +	if (use_mmap) {
> > +		p->data = xmmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
> > +		p->fd = fd;
> 
> This is a bit unusual for our use of mmap() where the norm is to
> close the file descriptor once we mapped (we only need the memory
> region itself and not the originating file descriptor to unmap it).
> 
> Is there a reason why we need to keep p->fd?  Now the other side of
> this if/else preallocates the whole thing and slurps everything in
> core to allow the remainder of the code to mimic what happens on the
> mmaped block memory (e.g., we saw how file_read_block() assumes that
> b->data[0..b->size] are valid) and does not obviously need p->fd,
> can we just remove .fd member from the file_block_source structure?
> 
> That way, file_close() can also lose the conditional close() call.
> 
> For that matter, do we need any codepath in this file that is
> enabled by !use_mmap?  Can't we just use xmmap() and let its
> "instead, we allocate, read into it and return" emulation?

Not really, I'll update the code accordingly.

Thanks!

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 0/5] reftable: optimize I/O patterns
  2024-01-08 12:18 [PATCH 0/4] reftable: optimize I/O patterns Patrick Steinhardt
                   ` (3 preceding siblings ...)
  2024-01-08 12:18 ` [PATCH 4/4] reftable/blocksource: use mmap to read tables Patrick Steinhardt
@ 2024-01-11 10:06 ` Patrick Steinhardt
  2024-01-11 10:06   ` [PATCH v2 1/5] reftable/stack: refactor stack reloading to have common exit path Patrick Steinhardt
                     ` (4 more replies)
  4 siblings, 5 replies; 22+ messages in thread
From: Patrick Steinhardt @ 2024-01-11 10:06 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 15504 bytes --]

Hi,

this is the second version of my patch series that optimizes some I/O
patterns that the reftable library uses. Changes compared to v1:

  - Added missing signoffs.

  - The validity cache is now cleared when reloading the reftable stack
    fails.

  - Style fixes for `reftable_block_source_from_file()` are now in a
    separate commit.

  - We now use `xmmap()` exclusively, relying on its fallback code on
    NO_MMAP platforms.

  - The file descriptor is closed immediately after mmapping now.

Thanks for your review, Junio!

Patrick

Patrick Steinhardt (5):
  reftable/stack: refactor stack reloading to have common exit path
  reftable/stack: refactor reloading to use file descriptor
  reftable/stack: use stat info to avoid re-reading stack list
  reftable/blocksource: refactor code to match our coding style
  reftable/blocksource: use mmap to read tables

 reftable/blocksource.c |  39 ++++++--------
 reftable/stack.c       | 113 +++++++++++++++++++++++++----------------
 reftable/stack.h       |   1 +
 reftable/system.h      |   1 +
 4 files changed, 85 insertions(+), 69 deletions(-)

Range-diff against v1:
1:  01ece2626d ! 1:  4b7f52c415 reftable/stack: refactor stack reloading to have common exit path
    @@ Commit message
         Refactor the function to have a common exit path. While at it, touch up
         the style of this function a bit to match our usual coding style better.
     
    +    Signed-off-by: Patrick Steinhardt <ps@pks.im>
    +
      ## reftable/stack.c ##
     @@ reftable/stack.c: static int tv_cmp(struct timeval *a, struct timeval *b)
      static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
2:  726d302d7b ! 2:  36b9f6b624 reftable/stack: refactor reloading to use file descriptor
    @@ Commit message
         Prepare for this by converting the code to use `fd_read_lines()` so that
         we have the file descriptor readily available.
     
    +    Signed-off-by: Patrick Steinhardt <ps@pks.im>
    +
      ## reftable/stack.c ##
     @@ reftable/stack.c: static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
      	struct timeval deadline;
3:  4fabdc3d80 ! 3:  c0f7cec95b reftable/stack: use stat info to avoid re-reading stack list
    @@ Commit message
         cached value from the last time we have updated the stack. This should
         always work alright because "tables.list" is updated atomically via a
         rename, so even if the ctime or mtime wasn't granular enough to identify
    -    a change, at least the inode number should have changed.
    +    a change, at least the inode number or file size should have changed.
     
         This change significantly speeds up operations where many refs are read,
         like when using git-update-ref(1). The following benchmark creates N
         refs in an otherwise-empty repository via `git update-ref --stdin`:
     
           Benchmark 1: update-ref: create many refs (refcount = 1, revision = HEAD~)
    -        Time (mean ± σ):       5.6 ms ±   0.2 ms    [User: 2.5 ms, System: 3.0 ms]
    -        Range (min … max):     5.2 ms …   6.0 ms    89 runs
    +        Time (mean ± σ):       5.1 ms ±   0.2 ms    [User: 2.4 ms, System: 2.6 ms]
    +        Range (min … max):     4.8 ms …   7.2 ms    109 runs
     
           Benchmark 2: update-ref: create many refs (refcount = 100, revision = HEAD~)
    -        Time (mean ± σ):      19.2 ms ±   0.3 ms    [User: 8.6 ms, System: 10.4 ms]
    -        Range (min … max):    18.4 ms …  19.8 ms    63 runs
    +        Time (mean ± σ):      19.1 ms ±   0.9 ms    [User: 8.9 ms, System: 9.9 ms]
    +        Range (min … max):    18.4 ms …  26.7 ms    72 runs
     
           Benchmark 3: update-ref: create many refs (refcount = 10000, revision = HEAD~)
    -        Time (mean ± σ):      1.310 s ±  0.014 s    [User: 0.566 s, System: 0.724 s]
    -        Range (min … max):    1.291 s …  1.342 s    10 runs
    +        Time (mean ± σ):      1.336 s ±  0.018 s    [User: 0.590 s, System: 0.724 s]
    +        Range (min … max):    1.314 s …  1.373 s    10 runs
     
           Benchmark 4: update-ref: create many refs (refcount = 1, revision = HEAD)
    -        Time (mean ± σ):       5.7 ms ±   0.4 ms    [User: 2.6 ms, System: 3.1 ms]
    -        Range (min … max):     5.1 ms …   9.5 ms    91 runs
    +        Time (mean ± σ):       5.1 ms ±   0.2 ms    [User: 2.4 ms, System: 2.6 ms]
    +        Range (min … max):     4.8 ms …   7.2 ms    109 runs
     
           Benchmark 5: update-ref: create many refs (refcount = 100, revision = HEAD)
    -        Time (mean ± σ):      15.2 ms ±   0.3 ms    [User: 7.0 ms, System: 8.1 ms]
    -        Range (min … max):    14.3 ms …  17.1 ms    71 runs
    +        Time (mean ± σ):      14.8 ms ±   0.2 ms    [User: 7.1 ms, System: 7.5 ms]
    +        Range (min … max):    14.2 ms …  15.2 ms    82 runs
     
           Benchmark 6: update-ref: create many refs (refcount = 10000, revision = HEAD)
    -        Time (mean ± σ):     916.1 ms ±  11.0 ms    [User: 420.8 ms, System: 495.0 ms]
    -        Range (min … max):   906.9 ms … 943.8 ms    10 runs
    +        Time (mean ± σ):     927.6 ms ±   5.3 ms    [User: 437.8 ms, System: 489.5 ms]
    +        Range (min … max):   919.4 ms … 936.4 ms    10 runs
     
           Summary
    -        update-ref: create many refs (refcount = 1, revision = HEAD~) ran
    -          1.01 ± 0.09 times faster than update-ref: create many refs (refcount = 1, revision = HEAD)
    -          2.72 ± 0.11 times faster than update-ref: create many refs (refcount = 100, revision = HEAD)
    -          3.42 ± 0.13 times faster than update-ref: create many refs (refcount = 100, revision = HEAD~)
    -        163.59 ± 5.62 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD)
    -        233.91 ± 7.92 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD~)
    +        update-ref: create many refs (refcount = 1, revision = HEAD) ran
    +          1.00 ± 0.07 times faster than update-ref: create many refs (refcount = 1, revision = HEAD~)
    +          2.89 ± 0.14 times faster than update-ref: create many refs (refcount = 100, revision = HEAD)
    +          3.74 ± 0.25 times faster than update-ref: create many refs (refcount = 100, revision = HEAD~)
    +        181.26 ± 8.30 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD)
    +        261.01 ± 12.35 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD~)
    +
    +    Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
      ## reftable/stack.c ##
     @@ reftable/stack.c: void reftable_stack_destroy(struct reftable_stack *st)
    @@ reftable/stack.c: static int reftable_stack_reload_maybe_reuse(struct reftable_s
     +	stat_validity_update(&st->list_validity, fd);
     +
      out:
    ++	if (err)
    ++		stat_validity_clear(&st->list_validity);
      	if (fd >= 0)
      		close(fd);
    + 	free_names(names);
     @@ reftable/stack.c: static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
      static int stack_uptodate(struct reftable_stack *st)
      {
-:  ---------- > 4:  96e79dc3ba reftable/blocksource: refactor code to match our coding style
4:  a23f38a806 ! 5:  e53a20c8e1 reftable/blocksource: use mmap to read tables
    @@ Commit message
         already and ignores any such errors. Also, `reftable_stack_clean()` will
         prune stale tables which are not referenced by "tables.list" anymore so
         that those files can eventually be pruned. And second, we never rewrite
    -    already-rewritten stacks, so it does not matter that we cannot rename a
    +    already-written stacks, so it does not matter that we cannot rename a
         file over an mmaped file, either.
     
         Another unfortunate property of mmap is that it is not supported by all
         systems. But given that the size of reftables should typically be rather
         limited (megabytes at most in the vast majority of repositories), we can
    -    provide an easy fallback by just reading the complete table into memory
    -    on such platforms. This is the same strategy that the "packed" backend
    -    uses in case the platform does not provide mmap.
    +    use the fallback implementation provided by `git_mmap()` which reads the
    +    whole file into memory instead. This is the same strategy that the
    +    "packed" backend uses.
     
         While this change doesn't significantly improve performance in the case
         where we're seeking through stacks once (like e.g. git-for-each-ref(1)
    @@ Commit message
         empty repository:
     
           Benchmark 1: update-ref: create many refs (refcount = 1, revision = HEAD~)
    -        Time (mean ± σ):       5.6 ms ±   0.2 ms    [User: 2.7 ms, System: 2.8 ms]
    -        Range (min … max):     5.1 ms …   6.0 ms    90 runs
    +        Time (mean ± σ):       5.1 ms ±   0.2 ms    [User: 2.5 ms, System: 2.5 ms]
    +        Range (min … max):     4.8 ms …   7.1 ms    111 runs
     
           Benchmark 2: update-ref: create many refs (refcount = 100, revision = HEAD~)
    -        Time (mean ± σ):      15.1 ms ±   0.4 ms    [User: 7.1 ms, System: 8.0 ms]
    -        Range (min … max):    14.2 ms …  16.5 ms    71 runs
    +        Time (mean ± σ):      14.8 ms ±   0.5 ms    [User: 7.1 ms, System: 7.5 ms]
    +        Range (min … max):    14.1 ms …  18.7 ms    84 runs
     
           Benchmark 3: update-ref: create many refs (refcount = 10000, revision = HEAD~)
    -        Time (mean ± σ):     919.4 ms ±  11.2 ms    [User: 427.5 ms, System: 491.7 ms]
    -        Range (min … max):   908.1 ms … 936.6 ms    10 runs
    +        Time (mean ± σ):     926.4 ms ±   5.6 ms    [User: 448.5 ms, System: 477.7 ms]
    +        Range (min … max):   920.0 ms … 936.1 ms    10 runs
     
           Benchmark 4: update-ref: create many refs (refcount = 1, revision = HEAD)
    -        Time (mean ± σ):       5.5 ms ±   0.3 ms    [User: 2.4 ms, System: 3.1 ms]
    -        Range (min … max):     5.0 ms …   7.3 ms    89 runs
    +        Time (mean ± σ):       5.0 ms ±   0.2 ms    [User: 2.4 ms, System: 2.5 ms]
    +        Range (min … max):     4.7 ms …   5.4 ms    111 runs
     
           Benchmark 5: update-ref: create many refs (refcount = 100, revision = HEAD)
    -        Time (mean ± σ):      10.4 ms ±   0.3 ms    [User: 5.1 ms, System: 5.3 ms]
    -        Range (min … max):     9.7 ms …  11.0 ms    78 runs
    +        Time (mean ± σ):      10.5 ms ±   0.2 ms    [User: 5.0 ms, System: 5.3 ms]
    +        Range (min … max):    10.0 ms …  10.9 ms    93 runs
     
           Benchmark 6: update-ref: create many refs (refcount = 10000, revision = HEAD)
    -        Time (mean ± σ):     483.5 ms ±   6.3 ms    [User: 227.8 ms, System: 255.6 ms]
    -        Range (min … max):   477.5 ms … 498.8 ms    10 runs
    +        Time (mean ± σ):     529.6 ms ±   9.1 ms    [User: 268.0 ms, System: 261.4 ms]
    +        Range (min … max):   522.4 ms … 547.1 ms    10 runs
     
           Summary
             update-ref: create many refs (refcount = 1, revision = HEAD) ran
               1.01 ± 0.06 times faster than update-ref: create many refs (refcount = 1, revision = HEAD~)
    -          1.89 ± 0.11 times faster than update-ref: create many refs (refcount = 100, revision = HEAD)
    -          2.73 ± 0.16 times faster than update-ref: create many refs (refcount = 100, revision = HEAD~)
    -         87.55 ± 4.65 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD)
    -        166.48 ± 8.80 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD~)
    +          2.08 ± 0.07 times faster than update-ref: create many refs (refcount = 100, revision = HEAD)
    +          2.95 ± 0.14 times faster than update-ref: create many refs (refcount = 100, revision = HEAD~)
    +        105.33 ± 3.76 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD)
    +        184.24 ± 5.89 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD~)
     
         Theoretically, we could also replicate the strategy of the "packed"
         backend where small tables are read into memory instead of using mmap.
         Benchmarks did not confirm that this has a performance benefit though.
     
    +    Signed-off-by: Patrick Steinhardt <ps@pks.im>
    +
      ## reftable/blocksource.c ##
    -@@ reftable/blocksource.c: license that can be found in the LICENSE file or at
    - #include "reftable-blocksource.h"
    - #include "reftable-error.h"
    - 
    -+#if defined(NO_MMAP)
    -+static int use_mmap = 0;
    -+#else
    -+static int use_mmap = 1;
    -+#endif
    -+
    - static void strbuf_return_block(void *b, struct reftable_block *dest)
    - {
    - 	if (dest->len)
     @@ reftable/blocksource.c: struct reftable_block_source malloc_block_source(void)
    + }
    + 
      struct file_block_source {
    - 	int fd;
    +-	int fd;
      	uint64_t size;
     +	unsigned char *data;
      };
    @@ reftable/blocksource.c: static uint64_t file_size(void *b)
     -	if (fd > 0) {
     -		close(fd);
     -		((struct file_block_source *)b)->fd = 0;
    +-	}
    +-
     +	struct file_block_source *b = v;
    -+
    -+	if (b->fd >= 0) {
    -+		close(b->fd);
    -+		b->fd = -1;
    - 	}
    - 
    -+	if (use_mmap)
    -+		munmap(b->data, b->size);
    -+	else
    -+		reftable_free(b->data);
    -+	b->data = NULL;
    -+
    ++	munmap(b->data, b->size);
      	reftable_free(b);
      }
      
    @@ reftable/blocksource.c: static int file_read_block(void *v, struct reftable_bloc
      	return size;
      }
     @@ reftable/blocksource.c: int reftable_block_source_from_file(struct reftable_block_source *bs,
    - {
    - 	struct stat st = { 0 };
    - 	int err = 0;
    --	int fd = open(name, O_RDONLY);
    -+	int fd;
    - 	struct file_block_source *p = NULL;
    -+
    -+	fd = open(name, O_RDONLY);
    - 	if (fd < 0) {
    - 		if (errno == ENOENT) {
    - 			return REFTABLE_NOT_EXIST_ERROR;
    -@@ reftable/blocksource.c: int reftable_block_source_from_file(struct reftable_block_source *bs,
      
    - 	p = reftable_calloc(sizeof(struct file_block_source));
    + 	p = reftable_calloc(sizeof(*p));
      	p->size = st.st_size;
     -	p->fd = fd;
    -+	if (use_mmap) {
    -+		p->data = xmmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
    -+		p->fd = fd;
    -+	} else {
    -+		p->data = xmalloc(st.st_size);
    -+		if (read_in_full(fd, p->data, st.st_size) != st.st_size) {
    -+			close(fd);
    -+			return -1;
    -+		}
    -+		close(fd);
    -+		p->fd = -1;
    -+	}
    ++	p->data = xmmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
    ++	close(fd);
      
      	assert(!bs->ops);
      	bs->ops = &file_vtable;

base-commit: a54a84b333adbecf7bc4483c0e36ed5878cac17b
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 1/5] reftable/stack: refactor stack reloading to have common exit path
  2024-01-11 10:06 ` [PATCH v2 0/5] reftable: optimize I/O patterns Patrick Steinhardt
@ 2024-01-11 10:06   ` Patrick Steinhardt
  2024-01-11 10:06   ` [PATCH v2 2/5] reftable/stack: refactor reloading to use file descriptor Patrick Steinhardt
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 22+ messages in thread
From: Patrick Steinhardt @ 2024-01-11 10:06 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 3371 bytes --]

The `reftable_stack_reload_maybe_reuse()` function is responsible for
reloading the reftable list from disk. The function is quite hard to
follow though because it has a bunch of different exit paths, many of
which have to free the same set of resources.

Refactor the function to have a common exit path. While at it, touch up
the style of this function a bit to match our usual coding style better.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/stack.c | 86 +++++++++++++++++++++++-------------------------
 1 file changed, 42 insertions(+), 44 deletions(-)

diff --git a/reftable/stack.c b/reftable/stack.c
index 16bab82063..bf869a6772 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -304,69 +304,67 @@ static int tv_cmp(struct timeval *a, struct timeval *b)
 static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
 					     int reuse_open)
 {
-	struct timeval deadline = { 0 };
-	int err = gettimeofday(&deadline, NULL);
+	char **names = NULL, **names_after = NULL;
+	struct timeval deadline;
 	int64_t delay = 0;
-	int tries = 0;
-	if (err < 0)
-		return err;
+	int tries = 0, err;
 
+	err = gettimeofday(&deadline, NULL);
+	if (err < 0)
+		goto out;
 	deadline.tv_sec += 3;
+
 	while (1) {
-		char **names = NULL;
-		char **names_after = NULL;
-		struct timeval now = { 0 };
-		int err = gettimeofday(&now, NULL);
-		int err2 = 0;
-		if (err < 0) {
-			return err;
-		}
+		struct timeval now;
 
-		/* Only look at deadlines after the first few times. This
-		   simplifies debugging in GDB */
+		err = gettimeofday(&now, NULL);
+		if (err < 0)
+			goto out;
+
+		/*
+		 * Only look at deadlines after the first few times. This
+		 * simplifies debugging in GDB.
+		 */
 		tries++;
-		if (tries > 3 && tv_cmp(&now, &deadline) >= 0) {
-			break;
-		}
+		if (tries > 3 && tv_cmp(&now, &deadline) >= 0)
+			goto out;
 
 		err = read_lines(st->list_file, &names);
-		if (err < 0) {
-			free_names(names);
-			return err;
-		}
+		if (err < 0)
+			goto out;
+
 		err = reftable_stack_reload_once(st, names, reuse_open);
-		if (err == 0) {
-			free_names(names);
+		if (!err)
 			break;
-		}
-		if (err != REFTABLE_NOT_EXIST_ERROR) {
-			free_names(names);
-			return err;
-		}
-
-		/* err == REFTABLE_NOT_EXIST_ERROR can be caused by a concurrent
-		   writer. Check if there was one by checking if the name list
-		   changed.
-		*/
-		err2 = read_lines(st->list_file, &names_after);
-		if (err2 < 0) {
-			free_names(names);
-			return err2;
-		}
-
+		if (err != REFTABLE_NOT_EXIST_ERROR)
+			goto out;
+
+		/*
+		 * REFTABLE_NOT_EXIST_ERROR can be caused by a concurrent
+		 * writer. Check if there was one by checking if the name list
+		 * changed.
+		 */
+		err = read_lines(st->list_file, &names_after);
+		if (err < 0)
+			goto out;
 		if (names_equal(names_after, names)) {
-			free_names(names);
-			free_names(names_after);
-			return err;
+			err = REFTABLE_NOT_EXIST_ERROR;
+			goto out;
 		}
+
 		free_names(names);
+		names = NULL;
 		free_names(names_after);
+		names_after = NULL;
 
 		delay = delay + (delay * rand()) / RAND_MAX + 1;
 		sleep_millisec(delay);
 	}
 
-	return 0;
+out:
+	free_names(names);
+	free_names(names_after);
+	return err;
 }
 
 /* -1 = error
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 2/5] reftable/stack: refactor reloading to use file descriptor
  2024-01-11 10:06 ` [PATCH v2 0/5] reftable: optimize I/O patterns Patrick Steinhardt
  2024-01-11 10:06   ` [PATCH v2 1/5] reftable/stack: refactor stack reloading to have common exit path Patrick Steinhardt
@ 2024-01-11 10:06   ` Patrick Steinhardt
  2024-01-14 10:14     ` Jeff King
  2024-01-11 10:06   ` [PATCH v2 3/5] reftable/stack: use stat info to avoid re-reading stack list Patrick Steinhardt
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: Patrick Steinhardt @ 2024-01-11 10:06 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 1900 bytes --]

We're about to introduce a stat(3P)-based caching mechanism to reload
the list of stacks only when it has changed. In order to avoid race
conditions this requires us to have a file descriptor available that we
can use to call fstat(3P) on.

Prepare for this by converting the code to use `fd_read_lines()` so that
we have the file descriptor readily available.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/stack.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/reftable/stack.c b/reftable/stack.c
index bf869a6772..b1ee247601 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -308,6 +308,7 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
 	struct timeval deadline;
 	int64_t delay = 0;
 	int tries = 0, err;
+	int fd = -1;
 
 	err = gettimeofday(&deadline, NULL);
 	if (err < 0)
@@ -329,9 +330,19 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
 		if (tries > 3 && tv_cmp(&now, &deadline) >= 0)
 			goto out;
 
-		err = read_lines(st->list_file, &names);
-		if (err < 0)
-			goto out;
+		fd = open(st->list_file, O_RDONLY);
+		if (fd < 0) {
+			if (errno != ENOENT) {
+				err = REFTABLE_IO_ERROR;
+				goto out;
+			}
+
+			names = reftable_calloc(sizeof(char *));
+		} else {
+			err = fd_read_lines(fd, &names);
+			if (err < 0)
+				goto out;
+		}
 
 		err = reftable_stack_reload_once(st, names, reuse_open);
 		if (!err)
@@ -356,12 +367,16 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
 		names = NULL;
 		free_names(names_after);
 		names_after = NULL;
+		close(fd);
+		fd = -1;
 
 		delay = delay + (delay * rand()) / RAND_MAX + 1;
 		sleep_millisec(delay);
 	}
 
 out:
+	if (fd >= 0)
+		close(fd);
 	free_names(names);
 	free_names(names_after);
 	return err;
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 3/5] reftable/stack: use stat info to avoid re-reading stack list
  2024-01-11 10:06 ` [PATCH v2 0/5] reftable: optimize I/O patterns Patrick Steinhardt
  2024-01-11 10:06   ` [PATCH v2 1/5] reftable/stack: refactor stack reloading to have common exit path Patrick Steinhardt
  2024-01-11 10:06   ` [PATCH v2 2/5] reftable/stack: refactor reloading to use file descriptor Patrick Steinhardt
@ 2024-01-11 10:06   ` Patrick Steinhardt
  2024-01-11 10:06   ` [PATCH v2 4/5] reftable/blocksource: refactor code to match our coding style Patrick Steinhardt
  2024-01-11 10:06   ` [PATCH v2 5/5] reftable/blocksource: use mmap to read tables Patrick Steinhardt
  4 siblings, 0 replies; 22+ messages in thread
From: Patrick Steinhardt @ 2024-01-11 10:06 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 5676 bytes --]

Whenever we call into the refs interfaces we potentially have to reload
refs in case they have been concurrently modified, either in-process or
externally. While this happens somewhat automatically for loose refs
because we simply try to re-read the files, the "packed" backend will
reload its snapshot of the packed-refs file in case its stat info has
changed since last reading it.

In the reftable backend we have a similar mechanism that is provided by
`reftable_stack_reload()`. This function will read the list of stacks
from "tables.list" and, if they have changed from the currently stored
list, reload the stacks. This is heavily inefficient though, as we have
to check whether the stack is up-to-date on basically every read and
thus keep on re-reading the file all the time even if it didn't change
at all.

We can do better and use the same stat(3P)-based mechanism that the
"packed" backend uses. Instead of reading the file, we will only open
the file descriptor, fstat(3P) it, and then compare the info against the
cached value from the last time we have updated the stack. This should
always work alright because "tables.list" is updated atomically via a
rename, so even if the ctime or mtime wasn't granular enough to identify
a change, at least the inode number or file size should have changed.

This change significantly speeds up operations where many refs are read,
like when using git-update-ref(1). The following benchmark creates N
refs in an otherwise-empty repository via `git update-ref --stdin`:

  Benchmark 1: update-ref: create many refs (refcount = 1, revision = HEAD~)
    Time (mean ± σ):       5.1 ms ±   0.2 ms    [User: 2.4 ms, System: 2.6 ms]
    Range (min … max):     4.8 ms …   7.2 ms    109 runs

  Benchmark 2: update-ref: create many refs (refcount = 100, revision = HEAD~)
    Time (mean ± σ):      19.1 ms ±   0.9 ms    [User: 8.9 ms, System: 9.9 ms]
    Range (min … max):    18.4 ms …  26.7 ms    72 runs

  Benchmark 3: update-ref: create many refs (refcount = 10000, revision = HEAD~)
    Time (mean ± σ):      1.336 s ±  0.018 s    [User: 0.590 s, System: 0.724 s]
    Range (min … max):    1.314 s …  1.373 s    10 runs

  Benchmark 4: update-ref: create many refs (refcount = 1, revision = HEAD)
    Time (mean ± σ):       5.1 ms ±   0.2 ms    [User: 2.4 ms, System: 2.6 ms]
    Range (min … max):     4.8 ms …   7.2 ms    109 runs

  Benchmark 5: update-ref: create many refs (refcount = 100, revision = HEAD)
    Time (mean ± σ):      14.8 ms ±   0.2 ms    [User: 7.1 ms, System: 7.5 ms]
    Range (min … max):    14.2 ms …  15.2 ms    82 runs

  Benchmark 6: update-ref: create many refs (refcount = 10000, revision = HEAD)
    Time (mean ± σ):     927.6 ms ±   5.3 ms    [User: 437.8 ms, System: 489.5 ms]
    Range (min … max):   919.4 ms … 936.4 ms    10 runs

  Summary
    update-ref: create many refs (refcount = 1, revision = HEAD) ran
      1.00 ± 0.07 times faster than update-ref: create many refs (refcount = 1, revision = HEAD~)
      2.89 ± 0.14 times faster than update-ref: create many refs (refcount = 100, revision = HEAD)
      3.74 ± 0.25 times faster than update-ref: create many refs (refcount = 100, revision = HEAD~)
    181.26 ± 8.30 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD)
    261.01 ± 12.35 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/stack.c  | 12 +++++++++++-
 reftable/stack.h  |  1 +
 reftable/system.h |  1 +
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/reftable/stack.c b/reftable/stack.c
index b1ee247601..c28d82299d 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -175,6 +175,7 @@ void reftable_stack_destroy(struct reftable_stack *st)
 		st->readers_len = 0;
 		FREE_AND_NULL(st->readers);
 	}
+	stat_validity_clear(&st->list_validity);
 	FREE_AND_NULL(st->list_file);
 	FREE_AND_NULL(st->reftable_dir);
 	reftable_free(st);
@@ -374,7 +375,11 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
 		sleep_millisec(delay);
 	}
 
+	stat_validity_update(&st->list_validity, fd);
+
 out:
+	if (err)
+		stat_validity_clear(&st->list_validity);
 	if (fd >= 0)
 		close(fd);
 	free_names(names);
@@ -388,8 +393,13 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
 static int stack_uptodate(struct reftable_stack *st)
 {
 	char **names = NULL;
-	int err = read_lines(st->list_file, &names);
+	int err;
 	int i = 0;
+
+	if (stat_validity_check(&st->list_validity, st->list_file))
+		return 0;
+
+	err = read_lines(st->list_file, &names);
 	if (err < 0)
 		return err;
 
diff --git a/reftable/stack.h b/reftable/stack.h
index f57005846e..3f80cc598a 100644
--- a/reftable/stack.h
+++ b/reftable/stack.h
@@ -14,6 +14,7 @@ license that can be found in the LICENSE file or at
 #include "reftable-stack.h"
 
 struct reftable_stack {
+	struct stat_validity list_validity;
 	char *list_file;
 	char *reftable_dir;
 	int disable_auto_compact;
diff --git a/reftable/system.h b/reftable/system.h
index 6b74a81514..2cc7adf271 100644
--- a/reftable/system.h
+++ b/reftable/system.h
@@ -12,6 +12,7 @@ license that can be found in the LICENSE file or at
 /* This header glues the reftable library to the rest of Git */
 
 #include "git-compat-util.h"
+#include "statinfo.h"
 #include "strbuf.h"
 #include "hash-ll.h" /* hash ID, sizes.*/
 #include "dir.h" /* remove_dir_recursively, for tests.*/
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 4/5] reftable/blocksource: refactor code to match our coding style
  2024-01-11 10:06 ` [PATCH v2 0/5] reftable: optimize I/O patterns Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2024-01-11 10:06   ` [PATCH v2 3/5] reftable/stack: use stat info to avoid re-reading stack list Patrick Steinhardt
@ 2024-01-11 10:06   ` Patrick Steinhardt
  2024-01-11 10:06   ` [PATCH v2 5/5] reftable/blocksource: use mmap to read tables Patrick Steinhardt
  4 siblings, 0 replies; 22+ messages in thread
From: Patrick Steinhardt @ 2024-01-11 10:06 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 1206 bytes --]

Refactor `reftable_block_source_from_file()` to match our coding style
better.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/blocksource.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/reftable/blocksource.c b/reftable/blocksource.c
index a1ea304429..1e2fb5e9fd 100644
--- a/reftable/blocksource.c
+++ b/reftable/blocksource.c
@@ -125,24 +125,23 @@ static struct reftable_block_source_vtable file_vtable = {
 int reftable_block_source_from_file(struct reftable_block_source *bs,
 				    const char *name)
 {
-	struct stat st = { 0 };
-	int err = 0;
-	int fd = open(name, O_RDONLY);
-	struct file_block_source *p = NULL;
+	struct file_block_source *p;
+	struct stat st;
+	int fd;
+
+	fd = open(name, O_RDONLY);
 	if (fd < 0) {
-		if (errno == ENOENT) {
+		if (errno == ENOENT)
 			return REFTABLE_NOT_EXIST_ERROR;
-		}
 		return -1;
 	}
 
-	err = fstat(fd, &st);
-	if (err < 0) {
+	if (fstat(fd, &st) < 0) {
 		close(fd);
 		return REFTABLE_IO_ERROR;
 	}
 
-	p = reftable_calloc(sizeof(struct file_block_source));
+	p = reftable_calloc(sizeof(*p));
 	p->size = st.st_size;
 	p->fd = fd;
 
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 5/5] reftable/blocksource: use mmap to read tables
  2024-01-11 10:06 ` [PATCH v2 0/5] reftable: optimize I/O patterns Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2024-01-11 10:06   ` [PATCH v2 4/5] reftable/blocksource: refactor code to match our coding style Patrick Steinhardt
@ 2024-01-11 10:06   ` Patrick Steinhardt
  4 siblings, 0 replies; 22+ messages in thread
From: Patrick Steinhardt @ 2024-01-11 10:06 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 6238 bytes --]

The blocksource interface provides an interface to read blocks from a
reftable table. This interface is implemented using read(3P) calls on
the underlying file descriptor. While this works alright, this pattern
is very inefficient when repeatedly querying the reftable stack for one
or more refs. This inefficiency can mostly be attributed to the fact
that we often need to re-read the same blocks over and over again, and
every single time we need to call read(3P) again.

A natural fit in this context is to use mmap(3P) instead of read(3P),
which has a bunch of benefits:

  - We do not need to come up with a caching strategy for some of the
    blocks as this will be handled by the kernel already.

  - We can avoid the overhead of having to call into the read(3P)
    syscall repeatedly.

  - We do not need to allocate returned blocks repeatedly, but can
    instead hand out pointers into the mmapped region directly.

Using mmap comes with a significant drawback on Windows though, because
mmapped files cannot be deleted and neither is it possible to rename
files onto an mmapped file. But for one, the reftable library gracefully
handles the case where auto-compaction cannot delete a still-open stack
already and ignores any such errors. Also, `reftable_stack_clean()` will
prune stale tables which are not referenced by "tables.list" anymore so
that those files can eventually be pruned. And second, we never rewrite
already-written stacks, so it does not matter that we cannot rename a
file over an mmaped file, either.

Another unfortunate property of mmap is that it is not supported by all
systems. But given that the size of reftables should typically be rather
limited (megabytes at most in the vast majority of repositories), we can
use the fallback implementation provided by `git_mmap()` which reads the
whole file into memory instead. This is the same strategy that the
"packed" backend uses.

While this change doesn't significantly improve performance in the case
where we're seeking through stacks once (like e.g. git-for-each-ref(1)
would). But it does speed up usecases where there is lots of random
access to refs, e.g. when writing. The following benchmark demonstrates
these savings with git-update-ref(1) creating N refs in an otherwise
empty repository:

  Benchmark 1: update-ref: create many refs (refcount = 1, revision = HEAD~)
    Time (mean ± σ):       5.1 ms ±   0.2 ms    [User: 2.5 ms, System: 2.5 ms]
    Range (min … max):     4.8 ms …   7.1 ms    111 runs

  Benchmark 2: update-ref: create many refs (refcount = 100, revision = HEAD~)
    Time (mean ± σ):      14.8 ms ±   0.5 ms    [User: 7.1 ms, System: 7.5 ms]
    Range (min … max):    14.1 ms …  18.7 ms    84 runs

  Benchmark 3: update-ref: create many refs (refcount = 10000, revision = HEAD~)
    Time (mean ± σ):     926.4 ms ±   5.6 ms    [User: 448.5 ms, System: 477.7 ms]
    Range (min … max):   920.0 ms … 936.1 ms    10 runs

  Benchmark 4: update-ref: create many refs (refcount = 1, revision = HEAD)
    Time (mean ± σ):       5.0 ms ±   0.2 ms    [User: 2.4 ms, System: 2.5 ms]
    Range (min … max):     4.7 ms …   5.4 ms    111 runs

  Benchmark 5: update-ref: create many refs (refcount = 100, revision = HEAD)
    Time (mean ± σ):      10.5 ms ±   0.2 ms    [User: 5.0 ms, System: 5.3 ms]
    Range (min … max):    10.0 ms …  10.9 ms    93 runs

  Benchmark 6: update-ref: create many refs (refcount = 10000, revision = HEAD)
    Time (mean ± σ):     529.6 ms ±   9.1 ms    [User: 268.0 ms, System: 261.4 ms]
    Range (min … max):   522.4 ms … 547.1 ms    10 runs

  Summary
    update-ref: create many refs (refcount = 1, revision = HEAD) ran
      1.01 ± 0.06 times faster than update-ref: create many refs (refcount = 1, revision = HEAD~)
      2.08 ± 0.07 times faster than update-ref: create many refs (refcount = 100, revision = HEAD)
      2.95 ± 0.14 times faster than update-ref: create many refs (refcount = 100, revision = HEAD~)
    105.33 ± 3.76 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD)
    184.24 ± 5.89 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD~)

Theoretically, we could also replicate the strategy of the "packed"
backend where small tables are read into memory instead of using mmap.
Benchmarks did not confirm that this has a performance benefit though.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/blocksource.c | 22 +++++++---------------
 1 file changed, 7 insertions(+), 15 deletions(-)

diff --git a/reftable/blocksource.c b/reftable/blocksource.c
index 1e2fb5e9fd..8c41e3c70f 100644
--- a/reftable/blocksource.c
+++ b/reftable/blocksource.c
@@ -76,8 +76,8 @@ struct reftable_block_source malloc_block_source(void)
 }
 
 struct file_block_source {
-	int fd;
 	uint64_t size;
+	unsigned char *data;
 };
 
 static uint64_t file_size(void *b)
@@ -87,19 +87,12 @@ static uint64_t file_size(void *b)
 
 static void file_return_block(void *b, struct reftable_block *dest)
 {
-	if (dest->len)
-		memset(dest->data, 0xff, dest->len);
-	reftable_free(dest->data);
 }
 
-static void file_close(void *b)
+static void file_close(void *v)
 {
-	int fd = ((struct file_block_source *)b)->fd;
-	if (fd > 0) {
-		close(fd);
-		((struct file_block_source *)b)->fd = 0;
-	}
-
+	struct file_block_source *b = v;
+	munmap(b->data, b->size);
 	reftable_free(b);
 }
 
@@ -108,9 +101,7 @@ static int file_read_block(void *v, struct reftable_block *dest, uint64_t off,
 {
 	struct file_block_source *b = v;
 	assert(off + size <= b->size);
-	dest->data = reftable_malloc(size);
-	if (pread_in_full(b->fd, dest->data, size, off) != size)
-		return -1;
+	dest->data = b->data + off;
 	dest->len = size;
 	return size;
 }
@@ -143,7 +134,8 @@ int reftable_block_source_from_file(struct reftable_block_source *bs,
 
 	p = reftable_calloc(sizeof(*p));
 	p->size = st.st_size;
-	p->fd = fd;
+	p->data = xmmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
+	close(fd);
 
 	assert(!bs->ops);
 	bs->ops = &file_vtable;
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 2/5] reftable/stack: refactor reloading to use file descriptor
  2024-01-11 10:06   ` [PATCH v2 2/5] reftable/stack: refactor reloading to use file descriptor Patrick Steinhardt
@ 2024-01-14 10:14     ` Jeff King
  2024-01-15 10:03       ` Patrick Steinhardt
  0 siblings, 1 reply; 22+ messages in thread
From: Jeff King @ 2024-01-14 10:14 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Han-Wen Nienhuys, Junio C Hamano

On Thu, Jan 11, 2024 at 11:06:43AM +0100, Patrick Steinhardt wrote:

> We're about to introduce a stat(3P)-based caching mechanism to reload
> the list of stacks only when it has changed. In order to avoid race
> conditions this requires us to have a file descriptor available that we
> can use to call fstat(3P) on.
> 
> Prepare for this by converting the code to use `fd_read_lines()` so that
> we have the file descriptor readily available.

Coverity noted a case with this series where we might feed a negative
value to fstat(). I'm not sure if it's a bug or not.

The issue is that here:

> @@ -329,9 +330,19 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
>  		if (tries > 3 && tv_cmp(&now, &deadline) >= 0)
>  			goto out;
>  
> -		err = read_lines(st->list_file, &names);
> -		if (err < 0)
> -			goto out;
> +		fd = open(st->list_file, O_RDONLY);
> +		if (fd < 0) {
> +			if (errno != ENOENT) {
> +				err = REFTABLE_IO_ERROR;
> +				goto out;
> +			}
> +
> +			names = reftable_calloc(sizeof(char *));
> +		} else {
> +			err = fd_read_lines(fd, &names);
> +			if (err < 0)
> +				goto out;
> +		}

...we might end up with fd as "-1" after calling open() on the list
file. For most errors we'll jump to "out", which makes sense. But if we
get ENOENT, we keep going with an empty file-list, which makes sense.

But we then do other stuff with "fd". I think this case is OK:

> @@ -356,12 +367,16 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
>  		names = NULL;
>  		free_names(names_after);
>  		names_after = NULL;
> +		close(fd);
> +		fd = -1;

We only get here if reftable_stack_reload_once() returned an error,
which it won't do since we feed it a blank set of names (and anyway,
close(-1) is a harmless noop).

But if we actually get to the end of the function, it's more
questionable. As of this patch, it's OK:

>  		delay = delay + (delay * rand()) / RAND_MAX + 1;
>  		sleep_millisec(delay);
>  	}
>  
>  out:
> +	if (fd >= 0)
> +		close(fd);
>  	free_names(names);
>  	free_names(names_after);
>  	return err;

But in the next patch we have this hunk:

> @@ -374,7 +375,11 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
>                 sleep_millisec(delay);
>         }
> 
> +       stat_validity_update(&st->list_validity, fd);
> +
>  out:
> +       if (err)
> +               stat_validity_clear(&st->list_validity);
>         if (fd >= 0)
>                 close(fd);
>         free_names(names);

which means we'll feed a negative value to stat_validity_update(). I
think this may be OK, because I'd imagine the only sensible thing to do
is call stat_validity_clear() instead. And using a negative fd means
fstat() will fail, which will cause stat_validity_update() to clear the
validity struct anyway. But I thought it was worth double-checking.

-Peff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 2/5] reftable/stack: refactor reloading to use file descriptor
  2024-01-14 10:14     ` Jeff King
@ 2024-01-15 10:03       ` Patrick Steinhardt
  2024-01-16 15:14         ` Jeff King
  0 siblings, 1 reply; 22+ messages in thread
From: Patrick Steinhardt @ 2024-01-15 10:03 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Han-Wen Nienhuys, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 3799 bytes --]

On Sun, Jan 14, 2024 at 05:14:24AM -0500, Jeff King wrote:
> On Thu, Jan 11, 2024 at 11:06:43AM +0100, Patrick Steinhardt wrote:
> 
> > We're about to introduce a stat(3P)-based caching mechanism to reload
> > the list of stacks only when it has changed. In order to avoid race
> > conditions this requires us to have a file descriptor available that we
> > can use to call fstat(3P) on.
> > 
> > Prepare for this by converting the code to use `fd_read_lines()` so that
> > we have the file descriptor readily available.
> 
> Coverity noted a case with this series where we might feed a negative
> value to fstat(). I'm not sure if it's a bug or not.
> 
> The issue is that here:
> 
> > @@ -329,9 +330,19 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
> >  		if (tries > 3 && tv_cmp(&now, &deadline) >= 0)
> >  			goto out;
> >  
> > -		err = read_lines(st->list_file, &names);
> > -		if (err < 0)
> > -			goto out;
> > +		fd = open(st->list_file, O_RDONLY);
> > +		if (fd < 0) {
> > +			if (errno != ENOENT) {
> > +				err = REFTABLE_IO_ERROR;
> > +				goto out;
> > +			}
> > +
> > +			names = reftable_calloc(sizeof(char *));
> > +		} else {
> > +			err = fd_read_lines(fd, &names);
> > +			if (err < 0)
> > +				goto out;
> > +		}
> 
> ...we might end up with fd as "-1" after calling open() on the list
> file. For most errors we'll jump to "out", which makes sense. But if we
> get ENOENT, we keep going with an empty file-list, which makes sense.
> 
> But we then do other stuff with "fd". I think this case is OK:
> 
> > @@ -356,12 +367,16 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
> >  		names = NULL;
> >  		free_names(names_after);
> >  		names_after = NULL;
> > +		close(fd);
> > +		fd = -1;
> 
> We only get here if reftable_stack_reload_once() returned an error,
> which it won't do since we feed it a blank set of names (and anyway,
> close(-1) is a harmless noop).
> 
> But if we actually get to the end of the function, it's more
> questionable. As of this patch, it's OK:
> 
> >  		delay = delay + (delay * rand()) / RAND_MAX + 1;
> >  		sleep_millisec(delay);
> >  	}
> >  
> >  out:
> > +	if (fd >= 0)
> > +		close(fd);
> >  	free_names(names);
> >  	free_names(names_after);
> >  	return err;
> 
> But in the next patch we have this hunk:
> 
> > @@ -374,7 +375,11 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
> >                 sleep_millisec(delay);
> >         }
> > 
> > +       stat_validity_update(&st->list_validity, fd);
> > +
> >  out:
> > +       if (err)
> > +               stat_validity_clear(&st->list_validity);
> >         if (fd >= 0)
> >                 close(fd);
> >         free_names(names);
> 
> which means we'll feed a negative value to stat_validity_update(). I
> think this may be OK, because I'd imagine the only sensible thing to do
> is call stat_validity_clear() instead. And using a negative fd means
> fstat() will fail, which will cause stat_validity_update() to clear the
> validity struct anyway. But I thought it was worth double-checking.

Good catch, and thanks a lot for double-checking. I was briefly
wondering whether this behaviour is actually specified by POSIX. In any
case, fstat(3P) explicitly documents `EBADF` as:

  The fildes argument is not a valid file descriptor.

That makes me think that this code is indeed POSIX-compliant, as
implementations are expected to handle invalid file descriptors via this
error code.

So overall this works as intended, even though I would not consider it
to be the cleanest way to handle this. Unless you or others think that
this should be refactored I'll leave it as-is for now though.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 2/5] reftable/stack: refactor reloading to use file descriptor
  2024-01-15 10:03       ` Patrick Steinhardt
@ 2024-01-16 15:14         ` Jeff King
  2024-01-16 16:44           ` Junio C Hamano
  0 siblings, 1 reply; 22+ messages in thread
From: Jeff King @ 2024-01-16 15:14 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Han-Wen Nienhuys, Junio C Hamano

On Mon, Jan 15, 2024 at 11:03:37AM +0100, Patrick Steinhardt wrote:

> > which means we'll feed a negative value to stat_validity_update(). I
> > think this may be OK, because I'd imagine the only sensible thing to do
> > is call stat_validity_clear() instead. And using a negative fd means
> > fstat() will fail, which will cause stat_validity_update() to clear the
> > validity struct anyway. But I thought it was worth double-checking.
> 
> Good catch, and thanks a lot for double-checking. I was briefly
> wondering whether this behaviour is actually specified by POSIX. In any
> case, fstat(3P) explicitly documents `EBADF` as:
> 
>   The fildes argument is not a valid file descriptor.
> 
> That makes me think that this code is indeed POSIX-compliant, as
> implementations are expected to handle invalid file descriptors via this
> error code.
> 
> So overall this works as intended, even though I would not consider it
> to be the cleanest way to handle this. Unless you or others think that
> this should be refactored I'll leave it as-is for now though.

Thanks for confirming. I think we can leave your patch as-is. If
anything, I would say that stat_validity_update() should check for "fd <
0" itself. Not because I think fstat() is unlikely to behave differently
on some platform, but simply because it more clearly documents the
expectation.

-Peff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 2/5] reftable/stack: refactor reloading to use file descriptor
  2024-01-16 15:14         ` Jeff King
@ 2024-01-16 16:44           ` Junio C Hamano
  0 siblings, 0 replies; 22+ messages in thread
From: Junio C Hamano @ 2024-01-16 16:44 UTC (permalink / raw)
  To: Jeff King; +Cc: Patrick Steinhardt, git, Han-Wen Nienhuys

Jeff King <peff@peff.net> writes:

>> So overall this works as intended, even though I would not consider it
>> to be the cleanest way to handle this. Unless you or others think that
>> this should be refactored I'll leave it as-is for now though.
>
> Thanks for confirming. I think we can leave your patch as-is. If
> anything, I would say that stat_validity_update() should check for "fd <
> 0" itself. Not because I think fstat() is unlikely to behave differently
> on some platform, but simply because it more clearly documents the
> expectation.

Thanks, I agree with your point that we should avoid calling system
functions that take a file descriptor with a known-invalid (like
negative) one.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2024-01-16 16:44 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-08 12:18 [PATCH 0/4] reftable: optimize I/O patterns Patrick Steinhardt
2024-01-08 12:18 ` [PATCH 1/4] reftable/stack: refactor stack reloading to have common exit path Patrick Steinhardt
2024-01-10 17:30   ` Junio C Hamano
2024-01-11  7:33     ` Patrick Steinhardt
2024-01-08 12:18 ` [PATCH 2/4] reftable/stack: refactor reloading to use file descriptor Patrick Steinhardt
2024-01-10 19:57   ` Junio C Hamano
2024-01-08 12:18 ` [PATCH 3/4] reftable/stack: use stat info to avoid re-reading stack list Patrick Steinhardt
2024-01-10 20:07   ` Junio C Hamano
2024-01-11  7:41     ` Patrick Steinhardt
2024-01-08 12:18 ` [PATCH 4/4] reftable/blocksource: use mmap to read tables Patrick Steinhardt
2024-01-10 21:24   ` Junio C Hamano
2024-01-11  9:21     ` Patrick Steinhardt
2024-01-11 10:06 ` [PATCH v2 0/5] reftable: optimize I/O patterns Patrick Steinhardt
2024-01-11 10:06   ` [PATCH v2 1/5] reftable/stack: refactor stack reloading to have common exit path Patrick Steinhardt
2024-01-11 10:06   ` [PATCH v2 2/5] reftable/stack: refactor reloading to use file descriptor Patrick Steinhardt
2024-01-14 10:14     ` Jeff King
2024-01-15 10:03       ` Patrick Steinhardt
2024-01-16 15:14         ` Jeff King
2024-01-16 16:44           ` Junio C Hamano
2024-01-11 10:06   ` [PATCH v2 3/5] reftable/stack: use stat info to avoid re-reading stack list Patrick Steinhardt
2024-01-11 10:06   ` [PATCH v2 4/5] reftable/blocksource: refactor code to match our coding style Patrick Steinhardt
2024-01-11 10:06   ` [PATCH v2 5/5] reftable/blocksource: use mmap to read tables Patrick Steinhardt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).