git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/13] reftable library
@ 2020-09-16 19:10 Han-Wen Nienhuys via GitGitGadget
  2020-09-16 19:10 ` [PATCH 01/13] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
                   ` (13 more replies)
  0 siblings, 14 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-09-16 19:10 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys

This splits the giant commit from 
https://github.com/gitgitgadget/git/pull/539 into a series of smaller
commits, which build and have unittests.

The final commit should also be split up, but I want to wait until we have
consensus that the bottom commits look good.

Han-Wen Nienhuys (12):
  reftable: add LICENSE
  reftable: define the public API
  reftable: add a barebones unittest framework
  reftable: utility functions
  reftable: (de)serialization for the polymorphic record type.
  reftable: reading/writing blocks
  reftable: a generic binary tree implementation
  reftable: write reftable files
  reftable: read reftable files
  reftable: file level tests
  reftable: rest of library
  reftable: "test-tool dump-reftable" command.

Johannes Schindelin (1):
  vcxproj: adjust for the reftable changes

 Makefile                                   |   46 +-
 config.mak.uname                           |    2 +-
 contrib/buildsystems/Generators/Vcxproj.pm |   11 +-
 reftable/.gitattributes                    |    1 +
 reftable/LICENSE                           |   31 +
 reftable/VERSION                           |    1 +
 reftable/basics.c                          |  131 +++
 reftable/basics.h                          |   48 +
 reftable/block.c                           |  443 +++++++
 reftable/block.h                           |  129 +++
 reftable/block_test.c                      |  158 +++
 reftable/blocksource.c                     |  148 +++
 reftable/blocksource.h                     |   22 +
 reftable/compat.c                          |  110 ++
 reftable/compat.h                          |   48 +
 reftable/constants.h                       |   21 +
 reftable/dump.c                            |  212 ++++
 reftable/iter.c                            |  242 ++++
 reftable/iter.h                            |   72 ++
 reftable/merged.c                          |  358 ++++++
 reftable/merged.h                          |   39 +
 reftable/merged_test.c                     |  331 ++++++
 reftable/pq.c                              |  115 ++
 reftable/pq.h                              |   34 +
 reftable/publicbasics.c                    |  100 ++
 reftable/reader.c                          |  733 ++++++++++++
 reftable/reader.h                          |   78 ++
 reftable/record.c                          | 1114 ++++++++++++++++++
 reftable/record.h                          |  143 +++
 reftable/record_test.c                     |  410 +++++++
 reftable/refname.c                         |  209 ++++
 reftable/refname.h                         |   38 +
 reftable/refname_test.c                    |  100 ++
 reftable/reftable-tests.h                  |   22 +
 reftable/reftable.c                        |  104 ++
 reftable/reftable.h                        |  585 ++++++++++
 reftable/reftable_test.c                   |  585 ++++++++++
 reftable/stack.c                           | 1224 ++++++++++++++++++++
 reftable/stack.h                           |   50 +
 reftable/stack_test.c                      |  787 +++++++++++++
 reftable/strbuf.c                          |  142 +++
 reftable/strbuf.h                          |   80 ++
 reftable/strbuf_test.c                     |   37 +
 reftable/system.h                          |   51 +
 reftable/test_framework.c                  |   68 ++
 reftable/test_framework.h                  |   60 +
 reftable/tree.c                            |   63 +
 reftable/tree.h                            |   34 +
 reftable/tree_test.c                       |   62 +
 reftable/update.sh                         |   22 +
 reftable/writer.c                          |  664 +++++++++++
 reftable/writer.h                          |   60 +
 reftable/zlib-compat.c                     |   92 ++
 t/helper/test-reftable.c                   |   20 +
 t/helper/test-tool.c                       |    2 +
 t/helper/test-tool.h                       |    2 +
 56 files changed, 10489 insertions(+), 5 deletions(-)
 create mode 100644 reftable/.gitattributes
 create mode 100644 reftable/LICENSE
 create mode 100644 reftable/VERSION
 create mode 100644 reftable/basics.c
 create mode 100644 reftable/basics.h
 create mode 100644 reftable/block.c
 create mode 100644 reftable/block.h
 create mode 100644 reftable/block_test.c
 create mode 100644 reftable/blocksource.c
 create mode 100644 reftable/blocksource.h
 create mode 100644 reftable/compat.c
 create mode 100644 reftable/compat.h
 create mode 100644 reftable/constants.h
 create mode 100644 reftable/dump.c
 create mode 100644 reftable/iter.c
 create mode 100644 reftable/iter.h
 create mode 100644 reftable/merged.c
 create mode 100644 reftable/merged.h
 create mode 100644 reftable/merged_test.c
 create mode 100644 reftable/pq.c
 create mode 100644 reftable/pq.h
 create mode 100644 reftable/publicbasics.c
 create mode 100644 reftable/reader.c
 create mode 100644 reftable/reader.h
 create mode 100644 reftable/record.c
 create mode 100644 reftable/record.h
 create mode 100644 reftable/record_test.c
 create mode 100644 reftable/refname.c
 create mode 100644 reftable/refname.h
 create mode 100644 reftable/refname_test.c
 create mode 100644 reftable/reftable-tests.h
 create mode 100644 reftable/reftable.c
 create mode 100644 reftable/reftable.h
 create mode 100644 reftable/reftable_test.c
 create mode 100644 reftable/stack.c
 create mode 100644 reftable/stack.h
 create mode 100644 reftable/stack_test.c
 create mode 100644 reftable/strbuf.c
 create mode 100644 reftable/strbuf.h
 create mode 100644 reftable/strbuf_test.c
 create mode 100644 reftable/system.h
 create mode 100644 reftable/test_framework.c
 create mode 100644 reftable/test_framework.h
 create mode 100644 reftable/tree.c
 create mode 100644 reftable/tree.h
 create mode 100644 reftable/tree_test.c
 create mode 100755 reftable/update.sh
 create mode 100644 reftable/writer.c
 create mode 100644 reftable/writer.h
 create mode 100644 reftable/zlib-compat.c
 create mode 100644 t/helper/test-reftable.c


base-commit: 54e85e7af1ac9e9a92888060d6811ae767fea1bc
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-847%2Fhanwen%2Flibreftable-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-847/hanwen/libreftable-v1
Pull-Request: https://github.com/git/git/pull/847
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 251+ messages in thread

* [PATCH 01/13] reftable: add LICENSE
  2020-09-16 19:10 [PATCH 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
@ 2020-09-16 19:10 ` Han-Wen Nienhuys via GitGitGadget
  2020-09-16 19:10 ` [PATCH 02/13] reftable: define the public API Han-Wen Nienhuys via GitGitGadget
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-09-16 19:10 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

TODO: relicense?

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/LICENSE | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)
 create mode 100644 reftable/LICENSE

diff --git a/reftable/LICENSE b/reftable/LICENSE
new file mode 100644
index 0000000000..402e0f9356
--- /dev/null
+++ b/reftable/LICENSE
@@ -0,0 +1,31 @@
+BSD License
+
+Copyright (c) 2020, Google LLC
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are
+met:
+
+* Redistributions of source code must retain the above copyright notice,
+this list of conditions and the following disclaimer.
+
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in the
+documentation and/or other materials provided with the distribution.
+
+* Neither the name of Google LLC nor the names of its contributors may
+be used to endorse or promote products derived from this software
+without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH 02/13] reftable: define the public API
  2020-09-16 19:10 [PATCH 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
  2020-09-16 19:10 ` [PATCH 01/13] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
@ 2020-09-16 19:10 ` Han-Wen Nienhuys via GitGitGadget
  2020-09-16 19:10 ` [PATCH 03/13] vcxproj: adjust for the reftable changes Johannes Schindelin via GitGitGadget
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-09-16 19:10 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/reftable.h | 585 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 585 insertions(+)
 create mode 100644 reftable/reftable.h

diff --git a/reftable/reftable.h b/reftable/reftable.h
new file mode 100644
index 0000000000..5f8ebf5540
--- /dev/null
+++ b/reftable/reftable.h
@@ -0,0 +1,585 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_H
+#define REFTABLE_H
+
+#include <stdint.h>
+#include <stddef.h>
+
+void reftable_set_alloc(void *(*malloc)(size_t),
+			void *(*realloc)(void *, size_t), void (*free)(void *));
+
+/****************************************************************
+ Basic data types
+
+ Reftables store the state of each ref in struct reftable_ref_record, and they
+ store a sequence of reflog updates in struct reftable_log_record.
+ ****************************************************************/
+
+/* reftable_ref_record holds a ref database entry target_value */
+struct reftable_ref_record {
+	char *refname; /* Name of the ref, malloced. */
+	uint64_t update_index; /* Logical timestamp at which this value is
+				  written */
+	uint8_t *value; /* SHA1, or NULL. malloced. */
+	uint8_t *target_value; /* peeled annotated tag, or NULL. malloced. */
+	char *target; /* symref, or NULL. malloced. */
+};
+
+/* returns whether 'ref' represents a deletion */
+int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref);
+
+/* prints a reftable_ref_record onto stdout */
+void reftable_ref_record_print(struct reftable_ref_record *ref,
+			       uint32_t hash_id);
+
+/* frees and nulls all pointer values. */
+void reftable_ref_record_clear(struct reftable_ref_record *ref);
+
+/* returns whether two reftable_ref_records are the same */
+int reftable_ref_record_equal(struct reftable_ref_record *a,
+			      struct reftable_ref_record *b, int hash_size);
+
+/* reftable_log_record holds a reflog entry */
+struct reftable_log_record {
+	char *refname;
+	uint64_t update_index; /* logical timestamp of a transactional update.
+				*/
+	uint8_t *new_hash;
+	uint8_t *old_hash;
+	char *name;
+	char *email;
+	uint64_t time;
+	int16_t tz_offset;
+	char *message;
+};
+
+/* returns whether 'ref' represents the deletion of a log record. */
+int reftable_log_record_is_deletion(const struct reftable_log_record *log);
+
+/* frees and nulls all pointer values. */
+void reftable_log_record_clear(struct reftable_log_record *log);
+
+/* returns whether two records are equal. */
+int reftable_log_record_equal(struct reftable_log_record *a,
+			      struct reftable_log_record *b, int hash_size);
+
+/* dumps a reftable_log_record on stdout, for debugging/testing. */
+void reftable_log_record_print(struct reftable_log_record *log,
+			       uint32_t hash_id);
+
+/****************************************************************
+ Error handling
+
+ Error are signaled with negative integer return values. 0 means success.
+ ****************************************************************/
+
+/* different types of errors */
+enum reftable_error {
+	/* Unexpected file system behavior */
+	REFTABLE_IO_ERROR = -2,
+
+	/* Format inconsistency on reading data
+	 */
+	REFTABLE_FORMAT_ERROR = -3,
+
+	/* File does not exist. Returned from block_source_from_file(),  because
+	   it needs special handling in stack.
+	*/
+	REFTABLE_NOT_EXIST_ERROR = -4,
+
+	/* Trying to write out-of-date data. */
+	REFTABLE_LOCK_ERROR = -5,
+
+	/* Misuse of the API:
+	   - on writing a record with NULL refname.
+	   - on writing a reftable_ref_record outside the table limits
+	   - on writing a ref or log record before the stack's next_update_index
+	   - on writing a log record with multiline message with
+	   exact_log_message unset
+	   - on reading a reftable_ref_record from log iterator, or vice versa.
+	*/
+	REFTABLE_API_ERROR = -6,
+
+	/* Decompression error */
+	REFTABLE_ZLIB_ERROR = -7,
+
+	/* Wrote a table without blocks. */
+	REFTABLE_EMPTY_TABLE_ERROR = -8,
+
+	/* Dir/file conflict. */
+	REFTABLE_NAME_CONFLICT = -9,
+
+	/* Illegal ref name. */
+	REFTABLE_REFNAME_ERROR = -10,
+};
+
+/* convert the numeric error code to a string. The string should not be
+ * deallocated. */
+const char *reftable_error_str(int err);
+
+/*
+ * Convert the numeric error code to an equivalent errno code.
+ */
+int reftable_error_to_errno(int err);
+
+/****************************************************************
+ Writing
+
+ Writing single reftables
+ ****************************************************************/
+
+/* reftable_write_options sets options for writing a single reftable. */
+struct reftable_write_options {
+	/* boolean: do not pad out blocks to block size. */
+	int unpadded;
+
+	/* the blocksize. Should be less than 2^24. */
+	uint32_t block_size;
+
+	/* boolean: do not generate a SHA1 => ref index. */
+	int skip_index_objects;
+
+	/* how often to write complete keys in each block. */
+	int restart_interval;
+
+	/* 4-byte identifier ("sha1", "s256") of the hash.
+	 * Defaults to SHA1 if unset
+	 */
+	uint32_t hash_id;
+
+	/* boolean: do not check ref names for validity or dir/file conflicts.
+	 */
+	int skip_name_check;
+
+	/* boolean: copy log messages exactly. If unset, check that the message
+	 *   is a single line, and add '\n' if missing.
+	 */
+	int exact_log_message;
+};
+
+/* reftable_block_stats holds statistics for a single block type */
+struct reftable_block_stats {
+	/* total number of entries written */
+	int entries;
+	/* total number of key restarts */
+	int restarts;
+	/* total number of blocks */
+	int blocks;
+	/* total number of index blocks */
+	int index_blocks;
+	/* depth of the index */
+	int max_index_level;
+
+	/* offset of the first block for this type */
+	uint64_t offset;
+	/* offset of the top level index block for this type, or 0 if not
+	 * present */
+	uint64_t index_offset;
+};
+
+/* stats holds overall statistics for a single reftable */
+struct reftable_stats {
+	/* total number of blocks written. */
+	int blocks;
+	/* stats for ref data */
+	struct reftable_block_stats ref_stats;
+	/* stats for the SHA1 to ref map. */
+	struct reftable_block_stats obj_stats;
+	/* stats for index blocks */
+	struct reftable_block_stats idx_stats;
+	/* stats for log blocks */
+	struct reftable_block_stats log_stats;
+
+	/* disambiguation length of shortened object IDs. */
+	int object_id_len;
+};
+
+/* reftable_new_writer creates a new writer */
+struct reftable_writer *
+reftable_new_writer(int (*writer_func)(void *, const void *, size_t),
+		    void *writer_arg, struct reftable_write_options *opts);
+
+/* write to a file descriptor. fdp should be an int* pointing to the fd. */
+int reftable_fd_write(void *fdp, const void *data, size_t size);
+
+/* Set the range of update indices for the records we will add.  When
+   writing a table into a stack, the min should be at least
+   reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned.
+
+   For transactional updates, typically min==max. When converting an existing
+   ref database into a single reftable, this would be a range of update-index
+   timestamps.
+ */
+void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
+				uint64_t max);
+
+/* adds a reftable_ref_record. Must be called in ascending
+   order. The update_index must be within the limits set by
+   reftable_writer_set_limits(), or REFTABLE_API_ERROR is returned.
+
+   It is an error to write a ref record after a log record.
+ */
+int reftable_writer_add_ref(struct reftable_writer *w,
+			    struct reftable_ref_record *ref);
+
+/* Convenience function to add multiple refs. Will sort the refs by
+   name before adding. */
+int reftable_writer_add_refs(struct reftable_writer *w,
+			     struct reftable_ref_record *refs, int n);
+
+/* adds a reftable_log_record. Must be called in ascending order (with more
+   recent log entries first.)
+ */
+int reftable_writer_add_log(struct reftable_writer *w,
+			    struct reftable_log_record *log);
+
+/* Convenience function to add multiple logs. Will sort the records by
+   key before adding. */
+int reftable_writer_add_logs(struct reftable_writer *w,
+			     struct reftable_log_record *logs, int n);
+
+/* reftable_writer_close finalizes the reftable. The writer is retained so
+ * statistics can be inspected. */
+int reftable_writer_close(struct reftable_writer *w);
+
+/* writer_stats returns the statistics on the reftable being written.
+
+   This struct becomes invalid when the writer is freed.
+ */
+const struct reftable_stats *writer_stats(struct reftable_writer *w);
+
+/* reftable_writer_free deallocates memory for the writer */
+void reftable_writer_free(struct reftable_writer *w);
+
+/****************************************************************
+ * ITERATING
+ ****************************************************************/
+
+/* iterator is the generic interface for walking over data stored in a
+   reftable. It is generally passed around by value.
+*/
+struct reftable_iterator {
+	struct reftable_iterator_vtable *ops;
+	void *iter_arg;
+};
+
+/* reads the next reftable_ref_record. Returns < 0 for error, 0 for OK and > 0:
+   end of iteration.
+*/
+int reftable_iterator_next_ref(struct reftable_iterator *it,
+			       struct reftable_ref_record *ref);
+
+/* reads the next reftable_log_record. Returns < 0 for error, 0 for OK and > 0:
+   end of iteration.
+*/
+int reftable_iterator_next_log(struct reftable_iterator *it,
+			       struct reftable_log_record *log);
+
+/* releases resources associated with an iterator. */
+void reftable_iterator_destroy(struct reftable_iterator *it);
+
+/****************************************************************
+ Reading single tables
+
+ The follow routines are for reading single files. For an application-level
+ interface, skip ahead to struct reftable_merged_table and struct
+ reftable_stack.
+ ****************************************************************/
+
+/* block_source is a generic wrapper for a seekable readable file.
+   It is generally passed around by value.
+ */
+struct reftable_block_source {
+	struct reftable_block_source_vtable *ops;
+	void *arg;
+};
+
+/* a contiguous segment of bytes. It keeps track of its generating block_source
+   so it can return itself into the pool.
+*/
+struct reftable_block {
+	uint8_t *data;
+	int len;
+	struct reftable_block_source source;
+};
+
+/* block_source_vtable are the operations that make up block_source */
+struct reftable_block_source_vtable {
+	/* returns the size of a block source */
+	uint64_t (*size)(void *source);
+
+	/* reads a segment from the block source. It is an error to read
+	   beyond the end of the block */
+	int (*read_block)(void *source, struct reftable_block *dest,
+			  uint64_t off, uint32_t size);
+	/* mark the block as read; may return the data back to malloc */
+	void (*return_block)(void *source, struct reftable_block *blockp);
+
+	/* release all resources associated with the block source */
+	void (*close)(void *source);
+};
+
+/* opens a file on the file system as a block_source */
+int reftable_block_source_from_file(struct reftable_block_source *block_src,
+				    const char *name);
+
+/* The reader struct is a handle to an open reftable file. */
+struct reftable_reader;
+
+/* reftable_new_reader opens a reftable for reading. If successful, returns 0
+ * code and sets pp. The name is used for creating a stack. Typically, it is the
+ * basename of the file. The block source `src` is owned by the reader, and is
+ * closed on calling reftable_reader_destroy().
+ */
+int reftable_new_reader(struct reftable_reader **pp,
+			struct reftable_block_source *src, const char *name);
+
+/* reftable_reader_seek_ref returns an iterator where 'name' would be inserted
+   in the table.  To seek to the start of the table, use name = "".
+
+   example:
+
+   struct reftable_reader *r = NULL;
+   int err = reftable_new_reader(&r, &src, "filename");
+   if (err < 0) { ... }
+   struct reftable_iterator it  = {0};
+   err = reftable_reader_seek_ref(r, &it, "refs/heads/master");
+   if (err < 0) { ... }
+   struct reftable_ref_record ref  = {0};
+   while (1) {
+     err = reftable_iterator_next_ref(&it, &ref);
+     if (err > 0) {
+       break;
+     }
+     if (err < 0) {
+       ..error handling..
+     }
+     ..found..
+   }
+   reftable_iterator_destroy(&it);
+   reftable_ref_record_clear(&ref);
+ */
+int reftable_reader_seek_ref(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name);
+
+/* returns the hash ID used in this table. */
+uint32_t reftable_reader_hash_id(struct reftable_reader *r);
+
+/* seek to logs for the given name, older than update_index. To seek to the
+   start of the table, use name = "".
+ */
+int reftable_reader_seek_log_at(struct reftable_reader *r,
+				struct reftable_iterator *it, const char *name,
+				uint64_t update_index);
+
+/* seek to newest log entry for given name. */
+int reftable_reader_seek_log(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name);
+
+/* closes and deallocates a reader. */
+void reftable_reader_free(struct reftable_reader *);
+
+/* return an iterator for the refs pointing to `oid`. */
+int reftable_reader_refs_for(struct reftable_reader *r,
+			     struct reftable_iterator *it, uint8_t *oid);
+
+/* return the max_update_index for a table */
+uint64_t reftable_reader_max_update_index(struct reftable_reader *r);
+
+/* return the min_update_index for a table */
+uint64_t reftable_reader_min_update_index(struct reftable_reader *r);
+
+/****************************************************************
+ Merged tables
+
+ A ref database kept in a sequence of table files. The merged_table presents a
+ unified view to reading (seeking, iterating) a sequence of immutable tables.
+ ****************************************************************/
+
+/* A merged table is implements seeking/iterating over a stack of tables. */
+struct reftable_merged_table;
+
+/* A generic reftable; see below. */
+struct reftable_table;
+
+/* reftable_new_merged_table creates a new merged table. It takes ownership of
+   the stack array.
+*/
+int reftable_new_merged_table(struct reftable_merged_table **dest,
+			      struct reftable_table *stack, int n,
+			      uint32_t hash_id);
+
+/* returns an iterator positioned just before 'name' */
+int reftable_merged_table_seek_ref(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name);
+
+/* returns an iterator for log entry, at given update_index */
+int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt,
+				      struct reftable_iterator *it,
+				      const char *name, uint64_t update_index);
+
+/* like reftable_merged_table_seek_log_at but look for the newest entry. */
+int reftable_merged_table_seek_log(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name);
+
+/* returns the max update_index covered by this merged table. */
+uint64_t
+reftable_merged_table_max_update_index(struct reftable_merged_table *mt);
+
+/* returns the min update_index covered by this merged table. */
+uint64_t
+reftable_merged_table_min_update_index(struct reftable_merged_table *mt);
+
+/* releases memory for the merged_table */
+void reftable_merged_table_free(struct reftable_merged_table *m);
+
+/* return the hash ID of the merged table. */
+uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *m);
+
+/****************************************************************
+ Generic tables
+
+ A unified API for reading tables, either merged tables, or single readers.
+ ****************************************************************/
+
+struct reftable_table {
+	struct reftable_table_vtable *ops;
+	void *table_arg;
+};
+
+int reftable_table_seek_ref(struct reftable_table *tab,
+			    struct reftable_iterator *it, const char *name);
+
+void reftable_table_from_reader(struct reftable_table *tab,
+				struct reftable_reader *reader);
+
+/* returns the hash ID from a generic reftable_table */
+uint32_t reftable_table_hash_id(struct reftable_table *tab);
+
+/* create a generic table from reftable_merged_table */
+void reftable_table_from_merged_table(struct reftable_table *tab,
+				      struct reftable_merged_table *table);
+
+/* returns the max update_index covered by this table. */
+uint64_t reftable_table_max_update_index(struct reftable_table *tab);
+
+/* returns the min update_index covered by this table. */
+uint64_t reftable_table_min_update_index(struct reftable_table *tab);
+
+/* convenience function to read a single ref. Returns < 0 for error, 0
+   for success, and 1 if ref not found. */
+int reftable_table_read_ref(struct reftable_table *tab, const char *name,
+			    struct reftable_ref_record *ref);
+
+/****************************************************************
+ Mutable ref database
+
+ The stack presents an interface to a mutable sequence of reftables.
+ ****************************************************************/
+
+/* a stack is a stack of reftables, which can be mutated by pushing a table to
+ * the top of the stack */
+struct reftable_stack;
+
+/* open a new reftable stack. The tables along with the table list will be
+   stored in 'dir'. Typically, this should be .git/reftables.
+*/
+int reftable_new_stack(struct reftable_stack **dest, const char *dir,
+		       struct reftable_write_options config);
+
+/* returns the update_index at which a next table should be written. */
+uint64_t reftable_stack_next_update_index(struct reftable_stack *st);
+
+/* holds a transaction to add tables at the top of a stack. */
+struct reftable_addition;
+
+/*
+  returns a new transaction to add reftables to the given stack. As a side
+  effect, the ref database is locked.
+*/
+int reftable_stack_new_addition(struct reftable_addition **dest,
+				struct reftable_stack *st);
+
+/* Adds a reftable to transaction. */
+int reftable_addition_add(struct reftable_addition *add,
+			  int (*write_table)(struct reftable_writer *wr,
+					     void *arg),
+			  void *arg);
+
+/* Commits the transaction, releasing the lock. */
+int reftable_addition_commit(struct reftable_addition *add);
+
+/* Release all non-committed data from the transaction, and deallocate the
+   transaction. Releases the lock if held. */
+void reftable_addition_destroy(struct reftable_addition *add);
+
+/* add a new table to the stack. The write_table function must call
+   reftable_writer_set_limits, add refs and return an error value. */
+int reftable_stack_add(struct reftable_stack *st,
+		       int (*write_table)(struct reftable_writer *wr,
+					  void *write_arg),
+		       void *write_arg);
+
+/* returns the merged_table for seeking. This table is valid until the
+   next write or reload, and should not be closed or deleted.
+*/
+struct reftable_merged_table *
+reftable_stack_merged_table(struct reftable_stack *st);
+
+/* frees all resources associated with the stack. */
+void reftable_stack_destroy(struct reftable_stack *st);
+
+/* Reloads the stack if necessary. This is very cheap to run if the stack was up
+ * to date */
+int reftable_stack_reload(struct reftable_stack *st);
+
+/* Policy for expiring reflog entries. */
+struct reftable_log_expiry_config {
+	/* Drop entries older than this timestamp */
+	uint64_t time;
+
+	/* Drop older entries */
+	uint64_t min_update_index;
+};
+
+/* compacts all reftables into a giant table. Expire reflog entries if config is
+ * non-NULL */
+int reftable_stack_compact_all(struct reftable_stack *st,
+			       struct reftable_log_expiry_config *config);
+
+/* heuristically compact unbalanced table stack. */
+int reftable_stack_auto_compact(struct reftable_stack *st);
+
+/* convenience function to read a single ref. Returns < 0 for error, 0
+   for success, and 1 if ref not found. */
+int reftable_stack_read_ref(struct reftable_stack *st, const char *refname,
+			    struct reftable_ref_record *ref);
+
+/* convenience function to read a single log. Returns < 0 for error, 0
+   for success, and 1 if ref not found. */
+int reftable_stack_read_log(struct reftable_stack *st, const char *refname,
+			    struct reftable_log_record *log);
+
+/* statistics on past compactions. */
+struct reftable_compaction_stats {
+	uint64_t bytes; /* total number of bytes written */
+	uint64_t entries_written; /* total number of entries written, including
+				     failures. */
+	int attempts; /* how often we tried to compact */
+	int failures; /* failures happen on concurrent updates */
+};
+
+/* return statistics for compaction up till now. */
+struct reftable_compaction_stats *
+reftable_stack_compaction_stats(struct reftable_stack *st);
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH 03/13] vcxproj: adjust for the reftable changes
  2020-09-16 19:10 [PATCH 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
  2020-09-16 19:10 ` [PATCH 01/13] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
  2020-09-16 19:10 ` [PATCH 02/13] reftable: define the public API Han-Wen Nienhuys via GitGitGadget
@ 2020-09-16 19:10 ` Johannes Schindelin via GitGitGadget
  2020-09-16 19:10 ` [PATCH 04/13] reftable: add a barebones unittest framework Han-Wen Nienhuys via GitGitGadget
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 251+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2020-09-16 19:10 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

This allows Git to be compiled via Visual Studio again after integrating
the `hn/reftable` branch.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 config.mak.uname                           |  2 +-
 contrib/buildsystems/Generators/Vcxproj.pm | 11 ++++++++++-
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/config.mak.uname b/config.mak.uname
index c7eba69e54..ae4e25a1a4 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -709,7 +709,7 @@ vcxproj:
 	# Make .vcxproj files and add them
 	unset QUIET_GEN QUIET_BUILT_IN; \
 	perl contrib/buildsystems/generate -g Vcxproj
-	git add -f git.sln {*,*/lib,t/helper/*}/*.vcxproj
+	git add -f git.sln {*,*/lib,*/libreftable,t/helper/*}/*.vcxproj
 
 	# Generate the LinkOrCopyBuiltins.targets and LinkOrCopyRemoteHttp.targets file
 	(echo '<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">' && \
diff --git a/contrib/buildsystems/Generators/Vcxproj.pm b/contrib/buildsystems/Generators/Vcxproj.pm
index d2584450ba..1a25789d28 100644
--- a/contrib/buildsystems/Generators/Vcxproj.pm
+++ b/contrib/buildsystems/Generators/Vcxproj.pm
@@ -77,7 +77,7 @@ sub createProject {
     my $libs_release = "\n    ";
     my $libs_debug = "\n    ";
     if (!$static_library) {
-      $libs_release = join(";", sort(grep /^(?!libgit\.lib|xdiff\/lib\.lib|vcs-svn\/lib\.lib)/, @{$$build_structure{"$prefix${name}_LIBS"}}));
+      $libs_release = join(";", sort(grep /^(?!libgit\.lib|xdiff\/lib\.lib|vcs-svn\/lib\.lib|reftable\/libreftable\.lib)/, @{$$build_structure{"$prefix${name}_LIBS"}}));
       $libs_debug = $libs_release;
       $libs_debug =~ s/zlib\.lib/zlibd\.lib/g;
       $libs_debug =~ s/libexpat\.lib/libexpatd\.lib/g;
@@ -232,6 +232,7 @@ sub createProject {
 EOM
     if (!$static_library || $target =~ 'vcs-svn' || $target =~ 'xdiff') {
       my $uuid_libgit = $$build_structure{"LIBS_libgit_GUID"};
+      my $uuid_libreftable = $$build_structure{"LIBS_reftable/libreftable_GUID"};
       my $uuid_xdiff_lib = $$build_structure{"LIBS_xdiff/lib_GUID"};
 
       print F << "EOM";
@@ -241,6 +242,14 @@ sub createProject {
       <ReferenceOutputAssembly>false</ReferenceOutputAssembly>
     </ProjectReference>
 EOM
+      if (!($name =~ /xdiff|libreftable/)) {
+        print F << "EOM";
+    <ProjectReference Include="$cdup\\reftable\\libreftable\\libreftable.vcxproj">
+      <Project>$uuid_libreftable</Project>
+      <ReferenceOutputAssembly>false</ReferenceOutputAssembly>
+    </ProjectReference>
+EOM
+      }
       if (!($name =~ 'xdiff')) {
         print F << "EOM";
     <ProjectReference Include="$cdup\\xdiff\\lib\\xdiff_lib.vcxproj">
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH 04/13] reftable: add a barebones unittest framework
  2020-09-16 19:10 [PATCH 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
                   ` (2 preceding siblings ...)
  2020-09-16 19:10 ` [PATCH 03/13] vcxproj: adjust for the reftable changes Johannes Schindelin via GitGitGadget
@ 2020-09-16 19:10 ` Han-Wen Nienhuys via GitGitGadget
  2020-09-16 19:10 ` [PATCH 05/13] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-09-16 19:10 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/test_framework.c | 68 +++++++++++++++++++++++++++++++++++++++
 reftable/test_framework.h | 60 ++++++++++++++++++++++++++++++++++
 2 files changed, 128 insertions(+)
 create mode 100644 reftable/test_framework.c
 create mode 100644 reftable/test_framework.h

diff --git a/reftable/test_framework.c b/reftable/test_framework.c
new file mode 100644
index 0000000000..f304a2773a
--- /dev/null
+++ b/reftable/test_framework.c
@@ -0,0 +1,68 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "test_framework.h"
+
+#include "system.h"
+#include "basics.h"
+
+struct test_case **test_cases;
+int test_case_len;
+int test_case_cap;
+
+struct test_case *new_test_case(const char *name, void (*testfunc)(void))
+{
+	struct test_case *tc = reftable_malloc(sizeof(struct test_case));
+	tc->name = name;
+	tc->testfunc = testfunc;
+	return tc;
+}
+
+struct test_case *add_test_case(const char *name, void (*testfunc)(void))
+{
+	struct test_case *tc = new_test_case(name, testfunc);
+	if (test_case_len == test_case_cap) {
+		test_case_cap = 2 * test_case_cap + 1;
+		test_cases = reftable_realloc(
+			test_cases, sizeof(struct test_case) * test_case_cap);
+	}
+
+	test_cases[test_case_len++] = tc;
+	return tc;
+}
+
+int test_main(int argc, const char *argv[])
+{
+	const char *filter = NULL;
+	int i = 0;
+	if (argc > 1) {
+		filter = argv[1];
+	}
+
+	for (i = 0; i < test_case_len; i++) {
+		const char *name = test_cases[i]->name;
+		if (filter == NULL || strstr(name, filter) != NULL) {
+			printf("case %s\n", name);
+			test_cases[i]->testfunc();
+		} else {
+			printf("skip %s\n", name);
+		}
+
+		reftable_free(test_cases[i]);
+	}
+	reftable_free(test_cases);
+	test_cases = 0;
+	test_case_len = 0;
+	test_case_cap = 0;
+	return 0;
+}
+
+void set_test_hash(uint8_t *p, int i)
+{
+	memset(p, (uint8_t)i, hash_size(SHA1_ID));
+}
diff --git a/reftable/test_framework.h b/reftable/test_framework.h
new file mode 100644
index 0000000000..3c74d287aa
--- /dev/null
+++ b/reftable/test_framework.h
@@ -0,0 +1,60 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef TEST_FRAMEWORK_H
+#define TEST_FRAMEWORK_H
+
+#include "system.h"
+
+#ifdef NDEBUG
+#undef NDEBUG
+#endif
+
+#ifdef assert
+#undef assert
+#endif
+
+#define assert_err(c)                                                  \
+	if (c != 0) {                                                  \
+		fflush(stderr);                                        \
+		fflush(stdout);                                        \
+		fprintf(stderr, "%s: %d: error == %d (%s), want 0\n",  \
+			__FILE__, __LINE__, c, reftable_error_str(c)); \
+		abort();                                               \
+	}
+
+#define assert_streq(a, b)                                               \
+	if (strcmp(a, b)) {                                              \
+		fflush(stderr);                                          \
+		fflush(stdout);                                          \
+		fprintf(stderr, "%s:%d: %s (%s) != %s (%s)\n", __FILE__, \
+			__LINE__, #a, a, #b, b);                         \
+		abort();                                                 \
+	}
+
+#define assert(c)                                                          \
+	if (!(c)) {                                                        \
+		fflush(stderr);                                            \
+		fflush(stdout);                                            \
+		fprintf(stderr, "%s: %d: failed assertion %s\n", __FILE__, \
+			__LINE__, #c);                                     \
+		abort();                                                   \
+	}
+
+struct test_case {
+	const char *name;
+	void (*testfunc)(void);
+};
+
+struct test_case *new_test_case(const char *name, void (*testfunc)(void));
+struct test_case *add_test_case(const char *name, void (*testfunc)(void));
+int test_main(int argc, const char *argv[]);
+
+void set_test_hash(uint8_t *p, int i);
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH 05/13] reftable: utility functions
  2020-09-16 19:10 [PATCH 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
                   ` (3 preceding siblings ...)
  2020-09-16 19:10 ` [PATCH 04/13] reftable: add a barebones unittest framework Han-Wen Nienhuys via GitGitGadget
@ 2020-09-16 19:10 ` Han-Wen Nienhuys via GitGitGadget
  2020-09-16 19:10 ` [PATCH 06/13] reftable: (de)serialization for the polymorphic record type Han-Wen Nienhuys via GitGitGadget
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-09-16 19:10 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This commit provides basic utility classes for the reftable library.

Since the reftable library must compile standalone, there may be some overlap
with git-core utility functions.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                  |  26 ++++++-
 reftable/basics.c         | 131 +++++++++++++++++++++++++++++++++
 reftable/basics.h         |  48 +++++++++++++
 reftable/blocksource.c    | 148 ++++++++++++++++++++++++++++++++++++++
 reftable/blocksource.h    |  22 ++++++
 reftable/compat.c         | 110 ++++++++++++++++++++++++++++
 reftable/compat.h         |  48 +++++++++++++
 reftable/publicbasics.c   | 100 ++++++++++++++++++++++++++
 reftable/reftable-tests.h |  22 ++++++
 reftable/strbuf.c         | 142 ++++++++++++++++++++++++++++++++++++
 reftable/strbuf.h         |  80 +++++++++++++++++++++
 reftable/strbuf_test.c    |  37 ++++++++++
 reftable/system.h         |  51 +++++++++++++
 t/helper/test-reftable.c  |   8 +++
 t/helper/test-tool.c      |   1 +
 t/helper/test-tool.h      |   1 +
 16 files changed, 972 insertions(+), 3 deletions(-)
 create mode 100644 reftable/basics.c
 create mode 100644 reftable/basics.h
 create mode 100644 reftable/blocksource.c
 create mode 100644 reftable/blocksource.h
 create mode 100644 reftable/compat.c
 create mode 100644 reftable/compat.h
 create mode 100644 reftable/publicbasics.c
 create mode 100644 reftable/reftable-tests.h
 create mode 100644 reftable/strbuf.c
 create mode 100644 reftable/strbuf.h
 create mode 100644 reftable/strbuf_test.c
 create mode 100644 reftable/system.h
 create mode 100644 t/helper/test-reftable.c

diff --git a/Makefile b/Makefile
index 86e5411f39..66158525a9 100644
--- a/Makefile
+++ b/Makefile
@@ -720,6 +720,7 @@ TEST_BUILTINS_OBJS += test-read-cache.o
 TEST_BUILTINS_OBJS += test-read-graph.o
 TEST_BUILTINS_OBJS += test-read-midx.o
 TEST_BUILTINS_OBJS += test-ref-store.o
+TEST_BUILTINS_OBJS += test-reftable.o
 TEST_BUILTINS_OBJS += test-regex.o
 TEST_BUILTINS_OBJS += test-repository.o
 TEST_BUILTINS_OBJS += test-revision-walking.o
@@ -799,6 +800,8 @@ TEST_SHELL_PATH = $(SHELL_PATH)
 
 LIB_FILE = libgit.a
 XDIFF_LIB = xdiff/lib.a
+REFTABLE_LIB = reftable/libreftable.a
+REFTABLE_TEST_LIB = reftable/libreftable_test.a
 
 GENERATED_H += config-list.h
 GENERATED_H += command-list.h
@@ -1160,7 +1163,7 @@ THIRD_PARTY_SOURCES += compat/regex/%
 THIRD_PARTY_SOURCES += sha1collisiondetection/%
 THIRD_PARTY_SOURCES += sha1dc/%
 
-GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB)
+GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB)
 EXTLIBS =
 
 GIT_USER_AGENT = git/$(GIT_VERSION)
@@ -2341,10 +2344,21 @@ XDIFF_OBJS += xdiff/xpatience.o
 XDIFF_OBJS += xdiff/xprepare.o
 XDIFF_OBJS += xdiff/xutils.o
 
+REFTABLE_OBJS += reftable/basics.o
+REFTABLE_OBJS += reftable/blocksource.o
+REFTABLE_OBJS += reftable/publicbasics.o
+REFTABLE_OBJS += reftable/compat.o
+REFTABLE_OBJS += reftable/strbuf.o
+
+REFTABLE_TEST_OBJS += reftable/strbuf_test.o
+REFTABLE_TEST_OBJS += reftable/test_framework.o
+
 TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
 OBJECTS := $(LIB_OBJS) $(BUILTIN_OBJS) $(PROGRAM_OBJS) $(TEST_OBJS) \
 	$(XDIFF_OBJS) \
 	$(FUZZ_OBJS) \
+	$(REFTABLE_OBJS) \
+	$(REFTABLE_TEST_OBJS) \
 	common-main.o \
 	git.o
 ifndef NO_CURL
@@ -2474,6 +2488,12 @@ $(LIB_FILE): $(LIB_OBJS)
 $(XDIFF_LIB): $(XDIFF_OBJS)
 	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
 
+$(REFTABLE_LIB): $(REFTABLE_OBJS)
+	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
+
+$(REFTABLE_TEST_LIB): $(REFTABLE_TEST_OBJS)
+	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
+
 export DEFAULT_EDITOR DEFAULT_PAGER
 
 Documentation/GIT-EXCLUDED-PROGRAMS: FORCE
@@ -2752,7 +2772,7 @@ perf: all
 
 t/helper/test-tool$X: $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
 
-t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS)
+t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS) $(REFTABLE_TEST_LIB)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(filter %.a,$^) $(LIBS)
 
 check-sha1:: t/helper/test-tool$X
@@ -3080,7 +3100,7 @@ cocciclean:
 clean: profile-clean coverage-clean cocciclean
 	$(RM) *.res
 	$(RM) $(OBJECTS)
-	$(RM) $(LIB_FILE) $(XDIFF_LIB)
+	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB)
 	$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) git$X
 	$(RM) $(TEST_PROGRAMS)
 	$(RM) $(FUZZ_PROGRAMS)
diff --git a/reftable/basics.c b/reftable/basics.c
new file mode 100644
index 0000000000..c429055d15
--- /dev/null
+++ b/reftable/basics.c
@@ -0,0 +1,131 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "basics.h"
+
+void put_be24(uint8_t *out, uint32_t i)
+{
+	out[0] = (uint8_t)((i >> 16) & 0xff);
+	out[1] = (uint8_t)((i >> 8) & 0xff);
+	out[2] = (uint8_t)(i & 0xff);
+}
+
+uint32_t get_be24(uint8_t *in)
+{
+	return (uint32_t)(in[0]) << 16 | (uint32_t)(in[1]) << 8 |
+	       (uint32_t)(in[2]);
+}
+
+void put_be16(uint8_t *out, uint16_t i)
+{
+	out[0] = (uint8_t)((i >> 8) & 0xff);
+	out[1] = (uint8_t)(i & 0xff);
+}
+
+int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args)
+{
+	size_t lo = 0;
+	size_t hi = sz;
+
+	/* invariant: (hi == sz) || f(hi) == true
+	   (lo == 0 && f(0) == true) || fi(lo) == false
+	 */
+	while (hi - lo > 1) {
+		size_t mid = lo + (hi - lo) / 2;
+
+		int val = f(mid, args);
+		if (val) {
+			hi = mid;
+		} else {
+			lo = mid;
+		}
+	}
+
+	if (lo == 0) {
+		if (f(0, args)) {
+			return 0;
+		} else {
+			return 1;
+		}
+	}
+
+	return hi;
+}
+
+void free_names(char **a)
+{
+	char **p = a;
+	if (p == NULL) {
+		return;
+	}
+	while (*p) {
+		reftable_free(*p);
+		p++;
+	}
+	reftable_free(a);
+}
+
+int names_length(char **names)
+{
+	int len = 0;
+	char **p = names;
+	while (*p) {
+		p++;
+		len++;
+	}
+	return len;
+}
+
+void parse_names(char *buf, int size, char ***namesp)
+{
+	char **names = NULL;
+	size_t names_cap = 0;
+	size_t names_len = 0;
+
+	char *p = buf;
+	char *end = buf + size;
+	while (p < end) {
+		char *next = strchr(p, '\n');
+		if (next != NULL) {
+			*next = 0;
+		} else {
+			next = end;
+		}
+		if (p < next) {
+			if (names_len == names_cap) {
+				names_cap = 2 * names_cap + 1;
+				names = reftable_realloc(
+					names, names_cap * sizeof(char *));
+			}
+			names[names_len++] = xstrdup(p);
+		}
+		p = next + 1;
+	}
+
+	if (names_len == names_cap) {
+		names_cap = 2 * names_cap + 1;
+		names = reftable_realloc(names, names_cap * sizeof(char *));
+	}
+
+	names[names_len] = NULL;
+	*namesp = names;
+}
+
+int names_equal(char **a, char **b)
+{
+	while (*a && *b) {
+		if (strcmp(*a, *b)) {
+			return 0;
+		}
+
+		a++;
+		b++;
+	}
+
+	return *a == *b;
+}
diff --git a/reftable/basics.h b/reftable/basics.h
new file mode 100644
index 0000000000..90639865a7
--- /dev/null
+++ b/reftable/basics.h
@@ -0,0 +1,48 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef BASICS_H
+#define BASICS_H
+
+#include "system.h"
+
+/* Bigendian en/decoding of integers */
+
+void put_be24(uint8_t *out, uint32_t i);
+uint32_t get_be24(uint8_t *in);
+void put_be16(uint8_t *out, uint16_t i);
+
+/*
+  find smallest index i in [0, sz) at which f(i) is true, assuming
+  that f is ascending. Return sz if f(i) is false for all indices.
+*/
+int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args);
+
+/*
+  Frees a NULL terminated array of malloced strings. The array itself is also
+  freed.
+ */
+void free_names(char **a);
+
+/* parse a newline separated list of names. Empty names are discarded. */
+void parse_names(char *buf, int size, char ***namesp);
+
+/* compares two NULL-terminated arrays of strings. */
+int names_equal(char **a, char **b);
+
+/* returns the array size of a NULL-terminated array of strings. */
+int names_length(char **names);
+
+/* Allocation routines; they invoke the functions set through
+ * reftable_set_alloc() */
+void *reftable_malloc(size_t sz);
+void *reftable_realloc(void *p, size_t sz);
+void reftable_free(void *p);
+void *reftable_calloc(size_t sz);
+
+#endif
diff --git a/reftable/blocksource.c b/reftable/blocksource.c
new file mode 100644
index 0000000000..f12cea2472
--- /dev/null
+++ b/reftable/blocksource.c
@@ -0,0 +1,148 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+
+#include "basics.h"
+#include "blocksource.h"
+#include "strbuf.h"
+#include "reftable.h"
+
+static void strbuf_return_block(void *b, struct reftable_block *dest)
+{
+	memset(dest->data, 0xff, dest->len);
+	reftable_free(dest->data);
+}
+
+static void strbuf_close(void *b)
+{
+}
+
+static int strbuf_read_block(void *v, struct reftable_block *dest, uint64_t off,
+			     uint32_t size)
+{
+	struct strbuf *b = (struct strbuf *)v;
+	assert(off + size <= b->len);
+	dest->data = reftable_calloc(size);
+	memcpy(dest->data, b->buf + off, size);
+	dest->len = size;
+	return size;
+}
+
+static uint64_t strbuf_size(void *b)
+{
+	return ((struct strbuf *)b)->len;
+}
+
+struct reftable_block_source_vtable strbuf_vtable = {
+	.size = &strbuf_size,
+	.read_block = &strbuf_read_block,
+	.return_block = &strbuf_return_block,
+	.close = &strbuf_close,
+};
+
+void block_source_from_strbuf(struct reftable_block_source *bs,
+			      struct strbuf *buf)
+{
+	assert(bs->ops == NULL);
+	bs->ops = &strbuf_vtable;
+	bs->arg = buf;
+}
+
+static void malloc_return_block(void *b, struct reftable_block *dest)
+{
+	memset(dest->data, 0xff, dest->len);
+	reftable_free(dest->data);
+}
+
+struct reftable_block_source_vtable malloc_vtable = {
+	.return_block = &malloc_return_block,
+};
+
+struct reftable_block_source malloc_block_source_instance = {
+	.ops = &malloc_vtable,
+};
+
+struct reftable_block_source malloc_block_source(void)
+{
+	return malloc_block_source_instance;
+}
+
+struct file_block_source {
+	int fd;
+	uint64_t size;
+};
+
+static uint64_t file_size(void *b)
+{
+	return ((struct file_block_source *)b)->size;
+}
+
+static void file_return_block(void *b, struct reftable_block *dest)
+{
+	memset(dest->data, 0xff, dest->len);
+	reftable_free(dest->data);
+}
+
+static void file_close(void *b)
+{
+	int fd = ((struct file_block_source *)b)->fd;
+	if (fd > 0) {
+		close(fd);
+		((struct file_block_source *)b)->fd = 0;
+	}
+
+	reftable_free(b);
+}
+
+static int file_read_block(void *v, struct reftable_block *dest, uint64_t off,
+			   uint32_t size)
+{
+	struct file_block_source *b = (struct file_block_source *)v;
+	assert(off + size <= b->size);
+	dest->data = reftable_malloc(size);
+	if (pread(b->fd, dest->data, size, off) != size)
+		return -1;
+	dest->len = size;
+	return size;
+}
+
+struct reftable_block_source_vtable file_vtable = {
+	.size = &file_size,
+	.read_block = &file_read_block,
+	.return_block = &file_return_block,
+	.close = &file_close,
+};
+
+int reftable_block_source_from_file(struct reftable_block_source *bs,
+				    const char *name)
+{
+	struct stat st = { 0 };
+	int err = 0;
+	int fd = open(name, O_RDONLY);
+	struct file_block_source *p = NULL;
+	if (fd < 0) {
+		if (errno == ENOENT) {
+			return REFTABLE_NOT_EXIST_ERROR;
+		}
+		return -1;
+	}
+
+	err = fstat(fd, &st);
+	if (err < 0)
+		return -1;
+
+	p = reftable_calloc(sizeof(struct file_block_source));
+	p->size = st.st_size;
+	p->fd = fd;
+
+	assert(bs->ops == NULL);
+	bs->ops = &file_vtable;
+	bs->arg = p;
+	return 0;
+}
diff --git a/reftable/blocksource.h b/reftable/blocksource.h
new file mode 100644
index 0000000000..3faf83fa9d
--- /dev/null
+++ b/reftable/blocksource.h
@@ -0,0 +1,22 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef BLOCKSOURCE_H
+#define BLOCKSOURCE_H
+
+#include "strbuf.h"
+
+struct reftable_block_source;
+
+/* Create an in-memory block source for reading reftables */
+void block_source_from_strbuf(struct reftable_block_source *bs,
+			      struct strbuf *buf);
+
+struct reftable_block_source malloc_block_source(void);
+
+#endif
diff --git a/reftable/compat.c b/reftable/compat.c
new file mode 100644
index 0000000000..a48c5aa5e3
--- /dev/null
+++ b/reftable/compat.c
@@ -0,0 +1,110 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+
+*/
+
+/* compat.c - compatibility functions for standalone compilation */
+
+#include "system.h"
+#include "basics.h"
+
+#ifdef REFTABLE_STANDALONE
+
+#include <dirent.h>
+
+void put_be32(void *p, uint32_t i)
+{
+	uint8_t *out = (uint8_t *)p;
+
+	out[0] = (uint8_t)((i >> 24) & 0xff);
+	out[1] = (uint8_t)((i >> 16) & 0xff);
+	out[2] = (uint8_t)((i >> 8) & 0xff);
+	out[3] = (uint8_t)((i)&0xff);
+}
+
+uint32_t get_be32(uint8_t *in)
+{
+	return (uint32_t)(in[0]) << 24 | (uint32_t)(in[1]) << 16 |
+	       (uint32_t)(in[2]) << 8 | (uint32_t)(in[3]);
+}
+
+void put_be64(void *p, uint64_t v)
+{
+	uint8_t *out = (uint8_t *)p;
+	int i = sizeof(uint64_t);
+	while (i--) {
+		out[i] = (uint8_t)(v & 0xff);
+		v >>= 8;
+	}
+}
+
+uint64_t get_be64(void *out)
+{
+	uint8_t *bytes = (uint8_t *)out;
+	uint64_t v = 0;
+	int i = 0;
+	for (i = 0; i < sizeof(uint64_t); i++) {
+		v = (v << 8) | (uint8_t)(bytes[i] & 0xff);
+	}
+	return v;
+}
+
+uint16_t get_be16(uint8_t *in)
+{
+	return (uint32_t)(in[0]) << 8 | (uint32_t)(in[1]);
+}
+
+char *xstrdup(const char *s)
+{
+	int l = strlen(s);
+	char *dest = (char *)reftable_malloc(l + 1);
+	strncpy(dest, s, l + 1);
+	return dest;
+}
+
+void sleep_millisec(int millisecs)
+{
+	usleep(millisecs * 1000);
+}
+
+void reftable_clear_dir(const char *dirname)
+{
+	DIR *dir = opendir(dirname);
+	struct dirent *ent = NULL;
+	assert(dir);
+	while ((ent = readdir(dir)) != NULL) {
+		unlinkat(dirfd(dir), ent->d_name, 0);
+	}
+	closedir(dir);
+	rmdir(dirname);
+}
+
+#else
+
+#include "../dir.h"
+
+void reftable_clear_dir(const char *dirname)
+{
+	struct strbuf path = STRBUF_INIT;
+	strbuf_addstr(&path, dirname);
+	remove_dir_recursively(&path, 0);
+	strbuf_release(&path);
+}
+
+#endif
+
+int hash_size(uint32_t id)
+{
+	switch (id) {
+	case 0:
+	case SHA1_ID:
+		return SHA1_SIZE;
+	case SHA256_ID:
+		return SHA256_SIZE;
+	}
+	abort();
+}
diff --git a/reftable/compat.h b/reftable/compat.h
new file mode 100644
index 0000000000..a765c57e96
--- /dev/null
+++ b/reftable/compat.h
@@ -0,0 +1,48 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef COMPAT_H
+#define COMPAT_H
+
+#include "system.h"
+
+#ifdef REFTABLE_STANDALONE
+
+/* functions that git-core provides, for standalone compilation */
+#include <stdint.h>
+
+uint64_t get_be64(void *in);
+void put_be64(void *out, uint64_t i);
+
+void put_be32(void *out, uint32_t i);
+uint32_t get_be32(uint8_t *in);
+
+uint16_t get_be16(uint8_t *in);
+
+#define ARRAY_SIZE(a) sizeof((a)) / sizeof((a)[0])
+#define FREE_AND_NULL(x)          \
+	do {                      \
+		reftable_free(x); \
+		(x) = NULL;       \
+	} while (0)
+#define QSORT(arr, n, cmp) qsort(arr, n, sizeof(arr[0]), cmp)
+#define SWAP(a, b)                              \
+	{                                       \
+		char tmp[sizeof(a)];            \
+		assert(sizeof(a) == sizeof(b)); \
+		memcpy(&tmp[0], &a, sizeof(a)); \
+		memcpy(&a, &b, sizeof(a));      \
+		memcpy(&b, &tmp[0], sizeof(a)); \
+	}
+
+char *xstrdup(const char *s);
+
+void sleep_millisec(int millisecs);
+
+#endif
+#endif
diff --git a/reftable/publicbasics.c b/reftable/publicbasics.c
new file mode 100644
index 0000000000..a31463ff9a
--- /dev/null
+++ b/reftable/publicbasics.c
@@ -0,0 +1,100 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reftable.h"
+
+#include "basics.h"
+#include "system.h"
+
+const char *reftable_error_str(int err)
+{
+	static char buf[250];
+	switch (err) {
+	case REFTABLE_IO_ERROR:
+		return "I/O error";
+	case REFTABLE_FORMAT_ERROR:
+		return "corrupt reftable file";
+	case REFTABLE_NOT_EXIST_ERROR:
+		return "file does not exist";
+	case REFTABLE_LOCK_ERROR:
+		return "data is outdated";
+	case REFTABLE_API_ERROR:
+		return "misuse of the reftable API";
+	case REFTABLE_ZLIB_ERROR:
+		return "zlib failure";
+	case REFTABLE_NAME_CONFLICT:
+		return "file/directory conflict";
+	case REFTABLE_REFNAME_ERROR:
+		return "invalid refname";
+	case -1:
+		return "general error";
+	default:
+		snprintf(buf, sizeof(buf), "unknown error code %d", err);
+		return buf;
+	}
+}
+
+int reftable_error_to_errno(int err)
+{
+	switch (err) {
+	case REFTABLE_IO_ERROR:
+		return EIO;
+	case REFTABLE_FORMAT_ERROR:
+		return EFAULT;
+	case REFTABLE_NOT_EXIST_ERROR:
+		return ENOENT;
+	case REFTABLE_LOCK_ERROR:
+		return EBUSY;
+	case REFTABLE_API_ERROR:
+		return EINVAL;
+	case REFTABLE_ZLIB_ERROR:
+		return EDOM;
+	default:
+		return ERANGE;
+	}
+}
+
+void *(*reftable_malloc_ptr)(size_t sz) = &malloc;
+void *(*reftable_realloc_ptr)(void *, size_t) = &realloc;
+void (*reftable_free_ptr)(void *) = &free;
+
+void *reftable_malloc(size_t sz)
+{
+	return (*reftable_malloc_ptr)(sz);
+}
+
+void *reftable_realloc(void *p, size_t sz)
+{
+	return (*reftable_realloc_ptr)(p, sz);
+}
+
+void reftable_free(void *p)
+{
+	reftable_free_ptr(p);
+}
+
+void *reftable_calloc(size_t sz)
+{
+	void *p = reftable_malloc(sz);
+	memset(p, 0, sz);
+	return p;
+}
+
+void reftable_set_alloc(void *(*malloc)(size_t),
+			void *(*realloc)(void *, size_t), void (*free)(void *))
+{
+	reftable_malloc_ptr = malloc;
+	reftable_realloc_ptr = realloc;
+	reftable_free_ptr = free;
+}
+
+int reftable_fd_write(void *arg, const void *data, size_t sz)
+{
+	int *fdp = (int *)arg;
+	return write(*fdp, data, sz);
+}
diff --git a/reftable/reftable-tests.h b/reftable/reftable-tests.h
new file mode 100644
index 0000000000..e38471888f
--- /dev/null
+++ b/reftable/reftable-tests.h
@@ -0,0 +1,22 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_TESTS_H
+#define REFTABLE_TESTS_H
+
+int block_test_main(int argc, const char **argv);
+int merged_test_main(int argc, const char **argv);
+int record_test_main(int argc, const char **argv);
+int refname_test_main(int argc, const char **argv);
+int reftable_test_main(int argc, const char **argv);
+int strbuf_test_main(int argc, const char **argv);
+int stack_test_main(int argc, const char **argv);
+int tree_test_main(int argc, const char **argv);
+int reftable_dump_main(int argc, char *const *argv);
+
+#endif
diff --git a/reftable/strbuf.c b/reftable/strbuf.c
new file mode 100644
index 0000000000..136bf65591
--- /dev/null
+++ b/reftable/strbuf.c
@@ -0,0 +1,142 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "strbuf.h"
+
+#ifdef REFTABLE_STANDALONE
+
+void strbuf_init(struct strbuf *s, size_t alloc)
+{
+	struct strbuf empty = STRBUF_INIT;
+	*s = empty;
+}
+
+void strbuf_grow(struct strbuf *s, size_t extra)
+{
+	size_t newcap = s->len + extra + 1;
+	if (newcap > s->cap) {
+		s->buf = reftable_realloc(s->buf, newcap);
+		s->cap = newcap;
+	}
+}
+
+static void strbuf_resize(struct strbuf *s, int l)
+{
+	int zl = l + 1; /* one uint8_t for 0 termination. */
+	assert(s->canary == STRBUF_CANARY);
+	if (s->cap < zl) {
+		int c = s->cap * 2;
+		if (c < zl) {
+			c = zl;
+		}
+		s->cap = c;
+		s->buf = reftable_realloc(s->buf, s->cap);
+	}
+	s->len = l;
+	s->buf[l] = 0;
+}
+
+void strbuf_setlen(struct strbuf *s, size_t l)
+{
+	assert(s->cap >= l + 1);
+	s->len = l;
+	s->buf[l] = 0;
+}
+
+void strbuf_reset(struct strbuf *s)
+{
+	strbuf_resize(s, 0);
+}
+
+void strbuf_addstr(struct strbuf *d, const char *s)
+{
+	int l1 = d->len;
+	int l2 = strlen(s);
+	assert(d->canary == STRBUF_CANARY);
+
+	strbuf_resize(d, l2 + l1);
+	memcpy(d->buf + l1, s, l2);
+}
+
+void strbuf_addbuf(struct strbuf *s, struct strbuf *a)
+{
+	int end = s->len;
+	assert(s->canary == STRBUF_CANARY);
+	strbuf_resize(s, s->len + a->len);
+	memcpy(s->buf + end, a->buf, a->len);
+}
+
+char *strbuf_detach(struct strbuf *s, size_t *sz)
+{
+	char *p = NULL;
+	p = (char *)s->buf;
+	if (sz)
+		*sz = s->len;
+	s->buf = NULL;
+	s->cap = 0;
+	s->len = 0;
+	return p;
+}
+
+void strbuf_release(struct strbuf *s)
+{
+	assert(s->canary == STRBUF_CANARY);
+	s->cap = 0;
+	s->len = 0;
+	reftable_free(s->buf);
+	s->buf = NULL;
+}
+
+int strbuf_cmp(const struct strbuf *a, const struct strbuf *b)
+{
+	int min = a->len < b->len ? a->len : b->len;
+	int res = memcmp(a->buf, b->buf, min);
+	assert(a->canary == STRBUF_CANARY);
+	assert(b->canary == STRBUF_CANARY);
+	if (res != 0)
+		return res;
+	if (a->len < b->len)
+		return -1;
+	else if (a->len > b->len)
+		return 1;
+	else
+		return 0;
+}
+
+int strbuf_add(struct strbuf *b, const void *data, size_t sz)
+{
+	assert(b->canary == STRBUF_CANARY);
+	strbuf_grow(b, sz);
+	memcpy(b->buf + b->len, data, sz);
+	b->len += sz;
+	b->buf[b->len] = 0;
+	return sz;
+}
+
+#endif
+
+int strbuf_add_void(void *b, const void *data, size_t sz)
+{
+	strbuf_add((struct strbuf *)b, data, sz);
+	return sz;
+}
+
+int common_prefix_size(struct strbuf *a, struct strbuf *b)
+{
+	int p = 0;
+	while (p < a->len && p < b->len) {
+		if (a->buf[p] != b->buf[p]) {
+			break;
+		}
+		p++;
+	}
+
+	return p;
+}
+
+struct strbuf reftable_empty_strbuf = STRBUF_INIT;
diff --git a/reftable/strbuf.h b/reftable/strbuf.h
new file mode 100644
index 0000000000..c2d7aca8dd
--- /dev/null
+++ b/reftable/strbuf.h
@@ -0,0 +1,80 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef SLICE_H
+#define SLICE_H
+
+#ifdef REFTABLE_STANDALONE
+
+#include "basics.h"
+
+/*
+  Provides a bounds-checked, growable byte ranges. To use, initialize as "strbuf
+  x = STRBUF_INIT;"
+ */
+struct strbuf {
+	size_t len;
+	size_t cap;
+	char *buf;
+
+	/* Used to enforce initialization with STRBUF_INIT */
+	uint8_t canary;
+};
+
+#define STRBUF_CANARY 0x42
+#define STRBUF_INIT                       \
+	{                                 \
+		0, 0, NULL, STRBUF_CANARY \
+	}
+
+void strbuf_addstr(struct strbuf *dest, const char *src);
+
+/* Deallocate and clear strbuf */
+void strbuf_release(struct strbuf *strbuf);
+
+/* Set strbuf to 0 length, but retain buffer. */
+void strbuf_reset(struct strbuf *strbuf);
+
+/* Initializes a strbuf. Accepts a strbuf with random garbage. */
+void strbuf_init(struct strbuf *strbuf, size_t alloc);
+
+/* Return `buf`, clearing out `s`. Optionally return len (not cap) in `sz`.  */
+char *strbuf_detach(struct strbuf *s, size_t *sz);
+
+/* Set length of the slace to `l`, but don't reallocated. */
+void strbuf_setlen(struct strbuf *s, size_t l);
+
+/* Ensure `l` bytes beyond current length are available */
+void strbuf_grow(struct strbuf *s, size_t l);
+
+/* Signed comparison */
+int strbuf_cmp(const struct strbuf *a, const struct strbuf *b);
+
+/* Append `data` to the `dest` strbuf.  */
+int strbuf_add(struct strbuf *dest, const void *data, size_t sz);
+
+/* Append `add` to `dest. */
+void strbuf_addbuf(struct strbuf *dest, struct strbuf *add);
+
+#else
+
+#include "../git-compat-util.h"
+#include "../strbuf.h"
+
+#endif
+
+extern struct strbuf reftable_empty_strbuf;
+
+/* Like strbuf_add, but suitable for passing to reftable_new_writer
+ */
+int strbuf_add_void(void *b, const void *data, size_t sz);
+
+/* Find the longest shared prefix size of `a` and `b` */
+int common_prefix_size(struct strbuf *a, struct strbuf *b);
+
+#endif
diff --git a/reftable/strbuf_test.c b/reftable/strbuf_test.c
new file mode 100644
index 0000000000..39f561c81a
--- /dev/null
+++ b/reftable/strbuf_test.c
@@ -0,0 +1,37 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "strbuf.h"
+
+#include "system.h"
+
+#include "basics.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static void test_strbuf(void)
+{
+	struct strbuf s = STRBUF_INIT;
+	struct strbuf t = STRBUF_INIT;
+
+	strbuf_addstr(&s, "abc");
+	assert(0 == strcmp("abc", s.buf));
+
+	strbuf_addstr(&t, "pqr");
+	strbuf_addbuf(&s, &t);
+	assert(0 == strcmp("abcpqr", s.buf));
+
+	strbuf_release(&s);
+	strbuf_release(&t);
+}
+
+int strbuf_test_main(int argc, const char *argv[])
+{
+	add_test_case("test_strbuf", &test_strbuf);
+	return test_main(argc, argv);
+}
diff --git a/reftable/system.h b/reftable/system.h
new file mode 100644
index 0000000000..567eb8a87d
--- /dev/null
+++ b/reftable/system.h
@@ -0,0 +1,51 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef SYSTEM_H
+#define SYSTEM_H
+
+#ifndef REFTABLE_STANDALONE
+
+#include "git-compat-util.h"
+#include "cache.h"
+#include <zlib.h>
+
+#else
+
+#include <assert.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/stat.h>
+#include <sys/time.h>
+#include <sys/types.h>
+#include <unistd.h>
+#include <zlib.h>
+
+#include "compat.h"
+
+#endif /* REFTABLE_STANDALONE */
+
+void reftable_clear_dir(const char *dirname);
+
+#define SHA1_ID 0x73686131
+#define SHA256_ID 0x73323536
+#define SHA1_SIZE 20
+#define SHA256_SIZE 32
+
+/* This is uncompress2, which is only available in zlib as of 2017.
+ */
+int uncompress_return_consumed(Bytef *dest, uLongf *destLen,
+			       const Bytef *source, uLong *sourceLen);
+int hash_size(uint32_t id);
+
+#endif
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
new file mode 100644
index 0000000000..7d50aa6bcc
--- /dev/null
+++ b/t/helper/test-reftable.c
@@ -0,0 +1,8 @@
+#include "reftable/reftable-tests.h"
+#include "test-tool.h"
+
+int cmd__reftable(int argc, const char **argv)
+{
+	strbuf_test_main(argc, argv);
+	return 0;
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 590b2efca7..10366b7b76 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -52,6 +52,7 @@ static struct test_cmd cmds[] = {
 	{ "read-graph", cmd__read_graph },
 	{ "read-midx", cmd__read_midx },
 	{ "ref-store", cmd__ref_store },
+	{ "reftable", cmd__reftable },
 	{ "regex", cmd__regex },
 	{ "repository", cmd__repository },
 	{ "revision-walking", cmd__revision_walking },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index ddc8e990e9..d52ba2f5e5 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -41,6 +41,7 @@ int cmd__read_cache(int argc, const char **argv);
 int cmd__read_graph(int argc, const char **argv);
 int cmd__read_midx(int argc, const char **argv);
 int cmd__ref_store(int argc, const char **argv);
+int cmd__reftable(int argc, const char **argv);
 int cmd__regex(int argc, const char **argv);
 int cmd__repository(int argc, const char **argv);
 int cmd__revision_walking(int argc, const char **argv);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH 06/13] reftable: (de)serialization for the polymorphic record type.
  2020-09-16 19:10 [PATCH 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
                   ` (4 preceding siblings ...)
  2020-09-16 19:10 ` [PATCH 05/13] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
@ 2020-09-16 19:10 ` Han-Wen Nienhuys via GitGitGadget
  2020-09-20  1:00   ` Junio C Hamano
  2020-09-16 19:10 ` [PATCH 07/13] reftable: reading/writing blocks Han-Wen Nienhuys via GitGitGadget
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-09-16 19:10 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |    2 +
 reftable/constants.h     |   21 +
 reftable/record.c        | 1114 ++++++++++++++++++++++++++++++++++++++
 reftable/record.h        |  143 +++++
 reftable/record_test.c   |  410 ++++++++++++++
 t/helper/test-reftable.c |    1 +
 6 files changed, 1691 insertions(+)
 create mode 100644 reftable/constants.h
 create mode 100644 reftable/record.c
 create mode 100644 reftable/record.h
 create mode 100644 reftable/record_test.c

diff --git a/Makefile b/Makefile
index 66158525a9..0fb0048c34 100644
--- a/Makefile
+++ b/Makefile
@@ -2348,8 +2348,10 @@ REFTABLE_OBJS += reftable/basics.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/compat.o
+REFTABLE_OBJS += reftable/record.o
 REFTABLE_OBJS += reftable/strbuf.o
 
+REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/strbuf_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 
diff --git a/reftable/constants.h b/reftable/constants.h
new file mode 100644
index 0000000000..5eee72c4c1
--- /dev/null
+++ b/reftable/constants.h
@@ -0,0 +1,21 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef CONSTANTS_H
+#define CONSTANTS_H
+
+#define BLOCK_TYPE_LOG 'g'
+#define BLOCK_TYPE_INDEX 'i'
+#define BLOCK_TYPE_REF 'r'
+#define BLOCK_TYPE_OBJ 'o'
+#define BLOCK_TYPE_ANY 0
+
+#define MAX_RESTARTS ((1 << 16) - 1)
+#define DEFAULT_BLOCK_SIZE 4096
+
+#endif
diff --git a/reftable/record.c b/reftable/record.c
new file mode 100644
index 0000000000..21c9bba077
--- /dev/null
+++ b/reftable/record.c
@@ -0,0 +1,1114 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+/* record.c - methods for different types of records. */
+
+#include "record.h"
+
+#include "system.h"
+#include "constants.h"
+#include "reftable.h"
+#include "basics.h"
+
+int get_var_int(uint64_t *dest, struct string_view *in)
+{
+	int ptr = 0;
+	uint64_t val;
+
+	if (in->len == 0)
+		return -1;
+	val = in->buf[ptr] & 0x7f;
+
+	while (in->buf[ptr] & 0x80) {
+		ptr++;
+		if (ptr > in->len) {
+			return -1;
+		}
+		val = (val + 1) << 7 | (uint64_t)(in->buf[ptr] & 0x7f);
+	}
+
+	*dest = val;
+	return ptr + 1;
+}
+
+int put_var_int(struct string_view *dest, uint64_t val)
+{
+	uint8_t buf[10] = { 0 };
+	int i = 9;
+	int n = 0;
+	buf[i] = (uint8_t)(val & 0x7f);
+	i--;
+	while (1) {
+		val >>= 7;
+		if (!val) {
+			break;
+		}
+		val--;
+		buf[i] = 0x80 | (uint8_t)(val & 0x7f);
+		i--;
+	}
+
+	n = sizeof(buf) - i - 1;
+	if (dest->len < n)
+		return -1;
+	memcpy(dest->buf, &buf[i + 1], n);
+	return n;
+}
+
+int reftable_is_block_type(uint8_t typ)
+{
+	switch (typ) {
+	case BLOCK_TYPE_REF:
+	case BLOCK_TYPE_LOG:
+	case BLOCK_TYPE_OBJ:
+	case BLOCK_TYPE_INDEX:
+		return 1;
+	}
+	return 0;
+}
+
+static int decode_string(struct strbuf *dest, struct string_view in)
+{
+	int start_len = in.len;
+	uint64_t tsize = 0;
+	int n = get_var_int(&tsize, &in);
+	if (n <= 0)
+		return -1;
+	string_view_consume(&in, n);
+	if (in.len < tsize)
+		return -1;
+
+	strbuf_reset(dest);
+	strbuf_add(dest, in.buf, tsize);
+	string_view_consume(&in, tsize);
+
+	return start_len - in.len;
+}
+
+static int encode_string(char *str, struct string_view s)
+{
+	struct string_view start = s;
+	int l = strlen(str);
+	int n = put_var_int(&s, l);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+	if (s.len < l)
+		return -1;
+	memcpy(s.buf, str, l);
+	string_view_consume(&s, l);
+
+	return start.len - s.len;
+}
+
+int reftable_encode_key(int *restart, struct string_view dest,
+			struct strbuf prev_key, struct strbuf key,
+			uint8_t extra)
+{
+	struct string_view start = dest;
+	int prefix_len = common_prefix_size(&prev_key, &key);
+	uint64_t suffix_len = key.len - prefix_len;
+	int n = put_var_int(&dest, (uint64_t)prefix_len);
+	if (n < 0)
+		return -1;
+	string_view_consume(&dest, n);
+
+	*restart = (prefix_len == 0);
+
+	n = put_var_int(&dest, suffix_len << 3 | (uint64_t)extra);
+	if (n < 0)
+		return -1;
+	string_view_consume(&dest, n);
+
+	if (dest.len < suffix_len)
+		return -1;
+	memcpy(dest.buf, key.buf + prefix_len, suffix_len);
+	string_view_consume(&dest, suffix_len);
+
+	return start.len - dest.len;
+}
+
+int reftable_decode_key(struct strbuf *key, uint8_t *extra,
+			struct strbuf last_key, struct string_view in)
+{
+	int start_len = in.len;
+	uint64_t prefix_len = 0;
+	uint64_t suffix_len = 0;
+	int n = get_var_int(&prefix_len, &in);
+	if (n < 0)
+		return -1;
+	string_view_consume(&in, n);
+
+	if (prefix_len > last_key.len)
+		return -1;
+
+	n = get_var_int(&suffix_len, &in);
+	if (n <= 0)
+		return -1;
+	string_view_consume(&in, n);
+
+	*extra = (uint8_t)(suffix_len & 0x7);
+	suffix_len >>= 3;
+
+	if (in.len < suffix_len)
+		return -1;
+
+	strbuf_reset(key);
+	strbuf_add(key, last_key.buf, prefix_len);
+	strbuf_add(key, in.buf, suffix_len);
+	string_view_consume(&in, suffix_len);
+
+	return start_len - in.len;
+}
+
+static void reftable_ref_record_key(const void *r, struct strbuf *dest)
+{
+	const struct reftable_ref_record *rec =
+		(const struct reftable_ref_record *)r;
+	strbuf_reset(dest);
+	strbuf_addstr(dest, rec->refname);
+}
+
+static void reftable_ref_record_copy_from(void *rec, const void *src_rec,
+					  int hash_size)
+{
+	struct reftable_ref_record *ref = (struct reftable_ref_record *)rec;
+	struct reftable_ref_record *src = (struct reftable_ref_record *)src_rec;
+	assert(hash_size > 0);
+
+	/* This is simple and correct, but we could probably reuse the hash
+	   fields. */
+	reftable_ref_record_clear(ref);
+	if (src->refname != NULL) {
+		ref->refname = xstrdup(src->refname);
+	}
+
+	if (src->target != NULL) {
+		ref->target = xstrdup(src->target);
+	}
+
+	if (src->target_value != NULL) {
+		ref->target_value = reftable_malloc(hash_size);
+		memcpy(ref->target_value, src->target_value, hash_size);
+	}
+
+	if (src->value != NULL) {
+		ref->value = reftable_malloc(hash_size);
+		memcpy(ref->value, src->value, hash_size);
+	}
+	ref->update_index = src->update_index;
+}
+
+static char hexdigit(int c)
+{
+	if (c <= 9)
+		return '0' + c;
+	return 'a' + (c - 10);
+}
+
+static void hex_format(char *dest, uint8_t *src, int hash_size)
+{
+	assert(hash_size > 0);
+	if (src != NULL) {
+		int i = 0;
+		for (i = 0; i < hash_size; i++) {
+			dest[2 * i] = hexdigit(src[i] >> 4);
+			dest[2 * i + 1] = hexdigit(src[i] & 0xf);
+		}
+		dest[2 * hash_size] = 0;
+	}
+}
+
+void reftable_ref_record_print(struct reftable_ref_record *ref,
+			       uint32_t hash_id)
+{
+	char hex[SHA256_SIZE + 1] = { 0 };
+	printf("ref{%s(%" PRIu64 ") ", ref->refname, ref->update_index);
+	if (ref->value != NULL) {
+		hex_format(hex, ref->value, hash_size(hash_id));
+		printf("%s", hex);
+	}
+	if (ref->target_value != NULL) {
+		hex_format(hex, ref->target_value, hash_size(hash_id));
+		printf(" (T %s)", hex);
+	}
+	if (ref->target != NULL) {
+		printf("=> %s", ref->target);
+	}
+	printf("}\n");
+}
+
+static void reftable_ref_record_clear_void(void *rec)
+{
+	reftable_ref_record_clear((struct reftable_ref_record *)rec);
+}
+
+void reftable_ref_record_clear(struct reftable_ref_record *ref)
+{
+	reftable_free(ref->refname);
+	reftable_free(ref->target);
+	reftable_free(ref->target_value);
+	reftable_free(ref->value);
+	memset(ref, 0, sizeof(struct reftable_ref_record));
+}
+
+static uint8_t reftable_ref_record_val_type(const void *rec)
+{
+	const struct reftable_ref_record *r =
+		(const struct reftable_ref_record *)rec;
+	if (r->value != NULL) {
+		if (r->target_value != NULL) {
+			return 2;
+		} else {
+			return 1;
+		}
+	} else if (r->target != NULL)
+		return 3;
+	return 0;
+}
+
+static int reftable_ref_record_encode(const void *rec, struct string_view s,
+				      int hash_size)
+{
+	const struct reftable_ref_record *r =
+		(const struct reftable_ref_record *)rec;
+	struct string_view start = s;
+	int n = put_var_int(&s, r->update_index);
+	assert(hash_size > 0);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	if (r->value != NULL) {
+		if (s.len < hash_size) {
+			return -1;
+		}
+		memcpy(s.buf, r->value, hash_size);
+		string_view_consume(&s, hash_size);
+	}
+
+	if (r->target_value != NULL) {
+		if (s.len < hash_size) {
+			return -1;
+		}
+		memcpy(s.buf, r->target_value, hash_size);
+		string_view_consume(&s, hash_size);
+	}
+
+	if (r->target != NULL) {
+		int n = encode_string(r->target, s);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&s, n);
+	}
+
+	return start.len - s.len;
+}
+
+static int reftable_ref_record_decode(void *rec, struct strbuf key,
+				      uint8_t val_type, struct string_view in,
+				      int hash_size)
+{
+	struct reftable_ref_record *r = (struct reftable_ref_record *)rec;
+	struct string_view start = in;
+	int seen_value = 0;
+	int seen_target_value = 0;
+	int seen_target = 0;
+
+	int n = get_var_int(&r->update_index, &in);
+	if (n < 0)
+		return n;
+	assert(hash_size > 0);
+
+	string_view_consume(&in, n);
+
+	r->refname = reftable_realloc(r->refname, key.len + 1);
+	memcpy(r->refname, key.buf, key.len);
+	r->refname[key.len] = 0;
+
+	switch (val_type) {
+	case 1:
+	case 2:
+		if (in.len < hash_size) {
+			return -1;
+		}
+
+		if (r->value == NULL) {
+			r->value = reftable_malloc(hash_size);
+		}
+		seen_value = 1;
+		memcpy(r->value, in.buf, hash_size);
+		string_view_consume(&in, hash_size);
+		if (val_type == 1) {
+			break;
+		}
+		if (r->target_value == NULL) {
+			r->target_value = reftable_malloc(hash_size);
+		}
+		seen_target_value = 1;
+		memcpy(r->target_value, in.buf, hash_size);
+		string_view_consume(&in, hash_size);
+		break;
+	case 3: {
+		struct strbuf dest = STRBUF_INIT;
+		int n = decode_string(&dest, in);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&in, n);
+		seen_target = 1;
+		if (r->target != NULL) {
+			reftable_free(r->target);
+		}
+		r->target = dest.buf;
+	} break;
+
+	case 0:
+		break;
+	default:
+		abort();
+		break;
+	}
+
+	if (!seen_target && r->target != NULL) {
+		FREE_AND_NULL(r->target);
+	}
+	if (!seen_target_value && r->target_value != NULL) {
+		FREE_AND_NULL(r->target_value);
+	}
+	if (!seen_value && r->value != NULL) {
+		FREE_AND_NULL(r->value);
+	}
+
+	return start.len - in.len;
+}
+
+static int reftable_ref_record_is_deletion_void(const void *p)
+{
+	return reftable_ref_record_is_deletion(
+		(const struct reftable_ref_record *)p);
+}
+
+struct reftable_record_vtable reftable_ref_record_vtable = {
+	.key = &reftable_ref_record_key,
+	.type = BLOCK_TYPE_REF,
+	.copy_from = &reftable_ref_record_copy_from,
+	.val_type = &reftable_ref_record_val_type,
+	.encode = &reftable_ref_record_encode,
+	.decode = &reftable_ref_record_decode,
+	.clear = &reftable_ref_record_clear_void,
+	.is_deletion = &reftable_ref_record_is_deletion_void,
+};
+
+static void reftable_obj_record_key(const void *r, struct strbuf *dest)
+{
+	const struct reftable_obj_record *rec =
+		(const struct reftable_obj_record *)r;
+	strbuf_reset(dest);
+	strbuf_add(dest, rec->hash_prefix, rec->hash_prefix_len);
+}
+
+static void reftable_obj_record_clear(void *rec)
+{
+	struct reftable_obj_record *obj = (struct reftable_obj_record *)rec;
+	FREE_AND_NULL(obj->hash_prefix);
+	FREE_AND_NULL(obj->offsets);
+	memset(obj, 0, sizeof(struct reftable_obj_record));
+}
+
+static void reftable_obj_record_copy_from(void *rec, const void *src_rec,
+					  int hash_size)
+{
+	struct reftable_obj_record *obj = (struct reftable_obj_record *)rec;
+	const struct reftable_obj_record *src =
+		(const struct reftable_obj_record *)src_rec;
+	int olen;
+
+	reftable_obj_record_clear(obj);
+	*obj = *src;
+	obj->hash_prefix = reftable_malloc(obj->hash_prefix_len);
+	memcpy(obj->hash_prefix, src->hash_prefix, obj->hash_prefix_len);
+
+	olen = obj->offset_len * sizeof(uint64_t);
+	obj->offsets = reftable_malloc(olen);
+	memcpy(obj->offsets, src->offsets, olen);
+}
+
+static uint8_t reftable_obj_record_val_type(const void *rec)
+{
+	struct reftable_obj_record *r = (struct reftable_obj_record *)rec;
+	if (r->offset_len > 0 && r->offset_len < 8)
+		return r->offset_len;
+	return 0;
+}
+
+static int reftable_obj_record_encode(const void *rec, struct string_view s,
+				      int hash_size)
+{
+	struct reftable_obj_record *r = (struct reftable_obj_record *)rec;
+	struct string_view start = s;
+	int i = 0;
+	int n = 0;
+	uint64_t last = 0;
+	if (r->offset_len == 0 || r->offset_len >= 8) {
+		n = put_var_int(&s, r->offset_len);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&s, n);
+	}
+	if (r->offset_len == 0)
+		return start.len - s.len;
+	n = put_var_int(&s, r->offsets[0]);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	last = r->offsets[0];
+	for (i = 1; i < r->offset_len; i++) {
+		int n = put_var_int(&s, r->offsets[i] - last);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&s, n);
+		last = r->offsets[i];
+	}
+	return start.len - s.len;
+}
+
+static int reftable_obj_record_decode(void *rec, struct strbuf key,
+				      uint8_t val_type, struct string_view in,
+				      int hash_size)
+{
+	struct string_view start = in;
+	struct reftable_obj_record *r = (struct reftable_obj_record *)rec;
+	uint64_t count = val_type;
+	int n = 0;
+	uint64_t last;
+	int j;
+	r->hash_prefix = reftable_malloc(key.len);
+	memcpy(r->hash_prefix, key.buf, key.len);
+	r->hash_prefix_len = key.len;
+
+	if (val_type == 0) {
+		n = get_var_int(&count, &in);
+		if (n < 0) {
+			return n;
+		}
+
+		string_view_consume(&in, n);
+	}
+
+	r->offsets = NULL;
+	r->offset_len = 0;
+	if (count == 0)
+		return start.len - in.len;
+
+	r->offsets = reftable_malloc(count * sizeof(uint64_t));
+	r->offset_len = count;
+
+	n = get_var_int(&r->offsets[0], &in);
+	if (n < 0)
+		return n;
+	string_view_consume(&in, n);
+
+	last = r->offsets[0];
+	j = 1;
+	while (j < count) {
+		uint64_t delta = 0;
+		int n = get_var_int(&delta, &in);
+		if (n < 0) {
+			return n;
+		}
+		string_view_consume(&in, n);
+
+		last = r->offsets[j] = (delta + last);
+		j++;
+	}
+	return start.len - in.len;
+}
+
+static int not_a_deletion(const void *p)
+{
+	return 0;
+}
+
+struct reftable_record_vtable reftable_obj_record_vtable = {
+	.key = &reftable_obj_record_key,
+	.type = BLOCK_TYPE_OBJ,
+	.copy_from = &reftable_obj_record_copy_from,
+	.val_type = &reftable_obj_record_val_type,
+	.encode = &reftable_obj_record_encode,
+	.decode = &reftable_obj_record_decode,
+	.clear = &reftable_obj_record_clear,
+	.is_deletion = not_a_deletion,
+};
+
+void reftable_log_record_print(struct reftable_log_record *log,
+			       uint32_t hash_id)
+{
+	char hex[SHA256_SIZE + 1] = { 0 };
+
+	printf("log{%s(%" PRIu64 ") %s <%s> %" PRIu64 " %04d\n", log->refname,
+	       log->update_index, log->name, log->email, log->time,
+	       log->tz_offset);
+	hex_format(hex, log->old_hash, hash_size(hash_id));
+	printf("%s => ", hex);
+	hex_format(hex, log->new_hash, hash_size(hash_id));
+	printf("%s\n\n%s\n}\n", hex, log->message);
+}
+
+static void reftable_log_record_key(const void *r, struct strbuf *dest)
+{
+	const struct reftable_log_record *rec =
+		(const struct reftable_log_record *)r;
+	int len = strlen(rec->refname);
+	uint8_t i64[8];
+	uint64_t ts = 0;
+	strbuf_reset(dest);
+	strbuf_add(dest, (uint8_t *)rec->refname, len + 1);
+
+	ts = (~ts) - rec->update_index;
+	put_be64(&i64[0], ts);
+	strbuf_add(dest, i64, sizeof(i64));
+}
+
+static void reftable_log_record_copy_from(void *rec, const void *src_rec,
+					  int hash_size)
+{
+	struct reftable_log_record *dst = (struct reftable_log_record *)rec;
+	const struct reftable_log_record *src =
+		(const struct reftable_log_record *)src_rec;
+
+	reftable_log_record_clear(dst);
+	*dst = *src;
+	if (dst->refname != NULL) {
+		dst->refname = xstrdup(dst->refname);
+	}
+	if (dst->email != NULL) {
+		dst->email = xstrdup(dst->email);
+	}
+	if (dst->name != NULL) {
+		dst->name = xstrdup(dst->name);
+	}
+	if (dst->message != NULL) {
+		dst->message = xstrdup(dst->message);
+	}
+
+	if (dst->new_hash != NULL) {
+		dst->new_hash = reftable_malloc(hash_size);
+		memcpy(dst->new_hash, src->new_hash, hash_size);
+	}
+	if (dst->old_hash != NULL) {
+		dst->old_hash = reftable_malloc(hash_size);
+		memcpy(dst->old_hash, src->old_hash, hash_size);
+	}
+}
+
+static void reftable_log_record_clear_void(void *rec)
+{
+	struct reftable_log_record *r = (struct reftable_log_record *)rec;
+	reftable_log_record_clear(r);
+}
+
+void reftable_log_record_clear(struct reftable_log_record *r)
+{
+	reftable_free(r->refname);
+	reftable_free(r->new_hash);
+	reftable_free(r->old_hash);
+	reftable_free(r->name);
+	reftable_free(r->email);
+	reftable_free(r->message);
+	memset(r, 0, sizeof(struct reftable_log_record));
+}
+
+static uint8_t reftable_log_record_val_type(const void *rec)
+{
+	const struct reftable_log_record *log =
+		(const struct reftable_log_record *)rec;
+
+	return reftable_log_record_is_deletion(log) ? 0 : 1;
+}
+
+static uint8_t zero[SHA256_SIZE] = { 0 };
+
+static int reftable_log_record_encode(const void *rec, struct string_view s,
+				      int hash_size)
+{
+	struct reftable_log_record *r = (struct reftable_log_record *)rec;
+	struct string_view start = s;
+	int n = 0;
+	uint8_t *oldh = r->old_hash;
+	uint8_t *newh = r->new_hash;
+	if (reftable_log_record_is_deletion(r))
+		return 0;
+
+	if (oldh == NULL) {
+		oldh = zero;
+	}
+	if (newh == NULL) {
+		newh = zero;
+	}
+
+	if (s.len < 2 * hash_size)
+		return -1;
+
+	memcpy(s.buf, oldh, hash_size);
+	memcpy(s.buf + hash_size, newh, hash_size);
+	string_view_consume(&s, 2 * hash_size);
+
+	n = encode_string(r->name ? r->name : "", s);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	n = encode_string(r->email ? r->email : "", s);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	n = put_var_int(&s, r->time);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	if (s.len < 2)
+		return -1;
+
+	put_be16(s.buf, r->tz_offset);
+	string_view_consume(&s, 2);
+
+	n = encode_string(r->message ? r->message : "", s);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	return start.len - s.len;
+}
+
+static int reftable_log_record_decode(void *rec, struct strbuf key,
+				      uint8_t val_type, struct string_view in,
+				      int hash_size)
+{
+	struct string_view start = in;
+	struct reftable_log_record *r = (struct reftable_log_record *)rec;
+	uint64_t max = 0;
+	uint64_t ts = 0;
+	struct strbuf dest = STRBUF_INIT;
+	int n;
+
+	if (key.len <= 9 || key.buf[key.len - 9] != 0)
+		return REFTABLE_FORMAT_ERROR;
+
+	r->refname = reftable_realloc(r->refname, key.len - 8);
+	memcpy(r->refname, key.buf, key.len - 8);
+	ts = get_be64(key.buf + key.len - 8);
+
+	r->update_index = (~max) - ts;
+
+	if (val_type == 0) {
+		FREE_AND_NULL(r->old_hash);
+		FREE_AND_NULL(r->new_hash);
+		FREE_AND_NULL(r->message);
+		FREE_AND_NULL(r->email);
+		FREE_AND_NULL(r->name);
+		return 0;
+	}
+
+	if (in.len < 2 * hash_size)
+		return REFTABLE_FORMAT_ERROR;
+
+	r->old_hash = reftable_realloc(r->old_hash, hash_size);
+	r->new_hash = reftable_realloc(r->new_hash, hash_size);
+
+	memcpy(r->old_hash, in.buf, hash_size);
+	memcpy(r->new_hash, in.buf + hash_size, hash_size);
+
+	string_view_consume(&in, 2 * hash_size);
+
+	n = decode_string(&dest, in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+
+	r->name = reftable_realloc(r->name, dest.len + 1);
+	memcpy(r->name, dest.buf, dest.len);
+	r->name[dest.len] = 0;
+
+	strbuf_reset(&dest);
+	n = decode_string(&dest, in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+
+	r->email = reftable_realloc(r->email, dest.len + 1);
+	memcpy(r->email, dest.buf, dest.len);
+	r->email[dest.len] = 0;
+
+	ts = 0;
+	n = get_var_int(&ts, &in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+	r->time = ts;
+	if (in.len < 2)
+		goto done;
+
+	r->tz_offset = get_be16(in.buf);
+	string_view_consume(&in, 2);
+
+	strbuf_reset(&dest);
+	n = decode_string(&dest, in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+
+	r->message = reftable_realloc(r->message, dest.len + 1);
+	memcpy(r->message, dest.buf, dest.len);
+	r->message[dest.len] = 0;
+
+	strbuf_release(&dest);
+	return start.len - in.len;
+
+done:
+	strbuf_release(&dest);
+	return REFTABLE_FORMAT_ERROR;
+}
+
+static int null_streq(char *a, char *b)
+{
+	char *empty = "";
+	if (a == NULL)
+		a = empty;
+
+	if (b == NULL)
+		b = empty;
+
+	return 0 == strcmp(a, b);
+}
+
+static int zero_hash_eq(uint8_t *a, uint8_t *b, int sz)
+{
+	if (a == NULL)
+		a = zero;
+
+	if (b == NULL)
+		b = zero;
+
+	return !memcmp(a, b, sz);
+}
+
+int reftable_log_record_equal(struct reftable_log_record *a,
+			      struct reftable_log_record *b, int hash_size)
+{
+	return null_streq(a->name, b->name) && null_streq(a->email, b->email) &&
+	       null_streq(a->message, b->message) &&
+	       zero_hash_eq(a->old_hash, b->old_hash, hash_size) &&
+	       zero_hash_eq(a->new_hash, b->new_hash, hash_size) &&
+	       a->time == b->time && a->tz_offset == b->tz_offset &&
+	       a->update_index == b->update_index;
+}
+
+static int reftable_log_record_is_deletion_void(const void *p)
+{
+	return reftable_log_record_is_deletion(
+		(const struct reftable_log_record *)p);
+}
+
+struct reftable_record_vtable reftable_log_record_vtable = {
+	.key = &reftable_log_record_key,
+	.type = BLOCK_TYPE_LOG,
+	.copy_from = &reftable_log_record_copy_from,
+	.val_type = &reftable_log_record_val_type,
+	.encode = &reftable_log_record_encode,
+	.decode = &reftable_log_record_decode,
+	.clear = &reftable_log_record_clear_void,
+	.is_deletion = &reftable_log_record_is_deletion_void,
+};
+
+struct reftable_record reftable_new_record(uint8_t typ)
+{
+	struct reftable_record rec = { NULL };
+	switch (typ) {
+	case BLOCK_TYPE_REF: {
+		struct reftable_ref_record *r =
+			reftable_calloc(sizeof(struct reftable_ref_record));
+		reftable_record_from_ref(&rec, r);
+		return rec;
+	}
+
+	case BLOCK_TYPE_OBJ: {
+		struct reftable_obj_record *r =
+			reftable_calloc(sizeof(struct reftable_obj_record));
+		reftable_record_from_obj(&rec, r);
+		return rec;
+	}
+	case BLOCK_TYPE_LOG: {
+		struct reftable_log_record *r =
+			reftable_calloc(sizeof(struct reftable_log_record));
+		reftable_record_from_log(&rec, r);
+		return rec;
+	}
+	case BLOCK_TYPE_INDEX: {
+		struct reftable_index_record empty = { .last_key =
+							       STRBUF_INIT };
+		struct reftable_index_record *r =
+			reftable_calloc(sizeof(struct reftable_index_record));
+		*r = empty;
+		reftable_record_from_index(&rec, r);
+		return rec;
+	}
+	}
+	abort();
+	return rec;
+}
+
+void *reftable_record_yield(struct reftable_record *rec)
+{
+	void *p = rec->data;
+	rec->data = NULL;
+	return p;
+}
+
+void reftable_record_destroy(struct reftable_record *rec)
+{
+	reftable_record_clear(rec);
+	reftable_free(reftable_record_yield(rec));
+}
+
+static void reftable_index_record_key(const void *r, struct strbuf *dest)
+{
+	struct reftable_index_record *rec = (struct reftable_index_record *)r;
+	strbuf_reset(dest);
+	strbuf_addbuf(dest, &rec->last_key);
+}
+
+static void reftable_index_record_copy_from(void *rec, const void *src_rec,
+					    int hash_size)
+{
+	struct reftable_index_record *dst = (struct reftable_index_record *)rec;
+	struct reftable_index_record *src =
+		(struct reftable_index_record *)src_rec;
+
+	strbuf_reset(&dst->last_key);
+	strbuf_addbuf(&dst->last_key, &src->last_key);
+	dst->offset = src->offset;
+}
+
+static void reftable_index_record_clear(void *rec)
+{
+	struct reftable_index_record *idx = (struct reftable_index_record *)rec;
+	strbuf_release(&idx->last_key);
+}
+
+static uint8_t reftable_index_record_val_type(const void *rec)
+{
+	return 0;
+}
+
+static int reftable_index_record_encode(const void *rec, struct string_view out,
+					int hash_size)
+{
+	const struct reftable_index_record *r =
+		(const struct reftable_index_record *)rec;
+	struct string_view start = out;
+
+	int n = put_var_int(&out, r->offset);
+	if (n < 0)
+		return n;
+
+	string_view_consume(&out, n);
+
+	return start.len - out.len;
+}
+
+static int reftable_index_record_decode(void *rec, struct strbuf key,
+					uint8_t val_type, struct string_view in,
+					int hash_size)
+{
+	struct string_view start = in;
+	struct reftable_index_record *r = (struct reftable_index_record *)rec;
+	int n = 0;
+
+	strbuf_reset(&r->last_key);
+	strbuf_addbuf(&r->last_key, &key);
+
+	n = get_var_int(&r->offset, &in);
+	if (n < 0)
+		return n;
+
+	string_view_consume(&in, n);
+	return start.len - in.len;
+}
+
+struct reftable_record_vtable reftable_index_record_vtable = {
+	.key = &reftable_index_record_key,
+	.type = BLOCK_TYPE_INDEX,
+	.copy_from = &reftable_index_record_copy_from,
+	.val_type = &reftable_index_record_val_type,
+	.encode = &reftable_index_record_encode,
+	.decode = &reftable_index_record_decode,
+	.clear = &reftable_index_record_clear,
+	.is_deletion = &not_a_deletion,
+};
+
+void reftable_record_key(struct reftable_record *rec, struct strbuf *dest)
+{
+	rec->ops->key(rec->data, dest);
+}
+
+uint8_t reftable_record_type(struct reftable_record *rec)
+{
+	return rec->ops->type;
+}
+
+int reftable_record_encode(struct reftable_record *rec, struct string_view dest,
+			   int hash_size)
+{
+	return rec->ops->encode(rec->data, dest, hash_size);
+}
+
+void reftable_record_copy_from(struct reftable_record *rec,
+			       struct reftable_record *src, int hash_size)
+{
+	assert(src->ops->type == rec->ops->type);
+
+	rec->ops->copy_from(rec->data, src->data, hash_size);
+}
+
+uint8_t reftable_record_val_type(struct reftable_record *rec)
+{
+	return rec->ops->val_type(rec->data);
+}
+
+int reftable_record_decode(struct reftable_record *rec, struct strbuf key,
+			   uint8_t extra, struct string_view src, int hash_size)
+{
+	return rec->ops->decode(rec->data, key, extra, src, hash_size);
+}
+
+void reftable_record_clear(struct reftable_record *rec)
+{
+	rec->ops->clear(rec->data);
+}
+
+int reftable_record_is_deletion(struct reftable_record *rec)
+{
+	return rec->ops->is_deletion(rec->data);
+}
+
+void reftable_record_from_ref(struct reftable_record *rec,
+			      struct reftable_ref_record *ref_rec)
+{
+	assert(rec->ops == NULL);
+	rec->data = ref_rec;
+	rec->ops = &reftable_ref_record_vtable;
+}
+
+void reftable_record_from_obj(struct reftable_record *rec,
+			      struct reftable_obj_record *obj_rec)
+{
+	assert(rec->ops == NULL);
+	rec->data = obj_rec;
+	rec->ops = &reftable_obj_record_vtable;
+}
+
+void reftable_record_from_index(struct reftable_record *rec,
+				struct reftable_index_record *index_rec)
+{
+	assert(rec->ops == NULL);
+	rec->data = index_rec;
+	rec->ops = &reftable_index_record_vtable;
+}
+
+void reftable_record_from_log(struct reftable_record *rec,
+			      struct reftable_log_record *log_rec)
+{
+	assert(rec->ops == NULL);
+	rec->data = log_rec;
+	rec->ops = &reftable_log_record_vtable;
+}
+
+struct reftable_ref_record *reftable_record_as_ref(struct reftable_record *rec)
+{
+	assert(reftable_record_type(rec) == BLOCK_TYPE_REF);
+	return (struct reftable_ref_record *)rec->data;
+}
+
+struct reftable_log_record *reftable_record_as_log(struct reftable_record *rec)
+{
+	assert(reftable_record_type(rec) == BLOCK_TYPE_LOG);
+	return (struct reftable_log_record *)rec->data;
+}
+
+static int hash_equal(uint8_t *a, uint8_t *b, int hash_size)
+{
+	if (a != NULL && b != NULL)
+		return !memcmp(a, b, hash_size);
+
+	return a == b;
+}
+
+static int str_equal(char *a, char *b)
+{
+	if (a != NULL && b != NULL)
+		return 0 == strcmp(a, b);
+
+	return a == b;
+}
+
+int reftable_ref_record_equal(struct reftable_ref_record *a,
+			      struct reftable_ref_record *b, int hash_size)
+{
+	assert(hash_size > 0);
+	return 0 == strcmp(a->refname, b->refname) &&
+	       a->update_index == b->update_index &&
+	       hash_equal(a->value, b->value, hash_size) &&
+	       hash_equal(a->target_value, b->target_value, hash_size) &&
+	       str_equal(a->target, b->target);
+}
+
+int reftable_ref_record_compare_name(const void *a, const void *b)
+{
+	return strcmp(((struct reftable_ref_record *)a)->refname,
+		      ((struct reftable_ref_record *)b)->refname);
+}
+
+int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref)
+{
+	return ref->value == NULL && ref->target == NULL &&
+	       ref->target_value == NULL;
+}
+
+int reftable_log_record_compare_key(const void *a, const void *b)
+{
+	struct reftable_log_record *la = (struct reftable_log_record *)a;
+	struct reftable_log_record *lb = (struct reftable_log_record *)b;
+
+	int cmp = strcmp(la->refname, lb->refname);
+	if (cmp)
+		return cmp;
+	if (la->update_index > lb->update_index)
+		return -1;
+	return (la->update_index < lb->update_index) ? 1 : 0;
+}
+
+int reftable_log_record_is_deletion(const struct reftable_log_record *log)
+{
+	return (log->new_hash == NULL && log->old_hash == NULL &&
+		log->name == NULL && log->email == NULL &&
+		log->message == NULL && log->time == 0 && log->tz_offset == 0 &&
+		log->message == NULL);
+}
+
+void string_view_consume(struct string_view *s, int n)
+{
+	s->buf += n;
+	s->len -= n;
+}
diff --git a/reftable/record.h b/reftable/record.h
new file mode 100644
index 0000000000..0c4725cd3c
--- /dev/null
+++ b/reftable/record.h
@@ -0,0 +1,143 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef RECORD_H
+#define RECORD_H
+
+#include "reftable.h"
+#include "strbuf.h"
+#include "system.h"
+
+/*
+  A substring of existing string data. This structure takes no responsibility
+  for the lifetime of the data it points to.
+*/
+struct string_view {
+	uint8_t *buf;
+	size_t len;
+};
+
+/* Advance `s.buf` by `n`, and decrease length. */
+void string_view_consume(struct string_view *s, int n);
+
+/* utilities for de/encoding varints */
+
+int get_var_int(uint64_t *dest, struct string_view *in);
+int put_var_int(struct string_view *dest, uint64_t val);
+
+/* Methods for records. */
+struct reftable_record_vtable {
+	/* encode the key of to a uint8_t strbuf. */
+	void (*key)(const void *rec, struct strbuf *dest);
+
+	/* The record type of ('r' for ref). */
+	uint8_t type;
+
+	void (*copy_from)(void *dest, const void *src, int hash_size);
+
+	/* a value of [0..7], indicating record subvariants (eg. ref vs. symref
+	 * vs ref deletion) */
+	uint8_t (*val_type)(const void *rec);
+
+	/* encodes rec into dest, returning how much space was used. */
+	int (*encode)(const void *rec, struct string_view dest, int hash_size);
+
+	/* decode data from `src` into the record. */
+	int (*decode)(void *rec, struct strbuf key, uint8_t extra,
+		      struct string_view src, int hash_size);
+
+	/* deallocate and null the record. */
+	void (*clear)(void *rec);
+
+	/* is this a tombstone? */
+	int (*is_deletion)(const void *rec);
+};
+
+/* record is a generic wrapper for different types of records. */
+struct reftable_record {
+	void *data;
+	struct reftable_record_vtable *ops;
+};
+
+/* returns true for recognized block types. Block start with the block type. */
+int reftable_is_block_type(uint8_t typ);
+
+/* creates a malloced record of the given type. Dispose with record_destroy */
+struct reftable_record reftable_new_record(uint8_t typ);
+
+extern struct reftable_record_vtable reftable_ref_record_vtable;
+
+/* Encode `key` into `dest`. Sets `is_restart` to indicate a restart. Returns
+   number of bytes written. */
+int reftable_encode_key(int *is_restart, struct string_view dest,
+			struct strbuf prev_key, struct strbuf key,
+			uint8_t extra);
+
+/* Decode into `key` and `extra` from `in` */
+int reftable_decode_key(struct strbuf *key, uint8_t *extra,
+			struct strbuf last_key, struct string_view in);
+
+/* reftable_index_record are used internally to speed up lookups. */
+struct reftable_index_record {
+	uint64_t offset; /* Offset of block */
+	struct strbuf last_key; /* Last key of the block. */
+};
+
+/* reftable_obj_record stores an object ID => ref mapping. */
+struct reftable_obj_record {
+	uint8_t *hash_prefix; /* leading bytes of the object ID */
+	int hash_prefix_len; /* number of leading bytes. Constant
+			      * across a single table. */
+	uint64_t *offsets; /* a vector of file offsets. */
+	int offset_len;
+};
+
+/* see struct record_vtable */
+
+void reftable_record_key(struct reftable_record *rec, struct strbuf *dest);
+uint8_t reftable_record_type(struct reftable_record *rec);
+void reftable_record_copy_from(struct reftable_record *rec,
+			       struct reftable_record *src, int hash_size);
+uint8_t reftable_record_val_type(struct reftable_record *rec);
+int reftable_record_encode(struct reftable_record *rec, struct string_view dest,
+			   int hash_size);
+int reftable_record_decode(struct reftable_record *rec, struct strbuf key,
+			   uint8_t extra, struct string_view src,
+			   int hash_size);
+int reftable_record_is_deletion(struct reftable_record *rec);
+
+/* zeroes out the embedded record */
+void reftable_record_clear(struct reftable_record *rec);
+
+/* clear out the record, yielding the reftable_record data that was
+ * encapsulated. */
+void *reftable_record_yield(struct reftable_record *rec);
+
+/* clear and deallocate embedded record, and zero `rec`. */
+void reftable_record_destroy(struct reftable_record *rec);
+
+/* initialize generic records from concrete records. The generic record should
+ * be zeroed out. */
+void reftable_record_from_obj(struct reftable_record *rec,
+			      struct reftable_obj_record *objrec);
+void reftable_record_from_index(struct reftable_record *rec,
+				struct reftable_index_record *idxrec);
+void reftable_record_from_ref(struct reftable_record *rec,
+			      struct reftable_ref_record *refrec);
+void reftable_record_from_log(struct reftable_record *rec,
+			      struct reftable_log_record *logrec);
+struct reftable_ref_record *reftable_record_as_ref(struct reftable_record *ref);
+struct reftable_log_record *reftable_record_as_log(struct reftable_record *ref);
+
+/* for qsort. */
+int reftable_ref_record_compare_name(const void *a, const void *b);
+
+/* for qsort. */
+int reftable_log_record_compare_key(const void *a, const void *b);
+
+#endif
diff --git a/reftable/record_test.c b/reftable/record_test.c
new file mode 100644
index 0000000000..6da80d33cf
--- /dev/null
+++ b/reftable/record_test.c
@@ -0,0 +1,410 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "record.h"
+
+#include "system.h"
+#include "basics.h"
+#include "constants.h"
+#include "reftable.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static void test_copy(struct reftable_record *rec)
+{
+	struct reftable_record copy =
+		reftable_new_record(reftable_record_type(rec));
+	reftable_record_copy_from(&copy, rec, SHA1_SIZE);
+	/* do it twice to catch memory leaks */
+	reftable_record_copy_from(&copy, rec, SHA1_SIZE);
+	switch (reftable_record_type(&copy)) {
+	case BLOCK_TYPE_REF:
+		assert(reftable_ref_record_equal(reftable_record_as_ref(&copy),
+						 reftable_record_as_ref(rec),
+						 SHA1_SIZE));
+		break;
+	case BLOCK_TYPE_LOG:
+		assert(reftable_log_record_equal(reftable_record_as_log(&copy),
+						 reftable_record_as_log(rec),
+						 SHA1_SIZE));
+		break;
+	}
+	reftable_record_destroy(&copy);
+}
+
+static void test_varint_roundtrip(void)
+{
+	uint64_t inputs[] = { 0,
+			      1,
+			      27,
+			      127,
+			      128,
+			      257,
+			      4096,
+			      ((uint64_t)1 << 63),
+			      ((uint64_t)1 << 63) + ((uint64_t)1 << 63) - 1 };
+	int i = 0;
+	for (i = 0; i < ARRAY_SIZE(inputs); i++) {
+		uint8_t dest[10];
+
+		struct string_view out = {
+			.buf = dest,
+			.len = sizeof(dest),
+		};
+		uint64_t in = inputs[i];
+		int n = put_var_int(&out, in);
+		uint64_t got = 0;
+
+		assert(n > 0);
+		out.len = n;
+		n = get_var_int(&got, &out);
+		assert(n > 0);
+
+		assert(got == in);
+	}
+}
+
+static void test_common_prefix(void)
+{
+	struct {
+		const char *a, *b;
+		int want;
+	} cases[] = {
+		{ "abc", "ab", 2 },
+		{ "", "abc", 0 },
+		{ "abc", "abd", 2 },
+		{ "abc", "pqr", 0 },
+	};
+
+	int i = 0;
+	for (i = 0; i < ARRAY_SIZE(cases); i++) {
+		struct strbuf a = STRBUF_INIT;
+		struct strbuf b = STRBUF_INIT;
+		strbuf_addstr(&a, cases[i].a);
+		strbuf_addstr(&b, cases[i].b);
+		assert(common_prefix_size(&a, &b) == cases[i].want);
+
+		strbuf_release(&a);
+		strbuf_release(&b);
+	}
+}
+
+static void set_hash(uint8_t *h, int j)
+{
+	int i = 0;
+	for (i = 0; i < hash_size(SHA1_ID); i++) {
+		h[i] = (j >> i) & 0xff;
+	}
+}
+
+static void test_reftable_ref_record_roundtrip(void)
+{
+	int i = 0;
+
+	for (i = 0; i <= 3; i++) {
+		struct reftable_ref_record in = { 0 };
+		struct reftable_ref_record out = {
+			.refname = xstrdup("old name"),
+			.value = reftable_calloc(SHA1_SIZE),
+			.target_value = reftable_calloc(SHA1_SIZE),
+			.target = xstrdup("old value"),
+		};
+		struct reftable_record rec_out = { 0 };
+		struct strbuf key = STRBUF_INIT;
+		struct reftable_record rec = { 0 };
+		uint8_t buffer[1024] = { 0 };
+		struct string_view dest = {
+			.buf = buffer,
+			.len = sizeof(buffer),
+		};
+
+		int n, m;
+
+		switch (i) {
+		case 0:
+			break;
+		case 1:
+			in.value = reftable_malloc(SHA1_SIZE);
+			set_hash(in.value, 1);
+			break;
+		case 2:
+			in.value = reftable_malloc(SHA1_SIZE);
+			set_hash(in.value, 1);
+			in.target_value = reftable_malloc(SHA1_SIZE);
+			set_hash(in.target_value, 2);
+			break;
+		case 3:
+			in.target = xstrdup("target");
+			break;
+		}
+		in.refname = xstrdup("refs/heads/master");
+
+		reftable_record_from_ref(&rec, &in);
+		test_copy(&rec);
+
+		assert(reftable_record_val_type(&rec) == i);
+
+		reftable_record_key(&rec, &key);
+		n = reftable_record_encode(&rec, dest, SHA1_SIZE);
+		assert(n > 0);
+
+		/* decode into a non-zero reftable_record to test for leaks. */
+
+		reftable_record_from_ref(&rec_out, &out);
+		m = reftable_record_decode(&rec_out, key, i, dest, SHA1_SIZE);
+		assert(n == m);
+
+		assert((out.value != NULL) == (in.value != NULL));
+		assert((out.target_value != NULL) == (in.target_value != NULL));
+		assert((out.target != NULL) == (in.target != NULL));
+		reftable_record_clear(&rec_out);
+
+		strbuf_release(&key);
+		reftable_ref_record_clear(&in);
+	}
+}
+
+static void test_reftable_log_record_equal(void)
+{
+	struct reftable_log_record in[2] = {
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.update_index = 42,
+		},
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.update_index = 22,
+		}
+	};
+
+	assert(!reftable_log_record_equal(&in[0], &in[1], SHA1_SIZE));
+	in[1].update_index = in[0].update_index;
+	assert(reftable_log_record_equal(&in[0], &in[1], SHA1_SIZE));
+	reftable_log_record_clear(&in[0]);
+	reftable_log_record_clear(&in[1]);
+}
+
+static void test_reftable_log_record_roundtrip(void)
+{
+	struct reftable_log_record in[2] = {
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.old_hash = reftable_malloc(SHA1_SIZE),
+			.new_hash = reftable_malloc(SHA1_SIZE),
+			.name = xstrdup("han-wen"),
+			.email = xstrdup("hanwen@google.com"),
+			.message = xstrdup("test"),
+			.update_index = 42,
+			.time = 1577123507,
+			.tz_offset = 100,
+		},
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.update_index = 22,
+		}
+	};
+	set_test_hash(in[0].new_hash, 1);
+	set_test_hash(in[0].old_hash, 2);
+	for (int i = 0; i < ARRAY_SIZE(in); i++) {
+		struct reftable_record rec = { 0 };
+		struct strbuf key = STRBUF_INIT;
+		uint8_t buffer[1024] = { 0 };
+		struct string_view dest = {
+			.buf = buffer,
+			.len = sizeof(buffer),
+		};
+		/* populate out, to check for leaks. */
+		struct reftable_log_record out = {
+			.refname = xstrdup("old name"),
+			.new_hash = reftable_calloc(SHA1_SIZE),
+			.old_hash = reftable_calloc(SHA1_SIZE),
+			.name = xstrdup("old name"),
+			.email = xstrdup("old@email"),
+			.message = xstrdup("old message"),
+		};
+		struct reftable_record rec_out = { 0 };
+		int n, m, valtype;
+
+		reftable_record_from_log(&rec, &in[i]);
+
+		test_copy(&rec);
+
+		reftable_record_key(&rec, &key);
+
+		n = reftable_record_encode(&rec, dest, SHA1_SIZE);
+		assert(n >= 0);
+		reftable_record_from_log(&rec_out, &out);
+		valtype = reftable_record_val_type(&rec);
+		m = reftable_record_decode(&rec_out, key, valtype, dest,
+					   SHA1_SIZE);
+		assert(n == m);
+
+		assert(reftable_log_record_equal(&in[i], &out, SHA1_SIZE));
+		reftable_log_record_clear(&in[i]);
+		strbuf_release(&key);
+		reftable_record_clear(&rec_out);
+	}
+}
+
+static void test_u24_roundtrip(void)
+{
+	uint32_t in = 0x112233;
+	uint8_t dest[3];
+	uint32_t out;
+	put_be24(dest, in);
+	out = get_be24(dest);
+	assert(in == out);
+}
+
+static void test_key_roundtrip(void)
+{
+	uint8_t buffer[1024] = { 0 };
+	struct string_view dest = {
+		.buf = buffer,
+		.len = sizeof(buffer),
+	};
+	struct strbuf last_key = STRBUF_INIT;
+	struct strbuf key = STRBUF_INIT;
+	struct strbuf roundtrip = STRBUF_INIT;
+	int restart;
+	uint8_t extra;
+	int n, m;
+	uint8_t rt_extra;
+
+	strbuf_addstr(&last_key, "refs/heads/master");
+	strbuf_addstr(&key, "refs/tags/bla");
+	extra = 6;
+	n = reftable_encode_key(&restart, dest, last_key, key, extra);
+	assert(!restart);
+	assert(n > 0);
+
+	m = reftable_decode_key(&roundtrip, &rt_extra, last_key, dest);
+	assert(n == m);
+	assert(0 == strbuf_cmp(&key, &roundtrip));
+	assert(rt_extra == extra);
+
+	strbuf_release(&last_key);
+	strbuf_release(&key);
+	strbuf_release(&roundtrip);
+}
+
+static void test_reftable_obj_record_roundtrip(void)
+{
+	uint8_t testHash1[SHA1_SIZE] = { 1, 2, 3, 4, 0 };
+	uint64_t till9[] = { 1, 2, 3, 4, 500, 600, 700, 800, 9000 };
+	struct reftable_obj_record recs[3] = { {
+						       .hash_prefix = testHash1,
+						       .hash_prefix_len = 5,
+						       .offsets = till9,
+						       .offset_len = 3,
+					       },
+					       {
+						       .hash_prefix = testHash1,
+						       .hash_prefix_len = 5,
+						       .offsets = till9,
+						       .offset_len = 9,
+					       },
+					       {
+						       .hash_prefix = testHash1,
+						       .hash_prefix_len = 5,
+					       } };
+	int i = 0;
+	for (i = 0; i < ARRAY_SIZE(recs); i++) {
+		struct reftable_obj_record in = recs[i];
+		uint8_t buffer[1024] = { 0 };
+		struct string_view dest = {
+			.buf = buffer,
+			.len = sizeof(buffer),
+		};
+		struct reftable_record rec = { 0 };
+		struct strbuf key = STRBUF_INIT;
+		struct reftable_obj_record out = { 0 };
+		struct reftable_record rec_out = { 0 };
+		int n, m;
+		uint8_t extra;
+
+		reftable_record_from_obj(&rec, &in);
+		test_copy(&rec);
+		reftable_record_key(&rec, &key);
+		n = reftable_record_encode(&rec, dest, SHA1_SIZE);
+		assert(n > 0);
+		extra = reftable_record_val_type(&rec);
+		reftable_record_from_obj(&rec_out, &out);
+		m = reftable_record_decode(&rec_out, key, extra, dest,
+					   SHA1_SIZE);
+		assert(n == m);
+
+		assert(in.hash_prefix_len == out.hash_prefix_len);
+		assert(in.offset_len == out.offset_len);
+
+		assert(!memcmp(in.hash_prefix, out.hash_prefix,
+			       in.hash_prefix_len));
+		assert(0 == memcmp(in.offsets, out.offsets,
+				   sizeof(uint64_t) * in.offset_len));
+		strbuf_release(&key);
+		reftable_record_clear(&rec_out);
+	}
+}
+
+static void test_reftable_index_record_roundtrip(void)
+{
+	struct reftable_index_record in = {
+		.offset = 42,
+		.last_key = STRBUF_INIT,
+	};
+	uint8_t buffer[1024] = { 0 };
+	struct string_view dest = {
+		.buf = buffer,
+		.len = sizeof(buffer),
+	};
+	struct strbuf key = STRBUF_INIT;
+	struct reftable_record rec = { 0 };
+	struct reftable_index_record out = { .last_key = STRBUF_INIT };
+	struct reftable_record out_rec = { NULL };
+	int n, m;
+	uint8_t extra;
+
+	strbuf_addstr(&in.last_key, "refs/heads/master");
+	reftable_record_from_index(&rec, &in);
+	reftable_record_key(&rec, &key);
+	test_copy(&rec);
+
+	assert(0 == strbuf_cmp(&key, &in.last_key));
+	n = reftable_record_encode(&rec, dest, SHA1_SIZE);
+	assert(n > 0);
+
+	extra = reftable_record_val_type(&rec);
+	reftable_record_from_index(&out_rec, &out);
+	m = reftable_record_decode(&out_rec, key, extra, dest, SHA1_SIZE);
+	assert(m == n);
+
+	assert(in.offset == out.offset);
+
+	reftable_record_clear(&out_rec);
+	strbuf_release(&key);
+	strbuf_release(&in.last_key);
+}
+
+int record_test_main(int argc, const char *argv[])
+{
+	add_test_case("test_reftable_log_record_equal",
+		      &test_reftable_log_record_equal);
+	add_test_case("test_reftable_log_record_roundtrip",
+		      &test_reftable_log_record_roundtrip);
+	add_test_case("test_reftable_ref_record_roundtrip",
+		      &test_reftable_ref_record_roundtrip);
+	add_test_case("test_varint_roundtrip", &test_varint_roundtrip);
+	add_test_case("test_key_roundtrip", &test_key_roundtrip);
+	add_test_case("test_common_prefix", &test_common_prefix);
+	add_test_case("test_reftable_obj_record_roundtrip",
+		      &test_reftable_obj_record_roundtrip);
+	add_test_case("test_reftable_index_record_roundtrip",
+		      &test_reftable_index_record_roundtrip);
+	add_test_case("test_u24_roundtrip", &test_u24_roundtrip);
+	return test_main(argc, argv);
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 7d50aa6bcc..9341272089 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -3,6 +3,7 @@
 
 int cmd__reftable(int argc, const char **argv)
 {
+	record_test_main(argc, argv);
 	strbuf_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH 07/13] reftable: reading/writing blocks
  2020-09-16 19:10 [PATCH 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
                   ` (5 preceding siblings ...)
  2020-09-16 19:10 ` [PATCH 06/13] reftable: (de)serialization for the polymorphic record type Han-Wen Nienhuys via GitGitGadget
@ 2020-09-16 19:10 ` Han-Wen Nienhuys via GitGitGadget
  2020-09-16 19:10 ` [PATCH 08/13] reftable: a generic binary tree implementation Han-Wen Nienhuys via GitGitGadget
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-09-16 19:10 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Includes a code snippet copied from zlib

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |   3 +
 reftable/.gitattributes  |   1 +
 reftable/block.c         | 443 +++++++++++++++++++++++++++++++++++++++
 reftable/block.h         | 129 ++++++++++++
 reftable/block_test.c    | 158 ++++++++++++++
 reftable/zlib-compat.c   |  92 ++++++++
 t/helper/test-reftable.c |   1 +
 7 files changed, 827 insertions(+)
 create mode 100644 reftable/.gitattributes
 create mode 100644 reftable/block.c
 create mode 100644 reftable/block.h
 create mode 100644 reftable/block_test.c
 create mode 100644 reftable/zlib-compat.c

diff --git a/Makefile b/Makefile
index 0fb0048c34..a897818a7e 100644
--- a/Makefile
+++ b/Makefile
@@ -2345,12 +2345,15 @@ XDIFF_OBJS += xdiff/xprepare.o
 XDIFF_OBJS += xdiff/xutils.o
 
 REFTABLE_OBJS += reftable/basics.o
+REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/compat.o
 REFTABLE_OBJS += reftable/record.o
 REFTABLE_OBJS += reftable/strbuf.o
+REFTABLE_OBJS += reftable/zlib-compat.o
 
+REFTABLE_TEST_OBJS += reftable/block_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/strbuf_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
diff --git a/reftable/.gitattributes b/reftable/.gitattributes
new file mode 100644
index 0000000000..f44451a379
--- /dev/null
+++ b/reftable/.gitattributes
@@ -0,0 +1 @@
+/zlib-compat.c	whitespace=-indent-with-non-tab,-trailing-space
diff --git a/reftable/block.c b/reftable/block.c
new file mode 100644
index 0000000000..d61faa0fd4
--- /dev/null
+++ b/reftable/block.c
@@ -0,0 +1,443 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "block.h"
+
+#include "blocksource.h"
+#include "constants.h"
+#include "record.h"
+#include "reftable.h"
+#include "system.h"
+#include "zlib.h"
+
+int header_size(int version)
+{
+	switch (version) {
+	case 1:
+		return 24;
+	case 2:
+		return 28;
+	}
+	abort();
+}
+
+int footer_size(int version)
+{
+	switch (version) {
+	case 1:
+		return 68;
+	case 2:
+		return 72;
+	}
+	abort();
+}
+
+int block_writer_register_restart(struct block_writer *w, int n, int restart,
+				  struct strbuf *key);
+
+void block_writer_init(struct block_writer *bw, uint8_t typ, uint8_t *buf,
+		       uint32_t block_size, uint32_t header_off, int hash_size)
+{
+	bw->buf = buf;
+	bw->hash_size = hash_size;
+	bw->block_size = block_size;
+	bw->header_off = header_off;
+	bw->buf[header_off] = typ;
+	bw->next = header_off + 4;
+	bw->restart_interval = 16;
+	bw->entries = 0;
+	bw->restart_len = 0;
+	bw->last_key.len = 0;
+}
+
+uint8_t block_writer_type(struct block_writer *bw)
+{
+	return bw->buf[bw->header_off];
+}
+
+/* adds the reftable_record to the block. Returns -1 if it does not fit, 0 on
+   success */
+int block_writer_add(struct block_writer *w, struct reftable_record *rec)
+{
+	struct strbuf empty = STRBUF_INIT;
+	struct strbuf last =
+		w->entries % w->restart_interval == 0 ? empty : w->last_key;
+	struct string_view out = {
+		.buf = w->buf + w->next,
+		.len = w->block_size - w->next,
+	};
+
+	struct string_view start = out;
+
+	int is_restart = 0;
+	struct strbuf key = STRBUF_INIT;
+	int n = 0;
+
+	reftable_record_key(rec, &key);
+	n = reftable_encode_key(&is_restart, out, last, key,
+				reftable_record_val_type(rec));
+	if (n < 0)
+		goto done;
+	string_view_consume(&out, n);
+
+	n = reftable_record_encode(rec, out, w->hash_size);
+	if (n < 0)
+		goto done;
+	string_view_consume(&out, n);
+
+	if (block_writer_register_restart(w, start.len - out.len, is_restart,
+					  &key) < 0)
+		goto done;
+
+	strbuf_release(&key);
+	return 0;
+
+done:
+	strbuf_release(&key);
+	return -1;
+}
+
+int block_writer_register_restart(struct block_writer *w, int n, int is_restart,
+				  struct strbuf *key)
+{
+	int rlen = w->restart_len;
+	if (rlen >= MAX_RESTARTS) {
+		is_restart = 0;
+	}
+
+	if (is_restart) {
+		rlen++;
+	}
+	if (2 + 3 * rlen + n > w->block_size - w->next)
+		return -1;
+	if (is_restart) {
+		if (w->restart_len == w->restart_cap) {
+			w->restart_cap = w->restart_cap * 2 + 1;
+			w->restarts = reftable_realloc(
+				w->restarts, sizeof(uint32_t) * w->restart_cap);
+		}
+
+		w->restarts[w->restart_len++] = w->next;
+	}
+
+	w->next += n;
+
+	strbuf_reset(&w->last_key);
+	strbuf_addbuf(&w->last_key, key);
+	w->entries++;
+	return 0;
+}
+
+int block_writer_finish(struct block_writer *w)
+{
+	int i = 0;
+	for (i = 0; i < w->restart_len; i++) {
+		put_be24(w->buf + w->next, w->restarts[i]);
+		w->next += 3;
+	}
+
+	put_be16(w->buf + w->next, w->restart_len);
+	w->next += 2;
+	put_be24(w->buf + 1 + w->header_off, w->next);
+
+	if (block_writer_type(w) == BLOCK_TYPE_LOG) {
+		int block_header_skip = 4 + w->header_off;
+		uint8_t *compressed = NULL;
+		int zresult = 0;
+		uLongf src_len = w->next - block_header_skip;
+		size_t dest_cap = src_len;
+
+		compressed = reftable_malloc(dest_cap);
+		while (1) {
+			uLongf out_dest_len = dest_cap;
+
+			zresult = compress2(compressed, &out_dest_len,
+					    w->buf + block_header_skip, src_len,
+					    9);
+			if (zresult == Z_BUF_ERROR) {
+				dest_cap *= 2;
+				compressed =
+					reftable_realloc(compressed, dest_cap);
+				continue;
+			}
+
+			if (Z_OK != zresult) {
+				reftable_free(compressed);
+				return REFTABLE_ZLIB_ERROR;
+			}
+
+			memcpy(w->buf + block_header_skip, compressed,
+			       out_dest_len);
+			w->next = out_dest_len + block_header_skip;
+			reftable_free(compressed);
+			break;
+		}
+	}
+	return w->next;
+}
+
+uint8_t block_reader_type(struct block_reader *r)
+{
+	return r->block.data[r->header_off];
+}
+
+int block_reader_init(struct block_reader *br, struct reftable_block *block,
+		      uint32_t header_off, uint32_t table_block_size,
+		      int hash_size)
+{
+	uint32_t full_block_size = table_block_size;
+	uint8_t typ = block->data[header_off];
+	uint32_t sz = get_be24(block->data + header_off + 1);
+
+	uint16_t restart_count = 0;
+	uint32_t restart_start = 0;
+	uint8_t *restart_bytes = NULL;
+
+	if (!reftable_is_block_type(typ))
+		return REFTABLE_FORMAT_ERROR;
+
+	if (typ == BLOCK_TYPE_LOG) {
+		int block_header_skip = 4 + header_off;
+		uLongf dst_len = sz - block_header_skip; /* total size of dest
+							    buffer. */
+		uLongf src_len = block->len - block_header_skip;
+		/* Log blocks specify the *uncompressed* size in their header.
+		 */
+		uint8_t *uncompressed = reftable_malloc(sz);
+
+		/* Copy over the block header verbatim. It's not compressed. */
+		memcpy(uncompressed, block->data, block_header_skip);
+
+		/* Uncompress */
+		if (Z_OK != uncompress_return_consumed(
+				    uncompressed + block_header_skip, &dst_len,
+				    block->data + block_header_skip,
+				    &src_len)) {
+			reftable_free(uncompressed);
+			return REFTABLE_ZLIB_ERROR;
+		}
+
+		if (dst_len + block_header_skip != sz)
+			return REFTABLE_FORMAT_ERROR;
+
+		/* We're done with the input data. */
+		reftable_block_done(block);
+		block->data = uncompressed;
+		block->len = sz;
+		block->source = malloc_block_source();
+		full_block_size = src_len + block_header_skip;
+	} else if (full_block_size == 0) {
+		full_block_size = sz;
+	} else if (sz < full_block_size && sz < block->len &&
+		   block->data[sz] != 0) {
+		/* If the block is smaller than the full block size, it is
+		   padded (data followed by '\0') or the next block is
+		   unaligned. */
+		full_block_size = sz;
+	}
+
+	restart_count = get_be16(block->data + sz - 2);
+	restart_start = sz - 2 - 3 * restart_count;
+	restart_bytes = block->data + restart_start;
+
+	/* transfer ownership. */
+	br->block = *block;
+	block->data = NULL;
+	block->len = 0;
+
+	br->hash_size = hash_size;
+	br->block_len = restart_start;
+	br->full_block_size = full_block_size;
+	br->header_off = header_off;
+	br->restart_count = restart_count;
+	br->restart_bytes = restart_bytes;
+
+	return 0;
+}
+
+static uint32_t block_reader_restart_offset(struct block_reader *br, int i)
+{
+	return get_be24(br->restart_bytes + 3 * i);
+}
+
+void block_reader_start(struct block_reader *br, struct block_iter *it)
+{
+	it->br = br;
+	strbuf_reset(&it->last_key);
+	it->next_off = br->header_off + 4;
+}
+
+struct restart_find_args {
+	int error;
+	struct strbuf key;
+	struct block_reader *r;
+};
+
+static int restart_key_less(size_t idx, void *args)
+{
+	struct restart_find_args *a = (struct restart_find_args *)args;
+	uint32_t off = block_reader_restart_offset(a->r, idx);
+	struct string_view in = {
+		.buf = a->r->block.data + off,
+		.len = a->r->block_len - off,
+	};
+
+	/* the restart key is verbatim in the block, so this could avoid the
+	   alloc for decoding the key */
+	struct strbuf rkey = STRBUF_INIT;
+	struct strbuf last_key = STRBUF_INIT;
+	uint8_t unused_extra;
+	int n = reftable_decode_key(&rkey, &unused_extra, last_key, in);
+	int result;
+	if (n < 0) {
+		a->error = 1;
+		return -1;
+	}
+
+	result = strbuf_cmp(&a->key, &rkey);
+	strbuf_release(&rkey);
+	return result;
+}
+
+void block_iter_copy_from(struct block_iter *dest, struct block_iter *src)
+{
+	dest->br = src->br;
+	dest->next_off = src->next_off;
+	strbuf_reset(&dest->last_key);
+	strbuf_addbuf(&dest->last_key, &src->last_key);
+}
+
+int block_iter_next(struct block_iter *it, struct reftable_record *rec)
+{
+	struct string_view in = {
+		.buf = it->br->block.data + it->next_off,
+		.len = it->br->block_len - it->next_off,
+	};
+	struct string_view start = in;
+	struct strbuf key = STRBUF_INIT;
+	uint8_t extra = 0;
+	int n = 0;
+
+	if (it->next_off >= it->br->block_len)
+		return 1;
+
+	n = reftable_decode_key(&key, &extra, it->last_key, in);
+	if (n < 0)
+		return -1;
+
+	string_view_consume(&in, n);
+	n = reftable_record_decode(rec, key, extra, in, it->br->hash_size);
+	if (n < 0)
+		return -1;
+	string_view_consume(&in, n);
+
+	strbuf_reset(&it->last_key);
+	strbuf_addbuf(&it->last_key, &key);
+	it->next_off += start.len - in.len;
+	strbuf_release(&key);
+	return 0;
+}
+
+int block_reader_first_key(struct block_reader *br, struct strbuf *key)
+{
+	struct strbuf empty = STRBUF_INIT;
+	int off = br->header_off + 4;
+	struct string_view in = {
+		.buf = br->block.data + off,
+		.len = br->block_len - off,
+	};
+
+	uint8_t extra = 0;
+	int n = reftable_decode_key(key, &extra, empty, in);
+	if (n < 0)
+		return n;
+
+	return 0;
+}
+
+int block_iter_seek(struct block_iter *it, struct strbuf *want)
+{
+	return block_reader_seek(it->br, it, want);
+}
+
+void block_iter_close(struct block_iter *it)
+{
+	strbuf_release(&it->last_key);
+}
+
+int block_reader_seek(struct block_reader *br, struct block_iter *it,
+		      struct strbuf *want)
+{
+	struct restart_find_args args = {
+		.key = *want,
+		.r = br,
+	};
+	struct reftable_record rec = reftable_new_record(block_reader_type(br));
+	struct strbuf key = STRBUF_INIT;
+	int err = 0;
+	struct block_iter next = {
+		.last_key = STRBUF_INIT,
+	};
+
+	int i = binsearch(br->restart_count, &restart_key_less, &args);
+	if (args.error) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	it->br = br;
+	if (i > 0) {
+		i--;
+		it->next_off = block_reader_restart_offset(br, i);
+	} else {
+		it->next_off = br->header_off + 4;
+	}
+
+	/* We're looking for the last entry less/equal than the wanted key, so
+	   we have to go one entry too far and then back up.
+	*/
+	while (1) {
+		block_iter_copy_from(&next, it);
+		err = block_iter_next(&next, &rec);
+		if (err < 0)
+			goto done;
+
+		reftable_record_key(&rec, &key);
+		if (err > 0 || strbuf_cmp(&key, want) >= 0) {
+			err = 0;
+			goto done;
+		}
+
+		block_iter_copy_from(it, &next);
+	}
+
+done:
+	strbuf_release(&key);
+	strbuf_release(&next.last_key);
+	reftable_record_destroy(&rec);
+
+	return err;
+}
+
+void block_writer_clear(struct block_writer *bw)
+{
+	FREE_AND_NULL(bw->restarts);
+	strbuf_release(&bw->last_key);
+	/* the block is not owned. */
+}
+
+void reftable_block_done(struct reftable_block *blockp)
+{
+	struct reftable_block_source source = blockp->source;
+	if (blockp != NULL && source.ops != NULL)
+		source.ops->return_block(source.arg, blockp);
+	blockp->data = NULL;
+	blockp->len = 0;
+	blockp->source.ops = NULL;
+	blockp->source.arg = NULL;
+}
diff --git a/reftable/block.h b/reftable/block.h
new file mode 100644
index 0000000000..14a6f835f4
--- /dev/null
+++ b/reftable/block.h
@@ -0,0 +1,129 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef BLOCK_H
+#define BLOCK_H
+
+#include "basics.h"
+#include "record.h"
+#include "reftable.h"
+
+/*
+  Writes reftable blocks. The block_writer is reused across blocks to minimize
+  allocation overhead.
+*/
+struct block_writer {
+	uint8_t *buf;
+	uint32_t block_size;
+
+	/* Offset ofof the global header. Nonzero in the first block only. */
+	uint32_t header_off;
+
+	/* How often to restart keys. */
+	int restart_interval;
+	int hash_size;
+
+	/* Offset of next uint8_t to write. */
+	uint32_t next;
+	uint32_t *restarts;
+	uint32_t restart_len;
+	uint32_t restart_cap;
+
+	struct strbuf last_key;
+	int entries;
+};
+
+/*
+  initializes the blockwriter to write `typ` entries, using `buf` as temporary
+  storage. `buf` is not owned by the block_writer. */
+void block_writer_init(struct block_writer *bw, uint8_t typ, uint8_t *buf,
+		       uint32_t block_size, uint32_t header_off, int hash_size);
+
+/*
+  returns the block type (eg. 'r' for ref records.
+*/
+uint8_t block_writer_type(struct block_writer *bw);
+
+/* appends the record, or -1 if it doesn't fit. */
+int block_writer_add(struct block_writer *w, struct reftable_record *rec);
+
+/* appends the key restarts, and compress the block if necessary. */
+int block_writer_finish(struct block_writer *w);
+
+/* clears out internally allocated block_writer members. */
+void block_writer_clear(struct block_writer *bw);
+
+/* Read a block. */
+struct block_reader {
+	/* offset of the block header; nonzero for the first block in a
+	 * reftable. */
+	uint32_t header_off;
+
+	/* the memory block */
+	struct reftable_block block;
+	int hash_size;
+
+	/* size of the data, excluding restart data. */
+	uint32_t block_len;
+	uint8_t *restart_bytes;
+	uint16_t restart_count;
+
+	/* size of the data in the file. For log blocks, this is the compressed
+	 * size. */
+	uint32_t full_block_size;
+};
+
+/* Iterate over entries in a block */
+struct block_iter {
+	/* offset within the block of the next entry to read. */
+	uint32_t next_off;
+	struct block_reader *br;
+
+	/* key for last entry we read. */
+	struct strbuf last_key;
+};
+
+/* initializes a block reader. */
+int block_reader_init(struct block_reader *br, struct reftable_block *bl,
+		      uint32_t header_off, uint32_t table_block_size,
+		      int hash_size);
+
+/* Position `it` at start of the block */
+void block_reader_start(struct block_reader *br, struct block_iter *it);
+
+/* Position `it` to the `want` key in the block */
+int block_reader_seek(struct block_reader *br, struct block_iter *it,
+		      struct strbuf *want);
+
+/* Returns the block type (eg. 'r' for refs) */
+uint8_t block_reader_type(struct block_reader *r);
+
+/* Decodes the first key in the block */
+int block_reader_first_key(struct block_reader *br, struct strbuf *key);
+
+void block_iter_copy_from(struct block_iter *dest, struct block_iter *src);
+
+/* return < 0 for error, 0 for OK, > 0 for EOF. */
+int block_iter_next(struct block_iter *it, struct reftable_record *rec);
+
+/* Seek to `want` with in the block pointed to by `it` */
+int block_iter_seek(struct block_iter *it, struct strbuf *want);
+
+/* deallocate memory for `it`. The block reader and its block is left intact. */
+void block_iter_close(struct block_iter *it);
+
+/* size of file header, depending on format version */
+int header_size(int version);
+
+/* size of file footer, depending on format version */
+int footer_size(int version);
+
+/* returns a block to its source. */
+void reftable_block_done(struct reftable_block *ret);
+
+#endif
diff --git a/reftable/block_test.c b/reftable/block_test.c
new file mode 100644
index 0000000000..1ca327b3c7
--- /dev/null
+++ b/reftable/block_test.c
@@ -0,0 +1,158 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "block.h"
+
+#include "system.h"
+
+#include "blocksource.h"
+#include "basics.h"
+#include "constants.h"
+#include "record.h"
+#include "reftable.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+struct binsearch_args {
+	int key;
+	int *arr;
+};
+
+static int binsearch_func(size_t i, void *void_args)
+{
+	struct binsearch_args *args = (struct binsearch_args *)void_args;
+
+	return args->key < args->arr[i];
+}
+
+static void test_binsearch(void)
+{
+	int arr[] = { 2, 4, 6, 8, 10 };
+	size_t sz = ARRAY_SIZE(arr);
+	struct binsearch_args args = {
+		.arr = arr,
+	};
+
+	int i = 0;
+	for (i = 1; i < 11; i++) {
+		int res;
+		args.key = i;
+		res = binsearch(sz, &binsearch_func, &args);
+
+		if (res < sz) {
+			assert(args.key < arr[res]);
+			if (res > 0) {
+				assert(args.key >= arr[res - 1]);
+			}
+		} else {
+			assert(args.key == 10 || args.key == 11);
+		}
+	}
+}
+
+static void test_block_read_write(void)
+{
+	const int header_off = 21; /* random */
+	char *names[30];
+	const int N = ARRAY_SIZE(names);
+	const int block_size = 1024;
+	struct reftable_block block = { 0 };
+	struct block_writer bw = {
+		.last_key = STRBUF_INIT,
+	};
+	struct reftable_ref_record ref = { 0 };
+	struct reftable_record rec = { 0 };
+	int i = 0;
+	int n;
+	struct block_reader br = { 0 };
+	struct block_iter it = { .last_key = STRBUF_INIT };
+	int j = 0;
+	struct strbuf want = STRBUF_INIT;
+
+	block.data = reftable_calloc(block_size);
+	block.len = block_size;
+	block.source = malloc_block_source();
+	block_writer_init(&bw, BLOCK_TYPE_REF, block.data, block_size,
+			  header_off, hash_size(SHA1_ID));
+	reftable_record_from_ref(&rec, &ref);
+
+	for (i = 0; i < N; i++) {
+		char name[100];
+		uint8_t hash[SHA1_SIZE];
+		snprintf(name, sizeof(name), "branch%02d", i);
+		memset(hash, i, sizeof(hash));
+
+		ref.refname = name;
+		ref.value = hash;
+		names[i] = xstrdup(name);
+		n = block_writer_add(&bw, &rec);
+		ref.refname = NULL;
+		ref.value = NULL;
+		assert(n == 0);
+	}
+
+	n = block_writer_finish(&bw);
+	assert(n > 0);
+
+	block_writer_clear(&bw);
+
+	block_reader_init(&br, &block, header_off, block_size, SHA1_SIZE);
+
+	block_reader_start(&br, &it);
+
+	while (1) {
+		int r = block_iter_next(&it, &rec);
+		assert(r >= 0);
+		if (r > 0) {
+			break;
+		}
+		assert_streq(names[j], ref.refname);
+		j++;
+	}
+
+	reftable_record_clear(&rec);
+	block_iter_close(&it);
+
+	for (i = 0; i < N; i++) {
+		struct block_iter it = { .last_key = STRBUF_INIT };
+		strbuf_reset(&want);
+		strbuf_addstr(&want, names[i]);
+
+		n = block_reader_seek(&br, &it, &want);
+		assert(n == 0);
+
+		n = block_iter_next(&it, &rec);
+		assert(n == 0);
+
+		assert_streq(names[i], ref.refname);
+
+		want.len--;
+		n = block_reader_seek(&br, &it, &want);
+		assert(n == 0);
+
+		n = block_iter_next(&it, &rec);
+		assert(n == 0);
+		assert_streq(names[10 * (i / 10)], ref.refname);
+
+		block_iter_close(&it);
+	}
+
+	reftable_record_clear(&rec);
+	reftable_block_done(&br.block);
+	strbuf_release(&want);
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+}
+
+int block_test_main(int argc, const char *argv[])
+{
+	add_test_case("binsearch", &test_binsearch);
+	add_test_case("block_read_write", &test_block_read_write);
+	return test_main(argc, argv);
+}
diff --git a/reftable/zlib-compat.c b/reftable/zlib-compat.c
new file mode 100644
index 0000000000..3e0b0f24f1
--- /dev/null
+++ b/reftable/zlib-compat.c
@@ -0,0 +1,92 @@
+/* taken from zlib's uncompr.c
+
+   commit cacf7f1d4e3d44d871b605da3b647f07d718623f
+   Author: Mark Adler <madler@alumni.caltech.edu>
+   Date:   Sun Jan 15 09:18:46 2017 -0800
+
+       zlib 1.2.11
+
+*/
+
+/*
+ * Copyright (C) 1995-2003, 2010, 2014, 2016 Jean-loup Gailly, Mark Adler
+ * For conditions of distribution and use, see copyright notice in zlib.h
+ */
+
+#include "system.h"
+
+/* clang-format off */
+
+/* ===========================================================================
+     Decompresses the source buffer into the destination buffer.  *sourceLen is
+   the byte length of the source buffer. Upon entry, *destLen is the total size
+   of the destination buffer, which must be large enough to hold the entire
+   uncompressed data. (The size of the uncompressed data must have been saved
+   previously by the compressor and transmitted to the decompressor by some
+   mechanism outside the scope of this compression library.) Upon exit,
+   *destLen is the size of the decompressed data and *sourceLen is the number
+   of source bytes consumed. Upon return, source + *sourceLen points to the
+   first unused input byte.
+
+     uncompress returns Z_OK if success, Z_MEM_ERROR if there was not enough
+   memory, Z_BUF_ERROR if there was not enough room in the output buffer, or
+   Z_DATA_ERROR if the input data was corrupted, including if the input data is
+   an incomplete zlib stream.
+*/
+int ZEXPORT uncompress_return_consumed (
+    Bytef *dest,
+    uLongf *destLen,
+    const Bytef *source,
+    uLong *sourceLen) {
+    z_stream stream;
+    int err;
+    const uInt max = (uInt)-1;
+    uLong len, left;
+    Byte buf[1];    /* for detection of incomplete stream when *destLen == 0 */
+
+    len = *sourceLen;
+    if (*destLen) {
+        left = *destLen;
+        *destLen = 0;
+    }
+    else {
+        left = 1;
+        dest = buf;
+    }
+
+    stream.next_in = (z_const Bytef *)source;
+    stream.avail_in = 0;
+    stream.zalloc = (alloc_func)0;
+    stream.zfree = (free_func)0;
+    stream.opaque = (voidpf)0;
+
+    err = inflateInit(&stream);
+    if (err != Z_OK) return err;
+
+    stream.next_out = dest;
+    stream.avail_out = 0;
+
+    do {
+        if (stream.avail_out == 0) {
+            stream.avail_out = left > (uLong)max ? max : (uInt)left;
+            left -= stream.avail_out;
+        }
+        if (stream.avail_in == 0) {
+            stream.avail_in = len > (uLong)max ? max : (uInt)len;
+            len -= stream.avail_in;
+        }
+        err = inflate(&stream, Z_NO_FLUSH);
+    } while (err == Z_OK);
+
+    *sourceLen -= len + stream.avail_in;
+    if (dest != buf)
+        *destLen = stream.total_out;
+    else if (stream.total_out && err == Z_BUF_ERROR)
+        left = 1;
+
+    inflateEnd(&stream);
+    return err == Z_STREAM_END ? Z_OK :
+           err == Z_NEED_DICT ? Z_DATA_ERROR  :
+           err == Z_BUF_ERROR && left + stream.avail_out ? Z_DATA_ERROR :
+           err;
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 9341272089..81a9bd5667 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -3,6 +3,7 @@
 
 int cmd__reftable(int argc, const char **argv)
 {
+	block_test_main(argc, argv);
 	record_test_main(argc, argv);
 	strbuf_test_main(argc, argv);
 	return 0;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH 08/13] reftable: a generic binary tree implementation
  2020-09-16 19:10 [PATCH 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
                   ` (6 preceding siblings ...)
  2020-09-16 19:10 ` [PATCH 07/13] reftable: reading/writing blocks Han-Wen Nienhuys via GitGitGadget
@ 2020-09-16 19:10 ` Han-Wen Nienhuys via GitGitGadget
  2020-09-16 19:10 ` [PATCH 09/13] reftable: write reftable files Han-Wen Nienhuys via GitGitGadget
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-09-16 19:10 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This is necessary for building a OID => ref map on write

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |  2 ++
 reftable/tree.c          | 63 ++++++++++++++++++++++++++++++++++++++++
 reftable/tree.h          | 34 ++++++++++++++++++++++
 reftable/tree_test.c     | 62 +++++++++++++++++++++++++++++++++++++++
 t/helper/test-reftable.c |  1 +
 5 files changed, 162 insertions(+)
 create mode 100644 reftable/tree.c
 create mode 100644 reftable/tree.h
 create mode 100644 reftable/tree_test.c

diff --git a/Makefile b/Makefile
index a897818a7e..b96c235087 100644
--- a/Makefile
+++ b/Makefile
@@ -2351,12 +2351,14 @@ REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/compat.o
 REFTABLE_OBJS += reftable/record.o
 REFTABLE_OBJS += reftable/strbuf.o
+REFTABLE_OBJS += reftable/tree.o
 REFTABLE_OBJS += reftable/zlib-compat.o
 
 REFTABLE_TEST_OBJS += reftable/block_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/strbuf_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
+REFTABLE_TEST_OBJS += reftable/tree_test.o
 
 TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
 OBJECTS := $(LIB_OBJS) $(BUILTIN_OBJS) $(PROGRAM_OBJS) $(TEST_OBJS) \
diff --git a/reftable/tree.c b/reftable/tree.c
new file mode 100644
index 0000000000..0061d14e30
--- /dev/null
+++ b/reftable/tree.c
@@ -0,0 +1,63 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "tree.h"
+
+#include "basics.h"
+#include "system.h"
+
+struct tree_node *tree_search(void *key, struct tree_node **rootp,
+			      int (*compare)(const void *, const void *),
+			      int insert)
+{
+	int res;
+	if (*rootp == NULL) {
+		if (!insert) {
+			return NULL;
+		} else {
+			struct tree_node *n =
+				reftable_calloc(sizeof(struct tree_node));
+			n->key = key;
+			*rootp = n;
+			return *rootp;
+		}
+	}
+
+	res = compare(key, (*rootp)->key);
+	if (res < 0)
+		return tree_search(key, &(*rootp)->left, compare, insert);
+	else if (res > 0)
+		return tree_search(key, &(*rootp)->right, compare, insert);
+	return *rootp;
+}
+
+void infix_walk(struct tree_node *t, void (*action)(void *arg, void *key),
+		void *arg)
+{
+	if (t->left != NULL) {
+		infix_walk(t->left, action, arg);
+	}
+	action(arg, t->key);
+	if (t->right != NULL) {
+		infix_walk(t->right, action, arg);
+	}
+}
+
+void tree_free(struct tree_node *t)
+{
+	if (t == NULL) {
+		return;
+	}
+	if (t->left != NULL) {
+		tree_free(t->left);
+	}
+	if (t->right != NULL) {
+		tree_free(t->right);
+	}
+	reftable_free(t);
+}
diff --git a/reftable/tree.h b/reftable/tree.h
new file mode 100644
index 0000000000..954512e9a3
--- /dev/null
+++ b/reftable/tree.h
@@ -0,0 +1,34 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef TREE_H
+#define TREE_H
+
+/* tree_node is a generic binary search tree. */
+struct tree_node {
+	void *key;
+	struct tree_node *left, *right;
+};
+
+/* looks for `key` in `rootp` using `compare` as comparison function. If insert
+   is set, insert the key if it's not found. Else, return NULL.
+*/
+struct tree_node *tree_search(void *key, struct tree_node **rootp,
+			      int (*compare)(const void *, const void *),
+			      int insert);
+
+/* performs an infix walk of the tree. */
+void infix_walk(struct tree_node *t, void (*action)(void *arg, void *key),
+		void *arg);
+
+/*
+  deallocates the tree nodes recursively. Keys should be deallocated separately
+  by walking over the tree. */
+void tree_free(struct tree_node *t);
+
+#endif
diff --git a/reftable/tree_test.c b/reftable/tree_test.c
new file mode 100644
index 0000000000..c6d448cbe8
--- /dev/null
+++ b/reftable/tree_test.c
@@ -0,0 +1,62 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "tree.h"
+
+#include "basics.h"
+#include "record.h"
+#include "reftable.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static int test_compare(const void *a, const void *b)
+{
+	return (char *)a - (char *)b;
+}
+
+struct curry {
+	void *last;
+};
+
+static void check_increasing(void *arg, void *key)
+{
+	struct curry *c = (struct curry *)arg;
+	if (c->last != NULL) {
+		assert(test_compare(c->last, key) < 0);
+	}
+	c->last = key;
+}
+
+static void test_tree(void)
+{
+	struct tree_node *root = NULL;
+
+	void *values[11] = { 0 };
+	struct tree_node *nodes[11] = { 0 };
+	int i = 1;
+	struct curry c = { 0 };
+	do {
+		nodes[i] = tree_search(values + i, &root, &test_compare, 1);
+		i = (i * 7) % 11;
+	} while (i != 1);
+
+	for (i = 1; i < ARRAY_SIZE(nodes); i++) {
+		assert(values + i == nodes[i]->key);
+		assert(nodes[i] ==
+		       tree_search(values + i, &root, &test_compare, 0));
+	}
+
+	infix_walk(root, check_increasing, &c);
+	tree_free(root);
+}
+
+int tree_test_main(int argc, const char *argv[])
+{
+	add_test_case("test_tree", &test_tree);
+	return test_main(argc, argv);
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 81a9bd5667..9c4e0f42dc 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -6,5 +6,6 @@ int cmd__reftable(int argc, const char **argv)
 	block_test_main(argc, argv);
 	record_test_main(argc, argv);
 	strbuf_test_main(argc, argv);
+	tree_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH 09/13] reftable: write reftable files
  2020-09-16 19:10 [PATCH 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
                   ` (7 preceding siblings ...)
  2020-09-16 19:10 ` [PATCH 08/13] reftable: a generic binary tree implementation Han-Wen Nienhuys via GitGitGadget
@ 2020-09-16 19:10 ` Han-Wen Nienhuys via GitGitGadget
  2020-09-16 19:10 ` [PATCH 10/13] reftable: read " Han-Wen Nienhuys via GitGitGadget
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-09-16 19:10 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile          |   1 +
 reftable/writer.c | 664 ++++++++++++++++++++++++++++++++++++++++++++++
 reftable/writer.h |  60 +++++
 3 files changed, 725 insertions(+)
 create mode 100644 reftable/writer.c
 create mode 100644 reftable/writer.h

diff --git a/Makefile b/Makefile
index b96c235087..a200633945 100644
--- a/Makefile
+++ b/Makefile
@@ -2352,6 +2352,7 @@ REFTABLE_OBJS += reftable/compat.o
 REFTABLE_OBJS += reftable/record.o
 REFTABLE_OBJS += reftable/strbuf.o
 REFTABLE_OBJS += reftable/tree.o
+REFTABLE_OBJS += reftable/writer.o
 REFTABLE_OBJS += reftable/zlib-compat.o
 
 REFTABLE_TEST_OBJS += reftable/block_test.o
diff --git a/reftable/writer.c b/reftable/writer.c
new file mode 100644
index 0000000000..44ddcc6757
--- /dev/null
+++ b/reftable/writer.c
@@ -0,0 +1,664 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "writer.h"
+
+#include "system.h"
+
+#include "block.h"
+#include "constants.h"
+#include "record.h"
+#include "reftable.h"
+#include "tree.h"
+
+static struct reftable_block_stats *
+writer_reftable_block_stats(struct reftable_writer *w, uint8_t typ)
+{
+	switch (typ) {
+	case 'r':
+		return &w->stats.ref_stats;
+	case 'o':
+		return &w->stats.obj_stats;
+	case 'i':
+		return &w->stats.idx_stats;
+	case 'g':
+		return &w->stats.log_stats;
+	}
+	abort();
+	return NULL;
+}
+
+/* write data, queuing the padding for the next write. Returns negative for
+ * error. */
+static int padded_write(struct reftable_writer *w, uint8_t *data, size_t len,
+			int padding)
+{
+	int n = 0;
+	if (w->pending_padding > 0) {
+		uint8_t *zeroed = reftable_calloc(w->pending_padding);
+		int n = w->write(w->write_arg, zeroed, w->pending_padding);
+		if (n < 0)
+			return n;
+
+		w->pending_padding = 0;
+		reftable_free(zeroed);
+	}
+
+	w->pending_padding = padding;
+	n = w->write(w->write_arg, data, len);
+	if (n < 0)
+		return n;
+	n += padding;
+	return 0;
+}
+
+static void options_set_defaults(struct reftable_write_options *opts)
+{
+	if (opts->restart_interval == 0) {
+		opts->restart_interval = 16;
+	}
+
+	if (opts->hash_id == 0) {
+		opts->hash_id = SHA1_ID;
+	}
+	if (opts->block_size == 0) {
+		opts->block_size = DEFAULT_BLOCK_SIZE;
+	}
+}
+
+static int writer_version(struct reftable_writer *w)
+{
+	return (w->opts.hash_id == 0 || w->opts.hash_id == SHA1_ID) ? 1 : 2;
+}
+
+static int writer_write_header(struct reftable_writer *w, uint8_t *dest)
+{
+	memcpy((char *)dest, "REFT", 4);
+
+	dest[4] = writer_version(w);
+
+	put_be24(dest + 5, w->opts.block_size);
+	put_be64(dest + 8, w->min_update_index);
+	put_be64(dest + 16, w->max_update_index);
+	if (writer_version(w) == 2) {
+		put_be32(dest + 24, w->opts.hash_id);
+	}
+	return header_size(writer_version(w));
+}
+
+static void writer_reinit_block_writer(struct reftable_writer *w, uint8_t typ)
+{
+	int block_start = 0;
+	if (w->next == 0) {
+		block_start = header_size(writer_version(w));
+	}
+
+	strbuf_release(&w->last_key);
+	block_writer_init(&w->block_writer_data, typ, w->block,
+			  w->opts.block_size, block_start,
+			  hash_size(w->opts.hash_id));
+	w->block_writer = &w->block_writer_data;
+	w->block_writer->restart_interval = w->opts.restart_interval;
+}
+
+struct reftable_writer *
+reftable_new_writer(int (*writer_func)(void *, const void *, size_t),
+		    void *writer_arg, struct reftable_write_options *opts)
+{
+	struct reftable_writer *wp =
+		reftable_calloc(sizeof(struct reftable_writer));
+	strbuf_init(&wp->block_writer_data.last_key, 0);
+	options_set_defaults(opts);
+	if (opts->block_size >= (1 << 24)) {
+		/* TODO - error return? */
+		abort();
+	}
+	wp->last_key = reftable_empty_strbuf;
+	wp->block = reftable_calloc(opts->block_size);
+	wp->write = writer_func;
+	wp->write_arg = writer_arg;
+	wp->opts = *opts;
+	writer_reinit_block_writer(wp, BLOCK_TYPE_REF);
+
+	return wp;
+}
+
+void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
+				uint64_t max)
+{
+	w->min_update_index = min;
+	w->max_update_index = max;
+}
+
+void reftable_writer_free(struct reftable_writer *w)
+{
+	reftable_free(w->block);
+	reftable_free(w);
+}
+
+struct obj_index_tree_node {
+	struct strbuf hash;
+	uint64_t *offsets;
+	size_t offset_len;
+	size_t offset_cap;
+};
+
+#define OBJ_INDEX_TREE_NODE_INIT    \
+	{                           \
+		.hash = STRBUF_INIT \
+	}
+
+static int obj_index_tree_node_compare(const void *a, const void *b)
+{
+	return strbuf_cmp(&((const struct obj_index_tree_node *)a)->hash,
+			  &((const struct obj_index_tree_node *)b)->hash);
+}
+
+static void writer_index_hash(struct reftable_writer *w, struct strbuf *hash)
+{
+	uint64_t off = w->next;
+
+	struct obj_index_tree_node want = { .hash = *hash };
+
+	struct tree_node *node = tree_search(&want, &w->obj_index_tree,
+					     &obj_index_tree_node_compare, 0);
+	struct obj_index_tree_node *key = NULL;
+	if (node == NULL) {
+		struct obj_index_tree_node empty = OBJ_INDEX_TREE_NODE_INIT;
+		key = reftable_malloc(sizeof(struct obj_index_tree_node));
+		*key = empty;
+
+		strbuf_reset(&key->hash);
+		strbuf_addbuf(&key->hash, hash);
+		tree_search((void *)key, &w->obj_index_tree,
+			    &obj_index_tree_node_compare, 1);
+	} else {
+		key = node->key;
+	}
+
+	if (key->offset_len > 0 && key->offsets[key->offset_len - 1] == off) {
+		return;
+	}
+
+	if (key->offset_len == key->offset_cap) {
+		key->offset_cap = 2 * key->offset_cap + 1;
+		key->offsets = reftable_realloc(
+			key->offsets, sizeof(uint64_t) * key->offset_cap);
+	}
+
+	key->offsets[key->offset_len++] = off;
+}
+
+static int writer_add_record(struct reftable_writer *w,
+			     struct reftable_record *rec)
+{
+	int result = -1;
+	struct strbuf key = STRBUF_INIT;
+	int err = 0;
+	reftable_record_key(rec, &key);
+	if (strbuf_cmp(&w->last_key, &key) >= 0)
+		goto done;
+
+	strbuf_reset(&w->last_key);
+	strbuf_addbuf(&w->last_key, &key);
+	if (w->block_writer == NULL) {
+		writer_reinit_block_writer(w, reftable_record_type(rec));
+	}
+
+	assert(block_writer_type(w->block_writer) == reftable_record_type(rec));
+
+	if (block_writer_add(w->block_writer, rec) == 0) {
+		result = 0;
+		goto done;
+	}
+
+	err = writer_flush_block(w);
+	if (err < 0) {
+		result = err;
+		goto done;
+	}
+
+	writer_reinit_block_writer(w, reftable_record_type(rec));
+	err = block_writer_add(w->block_writer, rec);
+	if (err < 0) {
+		result = err;
+		goto done;
+	}
+
+	result = 0;
+done:
+	strbuf_release(&key);
+	return result;
+}
+
+int reftable_writer_add_ref(struct reftable_writer *w,
+			    struct reftable_ref_record *ref)
+{
+	struct reftable_record rec = { 0 };
+	struct reftable_ref_record copy = *ref;
+	int err = 0;
+
+	if (ref->refname == NULL)
+		return REFTABLE_API_ERROR;
+	if (ref->update_index < w->min_update_index ||
+	    ref->update_index > w->max_update_index)
+		return REFTABLE_API_ERROR;
+
+	reftable_record_from_ref(&rec, &copy);
+	copy.update_index -= w->min_update_index;
+	err = writer_add_record(w, &rec);
+	if (err < 0)
+		return err;
+
+	if (!w->opts.skip_index_objects && ref->value != NULL) {
+		struct strbuf h = STRBUF_INIT;
+		strbuf_add(&h, (char *)ref->value, hash_size(w->opts.hash_id));
+		writer_index_hash(w, &h);
+		strbuf_release(&h);
+	}
+
+	if (!w->opts.skip_index_objects && ref->target_value != NULL) {
+		struct strbuf h = STRBUF_INIT;
+		strbuf_add(&h, ref->target_value, hash_size(w->opts.hash_id));
+		writer_index_hash(w, &h);
+		strbuf_release(&h);
+	}
+	return 0;
+}
+
+int reftable_writer_add_refs(struct reftable_writer *w,
+			     struct reftable_ref_record *refs, int n)
+{
+	int err = 0;
+	int i = 0;
+	QSORT(refs, n, reftable_ref_record_compare_name);
+	for (i = 0; err == 0 && i < n; i++) {
+		err = reftable_writer_add_ref(w, &refs[i]);
+	}
+	return err;
+}
+
+int reftable_writer_add_log(struct reftable_writer *w,
+			    struct reftable_log_record *log)
+{
+	struct reftable_record rec = { 0 };
+	char *input_log_message = log->message;
+	struct strbuf cleaned_message = STRBUF_INIT;
+	int err;
+	if (log->refname == NULL)
+		return REFTABLE_API_ERROR;
+
+	if (w->block_writer != NULL &&
+	    block_writer_type(w->block_writer) == BLOCK_TYPE_REF) {
+		int err = writer_finish_public_section(w);
+		if (err < 0)
+			return err;
+	}
+
+	if (!w->opts.exact_log_message && log->message != NULL) {
+		strbuf_addstr(&cleaned_message, log->message);
+		while (cleaned_message.len &&
+		       cleaned_message.buf[cleaned_message.len - 1] == '\n')
+			strbuf_setlen(&cleaned_message,
+				      cleaned_message.len - 1);
+		if (strchr(cleaned_message.buf, '\n')) {
+			// multiple lines not allowed.
+			err = REFTABLE_API_ERROR;
+			goto done;
+		}
+		strbuf_addstr(&cleaned_message, "\n");
+		log->message = cleaned_message.buf;
+	}
+
+	w->next -= w->pending_padding;
+	w->pending_padding = 0;
+
+	reftable_record_from_log(&rec, log);
+	err = writer_add_record(w, &rec);
+
+done:
+	log->message = input_log_message;
+	strbuf_release(&cleaned_message);
+	return err;
+}
+
+int reftable_writer_add_logs(struct reftable_writer *w,
+			     struct reftable_log_record *logs, int n)
+{
+	int err = 0;
+	int i = 0;
+	QSORT(logs, n, reftable_log_record_compare_key);
+	for (i = 0; err == 0 && i < n; i++) {
+		err = reftable_writer_add_log(w, &logs[i]);
+	}
+	return err;
+}
+
+static int writer_finish_section(struct reftable_writer *w)
+{
+	uint8_t typ = block_writer_type(w->block_writer);
+	uint64_t index_start = 0;
+	int max_level = 0;
+	int threshold = w->opts.unpadded ? 1 : 3;
+	int before_blocks = w->stats.idx_stats.blocks;
+	int err = writer_flush_block(w);
+	int i = 0;
+	struct reftable_block_stats *bstats = NULL;
+	if (err < 0)
+		return err;
+
+	while (w->index_len > threshold) {
+		struct reftable_index_record *idx = NULL;
+		int idx_len = 0;
+
+		max_level++;
+		index_start = w->next;
+		writer_reinit_block_writer(w, BLOCK_TYPE_INDEX);
+
+		idx = w->index;
+		idx_len = w->index_len;
+
+		w->index = NULL;
+		w->index_len = 0;
+		w->index_cap = 0;
+		for (i = 0; i < idx_len; i++) {
+			struct reftable_record rec = { 0 };
+			reftable_record_from_index(&rec, idx + i);
+			if (block_writer_add(w->block_writer, &rec) == 0) {
+				continue;
+			}
+
+			err = writer_flush_block(w);
+			if (err < 0)
+				return err;
+
+			writer_reinit_block_writer(w, BLOCK_TYPE_INDEX);
+
+			err = block_writer_add(w->block_writer, &rec);
+			if (err != 0) {
+				/* write into fresh block should always succeed
+				 */
+				abort();
+			}
+		}
+		for (i = 0; i < idx_len; i++) {
+			strbuf_release(&idx[i].last_key);
+		}
+		reftable_free(idx);
+	}
+
+	writer_clear_index(w);
+
+	err = writer_flush_block(w);
+	if (err < 0)
+		return err;
+
+	bstats = writer_reftable_block_stats(w, typ);
+	bstats->index_blocks = w->stats.idx_stats.blocks - before_blocks;
+	bstats->index_offset = index_start;
+	bstats->max_index_level = max_level;
+
+	/* Reinit lastKey, as the next section can start with any key. */
+	w->last_key.len = 0;
+
+	return 0;
+}
+
+struct common_prefix_arg {
+	struct strbuf *last;
+	int max;
+};
+
+static void update_common(void *void_arg, void *key)
+{
+	struct common_prefix_arg *arg = (struct common_prefix_arg *)void_arg;
+	struct obj_index_tree_node *entry = (struct obj_index_tree_node *)key;
+	if (arg->last != NULL) {
+		int n = common_prefix_size(&entry->hash, arg->last);
+		if (n > arg->max) {
+			arg->max = n;
+		}
+	}
+	arg->last = &entry->hash;
+}
+
+struct write_record_arg {
+	struct reftable_writer *w;
+	int err;
+};
+
+static void write_object_record(void *void_arg, void *key)
+{
+	struct write_record_arg *arg = (struct write_record_arg *)void_arg;
+	struct obj_index_tree_node *entry = (struct obj_index_tree_node *)key;
+	struct reftable_obj_record obj_rec = {
+		.hash_prefix = (uint8_t *)entry->hash.buf,
+		.hash_prefix_len = arg->w->stats.object_id_len,
+		.offsets = entry->offsets,
+		.offset_len = entry->offset_len,
+	};
+	struct reftable_record rec = { 0 };
+	if (arg->err < 0)
+		goto done;
+
+	reftable_record_from_obj(&rec, &obj_rec);
+	arg->err = block_writer_add(arg->w->block_writer, &rec);
+	if (arg->err == 0)
+		goto done;
+
+	arg->err = writer_flush_block(arg->w);
+	if (arg->err < 0)
+		goto done;
+
+	writer_reinit_block_writer(arg->w, BLOCK_TYPE_OBJ);
+	arg->err = block_writer_add(arg->w->block_writer, &rec);
+	if (arg->err == 0)
+		goto done;
+	obj_rec.offset_len = 0;
+	arg->err = block_writer_add(arg->w->block_writer, &rec);
+
+	/* Should be able to write into a fresh block. */
+	assert(arg->err == 0);
+
+done:;
+}
+
+static void object_record_free(void *void_arg, void *key)
+{
+	struct obj_index_tree_node *entry = (struct obj_index_tree_node *)key;
+
+	FREE_AND_NULL(entry->offsets);
+	strbuf_release(&entry->hash);
+	reftable_free(entry);
+}
+
+static int writer_dump_object_index(struct reftable_writer *w)
+{
+	struct write_record_arg closure = { .w = w };
+	struct common_prefix_arg common = { 0 };
+	if (w->obj_index_tree != NULL) {
+		infix_walk(w->obj_index_tree, &update_common, &common);
+	}
+	w->stats.object_id_len = common.max + 1;
+
+	writer_reinit_block_writer(w, BLOCK_TYPE_OBJ);
+
+	if (w->obj_index_tree != NULL) {
+		infix_walk(w->obj_index_tree, &write_object_record, &closure);
+	}
+
+	if (closure.err < 0)
+		return closure.err;
+	return writer_finish_section(w);
+}
+
+int writer_finish_public_section(struct reftable_writer *w)
+{
+	uint8_t typ = 0;
+	int err = 0;
+
+	if (w->block_writer == NULL)
+		return 0;
+
+	typ = block_writer_type(w->block_writer);
+	err = writer_finish_section(w);
+	if (err < 0)
+		return err;
+	if (typ == BLOCK_TYPE_REF && !w->opts.skip_index_objects &&
+	    w->stats.ref_stats.index_blocks > 0) {
+		err = writer_dump_object_index(w);
+		if (err < 0)
+			return err;
+	}
+
+	if (w->obj_index_tree != NULL) {
+		infix_walk(w->obj_index_tree, &object_record_free, NULL);
+		tree_free(w->obj_index_tree);
+		w->obj_index_tree = NULL;
+	}
+
+	w->block_writer = NULL;
+	return 0;
+}
+
+int reftable_writer_close(struct reftable_writer *w)
+{
+	uint8_t footer[72];
+	uint8_t *p = footer;
+	int err = writer_finish_public_section(w);
+	int empty_table = w->next == 0;
+	if (err != 0)
+		goto done;
+	w->pending_padding = 0;
+	if (empty_table) {
+		/* Empty tables need a header anyway. */
+		uint8_t header[28];
+		int n = writer_write_header(w, header);
+		err = padded_write(w, header, n, 0);
+		if (err < 0)
+			goto done;
+	}
+
+	p += writer_write_header(w, footer);
+	put_be64(p, w->stats.ref_stats.index_offset);
+	p += 8;
+	put_be64(p, (w->stats.obj_stats.offset) << 5 | w->stats.object_id_len);
+	p += 8;
+	put_be64(p, w->stats.obj_stats.index_offset);
+	p += 8;
+
+	put_be64(p, w->stats.log_stats.offset);
+	p += 8;
+	put_be64(p, w->stats.log_stats.index_offset);
+	p += 8;
+
+	put_be32(p, crc32(0, footer, p - footer));
+	p += 4;
+
+	err = padded_write(w, footer, footer_size(writer_version(w)), 0);
+	if (err < 0)
+		goto done;
+
+	if (empty_table) {
+		err = REFTABLE_EMPTY_TABLE_ERROR;
+		goto done;
+	}
+
+done:
+	/* free up memory. */
+	block_writer_clear(&w->block_writer_data);
+	writer_clear_index(w);
+	strbuf_release(&w->last_key);
+	return err;
+}
+
+void writer_clear_index(struct reftable_writer *w)
+{
+	int i = 0;
+	for (i = 0; i < w->index_len; i++) {
+		strbuf_release(&w->index[i].last_key);
+	}
+
+	FREE_AND_NULL(w->index);
+	w->index_len = 0;
+	w->index_cap = 0;
+}
+
+const int debug = 0;
+
+static int writer_flush_nonempty_block(struct reftable_writer *w)
+{
+	uint8_t typ = block_writer_type(w->block_writer);
+	struct reftable_block_stats *bstats =
+		writer_reftable_block_stats(w, typ);
+	uint64_t block_typ_off = (bstats->blocks == 0) ? w->next : 0;
+	int raw_bytes = block_writer_finish(w->block_writer);
+	int padding = 0;
+	int err = 0;
+	struct reftable_index_record ir = { .last_key = STRBUF_INIT };
+	if (raw_bytes < 0)
+		return raw_bytes;
+
+	if (!w->opts.unpadded && typ != BLOCK_TYPE_LOG) {
+		padding = w->opts.block_size - raw_bytes;
+	}
+
+	if (block_typ_off > 0) {
+		bstats->offset = block_typ_off;
+	}
+
+	bstats->entries += w->block_writer->entries;
+	bstats->restarts += w->block_writer->restart_len;
+	bstats->blocks++;
+	w->stats.blocks++;
+
+	if (debug) {
+		fprintf(stderr, "block %c off %" PRIu64 " sz %d (%d)\n", typ,
+			w->next, raw_bytes,
+			get_be24(w->block + w->block_writer->header_off + 1));
+	}
+
+	if (w->next == 0) {
+		writer_write_header(w, w->block);
+	}
+
+	err = padded_write(w, w->block, raw_bytes, padding);
+	if (err < 0)
+		return err;
+
+	if (w->index_cap == w->index_len) {
+		w->index_cap = 2 * w->index_cap + 1;
+		w->index = reftable_realloc(
+			w->index,
+			sizeof(struct reftable_index_record) * w->index_cap);
+	}
+
+	ir.offset = w->next;
+	strbuf_reset(&ir.last_key);
+	strbuf_addbuf(&ir.last_key, &w->block_writer->last_key);
+	w->index[w->index_len] = ir;
+
+	w->index_len++;
+	w->next += padding + raw_bytes;
+	w->block_writer = NULL;
+	return 0;
+}
+
+int writer_flush_block(struct reftable_writer *w)
+{
+	if (w->block_writer == NULL)
+		return 0;
+	if (w->block_writer->entries == 0)
+		return 0;
+	return writer_flush_nonempty_block(w);
+}
+
+const struct reftable_stats *writer_stats(struct reftable_writer *w)
+{
+	return &w->stats;
+}
diff --git a/reftable/writer.h b/reftable/writer.h
new file mode 100644
index 0000000000..c69ecffea0
--- /dev/null
+++ b/reftable/writer.h
@@ -0,0 +1,60 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef WRITER_H
+#define WRITER_H
+
+#include "basics.h"
+#include "block.h"
+#include "reftable.h"
+#include "strbuf.h"
+#include "tree.h"
+
+struct reftable_writer {
+	int (*write)(void *, const void *, size_t);
+	void *write_arg;
+	int pending_padding;
+	struct strbuf last_key;
+
+	/* offset of next block to write. */
+	uint64_t next;
+	uint64_t min_update_index, max_update_index;
+	struct reftable_write_options opts;
+
+	/* memory buffer for writing */
+	uint8_t *block;
+
+	/* writer for the current section. NULL or points to
+	 * block_writer_data */
+	struct block_writer *block_writer;
+
+	struct block_writer block_writer_data;
+
+	/* pending index records for the current section */
+	struct reftable_index_record *index;
+	size_t index_len;
+	size_t index_cap;
+
+	/*
+	  tree for use with tsearch; used to populate the 'o' inverse OID
+	  map */
+	struct tree_node *obj_index_tree;
+
+	struct reftable_stats stats;
+};
+
+/* finishes a block, and writes it to storage */
+int writer_flush_block(struct reftable_writer *w);
+
+/* deallocates memory related to the index */
+void writer_clear_index(struct reftable_writer *w);
+
+/* finishes writing a 'r' (refs) or 'g' (reflogs) section */
+int writer_finish_public_section(struct reftable_writer *w);
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH 10/13] reftable: read reftable files
  2020-09-16 19:10 [PATCH 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
                   ` (8 preceding siblings ...)
  2020-09-16 19:10 ` [PATCH 09/13] reftable: write reftable files Han-Wen Nienhuys via GitGitGadget
@ 2020-09-16 19:10 ` Han-Wen Nienhuys via GitGitGadget
  2020-09-16 19:10 ` [PATCH 11/13] reftable: file level tests Han-Wen Nienhuys via GitGitGadget
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-09-16 19:10 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile            |   3 +
 reftable/iter.c     | 242 +++++++++++++++
 reftable/iter.h     |  72 +++++
 reftable/reader.c   | 733 ++++++++++++++++++++++++++++++++++++++++++++
 reftable/reader.h   |  78 +++++
 reftable/reftable.c | 104 +++++++
 6 files changed, 1232 insertions(+)
 create mode 100644 reftable/iter.c
 create mode 100644 reftable/iter.h
 create mode 100644 reftable/reader.c
 create mode 100644 reftable/reader.h
 create mode 100644 reftable/reftable.c

diff --git a/Makefile b/Makefile
index a200633945..275fc4227c 100644
--- a/Makefile
+++ b/Makefile
@@ -2349,7 +2349,10 @@ REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/compat.o
+REFTABLE_OBJS += reftable/iter.o
+REFTABLE_OBJS += reftable/reader.o
 REFTABLE_OBJS += reftable/record.o
+REFTABLE_OBJS += reftable/reftable.o
 REFTABLE_OBJS += reftable/strbuf.o
 REFTABLE_OBJS += reftable/tree.o
 REFTABLE_OBJS += reftable/writer.o
diff --git a/reftable/iter.c b/reftable/iter.c
new file mode 100644
index 0000000000..6631408ef8
--- /dev/null
+++ b/reftable/iter.c
@@ -0,0 +1,242 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "iter.h"
+
+#include "system.h"
+
+#include "block.h"
+#include "constants.h"
+#include "reader.h"
+#include "reftable.h"
+
+int iterator_is_null(struct reftable_iterator *it)
+{
+	return it->ops == NULL;
+}
+
+static int empty_iterator_next(void *arg, struct reftable_record *rec)
+{
+	return 1;
+}
+
+static void empty_iterator_close(void *arg)
+{
+}
+
+struct reftable_iterator_vtable empty_vtable = {
+	.next = &empty_iterator_next,
+	.close = &empty_iterator_close,
+};
+
+void iterator_set_empty(struct reftable_iterator *it)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = NULL;
+	it->ops = &empty_vtable;
+}
+
+int iterator_next(struct reftable_iterator *it, struct reftable_record *rec)
+{
+	return it->ops->next(it->iter_arg, rec);
+}
+
+void reftable_iterator_destroy(struct reftable_iterator *it)
+{
+	if (it->ops == NULL) {
+		return;
+	}
+	it->ops->close(it->iter_arg);
+	it->ops = NULL;
+	FREE_AND_NULL(it->iter_arg);
+}
+
+int reftable_iterator_next_ref(struct reftable_iterator *it,
+			       struct reftable_ref_record *ref)
+{
+	struct reftable_record rec = { 0 };
+	reftable_record_from_ref(&rec, ref);
+	return iterator_next(it, &rec);
+}
+
+int reftable_iterator_next_log(struct reftable_iterator *it,
+			       struct reftable_log_record *log)
+{
+	struct reftable_record rec = { 0 };
+	reftable_record_from_log(&rec, log);
+	return iterator_next(it, &rec);
+}
+
+static void filtering_ref_iterator_close(void *iter_arg)
+{
+	struct filtering_ref_iterator *fri =
+		(struct filtering_ref_iterator *)iter_arg;
+	strbuf_release(&fri->oid);
+	reftable_iterator_destroy(&fri->it);
+}
+
+static int filtering_ref_iterator_next(void *iter_arg,
+				       struct reftable_record *rec)
+{
+	struct filtering_ref_iterator *fri =
+		(struct filtering_ref_iterator *)iter_arg;
+	struct reftable_ref_record *ref =
+		(struct reftable_ref_record *)rec->data;
+	int err = 0;
+	while (1) {
+		err = reftable_iterator_next_ref(&fri->it, ref);
+		if (err != 0) {
+			break;
+		}
+
+		if (fri->double_check) {
+			struct reftable_iterator it = { 0 };
+
+			err = reftable_table_seek_ref(&fri->tab, &it,
+						      ref->refname);
+			if (err == 0) {
+				err = reftable_iterator_next_ref(&it, ref);
+			}
+
+			reftable_iterator_destroy(&it);
+
+			if (err < 0) {
+				break;
+			}
+
+			if (err > 0) {
+				continue;
+			}
+		}
+
+		if ((ref->target_value != NULL &&
+		     !memcmp(fri->oid.buf, ref->target_value, fri->oid.len)) ||
+		    (ref->value != NULL &&
+		     !memcmp(fri->oid.buf, ref->value, fri->oid.len))) {
+			return 0;
+		}
+	}
+
+	reftable_ref_record_clear(ref);
+	return err;
+}
+
+struct reftable_iterator_vtable filtering_ref_iterator_vtable = {
+	.next = &filtering_ref_iterator_next,
+	.close = &filtering_ref_iterator_close,
+};
+
+void iterator_from_filtering_ref_iterator(struct reftable_iterator *it,
+					  struct filtering_ref_iterator *fri)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = fri;
+	it->ops = &filtering_ref_iterator_vtable;
+}
+
+static void indexed_table_ref_iter_close(void *p)
+{
+	struct indexed_table_ref_iter *it = (struct indexed_table_ref_iter *)p;
+	block_iter_close(&it->cur);
+	reftable_block_done(&it->block_reader.block);
+	strbuf_release(&it->oid);
+}
+
+static int indexed_table_ref_iter_next_block(struct indexed_table_ref_iter *it)
+{
+	uint64_t off;
+	int err = 0;
+	if (it->offset_idx == it->offset_len) {
+		it->is_finished = 1;
+		return 1;
+	}
+
+	reftable_block_done(&it->block_reader.block);
+
+	off = it->offsets[it->offset_idx++];
+	err = reader_init_block_reader(it->r, &it->block_reader, off,
+				       BLOCK_TYPE_REF);
+	if (err < 0) {
+		return err;
+	}
+	if (err > 0) {
+		/* indexed block does not exist. */
+		return REFTABLE_FORMAT_ERROR;
+	}
+	block_reader_start(&it->block_reader, &it->cur);
+	return 0;
+}
+
+static int indexed_table_ref_iter_next(void *p, struct reftable_record *rec)
+{
+	struct indexed_table_ref_iter *it = (struct indexed_table_ref_iter *)p;
+	struct reftable_ref_record *ref =
+		(struct reftable_ref_record *)rec->data;
+
+	while (1) {
+		int err = block_iter_next(&it->cur, rec);
+		if (err < 0) {
+			return err;
+		}
+
+		if (err > 0) {
+			err = indexed_table_ref_iter_next_block(it);
+			if (err < 0) {
+				return err;
+			}
+
+			if (it->is_finished) {
+				return 1;
+			}
+			continue;
+		}
+
+		if (!memcmp(it->oid.buf, ref->target_value, it->oid.len) ||
+		    !memcmp(it->oid.buf, ref->value, it->oid.len)) {
+			return 0;
+		}
+	}
+}
+
+int new_indexed_table_ref_iter(struct indexed_table_ref_iter **dest,
+			       struct reftable_reader *r, uint8_t *oid,
+			       int oid_len, uint64_t *offsets, int offset_len)
+{
+	struct indexed_table_ref_iter empty = INDEXED_TABLE_REF_ITER_INIT;
+	struct indexed_table_ref_iter *itr =
+		reftable_calloc(sizeof(struct indexed_table_ref_iter));
+	int err = 0;
+
+	*itr = empty;
+	itr->r = r;
+	strbuf_add(&itr->oid, oid, oid_len);
+
+	itr->offsets = offsets;
+	itr->offset_len = offset_len;
+
+	err = indexed_table_ref_iter_next_block(itr);
+	if (err < 0) {
+		reftable_free(itr);
+	} else {
+		*dest = itr;
+	}
+	return err;
+}
+
+struct reftable_iterator_vtable indexed_table_ref_iter_vtable = {
+	.next = &indexed_table_ref_iter_next,
+	.close = &indexed_table_ref_iter_close,
+};
+
+void iterator_from_indexed_table_ref_iter(struct reftable_iterator *it,
+					  struct indexed_table_ref_iter *itr)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = itr;
+	it->ops = &indexed_table_ref_iter_vtable;
+}
diff --git a/reftable/iter.h b/reftable/iter.h
new file mode 100644
index 0000000000..c92a178773
--- /dev/null
+++ b/reftable/iter.h
@@ -0,0 +1,72 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef ITER_H
+#define ITER_H
+
+#include "block.h"
+#include "record.h"
+#include "strbuf.h"
+
+struct reftable_iterator_vtable {
+	int (*next)(void *iter_arg, struct reftable_record *rec);
+	void (*close)(void *iter_arg);
+};
+
+void iterator_set_empty(struct reftable_iterator *it);
+int iterator_next(struct reftable_iterator *it, struct reftable_record *rec);
+
+/* Returns true for a zeroed out iterator, such as the one returned from
+   iterator_destroy. */
+int iterator_is_null(struct reftable_iterator *it);
+
+/* iterator that produces only ref records that point to `oid` */
+struct filtering_ref_iterator {
+	int double_check;
+	struct reftable_table tab;
+	struct strbuf oid;
+	struct reftable_iterator it;
+};
+#define FILTERING_REF_ITERATOR_INIT \
+	{                           \
+		.oid = STRBUF_INIT  \
+	}
+
+void iterator_from_filtering_ref_iterator(struct reftable_iterator *,
+					  struct filtering_ref_iterator *);
+
+/* iterator that produces only ref records that point to `oid`,
+   but using the object index.
+ */
+struct indexed_table_ref_iter {
+	struct reftable_reader *r;
+	struct strbuf oid;
+
+	/* mutable */
+	uint64_t *offsets;
+
+	/* Points to the next offset to read. */
+	int offset_idx;
+	int offset_len;
+	struct block_reader block_reader;
+	struct block_iter cur;
+	int is_finished;
+};
+
+#define INDEXED_TABLE_REF_ITER_INIT                                     \
+	{                                                               \
+		.cur = { .last_key = STRBUF_INIT }, .oid = STRBUF_INIT, \
+	}
+
+void iterator_from_indexed_table_ref_iter(struct reftable_iterator *it,
+					  struct indexed_table_ref_iter *itr);
+int new_indexed_table_ref_iter(struct indexed_table_ref_iter **dest,
+			       struct reftable_reader *r, uint8_t *oid,
+			       int oid_len, uint64_t *offsets, int offset_len);
+
+#endif
diff --git a/reftable/reader.c b/reftable/reader.c
new file mode 100644
index 0000000000..fae2dbb64e
--- /dev/null
+++ b/reftable/reader.c
@@ -0,0 +1,733 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reader.h"
+
+#include "system.h"
+#include "block.h"
+#include "constants.h"
+#include "iter.h"
+#include "record.h"
+#include "reftable.h"
+#include "tree.h"
+
+uint64_t block_source_size(struct reftable_block_source *source)
+{
+	return source->ops->size(source->arg);
+}
+
+int block_source_read_block(struct reftable_block_source *source,
+			    struct reftable_block *dest, uint64_t off,
+			    uint32_t size)
+{
+	int result = source->ops->read_block(source->arg, dest, off, size);
+	dest->source = *source;
+	return result;
+}
+
+void block_source_close(struct reftable_block_source *source)
+{
+	if (source->ops == NULL) {
+		return;
+	}
+
+	source->ops->close(source->arg);
+	source->ops = NULL;
+}
+
+static struct reftable_reader_offsets *
+reader_offsets_for(struct reftable_reader *r, uint8_t typ)
+{
+	switch (typ) {
+	case BLOCK_TYPE_REF:
+		return &r->ref_offsets;
+	case BLOCK_TYPE_LOG:
+		return &r->log_offsets;
+	case BLOCK_TYPE_OBJ:
+		return &r->obj_offsets;
+	}
+	abort();
+}
+
+static int reader_get_block(struct reftable_reader *r,
+			    struct reftable_block *dest, uint64_t off,
+			    uint32_t sz)
+{
+	if (off >= r->size)
+		return 0;
+
+	if (off + sz > r->size) {
+		sz = r->size - off;
+	}
+
+	return block_source_read_block(&r->source, dest, off, sz);
+}
+
+uint32_t reftable_reader_hash_id(struct reftable_reader *r)
+{
+	return r->hash_id;
+}
+
+const char *reader_name(struct reftable_reader *r)
+{
+	return r->name;
+}
+
+static int parse_footer(struct reftable_reader *r, uint8_t *footer,
+			uint8_t *header)
+{
+	uint8_t *f = footer;
+	uint8_t first_block_typ;
+	int err = 0;
+	uint32_t computed_crc;
+	uint32_t file_crc;
+
+	if (memcmp(f, "REFT", 4)) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+	f += 4;
+
+	if (memcmp(footer, header, header_size(r->version))) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	f++;
+	r->block_size = get_be24(f);
+
+	f += 3;
+	r->min_update_index = get_be64(f);
+	f += 8;
+	r->max_update_index = get_be64(f);
+	f += 8;
+
+	if (r->version == 1) {
+		r->hash_id = SHA1_ID;
+	} else {
+		r->hash_id = get_be32(f);
+		switch (r->hash_id) {
+		case SHA1_ID:
+			break;
+		case SHA256_ID:
+			break;
+		default:
+			err = REFTABLE_FORMAT_ERROR;
+			goto done;
+		}
+		f += 4;
+	}
+
+	r->ref_offsets.index_offset = get_be64(f);
+	f += 8;
+
+	r->obj_offsets.offset = get_be64(f);
+	f += 8;
+
+	r->object_id_len = r->obj_offsets.offset & ((1 << 5) - 1);
+	r->obj_offsets.offset >>= 5;
+
+	r->obj_offsets.index_offset = get_be64(f);
+	f += 8;
+	r->log_offsets.offset = get_be64(f);
+	f += 8;
+	r->log_offsets.index_offset = get_be64(f);
+	f += 8;
+
+	computed_crc = crc32(0, footer, f - footer);
+	file_crc = get_be32(f);
+	f += 4;
+	if (computed_crc != file_crc) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	first_block_typ = header[header_size(r->version)];
+	r->ref_offsets.is_present = (first_block_typ == BLOCK_TYPE_REF);
+	r->ref_offsets.offset = 0;
+	r->log_offsets.is_present = (first_block_typ == BLOCK_TYPE_LOG ||
+				     r->log_offsets.offset > 0);
+	r->obj_offsets.is_present = r->obj_offsets.offset > 0;
+	err = 0;
+done:
+	return err;
+}
+
+int init_reader(struct reftable_reader *r, struct reftable_block_source *source,
+		const char *name)
+{
+	struct reftable_block footer = { 0 };
+	struct reftable_block header = { 0 };
+	int err = 0;
+
+	memset(r, 0, sizeof(struct reftable_reader));
+
+	/* Need +1 to read type of first block. */
+	err = block_source_read_block(source, &header, 0, header_size(2) + 1);
+	if (err != header_size(2) + 1) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	if (memcmp(header.data, "REFT", 4)) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+	r->version = header.data[4];
+	if (r->version != 1 && r->version != 2) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	r->size = block_source_size(source) - footer_size(r->version);
+	r->source = *source;
+	r->name = xstrdup(name);
+	r->hash_id = 0;
+
+	err = block_source_read_block(source, &footer, r->size,
+				      footer_size(r->version));
+	if (err != footer_size(r->version)) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = parse_footer(r, footer.data, header.data);
+done:
+	reftable_block_done(&footer);
+	reftable_block_done(&header);
+	return err;
+}
+
+struct table_iter {
+	struct reftable_reader *r;
+	uint8_t typ;
+	uint64_t block_off;
+	struct block_iter bi;
+	int is_finished;
+};
+#define TABLE_ITER_INIT                          \
+	{                                        \
+		.bi = {.last_key = STRBUF_INIT } \
+	}
+
+static void table_iter_copy_from(struct table_iter *dest,
+				 struct table_iter *src)
+{
+	dest->r = src->r;
+	dest->typ = src->typ;
+	dest->block_off = src->block_off;
+	dest->is_finished = src->is_finished;
+	block_iter_copy_from(&dest->bi, &src->bi);
+}
+
+static int table_iter_next_in_block(struct table_iter *ti,
+				    struct reftable_record *rec)
+{
+	int res = block_iter_next(&ti->bi, rec);
+	if (res == 0 && reftable_record_type(rec) == BLOCK_TYPE_REF) {
+		((struct reftable_ref_record *)rec->data)->update_index +=
+			ti->r->min_update_index;
+	}
+
+	return res;
+}
+
+static void table_iter_block_done(struct table_iter *ti)
+{
+	if (ti->bi.br == NULL) {
+		return;
+	}
+	reftable_block_done(&ti->bi.br->block);
+	FREE_AND_NULL(ti->bi.br);
+
+	ti->bi.last_key.len = 0;
+	ti->bi.next_off = 0;
+}
+
+static int32_t extract_block_size(uint8_t *data, uint8_t *typ, uint64_t off,
+				  int version)
+{
+	int32_t result = 0;
+
+	if (off == 0) {
+		data += header_size(version);
+	}
+
+	*typ = data[0];
+	if (reftable_is_block_type(*typ)) {
+		result = get_be24(data + 1);
+	}
+	return result;
+}
+
+int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br,
+			     uint64_t next_off, uint8_t want_typ)
+{
+	int32_t guess_block_size = r->block_size ? r->block_size :
+							 DEFAULT_BLOCK_SIZE;
+	struct reftable_block block = { 0 };
+	uint8_t block_typ = 0;
+	int err = 0;
+	uint32_t header_off = next_off ? 0 : header_size(r->version);
+	int32_t block_size = 0;
+
+	if (next_off >= r->size)
+		return 1;
+
+	err = reader_get_block(r, &block, next_off, guess_block_size);
+	if (err < 0)
+		return err;
+
+	block_size = extract_block_size(block.data, &block_typ, next_off,
+					r->version);
+	if (block_size < 0)
+		return block_size;
+
+	if (want_typ != BLOCK_TYPE_ANY && block_typ != want_typ) {
+		reftable_block_done(&block);
+		return 1;
+	}
+
+	if (block_size > guess_block_size) {
+		reftable_block_done(&block);
+		err = reader_get_block(r, &block, next_off, block_size);
+		if (err < 0) {
+			return err;
+		}
+	}
+
+	return block_reader_init(br, &block, header_off, r->block_size,
+				 hash_size(r->hash_id));
+}
+
+static int table_iter_next_block(struct table_iter *dest,
+				 struct table_iter *src)
+{
+	uint64_t next_block_off = src->block_off + src->bi.br->full_block_size;
+	struct block_reader br = { 0 };
+	int err = 0;
+
+	dest->r = src->r;
+	dest->typ = src->typ;
+	dest->block_off = next_block_off;
+
+	err = reader_init_block_reader(src->r, &br, next_block_off, src->typ);
+	if (err > 0) {
+		dest->is_finished = 1;
+		return 1;
+	}
+	if (err != 0)
+		return err;
+	else {
+		struct block_reader *brp =
+			reftable_malloc(sizeof(struct block_reader));
+		*brp = br;
+
+		dest->is_finished = 0;
+		block_reader_start(brp, &dest->bi);
+	}
+	return 0;
+}
+
+static int table_iter_next(struct table_iter *ti, struct reftable_record *rec)
+{
+	if (reftable_record_type(rec) != ti->typ)
+		return REFTABLE_API_ERROR;
+
+	while (1) {
+		struct table_iter next = TABLE_ITER_INIT;
+		int err = 0;
+		if (ti->is_finished) {
+			return 1;
+		}
+
+		err = table_iter_next_in_block(ti, rec);
+		if (err <= 0) {
+			return err;
+		}
+
+		err = table_iter_next_block(&next, ti);
+		if (err != 0) {
+			ti->is_finished = 1;
+		}
+		table_iter_block_done(ti);
+		if (err != 0) {
+			return err;
+		}
+		table_iter_copy_from(ti, &next);
+		block_iter_close(&next.bi);
+	}
+}
+
+static int table_iter_next_void(void *ti, struct reftable_record *rec)
+{
+	return table_iter_next((struct table_iter *)ti, rec);
+}
+
+static void table_iter_close(void *p)
+{
+	struct table_iter *ti = (struct table_iter *)p;
+	table_iter_block_done(ti);
+	block_iter_close(&ti->bi);
+}
+
+struct reftable_iterator_vtable table_iter_vtable = {
+	.next = &table_iter_next_void,
+	.close = &table_iter_close,
+};
+
+static void iterator_from_table_iter(struct reftable_iterator *it,
+				     struct table_iter *ti)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = ti;
+	it->ops = &table_iter_vtable;
+}
+
+static int reader_table_iter_at(struct reftable_reader *r,
+				struct table_iter *ti, uint64_t off,
+				uint8_t typ)
+{
+	struct block_reader br = { 0 };
+	struct block_reader *brp = NULL;
+
+	int err = reader_init_block_reader(r, &br, off, typ);
+	if (err != 0)
+		return err;
+
+	brp = reftable_malloc(sizeof(struct block_reader));
+	*brp = br;
+	ti->r = r;
+	ti->typ = block_reader_type(brp);
+	ti->block_off = off;
+	block_reader_start(brp, &ti->bi);
+	return 0;
+}
+
+static int reader_start(struct reftable_reader *r, struct table_iter *ti,
+			uint8_t typ, int index)
+{
+	struct reftable_reader_offsets *offs = reader_offsets_for(r, typ);
+	uint64_t off = offs->offset;
+	if (index) {
+		off = offs->index_offset;
+		if (off == 0) {
+			return 1;
+		}
+		typ = BLOCK_TYPE_INDEX;
+	}
+
+	return reader_table_iter_at(r, ti, off, typ);
+}
+
+static int reader_seek_linear(struct reftable_reader *r, struct table_iter *ti,
+			      struct reftable_record *want)
+{
+	struct reftable_record rec =
+		reftable_new_record(reftable_record_type(want));
+	struct strbuf want_key = STRBUF_INIT;
+	struct strbuf got_key = STRBUF_INIT;
+	struct table_iter next = TABLE_ITER_INIT;
+	int err = -1;
+
+	reftable_record_key(want, &want_key);
+
+	while (1) {
+		err = table_iter_next_block(&next, ti);
+		if (err < 0)
+			goto done;
+
+		if (err > 0) {
+			break;
+		}
+
+		err = block_reader_first_key(next.bi.br, &got_key);
+		if (err < 0)
+			goto done;
+
+		if (strbuf_cmp(&got_key, &want_key) > 0) {
+			table_iter_block_done(&next);
+			break;
+		}
+
+		table_iter_block_done(ti);
+		table_iter_copy_from(ti, &next);
+	}
+
+	err = block_iter_seek(&ti->bi, &want_key);
+	if (err < 0)
+		goto done;
+	err = 0;
+
+done:
+	block_iter_close(&next.bi);
+	reftable_record_destroy(&rec);
+	strbuf_release(&want_key);
+	strbuf_release(&got_key);
+	return err;
+}
+
+static int reader_seek_indexed(struct reftable_reader *r,
+			       struct reftable_iterator *it,
+			       struct reftable_record *rec)
+{
+	struct reftable_index_record want_index = { .last_key = STRBUF_INIT };
+	struct reftable_record want_index_rec = { 0 };
+	struct reftable_index_record index_result = { .last_key = STRBUF_INIT };
+	struct reftable_record index_result_rec = { 0 };
+	struct table_iter index_iter = TABLE_ITER_INIT;
+	struct table_iter next = TABLE_ITER_INIT;
+	int err = 0;
+
+	reftable_record_key(rec, &want_index.last_key);
+	reftable_record_from_index(&want_index_rec, &want_index);
+	reftable_record_from_index(&index_result_rec, &index_result);
+
+	err = reader_start(r, &index_iter, reftable_record_type(rec), 1);
+	if (err < 0)
+		goto done;
+
+	err = reader_seek_linear(r, &index_iter, &want_index_rec);
+	while (1) {
+		err = table_iter_next(&index_iter, &index_result_rec);
+		table_iter_block_done(&index_iter);
+		if (err != 0)
+			goto done;
+
+		err = reader_table_iter_at(r, &next, index_result.offset, 0);
+		if (err != 0)
+			goto done;
+
+		err = block_iter_seek(&next.bi, &want_index.last_key);
+		if (err < 0)
+			goto done;
+
+		if (next.typ == reftable_record_type(rec)) {
+			err = 0;
+			break;
+		}
+
+		if (next.typ != BLOCK_TYPE_INDEX) {
+			err = REFTABLE_FORMAT_ERROR;
+			break;
+		}
+
+		table_iter_copy_from(&index_iter, &next);
+	}
+
+	if (err == 0) {
+		struct table_iter empty = TABLE_ITER_INIT;
+		struct table_iter *malloced =
+			reftable_calloc(sizeof(struct table_iter));
+		*malloced = empty;
+		table_iter_copy_from(malloced, &next);
+		iterator_from_table_iter(it, malloced);
+	}
+done:
+	block_iter_close(&next.bi);
+	table_iter_close(&index_iter);
+	reftable_record_clear(&want_index_rec);
+	reftable_record_clear(&index_result_rec);
+	return err;
+}
+
+static int reader_seek_internal(struct reftable_reader *r,
+				struct reftable_iterator *it,
+				struct reftable_record *rec)
+{
+	struct reftable_reader_offsets *offs =
+		reader_offsets_for(r, reftable_record_type(rec));
+	uint64_t idx = offs->index_offset;
+	struct table_iter ti = TABLE_ITER_INIT;
+	int err = 0;
+	if (idx > 0)
+		return reader_seek_indexed(r, it, rec);
+
+	err = reader_start(r, &ti, reftable_record_type(rec), 0);
+	if (err < 0)
+		return err;
+	err = reader_seek_linear(r, &ti, rec);
+	if (err < 0)
+		return err;
+	else {
+		struct table_iter *p =
+			reftable_malloc(sizeof(struct table_iter));
+		*p = ti;
+		iterator_from_table_iter(it, p);
+	}
+
+	return 0;
+}
+
+int reader_seek(struct reftable_reader *r, struct reftable_iterator *it,
+		struct reftable_record *rec)
+{
+	uint8_t typ = reftable_record_type(rec);
+
+	struct reftable_reader_offsets *offs = reader_offsets_for(r, typ);
+	if (!offs->is_present) {
+		iterator_set_empty(it);
+		return 0;
+	}
+
+	return reader_seek_internal(r, it, rec);
+}
+
+int reftable_reader_seek_ref(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name)
+{
+	struct reftable_ref_record ref = {
+		.refname = (char *)name,
+	};
+	struct reftable_record rec = { 0 };
+	reftable_record_from_ref(&rec, &ref);
+	return reader_seek(r, it, &rec);
+}
+
+int reftable_reader_seek_log_at(struct reftable_reader *r,
+				struct reftable_iterator *it, const char *name,
+				uint64_t update_index)
+{
+	struct reftable_log_record log = {
+		.refname = (char *)name,
+		.update_index = update_index,
+	};
+	struct reftable_record rec = { 0 };
+	reftable_record_from_log(&rec, &log);
+	return reader_seek(r, it, &rec);
+}
+
+int reftable_reader_seek_log(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name)
+{
+	uint64_t max = ~((uint64_t)0);
+	return reftable_reader_seek_log_at(r, it, name, max);
+}
+
+void reader_close(struct reftable_reader *r)
+{
+	block_source_close(&r->source);
+	FREE_AND_NULL(r->name);
+}
+
+int reftable_new_reader(struct reftable_reader **p,
+			struct reftable_block_source *src, char const *name)
+{
+	struct reftable_reader *rd =
+		reftable_calloc(sizeof(struct reftable_reader));
+	int err = init_reader(rd, src, name);
+	if (err == 0) {
+		*p = rd;
+	} else {
+		block_source_close(src);
+		reftable_free(rd);
+	}
+	return err;
+}
+
+void reftable_reader_free(struct reftable_reader *r)
+{
+	reader_close(r);
+	reftable_free(r);
+}
+
+static int reftable_reader_refs_for_indexed(struct reftable_reader *r,
+					    struct reftable_iterator *it,
+					    uint8_t *oid)
+{
+	struct reftable_obj_record want = {
+		.hash_prefix = oid,
+		.hash_prefix_len = r->object_id_len,
+	};
+	struct reftable_record want_rec = { 0 };
+	struct reftable_iterator oit = { 0 };
+	struct reftable_obj_record got = { 0 };
+	struct reftable_record got_rec = { 0 };
+	int err = 0;
+	struct indexed_table_ref_iter *itr = NULL;
+
+	/* Look through the reverse index. */
+	reftable_record_from_obj(&want_rec, &want);
+	err = reader_seek(r, &oit, &want_rec);
+	if (err != 0)
+		goto done;
+
+	/* read out the reftable_obj_record */
+	reftable_record_from_obj(&got_rec, &got);
+	err = iterator_next(&oit, &got_rec);
+	if (err < 0)
+		goto done;
+
+	if (err > 0 ||
+	    memcmp(want.hash_prefix, got.hash_prefix, r->object_id_len)) {
+		/* didn't find it; return empty iterator */
+		iterator_set_empty(it);
+		err = 0;
+		goto done;
+	}
+
+	err = new_indexed_table_ref_iter(&itr, r, oid, hash_size(r->hash_id),
+					 got.offsets, got.offset_len);
+	if (err < 0)
+		goto done;
+	got.offsets = NULL;
+	iterator_from_indexed_table_ref_iter(it, itr);
+
+done:
+	reftable_iterator_destroy(&oit);
+	reftable_record_clear(&got_rec);
+	return err;
+}
+
+static int reftable_reader_refs_for_unindexed(struct reftable_reader *r,
+					      struct reftable_iterator *it,
+					      uint8_t *oid)
+{
+	struct table_iter ti_empty = TABLE_ITER_INIT;
+	struct table_iter *ti = reftable_calloc(sizeof(struct table_iter));
+	struct filtering_ref_iterator *filter = NULL;
+	struct filtering_ref_iterator empty = FILTERING_REF_ITERATOR_INIT;
+	int oid_len = hash_size(r->hash_id);
+	int err;
+
+	*ti = ti_empty;
+	err = reader_start(r, ti, BLOCK_TYPE_REF, 0);
+	if (err < 0) {
+		reftable_free(ti);
+		return err;
+	}
+
+	filter = reftable_malloc(sizeof(struct filtering_ref_iterator));
+	*filter = empty;
+
+	strbuf_add(&filter->oid, oid, oid_len);
+	reftable_table_from_reader(&filter->tab, r);
+	filter->double_check = 0;
+	iterator_from_table_iter(&filter->it, ti);
+
+	iterator_from_filtering_ref_iterator(it, filter);
+	return 0;
+}
+
+int reftable_reader_refs_for(struct reftable_reader *r,
+			     struct reftable_iterator *it, uint8_t *oid)
+{
+	if (r->obj_offsets.is_present)
+		return reftable_reader_refs_for_indexed(r, it, oid);
+	return reftable_reader_refs_for_unindexed(r, it, oid);
+}
+
+uint64_t reftable_reader_max_update_index(struct reftable_reader *r)
+{
+	return r->max_update_index;
+}
+
+uint64_t reftable_reader_min_update_index(struct reftable_reader *r)
+{
+	return r->min_update_index;
+}
diff --git a/reftable/reader.h b/reftable/reader.h
new file mode 100644
index 0000000000..8170258881
--- /dev/null
+++ b/reftable/reader.h
@@ -0,0 +1,78 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef READER_H
+#define READER_H
+
+#include "block.h"
+#include "record.h"
+#include "reftable.h"
+
+uint64_t block_source_size(struct reftable_block_source *source);
+
+int block_source_read_block(struct reftable_block_source *source,
+			    struct reftable_block *dest, uint64_t off,
+			    uint32_t size);
+void block_source_close(struct reftable_block_source *source);
+
+/* metadata for a block type */
+struct reftable_reader_offsets {
+	int is_present;
+	uint64_t offset;
+	uint64_t index_offset;
+};
+
+/* The state for reading a reftable file. */
+struct reftable_reader {
+	/* for convience, associate a name with the instance. */
+	char *name;
+	struct reftable_block_source source;
+
+	/* Size of the file, excluding the footer. */
+	uint64_t size;
+
+	/* 'sha1' for SHA1, 's256' for SHA-256 */
+	uint32_t hash_id;
+
+	uint32_t block_size;
+	uint64_t min_update_index;
+	uint64_t max_update_index;
+	/* Length of the OID keys in the 'o' section */
+	int object_id_len;
+	int version;
+
+	struct reftable_reader_offsets ref_offsets;
+	struct reftable_reader_offsets obj_offsets;
+	struct reftable_reader_offsets log_offsets;
+};
+
+int init_reader(struct reftable_reader *r, struct reftable_block_source *source,
+		const char *name);
+int reader_seek(struct reftable_reader *r, struct reftable_iterator *it,
+		struct reftable_record *rec);
+void reader_close(struct reftable_reader *r);
+const char *reader_name(struct reftable_reader *r);
+
+/* initialize a block reader to read from `r` */
+int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br,
+			     uint64_t next_off, uint8_t want_typ);
+
+/* generic interface to reftables */
+struct reftable_table_vtable {
+	int (*seek_record)(void *tab, struct reftable_iterator *it,
+			   struct reftable_record *);
+	uint32_t (*hash_id)(void *tab);
+	uint64_t (*min_update_index)(void *tab);
+	uint64_t (*max_update_index)(void *tab);
+};
+
+int reftable_table_seek_record(struct reftable_table *tab,
+			       struct reftable_iterator *it,
+			       struct reftable_record *rec);
+
+#endif
diff --git a/reftable/reftable.c b/reftable/reftable.c
new file mode 100644
index 0000000000..1425aef6f2
--- /dev/null
+++ b/reftable/reftable.c
@@ -0,0 +1,104 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reftable.h"
+#include "record.h"
+#include "reader.h"
+
+static int reftable_reader_seek_void(void *tab, struct reftable_iterator *it,
+				     struct reftable_record *rec)
+{
+	return reader_seek((struct reftable_reader *)tab, it, rec);
+}
+
+static uint32_t reftable_reader_hash_id_void(void *tab)
+{
+	return reftable_reader_hash_id((struct reftable_reader *)tab);
+}
+
+static uint64_t reftable_reader_min_update_index_void(void *tab)
+{
+	return reftable_reader_min_update_index((struct reftable_reader *)tab);
+}
+
+static uint64_t reftable_reader_max_update_index_void(void *tab)
+{
+	return reftable_reader_max_update_index((struct reftable_reader *)tab);
+}
+
+static struct reftable_table_vtable reader_vtable = {
+	.seek_record = reftable_reader_seek_void,
+	.hash_id = reftable_reader_hash_id_void,
+	.min_update_index = reftable_reader_min_update_index_void,
+	.max_update_index = reftable_reader_max_update_index_void,
+};
+
+int reftable_table_seek_ref(struct reftable_table *tab,
+			    struct reftable_iterator *it, const char *name)
+{
+	struct reftable_ref_record ref = {
+		.refname = (char *)name,
+	};
+	struct reftable_record rec = { 0 };
+	reftable_record_from_ref(&rec, &ref);
+	return tab->ops->seek_record(tab->table_arg, it, &rec);
+}
+
+void reftable_table_from_reader(struct reftable_table *tab,
+				struct reftable_reader *reader)
+{
+	assert(tab->ops == NULL);
+	tab->ops = &reader_vtable;
+	tab->table_arg = reader;
+}
+
+int reftable_table_read_ref(struct reftable_table *tab, const char *name,
+			    struct reftable_ref_record *ref)
+{
+	struct reftable_iterator it = { 0 };
+	int err = reftable_table_seek_ref(tab, &it, name);
+	if (err)
+		goto done;
+
+	err = reftable_iterator_next_ref(&it, ref);
+	if (err)
+		goto done;
+
+	if (strcmp(ref->refname, name) ||
+	    reftable_ref_record_is_deletion(ref)) {
+		reftable_ref_record_clear(ref);
+		err = 1;
+		goto done;
+	}
+
+done:
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+int reftable_table_seek_record(struct reftable_table *tab,
+			       struct reftable_iterator *it,
+			       struct reftable_record *rec)
+{
+	return tab->ops->seek_record(tab->table_arg, it, rec);
+}
+
+uint64_t reftable_table_max_update_index(struct reftable_table *tab)
+{
+	return tab->ops->max_update_index(tab->table_arg);
+}
+
+uint64_t reftable_table_min_update_index(struct reftable_table *tab)
+{
+	return tab->ops->min_update_index(tab->table_arg);
+}
+
+uint32_t reftable_table_hash_id(struct reftable_table *tab)
+{
+	return tab->ops->hash_id(tab->table_arg);
+}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH 11/13] reftable: file level tests
  2020-09-16 19:10 [PATCH 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
                   ` (9 preceding siblings ...)
  2020-09-16 19:10 ` [PATCH 10/13] reftable: read " Han-Wen Nienhuys via GitGitGadget
@ 2020-09-16 19:10 ` Han-Wen Nienhuys via GitGitGadget
  2020-09-16 19:10 ` [PATCH 12/13] reftable: rest of library Han-Wen Nienhuys via GitGitGadget
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-09-16 19:10 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |   1 +
 reftable/reftable_test.c | 585 +++++++++++++++++++++++++++++++++++++++
 t/helper/test-reftable.c |   1 +
 3 files changed, 587 insertions(+)
 create mode 100644 reftable/reftable_test.c

diff --git a/Makefile b/Makefile
index 275fc4227c..22dd31e7ff 100644
--- a/Makefile
+++ b/Makefile
@@ -2360,6 +2360,7 @@ REFTABLE_OBJS += reftable/zlib-compat.o
 
 REFTABLE_TEST_OBJS += reftable/block_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
+REFTABLE_TEST_OBJS += reftable/reftable_test.o
 REFTABLE_TEST_OBJS += reftable/strbuf_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/tree_test.o
diff --git a/reftable/reftable_test.c b/reftable/reftable_test.c
new file mode 100644
index 0000000000..73efeaf523
--- /dev/null
+++ b/reftable/reftable_test.c
@@ -0,0 +1,585 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reftable.h"
+
+#include "system.h"
+
+#include "basics.h"
+#include "block.h"
+#include "blocksource.h"
+#include "constants.h"
+#include "reader.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static const int update_index = 5;
+
+static void test_buffer(void)
+{
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_block_source source = { NULL };
+	struct reftable_block out = { 0 };
+	int n;
+	uint8_t in[] = "hello";
+	strbuf_add(&buf, in, sizeof(in));
+	block_source_from_strbuf(&source, &buf);
+	assert(block_source_size(&source) == 6);
+	n = block_source_read_block(&source, &out, 0, sizeof(in));
+	assert(n == sizeof(in));
+	assert(!memcmp(in, out.data, n));
+	reftable_block_done(&out);
+
+	n = block_source_read_block(&source, &out, 1, 2);
+	assert(n == 2);
+	assert(!memcmp(out.data, "el", 2));
+
+	reftable_block_done(&out);
+	block_source_close(&source);
+	strbuf_release(&buf);
+}
+
+static void write_table(char ***names, struct strbuf *buf, int N,
+			int block_size, uint32_t hash_id)
+{
+	struct reftable_write_options opts = {
+		.block_size = block_size,
+		.hash_id = hash_id,
+	};
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, buf, &opts);
+	struct reftable_ref_record ref = { 0 };
+	int i = 0, n;
+	struct reftable_log_record log = { 0 };
+	const struct reftable_stats *stats = NULL;
+	*names = reftable_calloc(sizeof(char *) * (N + 1));
+	reftable_writer_set_limits(w, update_index, update_index);
+	for (i = 0; i < N; i++) {
+		uint8_t hash[SHA256_SIZE] = { 0 };
+		char name[100];
+		int n;
+
+		set_test_hash(hash, i);
+
+		snprintf(name, sizeof(name), "refs/heads/branch%02d", i);
+
+		ref.refname = name;
+		ref.value = hash;
+		ref.update_index = update_index;
+		(*names)[i] = xstrdup(name);
+
+		n = reftable_writer_add_ref(w, &ref);
+		assert(n == 0);
+	}
+
+	for (i = 0; i < N; i++) {
+		uint8_t hash[SHA256_SIZE] = { 0 };
+		char name[100];
+		int n;
+
+		set_test_hash(hash, i);
+
+		snprintf(name, sizeof(name), "refs/heads/branch%02d", i);
+
+		log.refname = name;
+		log.new_hash = hash;
+		log.update_index = update_index;
+		log.message = "message";
+
+		n = reftable_writer_add_log(w, &log);
+		assert(n == 0);
+	}
+
+	n = reftable_writer_close(w);
+	assert(n == 0);
+
+	stats = writer_stats(w);
+	for (i = 0; i < stats->ref_stats.blocks; i++) {
+		int off = i * opts.block_size;
+		if (off == 0) {
+			off = header_size((hash_id == SHA256_ID) ? 2 : 1);
+		}
+		assert(buf->buf[off] == 'r');
+	}
+
+	assert(stats->log_stats.blocks > 0);
+	reftable_writer_free(w);
+}
+
+static void test_log_buffer_size(void)
+{
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_write_options opts = {
+		.block_size = 4096,
+	};
+	int err;
+	struct reftable_log_record log = {
+		.refname = "refs/heads/master",
+		.name = "Han-Wen Nienhuys",
+		.email = "hanwen@google.com",
+		.tz_offset = 100,
+		.time = 0x5e430672,
+		.update_index = 0xa,
+		.message = "commit: 9\n",
+	};
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+
+	/* This tests buffer extension for log compression. Must use a random
+	   hash, to ensure that the compressed part is larger than the original.
+	*/
+	uint8_t hash1[SHA1_SIZE], hash2[SHA1_SIZE];
+	for (int i = 0; i < SHA1_SIZE; i++) {
+		hash1[i] = (uint8_t)(rand() % 256);
+		hash2[i] = (uint8_t)(rand() % 256);
+	}
+	log.old_hash = hash1;
+	log.new_hash = hash2;
+	reftable_writer_set_limits(w, update_index, update_index);
+	err = reftable_writer_add_log(w, &log);
+	assert_err(err);
+	err = reftable_writer_close(w);
+	assert_err(err);
+	reftable_writer_free(w);
+	strbuf_release(&buf);
+}
+
+static void test_log_write_read(void)
+{
+	int N = 2;
+	char **names = reftable_calloc(sizeof(char *) * (N + 1));
+	int err;
+	struct reftable_write_options opts = {
+		.block_size = 256,
+	};
+	struct reftable_ref_record ref = { 0 };
+	int i = 0;
+	struct reftable_log_record log = { 0 };
+	int n;
+	struct reftable_iterator it = { 0 };
+	struct reftable_reader rd = { 0 };
+	struct reftable_block_source source = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	const struct reftable_stats *stats = NULL;
+	reftable_writer_set_limits(w, 0, N);
+	for (i = 0; i < N; i++) {
+		char name[256];
+		struct reftable_ref_record ref = { 0 };
+		snprintf(name, sizeof(name), "b%02d%0*d", i, 130, 7);
+		names[i] = xstrdup(name);
+		ref.refname = name;
+		ref.update_index = i;
+
+		err = reftable_writer_add_ref(w, &ref);
+		assert_err(err);
+	}
+	for (i = 0; i < N; i++) {
+		uint8_t hash1[SHA1_SIZE], hash2[SHA1_SIZE];
+		struct reftable_log_record log = { 0 };
+		set_test_hash(hash1, i);
+		set_test_hash(hash2, i + 1);
+
+		log.refname = names[i];
+		log.update_index = i;
+		log.old_hash = hash1;
+		log.new_hash = hash2;
+
+		err = reftable_writer_add_log(w, &log);
+		assert_err(err);
+	}
+
+	n = reftable_writer_close(w);
+	assert(n == 0);
+
+	stats = writer_stats(w);
+	assert(stats->log_stats.blocks > 0);
+	reftable_writer_free(w);
+	w = NULL;
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.log");
+	assert_err(err);
+
+	err = reftable_reader_seek_ref(&rd, &it, names[N - 1]);
+	assert_err(err);
+
+	err = reftable_iterator_next_ref(&it, &ref);
+	assert_err(err);
+
+	/* end of iteration. */
+	err = reftable_iterator_next_ref(&it, &ref);
+	assert(0 < err);
+
+	reftable_iterator_destroy(&it);
+	reftable_ref_record_clear(&ref);
+
+	err = reftable_reader_seek_log(&rd, &it, "");
+	assert_err(err);
+
+	i = 0;
+	while (1) {
+		int err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			break;
+		}
+
+		assert_err(err);
+		assert_streq(names[i], log.refname);
+		assert(i == log.update_index);
+		i++;
+		reftable_log_record_clear(&log);
+	}
+
+	assert(i == N);
+	reftable_iterator_destroy(&it);
+
+	/* cleanup. */
+	strbuf_release(&buf);
+	free_names(names);
+	reader_close(&rd);
+}
+
+static void test_table_read_write_sequential(void)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 50;
+	struct reftable_iterator it = { 0 };
+	struct reftable_block_source source = { 0 };
+	struct reftable_reader rd = { 0 };
+	int err = 0;
+	int j = 0;
+
+	write_table(&names, &buf, N, 256, SHA1_ID);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	assert_err(err);
+
+	err = reftable_reader_seek_ref(&rd, &it, "");
+	assert_err(err);
+
+	while (1) {
+		struct reftable_ref_record ref = { 0 };
+		int r = reftable_iterator_next_ref(&it, &ref);
+		assert(r >= 0);
+		if (r > 0) {
+			break;
+		}
+		assert(0 == strcmp(names[j], ref.refname));
+		assert(update_index == ref.update_index);
+
+		j++;
+		reftable_ref_record_clear(&ref);
+	}
+	assert(j == N);
+	reftable_iterator_destroy(&it);
+	strbuf_release(&buf);
+	free_names(names);
+
+	reader_close(&rd);
+}
+
+static void test_table_write_small_table(void)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 1;
+	write_table(&names, &buf, N, 4096, SHA1_ID);
+	assert(buf.len < 200);
+	strbuf_release(&buf);
+	free_names(names);
+}
+
+static void test_table_read_api(void)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 50;
+	struct reftable_reader rd = { 0 };
+	struct reftable_block_source source = { 0 };
+	int err;
+	int i;
+	struct reftable_log_record log = { 0 };
+	struct reftable_iterator it = { 0 };
+
+	write_table(&names, &buf, N, 256, SHA1_ID);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	assert_err(err);
+
+	err = reftable_reader_seek_ref(&rd, &it, names[0]);
+	assert_err(err);
+
+	err = reftable_iterator_next_log(&it, &log);
+	assert(err == REFTABLE_API_ERROR);
+
+	strbuf_release(&buf);
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_free(names);
+	reader_close(&rd);
+	strbuf_release(&buf);
+}
+
+static void test_table_read_write_seek(int index, int hash_id)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 50;
+	struct reftable_reader rd = { 0 };
+	struct reftable_block_source source = { 0 };
+	int err;
+	int i = 0;
+
+	struct reftable_iterator it = { 0 };
+	struct strbuf pastLast = STRBUF_INIT;
+	struct reftable_ref_record ref = { 0 };
+
+	write_table(&names, &buf, N, 256, hash_id);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	assert_err(err);
+	assert(hash_id == reftable_reader_hash_id(&rd));
+
+	if (!index) {
+		rd.ref_offsets.index_offset = 0;
+	} else {
+		assert(rd.ref_offsets.index_offset > 0);
+	}
+
+	for (i = 1; i < N; i++) {
+		int err = reftable_reader_seek_ref(&rd, &it, names[i]);
+		assert_err(err);
+		err = reftable_iterator_next_ref(&it, &ref);
+		assert_err(err);
+		assert(0 == strcmp(names[i], ref.refname));
+		assert(i == ref.value[0]);
+
+		reftable_ref_record_clear(&ref);
+		reftable_iterator_destroy(&it);
+	}
+
+	strbuf_addstr(&pastLast, names[N - 1]);
+	strbuf_addstr(&pastLast, "/");
+
+	err = reftable_reader_seek_ref(&rd, &it, pastLast.buf);
+	if (err == 0) {
+		struct reftable_ref_record ref = { 0 };
+		int err = reftable_iterator_next_ref(&it, &ref);
+		assert(err > 0);
+	} else {
+		assert(err > 0);
+	}
+
+	strbuf_release(&pastLast);
+	reftable_iterator_destroy(&it);
+
+	strbuf_release(&buf);
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+	reftable_free(names);
+	reader_close(&rd);
+}
+
+static void test_table_read_write_seek_linear(void)
+{
+	test_table_read_write_seek(0, SHA1_ID);
+}
+
+static void test_table_read_write_seek_linear_sha256(void)
+{
+	test_table_read_write_seek(0, SHA256_ID);
+}
+
+static void test_table_read_write_seek_index(void)
+{
+	test_table_read_write_seek(1, SHA1_ID);
+}
+
+static void test_table_refs_for(int indexed)
+{
+	int N = 50;
+	char **want_names = reftable_calloc(sizeof(char *) * (N + 1));
+	int want_names_len = 0;
+	uint8_t want_hash[SHA1_SIZE];
+
+	struct reftable_write_options opts = {
+		.block_size = 256,
+	};
+	struct reftable_ref_record ref = { 0 };
+	int i = 0;
+	int n;
+	int err;
+	struct reftable_reader rd;
+	struct reftable_block_source source = { 0 };
+
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+
+	struct reftable_iterator it = { 0 };
+	int j;
+
+	set_test_hash(want_hash, 4);
+
+	for (i = 0; i < N; i++) {
+		uint8_t hash[SHA1_SIZE];
+		char fill[51] = { 0 };
+		char name[100];
+		uint8_t hash1[SHA1_SIZE];
+		uint8_t hash2[SHA1_SIZE];
+		struct reftable_ref_record ref = { 0 };
+
+		memset(hash, i, sizeof(hash));
+		memset(fill, 'x', 50);
+		/* Put the variable part in the start */
+		snprintf(name, sizeof(name), "br%02d%s", i, fill);
+		name[40] = 0;
+		ref.refname = name;
+
+		set_test_hash(hash1, i / 4);
+		set_test_hash(hash2, 3 + i / 4);
+		ref.value = hash1;
+		ref.target_value = hash2;
+
+		/* 80 bytes / entry, so 3 entries per block. Yields 17
+		 */
+		/* blocks. */
+		n = reftable_writer_add_ref(w, &ref);
+		assert(n == 0);
+
+		if (!memcmp(hash1, want_hash, SHA1_SIZE) ||
+		    !memcmp(hash2, want_hash, SHA1_SIZE)) {
+			want_names[want_names_len++] = xstrdup(name);
+		}
+	}
+
+	n = reftable_writer_close(w);
+	assert(n == 0);
+
+	reftable_writer_free(w);
+	w = NULL;
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	assert_err(err);
+	if (!indexed) {
+		rd.obj_offsets.is_present = 0;
+	}
+
+	err = reftable_reader_seek_ref(&rd, &it, "");
+	assert_err(err);
+	reftable_iterator_destroy(&it);
+
+	err = reftable_reader_refs_for(&rd, &it, want_hash);
+	assert_err(err);
+
+	j = 0;
+	while (1) {
+		int err = reftable_iterator_next_ref(&it, &ref);
+		assert(err >= 0);
+		if (err > 0) {
+			break;
+		}
+
+		assert(j < want_names_len);
+		assert(0 == strcmp(ref.refname, want_names[j]));
+		j++;
+		reftable_ref_record_clear(&ref);
+	}
+	assert(j == want_names_len);
+
+	strbuf_release(&buf);
+	free_names(want_names);
+	reftable_iterator_destroy(&it);
+	reader_close(&rd);
+}
+
+static void test_table_refs_for_no_index(void)
+{
+	test_table_refs_for(0);
+}
+
+static void test_table_refs_for_obj_index(void)
+{
+	test_table_refs_for(1);
+}
+
+static void test_table_empty(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	struct reftable_block_source source = { 0 };
+	struct reftable_reader *rd = NULL;
+	struct reftable_ref_record rec = { 0 };
+	struct reftable_iterator it = { 0 };
+	int err;
+
+	reftable_writer_set_limits(w, 1, 1);
+
+	err = reftable_writer_close(w);
+	assert(err == REFTABLE_EMPTY_TABLE_ERROR);
+	reftable_writer_free(w);
+
+	assert(buf.len == header_size(1) + footer_size(1));
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = reftable_new_reader(&rd, &source, "filename");
+	assert_err(err);
+
+	err = reftable_reader_seek_ref(rd, &it, "");
+	assert_err(err);
+
+	err = reftable_iterator_next_ref(&it, &rec);
+	assert(err > 0);
+
+	reftable_iterator_destroy(&it);
+	reftable_reader_free(rd);
+	strbuf_release(&buf);
+}
+
+int reftable_test_main(int argc, const char *argv[])
+{
+	add_test_case("test_log_write_read", test_log_write_read);
+	add_test_case("test_table_read_write_seek_linear_sha256",
+		      &test_table_read_write_seek_linear_sha256);
+	add_test_case("test_log_buffer_size", test_log_buffer_size);
+	add_test_case("test_table_write_small_table",
+		      &test_table_write_small_table);
+	add_test_case("test_buffer", &test_buffer);
+	add_test_case("test_table_read_api", &test_table_read_api);
+	add_test_case("test_table_read_write_sequential",
+		      &test_table_read_write_sequential);
+	add_test_case("test_table_read_write_seek_linear",
+		      &test_table_read_write_seek_linear);
+	add_test_case("test_table_read_write_seek_index",
+		      &test_table_read_write_seek_index);
+	add_test_case("test_table_read_write_refs_for_no_index",
+		      &test_table_refs_for_no_index);
+	add_test_case("test_table_read_write_refs_for_obj_index",
+		      &test_table_refs_for_obj_index);
+	add_test_case("test_table_empty", &test_table_empty);
+	return test_main(argc, argv);
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 9c4e0f42dc..2273fdfe39 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -5,6 +5,7 @@ int cmd__reftable(int argc, const char **argv)
 {
 	block_test_main(argc, argv);
 	record_test_main(argc, argv);
+	reftable_test_main(argc, argv);
 	strbuf_test_main(argc, argv);
 	tree_test_main(argc, argv);
 	return 0;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH 12/13] reftable: rest of library
  2020-09-16 19:10 [PATCH 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
                   ` (10 preceding siblings ...)
  2020-09-16 19:10 ` [PATCH 11/13] reftable: file level tests Han-Wen Nienhuys via GitGitGadget
@ 2020-09-16 19:10 ` Han-Wen Nienhuys via GitGitGadget
  2020-09-16 19:10 ` [PATCH 13/13] reftable: "test-tool dump-reftable" command Han-Wen Nienhuys via GitGitGadget
  2020-10-01 16:10 ` [PATCH v2 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
  13 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-09-16 19:10 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This will be further split up once preceding commits have passed review.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |    8 +-
 reftable/VERSION         |    1 +
 reftable/dump.c          |  212 +++++++
 reftable/merged.c        |  358 +++++++++++
 reftable/merged.h        |   39 ++
 reftable/merged_test.c   |  331 +++++++++++
 reftable/pq.c            |  115 ++++
 reftable/pq.h            |   34 ++
 reftable/refname.c       |  209 +++++++
 reftable/refname.h       |   38 ++
 reftable/refname_test.c  |  100 ++++
 reftable/stack.c         | 1224 ++++++++++++++++++++++++++++++++++++++
 reftable/stack.h         |   50 ++
 reftable/stack_test.c    |  787 ++++++++++++++++++++++++
 reftable/update.sh       |   22 +
 t/helper/test-reftable.c |    3 +
 16 files changed, 3530 insertions(+), 1 deletion(-)
 create mode 100644 reftable/VERSION
 create mode 100644 reftable/dump.c
 create mode 100644 reftable/merged.c
 create mode 100644 reftable/merged.h
 create mode 100644 reftable/merged_test.c
 create mode 100644 reftable/pq.c
 create mode 100644 reftable/pq.h
 create mode 100644 reftable/refname.c
 create mode 100644 reftable/refname.h
 create mode 100644 reftable/refname_test.c
 create mode 100644 reftable/stack.c
 create mode 100644 reftable/stack.h
 create mode 100644 reftable/stack_test.c
 create mode 100755 reftable/update.sh

diff --git a/Makefile b/Makefile
index 22dd31e7ff..6652259baf 100644
--- a/Makefile
+++ b/Makefile
@@ -2347,12 +2347,16 @@ XDIFF_OBJS += xdiff/xutils.o
 REFTABLE_OBJS += reftable/basics.o
 REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
-REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/compat.o
 REFTABLE_OBJS += reftable/iter.o
+REFTABLE_OBJS += reftable/merged.o
+REFTABLE_OBJS += reftable/pq.o
+REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/reader.o
 REFTABLE_OBJS += reftable/record.o
+REFTABLE_OBJS += reftable/refname.o
 REFTABLE_OBJS += reftable/reftable.o
+REFTABLE_OBJS += reftable/stack.o
 REFTABLE_OBJS += reftable/strbuf.o
 REFTABLE_OBJS += reftable/tree.o
 REFTABLE_OBJS += reftable/writer.o
@@ -2360,7 +2364,9 @@ REFTABLE_OBJS += reftable/zlib-compat.o
 
 REFTABLE_TEST_OBJS += reftable/block_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
+REFTABLE_TEST_OBJS += reftable/refname_test.o
 REFTABLE_TEST_OBJS += reftable/reftable_test.o
+REFTABLE_TEST_OBJS += reftable/stack_test.o
 REFTABLE_TEST_OBJS += reftable/strbuf_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/tree_test.o
diff --git a/reftable/VERSION b/reftable/VERSION
new file mode 100644
index 0000000000..7553c395eb
--- /dev/null
+++ b/reftable/VERSION
@@ -0,0 +1 @@
+b8ec0f74c74cb6752eb2033ad8e755a9c19aad15 C: add missing header
diff --git a/reftable/dump.c b/reftable/dump.c
new file mode 100644
index 0000000000..00b444e8c9
--- /dev/null
+++ b/reftable/dump.c
@@ -0,0 +1,212 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include <stddef.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+
+#include "reftable.h"
+#include "reftable-tests.h"
+
+static uint32_t hash_id;
+
+static int dump_table(const char *tablename)
+{
+	struct reftable_block_source src = { 0 };
+	int err = reftable_block_source_from_file(&src, tablename);
+	struct reftable_iterator it = { 0 };
+	struct reftable_ref_record ref = { 0 };
+	struct reftable_log_record log = { 0 };
+	struct reftable_reader *r = NULL;
+
+	if (err < 0)
+		return err;
+
+	err = reftable_new_reader(&r, &src, tablename);
+	if (err < 0)
+		return err;
+
+	err = reftable_reader_seek_ref(r, &it, "");
+	if (err < 0) {
+		return err;
+	}
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_ref_record_print(&ref, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_ref_record_clear(&ref);
+
+	err = reftable_reader_seek_log(r, &it, "");
+	if (err < 0) {
+		return err;
+	}
+	while (1) {
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_log_record_print(&log, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_log_record_clear(&log);
+
+	reftable_reader_free(r);
+	return 0;
+}
+
+static int compact_stack(const char *stackdir)
+{
+	struct reftable_stack *stack = NULL;
+	struct reftable_write_options cfg = {};
+
+	int err = reftable_new_stack(&stack, stackdir, cfg);
+	if (err < 0)
+		goto done;
+
+	err = reftable_stack_compact_all(stack, NULL);
+	if (err < 0)
+		goto done;
+done:
+	if (stack != NULL) {
+		reftable_stack_destroy(stack);
+	}
+	return err;
+}
+
+static int dump_stack(const char *stackdir)
+{
+	struct reftable_stack *stack = NULL;
+	struct reftable_write_options cfg = {};
+	struct reftable_iterator it = { 0 };
+	struct reftable_ref_record ref = { 0 };
+	struct reftable_log_record log = { 0 };
+	struct reftable_merged_table *merged = NULL;
+
+	int err = reftable_new_stack(&stack, stackdir, cfg);
+	if (err < 0)
+		return err;
+
+	merged = reftable_stack_merged_table(stack);
+
+	err = reftable_merged_table_seek_ref(merged, &it, "");
+	if (err < 0) {
+		return err;
+	}
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_ref_record_print(&ref, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_ref_record_clear(&ref);
+
+	err = reftable_merged_table_seek_log(merged, &it, "");
+	if (err < 0) {
+		return err;
+	}
+	while (1) {
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_log_record_print(&log, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_log_record_clear(&log);
+
+	reftable_stack_destroy(stack);
+	return 0;
+}
+
+static void print_help(void)
+{
+	printf("usage: dump [-cst] arg\n\n"
+	       "options: \n"
+	       "  -c compact\n"
+	       "  -t dump table\n"
+	       "  -s dump stack\n"
+	       "  -h this help\n"
+	       "  -2 use SHA256\n"
+	       "\n");
+}
+
+int reftable_dump_main(int argc, char *const *argv)
+{
+	int err = 0;
+	int opt;
+	int opt_dump_table = 0;
+	int opt_dump_stack = 0;
+	int opt_compact = 0;
+	const char *arg = NULL;
+	while ((opt = getopt(argc, argv, "2chts")) != -1) {
+		switch (opt) {
+		case '2':
+			hash_id = 0x73323536;
+			break;
+		case 't':
+			opt_dump_table = 1;
+			break;
+		case 's':
+			opt_dump_stack = 1;
+			break;
+		case 'c':
+			opt_compact = 1;
+			break;
+		case '?':
+		case 'h':
+			print_help();
+			return 2;
+			break;
+		}
+	}
+
+	if (argv[optind] == NULL) {
+		fprintf(stderr, "need argument\n");
+		print_help();
+		return 2;
+	}
+
+	arg = argv[optind];
+
+	if (opt_dump_table) {
+		err = dump_table(arg);
+	} else if (opt_dump_stack) {
+		err = dump_stack(arg);
+	} else if (opt_compact) {
+		err = compact_stack(arg);
+	}
+
+	if (err < 0) {
+		fprintf(stderr, "%s: %s: %s\n", argv[0], arg,
+			reftable_error_str(err));
+		return 1;
+	}
+	return 0;
+}
diff --git a/reftable/merged.c b/reftable/merged.c
new file mode 100644
index 0000000000..22b071cb5d
--- /dev/null
+++ b/reftable/merged.c
@@ -0,0 +1,358 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "merged.h"
+
+#include "constants.h"
+#include "iter.h"
+#include "pq.h"
+#include "reader.h"
+#include "record.h"
+#include "reftable.h"
+#include "system.h"
+
+static int merged_iter_init(struct merged_iter *mi)
+{
+	int i = 0;
+	for (i = 0; i < mi->stack_len; i++) {
+		struct reftable_record rec = reftable_new_record(mi->typ);
+		int err = iterator_next(&mi->stack[i], &rec);
+		if (err < 0) {
+			return err;
+		}
+
+		if (err > 0) {
+			reftable_iterator_destroy(&mi->stack[i]);
+			reftable_record_destroy(&rec);
+		} else {
+			struct pq_entry e = {
+				.rec = rec,
+				.index = i,
+			};
+			merged_iter_pqueue_add(&mi->pq, e);
+		}
+	}
+
+	return 0;
+}
+
+static void merged_iter_close(void *p)
+{
+	struct merged_iter *mi = (struct merged_iter *)p;
+	int i = 0;
+	merged_iter_pqueue_clear(&mi->pq);
+	for (i = 0; i < mi->stack_len; i++) {
+		reftable_iterator_destroy(&mi->stack[i]);
+	}
+	reftable_free(mi->stack);
+}
+
+static int merged_iter_advance_nonnull_subiter(struct merged_iter *mi,
+					       size_t idx)
+{
+	struct reftable_record rec = reftable_new_record(mi->typ);
+	struct pq_entry e = {
+		.rec = rec,
+		.index = idx,
+	};
+	int err = iterator_next(&mi->stack[idx], &rec);
+	if (err < 0)
+		return err;
+
+	if (err > 0) {
+		reftable_iterator_destroy(&mi->stack[idx]);
+		reftable_record_destroy(&rec);
+		return 0;
+	}
+
+	merged_iter_pqueue_add(&mi->pq, e);
+	return 0;
+}
+
+static int merged_iter_advance_subiter(struct merged_iter *mi, size_t idx)
+{
+	if (iterator_is_null(&mi->stack[idx]))
+		return 0;
+	return merged_iter_advance_nonnull_subiter(mi, idx);
+}
+
+static int merged_iter_next_entry(struct merged_iter *mi,
+				  struct reftable_record *rec)
+{
+	struct strbuf entry_key = STRBUF_INIT;
+	struct pq_entry entry = { 0 };
+	int err = 0;
+
+	if (merged_iter_pqueue_is_empty(mi->pq))
+		return 1;
+
+	entry = merged_iter_pqueue_remove(&mi->pq);
+	err = merged_iter_advance_subiter(mi, entry.index);
+	if (err < 0)
+		return err;
+
+	/*
+	  One can also use reftable as datacenter-local storage, where the ref
+	  database is maintained in globally consistent database (eg.
+	  CockroachDB or Spanner). In this scenario, replication delays together
+	  with compaction may cause newer tables to contain older entries. In
+	  such a deployment, the loop below must be changed to collect all
+	  entries for the same key, and return new the newest one.
+	*/
+	reftable_record_key(&entry.rec, &entry_key);
+	while (!merged_iter_pqueue_is_empty(mi->pq)) {
+		struct pq_entry top = merged_iter_pqueue_top(mi->pq);
+		struct strbuf k = STRBUF_INIT;
+		int err = 0, cmp = 0;
+
+		reftable_record_key(&top.rec, &k);
+
+		cmp = strbuf_cmp(&k, &entry_key);
+		strbuf_release(&k);
+
+		if (cmp > 0) {
+			break;
+		}
+
+		merged_iter_pqueue_remove(&mi->pq);
+		err = merged_iter_advance_subiter(mi, top.index);
+		if (err < 0) {
+			return err;
+		}
+		reftable_record_destroy(&top.rec);
+	}
+
+	reftable_record_copy_from(rec, &entry.rec, hash_size(mi->hash_id));
+	reftable_record_destroy(&entry.rec);
+	strbuf_release(&entry_key);
+	return 0;
+}
+
+static int merged_iter_next(struct merged_iter *mi, struct reftable_record *rec)
+{
+	while (1) {
+		int err = merged_iter_next_entry(mi, rec);
+		if (err == 0 && mi->suppress_deletions &&
+		    reftable_record_is_deletion(rec)) {
+			continue;
+		}
+
+		return err;
+	}
+}
+
+static int merged_iter_next_void(void *p, struct reftable_record *rec)
+{
+	struct merged_iter *mi = (struct merged_iter *)p;
+	if (merged_iter_pqueue_is_empty(mi->pq))
+		return 1;
+
+	return merged_iter_next(mi, rec);
+}
+
+struct reftable_iterator_vtable merged_iter_vtable = {
+	.next = &merged_iter_next_void,
+	.close = &merged_iter_close,
+};
+
+static void iterator_from_merged_iter(struct reftable_iterator *it,
+				      struct merged_iter *mi)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = mi;
+	it->ops = &merged_iter_vtable;
+}
+
+int reftable_new_merged_table(struct reftable_merged_table **dest,
+			      struct reftable_table *stack, int n,
+			      uint32_t hash_id)
+{
+	struct reftable_merged_table *m = NULL;
+	uint64_t last_max = 0;
+	uint64_t first_min = 0;
+	int i = 0;
+	for (i = 0; i < n; i++) {
+		uint64_t min = reftable_table_min_update_index(&stack[i]);
+		uint64_t max = reftable_table_max_update_index(&stack[i]);
+
+		if (reftable_table_hash_id(&stack[i]) != hash_id) {
+			return REFTABLE_FORMAT_ERROR;
+		}
+		if (i == 0 || min < first_min) {
+			first_min = min;
+		}
+		if (i == 0 || max > last_max) {
+			last_max = max;
+		}
+	}
+
+	m = (struct reftable_merged_table *)reftable_calloc(
+		sizeof(struct reftable_merged_table));
+	m->stack = stack;
+	m->stack_len = n;
+	m->min = first_min;
+	m->max = last_max;
+	m->hash_id = hash_id;
+	*dest = m;
+	return 0;
+}
+
+/* clears the list of subtable, without affecting the readers themselves. */
+void merged_table_clear(struct reftable_merged_table *mt)
+{
+	FREE_AND_NULL(mt->stack);
+	mt->stack_len = 0;
+}
+
+void reftable_merged_table_free(struct reftable_merged_table *mt)
+{
+	if (mt == NULL) {
+		return;
+	}
+	merged_table_clear(mt);
+	reftable_free(mt);
+}
+
+uint64_t
+reftable_merged_table_max_update_index(struct reftable_merged_table *mt)
+{
+	return mt->max;
+}
+
+uint64_t
+reftable_merged_table_min_update_index(struct reftable_merged_table *mt)
+{
+	return mt->min;
+}
+
+int merged_table_seek_record(struct reftable_merged_table *mt,
+			     struct reftable_iterator *it,
+			     struct reftable_record *rec)
+{
+	struct reftable_iterator *iters = reftable_calloc(
+		sizeof(struct reftable_iterator) * mt->stack_len);
+	struct merged_iter merged = {
+		.stack = iters,
+		.typ = reftable_record_type(rec),
+		.hash_id = mt->hash_id,
+		.suppress_deletions = mt->suppress_deletions,
+	};
+	int n = 0;
+	int err = 0;
+	int i = 0;
+	for (i = 0; i < mt->stack_len && err == 0; i++) {
+		int e = reftable_table_seek_record(&mt->stack[i], &iters[n],
+						   rec);
+		if (e < 0) {
+			err = e;
+		}
+		if (e == 0) {
+			n++;
+		}
+	}
+	if (err < 0) {
+		int i = 0;
+		for (i = 0; i < n; i++) {
+			reftable_iterator_destroy(&iters[i]);
+		}
+		reftable_free(iters);
+		return err;
+	}
+
+	merged.stack_len = n;
+	err = merged_iter_init(&merged);
+	if (err < 0) {
+		merged_iter_close(&merged);
+		return err;
+	} else {
+		struct merged_iter *p =
+			reftable_malloc(sizeof(struct merged_iter));
+		*p = merged;
+		iterator_from_merged_iter(it, p);
+	}
+	return 0;
+}
+
+int reftable_merged_table_seek_ref(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name)
+{
+	struct reftable_ref_record ref = {
+		.refname = (char *)name,
+	};
+	struct reftable_record rec = { 0 };
+	reftable_record_from_ref(&rec, &ref);
+	return merged_table_seek_record(mt, it, &rec);
+}
+
+int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt,
+				      struct reftable_iterator *it,
+				      const char *name, uint64_t update_index)
+{
+	struct reftable_log_record log = {
+		.refname = (char *)name,
+		.update_index = update_index,
+	};
+	struct reftable_record rec = { 0 };
+	reftable_record_from_log(&rec, &log);
+	return merged_table_seek_record(mt, it, &rec);
+}
+
+int reftable_merged_table_seek_log(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name)
+{
+	uint64_t max = ~((uint64_t)0);
+	return reftable_merged_table_seek_log_at(mt, it, name, max);
+}
+
+uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *mt)
+{
+	return mt->hash_id;
+}
+
+static int reftable_merged_table_seek_void(void *tab,
+					   struct reftable_iterator *it,
+					   struct reftable_record *rec)
+{
+	return merged_table_seek_record((struct reftable_merged_table *)tab, it,
+					rec);
+}
+
+static uint32_t reftable_merged_table_hash_id_void(void *tab)
+{
+	return reftable_merged_table_hash_id(
+		(struct reftable_merged_table *)tab);
+}
+
+static uint64_t reftable_merged_table_min_update_index_void(void *tab)
+{
+	return reftable_merged_table_min_update_index(
+		(struct reftable_merged_table *)tab);
+}
+
+static uint64_t reftable_merged_table_max_update_index_void(void *tab)
+{
+	return reftable_merged_table_max_update_index(
+		(struct reftable_merged_table *)tab);
+}
+
+static struct reftable_table_vtable merged_table_vtable = {
+	.seek_record = reftable_merged_table_seek_void,
+	.hash_id = reftable_merged_table_hash_id_void,
+	.min_update_index = reftable_merged_table_min_update_index_void,
+	.max_update_index = reftable_merged_table_max_update_index_void,
+};
+
+void reftable_table_from_merged_table(struct reftable_table *tab,
+				      struct reftable_merged_table *merged)
+{
+	assert(tab->ops == NULL);
+	tab->ops = &merged_table_vtable;
+	tab->table_arg = merged;
+}
diff --git a/reftable/merged.h b/reftable/merged.h
new file mode 100644
index 0000000000..66a1d71830
--- /dev/null
+++ b/reftable/merged.h
@@ -0,0 +1,39 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef MERGED_H
+#define MERGED_H
+
+#include "pq.h"
+#include "reftable.h"
+
+struct reftable_merged_table {
+	struct reftable_table *stack;
+	size_t stack_len;
+	uint32_t hash_id;
+	int suppress_deletions;
+
+	uint64_t min;
+	uint64_t max;
+};
+
+struct merged_iter {
+	struct reftable_iterator *stack;
+	uint32_t hash_id;
+	size_t stack_len;
+	uint8_t typ;
+	int suppress_deletions;
+	struct merged_iter_pqueue pq;
+};
+
+void merged_table_clear(struct reftable_merged_table *mt);
+int merged_table_seek_record(struct reftable_merged_table *mt,
+			     struct reftable_iterator *it,
+			     struct reftable_record *rec);
+
+#endif
diff --git a/reftable/merged_test.c b/reftable/merged_test.c
new file mode 100644
index 0000000000..18a0fe1dac
--- /dev/null
+++ b/reftable/merged_test.c
@@ -0,0 +1,331 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "merged.h"
+
+#include "system.h"
+
+#include "basics.h"
+#include "blocksource.h"
+#include "constants.h"
+#include "pq.h"
+#include "reader.h"
+#include "record.h"
+#include "reftable.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static void test_pq(void)
+{
+	char *names[54] = { 0 };
+	int N = ARRAY_SIZE(names) - 1;
+
+	struct merged_iter_pqueue pq = { 0 };
+	const char *last = NULL;
+
+	int i = 0;
+	for (i = 0; i < N; i++) {
+		char name[100];
+		snprintf(name, sizeof(name), "%02d", i);
+		names[i] = xstrdup(name);
+	}
+
+	i = 1;
+	do {
+		struct reftable_record rec =
+			reftable_new_record(BLOCK_TYPE_REF);
+		struct pq_entry e = { 0 };
+
+		reftable_record_as_ref(&rec)->refname = names[i];
+		e.rec = rec;
+		merged_iter_pqueue_add(&pq, e);
+		merged_iter_pqueue_check(pq);
+		i = (i * 7) % N;
+	} while (i != 1);
+
+	while (!merged_iter_pqueue_is_empty(pq)) {
+		struct pq_entry e = merged_iter_pqueue_remove(&pq);
+		struct reftable_ref_record *ref =
+			reftable_record_as_ref(&e.rec);
+
+		merged_iter_pqueue_check(pq);
+
+		if (last != NULL) {
+			assert(strcmp(last, ref->refname) < 0);
+		}
+		last = ref->refname;
+		ref->refname = NULL;
+		reftable_free(ref);
+	}
+
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+
+	merged_iter_pqueue_clear(&pq);
+}
+
+static void write_test_table(struct strbuf *buf,
+			     struct reftable_ref_record refs[], int n)
+{
+	int min = 0xffffffff;
+	int max = 0;
+	int i = 0;
+	int err;
+
+	struct reftable_write_options opts = {
+		.block_size = 256,
+	};
+	struct reftable_writer *w = NULL;
+	for (i = 0; i < n; i++) {
+		uint64_t ui = refs[i].update_index;
+		if (ui > max) {
+			max = ui;
+		}
+		if (ui < min) {
+			min = ui;
+		}
+	}
+
+	w = reftable_new_writer(&strbuf_add_void, buf, &opts);
+	reftable_writer_set_limits(w, min, max);
+
+	for (i = 0; i < n; i++) {
+		uint64_t before = refs[i].update_index;
+		int n = reftable_writer_add_ref(w, &refs[i]);
+		assert(n == 0);
+		assert(before == refs[i].update_index);
+	}
+
+	err = reftable_writer_close(w);
+	assert_err(err);
+
+	reftable_writer_free(w);
+}
+
+static struct reftable_merged_table *
+merged_table_from_records(struct reftable_ref_record **refs,
+			  struct reftable_block_source **source,
+			  struct reftable_reader ***readers, int *sizes,
+			  struct strbuf *buf, int n)
+{
+	int i = 0;
+	struct reftable_merged_table *mt = NULL;
+	int err;
+	struct reftable_table *tabs =
+		reftable_calloc(n * sizeof(struct reftable_table));
+	*readers = reftable_calloc(n * sizeof(struct reftable_reader *));
+	*source = reftable_calloc(n * sizeof(**source));
+	for (i = 0; i < n; i++) {
+		write_test_table(&buf[i], refs[i], sizes[i]);
+		block_source_from_strbuf(&(*source)[i], &buf[i]);
+
+		err = reftable_new_reader(&(*readers)[i], &(*source)[i],
+					  "name");
+		assert_err(err);
+		reftable_table_from_reader(&tabs[i], (*readers)[i]);
+	}
+
+	err = reftable_new_merged_table(&mt, tabs, n, SHA1_ID);
+	assert_err(err);
+	return mt;
+}
+
+static void readers_destroy(struct reftable_reader **readers, size_t n)
+{
+	int i = 0;
+	for (; i < n; i++)
+		reftable_reader_free(readers[i]);
+	reftable_free(readers);
+}
+
+static void test_merged_between(void)
+{
+	uint8_t hash1[SHA1_SIZE] = { 1, 2, 3, 0 };
+
+	struct reftable_ref_record r1[] = { {
+		.refname = "b",
+		.update_index = 1,
+		.value = hash1,
+	} };
+	struct reftable_ref_record r2[] = { {
+		.refname = "a",
+		.update_index = 2,
+	} };
+
+	struct reftable_ref_record *refs[] = { r1, r2 };
+	int sizes[] = { 1, 1 };
+	struct strbuf bufs[2] = { STRBUF_INIT, STRBUF_INIT };
+	struct reftable_block_source *bs = NULL;
+	struct reftable_reader **readers = NULL;
+	struct reftable_merged_table *mt =
+		merged_table_from_records(refs, &bs, &readers, sizes, bufs, 2);
+	int i;
+	struct reftable_ref_record ref = { 0 };
+	struct reftable_iterator it = { 0 };
+	int err = reftable_merged_table_seek_ref(mt, &it, "a");
+	assert_err(err);
+
+	err = reftable_iterator_next_ref(&it, &ref);
+	assert_err(err);
+	assert(ref.update_index == 2);
+	reftable_ref_record_clear(&ref);
+	reftable_iterator_destroy(&it);
+	readers_destroy(readers, 2);
+	reftable_merged_table_free(mt);
+	for (i = 0; i < ARRAY_SIZE(bufs); i++) {
+		strbuf_release(&bufs[i]);
+	}
+	reftable_free(bs);
+}
+
+static void test_merged(void)
+{
+	uint8_t hash1[SHA1_SIZE] = { 1 };
+	uint8_t hash2[SHA1_SIZE] = { 2 };
+	struct reftable_ref_record r1[] = { {
+						    .refname = "a",
+						    .update_index = 1,
+						    .value = hash1,
+					    },
+					    {
+						    .refname = "b",
+						    .update_index = 1,
+						    .value = hash1,
+					    },
+					    {
+						    .refname = "c",
+						    .update_index = 1,
+						    .value = hash1,
+					    } };
+	struct reftable_ref_record r2[] = { {
+		.refname = "a",
+		.update_index = 2,
+	} };
+	struct reftable_ref_record r3[] = {
+		{
+			.refname = "c",
+			.update_index = 3,
+			.value = hash2,
+		},
+		{
+			.refname = "d",
+			.update_index = 3,
+			.value = hash1,
+		},
+	};
+
+	struct reftable_ref_record want[] = {
+		r2[0],
+		r1[1],
+		r3[0],
+		r3[1],
+	};
+
+	struct reftable_ref_record *refs[] = { r1, r2, r3 };
+	int sizes[3] = { 3, 1, 2 };
+	struct strbuf bufs[3] = { STRBUF_INIT, STRBUF_INIT, STRBUF_INIT };
+	struct reftable_block_source *bs = NULL;
+	struct reftable_reader **readers = NULL;
+	struct reftable_merged_table *mt =
+		merged_table_from_records(refs, &bs, &readers, sizes, bufs, 3);
+
+	struct reftable_iterator it = { 0 };
+	int err = reftable_merged_table_seek_ref(mt, &it, "a");
+	struct reftable_ref_record *out = NULL;
+	size_t len = 0;
+	size_t cap = 0;
+	int i = 0;
+
+	assert_err(err);
+	while (len < 100) { /* cap loops/recursion. */
+		struct reftable_ref_record ref = { 0 };
+		int err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (len == cap) {
+			cap = 2 * cap + 1;
+			out = reftable_realloc(
+				out, sizeof(struct reftable_ref_record) * cap);
+		}
+		out[len++] = ref;
+	}
+	reftable_iterator_destroy(&it);
+
+	assert(ARRAY_SIZE(want) == len);
+	for (i = 0; i < len; i++) {
+		assert(reftable_ref_record_equal(&want[i], &out[i], SHA1_SIZE));
+	}
+	for (i = 0; i < len; i++) {
+		reftable_ref_record_clear(&out[i]);
+	}
+	reftable_free(out);
+
+	for (i = 0; i < 3; i++) {
+		strbuf_release(&bufs[i]);
+	}
+	readers_destroy(readers, 3);
+	reftable_merged_table_free(mt);
+	reftable_free(bs);
+}
+
+static void test_default_write_opts(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+
+	struct reftable_ref_record rec = {
+		.refname = "master",
+		.update_index = 1,
+	};
+	int err;
+	struct reftable_block_source source = { 0 };
+	struct reftable_table *tab = reftable_calloc(sizeof(*tab) * 1);
+	uint32_t hash_id;
+	struct reftable_reader *rd = NULL;
+	struct reftable_merged_table *merged = NULL;
+
+	reftable_writer_set_limits(w, 1, 1);
+
+	err = reftable_writer_add_ref(w, &rec);
+	assert_err(err);
+
+	err = reftable_writer_close(w);
+	assert_err(err);
+	reftable_writer_free(w);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = reftable_new_reader(&rd, &source, "filename");
+	assert_err(err);
+
+	hash_id = reftable_reader_hash_id(rd);
+	assert(hash_id == SHA1_ID);
+
+	reftable_table_from_reader(&tab[0], rd);
+	err = reftable_new_merged_table(&merged, tab, 1, SHA1_ID);
+	assert_err(err);
+
+	reftable_reader_free(rd);
+	reftable_merged_table_free(merged);
+	strbuf_release(&buf);
+}
+
+/* XXX test refs_for(oid) */
+
+int merged_test_main(int argc, const char *argv[])
+{
+	add_test_case("test_merged_between", &test_merged_between);
+	add_test_case("test_pq", &test_pq);
+	add_test_case("test_merged", &test_merged);
+	add_test_case("test_default_write_opts", &test_default_write_opts);
+	return test_main(argc, argv);
+}
diff --git a/reftable/pq.c b/reftable/pq.c
new file mode 100644
index 0000000000..5fa1b5f437
--- /dev/null
+++ b/reftable/pq.c
@@ -0,0 +1,115 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "pq.h"
+
+#include "reftable.h"
+#include "system.h"
+#include "basics.h"
+
+int pq_less(struct pq_entry a, struct pq_entry b)
+{
+	struct strbuf ak = STRBUF_INIT;
+	struct strbuf bk = STRBUF_INIT;
+	int cmp = 0;
+	reftable_record_key(&a.rec, &ak);
+	reftable_record_key(&b.rec, &bk);
+
+	cmp = strbuf_cmp(&ak, &bk);
+
+	strbuf_release(&ak);
+	strbuf_release(&bk);
+
+	if (cmp == 0)
+		return a.index > b.index;
+
+	return cmp < 0;
+}
+
+struct pq_entry merged_iter_pqueue_top(struct merged_iter_pqueue pq)
+{
+	return pq.heap[0];
+}
+
+int merged_iter_pqueue_is_empty(struct merged_iter_pqueue pq)
+{
+	return pq.len == 0;
+}
+
+void merged_iter_pqueue_check(struct merged_iter_pqueue pq)
+{
+	int i = 0;
+	for (i = 1; i < pq.len; i++) {
+		int parent = (i - 1) / 2;
+
+		assert(pq_less(pq.heap[parent], pq.heap[i]));
+	}
+}
+
+struct pq_entry merged_iter_pqueue_remove(struct merged_iter_pqueue *pq)
+{
+	int i = 0;
+	struct pq_entry e = pq->heap[0];
+	pq->heap[0] = pq->heap[pq->len - 1];
+	pq->len--;
+
+	i = 0;
+	while (i < pq->len) {
+		int min = i;
+		int j = 2 * i + 1;
+		int k = 2 * i + 2;
+		if (j < pq->len && pq_less(pq->heap[j], pq->heap[i])) {
+			min = j;
+		}
+		if (k < pq->len && pq_less(pq->heap[k], pq->heap[min])) {
+			min = k;
+		}
+
+		if (min == i) {
+			break;
+		}
+
+		SWAP(pq->heap[i], pq->heap[min]);
+		i = min;
+	}
+
+	return e;
+}
+
+void merged_iter_pqueue_add(struct merged_iter_pqueue *pq, struct pq_entry e)
+{
+	int i = 0;
+	if (pq->len == pq->cap) {
+		pq->cap = 2 * pq->cap + 1;
+		pq->heap = reftable_realloc(pq->heap,
+					    pq->cap * sizeof(struct pq_entry));
+	}
+
+	pq->heap[pq->len++] = e;
+	i = pq->len - 1;
+	while (i > 0) {
+		int j = (i - 1) / 2;
+		if (pq_less(pq->heap[j], pq->heap[i])) {
+			break;
+		}
+
+		SWAP(pq->heap[j], pq->heap[i]);
+
+		i = j;
+	}
+}
+
+void merged_iter_pqueue_clear(struct merged_iter_pqueue *pq)
+{
+	int i = 0;
+	for (i = 0; i < pq->len; i++) {
+		reftable_record_destroy(&pq->heap[i].rec);
+	}
+	FREE_AND_NULL(pq->heap);
+	pq->len = pq->cap = 0;
+}
diff --git a/reftable/pq.h b/reftable/pq.h
new file mode 100644
index 0000000000..918ac3990d
--- /dev/null
+++ b/reftable/pq.h
@@ -0,0 +1,34 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef PQ_H
+#define PQ_H
+
+#include "record.h"
+
+struct pq_entry {
+	int index;
+	struct reftable_record rec;
+};
+
+int pq_less(struct pq_entry a, struct pq_entry b);
+
+struct merged_iter_pqueue {
+	struct pq_entry *heap;
+	size_t len;
+	size_t cap;
+};
+
+struct pq_entry merged_iter_pqueue_top(struct merged_iter_pqueue pq);
+int merged_iter_pqueue_is_empty(struct merged_iter_pqueue pq);
+void merged_iter_pqueue_check(struct merged_iter_pqueue pq);
+struct pq_entry merged_iter_pqueue_remove(struct merged_iter_pqueue *pq);
+void merged_iter_pqueue_add(struct merged_iter_pqueue *pq, struct pq_entry e);
+void merged_iter_pqueue_clear(struct merged_iter_pqueue *pq);
+
+#endif
diff --git a/reftable/refname.c b/reftable/refname.c
new file mode 100644
index 0000000000..423e2b06bc
--- /dev/null
+++ b/reftable/refname.c
@@ -0,0 +1,209 @@
+/*
+  Copyright 2020 Google LLC
+
+  Use of this source code is governed by a BSD-style
+  license that can be found in the LICENSE file or at
+  https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+#include "reftable.h"
+#include "basics.h"
+#include "refname.h"
+#include "strbuf.h"
+
+struct find_arg {
+	char **names;
+	const char *want;
+};
+
+static int find_name(size_t k, void *arg)
+{
+	struct find_arg *f_arg = (struct find_arg *)arg;
+	return strcmp(f_arg->names[k], f_arg->want) >= 0;
+}
+
+int modification_has_ref(struct modification *mod, const char *name)
+{
+	struct reftable_ref_record ref = { 0 };
+	int err = 0;
+
+	if (mod->add_len > 0) {
+		struct find_arg arg = {
+			.names = mod->add,
+			.want = name,
+		};
+		int idx = binsearch(mod->add_len, find_name, &arg);
+		if (idx < mod->add_len && !strcmp(mod->add[idx], name)) {
+			return 0;
+		}
+	}
+
+	if (mod->del_len > 0) {
+		struct find_arg arg = {
+			.names = mod->del,
+			.want = name,
+		};
+		int idx = binsearch(mod->del_len, find_name, &arg);
+		if (idx < mod->del_len && !strcmp(mod->del[idx], name)) {
+			return 1;
+		}
+	}
+
+	err = reftable_table_read_ref(&mod->tab, name, &ref);
+	reftable_ref_record_clear(&ref);
+	return err;
+}
+
+static void modification_clear(struct modification *mod)
+{
+	/* don't delete the strings themselves; they're owned by ref records.
+	 */
+	FREE_AND_NULL(mod->add);
+	FREE_AND_NULL(mod->del);
+	mod->add_len = 0;
+	mod->del_len = 0;
+}
+
+int modification_has_ref_with_prefix(struct modification *mod,
+				     const char *prefix)
+{
+	struct reftable_iterator it = { NULL };
+	struct reftable_ref_record ref = { NULL };
+	int err = 0;
+
+	if (mod->add_len > 0) {
+		struct find_arg arg = {
+			.names = mod->add,
+			.want = prefix,
+		};
+		int idx = binsearch(mod->add_len, find_name, &arg);
+		if (idx < mod->add_len &&
+		    !strncmp(prefix, mod->add[idx], strlen(prefix)))
+			goto done;
+	}
+	err = reftable_table_seek_ref(&mod->tab, &it, prefix);
+	if (err)
+		goto done;
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err)
+			goto done;
+
+		if (mod->del_len > 0) {
+			struct find_arg arg = {
+				.names = mod->del,
+				.want = ref.refname,
+			};
+			int idx = binsearch(mod->del_len, find_name, &arg);
+			if (idx < mod->del_len &&
+			    !strcmp(ref.refname, mod->del[idx])) {
+				continue;
+			}
+		}
+
+		if (strncmp(ref.refname, prefix, strlen(prefix))) {
+			err = 1;
+			goto done;
+		}
+		err = 0;
+		goto done;
+	}
+
+done:
+	reftable_ref_record_clear(&ref);
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+int validate_refname(const char *name)
+{
+	while (1) {
+		char *next = strchr(name, '/');
+		if (!*name) {
+			return REFTABLE_REFNAME_ERROR;
+		}
+		if (!next) {
+			return 0;
+		}
+		if (next - name == 0 || (next - name == 1 && *name == '.') ||
+		    (next - name == 2 && name[0] == '.' && name[1] == '.'))
+			return REFTABLE_REFNAME_ERROR;
+		name = next + 1;
+	}
+	return 0;
+}
+
+int validate_ref_record_addition(struct reftable_table tab,
+				 struct reftable_ref_record *recs, size_t sz)
+{
+	struct modification mod = {
+		.tab = tab,
+		.add = reftable_calloc(sizeof(char *) * sz),
+		.del = reftable_calloc(sizeof(char *) * sz),
+	};
+	int i = 0;
+	int err = 0;
+	for (; i < sz; i++) {
+		if (reftable_ref_record_is_deletion(&recs[i])) {
+			mod.del[mod.del_len++] = recs[i].refname;
+		} else {
+			mod.add[mod.add_len++] = recs[i].refname;
+		}
+	}
+
+	err = modification_validate(&mod);
+	modification_clear(&mod);
+	return err;
+}
+
+static void strbuf_trim_component(struct strbuf *sl)
+{
+	while (sl->len > 0) {
+		int is_slash = (sl->buf[sl->len - 1] == '/');
+		strbuf_setlen(sl, sl->len - 1);
+		if (is_slash)
+			break;
+	}
+}
+
+int modification_validate(struct modification *mod)
+{
+	struct strbuf slashed = STRBUF_INIT;
+	int err = 0;
+	int i = 0;
+	for (; i < mod->add_len; i++) {
+		err = validate_refname(mod->add[i]);
+		if (err)
+			goto done;
+		strbuf_reset(&slashed);
+		strbuf_addstr(&slashed, mod->add[i]);
+		strbuf_addstr(&slashed, "/");
+
+		err = modification_has_ref_with_prefix(mod, slashed.buf);
+		if (err == 0) {
+			err = REFTABLE_NAME_CONFLICT;
+			goto done;
+		}
+		if (err < 0)
+			goto done;
+
+		strbuf_reset(&slashed);
+		strbuf_addstr(&slashed, mod->add[i]);
+		while (slashed.len) {
+			strbuf_trim_component(&slashed);
+			err = modification_has_ref(mod, slashed.buf);
+			if (err == 0) {
+				err = REFTABLE_NAME_CONFLICT;
+				goto done;
+			}
+			if (err < 0)
+				goto done;
+		}
+	}
+	err = 0;
+done:
+	strbuf_release(&slashed);
+	return err;
+}
diff --git a/reftable/refname.h b/reftable/refname.h
new file mode 100644
index 0000000000..13bf1e6861
--- /dev/null
+++ b/reftable/refname.h
@@ -0,0 +1,38 @@
+/*
+  Copyright 2020 Google LLC
+
+  Use of this source code is governed by a BSD-style
+  license that can be found in the LICENSE file or at
+  https://developers.google.com/open-source/licenses/bsd
+*/
+#ifndef REFNAME_H
+#define REFNAME_H
+
+#include "reftable.h"
+
+struct modification {
+	struct reftable_table tab;
+
+	char **add;
+	size_t add_len;
+
+	char **del;
+	size_t del_len;
+};
+
+// -1 = error, 0 = found, 1 = not found
+int modification_has_ref(struct modification *mod, const char *name);
+
+// -1 = error, 0 = found, 1 = not found.
+int modification_has_ref_with_prefix(struct modification *mod,
+				     const char *prefix);
+
+// 0 = OK.
+int validate_refname(const char *name);
+
+int validate_ref_record_addition(struct reftable_table tab,
+				 struct reftable_ref_record *recs, size_t sz);
+
+int modification_validate(struct modification *mod);
+
+#endif
diff --git a/reftable/refname_test.c b/reftable/refname_test.c
new file mode 100644
index 0000000000..f3dcaade2a
--- /dev/null
+++ b/reftable/refname_test.c
@@ -0,0 +1,100 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reftable.h"
+
+#include "basics.h"
+#include "block.h"
+#include "blocksource.h"
+#include "constants.h"
+#include "reader.h"
+#include "record.h"
+#include "refname.h"
+#include "system.h"
+
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+struct testcase {
+	char *add;
+	char *del;
+	int error_code;
+};
+
+static void test_conflict(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	struct reftable_ref_record rec = {
+		.refname = "a/b",
+		.target = "destination", /* make sure it's not a symref. */
+		.update_index = 1,
+	};
+	int err;
+	int i;
+	struct reftable_block_source source = { 0 };
+	struct reftable_reader *rd = NULL;
+	struct reftable_table tab = { NULL };
+	struct testcase cases[] = {
+		{ "a/b/c", NULL, REFTABLE_NAME_CONFLICT },
+		{ "b", NULL, 0 },
+		{ "a", NULL, REFTABLE_NAME_CONFLICT },
+		{ "a", "a/b", 0 },
+
+		{ "p/", NULL, REFTABLE_REFNAME_ERROR },
+		{ "p//q", NULL, REFTABLE_REFNAME_ERROR },
+		{ "p/./q", NULL, REFTABLE_REFNAME_ERROR },
+		{ "p/../q", NULL, REFTABLE_REFNAME_ERROR },
+
+		{ "a/b/c", "a/b", 0 },
+		{ NULL, "a//b", 0 },
+	};
+	reftable_writer_set_limits(w, 1, 1);
+
+	err = reftable_writer_add_ref(w, &rec);
+	assert_err(err);
+
+	err = reftable_writer_close(w);
+	assert_err(err);
+	reftable_writer_free(w);
+
+	block_source_from_strbuf(&source, &buf);
+	err = reftable_new_reader(&rd, &source, "filename");
+	assert_err(err);
+
+	reftable_table_from_reader(&tab, rd);
+
+	for (i = 0; i < ARRAY_SIZE(cases); i++) {
+		struct modification mod = {
+			.tab = tab,
+		};
+
+		if (cases[i].add != NULL) {
+			mod.add = &cases[i].add;
+			mod.add_len = 1;
+		}
+		if (cases[i].del != NULL) {
+			mod.del = &cases[i].del;
+			mod.del_len = 1;
+		}
+
+		err = modification_validate(&mod);
+		assert(err == cases[i].error_code);
+	}
+
+	reftable_reader_free(rd);
+	strbuf_release(&buf);
+}
+
+int refname_test_main(int argc, const char *argv[])
+{
+	add_test_case("test_conflict", &test_conflict);
+	return test_main(argc, argv);
+}
diff --git a/reftable/stack.c b/reftable/stack.c
new file mode 100644
index 0000000000..cf8a8e854b
--- /dev/null
+++ b/reftable/stack.c
@@ -0,0 +1,1224 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "stack.h"
+
+#include "system.h"
+#include "merged.h"
+#include "reader.h"
+#include "refname.h"
+#include "reftable.h"
+#include "writer.h"
+
+int reftable_new_stack(struct reftable_stack **dest, const char *dir,
+		       struct reftable_write_options config)
+{
+	struct reftable_stack *p =
+		reftable_calloc(sizeof(struct reftable_stack));
+	struct strbuf list_file_name = STRBUF_INIT;
+	int err = 0;
+
+	if (config.hash_id == 0) {
+		config.hash_id = SHA1_ID;
+	}
+
+	*dest = NULL;
+
+	strbuf_reset(&list_file_name);
+	strbuf_addstr(&list_file_name, dir);
+	strbuf_addstr(&list_file_name, "/tables.list");
+
+	p->list_file = strbuf_detach(&list_file_name, NULL);
+	p->reftable_dir = xstrdup(dir);
+	p->config = config;
+
+	err = reftable_stack_reload_maybe_reuse(p, 1);
+	if (err < 0) {
+		reftable_stack_destroy(p);
+	} else {
+		*dest = p;
+	}
+	return err;
+}
+
+static int fd_read_lines(int fd, char ***namesp)
+{
+	off_t size = lseek(fd, 0, SEEK_END);
+	char *buf = NULL;
+	int err = 0;
+	if (size < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+	err = lseek(fd, 0, SEEK_SET);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	buf = reftable_malloc(size + 1);
+	if (read(fd, buf, size) != size) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+	buf[size] = 0;
+
+	parse_names(buf, size, namesp);
+
+done:
+	reftable_free(buf);
+	return err;
+}
+
+int read_lines(const char *filename, char ***namesp)
+{
+	int fd = open(filename, O_RDONLY, 0644);
+	int err = 0;
+	if (fd < 0) {
+		if (errno == ENOENT) {
+			*namesp = reftable_calloc(sizeof(char *));
+			return 0;
+		}
+
+		return REFTABLE_IO_ERROR;
+	}
+	err = fd_read_lines(fd, namesp);
+	close(fd);
+	return err;
+}
+
+struct reftable_merged_table *
+reftable_stack_merged_table(struct reftable_stack *st)
+{
+	return st->merged;
+}
+
+/* Close and free the stack */
+void reftable_stack_destroy(struct reftable_stack *st)
+{
+	if (st->merged != NULL) {
+		reftable_merged_table_free(st->merged);
+		st->merged = NULL;
+	}
+
+	if (st->readers != NULL) {
+		int i = 0;
+		for (i = 0; i < st->readers_len; i++) {
+			reftable_reader_free(st->readers[i]);
+		}
+		st->readers_len = 0;
+		FREE_AND_NULL(st->readers);
+	}
+	FREE_AND_NULL(st->list_file);
+	FREE_AND_NULL(st->reftable_dir);
+	reftable_free(st);
+}
+
+static struct reftable_reader **stack_copy_readers(struct reftable_stack *st,
+						   int cur_len)
+{
+	struct reftable_reader **cur =
+		reftable_calloc(sizeof(struct reftable_reader *) * cur_len);
+	int i = 0;
+	for (i = 0; i < cur_len; i++) {
+		cur[i] = st->readers[i];
+	}
+	return cur;
+}
+
+static int reftable_stack_reload_once(struct reftable_stack *st, char **names,
+				      int reuse_open)
+{
+	int cur_len = st->merged == NULL ? 0 : st->merged->stack_len;
+	struct reftable_reader **cur = stack_copy_readers(st, cur_len);
+	int err = 0;
+	int names_len = names_length(names);
+	struct reftable_reader **new_readers =
+		reftable_calloc(sizeof(struct reftable_reader *) * names_len);
+	struct reftable_table *new_tables =
+		reftable_calloc(sizeof(struct reftable_table) * names_len);
+	int new_readers_len = 0;
+	struct reftable_merged_table *new_merged = NULL;
+	int i;
+
+	while (*names) {
+		struct reftable_reader *rd = NULL;
+		char *name = *names++;
+
+		/* this is linear; we assume compaction keeps the number of
+		   tables under control so this is not quadratic. */
+		int j = 0;
+		for (j = 0; reuse_open && j < cur_len; j++) {
+			if (cur[j] != NULL && 0 == strcmp(cur[j]->name, name)) {
+				rd = cur[j];
+				cur[j] = NULL;
+				break;
+			}
+		}
+
+		if (rd == NULL) {
+			struct reftable_block_source src = { 0 };
+			struct strbuf table_path = STRBUF_INIT;
+			strbuf_addstr(&table_path, st->reftable_dir);
+			strbuf_addstr(&table_path, "/");
+			strbuf_addstr(&table_path, name);
+
+			err = reftable_block_source_from_file(&src,
+							      table_path.buf);
+			strbuf_release(&table_path);
+
+			if (err < 0)
+				goto done;
+
+			err = reftable_new_reader(&rd, &src, name);
+			if (err < 0)
+				goto done;
+		}
+
+		new_readers[new_readers_len] = rd;
+		reftable_table_from_reader(&new_tables[new_readers_len], rd);
+		new_readers_len++;
+	}
+
+	/* success! */
+	err = reftable_new_merged_table(&new_merged, new_tables,
+					new_readers_len, st->config.hash_id);
+	if (err < 0)
+		goto done;
+
+	new_tables = NULL;
+	st->readers_len = new_readers_len;
+	if (st->merged != NULL) {
+		merged_table_clear(st->merged);
+		reftable_merged_table_free(st->merged);
+	}
+	if (st->readers != NULL) {
+		reftable_free(st->readers);
+	}
+	st->readers = new_readers;
+	new_readers = NULL;
+	new_readers_len = 0;
+
+	new_merged->suppress_deletions = 1;
+	st->merged = new_merged;
+	for (i = 0; i < cur_len; i++) {
+		if (cur[i] != NULL) {
+			reader_close(cur[i]);
+			reftable_reader_free(cur[i]);
+		}
+	}
+
+done:
+	for (i = 0; i < new_readers_len; i++) {
+		reader_close(new_readers[i]);
+		reftable_reader_free(new_readers[i]);
+	}
+	reftable_free(new_readers);
+	reftable_free(new_tables);
+	reftable_free(cur);
+	return err;
+}
+
+/* return negative if a before b. */
+static int tv_cmp(struct timeval *a, struct timeval *b)
+{
+	time_t diff = a->tv_sec - b->tv_sec;
+	int udiff = a->tv_usec - b->tv_usec;
+
+	if (diff != 0)
+		return diff;
+
+	return udiff;
+}
+
+int reftable_stack_reload_maybe_reuse(struct reftable_stack *st, int reuse_open)
+{
+	struct timeval deadline = { 0 };
+	int err = gettimeofday(&deadline, NULL);
+	int64_t delay = 0;
+	int tries = 0;
+	if (err < 0)
+		return err;
+
+	deadline.tv_sec += 3;
+	while (1) {
+		char **names = NULL;
+		char **names_after = NULL;
+		struct timeval now = { 0 };
+		int err = gettimeofday(&now, NULL);
+		int err2 = 0;
+		if (err < 0) {
+			return err;
+		}
+
+		/* Only look at deadlines after the first few times. This
+		   simplifies debugging in GDB */
+		tries++;
+		if (tries > 3 && tv_cmp(&now, &deadline) >= 0) {
+			break;
+		}
+
+		err = read_lines(st->list_file, &names);
+		if (err < 0) {
+			free_names(names);
+			return err;
+		}
+		err = reftable_stack_reload_once(st, names, reuse_open);
+		if (err == 0) {
+			free_names(names);
+			break;
+		}
+		if (err != REFTABLE_NOT_EXIST_ERROR) {
+			free_names(names);
+			return err;
+		}
+
+		/* err == REFTABLE_NOT_EXIST_ERROR can be caused by a concurrent
+		   writer. Check if there was one by checking if the name list
+		   changed.
+		*/
+		err2 = read_lines(st->list_file, &names_after);
+		if (err2 < 0) {
+			free_names(names);
+			return err2;
+		}
+
+		if (names_equal(names_after, names)) {
+			free_names(names);
+			free_names(names_after);
+			return err;
+		}
+		free_names(names);
+		free_names(names_after);
+
+		delay = delay + (delay * rand()) / RAND_MAX + 1;
+		sleep_millisec(delay);
+	}
+
+	return 0;
+}
+
+/* -1 = error
+ 0 = up to date
+ 1 = changed. */
+static int stack_uptodate(struct reftable_stack *st)
+{
+	char **names = NULL;
+	int err = read_lines(st->list_file, &names);
+	int i = 0;
+	if (err < 0)
+		return err;
+
+	for (i = 0; i < st->readers_len; i++) {
+		if (names[i] == NULL) {
+			err = 1;
+			goto done;
+		}
+
+		if (strcmp(st->readers[i]->name, names[i])) {
+			err = 1;
+			goto done;
+		}
+	}
+
+	if (names[st->merged->stack_len] != NULL) {
+		err = 1;
+		goto done;
+	}
+
+done:
+	free_names(names);
+	return err;
+}
+
+int reftable_stack_reload(struct reftable_stack *st)
+{
+	int err = stack_uptodate(st);
+	if (err > 0)
+		return reftable_stack_reload_maybe_reuse(st, 1);
+	return err;
+}
+
+int reftable_stack_add(struct reftable_stack *st,
+		       int (*write)(struct reftable_writer *wr, void *arg),
+		       void *arg)
+{
+	int err = stack_try_add(st, write, arg);
+	if (err < 0) {
+		if (err == REFTABLE_LOCK_ERROR) {
+			/* Ignore error return, we want to propagate
+			   REFTABLE_LOCK_ERROR.
+			*/
+			reftable_stack_reload(st);
+		}
+		return err;
+	}
+
+	if (!st->disable_auto_compact)
+		return reftable_stack_auto_compact(st);
+
+	return 0;
+}
+
+static void format_name(struct strbuf *dest, uint64_t min, uint64_t max)
+{
+	char buf[100];
+	snprintf(buf, sizeof(buf), "0x%012" PRIx64 "-0x%012" PRIx64, min, max);
+	strbuf_reset(dest);
+	strbuf_addstr(dest, buf);
+}
+
+struct reftable_addition {
+	int lock_file_fd;
+	struct strbuf lock_file_name;
+	struct reftable_stack *stack;
+	char **names;
+	char **new_tables;
+	int new_tables_len;
+	uint64_t next_update_index;
+};
+
+#define REFTABLE_ADDITION_INIT                \
+	{                                     \
+		.lock_file_name = STRBUF_INIT \
+	}
+
+static int reftable_stack_init_addition(struct reftable_addition *add,
+					struct reftable_stack *st)
+{
+	int err = 0;
+	add->stack = st;
+
+	strbuf_reset(&add->lock_file_name);
+	strbuf_addstr(&add->lock_file_name, st->list_file);
+	strbuf_addstr(&add->lock_file_name, ".lock");
+
+	add->lock_file_fd = open(add->lock_file_name.buf,
+				 O_EXCL | O_CREAT | O_WRONLY, 0644);
+	if (add->lock_file_fd < 0) {
+		if (errno == EEXIST) {
+			err = REFTABLE_LOCK_ERROR;
+		} else {
+			err = REFTABLE_IO_ERROR;
+		}
+		goto done;
+	}
+	err = stack_uptodate(st);
+	if (err < 0)
+		goto done;
+
+	if (err > 1) {
+		err = REFTABLE_LOCK_ERROR;
+		goto done;
+	}
+
+	add->next_update_index = reftable_stack_next_update_index(st);
+done:
+	if (err) {
+		reftable_addition_close(add);
+	}
+	return err;
+}
+
+void reftable_addition_close(struct reftable_addition *add)
+{
+	int i = 0;
+	struct strbuf nm = STRBUF_INIT;
+	for (i = 0; i < add->new_tables_len; i++) {
+		strbuf_reset(&nm);
+		strbuf_addstr(&nm, add->stack->list_file);
+		strbuf_addstr(&nm, "/");
+		strbuf_addstr(&nm, add->new_tables[i]);
+		unlink(nm.buf);
+		reftable_free(add->new_tables[i]);
+		add->new_tables[i] = NULL;
+	}
+	reftable_free(add->new_tables);
+	add->new_tables = NULL;
+	add->new_tables_len = 0;
+
+	if (add->lock_file_fd > 0) {
+		close(add->lock_file_fd);
+		add->lock_file_fd = 0;
+	}
+	if (add->lock_file_name.len > 0) {
+		unlink(add->lock_file_name.buf);
+		strbuf_release(&add->lock_file_name);
+	}
+
+	free_names(add->names);
+	add->names = NULL;
+	strbuf_release(&nm);
+}
+
+void reftable_addition_destroy(struct reftable_addition *add)
+{
+	if (add == NULL) {
+		return;
+	}
+	reftable_addition_close(add);
+	reftable_free(add);
+}
+
+int reftable_addition_commit(struct reftable_addition *add)
+{
+	struct strbuf table_list = STRBUF_INIT;
+	int i = 0;
+	int err = 0;
+	if (add->new_tables_len == 0)
+		goto done;
+
+	for (i = 0; i < add->stack->merged->stack_len; i++) {
+		strbuf_addstr(&table_list, add->stack->readers[i]->name);
+		strbuf_addstr(&table_list, "\n");
+	}
+	for (i = 0; i < add->new_tables_len; i++) {
+		strbuf_addstr(&table_list, add->new_tables[i]);
+		strbuf_addstr(&table_list, "\n");
+	}
+
+	err = write(add->lock_file_fd, table_list.buf, table_list.len);
+	strbuf_release(&table_list);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = close(add->lock_file_fd);
+	add->lock_file_fd = 0;
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = rename(add->lock_file_name.buf, add->stack->list_file);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = reftable_stack_reload(add->stack);
+
+done:
+	reftable_addition_close(add);
+	return err;
+}
+
+int reftable_stack_new_addition(struct reftable_addition **dest,
+				struct reftable_stack *st)
+{
+	int err = 0;
+	struct reftable_addition empty = REFTABLE_ADDITION_INIT;
+	*dest = reftable_calloc(sizeof(**dest));
+	**dest = empty;
+	err = reftable_stack_init_addition(*dest, st);
+	if (err) {
+		reftable_free(*dest);
+		*dest = NULL;
+	}
+	return err;
+}
+
+int stack_try_add(struct reftable_stack *st,
+		  int (*write_table)(struct reftable_writer *wr, void *arg),
+		  void *arg)
+{
+	struct reftable_addition add = REFTABLE_ADDITION_INIT;
+	int err = reftable_stack_init_addition(&add, st);
+	if (err < 0)
+		goto done;
+	if (err > 0) {
+		err = REFTABLE_LOCK_ERROR;
+		goto done;
+	}
+
+	err = reftable_addition_add(&add, write_table, arg);
+	if (err < 0)
+		goto done;
+
+	err = reftable_addition_commit(&add);
+done:
+	reftable_addition_close(&add);
+	return err;
+}
+
+int reftable_addition_add(struct reftable_addition *add,
+			  int (*write_table)(struct reftable_writer *wr,
+					     void *arg),
+			  void *arg)
+{
+	struct strbuf temp_tab_file_name = STRBUF_INIT;
+	struct strbuf tab_file_name = STRBUF_INIT;
+	struct strbuf next_name = STRBUF_INIT;
+	struct reftable_writer *wr = NULL;
+	int err = 0;
+	int tab_fd = 0;
+
+	strbuf_reset(&next_name);
+	format_name(&next_name, add->next_update_index, add->next_update_index);
+
+	strbuf_addstr(&temp_tab_file_name, add->stack->reftable_dir);
+	strbuf_addstr(&temp_tab_file_name, "/");
+	strbuf_addbuf(&temp_tab_file_name, &next_name);
+	strbuf_addstr(&temp_tab_file_name, ".temp.XXXXXX");
+
+	tab_fd = mkstemp(temp_tab_file_name.buf);
+	if (tab_fd < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	wr = reftable_new_writer(reftable_fd_write, &tab_fd,
+				 &add->stack->config);
+	err = write_table(wr, arg);
+	if (err < 0)
+		goto done;
+
+	err = reftable_writer_close(wr);
+	if (err == REFTABLE_EMPTY_TABLE_ERROR) {
+		err = 0;
+		goto done;
+	}
+	if (err < 0)
+		goto done;
+
+	err = close(tab_fd);
+	tab_fd = 0;
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = stack_check_addition(add->stack, temp_tab_file_name.buf);
+	if (err < 0)
+		goto done;
+
+	if (wr->min_update_index < add->next_update_index) {
+		err = REFTABLE_API_ERROR;
+		goto done;
+	}
+
+	format_name(&next_name, wr->min_update_index, wr->max_update_index);
+	strbuf_addstr(&next_name, ".ref");
+
+	strbuf_addstr(&tab_file_name, add->stack->reftable_dir);
+	strbuf_addstr(&tab_file_name, "/");
+	strbuf_addbuf(&tab_file_name, &next_name);
+
+	/* TODO: should check destination out of paranoia */
+	err = rename(temp_tab_file_name.buf, tab_file_name.buf);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	add->new_tables = reftable_realloc(add->new_tables,
+					   sizeof(*add->new_tables) *
+						   (add->new_tables_len + 1));
+	add->new_tables[add->new_tables_len] = strbuf_detach(&next_name, NULL);
+	add->new_tables_len++;
+done:
+	if (tab_fd > 0) {
+		close(tab_fd);
+		tab_fd = 0;
+	}
+	if (temp_tab_file_name.len > 0) {
+		unlink(temp_tab_file_name.buf);
+	}
+
+	strbuf_release(&temp_tab_file_name);
+	strbuf_release(&tab_file_name);
+	strbuf_release(&next_name);
+	reftable_writer_free(wr);
+	return err;
+}
+
+uint64_t reftable_stack_next_update_index(struct reftable_stack *st)
+{
+	int sz = st->merged->stack_len;
+	if (sz > 0)
+		return reftable_reader_max_update_index(st->readers[sz - 1]) +
+		       1;
+	return 1;
+}
+
+static int stack_compact_locked(struct reftable_stack *st, int first, int last,
+				struct strbuf *temp_tab,
+				struct reftable_log_expiry_config *config)
+{
+	struct strbuf next_name = STRBUF_INIT;
+	int tab_fd = -1;
+	struct reftable_writer *wr = NULL;
+	int err = 0;
+
+	format_name(&next_name,
+		    reftable_reader_min_update_index(st->readers[first]),
+		    reftable_reader_max_update_index(st->readers[first]));
+
+	strbuf_reset(temp_tab);
+	strbuf_addstr(temp_tab, st->reftable_dir);
+	strbuf_addstr(temp_tab, "/");
+	strbuf_addbuf(temp_tab, &next_name);
+	strbuf_addstr(temp_tab, ".temp.XXXXXX");
+
+	tab_fd = mkstemp(temp_tab->buf);
+	wr = reftable_new_writer(reftable_fd_write, &tab_fd, &st->config);
+
+	err = stack_write_compact(st, wr, first, last, config);
+	if (err < 0)
+		goto done;
+	err = reftable_writer_close(wr);
+	if (err < 0)
+		goto done;
+
+	err = close(tab_fd);
+	tab_fd = 0;
+
+done:
+	reftable_writer_free(wr);
+	if (tab_fd > 0) {
+		close(tab_fd);
+		tab_fd = 0;
+	}
+	if (err != 0 && temp_tab->len > 0) {
+		unlink(temp_tab->buf);
+		strbuf_release(temp_tab);
+	}
+	strbuf_release(&next_name);
+	return err;
+}
+
+int stack_write_compact(struct reftable_stack *st, struct reftable_writer *wr,
+			int first, int last,
+			struct reftable_log_expiry_config *config)
+{
+	int subtabs_len = last - first + 1;
+	struct reftable_table *subtabs = reftable_calloc(
+		sizeof(struct reftable_table) * (last - first + 1));
+	struct reftable_merged_table *mt = NULL;
+	int err = 0;
+	struct reftable_iterator it = { 0 };
+	struct reftable_ref_record ref = { 0 };
+	struct reftable_log_record log = { 0 };
+
+	uint64_t entries = 0;
+
+	int i = 0, j = 0;
+	for (i = first, j = 0; i <= last; i++) {
+		struct reftable_reader *t = st->readers[i];
+		reftable_table_from_reader(&subtabs[j++], t);
+		st->stats.bytes += t->size;
+	}
+	reftable_writer_set_limits(wr, st->readers[first]->min_update_index,
+				   st->readers[last]->max_update_index);
+
+	err = reftable_new_merged_table(&mt, subtabs, subtabs_len,
+					st->config.hash_id);
+	if (err < 0) {
+		reftable_free(subtabs);
+		goto done;
+	}
+
+	err = reftable_merged_table_seek_ref(mt, &it, "");
+	if (err < 0)
+		goto done;
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+		if (err < 0) {
+			break;
+		}
+		if (first == 0 && reftable_ref_record_is_deletion(&ref)) {
+			continue;
+		}
+
+		err = reftable_writer_add_ref(wr, &ref);
+		if (err < 0) {
+			break;
+		}
+		entries++;
+	}
+	reftable_iterator_destroy(&it);
+
+	err = reftable_merged_table_seek_log(mt, &it, "");
+	if (err < 0)
+		goto done;
+
+	while (1) {
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+		if (err < 0) {
+			break;
+		}
+		if (first == 0 && reftable_log_record_is_deletion(&log)) {
+			continue;
+		}
+
+		if (config != NULL && config->time > 0 &&
+		    log.time < config->time) {
+			continue;
+		}
+
+		if (config != NULL && config->min_update_index > 0 &&
+		    log.update_index < config->min_update_index) {
+			continue;
+		}
+
+		err = reftable_writer_add_log(wr, &log);
+		if (err < 0) {
+			break;
+		}
+		entries++;
+	}
+
+done:
+	reftable_iterator_destroy(&it);
+	if (mt != NULL) {
+		merged_table_clear(mt);
+		reftable_merged_table_free(mt);
+	}
+	reftable_ref_record_clear(&ref);
+	reftable_log_record_clear(&log);
+	st->stats.entries_written += entries;
+	return err;
+}
+
+/* <  0: error. 0 == OK, > 0 attempt failed; could retry. */
+static int stack_compact_range(struct reftable_stack *st, int first, int last,
+			       struct reftable_log_expiry_config *expiry)
+{
+	struct strbuf temp_tab_file_name = STRBUF_INIT;
+	struct strbuf new_table_name = STRBUF_INIT;
+	struct strbuf lock_file_name = STRBUF_INIT;
+	struct strbuf ref_list_contents = STRBUF_INIT;
+	struct strbuf new_table_path = STRBUF_INIT;
+	int err = 0;
+	int have_lock = 0;
+	int lock_file_fd = 0;
+	int compact_count = last - first + 1;
+	char **listp = NULL;
+	char **delete_on_success =
+		reftable_calloc(sizeof(char *) * (compact_count + 1));
+	char **subtable_locks =
+		reftable_calloc(sizeof(char *) * (compact_count + 1));
+	int i = 0;
+	int j = 0;
+	int is_empty_table = 0;
+
+	if (first > last || (expiry == NULL && first == last)) {
+		err = 0;
+		goto done;
+	}
+
+	st->stats.attempts++;
+
+	strbuf_reset(&lock_file_name);
+	strbuf_addstr(&lock_file_name, st->list_file);
+	strbuf_addstr(&lock_file_name, ".lock");
+
+	lock_file_fd =
+		open(lock_file_name.buf, O_EXCL | O_CREAT | O_WRONLY, 0644);
+	if (lock_file_fd < 0) {
+		if (errno == EEXIST) {
+			err = 1;
+		} else {
+			err = REFTABLE_IO_ERROR;
+		}
+		goto done;
+	}
+	/* Don't want to write to the lock for now.  */
+	close(lock_file_fd);
+	lock_file_fd = 0;
+
+	have_lock = 1;
+	err = stack_uptodate(st);
+	if (err != 0)
+		goto done;
+
+	for (i = first, j = 0; i <= last; i++) {
+		struct strbuf subtab_file_name = STRBUF_INIT;
+		struct strbuf subtab_lock = STRBUF_INIT;
+		int sublock_file_fd = -1;
+
+		strbuf_addstr(&subtab_file_name, st->reftable_dir);
+		strbuf_addstr(&subtab_file_name, "/");
+		strbuf_addstr(&subtab_file_name, reader_name(st->readers[i]));
+
+		strbuf_reset(&subtab_lock);
+		strbuf_addbuf(&subtab_lock, &subtab_file_name);
+		strbuf_addstr(&subtab_lock, ".lock");
+
+		sublock_file_fd = open(subtab_lock.buf,
+				       O_EXCL | O_CREAT | O_WRONLY, 0644);
+		if (sublock_file_fd > 0) {
+			close(sublock_file_fd);
+		} else if (sublock_file_fd < 0) {
+			if (errno == EEXIST) {
+				err = 1;
+			} else {
+				err = REFTABLE_IO_ERROR;
+			}
+		}
+
+		subtable_locks[j] = subtab_lock.buf;
+		delete_on_success[j] = subtab_file_name.buf;
+		j++;
+
+		if (err != 0)
+			goto done;
+	}
+
+	err = unlink(lock_file_name.buf);
+	if (err < 0)
+		goto done;
+	have_lock = 0;
+
+	err = stack_compact_locked(st, first, last, &temp_tab_file_name,
+				   expiry);
+	/* Compaction + tombstones can create an empty table out of non-empty
+	 * tables. */
+	is_empty_table = (err == REFTABLE_EMPTY_TABLE_ERROR);
+	if (is_empty_table) {
+		err = 0;
+	}
+	if (err < 0)
+		goto done;
+
+	lock_file_fd =
+		open(lock_file_name.buf, O_EXCL | O_CREAT | O_WRONLY, 0644);
+	if (lock_file_fd < 0) {
+		if (errno == EEXIST) {
+			err = 1;
+		} else {
+			err = REFTABLE_IO_ERROR;
+		}
+		goto done;
+	}
+	have_lock = 1;
+
+	format_name(&new_table_name, st->readers[first]->min_update_index,
+		    st->readers[last]->max_update_index);
+	strbuf_addstr(&new_table_name, ".ref");
+
+	strbuf_reset(&new_table_path);
+	strbuf_addstr(&new_table_path, st->reftable_dir);
+	strbuf_addstr(&new_table_path, "/");
+	strbuf_addbuf(&new_table_path, &new_table_name);
+
+	if (!is_empty_table) {
+		err = rename(temp_tab_file_name.buf, new_table_path.buf);
+		if (err < 0) {
+			err = REFTABLE_IO_ERROR;
+			goto done;
+		}
+	}
+
+	for (i = 0; i < first; i++) {
+		strbuf_addstr(&ref_list_contents, st->readers[i]->name);
+		strbuf_addstr(&ref_list_contents, "\n");
+	}
+	if (!is_empty_table) {
+		strbuf_addbuf(&ref_list_contents, &new_table_name);
+		strbuf_addstr(&ref_list_contents, "\n");
+	}
+	for (i = last + 1; i < st->merged->stack_len; i++) {
+		strbuf_addstr(&ref_list_contents, st->readers[i]->name);
+		strbuf_addstr(&ref_list_contents, "\n");
+	}
+
+	err = write(lock_file_fd, ref_list_contents.buf, ref_list_contents.len);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		unlink(new_table_path.buf);
+		goto done;
+	}
+	err = close(lock_file_fd);
+	lock_file_fd = 0;
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		unlink(new_table_path.buf);
+		goto done;
+	}
+
+	err = rename(lock_file_name.buf, st->list_file);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		unlink(new_table_path.buf);
+		goto done;
+	}
+	have_lock = 0;
+
+	/* Reload the stack before deleting. On windows, we can only delete the
+	   files after we closed them.
+	*/
+	err = reftable_stack_reload_maybe_reuse(st, first < last);
+
+	listp = delete_on_success;
+	while (*listp) {
+		if (strcmp(*listp, new_table_path.buf)) {
+			unlink(*listp);
+		}
+		listp++;
+	}
+
+done:
+	free_names(delete_on_success);
+
+	listp = subtable_locks;
+	while (*listp) {
+		unlink(*listp);
+		listp++;
+	}
+	free_names(subtable_locks);
+	if (lock_file_fd > 0) {
+		close(lock_file_fd);
+		lock_file_fd = 0;
+	}
+	if (have_lock) {
+		unlink(lock_file_name.buf);
+	}
+	strbuf_release(&new_table_name);
+	strbuf_release(&new_table_path);
+	strbuf_release(&ref_list_contents);
+	strbuf_release(&temp_tab_file_name);
+	strbuf_release(&lock_file_name);
+	return err;
+}
+
+int reftable_stack_compact_all(struct reftable_stack *st,
+			       struct reftable_log_expiry_config *config)
+{
+	return stack_compact_range(st, 0, st->merged->stack_len - 1, config);
+}
+
+static int stack_compact_range_stats(struct reftable_stack *st, int first,
+				     int last,
+				     struct reftable_log_expiry_config *config)
+{
+	int err = stack_compact_range(st, first, last, config);
+	if (err > 0) {
+		st->stats.failures++;
+	}
+	return err;
+}
+
+static int segment_size(struct segment *s)
+{
+	return s->end - s->start;
+}
+
+int fastlog2(uint64_t sz)
+{
+	int l = 0;
+	if (sz == 0)
+		return 0;
+	for (; sz; sz /= 2) {
+		l++;
+	}
+	return l - 1;
+}
+
+struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n)
+{
+	struct segment *segs = reftable_calloc(sizeof(struct segment) * n);
+	int next = 0;
+	struct segment cur = { 0 };
+	int i = 0;
+
+	if (n == 0) {
+		*seglen = 0;
+		return segs;
+	}
+	for (i = 0; i < n; i++) {
+		int log = fastlog2(sizes[i]);
+		if (cur.log != log && cur.bytes > 0) {
+			struct segment fresh = {
+				.start = i,
+			};
+
+			segs[next++] = cur;
+			cur = fresh;
+		}
+
+		cur.log = log;
+		cur.end = i + 1;
+		cur.bytes += sizes[i];
+	}
+	segs[next++] = cur;
+	*seglen = next;
+	return segs;
+}
+
+struct segment suggest_compaction_segment(uint64_t *sizes, int n)
+{
+	int seglen = 0;
+	struct segment *segs = sizes_to_segments(&seglen, sizes, n);
+	struct segment min_seg = {
+		.log = 64,
+	};
+	int i = 0;
+	for (i = 0; i < seglen; i++) {
+		if (segment_size(&segs[i]) == 1) {
+			continue;
+		}
+
+		if (segs[i].log < min_seg.log) {
+			min_seg = segs[i];
+		}
+	}
+
+	while (min_seg.start > 0) {
+		int prev = min_seg.start - 1;
+		if (fastlog2(min_seg.bytes) < fastlog2(sizes[prev])) {
+			break;
+		}
+
+		min_seg.start = prev;
+		min_seg.bytes += sizes[prev];
+	}
+
+	reftable_free(segs);
+	return min_seg;
+}
+
+static uint64_t *stack_table_sizes_for_compaction(struct reftable_stack *st)
+{
+	uint64_t *sizes =
+		reftable_calloc(sizeof(uint64_t) * st->merged->stack_len);
+	int version = (st->config.hash_id == SHA1_ID) ? 1 : 2;
+	int overhead = header_size(version) - 1;
+	int i = 0;
+	for (i = 0; i < st->merged->stack_len; i++) {
+		sizes[i] = st->readers[i]->size - overhead;
+	}
+	return sizes;
+}
+
+int reftable_stack_auto_compact(struct reftable_stack *st)
+{
+	uint64_t *sizes = stack_table_sizes_for_compaction(st);
+	struct segment seg =
+		suggest_compaction_segment(sizes, st->merged->stack_len);
+	reftable_free(sizes);
+	if (segment_size(&seg) > 0)
+		return stack_compact_range_stats(st, seg.start, seg.end - 1,
+						 NULL);
+
+	return 0;
+}
+
+struct reftable_compaction_stats *
+reftable_stack_compaction_stats(struct reftable_stack *st)
+{
+	return &st->stats;
+}
+
+int reftable_stack_read_ref(struct reftable_stack *st, const char *refname,
+			    struct reftable_ref_record *ref)
+{
+	struct reftable_table tab = { NULL };
+	reftable_table_from_merged_table(&tab, reftable_stack_merged_table(st));
+	return reftable_table_read_ref(&tab, refname, ref);
+}
+
+int reftable_stack_read_log(struct reftable_stack *st, const char *refname,
+			    struct reftable_log_record *log)
+{
+	struct reftable_iterator it = { 0 };
+	struct reftable_merged_table *mt = reftable_stack_merged_table(st);
+	int err = reftable_merged_table_seek_log(mt, &it, refname);
+	if (err)
+		goto done;
+
+	err = reftable_iterator_next_log(&it, log);
+	if (err)
+		goto done;
+
+	if (strcmp(log->refname, refname) ||
+	    reftable_log_record_is_deletion(log)) {
+		err = 1;
+		goto done;
+	}
+
+done:
+	if (err) {
+		reftable_log_record_clear(log);
+	}
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+int stack_check_addition(struct reftable_stack *st, const char *new_tab_name)
+{
+	int err = 0;
+	struct reftable_block_source src = { 0 };
+	struct reftable_reader *rd = NULL;
+	struct reftable_table tab = { NULL };
+	struct reftable_ref_record *refs = NULL;
+	struct reftable_iterator it = { NULL };
+	int cap = 0;
+	int len = 0;
+	int i = 0;
+
+	if (st->config.skip_name_check)
+		return 0;
+
+	err = reftable_block_source_from_file(&src, new_tab_name);
+	if (err < 0)
+		goto done;
+
+	err = reftable_new_reader(&rd, &src, new_tab_name);
+	if (err < 0)
+		goto done;
+
+	err = reftable_reader_seek_ref(rd, &it, "");
+	if (err > 0) {
+		err = 0;
+		goto done;
+	}
+	if (err < 0)
+		goto done;
+
+	while (1) {
+		struct reftable_ref_record ref = { 0 };
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0)
+			goto done;
+
+		if (len >= cap) {
+			cap = 2 * cap + 1;
+			refs = reftable_realloc(refs, cap * sizeof(refs[0]));
+		}
+
+		refs[len++] = ref;
+	}
+
+	reftable_table_from_merged_table(&tab, reftable_stack_merged_table(st));
+
+	err = validate_ref_record_addition(tab, refs, len);
+
+done:
+	for (i = 0; i < len; i++) {
+		reftable_ref_record_clear(&refs[i]);
+	}
+
+	free(refs);
+	reftable_iterator_destroy(&it);
+	reftable_reader_free(rd);
+	return err;
+}
diff --git a/reftable/stack.h b/reftable/stack.h
new file mode 100644
index 0000000000..5ef138b025
--- /dev/null
+++ b/reftable/stack.h
@@ -0,0 +1,50 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef STACK_H
+#define STACK_H
+
+#include "reftable.h"
+#include "system.h"
+
+struct reftable_stack {
+	char *list_file;
+	char *reftable_dir;
+	int disable_auto_compact;
+
+	struct reftable_write_options config;
+
+	struct reftable_reader **readers;
+	size_t readers_len;
+	struct reftable_merged_table *merged;
+	struct reftable_compaction_stats stats;
+};
+
+int read_lines(const char *filename, char ***lines);
+int stack_try_add(struct reftable_stack *st,
+		  int (*write_table)(struct reftable_writer *wr, void *arg),
+		  void *arg);
+int stack_write_compact(struct reftable_stack *st, struct reftable_writer *wr,
+			int first, int last,
+			struct reftable_log_expiry_config *config);
+int fastlog2(uint64_t sz);
+int stack_check_addition(struct reftable_stack *st, const char *new_tab_name);
+void reftable_addition_close(struct reftable_addition *add);
+int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
+				      int reuse_open);
+
+struct segment {
+	int start, end;
+	int log;
+	uint64_t bytes;
+};
+
+struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n);
+struct segment suggest_compaction_segment(uint64_t *sizes, int n);
+
+#endif
diff --git a/reftable/stack_test.c b/reftable/stack_test.c
new file mode 100644
index 0000000000..de806441dc
--- /dev/null
+++ b/reftable/stack_test.c
@@ -0,0 +1,787 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "stack.h"
+
+#include "system.h"
+
+#include "merged.h"
+#include "basics.h"
+#include "constants.h"
+#include "record.h"
+#include "reftable.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+#include <sys/types.h>
+#include <dirent.h>
+
+static void test_read_file(void)
+{
+	char fn[256] = "/tmp/stack.test_read_file.XXXXXX";
+	int fd = mkstemp(fn);
+	char out[1024] = "line1\n\nline2\nline3";
+	int n, err;
+	char **names = NULL;
+	char *want[] = { "line1", "line2", "line3" };
+	int i = 0;
+
+	assert(fd > 0);
+	n = write(fd, out, strlen(out));
+	assert(n == strlen(out));
+	err = close(fd);
+	assert(err >= 0);
+
+	err = read_lines(fn, &names);
+	assert_err(err);
+
+	for (i = 0; names[i] != NULL; i++) {
+		assert(0 == strcmp(want[i], names[i]));
+	}
+	free_names(names);
+	remove(fn);
+}
+
+static void test_parse_names(void)
+{
+	char buf[] = "line\n";
+	char **names = NULL;
+	parse_names(buf, strlen(buf), &names);
+
+	assert(NULL != names[0]);
+	assert(0 == strcmp(names[0], "line"));
+	assert(NULL == names[1]);
+	free_names(names);
+}
+
+static void test_names_equal(void)
+{
+	char *a[] = { "a", "b", "c", NULL };
+	char *b[] = { "a", "b", "d", NULL };
+	char *c[] = { "a", "b", NULL };
+
+	assert(names_equal(a, a));
+	assert(!names_equal(a, b));
+	assert(!names_equal(a, c));
+}
+
+static int write_test_ref(struct reftable_writer *wr, void *arg)
+{
+	struct reftable_ref_record *ref = arg;
+	reftable_writer_set_limits(wr, ref->update_index, ref->update_index);
+	return reftable_writer_add_ref(wr, ref);
+}
+
+struct write_log_arg {
+	struct reftable_log_record *log;
+	uint64_t update_index;
+};
+
+static int write_test_log(struct reftable_writer *wr, void *arg)
+{
+	struct write_log_arg *wla = arg;
+
+	reftable_writer_set_limits(wr, wla->update_index, wla->update_index);
+	return reftable_writer_add_log(wr, wla->log);
+}
+
+static void test_reftable_stack_add_one(void)
+{
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_ref_record ref = {
+		.refname = "HEAD",
+		.update_index = 1,
+		.target = "master",
+	};
+	struct reftable_ref_record dest = { 0 };
+
+	assert(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	assert_err(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref);
+	assert_err(err);
+
+	err = reftable_stack_read_ref(st, ref.refname, &dest);
+	assert_err(err);
+	assert(0 == strcmp("master", dest.target));
+
+	reftable_ref_record_clear(&dest);
+	reftable_stack_destroy(st);
+	reftable_clear_dir(dir);
+}
+
+static void test_reftable_stack_uptodate(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st1 = NULL;
+	struct reftable_stack *st2 = NULL;
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	int err;
+	struct reftable_ref_record ref1 = {
+		.refname = "HEAD",
+		.update_index = 1,
+		.target = "master",
+	};
+	struct reftable_ref_record ref2 = {
+		.refname = "branch2",
+		.update_index = 2,
+		.target = "master",
+	};
+
+	assert(mkdtemp(dir));
+
+	err = reftable_new_stack(&st1, dir, cfg);
+	assert_err(err);
+
+	err = reftable_new_stack(&st2, dir, cfg);
+	assert_err(err);
+
+	err = reftable_stack_add(st1, &write_test_ref, &ref1);
+	assert_err(err);
+
+	err = reftable_stack_add(st2, &write_test_ref, &ref2);
+	assert(err == REFTABLE_LOCK_ERROR);
+
+	err = reftable_stack_reload(st2);
+	assert_err(err);
+
+	err = reftable_stack_add(st2, &write_test_ref, &ref2);
+	assert_err(err);
+	reftable_stack_destroy(st1);
+	reftable_stack_destroy(st2);
+	reftable_clear_dir(dir);
+}
+
+static void test_reftable_stack_transaction_api(void)
+{
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_addition *add = NULL;
+
+	struct reftable_ref_record ref = {
+		.refname = "HEAD",
+		.update_index = 1,
+		.target = "master",
+	};
+	struct reftable_ref_record dest = { 0 };
+
+	assert(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	assert_err(err);
+
+	reftable_addition_destroy(add);
+
+	err = reftable_stack_new_addition(&add, st);
+	assert_err(err);
+
+	err = reftable_addition_add(add, &write_test_ref, &ref);
+	assert_err(err);
+
+	err = reftable_addition_commit(add);
+	assert_err(err);
+
+	reftable_addition_destroy(add);
+
+	err = reftable_stack_read_ref(st, ref.refname, &dest);
+	assert_err(err);
+	assert(0 == strcmp("master", dest.target));
+
+	reftable_ref_record_clear(&dest);
+	reftable_stack_destroy(st);
+	reftable_clear_dir(dir);
+}
+
+static void test_reftable_stack_validate_refname(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	int i;
+	struct reftable_ref_record ref = {
+		.refname = "a/b",
+		.update_index = 1,
+		.target = "master",
+	};
+	char *additions[] = { "a", "a/b/c" };
+
+	assert(mkdtemp(dir));
+	err = reftable_new_stack(&st, dir, cfg);
+	assert_err(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref);
+	assert_err(err);
+
+	for (i = 0; i < ARRAY_SIZE(additions); i++) {
+		struct reftable_ref_record ref = {
+			.refname = additions[i],
+			.update_index = 1,
+			.target = "master",
+		};
+
+		err = reftable_stack_add(st, &write_test_ref, &ref);
+		assert(err == REFTABLE_NAME_CONFLICT);
+	}
+
+	reftable_stack_destroy(st);
+	reftable_clear_dir(dir);
+}
+
+static int write_error(struct reftable_writer *wr, void *arg)
+{
+	return *((int *)arg);
+}
+
+static void test_reftable_stack_update_index_check(void)
+{
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_ref_record ref1 = {
+		.refname = "name1",
+		.update_index = 1,
+		.target = "master",
+	};
+	struct reftable_ref_record ref2 = {
+		.refname = "name2",
+		.update_index = 1,
+		.target = "master",
+	};
+	assert(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	assert_err(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref1);
+	assert_err(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref2);
+	assert(err == REFTABLE_API_ERROR);
+	reftable_stack_destroy(st);
+	reftable_clear_dir(dir);
+}
+
+static void test_reftable_stack_lock_failure(void)
+{
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err, i;
+	assert(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	assert_err(err);
+	for (i = -1; i != REFTABLE_EMPTY_TABLE_ERROR; i--) {
+		err = reftable_stack_add(st, &write_error, &i);
+		assert(err == i);
+	}
+
+	reftable_stack_destroy(st);
+	reftable_clear_dir(dir);
+}
+
+static void test_reftable_stack_add(void)
+{
+	int i = 0;
+	int err = 0;
+	struct reftable_write_options cfg = {
+		.exact_log_message = 1,
+	};
+	struct reftable_stack *st = NULL;
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	struct reftable_ref_record refs[2] = { { 0 } };
+	struct reftable_log_record logs[2] = { { 0 } };
+	int N = ARRAY_SIZE(refs);
+
+	assert(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	assert_err(err);
+	st->disable_auto_compact = 1;
+
+	for (i = 0; i < N; i++) {
+		char buf[256];
+		snprintf(buf, sizeof(buf), "branch%02d", i);
+		refs[i].refname = xstrdup(buf);
+		refs[i].value = reftable_malloc(SHA1_SIZE);
+		refs[i].update_index = i + 1;
+		set_test_hash(refs[i].value, i);
+
+		logs[i].refname = xstrdup(buf);
+		logs[i].update_index = N + i + 1;
+		logs[i].new_hash = reftable_malloc(SHA1_SIZE);
+		logs[i].email = xstrdup("identity@invalid");
+		set_test_hash(logs[i].new_hash, i);
+	}
+
+	for (i = 0; i < N; i++) {
+		int err = reftable_stack_add(st, &write_test_ref, &refs[i]);
+		assert_err(err);
+	}
+
+	for (i = 0; i < N; i++) {
+		struct write_log_arg arg = {
+			.log = &logs[i],
+			.update_index = reftable_stack_next_update_index(st),
+		};
+		int err = reftable_stack_add(st, &write_test_log, &arg);
+		assert_err(err);
+	}
+
+	err = reftable_stack_compact_all(st, NULL);
+	assert_err(err);
+
+	for (i = 0; i < N; i++) {
+		struct reftable_ref_record dest = { 0 };
+		int err = reftable_stack_read_ref(st, refs[i].refname, &dest);
+		assert_err(err);
+		assert(reftable_ref_record_equal(&dest, refs + i, SHA1_SIZE));
+		reftable_ref_record_clear(&dest);
+	}
+
+	for (i = 0; i < N; i++) {
+		struct reftable_log_record dest = { 0 };
+		int err = reftable_stack_read_log(st, refs[i].refname, &dest);
+		assert_err(err);
+		assert(reftable_log_record_equal(&dest, logs + i, SHA1_SIZE));
+		reftable_log_record_clear(&dest);
+	}
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	for (i = 0; i < N; i++) {
+		reftable_ref_record_clear(&refs[i]);
+		reftable_log_record_clear(&logs[i]);
+	}
+	reftable_clear_dir(dir);
+}
+
+static void test_reftable_stack_log_normalize(void)
+{
+	int err = 0;
+	struct reftable_write_options cfg = {
+		0,
+	};
+	struct reftable_stack *st = NULL;
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+
+	uint8_t h1[SHA1_SIZE] = { 0x01 }, h2[SHA1_SIZE] = { 0x02 };
+
+	struct reftable_log_record input = {
+		.refname = "branch",
+		.update_index = 1,
+		.new_hash = h1,
+		.old_hash = h2,
+	};
+	struct reftable_log_record dest = {
+		.update_index = 0,
+	};
+	struct write_log_arg arg = {
+		.log = &input,
+		.update_index = 1,
+	};
+
+	assert(mkdtemp(dir));
+	err = reftable_new_stack(&st, dir, cfg);
+	assert_err(err);
+
+	input.message = "one\ntwo";
+	err = reftable_stack_add(st, &write_test_log, &arg);
+	assert(err == REFTABLE_API_ERROR);
+
+	input.message = "one";
+	err = reftable_stack_add(st, &write_test_log, &arg);
+	assert_err(err);
+
+	err = reftable_stack_read_log(st, input.refname, &dest);
+	assert_err(err);
+	assert(0 == strcmp(dest.message, "one\n"));
+
+	input.message = "two\n";
+	arg.update_index = 2;
+	err = reftable_stack_add(st, &write_test_log, &arg);
+	assert_err(err);
+	err = reftable_stack_read_log(st, input.refname, &dest);
+	assert_err(err);
+	assert(0 == strcmp(dest.message, "two\n"));
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	reftable_log_record_clear(&dest);
+	reftable_clear_dir(dir);
+}
+
+static void test_reftable_stack_tombstone(void)
+{
+	int i = 0;
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_ref_record refs[2] = { { 0 } };
+	struct reftable_log_record logs[2] = { { 0 } };
+	int N = ARRAY_SIZE(refs);
+	struct reftable_ref_record dest = { 0 };
+	struct reftable_log_record log_dest = { 0 };
+
+	assert(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	assert_err(err);
+
+	for (i = 0; i < N; i++) {
+		const char *buf = "branch";
+		refs[i].refname = xstrdup(buf);
+		refs[i].update_index = i + 1;
+		if (i % 2 == 0) {
+			refs[i].value = reftable_malloc(SHA1_SIZE);
+			set_test_hash(refs[i].value, i);
+		}
+		logs[i].refname = xstrdup(buf);
+		/* update_index is part of the key. */
+		logs[i].update_index = 42;
+		if (i % 2 == 0) {
+			logs[i].new_hash = reftable_malloc(SHA1_SIZE);
+			set_test_hash(logs[i].new_hash, i);
+			logs[i].email = xstrdup("identity@invalid");
+		}
+	}
+	for (i = 0; i < N; i++) {
+		int err = reftable_stack_add(st, &write_test_ref, &refs[i]);
+		assert_err(err);
+	}
+	for (i = 0; i < N; i++) {
+		struct write_log_arg arg = {
+			.log = &logs[i],
+			.update_index = reftable_stack_next_update_index(st),
+		};
+		int err = reftable_stack_add(st, &write_test_log, &arg);
+		assert_err(err);
+	}
+
+	err = reftable_stack_read_ref(st, "branch", &dest);
+	assert(err == 1);
+	reftable_ref_record_clear(&dest);
+
+	err = reftable_stack_read_log(st, "branch", &log_dest);
+	assert(err == 1);
+	reftable_log_record_clear(&log_dest);
+
+	err = reftable_stack_compact_all(st, NULL);
+	assert_err(err);
+
+	err = reftable_stack_read_ref(st, "branch", &dest);
+	assert(err == 1);
+
+	err = reftable_stack_read_log(st, "branch", &log_dest);
+	assert(err == 1);
+	reftable_ref_record_clear(&dest);
+	reftable_log_record_clear(&log_dest);
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	for (i = 0; i < N; i++) {
+		reftable_ref_record_clear(&refs[i]);
+		reftable_log_record_clear(&logs[i]);
+	}
+	reftable_clear_dir(dir);
+}
+
+static void test_reftable_stack_hash_id(void)
+{
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+
+	struct reftable_ref_record ref = {
+		.refname = "master",
+		.target = "target",
+		.update_index = 1,
+	};
+	struct reftable_write_options cfg32 = { .hash_id = SHA256_ID };
+	struct reftable_stack *st32 = NULL;
+	struct reftable_write_options cfg_default = { 0 };
+	struct reftable_stack *st_default = NULL;
+	struct reftable_ref_record dest = { 0 };
+
+	assert(mkdtemp(dir));
+	err = reftable_new_stack(&st, dir, cfg);
+	assert_err(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref);
+	assert_err(err);
+
+	/* can't read it with the wrong hash ID. */
+	err = reftable_new_stack(&st32, dir, cfg32);
+	assert(err == REFTABLE_FORMAT_ERROR);
+
+	/* check that we can read it back with default config too. */
+	err = reftable_new_stack(&st_default, dir, cfg_default);
+	assert_err(err);
+
+	err = reftable_stack_read_ref(st_default, "master", &dest);
+	assert_err(err);
+
+	assert(!strcmp(dest.target, ref.target));
+	reftable_ref_record_clear(&dest);
+	reftable_stack_destroy(st);
+	reftable_stack_destroy(st_default);
+	reftable_clear_dir(dir);
+}
+
+static void test_log2(void)
+{
+	assert(1 == fastlog2(3));
+	assert(2 == fastlog2(4));
+	assert(2 == fastlog2(5));
+}
+
+static void test_sizes_to_segments(void)
+{
+	uint64_t sizes[] = { 2, 3, 4, 5, 7, 9 };
+	/* .................0  1  2  3  4  5 */
+
+	int seglen = 0;
+	struct segment *segs =
+		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
+	assert(segs[2].log == 3);
+	assert(segs[2].start == 5);
+	assert(segs[2].end == 6);
+
+	assert(segs[1].log == 2);
+	assert(segs[1].start == 2);
+	assert(segs[1].end == 5);
+	reftable_free(segs);
+}
+
+static void test_sizes_to_segments_empty(void)
+{
+	uint64_t sizes[0];
+
+	int seglen = 0;
+	struct segment *segs =
+		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
+	assert(seglen == 0);
+	reftable_free(segs);
+}
+
+static void test_sizes_to_segments_all_equal(void)
+{
+	uint64_t sizes[] = { 5, 5 };
+
+	int seglen = 0;
+	struct segment *segs =
+		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
+	assert(seglen == 1);
+	assert(segs[0].start == 0);
+	assert(segs[0].end == 2);
+	reftable_free(segs);
+}
+
+static void test_suggest_compaction_segment(void)
+{
+	uint64_t sizes[] = { 128, 64, 17, 16, 9, 9, 9, 16, 16 };
+	/* .................0    1    2  3   4  5  6 */
+	struct segment min =
+		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes));
+	assert(min.start == 2);
+	assert(min.end == 7);
+}
+
+static void test_suggest_compaction_segment_nothing(void)
+{
+	uint64_t sizes[] = { 64, 32, 16, 8, 4, 2 };
+	struct segment result =
+		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes));
+	assert(result.start == result.end);
+}
+
+static void test_reflog_expire(void)
+{
+	char dir[256] = "/tmp/stack.test_reflog_expire.XXXXXX";
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	struct reftable_log_record logs[20] = { { 0 } };
+	int N = ARRAY_SIZE(logs) - 1;
+	int i = 0;
+	int err;
+	struct reftable_log_expiry_config expiry = {
+		.time = 10,
+	};
+	struct reftable_log_record log = { 0 };
+
+	assert(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	assert_err(err);
+
+	for (i = 1; i <= N; i++) {
+		char buf[256];
+		snprintf(buf, sizeof(buf), "branch%02d", i);
+
+		logs[i].refname = xstrdup(buf);
+		logs[i].update_index = i;
+		logs[i].time = i;
+		logs[i].new_hash = reftable_malloc(SHA1_SIZE);
+		logs[i].email = xstrdup("identity@invalid");
+		set_test_hash(logs[i].new_hash, i);
+	}
+
+	for (i = 1; i <= N; i++) {
+		struct write_log_arg arg = {
+			.log = &logs[i],
+			.update_index = reftable_stack_next_update_index(st),
+		};
+		int err = reftable_stack_add(st, &write_test_log, &arg);
+		assert_err(err);
+	}
+
+	err = reftable_stack_compact_all(st, NULL);
+	assert_err(err);
+
+	err = reftable_stack_compact_all(st, &expiry);
+	assert_err(err);
+
+	err = reftable_stack_read_log(st, logs[9].refname, &log);
+	assert(err == 1);
+
+	err = reftable_stack_read_log(st, logs[11].refname, &log);
+	assert_err(err);
+
+	expiry.min_update_index = 15;
+	err = reftable_stack_compact_all(st, &expiry);
+	assert_err(err);
+
+	err = reftable_stack_read_log(st, logs[14].refname, &log);
+	assert(err == 1);
+
+	err = reftable_stack_read_log(st, logs[16].refname, &log);
+	assert_err(err);
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	for (i = 0; i <= N; i++) {
+		reftable_log_record_clear(&logs[i]);
+	}
+	reftable_clear_dir(dir);
+	reftable_log_record_clear(&log);
+}
+
+static int write_nothing(struct reftable_writer *wr, void *arg)
+{
+	reftable_writer_set_limits(wr, 1, 1);
+	return 0;
+}
+
+static void test_empty_add(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	struct reftable_stack *st2 = NULL;
+
+	assert(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	assert_err(err);
+
+	err = reftable_stack_add(st, &write_nothing, NULL);
+	assert_err(err);
+
+	err = reftable_new_stack(&st2, dir, cfg);
+	assert_err(err);
+	reftable_clear_dir(dir);
+	reftable_stack_destroy(st);
+	reftable_stack_destroy(st2);
+}
+
+static void test_reftable_stack_auto_compaction(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	int err, i;
+	int N = 100;
+	assert(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	assert_err(err);
+
+	for (i = 0; i < N; i++) {
+		char name[100];
+		struct reftable_ref_record ref = {
+			.refname = name,
+			.update_index = reftable_stack_next_update_index(st),
+			.target = "master",
+		};
+		snprintf(name, sizeof(name), "branch%04d", i);
+
+		err = reftable_stack_add(st, &write_test_ref, &ref);
+		assert_err(err);
+
+		assert(i < 3 || st->merged->stack_len < 2 * fastlog2(i));
+	}
+
+	assert(reftable_stack_compaction_stats(st)->entries_written <
+	       (uint64_t)(N * fastlog2(N)));
+
+	reftable_stack_destroy(st);
+	reftable_clear_dir(dir);
+}
+
+int stack_test_main(int argc, const char *argv[])
+{
+	add_test_case("test_reftable_stack_uptodate",
+		      &test_reftable_stack_uptodate);
+	add_test_case("test_reftable_stack_transaction_api",
+		      &test_reftable_stack_transaction_api);
+	add_test_case("test_reftable_stack_hash_id",
+		      &test_reftable_stack_hash_id);
+	add_test_case("test_sizes_to_segments_all_equal",
+		      &test_sizes_to_segments_all_equal);
+	add_test_case("test_reftable_stack_auto_compaction",
+		      &test_reftable_stack_auto_compaction);
+	add_test_case("test_reftable_stack_validate_refname",
+		      &test_reftable_stack_validate_refname);
+	add_test_case("test_reftable_stack_update_index_check",
+		      &test_reftable_stack_update_index_check);
+	add_test_case("test_reftable_stack_lock_failure",
+		      &test_reftable_stack_lock_failure);
+	add_test_case("test_reftable_stack_log_normalize",
+		      &test_reftable_stack_log_normalize);
+	add_test_case("test_reftable_stack_tombstone",
+		      &test_reftable_stack_tombstone);
+	add_test_case("test_reftable_stack_add_one",
+		      &test_reftable_stack_add_one);
+	add_test_case("test_empty_add", test_empty_add);
+	add_test_case("test_reflog_expire", test_reflog_expire);
+	add_test_case("test_suggest_compaction_segment",
+		      &test_suggest_compaction_segment);
+	add_test_case("test_suggest_compaction_segment_nothing",
+		      &test_suggest_compaction_segment_nothing);
+	add_test_case("test_sizes_to_segments", &test_sizes_to_segments);
+	add_test_case("test_sizes_to_segments_empty",
+		      &test_sizes_to_segments_empty);
+	add_test_case("test_log2", &test_log2);
+	add_test_case("test_parse_names", &test_parse_names);
+	add_test_case("test_read_file", &test_read_file);
+	add_test_case("test_names_equal", &test_names_equal);
+	add_test_case("test_reftable_stack_add", &test_reftable_stack_add);
+	return test_main(argc, argv);
+}
diff --git a/reftable/update.sh b/reftable/update.sh
new file mode 100755
index 0000000000..4dadd875f4
--- /dev/null
+++ b/reftable/update.sh
@@ -0,0 +1,22 @@
+#!/bin/sh
+
+set -eu
+
+# Override this to import from somewhere else, say "../reftable".
+SRC=${SRC:-origin}
+BRANCH=${BRANCH:-master}
+
+((git --git-dir reftable-repo/.git fetch -f ${SRC} ${BRANCH}:import && cd reftable-repo && git checkout -f $(git rev-parse import) ) ||
+   git clone https://github.com/google/reftable reftable-repo)
+
+cp reftable-repo/c/*.[ch] reftable/
+cp reftable-repo/c/include/*.[ch] reftable/
+cp reftable-repo/LICENSE reftable/
+
+git --git-dir reftable-repo/.git show --no-patch --format=oneline HEAD \
+  > reftable/VERSION
+
+mv reftable/system.h reftable/system.h~
+sed 's|if REFTABLE_IN_GITCORE|if 1 /* REFTABLE_IN_GITCORE */|'  < reftable/system.h~ > reftable/system.h
+
+git add reftable/*.[ch] reftable/LICENSE reftable/VERSION
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 2273fdfe39..def8883439 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -4,9 +4,12 @@
 int cmd__reftable(int argc, const char **argv)
 {
 	block_test_main(argc, argv);
+	merged_test_main(argc, argv);
 	record_test_main(argc, argv);
+	refname_test_main(argc, argv);
 	reftable_test_main(argc, argv);
 	strbuf_test_main(argc, argv);
+	stack_test_main(argc, argv);
 	tree_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH 13/13] reftable: "test-tool dump-reftable" command.
  2020-09-16 19:10 [PATCH 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
                   ` (11 preceding siblings ...)
  2020-09-16 19:10 ` [PATCH 12/13] reftable: rest of library Han-Wen Nienhuys via GitGitGadget
@ 2020-09-16 19:10 ` Han-Wen Nienhuys via GitGitGadget
  2020-10-01 16:10 ` [PATCH v2 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
  13 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-09-16 19:10 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This command dumps individual tables or a stack of of tables.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 | 2 ++
 t/helper/test-reftable.c | 5 +++++
 t/helper/test-tool.c     | 1 +
 t/helper/test-tool.h     | 1 +
 4 files changed, 9 insertions(+)

diff --git a/Makefile b/Makefile
index 6652259baf..ee82ff2915 100644
--- a/Makefile
+++ b/Makefile
@@ -2363,6 +2363,8 @@ REFTABLE_OBJS += reftable/writer.o
 REFTABLE_OBJS += reftable/zlib-compat.o
 
 REFTABLE_TEST_OBJS += reftable/block_test.o
+REFTABLE_TEST_OBJS += reftable/dump.o
+REFTABLE_TEST_OBJS += reftable/merged_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/refname_test.o
 REFTABLE_TEST_OBJS += reftable/reftable_test.o
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index def8883439..aff4fbccda 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -13,3 +13,8 @@ int cmd__reftable(int argc, const char **argv)
 	tree_test_main(argc, argv);
 	return 0;
 }
+
+int cmd__dump_reftable(int argc, const char **argv)
+{
+	return reftable_dump_main(argc, (char *const *)argv);
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 10366b7b76..9e689f9d2b 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -53,6 +53,7 @@ static struct test_cmd cmds[] = {
 	{ "read-midx", cmd__read_midx },
 	{ "ref-store", cmd__ref_store },
 	{ "reftable", cmd__reftable },
+	{ "dump-reftable", cmd__dump_reftable },
 	{ "regex", cmd__regex },
 	{ "repository", cmd__repository },
 	{ "revision-walking", cmd__revision_walking },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index d52ba2f5e5..bf833e01d4 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -17,6 +17,7 @@ int cmd__dump_cache_tree(int argc, const char **argv);
 int cmd__dump_fsmonitor(int argc, const char **argv);
 int cmd__dump_split_index(int argc, const char **argv);
 int cmd__dump_untracked_cache(int argc, const char **argv);
+int cmd__dump_reftable(int argc, const char **argv);
 int cmd__example_decorate(int argc, const char **argv);
 int cmd__genrandom(int argc, const char **argv);
 int cmd__genzeros(int argc, const char **argv);
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 251+ messages in thread

* Re: [PATCH 06/13] reftable: (de)serialization for the polymorphic record type.
  2020-09-16 19:10 ` [PATCH 06/13] reftable: (de)serialization for the polymorphic record type Han-Wen Nienhuys via GitGitGadget
@ 2020-09-20  1:00   ` Junio C Hamano
  2020-09-21 13:13     ` Han-Wen Nienhuys
  0 siblings, 1 reply; 251+ messages in thread
From: Junio C Hamano @ 2020-09-20  1:00 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget; +Cc: git, Han-Wen Nienhuys, Han-Wen Nienhuys

"Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com> writes:

> +static void reftable_log_record_key(const void *r, struct strbuf *dest)
> +{
> +	const struct reftable_log_record *rec =
> +		(const struct reftable_log_record *)r;
> +	int len = strlen(rec->refname);
> +	uint8_t i64[8];
> +	uint64_t ts = 0;
> +	strbuf_reset(dest);
> +	strbuf_add(dest, (uint8_t *)rec->refname, len + 1);
> +
> +	ts = (~ts) - rec->update_index;
> +	put_be64(&i64[0], ts);
> +	strbuf_add(dest, i64, sizeof(i64));
> +}

We seem to be getting

reftable/record.c: In function 'reftable_log_record_key':
reftable/record.c:578:2: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing]
  put_be64(&i64[0], ts);
    ^
        CC reftable/refname.o

when this series is merged to 'seen'.

cf. e.g. https://travis-ci.org/github/git/git/jobs/728655368


Thanks.




^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH 06/13] reftable: (de)serialization for the polymorphic record type.
  2020-09-20  1:00   ` Junio C Hamano
@ 2020-09-21 13:13     ` Han-Wen Nienhuys
  2020-09-24  7:21       ` Jeff King
  0 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys @ 2020-09-21 13:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Han-Wen Nienhuys via GitGitGadget, git, Han-Wen Nienhuys

On Sun, Sep 20, 2020 at 3:00 AM Junio C Hamano <gitster@pobox.com> wrote:
> > +static void reftable_log_record_key(const void *r, struct strbuf *dest)
> > +{
> > +     const struct reftable_log_record *rec =
> > +             (const struct reftable_log_record *)r;
> > +     int len = strlen(rec->refname);
> > +     uint8_t i64[8];
> > +     uint64_t ts = 0;
> > +     strbuf_reset(dest);
> > +     strbuf_add(dest, (uint8_t *)rec->refname, len + 1);
> > +
> > +     ts = (~ts) - rec->update_index;
> > +     put_be64(&i64[0], ts);
> > +     strbuf_add(dest, i64, sizeof(i64));
> > +}
>
> We seem to be getting
>
> reftable/record.c: In function 'reftable_log_record_key':
> reftable/record.c:578:2: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing]
>   put_be64(&i64[0], ts);
>     ^
>         CC reftable/refname.o
>
> when this series is merged to 'seen'.

Thanks for bringing this up. I did see this before, but I've been
unable to reproduce this locally, and I forgot about it.

The problem is actually triggered by the Git-provided put_be64()
(which appears unused in the Git source code). The definition

  #define put_be64(p, v) do { *(uint64_t *)(p) = htonll(v); } while (0)

is trying to reinterpret a char* as a uint64_t* , which is illegal in
strict aliasing rules? I originally had

+void put_be64(uint8_t *out, uint64_t v)
+{
+       int i = sizeof(uint64_t);
+        while (i--) {
+               out[i] = (uint8_t)(v & 0xff);
+               v >>= 8;
+       }
+}

in my reftable library, which is portable. Is there a reason for the
magic with htonll and friends?

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--
Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH 06/13] reftable: (de)serialization for the polymorphic record type.
  2020-09-21 13:13     ` Han-Wen Nienhuys
@ 2020-09-24  7:21       ` Jeff King
  2020-09-24  7:31         ` Jeff King
  0 siblings, 1 reply; 251+ messages in thread
From: Jeff King @ 2020-09-24  7:21 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Junio C Hamano, Han-Wen Nienhuys via GitGitGadget, git, Han-Wen Nienhuys

On Mon, Sep 21, 2020 at 03:13:05PM +0200, Han-Wen Nienhuys wrote:

> The problem is actually triggered by the Git-provided put_be64()
> (which appears unused in the Git source code). The definition
> 
>   #define put_be64(p, v) do { *(uint64_t *)(p) = htonll(v); } while (0)
> 
> is trying to reinterpret a char* as a uint64_t* , which is illegal in
> strict aliasing rules?

Ah, thanks for finding that. I think this resolves the mystery I had
around aliasing warnings with put_be32() in f39ad38410 (fast-export: use
local array to store anonymized oid, 2020-06-25). I got confused because
when I looked at the implementation of put_be32(), I stupidly looked at
the NO_ALIGNED_LOADS fallback code, which indeed does not alias. But the
one we use on x86 does.

I thought this would be OK because C allows aliasing through "char *"
pointers. But I think it may only go the other way. I.e., you can
type-pun a uint64_t as a char, but not the other way around. But then
why doesn't basically every use of put_be32() and put_be64() complain?

> I originally had
> 
> +void put_be64(uint8_t *out, uint64_t v)
> +{
> +       int i = sizeof(uint64_t);
> +        while (i--) {
> +               out[i] = (uint8_t)(v & 0xff);
> +               v >>= 8;
> +       }
> +}
> 
> in my reftable library, which is portable. Is there a reason for the
> magic with htonll and friends?

Presumably it was thought to be faster. This comes originally from the
block-sha1 code in 660231aa97 (block-sha1: support for architectures
with memory alignment restrictions, 2009-08-12). I don't know how it
compares in practice, and especially these days.

Our fallback routines are similar to an unrolled version of what you
wrote above.

-Peff

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH 06/13] reftable: (de)serialization for the polymorphic record type.
  2020-09-24  7:21       ` Jeff King
@ 2020-09-24  7:31         ` Jeff King
  2020-09-24 17:22           ` Junio C Hamano
  0 siblings, 1 reply; 251+ messages in thread
From: Jeff King @ 2020-09-24  7:31 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Junio C Hamano, Han-Wen Nienhuys via GitGitGadget, git, Han-Wen Nienhuys

On Thu, Sep 24, 2020 at 03:21:51AM -0400, Jeff King wrote:

> > I originally had
> > 
> > +void put_be64(uint8_t *out, uint64_t v)
> > +{
> > +       int i = sizeof(uint64_t);
> > +        while (i--) {
> > +               out[i] = (uint8_t)(v & 0xff);
> > +               v >>= 8;
> > +       }
> > +}
> > 
> > in my reftable library, which is portable. Is there a reason for the
> > magic with htonll and friends?
> 
> Presumably it was thought to be faster. This comes originally from the
> block-sha1 code in 660231aa97 (block-sha1: support for architectures
> with memory alignment restrictions, 2009-08-12). I don't know how it
> compares in practice, and especially these days.
> 
> Our fallback routines are similar to an unrolled version of what you
> wrote above.

We should be able to measure it pretty easily, since block-sha1 uses a
lot of get_be32/put_be32. I generated a 4GB random file, built with
BLK_SHA1=Yes and -O2, and timed:

  t/helper/test-tool sha1 <foo.rand

Then I did the same, but building with -DNO_UNALIGNED_LOADS. The latter
actually ran faster, by a small margin. Here are the hyperfine results:

  [stock]
  Time (mean ± σ):      6.638 s ±  0.081 s    [User: 6.269 s, System: 0.368 s]
  Range (min … max):    6.550 s …  6.841 s    10 runs

  [-DNO_UNALIGNED_LOADS]
  Time (mean ± σ):      6.418 s ±  0.015 s    [User: 6.058 s, System: 0.360 s]
  Range (min … max):    6.394 s …  6.447 s    10 runs

For casual use as in reftables I doubt the difference is even
measurable. But this result implies that perhaps we ought to just be
using the fallback version all the time.

-Peff

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH 06/13] reftable: (de)serialization for the polymorphic record type.
  2020-09-24  7:31         ` Jeff King
@ 2020-09-24 17:22           ` Junio C Hamano
  0 siblings, 0 replies; 251+ messages in thread
From: Junio C Hamano @ 2020-09-24 17:22 UTC (permalink / raw)
  To: Jeff King
  Cc: Han-Wen Nienhuys, Han-Wen Nienhuys via GitGitGadget, git,
	Han-Wen Nienhuys

Jeff King <peff@peff.net> writes:

> Then I did the same, but building with -DNO_UNALIGNED_LOADS. The latter
> actually ran faster, by a small margin. Here are the hyperfine results:
>
>   [stock]
>   Time (mean ± σ):      6.638 s ±  0.081 s    [User: 6.269 s, System: 0.368 s]
>   Range (min … max):    6.550 s …  6.841 s    10 runs
>
>   [-DNO_UNALIGNED_LOADS]
>   Time (mean ± σ):      6.418 s ±  0.015 s    [User: 6.058 s, System: 0.360 s]
>   Range (min … max):    6.394 s …  6.447 s    10 runs
>
> For casual use as in reftables I doubt the difference is even
> measurable. But this result implies that perhaps we ought to just be
> using the fallback version all the time.

I like that one.  One less configurable knob that makes us execute
different codepaths is one less thing to be worried about.


^ permalink raw reply	[flat|nested] 251+ messages in thread

* [PATCH v2 00/13] reftable library
  2020-09-16 19:10 [PATCH 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
                   ` (12 preceding siblings ...)
  2020-09-16 19:10 ` [PATCH 13/13] reftable: "test-tool dump-reftable" command Han-Wen Nienhuys via GitGitGadget
@ 2020-10-01 16:10 ` Han-Wen Nienhuys via GitGitGadget
  2020-10-01 16:10   ` [PATCH v2 01/13] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
                     ` (13 more replies)
  13 siblings, 14 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-10-01 16:10 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Jeff King, Han-Wen Nienhuys

This splits the giant commit from 
https://github.com/gitgitgadget/git/pull/539 into a series of smaller
commits, which build and have unittests.

The final commit should also be split up, but I want to wait until we have
consensus that the bottom commits look good.

Han-Wen Nienhuys (12):
  reftable: add LICENSE
  reftable: define the public API
  reftable: add a barebones unittest framework
  reftable: utility functions
  reftable: (de)serialization for the polymorphic record type.
  reftable: reading/writing blocks
  reftable: a generic binary tree implementation
  reftable: write reftable files
  reftable: read reftable files
  reftable: file level tests
  reftable: rest of library
  reftable: "test-tool dump-reftable" command.

Johannes Schindelin (1):
  vcxproj: adjust for the reftable changes

 Makefile                                   |   46 +-
 config.mak.uname                           |    2 +-
 contrib/buildsystems/Generators/Vcxproj.pm |   11 +-
 reftable/.gitattributes                    |    1 +
 reftable/LICENSE                           |   31 +
 reftable/VERSION                           |    1 +
 reftable/basics.c                          |  131 +++
 reftable/basics.h                          |   48 +
 reftable/block.c                           |  441 +++++++
 reftable/block.h                           |  129 ++
 reftable/block_test.c                      |  158 +++
 reftable/blocksource.c                     |  148 +++
 reftable/blocksource.h                     |   22 +
 reftable/compat.c                          |  110 ++
 reftable/compat.h                          |   48 +
 reftable/constants.h                       |   21 +
 reftable/dump.c                            |  212 ++++
 reftable/iter.c                            |  242 ++++
 reftable/iter.h                            |   72 ++
 reftable/merged.c                          |  358 ++++++
 reftable/merged.h                          |   36 +
 reftable/merged_test.c                     |  331 ++++++
 reftable/pq.c                              |  115 ++
 reftable/pq.h                              |   32 +
 reftable/publicbasics.c                    |  100 ++
 reftable/reader.c                          |  733 ++++++++++++
 reftable/reader.h                          |   78 ++
 reftable/record.c                          | 1116 ++++++++++++++++++
 reftable/record.h                          |  137 +++
 reftable/record_test.c                     |  410 +++++++
 reftable/refname.c                         |  209 ++++
 reftable/refname.h                         |   28 +
 reftable/refname_test.c                    |  100 ++
 reftable/reftable-tests.h                  |   22 +
 reftable/reftable.c                        |  104 ++
 reftable/reftable.h                        |  585 +++++++++
 reftable/reftable_test.c                   |  585 +++++++++
 reftable/stack.c                           | 1240 ++++++++++++++++++++
 reftable/stack.h                           |   40 +
 reftable/stack_test.c                      |  788 +++++++++++++
 reftable/strbuf.c                          |  142 +++
 reftable/strbuf.h                          |   80 ++
 reftable/strbuf_test.c                     |   37 +
 reftable/system.h                          |   51 +
 reftable/test_framework.c                  |   68 ++
 reftable/test_framework.h                  |   59 +
 reftable/tree.c                            |   63 +
 reftable/tree.h                            |   34 +
 reftable/tree_test.c                       |   62 +
 reftable/update.sh                         |   22 +
 reftable/writer.c                          |  673 +++++++++++
 reftable/writer.h                          |   51 +
 reftable/zlib-compat.c                     |   92 ++
 t/helper/test-reftable.c                   |   20 +
 t/helper/test-tool.c                       |    2 +
 t/helper/test-tool.h                       |    2 +
 56 files changed, 10474 insertions(+), 5 deletions(-)
 create mode 100644 reftable/.gitattributes
 create mode 100644 reftable/LICENSE
 create mode 100644 reftable/VERSION
 create mode 100644 reftable/basics.c
 create mode 100644 reftable/basics.h
 create mode 100644 reftable/block.c
 create mode 100644 reftable/block.h
 create mode 100644 reftable/block_test.c
 create mode 100644 reftable/blocksource.c
 create mode 100644 reftable/blocksource.h
 create mode 100644 reftable/compat.c
 create mode 100644 reftable/compat.h
 create mode 100644 reftable/constants.h
 create mode 100644 reftable/dump.c
 create mode 100644 reftable/iter.c
 create mode 100644 reftable/iter.h
 create mode 100644 reftable/merged.c
 create mode 100644 reftable/merged.h
 create mode 100644 reftable/merged_test.c
 create mode 100644 reftable/pq.c
 create mode 100644 reftable/pq.h
 create mode 100644 reftable/publicbasics.c
 create mode 100644 reftable/reader.c
 create mode 100644 reftable/reader.h
 create mode 100644 reftable/record.c
 create mode 100644 reftable/record.h
 create mode 100644 reftable/record_test.c
 create mode 100644 reftable/refname.c
 create mode 100644 reftable/refname.h
 create mode 100644 reftable/refname_test.c
 create mode 100644 reftable/reftable-tests.h
 create mode 100644 reftable/reftable.c
 create mode 100644 reftable/reftable.h
 create mode 100644 reftable/reftable_test.c
 create mode 100644 reftable/stack.c
 create mode 100644 reftable/stack.h
 create mode 100644 reftable/stack_test.c
 create mode 100644 reftable/strbuf.c
 create mode 100644 reftable/strbuf.h
 create mode 100644 reftable/strbuf_test.c
 create mode 100644 reftable/system.h
 create mode 100644 reftable/test_framework.c
 create mode 100644 reftable/test_framework.h
 create mode 100644 reftable/tree.c
 create mode 100644 reftable/tree.h
 create mode 100644 reftable/tree_test.c
 create mode 100755 reftable/update.sh
 create mode 100644 reftable/writer.c
 create mode 100644 reftable/writer.h
 create mode 100644 reftable/zlib-compat.c
 create mode 100644 t/helper/test-reftable.c


base-commit: 306ee63a703ad67c54ba1209dc11dd9ea500dc1f
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-847%2Fhanwen%2Flibreftable-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-847/hanwen/libreftable-v2
Pull-Request: https://github.com/git/git/pull/847

Range-diff vs v1:

  1:  f697eac228 =  1:  6228103b4a reftable: add LICENSE
  2:  e2dbc84833 =  2:  5d1b946ab5 reftable: define the public API
  3:  0f6efc065c =  3:  01a669a731 vcxproj: adjust for the reftable changes
  4:  7bf59d491d !  4:  b94c5f5c61 reftable: add a barebones unittest framework
     @@ reftable/test_framework.c (new)
      +#include "system.h"
      +#include "basics.h"
      +
     -+struct test_case **test_cases;
     -+int test_case_len;
     -+int test_case_cap;
     ++static struct test_case **test_cases;
     ++static int test_case_len;
     ++static int test_case_cap;
      +
     -+struct test_case *new_test_case(const char *name, void (*testfunc)(void))
     ++static struct test_case *new_test_case(const char *name, void (*testfunc)(void))
      +{
      +	struct test_case *tc = reftable_malloc(sizeof(struct test_case));
      +	tc->name = name;
     @@ reftable/test_framework.c (new)
      +		reftable_free(test_cases[i]);
      +	}
      +	reftable_free(test_cases);
     -+	test_cases = 0;
     ++	test_cases = NULL;
      +	test_case_len = 0;
      +	test_case_cap = 0;
      +	return 0;
     @@ reftable/test_framework.h (new)
      +	void (*testfunc)(void);
      +};
      +
     -+struct test_case *new_test_case(const char *name, void (*testfunc)(void));
      +struct test_case *add_test_case(const char *name, void (*testfunc)(void));
      +int test_main(int argc, const char *argv[]);
      +
  5:  570a8c4bca !  5:  4190da597e reftable: utility functions
     @@ reftable/blocksource.c (new)
      +	return ((struct strbuf *)b)->len;
      +}
      +
     -+struct reftable_block_source_vtable strbuf_vtable = {
     ++static struct reftable_block_source_vtable strbuf_vtable = {
      +	.size = &strbuf_size,
      +	.read_block = &strbuf_read_block,
      +	.return_block = &strbuf_return_block,
     @@ reftable/blocksource.c (new)
      +	reftable_free(dest->data);
      +}
      +
     -+struct reftable_block_source_vtable malloc_vtable = {
     ++static struct reftable_block_source_vtable malloc_vtable = {
      +	.return_block = &malloc_return_block,
      +};
      +
     -+struct reftable_block_source malloc_block_source_instance = {
     ++static struct reftable_block_source malloc_block_source_instance = {
      +	.ops = &malloc_vtable,
      +};
      +
     @@ reftable/blocksource.c (new)
      +	return size;
      +}
      +
     -+struct reftable_block_source_vtable file_vtable = {
     ++static struct reftable_block_source_vtable file_vtable = {
      +	.size = &file_size,
      +	.read_block = &file_read_block,
      +	.return_block = &file_return_block,
     @@ reftable/publicbasics.c (new)
      +	}
      +}
      +
     -+void *(*reftable_malloc_ptr)(size_t sz) = &malloc;
     -+void *(*reftable_realloc_ptr)(void *, size_t) = &realloc;
     -+void (*reftable_free_ptr)(void *) = &free;
     ++static void *(*reftable_malloc_ptr)(size_t sz) = &malloc;
     ++static void *(*reftable_realloc_ptr)(void *, size_t) = &realloc;
     ++static void (*reftable_free_ptr)(void *) = &free;
      +
      +void *reftable_malloc(size_t sz)
      +{
  6:  791f69c000 !  6:  8eb944ea9b reftable: (de)serialization for the polymorphic record type.
     @@ reftable/record.c (new)
      +		(const struct reftable_ref_record *)p);
      +}
      +
     -+struct reftable_record_vtable reftable_ref_record_vtable = {
     ++static struct reftable_record_vtable reftable_ref_record_vtable = {
      +	.key = &reftable_ref_record_key,
      +	.type = BLOCK_TYPE_REF,
      +	.copy_from = &reftable_ref_record_copy_from,
     @@ reftable/record.c (new)
      +	return 0;
      +}
      +
     -+struct reftable_record_vtable reftable_obj_record_vtable = {
     ++static struct reftable_record_vtable reftable_obj_record_vtable = {
      +	.key = &reftable_obj_record_key,
      +	.type = BLOCK_TYPE_OBJ,
      +	.copy_from = &reftable_obj_record_copy_from,
     @@ reftable/record.c (new)
      +		(const struct reftable_log_record *)p);
      +}
      +
     -+struct reftable_record_vtable reftable_log_record_vtable = {
     ++static struct reftable_record_vtable reftable_log_record_vtable = {
      +	.key = &reftable_log_record_key,
      +	.type = BLOCK_TYPE_LOG,
      +	.copy_from = &reftable_log_record_copy_from,
     @@ reftable/record.c (new)
      +	return rec;
      +}
      +
     -+void *reftable_record_yield(struct reftable_record *rec)
     ++/* clear out the record, yielding the reftable_record data that was
     ++ * encapsulated. */
     ++static void *reftable_record_yield(struct reftable_record *rec)
      +{
      +	void *p = rec->data;
      +	rec->data = NULL;
     @@ reftable/record.c (new)
      +	return start.len - in.len;
      +}
      +
     -+struct reftable_record_vtable reftable_index_record_vtable = {
     ++static struct reftable_record_vtable reftable_index_record_vtable = {
      +	.key = &reftable_index_record_key,
      +	.type = BLOCK_TYPE_INDEX,
      +	.copy_from = &reftable_index_record_copy_from,
     @@ reftable/record.h (new)
      +/* creates a malloced record of the given type. Dispose with record_destroy */
      +struct reftable_record reftable_new_record(uint8_t typ);
      +
     -+extern struct reftable_record_vtable reftable_ref_record_vtable;
     -+
      +/* Encode `key` into `dest`. Sets `is_restart` to indicate a restart. Returns
      +   number of bytes written. */
      +int reftable_encode_key(int *is_restart, struct string_view dest,
     @@ reftable/record.h (new)
      +/* zeroes out the embedded record */
      +void reftable_record_clear(struct reftable_record *rec);
      +
     -+/* clear out the record, yielding the reftable_record data that was
     -+ * encapsulated. */
     -+void *reftable_record_yield(struct reftable_record *rec);
     -+
      +/* clear and deallocate embedded record, and zero `rec`. */
      +void reftable_record_destroy(struct reftable_record *rec);
      +
     @@ reftable/record_test.c (new)
      +	int i = 0;
      +
      +	for (i = 0; i <= 3; i++) {
     -+		struct reftable_ref_record in = { 0 };
     ++		struct reftable_ref_record in = { NULL };
      +		struct reftable_ref_record out = {
      +			.refname = xstrdup("old name"),
      +			.value = reftable_calloc(SHA1_SIZE),
      +			.target_value = reftable_calloc(SHA1_SIZE),
      +			.target = xstrdup("old value"),
      +		};
     -+		struct reftable_record rec_out = { 0 };
     ++		struct reftable_record rec_out = { NULL };
      +		struct strbuf key = STRBUF_INIT;
     -+		struct reftable_record rec = { 0 };
     ++		struct reftable_record rec = { NULL };
      +		uint8_t buffer[1024] = { 0 };
      +		struct string_view dest = {
      +			.buf = buffer,
     @@ reftable/record_test.c (new)
      +	set_test_hash(in[0].new_hash, 1);
      +	set_test_hash(in[0].old_hash, 2);
      +	for (int i = 0; i < ARRAY_SIZE(in); i++) {
     -+		struct reftable_record rec = { 0 };
     ++		struct reftable_record rec = { NULL };
      +		struct strbuf key = STRBUF_INIT;
      +		uint8_t buffer[1024] = { 0 };
      +		struct string_view dest = {
     @@ reftable/record_test.c (new)
      +			.email = xstrdup("old@email"),
      +			.message = xstrdup("old message"),
      +		};
     -+		struct reftable_record rec_out = { 0 };
     ++		struct reftable_record rec_out = { NULL };
      +		int n, m, valtype;
      +
      +		reftable_record_from_log(&rec, &in[i]);
     @@ reftable/record_test.c (new)
      +			.buf = buffer,
      +			.len = sizeof(buffer),
      +		};
     -+		struct reftable_record rec = { 0 };
     ++		struct reftable_record rec = { NULL };
      +		struct strbuf key = STRBUF_INIT;
     -+		struct reftable_obj_record out = { 0 };
     -+		struct reftable_record rec_out = { 0 };
     ++		struct reftable_obj_record out = { NULL };
     ++		struct reftable_record rec_out = { NULL };
      +		int n, m;
      +		uint8_t extra;
      +
     @@ reftable/record_test.c (new)
      +		.len = sizeof(buffer),
      +	};
      +	struct strbuf key = STRBUF_INIT;
     -+	struct reftable_record rec = { 0 };
     ++	struct reftable_record rec = { NULL };
      +	struct reftable_index_record out = { .last_key = STRBUF_INIT };
      +	struct reftable_record out_rec = { NULL };
      +	int n, m;
  7:  1ba8e3eb30 !  7:  757dd30fe2 reftable: reading/writing blocks
     @@ reftable/block.c (new)
      +	abort();
      +}
      +
     -+int block_writer_register_restart(struct block_writer *w, int n, int restart,
     -+				  struct strbuf *key);
     ++static int block_writer_register_restart(struct block_writer *w, int n,
     ++					 int is_restart, struct strbuf *key)
     ++{
     ++	int rlen = w->restart_len;
     ++	if (rlen >= MAX_RESTARTS) {
     ++		is_restart = 0;
     ++	}
     ++
     ++	if (is_restart) {
     ++		rlen++;
     ++	}
     ++	if (2 + 3 * rlen + n > w->block_size - w->next)
     ++		return -1;
     ++	if (is_restart) {
     ++		if (w->restart_len == w->restart_cap) {
     ++			w->restart_cap = w->restart_cap * 2 + 1;
     ++			w->restarts = reftable_realloc(
     ++				w->restarts, sizeof(uint32_t) * w->restart_cap);
     ++		}
     ++
     ++		w->restarts[w->restart_len++] = w->next;
     ++	}
     ++
     ++	w->next += n;
     ++
     ++	strbuf_reset(&w->last_key);
     ++	strbuf_addbuf(&w->last_key, key);
     ++	w->entries++;
     ++	return 0;
     ++}
      +
      +void block_writer_init(struct block_writer *bw, uint8_t typ, uint8_t *buf,
      +		       uint32_t block_size, uint32_t header_off, int hash_size)
     @@ reftable/block.c (new)
      +	return -1;
      +}
      +
     -+int block_writer_register_restart(struct block_writer *w, int n, int is_restart,
     -+				  struct strbuf *key)
     -+{
     -+	int rlen = w->restart_len;
     -+	if (rlen >= MAX_RESTARTS) {
     -+		is_restart = 0;
     -+	}
     -+
     -+	if (is_restart) {
     -+		rlen++;
     -+	}
     -+	if (2 + 3 * rlen + n > w->block_size - w->next)
     -+		return -1;
     -+	if (is_restart) {
     -+		if (w->restart_len == w->restart_cap) {
     -+			w->restart_cap = w->restart_cap * 2 + 1;
     -+			w->restarts = reftable_realloc(
     -+				w->restarts, sizeof(uint32_t) * w->restart_cap);
     -+		}
     -+
     -+		w->restarts[w->restart_len++] = w->next;
     -+	}
     -+
     -+	w->next += n;
     -+
     -+	strbuf_reset(&w->last_key);
     -+	strbuf_addbuf(&w->last_key, key);
     -+	w->entries++;
     -+	return 0;
     -+}
      +
      +int block_writer_finish(struct block_writer *w)
      +{
     @@ reftable/block_test.c (new)
      +	char *names[30];
      +	const int N = ARRAY_SIZE(names);
      +	const int block_size = 1024;
     -+	struct reftable_block block = { 0 };
     ++	struct reftable_block block = { NULL };
      +	struct block_writer bw = {
      +		.last_key = STRBUF_INIT,
      +	};
     -+	struct reftable_ref_record ref = { 0 };
     -+	struct reftable_record rec = { 0 };
     ++	struct reftable_ref_record ref = { NULL };
     ++	struct reftable_record rec = { NULL };
      +	int i = 0;
      +	int n;
      +	struct block_reader br = { 0 };
  8:  17fb8d050d !  8:  e30a7e0281 reftable: a generic binary tree implementation
     @@ reftable/tree_test.c (new)
      +{
      +	struct tree_node *root = NULL;
      +
     -+	void *values[11] = { 0 };
     -+	struct tree_node *nodes[11] = { 0 };
     ++	void *values[11] = { NULL };
     ++	struct tree_node *nodes[11] = { NULL };
      +	int i = 1;
     -+	struct curry c = { 0 };
     ++	struct curry c = { NULL };
      +	do {
      +		nodes[i] = tree_search(values + i, &root, &test_compare, 1);
      +		i = (i * 7) % 11;
  9:  4e40fc3b40 !  9:  68aee16e60 reftable: write reftable files
     @@ reftable/writer.c (new)
      +#include "reftable.h"
      +#include "tree.h"
      +
     ++/* finishes a block, and writes it to storage */
     ++static int writer_flush_block(struct reftable_writer *w);
     ++
     ++/* deallocates memory related to the index */
     ++static void writer_clear_index(struct reftable_writer *w);
     ++
     ++/* finishes writing a 'r' (refs) or 'g' (reflogs) section */
     ++static int writer_finish_public_section(struct reftable_writer *w);
     ++
      +static struct reftable_block_stats *
      +writer_reftable_block_stats(struct reftable_writer *w, uint8_t typ)
      +{
     @@ reftable/writer.c (new)
      +int reftable_writer_add_ref(struct reftable_writer *w,
      +			    struct reftable_ref_record *ref)
      +{
     -+	struct reftable_record rec = { 0 };
     ++	struct reftable_record rec = { NULL };
      +	struct reftable_ref_record copy = *ref;
      +	int err = 0;
      +
     @@ reftable/writer.c (new)
      +int reftable_writer_add_log(struct reftable_writer *w,
      +			    struct reftable_log_record *log)
      +{
     -+	struct reftable_record rec = { 0 };
     ++	struct reftable_record rec = { NULL };
      +	char *input_log_message = log->message;
      +	struct strbuf cleaned_message = STRBUF_INIT;
      +	int err;
     @@ reftable/writer.c (new)
      +		w->index_len = 0;
      +		w->index_cap = 0;
      +		for (i = 0; i < idx_len; i++) {
     -+			struct reftable_record rec = { 0 };
     ++			struct reftable_record rec = { NULL };
      +			reftable_record_from_index(&rec, idx + i);
      +			if (block_writer_add(w->block_writer, &rec) == 0) {
      +				continue;
     @@ reftable/writer.c (new)
      +		.offsets = entry->offsets,
      +		.offset_len = entry->offset_len,
      +	};
     -+	struct reftable_record rec = { 0 };
     ++	struct reftable_record rec = { NULL };
      +	if (arg->err < 0)
      +		goto done;
      +
     @@ reftable/writer.c (new)
      +static int writer_dump_object_index(struct reftable_writer *w)
      +{
      +	struct write_record_arg closure = { .w = w };
     -+	struct common_prefix_arg common = { 0 };
     ++	struct common_prefix_arg common = { NULL };
      +	if (w->obj_index_tree != NULL) {
      +		infix_walk(w->obj_index_tree, &update_common, &common);
      +	}
     @@ reftable/writer.c (new)
      +	return writer_finish_section(w);
      +}
      +
     -+int writer_finish_public_section(struct reftable_writer *w)
     ++static int writer_finish_public_section(struct reftable_writer *w)
      +{
      +	uint8_t typ = 0;
      +	int err = 0;
     @@ reftable/writer.c (new)
      +	return err;
      +}
      +
     -+void writer_clear_index(struct reftable_writer *w)
     ++static void writer_clear_index(struct reftable_writer *w)
      +{
      +	int i = 0;
      +	for (i = 0; i < w->index_len; i++) {
     @@ reftable/writer.c (new)
      +	w->index_cap = 0;
      +}
      +
     -+const int debug = 0;
     ++static const int debug = 0;
      +
      +static int writer_flush_nonempty_block(struct reftable_writer *w)
      +{
     @@ reftable/writer.c (new)
      +	return 0;
      +}
      +
     -+int writer_flush_block(struct reftable_writer *w)
     ++static int writer_flush_block(struct reftable_writer *w)
      +{
      +	if (w->block_writer == NULL)
      +		return 0;
     @@ reftable/writer.h (new)
      +	struct reftable_stats stats;
      +};
      +
     -+/* finishes a block, and writes it to storage */
     -+int writer_flush_block(struct reftable_writer *w);
     -+
     -+/* deallocates memory related to the index */
     -+void writer_clear_index(struct reftable_writer *w);
     -+
     -+/* finishes writing a 'r' (refs) or 'g' (reflogs) section */
     -+int writer_finish_public_section(struct reftable_writer *w);
     -+
      +#endif
 10:  25259b683b ! 10:  c196de7f06 reftable: read reftable files
     @@ reftable/iter.c (new)
      +{
      +}
      +
     -+struct reftable_iterator_vtable empty_vtable = {
     ++static struct reftable_iterator_vtable empty_vtable = {
      +	.next = &empty_iterator_next,
      +	.close = &empty_iterator_close,
      +};
     @@ reftable/iter.c (new)
      +	return err;
      +}
      +
     -+struct reftable_iterator_vtable filtering_ref_iterator_vtable = {
     ++static struct reftable_iterator_vtable filtering_ref_iterator_vtable = {
      +	.next = &filtering_ref_iterator_next,
      +	.close = &filtering_ref_iterator_close,
      +};
     @@ reftable/iter.c (new)
      +	return err;
      +}
      +
     -+struct reftable_iterator_vtable indexed_table_ref_iter_vtable = {
     ++static struct reftable_iterator_vtable indexed_table_ref_iter_vtable = {
      +	.next = &indexed_table_ref_iter_next,
      +	.close = &indexed_table_ref_iter_close,
      +};
     @@ reftable/reader.c (new)
      +	block_iter_close(&ti->bi);
      +}
      +
     -+struct reftable_iterator_vtable table_iter_vtable = {
     ++static struct reftable_iterator_vtable table_iter_vtable = {
      +	.next = &table_iter_next_void,
      +	.close = &table_iter_close,
      +};
 11:  3448fcf828 ! 11:  bf6b929b86 reftable: file level tests
     @@ reftable/reftable_test.c (new)
      +{
      +	struct strbuf buf = STRBUF_INIT;
      +	struct reftable_block_source source = { NULL };
     -+	struct reftable_block out = { 0 };
     ++	struct reftable_block out = { NULL };
      +	int n;
      +	uint8_t in[] = "hello";
      +	strbuf_add(&buf, in, sizeof(in));
     @@ reftable/reftable_test.c (new)
      +	};
      +	struct reftable_writer *w =
      +		reftable_new_writer(&strbuf_add_void, buf, &opts);
     -+	struct reftable_ref_record ref = { 0 };
     ++	struct reftable_ref_record ref = { NULL };
      +	int i = 0, n;
     -+	struct reftable_log_record log = { 0 };
     ++	struct reftable_log_record log = { NULL };
      +	const struct reftable_stats *stats = NULL;
      +	*names = reftable_calloc(sizeof(char *) * (N + 1));
      +	reftable_writer_set_limits(w, update_index, update_index);
     @@ reftable/reftable_test.c (new)
      +	struct reftable_write_options opts = {
      +		.block_size = 256,
      +	};
     -+	struct reftable_ref_record ref = { 0 };
     ++	struct reftable_ref_record ref = { NULL };
      +	int i = 0;
     -+	struct reftable_log_record log = { 0 };
     ++	struct reftable_log_record log = { NULL };
      +	int n;
     -+	struct reftable_iterator it = { 0 };
     -+	struct reftable_reader rd = { 0 };
     -+	struct reftable_block_source source = { 0 };
     ++	struct reftable_iterator it = { NULL };
     ++	struct reftable_reader rd = { NULL };
     ++	struct reftable_block_source source = { NULL };
      +	struct strbuf buf = STRBUF_INIT;
      +	struct reftable_writer *w =
      +		reftable_new_writer(&strbuf_add_void, &buf, &opts);
     @@ reftable/reftable_test.c (new)
      +	reftable_writer_set_limits(w, 0, N);
      +	for (i = 0; i < N; i++) {
      +		char name[256];
     -+		struct reftable_ref_record ref = { 0 };
     ++		struct reftable_ref_record ref = { NULL };
      +		snprintf(name, sizeof(name), "b%02d%0*d", i, 130, 7);
      +		names[i] = xstrdup(name);
      +		ref.refname = name;
     @@ reftable/reftable_test.c (new)
      +	}
      +	for (i = 0; i < N; i++) {
      +		uint8_t hash1[SHA1_SIZE], hash2[SHA1_SIZE];
     -+		struct reftable_log_record log = { 0 };
     ++		struct reftable_log_record log = { NULL };
      +		set_test_hash(hash1, i);
      +		set_test_hash(hash2, i + 1);
      +
     @@ reftable/reftable_test.c (new)
      +	char **names;
      +	struct strbuf buf = STRBUF_INIT;
      +	int N = 50;
     -+	struct reftable_iterator it = { 0 };
     -+	struct reftable_block_source source = { 0 };
     -+	struct reftable_reader rd = { 0 };
     ++	struct reftable_iterator it = { NULL };
     ++	struct reftable_block_source source = { NULL };
     ++	struct reftable_reader rd = { NULL };
      +	int err = 0;
      +	int j = 0;
      +
     @@ reftable/reftable_test.c (new)
      +	assert_err(err);
      +
      +	while (1) {
     -+		struct reftable_ref_record ref = { 0 };
     ++		struct reftable_ref_record ref = { NULL };
      +		int r = reftable_iterator_next_ref(&it, &ref);
      +		assert(r >= 0);
      +		if (r > 0) {
     @@ reftable/reftable_test.c (new)
      +	char **names;
      +	struct strbuf buf = STRBUF_INIT;
      +	int N = 50;
     -+	struct reftable_reader rd = { 0 };
     -+	struct reftable_block_source source = { 0 };
     ++	struct reftable_reader rd = { NULL };
     ++	struct reftable_block_source source = { NULL };
      +	int err;
      +	int i;
     -+	struct reftable_log_record log = { 0 };
     -+	struct reftable_iterator it = { 0 };
     ++	struct reftable_log_record log = { NULL };
     ++	struct reftable_iterator it = { NULL };
      +
      +	write_table(&names, &buf, N, 256, SHA1_ID);
      +
     @@ reftable/reftable_test.c (new)
      +	char **names;
      +	struct strbuf buf = STRBUF_INIT;
      +	int N = 50;
     -+	struct reftable_reader rd = { 0 };
     -+	struct reftable_block_source source = { 0 };
     ++	struct reftable_reader rd = { NULL };
     ++	struct reftable_block_source source = { NULL };
      +	int err;
      +	int i = 0;
      +
     -+	struct reftable_iterator it = { 0 };
     ++	struct reftable_iterator it = { NULL };
      +	struct strbuf pastLast = STRBUF_INIT;
     -+	struct reftable_ref_record ref = { 0 };
     ++	struct reftable_ref_record ref = { NULL };
      +
      +	write_table(&names, &buf, N, 256, hash_id);
      +
     @@ reftable/reftable_test.c (new)
      +
      +	err = reftable_reader_seek_ref(&rd, &it, pastLast.buf);
      +	if (err == 0) {
     -+		struct reftable_ref_record ref = { 0 };
     ++		struct reftable_ref_record ref = { NULL };
      +		int err = reftable_iterator_next_ref(&it, &ref);
      +		assert(err > 0);
      +	} else {
     @@ reftable/reftable_test.c (new)
      +	struct reftable_write_options opts = {
      +		.block_size = 256,
      +	};
     -+	struct reftable_ref_record ref = { 0 };
     ++	struct reftable_ref_record ref = { NULL };
      +	int i = 0;
      +	int n;
      +	int err;
      +	struct reftable_reader rd;
     -+	struct reftable_block_source source = { 0 };
     ++	struct reftable_block_source source = { NULL };
      +
      +	struct strbuf buf = STRBUF_INIT;
      +	struct reftable_writer *w =
      +		reftable_new_writer(&strbuf_add_void, &buf, &opts);
      +
     -+	struct reftable_iterator it = { 0 };
     ++	struct reftable_iterator it = { NULL };
      +	int j;
      +
      +	set_test_hash(want_hash, 4);
     @@ reftable/reftable_test.c (new)
      +		char name[100];
      +		uint8_t hash1[SHA1_SIZE];
      +		uint8_t hash2[SHA1_SIZE];
     -+		struct reftable_ref_record ref = { 0 };
     ++		struct reftable_ref_record ref = { NULL };
      +
      +		memset(hash, i, sizeof(hash));
      +		memset(fill, 'x', 50);
     @@ reftable/reftable_test.c (new)
      +	struct strbuf buf = STRBUF_INIT;
      +	struct reftable_writer *w =
      +		reftable_new_writer(&strbuf_add_void, &buf, &opts);
     -+	struct reftable_block_source source = { 0 };
     ++	struct reftable_block_source source = { NULL };
      +	struct reftable_reader *rd = NULL;
     -+	struct reftable_ref_record rec = { 0 };
     -+	struct reftable_iterator it = { 0 };
     ++	struct reftable_ref_record rec = { NULL };
     ++	struct reftable_iterator it = { NULL };
      +	int err;
      +
      +	reftable_writer_set_limits(w, 1, 1);
 12:  64d98e60b2 ! 12:  4e38db7f48 reftable: rest of library
     @@ Makefile: REFTABLE_OBJS += reftable/zlib-compat.o
      
       ## reftable/VERSION (new) ##
      @@
     -+b8ec0f74c74cb6752eb2033ad8e755a9c19aad15 C: add missing header
     ++7134eb9f8171a9759800f4187f9e6dde997335e7 C: NULL iso 0 for init
      
       ## reftable/dump.c (new) ##
      @@
     @@ reftable/dump.c (new)
      +
      +static int dump_table(const char *tablename)
      +{
     -+	struct reftable_block_source src = { 0 };
     ++	struct reftable_block_source src = { NULL };
      +	int err = reftable_block_source_from_file(&src, tablename);
     -+	struct reftable_iterator it = { 0 };
     -+	struct reftable_ref_record ref = { 0 };
     -+	struct reftable_log_record log = { 0 };
     ++	struct reftable_iterator it = { NULL };
     ++	struct reftable_ref_record ref = { NULL };
     ++	struct reftable_log_record log = { NULL };
      +	struct reftable_reader *r = NULL;
      +
      +	if (err < 0)
     @@ reftable/dump.c (new)
      +{
      +	struct reftable_stack *stack = NULL;
      +	struct reftable_write_options cfg = {};
     -+	struct reftable_iterator it = { 0 };
     -+	struct reftable_ref_record ref = { 0 };
     -+	struct reftable_log_record log = { 0 };
     ++	struct reftable_iterator it = { NULL };
     ++	struct reftable_ref_record ref = { NULL };
     ++	struct reftable_log_record log = { NULL };
      +	struct reftable_merged_table *merged = NULL;
      +
      +	int err = reftable_new_stack(&stack, stackdir, cfg);
     @@ reftable/merged.c (new)
      +	return merged_iter_next(mi, rec);
      +}
      +
     -+struct reftable_iterator_vtable merged_iter_vtable = {
     ++static struct reftable_iterator_vtable merged_iter_vtable = {
      +	.next = &merged_iter_next_void,
      +	.close = &merged_iter_close,
      +};
     @@ reftable/merged.c (new)
      +	return mt->min;
      +}
      +
     -+int merged_table_seek_record(struct reftable_merged_table *mt,
     -+			     struct reftable_iterator *it,
     -+			     struct reftable_record *rec)
     ++static int merged_table_seek_record(struct reftable_merged_table *mt,
     ++				    struct reftable_iterator *it,
     ++				    struct reftable_record *rec)
      +{
      +	struct reftable_iterator *iters = reftable_calloc(
      +		sizeof(struct reftable_iterator) * mt->stack_len);
     @@ reftable/merged.c (new)
      +	struct reftable_ref_record ref = {
      +		.refname = (char *)name,
      +	};
     -+	struct reftable_record rec = { 0 };
     ++	struct reftable_record rec = { NULL };
      +	reftable_record_from_ref(&rec, &ref);
      +	return merged_table_seek_record(mt, it, &rec);
      +}
     @@ reftable/merged.c (new)
      +		.refname = (char *)name,
      +		.update_index = update_index,
      +	};
     -+	struct reftable_record rec = { 0 };
     ++	struct reftable_record rec = { NULL };
      +	reftable_record_from_log(&rec, &log);
      +	return merged_table_seek_record(mt, it, &rec);
      +}
     @@ reftable/merged.h (new)
      +};
      +
      +void merged_table_clear(struct reftable_merged_table *mt);
     -+int merged_table_seek_record(struct reftable_merged_table *mt,
     -+			     struct reftable_iterator *it,
     -+			     struct reftable_record *rec);
      +
      +#endif
      
     @@ reftable/merged_test.c (new)
      +
      +static void test_pq(void)
      +{
     -+	char *names[54] = { 0 };
     ++	char *names[54] = { NULL };
      +	int N = ARRAY_SIZE(names) - 1;
      +
     -+	struct merged_iter_pqueue pq = { 0 };
     ++	struct merged_iter_pqueue pq = { NULL };
      +	const char *last = NULL;
      +
      +	int i = 0;
     @@ reftable/merged_test.c (new)
      +	struct reftable_merged_table *mt =
      +		merged_table_from_records(refs, &bs, &readers, sizes, bufs, 2);
      +	int i;
     -+	struct reftable_ref_record ref = { 0 };
     -+	struct reftable_iterator it = { 0 };
     ++	struct reftable_ref_record ref = { NULL };
     ++	struct reftable_iterator it = { NULL };
      +	int err = reftable_merged_table_seek_ref(mt, &it, "a");
      +	assert_err(err);
      +
     @@ reftable/merged_test.c (new)
      +	struct reftable_merged_table *mt =
      +		merged_table_from_records(refs, &bs, &readers, sizes, bufs, 3);
      +
     -+	struct reftable_iterator it = { 0 };
     ++	struct reftable_iterator it = { NULL };
      +	int err = reftable_merged_table_seek_ref(mt, &it, "a");
      +	struct reftable_ref_record *out = NULL;
      +	size_t len = 0;
     @@ reftable/merged_test.c (new)
      +
      +	assert_err(err);
      +	while (len < 100) { /* cap loops/recursion. */
     -+		struct reftable_ref_record ref = { 0 };
     ++		struct reftable_ref_record ref = { NULL };
      +		int err = reftable_iterator_next_ref(&it, &ref);
      +		if (err > 0) {
      +			break;
     @@ reftable/merged_test.c (new)
      +		.update_index = 1,
      +	};
      +	int err;
     -+	struct reftable_block_source source = { 0 };
     ++	struct reftable_block_source source = { NULL };
      +	struct reftable_table *tab = reftable_calloc(sizeof(*tab) * 1);
      +	uint32_t hash_id;
      +	struct reftable_reader *rd = NULL;
     @@ reftable/pq.c (new)
      +#include "system.h"
      +#include "basics.h"
      +
     -+int pq_less(struct pq_entry a, struct pq_entry b)
     ++static int pq_less(struct pq_entry a, struct pq_entry b)
      +{
      +	struct strbuf ak = STRBUF_INIT;
      +	struct strbuf bk = STRBUF_INIT;
     @@ reftable/pq.h (new)
      +	struct reftable_record rec;
      +};
      +
     -+int pq_less(struct pq_entry a, struct pq_entry b);
     -+
      +struct merged_iter_pqueue {
      +	struct pq_entry *heap;
      +	size_t len;
     @@ reftable/refname.c (new)
      +	return strcmp(f_arg->names[k], f_arg->want) >= 0;
      +}
      +
     -+int modification_has_ref(struct modification *mod, const char *name)
     ++static int modification_has_ref(struct modification *mod, const char *name)
      +{
     -+	struct reftable_ref_record ref = { 0 };
     ++	struct reftable_ref_record ref = { NULL };
      +	int err = 0;
      +
      +	if (mod->add_len > 0) {
     @@ reftable/refname.c (new)
      +	mod->del_len = 0;
      +}
      +
     -+int modification_has_ref_with_prefix(struct modification *mod,
     -+				     const char *prefix)
     ++static int modification_has_ref_with_prefix(struct modification *mod,
     ++					    const char *prefix)
      +{
      +	struct reftable_iterator it = { NULL };
      +	struct reftable_ref_record ref = { NULL };
     @@ reftable/refname.c (new)
      +	return err;
      +}
      +
     -+int validate_refname(const char *name)
     ++static int validate_refname(const char *name)
      +{
      +	while (1) {
      +		char *next = strchr(name, '/');
     @@ reftable/refname.h (new)
      +	size_t del_len;
      +};
      +
     -+// -1 = error, 0 = found, 1 = not found
     -+int modification_has_ref(struct modification *mod, const char *name);
     -+
     -+// -1 = error, 0 = found, 1 = not found.
     -+int modification_has_ref_with_prefix(struct modification *mod,
     -+				     const char *prefix);
     -+
     -+// 0 = OK.
     -+int validate_refname(const char *name);
     -+
      +int validate_ref_record_addition(struct reftable_table tab,
      +				 struct reftable_ref_record *recs, size_t sz);
      +
     @@ reftable/refname_test.c (new)
      +	};
      +	int err;
      +	int i;
     -+	struct reftable_block_source source = { 0 };
     ++	struct reftable_block_source source = { NULL };
      +	struct reftable_reader *rd = NULL;
      +	struct reftable_table tab = { NULL };
      +	struct testcase cases[] = {
     @@ reftable/refname_test.c (new)
      +	return test_main(argc, argv);
      +}
      
     + ## reftable/reftable.c ##
     +@@ reftable/reftable.c: int reftable_table_seek_ref(struct reftable_table *tab,
     + 	struct reftable_ref_record ref = {
     + 		.refname = (char *)name,
     + 	};
     +-	struct reftable_record rec = { 0 };
     ++	struct reftable_record rec = { NULL };
     + 	reftable_record_from_ref(&rec, &ref);
     + 	return tab->ops->seek_record(tab->table_arg, it, &rec);
     + }
     +@@ reftable/reftable.c: void reftable_table_from_reader(struct reftable_table *tab,
     + int reftable_table_read_ref(struct reftable_table *tab, const char *name,
     + 			    struct reftable_ref_record *ref)
     + {
     +-	struct reftable_iterator it = { 0 };
     ++	struct reftable_iterator it = { NULL };
     + 	int err = reftable_table_seek_ref(tab, &it, name);
     + 	if (err)
     + 		goto done;
     +
       ## reftable/stack.c (new) ##
      @@
      +/*
     @@ reftable/stack.c (new)
      +#include "reftable.h"
      +#include "writer.h"
      +
     ++static int stack_try_add(struct reftable_stack *st,
     ++			 int (*write_table)(struct reftable_writer *wr,
     ++					    void *arg),
     ++			 void *arg);
     ++static int stack_write_compact(struct reftable_stack *st,
     ++			       struct reftable_writer *wr, int first, int last,
     ++			       struct reftable_log_expiry_config *config);
     ++static int stack_check_addition(struct reftable_stack *st,
     ++				const char *new_tab_name);
     ++static void reftable_addition_close(struct reftable_addition *add);
     ++static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
     ++					     int reuse_open);
     ++
      +int reftable_new_stack(struct reftable_stack **dest, const char *dir,
      +		       struct reftable_write_options config)
      +{
     @@ reftable/stack.c (new)
      +		}
      +
      +		if (rd == NULL) {
     -+			struct reftable_block_source src = { 0 };
     ++			struct reftable_block_source src = { NULL };
      +			struct strbuf table_path = STRBUF_INIT;
      +			strbuf_addstr(&table_path, st->reftable_dir);
      +			strbuf_addstr(&table_path, "/");
     @@ reftable/stack.c (new)
      +	return udiff;
      +}
      +
     -+int reftable_stack_reload_maybe_reuse(struct reftable_stack *st, int reuse_open)
     ++static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
     ++					     int reuse_open)
      +{
      +	struct timeval deadline = { 0 };
      +	int err = gettimeofday(&deadline, NULL);
     @@ reftable/stack.c (new)
      +	return err;
      +}
      +
     -+void reftable_addition_close(struct reftable_addition *add)
     ++static void reftable_addition_close(struct reftable_addition *add)
      +{
      +	int i = 0;
      +	struct strbuf nm = STRBUF_INIT;
     @@ reftable/stack.c (new)
      +	return err;
      +}
      +
     -+int stack_try_add(struct reftable_stack *st,
     -+		  int (*write_table)(struct reftable_writer *wr, void *arg),
     -+		  void *arg)
     ++static int stack_try_add(struct reftable_stack *st,
     ++			 int (*write_table)(struct reftable_writer *wr,
     ++					    void *arg),
     ++			 void *arg)
      +{
      +	struct reftable_addition add = REFTABLE_ADDITION_INIT;
      +	int err = reftable_stack_init_addition(&add, st);
     @@ reftable/stack.c (new)
      +	return err;
      +}
      +
     -+int stack_write_compact(struct reftable_stack *st, struct reftable_writer *wr,
     -+			int first, int last,
     -+			struct reftable_log_expiry_config *config)
     ++static int stack_write_compact(struct reftable_stack *st,
     ++			       struct reftable_writer *wr, int first, int last,
     ++			       struct reftable_log_expiry_config *config)
      +{
      +	int subtabs_len = last - first + 1;
      +	struct reftable_table *subtabs = reftable_calloc(
      +		sizeof(struct reftable_table) * (last - first + 1));
      +	struct reftable_merged_table *mt = NULL;
      +	int err = 0;
     -+	struct reftable_iterator it = { 0 };
     -+	struct reftable_ref_record ref = { 0 };
     -+	struct reftable_log_record log = { 0 };
     ++	struct reftable_iterator it = { NULL };
     ++	struct reftable_ref_record ref = { NULL };
     ++	struct reftable_log_record log = { NULL };
      +
      +	uint64_t entries = 0;
      +
     @@ reftable/stack.c (new)
      +int reftable_stack_read_log(struct reftable_stack *st, const char *refname,
      +			    struct reftable_log_record *log)
      +{
     -+	struct reftable_iterator it = { 0 };
     ++	struct reftable_iterator it = { NULL };
      +	struct reftable_merged_table *mt = reftable_stack_merged_table(st);
      +	int err = reftable_merged_table_seek_log(mt, &it, refname);
      +	if (err)
     @@ reftable/stack.c (new)
      +	return err;
      +}
      +
     -+int stack_check_addition(struct reftable_stack *st, const char *new_tab_name)
     ++static int stack_check_addition(struct reftable_stack *st,
     ++				const char *new_tab_name)
      +{
      +	int err = 0;
     -+	struct reftable_block_source src = { 0 };
     ++	struct reftable_block_source src = { NULL };
      +	struct reftable_reader *rd = NULL;
      +	struct reftable_table tab = { NULL };
      +	struct reftable_ref_record *refs = NULL;
     @@ reftable/stack.c (new)
      +		goto done;
      +
      +	while (1) {
     -+		struct reftable_ref_record ref = { 0 };
     ++		struct reftable_ref_record ref = { NULL };
      +		err = reftable_iterator_next_ref(&it, &ref);
      +		if (err > 0) {
      +			break;
     @@ reftable/stack.h (new)
      +};
      +
      +int read_lines(const char *filename, char ***lines);
     -+int stack_try_add(struct reftable_stack *st,
     -+		  int (*write_table)(struct reftable_writer *wr, void *arg),
     -+		  void *arg);
     -+int stack_write_compact(struct reftable_stack *st, struct reftable_writer *wr,
     -+			int first, int last,
     -+			struct reftable_log_expiry_config *config);
     -+int fastlog2(uint64_t sz);
     -+int stack_check_addition(struct reftable_stack *st, const char *new_tab_name);
     -+void reftable_addition_close(struct reftable_addition *add);
     -+int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
     -+				      int reuse_open);
      +
      +struct segment {
      +	int start, end;
     @@ reftable/stack.h (new)
      +	uint64_t bytes;
      +};
      +
     ++int fastlog2(uint64_t sz);
      +struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n);
      +struct segment suggest_compaction_segment(uint64_t *sizes, int n);
      +
     @@ reftable/stack_test.c (new)
      +		.update_index = 1,
      +		.target = "master",
      +	};
     -+	struct reftable_ref_record dest = { 0 };
     ++	struct reftable_ref_record dest = { NULL };
      +
      +	assert(mkdtemp(dir));
      +
     @@ reftable/stack_test.c (new)
      +		.update_index = 1,
      +		.target = "master",
      +	};
     -+	struct reftable_ref_record dest = { 0 };
     ++	struct reftable_ref_record dest = { NULL };
      +
      +	assert(mkdtemp(dir));
      +
     @@ reftable/stack_test.c (new)
      +	};
      +	struct reftable_stack *st = NULL;
      +	char dir[256] = "/tmp/stack_test.XXXXXX";
     -+	struct reftable_ref_record refs[2] = { { 0 } };
     -+	struct reftable_log_record logs[2] = { { 0 } };
     ++	struct reftable_ref_record refs[2] = { { NULL } };
     ++	struct reftable_log_record logs[2] = { { NULL } };
      +	int N = ARRAY_SIZE(refs);
      +
      +	assert(mkdtemp(dir));
     @@ reftable/stack_test.c (new)
      +	assert_err(err);
      +
      +	for (i = 0; i < N; i++) {
     -+		struct reftable_ref_record dest = { 0 };
     ++		struct reftable_ref_record dest = { NULL };
     ++
      +		int err = reftable_stack_read_ref(st, refs[i].refname, &dest);
      +		assert_err(err);
      +		assert(reftable_ref_record_equal(&dest, refs + i, SHA1_SIZE));
     @@ reftable/stack_test.c (new)
      +	}
      +
      +	for (i = 0; i < N; i++) {
     -+		struct reftable_log_record dest = { 0 };
     ++		struct reftable_log_record dest = { NULL };
      +		int err = reftable_stack_read_log(st, refs[i].refname, &dest);
      +		assert_err(err);
      +		assert(reftable_log_record_equal(&dest, logs + i, SHA1_SIZE));
     @@ reftable/stack_test.c (new)
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st = NULL;
      +	int err;
     -+	struct reftable_ref_record refs[2] = { { 0 } };
     -+	struct reftable_log_record logs[2] = { { 0 } };
     ++	struct reftable_ref_record refs[2] = { { NULL } };
     ++	struct reftable_log_record logs[2] = { { NULL } };
      +	int N = ARRAY_SIZE(refs);
     -+	struct reftable_ref_record dest = { 0 };
     -+	struct reftable_log_record log_dest = { 0 };
     ++	struct reftable_ref_record dest = { NULL };
     ++	struct reftable_log_record log_dest = { NULL };
      +
      +	assert(mkdtemp(dir));
      +
     @@ reftable/stack_test.c (new)
      +	struct reftable_stack *st32 = NULL;
      +	struct reftable_write_options cfg_default = { 0 };
      +	struct reftable_stack *st_default = NULL;
     -+	struct reftable_ref_record dest = { 0 };
     ++	struct reftable_ref_record dest = { NULL };
      +
      +	assert(mkdtemp(dir));
      +	err = reftable_new_stack(&st, dir, cfg);
     @@ reftable/stack_test.c (new)
      +	char dir[256] = "/tmp/stack.test_reflog_expire.XXXXXX";
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st = NULL;
     -+	struct reftable_log_record logs[20] = { { 0 } };
     ++	struct reftable_log_record logs[20] = { { NULL } };
      +	int N = ARRAY_SIZE(logs) - 1;
      +	int i = 0;
      +	int err;
      +	struct reftable_log_expiry_config expiry = {
      +		.time = 10,
      +	};
     -+	struct reftable_log_record log = { 0 };
     ++	struct reftable_log_record log = { NULL };
      +
      +	assert(mkdtemp(dir));
      +
 13:  b8729fee9e ! 13:  c535f838d6 reftable: "test-tool dump-reftable" command.
     @@ Makefile: REFTABLE_OBJS += reftable/writer.o
       REFTABLE_TEST_OBJS += reftable/refname_test.o
       REFTABLE_TEST_OBJS += reftable/reftable_test.o
      
     + ## reftable/iter.c ##
     +@@ reftable/iter.c: void reftable_iterator_destroy(struct reftable_iterator *it)
     + int reftable_iterator_next_ref(struct reftable_iterator *it,
     + 			       struct reftable_ref_record *ref)
     + {
     +-	struct reftable_record rec = { 0 };
     ++	struct reftable_record rec = { NULL };
     + 	reftable_record_from_ref(&rec, ref);
     + 	return iterator_next(it, &rec);
     + }
     +@@ reftable/iter.c: int reftable_iterator_next_ref(struct reftable_iterator *it,
     + int reftable_iterator_next_log(struct reftable_iterator *it,
     + 			       struct reftable_log_record *log)
     + {
     +-	struct reftable_record rec = { 0 };
     ++	struct reftable_record rec = { NULL };
     + 	reftable_record_from_log(&rec, log);
     + 	return iterator_next(it, &rec);
     + }
     +@@ reftable/iter.c: static int filtering_ref_iterator_next(void *iter_arg,
     + 		}
     + 
     + 		if (fri->double_check) {
     +-			struct reftable_iterator it = { 0 };
     ++			struct reftable_iterator it = { NULL };
     + 
     + 			err = reftable_table_seek_ref(&fri->tab, &it,
     + 						      ref->refname);
     +
     + ## reftable/reader.c ##
     +@@ reftable/reader.c: static int parse_footer(struct reftable_reader *r, uint8_t *footer,
     + int init_reader(struct reftable_reader *r, struct reftable_block_source *source,
     + 		const char *name)
     + {
     +-	struct reftable_block footer = { 0 };
     +-	struct reftable_block header = { 0 };
     ++	struct reftable_block footer = { NULL };
     ++	struct reftable_block header = { NULL };
     + 	int err = 0;
     + 
     + 	memset(r, 0, sizeof(struct reftable_reader));
     +@@ reftable/reader.c: int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br,
     + {
     + 	int32_t guess_block_size = r->block_size ? r->block_size :
     + 							 DEFAULT_BLOCK_SIZE;
     +-	struct reftable_block block = { 0 };
     ++	struct reftable_block block = { NULL };
     + 	uint8_t block_typ = 0;
     + 	int err = 0;
     + 	uint32_t header_off = next_off ? 0 : header_size(r->version);
     +@@ reftable/reader.c: static int reader_seek_indexed(struct reftable_reader *r,
     + 			       struct reftable_record *rec)
     + {
     + 	struct reftable_index_record want_index = { .last_key = STRBUF_INIT };
     +-	struct reftable_record want_index_rec = { 0 };
     ++	struct reftable_record want_index_rec = { NULL };
     + 	struct reftable_index_record index_result = { .last_key = STRBUF_INIT };
     +-	struct reftable_record index_result_rec = { 0 };
     ++	struct reftable_record index_result_rec = { NULL };
     + 	struct table_iter index_iter = TABLE_ITER_INIT;
     + 	struct table_iter next = TABLE_ITER_INIT;
     + 	int err = 0;
     +@@ reftable/reader.c: int reftable_reader_seek_ref(struct reftable_reader *r,
     + 	struct reftable_ref_record ref = {
     + 		.refname = (char *)name,
     + 	};
     +-	struct reftable_record rec = { 0 };
     ++	struct reftable_record rec = { NULL };
     + 	reftable_record_from_ref(&rec, &ref);
     + 	return reader_seek(r, it, &rec);
     + }
     +@@ reftable/reader.c: int reftable_reader_seek_log_at(struct reftable_reader *r,
     + 		.refname = (char *)name,
     + 		.update_index = update_index,
     + 	};
     +-	struct reftable_record rec = { 0 };
     ++	struct reftable_record rec = { NULL };
     + 	reftable_record_from_log(&rec, &log);
     + 	return reader_seek(r, it, &rec);
     + }
     +@@ reftable/reader.c: static int reftable_reader_refs_for_indexed(struct reftable_reader *r,
     + 		.hash_prefix = oid,
     + 		.hash_prefix_len = r->object_id_len,
     + 	};
     +-	struct reftable_record want_rec = { 0 };
     +-	struct reftable_iterator oit = { 0 };
     +-	struct reftable_obj_record got = { 0 };
     +-	struct reftable_record got_rec = { 0 };
     ++	struct reftable_record want_rec = { NULL };
     ++	struct reftable_iterator oit = { NULL };
     ++	struct reftable_obj_record got = { NULL };
     ++	struct reftable_record got_rec = { NULL };
     + 	int err = 0;
     + 	struct indexed_table_ref_iter *itr = NULL;
     + 
     +
       ## t/helper/test-reftable.c ##
      @@ t/helper/test-reftable.c: int cmd__reftable(int argc, const char **argv)
       	tree_test_main(argc, argv);

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 251+ messages in thread

* [PATCH v2 01/13] reftable: add LICENSE
  2020-10-01 16:10 ` [PATCH v2 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
@ 2020-10-01 16:10   ` Han-Wen Nienhuys via GitGitGadget
  2020-10-02  3:18     ` Jonathan Nieder
  2020-10-01 16:10   ` [PATCH v2 02/13] reftable: define the public API Han-Wen Nienhuys via GitGitGadget
                     ` (12 subsequent siblings)
  13 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-10-01 16:10 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Jeff King, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

TODO: relicense?

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/LICENSE | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)
 create mode 100644 reftable/LICENSE

diff --git a/reftable/LICENSE b/reftable/LICENSE
new file mode 100644
index 0000000000..402e0f9356
--- /dev/null
+++ b/reftable/LICENSE
@@ -0,0 +1,31 @@
+BSD License
+
+Copyright (c) 2020, Google LLC
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are
+met:
+
+* Redistributions of source code must retain the above copyright notice,
+this list of conditions and the following disclaimer.
+
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in the
+documentation and/or other materials provided with the distribution.
+
+* Neither the name of Google LLC nor the names of its contributors may
+be used to endorse or promote products derived from this software
+without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v2 02/13] reftable: define the public API
  2020-10-01 16:10 ` [PATCH v2 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
  2020-10-01 16:10   ` [PATCH v2 01/13] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
@ 2020-10-01 16:10   ` Han-Wen Nienhuys via GitGitGadget
  2020-10-02  3:58     ` Jonathan Nieder
  2020-10-08  1:41     ` Jonathan Tan
  2020-10-01 16:10   ` [PATCH v2 03/13] vcxproj: adjust for the reftable changes Johannes Schindelin via GitGitGadget
                     ` (11 subsequent siblings)
  13 siblings, 2 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-10-01 16:10 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Jeff King, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/reftable.h | 585 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 585 insertions(+)
 create mode 100644 reftable/reftable.h

diff --git a/reftable/reftable.h b/reftable/reftable.h
new file mode 100644
index 0000000000..5f8ebf5540
--- /dev/null
+++ b/reftable/reftable.h
@@ -0,0 +1,585 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_H
+#define REFTABLE_H
+
+#include <stdint.h>
+#include <stddef.h>
+
+void reftable_set_alloc(void *(*malloc)(size_t),
+			void *(*realloc)(void *, size_t), void (*free)(void *));
+
+/****************************************************************
+ Basic data types
+
+ Reftables store the state of each ref in struct reftable_ref_record, and they
+ store a sequence of reflog updates in struct reftable_log_record.
+ ****************************************************************/
+
+/* reftable_ref_record holds a ref database entry target_value */
+struct reftable_ref_record {
+	char *refname; /* Name of the ref, malloced. */
+	uint64_t update_index; /* Logical timestamp at which this value is
+				  written */
+	uint8_t *value; /* SHA1, or NULL. malloced. */
+	uint8_t *target_value; /* peeled annotated tag, or NULL. malloced. */
+	char *target; /* symref, or NULL. malloced. */
+};
+
+/* returns whether 'ref' represents a deletion */
+int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref);
+
+/* prints a reftable_ref_record onto stdout */
+void reftable_ref_record_print(struct reftable_ref_record *ref,
+			       uint32_t hash_id);
+
+/* frees and nulls all pointer values. */
+void reftable_ref_record_clear(struct reftable_ref_record *ref);
+
+/* returns whether two reftable_ref_records are the same */
+int reftable_ref_record_equal(struct reftable_ref_record *a,
+			      struct reftable_ref_record *b, int hash_size);
+
+/* reftable_log_record holds a reflog entry */
+struct reftable_log_record {
+	char *refname;
+	uint64_t update_index; /* logical timestamp of a transactional update.
+				*/
+	uint8_t *new_hash;
+	uint8_t *old_hash;
+	char *name;
+	char *email;
+	uint64_t time;
+	int16_t tz_offset;
+	char *message;
+};
+
+/* returns whether 'ref' represents the deletion of a log record. */
+int reftable_log_record_is_deletion(const struct reftable_log_record *log);
+
+/* frees and nulls all pointer values. */
+void reftable_log_record_clear(struct reftable_log_record *log);
+
+/* returns whether two records are equal. */
+int reftable_log_record_equal(struct reftable_log_record *a,
+			      struct reftable_log_record *b, int hash_size);
+
+/* dumps a reftable_log_record on stdout, for debugging/testing. */
+void reftable_log_record_print(struct reftable_log_record *log,
+			       uint32_t hash_id);
+
+/****************************************************************
+ Error handling
+
+ Error are signaled with negative integer return values. 0 means success.
+ ****************************************************************/
+
+/* different types of errors */
+enum reftable_error {
+	/* Unexpected file system behavior */
+	REFTABLE_IO_ERROR = -2,
+
+	/* Format inconsistency on reading data
+	 */
+	REFTABLE_FORMAT_ERROR = -3,
+
+	/* File does not exist. Returned from block_source_from_file(),  because
+	   it needs special handling in stack.
+	*/
+	REFTABLE_NOT_EXIST_ERROR = -4,
+
+	/* Trying to write out-of-date data. */
+	REFTABLE_LOCK_ERROR = -5,
+
+	/* Misuse of the API:
+	   - on writing a record with NULL refname.
+	   - on writing a reftable_ref_record outside the table limits
+	   - on writing a ref or log record before the stack's next_update_index
+	   - on writing a log record with multiline message with
+	   exact_log_message unset
+	   - on reading a reftable_ref_record from log iterator, or vice versa.
+	*/
+	REFTABLE_API_ERROR = -6,
+
+	/* Decompression error */
+	REFTABLE_ZLIB_ERROR = -7,
+
+	/* Wrote a table without blocks. */
+	REFTABLE_EMPTY_TABLE_ERROR = -8,
+
+	/* Dir/file conflict. */
+	REFTABLE_NAME_CONFLICT = -9,
+
+	/* Illegal ref name. */
+	REFTABLE_REFNAME_ERROR = -10,
+};
+
+/* convert the numeric error code to a string. The string should not be
+ * deallocated. */
+const char *reftable_error_str(int err);
+
+/*
+ * Convert the numeric error code to an equivalent errno code.
+ */
+int reftable_error_to_errno(int err);
+
+/****************************************************************
+ Writing
+
+ Writing single reftables
+ ****************************************************************/
+
+/* reftable_write_options sets options for writing a single reftable. */
+struct reftable_write_options {
+	/* boolean: do not pad out blocks to block size. */
+	int unpadded;
+
+	/* the blocksize. Should be less than 2^24. */
+	uint32_t block_size;
+
+	/* boolean: do not generate a SHA1 => ref index. */
+	int skip_index_objects;
+
+	/* how often to write complete keys in each block. */
+	int restart_interval;
+
+	/* 4-byte identifier ("sha1", "s256") of the hash.
+	 * Defaults to SHA1 if unset
+	 */
+	uint32_t hash_id;
+
+	/* boolean: do not check ref names for validity or dir/file conflicts.
+	 */
+	int skip_name_check;
+
+	/* boolean: copy log messages exactly. If unset, check that the message
+	 *   is a single line, and add '\n' if missing.
+	 */
+	int exact_log_message;
+};
+
+/* reftable_block_stats holds statistics for a single block type */
+struct reftable_block_stats {
+	/* total number of entries written */
+	int entries;
+	/* total number of key restarts */
+	int restarts;
+	/* total number of blocks */
+	int blocks;
+	/* total number of index blocks */
+	int index_blocks;
+	/* depth of the index */
+	int max_index_level;
+
+	/* offset of the first block for this type */
+	uint64_t offset;
+	/* offset of the top level index block for this type, or 0 if not
+	 * present */
+	uint64_t index_offset;
+};
+
+/* stats holds overall statistics for a single reftable */
+struct reftable_stats {
+	/* total number of blocks written. */
+	int blocks;
+	/* stats for ref data */
+	struct reftable_block_stats ref_stats;
+	/* stats for the SHA1 to ref map. */
+	struct reftable_block_stats obj_stats;
+	/* stats for index blocks */
+	struct reftable_block_stats idx_stats;
+	/* stats for log blocks */
+	struct reftable_block_stats log_stats;
+
+	/* disambiguation length of shortened object IDs. */
+	int object_id_len;
+};
+
+/* reftable_new_writer creates a new writer */
+struct reftable_writer *
+reftable_new_writer(int (*writer_func)(void *, const void *, size_t),
+		    void *writer_arg, struct reftable_write_options *opts);
+
+/* write to a file descriptor. fdp should be an int* pointing to the fd. */
+int reftable_fd_write(void *fdp, const void *data, size_t size);
+
+/* Set the range of update indices for the records we will add.  When
+   writing a table into a stack, the min should be at least
+   reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned.
+
+   For transactional updates, typically min==max. When converting an existing
+   ref database into a single reftable, this would be a range of update-index
+   timestamps.
+ */
+void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
+				uint64_t max);
+
+/* adds a reftable_ref_record. Must be called in ascending
+   order. The update_index must be within the limits set by
+   reftable_writer_set_limits(), or REFTABLE_API_ERROR is returned.
+
+   It is an error to write a ref record after a log record.
+ */
+int reftable_writer_add_ref(struct reftable_writer *w,
+			    struct reftable_ref_record *ref);
+
+/* Convenience function to add multiple refs. Will sort the refs by
+   name before adding. */
+int reftable_writer_add_refs(struct reftable_writer *w,
+			     struct reftable_ref_record *refs, int n);
+
+/* adds a reftable_log_record. Must be called in ascending order (with more
+   recent log entries first.)
+ */
+int reftable_writer_add_log(struct reftable_writer *w,
+			    struct reftable_log_record *log);
+
+/* Convenience function to add multiple logs. Will sort the records by
+   key before adding. */
+int reftable_writer_add_logs(struct reftable_writer *w,
+			     struct reftable_log_record *logs, int n);
+
+/* reftable_writer_close finalizes the reftable. The writer is retained so
+ * statistics can be inspected. */
+int reftable_writer_close(struct reftable_writer *w);
+
+/* writer_stats returns the statistics on the reftable being written.
+
+   This struct becomes invalid when the writer is freed.
+ */
+const struct reftable_stats *writer_stats(struct reftable_writer *w);
+
+/* reftable_writer_free deallocates memory for the writer */
+void reftable_writer_free(struct reftable_writer *w);
+
+/****************************************************************
+ * ITERATING
+ ****************************************************************/
+
+/* iterator is the generic interface for walking over data stored in a
+   reftable. It is generally passed around by value.
+*/
+struct reftable_iterator {
+	struct reftable_iterator_vtable *ops;
+	void *iter_arg;
+};
+
+/* reads the next reftable_ref_record. Returns < 0 for error, 0 for OK and > 0:
+   end of iteration.
+*/
+int reftable_iterator_next_ref(struct reftable_iterator *it,
+			       struct reftable_ref_record *ref);
+
+/* reads the next reftable_log_record. Returns < 0 for error, 0 for OK and > 0:
+   end of iteration.
+*/
+int reftable_iterator_next_log(struct reftable_iterator *it,
+			       struct reftable_log_record *log);
+
+/* releases resources associated with an iterator. */
+void reftable_iterator_destroy(struct reftable_iterator *it);
+
+/****************************************************************
+ Reading single tables
+
+ The follow routines are for reading single files. For an application-level
+ interface, skip ahead to struct reftable_merged_table and struct
+ reftable_stack.
+ ****************************************************************/
+
+/* block_source is a generic wrapper for a seekable readable file.
+   It is generally passed around by value.
+ */
+struct reftable_block_source {
+	struct reftable_block_source_vtable *ops;
+	void *arg;
+};
+
+/* a contiguous segment of bytes. It keeps track of its generating block_source
+   so it can return itself into the pool.
+*/
+struct reftable_block {
+	uint8_t *data;
+	int len;
+	struct reftable_block_source source;
+};
+
+/* block_source_vtable are the operations that make up block_source */
+struct reftable_block_source_vtable {
+	/* returns the size of a block source */
+	uint64_t (*size)(void *source);
+
+	/* reads a segment from the block source. It is an error to read
+	   beyond the end of the block */
+	int (*read_block)(void *source, struct reftable_block *dest,
+			  uint64_t off, uint32_t size);
+	/* mark the block as read; may return the data back to malloc */
+	void (*return_block)(void *source, struct reftable_block *blockp);
+
+	/* release all resources associated with the block source */
+	void (*close)(void *source);
+};
+
+/* opens a file on the file system as a block_source */
+int reftable_block_source_from_file(struct reftable_block_source *block_src,
+				    const char *name);
+
+/* The reader struct is a handle to an open reftable file. */
+struct reftable_reader;
+
+/* reftable_new_reader opens a reftable for reading. If successful, returns 0
+ * code and sets pp. The name is used for creating a stack. Typically, it is the
+ * basename of the file. The block source `src` is owned by the reader, and is
+ * closed on calling reftable_reader_destroy().
+ */
+int reftable_new_reader(struct reftable_reader **pp,
+			struct reftable_block_source *src, const char *name);
+
+/* reftable_reader_seek_ref returns an iterator where 'name' would be inserted
+   in the table.  To seek to the start of the table, use name = "".
+
+   example:
+
+   struct reftable_reader *r = NULL;
+   int err = reftable_new_reader(&r, &src, "filename");
+   if (err < 0) { ... }
+   struct reftable_iterator it  = {0};
+   err = reftable_reader_seek_ref(r, &it, "refs/heads/master");
+   if (err < 0) { ... }
+   struct reftable_ref_record ref  = {0};
+   while (1) {
+     err = reftable_iterator_next_ref(&it, &ref);
+     if (err > 0) {
+       break;
+     }
+     if (err < 0) {
+       ..error handling..
+     }
+     ..found..
+   }
+   reftable_iterator_destroy(&it);
+   reftable_ref_record_clear(&ref);
+ */
+int reftable_reader_seek_ref(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name);
+
+/* returns the hash ID used in this table. */
+uint32_t reftable_reader_hash_id(struct reftable_reader *r);
+
+/* seek to logs for the given name, older than update_index. To seek to the
+   start of the table, use name = "".
+ */
+int reftable_reader_seek_log_at(struct reftable_reader *r,
+				struct reftable_iterator *it, const char *name,
+				uint64_t update_index);
+
+/* seek to newest log entry for given name. */
+int reftable_reader_seek_log(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name);
+
+/* closes and deallocates a reader. */
+void reftable_reader_free(struct reftable_reader *);
+
+/* return an iterator for the refs pointing to `oid`. */
+int reftable_reader_refs_for(struct reftable_reader *r,
+			     struct reftable_iterator *it, uint8_t *oid);
+
+/* return the max_update_index for a table */
+uint64_t reftable_reader_max_update_index(struct reftable_reader *r);
+
+/* return the min_update_index for a table */
+uint64_t reftable_reader_min_update_index(struct reftable_reader *r);
+
+/****************************************************************
+ Merged tables
+
+ A ref database kept in a sequence of table files. The merged_table presents a
+ unified view to reading (seeking, iterating) a sequence of immutable tables.
+ ****************************************************************/
+
+/* A merged table is implements seeking/iterating over a stack of tables. */
+struct reftable_merged_table;
+
+/* A generic reftable; see below. */
+struct reftable_table;
+
+/* reftable_new_merged_table creates a new merged table. It takes ownership of
+   the stack array.
+*/
+int reftable_new_merged_table(struct reftable_merged_table **dest,
+			      struct reftable_table *stack, int n,
+			      uint32_t hash_id);
+
+/* returns an iterator positioned just before 'name' */
+int reftable_merged_table_seek_ref(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name);
+
+/* returns an iterator for log entry, at given update_index */
+int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt,
+				      struct reftable_iterator *it,
+				      const char *name, uint64_t update_index);
+
+/* like reftable_merged_table_seek_log_at but look for the newest entry. */
+int reftable_merged_table_seek_log(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name);
+
+/* returns the max update_index covered by this merged table. */
+uint64_t
+reftable_merged_table_max_update_index(struct reftable_merged_table *mt);
+
+/* returns the min update_index covered by this merged table. */
+uint64_t
+reftable_merged_table_min_update_index(struct reftable_merged_table *mt);
+
+/* releases memory for the merged_table */
+void reftable_merged_table_free(struct reftable_merged_table *m);
+
+/* return the hash ID of the merged table. */
+uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *m);
+
+/****************************************************************
+ Generic tables
+
+ A unified API for reading tables, either merged tables, or single readers.
+ ****************************************************************/
+
+struct reftable_table {
+	struct reftable_table_vtable *ops;
+	void *table_arg;
+};
+
+int reftable_table_seek_ref(struct reftable_table *tab,
+			    struct reftable_iterator *it, const char *name);
+
+void reftable_table_from_reader(struct reftable_table *tab,
+				struct reftable_reader *reader);
+
+/* returns the hash ID from a generic reftable_table */
+uint32_t reftable_table_hash_id(struct reftable_table *tab);
+
+/* create a generic table from reftable_merged_table */
+void reftable_table_from_merged_table(struct reftable_table *tab,
+				      struct reftable_merged_table *table);
+
+/* returns the max update_index covered by this table. */
+uint64_t reftable_table_max_update_index(struct reftable_table *tab);
+
+/* returns the min update_index covered by this table. */
+uint64_t reftable_table_min_update_index(struct reftable_table *tab);
+
+/* convenience function to read a single ref. Returns < 0 for error, 0
+   for success, and 1 if ref not found. */
+int reftable_table_read_ref(struct reftable_table *tab, const char *name,
+			    struct reftable_ref_record *ref);
+
+/****************************************************************
+ Mutable ref database
+
+ The stack presents an interface to a mutable sequence of reftables.
+ ****************************************************************/
+
+/* a stack is a stack of reftables, which can be mutated by pushing a table to
+ * the top of the stack */
+struct reftable_stack;
+
+/* open a new reftable stack. The tables along with the table list will be
+   stored in 'dir'. Typically, this should be .git/reftables.
+*/
+int reftable_new_stack(struct reftable_stack **dest, const char *dir,
+		       struct reftable_write_options config);
+
+/* returns the update_index at which a next table should be written. */
+uint64_t reftable_stack_next_update_index(struct reftable_stack *st);
+
+/* holds a transaction to add tables at the top of a stack. */
+struct reftable_addition;
+
+/*
+  returns a new transaction to add reftables to the given stack. As a side
+  effect, the ref database is locked.
+*/
+int reftable_stack_new_addition(struct reftable_addition **dest,
+				struct reftable_stack *st);
+
+/* Adds a reftable to transaction. */
+int reftable_addition_add(struct reftable_addition *add,
+			  int (*write_table)(struct reftable_writer *wr,
+					     void *arg),
+			  void *arg);
+
+/* Commits the transaction, releasing the lock. */
+int reftable_addition_commit(struct reftable_addition *add);
+
+/* Release all non-committed data from the transaction, and deallocate the
+   transaction. Releases the lock if held. */
+void reftable_addition_destroy(struct reftable_addition *add);
+
+/* add a new table to the stack. The write_table function must call
+   reftable_writer_set_limits, add refs and return an error value. */
+int reftable_stack_add(struct reftable_stack *st,
+		       int (*write_table)(struct reftable_writer *wr,
+					  void *write_arg),
+		       void *write_arg);
+
+/* returns the merged_table for seeking. This table is valid until the
+   next write or reload, and should not be closed or deleted.
+*/
+struct reftable_merged_table *
+reftable_stack_merged_table(struct reftable_stack *st);
+
+/* frees all resources associated with the stack. */
+void reftable_stack_destroy(struct reftable_stack *st);
+
+/* Reloads the stack if necessary. This is very cheap to run if the stack was up
+ * to date */
+int reftable_stack_reload(struct reftable_stack *st);
+
+/* Policy for expiring reflog entries. */
+struct reftable_log_expiry_config {
+	/* Drop entries older than this timestamp */
+	uint64_t time;
+
+	/* Drop older entries */
+	uint64_t min_update_index;
+};
+
+/* compacts all reftables into a giant table. Expire reflog entries if config is
+ * non-NULL */
+int reftable_stack_compact_all(struct reftable_stack *st,
+			       struct reftable_log_expiry_config *config);
+
+/* heuristically compact unbalanced table stack. */
+int reftable_stack_auto_compact(struct reftable_stack *st);
+
+/* convenience function to read a single ref. Returns < 0 for error, 0
+   for success, and 1 if ref not found. */
+int reftable_stack_read_ref(struct reftable_stack *st, const char *refname,
+			    struct reftable_ref_record *ref);
+
+/* convenience function to read a single log. Returns < 0 for error, 0
+   for success, and 1 if ref not found. */
+int reftable_stack_read_log(struct reftable_stack *st, const char *refname,
+			    struct reftable_log_record *log);
+
+/* statistics on past compactions. */
+struct reftable_compaction_stats {
+	uint64_t bytes; /* total number of bytes written */
+	uint64_t entries_written; /* total number of entries written, including
+				     failures. */
+	int attempts; /* how often we tried to compact */
+	int failures; /* failures happen on concurrent updates */
+};
+
+/* return statistics for compaction up till now. */
+struct reftable_compaction_stats *
+reftable_stack_compaction_stats(struct reftable_stack *st);
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v2 03/13] vcxproj: adjust for the reftable changes
  2020-10-01 16:10 ` [PATCH v2 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
  2020-10-01 16:10   ` [PATCH v2 01/13] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
  2020-10-01 16:10   ` [PATCH v2 02/13] reftable: define the public API Han-Wen Nienhuys via GitGitGadget
@ 2020-10-01 16:10   ` Johannes Schindelin via GitGitGadget
  2020-10-02  4:02     ` Jonathan Nieder
  2020-10-01 16:10   ` [PATCH v2 04/13] reftable: add a barebones unittest framework Han-Wen Nienhuys via GitGitGadget
                     ` (10 subsequent siblings)
  13 siblings, 1 reply; 251+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2020-10-01 16:10 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Jeff King, Han-Wen Nienhuys, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

This allows Git to be compiled via Visual Studio again after integrating
the `hn/reftable` branch.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 config.mak.uname                           |  2 +-
 contrib/buildsystems/Generators/Vcxproj.pm | 11 ++++++++++-
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/config.mak.uname b/config.mak.uname
index c7eba69e54..ae4e25a1a4 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -709,7 +709,7 @@ vcxproj:
 	# Make .vcxproj files and add them
 	unset QUIET_GEN QUIET_BUILT_IN; \
 	perl contrib/buildsystems/generate -g Vcxproj
-	git add -f git.sln {*,*/lib,t/helper/*}/*.vcxproj
+	git add -f git.sln {*,*/lib,*/libreftable,t/helper/*}/*.vcxproj
 
 	# Generate the LinkOrCopyBuiltins.targets and LinkOrCopyRemoteHttp.targets file
 	(echo '<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">' && \
diff --git a/contrib/buildsystems/Generators/Vcxproj.pm b/contrib/buildsystems/Generators/Vcxproj.pm
index d2584450ba..1a25789d28 100644
--- a/contrib/buildsystems/Generators/Vcxproj.pm
+++ b/contrib/buildsystems/Generators/Vcxproj.pm
@@ -77,7 +77,7 @@ sub createProject {
     my $libs_release = "\n    ";
     my $libs_debug = "\n    ";
     if (!$static_library) {
-      $libs_release = join(";", sort(grep /^(?!libgit\.lib|xdiff\/lib\.lib|vcs-svn\/lib\.lib)/, @{$$build_structure{"$prefix${name}_LIBS"}}));
+      $libs_release = join(";", sort(grep /^(?!libgit\.lib|xdiff\/lib\.lib|vcs-svn\/lib\.lib|reftable\/libreftable\.lib)/, @{$$build_structure{"$prefix${name}_LIBS"}}));
       $libs_debug = $libs_release;
       $libs_debug =~ s/zlib\.lib/zlibd\.lib/g;
       $libs_debug =~ s/libexpat\.lib/libexpatd\.lib/g;
@@ -232,6 +232,7 @@ sub createProject {
 EOM
     if (!$static_library || $target =~ 'vcs-svn' || $target =~ 'xdiff') {
       my $uuid_libgit = $$build_structure{"LIBS_libgit_GUID"};
+      my $uuid_libreftable = $$build_structure{"LIBS_reftable/libreftable_GUID"};
       my $uuid_xdiff_lib = $$build_structure{"LIBS_xdiff/lib_GUID"};
 
       print F << "EOM";
@@ -241,6 +242,14 @@ sub createProject {
       <ReferenceOutputAssembly>false</ReferenceOutputAssembly>
     </ProjectReference>
 EOM
+      if (!($name =~ /xdiff|libreftable/)) {
+        print F << "EOM";
+    <ProjectReference Include="$cdup\\reftable\\libreftable\\libreftable.vcxproj">
+      <Project>$uuid_libreftable</Project>
+      <ReferenceOutputAssembly>false</ReferenceOutputAssembly>
+    </ProjectReference>
+EOM
+      }
       if (!($name =~ 'xdiff')) {
         print F << "EOM";
     <ProjectReference Include="$cdup\\xdiff\\lib\\xdiff_lib.vcxproj">
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v2 04/13] reftable: add a barebones unittest framework
  2020-10-01 16:10 ` [PATCH v2 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
                     ` (2 preceding siblings ...)
  2020-10-01 16:10   ` [PATCH v2 03/13] vcxproj: adjust for the reftable changes Johannes Schindelin via GitGitGadget
@ 2020-10-01 16:10   ` Han-Wen Nienhuys via GitGitGadget
  2020-10-02  4:05     ` Jonathan Nieder
  2020-10-08  1:45     ` Jonathan Tan
  2020-10-01 16:10   ` [PATCH v2 05/13] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
                     ` (9 subsequent siblings)
  13 siblings, 2 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-10-01 16:10 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Jeff King, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/test_framework.c | 68 +++++++++++++++++++++++++++++++++++++++
 reftable/test_framework.h | 59 +++++++++++++++++++++++++++++++++
 2 files changed, 127 insertions(+)
 create mode 100644 reftable/test_framework.c
 create mode 100644 reftable/test_framework.h

diff --git a/reftable/test_framework.c b/reftable/test_framework.c
new file mode 100644
index 0000000000..8d718f2f06
--- /dev/null
+++ b/reftable/test_framework.c
@@ -0,0 +1,68 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "test_framework.h"
+
+#include "system.h"
+#include "basics.h"
+
+static struct test_case **test_cases;
+static int test_case_len;
+static int test_case_cap;
+
+static struct test_case *new_test_case(const char *name, void (*testfunc)(void))
+{
+	struct test_case *tc = reftable_malloc(sizeof(struct test_case));
+	tc->name = name;
+	tc->testfunc = testfunc;
+	return tc;
+}
+
+struct test_case *add_test_case(const char *name, void (*testfunc)(void))
+{
+	struct test_case *tc = new_test_case(name, testfunc);
+	if (test_case_len == test_case_cap) {
+		test_case_cap = 2 * test_case_cap + 1;
+		test_cases = reftable_realloc(
+			test_cases, sizeof(struct test_case) * test_case_cap);
+	}
+
+	test_cases[test_case_len++] = tc;
+	return tc;
+}
+
+int test_main(int argc, const char *argv[])
+{
+	const char *filter = NULL;
+	int i = 0;
+	if (argc > 1) {
+		filter = argv[1];
+	}
+
+	for (i = 0; i < test_case_len; i++) {
+		const char *name = test_cases[i]->name;
+		if (filter == NULL || strstr(name, filter) != NULL) {
+			printf("case %s\n", name);
+			test_cases[i]->testfunc();
+		} else {
+			printf("skip %s\n", name);
+		}
+
+		reftable_free(test_cases[i]);
+	}
+	reftable_free(test_cases);
+	test_cases = NULL;
+	test_case_len = 0;
+	test_case_cap = 0;
+	return 0;
+}
+
+void set_test_hash(uint8_t *p, int i)
+{
+	memset(p, (uint8_t)i, hash_size(SHA1_ID));
+}
diff --git a/reftable/test_framework.h b/reftable/test_framework.h
new file mode 100644
index 0000000000..f0a208e880
--- /dev/null
+++ b/reftable/test_framework.h
@@ -0,0 +1,59 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef TEST_FRAMEWORK_H
+#define TEST_FRAMEWORK_H
+
+#include "system.h"
+
+#ifdef NDEBUG
+#undef NDEBUG
+#endif
+
+#ifdef assert
+#undef assert
+#endif
+
+#define assert_err(c)                                                  \
+	if (c != 0) {                                                  \
+		fflush(stderr);                                        \
+		fflush(stdout);                                        \
+		fprintf(stderr, "%s: %d: error == %d (%s), want 0\n",  \
+			__FILE__, __LINE__, c, reftable_error_str(c)); \
+		abort();                                               \
+	}
+
+#define assert_streq(a, b)                                               \
+	if (strcmp(a, b)) {                                              \
+		fflush(stderr);                                          \
+		fflush(stdout);                                          \
+		fprintf(stderr, "%s:%d: %s (%s) != %s (%s)\n", __FILE__, \
+			__LINE__, #a, a, #b, b);                         \
+		abort();                                                 \
+	}
+
+#define assert(c)                                                          \
+	if (!(c)) {                                                        \
+		fflush(stderr);                                            \
+		fflush(stdout);                                            \
+		fprintf(stderr, "%s: %d: failed assertion %s\n", __FILE__, \
+			__LINE__, #c);                                     \
+		abort();                                                   \
+	}
+
+struct test_case {
+	const char *name;
+	void (*testfunc)(void);
+};
+
+struct test_case *add_test_case(const char *name, void (*testfunc)(void));
+int test_main(int argc, const char *argv[]);
+
+void set_test_hash(uint8_t *p, int i);
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v2 05/13] reftable: utility functions
  2020-10-01 16:10 ` [PATCH v2 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
                     ` (3 preceding siblings ...)
  2020-10-01 16:10   ` [PATCH v2 04/13] reftable: add a barebones unittest framework Han-Wen Nienhuys via GitGitGadget
@ 2020-10-01 16:10   ` Han-Wen Nienhuys via GitGitGadget
  2020-10-02  4:12     ` Jonathan Nieder
                       ` (2 more replies)
  2020-10-01 16:10   ` [PATCH v2 06/13] reftable: (de)serialization for the polymorphic record type Han-Wen Nienhuys via GitGitGadget
                     ` (8 subsequent siblings)
  13 siblings, 3 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-10-01 16:10 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Jeff King, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This commit provides basic utility classes for the reftable library.

Since the reftable library must compile standalone, there may be some overlap
with git-core utility functions.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                  |  26 ++++++-
 reftable/basics.c         | 131 +++++++++++++++++++++++++++++++++
 reftable/basics.h         |  48 +++++++++++++
 reftable/blocksource.c    | 148 ++++++++++++++++++++++++++++++++++++++
 reftable/blocksource.h    |  22 ++++++
 reftable/compat.c         | 110 ++++++++++++++++++++++++++++
 reftable/compat.h         |  48 +++++++++++++
 reftable/publicbasics.c   | 100 ++++++++++++++++++++++++++
 reftable/reftable-tests.h |  22 ++++++
 reftable/strbuf.c         | 142 ++++++++++++++++++++++++++++++++++++
 reftable/strbuf.h         |  80 +++++++++++++++++++++
 reftable/strbuf_test.c    |  37 ++++++++++
 reftable/system.h         |  51 +++++++++++++
 t/helper/test-reftable.c  |   8 +++
 t/helper/test-tool.c      |   1 +
 t/helper/test-tool.h      |   1 +
 16 files changed, 972 insertions(+), 3 deletions(-)
 create mode 100644 reftable/basics.c
 create mode 100644 reftable/basics.h
 create mode 100644 reftable/blocksource.c
 create mode 100644 reftable/blocksource.h
 create mode 100644 reftable/compat.c
 create mode 100644 reftable/compat.h
 create mode 100644 reftable/publicbasics.c
 create mode 100644 reftable/reftable-tests.h
 create mode 100644 reftable/strbuf.c
 create mode 100644 reftable/strbuf.h
 create mode 100644 reftable/strbuf_test.c
 create mode 100644 reftable/system.h
 create mode 100644 t/helper/test-reftable.c

diff --git a/Makefile b/Makefile
index de53954590..e40d55cd87 100644
--- a/Makefile
+++ b/Makefile
@@ -727,6 +727,7 @@ TEST_BUILTINS_OBJS += test-read-cache.o
 TEST_BUILTINS_OBJS += test-read-graph.o
 TEST_BUILTINS_OBJS += test-read-midx.o
 TEST_BUILTINS_OBJS += test-ref-store.o
+TEST_BUILTINS_OBJS += test-reftable.o
 TEST_BUILTINS_OBJS += test-regex.o
 TEST_BUILTINS_OBJS += test-repository.o
 TEST_BUILTINS_OBJS += test-revision-walking.o
@@ -806,6 +807,8 @@ TEST_SHELL_PATH = $(SHELL_PATH)
 
 LIB_FILE = libgit.a
 XDIFF_LIB = xdiff/lib.a
+REFTABLE_LIB = reftable/libreftable.a
+REFTABLE_TEST_LIB = reftable/libreftable_test.a
 
 GENERATED_H += config-list.h
 GENERATED_H += command-list.h
@@ -1167,7 +1170,7 @@ THIRD_PARTY_SOURCES += compat/regex/%
 THIRD_PARTY_SOURCES += sha1collisiondetection/%
 THIRD_PARTY_SOURCES += sha1dc/%
 
-GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB)
+GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB)
 EXTLIBS =
 
 GIT_USER_AGENT = git/$(GIT_VERSION)
@@ -2369,10 +2372,21 @@ XDIFF_OBJS += xdiff/xpatience.o
 XDIFF_OBJS += xdiff/xprepare.o
 XDIFF_OBJS += xdiff/xutils.o
 
+REFTABLE_OBJS += reftable/basics.o
+REFTABLE_OBJS += reftable/blocksource.o
+REFTABLE_OBJS += reftable/publicbasics.o
+REFTABLE_OBJS += reftable/compat.o
+REFTABLE_OBJS += reftable/strbuf.o
+
+REFTABLE_TEST_OBJS += reftable/strbuf_test.o
+REFTABLE_TEST_OBJS += reftable/test_framework.o
+
 TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
 OBJECTS := $(LIB_OBJS) $(BUILTIN_OBJS) $(PROGRAM_OBJS) $(TEST_OBJS) \
 	$(XDIFF_OBJS) \
 	$(FUZZ_OBJS) \
+	$(REFTABLE_OBJS) \
+	$(REFTABLE_TEST_OBJS) \
 	common-main.o \
 	git.o
 ifndef NO_CURL
@@ -2524,6 +2538,12 @@ $(LIB_FILE): $(LIB_OBJS)
 $(XDIFF_LIB): $(XDIFF_OBJS)
 	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
 
+$(REFTABLE_LIB): $(REFTABLE_OBJS)
+	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
+
+$(REFTABLE_TEST_LIB): $(REFTABLE_TEST_OBJS)
+	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
+
 export DEFAULT_EDITOR DEFAULT_PAGER
 
 Documentation/GIT-EXCLUDED-PROGRAMS: FORCE
@@ -2802,7 +2822,7 @@ perf: all
 
 t/helper/test-tool$X: $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
 
-t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS)
+t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS) $(REFTABLE_TEST_LIB)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(filter %.a,$^) $(LIBS)
 
 check-sha1:: t/helper/test-tool$X
@@ -3133,7 +3153,7 @@ cocciclean:
 clean: profile-clean coverage-clean cocciclean
 	$(RM) *.res
 	$(RM) $(OBJECTS)
-	$(RM) $(LIB_FILE) $(XDIFF_LIB)
+	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB)
 	$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) git$X
 	$(RM) $(TEST_PROGRAMS)
 	$(RM) $(FUZZ_PROGRAMS)
diff --git a/reftable/basics.c b/reftable/basics.c
new file mode 100644
index 0000000000..c429055d15
--- /dev/null
+++ b/reftable/basics.c
@@ -0,0 +1,131 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "basics.h"
+
+void put_be24(uint8_t *out, uint32_t i)
+{
+	out[0] = (uint8_t)((i >> 16) & 0xff);
+	out[1] = (uint8_t)((i >> 8) & 0xff);
+	out[2] = (uint8_t)(i & 0xff);
+}
+
+uint32_t get_be24(uint8_t *in)
+{
+	return (uint32_t)(in[0]) << 16 | (uint32_t)(in[1]) << 8 |
+	       (uint32_t)(in[2]);
+}
+
+void put_be16(uint8_t *out, uint16_t i)
+{
+	out[0] = (uint8_t)((i >> 8) & 0xff);
+	out[1] = (uint8_t)(i & 0xff);
+}
+
+int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args)
+{
+	size_t lo = 0;
+	size_t hi = sz;
+
+	/* invariant: (hi == sz) || f(hi) == true
+	   (lo == 0 && f(0) == true) || fi(lo) == false
+	 */
+	while (hi - lo > 1) {
+		size_t mid = lo + (hi - lo) / 2;
+
+		int val = f(mid, args);
+		if (val) {
+			hi = mid;
+		} else {
+			lo = mid;
+		}
+	}
+
+	if (lo == 0) {
+		if (f(0, args)) {
+			return 0;
+		} else {
+			return 1;
+		}
+	}
+
+	return hi;
+}
+
+void free_names(char **a)
+{
+	char **p = a;
+	if (p == NULL) {
+		return;
+	}
+	while (*p) {
+		reftable_free(*p);
+		p++;
+	}
+	reftable_free(a);
+}
+
+int names_length(char **names)
+{
+	int len = 0;
+	char **p = names;
+	while (*p) {
+		p++;
+		len++;
+	}
+	return len;
+}
+
+void parse_names(char *buf, int size, char ***namesp)
+{
+	char **names = NULL;
+	size_t names_cap = 0;
+	size_t names_len = 0;
+
+	char *p = buf;
+	char *end = buf + size;
+	while (p < end) {
+		char *next = strchr(p, '\n');
+		if (next != NULL) {
+			*next = 0;
+		} else {
+			next = end;
+		}
+		if (p < next) {
+			if (names_len == names_cap) {
+				names_cap = 2 * names_cap + 1;
+				names = reftable_realloc(
+					names, names_cap * sizeof(char *));
+			}
+			names[names_len++] = xstrdup(p);
+		}
+		p = next + 1;
+	}
+
+	if (names_len == names_cap) {
+		names_cap = 2 * names_cap + 1;
+		names = reftable_realloc(names, names_cap * sizeof(char *));
+	}
+
+	names[names_len] = NULL;
+	*namesp = names;
+}
+
+int names_equal(char **a, char **b)
+{
+	while (*a && *b) {
+		if (strcmp(*a, *b)) {
+			return 0;
+		}
+
+		a++;
+		b++;
+	}
+
+	return *a == *b;
+}
diff --git a/reftable/basics.h b/reftable/basics.h
new file mode 100644
index 0000000000..90639865a7
--- /dev/null
+++ b/reftable/basics.h
@@ -0,0 +1,48 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef BASICS_H
+#define BASICS_H
+
+#include "system.h"
+
+/* Bigendian en/decoding of integers */
+
+void put_be24(uint8_t *out, uint32_t i);
+uint32_t get_be24(uint8_t *in);
+void put_be16(uint8_t *out, uint16_t i);
+
+/*
+  find smallest index i in [0, sz) at which f(i) is true, assuming
+  that f is ascending. Return sz if f(i) is false for all indices.
+*/
+int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args);
+
+/*
+  Frees a NULL terminated array of malloced strings. The array itself is also
+  freed.
+ */
+void free_names(char **a);
+
+/* parse a newline separated list of names. Empty names are discarded. */
+void parse_names(char *buf, int size, char ***namesp);
+
+/* compares two NULL-terminated arrays of strings. */
+int names_equal(char **a, char **b);
+
+/* returns the array size of a NULL-terminated array of strings. */
+int names_length(char **names);
+
+/* Allocation routines; they invoke the functions set through
+ * reftable_set_alloc() */
+void *reftable_malloc(size_t sz);
+void *reftable_realloc(void *p, size_t sz);
+void reftable_free(void *p);
+void *reftable_calloc(size_t sz);
+
+#endif
diff --git a/reftable/blocksource.c b/reftable/blocksource.c
new file mode 100644
index 0000000000..7f29b864f9
--- /dev/null
+++ b/reftable/blocksource.c
@@ -0,0 +1,148 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+
+#include "basics.h"
+#include "blocksource.h"
+#include "strbuf.h"
+#include "reftable.h"
+
+static void strbuf_return_block(void *b, struct reftable_block *dest)
+{
+	memset(dest->data, 0xff, dest->len);
+	reftable_free(dest->data);
+}
+
+static void strbuf_close(void *b)
+{
+}
+
+static int strbuf_read_block(void *v, struct reftable_block *dest, uint64_t off,
+			     uint32_t size)
+{
+	struct strbuf *b = (struct strbuf *)v;
+	assert(off + size <= b->len);
+	dest->data = reftable_calloc(size);
+	memcpy(dest->data, b->buf + off, size);
+	dest->len = size;
+	return size;
+}
+
+static uint64_t strbuf_size(void *b)
+{
+	return ((struct strbuf *)b)->len;
+}
+
+static struct reftable_block_source_vtable strbuf_vtable = {
+	.size = &strbuf_size,
+	.read_block = &strbuf_read_block,
+	.return_block = &strbuf_return_block,
+	.close = &strbuf_close,
+};
+
+void block_source_from_strbuf(struct reftable_block_source *bs,
+			      struct strbuf *buf)
+{
+	assert(bs->ops == NULL);
+	bs->ops = &strbuf_vtable;
+	bs->arg = buf;
+}
+
+static void malloc_return_block(void *b, struct reftable_block *dest)
+{
+	memset(dest->data, 0xff, dest->len);
+	reftable_free(dest->data);
+}
+
+static struct reftable_block_source_vtable malloc_vtable = {
+	.return_block = &malloc_return_block,
+};
+
+static struct reftable_block_source malloc_block_source_instance = {
+	.ops = &malloc_vtable,
+};
+
+struct reftable_block_source malloc_block_source(void)
+{
+	return malloc_block_source_instance;
+}
+
+struct file_block_source {
+	int fd;
+	uint64_t size;
+};
+
+static uint64_t file_size(void *b)
+{
+	return ((struct file_block_source *)b)->size;
+}
+
+static void file_return_block(void *b, struct reftable_block *dest)
+{
+	memset(dest->data, 0xff, dest->len);
+	reftable_free(dest->data);
+}
+
+static void file_close(void *b)
+{
+	int fd = ((struct file_block_source *)b)->fd;
+	if (fd > 0) {
+		close(fd);
+		((struct file_block_source *)b)->fd = 0;
+	}
+
+	reftable_free(b);
+}
+
+static int file_read_block(void *v, struct reftable_block *dest, uint64_t off,
+			   uint32_t size)
+{
+	struct file_block_source *b = (struct file_block_source *)v;
+	assert(off + size <= b->size);
+	dest->data = reftable_malloc(size);
+	if (pread(b->fd, dest->data, size, off) != size)
+		return -1;
+	dest->len = size;
+	return size;
+}
+
+static struct reftable_block_source_vtable file_vtable = {
+	.size = &file_size,
+	.read_block = &file_read_block,
+	.return_block = &file_return_block,
+	.close = &file_close,
+};
+
+int reftable_block_source_from_file(struct reftable_block_source *bs,
+				    const char *name)
+{
+	struct stat st = { 0 };
+	int err = 0;
+	int fd = open(name, O_RDONLY);
+	struct file_block_source *p = NULL;
+	if (fd < 0) {
+		if (errno == ENOENT) {
+			return REFTABLE_NOT_EXIST_ERROR;
+		}
+		return -1;
+	}
+
+	err = fstat(fd, &st);
+	if (err < 0)
+		return -1;
+
+	p = reftable_calloc(sizeof(struct file_block_source));
+	p->size = st.st_size;
+	p->fd = fd;
+
+	assert(bs->ops == NULL);
+	bs->ops = &file_vtable;
+	bs->arg = p;
+	return 0;
+}
diff --git a/reftable/blocksource.h b/reftable/blocksource.h
new file mode 100644
index 0000000000..3faf83fa9d
--- /dev/null
+++ b/reftable/blocksource.h
@@ -0,0 +1,22 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef BLOCKSOURCE_H
+#define BLOCKSOURCE_H
+
+#include "strbuf.h"
+
+struct reftable_block_source;
+
+/* Create an in-memory block source for reading reftables */
+void block_source_from_strbuf(struct reftable_block_source *bs,
+			      struct strbuf *buf);
+
+struct reftable_block_source malloc_block_source(void);
+
+#endif
diff --git a/reftable/compat.c b/reftable/compat.c
new file mode 100644
index 0000000000..a48c5aa5e3
--- /dev/null
+++ b/reftable/compat.c
@@ -0,0 +1,110 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+
+*/
+
+/* compat.c - compatibility functions for standalone compilation */
+
+#include "system.h"
+#include "basics.h"
+
+#ifdef REFTABLE_STANDALONE
+
+#include <dirent.h>
+
+void put_be32(void *p, uint32_t i)
+{
+	uint8_t *out = (uint8_t *)p;
+
+	out[0] = (uint8_t)((i >> 24) & 0xff);
+	out[1] = (uint8_t)((i >> 16) & 0xff);
+	out[2] = (uint8_t)((i >> 8) & 0xff);
+	out[3] = (uint8_t)((i)&0xff);
+}
+
+uint32_t get_be32(uint8_t *in)
+{
+	return (uint32_t)(in[0]) << 24 | (uint32_t)(in[1]) << 16 |
+	       (uint32_t)(in[2]) << 8 | (uint32_t)(in[3]);
+}
+
+void put_be64(void *p, uint64_t v)
+{
+	uint8_t *out = (uint8_t *)p;
+	int i = sizeof(uint64_t);
+	while (i--) {
+		out[i] = (uint8_t)(v & 0xff);
+		v >>= 8;
+	}
+}
+
+uint64_t get_be64(void *out)
+{
+	uint8_t *bytes = (uint8_t *)out;
+	uint64_t v = 0;
+	int i = 0;
+	for (i = 0; i < sizeof(uint64_t); i++) {
+		v = (v << 8) | (uint8_t)(bytes[i] & 0xff);
+	}
+	return v;
+}
+
+uint16_t get_be16(uint8_t *in)
+{
+	return (uint32_t)(in[0]) << 8 | (uint32_t)(in[1]);
+}
+
+char *xstrdup(const char *s)
+{
+	int l = strlen(s);
+	char *dest = (char *)reftable_malloc(l + 1);
+	strncpy(dest, s, l + 1);
+	return dest;
+}
+
+void sleep_millisec(int millisecs)
+{
+	usleep(millisecs * 1000);
+}
+
+void reftable_clear_dir(const char *dirname)
+{
+	DIR *dir = opendir(dirname);
+	struct dirent *ent = NULL;
+	assert(dir);
+	while ((ent = readdir(dir)) != NULL) {
+		unlinkat(dirfd(dir), ent->d_name, 0);
+	}
+	closedir(dir);
+	rmdir(dirname);
+}
+
+#else
+
+#include "../dir.h"
+
+void reftable_clear_dir(const char *dirname)
+{
+	struct strbuf path = STRBUF_INIT;
+	strbuf_addstr(&path, dirname);
+	remove_dir_recursively(&path, 0);
+	strbuf_release(&path);
+}
+
+#endif
+
+int hash_size(uint32_t id)
+{
+	switch (id) {
+	case 0:
+	case SHA1_ID:
+		return SHA1_SIZE;
+	case SHA256_ID:
+		return SHA256_SIZE;
+	}
+	abort();
+}
diff --git a/reftable/compat.h b/reftable/compat.h
new file mode 100644
index 0000000000..a765c57e96
--- /dev/null
+++ b/reftable/compat.h
@@ -0,0 +1,48 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef COMPAT_H
+#define COMPAT_H
+
+#include "system.h"
+
+#ifdef REFTABLE_STANDALONE
+
+/* functions that git-core provides, for standalone compilation */
+#include <stdint.h>
+
+uint64_t get_be64(void *in);
+void put_be64(void *out, uint64_t i);
+
+void put_be32(void *out, uint32_t i);
+uint32_t get_be32(uint8_t *in);
+
+uint16_t get_be16(uint8_t *in);
+
+#define ARRAY_SIZE(a) sizeof((a)) / sizeof((a)[0])
+#define FREE_AND_NULL(x)          \
+	do {                      \
+		reftable_free(x); \
+		(x) = NULL;       \
+	} while (0)
+#define QSORT(arr, n, cmp) qsort(arr, n, sizeof(arr[0]), cmp)
+#define SWAP(a, b)                              \
+	{                                       \
+		char tmp[sizeof(a)];            \
+		assert(sizeof(a) == sizeof(b)); \
+		memcpy(&tmp[0], &a, sizeof(a)); \
+		memcpy(&a, &b, sizeof(a));      \
+		memcpy(&b, &tmp[0], sizeof(a)); \
+	}
+
+char *xstrdup(const char *s);
+
+void sleep_millisec(int millisecs);
+
+#endif
+#endif
diff --git a/reftable/publicbasics.c b/reftable/publicbasics.c
new file mode 100644
index 0000000000..12d547d70d
--- /dev/null
+++ b/reftable/publicbasics.c
@@ -0,0 +1,100 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reftable.h"
+
+#include "basics.h"
+#include "system.h"
+
+const char *reftable_error_str(int err)
+{
+	static char buf[250];
+	switch (err) {
+	case REFTABLE_IO_ERROR:
+		return "I/O error";
+	case REFTABLE_FORMAT_ERROR:
+		return "corrupt reftable file";
+	case REFTABLE_NOT_EXIST_ERROR:
+		return "file does not exist";
+	case REFTABLE_LOCK_ERROR:
+		return "data is outdated";
+	case REFTABLE_API_ERROR:
+		return "misuse of the reftable API";
+	case REFTABLE_ZLIB_ERROR:
+		return "zlib failure";
+	case REFTABLE_NAME_CONFLICT:
+		return "file/directory conflict";
+	case REFTABLE_REFNAME_ERROR:
+		return "invalid refname";
+	case -1:
+		return "general error";
+	default:
+		snprintf(buf, sizeof(buf), "unknown error code %d", err);
+		return buf;
+	}
+}
+
+int reftable_error_to_errno(int err)
+{
+	switch (err) {
+	case REFTABLE_IO_ERROR:
+		return EIO;
+	case REFTABLE_FORMAT_ERROR:
+		return EFAULT;
+	case REFTABLE_NOT_EXIST_ERROR:
+		return ENOENT;
+	case REFTABLE_LOCK_ERROR:
+		return EBUSY;
+	case REFTABLE_API_ERROR:
+		return EINVAL;
+	case REFTABLE_ZLIB_ERROR:
+		return EDOM;
+	default:
+		return ERANGE;
+	}
+}
+
+static void *(*reftable_malloc_ptr)(size_t sz) = &malloc;
+static void *(*reftable_realloc_ptr)(void *, size_t) = &realloc;
+static void (*reftable_free_ptr)(void *) = &free;
+
+void *reftable_malloc(size_t sz)
+{
+	return (*reftable_malloc_ptr)(sz);
+}
+
+void *reftable_realloc(void *p, size_t sz)
+{
+	return (*reftable_realloc_ptr)(p, sz);
+}
+
+void reftable_free(void *p)
+{
+	reftable_free_ptr(p);
+}
+
+void *reftable_calloc(size_t sz)
+{
+	void *p = reftable_malloc(sz);
+	memset(p, 0, sz);
+	return p;
+}
+
+void reftable_set_alloc(void *(*malloc)(size_t),
+			void *(*realloc)(void *, size_t), void (*free)(void *))
+{
+	reftable_malloc_ptr = malloc;
+	reftable_realloc_ptr = realloc;
+	reftable_free_ptr = free;
+}
+
+int reftable_fd_write(void *arg, const void *data, size_t sz)
+{
+	int *fdp = (int *)arg;
+	return write(*fdp, data, sz);
+}
diff --git a/reftable/reftable-tests.h b/reftable/reftable-tests.h
new file mode 100644
index 0000000000..e38471888f
--- /dev/null
+++ b/reftable/reftable-tests.h
@@ -0,0 +1,22 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_TESTS_H
+#define REFTABLE_TESTS_H
+
+int block_test_main(int argc, const char **argv);
+int merged_test_main(int argc, const char **argv);
+int record_test_main(int argc, const char **argv);
+int refname_test_main(int argc, const char **argv);
+int reftable_test_main(int argc, const char **argv);
+int strbuf_test_main(int argc, const char **argv);
+int stack_test_main(int argc, const char **argv);
+int tree_test_main(int argc, const char **argv);
+int reftable_dump_main(int argc, char *const *argv);
+
+#endif
diff --git a/reftable/strbuf.c b/reftable/strbuf.c
new file mode 100644
index 0000000000..136bf65591
--- /dev/null
+++ b/reftable/strbuf.c
@@ -0,0 +1,142 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "strbuf.h"
+
+#ifdef REFTABLE_STANDALONE
+
+void strbuf_init(struct strbuf *s, size_t alloc)
+{
+	struct strbuf empty = STRBUF_INIT;
+	*s = empty;
+}
+
+void strbuf_grow(struct strbuf *s, size_t extra)
+{
+	size_t newcap = s->len + extra + 1;
+	if (newcap > s->cap) {
+		s->buf = reftable_realloc(s->buf, newcap);
+		s->cap = newcap;
+	}
+}
+
+static void strbuf_resize(struct strbuf *s, int l)
+{
+	int zl = l + 1; /* one uint8_t for 0 termination. */
+	assert(s->canary == STRBUF_CANARY);
+	if (s->cap < zl) {
+		int c = s->cap * 2;
+		if (c < zl) {
+			c = zl;
+		}
+		s->cap = c;
+		s->buf = reftable_realloc(s->buf, s->cap);
+	}
+	s->len = l;
+	s->buf[l] = 0;
+}
+
+void strbuf_setlen(struct strbuf *s, size_t l)
+{
+	assert(s->cap >= l + 1);
+	s->len = l;
+	s->buf[l] = 0;
+}
+
+void strbuf_reset(struct strbuf *s)
+{
+	strbuf_resize(s, 0);
+}
+
+void strbuf_addstr(struct strbuf *d, const char *s)
+{
+	int l1 = d->len;
+	int l2 = strlen(s);
+	assert(d->canary == STRBUF_CANARY);
+
+	strbuf_resize(d, l2 + l1);
+	memcpy(d->buf + l1, s, l2);
+}
+
+void strbuf_addbuf(struct strbuf *s, struct strbuf *a)
+{
+	int end = s->len;
+	assert(s->canary == STRBUF_CANARY);
+	strbuf_resize(s, s->len + a->len);
+	memcpy(s->buf + end, a->buf, a->len);
+}
+
+char *strbuf_detach(struct strbuf *s, size_t *sz)
+{
+	char *p = NULL;
+	p = (char *)s->buf;
+	if (sz)
+		*sz = s->len;
+	s->buf = NULL;
+	s->cap = 0;
+	s->len = 0;
+	return p;
+}
+
+void strbuf_release(struct strbuf *s)
+{
+	assert(s->canary == STRBUF_CANARY);
+	s->cap = 0;
+	s->len = 0;
+	reftable_free(s->buf);
+	s->buf = NULL;
+}
+
+int strbuf_cmp(const struct strbuf *a, const struct strbuf *b)
+{
+	int min = a->len < b->len ? a->len : b->len;
+	int res = memcmp(a->buf, b->buf, min);
+	assert(a->canary == STRBUF_CANARY);
+	assert(b->canary == STRBUF_CANARY);
+	if (res != 0)
+		return res;
+	if (a->len < b->len)
+		return -1;
+	else if (a->len > b->len)
+		return 1;
+	else
+		return 0;
+}
+
+int strbuf_add(struct strbuf *b, const void *data, size_t sz)
+{
+	assert(b->canary == STRBUF_CANARY);
+	strbuf_grow(b, sz);
+	memcpy(b->buf + b->len, data, sz);
+	b->len += sz;
+	b->buf[b->len] = 0;
+	return sz;
+}
+
+#endif
+
+int strbuf_add_void(void *b, const void *data, size_t sz)
+{
+	strbuf_add((struct strbuf *)b, data, sz);
+	return sz;
+}
+
+int common_prefix_size(struct strbuf *a, struct strbuf *b)
+{
+	int p = 0;
+	while (p < a->len && p < b->len) {
+		if (a->buf[p] != b->buf[p]) {
+			break;
+		}
+		p++;
+	}
+
+	return p;
+}
+
+struct strbuf reftable_empty_strbuf = STRBUF_INIT;
diff --git a/reftable/strbuf.h b/reftable/strbuf.h
new file mode 100644
index 0000000000..c2d7aca8dd
--- /dev/null
+++ b/reftable/strbuf.h
@@ -0,0 +1,80 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef SLICE_H
+#define SLICE_H
+
+#ifdef REFTABLE_STANDALONE
+
+#include "basics.h"
+
+/*
+  Provides a bounds-checked, growable byte ranges. To use, initialize as "strbuf
+  x = STRBUF_INIT;"
+ */
+struct strbuf {
+	size_t len;
+	size_t cap;
+	char *buf;
+
+	/* Used to enforce initialization with STRBUF_INIT */
+	uint8_t canary;
+};
+
+#define STRBUF_CANARY 0x42
+#define STRBUF_INIT                       \
+	{                                 \
+		0, 0, NULL, STRBUF_CANARY \
+	}
+
+void strbuf_addstr(struct strbuf *dest, const char *src);
+
+/* Deallocate and clear strbuf */
+void strbuf_release(struct strbuf *strbuf);
+
+/* Set strbuf to 0 length, but retain buffer. */
+void strbuf_reset(struct strbuf *strbuf);
+
+/* Initializes a strbuf. Accepts a strbuf with random garbage. */
+void strbuf_init(struct strbuf *strbuf, size_t alloc);
+
+/* Return `buf`, clearing out `s`. Optionally return len (not cap) in `sz`.  */
+char *strbuf_detach(struct strbuf *s, size_t *sz);
+
+/* Set length of the slace to `l`, but don't reallocated. */
+void strbuf_setlen(struct strbuf *s, size_t l);
+
+/* Ensure `l` bytes beyond current length are available */
+void strbuf_grow(struct strbuf *s, size_t l);
+
+/* Signed comparison */
+int strbuf_cmp(const struct strbuf *a, const struct strbuf *b);
+
+/* Append `data` to the `dest` strbuf.  */
+int strbuf_add(struct strbuf *dest, const void *data, size_t sz);
+
+/* Append `add` to `dest. */
+void strbuf_addbuf(struct strbuf *dest, struct strbuf *add);
+
+#else
+
+#include "../git-compat-util.h"
+#include "../strbuf.h"
+
+#endif
+
+extern struct strbuf reftable_empty_strbuf;
+
+/* Like strbuf_add, but suitable for passing to reftable_new_writer
+ */
+int strbuf_add_void(void *b, const void *data, size_t sz);
+
+/* Find the longest shared prefix size of `a` and `b` */
+int common_prefix_size(struct strbuf *a, struct strbuf *b);
+
+#endif
diff --git a/reftable/strbuf_test.c b/reftable/strbuf_test.c
new file mode 100644
index 0000000000..39f561c81a
--- /dev/null
+++ b/reftable/strbuf_test.c
@@ -0,0 +1,37 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "strbuf.h"
+
+#include "system.h"
+
+#include "basics.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static void test_strbuf(void)
+{
+	struct strbuf s = STRBUF_INIT;
+	struct strbuf t = STRBUF_INIT;
+
+	strbuf_addstr(&s, "abc");
+	assert(0 == strcmp("abc", s.buf));
+
+	strbuf_addstr(&t, "pqr");
+	strbuf_addbuf(&s, &t);
+	assert(0 == strcmp("abcpqr", s.buf));
+
+	strbuf_release(&s);
+	strbuf_release(&t);
+}
+
+int strbuf_test_main(int argc, const char *argv[])
+{
+	add_test_case("test_strbuf", &test_strbuf);
+	return test_main(argc, argv);
+}
diff --git a/reftable/system.h b/reftable/system.h
new file mode 100644
index 0000000000..567eb8a87d
--- /dev/null
+++ b/reftable/system.h
@@ -0,0 +1,51 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef SYSTEM_H
+#define SYSTEM_H
+
+#ifndef REFTABLE_STANDALONE
+
+#include "git-compat-util.h"
+#include "cache.h"
+#include <zlib.h>
+
+#else
+
+#include <assert.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/stat.h>
+#include <sys/time.h>
+#include <sys/types.h>
+#include <unistd.h>
+#include <zlib.h>
+
+#include "compat.h"
+
+#endif /* REFTABLE_STANDALONE */
+
+void reftable_clear_dir(const char *dirname);
+
+#define SHA1_ID 0x73686131
+#define SHA256_ID 0x73323536
+#define SHA1_SIZE 20
+#define SHA256_SIZE 32
+
+/* This is uncompress2, which is only available in zlib as of 2017.
+ */
+int uncompress_return_consumed(Bytef *dest, uLongf *destLen,
+			       const Bytef *source, uLong *sourceLen);
+int hash_size(uint32_t id);
+
+#endif
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
new file mode 100644
index 0000000000..7d50aa6bcc
--- /dev/null
+++ b/t/helper/test-reftable.c
@@ -0,0 +1,8 @@
+#include "reftable/reftable-tests.h"
+#include "test-tool.h"
+
+int cmd__reftable(int argc, const char **argv)
+{
+	strbuf_test_main(argc, argv);
+	return 0;
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index a0d3966b29..c7ab5bc4a6 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -53,6 +53,7 @@ static struct test_cmd cmds[] = {
 	{ "read-graph", cmd__read_graph },
 	{ "read-midx", cmd__read_midx },
 	{ "ref-store", cmd__ref_store },
+	{ "reftable", cmd__reftable },
 	{ "regex", cmd__regex },
 	{ "repository", cmd__repository },
 	{ "revision-walking", cmd__revision_walking },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index 07034d3f38..fa2b11ace6 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -42,6 +42,7 @@ int cmd__read_cache(int argc, const char **argv);
 int cmd__read_graph(int argc, const char **argv);
 int cmd__read_midx(int argc, const char **argv);
 int cmd__ref_store(int argc, const char **argv);
+int cmd__reftable(int argc, const char **argv);
 int cmd__regex(int argc, const char **argv);
 int cmd__repository(int argc, const char **argv);
 int cmd__revision_walking(int argc, const char **argv);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v2 06/13] reftable: (de)serialization for the polymorphic record type.
  2020-10-01 16:10 ` [PATCH v2 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
                     ` (4 preceding siblings ...)
  2020-10-01 16:10   ` [PATCH v2 05/13] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
@ 2020-10-01 16:10   ` Han-Wen Nienhuys via GitGitGadget
  2020-10-01 19:23     ` Junio C Hamano
  2020-10-01 16:10   ` [PATCH v2 07/13] reftable: reading/writing blocks Han-Wen Nienhuys via GitGitGadget
                     ` (7 subsequent siblings)
  13 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-10-01 16:10 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Jeff King, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |    2 +
 reftable/constants.h     |   21 +
 reftable/record.c        | 1116 ++++++++++++++++++++++++++++++++++++++
 reftable/record.h        |  137 +++++
 reftable/record_test.c   |  410 ++++++++++++++
 t/helper/test-reftable.c |    1 +
 6 files changed, 1687 insertions(+)
 create mode 100644 reftable/constants.h
 create mode 100644 reftable/record.c
 create mode 100644 reftable/record.h
 create mode 100644 reftable/record_test.c

diff --git a/Makefile b/Makefile
index e40d55cd87..b5e43d79ed 100644
--- a/Makefile
+++ b/Makefile
@@ -2376,8 +2376,10 @@ REFTABLE_OBJS += reftable/basics.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/compat.o
+REFTABLE_OBJS += reftable/record.o
 REFTABLE_OBJS += reftable/strbuf.o
 
+REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/strbuf_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 
diff --git a/reftable/constants.h b/reftable/constants.h
new file mode 100644
index 0000000000..5eee72c4c1
--- /dev/null
+++ b/reftable/constants.h
@@ -0,0 +1,21 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef CONSTANTS_H
+#define CONSTANTS_H
+
+#define BLOCK_TYPE_LOG 'g'
+#define BLOCK_TYPE_INDEX 'i'
+#define BLOCK_TYPE_REF 'r'
+#define BLOCK_TYPE_OBJ 'o'
+#define BLOCK_TYPE_ANY 0
+
+#define MAX_RESTARTS ((1 << 16) - 1)
+#define DEFAULT_BLOCK_SIZE 4096
+
+#endif
diff --git a/reftable/record.c b/reftable/record.c
new file mode 100644
index 0000000000..f050667ea4
--- /dev/null
+++ b/reftable/record.c
@@ -0,0 +1,1116 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+/* record.c - methods for different types of records. */
+
+#include "record.h"
+
+#include "system.h"
+#include "constants.h"
+#include "reftable.h"
+#include "basics.h"
+
+int get_var_int(uint64_t *dest, struct string_view *in)
+{
+	int ptr = 0;
+	uint64_t val;
+
+	if (in->len == 0)
+		return -1;
+	val = in->buf[ptr] & 0x7f;
+
+	while (in->buf[ptr] & 0x80) {
+		ptr++;
+		if (ptr > in->len) {
+			return -1;
+		}
+		val = (val + 1) << 7 | (uint64_t)(in->buf[ptr] & 0x7f);
+	}
+
+	*dest = val;
+	return ptr + 1;
+}
+
+int put_var_int(struct string_view *dest, uint64_t val)
+{
+	uint8_t buf[10] = { 0 };
+	int i = 9;
+	int n = 0;
+	buf[i] = (uint8_t)(val & 0x7f);
+	i--;
+	while (1) {
+		val >>= 7;
+		if (!val) {
+			break;
+		}
+		val--;
+		buf[i] = 0x80 | (uint8_t)(val & 0x7f);
+		i--;
+	}
+
+	n = sizeof(buf) - i - 1;
+	if (dest->len < n)
+		return -1;
+	memcpy(dest->buf, &buf[i + 1], n);
+	return n;
+}
+
+int reftable_is_block_type(uint8_t typ)
+{
+	switch (typ) {
+	case BLOCK_TYPE_REF:
+	case BLOCK_TYPE_LOG:
+	case BLOCK_TYPE_OBJ:
+	case BLOCK_TYPE_INDEX:
+		return 1;
+	}
+	return 0;
+}
+
+static int decode_string(struct strbuf *dest, struct string_view in)
+{
+	int start_len = in.len;
+	uint64_t tsize = 0;
+	int n = get_var_int(&tsize, &in);
+	if (n <= 0)
+		return -1;
+	string_view_consume(&in, n);
+	if (in.len < tsize)
+		return -1;
+
+	strbuf_reset(dest);
+	strbuf_add(dest, in.buf, tsize);
+	string_view_consume(&in, tsize);
+
+	return start_len - in.len;
+}
+
+static int encode_string(char *str, struct string_view s)
+{
+	struct string_view start = s;
+	int l = strlen(str);
+	int n = put_var_int(&s, l);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+	if (s.len < l)
+		return -1;
+	memcpy(s.buf, str, l);
+	string_view_consume(&s, l);
+
+	return start.len - s.len;
+}
+
+int reftable_encode_key(int *restart, struct string_view dest,
+			struct strbuf prev_key, struct strbuf key,
+			uint8_t extra)
+{
+	struct string_view start = dest;
+	int prefix_len = common_prefix_size(&prev_key, &key);
+	uint64_t suffix_len = key.len - prefix_len;
+	int n = put_var_int(&dest, (uint64_t)prefix_len);
+	if (n < 0)
+		return -1;
+	string_view_consume(&dest, n);
+
+	*restart = (prefix_len == 0);
+
+	n = put_var_int(&dest, suffix_len << 3 | (uint64_t)extra);
+	if (n < 0)
+		return -1;
+	string_view_consume(&dest, n);
+
+	if (dest.len < suffix_len)
+		return -1;
+	memcpy(dest.buf, key.buf + prefix_len, suffix_len);
+	string_view_consume(&dest, suffix_len);
+
+	return start.len - dest.len;
+}
+
+int reftable_decode_key(struct strbuf *key, uint8_t *extra,
+			struct strbuf last_key, struct string_view in)
+{
+	int start_len = in.len;
+	uint64_t prefix_len = 0;
+	uint64_t suffix_len = 0;
+	int n = get_var_int(&prefix_len, &in);
+	if (n < 0)
+		return -1;
+	string_view_consume(&in, n);
+
+	if (prefix_len > last_key.len)
+		return -1;
+
+	n = get_var_int(&suffix_len, &in);
+	if (n <= 0)
+		return -1;
+	string_view_consume(&in, n);
+
+	*extra = (uint8_t)(suffix_len & 0x7);
+	suffix_len >>= 3;
+
+	if (in.len < suffix_len)
+		return -1;
+
+	strbuf_reset(key);
+	strbuf_add(key, last_key.buf, prefix_len);
+	strbuf_add(key, in.buf, suffix_len);
+	string_view_consume(&in, suffix_len);
+
+	return start_len - in.len;
+}
+
+static void reftable_ref_record_key(const void *r, struct strbuf *dest)
+{
+	const struct reftable_ref_record *rec =
+		(const struct reftable_ref_record *)r;
+	strbuf_reset(dest);
+	strbuf_addstr(dest, rec->refname);
+}
+
+static void reftable_ref_record_copy_from(void *rec, const void *src_rec,
+					  int hash_size)
+{
+	struct reftable_ref_record *ref = (struct reftable_ref_record *)rec;
+	struct reftable_ref_record *src = (struct reftable_ref_record *)src_rec;
+	assert(hash_size > 0);
+
+	/* This is simple and correct, but we could probably reuse the hash
+	   fields. */
+	reftable_ref_record_clear(ref);
+	if (src->refname != NULL) {
+		ref->refname = xstrdup(src->refname);
+	}
+
+	if (src->target != NULL) {
+		ref->target = xstrdup(src->target);
+	}
+
+	if (src->target_value != NULL) {
+		ref->target_value = reftable_malloc(hash_size);
+		memcpy(ref->target_value, src->target_value, hash_size);
+	}
+
+	if (src->value != NULL) {
+		ref->value = reftable_malloc(hash_size);
+		memcpy(ref->value, src->value, hash_size);
+	}
+	ref->update_index = src->update_index;
+}
+
+static char hexdigit(int c)
+{
+	if (c <= 9)
+		return '0' + c;
+	return 'a' + (c - 10);
+}
+
+static void hex_format(char *dest, uint8_t *src, int hash_size)
+{
+	assert(hash_size > 0);
+	if (src != NULL) {
+		int i = 0;
+		for (i = 0; i < hash_size; i++) {
+			dest[2 * i] = hexdigit(src[i] >> 4);
+			dest[2 * i + 1] = hexdigit(src[i] & 0xf);
+		}
+		dest[2 * hash_size] = 0;
+	}
+}
+
+void reftable_ref_record_print(struct reftable_ref_record *ref,
+			       uint32_t hash_id)
+{
+	char hex[SHA256_SIZE + 1] = { 0 };
+	printf("ref{%s(%" PRIu64 ") ", ref->refname, ref->update_index);
+	if (ref->value != NULL) {
+		hex_format(hex, ref->value, hash_size(hash_id));
+		printf("%s", hex);
+	}
+	if (ref->target_value != NULL) {
+		hex_format(hex, ref->target_value, hash_size(hash_id));
+		printf(" (T %s)", hex);
+	}
+	if (ref->target != NULL) {
+		printf("=> %s", ref->target);
+	}
+	printf("}\n");
+}
+
+static void reftable_ref_record_clear_void(void *rec)
+{
+	reftable_ref_record_clear((struct reftable_ref_record *)rec);
+}
+
+void reftable_ref_record_clear(struct reftable_ref_record *ref)
+{
+	reftable_free(ref->refname);
+	reftable_free(ref->target);
+	reftable_free(ref->target_value);
+	reftable_free(ref->value);
+	memset(ref, 0, sizeof(struct reftable_ref_record));
+}
+
+static uint8_t reftable_ref_record_val_type(const void *rec)
+{
+	const struct reftable_ref_record *r =
+		(const struct reftable_ref_record *)rec;
+	if (r->value != NULL) {
+		if (r->target_value != NULL) {
+			return 2;
+		} else {
+			return 1;
+		}
+	} else if (r->target != NULL)
+		return 3;
+	return 0;
+}
+
+static int reftable_ref_record_encode(const void *rec, struct string_view s,
+				      int hash_size)
+{
+	const struct reftable_ref_record *r =
+		(const struct reftable_ref_record *)rec;
+	struct string_view start = s;
+	int n = put_var_int(&s, r->update_index);
+	assert(hash_size > 0);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	if (r->value != NULL) {
+		if (s.len < hash_size) {
+			return -1;
+		}
+		memcpy(s.buf, r->value, hash_size);
+		string_view_consume(&s, hash_size);
+	}
+
+	if (r->target_value != NULL) {
+		if (s.len < hash_size) {
+			return -1;
+		}
+		memcpy(s.buf, r->target_value, hash_size);
+		string_view_consume(&s, hash_size);
+	}
+
+	if (r->target != NULL) {
+		int n = encode_string(r->target, s);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&s, n);
+	}
+
+	return start.len - s.len;
+}
+
+static int reftable_ref_record_decode(void *rec, struct strbuf key,
+				      uint8_t val_type, struct string_view in,
+				      int hash_size)
+{
+	struct reftable_ref_record *r = (struct reftable_ref_record *)rec;
+	struct string_view start = in;
+	int seen_value = 0;
+	int seen_target_value = 0;
+	int seen_target = 0;
+
+	int n = get_var_int(&r->update_index, &in);
+	if (n < 0)
+		return n;
+	assert(hash_size > 0);
+
+	string_view_consume(&in, n);
+
+	r->refname = reftable_realloc(r->refname, key.len + 1);
+	memcpy(r->refname, key.buf, key.len);
+	r->refname[key.len] = 0;
+
+	switch (val_type) {
+	case 1:
+	case 2:
+		if (in.len < hash_size) {
+			return -1;
+		}
+
+		if (r->value == NULL) {
+			r->value = reftable_malloc(hash_size);
+		}
+		seen_value = 1;
+		memcpy(r->value, in.buf, hash_size);
+		string_view_consume(&in, hash_size);
+		if (val_type == 1) {
+			break;
+		}
+		if (r->target_value == NULL) {
+			r->target_value = reftable_malloc(hash_size);
+		}
+		seen_target_value = 1;
+		memcpy(r->target_value, in.buf, hash_size);
+		string_view_consume(&in, hash_size);
+		break;
+	case 3: {
+		struct strbuf dest = STRBUF_INIT;
+		int n = decode_string(&dest, in);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&in, n);
+		seen_target = 1;
+		if (r->target != NULL) {
+			reftable_free(r->target);
+		}
+		r->target = dest.buf;
+	} break;
+
+	case 0:
+		break;
+	default:
+		abort();
+		break;
+	}
+
+	if (!seen_target && r->target != NULL) {
+		FREE_AND_NULL(r->target);
+	}
+	if (!seen_target_value && r->target_value != NULL) {
+		FREE_AND_NULL(r->target_value);
+	}
+	if (!seen_value && r->value != NULL) {
+		FREE_AND_NULL(r->value);
+	}
+
+	return start.len - in.len;
+}
+
+static int reftable_ref_record_is_deletion_void(const void *p)
+{
+	return reftable_ref_record_is_deletion(
+		(const struct reftable_ref_record *)p);
+}
+
+static struct reftable_record_vtable reftable_ref_record_vtable = {
+	.key = &reftable_ref_record_key,
+	.type = BLOCK_TYPE_REF,
+	.copy_from = &reftable_ref_record_copy_from,
+	.val_type = &reftable_ref_record_val_type,
+	.encode = &reftable_ref_record_encode,
+	.decode = &reftable_ref_record_decode,
+	.clear = &reftable_ref_record_clear_void,
+	.is_deletion = &reftable_ref_record_is_deletion_void,
+};
+
+static void reftable_obj_record_key(const void *r, struct strbuf *dest)
+{
+	const struct reftable_obj_record *rec =
+		(const struct reftable_obj_record *)r;
+	strbuf_reset(dest);
+	strbuf_add(dest, rec->hash_prefix, rec->hash_prefix_len);
+}
+
+static void reftable_obj_record_clear(void *rec)
+{
+	struct reftable_obj_record *obj = (struct reftable_obj_record *)rec;
+	FREE_AND_NULL(obj->hash_prefix);
+	FREE_AND_NULL(obj->offsets);
+	memset(obj, 0, sizeof(struct reftable_obj_record));
+}
+
+static void reftable_obj_record_copy_from(void *rec, const void *src_rec,
+					  int hash_size)
+{
+	struct reftable_obj_record *obj = (struct reftable_obj_record *)rec;
+	const struct reftable_obj_record *src =
+		(const struct reftable_obj_record *)src_rec;
+	int olen;
+
+	reftable_obj_record_clear(obj);
+	*obj = *src;
+	obj->hash_prefix = reftable_malloc(obj->hash_prefix_len);
+	memcpy(obj->hash_prefix, src->hash_prefix, obj->hash_prefix_len);
+
+	olen = obj->offset_len * sizeof(uint64_t);
+	obj->offsets = reftable_malloc(olen);
+	memcpy(obj->offsets, src->offsets, olen);
+}
+
+static uint8_t reftable_obj_record_val_type(const void *rec)
+{
+	struct reftable_obj_record *r = (struct reftable_obj_record *)rec;
+	if (r->offset_len > 0 && r->offset_len < 8)
+		return r->offset_len;
+	return 0;
+}
+
+static int reftable_obj_record_encode(const void *rec, struct string_view s,
+				      int hash_size)
+{
+	struct reftable_obj_record *r = (struct reftable_obj_record *)rec;
+	struct string_view start = s;
+	int i = 0;
+	int n = 0;
+	uint64_t last = 0;
+	if (r->offset_len == 0 || r->offset_len >= 8) {
+		n = put_var_int(&s, r->offset_len);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&s, n);
+	}
+	if (r->offset_len == 0)
+		return start.len - s.len;
+	n = put_var_int(&s, r->offsets[0]);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	last = r->offsets[0];
+	for (i = 1; i < r->offset_len; i++) {
+		int n = put_var_int(&s, r->offsets[i] - last);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&s, n);
+		last = r->offsets[i];
+	}
+	return start.len - s.len;
+}
+
+static int reftable_obj_record_decode(void *rec, struct strbuf key,
+				      uint8_t val_type, struct string_view in,
+				      int hash_size)
+{
+	struct string_view start = in;
+	struct reftable_obj_record *r = (struct reftable_obj_record *)rec;
+	uint64_t count = val_type;
+	int n = 0;
+	uint64_t last;
+	int j;
+	r->hash_prefix = reftable_malloc(key.len);
+	memcpy(r->hash_prefix, key.buf, key.len);
+	r->hash_prefix_len = key.len;
+
+	if (val_type == 0) {
+		n = get_var_int(&count, &in);
+		if (n < 0) {
+			return n;
+		}
+
+		string_view_consume(&in, n);
+	}
+
+	r->offsets = NULL;
+	r->offset_len = 0;
+	if (count == 0)
+		return start.len - in.len;
+
+	r->offsets = reftable_malloc(count * sizeof(uint64_t));
+	r->offset_len = count;
+
+	n = get_var_int(&r->offsets[0], &in);
+	if (n < 0)
+		return n;
+	string_view_consume(&in, n);
+
+	last = r->offsets[0];
+	j = 1;
+	while (j < count) {
+		uint64_t delta = 0;
+		int n = get_var_int(&delta, &in);
+		if (n < 0) {
+			return n;
+		}
+		string_view_consume(&in, n);
+
+		last = r->offsets[j] = (delta + last);
+		j++;
+	}
+	return start.len - in.len;
+}
+
+static int not_a_deletion(const void *p)
+{
+	return 0;
+}
+
+static struct reftable_record_vtable reftable_obj_record_vtable = {
+	.key = &reftable_obj_record_key,
+	.type = BLOCK_TYPE_OBJ,
+	.copy_from = &reftable_obj_record_copy_from,
+	.val_type = &reftable_obj_record_val_type,
+	.encode = &reftable_obj_record_encode,
+	.decode = &reftable_obj_record_decode,
+	.clear = &reftable_obj_record_clear,
+	.is_deletion = not_a_deletion,
+};
+
+void reftable_log_record_print(struct reftable_log_record *log,
+			       uint32_t hash_id)
+{
+	char hex[SHA256_SIZE + 1] = { 0 };
+
+	printf("log{%s(%" PRIu64 ") %s <%s> %" PRIu64 " %04d\n", log->refname,
+	       log->update_index, log->name, log->email, log->time,
+	       log->tz_offset);
+	hex_format(hex, log->old_hash, hash_size(hash_id));
+	printf("%s => ", hex);
+	hex_format(hex, log->new_hash, hash_size(hash_id));
+	printf("%s\n\n%s\n}\n", hex, log->message);
+}
+
+static void reftable_log_record_key(const void *r, struct strbuf *dest)
+{
+	const struct reftable_log_record *rec =
+		(const struct reftable_log_record *)r;
+	int len = strlen(rec->refname);
+	uint8_t i64[8];
+	uint64_t ts = 0;
+	strbuf_reset(dest);
+	strbuf_add(dest, (uint8_t *)rec->refname, len + 1);
+
+	ts = (~ts) - rec->update_index;
+	put_be64(&i64[0], ts);
+	strbuf_add(dest, i64, sizeof(i64));
+}
+
+static void reftable_log_record_copy_from(void *rec, const void *src_rec,
+					  int hash_size)
+{
+	struct reftable_log_record *dst = (struct reftable_log_record *)rec;
+	const struct reftable_log_record *src =
+		(const struct reftable_log_record *)src_rec;
+
+	reftable_log_record_clear(dst);
+	*dst = *src;
+	if (dst->refname != NULL) {
+		dst->refname = xstrdup(dst->refname);
+	}
+	if (dst->email != NULL) {
+		dst->email = xstrdup(dst->email);
+	}
+	if (dst->name != NULL) {
+		dst->name = xstrdup(dst->name);
+	}
+	if (dst->message != NULL) {
+		dst->message = xstrdup(dst->message);
+	}
+
+	if (dst->new_hash != NULL) {
+		dst->new_hash = reftable_malloc(hash_size);
+		memcpy(dst->new_hash, src->new_hash, hash_size);
+	}
+	if (dst->old_hash != NULL) {
+		dst->old_hash = reftable_malloc(hash_size);
+		memcpy(dst->old_hash, src->old_hash, hash_size);
+	}
+}
+
+static void reftable_log_record_clear_void(void *rec)
+{
+	struct reftable_log_record *r = (struct reftable_log_record *)rec;
+	reftable_log_record_clear(r);
+}
+
+void reftable_log_record_clear(struct reftable_log_record *r)
+{
+	reftable_free(r->refname);
+	reftable_free(r->new_hash);
+	reftable_free(r->old_hash);
+	reftable_free(r->name);
+	reftable_free(r->email);
+	reftable_free(r->message);
+	memset(r, 0, sizeof(struct reftable_log_record));
+}
+
+static uint8_t reftable_log_record_val_type(const void *rec)
+{
+	const struct reftable_log_record *log =
+		(const struct reftable_log_record *)rec;
+
+	return reftable_log_record_is_deletion(log) ? 0 : 1;
+}
+
+static uint8_t zero[SHA256_SIZE] = { 0 };
+
+static int reftable_log_record_encode(const void *rec, struct string_view s,
+				      int hash_size)
+{
+	struct reftable_log_record *r = (struct reftable_log_record *)rec;
+	struct string_view start = s;
+	int n = 0;
+	uint8_t *oldh = r->old_hash;
+	uint8_t *newh = r->new_hash;
+	if (reftable_log_record_is_deletion(r))
+		return 0;
+
+	if (oldh == NULL) {
+		oldh = zero;
+	}
+	if (newh == NULL) {
+		newh = zero;
+	}
+
+	if (s.len < 2 * hash_size)
+		return -1;
+
+	memcpy(s.buf, oldh, hash_size);
+	memcpy(s.buf + hash_size, newh, hash_size);
+	string_view_consume(&s, 2 * hash_size);
+
+	n = encode_string(r->name ? r->name : "", s);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	n = encode_string(r->email ? r->email : "", s);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	n = put_var_int(&s, r->time);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	if (s.len < 2)
+		return -1;
+
+	put_be16(s.buf, r->tz_offset);
+	string_view_consume(&s, 2);
+
+	n = encode_string(r->message ? r->message : "", s);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	return start.len - s.len;
+}
+
+static int reftable_log_record_decode(void *rec, struct strbuf key,
+				      uint8_t val_type, struct string_view in,
+				      int hash_size)
+{
+	struct string_view start = in;
+	struct reftable_log_record *r = (struct reftable_log_record *)rec;
+	uint64_t max = 0;
+	uint64_t ts = 0;
+	struct strbuf dest = STRBUF_INIT;
+	int n;
+
+	if (key.len <= 9 || key.buf[key.len - 9] != 0)
+		return REFTABLE_FORMAT_ERROR;
+
+	r->refname = reftable_realloc(r->refname, key.len - 8);
+	memcpy(r->refname, key.buf, key.len - 8);
+	ts = get_be64(key.buf + key.len - 8);
+
+	r->update_index = (~max) - ts;
+
+	if (val_type == 0) {
+		FREE_AND_NULL(r->old_hash);
+		FREE_AND_NULL(r->new_hash);
+		FREE_AND_NULL(r->message);
+		FREE_AND_NULL(r->email);
+		FREE_AND_NULL(r->name);
+		return 0;
+	}
+
+	if (in.len < 2 * hash_size)
+		return REFTABLE_FORMAT_ERROR;
+
+	r->old_hash = reftable_realloc(r->old_hash, hash_size);
+	r->new_hash = reftable_realloc(r->new_hash, hash_size);
+
+	memcpy(r->old_hash, in.buf, hash_size);
+	memcpy(r->new_hash, in.buf + hash_size, hash_size);
+
+	string_view_consume(&in, 2 * hash_size);
+
+	n = decode_string(&dest, in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+
+	r->name = reftable_realloc(r->name, dest.len + 1);
+	memcpy(r->name, dest.buf, dest.len);
+	r->name[dest.len] = 0;
+
+	strbuf_reset(&dest);
+	n = decode_string(&dest, in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+
+	r->email = reftable_realloc(r->email, dest.len + 1);
+	memcpy(r->email, dest.buf, dest.len);
+	r->email[dest.len] = 0;
+
+	ts = 0;
+	n = get_var_int(&ts, &in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+	r->time = ts;
+	if (in.len < 2)
+		goto done;
+
+	r->tz_offset = get_be16(in.buf);
+	string_view_consume(&in, 2);
+
+	strbuf_reset(&dest);
+	n = decode_string(&dest, in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+
+	r->message = reftable_realloc(r->message, dest.len + 1);
+	memcpy(r->message, dest.buf, dest.len);
+	r->message[dest.len] = 0;
+
+	strbuf_release(&dest);
+	return start.len - in.len;
+
+done:
+	strbuf_release(&dest);
+	return REFTABLE_FORMAT_ERROR;
+}
+
+static int null_streq(char *a, char *b)
+{
+	char *empty = "";
+	if (a == NULL)
+		a = empty;
+
+	if (b == NULL)
+		b = empty;
+
+	return 0 == strcmp(a, b);
+}
+
+static int zero_hash_eq(uint8_t *a, uint8_t *b, int sz)
+{
+	if (a == NULL)
+		a = zero;
+
+	if (b == NULL)
+		b = zero;
+
+	return !memcmp(a, b, sz);
+}
+
+int reftable_log_record_equal(struct reftable_log_record *a,
+			      struct reftable_log_record *b, int hash_size)
+{
+	return null_streq(a->name, b->name) && null_streq(a->email, b->email) &&
+	       null_streq(a->message, b->message) &&
+	       zero_hash_eq(a->old_hash, b->old_hash, hash_size) &&
+	       zero_hash_eq(a->new_hash, b->new_hash, hash_size) &&
+	       a->time == b->time && a->tz_offset == b->tz_offset &&
+	       a->update_index == b->update_index;
+}
+
+static int reftable_log_record_is_deletion_void(const void *p)
+{
+	return reftable_log_record_is_deletion(
+		(const struct reftable_log_record *)p);
+}
+
+static struct reftable_record_vtable reftable_log_record_vtable = {
+	.key = &reftable_log_record_key,
+	.type = BLOCK_TYPE_LOG,
+	.copy_from = &reftable_log_record_copy_from,
+	.val_type = &reftable_log_record_val_type,
+	.encode = &reftable_log_record_encode,
+	.decode = &reftable_log_record_decode,
+	.clear = &reftable_log_record_clear_void,
+	.is_deletion = &reftable_log_record_is_deletion_void,
+};
+
+struct reftable_record reftable_new_record(uint8_t typ)
+{
+	struct reftable_record rec = { NULL };
+	switch (typ) {
+	case BLOCK_TYPE_REF: {
+		struct reftable_ref_record *r =
+			reftable_calloc(sizeof(struct reftable_ref_record));
+		reftable_record_from_ref(&rec, r);
+		return rec;
+	}
+
+	case BLOCK_TYPE_OBJ: {
+		struct reftable_obj_record *r =
+			reftable_calloc(sizeof(struct reftable_obj_record));
+		reftable_record_from_obj(&rec, r);
+		return rec;
+	}
+	case BLOCK_TYPE_LOG: {
+		struct reftable_log_record *r =
+			reftable_calloc(sizeof(struct reftable_log_record));
+		reftable_record_from_log(&rec, r);
+		return rec;
+	}
+	case BLOCK_TYPE_INDEX: {
+		struct reftable_index_record empty = { .last_key =
+							       STRBUF_INIT };
+		struct reftable_index_record *r =
+			reftable_calloc(sizeof(struct reftable_index_record));
+		*r = empty;
+		reftable_record_from_index(&rec, r);
+		return rec;
+	}
+	}
+	abort();
+	return rec;
+}
+
+/* clear out the record, yielding the reftable_record data that was
+ * encapsulated. */
+static void *reftable_record_yield(struct reftable_record *rec)
+{
+	void *p = rec->data;
+	rec->data = NULL;
+	return p;
+}
+
+void reftable_record_destroy(struct reftable_record *rec)
+{
+	reftable_record_clear(rec);
+	reftable_free(reftable_record_yield(rec));
+}
+
+static void reftable_index_record_key(const void *r, struct strbuf *dest)
+{
+	struct reftable_index_record *rec = (struct reftable_index_record *)r;
+	strbuf_reset(dest);
+	strbuf_addbuf(dest, &rec->last_key);
+}
+
+static void reftable_index_record_copy_from(void *rec, const void *src_rec,
+					    int hash_size)
+{
+	struct reftable_index_record *dst = (struct reftable_index_record *)rec;
+	struct reftable_index_record *src =
+		(struct reftable_index_record *)src_rec;
+
+	strbuf_reset(&dst->last_key);
+	strbuf_addbuf(&dst->last_key, &src->last_key);
+	dst->offset = src->offset;
+}
+
+static void reftable_index_record_clear(void *rec)
+{
+	struct reftable_index_record *idx = (struct reftable_index_record *)rec;
+	strbuf_release(&idx->last_key);
+}
+
+static uint8_t reftable_index_record_val_type(const void *rec)
+{
+	return 0;
+}
+
+static int reftable_index_record_encode(const void *rec, struct string_view out,
+					int hash_size)
+{
+	const struct reftable_index_record *r =
+		(const struct reftable_index_record *)rec;
+	struct string_view start = out;
+
+	int n = put_var_int(&out, r->offset);
+	if (n < 0)
+		return n;
+
+	string_view_consume(&out, n);
+
+	return start.len - out.len;
+}
+
+static int reftable_index_record_decode(void *rec, struct strbuf key,
+					uint8_t val_type, struct string_view in,
+					int hash_size)
+{
+	struct string_view start = in;
+	struct reftable_index_record *r = (struct reftable_index_record *)rec;
+	int n = 0;
+
+	strbuf_reset(&r->last_key);
+	strbuf_addbuf(&r->last_key, &key);
+
+	n = get_var_int(&r->offset, &in);
+	if (n < 0)
+		return n;
+
+	string_view_consume(&in, n);
+	return start.len - in.len;
+}
+
+static struct reftable_record_vtable reftable_index_record_vtable = {
+	.key = &reftable_index_record_key,
+	.type = BLOCK_TYPE_INDEX,
+	.copy_from = &reftable_index_record_copy_from,
+	.val_type = &reftable_index_record_val_type,
+	.encode = &reftable_index_record_encode,
+	.decode = &reftable_index_record_decode,
+	.clear = &reftable_index_record_clear,
+	.is_deletion = &not_a_deletion,
+};
+
+void reftable_record_key(struct reftable_record *rec, struct strbuf *dest)
+{
+	rec->ops->key(rec->data, dest);
+}
+
+uint8_t reftable_record_type(struct reftable_record *rec)
+{
+	return rec->ops->type;
+}
+
+int reftable_record_encode(struct reftable_record *rec, struct string_view dest,
+			   int hash_size)
+{
+	return rec->ops->encode(rec->data, dest, hash_size);
+}
+
+void reftable_record_copy_from(struct reftable_record *rec,
+			       struct reftable_record *src, int hash_size)
+{
+	assert(src->ops->type == rec->ops->type);
+
+	rec->ops->copy_from(rec->data, src->data, hash_size);
+}
+
+uint8_t reftable_record_val_type(struct reftable_record *rec)
+{
+	return rec->ops->val_type(rec->data);
+}
+
+int reftable_record_decode(struct reftable_record *rec, struct strbuf key,
+			   uint8_t extra, struct string_view src, int hash_size)
+{
+	return rec->ops->decode(rec->data, key, extra, src, hash_size);
+}
+
+void reftable_record_clear(struct reftable_record *rec)
+{
+	rec->ops->clear(rec->data);
+}
+
+int reftable_record_is_deletion(struct reftable_record *rec)
+{
+	return rec->ops->is_deletion(rec->data);
+}
+
+void reftable_record_from_ref(struct reftable_record *rec,
+			      struct reftable_ref_record *ref_rec)
+{
+	assert(rec->ops == NULL);
+	rec->data = ref_rec;
+	rec->ops = &reftable_ref_record_vtable;
+}
+
+void reftable_record_from_obj(struct reftable_record *rec,
+			      struct reftable_obj_record *obj_rec)
+{
+	assert(rec->ops == NULL);
+	rec->data = obj_rec;
+	rec->ops = &reftable_obj_record_vtable;
+}
+
+void reftable_record_from_index(struct reftable_record *rec,
+				struct reftable_index_record *index_rec)
+{
+	assert(rec->ops == NULL);
+	rec->data = index_rec;
+	rec->ops = &reftable_index_record_vtable;
+}
+
+void reftable_record_from_log(struct reftable_record *rec,
+			      struct reftable_log_record *log_rec)
+{
+	assert(rec->ops == NULL);
+	rec->data = log_rec;
+	rec->ops = &reftable_log_record_vtable;
+}
+
+struct reftable_ref_record *reftable_record_as_ref(struct reftable_record *rec)
+{
+	assert(reftable_record_type(rec) == BLOCK_TYPE_REF);
+	return (struct reftable_ref_record *)rec->data;
+}
+
+struct reftable_log_record *reftable_record_as_log(struct reftable_record *rec)
+{
+	assert(reftable_record_type(rec) == BLOCK_TYPE_LOG);
+	return (struct reftable_log_record *)rec->data;
+}
+
+static int hash_equal(uint8_t *a, uint8_t *b, int hash_size)
+{
+	if (a != NULL && b != NULL)
+		return !memcmp(a, b, hash_size);
+
+	return a == b;
+}
+
+static int str_equal(char *a, char *b)
+{
+	if (a != NULL && b != NULL)
+		return 0 == strcmp(a, b);
+
+	return a == b;
+}
+
+int reftable_ref_record_equal(struct reftable_ref_record *a,
+			      struct reftable_ref_record *b, int hash_size)
+{
+	assert(hash_size > 0);
+	return 0 == strcmp(a->refname, b->refname) &&
+	       a->update_index == b->update_index &&
+	       hash_equal(a->value, b->value, hash_size) &&
+	       hash_equal(a->target_value, b->target_value, hash_size) &&
+	       str_equal(a->target, b->target);
+}
+
+int reftable_ref_record_compare_name(const void *a, const void *b)
+{
+	return strcmp(((struct reftable_ref_record *)a)->refname,
+		      ((struct reftable_ref_record *)b)->refname);
+}
+
+int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref)
+{
+	return ref->value == NULL && ref->target == NULL &&
+	       ref->target_value == NULL;
+}
+
+int reftable_log_record_compare_key(const void *a, const void *b)
+{
+	struct reftable_log_record *la = (struct reftable_log_record *)a;
+	struct reftable_log_record *lb = (struct reftable_log_record *)b;
+
+	int cmp = strcmp(la->refname, lb->refname);
+	if (cmp)
+		return cmp;
+	if (la->update_index > lb->update_index)
+		return -1;
+	return (la->update_index < lb->update_index) ? 1 : 0;
+}
+
+int reftable_log_record_is_deletion(const struct reftable_log_record *log)
+{
+	return (log->new_hash == NULL && log->old_hash == NULL &&
+		log->name == NULL && log->email == NULL &&
+		log->message == NULL && log->time == 0 && log->tz_offset == 0 &&
+		log->message == NULL);
+}
+
+void string_view_consume(struct string_view *s, int n)
+{
+	s->buf += n;
+	s->len -= n;
+}
diff --git a/reftable/record.h b/reftable/record.h
new file mode 100644
index 0000000000..dd5881ed75
--- /dev/null
+++ b/reftable/record.h
@@ -0,0 +1,137 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef RECORD_H
+#define RECORD_H
+
+#include "reftable.h"
+#include "strbuf.h"
+#include "system.h"
+
+/*
+  A substring of existing string data. This structure takes no responsibility
+  for the lifetime of the data it points to.
+*/
+struct string_view {
+	uint8_t *buf;
+	size_t len;
+};
+
+/* Advance `s.buf` by `n`, and decrease length. */
+void string_view_consume(struct string_view *s, int n);
+
+/* utilities for de/encoding varints */
+
+int get_var_int(uint64_t *dest, struct string_view *in);
+int put_var_int(struct string_view *dest, uint64_t val);
+
+/* Methods for records. */
+struct reftable_record_vtable {
+	/* encode the key of to a uint8_t strbuf. */
+	void (*key)(const void *rec, struct strbuf *dest);
+
+	/* The record type of ('r' for ref). */
+	uint8_t type;
+
+	void (*copy_from)(void *dest, const void *src, int hash_size);
+
+	/* a value of [0..7], indicating record subvariants (eg. ref vs. symref
+	 * vs ref deletion) */
+	uint8_t (*val_type)(const void *rec);
+
+	/* encodes rec into dest, returning how much space was used. */
+	int (*encode)(const void *rec, struct string_view dest, int hash_size);
+
+	/* decode data from `src` into the record. */
+	int (*decode)(void *rec, struct strbuf key, uint8_t extra,
+		      struct string_view src, int hash_size);
+
+	/* deallocate and null the record. */
+	void (*clear)(void *rec);
+
+	/* is this a tombstone? */
+	int (*is_deletion)(const void *rec);
+};
+
+/* record is a generic wrapper for different types of records. */
+struct reftable_record {
+	void *data;
+	struct reftable_record_vtable *ops;
+};
+
+/* returns true for recognized block types. Block start with the block type. */
+int reftable_is_block_type(uint8_t typ);
+
+/* creates a malloced record of the given type. Dispose with record_destroy */
+struct reftable_record reftable_new_record(uint8_t typ);
+
+/* Encode `key` into `dest`. Sets `is_restart` to indicate a restart. Returns
+   number of bytes written. */
+int reftable_encode_key(int *is_restart, struct string_view dest,
+			struct strbuf prev_key, struct strbuf key,
+			uint8_t extra);
+
+/* Decode into `key` and `extra` from `in` */
+int reftable_decode_key(struct strbuf *key, uint8_t *extra,
+			struct strbuf last_key, struct string_view in);
+
+/* reftable_index_record are used internally to speed up lookups. */
+struct reftable_index_record {
+	uint64_t offset; /* Offset of block */
+	struct strbuf last_key; /* Last key of the block. */
+};
+
+/* reftable_obj_record stores an object ID => ref mapping. */
+struct reftable_obj_record {
+	uint8_t *hash_prefix; /* leading bytes of the object ID */
+	int hash_prefix_len; /* number of leading bytes. Constant
+			      * across a single table. */
+	uint64_t *offsets; /* a vector of file offsets. */
+	int offset_len;
+};
+
+/* see struct record_vtable */
+
+void reftable_record_key(struct reftable_record *rec, struct strbuf *dest);
+uint8_t reftable_record_type(struct reftable_record *rec);
+void reftable_record_copy_from(struct reftable_record *rec,
+			       struct reftable_record *src, int hash_size);
+uint8_t reftable_record_val_type(struct reftable_record *rec);
+int reftable_record_encode(struct reftable_record *rec, struct string_view dest,
+			   int hash_size);
+int reftable_record_decode(struct reftable_record *rec, struct strbuf key,
+			   uint8_t extra, struct string_view src,
+			   int hash_size);
+int reftable_record_is_deletion(struct reftable_record *rec);
+
+/* zeroes out the embedded record */
+void reftable_record_clear(struct reftable_record *rec);
+
+/* clear and deallocate embedded record, and zero `rec`. */
+void reftable_record_destroy(struct reftable_record *rec);
+
+/* initialize generic records from concrete records. The generic record should
+ * be zeroed out. */
+void reftable_record_from_obj(struct reftable_record *rec,
+			      struct reftable_obj_record *objrec);
+void reftable_record_from_index(struct reftable_record *rec,
+				struct reftable_index_record *idxrec);
+void reftable_record_from_ref(struct reftable_record *rec,
+			      struct reftable_ref_record *refrec);
+void reftable_record_from_log(struct reftable_record *rec,
+			      struct reftable_log_record *logrec);
+struct reftable_ref_record *reftable_record_as_ref(struct reftable_record *ref);
+struct reftable_log_record *reftable_record_as_log(struct reftable_record *ref);
+
+/* for qsort. */
+int reftable_ref_record_compare_name(const void *a, const void *b);
+
+/* for qsort. */
+int reftable_log_record_compare_key(const void *a, const void *b);
+
+#endif
diff --git a/reftable/record_test.c b/reftable/record_test.c
new file mode 100644
index 0000000000..dcf1c74cb7
--- /dev/null
+++ b/reftable/record_test.c
@@ -0,0 +1,410 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "record.h"
+
+#include "system.h"
+#include "basics.h"
+#include "constants.h"
+#include "reftable.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static void test_copy(struct reftable_record *rec)
+{
+	struct reftable_record copy =
+		reftable_new_record(reftable_record_type(rec));
+	reftable_record_copy_from(&copy, rec, SHA1_SIZE);
+	/* do it twice to catch memory leaks */
+	reftable_record_copy_from(&copy, rec, SHA1_SIZE);
+	switch (reftable_record_type(&copy)) {
+	case BLOCK_TYPE_REF:
+		assert(reftable_ref_record_equal(reftable_record_as_ref(&copy),
+						 reftable_record_as_ref(rec),
+						 SHA1_SIZE));
+		break;
+	case BLOCK_TYPE_LOG:
+		assert(reftable_log_record_equal(reftable_record_as_log(&copy),
+						 reftable_record_as_log(rec),
+						 SHA1_SIZE));
+		break;
+	}
+	reftable_record_destroy(&copy);
+}
+
+static void test_varint_roundtrip(void)
+{
+	uint64_t inputs[] = { 0,
+			      1,
+			      27,
+			      127,
+			      128,
+			      257,
+			      4096,
+			      ((uint64_t)1 << 63),
+			      ((uint64_t)1 << 63) + ((uint64_t)1 << 63) - 1 };
+	int i = 0;
+	for (i = 0; i < ARRAY_SIZE(inputs); i++) {
+		uint8_t dest[10];
+
+		struct string_view out = {
+			.buf = dest,
+			.len = sizeof(dest),
+		};
+		uint64_t in = inputs[i];
+		int n = put_var_int(&out, in);
+		uint64_t got = 0;
+
+		assert(n > 0);
+		out.len = n;
+		n = get_var_int(&got, &out);
+		assert(n > 0);
+
+		assert(got == in);
+	}
+}
+
+static void test_common_prefix(void)
+{
+	struct {
+		const char *a, *b;
+		int want;
+	} cases[] = {
+		{ "abc", "ab", 2 },
+		{ "", "abc", 0 },
+		{ "abc", "abd", 2 },
+		{ "abc", "pqr", 0 },
+	};
+
+	int i = 0;
+	for (i = 0; i < ARRAY_SIZE(cases); i++) {
+		struct strbuf a = STRBUF_INIT;
+		struct strbuf b = STRBUF_INIT;
+		strbuf_addstr(&a, cases[i].a);
+		strbuf_addstr(&b, cases[i].b);
+		assert(common_prefix_size(&a, &b) == cases[i].want);
+
+		strbuf_release(&a);
+		strbuf_release(&b);
+	}
+}
+
+static void set_hash(uint8_t *h, int j)
+{
+	int i = 0;
+	for (i = 0; i < hash_size(SHA1_ID); i++) {
+		h[i] = (j >> i) & 0xff;
+	}
+}
+
+static void test_reftable_ref_record_roundtrip(void)
+{
+	int i = 0;
+
+	for (i = 0; i <= 3; i++) {
+		struct reftable_ref_record in = { NULL };
+		struct reftable_ref_record out = {
+			.refname = xstrdup("old name"),
+			.value = reftable_calloc(SHA1_SIZE),
+			.target_value = reftable_calloc(SHA1_SIZE),
+			.target = xstrdup("old value"),
+		};
+		struct reftable_record rec_out = { NULL };
+		struct strbuf key = STRBUF_INIT;
+		struct reftable_record rec = { NULL };
+		uint8_t buffer[1024] = { 0 };
+		struct string_view dest = {
+			.buf = buffer,
+			.len = sizeof(buffer),
+		};
+
+		int n, m;
+
+		switch (i) {
+		case 0:
+			break;
+		case 1:
+			in.value = reftable_malloc(SHA1_SIZE);
+			set_hash(in.value, 1);
+			break;
+		case 2:
+			in.value = reftable_malloc(SHA1_SIZE);
+			set_hash(in.value, 1);
+			in.target_value = reftable_malloc(SHA1_SIZE);
+			set_hash(in.target_value, 2);
+			break;
+		case 3:
+			in.target = xstrdup("target");
+			break;
+		}
+		in.refname = xstrdup("refs/heads/master");
+
+		reftable_record_from_ref(&rec, &in);
+		test_copy(&rec);
+
+		assert(reftable_record_val_type(&rec) == i);
+
+		reftable_record_key(&rec, &key);
+		n = reftable_record_encode(&rec, dest, SHA1_SIZE);
+		assert(n > 0);
+
+		/* decode into a non-zero reftable_record to test for leaks. */
+
+		reftable_record_from_ref(&rec_out, &out);
+		m = reftable_record_decode(&rec_out, key, i, dest, SHA1_SIZE);
+		assert(n == m);
+
+		assert((out.value != NULL) == (in.value != NULL));
+		assert((out.target_value != NULL) == (in.target_value != NULL));
+		assert((out.target != NULL) == (in.target != NULL));
+		reftable_record_clear(&rec_out);
+
+		strbuf_release(&key);
+		reftable_ref_record_clear(&in);
+	}
+}
+
+static void test_reftable_log_record_equal(void)
+{
+	struct reftable_log_record in[2] = {
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.update_index = 42,
+		},
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.update_index = 22,
+		}
+	};
+
+	assert(!reftable_log_record_equal(&in[0], &in[1], SHA1_SIZE));
+	in[1].update_index = in[0].update_index;
+	assert(reftable_log_record_equal(&in[0], &in[1], SHA1_SIZE));
+	reftable_log_record_clear(&in[0]);
+	reftable_log_record_clear(&in[1]);
+}
+
+static void test_reftable_log_record_roundtrip(void)
+{
+	struct reftable_log_record in[2] = {
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.old_hash = reftable_malloc(SHA1_SIZE),
+			.new_hash = reftable_malloc(SHA1_SIZE),
+			.name = xstrdup("han-wen"),
+			.email = xstrdup("hanwen@google.com"),
+			.message = xstrdup("test"),
+			.update_index = 42,
+			.time = 1577123507,
+			.tz_offset = 100,
+		},
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.update_index = 22,
+		}
+	};
+	set_test_hash(in[0].new_hash, 1);
+	set_test_hash(in[0].old_hash, 2);
+	for (int i = 0; i < ARRAY_SIZE(in); i++) {
+		struct reftable_record rec = { NULL };
+		struct strbuf key = STRBUF_INIT;
+		uint8_t buffer[1024] = { 0 };
+		struct string_view dest = {
+			.buf = buffer,
+			.len = sizeof(buffer),
+		};
+		/* populate out, to check for leaks. */
+		struct reftable_log_record out = {
+			.refname = xstrdup("old name"),
+			.new_hash = reftable_calloc(SHA1_SIZE),
+			.old_hash = reftable_calloc(SHA1_SIZE),
+			.name = xstrdup("old name"),
+			.email = xstrdup("old@email"),
+			.message = xstrdup("old message"),
+		};
+		struct reftable_record rec_out = { NULL };
+		int n, m, valtype;
+
+		reftable_record_from_log(&rec, &in[i]);
+
+		test_copy(&rec);
+
+		reftable_record_key(&rec, &key);
+
+		n = reftable_record_encode(&rec, dest, SHA1_SIZE);
+		assert(n >= 0);
+		reftable_record_from_log(&rec_out, &out);
+		valtype = reftable_record_val_type(&rec);
+		m = reftable_record_decode(&rec_out, key, valtype, dest,
+					   SHA1_SIZE);
+		assert(n == m);
+
+		assert(reftable_log_record_equal(&in[i], &out, SHA1_SIZE));
+		reftable_log_record_clear(&in[i]);
+		strbuf_release(&key);
+		reftable_record_clear(&rec_out);
+	}
+}
+
+static void test_u24_roundtrip(void)
+{
+	uint32_t in = 0x112233;
+	uint8_t dest[3];
+	uint32_t out;
+	put_be24(dest, in);
+	out = get_be24(dest);
+	assert(in == out);
+}
+
+static void test_key_roundtrip(void)
+{
+	uint8_t buffer[1024] = { 0 };
+	struct string_view dest = {
+		.buf = buffer,
+		.len = sizeof(buffer),
+	};
+	struct strbuf last_key = STRBUF_INIT;
+	struct strbuf key = STRBUF_INIT;
+	struct strbuf roundtrip = STRBUF_INIT;
+	int restart;
+	uint8_t extra;
+	int n, m;
+	uint8_t rt_extra;
+
+	strbuf_addstr(&last_key, "refs/heads/master");
+	strbuf_addstr(&key, "refs/tags/bla");
+	extra = 6;
+	n = reftable_encode_key(&restart, dest, last_key, key, extra);
+	assert(!restart);
+	assert(n > 0);
+
+	m = reftable_decode_key(&roundtrip, &rt_extra, last_key, dest);
+	assert(n == m);
+	assert(0 == strbuf_cmp(&key, &roundtrip));
+	assert(rt_extra == extra);
+
+	strbuf_release(&last_key);
+	strbuf_release(&key);
+	strbuf_release(&roundtrip);
+}
+
+static void test_reftable_obj_record_roundtrip(void)
+{
+	uint8_t testHash1[SHA1_SIZE] = { 1, 2, 3, 4, 0 };
+	uint64_t till9[] = { 1, 2, 3, 4, 500, 600, 700, 800, 9000 };
+	struct reftable_obj_record recs[3] = { {
+						       .hash_prefix = testHash1,
+						       .hash_prefix_len = 5,
+						       .offsets = till9,
+						       .offset_len = 3,
+					       },
+					       {
+						       .hash_prefix = testHash1,
+						       .hash_prefix_len = 5,
+						       .offsets = till9,
+						       .offset_len = 9,
+					       },
+					       {
+						       .hash_prefix = testHash1,
+						       .hash_prefix_len = 5,
+					       } };
+	int i = 0;
+	for (i = 0; i < ARRAY_SIZE(recs); i++) {
+		struct reftable_obj_record in = recs[i];
+		uint8_t buffer[1024] = { 0 };
+		struct string_view dest = {
+			.buf = buffer,
+			.len = sizeof(buffer),
+		};
+		struct reftable_record rec = { NULL };
+		struct strbuf key = STRBUF_INIT;
+		struct reftable_obj_record out = { NULL };
+		struct reftable_record rec_out = { NULL };
+		int n, m;
+		uint8_t extra;
+
+		reftable_record_from_obj(&rec, &in);
+		test_copy(&rec);
+		reftable_record_key(&rec, &key);
+		n = reftable_record_encode(&rec, dest, SHA1_SIZE);
+		assert(n > 0);
+		extra = reftable_record_val_type(&rec);
+		reftable_record_from_obj(&rec_out, &out);
+		m = reftable_record_decode(&rec_out, key, extra, dest,
+					   SHA1_SIZE);
+		assert(n == m);
+
+		assert(in.hash_prefix_len == out.hash_prefix_len);
+		assert(in.offset_len == out.offset_len);
+
+		assert(!memcmp(in.hash_prefix, out.hash_prefix,
+			       in.hash_prefix_len));
+		assert(0 == memcmp(in.offsets, out.offsets,
+				   sizeof(uint64_t) * in.offset_len));
+		strbuf_release(&key);
+		reftable_record_clear(&rec_out);
+	}
+}
+
+static void test_reftable_index_record_roundtrip(void)
+{
+	struct reftable_index_record in = {
+		.offset = 42,
+		.last_key = STRBUF_INIT,
+	};
+	uint8_t buffer[1024] = { 0 };
+	struct string_view dest = {
+		.buf = buffer,
+		.len = sizeof(buffer),
+	};
+	struct strbuf key = STRBUF_INIT;
+	struct reftable_record rec = { NULL };
+	struct reftable_index_record out = { .last_key = STRBUF_INIT };
+	struct reftable_record out_rec = { NULL };
+	int n, m;
+	uint8_t extra;
+
+	strbuf_addstr(&in.last_key, "refs/heads/master");
+	reftable_record_from_index(&rec, &in);
+	reftable_record_key(&rec, &key);
+	test_copy(&rec);
+
+	assert(0 == strbuf_cmp(&key, &in.last_key));
+	n = reftable_record_encode(&rec, dest, SHA1_SIZE);
+	assert(n > 0);
+
+	extra = reftable_record_val_type(&rec);
+	reftable_record_from_index(&out_rec, &out);
+	m = reftable_record_decode(&out_rec, key, extra, dest, SHA1_SIZE);
+	assert(m == n);
+
+	assert(in.offset == out.offset);
+
+	reftable_record_clear(&out_rec);
+	strbuf_release(&key);
+	strbuf_release(&in.last_key);
+}
+
+int record_test_main(int argc, const char *argv[])
+{
+	add_test_case("test_reftable_log_record_equal",
+		      &test_reftable_log_record_equal);
+	add_test_case("test_reftable_log_record_roundtrip",
+		      &test_reftable_log_record_roundtrip);
+	add_test_case("test_reftable_ref_record_roundtrip",
+		      &test_reftable_ref_record_roundtrip);
+	add_test_case("test_varint_roundtrip", &test_varint_roundtrip);
+	add_test_case("test_key_roundtrip", &test_key_roundtrip);
+	add_test_case("test_common_prefix", &test_common_prefix);
+	add_test_case("test_reftable_obj_record_roundtrip",
+		      &test_reftable_obj_record_roundtrip);
+	add_test_case("test_reftable_index_record_roundtrip",
+		      &test_reftable_index_record_roundtrip);
+	add_test_case("test_u24_roundtrip", &test_u24_roundtrip);
+	return test_main(argc, argv);
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 7d50aa6bcc..9341272089 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -3,6 +3,7 @@
 
 int cmd__reftable(int argc, const char **argv)
 {
+	record_test_main(argc, argv);
 	strbuf_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v2 07/13] reftable: reading/writing blocks
  2020-10-01 16:10 ` [PATCH v2 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
                     ` (5 preceding siblings ...)
  2020-10-01 16:10   ` [PATCH v2 06/13] reftable: (de)serialization for the polymorphic record type Han-Wen Nienhuys via GitGitGadget
@ 2020-10-01 16:10   ` Han-Wen Nienhuys via GitGitGadget
  2020-10-01 16:10   ` [PATCH v2 08/13] reftable: a generic binary tree implementation Han-Wen Nienhuys via GitGitGadget
                     ` (6 subsequent siblings)
  13 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-10-01 16:10 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Jeff King, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Includes a code snippet copied from zlib

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |   3 +
 reftable/.gitattributes  |   1 +
 reftable/block.c         | 441 +++++++++++++++++++++++++++++++++++++++
 reftable/block.h         | 129 ++++++++++++
 reftable/block_test.c    | 158 ++++++++++++++
 reftable/zlib-compat.c   |  92 ++++++++
 t/helper/test-reftable.c |   1 +
 7 files changed, 825 insertions(+)
 create mode 100644 reftable/.gitattributes
 create mode 100644 reftable/block.c
 create mode 100644 reftable/block.h
 create mode 100644 reftable/block_test.c
 create mode 100644 reftable/zlib-compat.c

diff --git a/Makefile b/Makefile
index b5e43d79ed..eda92b00ef 100644
--- a/Makefile
+++ b/Makefile
@@ -2373,12 +2373,15 @@ XDIFF_OBJS += xdiff/xprepare.o
 XDIFF_OBJS += xdiff/xutils.o
 
 REFTABLE_OBJS += reftable/basics.o
+REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/compat.o
 REFTABLE_OBJS += reftable/record.o
 REFTABLE_OBJS += reftable/strbuf.o
+REFTABLE_OBJS += reftable/zlib-compat.o
 
+REFTABLE_TEST_OBJS += reftable/block_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/strbuf_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
diff --git a/reftable/.gitattributes b/reftable/.gitattributes
new file mode 100644
index 0000000000..f44451a379
--- /dev/null
+++ b/reftable/.gitattributes
@@ -0,0 +1 @@
+/zlib-compat.c	whitespace=-indent-with-non-tab,-trailing-space
diff --git a/reftable/block.c b/reftable/block.c
new file mode 100644
index 0000000000..d49e80df7c
--- /dev/null
+++ b/reftable/block.c
@@ -0,0 +1,441 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "block.h"
+
+#include "blocksource.h"
+#include "constants.h"
+#include "record.h"
+#include "reftable.h"
+#include "system.h"
+#include "zlib.h"
+
+int header_size(int version)
+{
+	switch (version) {
+	case 1:
+		return 24;
+	case 2:
+		return 28;
+	}
+	abort();
+}
+
+int footer_size(int version)
+{
+	switch (version) {
+	case 1:
+		return 68;
+	case 2:
+		return 72;
+	}
+	abort();
+}
+
+static int block_writer_register_restart(struct block_writer *w, int n,
+					 int is_restart, struct strbuf *key)
+{
+	int rlen = w->restart_len;
+	if (rlen >= MAX_RESTARTS) {
+		is_restart = 0;
+	}
+
+	if (is_restart) {
+		rlen++;
+	}
+	if (2 + 3 * rlen + n > w->block_size - w->next)
+		return -1;
+	if (is_restart) {
+		if (w->restart_len == w->restart_cap) {
+			w->restart_cap = w->restart_cap * 2 + 1;
+			w->restarts = reftable_realloc(
+				w->restarts, sizeof(uint32_t) * w->restart_cap);
+		}
+
+		w->restarts[w->restart_len++] = w->next;
+	}
+
+	w->next += n;
+
+	strbuf_reset(&w->last_key);
+	strbuf_addbuf(&w->last_key, key);
+	w->entries++;
+	return 0;
+}
+
+void block_writer_init(struct block_writer *bw, uint8_t typ, uint8_t *buf,
+		       uint32_t block_size, uint32_t header_off, int hash_size)
+{
+	bw->buf = buf;
+	bw->hash_size = hash_size;
+	bw->block_size = block_size;
+	bw->header_off = header_off;
+	bw->buf[header_off] = typ;
+	bw->next = header_off + 4;
+	bw->restart_interval = 16;
+	bw->entries = 0;
+	bw->restart_len = 0;
+	bw->last_key.len = 0;
+}
+
+uint8_t block_writer_type(struct block_writer *bw)
+{
+	return bw->buf[bw->header_off];
+}
+
+/* adds the reftable_record to the block. Returns -1 if it does not fit, 0 on
+   success */
+int block_writer_add(struct block_writer *w, struct reftable_record *rec)
+{
+	struct strbuf empty = STRBUF_INIT;
+	struct strbuf last =
+		w->entries % w->restart_interval == 0 ? empty : w->last_key;
+	struct string_view out = {
+		.buf = w->buf + w->next,
+		.len = w->block_size - w->next,
+	};
+
+	struct string_view start = out;
+
+	int is_restart = 0;
+	struct strbuf key = STRBUF_INIT;
+	int n = 0;
+
+	reftable_record_key(rec, &key);
+	n = reftable_encode_key(&is_restart, out, last, key,
+				reftable_record_val_type(rec));
+	if (n < 0)
+		goto done;
+	string_view_consume(&out, n);
+
+	n = reftable_record_encode(rec, out, w->hash_size);
+	if (n < 0)
+		goto done;
+	string_view_consume(&out, n);
+
+	if (block_writer_register_restart(w, start.len - out.len, is_restart,
+					  &key) < 0)
+		goto done;
+
+	strbuf_release(&key);
+	return 0;
+
+done:
+	strbuf_release(&key);
+	return -1;
+}
+
+
+int block_writer_finish(struct block_writer *w)
+{
+	int i = 0;
+	for (i = 0; i < w->restart_len; i++) {
+		put_be24(w->buf + w->next, w->restarts[i]);
+		w->next += 3;
+	}
+
+	put_be16(w->buf + w->next, w->restart_len);
+	w->next += 2;
+	put_be24(w->buf + 1 + w->header_off, w->next);
+
+	if (block_writer_type(w) == BLOCK_TYPE_LOG) {
+		int block_header_skip = 4 + w->header_off;
+		uint8_t *compressed = NULL;
+		int zresult = 0;
+		uLongf src_len = w->next - block_header_skip;
+		size_t dest_cap = src_len;
+
+		compressed = reftable_malloc(dest_cap);
+		while (1) {
+			uLongf out_dest_len = dest_cap;
+
+			zresult = compress2(compressed, &out_dest_len,
+					    w->buf + block_header_skip, src_len,
+					    9);
+			if (zresult == Z_BUF_ERROR) {
+				dest_cap *= 2;
+				compressed =
+					reftable_realloc(compressed, dest_cap);
+				continue;
+			}
+
+			if (Z_OK != zresult) {
+				reftable_free(compressed);
+				return REFTABLE_ZLIB_ERROR;
+			}
+
+			memcpy(w->buf + block_header_skip, compressed,
+			       out_dest_len);
+			w->next = out_dest_len + block_header_skip;
+			reftable_free(compressed);
+			break;
+		}
+	}
+	return w->next;
+}
+
+uint8_t block_reader_type(struct block_reader *r)
+{
+	return r->block.data[r->header_off];
+}
+
+int block_reader_init(struct block_reader *br, struct reftable_block *block,
+		      uint32_t header_off, uint32_t table_block_size,
+		      int hash_size)
+{
+	uint32_t full_block_size = table_block_size;
+	uint8_t typ = block->data[header_off];
+	uint32_t sz = get_be24(block->data + header_off + 1);
+
+	uint16_t restart_count = 0;
+	uint32_t restart_start = 0;
+	uint8_t *restart_bytes = NULL;
+
+	if (!reftable_is_block_type(typ))
+		return REFTABLE_FORMAT_ERROR;
+
+	if (typ == BLOCK_TYPE_LOG) {
+		int block_header_skip = 4 + header_off;
+		uLongf dst_len = sz - block_header_skip; /* total size of dest
+							    buffer. */
+		uLongf src_len = block->len - block_header_skip;
+		/* Log blocks specify the *uncompressed* size in their header.
+		 */
+		uint8_t *uncompressed = reftable_malloc(sz);
+
+		/* Copy over the block header verbatim. It's not compressed. */
+		memcpy(uncompressed, block->data, block_header_skip);
+
+		/* Uncompress */
+		if (Z_OK != uncompress_return_consumed(
+				    uncompressed + block_header_skip, &dst_len,
+				    block->data + block_header_skip,
+				    &src_len)) {
+			reftable_free(uncompressed);
+			return REFTABLE_ZLIB_ERROR;
+		}
+
+		if (dst_len + block_header_skip != sz)
+			return REFTABLE_FORMAT_ERROR;
+
+		/* We're done with the input data. */
+		reftable_block_done(block);
+		block->data = uncompressed;
+		block->len = sz;
+		block->source = malloc_block_source();
+		full_block_size = src_len + block_header_skip;
+	} else if (full_block_size == 0) {
+		full_block_size = sz;
+	} else if (sz < full_block_size && sz < block->len &&
+		   block->data[sz] != 0) {
+		/* If the block is smaller than the full block size, it is
+		   padded (data followed by '\0') or the next block is
+		   unaligned. */
+		full_block_size = sz;
+	}
+
+	restart_count = get_be16(block->data + sz - 2);
+	restart_start = sz - 2 - 3 * restart_count;
+	restart_bytes = block->data + restart_start;
+
+	/* transfer ownership. */
+	br->block = *block;
+	block->data = NULL;
+	block->len = 0;
+
+	br->hash_size = hash_size;
+	br->block_len = restart_start;
+	br->full_block_size = full_block_size;
+	br->header_off = header_off;
+	br->restart_count = restart_count;
+	br->restart_bytes = restart_bytes;
+
+	return 0;
+}
+
+static uint32_t block_reader_restart_offset(struct block_reader *br, int i)
+{
+	return get_be24(br->restart_bytes + 3 * i);
+}
+
+void block_reader_start(struct block_reader *br, struct block_iter *it)
+{
+	it->br = br;
+	strbuf_reset(&it->last_key);
+	it->next_off = br->header_off + 4;
+}
+
+struct restart_find_args {
+	int error;
+	struct strbuf key;
+	struct block_reader *r;
+};
+
+static int restart_key_less(size_t idx, void *args)
+{
+	struct restart_find_args *a = (struct restart_find_args *)args;
+	uint32_t off = block_reader_restart_offset(a->r, idx);
+	struct string_view in = {
+		.buf = a->r->block.data + off,
+		.len = a->r->block_len - off,
+	};
+
+	/* the restart key is verbatim in the block, so this could avoid the
+	   alloc for decoding the key */
+	struct strbuf rkey = STRBUF_INIT;
+	struct strbuf last_key = STRBUF_INIT;
+	uint8_t unused_extra;
+	int n = reftable_decode_key(&rkey, &unused_extra, last_key, in);
+	int result;
+	if (n < 0) {
+		a->error = 1;
+		return -1;
+	}
+
+	result = strbuf_cmp(&a->key, &rkey);
+	strbuf_release(&rkey);
+	return result;
+}
+
+void block_iter_copy_from(struct block_iter *dest, struct block_iter *src)
+{
+	dest->br = src->br;
+	dest->next_off = src->next_off;
+	strbuf_reset(&dest->last_key);
+	strbuf_addbuf(&dest->last_key, &src->last_key);
+}
+
+int block_iter_next(struct block_iter *it, struct reftable_record *rec)
+{
+	struct string_view in = {
+		.buf = it->br->block.data + it->next_off,
+		.len = it->br->block_len - it->next_off,
+	};
+	struct string_view start = in;
+	struct strbuf key = STRBUF_INIT;
+	uint8_t extra = 0;
+	int n = 0;
+
+	if (it->next_off >= it->br->block_len)
+		return 1;
+
+	n = reftable_decode_key(&key, &extra, it->last_key, in);
+	if (n < 0)
+		return -1;
+
+	string_view_consume(&in, n);
+	n = reftable_record_decode(rec, key, extra, in, it->br->hash_size);
+	if (n < 0)
+		return -1;
+	string_view_consume(&in, n);
+
+	strbuf_reset(&it->last_key);
+	strbuf_addbuf(&it->last_key, &key);
+	it->next_off += start.len - in.len;
+	strbuf_release(&key);
+	return 0;
+}
+
+int block_reader_first_key(struct block_reader *br, struct strbuf *key)
+{
+	struct strbuf empty = STRBUF_INIT;
+	int off = br->header_off + 4;
+	struct string_view in = {
+		.buf = br->block.data + off,
+		.len = br->block_len - off,
+	};
+
+	uint8_t extra = 0;
+	int n = reftable_decode_key(key, &extra, empty, in);
+	if (n < 0)
+		return n;
+
+	return 0;
+}
+
+int block_iter_seek(struct block_iter *it, struct strbuf *want)
+{
+	return block_reader_seek(it->br, it, want);
+}
+
+void block_iter_close(struct block_iter *it)
+{
+	strbuf_release(&it->last_key);
+}
+
+int block_reader_seek(struct block_reader *br, struct block_iter *it,
+		      struct strbuf *want)
+{
+	struct restart_find_args args = {
+		.key = *want,
+		.r = br,
+	};
+	struct reftable_record rec = reftable_new_record(block_reader_type(br));
+	struct strbuf key = STRBUF_INIT;
+	int err = 0;
+	struct block_iter next = {
+		.last_key = STRBUF_INIT,
+	};
+
+	int i = binsearch(br->restart_count, &restart_key_less, &args);
+	if (args.error) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	it->br = br;
+	if (i > 0) {
+		i--;
+		it->next_off = block_reader_restart_offset(br, i);
+	} else {
+		it->next_off = br->header_off + 4;
+	}
+
+	/* We're looking for the last entry less/equal than the wanted key, so
+	   we have to go one entry too far and then back up.
+	*/
+	while (1) {
+		block_iter_copy_from(&next, it);
+		err = block_iter_next(&next, &rec);
+		if (err < 0)
+			goto done;
+
+		reftable_record_key(&rec, &key);
+		if (err > 0 || strbuf_cmp(&key, want) >= 0) {
+			err = 0;
+			goto done;
+		}
+
+		block_iter_copy_from(it, &next);
+	}
+
+done:
+	strbuf_release(&key);
+	strbuf_release(&next.last_key);
+	reftable_record_destroy(&rec);
+
+	return err;
+}
+
+void block_writer_clear(struct block_writer *bw)
+{
+	FREE_AND_NULL(bw->restarts);
+	strbuf_release(&bw->last_key);
+	/* the block is not owned. */
+}
+
+void reftable_block_done(struct reftable_block *blockp)
+{
+	struct reftable_block_source source = blockp->source;
+	if (blockp != NULL && source.ops != NULL)
+		source.ops->return_block(source.arg, blockp);
+	blockp->data = NULL;
+	blockp->len = 0;
+	blockp->source.ops = NULL;
+	blockp->source.arg = NULL;
+}
diff --git a/reftable/block.h b/reftable/block.h
new file mode 100644
index 0000000000..14a6f835f4
--- /dev/null
+++ b/reftable/block.h
@@ -0,0 +1,129 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef BLOCK_H
+#define BLOCK_H
+
+#include "basics.h"
+#include "record.h"
+#include "reftable.h"
+
+/*
+  Writes reftable blocks. The block_writer is reused across blocks to minimize
+  allocation overhead.
+*/
+struct block_writer {
+	uint8_t *buf;
+	uint32_t block_size;
+
+	/* Offset ofof the global header. Nonzero in the first block only. */
+	uint32_t header_off;
+
+	/* How often to restart keys. */
+	int restart_interval;
+	int hash_size;
+
+	/* Offset of next uint8_t to write. */
+	uint32_t next;
+	uint32_t *restarts;
+	uint32_t restart_len;
+	uint32_t restart_cap;
+
+	struct strbuf last_key;
+	int entries;
+};
+
+/*
+  initializes the blockwriter to write `typ` entries, using `buf` as temporary
+  storage. `buf` is not owned by the block_writer. */
+void block_writer_init(struct block_writer *bw, uint8_t typ, uint8_t *buf,
+		       uint32_t block_size, uint32_t header_off, int hash_size);
+
+/*
+  returns the block type (eg. 'r' for ref records.
+*/
+uint8_t block_writer_type(struct block_writer *bw);
+
+/* appends the record, or -1 if it doesn't fit. */
+int block_writer_add(struct block_writer *w, struct reftable_record *rec);
+
+/* appends the key restarts, and compress the block if necessary. */
+int block_writer_finish(struct block_writer *w);
+
+/* clears out internally allocated block_writer members. */
+void block_writer_clear(struct block_writer *bw);
+
+/* Read a block. */
+struct block_reader {
+	/* offset of the block header; nonzero for the first block in a
+	 * reftable. */
+	uint32_t header_off;
+
+	/* the memory block */
+	struct reftable_block block;
+	int hash_size;
+
+	/* size of the data, excluding restart data. */
+	uint32_t block_len;
+	uint8_t *restart_bytes;
+	uint16_t restart_count;
+
+	/* size of the data in the file. For log blocks, this is the compressed
+	 * size. */
+	uint32_t full_block_size;
+};
+
+/* Iterate over entries in a block */
+struct block_iter {
+	/* offset within the block of the next entry to read. */
+	uint32_t next_off;
+	struct block_reader *br;
+
+	/* key for last entry we read. */
+	struct strbuf last_key;
+};
+
+/* initializes a block reader. */
+int block_reader_init(struct block_reader *br, struct reftable_block *bl,
+		      uint32_t header_off, uint32_t table_block_size,
+		      int hash_size);
+
+/* Position `it` at start of the block */
+void block_reader_start(struct block_reader *br, struct block_iter *it);
+
+/* Position `it` to the `want` key in the block */
+int block_reader_seek(struct block_reader *br, struct block_iter *it,
+		      struct strbuf *want);
+
+/* Returns the block type (eg. 'r' for refs) */
+uint8_t block_reader_type(struct block_reader *r);
+
+/* Decodes the first key in the block */
+int block_reader_first_key(struct block_reader *br, struct strbuf *key);
+
+void block_iter_copy_from(struct block_iter *dest, struct block_iter *src);
+
+/* return < 0 for error, 0 for OK, > 0 for EOF. */
+int block_iter_next(struct block_iter *it, struct reftable_record *rec);
+
+/* Seek to `want` with in the block pointed to by `it` */
+int block_iter_seek(struct block_iter *it, struct strbuf *want);
+
+/* deallocate memory for `it`. The block reader and its block is left intact. */
+void block_iter_close(struct block_iter *it);
+
+/* size of file header, depending on format version */
+int header_size(int version);
+
+/* size of file footer, depending on format version */
+int footer_size(int version);
+
+/* returns a block to its source. */
+void reftable_block_done(struct reftable_block *ret);
+
+#endif
diff --git a/reftable/block_test.c b/reftable/block_test.c
new file mode 100644
index 0000000000..2fa8ca9831
--- /dev/null
+++ b/reftable/block_test.c
@@ -0,0 +1,158 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "block.h"
+
+#include "system.h"
+
+#include "blocksource.h"
+#include "basics.h"
+#include "constants.h"
+#include "record.h"
+#include "reftable.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+struct binsearch_args {
+	int key;
+	int *arr;
+};
+
+static int binsearch_func(size_t i, void *void_args)
+{
+	struct binsearch_args *args = (struct binsearch_args *)void_args;
+
+	return args->key < args->arr[i];
+}
+
+static void test_binsearch(void)
+{
+	int arr[] = { 2, 4, 6, 8, 10 };
+	size_t sz = ARRAY_SIZE(arr);
+	struct binsearch_args args = {
+		.arr = arr,
+	};
+
+	int i = 0;
+	for (i = 1; i < 11; i++) {
+		int res;
+		args.key = i;
+		res = binsearch(sz, &binsearch_func, &args);
+
+		if (res < sz) {
+			assert(args.key < arr[res]);
+			if (res > 0) {
+				assert(args.key >= arr[res - 1]);
+			}
+		} else {
+			assert(args.key == 10 || args.key == 11);
+		}
+	}
+}
+
+static void test_block_read_write(void)
+{
+	const int header_off = 21; /* random */
+	char *names[30];
+	const int N = ARRAY_SIZE(names);
+	const int block_size = 1024;
+	struct reftable_block block = { NULL };
+	struct block_writer bw = {
+		.last_key = STRBUF_INIT,
+	};
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_record rec = { NULL };
+	int i = 0;
+	int n;
+	struct block_reader br = { 0 };
+	struct block_iter it = { .last_key = STRBUF_INIT };
+	int j = 0;
+	struct strbuf want = STRBUF_INIT;
+
+	block.data = reftable_calloc(block_size);
+	block.len = block_size;
+	block.source = malloc_block_source();
+	block_writer_init(&bw, BLOCK_TYPE_REF, block.data, block_size,
+			  header_off, hash_size(SHA1_ID));
+	reftable_record_from_ref(&rec, &ref);
+
+	for (i = 0; i < N; i++) {
+		char name[100];
+		uint8_t hash[SHA1_SIZE];
+		snprintf(name, sizeof(name), "branch%02d", i);
+		memset(hash, i, sizeof(hash));
+
+		ref.refname = name;
+		ref.value = hash;
+		names[i] = xstrdup(name);
+		n = block_writer_add(&bw, &rec);
+		ref.refname = NULL;
+		ref.value = NULL;
+		assert(n == 0);
+	}
+
+	n = block_writer_finish(&bw);
+	assert(n > 0);
+
+	block_writer_clear(&bw);
+
+	block_reader_init(&br, &block, header_off, block_size, SHA1_SIZE);
+
+	block_reader_start(&br, &it);
+
+	while (1) {
+		int r = block_iter_next(&it, &rec);
+		assert(r >= 0);
+		if (r > 0) {
+			break;
+		}
+		assert_streq(names[j], ref.refname);
+		j++;
+	}
+
+	reftable_record_clear(&rec);
+	block_iter_close(&it);
+
+	for (i = 0; i < N; i++) {
+		struct block_iter it = { .last_key = STRBUF_INIT };
+		strbuf_reset(&want);
+		strbuf_addstr(&want, names[i]);
+
+		n = block_reader_seek(&br, &it, &want);
+		assert(n == 0);
+
+		n = block_iter_next(&it, &rec);
+		assert(n == 0);
+
+		assert_streq(names[i], ref.refname);
+
+		want.len--;
+		n = block_reader_seek(&br, &it, &want);
+		assert(n == 0);
+
+		n = block_iter_next(&it, &rec);
+		assert(n == 0);
+		assert_streq(names[10 * (i / 10)], ref.refname);
+
+		block_iter_close(&it);
+	}
+
+	reftable_record_clear(&rec);
+	reftable_block_done(&br.block);
+	strbuf_release(&want);
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+}
+
+int block_test_main(int argc, const char *argv[])
+{
+	add_test_case("binsearch", &test_binsearch);
+	add_test_case("block_read_write", &test_block_read_write);
+	return test_main(argc, argv);
+}
diff --git a/reftable/zlib-compat.c b/reftable/zlib-compat.c
new file mode 100644
index 0000000000..3e0b0f24f1
--- /dev/null
+++ b/reftable/zlib-compat.c
@@ -0,0 +1,92 @@
+/* taken from zlib's uncompr.c
+
+   commit cacf7f1d4e3d44d871b605da3b647f07d718623f
+   Author: Mark Adler <madler@alumni.caltech.edu>
+   Date:   Sun Jan 15 09:18:46 2017 -0800
+
+       zlib 1.2.11
+
+*/
+
+/*
+ * Copyright (C) 1995-2003, 2010, 2014, 2016 Jean-loup Gailly, Mark Adler
+ * For conditions of distribution and use, see copyright notice in zlib.h
+ */
+
+#include "system.h"
+
+/* clang-format off */
+
+/* ===========================================================================
+     Decompresses the source buffer into the destination buffer.  *sourceLen is
+   the byte length of the source buffer. Upon entry, *destLen is the total size
+   of the destination buffer, which must be large enough to hold the entire
+   uncompressed data. (The size of the uncompressed data must have been saved
+   previously by the compressor and transmitted to the decompressor by some
+   mechanism outside the scope of this compression library.) Upon exit,
+   *destLen is the size of the decompressed data and *sourceLen is the number
+   of source bytes consumed. Upon return, source + *sourceLen points to the
+   first unused input byte.
+
+     uncompress returns Z_OK if success, Z_MEM_ERROR if there was not enough
+   memory, Z_BUF_ERROR if there was not enough room in the output buffer, or
+   Z_DATA_ERROR if the input data was corrupted, including if the input data is
+   an incomplete zlib stream.
+*/
+int ZEXPORT uncompress_return_consumed (
+    Bytef *dest,
+    uLongf *destLen,
+    const Bytef *source,
+    uLong *sourceLen) {
+    z_stream stream;
+    int err;
+    const uInt max = (uInt)-1;
+    uLong len, left;
+    Byte buf[1];    /* for detection of incomplete stream when *destLen == 0 */
+
+    len = *sourceLen;
+    if (*destLen) {
+        left = *destLen;
+        *destLen = 0;
+    }
+    else {
+        left = 1;
+        dest = buf;
+    }
+
+    stream.next_in = (z_const Bytef *)source;
+    stream.avail_in = 0;
+    stream.zalloc = (alloc_func)0;
+    stream.zfree = (free_func)0;
+    stream.opaque = (voidpf)0;
+
+    err = inflateInit(&stream);
+    if (err != Z_OK) return err;
+
+    stream.next_out = dest;
+    stream.avail_out = 0;
+
+    do {
+        if (stream.avail_out == 0) {
+            stream.avail_out = left > (uLong)max ? max : (uInt)left;
+            left -= stream.avail_out;
+        }
+        if (stream.avail_in == 0) {
+            stream.avail_in = len > (uLong)max ? max : (uInt)len;
+            len -= stream.avail_in;
+        }
+        err = inflate(&stream, Z_NO_FLUSH);
+    } while (err == Z_OK);
+
+    *sourceLen -= len + stream.avail_in;
+    if (dest != buf)
+        *destLen = stream.total_out;
+    else if (stream.total_out && err == Z_BUF_ERROR)
+        left = 1;
+
+    inflateEnd(&stream);
+    return err == Z_STREAM_END ? Z_OK :
+           err == Z_NEED_DICT ? Z_DATA_ERROR  :
+           err == Z_BUF_ERROR && left + stream.avail_out ? Z_DATA_ERROR :
+           err;
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 9341272089..81a9bd5667 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -3,6 +3,7 @@
 
 int cmd__reftable(int argc, const char **argv)
 {
+	block_test_main(argc, argv);
 	record_test_main(argc, argv);
 	strbuf_test_main(argc, argv);
 	return 0;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v2 08/13] reftable: a generic binary tree implementation
  2020-10-01 16:10 ` [PATCH v2 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
                     ` (6 preceding siblings ...)
  2020-10-01 16:10   ` [PATCH v2 07/13] reftable: reading/writing blocks Han-Wen Nienhuys via GitGitGadget
@ 2020-10-01 16:10   ` Han-Wen Nienhuys via GitGitGadget
  2020-10-01 16:10   ` [PATCH v2 09/13] reftable: write reftable files Han-Wen Nienhuys via GitGitGadget
                     ` (5 subsequent siblings)
  13 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-10-01 16:10 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Jeff King, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This is necessary for building a OID => ref map on write

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |  2 ++
 reftable/tree.c          | 63 ++++++++++++++++++++++++++++++++++++++++
 reftable/tree.h          | 34 ++++++++++++++++++++++
 reftable/tree_test.c     | 62 +++++++++++++++++++++++++++++++++++++++
 t/helper/test-reftable.c |  1 +
 5 files changed, 162 insertions(+)
 create mode 100644 reftable/tree.c
 create mode 100644 reftable/tree.h
 create mode 100644 reftable/tree_test.c

diff --git a/Makefile b/Makefile
index eda92b00ef..870e540f22 100644
--- a/Makefile
+++ b/Makefile
@@ -2379,12 +2379,14 @@ REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/compat.o
 REFTABLE_OBJS += reftable/record.o
 REFTABLE_OBJS += reftable/strbuf.o
+REFTABLE_OBJS += reftable/tree.o
 REFTABLE_OBJS += reftable/zlib-compat.o
 
 REFTABLE_TEST_OBJS += reftable/block_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/strbuf_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
+REFTABLE_TEST_OBJS += reftable/tree_test.o
 
 TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
 OBJECTS := $(LIB_OBJS) $(BUILTIN_OBJS) $(PROGRAM_OBJS) $(TEST_OBJS) \
diff --git a/reftable/tree.c b/reftable/tree.c
new file mode 100644
index 0000000000..0061d14e30
--- /dev/null
+++ b/reftable/tree.c
@@ -0,0 +1,63 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "tree.h"
+
+#include "basics.h"
+#include "system.h"
+
+struct tree_node *tree_search(void *key, struct tree_node **rootp,
+			      int (*compare)(const void *, const void *),
+			      int insert)
+{
+	int res;
+	if (*rootp == NULL) {
+		if (!insert) {
+			return NULL;
+		} else {
+			struct tree_node *n =
+				reftable_calloc(sizeof(struct tree_node));
+			n->key = key;
+			*rootp = n;
+			return *rootp;
+		}
+	}
+
+	res = compare(key, (*rootp)->key);
+	if (res < 0)
+		return tree_search(key, &(*rootp)->left, compare, insert);
+	else if (res > 0)
+		return tree_search(key, &(*rootp)->right, compare, insert);
+	return *rootp;
+}
+
+void infix_walk(struct tree_node *t, void (*action)(void *arg, void *key),
+		void *arg)
+{
+	if (t->left != NULL) {
+		infix_walk(t->left, action, arg);
+	}
+	action(arg, t->key);
+	if (t->right != NULL) {
+		infix_walk(t->right, action, arg);
+	}
+}
+
+void tree_free(struct tree_node *t)
+{
+	if (t == NULL) {
+		return;
+	}
+	if (t->left != NULL) {
+		tree_free(t->left);
+	}
+	if (t->right != NULL) {
+		tree_free(t->right);
+	}
+	reftable_free(t);
+}
diff --git a/reftable/tree.h b/reftable/tree.h
new file mode 100644
index 0000000000..954512e9a3
--- /dev/null
+++ b/reftable/tree.h
@@ -0,0 +1,34 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef TREE_H
+#define TREE_H
+
+/* tree_node is a generic binary search tree. */
+struct tree_node {
+	void *key;
+	struct tree_node *left, *right;
+};
+
+/* looks for `key` in `rootp` using `compare` as comparison function. If insert
+   is set, insert the key if it's not found. Else, return NULL.
+*/
+struct tree_node *tree_search(void *key, struct tree_node **rootp,
+			      int (*compare)(const void *, const void *),
+			      int insert);
+
+/* performs an infix walk of the tree. */
+void infix_walk(struct tree_node *t, void (*action)(void *arg, void *key),
+		void *arg);
+
+/*
+  deallocates the tree nodes recursively. Keys should be deallocated separately
+  by walking over the tree. */
+void tree_free(struct tree_node *t);
+
+#endif
diff --git a/reftable/tree_test.c b/reftable/tree_test.c
new file mode 100644
index 0000000000..320fd02a0f
--- /dev/null
+++ b/reftable/tree_test.c
@@ -0,0 +1,62 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "tree.h"
+
+#include "basics.h"
+#include "record.h"
+#include "reftable.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static int test_compare(const void *a, const void *b)
+{
+	return (char *)a - (char *)b;
+}
+
+struct curry {
+	void *last;
+};
+
+static void check_increasing(void *arg, void *key)
+{
+	struct curry *c = (struct curry *)arg;
+	if (c->last != NULL) {
+		assert(test_compare(c->last, key) < 0);
+	}
+	c->last = key;
+}
+
+static void test_tree(void)
+{
+	struct tree_node *root = NULL;
+
+	void *values[11] = { NULL };
+	struct tree_node *nodes[11] = { NULL };
+	int i = 1;
+	struct curry c = { NULL };
+	do {
+		nodes[i] = tree_search(values + i, &root, &test_compare, 1);
+		i = (i * 7) % 11;
+	} while (i != 1);
+
+	for (i = 1; i < ARRAY_SIZE(nodes); i++) {
+		assert(values + i == nodes[i]->key);
+		assert(nodes[i] ==
+		       tree_search(values + i, &root, &test_compare, 0));
+	}
+
+	infix_walk(root, check_increasing, &c);
+	tree_free(root);
+}
+
+int tree_test_main(int argc, const char *argv[])
+{
+	add_test_case("test_tree", &test_tree);
+	return test_main(argc, argv);
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 81a9bd5667..9c4e0f42dc 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -6,5 +6,6 @@ int cmd__reftable(int argc, const char **argv)
 	block_test_main(argc, argv);
 	record_test_main(argc, argv);
 	strbuf_test_main(argc, argv);
+	tree_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v2 09/13] reftable: write reftable files
  2020-10-01 16:10 ` [PATCH v2 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
                     ` (7 preceding siblings ...)
  2020-10-01 16:10   ` [PATCH v2 08/13] reftable: a generic binary tree implementation Han-Wen Nienhuys via GitGitGadget
@ 2020-10-01 16:10   ` Han-Wen Nienhuys via GitGitGadget
  2020-10-01 16:11   ` [PATCH v2 10/13] reftable: read " Han-Wen Nienhuys via GitGitGadget
                     ` (4 subsequent siblings)
  13 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-10-01 16:10 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Jeff King, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile          |   1 +
 reftable/writer.c | 673 ++++++++++++++++++++++++++++++++++++++++++++++
 reftable/writer.h |  51 ++++
 3 files changed, 725 insertions(+)
 create mode 100644 reftable/writer.c
 create mode 100644 reftable/writer.h

diff --git a/Makefile b/Makefile
index 870e540f22..c1dd046155 100644
--- a/Makefile
+++ b/Makefile
@@ -2380,6 +2380,7 @@ REFTABLE_OBJS += reftable/compat.o
 REFTABLE_OBJS += reftable/record.o
 REFTABLE_OBJS += reftable/strbuf.o
 REFTABLE_OBJS += reftable/tree.o
+REFTABLE_OBJS += reftable/writer.o
 REFTABLE_OBJS += reftable/zlib-compat.o
 
 REFTABLE_TEST_OBJS += reftable/block_test.o
diff --git a/reftable/writer.c b/reftable/writer.c
new file mode 100644
index 0000000000..436756ad24
--- /dev/null
+++ b/reftable/writer.c
@@ -0,0 +1,673 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "writer.h"
+
+#include "system.h"
+
+#include "block.h"
+#include "constants.h"
+#include "record.h"
+#include "reftable.h"
+#include "tree.h"
+
+/* finishes a block, and writes it to storage */
+static int writer_flush_block(struct reftable_writer *w);
+
+/* deallocates memory related to the index */
+static void writer_clear_index(struct reftable_writer *w);
+
+/* finishes writing a 'r' (refs) or 'g' (reflogs) section */
+static int writer_finish_public_section(struct reftable_writer *w);
+
+static struct reftable_block_stats *
+writer_reftable_block_stats(struct reftable_writer *w, uint8_t typ)
+{
+	switch (typ) {
+	case 'r':
+		return &w->stats.ref_stats;
+	case 'o':
+		return &w->stats.obj_stats;
+	case 'i':
+		return &w->stats.idx_stats;
+	case 'g':
+		return &w->stats.log_stats;
+	}
+	abort();
+	return NULL;
+}
+
+/* write data, queuing the padding for the next write. Returns negative for
+ * error. */
+static int padded_write(struct reftable_writer *w, uint8_t *data, size_t len,
+			int padding)
+{
+	int n = 0;
+	if (w->pending_padding > 0) {
+		uint8_t *zeroed = reftable_calloc(w->pending_padding);
+		int n = w->write(w->write_arg, zeroed, w->pending_padding);
+		if (n < 0)
+			return n;
+
+		w->pending_padding = 0;
+		reftable_free(zeroed);
+	}
+
+	w->pending_padding = padding;
+	n = w->write(w->write_arg, data, len);
+	if (n < 0)
+		return n;
+	n += padding;
+	return 0;
+}
+
+static void options_set_defaults(struct reftable_write_options *opts)
+{
+	if (opts->restart_interval == 0) {
+		opts->restart_interval = 16;
+	}
+
+	if (opts->hash_id == 0) {
+		opts->hash_id = SHA1_ID;
+	}
+	if (opts->block_size == 0) {
+		opts->block_size = DEFAULT_BLOCK_SIZE;
+	}
+}
+
+static int writer_version(struct reftable_writer *w)
+{
+	return (w->opts.hash_id == 0 || w->opts.hash_id == SHA1_ID) ? 1 : 2;
+}
+
+static int writer_write_header(struct reftable_writer *w, uint8_t *dest)
+{
+	memcpy((char *)dest, "REFT", 4);
+
+	dest[4] = writer_version(w);
+
+	put_be24(dest + 5, w->opts.block_size);
+	put_be64(dest + 8, w->min_update_index);
+	put_be64(dest + 16, w->max_update_index);
+	if (writer_version(w) == 2) {
+		put_be32(dest + 24, w->opts.hash_id);
+	}
+	return header_size(writer_version(w));
+}
+
+static void writer_reinit_block_writer(struct reftable_writer *w, uint8_t typ)
+{
+	int block_start = 0;
+	if (w->next == 0) {
+		block_start = header_size(writer_version(w));
+	}
+
+	strbuf_release(&w->last_key);
+	block_writer_init(&w->block_writer_data, typ, w->block,
+			  w->opts.block_size, block_start,
+			  hash_size(w->opts.hash_id));
+	w->block_writer = &w->block_writer_data;
+	w->block_writer->restart_interval = w->opts.restart_interval;
+}
+
+struct reftable_writer *
+reftable_new_writer(int (*writer_func)(void *, const void *, size_t),
+		    void *writer_arg, struct reftable_write_options *opts)
+{
+	struct reftable_writer *wp =
+		reftable_calloc(sizeof(struct reftable_writer));
+	strbuf_init(&wp->block_writer_data.last_key, 0);
+	options_set_defaults(opts);
+	if (opts->block_size >= (1 << 24)) {
+		/* TODO - error return? */
+		abort();
+	}
+	wp->last_key = reftable_empty_strbuf;
+	wp->block = reftable_calloc(opts->block_size);
+	wp->write = writer_func;
+	wp->write_arg = writer_arg;
+	wp->opts = *opts;
+	writer_reinit_block_writer(wp, BLOCK_TYPE_REF);
+
+	return wp;
+}
+
+void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
+				uint64_t max)
+{
+	w->min_update_index = min;
+	w->max_update_index = max;
+}
+
+void reftable_writer_free(struct reftable_writer *w)
+{
+	reftable_free(w->block);
+	reftable_free(w);
+}
+
+struct obj_index_tree_node {
+	struct strbuf hash;
+	uint64_t *offsets;
+	size_t offset_len;
+	size_t offset_cap;
+};
+
+#define OBJ_INDEX_TREE_NODE_INIT    \
+	{                           \
+		.hash = STRBUF_INIT \
+	}
+
+static int obj_index_tree_node_compare(const void *a, const void *b)
+{
+	return strbuf_cmp(&((const struct obj_index_tree_node *)a)->hash,
+			  &((const struct obj_index_tree_node *)b)->hash);
+}
+
+static void writer_index_hash(struct reftable_writer *w, struct strbuf *hash)
+{
+	uint64_t off = w->next;
+
+	struct obj_index_tree_node want = { .hash = *hash };
+
+	struct tree_node *node = tree_search(&want, &w->obj_index_tree,
+					     &obj_index_tree_node_compare, 0);
+	struct obj_index_tree_node *key = NULL;
+	if (node == NULL) {
+		struct obj_index_tree_node empty = OBJ_INDEX_TREE_NODE_INIT;
+		key = reftable_malloc(sizeof(struct obj_index_tree_node));
+		*key = empty;
+
+		strbuf_reset(&key->hash);
+		strbuf_addbuf(&key->hash, hash);
+		tree_search((void *)key, &w->obj_index_tree,
+			    &obj_index_tree_node_compare, 1);
+	} else {
+		key = node->key;
+	}
+
+	if (key->offset_len > 0 && key->offsets[key->offset_len - 1] == off) {
+		return;
+	}
+
+	if (key->offset_len == key->offset_cap) {
+		key->offset_cap = 2 * key->offset_cap + 1;
+		key->offsets = reftable_realloc(
+			key->offsets, sizeof(uint64_t) * key->offset_cap);
+	}
+
+	key->offsets[key->offset_len++] = off;
+}
+
+static int writer_add_record(struct reftable_writer *w,
+			     struct reftable_record *rec)
+{
+	int result = -1;
+	struct strbuf key = STRBUF_INIT;
+	int err = 0;
+	reftable_record_key(rec, &key);
+	if (strbuf_cmp(&w->last_key, &key) >= 0)
+		goto done;
+
+	strbuf_reset(&w->last_key);
+	strbuf_addbuf(&w->last_key, &key);
+	if (w->block_writer == NULL) {
+		writer_reinit_block_writer(w, reftable_record_type(rec));
+	}
+
+	assert(block_writer_type(w->block_writer) == reftable_record_type(rec));
+
+	if (block_writer_add(w->block_writer, rec) == 0) {
+		result = 0;
+		goto done;
+	}
+
+	err = writer_flush_block(w);
+	if (err < 0) {
+		result = err;
+		goto done;
+	}
+
+	writer_reinit_block_writer(w, reftable_record_type(rec));
+	err = block_writer_add(w->block_writer, rec);
+	if (err < 0) {
+		result = err;
+		goto done;
+	}
+
+	result = 0;
+done:
+	strbuf_release(&key);
+	return result;
+}
+
+int reftable_writer_add_ref(struct reftable_writer *w,
+			    struct reftable_ref_record *ref)
+{
+	struct reftable_record rec = { NULL };
+	struct reftable_ref_record copy = *ref;
+	int err = 0;
+
+	if (ref->refname == NULL)
+		return REFTABLE_API_ERROR;
+	if (ref->update_index < w->min_update_index ||
+	    ref->update_index > w->max_update_index)
+		return REFTABLE_API_ERROR;
+
+	reftable_record_from_ref(&rec, &copy);
+	copy.update_index -= w->min_update_index;
+	err = writer_add_record(w, &rec);
+	if (err < 0)
+		return err;
+
+	if (!w->opts.skip_index_objects && ref->value != NULL) {
+		struct strbuf h = STRBUF_INIT;
+		strbuf_add(&h, (char *)ref->value, hash_size(w->opts.hash_id));
+		writer_index_hash(w, &h);
+		strbuf_release(&h);
+	}
+
+	if (!w->opts.skip_index_objects && ref->target_value != NULL) {
+		struct strbuf h = STRBUF_INIT;
+		strbuf_add(&h, ref->target_value, hash_size(w->opts.hash_id));
+		writer_index_hash(w, &h);
+		strbuf_release(&h);
+	}
+	return 0;
+}
+
+int reftable_writer_add_refs(struct reftable_writer *w,
+			     struct reftable_ref_record *refs, int n)
+{
+	int err = 0;
+	int i = 0;
+	QSORT(refs, n, reftable_ref_record_compare_name);
+	for (i = 0; err == 0 && i < n; i++) {
+		err = reftable_writer_add_ref(w, &refs[i]);
+	}
+	return err;
+}
+
+int reftable_writer_add_log(struct reftable_writer *w,
+			    struct reftable_log_record *log)
+{
+	struct reftable_record rec = { NULL };
+	char *input_log_message = log->message;
+	struct strbuf cleaned_message = STRBUF_INIT;
+	int err;
+	if (log->refname == NULL)
+		return REFTABLE_API_ERROR;
+
+	if (w->block_writer != NULL &&
+	    block_writer_type(w->block_writer) == BLOCK_TYPE_REF) {
+		int err = writer_finish_public_section(w);
+		if (err < 0)
+			return err;
+	}
+
+	if (!w->opts.exact_log_message && log->message != NULL) {
+		strbuf_addstr(&cleaned_message, log->message);
+		while (cleaned_message.len &&
+		       cleaned_message.buf[cleaned_message.len - 1] == '\n')
+			strbuf_setlen(&cleaned_message,
+				      cleaned_message.len - 1);
+		if (strchr(cleaned_message.buf, '\n')) {
+			// multiple lines not allowed.
+			err = REFTABLE_API_ERROR;
+			goto done;
+		}
+		strbuf_addstr(&cleaned_message, "\n");
+		log->message = cleaned_message.buf;
+	}
+
+	w->next -= w->pending_padding;
+	w->pending_padding = 0;
+
+	reftable_record_from_log(&rec, log);
+	err = writer_add_record(w, &rec);
+
+done:
+	log->message = input_log_message;
+	strbuf_release(&cleaned_message);
+	return err;
+}
+
+int reftable_writer_add_logs(struct reftable_writer *w,
+			     struct reftable_log_record *logs, int n)
+{
+	int err = 0;
+	int i = 0;
+	QSORT(logs, n, reftable_log_record_compare_key);
+	for (i = 0; err == 0 && i < n; i++) {
+		err = reftable_writer_add_log(w, &logs[i]);
+	}
+	return err;
+}
+
+static int writer_finish_section(struct reftable_writer *w)
+{
+	uint8_t typ = block_writer_type(w->block_writer);
+	uint64_t index_start = 0;
+	int max_level = 0;
+	int threshold = w->opts.unpadded ? 1 : 3;
+	int before_blocks = w->stats.idx_stats.blocks;
+	int err = writer_flush_block(w);
+	int i = 0;
+	struct reftable_block_stats *bstats = NULL;
+	if (err < 0)
+		return err;
+
+	while (w->index_len > threshold) {
+		struct reftable_index_record *idx = NULL;
+		int idx_len = 0;
+
+		max_level++;
+		index_start = w->next;
+		writer_reinit_block_writer(w, BLOCK_TYPE_INDEX);
+
+		idx = w->index;
+		idx_len = w->index_len;
+
+		w->index = NULL;
+		w->index_len = 0;
+		w->index_cap = 0;
+		for (i = 0; i < idx_len; i++) {
+			struct reftable_record rec = { NULL };
+			reftable_record_from_index(&rec, idx + i);
+			if (block_writer_add(w->block_writer, &rec) == 0) {
+				continue;
+			}
+
+			err = writer_flush_block(w);
+			if (err < 0)
+				return err;
+
+			writer_reinit_block_writer(w, BLOCK_TYPE_INDEX);
+
+			err = block_writer_add(w->block_writer, &rec);
+			if (err != 0) {
+				/* write into fresh block should always succeed
+				 */
+				abort();
+			}
+		}
+		for (i = 0; i < idx_len; i++) {
+			strbuf_release(&idx[i].last_key);
+		}
+		reftable_free(idx);
+	}
+
+	writer_clear_index(w);
+
+	err = writer_flush_block(w);
+	if (err < 0)
+		return err;
+
+	bstats = writer_reftable_block_stats(w, typ);
+	bstats->index_blocks = w->stats.idx_stats.blocks - before_blocks;
+	bstats->index_offset = index_start;
+	bstats->max_index_level = max_level;
+
+	/* Reinit lastKey, as the next section can start with any key. */
+	w->last_key.len = 0;
+
+	return 0;
+}
+
+struct common_prefix_arg {
+	struct strbuf *last;
+	int max;
+};
+
+static void update_common(void *void_arg, void *key)
+{
+	struct common_prefix_arg *arg = (struct common_prefix_arg *)void_arg;
+	struct obj_index_tree_node *entry = (struct obj_index_tree_node *)key;
+	if (arg->last != NULL) {
+		int n = common_prefix_size(&entry->hash, arg->last);
+		if (n > arg->max) {
+			arg->max = n;
+		}
+	}
+	arg->last = &entry->hash;
+}
+
+struct write_record_arg {
+	struct reftable_writer *w;
+	int err;
+};
+
+static void write_object_record(void *void_arg, void *key)
+{
+	struct write_record_arg *arg = (struct write_record_arg *)void_arg;
+	struct obj_index_tree_node *entry = (struct obj_index_tree_node *)key;
+	struct reftable_obj_record obj_rec = {
+		.hash_prefix = (uint8_t *)entry->hash.buf,
+		.hash_prefix_len = arg->w->stats.object_id_len,
+		.offsets = entry->offsets,
+		.offset_len = entry->offset_len,
+	};
+	struct reftable_record rec = { NULL };
+	if (arg->err < 0)
+		goto done;
+
+	reftable_record_from_obj(&rec, &obj_rec);
+	arg->err = block_writer_add(arg->w->block_writer, &rec);
+	if (arg->err == 0)
+		goto done;
+
+	arg->err = writer_flush_block(arg->w);
+	if (arg->err < 0)
+		goto done;
+
+	writer_reinit_block_writer(arg->w, BLOCK_TYPE_OBJ);
+	arg->err = block_writer_add(arg->w->block_writer, &rec);
+	if (arg->err == 0)
+		goto done;
+	obj_rec.offset_len = 0;
+	arg->err = block_writer_add(arg->w->block_writer, &rec);
+
+	/* Should be able to write into a fresh block. */
+	assert(arg->err == 0);
+
+done:;
+}
+
+static void object_record_free(void *void_arg, void *key)
+{
+	struct obj_index_tree_node *entry = (struct obj_index_tree_node *)key;
+
+	FREE_AND_NULL(entry->offsets);
+	strbuf_release(&entry->hash);
+	reftable_free(entry);
+}
+
+static int writer_dump_object_index(struct reftable_writer *w)
+{
+	struct write_record_arg closure = { .w = w };
+	struct common_prefix_arg common = { NULL };
+	if (w->obj_index_tree != NULL) {
+		infix_walk(w->obj_index_tree, &update_common, &common);
+	}
+	w->stats.object_id_len = common.max + 1;
+
+	writer_reinit_block_writer(w, BLOCK_TYPE_OBJ);
+
+	if (w->obj_index_tree != NULL) {
+		infix_walk(w->obj_index_tree, &write_object_record, &closure);
+	}
+
+	if (closure.err < 0)
+		return closure.err;
+	return writer_finish_section(w);
+}
+
+static int writer_finish_public_section(struct reftable_writer *w)
+{
+	uint8_t typ = 0;
+	int err = 0;
+
+	if (w->block_writer == NULL)
+		return 0;
+
+	typ = block_writer_type(w->block_writer);
+	err = writer_finish_section(w);
+	if (err < 0)
+		return err;
+	if (typ == BLOCK_TYPE_REF && !w->opts.skip_index_objects &&
+	    w->stats.ref_stats.index_blocks > 0) {
+		err = writer_dump_object_index(w);
+		if (err < 0)
+			return err;
+	}
+
+	if (w->obj_index_tree != NULL) {
+		infix_walk(w->obj_index_tree, &object_record_free, NULL);
+		tree_free(w->obj_index_tree);
+		w->obj_index_tree = NULL;
+	}
+
+	w->block_writer = NULL;
+	return 0;
+}
+
+int reftable_writer_close(struct reftable_writer *w)
+{
+	uint8_t footer[72];
+	uint8_t *p = footer;
+	int err = writer_finish_public_section(w);
+	int empty_table = w->next == 0;
+	if (err != 0)
+		goto done;
+	w->pending_padding = 0;
+	if (empty_table) {
+		/* Empty tables need a header anyway. */
+		uint8_t header[28];
+		int n = writer_write_header(w, header);
+		err = padded_write(w, header, n, 0);
+		if (err < 0)
+			goto done;
+	}
+
+	p += writer_write_header(w, footer);
+	put_be64(p, w->stats.ref_stats.index_offset);
+	p += 8;
+	put_be64(p, (w->stats.obj_stats.offset) << 5 | w->stats.object_id_len);
+	p += 8;
+	put_be64(p, w->stats.obj_stats.index_offset);
+	p += 8;
+
+	put_be64(p, w->stats.log_stats.offset);
+	p += 8;
+	put_be64(p, w->stats.log_stats.index_offset);
+	p += 8;
+
+	put_be32(p, crc32(0, footer, p - footer));
+	p += 4;
+
+	err = padded_write(w, footer, footer_size(writer_version(w)), 0);
+	if (err < 0)
+		goto done;
+
+	if (empty_table) {
+		err = REFTABLE_EMPTY_TABLE_ERROR;
+		goto done;
+	}
+
+done:
+	/* free up memory. */
+	block_writer_clear(&w->block_writer_data);
+	writer_clear_index(w);
+	strbuf_release(&w->last_key);
+	return err;
+}
+
+static void writer_clear_index(struct reftable_writer *w)
+{
+	int i = 0;
+	for (i = 0; i < w->index_len; i++) {
+		strbuf_release(&w->index[i].last_key);
+	}
+
+	FREE_AND_NULL(w->index);
+	w->index_len = 0;
+	w->index_cap = 0;
+}
+
+static const int debug = 0;
+
+static int writer_flush_nonempty_block(struct reftable_writer *w)
+{
+	uint8_t typ = block_writer_type(w->block_writer);
+	struct reftable_block_stats *bstats =
+		writer_reftable_block_stats(w, typ);
+	uint64_t block_typ_off = (bstats->blocks == 0) ? w->next : 0;
+	int raw_bytes = block_writer_finish(w->block_writer);
+	int padding = 0;
+	int err = 0;
+	struct reftable_index_record ir = { .last_key = STRBUF_INIT };
+	if (raw_bytes < 0)
+		return raw_bytes;
+
+	if (!w->opts.unpadded && typ != BLOCK_TYPE_LOG) {
+		padding = w->opts.block_size - raw_bytes;
+	}
+
+	if (block_typ_off > 0) {
+		bstats->offset = block_typ_off;
+	}
+
+	bstats->entries += w->block_writer->entries;
+	bstats->restarts += w->block_writer->restart_len;
+	bstats->blocks++;
+	w->stats.blocks++;
+
+	if (debug) {
+		fprintf(stderr, "block %c off %" PRIu64 " sz %d (%d)\n", typ,
+			w->next, raw_bytes,
+			get_be24(w->block + w->block_writer->header_off + 1));
+	}
+
+	if (w->next == 0) {
+		writer_write_header(w, w->block);
+	}
+
+	err = padded_write(w, w->block, raw_bytes, padding);
+	if (err < 0)
+		return err;
+
+	if (w->index_cap == w->index_len) {
+		w->index_cap = 2 * w->index_cap + 1;
+		w->index = reftable_realloc(
+			w->index,
+			sizeof(struct reftable_index_record) * w->index_cap);
+	}
+
+	ir.offset = w->next;
+	strbuf_reset(&ir.last_key);
+	strbuf_addbuf(&ir.last_key, &w->block_writer->last_key);
+	w->index[w->index_len] = ir;
+
+	w->index_len++;
+	w->next += padding + raw_bytes;
+	w->block_writer = NULL;
+	return 0;
+}
+
+static int writer_flush_block(struct reftable_writer *w)
+{
+	if (w->block_writer == NULL)
+		return 0;
+	if (w->block_writer->entries == 0)
+		return 0;
+	return writer_flush_nonempty_block(w);
+}
+
+const struct reftable_stats *writer_stats(struct reftable_writer *w)
+{
+	return &w->stats;
+}
diff --git a/reftable/writer.h b/reftable/writer.h
new file mode 100644
index 0000000000..06e0705a79
--- /dev/null
+++ b/reftable/writer.h
@@ -0,0 +1,51 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef WRITER_H
+#define WRITER_H
+
+#include "basics.h"
+#include "block.h"
+#include "reftable.h"
+#include "strbuf.h"
+#include "tree.h"
+
+struct reftable_writer {
+	int (*write)(void *, const void *, size_t);
+	void *write_arg;
+	int pending_padding;
+	struct strbuf last_key;
+
+	/* offset of next block to write. */
+	uint64_t next;
+	uint64_t min_update_index, max_update_index;
+	struct reftable_write_options opts;
+
+	/* memory buffer for writing */
+	uint8_t *block;
+
+	/* writer for the current section. NULL or points to
+	 * block_writer_data */
+	struct block_writer *block_writer;
+
+	struct block_writer block_writer_data;
+
+	/* pending index records for the current section */
+	struct reftable_index_record *index;
+	size_t index_len;
+	size_t index_cap;
+
+	/*
+	  tree for use with tsearch; used to populate the 'o' inverse OID
+	  map */
+	struct tree_node *obj_index_tree;
+
+	struct reftable_stats stats;
+};
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v2 10/13] reftable: read reftable files
  2020-10-01 16:10 ` [PATCH v2 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
                     ` (8 preceding siblings ...)
  2020-10-01 16:10   ` [PATCH v2 09/13] reftable: write reftable files Han-Wen Nienhuys via GitGitGadget
@ 2020-10-01 16:11   ` Han-Wen Nienhuys via GitGitGadget
  2020-10-01 16:11   ` [PATCH v2 11/13] reftable: file level tests Han-Wen Nienhuys via GitGitGadget
                     ` (3 subsequent siblings)
  13 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-10-01 16:11 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Jeff King, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile            |   3 +
 reftable/iter.c     | 242 +++++++++++++++
 reftable/iter.h     |  72 +++++
 reftable/reader.c   | 733 ++++++++++++++++++++++++++++++++++++++++++++
 reftable/reader.h   |  78 +++++
 reftable/reftable.c | 104 +++++++
 6 files changed, 1232 insertions(+)
 create mode 100644 reftable/iter.c
 create mode 100644 reftable/iter.h
 create mode 100644 reftable/reader.c
 create mode 100644 reftable/reader.h
 create mode 100644 reftable/reftable.c

diff --git a/Makefile b/Makefile
index c1dd046155..8b09a56d91 100644
--- a/Makefile
+++ b/Makefile
@@ -2377,7 +2377,10 @@ REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/compat.o
+REFTABLE_OBJS += reftable/iter.o
+REFTABLE_OBJS += reftable/reader.o
 REFTABLE_OBJS += reftable/record.o
+REFTABLE_OBJS += reftable/reftable.o
 REFTABLE_OBJS += reftable/strbuf.o
 REFTABLE_OBJS += reftable/tree.o
 REFTABLE_OBJS += reftable/writer.o
diff --git a/reftable/iter.c b/reftable/iter.c
new file mode 100644
index 0000000000..2cff447323
--- /dev/null
+++ b/reftable/iter.c
@@ -0,0 +1,242 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "iter.h"
+
+#include "system.h"
+
+#include "block.h"
+#include "constants.h"
+#include "reader.h"
+#include "reftable.h"
+
+int iterator_is_null(struct reftable_iterator *it)
+{
+	return it->ops == NULL;
+}
+
+static int empty_iterator_next(void *arg, struct reftable_record *rec)
+{
+	return 1;
+}
+
+static void empty_iterator_close(void *arg)
+{
+}
+
+static struct reftable_iterator_vtable empty_vtable = {
+	.next = &empty_iterator_next,
+	.close = &empty_iterator_close,
+};
+
+void iterator_set_empty(struct reftable_iterator *it)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = NULL;
+	it->ops = &empty_vtable;
+}
+
+int iterator_next(struct reftable_iterator *it, struct reftable_record *rec)
+{
+	return it->ops->next(it->iter_arg, rec);
+}
+
+void reftable_iterator_destroy(struct reftable_iterator *it)
+{
+	if (it->ops == NULL) {
+		return;
+	}
+	it->ops->close(it->iter_arg);
+	it->ops = NULL;
+	FREE_AND_NULL(it->iter_arg);
+}
+
+int reftable_iterator_next_ref(struct reftable_iterator *it,
+			       struct reftable_ref_record *ref)
+{
+	struct reftable_record rec = { 0 };
+	reftable_record_from_ref(&rec, ref);
+	return iterator_next(it, &rec);
+}
+
+int reftable_iterator_next_log(struct reftable_iterator *it,
+			       struct reftable_log_record *log)
+{
+	struct reftable_record rec = { 0 };
+	reftable_record_from_log(&rec, log);
+	return iterator_next(it, &rec);
+}
+
+static void filtering_ref_iterator_close(void *iter_arg)
+{
+	struct filtering_ref_iterator *fri =
+		(struct filtering_ref_iterator *)iter_arg;
+	strbuf_release(&fri->oid);
+	reftable_iterator_destroy(&fri->it);
+}
+
+static int filtering_ref_iterator_next(void *iter_arg,
+				       struct reftable_record *rec)
+{
+	struct filtering_ref_iterator *fri =
+		(struct filtering_ref_iterator *)iter_arg;
+	struct reftable_ref_record *ref =
+		(struct reftable_ref_record *)rec->data;
+	int err = 0;
+	while (1) {
+		err = reftable_iterator_next_ref(&fri->it, ref);
+		if (err != 0) {
+			break;
+		}
+
+		if (fri->double_check) {
+			struct reftable_iterator it = { 0 };
+
+			err = reftable_table_seek_ref(&fri->tab, &it,
+						      ref->refname);
+			if (err == 0) {
+				err = reftable_iterator_next_ref(&it, ref);
+			}
+
+			reftable_iterator_destroy(&it);
+
+			if (err < 0) {
+				break;
+			}
+
+			if (err > 0) {
+				continue;
+			}
+		}
+
+		if ((ref->target_value != NULL &&
+		     !memcmp(fri->oid.buf, ref->target_value, fri->oid.len)) ||
+		    (ref->value != NULL &&
+		     !memcmp(fri->oid.buf, ref->value, fri->oid.len))) {
+			return 0;
+		}
+	}
+
+	reftable_ref_record_clear(ref);
+	return err;
+}
+
+static struct reftable_iterator_vtable filtering_ref_iterator_vtable = {
+	.next = &filtering_ref_iterator_next,
+	.close = &filtering_ref_iterator_close,
+};
+
+void iterator_from_filtering_ref_iterator(struct reftable_iterator *it,
+					  struct filtering_ref_iterator *fri)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = fri;
+	it->ops = &filtering_ref_iterator_vtable;
+}
+
+static void indexed_table_ref_iter_close(void *p)
+{
+	struct indexed_table_ref_iter *it = (struct indexed_table_ref_iter *)p;
+	block_iter_close(&it->cur);
+	reftable_block_done(&it->block_reader.block);
+	strbuf_release(&it->oid);
+}
+
+static int indexed_table_ref_iter_next_block(struct indexed_table_ref_iter *it)
+{
+	uint64_t off;
+	int err = 0;
+	if (it->offset_idx == it->offset_len) {
+		it->is_finished = 1;
+		return 1;
+	}
+
+	reftable_block_done(&it->block_reader.block);
+
+	off = it->offsets[it->offset_idx++];
+	err = reader_init_block_reader(it->r, &it->block_reader, off,
+				       BLOCK_TYPE_REF);
+	if (err < 0) {
+		return err;
+	}
+	if (err > 0) {
+		/* indexed block does not exist. */
+		return REFTABLE_FORMAT_ERROR;
+	}
+	block_reader_start(&it->block_reader, &it->cur);
+	return 0;
+}
+
+static int indexed_table_ref_iter_next(void *p, struct reftable_record *rec)
+{
+	struct indexed_table_ref_iter *it = (struct indexed_table_ref_iter *)p;
+	struct reftable_ref_record *ref =
+		(struct reftable_ref_record *)rec->data;
+
+	while (1) {
+		int err = block_iter_next(&it->cur, rec);
+		if (err < 0) {
+			return err;
+		}
+
+		if (err > 0) {
+			err = indexed_table_ref_iter_next_block(it);
+			if (err < 0) {
+				return err;
+			}
+
+			if (it->is_finished) {
+				return 1;
+			}
+			continue;
+		}
+
+		if (!memcmp(it->oid.buf, ref->target_value, it->oid.len) ||
+		    !memcmp(it->oid.buf, ref->value, it->oid.len)) {
+			return 0;
+		}
+	}
+}
+
+int new_indexed_table_ref_iter(struct indexed_table_ref_iter **dest,
+			       struct reftable_reader *r, uint8_t *oid,
+			       int oid_len, uint64_t *offsets, int offset_len)
+{
+	struct indexed_table_ref_iter empty = INDEXED_TABLE_REF_ITER_INIT;
+	struct indexed_table_ref_iter *itr =
+		reftable_calloc(sizeof(struct indexed_table_ref_iter));
+	int err = 0;
+
+	*itr = empty;
+	itr->r = r;
+	strbuf_add(&itr->oid, oid, oid_len);
+
+	itr->offsets = offsets;
+	itr->offset_len = offset_len;
+
+	err = indexed_table_ref_iter_next_block(itr);
+	if (err < 0) {
+		reftable_free(itr);
+	} else {
+		*dest = itr;
+	}
+	return err;
+}
+
+static struct reftable_iterator_vtable indexed_table_ref_iter_vtable = {
+	.next = &indexed_table_ref_iter_next,
+	.close = &indexed_table_ref_iter_close,
+};
+
+void iterator_from_indexed_table_ref_iter(struct reftable_iterator *it,
+					  struct indexed_table_ref_iter *itr)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = itr;
+	it->ops = &indexed_table_ref_iter_vtable;
+}
diff --git a/reftable/iter.h b/reftable/iter.h
new file mode 100644
index 0000000000..c92a178773
--- /dev/null
+++ b/reftable/iter.h
@@ -0,0 +1,72 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef ITER_H
+#define ITER_H
+
+#include "block.h"
+#include "record.h"
+#include "strbuf.h"
+
+struct reftable_iterator_vtable {
+	int (*next)(void *iter_arg, struct reftable_record *rec);
+	void (*close)(void *iter_arg);
+};
+
+void iterator_set_empty(struct reftable_iterator *it);
+int iterator_next(struct reftable_iterator *it, struct reftable_record *rec);
+
+/* Returns true for a zeroed out iterator, such as the one returned from
+   iterator_destroy. */
+int iterator_is_null(struct reftable_iterator *it);
+
+/* iterator that produces only ref records that point to `oid` */
+struct filtering_ref_iterator {
+	int double_check;
+	struct reftable_table tab;
+	struct strbuf oid;
+	struct reftable_iterator it;
+};
+#define FILTERING_REF_ITERATOR_INIT \
+	{                           \
+		.oid = STRBUF_INIT  \
+	}
+
+void iterator_from_filtering_ref_iterator(struct reftable_iterator *,
+					  struct filtering_ref_iterator *);
+
+/* iterator that produces only ref records that point to `oid`,
+   but using the object index.
+ */
+struct indexed_table_ref_iter {
+	struct reftable_reader *r;
+	struct strbuf oid;
+
+	/* mutable */
+	uint64_t *offsets;
+
+	/* Points to the next offset to read. */
+	int offset_idx;
+	int offset_len;
+	struct block_reader block_reader;
+	struct block_iter cur;
+	int is_finished;
+};
+
+#define INDEXED_TABLE_REF_ITER_INIT                                     \
+	{                                                               \
+		.cur = { .last_key = STRBUF_INIT }, .oid = STRBUF_INIT, \
+	}
+
+void iterator_from_indexed_table_ref_iter(struct reftable_iterator *it,
+					  struct indexed_table_ref_iter *itr);
+int new_indexed_table_ref_iter(struct indexed_table_ref_iter **dest,
+			       struct reftable_reader *r, uint8_t *oid,
+			       int oid_len, uint64_t *offsets, int offset_len);
+
+#endif
diff --git a/reftable/reader.c b/reftable/reader.c
new file mode 100644
index 0000000000..c7f56b5fdc
--- /dev/null
+++ b/reftable/reader.c
@@ -0,0 +1,733 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reader.h"
+
+#include "system.h"
+#include "block.h"
+#include "constants.h"
+#include "iter.h"
+#include "record.h"
+#include "reftable.h"
+#include "tree.h"
+
+uint64_t block_source_size(struct reftable_block_source *source)
+{
+	return source->ops->size(source->arg);
+}
+
+int block_source_read_block(struct reftable_block_source *source,
+			    struct reftable_block *dest, uint64_t off,
+			    uint32_t size)
+{
+	int result = source->ops->read_block(source->arg, dest, off, size);
+	dest->source = *source;
+	return result;
+}
+
+void block_source_close(struct reftable_block_source *source)
+{
+	if (source->ops == NULL) {
+		return;
+	}
+
+	source->ops->close(source->arg);
+	source->ops = NULL;
+}
+
+static struct reftable_reader_offsets *
+reader_offsets_for(struct reftable_reader *r, uint8_t typ)
+{
+	switch (typ) {
+	case BLOCK_TYPE_REF:
+		return &r->ref_offsets;
+	case BLOCK_TYPE_LOG:
+		return &r->log_offsets;
+	case BLOCK_TYPE_OBJ:
+		return &r->obj_offsets;
+	}
+	abort();
+}
+
+static int reader_get_block(struct reftable_reader *r,
+			    struct reftable_block *dest, uint64_t off,
+			    uint32_t sz)
+{
+	if (off >= r->size)
+		return 0;
+
+	if (off + sz > r->size) {
+		sz = r->size - off;
+	}
+
+	return block_source_read_block(&r->source, dest, off, sz);
+}
+
+uint32_t reftable_reader_hash_id(struct reftable_reader *r)
+{
+	return r->hash_id;
+}
+
+const char *reader_name(struct reftable_reader *r)
+{
+	return r->name;
+}
+
+static int parse_footer(struct reftable_reader *r, uint8_t *footer,
+			uint8_t *header)
+{
+	uint8_t *f = footer;
+	uint8_t first_block_typ;
+	int err = 0;
+	uint32_t computed_crc;
+	uint32_t file_crc;
+
+	if (memcmp(f, "REFT", 4)) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+	f += 4;
+
+	if (memcmp(footer, header, header_size(r->version))) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	f++;
+	r->block_size = get_be24(f);
+
+	f += 3;
+	r->min_update_index = get_be64(f);
+	f += 8;
+	r->max_update_index = get_be64(f);
+	f += 8;
+
+	if (r->version == 1) {
+		r->hash_id = SHA1_ID;
+	} else {
+		r->hash_id = get_be32(f);
+		switch (r->hash_id) {
+		case SHA1_ID:
+			break;
+		case SHA256_ID:
+			break;
+		default:
+			err = REFTABLE_FORMAT_ERROR;
+			goto done;
+		}
+		f += 4;
+	}
+
+	r->ref_offsets.index_offset = get_be64(f);
+	f += 8;
+
+	r->obj_offsets.offset = get_be64(f);
+	f += 8;
+
+	r->object_id_len = r->obj_offsets.offset & ((1 << 5) - 1);
+	r->obj_offsets.offset >>= 5;
+
+	r->obj_offsets.index_offset = get_be64(f);
+	f += 8;
+	r->log_offsets.offset = get_be64(f);
+	f += 8;
+	r->log_offsets.index_offset = get_be64(f);
+	f += 8;
+
+	computed_crc = crc32(0, footer, f - footer);
+	file_crc = get_be32(f);
+	f += 4;
+	if (computed_crc != file_crc) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	first_block_typ = header[header_size(r->version)];
+	r->ref_offsets.is_present = (first_block_typ == BLOCK_TYPE_REF);
+	r->ref_offsets.offset = 0;
+	r->log_offsets.is_present = (first_block_typ == BLOCK_TYPE_LOG ||
+				     r->log_offsets.offset > 0);
+	r->obj_offsets.is_present = r->obj_offsets.offset > 0;
+	err = 0;
+done:
+	return err;
+}
+
+int init_reader(struct reftable_reader *r, struct reftable_block_source *source,
+		const char *name)
+{
+	struct reftable_block footer = { 0 };
+	struct reftable_block header = { 0 };
+	int err = 0;
+
+	memset(r, 0, sizeof(struct reftable_reader));
+
+	/* Need +1 to read type of first block. */
+	err = block_source_read_block(source, &header, 0, header_size(2) + 1);
+	if (err != header_size(2) + 1) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	if (memcmp(header.data, "REFT", 4)) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+	r->version = header.data[4];
+	if (r->version != 1 && r->version != 2) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	r->size = block_source_size(source) - footer_size(r->version);
+	r->source = *source;
+	r->name = xstrdup(name);
+	r->hash_id = 0;
+
+	err = block_source_read_block(source, &footer, r->size,
+				      footer_size(r->version));
+	if (err != footer_size(r->version)) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = parse_footer(r, footer.data, header.data);
+done:
+	reftable_block_done(&footer);
+	reftable_block_done(&header);
+	return err;
+}
+
+struct table_iter {
+	struct reftable_reader *r;
+	uint8_t typ;
+	uint64_t block_off;
+	struct block_iter bi;
+	int is_finished;
+};
+#define TABLE_ITER_INIT                          \
+	{                                        \
+		.bi = {.last_key = STRBUF_INIT } \
+	}
+
+static void table_iter_copy_from(struct table_iter *dest,
+				 struct table_iter *src)
+{
+	dest->r = src->r;
+	dest->typ = src->typ;
+	dest->block_off = src->block_off;
+	dest->is_finished = src->is_finished;
+	block_iter_copy_from(&dest->bi, &src->bi);
+}
+
+static int table_iter_next_in_block(struct table_iter *ti,
+				    struct reftable_record *rec)
+{
+	int res = block_iter_next(&ti->bi, rec);
+	if (res == 0 && reftable_record_type(rec) == BLOCK_TYPE_REF) {
+		((struct reftable_ref_record *)rec->data)->update_index +=
+			ti->r->min_update_index;
+	}
+
+	return res;
+}
+
+static void table_iter_block_done(struct table_iter *ti)
+{
+	if (ti->bi.br == NULL) {
+		return;
+	}
+	reftable_block_done(&ti->bi.br->block);
+	FREE_AND_NULL(ti->bi.br);
+
+	ti->bi.last_key.len = 0;
+	ti->bi.next_off = 0;
+}
+
+static int32_t extract_block_size(uint8_t *data, uint8_t *typ, uint64_t off,
+				  int version)
+{
+	int32_t result = 0;
+
+	if (off == 0) {
+		data += header_size(version);
+	}
+
+	*typ = data[0];
+	if (reftable_is_block_type(*typ)) {
+		result = get_be24(data + 1);
+	}
+	return result;
+}
+
+int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br,
+			     uint64_t next_off, uint8_t want_typ)
+{
+	int32_t guess_block_size = r->block_size ? r->block_size :
+							 DEFAULT_BLOCK_SIZE;
+	struct reftable_block block = { 0 };
+	uint8_t block_typ = 0;
+	int err = 0;
+	uint32_t header_off = next_off ? 0 : header_size(r->version);
+	int32_t block_size = 0;
+
+	if (next_off >= r->size)
+		return 1;
+
+	err = reader_get_block(r, &block, next_off, guess_block_size);
+	if (err < 0)
+		return err;
+
+	block_size = extract_block_size(block.data, &block_typ, next_off,
+					r->version);
+	if (block_size < 0)
+		return block_size;
+
+	if (want_typ != BLOCK_TYPE_ANY && block_typ != want_typ) {
+		reftable_block_done(&block);
+		return 1;
+	}
+
+	if (block_size > guess_block_size) {
+		reftable_block_done(&block);
+		err = reader_get_block(r, &block, next_off, block_size);
+		if (err < 0) {
+			return err;
+		}
+	}
+
+	return block_reader_init(br, &block, header_off, r->block_size,
+				 hash_size(r->hash_id));
+}
+
+static int table_iter_next_block(struct table_iter *dest,
+				 struct table_iter *src)
+{
+	uint64_t next_block_off = src->block_off + src->bi.br->full_block_size;
+	struct block_reader br = { 0 };
+	int err = 0;
+
+	dest->r = src->r;
+	dest->typ = src->typ;
+	dest->block_off = next_block_off;
+
+	err = reader_init_block_reader(src->r, &br, next_block_off, src->typ);
+	if (err > 0) {
+		dest->is_finished = 1;
+		return 1;
+	}
+	if (err != 0)
+		return err;
+	else {
+		struct block_reader *brp =
+			reftable_malloc(sizeof(struct block_reader));
+		*brp = br;
+
+		dest->is_finished = 0;
+		block_reader_start(brp, &dest->bi);
+	}
+	return 0;
+}
+
+static int table_iter_next(struct table_iter *ti, struct reftable_record *rec)
+{
+	if (reftable_record_type(rec) != ti->typ)
+		return REFTABLE_API_ERROR;
+
+	while (1) {
+		struct table_iter next = TABLE_ITER_INIT;
+		int err = 0;
+		if (ti->is_finished) {
+			return 1;
+		}
+
+		err = table_iter_next_in_block(ti, rec);
+		if (err <= 0) {
+			return err;
+		}
+
+		err = table_iter_next_block(&next, ti);
+		if (err != 0) {
+			ti->is_finished = 1;
+		}
+		table_iter_block_done(ti);
+		if (err != 0) {
+			return err;
+		}
+		table_iter_copy_from(ti, &next);
+		block_iter_close(&next.bi);
+	}
+}
+
+static int table_iter_next_void(void *ti, struct reftable_record *rec)
+{
+	return table_iter_next((struct table_iter *)ti, rec);
+}
+
+static void table_iter_close(void *p)
+{
+	struct table_iter *ti = (struct table_iter *)p;
+	table_iter_block_done(ti);
+	block_iter_close(&ti->bi);
+}
+
+static struct reftable_iterator_vtable table_iter_vtable = {
+	.next = &table_iter_next_void,
+	.close = &table_iter_close,
+};
+
+static void iterator_from_table_iter(struct reftable_iterator *it,
+				     struct table_iter *ti)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = ti;
+	it->ops = &table_iter_vtable;
+}
+
+static int reader_table_iter_at(struct reftable_reader *r,
+				struct table_iter *ti, uint64_t off,
+				uint8_t typ)
+{
+	struct block_reader br = { 0 };
+	struct block_reader *brp = NULL;
+
+	int err = reader_init_block_reader(r, &br, off, typ);
+	if (err != 0)
+		return err;
+
+	brp = reftable_malloc(sizeof(struct block_reader));
+	*brp = br;
+	ti->r = r;
+	ti->typ = block_reader_type(brp);
+	ti->block_off = off;
+	block_reader_start(brp, &ti->bi);
+	return 0;
+}
+
+static int reader_start(struct reftable_reader *r, struct table_iter *ti,
+			uint8_t typ, int index)
+{
+	struct reftable_reader_offsets *offs = reader_offsets_for(r, typ);
+	uint64_t off = offs->offset;
+	if (index) {
+		off = offs->index_offset;
+		if (off == 0) {
+			return 1;
+		}
+		typ = BLOCK_TYPE_INDEX;
+	}
+
+	return reader_table_iter_at(r, ti, off, typ);
+}
+
+static int reader_seek_linear(struct reftable_reader *r, struct table_iter *ti,
+			      struct reftable_record *want)
+{
+	struct reftable_record rec =
+		reftable_new_record(reftable_record_type(want));
+	struct strbuf want_key = STRBUF_INIT;
+	struct strbuf got_key = STRBUF_INIT;
+	struct table_iter next = TABLE_ITER_INIT;
+	int err = -1;
+
+	reftable_record_key(want, &want_key);
+
+	while (1) {
+		err = table_iter_next_block(&next, ti);
+		if (err < 0)
+			goto done;
+
+		if (err > 0) {
+			break;
+		}
+
+		err = block_reader_first_key(next.bi.br, &got_key);
+		if (err < 0)
+			goto done;
+
+		if (strbuf_cmp(&got_key, &want_key) > 0) {
+			table_iter_block_done(&next);
+			break;
+		}
+
+		table_iter_block_done(ti);
+		table_iter_copy_from(ti, &next);
+	}
+
+	err = block_iter_seek(&ti->bi, &want_key);
+	if (err < 0)
+		goto done;
+	err = 0;
+
+done:
+	block_iter_close(&next.bi);
+	reftable_record_destroy(&rec);
+	strbuf_release(&want_key);
+	strbuf_release(&got_key);
+	return err;
+}
+
+static int reader_seek_indexed(struct reftable_reader *r,
+			       struct reftable_iterator *it,
+			       struct reftable_record *rec)
+{
+	struct reftable_index_record want_index = { .last_key = STRBUF_INIT };
+	struct reftable_record want_index_rec = { 0 };
+	struct reftable_index_record index_result = { .last_key = STRBUF_INIT };
+	struct reftable_record index_result_rec = { 0 };
+	struct table_iter index_iter = TABLE_ITER_INIT;
+	struct table_iter next = TABLE_ITER_INIT;
+	int err = 0;
+
+	reftable_record_key(rec, &want_index.last_key);
+	reftable_record_from_index(&want_index_rec, &want_index);
+	reftable_record_from_index(&index_result_rec, &index_result);
+
+	err = reader_start(r, &index_iter, reftable_record_type(rec), 1);
+	if (err < 0)
+		goto done;
+
+	err = reader_seek_linear(r, &index_iter, &want_index_rec);
+	while (1) {
+		err = table_iter_next(&index_iter, &index_result_rec);
+		table_iter_block_done(&index_iter);
+		if (err != 0)
+			goto done;
+
+		err = reader_table_iter_at(r, &next, index_result.offset, 0);
+		if (err != 0)
+			goto done;
+
+		err = block_iter_seek(&next.bi, &want_index.last_key);
+		if (err < 0)
+			goto done;
+
+		if (next.typ == reftable_record_type(rec)) {
+			err = 0;
+			break;
+		}
+
+		if (next.typ != BLOCK_TYPE_INDEX) {
+			err = REFTABLE_FORMAT_ERROR;
+			break;
+		}
+
+		table_iter_copy_from(&index_iter, &next);
+	}
+
+	if (err == 0) {
+		struct table_iter empty = TABLE_ITER_INIT;
+		struct table_iter *malloced =
+			reftable_calloc(sizeof(struct table_iter));
+		*malloced = empty;
+		table_iter_copy_from(malloced, &next);
+		iterator_from_table_iter(it, malloced);
+	}
+done:
+	block_iter_close(&next.bi);
+	table_iter_close(&index_iter);
+	reftable_record_clear(&want_index_rec);
+	reftable_record_clear(&index_result_rec);
+	return err;
+}
+
+static int reader_seek_internal(struct reftable_reader *r,
+				struct reftable_iterator *it,
+				struct reftable_record *rec)
+{
+	struct reftable_reader_offsets *offs =
+		reader_offsets_for(r, reftable_record_type(rec));
+	uint64_t idx = offs->index_offset;
+	struct table_iter ti = TABLE_ITER_INIT;
+	int err = 0;
+	if (idx > 0)
+		return reader_seek_indexed(r, it, rec);
+
+	err = reader_start(r, &ti, reftable_record_type(rec), 0);
+	if (err < 0)
+		return err;
+	err = reader_seek_linear(r, &ti, rec);
+	if (err < 0)
+		return err;
+	else {
+		struct table_iter *p =
+			reftable_malloc(sizeof(struct table_iter));
+		*p = ti;
+		iterator_from_table_iter(it, p);
+	}
+
+	return 0;
+}
+
+int reader_seek(struct reftable_reader *r, struct reftable_iterator *it,
+		struct reftable_record *rec)
+{
+	uint8_t typ = reftable_record_type(rec);
+
+	struct reftable_reader_offsets *offs = reader_offsets_for(r, typ);
+	if (!offs->is_present) {
+		iterator_set_empty(it);
+		return 0;
+	}
+
+	return reader_seek_internal(r, it, rec);
+}
+
+int reftable_reader_seek_ref(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name)
+{
+	struct reftable_ref_record ref = {
+		.refname = (char *)name,
+	};
+	struct reftable_record rec = { 0 };
+	reftable_record_from_ref(&rec, &ref);
+	return reader_seek(r, it, &rec);
+}
+
+int reftable_reader_seek_log_at(struct reftable_reader *r,
+				struct reftable_iterator *it, const char *name,
+				uint64_t update_index)
+{
+	struct reftable_log_record log = {
+		.refname = (char *)name,
+		.update_index = update_index,
+	};
+	struct reftable_record rec = { 0 };
+	reftable_record_from_log(&rec, &log);
+	return reader_seek(r, it, &rec);
+}
+
+int reftable_reader_seek_log(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name)
+{
+	uint64_t max = ~((uint64_t)0);
+	return reftable_reader_seek_log_at(r, it, name, max);
+}
+
+void reader_close(struct reftable_reader *r)
+{
+	block_source_close(&r->source);
+	FREE_AND_NULL(r->name);
+}
+
+int reftable_new_reader(struct reftable_reader **p,
+			struct reftable_block_source *src, char const *name)
+{
+	struct reftable_reader *rd =
+		reftable_calloc(sizeof(struct reftable_reader));
+	int err = init_reader(rd, src, name);
+	if (err == 0) {
+		*p = rd;
+	} else {
+		block_source_close(src);
+		reftable_free(rd);
+	}
+	return err;
+}
+
+void reftable_reader_free(struct reftable_reader *r)
+{
+	reader_close(r);
+	reftable_free(r);
+}
+
+static int reftable_reader_refs_for_indexed(struct reftable_reader *r,
+					    struct reftable_iterator *it,
+					    uint8_t *oid)
+{
+	struct reftable_obj_record want = {
+		.hash_prefix = oid,
+		.hash_prefix_len = r->object_id_len,
+	};
+	struct reftable_record want_rec = { 0 };
+	struct reftable_iterator oit = { 0 };
+	struct reftable_obj_record got = { 0 };
+	struct reftable_record got_rec = { 0 };
+	int err = 0;
+	struct indexed_table_ref_iter *itr = NULL;
+
+	/* Look through the reverse index. */
+	reftable_record_from_obj(&want_rec, &want);
+	err = reader_seek(r, &oit, &want_rec);
+	if (err != 0)
+		goto done;
+
+	/* read out the reftable_obj_record */
+	reftable_record_from_obj(&got_rec, &got);
+	err = iterator_next(&oit, &got_rec);
+	if (err < 0)
+		goto done;
+
+	if (err > 0 ||
+	    memcmp(want.hash_prefix, got.hash_prefix, r->object_id_len)) {
+		/* didn't find it; return empty iterator */
+		iterator_set_empty(it);
+		err = 0;
+		goto done;
+	}
+
+	err = new_indexed_table_ref_iter(&itr, r, oid, hash_size(r->hash_id),
+					 got.offsets, got.offset_len);
+	if (err < 0)
+		goto done;
+	got.offsets = NULL;
+	iterator_from_indexed_table_ref_iter(it, itr);
+
+done:
+	reftable_iterator_destroy(&oit);
+	reftable_record_clear(&got_rec);
+	return err;
+}
+
+static int reftable_reader_refs_for_unindexed(struct reftable_reader *r,
+					      struct reftable_iterator *it,
+					      uint8_t *oid)
+{
+	struct table_iter ti_empty = TABLE_ITER_INIT;
+	struct table_iter *ti = reftable_calloc(sizeof(struct table_iter));
+	struct filtering_ref_iterator *filter = NULL;
+	struct filtering_ref_iterator empty = FILTERING_REF_ITERATOR_INIT;
+	int oid_len = hash_size(r->hash_id);
+	int err;
+
+	*ti = ti_empty;
+	err = reader_start(r, ti, BLOCK_TYPE_REF, 0);
+	if (err < 0) {
+		reftable_free(ti);
+		return err;
+	}
+
+	filter = reftable_malloc(sizeof(struct filtering_ref_iterator));
+	*filter = empty;
+
+	strbuf_add(&filter->oid, oid, oid_len);
+	reftable_table_from_reader(&filter->tab, r);
+	filter->double_check = 0;
+	iterator_from_table_iter(&filter->it, ti);
+
+	iterator_from_filtering_ref_iterator(it, filter);
+	return 0;
+}
+
+int reftable_reader_refs_for(struct reftable_reader *r,
+			     struct reftable_iterator *it, uint8_t *oid)
+{
+	if (r->obj_offsets.is_present)
+		return reftable_reader_refs_for_indexed(r, it, oid);
+	return reftable_reader_refs_for_unindexed(r, it, oid);
+}
+
+uint64_t reftable_reader_max_update_index(struct reftable_reader *r)
+{
+	return r->max_update_index;
+}
+
+uint64_t reftable_reader_min_update_index(struct reftable_reader *r)
+{
+	return r->min_update_index;
+}
diff --git a/reftable/reader.h b/reftable/reader.h
new file mode 100644
index 0000000000..8170258881
--- /dev/null
+++ b/reftable/reader.h
@@ -0,0 +1,78 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef READER_H
+#define READER_H
+
+#include "block.h"
+#include "record.h"
+#include "reftable.h"
+
+uint64_t block_source_size(struct reftable_block_source *source);
+
+int block_source_read_block(struct reftable_block_source *source,
+			    struct reftable_block *dest, uint64_t off,
+			    uint32_t size);
+void block_source_close(struct reftable_block_source *source);
+
+/* metadata for a block type */
+struct reftable_reader_offsets {
+	int is_present;
+	uint64_t offset;
+	uint64_t index_offset;
+};
+
+/* The state for reading a reftable file. */
+struct reftable_reader {
+	/* for convience, associate a name with the instance. */
+	char *name;
+	struct reftable_block_source source;
+
+	/* Size of the file, excluding the footer. */
+	uint64_t size;
+
+	/* 'sha1' for SHA1, 's256' for SHA-256 */
+	uint32_t hash_id;
+
+	uint32_t block_size;
+	uint64_t min_update_index;
+	uint64_t max_update_index;
+	/* Length of the OID keys in the 'o' section */
+	int object_id_len;
+	int version;
+
+	struct reftable_reader_offsets ref_offsets;
+	struct reftable_reader_offsets obj_offsets;
+	struct reftable_reader_offsets log_offsets;
+};
+
+int init_reader(struct reftable_reader *r, struct reftable_block_source *source,
+		const char *name);
+int reader_seek(struct reftable_reader *r, struct reftable_iterator *it,
+		struct reftable_record *rec);
+void reader_close(struct reftable_reader *r);
+const char *reader_name(struct reftable_reader *r);
+
+/* initialize a block reader to read from `r` */
+int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br,
+			     uint64_t next_off, uint8_t want_typ);
+
+/* generic interface to reftables */
+struct reftable_table_vtable {
+	int (*seek_record)(void *tab, struct reftable_iterator *it,
+			   struct reftable_record *);
+	uint32_t (*hash_id)(void *tab);
+	uint64_t (*min_update_index)(void *tab);
+	uint64_t (*max_update_index)(void *tab);
+};
+
+int reftable_table_seek_record(struct reftable_table *tab,
+			       struct reftable_iterator *it,
+			       struct reftable_record *rec);
+
+#endif
diff --git a/reftable/reftable.c b/reftable/reftable.c
new file mode 100644
index 0000000000..1425aef6f2
--- /dev/null
+++ b/reftable/reftable.c
@@ -0,0 +1,104 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reftable.h"
+#include "record.h"
+#include "reader.h"
+
+static int reftable_reader_seek_void(void *tab, struct reftable_iterator *it,
+				     struct reftable_record *rec)
+{
+	return reader_seek((struct reftable_reader *)tab, it, rec);
+}
+
+static uint32_t reftable_reader_hash_id_void(void *tab)
+{
+	return reftable_reader_hash_id((struct reftable_reader *)tab);
+}
+
+static uint64_t reftable_reader_min_update_index_void(void *tab)
+{
+	return reftable_reader_min_update_index((struct reftable_reader *)tab);
+}
+
+static uint64_t reftable_reader_max_update_index_void(void *tab)
+{
+	return reftable_reader_max_update_index((struct reftable_reader *)tab);
+}
+
+static struct reftable_table_vtable reader_vtable = {
+	.seek_record = reftable_reader_seek_void,
+	.hash_id = reftable_reader_hash_id_void,
+	.min_update_index = reftable_reader_min_update_index_void,
+	.max_update_index = reftable_reader_max_update_index_void,
+};
+
+int reftable_table_seek_ref(struct reftable_table *tab,
+			    struct reftable_iterator *it, const char *name)
+{
+	struct reftable_ref_record ref = {
+		.refname = (char *)name,
+	};
+	struct reftable_record rec = { 0 };
+	reftable_record_from_ref(&rec, &ref);
+	return tab->ops->seek_record(tab->table_arg, it, &rec);
+}
+
+void reftable_table_from_reader(struct reftable_table *tab,
+				struct reftable_reader *reader)
+{
+	assert(tab->ops == NULL);
+	tab->ops = &reader_vtable;
+	tab->table_arg = reader;
+}
+
+int reftable_table_read_ref(struct reftable_table *tab, const char *name,
+			    struct reftable_ref_record *ref)
+{
+	struct reftable_iterator it = { 0 };
+	int err = reftable_table_seek_ref(tab, &it, name);
+	if (err)
+		goto done;
+
+	err = reftable_iterator_next_ref(&it, ref);
+	if (err)
+		goto done;
+
+	if (strcmp(ref->refname, name) ||
+	    reftable_ref_record_is_deletion(ref)) {
+		reftable_ref_record_clear(ref);
+		err = 1;
+		goto done;
+	}
+
+done:
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+int reftable_table_seek_record(struct reftable_table *tab,
+			       struct reftable_iterator *it,
+			       struct reftable_record *rec)
+{
+	return tab->ops->seek_record(tab->table_arg, it, rec);
+}
+
+uint64_t reftable_table_max_update_index(struct reftable_table *tab)
+{
+	return tab->ops->max_update_index(tab->table_arg);
+}
+
+uint64_t reftable_table_min_update_index(struct reftable_table *tab)
+{
+	return tab->ops->min_update_index(tab->table_arg);
+}
+
+uint32_t reftable_table_hash_id(struct reftable_table *tab)
+{
+	return tab->ops->hash_id(tab->table_arg);
+}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v2 11/13] reftable: file level tests
  2020-10-01 16:10 ` [PATCH v2 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
                     ` (9 preceding siblings ...)
  2020-10-01 16:11   ` [PATCH v2 10/13] reftable: read " Han-Wen Nienhuys via GitGitGadget
@ 2020-10-01 16:11   ` Han-Wen Nienhuys via GitGitGadget
  2020-10-01 16:11   ` [PATCH v2 12/13] reftable: rest of library Han-Wen Nienhuys via GitGitGadget
                     ` (2 subsequent siblings)
  13 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-10-01 16:11 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Jeff King, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |   1 +
 reftable/reftable_test.c | 585 +++++++++++++++++++++++++++++++++++++++
 t/helper/test-reftable.c |   1 +
 3 files changed, 587 insertions(+)
 create mode 100644 reftable/reftable_test.c

diff --git a/Makefile b/Makefile
index 8b09a56d91..17d5fba0a0 100644
--- a/Makefile
+++ b/Makefile
@@ -2388,6 +2388,7 @@ REFTABLE_OBJS += reftable/zlib-compat.o
 
 REFTABLE_TEST_OBJS += reftable/block_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
+REFTABLE_TEST_OBJS += reftable/reftable_test.o
 REFTABLE_TEST_OBJS += reftable/strbuf_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/tree_test.o
diff --git a/reftable/reftable_test.c b/reftable/reftable_test.c
new file mode 100644
index 0000000000..9fea692905
--- /dev/null
+++ b/reftable/reftable_test.c
@@ -0,0 +1,585 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reftable.h"
+
+#include "system.h"
+
+#include "basics.h"
+#include "block.h"
+#include "blocksource.h"
+#include "constants.h"
+#include "reader.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static const int update_index = 5;
+
+static void test_buffer(void)
+{
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_block_source source = { NULL };
+	struct reftable_block out = { NULL };
+	int n;
+	uint8_t in[] = "hello";
+	strbuf_add(&buf, in, sizeof(in));
+	block_source_from_strbuf(&source, &buf);
+	assert(block_source_size(&source) == 6);
+	n = block_source_read_block(&source, &out, 0, sizeof(in));
+	assert(n == sizeof(in));
+	assert(!memcmp(in, out.data, n));
+	reftable_block_done(&out);
+
+	n = block_source_read_block(&source, &out, 1, 2);
+	assert(n == 2);
+	assert(!memcmp(out.data, "el", 2));
+
+	reftable_block_done(&out);
+	block_source_close(&source);
+	strbuf_release(&buf);
+}
+
+static void write_table(char ***names, struct strbuf *buf, int N,
+			int block_size, uint32_t hash_id)
+{
+	struct reftable_write_options opts = {
+		.block_size = block_size,
+		.hash_id = hash_id,
+	};
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, buf, &opts);
+	struct reftable_ref_record ref = { NULL };
+	int i = 0, n;
+	struct reftable_log_record log = { NULL };
+	const struct reftable_stats *stats = NULL;
+	*names = reftable_calloc(sizeof(char *) * (N + 1));
+	reftable_writer_set_limits(w, update_index, update_index);
+	for (i = 0; i < N; i++) {
+		uint8_t hash[SHA256_SIZE] = { 0 };
+		char name[100];
+		int n;
+
+		set_test_hash(hash, i);
+
+		snprintf(name, sizeof(name), "refs/heads/branch%02d", i);
+
+		ref.refname = name;
+		ref.value = hash;
+		ref.update_index = update_index;
+		(*names)[i] = xstrdup(name);
+
+		n = reftable_writer_add_ref(w, &ref);
+		assert(n == 0);
+	}
+
+	for (i = 0; i < N; i++) {
+		uint8_t hash[SHA256_SIZE] = { 0 };
+		char name[100];
+		int n;
+
+		set_test_hash(hash, i);
+
+		snprintf(name, sizeof(name), "refs/heads/branch%02d", i);
+
+		log.refname = name;
+		log.new_hash = hash;
+		log.update_index = update_index;
+		log.message = "message";
+
+		n = reftable_writer_add_log(w, &log);
+		assert(n == 0);
+	}
+
+	n = reftable_writer_close(w);
+	assert(n == 0);
+
+	stats = writer_stats(w);
+	for (i = 0; i < stats->ref_stats.blocks; i++) {
+		int off = i * opts.block_size;
+		if (off == 0) {
+			off = header_size((hash_id == SHA256_ID) ? 2 : 1);
+		}
+		assert(buf->buf[off] == 'r');
+	}
+
+	assert(stats->log_stats.blocks > 0);
+	reftable_writer_free(w);
+}
+
+static void test_log_buffer_size(void)
+{
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_write_options opts = {
+		.block_size = 4096,
+	};
+	int err;
+	struct reftable_log_record log = {
+		.refname = "refs/heads/master",
+		.name = "Han-Wen Nienhuys",
+		.email = "hanwen@google.com",
+		.tz_offset = 100,
+		.time = 0x5e430672,
+		.update_index = 0xa,
+		.message = "commit: 9\n",
+	};
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+
+	/* This tests buffer extension for log compression. Must use a random
+	   hash, to ensure that the compressed part is larger than the original.
+	*/
+	uint8_t hash1[SHA1_SIZE], hash2[SHA1_SIZE];
+	for (int i = 0; i < SHA1_SIZE; i++) {
+		hash1[i] = (uint8_t)(rand() % 256);
+		hash2[i] = (uint8_t)(rand() % 256);
+	}
+	log.old_hash = hash1;
+	log.new_hash = hash2;
+	reftable_writer_set_limits(w, update_index, update_index);
+	err = reftable_writer_add_log(w, &log);
+	assert_err(err);
+	err = reftable_writer_close(w);
+	assert_err(err);
+	reftable_writer_free(w);
+	strbuf_release(&buf);
+}
+
+static void test_log_write_read(void)
+{
+	int N = 2;
+	char **names = reftable_calloc(sizeof(char *) * (N + 1));
+	int err;
+	struct reftable_write_options opts = {
+		.block_size = 256,
+	};
+	struct reftable_ref_record ref = { NULL };
+	int i = 0;
+	struct reftable_log_record log = { NULL };
+	int n;
+	struct reftable_iterator it = { NULL };
+	struct reftable_reader rd = { NULL };
+	struct reftable_block_source source = { NULL };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	const struct reftable_stats *stats = NULL;
+	reftable_writer_set_limits(w, 0, N);
+	for (i = 0; i < N; i++) {
+		char name[256];
+		struct reftable_ref_record ref = { NULL };
+		snprintf(name, sizeof(name), "b%02d%0*d", i, 130, 7);
+		names[i] = xstrdup(name);
+		ref.refname = name;
+		ref.update_index = i;
+
+		err = reftable_writer_add_ref(w, &ref);
+		assert_err(err);
+	}
+	for (i = 0; i < N; i++) {
+		uint8_t hash1[SHA1_SIZE], hash2[SHA1_SIZE];
+		struct reftable_log_record log = { NULL };
+		set_test_hash(hash1, i);
+		set_test_hash(hash2, i + 1);
+
+		log.refname = names[i];
+		log.update_index = i;
+		log.old_hash = hash1;
+		log.new_hash = hash2;
+
+		err = reftable_writer_add_log(w, &log);
+		assert_err(err);
+	}
+
+	n = reftable_writer_close(w);
+	assert(n == 0);
+
+	stats = writer_stats(w);
+	assert(stats->log_stats.blocks > 0);
+	reftable_writer_free(w);
+	w = NULL;
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.log");
+	assert_err(err);
+
+	err = reftable_reader_seek_ref(&rd, &it, names[N - 1]);
+	assert_err(err);
+
+	err = reftable_iterator_next_ref(&it, &ref);
+	assert_err(err);
+
+	/* end of iteration. */
+	err = reftable_iterator_next_ref(&it, &ref);
+	assert(0 < err);
+
+	reftable_iterator_destroy(&it);
+	reftable_ref_record_clear(&ref);
+
+	err = reftable_reader_seek_log(&rd, &it, "");
+	assert_err(err);
+
+	i = 0;
+	while (1) {
+		int err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			break;
+		}
+
+		assert_err(err);
+		assert_streq(names[i], log.refname);
+		assert(i == log.update_index);
+		i++;
+		reftable_log_record_clear(&log);
+	}
+
+	assert(i == N);
+	reftable_iterator_destroy(&it);
+
+	/* cleanup. */
+	strbuf_release(&buf);
+	free_names(names);
+	reader_close(&rd);
+}
+
+static void test_table_read_write_sequential(void)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 50;
+	struct reftable_iterator it = { NULL };
+	struct reftable_block_source source = { NULL };
+	struct reftable_reader rd = { NULL };
+	int err = 0;
+	int j = 0;
+
+	write_table(&names, &buf, N, 256, SHA1_ID);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	assert_err(err);
+
+	err = reftable_reader_seek_ref(&rd, &it, "");
+	assert_err(err);
+
+	while (1) {
+		struct reftable_ref_record ref = { NULL };
+		int r = reftable_iterator_next_ref(&it, &ref);
+		assert(r >= 0);
+		if (r > 0) {
+			break;
+		}
+		assert(0 == strcmp(names[j], ref.refname));
+		assert(update_index == ref.update_index);
+
+		j++;
+		reftable_ref_record_clear(&ref);
+	}
+	assert(j == N);
+	reftable_iterator_destroy(&it);
+	strbuf_release(&buf);
+	free_names(names);
+
+	reader_close(&rd);
+}
+
+static void test_table_write_small_table(void)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 1;
+	write_table(&names, &buf, N, 4096, SHA1_ID);
+	assert(buf.len < 200);
+	strbuf_release(&buf);
+	free_names(names);
+}
+
+static void test_table_read_api(void)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 50;
+	struct reftable_reader rd = { NULL };
+	struct reftable_block_source source = { NULL };
+	int err;
+	int i;
+	struct reftable_log_record log = { NULL };
+	struct reftable_iterator it = { NULL };
+
+	write_table(&names, &buf, N, 256, SHA1_ID);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	assert_err(err);
+
+	err = reftable_reader_seek_ref(&rd, &it, names[0]);
+	assert_err(err);
+
+	err = reftable_iterator_next_log(&it, &log);
+	assert(err == REFTABLE_API_ERROR);
+
+	strbuf_release(&buf);
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_free(names);
+	reader_close(&rd);
+	strbuf_release(&buf);
+}
+
+static void test_table_read_write_seek(int index, int hash_id)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 50;
+	struct reftable_reader rd = { NULL };
+	struct reftable_block_source source = { NULL };
+	int err;
+	int i = 0;
+
+	struct reftable_iterator it = { NULL };
+	struct strbuf pastLast = STRBUF_INIT;
+	struct reftable_ref_record ref = { NULL };
+
+	write_table(&names, &buf, N, 256, hash_id);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	assert_err(err);
+	assert(hash_id == reftable_reader_hash_id(&rd));
+
+	if (!index) {
+		rd.ref_offsets.index_offset = 0;
+	} else {
+		assert(rd.ref_offsets.index_offset > 0);
+	}
+
+	for (i = 1; i < N; i++) {
+		int err = reftable_reader_seek_ref(&rd, &it, names[i]);
+		assert_err(err);
+		err = reftable_iterator_next_ref(&it, &ref);
+		assert_err(err);
+		assert(0 == strcmp(names[i], ref.refname));
+		assert(i == ref.value[0]);
+
+		reftable_ref_record_clear(&ref);
+		reftable_iterator_destroy(&it);
+	}
+
+	strbuf_addstr(&pastLast, names[N - 1]);
+	strbuf_addstr(&pastLast, "/");
+
+	err = reftable_reader_seek_ref(&rd, &it, pastLast.buf);
+	if (err == 0) {
+		struct reftable_ref_record ref = { NULL };
+		int err = reftable_iterator_next_ref(&it, &ref);
+		assert(err > 0);
+	} else {
+		assert(err > 0);
+	}
+
+	strbuf_release(&pastLast);
+	reftable_iterator_destroy(&it);
+
+	strbuf_release(&buf);
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+	reftable_free(names);
+	reader_close(&rd);
+}
+
+static void test_table_read_write_seek_linear(void)
+{
+	test_table_read_write_seek(0, SHA1_ID);
+}
+
+static void test_table_read_write_seek_linear_sha256(void)
+{
+	test_table_read_write_seek(0, SHA256_ID);
+}
+
+static void test_table_read_write_seek_index(void)
+{
+	test_table_read_write_seek(1, SHA1_ID);
+}
+
+static void test_table_refs_for(int indexed)
+{
+	int N = 50;
+	char **want_names = reftable_calloc(sizeof(char *) * (N + 1));
+	int want_names_len = 0;
+	uint8_t want_hash[SHA1_SIZE];
+
+	struct reftable_write_options opts = {
+		.block_size = 256,
+	};
+	struct reftable_ref_record ref = { NULL };
+	int i = 0;
+	int n;
+	int err;
+	struct reftable_reader rd;
+	struct reftable_block_source source = { NULL };
+
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+
+	struct reftable_iterator it = { NULL };
+	int j;
+
+	set_test_hash(want_hash, 4);
+
+	for (i = 0; i < N; i++) {
+		uint8_t hash[SHA1_SIZE];
+		char fill[51] = { 0 };
+		char name[100];
+		uint8_t hash1[SHA1_SIZE];
+		uint8_t hash2[SHA1_SIZE];
+		struct reftable_ref_record ref = { NULL };
+
+		memset(hash, i, sizeof(hash));
+		memset(fill, 'x', 50);
+		/* Put the variable part in the start */
+		snprintf(name, sizeof(name), "br%02d%s", i, fill);
+		name[40] = 0;
+		ref.refname = name;
+
+		set_test_hash(hash1, i / 4);
+		set_test_hash(hash2, 3 + i / 4);
+		ref.value = hash1;
+		ref.target_value = hash2;
+
+		/* 80 bytes / entry, so 3 entries per block. Yields 17
+		 */
+		/* blocks. */
+		n = reftable_writer_add_ref(w, &ref);
+		assert(n == 0);
+
+		if (!memcmp(hash1, want_hash, SHA1_SIZE) ||
+		    !memcmp(hash2, want_hash, SHA1_SIZE)) {
+			want_names[want_names_len++] = xstrdup(name);
+		}
+	}
+
+	n = reftable_writer_close(w);
+	assert(n == 0);
+
+	reftable_writer_free(w);
+	w = NULL;
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	assert_err(err);
+	if (!indexed) {
+		rd.obj_offsets.is_present = 0;
+	}
+
+	err = reftable_reader_seek_ref(&rd, &it, "");
+	assert_err(err);
+	reftable_iterator_destroy(&it);
+
+	err = reftable_reader_refs_for(&rd, &it, want_hash);
+	assert_err(err);
+
+	j = 0;
+	while (1) {
+		int err = reftable_iterator_next_ref(&it, &ref);
+		assert(err >= 0);
+		if (err > 0) {
+			break;
+		}
+
+		assert(j < want_names_len);
+		assert(0 == strcmp(ref.refname, want_names[j]));
+		j++;
+		reftable_ref_record_clear(&ref);
+	}
+	assert(j == want_names_len);
+
+	strbuf_release(&buf);
+	free_names(want_names);
+	reftable_iterator_destroy(&it);
+	reader_close(&rd);
+}
+
+static void test_table_refs_for_no_index(void)
+{
+	test_table_refs_for(0);
+}
+
+static void test_table_refs_for_obj_index(void)
+{
+	test_table_refs_for(1);
+}
+
+static void test_table_empty(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	struct reftable_block_source source = { NULL };
+	struct reftable_reader *rd = NULL;
+	struct reftable_ref_record rec = { NULL };
+	struct reftable_iterator it = { NULL };
+	int err;
+
+	reftable_writer_set_limits(w, 1, 1);
+
+	err = reftable_writer_close(w);
+	assert(err == REFTABLE_EMPTY_TABLE_ERROR);
+	reftable_writer_free(w);
+
+	assert(buf.len == header_size(1) + footer_size(1));
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = reftable_new_reader(&rd, &source, "filename");
+	assert_err(err);
+
+	err = reftable_reader_seek_ref(rd, &it, "");
+	assert_err(err);
+
+	err = reftable_iterator_next_ref(&it, &rec);
+	assert(err > 0);
+
+	reftable_iterator_destroy(&it);
+	reftable_reader_free(rd);
+	strbuf_release(&buf);
+}
+
+int reftable_test_main(int argc, const char *argv[])
+{
+	add_test_case("test_log_write_read", test_log_write_read);
+	add_test_case("test_table_read_write_seek_linear_sha256",
+		      &test_table_read_write_seek_linear_sha256);
+	add_test_case("test_log_buffer_size", test_log_buffer_size);
+	add_test_case("test_table_write_small_table",
+		      &test_table_write_small_table);
+	add_test_case("test_buffer", &test_buffer);
+	add_test_case("test_table_read_api", &test_table_read_api);
+	add_test_case("test_table_read_write_sequential",
+		      &test_table_read_write_sequential);
+	add_test_case("test_table_read_write_seek_linear",
+		      &test_table_read_write_seek_linear);
+	add_test_case("test_table_read_write_seek_index",
+		      &test_table_read_write_seek_index);
+	add_test_case("test_table_read_write_refs_for_no_index",
+		      &test_table_refs_for_no_index);
+	add_test_case("test_table_read_write_refs_for_obj_index",
+		      &test_table_refs_for_obj_index);
+	add_test_case("test_table_empty", &test_table_empty);
+	return test_main(argc, argv);
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 9c4e0f42dc..2273fdfe39 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -5,6 +5,7 @@ int cmd__reftable(int argc, const char **argv)
 {
 	block_test_main(argc, argv);
 	record_test_main(argc, argv);
+	reftable_test_main(argc, argv);
 	strbuf_test_main(argc, argv);
 	tree_test_main(argc, argv);
 	return 0;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v2 12/13] reftable: rest of library
  2020-10-01 16:10 ` [PATCH v2 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
                     ` (10 preceding siblings ...)
  2020-10-01 16:11   ` [PATCH v2 11/13] reftable: file level tests Han-Wen Nienhuys via GitGitGadget
@ 2020-10-01 16:11   ` Han-Wen Nienhuys via GitGitGadget
  2020-10-02 13:57     ` Johannes Schindelin
  2020-10-01 16:11   ` [PATCH v2 13/13] reftable: "test-tool dump-reftable" command Han-Wen Nienhuys via GitGitGadget
  2020-11-26 19:42   ` [PATCH v3 00/16] reftable library Han-Wen Nienhuys via GitGitGadget
  13 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-10-01 16:11 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Jeff King, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This will be further split up once preceding commits have passed review.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |    8 +-
 reftable/VERSION         |    1 +
 reftable/dump.c          |  212 +++++++
 reftable/merged.c        |  358 +++++++++++
 reftable/merged.h        |   36 ++
 reftable/merged_test.c   |  331 ++++++++++
 reftable/pq.c            |  115 ++++
 reftable/pq.h            |   32 +
 reftable/refname.c       |  209 +++++++
 reftable/refname.h       |   28 +
 reftable/refname_test.c  |  100 +++
 reftable/reftable.c      |    4 +-
 reftable/stack.c         | 1240 ++++++++++++++++++++++++++++++++++++++
 reftable/stack.h         |   40 ++
 reftable/stack_test.c    |  788 ++++++++++++++++++++++++
 reftable/update.sh       |   22 +
 t/helper/test-reftable.c |    3 +
 17 files changed, 3524 insertions(+), 3 deletions(-)
 create mode 100644 reftable/VERSION
 create mode 100644 reftable/dump.c
 create mode 100644 reftable/merged.c
 create mode 100644 reftable/merged.h
 create mode 100644 reftable/merged_test.c
 create mode 100644 reftable/pq.c
 create mode 100644 reftable/pq.h
 create mode 100644 reftable/refname.c
 create mode 100644 reftable/refname.h
 create mode 100644 reftable/refname_test.c
 create mode 100644 reftable/stack.c
 create mode 100644 reftable/stack.h
 create mode 100644 reftable/stack_test.c
 create mode 100755 reftable/update.sh

diff --git a/Makefile b/Makefile
index 17d5fba0a0..05021f79a1 100644
--- a/Makefile
+++ b/Makefile
@@ -2375,12 +2375,16 @@ XDIFF_OBJS += xdiff/xutils.o
 REFTABLE_OBJS += reftable/basics.o
 REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
-REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/compat.o
 REFTABLE_OBJS += reftable/iter.o
+REFTABLE_OBJS += reftable/merged.o
+REFTABLE_OBJS += reftable/pq.o
+REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/reader.o
 REFTABLE_OBJS += reftable/record.o
+REFTABLE_OBJS += reftable/refname.o
 REFTABLE_OBJS += reftable/reftable.o
+REFTABLE_OBJS += reftable/stack.o
 REFTABLE_OBJS += reftable/strbuf.o
 REFTABLE_OBJS += reftable/tree.o
 REFTABLE_OBJS += reftable/writer.o
@@ -2388,7 +2392,9 @@ REFTABLE_OBJS += reftable/zlib-compat.o
 
 REFTABLE_TEST_OBJS += reftable/block_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
+REFTABLE_TEST_OBJS += reftable/refname_test.o
 REFTABLE_TEST_OBJS += reftable/reftable_test.o
+REFTABLE_TEST_OBJS += reftable/stack_test.o
 REFTABLE_TEST_OBJS += reftable/strbuf_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/tree_test.o
diff --git a/reftable/VERSION b/reftable/VERSION
new file mode 100644
index 0000000000..cf8ae4958f
--- /dev/null
+++ b/reftable/VERSION
@@ -0,0 +1 @@
+7134eb9f8171a9759800f4187f9e6dde997335e7 C: NULL iso 0 for init
diff --git a/reftable/dump.c b/reftable/dump.c
new file mode 100644
index 0000000000..ed09f2e2f9
--- /dev/null
+++ b/reftable/dump.c
@@ -0,0 +1,212 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include <stddef.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+
+#include "reftable.h"
+#include "reftable-tests.h"
+
+static uint32_t hash_id;
+
+static int dump_table(const char *tablename)
+{
+	struct reftable_block_source src = { NULL };
+	int err = reftable_block_source_from_file(&src, tablename);
+	struct reftable_iterator it = { NULL };
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_log_record log = { NULL };
+	struct reftable_reader *r = NULL;
+
+	if (err < 0)
+		return err;
+
+	err = reftable_new_reader(&r, &src, tablename);
+	if (err < 0)
+		return err;
+
+	err = reftable_reader_seek_ref(r, &it, "");
+	if (err < 0) {
+		return err;
+	}
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_ref_record_print(&ref, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_ref_record_clear(&ref);
+
+	err = reftable_reader_seek_log(r, &it, "");
+	if (err < 0) {
+		return err;
+	}
+	while (1) {
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_log_record_print(&log, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_log_record_clear(&log);
+
+	reftable_reader_free(r);
+	return 0;
+}
+
+static int compact_stack(const char *stackdir)
+{
+	struct reftable_stack *stack = NULL;
+	struct reftable_write_options cfg = {};
+
+	int err = reftable_new_stack(&stack, stackdir, cfg);
+	if (err < 0)
+		goto done;
+
+	err = reftable_stack_compact_all(stack, NULL);
+	if (err < 0)
+		goto done;
+done:
+	if (stack != NULL) {
+		reftable_stack_destroy(stack);
+	}
+	return err;
+}
+
+static int dump_stack(const char *stackdir)
+{
+	struct reftable_stack *stack = NULL;
+	struct reftable_write_options cfg = {};
+	struct reftable_iterator it = { NULL };
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_log_record log = { NULL };
+	struct reftable_merged_table *merged = NULL;
+
+	int err = reftable_new_stack(&stack, stackdir, cfg);
+	if (err < 0)
+		return err;
+
+	merged = reftable_stack_merged_table(stack);
+
+	err = reftable_merged_table_seek_ref(merged, &it, "");
+	if (err < 0) {
+		return err;
+	}
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_ref_record_print(&ref, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_ref_record_clear(&ref);
+
+	err = reftable_merged_table_seek_log(merged, &it, "");
+	if (err < 0) {
+		return err;
+	}
+	while (1) {
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_log_record_print(&log, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_log_record_clear(&log);
+
+	reftable_stack_destroy(stack);
+	return 0;
+}
+
+static void print_help(void)
+{
+	printf("usage: dump [-cst] arg\n\n"
+	       "options: \n"
+	       "  -c compact\n"
+	       "  -t dump table\n"
+	       "  -s dump stack\n"
+	       "  -h this help\n"
+	       "  -2 use SHA256\n"
+	       "\n");
+}
+
+int reftable_dump_main(int argc, char *const *argv)
+{
+	int err = 0;
+	int opt;
+	int opt_dump_table = 0;
+	int opt_dump_stack = 0;
+	int opt_compact = 0;
+	const char *arg = NULL;
+	while ((opt = getopt(argc, argv, "2chts")) != -1) {
+		switch (opt) {
+		case '2':
+			hash_id = 0x73323536;
+			break;
+		case 't':
+			opt_dump_table = 1;
+			break;
+		case 's':
+			opt_dump_stack = 1;
+			break;
+		case 'c':
+			opt_compact = 1;
+			break;
+		case '?':
+		case 'h':
+			print_help();
+			return 2;
+			break;
+		}
+	}
+
+	if (argv[optind] == NULL) {
+		fprintf(stderr, "need argument\n");
+		print_help();
+		return 2;
+	}
+
+	arg = argv[optind];
+
+	if (opt_dump_table) {
+		err = dump_table(arg);
+	} else if (opt_dump_stack) {
+		err = dump_stack(arg);
+	} else if (opt_compact) {
+		err = compact_stack(arg);
+	}
+
+	if (err < 0) {
+		fprintf(stderr, "%s: %s: %s\n", argv[0], arg,
+			reftable_error_str(err));
+		return 1;
+	}
+	return 0;
+}
diff --git a/reftable/merged.c b/reftable/merged.c
new file mode 100644
index 0000000000..bfa89cfaee
--- /dev/null
+++ b/reftable/merged.c
@@ -0,0 +1,358 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "merged.h"
+
+#include "constants.h"
+#include "iter.h"
+#include "pq.h"
+#include "reader.h"
+#include "record.h"
+#include "reftable.h"
+#include "system.h"
+
+static int merged_iter_init(struct merged_iter *mi)
+{
+	int i = 0;
+	for (i = 0; i < mi->stack_len; i++) {
+		struct reftable_record rec = reftable_new_record(mi->typ);
+		int err = iterator_next(&mi->stack[i], &rec);
+		if (err < 0) {
+			return err;
+		}
+
+		if (err > 0) {
+			reftable_iterator_destroy(&mi->stack[i]);
+			reftable_record_destroy(&rec);
+		} else {
+			struct pq_entry e = {
+				.rec = rec,
+				.index = i,
+			};
+			merged_iter_pqueue_add(&mi->pq, e);
+		}
+	}
+
+	return 0;
+}
+
+static void merged_iter_close(void *p)
+{
+	struct merged_iter *mi = (struct merged_iter *)p;
+	int i = 0;
+	merged_iter_pqueue_clear(&mi->pq);
+	for (i = 0; i < mi->stack_len; i++) {
+		reftable_iterator_destroy(&mi->stack[i]);
+	}
+	reftable_free(mi->stack);
+}
+
+static int merged_iter_advance_nonnull_subiter(struct merged_iter *mi,
+					       size_t idx)
+{
+	struct reftable_record rec = reftable_new_record(mi->typ);
+	struct pq_entry e = {
+		.rec = rec,
+		.index = idx,
+	};
+	int err = iterator_next(&mi->stack[idx], &rec);
+	if (err < 0)
+		return err;
+
+	if (err > 0) {
+		reftable_iterator_destroy(&mi->stack[idx]);
+		reftable_record_destroy(&rec);
+		return 0;
+	}
+
+	merged_iter_pqueue_add(&mi->pq, e);
+	return 0;
+}
+
+static int merged_iter_advance_subiter(struct merged_iter *mi, size_t idx)
+{
+	if (iterator_is_null(&mi->stack[idx]))
+		return 0;
+	return merged_iter_advance_nonnull_subiter(mi, idx);
+}
+
+static int merged_iter_next_entry(struct merged_iter *mi,
+				  struct reftable_record *rec)
+{
+	struct strbuf entry_key = STRBUF_INIT;
+	struct pq_entry entry = { 0 };
+	int err = 0;
+
+	if (merged_iter_pqueue_is_empty(mi->pq))
+		return 1;
+
+	entry = merged_iter_pqueue_remove(&mi->pq);
+	err = merged_iter_advance_subiter(mi, entry.index);
+	if (err < 0)
+		return err;
+
+	/*
+	  One can also use reftable as datacenter-local storage, where the ref
+	  database is maintained in globally consistent database (eg.
+	  CockroachDB or Spanner). In this scenario, replication delays together
+	  with compaction may cause newer tables to contain older entries. In
+	  such a deployment, the loop below must be changed to collect all
+	  entries for the same key, and return new the newest one.
+	*/
+	reftable_record_key(&entry.rec, &entry_key);
+	while (!merged_iter_pqueue_is_empty(mi->pq)) {
+		struct pq_entry top = merged_iter_pqueue_top(mi->pq);
+		struct strbuf k = STRBUF_INIT;
+		int err = 0, cmp = 0;
+
+		reftable_record_key(&top.rec, &k);
+
+		cmp = strbuf_cmp(&k, &entry_key);
+		strbuf_release(&k);
+
+		if (cmp > 0) {
+			break;
+		}
+
+		merged_iter_pqueue_remove(&mi->pq);
+		err = merged_iter_advance_subiter(mi, top.index);
+		if (err < 0) {
+			return err;
+		}
+		reftable_record_destroy(&top.rec);
+	}
+
+	reftable_record_copy_from(rec, &entry.rec, hash_size(mi->hash_id));
+	reftable_record_destroy(&entry.rec);
+	strbuf_release(&entry_key);
+	return 0;
+}
+
+static int merged_iter_next(struct merged_iter *mi, struct reftable_record *rec)
+{
+	while (1) {
+		int err = merged_iter_next_entry(mi, rec);
+		if (err == 0 && mi->suppress_deletions &&
+		    reftable_record_is_deletion(rec)) {
+			continue;
+		}
+
+		return err;
+	}
+}
+
+static int merged_iter_next_void(void *p, struct reftable_record *rec)
+{
+	struct merged_iter *mi = (struct merged_iter *)p;
+	if (merged_iter_pqueue_is_empty(mi->pq))
+		return 1;
+
+	return merged_iter_next(mi, rec);
+}
+
+static struct reftable_iterator_vtable merged_iter_vtable = {
+	.next = &merged_iter_next_void,
+	.close = &merged_iter_close,
+};
+
+static void iterator_from_merged_iter(struct reftable_iterator *it,
+				      struct merged_iter *mi)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = mi;
+	it->ops = &merged_iter_vtable;
+}
+
+int reftable_new_merged_table(struct reftable_merged_table **dest,
+			      struct reftable_table *stack, int n,
+			      uint32_t hash_id)
+{
+	struct reftable_merged_table *m = NULL;
+	uint64_t last_max = 0;
+	uint64_t first_min = 0;
+	int i = 0;
+	for (i = 0; i < n; i++) {
+		uint64_t min = reftable_table_min_update_index(&stack[i]);
+		uint64_t max = reftable_table_max_update_index(&stack[i]);
+
+		if (reftable_table_hash_id(&stack[i]) != hash_id) {
+			return REFTABLE_FORMAT_ERROR;
+		}
+		if (i == 0 || min < first_min) {
+			first_min = min;
+		}
+		if (i == 0 || max > last_max) {
+			last_max = max;
+		}
+	}
+
+	m = (struct reftable_merged_table *)reftable_calloc(
+		sizeof(struct reftable_merged_table));
+	m->stack = stack;
+	m->stack_len = n;
+	m->min = first_min;
+	m->max = last_max;
+	m->hash_id = hash_id;
+	*dest = m;
+	return 0;
+}
+
+/* clears the list of subtable, without affecting the readers themselves. */
+void merged_table_clear(struct reftable_merged_table *mt)
+{
+	FREE_AND_NULL(mt->stack);
+	mt->stack_len = 0;
+}
+
+void reftable_merged_table_free(struct reftable_merged_table *mt)
+{
+	if (mt == NULL) {
+		return;
+	}
+	merged_table_clear(mt);
+	reftable_free(mt);
+}
+
+uint64_t
+reftable_merged_table_max_update_index(struct reftable_merged_table *mt)
+{
+	return mt->max;
+}
+
+uint64_t
+reftable_merged_table_min_update_index(struct reftable_merged_table *mt)
+{
+	return mt->min;
+}
+
+static int merged_table_seek_record(struct reftable_merged_table *mt,
+				    struct reftable_iterator *it,
+				    struct reftable_record *rec)
+{
+	struct reftable_iterator *iters = reftable_calloc(
+		sizeof(struct reftable_iterator) * mt->stack_len);
+	struct merged_iter merged = {
+		.stack = iters,
+		.typ = reftable_record_type(rec),
+		.hash_id = mt->hash_id,
+		.suppress_deletions = mt->suppress_deletions,
+	};
+	int n = 0;
+	int err = 0;
+	int i = 0;
+	for (i = 0; i < mt->stack_len && err == 0; i++) {
+		int e = reftable_table_seek_record(&mt->stack[i], &iters[n],
+						   rec);
+		if (e < 0) {
+			err = e;
+		}
+		if (e == 0) {
+			n++;
+		}
+	}
+	if (err < 0) {
+		int i = 0;
+		for (i = 0; i < n; i++) {
+			reftable_iterator_destroy(&iters[i]);
+		}
+		reftable_free(iters);
+		return err;
+	}
+
+	merged.stack_len = n;
+	err = merged_iter_init(&merged);
+	if (err < 0) {
+		merged_iter_close(&merged);
+		return err;
+	} else {
+		struct merged_iter *p =
+			reftable_malloc(sizeof(struct merged_iter));
+		*p = merged;
+		iterator_from_merged_iter(it, p);
+	}
+	return 0;
+}
+
+int reftable_merged_table_seek_ref(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name)
+{
+	struct reftable_ref_record ref = {
+		.refname = (char *)name,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_ref(&rec, &ref);
+	return merged_table_seek_record(mt, it, &rec);
+}
+
+int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt,
+				      struct reftable_iterator *it,
+				      const char *name, uint64_t update_index)
+{
+	struct reftable_log_record log = {
+		.refname = (char *)name,
+		.update_index = update_index,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_log(&rec, &log);
+	return merged_table_seek_record(mt, it, &rec);
+}
+
+int reftable_merged_table_seek_log(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name)
+{
+	uint64_t max = ~((uint64_t)0);
+	return reftable_merged_table_seek_log_at(mt, it, name, max);
+}
+
+uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *mt)
+{
+	return mt->hash_id;
+}
+
+static int reftable_merged_table_seek_void(void *tab,
+					   struct reftable_iterator *it,
+					   struct reftable_record *rec)
+{
+	return merged_table_seek_record((struct reftable_merged_table *)tab, it,
+					rec);
+}
+
+static uint32_t reftable_merged_table_hash_id_void(void *tab)
+{
+	return reftable_merged_table_hash_id(
+		(struct reftable_merged_table *)tab);
+}
+
+static uint64_t reftable_merged_table_min_update_index_void(void *tab)
+{
+	return reftable_merged_table_min_update_index(
+		(struct reftable_merged_table *)tab);
+}
+
+static uint64_t reftable_merged_table_max_update_index_void(void *tab)
+{
+	return reftable_merged_table_max_update_index(
+		(struct reftable_merged_table *)tab);
+}
+
+static struct reftable_table_vtable merged_table_vtable = {
+	.seek_record = reftable_merged_table_seek_void,
+	.hash_id = reftable_merged_table_hash_id_void,
+	.min_update_index = reftable_merged_table_min_update_index_void,
+	.max_update_index = reftable_merged_table_max_update_index_void,
+};
+
+void reftable_table_from_merged_table(struct reftable_table *tab,
+				      struct reftable_merged_table *merged)
+{
+	assert(tab->ops == NULL);
+	tab->ops = &merged_table_vtable;
+	tab->table_arg = merged;
+}
diff --git a/reftable/merged.h b/reftable/merged.h
new file mode 100644
index 0000000000..76c29928c3
--- /dev/null
+++ b/reftable/merged.h
@@ -0,0 +1,36 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef MERGED_H
+#define MERGED_H
+
+#include "pq.h"
+#include "reftable.h"
+
+struct reftable_merged_table {
+	struct reftable_table *stack;
+	size_t stack_len;
+	uint32_t hash_id;
+	int suppress_deletions;
+
+	uint64_t min;
+	uint64_t max;
+};
+
+struct merged_iter {
+	struct reftable_iterator *stack;
+	uint32_t hash_id;
+	size_t stack_len;
+	uint8_t typ;
+	int suppress_deletions;
+	struct merged_iter_pqueue pq;
+};
+
+void merged_table_clear(struct reftable_merged_table *mt);
+
+#endif
diff --git a/reftable/merged_test.c b/reftable/merged_test.c
new file mode 100644
index 0000000000..66c65a8ea4
--- /dev/null
+++ b/reftable/merged_test.c
@@ -0,0 +1,331 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "merged.h"
+
+#include "system.h"
+
+#include "basics.h"
+#include "blocksource.h"
+#include "constants.h"
+#include "pq.h"
+#include "reader.h"
+#include "record.h"
+#include "reftable.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static void test_pq(void)
+{
+	char *names[54] = { NULL };
+	int N = ARRAY_SIZE(names) - 1;
+
+	struct merged_iter_pqueue pq = { NULL };
+	const char *last = NULL;
+
+	int i = 0;
+	for (i = 0; i < N; i++) {
+		char name[100];
+		snprintf(name, sizeof(name), "%02d", i);
+		names[i] = xstrdup(name);
+	}
+
+	i = 1;
+	do {
+		struct reftable_record rec =
+			reftable_new_record(BLOCK_TYPE_REF);
+		struct pq_entry e = { 0 };
+
+		reftable_record_as_ref(&rec)->refname = names[i];
+		e.rec = rec;
+		merged_iter_pqueue_add(&pq, e);
+		merged_iter_pqueue_check(pq);
+		i = (i * 7) % N;
+	} while (i != 1);
+
+	while (!merged_iter_pqueue_is_empty(pq)) {
+		struct pq_entry e = merged_iter_pqueue_remove(&pq);
+		struct reftable_ref_record *ref =
+			reftable_record_as_ref(&e.rec);
+
+		merged_iter_pqueue_check(pq);
+
+		if (last != NULL) {
+			assert(strcmp(last, ref->refname) < 0);
+		}
+		last = ref->refname;
+		ref->refname = NULL;
+		reftable_free(ref);
+	}
+
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+
+	merged_iter_pqueue_clear(&pq);
+}
+
+static void write_test_table(struct strbuf *buf,
+			     struct reftable_ref_record refs[], int n)
+{
+	int min = 0xffffffff;
+	int max = 0;
+	int i = 0;
+	int err;
+
+	struct reftable_write_options opts = {
+		.block_size = 256,
+	};
+	struct reftable_writer *w = NULL;
+	for (i = 0; i < n; i++) {
+		uint64_t ui = refs[i].update_index;
+		if (ui > max) {
+			max = ui;
+		}
+		if (ui < min) {
+			min = ui;
+		}
+	}
+
+	w = reftable_new_writer(&strbuf_add_void, buf, &opts);
+	reftable_writer_set_limits(w, min, max);
+
+	for (i = 0; i < n; i++) {
+		uint64_t before = refs[i].update_index;
+		int n = reftable_writer_add_ref(w, &refs[i]);
+		assert(n == 0);
+		assert(before == refs[i].update_index);
+	}
+
+	err = reftable_writer_close(w);
+	assert_err(err);
+
+	reftable_writer_free(w);
+}
+
+static struct reftable_merged_table *
+merged_table_from_records(struct reftable_ref_record **refs,
+			  struct reftable_block_source **source,
+			  struct reftable_reader ***readers, int *sizes,
+			  struct strbuf *buf, int n)
+{
+	int i = 0;
+	struct reftable_merged_table *mt = NULL;
+	int err;
+	struct reftable_table *tabs =
+		reftable_calloc(n * sizeof(struct reftable_table));
+	*readers = reftable_calloc(n * sizeof(struct reftable_reader *));
+	*source = reftable_calloc(n * sizeof(**source));
+	for (i = 0; i < n; i++) {
+		write_test_table(&buf[i], refs[i], sizes[i]);
+		block_source_from_strbuf(&(*source)[i], &buf[i]);
+
+		err = reftable_new_reader(&(*readers)[i], &(*source)[i],
+					  "name");
+		assert_err(err);
+		reftable_table_from_reader(&tabs[i], (*readers)[i]);
+	}
+
+	err = reftable_new_merged_table(&mt, tabs, n, SHA1_ID);
+	assert_err(err);
+	return mt;
+}
+
+static void readers_destroy(struct reftable_reader **readers, size_t n)
+{
+	int i = 0;
+	for (; i < n; i++)
+		reftable_reader_free(readers[i]);
+	reftable_free(readers);
+}
+
+static void test_merged_between(void)
+{
+	uint8_t hash1[SHA1_SIZE] = { 1, 2, 3, 0 };
+
+	struct reftable_ref_record r1[] = { {
+		.refname = "b",
+		.update_index = 1,
+		.value = hash1,
+	} };
+	struct reftable_ref_record r2[] = { {
+		.refname = "a",
+		.update_index = 2,
+	} };
+
+	struct reftable_ref_record *refs[] = { r1, r2 };
+	int sizes[] = { 1, 1 };
+	struct strbuf bufs[2] = { STRBUF_INIT, STRBUF_INIT };
+	struct reftable_block_source *bs = NULL;
+	struct reftable_reader **readers = NULL;
+	struct reftable_merged_table *mt =
+		merged_table_from_records(refs, &bs, &readers, sizes, bufs, 2);
+	int i;
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_iterator it = { NULL };
+	int err = reftable_merged_table_seek_ref(mt, &it, "a");
+	assert_err(err);
+
+	err = reftable_iterator_next_ref(&it, &ref);
+	assert_err(err);
+	assert(ref.update_index == 2);
+	reftable_ref_record_clear(&ref);
+	reftable_iterator_destroy(&it);
+	readers_destroy(readers, 2);
+	reftable_merged_table_free(mt);
+	for (i = 0; i < ARRAY_SIZE(bufs); i++) {
+		strbuf_release(&bufs[i]);
+	}
+	reftable_free(bs);
+}
+
+static void test_merged(void)
+{
+	uint8_t hash1[SHA1_SIZE] = { 1 };
+	uint8_t hash2[SHA1_SIZE] = { 2 };
+	struct reftable_ref_record r1[] = { {
+						    .refname = "a",
+						    .update_index = 1,
+						    .value = hash1,
+					    },
+					    {
+						    .refname = "b",
+						    .update_index = 1,
+						    .value = hash1,
+					    },
+					    {
+						    .refname = "c",
+						    .update_index = 1,
+						    .value = hash1,
+					    } };
+	struct reftable_ref_record r2[] = { {
+		.refname = "a",
+		.update_index = 2,
+	} };
+	struct reftable_ref_record r3[] = {
+		{
+			.refname = "c",
+			.update_index = 3,
+			.value = hash2,
+		},
+		{
+			.refname = "d",
+			.update_index = 3,
+			.value = hash1,
+		},
+	};
+
+	struct reftable_ref_record want[] = {
+		r2[0],
+		r1[1],
+		r3[0],
+		r3[1],
+	};
+
+	struct reftable_ref_record *refs[] = { r1, r2, r3 };
+	int sizes[3] = { 3, 1, 2 };
+	struct strbuf bufs[3] = { STRBUF_INIT, STRBUF_INIT, STRBUF_INIT };
+	struct reftable_block_source *bs = NULL;
+	struct reftable_reader **readers = NULL;
+	struct reftable_merged_table *mt =
+		merged_table_from_records(refs, &bs, &readers, sizes, bufs, 3);
+
+	struct reftable_iterator it = { NULL };
+	int err = reftable_merged_table_seek_ref(mt, &it, "a");
+	struct reftable_ref_record *out = NULL;
+	size_t len = 0;
+	size_t cap = 0;
+	int i = 0;
+
+	assert_err(err);
+	while (len < 100) { /* cap loops/recursion. */
+		struct reftable_ref_record ref = { NULL };
+		int err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (len == cap) {
+			cap = 2 * cap + 1;
+			out = reftable_realloc(
+				out, sizeof(struct reftable_ref_record) * cap);
+		}
+		out[len++] = ref;
+	}
+	reftable_iterator_destroy(&it);
+
+	assert(ARRAY_SIZE(want) == len);
+	for (i = 0; i < len; i++) {
+		assert(reftable_ref_record_equal(&want[i], &out[i], SHA1_SIZE));
+	}
+	for (i = 0; i < len; i++) {
+		reftable_ref_record_clear(&out[i]);
+	}
+	reftable_free(out);
+
+	for (i = 0; i < 3; i++) {
+		strbuf_release(&bufs[i]);
+	}
+	readers_destroy(readers, 3);
+	reftable_merged_table_free(mt);
+	reftable_free(bs);
+}
+
+static void test_default_write_opts(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+
+	struct reftable_ref_record rec = {
+		.refname = "master",
+		.update_index = 1,
+	};
+	int err;
+	struct reftable_block_source source = { NULL };
+	struct reftable_table *tab = reftable_calloc(sizeof(*tab) * 1);
+	uint32_t hash_id;
+	struct reftable_reader *rd = NULL;
+	struct reftable_merged_table *merged = NULL;
+
+	reftable_writer_set_limits(w, 1, 1);
+
+	err = reftable_writer_add_ref(w, &rec);
+	assert_err(err);
+
+	err = reftable_writer_close(w);
+	assert_err(err);
+	reftable_writer_free(w);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = reftable_new_reader(&rd, &source, "filename");
+	assert_err(err);
+
+	hash_id = reftable_reader_hash_id(rd);
+	assert(hash_id == SHA1_ID);
+
+	reftable_table_from_reader(&tab[0], rd);
+	err = reftable_new_merged_table(&merged, tab, 1, SHA1_ID);
+	assert_err(err);
+
+	reftable_reader_free(rd);
+	reftable_merged_table_free(merged);
+	strbuf_release(&buf);
+}
+
+/* XXX test refs_for(oid) */
+
+int merged_test_main(int argc, const char *argv[])
+{
+	add_test_case("test_merged_between", &test_merged_between);
+	add_test_case("test_pq", &test_pq);
+	add_test_case("test_merged", &test_merged);
+	add_test_case("test_default_write_opts", &test_default_write_opts);
+	return test_main(argc, argv);
+}
diff --git a/reftable/pq.c b/reftable/pq.c
new file mode 100644
index 0000000000..81f7745f36
--- /dev/null
+++ b/reftable/pq.c
@@ -0,0 +1,115 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "pq.h"
+
+#include "reftable.h"
+#include "system.h"
+#include "basics.h"
+
+static int pq_less(struct pq_entry a, struct pq_entry b)
+{
+	struct strbuf ak = STRBUF_INIT;
+	struct strbuf bk = STRBUF_INIT;
+	int cmp = 0;
+	reftable_record_key(&a.rec, &ak);
+	reftable_record_key(&b.rec, &bk);
+
+	cmp = strbuf_cmp(&ak, &bk);
+
+	strbuf_release(&ak);
+	strbuf_release(&bk);
+
+	if (cmp == 0)
+		return a.index > b.index;
+
+	return cmp < 0;
+}
+
+struct pq_entry merged_iter_pqueue_top(struct merged_iter_pqueue pq)
+{
+	return pq.heap[0];
+}
+
+int merged_iter_pqueue_is_empty(struct merged_iter_pqueue pq)
+{
+	return pq.len == 0;
+}
+
+void merged_iter_pqueue_check(struct merged_iter_pqueue pq)
+{
+	int i = 0;
+	for (i = 1; i < pq.len; i++) {
+		int parent = (i - 1) / 2;
+
+		assert(pq_less(pq.heap[parent], pq.heap[i]));
+	}
+}
+
+struct pq_entry merged_iter_pqueue_remove(struct merged_iter_pqueue *pq)
+{
+	int i = 0;
+	struct pq_entry e = pq->heap[0];
+	pq->heap[0] = pq->heap[pq->len - 1];
+	pq->len--;
+
+	i = 0;
+	while (i < pq->len) {
+		int min = i;
+		int j = 2 * i + 1;
+		int k = 2 * i + 2;
+		if (j < pq->len && pq_less(pq->heap[j], pq->heap[i])) {
+			min = j;
+		}
+		if (k < pq->len && pq_less(pq->heap[k], pq->heap[min])) {
+			min = k;
+		}
+
+		if (min == i) {
+			break;
+		}
+
+		SWAP(pq->heap[i], pq->heap[min]);
+		i = min;
+	}
+
+	return e;
+}
+
+void merged_iter_pqueue_add(struct merged_iter_pqueue *pq, struct pq_entry e)
+{
+	int i = 0;
+	if (pq->len == pq->cap) {
+		pq->cap = 2 * pq->cap + 1;
+		pq->heap = reftable_realloc(pq->heap,
+					    pq->cap * sizeof(struct pq_entry));
+	}
+
+	pq->heap[pq->len++] = e;
+	i = pq->len - 1;
+	while (i > 0) {
+		int j = (i - 1) / 2;
+		if (pq_less(pq->heap[j], pq->heap[i])) {
+			break;
+		}
+
+		SWAP(pq->heap[j], pq->heap[i]);
+
+		i = j;
+	}
+}
+
+void merged_iter_pqueue_clear(struct merged_iter_pqueue *pq)
+{
+	int i = 0;
+	for (i = 0; i < pq->len; i++) {
+		reftable_record_destroy(&pq->heap[i].rec);
+	}
+	FREE_AND_NULL(pq->heap);
+	pq->len = pq->cap = 0;
+}
diff --git a/reftable/pq.h b/reftable/pq.h
new file mode 100644
index 0000000000..9c2c0e0bfe
--- /dev/null
+++ b/reftable/pq.h
@@ -0,0 +1,32 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef PQ_H
+#define PQ_H
+
+#include "record.h"
+
+struct pq_entry {
+	int index;
+	struct reftable_record rec;
+};
+
+struct merged_iter_pqueue {
+	struct pq_entry *heap;
+	size_t len;
+	size_t cap;
+};
+
+struct pq_entry merged_iter_pqueue_top(struct merged_iter_pqueue pq);
+int merged_iter_pqueue_is_empty(struct merged_iter_pqueue pq);
+void merged_iter_pqueue_check(struct merged_iter_pqueue pq);
+struct pq_entry merged_iter_pqueue_remove(struct merged_iter_pqueue *pq);
+void merged_iter_pqueue_add(struct merged_iter_pqueue *pq, struct pq_entry e);
+void merged_iter_pqueue_clear(struct merged_iter_pqueue *pq);
+
+#endif
diff --git a/reftable/refname.c b/reftable/refname.c
new file mode 100644
index 0000000000..9051c1f574
--- /dev/null
+++ b/reftable/refname.c
@@ -0,0 +1,209 @@
+/*
+  Copyright 2020 Google LLC
+
+  Use of this source code is governed by a BSD-style
+  license that can be found in the LICENSE file or at
+  https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+#include "reftable.h"
+#include "basics.h"
+#include "refname.h"
+#include "strbuf.h"
+
+struct find_arg {
+	char **names;
+	const char *want;
+};
+
+static int find_name(size_t k, void *arg)
+{
+	struct find_arg *f_arg = (struct find_arg *)arg;
+	return strcmp(f_arg->names[k], f_arg->want) >= 0;
+}
+
+static int modification_has_ref(struct modification *mod, const char *name)
+{
+	struct reftable_ref_record ref = { NULL };
+	int err = 0;
+
+	if (mod->add_len > 0) {
+		struct find_arg arg = {
+			.names = mod->add,
+			.want = name,
+		};
+		int idx = binsearch(mod->add_len, find_name, &arg);
+		if (idx < mod->add_len && !strcmp(mod->add[idx], name)) {
+			return 0;
+		}
+	}
+
+	if (mod->del_len > 0) {
+		struct find_arg arg = {
+			.names = mod->del,
+			.want = name,
+		};
+		int idx = binsearch(mod->del_len, find_name, &arg);
+		if (idx < mod->del_len && !strcmp(mod->del[idx], name)) {
+			return 1;
+		}
+	}
+
+	err = reftable_table_read_ref(&mod->tab, name, &ref);
+	reftable_ref_record_clear(&ref);
+	return err;
+}
+
+static void modification_clear(struct modification *mod)
+{
+	/* don't delete the strings themselves; they're owned by ref records.
+	 */
+	FREE_AND_NULL(mod->add);
+	FREE_AND_NULL(mod->del);
+	mod->add_len = 0;
+	mod->del_len = 0;
+}
+
+static int modification_has_ref_with_prefix(struct modification *mod,
+					    const char *prefix)
+{
+	struct reftable_iterator it = { NULL };
+	struct reftable_ref_record ref = { NULL };
+	int err = 0;
+
+	if (mod->add_len > 0) {
+		struct find_arg arg = {
+			.names = mod->add,
+			.want = prefix,
+		};
+		int idx = binsearch(mod->add_len, find_name, &arg);
+		if (idx < mod->add_len &&
+		    !strncmp(prefix, mod->add[idx], strlen(prefix)))
+			goto done;
+	}
+	err = reftable_table_seek_ref(&mod->tab, &it, prefix);
+	if (err)
+		goto done;
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err)
+			goto done;
+
+		if (mod->del_len > 0) {
+			struct find_arg arg = {
+				.names = mod->del,
+				.want = ref.refname,
+			};
+			int idx = binsearch(mod->del_len, find_name, &arg);
+			if (idx < mod->del_len &&
+			    !strcmp(ref.refname, mod->del[idx])) {
+				continue;
+			}
+		}
+
+		if (strncmp(ref.refname, prefix, strlen(prefix))) {
+			err = 1;
+			goto done;
+		}
+		err = 0;
+		goto done;
+	}
+
+done:
+	reftable_ref_record_clear(&ref);
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+static int validate_refname(const char *name)
+{
+	while (1) {
+		char *next = strchr(name, '/');
+		if (!*name) {
+			return REFTABLE_REFNAME_ERROR;
+		}
+		if (!next) {
+			return 0;
+		}
+		if (next - name == 0 || (next - name == 1 && *name == '.') ||
+		    (next - name == 2 && name[0] == '.' && name[1] == '.'))
+			return REFTABLE_REFNAME_ERROR;
+		name = next + 1;
+	}
+	return 0;
+}
+
+int validate_ref_record_addition(struct reftable_table tab,
+				 struct reftable_ref_record *recs, size_t sz)
+{
+	struct modification mod = {
+		.tab = tab,
+		.add = reftable_calloc(sizeof(char *) * sz),
+		.del = reftable_calloc(sizeof(char *) * sz),
+	};
+	int i = 0;
+	int err = 0;
+	for (; i < sz; i++) {
+		if (reftable_ref_record_is_deletion(&recs[i])) {
+			mod.del[mod.del_len++] = recs[i].refname;
+		} else {
+			mod.add[mod.add_len++] = recs[i].refname;
+		}
+	}
+
+	err = modification_validate(&mod);
+	modification_clear(&mod);
+	return err;
+}
+
+static void strbuf_trim_component(struct strbuf *sl)
+{
+	while (sl->len > 0) {
+		int is_slash = (sl->buf[sl->len - 1] == '/');
+		strbuf_setlen(sl, sl->len - 1);
+		if (is_slash)
+			break;
+	}
+}
+
+int modification_validate(struct modification *mod)
+{
+	struct strbuf slashed = STRBUF_INIT;
+	int err = 0;
+	int i = 0;
+	for (; i < mod->add_len; i++) {
+		err = validate_refname(mod->add[i]);
+		if (err)
+			goto done;
+		strbuf_reset(&slashed);
+		strbuf_addstr(&slashed, mod->add[i]);
+		strbuf_addstr(&slashed, "/");
+
+		err = modification_has_ref_with_prefix(mod, slashed.buf);
+		if (err == 0) {
+			err = REFTABLE_NAME_CONFLICT;
+			goto done;
+		}
+		if (err < 0)
+			goto done;
+
+		strbuf_reset(&slashed);
+		strbuf_addstr(&slashed, mod->add[i]);
+		while (slashed.len) {
+			strbuf_trim_component(&slashed);
+			err = modification_has_ref(mod, slashed.buf);
+			if (err == 0) {
+				err = REFTABLE_NAME_CONFLICT;
+				goto done;
+			}
+			if (err < 0)
+				goto done;
+		}
+	}
+	err = 0;
+done:
+	strbuf_release(&slashed);
+	return err;
+}
diff --git a/reftable/refname.h b/reftable/refname.h
new file mode 100644
index 0000000000..fcd645a99f
--- /dev/null
+++ b/reftable/refname.h
@@ -0,0 +1,28 @@
+/*
+  Copyright 2020 Google LLC
+
+  Use of this source code is governed by a BSD-style
+  license that can be found in the LICENSE file or at
+  https://developers.google.com/open-source/licenses/bsd
+*/
+#ifndef REFNAME_H
+#define REFNAME_H
+
+#include "reftable.h"
+
+struct modification {
+	struct reftable_table tab;
+
+	char **add;
+	size_t add_len;
+
+	char **del;
+	size_t del_len;
+};
+
+int validate_ref_record_addition(struct reftable_table tab,
+				 struct reftable_ref_record *recs, size_t sz);
+
+int modification_validate(struct modification *mod);
+
+#endif
diff --git a/reftable/refname_test.c b/reftable/refname_test.c
new file mode 100644
index 0000000000..6e9a15da38
--- /dev/null
+++ b/reftable/refname_test.c
@@ -0,0 +1,100 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reftable.h"
+
+#include "basics.h"
+#include "block.h"
+#include "blocksource.h"
+#include "constants.h"
+#include "reader.h"
+#include "record.h"
+#include "refname.h"
+#include "system.h"
+
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+struct testcase {
+	char *add;
+	char *del;
+	int error_code;
+};
+
+static void test_conflict(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	struct reftable_ref_record rec = {
+		.refname = "a/b",
+		.target = "destination", /* make sure it's not a symref. */
+		.update_index = 1,
+	};
+	int err;
+	int i;
+	struct reftable_block_source source = { NULL };
+	struct reftable_reader *rd = NULL;
+	struct reftable_table tab = { NULL };
+	struct testcase cases[] = {
+		{ "a/b/c", NULL, REFTABLE_NAME_CONFLICT },
+		{ "b", NULL, 0 },
+		{ "a", NULL, REFTABLE_NAME_CONFLICT },
+		{ "a", "a/b", 0 },
+
+		{ "p/", NULL, REFTABLE_REFNAME_ERROR },
+		{ "p//q", NULL, REFTABLE_REFNAME_ERROR },
+		{ "p/./q", NULL, REFTABLE_REFNAME_ERROR },
+		{ "p/../q", NULL, REFTABLE_REFNAME_ERROR },
+
+		{ "a/b/c", "a/b", 0 },
+		{ NULL, "a//b", 0 },
+	};
+	reftable_writer_set_limits(w, 1, 1);
+
+	err = reftable_writer_add_ref(w, &rec);
+	assert_err(err);
+
+	err = reftable_writer_close(w);
+	assert_err(err);
+	reftable_writer_free(w);
+
+	block_source_from_strbuf(&source, &buf);
+	err = reftable_new_reader(&rd, &source, "filename");
+	assert_err(err);
+
+	reftable_table_from_reader(&tab, rd);
+
+	for (i = 0; i < ARRAY_SIZE(cases); i++) {
+		struct modification mod = {
+			.tab = tab,
+		};
+
+		if (cases[i].add != NULL) {
+			mod.add = &cases[i].add;
+			mod.add_len = 1;
+		}
+		if (cases[i].del != NULL) {
+			mod.del = &cases[i].del;
+			mod.del_len = 1;
+		}
+
+		err = modification_validate(&mod);
+		assert(err == cases[i].error_code);
+	}
+
+	reftable_reader_free(rd);
+	strbuf_release(&buf);
+}
+
+int refname_test_main(int argc, const char *argv[])
+{
+	add_test_case("test_conflict", &test_conflict);
+	return test_main(argc, argv);
+}
diff --git a/reftable/reftable.c b/reftable/reftable.c
index 1425aef6f2..0f3bf0291e 100644
--- a/reftable/reftable.c
+++ b/reftable/reftable.c
@@ -44,7 +44,7 @@ int reftable_table_seek_ref(struct reftable_table *tab,
 	struct reftable_ref_record ref = {
 		.refname = (char *)name,
 	};
-	struct reftable_record rec = { 0 };
+	struct reftable_record rec = { NULL };
 	reftable_record_from_ref(&rec, &ref);
 	return tab->ops->seek_record(tab->table_arg, it, &rec);
 }
@@ -60,7 +60,7 @@ void reftable_table_from_reader(struct reftable_table *tab,
 int reftable_table_read_ref(struct reftable_table *tab, const char *name,
 			    struct reftable_ref_record *ref)
 {
-	struct reftable_iterator it = { 0 };
+	struct reftable_iterator it = { NULL };
 	int err = reftable_table_seek_ref(tab, &it, name);
 	if (err)
 		goto done;
diff --git a/reftable/stack.c b/reftable/stack.c
new file mode 100644
index 0000000000..b20bee407b
--- /dev/null
+++ b/reftable/stack.c
@@ -0,0 +1,1240 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "stack.h"
+
+#include "system.h"
+#include "merged.h"
+#include "reader.h"
+#include "refname.h"
+#include "reftable.h"
+#include "writer.h"
+
+static int stack_try_add(struct reftable_stack *st,
+			 int (*write_table)(struct reftable_writer *wr,
+					    void *arg),
+			 void *arg);
+static int stack_write_compact(struct reftable_stack *st,
+			       struct reftable_writer *wr, int first, int last,
+			       struct reftable_log_expiry_config *config);
+static int stack_check_addition(struct reftable_stack *st,
+				const char *new_tab_name);
+static void reftable_addition_close(struct reftable_addition *add);
+static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
+					     int reuse_open);
+
+int reftable_new_stack(struct reftable_stack **dest, const char *dir,
+		       struct reftable_write_options config)
+{
+	struct reftable_stack *p =
+		reftable_calloc(sizeof(struct reftable_stack));
+	struct strbuf list_file_name = STRBUF_INIT;
+	int err = 0;
+
+	if (config.hash_id == 0) {
+		config.hash_id = SHA1_ID;
+	}
+
+	*dest = NULL;
+
+	strbuf_reset(&list_file_name);
+	strbuf_addstr(&list_file_name, dir);
+	strbuf_addstr(&list_file_name, "/tables.list");
+
+	p->list_file = strbuf_detach(&list_file_name, NULL);
+	p->reftable_dir = xstrdup(dir);
+	p->config = config;
+
+	err = reftable_stack_reload_maybe_reuse(p, 1);
+	if (err < 0) {
+		reftable_stack_destroy(p);
+	} else {
+		*dest = p;
+	}
+	return err;
+}
+
+static int fd_read_lines(int fd, char ***namesp)
+{
+	off_t size = lseek(fd, 0, SEEK_END);
+	char *buf = NULL;
+	int err = 0;
+	if (size < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+	err = lseek(fd, 0, SEEK_SET);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	buf = reftable_malloc(size + 1);
+	if (read(fd, buf, size) != size) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+	buf[size] = 0;
+
+	parse_names(buf, size, namesp);
+
+done:
+	reftable_free(buf);
+	return err;
+}
+
+int read_lines(const char *filename, char ***namesp)
+{
+	int fd = open(filename, O_RDONLY, 0644);
+	int err = 0;
+	if (fd < 0) {
+		if (errno == ENOENT) {
+			*namesp = reftable_calloc(sizeof(char *));
+			return 0;
+		}
+
+		return REFTABLE_IO_ERROR;
+	}
+	err = fd_read_lines(fd, namesp);
+	close(fd);
+	return err;
+}
+
+struct reftable_merged_table *
+reftable_stack_merged_table(struct reftable_stack *st)
+{
+	return st->merged;
+}
+
+/* Close and free the stack */
+void reftable_stack_destroy(struct reftable_stack *st)
+{
+	if (st->merged != NULL) {
+		reftable_merged_table_free(st->merged);
+		st->merged = NULL;
+	}
+
+	if (st->readers != NULL) {
+		int i = 0;
+		for (i = 0; i < st->readers_len; i++) {
+			reftable_reader_free(st->readers[i]);
+		}
+		st->readers_len = 0;
+		FREE_AND_NULL(st->readers);
+	}
+	FREE_AND_NULL(st->list_file);
+	FREE_AND_NULL(st->reftable_dir);
+	reftable_free(st);
+}
+
+static struct reftable_reader **stack_copy_readers(struct reftable_stack *st,
+						   int cur_len)
+{
+	struct reftable_reader **cur =
+		reftable_calloc(sizeof(struct reftable_reader *) * cur_len);
+	int i = 0;
+	for (i = 0; i < cur_len; i++) {
+		cur[i] = st->readers[i];
+	}
+	return cur;
+}
+
+static int reftable_stack_reload_once(struct reftable_stack *st, char **names,
+				      int reuse_open)
+{
+	int cur_len = st->merged == NULL ? 0 : st->merged->stack_len;
+	struct reftable_reader **cur = stack_copy_readers(st, cur_len);
+	int err = 0;
+	int names_len = names_length(names);
+	struct reftable_reader **new_readers =
+		reftable_calloc(sizeof(struct reftable_reader *) * names_len);
+	struct reftable_table *new_tables =
+		reftable_calloc(sizeof(struct reftable_table) * names_len);
+	int new_readers_len = 0;
+	struct reftable_merged_table *new_merged = NULL;
+	int i;
+
+	while (*names) {
+		struct reftable_reader *rd = NULL;
+		char *name = *names++;
+
+		/* this is linear; we assume compaction keeps the number of
+		   tables under control so this is not quadratic. */
+		int j = 0;
+		for (j = 0; reuse_open && j < cur_len; j++) {
+			if (cur[j] != NULL && 0 == strcmp(cur[j]->name, name)) {
+				rd = cur[j];
+				cur[j] = NULL;
+				break;
+			}
+		}
+
+		if (rd == NULL) {
+			struct reftable_block_source src = { NULL };
+			struct strbuf table_path = STRBUF_INIT;
+			strbuf_addstr(&table_path, st->reftable_dir);
+			strbuf_addstr(&table_path, "/");
+			strbuf_addstr(&table_path, name);
+
+			err = reftable_block_source_from_file(&src,
+							      table_path.buf);
+			strbuf_release(&table_path);
+
+			if (err < 0)
+				goto done;
+
+			err = reftable_new_reader(&rd, &src, name);
+			if (err < 0)
+				goto done;
+		}
+
+		new_readers[new_readers_len] = rd;
+		reftable_table_from_reader(&new_tables[new_readers_len], rd);
+		new_readers_len++;
+	}
+
+	/* success! */
+	err = reftable_new_merged_table(&new_merged, new_tables,
+					new_readers_len, st->config.hash_id);
+	if (err < 0)
+		goto done;
+
+	new_tables = NULL;
+	st->readers_len = new_readers_len;
+	if (st->merged != NULL) {
+		merged_table_clear(st->merged);
+		reftable_merged_table_free(st->merged);
+	}
+	if (st->readers != NULL) {
+		reftable_free(st->readers);
+	}
+	st->readers = new_readers;
+	new_readers = NULL;
+	new_readers_len = 0;
+
+	new_merged->suppress_deletions = 1;
+	st->merged = new_merged;
+	for (i = 0; i < cur_len; i++) {
+		if (cur[i] != NULL) {
+			reader_close(cur[i]);
+			reftable_reader_free(cur[i]);
+		}
+	}
+
+done:
+	for (i = 0; i < new_readers_len; i++) {
+		reader_close(new_readers[i]);
+		reftable_reader_free(new_readers[i]);
+	}
+	reftable_free(new_readers);
+	reftable_free(new_tables);
+	reftable_free(cur);
+	return err;
+}
+
+/* return negative if a before b. */
+static int tv_cmp(struct timeval *a, struct timeval *b)
+{
+	time_t diff = a->tv_sec - b->tv_sec;
+	int udiff = a->tv_usec - b->tv_usec;
+
+	if (diff != 0)
+		return diff;
+
+	return udiff;
+}
+
+static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
+					     int reuse_open)
+{
+	struct timeval deadline = { 0 };
+	int err = gettimeofday(&deadline, NULL);
+	int64_t delay = 0;
+	int tries = 0;
+	if (err < 0)
+		return err;
+
+	deadline.tv_sec += 3;
+	while (1) {
+		char **names = NULL;
+		char **names_after = NULL;
+		struct timeval now = { 0 };
+		int err = gettimeofday(&now, NULL);
+		int err2 = 0;
+		if (err < 0) {
+			return err;
+		}
+
+		/* Only look at deadlines after the first few times. This
+		   simplifies debugging in GDB */
+		tries++;
+		if (tries > 3 && tv_cmp(&now, &deadline) >= 0) {
+			break;
+		}
+
+		err = read_lines(st->list_file, &names);
+		if (err < 0) {
+			free_names(names);
+			return err;
+		}
+		err = reftable_stack_reload_once(st, names, reuse_open);
+		if (err == 0) {
+			free_names(names);
+			break;
+		}
+		if (err != REFTABLE_NOT_EXIST_ERROR) {
+			free_names(names);
+			return err;
+		}
+
+		/* err == REFTABLE_NOT_EXIST_ERROR can be caused by a concurrent
+		   writer. Check if there was one by checking if the name list
+		   changed.
+		*/
+		err2 = read_lines(st->list_file, &names_after);
+		if (err2 < 0) {
+			free_names(names);
+			return err2;
+		}
+
+		if (names_equal(names_after, names)) {
+			free_names(names);
+			free_names(names_after);
+			return err;
+		}
+		free_names(names);
+		free_names(names_after);
+
+		delay = delay + (delay * rand()) / RAND_MAX + 1;
+		sleep_millisec(delay);
+	}
+
+	return 0;
+}
+
+/* -1 = error
+ 0 = up to date
+ 1 = changed. */
+static int stack_uptodate(struct reftable_stack *st)
+{
+	char **names = NULL;
+	int err = read_lines(st->list_file, &names);
+	int i = 0;
+	if (err < 0)
+		return err;
+
+	for (i = 0; i < st->readers_len; i++) {
+		if (names[i] == NULL) {
+			err = 1;
+			goto done;
+		}
+
+		if (strcmp(st->readers[i]->name, names[i])) {
+			err = 1;
+			goto done;
+		}
+	}
+
+	if (names[st->merged->stack_len] != NULL) {
+		err = 1;
+		goto done;
+	}
+
+done:
+	free_names(names);
+	return err;
+}
+
+int reftable_stack_reload(struct reftable_stack *st)
+{
+	int err = stack_uptodate(st);
+	if (err > 0)
+		return reftable_stack_reload_maybe_reuse(st, 1);
+	return err;
+}
+
+int reftable_stack_add(struct reftable_stack *st,
+		       int (*write)(struct reftable_writer *wr, void *arg),
+		       void *arg)
+{
+	int err = stack_try_add(st, write, arg);
+	if (err < 0) {
+		if (err == REFTABLE_LOCK_ERROR) {
+			/* Ignore error return, we want to propagate
+			   REFTABLE_LOCK_ERROR.
+			*/
+			reftable_stack_reload(st);
+		}
+		return err;
+	}
+
+	if (!st->disable_auto_compact)
+		return reftable_stack_auto_compact(st);
+
+	return 0;
+}
+
+static void format_name(struct strbuf *dest, uint64_t min, uint64_t max)
+{
+	char buf[100];
+	snprintf(buf, sizeof(buf), "0x%012" PRIx64 "-0x%012" PRIx64, min, max);
+	strbuf_reset(dest);
+	strbuf_addstr(dest, buf);
+}
+
+struct reftable_addition {
+	int lock_file_fd;
+	struct strbuf lock_file_name;
+	struct reftable_stack *stack;
+	char **names;
+	char **new_tables;
+	int new_tables_len;
+	uint64_t next_update_index;
+};
+
+#define REFTABLE_ADDITION_INIT                \
+	{                                     \
+		.lock_file_name = STRBUF_INIT \
+	}
+
+static int reftable_stack_init_addition(struct reftable_addition *add,
+					struct reftable_stack *st)
+{
+	int err = 0;
+	add->stack = st;
+
+	strbuf_reset(&add->lock_file_name);
+	strbuf_addstr(&add->lock_file_name, st->list_file);
+	strbuf_addstr(&add->lock_file_name, ".lock");
+
+	add->lock_file_fd = open(add->lock_file_name.buf,
+				 O_EXCL | O_CREAT | O_WRONLY, 0644);
+	if (add->lock_file_fd < 0) {
+		if (errno == EEXIST) {
+			err = REFTABLE_LOCK_ERROR;
+		} else {
+			err = REFTABLE_IO_ERROR;
+		}
+		goto done;
+	}
+	err = stack_uptodate(st);
+	if (err < 0)
+		goto done;
+
+	if (err > 1) {
+		err = REFTABLE_LOCK_ERROR;
+		goto done;
+	}
+
+	add->next_update_index = reftable_stack_next_update_index(st);
+done:
+	if (err) {
+		reftable_addition_close(add);
+	}
+	return err;
+}
+
+static void reftable_addition_close(struct reftable_addition *add)
+{
+	int i = 0;
+	struct strbuf nm = STRBUF_INIT;
+	for (i = 0; i < add->new_tables_len; i++) {
+		strbuf_reset(&nm);
+		strbuf_addstr(&nm, add->stack->list_file);
+		strbuf_addstr(&nm, "/");
+		strbuf_addstr(&nm, add->new_tables[i]);
+		unlink(nm.buf);
+		reftable_free(add->new_tables[i]);
+		add->new_tables[i] = NULL;
+	}
+	reftable_free(add->new_tables);
+	add->new_tables = NULL;
+	add->new_tables_len = 0;
+
+	if (add->lock_file_fd > 0) {
+		close(add->lock_file_fd);
+		add->lock_file_fd = 0;
+	}
+	if (add->lock_file_name.len > 0) {
+		unlink(add->lock_file_name.buf);
+		strbuf_release(&add->lock_file_name);
+	}
+
+	free_names(add->names);
+	add->names = NULL;
+	strbuf_release(&nm);
+}
+
+void reftable_addition_destroy(struct reftable_addition *add)
+{
+	if (add == NULL) {
+		return;
+	}
+	reftable_addition_close(add);
+	reftable_free(add);
+}
+
+int reftable_addition_commit(struct reftable_addition *add)
+{
+	struct strbuf table_list = STRBUF_INIT;
+	int i = 0;
+	int err = 0;
+	if (add->new_tables_len == 0)
+		goto done;
+
+	for (i = 0; i < add->stack->merged->stack_len; i++) {
+		strbuf_addstr(&table_list, add->stack->readers[i]->name);
+		strbuf_addstr(&table_list, "\n");
+	}
+	for (i = 0; i < add->new_tables_len; i++) {
+		strbuf_addstr(&table_list, add->new_tables[i]);
+		strbuf_addstr(&table_list, "\n");
+	}
+
+	err = write(add->lock_file_fd, table_list.buf, table_list.len);
+	strbuf_release(&table_list);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = close(add->lock_file_fd);
+	add->lock_file_fd = 0;
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = rename(add->lock_file_name.buf, add->stack->list_file);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = reftable_stack_reload(add->stack);
+
+done:
+	reftable_addition_close(add);
+	return err;
+}
+
+int reftable_stack_new_addition(struct reftable_addition **dest,
+				struct reftable_stack *st)
+{
+	int err = 0;
+	struct reftable_addition empty = REFTABLE_ADDITION_INIT;
+	*dest = reftable_calloc(sizeof(**dest));
+	**dest = empty;
+	err = reftable_stack_init_addition(*dest, st);
+	if (err) {
+		reftable_free(*dest);
+		*dest = NULL;
+	}
+	return err;
+}
+
+static int stack_try_add(struct reftable_stack *st,
+			 int (*write_table)(struct reftable_writer *wr,
+					    void *arg),
+			 void *arg)
+{
+	struct reftable_addition add = REFTABLE_ADDITION_INIT;
+	int err = reftable_stack_init_addition(&add, st);
+	if (err < 0)
+		goto done;
+	if (err > 0) {
+		err = REFTABLE_LOCK_ERROR;
+		goto done;
+	}
+
+	err = reftable_addition_add(&add, write_table, arg);
+	if (err < 0)
+		goto done;
+
+	err = reftable_addition_commit(&add);
+done:
+	reftable_addition_close(&add);
+	return err;
+}
+
+int reftable_addition_add(struct reftable_addition *add,
+			  int (*write_table)(struct reftable_writer *wr,
+					     void *arg),
+			  void *arg)
+{
+	struct strbuf temp_tab_file_name = STRBUF_INIT;
+	struct strbuf tab_file_name = STRBUF_INIT;
+	struct strbuf next_name = STRBUF_INIT;
+	struct reftable_writer *wr = NULL;
+	int err = 0;
+	int tab_fd = 0;
+
+	strbuf_reset(&next_name);
+	format_name(&next_name, add->next_update_index, add->next_update_index);
+
+	strbuf_addstr(&temp_tab_file_name, add->stack->reftable_dir);
+	strbuf_addstr(&temp_tab_file_name, "/");
+	strbuf_addbuf(&temp_tab_file_name, &next_name);
+	strbuf_addstr(&temp_tab_file_name, ".temp.XXXXXX");
+
+	tab_fd = mkstemp(temp_tab_file_name.buf);
+	if (tab_fd < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	wr = reftable_new_writer(reftable_fd_write, &tab_fd,
+				 &add->stack->config);
+	err = write_table(wr, arg);
+	if (err < 0)
+		goto done;
+
+	err = reftable_writer_close(wr);
+	if (err == REFTABLE_EMPTY_TABLE_ERROR) {
+		err = 0;
+		goto done;
+	}
+	if (err < 0)
+		goto done;
+
+	err = close(tab_fd);
+	tab_fd = 0;
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = stack_check_addition(add->stack, temp_tab_file_name.buf);
+	if (err < 0)
+		goto done;
+
+	if (wr->min_update_index < add->next_update_index) {
+		err = REFTABLE_API_ERROR;
+		goto done;
+	}
+
+	format_name(&next_name, wr->min_update_index, wr->max_update_index);
+	strbuf_addstr(&next_name, ".ref");
+
+	strbuf_addstr(&tab_file_name, add->stack->reftable_dir);
+	strbuf_addstr(&tab_file_name, "/");
+	strbuf_addbuf(&tab_file_name, &next_name);
+
+	/* TODO: should check destination out of paranoia */
+	err = rename(temp_tab_file_name.buf, tab_file_name.buf);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	add->new_tables = reftable_realloc(add->new_tables,
+					   sizeof(*add->new_tables) *
+						   (add->new_tables_len + 1));
+	add->new_tables[add->new_tables_len] = strbuf_detach(&next_name, NULL);
+	add->new_tables_len++;
+done:
+	if (tab_fd > 0) {
+		close(tab_fd);
+		tab_fd = 0;
+	}
+	if (temp_tab_file_name.len > 0) {
+		unlink(temp_tab_file_name.buf);
+	}
+
+	strbuf_release(&temp_tab_file_name);
+	strbuf_release(&tab_file_name);
+	strbuf_release(&next_name);
+	reftable_writer_free(wr);
+	return err;
+}
+
+uint64_t reftable_stack_next_update_index(struct reftable_stack *st)
+{
+	int sz = st->merged->stack_len;
+	if (sz > 0)
+		return reftable_reader_max_update_index(st->readers[sz - 1]) +
+		       1;
+	return 1;
+}
+
+static int stack_compact_locked(struct reftable_stack *st, int first, int last,
+				struct strbuf *temp_tab,
+				struct reftable_log_expiry_config *config)
+{
+	struct strbuf next_name = STRBUF_INIT;
+	int tab_fd = -1;
+	struct reftable_writer *wr = NULL;
+	int err = 0;
+
+	format_name(&next_name,
+		    reftable_reader_min_update_index(st->readers[first]),
+		    reftable_reader_max_update_index(st->readers[first]));
+
+	strbuf_reset(temp_tab);
+	strbuf_addstr(temp_tab, st->reftable_dir);
+	strbuf_addstr(temp_tab, "/");
+	strbuf_addbuf(temp_tab, &next_name);
+	strbuf_addstr(temp_tab, ".temp.XXXXXX");
+
+	tab_fd = mkstemp(temp_tab->buf);
+	wr = reftable_new_writer(reftable_fd_write, &tab_fd, &st->config);
+
+	err = stack_write_compact(st, wr, first, last, config);
+	if (err < 0)
+		goto done;
+	err = reftable_writer_close(wr);
+	if (err < 0)
+		goto done;
+
+	err = close(tab_fd);
+	tab_fd = 0;
+
+done:
+	reftable_writer_free(wr);
+	if (tab_fd > 0) {
+		close(tab_fd);
+		tab_fd = 0;
+	}
+	if (err != 0 && temp_tab->len > 0) {
+		unlink(temp_tab->buf);
+		strbuf_release(temp_tab);
+	}
+	strbuf_release(&next_name);
+	return err;
+}
+
+static int stack_write_compact(struct reftable_stack *st,
+			       struct reftable_writer *wr, int first, int last,
+			       struct reftable_log_expiry_config *config)
+{
+	int subtabs_len = last - first + 1;
+	struct reftable_table *subtabs = reftable_calloc(
+		sizeof(struct reftable_table) * (last - first + 1));
+	struct reftable_merged_table *mt = NULL;
+	int err = 0;
+	struct reftable_iterator it = { NULL };
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_log_record log = { NULL };
+
+	uint64_t entries = 0;
+
+	int i = 0, j = 0;
+	for (i = first, j = 0; i <= last; i++) {
+		struct reftable_reader *t = st->readers[i];
+		reftable_table_from_reader(&subtabs[j++], t);
+		st->stats.bytes += t->size;
+	}
+	reftable_writer_set_limits(wr, st->readers[first]->min_update_index,
+				   st->readers[last]->max_update_index);
+
+	err = reftable_new_merged_table(&mt, subtabs, subtabs_len,
+					st->config.hash_id);
+	if (err < 0) {
+		reftable_free(subtabs);
+		goto done;
+	}
+
+	err = reftable_merged_table_seek_ref(mt, &it, "");
+	if (err < 0)
+		goto done;
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+		if (err < 0) {
+			break;
+		}
+		if (first == 0 && reftable_ref_record_is_deletion(&ref)) {
+			continue;
+		}
+
+		err = reftable_writer_add_ref(wr, &ref);
+		if (err < 0) {
+			break;
+		}
+		entries++;
+	}
+	reftable_iterator_destroy(&it);
+
+	err = reftable_merged_table_seek_log(mt, &it, "");
+	if (err < 0)
+		goto done;
+
+	while (1) {
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+		if (err < 0) {
+			break;
+		}
+		if (first == 0 && reftable_log_record_is_deletion(&log)) {
+			continue;
+		}
+
+		if (config != NULL && config->time > 0 &&
+		    log.time < config->time) {
+			continue;
+		}
+
+		if (config != NULL && config->min_update_index > 0 &&
+		    log.update_index < config->min_update_index) {
+			continue;
+		}
+
+		err = reftable_writer_add_log(wr, &log);
+		if (err < 0) {
+			break;
+		}
+		entries++;
+	}
+
+done:
+	reftable_iterator_destroy(&it);
+	if (mt != NULL) {
+		merged_table_clear(mt);
+		reftable_merged_table_free(mt);
+	}
+	reftable_ref_record_clear(&ref);
+	reftable_log_record_clear(&log);
+	st->stats.entries_written += entries;
+	return err;
+}
+
+/* <  0: error. 0 == OK, > 0 attempt failed; could retry. */
+static int stack_compact_range(struct reftable_stack *st, int first, int last,
+			       struct reftable_log_expiry_config *expiry)
+{
+	struct strbuf temp_tab_file_name = STRBUF_INIT;
+	struct strbuf new_table_name = STRBUF_INIT;
+	struct strbuf lock_file_name = STRBUF_INIT;
+	struct strbuf ref_list_contents = STRBUF_INIT;
+	struct strbuf new_table_path = STRBUF_INIT;
+	int err = 0;
+	int have_lock = 0;
+	int lock_file_fd = 0;
+	int compact_count = last - first + 1;
+	char **listp = NULL;
+	char **delete_on_success =
+		reftable_calloc(sizeof(char *) * (compact_count + 1));
+	char **subtable_locks =
+		reftable_calloc(sizeof(char *) * (compact_count + 1));
+	int i = 0;
+	int j = 0;
+	int is_empty_table = 0;
+
+	if (first > last || (expiry == NULL && first == last)) {
+		err = 0;
+		goto done;
+	}
+
+	st->stats.attempts++;
+
+	strbuf_reset(&lock_file_name);
+	strbuf_addstr(&lock_file_name, st->list_file);
+	strbuf_addstr(&lock_file_name, ".lock");
+
+	lock_file_fd =
+		open(lock_file_name.buf, O_EXCL | O_CREAT | O_WRONLY, 0644);
+	if (lock_file_fd < 0) {
+		if (errno == EEXIST) {
+			err = 1;
+		} else {
+			err = REFTABLE_IO_ERROR;
+		}
+		goto done;
+	}
+	/* Don't want to write to the lock for now.  */
+	close(lock_file_fd);
+	lock_file_fd = 0;
+
+	have_lock = 1;
+	err = stack_uptodate(st);
+	if (err != 0)
+		goto done;
+
+	for (i = first, j = 0; i <= last; i++) {
+		struct strbuf subtab_file_name = STRBUF_INIT;
+		struct strbuf subtab_lock = STRBUF_INIT;
+		int sublock_file_fd = -1;
+
+		strbuf_addstr(&subtab_file_name, st->reftable_dir);
+		strbuf_addstr(&subtab_file_name, "/");
+		strbuf_addstr(&subtab_file_name, reader_name(st->readers[i]));
+
+		strbuf_reset(&subtab_lock);
+		strbuf_addbuf(&subtab_lock, &subtab_file_name);
+		strbuf_addstr(&subtab_lock, ".lock");
+
+		sublock_file_fd = open(subtab_lock.buf,
+				       O_EXCL | O_CREAT | O_WRONLY, 0644);
+		if (sublock_file_fd > 0) {
+			close(sublock_file_fd);
+		} else if (sublock_file_fd < 0) {
+			if (errno == EEXIST) {
+				err = 1;
+			} else {
+				err = REFTABLE_IO_ERROR;
+			}
+		}
+
+		subtable_locks[j] = subtab_lock.buf;
+		delete_on_success[j] = subtab_file_name.buf;
+		j++;
+
+		if (err != 0)
+			goto done;
+	}
+
+	err = unlink(lock_file_name.buf);
+	if (err < 0)
+		goto done;
+	have_lock = 0;
+
+	err = stack_compact_locked(st, first, last, &temp_tab_file_name,
+				   expiry);
+	/* Compaction + tombstones can create an empty table out of non-empty
+	 * tables. */
+	is_empty_table = (err == REFTABLE_EMPTY_TABLE_ERROR);
+	if (is_empty_table) {
+		err = 0;
+	}
+	if (err < 0)
+		goto done;
+
+	lock_file_fd =
+		open(lock_file_name.buf, O_EXCL | O_CREAT | O_WRONLY, 0644);
+	if (lock_file_fd < 0) {
+		if (errno == EEXIST) {
+			err = 1;
+		} else {
+			err = REFTABLE_IO_ERROR;
+		}
+		goto done;
+	}
+	have_lock = 1;
+
+	format_name(&new_table_name, st->readers[first]->min_update_index,
+		    st->readers[last]->max_update_index);
+	strbuf_addstr(&new_table_name, ".ref");
+
+	strbuf_reset(&new_table_path);
+	strbuf_addstr(&new_table_path, st->reftable_dir);
+	strbuf_addstr(&new_table_path, "/");
+	strbuf_addbuf(&new_table_path, &new_table_name);
+
+	if (!is_empty_table) {
+		err = rename(temp_tab_file_name.buf, new_table_path.buf);
+		if (err < 0) {
+			err = REFTABLE_IO_ERROR;
+			goto done;
+		}
+	}
+
+	for (i = 0; i < first; i++) {
+		strbuf_addstr(&ref_list_contents, st->readers[i]->name);
+		strbuf_addstr(&ref_list_contents, "\n");
+	}
+	if (!is_empty_table) {
+		strbuf_addbuf(&ref_list_contents, &new_table_name);
+		strbuf_addstr(&ref_list_contents, "\n");
+	}
+	for (i = last + 1; i < st->merged->stack_len; i++) {
+		strbuf_addstr(&ref_list_contents, st->readers[i]->name);
+		strbuf_addstr(&ref_list_contents, "\n");
+	}
+
+	err = write(lock_file_fd, ref_list_contents.buf, ref_list_contents.len);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		unlink(new_table_path.buf);
+		goto done;
+	}
+	err = close(lock_file_fd);
+	lock_file_fd = 0;
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		unlink(new_table_path.buf);
+		goto done;
+	}
+
+	err = rename(lock_file_name.buf, st->list_file);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		unlink(new_table_path.buf);
+		goto done;
+	}
+	have_lock = 0;
+
+	/* Reload the stack before deleting. On windows, we can only delete the
+	   files after we closed them.
+	*/
+	err = reftable_stack_reload_maybe_reuse(st, first < last);
+
+	listp = delete_on_success;
+	while (*listp) {
+		if (strcmp(*listp, new_table_path.buf)) {
+			unlink(*listp);
+		}
+		listp++;
+	}
+
+done:
+	free_names(delete_on_success);
+
+	listp = subtable_locks;
+	while (*listp) {
+		unlink(*listp);
+		listp++;
+	}
+	free_names(subtable_locks);
+	if (lock_file_fd > 0) {
+		close(lock_file_fd);
+		lock_file_fd = 0;
+	}
+	if (have_lock) {
+		unlink(lock_file_name.buf);
+	}
+	strbuf_release(&new_table_name);
+	strbuf_release(&new_table_path);
+	strbuf_release(&ref_list_contents);
+	strbuf_release(&temp_tab_file_name);
+	strbuf_release(&lock_file_name);
+	return err;
+}
+
+int reftable_stack_compact_all(struct reftable_stack *st,
+			       struct reftable_log_expiry_config *config)
+{
+	return stack_compact_range(st, 0, st->merged->stack_len - 1, config);
+}
+
+static int stack_compact_range_stats(struct reftable_stack *st, int first,
+				     int last,
+				     struct reftable_log_expiry_config *config)
+{
+	int err = stack_compact_range(st, first, last, config);
+	if (err > 0) {
+		st->stats.failures++;
+	}
+	return err;
+}
+
+static int segment_size(struct segment *s)
+{
+	return s->end - s->start;
+}
+
+int fastlog2(uint64_t sz)
+{
+	int l = 0;
+	if (sz == 0)
+		return 0;
+	for (; sz; sz /= 2) {
+		l++;
+	}
+	return l - 1;
+}
+
+struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n)
+{
+	struct segment *segs = reftable_calloc(sizeof(struct segment) * n);
+	int next = 0;
+	struct segment cur = { 0 };
+	int i = 0;
+
+	if (n == 0) {
+		*seglen = 0;
+		return segs;
+	}
+	for (i = 0; i < n; i++) {
+		int log = fastlog2(sizes[i]);
+		if (cur.log != log && cur.bytes > 0) {
+			struct segment fresh = {
+				.start = i,
+			};
+
+			segs[next++] = cur;
+			cur = fresh;
+		}
+
+		cur.log = log;
+		cur.end = i + 1;
+		cur.bytes += sizes[i];
+	}
+	segs[next++] = cur;
+	*seglen = next;
+	return segs;
+}
+
+struct segment suggest_compaction_segment(uint64_t *sizes, int n)
+{
+	int seglen = 0;
+	struct segment *segs = sizes_to_segments(&seglen, sizes, n);
+	struct segment min_seg = {
+		.log = 64,
+	};
+	int i = 0;
+	for (i = 0; i < seglen; i++) {
+		if (segment_size(&segs[i]) == 1) {
+			continue;
+		}
+
+		if (segs[i].log < min_seg.log) {
+			min_seg = segs[i];
+		}
+	}
+
+	while (min_seg.start > 0) {
+		int prev = min_seg.start - 1;
+		if (fastlog2(min_seg.bytes) < fastlog2(sizes[prev])) {
+			break;
+		}
+
+		min_seg.start = prev;
+		min_seg.bytes += sizes[prev];
+	}
+
+	reftable_free(segs);
+	return min_seg;
+}
+
+static uint64_t *stack_table_sizes_for_compaction(struct reftable_stack *st)
+{
+	uint64_t *sizes =
+		reftable_calloc(sizeof(uint64_t) * st->merged->stack_len);
+	int version = (st->config.hash_id == SHA1_ID) ? 1 : 2;
+	int overhead = header_size(version) - 1;
+	int i = 0;
+	for (i = 0; i < st->merged->stack_len; i++) {
+		sizes[i] = st->readers[i]->size - overhead;
+	}
+	return sizes;
+}
+
+int reftable_stack_auto_compact(struct reftable_stack *st)
+{
+	uint64_t *sizes = stack_table_sizes_for_compaction(st);
+	struct segment seg =
+		suggest_compaction_segment(sizes, st->merged->stack_len);
+	reftable_free(sizes);
+	if (segment_size(&seg) > 0)
+		return stack_compact_range_stats(st, seg.start, seg.end - 1,
+						 NULL);
+
+	return 0;
+}
+
+struct reftable_compaction_stats *
+reftable_stack_compaction_stats(struct reftable_stack *st)
+{
+	return &st->stats;
+}
+
+int reftable_stack_read_ref(struct reftable_stack *st, const char *refname,
+			    struct reftable_ref_record *ref)
+{
+	struct reftable_table tab = { NULL };
+	reftable_table_from_merged_table(&tab, reftable_stack_merged_table(st));
+	return reftable_table_read_ref(&tab, refname, ref);
+}
+
+int reftable_stack_read_log(struct reftable_stack *st, const char *refname,
+			    struct reftable_log_record *log)
+{
+	struct reftable_iterator it = { NULL };
+	struct reftable_merged_table *mt = reftable_stack_merged_table(st);
+	int err = reftable_merged_table_seek_log(mt, &it, refname);
+	if (err)
+		goto done;
+
+	err = reftable_iterator_next_log(&it, log);
+	if (err)
+		goto done;
+
+	if (strcmp(log->refname, refname) ||
+	    reftable_log_record_is_deletion(log)) {
+		err = 1;
+		goto done;
+	}
+
+done:
+	if (err) {
+		reftable_log_record_clear(log);
+	}
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+static int stack_check_addition(struct reftable_stack *st,
+				const char *new_tab_name)
+{
+	int err = 0;
+	struct reftable_block_source src = { NULL };
+	struct reftable_reader *rd = NULL;
+	struct reftable_table tab = { NULL };
+	struct reftable_ref_record *refs = NULL;
+	struct reftable_iterator it = { NULL };
+	int cap = 0;
+	int len = 0;
+	int i = 0;
+
+	if (st->config.skip_name_check)
+		return 0;
+
+	err = reftable_block_source_from_file(&src, new_tab_name);
+	if (err < 0)
+		goto done;
+
+	err = reftable_new_reader(&rd, &src, new_tab_name);
+	if (err < 0)
+		goto done;
+
+	err = reftable_reader_seek_ref(rd, &it, "");
+	if (err > 0) {
+		err = 0;
+		goto done;
+	}
+	if (err < 0)
+		goto done;
+
+	while (1) {
+		struct reftable_ref_record ref = { NULL };
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0)
+			goto done;
+
+		if (len >= cap) {
+			cap = 2 * cap + 1;
+			refs = reftable_realloc(refs, cap * sizeof(refs[0]));
+		}
+
+		refs[len++] = ref;
+	}
+
+	reftable_table_from_merged_table(&tab, reftable_stack_merged_table(st));
+
+	err = validate_ref_record_addition(tab, refs, len);
+
+done:
+	for (i = 0; i < len; i++) {
+		reftable_ref_record_clear(&refs[i]);
+	}
+
+	free(refs);
+	reftable_iterator_destroy(&it);
+	reftable_reader_free(rd);
+	return err;
+}
diff --git a/reftable/stack.h b/reftable/stack.h
new file mode 100644
index 0000000000..4e9557f08d
--- /dev/null
+++ b/reftable/stack.h
@@ -0,0 +1,40 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef STACK_H
+#define STACK_H
+
+#include "reftable.h"
+#include "system.h"
+
+struct reftable_stack {
+	char *list_file;
+	char *reftable_dir;
+	int disable_auto_compact;
+
+	struct reftable_write_options config;
+
+	struct reftable_reader **readers;
+	size_t readers_len;
+	struct reftable_merged_table *merged;
+	struct reftable_compaction_stats stats;
+};
+
+int read_lines(const char *filename, char ***lines);
+
+struct segment {
+	int start, end;
+	int log;
+	uint64_t bytes;
+};
+
+int fastlog2(uint64_t sz);
+struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n);
+struct segment suggest_compaction_segment(uint64_t *sizes, int n);
+
+#endif
diff --git a/reftable/stack_test.c b/reftable/stack_test.c
new file mode 100644
index 0000000000..534463829c
--- /dev/null
+++ b/reftable/stack_test.c
@@ -0,0 +1,788 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "stack.h"
+
+#include "system.h"
+
+#include "merged.h"
+#include "basics.h"
+#include "constants.h"
+#include "record.h"
+#include "reftable.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+#include <sys/types.h>
+#include <dirent.h>
+
+static void test_read_file(void)
+{
+	char fn[256] = "/tmp/stack.test_read_file.XXXXXX";
+	int fd = mkstemp(fn);
+	char out[1024] = "line1\n\nline2\nline3";
+	int n, err;
+	char **names = NULL;
+	char *want[] = { "line1", "line2", "line3" };
+	int i = 0;
+
+	assert(fd > 0);
+	n = write(fd, out, strlen(out));
+	assert(n == strlen(out));
+	err = close(fd);
+	assert(err >= 0);
+
+	err = read_lines(fn, &names);
+	assert_err(err);
+
+	for (i = 0; names[i] != NULL; i++) {
+		assert(0 == strcmp(want[i], names[i]));
+	}
+	free_names(names);
+	remove(fn);
+}
+
+static void test_parse_names(void)
+{
+	char buf[] = "line\n";
+	char **names = NULL;
+	parse_names(buf, strlen(buf), &names);
+
+	assert(NULL != names[0]);
+	assert(0 == strcmp(names[0], "line"));
+	assert(NULL == names[1]);
+	free_names(names);
+}
+
+static void test_names_equal(void)
+{
+	char *a[] = { "a", "b", "c", NULL };
+	char *b[] = { "a", "b", "d", NULL };
+	char *c[] = { "a", "b", NULL };
+
+	assert(names_equal(a, a));
+	assert(!names_equal(a, b));
+	assert(!names_equal(a, c));
+}
+
+static int write_test_ref(struct reftable_writer *wr, void *arg)
+{
+	struct reftable_ref_record *ref = arg;
+	reftable_writer_set_limits(wr, ref->update_index, ref->update_index);
+	return reftable_writer_add_ref(wr, ref);
+}
+
+struct write_log_arg {
+	struct reftable_log_record *log;
+	uint64_t update_index;
+};
+
+static int write_test_log(struct reftable_writer *wr, void *arg)
+{
+	struct write_log_arg *wla = arg;
+
+	reftable_writer_set_limits(wr, wla->update_index, wla->update_index);
+	return reftable_writer_add_log(wr, wla->log);
+}
+
+static void test_reftable_stack_add_one(void)
+{
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_ref_record ref = {
+		.refname = "HEAD",
+		.update_index = 1,
+		.target = "master",
+	};
+	struct reftable_ref_record dest = { NULL };
+
+	assert(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	assert_err(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref);
+	assert_err(err);
+
+	err = reftable_stack_read_ref(st, ref.refname, &dest);
+	assert_err(err);
+	assert(0 == strcmp("master", dest.target));
+
+	reftable_ref_record_clear(&dest);
+	reftable_stack_destroy(st);
+	reftable_clear_dir(dir);
+}
+
+static void test_reftable_stack_uptodate(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st1 = NULL;
+	struct reftable_stack *st2 = NULL;
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	int err;
+	struct reftable_ref_record ref1 = {
+		.refname = "HEAD",
+		.update_index = 1,
+		.target = "master",
+	};
+	struct reftable_ref_record ref2 = {
+		.refname = "branch2",
+		.update_index = 2,
+		.target = "master",
+	};
+
+	assert(mkdtemp(dir));
+
+	err = reftable_new_stack(&st1, dir, cfg);
+	assert_err(err);
+
+	err = reftable_new_stack(&st2, dir, cfg);
+	assert_err(err);
+
+	err = reftable_stack_add(st1, &write_test_ref, &ref1);
+	assert_err(err);
+
+	err = reftable_stack_add(st2, &write_test_ref, &ref2);
+	assert(err == REFTABLE_LOCK_ERROR);
+
+	err = reftable_stack_reload(st2);
+	assert_err(err);
+
+	err = reftable_stack_add(st2, &write_test_ref, &ref2);
+	assert_err(err);
+	reftable_stack_destroy(st1);
+	reftable_stack_destroy(st2);
+	reftable_clear_dir(dir);
+}
+
+static void test_reftable_stack_transaction_api(void)
+{
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_addition *add = NULL;
+
+	struct reftable_ref_record ref = {
+		.refname = "HEAD",
+		.update_index = 1,
+		.target = "master",
+	};
+	struct reftable_ref_record dest = { NULL };
+
+	assert(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	assert_err(err);
+
+	reftable_addition_destroy(add);
+
+	err = reftable_stack_new_addition(&add, st);
+	assert_err(err);
+
+	err = reftable_addition_add(add, &write_test_ref, &ref);
+	assert_err(err);
+
+	err = reftable_addition_commit(add);
+	assert_err(err);
+
+	reftable_addition_destroy(add);
+
+	err = reftable_stack_read_ref(st, ref.refname, &dest);
+	assert_err(err);
+	assert(0 == strcmp("master", dest.target));
+
+	reftable_ref_record_clear(&dest);
+	reftable_stack_destroy(st);
+	reftable_clear_dir(dir);
+}
+
+static void test_reftable_stack_validate_refname(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	int i;
+	struct reftable_ref_record ref = {
+		.refname = "a/b",
+		.update_index = 1,
+		.target = "master",
+	};
+	char *additions[] = { "a", "a/b/c" };
+
+	assert(mkdtemp(dir));
+	err = reftable_new_stack(&st, dir, cfg);
+	assert_err(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref);
+	assert_err(err);
+
+	for (i = 0; i < ARRAY_SIZE(additions); i++) {
+		struct reftable_ref_record ref = {
+			.refname = additions[i],
+			.update_index = 1,
+			.target = "master",
+		};
+
+		err = reftable_stack_add(st, &write_test_ref, &ref);
+		assert(err == REFTABLE_NAME_CONFLICT);
+	}
+
+	reftable_stack_destroy(st);
+	reftable_clear_dir(dir);
+}
+
+static int write_error(struct reftable_writer *wr, void *arg)
+{
+	return *((int *)arg);
+}
+
+static void test_reftable_stack_update_index_check(void)
+{
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_ref_record ref1 = {
+		.refname = "name1",
+		.update_index = 1,
+		.target = "master",
+	};
+	struct reftable_ref_record ref2 = {
+		.refname = "name2",
+		.update_index = 1,
+		.target = "master",
+	};
+	assert(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	assert_err(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref1);
+	assert_err(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref2);
+	assert(err == REFTABLE_API_ERROR);
+	reftable_stack_destroy(st);
+	reftable_clear_dir(dir);
+}
+
+static void test_reftable_stack_lock_failure(void)
+{
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err, i;
+	assert(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	assert_err(err);
+	for (i = -1; i != REFTABLE_EMPTY_TABLE_ERROR; i--) {
+		err = reftable_stack_add(st, &write_error, &i);
+		assert(err == i);
+	}
+
+	reftable_stack_destroy(st);
+	reftable_clear_dir(dir);
+}
+
+static void test_reftable_stack_add(void)
+{
+	int i = 0;
+	int err = 0;
+	struct reftable_write_options cfg = {
+		.exact_log_message = 1,
+	};
+	struct reftable_stack *st = NULL;
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	struct reftable_ref_record refs[2] = { { NULL } };
+	struct reftable_log_record logs[2] = { { NULL } };
+	int N = ARRAY_SIZE(refs);
+
+	assert(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	assert_err(err);
+	st->disable_auto_compact = 1;
+
+	for (i = 0; i < N; i++) {
+		char buf[256];
+		snprintf(buf, sizeof(buf), "branch%02d", i);
+		refs[i].refname = xstrdup(buf);
+		refs[i].value = reftable_malloc(SHA1_SIZE);
+		refs[i].update_index = i + 1;
+		set_test_hash(refs[i].value, i);
+
+		logs[i].refname = xstrdup(buf);
+		logs[i].update_index = N + i + 1;
+		logs[i].new_hash = reftable_malloc(SHA1_SIZE);
+		logs[i].email = xstrdup("identity@invalid");
+		set_test_hash(logs[i].new_hash, i);
+	}
+
+	for (i = 0; i < N; i++) {
+		int err = reftable_stack_add(st, &write_test_ref, &refs[i]);
+		assert_err(err);
+	}
+
+	for (i = 0; i < N; i++) {
+		struct write_log_arg arg = {
+			.log = &logs[i],
+			.update_index = reftable_stack_next_update_index(st),
+		};
+		int err = reftable_stack_add(st, &write_test_log, &arg);
+		assert_err(err);
+	}
+
+	err = reftable_stack_compact_all(st, NULL);
+	assert_err(err);
+
+	for (i = 0; i < N; i++) {
+		struct reftable_ref_record dest = { NULL };
+
+		int err = reftable_stack_read_ref(st, refs[i].refname, &dest);
+		assert_err(err);
+		assert(reftable_ref_record_equal(&dest, refs + i, SHA1_SIZE));
+		reftable_ref_record_clear(&dest);
+	}
+
+	for (i = 0; i < N; i++) {
+		struct reftable_log_record dest = { NULL };
+		int err = reftable_stack_read_log(st, refs[i].refname, &dest);
+		assert_err(err);
+		assert(reftable_log_record_equal(&dest, logs + i, SHA1_SIZE));
+		reftable_log_record_clear(&dest);
+	}
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	for (i = 0; i < N; i++) {
+		reftable_ref_record_clear(&refs[i]);
+		reftable_log_record_clear(&logs[i]);
+	}
+	reftable_clear_dir(dir);
+}
+
+static void test_reftable_stack_log_normalize(void)
+{
+	int err = 0;
+	struct reftable_write_options cfg = {
+		0,
+	};
+	struct reftable_stack *st = NULL;
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+
+	uint8_t h1[SHA1_SIZE] = { 0x01 }, h2[SHA1_SIZE] = { 0x02 };
+
+	struct reftable_log_record input = {
+		.refname = "branch",
+		.update_index = 1,
+		.new_hash = h1,
+		.old_hash = h2,
+	};
+	struct reftable_log_record dest = {
+		.update_index = 0,
+	};
+	struct write_log_arg arg = {
+		.log = &input,
+		.update_index = 1,
+	};
+
+	assert(mkdtemp(dir));
+	err = reftable_new_stack(&st, dir, cfg);
+	assert_err(err);
+
+	input.message = "one\ntwo";
+	err = reftable_stack_add(st, &write_test_log, &arg);
+	assert(err == REFTABLE_API_ERROR);
+
+	input.message = "one";
+	err = reftable_stack_add(st, &write_test_log, &arg);
+	assert_err(err);
+
+	err = reftable_stack_read_log(st, input.refname, &dest);
+	assert_err(err);
+	assert(0 == strcmp(dest.message, "one\n"));
+
+	input.message = "two\n";
+	arg.update_index = 2;
+	err = reftable_stack_add(st, &write_test_log, &arg);
+	assert_err(err);
+	err = reftable_stack_read_log(st, input.refname, &dest);
+	assert_err(err);
+	assert(0 == strcmp(dest.message, "two\n"));
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	reftable_log_record_clear(&dest);
+	reftable_clear_dir(dir);
+}
+
+static void test_reftable_stack_tombstone(void)
+{
+	int i = 0;
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_ref_record refs[2] = { { NULL } };
+	struct reftable_log_record logs[2] = { { NULL } };
+	int N = ARRAY_SIZE(refs);
+	struct reftable_ref_record dest = { NULL };
+	struct reftable_log_record log_dest = { NULL };
+
+	assert(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	assert_err(err);
+
+	for (i = 0; i < N; i++) {
+		const char *buf = "branch";
+		refs[i].refname = xstrdup(buf);
+		refs[i].update_index = i + 1;
+		if (i % 2 == 0) {
+			refs[i].value = reftable_malloc(SHA1_SIZE);
+			set_test_hash(refs[i].value, i);
+		}
+		logs[i].refname = xstrdup(buf);
+		/* update_index is part of the key. */
+		logs[i].update_index = 42;
+		if (i % 2 == 0) {
+			logs[i].new_hash = reftable_malloc(SHA1_SIZE);
+			set_test_hash(logs[i].new_hash, i);
+			logs[i].email = xstrdup("identity@invalid");
+		}
+	}
+	for (i = 0; i < N; i++) {
+		int err = reftable_stack_add(st, &write_test_ref, &refs[i]);
+		assert_err(err);
+	}
+	for (i = 0; i < N; i++) {
+		struct write_log_arg arg = {
+			.log = &logs[i],
+			.update_index = reftable_stack_next_update_index(st),
+		};
+		int err = reftable_stack_add(st, &write_test_log, &arg);
+		assert_err(err);
+	}
+
+	err = reftable_stack_read_ref(st, "branch", &dest);
+	assert(err == 1);
+	reftable_ref_record_clear(&dest);
+
+	err = reftable_stack_read_log(st, "branch", &log_dest);
+	assert(err == 1);
+	reftable_log_record_clear(&log_dest);
+
+	err = reftable_stack_compact_all(st, NULL);
+	assert_err(err);
+
+	err = reftable_stack_read_ref(st, "branch", &dest);
+	assert(err == 1);
+
+	err = reftable_stack_read_log(st, "branch", &log_dest);
+	assert(err == 1);
+	reftable_ref_record_clear(&dest);
+	reftable_log_record_clear(&log_dest);
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	for (i = 0; i < N; i++) {
+		reftable_ref_record_clear(&refs[i]);
+		reftable_log_record_clear(&logs[i]);
+	}
+	reftable_clear_dir(dir);
+}
+
+static void test_reftable_stack_hash_id(void)
+{
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+
+	struct reftable_ref_record ref = {
+		.refname = "master",
+		.target = "target",
+		.update_index = 1,
+	};
+	struct reftable_write_options cfg32 = { .hash_id = SHA256_ID };
+	struct reftable_stack *st32 = NULL;
+	struct reftable_write_options cfg_default = { 0 };
+	struct reftable_stack *st_default = NULL;
+	struct reftable_ref_record dest = { NULL };
+
+	assert(mkdtemp(dir));
+	err = reftable_new_stack(&st, dir, cfg);
+	assert_err(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref);
+	assert_err(err);
+
+	/* can't read it with the wrong hash ID. */
+	err = reftable_new_stack(&st32, dir, cfg32);
+	assert(err == REFTABLE_FORMAT_ERROR);
+
+	/* check that we can read it back with default config too. */
+	err = reftable_new_stack(&st_default, dir, cfg_default);
+	assert_err(err);
+
+	err = reftable_stack_read_ref(st_default, "master", &dest);
+	assert_err(err);
+
+	assert(!strcmp(dest.target, ref.target));
+	reftable_ref_record_clear(&dest);
+	reftable_stack_destroy(st);
+	reftable_stack_destroy(st_default);
+	reftable_clear_dir(dir);
+}
+
+static void test_log2(void)
+{
+	assert(1 == fastlog2(3));
+	assert(2 == fastlog2(4));
+	assert(2 == fastlog2(5));
+}
+
+static void test_sizes_to_segments(void)
+{
+	uint64_t sizes[] = { 2, 3, 4, 5, 7, 9 };
+	/* .................0  1  2  3  4  5 */
+
+	int seglen = 0;
+	struct segment *segs =
+		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
+	assert(segs[2].log == 3);
+	assert(segs[2].start == 5);
+	assert(segs[2].end == 6);
+
+	assert(segs[1].log == 2);
+	assert(segs[1].start == 2);
+	assert(segs[1].end == 5);
+	reftable_free(segs);
+}
+
+static void test_sizes_to_segments_empty(void)
+{
+	uint64_t sizes[0];
+
+	int seglen = 0;
+	struct segment *segs =
+		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
+	assert(seglen == 0);
+	reftable_free(segs);
+}
+
+static void test_sizes_to_segments_all_equal(void)
+{
+	uint64_t sizes[] = { 5, 5 };
+
+	int seglen = 0;
+	struct segment *segs =
+		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
+	assert(seglen == 1);
+	assert(segs[0].start == 0);
+	assert(segs[0].end == 2);
+	reftable_free(segs);
+}
+
+static void test_suggest_compaction_segment(void)
+{
+	uint64_t sizes[] = { 128, 64, 17, 16, 9, 9, 9, 16, 16 };
+	/* .................0    1    2  3   4  5  6 */
+	struct segment min =
+		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes));
+	assert(min.start == 2);
+	assert(min.end == 7);
+}
+
+static void test_suggest_compaction_segment_nothing(void)
+{
+	uint64_t sizes[] = { 64, 32, 16, 8, 4, 2 };
+	struct segment result =
+		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes));
+	assert(result.start == result.end);
+}
+
+static void test_reflog_expire(void)
+{
+	char dir[256] = "/tmp/stack.test_reflog_expire.XXXXXX";
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	struct reftable_log_record logs[20] = { { NULL } };
+	int N = ARRAY_SIZE(logs) - 1;
+	int i = 0;
+	int err;
+	struct reftable_log_expiry_config expiry = {
+		.time = 10,
+	};
+	struct reftable_log_record log = { NULL };
+
+	assert(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	assert_err(err);
+
+	for (i = 1; i <= N; i++) {
+		char buf[256];
+		snprintf(buf, sizeof(buf), "branch%02d", i);
+
+		logs[i].refname = xstrdup(buf);
+		logs[i].update_index = i;
+		logs[i].time = i;
+		logs[i].new_hash = reftable_malloc(SHA1_SIZE);
+		logs[i].email = xstrdup("identity@invalid");
+		set_test_hash(logs[i].new_hash, i);
+	}
+
+	for (i = 1; i <= N; i++) {
+		struct write_log_arg arg = {
+			.log = &logs[i],
+			.update_index = reftable_stack_next_update_index(st),
+		};
+		int err = reftable_stack_add(st, &write_test_log, &arg);
+		assert_err(err);
+	}
+
+	err = reftable_stack_compact_all(st, NULL);
+	assert_err(err);
+
+	err = reftable_stack_compact_all(st, &expiry);
+	assert_err(err);
+
+	err = reftable_stack_read_log(st, logs[9].refname, &log);
+	assert(err == 1);
+
+	err = reftable_stack_read_log(st, logs[11].refname, &log);
+	assert_err(err);
+
+	expiry.min_update_index = 15;
+	err = reftable_stack_compact_all(st, &expiry);
+	assert_err(err);
+
+	err = reftable_stack_read_log(st, logs[14].refname, &log);
+	assert(err == 1);
+
+	err = reftable_stack_read_log(st, logs[16].refname, &log);
+	assert_err(err);
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	for (i = 0; i <= N; i++) {
+		reftable_log_record_clear(&logs[i]);
+	}
+	reftable_clear_dir(dir);
+	reftable_log_record_clear(&log);
+}
+
+static int write_nothing(struct reftable_writer *wr, void *arg)
+{
+	reftable_writer_set_limits(wr, 1, 1);
+	return 0;
+}
+
+static void test_empty_add(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	struct reftable_stack *st2 = NULL;
+
+	assert(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	assert_err(err);
+
+	err = reftable_stack_add(st, &write_nothing, NULL);
+	assert_err(err);
+
+	err = reftable_new_stack(&st2, dir, cfg);
+	assert_err(err);
+	reftable_clear_dir(dir);
+	reftable_stack_destroy(st);
+	reftable_stack_destroy(st2);
+}
+
+static void test_reftable_stack_auto_compaction(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	int err, i;
+	int N = 100;
+	assert(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	assert_err(err);
+
+	for (i = 0; i < N; i++) {
+		char name[100];
+		struct reftable_ref_record ref = {
+			.refname = name,
+			.update_index = reftable_stack_next_update_index(st),
+			.target = "master",
+		};
+		snprintf(name, sizeof(name), "branch%04d", i);
+
+		err = reftable_stack_add(st, &write_test_ref, &ref);
+		assert_err(err);
+
+		assert(i < 3 || st->merged->stack_len < 2 * fastlog2(i));
+	}
+
+	assert(reftable_stack_compaction_stats(st)->entries_written <
+	       (uint64_t)(N * fastlog2(N)));
+
+	reftable_stack_destroy(st);
+	reftable_clear_dir(dir);
+}
+
+int stack_test_main(int argc, const char *argv[])
+{
+	add_test_case("test_reftable_stack_uptodate",
+		      &test_reftable_stack_uptodate);
+	add_test_case("test_reftable_stack_transaction_api",
+		      &test_reftable_stack_transaction_api);
+	add_test_case("test_reftable_stack_hash_id",
+		      &test_reftable_stack_hash_id);
+	add_test_case("test_sizes_to_segments_all_equal",
+		      &test_sizes_to_segments_all_equal);
+	add_test_case("test_reftable_stack_auto_compaction",
+		      &test_reftable_stack_auto_compaction);
+	add_test_case("test_reftable_stack_validate_refname",
+		      &test_reftable_stack_validate_refname);
+	add_test_case("test_reftable_stack_update_index_check",
+		      &test_reftable_stack_update_index_check);
+	add_test_case("test_reftable_stack_lock_failure",
+		      &test_reftable_stack_lock_failure);
+	add_test_case("test_reftable_stack_log_normalize",
+		      &test_reftable_stack_log_normalize);
+	add_test_case("test_reftable_stack_tombstone",
+		      &test_reftable_stack_tombstone);
+	add_test_case("test_reftable_stack_add_one",
+		      &test_reftable_stack_add_one);
+	add_test_case("test_empty_add", test_empty_add);
+	add_test_case("test_reflog_expire", test_reflog_expire);
+	add_test_case("test_suggest_compaction_segment",
+		      &test_suggest_compaction_segment);
+	add_test_case("test_suggest_compaction_segment_nothing",
+		      &test_suggest_compaction_segment_nothing);
+	add_test_case("test_sizes_to_segments", &test_sizes_to_segments);
+	add_test_case("test_sizes_to_segments_empty",
+		      &test_sizes_to_segments_empty);
+	add_test_case("test_log2", &test_log2);
+	add_test_case("test_parse_names", &test_parse_names);
+	add_test_case("test_read_file", &test_read_file);
+	add_test_case("test_names_equal", &test_names_equal);
+	add_test_case("test_reftable_stack_add", &test_reftable_stack_add);
+	return test_main(argc, argv);
+}
diff --git a/reftable/update.sh b/reftable/update.sh
new file mode 100755
index 0000000000..4dadd875f4
--- /dev/null
+++ b/reftable/update.sh
@@ -0,0 +1,22 @@
+#!/bin/sh
+
+set -eu
+
+# Override this to import from somewhere else, say "../reftable".
+SRC=${SRC:-origin}
+BRANCH=${BRANCH:-master}
+
+((git --git-dir reftable-repo/.git fetch -f ${SRC} ${BRANCH}:import && cd reftable-repo && git checkout -f $(git rev-parse import) ) ||
+   git clone https://github.com/google/reftable reftable-repo)
+
+cp reftable-repo/c/*.[ch] reftable/
+cp reftable-repo/c/include/*.[ch] reftable/
+cp reftable-repo/LICENSE reftable/
+
+git --git-dir reftable-repo/.git show --no-patch --format=oneline HEAD \
+  > reftable/VERSION
+
+mv reftable/system.h reftable/system.h~
+sed 's|if REFTABLE_IN_GITCORE|if 1 /* REFTABLE_IN_GITCORE */|'  < reftable/system.h~ > reftable/system.h
+
+git add reftable/*.[ch] reftable/LICENSE reftable/VERSION
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 2273fdfe39..def8883439 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -4,9 +4,12 @@
 int cmd__reftable(int argc, const char **argv)
 {
 	block_test_main(argc, argv);
+	merged_test_main(argc, argv);
 	record_test_main(argc, argv);
+	refname_test_main(argc, argv);
 	reftable_test_main(argc, argv);
 	strbuf_test_main(argc, argv);
+	stack_test_main(argc, argv);
 	tree_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v2 13/13] reftable: "test-tool dump-reftable" command.
  2020-10-01 16:10 ` [PATCH v2 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
                     ` (11 preceding siblings ...)
  2020-10-01 16:11   ` [PATCH v2 12/13] reftable: rest of library Han-Wen Nienhuys via GitGitGadget
@ 2020-10-01 16:11   ` Han-Wen Nienhuys via GitGitGadget
  2020-11-26 19:42   ` [PATCH v3 00/16] reftable library Han-Wen Nienhuys via GitGitGadget
  13 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-10-01 16:11 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Jeff King, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This command dumps individual tables or a stack of of tables.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |  2 ++
 reftable/iter.c          |  6 +++---
 reftable/reader.c        | 22 +++++++++++-----------
 t/helper/test-reftable.c |  5 +++++
 t/helper/test-tool.c     |  1 +
 t/helper/test-tool.h     |  1 +
 6 files changed, 23 insertions(+), 14 deletions(-)

diff --git a/Makefile b/Makefile
index 05021f79a1..5f92e5cfc2 100644
--- a/Makefile
+++ b/Makefile
@@ -2391,6 +2391,8 @@ REFTABLE_OBJS += reftable/writer.o
 REFTABLE_OBJS += reftable/zlib-compat.o
 
 REFTABLE_TEST_OBJS += reftable/block_test.o
+REFTABLE_TEST_OBJS += reftable/dump.o
+REFTABLE_TEST_OBJS += reftable/merged_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/refname_test.o
 REFTABLE_TEST_OBJS += reftable/reftable_test.o
diff --git a/reftable/iter.c b/reftable/iter.c
index 2cff447323..80bc65e07d 100644
--- a/reftable/iter.c
+++ b/reftable/iter.c
@@ -59,7 +59,7 @@ void reftable_iterator_destroy(struct reftable_iterator *it)
 int reftable_iterator_next_ref(struct reftable_iterator *it,
 			       struct reftable_ref_record *ref)
 {
-	struct reftable_record rec = { 0 };
+	struct reftable_record rec = { NULL };
 	reftable_record_from_ref(&rec, ref);
 	return iterator_next(it, &rec);
 }
@@ -67,7 +67,7 @@ int reftable_iterator_next_ref(struct reftable_iterator *it,
 int reftable_iterator_next_log(struct reftable_iterator *it,
 			       struct reftable_log_record *log)
 {
-	struct reftable_record rec = { 0 };
+	struct reftable_record rec = { NULL };
 	reftable_record_from_log(&rec, log);
 	return iterator_next(it, &rec);
 }
@@ -95,7 +95,7 @@ static int filtering_ref_iterator_next(void *iter_arg,
 		}
 
 		if (fri->double_check) {
-			struct reftable_iterator it = { 0 };
+			struct reftable_iterator it = { NULL };
 
 			err = reftable_table_seek_ref(&fri->tab, &it,
 						      ref->refname);
diff --git a/reftable/reader.c b/reftable/reader.c
index c7f56b5fdc..8d3f63b0d8 100644
--- a/reftable/reader.c
+++ b/reftable/reader.c
@@ -161,8 +161,8 @@ static int parse_footer(struct reftable_reader *r, uint8_t *footer,
 int init_reader(struct reftable_reader *r, struct reftable_block_source *source,
 		const char *name)
 {
-	struct reftable_block footer = { 0 };
-	struct reftable_block header = { 0 };
+	struct reftable_block footer = { NULL };
+	struct reftable_block header = { NULL };
 	int err = 0;
 
 	memset(r, 0, sizeof(struct reftable_reader));
@@ -270,7 +270,7 @@ int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br,
 {
 	int32_t guess_block_size = r->block_size ? r->block_size :
 							 DEFAULT_BLOCK_SIZE;
-	struct reftable_block block = { 0 };
+	struct reftable_block block = { NULL };
 	uint8_t block_typ = 0;
 	int err = 0;
 	uint32_t header_off = next_off ? 0 : header_size(r->version);
@@ -477,9 +477,9 @@ static int reader_seek_indexed(struct reftable_reader *r,
 			       struct reftable_record *rec)
 {
 	struct reftable_index_record want_index = { .last_key = STRBUF_INIT };
-	struct reftable_record want_index_rec = { 0 };
+	struct reftable_record want_index_rec = { NULL };
 	struct reftable_index_record index_result = { .last_key = STRBUF_INIT };
-	struct reftable_record index_result_rec = { 0 };
+	struct reftable_record index_result_rec = { NULL };
 	struct table_iter index_iter = TABLE_ITER_INIT;
 	struct table_iter next = TABLE_ITER_INIT;
 	int err = 0;
@@ -584,7 +584,7 @@ int reftable_reader_seek_ref(struct reftable_reader *r,
 	struct reftable_ref_record ref = {
 		.refname = (char *)name,
 	};
-	struct reftable_record rec = { 0 };
+	struct reftable_record rec = { NULL };
 	reftable_record_from_ref(&rec, &ref);
 	return reader_seek(r, it, &rec);
 }
@@ -597,7 +597,7 @@ int reftable_reader_seek_log_at(struct reftable_reader *r,
 		.refname = (char *)name,
 		.update_index = update_index,
 	};
-	struct reftable_record rec = { 0 };
+	struct reftable_record rec = { NULL };
 	reftable_record_from_log(&rec, &log);
 	return reader_seek(r, it, &rec);
 }
@@ -644,10 +644,10 @@ static int reftable_reader_refs_for_indexed(struct reftable_reader *r,
 		.hash_prefix = oid,
 		.hash_prefix_len = r->object_id_len,
 	};
-	struct reftable_record want_rec = { 0 };
-	struct reftable_iterator oit = { 0 };
-	struct reftable_obj_record got = { 0 };
-	struct reftable_record got_rec = { 0 };
+	struct reftable_record want_rec = { NULL };
+	struct reftable_iterator oit = { NULL };
+	struct reftable_obj_record got = { NULL };
+	struct reftable_record got_rec = { NULL };
 	int err = 0;
 	struct indexed_table_ref_iter *itr = NULL;
 
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index def8883439..aff4fbccda 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -13,3 +13,8 @@ int cmd__reftable(int argc, const char **argv)
 	tree_test_main(argc, argv);
 	return 0;
 }
+
+int cmd__dump_reftable(int argc, const char **argv)
+{
+	return reftable_dump_main(argc, (char *const *)argv);
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index c7ab5bc4a6..ce27041ab2 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -54,6 +54,7 @@ static struct test_cmd cmds[] = {
 	{ "read-midx", cmd__read_midx },
 	{ "ref-store", cmd__ref_store },
 	{ "reftable", cmd__reftable },
+	{ "dump-reftable", cmd__dump_reftable },
 	{ "regex", cmd__regex },
 	{ "repository", cmd__repository },
 	{ "revision-walking", cmd__revision_walking },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index fa2b11ace6..8b5790d93f 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -17,6 +17,7 @@ int cmd__dump_cache_tree(int argc, const char **argv);
 int cmd__dump_fsmonitor(int argc, const char **argv);
 int cmd__dump_split_index(int argc, const char **argv);
 int cmd__dump_untracked_cache(int argc, const char **argv);
+int cmd__dump_reftable(int argc, const char **argv);
 int cmd__example_decorate(int argc, const char **argv);
 int cmd__genrandom(int argc, const char **argv);
 int cmd__genzeros(int argc, const char **argv);
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 06/13] reftable: (de)serialization for the polymorphic record type.
  2020-10-01 16:10   ` [PATCH v2 06/13] reftable: (de)serialization for the polymorphic record type Han-Wen Nienhuys via GitGitGadget
@ 2020-10-01 19:23     ` Junio C Hamano
  2020-10-01 19:59       ` Ramsay Jones
  0 siblings, 1 reply; 251+ messages in thread
From: Junio C Hamano @ 2020-10-01 19:23 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget, Ramsay Jones
  Cc: git, Han-Wen Nienhuys, Jeff King, Han-Wen Nienhuys

"Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com> writes:

> +struct reftable_record reftable_new_record(uint8_t typ)
> +{
> +	struct reftable_record rec = { NULL };

I can see a lot of "sparse" inspired work went into this, but we
would want to take the previous discussion into account, as nothing
has changed since then.

https://lore.kernel.org/git/41a60e60-d2c0-7d54-5456-e44d106548a4@kdbg.org/

https://lore.kernel.org/git/1df91aa4-dda5-64da-6ae3-5d65e50a55c5@ramsayjones.plus.com/

Unfortunately we didn't reach a clear consensus back then; I thought
we saw a fix to make sparse silent when we use the "= { 0 }" idiom
in recent versions of sparse, but the above threads do not have any
mention of it, so either I am misremembering or there were other
discussions on the same topic where it was also mentioned.

> +	switch (typ) {

Need blank before this line.

Thanks.

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 06/13] reftable: (de)serialization for the polymorphic record type.
  2020-10-01 19:23     ` Junio C Hamano
@ 2020-10-01 19:59       ` Ramsay Jones
  0 siblings, 0 replies; 251+ messages in thread
From: Ramsay Jones @ 2020-10-01 19:59 UTC (permalink / raw)
  To: Junio C Hamano, Han-Wen Nienhuys via GitGitGadget
  Cc: git, Han-Wen Nienhuys, Jeff King, Han-Wen Nienhuys



On 01/10/2020 20:23, Junio C Hamano wrote:
> "Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> +struct reftable_record reftable_new_record(uint8_t typ)
>> +{
>> +	struct reftable_record rec = { NULL };
> 
> I can see a lot of "sparse" inspired work went into this, but we
> would want to take the previous discussion into account, as nothing
> has changed since then.
> 
> https://lore.kernel.org/git/41a60e60-d2c0-7d54-5456-e44d106548a4@kdbg.org/
> 
> https://lore.kernel.org/git/1df91aa4-dda5-64da-6ae3-5d65e50a55c5@ramsayjones.plus.com/

yep, sparse is equally happy with = { 0 }; here - at least any sparse
version after 'v0.6.1-246-g41f651b4'. ;-) So, as I said yesterday, the
last official release v0.6.2 should be fine, and I _think_ Debian Testing
has a suitable version. (or you could build from source).

> Unfortunately we didn't reach a clear consensus back then; I thought
> we saw a fix to make sparse silent when we use the "= { 0 }" idiom
> in recent versions of sparse, but the above threads do not have any
> mention of it, so either I am misremembering or there were other
> discussions on the same topic where it was also mentioned.

See commit 1c96642326 (sparse: allow '{ 0 }' to be used without warnings,
22-05-2020). We could actually remove that now, since '-Wno-universal-initializer'
is now the default (but it doesn't hurt).

ATB,
Ramsay Jones




^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 01/13] reftable: add LICENSE
  2020-10-01 16:10   ` [PATCH v2 01/13] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
@ 2020-10-02  3:18     ` Jonathan Nieder
  0 siblings, 0 replies; 251+ messages in thread
From: Jonathan Nieder @ 2020-10-02  3:18 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: git, Han-Wen Nienhuys, Jeff King, Han-Wen Nienhuys

Hi,

Han-Wen Nienhuys wrote:

> TODO: relicense?

This line presumably doesn't want to be part of permanent history. :)

Instead, it can say a little about rationale.  For example, does this
make the code easier for other projects like libgit2 to share?

> Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
> ---
>  reftable/LICENSE | 31 +++++++++++++++++++++++++++++++
>  1 file changed, 31 insertions(+)
>  create mode 100644 reftable/LICENSE

Should there be a reftable/README as well?

[...]
> --- /dev/null
> +++ b/reftable/LICENSE
> @@ -0,0 +1,31 @@
> +BSD License
> +
> +Copyright (c) 2020, Google LLC
> +All rights reserved.

nit: Google seems to prefer a slightly more spartan form of this
license text:
https://opensource.google/docs/releasing/licenses/#bsd-license-file

Nits aside, except for the TODO line, this patch looks reasonable to
me.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 02/13] reftable: define the public API
  2020-10-01 16:10   ` [PATCH v2 02/13] reftable: define the public API Han-Wen Nienhuys via GitGitGadget
@ 2020-10-02  3:58     ` Jonathan Nieder
  2020-10-09 21:13       ` Emily Shaffer
  2020-10-10 13:43       ` Han-Wen Nienhuys
  2020-10-08  1:41     ` Jonathan Tan
  1 sibling, 2 replies; 251+ messages in thread
From: Jonathan Nieder @ 2020-10-02  3:58 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: git, Han-Wen Nienhuys, Jeff King, Han-Wen Nienhuys

Hi,

Han-Wen Nienhuys wrote:

>  reftable/reftable.h | 585 ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 585 insertions(+)
>  create mode 100644 reftable/reftable.h

Adding a header in a separate patch from the implementation doesn't
match the usual practice.  Can we add declarations in the same patch
as the functions being declared instead?

We could still introduce the header early in its own patch if we want,
but it would be a skeleton of a header that gets filled out by later
patches.

[...]
> --- /dev/null
> +++ b/reftable/reftable.h
> @@ -0,0 +1,585 @@
[...]
> +#ifndef REFTABLE_H
> +#define REFTABLE_H
> +
> +#include <stdint.h>
> +#include <stddef.h>

Git tends to prefer to rely on the caller to have #include-ed
git-compat-util.h instead.  That helps avoid having to worry about
portability gotchas around what feature test macros need to have been
set before the first #include of a standard header.

> +
> +void reftable_set_alloc(void *(*malloc)(size_t),
> +			void *(*realloc)(void *, size_t), void (*free)(void *));
> +
> +/****************************************************************
> + Basic data types
> +
> + Reftables store the state of each ref in struct reftable_ref_record, and they
> + store a sequence of reflog updates in struct reftable_log_record.
> + ****************************************************************/

nit: Git style for comments would be

/*
 * Basic data types
 *
 * Reftables store the state of each [etc]
 */

In other words:

- no lines of stars
- "/*" with no text after it as first line
- " *" leading subsequent lines
- " */" as last line


> +
> +/* reftable_ref_record holds a ref database entry target_value */
> +struct reftable_ref_record {
> +	char *refname; /* Name of the ref, malloced. */
> +	uint64_t update_index; /* Logical timestamp at which this value is
> +				  written */

Likewise: Git doesn't have this style of multi-line comment, but this
could be e.g. a single-line comment before

	/* logical timestamp at which this value was written */
	uint64_t update_index;

> +	uint8_t *value; /* SHA1, or NULL. malloced. */

What does "NULL. malloced" mean?  Is the point that the
reftable_ref_record owns the memory?  (We don't do as good a job of
documenting that as we should.  Often a non-const pointer is owned,
but in this example it's probably better to be explicit that this
struct owns the memory pointed to by all its fields in the comment
describing the struct as a whole.)

It would be more idiomatic for this to be a 'struct object_id' inline
in the reftable_ref_record.

> +	uint8_t *target_value; /* peeled annotated tag, or NULL. malloced. */
> +	char *target; /* symref, or NULL. malloced. */
> +};
> +
> +/* returns whether 'ref' represents a deletion */
> +int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref);

Nice, this looks pleasant to use.

> +
> +/* prints a reftable_ref_record onto stdout */
> +void reftable_ref_record_print(struct reftable_ref_record *ref,
> +			       uint32_t hash_id);

Probably worth being explicit about the purpose: is this for
debugging?

What is a hash_id?

> +
> +/* frees and nulls all pointer values. */
> +void reftable_ref_record_clear(struct reftable_ref_record *ref);
> +
> +/* returns whether two reftable_ref_records are the same */
> +int reftable_ref_record_equal(struct reftable_ref_record *a,
> +			      struct reftable_ref_record *b, int hash_size);
> +
> +/* reftable_log_record holds a reflog entry */
> +struct reftable_log_record {

Should some of this documentation mention the reftable spec?  E.g. an
opening comment at the beginning of the header.  That might help avoid
having to duplicate too much explanation when terminology matches.

> +	char *refname;
> +	uint64_t update_index; /* logical timestamp of a transactional update.
> +				*/

multiline comment (I won't mention the rest).

> +	uint8_t *new_hash;
> +	uint8_t *old_hash;
> +	char *name;
> +	char *email;
> +	uint64_t time;
> +	int16_t tz_offset;
> +	char *message;
> +};
> +
> +/* returns whether 'ref' represents the deletion of a log record. */

These are a bit subtle --- should the comment include a little more
explanation (e.g. about how this is mostly for tools like "git
stash")?

> +int reftable_log_record_is_deletion(const struct reftable_log_record *log);
> +
> +/* frees and nulls all pointer values. */
> +void reftable_log_record_clear(struct reftable_log_record *log);
> +
> +/* returns whether two records are equal. */
> +int reftable_log_record_equal(struct reftable_log_record *a,
> +			      struct reftable_log_record *b, int hash_size);
> +
> +/* dumps a reftable_log_record on stdout, for debugging/testing. */
> +void reftable_log_record_print(struct reftable_log_record *log,
> +			       uint32_t hash_id);
> +
> +/****************************************************************
> + Error handling
> +
> + Error are signaled with negative integer return values. 0 means success.
> + ****************************************************************/
> +
> +/* different types of errors */
> +enum reftable_error {

Reminds me of zlib and liblzma.  I like it.

> +	/* Unexpected file system behavior */
> +	REFTABLE_IO_ERROR = -2,
> +
> +	/* Format inconsistency on reading data
> +	 */
> +	REFTABLE_FORMAT_ERROR = -3,
> +
> +	/* File does not exist. Returned from block_source_from_file(),  because
> +	   it needs special handling in stack.
> +	*/
> +	REFTABLE_NOT_EXIST_ERROR = -4,
> +
> +	/* Trying to write out-of-date data. */
> +	REFTABLE_LOCK_ERROR = -5,
> +
> +	/* Misuse of the API:
> +	   - on writing a record with NULL refname.
> +	   - on writing a reftable_ref_record outside the table limits
> +	   - on writing a ref or log record before the stack's next_update_index
> +	   - on writing a log record with multiline message with
> +	   exact_log_message unset
> +	   - on reading a reftable_ref_record from log iterator, or vice versa.
> +	*/
> +	REFTABLE_API_ERROR = -6,

Should these call BUG()?  Or is it useful for some callers to be able
to recover from these errors?

> +
> +	/* Decompression error */
> +	REFTABLE_ZLIB_ERROR = -7,
> +
> +	/* Wrote a table without blocks. */
> +	REFTABLE_EMPTY_TABLE_ERROR = -8,
> +
> +	/* Dir/file conflict. */

This is when adding a ref?

> +	REFTABLE_NAME_CONFLICT = -9,
> +
> +	/* Illegal ref name. */
> +	REFTABLE_REFNAME_ERROR = -10,
> +};
> +
> +/* convert the numeric error code to a string. The string should not be
> + * deallocated. */
> +const char *reftable_error_str(int err);

In the ZLIB_ERROR case this would produce a frustratingly vague
message.  I think I don't mind, but it might be worth a NEEDSWORK
comment to revisit in the future.

> +
> +/*
> + * Convert the numeric error code to an equivalent errno code.
> + */
> +int reftable_error_to_errno(int err);

What is the intended use of this function?

[...]
> +/* reftable_block_stats holds statistics for a single block type */
> +struct reftable_block_stats {

Are these for use in tracing, debugging, or some other purpose?

[...]
> +/* reftable_new_writer creates a new writer */
> +struct reftable_writer *
> +reftable_new_writer(int (*writer_func)(void *, const void *, size_t),
> +		    void *writer_arg, struct reftable_write_options *opts);

What is a writer?  What is a writer_func?  Is this a handle that
allows me to determine where I write my reftable?  (Sounds reasonable
but the comment could be more explicit.)

[...]
> +
> +/* write to a file descriptor. fdp should be an int* pointing to the fd. */
> +int reftable_fd_write(void *fdp, const void *data, size_t size);

What does this do for interrupted writes?

> +
> +/* Set the range of update indices for the records we will add.  When
> +   writing a table into a stack, the min should be at least
> +   reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned.
> +
> +   For transactional updates, typically min==max. When converting an existing
> +   ref database into a single reftable, this would be a range of update-index
> +   timestamps.
> + */
> +void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
> +				uint64_t max);

What happens if I write to a reftable_writer without setting limits
first?

> +
> +/* adds a reftable_ref_record. Must be called in ascending
> +   order. The update_index must be within the limits set by
> +   reftable_writer_set_limits(), or REFTABLE_API_ERROR is returned.
> +
> +   It is an error to write a ref record after a log record.
> + */
> +int reftable_writer_add_ref(struct reftable_writer *w,
> +			    struct reftable_ref_record *ref);

optional: since it's called a writer, _write_ref() feels like a
slightly more natural verb (likewise for the others).

[...]
> +/* adds a reftable_log_record. Must be called in ascending order (with more
> +   recent log entries first.)
> + */
> +int reftable_writer_add_log(struct reftable_writer *w,
> +			    struct reftable_log_record *log);

If I want to write multiple log records with the same update_index to
represent a multi-ref transaction, do I have to do anything careful?

> +
> +/* Convenience function to add multiple logs. Will sort the records by
> +   key before adding. */
> +int reftable_writer_add_logs(struct reftable_writer *w,
> +			     struct reftable_log_record *logs, int n);
> +
> +/* reftable_writer_close finalizes the reftable. The writer is retained so
> + * statistics can be inspected. */
> +int reftable_writer_close(struct reftable_writer *w);

Nice.

> +
> +/* writer_stats returns the statistics on the reftable being written.
> +
> +   This struct becomes invalid when the writer is freed.
> + */
> +const struct reftable_stats *writer_stats(struct reftable_writer *w);

Also nice.

[...]
> +/* iterator is the generic interface for walking over data stored in a
> +   reftable. It is generally passed around by value.
> +*/
> +struct reftable_iterator {
> +	struct reftable_iterator_vtable *ops;
> +	void *iter_arg;
> +};

Even small structs like this one tend not to be passed by value in the
Git codebase.  The only counterexample I found with a quick search
(builtin/stash.c::save_untracked_files) appears to be a typo.

[...]
> +/* a contiguous segment of bytes. It keeps track of its generating block_source
> +   so it can return itself into the pool.
> +*/
> +struct reftable_block {
> +	uint8_t *data;
> +	int len;
> +	struct reftable_block_source source;
> +};

Is this something that could use mem-pool.h?

> +/* block_source_vtable are the operations that make up block_source */
> +struct reftable_block_source_vtable {
> +	/* returns the size of a block source */
> +	uint64_t (*size)(void *source);
> +
> +	/* reads a segment from the block source. It is an error to read
> +	   beyond the end of the block */
> +	int (*read_block)(void *source, struct reftable_block *dest,
> +			  uint64_t off, uint32_t size);
> +	/* mark the block as read; may return the data back to malloc */
> +	void (*return_block)(void *source, struct reftable_block *blockp);
> +
> +	/* release all resources associated with the block source */
> +	void (*close)(void *source);
> +};

This is abstracting file input?  I wonder if some name emphasizing the
'read' operation like block_reader might be clearer.

Do I pass in the 'struct block_source *' as the source arg?  If so, why
are these declared as void *?

Is the reason this manages the buffer instead of requiring a
caller-supplied buffer to support zero-copy?

[...]
> +
> +/* opens a file on the file system as a block_source */
> +int reftable_block_source_from_file(struct reftable_block_source *block_src,
> +				    const char *name);
> +
> +/* The reader struct is a handle to an open reftable file. */
> +struct reftable_reader;

... ah, this is why the underlying file handle isn't called a reader.
Hmm.

[...]
> +/* reftable_reader_seek_ref returns an iterator where 'name' would be inserted
> +   in the table.  To seek to the start of the table, use name = "".

nit: s/iterator where/iterator pointing where/

> +
> +   example:
> +
> +   struct reftable_reader *r = NULL;
> +   int err = reftable_new_reader(&r, &src, "filename");
> +   if (err < 0) { ... }
> +   struct reftable_iterator it  = {0};
> +   err = reftable_reader_seek_ref(r, &it, "refs/heads/master");
> +   if (err < 0) { ... }
> +   struct reftable_ref_record ref  = {0};
> +   while (1) {
> +     err = reftable_iterator_next_ref(&it, &ref);
> +     if (err > 0) {
> +       break;
> +     }
> +     if (err < 0) {
> +       ..error handling..
> +     }
> +     ..found..
> +   }
> +   reftable_iterator_destroy(&it);
> +   reftable_ref_record_clear(&ref);

Nice.

> + */
> +int reftable_reader_seek_ref(struct reftable_reader *r,
> +			     struct reftable_iterator *it, const char *name);
> +
> +/* returns the hash ID used in this table. */
> +uint32_t reftable_reader_hash_id(struct reftable_reader *r);

What is a hash id?

[...]
> + Merged tables
> +
> + A ref database kept in a sequence of table files. The merged_table presents a
> + unified view to reading (seeking, iterating) a sequence of immutable tables.

Nice as well.

[...]
> +/* returns the max update_index covered by this merged table. */
> +uint64_t
> +reftable_merged_table_max_update_index(struct reftable_merged_table *mt);

nit: return type should be on the same line as the function name

	uint64_t reftable_merged_table_max_update_index(
					struct reftable_merged_table *mt);

[...]
> + Generic tables
> +
> + A unified API for reading tables, either merged tables, or single readers.

Are there callers/helpers that don't know whether they want one or the
other?

[...]
> + * a stack is a stack of reftables, which can be mutated by pushing a table to
> + * the top of the stack

Nice.

[...]
> +/* holds a transaction to add tables at the top of a stack. */
> +struct reftable_addition;

When would I add multiple tables at once to the top?  Is this used
during compacting, for example?

[...]
> +/* heuristically compact unbalanced table stack. */
> +int reftable_stack_auto_compact(struct reftable_stack *st);

When do I call this?  What are the semantics?  (By this I don't mean
what heuristic does it use, but what expectation should I have as a
caller?)

> +/* convenience function to read a single log. Returns < 0 for error, 0
> +   for success, and 1 if ref not found. */
> +int reftable_stack_read_log(struct reftable_stack *st, const char *refname,
> +			    struct reftable_log_record *log);

nit: single log *record*?

> +/* statistics on past compactions. */
> +struct reftable_compaction_stats {
> +	uint64_t bytes; /* total number of bytes written */
> +	uint64_t entries_written; /* total number of entries written, including
> +				     failures. */
> +	int attempts; /* how often we tried to compact */
> +	int failures; /* failures happen on concurrent updates */
> +};
> +
> +/* return statistics for compaction up till now. */
> +struct reftable_compaction_stats *
> +reftable_stack_compaction_stats(struct reftable_stack *st);

All in all, I like this API.  Thanks for putting it together
thoughtfully.

The API builds up in layers (blocks, reftables, merged reftables, etc),
suggesting a natural division for the patch series.  I think the rest of
the series follows that --- let's see.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 03/13] vcxproj: adjust for the reftable changes
  2020-10-01 16:10   ` [PATCH v2 03/13] vcxproj: adjust for the reftable changes Johannes Schindelin via GitGitGadget
@ 2020-10-02  4:02     ` Jonathan Nieder
  2020-10-02 11:43       ` Johannes Schindelin
  0 siblings, 1 reply; 251+ messages in thread
From: Jonathan Nieder @ 2020-10-02  4:02 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Han-Wen Nienhuys, Jeff King, Han-Wen Nienhuys, Johannes Schindelin

Han-Wen Nienhuys wrote:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> This allows Git to be compiled via Visual Studio again after integrating
> the `hn/reftable` branch.

nit: This branch name is no longer meaningful to the primary audience
for the commit message (people discovering this commit in git history
later).

> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  config.mak.uname                           |  2 +-
>  contrib/buildsystems/Generators/Vcxproj.pm | 11 ++++++++++-
>  2 files changed, 11 insertions(+), 2 deletions(-)

Can this be squashed into or put immediately after patch 5 which
introduces the Makefile?

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 04/13] reftable: add a barebones unittest framework
  2020-10-01 16:10   ` [PATCH v2 04/13] reftable: add a barebones unittest framework Han-Wen Nienhuys via GitGitGadget
@ 2020-10-02  4:05     ` Jonathan Nieder
  2020-10-08  1:45     ` Jonathan Tan
  1 sibling, 0 replies; 251+ messages in thread
From: Jonathan Nieder @ 2020-10-02  4:05 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: git, Han-Wen Nienhuys, Jeff King, Han-Wen Nienhuys

Hi,

Han-Wen Nienhuys wrote:

> Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
> ---
>  reftable/test_framework.c | 68 +++++++++++++++++++++++++++++++++++++++
>  reftable/test_framework.h | 59 +++++++++++++++++++++++++++++++++
>  2 files changed, 127 insertions(+)
>  create mode 100644 reftable/test_framework.c
>  create mode 100644 reftable/test_framework.h

Hm, can the commit message say a little about the motivation for this
test framework and what it allows?

For example, does it allow unit tests to write output in TAP format?
How does it compare to existing frameworks like
https://www.eyrie.org/~eagle/software/c-tap-harness/?  How does this
approach compare to what Git's existing unit-style tests with helpers
in t/helper/ driven by scripts in t/ do?

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-01 16:10   ` [PATCH v2 05/13] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
@ 2020-10-02  4:12     ` Jonathan Nieder
  2020-10-10 17:32       ` Han-Wen Nienhuys
  2020-10-02 14:01     ` Johannes Schindelin
  2020-10-08  1:48     ` Jonathan Tan
  2 siblings, 1 reply; 251+ messages in thread
From: Jonathan Nieder @ 2020-10-02  4:12 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: git, Han-Wen Nienhuys, Jeff King, Han-Wen Nienhuys

Han-Wen Nienhuys wrote:

> --- /dev/null
> +++ b/reftable/basics.c
> @@ -0,0 +1,131 @@
> +/*
> +Copyright 2020 Google LLC
> +
> +Use of this source code is governed by a BSD-style
> +license that can be found in the LICENSE file or at
> +https://developers.google.com/open-source/licenses/bsd
> +*/
> +
> +#include "basics.h"

Git's source files start with #include-ing "git-compat-util.h"; how do
we approach that here?

E.g. would we pass -include on the command line, or could we #include
some kind of compat-util.h that #include-s "git-compat-util.h" in Git
and does the appropriate steps for enabling feature test macros and
including system headers when used outside?

[...]
> +int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args)

How does this compare to stdlib's bsearch?

[...]
> +void free_names(char **a)
> +{
> +	char **p = a;
> +	if (p == NULL) {
> +		return;
> +	}
> +	while (*p) {
> +		reftable_free(*p);
> +		p++;
> +	}
> +	reftable_free(a);
> +}

Are there other callers that need custom free?

[...]
> +int names_length(char **names)
> +{
> +	int len = 0;
> +	char **p = names;
> +	while (*p) {
> +		p++;
> +		len++;
> +	}
> +	return len;
> +}

The rest are probably easier to evaluate when I look at the callers, and
it's time for me to go to sleep now.  I'll try to find some time soon to
pick the review back up.

Thanks for a thoughtfully put together library so far.

Jonathan

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 03/13] vcxproj: adjust for the reftable changes
  2020-10-02  4:02     ` Jonathan Nieder
@ 2020-10-02 11:43       ` Johannes Schindelin
  0 siblings, 0 replies; 251+ messages in thread
From: Johannes Schindelin @ 2020-10-02 11:43 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Johannes Schindelin via GitGitGadget, git, Han-Wen Nienhuys,
	Jeff King, Han-Wen Nienhuys

Hi Jonathan,

On Thu, 1 Oct 2020, Jonathan Nieder wrote:

> Han-Wen Nienhuys wrote:
>
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > This allows Git to be compiled via Visual Studio again after integrating
> > the `hn/reftable` branch.
>
> nit: This branch name is no longer meaningful to the primary audience
> for the commit message (people discovering this commit in git history
> later).
>
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
> >  config.mak.uname                           |  2 +-
> >  contrib/buildsystems/Generators/Vcxproj.pm | 11 ++++++++++-
> >  2 files changed, 11 insertions(+), 2 deletions(-)
>
> Can this be squashed into or put immediately after patch 5 which
> introduces the Makefile?

I never intended this to be a stand-alone patch. So yes, I would be very
much in favor of squashing these changes into the appropriate commit.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 12/13] reftable: rest of library
  2020-10-01 16:11   ` [PATCH v2 12/13] reftable: rest of library Han-Wen Nienhuys via GitGitGadget
@ 2020-10-02 13:57     ` Johannes Schindelin
  2020-10-02 17:08       ` Junio C Hamano
  0 siblings, 1 reply; 251+ messages in thread
From: Johannes Schindelin @ 2020-10-02 13:57 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: git, Han-Wen Nienhuys, Jeff King, Han-Wen Nienhuys, Han-Wen Nienhuys

Hi Han-Wen,

On Thu, 1 Oct 2020, Han-Wen Nienhuys via GitGitGadget wrote:

> From: Han-Wen Nienhuys <hanwen@google.com>
>
> This will be further split up once preceding commits have passed review.

Before you further split it up, I encourage you to include these patches
without which the CI builds will continue to fail (Junio, could I ask you
to either cherry-pick them from https://github.com/git-for-windows/git's
shears/seen branch, or apply them from the mbox?):

-- snipsnap --
From e485e006f34922439f2e971a1c5c38b8ca56c011 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 30 Sep 2020 14:46:59 +0200
Subject: [PATCH 1/3] fixup??? reftable: rest of library

	struct abc x = {}

is a GNUism.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 reftable/dump.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/reftable/dump.c b/reftable/dump.c
index ed09f2e2f94..233f0434e39 100644
--- a/reftable/dump.c
+++ b/reftable/dump.c
@@ -75,7 +75,7 @@ static int dump_table(const char *tablename)
 static int compact_stack(const char *stackdir)
 {
 	struct reftable_stack *stack = NULL;
-	struct reftable_write_options cfg = {};
+	struct reftable_write_options cfg = { 0 };

 	int err = reftable_new_stack(&stack, stackdir, cfg);
 	if (err < 0)
@@ -94,7 +94,7 @@ static int compact_stack(const char *stackdir)
 static int dump_stack(const char *stackdir)
 {
 	struct reftable_stack *stack = NULL;
-	struct reftable_write_options cfg = {};
+	struct reftable_write_options cfg = { 0 };
 	struct reftable_iterator it = { NULL };
 	struct reftable_ref_record ref = { NULL };
 	struct reftable_log_record log = { NULL };
--
2.28.0.windows.1.18.g5300e52e185


From d5faa818a1bc00016e310d27602551127db620fb Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 30 Sep 2020 14:55:28 +0200
Subject: [PATCH 2/3] fixup??? reftable: rest of library

0-sized arrays are actually not portable.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 reftable/stack_test.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/reftable/stack_test.c b/reftable/stack_test.c
index 534463829cb..e863cc3c0a2 100644
--- a/reftable/stack_test.c
+++ b/reftable/stack_test.c
@@ -572,11 +572,11 @@ static void test_sizes_to_segments(void)

 static void test_sizes_to_segments_empty(void)
 {
-	uint64_t sizes[0];
+	uint64_t sizes[1];

 	int seglen = 0;
 	struct segment *segs =
-		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
+		sizes_to_segments(&seglen, sizes, 0);
 	assert(seglen == 0);
 	reftable_free(segs);
 }
--
2.28.0.windows.1.18.g5300e52e185


From d446e1a7354c676d60114b50ba96a6ea083441ae Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 30 Sep 2020 15:19:28 +0200
Subject: [PATCH 3/3] fixup??? reftable: rest of library

Avoid using `getopt()`: it might be POSIX, but Git's audience is much
larger than POSIX. MSVC, for example, does not support `getopt()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 reftable/dump.c | 32 +++++++++++++-------------------
 1 file changed, 13 insertions(+), 19 deletions(-)

diff --git a/reftable/dump.c b/reftable/dump.c
index 233f0434e39..b63c9fe9e81 100644
--- a/reftable/dump.c
+++ b/reftable/dump.c
@@ -160,40 +160,34 @@ static void print_help(void)
 int reftable_dump_main(int argc, char *const *argv)
 {
 	int err = 0;
-	int opt;
 	int opt_dump_table = 0;
 	int opt_dump_stack = 0;
 	int opt_compact = 0;
-	const char *arg = NULL;
-	while ((opt = getopt(argc, argv, "2chts")) != -1) {
-		switch (opt) {
-		case '2':
-			hash_id = 0x73323536;
+	const char *arg = NULL, *argv0 = argv[0];
+
+	for (; argc > 1; argv++, argc--)
+		if (*argv[1] != '-')
 			break;
-		case 't':
+		else if (!strcmp("-2", argv[1]))
+			hash_id = 0x73323536;
+		else if (!strcmp("-t", argv[1]))
 			opt_dump_table = 1;
-			break;
-		case 's':
+		else if (!strcmp("-s", argv[1]))
 			opt_dump_stack = 1;
-			break;
-		case 'c':
+		else if (!strcmp("-c", argv[1]))
 			opt_compact = 1;
-			break;
-		case '?':
-		case 'h':
+		else if (!strcmp("-?", argv[1]) || !strcmp("-h", argv[1])) {
 			print_help();
 			return 2;
-			break;
 		}
-	}

-	if (argv[optind] == NULL) {
+	if (argc != 2) {
 		fprintf(stderr, "need argument\n");
 		print_help();
 		return 2;
 	}

-	arg = argv[optind];
+	arg = argv[1];

 	if (opt_dump_table) {
 		err = dump_table(arg);
@@ -204,7 +198,7 @@ int reftable_dump_main(int argc, char *const *argv)
 	}

 	if (err < 0) {
-		fprintf(stderr, "%s: %s: %s\n", argv[0], arg,
+		fprintf(stderr, "%s: %s: %s\n", argv0, arg,
 			reftable_error_str(err));
 		return 1;
 	}
--
2.28.0.windows.1.18.g5300e52e185


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-01 16:10   ` [PATCH v2 05/13] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
  2020-10-02  4:12     ` Jonathan Nieder
@ 2020-10-02 14:01     ` Johannes Schindelin
  2020-10-02 20:47       ` Junio C Hamano
  2020-10-08  1:48     ` Jonathan Tan
  2 siblings, 1 reply; 251+ messages in thread
From: Johannes Schindelin @ 2020-10-02 14:01 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: git, Han-Wen Nienhuys, Jeff King, Han-Wen Nienhuys,
	Han-Wen Nienhuys, gitster

Hi Han-Wen,

On Thu, 1 Oct 2020, Han-Wen Nienhuys via GitGitGadget wrote:

> From: Han-Wen Nienhuys <hanwen@google.com>
>
> This commit provides basic utility classes for the reftable library.
>
> Since the reftable library must compile standalone, there may be some overlap
> with git-core utility functions.

My position on this idea to duplicate functionality in order to somehow
pretend that the reftable code is independent of Git's source code has not
changed.

Be that as it may, the CI build failures impacted my notifications'
signal/noise ratio too much, so here goes (Junio, I would be delighted if
you could apply this on top of your branch):

-- snipsnap --
From 1698c3450b5e47927c2e8fe35cad2634963bad89 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 30 Sep 2020 16:53:53 +0200
Subject: [PATCH] fixup??? reftable: utility functions

Let's not forget our CMake configuration.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 contrib/buildsystems/CMakeLists.txt | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index 1bd53c4ad51..284bf4402d7 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -591,6 +591,12 @@ parse_makefile_for_sources(libxdiff_SOURCES "XDIFF_OBJS")
 list(TRANSFORM libxdiff_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/")
 add_library(xdiff STATIC ${libxdiff_SOURCES})

+#reftable
+parse_makefile_for_sources(reftable_SOURCES "REFTABLE_OBJS")
+
+list(TRANSFORM reftable_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/")
+add_library(reftable STATIC ${reftable_SOURCES})
+
 if(WIN32)
 	if(NOT MSVC)#use windres when compiling with gcc and clang
 		add_custom_command(OUTPUT ${CMAKE_BINARY_DIR}/git.res
@@ -613,7 +619,7 @@ endif()
 #link all required libraries to common-main
 add_library(common-main OBJECT ${CMAKE_SOURCE_DIR}/common-main.c)

-target_link_libraries(common-main libgit xdiff ${ZLIB_LIBRARIES})
+target_link_libraries(common-main libgit xdiff reftable ${ZLIB_LIBRARIES})
 if(Intl_FOUND)
 	target_link_libraries(common-main ${Intl_LIBRARIES})
 endif()
@@ -848,11 +854,15 @@ if(BUILD_TESTING)
 add_executable(test-fake-ssh ${CMAKE_SOURCE_DIR}/t/helper/test-fake-ssh.c)
 target_link_libraries(test-fake-ssh common-main)

+#reftable-tests
+parse_makefile_for_sources(test-reftable_SOURCES "REFTABLE_TEST_OBJS")
+list(TRANSFORM test-reftable_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/")
+
 #test-tool
 parse_makefile_for_sources(test-tool_SOURCES "TEST_BUILTINS_OBJS")

 list(TRANSFORM test-tool_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/t/helper/")
-add_executable(test-tool ${CMAKE_SOURCE_DIR}/t/helper/test-tool.c ${test-tool_SOURCES})
+add_executable(test-tool ${CMAKE_SOURCE_DIR}/t/helper/test-tool.c ${test-tool_SOURCES} ${test-reftable_SOURCES})
 target_link_libraries(test-tool common-main)

 set_target_properties(test-fake-ssh test-tool
--
2.28.0.windows.1.18.g5300e52e185


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 12/13] reftable: rest of library
  2020-10-02 13:57     ` Johannes Schindelin
@ 2020-10-02 17:08       ` Junio C Hamano
  2020-10-04 18:39         ` Johannes Schindelin
  0 siblings, 1 reply; 251+ messages in thread
From: Junio C Hamano @ 2020-10-02 17:08 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Han-Wen Nienhuys,
	Jeff King, Han-Wen Nienhuys

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Subject: [PATCH 1/3] fixup??? reftable: rest of library

This is unambiguously good.

> Subject: [PATCH 2/3] fixup??? reftable: rest of library
>
> 0-sized arrays are actually not portable.
> ...
>  static void test_sizes_to_segments_empty(void)
>  {
> -	uint64_t sizes[0];
> +	uint64_t sizes[1];
>
>  	int seglen = 0;
>  	struct segment *segs =
> -		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
> +		sizes_to_segments(&seglen, sizes, 0);
>  	assert(seglen == 0);
>  	reftable_free(segs);

Question to Han-Wen.

It is unclear what this test wants to test.  Do we even need sizes[]
array if we know we are passing a hardcoded 0?  IOW, I would
understand if the test were

	sizes_to_segments(&seglen, NULL, 0);

to ensure that sizes_to_segments do not even attempt to look at sizes[]
array when the number of elements is 0.

> Subject: [PATCH 3/3] fixup??? reftable: rest of library
>
> Avoid using `getopt()`: it might be POSIX, but Git's audience is much
> larger than POSIX. MSVC, for example, does not support `getopt()`.

Either that, or we could use parse-options().  I do not care either
way, as this seems to be purely for debugging?


^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-02 14:01     ` Johannes Schindelin
@ 2020-10-02 20:47       ` Junio C Hamano
  2020-10-03  8:07         ` Johannes Schindelin
  0 siblings, 1 reply; 251+ messages in thread
From: Junio C Hamano @ 2020-10-02 20:47 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Han-Wen Nienhuys,
	Jeff King, Han-Wen Nienhuys

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> This commit provides basic utility classes for the reftable library.
>>
>> Since the reftable library must compile standalone, there may be some overlap
>> with git-core utility functions.
>
> My position on this idea to duplicate functionality in order to somehow
> pretend that the reftable code is independent of Git's source code has not
> changed.

The above may be sufficient between you and Han-Wen, and a few
selected others who still remember, but please do not forget that we
are collaborating in the open.  It would help those who are learning
open source interactions by watching from sidelines, if you spelled
out what your position is or at least left a pointer to a previous
discussion.

FWIW, I think it is a mistake to try to make this directory so that
it can be lifted out of our code base and used independently, as it
has to create unnecessary friction to the code when used here.  It
is not like other code that we are not their primary intended
audience and we simply "borrow" from them (e.g. xdiff/ & sha1dc/).

The previous paragraph agrees with my guess of your position, but
perhaps you have something else in mind.  I dunno.

> Be that as it may, the CI build failures impacted my notifications'
> signal/noise ratio too much, so here goes (Junio, I would be delighted if
> you could apply this on top of your branch):

Quite honestly, this close to the first preview release for 2.29,
I'd rather drop this round from 'seen' and expect an updated topic,
than piling fixup on top of fixups.

Thanks.

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-02 20:47       ` Junio C Hamano
@ 2020-10-03  8:07         ` Johannes Schindelin
  0 siblings, 0 replies; 251+ messages in thread
From: Johannes Schindelin @ 2020-10-03  8:07 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Han-Wen Nienhuys,
	Jeff King, Han-Wen Nienhuys

Hi Junio,

On Fri, 2 Oct 2020, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> >> This commit provides basic utility classes for the reftable library.
> >>
> >> Since the reftable library must compile standalone, there may be some overlap
> >> with git-core utility functions.
> >
> > My position on this idea to duplicate functionality in order to somehow
> > pretend that the reftable code is independent of Git's source code has not
> > changed.
>
> The above may be sufficient between you and Han-Wen, and a few
> selected others who still remember, but please do not forget that we
> are collaborating in the open.  It would help those who are learning
> open source interactions by watching from sidelines, if you spelled
> out what your position is or at least left a pointer to a previous
> discussion.

You're right, sorry.

> FWIW, I think it is a mistake to try to make this directory so that
> it can be lifted out of our code base and used independently, as it
> has to create unnecessary friction to the code when used here.  It
> is not like other code that we are not their primary intended
> audience and we simply "borrow" from them (e.g. xdiff/ & sha1dc/).
>
> The previous paragraph agrees with my guess of your position, but
> perhaps you have something else in mind.  I dunno.

Yes, you were spot on.

I am rather strongly opposed to introducing duplicated functionality.
Rather than spending time on finding half a dozen links to conversations
in the past, let me try to make the case afresh:

I have a very concrete example of the undesirable consequences: the
commit-graph breakage Stolee and I debugged yesterday, where a bug hid in
`in_merge_bases_many()` for two years, hidden by the fact that there was
_duplicate logic_ in `get_reachable_subset()`.

Pretty much _all_ of the code in `reftable/` is _very_ new, has not seen
any real-world testing, and as such cannot really be trusted, in
particular the code that duplicates functionality already present in
libgit.a. Note that I do not claim that Han-Wen cannot be trusted: to the
contrary, I trust him very much (we have a history going all the way back
to Lilypond, the musical typesetter). The code, however, cannot be
trusted, and needs to be reviewed and tested in practice.

The simple fact is that even the thorough review that the commit-graph
patches received did not prevent the `min_generation`/`max_generation` bug
in `in_merge_bases_many()` from entering the code base, and if we had not
had _another_ function doing very, very similar things, that bug would
most likely not have lingered for _this_ long.

Likewise for the reimplementation of the convenient functionality like
`strbuf`. This kind of duplication (in the case of `struct strbuf`, quite
_literally_: there are now confusingly _two_ `strbuf.h` files that even
try to implement the _exact same_ API) is _prone_ to lead to stale, or
long undiscovered, bugs in one of the duplicated implementations.

Which means that we're likely to introduce bugs with those new functions
introduced in `reftable/`. For the reftable functionality, that has to be
accepted. For functions that do the same as equivalents in libgit.a, we do
not have to accept it.

In any case, duplicated functionality is a maintenance burden, and
especially in this case, an unnecessary one at that: unlike xdiff or
compat/regex/, the reftable functionality is co-dependent with `libgit.a`.
Yes, you can implement libreftable as a stand-alone library (including
duplicated functionality from libgit.a, as I pointed out), but you _cannot
use it independently of Git_.

The reftable concerns _Git_ refs.

This is not a compressor that could come in handy in a messenger app, or a
diff engine that could potentially help with a semantic merging tool.

It is code whose entire purpose is to manage boatloads of Git refs
efficiently. Therefore, I see no convincing reason to insist on
maintaining this code outside of git/git.

It would make more sense to maintain parse-options or strbuf or
t/test-lib.sh outside of git/git (because at least that functionality is
independent enough of Git to be potentially useful to other projects) and
we don't. We maintain them inside git/git, and for good reason: it makes
development and maintenance a hell of a lot easier.

And once we agree that reftable is intended to be so integral a part of
Git that it should absolutely have its home inside git/git, all of that
duplicated functionality (with almost totally untested implementations, at
least when it comes to "battle testing in production") can totally
evaporate and does not need to worry us any more.

And then we can start to spend reviewers' time on the actual code
revolving around the actual reftables rather than having to bother with
reviewing, say, a close, but non-identical `strbuf` that might even have
"borrowed" parts of the `strbuf` code and put it under a more permissive
license (I don't know, and I don't want to know, that's part of the reason
why I don't want to review the reftable patches in their current form).

In other words, we can then finally start to actually review the reftable
code on its merits of teaching Git a valuable new feature.

> > Be that as it may, the CI build failures impacted my notifications'
> > signal/noise ratio too much, so here goes (Junio, I would be delighted if
> > you could apply this on top of your branch):
>
> Quite honestly, this close to the first preview release for 2.29,
> I'd rather drop this round from 'seen' and expect an updated topic,
> than piling fixup on top of fixups.

Sounds good to me.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 12/13] reftable: rest of library
  2020-10-02 17:08       ` Junio C Hamano
@ 2020-10-04 18:39         ` Johannes Schindelin
  0 siblings, 0 replies; 251+ messages in thread
From: Johannes Schindelin @ 2020-10-04 18:39 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Han-Wen Nienhuys,
	Jeff King, Han-Wen Nienhuys

Hi Junio,

On Fri, 2 Oct 2020, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> > Subject: [PATCH 3/3] fixup??? reftable: rest of library
> >
> > Avoid using `getopt()`: it might be POSIX, but Git's audience is much
> > larger than POSIX. MSVC, for example, does not support `getopt()`.
>
> Either that, or we could use parse-options().  I do not care either
> way, as this seems to be purely for debugging?

I actually wanted to use `parse_options()`, but due to the way `reftable/`
insists on being completely separate from `libgit.a`, it is not really
trivial, as `#include "parse-options.h"` fails (because
`git-compat-util.h` was not included, and IIRC including _that_ resulted
in more compile errors, at which stage I opted for a more minimal
solution).

So yeah, I would strongly expect `reftable/` to become considerably
simpler once it starts using `libgit.a`'s goodies.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 02/13] reftable: define the public API
  2020-10-01 16:10   ` [PATCH v2 02/13] reftable: define the public API Han-Wen Nienhuys via GitGitGadget
  2020-10-02  3:58     ` Jonathan Nieder
@ 2020-10-08  1:41     ` Jonathan Tan
  2020-10-10 16:57       ` Han-Wen Nienhuys
  1 sibling, 1 reply; 251+ messages in thread
From: Jonathan Tan @ 2020-10-08  1:41 UTC (permalink / raw)
  To: gitgitgadget; +Cc: git, hanwen, peff, Jonathan Tan

My initial impression is that this public API is much larger than I
would expect from reading the reftable spec. Let me look at it in
detail...

> +/****************************************************************
> + Basic data types
> +
> + Reftables store the state of each ref in struct reftable_ref_record, and they
> + store a sequence of reflog updates in struct reftable_log_record.
> + ****************************************************************/
> +
> +/* reftable_ref_record holds a ref database entry target_value */
> +struct reftable_ref_record {
> +	char *refname; /* Name of the ref, malloced. */
> +	uint64_t update_index; /* Logical timestamp at which this value is
> +				  written */
> +	uint8_t *value; /* SHA1, or NULL. malloced. */
> +	uint8_t *target_value; /* peeled annotated tag, or NULL. malloced. */
> +	char *target; /* symref, or NULL. malloced. */
> +};

A typical user would just want to do refname->hash, refname->log, or possibly
hash->refname lookups, so a ref record would be meant for low-level
users. I can't think of a typical use case, but even if such existed, I
wouldn't expect this API - in particular, the refname is not written as
such on disk (it is composed from a prefix and a suffix, if I remember
correctly), so I would expect a function that one can call to write the
refname, but not for it to appear in a data structure.

> +/* returns whether 'ref' represents a deletion */
> +int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref);

This looks unorthogonal - looking at the spec, the "value type" can
represent a deletion, one object name (value of the ref), two object
names (value of the ref + peeled target), or symbolic reference. If
we're willing to allocate memory for a ref record, I think we can afford
the extra 4 bytes (maybe 8) and use a tagged union instead.

> +/* prints a reftable_ref_record onto stdout */
> +void reftable_ref_record_print(struct reftable_ref_record *ref,
> +			       uint32_t hash_id);

Not sure if this belongs here.

> +/* frees and nulls all pointer values. */
> +void reftable_ref_record_clear(struct reftable_ref_record *ref);

Usually we call this "release" in Git.

> +/* returns whether two reftable_ref_records are the same */
> +int reftable_ref_record_equal(struct reftable_ref_record *a,
> +			      struct reftable_ref_record *b, int hash_size);

Not sure what this will be used for.

> +/* reftable_log_record holds a reflog entry */
> +struct reftable_log_record {
> +	char *refname;
> +	uint64_t update_index; /* logical timestamp of a transactional update.
> +				*/
> +	uint8_t *new_hash;
> +	uint8_t *old_hash;
> +	char *name;
> +	char *email;
> +	uint64_t time;
> +	int16_t tz_offset;
> +	char *message;
> +};

Mostly same comments as above.

[snip]

> +/****************************************************************
> + Error handling
> +
> + Error are signaled with negative integer return values. 0 means success.
> + ****************************************************************/

I didn't look at this section deeply, but this looks OK at first glance.

> +/****************************************************************
> + Writing
> +
> + Writing single reftables
> + ****************************************************************/

OK - writing a single reftable makes sense.

> +/* reftable_write_options sets options for writing a single reftable. */
> +struct reftable_write_options {
> +	/* boolean: do not pad out blocks to block size. */
> +	int unpadded;
> +
> +	/* the blocksize. Should be less than 2^24. */
> +	uint32_t block_size;
> +
> +	/* boolean: do not generate a SHA1 => ref index. */
> +	int skip_index_objects;
> +
> +	/* how often to write complete keys in each block. */
> +	int restart_interval;
> +
> +	/* 4-byte identifier ("sha1", "s256") of the hash.
> +	 * Defaults to SHA1 if unset
> +	 */
> +	uint32_t hash_id;
> +
> +	/* boolean: do not check ref names for validity or dir/file conflicts.
> +	 */
> +	int skip_name_check;
> +
> +	/* boolean: copy log messages exactly. If unset, check that the message
> +	 *   is a single line, and add '\n' if missing.
> +	 */
> +	int exact_log_message;
> +};

I'm not sure we need so many options, but nothing jumps out to me
off-hand. Some minor things - "hash_id" should probably be an enum and
the booleans should be "unsigned variable_name : 1".

[snip statistics]

Statistics look useful.

> +/* reftable_new_writer creates a new writer */
> +struct reftable_writer *
> +reftable_new_writer(int (*writer_func)(void *, const void *, size_t),
> +		    void *writer_arg, struct reftable_write_options *opts);
> +
> +/* write to a file descriptor. fdp should be an int* pointing to the fd. */
> +int reftable_fd_write(void *fdp, const void *data, size_t size);

Not sure if we'll need to write to things other than an fd, but if we
do, this makes sense.

> +/* Set the range of update indices for the records we will add.  When
> +   writing a table into a stack, the min should be at least
> +   reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned.
> +
> +   For transactional updates, typically min==max. When converting an existing
> +   ref database into a single reftable, this would be a range of update-index
> +   timestamps.
> + */
> +void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
> +				uint64_t max);

This seems to be here because we want to write the file in a single
pass, and the update index maximum and minimum appear in the header. If
we were allowed to seek while writing, could the update index maximum
and minimum be deduced instead? That might be a compelling reason to
only support FDs, and require that the FDs be seekable.

> +/* adds a reftable_ref_record. Must be called in ascending
> +   order. The update_index must be within the limits set by
> +   reftable_writer_set_limits(), or REFTABLE_API_ERROR is returned.
> +
> +   It is an error to write a ref record after a log record.
> + */
> +int reftable_writer_add_ref(struct reftable_writer *w,
> +			    struct reftable_ref_record *ref);
> +
> +/* Convenience function to add multiple refs. Will sort the refs by
> +   name before adding. */
> +int reftable_writer_add_refs(struct reftable_writer *w,
> +			     struct reftable_ref_record *refs, int n);

Since refs is an array of objects and not an array of pointers, these
two functions could be combined - the first is the same as calling the
second with n=1.

Also, ascending order of what?

The user will also need to know where to get the update index from - I
presume that this will be the maximum update index of any record with
the given refname + 1.

[snip similar functions for log]

> +/* reftable_writer_close finalizes the reftable. The writer is retained so
> + * statistics can be inspected. */
> +int reftable_writer_close(struct reftable_writer *w);
> +
> +/* writer_stats returns the statistics on the reftable being written.
> +
> +   This struct becomes invalid when the writer is freed.
> + */
> +const struct reftable_stats *writer_stats(struct reftable_writer *w);
> +
> +/* reftable_writer_free deallocates memory for the writer */
> +void reftable_writer_free(struct reftable_writer *w);

OK.

> +/****************************************************************
> + * ITERATING
> + ****************************************************************/

[snip]

> +/* reads the next reftable_ref_record. Returns < 0 for error, 0 for OK and > 0:
> +   end of iteration.
> +*/
> +int reftable_iterator_next_ref(struct reftable_iterator *it,
> +			       struct reftable_ref_record *ref);
> +
> +/* reads the next reftable_log_record. Returns < 0 for error, 0 for OK and > 0:
> +   end of iteration.
> +*/
> +int reftable_iterator_next_log(struct reftable_iterator *it,
> +			       struct reftable_log_record *log);

From my recollection, in Git, we typically use foreach functions with a
callback that is invoked once for each result. I think that's preferable
to this approach.

> +/****************************************************************
> + Reading single tables
> +
> + The follow routines are for reading single files. For an application-level
> + interface, skip ahead to struct reftable_merged_table and struct
> + reftable_stack.
> + ****************************************************************/

I think that this whole section could be skipped - if we really wanted
to read a single table, we could just use the merged-table interface
with one table.

[snip]

> +/****************************************************************
> + Merged tables
> +
> + A ref database kept in a sequence of table files. The merged_table presents a
> + unified view to reading (seeking, iterating) a sequence of immutable tables.
> + ****************************************************************/
> +
> +/* A merged table is implements seeking/iterating over a stack of tables. */
> +struct reftable_merged_table;
> +
> +/* A generic reftable; see below. */
> +struct reftable_table;

Why would we need to see an individual reftable? Could we just represent
a merged table as a virtual concatenation of blocks (as if all the
blocks were in the same file)?

> +/* reftable_new_merged_table creates a new merged table. It takes ownership of
> +   the stack array.
> +*/
> +int reftable_new_merged_table(struct reftable_merged_table **dest,
> +			      struct reftable_table *stack, int n,
> +			      uint32_t hash_id);

I presume this would be used for things like compacting a few reftables
together? In which case, I would expect this function to just take a
list of filenames.

> +/* returns an iterator positioned just before 'name' */
> +int reftable_merged_table_seek_ref(struct reftable_merged_table *mt,
> +				   struct reftable_iterator *it,
> +				   const char *name);
> +
> +/* returns an iterator for log entry, at given update_index */
> +int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt,
> +				      struct reftable_iterator *it,
> +				      const char *name, uint64_t update_index);
> +
> +/* like reftable_merged_table_seek_log_at but look for the newest entry. */
> +int reftable_merged_table_seek_log(struct reftable_merged_table *mt,
> +				   struct reftable_iterator *it,
> +				   const char *name);

Why do iterators need to be precisely positioned if this merged table is
immutable?

> +/* returns the max update_index covered by this merged table. */
> +uint64_t
> +reftable_merged_table_max_update_index(struct reftable_merged_table *mt);
> +
> +/* returns the min update_index covered by this merged table. */
> +uint64_t
> +reftable_merged_table_min_update_index(struct reftable_merged_table *mt);
> +
> +/* releases memory for the merged_table */
> +void reftable_merged_table_free(struct reftable_merged_table *m);
> +
> +/* return the hash ID of the merged table. */
> +uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *m);

OK - makes sense.

> +/****************************************************************
> + Generic tables
> +
> + A unified API for reading tables, either merged tables, or single readers.
> + ****************************************************************/

Again, we could just use the merged table API.

[snip]

> +/* convenience function to read a single ref. Returns < 0 for error, 0
> +   for success, and 1 if ref not found. */
> +int reftable_table_read_ref(struct reftable_table *tab, const char *name,
> +			    struct reftable_ref_record *ref);

I presume this returns the most up-to-date record in the (possibly
merged) table? So we can just read the hash off the record to know what
this ref points to.

> +/****************************************************************
> + Mutable ref database
> +
> + The stack presents an interface to a mutable sequence of reftables.
> + ****************************************************************/

So I would expect ref mutations to be done by opening the existing
reftables as a merged table, open a reftable writer, write all the
changes that need to be written, and update the tables.list file...

> +/* a stack is a stack of reftables, which can be mutated by pushing a table to
> + * the top of the stack */
> +struct reftable_stack;
> +
> +/* open a new reftable stack. The tables along with the table list will be
> +   stored in 'dir'. Typically, this should be .git/reftables.
> +*/
> +int reftable_new_stack(struct reftable_stack **dest, const char *dir,
> +		       struct reftable_write_options config);
> +
> +/* returns the update_index at which a next table should be written. */
> +uint64_t reftable_stack_next_update_index(struct reftable_stack *st);

...so I don't see why we need a stack.

> +/* holds a transaction to add tables at the top of a stack. */
> +struct reftable_addition;
> +
> +/*
> +  returns a new transaction to add reftables to the given stack. As a side
> +  effect, the ref database is locked.
> +*/
> +int reftable_stack_new_addition(struct reftable_addition **dest,
> +				struct reftable_stack *st);
> +
> +/* Adds a reftable to transaction. */
> +int reftable_addition_add(struct reftable_addition *add,
> +			  int (*write_table)(struct reftable_writer *wr,
> +					     void *arg),
> +			  void *arg);
> +
> +/* Commits the transaction, releasing the lock. */
> +int reftable_addition_commit(struct reftable_addition *add);
> +
> +/* Release all non-committed data from the transaction, and deallocate the
> +   transaction. Releases the lock if held. */
> +void reftable_addition_destroy(struct reftable_addition *add);

We do need a transaction to write the new tables.list and then
atomically update the repository with it, but I don't think we need one
just to add a reftable to the stack.

[snip rest of stack functions]

> +/* Policy for expiring reflog entries. */
> +struct reftable_log_expiry_config {
> +	/* Drop entries older than this timestamp */
> +	uint64_t time;
> +
> +	/* Drop older entries */
> +	uint64_t min_update_index;
> +};
> +
> +/* compacts all reftables into a giant table. Expire reflog entries if config is
> + * non-NULL */
> +int reftable_stack_compact_all(struct reftable_stack *st,
> +			       struct reftable_log_expiry_config *config);

Ah, the stack is used for compacting as well. I don't think compacting
belongs here though - it should be its own thing that can make use of
the merged-table reader and the single-table writer.

[snip]

> +/* convenience function to read a single ref. Returns < 0 for error, 0
> +   for success, and 1 if ref not found. */
> +int reftable_stack_read_ref(struct reftable_stack *st, const char *refname,
> +			    struct reftable_ref_record *ref);
> +
> +/* convenience function to read a single log. Returns < 0 for error, 0
> +   for success, and 1 if ref not found. */
> +int reftable_stack_read_log(struct reftable_stack *st, const char *refname,
> +			    struct reftable_log_record *log);

Other things that belong to the merged table reader instead of the
stack.

Some things that I think are missing:

 - foreach all latest record of all refs (in any order) (I see some
   functions above that can set the position of the iterator, but I
   don't think that the iterator skips over irrelevant refs? So we can't
   use it for iterating over all refs if we don't care about the history
   of refs)
 - foreach all log entries of one ref (or all refs)

In summary, I would have expected a mechanism to read multiple (possibly
one) reftable and perform queries on it, a mechanism to write a single
reftable, and some functions to update tables.list.

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 04/13] reftable: add a barebones unittest framework
  2020-10-01 16:10   ` [PATCH v2 04/13] reftable: add a barebones unittest framework Han-Wen Nienhuys via GitGitGadget
  2020-10-02  4:05     ` Jonathan Nieder
@ 2020-10-08  1:45     ` Jonathan Tan
  2020-10-08 22:31       ` Josh Steadmon
  1 sibling, 1 reply; 251+ messages in thread
From: Jonathan Tan @ 2020-10-08  1:45 UTC (permalink / raw)
  To: gitgitgadget; +Cc: git, hanwen, peff, hanwenn, Jonathan Tan

> From: Han-Wen Nienhuys <hanwen@google.com>
> 
> Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
> ---
>  reftable/test_framework.c | 68 +++++++++++++++++++++++++++++++++++++++
>  reftable/test_framework.h | 59 +++++++++++++++++++++++++++++++++

Even if reftable will be a project maintained by the Git project but
independent enough to be used by other projects, I don't think it's
worth creating a separate test framework. E.g. as far as I can tell,
when we import sha1dc, we don't import any tests, so I don't think they
need to import tests either - they can run the tests in the Git repo
themselves if they need to.

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-01 16:10   ` [PATCH v2 05/13] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
  2020-10-02  4:12     ` Jonathan Nieder
  2020-10-02 14:01     ` Johannes Schindelin
@ 2020-10-08  1:48     ` Jonathan Tan
  2020-10-10 17:28       ` Han-Wen Nienhuys
  2 siblings, 1 reply; 251+ messages in thread
From: Jonathan Tan @ 2020-10-08  1:48 UTC (permalink / raw)
  To: gitgitgadget; +Cc: git, hanwen, peff, hanwenn, Jonathan Tan

> From: Han-Wen Nienhuys <hanwen@google.com>
> 
> This commit provides basic utility classes for the reftable library.
> 
> Since the reftable library must compile standalone, there may be some overlap
> with git-core utility functions.
> 
> Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
> ---
>  Makefile                  |  26 ++++++-
>  reftable/basics.c         | 131 +++++++++++++++++++++++++++++++++
>  reftable/basics.h         |  48 +++++++++++++
>  reftable/blocksource.c    | 148 ++++++++++++++++++++++++++++++++++++++
>  reftable/blocksource.h    |  22 ++++++
>  reftable/compat.c         | 110 ++++++++++++++++++++++++++++
>  reftable/compat.h         |  48 +++++++++++++
>  reftable/publicbasics.c   | 100 ++++++++++++++++++++++++++
>  reftable/reftable-tests.h |  22 ++++++
>  reftable/strbuf.c         | 142 ++++++++++++++++++++++++++++++++++++
>  reftable/strbuf.h         |  80 +++++++++++++++++++++
>  reftable/strbuf_test.c    |  37 ++++++++++
>  reftable/system.h         |  51 +++++++++++++
>  t/helper/test-reftable.c  |   8 +++
>  t/helper/test-tool.c      |   1 +
>  t/helper/test-tool.h      |   1 +

I think duplicating things like strbuf is an unnecessary burden if Git
is to maintain this library. Something like "reftable will only import
git-compat-util.h and strbuf.h, and any project that wants to use
reftable must make sure that these functions and data structures are
available" would be more plausible.

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 04/13] reftable: add a barebones unittest framework
  2020-10-08  1:45     ` Jonathan Tan
@ 2020-10-08 22:31       ` Josh Steadmon
  0 siblings, 0 replies; 251+ messages in thread
From: Josh Steadmon @ 2020-10-08 22:31 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: gitgitgadget, git, hanwen, peff, hanwenn

On 2020.10.07 18:45, Jonathan Tan wrote:
> > From: Han-Wen Nienhuys <hanwen@google.com>
> > 
> > Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
> > ---
> >  reftable/test_framework.c | 68 +++++++++++++++++++++++++++++++++++++++
> >  reftable/test_framework.h | 59 +++++++++++++++++++++++++++++++++
> 
> Even if reftable will be a project maintained by the Git project but
> independent enough to be used by other projects, I don't think it's
> worth creating a separate test framework. E.g. as far as I can tell,
> when we import sha1dc, we don't import any tests, so I don't think they
> need to import tests either - they can run the tests in the Git repo
> themselves if they need to.

I agree with Jonathan here, I think we should stick with the existing
test framework.

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 02/13] reftable: define the public API
  2020-10-02  3:58     ` Jonathan Nieder
@ 2020-10-09 21:13       ` Emily Shaffer
  2020-10-10 17:03         ` Han-Wen Nienhuys
  2020-11-30 14:44         ` Han-Wen Nienhuys
  2020-10-10 13:43       ` Han-Wen Nienhuys
  1 sibling, 2 replies; 251+ messages in thread
From: Emily Shaffer @ 2020-10-09 21:13 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Han-Wen Nienhuys,
	Jeff King, Han-Wen Nienhuys

On Thu, Oct 01, 2020 at 08:58:51PM -0700, Jonathan Nieder wrote:
> 
> Hi,
> 
> Han-Wen Nienhuys wrote:
> 
> >  reftable/reftable.h | 585 ++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 585 insertions(+)
> >  create mode 100644 reftable/reftable.h
> 
> Adding a header in a separate patch from the implementation doesn't
> match the usual practice.  Can we add declarations in the same patch
> as the functions being declared instead?
> 
> We could still introduce the header early in its own patch if we want,
> but it would be a skeleton of a header that gets filled out by later
> patches.

To poke a little deeper into what I think Jonathan is trying to say:

When we reviewed this for review club, the general feeling was that A)
this commit structure is still pretty difficult to review, and B) we
want to make sure you aren't spending a ton of time reorganizing the
code only to find that people still don't like the structure (as may
have happened with the current iteration).

One point I think Jonathan is making is that each commit should stand
alone: compile, be tested, and do something succinct. Another point (or
maybe the same point from a different angle) is that a series of commits
should tell a progressive story. That is, something like

  add infrastructure to support reftable library
  reftable: read/write blocks
  reftable: write files
  reftable: read files
  reftable: generate binary tree from file
  ...

is much more compelling (and easier to review) than

  reftable: LICENSE
  reftable: headers
  reftable: tests
  ...

Towards the latter end of your series it seems like you started to take
that approach; but some commits, like this one (reftable: define the
public API) are not quite so. That is, this commit is hard to review
without the context of the rest of the series: I read it, I say, "well
WHY do we need these functions?", and then I cannot continue my review
of this patch until I've completed my review of the rest of the series.

Of course, taking a completed project - like your initial reftable
submission - and then chopping it up into a cute story of commits is a
pain in the ***. Doing it twice - or more - is just aggravating. So I
wonder whether we can bikeshed what story would look nice before you
even pick up your 'git rebase -i'? Doing that bikeshedding here on list
means that we also have a chance for someone to interrupt and say, "no,
that organization doesn't make sense" - or even for someone to say
"Emily, there is no need to reorganize these commits, go sit down" ;)

> All in all, I like this API.  Thanks for putting it together
> thoughtfully.
> 
> The API builds up in layers (blocks, reftables, merged reftables, etc),
> suggesting a natural division for the patch series.  I think the rest of
> the series follows that --- let's see.

Or maybe what I said runs contrary to what Jonathan was actually saying.
Please, correct me :)

 - Emily

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 02/13] reftable: define the public API
  2020-10-02  3:58     ` Jonathan Nieder
  2020-10-09 21:13       ` Emily Shaffer
@ 2020-10-10 13:43       ` Han-Wen Nienhuys
  2020-10-12 16:57         ` Jonathan Nieder
  1 sibling, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys @ 2020-10-10 13:43 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Han-Wen Nienhuys

On Fri, Oct 2, 2020 at 5:58 AM Jonathan Nieder <jrnieder@gmail.com> wrote:
> Adding a header in a separate patch from the implementation doesn't
> match the usual practice.  Can we add declarations in the same patch
> as the functions being declared instead?

We could, but it would considerably complicate work on this patch
series, as the commit boundary then doesn't fall on file boundaries
anymore. Would you be open to having multiple headers for the public
interface? eg. reftable-record.h, reftable-reader.h etc. ?

> > +     /* Misuse of the API:
> > +        - on writing a record with NULL refname.
> > +        - on writing a reftable_ref_record outside the table limits
> > +        - on writing a ref or log record before the stack's next_update_index
> > +        - on writing a log record with multiline message with
> > +        exact_log_message unset
> > +        - on reading a reftable_ref_record from log iterator, or vice versa.
> > +     */
> > +     REFTABLE_API_ERROR = -6,
>
> Should these call BUG()?  Or is it useful for some callers to be able
> to recover from these errors?

Since this was written as a standalone library, it's up to the caller
to decide what should be done. It also simplifies testing: you can
verify that incorrect API usage returns a certain error code, but a
BUG() that prints an error or crashes the program is much harder to
verify.

> > +     /* Dir/file conflict. */
>
> This is when adding a ref?

yes.

> > +
> > +/*
> > + * Convert the numeric error code to an equivalent errno code.
> > + */
> > +int reftable_error_to_errno(int err);
>
> What is the intended use of this function?

The read_raw_ref method in the ref backend API uses errno values as
out-of-band communication mechanism.

> > +/* Set the range of update indices for the records we will add.  When
> > +   writing a table into a stack, the min should be at least
> > +   reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned.
> > +
> > +   For transactional updates, typically min==max. When converting an existing
> > +   ref database into a single reftable, this would be a range of update-index
> > +   timestamps.
> > + */
> > +void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
> > +                             uint64_t max);
>
> What happens if I write to a reftable_writer without setting limits
> first?

you get an REFTABLE_API_ERROR.

> > +/* block_source_vtable are the operations that make up block_source */
> > +struct reftable_block_source_vtable {
> > +     /* returns the size of a block source */
> > +     uint64_t (*size)(void *source);
> > +
> > +     /* reads a segment from the block source. It is an error to read
> > +        beyond the end of the block */
> > +     int (*read_block)(void *source, struct reftable_block *dest,
> > +                       uint64_t off, uint32_t size);
> > +     /* mark the block as read; may return the data back to malloc */
> > +     void (*return_block)(void *source, struct reftable_block *blockp);
> > +
> > +     /* release all resources associated with the block source */
> > +     void (*close)(void *source);
> > +};
>
> This is abstracting file input?  I wonder if some name emphasizing the
> 'read' operation like block_reader might be clearer.
>
> Do I pass in the 'struct block_source *' as the source arg?  If so, why
> are these declared as void *?

you pass in block_source->arg. Should struct implementations of
polymorphic types carry a pointer to their own vtable instead?

> Is the reason this manages the buffer instead of requiring a
> caller-supplied buffer to support zero-copy?

Log blocks are compressed, so the caller doesn't know the correct size
to supply. By letting the block source handle the management, we can
swap out the block read from the file for a block managed by malloc on
decompressing a log block.

> > +/* returns the hash ID used in this table. */
> > +uint32_t reftable_reader_hash_id(struct reftable_reader *r);
>
> What is a hash id?

The hash identifier, eg. "sha1" for SHA1 and "s256" for SHA-256.

> [...]
> > + Generic tables
> > +
> > + A unified API for reading tables, either merged tables, or single readers.
>
> Are there callers/helpers that don't know whether they want one or the
> other?

The setup with per-worktree refs means that there are two reftable
stacks in a .git repo. In order to iterate over the entire ref space,
you have to merge two merged tables, but the stack itslef merges a set
of simple reftables.

> [...]
> > +/* holds a transaction to add tables at the top of a stack. */
> > +struct reftable_addition;
>
> When would I add multiple tables at once to the top?  Is this used
> during compacting, for example?

No. You'd usually add only one table, but at the same time, it seemed
kind of arbitrary to allow adding only one table. FWIW, I added this
because libgit2 wants to be able to lock a ref database: creating the
reftable_addition institutes the lock.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 02/13] reftable: define the public API
  2020-10-08  1:41     ` Jonathan Tan
@ 2020-10-10 16:57       ` Han-Wen Nienhuys
  0 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys @ 2020-10-10 16:57 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King

Hi Jonathan,

thanks for a detailed look at the API. I'm replying to some queries
here, but I won't have time for another 2 weeks to revisit the code.

On Thu, Oct 8, 2020 at 3:41 AM Jonathan Tan <jonathantanmy@google.com> wrote:
> A typical user would just want to do refname->hash, refname->log, or possibly
> hash->refname lookups, so a ref record would be meant for low-level

You are forgetting refname -> symref target and refname -> peeled tag.

> users. I can't think of a typical use case, but even if such existed, I
> wouldn't expect this API - in particular, the refname is not written as
> such on disk (it is composed from a prefix and a suffix, if I remember
> correctly), so I would expect a function that one can call to write the
> refname, but not for it to appear in a data structure.

It works with records, because it is more practical for the low level
code, but also because it obviates defining many functions for the
different types of lookups and seeks.

> > +/* returns whether 'ref' represents a deletion */
> > +int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref);
>
> This looks unorthogonal - looking at the spec, the "value type" can
> represent a deletion, one object name (value of the ref), two object
> names (value of the ref + peeled target), or symbolic reference. If
> we're willing to allocate memory for a ref record, I think we can afford
> the extra 4 bytes (maybe 8) and use a tagged union instead.

I guess you want something like

  struct {
     enum TYPE  { REF_DEL, REF_1VAL, REF_2VAL, REF_SYM } record_type;
     char *ref_name;
     uint64_t update_index;
     union  {
          uint_8 *val1;
          struct { uint8_t *val1, *val2; } val2;
          char *sym;
     };
  };

I suspect it makes initialization more verbose, but yes, I could do that.

> > +/* returns whether two reftable_ref_records are the same */
> > +int reftable_ref_record_equal(struct reftable_ref_record *a,
> > +                           struct reftable_ref_record *b, int hash_size);
>
> Not sure what this will be used for.

This is useful for testing.

> > +/* Set the range of update indices for the records we will add.  When
> > +   writing a table into a stack, the min should be at least
> > +   reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned.
> > +
> > +   For transactional updates, typically min==max. When converting an existing
> > +   ref database into a single reftable, this would be a range of update-index
> > +   timestamps.
> > + */
> > +void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
> > +                             uint64_t max);
>
> This seems to be here because we want to write the file in a single
> pass, and the update index maximum and minimum appear in the header. If
> we were allowed to seek while writing, could the update index maximum
> and minimum be deduced instead?

No. The ref record update_index is delta encoded against the min
update_index, so you have to know it upfront.

If I could redesign the format, I'd leave out the max update_index
from the header (because it can be determined afterwards.), but it's
too late now.

>
> > +/* adds a reftable_ref_record. Must be called in ascending
> > +   order. The update_index must be within the limits set by
> > +   reftable_writer_set_limits(), or REFTABLE_API_ERROR is returned.
> > +
> > +   It is an error to write a ref record after a log record.
> > + */
> > +int reftable_writer_add_ref(struct reftable_writer *w,
> > +                         struct reftable_ref_record *ref);
> > +
> > +/* Convenience function to add multiple refs. Will sort the refs by
> > +   name before adding. */
> > +int reftable_writer_add_refs(struct reftable_writer *w,
> > +                          struct reftable_ref_record *refs, int n);
>
> Since refs is an array of objects and not an array of pointers, these
> two functions could be combined - the first is the same as calling the
> second with n=1.
>
> Also, ascending order of what?

Of record keys, ie. refname in case of ref records.

> The user will also need to know where to get the update index from - I
> presume that this will be the maximum update index of any record with
> the given refname + 1.

there is a reftable_stack_next_update_index() for that.

> > +/* reads the next reftable_ref_record. Returns < 0 for error, 0 for OK and > 0:
> > +   end of iteration.
> > +*/
> > +int reftable_iterator_next_ref(struct reftable_iterator *it,
> > +                            struct reftable_ref_record *ref);
> > +
> > +/* reads the next reftable_log_record. Returns < 0 for error, 0 for OK and > 0:
> > +   end of iteration.
> > +*/
> > +int reftable_iterator_next_log(struct reftable_iterator *it,
> > +                            struct reftable_log_record *log);
>
> From my recollection, in Git, we typically use foreach functions with a
> callback that is invoked once for each result. I think that's preferable
> to this approach.

The iterator interface has the advantage that it captures reading both
individual entries (read_raw_ref) and iteration. It's also a natural
structure, because internally the iteration state has to be stored
(iterating merged tables stores the iterators in a priority queue.)


> [snip]
>
> > +/****************************************************************
> > + Merged tables
> > +
> > + A ref database kept in a sequence of table files. The merged_table presents a
> > + unified view to reading (seeking, iterating) a sequence of immutable tables.
> > + ****************************************************************/
> > +
> > +/* A merged table is implements seeking/iterating over a stack of tables. */
> > +struct reftable_merged_table;
> > +
> > +/* A generic reftable; see below. */
> > +struct reftable_table;
>
> Why would we need to see an individual reftable?

It could be useful diagnostics and troubleshooting. For example, if
you are debugging or trying to troubleshoot/fsck a repo, it might be
interesting to read a single file from .git/reftable/

Philosophically, since we have to expose the functionality to write a
single table, it seems reasonable to read single tables for symmetry
as well.

> Could we just represent
> a merged table as a virtual concatenation of blocks (as if all the
> blocks were in the same file)?

No. The format doesn't work like that.

> > +/* reftable_new_merged_table creates a new merged table. It takes ownership of
> > +   the stack array.
> > +*/
> > +int reftable_new_merged_table(struct reftable_merged_table **dest,
> > +                           struct reftable_table *stack, int n,
> > +                           uint32_t hash_id);
>
> I presume this would be used for things like compacting a few reftables
> together? In which case, I would expect this function to just take a
> list of filenames.

The merging of tables has nothing to do with the filesystem, and could
be done with reftables represented as in memory structure. This is
also how the unittests work.

> > +/* returns an iterator positioned just before 'name' */
> > +int reftable_merged_table_seek_ref(struct reftable_merged_table *mt,
> > +                                struct reftable_iterator *it,
> > +                                const char *name);
> > +
> > +/* returns an iterator for log entry, at given update_index */
> > +int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt,
> > +                                   struct reftable_iterator *it,
> > +                                   const char *name, uint64_t update_index);
> > +
> > +/* like reftable_merged_table_seek_log_at but look for the newest entry. */
> > +int reftable_merged_table_seek_log(struct reftable_merged_table *mt,
> > +                                struct reftable_iterator *it,
> > +                                const char *name);
>
> Why do iterators need to be precisely positioned if this merged table is
> immutable?

I don't understand the question. How would you read a single ref from
a merged table without a seek function?

> > +/* convenience function to read a single ref. Returns < 0 for error, 0
> > +   for success, and 1 if ref not found. */
> > +int reftable_table_read_ref(struct reftable_table *tab, const char *name,
> > +                         struct reftable_ref_record *ref);
>
> I presume this returns the most up-to-date record in the (possibly
> merged) table? So we can just read the hash off the record to know what
> this ref points to.

correct.

> > +/****************************************************************
> > + Mutable ref database
> > +
> > + The stack presents an interface to a mutable sequence of reftables.
> > + ****************************************************************/
>
> So I would expect ref mutations to be done by opening the existing
> reftables as a merged table, open a reftable writer, write all the
> changes that need to be written, and update the tables.list file...

Roughly, yes, but there are other considerations: you have to check
for concurrent updates (by the time you finish reading tables.list, a
compaction may have deleted some of the tables referenced, so you have
to retry). If there are any failures, the transaction fails and you
have to clean up already written tables. If the write succeeds, but
the stack becomes unbalanced, you have to do a compaction. If you have
to reload data, you want to avoid closing and opening the same file.

If you are curious what the stack does, have a look at stack.c.

Also, this is for separation of logic. The merged table is composed of
reftables that have no particular type of storage, but support the
read interface. The stack is what ties the merged table to a
posix-like file system, and provides mutations.

> > +/* holds a transaction to add tables at the top of a stack. */
> > +struct reftable_addition;
> > +
> > +/*
> > +  returns a new transaction to add reftables to the given stack. As a side
> > +  effect, the ref database is locked.
> > +*/
> > +int reftable_stack_new_addition(struct reftable_addition **dest,
> > +                             struct reftable_stack *st);
> > +
> > +/* Adds a reftable to transaction. */
> > +int reftable_addition_add(struct reftable_addition *add,
> > +                       int (*write_table)(struct reftable_writer *wr,
> > +                                          void *arg),
> > +                       void *arg);
> > +
> > +/* Commits the transaction, releasing the lock. */
> > +int reftable_addition_commit(struct reftable_addition *add);
> > +
> > +/* Release all non-committed data from the transaction, and deallocate the
> > +   transaction. Releases the lock if held. */
> > +void reftable_addition_destroy(struct reftable_addition *add);
>
> We do need a transaction to write the new tables.list and then
> atomically update the repository with it, but I don't think we need one
> just to add a reftable to the stack.

see above.

> > +/* compacts all reftables into a giant table. Expire reflog entries if config is
> > + * non-NULL */
> > +int reftable_stack_compact_all(struct reftable_stack *st,
> > +                            struct reftable_log_expiry_config *config);
>
> Ah, the stack is used for compacting as well. I don't think compacting
> belongs here though - it should be its own thing that can make use of
> the merged-table reader and the single-table writer.

Why should it be like that, and where should compaction live?

> Some things that I think are missing:
>
>  - foreach all latest record of all refs (in any order) (I see some

Ref records are keyed by name, so the iteration produces each ref only once.

>    functions above that can set the position of the iterator, but I
>    don't think that the iterator skips over irrelevant refs?

What makes you think that?

> So we can't
>    use it for iterating over all refs if we don't care about the history
>    of refs)

I think you might be confusing the refs with reflogs here. The history
of a ref is kept in the reflog. This is why functions that want to
seek to a log record need an update_index timestamp.

>  - foreach all log entries of one ref (or all refs)

Before reviewing the API header here, I think it might be useful to
also look at the commit in the other series that uses the API to
implement the git ref backend. It shows how the foreach functions can
be implemented using the iterator interface.

> In summary, I would have expected a mechanism to read multiple (possibly
> one) reftable and perform queries on it, a mechanism to write a single
> reftable, and some functions to update tables.list.

I understand your expectation, but I think that expectation glosses
over important details of how it actually works.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--
Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 02/13] reftable: define the public API
  2020-10-09 21:13       ` Emily Shaffer
@ 2020-10-10 17:03         ` Han-Wen Nienhuys
  2020-11-30 14:44         ` Han-Wen Nienhuys
  1 sibling, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys @ 2020-10-10 17:03 UTC (permalink / raw)
  To: Emily Shaffer
  Cc: Jonathan Nieder, Han-Wen Nienhuys via GitGitGadget, git,
	Jeff King, Han-Wen Nienhuys

On Fri, Oct 9, 2020 at 11:13 PM Emily Shaffer <emilyshaffer@google.com> wrote:
> Of course, taking a completed project - like your initial reftable
> submission - and then chopping it up into a cute story of commits is a
> pain in the ***. Doing it twice - or more - is just aggravating. So I
> wonder whether we can bikeshed what story would look nice before you
> even pick up your 'git rebase -i'? Doing that bikeshedding here on list
> means that we also have a chance for someone to interrupt and say, "no,
> that organization doesn't make sense" - or even for someone to say

It would be great if you could bikeshed about this while I'm on holiday :)

That said, I think it's worthwhile to first look at the commit that
uses the library to implement a ref backend, because that will show
better how the reftable library fits with the way that Git likes to
consume its data.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-08  1:48     ` Jonathan Tan
@ 2020-10-10 17:28       ` Han-Wen Nienhuys
  2020-10-11 10:52         ` Johannes Schindelin
  2020-10-23  9:13         ` Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 251+ messages in thread
From: Han-Wen Nienhuys @ 2020-10-10 17:28 UTC (permalink / raw)
  To: Jonathan Tan
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Han-Wen Nienhuys

On Thu, Oct 8, 2020 at 3:48 AM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> > From: Han-Wen Nienhuys <hanwen@google.com>
> >
> > This commit provides basic utility classes for the reftable library.
> >
> > Since the reftable library must compile standalone, there may be some overlap
> > with git-core utility functions.

> I think duplicating things like strbuf is an unnecessary burden if Git
> is to maintain this library. Something like "reftable will only import
> git-compat-util.h and strbuf.h, and any project that wants to use
> reftable must make sure that these functions and data structures are
> available" would be more plausible.

Sure, but how do we ensure that the directory won't take on
dependencies beyond these headers? I am worried that I will be
involved in a tedious back & forth process to keep updates going into
libgit2 and/or also have to keep maintaining
github.com/google/reftable.

FWIW, the duplication is really tiny: according to

 $ wc $(grep -l REFTABLE_STANDALONE *[ch])

it's just 431 lines of code.

--
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--
Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-02  4:12     ` Jonathan Nieder
@ 2020-10-10 17:32       ` Han-Wen Nienhuys
  2020-10-12 15:25         ` Jonathan Nieder
  0 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys @ 2020-10-10 17:32 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Han-Wen Nienhuys

On Fri, Oct 2, 2020 at 6:12 AM Jonathan Nieder <jrnieder@gmail.com> wrote:
> [...]
> > +int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args)
>
> How does this compare to stdlib's bsearch?

bsearch gives you back NULL if it doesn't find an exact match.

> > +     reftable_free(a);
> > +}
>
> Are there other callers that need custom free?

The libgit2 folks requested the ability to set memory allocation
routines, hence reftable_free().

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-10 17:28       ` Han-Wen Nienhuys
@ 2020-10-11 10:52         ` Johannes Schindelin
  2020-10-12 15:19           ` Jonathan Nieder
  2020-10-12 16:42           ` Junio C Hamano
  2020-10-23  9:13         ` Ævar Arnfjörð Bjarmason
  1 sibling, 2 replies; 251+ messages in thread
From: Johannes Schindelin @ 2020-10-11 10:52 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Jonathan Tan, Han-Wen Nienhuys via GitGitGadget, git, Jeff King,
	Han-Wen Nienhuys

Hi Han-Wen,

On Sat, 10 Oct 2020, Han-Wen Nienhuys wrote:

> On Thu, Oct 8, 2020 at 3:48 AM Jonathan Tan <jonathantanmy@google.com> wrote:
> >
> > > From: Han-Wen Nienhuys <hanwen@google.com>
> > >
> > > This commit provides basic utility classes for the reftable library.
> > >
> > > Since the reftable library must compile standalone, there may be some overlap
> > > with git-core utility functions.
>
> > I think duplicating things like strbuf is an unnecessary burden if Git
> > is to maintain this library. Something like "reftable will only import
> > git-compat-util.h and strbuf.h, and any project that wants to use
> > reftable must make sure that these functions and data structures are
> > available" would be more plausible.
>
> Sure, but how do we ensure that the directory won't take on
> dependencies beyond these headers? I am worried that I will be
> involved in a tedious back & forth process to keep updates going into
> libgit2 and/or also have to keep maintaining
> github.com/google/reftable.
>
> FWIW, the duplication is really tiny: according to
>
>  $ wc $(grep -l REFTABLE_STANDALONE *[ch])
>
> it's just 431 lines of code.

The `merge_bases_many()` function has only 33 lines of code, partially
duplicating `get_reachable_subset()`. Yet, it had a bug in it for two
years that was not found.

How much worse will the situation be with your 431 lines of code.

Even more so when you consider the fact that you intend to shove the same
duplication down libgit2's throat. It's "triplicating" code.

So I find the argument you made above quite unconvincing.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-11 10:52         ` Johannes Schindelin
@ 2020-10-12 15:19           ` Jonathan Nieder
  2020-10-12 18:44             ` Johannes Schindelin
  2020-10-12 16:42           ` Junio C Hamano
  1 sibling, 1 reply; 251+ messages in thread
From: Jonathan Nieder @ 2020-10-12 15:19 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Han-Wen Nienhuys, Jonathan Tan,
	Han-Wen Nienhuys via GitGitGadget, git, Jeff King,
	Han-Wen Nienhuys

Hi,

Johannes Schindelin wrote:

> The `merge_bases_many()` function has only 33 lines of code, partially
> duplicating `get_reachable_subset()`. Yet, it had a bug in it for two
> years that was not found.
>
> How much worse will the situation be with your 431 lines of code.
>
> Even more so when you consider the fact that you intend to shove the same
> duplication down libgit2's throat. It's "triplicating" code.

Careful: you seem to be making a bunch of assumptions here (for
example, around Han-Wen having some intent around shoving things down
libgit2's throat), and you seem to be focusing on the person instead
of the contribution.

Would you mind restating your point in a way that is a little easier
to process, for example by focusing on the effect that this patch
would have on you as a Git developer?

Thanks,
Jonathan

> So I find the argument you made above quite unconvincing.

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-10 17:32       ` Han-Wen Nienhuys
@ 2020-10-12 15:25         ` Jonathan Nieder
  2020-10-12 17:05           ` Patrick Steinhardt
  0 siblings, 1 reply; 251+ messages in thread
From: Jonathan Nieder @ 2020-10-12 15:25 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King,
	Han-Wen Nienhuys, Patrick Steinhardt

(+cc: Patrick Steinhardt from libgit2)
Hi,

Han-Wen Nienhuys wrote[1]:
> On Fri, Oct 2, 2020 at 6:12 AM Jonathan Nieder <jrnieder@gmail.com> wrote:
>> Han-Wen Nienhuys wrote:

>>> +     reftable_free(a);
>>> +}
>>
>> Are there other callers that need custom free?
>
> The libgit2 folks requested the ability to set memory allocation
> routines, hence reftable_free().

Thanks.  Patrick or Han-Wen, can you say a little more about this use
case?  That would help with making sure we are making an API that
meets its needs.

For example, is a custom allocator something that would be set
globally or something attached to a handle?  If the former, would code
that uses xmalloc and free and gets #define-d away when used in
libgit2 work?  If the latter, what does the handle look like?

Thanks,
Jonathan

[1] context: https://lore.kernel.org/git/pull.847.git.git.1600283416.gitgitgadget@gmail.com/T/#r2337ddc80848c80f6c9699d904ab64d4297d2c54

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-11 10:52         ` Johannes Schindelin
  2020-10-12 15:19           ` Jonathan Nieder
@ 2020-10-12 16:42           ` Junio C Hamano
  2020-10-12 19:01             ` Johannes Schindelin
  1 sibling, 1 reply; 251+ messages in thread
From: Junio C Hamano @ 2020-10-12 16:42 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Han-Wen Nienhuys, Jonathan Tan,
	Han-Wen Nienhuys via GitGitGadget, git, Jeff King,
	Han-Wen Nienhuys

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> FWIW, the duplication is really tiny: according to
>>
>>  $ wc $(grep -l REFTABLE_STANDALONE *[ch])
>>
>> it's just 431 lines of code.
>
> The `merge_bases_many()` function has only 33 lines of code, partially
> duplicating `get_reachable_subset()`. Yet, it had a bug in it for two
> years that was not found.

It does not affect the current discussion, but what you are giving
is a revisionist view of the world.  The latter function came MUCH
later to do a bit more than the former.  The bug was caused by the
fact that those that added the latter neglected the responsibility
of maintaining the former to the same degree when new feature like
commit-graph were added to the latter.

The root cause was that the latter one did not share the code with
the former one when it was introduced.  That does make it appear
similar to the situation we have at hand with duplicated utility
functions.

> How much worse will the situation be with your 431 lines of code.
>
> Even more so when you consider the fact that you intend to shove the same
> duplication down libgit2's throat. It's "triplicating" code.
>
> So I find the argument you made above quite unconvincing.
>
> Ciao,
> Dscho

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 02/13] reftable: define the public API
  2020-10-10 13:43       ` Han-Wen Nienhuys
@ 2020-10-12 16:57         ` Jonathan Nieder
  2020-11-30 14:55           ` Han-Wen Nienhuys
  0 siblings, 1 reply; 251+ messages in thread
From: Jonathan Nieder @ 2020-10-12 16:57 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Han-Wen Nienhuys

Han-Wen Nienhuys wrote:
> On Fri, Oct 2, 2020 at 5:58 AM Jonathan Nieder <jrnieder@gmail.com> wrote:

>> Adding a header in a separate patch from the implementation doesn't
>> match the usual practice.  Can we add declarations in the same patch
>> as the functions being declared instead?
>
> We could, but it would considerably complicate work on this patch
> series, as the commit boundary then doesn't fall on file boundaries
> anymore. Would you be open to having multiple headers for the public
> interface? eg. reftable-record.h, reftable-reader.h etc. ?

I'd be more than open to it: I think that would make for a clearer API.

For example, a caller like refs.c wouldn't have any need to use
reftable-block.h because that's a lower-level detail that operates
behind the scenes when operating on a stack of reftables.  So I think
this would be a good step toward addressing the feedback Jonathan gave
about the set of functions exposed by the API being overwhelming.

>>> +     /* Misuse of the API:
>>> +        - on writing a record with NULL refname.
>>> +        - on writing a reftable_ref_record outside the table limits
>>> +        - on writing a ref or log record before the stack's next_update_index
>>> +        - on writing a log record with multiline message with
>>> +        exact_log_message unset
>>> +        - on reading a reftable_ref_record from log iterator, or vice versa.
>>> +     */
>>> +     REFTABLE_API_ERROR = -6,
>>
>> Should these call BUG()?  Or is it useful for some callers to be able
>> to recover from these errors?
>
> Since this was written as a standalone library, it's up to the caller
> to decide what should be done.

I'm not strongly opinionated about this, but just a quick note: this
implies that the library would need to make sure it is producing a
valid state in error cases.

Does the API documentation describe what state a handle is in after
an error --- e.g., what operations are permitted after that?

>>> +
>>> +/*
>>> + * Convert the numeric error code to an equivalent errno code.
>>> + */
>>> +int reftable_error_to_errno(int err);
>>
>> What is the intended use of this function?
>
> The read_raw_ref method in the ref backend API uses errno values as
> out-of-band communication mechanism.

Could we change that?  It sounds error-prone.

[...]
>>> +/* Set the range of update indices for the records we will add.  When
>>> +   writing a table into a stack, the min should be at least
>>> +   reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned.
>>> +
>>> +   For transactional updates, typically min==max. When converting an existing
>>> +   ref database into a single reftable, this would be a range of update-index
>>> +   timestamps.
>>> + */
>>> +void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
>>> +                             uint64_t max);
>>
>> What happens if I write to a reftable_writer without setting limits
>> first?
>
> you get an REFTABLE_API_ERROR.

(To be clear, with all of these questions, I am asking not only for my
own curiosity but so that the documentation, naming, etc can be
cleared up to help the next person who wonders the same thing.)

[...]
>> Do I pass in the 'struct block_source *' as the source arg?  If so, why
>> are these declared as void *?
>
> you pass in block_source->arg. Should struct implementations of
> polymorphic types carry a pointer to their own vtable instead?

I think a pointer to block_source would make it more self-explanatory,
yes.

(Aside: this is a difference between how Go and e.g. Java and C++
implement polymorphism.  In Go, an interface object contains a vtable
and a pointer to the object.  In Java and C++, the object contains a
vptr.)

>> Is the reason this manages the buffer instead of requiring a
>> caller-supplied buffer to support zero-copy?
>
> Log blocks are compressed, so the caller doesn't know the correct size
> to supply. By letting the block source handle the management, we can
> swap out the block read from the file for a block managed by malloc on
> decompressing a log block.

Hm, my naive assumption would have been that we'd use different
buffers for the compressed and uncompressed data.

>>> + Generic tables
>>> +
>>> + A unified API for reading tables, either merged tables, or single readers.
>>
>> Are there callers/helpers that don't know whether they want one or the
>> other?
>
> The setup with per-worktree refs means that there are two reftable
> stacks in a .git repo. In order to iterate over the entire ref space,
> you have to merge two merged tables, but the stack itslef merges a set
> of simple reftables.

I see.  That's subtle, so it seems worth documenting for the next
person reading this documentation and wondering why.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-12 15:25         ` Jonathan Nieder
@ 2020-10-12 17:05           ` Patrick Steinhardt
  2020-10-12 17:45             ` Jonathan Nieder
  2020-10-13 12:12             ` Johannes Schindelin
  0 siblings, 2 replies; 251+ messages in thread
From: Patrick Steinhardt @ 2020-10-12 17:05 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Han-Wen Nienhuys, Han-Wen Nienhuys via GitGitGadget, git,
	Jeff King, Han-Wen Nienhuys

[-- Attachment #1: Type: text/plain, Size: 2266 bytes --]

On Mon, Oct 12, 2020 at 08:25:05AM -0700, Jonathan Nieder wrote:
> (+cc: Patrick Steinhardt from libgit2)
> Hi,
> 
> Han-Wen Nienhuys wrote[1]:
> > On Fri, Oct 2, 2020 at 6:12 AM Jonathan Nieder <jrnieder@gmail.com> wrote:
> >> Han-Wen Nienhuys wrote:
> 
> >>> +     reftable_free(a);
> >>> +}
> >>
> >> Are there other callers that need custom free?
> >
> > The libgit2 folks requested the ability to set memory allocation
> > routines, hence reftable_free().
> 
> Thanks.  Patrick or Han-Wen, can you say a little more about this use
> case?  That would help with making sure we are making an API that
> meets its needs.
> 
> For example, is a custom allocator something that would be set
> globally or something attached to a handle?  If the former, would code
> that uses xmalloc and free and gets #define-d away when used in
> libgit2 work?  If the latter, what does the handle look like?

We have global pluggable allocators in libgit2 which can be set up
before calling `git_libgit2_init()`. The user of libgit2 can fill a
`git_allocator` structure with a set of funtcion pointers, most
importantly with implementations of `free` and `malloc`. Those then get
used across all of libgit2 for all subsequent allocations.

In order to be as thorough as possible, we thus also need to replace
these function pointers for libgit2's dependencies. As registration of
the allocator happens at runtime, we need to also be able to replace
function pointers of dependencies at runtime. E.g. for OpenSSL, it
provides an API `CRYPTO_set_mem_functions(malloc, realloc, free)` which
we call on global initialization of the libgit2 library.

So, to answer your questions:

    - The allocator is global and cannot be changed after initialization
      of libgit2.

    - It is pluggable, users can set up their own allocators by filling
      a structure with function pointers for `free`, `malloc`, `realloc`
      etc.

    - Due to the pluggable nature, we need to be able to set up those
      pointers at runtime. We can provide a set of static wrappers
      though which then call into the pluggable functions, so defines
      would probably work for us, too.

I hope that answers all of your questions.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-12 17:05           ` Patrick Steinhardt
@ 2020-10-12 17:45             ` Jonathan Nieder
  2020-10-13 12:12             ` Johannes Schindelin
  1 sibling, 0 replies; 251+ messages in thread
From: Jonathan Nieder @ 2020-10-12 17:45 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: Han-Wen Nienhuys, Han-Wen Nienhuys via GitGitGadget, git,
	Jeff King, Han-Wen Nienhuys

Patrick Steinhardt wrote:

> In order to be as thorough as possible, we thus also need to replace
> these function pointers for libgit2's dependencies. As registration of
> the allocator happens at runtime, we need to also be able to replace
> function pointers of dependencies at runtime. E.g. for OpenSSL, it
> provides an API `CRYPTO_set_mem_functions(malloc, realloc, free)` which
> we call on global initialization of the libgit2 library.
>
> So, to answer your questions:
>
>     - The allocator is global and cannot be changed after initialization
>       of libgit2.
>
>     - It is pluggable, users can set up their own allocators by filling
>       a structure with function pointers for `free`, `malloc`, `realloc`
>       etc.
>
>     - Due to the pluggable nature, we need to be able to set up those
>       pointers at runtime. We can provide a set of static wrappers
>       though which then call into the pluggable functions, so defines
>       would probably work for us, too.
>
> I hope that answers all of your questions.

Thanks!  Yes, that's very helpful.

Jonathan

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-12 15:19           ` Jonathan Nieder
@ 2020-10-12 18:44             ` Johannes Schindelin
  2020-10-12 19:41               ` Jonathan Nieder
  0 siblings, 1 reply; 251+ messages in thread
From: Johannes Schindelin @ 2020-10-12 18:44 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Han-Wen Nienhuys, Jonathan Tan,
	Han-Wen Nienhuys via GitGitGadget, git, Jeff King,
	Han-Wen Nienhuys

[-- Attachment #1: Type: text/plain, Size: 2367 bytes --]

Hi Jonathan,

On Mon, 12 Oct 2020, Jonathan Nieder wrote:

> Johannes Schindelin wrote:
>
> > The `merge_bases_many()` function has only 33 lines of code, partially
> > duplicating `get_reachable_subset()`. Yet, it had a bug in it for two
> > years that was not found.
> >
> > How much worse will the situation be with your 431 lines of code.
> >
> > Even more so when you consider the fact that you intend to shove the same
> > duplication down libgit2's throat. It's "triplicating" code.
>
> Careful: you seem to be making a bunch of assumptions here (for
> example, around Han-Wen having some intent around shoving things down
> libgit2's throat), and you seem to be focusing on the person instead
> of the contribution.
>
> Would you mind restating your point in a way that is a little easier
> to process, for example by focusing on the effect that this patch
> would have on you as a Git developer?

I really hope that Han-Wen did not take this as a personal attack.

In case he did, I'll gladly try to re-phrase my point: It does not matter
how much code is duplicated. The fact that we run into bugs even in
unintentionally duplicated code is reason enough to not lightly accept
this design decision as being set in stone.

Which is why I, you, Emily and Josh pointed that out, and it is the same
reason why Peff and Junio had made the same point in the past.

I was quite unpleasantly surprised to see that all of those objections
seemed to be insufficient, hence my rather forceful reply.

As to the libgit2 angle: libgit2 also has `strbuf` (which it calls
`git_buf` IIRC), and I am certain that it also has those other helper
functions à la `put_be32()`. So yes, the triplication I mentioned is a
very real, undesirable feature of this patch series.

To be honest, I would much rather spend my time reviewing patches that
added `reftable` support using as much of libgit.a's functionality as
possible rather than repeating the same point over and over again: that it
makes very little sense _not_ to use that functionality. Of course, at
this stage I understand that my feedback is not very welcome, so I will
try to refrain from commenting on this patch series further (I do reserve
the right to send patches that fix CI failures, just like I've done quite
a few times up to this point).

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-12 16:42           ` Junio C Hamano
@ 2020-10-12 19:01             ` Johannes Schindelin
  0 siblings, 0 replies; 251+ messages in thread
From: Johannes Schindelin @ 2020-10-12 19:01 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Han-Wen Nienhuys, Jonathan Tan,
	Han-Wen Nienhuys via GitGitGadget, git, Jeff King,
	Han-Wen Nienhuys

Hi Junio,

On Mon, 12 Oct 2020, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> >> FWIW, the duplication is really tiny: according to
> >>
> >>  $ wc $(grep -l REFTABLE_STANDALONE *[ch])
> >>
> >> it's just 431 lines of code.
> >
> > The `merge_bases_many()` function has only 33 lines of code, partially
> > duplicating `get_reachable_subset()`. Yet, it had a bug in it for two
> > years that was not found.
>
> It does not affect the current discussion, but what you are giving
> is a revisionist view of the world.  The latter function came MUCH
> later to do a bit more than the former.  The bug was caused by the
> fact that those that added the latter neglected the responsibility
> of maintaining the former to the same degree when new feature like
> commit-graph were added to the latter.

You're right, I mixed up the direction: by introducing new code, the old
code became under-tested and a bug creeped in that was not detected for
two years.

> The root cause was that the latter one did not share the code with
> the former one when it was introduced.  That does make it appear
> similar to the situation we have at hand with duplicated utility
> functions.

Indeed, and even if this is just one concrete example, experience (and
teachers and instructors and mentors) repeat the lesson that code
duplication should be avoided because it _does_ cause problems.

Thanks for setting the record straight,
Dscho

>
> > How much worse will the situation be with your 431 lines of code.
> >
> > Even more so when you consider the fact that you intend to shove the same
> > duplication down libgit2's throat. It's "triplicating" code.
> >
> > So I find the argument you made above quite unconvincing.
> >
> > Ciao,
> > Dscho
>

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-12 18:44             ` Johannes Schindelin
@ 2020-10-12 19:41               ` Jonathan Nieder
  2020-10-12 20:27                 ` Johannes Schindelin
  0 siblings, 1 reply; 251+ messages in thread
From: Jonathan Nieder @ 2020-10-12 19:41 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Han-Wen Nienhuys, Jonathan Tan,
	Han-Wen Nienhuys via GitGitGadget, git, Jeff King,
	Han-Wen Nienhuys

Johannes Schindelin wrote:

>                                                          Of course, at
> this stage I understand that my feedback is not very welcome,

Do you mean because you don't feel that your feedback has been
sufficiently acted upon, or is there something else you are alluding
to?

Puzzled,
Jonathan

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-12 19:41               ` Jonathan Nieder
@ 2020-10-12 20:27                 ` Johannes Schindelin
  0 siblings, 0 replies; 251+ messages in thread
From: Johannes Schindelin @ 2020-10-12 20:27 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Han-Wen Nienhuys, Jonathan Tan,
	Han-Wen Nienhuys via GitGitGadget, git, Jeff King,
	Han-Wen Nienhuys

Hi Jonathan,

On Mon, 12 Oct 2020, Jonathan Nieder wrote:

> Johannes Schindelin wrote:
>
> > Of course, at this stage I understand that my feedback is not very
> > welcome,
>
> Do you mean because you don't feel that your feedback has been
> sufficiently acted upon, or is there something else you are alluding to?

My feedback has not been acted upon, and my arguments have been dismissed
(or there were attempts to dismiss them, I should say, that did not
convince me). At least I saw that when gitster repeated some of my
concerns in #git-devel, there were attempts to address them. While I feel
bad about leaving the burden on gitster's shoulders, at least I am glad
that those concerns are being addressed.

Ciao,
Dscho


^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-12 17:05           ` Patrick Steinhardt
  2020-10-12 17:45             ` Jonathan Nieder
@ 2020-10-13 12:12             ` Johannes Schindelin
  2020-10-13 15:47               ` Junio C Hamano
  1 sibling, 1 reply; 251+ messages in thread
From: Johannes Schindelin @ 2020-10-13 12:12 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: Jonathan Nieder, Han-Wen Nienhuys,
	Han-Wen Nienhuys via GitGitGadget, git, Jeff King,
	Han-Wen Nienhuys

Hi Patrick,

On Mon, 12 Oct 2020, Patrick Steinhardt wrote:

> On Mon, Oct 12, 2020 at 08:25:05AM -0700, Jonathan Nieder wrote:
> > (+cc: Patrick Steinhardt from libgit2)
> > Hi,
> >
> > Han-Wen Nienhuys wrote[1]:
> > > On Fri, Oct 2, 2020 at 6:12 AM Jonathan Nieder <jrnieder@gmail.com> wrote:
> > >> Han-Wen Nienhuys wrote:
> >
> > >>> +     reftable_free(a);
> > >>> +}
> > >>
> > >> Are there other callers that need custom free?
> > >
> > > The libgit2 folks requested the ability to set memory allocation
> > > routines, hence reftable_free().
> >
> > Thanks.  Patrick or Han-Wen, can you say a little more about this use
> > case?  That would help with making sure we are making an API that
> > meets its needs.
> >
> > For example, is a custom allocator something that would be set
> > globally or something attached to a handle?  If the former, would code
> > that uses xmalloc and free and gets #define-d away when used in
> > libgit2 work?  If the latter, what does the handle look like?
>
> We have global pluggable allocators in libgit2 which can be set up
> before calling `git_libgit2_init()`. The user of libgit2 can fill a
> `git_allocator` structure with a set of funtcion pointers, most
> importantly with implementations of `free` and `malloc`. Those then get
> used across all of libgit2 for all subsequent allocations.

I did not find out how those are used in the `deps/` part of libgit2's
source code. For example, I see a couple of instances where `malloc()` is
used in `ntlmclient` and in `pcre`.

The thing I was looking for would have been something like

	#define malloc(size) git__malloc(size)
	...

This would also have been what I imagined to be the best strategy to
integrate the reftable code once it is properly embedded in libgit.a (and
of course using libgit.a's API as much as it can).

Somewhat related: I was wondering whether it would make sense for git.git
to rename `strbuf` to `git_buf`? Would that make it easier to exchange
code between the two projects? Or would it just be unnecessary churn?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-13 12:12             ` Johannes Schindelin
@ 2020-10-13 15:47               ` Junio C Hamano
  2020-10-15 11:46                 ` Johannes Schindelin
  0 siblings, 1 reply; 251+ messages in thread
From: Junio C Hamano @ 2020-10-13 15:47 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Patrick Steinhardt, Jonathan Nieder, Han-Wen Nienhuys,
	Han-Wen Nienhuys via GitGitGadget, git, Jeff King,
	Han-Wen Nienhuys

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Somewhat related: I was wondering whether it would make sense for git.git
> to rename `strbuf` to `git_buf`? Would that make it easier to exchange
> code between the two projects? Or would it just be unnecessary churn?

To us, "git_buf" is just as descriptive as "buf" and does not say
anything about the nature of 'buf' (other than apparently it was
invented and widely used here).  "git_strbuf" I can understand, but
why should we?


^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-13 15:47               ` Junio C Hamano
@ 2020-10-15 11:46                 ` Johannes Schindelin
  2020-10-15 16:23                   ` Junio C Hamano
  0 siblings, 1 reply; 251+ messages in thread
From: Johannes Schindelin @ 2020-10-15 11:46 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Patrick Steinhardt, Jonathan Nieder, Han-Wen Nienhuys,
	Han-Wen Nienhuys via GitGitGadget, git, Jeff King,
	Han-Wen Nienhuys

Hi Junio,

On Tue, 13 Oct 2020, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> > Somewhat related: I was wondering whether it would make sense for git.git
> > to rename `strbuf` to `git_buf`? Would that make it easier to exchange
> > code between the two projects? Or would it just be unnecessary churn?
>
> To us, "git_buf" is just as descriptive as "buf" and does not say
> anything about the nature of 'buf' (other than apparently it was
> invented and widely used here).  "git_strbuf" I can understand, but
> why should we?

If it makes code sharing between git.git and libgit2 easier, why shouldn't
we ;-)

Obviously, if it doesn't make life easier, we shouldn't bother.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-15 11:46                 ` Johannes Schindelin
@ 2020-10-15 16:23                   ` Junio C Hamano
  2020-10-15 19:39                     ` Johannes Schindelin
  2020-10-16  9:15                     ` Patrick Steinhardt
  0 siblings, 2 replies; 251+ messages in thread
From: Junio C Hamano @ 2020-10-15 16:23 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Patrick Steinhardt, Jonathan Nieder, Han-Wen Nienhuys,
	Han-Wen Nienhuys via GitGitGadget, git, Jeff King,
	Han-Wen Nienhuys

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Hi Junio,
>
> On Tue, 13 Oct 2020, Junio C Hamano wrote:
>
>> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>>
>> > Somewhat related: I was wondering whether it would make sense for git.git
>> > to rename `strbuf` to `git_buf`? Would that make it easier to exchange
>> > code between the two projects? Or would it just be unnecessary churn?
>>
>> To us, "git_buf" is just as descriptive as "buf" and does not say
>> anything about the nature of 'buf' (other than apparently it was
>> invented and widely used here).  "git_strbuf" I can understand, but
>> why should we?
>
> If it makes code sharing between git.git and libgit2 easier, why shouldn't
> we ;-)

I see no reasonably explanation why libgit2 is not the one who uses
"#define strbuf git_buf" to make "sharing easier", though.


^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-15 16:23                   ` Junio C Hamano
@ 2020-10-15 19:39                     ` Johannes Schindelin
  2020-10-16  9:15                     ` Patrick Steinhardt
  1 sibling, 0 replies; 251+ messages in thread
From: Johannes Schindelin @ 2020-10-15 19:39 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Patrick Steinhardt, Jonathan Nieder, Han-Wen Nienhuys,
	Han-Wen Nienhuys via GitGitGadget, git, Jeff King,
	Han-Wen Nienhuys

Hi Junio,

On Thu, 15 Oct 2020, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> > Hi Junio,
> >
> > On Tue, 13 Oct 2020, Junio C Hamano wrote:
> >
> >> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> >>
> >> > Somewhat related: I was wondering whether it would make sense for git.git
> >> > to rename `strbuf` to `git_buf`? Would that make it easier to exchange
> >> > code between the two projects? Or would it just be unnecessary churn?
> >>
> >> To us, "git_buf" is just as descriptive as "buf" and does not say
> >> anything about the nature of 'buf' (other than apparently it was
> >> invented and widely used here).  "git_strbuf" I can understand, but
> >> why should we?
> >
> > If it makes code sharing between git.git and libgit2 easier, why shouldn't
> > we ;-)
>
> I see no reasonably explanation why libgit2 is not the one who uses
> "#define strbuf git_buf" to make "sharing easier", though.

That's a fair point, thanks for bringing it up!

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-15 16:23                   ` Junio C Hamano
  2020-10-15 19:39                     ` Johannes Schindelin
@ 2020-10-16  9:15                     ` Patrick Steinhardt
  1 sibling, 0 replies; 251+ messages in thread
From: Patrick Steinhardt @ 2020-10-16  9:15 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Schindelin, Jonathan Nieder, Han-Wen Nienhuys,
	Han-Wen Nienhuys via GitGitGadget, git, Jeff King,
	Han-Wen Nienhuys

[-- Attachment #1: Type: text/plain, Size: 1245 bytes --]

On Thu, Oct 15, 2020 at 09:23:28AM -0700, Junio C Hamano wrote:
> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > Hi Junio,
> >
> > On Tue, 13 Oct 2020, Junio C Hamano wrote:
> >
> >> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> >>
> >> > Somewhat related: I was wondering whether it would make sense for git.git
> >> > to rename `strbuf` to `git_buf`? Would that make it easier to exchange
> >> > code between the two projects? Or would it just be unnecessary churn?
> >>
> >> To us, "git_buf" is just as descriptive as "buf" and does not say
> >> anything about the nature of 'buf' (other than apparently it was
> >> invented and widely used here).  "git_strbuf" I can understand, but
> >> why should we?
> >
> > If it makes code sharing between git.git and libgit2 easier, why shouldn't
> > we ;-)
> 
> I see no reasonably explanation why libgit2 is not the one who uses
> "#define strbuf git_buf" to make "sharing easier", though.

It probably wouldn't help much anyway. We already have our own buffer
type which we can't easily replace with yours as it's part of the public
interface. If the need arises, providing a compatibility interface for
it shouldn't be too hard.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-10 17:28       ` Han-Wen Nienhuys
  2020-10-11 10:52         ` Johannes Schindelin
@ 2020-10-23  9:13         ` Ævar Arnfjörð Bjarmason
  2020-10-23 17:36           ` Junio C Hamano
  1 sibling, 1 reply; 251+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2020-10-23  9:13 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Jonathan Tan, Han-Wen Nienhuys via GitGitGadget, git, Jeff King,
	Han-Wen Nienhuys, brian m. carlson


On Sat, Oct 10 2020, Han-Wen Nienhuys wrote:

> On Thu, Oct 8, 2020 at 3:48 AM Jonathan Tan <jonathantanmy@google.com> wrote:
>>
>> > From: Han-Wen Nienhuys <hanwen@google.com>
>> >
>> > This commit provides basic utility classes for the reftable library.
>> >
>> > Since the reftable library must compile standalone, there may be some overlap
>> > with git-core utility functions.
>
>> I think duplicating things like strbuf is an unnecessary burden if Git
>> is to maintain this library. Something like "reftable will only import
>> git-compat-util.h and strbuf.h, and any project that wants to use
>> reftable must make sure that these functions and data structures are
>> available" would be more plausible.
>
> Sure, but how do we ensure that the directory won't take on
> dependencies beyond these headers? I am worried that I will be
> involved in a tedious back & forth process to keep updates going into
> libgit2 and/or also have to keep maintaining
> github.com/google/reftable.
>
> FWIW, the duplication is really tiny: according to
>
>  $ wc $(grep -l REFTABLE_STANDALONE *[ch])
>
> it's just 431 lines of code.

Why import the dead code at all? If I add this to your update script:

    # Not a general solution, but works for the specific code here.
    perl -0777 -pi -e 's[(#ifndef REFTABLE_STANDALONE\n.*?\n#else\n).*?(?=\n#endif)][$1/* Removed upstream code for git.git import */]gs' reftable/system.h
    perl -0777 -pi -e 's[(?<=#ifdef REFTABLE_STANDALONE\n).*?(?=\n#endif)][/* Removed upstream code for git.git import */]gs' reftable/strbuf.c reftable/compat.h
    perl -0777 -pi -e 's[(?<=#ifdef REFTABLE_STANDALONE\n).*?(?=\n#else)][/* Removed upstream code for git.git import */]gs' reftable/compat.c reftable/strbuf.h

It's now 157 lines instead of 431.

I think doing that with a tiny bit more complexity in the update.sh
script is a much lower maintenance burden.

It's not just about the number of lines, but things coming up in grep,
and now unique code really stands out (e.g. strbuf_add_void, should
probably be just added to the main strbuf.h if it's needed...), and of
course the cost of attention of eyeballing an already big series on the
ML & the churn every time we have updates.

Overall I'm all for having this carved out in its own directory at a
cost of a bit more maintenance burden if it can be shared with libgit2 &
be a standalone library.

I am concerned that it seems this code can't be maintained in git.git by
anyone not willing to sign a contract with Google. I sent a tiny PR for
a typo fix at [1] and got directed to sign [2] before someone at Google
could look at it. I see brian raised this before in [3] but there wasn't
a reply to that point.

Is there some summary of how this part of integrating it is supposed to
work going forward?

At first glance that seems like a recipe for perma-forking this pretty
much from day one, i.e.:

 A. Upstream changes happen
 B. We get a patch to the bundled library to this ML
 ==> Google employees maintaining A can't even look at B
 ==> Patch B can't be integrated into A

1. https://github.com/google/reftable/pull/2
2. https://cla.developers.google.com/about/google-individual
3. https://lore.kernel.org/git/20200512233741.GB6605@camp.crustytoothpaste.net/

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 05/13] reftable: utility functions
  2020-10-23  9:13         ` Ævar Arnfjörð Bjarmason
@ 2020-10-23 17:36           ` Junio C Hamano
  0 siblings, 0 replies; 251+ messages in thread
From: Junio C Hamano @ 2020-10-23 17:36 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Han-Wen Nienhuys, Jonathan Tan,
	Han-Wen Nienhuys via GitGitGadget, git, Jeff King,
	Han-Wen Nienhuys, brian m. carlson

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> It's not just about the number of lines, but things coming up in grep,
> and now unique code really stands out (e.g. strbuf_add_void, should
> probably be just added to the main strbuf.h if it's needed...), and of
> course the cost of attention of eyeballing an already big series on the
> ML & the churn every time we have updates.

These are all valid concerns.

> Overall I'm all for having this carved out in its own directory at a
> cost of a bit more maintenance burden if it can be shared with libgit2 &
> be a standalone library.

Do you mean by "this" the reftable stuff as a whole, or only the
duplicated support library part?  If the latter is split out of what
we import as our reftable/ directory, and stored in a separate
directory that we do not have to import (because we have strbuf and
other stuff) but other people may choose to use (because they may
not have strbuf and other stuff), that may work.  Also, if the
contents of reftable/ directory wants a pluggable allocator, the
main code can be written to call reftable_malloc (or whatever), with
a thin shim to interface with us (i.e. reftable_malloc() would be
implemented as a thin wrapper around xmalloc() for us) which is
stored in a separate directory just for us to interface with the
main reftable library.  For libgit2, there will be a separate
directory that uses a different implementation of reftable_malloc()
that would let them plug their preferred allocator.  An arrangement
like that might work.

I do not offhand know if that kind of overhead is worth the trouble
or if there are better ways, though.

> I am concerned that it seems this code can't be maintained in git.git by
> anyone not willing to sign a contract with Google.

It can be maintained in git.git; the trouble comes when they want to
update us.

I however suspect that, as the primary intended audience, it is hard
to imagine that the reftable library as a standalone project will be
successful without going through the usual reviews and testing in
git.git.  So even though there exists github.com/google/reftable
repository, its contents may not matter very much in practice,
unless they come here and beg for the change we make ourselves
anyway.  Perhaps I am being naive.  I dunno.

> I sent a tiny PR for
> a typo fix at [1] and got directed to sign [2] before someone at Google
> could look at it. I see brian raised this before in [3] but there wasn't
> a reply to that point.
>
> Is there some summary of how this part of integrating it is supposed to
> work going forward?

Yeah, thanks for raising a good point.  We definitely need to figure
this part out.

> At first glance that seems like a recipe for perma-forking this pretty
> much from day one, i.e.:
>
>  A. Upstream changes happen
>  B. We get a patch to the bundled library to this ML
>  ==> Google employees maintaining A can't even look at B
>  ==> Patch B can't be integrated into A
>
> 1. https://github.com/google/reftable/pull/2
> 2. https://cla.developers.google.com/about/google-individual
> 3. https://lore.kernel.org/git/20200512233741.GB6605@camp.crustytoothpaste.net/

^ permalink raw reply	[flat|nested] 251+ messages in thread

* [PATCH v3 00/16] reftable library
  2020-10-01 16:10 ` [PATCH v2 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
                     ` (12 preceding siblings ...)
  2020-10-01 16:11   ` [PATCH v2 13/13] reftable: "test-tool dump-reftable" command Han-Wen Nienhuys via GitGitGadget
@ 2020-11-26 19:42   ` Han-Wen Nienhuys via GitGitGadget
  2020-11-26 19:42     ` [PATCH v3 01/16] move sleep_millisec to git-compat-util.h Han-Wen Nienhuys via GitGitGadget
                       ` (16 more replies)
  13 siblings, 17 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-11-26 19:42 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Han-Wen Nienhuys

This splits the giant commit from 
https://github.com/gitgitgadget/git/pull/539 into a series of smaller
commits, which build and have unittests.

The final commit should also be split up, but I want to wait until we have
consensus that the bottom commits look good.

version 26 Nov 2020:

 * split public API header into multiple headers
 * minor adaptions in response to review feedback
 * filter out code for standalone compile
 * get rid of test framework

Han-Wen Nienhuys (15):
  move sleep_millisec to git-compat-util.h
  init-db: set the_repository->hash_algo early on
  reftable: add LICENSE
  reftable: add error related functionality
  reftable: utility functions
  reftable: add blocksource, an abstraction for random access reads
  reftable: (de)serialization for the polymorphic record type.
  reftable: reading/writing blocks
  reftable: a generic binary tree implementation
  reftable: write reftable files
  reftable: read reftable files
  reftable: reftable file level tests
  reftable: rest of library
  Reftable support for git-core
  Add "test-tool dump-reftable" command.

SZEDER Gábor (1):
  git-prompt: prepare for reftable refs backend

 .../technical/repository-version.txt          |    7 +
 Makefile                                      |   46 +-
 builtin/clone.c                               |    5 +-
 builtin/init-db.c                             |   56 +-
 builtin/worktree.c                            |   27 +-
 cache.h                                       |    9 +-
 config.mak.uname                              |    2 +-
 contrib/buildsystems/Generators/Vcxproj.pm    |   11 +-
 contrib/completion/git-prompt.sh              |    7 +-
 git-compat-util.h                             |    2 +
 refs.c                                        |   27 +-
 refs.h                                        |    3 +
 refs/refs-internal.h                          |    1 +
 refs/reftable-backend.c                       | 1418 +++++++++++++++++
 reftable/.gitattributes                       |    1 +
 reftable/LICENSE                              |   31 +
 reftable/VERSION                              |    1 +
 reftable/basics.c                             |  144 ++
 reftable/basics.h                             |   56 +
 reftable/basics_test.c                        |   56 +
 reftable/block.c                              |  440 +++++
 reftable/block.h                              |  129 ++
 reftable/block_test.c                         |  119 ++
 reftable/blocksource.c                        |  148 ++
 reftable/blocksource.h                        |   22 +
 reftable/compat.h                             |   48 +
 reftable/constants.h                          |   21 +
 reftable/dump.c                               |  219 +++
 reftable/error.c                              |   39 +
 reftable/iter.c                               |  242 +++
 reftable/iter.h                               |   75 +
 reftable/merged.c                             |  366 +++++
 reftable/merged.h                             |   35 +
 reftable/merged_test.c                        |  333 ++++
 reftable/pq.c                                 |  115 ++
 reftable/pq.h                                 |   32 +
 reftable/publicbasics.c                       |   58 +
 reftable/reader.c                             |  733 +++++++++
 reftable/reader.h                             |   75 +
 reftable/record.c                             | 1116 +++++++++++++
 reftable/record.h                             |  139 ++
 reftable/record_test.c                        |  404 +++++
 reftable/refname.c                            |  209 +++
 reftable/refname.h                            |   29 +
 reftable/refname_test.c                       |  100 ++
 reftable/reftable-blocksource.h               |   50 +
 reftable/reftable-error.h                     |   62 +
 reftable/reftable-generic.h                   |   49 +
 reftable/reftable-iterator.h                  |   37 +
 reftable/reftable-malloc.h                    |   20 +
 reftable/reftable-merged.h                    |   68 +
 reftable/reftable-reader.h                    |   89 ++
 reftable/reftable-record.h                    |   73 +
 reftable/reftable-stack.h                     |  116 ++
 reftable/reftable-tests.h                     |   22 +
 reftable/reftable-writer.h                    |  149 ++
 reftable/reftable.c                           |   98 ++
 reftable/reftable_test.c                      |  577 +++++++
 reftable/stack.c                              | 1247 +++++++++++++++
 reftable/stack.h                              |   41 +
 reftable/stack_test.c                         |  781 +++++++++
 reftable/system.h                             |   32 +
 reftable/test_framework.c                     |   23 +
 reftable/test_framework.h                     |   48 +
 reftable/tree.c                               |   63 +
 reftable/tree.h                               |   34 +
 reftable/tree_test.c                          |   61 +
 reftable/writer.c                             |  676 ++++++++
 reftable/writer.h                             |   50 +
 reftable/zlib-compat.c                        |   92 ++
 repository.c                                  |    2 +
 repository.h                                  |    3 +
 setup.c                                       |    9 +-
 t/helper/test-reftable.c                      |   20 +
 t/helper/test-tool.c                          |    4 +-
 t/helper/test-tool.h                          |    2 +
 t/t0031-reftable.sh                           |  199 +++
 t/t0032-reftable-unittest.sh                  |   14 +
 t/t1409-avoid-packing-refs.sh                 |    6 +
 t/t1450-fsck.sh                               |    6 +
 t/t3210-pack-refs.sh                          |    6 +
 t/test-lib.sh                                 |    5 +
 82 files changed, 11951 insertions(+), 39 deletions(-)
 create mode 100644 refs/reftable-backend.c
 create mode 100644 reftable/.gitattributes
 create mode 100644 reftable/LICENSE
 create mode 100644 reftable/VERSION
 create mode 100644 reftable/basics.c
 create mode 100644 reftable/basics.h
 create mode 100644 reftable/basics_test.c
 create mode 100644 reftable/block.c
 create mode 100644 reftable/block.h
 create mode 100644 reftable/block_test.c
 create mode 100644 reftable/blocksource.c
 create mode 100644 reftable/blocksource.h
 create mode 100644 reftable/compat.h
 create mode 100644 reftable/constants.h
 create mode 100644 reftable/dump.c
 create mode 100644 reftable/error.c
 create mode 100644 reftable/iter.c
 create mode 100644 reftable/iter.h
 create mode 100644 reftable/merged.c
 create mode 100644 reftable/merged.h
 create mode 100644 reftable/merged_test.c
 create mode 100644 reftable/pq.c
 create mode 100644 reftable/pq.h
 create mode 100644 reftable/publicbasics.c
 create mode 100644 reftable/reader.c
 create mode 100644 reftable/reader.h
 create mode 100644 reftable/record.c
 create mode 100644 reftable/record.h
 create mode 100644 reftable/record_test.c
 create mode 100644 reftable/refname.c
 create mode 100644 reftable/refname.h
 create mode 100644 reftable/refname_test.c
 create mode 100644 reftable/reftable-blocksource.h
 create mode 100644 reftable/reftable-error.h
 create mode 100644 reftable/reftable-generic.h
 create mode 100644 reftable/reftable-iterator.h
 create mode 100644 reftable/reftable-malloc.h
 create mode 100644 reftable/reftable-merged.h
 create mode 100644 reftable/reftable-reader.h
 create mode 100644 reftable/reftable-record.h
 create mode 100644 reftable/reftable-stack.h
 create mode 100644 reftable/reftable-tests.h
 create mode 100644 reftable/reftable-writer.h
 create mode 100644 reftable/reftable.c
 create mode 100644 reftable/reftable_test.c
 create mode 100644 reftable/stack.c
 create mode 100644 reftable/stack.h
 create mode 100644 reftable/stack_test.c
 create mode 100644 reftable/system.h
 create mode 100644 reftable/test_framework.c
 create mode 100644 reftable/test_framework.h
 create mode 100644 reftable/tree.c
 create mode 100644 reftable/tree.h
 create mode 100644 reftable/tree_test.c
 create mode 100644 reftable/writer.c
 create mode 100644 reftable/writer.h
 create mode 100644 reftable/zlib-compat.c
 create mode 100644 t/helper/test-reftable.c
 create mode 100755 t/t0031-reftable.sh
 create mode 100755 t/t0032-reftable-unittest.sh


base-commit: faefdd61ec7c7f6f3c8c9907891465ac9a2a1475
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-847%2Fhanwen%2Flibreftable-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-847/hanwen/libreftable-v3
Pull-Request: https://github.com/git/git/pull/847

Range-diff vs v2:

  -:  ---------- >  1:  030998a732 move sleep_millisec to git-compat-util.h
  -:  ---------- >  2:  cf667653df init-db: set the_repository->hash_algo early on
  1:  6228103b4a =  3:  91c3ac2449 reftable: add LICENSE
  2:  5d1b946ab5 !  4:  2aa30f536f reftable: define the public API
     @@ Metadata
      Author: Han-Wen Nienhuys <hanwen@google.com>
      
       ## Commit message ##
     -    reftable: define the public API
     +    reftable: add error related functionality
     +
     +    The reftable/ directory is structured as a library, so it cannot
     +    crash on misuse. Instead, it returns an error codes.
     +
     +    In addition, the error code can be used to signal conditions from lower levels
     +    of the library to be handled by higher levels of the library. For example, a
     +    transaction might legitimately write an empty reftable file, but in that case,
     +    we'd want to shortcut the transaction overhead.
      
          Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
      
     - ## reftable/reftable.h (new) ##
     + ## reftable/error.c (new) ##
      @@
      +/*
      +Copyright 2020 Google LLC
     @@ reftable/reftable.h (new)
      +https://developers.google.com/open-source/licenses/bsd
      +*/
      +
     -+#ifndef REFTABLE_H
     -+#define REFTABLE_H
     -+
     -+#include <stdint.h>
     -+#include <stddef.h>
     -+
     -+void reftable_set_alloc(void *(*malloc)(size_t),
     -+			void *(*realloc)(void *, size_t), void (*free)(void *));
     -+
     -+/****************************************************************
     -+ Basic data types
     -+
     -+ Reftables store the state of each ref in struct reftable_ref_record, and they
     -+ store a sequence of reflog updates in struct reftable_log_record.
     -+ ****************************************************************/
     -+
     -+/* reftable_ref_record holds a ref database entry target_value */
     -+struct reftable_ref_record {
     -+	char *refname; /* Name of the ref, malloced. */
     -+	uint64_t update_index; /* Logical timestamp at which this value is
     -+				  written */
     -+	uint8_t *value; /* SHA1, or NULL. malloced. */
     -+	uint8_t *target_value; /* peeled annotated tag, or NULL. malloced. */
     -+	char *target; /* symref, or NULL. malloced. */
     -+};
     -+
     -+/* returns whether 'ref' represents a deletion */
     -+int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref);
     -+
     -+/* prints a reftable_ref_record onto stdout */
     -+void reftable_ref_record_print(struct reftable_ref_record *ref,
     -+			       uint32_t hash_id);
     -+
     -+/* frees and nulls all pointer values. */
     -+void reftable_ref_record_clear(struct reftable_ref_record *ref);
     -+
     -+/* returns whether two reftable_ref_records are the same */
     -+int reftable_ref_record_equal(struct reftable_ref_record *a,
     -+			      struct reftable_ref_record *b, int hash_size);
     -+
     -+/* reftable_log_record holds a reflog entry */
     -+struct reftable_log_record {
     -+	char *refname;
     -+	uint64_t update_index; /* logical timestamp of a transactional update.
     -+				*/
     -+	uint8_t *new_hash;
     -+	uint8_t *old_hash;
     -+	char *name;
     -+	char *email;
     -+	uint64_t time;
     -+	int16_t tz_offset;
     -+	char *message;
     -+};
     -+
     -+/* returns whether 'ref' represents the deletion of a log record. */
     -+int reftable_log_record_is_deletion(const struct reftable_log_record *log);
     -+
     -+/* frees and nulls all pointer values. */
     -+void reftable_log_record_clear(struct reftable_log_record *log);
     -+
     -+/* returns whether two records are equal. */
     -+int reftable_log_record_equal(struct reftable_log_record *a,
     -+			      struct reftable_log_record *b, int hash_size);
     -+
     -+/* dumps a reftable_log_record on stdout, for debugging/testing. */
     -+void reftable_log_record_print(struct reftable_log_record *log,
     -+			       uint32_t hash_id);
     ++#include "reftable-error.h"
     ++
     ++#include <stdio.h>
     ++
     ++const char *reftable_error_str(int err)
     ++{
     ++	static char buf[250];
     ++	switch (err) {
     ++	case REFTABLE_IO_ERROR:
     ++		return "I/O error";
     ++	case REFTABLE_FORMAT_ERROR:
     ++		return "corrupt reftable file";
     ++	case REFTABLE_NOT_EXIST_ERROR:
     ++		return "file does not exist";
     ++	case REFTABLE_LOCK_ERROR:
     ++		return "data is outdated";
     ++	case REFTABLE_API_ERROR:
     ++		return "misuse of the reftable API";
     ++	case REFTABLE_ZLIB_ERROR:
     ++		return "zlib failure";
     ++	case REFTABLE_NAME_CONFLICT:
     ++		return "file/directory conflict";
     ++	case REFTABLE_REFNAME_ERROR:
     ++		return "invalid refname";
     ++	case -1:
     ++		return "general error";
     ++	default:
     ++		snprintf(buf, sizeof(buf), "unknown error code %d", err);
     ++		return buf;
     ++	}
     ++}
     +
     + ## reftable/reftable-error.h (new) ##
     +@@
     ++/*
     ++Copyright 2020 Google LLC
      +
     -+/****************************************************************
     -+ Error handling
     ++Use of this source code is governed by a BSD-style
     ++license that can be found in the LICENSE file or at
     ++https://developers.google.com/open-source/licenses/bsd
     ++*/
      +
     -+ Error are signaled with negative integer return values. 0 means success.
     -+ ****************************************************************/
     ++#ifndef REFTABLE_ERROR_H
     ++#define REFTABLE_ERROR_H
      +
     -+/* different types of errors */
     ++/*
     ++ Errors in reftable calls are signaled with negative integer return values. 0
     ++ means success.
     ++*/
      +enum reftable_error {
      +	/* Unexpected file system behavior */
      +	REFTABLE_IO_ERROR = -2,
     @@ reftable/reftable.h (new)
      +	   - on writing a log record with multiline message with
      +	   exact_log_message unset
      +	   - on reading a reftable_ref_record from log iterator, or vice versa.
     ++
     ++	  When a call misuses the API, the internal state of the library is kept
     ++	  unchanged.
      +	*/
      +	REFTABLE_API_ERROR = -6,
      +
     @@ reftable/reftable.h (new)
      + * deallocated. */
      +const char *reftable_error_str(int err);
      +
     -+/*
     -+ * Convert the numeric error code to an equivalent errno code.
     -+ */
     -+int reftable_error_to_errno(int err);
     -+
     -+/****************************************************************
     -+ Writing
     -+
     -+ Writing single reftables
     -+ ****************************************************************/
     -+
     -+/* reftable_write_options sets options for writing a single reftable. */
     -+struct reftable_write_options {
     -+	/* boolean: do not pad out blocks to block size. */
     -+	int unpadded;
     -+
     -+	/* the blocksize. Should be less than 2^24. */
     -+	uint32_t block_size;
     -+
     -+	/* boolean: do not generate a SHA1 => ref index. */
     -+	int skip_index_objects;
     -+
     -+	/* how often to write complete keys in each block. */
     -+	int restart_interval;
     -+
     -+	/* 4-byte identifier ("sha1", "s256") of the hash.
     -+	 * Defaults to SHA1 if unset
     -+	 */
     -+	uint32_t hash_id;
     -+
     -+	/* boolean: do not check ref names for validity or dir/file conflicts.
     -+	 */
     -+	int skip_name_check;
     -+
     -+	/* boolean: copy log messages exactly. If unset, check that the message
     -+	 *   is a single line, and add '\n' if missing.
     -+	 */
     -+	int exact_log_message;
     -+};
     -+
     -+/* reftable_block_stats holds statistics for a single block type */
     -+struct reftable_block_stats {
     -+	/* total number of entries written */
     -+	int entries;
     -+	/* total number of key restarts */
     -+	int restarts;
     -+	/* total number of blocks */
     -+	int blocks;
     -+	/* total number of index blocks */
     -+	int index_blocks;
     -+	/* depth of the index */
     -+	int max_index_level;
     -+
     -+	/* offset of the first block for this type */
     -+	uint64_t offset;
     -+	/* offset of the top level index block for this type, or 0 if not
     -+	 * present */
     -+	uint64_t index_offset;
     -+};
     -+
     -+/* stats holds overall statistics for a single reftable */
     -+struct reftable_stats {
     -+	/* total number of blocks written. */
     -+	int blocks;
     -+	/* stats for ref data */
     -+	struct reftable_block_stats ref_stats;
     -+	/* stats for the SHA1 to ref map. */
     -+	struct reftable_block_stats obj_stats;
     -+	/* stats for index blocks */
     -+	struct reftable_block_stats idx_stats;
     -+	/* stats for log blocks */
     -+	struct reftable_block_stats log_stats;
     -+
     -+	/* disambiguation length of shortened object IDs. */
     -+	int object_id_len;
     -+};
     -+
     -+/* reftable_new_writer creates a new writer */
     -+struct reftable_writer *
     -+reftable_new_writer(int (*writer_func)(void *, const void *, size_t),
     -+		    void *writer_arg, struct reftable_write_options *opts);
     -+
     -+/* write to a file descriptor. fdp should be an int* pointing to the fd. */
     -+int reftable_fd_write(void *fdp, const void *data, size_t size);
     -+
     -+/* Set the range of update indices for the records we will add.  When
     -+   writing a table into a stack, the min should be at least
     -+   reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned.
     -+
     -+   For transactional updates, typically min==max. When converting an existing
     -+   ref database into a single reftable, this would be a range of update-index
     -+   timestamps.
     -+ */
     -+void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
     -+				uint64_t max);
     -+
     -+/* adds a reftable_ref_record. Must be called in ascending
     -+   order. The update_index must be within the limits set by
     -+   reftable_writer_set_limits(), or REFTABLE_API_ERROR is returned.
     -+
     -+   It is an error to write a ref record after a log record.
     -+ */
     -+int reftable_writer_add_ref(struct reftable_writer *w,
     -+			    struct reftable_ref_record *ref);
     -+
     -+/* Convenience function to add multiple refs. Will sort the refs by
     -+   name before adding. */
     -+int reftable_writer_add_refs(struct reftable_writer *w,
     -+			     struct reftable_ref_record *refs, int n);
     -+
     -+/* adds a reftable_log_record. Must be called in ascending order (with more
     -+   recent log entries first.)
     -+ */
     -+int reftable_writer_add_log(struct reftable_writer *w,
     -+			    struct reftable_log_record *log);
     -+
     -+/* Convenience function to add multiple logs. Will sort the records by
     -+   key before adding. */
     -+int reftable_writer_add_logs(struct reftable_writer *w,
     -+			     struct reftable_log_record *logs, int n);
     -+
     -+/* reftable_writer_close finalizes the reftable. The writer is retained so
     -+ * statistics can be inspected. */
     -+int reftable_writer_close(struct reftable_writer *w);
     -+
     -+/* writer_stats returns the statistics on the reftable being written.
     -+
     -+   This struct becomes invalid when the writer is freed.
     -+ */
     -+const struct reftable_stats *writer_stats(struct reftable_writer *w);
     -+
     -+/* reftable_writer_free deallocates memory for the writer */
     -+void reftable_writer_free(struct reftable_writer *w);
     -+
     -+/****************************************************************
     -+ * ITERATING
     -+ ****************************************************************/
     -+
     -+/* iterator is the generic interface for walking over data stored in a
     -+   reftable. It is generally passed around by value.
     -+*/
     -+struct reftable_iterator {
     -+	struct reftable_iterator_vtable *ops;
     -+	void *iter_arg;
     -+};
     -+
     -+/* reads the next reftable_ref_record. Returns < 0 for error, 0 for OK and > 0:
     -+   end of iteration.
     -+*/
     -+int reftable_iterator_next_ref(struct reftable_iterator *it,
     -+			       struct reftable_ref_record *ref);
     -+
     -+/* reads the next reftable_log_record. Returns < 0 for error, 0 for OK and > 0:
     -+   end of iteration.
     -+*/
     -+int reftable_iterator_next_log(struct reftable_iterator *it,
     -+			       struct reftable_log_record *log);
     -+
     -+/* releases resources associated with an iterator. */
     -+void reftable_iterator_destroy(struct reftable_iterator *it);
     -+
     -+/****************************************************************
     -+ Reading single tables
     -+
     -+ The follow routines are for reading single files. For an application-level
     -+ interface, skip ahead to struct reftable_merged_table and struct
     -+ reftable_stack.
     -+ ****************************************************************/
     -+
     -+/* block_source is a generic wrapper for a seekable readable file.
     -+   It is generally passed around by value.
     -+ */
     -+struct reftable_block_source {
     -+	struct reftable_block_source_vtable *ops;
     -+	void *arg;
     -+};
     -+
     -+/* a contiguous segment of bytes. It keeps track of its generating block_source
     -+   so it can return itself into the pool.
     -+*/
     -+struct reftable_block {
     -+	uint8_t *data;
     -+	int len;
     -+	struct reftable_block_source source;
     -+};
     -+
     -+/* block_source_vtable are the operations that make up block_source */
     -+struct reftable_block_source_vtable {
     -+	/* returns the size of a block source */
     -+	uint64_t (*size)(void *source);
     -+
     -+	/* reads a segment from the block source. It is an error to read
     -+	   beyond the end of the block */
     -+	int (*read_block)(void *source, struct reftable_block *dest,
     -+			  uint64_t off, uint32_t size);
     -+	/* mark the block as read; may return the data back to malloc */
     -+	void (*return_block)(void *source, struct reftable_block *blockp);
     -+
     -+	/* release all resources associated with the block source */
     -+	void (*close)(void *source);
     -+};
     -+
     -+/* opens a file on the file system as a block_source */
     -+int reftable_block_source_from_file(struct reftable_block_source *block_src,
     -+				    const char *name);
     -+
     -+/* The reader struct is a handle to an open reftable file. */
     -+struct reftable_reader;
     -+
     -+/* reftable_new_reader opens a reftable for reading. If successful, returns 0
     -+ * code and sets pp. The name is used for creating a stack. Typically, it is the
     -+ * basename of the file. The block source `src` is owned by the reader, and is
     -+ * closed on calling reftable_reader_destroy().
     -+ */
     -+int reftable_new_reader(struct reftable_reader **pp,
     -+			struct reftable_block_source *src, const char *name);
     -+
     -+/* reftable_reader_seek_ref returns an iterator where 'name' would be inserted
     -+   in the table.  To seek to the start of the table, use name = "".
     -+
     -+   example:
     -+
     -+   struct reftable_reader *r = NULL;
     -+   int err = reftable_new_reader(&r, &src, "filename");
     -+   if (err < 0) { ... }
     -+   struct reftable_iterator it  = {0};
     -+   err = reftable_reader_seek_ref(r, &it, "refs/heads/master");
     -+   if (err < 0) { ... }
     -+   struct reftable_ref_record ref  = {0};
     -+   while (1) {
     -+     err = reftable_iterator_next_ref(&it, &ref);
     -+     if (err > 0) {
     -+       break;
     -+     }
     -+     if (err < 0) {
     -+       ..error handling..
     -+     }
     -+     ..found..
     -+   }
     -+   reftable_iterator_destroy(&it);
     -+   reftable_ref_record_clear(&ref);
     -+ */
     -+int reftable_reader_seek_ref(struct reftable_reader *r,
     -+			     struct reftable_iterator *it, const char *name);
     -+
     -+/* returns the hash ID used in this table. */
     -+uint32_t reftable_reader_hash_id(struct reftable_reader *r);
     -+
     -+/* seek to logs for the given name, older than update_index. To seek to the
     -+   start of the table, use name = "".
     -+ */
     -+int reftable_reader_seek_log_at(struct reftable_reader *r,
     -+				struct reftable_iterator *it, const char *name,
     -+				uint64_t update_index);
     -+
     -+/* seek to newest log entry for given name. */
     -+int reftable_reader_seek_log(struct reftable_reader *r,
     -+			     struct reftable_iterator *it, const char *name);
     -+
     -+/* closes and deallocates a reader. */
     -+void reftable_reader_free(struct reftable_reader *);
     -+
     -+/* return an iterator for the refs pointing to `oid`. */
     -+int reftable_reader_refs_for(struct reftable_reader *r,
     -+			     struct reftable_iterator *it, uint8_t *oid);
     -+
     -+/* return the max_update_index for a table */
     -+uint64_t reftable_reader_max_update_index(struct reftable_reader *r);
     -+
     -+/* return the min_update_index for a table */
     -+uint64_t reftable_reader_min_update_index(struct reftable_reader *r);
     -+
     -+/****************************************************************
     -+ Merged tables
     -+
     -+ A ref database kept in a sequence of table files. The merged_table presents a
     -+ unified view to reading (seeking, iterating) a sequence of immutable tables.
     -+ ****************************************************************/
     -+
     -+/* A merged table is implements seeking/iterating over a stack of tables. */
     -+struct reftable_merged_table;
     -+
     -+/* A generic reftable; see below. */
     -+struct reftable_table;
     -+
     -+/* reftable_new_merged_table creates a new merged table. It takes ownership of
     -+   the stack array.
     -+*/
     -+int reftable_new_merged_table(struct reftable_merged_table **dest,
     -+			      struct reftable_table *stack, int n,
     -+			      uint32_t hash_id);
     -+
     -+/* returns an iterator positioned just before 'name' */
     -+int reftable_merged_table_seek_ref(struct reftable_merged_table *mt,
     -+				   struct reftable_iterator *it,
     -+				   const char *name);
     -+
     -+/* returns an iterator for log entry, at given update_index */
     -+int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt,
     -+				      struct reftable_iterator *it,
     -+				      const char *name, uint64_t update_index);
     -+
     -+/* like reftable_merged_table_seek_log_at but look for the newest entry. */
     -+int reftable_merged_table_seek_log(struct reftable_merged_table *mt,
     -+				   struct reftable_iterator *it,
     -+				   const char *name);
     -+
     -+/* returns the max update_index covered by this merged table. */
     -+uint64_t
     -+reftable_merged_table_max_update_index(struct reftable_merged_table *mt);
     -+
     -+/* returns the min update_index covered by this merged table. */
     -+uint64_t
     -+reftable_merged_table_min_update_index(struct reftable_merged_table *mt);
     -+
     -+/* releases memory for the merged_table */
     -+void reftable_merged_table_free(struct reftable_merged_table *m);
     -+
     -+/* return the hash ID of the merged table. */
     -+uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *m);
     -+
     -+/****************************************************************
     -+ Generic tables
     -+
     -+ A unified API for reading tables, either merged tables, or single readers.
     -+ ****************************************************************/
     -+
     -+struct reftable_table {
     -+	struct reftable_table_vtable *ops;
     -+	void *table_arg;
     -+};
     -+
     -+int reftable_table_seek_ref(struct reftable_table *tab,
     -+			    struct reftable_iterator *it, const char *name);
     -+
     -+void reftable_table_from_reader(struct reftable_table *tab,
     -+				struct reftable_reader *reader);
     -+
     -+/* returns the hash ID from a generic reftable_table */
     -+uint32_t reftable_table_hash_id(struct reftable_table *tab);
     -+
     -+/* create a generic table from reftable_merged_table */
     -+void reftable_table_from_merged_table(struct reftable_table *tab,
     -+				      struct reftable_merged_table *table);
     -+
     -+/* returns the max update_index covered by this table. */
     -+uint64_t reftable_table_max_update_index(struct reftable_table *tab);
     -+
     -+/* returns the min update_index covered by this table. */
     -+uint64_t reftable_table_min_update_index(struct reftable_table *tab);
     -+
     -+/* convenience function to read a single ref. Returns < 0 for error, 0
     -+   for success, and 1 if ref not found. */
     -+int reftable_table_read_ref(struct reftable_table *tab, const char *name,
     -+			    struct reftable_ref_record *ref);
     -+
     -+/****************************************************************
     -+ Mutable ref database
     -+
     -+ The stack presents an interface to a mutable sequence of reftables.
     -+ ****************************************************************/
     -+
     -+/* a stack is a stack of reftables, which can be mutated by pushing a table to
     -+ * the top of the stack */
     -+struct reftable_stack;
     -+
     -+/* open a new reftable stack. The tables along with the table list will be
     -+   stored in 'dir'. Typically, this should be .git/reftables.
     -+*/
     -+int reftable_new_stack(struct reftable_stack **dest, const char *dir,
     -+		       struct reftable_write_options config);
     -+
     -+/* returns the update_index at which a next table should be written. */
     -+uint64_t reftable_stack_next_update_index(struct reftable_stack *st);
     -+
     -+/* holds a transaction to add tables at the top of a stack. */
     -+struct reftable_addition;
     -+
     -+/*
     -+  returns a new transaction to add reftables to the given stack. As a side
     -+  effect, the ref database is locked.
     -+*/
     -+int reftable_stack_new_addition(struct reftable_addition **dest,
     -+				struct reftable_stack *st);
     -+
     -+/* Adds a reftable to transaction. */
     -+int reftable_addition_add(struct reftable_addition *add,
     -+			  int (*write_table)(struct reftable_writer *wr,
     -+					     void *arg),
     -+			  void *arg);
     -+
     -+/* Commits the transaction, releasing the lock. */
     -+int reftable_addition_commit(struct reftable_addition *add);
     -+
     -+/* Release all non-committed data from the transaction, and deallocate the
     -+   transaction. Releases the lock if held. */
     -+void reftable_addition_destroy(struct reftable_addition *add);
     -+
     -+/* add a new table to the stack. The write_table function must call
     -+   reftable_writer_set_limits, add refs and return an error value. */
     -+int reftable_stack_add(struct reftable_stack *st,
     -+		       int (*write_table)(struct reftable_writer *wr,
     -+					  void *write_arg),
     -+		       void *write_arg);
     -+
     -+/* returns the merged_table for seeking. This table is valid until the
     -+   next write or reload, and should not be closed or deleted.
     -+*/
     -+struct reftable_merged_table *
     -+reftable_stack_merged_table(struct reftable_stack *st);
     -+
     -+/* frees all resources associated with the stack. */
     -+void reftable_stack_destroy(struct reftable_stack *st);
     -+
     -+/* Reloads the stack if necessary. This is very cheap to run if the stack was up
     -+ * to date */
     -+int reftable_stack_reload(struct reftable_stack *st);
     -+
     -+/* Policy for expiring reflog entries. */
     -+struct reftable_log_expiry_config {
     -+	/* Drop entries older than this timestamp */
     -+	uint64_t time;
     -+
     -+	/* Drop older entries */
     -+	uint64_t min_update_index;
     -+};
     -+
     -+/* compacts all reftables into a giant table. Expire reflog entries if config is
     -+ * non-NULL */
     -+int reftable_stack_compact_all(struct reftable_stack *st,
     -+			       struct reftable_log_expiry_config *config);
     -+
     -+/* heuristically compact unbalanced table stack. */
     -+int reftable_stack_auto_compact(struct reftable_stack *st);
     -+
     -+/* convenience function to read a single ref. Returns < 0 for error, 0
     -+   for success, and 1 if ref not found. */
     -+int reftable_stack_read_ref(struct reftable_stack *st, const char *refname,
     -+			    struct reftable_ref_record *ref);
     -+
     -+/* convenience function to read a single log. Returns < 0 for error, 0
     -+   for success, and 1 if ref not found. */
     -+int reftable_stack_read_log(struct reftable_stack *st, const char *refname,
     -+			    struct reftable_log_record *log);
     -+
     -+/* statistics on past compactions. */
     -+struct reftable_compaction_stats {
     -+	uint64_t bytes; /* total number of bytes written */
     -+	uint64_t entries_written; /* total number of entries written, including
     -+				     failures. */
     -+	int attempts; /* how often we tried to compact */
     -+	int failures; /* failures happen on concurrent updates */
     -+};
     -+
     -+/* return statistics for compaction up till now. */
     -+struct reftable_compaction_stats *
     -+reftable_stack_compaction_stats(struct reftable_stack *st);
     -+
      +#endif
  3:  01a669a731 <  -:  ---------- vcxproj: adjust for the reftable changes
  5:  4190da597e !  5:  987d1d186f reftable: utility functions
     @@ Commit message
      
          This commit provides basic utility classes for the reftable library.
      
     -    Since the reftable library must compile standalone, there may be some overlap
     -    with git-core utility functions.
     -
          Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
      
       ## Makefile ##
     @@ Makefile: TEST_SHELL_PATH = $(SHELL_PATH)
      +REFTABLE_LIB = reftable/libreftable.a
      +REFTABLE_TEST_LIB = reftable/libreftable_test.a
       
     - GENERATED_H += config-list.h
       GENERATED_H += command-list.h
     + GENERATED_H += config-list.h
      @@ Makefile: THIRD_PARTY_SOURCES += compat/regex/%
       THIRD_PARTY_SOURCES += sha1collisiondetection/%
       THIRD_PARTY_SOURCES += sha1dc/%
     @@ Makefile: XDIFF_OBJS += xdiff/xpatience.o
       XDIFF_OBJS += xdiff/xutils.o
       
      +REFTABLE_OBJS += reftable/basics.o
     -+REFTABLE_OBJS += reftable/blocksource.o
     ++REFTABLE_OBJS += reftable/error.o
      +REFTABLE_OBJS += reftable/publicbasics.o
     -+REFTABLE_OBJS += reftable/compat.o
     -+REFTABLE_OBJS += reftable/strbuf.o
      +
     -+REFTABLE_TEST_OBJS += reftable/strbuf_test.o
      +REFTABLE_TEST_OBJS += reftable/test_framework.o
     ++REFTABLE_TEST_OBJS += reftable/basics_test.o
      +
       TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
       OBJECTS := $(LIB_OBJS) $(BUILTIN_OBJS) $(PROGRAM_OBJS) $(TEST_OBJS) \
     @@ reftable/basics.c (new)
      +	}
      +
      +	return *a == *b;
     ++}
     ++
     ++int common_prefix_size(struct strbuf *a, struct strbuf *b)
     ++{
     ++	int p = 0;
     ++	while (p < a->len && p < b->len) {
     ++		if (a->buf[p] != b->buf[p]) {
     ++			break;
     ++		}
     ++		p++;
     ++	}
     ++
     ++	return p;
      +}
      
       ## reftable/basics.h (new) ##
     @@ reftable/basics.h (new)
      +#ifndef BASICS_H
      +#define BASICS_H
      +
     ++/*
     ++ * miscellaneous utilities that are not provided by Git.
     ++ */
     ++
      +#include "system.h"
      +
      +/* Bigendian en/decoding of integers */
     @@ reftable/basics.h (new)
      +void reftable_free(void *p);
      +void *reftable_calloc(size_t sz);
      +
     ++/* Find the longest shared prefix size of `a` and `b` */
     ++struct strbuf;
     ++int common_prefix_size(struct strbuf *a, struct strbuf *b);
     ++
      +#endif
      
     - ## reftable/blocksource.c (new) ##
     + ## reftable/basics_test.c (new) ##
      @@
      +/*
      +Copyright 2020 Google LLC
     @@ reftable/blocksource.c (new)
      +#include "system.h"
      +
      +#include "basics.h"
     -+#include "blocksource.h"
     -+#include "strbuf.h"
     -+#include "reftable.h"
     -+
     -+static void strbuf_return_block(void *b, struct reftable_block *dest)
     -+{
     -+	memset(dest->data, 0xff, dest->len);
     -+	reftable_free(dest->data);
     -+}
     -+
     -+static void strbuf_close(void *b)
     -+{
     -+}
     -+
     -+static int strbuf_read_block(void *v, struct reftable_block *dest, uint64_t off,
     -+			     uint32_t size)
     -+{
     -+	struct strbuf *b = (struct strbuf *)v;
     -+	assert(off + size <= b->len);
     -+	dest->data = reftable_calloc(size);
     -+	memcpy(dest->data, b->buf + off, size);
     -+	dest->len = size;
     -+	return size;
     -+}
     -+
     -+static uint64_t strbuf_size(void *b)
     -+{
     -+	return ((struct strbuf *)b)->len;
     -+}
     -+
     -+static struct reftable_block_source_vtable strbuf_vtable = {
     -+	.size = &strbuf_size,
     -+	.read_block = &strbuf_read_block,
     -+	.return_block = &strbuf_return_block,
     -+	.close = &strbuf_close,
     -+};
     -+
     -+void block_source_from_strbuf(struct reftable_block_source *bs,
     -+			      struct strbuf *buf)
     -+{
     -+	assert(bs->ops == NULL);
     -+	bs->ops = &strbuf_vtable;
     -+	bs->arg = buf;
     -+}
     -+
     -+static void malloc_return_block(void *b, struct reftable_block *dest)
     -+{
     -+	memset(dest->data, 0xff, dest->len);
     -+	reftable_free(dest->data);
     -+}
     -+
     -+static struct reftable_block_source_vtable malloc_vtable = {
     -+	.return_block = &malloc_return_block,
     -+};
     -+
     -+static struct reftable_block_source malloc_block_source_instance = {
     -+	.ops = &malloc_vtable,
     -+};
     -+
     -+struct reftable_block_source malloc_block_source(void)
     -+{
     -+	return malloc_block_source_instance;
     -+}
     ++#include "test_framework.h"
     ++#include "reftable-tests.h"
      +
     -+struct file_block_source {
     -+	int fd;
     -+	uint64_t size;
     ++struct binsearch_args {
     ++	int key;
     ++	int *arr;
      +};
      +
     -+static uint64_t file_size(void *b)
     ++static int binsearch_func(size_t i, void *void_args)
      +{
     -+	return ((struct file_block_source *)b)->size;
     -+}
     ++	struct binsearch_args *args = (struct binsearch_args *)void_args;
      +
     -+static void file_return_block(void *b, struct reftable_block *dest)
     -+{
     -+	memset(dest->data, 0xff, dest->len);
     -+	reftable_free(dest->data);
     ++	return args->key < args->arr[i];
      +}
      +
     -+static void file_close(void *b)
     -+{
     -+	int fd = ((struct file_block_source *)b)->fd;
     -+	if (fd > 0) {
     -+		close(fd);
     -+		((struct file_block_source *)b)->fd = 0;
     -+	}
     -+
     -+	reftable_free(b);
     -+}
     -+
     -+static int file_read_block(void *v, struct reftable_block *dest, uint64_t off,
     -+			   uint32_t size)
     -+{
     -+	struct file_block_source *b = (struct file_block_source *)v;
     -+	assert(off + size <= b->size);
     -+	dest->data = reftable_malloc(size);
     -+	if (pread(b->fd, dest->data, size, off) != size)
     -+		return -1;
     -+	dest->len = size;
     -+	return size;
     -+}
     -+
     -+static struct reftable_block_source_vtable file_vtable = {
     -+	.size = &file_size,
     -+	.read_block = &file_read_block,
     -+	.return_block = &file_return_block,
     -+	.close = &file_close,
     -+};
     -+
     -+int reftable_block_source_from_file(struct reftable_block_source *bs,
     -+				    const char *name)
     ++static void test_binsearch(void)
      +{
     -+	struct stat st = { 0 };
     -+	int err = 0;
     -+	int fd = open(name, O_RDONLY);
     -+	struct file_block_source *p = NULL;
     -+	if (fd < 0) {
     -+		if (errno == ENOENT) {
     -+			return REFTABLE_NOT_EXIST_ERROR;
     -+		}
     -+		return -1;
     -+	}
     -+
     -+	err = fstat(fd, &st);
     -+	if (err < 0)
     -+		return -1;
     -+
     -+	p = reftable_calloc(sizeof(struct file_block_source));
     -+	p->size = st.st_size;
     -+	p->fd = fd;
     -+
     -+	assert(bs->ops == NULL);
     -+	bs->ops = &file_vtable;
     -+	bs->arg = p;
     -+	return 0;
     -+}
     -
     - ## reftable/blocksource.h (new) ##
     -@@
     -+/*
     -+Copyright 2020 Google LLC
     -+
     -+Use of this source code is governed by a BSD-style
     -+license that can be found in the LICENSE file or at
     -+https://developers.google.com/open-source/licenses/bsd
     -+*/
     -+
     -+#ifndef BLOCKSOURCE_H
     -+#define BLOCKSOURCE_H
     -+
     -+#include "strbuf.h"
     -+
     -+struct reftable_block_source;
     -+
     -+/* Create an in-memory block source for reading reftables */
     -+void block_source_from_strbuf(struct reftable_block_source *bs,
     -+			      struct strbuf *buf);
     ++	int arr[] = { 2, 4, 6, 8, 10 };
     ++	size_t sz = ARRAY_SIZE(arr);
     ++	struct binsearch_args args = {
     ++		.arr = arr,
     ++	};
      +
     -+struct reftable_block_source malloc_block_source(void);
     -+
     -+#endif
     -
     - ## reftable/compat.c (new) ##
     -@@
     -+/*
     -+Copyright 2020 Google LLC
     -+
     -+Use of this source code is governed by a BSD-style
     -+license that can be found in the LICENSE file or at
     -+https://developers.google.com/open-source/licenses/bsd
     -+
     -+*/
     -+
     -+/* compat.c - compatibility functions for standalone compilation */
     -+
     -+#include "system.h"
     -+#include "basics.h"
     -+
     -+#ifdef REFTABLE_STANDALONE
     -+
     -+#include <dirent.h>
     -+
     -+void put_be32(void *p, uint32_t i)
     -+{
     -+	uint8_t *out = (uint8_t *)p;
     -+
     -+	out[0] = (uint8_t)((i >> 24) & 0xff);
     -+	out[1] = (uint8_t)((i >> 16) & 0xff);
     -+	out[2] = (uint8_t)((i >> 8) & 0xff);
     -+	out[3] = (uint8_t)((i)&0xff);
     -+}
     -+
     -+uint32_t get_be32(uint8_t *in)
     -+{
     -+	return (uint32_t)(in[0]) << 24 | (uint32_t)(in[1]) << 16 |
     -+	       (uint32_t)(in[2]) << 8 | (uint32_t)(in[3]);
     -+}
     -+
     -+void put_be64(void *p, uint64_t v)
     -+{
     -+	uint8_t *out = (uint8_t *)p;
     -+	int i = sizeof(uint64_t);
     -+	while (i--) {
     -+		out[i] = (uint8_t)(v & 0xff);
     -+		v >>= 8;
     -+	}
     -+}
     -+
     -+uint64_t get_be64(void *out)
     -+{
     -+	uint8_t *bytes = (uint8_t *)out;
     -+	uint64_t v = 0;
      +	int i = 0;
     -+	for (i = 0; i < sizeof(uint64_t); i++) {
     -+		v = (v << 8) | (uint8_t)(bytes[i] & 0xff);
     -+	}
     -+	return v;
     -+}
     -+
     -+uint16_t get_be16(uint8_t *in)
     -+{
     -+	return (uint32_t)(in[0]) << 8 | (uint32_t)(in[1]);
     -+}
     -+
     -+char *xstrdup(const char *s)
     -+{
     -+	int l = strlen(s);
     -+	char *dest = (char *)reftable_malloc(l + 1);
     -+	strncpy(dest, s, l + 1);
     -+	return dest;
     -+}
     -+
     -+void sleep_millisec(int millisecs)
     -+{
     -+	usleep(millisecs * 1000);
     -+}
     -+
     -+void reftable_clear_dir(const char *dirname)
     -+{
     -+	DIR *dir = opendir(dirname);
     -+	struct dirent *ent = NULL;
     -+	assert(dir);
     -+	while ((ent = readdir(dir)) != NULL) {
     -+		unlinkat(dirfd(dir), ent->d_name, 0);
     ++	for (i = 1; i < 11; i++) {
     ++		int res;
     ++		args.key = i;
     ++		res = binsearch(sz, &binsearch_func, &args);
     ++
     ++		if (res < sz) {
     ++			EXPECT(args.key < arr[res]);
     ++			if (res > 0) {
     ++				EXPECT(args.key >= arr[res - 1]);
     ++			}
     ++		} else {
     ++			EXPECT(args.key == 10 || args.key == 11);
     ++		}
      +	}
     -+	closedir(dir);
     -+	rmdir(dirname);
      +}
      +
     -+#else
     -+
     -+#include "../dir.h"
     -+
     -+void reftable_clear_dir(const char *dirname)
     ++int basics_test_main(int argc, const char *argv[])
      +{
     -+	struct strbuf path = STRBUF_INIT;
     -+	strbuf_addstr(&path, dirname);
     -+	remove_dir_recursively(&path, 0);
     -+	strbuf_release(&path);
     -+}
     -+
     -+#endif
     -+
     -+int hash_size(uint32_t id)
     -+{
     -+	switch (id) {
     -+	case 0:
     -+	case SHA1_ID:
     -+		return SHA1_SIZE;
     -+	case SHA256_ID:
     -+		return SHA256_SIZE;
     -+	}
     -+	abort();
     ++	test_binsearch();
     ++	return 0;
      +}
      
       ## reftable/compat.h (new) ##
     @@ reftable/publicbasics.c (new)
      +https://developers.google.com/open-source/licenses/bsd
      +*/
      +
     -+#include "reftable.h"
     ++#include "reftable-malloc.h"
      +
      +#include "basics.h"
      +#include "system.h"
      +
     -+const char *reftable_error_str(int err)
     -+{
     -+	static char buf[250];
     -+	switch (err) {
     -+	case REFTABLE_IO_ERROR:
     -+		return "I/O error";
     -+	case REFTABLE_FORMAT_ERROR:
     -+		return "corrupt reftable file";
     -+	case REFTABLE_NOT_EXIST_ERROR:
     -+		return "file does not exist";
     -+	case REFTABLE_LOCK_ERROR:
     -+		return "data is outdated";
     -+	case REFTABLE_API_ERROR:
     -+		return "misuse of the reftable API";
     -+	case REFTABLE_ZLIB_ERROR:
     -+		return "zlib failure";
     -+	case REFTABLE_NAME_CONFLICT:
     -+		return "file/directory conflict";
     -+	case REFTABLE_REFNAME_ERROR:
     -+		return "invalid refname";
     -+	case -1:
     -+		return "general error";
     -+	default:
     -+		snprintf(buf, sizeof(buf), "unknown error code %d", err);
     -+		return buf;
     -+	}
     -+}
     -+
     -+int reftable_error_to_errno(int err)
     -+{
     -+	switch (err) {
     -+	case REFTABLE_IO_ERROR:
     -+		return EIO;
     -+	case REFTABLE_FORMAT_ERROR:
     -+		return EFAULT;
     -+	case REFTABLE_NOT_EXIST_ERROR:
     -+		return ENOENT;
     -+	case REFTABLE_LOCK_ERROR:
     -+		return EBUSY;
     -+	case REFTABLE_API_ERROR:
     -+		return EINVAL;
     -+	case REFTABLE_ZLIB_ERROR:
     -+		return EDOM;
     -+	default:
     -+		return ERANGE;
     -+	}
     -+}
     -+
      +static void *(*reftable_malloc_ptr)(size_t sz) = &malloc;
      +static void *(*reftable_realloc_ptr)(void *, size_t) = &realloc;
      +static void (*reftable_free_ptr)(void *) = &free;
     @@ reftable/publicbasics.c (new)
      +	reftable_free_ptr = free;
      +}
      +
     -+int reftable_fd_write(void *arg, const void *data, size_t sz)
     ++int hash_size(uint32_t id)
      +{
     -+	int *fdp = (int *)arg;
     -+	return write(*fdp, data, sz);
     ++	switch (id) {
     ++	case 0:
     ++	case SHA1_ID:
     ++		return SHA1_SIZE;
     ++	case SHA256_ID:
     ++		return SHA256_SIZE;
     ++	}
     ++	abort();
      +}
      
     - ## reftable/reftable-tests.h (new) ##
     + ## reftable/reftable-malloc.h (new) ##
      @@
      +/*
      +Copyright 2020 Google LLC
     @@ reftable/reftable-tests.h (new)
      +https://developers.google.com/open-source/licenses/bsd
      +*/
      +
     -+#ifndef REFTABLE_TESTS_H
     -+#define REFTABLE_TESTS_H
     ++#ifndef REFTABLE_H
     ++#define REFTABLE_H
      +
     -+int block_test_main(int argc, const char **argv);
     -+int merged_test_main(int argc, const char **argv);
     -+int record_test_main(int argc, const char **argv);
     -+int refname_test_main(int argc, const char **argv);
     -+int reftable_test_main(int argc, const char **argv);
     -+int strbuf_test_main(int argc, const char **argv);
     -+int stack_test_main(int argc, const char **argv);
     -+int tree_test_main(int argc, const char **argv);
     -+int reftable_dump_main(int argc, char *const *argv);
     ++#include <stddef.h>
     ++
     ++/*
     ++  Overrides the functions to use for memory management.
     ++ */
     ++void reftable_set_alloc(void *(*malloc)(size_t),
     ++			void *(*realloc)(void *, size_t), void (*free)(void *));
      +
      +#endif
      
     - ## reftable/strbuf.c (new) ##
     + ## reftable/reftable-tests.h (new) ##
      @@
      +/*
      +Copyright 2020 Google LLC
     @@ reftable/strbuf.c (new)
      +https://developers.google.com/open-source/licenses/bsd
      +*/
      +
     -+#include "strbuf.h"
     -+
     -+#ifdef REFTABLE_STANDALONE
     -+
     -+void strbuf_init(struct strbuf *s, size_t alloc)
     -+{
     -+	struct strbuf empty = STRBUF_INIT;
     -+	*s = empty;
     -+}
     -+
     -+void strbuf_grow(struct strbuf *s, size_t extra)
     -+{
     -+	size_t newcap = s->len + extra + 1;
     -+	if (newcap > s->cap) {
     -+		s->buf = reftable_realloc(s->buf, newcap);
     -+		s->cap = newcap;
     -+	}
     -+}
     -+
     -+static void strbuf_resize(struct strbuf *s, int l)
     -+{
     -+	int zl = l + 1; /* one uint8_t for 0 termination. */
     -+	assert(s->canary == STRBUF_CANARY);
     -+	if (s->cap < zl) {
     -+		int c = s->cap * 2;
     -+		if (c < zl) {
     -+			c = zl;
     -+		}
     -+		s->cap = c;
     -+		s->buf = reftable_realloc(s->buf, s->cap);
     -+	}
     -+	s->len = l;
     -+	s->buf[l] = 0;
     -+}
     -+
     -+void strbuf_setlen(struct strbuf *s, size_t l)
     -+{
     -+	assert(s->cap >= l + 1);
     -+	s->len = l;
     -+	s->buf[l] = 0;
     -+}
     -+
     -+void strbuf_reset(struct strbuf *s)
     -+{
     -+	strbuf_resize(s, 0);
     -+}
     -+
     -+void strbuf_addstr(struct strbuf *d, const char *s)
     -+{
     -+	int l1 = d->len;
     -+	int l2 = strlen(s);
     -+	assert(d->canary == STRBUF_CANARY);
     -+
     -+	strbuf_resize(d, l2 + l1);
     -+	memcpy(d->buf + l1, s, l2);
     -+}
     -+
     -+void strbuf_addbuf(struct strbuf *s, struct strbuf *a)
     -+{
     -+	int end = s->len;
     -+	assert(s->canary == STRBUF_CANARY);
     -+	strbuf_resize(s, s->len + a->len);
     -+	memcpy(s->buf + end, a->buf, a->len);
     -+}
     -+
     -+char *strbuf_detach(struct strbuf *s, size_t *sz)
     -+{
     -+	char *p = NULL;
     -+	p = (char *)s->buf;
     -+	if (sz)
     -+		*sz = s->len;
     -+	s->buf = NULL;
     -+	s->cap = 0;
     -+	s->len = 0;
     -+	return p;
     -+}
     -+
     -+void strbuf_release(struct strbuf *s)
     -+{
     -+	assert(s->canary == STRBUF_CANARY);
     -+	s->cap = 0;
     -+	s->len = 0;
     -+	reftable_free(s->buf);
     -+	s->buf = NULL;
     -+}
     -+
     -+int strbuf_cmp(const struct strbuf *a, const struct strbuf *b)
     -+{
     -+	int min = a->len < b->len ? a->len : b->len;
     -+	int res = memcmp(a->buf, b->buf, min);
     -+	assert(a->canary == STRBUF_CANARY);
     -+	assert(b->canary == STRBUF_CANARY);
     -+	if (res != 0)
     -+		return res;
     -+	if (a->len < b->len)
     -+		return -1;
     -+	else if (a->len > b->len)
     -+		return 1;
     -+	else
     -+		return 0;
     -+}
     ++#ifndef REFTABLE_TESTS_H
     ++#define REFTABLE_TESTS_H
      +
     -+int strbuf_add(struct strbuf *b, const void *data, size_t sz)
     -+{
     -+	assert(b->canary == STRBUF_CANARY);
     -+	strbuf_grow(b, sz);
     -+	memcpy(b->buf + b->len, data, sz);
     -+	b->len += sz;
     -+	b->buf[b->len] = 0;
     -+	return sz;
     -+}
     ++int basics_test_main(int argc, const char **argv);
     ++int block_test_main(int argc, const char **argv);
     ++int merged_test_main(int argc, const char **argv);
     ++int record_test_main(int argc, const char **argv);
     ++int refname_test_main(int argc, const char **argv);
     ++int reftable_test_main(int argc, const char **argv);
     ++int stack_test_main(int argc, const char **argv);
     ++int tree_test_main(int argc, const char **argv);
     ++int reftable_dump_main(int argc, char *const *argv);
      +
      +#endif
     -+
     -+int strbuf_add_void(void *b, const void *data, size_t sz)
     -+{
     -+	strbuf_add((struct strbuf *)b, data, sz);
     -+	return sz;
     -+}
     -+
     -+int common_prefix_size(struct strbuf *a, struct strbuf *b)
     -+{
     -+	int p = 0;
     -+	while (p < a->len && p < b->len) {
     -+		if (a->buf[p] != b->buf[p]) {
     -+			break;
     -+		}
     -+		p++;
     -+	}
     -+
     -+	return p;
     -+}
     -+
     -+struct strbuf reftable_empty_strbuf = STRBUF_INIT;
      
     - ## reftable/strbuf.h (new) ##
     + ## reftable/system.h (new) ##
      @@
      +/*
      +Copyright 2020 Google LLC
     @@ reftable/strbuf.h (new)
      +https://developers.google.com/open-source/licenses/bsd
      +*/
      +
     -+#ifndef SLICE_H
     -+#define SLICE_H
     -+
     -+#ifdef REFTABLE_STANDALONE
     -+
     -+#include "basics.h"
     -+
     -+/*
     -+  Provides a bounds-checked, growable byte ranges. To use, initialize as "strbuf
     -+  x = STRBUF_INIT;"
     -+ */
     -+struct strbuf {
     -+	size_t len;
     -+	size_t cap;
     -+	char *buf;
     -+
     -+	/* Used to enforce initialization with STRBUF_INIT */
     -+	uint8_t canary;
     -+};
     -+
     -+#define STRBUF_CANARY 0x42
     -+#define STRBUF_INIT                       \
     -+	{                                 \
     -+		0, 0, NULL, STRBUF_CANARY \
     -+	}
     -+
     -+void strbuf_addstr(struct strbuf *dest, const char *src);
     -+
     -+/* Deallocate and clear strbuf */
     -+void strbuf_release(struct strbuf *strbuf);
     -+
     -+/* Set strbuf to 0 length, but retain buffer. */
     -+void strbuf_reset(struct strbuf *strbuf);
     -+
     -+/* Initializes a strbuf. Accepts a strbuf with random garbage. */
     -+void strbuf_init(struct strbuf *strbuf, size_t alloc);
     -+
     -+/* Return `buf`, clearing out `s`. Optionally return len (not cap) in `sz`.  */
     -+char *strbuf_detach(struct strbuf *s, size_t *sz);
     -+
     -+/* Set length of the slace to `l`, but don't reallocated. */
     -+void strbuf_setlen(struct strbuf *s, size_t l);
     -+
     -+/* Ensure `l` bytes beyond current length are available */
     -+void strbuf_grow(struct strbuf *s, size_t l);
     -+
     -+/* Signed comparison */
     -+int strbuf_cmp(const struct strbuf *a, const struct strbuf *b);
     -+
     -+/* Append `data` to the `dest` strbuf.  */
     -+int strbuf_add(struct strbuf *dest, const void *data, size_t sz);
     ++#ifndef SYSTEM_H
     ++#define SYSTEM_H
      +
     -+/* Append `add` to `dest. */
     -+void strbuf_addbuf(struct strbuf *dest, struct strbuf *add);
     ++#include "git-compat-util.h"
     ++#include "strbuf.h"
      +
     -+#else
     ++#include <zlib.h>
      +
     -+#include "../git-compat-util.h"
     -+#include "../strbuf.h"
     ++struct strbuf;
     ++/* In git, this is declared in dir.h */
     ++int remove_dir_recursively(struct strbuf *path, int flags);
      +
     -+#endif
     -+
     -+extern struct strbuf reftable_empty_strbuf;
     ++#define SHA1_ID 0x73686131
     ++#define SHA256_ID 0x73323536
     ++#define SHA1_SIZE 20
     ++#define SHA256_SIZE 32
      +
     -+/* Like strbuf_add, but suitable for passing to reftable_new_writer
     ++/* This is uncompress2, which is only available in zlib as of 2017.
      + */
     -+int strbuf_add_void(void *b, const void *data, size_t sz);
     -+
     -+/* Find the longest shared prefix size of `a` and `b` */
     -+int common_prefix_size(struct strbuf *a, struct strbuf *b);
     ++int uncompress_return_consumed(Bytef *dest, uLongf *destLen,
     ++			       const Bytef *source, uLong *sourceLen);
     ++int hash_size(uint32_t id);
      +
      +#endif
      
     - ## reftable/strbuf_test.c (new) ##
     + ## reftable/test_framework.c (new) ##
      @@
      +/*
      +Copyright 2020 Google LLC
     @@ reftable/strbuf_test.c (new)
      +https://developers.google.com/open-source/licenses/bsd
      +*/
      +
     -+#include "strbuf.h"
     -+
      +#include "system.h"
     ++#include "test_framework.h"
      +
      +#include "basics.h"
     -+#include "test_framework.h"
     -+#include "reftable-tests.h"
      +
     -+static void test_strbuf(void)
     ++void set_test_hash(uint8_t *p, int i)
      +{
     -+	struct strbuf s = STRBUF_INIT;
     -+	struct strbuf t = STRBUF_INIT;
     -+
     -+	strbuf_addstr(&s, "abc");
     -+	assert(0 == strcmp("abc", s.buf));
     -+
     -+	strbuf_addstr(&t, "pqr");
     -+	strbuf_addbuf(&s, &t);
     -+	assert(0 == strcmp("abcpqr", s.buf));
     -+
     -+	strbuf_release(&s);
     -+	strbuf_release(&t);
     ++	memset(p, (uint8_t)i, hash_size(SHA1_ID));
      +}
      +
     -+int strbuf_test_main(int argc, const char *argv[])
     ++int strbuf_add_void(void *b, const void *data, size_t sz)
      +{
     -+	add_test_case("test_strbuf", &test_strbuf);
     -+	return test_main(argc, argv);
     ++	strbuf_add((struct strbuf *)b, data, sz);
     ++	return sz;
      +}
      
     - ## reftable/system.h (new) ##
     + ## reftable/test_framework.h (new) ##
      @@
      +/*
      +Copyright 2020 Google LLC
     @@ reftable/system.h (new)
      +https://developers.google.com/open-source/licenses/bsd
      +*/
      +
     -+#ifndef SYSTEM_H
     -+#define SYSTEM_H
     -+
     -+#ifndef REFTABLE_STANDALONE
     -+
     -+#include "git-compat-util.h"
     -+#include "cache.h"
     -+#include <zlib.h>
     -+
     -+#else
     -+
     -+#include <assert.h>
     -+#include <errno.h>
     -+#include <fcntl.h>
     -+#include <inttypes.h>
     -+#include <stdint.h>
     -+#include <stdio.h>
     -+#include <stdlib.h>
     -+#include <string.h>
     -+#include <sys/stat.h>
     -+#include <sys/time.h>
     -+#include <sys/types.h>
     -+#include <unistd.h>
     -+#include <zlib.h>
     ++#ifndef TEST_FRAMEWORK_H
     ++#define TEST_FRAMEWORK_H
      +
     -+#include "compat.h"
     ++#include "system.h"
     ++#include "reftable-error.h"
     ++
     ++#define EXPECT_ERR(c)                                                  \
     ++	if (c != 0) {                                                  \
     ++		fflush(stderr);                                        \
     ++		fflush(stdout);                                        \
     ++		fprintf(stderr, "%s: %d: error == %d (%s), want 0\n",  \
     ++			__FILE__, __LINE__, c, reftable_error_str(c)); \
     ++		abort();                                               \
     ++	}
      +
     -+#endif /* REFTABLE_STANDALONE */
     ++#define EXPECT_STREQ(a, b)                                               \
     ++	if (strcmp(a, b)) {                                              \
     ++		fflush(stderr);                                          \
     ++		fflush(stdout);                                          \
     ++		fprintf(stderr, "%s:%d: %s (%s) != %s (%s)\n", __FILE__, \
     ++			__LINE__, #a, a, #b, b);                         \
     ++		abort();                                                 \
     ++	}
      +
     -+void reftable_clear_dir(const char *dirname);
     ++#define EXPECT(c)                                                          \
     ++	if (!(c)) {                                                        \
     ++		fflush(stderr);                                            \
     ++		fflush(stdout);                                            \
     ++		fprintf(stderr, "%s: %d: failed assertion %s\n", __FILE__, \
     ++			__LINE__, #c);                                     \
     ++		abort();                                                   \
     ++	}
      +
     -+#define SHA1_ID 0x73686131
     -+#define SHA256_ID 0x73323536
     -+#define SHA1_SIZE 20
     -+#define SHA256_SIZE 32
     ++void set_test_hash(uint8_t *p, int i);
      +
     -+/* This is uncompress2, which is only available in zlib as of 2017.
     ++/* Like strbuf_add, but suitable for passing to reftable_new_writer
      + */
     -+int uncompress_return_consumed(Bytef *dest, uLongf *destLen,
     -+			       const Bytef *source, uLong *sourceLen);
     -+int hash_size(uint32_t id);
     ++int strbuf_add_void(void *b, const void *data, size_t sz);
      +
      +#endif
      
     @@ t/helper/test-reftable.c (new)
      +
      +int cmd__reftable(int argc, const char **argv)
      +{
     -+	strbuf_test_main(argc, argv);
     ++	basics_test_main(argc, argv);
     ++
      +	return 0;
      +}
      
       ## t/helper/test-tool.c ##
      @@ t/helper/test-tool.c: static struct test_cmd cmds[] = {
     + 	{ "path-utils", cmd__path_utils },
     + 	{ "pkt-line", cmd__pkt_line },
     + 	{ "prio-queue", cmd__prio_queue },
     +-	{ "proc-receive", cmd__proc_receive},
     ++	{ "proc-receive", cmd__proc_receive },
     + 	{ "progress", cmd__progress },
     + 	{ "reach", cmd__reach },
     + 	{ "read-cache", cmd__read_cache },
       	{ "read-graph", cmd__read_graph },
       	{ "read-midx", cmd__read_midx },
       	{ "ref-store", cmd__ref_store },
     @@ t/helper/test-tool.h: int cmd__read_cache(int argc, const char **argv);
       int cmd__regex(int argc, const char **argv);
       int cmd__repository(int argc, const char **argv);
       int cmd__revision_walking(int argc, const char **argv);
     +
     + ## t/t0032-reftable-unittest.sh (new) ##
     +@@
     ++#!/bin/sh
     ++#
     ++# Copyright (c) 2020 Google LLC
     ++#
     ++
     ++test_description='reftable unittests'
     ++
     ++. ./test-lib.sh
     ++
     ++test_expect_success 'unittests' '
     ++       test-tool reftable
     ++'
     ++
     ++test_done
  4:  b94c5f5c61 !  6:  f71e82b995 reftable: add a barebones unittest framework
     @@ Metadata
      Author: Han-Wen Nienhuys <hanwen@google.com>
      
       ## Commit message ##
     -    reftable: add a barebones unittest framework
     +    reftable: add blocksource, an abstraction for random access reads
     +
     +    The reftable format is usually used with files for storage. However, we abstract
     +    away this using the blocksource data structure. This has two advantages:
     +
     +    * log blocks are zlib compressed, and handling them is simplified if we can
     +      discard byte segments from within the block layer.
     +
     +    * for unittests, it is useful to read and write in-memory. The blocksource
     +      allows us to abstract the data away from on-disk files.
      
          Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
      
     - ## reftable/test_framework.c (new) ##
     + ## Makefile ##
     +@@ Makefile: XDIFF_OBJS += xdiff/xutils.o
     + 
     + REFTABLE_OBJS += reftable/basics.o
     + REFTABLE_OBJS += reftable/error.o
     ++REFTABLE_OBJS += reftable/blocksource.o
     + REFTABLE_OBJS += reftable/publicbasics.o
     + 
     + REFTABLE_TEST_OBJS += reftable/test_framework.o
     +
     + ## reftable/blocksource.c (new) ##
      @@
      +/*
      +Copyright 2020 Google LLC
     @@ reftable/test_framework.c (new)
      +https://developers.google.com/open-source/licenses/bsd
      +*/
      +
     -+#include "test_framework.h"
     -+
      +#include "system.h"
     ++
      +#include "basics.h"
     ++#include "blocksource.h"
     ++#include "reftable-blocksource.h"
     ++#include "reftable-error.h"
      +
     -+static struct test_case **test_cases;
     -+static int test_case_len;
     -+static int test_case_cap;
     ++static void strbuf_return_block(void *b, struct reftable_block *dest)
     ++{
     ++	memset(dest->data, 0xff, dest->len);
     ++	reftable_free(dest->data);
     ++}
      +
     -+static struct test_case *new_test_case(const char *name, void (*testfunc)(void))
     ++static void strbuf_close(void *b)
      +{
     -+	struct test_case *tc = reftable_malloc(sizeof(struct test_case));
     -+	tc->name = name;
     -+	tc->testfunc = testfunc;
     -+	return tc;
      +}
      +
     -+struct test_case *add_test_case(const char *name, void (*testfunc)(void))
     ++static int strbuf_read_block(void *v, struct reftable_block *dest, uint64_t off,
     ++			     uint32_t size)
      +{
     -+	struct test_case *tc = new_test_case(name, testfunc);
     -+	if (test_case_len == test_case_cap) {
     -+		test_case_cap = 2 * test_case_cap + 1;
     -+		test_cases = reftable_realloc(
     -+			test_cases, sizeof(struct test_case) * test_case_cap);
     -+	}
     ++	struct strbuf *b = (struct strbuf *)v;
     ++	assert(off + size <= b->len);
     ++	dest->data = reftable_calloc(size);
     ++	memcpy(dest->data, b->buf + off, size);
     ++	dest->len = size;
     ++	return size;
     ++}
      +
     -+	test_cases[test_case_len++] = tc;
     -+	return tc;
     ++static uint64_t strbuf_size(void *b)
     ++{
     ++	return ((struct strbuf *)b)->len;
      +}
      +
     -+int test_main(int argc, const char *argv[])
     ++static struct reftable_block_source_vtable strbuf_vtable = {
     ++	.size = &strbuf_size,
     ++	.read_block = &strbuf_read_block,
     ++	.return_block = &strbuf_return_block,
     ++	.close = &strbuf_close,
     ++};
     ++
     ++void block_source_from_strbuf(struct reftable_block_source *bs,
     ++			      struct strbuf *buf)
      +{
     -+	const char *filter = NULL;
     -+	int i = 0;
     -+	if (argc > 1) {
     -+		filter = argv[1];
     -+	}
     ++	assert(bs->ops == NULL);
     ++	bs->ops = &strbuf_vtable;
     ++	bs->arg = buf;
     ++}
      +
     -+	for (i = 0; i < test_case_len; i++) {
     -+		const char *name = test_cases[i]->name;
     -+		if (filter == NULL || strstr(name, filter) != NULL) {
     -+			printf("case %s\n", name);
     -+			test_cases[i]->testfunc();
     -+		} else {
     -+			printf("skip %s\n", name);
     -+		}
     ++static void malloc_return_block(void *b, struct reftable_block *dest)
     ++{
     ++	memset(dest->data, 0xff, dest->len);
     ++	reftable_free(dest->data);
     ++}
     ++
     ++static struct reftable_block_source_vtable malloc_vtable = {
     ++	.return_block = &malloc_return_block,
     ++};
     ++
     ++static struct reftable_block_source malloc_block_source_instance = {
     ++	.ops = &malloc_vtable,
     ++};
      +
     -+		reftable_free(test_cases[i]);
     ++struct reftable_block_source malloc_block_source(void)
     ++{
     ++	return malloc_block_source_instance;
     ++}
     ++
     ++struct file_block_source {
     ++	int fd;
     ++	uint64_t size;
     ++};
     ++
     ++static uint64_t file_size(void *b)
     ++{
     ++	return ((struct file_block_source *)b)->size;
     ++}
     ++
     ++static void file_return_block(void *b, struct reftable_block *dest)
     ++{
     ++	memset(dest->data, 0xff, dest->len);
     ++	reftable_free(dest->data);
     ++}
     ++
     ++static void file_close(void *b)
     ++{
     ++	int fd = ((struct file_block_source *)b)->fd;
     ++	if (fd > 0) {
     ++		close(fd);
     ++		((struct file_block_source *)b)->fd = 0;
      +	}
     -+	reftable_free(test_cases);
     -+	test_cases = NULL;
     -+	test_case_len = 0;
     -+	test_case_cap = 0;
     -+	return 0;
     ++
     ++	reftable_free(b);
      +}
      +
     -+void set_test_hash(uint8_t *p, int i)
     ++static int file_read_block(void *v, struct reftable_block *dest, uint64_t off,
     ++			   uint32_t size)
      +{
     -+	memset(p, (uint8_t)i, hash_size(SHA1_ID));
     ++	struct file_block_source *b = (struct file_block_source *)v;
     ++	assert(off + size <= b->size);
     ++	dest->data = reftable_malloc(size);
     ++	if (pread(b->fd, dest->data, size, off) != size)
     ++		return -1;
     ++	dest->len = size;
     ++	return size;
     ++}
     ++
     ++static struct reftable_block_source_vtable file_vtable = {
     ++	.size = &file_size,
     ++	.read_block = &file_read_block,
     ++	.return_block = &file_return_block,
     ++	.close = &file_close,
     ++};
     ++
     ++int reftable_block_source_from_file(struct reftable_block_source *bs,
     ++				    const char *name)
     ++{
     ++	struct stat st = { 0 };
     ++	int err = 0;
     ++	int fd = open(name, O_RDONLY);
     ++	struct file_block_source *p = NULL;
     ++	if (fd < 0) {
     ++		if (errno == ENOENT) {
     ++			return REFTABLE_NOT_EXIST_ERROR;
     ++		}
     ++		return -1;
     ++	}
     ++
     ++	err = fstat(fd, &st);
     ++	if (err < 0)
     ++		return -1;
     ++
     ++	p = reftable_calloc(sizeof(struct file_block_source));
     ++	p->size = st.st_size;
     ++	p->fd = fd;
     ++
     ++	assert(bs->ops == NULL);
     ++	bs->ops = &file_vtable;
     ++	bs->arg = p;
     ++	return 0;
      +}
      
     - ## reftable/test_framework.h (new) ##
     + ## reftable/blocksource.h (new) ##
      @@
      +/*
      +Copyright 2020 Google LLC
     @@ reftable/test_framework.h (new)
      +https://developers.google.com/open-source/licenses/bsd
      +*/
      +
     -+#ifndef TEST_FRAMEWORK_H
     -+#define TEST_FRAMEWORK_H
     ++#ifndef BLOCKSOURCE_H
     ++#define BLOCKSOURCE_H
      +
      +#include "system.h"
      +
     -+#ifdef NDEBUG
     -+#undef NDEBUG
     -+#endif
     ++struct reftable_block_source;
     ++
     ++/* Create an in-memory block source for reading reftables */
     ++void block_source_from_strbuf(struct reftable_block_source *bs,
     ++			      struct strbuf *buf);
     ++
     ++struct reftable_block_source malloc_block_source(void);
      +
     -+#ifdef assert
     -+#undef assert
      +#endif
     +
     + ## reftable/reftable-blocksource.h (new) ##
     +@@
     ++/*
     ++Copyright 2020 Google LLC
      +
     -+#define assert_err(c)                                                  \
     -+	if (c != 0) {                                                  \
     -+		fflush(stderr);                                        \
     -+		fflush(stdout);                                        \
     -+		fprintf(stderr, "%s: %d: error == %d (%s), want 0\n",  \
     -+			__FILE__, __LINE__, c, reftable_error_str(c)); \
     -+		abort();                                               \
     -+	}
     ++Use of this source code is governed by a BSD-style
     ++license that can be found in the LICENSE file or at
     ++https://developers.google.com/open-source/licenses/bsd
     ++*/
      +
     -+#define assert_streq(a, b)                                               \
     -+	if (strcmp(a, b)) {                                              \
     -+		fflush(stderr);                                          \
     -+		fflush(stdout);                                          \
     -+		fprintf(stderr, "%s:%d: %s (%s) != %s (%s)\n", __FILE__, \
     -+			__LINE__, #a, a, #b, b);                         \
     -+		abort();                                                 \
     -+	}
     ++#ifndef REFTABLE_BLOCKSOURCE_H
     ++#define REFTABLE_BLOCKSOURCE_H
      +
     -+#define assert(c)                                                          \
     -+	if (!(c)) {                                                        \
     -+		fflush(stderr);                                            \
     -+		fflush(stdout);                                            \
     -+		fprintf(stderr, "%s: %d: failed assertion %s\n", __FILE__, \
     -+			__LINE__, #c);                                     \
     -+		abort();                                                   \
     -+	}
     ++#include <stdint.h>
     ++
     ++/* block_source is a generic wrapper for a seekable readable file.
     ++ */
     ++struct reftable_block_source {
     ++	struct reftable_block_source_vtable *ops;
     ++	void *arg;
     ++};
      +
     -+struct test_case {
     -+	const char *name;
     -+	void (*testfunc)(void);
     ++/* a contiguous segment of bytes. It keeps track of its generating block_source
     ++   so it can return itself into the pool.
     ++*/
     ++struct reftable_block {
     ++	uint8_t *data;
     ++	int len;
     ++	struct reftable_block_source source;
      +};
      +
     -+struct test_case *add_test_case(const char *name, void (*testfunc)(void));
     -+int test_main(int argc, const char *argv[]);
     ++/* block_source_vtable are the operations that make up block_source */
     ++struct reftable_block_source_vtable {
     ++	/* returns the size of a block source */
     ++	uint64_t (*size)(void *source);
     ++
     ++	/* reads a segment from the block source. It is an error to read
     ++	   beyond the end of the block */
     ++	int (*read_block)(void *source, struct reftable_block *dest,
     ++			  uint64_t off, uint32_t size);
     ++	/* mark the block as read; may return the data back to malloc */
     ++	void (*return_block)(void *source, struct reftable_block *blockp);
     ++
     ++	/* release all resources associated with the block source */
     ++	void (*close)(void *source);
     ++};
      +
     -+void set_test_hash(uint8_t *p, int i);
     ++/* opens a file on the file system as a block_source */
     ++int reftable_block_source_from_file(struct reftable_block_source *block_src,
     ++				    const char *name);
      +
      +#endif
  6:  8eb944ea9b !  7:  bbe2ed7173 reftable: (de)serialization for the polymorphic record type.
     @@ Metadata
       ## Commit message ##
          reftable: (de)serialization for the polymorphic record type.
      
     +    The reftable format is structured as a sequence of blocks, and each block
     +    contains a sequence of prefix-compressed key-value records. There are 4 types of
     +    records, and they have similarities in how they must be handled. This is
     +    achieved by introducing a polymorphic 'record' type that encapsulates ref, log,
     +    index and object records.
     +
          Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
      
       ## Makefile ##
      @@ Makefile: REFTABLE_OBJS += reftable/basics.o
     + REFTABLE_OBJS += reftable/error.o
       REFTABLE_OBJS += reftable/blocksource.o
       REFTABLE_OBJS += reftable/publicbasics.o
     - REFTABLE_OBJS += reftable/compat.o
      +REFTABLE_OBJS += reftable/record.o
     - REFTABLE_OBJS += reftable/strbuf.o
       
      +REFTABLE_TEST_OBJS += reftable/record_test.o
     - REFTABLE_TEST_OBJS += reftable/strbuf_test.o
       REFTABLE_TEST_OBJS += reftable/test_framework.o
     + REFTABLE_TEST_OBJS += reftable/basics_test.o
       
      
       ## reftable/constants.h (new) ##
     @@ reftable/record.c (new)
      +
      +#include "system.h"
      +#include "constants.h"
     -+#include "reftable.h"
     ++#include "reftable-error.h"
      +#include "basics.h"
      +
      +int get_var_int(uint64_t *dest, struct string_view *in)
     @@ reftable/record.c (new)
      +
      +	/* This is simple and correct, but we could probably reuse the hash
      +	   fields. */
     -+	reftable_ref_record_clear(ref);
     ++	reftable_ref_record_release(ref);
      +	if (src->refname != NULL) {
      +		ref->refname = xstrdup(src->refname);
      +	}
     @@ reftable/record.c (new)
      +	printf("}\n");
      +}
      +
     -+static void reftable_ref_record_clear_void(void *rec)
     ++static void reftable_ref_record_release_void(void *rec)
      +{
     -+	reftable_ref_record_clear((struct reftable_ref_record *)rec);
     ++	reftable_ref_record_release((struct reftable_ref_record *)rec);
      +}
      +
     -+void reftable_ref_record_clear(struct reftable_ref_record *ref)
     ++void reftable_ref_record_release(struct reftable_ref_record *ref)
      +{
      +	reftable_free(ref->refname);
      +	reftable_free(ref->target);
     @@ reftable/record.c (new)
      +	.val_type = &reftable_ref_record_val_type,
      +	.encode = &reftable_ref_record_encode,
      +	.decode = &reftable_ref_record_decode,
     -+	.clear = &reftable_ref_record_clear_void,
     ++	.release = &reftable_ref_record_release_void,
      +	.is_deletion = &reftable_ref_record_is_deletion_void,
      +};
      +
     @@ reftable/record.c (new)
      +	strbuf_add(dest, rec->hash_prefix, rec->hash_prefix_len);
      +}
      +
     -+static void reftable_obj_record_clear(void *rec)
     ++static void reftable_obj_record_release(void *rec)
      +{
      +	struct reftable_obj_record *obj = (struct reftable_obj_record *)rec;
      +	FREE_AND_NULL(obj->hash_prefix);
     @@ reftable/record.c (new)
      +		(const struct reftable_obj_record *)src_rec;
      +	int olen;
      +
     -+	reftable_obj_record_clear(obj);
     ++	reftable_obj_record_release(obj);
      +	*obj = *src;
      +	obj->hash_prefix = reftable_malloc(obj->hash_prefix_len);
      +	memcpy(obj->hash_prefix, src->hash_prefix, obj->hash_prefix_len);
     @@ reftable/record.c (new)
      +	.val_type = &reftable_obj_record_val_type,
      +	.encode = &reftable_obj_record_encode,
      +	.decode = &reftable_obj_record_decode,
     -+	.clear = &reftable_obj_record_clear,
     ++	.release = &reftable_obj_record_release,
      +	.is_deletion = not_a_deletion,
      +};
      +
     @@ reftable/record.c (new)
      +	const struct reftable_log_record *src =
      +		(const struct reftable_log_record *)src_rec;
      +
     -+	reftable_log_record_clear(dst);
     ++	reftable_log_record_release(dst);
      +	*dst = *src;
      +	if (dst->refname != NULL) {
      +		dst->refname = xstrdup(dst->refname);
     @@ reftable/record.c (new)
      +	}
      +}
      +
     -+static void reftable_log_record_clear_void(void *rec)
     ++static void reftable_log_record_release_void(void *rec)
      +{
      +	struct reftable_log_record *r = (struct reftable_log_record *)rec;
     -+	reftable_log_record_clear(r);
     ++	reftable_log_record_release(r);
      +}
      +
     -+void reftable_log_record_clear(struct reftable_log_record *r)
     ++void reftable_log_record_release(struct reftable_log_record *r)
      +{
      +	reftable_free(r->refname);
      +	reftable_free(r->new_hash);
     @@ reftable/record.c (new)
      +	.val_type = &reftable_log_record_val_type,
      +	.encode = &reftable_log_record_encode,
      +	.decode = &reftable_log_record_decode,
     -+	.clear = &reftable_log_record_clear_void,
     ++	.release = &reftable_log_record_release_void,
      +	.is_deletion = &reftable_log_record_is_deletion_void,
      +};
      +
     @@ reftable/record.c (new)
      +
      +void reftable_record_destroy(struct reftable_record *rec)
      +{
     -+	reftable_record_clear(rec);
     ++	reftable_record_release(rec);
      +	reftable_free(reftable_record_yield(rec));
      +}
      +
     @@ reftable/record.c (new)
      +	dst->offset = src->offset;
      +}
      +
     -+static void reftable_index_record_clear(void *rec)
     ++static void reftable_index_record_release(void *rec)
      +{
      +	struct reftable_index_record *idx = (struct reftable_index_record *)rec;
      +	strbuf_release(&idx->last_key);
     @@ reftable/record.c (new)
      +	.val_type = &reftable_index_record_val_type,
      +	.encode = &reftable_index_record_encode,
      +	.decode = &reftable_index_record_decode,
     -+	.clear = &reftable_index_record_clear,
     ++	.release = &reftable_index_record_release,
      +	.is_deletion = &not_a_deletion,
      +};
      +
     @@ reftable/record.c (new)
      +	return rec->ops->decode(rec->data, key, extra, src, hash_size);
      +}
      +
     -+void reftable_record_clear(struct reftable_record *rec)
     ++void reftable_record_release(struct reftable_record *rec)
      +{
     -+	rec->ops->clear(rec->data);
     ++	rec->ops->release(rec->data);
      +}
      +
      +int reftable_record_is_deletion(struct reftable_record *rec)
     @@ reftable/record.h (new)
      +#ifndef RECORD_H
      +#define RECORD_H
      +
     -+#include "reftable.h"
     -+#include "strbuf.h"
      +#include "system.h"
      +
     ++#include <stdint.h>
     ++
     ++#include "reftable-record.h"
     ++
      +/*
      +  A substring of existing string data. This structure takes no responsibility
      +  for the lifetime of the data it points to.
     @@ reftable/record.h (new)
      +		      struct string_view src, int hash_size);
      +
      +	/* deallocate and null the record. */
     -+	void (*clear)(void *rec);
     ++	void (*release)(void *rec);
      +
      +	/* is this a tombstone? */
      +	int (*is_deletion)(const void *rec);
     @@ reftable/record.h (new)
      +int reftable_record_is_deletion(struct reftable_record *rec);
      +
      +/* zeroes out the embedded record */
     -+void reftable_record_clear(struct reftable_record *rec);
     ++void reftable_record_release(struct reftable_record *rec);
      +
      +/* clear and deallocate embedded record, and zero `rec`. */
      +void reftable_record_destroy(struct reftable_record *rec);
     @@ reftable/record_test.c (new)
      +#include "system.h"
      +#include "basics.h"
      +#include "constants.h"
     -+#include "reftable.h"
      +#include "test_framework.h"
      +#include "reftable-tests.h"
      +
     @@ reftable/record_test.c (new)
      +	reftable_record_copy_from(&copy, rec, SHA1_SIZE);
      +	switch (reftable_record_type(&copy)) {
      +	case BLOCK_TYPE_REF:
     -+		assert(reftable_ref_record_equal(reftable_record_as_ref(&copy),
     ++		EXPECT(reftable_ref_record_equal(reftable_record_as_ref(&copy),
      +						 reftable_record_as_ref(rec),
      +						 SHA1_SIZE));
      +		break;
      +	case BLOCK_TYPE_LOG:
     -+		assert(reftable_log_record_equal(reftable_record_as_log(&copy),
     ++		EXPECT(reftable_log_record_equal(reftable_record_as_log(&copy),
      +						 reftable_record_as_log(rec),
      +						 SHA1_SIZE));
      +		break;
     @@ reftable/record_test.c (new)
      +		int n = put_var_int(&out, in);
      +		uint64_t got = 0;
      +
     -+		assert(n > 0);
     ++		EXPECT(n > 0);
      +		out.len = n;
      +		n = get_var_int(&got, &out);
     -+		assert(n > 0);
     ++		EXPECT(n > 0);
      +
     -+		assert(got == in);
     ++		EXPECT(got == in);
      +	}
      +}
      +
     @@ reftable/record_test.c (new)
      +		struct strbuf b = STRBUF_INIT;
      +		strbuf_addstr(&a, cases[i].a);
      +		strbuf_addstr(&b, cases[i].b);
     -+		assert(common_prefix_size(&a, &b) == cases[i].want);
     ++		EXPECT(common_prefix_size(&a, &b) == cases[i].want);
      +
      +		strbuf_release(&a);
      +		strbuf_release(&b);
     @@ reftable/record_test.c (new)
      +		reftable_record_from_ref(&rec, &in);
      +		test_copy(&rec);
      +
     -+		assert(reftable_record_val_type(&rec) == i);
     ++		EXPECT(reftable_record_val_type(&rec) == i);
      +
      +		reftable_record_key(&rec, &key);
      +		n = reftable_record_encode(&rec, dest, SHA1_SIZE);
     -+		assert(n > 0);
     ++		EXPECT(n > 0);
      +
      +		/* decode into a non-zero reftable_record to test for leaks. */
      +
      +		reftable_record_from_ref(&rec_out, &out);
      +		m = reftable_record_decode(&rec_out, key, i, dest, SHA1_SIZE);
     -+		assert(n == m);
     ++		EXPECT(n == m);
      +
     -+		assert((out.value != NULL) == (in.value != NULL));
     -+		assert((out.target_value != NULL) == (in.target_value != NULL));
     -+		assert((out.target != NULL) == (in.target != NULL));
     -+		reftable_record_clear(&rec_out);
     ++		EXPECT((out.value != NULL) == (in.value != NULL));
     ++		EXPECT((out.target_value != NULL) == (in.target_value != NULL));
     ++		EXPECT((out.target != NULL) == (in.target != NULL));
     ++		reftable_record_release(&rec_out);
      +
      +		strbuf_release(&key);
     -+		reftable_ref_record_clear(&in);
     ++		reftable_ref_record_release(&in);
      +	}
      +}
      +
     @@ reftable/record_test.c (new)
      +		}
      +	};
      +
     -+	assert(!reftable_log_record_equal(&in[0], &in[1], SHA1_SIZE));
     ++	EXPECT(!reftable_log_record_equal(&in[0], &in[1], SHA1_SIZE));
      +	in[1].update_index = in[0].update_index;
     -+	assert(reftable_log_record_equal(&in[0], &in[1], SHA1_SIZE));
     -+	reftable_log_record_clear(&in[0]);
     -+	reftable_log_record_clear(&in[1]);
     ++	EXPECT(reftable_log_record_equal(&in[0], &in[1], SHA1_SIZE));
     ++	reftable_log_record_release(&in[0]);
     ++	reftable_log_record_release(&in[1]);
      +}
      +
      +static void test_reftable_log_record_roundtrip(void)
     @@ reftable/record_test.c (new)
      +		reftable_record_key(&rec, &key);
      +
      +		n = reftable_record_encode(&rec, dest, SHA1_SIZE);
     -+		assert(n >= 0);
     ++		EXPECT(n >= 0);
      +		reftable_record_from_log(&rec_out, &out);
      +		valtype = reftable_record_val_type(&rec);
      +		m = reftable_record_decode(&rec_out, key, valtype, dest,
      +					   SHA1_SIZE);
     -+		assert(n == m);
     ++		EXPECT(n == m);
      +
     -+		assert(reftable_log_record_equal(&in[i], &out, SHA1_SIZE));
     -+		reftable_log_record_clear(&in[i]);
     ++		EXPECT(reftable_log_record_equal(&in[i], &out, SHA1_SIZE));
     ++		reftable_log_record_release(&in[i]);
      +		strbuf_release(&key);
     -+		reftable_record_clear(&rec_out);
     ++		reftable_record_release(&rec_out);
      +	}
      +}
      +
     @@ reftable/record_test.c (new)
      +	uint32_t out;
      +	put_be24(dest, in);
      +	out = get_be24(dest);
     -+	assert(in == out);
     ++	EXPECT(in == out);
      +}
      +
      +static void test_key_roundtrip(void)
     @@ reftable/record_test.c (new)
      +	strbuf_addstr(&key, "refs/tags/bla");
      +	extra = 6;
      +	n = reftable_encode_key(&restart, dest, last_key, key, extra);
     -+	assert(!restart);
     -+	assert(n > 0);
     ++	EXPECT(!restart);
     ++	EXPECT(n > 0);
      +
      +	m = reftable_decode_key(&roundtrip, &rt_extra, last_key, dest);
     -+	assert(n == m);
     -+	assert(0 == strbuf_cmp(&key, &roundtrip));
     -+	assert(rt_extra == extra);
     ++	EXPECT(n == m);
     ++	EXPECT(0 == strbuf_cmp(&key, &roundtrip));
     ++	EXPECT(rt_extra == extra);
      +
      +	strbuf_release(&last_key);
      +	strbuf_release(&key);
     @@ reftable/record_test.c (new)
      +		test_copy(&rec);
      +		reftable_record_key(&rec, &key);
      +		n = reftable_record_encode(&rec, dest, SHA1_SIZE);
     -+		assert(n > 0);
     ++		EXPECT(n > 0);
      +		extra = reftable_record_val_type(&rec);
      +		reftable_record_from_obj(&rec_out, &out);
      +		m = reftable_record_decode(&rec_out, key, extra, dest,
      +					   SHA1_SIZE);
     -+		assert(n == m);
     ++		EXPECT(n == m);
      +
     -+		assert(in.hash_prefix_len == out.hash_prefix_len);
     -+		assert(in.offset_len == out.offset_len);
     ++		EXPECT(in.hash_prefix_len == out.hash_prefix_len);
     ++		EXPECT(in.offset_len == out.offset_len);
      +
     -+		assert(!memcmp(in.hash_prefix, out.hash_prefix,
     ++		EXPECT(!memcmp(in.hash_prefix, out.hash_prefix,
      +			       in.hash_prefix_len));
     -+		assert(0 == memcmp(in.offsets, out.offsets,
     ++		EXPECT(0 == memcmp(in.offsets, out.offsets,
      +				   sizeof(uint64_t) * in.offset_len));
      +		strbuf_release(&key);
     -+		reftable_record_clear(&rec_out);
     ++		reftable_record_release(&rec_out);
      +	}
      +}
      +
     @@ reftable/record_test.c (new)
      +	reftable_record_key(&rec, &key);
      +	test_copy(&rec);
      +
     -+	assert(0 == strbuf_cmp(&key, &in.last_key));
     ++	EXPECT(0 == strbuf_cmp(&key, &in.last_key));
      +	n = reftable_record_encode(&rec, dest, SHA1_SIZE);
     -+	assert(n > 0);
     ++	EXPECT(n > 0);
      +
      +	extra = reftable_record_val_type(&rec);
      +	reftable_record_from_index(&out_rec, &out);
      +	m = reftable_record_decode(&out_rec, key, extra, dest, SHA1_SIZE);
     -+	assert(m == n);
     ++	EXPECT(m == n);
      +
     -+	assert(in.offset == out.offset);
     ++	EXPECT(in.offset == out.offset);
      +
     -+	reftable_record_clear(&out_rec);
     ++	reftable_record_release(&out_rec);
      +	strbuf_release(&key);
      +	strbuf_release(&in.last_key);
      +}
      +
      +int record_test_main(int argc, const char *argv[])
      +{
     -+	add_test_case("test_reftable_log_record_equal",
     -+		      &test_reftable_log_record_equal);
     -+	add_test_case("test_reftable_log_record_roundtrip",
     -+		      &test_reftable_log_record_roundtrip);
     -+	add_test_case("test_reftable_ref_record_roundtrip",
     -+		      &test_reftable_ref_record_roundtrip);
     -+	add_test_case("test_varint_roundtrip", &test_varint_roundtrip);
     -+	add_test_case("test_key_roundtrip", &test_key_roundtrip);
     -+	add_test_case("test_common_prefix", &test_common_prefix);
     -+	add_test_case("test_reftable_obj_record_roundtrip",
     -+		      &test_reftable_obj_record_roundtrip);
     -+	add_test_case("test_reftable_index_record_roundtrip",
     -+		      &test_reftable_index_record_roundtrip);
     -+	add_test_case("test_u24_roundtrip", &test_u24_roundtrip);
     -+	return test_main(argc, argv);
     ++	test_reftable_log_record_equal();
     ++	test_reftable_log_record_roundtrip();
     ++	test_reftable_ref_record_roundtrip();
     ++	test_varint_roundtrip();
     ++	test_key_roundtrip();
     ++	test_common_prefix();
     ++	test_reftable_obj_record_roundtrip();
     ++	test_reftable_index_record_roundtrip();
     ++	test_u24_roundtrip();
     ++	return 0;
      +}
      
     + ## reftable/reftable-record.h (new) ##
     +@@
     ++/*
     ++Copyright 2020 Google LLC
     ++
     ++Use of this source code is governed by a BSD-style
     ++license that can be found in the LICENSE file or at
     ++https://developers.google.com/open-source/licenses/bsd
     ++*/
     ++
     ++#ifndef REFTABLE_RECORD_H
     ++#define REFTABLE_RECORD_H
     ++
     ++#include <stdint.h>
     ++
     ++/*
     ++ Basic data types
     ++
     ++ Reftables store the state of each ref in struct reftable_ref_record, and they
     ++ store a sequence of reflog updates in struct reftable_log_record.
     ++*/
     ++
     ++/* reftable_ref_record holds a ref database entry target_value */
     ++struct reftable_ref_record {
     ++	char *refname; /* Name of the ref, malloced. */
     ++	uint64_t update_index; /* Logical timestamp at which this value is
     ++				  written */
     ++	uint8_t *value; /* SHA1, or NULL. malloced. */
     ++	uint8_t *target_value; /* peeled annotated tag, or NULL. malloced. */
     ++	char *target; /* symref, or NULL. malloced. */
     ++};
     ++
     ++/* returns whether 'ref' represents a deletion */
     ++int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref);
     ++
     ++/* prints a reftable_ref_record onto stdout */
     ++void reftable_ref_record_print(struct reftable_ref_record *ref,
     ++			       uint32_t hash_id);
     ++
     ++/* frees and nulls all pointer values. */
     ++void reftable_ref_record_release(struct reftable_ref_record *ref);
     ++
     ++/* returns whether two reftable_ref_records are the same */
     ++int reftable_ref_record_equal(struct reftable_ref_record *a,
     ++			      struct reftable_ref_record *b, int hash_size);
     ++
     ++/* reftable_log_record holds a reflog entry */
     ++struct reftable_log_record {
     ++	char *refname;
     ++	uint64_t update_index; /* logical timestamp of a transactional update.
     ++				*/
     ++	uint8_t *new_hash;
     ++	uint8_t *old_hash;
     ++	char *name;
     ++	char *email;
     ++	uint64_t time;
     ++	int16_t tz_offset;
     ++	char *message;
     ++};
     ++
     ++/* returns whether 'ref' represents the deletion of a log record. */
     ++int reftable_log_record_is_deletion(const struct reftable_log_record *log);
     ++
     ++/* frees and nulls all pointer values. */
     ++void reftable_log_record_release(struct reftable_log_record *log);
     ++
     ++/* returns whether two records are equal. Useful for testing. */
     ++int reftable_log_record_equal(struct reftable_log_record *a,
     ++			      struct reftable_log_record *b, int hash_size);
     ++
     ++/* dumps a reftable_log_record on stdout, for debugging/testing. */
     ++void reftable_log_record_print(struct reftable_log_record *log,
     ++			       uint32_t hash_id);
     ++
     ++#endif
     +
       ## t/helper/test-reftable.c ##
      @@
     - 
       int cmd__reftable(int argc, const char **argv)
       {
     + 	basics_test_main(argc, argv);
     +-
      +	record_test_main(argc, argv);
     - 	strbuf_test_main(argc, argv);
       	return 0;
       }
  7:  757dd30fe2 !  8:  a358b052d5 reftable: reading/writing blocks
     @@ Metadata
       ## Commit message ##
          reftable: reading/writing blocks
      
     +    The reftable format is structured as a sequence of block. Within a block,
     +    records are prefix compressed, with an index of offsets for fully expand keys to
     +    enable binary search within blocks.
     +
     +    This commit provides the logic to read and write these blocks.
     +
          Includes a code snippet copied from zlib
      
          Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
      
       ## Makefile ##
     -@@ Makefile: XDIFF_OBJS += xdiff/xprepare.o
     - XDIFF_OBJS += xdiff/xutils.o
     +@@ Makefile: XDIFF_OBJS += xdiff/xutils.o
       
       REFTABLE_OBJS += reftable/basics.o
     + REFTABLE_OBJS += reftable/error.o
      +REFTABLE_OBJS += reftable/block.o
       REFTABLE_OBJS += reftable/blocksource.o
       REFTABLE_OBJS += reftable/publicbasics.o
     - REFTABLE_OBJS += reftable/compat.o
       REFTABLE_OBJS += reftable/record.o
     - REFTABLE_OBJS += reftable/strbuf.o
      +REFTABLE_OBJS += reftable/zlib-compat.o
       
      +REFTABLE_TEST_OBJS += reftable/block_test.o
       REFTABLE_TEST_OBJS += reftable/record_test.o
     - REFTABLE_TEST_OBJS += reftable/strbuf_test.o
       REFTABLE_TEST_OBJS += reftable/test_framework.o
     + REFTABLE_TEST_OBJS += reftable/basics_test.o
      
       ## reftable/.gitattributes (new) ##
      @@
     @@ reftable/block.c (new)
      +#include "blocksource.h"
      +#include "constants.h"
      +#include "record.h"
     -+#include "reftable.h"
     ++#include "reftable-error.h"
      +#include "system.h"
      +#include "zlib.h"
      +
     @@ reftable/block.c (new)
      +	return -1;
      +}
      +
     -+
      +int block_writer_finish(struct block_writer *w)
      +{
      +	int i = 0;
     @@ reftable/block.c (new)
      +	return err;
      +}
      +
     -+void block_writer_clear(struct block_writer *bw)
     ++void block_writer_release(struct block_writer *bw)
      +{
      +	FREE_AND_NULL(bw->restarts);
      +	strbuf_release(&bw->last_key);
     @@ reftable/block.h (new)
      +
      +#include "basics.h"
      +#include "record.h"
     -+#include "reftable.h"
     ++#include "reftable-blocksource.h"
      +
      +/*
      +  Writes reftable blocks. The block_writer is reused across blocks to minimize
     @@ reftable/block.h (new)
      +int block_writer_finish(struct block_writer *w);
      +
      +/* clears out internally allocated block_writer members. */
     -+void block_writer_clear(struct block_writer *bw);
     ++void block_writer_release(struct block_writer *bw);
      +
      +/* Read a block. */
      +struct block_reader {
     @@ reftable/block_test.c (new)
      +#include "basics.h"
      +#include "constants.h"
      +#include "record.h"
     -+#include "reftable.h"
      +#include "test_framework.h"
      +#include "reftable-tests.h"
      +
     -+struct binsearch_args {
     -+	int key;
     -+	int *arr;
     -+};
     -+
     -+static int binsearch_func(size_t i, void *void_args)
     -+{
     -+	struct binsearch_args *args = (struct binsearch_args *)void_args;
     -+
     -+	return args->key < args->arr[i];
     -+}
     -+
     -+static void test_binsearch(void)
     -+{
     -+	int arr[] = { 2, 4, 6, 8, 10 };
     -+	size_t sz = ARRAY_SIZE(arr);
     -+	struct binsearch_args args = {
     -+		.arr = arr,
     -+	};
     -+
     -+	int i = 0;
     -+	for (i = 1; i < 11; i++) {
     -+		int res;
     -+		args.key = i;
     -+		res = binsearch(sz, &binsearch_func, &args);
     -+
     -+		if (res < sz) {
     -+			assert(args.key < arr[res]);
     -+			if (res > 0) {
     -+				assert(args.key >= arr[res - 1]);
     -+			}
     -+		} else {
     -+			assert(args.key == 10 || args.key == 11);
     -+		}
     -+	}
     -+}
     -+
      +static void test_block_read_write(void)
      +{
      +	const int header_off = 21; /* random */
     @@ reftable/block_test.c (new)
      +		n = block_writer_add(&bw, &rec);
      +		ref.refname = NULL;
      +		ref.value = NULL;
     -+		assert(n == 0);
     ++		EXPECT(n == 0);
      +	}
      +
      +	n = block_writer_finish(&bw);
     -+	assert(n > 0);
     ++	EXPECT(n > 0);
      +
     -+	block_writer_clear(&bw);
     ++	block_writer_release(&bw);
      +
      +	block_reader_init(&br, &block, header_off, block_size, SHA1_SIZE);
      +
     @@ reftable/block_test.c (new)
      +
      +	while (1) {
      +		int r = block_iter_next(&it, &rec);
     -+		assert(r >= 0);
     ++		EXPECT(r >= 0);
      +		if (r > 0) {
      +			break;
      +		}
     -+		assert_streq(names[j], ref.refname);
     ++		EXPECT_STREQ(names[j], ref.refname);
      +		j++;
      +	}
      +
     -+	reftable_record_clear(&rec);
     ++	reftable_record_release(&rec);
      +	block_iter_close(&it);
      +
      +	for (i = 0; i < N; i++) {
     @@ reftable/block_test.c (new)
      +		strbuf_addstr(&want, names[i]);
      +
      +		n = block_reader_seek(&br, &it, &want);
     -+		assert(n == 0);
     ++		EXPECT(n == 0);
      +
      +		n = block_iter_next(&it, &rec);
     -+		assert(n == 0);
     ++		EXPECT(n == 0);
      +
     -+		assert_streq(names[i], ref.refname);
     ++		EXPECT_STREQ(names[i], ref.refname);
      +
      +		want.len--;
      +		n = block_reader_seek(&br, &it, &want);
     -+		assert(n == 0);
     ++		EXPECT(n == 0);
      +
      +		n = block_iter_next(&it, &rec);
     -+		assert(n == 0);
     -+		assert_streq(names[10 * (i / 10)], ref.refname);
     ++		EXPECT(n == 0);
     ++		EXPECT_STREQ(names[10 * (i / 10)], ref.refname);
      +
      +		block_iter_close(&it);
      +	}
      +
     -+	reftable_record_clear(&rec);
     ++	reftable_record_release(&rec);
      +	reftable_block_done(&br.block);
      +	strbuf_release(&want);
      +	for (i = 0; i < N; i++) {
     @@ reftable/block_test.c (new)
      +
      +int block_test_main(int argc, const char *argv[])
      +{
     -+	add_test_case("binsearch", &test_binsearch);
     -+	add_test_case("block_read_write", &test_block_read_write);
     -+	return test_main(argc, argv);
     ++	test_block_read_write();
     ++	return 0;
      +}
      
       ## reftable/zlib-compat.c (new) ##
     @@ reftable/zlib-compat.c (new)
      
       ## t/helper/test-reftable.c ##
      @@
     - 
       int cmd__reftable(int argc, const char **argv)
       {
     + 	basics_test_main(argc, argv);
      +	block_test_main(argc, argv);
       	record_test_main(argc, argv);
     - 	strbuf_test_main(argc, argv);
       	return 0;
     + }
  8:  e30a7e0281 !  9:  24afac9c91 reftable: a generic binary tree implementation
     @@ Metadata
       ## Commit message ##
          reftable: a generic binary tree implementation
      
     -    This is necessary for building a OID => ref map on write
     +    The reftable format includes support for an (OID => ref) map. This map can speed
     +    up visibility and reachability checks. In particular, various operations along
     +    the fetch/push path within Gerrit have ben sped up by using this structure.
     +
     +    The map is constructed with help of a binary tree. Object IDs are hashes, so
     +    they are uniformly distributed. Hence, the tree does not attempt forced
     +    rebalancing.
      
          Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
      
       ## Makefile ##
     -@@ Makefile: REFTABLE_OBJS += reftable/publicbasics.o
     - REFTABLE_OBJS += reftable/compat.o
     +@@ Makefile: REFTABLE_OBJS += reftable/block.o
     + REFTABLE_OBJS += reftable/blocksource.o
     + REFTABLE_OBJS += reftable/publicbasics.o
       REFTABLE_OBJS += reftable/record.o
     - REFTABLE_OBJS += reftable/strbuf.o
      +REFTABLE_OBJS += reftable/tree.o
       REFTABLE_OBJS += reftable/zlib-compat.o
       
     ++REFTABLE_TEST_OBJS += reftable/basics_test.o
       REFTABLE_TEST_OBJS += reftable/block_test.o
       REFTABLE_TEST_OBJS += reftable/record_test.o
     - REFTABLE_TEST_OBJS += reftable/strbuf_test.o
       REFTABLE_TEST_OBJS += reftable/test_framework.o
     +-REFTABLE_TEST_OBJS += reftable/basics_test.o
      +REFTABLE_TEST_OBJS += reftable/tree_test.o
       
       TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
     @@ reftable/tree_test.c (new)
      +
      +#include "basics.h"
      +#include "record.h"
     -+#include "reftable.h"
      +#include "test_framework.h"
      +#include "reftable-tests.h"
      +
     @@ reftable/tree_test.c (new)
      +
      +int tree_test_main(int argc, const char *argv[])
      +{
     -+	add_test_case("test_tree", &test_tree);
     -+	return test_main(argc, argv);
     ++	test_tree();
     ++	return 0;
      +}
      
       ## t/helper/test-reftable.c ##
      @@ t/helper/test-reftable.c: int cmd__reftable(int argc, const char **argv)
     + 	basics_test_main(argc, argv);
       	block_test_main(argc, argv);
       	record_test_main(argc, argv);
     - 	strbuf_test_main(argc, argv);
      +	tree_test_main(argc, argv);
       	return 0;
       }
  9:  68aee16e60 ! 10:  5f0b7c5592 reftable: write reftable files
     @@ Commit message
          Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
      
       ## Makefile ##
     -@@ Makefile: REFTABLE_OBJS += reftable/compat.o
     +@@ Makefile: REFTABLE_OBJS += reftable/blocksource.o
     + REFTABLE_OBJS += reftable/publicbasics.o
       REFTABLE_OBJS += reftable/record.o
     - REFTABLE_OBJS += reftable/strbuf.o
       REFTABLE_OBJS += reftable/tree.o
      +REFTABLE_OBJS += reftable/writer.o
       REFTABLE_OBJS += reftable/zlib-compat.o
       
     - REFTABLE_TEST_OBJS += reftable/block_test.o
     + REFTABLE_TEST_OBJS += reftable/basics_test.o
     +
     + ## reftable/reftable-writer.h (new) ##
     +@@
     ++/*
     ++Copyright 2020 Google LLC
     ++
     ++Use of this source code is governed by a BSD-style
     ++license that can be found in the LICENSE file or at
     ++https://developers.google.com/open-source/licenses/bsd
     ++*/
     ++
     ++#ifndef REFTABLE_WRITER_H
     ++#define REFTABLE_WRITER_H
     ++
     ++#include <stdint.h>
     ++
     ++#include "reftable-record.h"
     ++
     ++/*
     ++ Writing single reftables
     ++*/
     ++
     ++/* reftable_write_options sets options for writing a single reftable. */
     ++struct reftable_write_options {
     ++	/* boolean: do not pad out blocks to block size. */
     ++	unsigned unpadded : 1;
     ++
     ++	/* the blocksize. Should be less than 2^24. */
     ++	uint32_t block_size;
     ++
     ++	/* boolean: do not generate a SHA1 => ref index. */
     ++	unsigned skip_index_objects : 1;
     ++
     ++	/* how often to write complete keys in each block. */
     ++	int restart_interval;
     ++
     ++	/* 4-byte identifier ("sha1", "s256") of the hash.
     ++	 * Defaults to SHA1 if unset
     ++	 */
     ++	uint32_t hash_id;
     ++
     ++	/* boolean: do not check ref names for validity or dir/file conflicts.
     ++	 */
     ++	unsigned skip_name_check : 1;
     ++
     ++	/* boolean: copy log messages exactly. If unset, check that the message
     ++	 *   is a single line, and add '\n' if missing.
     ++	 */
     ++	unsigned exact_log_message : 1;
     ++};
     ++
     ++/* reftable_block_stats holds statistics for a single block type */
     ++struct reftable_block_stats {
     ++	/* total number of entries written */
     ++	int entries;
     ++	/* total number of key restarts */
     ++	int restarts;
     ++	/* total number of blocks */
     ++	int blocks;
     ++	/* total number of index blocks */
     ++	int index_blocks;
     ++	/* depth of the index */
     ++	int max_index_level;
     ++
     ++	/* offset of the first block for this type */
     ++	uint64_t offset;
     ++	/* offset of the top level index block for this type, or 0 if not
     ++	 * present */
     ++	uint64_t index_offset;
     ++};
     ++
     ++/* stats holds overall statistics for a single reftable */
     ++struct reftable_stats {
     ++	/* total number of blocks written. */
     ++	int blocks;
     ++	/* stats for ref data */
     ++	struct reftable_block_stats ref_stats;
     ++	/* stats for the SHA1 to ref map. */
     ++	struct reftable_block_stats obj_stats;
     ++	/* stats for index blocks */
     ++	struct reftable_block_stats idx_stats;
     ++	/* stats for log blocks */
     ++	struct reftable_block_stats log_stats;
     ++
     ++	/* disambiguation length of shortened object IDs. */
     ++	int object_id_len;
     ++};
     ++
     ++/* reftable_new_writer creates a new writer */
     ++struct reftable_writer *
     ++reftable_new_writer(int (*writer_func)(void *, const void *, size_t),
     ++		    void *writer_arg, struct reftable_write_options *opts);
     ++
     ++/* Set the range of update indices for the records we will add. When writing a
     ++   table into a stack, the min should be at least
     ++   reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned.
     ++
     ++   For transactional updates to a stack, typically min==max, and the
     ++   update_index can be obtained by inspeciting the stack. When converting an
     ++   existing ref database into a single reftable, this would be a range of
     ++   update-index timestamps.
     ++ */
     ++void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
     ++				uint64_t max);
     ++
     ++/*
     ++  Add a reftable_ref_record. The record should have names that come after
     ++  already added records.
     ++
     ++  The update_index must be within the limits set by
     ++  reftable_writer_set_limits(), or REFTABLE_API_ERROR is returned. It is an
     ++  REFTABLE_API_ERROR error to write a ref record after a log record.
     ++*/
     ++int reftable_writer_add_ref(struct reftable_writer *w,
     ++			    struct reftable_ref_record *ref);
     ++
     ++/*
     ++  Convenience function to add multiple reftable_ref_records; the function sorts
     ++  the records before adding them, reordering the records array passed in.
     ++*/
     ++int reftable_writer_add_refs(struct reftable_writer *w,
     ++			     struct reftable_ref_record *refs, int n);
     ++
     ++/*
     ++  adds reftable_log_records. Log records are keyed by (refname, decreasing
     ++  update_index). The key for the record added must come after the already added
     ++  log records.
     ++*/
     ++int reftable_writer_add_log(struct reftable_writer *w,
     ++			    struct reftable_log_record *log);
     ++
     ++/*
     ++  Convenience function to add multiple reftable_log_records; the function sorts
     ++  the records before adding them, reordering records array passed in.
     ++*/
     ++int reftable_writer_add_logs(struct reftable_writer *w,
     ++			     struct reftable_log_record *logs, int n);
     ++
     ++/* reftable_writer_close finalizes the reftable. The writer is retained so
     ++ * statistics can be inspected. */
     ++int reftable_writer_close(struct reftable_writer *w);
     ++
     ++/* writer_stats returns the statistics on the reftable being written.
     ++
     ++   This struct becomes invalid when the writer is freed.
     ++ */
     ++const struct reftable_stats *writer_stats(struct reftable_writer *w);
     ++
     ++/* reftable_writer_free deallocates memory for the writer */
     ++void reftable_writer_free(struct reftable_writer *w);
     ++
     ++#endif
      
       ## reftable/writer.c (new) ##
      @@
     @@ reftable/writer.c (new)
      +#include "block.h"
      +#include "constants.h"
      +#include "record.h"
     -+#include "reftable.h"
      +#include "tree.h"
     ++#include "reftable-error.h"
      +
      +/* finishes a block, and writes it to storage */
      +static int writer_flush_block(struct reftable_writer *w);
     @@ reftable/writer.c (new)
      +	w->block_writer->restart_interval = w->opts.restart_interval;
      +}
      +
     ++static struct strbuf reftable_empty_strbuf = STRBUF_INIT;
     ++
      +struct reftable_writer *
      +reftable_new_writer(int (*writer_func)(void *, const void *, size_t),
      +		    void *writer_arg, struct reftable_write_options *opts)
     @@ reftable/writer.c (new)
      +	int err = 0;
      +	int i = 0;
      +	QSORT(logs, n, reftable_log_record_compare_key);
     ++
      +	for (i = 0; err == 0 && i < n; i++) {
      +		err = reftable_writer_add_log(w, &logs[i]);
      +	}
     @@ reftable/writer.c (new)
      +
      +done:
      +	/* free up memory. */
     -+	block_writer_clear(&w->block_writer_data);
     ++	block_writer_release(&w->block_writer_data);
      +	writer_clear_index(w);
      +	strbuf_release(&w->last_key);
      +	return err;
     @@ reftable/writer.h (new)
      +
      +#include "basics.h"
      +#include "block.h"
     -+#include "reftable.h"
     -+#include "strbuf.h"
      +#include "tree.h"
     ++#include "reftable-writer.h"
      +
      +struct reftable_writer {
      +	int (*write)(void *, const void *, size_t);
 10:  c196de7f06 ! 11:  9aa2caade8 reftable: read reftable files
     @@ Metadata
       ## Commit message ##
          reftable: read reftable files
      
     +    This supports reading a single reftable file.
     +
     +    The commit introduces an abstract iterator type, which captures the usecases
     +    both of reading individual refs, and iterating over a segment of the ref
     +    namespace.
     +
          Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
      
       ## Makefile ##
     -@@ Makefile: REFTABLE_OBJS += reftable/block.o
     +@@ Makefile: REFTABLE_OBJS += reftable/basics.o
     + REFTABLE_OBJS += reftable/error.o
     + REFTABLE_OBJS += reftable/block.o
       REFTABLE_OBJS += reftable/blocksource.o
     - REFTABLE_OBJS += reftable/publicbasics.o
     - REFTABLE_OBJS += reftable/compat.o
      +REFTABLE_OBJS += reftable/iter.o
     + REFTABLE_OBJS += reftable/publicbasics.o
      +REFTABLE_OBJS += reftable/reader.o
       REFTABLE_OBJS += reftable/record.o
      +REFTABLE_OBJS += reftable/reftable.o
     - REFTABLE_OBJS += reftable/strbuf.o
       REFTABLE_OBJS += reftable/tree.o
       REFTABLE_OBJS += reftable/writer.o
     + REFTABLE_OBJS += reftable/zlib-compat.o
      
       ## reftable/iter.c (new) ##
      @@
     @@ reftable/iter.c (new)
      +#include "block.h"
      +#include "constants.h"
      +#include "reader.h"
     -+#include "reftable.h"
     ++#include "reftable-error.h"
      +
      +int iterator_is_null(struct reftable_iterator *it)
      +{
     @@ reftable/iter.c (new)
      +int reftable_iterator_next_ref(struct reftable_iterator *it,
      +			       struct reftable_ref_record *ref)
      +{
     -+	struct reftable_record rec = { 0 };
     ++	struct reftable_record rec = { NULL };
      +	reftable_record_from_ref(&rec, ref);
      +	return iterator_next(it, &rec);
      +}
     @@ reftable/iter.c (new)
      +int reftable_iterator_next_log(struct reftable_iterator *it,
      +			       struct reftable_log_record *log)
      +{
     -+	struct reftable_record rec = { 0 };
     ++	struct reftable_record rec = { NULL };
      +	reftable_record_from_log(&rec, log);
      +	return iterator_next(it, &rec);
      +}
     @@ reftable/iter.c (new)
      +		}
      +
      +		if (fri->double_check) {
     -+			struct reftable_iterator it = { 0 };
     ++			struct reftable_iterator it = { NULL };
      +
      +			err = reftable_table_seek_ref(&fri->tab, &it,
      +						      ref->refname);
     @@ reftable/iter.c (new)
      +		}
      +	}
      +
     -+	reftable_ref_record_clear(ref);
     ++	reftable_ref_record_release(ref);
      +	return err;
      +}
      +
     @@ reftable/iter.h (new)
      +#ifndef ITER_H
      +#define ITER_H
      +
     ++#include "system.h"
      +#include "block.h"
      +#include "record.h"
     -+#include "strbuf.h"
     ++
     ++#include "reftable-iterator.h"
     ++#include "reftable-generic.h"
      +
      +struct reftable_iterator_vtable {
      +	int (*next)(void *iter_arg, struct reftable_record *rec);
     @@ reftable/reader.c (new)
      +#include "constants.h"
      +#include "iter.h"
      +#include "record.h"
     -+#include "reftable.h"
     ++#include "reftable-error.h"
      +#include "tree.h"
      +
      +uint64_t block_source_size(struct reftable_block_source *source)
     @@ reftable/reader.c (new)
      +int init_reader(struct reftable_reader *r, struct reftable_block_source *source,
      +		const char *name)
      +{
     -+	struct reftable_block footer = { 0 };
     -+	struct reftable_block header = { 0 };
     ++	struct reftable_block footer = { NULL };
     ++	struct reftable_block header = { NULL };
      +	int err = 0;
      +
      +	memset(r, 0, sizeof(struct reftable_reader));
     @@ reftable/reader.c (new)
      +{
      +	int32_t guess_block_size = r->block_size ? r->block_size :
      +							 DEFAULT_BLOCK_SIZE;
     -+	struct reftable_block block = { 0 };
     ++	struct reftable_block block = { NULL };
      +	uint8_t block_typ = 0;
      +	int err = 0;
      +	uint32_t header_off = next_off ? 0 : header_size(r->version);
     @@ reftable/reader.c (new)
      +			       struct reftable_record *rec)
      +{
      +	struct reftable_index_record want_index = { .last_key = STRBUF_INIT };
     -+	struct reftable_record want_index_rec = { 0 };
     ++	struct reftable_record want_index_rec = { NULL };
      +	struct reftable_index_record index_result = { .last_key = STRBUF_INIT };
     -+	struct reftable_record index_result_rec = { 0 };
     ++	struct reftable_record index_result_rec = { NULL };
      +	struct table_iter index_iter = TABLE_ITER_INIT;
      +	struct table_iter next = TABLE_ITER_INIT;
      +	int err = 0;
     @@ reftable/reader.c (new)
      +done:
      +	block_iter_close(&next.bi);
      +	table_iter_close(&index_iter);
     -+	reftable_record_clear(&want_index_rec);
     -+	reftable_record_clear(&index_result_rec);
     ++	reftable_record_release(&want_index_rec);
     ++	reftable_record_release(&index_result_rec);
      +	return err;
      +}
      +
     @@ reftable/reader.c (new)
      +	struct reftable_ref_record ref = {
      +		.refname = (char *)name,
      +	};
     -+	struct reftable_record rec = { 0 };
     ++	struct reftable_record rec = { NULL };
      +	reftable_record_from_ref(&rec, &ref);
      +	return reader_seek(r, it, &rec);
      +}
     @@ reftable/reader.c (new)
      +		.refname = (char *)name,
      +		.update_index = update_index,
      +	};
     -+	struct reftable_record rec = { 0 };
     ++	struct reftable_record rec = { NULL };
      +	reftable_record_from_log(&rec, &log);
      +	return reader_seek(r, it, &rec);
      +}
     @@ reftable/reader.c (new)
      +		.hash_prefix = oid,
      +		.hash_prefix_len = r->object_id_len,
      +	};
     -+	struct reftable_record want_rec = { 0 };
     -+	struct reftable_iterator oit = { 0 };
     -+	struct reftable_obj_record got = { 0 };
     -+	struct reftable_record got_rec = { 0 };
     ++	struct reftable_record want_rec = { NULL };
     ++	struct reftable_iterator oit = { NULL };
     ++	struct reftable_obj_record got = { NULL };
     ++	struct reftable_record got_rec = { NULL };
      +	int err = 0;
      +	struct indexed_table_ref_iter *itr = NULL;
      +
     @@ reftable/reader.c (new)
      +
      +done:
      +	reftable_iterator_destroy(&oit);
     -+	reftable_record_clear(&got_rec);
     ++	reftable_record_release(&got_rec);
      +	return err;
      +}
      +
     @@ reftable/reader.h (new)
      +
      +#include "block.h"
      +#include "record.h"
     -+#include "reftable.h"
     ++#include "reftable-iterator.h"
     ++#include "reftable-reader.h"
      +
      +uint64_t block_source_size(struct reftable_block_source *source);
      +
     @@ reftable/reader.h (new)
      +	uint64_t (*max_update_index)(void *tab);
      +};
      +
     -+int reftable_table_seek_record(struct reftable_table *tab,
     -+			       struct reftable_iterator *it,
     -+			       struct reftable_record *rec);
     -+
      +#endif
      
     - ## reftable/reftable.c (new) ##
     + ## reftable/reftable-iterator.h (new) ##
      @@
      +/*
      +Copyright 2020 Google LLC
     @@ reftable/reftable.c (new)
      +https://developers.google.com/open-source/licenses/bsd
      +*/
      +
     -+#include "reftable.h"
     -+#include "record.h"
     -+#include "reader.h"
     ++#ifndef REFTABLE_ITERATOR_H
     ++#define REFTABLE_ITERATOR_H
      +
     -+static int reftable_reader_seek_void(void *tab, struct reftable_iterator *it,
     -+				     struct reftable_record *rec)
     -+{
     -+	return reader_seek((struct reftable_reader *)tab, it, rec);
     -+}
     ++#include "reftable-record.h"
      +
     -+static uint32_t reftable_reader_hash_id_void(void *tab)
     -+{
     -+	return reftable_reader_hash_id((struct reftable_reader *)tab);
     -+}
     ++/* iterator is the generic interface for walking over data stored in a
     ++   reftable.
     ++*/
     ++struct reftable_iterator {
     ++	struct reftable_iterator_vtable *ops;
     ++	void *iter_arg;
     ++};
      +
     -+static uint64_t reftable_reader_min_update_index_void(void *tab)
     -+{
     -+	return reftable_reader_min_update_index((struct reftable_reader *)tab);
     -+}
     ++/* reads the next reftable_ref_record. Returns < 0 for error, 0 for OK and > 0:
     ++   end of iteration.
     ++*/
     ++int reftable_iterator_next_ref(struct reftable_iterator *it,
     ++			       struct reftable_ref_record *ref);
      +
     -+static uint64_t reftable_reader_max_update_index_void(void *tab)
     -+{
     -+	return reftable_reader_max_update_index((struct reftable_reader *)tab);
     -+}
     ++/* reads the next reftable_log_record. Returns < 0 for error, 0 for OK and > 0:
     ++   end of iteration.
     ++*/
     ++int reftable_iterator_next_log(struct reftable_iterator *it,
     ++			       struct reftable_log_record *log);
      +
     -+static struct reftable_table_vtable reader_vtable = {
     -+	.seek_record = reftable_reader_seek_void,
     -+	.hash_id = reftable_reader_hash_id_void,
     -+	.min_update_index = reftable_reader_min_update_index_void,
     -+	.max_update_index = reftable_reader_max_update_index_void,
     -+};
     ++/* releases resources associated with an iterator. */
     ++void reftable_iterator_destroy(struct reftable_iterator *it);
      +
     -+int reftable_table_seek_ref(struct reftable_table *tab,
     -+			    struct reftable_iterator *it, const char *name)
     -+{
     -+	struct reftable_ref_record ref = {
     -+		.refname = (char *)name,
     -+	};
     -+	struct reftable_record rec = { 0 };
     -+	reftable_record_from_ref(&rec, &ref);
     -+	return tab->ops->seek_record(tab->table_arg, it, &rec);
     -+}
     ++#endif
     +
     + ## reftable/reftable-reader.h (new) ##
     +@@
     ++/*
     ++Copyright 2020 Google LLC
      +
     -+void reftable_table_from_reader(struct reftable_table *tab,
     -+				struct reftable_reader *reader)
     -+{
     -+	assert(tab->ops == NULL);
     -+	tab->ops = &reader_vtable;
     -+	tab->table_arg = reader;
     -+}
     ++Use of this source code is governed by a BSD-style
     ++license that can be found in the LICENSE file or at
     ++https://developers.google.com/open-source/licenses/bsd
     ++*/
      +
     -+int reftable_table_read_ref(struct reftable_table *tab, const char *name,
     -+			    struct reftable_ref_record *ref)
     -+{
     -+	struct reftable_iterator it = { 0 };
     -+	int err = reftable_table_seek_ref(tab, &it, name);
     -+	if (err)
     -+		goto done;
     ++#ifndef REFTABLE_READER_H
     ++#define REFTABLE_READER_H
      +
     -+	err = reftable_iterator_next_ref(&it, ref);
     -+	if (err)
     -+		goto done;
     ++#include "reftable-iterator.h"
     ++#include "reftable-blocksource.h"
      +
     -+	if (strcmp(ref->refname, name) ||
     -+	    reftable_ref_record_is_deletion(ref)) {
     -+		reftable_ref_record_clear(ref);
     -+		err = 1;
     -+		goto done;
     -+	}
     ++/*
     ++ Reading single tables
      +
     -+done:
     -+	reftable_iterator_destroy(&it);
     -+	return err;
     -+}
     ++ The follow routines are for reading single files. For an application-level
     ++ interface, skip ahead to struct reftable_merged_table and struct
     ++ reftable_stack.
     ++*/
      +
     -+int reftable_table_seek_record(struct reftable_table *tab,
     -+			       struct reftable_iterator *it,
     -+			       struct reftable_record *rec)
     -+{
     -+	return tab->ops->seek_record(tab->table_arg, it, rec);
     -+}
     ++/* The reader struct is a handle to an open reftable file. */
     ++struct reftable_reader;
      +
     -+uint64_t reftable_table_max_update_index(struct reftable_table *tab)
     -+{
     -+	return tab->ops->max_update_index(tab->table_arg);
     -+}
     ++/* reftable_new_reader opens a reftable for reading. If successful, returns 0
     ++ * code and sets pp. The name is used for creating a stack. Typically, it is the
     ++ * basename of the file. The block source `src` is owned by the reader, and is
     ++ * closed on calling reftable_reader_destroy().
     ++ */
     ++int reftable_new_reader(struct reftable_reader **pp,
     ++			struct reftable_block_source *src, const char *name);
     ++
     ++/* reftable_reader_seek_ref returns an iterator where 'name' would be inserted
     ++   in the table.  To seek to the start of the table, use name = "".
     ++
     ++   example:
     ++
     ++   struct reftable_reader *r = NULL;
     ++   int err = reftable_new_reader(&r, &src, "filename");
     ++   if (err < 0) { ... }
     ++   struct reftable_iterator it  = {0};
     ++   err = reftable_reader_seek_ref(r, &it, "refs/heads/master");
     ++   if (err < 0) { ... }
     ++   struct reftable_ref_record ref  = {0};
     ++   while (1) {
     ++     err = reftable_iterator_next_ref(&it, &ref);
     ++     if (err > 0) {
     ++       break;
     ++     }
     ++     if (err < 0) {
     ++       ..error handling..
     ++     }
     ++     ..found..
     ++   }
     ++   reftable_iterator_destroy(&it);
     ++   reftable_ref_record_release(&ref);
     ++ */
     ++int reftable_reader_seek_ref(struct reftable_reader *r,
     ++			     struct reftable_iterator *it, const char *name);
      +
     -+uint64_t reftable_table_min_update_index(struct reftable_table *tab)
     -+{
     -+	return tab->ops->min_update_index(tab->table_arg);
     -+}
     ++/* returns the hash ID used in this table. */
     ++uint32_t reftable_reader_hash_id(struct reftable_reader *r);
      +
     -+uint32_t reftable_table_hash_id(struct reftable_table *tab)
     -+{
     -+	return tab->ops->hash_id(tab->table_arg);
     -+}
     ++/* seek to logs for the given name, older than update_index. To seek to the
     ++   start of the table, use name = "".
     ++ */
     ++int reftable_reader_seek_log_at(struct reftable_reader *r,
     ++				struct reftable_iterator *it, const char *name,
     ++				uint64_t update_index);
     ++
     ++/* seek to newest log entry for given name. */
     ++int reftable_reader_seek_log(struct reftable_reader *r,
     ++			     struct reftable_iterator *it, const char *name);
     ++
     ++/* closes and deallocates a reader. */
     ++void reftable_reader_free(struct reftable_reader *);
     ++
     ++/* return an iterator for the refs pointing to `oid`. */
     ++int reftable_reader_refs_for(struct reftable_reader *r,
     ++			     struct reftable_iterator *it, uint8_t *oid);
     ++
     ++/* return the max_update_index for a table */
     ++uint64_t reftable_reader_max_update_index(struct reftable_reader *r);
     ++
     ++/* return the min_update_index for a table */
     ++uint64_t reftable_reader_min_update_index(struct reftable_reader *r);
     ++
     ++#endif
 11:  bf6b929b86 ! 12:  581a4200d6 reftable: file level tests
     @@ Metadata
      Author: Han-Wen Nienhuys <hanwen@google.com>
      
       ## Commit message ##
     -    reftable: file level tests
     +    reftable: reftable file level tests
     +
     +    With support for reading and writing files in place, we can construct files (in
     +    memory) and attempt to read them back.
     +
     +    Because some sections of the format are optional (eg. indices, log entries), we
     +    have to exercise this code using multiple sizes of input data
      
          Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
      
       ## Makefile ##
      @@ Makefile: REFTABLE_OBJS += reftable/zlib-compat.o
     - 
     + REFTABLE_TEST_OBJS += reftable/basics_test.o
       REFTABLE_TEST_OBJS += reftable/block_test.o
       REFTABLE_TEST_OBJS += reftable/record_test.o
      +REFTABLE_TEST_OBJS += reftable/reftable_test.o
     - REFTABLE_TEST_OBJS += reftable/strbuf_test.o
       REFTABLE_TEST_OBJS += reftable/test_framework.o
       REFTABLE_TEST_OBJS += reftable/tree_test.o
     + 
      
       ## reftable/reftable_test.c (new) ##
      @@
     @@ reftable/reftable_test.c (new)
      +https://developers.google.com/open-source/licenses/bsd
      +*/
      +
     -+#include "reftable.h"
     -+
      +#include "system.h"
      +
      +#include "basics.h"
     @@ reftable/reftable_test.c (new)
      +#include "record.h"
      +#include "test_framework.h"
      +#include "reftable-tests.h"
     ++#include "reftable-stack.h"
      +
      +static const int update_index = 5;
      +
     @@ reftable/reftable_test.c (new)
      +	uint8_t in[] = "hello";
      +	strbuf_add(&buf, in, sizeof(in));
      +	block_source_from_strbuf(&source, &buf);
     -+	assert(block_source_size(&source) == 6);
     ++	EXPECT(block_source_size(&source) == 6);
      +	n = block_source_read_block(&source, &out, 0, sizeof(in));
     -+	assert(n == sizeof(in));
     -+	assert(!memcmp(in, out.data, n));
     ++	EXPECT(n == sizeof(in));
     ++	EXPECT(!memcmp(in, out.data, n));
      +	reftable_block_done(&out);
      +
      +	n = block_source_read_block(&source, &out, 1, 2);
     -+	assert(n == 2);
     -+	assert(!memcmp(out.data, "el", 2));
     ++	EXPECT(n == 2);
     ++	EXPECT(!memcmp(out.data, "el", 2));
      +
      +	reftable_block_done(&out);
      +	block_source_close(&source);
     @@ reftable/reftable_test.c (new)
      +		(*names)[i] = xstrdup(name);
      +
      +		n = reftable_writer_add_ref(w, &ref);
     -+		assert(n == 0);
     ++		EXPECT(n == 0);
      +	}
      +
      +	for (i = 0; i < N; i++) {
     @@ reftable/reftable_test.c (new)
      +		log.message = "message";
      +
      +		n = reftable_writer_add_log(w, &log);
     -+		assert(n == 0);
     ++		EXPECT(n == 0);
      +	}
      +
      +	n = reftable_writer_close(w);
     -+	assert(n == 0);
     ++	EXPECT(n == 0);
      +
      +	stats = writer_stats(w);
      +	for (i = 0; i < stats->ref_stats.blocks; i++) {
     @@ reftable/reftable_test.c (new)
      +		if (off == 0) {
      +			off = header_size((hash_id == SHA256_ID) ? 2 : 1);
      +		}
     -+		assert(buf->buf[off] == 'r');
     ++		EXPECT(buf->buf[off] == 'r');
      +	}
      +
     -+	assert(stats->log_stats.blocks > 0);
     ++	EXPECT(stats->log_stats.blocks > 0);
      +	reftable_writer_free(w);
      +}
      +
     @@ reftable/reftable_test.c (new)
      +	log.new_hash = hash2;
      +	reftable_writer_set_limits(w, update_index, update_index);
      +	err = reftable_writer_add_log(w, &log);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +	err = reftable_writer_close(w);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +	reftable_writer_free(w);
      +	strbuf_release(&buf);
      +}
     @@ reftable/reftable_test.c (new)
      +		ref.update_index = i;
      +
      +		err = reftable_writer_add_ref(w, &ref);
     -+		assert_err(err);
     ++		EXPECT_ERR(err);
      +	}
      +	for (i = 0; i < N; i++) {
      +		uint8_t hash1[SHA1_SIZE], hash2[SHA1_SIZE];
     @@ reftable/reftable_test.c (new)
      +		log.new_hash = hash2;
      +
      +		err = reftable_writer_add_log(w, &log);
     -+		assert_err(err);
     ++		EXPECT_ERR(err);
      +	}
      +
      +	n = reftable_writer_close(w);
     -+	assert(n == 0);
     ++	EXPECT(n == 0);
      +
      +	stats = writer_stats(w);
     -+	assert(stats->log_stats.blocks > 0);
     ++	EXPECT(stats->log_stats.blocks > 0);
      +	reftable_writer_free(w);
      +	w = NULL;
      +
      +	block_source_from_strbuf(&source, &buf);
      +
      +	err = init_reader(&rd, &source, "file.log");
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_reader_seek_ref(&rd, &it, names[N - 1]);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_iterator_next_ref(&it, &ref);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	/* end of iteration. */
      +	err = reftable_iterator_next_ref(&it, &ref);
     -+	assert(0 < err);
     ++	EXPECT(0 < err);
      +
      +	reftable_iterator_destroy(&it);
     -+	reftable_ref_record_clear(&ref);
     ++	reftable_ref_record_release(&ref);
      +
      +	err = reftable_reader_seek_log(&rd, &it, "");
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	i = 0;
      +	while (1) {
     @@ reftable/reftable_test.c (new)
      +			break;
      +		}
      +
     -+		assert_err(err);
     -+		assert_streq(names[i], log.refname);
     -+		assert(i == log.update_index);
     ++		EXPECT_ERR(err);
     ++		EXPECT_STREQ(names[i], log.refname);
     ++		EXPECT(i == log.update_index);
      +		i++;
     -+		reftable_log_record_clear(&log);
     ++		reftable_log_record_release(&log);
      +	}
      +
     -+	assert(i == N);
     ++	EXPECT(i == N);
      +	reftable_iterator_destroy(&it);
      +
      +	/* cleanup. */
     @@ reftable/reftable_test.c (new)
      +	block_source_from_strbuf(&source, &buf);
      +
      +	err = init_reader(&rd, &source, "file.ref");
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_reader_seek_ref(&rd, &it, "");
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	while (1) {
      +		struct reftable_ref_record ref = { NULL };
      +		int r = reftable_iterator_next_ref(&it, &ref);
     -+		assert(r >= 0);
     ++		EXPECT(r >= 0);
      +		if (r > 0) {
      +			break;
      +		}
     -+		assert(0 == strcmp(names[j], ref.refname));
     -+		assert(update_index == ref.update_index);
     ++		EXPECT(0 == strcmp(names[j], ref.refname));
     ++		EXPECT(update_index == ref.update_index);
      +
      +		j++;
     -+		reftable_ref_record_clear(&ref);
     ++		reftable_ref_record_release(&ref);
      +	}
     -+	assert(j == N);
     ++	EXPECT(j == N);
      +	reftable_iterator_destroy(&it);
      +	strbuf_release(&buf);
      +	free_names(names);
     @@ reftable/reftable_test.c (new)
      +	struct strbuf buf = STRBUF_INIT;
      +	int N = 1;
      +	write_table(&names, &buf, N, 4096, SHA1_ID);
     -+	assert(buf.len < 200);
     ++	EXPECT(buf.len < 200);
      +	strbuf_release(&buf);
      +	free_names(names);
      +}
     @@ reftable/reftable_test.c (new)
      +	block_source_from_strbuf(&source, &buf);
      +
      +	err = init_reader(&rd, &source, "file.ref");
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_reader_seek_ref(&rd, &it, names[0]);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_iterator_next_log(&it, &log);
     -+	assert(err == REFTABLE_API_ERROR);
     ++	EXPECT(err == REFTABLE_API_ERROR);
      +
      +	strbuf_release(&buf);
      +	for (i = 0; i < N; i++) {
     @@ reftable/reftable_test.c (new)
      +	block_source_from_strbuf(&source, &buf);
      +
      +	err = init_reader(&rd, &source, "file.ref");
     -+	assert_err(err);
     -+	assert(hash_id == reftable_reader_hash_id(&rd));
     ++	EXPECT_ERR(err);
     ++	EXPECT(hash_id == reftable_reader_hash_id(&rd));
      +
      +	if (!index) {
      +		rd.ref_offsets.index_offset = 0;
      +	} else {
     -+		assert(rd.ref_offsets.index_offset > 0);
     ++		EXPECT(rd.ref_offsets.index_offset > 0);
      +	}
      +
      +	for (i = 1; i < N; i++) {
      +		int err = reftable_reader_seek_ref(&rd, &it, names[i]);
     -+		assert_err(err);
     ++		EXPECT_ERR(err);
      +		err = reftable_iterator_next_ref(&it, &ref);
     -+		assert_err(err);
     -+		assert(0 == strcmp(names[i], ref.refname));
     -+		assert(i == ref.value[0]);
     ++		EXPECT_ERR(err);
     ++		EXPECT(0 == strcmp(names[i], ref.refname));
     ++		EXPECT(i == ref.value[0]);
      +
     -+		reftable_ref_record_clear(&ref);
     ++		reftable_ref_record_release(&ref);
      +		reftable_iterator_destroy(&it);
      +	}
      +
     @@ reftable/reftable_test.c (new)
      +	if (err == 0) {
      +		struct reftable_ref_record ref = { NULL };
      +		int err = reftable_iterator_next_ref(&it, &ref);
     -+		assert(err > 0);
     ++		EXPECT(err > 0);
      +	} else {
     -+		assert(err > 0);
     ++		EXPECT(err > 0);
      +	}
      +
      +	strbuf_release(&pastLast);
     @@ reftable/reftable_test.c (new)
      +		 */
      +		/* blocks. */
      +		n = reftable_writer_add_ref(w, &ref);
     -+		assert(n == 0);
     ++		EXPECT(n == 0);
      +
      +		if (!memcmp(hash1, want_hash, SHA1_SIZE) ||
      +		    !memcmp(hash2, want_hash, SHA1_SIZE)) {
     @@ reftable/reftable_test.c (new)
      +	}
      +
      +	n = reftable_writer_close(w);
     -+	assert(n == 0);
     ++	EXPECT(n == 0);
      +
      +	reftable_writer_free(w);
      +	w = NULL;
     @@ reftable/reftable_test.c (new)
      +	block_source_from_strbuf(&source, &buf);
      +
      +	err = init_reader(&rd, &source, "file.ref");
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +	if (!indexed) {
      +		rd.obj_offsets.is_present = 0;
      +	}
      +
      +	err = reftable_reader_seek_ref(&rd, &it, "");
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +	reftable_iterator_destroy(&it);
      +
      +	err = reftable_reader_refs_for(&rd, &it, want_hash);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	j = 0;
      +	while (1) {
      +		int err = reftable_iterator_next_ref(&it, &ref);
     -+		assert(err >= 0);
     ++		EXPECT(err >= 0);
      +		if (err > 0) {
      +			break;
      +		}
      +
     -+		assert(j < want_names_len);
     -+		assert(0 == strcmp(ref.refname, want_names[j]));
     ++		EXPECT(j < want_names_len);
     ++		EXPECT(0 == strcmp(ref.refname, want_names[j]));
      +		j++;
     -+		reftable_ref_record_clear(&ref);
     ++		reftable_ref_record_release(&ref);
      +	}
     -+	assert(j == want_names_len);
     ++	EXPECT(j == want_names_len);
      +
      +	strbuf_release(&buf);
      +	free_names(want_names);
     @@ reftable/reftable_test.c (new)
      +	reftable_writer_set_limits(w, 1, 1);
      +
      +	err = reftable_writer_close(w);
     -+	assert(err == REFTABLE_EMPTY_TABLE_ERROR);
     ++	EXPECT(err == REFTABLE_EMPTY_TABLE_ERROR);
      +	reftable_writer_free(w);
      +
     -+	assert(buf.len == header_size(1) + footer_size(1));
     ++	EXPECT(buf.len == header_size(1) + footer_size(1));
      +
      +	block_source_from_strbuf(&source, &buf);
      +
      +	err = reftable_new_reader(&rd, &source, "filename");
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_reader_seek_ref(rd, &it, "");
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_iterator_next_ref(&it, &rec);
     -+	assert(err > 0);
     ++	EXPECT(err > 0);
      +
      +	reftable_iterator_destroy(&it);
      +	reftable_reader_free(rd);
     @@ reftable/reftable_test.c (new)
      +
      +int reftable_test_main(int argc, const char *argv[])
      +{
     -+	add_test_case("test_log_write_read", test_log_write_read);
     -+	add_test_case("test_table_read_write_seek_linear_sha256",
     -+		      &test_table_read_write_seek_linear_sha256);
     -+	add_test_case("test_log_buffer_size", test_log_buffer_size);
     -+	add_test_case("test_table_write_small_table",
     -+		      &test_table_write_small_table);
     -+	add_test_case("test_buffer", &test_buffer);
     -+	add_test_case("test_table_read_api", &test_table_read_api);
     -+	add_test_case("test_table_read_write_sequential",
     -+		      &test_table_read_write_sequential);
     -+	add_test_case("test_table_read_write_seek_linear",
     -+		      &test_table_read_write_seek_linear);
     -+	add_test_case("test_table_read_write_seek_index",
     -+		      &test_table_read_write_seek_index);
     -+	add_test_case("test_table_read_write_refs_for_no_index",
     -+		      &test_table_refs_for_no_index);
     -+	add_test_case("test_table_read_write_refs_for_obj_index",
     -+		      &test_table_refs_for_obj_index);
     -+	add_test_case("test_table_empty", &test_table_empty);
     -+	return test_main(argc, argv);
     ++	test_log_write_read();
     ++	test_table_read_write_seek_linear_sha256();
     ++	test_log_buffer_size();
     ++	test_table_write_small_table();
     ++	test_buffer();
     ++	test_table_read_api();
     ++	test_table_read_write_sequential();
     ++	test_table_read_write_seek_linear();
     ++	test_table_read_write_seek_index();
     ++	test_table_refs_for_no_index();
     ++	test_table_refs_for_obj_index();
     ++	test_table_empty();
     ++	return 0;
      +}
      
       ## t/helper/test-reftable.c ##
      @@ t/helper/test-reftable.c: int cmd__reftable(int argc, const char **argv)
     - {
     + 	basics_test_main(argc, argv);
       	block_test_main(argc, argv);
       	record_test_main(argc, argv);
      +	reftable_test_main(argc, argv);
     - 	strbuf_test_main(argc, argv);
       	tree_test_main(argc, argv);
       	return 0;
     + }
 12:  4e38db7f48 ! 13:  a26fb180ad reftable: rest of library
     @@ Metadata
       ## Commit message ##
          reftable: rest of library
      
     -    This will be further split up once preceding commits have passed review.
     -
          Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
      
       ## Makefile ##
     -@@ Makefile: XDIFF_OBJS += xdiff/xutils.o
     - REFTABLE_OBJS += reftable/basics.o
     +@@ Makefile: REFTABLE_OBJS += reftable/error.o
       REFTABLE_OBJS += reftable/block.o
       REFTABLE_OBJS += reftable/blocksource.o
     --REFTABLE_OBJS += reftable/publicbasics.o
     - REFTABLE_OBJS += reftable/compat.o
       REFTABLE_OBJS += reftable/iter.o
      +REFTABLE_OBJS += reftable/merged.o
      +REFTABLE_OBJS += reftable/pq.o
     -+REFTABLE_OBJS += reftable/publicbasics.o
     + REFTABLE_OBJS += reftable/publicbasics.o
       REFTABLE_OBJS += reftable/reader.o
       REFTABLE_OBJS += reftable/record.o
      +REFTABLE_OBJS += reftable/refname.o
       REFTABLE_OBJS += reftable/reftable.o
      +REFTABLE_OBJS += reftable/stack.o
     - REFTABLE_OBJS += reftable/strbuf.o
       REFTABLE_OBJS += reftable/tree.o
       REFTABLE_OBJS += reftable/writer.o
     -@@ Makefile: REFTABLE_OBJS += reftable/zlib-compat.o
     - 
     - REFTABLE_TEST_OBJS += reftable/block_test.o
     - REFTABLE_TEST_OBJS += reftable/record_test.o
     -+REFTABLE_TEST_OBJS += reftable/refname_test.o
     - REFTABLE_TEST_OBJS += reftable/reftable_test.o
     -+REFTABLE_TEST_OBJS += reftable/stack_test.o
     - REFTABLE_TEST_OBJS += reftable/strbuf_test.o
     - REFTABLE_TEST_OBJS += reftable/test_framework.o
     - REFTABLE_TEST_OBJS += reftable/tree_test.o
     + REFTABLE_OBJS += reftable/zlib-compat.o
      
       ## reftable/VERSION (new) ##
      @@
     -+7134eb9f8171a9759800f4187f9e6dde997335e7 C: NULL iso 0 for init
     ++b8ec0f74c74cb6752eb2033ad8e755a9c19aad15 C: add missing header
      
       ## reftable/dump.c (new) ##
      @@
     @@ reftable/dump.c (new)
      +
      +static int dump_table(const char *tablename)
      +{
     -+	struct reftable_block_source src = { NULL };
     ++	struct reftable_block_source src = { 0 };
      +	int err = reftable_block_source_from_file(&src, tablename);
     -+	struct reftable_iterator it = { NULL };
     -+	struct reftable_ref_record ref = { NULL };
     -+	struct reftable_log_record log = { NULL };
     ++	struct reftable_iterator it = { 0 };
     ++	struct reftable_ref_record ref = { 0 };
     ++	struct reftable_log_record log = { 0 };
      +	struct reftable_reader *r = NULL;
      +
      +	if (err < 0)
     @@ reftable/dump.c (new)
      +{
      +	struct reftable_stack *stack = NULL;
      +	struct reftable_write_options cfg = {};
     -+	struct reftable_iterator it = { NULL };
     -+	struct reftable_ref_record ref = { NULL };
     -+	struct reftable_log_record log = { NULL };
     ++	struct reftable_iterator it = { 0 };
     ++	struct reftable_ref_record ref = { 0 };
     ++	struct reftable_log_record log = { 0 };
      +	struct reftable_merged_table *merged = NULL;
      +
      +	int err = reftable_new_stack(&stack, stackdir, cfg);
     @@ reftable/merged.c (new)
      +#include "pq.h"
      +#include "reader.h"
      +#include "record.h"
     -+#include "reftable.h"
     ++#include "reftable-merged.h"
     ++#include "reftable-error.h"
      +#include "system.h"
      +
      +static int merged_iter_init(struct merged_iter *mi)
     @@ reftable/merged.c (new)
      +{
      +	struct merged_iter *mi = (struct merged_iter *)p;
      +	int i = 0;
     -+	merged_iter_pqueue_clear(&mi->pq);
     ++	merged_iter_pqueue_release(&mi->pq);
      +	for (i = 0; i < mi->stack_len; i++) {
      +		reftable_iterator_destroy(&mi->stack[i]);
      +	}
     @@ reftable/merged.c (new)
      +}
      +
      +/* clears the list of subtable, without affecting the readers themselves. */
     -+void merged_table_clear(struct reftable_merged_table *mt)
     ++void merged_table_release(struct reftable_merged_table *mt)
      +{
      +	FREE_AND_NULL(mt->stack);
      +	mt->stack_len = 0;
     @@ reftable/merged.c (new)
      +	if (mt == NULL) {
      +		return;
      +	}
     -+	merged_table_clear(mt);
     ++	merged_table_release(mt);
      +	reftable_free(mt);
      +}
      +
     @@ reftable/merged.c (new)
      +	return mt->min;
      +}
      +
     ++static int reftable_table_seek_record(struct reftable_table *tab,
     ++				      struct reftable_iterator *it,
     ++				      struct reftable_record *rec)
     ++{
     ++	return tab->ops->seek_record(tab->table_arg, it, rec);
     ++}
     ++
      +static int merged_table_seek_record(struct reftable_merged_table *mt,
      +				    struct reftable_iterator *it,
      +				    struct reftable_record *rec)
     @@ reftable/merged.h (new)
      +#define MERGED_H
      +
      +#include "pq.h"
     -+#include "reftable.h"
      +
      +struct reftable_merged_table {
      +	struct reftable_table *stack;
     @@ reftable/merged.h (new)
      +	struct merged_iter_pqueue pq;
      +};
      +
     -+void merged_table_clear(struct reftable_merged_table *mt);
     ++void merged_table_release(struct reftable_merged_table *mt);
      +
      +#endif
      
     @@ reftable/merged_test.c (new)
      +#include "pq.h"
      +#include "reader.h"
      +#include "record.h"
     -+#include "reftable.h"
      +#include "test_framework.h"
     ++#include "reftable-merged.h"
      +#include "reftable-tests.h"
     ++#include "reftable-generic.h"
     ++#include "reftable-stack.h"
      +
      +static void test_pq(void)
      +{
     @@ reftable/merged_test.c (new)
      +		reftable_free(names[i]);
      +	}
      +
     -+	merged_iter_pqueue_clear(&pq);
     ++	merged_iter_pqueue_release(&pq);
      +}
      +
      +static void write_test_table(struct strbuf *buf,
     @@ reftable/merged_test.c (new)
      +	}
      +
      +	err = reftable_writer_close(w);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	reftable_writer_free(w);
      +}
     @@ reftable/merged_test.c (new)
      +
      +		err = reftable_new_reader(&(*readers)[i], &(*source)[i],
      +					  "name");
     -+		assert_err(err);
     ++		EXPECT_ERR(err);
      +		reftable_table_from_reader(&tabs[i], (*readers)[i]);
      +	}
      +
      +	err = reftable_new_merged_table(&mt, tabs, n, SHA1_ID);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +	return mt;
      +}
      +
     @@ reftable/merged_test.c (new)
      +	struct reftable_ref_record ref = { NULL };
      +	struct reftable_iterator it = { NULL };
      +	int err = reftable_merged_table_seek_ref(mt, &it, "a");
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_iterator_next_ref(&it, &ref);
     -+	assert_err(err);
     -+	assert(ref.update_index == 2);
     -+	reftable_ref_record_clear(&ref);
     ++	EXPECT_ERR(err);
     ++	EXPECT(ref.update_index == 2);
     ++	reftable_ref_record_release(&ref);
      +	reftable_iterator_destroy(&it);
      +	readers_destroy(readers, 2);
      +	reftable_merged_table_free(mt);
     @@ reftable/merged_test.c (new)
      +	size_t cap = 0;
      +	int i = 0;
      +
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +	while (len < 100) { /* cap loops/recursion. */
      +		struct reftable_ref_record ref = { NULL };
      +		int err = reftable_iterator_next_ref(&it, &ref);
     @@ reftable/merged_test.c (new)
      +		assert(reftable_ref_record_equal(&want[i], &out[i], SHA1_SIZE));
      +	}
      +	for (i = 0; i < len; i++) {
     -+		reftable_ref_record_clear(&out[i]);
     ++		reftable_ref_record_release(&out[i]);
      +	}
      +	reftable_free(out);
      +
     @@ reftable/merged_test.c (new)
      +	reftable_writer_set_limits(w, 1, 1);
      +
      +	err = reftable_writer_add_ref(w, &rec);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_writer_close(w);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +	reftable_writer_free(w);
      +
      +	block_source_from_strbuf(&source, &buf);
      +
      +	err = reftable_new_reader(&rd, &source, "filename");
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	hash_id = reftable_reader_hash_id(rd);
      +	assert(hash_id == SHA1_ID);
      +
      +	reftable_table_from_reader(&tab[0], rd);
      +	err = reftable_new_merged_table(&merged, tab, 1, SHA1_ID);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	reftable_reader_free(rd);
      +	reftable_merged_table_free(merged);
     @@ reftable/merged_test.c (new)
      +
      +int merged_test_main(int argc, const char *argv[])
      +{
     -+	add_test_case("test_merged_between", &test_merged_between);
     -+	add_test_case("test_pq", &test_pq);
     -+	add_test_case("test_merged", &test_merged);
     -+	add_test_case("test_default_write_opts", &test_default_write_opts);
     -+	return test_main(argc, argv);
     ++	test_merged_between();
     ++	test_pq();
     ++	test_merged();
     ++	test_default_write_opts();
     ++	return 0;
      +}
      
       ## reftable/pq.c (new) ##
     @@ reftable/pq.c (new)
      +
      +#include "pq.h"
      +
     -+#include "reftable.h"
     ++#include "reftable-record.h"
      +#include "system.h"
      +#include "basics.h"
      +
     @@ reftable/pq.c (new)
      +	}
      +}
      +
     -+void merged_iter_pqueue_clear(struct merged_iter_pqueue *pq)
     ++void merged_iter_pqueue_release(struct merged_iter_pqueue *pq)
      +{
      +	int i = 0;
      +	for (i = 0; i < pq->len; i++) {
     @@ reftable/pq.h (new)
      +void merged_iter_pqueue_check(struct merged_iter_pqueue pq);
      +struct pq_entry merged_iter_pqueue_remove(struct merged_iter_pqueue *pq);
      +void merged_iter_pqueue_add(struct merged_iter_pqueue *pq, struct pq_entry e);
     -+void merged_iter_pqueue_clear(struct merged_iter_pqueue *pq);
     ++void merged_iter_pqueue_release(struct merged_iter_pqueue *pq);
      +
      +#endif
      
     @@ reftable/refname.c (new)
      +*/
      +
      +#include "system.h"
     -+#include "reftable.h"
     ++#include "reftable-error.h"
      +#include "basics.h"
      +#include "refname.h"
     -+#include "strbuf.h"
     ++#include "reftable-iterator.h"
      +
      +struct find_arg {
      +	char **names;
     @@ reftable/refname.c (new)
      +	}
      +
      +	err = reftable_table_read_ref(&mod->tab, name, &ref);
     -+	reftable_ref_record_clear(&ref);
     ++	reftable_ref_record_release(&ref);
      +	return err;
      +}
      +
     -+static void modification_clear(struct modification *mod)
     ++static void modification_release(struct modification *mod)
      +{
      +	/* don't delete the strings themselves; they're owned by ref records.
      +	 */
     @@ reftable/refname.c (new)
      +	}
      +
      +done:
     -+	reftable_ref_record_clear(&ref);
     ++	reftable_ref_record_release(&ref);
      +	reftable_iterator_destroy(&it);
      +	return err;
      +}
     @@ reftable/refname.c (new)
      +	}
      +
      +	err = modification_validate(&mod);
     -+	modification_clear(&mod);
     ++	modification_release(&mod);
      +	return err;
      +}
      +
     @@ reftable/refname.h (new)
      +#ifndef REFNAME_H
      +#define REFNAME_H
      +
     -+#include "reftable.h"
     ++#include "reftable-record.h"
     ++#include "reftable-generic.h"
      +
      +struct modification {
      +	struct reftable_table tab;
     @@ reftable/refname_test.c (new)
      +https://developers.google.com/open-source/licenses/bsd
      +*/
      +
     -+#include "reftable.h"
     -+
      +#include "basics.h"
      +#include "block.h"
      +#include "blocksource.h"
     @@ reftable/refname_test.c (new)
      +#include "reader.h"
      +#include "record.h"
      +#include "refname.h"
     ++#include "reftable-error.h"
     ++#include "reftable-writer.h"
      +#include "system.h"
      +
      +#include "test_framework.h"
     @@ reftable/refname_test.c (new)
      +	reftable_writer_set_limits(w, 1, 1);
      +
      +	err = reftable_writer_add_ref(w, &rec);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_writer_close(w);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +	reftable_writer_free(w);
      +
      +	block_source_from_strbuf(&source, &buf);
      +	err = reftable_new_reader(&rd, &source, "filename");
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	reftable_table_from_reader(&tab, rd);
      +
     @@ reftable/refname_test.c (new)
      +		}
      +
      +		err = modification_validate(&mod);
     -+		assert(err == cases[i].error_code);
     ++		EXPECT(err == cases[i].error_code);
      +	}
      +
      +	reftable_reader_free(rd);
     @@ reftable/refname_test.c (new)
      +
      +int refname_test_main(int argc, const char *argv[])
      +{
     -+	add_test_case("test_conflict", &test_conflict);
     -+	return test_main(argc, argv);
     ++	test_conflict();
     ++	return 0;
      +}
      
     - ## reftable/reftable.c ##
     -@@ reftable/reftable.c: int reftable_table_seek_ref(struct reftable_table *tab,
     - 	struct reftable_ref_record ref = {
     - 		.refname = (char *)name,
     - 	};
     --	struct reftable_record rec = { 0 };
     + ## reftable/reftable-generic.h (new) ##
     +@@
     ++/*
     ++Copyright 2020 Google LLC
     ++
     ++Use of this source code is governed by a BSD-style
     ++license that can be found in the LICENSE file or at
     ++https://developers.google.com/open-source/licenses/bsd
     ++*/
     ++
     ++#ifndef REFTABLE_GENERIC_H
     ++#define REFTABLE_GENERIC_H
     ++
     ++#include "reftable-iterator.h"
     ++#include "reftable-reader.h"
     ++#include "reftable-merged.h"
     ++
     ++/*
     ++ Provides a unified API for reading tables, either merged tables, or single
     ++ readers.
     ++*/
     ++struct reftable_table {
     ++	struct reftable_table_vtable *ops;
     ++	void *table_arg;
     ++};
     ++
     ++int reftable_table_seek_ref(struct reftable_table *tab,
     ++			    struct reftable_iterator *it, const char *name);
     ++
     ++void reftable_table_from_reader(struct reftable_table *tab,
     ++				struct reftable_reader *reader);
     ++
     ++/* returns the hash ID from a generic reftable_table */
     ++uint32_t reftable_table_hash_id(struct reftable_table *tab);
     ++
     ++/* create a generic table from reftable_merged_table */
     ++void reftable_table_from_merged_table(struct reftable_table *tab,
     ++				      struct reftable_merged_table *table);
     ++
     ++/* returns the max update_index covered by this table. */
     ++uint64_t reftable_table_max_update_index(struct reftable_table *tab);
     ++
     ++/* returns the min update_index covered by this table. */
     ++uint64_t reftable_table_min_update_index(struct reftable_table *tab);
     ++
     ++/* convenience function to read a single ref. Returns < 0 for error, 0
     ++   for success, and 1 if ref not found. */
     ++int reftable_table_read_ref(struct reftable_table *tab, const char *name,
     ++			    struct reftable_ref_record *ref);
     ++
     ++#endif
     +
     + ## reftable/reftable-merged.h (new) ##
     +@@
     ++/*
     ++Copyright 2020 Google LLC
     ++
     ++Use of this source code is governed by a BSD-style
     ++license that can be found in the LICENSE file or at
     ++https://developers.google.com/open-source/licenses/bsd
     ++*/
     ++
     ++#ifndef REFTABLE_MERGED_H
     ++#define REFTABLE_MERGED_H
     ++
     ++#include "reftable-iterator.h"
     ++
     ++/*
     ++ Merged tables
     ++
     ++ A ref database kept in a sequence of table files. The merged_table presents a
     ++ unified view to reading (seeking, iterating) a sequence of immutable tables.
     ++
     ++ The merged tables are on purpose kept disconnected from their actual storage
     ++ (eg. files on disk), because it is useful to merge tables aren't files. For
     ++ example, the per-workspace and global ref namespace can be implemented as a
     ++ merged table of two stacks of file-backed reftables.
     ++*/
     ++
     ++/* A merged table is implements seeking/iterating over a stack of tables. */
     ++struct reftable_merged_table;
     ++
     ++/* A generic reftable; see below. */
     ++struct reftable_table;
     ++
     ++/* reftable_new_merged_table creates a new merged table. It takes ownership of
     ++   the stack array.
     ++*/
     ++int reftable_new_merged_table(struct reftable_merged_table **dest,
     ++			      struct reftable_table *stack, int n,
     ++			      uint32_t hash_id);
     ++
     ++/* returns an iterator positioned just before 'name' */
     ++int reftable_merged_table_seek_ref(struct reftable_merged_table *mt,
     ++				   struct reftable_iterator *it,
     ++				   const char *name);
     ++
     ++/* returns an iterator for log entry, at given update_index */
     ++int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt,
     ++				      struct reftable_iterator *it,
     ++				      const char *name, uint64_t update_index);
     ++
     ++/* like reftable_merged_table_seek_log_at but look for the newest entry. */
     ++int reftable_merged_table_seek_log(struct reftable_merged_table *mt,
     ++				   struct reftable_iterator *it,
     ++				   const char *name);
     ++
     ++/* returns the max update_index covered by this merged table. */
     ++uint64_t
     ++reftable_merged_table_max_update_index(struct reftable_merged_table *mt);
     ++
     ++/* returns the min update_index covered by this merged table. */
     ++uint64_t
     ++reftable_merged_table_min_update_index(struct reftable_merged_table *mt);
     ++
     ++/* releases memory for the merged_table */
     ++void reftable_merged_table_free(struct reftable_merged_table *m);
     ++
     ++/* return the hash ID of the merged table. */
     ++uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *m);
     ++
     ++#endif
     +
     + ## reftable/reftable-stack.h (new) ##
     +@@
     ++/*
     ++Copyright 2020 Google LLC
     ++
     ++Use of this source code is governed by a BSD-style
     ++license that can be found in the LICENSE file or at
     ++https://developers.google.com/open-source/licenses/bsd
     ++*/
     ++
     ++#ifndef REFTABLE_STACK_H
     ++#define REFTABLE_STACK_H
     ++
     ++#include "reftable-writer.h"
     ++
     ++/*
     ++  The stack presents an interface to a mutable sequence of reftables.
     ++
     ++  A stack can be mutated by pushing a table to the top of the stack.
     ++
     ++  The reftable_stack automatically compacts files on disk to ensure good
     ++  amortized performance.
     ++*/
     ++struct reftable_stack;
     ++
     ++/* open a new reftable stack. The tables along with the table list will be
     ++   stored in 'dir'. Typically, this should be .git/reftables.
     ++*/
     ++int reftable_new_stack(struct reftable_stack **dest, const char *dir,
     ++		       struct reftable_write_options config);
     ++
     ++/* returns the update_index at which a next table should be written. */
     ++uint64_t reftable_stack_next_update_index(struct reftable_stack *st);
     ++
     ++/* holds a transaction to add tables at the top of a stack. */
     ++struct reftable_addition;
     ++
     ++/*
     ++  returns a new transaction to add reftables to the given stack. As a side
     ++  effect, the ref database is locked.
     ++*/
     ++int reftable_stack_new_addition(struct reftable_addition **dest,
     ++				struct reftable_stack *st);
     ++
     ++/* Adds a reftable to transaction. */
     ++int reftable_addition_add(struct reftable_addition *add,
     ++			  int (*write_table)(struct reftable_writer *wr,
     ++					     void *arg),
     ++			  void *arg);
     ++
     ++/* Commits the transaction, releasing the lock. */
     ++int reftable_addition_commit(struct reftable_addition *add);
     ++
     ++/* Release all non-committed data from the transaction, and deallocate the
     ++   transaction. Releases the lock if held. */
     ++void reftable_addition_destroy(struct reftable_addition *add);
     ++
     ++/* add a new table to the stack. The write_table function must call
     ++   reftable_writer_set_limits, add refs and return an error value. */
     ++int reftable_stack_add(struct reftable_stack *st,
     ++		       int (*write_table)(struct reftable_writer *wr,
     ++					  void *write_arg),
     ++		       void *write_arg);
     ++
     ++/* returns the merged_table for seeking. This table is valid until the
     ++   next write or reload, and should not be closed or deleted.
     ++*/
     ++struct reftable_merged_table *
     ++reftable_stack_merged_table(struct reftable_stack *st);
     ++
     ++/* frees all resources associated with the stack. */
     ++void reftable_stack_destroy(struct reftable_stack *st);
     ++
     ++/* Reloads the stack if necessary. This is very cheap to run if the stack was up
     ++ * to date */
     ++int reftable_stack_reload(struct reftable_stack *st);
     ++
     ++/* Policy for expiring reflog entries. */
     ++struct reftable_log_expiry_config {
     ++	/* Drop entries older than this timestamp */
     ++	uint64_t time;
     ++
     ++	/* Drop older entries */
     ++	uint64_t min_update_index;
     ++};
     ++
     ++/* compacts all reftables into a giant table. Expire reflog entries if config is
     ++ * non-NULL */
     ++int reftable_stack_compact_all(struct reftable_stack *st,
     ++			       struct reftable_log_expiry_config *config);
     ++
     ++/* heuristically compact unbalanced table stack. */
     ++int reftable_stack_auto_compact(struct reftable_stack *st);
     ++
     ++/* convenience function to read a single ref. Returns < 0 for error, 0
     ++   for success, and 1 if ref not found. */
     ++int reftable_stack_read_ref(struct reftable_stack *st, const char *refname,
     ++			    struct reftable_ref_record *ref);
     ++
     ++/* convenience function to read a single log. Returns < 0 for error, 0
     ++   for success, and 1 if ref not found. */
     ++int reftable_stack_read_log(struct reftable_stack *st, const char *refname,
     ++			    struct reftable_log_record *log);
     ++
     ++/* statistics on past compactions. */
     ++struct reftable_compaction_stats {
     ++	uint64_t bytes; /* total number of bytes written */
     ++	uint64_t entries_written; /* total number of entries written, including
     ++				     failures. */
     ++	int attempts; /* how often we tried to compact */
     ++	int failures; /* failures happen on concurrent updates */
     ++};
     ++
     ++/* return statistics for compaction up till now. */
     ++struct reftable_compaction_stats *
     ++reftable_stack_compaction_stats(struct reftable_stack *st);
     ++
     ++#endif
     +
     + ## reftable/reftable.c (new) ##
     +@@
     ++/*
     ++Copyright 2020 Google LLC
     ++
     ++Use of this source code is governed by a BSD-style
     ++license that can be found in the LICENSE file or at
     ++https://developers.google.com/open-source/licenses/bsd
     ++*/
     ++
     ++#include "record.h"
     ++#include "reader.h"
     ++#include "reftable-iterator.h"
     ++#include "reftable-generic.h"
     ++
     ++static int reftable_reader_seek_void(void *tab, struct reftable_iterator *it,
     ++				     struct reftable_record *rec)
     ++{
     ++	return reader_seek((struct reftable_reader *)tab, it, rec);
     ++}
     ++
     ++static uint32_t reftable_reader_hash_id_void(void *tab)
     ++{
     ++	return reftable_reader_hash_id((struct reftable_reader *)tab);
     ++}
     ++
     ++static uint64_t reftable_reader_min_update_index_void(void *tab)
     ++{
     ++	return reftable_reader_min_update_index((struct reftable_reader *)tab);
     ++}
     ++
     ++static uint64_t reftable_reader_max_update_index_void(void *tab)
     ++{
     ++	return reftable_reader_max_update_index((struct reftable_reader *)tab);
     ++}
     ++
     ++static struct reftable_table_vtable reader_vtable = {
     ++	.seek_record = reftable_reader_seek_void,
     ++	.hash_id = reftable_reader_hash_id_void,
     ++	.min_update_index = reftable_reader_min_update_index_void,
     ++	.max_update_index = reftable_reader_max_update_index_void,
     ++};
     ++
     ++int reftable_table_seek_ref(struct reftable_table *tab,
     ++			    struct reftable_iterator *it, const char *name)
     ++{
     ++	struct reftable_ref_record ref = {
     ++		.refname = (char *)name,
     ++	};
      +	struct reftable_record rec = { NULL };
     - 	reftable_record_from_ref(&rec, &ref);
     - 	return tab->ops->seek_record(tab->table_arg, it, &rec);
     - }
     -@@ reftable/reftable.c: void reftable_table_from_reader(struct reftable_table *tab,
     - int reftable_table_read_ref(struct reftable_table *tab, const char *name,
     - 			    struct reftable_ref_record *ref)
     - {
     --	struct reftable_iterator it = { 0 };
     ++	reftable_record_from_ref(&rec, &ref);
     ++	return tab->ops->seek_record(tab->table_arg, it, &rec);
     ++}
     ++
     ++void reftable_table_from_reader(struct reftable_table *tab,
     ++				struct reftable_reader *reader)
     ++{
     ++	assert(tab->ops == NULL);
     ++	tab->ops = &reader_vtable;
     ++	tab->table_arg = reader;
     ++}
     ++
     ++int reftable_table_read_ref(struct reftable_table *tab, const char *name,
     ++			    struct reftable_ref_record *ref)
     ++{
      +	struct reftable_iterator it = { NULL };
     - 	int err = reftable_table_seek_ref(tab, &it, name);
     - 	if (err)
     - 		goto done;
     ++	int err = reftable_table_seek_ref(tab, &it, name);
     ++	if (err)
     ++		goto done;
     ++
     ++	err = reftable_iterator_next_ref(&it, ref);
     ++	if (err)
     ++		goto done;
     ++
     ++	if (strcmp(ref->refname, name) ||
     ++	    reftable_ref_record_is_deletion(ref)) {
     ++		reftable_ref_record_release(ref);
     ++		err = 1;
     ++		goto done;
     ++	}
     ++
     ++done:
     ++	reftable_iterator_destroy(&it);
     ++	return err;
     ++}
     ++
     ++uint64_t reftable_table_max_update_index(struct reftable_table *tab)
     ++{
     ++	return tab->ops->max_update_index(tab->table_arg);
     ++}
     ++
     ++uint64_t reftable_table_min_update_index(struct reftable_table *tab)
     ++{
     ++	return tab->ops->min_update_index(tab->table_arg);
     ++}
     ++
     ++uint32_t reftable_table_hash_id(struct reftable_table *tab)
     ++{
     ++	return tab->ops->hash_id(tab->table_arg);
     ++}
      
       ## reftable/stack.c (new) ##
      @@
     @@ reftable/stack.c (new)
      +#include "merged.h"
      +#include "reader.h"
      +#include "refname.h"
     -+#include "reftable.h"
     ++#include "reftable-error.h"
     ++#include "reftable-record.h"
      +#include "writer.h"
      +
      +static int stack_try_add(struct reftable_stack *st,
     @@ reftable/stack.c (new)
      +static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
      +					     int reuse_open);
      +
     ++static int reftable_fd_write(void *arg, const void *data, size_t sz)
     ++{
     ++	int *fdp = (int *)arg;
     ++	return write(*fdp, data, sz);
     ++}
     ++
      +int reftable_new_stack(struct reftable_stack **dest, const char *dir,
      +		       struct reftable_write_options config)
      +{
     @@ reftable/stack.c (new)
      +	new_tables = NULL;
      +	st->readers_len = new_readers_len;
      +	if (st->merged != NULL) {
     -+		merged_table_clear(st->merged);
     ++		merged_table_release(st->merged);
      +		reftable_merged_table_free(st->merged);
      +	}
      +	if (st->readers != NULL) {
     @@ reftable/stack.c (new)
      +done:
      +	reftable_iterator_destroy(&it);
      +	if (mt != NULL) {
     -+		merged_table_clear(mt);
     ++		merged_table_release(mt);
      +		reftable_merged_table_free(mt);
      +	}
     -+	reftable_ref_record_clear(&ref);
     -+	reftable_log_record_clear(&log);
     ++	reftable_ref_record_release(&ref);
     ++	reftable_log_record_release(&log);
      +	st->stats.entries_written += entries;
      +	return err;
      +}
     @@ reftable/stack.c (new)
      +
      +done:
      +	if (err) {
     -+		reftable_log_record_clear(log);
     ++		reftable_log_record_release(log);
      +	}
      +	reftable_iterator_destroy(&it);
      +	return err;
     @@ reftable/stack.c (new)
      +
      +done:
      +	for (i = 0; i < len; i++) {
     -+		reftable_ref_record_clear(&refs[i]);
     ++		reftable_ref_record_release(&refs[i]);
      +	}
      +
      +	free(refs);
     @@ reftable/stack.h (new)
      +#ifndef STACK_H
      +#define STACK_H
      +
     -+#include "reftable.h"
      +#include "system.h"
     ++#include "reftable-writer.h"
     ++#include "reftable-stack.h"
      +
      +struct reftable_stack {
      +	char *list_file;
     @@ reftable/stack_test.c (new)
      +#include "basics.h"
      +#include "constants.h"
      +#include "record.h"
     -+#include "reftable.h"
      +#include "test_framework.h"
      +#include "reftable-tests.h"
      +
      +#include <sys/types.h>
      +#include <dirent.h>
      +
     ++static void clear_dir(const char *dirname)
     ++{
     ++	struct strbuf path = STRBUF_INIT;
     ++	strbuf_addstr(&path, dirname);
     ++	remove_dir_recursively(&path, 0);
     ++	strbuf_release(&path);
     ++}
     ++
      +static void test_read_file(void)
      +{
      +	char fn[256] = "/tmp/stack.test_read_file.XXXXXX";
     @@ reftable/stack_test.c (new)
      +	char *want[] = { "line1", "line2", "line3" };
      +	int i = 0;
      +
     -+	assert(fd > 0);
     ++	EXPECT(fd > 0);
      +	n = write(fd, out, strlen(out));
     -+	assert(n == strlen(out));
     ++	EXPECT(n == strlen(out));
      +	err = close(fd);
     -+	assert(err >= 0);
     ++	EXPECT(err >= 0);
      +
      +	err = read_lines(fn, &names);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	for (i = 0; names[i] != NULL; i++) {
     -+		assert(0 == strcmp(want[i], names[i]));
     ++		EXPECT(0 == strcmp(want[i], names[i]));
      +	}
      +	free_names(names);
      +	remove(fn);
     @@ reftable/stack_test.c (new)
      +	char **names = NULL;
      +	parse_names(buf, strlen(buf), &names);
      +
     -+	assert(NULL != names[0]);
     -+	assert(0 == strcmp(names[0], "line"));
     -+	assert(NULL == names[1]);
     ++	EXPECT(NULL != names[0]);
     ++	EXPECT(0 == strcmp(names[0], "line"));
     ++	EXPECT(NULL == names[1]);
      +	free_names(names);
      +}
      +
     @@ reftable/stack_test.c (new)
      +	char *b[] = { "a", "b", "d", NULL };
      +	char *c[] = { "a", "b", NULL };
      +
     -+	assert(names_equal(a, a));
     -+	assert(!names_equal(a, b));
     -+	assert(!names_equal(a, c));
     ++	EXPECT(names_equal(a, a));
     ++	EXPECT(!names_equal(a, b));
     ++	EXPECT(!names_equal(a, c));
      +}
      +
      +static int write_test_ref(struct reftable_writer *wr, void *arg)
     @@ reftable/stack_test.c (new)
      +	};
      +	struct reftable_ref_record dest = { NULL };
      +
     -+	assert(mkdtemp(dir));
     ++	EXPECT(mkdtemp(dir));
      +
      +	err = reftable_new_stack(&st, dir, cfg);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_stack_add(st, &write_test_ref, &ref);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_stack_read_ref(st, ref.refname, &dest);
     -+	assert_err(err);
     -+	assert(0 == strcmp("master", dest.target));
     ++	EXPECT_ERR(err);
     ++	EXPECT(0 == strcmp("master", dest.target));
      +
     -+	reftable_ref_record_clear(&dest);
     ++	reftable_ref_record_release(&dest);
      +	reftable_stack_destroy(st);
     -+	reftable_clear_dir(dir);
     ++	clear_dir(dir);
      +}
      +
      +static void test_reftable_stack_uptodate(void)
     @@ reftable/stack_test.c (new)
      +		.target = "master",
      +	};
      +
     -+	assert(mkdtemp(dir));
     ++	EXPECT(mkdtemp(dir));
      +
      +	err = reftable_new_stack(&st1, dir, cfg);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_new_stack(&st2, dir, cfg);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_stack_add(st1, &write_test_ref, &ref1);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_stack_add(st2, &write_test_ref, &ref2);
     -+	assert(err == REFTABLE_LOCK_ERROR);
     ++	EXPECT(err == REFTABLE_LOCK_ERROR);
      +
      +	err = reftable_stack_reload(st2);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_stack_add(st2, &write_test_ref, &ref2);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +	reftable_stack_destroy(st1);
      +	reftable_stack_destroy(st2);
     -+	reftable_clear_dir(dir);
     ++	clear_dir(dir);
      +}
      +
      +static void test_reftable_stack_transaction_api(void)
     @@ reftable/stack_test.c (new)
      +	};
      +	struct reftable_ref_record dest = { NULL };
      +
     -+	assert(mkdtemp(dir));
     ++	EXPECT(mkdtemp(dir));
      +
      +	err = reftable_new_stack(&st, dir, cfg);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	reftable_addition_destroy(add);
      +
      +	err = reftable_stack_new_addition(&add, st);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_addition_add(add, &write_test_ref, &ref);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_addition_commit(add);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	reftable_addition_destroy(add);
      +
      +	err = reftable_stack_read_ref(st, ref.refname, &dest);
     -+	assert_err(err);
     -+	assert(0 == strcmp("master", dest.target));
     ++	EXPECT_ERR(err);
     ++	EXPECT(0 == strcmp("master", dest.target));
      +
     -+	reftable_ref_record_clear(&dest);
     ++	reftable_ref_record_release(&dest);
      +	reftable_stack_destroy(st);
     -+	reftable_clear_dir(dir);
     ++	clear_dir(dir);
      +}
      +
      +static void test_reftable_stack_validate_refname(void)
     @@ reftable/stack_test.c (new)
      +	};
      +	char *additions[] = { "a", "a/b/c" };
      +
     -+	assert(mkdtemp(dir));
     ++	EXPECT(mkdtemp(dir));
      +	err = reftable_new_stack(&st, dir, cfg);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_stack_add(st, &write_test_ref, &ref);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	for (i = 0; i < ARRAY_SIZE(additions); i++) {
      +		struct reftable_ref_record ref = {
     @@ reftable/stack_test.c (new)
      +		};
      +
      +		err = reftable_stack_add(st, &write_test_ref, &ref);
     -+		assert(err == REFTABLE_NAME_CONFLICT);
     ++		EXPECT(err == REFTABLE_NAME_CONFLICT);
      +	}
      +
      +	reftable_stack_destroy(st);
     -+	reftable_clear_dir(dir);
     ++	clear_dir(dir);
      +}
      +
      +static int write_error(struct reftable_writer *wr, void *arg)
     @@ reftable/stack_test.c (new)
      +		.update_index = 1,
      +		.target = "master",
      +	};
     -+	assert(mkdtemp(dir));
     ++	EXPECT(mkdtemp(dir));
      +
      +	err = reftable_new_stack(&st, dir, cfg);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_stack_add(st, &write_test_ref, &ref1);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_stack_add(st, &write_test_ref, &ref2);
     -+	assert(err == REFTABLE_API_ERROR);
     ++	EXPECT(err == REFTABLE_API_ERROR);
      +	reftable_stack_destroy(st);
     -+	reftable_clear_dir(dir);
     ++	clear_dir(dir);
      +}
      +
      +static void test_reftable_stack_lock_failure(void)
     @@ reftable/stack_test.c (new)
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st = NULL;
      +	int err, i;
     -+	assert(mkdtemp(dir));
     ++	EXPECT(mkdtemp(dir));
      +
      +	err = reftable_new_stack(&st, dir, cfg);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +	for (i = -1; i != REFTABLE_EMPTY_TABLE_ERROR; i--) {
      +		err = reftable_stack_add(st, &write_error, &i);
     -+		assert(err == i);
     ++		EXPECT(err == i);
      +	}
      +
      +	reftable_stack_destroy(st);
     -+	reftable_clear_dir(dir);
     ++	clear_dir(dir);
      +}
      +
      +static void test_reftable_stack_add(void)
     @@ reftable/stack_test.c (new)
      +	struct reftable_log_record logs[2] = { { NULL } };
      +	int N = ARRAY_SIZE(refs);
      +
     -+	assert(mkdtemp(dir));
     ++	EXPECT(mkdtemp(dir));
      +
      +	err = reftable_new_stack(&st, dir, cfg);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +	st->disable_auto_compact = 1;
      +
      +	for (i = 0; i < N; i++) {
     @@ reftable/stack_test.c (new)
      +
      +	for (i = 0; i < N; i++) {
      +		int err = reftable_stack_add(st, &write_test_ref, &refs[i]);
     -+		assert_err(err);
     ++		EXPECT_ERR(err);
      +	}
      +
      +	for (i = 0; i < N; i++) {
     @@ reftable/stack_test.c (new)
      +			.update_index = reftable_stack_next_update_index(st),
      +		};
      +		int err = reftable_stack_add(st, &write_test_log, &arg);
     -+		assert_err(err);
     ++		EXPECT_ERR(err);
      +	}
      +
      +	err = reftable_stack_compact_all(st, NULL);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	for (i = 0; i < N; i++) {
      +		struct reftable_ref_record dest = { NULL };
      +
      +		int err = reftable_stack_read_ref(st, refs[i].refname, &dest);
     -+		assert_err(err);
     -+		assert(reftable_ref_record_equal(&dest, refs + i, SHA1_SIZE));
     -+		reftable_ref_record_clear(&dest);
     ++		EXPECT_ERR(err);
     ++		EXPECT(reftable_ref_record_equal(&dest, refs + i, SHA1_SIZE));
     ++		reftable_ref_record_release(&dest);
      +	}
      +
      +	for (i = 0; i < N; i++) {
      +		struct reftable_log_record dest = { NULL };
      +		int err = reftable_stack_read_log(st, refs[i].refname, &dest);
     -+		assert_err(err);
     -+		assert(reftable_log_record_equal(&dest, logs + i, SHA1_SIZE));
     -+		reftable_log_record_clear(&dest);
     ++		EXPECT_ERR(err);
     ++		EXPECT(reftable_log_record_equal(&dest, logs + i, SHA1_SIZE));
     ++		reftable_log_record_release(&dest);
      +	}
      +
      +	/* cleanup */
      +	reftable_stack_destroy(st);
      +	for (i = 0; i < N; i++) {
     -+		reftable_ref_record_clear(&refs[i]);
     -+		reftable_log_record_clear(&logs[i]);
     ++		reftable_ref_record_release(&refs[i]);
     ++		reftable_log_record_release(&logs[i]);
      +	}
     -+	reftable_clear_dir(dir);
     ++	clear_dir(dir);
      +}
      +
      +static void test_reftable_stack_log_normalize(void)
     @@ reftable/stack_test.c (new)
      +		.update_index = 1,
      +	};
      +
     -+	assert(mkdtemp(dir));
     ++	EXPECT(mkdtemp(dir));
      +	err = reftable_new_stack(&st, dir, cfg);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	input.message = "one\ntwo";
      +	err = reftable_stack_add(st, &write_test_log, &arg);
     -+	assert(err == REFTABLE_API_ERROR);
     ++	EXPECT(err == REFTABLE_API_ERROR);
      +
      +	input.message = "one";
      +	err = reftable_stack_add(st, &write_test_log, &arg);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_stack_read_log(st, input.refname, &dest);
     -+	assert_err(err);
     -+	assert(0 == strcmp(dest.message, "one\n"));
     ++	EXPECT_ERR(err);
     ++	EXPECT(0 == strcmp(dest.message, "one\n"));
      +
      +	input.message = "two\n";
      +	arg.update_index = 2;
      +	err = reftable_stack_add(st, &write_test_log, &arg);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +	err = reftable_stack_read_log(st, input.refname, &dest);
     -+	assert_err(err);
     -+	assert(0 == strcmp(dest.message, "two\n"));
     ++	EXPECT_ERR(err);
     ++	EXPECT(0 == strcmp(dest.message, "two\n"));
      +
      +	/* cleanup */
      +	reftable_stack_destroy(st);
     -+	reftable_log_record_clear(&dest);
     -+	reftable_clear_dir(dir);
     ++	reftable_log_record_release(&dest);
     ++	clear_dir(dir);
      +}
      +
      +static void test_reftable_stack_tombstone(void)
     @@ reftable/stack_test.c (new)
      +	struct reftable_ref_record dest = { NULL };
      +	struct reftable_log_record log_dest = { NULL };
      +
     -+	assert(mkdtemp(dir));
     ++	EXPECT(mkdtemp(dir));
      +
      +	err = reftable_new_stack(&st, dir, cfg);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	for (i = 0; i < N; i++) {
      +		const char *buf = "branch";
     @@ reftable/stack_test.c (new)
      +	}
      +	for (i = 0; i < N; i++) {
      +		int err = reftable_stack_add(st, &write_test_ref, &refs[i]);
     -+		assert_err(err);
     ++		EXPECT_ERR(err);
      +	}
      +	for (i = 0; i < N; i++) {
      +		struct write_log_arg arg = {
     @@ reftable/stack_test.c (new)
      +			.update_index = reftable_stack_next_update_index(st),
      +		};
      +		int err = reftable_stack_add(st, &write_test_log, &arg);
     -+		assert_err(err);
     ++		EXPECT_ERR(err);
      +	}
      +
      +	err = reftable_stack_read_ref(st, "branch", &dest);
     -+	assert(err == 1);
     -+	reftable_ref_record_clear(&dest);
     ++	EXPECT(err == 1);
     ++	reftable_ref_record_release(&dest);
      +
      +	err = reftable_stack_read_log(st, "branch", &log_dest);
     -+	assert(err == 1);
     -+	reftable_log_record_clear(&log_dest);
     ++	EXPECT(err == 1);
     ++	reftable_log_record_release(&log_dest);
      +
      +	err = reftable_stack_compact_all(st, NULL);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_stack_read_ref(st, "branch", &dest);
     -+	assert(err == 1);
     ++	EXPECT(err == 1);
      +
      +	err = reftable_stack_read_log(st, "branch", &log_dest);
     -+	assert(err == 1);
     -+	reftable_ref_record_clear(&dest);
     -+	reftable_log_record_clear(&log_dest);
     ++	EXPECT(err == 1);
     ++	reftable_ref_record_release(&dest);
     ++	reftable_log_record_release(&log_dest);
      +
      +	/* cleanup */
      +	reftable_stack_destroy(st);
      +	for (i = 0; i < N; i++) {
     -+		reftable_ref_record_clear(&refs[i]);
     -+		reftable_log_record_clear(&logs[i]);
     ++		reftable_ref_record_release(&refs[i]);
     ++		reftable_log_record_release(&logs[i]);
      +	}
     -+	reftable_clear_dir(dir);
     ++	clear_dir(dir);
      +}
      +
      +static void test_reftable_stack_hash_id(void)
     @@ reftable/stack_test.c (new)
      +	struct reftable_stack *st_default = NULL;
      +	struct reftable_ref_record dest = { NULL };
      +
     -+	assert(mkdtemp(dir));
     ++	EXPECT(mkdtemp(dir));
      +	err = reftable_new_stack(&st, dir, cfg);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_stack_add(st, &write_test_ref, &ref);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	/* can't read it with the wrong hash ID. */
      +	err = reftable_new_stack(&st32, dir, cfg32);
     -+	assert(err == REFTABLE_FORMAT_ERROR);
     ++	EXPECT(err == REFTABLE_FORMAT_ERROR);
      +
      +	/* check that we can read it back with default config too. */
      +	err = reftable_new_stack(&st_default, dir, cfg_default);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_stack_read_ref(st_default, "master", &dest);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
     -+	assert(!strcmp(dest.target, ref.target));
     -+	reftable_ref_record_clear(&dest);
     ++	EXPECT(!strcmp(dest.target, ref.target));
     ++	reftable_ref_record_release(&dest);
      +	reftable_stack_destroy(st);
      +	reftable_stack_destroy(st_default);
     -+	reftable_clear_dir(dir);
     ++	clear_dir(dir);
      +}
      +
      +static void test_log2(void)
      +{
     -+	assert(1 == fastlog2(3));
     -+	assert(2 == fastlog2(4));
     -+	assert(2 == fastlog2(5));
     ++	EXPECT(1 == fastlog2(3));
     ++	EXPECT(2 == fastlog2(4));
     ++	EXPECT(2 == fastlog2(5));
      +}
      +
      +static void test_sizes_to_segments(void)
     @@ reftable/stack_test.c (new)
      +	int seglen = 0;
      +	struct segment *segs =
      +		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
     -+	assert(segs[2].log == 3);
     -+	assert(segs[2].start == 5);
     -+	assert(segs[2].end == 6);
     ++	EXPECT(segs[2].log == 3);
     ++	EXPECT(segs[2].start == 5);
     ++	EXPECT(segs[2].end == 6);
      +
     -+	assert(segs[1].log == 2);
     -+	assert(segs[1].start == 2);
     -+	assert(segs[1].end == 5);
     ++	EXPECT(segs[1].log == 2);
     ++	EXPECT(segs[1].start == 2);
     ++	EXPECT(segs[1].end == 5);
      +	reftable_free(segs);
      +}
      +
     @@ reftable/stack_test.c (new)
      +	int seglen = 0;
      +	struct segment *segs =
      +		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
     -+	assert(seglen == 0);
     ++	EXPECT(seglen == 0);
      +	reftable_free(segs);
      +}
      +
     @@ reftable/stack_test.c (new)
      +	int seglen = 0;
      +	struct segment *segs =
      +		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
     -+	assert(seglen == 1);
     -+	assert(segs[0].start == 0);
     -+	assert(segs[0].end == 2);
     ++	EXPECT(seglen == 1);
     ++	EXPECT(segs[0].start == 0);
     ++	EXPECT(segs[0].end == 2);
      +	reftable_free(segs);
      +}
      +
     @@ reftable/stack_test.c (new)
      +	/* .................0    1    2  3   4  5  6 */
      +	struct segment min =
      +		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes));
     -+	assert(min.start == 2);
     -+	assert(min.end == 7);
     ++	EXPECT(min.start == 2);
     ++	EXPECT(min.end == 7);
      +}
      +
      +static void test_suggest_compaction_segment_nothing(void)
     @@ reftable/stack_test.c (new)
      +	uint64_t sizes[] = { 64, 32, 16, 8, 4, 2 };
      +	struct segment result =
      +		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes));
     -+	assert(result.start == result.end);
     ++	EXPECT(result.start == result.end);
      +}
      +
      +static void test_reflog_expire(void)
     @@ reftable/stack_test.c (new)
      +	};
      +	struct reftable_log_record log = { NULL };
      +
     -+	assert(mkdtemp(dir));
     ++	EXPECT(mkdtemp(dir));
      +
      +	err = reftable_new_stack(&st, dir, cfg);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	for (i = 1; i <= N; i++) {
      +		char buf[256];
     @@ reftable/stack_test.c (new)
      +			.update_index = reftable_stack_next_update_index(st),
      +		};
      +		int err = reftable_stack_add(st, &write_test_log, &arg);
     -+		assert_err(err);
     ++		EXPECT_ERR(err);
      +	}
      +
      +	err = reftable_stack_compact_all(st, NULL);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_stack_compact_all(st, &expiry);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_stack_read_log(st, logs[9].refname, &log);
     -+	assert(err == 1);
     ++	EXPECT(err == 1);
      +
      +	err = reftable_stack_read_log(st, logs[11].refname, &log);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	expiry.min_update_index = 15;
      +	err = reftable_stack_compact_all(st, &expiry);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_stack_read_log(st, logs[14].refname, &log);
     -+	assert(err == 1);
     ++	EXPECT(err == 1);
      +
      +	err = reftable_stack_read_log(st, logs[16].refname, &log);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	/* cleanup */
      +	reftable_stack_destroy(st);
      +	for (i = 0; i <= N; i++) {
     -+		reftable_log_record_clear(&logs[i]);
     ++		reftable_log_record_release(&logs[i]);
      +	}
     -+	reftable_clear_dir(dir);
     -+	reftable_log_record_clear(&log);
     ++	clear_dir(dir);
     ++	reftable_log_record_release(&log);
      +}
      +
      +static int write_nothing(struct reftable_writer *wr, void *arg)
     @@ reftable/stack_test.c (new)
      +	char dir[256] = "/tmp/stack_test.XXXXXX";
      +	struct reftable_stack *st2 = NULL;
      +
     -+	assert(mkdtemp(dir));
     ++	EXPECT(mkdtemp(dir));
      +
      +	err = reftable_new_stack(&st, dir, cfg);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_stack_add(st, &write_nothing, NULL);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	err = reftable_new_stack(&st2, dir, cfg);
     -+	assert_err(err);
     -+	reftable_clear_dir(dir);
     ++	EXPECT_ERR(err);
     ++	clear_dir(dir);
      +	reftable_stack_destroy(st);
      +	reftable_stack_destroy(st2);
      +}
     @@ reftable/stack_test.c (new)
      +	char dir[256] = "/tmp/stack_test.XXXXXX";
      +	int err, i;
      +	int N = 100;
     -+	assert(mkdtemp(dir));
     ++	EXPECT(mkdtemp(dir));
      +
      +	err = reftable_new_stack(&st, dir, cfg);
     -+	assert_err(err);
     ++	EXPECT_ERR(err);
      +
      +	for (i = 0; i < N; i++) {
      +		char name[100];
     @@ reftable/stack_test.c (new)
      +		snprintf(name, sizeof(name), "branch%04d", i);
      +
      +		err = reftable_stack_add(st, &write_test_ref, &ref);
     -+		assert_err(err);
     ++		EXPECT_ERR(err);
      +
     -+		assert(i < 3 || st->merged->stack_len < 2 * fastlog2(i));
     ++		EXPECT(i < 3 || st->merged->stack_len < 2 * fastlog2(i));
      +	}
      +
     -+	assert(reftable_stack_compaction_stats(st)->entries_written <
     ++	EXPECT(reftable_stack_compaction_stats(st)->entries_written <
      +	       (uint64_t)(N * fastlog2(N)));
      +
      +	reftable_stack_destroy(st);
     -+	reftable_clear_dir(dir);
     ++	clear_dir(dir);
      +}
      +
      +int stack_test_main(int argc, const char *argv[])
      +{
     -+	add_test_case("test_reftable_stack_uptodate",
     -+		      &test_reftable_stack_uptodate);
     -+	add_test_case("test_reftable_stack_transaction_api",
     -+		      &test_reftable_stack_transaction_api);
     -+	add_test_case("test_reftable_stack_hash_id",
     -+		      &test_reftable_stack_hash_id);
     -+	add_test_case("test_sizes_to_segments_all_equal",
     -+		      &test_sizes_to_segments_all_equal);
     -+	add_test_case("test_reftable_stack_auto_compaction",
     -+		      &test_reftable_stack_auto_compaction);
     -+	add_test_case("test_reftable_stack_validate_refname",
     -+		      &test_reftable_stack_validate_refname);
     -+	add_test_case("test_reftable_stack_update_index_check",
     -+		      &test_reftable_stack_update_index_check);
     -+	add_test_case("test_reftable_stack_lock_failure",
     -+		      &test_reftable_stack_lock_failure);
     -+	add_test_case("test_reftable_stack_log_normalize",
     -+		      &test_reftable_stack_log_normalize);
     -+	add_test_case("test_reftable_stack_tombstone",
     -+		      &test_reftable_stack_tombstone);
     -+	add_test_case("test_reftable_stack_add_one",
     -+		      &test_reftable_stack_add_one);
     -+	add_test_case("test_empty_add", test_empty_add);
     -+	add_test_case("test_reflog_expire", test_reflog_expire);
     -+	add_test_case("test_suggest_compaction_segment",
     -+		      &test_suggest_compaction_segment);
     -+	add_test_case("test_suggest_compaction_segment_nothing",
     -+		      &test_suggest_compaction_segment_nothing);
     -+	add_test_case("test_sizes_to_segments", &test_sizes_to_segments);
     -+	add_test_case("test_sizes_to_segments_empty",
     -+		      &test_sizes_to_segments_empty);
     -+	add_test_case("test_log2", &test_log2);
     -+	add_test_case("test_parse_names", &test_parse_names);
     -+	add_test_case("test_read_file", &test_read_file);
     -+	add_test_case("test_names_equal", &test_names_equal);
     -+	add_test_case("test_reftable_stack_add", &test_reftable_stack_add);
     -+	return test_main(argc, argv);
     ++	test_reftable_stack_uptodate();
     ++	test_reftable_stack_transaction_api();
     ++	test_reftable_stack_hash_id();
     ++	test_sizes_to_segments_all_equal();
     ++	test_reftable_stack_auto_compaction();
     ++	test_reftable_stack_validate_refname();
     ++	test_reftable_stack_update_index_check();
     ++	test_reftable_stack_lock_failure();
     ++	test_reftable_stack_log_normalize();
     ++	test_reftable_stack_tombstone();
     ++	test_reftable_stack_add_one();
     ++	test_empty_add();
     ++	test_reflog_expire();
     ++	test_suggest_compaction_segment();
     ++	test_suggest_compaction_segment_nothing();
     ++	test_sizes_to_segments();
     ++	test_sizes_to_segments_empty();
     ++	test_log2();
     ++	test_parse_names();
     ++	test_read_file();
     ++	test_names_equal();
     ++	test_reftable_stack_add();
     ++	return 0;
      +}
      
     - ## reftable/update.sh (new) ##
     -@@
     -+#!/bin/sh
     -+
     -+set -eu
     -+
     -+# Override this to import from somewhere else, say "../reftable".
     -+SRC=${SRC:-origin}
     -+BRANCH=${BRANCH:-master}
     -+
     -+((git --git-dir reftable-repo/.git fetch -f ${SRC} ${BRANCH}:import && cd reftable-repo && git checkout -f $(git rev-parse import) ) ||
     -+   git clone https://github.com/google/reftable reftable-repo)
     -+
     -+cp reftable-repo/c/*.[ch] reftable/
     -+cp reftable-repo/c/include/*.[ch] reftable/
     -+cp reftable-repo/LICENSE reftable/
     -+
     -+git --git-dir reftable-repo/.git show --no-patch --format=oneline HEAD \
     -+  > reftable/VERSION
     -+
     -+mv reftable/system.h reftable/system.h~
     -+sed 's|if REFTABLE_IN_GITCORE|if 1 /* REFTABLE_IN_GITCORE */|'  < reftable/system.h~ > reftable/system.h
     -+
     -+git add reftable/*.[ch] reftable/LICENSE reftable/VERSION
     -
       ## t/helper/test-reftable.c ##
     -@@
     - int cmd__reftable(int argc, const char **argv)
     +@@ t/helper/test-reftable.c: int cmd__reftable(int argc, const char **argv)
       {
     + 	basics_test_main(argc, argv);
       	block_test_main(argc, argv);
      +	merged_test_main(argc, argv);
       	record_test_main(argc, argv);
      +	refname_test_main(argc, argv);
       	reftable_test_main(argc, argv);
     - 	strbuf_test_main(argc, argv);
      +	stack_test_main(argc, argv);
       	tree_test_main(argc, argv);
       	return 0;
  -:  ---------- > 14:  a590865a70 Reftable support for git-core
  -:  ---------- > 15:  57626bfe2d git-prompt: prepare for reftable refs backend
 13:  c535f838d6 ! 16:  6229da992e reftable: "test-tool dump-reftable" command.
     @@ Metadata
      Author: Han-Wen Nienhuys <hanwen@google.com>
      
       ## Commit message ##
     -    reftable: "test-tool dump-reftable" command.
     +    Add "test-tool dump-reftable" command.
      
          This command dumps individual tables or a stack of of tables.
      
          Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
      
       ## Makefile ##
     -@@ Makefile: REFTABLE_OBJS += reftable/writer.o
     - REFTABLE_OBJS += reftable/zlib-compat.o
     +@@ Makefile: REFTABLE_OBJS += reftable/zlib-compat.o
       
     + REFTABLE_TEST_OBJS += reftable/basics_test.o
       REFTABLE_TEST_OBJS += reftable/block_test.o
      +REFTABLE_TEST_OBJS += reftable/dump.o
     -+REFTABLE_TEST_OBJS += reftable/merged_test.o
     + REFTABLE_TEST_OBJS += reftable/merged_test.o
       REFTABLE_TEST_OBJS += reftable/record_test.o
       REFTABLE_TEST_OBJS += reftable/refname_test.o
     - REFTABLE_TEST_OBJS += reftable/reftable_test.o
      
     - ## reftable/iter.c ##
     -@@ reftable/iter.c: void reftable_iterator_destroy(struct reftable_iterator *it)
     - int reftable_iterator_next_ref(struct reftable_iterator *it,
     - 			       struct reftable_ref_record *ref)
     - {
     --	struct reftable_record rec = { 0 };
     -+	struct reftable_record rec = { NULL };
     - 	reftable_record_from_ref(&rec, ref);
     - 	return iterator_next(it, &rec);
     - }
     -@@ reftable/iter.c: int reftable_iterator_next_ref(struct reftable_iterator *it,
     - int reftable_iterator_next_log(struct reftable_iterator *it,
     - 			       struct reftable_log_record *log)
     - {
     --	struct reftable_record rec = { 0 };
     -+	struct reftable_record rec = { NULL };
     - 	reftable_record_from_log(&rec, log);
     - 	return iterator_next(it, &rec);
     - }
     -@@ reftable/iter.c: static int filtering_ref_iterator_next(void *iter_arg,
     - 		}
     + ## reftable/dump.c ##
     +@@ reftable/dump.c: license that can be found in the LICENSE file or at
     + #include <unistd.h>
     + #include <string.h>
       
     - 		if (fri->double_check) {
     --			struct reftable_iterator it = { 0 };
     -+			struct reftable_iterator it = { NULL };
     +-#include "reftable.h"
     ++#include "reftable-blocksource.h"
     ++#include "reftable-error.h"
     ++#include "reftable-merged.h"
     ++#include "reftable-record.h"
     + #include "reftable-tests.h"
     ++#include "reftable-writer.h"
     ++#include "reftable-iterator.h"
     ++#include "reftable-reader.h"
     ++#include "reftable-stack.h"
       
     - 			err = reftable_table_seek_ref(&fri->tab, &it,
     - 						      ref->refname);
     -
     - ## reftable/reader.c ##
     -@@ reftable/reader.c: static int parse_footer(struct reftable_reader *r, uint8_t *footer,
     - int init_reader(struct reftable_reader *r, struct reftable_block_source *source,
     - 		const char *name)
     - {
     --	struct reftable_block footer = { 0 };
     --	struct reftable_block header = { 0 };
     -+	struct reftable_block footer = { NULL };
     -+	struct reftable_block header = { NULL };
     - 	int err = 0;
     + static uint32_t hash_id;
       
     - 	memset(r, 0, sizeof(struct reftable_reader));
     -@@ reftable/reader.c: int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br,
     + static int dump_table(const char *tablename)
       {
     - 	int32_t guess_block_size = r->block_size ? r->block_size :
     - 							 DEFAULT_BLOCK_SIZE;
     --	struct reftable_block block = { 0 };
     -+	struct reftable_block block = { NULL };
     - 	uint8_t block_typ = 0;
     - 	int err = 0;
     - 	uint32_t header_off = next_off ? 0 : header_size(r->version);
     -@@ reftable/reader.c: static int reader_seek_indexed(struct reftable_reader *r,
     - 			       struct reftable_record *rec)
     +-	struct reftable_block_source src = { 0 };
     ++	struct reftable_block_source src = { NULL };
     + 	int err = reftable_block_source_from_file(&src, tablename);
     +-	struct reftable_iterator it = { 0 };
     +-	struct reftable_ref_record ref = { 0 };
     +-	struct reftable_log_record log = { 0 };
     ++	struct reftable_iterator it = { NULL };
     ++	struct reftable_ref_record ref = { NULL };
     ++	struct reftable_log_record log = { NULL };
     + 	struct reftable_reader *r = NULL;
     + 
     + 	if (err < 0)
     +@@ reftable/dump.c: static int dump_table(const char *tablename)
     + 		reftable_ref_record_print(&ref, hash_id);
     + 	}
     + 	reftable_iterator_destroy(&it);
     +-	reftable_ref_record_clear(&ref);
     ++	reftable_ref_record_release(&ref);
     + 
     + 	err = reftable_reader_seek_log(r, &it, "");
     + 	if (err < 0) {
     +@@ reftable/dump.c: static int dump_table(const char *tablename)
     + 		reftable_log_record_print(&log, hash_id);
     + 	}
     + 	reftable_iterator_destroy(&it);
     +-	reftable_log_record_clear(&log);
     ++	reftable_log_record_release(&log);
     + 
     + 	reftable_reader_free(r);
     + 	return 0;
     +@@ reftable/dump.c: static int dump_stack(const char *stackdir)
       {
     - 	struct reftable_index_record want_index = { .last_key = STRBUF_INIT };
     --	struct reftable_record want_index_rec = { 0 };
     -+	struct reftable_record want_index_rec = { NULL };
     - 	struct reftable_index_record index_result = { .last_key = STRBUF_INIT };
     --	struct reftable_record index_result_rec = { 0 };
     -+	struct reftable_record index_result_rec = { NULL };
     - 	struct table_iter index_iter = TABLE_ITER_INIT;
     - 	struct table_iter next = TABLE_ITER_INIT;
     - 	int err = 0;
     -@@ reftable/reader.c: int reftable_reader_seek_ref(struct reftable_reader *r,
     - 	struct reftable_ref_record ref = {
     - 		.refname = (char *)name,
     - 	};
     --	struct reftable_record rec = { 0 };
     -+	struct reftable_record rec = { NULL };
     - 	reftable_record_from_ref(&rec, &ref);
     - 	return reader_seek(r, it, &rec);
     - }
     -@@ reftable/reader.c: int reftable_reader_seek_log_at(struct reftable_reader *r,
     - 		.refname = (char *)name,
     - 		.update_index = update_index,
     - 	};
     --	struct reftable_record rec = { 0 };
     -+	struct reftable_record rec = { NULL };
     - 	reftable_record_from_log(&rec, &log);
     - 	return reader_seek(r, it, &rec);
     - }
     -@@ reftable/reader.c: static int reftable_reader_refs_for_indexed(struct reftable_reader *r,
     - 		.hash_prefix = oid,
     - 		.hash_prefix_len = r->object_id_len,
     - 	};
     --	struct reftable_record want_rec = { 0 };
     --	struct reftable_iterator oit = { 0 };
     --	struct reftable_obj_record got = { 0 };
     --	struct reftable_record got_rec = { 0 };
     -+	struct reftable_record want_rec = { NULL };
     -+	struct reftable_iterator oit = { NULL };
     -+	struct reftable_obj_record got = { NULL };
     -+	struct reftable_record got_rec = { NULL };
     - 	int err = 0;
     - 	struct indexed_table_ref_iter *itr = NULL;
     + 	struct reftable_stack *stack = NULL;
     + 	struct reftable_write_options cfg = {};
     +-	struct reftable_iterator it = { 0 };
     +-	struct reftable_ref_record ref = { 0 };
     +-	struct reftable_log_record log = { 0 };
     ++	struct reftable_iterator it = { NULL };
     ++	struct reftable_ref_record ref = { NULL };
     ++	struct reftable_log_record log = { NULL };
     + 	struct reftable_merged_table *merged = NULL;
       
     + 	int err = reftable_new_stack(&stack, stackdir, cfg);
     +@@ reftable/dump.c: static int dump_stack(const char *stackdir)
     + 		reftable_ref_record_print(&ref, hash_id);
     + 	}
     + 	reftable_iterator_destroy(&it);
     +-	reftable_ref_record_clear(&ref);
     ++	reftable_ref_record_release(&ref);
     + 
     + 	err = reftable_merged_table_seek_log(merged, &it, "");
     + 	if (err < 0) {
     +@@ reftable/dump.c: static int dump_stack(const char *stackdir)
     + 		reftable_log_record_print(&log, hash_id);
     + 	}
     + 	reftable_iterator_destroy(&it);
     +-	reftable_log_record_clear(&log);
     ++	reftable_log_record_release(&log);
     + 
     + 	reftable_stack_destroy(stack);
     + 	return 0;
      
       ## t/helper/test-reftable.c ##
      @@ t/helper/test-reftable.c: int cmd__reftable(int argc, const char **argv)
     @@ t/helper/test-tool.h: int cmd__dump_cache_tree(int argc, const char **argv);
       int cmd__dump_untracked_cache(int argc, const char **argv);
      +int cmd__dump_reftable(int argc, const char **argv);
       int cmd__example_decorate(int argc, const char **argv);
     + int cmd__fast_rebase(int argc, const char **argv);
       int cmd__genrandom(int argc, const char **argv);
     - int cmd__genzeros(int argc, const char **argv);

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 251+ messages in thread

* [PATCH v3 01/16] move sleep_millisec to git-compat-util.h
  2020-11-26 19:42   ` [PATCH v3 00/16] reftable library Han-Wen Nienhuys via GitGitGadget
@ 2020-11-26 19:42     ` Han-Wen Nienhuys via GitGitGadget
  2020-11-26 19:42     ` [PATCH v3 02/16] init-db: set the_repository->hash_algo early on Han-Wen Nienhuys via GitGitGadget
                       ` (15 subsequent siblings)
  16 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-11-26 19:42 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The sleep function is defined in wrapper.c, so it makes more sense to be a in
system compatibility header.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 cache.h           | 1 -
 git-compat-util.h | 2 ++
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index c0072d43b1..e986cf4ea9 100644
--- a/cache.h
+++ b/cache.h
@@ -1960,7 +1960,6 @@ int stat_validity_check(struct stat_validity *sv, const char *path);
 void stat_validity_update(struct stat_validity *sv, int fd);
 
 int versioncmp(const char *s1, const char *s2);
-void sleep_millisec(int millisec);
 
 /*
  * Create a directory and (if share is nonzero) adjust its permissions
diff --git a/git-compat-util.h b/git-compat-util.h
index adfea06897..7d509c5022 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1354,4 +1354,6 @@ static inline void *container_of_or_null_offset(void *ptr, size_t offset)
 	((uintptr_t)&(ptr)->member - (uintptr_t)(ptr))
 #endif /* !__GNUC__ */
 
+void sleep_millisec(int millisec);
+
 #endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v3 02/16] init-db: set the_repository->hash_algo early on
  2020-11-26 19:42   ` [PATCH v3 00/16] reftable library Han-Wen Nienhuys via GitGitGadget
  2020-11-26 19:42     ` [PATCH v3 01/16] move sleep_millisec to git-compat-util.h Han-Wen Nienhuys via GitGitGadget
@ 2020-11-26 19:42     ` Han-Wen Nienhuys via GitGitGadget
  2020-11-27 10:22       ` Ævar Arnfjörð Bjarmason
  2020-11-26 19:42     ` [PATCH v3 03/16] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
                       ` (14 subsequent siblings)
  16 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-11-26 19:42 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable backend needs to know the hash algorithm for writing the
initialization hash table.

The initial reftable contains a symref HEAD => "main" (or "master"), which is
agnostic to the size of hash value, but this is an exceptional circumstance, and
the reftable library does not cater for this exceptional. It insists that all
tables in the stack have a consistent format ID for the hash algorithm.

Call set_repo_hash_algo directly after reading out GIT_DEFAULT_HASH.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 builtin/init-db.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/builtin/init-db.c b/builtin/init-db.c
index 01bc648d41..5c8c67fec6 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -391,6 +391,7 @@ static void validate_hash_algorithm(struct repository_format *repo_fmt, int hash
 			die(_("unknown hash algorithm '%s'"), env);
 		repo_fmt->hash_algo = env_algo;
 	}
+	repo_set_hash_algo(the_repository, repo_fmt->hash_algo);
 }
 
 int init_db(const char *git_dir, const char *real_git_dir,
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v3 03/16] reftable: add LICENSE
  2020-11-26 19:42   ` [PATCH v3 00/16] reftable library Han-Wen Nienhuys via GitGitGadget
  2020-11-26 19:42     ` [PATCH v3 01/16] move sleep_millisec to git-compat-util.h Han-Wen Nienhuys via GitGitGadget
  2020-11-26 19:42     ` [PATCH v3 02/16] init-db: set the_repository->hash_algo early on Han-Wen Nienhuys via GitGitGadget
@ 2020-11-26 19:42     ` Han-Wen Nienhuys via GitGitGadget
  2020-11-27 10:23       ` Ævar Arnfjörð Bjarmason
  2020-11-26 19:42     ` [PATCH v3 04/16] reftable: add error related functionality Han-Wen Nienhuys via GitGitGadget
                       ` (13 subsequent siblings)
  16 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-11-26 19:42 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

TODO: relicense?

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/LICENSE | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)
 create mode 100644 reftable/LICENSE

diff --git a/reftable/LICENSE b/reftable/LICENSE
new file mode 100644
index 0000000000..402e0f9356
--- /dev/null
+++ b/reftable/LICENSE
@@ -0,0 +1,31 @@
+BSD License
+
+Copyright (c) 2020, Google LLC
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are
+met:
+
+* Redistributions of source code must retain the above copyright notice,
+this list of conditions and the following disclaimer.
+
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in the
+documentation and/or other materials provided with the distribution.
+
+* Neither the name of Google LLC nor the names of its contributors may
+be used to endorse or promote products derived from this software
+without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v3 04/16] reftable: add error related functionality
  2020-11-26 19:42   ` [PATCH v3 00/16] reftable library Han-Wen Nienhuys via GitGitGadget
                       ` (2 preceding siblings ...)
  2020-11-26 19:42     ` [PATCH v3 03/16] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
@ 2020-11-26 19:42     ` Han-Wen Nienhuys via GitGitGadget
  2020-11-27  9:13       ` Felipe Contreras
  2020-11-27 10:25       ` Ævar Arnfjörð Bjarmason
  2020-11-26 19:42     ` [PATCH v3 05/16] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
                       ` (12 subsequent siblings)
  16 siblings, 2 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-11-26 19:42 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable/ directory is structured as a library, so it cannot
crash on misuse. Instead, it returns an error codes.

In addition, the error code can be used to signal conditions from lower levels
of the library to be handled by higher levels of the library. For example, a
transaction might legitimately write an empty reftable file, but in that case,
we'd want to shortcut the transaction overhead.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/error.c          | 39 ++++++++++++++++++++++++
 reftable/reftable-error.h | 62 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 101 insertions(+)
 create mode 100644 reftable/error.c
 create mode 100644 reftable/reftable-error.h

diff --git a/reftable/error.c b/reftable/error.c
new file mode 100644
index 0000000000..1d60137ea7
--- /dev/null
+++ b/reftable/error.c
@@ -0,0 +1,39 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reftable-error.h"
+
+#include <stdio.h>
+
+const char *reftable_error_str(int err)
+{
+	static char buf[250];
+	switch (err) {
+	case REFTABLE_IO_ERROR:
+		return "I/O error";
+	case REFTABLE_FORMAT_ERROR:
+		return "corrupt reftable file";
+	case REFTABLE_NOT_EXIST_ERROR:
+		return "file does not exist";
+	case REFTABLE_LOCK_ERROR:
+		return "data is outdated";
+	case REFTABLE_API_ERROR:
+		return "misuse of the reftable API";
+	case REFTABLE_ZLIB_ERROR:
+		return "zlib failure";
+	case REFTABLE_NAME_CONFLICT:
+		return "file/directory conflict";
+	case REFTABLE_REFNAME_ERROR:
+		return "invalid refname";
+	case -1:
+		return "general error";
+	default:
+		snprintf(buf, sizeof(buf), "unknown error code %d", err);
+		return buf;
+	}
+}
diff --git a/reftable/reftable-error.h b/reftable/reftable-error.h
new file mode 100644
index 0000000000..7e55a16aac
--- /dev/null
+++ b/reftable/reftable-error.h
@@ -0,0 +1,62 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_ERROR_H
+#define REFTABLE_ERROR_H
+
+/*
+ Errors in reftable calls are signaled with negative integer return values. 0
+ means success.
+*/
+enum reftable_error {
+	/* Unexpected file system behavior */
+	REFTABLE_IO_ERROR = -2,
+
+	/* Format inconsistency on reading data
+	 */
+	REFTABLE_FORMAT_ERROR = -3,
+
+	/* File does not exist. Returned from block_source_from_file(),  because
+	   it needs special handling in stack.
+	*/
+	REFTABLE_NOT_EXIST_ERROR = -4,
+
+	/* Trying to write out-of-date data. */
+	REFTABLE_LOCK_ERROR = -5,
+
+	/* Misuse of the API:
+	   - on writing a record with NULL refname.
+	   - on writing a reftable_ref_record outside the table limits
+	   - on writing a ref or log record before the stack's next_update_index
+	   - on writing a log record with multiline message with
+	   exact_log_message unset
+	   - on reading a reftable_ref_record from log iterator, or vice versa.
+
+	  When a call misuses the API, the internal state of the library is kept
+	  unchanged.
+	*/
+	REFTABLE_API_ERROR = -6,
+
+	/* Decompression error */
+	REFTABLE_ZLIB_ERROR = -7,
+
+	/* Wrote a table without blocks. */
+	REFTABLE_EMPTY_TABLE_ERROR = -8,
+
+	/* Dir/file conflict. */
+	REFTABLE_NAME_CONFLICT = -9,
+
+	/* Illegal ref name. */
+	REFTABLE_REFNAME_ERROR = -10,
+};
+
+/* convert the numeric error code to a string. The string should not be
+ * deallocated. */
+const char *reftable_error_str(int err);
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v3 05/16] reftable: utility functions
  2020-11-26 19:42   ` [PATCH v3 00/16] reftable library Han-Wen Nienhuys via GitGitGadget
                       ` (3 preceding siblings ...)
  2020-11-26 19:42     ` [PATCH v3 04/16] reftable: add error related functionality Han-Wen Nienhuys via GitGitGadget
@ 2020-11-26 19:42     ` Han-Wen Nienhuys via GitGitGadget
  2020-11-27 10:18       ` Felipe Contreras
  2020-11-27 10:33       ` Ævar Arnfjörð Bjarmason
  2020-11-26 19:42     ` [PATCH v3 06/16] reftable: add blocksource, an abstraction for random access reads Han-Wen Nienhuys via GitGitGadget
                       ` (11 subsequent siblings)
  16 siblings, 2 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-11-26 19:42 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This commit provides basic utility classes for the reftable library.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                     |  24 +++++-
 reftable/basics.c            | 144 +++++++++++++++++++++++++++++++++++
 reftable/basics.h            |  56 ++++++++++++++
 reftable/basics_test.c       |  56 ++++++++++++++
 reftable/compat.h            |  48 ++++++++++++
 reftable/publicbasics.c      |  58 ++++++++++++++
 reftable/reftable-malloc.h   |  20 +++++
 reftable/reftable-tests.h    |  22 ++++++
 reftable/system.h            |  32 ++++++++
 reftable/test_framework.c    |  23 ++++++
 reftable/test_framework.h    |  48 ++++++++++++
 t/helper/test-reftable.c     |   9 +++
 t/helper/test-tool.c         |   3 +-
 t/helper/test-tool.h         |   1 +
 t/t0032-reftable-unittest.sh |  14 ++++
 15 files changed, 554 insertions(+), 4 deletions(-)
 create mode 100644 reftable/basics.c
 create mode 100644 reftable/basics.h
 create mode 100644 reftable/basics_test.c
 create mode 100644 reftable/compat.h
 create mode 100644 reftable/publicbasics.c
 create mode 100644 reftable/reftable-malloc.h
 create mode 100644 reftable/reftable-tests.h
 create mode 100644 reftable/system.h
 create mode 100644 reftable/test_framework.c
 create mode 100644 reftable/test_framework.h
 create mode 100644 t/helper/test-reftable.c
 create mode 100755 t/t0032-reftable-unittest.sh

diff --git a/Makefile b/Makefile
index dd1cf41e00..5ce78772fd 100644
--- a/Makefile
+++ b/Makefile
@@ -732,6 +732,7 @@ TEST_BUILTINS_OBJS += test-read-cache.o
 TEST_BUILTINS_OBJS += test-read-graph.o
 TEST_BUILTINS_OBJS += test-read-midx.o
 TEST_BUILTINS_OBJS += test-ref-store.o
+TEST_BUILTINS_OBJS += test-reftable.o
 TEST_BUILTINS_OBJS += test-regex.o
 TEST_BUILTINS_OBJS += test-repository.o
 TEST_BUILTINS_OBJS += test-revision-walking.o
@@ -821,6 +822,8 @@ TEST_SHELL_PATH = $(SHELL_PATH)
 
 LIB_FILE = libgit.a
 XDIFF_LIB = xdiff/lib.a
+REFTABLE_LIB = reftable/libreftable.a
+REFTABLE_TEST_LIB = reftable/libreftable_test.a
 
 GENERATED_H += command-list.h
 GENERATED_H += config-list.h
@@ -1185,7 +1188,7 @@ THIRD_PARTY_SOURCES += compat/regex/%
 THIRD_PARTY_SOURCES += sha1collisiondetection/%
 THIRD_PARTY_SOURCES += sha1dc/%
 
-GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB)
+GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB)
 EXTLIBS =
 
 GIT_USER_AGENT = git/$(GIT_VERSION)
@@ -2386,10 +2389,19 @@ XDIFF_OBJS += xdiff/xpatience.o
 XDIFF_OBJS += xdiff/xprepare.o
 XDIFF_OBJS += xdiff/xutils.o
 
+REFTABLE_OBJS += reftable/basics.o
+REFTABLE_OBJS += reftable/error.o
+REFTABLE_OBJS += reftable/publicbasics.o
+
+REFTABLE_TEST_OBJS += reftable/test_framework.o
+REFTABLE_TEST_OBJS += reftable/basics_test.o
+
 TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
 OBJECTS := $(LIB_OBJS) $(BUILTIN_OBJS) $(PROGRAM_OBJS) $(TEST_OBJS) \
 	$(XDIFF_OBJS) \
 	$(FUZZ_OBJS) \
+	$(REFTABLE_OBJS) \
+	$(REFTABLE_TEST_OBJS) \
 	common-main.o \
 	git.o
 ifndef NO_CURL
@@ -2541,6 +2553,12 @@ $(LIB_FILE): $(LIB_OBJS)
 $(XDIFF_LIB): $(XDIFF_OBJS)
 	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
 
+$(REFTABLE_LIB): $(REFTABLE_OBJS)
+	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
+
+$(REFTABLE_TEST_LIB): $(REFTABLE_TEST_OBJS)
+	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
+
 export DEFAULT_EDITOR DEFAULT_PAGER
 
 Documentation/GIT-EXCLUDED-PROGRAMS: FORCE
@@ -2822,7 +2840,7 @@ perf: all
 
 t/helper/test-tool$X: $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
 
-t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS)
+t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS) $(REFTABLE_TEST_LIB)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(filter %.a,$^) $(LIBS)
 
 check-sha1:: t/helper/test-tool$X
@@ -3151,7 +3169,7 @@ cocciclean:
 clean: profile-clean coverage-clean cocciclean
 	$(RM) *.res
 	$(RM) $(OBJECTS)
-	$(RM) $(LIB_FILE) $(XDIFF_LIB)
+	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB)
 	$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) git$X
 	$(RM) $(TEST_PROGRAMS)
 	$(RM) $(FUZZ_PROGRAMS)
diff --git a/reftable/basics.c b/reftable/basics.c
new file mode 100644
index 0000000000..915c55e0ee
--- /dev/null
+++ b/reftable/basics.c
@@ -0,0 +1,144 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "basics.h"
+
+void put_be24(uint8_t *out, uint32_t i)
+{
+	out[0] = (uint8_t)((i >> 16) & 0xff);
+	out[1] = (uint8_t)((i >> 8) & 0xff);
+	out[2] = (uint8_t)(i & 0xff);
+}
+
+uint32_t get_be24(uint8_t *in)
+{
+	return (uint32_t)(in[0]) << 16 | (uint32_t)(in[1]) << 8 |
+	       (uint32_t)(in[2]);
+}
+
+void put_be16(uint8_t *out, uint16_t i)
+{
+	out[0] = (uint8_t)((i >> 8) & 0xff);
+	out[1] = (uint8_t)(i & 0xff);
+}
+
+int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args)
+{
+	size_t lo = 0;
+	size_t hi = sz;
+
+	/* invariant: (hi == sz) || f(hi) == true
+	   (lo == 0 && f(0) == true) || fi(lo) == false
+	 */
+	while (hi - lo > 1) {
+		size_t mid = lo + (hi - lo) / 2;
+
+		int val = f(mid, args);
+		if (val) {
+			hi = mid;
+		} else {
+			lo = mid;
+		}
+	}
+
+	if (lo == 0) {
+		if (f(0, args)) {
+			return 0;
+		} else {
+			return 1;
+		}
+	}
+
+	return hi;
+}
+
+void free_names(char **a)
+{
+	char **p = a;
+	if (p == NULL) {
+		return;
+	}
+	while (*p) {
+		reftable_free(*p);
+		p++;
+	}
+	reftable_free(a);
+}
+
+int names_length(char **names)
+{
+	int len = 0;
+	char **p = names;
+	while (*p) {
+		p++;
+		len++;
+	}
+	return len;
+}
+
+void parse_names(char *buf, int size, char ***namesp)
+{
+	char **names = NULL;
+	size_t names_cap = 0;
+	size_t names_len = 0;
+
+	char *p = buf;
+	char *end = buf + size;
+	while (p < end) {
+		char *next = strchr(p, '\n');
+		if (next != NULL) {
+			*next = 0;
+		} else {
+			next = end;
+		}
+		if (p < next) {
+			if (names_len == names_cap) {
+				names_cap = 2 * names_cap + 1;
+				names = reftable_realloc(
+					names, names_cap * sizeof(char *));
+			}
+			names[names_len++] = xstrdup(p);
+		}
+		p = next + 1;
+	}
+
+	if (names_len == names_cap) {
+		names_cap = 2 * names_cap + 1;
+		names = reftable_realloc(names, names_cap * sizeof(char *));
+	}
+
+	names[names_len] = NULL;
+	*namesp = names;
+}
+
+int names_equal(char **a, char **b)
+{
+	while (*a && *b) {
+		if (strcmp(*a, *b)) {
+			return 0;
+		}
+
+		a++;
+		b++;
+	}
+
+	return *a == *b;
+}
+
+int common_prefix_size(struct strbuf *a, struct strbuf *b)
+{
+	int p = 0;
+	while (p < a->len && p < b->len) {
+		if (a->buf[p] != b->buf[p]) {
+			break;
+		}
+		p++;
+	}
+
+	return p;
+}
diff --git a/reftable/basics.h b/reftable/basics.h
new file mode 100644
index 0000000000..0948926637
--- /dev/null
+++ b/reftable/basics.h
@@ -0,0 +1,56 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef BASICS_H
+#define BASICS_H
+
+/*
+ * miscellaneous utilities that are not provided by Git.
+ */
+
+#include "system.h"
+
+/* Bigendian en/decoding of integers */
+
+void put_be24(uint8_t *out, uint32_t i);
+uint32_t get_be24(uint8_t *in);
+void put_be16(uint8_t *out, uint16_t i);
+
+/*
+  find smallest index i in [0, sz) at which f(i) is true, assuming
+  that f is ascending. Return sz if f(i) is false for all indices.
+*/
+int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args);
+
+/*
+  Frees a NULL terminated array of malloced strings. The array itself is also
+  freed.
+ */
+void free_names(char **a);
+
+/* parse a newline separated list of names. Empty names are discarded. */
+void parse_names(char *buf, int size, char ***namesp);
+
+/* compares two NULL-terminated arrays of strings. */
+int names_equal(char **a, char **b);
+
+/* returns the array size of a NULL-terminated array of strings. */
+int names_length(char **names);
+
+/* Allocation routines; they invoke the functions set through
+ * reftable_set_alloc() */
+void *reftable_malloc(size_t sz);
+void *reftable_realloc(void *p, size_t sz);
+void reftable_free(void *p);
+void *reftable_calloc(size_t sz);
+
+/* Find the longest shared prefix size of `a` and `b` */
+struct strbuf;
+int common_prefix_size(struct strbuf *a, struct strbuf *b);
+
+#endif
diff --git a/reftable/basics_test.c b/reftable/basics_test.c
new file mode 100644
index 0000000000..593d3e608a
--- /dev/null
+++ b/reftable/basics_test.c
@@ -0,0 +1,56 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+
+#include "basics.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+struct binsearch_args {
+	int key;
+	int *arr;
+};
+
+static int binsearch_func(size_t i, void *void_args)
+{
+	struct binsearch_args *args = (struct binsearch_args *)void_args;
+
+	return args->key < args->arr[i];
+}
+
+static void test_binsearch(void)
+{
+	int arr[] = { 2, 4, 6, 8, 10 };
+	size_t sz = ARRAY_SIZE(arr);
+	struct binsearch_args args = {
+		.arr = arr,
+	};
+
+	int i = 0;
+	for (i = 1; i < 11; i++) {
+		int res;
+		args.key = i;
+		res = binsearch(sz, &binsearch_func, &args);
+
+		if (res < sz) {
+			EXPECT(args.key < arr[res]);
+			if (res > 0) {
+				EXPECT(args.key >= arr[res - 1]);
+			}
+		} else {
+			EXPECT(args.key == 10 || args.key == 11);
+		}
+	}
+}
+
+int basics_test_main(int argc, const char *argv[])
+{
+	test_binsearch();
+	return 0;
+}
diff --git a/reftable/compat.h b/reftable/compat.h
new file mode 100644
index 0000000000..a765c57e96
--- /dev/null
+++ b/reftable/compat.h
@@ -0,0 +1,48 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef COMPAT_H
+#define COMPAT_H
+
+#include "system.h"
+
+#ifdef REFTABLE_STANDALONE
+
+/* functions that git-core provides, for standalone compilation */
+#include <stdint.h>
+
+uint64_t get_be64(void *in);
+void put_be64(void *out, uint64_t i);
+
+void put_be32(void *out, uint32_t i);
+uint32_t get_be32(uint8_t *in);
+
+uint16_t get_be16(uint8_t *in);
+
+#define ARRAY_SIZE(a) sizeof((a)) / sizeof((a)[0])
+#define FREE_AND_NULL(x)          \
+	do {                      \
+		reftable_free(x); \
+		(x) = NULL;       \
+	} while (0)
+#define QSORT(arr, n, cmp) qsort(arr, n, sizeof(arr[0]), cmp)
+#define SWAP(a, b)                              \
+	{                                       \
+		char tmp[sizeof(a)];            \
+		assert(sizeof(a) == sizeof(b)); \
+		memcpy(&tmp[0], &a, sizeof(a)); \
+		memcpy(&a, &b, sizeof(a));      \
+		memcpy(&b, &tmp[0], sizeof(a)); \
+	}
+
+char *xstrdup(const char *s);
+
+void sleep_millisec(int millisecs);
+
+#endif
+#endif
diff --git a/reftable/publicbasics.c b/reftable/publicbasics.c
new file mode 100644
index 0000000000..25639f61af
--- /dev/null
+++ b/reftable/publicbasics.c
@@ -0,0 +1,58 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reftable-malloc.h"
+
+#include "basics.h"
+#include "system.h"
+
+static void *(*reftable_malloc_ptr)(size_t sz) = &malloc;
+static void *(*reftable_realloc_ptr)(void *, size_t) = &realloc;
+static void (*reftable_free_ptr)(void *) = &free;
+
+void *reftable_malloc(size_t sz)
+{
+	return (*reftable_malloc_ptr)(sz);
+}
+
+void *reftable_realloc(void *p, size_t sz)
+{
+	return (*reftable_realloc_ptr)(p, sz);
+}
+
+void reftable_free(void *p)
+{
+	reftable_free_ptr(p);
+}
+
+void *reftable_calloc(size_t sz)
+{
+	void *p = reftable_malloc(sz);
+	memset(p, 0, sz);
+	return p;
+}
+
+void reftable_set_alloc(void *(*malloc)(size_t),
+			void *(*realloc)(void *, size_t), void (*free)(void *))
+{
+	reftable_malloc_ptr = malloc;
+	reftable_realloc_ptr = realloc;
+	reftable_free_ptr = free;
+}
+
+int hash_size(uint32_t id)
+{
+	switch (id) {
+	case 0:
+	case SHA1_ID:
+		return SHA1_SIZE;
+	case SHA256_ID:
+		return SHA256_SIZE;
+	}
+	abort();
+}
diff --git a/reftable/reftable-malloc.h b/reftable/reftable-malloc.h
new file mode 100644
index 0000000000..f3571b5f4f
--- /dev/null
+++ b/reftable/reftable-malloc.h
@@ -0,0 +1,20 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_H
+#define REFTABLE_H
+
+#include <stddef.h>
+
+/*
+  Overrides the functions to use for memory management.
+ */
+void reftable_set_alloc(void *(*malloc)(size_t),
+			void *(*realloc)(void *, size_t), void (*free)(void *));
+
+#endif
diff --git a/reftable/reftable-tests.h b/reftable/reftable-tests.h
new file mode 100644
index 0000000000..5e7698ae65
--- /dev/null
+++ b/reftable/reftable-tests.h
@@ -0,0 +1,22 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_TESTS_H
+#define REFTABLE_TESTS_H
+
+int basics_test_main(int argc, const char **argv);
+int block_test_main(int argc, const char **argv);
+int merged_test_main(int argc, const char **argv);
+int record_test_main(int argc, const char **argv);
+int refname_test_main(int argc, const char **argv);
+int reftable_test_main(int argc, const char **argv);
+int stack_test_main(int argc, const char **argv);
+int tree_test_main(int argc, const char **argv);
+int reftable_dump_main(int argc, char *const *argv);
+
+#endif
diff --git a/reftable/system.h b/reftable/system.h
new file mode 100644
index 0000000000..07277ca062
--- /dev/null
+++ b/reftable/system.h
@@ -0,0 +1,32 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef SYSTEM_H
+#define SYSTEM_H
+
+#include "git-compat-util.h"
+#include "strbuf.h"
+
+#include <zlib.h>
+
+struct strbuf;
+/* In git, this is declared in dir.h */
+int remove_dir_recursively(struct strbuf *path, int flags);
+
+#define SHA1_ID 0x73686131
+#define SHA256_ID 0x73323536
+#define SHA1_SIZE 20
+#define SHA256_SIZE 32
+
+/* This is uncompress2, which is only available in zlib as of 2017.
+ */
+int uncompress_return_consumed(Bytef *dest, uLongf *destLen,
+			       const Bytef *source, uLong *sourceLen);
+int hash_size(uint32_t id);
+
+#endif
diff --git a/reftable/test_framework.c b/reftable/test_framework.c
new file mode 100644
index 0000000000..a5ff4e2a2d
--- /dev/null
+++ b/reftable/test_framework.c
@@ -0,0 +1,23 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+#include "test_framework.h"
+
+#include "basics.h"
+
+void set_test_hash(uint8_t *p, int i)
+{
+	memset(p, (uint8_t)i, hash_size(SHA1_ID));
+}
+
+int strbuf_add_void(void *b, const void *data, size_t sz)
+{
+	strbuf_add((struct strbuf *)b, data, sz);
+	return sz;
+}
diff --git a/reftable/test_framework.h b/reftable/test_framework.h
new file mode 100644
index 0000000000..18eec29aba
--- /dev/null
+++ b/reftable/test_framework.h
@@ -0,0 +1,48 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef TEST_FRAMEWORK_H
+#define TEST_FRAMEWORK_H
+
+#include "system.h"
+#include "reftable-error.h"
+
+#define EXPECT_ERR(c)                                                  \
+	if (c != 0) {                                                  \
+		fflush(stderr);                                        \
+		fflush(stdout);                                        \
+		fprintf(stderr, "%s: %d: error == %d (%s), want 0\n",  \
+			__FILE__, __LINE__, c, reftable_error_str(c)); \
+		abort();                                               \
+	}
+
+#define EXPECT_STREQ(a, b)                                               \
+	if (strcmp(a, b)) {                                              \
+		fflush(stderr);                                          \
+		fflush(stdout);                                          \
+		fprintf(stderr, "%s:%d: %s (%s) != %s (%s)\n", __FILE__, \
+			__LINE__, #a, a, #b, b);                         \
+		abort();                                                 \
+	}
+
+#define EXPECT(c)                                                          \
+	if (!(c)) {                                                        \
+		fflush(stderr);                                            \
+		fflush(stdout);                                            \
+		fprintf(stderr, "%s: %d: failed assertion %s\n", __FILE__, \
+			__LINE__, #c);                                     \
+		abort();                                                   \
+	}
+
+void set_test_hash(uint8_t *p, int i);
+
+/* Like strbuf_add, but suitable for passing to reftable_new_writer
+ */
+int strbuf_add_void(void *b, const void *data, size_t sz);
+
+#endif
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
new file mode 100644
index 0000000000..3b58e423e7
--- /dev/null
+++ b/t/helper/test-reftable.c
@@ -0,0 +1,9 @@
+#include "reftable/reftable-tests.h"
+#include "test-tool.h"
+
+int cmd__reftable(int argc, const char **argv)
+{
+	basics_test_main(argc, argv);
+
+	return 0;
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 9d6d14d929..0208a0a41c 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -48,13 +48,14 @@ static struct test_cmd cmds[] = {
 	{ "path-utils", cmd__path_utils },
 	{ "pkt-line", cmd__pkt_line },
 	{ "prio-queue", cmd__prio_queue },
-	{ "proc-receive", cmd__proc_receive},
+	{ "proc-receive", cmd__proc_receive },
 	{ "progress", cmd__progress },
 	{ "reach", cmd__reach },
 	{ "read-cache", cmd__read_cache },
 	{ "read-graph", cmd__read_graph },
 	{ "read-midx", cmd__read_midx },
 	{ "ref-store", cmd__ref_store },
+	{ "reftable", cmd__reftable },
 	{ "regex", cmd__regex },
 	{ "repository", cmd__repository },
 	{ "revision-walking", cmd__revision_walking },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index a6470ff62c..1de39ce5b5 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -44,6 +44,7 @@ int cmd__read_cache(int argc, const char **argv);
 int cmd__read_graph(int argc, const char **argv);
 int cmd__read_midx(int argc, const char **argv);
 int cmd__ref_store(int argc, const char **argv);
+int cmd__reftable(int argc, const char **argv);
 int cmd__regex(int argc, const char **argv);
 int cmd__repository(int argc, const char **argv);
 int cmd__revision_walking(int argc, const char **argv);
diff --git a/t/t0032-reftable-unittest.sh b/t/t0032-reftable-unittest.sh
new file mode 100755
index 0000000000..10e21a0590
--- /dev/null
+++ b/t/t0032-reftable-unittest.sh
@@ -0,0 +1,14 @@
+#!/bin/sh
+#
+# Copyright (c) 2020 Google LLC
+#
+
+test_description='reftable unittests'
+
+. ./test-lib.sh
+
+test_expect_success 'unittests' '
+       test-tool reftable
+'
+
+test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v3 06/16] reftable: add blocksource, an abstraction for random access reads
  2020-11-26 19:42   ` [PATCH v3 00/16] reftable library Han-Wen Nienhuys via GitGitGadget
                       ` (4 preceding siblings ...)
  2020-11-26 19:42     ` [PATCH v3 05/16] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
@ 2020-11-26 19:42     ` Han-Wen Nienhuys via GitGitGadget
  2020-11-26 19:42     ` [PATCH v3 07/16] reftable: (de)serialization for the polymorphic record type Han-Wen Nienhuys via GitGitGadget
                       ` (10 subsequent siblings)
  16 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-11-26 19:42 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable format is usually used with files for storage. However, we abstract
away this using the blocksource data structure. This has two advantages:

* log blocks are zlib compressed, and handling them is simplified if we can
  discard byte segments from within the block layer.

* for unittests, it is useful to read and write in-memory. The blocksource
  allows us to abstract the data away from on-disk files.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                        |   1 +
 reftable/blocksource.c          | 148 ++++++++++++++++++++++++++++++++
 reftable/blocksource.h          |  22 +++++
 reftable/reftable-blocksource.h |  50 +++++++++++
 4 files changed, 221 insertions(+)
 create mode 100644 reftable/blocksource.c
 create mode 100644 reftable/blocksource.h
 create mode 100644 reftable/reftable-blocksource.h

diff --git a/Makefile b/Makefile
index 5ce78772fd..030b3c0204 100644
--- a/Makefile
+++ b/Makefile
@@ -2391,6 +2391,7 @@ XDIFF_OBJS += xdiff/xutils.o
 
 REFTABLE_OBJS += reftable/basics.o
 REFTABLE_OBJS += reftable/error.o
+REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 
 REFTABLE_TEST_OBJS += reftable/test_framework.o
diff --git a/reftable/blocksource.c b/reftable/blocksource.c
new file mode 100644
index 0000000000..25d4d95b52
--- /dev/null
+++ b/reftable/blocksource.c
@@ -0,0 +1,148 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+
+#include "basics.h"
+#include "blocksource.h"
+#include "reftable-blocksource.h"
+#include "reftable-error.h"
+
+static void strbuf_return_block(void *b, struct reftable_block *dest)
+{
+	memset(dest->data, 0xff, dest->len);
+	reftable_free(dest->data);
+}
+
+static void strbuf_close(void *b)
+{
+}
+
+static int strbuf_read_block(void *v, struct reftable_block *dest, uint64_t off,
+			     uint32_t size)
+{
+	struct strbuf *b = (struct strbuf *)v;
+	assert(off + size <= b->len);
+	dest->data = reftable_calloc(size);
+	memcpy(dest->data, b->buf + off, size);
+	dest->len = size;
+	return size;
+}
+
+static uint64_t strbuf_size(void *b)
+{
+	return ((struct strbuf *)b)->len;
+}
+
+static struct reftable_block_source_vtable strbuf_vtable = {
+	.size = &strbuf_size,
+	.read_block = &strbuf_read_block,
+	.return_block = &strbuf_return_block,
+	.close = &strbuf_close,
+};
+
+void block_source_from_strbuf(struct reftable_block_source *bs,
+			      struct strbuf *buf)
+{
+	assert(bs->ops == NULL);
+	bs->ops = &strbuf_vtable;
+	bs->arg = buf;
+}
+
+static void malloc_return_block(void *b, struct reftable_block *dest)
+{
+	memset(dest->data, 0xff, dest->len);
+	reftable_free(dest->data);
+}
+
+static struct reftable_block_source_vtable malloc_vtable = {
+	.return_block = &malloc_return_block,
+};
+
+static struct reftable_block_source malloc_block_source_instance = {
+	.ops = &malloc_vtable,
+};
+
+struct reftable_block_source malloc_block_source(void)
+{
+	return malloc_block_source_instance;
+}
+
+struct file_block_source {
+	int fd;
+	uint64_t size;
+};
+
+static uint64_t file_size(void *b)
+{
+	return ((struct file_block_source *)b)->size;
+}
+
+static void file_return_block(void *b, struct reftable_block *dest)
+{
+	memset(dest->data, 0xff, dest->len);
+	reftable_free(dest->data);
+}
+
+static void file_close(void *b)
+{
+	int fd = ((struct file_block_source *)b)->fd;
+	if (fd > 0) {
+		close(fd);
+		((struct file_block_source *)b)->fd = 0;
+	}
+
+	reftable_free(b);
+}
+
+static int file_read_block(void *v, struct reftable_block *dest, uint64_t off,
+			   uint32_t size)
+{
+	struct file_block_source *b = (struct file_block_source *)v;
+	assert(off + size <= b->size);
+	dest->data = reftable_malloc(size);
+	if (pread(b->fd, dest->data, size, off) != size)
+		return -1;
+	dest->len = size;
+	return size;
+}
+
+static struct reftable_block_source_vtable file_vtable = {
+	.size = &file_size,
+	.read_block = &file_read_block,
+	.return_block = &file_return_block,
+	.close = &file_close,
+};
+
+int reftable_block_source_from_file(struct reftable_block_source *bs,
+				    const char *name)
+{
+	struct stat st = { 0 };
+	int err = 0;
+	int fd = open(name, O_RDONLY);
+	struct file_block_source *p = NULL;
+	if (fd < 0) {
+		if (errno == ENOENT) {
+			return REFTABLE_NOT_EXIST_ERROR;
+		}
+		return -1;
+	}
+
+	err = fstat(fd, &st);
+	if (err < 0)
+		return -1;
+
+	p = reftable_calloc(sizeof(struct file_block_source));
+	p->size = st.st_size;
+	p->fd = fd;
+
+	assert(bs->ops == NULL);
+	bs->ops = &file_vtable;
+	bs->arg = p;
+	return 0;
+}
diff --git a/reftable/blocksource.h b/reftable/blocksource.h
new file mode 100644
index 0000000000..072e2727ad
--- /dev/null
+++ b/reftable/blocksource.h
@@ -0,0 +1,22 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef BLOCKSOURCE_H
+#define BLOCKSOURCE_H
+
+#include "system.h"
+
+struct reftable_block_source;
+
+/* Create an in-memory block source for reading reftables */
+void block_source_from_strbuf(struct reftable_block_source *bs,
+			      struct strbuf *buf);
+
+struct reftable_block_source malloc_block_source(void);
+
+#endif
diff --git a/reftable/reftable-blocksource.h b/reftable/reftable-blocksource.h
new file mode 100644
index 0000000000..fffd19deb2
--- /dev/null
+++ b/reftable/reftable-blocksource.h
@@ -0,0 +1,50 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_BLOCKSOURCE_H
+#define REFTABLE_BLOCKSOURCE_H
+
+#include <stdint.h>
+
+/* block_source is a generic wrapper for a seekable readable file.
+ */
+struct reftable_block_source {
+	struct reftable_block_source_vtable *ops;
+	void *arg;
+};
+
+/* a contiguous segment of bytes. It keeps track of its generating block_source
+   so it can return itself into the pool.
+*/
+struct reftable_block {
+	uint8_t *data;
+	int len;
+	struct reftable_block_source source;
+};
+
+/* block_source_vtable are the operations that make up block_source */
+struct reftable_block_source_vtable {
+	/* returns the size of a block source */
+	uint64_t (*size)(void *source);
+
+	/* reads a segment from the block source. It is an error to read
+	   beyond the end of the block */
+	int (*read_block)(void *source, struct reftable_block *dest,
+			  uint64_t off, uint32_t size);
+	/* mark the block as read; may return the data back to malloc */
+	void (*return_block)(void *source, struct reftable_block *blockp);
+
+	/* release all resources associated with the block source */
+	void (*close)(void *source);
+};
+
+/* opens a file on the file system as a block_source */
+int reftable_block_source_from_file(struct reftable_block_source *block_src,
+				    const char *name);
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v3 07/16] reftable: (de)serialization for the polymorphic record type.
  2020-11-26 19:42   ` [PATCH v3 00/16] reftable library Han-Wen Nienhuys via GitGitGadget
                       ` (5 preceding siblings ...)
  2020-11-26 19:42     ` [PATCH v3 06/16] reftable: add blocksource, an abstraction for random access reads Han-Wen Nienhuys via GitGitGadget
@ 2020-11-26 19:42     ` Han-Wen Nienhuys via GitGitGadget
  2020-11-26 19:42     ` [PATCH v3 08/16] reftable: reading/writing blocks Han-Wen Nienhuys via GitGitGadget
                       ` (9 subsequent siblings)
  16 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-11-26 19:42 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable format is structured as a sequence of blocks, and each block
contains a sequence of prefix-compressed key-value records. There are 4 types of
records, and they have similarities in how they must be handled. This is
achieved by introducing a polymorphic 'record' type that encapsulates ref, log,
index and object records.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                   |    2 +
 reftable/constants.h       |   21 +
 reftable/record.c          | 1116 ++++++++++++++++++++++++++++++++++++
 reftable/record.h          |  139 +++++
 reftable/record_test.c     |  404 +++++++++++++
 reftable/reftable-record.h |   73 +++
 t/helper/test-reftable.c   |    2 +-
 7 files changed, 1756 insertions(+), 1 deletion(-)
 create mode 100644 reftable/constants.h
 create mode 100644 reftable/record.c
 create mode 100644 reftable/record.h
 create mode 100644 reftable/record_test.c
 create mode 100644 reftable/reftable-record.h

diff --git a/Makefile b/Makefile
index 030b3c0204..199aec9267 100644
--- a/Makefile
+++ b/Makefile
@@ -2393,7 +2393,9 @@ REFTABLE_OBJS += reftable/basics.o
 REFTABLE_OBJS += reftable/error.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
+REFTABLE_OBJS += reftable/record.o
 
+REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/basics_test.o
 
diff --git a/reftable/constants.h b/reftable/constants.h
new file mode 100644
index 0000000000..5eee72c4c1
--- /dev/null
+++ b/reftable/constants.h
@@ -0,0 +1,21 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef CONSTANTS_H
+#define CONSTANTS_H
+
+#define BLOCK_TYPE_LOG 'g'
+#define BLOCK_TYPE_INDEX 'i'
+#define BLOCK_TYPE_REF 'r'
+#define BLOCK_TYPE_OBJ 'o'
+#define BLOCK_TYPE_ANY 0
+
+#define MAX_RESTARTS ((1 << 16) - 1)
+#define DEFAULT_BLOCK_SIZE 4096
+
+#endif
diff --git a/reftable/record.c b/reftable/record.c
new file mode 100644
index 0000000000..dbe7a5571e
--- /dev/null
+++ b/reftable/record.c
@@ -0,0 +1,1116 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+/* record.c - methods for different types of records. */
+
+#include "record.h"
+
+#include "system.h"
+#include "constants.h"
+#include "reftable-error.h"
+#include "basics.h"
+
+int get_var_int(uint64_t *dest, struct string_view *in)
+{
+	int ptr = 0;
+	uint64_t val;
+
+	if (in->len == 0)
+		return -1;
+	val = in->buf[ptr] & 0x7f;
+
+	while (in->buf[ptr] & 0x80) {
+		ptr++;
+		if (ptr > in->len) {
+			return -1;
+		}
+		val = (val + 1) << 7 | (uint64_t)(in->buf[ptr] & 0x7f);
+	}
+
+	*dest = val;
+	return ptr + 1;
+}
+
+int put_var_int(struct string_view *dest, uint64_t val)
+{
+	uint8_t buf[10] = { 0 };
+	int i = 9;
+	int n = 0;
+	buf[i] = (uint8_t)(val & 0x7f);
+	i--;
+	while (1) {
+		val >>= 7;
+		if (!val) {
+			break;
+		}
+		val--;
+		buf[i] = 0x80 | (uint8_t)(val & 0x7f);
+		i--;
+	}
+
+	n = sizeof(buf) - i - 1;
+	if (dest->len < n)
+		return -1;
+	memcpy(dest->buf, &buf[i + 1], n);
+	return n;
+}
+
+int reftable_is_block_type(uint8_t typ)
+{
+	switch (typ) {
+	case BLOCK_TYPE_REF:
+	case BLOCK_TYPE_LOG:
+	case BLOCK_TYPE_OBJ:
+	case BLOCK_TYPE_INDEX:
+		return 1;
+	}
+	return 0;
+}
+
+static int decode_string(struct strbuf *dest, struct string_view in)
+{
+	int start_len = in.len;
+	uint64_t tsize = 0;
+	int n = get_var_int(&tsize, &in);
+	if (n <= 0)
+		return -1;
+	string_view_consume(&in, n);
+	if (in.len < tsize)
+		return -1;
+
+	strbuf_reset(dest);
+	strbuf_add(dest, in.buf, tsize);
+	string_view_consume(&in, tsize);
+
+	return start_len - in.len;
+}
+
+static int encode_string(char *str, struct string_view s)
+{
+	struct string_view start = s;
+	int l = strlen(str);
+	int n = put_var_int(&s, l);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+	if (s.len < l)
+		return -1;
+	memcpy(s.buf, str, l);
+	string_view_consume(&s, l);
+
+	return start.len - s.len;
+}
+
+int reftable_encode_key(int *restart, struct string_view dest,
+			struct strbuf prev_key, struct strbuf key,
+			uint8_t extra)
+{
+	struct string_view start = dest;
+	int prefix_len = common_prefix_size(&prev_key, &key);
+	uint64_t suffix_len = key.len - prefix_len;
+	int n = put_var_int(&dest, (uint64_t)prefix_len);
+	if (n < 0)
+		return -1;
+	string_view_consume(&dest, n);
+
+	*restart = (prefix_len == 0);
+
+	n = put_var_int(&dest, suffix_len << 3 | (uint64_t)extra);
+	if (n < 0)
+		return -1;
+	string_view_consume(&dest, n);
+
+	if (dest.len < suffix_len)
+		return -1;
+	memcpy(dest.buf, key.buf + prefix_len, suffix_len);
+	string_view_consume(&dest, suffix_len);
+
+	return start.len - dest.len;
+}
+
+int reftable_decode_key(struct strbuf *key, uint8_t *extra,
+			struct strbuf last_key, struct string_view in)
+{
+	int start_len = in.len;
+	uint64_t prefix_len = 0;
+	uint64_t suffix_len = 0;
+	int n = get_var_int(&prefix_len, &in);
+	if (n < 0)
+		return -1;
+	string_view_consume(&in, n);
+
+	if (prefix_len > last_key.len)
+		return -1;
+
+	n = get_var_int(&suffix_len, &in);
+	if (n <= 0)
+		return -1;
+	string_view_consume(&in, n);
+
+	*extra = (uint8_t)(suffix_len & 0x7);
+	suffix_len >>= 3;
+
+	if (in.len < suffix_len)
+		return -1;
+
+	strbuf_reset(key);
+	strbuf_add(key, last_key.buf, prefix_len);
+	strbuf_add(key, in.buf, suffix_len);
+	string_view_consume(&in, suffix_len);
+
+	return start_len - in.len;
+}
+
+static void reftable_ref_record_key(const void *r, struct strbuf *dest)
+{
+	const struct reftable_ref_record *rec =
+		(const struct reftable_ref_record *)r;
+	strbuf_reset(dest);
+	strbuf_addstr(dest, rec->refname);
+}
+
+static void reftable_ref_record_copy_from(void *rec, const void *src_rec,
+					  int hash_size)
+{
+	struct reftable_ref_record *ref = (struct reftable_ref_record *)rec;
+	struct reftable_ref_record *src = (struct reftable_ref_record *)src_rec;
+	assert(hash_size > 0);
+
+	/* This is simple and correct, but we could probably reuse the hash
+	   fields. */
+	reftable_ref_record_release(ref);
+	if (src->refname != NULL) {
+		ref->refname = xstrdup(src->refname);
+	}
+
+	if (src->target != NULL) {
+		ref->target = xstrdup(src->target);
+	}
+
+	if (src->target_value != NULL) {
+		ref->target_value = reftable_malloc(hash_size);
+		memcpy(ref->target_value, src->target_value, hash_size);
+	}
+
+	if (src->value != NULL) {
+		ref->value = reftable_malloc(hash_size);
+		memcpy(ref->value, src->value, hash_size);
+	}
+	ref->update_index = src->update_index;
+}
+
+static char hexdigit(int c)
+{
+	if (c <= 9)
+		return '0' + c;
+	return 'a' + (c - 10);
+}
+
+static void hex_format(char *dest, uint8_t *src, int hash_size)
+{
+	assert(hash_size > 0);
+	if (src != NULL) {
+		int i = 0;
+		for (i = 0; i < hash_size; i++) {
+			dest[2 * i] = hexdigit(src[i] >> 4);
+			dest[2 * i + 1] = hexdigit(src[i] & 0xf);
+		}
+		dest[2 * hash_size] = 0;
+	}
+}
+
+void reftable_ref_record_print(struct reftable_ref_record *ref,
+			       uint32_t hash_id)
+{
+	char hex[SHA256_SIZE + 1] = { 0 };
+	printf("ref{%s(%" PRIu64 ") ", ref->refname, ref->update_index);
+	if (ref->value != NULL) {
+		hex_format(hex, ref->value, hash_size(hash_id));
+		printf("%s", hex);
+	}
+	if (ref->target_value != NULL) {
+		hex_format(hex, ref->target_value, hash_size(hash_id));
+		printf(" (T %s)", hex);
+	}
+	if (ref->target != NULL) {
+		printf("=> %s", ref->target);
+	}
+	printf("}\n");
+}
+
+static void reftable_ref_record_release_void(void *rec)
+{
+	reftable_ref_record_release((struct reftable_ref_record *)rec);
+}
+
+void reftable_ref_record_release(struct reftable_ref_record *ref)
+{
+	reftable_free(ref->refname);
+	reftable_free(ref->target);
+	reftable_free(ref->target_value);
+	reftable_free(ref->value);
+	memset(ref, 0, sizeof(struct reftable_ref_record));
+}
+
+static uint8_t reftable_ref_record_val_type(const void *rec)
+{
+	const struct reftable_ref_record *r =
+		(const struct reftable_ref_record *)rec;
+	if (r->value != NULL) {
+		if (r->target_value != NULL) {
+			return 2;
+		} else {
+			return 1;
+		}
+	} else if (r->target != NULL)
+		return 3;
+	return 0;
+}
+
+static int reftable_ref_record_encode(const void *rec, struct string_view s,
+				      int hash_size)
+{
+	const struct reftable_ref_record *r =
+		(const struct reftable_ref_record *)rec;
+	struct string_view start = s;
+	int n = put_var_int(&s, r->update_index);
+	assert(hash_size > 0);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	if (r->value != NULL) {
+		if (s.len < hash_size) {
+			return -1;
+		}
+		memcpy(s.buf, r->value, hash_size);
+		string_view_consume(&s, hash_size);
+	}
+
+	if (r->target_value != NULL) {
+		if (s.len < hash_size) {
+			return -1;
+		}
+		memcpy(s.buf, r->target_value, hash_size);
+		string_view_consume(&s, hash_size);
+	}
+
+	if (r->target != NULL) {
+		int n = encode_string(r->target, s);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&s, n);
+	}
+
+	return start.len - s.len;
+}
+
+static int reftable_ref_record_decode(void *rec, struct strbuf key,
+				      uint8_t val_type, struct string_view in,
+				      int hash_size)
+{
+	struct reftable_ref_record *r = (struct reftable_ref_record *)rec;
+	struct string_view start = in;
+	int seen_value = 0;
+	int seen_target_value = 0;
+	int seen_target = 0;
+
+	int n = get_var_int(&r->update_index, &in);
+	if (n < 0)
+		return n;
+	assert(hash_size > 0);
+
+	string_view_consume(&in, n);
+
+	r->refname = reftable_realloc(r->refname, key.len + 1);
+	memcpy(r->refname, key.buf, key.len);
+	r->refname[key.len] = 0;
+
+	switch (val_type) {
+	case 1:
+	case 2:
+		if (in.len < hash_size) {
+			return -1;
+		}
+
+		if (r->value == NULL) {
+			r->value = reftable_malloc(hash_size);
+		}
+		seen_value = 1;
+		memcpy(r->value, in.buf, hash_size);
+		string_view_consume(&in, hash_size);
+		if (val_type == 1) {
+			break;
+		}
+		if (r->target_value == NULL) {
+			r->target_value = reftable_malloc(hash_size);
+		}
+		seen_target_value = 1;
+		memcpy(r->target_value, in.buf, hash_size);
+		string_view_consume(&in, hash_size);
+		break;
+	case 3: {
+		struct strbuf dest = STRBUF_INIT;
+		int n = decode_string(&dest, in);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&in, n);
+		seen_target = 1;
+		if (r->target != NULL) {
+			reftable_free(r->target);
+		}
+		r->target = dest.buf;
+	} break;
+
+	case 0:
+		break;
+	default:
+		abort();
+		break;
+	}
+
+	if (!seen_target && r->target != NULL) {
+		FREE_AND_NULL(r->target);
+	}
+	if (!seen_target_value && r->target_value != NULL) {
+		FREE_AND_NULL(r->target_value);
+	}
+	if (!seen_value && r->value != NULL) {
+		FREE_AND_NULL(r->value);
+	}
+
+	return start.len - in.len;
+}
+
+static int reftable_ref_record_is_deletion_void(const void *p)
+{
+	return reftable_ref_record_is_deletion(
+		(const struct reftable_ref_record *)p);
+}
+
+static struct reftable_record_vtable reftable_ref_record_vtable = {
+	.key = &reftable_ref_record_key,
+	.type = BLOCK_TYPE_REF,
+	.copy_from = &reftable_ref_record_copy_from,
+	.val_type = &reftable_ref_record_val_type,
+	.encode = &reftable_ref_record_encode,
+	.decode = &reftable_ref_record_decode,
+	.release = &reftable_ref_record_release_void,
+	.is_deletion = &reftable_ref_record_is_deletion_void,
+};
+
+static void reftable_obj_record_key(const void *r, struct strbuf *dest)
+{
+	const struct reftable_obj_record *rec =
+		(const struct reftable_obj_record *)r;
+	strbuf_reset(dest);
+	strbuf_add(dest, rec->hash_prefix, rec->hash_prefix_len);
+}
+
+static void reftable_obj_record_release(void *rec)
+{
+	struct reftable_obj_record *obj = (struct reftable_obj_record *)rec;
+	FREE_AND_NULL(obj->hash_prefix);
+	FREE_AND_NULL(obj->offsets);
+	memset(obj, 0, sizeof(struct reftable_obj_record));
+}
+
+static void reftable_obj_record_copy_from(void *rec, const void *src_rec,
+					  int hash_size)
+{
+	struct reftable_obj_record *obj = (struct reftable_obj_record *)rec;
+	const struct reftable_obj_record *src =
+		(const struct reftable_obj_record *)src_rec;
+	int olen;
+
+	reftable_obj_record_release(obj);
+	*obj = *src;
+	obj->hash_prefix = reftable_malloc(obj->hash_prefix_len);
+	memcpy(obj->hash_prefix, src->hash_prefix, obj->hash_prefix_len);
+
+	olen = obj->offset_len * sizeof(uint64_t);
+	obj->offsets = reftable_malloc(olen);
+	memcpy(obj->offsets, src->offsets, olen);
+}
+
+static uint8_t reftable_obj_record_val_type(const void *rec)
+{
+	struct reftable_obj_record *r = (struct reftable_obj_record *)rec;
+	if (r->offset_len > 0 && r->offset_len < 8)
+		return r->offset_len;
+	return 0;
+}
+
+static int reftable_obj_record_encode(const void *rec, struct string_view s,
+				      int hash_size)
+{
+	struct reftable_obj_record *r = (struct reftable_obj_record *)rec;
+	struct string_view start = s;
+	int i = 0;
+	int n = 0;
+	uint64_t last = 0;
+	if (r->offset_len == 0 || r->offset_len >= 8) {
+		n = put_var_int(&s, r->offset_len);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&s, n);
+	}
+	if (r->offset_len == 0)
+		return start.len - s.len;
+	n = put_var_int(&s, r->offsets[0]);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	last = r->offsets[0];
+	for (i = 1; i < r->offset_len; i++) {
+		int n = put_var_int(&s, r->offsets[i] - last);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&s, n);
+		last = r->offsets[i];
+	}
+	return start.len - s.len;
+}
+
+static int reftable_obj_record_decode(void *rec, struct strbuf key,
+				      uint8_t val_type, struct string_view in,
+				      int hash_size)
+{
+	struct string_view start = in;
+	struct reftable_obj_record *r = (struct reftable_obj_record *)rec;
+	uint64_t count = val_type;
+	int n = 0;
+	uint64_t last;
+	int j;
+	r->hash_prefix = reftable_malloc(key.len);
+	memcpy(r->hash_prefix, key.buf, key.len);
+	r->hash_prefix_len = key.len;
+
+	if (val_type == 0) {
+		n = get_var_int(&count, &in);
+		if (n < 0) {
+			return n;
+		}
+
+		string_view_consume(&in, n);
+	}
+
+	r->offsets = NULL;
+	r->offset_len = 0;
+	if (count == 0)
+		return start.len - in.len;
+
+	r->offsets = reftable_malloc(count * sizeof(uint64_t));
+	r->offset_len = count;
+
+	n = get_var_int(&r->offsets[0], &in);
+	if (n < 0)
+		return n;
+	string_view_consume(&in, n);
+
+	last = r->offsets[0];
+	j = 1;
+	while (j < count) {
+		uint64_t delta = 0;
+		int n = get_var_int(&delta, &in);
+		if (n < 0) {
+			return n;
+		}
+		string_view_consume(&in, n);
+
+		last = r->offsets[j] = (delta + last);
+		j++;
+	}
+	return start.len - in.len;
+}
+
+static int not_a_deletion(const void *p)
+{
+	return 0;
+}
+
+static struct reftable_record_vtable reftable_obj_record_vtable = {
+	.key = &reftable_obj_record_key,
+	.type = BLOCK_TYPE_OBJ,
+	.copy_from = &reftable_obj_record_copy_from,
+	.val_type = &reftable_obj_record_val_type,
+	.encode = &reftable_obj_record_encode,
+	.decode = &reftable_obj_record_decode,
+	.release = &reftable_obj_record_release,
+	.is_deletion = not_a_deletion,
+};
+
+void reftable_log_record_print(struct reftable_log_record *log,
+			       uint32_t hash_id)
+{
+	char hex[SHA256_SIZE + 1] = { 0 };
+
+	printf("log{%s(%" PRIu64 ") %s <%s> %" PRIu64 " %04d\n", log->refname,
+	       log->update_index, log->name, log->email, log->time,
+	       log->tz_offset);
+	hex_format(hex, log->old_hash, hash_size(hash_id));
+	printf("%s => ", hex);
+	hex_format(hex, log->new_hash, hash_size(hash_id));
+	printf("%s\n\n%s\n}\n", hex, log->message);
+}
+
+static void reftable_log_record_key(const void *r, struct strbuf *dest)
+{
+	const struct reftable_log_record *rec =
+		(const struct reftable_log_record *)r;
+	int len = strlen(rec->refname);
+	uint8_t i64[8];
+	uint64_t ts = 0;
+	strbuf_reset(dest);
+	strbuf_add(dest, (uint8_t *)rec->refname, len + 1);
+
+	ts = (~ts) - rec->update_index;
+	put_be64(&i64[0], ts);
+	strbuf_add(dest, i64, sizeof(i64));
+}
+
+static void reftable_log_record_copy_from(void *rec, const void *src_rec,
+					  int hash_size)
+{
+	struct reftable_log_record *dst = (struct reftable_log_record *)rec;
+	const struct reftable_log_record *src =
+		(const struct reftable_log_record *)src_rec;
+
+	reftable_log_record_release(dst);
+	*dst = *src;
+	if (dst->refname != NULL) {
+		dst->refname = xstrdup(dst->refname);
+	}
+	if (dst->email != NULL) {
+		dst->email = xstrdup(dst->email);
+	}
+	if (dst->name != NULL) {
+		dst->name = xstrdup(dst->name);
+	}
+	if (dst->message != NULL) {
+		dst->message = xstrdup(dst->message);
+	}
+
+	if (dst->new_hash != NULL) {
+		dst->new_hash = reftable_malloc(hash_size);
+		memcpy(dst->new_hash, src->new_hash, hash_size);
+	}
+	if (dst->old_hash != NULL) {
+		dst->old_hash = reftable_malloc(hash_size);
+		memcpy(dst->old_hash, src->old_hash, hash_size);
+	}
+}
+
+static void reftable_log_record_release_void(void *rec)
+{
+	struct reftable_log_record *r = (struct reftable_log_record *)rec;
+	reftable_log_record_release(r);
+}
+
+void reftable_log_record_release(struct reftable_log_record *r)
+{
+	reftable_free(r->refname);
+	reftable_free(r->new_hash);
+	reftable_free(r->old_hash);
+	reftable_free(r->name);
+	reftable_free(r->email);
+	reftable_free(r->message);
+	memset(r, 0, sizeof(struct reftable_log_record));
+}
+
+static uint8_t reftable_log_record_val_type(const void *rec)
+{
+	const struct reftable_log_record *log =
+		(const struct reftable_log_record *)rec;
+
+	return reftable_log_record_is_deletion(log) ? 0 : 1;
+}
+
+static uint8_t zero[SHA256_SIZE] = { 0 };
+
+static int reftable_log_record_encode(const void *rec, struct string_view s,
+				      int hash_size)
+{
+	struct reftable_log_record *r = (struct reftable_log_record *)rec;
+	struct string_view start = s;
+	int n = 0;
+	uint8_t *oldh = r->old_hash;
+	uint8_t *newh = r->new_hash;
+	if (reftable_log_record_is_deletion(r))
+		return 0;
+
+	if (oldh == NULL) {
+		oldh = zero;
+	}
+	if (newh == NULL) {
+		newh = zero;
+	}
+
+	if (s.len < 2 * hash_size)
+		return -1;
+
+	memcpy(s.buf, oldh, hash_size);
+	memcpy(s.buf + hash_size, newh, hash_size);
+	string_view_consume(&s, 2 * hash_size);
+
+	n = encode_string(r->name ? r->name : "", s);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	n = encode_string(r->email ? r->email : "", s);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	n = put_var_int(&s, r->time);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	if (s.len < 2)
+		return -1;
+
+	put_be16(s.buf, r->tz_offset);
+	string_view_consume(&s, 2);
+
+	n = encode_string(r->message ? r->message : "", s);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	return start.len - s.len;
+}
+
+static int reftable_log_record_decode(void *rec, struct strbuf key,
+				      uint8_t val_type, struct string_view in,
+				      int hash_size)
+{
+	struct string_view start = in;
+	struct reftable_log_record *r = (struct reftable_log_record *)rec;
+	uint64_t max = 0;
+	uint64_t ts = 0;
+	struct strbuf dest = STRBUF_INIT;
+	int n;
+
+	if (key.len <= 9 || key.buf[key.len - 9] != 0)
+		return REFTABLE_FORMAT_ERROR;
+
+	r->refname = reftable_realloc(r->refname, key.len - 8);
+	memcpy(r->refname, key.buf, key.len - 8);
+	ts = get_be64(key.buf + key.len - 8);
+
+	r->update_index = (~max) - ts;
+
+	if (val_type == 0) {
+		FREE_AND_NULL(r->old_hash);
+		FREE_AND_NULL(r->new_hash);
+		FREE_AND_NULL(r->message);
+		FREE_AND_NULL(r->email);
+		FREE_AND_NULL(r->name);
+		return 0;
+	}
+
+	if (in.len < 2 * hash_size)
+		return REFTABLE_FORMAT_ERROR;
+
+	r->old_hash = reftable_realloc(r->old_hash, hash_size);
+	r->new_hash = reftable_realloc(r->new_hash, hash_size);
+
+	memcpy(r->old_hash, in.buf, hash_size);
+	memcpy(r->new_hash, in.buf + hash_size, hash_size);
+
+	string_view_consume(&in, 2 * hash_size);
+
+	n = decode_string(&dest, in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+
+	r->name = reftable_realloc(r->name, dest.len + 1);
+	memcpy(r->name, dest.buf, dest.len);
+	r->name[dest.len] = 0;
+
+	strbuf_reset(&dest);
+	n = decode_string(&dest, in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+
+	r->email = reftable_realloc(r->email, dest.len + 1);
+	memcpy(r->email, dest.buf, dest.len);
+	r->email[dest.len] = 0;
+
+	ts = 0;
+	n = get_var_int(&ts, &in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+	r->time = ts;
+	if (in.len < 2)
+		goto done;
+
+	r->tz_offset = get_be16(in.buf);
+	string_view_consume(&in, 2);
+
+	strbuf_reset(&dest);
+	n = decode_string(&dest, in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+
+	r->message = reftable_realloc(r->message, dest.len + 1);
+	memcpy(r->message, dest.buf, dest.len);
+	r->message[dest.len] = 0;
+
+	strbuf_release(&dest);
+	return start.len - in.len;
+
+done:
+	strbuf_release(&dest);
+	return REFTABLE_FORMAT_ERROR;
+}
+
+static int null_streq(char *a, char *b)
+{
+	char *empty = "";
+	if (a == NULL)
+		a = empty;
+
+	if (b == NULL)
+		b = empty;
+
+	return 0 == strcmp(a, b);
+}
+
+static int zero_hash_eq(uint8_t *a, uint8_t *b, int sz)
+{
+	if (a == NULL)
+		a = zero;
+
+	if (b == NULL)
+		b = zero;
+
+	return !memcmp(a, b, sz);
+}
+
+int reftable_log_record_equal(struct reftable_log_record *a,
+			      struct reftable_log_record *b, int hash_size)
+{
+	return null_streq(a->name, b->name) && null_streq(a->email, b->email) &&
+	       null_streq(a->message, b->message) &&
+	       zero_hash_eq(a->old_hash, b->old_hash, hash_size) &&
+	       zero_hash_eq(a->new_hash, b->new_hash, hash_size) &&
+	       a->time == b->time && a->tz_offset == b->tz_offset &&
+	       a->update_index == b->update_index;
+}
+
+static int reftable_log_record_is_deletion_void(const void *p)
+{
+	return reftable_log_record_is_deletion(
+		(const struct reftable_log_record *)p);
+}
+
+static struct reftable_record_vtable reftable_log_record_vtable = {
+	.key = &reftable_log_record_key,
+	.type = BLOCK_TYPE_LOG,
+	.copy_from = &reftable_log_record_copy_from,
+	.val_type = &reftable_log_record_val_type,
+	.encode = &reftable_log_record_encode,
+	.decode = &reftable_log_record_decode,
+	.release = &reftable_log_record_release_void,
+	.is_deletion = &reftable_log_record_is_deletion_void,
+};
+
+struct reftable_record reftable_new_record(uint8_t typ)
+{
+	struct reftable_record rec = { NULL };
+	switch (typ) {
+	case BLOCK_TYPE_REF: {
+		struct reftable_ref_record *r =
+			reftable_calloc(sizeof(struct reftable_ref_record));
+		reftable_record_from_ref(&rec, r);
+		return rec;
+	}
+
+	case BLOCK_TYPE_OBJ: {
+		struct reftable_obj_record *r =
+			reftable_calloc(sizeof(struct reftable_obj_record));
+		reftable_record_from_obj(&rec, r);
+		return rec;
+	}
+	case BLOCK_TYPE_LOG: {
+		struct reftable_log_record *r =
+			reftable_calloc(sizeof(struct reftable_log_record));
+		reftable_record_from_log(&rec, r);
+		return rec;
+	}
+	case BLOCK_TYPE_INDEX: {
+		struct reftable_index_record empty = { .last_key =
+							       STRBUF_INIT };
+		struct reftable_index_record *r =
+			reftable_calloc(sizeof(struct reftable_index_record));
+		*r = empty;
+		reftable_record_from_index(&rec, r);
+		return rec;
+	}
+	}
+	abort();
+	return rec;
+}
+
+/* clear out the record, yielding the reftable_record data that was
+ * encapsulated. */
+static void *reftable_record_yield(struct reftable_record *rec)
+{
+	void *p = rec->data;
+	rec->data = NULL;
+	return p;
+}
+
+void reftable_record_destroy(struct reftable_record *rec)
+{
+	reftable_record_release(rec);
+	reftable_free(reftable_record_yield(rec));
+}
+
+static void reftable_index_record_key(const void *r, struct strbuf *dest)
+{
+	struct reftable_index_record *rec = (struct reftable_index_record *)r;
+	strbuf_reset(dest);
+	strbuf_addbuf(dest, &rec->last_key);
+}
+
+static void reftable_index_record_copy_from(void *rec, const void *src_rec,
+					    int hash_size)
+{
+	struct reftable_index_record *dst = (struct reftable_index_record *)rec;
+	struct reftable_index_record *src =
+		(struct reftable_index_record *)src_rec;
+
+	strbuf_reset(&dst->last_key);
+	strbuf_addbuf(&dst->last_key, &src->last_key);
+	dst->offset = src->offset;
+}
+
+static void reftable_index_record_release(void *rec)
+{
+	struct reftable_index_record *idx = (struct reftable_index_record *)rec;
+	strbuf_release(&idx->last_key);
+}
+
+static uint8_t reftable_index_record_val_type(const void *rec)
+{
+	return 0;
+}
+
+static int reftable_index_record_encode(const void *rec, struct string_view out,
+					int hash_size)
+{
+	const struct reftable_index_record *r =
+		(const struct reftable_index_record *)rec;
+	struct string_view start = out;
+
+	int n = put_var_int(&out, r->offset);
+	if (n < 0)
+		return n;
+
+	string_view_consume(&out, n);
+
+	return start.len - out.len;
+}
+
+static int reftable_index_record_decode(void *rec, struct strbuf key,
+					uint8_t val_type, struct string_view in,
+					int hash_size)
+{
+	struct string_view start = in;
+	struct reftable_index_record *r = (struct reftable_index_record *)rec;
+	int n = 0;
+
+	strbuf_reset(&r->last_key);
+	strbuf_addbuf(&r->last_key, &key);
+
+	n = get_var_int(&r->offset, &in);
+	if (n < 0)
+		return n;
+
+	string_view_consume(&in, n);
+	return start.len - in.len;
+}
+
+static struct reftable_record_vtable reftable_index_record_vtable = {
+	.key = &reftable_index_record_key,
+	.type = BLOCK_TYPE_INDEX,
+	.copy_from = &reftable_index_record_copy_from,
+	.val_type = &reftable_index_record_val_type,
+	.encode = &reftable_index_record_encode,
+	.decode = &reftable_index_record_decode,
+	.release = &reftable_index_record_release,
+	.is_deletion = &not_a_deletion,
+};
+
+void reftable_record_key(struct reftable_record *rec, struct strbuf *dest)
+{
+	rec->ops->key(rec->data, dest);
+}
+
+uint8_t reftable_record_type(struct reftable_record *rec)
+{
+	return rec->ops->type;
+}
+
+int reftable_record_encode(struct reftable_record *rec, struct string_view dest,
+			   int hash_size)
+{
+	return rec->ops->encode(rec->data, dest, hash_size);
+}
+
+void reftable_record_copy_from(struct reftable_record *rec,
+			       struct reftable_record *src, int hash_size)
+{
+	assert(src->ops->type == rec->ops->type);
+
+	rec->ops->copy_from(rec->data, src->data, hash_size);
+}
+
+uint8_t reftable_record_val_type(struct reftable_record *rec)
+{
+	return rec->ops->val_type(rec->data);
+}
+
+int reftable_record_decode(struct reftable_record *rec, struct strbuf key,
+			   uint8_t extra, struct string_view src, int hash_size)
+{
+	return rec->ops->decode(rec->data, key, extra, src, hash_size);
+}
+
+void reftable_record_release(struct reftable_record *rec)
+{
+	rec->ops->release(rec->data);
+}
+
+int reftable_record_is_deletion(struct reftable_record *rec)
+{
+	return rec->ops->is_deletion(rec->data);
+}
+
+void reftable_record_from_ref(struct reftable_record *rec,
+			      struct reftable_ref_record *ref_rec)
+{
+	assert(rec->ops == NULL);
+	rec->data = ref_rec;
+	rec->ops = &reftable_ref_record_vtable;
+}
+
+void reftable_record_from_obj(struct reftable_record *rec,
+			      struct reftable_obj_record *obj_rec)
+{
+	assert(rec->ops == NULL);
+	rec->data = obj_rec;
+	rec->ops = &reftable_obj_record_vtable;
+}
+
+void reftable_record_from_index(struct reftable_record *rec,
+				struct reftable_index_record *index_rec)
+{
+	assert(rec->ops == NULL);
+	rec->data = index_rec;
+	rec->ops = &reftable_index_record_vtable;
+}
+
+void reftable_record_from_log(struct reftable_record *rec,
+			      struct reftable_log_record *log_rec)
+{
+	assert(rec->ops == NULL);
+	rec->data = log_rec;
+	rec->ops = &reftable_log_record_vtable;
+}
+
+struct reftable_ref_record *reftable_record_as_ref(struct reftable_record *rec)
+{
+	assert(reftable_record_type(rec) == BLOCK_TYPE_REF);
+	return (struct reftable_ref_record *)rec->data;
+}
+
+struct reftable_log_record *reftable_record_as_log(struct reftable_record *rec)
+{
+	assert(reftable_record_type(rec) == BLOCK_TYPE_LOG);
+	return (struct reftable_log_record *)rec->data;
+}
+
+static int hash_equal(uint8_t *a, uint8_t *b, int hash_size)
+{
+	if (a != NULL && b != NULL)
+		return !memcmp(a, b, hash_size);
+
+	return a == b;
+}
+
+static int str_equal(char *a, char *b)
+{
+	if (a != NULL && b != NULL)
+		return 0 == strcmp(a, b);
+
+	return a == b;
+}
+
+int reftable_ref_record_equal(struct reftable_ref_record *a,
+			      struct reftable_ref_record *b, int hash_size)
+{
+	assert(hash_size > 0);
+	return 0 == strcmp(a->refname, b->refname) &&
+	       a->update_index == b->update_index &&
+	       hash_equal(a->value, b->value, hash_size) &&
+	       hash_equal(a->target_value, b->target_value, hash_size) &&
+	       str_equal(a->target, b->target);
+}
+
+int reftable_ref_record_compare_name(const void *a, const void *b)
+{
+	return strcmp(((struct reftable_ref_record *)a)->refname,
+		      ((struct reftable_ref_record *)b)->refname);
+}
+
+int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref)
+{
+	return ref->value == NULL && ref->target == NULL &&
+	       ref->target_value == NULL;
+}
+
+int reftable_log_record_compare_key(const void *a, const void *b)
+{
+	struct reftable_log_record *la = (struct reftable_log_record *)a;
+	struct reftable_log_record *lb = (struct reftable_log_record *)b;
+
+	int cmp = strcmp(la->refname, lb->refname);
+	if (cmp)
+		return cmp;
+	if (la->update_index > lb->update_index)
+		return -1;
+	return (la->update_index < lb->update_index) ? 1 : 0;
+}
+
+int reftable_log_record_is_deletion(const struct reftable_log_record *log)
+{
+	return (log->new_hash == NULL && log->old_hash == NULL &&
+		log->name == NULL && log->email == NULL &&
+		log->message == NULL && log->time == 0 && log->tz_offset == 0 &&
+		log->message == NULL);
+}
+
+void string_view_consume(struct string_view *s, int n)
+{
+	s->buf += n;
+	s->len -= n;
+}
diff --git a/reftable/record.h b/reftable/record.h
new file mode 100644
index 0000000000..c78b62e564
--- /dev/null
+++ b/reftable/record.h
@@ -0,0 +1,139 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef RECORD_H
+#define RECORD_H
+
+#include "system.h"
+
+#include <stdint.h>
+
+#include "reftable-record.h"
+
+/*
+  A substring of existing string data. This structure takes no responsibility
+  for the lifetime of the data it points to.
+*/
+struct string_view {
+	uint8_t *buf;
+	size_t len;
+};
+
+/* Advance `s.buf` by `n`, and decrease length. */
+void string_view_consume(struct string_view *s, int n);
+
+/* utilities for de/encoding varints */
+
+int get_var_int(uint64_t *dest, struct string_view *in);
+int put_var_int(struct string_view *dest, uint64_t val);
+
+/* Methods for records. */
+struct reftable_record_vtable {
+	/* encode the key of to a uint8_t strbuf. */
+	void (*key)(const void *rec, struct strbuf *dest);
+
+	/* The record type of ('r' for ref). */
+	uint8_t type;
+
+	void (*copy_from)(void *dest, const void *src, int hash_size);
+
+	/* a value of [0..7], indicating record subvariants (eg. ref vs. symref
+	 * vs ref deletion) */
+	uint8_t (*val_type)(const void *rec);
+
+	/* encodes rec into dest, returning how much space was used. */
+	int (*encode)(const void *rec, struct string_view dest, int hash_size);
+
+	/* decode data from `src` into the record. */
+	int (*decode)(void *rec, struct strbuf key, uint8_t extra,
+		      struct string_view src, int hash_size);
+
+	/* deallocate and null the record. */
+	void (*release)(void *rec);
+
+	/* is this a tombstone? */
+	int (*is_deletion)(const void *rec);
+};
+
+/* record is a generic wrapper for different types of records. */
+struct reftable_record {
+	void *data;
+	struct reftable_record_vtable *ops;
+};
+
+/* returns true for recognized block types. Block start with the block type. */
+int reftable_is_block_type(uint8_t typ);
+
+/* creates a malloced record of the given type. Dispose with record_destroy */
+struct reftable_record reftable_new_record(uint8_t typ);
+
+/* Encode `key` into `dest`. Sets `is_restart` to indicate a restart. Returns
+   number of bytes written. */
+int reftable_encode_key(int *is_restart, struct string_view dest,
+			struct strbuf prev_key, struct strbuf key,
+			uint8_t extra);
+
+/* Decode into `key` and `extra` from `in` */
+int reftable_decode_key(struct strbuf *key, uint8_t *extra,
+			struct strbuf last_key, struct string_view in);
+
+/* reftable_index_record are used internally to speed up lookups. */
+struct reftable_index_record {
+	uint64_t offset; /* Offset of block */
+	struct strbuf last_key; /* Last key of the block. */
+};
+
+/* reftable_obj_record stores an object ID => ref mapping. */
+struct reftable_obj_record {
+	uint8_t *hash_prefix; /* leading bytes of the object ID */
+	int hash_prefix_len; /* number of leading bytes. Constant
+			      * across a single table. */
+	uint64_t *offsets; /* a vector of file offsets. */
+	int offset_len;
+};
+
+/* see struct record_vtable */
+
+void reftable_record_key(struct reftable_record *rec, struct strbuf *dest);
+uint8_t reftable_record_type(struct reftable_record *rec);
+void reftable_record_copy_from(struct reftable_record *rec,
+			       struct reftable_record *src, int hash_size);
+uint8_t reftable_record_val_type(struct reftable_record *rec);
+int reftable_record_encode(struct reftable_record *rec, struct string_view dest,
+			   int hash_size);
+int reftable_record_decode(struct reftable_record *rec, struct strbuf key,
+			   uint8_t extra, struct string_view src,
+			   int hash_size);
+int reftable_record_is_deletion(struct reftable_record *rec);
+
+/* zeroes out the embedded record */
+void reftable_record_release(struct reftable_record *rec);
+
+/* clear and deallocate embedded record, and zero `rec`. */
+void reftable_record_destroy(struct reftable_record *rec);
+
+/* initialize generic records from concrete records. The generic record should
+ * be zeroed out. */
+void reftable_record_from_obj(struct reftable_record *rec,
+			      struct reftable_obj_record *objrec);
+void reftable_record_from_index(struct reftable_record *rec,
+				struct reftable_index_record *idxrec);
+void reftable_record_from_ref(struct reftable_record *rec,
+			      struct reftable_ref_record *refrec);
+void reftable_record_from_log(struct reftable_record *rec,
+			      struct reftable_log_record *logrec);
+struct reftable_ref_record *reftable_record_as_ref(struct reftable_record *ref);
+struct reftable_log_record *reftable_record_as_log(struct reftable_record *ref);
+
+/* for qsort. */
+int reftable_ref_record_compare_name(const void *a, const void *b);
+
+/* for qsort. */
+int reftable_log_record_compare_key(const void *a, const void *b);
+
+#endif
diff --git a/reftable/record_test.c b/reftable/record_test.c
new file mode 100644
index 0000000000..6dad76b0ac
--- /dev/null
+++ b/reftable/record_test.c
@@ -0,0 +1,404 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "record.h"
+
+#include "system.h"
+#include "basics.h"
+#include "constants.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static void test_copy(struct reftable_record *rec)
+{
+	struct reftable_record copy =
+		reftable_new_record(reftable_record_type(rec));
+	reftable_record_copy_from(&copy, rec, SHA1_SIZE);
+	/* do it twice to catch memory leaks */
+	reftable_record_copy_from(&copy, rec, SHA1_SIZE);
+	switch (reftable_record_type(&copy)) {
+	case BLOCK_TYPE_REF:
+		EXPECT(reftable_ref_record_equal(reftable_record_as_ref(&copy),
+						 reftable_record_as_ref(rec),
+						 SHA1_SIZE));
+		break;
+	case BLOCK_TYPE_LOG:
+		EXPECT(reftable_log_record_equal(reftable_record_as_log(&copy),
+						 reftable_record_as_log(rec),
+						 SHA1_SIZE));
+		break;
+	}
+	reftable_record_destroy(&copy);
+}
+
+static void test_varint_roundtrip(void)
+{
+	uint64_t inputs[] = { 0,
+			      1,
+			      27,
+			      127,
+			      128,
+			      257,
+			      4096,
+			      ((uint64_t)1 << 63),
+			      ((uint64_t)1 << 63) + ((uint64_t)1 << 63) - 1 };
+	int i = 0;
+	for (i = 0; i < ARRAY_SIZE(inputs); i++) {
+		uint8_t dest[10];
+
+		struct string_view out = {
+			.buf = dest,
+			.len = sizeof(dest),
+		};
+		uint64_t in = inputs[i];
+		int n = put_var_int(&out, in);
+		uint64_t got = 0;
+
+		EXPECT(n > 0);
+		out.len = n;
+		n = get_var_int(&got, &out);
+		EXPECT(n > 0);
+
+		EXPECT(got == in);
+	}
+}
+
+static void test_common_prefix(void)
+{
+	struct {
+		const char *a, *b;
+		int want;
+	} cases[] = {
+		{ "abc", "ab", 2 },
+		{ "", "abc", 0 },
+		{ "abc", "abd", 2 },
+		{ "abc", "pqr", 0 },
+	};
+
+	int i = 0;
+	for (i = 0; i < ARRAY_SIZE(cases); i++) {
+		struct strbuf a = STRBUF_INIT;
+		struct strbuf b = STRBUF_INIT;
+		strbuf_addstr(&a, cases[i].a);
+		strbuf_addstr(&b, cases[i].b);
+		EXPECT(common_prefix_size(&a, &b) == cases[i].want);
+
+		strbuf_release(&a);
+		strbuf_release(&b);
+	}
+}
+
+static void set_hash(uint8_t *h, int j)
+{
+	int i = 0;
+	for (i = 0; i < hash_size(SHA1_ID); i++) {
+		h[i] = (j >> i) & 0xff;
+	}
+}
+
+static void test_reftable_ref_record_roundtrip(void)
+{
+	int i = 0;
+
+	for (i = 0; i <= 3; i++) {
+		struct reftable_ref_record in = { NULL };
+		struct reftable_ref_record out = {
+			.refname = xstrdup("old name"),
+			.value = reftable_calloc(SHA1_SIZE),
+			.target_value = reftable_calloc(SHA1_SIZE),
+			.target = xstrdup("old value"),
+		};
+		struct reftable_record rec_out = { NULL };
+		struct strbuf key = STRBUF_INIT;
+		struct reftable_record rec = { NULL };
+		uint8_t buffer[1024] = { 0 };
+		struct string_view dest = {
+			.buf = buffer,
+			.len = sizeof(buffer),
+		};
+
+		int n, m;
+
+		switch (i) {
+		case 0:
+			break;
+		case 1:
+			in.value = reftable_malloc(SHA1_SIZE);
+			set_hash(in.value, 1);
+			break;
+		case 2:
+			in.value = reftable_malloc(SHA1_SIZE);
+			set_hash(in.value, 1);
+			in.target_value = reftable_malloc(SHA1_SIZE);
+			set_hash(in.target_value, 2);
+			break;
+		case 3:
+			in.target = xstrdup("target");
+			break;
+		}
+		in.refname = xstrdup("refs/heads/master");
+
+		reftable_record_from_ref(&rec, &in);
+		test_copy(&rec);
+
+		EXPECT(reftable_record_val_type(&rec) == i);
+
+		reftable_record_key(&rec, &key);
+		n = reftable_record_encode(&rec, dest, SHA1_SIZE);
+		EXPECT(n > 0);
+
+		/* decode into a non-zero reftable_record to test for leaks. */
+
+		reftable_record_from_ref(&rec_out, &out);
+		m = reftable_record_decode(&rec_out, key, i, dest, SHA1_SIZE);
+		EXPECT(n == m);
+
+		EXPECT((out.value != NULL) == (in.value != NULL));
+		EXPECT((out.target_value != NULL) == (in.target_value != NULL));
+		EXPECT((out.target != NULL) == (in.target != NULL));
+		reftable_record_release(&rec_out);
+
+		strbuf_release(&key);
+		reftable_ref_record_release(&in);
+	}
+}
+
+static void test_reftable_log_record_equal(void)
+{
+	struct reftable_log_record in[2] = {
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.update_index = 42,
+		},
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.update_index = 22,
+		}
+	};
+
+	EXPECT(!reftable_log_record_equal(&in[0], &in[1], SHA1_SIZE));
+	in[1].update_index = in[0].update_index;
+	EXPECT(reftable_log_record_equal(&in[0], &in[1], SHA1_SIZE));
+	reftable_log_record_release(&in[0]);
+	reftable_log_record_release(&in[1]);
+}
+
+static void test_reftable_log_record_roundtrip(void)
+{
+	struct reftable_log_record in[2] = {
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.old_hash = reftable_malloc(SHA1_SIZE),
+			.new_hash = reftable_malloc(SHA1_SIZE),
+			.name = xstrdup("han-wen"),
+			.email = xstrdup("hanwen@google.com"),
+			.message = xstrdup("test"),
+			.update_index = 42,
+			.time = 1577123507,
+			.tz_offset = 100,
+		},
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.update_index = 22,
+		}
+	};
+	set_test_hash(in[0].new_hash, 1);
+	set_test_hash(in[0].old_hash, 2);
+	for (int i = 0; i < ARRAY_SIZE(in); i++) {
+		struct reftable_record rec = { NULL };
+		struct strbuf key = STRBUF_INIT;
+		uint8_t buffer[1024] = { 0 };
+		struct string_view dest = {
+			.buf = buffer,
+			.len = sizeof(buffer),
+		};
+		/* populate out, to check for leaks. */
+		struct reftable_log_record out = {
+			.refname = xstrdup("old name"),
+			.new_hash = reftable_calloc(SHA1_SIZE),
+			.old_hash = reftable_calloc(SHA1_SIZE),
+			.name = xstrdup("old name"),
+			.email = xstrdup("old@email"),
+			.message = xstrdup("old message"),
+		};
+		struct reftable_record rec_out = { NULL };
+		int n, m, valtype;
+
+		reftable_record_from_log(&rec, &in[i]);
+
+		test_copy(&rec);
+
+		reftable_record_key(&rec, &key);
+
+		n = reftable_record_encode(&rec, dest, SHA1_SIZE);
+		EXPECT(n >= 0);
+		reftable_record_from_log(&rec_out, &out);
+		valtype = reftable_record_val_type(&rec);
+		m = reftable_record_decode(&rec_out, key, valtype, dest,
+					   SHA1_SIZE);
+		EXPECT(n == m);
+
+		EXPECT(reftable_log_record_equal(&in[i], &out, SHA1_SIZE));
+		reftable_log_record_release(&in[i]);
+		strbuf_release(&key);
+		reftable_record_release(&rec_out);
+	}
+}
+
+static void test_u24_roundtrip(void)
+{
+	uint32_t in = 0x112233;
+	uint8_t dest[3];
+	uint32_t out;
+	put_be24(dest, in);
+	out = get_be24(dest);
+	EXPECT(in == out);
+}
+
+static void test_key_roundtrip(void)
+{
+	uint8_t buffer[1024] = { 0 };
+	struct string_view dest = {
+		.buf = buffer,
+		.len = sizeof(buffer),
+	};
+	struct strbuf last_key = STRBUF_INIT;
+	struct strbuf key = STRBUF_INIT;
+	struct strbuf roundtrip = STRBUF_INIT;
+	int restart;
+	uint8_t extra;
+	int n, m;
+	uint8_t rt_extra;
+
+	strbuf_addstr(&last_key, "refs/heads/master");
+	strbuf_addstr(&key, "refs/tags/bla");
+	extra = 6;
+	n = reftable_encode_key(&restart, dest, last_key, key, extra);
+	EXPECT(!restart);
+	EXPECT(n > 0);
+
+	m = reftable_decode_key(&roundtrip, &rt_extra, last_key, dest);
+	EXPECT(n == m);
+	EXPECT(0 == strbuf_cmp(&key, &roundtrip));
+	EXPECT(rt_extra == extra);
+
+	strbuf_release(&last_key);
+	strbuf_release(&key);
+	strbuf_release(&roundtrip);
+}
+
+static void test_reftable_obj_record_roundtrip(void)
+{
+	uint8_t testHash1[SHA1_SIZE] = { 1, 2, 3, 4, 0 };
+	uint64_t till9[] = { 1, 2, 3, 4, 500, 600, 700, 800, 9000 };
+	struct reftable_obj_record recs[3] = { {
+						       .hash_prefix = testHash1,
+						       .hash_prefix_len = 5,
+						       .offsets = till9,
+						       .offset_len = 3,
+					       },
+					       {
+						       .hash_prefix = testHash1,
+						       .hash_prefix_len = 5,
+						       .offsets = till9,
+						       .offset_len = 9,
+					       },
+					       {
+						       .hash_prefix = testHash1,
+						       .hash_prefix_len = 5,
+					       } };
+	int i = 0;
+	for (i = 0; i < ARRAY_SIZE(recs); i++) {
+		struct reftable_obj_record in = recs[i];
+		uint8_t buffer[1024] = { 0 };
+		struct string_view dest = {
+			.buf = buffer,
+			.len = sizeof(buffer),
+		};
+		struct reftable_record rec = { NULL };
+		struct strbuf key = STRBUF_INIT;
+		struct reftable_obj_record out = { NULL };
+		struct reftable_record rec_out = { NULL };
+		int n, m;
+		uint8_t extra;
+
+		reftable_record_from_obj(&rec, &in);
+		test_copy(&rec);
+		reftable_record_key(&rec, &key);
+		n = reftable_record_encode(&rec, dest, SHA1_SIZE);
+		EXPECT(n > 0);
+		extra = reftable_record_val_type(&rec);
+		reftable_record_from_obj(&rec_out, &out);
+		m = reftable_record_decode(&rec_out, key, extra, dest,
+					   SHA1_SIZE);
+		EXPECT(n == m);
+
+		EXPECT(in.hash_prefix_len == out.hash_prefix_len);
+		EXPECT(in.offset_len == out.offset_len);
+
+		EXPECT(!memcmp(in.hash_prefix, out.hash_prefix,
+			       in.hash_prefix_len));
+		EXPECT(0 == memcmp(in.offsets, out.offsets,
+				   sizeof(uint64_t) * in.offset_len));
+		strbuf_release(&key);
+		reftable_record_release(&rec_out);
+	}
+}
+
+static void test_reftable_index_record_roundtrip(void)
+{
+	struct reftable_index_record in = {
+		.offset = 42,
+		.last_key = STRBUF_INIT,
+	};
+	uint8_t buffer[1024] = { 0 };
+	struct string_view dest = {
+		.buf = buffer,
+		.len = sizeof(buffer),
+	};
+	struct strbuf key = STRBUF_INIT;
+	struct reftable_record rec = { NULL };
+	struct reftable_index_record out = { .last_key = STRBUF_INIT };
+	struct reftable_record out_rec = { NULL };
+	int n, m;
+	uint8_t extra;
+
+	strbuf_addstr(&in.last_key, "refs/heads/master");
+	reftable_record_from_index(&rec, &in);
+	reftable_record_key(&rec, &key);
+	test_copy(&rec);
+
+	EXPECT(0 == strbuf_cmp(&key, &in.last_key));
+	n = reftable_record_encode(&rec, dest, SHA1_SIZE);
+	EXPECT(n > 0);
+
+	extra = reftable_record_val_type(&rec);
+	reftable_record_from_index(&out_rec, &out);
+	m = reftable_record_decode(&out_rec, key, extra, dest, SHA1_SIZE);
+	EXPECT(m == n);
+
+	EXPECT(in.offset == out.offset);
+
+	reftable_record_release(&out_rec);
+	strbuf_release(&key);
+	strbuf_release(&in.last_key);
+}
+
+int record_test_main(int argc, const char *argv[])
+{
+	test_reftable_log_record_equal();
+	test_reftable_log_record_roundtrip();
+	test_reftable_ref_record_roundtrip();
+	test_varint_roundtrip();
+	test_key_roundtrip();
+	test_common_prefix();
+	test_reftable_obj_record_roundtrip();
+	test_reftable_index_record_roundtrip();
+	test_u24_roundtrip();
+	return 0;
+}
diff --git a/reftable/reftable-record.h b/reftable/reftable-record.h
new file mode 100644
index 0000000000..7057ceba35
--- /dev/null
+++ b/reftable/reftable-record.h
@@ -0,0 +1,73 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_RECORD_H
+#define REFTABLE_RECORD_H
+
+#include <stdint.h>
+
+/*
+ Basic data types
+
+ Reftables store the state of each ref in struct reftable_ref_record, and they
+ store a sequence of reflog updates in struct reftable_log_record.
+*/
+
+/* reftable_ref_record holds a ref database entry target_value */
+struct reftable_ref_record {
+	char *refname; /* Name of the ref, malloced. */
+	uint64_t update_index; /* Logical timestamp at which this value is
+				  written */
+	uint8_t *value; /* SHA1, or NULL. malloced. */
+	uint8_t *target_value; /* peeled annotated tag, or NULL. malloced. */
+	char *target; /* symref, or NULL. malloced. */
+};
+
+/* returns whether 'ref' represents a deletion */
+int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref);
+
+/* prints a reftable_ref_record onto stdout */
+void reftable_ref_record_print(struct reftable_ref_record *ref,
+			       uint32_t hash_id);
+
+/* frees and nulls all pointer values. */
+void reftable_ref_record_release(struct reftable_ref_record *ref);
+
+/* returns whether two reftable_ref_records are the same */
+int reftable_ref_record_equal(struct reftable_ref_record *a,
+			      struct reftable_ref_record *b, int hash_size);
+
+/* reftable_log_record holds a reflog entry */
+struct reftable_log_record {
+	char *refname;
+	uint64_t update_index; /* logical timestamp of a transactional update.
+				*/
+	uint8_t *new_hash;
+	uint8_t *old_hash;
+	char *name;
+	char *email;
+	uint64_t time;
+	int16_t tz_offset;
+	char *message;
+};
+
+/* returns whether 'ref' represents the deletion of a log record. */
+int reftable_log_record_is_deletion(const struct reftable_log_record *log);
+
+/* frees and nulls all pointer values. */
+void reftable_log_record_release(struct reftable_log_record *log);
+
+/* returns whether two records are equal. Useful for testing. */
+int reftable_log_record_equal(struct reftable_log_record *a,
+			      struct reftable_log_record *b, int hash_size);
+
+/* dumps a reftable_log_record on stdout, for debugging/testing. */
+void reftable_log_record_print(struct reftable_log_record *log,
+			       uint32_t hash_id);
+
+#endif
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 3b58e423e7..09d4b83ef9 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -4,6 +4,6 @@
 int cmd__reftable(int argc, const char **argv)
 {
 	basics_test_main(argc, argv);
-
+	record_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v3 08/16] reftable: reading/writing blocks
  2020-11-26 19:42   ` [PATCH v3 00/16] reftable library Han-Wen Nienhuys via GitGitGadget
                       ` (6 preceding siblings ...)
  2020-11-26 19:42     ` [PATCH v3 07/16] reftable: (de)serialization for the polymorphic record type Han-Wen Nienhuys via GitGitGadget
@ 2020-11-26 19:42     ` Han-Wen Nienhuys via GitGitGadget
  2020-11-26 19:42     ` [PATCH v3 09/16] reftable: a generic binary tree implementation Han-Wen Nienhuys via GitGitGadget
                       ` (8 subsequent siblings)
  16 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-11-26 19:42 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable format is structured as a sequence of block. Within a block,
records are prefix compressed, with an index of offsets for fully expand keys to
enable binary search within blocks.

This commit provides the logic to read and write these blocks.

Includes a code snippet copied from zlib

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |   3 +
 reftable/.gitattributes  |   1 +
 reftable/block.c         | 440 +++++++++++++++++++++++++++++++++++++++
 reftable/block.h         | 129 ++++++++++++
 reftable/block_test.c    | 119 +++++++++++
 reftable/zlib-compat.c   |  92 ++++++++
 t/helper/test-reftable.c |   1 +
 7 files changed, 785 insertions(+)
 create mode 100644 reftable/.gitattributes
 create mode 100644 reftable/block.c
 create mode 100644 reftable/block.h
 create mode 100644 reftable/block_test.c
 create mode 100644 reftable/zlib-compat.c

diff --git a/Makefile b/Makefile
index 199aec9267..fcfe4075c3 100644
--- a/Makefile
+++ b/Makefile
@@ -2391,10 +2391,13 @@ XDIFF_OBJS += xdiff/xutils.o
 
 REFTABLE_OBJS += reftable/basics.o
 REFTABLE_OBJS += reftable/error.o
+REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/record.o
+REFTABLE_OBJS += reftable/zlib-compat.o
 
+REFTABLE_TEST_OBJS += reftable/block_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/basics_test.o
diff --git a/reftable/.gitattributes b/reftable/.gitattributes
new file mode 100644
index 0000000000..f44451a379
--- /dev/null
+++ b/reftable/.gitattributes
@@ -0,0 +1 @@
+/zlib-compat.c	whitespace=-indent-with-non-tab,-trailing-space
diff --git a/reftable/block.c b/reftable/block.c
new file mode 100644
index 0000000000..5b4e49fb2b
--- /dev/null
+++ b/reftable/block.c
@@ -0,0 +1,440 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "block.h"
+
+#include "blocksource.h"
+#include "constants.h"
+#include "record.h"
+#include "reftable-error.h"
+#include "system.h"
+#include "zlib.h"
+
+int header_size(int version)
+{
+	switch (version) {
+	case 1:
+		return 24;
+	case 2:
+		return 28;
+	}
+	abort();
+}
+
+int footer_size(int version)
+{
+	switch (version) {
+	case 1:
+		return 68;
+	case 2:
+		return 72;
+	}
+	abort();
+}
+
+static int block_writer_register_restart(struct block_writer *w, int n,
+					 int is_restart, struct strbuf *key)
+{
+	int rlen = w->restart_len;
+	if (rlen >= MAX_RESTARTS) {
+		is_restart = 0;
+	}
+
+	if (is_restart) {
+		rlen++;
+	}
+	if (2 + 3 * rlen + n > w->block_size - w->next)
+		return -1;
+	if (is_restart) {
+		if (w->restart_len == w->restart_cap) {
+			w->restart_cap = w->restart_cap * 2 + 1;
+			w->restarts = reftable_realloc(
+				w->restarts, sizeof(uint32_t) * w->restart_cap);
+		}
+
+		w->restarts[w->restart_len++] = w->next;
+	}
+
+	w->next += n;
+
+	strbuf_reset(&w->last_key);
+	strbuf_addbuf(&w->last_key, key);
+	w->entries++;
+	return 0;
+}
+
+void block_writer_init(struct block_writer *bw, uint8_t typ, uint8_t *buf,
+		       uint32_t block_size, uint32_t header_off, int hash_size)
+{
+	bw->buf = buf;
+	bw->hash_size = hash_size;
+	bw->block_size = block_size;
+	bw->header_off = header_off;
+	bw->buf[header_off] = typ;
+	bw->next = header_off + 4;
+	bw->restart_interval = 16;
+	bw->entries = 0;
+	bw->restart_len = 0;
+	bw->last_key.len = 0;
+}
+
+uint8_t block_writer_type(struct block_writer *bw)
+{
+	return bw->buf[bw->header_off];
+}
+
+/* adds the reftable_record to the block. Returns -1 if it does not fit, 0 on
+   success */
+int block_writer_add(struct block_writer *w, struct reftable_record *rec)
+{
+	struct strbuf empty = STRBUF_INIT;
+	struct strbuf last =
+		w->entries % w->restart_interval == 0 ? empty : w->last_key;
+	struct string_view out = {
+		.buf = w->buf + w->next,
+		.len = w->block_size - w->next,
+	};
+
+	struct string_view start = out;
+
+	int is_restart = 0;
+	struct strbuf key = STRBUF_INIT;
+	int n = 0;
+
+	reftable_record_key(rec, &key);
+	n = reftable_encode_key(&is_restart, out, last, key,
+				reftable_record_val_type(rec));
+	if (n < 0)
+		goto done;
+	string_view_consume(&out, n);
+
+	n = reftable_record_encode(rec, out, w->hash_size);
+	if (n < 0)
+		goto done;
+	string_view_consume(&out, n);
+
+	if (block_writer_register_restart(w, start.len - out.len, is_restart,
+					  &key) < 0)
+		goto done;
+
+	strbuf_release(&key);
+	return 0;
+
+done:
+	strbuf_release(&key);
+	return -1;
+}
+
+int block_writer_finish(struct block_writer *w)
+{
+	int i = 0;
+	for (i = 0; i < w->restart_len; i++) {
+		put_be24(w->buf + w->next, w->restarts[i]);
+		w->next += 3;
+	}
+
+	put_be16(w->buf + w->next, w->restart_len);
+	w->next += 2;
+	put_be24(w->buf + 1 + w->header_off, w->next);
+
+	if (block_writer_type(w) == BLOCK_TYPE_LOG) {
+		int block_header_skip = 4 + w->header_off;
+		uint8_t *compressed = NULL;
+		int zresult = 0;
+		uLongf src_len = w->next - block_header_skip;
+		size_t dest_cap = src_len;
+
+		compressed = reftable_malloc(dest_cap);
+		while (1) {
+			uLongf out_dest_len = dest_cap;
+
+			zresult = compress2(compressed, &out_dest_len,
+					    w->buf + block_header_skip, src_len,
+					    9);
+			if (zresult == Z_BUF_ERROR) {
+				dest_cap *= 2;
+				compressed =
+					reftable_realloc(compressed, dest_cap);
+				continue;
+			}
+
+			if (Z_OK != zresult) {
+				reftable_free(compressed);
+				return REFTABLE_ZLIB_ERROR;
+			}
+
+			memcpy(w->buf + block_header_skip, compressed,
+			       out_dest_len);
+			w->next = out_dest_len + block_header_skip;
+			reftable_free(compressed);
+			break;
+		}
+	}
+	return w->next;
+}
+
+uint8_t block_reader_type(struct block_reader *r)
+{
+	return r->block.data[r->header_off];
+}
+
+int block_reader_init(struct block_reader *br, struct reftable_block *block,
+		      uint32_t header_off, uint32_t table_block_size,
+		      int hash_size)
+{
+	uint32_t full_block_size = table_block_size;
+	uint8_t typ = block->data[header_off];
+	uint32_t sz = get_be24(block->data + header_off + 1);
+
+	uint16_t restart_count = 0;
+	uint32_t restart_start = 0;
+	uint8_t *restart_bytes = NULL;
+
+	if (!reftable_is_block_type(typ))
+		return REFTABLE_FORMAT_ERROR;
+
+	if (typ == BLOCK_TYPE_LOG) {
+		int block_header_skip = 4 + header_off;
+		uLongf dst_len = sz - block_header_skip; /* total size of dest
+							    buffer. */
+		uLongf src_len = block->len - block_header_skip;
+		/* Log blocks specify the *uncompressed* size in their header.
+		 */
+		uint8_t *uncompressed = reftable_malloc(sz);
+
+		/* Copy over the block header verbatim. It's not compressed. */
+		memcpy(uncompressed, block->data, block_header_skip);
+
+		/* Uncompress */
+		if (Z_OK != uncompress_return_consumed(
+				    uncompressed + block_header_skip, &dst_len,
+				    block->data + block_header_skip,
+				    &src_len)) {
+			reftable_free(uncompressed);
+			return REFTABLE_ZLIB_ERROR;
+		}
+
+		if (dst_len + block_header_skip != sz)
+			return REFTABLE_FORMAT_ERROR;
+
+		/* We're done with the input data. */
+		reftable_block_done(block);
+		block->data = uncompressed;
+		block->len = sz;
+		block->source = malloc_block_source();
+		full_block_size = src_len + block_header_skip;
+	} else if (full_block_size == 0) {
+		full_block_size = sz;
+	} else if (sz < full_block_size && sz < block->len &&
+		   block->data[sz] != 0) {
+		/* If the block is smaller than the full block size, it is
+		   padded (data followed by '\0') or the next block is
+		   unaligned. */
+		full_block_size = sz;
+	}
+
+	restart_count = get_be16(block->data + sz - 2);
+	restart_start = sz - 2 - 3 * restart_count;
+	restart_bytes = block->data + restart_start;
+
+	/* transfer ownership. */
+	br->block = *block;
+	block->data = NULL;
+	block->len = 0;
+
+	br->hash_size = hash_size;
+	br->block_len = restart_start;
+	br->full_block_size = full_block_size;
+	br->header_off = header_off;
+	br->restart_count = restart_count;
+	br->restart_bytes = restart_bytes;
+
+	return 0;
+}
+
+static uint32_t block_reader_restart_offset(struct block_reader *br, int i)
+{
+	return get_be24(br->restart_bytes + 3 * i);
+}
+
+void block_reader_start(struct block_reader *br, struct block_iter *it)
+{
+	it->br = br;
+	strbuf_reset(&it->last_key);
+	it->next_off = br->header_off + 4;
+}
+
+struct restart_find_args {
+	int error;
+	struct strbuf key;
+	struct block_reader *r;
+};
+
+static int restart_key_less(size_t idx, void *args)
+{
+	struct restart_find_args *a = (struct restart_find_args *)args;
+	uint32_t off = block_reader_restart_offset(a->r, idx);
+	struct string_view in = {
+		.buf = a->r->block.data + off,
+		.len = a->r->block_len - off,
+	};
+
+	/* the restart key is verbatim in the block, so this could avoid the
+	   alloc for decoding the key */
+	struct strbuf rkey = STRBUF_INIT;
+	struct strbuf last_key = STRBUF_INIT;
+	uint8_t unused_extra;
+	int n = reftable_decode_key(&rkey, &unused_extra, last_key, in);
+	int result;
+	if (n < 0) {
+		a->error = 1;
+		return -1;
+	}
+
+	result = strbuf_cmp(&a->key, &rkey);
+	strbuf_release(&rkey);
+	return result;
+}
+
+void block_iter_copy_from(struct block_iter *dest, struct block_iter *src)
+{
+	dest->br = src->br;
+	dest->next_off = src->next_off;
+	strbuf_reset(&dest->last_key);
+	strbuf_addbuf(&dest->last_key, &src->last_key);
+}
+
+int block_iter_next(struct block_iter *it, struct reftable_record *rec)
+{
+	struct string_view in = {
+		.buf = it->br->block.data + it->next_off,
+		.len = it->br->block_len - it->next_off,
+	};
+	struct string_view start = in;
+	struct strbuf key = STRBUF_INIT;
+	uint8_t extra = 0;
+	int n = 0;
+
+	if (it->next_off >= it->br->block_len)
+		return 1;
+
+	n = reftable_decode_key(&key, &extra, it->last_key, in);
+	if (n < 0)
+		return -1;
+
+	string_view_consume(&in, n);
+	n = reftable_record_decode(rec, key, extra, in, it->br->hash_size);
+	if (n < 0)
+		return -1;
+	string_view_consume(&in, n);
+
+	strbuf_reset(&it->last_key);
+	strbuf_addbuf(&it->last_key, &key);
+	it->next_off += start.len - in.len;
+	strbuf_release(&key);
+	return 0;
+}
+
+int block_reader_first_key(struct block_reader *br, struct strbuf *key)
+{
+	struct strbuf empty = STRBUF_INIT;
+	int off = br->header_off + 4;
+	struct string_view in = {
+		.buf = br->block.data + off,
+		.len = br->block_len - off,
+	};
+
+	uint8_t extra = 0;
+	int n = reftable_decode_key(key, &extra, empty, in);
+	if (n < 0)
+		return n;
+
+	return 0;
+}
+
+int block_iter_seek(struct block_iter *it, struct strbuf *want)
+{
+	return block_reader_seek(it->br, it, want);
+}
+
+void block_iter_close(struct block_iter *it)
+{
+	strbuf_release(&it->last_key);
+}
+
+int block_reader_seek(struct block_reader *br, struct block_iter *it,
+		      struct strbuf *want)
+{
+	struct restart_find_args args = {
+		.key = *want,
+		.r = br,
+	};
+	struct reftable_record rec = reftable_new_record(block_reader_type(br));
+	struct strbuf key = STRBUF_INIT;
+	int err = 0;
+	struct block_iter next = {
+		.last_key = STRBUF_INIT,
+	};
+
+	int i = binsearch(br->restart_count, &restart_key_less, &args);
+	if (args.error) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	it->br = br;
+	if (i > 0) {
+		i--;
+		it->next_off = block_reader_restart_offset(br, i);
+	} else {
+		it->next_off = br->header_off + 4;
+	}
+
+	/* We're looking for the last entry less/equal than the wanted key, so
+	   we have to go one entry too far and then back up.
+	*/
+	while (1) {
+		block_iter_copy_from(&next, it);
+		err = block_iter_next(&next, &rec);
+		if (err < 0)
+			goto done;
+
+		reftable_record_key(&rec, &key);
+		if (err > 0 || strbuf_cmp(&key, want) >= 0) {
+			err = 0;
+			goto done;
+		}
+
+		block_iter_copy_from(it, &next);
+	}
+
+done:
+	strbuf_release(&key);
+	strbuf_release(&next.last_key);
+	reftable_record_destroy(&rec);
+
+	return err;
+}
+
+void block_writer_release(struct block_writer *bw)
+{
+	FREE_AND_NULL(bw->restarts);
+	strbuf_release(&bw->last_key);
+	/* the block is not owned. */
+}
+
+void reftable_block_done(struct reftable_block *blockp)
+{
+	struct reftable_block_source source = blockp->source;
+	if (blockp != NULL && source.ops != NULL)
+		source.ops->return_block(source.arg, blockp);
+	blockp->data = NULL;
+	blockp->len = 0;
+	blockp->source.ops = NULL;
+	blockp->source.arg = NULL;
+}
diff --git a/reftable/block.h b/reftable/block.h
new file mode 100644
index 0000000000..85dfd493d2
--- /dev/null
+++ b/reftable/block.h
@@ -0,0 +1,129 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef BLOCK_H
+#define BLOCK_H
+
+#include "basics.h"
+#include "record.h"
+#include "reftable-blocksource.h"
+
+/*
+  Writes reftable blocks. The block_writer is reused across blocks to minimize
+  allocation overhead.
+*/
+struct block_writer {
+	uint8_t *buf;
+	uint32_t block_size;
+
+	/* Offset ofof the global header. Nonzero in the first block only. */
+	uint32_t header_off;
+
+	/* How often to restart keys. */
+	int restart_interval;
+	int hash_size;
+
+	/* Offset of next uint8_t to write. */
+	uint32_t next;
+	uint32_t *restarts;
+	uint32_t restart_len;
+	uint32_t restart_cap;
+
+	struct strbuf last_key;
+	int entries;
+};
+
+/*
+  initializes the blockwriter to write `typ` entries, using `buf` as temporary
+  storage. `buf` is not owned by the block_writer. */
+void block_writer_init(struct block_writer *bw, uint8_t typ, uint8_t *buf,
+		       uint32_t block_size, uint32_t header_off, int hash_size);
+
+/*
+  returns the block type (eg. 'r' for ref records.
+*/
+uint8_t block_writer_type(struct block_writer *bw);
+
+/* appends the record, or -1 if it doesn't fit. */
+int block_writer_add(struct block_writer *w, struct reftable_record *rec);
+
+/* appends the key restarts, and compress the block if necessary. */
+int block_writer_finish(struct block_writer *w);
+
+/* clears out internally allocated block_writer members. */
+void block_writer_release(struct block_writer *bw);
+
+/* Read a block. */
+struct block_reader {
+	/* offset of the block header; nonzero for the first block in a
+	 * reftable. */
+	uint32_t header_off;
+
+	/* the memory block */
+	struct reftable_block block;
+	int hash_size;
+
+	/* size of the data, excluding restart data. */
+	uint32_t block_len;
+	uint8_t *restart_bytes;
+	uint16_t restart_count;
+
+	/* size of the data in the file. For log blocks, this is the compressed
+	 * size. */
+	uint32_t full_block_size;
+};
+
+/* Iterate over entries in a block */
+struct block_iter {
+	/* offset within the block of the next entry to read. */
+	uint32_t next_off;
+	struct block_reader *br;
+
+	/* key for last entry we read. */
+	struct strbuf last_key;
+};
+
+/* initializes a block reader. */
+int block_reader_init(struct block_reader *br, struct reftable_block *bl,
+		      uint32_t header_off, uint32_t table_block_size,
+		      int hash_size);
+
+/* Position `it` at start of the block */
+void block_reader_start(struct block_reader *br, struct block_iter *it);
+
+/* Position `it` to the `want` key in the block */
+int block_reader_seek(struct block_reader *br, struct block_iter *it,
+		      struct strbuf *want);
+
+/* Returns the block type (eg. 'r' for refs) */
+uint8_t block_reader_type(struct block_reader *r);
+
+/* Decodes the first key in the block */
+int block_reader_first_key(struct block_reader *br, struct strbuf *key);
+
+void block_iter_copy_from(struct block_iter *dest, struct block_iter *src);
+
+/* return < 0 for error, 0 for OK, > 0 for EOF. */
+int block_iter_next(struct block_iter *it, struct reftable_record *rec);
+
+/* Seek to `want` with in the block pointed to by `it` */
+int block_iter_seek(struct block_iter *it, struct strbuf *want);
+
+/* deallocate memory for `it`. The block reader and its block is left intact. */
+void block_iter_close(struct block_iter *it);
+
+/* size of file header, depending on format version */
+int header_size(int version);
+
+/* size of file footer, depending on format version */
+int footer_size(int version);
+
+/* returns a block to its source. */
+void reftable_block_done(struct reftable_block *ret);
+
+#endif
diff --git a/reftable/block_test.c b/reftable/block_test.c
new file mode 100644
index 0000000000..597aa10b34
--- /dev/null
+++ b/reftable/block_test.c
@@ -0,0 +1,119 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "block.h"
+
+#include "system.h"
+
+#include "blocksource.h"
+#include "basics.h"
+#include "constants.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static void test_block_read_write(void)
+{
+	const int header_off = 21; /* random */
+	char *names[30];
+	const int N = ARRAY_SIZE(names);
+	const int block_size = 1024;
+	struct reftable_block block = { NULL };
+	struct block_writer bw = {
+		.last_key = STRBUF_INIT,
+	};
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_record rec = { NULL };
+	int i = 0;
+	int n;
+	struct block_reader br = { 0 };
+	struct block_iter it = { .last_key = STRBUF_INIT };
+	int j = 0;
+	struct strbuf want = STRBUF_INIT;
+
+	block.data = reftable_calloc(block_size);
+	block.len = block_size;
+	block.source = malloc_block_source();
+	block_writer_init(&bw, BLOCK_TYPE_REF, block.data, block_size,
+			  header_off, hash_size(SHA1_ID));
+	reftable_record_from_ref(&rec, &ref);
+
+	for (i = 0; i < N; i++) {
+		char name[100];
+		uint8_t hash[SHA1_SIZE];
+		snprintf(name, sizeof(name), "branch%02d", i);
+		memset(hash, i, sizeof(hash));
+
+		ref.refname = name;
+		ref.value = hash;
+		names[i] = xstrdup(name);
+		n = block_writer_add(&bw, &rec);
+		ref.refname = NULL;
+		ref.value = NULL;
+		EXPECT(n == 0);
+	}
+
+	n = block_writer_finish(&bw);
+	EXPECT(n > 0);
+
+	block_writer_release(&bw);
+
+	block_reader_init(&br, &block, header_off, block_size, SHA1_SIZE);
+
+	block_reader_start(&br, &it);
+
+	while (1) {
+		int r = block_iter_next(&it, &rec);
+		EXPECT(r >= 0);
+		if (r > 0) {
+			break;
+		}
+		EXPECT_STREQ(names[j], ref.refname);
+		j++;
+	}
+
+	reftable_record_release(&rec);
+	block_iter_close(&it);
+
+	for (i = 0; i < N; i++) {
+		struct block_iter it = { .last_key = STRBUF_INIT };
+		strbuf_reset(&want);
+		strbuf_addstr(&want, names[i]);
+
+		n = block_reader_seek(&br, &it, &want);
+		EXPECT(n == 0);
+
+		n = block_iter_next(&it, &rec);
+		EXPECT(n == 0);
+
+		EXPECT_STREQ(names[i], ref.refname);
+
+		want.len--;
+		n = block_reader_seek(&br, &it, &want);
+		EXPECT(n == 0);
+
+		n = block_iter_next(&it, &rec);
+		EXPECT(n == 0);
+		EXPECT_STREQ(names[10 * (i / 10)], ref.refname);
+
+		block_iter_close(&it);
+	}
+
+	reftable_record_release(&rec);
+	reftable_block_done(&br.block);
+	strbuf_release(&want);
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+}
+
+int block_test_main(int argc, const char *argv[])
+{
+	test_block_read_write();
+	return 0;
+}
diff --git a/reftable/zlib-compat.c b/reftable/zlib-compat.c
new file mode 100644
index 0000000000..3e0b0f24f1
--- /dev/null
+++ b/reftable/zlib-compat.c
@@ -0,0 +1,92 @@
+/* taken from zlib's uncompr.c
+
+   commit cacf7f1d4e3d44d871b605da3b647f07d718623f
+   Author: Mark Adler <madler@alumni.caltech.edu>
+   Date:   Sun Jan 15 09:18:46 2017 -0800
+
+       zlib 1.2.11
+
+*/
+
+/*
+ * Copyright (C) 1995-2003, 2010, 2014, 2016 Jean-loup Gailly, Mark Adler
+ * For conditions of distribution and use, see copyright notice in zlib.h
+ */
+
+#include "system.h"
+
+/* clang-format off */
+
+/* ===========================================================================
+     Decompresses the source buffer into the destination buffer.  *sourceLen is
+   the byte length of the source buffer. Upon entry, *destLen is the total size
+   of the destination buffer, which must be large enough to hold the entire
+   uncompressed data. (The size of the uncompressed data must have been saved
+   previously by the compressor and transmitted to the decompressor by some
+   mechanism outside the scope of this compression library.) Upon exit,
+   *destLen is the size of the decompressed data and *sourceLen is the number
+   of source bytes consumed. Upon return, source + *sourceLen points to the
+   first unused input byte.
+
+     uncompress returns Z_OK if success, Z_MEM_ERROR if there was not enough
+   memory, Z_BUF_ERROR if there was not enough room in the output buffer, or
+   Z_DATA_ERROR if the input data was corrupted, including if the input data is
+   an incomplete zlib stream.
+*/
+int ZEXPORT uncompress_return_consumed (
+    Bytef *dest,
+    uLongf *destLen,
+    const Bytef *source,
+    uLong *sourceLen) {
+    z_stream stream;
+    int err;
+    const uInt max = (uInt)-1;
+    uLong len, left;
+    Byte buf[1];    /* for detection of incomplete stream when *destLen == 0 */
+
+    len = *sourceLen;
+    if (*destLen) {
+        left = *destLen;
+        *destLen = 0;
+    }
+    else {
+        left = 1;
+        dest = buf;
+    }
+
+    stream.next_in = (z_const Bytef *)source;
+    stream.avail_in = 0;
+    stream.zalloc = (alloc_func)0;
+    stream.zfree = (free_func)0;
+    stream.opaque = (voidpf)0;
+
+    err = inflateInit(&stream);
+    if (err != Z_OK) return err;
+
+    stream.next_out = dest;
+    stream.avail_out = 0;
+
+    do {
+        if (stream.avail_out == 0) {
+            stream.avail_out = left > (uLong)max ? max : (uInt)left;
+            left -= stream.avail_out;
+        }
+        if (stream.avail_in == 0) {
+            stream.avail_in = len > (uLong)max ? max : (uInt)len;
+            len -= stream.avail_in;
+        }
+        err = inflate(&stream, Z_NO_FLUSH);
+    } while (err == Z_OK);
+
+    *sourceLen -= len + stream.avail_in;
+    if (dest != buf)
+        *destLen = stream.total_out;
+    else if (stream.total_out && err == Z_BUF_ERROR)
+        left = 1;
+
+    inflateEnd(&stream);
+    return err == Z_STREAM_END ? Z_OK :
+           err == Z_NEED_DICT ? Z_DATA_ERROR  :
+           err == Z_BUF_ERROR && left + stream.avail_out ? Z_DATA_ERROR :
+           err;
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 09d4b83ef9..c9deeaf08c 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -4,6 +4,7 @@
 int cmd__reftable(int argc, const char **argv)
 {
 	basics_test_main(argc, argv);
+	block_test_main(argc, argv);
 	record_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v3 09/16] reftable: a generic binary tree implementation
  2020-11-26 19:42   ` [PATCH v3 00/16] reftable library Han-Wen Nienhuys via GitGitGadget
                       ` (7 preceding siblings ...)
  2020-11-26 19:42     ` [PATCH v3 08/16] reftable: reading/writing blocks Han-Wen Nienhuys via GitGitGadget
@ 2020-11-26 19:42     ` Han-Wen Nienhuys via GitGitGadget
  2020-11-26 19:42     ` [PATCH v3 10/16] reftable: write reftable files Han-Wen Nienhuys via GitGitGadget
                       ` (7 subsequent siblings)
  16 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-11-26 19:42 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable format includes support for an (OID => ref) map. This map can speed
up visibility and reachability checks. In particular, various operations along
the fetch/push path within Gerrit have ben sped up by using this structure.

The map is constructed with help of a binary tree. Object IDs are hashes, so
they are uniformly distributed. Hence, the tree does not attempt forced
rebalancing.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |  4 ++-
 reftable/tree.c          | 63 ++++++++++++++++++++++++++++++++++++++++
 reftable/tree.h          | 34 ++++++++++++++++++++++
 reftable/tree_test.c     | 61 ++++++++++++++++++++++++++++++++++++++
 t/helper/test-reftable.c |  1 +
 5 files changed, 162 insertions(+), 1 deletion(-)
 create mode 100644 reftable/tree.c
 create mode 100644 reftable/tree.h
 create mode 100644 reftable/tree_test.c

diff --git a/Makefile b/Makefile
index fcfe4075c3..89564d4c06 100644
--- a/Makefile
+++ b/Makefile
@@ -2395,12 +2395,14 @@ REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/record.o
+REFTABLE_OBJS += reftable/tree.o
 REFTABLE_OBJS += reftable/zlib-compat.o
 
+REFTABLE_TEST_OBJS += reftable/basics_test.o
 REFTABLE_TEST_OBJS += reftable/block_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
-REFTABLE_TEST_OBJS += reftable/basics_test.o
+REFTABLE_TEST_OBJS += reftable/tree_test.o
 
 TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
 OBJECTS := $(LIB_OBJS) $(BUILTIN_OBJS) $(PROGRAM_OBJS) $(TEST_OBJS) \
diff --git a/reftable/tree.c b/reftable/tree.c
new file mode 100644
index 0000000000..0061d14e30
--- /dev/null
+++ b/reftable/tree.c
@@ -0,0 +1,63 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "tree.h"
+
+#include "basics.h"
+#include "system.h"
+
+struct tree_node *tree_search(void *key, struct tree_node **rootp,
+			      int (*compare)(const void *, const void *),
+			      int insert)
+{
+	int res;
+	if (*rootp == NULL) {
+		if (!insert) {
+			return NULL;
+		} else {
+			struct tree_node *n =
+				reftable_calloc(sizeof(struct tree_node));
+			n->key = key;
+			*rootp = n;
+			return *rootp;
+		}
+	}
+
+	res = compare(key, (*rootp)->key);
+	if (res < 0)
+		return tree_search(key, &(*rootp)->left, compare, insert);
+	else if (res > 0)
+		return tree_search(key, &(*rootp)->right, compare, insert);
+	return *rootp;
+}
+
+void infix_walk(struct tree_node *t, void (*action)(void *arg, void *key),
+		void *arg)
+{
+	if (t->left != NULL) {
+		infix_walk(t->left, action, arg);
+	}
+	action(arg, t->key);
+	if (t->right != NULL) {
+		infix_walk(t->right, action, arg);
+	}
+}
+
+void tree_free(struct tree_node *t)
+{
+	if (t == NULL) {
+		return;
+	}
+	if (t->left != NULL) {
+		tree_free(t->left);
+	}
+	if (t->right != NULL) {
+		tree_free(t->right);
+	}
+	reftable_free(t);
+}
diff --git a/reftable/tree.h b/reftable/tree.h
new file mode 100644
index 0000000000..954512e9a3
--- /dev/null
+++ b/reftable/tree.h
@@ -0,0 +1,34 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef TREE_H
+#define TREE_H
+
+/* tree_node is a generic binary search tree. */
+struct tree_node {
+	void *key;
+	struct tree_node *left, *right;
+};
+
+/* looks for `key` in `rootp` using `compare` as comparison function. If insert
+   is set, insert the key if it's not found. Else, return NULL.
+*/
+struct tree_node *tree_search(void *key, struct tree_node **rootp,
+			      int (*compare)(const void *, const void *),
+			      int insert);
+
+/* performs an infix walk of the tree. */
+void infix_walk(struct tree_node *t, void (*action)(void *arg, void *key),
+		void *arg);
+
+/*
+  deallocates the tree nodes recursively. Keys should be deallocated separately
+  by walking over the tree. */
+void tree_free(struct tree_node *t);
+
+#endif
diff --git a/reftable/tree_test.c b/reftable/tree_test.c
new file mode 100644
index 0000000000..6de9c9dbe5
--- /dev/null
+++ b/reftable/tree_test.c
@@ -0,0 +1,61 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "tree.h"
+
+#include "basics.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static int test_compare(const void *a, const void *b)
+{
+	return (char *)a - (char *)b;
+}
+
+struct curry {
+	void *last;
+};
+
+static void check_increasing(void *arg, void *key)
+{
+	struct curry *c = (struct curry *)arg;
+	if (c->last != NULL) {
+		assert(test_compare(c->last, key) < 0);
+	}
+	c->last = key;
+}
+
+static void test_tree(void)
+{
+	struct tree_node *root = NULL;
+
+	void *values[11] = { NULL };
+	struct tree_node *nodes[11] = { NULL };
+	int i = 1;
+	struct curry c = { NULL };
+	do {
+		nodes[i] = tree_search(values + i, &root, &test_compare, 1);
+		i = (i * 7) % 11;
+	} while (i != 1);
+
+	for (i = 1; i < ARRAY_SIZE(nodes); i++) {
+		assert(values + i == nodes[i]->key);
+		assert(nodes[i] ==
+		       tree_search(values + i, &root, &test_compare, 0));
+	}
+
+	infix_walk(root, check_increasing, &c);
+	tree_free(root);
+}
+
+int tree_test_main(int argc, const char *argv[])
+{
+	test_tree();
+	return 0;
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index c9deeaf08c..050551fa69 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -6,5 +6,6 @@ int cmd__reftable(int argc, const char **argv)
 	basics_test_main(argc, argv);
 	block_test_main(argc, argv);
 	record_test_main(argc, argv);
+	tree_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v3 10/16] reftable: write reftable files
  2020-11-26 19:42   ` [PATCH v3 00/16] reftable library Han-Wen Nienhuys via GitGitGadget
                       ` (8 preceding siblings ...)
  2020-11-26 19:42     ` [PATCH v3 09/16] reftable: a generic binary tree implementation Han-Wen Nienhuys via GitGitGadget
@ 2020-11-26 19:42     ` Han-Wen Nienhuys via GitGitGadget
  2020-11-26 19:42     ` [PATCH v3 11/16] reftable: read " Han-Wen Nienhuys via GitGitGadget
                       ` (6 subsequent siblings)
  16 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-11-26 19:42 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                   |   1 +
 reftable/reftable-writer.h | 149 ++++++++
 reftable/writer.c          | 676 +++++++++++++++++++++++++++++++++++++
 reftable/writer.h          |  50 +++
 4 files changed, 876 insertions(+)
 create mode 100644 reftable/reftable-writer.h
 create mode 100644 reftable/writer.c
 create mode 100644 reftable/writer.h

diff --git a/Makefile b/Makefile
index 89564d4c06..3e33a409f0 100644
--- a/Makefile
+++ b/Makefile
@@ -2396,6 +2396,7 @@ REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/record.o
 REFTABLE_OBJS += reftable/tree.o
+REFTABLE_OBJS += reftable/writer.o
 REFTABLE_OBJS += reftable/zlib-compat.o
 
 REFTABLE_TEST_OBJS += reftable/basics_test.o
diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h
new file mode 100644
index 0000000000..b736614684
--- /dev/null
+++ b/reftable/reftable-writer.h
@@ -0,0 +1,149 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_WRITER_H
+#define REFTABLE_WRITER_H
+
+#include <stdint.h>
+
+#include "reftable-record.h"
+
+/*
+ Writing single reftables
+*/
+
+/* reftable_write_options sets options for writing a single reftable. */
+struct reftable_write_options {
+	/* boolean: do not pad out blocks to block size. */
+	unsigned unpadded : 1;
+
+	/* the blocksize. Should be less than 2^24. */
+	uint32_t block_size;
+
+	/* boolean: do not generate a SHA1 => ref index. */
+	unsigned skip_index_objects : 1;
+
+	/* how often to write complete keys in each block. */
+	int restart_interval;
+
+	/* 4-byte identifier ("sha1", "s256") of the hash.
+	 * Defaults to SHA1 if unset
+	 */
+	uint32_t hash_id;
+
+	/* boolean: do not check ref names for validity or dir/file conflicts.
+	 */
+	unsigned skip_name_check : 1;
+
+	/* boolean: copy log messages exactly. If unset, check that the message
+	 *   is a single line, and add '\n' if missing.
+	 */
+	unsigned exact_log_message : 1;
+};
+
+/* reftable_block_stats holds statistics for a single block type */
+struct reftable_block_stats {
+	/* total number of entries written */
+	int entries;
+	/* total number of key restarts */
+	int restarts;
+	/* total number of blocks */
+	int blocks;
+	/* total number of index blocks */
+	int index_blocks;
+	/* depth of the index */
+	int max_index_level;
+
+	/* offset of the first block for this type */
+	uint64_t offset;
+	/* offset of the top level index block for this type, or 0 if not
+	 * present */
+	uint64_t index_offset;
+};
+
+/* stats holds overall statistics for a single reftable */
+struct reftable_stats {
+	/* total number of blocks written. */
+	int blocks;
+	/* stats for ref data */
+	struct reftable_block_stats ref_stats;
+	/* stats for the SHA1 to ref map. */
+	struct reftable_block_stats obj_stats;
+	/* stats for index blocks */
+	struct reftable_block_stats idx_stats;
+	/* stats for log blocks */
+	struct reftable_block_stats log_stats;
+
+	/* disambiguation length of shortened object IDs. */
+	int object_id_len;
+};
+
+/* reftable_new_writer creates a new writer */
+struct reftable_writer *
+reftable_new_writer(int (*writer_func)(void *, const void *, size_t),
+		    void *writer_arg, struct reftable_write_options *opts);
+
+/* Set the range of update indices for the records we will add. When writing a
+   table into a stack, the min should be at least
+   reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned.
+
+   For transactional updates to a stack, typically min==max, and the
+   update_index can be obtained by inspeciting the stack. When converting an
+   existing ref database into a single reftable, this would be a range of
+   update-index timestamps.
+ */
+void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
+				uint64_t max);
+
+/*
+  Add a reftable_ref_record. The record should have names that come after
+  already added records.
+
+  The update_index must be within the limits set by
+  reftable_writer_set_limits(), or REFTABLE_API_ERROR is returned. It is an
+  REFTABLE_API_ERROR error to write a ref record after a log record.
+*/
+int reftable_writer_add_ref(struct reftable_writer *w,
+			    struct reftable_ref_record *ref);
+
+/*
+  Convenience function to add multiple reftable_ref_records; the function sorts
+  the records before adding them, reordering the records array passed in.
+*/
+int reftable_writer_add_refs(struct reftable_writer *w,
+			     struct reftable_ref_record *refs, int n);
+
+/*
+  adds reftable_log_records. Log records are keyed by (refname, decreasing
+  update_index). The key for the record added must come after the already added
+  log records.
+*/
+int reftable_writer_add_log(struct reftable_writer *w,
+			    struct reftable_log_record *log);
+
+/*
+  Convenience function to add multiple reftable_log_records; the function sorts
+  the records before adding them, reordering records array passed in.
+*/
+int reftable_writer_add_logs(struct reftable_writer *w,
+			     struct reftable_log_record *logs, int n);
+
+/* reftable_writer_close finalizes the reftable. The writer is retained so
+ * statistics can be inspected. */
+int reftable_writer_close(struct reftable_writer *w);
+
+/* writer_stats returns the statistics on the reftable being written.
+
+   This struct becomes invalid when the writer is freed.
+ */
+const struct reftable_stats *writer_stats(struct reftable_writer *w);
+
+/* reftable_writer_free deallocates memory for the writer */
+void reftable_writer_free(struct reftable_writer *w);
+
+#endif
diff --git a/reftable/writer.c b/reftable/writer.c
new file mode 100644
index 0000000000..8f30e7a114
--- /dev/null
+++ b/reftable/writer.c
@@ -0,0 +1,676 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "writer.h"
+
+#include "system.h"
+
+#include "block.h"
+#include "constants.h"
+#include "record.h"
+#include "tree.h"
+#include "reftable-error.h"
+
+/* finishes a block, and writes it to storage */
+static int writer_flush_block(struct reftable_writer *w);
+
+/* deallocates memory related to the index */
+static void writer_clear_index(struct reftable_writer *w);
+
+/* finishes writing a 'r' (refs) or 'g' (reflogs) section */
+static int writer_finish_public_section(struct reftable_writer *w);
+
+static struct reftable_block_stats *
+writer_reftable_block_stats(struct reftable_writer *w, uint8_t typ)
+{
+	switch (typ) {
+	case 'r':
+		return &w->stats.ref_stats;
+	case 'o':
+		return &w->stats.obj_stats;
+	case 'i':
+		return &w->stats.idx_stats;
+	case 'g':
+		return &w->stats.log_stats;
+	}
+	abort();
+	return NULL;
+}
+
+/* write data, queuing the padding for the next write. Returns negative for
+ * error. */
+static int padded_write(struct reftable_writer *w, uint8_t *data, size_t len,
+			int padding)
+{
+	int n = 0;
+	if (w->pending_padding > 0) {
+		uint8_t *zeroed = reftable_calloc(w->pending_padding);
+		int n = w->write(w->write_arg, zeroed, w->pending_padding);
+		if (n < 0)
+			return n;
+
+		w->pending_padding = 0;
+		reftable_free(zeroed);
+	}
+
+	w->pending_padding = padding;
+	n = w->write(w->write_arg, data, len);
+	if (n < 0)
+		return n;
+	n += padding;
+	return 0;
+}
+
+static void options_set_defaults(struct reftable_write_options *opts)
+{
+	if (opts->restart_interval == 0) {
+		opts->restart_interval = 16;
+	}
+
+	if (opts->hash_id == 0) {
+		opts->hash_id = SHA1_ID;
+	}
+	if (opts->block_size == 0) {
+		opts->block_size = DEFAULT_BLOCK_SIZE;
+	}
+}
+
+static int writer_version(struct reftable_writer *w)
+{
+	return (w->opts.hash_id == 0 || w->opts.hash_id == SHA1_ID) ? 1 : 2;
+}
+
+static int writer_write_header(struct reftable_writer *w, uint8_t *dest)
+{
+	memcpy((char *)dest, "REFT", 4);
+
+	dest[4] = writer_version(w);
+
+	put_be24(dest + 5, w->opts.block_size);
+	put_be64(dest + 8, w->min_update_index);
+	put_be64(dest + 16, w->max_update_index);
+	if (writer_version(w) == 2) {
+		put_be32(dest + 24, w->opts.hash_id);
+	}
+	return header_size(writer_version(w));
+}
+
+static void writer_reinit_block_writer(struct reftable_writer *w, uint8_t typ)
+{
+	int block_start = 0;
+	if (w->next == 0) {
+		block_start = header_size(writer_version(w));
+	}
+
+	strbuf_release(&w->last_key);
+	block_writer_init(&w->block_writer_data, typ, w->block,
+			  w->opts.block_size, block_start,
+			  hash_size(w->opts.hash_id));
+	w->block_writer = &w->block_writer_data;
+	w->block_writer->restart_interval = w->opts.restart_interval;
+}
+
+static struct strbuf reftable_empty_strbuf = STRBUF_INIT;
+
+struct reftable_writer *
+reftable_new_writer(int (*writer_func)(void *, const void *, size_t),
+		    void *writer_arg, struct reftable_write_options *opts)
+{
+	struct reftable_writer *wp =
+		reftable_calloc(sizeof(struct reftable_writer));
+	strbuf_init(&wp->block_writer_data.last_key, 0);
+	options_set_defaults(opts);
+	if (opts->block_size >= (1 << 24)) {
+		/* TODO - error return? */
+		abort();
+	}
+	wp->last_key = reftable_empty_strbuf;
+	wp->block = reftable_calloc(opts->block_size);
+	wp->write = writer_func;
+	wp->write_arg = writer_arg;
+	wp->opts = *opts;
+	writer_reinit_block_writer(wp, BLOCK_TYPE_REF);
+
+	return wp;
+}
+
+void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
+				uint64_t max)
+{
+	w->min_update_index = min;
+	w->max_update_index = max;
+}
+
+void reftable_writer_free(struct reftable_writer *w)
+{
+	reftable_free(w->block);
+	reftable_free(w);
+}
+
+struct obj_index_tree_node {
+	struct strbuf hash;
+	uint64_t *offsets;
+	size_t offset_len;
+	size_t offset_cap;
+};
+
+#define OBJ_INDEX_TREE_NODE_INIT    \
+	{                           \
+		.hash = STRBUF_INIT \
+	}
+
+static int obj_index_tree_node_compare(const void *a, const void *b)
+{
+	return strbuf_cmp(&((const struct obj_index_tree_node *)a)->hash,
+			  &((const struct obj_index_tree_node *)b)->hash);
+}
+
+static void writer_index_hash(struct reftable_writer *w, struct strbuf *hash)
+{
+	uint64_t off = w->next;
+
+	struct obj_index_tree_node want = { .hash = *hash };
+
+	struct tree_node *node = tree_search(&want, &w->obj_index_tree,
+					     &obj_index_tree_node_compare, 0);
+	struct obj_index_tree_node *key = NULL;
+	if (node == NULL) {
+		struct obj_index_tree_node empty = OBJ_INDEX_TREE_NODE_INIT;
+		key = reftable_malloc(sizeof(struct obj_index_tree_node));
+		*key = empty;
+
+		strbuf_reset(&key->hash);
+		strbuf_addbuf(&key->hash, hash);
+		tree_search((void *)key, &w->obj_index_tree,
+			    &obj_index_tree_node_compare, 1);
+	} else {
+		key = node->key;
+	}
+
+	if (key->offset_len > 0 && key->offsets[key->offset_len - 1] == off) {
+		return;
+	}
+
+	if (key->offset_len == key->offset_cap) {
+		key->offset_cap = 2 * key->offset_cap + 1;
+		key->offsets = reftable_realloc(
+			key->offsets, sizeof(uint64_t) * key->offset_cap);
+	}
+
+	key->offsets[key->offset_len++] = off;
+}
+
+static int writer_add_record(struct reftable_writer *w,
+			     struct reftable_record *rec)
+{
+	int result = -1;
+	struct strbuf key = STRBUF_INIT;
+	int err = 0;
+	reftable_record_key(rec, &key);
+	if (strbuf_cmp(&w->last_key, &key) >= 0)
+		goto done;
+
+	strbuf_reset(&w->last_key);
+	strbuf_addbuf(&w->last_key, &key);
+	if (w->block_writer == NULL) {
+		writer_reinit_block_writer(w, reftable_record_type(rec));
+	}
+
+	assert(block_writer_type(w->block_writer) == reftable_record_type(rec));
+
+	if (block_writer_add(w->block_writer, rec) == 0) {
+		result = 0;
+		goto done;
+	}
+
+	err = writer_flush_block(w);
+	if (err < 0) {
+		result = err;
+		goto done;
+	}
+
+	writer_reinit_block_writer(w, reftable_record_type(rec));
+	err = block_writer_add(w->block_writer, rec);
+	if (err < 0) {
+		result = err;
+		goto done;
+	}
+
+	result = 0;
+done:
+	strbuf_release(&key);
+	return result;
+}
+
+int reftable_writer_add_ref(struct reftable_writer *w,
+			    struct reftable_ref_record *ref)
+{
+	struct reftable_record rec = { NULL };
+	struct reftable_ref_record copy = *ref;
+	int err = 0;
+
+	if (ref->refname == NULL)
+		return REFTABLE_API_ERROR;
+	if (ref->update_index < w->min_update_index ||
+	    ref->update_index > w->max_update_index)
+		return REFTABLE_API_ERROR;
+
+	reftable_record_from_ref(&rec, &copy);
+	copy.update_index -= w->min_update_index;
+	err = writer_add_record(w, &rec);
+	if (err < 0)
+		return err;
+
+	if (!w->opts.skip_index_objects && ref->value != NULL) {
+		struct strbuf h = STRBUF_INIT;
+		strbuf_add(&h, (char *)ref->value, hash_size(w->opts.hash_id));
+		writer_index_hash(w, &h);
+		strbuf_release(&h);
+	}
+
+	if (!w->opts.skip_index_objects && ref->target_value != NULL) {
+		struct strbuf h = STRBUF_INIT;
+		strbuf_add(&h, ref->target_value, hash_size(w->opts.hash_id));
+		writer_index_hash(w, &h);
+		strbuf_release(&h);
+	}
+	return 0;
+}
+
+int reftable_writer_add_refs(struct reftable_writer *w,
+			     struct reftable_ref_record *refs, int n)
+{
+	int err = 0;
+	int i = 0;
+	QSORT(refs, n, reftable_ref_record_compare_name);
+	for (i = 0; err == 0 && i < n; i++) {
+		err = reftable_writer_add_ref(w, &refs[i]);
+	}
+	return err;
+}
+
+int reftable_writer_add_log(struct reftable_writer *w,
+			    struct reftable_log_record *log)
+{
+	struct reftable_record rec = { NULL };
+	char *input_log_message = log->message;
+	struct strbuf cleaned_message = STRBUF_INIT;
+	int err;
+	if (log->refname == NULL)
+		return REFTABLE_API_ERROR;
+
+	if (w->block_writer != NULL &&
+	    block_writer_type(w->block_writer) == BLOCK_TYPE_REF) {
+		int err = writer_finish_public_section(w);
+		if (err < 0)
+			return err;
+	}
+
+	if (!w->opts.exact_log_message && log->message != NULL) {
+		strbuf_addstr(&cleaned_message, log->message);
+		while (cleaned_message.len &&
+		       cleaned_message.buf[cleaned_message.len - 1] == '\n')
+			strbuf_setlen(&cleaned_message,
+				      cleaned_message.len - 1);
+		if (strchr(cleaned_message.buf, '\n')) {
+			// multiple lines not allowed.
+			err = REFTABLE_API_ERROR;
+			goto done;
+		}
+		strbuf_addstr(&cleaned_message, "\n");
+		log->message = cleaned_message.buf;
+	}
+
+	w->next -= w->pending_padding;
+	w->pending_padding = 0;
+
+	reftable_record_from_log(&rec, log);
+	err = writer_add_record(w, &rec);
+
+done:
+	log->message = input_log_message;
+	strbuf_release(&cleaned_message);
+	return err;
+}
+
+int reftable_writer_add_logs(struct reftable_writer *w,
+			     struct reftable_log_record *logs, int n)
+{
+	int err = 0;
+	int i = 0;
+	QSORT(logs, n, reftable_log_record_compare_key);
+
+	for (i = 0; err == 0 && i < n; i++) {
+		err = reftable_writer_add_log(w, &logs[i]);
+	}
+	return err;
+}
+
+static int writer_finish_section(struct reftable_writer *w)
+{
+	uint8_t typ = block_writer_type(w->block_writer);
+	uint64_t index_start = 0;
+	int max_level = 0;
+	int threshold = w->opts.unpadded ? 1 : 3;
+	int before_blocks = w->stats.idx_stats.blocks;
+	int err = writer_flush_block(w);
+	int i = 0;
+	struct reftable_block_stats *bstats = NULL;
+	if (err < 0)
+		return err;
+
+	while (w->index_len > threshold) {
+		struct reftable_index_record *idx = NULL;
+		int idx_len = 0;
+
+		max_level++;
+		index_start = w->next;
+		writer_reinit_block_writer(w, BLOCK_TYPE_INDEX);
+
+		idx = w->index;
+		idx_len = w->index_len;
+
+		w->index = NULL;
+		w->index_len = 0;
+		w->index_cap = 0;
+		for (i = 0; i < idx_len; i++) {
+			struct reftable_record rec = { NULL };
+			reftable_record_from_index(&rec, idx + i);
+			if (block_writer_add(w->block_writer, &rec) == 0) {
+				continue;
+			}
+
+			err = writer_flush_block(w);
+			if (err < 0)
+				return err;
+
+			writer_reinit_block_writer(w, BLOCK_TYPE_INDEX);
+
+			err = block_writer_add(w->block_writer, &rec);
+			if (err != 0) {
+				/* write into fresh block should always succeed
+				 */
+				abort();
+			}
+		}
+		for (i = 0; i < idx_len; i++) {
+			strbuf_release(&idx[i].last_key);
+		}
+		reftable_free(idx);
+	}
+
+	writer_clear_index(w);
+
+	err = writer_flush_block(w);
+	if (err < 0)
+		return err;
+
+	bstats = writer_reftable_block_stats(w, typ);
+	bstats->index_blocks = w->stats.idx_stats.blocks - before_blocks;
+	bstats->index_offset = index_start;
+	bstats->max_index_level = max_level;
+
+	/* Reinit lastKey, as the next section can start with any key. */
+	w->last_key.len = 0;
+
+	return 0;
+}
+
+struct common_prefix_arg {
+	struct strbuf *last;
+	int max;
+};
+
+static void update_common(void *void_arg, void *key)
+{
+	struct common_prefix_arg *arg = (struct common_prefix_arg *)void_arg;
+	struct obj_index_tree_node *entry = (struct obj_index_tree_node *)key;
+	if (arg->last != NULL) {
+		int n = common_prefix_size(&entry->hash, arg->last);
+		if (n > arg->max) {
+			arg->max = n;
+		}
+	}
+	arg->last = &entry->hash;
+}
+
+struct write_record_arg {
+	struct reftable_writer *w;
+	int err;
+};
+
+static void write_object_record(void *void_arg, void *key)
+{
+	struct write_record_arg *arg = (struct write_record_arg *)void_arg;
+	struct obj_index_tree_node *entry = (struct obj_index_tree_node *)key;
+	struct reftable_obj_record obj_rec = {
+		.hash_prefix = (uint8_t *)entry->hash.buf,
+		.hash_prefix_len = arg->w->stats.object_id_len,
+		.offsets = entry->offsets,
+		.offset_len = entry->offset_len,
+	};
+	struct reftable_record rec = { NULL };
+	if (arg->err < 0)
+		goto done;
+
+	reftable_record_from_obj(&rec, &obj_rec);
+	arg->err = block_writer_add(arg->w->block_writer, &rec);
+	if (arg->err == 0)
+		goto done;
+
+	arg->err = writer_flush_block(arg->w);
+	if (arg->err < 0)
+		goto done;
+
+	writer_reinit_block_writer(arg->w, BLOCK_TYPE_OBJ);
+	arg->err = block_writer_add(arg->w->block_writer, &rec);
+	if (arg->err == 0)
+		goto done;
+	obj_rec.offset_len = 0;
+	arg->err = block_writer_add(arg->w->block_writer, &rec);
+
+	/* Should be able to write into a fresh block. */
+	assert(arg->err == 0);
+
+done:;
+}
+
+static void object_record_free(void *void_arg, void *key)
+{
+	struct obj_index_tree_node *entry = (struct obj_index_tree_node *)key;
+
+	FREE_AND_NULL(entry->offsets);
+	strbuf_release(&entry->hash);
+	reftable_free(entry);
+}
+
+static int writer_dump_object_index(struct reftable_writer *w)
+{
+	struct write_record_arg closure = { .w = w };
+	struct common_prefix_arg common = { NULL };
+	if (w->obj_index_tree != NULL) {
+		infix_walk(w->obj_index_tree, &update_common, &common);
+	}
+	w->stats.object_id_len = common.max + 1;
+
+	writer_reinit_block_writer(w, BLOCK_TYPE_OBJ);
+
+	if (w->obj_index_tree != NULL) {
+		infix_walk(w->obj_index_tree, &write_object_record, &closure);
+	}
+
+	if (closure.err < 0)
+		return closure.err;
+	return writer_finish_section(w);
+}
+
+static int writer_finish_public_section(struct reftable_writer *w)
+{
+	uint8_t typ = 0;
+	int err = 0;
+
+	if (w->block_writer == NULL)
+		return 0;
+
+	typ = block_writer_type(w->block_writer);
+	err = writer_finish_section(w);
+	if (err < 0)
+		return err;
+	if (typ == BLOCK_TYPE_REF && !w->opts.skip_index_objects &&
+	    w->stats.ref_stats.index_blocks > 0) {
+		err = writer_dump_object_index(w);
+		if (err < 0)
+			return err;
+	}
+
+	if (w->obj_index_tree != NULL) {
+		infix_walk(w->obj_index_tree, &object_record_free, NULL);
+		tree_free(w->obj_index_tree);
+		w->obj_index_tree = NULL;
+	}
+
+	w->block_writer = NULL;
+	return 0;
+}
+
+int reftable_writer_close(struct reftable_writer *w)
+{
+	uint8_t footer[72];
+	uint8_t *p = footer;
+	int err = writer_finish_public_section(w);
+	int empty_table = w->next == 0;
+	if (err != 0)
+		goto done;
+	w->pending_padding = 0;
+	if (empty_table) {
+		/* Empty tables need a header anyway. */
+		uint8_t header[28];
+		int n = writer_write_header(w, header);
+		err = padded_write(w, header, n, 0);
+		if (err < 0)
+			goto done;
+	}
+
+	p += writer_write_header(w, footer);
+	put_be64(p, w->stats.ref_stats.index_offset);
+	p += 8;
+	put_be64(p, (w->stats.obj_stats.offset) << 5 | w->stats.object_id_len);
+	p += 8;
+	put_be64(p, w->stats.obj_stats.index_offset);
+	p += 8;
+
+	put_be64(p, w->stats.log_stats.offset);
+	p += 8;
+	put_be64(p, w->stats.log_stats.index_offset);
+	p += 8;
+
+	put_be32(p, crc32(0, footer, p - footer));
+	p += 4;
+
+	err = padded_write(w, footer, footer_size(writer_version(w)), 0);
+	if (err < 0)
+		goto done;
+
+	if (empty_table) {
+		err = REFTABLE_EMPTY_TABLE_ERROR;
+		goto done;
+	}
+
+done:
+	/* free up memory. */
+	block_writer_release(&w->block_writer_data);
+	writer_clear_index(w);
+	strbuf_release(&w->last_key);
+	return err;
+}
+
+static void writer_clear_index(struct reftable_writer *w)
+{
+	int i = 0;
+	for (i = 0; i < w->index_len; i++) {
+		strbuf_release(&w->index[i].last_key);
+	}
+
+	FREE_AND_NULL(w->index);
+	w->index_len = 0;
+	w->index_cap = 0;
+}
+
+static const int debug = 0;
+
+static int writer_flush_nonempty_block(struct reftable_writer *w)
+{
+	uint8_t typ = block_writer_type(w->block_writer);
+	struct reftable_block_stats *bstats =
+		writer_reftable_block_stats(w, typ);
+	uint64_t block_typ_off = (bstats->blocks == 0) ? w->next : 0;
+	int raw_bytes = block_writer_finish(w->block_writer);
+	int padding = 0;
+	int err = 0;
+	struct reftable_index_record ir = { .last_key = STRBUF_INIT };
+	if (raw_bytes < 0)
+		return raw_bytes;
+
+	if (!w->opts.unpadded && typ != BLOCK_TYPE_LOG) {
+		padding = w->opts.block_size - raw_bytes;
+	}
+
+	if (block_typ_off > 0) {
+		bstats->offset = block_typ_off;
+	}
+
+	bstats->entries += w->block_writer->entries;
+	bstats->restarts += w->block_writer->restart_len;
+	bstats->blocks++;
+	w->stats.blocks++;
+
+	if (debug) {
+		fprintf(stderr, "block %c off %" PRIu64 " sz %d (%d)\n", typ,
+			w->next, raw_bytes,
+			get_be24(w->block + w->block_writer->header_off + 1));
+	}
+
+	if (w->next == 0) {
+		writer_write_header(w, w->block);
+	}
+
+	err = padded_write(w, w->block, raw_bytes, padding);
+	if (err < 0)
+		return err;
+
+	if (w->index_cap == w->index_len) {
+		w->index_cap = 2 * w->index_cap + 1;
+		w->index = reftable_realloc(
+			w->index,
+			sizeof(struct reftable_index_record) * w->index_cap);
+	}
+
+	ir.offset = w->next;
+	strbuf_reset(&ir.last_key);
+	strbuf_addbuf(&ir.last_key, &w->block_writer->last_key);
+	w->index[w->index_len] = ir;
+
+	w->index_len++;
+	w->next += padding + raw_bytes;
+	w->block_writer = NULL;
+	return 0;
+}
+
+static int writer_flush_block(struct reftable_writer *w)
+{
+	if (w->block_writer == NULL)
+		return 0;
+	if (w->block_writer->entries == 0)
+		return 0;
+	return writer_flush_nonempty_block(w);
+}
+
+const struct reftable_stats *writer_stats(struct reftable_writer *w)
+{
+	return &w->stats;
+}
diff --git a/reftable/writer.h b/reftable/writer.h
new file mode 100644
index 0000000000..f2ac732358
--- /dev/null
+++ b/reftable/writer.h
@@ -0,0 +1,50 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef WRITER_H
+#define WRITER_H
+
+#include "basics.h"
+#include "block.h"
+#include "tree.h"
+#include "reftable-writer.h"
+
+struct reftable_writer {
+	int (*write)(void *, const void *, size_t);
+	void *write_arg;
+	int pending_padding;
+	struct strbuf last_key;
+
+	/* offset of next block to write. */
+	uint64_t next;
+	uint64_t min_update_index, max_update_index;
+	struct reftable_write_options opts;
+
+	/* memory buffer for writing */
+	uint8_t *block;
+
+	/* writer for the current section. NULL or points to
+	 * block_writer_data */
+	struct block_writer *block_writer;
+
+	struct block_writer block_writer_data;
+
+	/* pending index records for the current section */
+	struct reftable_index_record *index;
+	size_t index_len;
+	size_t index_cap;
+
+	/*
+	  tree for use with tsearch; used to populate the 'o' inverse OID
+	  map */
+	struct tree_node *obj_index_tree;
+
+	struct reftable_stats stats;
+};
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v3 11/16] reftable: read reftable files
  2020-11-26 19:42   ` [PATCH v3 00/16] reftable library Han-Wen Nienhuys via GitGitGadget
                       ` (9 preceding siblings ...)
  2020-11-26 19:42     ` [PATCH v3 10/16] reftable: write reftable files Han-Wen Nienhuys via GitGitGadget
@ 2020-11-26 19:42     ` Han-Wen Nienhuys via GitGitGadget
  2020-11-26 19:42     ` [PATCH v3 12/16] reftable: reftable file level tests Han-Wen Nienhuys via GitGitGadget
                       ` (5 subsequent siblings)
  16 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-11-26 19:42 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This supports reading a single reftable file.

The commit introduces an abstract iterator type, which captures the usecases
both of reading individual refs, and iterating over a segment of the ref
namespace.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                     |   3 +
 reftable/iter.c              | 242 ++++++++++++
 reftable/iter.h              |  75 ++++
 reftable/reader.c            | 733 +++++++++++++++++++++++++++++++++++
 reftable/reader.h            |  75 ++++
 reftable/reftable-iterator.h |  37 ++
 reftable/reftable-reader.h   |  89 +++++
 7 files changed, 1254 insertions(+)
 create mode 100644 reftable/iter.c
 create mode 100644 reftable/iter.h
 create mode 100644 reftable/reader.c
 create mode 100644 reftable/reader.h
 create mode 100644 reftable/reftable-iterator.h
 create mode 100644 reftable/reftable-reader.h

diff --git a/Makefile b/Makefile
index 3e33a409f0..51dbcb1b9c 100644
--- a/Makefile
+++ b/Makefile
@@ -2393,8 +2393,11 @@ REFTABLE_OBJS += reftable/basics.o
 REFTABLE_OBJS += reftable/error.o
 REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
+REFTABLE_OBJS += reftable/iter.o
 REFTABLE_OBJS += reftable/publicbasics.o
+REFTABLE_OBJS += reftable/reader.o
 REFTABLE_OBJS += reftable/record.o
+REFTABLE_OBJS += reftable/reftable.o
 REFTABLE_OBJS += reftable/tree.o
 REFTABLE_OBJS += reftable/writer.o
 REFTABLE_OBJS += reftable/zlib-compat.o
diff --git a/reftable/iter.c b/reftable/iter.c
new file mode 100644
index 0000000000..63d87b758c
--- /dev/null
+++ b/reftable/iter.c
@@ -0,0 +1,242 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "iter.h"
+
+#include "system.h"
+
+#include "block.h"
+#include "constants.h"
+#include "reader.h"
+#include "reftable-error.h"
+
+int iterator_is_null(struct reftable_iterator *it)
+{
+	return it->ops == NULL;
+}
+
+static int empty_iterator_next(void *arg, struct reftable_record *rec)
+{
+	return 1;
+}
+
+static void empty_iterator_close(void *arg)
+{
+}
+
+static struct reftable_iterator_vtable empty_vtable = {
+	.next = &empty_iterator_next,
+	.close = &empty_iterator_close,
+};
+
+void iterator_set_empty(struct reftable_iterator *it)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = NULL;
+	it->ops = &empty_vtable;
+}
+
+int iterator_next(struct reftable_iterator *it, struct reftable_record *rec)
+{
+	return it->ops->next(it->iter_arg, rec);
+}
+
+void reftable_iterator_destroy(struct reftable_iterator *it)
+{
+	if (it->ops == NULL) {
+		return;
+	}
+	it->ops->close(it->iter_arg);
+	it->ops = NULL;
+	FREE_AND_NULL(it->iter_arg);
+}
+
+int reftable_iterator_next_ref(struct reftable_iterator *it,
+			       struct reftable_ref_record *ref)
+{
+	struct reftable_record rec = { NULL };
+	reftable_record_from_ref(&rec, ref);
+	return iterator_next(it, &rec);
+}
+
+int reftable_iterator_next_log(struct reftable_iterator *it,
+			       struct reftable_log_record *log)
+{
+	struct reftable_record rec = { NULL };
+	reftable_record_from_log(&rec, log);
+	return iterator_next(it, &rec);
+}
+
+static void filtering_ref_iterator_close(void *iter_arg)
+{
+	struct filtering_ref_iterator *fri =
+		(struct filtering_ref_iterator *)iter_arg;
+	strbuf_release(&fri->oid);
+	reftable_iterator_destroy(&fri->it);
+}
+
+static int filtering_ref_iterator_next(void *iter_arg,
+				       struct reftable_record *rec)
+{
+	struct filtering_ref_iterator *fri =
+		(struct filtering_ref_iterator *)iter_arg;
+	struct reftable_ref_record *ref =
+		(struct reftable_ref_record *)rec->data;
+	int err = 0;
+	while (1) {
+		err = reftable_iterator_next_ref(&fri->it, ref);
+		if (err != 0) {
+			break;
+		}
+
+		if (fri->double_check) {
+			struct reftable_iterator it = { NULL };
+
+			err = reftable_table_seek_ref(&fri->tab, &it,
+						      ref->refname);
+			if (err == 0) {
+				err = reftable_iterator_next_ref(&it, ref);
+			}
+
+			reftable_iterator_destroy(&it);
+
+			if (err < 0) {
+				break;
+			}
+
+			if (err > 0) {
+				continue;
+			}
+		}
+
+		if ((ref->target_value != NULL &&
+		     !memcmp(fri->oid.buf, ref->target_value, fri->oid.len)) ||
+		    (ref->value != NULL &&
+		     !memcmp(fri->oid.buf, ref->value, fri->oid.len))) {
+			return 0;
+		}
+	}
+
+	reftable_ref_record_release(ref);
+	return err;
+}
+
+static struct reftable_iterator_vtable filtering_ref_iterator_vtable = {
+	.next = &filtering_ref_iterator_next,
+	.close = &filtering_ref_iterator_close,
+};
+
+void iterator_from_filtering_ref_iterator(struct reftable_iterator *it,
+					  struct filtering_ref_iterator *fri)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = fri;
+	it->ops = &filtering_ref_iterator_vtable;
+}
+
+static void indexed_table_ref_iter_close(void *p)
+{
+	struct indexed_table_ref_iter *it = (struct indexed_table_ref_iter *)p;
+	block_iter_close(&it->cur);
+	reftable_block_done(&it->block_reader.block);
+	strbuf_release(&it->oid);
+}
+
+static int indexed_table_ref_iter_next_block(struct indexed_table_ref_iter *it)
+{
+	uint64_t off;
+	int err = 0;
+	if (it->offset_idx == it->offset_len) {
+		it->is_finished = 1;
+		return 1;
+	}
+
+	reftable_block_done(&it->block_reader.block);
+
+	off = it->offsets[it->offset_idx++];
+	err = reader_init_block_reader(it->r, &it->block_reader, off,
+				       BLOCK_TYPE_REF);
+	if (err < 0) {
+		return err;
+	}
+	if (err > 0) {
+		/* indexed block does not exist. */
+		return REFTABLE_FORMAT_ERROR;
+	}
+	block_reader_start(&it->block_reader, &it->cur);
+	return 0;
+}
+
+static int indexed_table_ref_iter_next(void *p, struct reftable_record *rec)
+{
+	struct indexed_table_ref_iter *it = (struct indexed_table_ref_iter *)p;
+	struct reftable_ref_record *ref =
+		(struct reftable_ref_record *)rec->data;
+
+	while (1) {
+		int err = block_iter_next(&it->cur, rec);
+		if (err < 0) {
+			return err;
+		}
+
+		if (err > 0) {
+			err = indexed_table_ref_iter_next_block(it);
+			if (err < 0) {
+				return err;
+			}
+
+			if (it->is_finished) {
+				return 1;
+			}
+			continue;
+		}
+
+		if (!memcmp(it->oid.buf, ref->target_value, it->oid.len) ||
+		    !memcmp(it->oid.buf, ref->value, it->oid.len)) {
+			return 0;
+		}
+	}
+}
+
+int new_indexed_table_ref_iter(struct indexed_table_ref_iter **dest,
+			       struct reftable_reader *r, uint8_t *oid,
+			       int oid_len, uint64_t *offsets, int offset_len)
+{
+	struct indexed_table_ref_iter empty = INDEXED_TABLE_REF_ITER_INIT;
+	struct indexed_table_ref_iter *itr =
+		reftable_calloc(sizeof(struct indexed_table_ref_iter));
+	int err = 0;
+
+	*itr = empty;
+	itr->r = r;
+	strbuf_add(&itr->oid, oid, oid_len);
+
+	itr->offsets = offsets;
+	itr->offset_len = offset_len;
+
+	err = indexed_table_ref_iter_next_block(itr);
+	if (err < 0) {
+		reftable_free(itr);
+	} else {
+		*dest = itr;
+	}
+	return err;
+}
+
+static struct reftable_iterator_vtable indexed_table_ref_iter_vtable = {
+	.next = &indexed_table_ref_iter_next,
+	.close = &indexed_table_ref_iter_close,
+};
+
+void iterator_from_indexed_table_ref_iter(struct reftable_iterator *it,
+					  struct indexed_table_ref_iter *itr)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = itr;
+	it->ops = &indexed_table_ref_iter_vtable;
+}
diff --git a/reftable/iter.h b/reftable/iter.h
new file mode 100644
index 0000000000..258bc107af
--- /dev/null
+++ b/reftable/iter.h
@@ -0,0 +1,75 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef ITER_H
+#define ITER_H
+
+#include "system.h"
+#include "block.h"
+#include "record.h"
+
+#include "reftable-iterator.h"
+#include "reftable-generic.h"
+
+struct reftable_iterator_vtable {
+	int (*next)(void *iter_arg, struct reftable_record *rec);
+	void (*close)(void *iter_arg);
+};
+
+void iterator_set_empty(struct reftable_iterator *it);
+int iterator_next(struct reftable_iterator *it, struct reftable_record *rec);
+
+/* Returns true for a zeroed out iterator, such as the one returned from
+   iterator_destroy. */
+int iterator_is_null(struct reftable_iterator *it);
+
+/* iterator that produces only ref records that point to `oid` */
+struct filtering_ref_iterator {
+	int double_check;
+	struct reftable_table tab;
+	struct strbuf oid;
+	struct reftable_iterator it;
+};
+#define FILTERING_REF_ITERATOR_INIT \
+	{                           \
+		.oid = STRBUF_INIT  \
+	}
+
+void iterator_from_filtering_ref_iterator(struct reftable_iterator *,
+					  struct filtering_ref_iterator *);
+
+/* iterator that produces only ref records that point to `oid`,
+   but using the object index.
+ */
+struct indexed_table_ref_iter {
+	struct reftable_reader *r;
+	struct strbuf oid;
+
+	/* mutable */
+	uint64_t *offsets;
+
+	/* Points to the next offset to read. */
+	int offset_idx;
+	int offset_len;
+	struct block_reader block_reader;
+	struct block_iter cur;
+	int is_finished;
+};
+
+#define INDEXED_TABLE_REF_ITER_INIT                                     \
+	{                                                               \
+		.cur = { .last_key = STRBUF_INIT }, .oid = STRBUF_INIT, \
+	}
+
+void iterator_from_indexed_table_ref_iter(struct reftable_iterator *it,
+					  struct indexed_table_ref_iter *itr);
+int new_indexed_table_ref_iter(struct indexed_table_ref_iter **dest,
+			       struct reftable_reader *r, uint8_t *oid,
+			       int oid_len, uint64_t *offsets, int offset_len);
+
+#endif
diff --git a/reftable/reader.c b/reftable/reader.c
new file mode 100644
index 0000000000..a5b7b1d4fe
--- /dev/null
+++ b/reftable/reader.c
@@ -0,0 +1,733 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reader.h"
+
+#include "system.h"
+#include "block.h"
+#include "constants.h"
+#include "iter.h"
+#include "record.h"
+#include "reftable-error.h"
+#include "tree.h"
+
+uint64_t block_source_size(struct reftable_block_source *source)
+{
+	return source->ops->size(source->arg);
+}
+
+int block_source_read_block(struct reftable_block_source *source,
+			    struct reftable_block *dest, uint64_t off,
+			    uint32_t size)
+{
+	int result = source->ops->read_block(source->arg, dest, off, size);
+	dest->source = *source;
+	return result;
+}
+
+void block_source_close(struct reftable_block_source *source)
+{
+	if (source->ops == NULL) {
+		return;
+	}
+
+	source->ops->close(source->arg);
+	source->ops = NULL;
+}
+
+static struct reftable_reader_offsets *
+reader_offsets_for(struct reftable_reader *r, uint8_t typ)
+{
+	switch (typ) {
+	case BLOCK_TYPE_REF:
+		return &r->ref_offsets;
+	case BLOCK_TYPE_LOG:
+		return &r->log_offsets;
+	case BLOCK_TYPE_OBJ:
+		return &r->obj_offsets;
+	}
+	abort();
+}
+
+static int reader_get_block(struct reftable_reader *r,
+			    struct reftable_block *dest, uint64_t off,
+			    uint32_t sz)
+{
+	if (off >= r->size)
+		return 0;
+
+	if (off + sz > r->size) {
+		sz = r->size - off;
+	}
+
+	return block_source_read_block(&r->source, dest, off, sz);
+}
+
+uint32_t reftable_reader_hash_id(struct reftable_reader *r)
+{
+	return r->hash_id;
+}
+
+const char *reader_name(struct reftable_reader *r)
+{
+	return r->name;
+}
+
+static int parse_footer(struct reftable_reader *r, uint8_t *footer,
+			uint8_t *header)
+{
+	uint8_t *f = footer;
+	uint8_t first_block_typ;
+	int err = 0;
+	uint32_t computed_crc;
+	uint32_t file_crc;
+
+	if (memcmp(f, "REFT", 4)) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+	f += 4;
+
+	if (memcmp(footer, header, header_size(r->version))) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	f++;
+	r->block_size = get_be24(f);
+
+	f += 3;
+	r->min_update_index = get_be64(f);
+	f += 8;
+	r->max_update_index = get_be64(f);
+	f += 8;
+
+	if (r->version == 1) {
+		r->hash_id = SHA1_ID;
+	} else {
+		r->hash_id = get_be32(f);
+		switch (r->hash_id) {
+		case SHA1_ID:
+			break;
+		case SHA256_ID:
+			break;
+		default:
+			err = REFTABLE_FORMAT_ERROR;
+			goto done;
+		}
+		f += 4;
+	}
+
+	r->ref_offsets.index_offset = get_be64(f);
+	f += 8;
+
+	r->obj_offsets.offset = get_be64(f);
+	f += 8;
+
+	r->object_id_len = r->obj_offsets.offset & ((1 << 5) - 1);
+	r->obj_offsets.offset >>= 5;
+
+	r->obj_offsets.index_offset = get_be64(f);
+	f += 8;
+	r->log_offsets.offset = get_be64(f);
+	f += 8;
+	r->log_offsets.index_offset = get_be64(f);
+	f += 8;
+
+	computed_crc = crc32(0, footer, f - footer);
+	file_crc = get_be32(f);
+	f += 4;
+	if (computed_crc != file_crc) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	first_block_typ = header[header_size(r->version)];
+	r->ref_offsets.is_present = (first_block_typ == BLOCK_TYPE_REF);
+	r->ref_offsets.offset = 0;
+	r->log_offsets.is_present = (first_block_typ == BLOCK_TYPE_LOG ||
+				     r->log_offsets.offset > 0);
+	r->obj_offsets.is_present = r->obj_offsets.offset > 0;
+	err = 0;
+done:
+	return err;
+}
+
+int init_reader(struct reftable_reader *r, struct reftable_block_source *source,
+		const char *name)
+{
+	struct reftable_block footer = { NULL };
+	struct reftable_block header = { NULL };
+	int err = 0;
+
+	memset(r, 0, sizeof(struct reftable_reader));
+
+	/* Need +1 to read type of first block. */
+	err = block_source_read_block(source, &header, 0, header_size(2) + 1);
+	if (err != header_size(2) + 1) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	if (memcmp(header.data, "REFT", 4)) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+	r->version = header.data[4];
+	if (r->version != 1 && r->version != 2) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	r->size = block_source_size(source) - footer_size(r->version);
+	r->source = *source;
+	r->name = xstrdup(name);
+	r->hash_id = 0;
+
+	err = block_source_read_block(source, &footer, r->size,
+				      footer_size(r->version));
+	if (err != footer_size(r->version)) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = parse_footer(r, footer.data, header.data);
+done:
+	reftable_block_done(&footer);
+	reftable_block_done(&header);
+	return err;
+}
+
+struct table_iter {
+	struct reftable_reader *r;
+	uint8_t typ;
+	uint64_t block_off;
+	struct block_iter bi;
+	int is_finished;
+};
+#define TABLE_ITER_INIT                          \
+	{                                        \
+		.bi = {.last_key = STRBUF_INIT } \
+	}
+
+static void table_iter_copy_from(struct table_iter *dest,
+				 struct table_iter *src)
+{
+	dest->r = src->r;
+	dest->typ = src->typ;
+	dest->block_off = src->block_off;
+	dest->is_finished = src->is_finished;
+	block_iter_copy_from(&dest->bi, &src->bi);
+}
+
+static int table_iter_next_in_block(struct table_iter *ti,
+				    struct reftable_record *rec)
+{
+	int res = block_iter_next(&ti->bi, rec);
+	if (res == 0 && reftable_record_type(rec) == BLOCK_TYPE_REF) {
+		((struct reftable_ref_record *)rec->data)->update_index +=
+			ti->r->min_update_index;
+	}
+
+	return res;
+}
+
+static void table_iter_block_done(struct table_iter *ti)
+{
+	if (ti->bi.br == NULL) {
+		return;
+	}
+	reftable_block_done(&ti->bi.br->block);
+	FREE_AND_NULL(ti->bi.br);
+
+	ti->bi.last_key.len = 0;
+	ti->bi.next_off = 0;
+}
+
+static int32_t extract_block_size(uint8_t *data, uint8_t *typ, uint64_t off,
+				  int version)
+{
+	int32_t result = 0;
+
+	if (off == 0) {
+		data += header_size(version);
+	}
+
+	*typ = data[0];
+	if (reftable_is_block_type(*typ)) {
+		result = get_be24(data + 1);
+	}
+	return result;
+}
+
+int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br,
+			     uint64_t next_off, uint8_t want_typ)
+{
+	int32_t guess_block_size = r->block_size ? r->block_size :
+							 DEFAULT_BLOCK_SIZE;
+	struct reftable_block block = { NULL };
+	uint8_t block_typ = 0;
+	int err = 0;
+	uint32_t header_off = next_off ? 0 : header_size(r->version);
+	int32_t block_size = 0;
+
+	if (next_off >= r->size)
+		return 1;
+
+	err = reader_get_block(r, &block, next_off, guess_block_size);
+	if (err < 0)
+		return err;
+
+	block_size = extract_block_size(block.data, &block_typ, next_off,
+					r->version);
+	if (block_size < 0)
+		return block_size;
+
+	if (want_typ != BLOCK_TYPE_ANY && block_typ != want_typ) {
+		reftable_block_done(&block);
+		return 1;
+	}
+
+	if (block_size > guess_block_size) {
+		reftable_block_done(&block);
+		err = reader_get_block(r, &block, next_off, block_size);
+		if (err < 0) {
+			return err;
+		}
+	}
+
+	return block_reader_init(br, &block, header_off, r->block_size,
+				 hash_size(r->hash_id));
+}
+
+static int table_iter_next_block(struct table_iter *dest,
+				 struct table_iter *src)
+{
+	uint64_t next_block_off = src->block_off + src->bi.br->full_block_size;
+	struct block_reader br = { 0 };
+	int err = 0;
+
+	dest->r = src->r;
+	dest->typ = src->typ;
+	dest->block_off = next_block_off;
+
+	err = reader_init_block_reader(src->r, &br, next_block_off, src->typ);
+	if (err > 0) {
+		dest->is_finished = 1;
+		return 1;
+	}
+	if (err != 0)
+		return err;
+	else {
+		struct block_reader *brp =
+			reftable_malloc(sizeof(struct block_reader));
+		*brp = br;
+
+		dest->is_finished = 0;
+		block_reader_start(brp, &dest->bi);
+	}
+	return 0;
+}
+
+static int table_iter_next(struct table_iter *ti, struct reftable_record *rec)
+{
+	if (reftable_record_type(rec) != ti->typ)
+		return REFTABLE_API_ERROR;
+
+	while (1) {
+		struct table_iter next = TABLE_ITER_INIT;
+		int err = 0;
+		if (ti->is_finished) {
+			return 1;
+		}
+
+		err = table_iter_next_in_block(ti, rec);
+		if (err <= 0) {
+			return err;
+		}
+
+		err = table_iter_next_block(&next, ti);
+		if (err != 0) {
+			ti->is_finished = 1;
+		}
+		table_iter_block_done(ti);
+		if (err != 0) {
+			return err;
+		}
+		table_iter_copy_from(ti, &next);
+		block_iter_close(&next.bi);
+	}
+}
+
+static int table_iter_next_void(void *ti, struct reftable_record *rec)
+{
+	return table_iter_next((struct table_iter *)ti, rec);
+}
+
+static void table_iter_close(void *p)
+{
+	struct table_iter *ti = (struct table_iter *)p;
+	table_iter_block_done(ti);
+	block_iter_close(&ti->bi);
+}
+
+static struct reftable_iterator_vtable table_iter_vtable = {
+	.next = &table_iter_next_void,
+	.close = &table_iter_close,
+};
+
+static void iterator_from_table_iter(struct reftable_iterator *it,
+				     struct table_iter *ti)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = ti;
+	it->ops = &table_iter_vtable;
+}
+
+static int reader_table_iter_at(struct reftable_reader *r,
+				struct table_iter *ti, uint64_t off,
+				uint8_t typ)
+{
+	struct block_reader br = { 0 };
+	struct block_reader *brp = NULL;
+
+	int err = reader_init_block_reader(r, &br, off, typ);
+	if (err != 0)
+		return err;
+
+	brp = reftable_malloc(sizeof(struct block_reader));
+	*brp = br;
+	ti->r = r;
+	ti->typ = block_reader_type(brp);
+	ti->block_off = off;
+	block_reader_start(brp, &ti->bi);
+	return 0;
+}
+
+static int reader_start(struct reftable_reader *r, struct table_iter *ti,
+			uint8_t typ, int index)
+{
+	struct reftable_reader_offsets *offs = reader_offsets_for(r, typ);
+	uint64_t off = offs->offset;
+	if (index) {
+		off = offs->index_offset;
+		if (off == 0) {
+			return 1;
+		}
+		typ = BLOCK_TYPE_INDEX;
+	}
+
+	return reader_table_iter_at(r, ti, off, typ);
+}
+
+static int reader_seek_linear(struct reftable_reader *r, struct table_iter *ti,
+			      struct reftable_record *want)
+{
+	struct reftable_record rec =
+		reftable_new_record(reftable_record_type(want));
+	struct strbuf want_key = STRBUF_INIT;
+	struct strbuf got_key = STRBUF_INIT;
+	struct table_iter next = TABLE_ITER_INIT;
+	int err = -1;
+
+	reftable_record_key(want, &want_key);
+
+	while (1) {
+		err = table_iter_next_block(&next, ti);
+		if (err < 0)
+			goto done;
+
+		if (err > 0) {
+			break;
+		}
+
+		err = block_reader_first_key(next.bi.br, &got_key);
+		if (err < 0)
+			goto done;
+
+		if (strbuf_cmp(&got_key, &want_key) > 0) {
+			table_iter_block_done(&next);
+			break;
+		}
+
+		table_iter_block_done(ti);
+		table_iter_copy_from(ti, &next);
+	}
+
+	err = block_iter_seek(&ti->bi, &want_key);
+	if (err < 0)
+		goto done;
+	err = 0;
+
+done:
+	block_iter_close(&next.bi);
+	reftable_record_destroy(&rec);
+	strbuf_release(&want_key);
+	strbuf_release(&got_key);
+	return err;
+}
+
+static int reader_seek_indexed(struct reftable_reader *r,
+			       struct reftable_iterator *it,
+			       struct reftable_record *rec)
+{
+	struct reftable_index_record want_index = { .last_key = STRBUF_INIT };
+	struct reftable_record want_index_rec = { NULL };
+	struct reftable_index_record index_result = { .last_key = STRBUF_INIT };
+	struct reftable_record index_result_rec = { NULL };
+	struct table_iter index_iter = TABLE_ITER_INIT;
+	struct table_iter next = TABLE_ITER_INIT;
+	int err = 0;
+
+	reftable_record_key(rec, &want_index.last_key);
+	reftable_record_from_index(&want_index_rec, &want_index);
+	reftable_record_from_index(&index_result_rec, &index_result);
+
+	err = reader_start(r, &index_iter, reftable_record_type(rec), 1);
+	if (err < 0)
+		goto done;
+
+	err = reader_seek_linear(r, &index_iter, &want_index_rec);
+	while (1) {
+		err = table_iter_next(&index_iter, &index_result_rec);
+		table_iter_block_done(&index_iter);
+		if (err != 0)
+			goto done;
+
+		err = reader_table_iter_at(r, &next, index_result.offset, 0);
+		if (err != 0)
+			goto done;
+
+		err = block_iter_seek(&next.bi, &want_index.last_key);
+		if (err < 0)
+			goto done;
+
+		if (next.typ == reftable_record_type(rec)) {
+			err = 0;
+			break;
+		}
+
+		if (next.typ != BLOCK_TYPE_INDEX) {
+			err = REFTABLE_FORMAT_ERROR;
+			break;
+		}
+
+		table_iter_copy_from(&index_iter, &next);
+	}
+
+	if (err == 0) {
+		struct table_iter empty = TABLE_ITER_INIT;
+		struct table_iter *malloced =
+			reftable_calloc(sizeof(struct table_iter));
+		*malloced = empty;
+		table_iter_copy_from(malloced, &next);
+		iterator_from_table_iter(it, malloced);
+	}
+done:
+	block_iter_close(&next.bi);
+	table_iter_close(&index_iter);
+	reftable_record_release(&want_index_rec);
+	reftable_record_release(&index_result_rec);
+	return err;
+}
+
+static int reader_seek_internal(struct reftable_reader *r,
+				struct reftable_iterator *it,
+				struct reftable_record *rec)
+{
+	struct reftable_reader_offsets *offs =
+		reader_offsets_for(r, reftable_record_type(rec));
+	uint64_t idx = offs->index_offset;
+	struct table_iter ti = TABLE_ITER_INIT;
+	int err = 0;
+	if (idx > 0)
+		return reader_seek_indexed(r, it, rec);
+
+	err = reader_start(r, &ti, reftable_record_type(rec), 0);
+	if (err < 0)
+		return err;
+	err = reader_seek_linear(r, &ti, rec);
+	if (err < 0)
+		return err;
+	else {
+		struct table_iter *p =
+			reftable_malloc(sizeof(struct table_iter));
+		*p = ti;
+		iterator_from_table_iter(it, p);
+	}
+
+	return 0;
+}
+
+int reader_seek(struct reftable_reader *r, struct reftable_iterator *it,
+		struct reftable_record *rec)
+{
+	uint8_t typ = reftable_record_type(rec);
+
+	struct reftable_reader_offsets *offs = reader_offsets_for(r, typ);
+	if (!offs->is_present) {
+		iterator_set_empty(it);
+		return 0;
+	}
+
+	return reader_seek_internal(r, it, rec);
+}
+
+int reftable_reader_seek_ref(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name)
+{
+	struct reftable_ref_record ref = {
+		.refname = (char *)name,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_ref(&rec, &ref);
+	return reader_seek(r, it, &rec);
+}
+
+int reftable_reader_seek_log_at(struct reftable_reader *r,
+				struct reftable_iterator *it, const char *name,
+				uint64_t update_index)
+{
+	struct reftable_log_record log = {
+		.refname = (char *)name,
+		.update_index = update_index,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_log(&rec, &log);
+	return reader_seek(r, it, &rec);
+}
+
+int reftable_reader_seek_log(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name)
+{
+	uint64_t max = ~((uint64_t)0);
+	return reftable_reader_seek_log_at(r, it, name, max);
+}
+
+void reader_close(struct reftable_reader *r)
+{
+	block_source_close(&r->source);
+	FREE_AND_NULL(r->name);
+}
+
+int reftable_new_reader(struct reftable_reader **p,
+			struct reftable_block_source *src, char const *name)
+{
+	struct reftable_reader *rd =
+		reftable_calloc(sizeof(struct reftable_reader));
+	int err = init_reader(rd, src, name);
+	if (err == 0) {
+		*p = rd;
+	} else {
+		block_source_close(src);
+		reftable_free(rd);
+	}
+	return err;
+}
+
+void reftable_reader_free(struct reftable_reader *r)
+{
+	reader_close(r);
+	reftable_free(r);
+}
+
+static int reftable_reader_refs_for_indexed(struct reftable_reader *r,
+					    struct reftable_iterator *it,
+					    uint8_t *oid)
+{
+	struct reftable_obj_record want = {
+		.hash_prefix = oid,
+		.hash_prefix_len = r->object_id_len,
+	};
+	struct reftable_record want_rec = { NULL };
+	struct reftable_iterator oit = { NULL };
+	struct reftable_obj_record got = { NULL };
+	struct reftable_record got_rec = { NULL };
+	int err = 0;
+	struct indexed_table_ref_iter *itr = NULL;
+
+	/* Look through the reverse index. */
+	reftable_record_from_obj(&want_rec, &want);
+	err = reader_seek(r, &oit, &want_rec);
+	if (err != 0)
+		goto done;
+
+	/* read out the reftable_obj_record */
+	reftable_record_from_obj(&got_rec, &got);
+	err = iterator_next(&oit, &got_rec);
+	if (err < 0)
+		goto done;
+
+	if (err > 0 ||
+	    memcmp(want.hash_prefix, got.hash_prefix, r->object_id_len)) {
+		/* didn't find it; return empty iterator */
+		iterator_set_empty(it);
+		err = 0;
+		goto done;
+	}
+
+	err = new_indexed_table_ref_iter(&itr, r, oid, hash_size(r->hash_id),
+					 got.offsets, got.offset_len);
+	if (err < 0)
+		goto done;
+	got.offsets = NULL;
+	iterator_from_indexed_table_ref_iter(it, itr);
+
+done:
+	reftable_iterator_destroy(&oit);
+	reftable_record_release(&got_rec);
+	return err;
+}
+
+static int reftable_reader_refs_for_unindexed(struct reftable_reader *r,
+					      struct reftable_iterator *it,
+					      uint8_t *oid)
+{
+	struct table_iter ti_empty = TABLE_ITER_INIT;
+	struct table_iter *ti = reftable_calloc(sizeof(struct table_iter));
+	struct filtering_ref_iterator *filter = NULL;
+	struct filtering_ref_iterator empty = FILTERING_REF_ITERATOR_INIT;
+	int oid_len = hash_size(r->hash_id);
+	int err;
+
+	*ti = ti_empty;
+	err = reader_start(r, ti, BLOCK_TYPE_REF, 0);
+	if (err < 0) {
+		reftable_free(ti);
+		return err;
+	}
+
+	filter = reftable_malloc(sizeof(struct filtering_ref_iterator));
+	*filter = empty;
+
+	strbuf_add(&filter->oid, oid, oid_len);
+	reftable_table_from_reader(&filter->tab, r);
+	filter->double_check = 0;
+	iterator_from_table_iter(&filter->it, ti);
+
+	iterator_from_filtering_ref_iterator(it, filter);
+	return 0;
+}
+
+int reftable_reader_refs_for(struct reftable_reader *r,
+			     struct reftable_iterator *it, uint8_t *oid)
+{
+	if (r->obj_offsets.is_present)
+		return reftable_reader_refs_for_indexed(r, it, oid);
+	return reftable_reader_refs_for_unindexed(r, it, oid);
+}
+
+uint64_t reftable_reader_max_update_index(struct reftable_reader *r)
+{
+	return r->max_update_index;
+}
+
+uint64_t reftable_reader_min_update_index(struct reftable_reader *r)
+{
+	return r->min_update_index;
+}
diff --git a/reftable/reader.h b/reftable/reader.h
new file mode 100644
index 0000000000..6d4927e1c5
--- /dev/null
+++ b/reftable/reader.h
@@ -0,0 +1,75 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef READER_H
+#define READER_H
+
+#include "block.h"
+#include "record.h"
+#include "reftable-iterator.h"
+#include "reftable-reader.h"
+
+uint64_t block_source_size(struct reftable_block_source *source);
+
+int block_source_read_block(struct reftable_block_source *source,
+			    struct reftable_block *dest, uint64_t off,
+			    uint32_t size);
+void block_source_close(struct reftable_block_source *source);
+
+/* metadata for a block type */
+struct reftable_reader_offsets {
+	int is_present;
+	uint64_t offset;
+	uint64_t index_offset;
+};
+
+/* The state for reading a reftable file. */
+struct reftable_reader {
+	/* for convience, associate a name with the instance. */
+	char *name;
+	struct reftable_block_source source;
+
+	/* Size of the file, excluding the footer. */
+	uint64_t size;
+
+	/* 'sha1' for SHA1, 's256' for SHA-256 */
+	uint32_t hash_id;
+
+	uint32_t block_size;
+	uint64_t min_update_index;
+	uint64_t max_update_index;
+	/* Length of the OID keys in the 'o' section */
+	int object_id_len;
+	int version;
+
+	struct reftable_reader_offsets ref_offsets;
+	struct reftable_reader_offsets obj_offsets;
+	struct reftable_reader_offsets log_offsets;
+};
+
+int init_reader(struct reftable_reader *r, struct reftable_block_source *source,
+		const char *name);
+int reader_seek(struct reftable_reader *r, struct reftable_iterator *it,
+		struct reftable_record *rec);
+void reader_close(struct reftable_reader *r);
+const char *reader_name(struct reftable_reader *r);
+
+/* initialize a block reader to read from `r` */
+int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br,
+			     uint64_t next_off, uint8_t want_typ);
+
+/* generic interface to reftables */
+struct reftable_table_vtable {
+	int (*seek_record)(void *tab, struct reftable_iterator *it,
+			   struct reftable_record *);
+	uint32_t (*hash_id)(void *tab);
+	uint64_t (*min_update_index)(void *tab);
+	uint64_t (*max_update_index)(void *tab);
+};
+
+#endif
diff --git a/reftable/reftable-iterator.h b/reftable/reftable-iterator.h
new file mode 100644
index 0000000000..a3d6be7ae0
--- /dev/null
+++ b/reftable/reftable-iterator.h
@@ -0,0 +1,37 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_ITERATOR_H
+#define REFTABLE_ITERATOR_H
+
+#include "reftable-record.h"
+
+/* iterator is the generic interface for walking over data stored in a
+   reftable.
+*/
+struct reftable_iterator {
+	struct reftable_iterator_vtable *ops;
+	void *iter_arg;
+};
+
+/* reads the next reftable_ref_record. Returns < 0 for error, 0 for OK and > 0:
+   end of iteration.
+*/
+int reftable_iterator_next_ref(struct reftable_iterator *it,
+			       struct reftable_ref_record *ref);
+
+/* reads the next reftable_log_record. Returns < 0 for error, 0 for OK and > 0:
+   end of iteration.
+*/
+int reftable_iterator_next_log(struct reftable_iterator *it,
+			       struct reftable_log_record *log);
+
+/* releases resources associated with an iterator. */
+void reftable_iterator_destroy(struct reftable_iterator *it);
+
+#endif
diff --git a/reftable/reftable-reader.h b/reftable/reftable-reader.h
new file mode 100644
index 0000000000..827957a8fb
--- /dev/null
+++ b/reftable/reftable-reader.h
@@ -0,0 +1,89 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_READER_H
+#define REFTABLE_READER_H
+
+#include "reftable-iterator.h"
+#include "reftable-blocksource.h"
+
+/*
+ Reading single tables
+
+ The follow routines are for reading single files. For an application-level
+ interface, skip ahead to struct reftable_merged_table and struct
+ reftable_stack.
+*/
+
+/* The reader struct is a handle to an open reftable file. */
+struct reftable_reader;
+
+/* reftable_new_reader opens a reftable for reading. If successful, returns 0
+ * code and sets pp. The name is used for creating a stack. Typically, it is the
+ * basename of the file. The block source `src` is owned by the reader, and is
+ * closed on calling reftable_reader_destroy().
+ */
+int reftable_new_reader(struct reftable_reader **pp,
+			struct reftable_block_source *src, const char *name);
+
+/* reftable_reader_seek_ref returns an iterator where 'name' would be inserted
+   in the table.  To seek to the start of the table, use name = "".
+
+   example:
+
+   struct reftable_reader *r = NULL;
+   int err = reftable_new_reader(&r, &src, "filename");
+   if (err < 0) { ... }
+   struct reftable_iterator it  = {0};
+   err = reftable_reader_seek_ref(r, &it, "refs/heads/master");
+   if (err < 0) { ... }
+   struct reftable_ref_record ref  = {0};
+   while (1) {
+     err = reftable_iterator_next_ref(&it, &ref);
+     if (err > 0) {
+       break;
+     }
+     if (err < 0) {
+       ..error handling..
+     }
+     ..found..
+   }
+   reftable_iterator_destroy(&it);
+   reftable_ref_record_release(&ref);
+ */
+int reftable_reader_seek_ref(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name);
+
+/* returns the hash ID used in this table. */
+uint32_t reftable_reader_hash_id(struct reftable_reader *r);
+
+/* seek to logs for the given name, older than update_index. To seek to the
+   start of the table, use name = "".
+ */
+int reftable_reader_seek_log_at(struct reftable_reader *r,
+				struct reftable_iterator *it, const char *name,
+				uint64_t update_index);
+
+/* seek to newest log entry for given name. */
+int reftable_reader_seek_log(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name);
+
+/* closes and deallocates a reader. */
+void reftable_reader_free(struct reftable_reader *);
+
+/* return an iterator for the refs pointing to `oid`. */
+int reftable_reader_refs_for(struct reftable_reader *r,
+			     struct reftable_iterator *it, uint8_t *oid);
+
+/* return the max_update_index for a table */
+uint64_t reftable_reader_max_update_index(struct reftable_reader *r);
+
+/* return the min_update_index for a table */
+uint64_t reftable_reader_min_update_index(struct reftable_reader *r);
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v3 12/16] reftable: reftable file level tests
  2020-11-26 19:42   ` [PATCH v3 00/16] reftable library Han-Wen Nienhuys via GitGitGadget
                       ` (10 preceding siblings ...)
  2020-11-26 19:42     ` [PATCH v3 11/16] reftable: read " Han-Wen Nienhuys via GitGitGadget
@ 2020-11-26 19:42     ` Han-Wen Nienhuys via GitGitGadget
  2020-11-26 19:42     ` [PATCH v3 13/16] reftable: rest of library Han-Wen Nienhuys via GitGitGadget
                       ` (4 subsequent siblings)
  16 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-11-26 19:42 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

With support for reading and writing files in place, we can construct files (in
memory) and attempt to read them back.

Because some sections of the format are optional (eg. indices, log entries), we
have to exercise this code using multiple sizes of input data

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |   1 +
 reftable/reftable_test.c | 577 +++++++++++++++++++++++++++++++++++++++
 t/helper/test-reftable.c |   1 +
 3 files changed, 579 insertions(+)
 create mode 100644 reftable/reftable_test.c

diff --git a/Makefile b/Makefile
index 51dbcb1b9c..b20acdac0f 100644
--- a/Makefile
+++ b/Makefile
@@ -2405,6 +2405,7 @@ REFTABLE_OBJS += reftable/zlib-compat.o
 REFTABLE_TEST_OBJS += reftable/basics_test.o
 REFTABLE_TEST_OBJS += reftable/block_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
+REFTABLE_TEST_OBJS += reftable/reftable_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/tree_test.o
 
diff --git a/reftable/reftable_test.c b/reftable/reftable_test.c
new file mode 100644
index 0000000000..a6fe633b1d
--- /dev/null
+++ b/reftable/reftable_test.c
@@ -0,0 +1,577 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+
+#include "basics.h"
+#include "block.h"
+#include "blocksource.h"
+#include "constants.h"
+#include "reader.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+#include "reftable-stack.h"
+
+static const int update_index = 5;
+
+static void test_buffer(void)
+{
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_block_source source = { NULL };
+	struct reftable_block out = { NULL };
+	int n;
+	uint8_t in[] = "hello";
+	strbuf_add(&buf, in, sizeof(in));
+	block_source_from_strbuf(&source, &buf);
+	EXPECT(block_source_size(&source) == 6);
+	n = block_source_read_block(&source, &out, 0, sizeof(in));
+	EXPECT(n == sizeof(in));
+	EXPECT(!memcmp(in, out.data, n));
+	reftable_block_done(&out);
+
+	n = block_source_read_block(&source, &out, 1, 2);
+	EXPECT(n == 2);
+	EXPECT(!memcmp(out.data, "el", 2));
+
+	reftable_block_done(&out);
+	block_source_close(&source);
+	strbuf_release(&buf);
+}
+
+static void write_table(char ***names, struct strbuf *buf, int N,
+			int block_size, uint32_t hash_id)
+{
+	struct reftable_write_options opts = {
+		.block_size = block_size,
+		.hash_id = hash_id,
+	};
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, buf, &opts);
+	struct reftable_ref_record ref = { NULL };
+	int i = 0, n;
+	struct reftable_log_record log = { NULL };
+	const struct reftable_stats *stats = NULL;
+	*names = reftable_calloc(sizeof(char *) * (N + 1));
+	reftable_writer_set_limits(w, update_index, update_index);
+	for (i = 0; i < N; i++) {
+		uint8_t hash[SHA256_SIZE] = { 0 };
+		char name[100];
+		int n;
+
+		set_test_hash(hash, i);
+
+		snprintf(name, sizeof(name), "refs/heads/branch%02d", i);
+
+		ref.refname = name;
+		ref.value = hash;
+		ref.update_index = update_index;
+		(*names)[i] = xstrdup(name);
+
+		n = reftable_writer_add_ref(w, &ref);
+		EXPECT(n == 0);
+	}
+
+	for (i = 0; i < N; i++) {
+		uint8_t hash[SHA256_SIZE] = { 0 };
+		char name[100];
+		int n;
+
+		set_test_hash(hash, i);
+
+		snprintf(name, sizeof(name), "refs/heads/branch%02d", i);
+
+		log.refname = name;
+		log.new_hash = hash;
+		log.update_index = update_index;
+		log.message = "message";
+
+		n = reftable_writer_add_log(w, &log);
+		EXPECT(n == 0);
+	}
+
+	n = reftable_writer_close(w);
+	EXPECT(n == 0);
+
+	stats = writer_stats(w);
+	for (i = 0; i < stats->ref_stats.blocks; i++) {
+		int off = i * opts.block_size;
+		if (off == 0) {
+			off = header_size((hash_id == SHA256_ID) ? 2 : 1);
+		}
+		EXPECT(buf->buf[off] == 'r');
+	}
+
+	EXPECT(stats->log_stats.blocks > 0);
+	reftable_writer_free(w);
+}
+
+static void test_log_buffer_size(void)
+{
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_write_options opts = {
+		.block_size = 4096,
+	};
+	int err;
+	struct reftable_log_record log = {
+		.refname = "refs/heads/master",
+		.name = "Han-Wen Nienhuys",
+		.email = "hanwen@google.com",
+		.tz_offset = 100,
+		.time = 0x5e430672,
+		.update_index = 0xa,
+		.message = "commit: 9\n",
+	};
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+
+	/* This tests buffer extension for log compression. Must use a random
+	   hash, to ensure that the compressed part is larger than the original.
+	*/
+	uint8_t hash1[SHA1_SIZE], hash2[SHA1_SIZE];
+	for (int i = 0; i < SHA1_SIZE; i++) {
+		hash1[i] = (uint8_t)(rand() % 256);
+		hash2[i] = (uint8_t)(rand() % 256);
+	}
+	log.old_hash = hash1;
+	log.new_hash = hash2;
+	reftable_writer_set_limits(w, update_index, update_index);
+	err = reftable_writer_add_log(w, &log);
+	EXPECT_ERR(err);
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+	reftable_writer_free(w);
+	strbuf_release(&buf);
+}
+
+static void test_log_write_read(void)
+{
+	int N = 2;
+	char **names = reftable_calloc(sizeof(char *) * (N + 1));
+	int err;
+	struct reftable_write_options opts = {
+		.block_size = 256,
+	};
+	struct reftable_ref_record ref = { NULL };
+	int i = 0;
+	struct reftable_log_record log = { NULL };
+	int n;
+	struct reftable_iterator it = { NULL };
+	struct reftable_reader rd = { NULL };
+	struct reftable_block_source source = { NULL };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	const struct reftable_stats *stats = NULL;
+	reftable_writer_set_limits(w, 0, N);
+	for (i = 0; i < N; i++) {
+		char name[256];
+		struct reftable_ref_record ref = { NULL };
+		snprintf(name, sizeof(name), "b%02d%0*d", i, 130, 7);
+		names[i] = xstrdup(name);
+		ref.refname = name;
+		ref.update_index = i;
+
+		err = reftable_writer_add_ref(w, &ref);
+		EXPECT_ERR(err);
+	}
+	for (i = 0; i < N; i++) {
+		uint8_t hash1[SHA1_SIZE], hash2[SHA1_SIZE];
+		struct reftable_log_record log = { NULL };
+		set_test_hash(hash1, i);
+		set_test_hash(hash2, i + 1);
+
+		log.refname = names[i];
+		log.update_index = i;
+		log.old_hash = hash1;
+		log.new_hash = hash2;
+
+		err = reftable_writer_add_log(w, &log);
+		EXPECT_ERR(err);
+	}
+
+	n = reftable_writer_close(w);
+	EXPECT(n == 0);
+
+	stats = writer_stats(w);
+	EXPECT(stats->log_stats.blocks > 0);
+	reftable_writer_free(w);
+	w = NULL;
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.log");
+	EXPECT_ERR(err);
+
+	err = reftable_reader_seek_ref(&rd, &it, names[N - 1]);
+	EXPECT_ERR(err);
+
+	err = reftable_iterator_next_ref(&it, &ref);
+	EXPECT_ERR(err);
+
+	/* end of iteration. */
+	err = reftable_iterator_next_ref(&it, &ref);
+	EXPECT(0 < err);
+
+	reftable_iterator_destroy(&it);
+	reftable_ref_record_release(&ref);
+
+	err = reftable_reader_seek_log(&rd, &it, "");
+	EXPECT_ERR(err);
+
+	i = 0;
+	while (1) {
+		int err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			break;
+		}
+
+		EXPECT_ERR(err);
+		EXPECT_STREQ(names[i], log.refname);
+		EXPECT(i == log.update_index);
+		i++;
+		reftable_log_record_release(&log);
+	}
+
+	EXPECT(i == N);
+	reftable_iterator_destroy(&it);
+
+	/* cleanup. */
+	strbuf_release(&buf);
+	free_names(names);
+	reader_close(&rd);
+}
+
+static void test_table_read_write_sequential(void)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 50;
+	struct reftable_iterator it = { NULL };
+	struct reftable_block_source source = { NULL };
+	struct reftable_reader rd = { NULL };
+	int err = 0;
+	int j = 0;
+
+	write_table(&names, &buf, N, 256, SHA1_ID);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	EXPECT_ERR(err);
+
+	err = reftable_reader_seek_ref(&rd, &it, "");
+	EXPECT_ERR(err);
+
+	while (1) {
+		struct reftable_ref_record ref = { NULL };
+		int r = reftable_iterator_next_ref(&it, &ref);
+		EXPECT(r >= 0);
+		if (r > 0) {
+			break;
+		}
+		EXPECT(0 == strcmp(names[j], ref.refname));
+		EXPECT(update_index == ref.update_index);
+
+		j++;
+		reftable_ref_record_release(&ref);
+	}
+	EXPECT(j == N);
+	reftable_iterator_destroy(&it);
+	strbuf_release(&buf);
+	free_names(names);
+
+	reader_close(&rd);
+}
+
+static void test_table_write_small_table(void)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 1;
+	write_table(&names, &buf, N, 4096, SHA1_ID);
+	EXPECT(buf.len < 200);
+	strbuf_release(&buf);
+	free_names(names);
+}
+
+static void test_table_read_api(void)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 50;
+	struct reftable_reader rd = { NULL };
+	struct reftable_block_source source = { NULL };
+	int err;
+	int i;
+	struct reftable_log_record log = { NULL };
+	struct reftable_iterator it = { NULL };
+
+	write_table(&names, &buf, N, 256, SHA1_ID);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	EXPECT_ERR(err);
+
+	err = reftable_reader_seek_ref(&rd, &it, names[0]);
+	EXPECT_ERR(err);
+
+	err = reftable_iterator_next_log(&it, &log);
+	EXPECT(err == REFTABLE_API_ERROR);
+
+	strbuf_release(&buf);
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_free(names);
+	reader_close(&rd);
+	strbuf_release(&buf);
+}
+
+static void test_table_read_write_seek(int index, int hash_id)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 50;
+	struct reftable_reader rd = { NULL };
+	struct reftable_block_source source = { NULL };
+	int err;
+	int i = 0;
+
+	struct reftable_iterator it = { NULL };
+	struct strbuf pastLast = STRBUF_INIT;
+	struct reftable_ref_record ref = { NULL };
+
+	write_table(&names, &buf, N, 256, hash_id);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	EXPECT_ERR(err);
+	EXPECT(hash_id == reftable_reader_hash_id(&rd));
+
+	if (!index) {
+		rd.ref_offsets.index_offset = 0;
+	} else {
+		EXPECT(rd.ref_offsets.index_offset > 0);
+	}
+
+	for (i = 1; i < N; i++) {
+		int err = reftable_reader_seek_ref(&rd, &it, names[i]);
+		EXPECT_ERR(err);
+		err = reftable_iterator_next_ref(&it, &ref);
+		EXPECT_ERR(err);
+		EXPECT(0 == strcmp(names[i], ref.refname));
+		EXPECT(i == ref.value[0]);
+
+		reftable_ref_record_release(&ref);
+		reftable_iterator_destroy(&it);
+	}
+
+	strbuf_addstr(&pastLast, names[N - 1]);
+	strbuf_addstr(&pastLast, "/");
+
+	err = reftable_reader_seek_ref(&rd, &it, pastLast.buf);
+	if (err == 0) {
+		struct reftable_ref_record ref = { NULL };
+		int err = reftable_iterator_next_ref(&it, &ref);
+		EXPECT(err > 0);
+	} else {
+		EXPECT(err > 0);
+	}
+
+	strbuf_release(&pastLast);
+	reftable_iterator_destroy(&it);
+
+	strbuf_release(&buf);
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+	reftable_free(names);
+	reader_close(&rd);
+}
+
+static void test_table_read_write_seek_linear(void)
+{
+	test_table_read_write_seek(0, SHA1_ID);
+}
+
+static void test_table_read_write_seek_linear_sha256(void)
+{
+	test_table_read_write_seek(0, SHA256_ID);
+}
+
+static void test_table_read_write_seek_index(void)
+{
+	test_table_read_write_seek(1, SHA1_ID);
+}
+
+static void test_table_refs_for(int indexed)
+{
+	int N = 50;
+	char **want_names = reftable_calloc(sizeof(char *) * (N + 1));
+	int want_names_len = 0;
+	uint8_t want_hash[SHA1_SIZE];
+
+	struct reftable_write_options opts = {
+		.block_size = 256,
+	};
+	struct reftable_ref_record ref = { NULL };
+	int i = 0;
+	int n;
+	int err;
+	struct reftable_reader rd;
+	struct reftable_block_source source = { NULL };
+
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+
+	struct reftable_iterator it = { NULL };
+	int j;
+
+	set_test_hash(want_hash, 4);
+
+	for (i = 0; i < N; i++) {
+		uint8_t hash[SHA1_SIZE];
+		char fill[51] = { 0 };
+		char name[100];
+		uint8_t hash1[SHA1_SIZE];
+		uint8_t hash2[SHA1_SIZE];
+		struct reftable_ref_record ref = { NULL };
+
+		memset(hash, i, sizeof(hash));
+		memset(fill, 'x', 50);
+		/* Put the variable part in the start */
+		snprintf(name, sizeof(name), "br%02d%s", i, fill);
+		name[40] = 0;
+		ref.refname = name;
+
+		set_test_hash(hash1, i / 4);
+		set_test_hash(hash2, 3 + i / 4);
+		ref.value = hash1;
+		ref.target_value = hash2;
+
+		/* 80 bytes / entry, so 3 entries per block. Yields 17
+		 */
+		/* blocks. */
+		n = reftable_writer_add_ref(w, &ref);
+		EXPECT(n == 0);
+
+		if (!memcmp(hash1, want_hash, SHA1_SIZE) ||
+		    !memcmp(hash2, want_hash, SHA1_SIZE)) {
+			want_names[want_names_len++] = xstrdup(name);
+		}
+	}
+
+	n = reftable_writer_close(w);
+	EXPECT(n == 0);
+
+	reftable_writer_free(w);
+	w = NULL;
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	EXPECT_ERR(err);
+	if (!indexed) {
+		rd.obj_offsets.is_present = 0;
+	}
+
+	err = reftable_reader_seek_ref(&rd, &it, "");
+	EXPECT_ERR(err);
+	reftable_iterator_destroy(&it);
+
+	err = reftable_reader_refs_for(&rd, &it, want_hash);
+	EXPECT_ERR(err);
+
+	j = 0;
+	while (1) {
+		int err = reftable_iterator_next_ref(&it, &ref);
+		EXPECT(err >= 0);
+		if (err > 0) {
+			break;
+		}
+
+		EXPECT(j < want_names_len);
+		EXPECT(0 == strcmp(ref.refname, want_names[j]));
+		j++;
+		reftable_ref_record_release(&ref);
+	}
+	EXPECT(j == want_names_len);
+
+	strbuf_release(&buf);
+	free_names(want_names);
+	reftable_iterator_destroy(&it);
+	reader_close(&rd);
+}
+
+static void test_table_refs_for_no_index(void)
+{
+	test_table_refs_for(0);
+}
+
+static void test_table_refs_for_obj_index(void)
+{
+	test_table_refs_for(1);
+}
+
+static void test_table_empty(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	struct reftable_block_source source = { NULL };
+	struct reftable_reader *rd = NULL;
+	struct reftable_ref_record rec = { NULL };
+	struct reftable_iterator it = { NULL };
+	int err;
+
+	reftable_writer_set_limits(w, 1, 1);
+
+	err = reftable_writer_close(w);
+	EXPECT(err == REFTABLE_EMPTY_TABLE_ERROR);
+	reftable_writer_free(w);
+
+	EXPECT(buf.len == header_size(1) + footer_size(1));
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = reftable_new_reader(&rd, &source, "filename");
+	EXPECT_ERR(err);
+
+	err = reftable_reader_seek_ref(rd, &it, "");
+	EXPECT_ERR(err);
+
+	err = reftable_iterator_next_ref(&it, &rec);
+	EXPECT(err > 0);
+
+	reftable_iterator_destroy(&it);
+	reftable_reader_free(rd);
+	strbuf_release(&buf);
+}
+
+int reftable_test_main(int argc, const char *argv[])
+{
+	test_log_write_read();
+	test_table_read_write_seek_linear_sha256();
+	test_log_buffer_size();
+	test_table_write_small_table();
+	test_buffer();
+	test_table_read_api();
+	test_table_read_write_sequential();
+	test_table_read_write_seek_linear();
+	test_table_read_write_seek_index();
+	test_table_refs_for_no_index();
+	test_table_refs_for_obj_index();
+	test_table_empty();
+	return 0;
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 050551fa69..fdf9258673 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -6,6 +6,7 @@ int cmd__reftable(int argc, const char **argv)
 	basics_test_main(argc, argv);
 	block_test_main(argc, argv);
 	record_test_main(argc, argv);
+	reftable_test_main(argc, argv);
 	tree_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v3 13/16] reftable: rest of library
  2020-11-26 19:42   ` [PATCH v3 00/16] reftable library Han-Wen Nienhuys via GitGitGadget
                       ` (11 preceding siblings ...)
  2020-11-26 19:42     ` [PATCH v3 12/16] reftable: reftable file level tests Han-Wen Nienhuys via GitGitGadget
@ 2020-11-26 19:42     ` Han-Wen Nienhuys via GitGitGadget
  2020-11-26 19:42     ` [PATCH v3 14/16] Reftable support for git-core Han-Wen Nienhuys via GitGitGadget
                       ` (3 subsequent siblings)
  16 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-11-26 19:42 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                    |    4 +
 reftable/VERSION            |    1 +
 reftable/dump.c             |  212 ++++++
 reftable/merged.c           |  366 ++++++++++
 reftable/merged.h           |   35 +
 reftable/merged_test.c      |  333 ++++++++++
 reftable/pq.c               |  115 ++++
 reftable/pq.h               |   32 +
 reftable/refname.c          |  209 ++++++
 reftable/refname.h          |   29 +
 reftable/refname_test.c     |  100 +++
 reftable/reftable-generic.h |   49 ++
 reftable/reftable-merged.h  |   68 ++
 reftable/reftable-stack.h   |  116 ++++
 reftable/reftable.c         |   98 +++
 reftable/stack.c            | 1247 +++++++++++++++++++++++++++++++++++
 reftable/stack.h            |   41 ++
 reftable/stack_test.c       |  781 ++++++++++++++++++++++
 t/helper/test-reftable.c    |    3 +
 19 files changed, 3839 insertions(+)
 create mode 100644 reftable/VERSION
 create mode 100644 reftable/dump.c
 create mode 100644 reftable/merged.c
 create mode 100644 reftable/merged.h
 create mode 100644 reftable/merged_test.c
 create mode 100644 reftable/pq.c
 create mode 100644 reftable/pq.h
 create mode 100644 reftable/refname.c
 create mode 100644 reftable/refname.h
 create mode 100644 reftable/refname_test.c
 create mode 100644 reftable/reftable-generic.h
 create mode 100644 reftable/reftable-merged.h
 create mode 100644 reftable/reftable-stack.h
 create mode 100644 reftable/reftable.c
 create mode 100644 reftable/stack.c
 create mode 100644 reftable/stack.h
 create mode 100644 reftable/stack_test.c

diff --git a/Makefile b/Makefile
index b20acdac0f..278c489da4 100644
--- a/Makefile
+++ b/Makefile
@@ -2394,10 +2394,14 @@ REFTABLE_OBJS += reftable/error.o
 REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/iter.o
+REFTABLE_OBJS += reftable/merged.o
+REFTABLE_OBJS += reftable/pq.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/reader.o
 REFTABLE_OBJS += reftable/record.o
+REFTABLE_OBJS += reftable/refname.o
 REFTABLE_OBJS += reftable/reftable.o
+REFTABLE_OBJS += reftable/stack.o
 REFTABLE_OBJS += reftable/tree.o
 REFTABLE_OBJS += reftable/writer.o
 REFTABLE_OBJS += reftable/zlib-compat.o
diff --git a/reftable/VERSION b/reftable/VERSION
new file mode 100644
index 0000000000..7553c395eb
--- /dev/null
+++ b/reftable/VERSION
@@ -0,0 +1 @@
+b8ec0f74c74cb6752eb2033ad8e755a9c19aad15 C: add missing header
diff --git a/reftable/dump.c b/reftable/dump.c
new file mode 100644
index 0000000000..00b444e8c9
--- /dev/null
+++ b/reftable/dump.c
@@ -0,0 +1,212 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include <stddef.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+
+#include "reftable.h"
+#include "reftable-tests.h"
+
+static uint32_t hash_id;
+
+static int dump_table(const char *tablename)
+{
+	struct reftable_block_source src = { 0 };
+	int err = reftable_block_source_from_file(&src, tablename);
+	struct reftable_iterator it = { 0 };
+	struct reftable_ref_record ref = { 0 };
+	struct reftable_log_record log = { 0 };
+	struct reftable_reader *r = NULL;
+
+	if (err < 0)
+		return err;
+
+	err = reftable_new_reader(&r, &src, tablename);
+	if (err < 0)
+		return err;
+
+	err = reftable_reader_seek_ref(r, &it, "");
+	if (err < 0) {
+		return err;
+	}
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_ref_record_print(&ref, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_ref_record_clear(&ref);
+
+	err = reftable_reader_seek_log(r, &it, "");
+	if (err < 0) {
+		return err;
+	}
+	while (1) {
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_log_record_print(&log, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_log_record_clear(&log);
+
+	reftable_reader_free(r);
+	return 0;
+}
+
+static int compact_stack(const char *stackdir)
+{
+	struct reftable_stack *stack = NULL;
+	struct reftable_write_options cfg = {};
+
+	int err = reftable_new_stack(&stack, stackdir, cfg);
+	if (err < 0)
+		goto done;
+
+	err = reftable_stack_compact_all(stack, NULL);
+	if (err < 0)
+		goto done;
+done:
+	if (stack != NULL) {
+		reftable_stack_destroy(stack);
+	}
+	return err;
+}
+
+static int dump_stack(const char *stackdir)
+{
+	struct reftable_stack *stack = NULL;
+	struct reftable_write_options cfg = {};
+	struct reftable_iterator it = { 0 };
+	struct reftable_ref_record ref = { 0 };
+	struct reftable_log_record log = { 0 };
+	struct reftable_merged_table *merged = NULL;
+
+	int err = reftable_new_stack(&stack, stackdir, cfg);
+	if (err < 0)
+		return err;
+
+	merged = reftable_stack_merged_table(stack);
+
+	err = reftable_merged_table_seek_ref(merged, &it, "");
+	if (err < 0) {
+		return err;
+	}
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_ref_record_print(&ref, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_ref_record_clear(&ref);
+
+	err = reftable_merged_table_seek_log(merged, &it, "");
+	if (err < 0) {
+		return err;
+	}
+	while (1) {
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_log_record_print(&log, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_log_record_clear(&log);
+
+	reftable_stack_destroy(stack);
+	return 0;
+}
+
+static void print_help(void)
+{
+	printf("usage: dump [-cst] arg\n\n"
+	       "options: \n"
+	       "  -c compact\n"
+	       "  -t dump table\n"
+	       "  -s dump stack\n"
+	       "  -h this help\n"
+	       "  -2 use SHA256\n"
+	       "\n");
+}
+
+int reftable_dump_main(int argc, char *const *argv)
+{
+	int err = 0;
+	int opt;
+	int opt_dump_table = 0;
+	int opt_dump_stack = 0;
+	int opt_compact = 0;
+	const char *arg = NULL;
+	while ((opt = getopt(argc, argv, "2chts")) != -1) {
+		switch (opt) {
+		case '2':
+			hash_id = 0x73323536;
+			break;
+		case 't':
+			opt_dump_table = 1;
+			break;
+		case 's':
+			opt_dump_stack = 1;
+			break;
+		case 'c':
+			opt_compact = 1;
+			break;
+		case '?':
+		case 'h':
+			print_help();
+			return 2;
+			break;
+		}
+	}
+
+	if (argv[optind] == NULL) {
+		fprintf(stderr, "need argument\n");
+		print_help();
+		return 2;
+	}
+
+	arg = argv[optind];
+
+	if (opt_dump_table) {
+		err = dump_table(arg);
+	} else if (opt_dump_stack) {
+		err = dump_stack(arg);
+	} else if (opt_compact) {
+		err = compact_stack(arg);
+	}
+
+	if (err < 0) {
+		fprintf(stderr, "%s: %s: %s\n", argv[0], arg,
+			reftable_error_str(err));
+		return 1;
+	}
+	return 0;
+}
diff --git a/reftable/merged.c b/reftable/merged.c
new file mode 100644
index 0000000000..ed02625bb5
--- /dev/null
+++ b/reftable/merged.c
@@ -0,0 +1,366 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "merged.h"
+
+#include "constants.h"
+#include "iter.h"
+#include "pq.h"
+#include "reader.h"
+#include "record.h"
+#include "reftable-merged.h"
+#include "reftable-error.h"
+#include "system.h"
+
+static int merged_iter_init(struct merged_iter *mi)
+{
+	int i = 0;
+	for (i = 0; i < mi->stack_len; i++) {
+		struct reftable_record rec = reftable_new_record(mi->typ);
+		int err = iterator_next(&mi->stack[i], &rec);
+		if (err < 0) {
+			return err;
+		}
+
+		if (err > 0) {
+			reftable_iterator_destroy(&mi->stack[i]);
+			reftable_record_destroy(&rec);
+		} else {
+			struct pq_entry e = {
+				.rec = rec,
+				.index = i,
+			};
+			merged_iter_pqueue_add(&mi->pq, e);
+		}
+	}
+
+	return 0;
+}
+
+static void merged_iter_close(void *p)
+{
+	struct merged_iter *mi = (struct merged_iter *)p;
+	int i = 0;
+	merged_iter_pqueue_release(&mi->pq);
+	for (i = 0; i < mi->stack_len; i++) {
+		reftable_iterator_destroy(&mi->stack[i]);
+	}
+	reftable_free(mi->stack);
+}
+
+static int merged_iter_advance_nonnull_subiter(struct merged_iter *mi,
+					       size_t idx)
+{
+	struct reftable_record rec = reftable_new_record(mi->typ);
+	struct pq_entry e = {
+		.rec = rec,
+		.index = idx,
+	};
+	int err = iterator_next(&mi->stack[idx], &rec);
+	if (err < 0)
+		return err;
+
+	if (err > 0) {
+		reftable_iterator_destroy(&mi->stack[idx]);
+		reftable_record_destroy(&rec);
+		return 0;
+	}
+
+	merged_iter_pqueue_add(&mi->pq, e);
+	return 0;
+}
+
+static int merged_iter_advance_subiter(struct merged_iter *mi, size_t idx)
+{
+	if (iterator_is_null(&mi->stack[idx]))
+		return 0;
+	return merged_iter_advance_nonnull_subiter(mi, idx);
+}
+
+static int merged_iter_next_entry(struct merged_iter *mi,
+				  struct reftable_record *rec)
+{
+	struct strbuf entry_key = STRBUF_INIT;
+	struct pq_entry entry = { 0 };
+	int err = 0;
+
+	if (merged_iter_pqueue_is_empty(mi->pq))
+		return 1;
+
+	entry = merged_iter_pqueue_remove(&mi->pq);
+	err = merged_iter_advance_subiter(mi, entry.index);
+	if (err < 0)
+		return err;
+
+	/*
+	  One can also use reftable as datacenter-local storage, where the ref
+	  database is maintained in globally consistent database (eg.
+	  CockroachDB or Spanner). In this scenario, replication delays together
+	  with compaction may cause newer tables to contain older entries. In
+	  such a deployment, the loop below must be changed to collect all
+	  entries for the same key, and return new the newest one.
+	*/
+	reftable_record_key(&entry.rec, &entry_key);
+	while (!merged_iter_pqueue_is_empty(mi->pq)) {
+		struct pq_entry top = merged_iter_pqueue_top(mi->pq);
+		struct strbuf k = STRBUF_INIT;
+		int err = 0, cmp = 0;
+
+		reftable_record_key(&top.rec, &k);
+
+		cmp = strbuf_cmp(&k, &entry_key);
+		strbuf_release(&k);
+
+		if (cmp > 0) {
+			break;
+		}
+
+		merged_iter_pqueue_remove(&mi->pq);
+		err = merged_iter_advance_subiter(mi, top.index);
+		if (err < 0) {
+			return err;
+		}
+		reftable_record_destroy(&top.rec);
+	}
+
+	reftable_record_copy_from(rec, &entry.rec, hash_size(mi->hash_id));
+	reftable_record_destroy(&entry.rec);
+	strbuf_release(&entry_key);
+	return 0;
+}
+
+static int merged_iter_next(struct merged_iter *mi, struct reftable_record *rec)
+{
+	while (1) {
+		int err = merged_iter_next_entry(mi, rec);
+		if (err == 0 && mi->suppress_deletions &&
+		    reftable_record_is_deletion(rec)) {
+			continue;
+		}
+
+		return err;
+	}
+}
+
+static int merged_iter_next_void(void *p, struct reftable_record *rec)
+{
+	struct merged_iter *mi = (struct merged_iter *)p;
+	if (merged_iter_pqueue_is_empty(mi->pq))
+		return 1;
+
+	return merged_iter_next(mi, rec);
+}
+
+static struct reftable_iterator_vtable merged_iter_vtable = {
+	.next = &merged_iter_next_void,
+	.close = &merged_iter_close,
+};
+
+static void iterator_from_merged_iter(struct reftable_iterator *it,
+				      struct merged_iter *mi)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = mi;
+	it->ops = &merged_iter_vtable;
+}
+
+int reftable_new_merged_table(struct reftable_merged_table **dest,
+			      struct reftable_table *stack, int n,
+			      uint32_t hash_id)
+{
+	struct reftable_merged_table *m = NULL;
+	uint64_t last_max = 0;
+	uint64_t first_min = 0;
+	int i = 0;
+	for (i = 0; i < n; i++) {
+		uint64_t min = reftable_table_min_update_index(&stack[i]);
+		uint64_t max = reftable_table_max_update_index(&stack[i]);
+
+		if (reftable_table_hash_id(&stack[i]) != hash_id) {
+			return REFTABLE_FORMAT_ERROR;
+		}
+		if (i == 0 || min < first_min) {
+			first_min = min;
+		}
+		if (i == 0 || max > last_max) {
+			last_max = max;
+		}
+	}
+
+	m = (struct reftable_merged_table *)reftable_calloc(
+		sizeof(struct reftable_merged_table));
+	m->stack = stack;
+	m->stack_len = n;
+	m->min = first_min;
+	m->max = last_max;
+	m->hash_id = hash_id;
+	*dest = m;
+	return 0;
+}
+
+/* clears the list of subtable, without affecting the readers themselves. */
+void merged_table_release(struct reftable_merged_table *mt)
+{
+	FREE_AND_NULL(mt->stack);
+	mt->stack_len = 0;
+}
+
+void reftable_merged_table_free(struct reftable_merged_table *mt)
+{
+	if (mt == NULL) {
+		return;
+	}
+	merged_table_release(mt);
+	reftable_free(mt);
+}
+
+uint64_t
+reftable_merged_table_max_update_index(struct reftable_merged_table *mt)
+{
+	return mt->max;
+}
+
+uint64_t
+reftable_merged_table_min_update_index(struct reftable_merged_table *mt)
+{
+	return mt->min;
+}
+
+static int reftable_table_seek_record(struct reftable_table *tab,
+				      struct reftable_iterator *it,
+				      struct reftable_record *rec)
+{
+	return tab->ops->seek_record(tab->table_arg, it, rec);
+}
+
+static int merged_table_seek_record(struct reftable_merged_table *mt,
+				    struct reftable_iterator *it,
+				    struct reftable_record *rec)
+{
+	struct reftable_iterator *iters = reftable_calloc(
+		sizeof(struct reftable_iterator) * mt->stack_len);
+	struct merged_iter merged = {
+		.stack = iters,
+		.typ = reftable_record_type(rec),
+		.hash_id = mt->hash_id,
+		.suppress_deletions = mt->suppress_deletions,
+	};
+	int n = 0;
+	int err = 0;
+	int i = 0;
+	for (i = 0; i < mt->stack_len && err == 0; i++) {
+		int e = reftable_table_seek_record(&mt->stack[i], &iters[n],
+						   rec);
+		if (e < 0) {
+			err = e;
+		}
+		if (e == 0) {
+			n++;
+		}
+	}
+	if (err < 0) {
+		int i = 0;
+		for (i = 0; i < n; i++) {
+			reftable_iterator_destroy(&iters[i]);
+		}
+		reftable_free(iters);
+		return err;
+	}
+
+	merged.stack_len = n;
+	err = merged_iter_init(&merged);
+	if (err < 0) {
+		merged_iter_close(&merged);
+		return err;
+	} else {
+		struct merged_iter *p =
+			reftable_malloc(sizeof(struct merged_iter));
+		*p = merged;
+		iterator_from_merged_iter(it, p);
+	}
+	return 0;
+}
+
+int reftable_merged_table_seek_ref(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name)
+{
+	struct reftable_ref_record ref = {
+		.refname = (char *)name,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_ref(&rec, &ref);
+	return merged_table_seek_record(mt, it, &rec);
+}
+
+int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt,
+				      struct reftable_iterator *it,
+				      const char *name, uint64_t update_index)
+{
+	struct reftable_log_record log = {
+		.refname = (char *)name,
+		.update_index = update_index,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_log(&rec, &log);
+	return merged_table_seek_record(mt, it, &rec);
+}
+
+int reftable_merged_table_seek_log(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name)
+{
+	uint64_t max = ~((uint64_t)0);
+	return reftable_merged_table_seek_log_at(mt, it, name, max);
+}
+
+uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *mt)
+{
+	return mt->hash_id;
+}
+
+static int reftable_merged_table_seek_void(void *tab,
+					   struct reftable_iterator *it,
+					   struct reftable_record *rec)
+{
+	return merged_table_seek_record((struct reftable_merged_table *)tab, it,
+					rec);
+}
+
+static uint32_t reftable_merged_table_hash_id_void(void *tab)
+{
+	return reftable_merged_table_hash_id(
+		(struct reftable_merged_table *)tab);
+}
+
+static uint64_t reftable_merged_table_min_update_index_void(void *tab)
+{
+	return reftable_merged_table_min_update_index(
+		(struct reftable_merged_table *)tab);
+}
+
+static uint64_t reftable_merged_table_max_update_index_void(void *tab)
+{
+	return reftable_merged_table_max_update_index(
+		(struct reftable_merged_table *)tab);
+}
+
+static struct reftable_table_vtable merged_table_vtable = {
+	.seek_record = reftable_merged_table_seek_void,
+	.hash_id = reftable_merged_table_hash_id_void,
+	.min_update_index = reftable_merged_table_min_update_index_void,
+	.max_update_index = reftable_merged_table_max_update_index_void,
+};
+
+void reftable_table_from_merged_table(struct reftable_table *tab,
+				      struct reftable_merged_table *merged)
+{
+	assert(tab->ops == NULL);
+	tab->ops = &merged_table_vtable;
+	tab->table_arg = merged;
+}
diff --git a/reftable/merged.h b/reftable/merged.h
new file mode 100644
index 0000000000..8c4d4d58d7
--- /dev/null
+++ b/reftable/merged.h
@@ -0,0 +1,35 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef MERGED_H
+#define MERGED_H
+
+#include "pq.h"
+
+struct reftable_merged_table {
+	struct reftable_table *stack;
+	size_t stack_len;
+	uint32_t hash_id;
+	int suppress_deletions;
+
+	uint64_t min;
+	uint64_t max;
+};
+
+struct merged_iter {
+	struct reftable_iterator *stack;
+	uint32_t hash_id;
+	size_t stack_len;
+	uint8_t typ;
+	int suppress_deletions;
+	struct merged_iter_pqueue pq;
+};
+
+void merged_table_release(struct reftable_merged_table *mt);
+
+#endif
diff --git a/reftable/merged_test.c b/reftable/merged_test.c
new file mode 100644
index 0000000000..fbb2e5ac10
--- /dev/null
+++ b/reftable/merged_test.c
@@ -0,0 +1,333 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "merged.h"
+
+#include "system.h"
+
+#include "basics.h"
+#include "blocksource.h"
+#include "constants.h"
+#include "pq.h"
+#include "reader.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-merged.h"
+#include "reftable-tests.h"
+#include "reftable-generic.h"
+#include "reftable-stack.h"
+
+static void test_pq(void)
+{
+	char *names[54] = { NULL };
+	int N = ARRAY_SIZE(names) - 1;
+
+	struct merged_iter_pqueue pq = { NULL };
+	const char *last = NULL;
+
+	int i = 0;
+	for (i = 0; i < N; i++) {
+		char name[100];
+		snprintf(name, sizeof(name), "%02d", i);
+		names[i] = xstrdup(name);
+	}
+
+	i = 1;
+	do {
+		struct reftable_record rec =
+			reftable_new_record(BLOCK_TYPE_REF);
+		struct pq_entry e = { 0 };
+
+		reftable_record_as_ref(&rec)->refname = names[i];
+		e.rec = rec;
+		merged_iter_pqueue_add(&pq, e);
+		merged_iter_pqueue_check(pq);
+		i = (i * 7) % N;
+	} while (i != 1);
+
+	while (!merged_iter_pqueue_is_empty(pq)) {
+		struct pq_entry e = merged_iter_pqueue_remove(&pq);
+		struct reftable_ref_record *ref =
+			reftable_record_as_ref(&e.rec);
+
+		merged_iter_pqueue_check(pq);
+
+		if (last != NULL) {
+			assert(strcmp(last, ref->refname) < 0);
+		}
+		last = ref->refname;
+		ref->refname = NULL;
+		reftable_free(ref);
+	}
+
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+
+	merged_iter_pqueue_release(&pq);
+}
+
+static void write_test_table(struct strbuf *buf,
+			     struct reftable_ref_record refs[], int n)
+{
+	int min = 0xffffffff;
+	int max = 0;
+	int i = 0;
+	int err;
+
+	struct reftable_write_options opts = {
+		.block_size = 256,
+	};
+	struct reftable_writer *w = NULL;
+	for (i = 0; i < n; i++) {
+		uint64_t ui = refs[i].update_index;
+		if (ui > max) {
+			max = ui;
+		}
+		if (ui < min) {
+			min = ui;
+		}
+	}
+
+	w = reftable_new_writer(&strbuf_add_void, buf, &opts);
+	reftable_writer_set_limits(w, min, max);
+
+	for (i = 0; i < n; i++) {
+		uint64_t before = refs[i].update_index;
+		int n = reftable_writer_add_ref(w, &refs[i]);
+		assert(n == 0);
+		assert(before == refs[i].update_index);
+	}
+
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+
+	reftable_writer_free(w);
+}
+
+static struct reftable_merged_table *
+merged_table_from_records(struct reftable_ref_record **refs,
+			  struct reftable_block_source **source,
+			  struct reftable_reader ***readers, int *sizes,
+			  struct strbuf *buf, int n)
+{
+	int i = 0;
+	struct reftable_merged_table *mt = NULL;
+	int err;
+	struct reftable_table *tabs =
+		reftable_calloc(n * sizeof(struct reftable_table));
+	*readers = reftable_calloc(n * sizeof(struct reftable_reader *));
+	*source = reftable_calloc(n * sizeof(**source));
+	for (i = 0; i < n; i++) {
+		write_test_table(&buf[i], refs[i], sizes[i]);
+		block_source_from_strbuf(&(*source)[i], &buf[i]);
+
+		err = reftable_new_reader(&(*readers)[i], &(*source)[i],
+					  "name");
+		EXPECT_ERR(err);
+		reftable_table_from_reader(&tabs[i], (*readers)[i]);
+	}
+
+	err = reftable_new_merged_table(&mt, tabs, n, SHA1_ID);
+	EXPECT_ERR(err);
+	return mt;
+}
+
+static void readers_destroy(struct reftable_reader **readers, size_t n)
+{
+	int i = 0;
+	for (; i < n; i++)
+		reftable_reader_free(readers[i]);
+	reftable_free(readers);
+}
+
+static void test_merged_between(void)
+{
+	uint8_t hash1[SHA1_SIZE] = { 1, 2, 3, 0 };
+
+	struct reftable_ref_record r1[] = { {
+		.refname = "b",
+		.update_index = 1,
+		.value = hash1,
+	} };
+	struct reftable_ref_record r2[] = { {
+		.refname = "a",
+		.update_index = 2,
+	} };
+
+	struct reftable_ref_record *refs[] = { r1, r2 };
+	int sizes[] = { 1, 1 };
+	struct strbuf bufs[2] = { STRBUF_INIT, STRBUF_INIT };
+	struct reftable_block_source *bs = NULL;
+	struct reftable_reader **readers = NULL;
+	struct reftable_merged_table *mt =
+		merged_table_from_records(refs, &bs, &readers, sizes, bufs, 2);
+	int i;
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_iterator it = { NULL };
+	int err = reftable_merged_table_seek_ref(mt, &it, "a");
+	EXPECT_ERR(err);
+
+	err = reftable_iterator_next_ref(&it, &ref);
+	EXPECT_ERR(err);
+	EXPECT(ref.update_index == 2);
+	reftable_ref_record_release(&ref);
+	reftable_iterator_destroy(&it);
+	readers_destroy(readers, 2);
+	reftable_merged_table_free(mt);
+	for (i = 0; i < ARRAY_SIZE(bufs); i++) {
+		strbuf_release(&bufs[i]);
+	}
+	reftable_free(bs);
+}
+
+static void test_merged(void)
+{
+	uint8_t hash1[SHA1_SIZE] = { 1 };
+	uint8_t hash2[SHA1_SIZE] = { 2 };
+	struct reftable_ref_record r1[] = { {
+						    .refname = "a",
+						    .update_index = 1,
+						    .value = hash1,
+					    },
+					    {
+						    .refname = "b",
+						    .update_index = 1,
+						    .value = hash1,
+					    },
+					    {
+						    .refname = "c",
+						    .update_index = 1,
+						    .value = hash1,
+					    } };
+	struct reftable_ref_record r2[] = { {
+		.refname = "a",
+		.update_index = 2,
+	} };
+	struct reftable_ref_record r3[] = {
+		{
+			.refname = "c",
+			.update_index = 3,
+			.value = hash2,
+		},
+		{
+			.refname = "d",
+			.update_index = 3,
+			.value = hash1,
+		},
+	};
+
+	struct reftable_ref_record want[] = {
+		r2[0],
+		r1[1],
+		r3[0],
+		r3[1],
+	};
+
+	struct reftable_ref_record *refs[] = { r1, r2, r3 };
+	int sizes[3] = { 3, 1, 2 };
+	struct strbuf bufs[3] = { STRBUF_INIT, STRBUF_INIT, STRBUF_INIT };
+	struct reftable_block_source *bs = NULL;
+	struct reftable_reader **readers = NULL;
+	struct reftable_merged_table *mt =
+		merged_table_from_records(refs, &bs, &readers, sizes, bufs, 3);
+
+	struct reftable_iterator it = { NULL };
+	int err = reftable_merged_table_seek_ref(mt, &it, "a");
+	struct reftable_ref_record *out = NULL;
+	size_t len = 0;
+	size_t cap = 0;
+	int i = 0;
+
+	EXPECT_ERR(err);
+	while (len < 100) { /* cap loops/recursion. */
+		struct reftable_ref_record ref = { NULL };
+		int err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (len == cap) {
+			cap = 2 * cap + 1;
+			out = reftable_realloc(
+				out, sizeof(struct reftable_ref_record) * cap);
+		}
+		out[len++] = ref;
+	}
+	reftable_iterator_destroy(&it);
+
+	assert(ARRAY_SIZE(want) == len);
+	for (i = 0; i < len; i++) {
+		assert(reftable_ref_record_equal(&want[i], &out[i], SHA1_SIZE));
+	}
+	for (i = 0; i < len; i++) {
+		reftable_ref_record_release(&out[i]);
+	}
+	reftable_free(out);
+
+	for (i = 0; i < 3; i++) {
+		strbuf_release(&bufs[i]);
+	}
+	readers_destroy(readers, 3);
+	reftable_merged_table_free(mt);
+	reftable_free(bs);
+}
+
+static void test_default_write_opts(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+
+	struct reftable_ref_record rec = {
+		.refname = "master",
+		.update_index = 1,
+	};
+	int err;
+	struct reftable_block_source source = { NULL };
+	struct reftable_table *tab = reftable_calloc(sizeof(*tab) * 1);
+	uint32_t hash_id;
+	struct reftable_reader *rd = NULL;
+	struct reftable_merged_table *merged = NULL;
+
+	reftable_writer_set_limits(w, 1, 1);
+
+	err = reftable_writer_add_ref(w, &rec);
+	EXPECT_ERR(err);
+
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+	reftable_writer_free(w);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = reftable_new_reader(&rd, &source, "filename");
+	EXPECT_ERR(err);
+
+	hash_id = reftable_reader_hash_id(rd);
+	assert(hash_id == SHA1_ID);
+
+	reftable_table_from_reader(&tab[0], rd);
+	err = reftable_new_merged_table(&merged, tab, 1, SHA1_ID);
+	EXPECT_ERR(err);
+
+	reftable_reader_free(rd);
+	reftable_merged_table_free(merged);
+	strbuf_release(&buf);
+}
+
+/* XXX test refs_for(oid) */
+
+int merged_test_main(int argc, const char *argv[])
+{
+	test_merged_between();
+	test_pq();
+	test_merged();
+	test_default_write_opts();
+	return 0;
+}
diff --git a/reftable/pq.c b/reftable/pq.c
new file mode 100644
index 0000000000..8918d158e2
--- /dev/null
+++ b/reftable/pq.c
@@ -0,0 +1,115 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "pq.h"
+
+#include "reftable-record.h"
+#include "system.h"
+#include "basics.h"
+
+static int pq_less(struct pq_entry a, struct pq_entry b)
+{
+	struct strbuf ak = STRBUF_INIT;
+	struct strbuf bk = STRBUF_INIT;
+	int cmp = 0;
+	reftable_record_key(&a.rec, &ak);
+	reftable_record_key(&b.rec, &bk);
+
+	cmp = strbuf_cmp(&ak, &bk);
+
+	strbuf_release(&ak);
+	strbuf_release(&bk);
+
+	if (cmp == 0)
+		return a.index > b.index;
+
+	return cmp < 0;
+}
+
+struct pq_entry merged_iter_pqueue_top(struct merged_iter_pqueue pq)
+{
+	return pq.heap[0];
+}
+
+int merged_iter_pqueue_is_empty(struct merged_iter_pqueue pq)
+{
+	return pq.len == 0;
+}
+
+void merged_iter_pqueue_check(struct merged_iter_pqueue pq)
+{
+	int i = 0;
+	for (i = 1; i < pq.len; i++) {
+		int parent = (i - 1) / 2;
+
+		assert(pq_less(pq.heap[parent], pq.heap[i]));
+	}
+}
+
+struct pq_entry merged_iter_pqueue_remove(struct merged_iter_pqueue *pq)
+{
+	int i = 0;
+	struct pq_entry e = pq->heap[0];
+	pq->heap[0] = pq->heap[pq->len - 1];
+	pq->len--;
+
+	i = 0;
+	while (i < pq->len) {
+		int min = i;
+		int j = 2 * i + 1;
+		int k = 2 * i + 2;
+		if (j < pq->len && pq_less(pq->heap[j], pq->heap[i])) {
+			min = j;
+		}
+		if (k < pq->len && pq_less(pq->heap[k], pq->heap[min])) {
+			min = k;
+		}
+
+		if (min == i) {
+			break;
+		}
+
+		SWAP(pq->heap[i], pq->heap[min]);
+		i = min;
+	}
+
+	return e;
+}
+
+void merged_iter_pqueue_add(struct merged_iter_pqueue *pq, struct pq_entry e)
+{
+	int i = 0;
+	if (pq->len == pq->cap) {
+		pq->cap = 2 * pq->cap + 1;
+		pq->heap = reftable_realloc(pq->heap,
+					    pq->cap * sizeof(struct pq_entry));
+	}
+
+	pq->heap[pq->len++] = e;
+	i = pq->len - 1;
+	while (i > 0) {
+		int j = (i - 1) / 2;
+		if (pq_less(pq->heap[j], pq->heap[i])) {
+			break;
+		}
+
+		SWAP(pq->heap[j], pq->heap[i]);
+
+		i = j;
+	}
+}
+
+void merged_iter_pqueue_release(struct merged_iter_pqueue *pq)
+{
+	int i = 0;
+	for (i = 0; i < pq->len; i++) {
+		reftable_record_destroy(&pq->heap[i].rec);
+	}
+	FREE_AND_NULL(pq->heap);
+	pq->len = pq->cap = 0;
+}
diff --git a/reftable/pq.h b/reftable/pq.h
new file mode 100644
index 0000000000..385d2fb139
--- /dev/null
+++ b/reftable/pq.h
@@ -0,0 +1,32 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef PQ_H
+#define PQ_H
+
+#include "record.h"
+
+struct pq_entry {
+	int index;
+	struct reftable_record rec;
+};
+
+struct merged_iter_pqueue {
+	struct pq_entry *heap;
+	size_t len;
+	size_t cap;
+};
+
+struct pq_entry merged_iter_pqueue_top(struct merged_iter_pqueue pq);
+int merged_iter_pqueue_is_empty(struct merged_iter_pqueue pq);
+void merged_iter_pqueue_check(struct merged_iter_pqueue pq);
+struct pq_entry merged_iter_pqueue_remove(struct merged_iter_pqueue *pq);
+void merged_iter_pqueue_add(struct merged_iter_pqueue *pq, struct pq_entry e);
+void merged_iter_pqueue_release(struct merged_iter_pqueue *pq);
+
+#endif
diff --git a/reftable/refname.c b/reftable/refname.c
new file mode 100644
index 0000000000..0f4eb3b292
--- /dev/null
+++ b/reftable/refname.c
@@ -0,0 +1,209 @@
+/*
+  Copyright 2020 Google LLC
+
+  Use of this source code is governed by a BSD-style
+  license that can be found in the LICENSE file or at
+  https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+#include "reftable-error.h"
+#include "basics.h"
+#include "refname.h"
+#include "reftable-iterator.h"
+
+struct find_arg {
+	char **names;
+	const char *want;
+};
+
+static int find_name(size_t k, void *arg)
+{
+	struct find_arg *f_arg = (struct find_arg *)arg;
+	return strcmp(f_arg->names[k], f_arg->want) >= 0;
+}
+
+static int modification_has_ref(struct modification *mod, const char *name)
+{
+	struct reftable_ref_record ref = { NULL };
+	int err = 0;
+
+	if (mod->add_len > 0) {
+		struct find_arg arg = {
+			.names = mod->add,
+			.want = name,
+		};
+		int idx = binsearch(mod->add_len, find_name, &arg);
+		if (idx < mod->add_len && !strcmp(mod->add[idx], name)) {
+			return 0;
+		}
+	}
+
+	if (mod->del_len > 0) {
+		struct find_arg arg = {
+			.names = mod->del,
+			.want = name,
+		};
+		int idx = binsearch(mod->del_len, find_name, &arg);
+		if (idx < mod->del_len && !strcmp(mod->del[idx], name)) {
+			return 1;
+		}
+	}
+
+	err = reftable_table_read_ref(&mod->tab, name, &ref);
+	reftable_ref_record_release(&ref);
+	return err;
+}
+
+static void modification_release(struct modification *mod)
+{
+	/* don't delete the strings themselves; they're owned by ref records.
+	 */
+	FREE_AND_NULL(mod->add);
+	FREE_AND_NULL(mod->del);
+	mod->add_len = 0;
+	mod->del_len = 0;
+}
+
+static int modification_has_ref_with_prefix(struct modification *mod,
+					    const char *prefix)
+{
+	struct reftable_iterator it = { NULL };
+	struct reftable_ref_record ref = { NULL };
+	int err = 0;
+
+	if (mod->add_len > 0) {
+		struct find_arg arg = {
+			.names = mod->add,
+			.want = prefix,
+		};
+		int idx = binsearch(mod->add_len, find_name, &arg);
+		if (idx < mod->add_len &&
+		    !strncmp(prefix, mod->add[idx], strlen(prefix)))
+			goto done;
+	}
+	err = reftable_table_seek_ref(&mod->tab, &it, prefix);
+	if (err)
+		goto done;
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err)
+			goto done;
+
+		if (mod->del_len > 0) {
+			struct find_arg arg = {
+				.names = mod->del,
+				.want = ref.refname,
+			};
+			int idx = binsearch(mod->del_len, find_name, &arg);
+			if (idx < mod->del_len &&
+			    !strcmp(ref.refname, mod->del[idx])) {
+				continue;
+			}
+		}
+
+		if (strncmp(ref.refname, prefix, strlen(prefix))) {
+			err = 1;
+			goto done;
+		}
+		err = 0;
+		goto done;
+	}
+
+done:
+	reftable_ref_record_release(&ref);
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+static int validate_refname(const char *name)
+{
+	while (1) {
+		char *next = strchr(name, '/');
+		if (!*name) {
+			return REFTABLE_REFNAME_ERROR;
+		}
+		if (!next) {
+			return 0;
+		}
+		if (next - name == 0 || (next - name == 1 && *name == '.') ||
+		    (next - name == 2 && name[0] == '.' && name[1] == '.'))
+			return REFTABLE_REFNAME_ERROR;
+		name = next + 1;
+	}
+	return 0;
+}
+
+int validate_ref_record_addition(struct reftable_table tab,
+				 struct reftable_ref_record *recs, size_t sz)
+{
+	struct modification mod = {
+		.tab = tab,
+		.add = reftable_calloc(sizeof(char *) * sz),
+		.del = reftable_calloc(sizeof(char *) * sz),
+	};
+	int i = 0;
+	int err = 0;
+	for (; i < sz; i++) {
+		if (reftable_ref_record_is_deletion(&recs[i])) {
+			mod.del[mod.del_len++] = recs[i].refname;
+		} else {
+			mod.add[mod.add_len++] = recs[i].refname;
+		}
+	}
+
+	err = modification_validate(&mod);
+	modification_release(&mod);
+	return err;
+}
+
+static void strbuf_trim_component(struct strbuf *sl)
+{
+	while (sl->len > 0) {
+		int is_slash = (sl->buf[sl->len - 1] == '/');
+		strbuf_setlen(sl, sl->len - 1);
+		if (is_slash)
+			break;
+	}
+}
+
+int modification_validate(struct modification *mod)
+{
+	struct strbuf slashed = STRBUF_INIT;
+	int err = 0;
+	int i = 0;
+	for (; i < mod->add_len; i++) {
+		err = validate_refname(mod->add[i]);
+		if (err)
+			goto done;
+		strbuf_reset(&slashed);
+		strbuf_addstr(&slashed, mod->add[i]);
+		strbuf_addstr(&slashed, "/");
+
+		err = modification_has_ref_with_prefix(mod, slashed.buf);
+		if (err == 0) {
+			err = REFTABLE_NAME_CONFLICT;
+			goto done;
+		}
+		if (err < 0)
+			goto done;
+
+		strbuf_reset(&slashed);
+		strbuf_addstr(&slashed, mod->add[i]);
+		while (slashed.len) {
+			strbuf_trim_component(&slashed);
+			err = modification_has_ref(mod, slashed.buf);
+			if (err == 0) {
+				err = REFTABLE_NAME_CONFLICT;
+				goto done;
+			}
+			if (err < 0)
+				goto done;
+		}
+	}
+	err = 0;
+done:
+	strbuf_release(&slashed);
+	return err;
+}
diff --git a/reftable/refname.h b/reftable/refname.h
new file mode 100644
index 0000000000..a24b40fcb4
--- /dev/null
+++ b/reftable/refname.h
@@ -0,0 +1,29 @@
+/*
+  Copyright 2020 Google LLC
+
+  Use of this source code is governed by a BSD-style
+  license that can be found in the LICENSE file or at
+  https://developers.google.com/open-source/licenses/bsd
+*/
+#ifndef REFNAME_H
+#define REFNAME_H
+
+#include "reftable-record.h"
+#include "reftable-generic.h"
+
+struct modification {
+	struct reftable_table tab;
+
+	char **add;
+	size_t add_len;
+
+	char **del;
+	size_t del_len;
+};
+
+int validate_ref_record_addition(struct reftable_table tab,
+				 struct reftable_ref_record *recs, size_t sz);
+
+int modification_validate(struct modification *mod);
+
+#endif
diff --git a/reftable/refname_test.c b/reftable/refname_test.c
new file mode 100644
index 0000000000..af04bc9a59
--- /dev/null
+++ b/reftable/refname_test.c
@@ -0,0 +1,100 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "basics.h"
+#include "block.h"
+#include "blocksource.h"
+#include "constants.h"
+#include "reader.h"
+#include "record.h"
+#include "refname.h"
+#include "reftable-error.h"
+#include "reftable-writer.h"
+#include "system.h"
+
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+struct testcase {
+	char *add;
+	char *del;
+	int error_code;
+};
+
+static void test_conflict(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	struct reftable_ref_record rec = {
+		.refname = "a/b",
+		.target = "destination", /* make sure it's not a symref. */
+		.update_index = 1,
+	};
+	int err;
+	int i;
+	struct reftable_block_source source = { NULL };
+	struct reftable_reader *rd = NULL;
+	struct reftable_table tab = { NULL };
+	struct testcase cases[] = {
+		{ "a/b/c", NULL, REFTABLE_NAME_CONFLICT },
+		{ "b", NULL, 0 },
+		{ "a", NULL, REFTABLE_NAME_CONFLICT },
+		{ "a", "a/b", 0 },
+
+		{ "p/", NULL, REFTABLE_REFNAME_ERROR },
+		{ "p//q", NULL, REFTABLE_REFNAME_ERROR },
+		{ "p/./q", NULL, REFTABLE_REFNAME_ERROR },
+		{ "p/../q", NULL, REFTABLE_REFNAME_ERROR },
+
+		{ "a/b/c", "a/b", 0 },
+		{ NULL, "a//b", 0 },
+	};
+	reftable_writer_set_limits(w, 1, 1);
+
+	err = reftable_writer_add_ref(w, &rec);
+	EXPECT_ERR(err);
+
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+	reftable_writer_free(w);
+
+	block_source_from_strbuf(&source, &buf);
+	err = reftable_new_reader(&rd, &source, "filename");
+	EXPECT_ERR(err);
+
+	reftable_table_from_reader(&tab, rd);
+
+	for (i = 0; i < ARRAY_SIZE(cases); i++) {
+		struct modification mod = {
+			.tab = tab,
+		};
+
+		if (cases[i].add != NULL) {
+			mod.add = &cases[i].add;
+			mod.add_len = 1;
+		}
+		if (cases[i].del != NULL) {
+			mod.del = &cases[i].del;
+			mod.del_len = 1;
+		}
+
+		err = modification_validate(&mod);
+		EXPECT(err == cases[i].error_code);
+	}
+
+	reftable_reader_free(rd);
+	strbuf_release(&buf);
+}
+
+int refname_test_main(int argc, const char *argv[])
+{
+	test_conflict();
+	return 0;
+}
diff --git a/reftable/reftable-generic.h b/reftable/reftable-generic.h
new file mode 100644
index 0000000000..964d25f040
--- /dev/null
+++ b/reftable/reftable-generic.h
@@ -0,0 +1,49 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_GENERIC_H
+#define REFTABLE_GENERIC_H
+
+#include "reftable-iterator.h"
+#include "reftable-reader.h"
+#include "reftable-merged.h"
+
+/*
+ Provides a unified API for reading tables, either merged tables, or single
+ readers.
+*/
+struct reftable_table {
+	struct reftable_table_vtable *ops;
+	void *table_arg;
+};
+
+int reftable_table_seek_ref(struct reftable_table *tab,
+			    struct reftable_iterator *it, const char *name);
+
+void reftable_table_from_reader(struct reftable_table *tab,
+				struct reftable_reader *reader);
+
+/* returns the hash ID from a generic reftable_table */
+uint32_t reftable_table_hash_id(struct reftable_table *tab);
+
+/* create a generic table from reftable_merged_table */
+void reftable_table_from_merged_table(struct reftable_table *tab,
+				      struct reftable_merged_table *table);
+
+/* returns the max update_index covered by this table. */
+uint64_t reftable_table_max_update_index(struct reftable_table *tab);
+
+/* returns the min update_index covered by this table. */
+uint64_t reftable_table_min_update_index(struct reftable_table *tab);
+
+/* convenience function to read a single ref. Returns < 0 for error, 0
+   for success, and 1 if ref not found. */
+int reftable_table_read_ref(struct reftable_table *tab, const char *name,
+			    struct reftable_ref_record *ref);
+
+#endif
diff --git a/reftable/reftable-merged.h b/reftable/reftable-merged.h
new file mode 100644
index 0000000000..5ee16aa02a
--- /dev/null
+++ b/reftable/reftable-merged.h
@@ -0,0 +1,68 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_MERGED_H
+#define REFTABLE_MERGED_H
+
+#include "reftable-iterator.h"
+
+/*
+ Merged tables
+
+ A ref database kept in a sequence of table files. The merged_table presents a
+ unified view to reading (seeking, iterating) a sequence of immutable tables.
+
+ The merged tables are on purpose kept disconnected from their actual storage
+ (eg. files on disk), because it is useful to merge tables aren't files. For
+ example, the per-workspace and global ref namespace can be implemented as a
+ merged table of two stacks of file-backed reftables.
+*/
+
+/* A merged table is implements seeking/iterating over a stack of tables. */
+struct reftable_merged_table;
+
+/* A generic reftable; see below. */
+struct reftable_table;
+
+/* reftable_new_merged_table creates a new merged table. It takes ownership of
+   the stack array.
+*/
+int reftable_new_merged_table(struct reftable_merged_table **dest,
+			      struct reftable_table *stack, int n,
+			      uint32_t hash_id);
+
+/* returns an iterator positioned just before 'name' */
+int reftable_merged_table_seek_ref(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name);
+
+/* returns an iterator for log entry, at given update_index */
+int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt,
+				      struct reftable_iterator *it,
+				      const char *name, uint64_t update_index);
+
+/* like reftable_merged_table_seek_log_at but look for the newest entry. */
+int reftable_merged_table_seek_log(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name);
+
+/* returns the max update_index covered by this merged table. */
+uint64_t
+reftable_merged_table_max_update_index(struct reftable_merged_table *mt);
+
+/* returns the min update_index covered by this merged table. */
+uint64_t
+reftable_merged_table_min_update_index(struct reftable_merged_table *mt);
+
+/* releases memory for the merged_table */
+void reftable_merged_table_free(struct reftable_merged_table *m);
+
+/* return the hash ID of the merged table. */
+uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *m);
+
+#endif
diff --git a/reftable/reftable-stack.h b/reftable/reftable-stack.h
new file mode 100644
index 0000000000..140a28b790
--- /dev/null
+++ b/reftable/reftable-stack.h
@@ -0,0 +1,116 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_STACK_H
+#define REFTABLE_STACK_H
+
+#include "reftable-writer.h"
+
+/*
+  The stack presents an interface to a mutable sequence of reftables.
+
+  A stack can be mutated by pushing a table to the top of the stack.
+
+  The reftable_stack automatically compacts files on disk to ensure good
+  amortized performance.
+*/
+struct reftable_stack;
+
+/* open a new reftable stack. The tables along with the table list will be
+   stored in 'dir'. Typically, this should be .git/reftables.
+*/
+int reftable_new_stack(struct reftable_stack **dest, const char *dir,
+		       struct reftable_write_options config);
+
+/* returns the update_index at which a next table should be written. */
+uint64_t reftable_stack_next_update_index(struct reftable_stack *st);
+
+/* holds a transaction to add tables at the top of a stack. */
+struct reftable_addition;
+
+/*
+  returns a new transaction to add reftables to the given stack. As a side
+  effect, the ref database is locked.
+*/
+int reftable_stack_new_addition(struct reftable_addition **dest,
+				struct reftable_stack *st);
+
+/* Adds a reftable to transaction. */
+int reftable_addition_add(struct reftable_addition *add,
+			  int (*write_table)(struct reftable_writer *wr,
+					     void *arg),
+			  void *arg);
+
+/* Commits the transaction, releasing the lock. */
+int reftable_addition_commit(struct reftable_addition *add);
+
+/* Release all non-committed data from the transaction, and deallocate the
+   transaction. Releases the lock if held. */
+void reftable_addition_destroy(struct reftable_addition *add);
+
+/* add a new table to the stack. The write_table function must call
+   reftable_writer_set_limits, add refs and return an error value. */
+int reftable_stack_add(struct reftable_stack *st,
+		       int (*write_table)(struct reftable_writer *wr,
+					  void *write_arg),
+		       void *write_arg);
+
+/* returns the merged_table for seeking. This table is valid until the
+   next write or reload, and should not be closed or deleted.
+*/
+struct reftable_merged_table *
+reftable_stack_merged_table(struct reftable_stack *st);
+
+/* frees all resources associated with the stack. */
+void reftable_stack_destroy(struct reftable_stack *st);
+
+/* Reloads the stack if necessary. This is very cheap to run if the stack was up
+ * to date */
+int reftable_stack_reload(struct reftable_stack *st);
+
+/* Policy for expiring reflog entries. */
+struct reftable_log_expiry_config {
+	/* Drop entries older than this timestamp */
+	uint64_t time;
+
+	/* Drop older entries */
+	uint64_t min_update_index;
+};
+
+/* compacts all reftables into a giant table. Expire reflog entries if config is
+ * non-NULL */
+int reftable_stack_compact_all(struct reftable_stack *st,
+			       struct reftable_log_expiry_config *config);
+
+/* heuristically compact unbalanced table stack. */
+int reftable_stack_auto_compact(struct reftable_stack *st);
+
+/* convenience function to read a single ref. Returns < 0 for error, 0
+   for success, and 1 if ref not found. */
+int reftable_stack_read_ref(struct reftable_stack *st, const char *refname,
+			    struct reftable_ref_record *ref);
+
+/* convenience function to read a single log. Returns < 0 for error, 0
+   for success, and 1 if ref not found. */
+int reftable_stack_read_log(struct reftable_stack *st, const char *refname,
+			    struct reftable_log_record *log);
+
+/* statistics on past compactions. */
+struct reftable_compaction_stats {
+	uint64_t bytes; /* total number of bytes written */
+	uint64_t entries_written; /* total number of entries written, including
+				     failures. */
+	int attempts; /* how often we tried to compact */
+	int failures; /* failures happen on concurrent updates */
+};
+
+/* return statistics for compaction up till now. */
+struct reftable_compaction_stats *
+reftable_stack_compaction_stats(struct reftable_stack *st);
+
+#endif
diff --git a/reftable/reftable.c b/reftable/reftable.c
new file mode 100644
index 0000000000..dc4fd03d5b
--- /dev/null
+++ b/reftable/reftable.c
@@ -0,0 +1,98 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "record.h"
+#include "reader.h"
+#include "reftable-iterator.h"
+#include "reftable-generic.h"
+
+static int reftable_reader_seek_void(void *tab, struct reftable_iterator *it,
+				     struct reftable_record *rec)
+{
+	return reader_seek((struct reftable_reader *)tab, it, rec);
+}
+
+static uint32_t reftable_reader_hash_id_void(void *tab)
+{
+	return reftable_reader_hash_id((struct reftable_reader *)tab);
+}
+
+static uint64_t reftable_reader_min_update_index_void(void *tab)
+{
+	return reftable_reader_min_update_index((struct reftable_reader *)tab);
+}
+
+static uint64_t reftable_reader_max_update_index_void(void *tab)
+{
+	return reftable_reader_max_update_index((struct reftable_reader *)tab);
+}
+
+static struct reftable_table_vtable reader_vtable = {
+	.seek_record = reftable_reader_seek_void,
+	.hash_id = reftable_reader_hash_id_void,
+	.min_update_index = reftable_reader_min_update_index_void,
+	.max_update_index = reftable_reader_max_update_index_void,
+};
+
+int reftable_table_seek_ref(struct reftable_table *tab,
+			    struct reftable_iterator *it, const char *name)
+{
+	struct reftable_ref_record ref = {
+		.refname = (char *)name,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_ref(&rec, &ref);
+	return tab->ops->seek_record(tab->table_arg, it, &rec);
+}
+
+void reftable_table_from_reader(struct reftable_table *tab,
+				struct reftable_reader *reader)
+{
+	assert(tab->ops == NULL);
+	tab->ops = &reader_vtable;
+	tab->table_arg = reader;
+}
+
+int reftable_table_read_ref(struct reftable_table *tab, const char *name,
+			    struct reftable_ref_record *ref)
+{
+	struct reftable_iterator it = { NULL };
+	int err = reftable_table_seek_ref(tab, &it, name);
+	if (err)
+		goto done;
+
+	err = reftable_iterator_next_ref(&it, ref);
+	if (err)
+		goto done;
+
+	if (strcmp(ref->refname, name) ||
+	    reftable_ref_record_is_deletion(ref)) {
+		reftable_ref_record_release(ref);
+		err = 1;
+		goto done;
+	}
+
+done:
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+uint64_t reftable_table_max_update_index(struct reftable_table *tab)
+{
+	return tab->ops->max_update_index(tab->table_arg);
+}
+
+uint64_t reftable_table_min_update_index(struct reftable_table *tab)
+{
+	return tab->ops->min_update_index(tab->table_arg);
+}
+
+uint32_t reftable_table_hash_id(struct reftable_table *tab)
+{
+	return tab->ops->hash_id(tab->table_arg);
+}
diff --git a/reftable/stack.c b/reftable/stack.c
new file mode 100644
index 0000000000..1d632937d7
--- /dev/null
+++ b/reftable/stack.c
@@ -0,0 +1,1247 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "stack.h"
+
+#include "system.h"
+#include "merged.h"
+#include "reader.h"
+#include "refname.h"
+#include "reftable-error.h"
+#include "reftable-record.h"
+#include "writer.h"
+
+static int stack_try_add(struct reftable_stack *st,
+			 int (*write_table)(struct reftable_writer *wr,
+					    void *arg),
+			 void *arg);
+static int stack_write_compact(struct reftable_stack *st,
+			       struct reftable_writer *wr, int first, int last,
+			       struct reftable_log_expiry_config *config);
+static int stack_check_addition(struct reftable_stack *st,
+				const char *new_tab_name);
+static void reftable_addition_close(struct reftable_addition *add);
+static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
+					     int reuse_open);
+
+static int reftable_fd_write(void *arg, const void *data, size_t sz)
+{
+	int *fdp = (int *)arg;
+	return write(*fdp, data, sz);
+}
+
+int reftable_new_stack(struct reftable_stack **dest, const char *dir,
+		       struct reftable_write_options config)
+{
+	struct reftable_stack *p =
+		reftable_calloc(sizeof(struct reftable_stack));
+	struct strbuf list_file_name = STRBUF_INIT;
+	int err = 0;
+
+	if (config.hash_id == 0) {
+		config.hash_id = SHA1_ID;
+	}
+
+	*dest = NULL;
+
+	strbuf_reset(&list_file_name);
+	strbuf_addstr(&list_file_name, dir);
+	strbuf_addstr(&list_file_name, "/tables.list");
+
+	p->list_file = strbuf_detach(&list_file_name, NULL);
+	p->reftable_dir = xstrdup(dir);
+	p->config = config;
+
+	err = reftable_stack_reload_maybe_reuse(p, 1);
+	if (err < 0) {
+		reftable_stack_destroy(p);
+	} else {
+		*dest = p;
+	}
+	return err;
+}
+
+static int fd_read_lines(int fd, char ***namesp)
+{
+	off_t size = lseek(fd, 0, SEEK_END);
+	char *buf = NULL;
+	int err = 0;
+	if (size < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+	err = lseek(fd, 0, SEEK_SET);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	buf = reftable_malloc(size + 1);
+	if (read(fd, buf, size) != size) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+	buf[size] = 0;
+
+	parse_names(buf, size, namesp);
+
+done:
+	reftable_free(buf);
+	return err;
+}
+
+int read_lines(const char *filename, char ***namesp)
+{
+	int fd = open(filename, O_RDONLY, 0644);
+	int err = 0;
+	if (fd < 0) {
+		if (errno == ENOENT) {
+			*namesp = reftable_calloc(sizeof(char *));
+			return 0;
+		}
+
+		return REFTABLE_IO_ERROR;
+	}
+	err = fd_read_lines(fd, namesp);
+	close(fd);
+	return err;
+}
+
+struct reftable_merged_table *
+reftable_stack_merged_table(struct reftable_stack *st)
+{
+	return st->merged;
+}
+
+/* Close and free the stack */
+void reftable_stack_destroy(struct reftable_stack *st)
+{
+	if (st->merged != NULL) {
+		reftable_merged_table_free(st->merged);
+		st->merged = NULL;
+	}
+
+	if (st->readers != NULL) {
+		int i = 0;
+		for (i = 0; i < st->readers_len; i++) {
+			reftable_reader_free(st->readers[i]);
+		}
+		st->readers_len = 0;
+		FREE_AND_NULL(st->readers);
+	}
+	FREE_AND_NULL(st->list_file);
+	FREE_AND_NULL(st->reftable_dir);
+	reftable_free(st);
+}
+
+static struct reftable_reader **stack_copy_readers(struct reftable_stack *st,
+						   int cur_len)
+{
+	struct reftable_reader **cur =
+		reftable_calloc(sizeof(struct reftable_reader *) * cur_len);
+	int i = 0;
+	for (i = 0; i < cur_len; i++) {
+		cur[i] = st->readers[i];
+	}
+	return cur;
+}
+
+static int reftable_stack_reload_once(struct reftable_stack *st, char **names,
+				      int reuse_open)
+{
+	int cur_len = st->merged == NULL ? 0 : st->merged->stack_len;
+	struct reftable_reader **cur = stack_copy_readers(st, cur_len);
+	int err = 0;
+	int names_len = names_length(names);
+	struct reftable_reader **new_readers =
+		reftable_calloc(sizeof(struct reftable_reader *) * names_len);
+	struct reftable_table *new_tables =
+		reftable_calloc(sizeof(struct reftable_table) * names_len);
+	int new_readers_len = 0;
+	struct reftable_merged_table *new_merged = NULL;
+	int i;
+
+	while (*names) {
+		struct reftable_reader *rd = NULL;
+		char *name = *names++;
+
+		/* this is linear; we assume compaction keeps the number of
+		   tables under control so this is not quadratic. */
+		int j = 0;
+		for (j = 0; reuse_open && j < cur_len; j++) {
+			if (cur[j] != NULL && 0 == strcmp(cur[j]->name, name)) {
+				rd = cur[j];
+				cur[j] = NULL;
+				break;
+			}
+		}
+
+		if (rd == NULL) {
+			struct reftable_block_source src = { NULL };
+			struct strbuf table_path = STRBUF_INIT;
+			strbuf_addstr(&table_path, st->reftable_dir);
+			strbuf_addstr(&table_path, "/");
+			strbuf_addstr(&table_path, name);
+
+			err = reftable_block_source_from_file(&src,
+							      table_path.buf);
+			strbuf_release(&table_path);
+
+			if (err < 0)
+				goto done;
+
+			err = reftable_new_reader(&rd, &src, name);
+			if (err < 0)
+				goto done;
+		}
+
+		new_readers[new_readers_len] = rd;
+		reftable_table_from_reader(&new_tables[new_readers_len], rd);
+		new_readers_len++;
+	}
+
+	/* success! */
+	err = reftable_new_merged_table(&new_merged, new_tables,
+					new_readers_len, st->config.hash_id);
+	if (err < 0)
+		goto done;
+
+	new_tables = NULL;
+	st->readers_len = new_readers_len;
+	if (st->merged != NULL) {
+		merged_table_release(st->merged);
+		reftable_merged_table_free(st->merged);
+	}
+	if (st->readers != NULL) {
+		reftable_free(st->readers);
+	}
+	st->readers = new_readers;
+	new_readers = NULL;
+	new_readers_len = 0;
+
+	new_merged->suppress_deletions = 1;
+	st->merged = new_merged;
+	for (i = 0; i < cur_len; i++) {
+		if (cur[i] != NULL) {
+			reader_close(cur[i]);
+			reftable_reader_free(cur[i]);
+		}
+	}
+
+done:
+	for (i = 0; i < new_readers_len; i++) {
+		reader_close(new_readers[i]);
+		reftable_reader_free(new_readers[i]);
+	}
+	reftable_free(new_readers);
+	reftable_free(new_tables);
+	reftable_free(cur);
+	return err;
+}
+
+/* return negative if a before b. */
+static int tv_cmp(struct timeval *a, struct timeval *b)
+{
+	time_t diff = a->tv_sec - b->tv_sec;
+	int udiff = a->tv_usec - b->tv_usec;
+
+	if (diff != 0)
+		return diff;
+
+	return udiff;
+}
+
+static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
+					     int reuse_open)
+{
+	struct timeval deadline = { 0 };
+	int err = gettimeofday(&deadline, NULL);
+	int64_t delay = 0;
+	int tries = 0;
+	if (err < 0)
+		return err;
+
+	deadline.tv_sec += 3;
+	while (1) {
+		char **names = NULL;
+		char **names_after = NULL;
+		struct timeval now = { 0 };
+		int err = gettimeofday(&now, NULL);
+		int err2 = 0;
+		if (err < 0) {
+			return err;
+		}
+
+		/* Only look at deadlines after the first few times. This
+		   simplifies debugging in GDB */
+		tries++;
+		if (tries > 3 && tv_cmp(&now, &deadline) >= 0) {
+			break;
+		}
+
+		err = read_lines(st->list_file, &names);
+		if (err < 0) {
+			free_names(names);
+			return err;
+		}
+		err = reftable_stack_reload_once(st, names, reuse_open);
+		if (err == 0) {
+			free_names(names);
+			break;
+		}
+		if (err != REFTABLE_NOT_EXIST_ERROR) {
+			free_names(names);
+			return err;
+		}
+
+		/* err == REFTABLE_NOT_EXIST_ERROR can be caused by a concurrent
+		   writer. Check if there was one by checking if the name list
+		   changed.
+		*/
+		err2 = read_lines(st->list_file, &names_after);
+		if (err2 < 0) {
+			free_names(names);
+			return err2;
+		}
+
+		if (names_equal(names_after, names)) {
+			free_names(names);
+			free_names(names_after);
+			return err;
+		}
+		free_names(names);
+		free_names(names_after);
+
+		delay = delay + (delay * rand()) / RAND_MAX + 1;
+		sleep_millisec(delay);
+	}
+
+	return 0;
+}
+
+/* -1 = error
+ 0 = up to date
+ 1 = changed. */
+static int stack_uptodate(struct reftable_stack *st)
+{
+	char **names = NULL;
+	int err = read_lines(st->list_file, &names);
+	int i = 0;
+	if (err < 0)
+		return err;
+
+	for (i = 0; i < st->readers_len; i++) {
+		if (names[i] == NULL) {
+			err = 1;
+			goto done;
+		}
+
+		if (strcmp(st->readers[i]->name, names[i])) {
+			err = 1;
+			goto done;
+		}
+	}
+
+	if (names[st->merged->stack_len] != NULL) {
+		err = 1;
+		goto done;
+	}
+
+done:
+	free_names(names);
+	return err;
+}
+
+int reftable_stack_reload(struct reftable_stack *st)
+{
+	int err = stack_uptodate(st);
+	if (err > 0)
+		return reftable_stack_reload_maybe_reuse(st, 1);
+	return err;
+}
+
+int reftable_stack_add(struct reftable_stack *st,
+		       int (*write)(struct reftable_writer *wr, void *arg),
+		       void *arg)
+{
+	int err = stack_try_add(st, write, arg);
+	if (err < 0) {
+		if (err == REFTABLE_LOCK_ERROR) {
+			/* Ignore error return, we want to propagate
+			   REFTABLE_LOCK_ERROR.
+			*/
+			reftable_stack_reload(st);
+		}
+		return err;
+	}
+
+	if (!st->disable_auto_compact)
+		return reftable_stack_auto_compact(st);
+
+	return 0;
+}
+
+static void format_name(struct strbuf *dest, uint64_t min, uint64_t max)
+{
+	char buf[100];
+	snprintf(buf, sizeof(buf), "0x%012" PRIx64 "-0x%012" PRIx64, min, max);
+	strbuf_reset(dest);
+	strbuf_addstr(dest, buf);
+}
+
+struct reftable_addition {
+	int lock_file_fd;
+	struct strbuf lock_file_name;
+	struct reftable_stack *stack;
+	char **names;
+	char **new_tables;
+	int new_tables_len;
+	uint64_t next_update_index;
+};
+
+#define REFTABLE_ADDITION_INIT                \
+	{                                     \
+		.lock_file_name = STRBUF_INIT \
+	}
+
+static int reftable_stack_init_addition(struct reftable_addition *add,
+					struct reftable_stack *st)
+{
+	int err = 0;
+	add->stack = st;
+
+	strbuf_reset(&add->lock_file_name);
+	strbuf_addstr(&add->lock_file_name, st->list_file);
+	strbuf_addstr(&add->lock_file_name, ".lock");
+
+	add->lock_file_fd = open(add->lock_file_name.buf,
+				 O_EXCL | O_CREAT | O_WRONLY, 0644);
+	if (add->lock_file_fd < 0) {
+		if (errno == EEXIST) {
+			err = REFTABLE_LOCK_ERROR;
+		} else {
+			err = REFTABLE_IO_ERROR;
+		}
+		goto done;
+	}
+	err = stack_uptodate(st);
+	if (err < 0)
+		goto done;
+
+	if (err > 1) {
+		err = REFTABLE_LOCK_ERROR;
+		goto done;
+	}
+
+	add->next_update_index = reftable_stack_next_update_index(st);
+done:
+	if (err) {
+		reftable_addition_close(add);
+	}
+	return err;
+}
+
+static void reftable_addition_close(struct reftable_addition *add)
+{
+	int i = 0;
+	struct strbuf nm = STRBUF_INIT;
+	for (i = 0; i < add->new_tables_len; i++) {
+		strbuf_reset(&nm);
+		strbuf_addstr(&nm, add->stack->list_file);
+		strbuf_addstr(&nm, "/");
+		strbuf_addstr(&nm, add->new_tables[i]);
+		unlink(nm.buf);
+		reftable_free(add->new_tables[i]);
+		add->new_tables[i] = NULL;
+	}
+	reftable_free(add->new_tables);
+	add->new_tables = NULL;
+	add->new_tables_len = 0;
+
+	if (add->lock_file_fd > 0) {
+		close(add->lock_file_fd);
+		add->lock_file_fd = 0;
+	}
+	if (add->lock_file_name.len > 0) {
+		unlink(add->lock_file_name.buf);
+		strbuf_release(&add->lock_file_name);
+	}
+
+	free_names(add->names);
+	add->names = NULL;
+	strbuf_release(&nm);
+}
+
+void reftable_addition_destroy(struct reftable_addition *add)
+{
+	if (add == NULL) {
+		return;
+	}
+	reftable_addition_close(add);
+	reftable_free(add);
+}
+
+int reftable_addition_commit(struct reftable_addition *add)
+{
+	struct strbuf table_list = STRBUF_INIT;
+	int i = 0;
+	int err = 0;
+	if (add->new_tables_len == 0)
+		goto done;
+
+	for (i = 0; i < add->stack->merged->stack_len; i++) {
+		strbuf_addstr(&table_list, add->stack->readers[i]->name);
+		strbuf_addstr(&table_list, "\n");
+	}
+	for (i = 0; i < add->new_tables_len; i++) {
+		strbuf_addstr(&table_list, add->new_tables[i]);
+		strbuf_addstr(&table_list, "\n");
+	}
+
+	err = write(add->lock_file_fd, table_list.buf, table_list.len);
+	strbuf_release(&table_list);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = close(add->lock_file_fd);
+	add->lock_file_fd = 0;
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = rename(add->lock_file_name.buf, add->stack->list_file);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = reftable_stack_reload(add->stack);
+
+done:
+	reftable_addition_close(add);
+	return err;
+}
+
+int reftable_stack_new_addition(struct reftable_addition **dest,
+				struct reftable_stack *st)
+{
+	int err = 0;
+	struct reftable_addition empty = REFTABLE_ADDITION_INIT;
+	*dest = reftable_calloc(sizeof(**dest));
+	**dest = empty;
+	err = reftable_stack_init_addition(*dest, st);
+	if (err) {
+		reftable_free(*dest);
+		*dest = NULL;
+	}
+	return err;
+}
+
+static int stack_try_add(struct reftable_stack *st,
+			 int (*write_table)(struct reftable_writer *wr,
+					    void *arg),
+			 void *arg)
+{
+	struct reftable_addition add = REFTABLE_ADDITION_INIT;
+	int err = reftable_stack_init_addition(&add, st);
+	if (err < 0)
+		goto done;
+	if (err > 0) {
+		err = REFTABLE_LOCK_ERROR;
+		goto done;
+	}
+
+	err = reftable_addition_add(&add, write_table, arg);
+	if (err < 0)
+		goto done;
+
+	err = reftable_addition_commit(&add);
+done:
+	reftable_addition_close(&add);
+	return err;
+}
+
+int reftable_addition_add(struct reftable_addition *add,
+			  int (*write_table)(struct reftable_writer *wr,
+					     void *arg),
+			  void *arg)
+{
+	struct strbuf temp_tab_file_name = STRBUF_INIT;
+	struct strbuf tab_file_name = STRBUF_INIT;
+	struct strbuf next_name = STRBUF_INIT;
+	struct reftable_writer *wr = NULL;
+	int err = 0;
+	int tab_fd = 0;
+
+	strbuf_reset(&next_name);
+	format_name(&next_name, add->next_update_index, add->next_update_index);
+
+	strbuf_addstr(&temp_tab_file_name, add->stack->reftable_dir);
+	strbuf_addstr(&temp_tab_file_name, "/");
+	strbuf_addbuf(&temp_tab_file_name, &next_name);
+	strbuf_addstr(&temp_tab_file_name, ".temp.XXXXXX");
+
+	tab_fd = mkstemp(temp_tab_file_name.buf);
+	if (tab_fd < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	wr = reftable_new_writer(reftable_fd_write, &tab_fd,
+				 &add->stack->config);
+	err = write_table(wr, arg);
+	if (err < 0)
+		goto done;
+
+	err = reftable_writer_close(wr);
+	if (err == REFTABLE_EMPTY_TABLE_ERROR) {
+		err = 0;
+		goto done;
+	}
+	if (err < 0)
+		goto done;
+
+	err = close(tab_fd);
+	tab_fd = 0;
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = stack_check_addition(add->stack, temp_tab_file_name.buf);
+	if (err < 0)
+		goto done;
+
+	if (wr->min_update_index < add->next_update_index) {
+		err = REFTABLE_API_ERROR;
+		goto done;
+	}
+
+	format_name(&next_name, wr->min_update_index, wr->max_update_index);
+	strbuf_addstr(&next_name, ".ref");
+
+	strbuf_addstr(&tab_file_name, add->stack->reftable_dir);
+	strbuf_addstr(&tab_file_name, "/");
+	strbuf_addbuf(&tab_file_name, &next_name);
+
+	/* TODO: should check destination out of paranoia */
+	err = rename(temp_tab_file_name.buf, tab_file_name.buf);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	add->new_tables = reftable_realloc(add->new_tables,
+					   sizeof(*add->new_tables) *
+						   (add->new_tables_len + 1));
+	add->new_tables[add->new_tables_len] = strbuf_detach(&next_name, NULL);
+	add->new_tables_len++;
+done:
+	if (tab_fd > 0) {
+		close(tab_fd);
+		tab_fd = 0;
+	}
+	if (temp_tab_file_name.len > 0) {
+		unlink(temp_tab_file_name.buf);
+	}
+
+	strbuf_release(&temp_tab_file_name);
+	strbuf_release(&tab_file_name);
+	strbuf_release(&next_name);
+	reftable_writer_free(wr);
+	return err;
+}
+
+uint64_t reftable_stack_next_update_index(struct reftable_stack *st)
+{
+	int sz = st->merged->stack_len;
+	if (sz > 0)
+		return reftable_reader_max_update_index(st->readers[sz - 1]) +
+		       1;
+	return 1;
+}
+
+static int stack_compact_locked(struct reftable_stack *st, int first, int last,
+				struct strbuf *temp_tab,
+				struct reftable_log_expiry_config *config)
+{
+	struct strbuf next_name = STRBUF_INIT;
+	int tab_fd = -1;
+	struct reftable_writer *wr = NULL;
+	int err = 0;
+
+	format_name(&next_name,
+		    reftable_reader_min_update_index(st->readers[first]),
+		    reftable_reader_max_update_index(st->readers[first]));
+
+	strbuf_reset(temp_tab);
+	strbuf_addstr(temp_tab, st->reftable_dir);
+	strbuf_addstr(temp_tab, "/");
+	strbuf_addbuf(temp_tab, &next_name);
+	strbuf_addstr(temp_tab, ".temp.XXXXXX");
+
+	tab_fd = mkstemp(temp_tab->buf);
+	wr = reftable_new_writer(reftable_fd_write, &tab_fd, &st->config);
+
+	err = stack_write_compact(st, wr, first, last, config);
+	if (err < 0)
+		goto done;
+	err = reftable_writer_close(wr);
+	if (err < 0)
+		goto done;
+
+	err = close(tab_fd);
+	tab_fd = 0;
+
+done:
+	reftable_writer_free(wr);
+	if (tab_fd > 0) {
+		close(tab_fd);
+		tab_fd = 0;
+	}
+	if (err != 0 && temp_tab->len > 0) {
+		unlink(temp_tab->buf);
+		strbuf_release(temp_tab);
+	}
+	strbuf_release(&next_name);
+	return err;
+}
+
+static int stack_write_compact(struct reftable_stack *st,
+			       struct reftable_writer *wr, int first, int last,
+			       struct reftable_log_expiry_config *config)
+{
+	int subtabs_len = last - first + 1;
+	struct reftable_table *subtabs = reftable_calloc(
+		sizeof(struct reftable_table) * (last - first + 1));
+	struct reftable_merged_table *mt = NULL;
+	int err = 0;
+	struct reftable_iterator it = { NULL };
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_log_record log = { NULL };
+
+	uint64_t entries = 0;
+
+	int i = 0, j = 0;
+	for (i = first, j = 0; i <= last; i++) {
+		struct reftable_reader *t = st->readers[i];
+		reftable_table_from_reader(&subtabs[j++], t);
+		st->stats.bytes += t->size;
+	}
+	reftable_writer_set_limits(wr, st->readers[first]->min_update_index,
+				   st->readers[last]->max_update_index);
+
+	err = reftable_new_merged_table(&mt, subtabs, subtabs_len,
+					st->config.hash_id);
+	if (err < 0) {
+		reftable_free(subtabs);
+		goto done;
+	}
+
+	err = reftable_merged_table_seek_ref(mt, &it, "");
+	if (err < 0)
+		goto done;
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+		if (err < 0) {
+			break;
+		}
+		if (first == 0 && reftable_ref_record_is_deletion(&ref)) {
+			continue;
+		}
+
+		err = reftable_writer_add_ref(wr, &ref);
+		if (err < 0) {
+			break;
+		}
+		entries++;
+	}
+	reftable_iterator_destroy(&it);
+
+	err = reftable_merged_table_seek_log(mt, &it, "");
+	if (err < 0)
+		goto done;
+
+	while (1) {
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+		if (err < 0) {
+			break;
+		}
+		if (first == 0 && reftable_log_record_is_deletion(&log)) {
+			continue;
+		}
+
+		if (config != NULL && config->time > 0 &&
+		    log.time < config->time) {
+			continue;
+		}
+
+		if (config != NULL && config->min_update_index > 0 &&
+		    log.update_index < config->min_update_index) {
+			continue;
+		}
+
+		err = reftable_writer_add_log(wr, &log);
+		if (err < 0) {
+			break;
+		}
+		entries++;
+	}
+
+done:
+	reftable_iterator_destroy(&it);
+	if (mt != NULL) {
+		merged_table_release(mt);
+		reftable_merged_table_free(mt);
+	}
+	reftable_ref_record_release(&ref);
+	reftable_log_record_release(&log);
+	st->stats.entries_written += entries;
+	return err;
+}
+
+/* <  0: error. 0 == OK, > 0 attempt failed; could retry. */
+static int stack_compact_range(struct reftable_stack *st, int first, int last,
+			       struct reftable_log_expiry_config *expiry)
+{
+	struct strbuf temp_tab_file_name = STRBUF_INIT;
+	struct strbuf new_table_name = STRBUF_INIT;
+	struct strbuf lock_file_name = STRBUF_INIT;
+	struct strbuf ref_list_contents = STRBUF_INIT;
+	struct strbuf new_table_path = STRBUF_INIT;
+	int err = 0;
+	int have_lock = 0;
+	int lock_file_fd = 0;
+	int compact_count = last - first + 1;
+	char **listp = NULL;
+	char **delete_on_success =
+		reftable_calloc(sizeof(char *) * (compact_count + 1));
+	char **subtable_locks =
+		reftable_calloc(sizeof(char *) * (compact_count + 1));
+	int i = 0;
+	int j = 0;
+	int is_empty_table = 0;
+
+	if (first > last || (expiry == NULL && first == last)) {
+		err = 0;
+		goto done;
+	}
+
+	st->stats.attempts++;
+
+	strbuf_reset(&lock_file_name);
+	strbuf_addstr(&lock_file_name, st->list_file);
+	strbuf_addstr(&lock_file_name, ".lock");
+
+	lock_file_fd =
+		open(lock_file_name.buf, O_EXCL | O_CREAT | O_WRONLY, 0644);
+	if (lock_file_fd < 0) {
+		if (errno == EEXIST) {
+			err = 1;
+		} else {
+			err = REFTABLE_IO_ERROR;
+		}
+		goto done;
+	}
+	/* Don't want to write to the lock for now.  */
+	close(lock_file_fd);
+	lock_file_fd = 0;
+
+	have_lock = 1;
+	err = stack_uptodate(st);
+	if (err != 0)
+		goto done;
+
+	for (i = first, j = 0; i <= last; i++) {
+		struct strbuf subtab_file_name = STRBUF_INIT;
+		struct strbuf subtab_lock = STRBUF_INIT;
+		int sublock_file_fd = -1;
+
+		strbuf_addstr(&subtab_file_name, st->reftable_dir);
+		strbuf_addstr(&subtab_file_name, "/");
+		strbuf_addstr(&subtab_file_name, reader_name(st->readers[i]));
+
+		strbuf_reset(&subtab_lock);
+		strbuf_addbuf(&subtab_lock, &subtab_file_name);
+		strbuf_addstr(&subtab_lock, ".lock");
+
+		sublock_file_fd = open(subtab_lock.buf,
+				       O_EXCL | O_CREAT | O_WRONLY, 0644);
+		if (sublock_file_fd > 0) {
+			close(sublock_file_fd);
+		} else if (sublock_file_fd < 0) {
+			if (errno == EEXIST) {
+				err = 1;
+			} else {
+				err = REFTABLE_IO_ERROR;
+			}
+		}
+
+		subtable_locks[j] = subtab_lock.buf;
+		delete_on_success[j] = subtab_file_name.buf;
+		j++;
+
+		if (err != 0)
+			goto done;
+	}
+
+	err = unlink(lock_file_name.buf);
+	if (err < 0)
+		goto done;
+	have_lock = 0;
+
+	err = stack_compact_locked(st, first, last, &temp_tab_file_name,
+				   expiry);
+	/* Compaction + tombstones can create an empty table out of non-empty
+	 * tables. */
+	is_empty_table = (err == REFTABLE_EMPTY_TABLE_ERROR);
+	if (is_empty_table) {
+		err = 0;
+	}
+	if (err < 0)
+		goto done;
+
+	lock_file_fd =
+		open(lock_file_name.buf, O_EXCL | O_CREAT | O_WRONLY, 0644);
+	if (lock_file_fd < 0) {
+		if (errno == EEXIST) {
+			err = 1;
+		} else {
+			err = REFTABLE_IO_ERROR;
+		}
+		goto done;
+	}
+	have_lock = 1;
+
+	format_name(&new_table_name, st->readers[first]->min_update_index,
+		    st->readers[last]->max_update_index);
+	strbuf_addstr(&new_table_name, ".ref");
+
+	strbuf_reset(&new_table_path);
+	strbuf_addstr(&new_table_path, st->reftable_dir);
+	strbuf_addstr(&new_table_path, "/");
+	strbuf_addbuf(&new_table_path, &new_table_name);
+
+	if (!is_empty_table) {
+		err = rename(temp_tab_file_name.buf, new_table_path.buf);
+		if (err < 0) {
+			err = REFTABLE_IO_ERROR;
+			goto done;
+		}
+	}
+
+	for (i = 0; i < first; i++) {
+		strbuf_addstr(&ref_list_contents, st->readers[i]->name);
+		strbuf_addstr(&ref_list_contents, "\n");
+	}
+	if (!is_empty_table) {
+		strbuf_addbuf(&ref_list_contents, &new_table_name);
+		strbuf_addstr(&ref_list_contents, "\n");
+	}
+	for (i = last + 1; i < st->merged->stack_len; i++) {
+		strbuf_addstr(&ref_list_contents, st->readers[i]->name);
+		strbuf_addstr(&ref_list_contents, "\n");
+	}
+
+	err = write(lock_file_fd, ref_list_contents.buf, ref_list_contents.len);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		unlink(new_table_path.buf);
+		goto done;
+	}
+	err = close(lock_file_fd);
+	lock_file_fd = 0;
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		unlink(new_table_path.buf);
+		goto done;
+	}
+
+	err = rename(lock_file_name.buf, st->list_file);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		unlink(new_table_path.buf);
+		goto done;
+	}
+	have_lock = 0;
+
+	/* Reload the stack before deleting. On windows, we can only delete the
+	   files after we closed them.
+	*/
+	err = reftable_stack_reload_maybe_reuse(st, first < last);
+
+	listp = delete_on_success;
+	while (*listp) {
+		if (strcmp(*listp, new_table_path.buf)) {
+			unlink(*listp);
+		}
+		listp++;
+	}
+
+done:
+	free_names(delete_on_success);
+
+	listp = subtable_locks;
+	while (*listp) {
+		unlink(*listp);
+		listp++;
+	}
+	free_names(subtable_locks);
+	if (lock_file_fd > 0) {
+		close(lock_file_fd);
+		lock_file_fd = 0;
+	}
+	if (have_lock) {
+		unlink(lock_file_name.buf);
+	}
+	strbuf_release(&new_table_name);
+	strbuf_release(&new_table_path);
+	strbuf_release(&ref_list_contents);
+	strbuf_release(&temp_tab_file_name);
+	strbuf_release(&lock_file_name);
+	return err;
+}
+
+int reftable_stack_compact_all(struct reftable_stack *st,
+			       struct reftable_log_expiry_config *config)
+{
+	return stack_compact_range(st, 0, st->merged->stack_len - 1, config);
+}
+
+static int stack_compact_range_stats(struct reftable_stack *st, int first,
+				     int last,
+				     struct reftable_log_expiry_config *config)
+{
+	int err = stack_compact_range(st, first, last, config);
+	if (err > 0) {
+		st->stats.failures++;
+	}
+	return err;
+}
+
+static int segment_size(struct segment *s)
+{
+	return s->end - s->start;
+}
+
+int fastlog2(uint64_t sz)
+{
+	int l = 0;
+	if (sz == 0)
+		return 0;
+	for (; sz; sz /= 2) {
+		l++;
+	}
+	return l - 1;
+}
+
+struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n)
+{
+	struct segment *segs = reftable_calloc(sizeof(struct segment) * n);
+	int next = 0;
+	struct segment cur = { 0 };
+	int i = 0;
+
+	if (n == 0) {
+		*seglen = 0;
+		return segs;
+	}
+	for (i = 0; i < n; i++) {
+		int log = fastlog2(sizes[i]);
+		if (cur.log != log && cur.bytes > 0) {
+			struct segment fresh = {
+				.start = i,
+			};
+
+			segs[next++] = cur;
+			cur = fresh;
+		}
+
+		cur.log = log;
+		cur.end = i + 1;
+		cur.bytes += sizes[i];
+	}
+	segs[next++] = cur;
+	*seglen = next;
+	return segs;
+}
+
+struct segment suggest_compaction_segment(uint64_t *sizes, int n)
+{
+	int seglen = 0;
+	struct segment *segs = sizes_to_segments(&seglen, sizes, n);
+	struct segment min_seg = {
+		.log = 64,
+	};
+	int i = 0;
+	for (i = 0; i < seglen; i++) {
+		if (segment_size(&segs[i]) == 1) {
+			continue;
+		}
+
+		if (segs[i].log < min_seg.log) {
+			min_seg = segs[i];
+		}
+	}
+
+	while (min_seg.start > 0) {
+		int prev = min_seg.start - 1;
+		if (fastlog2(min_seg.bytes) < fastlog2(sizes[prev])) {
+			break;
+		}
+
+		min_seg.start = prev;
+		min_seg.bytes += sizes[prev];
+	}
+
+	reftable_free(segs);
+	return min_seg;
+}
+
+static uint64_t *stack_table_sizes_for_compaction(struct reftable_stack *st)
+{
+	uint64_t *sizes =
+		reftable_calloc(sizeof(uint64_t) * st->merged->stack_len);
+	int version = (st->config.hash_id == SHA1_ID) ? 1 : 2;
+	int overhead = header_size(version) - 1;
+	int i = 0;
+	for (i = 0; i < st->merged->stack_len; i++) {
+		sizes[i] = st->readers[i]->size - overhead;
+	}
+	return sizes;
+}
+
+int reftable_stack_auto_compact(struct reftable_stack *st)
+{
+	uint64_t *sizes = stack_table_sizes_for_compaction(st);
+	struct segment seg =
+		suggest_compaction_segment(sizes, st->merged->stack_len);
+	reftable_free(sizes);
+	if (segment_size(&seg) > 0)
+		return stack_compact_range_stats(st, seg.start, seg.end - 1,
+						 NULL);
+
+	return 0;
+}
+
+struct reftable_compaction_stats *
+reftable_stack_compaction_stats(struct reftable_stack *st)
+{
+	return &st->stats;
+}
+
+int reftable_stack_read_ref(struct reftable_stack *st, const char *refname,
+			    struct reftable_ref_record *ref)
+{
+	struct reftable_table tab = { NULL };
+	reftable_table_from_merged_table(&tab, reftable_stack_merged_table(st));
+	return reftable_table_read_ref(&tab, refname, ref);
+}
+
+int reftable_stack_read_log(struct reftable_stack *st, const char *refname,
+			    struct reftable_log_record *log)
+{
+	struct reftable_iterator it = { NULL };
+	struct reftable_merged_table *mt = reftable_stack_merged_table(st);
+	int err = reftable_merged_table_seek_log(mt, &it, refname);
+	if (err)
+		goto done;
+
+	err = reftable_iterator_next_log(&it, log);
+	if (err)
+		goto done;
+
+	if (strcmp(log->refname, refname) ||
+	    reftable_log_record_is_deletion(log)) {
+		err = 1;
+		goto done;
+	}
+
+done:
+	if (err) {
+		reftable_log_record_release(log);
+	}
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+static int stack_check_addition(struct reftable_stack *st,
+				const char *new_tab_name)
+{
+	int err = 0;
+	struct reftable_block_source src = { NULL };
+	struct reftable_reader *rd = NULL;
+	struct reftable_table tab = { NULL };
+	struct reftable_ref_record *refs = NULL;
+	struct reftable_iterator it = { NULL };
+	int cap = 0;
+	int len = 0;
+	int i = 0;
+
+	if (st->config.skip_name_check)
+		return 0;
+
+	err = reftable_block_source_from_file(&src, new_tab_name);
+	if (err < 0)
+		goto done;
+
+	err = reftable_new_reader(&rd, &src, new_tab_name);
+	if (err < 0)
+		goto done;
+
+	err = reftable_reader_seek_ref(rd, &it, "");
+	if (err > 0) {
+		err = 0;
+		goto done;
+	}
+	if (err < 0)
+		goto done;
+
+	while (1) {
+		struct reftable_ref_record ref = { NULL };
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0)
+			goto done;
+
+		if (len >= cap) {
+			cap = 2 * cap + 1;
+			refs = reftable_realloc(refs, cap * sizeof(refs[0]));
+		}
+
+		refs[len++] = ref;
+	}
+
+	reftable_table_from_merged_table(&tab, reftable_stack_merged_table(st));
+
+	err = validate_ref_record_addition(tab, refs, len);
+
+done:
+	for (i = 0; i < len; i++) {
+		reftable_ref_record_release(&refs[i]);
+	}
+
+	free(refs);
+	reftable_iterator_destroy(&it);
+	reftable_reader_free(rd);
+	return err;
+}
diff --git a/reftable/stack.h b/reftable/stack.h
new file mode 100644
index 0000000000..f57005846e
--- /dev/null
+++ b/reftable/stack.h
@@ -0,0 +1,41 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef STACK_H
+#define STACK_H
+
+#include "system.h"
+#include "reftable-writer.h"
+#include "reftable-stack.h"
+
+struct reftable_stack {
+	char *list_file;
+	char *reftable_dir;
+	int disable_auto_compact;
+
+	struct reftable_write_options config;
+
+	struct reftable_reader **readers;
+	size_t readers_len;
+	struct reftable_merged_table *merged;
+	struct reftable_compaction_stats stats;
+};
+
+int read_lines(const char *filename, char ***lines);
+
+struct segment {
+	int start, end;
+	int log;
+	uint64_t bytes;
+};
+
+int fastlog2(uint64_t sz);
+struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n);
+struct segment suggest_compaction_segment(uint64_t *sizes, int n);
+
+#endif
diff --git a/reftable/stack_test.c b/reftable/stack_test.c
new file mode 100644
index 0000000000..11d3d30799
--- /dev/null
+++ b/reftable/stack_test.c
@@ -0,0 +1,781 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "stack.h"
+
+#include "system.h"
+
+#include "merged.h"
+#include "basics.h"
+#include "constants.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+#include <sys/types.h>
+#include <dirent.h>
+
+static void clear_dir(const char *dirname)
+{
+	struct strbuf path = STRBUF_INIT;
+	strbuf_addstr(&path, dirname);
+	remove_dir_recursively(&path, 0);
+	strbuf_release(&path);
+}
+
+static void test_read_file(void)
+{
+	char fn[256] = "/tmp/stack.test_read_file.XXXXXX";
+	int fd = mkstemp(fn);
+	char out[1024] = "line1\n\nline2\nline3";
+	int n, err;
+	char **names = NULL;
+	char *want[] = { "line1", "line2", "line3" };
+	int i = 0;
+
+	EXPECT(fd > 0);
+	n = write(fd, out, strlen(out));
+	EXPECT(n == strlen(out));
+	err = close(fd);
+	EXPECT(err >= 0);
+
+	err = read_lines(fn, &names);
+	EXPECT_ERR(err);
+
+	for (i = 0; names[i] != NULL; i++) {
+		EXPECT(0 == strcmp(want[i], names[i]));
+	}
+	free_names(names);
+	remove(fn);
+}
+
+static void test_parse_names(void)
+{
+	char buf[] = "line\n";
+	char **names = NULL;
+	parse_names(buf, strlen(buf), &names);
+
+	EXPECT(NULL != names[0]);
+	EXPECT(0 == strcmp(names[0], "line"));
+	EXPECT(NULL == names[1]);
+	free_names(names);
+}
+
+static void test_names_equal(void)
+{
+	char *a[] = { "a", "b", "c", NULL };
+	char *b[] = { "a", "b", "d", NULL };
+	char *c[] = { "a", "b", NULL };
+
+	EXPECT(names_equal(a, a));
+	EXPECT(!names_equal(a, b));
+	EXPECT(!names_equal(a, c));
+}
+
+static int write_test_ref(struct reftable_writer *wr, void *arg)
+{
+	struct reftable_ref_record *ref = arg;
+	reftable_writer_set_limits(wr, ref->update_index, ref->update_index);
+	return reftable_writer_add_ref(wr, ref);
+}
+
+struct write_log_arg {
+	struct reftable_log_record *log;
+	uint64_t update_index;
+};
+
+static int write_test_log(struct reftable_writer *wr, void *arg)
+{
+	struct write_log_arg *wla = arg;
+
+	reftable_writer_set_limits(wr, wla->update_index, wla->update_index);
+	return reftable_writer_add_log(wr, wla->log);
+}
+
+static void test_reftable_stack_add_one(void)
+{
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_ref_record ref = {
+		.refname = "HEAD",
+		.update_index = 1,
+		.target = "master",
+	};
+	struct reftable_ref_record dest = { NULL };
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_ref(st, ref.refname, &dest);
+	EXPECT_ERR(err);
+	EXPECT(0 == strcmp("master", dest.target));
+
+	reftable_ref_record_release(&dest);
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_uptodate(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st1 = NULL;
+	struct reftable_stack *st2 = NULL;
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	int err;
+	struct reftable_ref_record ref1 = {
+		.refname = "HEAD",
+		.update_index = 1,
+		.target = "master",
+	};
+	struct reftable_ref_record ref2 = {
+		.refname = "branch2",
+		.update_index = 2,
+		.target = "master",
+	};
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st1, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_new_stack(&st2, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st1, &write_test_ref, &ref1);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st2, &write_test_ref, &ref2);
+	EXPECT(err == REFTABLE_LOCK_ERROR);
+
+	err = reftable_stack_reload(st2);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st2, &write_test_ref, &ref2);
+	EXPECT_ERR(err);
+	reftable_stack_destroy(st1);
+	reftable_stack_destroy(st2);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_transaction_api(void)
+{
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_addition *add = NULL;
+
+	struct reftable_ref_record ref = {
+		.refname = "HEAD",
+		.update_index = 1,
+		.target = "master",
+	};
+	struct reftable_ref_record dest = { NULL };
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	reftable_addition_destroy(add);
+
+	err = reftable_stack_new_addition(&add, st);
+	EXPECT_ERR(err);
+
+	err = reftable_addition_add(add, &write_test_ref, &ref);
+	EXPECT_ERR(err);
+
+	err = reftable_addition_commit(add);
+	EXPECT_ERR(err);
+
+	reftable_addition_destroy(add);
+
+	err = reftable_stack_read_ref(st, ref.refname, &dest);
+	EXPECT_ERR(err);
+	EXPECT(0 == strcmp("master", dest.target));
+
+	reftable_ref_record_release(&dest);
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_validate_refname(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	int i;
+	struct reftable_ref_record ref = {
+		.refname = "a/b",
+		.update_index = 1,
+		.target = "master",
+	};
+	char *additions[] = { "a", "a/b/c" };
+
+	EXPECT(mkdtemp(dir));
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref);
+	EXPECT_ERR(err);
+
+	for (i = 0; i < ARRAY_SIZE(additions); i++) {
+		struct reftable_ref_record ref = {
+			.refname = additions[i],
+			.update_index = 1,
+			.target = "master",
+		};
+
+		err = reftable_stack_add(st, &write_test_ref, &ref);
+		EXPECT(err == REFTABLE_NAME_CONFLICT);
+	}
+
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static int write_error(struct reftable_writer *wr, void *arg)
+{
+	return *((int *)arg);
+}
+
+static void test_reftable_stack_update_index_check(void)
+{
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_ref_record ref1 = {
+		.refname = "name1",
+		.update_index = 1,
+		.target = "master",
+	};
+	struct reftable_ref_record ref2 = {
+		.refname = "name2",
+		.update_index = 1,
+		.target = "master",
+	};
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref1);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref2);
+	EXPECT(err == REFTABLE_API_ERROR);
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_lock_failure(void)
+{
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err, i;
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+	for (i = -1; i != REFTABLE_EMPTY_TABLE_ERROR; i--) {
+		err = reftable_stack_add(st, &write_error, &i);
+		EXPECT(err == i);
+	}
+
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_add(void)
+{
+	int i = 0;
+	int err = 0;
+	struct reftable_write_options cfg = {
+		.exact_log_message = 1,
+	};
+	struct reftable_stack *st = NULL;
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	struct reftable_ref_record refs[2] = { { NULL } };
+	struct reftable_log_record logs[2] = { { NULL } };
+	int N = ARRAY_SIZE(refs);
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+	st->disable_auto_compact = 1;
+
+	for (i = 0; i < N; i++) {
+		char buf[256];
+		snprintf(buf, sizeof(buf), "branch%02d", i);
+		refs[i].refname = xstrdup(buf);
+		refs[i].value = reftable_malloc(SHA1_SIZE);
+		refs[i].update_index = i + 1;
+		set_test_hash(refs[i].value, i);
+
+		logs[i].refname = xstrdup(buf);
+		logs[i].update_index = N + i + 1;
+		logs[i].new_hash = reftable_malloc(SHA1_SIZE);
+		logs[i].email = xstrdup("identity@invalid");
+		set_test_hash(logs[i].new_hash, i);
+	}
+
+	for (i = 0; i < N; i++) {
+		int err = reftable_stack_add(st, &write_test_ref, &refs[i]);
+		EXPECT_ERR(err);
+	}
+
+	for (i = 0; i < N; i++) {
+		struct write_log_arg arg = {
+			.log = &logs[i],
+			.update_index = reftable_stack_next_update_index(st),
+		};
+		int err = reftable_stack_add(st, &write_test_log, &arg);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_stack_compact_all(st, NULL);
+	EXPECT_ERR(err);
+
+	for (i = 0; i < N; i++) {
+		struct reftable_ref_record dest = { NULL };
+
+		int err = reftable_stack_read_ref(st, refs[i].refname, &dest);
+		EXPECT_ERR(err);
+		EXPECT(reftable_ref_record_equal(&dest, refs + i, SHA1_SIZE));
+		reftable_ref_record_release(&dest);
+	}
+
+	for (i = 0; i < N; i++) {
+		struct reftable_log_record dest = { NULL };
+		int err = reftable_stack_read_log(st, refs[i].refname, &dest);
+		EXPECT_ERR(err);
+		EXPECT(reftable_log_record_equal(&dest, logs + i, SHA1_SIZE));
+		reftable_log_record_release(&dest);
+	}
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	for (i = 0; i < N; i++) {
+		reftable_ref_record_release(&refs[i]);
+		reftable_log_record_release(&logs[i]);
+	}
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_log_normalize(void)
+{
+	int err = 0;
+	struct reftable_write_options cfg = {
+		0,
+	};
+	struct reftable_stack *st = NULL;
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+
+	uint8_t h1[SHA1_SIZE] = { 0x01 }, h2[SHA1_SIZE] = { 0x02 };
+
+	struct reftable_log_record input = {
+		.refname = "branch",
+		.update_index = 1,
+		.new_hash = h1,
+		.old_hash = h2,
+	};
+	struct reftable_log_record dest = {
+		.update_index = 0,
+	};
+	struct write_log_arg arg = {
+		.log = &input,
+		.update_index = 1,
+	};
+
+	EXPECT(mkdtemp(dir));
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	input.message = "one\ntwo";
+	err = reftable_stack_add(st, &write_test_log, &arg);
+	EXPECT(err == REFTABLE_API_ERROR);
+
+	input.message = "one";
+	err = reftable_stack_add(st, &write_test_log, &arg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_log(st, input.refname, &dest);
+	EXPECT_ERR(err);
+	EXPECT(0 == strcmp(dest.message, "one\n"));
+
+	input.message = "two\n";
+	arg.update_index = 2;
+	err = reftable_stack_add(st, &write_test_log, &arg);
+	EXPECT_ERR(err);
+	err = reftable_stack_read_log(st, input.refname, &dest);
+	EXPECT_ERR(err);
+	EXPECT(0 == strcmp(dest.message, "two\n"));
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	reftable_log_record_release(&dest);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_tombstone(void)
+{
+	int i = 0;
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_ref_record refs[2] = { { NULL } };
+	struct reftable_log_record logs[2] = { { NULL } };
+	int N = ARRAY_SIZE(refs);
+	struct reftable_ref_record dest = { NULL };
+	struct reftable_log_record log_dest = { NULL };
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	for (i = 0; i < N; i++) {
+		const char *buf = "branch";
+		refs[i].refname = xstrdup(buf);
+		refs[i].update_index = i + 1;
+		if (i % 2 == 0) {
+			refs[i].value = reftable_malloc(SHA1_SIZE);
+			set_test_hash(refs[i].value, i);
+		}
+		logs[i].refname = xstrdup(buf);
+		/* update_index is part of the key. */
+		logs[i].update_index = 42;
+		if (i % 2 == 0) {
+			logs[i].new_hash = reftable_malloc(SHA1_SIZE);
+			set_test_hash(logs[i].new_hash, i);
+			logs[i].email = xstrdup("identity@invalid");
+		}
+	}
+	for (i = 0; i < N; i++) {
+		int err = reftable_stack_add(st, &write_test_ref, &refs[i]);
+		EXPECT_ERR(err);
+	}
+	for (i = 0; i < N; i++) {
+		struct write_log_arg arg = {
+			.log = &logs[i],
+			.update_index = reftable_stack_next_update_index(st),
+		};
+		int err = reftable_stack_add(st, &write_test_log, &arg);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_stack_read_ref(st, "branch", &dest);
+	EXPECT(err == 1);
+	reftable_ref_record_release(&dest);
+
+	err = reftable_stack_read_log(st, "branch", &log_dest);
+	EXPECT(err == 1);
+	reftable_log_record_release(&log_dest);
+
+	err = reftable_stack_compact_all(st, NULL);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_ref(st, "branch", &dest);
+	EXPECT(err == 1);
+
+	err = reftable_stack_read_log(st, "branch", &log_dest);
+	EXPECT(err == 1);
+	reftable_ref_record_release(&dest);
+	reftable_log_record_release(&log_dest);
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	for (i = 0; i < N; i++) {
+		reftable_ref_record_release(&refs[i]);
+		reftable_log_record_release(&logs[i]);
+	}
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_hash_id(void)
+{
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+
+	struct reftable_ref_record ref = {
+		.refname = "master",
+		.target = "target",
+		.update_index = 1,
+	};
+	struct reftable_write_options cfg32 = { .hash_id = SHA256_ID };
+	struct reftable_stack *st32 = NULL;
+	struct reftable_write_options cfg_default = { 0 };
+	struct reftable_stack *st_default = NULL;
+	struct reftable_ref_record dest = { NULL };
+
+	EXPECT(mkdtemp(dir));
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref);
+	EXPECT_ERR(err);
+
+	/* can't read it with the wrong hash ID. */
+	err = reftable_new_stack(&st32, dir, cfg32);
+	EXPECT(err == REFTABLE_FORMAT_ERROR);
+
+	/* check that we can read it back with default config too. */
+	err = reftable_new_stack(&st_default, dir, cfg_default);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_ref(st_default, "master", &dest);
+	EXPECT_ERR(err);
+
+	EXPECT(!strcmp(dest.target, ref.target));
+	reftable_ref_record_release(&dest);
+	reftable_stack_destroy(st);
+	reftable_stack_destroy(st_default);
+	clear_dir(dir);
+}
+
+static void test_log2(void)
+{
+	EXPECT(1 == fastlog2(3));
+	EXPECT(2 == fastlog2(4));
+	EXPECT(2 == fastlog2(5));
+}
+
+static void test_sizes_to_segments(void)
+{
+	uint64_t sizes[] = { 2, 3, 4, 5, 7, 9 };
+	/* .................0  1  2  3  4  5 */
+
+	int seglen = 0;
+	struct segment *segs =
+		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
+	EXPECT(segs[2].log == 3);
+	EXPECT(segs[2].start == 5);
+	EXPECT(segs[2].end == 6);
+
+	EXPECT(segs[1].log == 2);
+	EXPECT(segs[1].start == 2);
+	EXPECT(segs[1].end == 5);
+	reftable_free(segs);
+}
+
+static void test_sizes_to_segments_empty(void)
+{
+	uint64_t sizes[0];
+
+	int seglen = 0;
+	struct segment *segs =
+		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
+	EXPECT(seglen == 0);
+	reftable_free(segs);
+}
+
+static void test_sizes_to_segments_all_equal(void)
+{
+	uint64_t sizes[] = { 5, 5 };
+
+	int seglen = 0;
+	struct segment *segs =
+		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
+	EXPECT(seglen == 1);
+	EXPECT(segs[0].start == 0);
+	EXPECT(segs[0].end == 2);
+	reftable_free(segs);
+}
+
+static void test_suggest_compaction_segment(void)
+{
+	uint64_t sizes[] = { 128, 64, 17, 16, 9, 9, 9, 16, 16 };
+	/* .................0    1    2  3   4  5  6 */
+	struct segment min =
+		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes));
+	EXPECT(min.start == 2);
+	EXPECT(min.end == 7);
+}
+
+static void test_suggest_compaction_segment_nothing(void)
+{
+	uint64_t sizes[] = { 64, 32, 16, 8, 4, 2 };
+	struct segment result =
+		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes));
+	EXPECT(result.start == result.end);
+}
+
+static void test_reflog_expire(void)
+{
+	char dir[256] = "/tmp/stack.test_reflog_expire.XXXXXX";
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	struct reftable_log_record logs[20] = { { NULL } };
+	int N = ARRAY_SIZE(logs) - 1;
+	int i = 0;
+	int err;
+	struct reftable_log_expiry_config expiry = {
+		.time = 10,
+	};
+	struct reftable_log_record log = { NULL };
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	for (i = 1; i <= N; i++) {
+		char buf[256];
+		snprintf(buf, sizeof(buf), "branch%02d", i);
+
+		logs[i].refname = xstrdup(buf);
+		logs[i].update_index = i;
+		logs[i].time = i;
+		logs[i].new_hash = reftable_malloc(SHA1_SIZE);
+		logs[i].email = xstrdup("identity@invalid");
+		set_test_hash(logs[i].new_hash, i);
+	}
+
+	for (i = 1; i <= N; i++) {
+		struct write_log_arg arg = {
+			.log = &logs[i],
+			.update_index = reftable_stack_next_update_index(st),
+		};
+		int err = reftable_stack_add(st, &write_test_log, &arg);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_stack_compact_all(st, NULL);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_compact_all(st, &expiry);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_log(st, logs[9].refname, &log);
+	EXPECT(err == 1);
+
+	err = reftable_stack_read_log(st, logs[11].refname, &log);
+	EXPECT_ERR(err);
+
+	expiry.min_update_index = 15;
+	err = reftable_stack_compact_all(st, &expiry);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_log(st, logs[14].refname, &log);
+	EXPECT(err == 1);
+
+	err = reftable_stack_read_log(st, logs[16].refname, &log);
+	EXPECT_ERR(err);
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	for (i = 0; i <= N; i++) {
+		reftable_log_record_release(&logs[i]);
+	}
+	clear_dir(dir);
+	reftable_log_record_release(&log);
+}
+
+static int write_nothing(struct reftable_writer *wr, void *arg)
+{
+	reftable_writer_set_limits(wr, 1, 1);
+	return 0;
+}
+
+static void test_empty_add(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	struct reftable_stack *st2 = NULL;
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_nothing, NULL);
+	EXPECT_ERR(err);
+
+	err = reftable_new_stack(&st2, dir, cfg);
+	EXPECT_ERR(err);
+	clear_dir(dir);
+	reftable_stack_destroy(st);
+	reftable_stack_destroy(st2);
+}
+
+static void test_reftable_stack_auto_compaction(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	char dir[256] = "/tmp/stack_test.XXXXXX";
+	int err, i;
+	int N = 100;
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	for (i = 0; i < N; i++) {
+		char name[100];
+		struct reftable_ref_record ref = {
+			.refname = name,
+			.update_index = reftable_stack_next_update_index(st),
+			.target = "master",
+		};
+		snprintf(name, sizeof(name), "branch%04d", i);
+
+		err = reftable_stack_add(st, &write_test_ref, &ref);
+		EXPECT_ERR(err);
+
+		EXPECT(i < 3 || st->merged->stack_len < 2 * fastlog2(i));
+	}
+
+	EXPECT(reftable_stack_compaction_stats(st)->entries_written <
+	       (uint64_t)(N * fastlog2(N)));
+
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+int stack_test_main(int argc, const char *argv[])
+{
+	test_reftable_stack_uptodate();
+	test_reftable_stack_transaction_api();
+	test_reftable_stack_hash_id();
+	test_sizes_to_segments_all_equal();
+	test_reftable_stack_auto_compaction();
+	test_reftable_stack_validate_refname();
+	test_reftable_stack_update_index_check();
+	test_reftable_stack_lock_failure();
+	test_reftable_stack_log_normalize();
+	test_reftable_stack_tombstone();
+	test_reftable_stack_add_one();
+	test_empty_add();
+	test_reflog_expire();
+	test_suggest_compaction_segment();
+	test_suggest_compaction_segment_nothing();
+	test_sizes_to_segments();
+	test_sizes_to_segments_empty();
+	test_log2();
+	test_parse_names();
+	test_read_file();
+	test_names_equal();
+	test_reftable_stack_add();
+	return 0;
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index fdf9258673..3b702f4855 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -5,8 +5,11 @@ int cmd__reftable(int argc, const char **argv)
 {
 	basics_test_main(argc, argv);
 	block_test_main(argc, argv);
+	merged_test_main(argc, argv);
 	record_test_main(argc, argv);
+	refname_test_main(argc, argv);
 	reftable_test_main(argc, argv);
+	stack_test_main(argc, argv);
 	tree_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v3 14/16] Reftable support for git-core
  2020-11-26 19:42   ` [PATCH v3 00/16] reftable library Han-Wen Nienhuys via GitGitGadget
                       ` (12 preceding siblings ...)
  2020-11-26 19:42     ` [PATCH v3 13/16] reftable: rest of library Han-Wen Nienhuys via GitGitGadget
@ 2020-11-26 19:42     ` Han-Wen Nienhuys via GitGitGadget
  2020-11-27 10:59       ` Ævar Arnfjörð Bjarmason
  2020-11-26 19:42     ` [PATCH v3 15/16] git-prompt: prepare for reftable refs backend SZEDER Gábor via GitGitGadget
                       ` (2 subsequent siblings)
  16 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-11-26 19:42 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

For background, see the previous commit introducing the library.

This introduces the file refs/reftable-backend.c containing a reftable-powered
ref storage backend.

It can be activated by passing --ref-storage=reftable to "git init", or setting
GIT_TEST_REFTABLE in the environment.

Example use: see t/t0031-reftable.sh

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Co-authored-by: Jeff King <peff@peff.net>
---
 .../technical/repository-version.txt          |    7 +
 Makefile                                      |    4 +
 builtin/clone.c                               |    5 +-
 builtin/init-db.c                             |   55 +-
 builtin/worktree.c                            |   27 +-
 cache.h                                       |    8 +-
 config.mak.uname                              |    2 +-
 contrib/buildsystems/Generators/Vcxproj.pm    |   11 +-
 refs.c                                        |   27 +-
 refs.h                                        |    3 +
 refs/refs-internal.h                          |    1 +
 refs/reftable-backend.c                       | 1418 +++++++++++++++++
 repository.c                                  |    2 +
 repository.h                                  |    3 +
 setup.c                                       |    9 +-
 t/t0031-reftable.sh                           |  199 +++
 t/t1409-avoid-packing-refs.sh                 |    6 +
 t/t1450-fsck.sh                               |    6 +
 t/t3210-pack-refs.sh                          |    6 +
 t/test-lib.sh                                 |    5 +
 20 files changed, 1771 insertions(+), 33 deletions(-)
 create mode 100644 refs/reftable-backend.c
 create mode 100755 t/t0031-reftable.sh

diff --git a/Documentation/technical/repository-version.txt b/Documentation/technical/repository-version.txt
index 7844ef30ff..7257623583 100644
--- a/Documentation/technical/repository-version.txt
+++ b/Documentation/technical/repository-version.txt
@@ -100,3 +100,10 @@ If set, by default "git config" reads from both "config" and
 multiple working directory mode, "config" file is shared while
 "config.worktree" is per-working directory (i.e., it's in
 GIT_COMMON_DIR/worktrees/<id>/config.worktree)
+
+==== `refStorage`
+
+Specifies the file format for the ref database. Values are `files`
+(for the traditional packed + loose ref format) and `reftable` for the
+binary reftable format. See https://github.com/google/reftable for
+more information.
diff --git a/Makefile b/Makefile
index 278c489da4..ec91fa14f4 100644
--- a/Makefile
+++ b/Makefile
@@ -978,6 +978,7 @@ LIB_OBJS += reflog-walk.o
 LIB_OBJS += refs.o
 LIB_OBJS += refs/debug.o
 LIB_OBJS += refs/files-backend.o
+LIB_OBJS += refs/reftable-backend.o
 LIB_OBJS += refs/iterator.o
 LIB_OBJS += refs/packed-backend.o
 LIB_OBJS += refs/ref-cache.o
@@ -2408,8 +2409,11 @@ REFTABLE_OBJS += reftable/zlib-compat.o
 
 REFTABLE_TEST_OBJS += reftable/basics_test.o
 REFTABLE_TEST_OBJS += reftable/block_test.o
+REFTABLE_TEST_OBJS += reftable/merged_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
+REFTABLE_TEST_OBJS += reftable/refname_test.o
 REFTABLE_TEST_OBJS += reftable/reftable_test.o
+REFTABLE_TEST_OBJS += reftable/stack_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/tree_test.o
 
diff --git a/builtin/clone.c b/builtin/clone.c
index a0841923cf..974a374e9c 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1138,7 +1138,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	}
 
 	init_db(git_dir, real_git_dir, option_template, GIT_HASH_UNKNOWN, NULL,
-		INIT_DB_QUIET);
+		default_ref_storage(), INIT_DB_QUIET);
 
 	if (real_git_dir)
 		git_dir = real_git_dir;
@@ -1273,7 +1273,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		 * Now that we know what algorithm the remote side is using,
 		 * let's set ours to the same thing.
 		 */
-		initialize_repository_version(hash_algo, 1);
+		initialize_repository_version(hash_algo, 1,
+					      default_ref_storage());
 		repo_set_hash_algo(the_repository, hash_algo);
 
 		mapped_refs = wanted_peer_refs(refs, &remote->fetch);
diff --git a/builtin/init-db.c b/builtin/init-db.c
index 5c8c67fec6..8b5873ec56 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -179,12 +179,14 @@ static int needs_work_tree_config(const char *git_dir, const char *work_tree)
 	return 1;
 }
 
-void initialize_repository_version(int hash_algo, int reinit)
+void initialize_repository_version(int hash_algo, int reinit,
+				   const char *ref_storage_format)
 {
 	char repo_version_string[10];
 	int repo_version = GIT_REPO_VERSION;
 
-	if (hash_algo != GIT_HASH_SHA1)
+	if (hash_algo != GIT_HASH_SHA1 ||
+	    !strcmp(ref_storage_format, "reftable"))
 		repo_version = GIT_REPO_VERSION_READ;
 
 	/* This forces creation of new config file */
@@ -237,6 +239,7 @@ static int create_default_files(const char *template_path,
 	is_bare_repository_cfg = init_is_bare_repository;
 	if (init_shared_repository != -1)
 		set_shared_repository(init_shared_repository);
+	the_repository->ref_storage_format = xstrdup(fmt->ref_storage);
 
 	/*
 	 * We would have created the above under user's umask -- under
@@ -246,6 +249,24 @@ static int create_default_files(const char *template_path,
 		adjust_shared_perm(get_git_dir());
 	}
 
+	/*
+	 * Check to see if .git/HEAD exists; this must happen before
+	 * initializing the ref db, because we want to see if there is an
+	 * existing HEAD.
+	 */
+	path = git_path_buf(&buf, "HEAD");
+	reinit = (!access(path, R_OK) ||
+		  readlink(path, junk, sizeof(junk) - 1) != -1);
+
+	/*
+	 * refs/heads is a file when using reftable. We can't reinitialize with
+	 * a reftable because it will overwrite HEAD
+	 */
+	if (reinit && (!strcmp(fmt->ref_storage, "reftable")) ==
+			      is_directory(git_path_buf(&buf, "refs/heads"))) {
+		die("cannot switch ref storage format.");
+	}
+
 	/*
 	 * We need to create a "refs" dir in any case so that older
 	 * versions of git can tell that this is a repository.
@@ -260,9 +281,6 @@ static int create_default_files(const char *template_path,
 	 * Point the HEAD symref to the initial branch with if HEAD does
 	 * not yet exist.
 	 */
-	path = git_path_buf(&buf, "HEAD");
-	reinit = (!access(path, R_OK)
-		  || readlink(path, junk, sizeof(junk)-1) != -1);
 	if (!reinit) {
 		char *ref;
 
@@ -279,7 +297,7 @@ static int create_default_files(const char *template_path,
 		free(ref);
 	}
 
-	initialize_repository_version(fmt->hash_algo, 0);
+	initialize_repository_version(fmt->hash_algo, 0, fmt->ref_storage);
 
 	/* Check filemode trustability */
 	path = git_path_buf(&buf, "config");
@@ -396,7 +414,7 @@ static void validate_hash_algorithm(struct repository_format *repo_fmt, int hash
 
 int init_db(const char *git_dir, const char *real_git_dir,
 	    const char *template_dir, int hash, const char *initial_branch,
-	    unsigned int flags)
+	    const char *ref_storage_format, unsigned int flags)
 {
 	int reinit;
 	int exist_ok = flags & INIT_DB_EXIST_OK;
@@ -435,6 +453,7 @@ int init_db(const char *git_dir, const char *real_git_dir,
 	 * is an attempt to reinitialize new repository with an old tool.
 	 */
 	check_repository_format(&repo_fmt);
+	repo_fmt.ref_storage = xstrdup(ref_storage_format);
 
 	validate_hash_algorithm(&repo_fmt, hash);
 
@@ -467,6 +486,9 @@ int init_db(const char *git_dir, const char *real_git_dir,
 		git_config_set("receive.denyNonFastforwards", "true");
 	}
 
+	if (!strcmp(ref_storage_format, "reftable"))
+		git_config_set("extensions.refStorage", ref_storage_format);
+
 	if (!(flags & INIT_DB_QUIET)) {
 		int len = strlen(git_dir);
 
@@ -540,6 +562,7 @@ static const char *const init_db_usage[] = {
 int cmd_init_db(int argc, const char **argv, const char *prefix)
 {
 	const char *git_dir;
+	const char *ref_storage_format = default_ref_storage();
 	const char *real_git_dir = NULL;
 	const char *work_tree;
 	const char *template_dir = NULL;
@@ -548,15 +571,18 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
 	const char *initial_branch = NULL;
 	int hash_algo = GIT_HASH_UNKNOWN;
 	const struct option init_db_options[] = {
-		OPT_STRING(0, "template", &template_dir, N_("template-directory"),
-				N_("directory from which templates will be used")),
+		OPT_STRING(0, "template", &template_dir,
+			   N_("template-directory"),
+			   N_("directory from which templates will be used")),
 		OPT_SET_INT(0, "bare", &is_bare_repository_cfg,
-				N_("create a bare repository"), 1),
+			    N_("create a bare repository"), 1),
 		{ OPTION_CALLBACK, 0, "shared", &init_shared_repository,
-			N_("permissions"),
-			N_("specify that the git repository is to be shared amongst several users"),
-			PARSE_OPT_OPTARG | PARSE_OPT_NONEG, shared_callback, 0},
+		  N_("permissions"),
+		  N_("specify that the git repository is to be shared amongst several users"),
+		  PARSE_OPT_OPTARG | PARSE_OPT_NONEG, shared_callback, 0 },
 		OPT_BIT('q', "quiet", &flags, N_("be quiet"), INIT_DB_QUIET),
+		OPT_STRING(0, "ref-storage", &ref_storage_format, N_("backend"),
+			   N_("the ref storage format to use")),
 		OPT_STRING(0, "separate-git-dir", &real_git_dir, N_("gitdir"),
 			   N_("separate git dir from working tree")),
 		OPT_STRING('b', "initial-branch", &initial_branch, N_("name"),
@@ -697,10 +723,11 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
 	}
 
 	UNLEAK(real_git_dir);
+	UNLEAK(ref_storage_format);
 	UNLEAK(git_dir);
 	UNLEAK(work_tree);
 
 	flags |= INIT_DB_EXIST_OK;
 	return init_db(git_dir, real_git_dir, template_dir, hash_algo,
-		       initial_branch, flags);
+		       initial_branch, ref_storage_format, flags);
 }
diff --git a/builtin/worktree.c b/builtin/worktree.c
index ce56fdaaa9..7931620788 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -12,6 +12,7 @@
 #include "submodule.h"
 #include "utf8.h"
 #include "worktree.h"
+#include "../refs/refs-internal.h"
 
 static const char * const worktree_usage[] = {
 	N_("git worktree add [<options>] <path> [<commit-ish>]"),
@@ -402,9 +403,29 @@ static int add_worktree(const char *path, const char *refname,
 	 * worktree.
 	 */
 	strbuf_reset(&sb);
-	strbuf_addf(&sb, "%s/HEAD", sb_repo.buf);
-	write_file(sb.buf, "%s", oid_to_hex(&null_oid));
-	strbuf_reset(&sb);
+	if (get_main_ref_store(the_repository)->be == &refs_be_reftable) {
+		/* XXX this is cut & paste from reftable_init_db. */
+		strbuf_addf(&sb, "%s/HEAD", sb_repo.buf);
+		write_file(sb.buf, "%s", "ref: refs/heads/.invalid\n");
+		strbuf_reset(&sb);
+
+		strbuf_addf(&sb, "%s/refs", sb_repo.buf);
+		safe_create_dir(sb.buf, 1);
+		strbuf_reset(&sb);
+
+		strbuf_addf(&sb, "%s/refs/heads", sb_repo.buf);
+		write_file(sb.buf, "this repository uses the reftable format");
+		strbuf_reset(&sb);
+
+		strbuf_addf(&sb, "%s/reftable", sb_repo.buf);
+		safe_create_dir(sb.buf, 1);
+		strbuf_reset(&sb);
+	} else {
+		strbuf_addf(&sb, "%s/HEAD", sb_repo.buf);
+		write_file(sb.buf, "%s", oid_to_hex(&null_oid));
+		strbuf_reset(&sb);
+	}
+
 	strbuf_addf(&sb, "%s/commondir", sb_repo.buf);
 	write_file(sb.buf, "../..");
 
diff --git a/cache.h b/cache.h
index e986cf4ea9..545d2b7260 100644
--- a/cache.h
+++ b/cache.h
@@ -627,9 +627,10 @@ int path_inside_repo(const char *prefix, const char *path);
 #define INIT_DB_EXIST_OK 0x0002
 
 int init_db(const char *git_dir, const char *real_git_dir,
-	    const char *template_dir, int hash_algo,
-	    const char *initial_branch, unsigned int flags);
-void initialize_repository_version(int hash_algo, int reinit);
+	    const char *template_dir, int hash_algo, const char *initial_branch,
+	    const char *ref_storage_format, unsigned int flags);
+void initialize_repository_version(int hash_algo, int reinit,
+				   const char *ref_storage_format);
 
 void sanitize_stdfds(void);
 int daemonize(void);
@@ -1043,6 +1044,7 @@ struct repository_format {
 	int is_bare;
 	int hash_algo;
 	char *work_tree;
+	char *ref_storage;
 	struct string_list unknown_extensions;
 	struct string_list v1_only_extensions;
 };
diff --git a/config.mak.uname b/config.mak.uname
index c7eba69e54..ae4e25a1a4 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -709,7 +709,7 @@ vcxproj:
 	# Make .vcxproj files and add them
 	unset QUIET_GEN QUIET_BUILT_IN; \
 	perl contrib/buildsystems/generate -g Vcxproj
-	git add -f git.sln {*,*/lib,t/helper/*}/*.vcxproj
+	git add -f git.sln {*,*/lib,*/libreftable,t/helper/*}/*.vcxproj
 
 	# Generate the LinkOrCopyBuiltins.targets and LinkOrCopyRemoteHttp.targets file
 	(echo '<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">' && \
diff --git a/contrib/buildsystems/Generators/Vcxproj.pm b/contrib/buildsystems/Generators/Vcxproj.pm
index d2584450ba..1a25789d28 100644
--- a/contrib/buildsystems/Generators/Vcxproj.pm
+++ b/contrib/buildsystems/Generators/Vcxproj.pm
@@ -77,7 +77,7 @@ sub createProject {
     my $libs_release = "\n    ";
     my $libs_debug = "\n    ";
     if (!$static_library) {
-      $libs_release = join(";", sort(grep /^(?!libgit\.lib|xdiff\/lib\.lib|vcs-svn\/lib\.lib)/, @{$$build_structure{"$prefix${name}_LIBS"}}));
+      $libs_release = join(";", sort(grep /^(?!libgit\.lib|xdiff\/lib\.lib|vcs-svn\/lib\.lib|reftable\/libreftable\.lib)/, @{$$build_structure{"$prefix${name}_LIBS"}}));
       $libs_debug = $libs_release;
       $libs_debug =~ s/zlib\.lib/zlibd\.lib/g;
       $libs_debug =~ s/libexpat\.lib/libexpatd\.lib/g;
@@ -232,6 +232,7 @@ sub createProject {
 EOM
     if (!$static_library || $target =~ 'vcs-svn' || $target =~ 'xdiff') {
       my $uuid_libgit = $$build_structure{"LIBS_libgit_GUID"};
+      my $uuid_libreftable = $$build_structure{"LIBS_reftable/libreftable_GUID"};
       my $uuid_xdiff_lib = $$build_structure{"LIBS_xdiff/lib_GUID"};
 
       print F << "EOM";
@@ -241,6 +242,14 @@ sub createProject {
       <ReferenceOutputAssembly>false</ReferenceOutputAssembly>
     </ProjectReference>
 EOM
+      if (!($name =~ /xdiff|libreftable/)) {
+        print F << "EOM";
+    <ProjectReference Include="$cdup\\reftable\\libreftable\\libreftable.vcxproj">
+      <Project>$uuid_libreftable</Project>
+      <ReferenceOutputAssembly>false</ReferenceOutputAssembly>
+    </ProjectReference>
+EOM
+      }
       if (!($name =~ 'xdiff')) {
         print F << "EOM";
     <ProjectReference Include="$cdup\\xdiff\\lib\\xdiff_lib.vcxproj">
diff --git a/refs.c b/refs.c
index 392f0bbf68..1b874db334 100644
--- a/refs.c
+++ b/refs.c
@@ -19,10 +19,16 @@
 #include "repository.h"
 #include "sigchain.h"
 
+const char *default_ref_storage(void)
+{
+	const char *test = getenv("GIT_TEST_REFTABLE");
+	return test ? "reftable" : "files";
+}
+
 /*
  * List of all available backends
  */
-static struct ref_storage_be *refs_backends = &refs_be_files;
+static struct ref_storage_be *refs_backends = &refs_be_reftable;
 
 static struct ref_storage_be *find_ref_storage_backend(const char *name)
 {
@@ -1754,13 +1760,13 @@ static struct ref_store *lookup_ref_store_map(struct hashmap *map,
  * Create, record, and return a ref_store instance for the specified
  * gitdir.
  */
-static struct ref_store *ref_store_init(const char *gitdir,
+static struct ref_store *ref_store_init(const char *gitdir, const char *be_name,
 					unsigned int flags)
 {
-	const char *be_name = "files";
-	struct ref_storage_be *be = find_ref_storage_backend(be_name);
+	struct ref_storage_be *be;
 	struct ref_store *refs;
 
+	be = find_ref_storage_backend(be_name);
 	if (!be)
 		BUG("reference backend %s is unknown", be_name);
 
@@ -1776,7 +1782,11 @@ struct ref_store *get_main_ref_store(struct repository *r)
 	if (!r->gitdir)
 		BUG("attempting to get main_ref_store outside of repository");
 
-	r->refs_private = ref_store_init(r->gitdir, REF_STORE_ALL_CAPS);
+	r->refs_private = ref_store_init(r->gitdir,
+					 r->ref_storage_format ?
+						       r->ref_storage_format :
+						       default_ref_storage(),
+					 REF_STORE_ALL_CAPS);
 	r->refs_private = maybe_debug_wrap_ref_store(r->gitdir, r->refs_private);
 	return r->refs_private;
 }
@@ -1832,7 +1842,7 @@ struct ref_store *get_submodule_ref_store(const char *submodule)
 		goto done;
 
 	/* assume that add_submodule_odb() has been called */
-	refs = ref_store_init(submodule_sb.buf,
+	refs = ref_store_init(submodule_sb.buf, default_ref_storage(),
 			      REF_STORE_READ | REF_STORE_ODB);
 	register_ref_store_map(&submodule_ref_stores, "submodule",
 			       refs, submodule);
@@ -1846,6 +1856,7 @@ struct ref_store *get_submodule_ref_store(const char *submodule)
 
 struct ref_store *get_worktree_ref_store(const struct worktree *wt)
 {
+	const char *format = default_ref_storage();
 	struct ref_store *refs;
 	const char *id;
 
@@ -1859,9 +1870,9 @@ struct ref_store *get_worktree_ref_store(const struct worktree *wt)
 
 	if (wt->id)
 		refs = ref_store_init(git_common_path("worktrees/%s", wt->id),
-				      REF_STORE_ALL_CAPS);
+				      format, REF_STORE_ALL_CAPS);
 	else
-		refs = ref_store_init(get_git_common_dir(),
+		refs = ref_store_init(get_git_common_dir(), format,
 				      REF_STORE_ALL_CAPS);
 
 	if (refs)
diff --git a/refs.h b/refs.h
index 6695518156..7dc60472c9 100644
--- a/refs.h
+++ b/refs.h
@@ -11,6 +11,9 @@ struct string_list;
 struct string_list_item;
 struct worktree;
 
+/* Returns the ref storage backend to use by default. */
+const char *default_ref_storage(void);
+
 /*
  * Resolve a reference, recursively following symbolic refererences.
  *
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 467f4b3c93..28166bf1f8 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -669,6 +669,7 @@ struct ref_storage_be {
 };
 
 extern struct ref_storage_be refs_be_files;
+extern struct ref_storage_be refs_be_reftable;
 extern struct ref_storage_be refs_be_packed;
 
 /*
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
new file mode 100644
index 0000000000..894c72ddc4
--- /dev/null
+++ b/refs/reftable-backend.c
@@ -0,0 +1,1418 @@
+#include "../cache.h"
+#include "../chdir-notify.h"
+#include "../config.h"
+#include "../iterator.h"
+#include "../lockfile.h"
+#include "../refs.h"
+#include "../reftable/reftable-stack.h"
+#include "../reftable/reftable-record.h"
+#include "../reftable/reftable-error.h"
+#include "../reftable/reftable-blocksource.h"
+#include "../reftable/reftable-reader.h"
+#include "../reftable/reftable-iterator.h"
+#include "../reftable/reftable-merged.h"
+#include "../reftable/reftable-generic.h"
+#include "../worktree.h"
+#include "refs-internal.h"
+
+extern struct ref_storage_be refs_be_reftable;
+
+struct git_reftable_ref_store {
+	struct ref_store base;
+	unsigned int store_flags;
+
+	int err;
+	char *repo_dir;
+
+	char *reftable_dir;
+	char *worktree_reftable_dir;
+
+	struct reftable_stack *main_stack;
+	struct reftable_stack *worktree_stack;
+};
+
+static struct reftable_stack *stack_for(struct git_reftable_ref_store *store,
+					const char *refname)
+{
+	if (store->worktree_stack == NULL)
+		return store->main_stack;
+
+	switch (ref_type(refname)) {
+	case REF_TYPE_PER_WORKTREE:
+	case REF_TYPE_PSEUDOREF:
+	case REF_TYPE_OTHER_PSEUDOREF:
+		return store->worktree_stack;
+	default:
+	case REF_TYPE_MAIN_PSEUDOREF:
+	case REF_TYPE_NORMAL:
+		return store->main_stack;
+	}
+}
+
+static int git_reftable_read_raw_ref(struct ref_store *ref_store,
+				     const char *refname, struct object_id *oid,
+				     struct strbuf *referent,
+				     unsigned int *type);
+
+static void clear_reftable_log_record(struct reftable_log_record *log)
+{
+	log->old_hash = NULL;
+	log->new_hash = NULL;
+	log->message = NULL;
+	log->refname = NULL;
+	reftable_log_record_release(log);
+}
+
+static void fill_reftable_log_record(struct reftable_log_record *log)
+{
+	const char *info = git_committer_info(0);
+	struct ident_split split = { NULL };
+	int result = split_ident_line(&split, info, strlen(info));
+	int sign = 1;
+	assert(0 == result);
+
+	reftable_log_record_release(log);
+	log->name =
+		xstrndup(split.name_begin, split.name_end - split.name_begin);
+	log->email =
+		xstrndup(split.mail_begin, split.mail_end - split.mail_begin);
+	log->time = atol(split.date_begin);
+	if (*split.tz_begin == '-') {
+		sign = -1;
+		split.tz_begin++;
+	}
+	if (*split.tz_begin == '+') {
+		sign = 1;
+		split.tz_begin++;
+	}
+
+	log->tz_offset = sign * atoi(split.tz_begin);
+}
+
+static int has_suffix(struct strbuf *b, const char *suffix)
+{
+	size_t len = strlen(suffix);
+
+	if (len > b->len) {
+		return 0;
+	}
+
+	return 0 == strncmp(b->buf + b->len - len, suffix, len);
+}
+
+/* trims the last path component of b. Returns -1 if it is not
+ * present, or 0 on success
+ */
+static int trim_component(struct strbuf *b)
+{
+	char *last;
+	last = strrchr(b->buf, '/');
+	if (!last)
+		return -1;
+	strbuf_setlen(b, last - b->buf);
+	return 0;
+}
+
+/* Returns whether `b` is a worktree path, trimming it to the gitdir
+ */
+static int is_worktree(struct strbuf *b)
+{
+	if (trim_component(b) < 0) {
+		return 0;
+	}
+	if (!has_suffix(b, "/worktrees")) {
+		return 0;
+	}
+	trim_component(b);
+	return 1;
+}
+
+static struct ref_store *git_reftable_ref_store_create(const char *path,
+						       unsigned int store_flags)
+{
+	struct git_reftable_ref_store *refs = xcalloc(1, sizeof(*refs));
+	struct ref_store *ref_store = (struct ref_store *)refs;
+	struct reftable_write_options cfg = {
+		.block_size = 4096,
+		.hash_id = the_hash_algo->format_id,
+	};
+	struct strbuf sb = STRBUF_INIT;
+	const char *gitdir = path;
+	struct strbuf wt_buf = STRBUF_INIT;
+	int wt = 0;
+
+	strbuf_addstr(&wt_buf, path);
+
+	/* this is clumsy, but the official worktree functions (eg.
+	 * get_worktrees()) function will try to initialize a ref storage
+	 * backend, leading to infinite recursion.  */
+	wt = is_worktree(&wt_buf);
+	if (wt) {
+		gitdir = wt_buf.buf;
+	}
+
+	base_ref_store_init(ref_store, &refs_be_reftable);
+	ref_store->gitdir = xstrdup(gitdir);
+	refs->store_flags = store_flags;
+	strbuf_addf(&sb, "%s/reftable", gitdir);
+	refs->reftable_dir = xstrdup(sb.buf);
+	strbuf_reset(&sb);
+
+	refs->err =
+		reftable_new_stack(&refs->main_stack, refs->reftable_dir, cfg);
+	assert(refs->err != REFTABLE_API_ERROR);
+
+	if (refs->err == 0 && wt) {
+		strbuf_addf(&sb, "%s/reftable", path);
+		refs->worktree_reftable_dir = xstrdup(sb.buf);
+
+		refs->err = reftable_new_stack(&refs->worktree_stack,
+					       refs->worktree_reftable_dir,
+					       cfg);
+		assert(refs->err != REFTABLE_API_ERROR);
+	}
+
+	strbuf_release(&sb);
+	strbuf_release(&wt_buf);
+	return ref_store;
+}
+
+static int git_reftable_init_db(struct ref_store *ref_store, struct strbuf *err)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct strbuf sb = STRBUF_INIT;
+
+	safe_create_dir(refs->reftable_dir, 1);
+	assert(refs->worktree_reftable_dir == NULL);
+
+	strbuf_addf(&sb, "%s/HEAD", refs->base.gitdir);
+	write_file(sb.buf, "ref: refs/heads/.invalid");
+	strbuf_reset(&sb);
+
+	strbuf_addf(&sb, "%s/refs", refs->base.gitdir);
+	safe_create_dir(sb.buf, 1);
+	strbuf_reset(&sb);
+
+	strbuf_addf(&sb, "%s/refs/heads", refs->base.gitdir);
+	write_file(sb.buf, "this repository uses the reftable format");
+
+	return 0;
+}
+
+struct git_reftable_iterator {
+	struct ref_iterator base;
+	struct reftable_iterator iter;
+	struct reftable_ref_record ref;
+	struct object_id oid;
+	struct ref_store *ref_store;
+
+	/* In case we must iterate over 2 stacks, this is non-null. */
+	struct reftable_merged_table *merged;
+	unsigned int flags;
+	int err;
+	const char *prefix;
+};
+
+static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator)
+{
+	struct git_reftable_iterator *ri =
+		(struct git_reftable_iterator *)ref_iterator;
+	while (ri->err == 0) {
+		ri->err = reftable_iterator_next_ref(&ri->iter, &ri->ref);
+		if (ri->err) {
+			break;
+		}
+
+		if (ref_type(ri->ref.refname) == REF_TYPE_PSEUDOREF) {
+			/*
+			  pseudorefs, eg. HEAD, FETCH_HEAD should not be
+			  produced, by default.
+			 */
+			continue;
+		}
+		ri->base.refname = ri->ref.refname;
+		if (ri->prefix != NULL &&
+		    strncmp(ri->prefix, ri->ref.refname, strlen(ri->prefix))) {
+			ri->err = 1;
+			break;
+		}
+		if (ri->flags & DO_FOR_EACH_PER_WORKTREE_ONLY &&
+		    ref_type(ri->base.refname) != REF_TYPE_PER_WORKTREE)
+			continue;
+
+		ri->base.flags = 0;
+		if (ri->ref.value != NULL) {
+			hashcpy(ri->oid.hash, ri->ref.value);
+		} else if (ri->ref.target != NULL) {
+			int out_flags = 0;
+			const char *resolved = refs_resolve_ref_unsafe(
+				ri->ref_store, ri->ref.refname,
+				RESOLVE_REF_READING, &ri->oid, &out_flags);
+			ri->base.flags = out_flags;
+			if (resolved == NULL &&
+			    !(ri->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
+			    (ri->base.flags & REF_ISBROKEN)) {
+				continue;
+			}
+		}
+
+		ri->base.oid = &ri->oid;
+		if (!(ri->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
+		    !ref_resolves_to_object(ri->base.refname, ri->base.oid,
+					    ri->base.flags)) {
+			continue;
+		}
+
+		break;
+	}
+
+	if (ri->err > 0) {
+		return ITER_DONE;
+	}
+	if (ri->err < 0) {
+		return ITER_ERROR;
+	}
+
+	return ITER_OK;
+}
+
+static int reftable_ref_iterator_peel(struct ref_iterator *ref_iterator,
+				      struct object_id *peeled)
+{
+	struct git_reftable_iterator *ri =
+		(struct git_reftable_iterator *)ref_iterator;
+	if (ri->ref.target_value != NULL) {
+		hashcpy(peeled->hash, ri->ref.target_value);
+		return 0;
+	}
+
+	return -1;
+}
+
+static int reftable_ref_iterator_abort(struct ref_iterator *ref_iterator)
+{
+	struct git_reftable_iterator *ri =
+		(struct git_reftable_iterator *)ref_iterator;
+	reftable_ref_record_release(&ri->ref);
+	reftable_iterator_destroy(&ri->iter);
+	if (ri->merged) {
+		reftable_merged_table_free(ri->merged);
+	}
+	return 0;
+}
+
+static struct ref_iterator_vtable reftable_ref_iterator_vtable = {
+	reftable_ref_iterator_advance, reftable_ref_iterator_peel,
+	reftable_ref_iterator_abort
+};
+
+static struct ref_iterator *
+git_reftable_ref_iterator_begin(struct ref_store *ref_store, const char *prefix,
+				unsigned int flags)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct git_reftable_iterator *ri = xcalloc(1, sizeof(*ri));
+
+	if (refs->err < 0) {
+		ri->err = refs->err;
+	} else if (refs->worktree_stack == NULL) {
+		struct reftable_merged_table *mt =
+			reftable_stack_merged_table(refs->main_stack);
+		ri->err = reftable_merged_table_seek_ref(mt, &ri->iter, prefix);
+	} else {
+		struct reftable_merged_table *mt1 =
+			reftable_stack_merged_table(refs->main_stack);
+		struct reftable_merged_table *mt2 =
+			reftable_stack_merged_table(refs->worktree_stack);
+		struct reftable_table *tabs =
+			xcalloc(2, sizeof(struct reftable_table));
+		reftable_table_from_merged_table(&tabs[0], mt1);
+		reftable_table_from_merged_table(&tabs[1], mt2);
+		ri->err = reftable_new_merged_table(&ri->merged, tabs, 2,
+						    the_hash_algo->format_id);
+		if (ri->err == 0)
+			ri->err = reftable_merged_table_seek_ref(
+				ri->merged, &ri->iter, prefix);
+	}
+
+	base_ref_iterator_init(&ri->base, &reftable_ref_iterator_vtable, 1);
+	ri->prefix = prefix;
+	ri->base.oid = &ri->oid;
+	ri->flags = flags;
+	ri->ref_store = ref_store;
+	return &ri->base;
+}
+
+static int fixup_symrefs(struct ref_store *ref_store,
+			 struct ref_transaction *transaction)
+{
+	struct strbuf referent = STRBUF_INIT;
+	int i = 0;
+	int err = 0;
+
+	for (i = 0; i < transaction->nr; i++) {
+		struct ref_update *update = transaction->updates[i];
+		struct object_id old_oid;
+
+		err = git_reftable_read_raw_ref(ref_store, update->refname,
+						&old_oid, &referent,
+						/* mutate input, like
+						   files-backend.c */
+						&update->type);
+		if (err < 0 && errno == ENOENT &&
+		    is_null_oid(&update->old_oid)) {
+			err = 0;
+		}
+		if (err < 0)
+			goto done;
+
+		if (!(update->type & REF_ISSYMREF))
+			continue;
+
+		if (update->flags & REF_NO_DEREF) {
+			/* what should happen here? See files-backend.c
+			 * lock_ref_for_update. */
+		} else {
+			/*
+			  If we are updating a symref (eg. HEAD), we should also
+			  update the branch that the symref points to.
+
+			  This is generic functionality, and would be better
+			  done in refs.c, but the current implementation is
+			  intertwined with the locking in files-backend.c.
+			*/
+			int new_flags = update->flags;
+			struct ref_update *new_update = NULL;
+
+			/* if this is an update for HEAD, should also record a
+			   log entry for HEAD? See files-backend.c,
+			   split_head_update()
+			*/
+			new_update = ref_transaction_add_update(
+				transaction, referent.buf, new_flags,
+				&update->new_oid, &update->old_oid,
+				update->msg);
+			new_update->parent_update = update;
+
+			/* files-backend sets REF_LOG_ONLY here. */
+			update->flags |= REF_NO_DEREF | REF_LOG_ONLY;
+			update->flags &= ~REF_HAVE_OLD;
+		}
+	}
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	strbuf_release(&referent);
+	return err;
+}
+
+static int git_reftable_transaction_prepare(struct ref_store *ref_store,
+					    struct ref_transaction *transaction,
+					    struct strbuf *errbuf)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_addition *add = NULL;
+	struct reftable_stack *stack =
+		transaction->nr ?
+			      stack_for(refs, transaction->updates[0]->refname) :
+			      refs->main_stack;
+	int err = refs->err;
+	if (err < 0) {
+		goto done;
+	}
+
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+
+	err = reftable_stack_new_addition(&add, stack);
+	if (err) {
+		goto done;
+	}
+
+	err = fixup_symrefs(ref_store, transaction);
+	if (err) {
+		goto done;
+	}
+
+	transaction->backend_data = add;
+	transaction->state = REF_TRANSACTION_PREPARED;
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	if (err < 0) {
+		transaction->state = REF_TRANSACTION_CLOSED;
+		strbuf_addf(errbuf, "reftable: transaction prepare: %s",
+			    reftable_error_str(err));
+	}
+
+	return err;
+}
+
+static int git_reftable_transaction_abort(struct ref_store *ref_store,
+					  struct ref_transaction *transaction,
+					  struct strbuf *err)
+{
+	struct reftable_addition *add =
+		(struct reftable_addition *)transaction->backend_data;
+	reftable_addition_destroy(add);
+	transaction->backend_data = NULL;
+	return 0;
+}
+
+static int reftable_check_old_oid(struct ref_store *refs, const char *refname,
+				  struct object_id *want_oid)
+{
+	struct object_id out_oid;
+	int out_flags = 0;
+	const char *resolved = refs_resolve_ref_unsafe(
+		refs, refname, RESOLVE_REF_READING, &out_oid, &out_flags);
+	if (is_null_oid(want_oid) != (resolved == NULL)) {
+		return REFTABLE_LOCK_ERROR;
+	}
+
+	if (resolved != NULL && !oideq(&out_oid, want_oid)) {
+		return REFTABLE_LOCK_ERROR;
+	}
+
+	return 0;
+}
+
+static int ref_update_cmp(const void *a, const void *b)
+{
+	return strcmp((*(struct ref_update **)a)->refname,
+		      (*(struct ref_update **)b)->refname);
+}
+
+static int write_transaction_table(struct reftable_writer *writer, void *arg)
+{
+	struct ref_transaction *transaction = (struct ref_transaction *)arg;
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)transaction->ref_store;
+	struct reftable_stack *stack =
+		stack_for(refs, transaction->updates[0]->refname);
+	uint64_t ts = reftable_stack_next_update_index(stack);
+	int err = 0;
+	int i = 0;
+	struct reftable_log_record *logs =
+		calloc(transaction->nr, sizeof(*logs));
+	struct ref_update **sorted =
+		malloc(transaction->nr * sizeof(struct ref_update *));
+	COPY_ARRAY(sorted, transaction->updates, transaction->nr);
+	QSORT(sorted, transaction->nr, ref_update_cmp);
+	reftable_writer_set_limits(writer, ts, ts);
+
+	for (i = 0; i < transaction->nr; i++) {
+		struct ref_update *u = sorted[i];
+		struct reftable_log_record *log = &logs[i];
+		fill_reftable_log_record(log);
+		log->refname = (char *)u->refname;
+		log->old_hash = u->old_oid.hash;
+		log->new_hash = u->new_oid.hash;
+		log->update_index = ts;
+		log->message = u->msg;
+
+		if (u->flags & REF_LOG_ONLY) {
+			continue;
+		}
+
+		if (u->flags & REF_HAVE_NEW) {
+			struct reftable_ref_record ref = { NULL };
+			struct object_id peeled;
+
+			int peel_error = peel_object(&u->new_oid, &peeled);
+			ref.refname = (char *)u->refname;
+
+			if (!is_null_oid(&u->new_oid)) {
+				ref.value = u->new_oid.hash;
+			}
+			ref.update_index = ts;
+			if (!peel_error) {
+				ref.target_value = peeled.hash;
+			}
+
+			err = reftable_writer_add_ref(writer, &ref);
+			if (err < 0) {
+				goto done;
+			}
+		}
+	}
+
+	for (i = 0; i < transaction->nr; i++) {
+		err = reftable_writer_add_log(writer, &logs[i]);
+		clear_reftable_log_record(&logs[i]);
+		if (err < 0) {
+			goto done;
+		}
+	}
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	free(logs);
+	free(sorted);
+	return err;
+}
+
+static int git_reftable_transaction_finish(struct ref_store *ref_store,
+					   struct ref_transaction *transaction,
+					   struct strbuf *errmsg)
+{
+	struct reftable_addition *add =
+		(struct reftable_addition *)transaction->backend_data;
+	int err = 0;
+	int i;
+
+	for (i = 0; i < transaction->nr; i++) {
+		struct ref_update *u = transaction->updates[i];
+		if (u->flags & REF_HAVE_OLD) {
+			err = reftable_check_old_oid(transaction->ref_store,
+						     u->refname, &u->old_oid);
+			if (err < 0) {
+				goto done;
+			}
+		}
+	}
+	if (transaction->nr) {
+		err = reftable_addition_add(add, &write_transaction_table,
+					    transaction);
+		if (err < 0) {
+			goto done;
+		}
+	}
+
+	err = reftable_addition_commit(add);
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	reftable_addition_destroy(add);
+	transaction->state = REF_TRANSACTION_CLOSED;
+	transaction->backend_data = NULL;
+	if (err) {
+		strbuf_addf(errmsg, "reftable: transaction failure: %s",
+			    reftable_error_str(err));
+		return -1;
+	}
+	return err;
+}
+
+static int
+git_reftable_transaction_initial_commit(struct ref_store *ref_store,
+					struct ref_transaction *transaction,
+					struct strbuf *errmsg)
+{
+	int err = git_reftable_transaction_prepare(ref_store, transaction,
+						   errmsg);
+	if (err)
+		return err;
+
+	return git_reftable_transaction_finish(ref_store, transaction, errmsg);
+}
+
+struct write_delete_refs_arg {
+	struct reftable_stack *stack;
+	struct string_list *refnames;
+	const char *logmsg;
+	unsigned int flags;
+};
+
+static int write_delete_refs_table(struct reftable_writer *writer, void *argv)
+{
+	struct write_delete_refs_arg *arg =
+		(struct write_delete_refs_arg *)argv;
+	uint64_t ts = reftable_stack_next_update_index(arg->stack);
+	int err = 0;
+	int i = 0;
+
+	reftable_writer_set_limits(writer, ts, ts);
+	for (i = 0; i < arg->refnames->nr; i++) {
+		struct reftable_ref_record ref = {
+			.refname = (char *)arg->refnames->items[i].string,
+			.update_index = ts,
+		};
+		err = reftable_writer_add_ref(writer, &ref);
+		if (err < 0) {
+			return err;
+		}
+	}
+
+	for (i = 0; i < arg->refnames->nr; i++) {
+		struct reftable_log_record log = {
+			.update_index = ts,
+		};
+		struct reftable_ref_record current = { NULL };
+		fill_reftable_log_record(&log);
+		log.message = xstrdup(arg->logmsg);
+		log.new_hash = NULL;
+		log.old_hash = NULL;
+		log.update_index = ts;
+		log.refname = (char *)arg->refnames->items[i].string;
+
+		if (reftable_stack_read_ref(arg->stack, log.refname,
+					    &current) == 0) {
+			log.old_hash = current.value;
+		}
+		err = reftable_writer_add_log(writer, &log);
+		log.old_hash = NULL;
+		reftable_ref_record_release(&current);
+
+		clear_reftable_log_record(&log);
+		if (err < 0) {
+			return err;
+		}
+	}
+	return 0;
+}
+
+static int git_reftable_delete_refs(struct ref_store *ref_store,
+				    const char *msg,
+				    struct string_list *refnames,
+				    unsigned int flags)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack =
+		stack_for(refs, refnames->items[0].string);
+	struct write_delete_refs_arg arg = {
+		.stack = stack,
+		.refnames = refnames,
+		.logmsg = msg,
+		.flags = flags,
+	};
+	int err = refs->err;
+	if (err < 0) {
+		goto done;
+	}
+
+	string_list_sort(refnames);
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+	err = reftable_stack_add(stack, &write_delete_refs_table, &arg);
+done:
+	assert(err != REFTABLE_API_ERROR);
+	return err;
+}
+
+static int git_reftable_pack_refs(struct ref_store *ref_store,
+				  unsigned int flags)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	int err = refs->err;
+	if (err < 0) {
+		return err;
+	}
+	err = reftable_stack_compact_all(refs->main_stack, NULL);
+	if (err == 0 && refs->worktree_stack != NULL)
+		err = reftable_stack_compact_all(refs->worktree_stack, NULL);
+	return err;
+}
+
+struct write_create_symref_arg {
+	struct git_reftable_ref_store *refs;
+	struct reftable_stack *stack;
+	const char *refname;
+	const char *target;
+	const char *logmsg;
+};
+
+static int write_create_symref_table(struct reftable_writer *writer, void *arg)
+{
+	struct write_create_symref_arg *create =
+		(struct write_create_symref_arg *)arg;
+	uint64_t ts = reftable_stack_next_update_index(create->stack);
+	int err = 0;
+
+	struct reftable_ref_record ref = {
+		.refname = (char *)create->refname,
+		.target = (char *)create->target,
+		.update_index = ts,
+	};
+	reftable_writer_set_limits(writer, ts, ts);
+	err = reftable_writer_add_ref(writer, &ref);
+	if (err == 0) {
+		struct reftable_log_record log = { NULL };
+		struct object_id new_oid;
+		struct object_id old_oid;
+
+		fill_reftable_log_record(&log);
+		log.refname = (char *)create->refname;
+		log.message = (char *)create->logmsg;
+		log.update_index = ts;
+		if (refs_resolve_ref_unsafe(
+			    (struct ref_store *)create->refs, create->refname,
+			    RESOLVE_REF_READING, &old_oid, NULL) != NULL) {
+			log.old_hash = old_oid.hash;
+		}
+
+		if (refs_resolve_ref_unsafe((struct ref_store *)create->refs,
+					    create->target, RESOLVE_REF_READING,
+					    &new_oid, NULL) != NULL) {
+			log.new_hash = new_oid.hash;
+		}
+
+		if (log.old_hash != NULL || log.new_hash != NULL) {
+			err = reftable_writer_add_log(writer, &log);
+		}
+		log.refname = NULL;
+		log.message = NULL;
+		log.old_hash = NULL;
+		log.new_hash = NULL;
+		clear_reftable_log_record(&log);
+	}
+	return err;
+}
+
+static int git_reftable_create_symref(struct ref_store *ref_store,
+				      const char *refname, const char *target,
+				      const char *logmsg)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct write_create_symref_arg arg = { .refs = refs,
+					       .stack = stack,
+					       .refname = refname,
+					       .target = target,
+					       .logmsg = logmsg };
+	int err = refs->err;
+	if (err < 0) {
+		goto done;
+	}
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+	err = reftable_stack_add(stack, &write_create_symref_table, &arg);
+done:
+	assert(err != REFTABLE_API_ERROR);
+	return err;
+}
+
+struct write_rename_arg {
+	struct reftable_stack *stack;
+	const char *oldname;
+	const char *newname;
+	const char *logmsg;
+};
+
+static int write_rename_table(struct reftable_writer *writer, void *argv)
+{
+	struct write_rename_arg *arg = (struct write_rename_arg *)argv;
+	uint64_t ts = reftable_stack_next_update_index(arg->stack);
+	struct reftable_ref_record ref = { NULL };
+	int err = reftable_stack_read_ref(arg->stack, arg->oldname, &ref);
+
+	if (err) {
+		goto done;
+	}
+
+	/* XXX do ref renames overwrite the target? */
+	if (reftable_stack_read_ref(arg->stack, arg->newname, &ref) == 0) {
+		goto done;
+	}
+
+	free(ref.refname);
+	ref.refname = strdup(arg->newname);
+	reftable_writer_set_limits(writer, ts, ts);
+	ref.update_index = ts;
+
+	{
+		struct reftable_ref_record todo[2] = { { NULL } };
+		todo[0].refname = (char *)arg->oldname;
+		todo[0].update_index = ts;
+		/* leave todo[0] empty */
+		todo[1] = ref;
+		todo[1].update_index = ts;
+
+		err = reftable_writer_add_refs(writer, todo, 2);
+		if (err < 0) {
+			goto done;
+		}
+	}
+
+	if (ref.value != NULL) {
+		struct reftable_log_record todo[2] = { { NULL } };
+		fill_reftable_log_record(&todo[0]);
+		fill_reftable_log_record(&todo[1]);
+
+		todo[0].refname = (char *)arg->oldname;
+		todo[0].update_index = ts;
+		todo[0].message = (char *)arg->logmsg;
+		todo[0].old_hash = ref.value;
+		todo[0].new_hash = NULL;
+
+		todo[1].refname = (char *)arg->newname;
+		todo[1].update_index = ts;
+		todo[1].old_hash = NULL;
+		todo[1].new_hash = ref.value;
+		todo[1].message = (char *)arg->logmsg;
+
+		err = reftable_writer_add_logs(writer, todo, 2);
+
+		clear_reftable_log_record(&todo[0]);
+		clear_reftable_log_record(&todo[1]);
+
+		if (err < 0) {
+			goto done;
+		}
+
+	} else {
+		/* XXX symrefs? */
+	}
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	reftable_ref_record_release(&ref);
+	return err;
+}
+
+static int git_reftable_rename_ref(struct ref_store *ref_store,
+				   const char *oldrefname,
+				   const char *newrefname, const char *logmsg)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, newrefname);
+	struct write_rename_arg arg = {
+		.stack = stack,
+		.oldname = oldrefname,
+		.newname = newrefname,
+		.logmsg = logmsg,
+	};
+	int err = refs->err;
+	if (err < 0) {
+		goto done;
+	}
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+
+	err = reftable_stack_add(stack, &write_rename_table, &arg);
+done:
+	assert(err != REFTABLE_API_ERROR);
+	return err;
+}
+
+static int git_reftable_copy_ref(struct ref_store *ref_store,
+				 const char *oldrefname, const char *newrefname,
+				 const char *logmsg)
+{
+	BUG("reftable reference store does not support copying references");
+}
+
+struct git_reftable_reflog_ref_iterator {
+	struct ref_iterator base;
+	struct reftable_iterator iter;
+	struct reftable_log_record log;
+	struct object_id oid;
+
+	/* Used when iterating over worktree & main */
+	struct reftable_merged_table *merged;
+	char *last_name;
+};
+
+static int
+git_reftable_reflog_ref_iterator_advance(struct ref_iterator *ref_iterator)
+{
+	struct git_reftable_reflog_ref_iterator *ri =
+		(struct git_reftable_reflog_ref_iterator *)ref_iterator;
+
+	while (1) {
+		int err = reftable_iterator_next_log(&ri->iter, &ri->log);
+		if (err > 0) {
+			return ITER_DONE;
+		}
+		if (err < 0) {
+			return ITER_ERROR;
+		}
+
+		ri->base.refname = ri->log.refname;
+		if (ri->last_name != NULL &&
+		    !strcmp(ri->log.refname, ri->last_name)) {
+			/* we want the refnames that we have reflogs for, so we
+			 * skip if we've already produced this name. This could
+			 * be faster by seeking directly to
+			 * reflog@update_index==0.
+			 */
+			continue;
+		}
+
+		free(ri->last_name);
+		ri->last_name = xstrdup(ri->log.refname);
+		hashcpy(ri->oid.hash, ri->log.new_hash);
+		return ITER_OK;
+	}
+}
+
+static int
+git_reftable_reflog_ref_iterator_peel(struct ref_iterator *ref_iterator,
+				      struct object_id *peeled)
+{
+	BUG("not supported.");
+	return -1;
+}
+
+static int
+git_reftable_reflog_ref_iterator_abort(struct ref_iterator *ref_iterator)
+{
+	struct git_reftable_reflog_ref_iterator *ri =
+		(struct git_reftable_reflog_ref_iterator *)ref_iterator;
+	reftable_log_record_release(&ri->log);
+	reftable_iterator_destroy(&ri->iter);
+	if (ri->merged)
+		reftable_merged_table_free(ri->merged);
+	return 0;
+}
+
+static struct ref_iterator_vtable git_reftable_reflog_ref_iterator_vtable = {
+	git_reftable_reflog_ref_iterator_advance,
+	git_reftable_reflog_ref_iterator_peel,
+	git_reftable_reflog_ref_iterator_abort
+};
+
+static struct ref_iterator *
+git_reftable_reflog_iterator_begin(struct ref_store *ref_store)
+{
+	struct git_reftable_reflog_ref_iterator *ri = xcalloc(sizeof(*ri), 1);
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+
+	if (refs->worktree_stack == NULL) {
+		struct reftable_stack *stack = refs->main_stack;
+		struct reftable_merged_table *mt =
+			reftable_stack_merged_table(stack);
+		int err = reftable_merged_table_seek_log(mt, &ri->iter, "");
+		if (err < 0) {
+			free(ri);
+			/* XXX is this allowed? */
+			return NULL;
+		}
+	} else {
+		struct reftable_merged_table *mt1 =
+			reftable_stack_merged_table(refs->main_stack);
+		struct reftable_merged_table *mt2 =
+			reftable_stack_merged_table(refs->worktree_stack);
+		struct reftable_table *tabs =
+			xcalloc(2, sizeof(struct reftable_table));
+		int err = 0;
+		reftable_table_from_merged_table(&tabs[0], mt1);
+		reftable_table_from_merged_table(&tabs[1], mt2);
+		err = reftable_new_merged_table(&ri->merged, tabs, 2,
+						the_hash_algo->format_id);
+		if (err < 0) {
+			free(tabs);
+			/* XXX see above */
+			return NULL;
+		}
+		err = reftable_merged_table_seek_ref(ri->merged, &ri->iter, "");
+		if (err < 0) {
+			return NULL;
+		}
+	}
+	base_ref_iterator_init(&ri->base,
+			       &git_reftable_reflog_ref_iterator_vtable, 1);
+	ri->base.oid = &ri->oid;
+
+	return (struct ref_iterator *)ri;
+}
+
+static int git_reftable_for_each_reflog_ent_newest_first(
+	struct ref_store *ref_store, const char *refname, each_reflog_ent_fn fn,
+	void *cb_data)
+{
+	struct reftable_iterator it = { NULL };
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct reftable_merged_table *mt = NULL;
+	int err = 0;
+	struct reftable_log_record log = { NULL };
+
+	if (refs->err < 0) {
+		return refs->err;
+	}
+
+	mt = reftable_stack_merged_table(stack);
+	err = reftable_merged_table_seek_log(mt, &it, refname);
+	while (err == 0) {
+		struct object_id old_oid;
+		struct object_id new_oid;
+		const char *full_committer = "";
+
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+		if (err < 0) {
+			break;
+		}
+
+		if (strcmp(log.refname, refname)) {
+			break;
+		}
+
+		hashcpy(old_oid.hash, log.old_hash);
+		hashcpy(new_oid.hash, log.new_hash);
+
+		full_committer = fmt_ident(log.name, log.email,
+					   WANT_COMMITTER_IDENT,
+					   /*date*/ NULL, IDENT_NO_DATE);
+		err = fn(&old_oid, &new_oid, full_committer, log.time,
+			 log.tz_offset, log.message, cb_data);
+		if (err)
+			break;
+	}
+
+	reftable_log_record_release(&log);
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+static int git_reftable_for_each_reflog_ent_oldest_first(
+	struct ref_store *ref_store, const char *refname, each_reflog_ent_fn fn,
+	void *cb_data)
+{
+	struct reftable_iterator it = { NULL };
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct reftable_merged_table *mt = NULL;
+	struct reftable_log_record *logs = NULL;
+	int cap = 0;
+	int len = 0;
+	int err = 0;
+	int i = 0;
+
+	if (refs->err < 0) {
+		return refs->err;
+	}
+	mt = reftable_stack_merged_table(stack);
+	err = reftable_merged_table_seek_log(mt, &it, refname);
+
+	while (err == 0) {
+		struct reftable_log_record log = { NULL };
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+		if (err < 0) {
+			break;
+		}
+
+		if (strcmp(log.refname, refname)) {
+			break;
+		}
+
+		if (len == cap) {
+			cap = 2 * cap + 1;
+			logs = realloc(logs, cap * sizeof(*logs));
+		}
+
+		logs[len++] = log;
+	}
+
+	for (i = len; i--;) {
+		struct reftable_log_record *log = &logs[i];
+		struct object_id old_oid;
+		struct object_id new_oid;
+		const char *full_committer = "";
+
+		hashcpy(old_oid.hash, log->old_hash);
+		hashcpy(new_oid.hash, log->new_hash);
+
+		full_committer = fmt_ident(log->name, log->email,
+					   WANT_COMMITTER_IDENT, NULL,
+					   IDENT_NO_DATE);
+		err = fn(&old_oid, &new_oid, full_committer, log->time,
+			 log->tz_offset, log->message, cb_data);
+		if (err) {
+			break;
+		}
+	}
+
+	for (i = 0; i < len; i++) {
+		reftable_log_record_release(&logs[i]);
+	}
+	free(logs);
+
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+static int git_reftable_reflog_exists(struct ref_store *ref_store,
+				      const char *refname)
+{
+	struct reftable_iterator it = { NULL };
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct reftable_merged_table *mt = reftable_stack_merged_table(stack);
+	struct reftable_log_record log = { NULL };
+	int err = refs->err;
+
+	if (err < 0) {
+		goto done;
+	}
+	err = reftable_merged_table_seek_log(mt, &it, refname);
+	if (err) {
+		goto done;
+	}
+	err = reftable_iterator_next_log(&it, &log);
+	if (err) {
+		goto done;
+	}
+
+	if (strcmp(log.refname, refname)) {
+		err = 1;
+	}
+
+done:
+	reftable_iterator_destroy(&it);
+	reftable_log_record_release(&log);
+	return !err;
+}
+
+static int git_reftable_create_reflog(struct ref_store *ref_store,
+				      const char *refname, int force_create,
+				      struct strbuf *err)
+{
+	return 0;
+}
+
+static int git_reftable_delete_reflog(struct ref_store *ref_store,
+				      const char *refname)
+{
+	return 0;
+}
+
+struct reflog_expiry_arg {
+	struct git_reftable_ref_store *refs;
+	struct reftable_stack *stack;
+	struct reftable_log_record *tombstones;
+	int len;
+	int cap;
+};
+
+static void clear_log_tombstones(struct reflog_expiry_arg *arg)
+{
+	int i = 0;
+	for (; i < arg->len; i++) {
+		reftable_log_record_release(&arg->tombstones[i]);
+	}
+
+	FREE_AND_NULL(arg->tombstones);
+}
+
+static void add_log_tombstone(struct reflog_expiry_arg *arg,
+			      const char *refname, uint64_t ts)
+{
+	struct reftable_log_record tombstone = {
+		.refname = xstrdup(refname),
+		.update_index = ts,
+	};
+	if (arg->len == arg->cap) {
+		arg->cap = 2 * arg->cap + 1;
+		arg->tombstones =
+			realloc(arg->tombstones, arg->cap * sizeof(tombstone));
+	}
+	arg->tombstones[arg->len++] = tombstone;
+}
+
+static int write_reflog_expiry_table(struct reftable_writer *writer, void *argv)
+{
+	struct reflog_expiry_arg *arg = (struct reflog_expiry_arg *)argv;
+	uint64_t ts = reftable_stack_next_update_index(arg->stack);
+	int i = 0;
+	reftable_writer_set_limits(writer, ts, ts);
+	for (i = 0; i < arg->len; i++) {
+		int err = reftable_writer_add_log(writer, &arg->tombstones[i]);
+		if (err) {
+			return err;
+		}
+	}
+	return 0;
+}
+
+static int
+git_reftable_reflog_expire(struct ref_store *ref_store, const char *refname,
+			   const struct object_id *oid, unsigned int flags,
+			   reflog_expiry_prepare_fn prepare_fn,
+			   reflog_expiry_should_prune_fn should_prune_fn,
+			   reflog_expiry_cleanup_fn cleanup_fn,
+			   void *policy_cb_data)
+{
+	/*
+	  For log expiry, we write tombstones in place of the expired entries,
+	  This means that the entries are still retrievable by delving into the
+	  stack, and expiring entries paradoxically takes extra memory.
+
+	  This memory is only reclaimed when some operation issues a
+	  git_reftable_pack_refs(), which will compact the entire stack and get
+	  rid of deletion entries.
+
+	  It would be better if the refs backend supported an API that sets a
+	  criterion for all refs, passing the criterion to pack_refs().
+	*/
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct reftable_merged_table *mt = NULL;
+	struct reflog_expiry_arg arg = {
+		.stack = stack,
+		.refs = refs,
+	};
+	struct reftable_log_record log = { NULL };
+	struct reftable_iterator it = { NULL };
+	int err = 0;
+	if (refs->err < 0) {
+		return refs->err;
+	}
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+
+	mt = reftable_stack_merged_table(stack);
+	err = reftable_merged_table_seek_log(mt, &it, refname);
+	if (err < 0) {
+		goto done;
+	}
+
+	while (1) {
+		struct object_id ooid;
+		struct object_id noid;
+
+		int err = reftable_iterator_next_log(&it, &log);
+		if (err < 0) {
+			goto done;
+		}
+
+		if (err > 0 || strcmp(log.refname, refname)) {
+			break;
+		}
+		hashcpy(ooid.hash, log.old_hash);
+		hashcpy(noid.hash, log.new_hash);
+
+		if (should_prune_fn(&ooid, &noid, log.email,
+				    (timestamp_t)log.time, log.tz_offset,
+				    log.message, policy_cb_data)) {
+			add_log_tombstone(&arg, refname, log.update_index);
+		}
+	}
+	err = reftable_stack_add(stack, &write_reflog_expiry_table, &arg);
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	reftable_log_record_release(&log);
+	reftable_iterator_destroy(&it);
+	clear_log_tombstones(&arg);
+	return err;
+}
+
+static int reftable_error_to_errno(int err)
+{
+	switch (err) {
+	case REFTABLE_IO_ERROR:
+		return EIO;
+	case REFTABLE_FORMAT_ERROR:
+		return EFAULT;
+	case REFTABLE_NOT_EXIST_ERROR:
+		return ENOENT;
+	case REFTABLE_LOCK_ERROR:
+		return EBUSY;
+	case REFTABLE_API_ERROR:
+		return EINVAL;
+	case REFTABLE_ZLIB_ERROR:
+		return EDOM;
+	default:
+		return ERANGE;
+	}
+}
+
+static int git_reftable_read_raw_ref(struct ref_store *ref_store,
+				     const char *refname, struct object_id *oid,
+				     struct strbuf *referent,
+				     unsigned int *type)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+
+	struct reftable_ref_record ref = { NULL };
+	int err = 0;
+	if (refs->err < 0) {
+		return refs->err;
+	}
+
+	/* This is usually not needed, but Git doesn't signal to ref backend if
+	   a subprocess updated the ref DB.  So we always check.
+	*/
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+
+	err = reftable_stack_read_ref(stack, refname, &ref);
+	if (err > 0) {
+		errno = ENOENT;
+		err = -1;
+		goto done;
+	}
+	if (err < 0) {
+		errno = reftable_error_to_errno(err);
+		err = -1;
+		goto done;
+	}
+	if (ref.target != NULL) {
+		strbuf_reset(referent);
+		strbuf_addstr(referent, ref.target);
+		*type |= REF_ISSYMREF;
+	} else if (ref.value != NULL) {
+		hashcpy(oid->hash, ref.value);
+	} else {
+		*type |= REF_ISBROKEN;
+		errno = EINVAL;
+		err = -1;
+	}
+done:
+	assert(err != REFTABLE_API_ERROR);
+	reftable_ref_record_release(&ref);
+	return err;
+}
+
+struct ref_storage_be refs_be_reftable = {
+	&refs_be_files,
+	"reftable",
+	git_reftable_ref_store_create,
+	git_reftable_init_db,
+	git_reftable_transaction_prepare,
+	git_reftable_transaction_finish,
+	git_reftable_transaction_abort,
+	git_reftable_transaction_initial_commit,
+
+	git_reftable_pack_refs,
+	git_reftable_create_symref,
+	git_reftable_delete_refs,
+	git_reftable_rename_ref,
+	git_reftable_copy_ref,
+
+	git_reftable_ref_iterator_begin,
+	git_reftable_read_raw_ref,
+
+	git_reftable_reflog_iterator_begin,
+	git_reftable_for_each_reflog_ent_oldest_first,
+	git_reftable_for_each_reflog_ent_newest_first,
+	git_reftable_reflog_exists,
+	git_reftable_create_reflog,
+	git_reftable_delete_reflog,
+	git_reftable_reflog_expire,
+};
diff --git a/repository.c b/repository.c
index a4174ddb06..ff0988dac8 100644
--- a/repository.c
+++ b/repository.c
@@ -174,6 +174,8 @@ int repo_init(struct repository *repo,
 	if (worktree)
 		repo_set_worktree(repo, worktree);
 
+	repo->ref_storage_format = xstrdup_or_null(format.ref_storage);
+
 	clear_repository_format(&format);
 	return 0;
 
diff --git a/repository.h b/repository.h
index b385ca3c94..8019a7d0a1 100644
--- a/repository.h
+++ b/repository.h
@@ -78,6 +78,9 @@ struct repository {
 	 */
 	struct ref_store *refs_private;
 
+	/* The format to use for the ref database. */
+	char *ref_storage_format;
+
 	/*
 	 * Contains path to often used file names.
 	 */
diff --git a/setup.c b/setup.c
index c04cd25a30..c6b57efd03 100644
--- a/setup.c
+++ b/setup.c
@@ -500,6 +500,9 @@ static enum extension_result handle_extension(const char *var,
 			return error("invalid value for 'extensions.objectformat'");
 		data->hash_algo = format;
 		return EXTENSION_OK;
+	} else if (!strcmp(ext, "refstorage")) {
+		data->ref_storage = xstrdup(value);
+		return EXTENSION_OK;
 	}
 	return EXTENSION_UNKNOWN;
 }
@@ -651,6 +654,7 @@ void clear_repository_format(struct repository_format *format)
 	string_list_clear(&format->v1_only_extensions, 0);
 	free(format->work_tree);
 	free(format->partial_clone);
+	free(format->ref_storage);
 	init_repository_format(format);
 }
 
@@ -1308,8 +1312,11 @@ const char *setup_git_directory_gently(int *nongit_ok)
 				gitdir = DEFAULT_GIT_DIR_ENVIRONMENT;
 			setup_git_env(gitdir);
 		}
-		if (startup_info->have_repository)
+		if (startup_info->have_repository) {
 			repo_set_hash_algo(the_repository, repo_fmt.hash_algo);
+			the_repository->ref_storage_format =
+				xstrdup_or_null(repo_fmt.ref_storage);
+		}
 	}
 
 	strbuf_release(&dir);
diff --git a/t/t0031-reftable.sh b/t/t0031-reftable.sh
new file mode 100755
index 0000000000..58c7d5d4bc
--- /dev/null
+++ b/t/t0031-reftable.sh
@@ -0,0 +1,199 @@
+#!/bin/sh
+#
+# Copyright (c) 2020 Google LLC
+#
+
+test_description='reftable basics'
+
+. ./test-lib.sh
+
+INVALID_SHA1=aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+
+initialize ()  {
+	rm -rf .git &&
+	git init --ref-storage=reftable &&
+	mv .git/hooks .git/hooks-disabled
+}
+
+test_expect_success 'SHA256 support, env' '
+	rm -rf .git &&
+	GIT_DEFAULT_HASH=sha256 && export GIT_DEFAULT_HASH &&
+	git init --ref-storage=reftable &&
+	mv .git/hooks .git/hooks-disabled &&
+	test_commit file
+'
+
+test_expect_success 'SHA256 support, option' '
+	rm -rf .git &&
+	git init --ref-storage=reftable --object-format=sha256 &&
+	mv .git/hooks .git/hooks-disabled &&
+	test_commit file
+'
+
+test_expect_success 'delete ref' '
+	initialize &&
+	test_commit file &&
+	SHA=$(git show-ref -s --verify HEAD) &&
+	test_write_lines "$SHA refs/heads/master" "$SHA refs/tags/file" >expect &&
+	git show-ref > actual &&
+	! git update-ref -d refs/tags/file $INVALID_SHA1 &&
+	test_cmp expect actual &&
+	git update-ref -d refs/tags/file $SHA  &&
+	test_write_lines "$SHA refs/heads/master" >expect &&
+	git show-ref > actual &&
+	test_cmp expect actual
+'
+
+
+test_expect_success 'clone calls transaction_initial_commit' '
+	test_commit message1 file1 &&
+	git clone . cloned &&
+	(test  -f cloned/file1 || echo "Fixme.")
+'
+
+test_expect_success 'basic operation of reftable storage: commit, show-ref' '
+	initialize &&
+	test_commit file &&
+	test_write_lines refs/heads/master refs/tags/file >expect &&
+	git show-ref &&
+	git show-ref | cut -f2 -d" " > actual &&
+	test_cmp actual expect
+'
+
+test_expect_success 'reflog, repack' '
+	initialize &&
+	for count in $(test_seq 1 10)
+	do
+		test_commit "number $count" file.t $count number-$count ||
+		return 1
+	done &&
+	git pack-refs &&
+	ls -1 .git/reftable >table-files &&
+	test_line_count = 2 table-files &&
+	git reflog refs/heads/master >output &&
+	test_line_count = 10 output &&
+	grep "commit (initial): number 1" output &&
+	grep "commit: number 10" output &&
+	git gc &&
+	git reflog refs/heads/master >output &&
+	test_line_count = 0 output
+'
+
+test_expect_success 'branch switch in reflog output' '
+	initialize &&
+	test_commit file1 &&
+	git checkout -b branch1 &&
+	test_commit file2 &&
+	git checkout -b branch2 &&
+	git switch - &&
+	git rev-parse --symbolic-full-name HEAD > actual &&
+	echo refs/heads/branch1 > expect &&
+	test_cmp actual expect
+'
+
+
+# This matches show-ref's output
+print_ref() {
+	echo "$(git rev-parse "$1") $1"
+}
+
+test_expect_success 'peeled tags are stored' '
+	initialize &&
+	test_commit file &&
+	git tag -m "annotated tag" test_tag HEAD &&
+	{
+		print_ref "refs/heads/master" &&
+		print_ref "refs/tags/file" &&
+		print_ref "refs/tags/test_tag" &&
+		print_ref "refs/tags/test_tag^{}"
+	} >expect &&
+	git show-ref -d >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'show-ref works on fresh repo' '
+	initialize &&
+	rm -rf .git &&
+	git init --ref-storage=reftable &&
+	>expect &&
+	! git show-ref > actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'checkout unborn branch' '
+	initialize &&
+	git checkout -b master
+'
+
+
+test_expect_success 'dir/file conflict' '
+	initialize &&
+	test_commit file &&
+	! git branch master/forbidden
+'
+
+
+test_expect_success 'do not clobber existing repo' '
+	rm -rf .git &&
+	git init --ref-storage=files &&
+	cat .git/HEAD > expect &&
+	test_commit file &&
+	(git init --ref-storage=reftable || true) &&
+	cat .git/HEAD > actual &&
+	test_cmp expect actual
+'
+
+# cherry-pick uses a pseudo ref.
+test_expect_success 'pseudo refs' '
+	initialize &&
+	test_commit message1 file1 &&
+	test_commit message2 file2 &&
+	git branch source &&
+	git checkout HEAD^ &&
+	test_commit message3 file3 &&
+	git cherry-pick source &&
+	test -f file2
+'
+
+# cherry-pick uses a pseudo ref.
+test_expect_success 'rebase' '
+	initialize &&
+	test_commit message1 file1 &&
+	test_commit message2 file2 &&
+	git branch source &&
+	git checkout HEAD^ &&
+	test_commit message3 file3 &&
+	git rebase source &&
+	test -f file2
+'
+
+test_expect_success 'worktrees' '
+	git init --ref-storage=reftable start &&
+	(cd start && test_commit file1 && git checkout -b branch1 &&
+	git checkout -b branch2 &&
+	git worktree add  ../wt
+	) &&
+	cd wt &&
+	git checkout branch1 &&
+	git branch
+'
+
+test_expect_success 'worktrees 2' '
+	initialize &&
+	test_commit file1 &&
+	mkdir existing_empty &&
+	git worktree add --detach existing_empty master
+'
+
+test_expect_success 'FETCH_HEAD' '
+	initialize &&
+	test_commit one &&
+	(git init sub && cd sub && test_commit two) &&
+	git --git-dir sub/.git rev-parse HEAD >expect &&
+	git fetch sub &&
+	git checkout FETCH_HEAD &&
+	git rev-parse HEAD >actual &&
+	test_cmp expect actual
+'
+
+test_done
diff --git a/t/t1409-avoid-packing-refs.sh b/t/t1409-avoid-packing-refs.sh
index be12fb6350..c6f7832556 100755
--- a/t/t1409-avoid-packing-refs.sh
+++ b/t/t1409-avoid-packing-refs.sh
@@ -4,6 +4,12 @@ test_description='avoid rewriting packed-refs unnecessarily'
 
 . ./test-lib.sh
 
+if test_have_prereq REFTABLE
+then
+  skip_all='skipping pack-refs tests; incompatible with reftable'
+  test_done
+fi
+
 # Add an identifying mark to the packed-refs file header line. This
 # shouldn't upset readers, and it should be omitted if the file is
 # ever rewritten.
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index b17f5c21fb..cc5d01571a 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -8,6 +8,12 @@ test_description='git fsck random collection of tests
 
 . ./test-lib.sh
 
+if test_have_prereq REFTABLE
+then
+  skip_all='skipping tests; incompatible with reftable'
+  test_done
+fi
+
 test_expect_success setup '
 	git config gc.auto 0 &&
 	git config i18n.commitencoding ISO-8859-1 &&
diff --git a/t/t3210-pack-refs.sh b/t/t3210-pack-refs.sh
index f41b2afb99..edaef2c175 100755
--- a/t/t3210-pack-refs.sh
+++ b/t/t3210-pack-refs.sh
@@ -11,6 +11,12 @@ semantic is still the same.
 '
 . ./test-lib.sh
 
+if test_have_prereq REFTABLE
+then
+  skip_all='skipping pack-refs tests; incompatible with reftable'
+  test_done
+fi
+
 test_expect_success 'enable reflogs' '
 	git config core.logallrefupdates true
 '
diff --git a/t/test-lib.sh b/t/test-lib.sh
index a863ccee7e..7b638e0b8c 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1520,6 +1520,11 @@ parisc* | hppa*)
 	;;
 esac
 
+if test -n "$GIT_TEST_REFTABLE"
+then
+  test_set_prereq REFTABLE
+fi
+
 ( COLUMNS=1 && test $COLUMNS = 1 ) && test_set_prereq COLUMNS_CAN_BE_1
 test -z "$NO_PERL" && test_set_prereq PERL
 test -z "$NO_PTHREADS" && test_set_prereq PTHREADS
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v3 15/16] git-prompt: prepare for reftable refs backend
  2020-11-26 19:42   ` [PATCH v3 00/16] reftable library Han-Wen Nienhuys via GitGitGadget
                       ` (13 preceding siblings ...)
  2020-11-26 19:42     ` [PATCH v3 14/16] Reftable support for git-core Han-Wen Nienhuys via GitGitGadget
@ 2020-11-26 19:42     ` SZEDER Gábor via GitGitGadget
  2020-11-26 19:42     ` [PATCH v3 16/16] Add "test-tool dump-reftable" command Han-Wen Nienhuys via GitGitGadget
  2020-12-09 14:00     ` [PATCH v4 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
  16 siblings, 0 replies; 251+ messages in thread
From: SZEDER Gábor via GitGitGadget @ 2020-11-26 19:42 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Han-Wen Nienhuys, SZEDER Gábor

From: =?UTF-8?q?SZEDER=20G=C3=A1bor?= <szeder.dev@gmail.com>

In our git-prompt script we strive to use Bash builtins wherever
possible, because fork()-ing subshells for command substitutions and
fork()+exec()-ing Git commands are expensive on some platforms.  We
even read and parse '.git/HEAD' using Bash builtins to get the name of
the current branch [1].  However, the upcoming reftable refs backend
won't use '.git/HEAD' at all, but will write an invalid refname as
placeholder for backwards compatibility instead, which will break our
git-prompt script.

Update the git-prompt script to recognize the placeholder '.git/HEAD'
written by the reftable backend (its content is specified in the
reftable specs), and then fall back to use 'git symbolic-ref' to get
the name of the current branch.

[1] 3a43c4b5bd (bash prompt: use bash builtins to find out current
    branch, 2011-03-31)

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 contrib/completion/git-prompt.sh | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/contrib/completion/git-prompt.sh b/contrib/completion/git-prompt.sh
index 4640a1535d..2e5a5d8027 100644
--- a/contrib/completion/git-prompt.sh
+++ b/contrib/completion/git-prompt.sh
@@ -478,10 +478,15 @@ __git_ps1 ()
 			if ! __git_eread "$g/HEAD" head; then
 				return $exit
 			fi
-			# is it a symbolic ref?
 			b="${head#ref: }"
 			if [ "$head" = "$b" ]; then
 				detached=yes
+			elif [ "$b" = "refs/heads/.invalid" ]; then
+				# Reftable
+				b="$(git symbolic-ref HEAD 2>/dev/null)" ||
+				detached=yes
+			fi
+			if [ "$detached" = yes ]; then
 				b="$(
 				case "${GIT_PS1_DESCRIBE_STYLE-}" in
 				(contains)
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v3 16/16] Add "test-tool dump-reftable" command.
  2020-11-26 19:42   ` [PATCH v3 00/16] reftable library Han-Wen Nienhuys via GitGitGadget
                       ` (14 preceding siblings ...)
  2020-11-26 19:42     ` [PATCH v3 15/16] git-prompt: prepare for reftable refs backend SZEDER Gábor via GitGitGadget
@ 2020-11-26 19:42     ` Han-Wen Nienhuys via GitGitGadget
  2020-12-09 14:00     ` [PATCH v4 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
  16 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-11-26 19:42 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This command dumps individual tables or a stack of of tables.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |  1 +
 reftable/dump.c          | 31 +++++++++++++++++++------------
 t/helper/test-reftable.c |  5 +++++
 t/helper/test-tool.c     |  1 +
 t/helper/test-tool.h     |  1 +
 5 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/Makefile b/Makefile
index ec91fa14f4..6154db5615 100644
--- a/Makefile
+++ b/Makefile
@@ -2409,6 +2409,7 @@ REFTABLE_OBJS += reftable/zlib-compat.o
 
 REFTABLE_TEST_OBJS += reftable/basics_test.o
 REFTABLE_TEST_OBJS += reftable/block_test.o
+REFTABLE_TEST_OBJS += reftable/dump.o
 REFTABLE_TEST_OBJS += reftable/merged_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/refname_test.o
diff --git a/reftable/dump.c b/reftable/dump.c
index 00b444e8c9..0bd7a1cfa6 100644
--- a/reftable/dump.c
+++ b/reftable/dump.c
@@ -12,18 +12,25 @@ license that can be found in the LICENSE file or at
 #include <unistd.h>
 #include <string.h>
 
-#include "reftable.h"
+#include "reftable-blocksource.h"
+#include "reftable-error.h"
+#include "reftable-merged.h"
+#include "reftable-record.h"
 #include "reftable-tests.h"
+#include "reftable-writer.h"
+#include "reftable-iterator.h"
+#include "reftable-reader.h"
+#include "reftable-stack.h"
 
 static uint32_t hash_id;
 
 static int dump_table(const char *tablename)
 {
-	struct reftable_block_source src = { 0 };
+	struct reftable_block_source src = { NULL };
 	int err = reftable_block_source_from_file(&src, tablename);
-	struct reftable_iterator it = { 0 };
-	struct reftable_ref_record ref = { 0 };
-	struct reftable_log_record log = { 0 };
+	struct reftable_iterator it = { NULL };
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_log_record log = { NULL };
 	struct reftable_reader *r = NULL;
 
 	if (err < 0)
@@ -49,7 +56,7 @@ static int dump_table(const char *tablename)
 		reftable_ref_record_print(&ref, hash_id);
 	}
 	reftable_iterator_destroy(&it);
-	reftable_ref_record_clear(&ref);
+	reftable_ref_record_release(&ref);
 
 	err = reftable_reader_seek_log(r, &it, "");
 	if (err < 0) {
@@ -66,7 +73,7 @@ static int dump_table(const char *tablename)
 		reftable_log_record_print(&log, hash_id);
 	}
 	reftable_iterator_destroy(&it);
-	reftable_log_record_clear(&log);
+	reftable_log_record_release(&log);
 
 	reftable_reader_free(r);
 	return 0;
@@ -95,9 +102,9 @@ static int dump_stack(const char *stackdir)
 {
 	struct reftable_stack *stack = NULL;
 	struct reftable_write_options cfg = {};
-	struct reftable_iterator it = { 0 };
-	struct reftable_ref_record ref = { 0 };
-	struct reftable_log_record log = { 0 };
+	struct reftable_iterator it = { NULL };
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_log_record log = { NULL };
 	struct reftable_merged_table *merged = NULL;
 
 	int err = reftable_new_stack(&stack, stackdir, cfg);
@@ -122,7 +129,7 @@ static int dump_stack(const char *stackdir)
 		reftable_ref_record_print(&ref, hash_id);
 	}
 	reftable_iterator_destroy(&it);
-	reftable_ref_record_clear(&ref);
+	reftable_ref_record_release(&ref);
 
 	err = reftable_merged_table_seek_log(merged, &it, "");
 	if (err < 0) {
@@ -139,7 +146,7 @@ static int dump_stack(const char *stackdir)
 		reftable_log_record_print(&log, hash_id);
 	}
 	reftable_iterator_destroy(&it);
-	reftable_log_record_clear(&log);
+	reftable_log_record_release(&log);
 
 	reftable_stack_destroy(stack);
 	return 0;
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 3b702f4855..c1ba131c3a 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -13,3 +13,8 @@ int cmd__reftable(int argc, const char **argv)
 	tree_test_main(argc, argv);
 	return 0;
 }
+
+int cmd__dump_reftable(int argc, const char **argv)
+{
+	return reftable_dump_main(argc, (char *const *)argv);
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 0208a0a41c..f064440a31 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -56,6 +56,7 @@ static struct test_cmd cmds[] = {
 	{ "read-midx", cmd__read_midx },
 	{ "ref-store", cmd__ref_store },
 	{ "reftable", cmd__reftable },
+	{ "dump-reftable", cmd__dump_reftable },
 	{ "regex", cmd__regex },
 	{ "repository", cmd__repository },
 	{ "revision-walking", cmd__revision_walking },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index 1de39ce5b5..226af8c6b8 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -18,6 +18,7 @@ int cmd__dump_cache_tree(int argc, const char **argv);
 int cmd__dump_fsmonitor(int argc, const char **argv);
 int cmd__dump_split_index(int argc, const char **argv);
 int cmd__dump_untracked_cache(int argc, const char **argv);
+int cmd__dump_reftable(int argc, const char **argv);
 int cmd__example_decorate(int argc, const char **argv);
 int cmd__fast_rebase(int argc, const char **argv);
 int cmd__genrandom(int argc, const char **argv);
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 251+ messages in thread

* Re: [PATCH v3 04/16] reftable: add error related functionality
  2020-11-26 19:42     ` [PATCH v3 04/16] reftable: add error related functionality Han-Wen Nienhuys via GitGitGadget
@ 2020-11-27  9:13       ` Felipe Contreras
  2020-11-27 10:25       ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 251+ messages in thread
From: Felipe Contreras @ 2020-11-27  9:13 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: Git, Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Han-Wen Nienhuys

On Fri, Nov 27, 2020 at 2:42 AM Han-Wen Nienhuys via GitGitGadget
<gitgitgadget@gmail.com> wrote:

> +/*
> + Errors in reftable calls are signaled with negative integer return values. 0
> + means success.
> +*/

This is the format of multi-line comments:

/*
* A very long
* multi-line comment.
*/

> +enum reftable_error {
> +       /* Unexpected file system behavior */
> +       REFTABLE_IO_ERROR = -2,
> +
> +       /* Format inconsistency on reading data
> +        */

No need for a multi-line comment here.

> +       REFTABLE_FORMAT_ERROR = -3,
> +
> +       /* File does not exist. Returned from block_source_from_file(),  because
> +          it needs special handling in stack.
> +       */

Once again, and for the rest of the file.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v3 05/16] reftable: utility functions
  2020-11-26 19:42     ` [PATCH v3 05/16] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
@ 2020-11-27 10:18       ` Felipe Contreras
  2020-11-27 10:33       ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 251+ messages in thread
From: Felipe Contreras @ 2020-11-27 10:18 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: Git, Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Han-Wen Nienhuys

On Fri, Nov 27, 2020 at 2:42 AM Han-Wen Nienhuys via GitGitGadget
<gitgitgadget@gmail.com> wrote:

> +int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args)
> +{
> +       size_t lo = 0;
> +       size_t hi = sz;

Is there anything wrong with:

  size_t lo = 0, hi = sz;

> +       /* invariant: (hi == sz) || f(hi) == true
> +          (lo == 0 && f(0) == true) || fi(lo) == false
> +        */

This doesn't follow the multi-line format. And perhaps English would be better.

> +       while (hi - lo > 1) {
> +               size_t mid = lo + (hi - lo) / 2;
> +
> +               int val = f(mid, args);
> +               if (val) {
> +                       hi = mid;
> +               } else {
> +                       lo = mid;
> +               }

Unnecessary braces are frowned upon. Also, why store the value if you
are not going to use it later?

if (f(mid, args))
  hi = mid;
else
  lo = mid;

> +       }
> +
> +       if (lo == 0) {

if (!lo)

> +               if (f(0, args)) {
> +                       return 0;
> +               } else {
> +                       return 1;
> +               }

return f(0, args) ? 0 : 1;

Or just:

return !f(0, args);

> +       }
> +
> +       return hi;

It would be much clearer to just do:

if (lo)
  return hi;
else
  return !f(0, args);

> +}
> +
> +void free_names(char **a)
> +{
> +       char **p = a;
> +       if (p == NULL) {
> +               return;
> +       }

Once again:

  if (!p)
    return;

> +       while (*p) {
> +               reftable_free(*p);
> +               p++;
> +       }

This is the same thing as:

  if (!a)
    return;

  for (p = a; *p; p++)
    reftable_free(*p);

> +int names_length(char **names)
> +{
> +       int len = 0;
> +       char **p = names;
> +       while (*p) {
> +               p++;
> +               len++;
> +       }
> +       return len;
> +}

Is there any need to keep track of len? It's always p - names.

This should do it:

  char **p;
  for (p = names; *p; p++);
  return p - names;

> +void parse_names(char *buf, int size, char ***namesp)
> +{
> +       char **names = NULL;
> +       size_t names_cap = 0;
> +       size_t names_len = 0;
> +
> +       char *p = buf;
> +       char *end = buf + size;
> +       while (p < end) {

Yet another while that can be a for:

  for (p = buf; p < end; p = next + 1)

> +               char *next = strchr(p, '\n');
> +               if (next != NULL) {
> +                       *next = 0;
> +               } else {
> +                       next = end;
> +               }

if (next)
  *next = 0;
else
  next = end;

> +               if (p < next) {

Or just:

  if (p >= next)
    continue;

> +                       if (names_len == names_cap) {
> +                               names_cap = 2 * names_cap + 1;
> +                               names = reftable_realloc(
> +                                       names, names_cap * sizeof(char *));

sizeof(*names);

> +                       }
> +                       names[names_len++] = xstrdup(p);
> +               }
> +               p = next + 1;
> +       }
> +
> +       if (names_len == names_cap) {
> +               names_cap = 2 * names_cap + 1;
> +               names = reftable_realloc(names, names_cap * sizeof(char *));
> +       }

This is repeated, maybe a macro like ALLOC_GROW(), or literally ALLOC_GROW().

> +       names[names_len] = NULL;
> +       *namesp = names;
> +}
> +
> +int names_equal(char **a, char **b)
> +{
> +       while (*a && *b) {
> +               if (strcmp(*a, *b)) {
> +                       return 0;
> +               }
> +
> +               a++;
> +               b++;
> +       }

Again:

  for (; *a && *b, a++, b++) {
    if (strcmp(*a, *b)
      return 0;
  }

But instead of moving multiple pointers:

  for (i = 0; a[i] && b[i], i++) {
    if (strcmp(a[i], b[i])
      return 0;
  }

> +int common_prefix_size(struct strbuf *a, struct strbuf *b)
> +{
> +       int p = 0;
> +       while (p < a->len && p < b->len) {

Once again, this can easily be a for.

> +               if (a->buf[p] != b->buf[p]) {
> +                       break;
> +               }
> +               p++;
> +       }
> +
> +       return p;
> +}

Cheers.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v3 02/16] init-db: set the_repository->hash_algo early on
  2020-11-26 19:42     ` [PATCH v3 02/16] init-db: set the_repository->hash_algo early on Han-Wen Nienhuys via GitGitGadget
@ 2020-11-27 10:22       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 251+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2020-11-27 10:22 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: git, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Han-Wen Nienhuys, Han-Wen Nienhuys


On Thu, Nov 26 2020, Han-Wen Nienhuys via GitGitGadget wrote:

> From: Han-Wen Nienhuys <hanwen@google.com>
>
> The reftable backend needs to know the hash algorithm for writing the
> initialization hash table.
>
> The initial reftable contains a symref HEAD => "main" (or "master"), which is
> agnostic to the size of hash value, but this is an exceptional circumstance, and
> the reftable library does not cater for this exceptional. It insists that all

rephrase: "does not cater to this exception" maybe?


^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v3 03/16] reftable: add LICENSE
  2020-11-26 19:42     ` [PATCH v3 03/16] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
@ 2020-11-27 10:23       ` Ævar Arnfjörð Bjarmason
  2020-11-30 11:26         ` Han-Wen Nienhuys
  0 siblings, 1 reply; 251+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2020-11-27 10:23 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: git, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Han-Wen Nienhuys, Han-Wen Nienhuys


On Thu, Nov 26 2020, Han-Wen Nienhuys via GitGitGadget wrote:

> From: Han-Wen Nienhuys <hanwen@google.com>
>
> TODO: relicense?

On the topics of TODOs, it would be nice to have an answer/some idea the
"I am concerned..." question in about
https://lore.kernel.org/git/873625i9tc.fsf@evledraar.gmail.com/ relevant
to the license etc.

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v3 04/16] reftable: add error related functionality
  2020-11-26 19:42     ` [PATCH v3 04/16] reftable: add error related functionality Han-Wen Nienhuys via GitGitGadget
  2020-11-27  9:13       ` Felipe Contreras
@ 2020-11-27 10:25       ` Ævar Arnfjörð Bjarmason
  2020-11-30 11:27         ` Han-Wen Nienhuys
  1 sibling, 1 reply; 251+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2020-11-27 10:25 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: git, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Han-Wen Nienhuys, Han-Wen Nienhuys


On Thu, Nov 26 2020, Han-Wen Nienhuys via GitGitGadget wrote:

> The reftable/ directory is structured as a library, so it cannot
> crash on misuse. Instead, it returns an error codes.
>
> In addition, the error code can be used to signal conditions from lower levels
> of the library to be handled by higher levels of the library. For example, a
> transaction might legitimately write an empty reftable file, but in that case,
> we'd want to shortcut the transaction overhead.
> [...]
> +	static char buf[250];
> +	switch (err) {
> +	case REFTABLE_IO_ERROR:
> +		return "I/O error";
> +	case REFTABLE_FORMAT_ERROR:
> +		return "corrupt reftable file";
> +	case REFTABLE_NOT_EXIST_ERROR:
> +		return "file does not exist";
> +	case REFTABLE_LOCK_ERROR:
> +		return "data is outdated";
> +	case REFTABLE_API_ERROR:
> +		return "misuse of the reftable API";
> +	case REFTABLE_ZLIB_ERROR:
> +		return "zlib failure";
> +	case REFTABLE_NAME_CONFLICT:
> +		return "file/directory conflict";
> [...]

Not for this series, but I wonder if the eventual integration into the
i18n framework in git needs some thought, seems all the (including
presumably stuff show to users) UI strings are hardcoded English at the
moment.

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v3 05/16] reftable: utility functions
  2020-11-26 19:42     ` [PATCH v3 05/16] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
  2020-11-27 10:18       ` Felipe Contreras
@ 2020-11-27 10:33       ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 251+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2020-11-27 10:33 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: git, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Han-Wen Nienhuys, Han-Wen Nienhuys


On Thu, Nov 26 2020, Han-Wen Nienhuys via GitGitGadget wrote:

> From: Han-Wen Nienhuys <hanwen@google.com>
>
> This commit provides basic utility classes for the reftable library.

Re Johannes's comments in v2
(https://lore.kernel.org/git/nycvar.QRO.7.76.6.2010021557570.50@tvgsbejvaqbjf.bet/)
about strbuf.c etc. being duplicated I see that's gone now.

It seems you did some version of my suggestion in
https://lore.kernel.org/git/873625i9tc.fsf@evledraar.gmail.com/ of
factoring out that code, except here:

> [...]
> +#ifdef REFTABLE_STANDALONE
> +
> +/* functions that git-core provides, for standalone compilation */
> +#include <stdint.h>
> +
> +uint64_t get_be64(void *in);
> +void put_be64(void *out, uint64_t i);

It's unclear whether that's an omission, or needed for some reason (is
git.git ever going to have REFTABLE_STANDALONE=true?

The update shellscript is also gone from v2->v3, is that out of tree now
somewhere? How does one test with the upstream code & presumably do some
munging on it to make it diff-able with what's going to be in git.git?

It would be really useful to have that sort of info/summary in the cover
letter between serieses. It's not that I mind e.g. having some of this
REFTABLE_STANDALONE code here. Just having to do archaeology trying to
figure out changes in this rather large series...

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v3 14/16] Reftable support for git-core
  2020-11-26 19:42     ` [PATCH v3 14/16] Reftable support for git-core Han-Wen Nienhuys via GitGitGadget
@ 2020-11-27 10:59       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 251+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2020-11-27 10:59 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: git, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Han-Wen Nienhuys, Han-Wen Nienhuys


On Thu, Nov 26 2020, Han-Wen Nienhuys via GitGitGadget wrote:

> From: Han-Wen Nienhuys <hanwen@google.com>
>
> For background, see the previous commit introducing the library.
>
> This introduces the file refs/reftable-backend.c containing a reftable-powered
> ref storage backend.
>
> It can be activated by passing --ref-storage=reftable to "git init", or setting
> GIT_TEST_REFTABLE in the environment.
>
> Example use: see t/t0031-reftable.sh
>
> Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> Co-authored-by: Jeff King <peff@peff.net>
> ---
>  .../technical/repository-version.txt          |    7 +
>  Makefile                                      |    4 +
>  builtin/clone.c                               |    5 +-
>  builtin/init-db.c                             |   55 +-
>  builtin/worktree.c                            |   27 +-
>  cache.h                                       |    8 +-
>  config.mak.uname                              |    2 +-
>  contrib/buildsystems/Generators/Vcxproj.pm    |   11 +-
>  refs.c                                        |   27 +-
>  refs.h                                        |    3 +
>  refs/refs-internal.h                          |    1 +
>  refs/reftable-backend.c                       | 1418 +++++++++++++++++
>  repository.c                                  |    2 +
>  repository.h                                  |    3 +
>  setup.c                                       |    9 +-
>  t/t0031-reftable.sh                           |  199 +++
>  t/t1409-avoid-packing-refs.sh                 |    6 +
>  t/t1450-fsck.sh                               |    6 +
>  t/t3210-pack-refs.sh                          |    6 +
>  t/test-lib.sh                                 |    5 +
>  20 files changed, 1771 insertions(+), 33 deletions(-)
>  create mode 100644 refs/reftable-backend.c
>  create mode 100755 t/t0031-reftable.sh
>
> diff --git a/Documentation/technical/repository-version.txt b/Documentation/technical/repository-version.txt
> index 7844ef30ff..7257623583 100644
> --- a/Documentation/technical/repository-version.txt
> +++ b/Documentation/technical/repository-version.txt
> @@ -100,3 +100,10 @@ If set, by default "git config" reads from both "config" and
>  multiple working directory mode, "config" file is shared while
>  "config.worktree" is per-working directory (i.e., it's in
>  GIT_COMMON_DIR/worktrees/<id>/config.worktree)
> +
> +==== `refStorage`
> +
> +Specifies the file format for the ref database. Values are `files`
> +(for the traditional packed + loose ref format) and `reftable` for the
> +binary reftable format. See https://github.com/google/reftable for
> +more information.

Should also be documented in Documentation/config/extensions.txt,
including maybe that setting this will hard die on older git
versions.

Then if you init with reftable and set it to "files" it'll say "warning:
ignoring broken ref refs/heads. Maybe amend that message to say the
config is wrong/detect it's a reftable file?

> +	/*
> +	 * Check to see if .git/HEAD exists; this must happen before
> +	 * initializing the ref db, because we want to see if there is an
> +	 * existing HEAD.
> +	 */
> +	path = git_path_buf(&buf, "HEAD");
> +	reinit = (!access(path, R_OK) ||
> +		  readlink(path, junk, sizeof(junk) - 1) != -1);
> +
> +	/*
> +	 * refs/heads is a file when using reftable. We can't reinitialize with
> +	 * a reftable because it will overwrite HEAD
> +	 */
> +	if (reinit && (!strcmp(fmt->ref_storage, "reftable")) ==
> +			      is_directory(git_path_buf(&buf, "refs/heads"))) {
> +		die("cannot switch ref storage format.");
> +	}
> +

I found this a bit weird, why would creating .git/refs/heads as a file
have to do with .git/HEAD, but reading on what this comment means is
that part of the initialization is to put "ref: refs/heads/.invalid" in
HEAD

And through further dissecting (nothing in this series documents this
AFAICT) that value is chosen because if you stray much from it git will
die with "not a git repository", i.e. it's invalid content, but not
*too* invalid.

I wonder if to achive that aim this isn't more useful &
self-documenting:

    $ cat .git/HEAD
    ref: refs/heads/[this repository uses a reftable for ref storage!]

Also since older versions of git will do this:

    $ __git_ps1
    ([this repository uses a reftable for ref storage!])

>  	/*
>  	 * We need to create a "refs" dir in any case so that older
>  	 * versions of git can tell that this is a repository.
> @@ -260,9 +281,6 @@ static int create_default_files(const char *template_path,
>  	 * Point the HEAD symref to the initial branch with if HEAD does
>  	 * not yet exist.
>  	 */
> -	path = git_path_buf(&buf, "HEAD");
> -	reinit = (!access(path, R_OK)
> -		  || readlink(path, junk, sizeof(junk)-1) != -1);
>  	if (!reinit) {
>  		char *ref;
>  
> @@ -279,7 +297,7 @@ static int create_default_files(const char *template_path,
>  		free(ref);
>  	}
>  
> -	initialize_repository_version(fmt->hash_algo, 0);
> +	initialize_repository_version(fmt->hash_algo, 0, fmt->ref_storage);
>  
>  	/* Check filemode trustability */
>  	path = git_path_buf(&buf, "config");
> @@ -396,7 +414,7 @@ static void validate_hash_algorithm(struct repository_format *repo_fmt, int hash
>  
>  int init_db(const char *git_dir, const char *real_git_dir,
>  	    const char *template_dir, int hash, const char *initial_branch,
> -	    unsigned int flags)
> +	    const char *ref_storage_format, unsigned int flags)
>  {
>  	int reinit;
>  	int exist_ok = flags & INIT_DB_EXIST_OK;
> @@ -435,6 +453,7 @@ int init_db(const char *git_dir, const char *real_git_dir,
>  	 * is an attempt to reinitialize new repository with an old tool.
>  	 */
>  	check_repository_format(&repo_fmt);
> +	repo_fmt.ref_storage = xstrdup(ref_storage_format);
>  
>  	validate_hash_algorithm(&repo_fmt, hash);
>  
> @@ -467,6 +486,9 @@ int init_db(const char *git_dir, const char *real_git_dir,
>  		git_config_set("receive.denyNonFastforwards", "true");
>  	}
>  
> +	if (!strcmp(ref_storage_format, "reftable"))
> +		git_config_set("extensions.refStorage", ref_storage_format);
> +
>  	if (!(flags & INIT_DB_QUIET)) {
>  		int len = strlen(git_dir);
>  
> @@ -540,6 +562,7 @@ static const char *const init_db_usage[] = {
>  int cmd_init_db(int argc, const char **argv, const char *prefix)
>  {
>  	const char *git_dir;
> +	const char *ref_storage_format = default_ref_storage();
>  	const char *real_git_dir = NULL;
>  	const char *work_tree;
>  	const char *template_dir = NULL;
> @@ -548,15 +571,18 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
>  	const char *initial_branch = NULL;
>  	int hash_algo = GIT_HASH_UNKNOWN;
>  	const struct option init_db_options[] = {
> -		OPT_STRING(0, "template", &template_dir, N_("template-directory"),
> -				N_("directory from which templates will be used")),
> +		OPT_STRING(0, "template", &template_dir,
> +			   N_("template-directory"),
> +			   N_("directory from which templates will be used")),
>  		OPT_SET_INT(0, "bare", &is_bare_repository_cfg,
> -				N_("create a bare repository"), 1),
> +			    N_("create a bare repository"), 1),
>  		{ OPTION_CALLBACK, 0, "shared", &init_shared_repository,
> -			N_("permissions"),
> -			N_("specify that the git repository is to be shared amongst several users"),
> -			PARSE_OPT_OPTARG | PARSE_OPT_NONEG, shared_callback, 0},
> +		  N_("permissions"),
> +		  N_("specify that the git repository is to be shared amongst several users"),
> +		  PARSE_OPT_OPTARG | PARSE_OPT_NONEG, shared_callback, 0 },
>  		OPT_BIT('q', "quiet", &flags, N_("be quiet"), INIT_DB_QUIET),
> +		OPT_STRING(0, "ref-storage", &ref_storage_format, N_("backend"),
> +			   N_("the ref storage format to use")),
>  		OPT_STRING(0, "separate-git-dir", &real_git_dir, N_("gitdir"),
>  			   N_("separate git dir from working tree")),
>  		OPT_STRING('b', "initial-branch", &initial_branch, N_("name"),
> @@ -697,10 +723,11 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
>  	}
>  
>  	UNLEAK(real_git_dir);
> +	UNLEAK(ref_storage_format);
>  	UNLEAK(git_dir);
>  	UNLEAK(work_tree);
>  
>  	flags |= INIT_DB_EXIST_OK;
>  	return init_db(git_dir, real_git_dir, template_dir, hash_algo,
> -		       initial_branch, flags);
> +		       initial_branch, ref_storage_format, flags);
>  }
> diff --git a/builtin/worktree.c b/builtin/worktree.c
> index ce56fdaaa9..7931620788 100644
> --- a/builtin/worktree.c
> +++ b/builtin/worktree.c
> @@ -12,6 +12,7 @@
>  #include "submodule.h"
>  #include "utf8.h"
>  #include "worktree.h"
> +#include "../refs/refs-internal.h"
>  
>  static const char * const worktree_usage[] = {
>  	N_("git worktree add [<options>] <path> [<commit-ish>]"),
> @@ -402,9 +403,29 @@ static int add_worktree(const char *path, const char *refname,
>  	 * worktree.
>  	 */
>  	strbuf_reset(&sb);
> -	strbuf_addf(&sb, "%s/HEAD", sb_repo.buf);
> -	write_file(sb.buf, "%s", oid_to_hex(&null_oid));
> -	strbuf_reset(&sb);
> +	if (get_main_ref_store(the_repository)->be == &refs_be_reftable) {
> +		/* XXX this is cut & paste from reftable_init_db. */
> +		strbuf_addf(&sb, "%s/HEAD", sb_repo.buf);
> +		write_file(sb.buf, "%s", "ref: refs/heads/.invalid\n");
> +		strbuf_reset(&sb);
> +
> +		strbuf_addf(&sb, "%s/refs", sb_repo.buf);
> +		safe_create_dir(sb.buf, 1);
> +		strbuf_reset(&sb);
> +
> +		strbuf_addf(&sb, "%s/refs/heads", sb_repo.buf);
> +		write_file(sb.buf, "this repository uses the reftable format");
> +		strbuf_reset(&sb);
> +
> +		strbuf_addf(&sb, "%s/reftable", sb_repo.buf);
> +		safe_create_dir(sb.buf, 1);
> +		strbuf_reset(&sb);
> +	} else {
> +		strbuf_addf(&sb, "%s/HEAD", sb_repo.buf);
> +		write_file(sb.buf, "%s", oid_to_hex(&null_oid));
> +		strbuf_reset(&sb);
> +	}
> +
>  	strbuf_addf(&sb, "%s/commondir", sb_repo.buf);
>  	write_file(sb.buf, "../..");
>  
> diff --git a/cache.h b/cache.h
> index e986cf4ea9..545d2b7260 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -627,9 +627,10 @@ int path_inside_repo(const char *prefix, const char *path);
>  #define INIT_DB_EXIST_OK 0x0002
>  
>  int init_db(const char *git_dir, const char *real_git_dir,
> -	    const char *template_dir, int hash_algo,
> -	    const char *initial_branch, unsigned int flags);
> -void initialize_repository_version(int hash_algo, int reinit);
> +	    const char *template_dir, int hash_algo, const char *initial_branch,
> +	    const char *ref_storage_format, unsigned int flags);
> +void initialize_repository_version(int hash_algo, int reinit,
> +				   const char *ref_storage_format);
>  
>  void sanitize_stdfds(void);
>  int daemonize(void);
> @@ -1043,6 +1044,7 @@ struct repository_format {
>  	int is_bare;
>  	int hash_algo;
>  	char *work_tree;
> +	char *ref_storage;
>  	struct string_list unknown_extensions;
>  	struct string_list v1_only_extensions;
>  };
> diff --git a/config.mak.uname b/config.mak.uname
> index c7eba69e54..ae4e25a1a4 100644
> --- a/config.mak.uname
> +++ b/config.mak.uname
> @@ -709,7 +709,7 @@ vcxproj:
>  	# Make .vcxproj files and add them
>  	unset QUIET_GEN QUIET_BUILT_IN; \
>  	perl contrib/buildsystems/generate -g Vcxproj
> -	git add -f git.sln {*,*/lib,t/helper/*}/*.vcxproj
> +	git add -f git.sln {*,*/lib,*/libreftable,t/helper/*}/*.vcxproj
>  
>  	# Generate the LinkOrCopyBuiltins.targets and LinkOrCopyRemoteHttp.targets file
>  	(echo '<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">' && \
> diff --git a/contrib/buildsystems/Generators/Vcxproj.pm b/contrib/buildsystems/Generators/Vcxproj.pm
> index d2584450ba..1a25789d28 100644
> --- a/contrib/buildsystems/Generators/Vcxproj.pm
> +++ b/contrib/buildsystems/Generators/Vcxproj.pm
> @@ -77,7 +77,7 @@ sub createProject {
>      my $libs_release = "\n    ";
>      my $libs_debug = "\n    ";
>      if (!$static_library) {
> -      $libs_release = join(";", sort(grep /^(?!libgit\.lib|xdiff\/lib\.lib|vcs-svn\/lib\.lib)/, @{$$build_structure{"$prefix${name}_LIBS"}}));
> +      $libs_release = join(";", sort(grep /^(?!libgit\.lib|xdiff\/lib\.lib|vcs-svn\/lib\.lib|reftable\/libreftable\.lib)/, @{$$build_structure{"$prefix${name}_LIBS"}}));
>        $libs_debug = $libs_release;
>        $libs_debug =~ s/zlib\.lib/zlibd\.lib/g;
>        $libs_debug =~ s/libexpat\.lib/libexpatd\.lib/g;
> @@ -232,6 +232,7 @@ sub createProject {
>  EOM
>      if (!$static_library || $target =~ 'vcs-svn' || $target =~ 'xdiff') {
>        my $uuid_libgit = $$build_structure{"LIBS_libgit_GUID"};
> +      my $uuid_libreftable = $$build_structure{"LIBS_reftable/libreftable_GUID"};
>        my $uuid_xdiff_lib = $$build_structure{"LIBS_xdiff/lib_GUID"};
>  
>        print F << "EOM";
> @@ -241,6 +242,14 @@ sub createProject {
>        <ReferenceOutputAssembly>false</ReferenceOutputAssembly>
>      </ProjectReference>
>  EOM
> +      if (!($name =~ /xdiff|libreftable/)) {
> +        print F << "EOM";
> +    <ProjectReference Include="$cdup\\reftable\\libreftable\\libreftable.vcxproj">
> +      <Project>$uuid_libreftable</Project>
> +      <ReferenceOutputAssembly>false</ReferenceOutputAssembly>
> +    </ProjectReference>
> +EOM
> +      }
>        if (!($name =~ 'xdiff')) {
>          print F << "EOM";
>      <ProjectReference Include="$cdup\\xdiff\\lib\\xdiff_lib.vcxproj">
> diff --git a/refs.c b/refs.c
> index 392f0bbf68..1b874db334 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -19,10 +19,16 @@
>  #include "repository.h"
>  #include "sigchain.h"
>  
> +const char *default_ref_storage(void)
> +{
> +	const char *test = getenv("GIT_TEST_REFTABLE");
> +	return test ? "reftable" : "files";
> +}
> +
>  /*
>   * List of all available backends
>   */
> -static struct ref_storage_be *refs_backends = &refs_be_files;
> +static struct ref_storage_be *refs_backends = &refs_be_reftable;
>  
>  static struct ref_storage_be *find_ref_storage_backend(const char *name)
>  {
> @@ -1754,13 +1760,13 @@ static struct ref_store *lookup_ref_store_map(struct hashmap *map,
>   * Create, record, and return a ref_store instance for the specified
>   * gitdir.
>   */
> -static struct ref_store *ref_store_init(const char *gitdir,
> +static struct ref_store *ref_store_init(const char *gitdir, const char *be_name,
>  					unsigned int flags)
>  {
> -	const char *be_name = "files";
> -	struct ref_storage_be *be = find_ref_storage_backend(be_name);
> +	struct ref_storage_be *be;
>  	struct ref_store *refs;
>  
> +	be = find_ref_storage_backend(be_name);
>  	if (!be)
>  		BUG("reference backend %s is unknown", be_name);
>  
> @@ -1776,7 +1782,11 @@ struct ref_store *get_main_ref_store(struct repository *r)
>  	if (!r->gitdir)
>  		BUG("attempting to get main_ref_store outside of repository");
>  
> -	r->refs_private = ref_store_init(r->gitdir, REF_STORE_ALL_CAPS);
> +	r->refs_private = ref_store_init(r->gitdir,
> +					 r->ref_storage_format ?
> +						       r->ref_storage_format :
> +						       default_ref_storage(),
> +					 REF_STORE_ALL_CAPS);
>  	r->refs_private = maybe_debug_wrap_ref_store(r->gitdir, r->refs_private);
>  	return r->refs_private;
>  }
> @@ -1832,7 +1842,7 @@ struct ref_store *get_submodule_ref_store(const char *submodule)
>  		goto done;
>  
>  	/* assume that add_submodule_odb() has been called */
> -	refs = ref_store_init(submodule_sb.buf,
> +	refs = ref_store_init(submodule_sb.buf, default_ref_storage(),
>  			      REF_STORE_READ | REF_STORE_ODB);
>  	register_ref_store_map(&submodule_ref_stores, "submodule",
>  			       refs, submodule);
> @@ -1846,6 +1856,7 @@ struct ref_store *get_submodule_ref_store(const char *submodule)
>  
>  struct ref_store *get_worktree_ref_store(const struct worktree *wt)
>  {
> +	const char *format = default_ref_storage();
>  	struct ref_store *refs;
>  	const char *id;
>  
> @@ -1859,9 +1870,9 @@ struct ref_store *get_worktree_ref_store(const struct worktree *wt)
>  
>  	if (wt->id)
>  		refs = ref_store_init(git_common_path("worktrees/%s", wt->id),
> -				      REF_STORE_ALL_CAPS);
> +				      format, REF_STORE_ALL_CAPS);
>  	else
> -		refs = ref_store_init(get_git_common_dir(),
> +		refs = ref_store_init(get_git_common_dir(), format,
>  				      REF_STORE_ALL_CAPS);
>  
>  	if (refs)
> diff --git a/refs.h b/refs.h
> index 6695518156..7dc60472c9 100644
> --- a/refs.h
> +++ b/refs.h
> @@ -11,6 +11,9 @@ struct string_list;
>  struct string_list_item;
>  struct worktree;
>  
> +/* Returns the ref storage backend to use by default. */
> +const char *default_ref_storage(void);
> +
>  /*
>   * Resolve a reference, recursively following symbolic refererences.
>   *
> diff --git a/refs/refs-internal.h b/refs/refs-internal.h
> index 467f4b3c93..28166bf1f8 100644
> --- a/refs/refs-internal.h
> +++ b/refs/refs-internal.h
> @@ -669,6 +669,7 @@ struct ref_storage_be {
>  };
>  
>  extern struct ref_storage_be refs_be_files;
> +extern struct ref_storage_be refs_be_reftable;
>  extern struct ref_storage_be refs_be_packed;
>  
>  /*
> diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
> new file mode 100644
> index 0000000000..894c72ddc4
> --- /dev/null
> +++ b/refs/reftable-backend.c
> @@ -0,0 +1,1418 @@
> +#include "../cache.h"
> +#include "../chdir-notify.h"
> +#include "../config.h"
> +#include "../iterator.h"
> +#include "../lockfile.h"
> +#include "../refs.h"
> +#include "../reftable/reftable-stack.h"
> +#include "../reftable/reftable-record.h"
> +#include "../reftable/reftable-error.h"
> +#include "../reftable/reftable-blocksource.h"
> +#include "../reftable/reftable-reader.h"
> +#include "../reftable/reftable-iterator.h"
> +#include "../reftable/reftable-merged.h"
> +#include "../reftable/reftable-generic.h"
> +#include "../worktree.h"
> +#include "refs-internal.h"
> +
> +extern struct ref_storage_be refs_be_reftable;
> +
> +struct git_reftable_ref_store {
> +	struct ref_store base;
> +	unsigned int store_flags;
> +
> +	int err;
> +	char *repo_dir;
> +
> +	char *reftable_dir;
> +	char *worktree_reftable_dir;
> +
> +	struct reftable_stack *main_stack;
> +	struct reftable_stack *worktree_stack;
> +};
> +
> +static struct reftable_stack *stack_for(struct git_reftable_ref_store *store,
> +					const char *refname)
> +{
> +	if (store->worktree_stack == NULL)
> +		return store->main_stack;
> +
> +	switch (ref_type(refname)) {
> +	case REF_TYPE_PER_WORKTREE:
> +	case REF_TYPE_PSEUDOREF:
> +	case REF_TYPE_OTHER_PSEUDOREF:
> +		return store->worktree_stack;
> +	default:
> +	case REF_TYPE_MAIN_PSEUDOREF:
> +	case REF_TYPE_NORMAL:
> +		return store->main_stack;
> +	}
> +}
> +
> +static int git_reftable_read_raw_ref(struct ref_store *ref_store,
> +				     const char *refname, struct object_id *oid,
> +				     struct strbuf *referent,
> +				     unsigned int *type);
> +
> +static void clear_reftable_log_record(struct reftable_log_record *log)
> +{
> +	log->old_hash = NULL;
> +	log->new_hash = NULL;
> +	log->message = NULL;
> +	log->refname = NULL;
> +	reftable_log_record_release(log);
> +}
> +
> +static void fill_reftable_log_record(struct reftable_log_record *log)
> +{
> +	const char *info = git_committer_info(0);
> +	struct ident_split split = { NULL };
> +	int result = split_ident_line(&split, info, strlen(info));
> +	int sign = 1;
> +	assert(0 == result);
> +
> +	reftable_log_record_release(log);
> +	log->name =
> +		xstrndup(split.name_begin, split.name_end - split.name_begin);
> +	log->email =
> +		xstrndup(split.mail_begin, split.mail_end - split.mail_begin);

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v3 03/16] reftable: add LICENSE
  2020-11-27 10:23       ` Ævar Arnfjörð Bjarmason
@ 2020-11-30 11:26         ` Han-Wen Nienhuys
  2020-11-30 20:25           ` Han-Wen Nienhuys
  0 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys @ 2020-11-30 11:26 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Han-Wen Nienhuys

On Fri, Nov 27, 2020 at 11:23 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Thu, Nov 26 2020, Han-Wen Nienhuys via GitGitGadget wrote:
>
> > From: Han-Wen Nienhuys <hanwen@google.com>
> >
> > TODO: relicense?
>
> On the topics of TODOs, it would be nice to have an answer/some idea the
> "I am concerned..." question in about
> https://lore.kernel.org/git/873625i9tc.fsf@evledraar.gmail.com/ relevant
> to the license etc.


Long term objective is for this to live in git proper, but remains
independently compilable, so it can be used by libgit2 as well.

This is achieved by having the code only depend on functions in
strbuf.h and git-compat-util.h.

I would like to enforce the separation by having an actual standalone
compile, but the discussion over the code to do so has caused more
heat than light, so I have deferred this by reworking standalone code
to to be on file level, and only importing the non-standalone code. In
one of the other patches, you found a place where I had overlooked a
REFTABLE_STANDALONE #ifdef; I'll remove that. I'll also remove the
update.sh script.

Until the reftable code is actually merged into git, I work on the
reftable library itself at github.com/google/reftable. I've asked our
opensource team if we could switch off the CLA enforcement for this
repo, to facilitate back & forth imports of fixes.

The current version of the patchseries mainly addresses overall
feedback from the Git team at Google to split the public API along
commits too, and updates the series so it applies to master again.


-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v3 04/16] reftable: add error related functionality
  2020-11-27 10:25       ` Ævar Arnfjörð Bjarmason
@ 2020-11-30 11:27         ` Han-Wen Nienhuys
  0 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys @ 2020-11-30 11:27 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Han-Wen Nienhuys

On Fri, Nov 27, 2020 at 11:25 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> Not for this series, but I wonder if the eventual integration into the
> i18n framework in git needs some thought, seems all the (including
> presumably stuff show to users) UI strings are hardcoded English at the
> moment.

AFAIK, these are the only strings that potentially show up in the UI.

If these strings need to be i18n'd.  It is probably easier for
git-core to provide its own version of reftable_error_str() ?  Or we
could remove reftable_error_str from the official API?

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 02/13] reftable: define the public API
  2020-10-09 21:13       ` Emily Shaffer
  2020-10-10 17:03         ` Han-Wen Nienhuys
@ 2020-11-30 14:44         ` Han-Wen Nienhuys
  1 sibling, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys @ 2020-11-30 14:44 UTC (permalink / raw)
  To: Emily Shaffer
  Cc: Jonathan Nieder, Han-Wen Nienhuys via GitGitGadget, git,
	Jeff King, Han-Wen Nienhuys

On Fri, Oct 9, 2020 at 11:13 PM Emily Shaffer <emilyshaffer@google.com> wrote:
>
> On Thu, Oct 01, 2020 at 08:58:51PM -0700, Jonathan Nieder wrote:
> >
> > Hi,
> >
> > Han-Wen Nienhuys wrote:
> >
> > >  reftable/reftable.h | 585 ++++++++++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 585 insertions(+)
> > >  create mode 100644 reftable/reftable.h
> >
> > Adding a header in a separate patch from the implementation doesn't
> > match the usual practice.  Can we add declarations in the same patch
> > as the functions being declared instead?
> >
> > We could still introduce the header early in its own patch if we want,
> > but it would be a skeleton of a header that gets filled out by later
> > patches.
>
> To poke a little deeper into what I think Jonathan is trying to say:
>
> Of course, taking a completed project - like your initial reftable
> submission - and then chopping it up into a cute story of commits is a
> pain in the ***. Doing it twice - or more - is just aggravating. So I
> wonder whether we can bikeshed what story would look nice before you
> even pick up your 'git rebase -i'? Doing that bikeshedding here on list

Looks like this didn't happen.  I went with Jonathan's suggestion, and
chopped the public API header in smaller bits. PTAL.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v2 02/13] reftable: define the public API
  2020-10-12 16:57         ` Jonathan Nieder
@ 2020-11-30 14:55           ` Han-Wen Nienhuys
  0 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys @ 2020-11-30 14:55 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Han-Wen Nienhuys

On Mon, Oct 12, 2020 at 6:57 PM Jonathan Nieder <jrnieder@gmail.com> wrote:
> > We could, but it would considerably complicate work on this patch
> > series, as the commit boundary then doesn't fall on file boundaries
> > anymore. Would you be open to having multiple headers for the public
> > interface? eg. reftable-record.h, reftable-reader.h etc. ?
>
> I'd be more than open to it: I think that would make for a clearer API.

Done.

> >> Should these call BUG()?  Or is it useful for some callers to be able
> >> to recover from these errors?
> >
> > Since this was written as a standalone library, it's up to the caller
> > to decide what should be done.
>
> I'm not strongly opinionated about this, but just a quick note: this
> implies that the library would need to make sure it is producing a
> valid state in error cases.

This is what it already does. I tried to clarify this in some places.

> Does the API documentation describe what state a handle is in after
> an error --- e.g., what operations are permitted after that?

API_ERROR should not result in state changes.

> >>> +int reftable_error_to_errno(int err);
> >>
> >> What is the intended use of this function?
> >
> > The read_raw_ref method in the ref backend API uses errno values as
> > out-of-band communication mechanism.
>
> Could we change that?  It sounds error-prone.

I moved it into refs/reftable-backend.c; I'd rather not also have to
clean up the files backend as part of this work, though.

> >> Do I pass in the 'struct block_source *' as the source arg?  If so, why
> >> are these declared as void *?
> >
> > you pass in block_source->arg. Should struct implementations of
> > polymorphic types carry a pointer to their own vtable instead?
>
> I think a pointer to block_source would make it more self-explanatory,
> yes.

This has a number of downsides, though. It means I have to introduce
separate structs strbuf_blocksource and file_blocksource, along with
functions to create and deallocate them, so it makes the whole thing
more unwieldy to use.

> >> Is the reason this manages the buffer instead of requiring a
> >> caller-supplied buffer to support zero-copy?
> >
> > Log blocks are compressed, so the caller doesn't know the correct size
> > to supply. By letting the block source handle the management, we can
> > swap out the block read from the file for a block managed by malloc on
> > decompressing a log block.
>
> Hm, my naive assumption would have been that we'd use different
> buffers for the compressed and uncompressed data.

Maybe you can comment on how to solve this differently when we get to
the implementation of block reading/writing in the commit series.

> >>> + Generic tables
> >>> +
> >>> + A unified API for reading tables, either merged tables, or single readers.
> >>
> >> Are there callers/helpers that don't know whether they want one or the
> >> other?
> >
> > The setup with per-worktree refs means that there are two reftable
> > stacks in a .git repo. In order to iterate over the entire ref space,
> > you have to merge two merged tables, but the stack itslef merges a set
> > of simple reftables.
>
> I see.  That's subtle, so it seems worth documenting for the next
> person reading this documentation and wondering why.

Done.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v3 03/16] reftable: add LICENSE
  2020-11-30 11:26         ` Han-Wen Nienhuys
@ 2020-11-30 20:25           ` Han-Wen Nienhuys
  2020-11-30 21:21             ` Felipe Contreras
  0 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys @ 2020-11-30 20:25 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Han-Wen Nienhuys

On Mon, Nov 30, 2020 at 12:26 PM Han-Wen Nienhuys <hanwen@google.com> wrote:
> Until the reftable code is actually merged into git, I work on the
> reftable library itself at github.com/google/reftable. I've asked our
> opensource team if we could switch off the CLA enforcement for this
> repo, to facilitate back & forth imports of fixes.

I've asked our lawyers, but this is not currently possible, unfortunately.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v3 03/16] reftable: add LICENSE
  2020-11-30 20:25           ` Han-Wen Nienhuys
@ 2020-11-30 21:21             ` Felipe Contreras
  2020-12-01  9:51               ` Han-Wen Nienhuys
  0 siblings, 1 reply; 251+ messages in thread
From: Felipe Contreras @ 2020-11-30 21:21 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Ævar Arnfjörð Bjarmason,
	Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Han-Wen Nienhuys

On Mon, Nov 30, 2020 at 2:28 PM Han-Wen Nienhuys <hanwen@google.com> wrote:
>
> On Mon, Nov 30, 2020 at 12:26 PM Han-Wen Nienhuys <hanwen@google.com> wrote:
> > Until the reftable code is actually merged into git, I work on the
> > reftable library itself at github.com/google/reftable. I've asked our
> > opensource team if we could switch off the CLA enforcement for this
> > repo, to facilitate back & forth imports of fixes.
>
> I've asked our lawyers, but this is not currently possible, unfortunately.

Maybe I'm not understanding something correctly.

Git's license is GPLv2. When you add your Signed-off-by line you are
saying you are authorized to provide this code to the Git project
under its open source license.

Therefore, I should be able to take your patches, and rework them in
any way I want as long as I keep Git's license.

I don't need your permission, you are supposedly giving it by adding
your Signed-off-by line.

Read: https://developercertificate.org/

If you are not authorized to provide these patches under an open
source license, then you should not add your Signed-off-by line.

And if these patches cannot be used by the Git project, nor can we
work on them in a collaborative manner, then what's the point in
providing them?

Sounds to me Google has not made its mind about actually contributing
these changes.

Or am I missing something?

Cheers.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v3 03/16] reftable: add LICENSE
  2020-11-30 21:21             ` Felipe Contreras
@ 2020-12-01  9:51               ` Han-Wen Nienhuys
  2020-12-01 10:38                 ` Felipe Contreras
  0 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys @ 2020-12-01  9:51 UTC (permalink / raw)
  To: Felipe Contreras
  Cc: Ævar Arnfjörð Bjarmason,
	Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Han-Wen Nienhuys

On Mon, Nov 30, 2020 at 10:21 PM Felipe Contreras
<felipe.contreras@gmail.com> wrote:
>
> On Mon, Nov 30, 2020 at 2:28 PM Han-Wen Nienhuys <hanwen@google.com> wrote:
> >
> > On Mon, Nov 30, 2020 at 12:26 PM Han-Wen Nienhuys <hanwen@google.com> wrote:
> > > Until the reftable code is actually merged into git, I work on the
> > > reftable library itself at github.com/google/reftable. I've asked our
> > > opensource team if we could switch off the CLA enforcement for this
> > > repo, to facilitate back & forth imports of fixes.
> >
> > I've asked our lawyers, but this is not currently possible, unfortunately.
>
> Maybe I'm not understanding something correctly.

> Sounds to me Google has not made its mind about actually contributing
> these changes.
>
> Or am I missing something?

The restrictions are not about the patches themselves. They are about
restricting what gets posted under github.com/google/ .


-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v3 03/16] reftable: add LICENSE
  2020-12-01  9:51               ` Han-Wen Nienhuys
@ 2020-12-01 10:38                 ` Felipe Contreras
  2020-12-01 11:45                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 251+ messages in thread
From: Felipe Contreras @ 2020-12-01 10:38 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Ævar Arnfjörð Bjarmason,
	Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Han-Wen Nienhuys

On Tue, Dec 1, 2020 at 3:51 AM Han-Wen Nienhuys <hanwen@google.com> wrote:
> On Mon, Nov 30, 2020 at 10:21 PM Felipe Contreras
> <felipe.contreras@gmail.com> wrote:
> > On Mon, Nov 30, 2020 at 2:28 PM Han-Wen Nienhuys <hanwen@google.com> wrote:

> > Sounds to me Google has not made its mind about actually contributing
> > these changes.
> >
> > Or am I missing something?
>
> The restrictions are not about the patches themselves. They are about
> restricting what gets posted under github.com/google/ .

I see.

But it doesn't have to be posted on github.com/google/, it doesn't
have to be posted on github.com at all. If the code is under an open
source license, you (or anyone) can post it anywhere.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v3 03/16] reftable: add LICENSE
  2020-12-01 10:38                 ` Felipe Contreras
@ 2020-12-01 11:45                   ` Ævar Arnfjörð Bjarmason
  2020-12-01 13:34                     ` Han-Wen Nienhuys
  2020-12-01 23:03                     ` Felipe Contreras
  0 siblings, 2 replies; 251+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2020-12-01 11:45 UTC (permalink / raw)
  To: Felipe Contreras
  Cc: Han-Wen Nienhuys, Han-Wen Nienhuys via GitGitGadget, git,
	Jeff King, Ramsay Jones, Jonathan Nieder, Johannes Schindelin,
	Jonathan Tan, Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Han-Wen Nienhuys


On Tue, Dec 01 2020, Felipe Contreras wrote:

> On Tue, Dec 1, 2020 at 3:51 AM Han-Wen Nienhuys <hanwen@google.com> wrote:
>> On Mon, Nov 30, 2020 at 10:21 PM Felipe Contreras
>> <felipe.contreras@gmail.com> wrote:
>> > On Mon, Nov 30, 2020 at 2:28 PM Han-Wen Nienhuys <hanwen@google.com> wrote:
>
>> > Sounds to me Google has not made its mind about actually contributing
>> > these changes.
>> >
>> > Or am I missing something?
>>
>> The restrictions are not about the patches themselves. They are about
>> restricting what gets posted under github.com/google/ .
>
> I see.
>
> But it doesn't have to be posted on github.com/google/, it doesn't
> have to be posted on github.com at all. If the code is under an open
> source license, you (or anyone) can post it anywhere.

The libgit2.git and git.git codebases are under different
licenses. Therefore if the goal is to have code that's used in both it's
not viable to e.g. store it in git.git under our current contribution
policies.

The first contributor to submit a fix to it under GPLv2 as opposed to
"any GPL or LGPL version" or whatever will preclude its subsequent
import into libgit2.

Assigning copyright to Google is a way around that. They own your work,
and then they re-license it under whatever license those two projects
use.

But as I pointed out in [1] it requires contributors to git.git's
reftable/ directory & Junio to play along with that scheme, least the
whole process come to a halt. Maybe that's worth it, maybe not. But
seems like something the series should very explicitly highlight and
document e.g. in Documentation/SubmittingPatches.

We do have some in-tree code in that state already, but it's
mostly/entirely one-off imports of GNU/FSF code like compat/regex/ or
sh-i18n--envsubst.c that we're not actively working on in any way, and
isn't core to git like the reftable would be.

There's individual contributors who just don't like the idea of
copyright assignment and wouldn't do it for that reason, but there's
also contributors who do paid-for work for their employers on git.git
sometimes.

E.g. I've done such work under terms that were basically "you can
contribute to open source as part of your job, and under the license the
relevant project uses". Going through the legal process of changing that
to "we, your employer, who own copyright to your work, agree to license
it to this third party project/company/whatever for the purposes of an
open source contribution" would in my experience be *very much* an
uphill battle.

So I think even if the individual contributor wouldn't mind the
copyright assignment (I'd personally be fine with that), e.g. I've been
in employment relationships where I'd just forget about sending a patch
if it needed assignment, because there's no way I'd be getting it past
legal, in the same way Han-Wen seems to be having an uphill battle with
Google's lawyers.

1. https://lore.kernel.org/git/873625i9tc.fsf@evledraar.gmail.com/

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v3 03/16] reftable: add LICENSE
  2020-12-01 11:45                   ` Ævar Arnfjörð Bjarmason
@ 2020-12-01 13:34                     ` Han-Wen Nienhuys
  2020-12-01 23:13                       ` Felipe Contreras
  2020-12-01 23:03                     ` Felipe Contreras
  1 sibling, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys @ 2020-12-01 13:34 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Felipe Contreras, Han-Wen Nienhuys via GitGitGadget, git,
	Jeff King, Ramsay Jones, Jonathan Nieder, Johannes Schindelin,
	Jonathan Tan, Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Han-Wen Nienhuys

On Tue, Dec 1, 2020 at 12:45 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Tue, Dec 01 2020, Felipe Contreras wrote:
>
> > On Tue, Dec 1, 2020 at 3:51 AM Han-Wen Nienhuys <hanwen@google.com> wrote:
> >> On Mon, Nov 30, 2020 at 10:21 PM Felipe Contreras
> >> <felipe.contreras@gmail.com> wrote:
> >> > On Mon, Nov 30, 2020 at 2:28 PM Han-Wen Nienhuys <hanwen@google.com> wrote:
> >
> >> > Sounds to me Google has not made its mind about actually contributing
> >> > these changes.
> >> >
> >> > Or am I missing something?
> >>
> >> The restrictions are not about the patches themselves. They are about
> >> restricting what gets posted under github.com/google/ .
> >
> > I see.
> >
> > But it doesn't have to be posted on github.com/google/, it doesn't
> > have to be posted on github.com at all. If the code is under an open
> > source license, you (or anyone) can post it anywhere.

I am also bound by my employment contract to follow company
instructions. While you (Felipe) can post code where you want, the
same is not true for me.

> The libgit2.git and git.git codebases are under different
> licenses. Therefore if the goal is to have code that's used in both it's
> not viable to e.g. store it in git.git under our current contribution
> policies.
> [..]

Thanks for sketching the dilemmas here. Me and Jun are talking to the
open source folks here at Google, so I am optimistic that we will find
some solution. Please give us a few days to sort this out.

> The first contributor to submit a fix to it under GPLv2 as opposed to
> "any GPL or LGPL version" or whatever will preclude its subsequent
> import into libgit2.
>
> Assigning copyright to Google is a way around that. They own your work,
> and then they re-license it under whatever license those two projects
> use.
>
> But as I pointed out in [1] it requires contributors to git.git's
> reftable/ directory & Junio to play along with that scheme, least the
> whole process come to a halt. Maybe that's worth it, maybe not. But
> seems like something the series should very explicitly highlight and
> document e.g. in Documentation/SubmittingPatches.

FWIW, the plan of record is still for this code to have its source of
truth in the Git project. Once it's there, there is no need for Google
to have CLAs for contributions to the reftable/ directory.

There is still the open question of how to set the license terms for
this code, so it is OK for it to be copied into libgit2. I picked the
current license (BSD) because it was the most liberal I could find.
(Maybe I should document that in the commit message.)

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v3 03/16] reftable: add LICENSE
  2020-12-01 11:45                   ` Ævar Arnfjörð Bjarmason
  2020-12-01 13:34                     ` Han-Wen Nienhuys
@ 2020-12-01 23:03                     ` Felipe Contreras
  1 sibling, 0 replies; 251+ messages in thread
From: Felipe Contreras @ 2020-12-01 23:03 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Han-Wen Nienhuys, Han-Wen Nienhuys via GitGitGadget, git,
	Jeff King, Ramsay Jones, Jonathan Nieder, Johannes Schindelin,
	Jonathan Tan, Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Han-Wen Nienhuys

On Tue, Dec 1, 2020 at 5:45 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:

> Assigning copyright to Google is a way around that. They own your work,
> and then they re-license it under whatever license those two projects
> use.

That's the default. Unless otherwise stated; you retain the copyright
while contributing code to open source projects.

Generally when open source projects want to change license they have
to get the agreement from the majority of copyright holders.

Some projects do ask you to give away your copyright, but I'm not
comfortable with that. If I want to contribute my code with GPLv2,
that's the license I want. If a project I contribute to wants to
switch to GPLv3, then they have to remove my code (or do something
illegal).

So I see nothing wrong with Google owning the copyright and
controlling under which license(s) their code is distributed. In fact,
I would see something wrong with the opposite; them giving away their
copyright.

Cheers.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v3 03/16] reftable: add LICENSE
  2020-12-01 13:34                     ` Han-Wen Nienhuys
@ 2020-12-01 23:13                       ` Felipe Contreras
  0 siblings, 0 replies; 251+ messages in thread
From: Felipe Contreras @ 2020-12-01 23:13 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Ævar Arnfjörð Bjarmason,
	Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Han-Wen Nienhuys

On Tue, Dec 1, 2020 at 7:35 AM Han-Wen Nienhuys <hanwen@google.com> wrote:
> > On Tue, Dec 01 2020, Felipe Contreras wrote:

> > > But it doesn't have to be posted on github.com/google/, it doesn't
> > > have to be posted on github.com at all. If the code is under an open
> > > source license, you (or anyone) can post it anywhere.
>
> I am also bound by my employment contract to follow company
> instructions. While you (Felipe) can post code where you want, the
> same is not true for me.

I understand that, but what if a stubborn person decides to do that,
and the code starts diverging from github.com/google/? So essentially
a fork is created.

I don't think that's in the best interest of Google, and that would
force their hand to either a) allow you to contribute to the open
repo, or b) allow open source people to contribute to the google repo.

I find it silly that these thought experiments have to be explained to
corporate lawyers, or worse; they actually have to be done so the
lawyers get the point. But alas, that's what working in a big
corporation entails.

Good luck with that. (No sarcasm intended)

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 251+ messages in thread

* [PATCH v4 00/15] reftable library
  2020-11-26 19:42   ` [PATCH v3 00/16] reftable library Han-Wen Nienhuys via GitGitGadget
                       ` (15 preceding siblings ...)
  2020-11-26 19:42     ` [PATCH v3 16/16] Add "test-tool dump-reftable" command Han-Wen Nienhuys via GitGitGadget
@ 2020-12-09 14:00     ` Han-Wen Nienhuys via GitGitGadget
  2020-12-09 14:00       ` [PATCH v4 01/15] init-db: set the_repository->hash_algo early on Han-Wen Nienhuys via GitGitGadget
                         ` (15 more replies)
  16 siblings, 16 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-12-09 14:00 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys

This splits the giant commit from
https://github.com/gitgitgadget/git/pull/539 into a series of smaller
commits, which build and have unittests.

The final commit should also be split up, but I want to wait until we have
consensus that the bottom commits look good.

version 8 Dec 2020:

 * fixes for windows: add random suffix to table names
 * tagged unions for reftable_ref_record
 * some fixes by dscho
 * some tweaks in response to Felipe Contreras
 * switched to github.com/hanwen/reftable (which does not need CLA)

Based on github.com/hanwen/reftable at
c33a40ad23dd622d8c72139c41a2be1cfa063e5f Add random suffix to table names

Han-Wen Nienhuys (14):
  init-db: set the_repository->hash_algo early on
  reftable: add LICENSE
  reftable: add error related functionality
  reftable: utility functions
  reftable: add blocksource, an abstraction for random access reads
  reftable: (de)serialization for the polymorphic record type.
  reftable: reading/writing blocks
  reftable: a generic binary tree implementation
  reftable: write reftable files
  reftable: read reftable files
  reftable: reftable file level tests
  reftable: rest of library
  Reftable support for git-core
  Add "test-tool dump-reftable" command.

SZEDER Gábor (1):
  git-prompt: prepare for reftable refs backend

 Documentation/config/extensions.txt           |    9 +
 .../technical/repository-version.txt          |    7 +
 Makefile                                      |   46 +-
 builtin/clone.c                               |    5 +-
 builtin/init-db.c                             |   76 +-
 builtin/worktree.c                            |   27 +-
 cache.h                                       |    8 +-
 config.mak.uname                              |    2 +-
 contrib/buildsystems/CMakeLists.txt           |   14 +-
 contrib/buildsystems/Generators/Vcxproj.pm    |   11 +-
 contrib/completion/git-prompt.sh              |    7 +-
 refs.c                                        |   27 +-
 refs.h                                        |    3 +
 refs/refs-internal.h                          |    1 +
 refs/reftable-backend.c                       | 1435 +++++++++++++++++
 reftable/.gitattributes                       |    1 +
 reftable/LICENSE                              |   31 +
 reftable/VERSION                              |    1 +
 reftable/basics.c                             |  128 ++
 reftable/basics.h                             |   60 +
 reftable/basics_test.c                        |   98 ++
 reftable/block.c                              |  440 +++++
 reftable/block.h                              |  127 ++
 reftable/block_test.c                         |  121 ++
 reftable/blocksource.c                        |  148 ++
 reftable/blocksource.h                        |   22 +
 reftable/constants.h                          |   21 +
 reftable/dump.c                               |  213 +++
 reftable/error.c                              |   41 +
 reftable/iter.c                               |  248 +++
 reftable/iter.h                               |   75 +
 reftable/merged.c                             |  366 +++++
 reftable/merged.h                             |   35 +
 reftable/merged_test.c                        |  343 ++++
 reftable/pq.c                                 |  115 ++
 reftable/pq.h                                 |   32 +
 reftable/publicbasics.c                       |   58 +
 reftable/reader.c                             |  733 +++++++++
 reftable/reader.h                             |   75 +
 reftable/record.c                             | 1157 +++++++++++++
 reftable/record.h                             |  139 ++
 reftable/record_test.c                        |  398 +++++
 reftable/refname.c                            |  209 +++
 reftable/refname.h                            |   29 +
 reftable/refname_test.c                       |  102 ++
 reftable/reftable-blocksource.h               |   49 +
 reftable/reftable-error.h                     |   62 +
 reftable/reftable-generic.h                   |   48 +
 reftable/reftable-iterator.h                  |   37 +
 reftable/reftable-malloc.h                    |   18 +
 reftable/reftable-merged.h                    |   68 +
 reftable/reftable-reader.h                    |   89 +
 reftable/reftable-record.h                    |  100 ++
 reftable/reftable-stack.h                     |  120 ++
 reftable/reftable-tests.h                     |   22 +
 reftable/reftable-writer.h                    |  147 ++
 reftable/reftable.c                           |   98 ++
 reftable/reftable_test.c                      |  580 +++++++
 reftable/stack.c                              | 1260 +++++++++++++++
 reftable/stack.h                              |   41 +
 reftable/stack_test.c                         |  803 +++++++++
 reftable/system.h                             |   32 +
 reftable/test_framework.c                     |   23 +
 reftable/test_framework.h                     |   53 +
 reftable/tree.c                               |   63 +
 reftable/tree.h                               |   34 +
 reftable/tree_test.c                          |   61 +
 reftable/writer.c                             |  681 ++++++++
 reftable/writer.h                             |   50 +
 reftable/zlib-compat.c                        |   92 ++
 repository.c                                  |    2 +
 repository.h                                  |    3 +
 setup.c                                       |    9 +-
 t/helper/test-reftable.c                      |   20 +
 t/helper/test-tool.c                          |    4 +-
 t/helper/test-tool.h                          |    2 +
 t/t0031-reftable.sh                           |  199 +++
 t/t0032-reftable-unittest.sh                  |   15 +
 t/t1409-avoid-packing-refs.sh                 |    6 +
 t/t1450-fsck.sh                               |    6 +
 t/t3210-pack-refs.sh                          |    6 +
 t/test-lib.sh                                 |    5 +
 82 files changed, 12112 insertions(+), 40 deletions(-)
 create mode 100644 refs/reftable-backend.c
 create mode 100644 reftable/.gitattributes
 create mode 100644 reftable/LICENSE
 create mode 100644 reftable/VERSION
 create mode 100644 reftable/basics.c
 create mode 100644 reftable/basics.h
 create mode 100644 reftable/basics_test.c
 create mode 100644 reftable/block.c
 create mode 100644 reftable/block.h
 create mode 100644 reftable/block_test.c
 create mode 100644 reftable/blocksource.c
 create mode 100644 reftable/blocksource.h
 create mode 100644 reftable/constants.h
 create mode 100644 reftable/dump.c
 create mode 100644 reftable/error.c
 create mode 100644 reftable/iter.c
 create mode 100644 reftable/iter.h
 create mode 100644 reftable/merged.c
 create mode 100644 reftable/merged.h
 create mode 100644 reftable/merged_test.c
 create mode 100644 reftable/pq.c
 create mode 100644 reftable/pq.h
 create mode 100644 reftable/publicbasics.c
 create mode 100644 reftable/reader.c
 create mode 100644 reftable/reader.h
 create mode 100644 reftable/record.c
 create mode 100644 reftable/record.h
 create mode 100644 reftable/record_test.c
 create mode 100644 reftable/refname.c
 create mode 100644 reftable/refname.h
 create mode 100644 reftable/refname_test.c
 create mode 100644 reftable/reftable-blocksource.h
 create mode 100644 reftable/reftable-error.h
 create mode 100644 reftable/reftable-generic.h
 create mode 100644 reftable/reftable-iterator.h
 create mode 100644 reftable/reftable-malloc.h
 create mode 100644 reftable/reftable-merged.h
 create mode 100644 reftable/reftable-reader.h
 create mode 100644 reftable/reftable-record.h
 create mode 100644 reftable/reftable-stack.h
 create mode 100644 reftable/reftable-tests.h
 create mode 100644 reftable/reftable-writer.h
 create mode 100644 reftable/reftable.c
 create mode 100644 reftable/reftable_test.c
 create mode 100644 reftable/stack.c
 create mode 100644 reftable/stack.h
 create mode 100644 reftable/stack_test.c
 create mode 100644 reftable/system.h
 create mode 100644 reftable/test_framework.c
 create mode 100644 reftable/test_framework.h
 create mode 100644 reftable/tree.c
 create mode 100644 reftable/tree.h
 create mode 100644 reftable/tree_test.c
 create mode 100644 reftable/writer.c
 create mode 100644 reftable/writer.h
 create mode 100644 reftable/zlib-compat.c
 create mode 100644 t/helper/test-reftable.c
 create mode 100755 t/t0031-reftable.sh
 create mode 100755 t/t0032-reftable-unittest.sh


base-commit: 3a0b884caba2752da0af626fb2de7d597c844e8b
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-847%2Fhanwen%2Flibreftable-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-847/hanwen/libreftable-v4
Pull-Request: https://github.com/git/git/pull/847

Range-diff vs v3:

  1:  030998a732a <  -:  ----------- move sleep_millisec to git-compat-util.h
  2:  cf667653dff <  -:  ----------- init-db: set the_repository->hash_algo early on
  -:  ----------- >  1:  40ac041d0ef init-db: set the_repository->hash_algo early on
  3:  91c3ac2449e =  2:  cc0fe58bd33 reftable: add LICENSE
  4:  2aa30f536fb !  3:  798f680b800 reftable: add error related functionality
     @@ reftable/error.c (new)
      +		return "zlib failure";
      +	case REFTABLE_NAME_CONFLICT:
      +		return "file/directory conflict";
     ++	case REFTABLE_EMPTY_TABLE_ERROR:
     ++		return "wrote empty table";
      +	case REFTABLE_REFNAME_ERROR:
      +		return "invalid refname";
      +	case -1:
     @@ reftable/reftable-error.h (new)
      +#define REFTABLE_ERROR_H
      +
      +/*
     -+ Errors in reftable calls are signaled with negative integer return values. 0
     -+ means success.
     -+*/
     ++ * Errors in reftable calls are signaled with negative integer return values. 0
     ++ * means success.
     ++ */
      +enum reftable_error {
      +	/* Unexpected file system behavior */
      +	REFTABLE_IO_ERROR = -2,
      +
     -+	/* Format inconsistency on reading data
     -+	 */
     ++	/* Format inconsistency on reading data */
      +	REFTABLE_FORMAT_ERROR = -3,
      +
     -+	/* File does not exist. Returned from block_source_from_file(),  because
     -+	   it needs special handling in stack.
     -+	*/
     ++	/* File does not exist. Returned from block_source_from_file(), because
     ++	 * it needs special handling in stack.
     ++	 */
      +	REFTABLE_NOT_EXIST_ERROR = -4,
      +
      +	/* Trying to write out-of-date data. */
      +	REFTABLE_LOCK_ERROR = -5,
      +
      +	/* Misuse of the API:
     -+	   - on writing a record with NULL refname.
     -+	   - on writing a reftable_ref_record outside the table limits
     -+	   - on writing a ref or log record before the stack's next_update_index
     -+	   - on writing a log record with multiline message with
     -+	   exact_log_message unset
     -+	   - on reading a reftable_ref_record from log iterator, or vice versa.
     -+
     -+	  When a call misuses the API, the internal state of the library is kept
     -+	  unchanged.
     -+	*/
     ++	 *  - on writing a record with NULL refname.
     ++	 *  - on writing a reftable_ref_record outside the table limits
     ++	 *  - on writing a ref or log record before the stack's
     ++	 * next_update_inde*x
     ++	 *  - on writing a log record with multiline message with
     ++	 *  exact_log_message unset
     ++	 *  - on reading a reftable_ref_record from log iterator, or vice versa.
     ++	 *
     ++	 * When a call misuses the API, the internal state of the library is
     ++	 * kept unchanged.
     ++	 */
      +	REFTABLE_API_ERROR = -6,
      +
      +	/* Decompression error */
     @@ reftable/reftable-error.h (new)
      +	/* Dir/file conflict. */
      +	REFTABLE_NAME_CONFLICT = -9,
      +
     -+	/* Illegal ref name. */
     ++	/* Invalid ref name. */
      +	REFTABLE_REFNAME_ERROR = -10,
      +};
      +
  5:  987d1d186fd !  4:  2f55ff6d240 reftable: utility functions
     @@ Commit message
          This commit provides basic utility classes for the reftable library.
      
          Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
     +    Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
      
       ## Makefile ##
      @@ Makefile: TEST_BUILTINS_OBJS += test-read-cache.o
     @@ Makefile: cocciclean:
       	$(RM) $(TEST_PROGRAMS)
       	$(RM) $(FUZZ_PROGRAMS)
      
     + ## contrib/buildsystems/CMakeLists.txt ##
     +@@ contrib/buildsystems/CMakeLists.txt: parse_makefile_for_sources(libxdiff_SOURCES "XDIFF_OBJS")
     + list(TRANSFORM libxdiff_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/")
     + add_library(xdiff STATIC ${libxdiff_SOURCES})
     + 
     ++#reftable
     ++parse_makefile_for_sources(reftable_SOURCES "REFTABLE_OBJS")
     ++
     ++list(TRANSFORM reftable_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/")
     ++add_library(reftable STATIC ${reftable_SOURCES})
     ++
     + if(WIN32)
     + 	if(NOT MSVC)#use windres when compiling with gcc and clang
     + 		add_custom_command(OUTPUT ${CMAKE_BINARY_DIR}/git.res
     +@@ contrib/buildsystems/CMakeLists.txt: endif()
     + #link all required libraries to common-main
     + add_library(common-main OBJECT ${CMAKE_SOURCE_DIR}/common-main.c)
     + 
     +-target_link_libraries(common-main libgit xdiff ${ZLIB_LIBRARIES})
     ++target_link_libraries(common-main libgit xdiff reftable ${ZLIB_LIBRARIES})
     + if(Intl_FOUND)
     + 	target_link_libraries(common-main ${Intl_LIBRARIES})
     + endif()
     +@@ contrib/buildsystems/CMakeLists.txt: if(BUILD_TESTING)
     + add_executable(test-fake-ssh ${CMAKE_SOURCE_DIR}/t/helper/test-fake-ssh.c)
     + target_link_libraries(test-fake-ssh common-main)
     + 
     ++#reftable-tests
     ++parse_makefile_for_sources(test-reftable_SOURCES "REFTABLE_TEST_OBJS")
     ++list(TRANSFORM test-reftable_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/")
     ++
     + #test-tool
     + parse_makefile_for_sources(test-tool_SOURCES "TEST_BUILTINS_OBJS")
     + 
     + list(TRANSFORM test-tool_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/t/helper/")
     +-add_executable(test-tool ${CMAKE_SOURCE_DIR}/t/helper/test-tool.c ${test-tool_SOURCES})
     ++add_executable(test-tool ${CMAKE_SOURCE_DIR}/t/helper/test-tool.c ${test-tool_SOURCES} ${test-reftable_SOURCES})
     + target_link_libraries(test-tool common-main)
     + 
     + set_target_properties(test-fake-ssh test-tool
     +
       ## reftable/basics.c (new) ##
      @@
      +/*
     @@ reftable/basics.c (new)
      +	size_t lo = 0;
      +	size_t hi = sz;
      +
     -+	/* invariant: (hi == sz) || f(hi) == true
     -+	   (lo == 0 && f(0) == true) || fi(lo) == false
     ++	/* Invariants:
     ++	 *
     ++	 *  (hi == sz) || f(hi) == true
     ++	 *  (lo == 0 && f(0) == true) || fi(lo) == false
      +	 */
      +	while (hi - lo > 1) {
      +		size_t mid = lo + (hi - lo) / 2;
      +
     -+		int val = f(mid, args);
     -+		if (val) {
     ++		if (f(mid, args))
      +			hi = mid;
     -+		} else {
     ++		else
      +			lo = mid;
     -+		}
      +	}
      +
     -+	if (lo == 0) {
     -+		if (f(0, args)) {
     -+			return 0;
     -+		} else {
     -+			return 1;
     -+		}
     -+	}
     ++	if (lo)
     ++		return hi;
      +
     -+	return hi;
     ++	return f(0, args) ? 0 : 1;
      +}
      +
      +void free_names(char **a)
      +{
     -+	char **p = a;
     -+	if (p == NULL) {
     ++	char **p;
     ++	if (a == NULL) {
      +		return;
      +	}
     -+	while (*p) {
     ++	for (p = a; *p; p++) {
      +		reftable_free(*p);
     -+		p++;
      +	}
      +	reftable_free(a);
      +}
      +
      +int names_length(char **names)
      +{
     -+	int len = 0;
      +	char **p = names;
     -+	while (*p) {
     -+		p++;
     -+		len++;
     ++	for (; *p; p++) {
     ++		/* empty */
      +	}
     -+	return len;
     ++	return p - names;
      +}
      +
      +void parse_names(char *buf, int size, char ***namesp)
     @@ reftable/basics.c (new)
      +	char *end = buf + size;
      +	while (p < end) {
      +		char *next = strchr(p, '\n');
     -+		if (next != NULL) {
     ++		if (next && next < end) {
      +			*next = 0;
      +		} else {
      +			next = end;
     @@ reftable/basics.c (new)
      +			if (names_len == names_cap) {
      +				names_cap = 2 * names_cap + 1;
      +				names = reftable_realloc(
     -+					names, names_cap * sizeof(char *));
     ++					names, names_cap * sizeof(*names));
      +			}
      +			names[names_len++] = xstrdup(p);
      +		}
      +		p = next + 1;
      +	}
      +
     -+	if (names_len == names_cap) {
     -+		names_cap = 2 * names_cap + 1;
     -+		names = reftable_realloc(names, names_cap * sizeof(char *));
     -+	}
     -+
     ++	names = reftable_realloc(names, (names_len + 1) * sizeof(*names));
      +	names[names_len] = NULL;
      +	*namesp = names;
      +}
      +
      +int names_equal(char **a, char **b)
      +{
     -+	while (*a && *b) {
     -+		if (strcmp(*a, *b)) {
     ++	int i = 0;
     ++	for (; a[i] && b[i]; i++) {
     ++		if (strcmp(a[i], b[i])) {
      +			return 0;
      +		}
     -+
     -+		a++;
     -+		b++;
      +	}
      +
     -+	return *a == *b;
     ++	return a[i] == b[i];
      +}
      +
      +int common_prefix_size(struct strbuf *a, struct strbuf *b)
      +{
      +	int p = 0;
     -+	while (p < a->len && p < b->len) {
     -+		if (a->buf[p] != b->buf[p]) {
     ++	for (; p < a->len && p < b->len; p++) {
     ++		if (a->buf[p] != b->buf[p])
      +			break;
     -+		}
     -+		p++;
      +	}
      +
      +	return p;
     @@ reftable/basics.h (new)
      +void put_be16(uint8_t *out, uint16_t i);
      +
      +/*
     -+  find smallest index i in [0, sz) at which f(i) is true, assuming
     -+  that f is ascending. Return sz if f(i) is false for all indices.
     -+*/
     ++ * find smallest index i in [0, sz) at which f(i) is true, assuming
     ++ * that f is ascending. Return sz if f(i) is false for all indices.
     ++ *
     ++ * Contrary to bsearch(3), this returns something useful if the argument is not
     ++ * found.
     ++ */
      +int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args);
      +
      +/*
     -+  Frees a NULL terminated array of malloced strings. The array itself is also
     -+  freed.
     ++ * Frees a NULL terminated array of malloced strings. The array itself is also
     ++ * freed.
      + */
      +void free_names(char **a);
      +
     -+/* parse a newline separated list of names. Empty names are discarded. */
     ++/* parse a newline separated list of names. `size` is the length of the buffer,
     ++ * without terminating '\0'. Empty names are discarded. */
      +void parse_names(char *buf, int size, char ***namesp);
      +
      +/* compares two NULL-terminated arrays of strings. */
     @@ reftable/basics_test.c (new)
      +	}
      +}
      +
     -+int basics_test_main(int argc, const char *argv[])
     ++static void test_names_length(void)
      +{
     -+	test_binsearch();
     -+	return 0;
     ++	char *a[] = { "a", "b", NULL };
     ++	EXPECT(names_length(a) == 2);
      +}
     -
     - ## reftable/compat.h (new) ##
     -@@
     -+/*
     -+Copyright 2020 Google LLC
     -+
     -+Use of this source code is governed by a BSD-style
     -+license that can be found in the LICENSE file or at
     -+https://developers.google.com/open-source/licenses/bsd
     -+*/
     -+
     -+#ifndef COMPAT_H
     -+#define COMPAT_H
      +
     -+#include "system.h"
     -+
     -+#ifdef REFTABLE_STANDALONE
     -+
     -+/* functions that git-core provides, for standalone compilation */
     -+#include <stdint.h>
     -+
     -+uint64_t get_be64(void *in);
     -+void put_be64(void *out, uint64_t i);
     -+
     -+void put_be32(void *out, uint32_t i);
     -+uint32_t get_be32(uint8_t *in);
     -+
     -+uint16_t get_be16(uint8_t *in);
     -+
     -+#define ARRAY_SIZE(a) sizeof((a)) / sizeof((a)[0])
     -+#define FREE_AND_NULL(x)          \
     -+	do {                      \
     -+		reftable_free(x); \
     -+		(x) = NULL;       \
     -+	} while (0)
     -+#define QSORT(arr, n, cmp) qsort(arr, n, sizeof(arr[0]), cmp)
     -+#define SWAP(a, b)                              \
     -+	{                                       \
     -+		char tmp[sizeof(a)];            \
     -+		assert(sizeof(a) == sizeof(b)); \
     -+		memcpy(&tmp[0], &a, sizeof(a)); \
     -+		memcpy(&a, &b, sizeof(a));      \
     -+		memcpy(&b, &tmp[0], sizeof(a)); \
     -+	}
     ++static void test_parse_names_normal(void)
     ++{
     ++	char in[] = "a\nb\n";
     ++	char **out = NULL;
     ++	parse_names(in, strlen(in), &out);
     ++	EXPECT(!strcmp(out[0], "a"));
     ++	EXPECT(!strcmp(out[1], "b"));
     ++	EXPECT(out[2] == NULL);
     ++	free_names(out);
     ++}
      +
     -+char *xstrdup(const char *s);
     ++static void test_parse_names_drop_empty(void)
     ++{
     ++	char in[] = "a\n\n";
     ++	char **out = NULL;
     ++	parse_names(in, strlen(in), &out);
     ++	EXPECT(!strcmp(out[0], "a"));
     ++	EXPECT(out[1] == NULL);
     ++	free_names(out);
     ++}
      +
     -+void sleep_millisec(int millisecs);
     ++static void test_common_prefix(void)
     ++{
     ++	struct strbuf s1 = STRBUF_INIT;
     ++	struct strbuf s2 = STRBUF_INIT;
     ++	strbuf_addstr(&s1, "abcdef");
     ++	strbuf_addstr(&s2, "abc");
     ++	EXPECT(common_prefix_size(&s1, &s2) == 3);
     ++	strbuf_release(&s1);
     ++	strbuf_release(&s2);
     ++}
      +
     -+#endif
     -+#endif
     ++int basics_test_main(int argc, const char *argv[])
     ++{
     ++	RUN_TEST(test_common_prefix);
     ++	RUN_TEST(test_parse_names_normal);
     ++	RUN_TEST(test_parse_names_drop_empty);
     ++	RUN_TEST(test_binsearch);
     ++	RUN_TEST(test_names_length);
     ++	return 0;
     ++}
      
       ## reftable/publicbasics.c (new) ##
      @@
     @@ reftable/reftable-malloc.h (new)
      +
      +#include <stddef.h>
      +
     -+/*
     -+  Overrides the functions to use for memory management.
     -+ */
     ++/* Overrides the functions to use for memory management. */
      +void reftable_set_alloc(void *(*malloc)(size_t),
      +			void *(*realloc)(void *, size_t), void (*free)(void *));
      +
     @@ reftable/test_framework.h (new)
      +		abort();                                                   \
      +	}
      +
     ++#define RUN_TEST(f)                          \
     ++	fprintf(stderr, "running %s\n", #f); \
     ++	fflush(stderr);                      \
     ++	f();
     ++
      +void set_test_hash(uint8_t *p, int i);
      +
      +/* Like strbuf_add, but suitable for passing to reftable_new_writer
     @@ t/t0032-reftable-unittest.sh (new)
      +. ./test-lib.sh
      +
      +test_expect_success 'unittests' '
     -+       test-tool reftable
     ++	TMPDIR=$(pwd) && export TMPDIR &&
     ++	test-tool reftable
      +'
      +
      +test_done
  6:  f71e82b995d !  5:  aaa817f8598 reftable: add blocksource, an abstraction for random access reads
     @@ reftable/reftable-blocksource.h (new)
      +};
      +
      +/* a contiguous segment of bytes. It keeps track of its generating block_source
     -+   so it can return itself into the pool.
     -+*/
     ++ * so it can return itself into the pool. */
      +struct reftable_block {
      +	uint8_t *data;
      +	int len;
  7:  bbe2ed71734 !  6:  e13afe31525 reftable: (de)serialization for the polymorphic record type.
     @@ reftable/record.c (new)
      +	return 0;
      +}
      +
     ++uint8_t *reftable_ref_record_val1(struct reftable_ref_record *rec)
     ++{
     ++	switch (rec->value_type) {
     ++	case REFTABLE_REF_VAL1:
     ++		return rec->value.val1;
     ++	case REFTABLE_REF_VAL2:
     ++		return rec->value.val2.value;
     ++	default:
     ++		return NULL;
     ++	}
     ++}
     ++
     ++uint8_t *reftable_ref_record_val2(struct reftable_ref_record *rec)
     ++{
     ++	switch (rec->value_type) {
     ++	case REFTABLE_REF_VAL2:
     ++		return rec->value.val2.target_value;
     ++	default:
     ++		return NULL;
     ++	}
     ++}
     ++
      +static int decode_string(struct strbuf *dest, struct string_view in)
      +{
      +	int start_len = in.len;
     @@ reftable/record.c (new)
      +	assert(hash_size > 0);
      +
      +	/* This is simple and correct, but we could probably reuse the hash
     -+	   fields. */
     ++	 * fields. */
      +	reftable_ref_record_release(ref);
      +	if (src->refname != NULL) {
      +		ref->refname = xstrdup(src->refname);
      +	}
     -+
     -+	if (src->target != NULL) {
     -+		ref->target = xstrdup(src->target);
     -+	}
     -+
     -+	if (src->target_value != NULL) {
     -+		ref->target_value = reftable_malloc(hash_size);
     -+		memcpy(ref->target_value, src->target_value, hash_size);
     -+	}
     -+
     -+	if (src->value != NULL) {
     -+		ref->value = reftable_malloc(hash_size);
     -+		memcpy(ref->value, src->value, hash_size);
     -+	}
      +	ref->update_index = src->update_index;
     ++	ref->value_type = src->value_type;
     ++	switch (src->value_type) {
     ++	case REFTABLE_REF_DELETION:
     ++		break;
     ++	case REFTABLE_REF_VAL1:
     ++		ref->value.val1 = reftable_malloc(hash_size);
     ++		memcpy(ref->value.val1, src->value.val1, hash_size);
     ++		break;
     ++	case REFTABLE_REF_VAL2:
     ++		ref->value.val2.value = reftable_malloc(hash_size);
     ++		memcpy(ref->value.val2.value, src->value.val2.value, hash_size);
     ++		ref->value.val2.target_value = reftable_malloc(hash_size);
     ++		memcpy(ref->value.val2.target_value,
     ++		       src->value.val2.target_value, hash_size);
     ++		break;
     ++	case REFTABLE_REF_SYMREF:
     ++		ref->value.symref = xstrdup(src->value.symref);
     ++		break;
     ++	}
      +}
      +
      +static char hexdigit(int c)
     @@ reftable/record.c (new)
      +void reftable_ref_record_print(struct reftable_ref_record *ref,
      +			       uint32_t hash_id)
      +{
     -+	char hex[SHA256_SIZE + 1] = { 0 };
     ++	char hex[2 * SHA256_SIZE + 1] = { 0 }; /* BUG */
      +	printf("ref{%s(%" PRIu64 ") ", ref->refname, ref->update_index);
     -+	if (ref->value != NULL) {
     -+		hex_format(hex, ref->value, hash_size(hash_id));
     -+		printf("%s", hex);
     -+	}
     -+	if (ref->target_value != NULL) {
     -+		hex_format(hex, ref->target_value, hash_size(hash_id));
     -+		printf(" (T %s)", hex);
     -+	}
     -+	if (ref->target != NULL) {
     -+		printf("=> %s", ref->target);
     ++	switch (ref->value_type) {
     ++	case REFTABLE_REF_SYMREF:
     ++		printf("=> %s", ref->value.symref);
     ++		break;
     ++	case REFTABLE_REF_VAL2:
     ++		hex_format(hex, ref->value.val2.value, hash_size(hash_id));
     ++		printf("val 2 %s", hex);
     ++		hex_format(hex, ref->value.val2.target_value,
     ++			   hash_size(hash_id));
     ++		printf("(T %s)", hex);
     ++		break;
     ++	case REFTABLE_REF_VAL1:
     ++		hex_format(hex, ref->value.val1, hash_size(hash_id));
     ++		printf("val 1 %s", hex);
     ++		break;
     ++	case REFTABLE_REF_DELETION:
     ++		printf("delete");
     ++		break;
      +	}
      +	printf("}\n");
      +}
     @@ reftable/record.c (new)
      +
      +void reftable_ref_record_release(struct reftable_ref_record *ref)
      +{
     ++	switch (ref->value_type) {
     ++	case REFTABLE_REF_SYMREF:
     ++		reftable_free(ref->value.symref);
     ++		break;
     ++	case REFTABLE_REF_VAL2:
     ++		reftable_free(ref->value.val2.target_value);
     ++		reftable_free(ref->value.val2.value);
     ++		break;
     ++	case REFTABLE_REF_VAL1:
     ++		reftable_free(ref->value.val1);
     ++		break;
     ++	case REFTABLE_REF_DELETION:
     ++		break;
     ++	default:
     ++		abort();
     ++	}
     ++
      +	reftable_free(ref->refname);
     -+	reftable_free(ref->target);
     -+	reftable_free(ref->target_value);
     -+	reftable_free(ref->value);
      +	memset(ref, 0, sizeof(struct reftable_ref_record));
      +}
      +
     @@ reftable/record.c (new)
      +{
      +	const struct reftable_ref_record *r =
      +		(const struct reftable_ref_record *)rec;
     -+	if (r->value != NULL) {
     -+		if (r->target_value != NULL) {
     -+			return 2;
     -+		} else {
     -+			return 1;
     -+		}
     -+	} else if (r->target != NULL)
     -+		return 3;
     -+	return 0;
     ++	return r->value_type;
      +}
      +
      +static int reftable_ref_record_encode(const void *rec, struct string_view s,
     @@ reftable/record.c (new)
      +		return -1;
      +	string_view_consume(&s, n);
      +
     -+	if (r->value != NULL) {
     -+		if (s.len < hash_size) {
     ++	switch (r->value_type) {
     ++	case REFTABLE_REF_SYMREF:
     ++		n = encode_string(r->value.symref, s);
     ++		if (n < 0) {
      +			return -1;
      +		}
     -+		memcpy(s.buf, r->value, hash_size);
     -+		string_view_consume(&s, hash_size);
     -+	}
     -+
     -+	if (r->target_value != NULL) {
     -+		if (s.len < hash_size) {
     ++		string_view_consume(&s, n);
     ++		break;
     ++	case REFTABLE_REF_VAL2:
     ++		if (s.len < 2 * hash_size) {
      +			return -1;
      +		}
     -+		memcpy(s.buf, r->target_value, hash_size);
     ++		memcpy(s.buf, r->value.val2.value, hash_size);
      +		string_view_consume(&s, hash_size);
     -+	}
     -+
     -+	if (r->target != NULL) {
     -+		int n = encode_string(r->target, s);
     -+		if (n < 0) {
     ++		memcpy(s.buf, r->value.val2.target_value, hash_size);
     ++		string_view_consume(&s, hash_size);
     ++		break;
     ++	case REFTABLE_REF_VAL1:
     ++		if (s.len < hash_size) {
      +			return -1;
      +		}
     -+		string_view_consume(&s, n);
     ++		memcpy(s.buf, r->value.val1, hash_size);
     ++		string_view_consume(&s, hash_size);
     ++		break;
     ++	case REFTABLE_REF_DELETION:
     ++		break;
     ++	default:
     ++		abort();
      +	}
      +
      +	return start.len - s.len;
     @@ reftable/record.c (new)
      +{
      +	struct reftable_ref_record *r = (struct reftable_ref_record *)rec;
      +	struct string_view start = in;
     -+	int seen_value = 0;
     -+	int seen_target_value = 0;
     -+	int seen_target = 0;
     -+
     -+	int n = get_var_int(&r->update_index, &in);
     ++	uint64_t update_index = 0;
     ++	int n = get_var_int(&update_index, &in);
      +	if (n < 0)
      +		return n;
     -+	assert(hash_size > 0);
     -+
      +	string_view_consume(&in, n);
      +
     ++	reftable_ref_record_release(r);
     ++
     ++	assert(hash_size > 0);
     ++
      +	r->refname = reftable_realloc(r->refname, key.len + 1);
      +	memcpy(r->refname, key.buf, key.len);
     ++	r->update_index = update_index;
      +	r->refname[key.len] = 0;
     -+
     ++	r->value_type = val_type;
      +	switch (val_type) {
     -+	case 1:
     -+	case 2:
     ++	case REFTABLE_REF_VAL1:
      +		if (in.len < hash_size) {
      +			return -1;
      +		}
      +
     -+		if (r->value == NULL) {
     -+			r->value = reftable_malloc(hash_size);
     -+		}
     -+		seen_value = 1;
     -+		memcpy(r->value, in.buf, hash_size);
     ++		r->value.val1 = reftable_malloc(hash_size);
     ++		memcpy(r->value.val1, in.buf, hash_size);
      +		string_view_consume(&in, hash_size);
     -+		if (val_type == 1) {
     -+			break;
     -+		}
     -+		if (r->target_value == NULL) {
     -+			r->target_value = reftable_malloc(hash_size);
     ++		break;
     ++
     ++	case REFTABLE_REF_VAL2:
     ++		if (in.len < 2 * hash_size) {
     ++			return -1;
      +		}
     -+		seen_target_value = 1;
     -+		memcpy(r->target_value, in.buf, hash_size);
     ++
     ++		r->value.val2.value = reftable_malloc(hash_size);
     ++		memcpy(r->value.val2.value, in.buf, hash_size);
     ++		string_view_consume(&in, hash_size);
     ++
     ++		r->value.val2.target_value = reftable_malloc(hash_size);
     ++		memcpy(r->value.val2.target_value, in.buf, hash_size);
      +		string_view_consume(&in, hash_size);
      +		break;
     -+	case 3: {
     ++
     ++	case REFTABLE_REF_SYMREF: {
      +		struct strbuf dest = STRBUF_INIT;
      +		int n = decode_string(&dest, in);
      +		if (n < 0) {
      +			return -1;
      +		}
      +		string_view_consume(&in, n);
     -+		seen_target = 1;
     -+		if (r->target != NULL) {
     -+			reftable_free(r->target);
     -+		}
     -+		r->target = dest.buf;
     ++		r->value.symref = dest.buf;
      +	} break;
      +
     -+	case 0:
     ++	case REFTABLE_REF_DELETION:
      +		break;
      +	default:
      +		abort();
      +		break;
      +	}
      +
     -+	if (!seen_target && r->target != NULL) {
     -+		FREE_AND_NULL(r->target);
     -+	}
     -+	if (!seen_target_value && r->target_value != NULL) {
     -+		FREE_AND_NULL(r->target_value);
     -+	}
     -+	if (!seen_value && r->value != NULL) {
     -+		FREE_AND_NULL(r->value);
     -+	}
     -+
      +	return start.len - in.len;
      +}
      +
     @@ reftable/record.c (new)
      +	return a == b;
      +}
      +
     -+static int str_equal(char *a, char *b)
     -+{
     -+	if (a != NULL && b != NULL)
     -+		return 0 == strcmp(a, b);
     -+
     -+	return a == b;
     -+}
     -+
      +int reftable_ref_record_equal(struct reftable_ref_record *a,
      +			      struct reftable_ref_record *b, int hash_size)
      +{
      +	assert(hash_size > 0);
     -+	return 0 == strcmp(a->refname, b->refname) &&
     -+	       a->update_index == b->update_index &&
     -+	       hash_equal(a->value, b->value, hash_size) &&
     -+	       hash_equal(a->target_value, b->target_value, hash_size) &&
     -+	       str_equal(a->target, b->target);
     ++	if (!(0 == strcmp(a->refname, b->refname) &&
     ++	      a->update_index == b->update_index &&
     ++	      a->value_type == b->value_type))
     ++		return 0;
     ++
     ++	switch (a->value_type) {
     ++	case REFTABLE_REF_SYMREF:
     ++		return !strcmp(a->value.symref, b->value.symref);
     ++	case REFTABLE_REF_VAL2:
     ++		return hash_equal(a->value.val2.value, b->value.val2.value,
     ++				  hash_size) &&
     ++		       hash_equal(a->value.val2.target_value,
     ++				  b->value.val2.target_value, hash_size);
     ++	case REFTABLE_REF_VAL1:
     ++		return hash_equal(a->value.val1, b->value.val1, hash_size);
     ++	case REFTABLE_REF_DELETION:
     ++		return 1;
     ++	default:
     ++		abort();
     ++	}
      +}
      +
      +int reftable_ref_record_compare_name(const void *a, const void *b)
     @@ reftable/record.c (new)
      +
      +int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref)
      +{
     -+	return ref->value == NULL && ref->target == NULL &&
     -+	       ref->target_value == NULL;
     ++	return ref->value_type == REFTABLE_REF_DELETION;
      +}
      +
      +int reftable_log_record_compare_key(const void *a, const void *b)
     @@ reftable/record.h (new)
      +#include "reftable-record.h"
      +
      +/*
     -+  A substring of existing string data. This structure takes no responsibility
     -+  for the lifetime of the data it points to.
     -+*/
     ++ * A substring of existing string data. This structure takes no responsibility
     ++ * for the lifetime of the data it points to.
     ++ */
      +struct string_view {
      +	uint8_t *buf;
      +	size_t len;
     @@ reftable/record.h (new)
      +struct reftable_record reftable_new_record(uint8_t typ);
      +
      +/* Encode `key` into `dest`. Sets `is_restart` to indicate a restart. Returns
     -+   number of bytes written. */
     ++ * number of bytes written. */
      +int reftable_encode_key(int *is_restart, struct string_view dest,
      +			struct strbuf prev_key, struct strbuf key,
      +			uint8_t extra);
     @@ reftable/record_test.c (new)
      +{
      +	int i = 0;
      +
     -+	for (i = 0; i <= 3; i++) {
     ++	for (i = REFTABLE_REF_DELETION; i < REFTABLE_NR_REF_VALUETYPES; i++) {
      +		struct reftable_ref_record in = { NULL };
     -+		struct reftable_ref_record out = {
     -+			.refname = xstrdup("old name"),
     -+			.value = reftable_calloc(SHA1_SIZE),
     -+			.target_value = reftable_calloc(SHA1_SIZE),
     -+			.target = xstrdup("old value"),
     -+		};
     ++		struct reftable_ref_record out = { NULL };
      +		struct reftable_record rec_out = { NULL };
      +		struct strbuf key = STRBUF_INIT;
      +		struct reftable_record rec = { NULL };
     @@ reftable/record_test.c (new)
      +
      +		int n, m;
      +
     ++		in.value_type = i;
      +		switch (i) {
     -+		case 0:
     ++		case REFTABLE_REF_DELETION:
      +			break;
     -+		case 1:
     -+			in.value = reftable_malloc(SHA1_SIZE);
     -+			set_hash(in.value, 1);
     ++		case REFTABLE_REF_VAL1:
     ++			in.value.val1 = reftable_malloc(SHA1_SIZE);
     ++			set_hash(in.value.val1, 1);
      +			break;
     -+		case 2:
     -+			in.value = reftable_malloc(SHA1_SIZE);
     -+			set_hash(in.value, 1);
     -+			in.target_value = reftable_malloc(SHA1_SIZE);
     -+			set_hash(in.target_value, 2);
     ++		case REFTABLE_REF_VAL2:
     ++			in.value.val2.value = reftable_malloc(SHA1_SIZE);
     ++			set_hash(in.value.val2.value, 1);
     ++			in.value.val2.target_value = reftable_malloc(SHA1_SIZE);
     ++			set_hash(in.value.val2.target_value, 2);
      +			break;
     -+		case 3:
     -+			in.target = xstrdup("target");
     ++		case REFTABLE_REF_SYMREF:
     ++			in.value.symref = xstrdup("target");
      +			break;
      +		}
      +		in.refname = xstrdup("refs/heads/master");
     @@ reftable/record_test.c (new)
      +		m = reftable_record_decode(&rec_out, key, i, dest, SHA1_SIZE);
      +		EXPECT(n == m);
      +
     -+		EXPECT((out.value != NULL) == (in.value != NULL));
     -+		EXPECT((out.target_value != NULL) == (in.target_value != NULL));
     -+		EXPECT((out.target != NULL) == (in.target != NULL));
     ++		EXPECT(reftable_ref_record_equal(&in, &out, SHA1_SIZE));
      +		reftable_record_release(&rec_out);
      +
      +		strbuf_release(&key);
     @@ reftable/record_test.c (new)
      +
      +int record_test_main(int argc, const char *argv[])
      +{
     -+	test_reftable_log_record_equal();
     -+	test_reftable_log_record_roundtrip();
     -+	test_reftable_ref_record_roundtrip();
     -+	test_varint_roundtrip();
     -+	test_key_roundtrip();
     -+	test_common_prefix();
     -+	test_reftable_obj_record_roundtrip();
     -+	test_reftable_index_record_roundtrip();
     -+	test_u24_roundtrip();
     ++	RUN_TEST(test_reftable_log_record_equal);
     ++	RUN_TEST(test_reftable_log_record_roundtrip);
     ++	RUN_TEST(test_reftable_ref_record_roundtrip);
     ++	RUN_TEST(test_varint_roundtrip);
     ++	RUN_TEST(test_key_roundtrip);
     ++	RUN_TEST(test_common_prefix);
     ++	RUN_TEST(test_reftable_obj_record_roundtrip);
     ++	RUN_TEST(test_reftable_index_record_roundtrip);
     ++	RUN_TEST(test_u24_roundtrip);
      +	return 0;
      +}
      
     @@ reftable/reftable-record.h (new)
      +#include <stdint.h>
      +
      +/*
     -+ Basic data types
     -+
     -+ Reftables store the state of each ref in struct reftable_ref_record, and they
     -+ store a sequence of reflog updates in struct reftable_log_record.
     -+*/
     ++ * Basic data types
     ++ *
     ++ * Reftables store the state of each ref in struct reftable_ref_record, and they
     ++ * store a sequence of reflog updates in struct reftable_log_record.
     ++ */
      +
      +/* reftable_ref_record holds a ref database entry target_value */
      +struct reftable_ref_record {
      +	char *refname; /* Name of the ref, malloced. */
      +	uint64_t update_index; /* Logical timestamp at which this value is
     -+				  written */
     -+	uint8_t *value; /* SHA1, or NULL. malloced. */
     -+	uint8_t *target_value; /* peeled annotated tag, or NULL. malloced. */
     -+	char *target; /* symref, or NULL. malloced. */
     ++				* written */
     ++
     ++	enum {
     ++		/* tombstone to hide deletions from earlier tables */
     ++		REFTABLE_REF_DELETION = 0x0,
     ++
     ++		/* a simple ref */
     ++		REFTABLE_REF_VAL1 = 0x1,
     ++		/* a tag, plus its peeled hash */
     ++		REFTABLE_REF_VAL2 = 0x2,
     ++
     ++		/* a symbolic reference */
     ++		REFTABLE_REF_SYMREF = 0x3,
     ++#define REFTABLE_NR_REF_VALUETYPES 4
     ++	} value_type;
     ++	union {
     ++		uint8_t *val1; /* malloced hash. */
     ++		struct {
     ++			uint8_t *value; /* first value, malloced hash  */
     ++			uint8_t *target_value; /* second value, malloced hash */
     ++		} val2;
     ++		char *symref; /* referent, malloced 0-terminated string */
     ++	} value;
      +};
      +
     ++/* Returns the first hash, or NULL if `rec` is not of type
     ++ * REFTABLE_REF_VAL1 or REFTABLE_REF_VAL2. */
     ++uint8_t *reftable_ref_record_val1(struct reftable_ref_record *rec);
     ++
     ++/* Returns the second hash, or NULL if `rec` is not of type
     ++ * REFTABLE_REF_VAL2. */
     ++uint8_t *reftable_ref_record_val2(struct reftable_ref_record *rec);
     ++
      +/* returns whether 'ref' represents a deletion */
      +int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref);
      +
     -+/* prints a reftable_ref_record onto stdout */
     ++/* prints a reftable_ref_record onto stdout. Useful for debugging. */
      +void reftable_ref_record_print(struct reftable_ref_record *ref,
      +			       uint32_t hash_id);
      +
     -+/* frees and nulls all pointer values. */
     ++/* frees and nulls all pointer values inside `ref`. */
      +void reftable_ref_record_release(struct reftable_ref_record *ref);
      +
     -+/* returns whether two reftable_ref_records are the same */
     ++/* returns whether two reftable_ref_records are the same. Useful for testing. */
      +int reftable_ref_record_equal(struct reftable_ref_record *a,
      +			      struct reftable_ref_record *b, int hash_size);
      +
  8:  a358b052d56 !  7:  a48b9937642 reftable: reading/writing blocks
     @@ reftable/block.h (new)
      +#include "reftable-blocksource.h"
      +
      +/*
     -+  Writes reftable blocks. The block_writer is reused across blocks to minimize
     -+  allocation overhead.
     -+*/
     ++ * Writes reftable blocks. The block_writer is reused across blocks to minimize
     ++ * allocation overhead.
     ++ */
      +struct block_writer {
      +	uint8_t *buf;
      +	uint32_t block_size;
     @@ reftable/block.h (new)
      +};
      +
      +/*
     -+  initializes the blockwriter to write `typ` entries, using `buf` as temporary
     -+  storage. `buf` is not owned by the block_writer. */
     ++ * initializes the blockwriter to write `typ` entries, using `buf` as temporary
     ++ * storage. `buf` is not owned by the block_writer. */
      +void block_writer_init(struct block_writer *bw, uint8_t typ, uint8_t *buf,
      +		       uint32_t block_size, uint32_t header_off, int hash_size);
      +
     -+/*
     -+  returns the block type (eg. 'r' for ref records.
     -+*/
     ++/* returns the block type (eg. 'r' for ref records. */
      +uint8_t block_writer_type(struct block_writer *bw);
      +
      +/* appends the record, or -1 if it doesn't fit. */
     @@ reftable/block_test.c (new)
      +		memset(hash, i, sizeof(hash));
      +
      +		ref.refname = name;
     -+		ref.value = hash;
     ++		ref.value_type = REFTABLE_REF_VAL1;
     ++		ref.value.val1 = hash;
     ++
      +		names[i] = xstrdup(name);
      +		n = block_writer_add(&bw, &rec);
      +		ref.refname = NULL;
     -+		ref.value = NULL;
     ++		ref.value_type = REFTABLE_REF_DELETION;
      +		EXPECT(n == 0);
      +	}
      +
     @@ reftable/block_test.c (new)
      +
      +int block_test_main(int argc, const char *argv[])
      +{
     -+	test_block_read_write();
     ++	RUN_TEST(test_block_read_write);
      +	return 0;
      +}
      
  9:  24afac9c91a !  8:  6968dbc3828 reftable: a generic binary tree implementation
     @@ reftable/tree.h (new)
      +};
      +
      +/* looks for `key` in `rootp` using `compare` as comparison function. If insert
     -+   is set, insert the key if it's not found. Else, return NULL.
     -+*/
     ++ * is set, insert the key if it's not found. Else, return NULL.
     ++ */
      +struct tree_node *tree_search(void *key, struct tree_node **rootp,
      +			      int (*compare)(const void *, const void *),
      +			      int insert);
     @@ reftable/tree.h (new)
      +		void *arg);
      +
      +/*
     -+  deallocates the tree nodes recursively. Keys should be deallocated separately
     -+  by walking over the tree. */
     ++ * deallocates the tree nodes recursively. Keys should be deallocated separately
     ++ * by walking over the tree. */
      +void tree_free(struct tree_node *t);
      +
      +#endif
     @@ reftable/tree_test.c (new)
      +
      +int tree_test_main(int argc, const char *argv[])
      +{
     -+	test_tree();
     ++	RUN_TEST(test_tree);
      +	return 0;
      +}
      
 10:  5f0b7c55925 !  9:  ff5b424d12f reftable: write reftable files
     @@ reftable/reftable-writer.h (new)
      +
      +#include "reftable-record.h"
      +
     -+/*
     -+ Writing single reftables
     -+*/
     ++/* Writing single reftables */
      +
      +/* reftable_write_options sets options for writing a single reftable. */
      +struct reftable_write_options {
     @@ reftable/writer.c (new)
      +
      +	reftable_record_from_ref(&rec, &copy);
      +	copy.update_index -= w->min_update_index;
     ++
      +	err = writer_add_record(w, &rec);
      +	if (err < 0)
      +		return err;
      +
     -+	if (!w->opts.skip_index_objects && ref->value != NULL) {
     ++	if (!w->opts.skip_index_objects &&
     ++	    reftable_ref_record_val1(ref) != NULL) {
      +		struct strbuf h = STRBUF_INIT;
     -+		strbuf_add(&h, (char *)ref->value, hash_size(w->opts.hash_id));
     ++		strbuf_add(&h, (char *)reftable_ref_record_val1(ref),
     ++			   hash_size(w->opts.hash_id));
      +		writer_index_hash(w, &h);
      +		strbuf_release(&h);
      +	}
      +
     -+	if (!w->opts.skip_index_objects && ref->target_value != NULL) {
     ++	if (!w->opts.skip_index_objects &&
     ++	    reftable_ref_record_val2(ref) != NULL) {
      +		struct strbuf h = STRBUF_INIT;
     -+		strbuf_add(&h, ref->target_value, hash_size(w->opts.hash_id));
     ++		strbuf_add(&h, reftable_ref_record_val2(ref),
     ++			   hash_size(w->opts.hash_id));
      +		writer_index_hash(w, &h);
      +		strbuf_release(&h);
      +	}
     @@ reftable/writer.h (new)
      +	size_t index_cap;
      +
      +	/*
     -+	  tree for use with tsearch; used to populate the 'o' inverse OID
     -+	  map */
     ++	 * tree for use with tsearch; used to populate the 'o' inverse OID
     ++	 * map */
      +	struct tree_node *obj_index_tree;
      +
      +	struct reftable_stats stats;
 11:  9aa2caade8c ! 10:  3c2de7dcc65 reftable: read reftable files
     @@ reftable/iter.c (new)
      +			}
      +		}
      +
     -+		if ((ref->target_value != NULL &&
     -+		     !memcmp(fri->oid.buf, ref->target_value, fri->oid.len)) ||
     -+		    (ref->value != NULL &&
     -+		     !memcmp(fri->oid.buf, ref->value, fri->oid.len))) {
     ++		if (ref->value_type == REFTABLE_REF_VAL2 &&
     ++		    (!memcmp(fri->oid.buf, ref->value.val2.target_value,
     ++			     fri->oid.len) ||
     ++		     !memcmp(fri->oid.buf, ref->value.val2.value,
     ++			     fri->oid.len)))
     ++			return 0;
     ++
     ++		if (ref->value_type == REFTABLE_REF_VAL1 &&
     ++		    !memcmp(fri->oid.buf, ref->value.val1, fri->oid.len)) {
      +			return 0;
      +		}
      +	}
     @@ reftable/iter.c (new)
      +			}
      +			continue;
      +		}
     -+
     -+		if (!memcmp(it->oid.buf, ref->target_value, it->oid.len) ||
     -+		    !memcmp(it->oid.buf, ref->value, it->oid.len)) {
     ++		/* BUG */
     ++		if (!memcmp(it->oid.buf, ref->value.val2.target_value,
     ++			    it->oid.len) ||
     ++		    !memcmp(it->oid.buf, ref->value.val2.value, it->oid.len)) {
      +			return 0;
      +		}
      +	}
     @@ reftable/iter.h (new)
      +int iterator_next(struct reftable_iterator *it, struct reftable_record *rec);
      +
      +/* Returns true for a zeroed out iterator, such as the one returned from
     -+   iterator_destroy. */
     ++ * iterator_destroy. */
      +int iterator_is_null(struct reftable_iterator *it);
      +
      +/* iterator that produces only ref records that point to `oid` */
     @@ reftable/iter.h (new)
      +					  struct filtering_ref_iterator *);
      +
      +/* iterator that produces only ref records that point to `oid`,
     -+   but using the object index.
     ++ * but using the object index.
      + */
      +struct indexed_table_ref_iter {
      +	struct reftable_reader *r;
     @@ reftable/reftable-iterator.h (new)
      +#include "reftable-record.h"
      +
      +/* iterator is the generic interface for walking over data stored in a
     -+   reftable.
     -+*/
     ++ * reftable.
     ++ */
      +struct reftable_iterator {
      +	struct reftable_iterator_vtable *ops;
      +	void *iter_arg;
      +};
      +
      +/* reads the next reftable_ref_record. Returns < 0 for error, 0 for OK and > 0:
     -+   end of iteration.
     -+*/
     ++ * end of iteration.
     ++ */
      +int reftable_iterator_next_ref(struct reftable_iterator *it,
      +			       struct reftable_ref_record *ref);
      +
      +/* reads the next reftable_log_record. Returns < 0 for error, 0 for OK and > 0:
     -+   end of iteration.
     -+*/
     ++ * end of iteration.
     ++ */
      +int reftable_iterator_next_log(struct reftable_iterator *it,
      +			       struct reftable_log_record *log);
      +
     @@ reftable/reftable-reader.h (new)
      +#include "reftable-blocksource.h"
      +
      +/*
     -+ Reading single tables
     -+
     -+ The follow routines are for reading single files. For an application-level
     -+ interface, skip ahead to struct reftable_merged_table and struct
     -+ reftable_stack.
     -+*/
     ++ * Reading single tables
     ++ *
     ++ * The follow routines are for reading single files. For an application-level
     ++ * interface, skip ahead to struct reftable_merged_table and struct
     ++ * reftable_stack.
     ++ */
      +
      +/* The reader struct is a handle to an open reftable file. */
      +struct reftable_reader;
 12:  581a4200d64 ! 11:  03681b820d1 reftable: reftable file level tests
     @@ reftable/reftable_test.c (new)
      +		snprintf(name, sizeof(name), "refs/heads/branch%02d", i);
      +
      +		ref.refname = name;
     -+		ref.value = hash;
      +		ref.update_index = update_index;
     ++		ref.value_type = REFTABLE_REF_VAL1;
     ++		ref.value.val1 = hash;
      +		(*names)[i] = xstrdup(name);
      +
      +		n = reftable_writer_add_ref(w, &ref);
     @@ reftable/reftable_test.c (new)
      +		err = reftable_iterator_next_ref(&it, &ref);
      +		EXPECT_ERR(err);
      +		EXPECT(0 == strcmp(names[i], ref.refname));
     -+		EXPECT(i == ref.value[0]);
     ++		EXPECT(REFTABLE_REF_VAL1 == ref.value_type);
     ++		EXPECT(i == ref.value.val1[0]);
      +
      +		reftable_ref_record_release(&ref);
      +		reftable_iterator_destroy(&it);
     @@ reftable/reftable_test.c (new)
      +
      +		set_test_hash(hash1, i / 4);
      +		set_test_hash(hash2, 3 + i / 4);
     -+		ref.value = hash1;
     -+		ref.target_value = hash2;
     ++		ref.value_type = REFTABLE_REF_VAL2;
     ++		ref.value.val2.value = hash1;
     ++		ref.value.val2.target_value = hash2;
      +
      +		/* 80 bytes / entry, so 3 entries per block. Yields 17
      +		 */
     @@ reftable/reftable_test.c (new)
      +
      +int reftable_test_main(int argc, const char *argv[])
      +{
     -+	test_log_write_read();
     -+	test_table_read_write_seek_linear_sha256();
     -+	test_log_buffer_size();
     -+	test_table_write_small_table();
     -+	test_buffer();
     -+	test_table_read_api();
     -+	test_table_read_write_sequential();
     -+	test_table_read_write_seek_linear();
     -+	test_table_read_write_seek_index();
     -+	test_table_refs_for_no_index();
     -+	test_table_refs_for_obj_index();
     -+	test_table_empty();
     ++	RUN_TEST(test_log_write_read);
     ++	RUN_TEST(test_table_read_write_seek_linear_sha256);
     ++	RUN_TEST(test_log_buffer_size);
     ++	RUN_TEST(test_table_write_small_table);
     ++	RUN_TEST(test_buffer);
     ++	RUN_TEST(test_table_read_api);
     ++	RUN_TEST(test_table_read_write_sequential);
     ++	RUN_TEST(test_table_read_write_seek_linear);
     ++	RUN_TEST(test_table_read_write_seek_index);
     ++	RUN_TEST(test_table_refs_for_no_index);
     ++	RUN_TEST(test_table_refs_for_obj_index);
     ++	RUN_TEST(test_table_empty);
      +	return 0;
      +}
      
 13:  a26fb180add ! 12:  557183d3e3e reftable: rest of library
     @@ Makefile: REFTABLE_OBJS += reftable/error.o
      
       ## reftable/VERSION (new) ##
      @@
     -+b8ec0f74c74cb6752eb2033ad8e755a9c19aad15 C: add missing header
     ++9b4a54059db9a05c270c0a0587f245bc6868d576 C: use rand() rather than cobbled together random generator.
      
       ## reftable/dump.c (new) ##
      @@
     @@ reftable/merged_test.c (new)
      +	struct reftable_ref_record r1[] = { {
      +		.refname = "b",
      +		.update_index = 1,
     -+		.value = hash1,
     ++		.value_type = REFTABLE_REF_VAL1,
     ++		.value.val1 = hash1,
      +	} };
      +	struct reftable_ref_record r2[] = { {
      +		.refname = "a",
      +		.update_index = 2,
     ++		.value_type = REFTABLE_REF_DELETION,
      +	} };
      +
      +	struct reftable_ref_record *refs[] = { r1, r2 };
     @@ reftable/merged_test.c (new)
      +{
      +	uint8_t hash1[SHA1_SIZE] = { 1 };
      +	uint8_t hash2[SHA1_SIZE] = { 2 };
     -+	struct reftable_ref_record r1[] = { {
     -+						    .refname = "a",
     -+						    .update_index = 1,
     -+						    .value = hash1,
     -+					    },
     -+					    {
     -+						    .refname = "b",
     -+						    .update_index = 1,
     -+						    .value = hash1,
     -+					    },
     -+					    {
     -+						    .refname = "c",
     -+						    .update_index = 1,
     -+						    .value = hash1,
     -+					    } };
     ++	struct reftable_ref_record r1[] = {
     ++		{
     ++			.refname = "a",
     ++			.update_index = 1,
     ++			.value_type = REFTABLE_REF_VAL1,
     ++			.value.val1 = hash1,
     ++		},
     ++		{
     ++			.refname = "b",
     ++			.update_index = 1,
     ++			.value_type = REFTABLE_REF_VAL1,
     ++			.value.val1 = hash1,
     ++		},
     ++		{
     ++			.refname = "c",
     ++			.update_index = 1,
     ++			.value_type = REFTABLE_REF_VAL1,
     ++			.value.val1 = hash1,
     ++		}
     ++	};
      +	struct reftable_ref_record r2[] = { {
      +		.refname = "a",
      +		.update_index = 2,
     ++		.value_type = REFTABLE_REF_DELETION,
      +	} };
      +	struct reftable_ref_record r3[] = {
      +		{
      +			.refname = "c",
      +			.update_index = 3,
     -+			.value = hash2,
     ++			.value_type = REFTABLE_REF_VAL1,
     ++			.value.val1 = hash2,
      +		},
      +		{
      +			.refname = "d",
      +			.update_index = 3,
     -+			.value = hash1,
     ++			.value_type = REFTABLE_REF_VAL1,
     ++			.value.val1 = hash1,
      +		},
      +	};
      +
     @@ reftable/merged_test.c (new)
      +
      +int merged_test_main(int argc, const char *argv[])
      +{
     -+	test_merged_between();
     -+	test_pq();
     -+	test_merged();
     -+	test_default_write_opts();
     ++	RUN_TEST(test_merged_between);
     ++	RUN_TEST(test_pq);
     ++	RUN_TEST(test_merged);
     ++	RUN_TEST(test_default_write_opts);
      +	return 0;
      +}
      
     @@ reftable/refname_test.c (new)
      +		reftable_new_writer(&strbuf_add_void, &buf, &opts);
      +	struct reftable_ref_record rec = {
      +		.refname = "a/b",
     -+		.target = "destination", /* make sure it's not a symref. */
     ++		.value_type = REFTABLE_REF_SYMREF,
     ++		.value.symref = "destination", /* make sure it's not a symref.
     ++						*/
      +		.update_index = 1,
      +	};
      +	int err;
     @@ reftable/refname_test.c (new)
      +
      +int refname_test_main(int argc, const char *argv[])
      +{
     -+	test_conflict();
     ++	RUN_TEST(test_conflict);
      +	return 0;
      +}
      
     @@ reftable/reftable-generic.h (new)
      +#include "reftable-merged.h"
      +
      +/*
     -+ Provides a unified API for reading tables, either merged tables, or single
     -+ readers.
     -+*/
     ++ * Provides a unified API for reading tables, either merged tables, or single
     ++ * readers. */
      +struct reftable_table {
      +	struct reftable_table_vtable *ops;
      +	void *table_arg;
     @@ reftable/reftable-merged.h (new)
      +#include "reftable-iterator.h"
      +
      +/*
     -+ Merged tables
     -+
     -+ A ref database kept in a sequence of table files. The merged_table presents a
     -+ unified view to reading (seeking, iterating) a sequence of immutable tables.
     -+
     -+ The merged tables are on purpose kept disconnected from their actual storage
     -+ (eg. files on disk), because it is useful to merge tables aren't files. For
     -+ example, the per-workspace and global ref namespace can be implemented as a
     -+ merged table of two stacks of file-backed reftables.
     -+*/
     ++ * Merged tables
     ++ *
     ++ * A ref database kept in a sequence of table files. The merged_table presents a
     ++ * unified view to reading (seeking, iterating) a sequence of immutable tables.
     ++ *
     ++ * The merged tables are on purpose kept disconnected from their actual storage
     ++ * (eg. files on disk), because it is useful to merge tables aren't files. For
     ++ * example, the per-workspace and global ref namespace can be implemented as a
     ++ * merged table of two stacks of file-backed reftables.
     ++ */
      +
      +/* A merged table is implements seeking/iterating over a stack of tables. */
      +struct reftable_merged_table;
     @@ reftable/reftable-stack.h (new)
      +#include "reftable-writer.h"
      +
      +/*
     -+  The stack presents an interface to a mutable sequence of reftables.
     ++ * The stack presents an interface to a mutable sequence of reftables.
      +
     -+  A stack can be mutated by pushing a table to the top of the stack.
     ++ * A stack can be mutated by pushing a table to the top of the stack.
      +
     -+  The reftable_stack automatically compacts files on disk to ensure good
     -+  amortized performance.
     -+*/
     ++ * The reftable_stack automatically compacts files on disk to ensure good
     ++ * amortized performance.
     ++ *
     ++ * For windows and other platforms that cannot have open files as rename
     ++ * destinations, concurrent access from multiple processes needs the rand()
     ++ * random seed to be randomized.
     ++ */
      +struct reftable_stack;
      +
      +/* open a new reftable stack. The tables along with the table list will be
     -+   stored in 'dir'. Typically, this should be .git/reftables.
     -+*/
     ++ *  stored in 'dir'. Typically, this should be .git/reftables.
     ++ */
      +int reftable_new_stack(struct reftable_stack **dest, const char *dir,
      +		       struct reftable_write_options config);
      +
     @@ reftable/reftable-stack.h (new)
      +struct reftable_addition;
      +
      +/*
     -+  returns a new transaction to add reftables to the given stack. As a side
     -+  effect, the ref database is locked.
     -+*/
     ++ * returns a new transaction to add reftables to the given stack. As a side
     ++ * effect, the ref database is locked.
     ++ */
      +int reftable_stack_new_addition(struct reftable_addition **dest,
      +				struct reftable_stack *st);
      +
     @@ reftable/reftable-stack.h (new)
      +int reftable_addition_commit(struct reftable_addition *add);
      +
      +/* Release all non-committed data from the transaction, and deallocate the
     -+   transaction. Releases the lock if held. */
     ++ * transaction. Releases the lock if held. */
      +void reftable_addition_destroy(struct reftable_addition *add);
      +
      +/* add a new table to the stack. The write_table function must call
     -+   reftable_writer_set_limits, add refs and return an error value. */
     ++ * reftable_writer_set_limits, add refs and return an error value. */
      +int reftable_stack_add(struct reftable_stack *st,
      +		       int (*write_table)(struct reftable_writer *wr,
      +					  void *write_arg),
      +		       void *write_arg);
      +
      +/* returns the merged_table for seeking. This table is valid until the
     -+   next write or reload, and should not be closed or deleted.
     -+*/
     ++ * next write or reload, and should not be closed or deleted.
     ++ */
      +struct reftable_merged_table *
      +reftable_stack_merged_table(struct reftable_stack *st);
      +
     @@ reftable/reftable-stack.h (new)
      +/* heuristically compact unbalanced table stack. */
      +int reftable_stack_auto_compact(struct reftable_stack *st);
      +
     -+/* convenience function to read a single ref. Returns < 0 for error, 0
     -+   for success, and 1 if ref not found. */
     ++/* convenience function to read a single ref. Returns < 0 for error, 0 for
     ++ * success, and 1 if ref not found. */
      +int reftable_stack_read_ref(struct reftable_stack *st, const char *refname,
      +			    struct reftable_ref_record *ref);
      +
     -+/* convenience function to read a single log. Returns < 0 for error, 0
     -+   for success, and 1 if ref not found. */
     ++/* convenience function to read a single log. Returns < 0 for error, 0 for
     ++ * success, and 1 if ref not found. */
      +int reftable_stack_read_log(struct reftable_stack *st, const char *refname,
      +			    struct reftable_log_record *log);
      +
     @@ reftable/stack.c (new)
      +static void format_name(struct strbuf *dest, uint64_t min, uint64_t max)
      +{
      +	char buf[100];
     -+	snprintf(buf, sizeof(buf), "0x%012" PRIx64 "-0x%012" PRIx64, min, max);
     ++	uint32_t rnd = (uint32_t)rand();
     ++	snprintf(buf, sizeof(buf), "0x%012" PRIx64 "-0x%012" PRIx64 "-%08x",
     ++		 min, max, rnd);
      +	strbuf_reset(dest);
      +	strbuf_addstr(dest, buf);
      +}
     @@ reftable/stack.c (new)
      +	int lock_file_fd;
      +	struct strbuf lock_file_name;
      +	struct reftable_stack *stack;
     -+	char **names;
     ++
      +	char **new_tables;
      +	int new_tables_len;
      +	uint64_t next_update_index;
     @@ reftable/stack.c (new)
      +	struct strbuf nm = STRBUF_INIT;
      +	for (i = 0; i < add->new_tables_len; i++) {
      +		strbuf_reset(&nm);
     -+		strbuf_addstr(&nm, add->stack->list_file);
     ++		strbuf_addstr(&nm, add->stack->reftable_dir);
      +		strbuf_addstr(&nm, "/");
      +		strbuf_addstr(&nm, add->new_tables[i]);
      +		unlink(nm.buf);
     @@ reftable/stack.c (new)
      +		strbuf_release(&add->lock_file_name);
      +	}
      +
     -+	free_names(add->names);
     -+	add->names = NULL;
      +	strbuf_release(&nm);
      +}
      +
     @@ reftable/stack.c (new)
      +		goto done;
      +	}
      +
     -+	err = reftable_stack_reload(add->stack);
     ++	/* success, no more state to clean up. */
     ++	strbuf_release(&add->lock_file_name);
     ++	for (i = 0; i < add->new_tables_len; i++) {
     ++		reftable_free(add->new_tables[i]);
     ++	}
     ++	reftable_free(add->new_tables);
     ++	add->new_tables = NULL;
     ++	add->new_tables_len = 0;
      +
     ++	err = reftable_stack_reload(add->stack);
      +done:
      +	reftable_addition_close(add);
      +	return err;
     @@ reftable/stack.c (new)
      +	strbuf_addstr(&tab_file_name, "/");
      +	strbuf_addbuf(&tab_file_name, &next_name);
      +
     -+	/* TODO: should check destination out of paranoia */
     ++	/*
     ++	  On windows, this relies on rand() picking a unique destination name.
     ++	  Maybe we should do retry loop as well?
     ++	 */
      +	err = rename(temp_tab_file_name.buf, tab_file_name.buf);
      +	if (err < 0) {
      +		err = REFTABLE_IO_ERROR;
     @@ reftable/stack.c (new)
      +
      +	format_name(&next_name,
      +		    reftable_reader_min_update_index(st->readers[first]),
     -+		    reftable_reader_max_update_index(st->readers[first]));
     ++		    reftable_reader_max_update_index(st->readers[last]));
      +
      +	strbuf_reset(temp_tab);
      +	strbuf_addstr(temp_tab, st->reftable_dir);
     @@ reftable/stack.c (new)
      +		if (err < 0) {
      +			break;
      +		}
     ++
      +		if (first == 0 && reftable_ref_record_is_deletion(&ref)) {
      +			continue;
      +		}
     @@ reftable/stack.c (new)
      +	strbuf_addbuf(&new_table_path, &new_table_name);
      +
      +	if (!is_empty_table) {
     ++		/* retry? */
      +		err = rename(temp_tab_file_name.buf, new_table_path.buf);
      +		if (err < 0) {
      +			err = REFTABLE_IO_ERROR;
     @@ reftable/stack_test.c (new)
      +	strbuf_release(&path);
      +}
      +
     ++static char *get_tmp_template(const char *prefix)
     ++{
     ++	const char *tmp = getenv("TMPDIR");
     ++	static char template[1024];
     ++	snprintf(template, sizeof(template) - 1, "%s/%s.XXXXXX",
     ++		 tmp ? tmp : "/tmp", prefix);
     ++	return template;
     ++}
     ++
      +static void test_read_file(void)
      +{
     -+	char fn[256] = "/tmp/stack.test_read_file.XXXXXX";
     ++	char *fn = get_tmp_template(__FUNCTION__);
      +	int fd = mkstemp(fn);
      +	char out[1024] = "line1\n\nline2\nline3";
      +	int n, err;
     @@ reftable/stack_test.c (new)
      +
      +static void test_reftable_stack_add_one(void)
      +{
     -+	char dir[256] = "/tmp/stack_test.XXXXXX";
     ++	char *dir = get_tmp_template(__FUNCTION__);
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st = NULL;
      +	int err;
      +	struct reftable_ref_record ref = {
      +		.refname = "HEAD",
      +		.update_index = 1,
     -+		.target = "master",
     ++		.value_type = REFTABLE_REF_SYMREF,
     ++		.value.symref = "master",
      +	};
      +	struct reftable_ref_record dest = { NULL };
      +
     @@ reftable/stack_test.c (new)
      +
      +	err = reftable_stack_read_ref(st, ref.refname, &dest);
      +	EXPECT_ERR(err);
     -+	EXPECT(0 == strcmp("master", dest.target));
     ++	EXPECT(0 == strcmp("master", dest.value.symref));
      +
      +	reftable_ref_record_release(&dest);
      +	reftable_stack_destroy(st);
     @@ reftable/stack_test.c (new)
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st1 = NULL;
      +	struct reftable_stack *st2 = NULL;
     -+	char dir[256] = "/tmp/stack_test.XXXXXX";
     ++	char *dir = get_tmp_template(__FUNCTION__);
      +	int err;
      +	struct reftable_ref_record ref1 = {
      +		.refname = "HEAD",
      +		.update_index = 1,
     -+		.target = "master",
     ++		.value_type = REFTABLE_REF_SYMREF,
     ++		.value.symref = "master",
      +	};
      +	struct reftable_ref_record ref2 = {
      +		.refname = "branch2",
      +		.update_index = 2,
     -+		.target = "master",
     ++		.value_type = REFTABLE_REF_SYMREF,
     ++		.value.symref = "master",
      +	};
      +
      +	EXPECT(mkdtemp(dir));
      +
     ++	/* simulate multi-process access to the same stack
     ++	   by creating two stacks for the same directory.
     ++	 */
      +	err = reftable_new_stack(&st1, dir, cfg);
      +	EXPECT_ERR(err);
      +
     @@ reftable/stack_test.c (new)
      +
      +static void test_reftable_stack_transaction_api(void)
      +{
     -+	char dir[256] = "/tmp/stack_test.XXXXXX";
     ++	char *dir = get_tmp_template(__FUNCTION__);
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st = NULL;
      +	int err;
     @@ reftable/stack_test.c (new)
      +	struct reftable_ref_record ref = {
      +		.refname = "HEAD",
      +		.update_index = 1,
     -+		.target = "master",
     ++		.value_type = REFTABLE_REF_SYMREF,
     ++		.value.symref = "master",
      +	};
      +	struct reftable_ref_record dest = { NULL };
      +
     @@ reftable/stack_test.c (new)
      +
      +	err = reftable_stack_read_ref(st, ref.refname, &dest);
      +	EXPECT_ERR(err);
     -+	EXPECT(0 == strcmp("master", dest.target));
     ++	EXPECT(REFTABLE_REF_SYMREF == dest.value_type);
     ++	EXPECT(0 == strcmp("master", dest.value.symref));
      +
      +	reftable_ref_record_release(&dest);
      +	reftable_stack_destroy(st);
     @@ reftable/stack_test.c (new)
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st = NULL;
      +	int err;
     -+	char dir[256] = "/tmp/stack_test.XXXXXX";
     ++	char *dir = get_tmp_template(__FUNCTION__);
      +	int i;
      +	struct reftable_ref_record ref = {
      +		.refname = "a/b",
      +		.update_index = 1,
     -+		.target = "master",
     ++		.value_type = REFTABLE_REF_SYMREF,
     ++		.value.symref = "master",
      +	};
      +	char *additions[] = { "a", "a/b/c" };
      +
     @@ reftable/stack_test.c (new)
      +		struct reftable_ref_record ref = {
      +			.refname = additions[i],
      +			.update_index = 1,
     -+			.target = "master",
     ++			.value_type = REFTABLE_REF_SYMREF,
     ++			.value.symref = "master",
      +		};
      +
      +		err = reftable_stack_add(st, &write_test_ref, &ref);
     @@ reftable/stack_test.c (new)
      +
      +static void test_reftable_stack_update_index_check(void)
      +{
     -+	char dir[256] = "/tmp/stack_test.XXXXXX";
     ++	char *dir = get_tmp_template(__FUNCTION__);
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st = NULL;
      +	int err;
      +	struct reftable_ref_record ref1 = {
      +		.refname = "name1",
      +		.update_index = 1,
     -+		.target = "master",
     ++		.value_type = REFTABLE_REF_SYMREF,
     ++		.value.symref = "master",
      +	};
      +	struct reftable_ref_record ref2 = {
      +		.refname = "name2",
      +		.update_index = 1,
     -+		.target = "master",
     ++		.value_type = REFTABLE_REF_SYMREF,
     ++		.value.symref = "master",
      +	};
      +	EXPECT(mkdtemp(dir));
      +
     @@ reftable/stack_test.c (new)
      +
      +static void test_reftable_stack_lock_failure(void)
      +{
     -+	char dir[256] = "/tmp/stack_test.XXXXXX";
     ++	char *dir = get_tmp_template(__FUNCTION__);
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st = NULL;
      +	int err, i;
     @@ reftable/stack_test.c (new)
      +		.exact_log_message = 1,
      +	};
      +	struct reftable_stack *st = NULL;
     -+	char dir[256] = "/tmp/stack_test.XXXXXX";
     ++	char *dir = get_tmp_template(__FUNCTION__);
      +	struct reftable_ref_record refs[2] = { { NULL } };
      +	struct reftable_log_record logs[2] = { { NULL } };
      +	int N = ARRAY_SIZE(refs);
     @@ reftable/stack_test.c (new)
      +		char buf[256];
      +		snprintf(buf, sizeof(buf), "branch%02d", i);
      +		refs[i].refname = xstrdup(buf);
     -+		refs[i].value = reftable_malloc(SHA1_SIZE);
      +		refs[i].update_index = i + 1;
     -+		set_test_hash(refs[i].value, i);
     ++		refs[i].value_type = REFTABLE_REF_VAL1;
     ++		refs[i].value.val1 = reftable_malloc(SHA1_SIZE);
     ++		set_test_hash(refs[i].value.val1, i);
      +
      +		logs[i].refname = xstrdup(buf);
      +		logs[i].update_index = N + i + 1;
     @@ reftable/stack_test.c (new)
      +		0,
      +	};
      +	struct reftable_stack *st = NULL;
     -+	char dir[256] = "/tmp/stack_test.XXXXXX";
     ++	char *dir = get_tmp_template(__FUNCTION__);
      +
      +	uint8_t h1[SHA1_SIZE] = { 0x01 }, h2[SHA1_SIZE] = { 0x02 };
      +
     @@ reftable/stack_test.c (new)
      +static void test_reftable_stack_tombstone(void)
      +{
      +	int i = 0;
     -+	char dir[256] = "/tmp/stack_test.XXXXXX";
     ++	char *dir = get_tmp_template(__FUNCTION__);
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st = NULL;
      +	int err;
     @@ reftable/stack_test.c (new)
      +		refs[i].refname = xstrdup(buf);
      +		refs[i].update_index = i + 1;
      +		if (i % 2 == 0) {
     -+			refs[i].value = reftable_malloc(SHA1_SIZE);
     -+			set_test_hash(refs[i].value, i);
     ++			refs[i].value_type = REFTABLE_REF_VAL1;
     ++			refs[i].value.val1 = reftable_malloc(SHA1_SIZE);
     ++			set_test_hash(refs[i].value.val1, i);
      +		}
      +		logs[i].refname = xstrdup(buf);
      +		/* update_index is part of the key. */
     @@ reftable/stack_test.c (new)
      +
      +static void test_reftable_stack_hash_id(void)
      +{
     -+	char dir[256] = "/tmp/stack_test.XXXXXX";
     ++	char *dir = get_tmp_template(__FUNCTION__);
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st = NULL;
      +	int err;
      +
      +	struct reftable_ref_record ref = {
      +		.refname = "master",
     -+		.target = "target",
     ++		.value_type = REFTABLE_REF_SYMREF,
     ++		.value.symref = "target",
      +		.update_index = 1,
      +	};
      +	struct reftable_write_options cfg32 = { .hash_id = SHA256_ID };
     @@ reftable/stack_test.c (new)
      +	err = reftable_stack_read_ref(st_default, "master", &dest);
      +	EXPECT_ERR(err);
      +
     -+	EXPECT(!strcmp(dest.target, ref.target));
     ++	EXPECT(reftable_ref_record_equal(&ref, &dest, SHA1_SIZE));
      +	reftable_ref_record_release(&dest);
      +	reftable_stack_destroy(st);
      +	reftable_stack_destroy(st_default);
     @@ reftable/stack_test.c (new)
      +
      +static void test_sizes_to_segments_empty(void)
      +{
     -+	uint64_t sizes[0];
     -+
      +	int seglen = 0;
     -+	struct segment *segs =
     -+		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
     ++	struct segment *segs = sizes_to_segments(&seglen, NULL, 0);
      +	EXPECT(seglen == 0);
      +	reftable_free(segs);
      +}
     @@ reftable/stack_test.c (new)
      +
      +static void test_reflog_expire(void)
      +{
     -+	char dir[256] = "/tmp/stack.test_reflog_expire.XXXXXX";
     ++	char *dir = get_tmp_template(__FUNCTION__);
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st = NULL;
      +	struct reftable_log_record logs[20] = { { NULL } };
     @@ reftable/stack_test.c (new)
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st = NULL;
      +	int err;
     -+	char dir[256] = "/tmp/stack_test.XXXXXX";
     ++	char *dir = get_tmp_template(__FUNCTION__);
      +	struct reftable_stack *st2 = NULL;
      +
      +	EXPECT(mkdtemp(dir));
     @@ reftable/stack_test.c (new)
      +{
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st = NULL;
     -+	char dir[256] = "/tmp/stack_test.XXXXXX";
     ++	char *dir = get_tmp_template(__FUNCTION__);
      +	int err, i;
      +	int N = 100;
      +	EXPECT(mkdtemp(dir));
     @@ reftable/stack_test.c (new)
      +		struct reftable_ref_record ref = {
      +			.refname = name,
      +			.update_index = reftable_stack_next_update_index(st),
     -+			.target = "master",
     ++			.value_type = REFTABLE_REF_SYMREF,
     ++			.value.symref = "master",
      +		};
      +		snprintf(name, sizeof(name), "branch%04d", i);
      +
     @@ reftable/stack_test.c (new)
      +
      +int stack_test_main(int argc, const char *argv[])
      +{
     -+	test_reftable_stack_uptodate();
     -+	test_reftable_stack_transaction_api();
     -+	test_reftable_stack_hash_id();
     -+	test_sizes_to_segments_all_equal();
     -+	test_reftable_stack_auto_compaction();
     -+	test_reftable_stack_validate_refname();
     -+	test_reftable_stack_update_index_check();
     -+	test_reftable_stack_lock_failure();
     -+	test_reftable_stack_log_normalize();
     -+	test_reftable_stack_tombstone();
     -+	test_reftable_stack_add_one();
     -+	test_empty_add();
     -+	test_reflog_expire();
     -+	test_suggest_compaction_segment();
     -+	test_suggest_compaction_segment_nothing();
     -+	test_sizes_to_segments();
     -+	test_sizes_to_segments_empty();
     -+	test_log2();
     -+	test_parse_names();
     -+	test_read_file();
     -+	test_names_equal();
     -+	test_reftable_stack_add();
     ++	RUN_TEST(test_reftable_stack_uptodate);
     ++	RUN_TEST(test_reftable_stack_transaction_api);
     ++	RUN_TEST(test_reftable_stack_hash_id);
     ++	RUN_TEST(test_sizes_to_segments_all_equal);
     ++	RUN_TEST(test_reftable_stack_auto_compaction);
     ++	RUN_TEST(test_reftable_stack_validate_refname);
     ++	RUN_TEST(test_reftable_stack_update_index_check);
     ++	RUN_TEST(test_reftable_stack_lock_failure);
     ++	RUN_TEST(test_reftable_stack_log_normalize);
     ++	RUN_TEST(test_reftable_stack_tombstone);
     ++	RUN_TEST(test_reftable_stack_add_one);
     ++	RUN_TEST(test_empty_add);
     ++	RUN_TEST(test_reflog_expire);
     ++	RUN_TEST(test_suggest_compaction_segment);
     ++	RUN_TEST(test_suggest_compaction_segment_nothing);
     ++	RUN_TEST(test_sizes_to_segments);
     ++	RUN_TEST(test_sizes_to_segments_empty);
     ++	RUN_TEST(test_log2);
     ++	RUN_TEST(test_parse_names);
     ++	RUN_TEST(test_read_file);
     ++	RUN_TEST(test_names_equal);
     ++	RUN_TEST(test_reftable_stack_add);
      +	return 0;
      +}
      
 14:  a590865a700 ! 13:  d57023d9f13 Reftable support for git-core
     @@ Commit message
          Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
          Co-authored-by: Jeff King <peff@peff.net>
      
     + ## Documentation/config/extensions.txt ##
     +@@ Documentation/config/extensions.txt: extensions.objectFormat::
     + Note that this setting should only be set by linkgit:git-init[1] or
     + linkgit:git-clone[1].  Trying to change it after initialization will not
     + work and will produce hard-to-diagnose issues.
     +++
     ++extensions.refStorage::
     ++	Specify the ref storage mechanism to use.  The acceptable values are `files` and
     ++	`reftable`.  If not specified, `files` is assumed.  It is an error to specify
     ++	this key unless `core.repositoryFormatVersion` is 1.
     +++
     ++Note that this setting should only be set by linkgit:git-init[1] or
     ++linkgit:git-clone[1].  Trying to change it after initialization will not
     ++work and will produce hard-to-diagnose issues.
     +
       ## Documentation/technical/repository-version.txt ##
      @@ Documentation/technical/repository-version.txt: If set, by default "git config" reads from both "config" and
       multiple working directory mode, "config" file is shared while
     @@ refs/reftable-backend.c (new)
      +			continue;
      +
      +		ri->base.flags = 0;
     -+		if (ri->ref.value != NULL) {
     -+			hashcpy(ri->oid.hash, ri->ref.value);
     -+		} else if (ri->ref.target != NULL) {
     ++		switch (ri->ref.value_type) {
     ++		case REFTABLE_REF_VAL1:
     ++			hashcpy(ri->oid.hash, ri->ref.value.val1);
     ++			break;
     ++		case REFTABLE_REF_VAL2:
     ++			hashcpy(ri->oid.hash, ri->ref.value.val2.value);
     ++			break;
     ++		case REFTABLE_REF_SYMREF: {
      +			int out_flags = 0;
      +			const char *resolved = refs_resolve_ref_unsafe(
      +				ri->ref_store, ri->ref.refname,
     @@ refs/reftable-backend.c (new)
      +			    (ri->base.flags & REF_ISBROKEN)) {
      +				continue;
      +			}
     ++			break;
     ++		}
     ++		default:
     ++			abort();
      +		}
      +
      +		ri->base.oid = &ri->oid;
     @@ refs/reftable-backend.c (new)
      +{
      +	struct git_reftable_iterator *ri =
      +		(struct git_reftable_iterator *)ref_iterator;
     -+	if (ri->ref.target_value != NULL) {
     -+		hashcpy(peeled->hash, ri->ref.target_value);
     ++	if (ri->ref.value_type == REFTABLE_REF_VAL2) {
     ++		hashcpy(peeled->hash, ri->ref.value.val2.target_value);
      +		return 0;
      +	}
      +
     @@ refs/reftable-backend.c (new)
      +
      +			int peel_error = peel_object(&u->new_oid, &peeled);
      +			ref.refname = (char *)u->refname;
     -+
     -+			if (!is_null_oid(&u->new_oid)) {
     -+				ref.value = u->new_oid.hash;
     -+			}
      +			ref.update_index = ts;
     ++
      +			if (!peel_error) {
     -+				ref.target_value = peeled.hash;
     ++				ref.value_type = REFTABLE_REF_VAL2;
     ++				ref.value.val2.target_value = peeled.hash;
     ++				ref.value.val2.value = u->new_oid.hash;
     ++			} else if (!is_null_oid(&u->new_oid)) {
     ++				ref.value_type = REFTABLE_REF_VAL1;
     ++				ref.value.val1 = u->new_oid.hash;
      +			}
      +
      +			err = reftable_writer_add_ref(writer, &ref);
     @@ refs/reftable-backend.c (new)
      +	for (i = 0; i < arg->refnames->nr; i++) {
      +		struct reftable_ref_record ref = {
      +			.refname = (char *)arg->refnames->items[i].string,
     ++			.value_type = REFTABLE_REF_DELETION,
      +			.update_index = ts,
      +		};
      +		err = reftable_writer_add_ref(writer, &ref);
     @@ refs/reftable-backend.c (new)
      +
      +		if (reftable_stack_read_ref(arg->stack, log.refname,
      +					    &current) == 0) {
     -+			log.old_hash = current.value;
     ++			log.old_hash = reftable_ref_record_val1(&current);
      +		}
      +		err = reftable_writer_add_log(writer, &log);
      +		log.old_hash = NULL;
     @@ refs/reftable-backend.c (new)
      +
      +	struct reftable_ref_record ref = {
      +		.refname = (char *)create->refname,
     -+		.target = (char *)create->target,
     ++		.value_type = REFTABLE_REF_SYMREF,
     ++		.value.symref = (char *)create->target,
      +		.update_index = ts,
      +	};
      +	reftable_writer_set_limits(writer, ts, ts);
     @@ refs/reftable-backend.c (new)
      +		goto done;
      +	}
      +
     -+	free(ref.refname);
     -+	ref.refname = strdup(arg->newname);
      +	reftable_writer_set_limits(writer, ts, ts);
     -+	ref.update_index = ts;
      +
      +	{
     -+		struct reftable_ref_record todo[2] = { { NULL } };
     -+		todo[0].refname = (char *)arg->oldname;
     -+		todo[0].update_index = ts;
     -+		/* leave todo[0] empty */
     -+		todo[1] = ref;
     ++		struct reftable_ref_record todo[2] = {
     ++			{
     ++				.refname = (char *)arg->oldname,
     ++				.update_index = ts,
     ++				.value_type = REFTABLE_REF_DELETION,
     ++			},
     ++			ref,
     ++		};
      +		todo[1].update_index = ts;
     ++		free(todo[1].refname);
     ++		todo[1].refname = strdup(arg->newname);
      +
      +		err = reftable_writer_add_refs(writer, todo, 2);
      +		if (err < 0) {
     @@ refs/reftable-backend.c (new)
      +		}
      +	}
      +
     -+	if (ref.value != NULL) {
     ++	if (reftable_ref_record_val1(&ref)) {
     ++		uint8_t *val1 = reftable_ref_record_val1(&ref);
      +		struct reftable_log_record todo[2] = { { NULL } };
      +		fill_reftable_log_record(&todo[0]);
      +		fill_reftable_log_record(&todo[1]);
     @@ refs/reftable-backend.c (new)
      +		todo[0].refname = (char *)arg->oldname;
      +		todo[0].update_index = ts;
      +		todo[0].message = (char *)arg->logmsg;
     -+		todo[0].old_hash = ref.value;
     ++		todo[0].old_hash = val1;
      +		todo[0].new_hash = NULL;
      +
      +		todo[1].refname = (char *)arg->newname;
      +		todo[1].update_index = ts;
      +		todo[1].old_hash = NULL;
     -+		todo[1].new_hash = ref.value;
     ++		todo[1].new_hash = val1;
      +		todo[1].message = (char *)arg->logmsg;
      +
      +		err = reftable_writer_add_logs(writer, todo, 2);
     @@ refs/reftable-backend.c (new)
      +		err = -1;
      +		goto done;
      +	}
     -+	if (ref.target != NULL) {
     ++
     ++	if (ref.value_type == REFTABLE_REF_SYMREF) {
      +		strbuf_reset(referent);
     -+		strbuf_addstr(referent, ref.target);
     ++		strbuf_addstr(referent, ref.value.symref);
      +		*type |= REF_ISSYMREF;
     -+	} else if (ref.value != NULL) {
     -+		hashcpy(oid->hash, ref.value);
     ++	} else if (reftable_ref_record_val1(&ref) != NULL) {
     ++		hashcpy(oid->hash, reftable_ref_record_val1(&ref));
      +	} else {
      +		*type |= REF_ISBROKEN;
      +		errno = EINVAL;
 15:  57626bfe2d4 = 14:  9df5bc69f97 git-prompt: prepare for reftable refs backend
 16:  6229da992e4 ! 15:  4076ef5d20b Add "test-tool dump-reftable" command.
     @@ reftable/dump.c: static int dump_table(const char *tablename)
       
       	reftable_reader_free(r);
       	return 0;
     -@@ reftable/dump.c: static int dump_stack(const char *stackdir)
     +@@ reftable/dump.c: static int dump_table(const char *tablename)
     + static int compact_stack(const char *stackdir)
       {
       	struct reftable_stack *stack = NULL;
     - 	struct reftable_write_options cfg = {};
     +-	struct reftable_write_options cfg = {};
     ++	struct reftable_write_options cfg = { 0 };
     + 
     + 	int err = reftable_new_stack(&stack, stackdir, cfg);
     + 	if (err < 0)
     +@@ reftable/dump.c: static int compact_stack(const char *stackdir)
     + static int dump_stack(const char *stackdir)
     + {
     + 	struct reftable_stack *stack = NULL;
     +-	struct reftable_write_options cfg = {};
      -	struct reftable_iterator it = { 0 };
      -	struct reftable_ref_record ref = { 0 };
      -	struct reftable_log_record log = { 0 };
     ++	struct reftable_write_options cfg = { 0 };
      +	struct reftable_iterator it = { NULL };
      +	struct reftable_ref_record ref = { NULL };
      +	struct reftable_log_record log = { NULL };
     @@ reftable/dump.c: static int dump_stack(const char *stackdir)
       
       	reftable_stack_destroy(stack);
       	return 0;
     +@@ reftable/dump.c: static void print_help(void)
     + int reftable_dump_main(int argc, char *const *argv)
     + {
     + 	int err = 0;
     +-	int opt;
     + 	int opt_dump_table = 0;
     + 	int opt_dump_stack = 0;
     + 	int opt_compact = 0;
     +-	const char *arg = NULL;
     +-	while ((opt = getopt(argc, argv, "2chts")) != -1) {
     +-		switch (opt) {
     +-		case '2':
     +-			hash_id = 0x73323536;
     ++	const char *arg = NULL, *argv0 = argv[0];
     ++
     ++	for (; argc > 1; argv++, argc--)
     ++		if (*argv[1] != '-')
     + 			break;
     +-		case 't':
     ++		else if (!strcmp("-2", argv[1]))
     ++			hash_id = 0x73323536;
     ++		else if (!strcmp("-t", argv[1]))
     + 			opt_dump_table = 1;
     +-			break;
     +-		case 's':
     ++		else if (!strcmp("-s", argv[1]))
     + 			opt_dump_stack = 1;
     +-			break;
     +-		case 'c':
     ++		else if (!strcmp("-c", argv[1]))
     + 			opt_compact = 1;
     +-			break;
     +-		case '?':
     +-		case 'h':
     ++		else if (!strcmp("-?", argv[1]) || !strcmp("-h", argv[1])) {
     + 			print_help();
     + 			return 2;
     +-			break;
     + 		}
     +-	}
     + 
     +-	if (argv[optind] == NULL) {
     ++	if (argc != 2) {
     + 		fprintf(stderr, "need argument\n");
     + 		print_help();
     + 		return 2;
     + 	}
     + 
     +-	arg = argv[optind];
     ++	arg = argv[1];
     + 
     + 	if (opt_dump_table) {
     + 		err = dump_table(arg);
     +@@ reftable/dump.c: int reftable_dump_main(int argc, char *const *argv)
     + 	}
     + 
     + 	if (err < 0) {
     +-		fprintf(stderr, "%s: %s: %s\n", argv[0], arg,
     ++		fprintf(stderr, "%s: %s: %s\n", argv0, arg,
     + 			reftable_error_str(err));
     + 		return 1;
     + 	}
      
       ## t/helper/test-reftable.c ##
      @@ t/helper/test-reftable.c: int cmd__reftable(int argc, const char **argv)

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 251+ messages in thread

* [PATCH v4 01/15] init-db: set the_repository->hash_algo early on
  2020-12-09 14:00     ` [PATCH v4 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
@ 2020-12-09 14:00       ` Han-Wen Nienhuys via GitGitGadget
  2020-12-09 14:00       ` [PATCH v4 02/15] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
                         ` (14 subsequent siblings)
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-12-09 14:00 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable backend needs to know the hash algorithm for writing the
initialization hash table.

The initial reftable contains a symref HEAD => "main" (or "master"), which is
agnostic to the size of hash value, but this is an exceptional circumstance, and
the reftable library does not cater to this exception. It insists that all
tables in the stack have a consistent format ID for the hash algorithm.

Call set_repo_hash_algo directly after calling validate_hash_algorithm() (which
reads $GIT_DEFAULT_HASH).

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 builtin/init-db.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/builtin/init-db.c b/builtin/init-db.c
index 01bc648d416..dcb7015db48 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -437,6 +437,27 @@ int init_db(const char *git_dir, const char *real_git_dir,
 
 	validate_hash_algorithm(&repo_fmt, hash);
 
+	/*
+	 * At this point, the_repository we have in-core does not look
+	 * anything like one that we would see initialized in an already
+	 * working repository after calling setup_git_directory().
+	 *
+	 * Calling repository.c::initialize_the_repository() may have
+	 * prepared the .index .objects and .parsed_objects members, but
+	 * other members like .gitdir, .commondir, etc. have not been
+	 * initialized.
+	 *
+	 * Many API functions assume they are working with the_repository
+	 * that has sensibly been initialized, but because we haven't
+	 * really read from an existing repository, we need to hand-craft
+	 * the necessary members of the structure to get out of this
+	 * chicken-and-egg situation.
+	 *
+	 * For now, we update the hash algorithm member to what the
+	 * validate_hash_algorithm() call decided for us.
+	 */
+	repo_set_hash_algo(the_repository, repo_fmt.hash_algo);
+
 	reinit = create_default_files(template_dir, original_git_dir,
 				      initial_branch, &repo_fmt);
 	if (reinit && initial_branch)
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v4 02/15] reftable: add LICENSE
  2020-12-09 14:00     ` [PATCH v4 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
  2020-12-09 14:00       ` [PATCH v4 01/15] init-db: set the_repository->hash_algo early on Han-Wen Nienhuys via GitGitGadget
@ 2020-12-09 14:00       ` Han-Wen Nienhuys via GitGitGadget
  2020-12-09 14:00       ` [PATCH v4 03/15] reftable: add error related functionality Han-Wen Nienhuys via GitGitGadget
                         ` (13 subsequent siblings)
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-12-09 14:00 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

TODO: relicense?

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/LICENSE | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)
 create mode 100644 reftable/LICENSE

diff --git a/reftable/LICENSE b/reftable/LICENSE
new file mode 100644
index 00000000000..402e0f9356b
--- /dev/null
+++ b/reftable/LICENSE
@@ -0,0 +1,31 @@
+BSD License
+
+Copyright (c) 2020, Google LLC
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are
+met:
+
+* Redistributions of source code must retain the above copyright notice,
+this list of conditions and the following disclaimer.
+
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in the
+documentation and/or other materials provided with the distribution.
+
+* Neither the name of Google LLC nor the names of its contributors may
+be used to endorse or promote products derived from this software
+without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v4 03/15] reftable: add error related functionality
  2020-12-09 14:00     ` [PATCH v4 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
  2020-12-09 14:00       ` [PATCH v4 01/15] init-db: set the_repository->hash_algo early on Han-Wen Nienhuys via GitGitGadget
  2020-12-09 14:00       ` [PATCH v4 02/15] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
@ 2020-12-09 14:00       ` Han-Wen Nienhuys via GitGitGadget
  2020-12-09 14:00       ` [PATCH v4 04/15] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
                         ` (12 subsequent siblings)
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-12-09 14:00 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable/ directory is structured as a library, so it cannot
crash on misuse. Instead, it returns an error codes.

In addition, the error code can be used to signal conditions from lower levels
of the library to be handled by higher levels of the library. For example, a
transaction might legitimately write an empty reftable file, but in that case,
we'd want to shortcut the transaction overhead.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/error.c          | 41 ++++++++++++++++++++++++++
 reftable/reftable-error.h | 62 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 103 insertions(+)
 create mode 100644 reftable/error.c
 create mode 100644 reftable/reftable-error.h

diff --git a/reftable/error.c b/reftable/error.c
new file mode 100644
index 00000000000..f6f16def921
--- /dev/null
+++ b/reftable/error.c
@@ -0,0 +1,41 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reftable-error.h"
+
+#include <stdio.h>
+
+const char *reftable_error_str(int err)
+{
+	static char buf[250];
+	switch (err) {
+	case REFTABLE_IO_ERROR:
+		return "I/O error";
+	case REFTABLE_FORMAT_ERROR:
+		return "corrupt reftable file";
+	case REFTABLE_NOT_EXIST_ERROR:
+		return "file does not exist";
+	case REFTABLE_LOCK_ERROR:
+		return "data is outdated";
+	case REFTABLE_API_ERROR:
+		return "misuse of the reftable API";
+	case REFTABLE_ZLIB_ERROR:
+		return "zlib failure";
+	case REFTABLE_NAME_CONFLICT:
+		return "file/directory conflict";
+	case REFTABLE_EMPTY_TABLE_ERROR:
+		return "wrote empty table";
+	case REFTABLE_REFNAME_ERROR:
+		return "invalid refname";
+	case -1:
+		return "general error";
+	default:
+		snprintf(buf, sizeof(buf), "unknown error code %d", err);
+		return buf;
+	}
+}
diff --git a/reftable/reftable-error.h b/reftable/reftable-error.h
new file mode 100644
index 00000000000..6f89bedf1a5
--- /dev/null
+++ b/reftable/reftable-error.h
@@ -0,0 +1,62 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_ERROR_H
+#define REFTABLE_ERROR_H
+
+/*
+ * Errors in reftable calls are signaled with negative integer return values. 0
+ * means success.
+ */
+enum reftable_error {
+	/* Unexpected file system behavior */
+	REFTABLE_IO_ERROR = -2,
+
+	/* Format inconsistency on reading data */
+	REFTABLE_FORMAT_ERROR = -3,
+
+	/* File does not exist. Returned from block_source_from_file(), because
+	 * it needs special handling in stack.
+	 */
+	REFTABLE_NOT_EXIST_ERROR = -4,
+
+	/* Trying to write out-of-date data. */
+	REFTABLE_LOCK_ERROR = -5,
+
+	/* Misuse of the API:
+	 *  - on writing a record with NULL refname.
+	 *  - on writing a reftable_ref_record outside the table limits
+	 *  - on writing a ref or log record before the stack's
+	 * next_update_inde*x
+	 *  - on writing a log record with multiline message with
+	 *  exact_log_message unset
+	 *  - on reading a reftable_ref_record from log iterator, or vice versa.
+	 *
+	 * When a call misuses the API, the internal state of the library is
+	 * kept unchanged.
+	 */
+	REFTABLE_API_ERROR = -6,
+
+	/* Decompression error */
+	REFTABLE_ZLIB_ERROR = -7,
+
+	/* Wrote a table without blocks. */
+	REFTABLE_EMPTY_TABLE_ERROR = -8,
+
+	/* Dir/file conflict. */
+	REFTABLE_NAME_CONFLICT = -9,
+
+	/* Invalid ref name. */
+	REFTABLE_REFNAME_ERROR = -10,
+};
+
+/* convert the numeric error code to a string. The string should not be
+ * deallocated. */
+const char *reftable_error_str(int err);
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v4 04/15] reftable: utility functions
  2020-12-09 14:00     ` [PATCH v4 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                         ` (2 preceding siblings ...)
  2020-12-09 14:00       ` [PATCH v4 03/15] reftable: add error related functionality Han-Wen Nienhuys via GitGitGadget
@ 2020-12-09 14:00       ` Han-Wen Nienhuys via GitGitGadget
  2020-12-09 14:00       ` [PATCH v4 05/15] reftable: add blocksource, an abstraction for random access reads Han-Wen Nienhuys via GitGitGadget
                         ` (11 subsequent siblings)
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-12-09 14:00 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This commit provides basic utility classes for the reftable library.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Makefile                            |  24 +++++-
 contrib/buildsystems/CMakeLists.txt |  14 ++-
 reftable/basics.c                   | 128 ++++++++++++++++++++++++++++
 reftable/basics.h                   |  60 +++++++++++++
 reftable/basics_test.c              |  98 +++++++++++++++++++++
 reftable/publicbasics.c             |  58 +++++++++++++
 reftable/reftable-malloc.h          |  18 ++++
 reftable/reftable-tests.h           |  22 +++++
 reftable/system.h                   |  32 +++++++
 reftable/test_framework.c           |  23 +++++
 reftable/test_framework.h           |  53 ++++++++++++
 t/helper/test-reftable.c            |   9 ++
 t/helper/test-tool.c                |   3 +-
 t/helper/test-tool.h                |   1 +
 t/t0032-reftable-unittest.sh        |  15 ++++
 15 files changed, 552 insertions(+), 6 deletions(-)
 create mode 100644 reftable/basics.c
 create mode 100644 reftable/basics.h
 create mode 100644 reftable/basics_test.c
 create mode 100644 reftable/publicbasics.c
 create mode 100644 reftable/reftable-malloc.h
 create mode 100644 reftable/reftable-tests.h
 create mode 100644 reftable/system.h
 create mode 100644 reftable/test_framework.c
 create mode 100644 reftable/test_framework.h
 create mode 100644 t/helper/test-reftable.c
 create mode 100755 t/t0032-reftable-unittest.sh

diff --git a/Makefile b/Makefile
index 45bce31016b..9b6d84af8f6 100644
--- a/Makefile
+++ b/Makefile
@@ -731,6 +731,7 @@ TEST_BUILTINS_OBJS += test-read-cache.o
 TEST_BUILTINS_OBJS += test-read-graph.o
 TEST_BUILTINS_OBJS += test-read-midx.o
 TEST_BUILTINS_OBJS += test-ref-store.o
+TEST_BUILTINS_OBJS += test-reftable.o
 TEST_BUILTINS_OBJS += test-regex.o
 TEST_BUILTINS_OBJS += test-repository.o
 TEST_BUILTINS_OBJS += test-revision-walking.o
@@ -820,6 +821,8 @@ TEST_SHELL_PATH = $(SHELL_PATH)
 
 LIB_FILE = libgit.a
 XDIFF_LIB = xdiff/lib.a
+REFTABLE_LIB = reftable/libreftable.a
+REFTABLE_TEST_LIB = reftable/libreftable_test.a
 
 GENERATED_H += command-list.h
 GENERATED_H += config-list.h
@@ -1185,7 +1188,7 @@ THIRD_PARTY_SOURCES += compat/regex/%
 THIRD_PARTY_SOURCES += sha1collisiondetection/%
 THIRD_PARTY_SOURCES += sha1dc/%
 
-GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB)
+GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB)
 EXTLIBS =
 
 GIT_USER_AGENT = git/$(GIT_VERSION)
@@ -2386,10 +2389,19 @@ XDIFF_OBJS += xdiff/xpatience.o
 XDIFF_OBJS += xdiff/xprepare.o
 XDIFF_OBJS += xdiff/xutils.o
 
+REFTABLE_OBJS += reftable/basics.o
+REFTABLE_OBJS += reftable/error.o
+REFTABLE_OBJS += reftable/publicbasics.o
+
+REFTABLE_TEST_OBJS += reftable/test_framework.o
+REFTABLE_TEST_OBJS += reftable/basics_test.o
+
 TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
 OBJECTS := $(LIB_OBJS) $(BUILTIN_OBJS) $(PROGRAM_OBJS) $(TEST_OBJS) \
 	$(XDIFF_OBJS) \
 	$(FUZZ_OBJS) \
+	$(REFTABLE_OBJS) \
+	$(REFTABLE_TEST_OBJS) \
 	common-main.o \
 	git.o
 ifndef NO_CURL
@@ -2541,6 +2553,12 @@ $(LIB_FILE): $(LIB_OBJS)
 $(XDIFF_LIB): $(XDIFF_OBJS)
 	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
 
+$(REFTABLE_LIB): $(REFTABLE_OBJS)
+	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
+
+$(REFTABLE_TEST_LIB): $(REFTABLE_TEST_OBJS)
+	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
+
 export DEFAULT_EDITOR DEFAULT_PAGER
 
 Documentation/GIT-EXCLUDED-PROGRAMS: FORCE
@@ -2821,7 +2839,7 @@ perf: all
 
 t/helper/test-tool$X: $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
 
-t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS)
+t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS) $(REFTABLE_TEST_LIB)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(filter %.a,$^) $(LIBS)
 
 check-sha1:: t/helper/test-tool$X
@@ -3150,7 +3168,7 @@ cocciclean:
 clean: profile-clean coverage-clean cocciclean
 	$(RM) *.res
 	$(RM) $(OBJECTS)
-	$(RM) $(LIB_FILE) $(XDIFF_LIB)
+	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB)
 	$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) git$X
 	$(RM) $(TEST_PROGRAMS)
 	$(RM) $(FUZZ_PROGRAMS)
diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index df539a44fa0..f3a2fd35616 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -591,6 +591,12 @@ parse_makefile_for_sources(libxdiff_SOURCES "XDIFF_OBJS")
 list(TRANSFORM libxdiff_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/")
 add_library(xdiff STATIC ${libxdiff_SOURCES})
 
+#reftable
+parse_makefile_for_sources(reftable_SOURCES "REFTABLE_OBJS")
+
+list(TRANSFORM reftable_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/")
+add_library(reftable STATIC ${reftable_SOURCES})
+
 if(WIN32)
 	if(NOT MSVC)#use windres when compiling with gcc and clang
 		add_custom_command(OUTPUT ${CMAKE_BINARY_DIR}/git.res
@@ -613,7 +619,7 @@ endif()
 #link all required libraries to common-main
 add_library(common-main OBJECT ${CMAKE_SOURCE_DIR}/common-main.c)
 
-target_link_libraries(common-main libgit xdiff ${ZLIB_LIBRARIES})
+target_link_libraries(common-main libgit xdiff reftable ${ZLIB_LIBRARIES})
 if(Intl_FOUND)
 	target_link_libraries(common-main ${Intl_LIBRARIES})
 endif()
@@ -848,11 +854,15 @@ if(BUILD_TESTING)
 add_executable(test-fake-ssh ${CMAKE_SOURCE_DIR}/t/helper/test-fake-ssh.c)
 target_link_libraries(test-fake-ssh common-main)
 
+#reftable-tests
+parse_makefile_for_sources(test-reftable_SOURCES "REFTABLE_TEST_OBJS")
+list(TRANSFORM test-reftable_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/")
+
 #test-tool
 parse_makefile_for_sources(test-tool_SOURCES "TEST_BUILTINS_OBJS")
 
 list(TRANSFORM test-tool_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/t/helper/")
-add_executable(test-tool ${CMAKE_SOURCE_DIR}/t/helper/test-tool.c ${test-tool_SOURCES})
+add_executable(test-tool ${CMAKE_SOURCE_DIR}/t/helper/test-tool.c ${test-tool_SOURCES} ${test-reftable_SOURCES})
 target_link_libraries(test-tool common-main)
 
 set_target_properties(test-fake-ssh test-tool
diff --git a/reftable/basics.c b/reftable/basics.c
new file mode 100644
index 00000000000..abd027b9888
--- /dev/null
+++ b/reftable/basics.c
@@ -0,0 +1,128 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "basics.h"
+
+void put_be24(uint8_t *out, uint32_t i)
+{
+	out[0] = (uint8_t)((i >> 16) & 0xff);
+	out[1] = (uint8_t)((i >> 8) & 0xff);
+	out[2] = (uint8_t)(i & 0xff);
+}
+
+uint32_t get_be24(uint8_t *in)
+{
+	return (uint32_t)(in[0]) << 16 | (uint32_t)(in[1]) << 8 |
+	       (uint32_t)(in[2]);
+}
+
+void put_be16(uint8_t *out, uint16_t i)
+{
+	out[0] = (uint8_t)((i >> 8) & 0xff);
+	out[1] = (uint8_t)(i & 0xff);
+}
+
+int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args)
+{
+	size_t lo = 0;
+	size_t hi = sz;
+
+	/* Invariants:
+	 *
+	 *  (hi == sz) || f(hi) == true
+	 *  (lo == 0 && f(0) == true) || fi(lo) == false
+	 */
+	while (hi - lo > 1) {
+		size_t mid = lo + (hi - lo) / 2;
+
+		if (f(mid, args))
+			hi = mid;
+		else
+			lo = mid;
+	}
+
+	if (lo)
+		return hi;
+
+	return f(0, args) ? 0 : 1;
+}
+
+void free_names(char **a)
+{
+	char **p;
+	if (a == NULL) {
+		return;
+	}
+	for (p = a; *p; p++) {
+		reftable_free(*p);
+	}
+	reftable_free(a);
+}
+
+int names_length(char **names)
+{
+	char **p = names;
+	for (; *p; p++) {
+		/* empty */
+	}
+	return p - names;
+}
+
+void parse_names(char *buf, int size, char ***namesp)
+{
+	char **names = NULL;
+	size_t names_cap = 0;
+	size_t names_len = 0;
+
+	char *p = buf;
+	char *end = buf + size;
+	while (p < end) {
+		char *next = strchr(p, '\n');
+		if (next && next < end) {
+			*next = 0;
+		} else {
+			next = end;
+		}
+		if (p < next) {
+			if (names_len == names_cap) {
+				names_cap = 2 * names_cap + 1;
+				names = reftable_realloc(
+					names, names_cap * sizeof(*names));
+			}
+			names[names_len++] = xstrdup(p);
+		}
+		p = next + 1;
+	}
+
+	names = reftable_realloc(names, (names_len + 1) * sizeof(*names));
+	names[names_len] = NULL;
+	*namesp = names;
+}
+
+int names_equal(char **a, char **b)
+{
+	int i = 0;
+	for (; a[i] && b[i]; i++) {
+		if (strcmp(a[i], b[i])) {
+			return 0;
+		}
+	}
+
+	return a[i] == b[i];
+}
+
+int common_prefix_size(struct strbuf *a, struct strbuf *b)
+{
+	int p = 0;
+	for (; p < a->len && p < b->len; p++) {
+		if (a->buf[p] != b->buf[p])
+			break;
+	}
+
+	return p;
+}
diff --git a/reftable/basics.h b/reftable/basics.h
new file mode 100644
index 00000000000..096b36862b9
--- /dev/null
+++ b/reftable/basics.h
@@ -0,0 +1,60 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef BASICS_H
+#define BASICS_H
+
+/*
+ * miscellaneous utilities that are not provided by Git.
+ */
+
+#include "system.h"
+
+/* Bigendian en/decoding of integers */
+
+void put_be24(uint8_t *out, uint32_t i);
+uint32_t get_be24(uint8_t *in);
+void put_be16(uint8_t *out, uint16_t i);
+
+/*
+ * find smallest index i in [0, sz) at which f(i) is true, assuming
+ * that f is ascending. Return sz if f(i) is false for all indices.
+ *
+ * Contrary to bsearch(3), this returns something useful if the argument is not
+ * found.
+ */
+int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args);
+
+/*
+ * Frees a NULL terminated array of malloced strings. The array itself is also
+ * freed.
+ */
+void free_names(char **a);
+
+/* parse a newline separated list of names. `size` is the length of the buffer,
+ * without terminating '\0'. Empty names are discarded. */
+void parse_names(char *buf, int size, char ***namesp);
+
+/* compares two NULL-terminated arrays of strings. */
+int names_equal(char **a, char **b);
+
+/* returns the array size of a NULL-terminated array of strings. */
+int names_length(char **names);
+
+/* Allocation routines; they invoke the functions set through
+ * reftable_set_alloc() */
+void *reftable_malloc(size_t sz);
+void *reftable_realloc(void *p, size_t sz);
+void reftable_free(void *p);
+void *reftable_calloc(size_t sz);
+
+/* Find the longest shared prefix size of `a` and `b` */
+struct strbuf;
+int common_prefix_size(struct strbuf *a, struct strbuf *b);
+
+#endif
diff --git a/reftable/basics_test.c b/reftable/basics_test.c
new file mode 100644
index 00000000000..6d52f0f9d5a
--- /dev/null
+++ b/reftable/basics_test.c
@@ -0,0 +1,98 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+
+#include "basics.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+struct binsearch_args {
+	int key;
+	int *arr;
+};
+
+static int binsearch_func(size_t i, void *void_args)
+{
+	struct binsearch_args *args = (struct binsearch_args *)void_args;
+
+	return args->key < args->arr[i];
+}
+
+static void test_binsearch(void)
+{
+	int arr[] = { 2, 4, 6, 8, 10 };
+	size_t sz = ARRAY_SIZE(arr);
+	struct binsearch_args args = {
+		.arr = arr,
+	};
+
+	int i = 0;
+	for (i = 1; i < 11; i++) {
+		int res;
+		args.key = i;
+		res = binsearch(sz, &binsearch_func, &args);
+
+		if (res < sz) {
+			EXPECT(args.key < arr[res]);
+			if (res > 0) {
+				EXPECT(args.key >= arr[res - 1]);
+			}
+		} else {
+			EXPECT(args.key == 10 || args.key == 11);
+		}
+	}
+}
+
+static void test_names_length(void)
+{
+	char *a[] = { "a", "b", NULL };
+	EXPECT(names_length(a) == 2);
+}
+
+static void test_parse_names_normal(void)
+{
+	char in[] = "a\nb\n";
+	char **out = NULL;
+	parse_names(in, strlen(in), &out);
+	EXPECT(!strcmp(out[0], "a"));
+	EXPECT(!strcmp(out[1], "b"));
+	EXPECT(out[2] == NULL);
+	free_names(out);
+}
+
+static void test_parse_names_drop_empty(void)
+{
+	char in[] = "a\n\n";
+	char **out = NULL;
+	parse_names(in, strlen(in), &out);
+	EXPECT(!strcmp(out[0], "a"));
+	EXPECT(out[1] == NULL);
+	free_names(out);
+}
+
+static void test_common_prefix(void)
+{
+	struct strbuf s1 = STRBUF_INIT;
+	struct strbuf s2 = STRBUF_INIT;
+	strbuf_addstr(&s1, "abcdef");
+	strbuf_addstr(&s2, "abc");
+	EXPECT(common_prefix_size(&s1, &s2) == 3);
+	strbuf_release(&s1);
+	strbuf_release(&s2);
+}
+
+int basics_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_common_prefix);
+	RUN_TEST(test_parse_names_normal);
+	RUN_TEST(test_parse_names_drop_empty);
+	RUN_TEST(test_binsearch);
+	RUN_TEST(test_names_length);
+	return 0;
+}
diff --git a/reftable/publicbasics.c b/reftable/publicbasics.c
new file mode 100644
index 00000000000..25639f61af6
--- /dev/null
+++ b/reftable/publicbasics.c
@@ -0,0 +1,58 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reftable-malloc.h"
+
+#include "basics.h"
+#include "system.h"
+
+static void *(*reftable_malloc_ptr)(size_t sz) = &malloc;
+static void *(*reftable_realloc_ptr)(void *, size_t) = &realloc;
+static void (*reftable_free_ptr)(void *) = &free;
+
+void *reftable_malloc(size_t sz)
+{
+	return (*reftable_malloc_ptr)(sz);
+}
+
+void *reftable_realloc(void *p, size_t sz)
+{
+	return (*reftable_realloc_ptr)(p, sz);
+}
+
+void reftable_free(void *p)
+{
+	reftable_free_ptr(p);
+}
+
+void *reftable_calloc(size_t sz)
+{
+	void *p = reftable_malloc(sz);
+	memset(p, 0, sz);
+	return p;
+}
+
+void reftable_set_alloc(void *(*malloc)(size_t),
+			void *(*realloc)(void *, size_t), void (*free)(void *))
+{
+	reftable_malloc_ptr = malloc;
+	reftable_realloc_ptr = realloc;
+	reftable_free_ptr = free;
+}
+
+int hash_size(uint32_t id)
+{
+	switch (id) {
+	case 0:
+	case SHA1_ID:
+		return SHA1_SIZE;
+	case SHA256_ID:
+		return SHA256_SIZE;
+	}
+	abort();
+}
diff --git a/reftable/reftable-malloc.h b/reftable/reftable-malloc.h
new file mode 100644
index 00000000000..5f2185f1f34
--- /dev/null
+++ b/reftable/reftable-malloc.h
@@ -0,0 +1,18 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_H
+#define REFTABLE_H
+
+#include <stddef.h>
+
+/* Overrides the functions to use for memory management. */
+void reftable_set_alloc(void *(*malloc)(size_t),
+			void *(*realloc)(void *, size_t), void (*free)(void *));
+
+#endif
diff --git a/reftable/reftable-tests.h b/reftable/reftable-tests.h
new file mode 100644
index 00000000000..5e7698ae654
--- /dev/null
+++ b/reftable/reftable-tests.h
@@ -0,0 +1,22 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_TESTS_H
+#define REFTABLE_TESTS_H
+
+int basics_test_main(int argc, const char **argv);
+int block_test_main(int argc, const char **argv);
+int merged_test_main(int argc, const char **argv);
+int record_test_main(int argc, const char **argv);
+int refname_test_main(int argc, const char **argv);
+int reftable_test_main(int argc, const char **argv);
+int stack_test_main(int argc, const char **argv);
+int tree_test_main(int argc, const char **argv);
+int reftable_dump_main(int argc, char *const *argv);
+
+#endif
diff --git a/reftable/system.h b/reftable/system.h
new file mode 100644
index 00000000000..07277ca0627
--- /dev/null
+++ b/reftable/system.h
@@ -0,0 +1,32 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef SYSTEM_H
+#define SYSTEM_H
+
+#include "git-compat-util.h"
+#include "strbuf.h"
+
+#include <zlib.h>
+
+struct strbuf;
+/* In git, this is declared in dir.h */
+int remove_dir_recursively(struct strbuf *path, int flags);
+
+#define SHA1_ID 0x73686131
+#define SHA256_ID 0x73323536
+#define SHA1_SIZE 20
+#define SHA256_SIZE 32
+
+/* This is uncompress2, which is only available in zlib as of 2017.
+ */
+int uncompress_return_consumed(Bytef *dest, uLongf *destLen,
+			       const Bytef *source, uLong *sourceLen);
+int hash_size(uint32_t id);
+
+#endif
diff --git a/reftable/test_framework.c b/reftable/test_framework.c
new file mode 100644
index 00000000000..a5ff4e2a2d2
--- /dev/null
+++ b/reftable/test_framework.c
@@ -0,0 +1,23 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+#include "test_framework.h"
+
+#include "basics.h"
+
+void set_test_hash(uint8_t *p, int i)
+{
+	memset(p, (uint8_t)i, hash_size(SHA1_ID));
+}
+
+int strbuf_add_void(void *b, const void *data, size_t sz)
+{
+	strbuf_add((struct strbuf *)b, data, sz);
+	return sz;
+}
diff --git a/reftable/test_framework.h b/reftable/test_framework.h
new file mode 100644
index 00000000000..5fdc9519a5a
--- /dev/null
+++ b/reftable/test_framework.h
@@ -0,0 +1,53 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef TEST_FRAMEWORK_H
+#define TEST_FRAMEWORK_H
+
+#include "system.h"
+#include "reftable-error.h"
+
+#define EXPECT_ERR(c)                                                  \
+	if (c != 0) {                                                  \
+		fflush(stderr);                                        \
+		fflush(stdout);                                        \
+		fprintf(stderr, "%s: %d: error == %d (%s), want 0\n",  \
+			__FILE__, __LINE__, c, reftable_error_str(c)); \
+		abort();                                               \
+	}
+
+#define EXPECT_STREQ(a, b)                                               \
+	if (strcmp(a, b)) {                                              \
+		fflush(stderr);                                          \
+		fflush(stdout);                                          \
+		fprintf(stderr, "%s:%d: %s (%s) != %s (%s)\n", __FILE__, \
+			__LINE__, #a, a, #b, b);                         \
+		abort();                                                 \
+	}
+
+#define EXPECT(c)                                                          \
+	if (!(c)) {                                                        \
+		fflush(stderr);                                            \
+		fflush(stdout);                                            \
+		fprintf(stderr, "%s: %d: failed assertion %s\n", __FILE__, \
+			__LINE__, #c);                                     \
+		abort();                                                   \
+	}
+
+#define RUN_TEST(f)                          \
+	fprintf(stderr, "running %s\n", #f); \
+	fflush(stderr);                      \
+	f();
+
+void set_test_hash(uint8_t *p, int i);
+
+/* Like strbuf_add, but suitable for passing to reftable_new_writer
+ */
+int strbuf_add_void(void *b, const void *data, size_t sz);
+
+#endif
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
new file mode 100644
index 00000000000..3b58e423e7b
--- /dev/null
+++ b/t/helper/test-reftable.c
@@ -0,0 +1,9 @@
+#include "reftable/reftable-tests.h"
+#include "test-tool.h"
+
+int cmd__reftable(int argc, const char **argv)
+{
+	basics_test_main(argc, argv);
+
+	return 0;
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 9d6d14d9293..0208a0a41cf 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -48,13 +48,14 @@ static struct test_cmd cmds[] = {
 	{ "path-utils", cmd__path_utils },
 	{ "pkt-line", cmd__pkt_line },
 	{ "prio-queue", cmd__prio_queue },
-	{ "proc-receive", cmd__proc_receive},
+	{ "proc-receive", cmd__proc_receive },
 	{ "progress", cmd__progress },
 	{ "reach", cmd__reach },
 	{ "read-cache", cmd__read_cache },
 	{ "read-graph", cmd__read_graph },
 	{ "read-midx", cmd__read_midx },
 	{ "ref-store", cmd__ref_store },
+	{ "reftable", cmd__reftable },
 	{ "regex", cmd__regex },
 	{ "repository", cmd__repository },
 	{ "revision-walking", cmd__revision_walking },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index a6470ff62c4..1de39ce5b58 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -44,6 +44,7 @@ int cmd__read_cache(int argc, const char **argv);
 int cmd__read_graph(int argc, const char **argv);
 int cmd__read_midx(int argc, const char **argv);
 int cmd__ref_store(int argc, const char **argv);
+int cmd__reftable(int argc, const char **argv);
 int cmd__regex(int argc, const char **argv);
 int cmd__repository(int argc, const char **argv);
 int cmd__revision_walking(int argc, const char **argv);
diff --git a/t/t0032-reftable-unittest.sh b/t/t0032-reftable-unittest.sh
new file mode 100755
index 00000000000..0ed14971a58
--- /dev/null
+++ b/t/t0032-reftable-unittest.sh
@@ -0,0 +1,15 @@
+#!/bin/sh
+#
+# Copyright (c) 2020 Google LLC
+#
+
+test_description='reftable unittests'
+
+. ./test-lib.sh
+
+test_expect_success 'unittests' '
+	TMPDIR=$(pwd) && export TMPDIR &&
+	test-tool reftable
+'
+
+test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v4 05/15] reftable: add blocksource, an abstraction for random access reads
  2020-12-09 14:00     ` [PATCH v4 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                         ` (3 preceding siblings ...)
  2020-12-09 14:00       ` [PATCH v4 04/15] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
@ 2020-12-09 14:00       ` Han-Wen Nienhuys via GitGitGadget
  2020-12-09 14:00       ` [PATCH v4 06/15] reftable: (de)serialization for the polymorphic record type Han-Wen Nienhuys via GitGitGadget
                         ` (10 subsequent siblings)
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-12-09 14:00 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable format is usually used with files for storage. However, we abstract
away this using the blocksource data structure. This has two advantages:

* log blocks are zlib compressed, and handling them is simplified if we can
  discard byte segments from within the block layer.

* for unittests, it is useful to read and write in-memory. The blocksource
  allows us to abstract the data away from on-disk files.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                        |   1 +
 reftable/blocksource.c          | 148 ++++++++++++++++++++++++++++++++
 reftable/blocksource.h          |  22 +++++
 reftable/reftable-blocksource.h |  49 +++++++++++
 4 files changed, 220 insertions(+)
 create mode 100644 reftable/blocksource.c
 create mode 100644 reftable/blocksource.h
 create mode 100644 reftable/reftable-blocksource.h

diff --git a/Makefile b/Makefile
index 9b6d84af8f6..f623611bde2 100644
--- a/Makefile
+++ b/Makefile
@@ -2391,6 +2391,7 @@ XDIFF_OBJS += xdiff/xutils.o
 
 REFTABLE_OBJS += reftable/basics.o
 REFTABLE_OBJS += reftable/error.o
+REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 
 REFTABLE_TEST_OBJS += reftable/test_framework.o
diff --git a/reftable/blocksource.c b/reftable/blocksource.c
new file mode 100644
index 00000000000..25d4d95b52b
--- /dev/null
+++ b/reftable/blocksource.c
@@ -0,0 +1,148 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+
+#include "basics.h"
+#include "blocksource.h"
+#include "reftable-blocksource.h"
+#include "reftable-error.h"
+
+static void strbuf_return_block(void *b, struct reftable_block *dest)
+{
+	memset(dest->data, 0xff, dest->len);
+	reftable_free(dest->data);
+}
+
+static void strbuf_close(void *b)
+{
+}
+
+static int strbuf_read_block(void *v, struct reftable_block *dest, uint64_t off,
+			     uint32_t size)
+{
+	struct strbuf *b = (struct strbuf *)v;
+	assert(off + size <= b->len);
+	dest->data = reftable_calloc(size);
+	memcpy(dest->data, b->buf + off, size);
+	dest->len = size;
+	return size;
+}
+
+static uint64_t strbuf_size(void *b)
+{
+	return ((struct strbuf *)b)->len;
+}
+
+static struct reftable_block_source_vtable strbuf_vtable = {
+	.size = &strbuf_size,
+	.read_block = &strbuf_read_block,
+	.return_block = &strbuf_return_block,
+	.close = &strbuf_close,
+};
+
+void block_source_from_strbuf(struct reftable_block_source *bs,
+			      struct strbuf *buf)
+{
+	assert(bs->ops == NULL);
+	bs->ops = &strbuf_vtable;
+	bs->arg = buf;
+}
+
+static void malloc_return_block(void *b, struct reftable_block *dest)
+{
+	memset(dest->data, 0xff, dest->len);
+	reftable_free(dest->data);
+}
+
+static struct reftable_block_source_vtable malloc_vtable = {
+	.return_block = &malloc_return_block,
+};
+
+static struct reftable_block_source malloc_block_source_instance = {
+	.ops = &malloc_vtable,
+};
+
+struct reftable_block_source malloc_block_source(void)
+{
+	return malloc_block_source_instance;
+}
+
+struct file_block_source {
+	int fd;
+	uint64_t size;
+};
+
+static uint64_t file_size(void *b)
+{
+	return ((struct file_block_source *)b)->size;
+}
+
+static void file_return_block(void *b, struct reftable_block *dest)
+{
+	memset(dest->data, 0xff, dest->len);
+	reftable_free(dest->data);
+}
+
+static void file_close(void *b)
+{
+	int fd = ((struct file_block_source *)b)->fd;
+	if (fd > 0) {
+		close(fd);
+		((struct file_block_source *)b)->fd = 0;
+	}
+
+	reftable_free(b);
+}
+
+static int file_read_block(void *v, struct reftable_block *dest, uint64_t off,
+			   uint32_t size)
+{
+	struct file_block_source *b = (struct file_block_source *)v;
+	assert(off + size <= b->size);
+	dest->data = reftable_malloc(size);
+	if (pread(b->fd, dest->data, size, off) != size)
+		return -1;
+	dest->len = size;
+	return size;
+}
+
+static struct reftable_block_source_vtable file_vtable = {
+	.size = &file_size,
+	.read_block = &file_read_block,
+	.return_block = &file_return_block,
+	.close = &file_close,
+};
+
+int reftable_block_source_from_file(struct reftable_block_source *bs,
+				    const char *name)
+{
+	struct stat st = { 0 };
+	int err = 0;
+	int fd = open(name, O_RDONLY);
+	struct file_block_source *p = NULL;
+	if (fd < 0) {
+		if (errno == ENOENT) {
+			return REFTABLE_NOT_EXIST_ERROR;
+		}
+		return -1;
+	}
+
+	err = fstat(fd, &st);
+	if (err < 0)
+		return -1;
+
+	p = reftable_calloc(sizeof(struct file_block_source));
+	p->size = st.st_size;
+	p->fd = fd;
+
+	assert(bs->ops == NULL);
+	bs->ops = &file_vtable;
+	bs->arg = p;
+	return 0;
+}
diff --git a/reftable/blocksource.h b/reftable/blocksource.h
new file mode 100644
index 00000000000..072e2727ad2
--- /dev/null
+++ b/reftable/blocksource.h
@@ -0,0 +1,22 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef BLOCKSOURCE_H
+#define BLOCKSOURCE_H
+
+#include "system.h"
+
+struct reftable_block_source;
+
+/* Create an in-memory block source for reading reftables */
+void block_source_from_strbuf(struct reftable_block_source *bs,
+			      struct strbuf *buf);
+
+struct reftable_block_source malloc_block_source(void);
+
+#endif
diff --git a/reftable/reftable-blocksource.h b/reftable/reftable-blocksource.h
new file mode 100644
index 00000000000..5aa3990a573
--- /dev/null
+++ b/reftable/reftable-blocksource.h
@@ -0,0 +1,49 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_BLOCKSOURCE_H
+#define REFTABLE_BLOCKSOURCE_H
+
+#include <stdint.h>
+
+/* block_source is a generic wrapper for a seekable readable file.
+ */
+struct reftable_block_source {
+	struct reftable_block_source_vtable *ops;
+	void *arg;
+};
+
+/* a contiguous segment of bytes. It keeps track of its generating block_source
+ * so it can return itself into the pool. */
+struct reftable_block {
+	uint8_t *data;
+	int len;
+	struct reftable_block_source source;
+};
+
+/* block_source_vtable are the operations that make up block_source */
+struct reftable_block_source_vtable {
+	/* returns the size of a block source */
+	uint64_t (*size)(void *source);
+
+	/* reads a segment from the block source. It is an error to read
+	   beyond the end of the block */
+	int (*read_block)(void *source, struct reftable_block *dest,
+			  uint64_t off, uint32_t size);
+	/* mark the block as read; may return the data back to malloc */
+	void (*return_block)(void *source, struct reftable_block *blockp);
+
+	/* release all resources associated with the block source */
+	void (*close)(void *source);
+};
+
+/* opens a file on the file system as a block_source */
+int reftable_block_source_from_file(struct reftable_block_source *block_src,
+				    const char *name);
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v4 06/15] reftable: (de)serialization for the polymorphic record type.
  2020-12-09 14:00     ` [PATCH v4 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                         ` (4 preceding siblings ...)
  2020-12-09 14:00       ` [PATCH v4 05/15] reftable: add blocksource, an abstraction for random access reads Han-Wen Nienhuys via GitGitGadget
@ 2020-12-09 14:00       ` Han-Wen Nienhuys via GitGitGadget
  2020-12-09 14:00       ` [PATCH v4 07/15] reftable: reading/writing blocks Han-Wen Nienhuys via GitGitGadget
                         ` (9 subsequent siblings)
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-12-09 14:00 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable format is structured as a sequence of blocks, and each block
contains a sequence of prefix-compressed key-value records. There are 4 types of
records, and they have similarities in how they must be handled. This is
achieved by introducing a polymorphic 'record' type that encapsulates ref, log,
index and object records.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                   |    2 +
 reftable/constants.h       |   21 +
 reftable/record.c          | 1157 ++++++++++++++++++++++++++++++++++++
 reftable/record.h          |  139 +++++
 reftable/record_test.c     |  398 +++++++++++++
 reftable/reftable-record.h |  100 ++++
 t/helper/test-reftable.c   |    2 +-
 7 files changed, 1818 insertions(+), 1 deletion(-)
 create mode 100644 reftable/constants.h
 create mode 100644 reftable/record.c
 create mode 100644 reftable/record.h
 create mode 100644 reftable/record_test.c
 create mode 100644 reftable/reftable-record.h

diff --git a/Makefile b/Makefile
index f623611bde2..ce3052169b4 100644
--- a/Makefile
+++ b/Makefile
@@ -2393,7 +2393,9 @@ REFTABLE_OBJS += reftable/basics.o
 REFTABLE_OBJS += reftable/error.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
+REFTABLE_OBJS += reftable/record.o
 
+REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/basics_test.o
 
diff --git a/reftable/constants.h b/reftable/constants.h
new file mode 100644
index 00000000000..5eee72c4c11
--- /dev/null
+++ b/reftable/constants.h
@@ -0,0 +1,21 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef CONSTANTS_H
+#define CONSTANTS_H
+
+#define BLOCK_TYPE_LOG 'g'
+#define BLOCK_TYPE_INDEX 'i'
+#define BLOCK_TYPE_REF 'r'
+#define BLOCK_TYPE_OBJ 'o'
+#define BLOCK_TYPE_ANY 0
+
+#define MAX_RESTARTS ((1 << 16) - 1)
+#define DEFAULT_BLOCK_SIZE 4096
+
+#endif
diff --git a/reftable/record.c b/reftable/record.c
new file mode 100644
index 00000000000..5381e9aaa89
--- /dev/null
+++ b/reftable/record.c
@@ -0,0 +1,1157 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+/* record.c - methods for different types of records. */
+
+#include "record.h"
+
+#include "system.h"
+#include "constants.h"
+#include "reftable-error.h"
+#include "basics.h"
+
+int get_var_int(uint64_t *dest, struct string_view *in)
+{
+	int ptr = 0;
+	uint64_t val;
+
+	if (in->len == 0)
+		return -1;
+	val = in->buf[ptr] & 0x7f;
+
+	while (in->buf[ptr] & 0x80) {
+		ptr++;
+		if (ptr > in->len) {
+			return -1;
+		}
+		val = (val + 1) << 7 | (uint64_t)(in->buf[ptr] & 0x7f);
+	}
+
+	*dest = val;
+	return ptr + 1;
+}
+
+int put_var_int(struct string_view *dest, uint64_t val)
+{
+	uint8_t buf[10] = { 0 };
+	int i = 9;
+	int n = 0;
+	buf[i] = (uint8_t)(val & 0x7f);
+	i--;
+	while (1) {
+		val >>= 7;
+		if (!val) {
+			break;
+		}
+		val--;
+		buf[i] = 0x80 | (uint8_t)(val & 0x7f);
+		i--;
+	}
+
+	n = sizeof(buf) - i - 1;
+	if (dest->len < n)
+		return -1;
+	memcpy(dest->buf, &buf[i + 1], n);
+	return n;
+}
+
+int reftable_is_block_type(uint8_t typ)
+{
+	switch (typ) {
+	case BLOCK_TYPE_REF:
+	case BLOCK_TYPE_LOG:
+	case BLOCK_TYPE_OBJ:
+	case BLOCK_TYPE_INDEX:
+		return 1;
+	}
+	return 0;
+}
+
+uint8_t *reftable_ref_record_val1(struct reftable_ref_record *rec)
+{
+	switch (rec->value_type) {
+	case REFTABLE_REF_VAL1:
+		return rec->value.val1;
+	case REFTABLE_REF_VAL2:
+		return rec->value.val2.value;
+	default:
+		return NULL;
+	}
+}
+
+uint8_t *reftable_ref_record_val2(struct reftable_ref_record *rec)
+{
+	switch (rec->value_type) {
+	case REFTABLE_REF_VAL2:
+		return rec->value.val2.target_value;
+	default:
+		return NULL;
+	}
+}
+
+static int decode_string(struct strbuf *dest, struct string_view in)
+{
+	int start_len = in.len;
+	uint64_t tsize = 0;
+	int n = get_var_int(&tsize, &in);
+	if (n <= 0)
+		return -1;
+	string_view_consume(&in, n);
+	if (in.len < tsize)
+		return -1;
+
+	strbuf_reset(dest);
+	strbuf_add(dest, in.buf, tsize);
+	string_view_consume(&in, tsize);
+
+	return start_len - in.len;
+}
+
+static int encode_string(char *str, struct string_view s)
+{
+	struct string_view start = s;
+	int l = strlen(str);
+	int n = put_var_int(&s, l);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+	if (s.len < l)
+		return -1;
+	memcpy(s.buf, str, l);
+	string_view_consume(&s, l);
+
+	return start.len - s.len;
+}
+
+int reftable_encode_key(int *restart, struct string_view dest,
+			struct strbuf prev_key, struct strbuf key,
+			uint8_t extra)
+{
+	struct string_view start = dest;
+	int prefix_len = common_prefix_size(&prev_key, &key);
+	uint64_t suffix_len = key.len - prefix_len;
+	int n = put_var_int(&dest, (uint64_t)prefix_len);
+	if (n < 0)
+		return -1;
+	string_view_consume(&dest, n);
+
+	*restart = (prefix_len == 0);
+
+	n = put_var_int(&dest, suffix_len << 3 | (uint64_t)extra);
+	if (n < 0)
+		return -1;
+	string_view_consume(&dest, n);
+
+	if (dest.len < suffix_len)
+		return -1;
+	memcpy(dest.buf, key.buf + prefix_len, suffix_len);
+	string_view_consume(&dest, suffix_len);
+
+	return start.len - dest.len;
+}
+
+int reftable_decode_key(struct strbuf *key, uint8_t *extra,
+			struct strbuf last_key, struct string_view in)
+{
+	int start_len = in.len;
+	uint64_t prefix_len = 0;
+	uint64_t suffix_len = 0;
+	int n = get_var_int(&prefix_len, &in);
+	if (n < 0)
+		return -1;
+	string_view_consume(&in, n);
+
+	if (prefix_len > last_key.len)
+		return -1;
+
+	n = get_var_int(&suffix_len, &in);
+	if (n <= 0)
+		return -1;
+	string_view_consume(&in, n);
+
+	*extra = (uint8_t)(suffix_len & 0x7);
+	suffix_len >>= 3;
+
+	if (in.len < suffix_len)
+		return -1;
+
+	strbuf_reset(key);
+	strbuf_add(key, last_key.buf, prefix_len);
+	strbuf_add(key, in.buf, suffix_len);
+	string_view_consume(&in, suffix_len);
+
+	return start_len - in.len;
+}
+
+static void reftable_ref_record_key(const void *r, struct strbuf *dest)
+{
+	const struct reftable_ref_record *rec =
+		(const struct reftable_ref_record *)r;
+	strbuf_reset(dest);
+	strbuf_addstr(dest, rec->refname);
+}
+
+static void reftable_ref_record_copy_from(void *rec, const void *src_rec,
+					  int hash_size)
+{
+	struct reftable_ref_record *ref = (struct reftable_ref_record *)rec;
+	struct reftable_ref_record *src = (struct reftable_ref_record *)src_rec;
+	assert(hash_size > 0);
+
+	/* This is simple and correct, but we could probably reuse the hash
+	 * fields. */
+	reftable_ref_record_release(ref);
+	if (src->refname != NULL) {
+		ref->refname = xstrdup(src->refname);
+	}
+	ref->update_index = src->update_index;
+	ref->value_type = src->value_type;
+	switch (src->value_type) {
+	case REFTABLE_REF_DELETION:
+		break;
+	case REFTABLE_REF_VAL1:
+		ref->value.val1 = reftable_malloc(hash_size);
+		memcpy(ref->value.val1, src->value.val1, hash_size);
+		break;
+	case REFTABLE_REF_VAL2:
+		ref->value.val2.value = reftable_malloc(hash_size);
+		memcpy(ref->value.val2.value, src->value.val2.value, hash_size);
+		ref->value.val2.target_value = reftable_malloc(hash_size);
+		memcpy(ref->value.val2.target_value,
+		       src->value.val2.target_value, hash_size);
+		break;
+	case REFTABLE_REF_SYMREF:
+		ref->value.symref = xstrdup(src->value.symref);
+		break;
+	}
+}
+
+static char hexdigit(int c)
+{
+	if (c <= 9)
+		return '0' + c;
+	return 'a' + (c - 10);
+}
+
+static void hex_format(char *dest, uint8_t *src, int hash_size)
+{
+	assert(hash_size > 0);
+	if (src != NULL) {
+		int i = 0;
+		for (i = 0; i < hash_size; i++) {
+			dest[2 * i] = hexdigit(src[i] >> 4);
+			dest[2 * i + 1] = hexdigit(src[i] & 0xf);
+		}
+		dest[2 * hash_size] = 0;
+	}
+}
+
+void reftable_ref_record_print(struct reftable_ref_record *ref,
+			       uint32_t hash_id)
+{
+	char hex[2 * SHA256_SIZE + 1] = { 0 }; /* BUG */
+	printf("ref{%s(%" PRIu64 ") ", ref->refname, ref->update_index);
+	switch (ref->value_type) {
+	case REFTABLE_REF_SYMREF:
+		printf("=> %s", ref->value.symref);
+		break;
+	case REFTABLE_REF_VAL2:
+		hex_format(hex, ref->value.val2.value, hash_size(hash_id));
+		printf("val 2 %s", hex);
+		hex_format(hex, ref->value.val2.target_value,
+			   hash_size(hash_id));
+		printf("(T %s)", hex);
+		break;
+	case REFTABLE_REF_VAL1:
+		hex_format(hex, ref->value.val1, hash_size(hash_id));
+		printf("val 1 %s", hex);
+		break;
+	case REFTABLE_REF_DELETION:
+		printf("delete");
+		break;
+	}
+	printf("}\n");
+}
+
+static void reftable_ref_record_release_void(void *rec)
+{
+	reftable_ref_record_release((struct reftable_ref_record *)rec);
+}
+
+void reftable_ref_record_release(struct reftable_ref_record *ref)
+{
+	switch (ref->value_type) {
+	case REFTABLE_REF_SYMREF:
+		reftable_free(ref->value.symref);
+		break;
+	case REFTABLE_REF_VAL2:
+		reftable_free(ref->value.val2.target_value);
+		reftable_free(ref->value.val2.value);
+		break;
+	case REFTABLE_REF_VAL1:
+		reftable_free(ref->value.val1);
+		break;
+	case REFTABLE_REF_DELETION:
+		break;
+	default:
+		abort();
+	}
+
+	reftable_free(ref->refname);
+	memset(ref, 0, sizeof(struct reftable_ref_record));
+}
+
+static uint8_t reftable_ref_record_val_type(const void *rec)
+{
+	const struct reftable_ref_record *r =
+		(const struct reftable_ref_record *)rec;
+	return r->value_type;
+}
+
+static int reftable_ref_record_encode(const void *rec, struct string_view s,
+				      int hash_size)
+{
+	const struct reftable_ref_record *r =
+		(const struct reftable_ref_record *)rec;
+	struct string_view start = s;
+	int n = put_var_int(&s, r->update_index);
+	assert(hash_size > 0);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	switch (r->value_type) {
+	case REFTABLE_REF_SYMREF:
+		n = encode_string(r->value.symref, s);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&s, n);
+		break;
+	case REFTABLE_REF_VAL2:
+		if (s.len < 2 * hash_size) {
+			return -1;
+		}
+		memcpy(s.buf, r->value.val2.value, hash_size);
+		string_view_consume(&s, hash_size);
+		memcpy(s.buf, r->value.val2.target_value, hash_size);
+		string_view_consume(&s, hash_size);
+		break;
+	case REFTABLE_REF_VAL1:
+		if (s.len < hash_size) {
+			return -1;
+		}
+		memcpy(s.buf, r->value.val1, hash_size);
+		string_view_consume(&s, hash_size);
+		break;
+	case REFTABLE_REF_DELETION:
+		break;
+	default:
+		abort();
+	}
+
+	return start.len - s.len;
+}
+
+static int reftable_ref_record_decode(void *rec, struct strbuf key,
+				      uint8_t val_type, struct string_view in,
+				      int hash_size)
+{
+	struct reftable_ref_record *r = (struct reftable_ref_record *)rec;
+	struct string_view start = in;
+	uint64_t update_index = 0;
+	int n = get_var_int(&update_index, &in);
+	if (n < 0)
+		return n;
+	string_view_consume(&in, n);
+
+	reftable_ref_record_release(r);
+
+	assert(hash_size > 0);
+
+	r->refname = reftable_realloc(r->refname, key.len + 1);
+	memcpy(r->refname, key.buf, key.len);
+	r->update_index = update_index;
+	r->refname[key.len] = 0;
+	r->value_type = val_type;
+	switch (val_type) {
+	case REFTABLE_REF_VAL1:
+		if (in.len < hash_size) {
+			return -1;
+		}
+
+		r->value.val1 = reftable_malloc(hash_size);
+		memcpy(r->value.val1, in.buf, hash_size);
+		string_view_consume(&in, hash_size);
+		break;
+
+	case REFTABLE_REF_VAL2:
+		if (in.len < 2 * hash_size) {
+			return -1;
+		}
+
+		r->value.val2.value = reftable_malloc(hash_size);
+		memcpy(r->value.val2.value, in.buf, hash_size);
+		string_view_consume(&in, hash_size);
+
+		r->value.val2.target_value = reftable_malloc(hash_size);
+		memcpy(r->value.val2.target_value, in.buf, hash_size);
+		string_view_consume(&in, hash_size);
+		break;
+
+	case REFTABLE_REF_SYMREF: {
+		struct strbuf dest = STRBUF_INIT;
+		int n = decode_string(&dest, in);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&in, n);
+		r->value.symref = dest.buf;
+	} break;
+
+	case REFTABLE_REF_DELETION:
+		break;
+	default:
+		abort();
+		break;
+	}
+
+	return start.len - in.len;
+}
+
+static int reftable_ref_record_is_deletion_void(const void *p)
+{
+	return reftable_ref_record_is_deletion(
+		(const struct reftable_ref_record *)p);
+}
+
+static struct reftable_record_vtable reftable_ref_record_vtable = {
+	.key = &reftable_ref_record_key,
+	.type = BLOCK_TYPE_REF,
+	.copy_from = &reftable_ref_record_copy_from,
+	.val_type = &reftable_ref_record_val_type,
+	.encode = &reftable_ref_record_encode,
+	.decode = &reftable_ref_record_decode,
+	.release = &reftable_ref_record_release_void,
+	.is_deletion = &reftable_ref_record_is_deletion_void,
+};
+
+static void reftable_obj_record_key(const void *r, struct strbuf *dest)
+{
+	const struct reftable_obj_record *rec =
+		(const struct reftable_obj_record *)r;
+	strbuf_reset(dest);
+	strbuf_add(dest, rec->hash_prefix, rec->hash_prefix_len);
+}
+
+static void reftable_obj_record_release(void *rec)
+{
+	struct reftable_obj_record *obj = (struct reftable_obj_record *)rec;
+	FREE_AND_NULL(obj->hash_prefix);
+	FREE_AND_NULL(obj->offsets);
+	memset(obj, 0, sizeof(struct reftable_obj_record));
+}
+
+static void reftable_obj_record_copy_from(void *rec, const void *src_rec,
+					  int hash_size)
+{
+	struct reftable_obj_record *obj = (struct reftable_obj_record *)rec;
+	const struct reftable_obj_record *src =
+		(const struct reftable_obj_record *)src_rec;
+	int olen;
+
+	reftable_obj_record_release(obj);
+	*obj = *src;
+	obj->hash_prefix = reftable_malloc(obj->hash_prefix_len);
+	memcpy(obj->hash_prefix, src->hash_prefix, obj->hash_prefix_len);
+
+	olen = obj->offset_len * sizeof(uint64_t);
+	obj->offsets = reftable_malloc(olen);
+	memcpy(obj->offsets, src->offsets, olen);
+}
+
+static uint8_t reftable_obj_record_val_type(const void *rec)
+{
+	struct reftable_obj_record *r = (struct reftable_obj_record *)rec;
+	if (r->offset_len > 0 && r->offset_len < 8)
+		return r->offset_len;
+	return 0;
+}
+
+static int reftable_obj_record_encode(const void *rec, struct string_view s,
+				      int hash_size)
+{
+	struct reftable_obj_record *r = (struct reftable_obj_record *)rec;
+	struct string_view start = s;
+	int i = 0;
+	int n = 0;
+	uint64_t last = 0;
+	if (r->offset_len == 0 || r->offset_len >= 8) {
+		n = put_var_int(&s, r->offset_len);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&s, n);
+	}
+	if (r->offset_len == 0)
+		return start.len - s.len;
+	n = put_var_int(&s, r->offsets[0]);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	last = r->offsets[0];
+	for (i = 1; i < r->offset_len; i++) {
+		int n = put_var_int(&s, r->offsets[i] - last);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&s, n);
+		last = r->offsets[i];
+	}
+	return start.len - s.len;
+}
+
+static int reftable_obj_record_decode(void *rec, struct strbuf key,
+				      uint8_t val_type, struct string_view in,
+				      int hash_size)
+{
+	struct string_view start = in;
+	struct reftable_obj_record *r = (struct reftable_obj_record *)rec;
+	uint64_t count = val_type;
+	int n = 0;
+	uint64_t last;
+	int j;
+	r->hash_prefix = reftable_malloc(key.len);
+	memcpy(r->hash_prefix, key.buf, key.len);
+	r->hash_prefix_len = key.len;
+
+	if (val_type == 0) {
+		n = get_var_int(&count, &in);
+		if (n < 0) {
+			return n;
+		}
+
+		string_view_consume(&in, n);
+	}
+
+	r->offsets = NULL;
+	r->offset_len = 0;
+	if (count == 0)
+		return start.len - in.len;
+
+	r->offsets = reftable_malloc(count * sizeof(uint64_t));
+	r->offset_len = count;
+
+	n = get_var_int(&r->offsets[0], &in);
+	if (n < 0)
+		return n;
+	string_view_consume(&in, n);
+
+	last = r->offsets[0];
+	j = 1;
+	while (j < count) {
+		uint64_t delta = 0;
+		int n = get_var_int(&delta, &in);
+		if (n < 0) {
+			return n;
+		}
+		string_view_consume(&in, n);
+
+		last = r->offsets[j] = (delta + last);
+		j++;
+	}
+	return start.len - in.len;
+}
+
+static int not_a_deletion(const void *p)
+{
+	return 0;
+}
+
+static struct reftable_record_vtable reftable_obj_record_vtable = {
+	.key = &reftable_obj_record_key,
+	.type = BLOCK_TYPE_OBJ,
+	.copy_from = &reftable_obj_record_copy_from,
+	.val_type = &reftable_obj_record_val_type,
+	.encode = &reftable_obj_record_encode,
+	.decode = &reftable_obj_record_decode,
+	.release = &reftable_obj_record_release,
+	.is_deletion = not_a_deletion,
+};
+
+void reftable_log_record_print(struct reftable_log_record *log,
+			       uint32_t hash_id)
+{
+	char hex[SHA256_SIZE + 1] = { 0 };
+
+	printf("log{%s(%" PRIu64 ") %s <%s> %" PRIu64 " %04d\n", log->refname,
+	       log->update_index, log->name, log->email, log->time,
+	       log->tz_offset);
+	hex_format(hex, log->old_hash, hash_size(hash_id));
+	printf("%s => ", hex);
+	hex_format(hex, log->new_hash, hash_size(hash_id));
+	printf("%s\n\n%s\n}\n", hex, log->message);
+}
+
+static void reftable_log_record_key(const void *r, struct strbuf *dest)
+{
+	const struct reftable_log_record *rec =
+		(const struct reftable_log_record *)r;
+	int len = strlen(rec->refname);
+	uint8_t i64[8];
+	uint64_t ts = 0;
+	strbuf_reset(dest);
+	strbuf_add(dest, (uint8_t *)rec->refname, len + 1);
+
+	ts = (~ts) - rec->update_index;
+	put_be64(&i64[0], ts);
+	strbuf_add(dest, i64, sizeof(i64));
+}
+
+static void reftable_log_record_copy_from(void *rec, const void *src_rec,
+					  int hash_size)
+{
+	struct reftable_log_record *dst = (struct reftable_log_record *)rec;
+	const struct reftable_log_record *src =
+		(const struct reftable_log_record *)src_rec;
+
+	reftable_log_record_release(dst);
+	*dst = *src;
+	if (dst->refname != NULL) {
+		dst->refname = xstrdup(dst->refname);
+	}
+	if (dst->email != NULL) {
+		dst->email = xstrdup(dst->email);
+	}
+	if (dst->name != NULL) {
+		dst->name = xstrdup(dst->name);
+	}
+	if (dst->message != NULL) {
+		dst->message = xstrdup(dst->message);
+	}
+
+	if (dst->new_hash != NULL) {
+		dst->new_hash = reftable_malloc(hash_size);
+		memcpy(dst->new_hash, src->new_hash, hash_size);
+	}
+	if (dst->old_hash != NULL) {
+		dst->old_hash = reftable_malloc(hash_size);
+		memcpy(dst->old_hash, src->old_hash, hash_size);
+	}
+}
+
+static void reftable_log_record_release_void(void *rec)
+{
+	struct reftable_log_record *r = (struct reftable_log_record *)rec;
+	reftable_log_record_release(r);
+}
+
+void reftable_log_record_release(struct reftable_log_record *r)
+{
+	reftable_free(r->refname);
+	reftable_free(r->new_hash);
+	reftable_free(r->old_hash);
+	reftable_free(r->name);
+	reftable_free(r->email);
+	reftable_free(r->message);
+	memset(r, 0, sizeof(struct reftable_log_record));
+}
+
+static uint8_t reftable_log_record_val_type(const void *rec)
+{
+	const struct reftable_log_record *log =
+		(const struct reftable_log_record *)rec;
+
+	return reftable_log_record_is_deletion(log) ? 0 : 1;
+}
+
+static uint8_t zero[SHA256_SIZE] = { 0 };
+
+static int reftable_log_record_encode(const void *rec, struct string_view s,
+				      int hash_size)
+{
+	struct reftable_log_record *r = (struct reftable_log_record *)rec;
+	struct string_view start = s;
+	int n = 0;
+	uint8_t *oldh = r->old_hash;
+	uint8_t *newh = r->new_hash;
+	if (reftable_log_record_is_deletion(r))
+		return 0;
+
+	if (oldh == NULL) {
+		oldh = zero;
+	}
+	if (newh == NULL) {
+		newh = zero;
+	}
+
+	if (s.len < 2 * hash_size)
+		return -1;
+
+	memcpy(s.buf, oldh, hash_size);
+	memcpy(s.buf + hash_size, newh, hash_size);
+	string_view_consume(&s, 2 * hash_size);
+
+	n = encode_string(r->name ? r->name : "", s);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	n = encode_string(r->email ? r->email : "", s);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	n = put_var_int(&s, r->time);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	if (s.len < 2)
+		return -1;
+
+	put_be16(s.buf, r->tz_offset);
+	string_view_consume(&s, 2);
+
+	n = encode_string(r->message ? r->message : "", s);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	return start.len - s.len;
+}
+
+static int reftable_log_record_decode(void *rec, struct strbuf key,
+				      uint8_t val_type, struct string_view in,
+				      int hash_size)
+{
+	struct string_view start = in;
+	struct reftable_log_record *r = (struct reftable_log_record *)rec;
+	uint64_t max = 0;
+	uint64_t ts = 0;
+	struct strbuf dest = STRBUF_INIT;
+	int n;
+
+	if (key.len <= 9 || key.buf[key.len - 9] != 0)
+		return REFTABLE_FORMAT_ERROR;
+
+	r->refname = reftable_realloc(r->refname, key.len - 8);
+	memcpy(r->refname, key.buf, key.len - 8);
+	ts = get_be64(key.buf + key.len - 8);
+
+	r->update_index = (~max) - ts;
+
+	if (val_type == 0) {
+		FREE_AND_NULL(r->old_hash);
+		FREE_AND_NULL(r->new_hash);
+		FREE_AND_NULL(r->message);
+		FREE_AND_NULL(r->email);
+		FREE_AND_NULL(r->name);
+		return 0;
+	}
+
+	if (in.len < 2 * hash_size)
+		return REFTABLE_FORMAT_ERROR;
+
+	r->old_hash = reftable_realloc(r->old_hash, hash_size);
+	r->new_hash = reftable_realloc(r->new_hash, hash_size);
+
+	memcpy(r->old_hash, in.buf, hash_size);
+	memcpy(r->new_hash, in.buf + hash_size, hash_size);
+
+	string_view_consume(&in, 2 * hash_size);
+
+	n = decode_string(&dest, in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+
+	r->name = reftable_realloc(r->name, dest.len + 1);
+	memcpy(r->name, dest.buf, dest.len);
+	r->name[dest.len] = 0;
+
+	strbuf_reset(&dest);
+	n = decode_string(&dest, in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+
+	r->email = reftable_realloc(r->email, dest.len + 1);
+	memcpy(r->email, dest.buf, dest.len);
+	r->email[dest.len] = 0;
+
+	ts = 0;
+	n = get_var_int(&ts, &in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+	r->time = ts;
+	if (in.len < 2)
+		goto done;
+
+	r->tz_offset = get_be16(in.buf);
+	string_view_consume(&in, 2);
+
+	strbuf_reset(&dest);
+	n = decode_string(&dest, in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+
+	r->message = reftable_realloc(r->message, dest.len + 1);
+	memcpy(r->message, dest.buf, dest.len);
+	r->message[dest.len] = 0;
+
+	strbuf_release(&dest);
+	return start.len - in.len;
+
+done:
+	strbuf_release(&dest);
+	return REFTABLE_FORMAT_ERROR;
+}
+
+static int null_streq(char *a, char *b)
+{
+	char *empty = "";
+	if (a == NULL)
+		a = empty;
+
+	if (b == NULL)
+		b = empty;
+
+	return 0 == strcmp(a, b);
+}
+
+static int zero_hash_eq(uint8_t *a, uint8_t *b, int sz)
+{
+	if (a == NULL)
+		a = zero;
+
+	if (b == NULL)
+		b = zero;
+
+	return !memcmp(a, b, sz);
+}
+
+int reftable_log_record_equal(struct reftable_log_record *a,
+			      struct reftable_log_record *b, int hash_size)
+{
+	return null_streq(a->name, b->name) && null_streq(a->email, b->email) &&
+	       null_streq(a->message, b->message) &&
+	       zero_hash_eq(a->old_hash, b->old_hash, hash_size) &&
+	       zero_hash_eq(a->new_hash, b->new_hash, hash_size) &&
+	       a->time == b->time && a->tz_offset == b->tz_offset &&
+	       a->update_index == b->update_index;
+}
+
+static int reftable_log_record_is_deletion_void(const void *p)
+{
+	return reftable_log_record_is_deletion(
+		(const struct reftable_log_record *)p);
+}
+
+static struct reftable_record_vtable reftable_log_record_vtable = {
+	.key = &reftable_log_record_key,
+	.type = BLOCK_TYPE_LOG,
+	.copy_from = &reftable_log_record_copy_from,
+	.val_type = &reftable_log_record_val_type,
+	.encode = &reftable_log_record_encode,
+	.decode = &reftable_log_record_decode,
+	.release = &reftable_log_record_release_void,
+	.is_deletion = &reftable_log_record_is_deletion_void,
+};
+
+struct reftable_record reftable_new_record(uint8_t typ)
+{
+	struct reftable_record rec = { NULL };
+	switch (typ) {
+	case BLOCK_TYPE_REF: {
+		struct reftable_ref_record *r =
+			reftable_calloc(sizeof(struct reftable_ref_record));
+		reftable_record_from_ref(&rec, r);
+		return rec;
+	}
+
+	case BLOCK_TYPE_OBJ: {
+		struct reftable_obj_record *r =
+			reftable_calloc(sizeof(struct reftable_obj_record));
+		reftable_record_from_obj(&rec, r);
+		return rec;
+	}
+	case BLOCK_TYPE_LOG: {
+		struct reftable_log_record *r =
+			reftable_calloc(sizeof(struct reftable_log_record));
+		reftable_record_from_log(&rec, r);
+		return rec;
+	}
+	case BLOCK_TYPE_INDEX: {
+		struct reftable_index_record empty = { .last_key =
+							       STRBUF_INIT };
+		struct reftable_index_record *r =
+			reftable_calloc(sizeof(struct reftable_index_record));
+		*r = empty;
+		reftable_record_from_index(&rec, r);
+		return rec;
+	}
+	}
+	abort();
+	return rec;
+}
+
+/* clear out the record, yielding the reftable_record data that was
+ * encapsulated. */
+static void *reftable_record_yield(struct reftable_record *rec)
+{
+	void *p = rec->data;
+	rec->data = NULL;
+	return p;
+}
+
+void reftable_record_destroy(struct reftable_record *rec)
+{
+	reftable_record_release(rec);
+	reftable_free(reftable_record_yield(rec));
+}
+
+static void reftable_index_record_key(const void *r, struct strbuf *dest)
+{
+	struct reftable_index_record *rec = (struct reftable_index_record *)r;
+	strbuf_reset(dest);
+	strbuf_addbuf(dest, &rec->last_key);
+}
+
+static void reftable_index_record_copy_from(void *rec, const void *src_rec,
+					    int hash_size)
+{
+	struct reftable_index_record *dst = (struct reftable_index_record *)rec;
+	struct reftable_index_record *src =
+		(struct reftable_index_record *)src_rec;
+
+	strbuf_reset(&dst->last_key);
+	strbuf_addbuf(&dst->last_key, &src->last_key);
+	dst->offset = src->offset;
+}
+
+static void reftable_index_record_release(void *rec)
+{
+	struct reftable_index_record *idx = (struct reftable_index_record *)rec;
+	strbuf_release(&idx->last_key);
+}
+
+static uint8_t reftable_index_record_val_type(const void *rec)
+{
+	return 0;
+}
+
+static int reftable_index_record_encode(const void *rec, struct string_view out,
+					int hash_size)
+{
+	const struct reftable_index_record *r =
+		(const struct reftable_index_record *)rec;
+	struct string_view start = out;
+
+	int n = put_var_int(&out, r->offset);
+	if (n < 0)
+		return n;
+
+	string_view_consume(&out, n);
+
+	return start.len - out.len;
+}
+
+static int reftable_index_record_decode(void *rec, struct strbuf key,
+					uint8_t val_type, struct string_view in,
+					int hash_size)
+{
+	struct string_view start = in;
+	struct reftable_index_record *r = (struct reftable_index_record *)rec;
+	int n = 0;
+
+	strbuf_reset(&r->last_key);
+	strbuf_addbuf(&r->last_key, &key);
+
+	n = get_var_int(&r->offset, &in);
+	if (n < 0)
+		return n;
+
+	string_view_consume(&in, n);
+	return start.len - in.len;
+}
+
+static struct reftable_record_vtable reftable_index_record_vtable = {
+	.key = &reftable_index_record_key,
+	.type = BLOCK_TYPE_INDEX,
+	.copy_from = &reftable_index_record_copy_from,
+	.val_type = &reftable_index_record_val_type,
+	.encode = &reftable_index_record_encode,
+	.decode = &reftable_index_record_decode,
+	.release = &reftable_index_record_release,
+	.is_deletion = &not_a_deletion,
+};
+
+void reftable_record_key(struct reftable_record *rec, struct strbuf *dest)
+{
+	rec->ops->key(rec->data, dest);
+}
+
+uint8_t reftable_record_type(struct reftable_record *rec)
+{
+	return rec->ops->type;
+}
+
+int reftable_record_encode(struct reftable_record *rec, struct string_view dest,
+			   int hash_size)
+{
+	return rec->ops->encode(rec->data, dest, hash_size);
+}
+
+void reftable_record_copy_from(struct reftable_record *rec,
+			       struct reftable_record *src, int hash_size)
+{
+	assert(src->ops->type == rec->ops->type);
+
+	rec->ops->copy_from(rec->data, src->data, hash_size);
+}
+
+uint8_t reftable_record_val_type(struct reftable_record *rec)
+{
+	return rec->ops->val_type(rec->data);
+}
+
+int reftable_record_decode(struct reftable_record *rec, struct strbuf key,
+			   uint8_t extra, struct string_view src, int hash_size)
+{
+	return rec->ops->decode(rec->data, key, extra, src, hash_size);
+}
+
+void reftable_record_release(struct reftable_record *rec)
+{
+	rec->ops->release(rec->data);
+}
+
+int reftable_record_is_deletion(struct reftable_record *rec)
+{
+	return rec->ops->is_deletion(rec->data);
+}
+
+void reftable_record_from_ref(struct reftable_record *rec,
+			      struct reftable_ref_record *ref_rec)
+{
+	assert(rec->ops == NULL);
+	rec->data = ref_rec;
+	rec->ops = &reftable_ref_record_vtable;
+}
+
+void reftable_record_from_obj(struct reftable_record *rec,
+			      struct reftable_obj_record *obj_rec)
+{
+	assert(rec->ops == NULL);
+	rec->data = obj_rec;
+	rec->ops = &reftable_obj_record_vtable;
+}
+
+void reftable_record_from_index(struct reftable_record *rec,
+				struct reftable_index_record *index_rec)
+{
+	assert(rec->ops == NULL);
+	rec->data = index_rec;
+	rec->ops = &reftable_index_record_vtable;
+}
+
+void reftable_record_from_log(struct reftable_record *rec,
+			      struct reftable_log_record *log_rec)
+{
+	assert(rec->ops == NULL);
+	rec->data = log_rec;
+	rec->ops = &reftable_log_record_vtable;
+}
+
+struct reftable_ref_record *reftable_record_as_ref(struct reftable_record *rec)
+{
+	assert(reftable_record_type(rec) == BLOCK_TYPE_REF);
+	return (struct reftable_ref_record *)rec->data;
+}
+
+struct reftable_log_record *reftable_record_as_log(struct reftable_record *rec)
+{
+	assert(reftable_record_type(rec) == BLOCK_TYPE_LOG);
+	return (struct reftable_log_record *)rec->data;
+}
+
+static int hash_equal(uint8_t *a, uint8_t *b, int hash_size)
+{
+	if (a != NULL && b != NULL)
+		return !memcmp(a, b, hash_size);
+
+	return a == b;
+}
+
+int reftable_ref_record_equal(struct reftable_ref_record *a,
+			      struct reftable_ref_record *b, int hash_size)
+{
+	assert(hash_size > 0);
+	if (!(0 == strcmp(a->refname, b->refname) &&
+	      a->update_index == b->update_index &&
+	      a->value_type == b->value_type))
+		return 0;
+
+	switch (a->value_type) {
+	case REFTABLE_REF_SYMREF:
+		return !strcmp(a->value.symref, b->value.symref);
+	case REFTABLE_REF_VAL2:
+		return hash_equal(a->value.val2.value, b->value.val2.value,
+				  hash_size) &&
+		       hash_equal(a->value.val2.target_value,
+				  b->value.val2.target_value, hash_size);
+	case REFTABLE_REF_VAL1:
+		return hash_equal(a->value.val1, b->value.val1, hash_size);
+	case REFTABLE_REF_DELETION:
+		return 1;
+	default:
+		abort();
+	}
+}
+
+int reftable_ref_record_compare_name(const void *a, const void *b)
+{
+	return strcmp(((struct reftable_ref_record *)a)->refname,
+		      ((struct reftable_ref_record *)b)->refname);
+}
+
+int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref)
+{
+	return ref->value_type == REFTABLE_REF_DELETION;
+}
+
+int reftable_log_record_compare_key(const void *a, const void *b)
+{
+	struct reftable_log_record *la = (struct reftable_log_record *)a;
+	struct reftable_log_record *lb = (struct reftable_log_record *)b;
+
+	int cmp = strcmp(la->refname, lb->refname);
+	if (cmp)
+		return cmp;
+	if (la->update_index > lb->update_index)
+		return -1;
+	return (la->update_index < lb->update_index) ? 1 : 0;
+}
+
+int reftable_log_record_is_deletion(const struct reftable_log_record *log)
+{
+	return (log->new_hash == NULL && log->old_hash == NULL &&
+		log->name == NULL && log->email == NULL &&
+		log->message == NULL && log->time == 0 && log->tz_offset == 0 &&
+		log->message == NULL);
+}
+
+void string_view_consume(struct string_view *s, int n)
+{
+	s->buf += n;
+	s->len -= n;
+}
diff --git a/reftable/record.h b/reftable/record.h
new file mode 100644
index 00000000000..498e8c50bf4
--- /dev/null
+++ b/reftable/record.h
@@ -0,0 +1,139 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef RECORD_H
+#define RECORD_H
+
+#include "system.h"
+
+#include <stdint.h>
+
+#include "reftable-record.h"
+
+/*
+ * A substring of existing string data. This structure takes no responsibility
+ * for the lifetime of the data it points to.
+ */
+struct string_view {
+	uint8_t *buf;
+	size_t len;
+};
+
+/* Advance `s.buf` by `n`, and decrease length. */
+void string_view_consume(struct string_view *s, int n);
+
+/* utilities for de/encoding varints */
+
+int get_var_int(uint64_t *dest, struct string_view *in);
+int put_var_int(struct string_view *dest, uint64_t val);
+
+/* Methods for records. */
+struct reftable_record_vtable {
+	/* encode the key of to a uint8_t strbuf. */
+	void (*key)(const void *rec, struct strbuf *dest);
+
+	/* The record type of ('r' for ref). */
+	uint8_t type;
+
+	void (*copy_from)(void *dest, const void *src, int hash_size);
+
+	/* a value of [0..7], indicating record subvariants (eg. ref vs. symref
+	 * vs ref deletion) */
+	uint8_t (*val_type)(const void *rec);
+
+	/* encodes rec into dest, returning how much space was used. */
+	int (*encode)(const void *rec, struct string_view dest, int hash_size);
+
+	/* decode data from `src` into the record. */
+	int (*decode)(void *rec, struct strbuf key, uint8_t extra,
+		      struct string_view src, int hash_size);
+
+	/* deallocate and null the record. */
+	void (*release)(void *rec);
+
+	/* is this a tombstone? */
+	int (*is_deletion)(const void *rec);
+};
+
+/* record is a generic wrapper for different types of records. */
+struct reftable_record {
+	void *data;
+	struct reftable_record_vtable *ops;
+};
+
+/* returns true for recognized block types. Block start with the block type. */
+int reftable_is_block_type(uint8_t typ);
+
+/* creates a malloced record of the given type. Dispose with record_destroy */
+struct reftable_record reftable_new_record(uint8_t typ);
+
+/* Encode `key` into `dest`. Sets `is_restart` to indicate a restart. Returns
+ * number of bytes written. */
+int reftable_encode_key(int *is_restart, struct string_view dest,
+			struct strbuf prev_key, struct strbuf key,
+			uint8_t extra);
+
+/* Decode into `key` and `extra` from `in` */
+int reftable_decode_key(struct strbuf *key, uint8_t *extra,
+			struct strbuf last_key, struct string_view in);
+
+/* reftable_index_record are used internally to speed up lookups. */
+struct reftable_index_record {
+	uint64_t offset; /* Offset of block */
+	struct strbuf last_key; /* Last key of the block. */
+};
+
+/* reftable_obj_record stores an object ID => ref mapping. */
+struct reftable_obj_record {
+	uint8_t *hash_prefix; /* leading bytes of the object ID */
+	int hash_prefix_len; /* number of leading bytes. Constant
+			      * across a single table. */
+	uint64_t *offsets; /* a vector of file offsets. */
+	int offset_len;
+};
+
+/* see struct record_vtable */
+
+void reftable_record_key(struct reftable_record *rec, struct strbuf *dest);
+uint8_t reftable_record_type(struct reftable_record *rec);
+void reftable_record_copy_from(struct reftable_record *rec,
+			       struct reftable_record *src, int hash_size);
+uint8_t reftable_record_val_type(struct reftable_record *rec);
+int reftable_record_encode(struct reftable_record *rec, struct string_view dest,
+			   int hash_size);
+int reftable_record_decode(struct reftable_record *rec, struct strbuf key,
+			   uint8_t extra, struct string_view src,
+			   int hash_size);
+int reftable_record_is_deletion(struct reftable_record *rec);
+
+/* zeroes out the embedded record */
+void reftable_record_release(struct reftable_record *rec);
+
+/* clear and deallocate embedded record, and zero `rec`. */
+void reftable_record_destroy(struct reftable_record *rec);
+
+/* initialize generic records from concrete records. The generic record should
+ * be zeroed out. */
+void reftable_record_from_obj(struct reftable_record *rec,
+			      struct reftable_obj_record *objrec);
+void reftable_record_from_index(struct reftable_record *rec,
+				struct reftable_index_record *idxrec);
+void reftable_record_from_ref(struct reftable_record *rec,
+			      struct reftable_ref_record *refrec);
+void reftable_record_from_log(struct reftable_record *rec,
+			      struct reftable_log_record *logrec);
+struct reftable_ref_record *reftable_record_as_ref(struct reftable_record *ref);
+struct reftable_log_record *reftable_record_as_log(struct reftable_record *ref);
+
+/* for qsort. */
+int reftable_ref_record_compare_name(const void *a, const void *b);
+
+/* for qsort. */
+int reftable_log_record_compare_key(const void *a, const void *b);
+
+#endif
diff --git a/reftable/record_test.c b/reftable/record_test.c
new file mode 100644
index 00000000000..1bfeafe7f44
--- /dev/null
+++ b/reftable/record_test.c
@@ -0,0 +1,398 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "record.h"
+
+#include "system.h"
+#include "basics.h"
+#include "constants.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static void test_copy(struct reftable_record *rec)
+{
+	struct reftable_record copy =
+		reftable_new_record(reftable_record_type(rec));
+	reftable_record_copy_from(&copy, rec, SHA1_SIZE);
+	/* do it twice to catch memory leaks */
+	reftable_record_copy_from(&copy, rec, SHA1_SIZE);
+	switch (reftable_record_type(&copy)) {
+	case BLOCK_TYPE_REF:
+		EXPECT(reftable_ref_record_equal(reftable_record_as_ref(&copy),
+						 reftable_record_as_ref(rec),
+						 SHA1_SIZE));
+		break;
+	case BLOCK_TYPE_LOG:
+		EXPECT(reftable_log_record_equal(reftable_record_as_log(&copy),
+						 reftable_record_as_log(rec),
+						 SHA1_SIZE));
+		break;
+	}
+	reftable_record_destroy(&copy);
+}
+
+static void test_varint_roundtrip(void)
+{
+	uint64_t inputs[] = { 0,
+			      1,
+			      27,
+			      127,
+			      128,
+			      257,
+			      4096,
+			      ((uint64_t)1 << 63),
+			      ((uint64_t)1 << 63) + ((uint64_t)1 << 63) - 1 };
+	int i = 0;
+	for (i = 0; i < ARRAY_SIZE(inputs); i++) {
+		uint8_t dest[10];
+
+		struct string_view out = {
+			.buf = dest,
+			.len = sizeof(dest),
+		};
+		uint64_t in = inputs[i];
+		int n = put_var_int(&out, in);
+		uint64_t got = 0;
+
+		EXPECT(n > 0);
+		out.len = n;
+		n = get_var_int(&got, &out);
+		EXPECT(n > 0);
+
+		EXPECT(got == in);
+	}
+}
+
+static void test_common_prefix(void)
+{
+	struct {
+		const char *a, *b;
+		int want;
+	} cases[] = {
+		{ "abc", "ab", 2 },
+		{ "", "abc", 0 },
+		{ "abc", "abd", 2 },
+		{ "abc", "pqr", 0 },
+	};
+
+	int i = 0;
+	for (i = 0; i < ARRAY_SIZE(cases); i++) {
+		struct strbuf a = STRBUF_INIT;
+		struct strbuf b = STRBUF_INIT;
+		strbuf_addstr(&a, cases[i].a);
+		strbuf_addstr(&b, cases[i].b);
+		EXPECT(common_prefix_size(&a, &b) == cases[i].want);
+
+		strbuf_release(&a);
+		strbuf_release(&b);
+	}
+}
+
+static void set_hash(uint8_t *h, int j)
+{
+	int i = 0;
+	for (i = 0; i < hash_size(SHA1_ID); i++) {
+		h[i] = (j >> i) & 0xff;
+	}
+}
+
+static void test_reftable_ref_record_roundtrip(void)
+{
+	int i = 0;
+
+	for (i = REFTABLE_REF_DELETION; i < REFTABLE_NR_REF_VALUETYPES; i++) {
+		struct reftable_ref_record in = { NULL };
+		struct reftable_ref_record out = { NULL };
+		struct reftable_record rec_out = { NULL };
+		struct strbuf key = STRBUF_INIT;
+		struct reftable_record rec = { NULL };
+		uint8_t buffer[1024] = { 0 };
+		struct string_view dest = {
+			.buf = buffer,
+			.len = sizeof(buffer),
+		};
+
+		int n, m;
+
+		in.value_type = i;
+		switch (i) {
+		case REFTABLE_REF_DELETION:
+			break;
+		case REFTABLE_REF_VAL1:
+			in.value.val1 = reftable_malloc(SHA1_SIZE);
+			set_hash(in.value.val1, 1);
+			break;
+		case REFTABLE_REF_VAL2:
+			in.value.val2.value = reftable_malloc(SHA1_SIZE);
+			set_hash(in.value.val2.value, 1);
+			in.value.val2.target_value = reftable_malloc(SHA1_SIZE);
+			set_hash(in.value.val2.target_value, 2);
+			break;
+		case REFTABLE_REF_SYMREF:
+			in.value.symref = xstrdup("target");
+			break;
+		}
+		in.refname = xstrdup("refs/heads/master");
+
+		reftable_record_from_ref(&rec, &in);
+		test_copy(&rec);
+
+		EXPECT(reftable_record_val_type(&rec) == i);
+
+		reftable_record_key(&rec, &key);
+		n = reftable_record_encode(&rec, dest, SHA1_SIZE);
+		EXPECT(n > 0);
+
+		/* decode into a non-zero reftable_record to test for leaks. */
+
+		reftable_record_from_ref(&rec_out, &out);
+		m = reftable_record_decode(&rec_out, key, i, dest, SHA1_SIZE);
+		EXPECT(n == m);
+
+		EXPECT(reftable_ref_record_equal(&in, &out, SHA1_SIZE));
+		reftable_record_release(&rec_out);
+
+		strbuf_release(&key);
+		reftable_ref_record_release(&in);
+	}
+}
+
+static void test_reftable_log_record_equal(void)
+{
+	struct reftable_log_record in[2] = {
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.update_index = 42,
+		},
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.update_index = 22,
+		}
+	};
+
+	EXPECT(!reftable_log_record_equal(&in[0], &in[1], SHA1_SIZE));
+	in[1].update_index = in[0].update_index;
+	EXPECT(reftable_log_record_equal(&in[0], &in[1], SHA1_SIZE));
+	reftable_log_record_release(&in[0]);
+	reftable_log_record_release(&in[1]);
+}
+
+static void test_reftable_log_record_roundtrip(void)
+{
+	struct reftable_log_record in[2] = {
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.old_hash = reftable_malloc(SHA1_SIZE),
+			.new_hash = reftable_malloc(SHA1_SIZE),
+			.name = xstrdup("han-wen"),
+			.email = xstrdup("hanwen@google.com"),
+			.message = xstrdup("test"),
+			.update_index = 42,
+			.time = 1577123507,
+			.tz_offset = 100,
+		},
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.update_index = 22,
+		}
+	};
+	set_test_hash(in[0].new_hash, 1);
+	set_test_hash(in[0].old_hash, 2);
+	for (int i = 0; i < ARRAY_SIZE(in); i++) {
+		struct reftable_record rec = { NULL };
+		struct strbuf key = STRBUF_INIT;
+		uint8_t buffer[1024] = { 0 };
+		struct string_view dest = {
+			.buf = buffer,
+			.len = sizeof(buffer),
+		};
+		/* populate out, to check for leaks. */
+		struct reftable_log_record out = {
+			.refname = xstrdup("old name"),
+			.new_hash = reftable_calloc(SHA1_SIZE),
+			.old_hash = reftable_calloc(SHA1_SIZE),
+			.name = xstrdup("old name"),
+			.email = xstrdup("old@email"),
+			.message = xstrdup("old message"),
+		};
+		struct reftable_record rec_out = { NULL };
+		int n, m, valtype;
+
+		reftable_record_from_log(&rec, &in[i]);
+
+		test_copy(&rec);
+
+		reftable_record_key(&rec, &key);
+
+		n = reftable_record_encode(&rec, dest, SHA1_SIZE);
+		EXPECT(n >= 0);
+		reftable_record_from_log(&rec_out, &out);
+		valtype = reftable_record_val_type(&rec);
+		m = reftable_record_decode(&rec_out, key, valtype, dest,
+					   SHA1_SIZE);
+		EXPECT(n == m);
+
+		EXPECT(reftable_log_record_equal(&in[i], &out, SHA1_SIZE));
+		reftable_log_record_release(&in[i]);
+		strbuf_release(&key);
+		reftable_record_release(&rec_out);
+	}
+}
+
+static void test_u24_roundtrip(void)
+{
+	uint32_t in = 0x112233;
+	uint8_t dest[3];
+	uint32_t out;
+	put_be24(dest, in);
+	out = get_be24(dest);
+	EXPECT(in == out);
+}
+
+static void test_key_roundtrip(void)
+{
+	uint8_t buffer[1024] = { 0 };
+	struct string_view dest = {
+		.buf = buffer,
+		.len = sizeof(buffer),
+	};
+	struct strbuf last_key = STRBUF_INIT;
+	struct strbuf key = STRBUF_INIT;
+	struct strbuf roundtrip = STRBUF_INIT;
+	int restart;
+	uint8_t extra;
+	int n, m;
+	uint8_t rt_extra;
+
+	strbuf_addstr(&last_key, "refs/heads/master");
+	strbuf_addstr(&key, "refs/tags/bla");
+	extra = 6;
+	n = reftable_encode_key(&restart, dest, last_key, key, extra);
+	EXPECT(!restart);
+	EXPECT(n > 0);
+
+	m = reftable_decode_key(&roundtrip, &rt_extra, last_key, dest);
+	EXPECT(n == m);
+	EXPECT(0 == strbuf_cmp(&key, &roundtrip));
+	EXPECT(rt_extra == extra);
+
+	strbuf_release(&last_key);
+	strbuf_release(&key);
+	strbuf_release(&roundtrip);
+}
+
+static void test_reftable_obj_record_roundtrip(void)
+{
+	uint8_t testHash1[SHA1_SIZE] = { 1, 2, 3, 4, 0 };
+	uint64_t till9[] = { 1, 2, 3, 4, 500, 600, 700, 800, 9000 };
+	struct reftable_obj_record recs[3] = { {
+						       .hash_prefix = testHash1,
+						       .hash_prefix_len = 5,
+						       .offsets = till9,
+						       .offset_len = 3,
+					       },
+					       {
+						       .hash_prefix = testHash1,
+						       .hash_prefix_len = 5,
+						       .offsets = till9,
+						       .offset_len = 9,
+					       },
+					       {
+						       .hash_prefix = testHash1,
+						       .hash_prefix_len = 5,
+					       } };
+	int i = 0;
+	for (i = 0; i < ARRAY_SIZE(recs); i++) {
+		struct reftable_obj_record in = recs[i];
+		uint8_t buffer[1024] = { 0 };
+		struct string_view dest = {
+			.buf = buffer,
+			.len = sizeof(buffer),
+		};
+		struct reftable_record rec = { NULL };
+		struct strbuf key = STRBUF_INIT;
+		struct reftable_obj_record out = { NULL };
+		struct reftable_record rec_out = { NULL };
+		int n, m;
+		uint8_t extra;
+
+		reftable_record_from_obj(&rec, &in);
+		test_copy(&rec);
+		reftable_record_key(&rec, &key);
+		n = reftable_record_encode(&rec, dest, SHA1_SIZE);
+		EXPECT(n > 0);
+		extra = reftable_record_val_type(&rec);
+		reftable_record_from_obj(&rec_out, &out);
+		m = reftable_record_decode(&rec_out, key, extra, dest,
+					   SHA1_SIZE);
+		EXPECT(n == m);
+
+		EXPECT(in.hash_prefix_len == out.hash_prefix_len);
+		EXPECT(in.offset_len == out.offset_len);
+
+		EXPECT(!memcmp(in.hash_prefix, out.hash_prefix,
+			       in.hash_prefix_len));
+		EXPECT(0 == memcmp(in.offsets, out.offsets,
+				   sizeof(uint64_t) * in.offset_len));
+		strbuf_release(&key);
+		reftable_record_release(&rec_out);
+	}
+}
+
+static void test_reftable_index_record_roundtrip(void)
+{
+	struct reftable_index_record in = {
+		.offset = 42,
+		.last_key = STRBUF_INIT,
+	};
+	uint8_t buffer[1024] = { 0 };
+	struct string_view dest = {
+		.buf = buffer,
+		.len = sizeof(buffer),
+	};
+	struct strbuf key = STRBUF_INIT;
+	struct reftable_record rec = { NULL };
+	struct reftable_index_record out = { .last_key = STRBUF_INIT };
+	struct reftable_record out_rec = { NULL };
+	int n, m;
+	uint8_t extra;
+
+	strbuf_addstr(&in.last_key, "refs/heads/master");
+	reftable_record_from_index(&rec, &in);
+	reftable_record_key(&rec, &key);
+	test_copy(&rec);
+
+	EXPECT(0 == strbuf_cmp(&key, &in.last_key));
+	n = reftable_record_encode(&rec, dest, SHA1_SIZE);
+	EXPECT(n > 0);
+
+	extra = reftable_record_val_type(&rec);
+	reftable_record_from_index(&out_rec, &out);
+	m = reftable_record_decode(&out_rec, key, extra, dest, SHA1_SIZE);
+	EXPECT(m == n);
+
+	EXPECT(in.offset == out.offset);
+
+	reftable_record_release(&out_rec);
+	strbuf_release(&key);
+	strbuf_release(&in.last_key);
+}
+
+int record_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_reftable_log_record_equal);
+	RUN_TEST(test_reftable_log_record_roundtrip);
+	RUN_TEST(test_reftable_ref_record_roundtrip);
+	RUN_TEST(test_varint_roundtrip);
+	RUN_TEST(test_key_roundtrip);
+	RUN_TEST(test_common_prefix);
+	RUN_TEST(test_reftable_obj_record_roundtrip);
+	RUN_TEST(test_reftable_index_record_roundtrip);
+	RUN_TEST(test_u24_roundtrip);
+	return 0;
+}
diff --git a/reftable/reftable-record.h b/reftable/reftable-record.h
new file mode 100644
index 00000000000..7b1bdbf5e5f
--- /dev/null
+++ b/reftable/reftable-record.h
@@ -0,0 +1,100 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_RECORD_H
+#define REFTABLE_RECORD_H
+
+#include <stdint.h>
+
+/*
+ * Basic data types
+ *
+ * Reftables store the state of each ref in struct reftable_ref_record, and they
+ * store a sequence of reflog updates in struct reftable_log_record.
+ */
+
+/* reftable_ref_record holds a ref database entry target_value */
+struct reftable_ref_record {
+	char *refname; /* Name of the ref, malloced. */
+	uint64_t update_index; /* Logical timestamp at which this value is
+				* written */
+
+	enum {
+		/* tombstone to hide deletions from earlier tables */
+		REFTABLE_REF_DELETION = 0x0,
+
+		/* a simple ref */
+		REFTABLE_REF_VAL1 = 0x1,
+		/* a tag, plus its peeled hash */
+		REFTABLE_REF_VAL2 = 0x2,
+
+		/* a symbolic reference */
+		REFTABLE_REF_SYMREF = 0x3,
+#define REFTABLE_NR_REF_VALUETYPES 4
+	} value_type;
+	union {
+		uint8_t *val1; /* malloced hash. */
+		struct {
+			uint8_t *value; /* first value, malloced hash  */
+			uint8_t *target_value; /* second value, malloced hash */
+		} val2;
+		char *symref; /* referent, malloced 0-terminated string */
+	} value;
+};
+
+/* Returns the first hash, or NULL if `rec` is not of type
+ * REFTABLE_REF_VAL1 or REFTABLE_REF_VAL2. */
+uint8_t *reftable_ref_record_val1(struct reftable_ref_record *rec);
+
+/* Returns the second hash, or NULL if `rec` is not of type
+ * REFTABLE_REF_VAL2. */
+uint8_t *reftable_ref_record_val2(struct reftable_ref_record *rec);
+
+/* returns whether 'ref' represents a deletion */
+int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref);
+
+/* prints a reftable_ref_record onto stdout. Useful for debugging. */
+void reftable_ref_record_print(struct reftable_ref_record *ref,
+			       uint32_t hash_id);
+
+/* frees and nulls all pointer values inside `ref`. */
+void reftable_ref_record_release(struct reftable_ref_record *ref);
+
+/* returns whether two reftable_ref_records are the same. Useful for testing. */
+int reftable_ref_record_equal(struct reftable_ref_record *a,
+			      struct reftable_ref_record *b, int hash_size);
+
+/* reftable_log_record holds a reflog entry */
+struct reftable_log_record {
+	char *refname;
+	uint64_t update_index; /* logical timestamp of a transactional update.
+				*/
+	uint8_t *new_hash;
+	uint8_t *old_hash;
+	char *name;
+	char *email;
+	uint64_t time;
+	int16_t tz_offset;
+	char *message;
+};
+
+/* returns whether 'ref' represents the deletion of a log record. */
+int reftable_log_record_is_deletion(const struct reftable_log_record *log);
+
+/* frees and nulls all pointer values. */
+void reftable_log_record_release(struct reftable_log_record *log);
+
+/* returns whether two records are equal. Useful for testing. */
+int reftable_log_record_equal(struct reftable_log_record *a,
+			      struct reftable_log_record *b, int hash_size);
+
+/* dumps a reftable_log_record on stdout, for debugging/testing. */
+void reftable_log_record_print(struct reftable_log_record *log,
+			       uint32_t hash_id);
+
+#endif
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 3b58e423e7b..09d4b83ef9b 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -4,6 +4,6 @@
 int cmd__reftable(int argc, const char **argv)
 {
 	basics_test_main(argc, argv);
-
+	record_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v4 07/15] reftable: reading/writing blocks
  2020-12-09 14:00     ` [PATCH v4 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                         ` (5 preceding siblings ...)
  2020-12-09 14:00       ` [PATCH v4 06/15] reftable: (de)serialization for the polymorphic record type Han-Wen Nienhuys via GitGitGadget
@ 2020-12-09 14:00       ` Han-Wen Nienhuys via GitGitGadget
  2020-12-09 14:00       ` [PATCH v4 08/15] reftable: a generic binary tree implementation Han-Wen Nienhuys via GitGitGadget
                         ` (8 subsequent siblings)
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-12-09 14:00 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable format is structured as a sequence of block. Within a block,
records are prefix compressed, with an index of offsets for fully expand keys to
enable binary search within blocks.

This commit provides the logic to read and write these blocks.

Includes a code snippet copied from zlib

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |   3 +
 reftable/.gitattributes  |   1 +
 reftable/block.c         | 440 +++++++++++++++++++++++++++++++++++++++
 reftable/block.h         | 127 +++++++++++
 reftable/block_test.c    | 121 +++++++++++
 reftable/zlib-compat.c   |  92 ++++++++
 t/helper/test-reftable.c |   1 +
 7 files changed, 785 insertions(+)
 create mode 100644 reftable/.gitattributes
 create mode 100644 reftable/block.c
 create mode 100644 reftable/block.h
 create mode 100644 reftable/block_test.c
 create mode 100644 reftable/zlib-compat.c

diff --git a/Makefile b/Makefile
index ce3052169b4..abe0dda5777 100644
--- a/Makefile
+++ b/Makefile
@@ -2391,10 +2391,13 @@ XDIFF_OBJS += xdiff/xutils.o
 
 REFTABLE_OBJS += reftable/basics.o
 REFTABLE_OBJS += reftable/error.o
+REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/record.o
+REFTABLE_OBJS += reftable/zlib-compat.o
 
+REFTABLE_TEST_OBJS += reftable/block_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/basics_test.o
diff --git a/reftable/.gitattributes b/reftable/.gitattributes
new file mode 100644
index 00000000000..f44451a3795
--- /dev/null
+++ b/reftable/.gitattributes
@@ -0,0 +1 @@
+/zlib-compat.c	whitespace=-indent-with-non-tab,-trailing-space
diff --git a/reftable/block.c b/reftable/block.c
new file mode 100644
index 00000000000..5b4e49fb2be
--- /dev/null
+++ b/reftable/block.c
@@ -0,0 +1,440 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "block.h"
+
+#include "blocksource.h"
+#include "constants.h"
+#include "record.h"
+#include "reftable-error.h"
+#include "system.h"
+#include "zlib.h"
+
+int header_size(int version)
+{
+	switch (version) {
+	case 1:
+		return 24;
+	case 2:
+		return 28;
+	}
+	abort();
+}
+
+int footer_size(int version)
+{
+	switch (version) {
+	case 1:
+		return 68;
+	case 2:
+		return 72;
+	}
+	abort();
+}
+
+static int block_writer_register_restart(struct block_writer *w, int n,
+					 int is_restart, struct strbuf *key)
+{
+	int rlen = w->restart_len;
+	if (rlen >= MAX_RESTARTS) {
+		is_restart = 0;
+	}
+
+	if (is_restart) {
+		rlen++;
+	}
+	if (2 + 3 * rlen + n > w->block_size - w->next)
+		return -1;
+	if (is_restart) {
+		if (w->restart_len == w->restart_cap) {
+			w->restart_cap = w->restart_cap * 2 + 1;
+			w->restarts = reftable_realloc(
+				w->restarts, sizeof(uint32_t) * w->restart_cap);
+		}
+
+		w->restarts[w->restart_len++] = w->next;
+	}
+
+	w->next += n;
+
+	strbuf_reset(&w->last_key);
+	strbuf_addbuf(&w->last_key, key);
+	w->entries++;
+	return 0;
+}
+
+void block_writer_init(struct block_writer *bw, uint8_t typ, uint8_t *buf,
+		       uint32_t block_size, uint32_t header_off, int hash_size)
+{
+	bw->buf = buf;
+	bw->hash_size = hash_size;
+	bw->block_size = block_size;
+	bw->header_off = header_off;
+	bw->buf[header_off] = typ;
+	bw->next = header_off + 4;
+	bw->restart_interval = 16;
+	bw->entries = 0;
+	bw->restart_len = 0;
+	bw->last_key.len = 0;
+}
+
+uint8_t block_writer_type(struct block_writer *bw)
+{
+	return bw->buf[bw->header_off];
+}
+
+/* adds the reftable_record to the block. Returns -1 if it does not fit, 0 on
+   success */
+int block_writer_add(struct block_writer *w, struct reftable_record *rec)
+{
+	struct strbuf empty = STRBUF_INIT;
+	struct strbuf last =
+		w->entries % w->restart_interval == 0 ? empty : w->last_key;
+	struct string_view out = {
+		.buf = w->buf + w->next,
+		.len = w->block_size - w->next,
+	};
+
+	struct string_view start = out;
+
+	int is_restart = 0;
+	struct strbuf key = STRBUF_INIT;
+	int n = 0;
+
+	reftable_record_key(rec, &key);
+	n = reftable_encode_key(&is_restart, out, last, key,
+				reftable_record_val_type(rec));
+	if (n < 0)
+		goto done;
+	string_view_consume(&out, n);
+
+	n = reftable_record_encode(rec, out, w->hash_size);
+	if (n < 0)
+		goto done;
+	string_view_consume(&out, n);
+
+	if (block_writer_register_restart(w, start.len - out.len, is_restart,
+					  &key) < 0)
+		goto done;
+
+	strbuf_release(&key);
+	return 0;
+
+done:
+	strbuf_release(&key);
+	return -1;
+}
+
+int block_writer_finish(struct block_writer *w)
+{
+	int i = 0;
+	for (i = 0; i < w->restart_len; i++) {
+		put_be24(w->buf + w->next, w->restarts[i]);
+		w->next += 3;
+	}
+
+	put_be16(w->buf + w->next, w->restart_len);
+	w->next += 2;
+	put_be24(w->buf + 1 + w->header_off, w->next);
+
+	if (block_writer_type(w) == BLOCK_TYPE_LOG) {
+		int block_header_skip = 4 + w->header_off;
+		uint8_t *compressed = NULL;
+		int zresult = 0;
+		uLongf src_len = w->next - block_header_skip;
+		size_t dest_cap = src_len;
+
+		compressed = reftable_malloc(dest_cap);
+		while (1) {
+			uLongf out_dest_len = dest_cap;
+
+			zresult = compress2(compressed, &out_dest_len,
+					    w->buf + block_header_skip, src_len,
+					    9);
+			if (zresult == Z_BUF_ERROR) {
+				dest_cap *= 2;
+				compressed =
+					reftable_realloc(compressed, dest_cap);
+				continue;
+			}
+
+			if (Z_OK != zresult) {
+				reftable_free(compressed);
+				return REFTABLE_ZLIB_ERROR;
+			}
+
+			memcpy(w->buf + block_header_skip, compressed,
+			       out_dest_len);
+			w->next = out_dest_len + block_header_skip;
+			reftable_free(compressed);
+			break;
+		}
+	}
+	return w->next;
+}
+
+uint8_t block_reader_type(struct block_reader *r)
+{
+	return r->block.data[r->header_off];
+}
+
+int block_reader_init(struct block_reader *br, struct reftable_block *block,
+		      uint32_t header_off, uint32_t table_block_size,
+		      int hash_size)
+{
+	uint32_t full_block_size = table_block_size;
+	uint8_t typ = block->data[header_off];
+	uint32_t sz = get_be24(block->data + header_off + 1);
+
+	uint16_t restart_count = 0;
+	uint32_t restart_start = 0;
+	uint8_t *restart_bytes = NULL;
+
+	if (!reftable_is_block_type(typ))
+		return REFTABLE_FORMAT_ERROR;
+
+	if (typ == BLOCK_TYPE_LOG) {
+		int block_header_skip = 4 + header_off;
+		uLongf dst_len = sz - block_header_skip; /* total size of dest
+							    buffer. */
+		uLongf src_len = block->len - block_header_skip;
+		/* Log blocks specify the *uncompressed* size in their header.
+		 */
+		uint8_t *uncompressed = reftable_malloc(sz);
+
+		/* Copy over the block header verbatim. It's not compressed. */
+		memcpy(uncompressed, block->data, block_header_skip);
+
+		/* Uncompress */
+		if (Z_OK != uncompress_return_consumed(
+				    uncompressed + block_header_skip, &dst_len,
+				    block->data + block_header_skip,
+				    &src_len)) {
+			reftable_free(uncompressed);
+			return REFTABLE_ZLIB_ERROR;
+		}
+
+		if (dst_len + block_header_skip != sz)
+			return REFTABLE_FORMAT_ERROR;
+
+		/* We're done with the input data. */
+		reftable_block_done(block);
+		block->data = uncompressed;
+		block->len = sz;
+		block->source = malloc_block_source();
+		full_block_size = src_len + block_header_skip;
+	} else if (full_block_size == 0) {
+		full_block_size = sz;
+	} else if (sz < full_block_size && sz < block->len &&
+		   block->data[sz] != 0) {
+		/* If the block is smaller than the full block size, it is
+		   padded (data followed by '\0') or the next block is
+		   unaligned. */
+		full_block_size = sz;
+	}
+
+	restart_count = get_be16(block->data + sz - 2);
+	restart_start = sz - 2 - 3 * restart_count;
+	restart_bytes = block->data + restart_start;
+
+	/* transfer ownership. */
+	br->block = *block;
+	block->data = NULL;
+	block->len = 0;
+
+	br->hash_size = hash_size;
+	br->block_len = restart_start;
+	br->full_block_size = full_block_size;
+	br->header_off = header_off;
+	br->restart_count = restart_count;
+	br->restart_bytes = restart_bytes;
+
+	return 0;
+}
+
+static uint32_t block_reader_restart_offset(struct block_reader *br, int i)
+{
+	return get_be24(br->restart_bytes + 3 * i);
+}
+
+void block_reader_start(struct block_reader *br, struct block_iter *it)
+{
+	it->br = br;
+	strbuf_reset(&it->last_key);
+	it->next_off = br->header_off + 4;
+}
+
+struct restart_find_args {
+	int error;
+	struct strbuf key;
+	struct block_reader *r;
+};
+
+static int restart_key_less(size_t idx, void *args)
+{
+	struct restart_find_args *a = (struct restart_find_args *)args;
+	uint32_t off = block_reader_restart_offset(a->r, idx);
+	struct string_view in = {
+		.buf = a->r->block.data + off,
+		.len = a->r->block_len - off,
+	};
+
+	/* the restart key is verbatim in the block, so this could avoid the
+	   alloc for decoding the key */
+	struct strbuf rkey = STRBUF_INIT;
+	struct strbuf last_key = STRBUF_INIT;
+	uint8_t unused_extra;
+	int n = reftable_decode_key(&rkey, &unused_extra, last_key, in);
+	int result;
+	if (n < 0) {
+		a->error = 1;
+		return -1;
+	}
+
+	result = strbuf_cmp(&a->key, &rkey);
+	strbuf_release(&rkey);
+	return result;
+}
+
+void block_iter_copy_from(struct block_iter *dest, struct block_iter *src)
+{
+	dest->br = src->br;
+	dest->next_off = src->next_off;
+	strbuf_reset(&dest->last_key);
+	strbuf_addbuf(&dest->last_key, &src->last_key);
+}
+
+int block_iter_next(struct block_iter *it, struct reftable_record *rec)
+{
+	struct string_view in = {
+		.buf = it->br->block.data + it->next_off,
+		.len = it->br->block_len - it->next_off,
+	};
+	struct string_view start = in;
+	struct strbuf key = STRBUF_INIT;
+	uint8_t extra = 0;
+	int n = 0;
+
+	if (it->next_off >= it->br->block_len)
+		return 1;
+
+	n = reftable_decode_key(&key, &extra, it->last_key, in);
+	if (n < 0)
+		return -1;
+
+	string_view_consume(&in, n);
+	n = reftable_record_decode(rec, key, extra, in, it->br->hash_size);
+	if (n < 0)
+		return -1;
+	string_view_consume(&in, n);
+
+	strbuf_reset(&it->last_key);
+	strbuf_addbuf(&it->last_key, &key);
+	it->next_off += start.len - in.len;
+	strbuf_release(&key);
+	return 0;
+}
+
+int block_reader_first_key(struct block_reader *br, struct strbuf *key)
+{
+	struct strbuf empty = STRBUF_INIT;
+	int off = br->header_off + 4;
+	struct string_view in = {
+		.buf = br->block.data + off,
+		.len = br->block_len - off,
+	};
+
+	uint8_t extra = 0;
+	int n = reftable_decode_key(key, &extra, empty, in);
+	if (n < 0)
+		return n;
+
+	return 0;
+}
+
+int block_iter_seek(struct block_iter *it, struct strbuf *want)
+{
+	return block_reader_seek(it->br, it, want);
+}
+
+void block_iter_close(struct block_iter *it)
+{
+	strbuf_release(&it->last_key);
+}
+
+int block_reader_seek(struct block_reader *br, struct block_iter *it,
+		      struct strbuf *want)
+{
+	struct restart_find_args args = {
+		.key = *want,
+		.r = br,
+	};
+	struct reftable_record rec = reftable_new_record(block_reader_type(br));
+	struct strbuf key = STRBUF_INIT;
+	int err = 0;
+	struct block_iter next = {
+		.last_key = STRBUF_INIT,
+	};
+
+	int i = binsearch(br->restart_count, &restart_key_less, &args);
+	if (args.error) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	it->br = br;
+	if (i > 0) {
+		i--;
+		it->next_off = block_reader_restart_offset(br, i);
+	} else {
+		it->next_off = br->header_off + 4;
+	}
+
+	/* We're looking for the last entry less/equal than the wanted key, so
+	   we have to go one entry too far and then back up.
+	*/
+	while (1) {
+		block_iter_copy_from(&next, it);
+		err = block_iter_next(&next, &rec);
+		if (err < 0)
+			goto done;
+
+		reftable_record_key(&rec, &key);
+		if (err > 0 || strbuf_cmp(&key, want) >= 0) {
+			err = 0;
+			goto done;
+		}
+
+		block_iter_copy_from(it, &next);
+	}
+
+done:
+	strbuf_release(&key);
+	strbuf_release(&next.last_key);
+	reftable_record_destroy(&rec);
+
+	return err;
+}
+
+void block_writer_release(struct block_writer *bw)
+{
+	FREE_AND_NULL(bw->restarts);
+	strbuf_release(&bw->last_key);
+	/* the block is not owned. */
+}
+
+void reftable_block_done(struct reftable_block *blockp)
+{
+	struct reftable_block_source source = blockp->source;
+	if (blockp != NULL && source.ops != NULL)
+		source.ops->return_block(source.arg, blockp);
+	blockp->data = NULL;
+	blockp->len = 0;
+	blockp->source.ops = NULL;
+	blockp->source.arg = NULL;
+}
diff --git a/reftable/block.h b/reftable/block.h
new file mode 100644
index 00000000000..e207706a644
--- /dev/null
+++ b/reftable/block.h
@@ -0,0 +1,127 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef BLOCK_H
+#define BLOCK_H
+
+#include "basics.h"
+#include "record.h"
+#include "reftable-blocksource.h"
+
+/*
+ * Writes reftable blocks. The block_writer is reused across blocks to minimize
+ * allocation overhead.
+ */
+struct block_writer {
+	uint8_t *buf;
+	uint32_t block_size;
+
+	/* Offset ofof the global header. Nonzero in the first block only. */
+	uint32_t header_off;
+
+	/* How often to restart keys. */
+	int restart_interval;
+	int hash_size;
+
+	/* Offset of next uint8_t to write. */
+	uint32_t next;
+	uint32_t *restarts;
+	uint32_t restart_len;
+	uint32_t restart_cap;
+
+	struct strbuf last_key;
+	int entries;
+};
+
+/*
+ * initializes the blockwriter to write `typ` entries, using `buf` as temporary
+ * storage. `buf` is not owned by the block_writer. */
+void block_writer_init(struct block_writer *bw, uint8_t typ, uint8_t *buf,
+		       uint32_t block_size, uint32_t header_off, int hash_size);
+
+/* returns the block type (eg. 'r' for ref records. */
+uint8_t block_writer_type(struct block_writer *bw);
+
+/* appends the record, or -1 if it doesn't fit. */
+int block_writer_add(struct block_writer *w, struct reftable_record *rec);
+
+/* appends the key restarts, and compress the block if necessary. */
+int block_writer_finish(struct block_writer *w);
+
+/* clears out internally allocated block_writer members. */
+void block_writer_release(struct block_writer *bw);
+
+/* Read a block. */
+struct block_reader {
+	/* offset of the block header; nonzero for the first block in a
+	 * reftable. */
+	uint32_t header_off;
+
+	/* the memory block */
+	struct reftable_block block;
+	int hash_size;
+
+	/* size of the data, excluding restart data. */
+	uint32_t block_len;
+	uint8_t *restart_bytes;
+	uint16_t restart_count;
+
+	/* size of the data in the file. For log blocks, this is the compressed
+	 * size. */
+	uint32_t full_block_size;
+};
+
+/* Iterate over entries in a block */
+struct block_iter {
+	/* offset within the block of the next entry to read. */
+	uint32_t next_off;
+	struct block_reader *br;
+
+	/* key for last entry we read. */
+	struct strbuf last_key;
+};
+
+/* initializes a block reader. */
+int block_reader_init(struct block_reader *br, struct reftable_block *bl,
+		      uint32_t header_off, uint32_t table_block_size,
+		      int hash_size);
+
+/* Position `it` at start of the block */
+void block_reader_start(struct block_reader *br, struct block_iter *it);
+
+/* Position `it` to the `want` key in the block */
+int block_reader_seek(struct block_reader *br, struct block_iter *it,
+		      struct strbuf *want);
+
+/* Returns the block type (eg. 'r' for refs) */
+uint8_t block_reader_type(struct block_reader *r);
+
+/* Decodes the first key in the block */
+int block_reader_first_key(struct block_reader *br, struct strbuf *key);
+
+void block_iter_copy_from(struct block_iter *dest, struct block_iter *src);
+
+/* return < 0 for error, 0 for OK, > 0 for EOF. */
+int block_iter_next(struct block_iter *it, struct reftable_record *rec);
+
+/* Seek to `want` with in the block pointed to by `it` */
+int block_iter_seek(struct block_iter *it, struct strbuf *want);
+
+/* deallocate memory for `it`. The block reader and its block is left intact. */
+void block_iter_close(struct block_iter *it);
+
+/* size of file header, depending on format version */
+int header_size(int version);
+
+/* size of file footer, depending on format version */
+int footer_size(int version);
+
+/* returns a block to its source. */
+void reftable_block_done(struct reftable_block *ret);
+
+#endif
diff --git a/reftable/block_test.c b/reftable/block_test.c
new file mode 100644
index 00000000000..75fe198e677
--- /dev/null
+++ b/reftable/block_test.c
@@ -0,0 +1,121 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "block.h"
+
+#include "system.h"
+
+#include "blocksource.h"
+#include "basics.h"
+#include "constants.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static void test_block_read_write(void)
+{
+	const int header_off = 21; /* random */
+	char *names[30];
+	const int N = ARRAY_SIZE(names);
+	const int block_size = 1024;
+	struct reftable_block block = { NULL };
+	struct block_writer bw = {
+		.last_key = STRBUF_INIT,
+	};
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_record rec = { NULL };
+	int i = 0;
+	int n;
+	struct block_reader br = { 0 };
+	struct block_iter it = { .last_key = STRBUF_INIT };
+	int j = 0;
+	struct strbuf want = STRBUF_INIT;
+
+	block.data = reftable_calloc(block_size);
+	block.len = block_size;
+	block.source = malloc_block_source();
+	block_writer_init(&bw, BLOCK_TYPE_REF, block.data, block_size,
+			  header_off, hash_size(SHA1_ID));
+	reftable_record_from_ref(&rec, &ref);
+
+	for (i = 0; i < N; i++) {
+		char name[100];
+		uint8_t hash[SHA1_SIZE];
+		snprintf(name, sizeof(name), "branch%02d", i);
+		memset(hash, i, sizeof(hash));
+
+		ref.refname = name;
+		ref.value_type = REFTABLE_REF_VAL1;
+		ref.value.val1 = hash;
+
+		names[i] = xstrdup(name);
+		n = block_writer_add(&bw, &rec);
+		ref.refname = NULL;
+		ref.value_type = REFTABLE_REF_DELETION;
+		EXPECT(n == 0);
+	}
+
+	n = block_writer_finish(&bw);
+	EXPECT(n > 0);
+
+	block_writer_release(&bw);
+
+	block_reader_init(&br, &block, header_off, block_size, SHA1_SIZE);
+
+	block_reader_start(&br, &it);
+
+	while (1) {
+		int r = block_iter_next(&it, &rec);
+		EXPECT(r >= 0);
+		if (r > 0) {
+			break;
+		}
+		EXPECT_STREQ(names[j], ref.refname);
+		j++;
+	}
+
+	reftable_record_release(&rec);
+	block_iter_close(&it);
+
+	for (i = 0; i < N; i++) {
+		struct block_iter it = { .last_key = STRBUF_INIT };
+		strbuf_reset(&want);
+		strbuf_addstr(&want, names[i]);
+
+		n = block_reader_seek(&br, &it, &want);
+		EXPECT(n == 0);
+
+		n = block_iter_next(&it, &rec);
+		EXPECT(n == 0);
+
+		EXPECT_STREQ(names[i], ref.refname);
+
+		want.len--;
+		n = block_reader_seek(&br, &it, &want);
+		EXPECT(n == 0);
+
+		n = block_iter_next(&it, &rec);
+		EXPECT(n == 0);
+		EXPECT_STREQ(names[10 * (i / 10)], ref.refname);
+
+		block_iter_close(&it);
+	}
+
+	reftable_record_release(&rec);
+	reftable_block_done(&br.block);
+	strbuf_release(&want);
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+}
+
+int block_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_block_read_write);
+	return 0;
+}
diff --git a/reftable/zlib-compat.c b/reftable/zlib-compat.c
new file mode 100644
index 00000000000..3e0b0f24f1c
--- /dev/null
+++ b/reftable/zlib-compat.c
@@ -0,0 +1,92 @@
+/* taken from zlib's uncompr.c
+
+   commit cacf7f1d4e3d44d871b605da3b647f07d718623f
+   Author: Mark Adler <madler@alumni.caltech.edu>
+   Date:   Sun Jan 15 09:18:46 2017 -0800
+
+       zlib 1.2.11
+
+*/
+
+/*
+ * Copyright (C) 1995-2003, 2010, 2014, 2016 Jean-loup Gailly, Mark Adler
+ * For conditions of distribution and use, see copyright notice in zlib.h
+ */
+
+#include "system.h"
+
+/* clang-format off */
+
+/* ===========================================================================
+     Decompresses the source buffer into the destination buffer.  *sourceLen is
+   the byte length of the source buffer. Upon entry, *destLen is the total size
+   of the destination buffer, which must be large enough to hold the entire
+   uncompressed data. (The size of the uncompressed data must have been saved
+   previously by the compressor and transmitted to the decompressor by some
+   mechanism outside the scope of this compression library.) Upon exit,
+   *destLen is the size of the decompressed data and *sourceLen is the number
+   of source bytes consumed. Upon return, source + *sourceLen points to the
+   first unused input byte.
+
+     uncompress returns Z_OK if success, Z_MEM_ERROR if there was not enough
+   memory, Z_BUF_ERROR if there was not enough room in the output buffer, or
+   Z_DATA_ERROR if the input data was corrupted, including if the input data is
+   an incomplete zlib stream.
+*/
+int ZEXPORT uncompress_return_consumed (
+    Bytef *dest,
+    uLongf *destLen,
+    const Bytef *source,
+    uLong *sourceLen) {
+    z_stream stream;
+    int err;
+    const uInt max = (uInt)-1;
+    uLong len, left;
+    Byte buf[1];    /* for detection of incomplete stream when *destLen == 0 */
+
+    len = *sourceLen;
+    if (*destLen) {
+        left = *destLen;
+        *destLen = 0;
+    }
+    else {
+        left = 1;
+        dest = buf;
+    }
+
+    stream.next_in = (z_const Bytef *)source;
+    stream.avail_in = 0;
+    stream.zalloc = (alloc_func)0;
+    stream.zfree = (free_func)0;
+    stream.opaque = (voidpf)0;
+
+    err = inflateInit(&stream);
+    if (err != Z_OK) return err;
+
+    stream.next_out = dest;
+    stream.avail_out = 0;
+
+    do {
+        if (stream.avail_out == 0) {
+            stream.avail_out = left > (uLong)max ? max : (uInt)left;
+            left -= stream.avail_out;
+        }
+        if (stream.avail_in == 0) {
+            stream.avail_in = len > (uLong)max ? max : (uInt)len;
+            len -= stream.avail_in;
+        }
+        err = inflate(&stream, Z_NO_FLUSH);
+    } while (err == Z_OK);
+
+    *sourceLen -= len + stream.avail_in;
+    if (dest != buf)
+        *destLen = stream.total_out;
+    else if (stream.total_out && err == Z_BUF_ERROR)
+        left = 1;
+
+    inflateEnd(&stream);
+    return err == Z_STREAM_END ? Z_OK :
+           err == Z_NEED_DICT ? Z_DATA_ERROR  :
+           err == Z_BUF_ERROR && left + stream.avail_out ? Z_DATA_ERROR :
+           err;
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 09d4b83ef9b..c9deeaf08c7 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -4,6 +4,7 @@
 int cmd__reftable(int argc, const char **argv)
 {
 	basics_test_main(argc, argv);
+	block_test_main(argc, argv);
 	record_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v4 08/15] reftable: a generic binary tree implementation
  2020-12-09 14:00     ` [PATCH v4 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                         ` (6 preceding siblings ...)
  2020-12-09 14:00       ` [PATCH v4 07/15] reftable: reading/writing blocks Han-Wen Nienhuys via GitGitGadget
@ 2020-12-09 14:00       ` Han-Wen Nienhuys via GitGitGadget
  2020-12-09 14:00       ` [PATCH v4 09/15] reftable: write reftable files Han-Wen Nienhuys via GitGitGadget
                         ` (7 subsequent siblings)
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-12-09 14:00 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable format includes support for an (OID => ref) map. This map can speed
up visibility and reachability checks. In particular, various operations along
the fetch/push path within Gerrit have ben sped up by using this structure.

The map is constructed with help of a binary tree. Object IDs are hashes, so
they are uniformly distributed. Hence, the tree does not attempt forced
rebalancing.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |  4 ++-
 reftable/tree.c          | 63 ++++++++++++++++++++++++++++++++++++++++
 reftable/tree.h          | 34 ++++++++++++++++++++++
 reftable/tree_test.c     | 61 ++++++++++++++++++++++++++++++++++++++
 t/helper/test-reftable.c |  1 +
 5 files changed, 162 insertions(+), 1 deletion(-)
 create mode 100644 reftable/tree.c
 create mode 100644 reftable/tree.h
 create mode 100644 reftable/tree_test.c

diff --git a/Makefile b/Makefile
index abe0dda5777..734c05655b2 100644
--- a/Makefile
+++ b/Makefile
@@ -2395,12 +2395,14 @@ REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/record.o
+REFTABLE_OBJS += reftable/tree.o
 REFTABLE_OBJS += reftable/zlib-compat.o
 
+REFTABLE_TEST_OBJS += reftable/basics_test.o
 REFTABLE_TEST_OBJS += reftable/block_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
-REFTABLE_TEST_OBJS += reftable/basics_test.o
+REFTABLE_TEST_OBJS += reftable/tree_test.o
 
 TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
 OBJECTS := $(LIB_OBJS) $(BUILTIN_OBJS) $(PROGRAM_OBJS) $(TEST_OBJS) \
diff --git a/reftable/tree.c b/reftable/tree.c
new file mode 100644
index 00000000000..0061d14e306
--- /dev/null
+++ b/reftable/tree.c
@@ -0,0 +1,63 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "tree.h"
+
+#include "basics.h"
+#include "system.h"
+
+struct tree_node *tree_search(void *key, struct tree_node **rootp,
+			      int (*compare)(const void *, const void *),
+			      int insert)
+{
+	int res;
+	if (*rootp == NULL) {
+		if (!insert) {
+			return NULL;
+		} else {
+			struct tree_node *n =
+				reftable_calloc(sizeof(struct tree_node));
+			n->key = key;
+			*rootp = n;
+			return *rootp;
+		}
+	}
+
+	res = compare(key, (*rootp)->key);
+	if (res < 0)
+		return tree_search(key, &(*rootp)->left, compare, insert);
+	else if (res > 0)
+		return tree_search(key, &(*rootp)->right, compare, insert);
+	return *rootp;
+}
+
+void infix_walk(struct tree_node *t, void (*action)(void *arg, void *key),
+		void *arg)
+{
+	if (t->left != NULL) {
+		infix_walk(t->left, action, arg);
+	}
+	action(arg, t->key);
+	if (t->right != NULL) {
+		infix_walk(t->right, action, arg);
+	}
+}
+
+void tree_free(struct tree_node *t)
+{
+	if (t == NULL) {
+		return;
+	}
+	if (t->left != NULL) {
+		tree_free(t->left);
+	}
+	if (t->right != NULL) {
+		tree_free(t->right);
+	}
+	reftable_free(t);
+}
diff --git a/reftable/tree.h b/reftable/tree.h
new file mode 100644
index 00000000000..fbdd002e23a
--- /dev/null
+++ b/reftable/tree.h
@@ -0,0 +1,34 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef TREE_H
+#define TREE_H
+
+/* tree_node is a generic binary search tree. */
+struct tree_node {
+	void *key;
+	struct tree_node *left, *right;
+};
+
+/* looks for `key` in `rootp` using `compare` as comparison function. If insert
+ * is set, insert the key if it's not found. Else, return NULL.
+ */
+struct tree_node *tree_search(void *key, struct tree_node **rootp,
+			      int (*compare)(const void *, const void *),
+			      int insert);
+
+/* performs an infix walk of the tree. */
+void infix_walk(struct tree_node *t, void (*action)(void *arg, void *key),
+		void *arg);
+
+/*
+ * deallocates the tree nodes recursively. Keys should be deallocated separately
+ * by walking over the tree. */
+void tree_free(struct tree_node *t);
+
+#endif
diff --git a/reftable/tree_test.c b/reftable/tree_test.c
new file mode 100644
index 00000000000..26d1e694252
--- /dev/null
+++ b/reftable/tree_test.c
@@ -0,0 +1,61 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "tree.h"
+
+#include "basics.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static int test_compare(const void *a, const void *b)
+{
+	return (char *)a - (char *)b;
+}
+
+struct curry {
+	void *last;
+};
+
+static void check_increasing(void *arg, void *key)
+{
+	struct curry *c = (struct curry *)arg;
+	if (c->last != NULL) {
+		assert(test_compare(c->last, key) < 0);
+	}
+	c->last = key;
+}
+
+static void test_tree(void)
+{
+	struct tree_node *root = NULL;
+
+	void *values[11] = { NULL };
+	struct tree_node *nodes[11] = { NULL };
+	int i = 1;
+	struct curry c = { NULL };
+	do {
+		nodes[i] = tree_search(values + i, &root, &test_compare, 1);
+		i = (i * 7) % 11;
+	} while (i != 1);
+
+	for (i = 1; i < ARRAY_SIZE(nodes); i++) {
+		assert(values + i == nodes[i]->key);
+		assert(nodes[i] ==
+		       tree_search(values + i, &root, &test_compare, 0));
+	}
+
+	infix_walk(root, check_increasing, &c);
+	tree_free(root);
+}
+
+int tree_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_tree);
+	return 0;
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index c9deeaf08c7..050551fa698 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -6,5 +6,6 @@ int cmd__reftable(int argc, const char **argv)
 	basics_test_main(argc, argv);
 	block_test_main(argc, argv);
 	record_test_main(argc, argv);
+	tree_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v4 09/15] reftable: write reftable files
  2020-12-09 14:00     ` [PATCH v4 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                         ` (7 preceding siblings ...)
  2020-12-09 14:00       ` [PATCH v4 08/15] reftable: a generic binary tree implementation Han-Wen Nienhuys via GitGitGadget
@ 2020-12-09 14:00       ` Han-Wen Nienhuys via GitGitGadget
  2020-12-09 14:00       ` [PATCH v4 10/15] reftable: read " Han-Wen Nienhuys via GitGitGadget
                         ` (6 subsequent siblings)
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-12-09 14:00 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                   |   1 +
 reftable/reftable-writer.h | 147 ++++++++
 reftable/writer.c          | 681 +++++++++++++++++++++++++++++++++++++
 reftable/writer.h          |  50 +++
 4 files changed, 879 insertions(+)
 create mode 100644 reftable/reftable-writer.h
 create mode 100644 reftable/writer.c
 create mode 100644 reftable/writer.h

diff --git a/Makefile b/Makefile
index 734c05655b2..e78908c976a 100644
--- a/Makefile
+++ b/Makefile
@@ -2396,6 +2396,7 @@ REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/record.o
 REFTABLE_OBJS += reftable/tree.o
+REFTABLE_OBJS += reftable/writer.o
 REFTABLE_OBJS += reftable/zlib-compat.o
 
 REFTABLE_TEST_OBJS += reftable/basics_test.o
diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h
new file mode 100644
index 00000000000..9d2f8d60555
--- /dev/null
+++ b/reftable/reftable-writer.h
@@ -0,0 +1,147 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_WRITER_H
+#define REFTABLE_WRITER_H
+
+#include <stdint.h>
+
+#include "reftable-record.h"
+
+/* Writing single reftables */
+
+/* reftable_write_options sets options for writing a single reftable. */
+struct reftable_write_options {
+	/* boolean: do not pad out blocks to block size. */
+	unsigned unpadded : 1;
+
+	/* the blocksize. Should be less than 2^24. */
+	uint32_t block_size;
+
+	/* boolean: do not generate a SHA1 => ref index. */
+	unsigned skip_index_objects : 1;
+
+	/* how often to write complete keys in each block. */
+	int restart_interval;
+
+	/* 4-byte identifier ("sha1", "s256") of the hash.
+	 * Defaults to SHA1 if unset
+	 */
+	uint32_t hash_id;
+
+	/* boolean: do not check ref names for validity or dir/file conflicts.
+	 */
+	unsigned skip_name_check : 1;
+
+	/* boolean: copy log messages exactly. If unset, check that the message
+	 *   is a single line, and add '\n' if missing.
+	 */
+	unsigned exact_log_message : 1;
+};
+
+/* reftable_block_stats holds statistics for a single block type */
+struct reftable_block_stats {
+	/* total number of entries written */
+	int entries;
+	/* total number of key restarts */
+	int restarts;
+	/* total number of blocks */
+	int blocks;
+	/* total number of index blocks */
+	int index_blocks;
+	/* depth of the index */
+	int max_index_level;
+
+	/* offset of the first block for this type */
+	uint64_t offset;
+	/* offset of the top level index block for this type, or 0 if not
+	 * present */
+	uint64_t index_offset;
+};
+
+/* stats holds overall statistics for a single reftable */
+struct reftable_stats {
+	/* total number of blocks written. */
+	int blocks;
+	/* stats for ref data */
+	struct reftable_block_stats ref_stats;
+	/* stats for the SHA1 to ref map. */
+	struct reftable_block_stats obj_stats;
+	/* stats for index blocks */
+	struct reftable_block_stats idx_stats;
+	/* stats for log blocks */
+	struct reftable_block_stats log_stats;
+
+	/* disambiguation length of shortened object IDs. */
+	int object_id_len;
+};
+
+/* reftable_new_writer creates a new writer */
+struct reftable_writer *
+reftable_new_writer(int (*writer_func)(void *, const void *, size_t),
+		    void *writer_arg, struct reftable_write_options *opts);
+
+/* Set the range of update indices for the records we will add. When writing a
+   table into a stack, the min should be at least
+   reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned.
+
+   For transactional updates to a stack, typically min==max, and the
+   update_index can be obtained by inspeciting the stack. When converting an
+   existing ref database into a single reftable, this would be a range of
+   update-index timestamps.
+ */
+void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
+				uint64_t max);
+
+/*
+  Add a reftable_ref_record. The record should have names that come after
+  already added records.
+
+  The update_index must be within the limits set by
+  reftable_writer_set_limits(), or REFTABLE_API_ERROR is returned. It is an
+  REFTABLE_API_ERROR error to write a ref record after a log record.
+*/
+int reftable_writer_add_ref(struct reftable_writer *w,
+			    struct reftable_ref_record *ref);
+
+/*
+  Convenience function to add multiple reftable_ref_records; the function sorts
+  the records before adding them, reordering the records array passed in.
+*/
+int reftable_writer_add_refs(struct reftable_writer *w,
+			     struct reftable_ref_record *refs, int n);
+
+/*
+  adds reftable_log_records. Log records are keyed by (refname, decreasing
+  update_index). The key for the record added must come after the already added
+  log records.
+*/
+int reftable_writer_add_log(struct reftable_writer *w,
+			    struct reftable_log_record *log);
+
+/*
+  Convenience function to add multiple reftable_log_records; the function sorts
+  the records before adding them, reordering records array passed in.
+*/
+int reftable_writer_add_logs(struct reftable_writer *w,
+			     struct reftable_log_record *logs, int n);
+
+/* reftable_writer_close finalizes the reftable. The writer is retained so
+ * statistics can be inspected. */
+int reftable_writer_close(struct reftable_writer *w);
+
+/* writer_stats returns the statistics on the reftable being written.
+
+   This struct becomes invalid when the writer is freed.
+ */
+const struct reftable_stats *writer_stats(struct reftable_writer *w);
+
+/* reftable_writer_free deallocates memory for the writer */
+void reftable_writer_free(struct reftable_writer *w);
+
+#endif
diff --git a/reftable/writer.c b/reftable/writer.c
new file mode 100644
index 00000000000..00a0a11440f
--- /dev/null
+++ b/reftable/writer.c
@@ -0,0 +1,681 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "writer.h"
+
+#include "system.h"
+
+#include "block.h"
+#include "constants.h"
+#include "record.h"
+#include "tree.h"
+#include "reftable-error.h"
+
+/* finishes a block, and writes it to storage */
+static int writer_flush_block(struct reftable_writer *w);
+
+/* deallocates memory related to the index */
+static void writer_clear_index(struct reftable_writer *w);
+
+/* finishes writing a 'r' (refs) or 'g' (reflogs) section */
+static int writer_finish_public_section(struct reftable_writer *w);
+
+static struct reftable_block_stats *
+writer_reftable_block_stats(struct reftable_writer *w, uint8_t typ)
+{
+	switch (typ) {
+	case 'r':
+		return &w->stats.ref_stats;
+	case 'o':
+		return &w->stats.obj_stats;
+	case 'i':
+		return &w->stats.idx_stats;
+	case 'g':
+		return &w->stats.log_stats;
+	}
+	abort();
+	return NULL;
+}
+
+/* write data, queuing the padding for the next write. Returns negative for
+ * error. */
+static int padded_write(struct reftable_writer *w, uint8_t *data, size_t len,
+			int padding)
+{
+	int n = 0;
+	if (w->pending_padding > 0) {
+		uint8_t *zeroed = reftable_calloc(w->pending_padding);
+		int n = w->write(w->write_arg, zeroed, w->pending_padding);
+		if (n < 0)
+			return n;
+
+		w->pending_padding = 0;
+		reftable_free(zeroed);
+	}
+
+	w->pending_padding = padding;
+	n = w->write(w->write_arg, data, len);
+	if (n < 0)
+		return n;
+	n += padding;
+	return 0;
+}
+
+static void options_set_defaults(struct reftable_write_options *opts)
+{
+	if (opts->restart_interval == 0) {
+		opts->restart_interval = 16;
+	}
+
+	if (opts->hash_id == 0) {
+		opts->hash_id = SHA1_ID;
+	}
+	if (opts->block_size == 0) {
+		opts->block_size = DEFAULT_BLOCK_SIZE;
+	}
+}
+
+static int writer_version(struct reftable_writer *w)
+{
+	return (w->opts.hash_id == 0 || w->opts.hash_id == SHA1_ID) ? 1 : 2;
+}
+
+static int writer_write_header(struct reftable_writer *w, uint8_t *dest)
+{
+	memcpy((char *)dest, "REFT", 4);
+
+	dest[4] = writer_version(w);
+
+	put_be24(dest + 5, w->opts.block_size);
+	put_be64(dest + 8, w->min_update_index);
+	put_be64(dest + 16, w->max_update_index);
+	if (writer_version(w) == 2) {
+		put_be32(dest + 24, w->opts.hash_id);
+	}
+	return header_size(writer_version(w));
+}
+
+static void writer_reinit_block_writer(struct reftable_writer *w, uint8_t typ)
+{
+	int block_start = 0;
+	if (w->next == 0) {
+		block_start = header_size(writer_version(w));
+	}
+
+	strbuf_release(&w->last_key);
+	block_writer_init(&w->block_writer_data, typ, w->block,
+			  w->opts.block_size, block_start,
+			  hash_size(w->opts.hash_id));
+	w->block_writer = &w->block_writer_data;
+	w->block_writer->restart_interval = w->opts.restart_interval;
+}
+
+static struct strbuf reftable_empty_strbuf = STRBUF_INIT;
+
+struct reftable_writer *
+reftable_new_writer(int (*writer_func)(void *, const void *, size_t),
+		    void *writer_arg, struct reftable_write_options *opts)
+{
+	struct reftable_writer *wp =
+		reftable_calloc(sizeof(struct reftable_writer));
+	strbuf_init(&wp->block_writer_data.last_key, 0);
+	options_set_defaults(opts);
+	if (opts->block_size >= (1 << 24)) {
+		/* TODO - error return? */
+		abort();
+	}
+	wp->last_key = reftable_empty_strbuf;
+	wp->block = reftable_calloc(opts->block_size);
+	wp->write = writer_func;
+	wp->write_arg = writer_arg;
+	wp->opts = *opts;
+	writer_reinit_block_writer(wp, BLOCK_TYPE_REF);
+
+	return wp;
+}
+
+void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
+				uint64_t max)
+{
+	w->min_update_index = min;
+	w->max_update_index = max;
+}
+
+void reftable_writer_free(struct reftable_writer *w)
+{
+	reftable_free(w->block);
+	reftable_free(w);
+}
+
+struct obj_index_tree_node {
+	struct strbuf hash;
+	uint64_t *offsets;
+	size_t offset_len;
+	size_t offset_cap;
+};
+
+#define OBJ_INDEX_TREE_NODE_INIT    \
+	{                           \
+		.hash = STRBUF_INIT \
+	}
+
+static int obj_index_tree_node_compare(const void *a, const void *b)
+{
+	return strbuf_cmp(&((const struct obj_index_tree_node *)a)->hash,
+			  &((const struct obj_index_tree_node *)b)->hash);
+}
+
+static void writer_index_hash(struct reftable_writer *w, struct strbuf *hash)
+{
+	uint64_t off = w->next;
+
+	struct obj_index_tree_node want = { .hash = *hash };
+
+	struct tree_node *node = tree_search(&want, &w->obj_index_tree,
+					     &obj_index_tree_node_compare, 0);
+	struct obj_index_tree_node *key = NULL;
+	if (node == NULL) {
+		struct obj_index_tree_node empty = OBJ_INDEX_TREE_NODE_INIT;
+		key = reftable_malloc(sizeof(struct obj_index_tree_node));
+		*key = empty;
+
+		strbuf_reset(&key->hash);
+		strbuf_addbuf(&key->hash, hash);
+		tree_search((void *)key, &w->obj_index_tree,
+			    &obj_index_tree_node_compare, 1);
+	} else {
+		key = node->key;
+	}
+
+	if (key->offset_len > 0 && key->offsets[key->offset_len - 1] == off) {
+		return;
+	}
+
+	if (key->offset_len == key->offset_cap) {
+		key->offset_cap = 2 * key->offset_cap + 1;
+		key->offsets = reftable_realloc(
+			key->offsets, sizeof(uint64_t) * key->offset_cap);
+	}
+
+	key->offsets[key->offset_len++] = off;
+}
+
+static int writer_add_record(struct reftable_writer *w,
+			     struct reftable_record *rec)
+{
+	int result = -1;
+	struct strbuf key = STRBUF_INIT;
+	int err = 0;
+	reftable_record_key(rec, &key);
+	if (strbuf_cmp(&w->last_key, &key) >= 0)
+		goto done;
+
+	strbuf_reset(&w->last_key);
+	strbuf_addbuf(&w->last_key, &key);
+	if (w->block_writer == NULL) {
+		writer_reinit_block_writer(w, reftable_record_type(rec));
+	}
+
+	assert(block_writer_type(w->block_writer) == reftable_record_type(rec));
+
+	if (block_writer_add(w->block_writer, rec) == 0) {
+		result = 0;
+		goto done;
+	}
+
+	err = writer_flush_block(w);
+	if (err < 0) {
+		result = err;
+		goto done;
+	}
+
+	writer_reinit_block_writer(w, reftable_record_type(rec));
+	err = block_writer_add(w->block_writer, rec);
+	if (err < 0) {
+		result = err;
+		goto done;
+	}
+
+	result = 0;
+done:
+	strbuf_release(&key);
+	return result;
+}
+
+int reftable_writer_add_ref(struct reftable_writer *w,
+			    struct reftable_ref_record *ref)
+{
+	struct reftable_record rec = { NULL };
+	struct reftable_ref_record copy = *ref;
+	int err = 0;
+
+	if (ref->refname == NULL)
+		return REFTABLE_API_ERROR;
+	if (ref->update_index < w->min_update_index ||
+	    ref->update_index > w->max_update_index)
+		return REFTABLE_API_ERROR;
+
+	reftable_record_from_ref(&rec, &copy);
+	copy.update_index -= w->min_update_index;
+
+	err = writer_add_record(w, &rec);
+	if (err < 0)
+		return err;
+
+	if (!w->opts.skip_index_objects &&
+	    reftable_ref_record_val1(ref) != NULL) {
+		struct strbuf h = STRBUF_INIT;
+		strbuf_add(&h, (char *)reftable_ref_record_val1(ref),
+			   hash_size(w->opts.hash_id));
+		writer_index_hash(w, &h);
+		strbuf_release(&h);
+	}
+
+	if (!w->opts.skip_index_objects &&
+	    reftable_ref_record_val2(ref) != NULL) {
+		struct strbuf h = STRBUF_INIT;
+		strbuf_add(&h, reftable_ref_record_val2(ref),
+			   hash_size(w->opts.hash_id));
+		writer_index_hash(w, &h);
+		strbuf_release(&h);
+	}
+	return 0;
+}
+
+int reftable_writer_add_refs(struct reftable_writer *w,
+			     struct reftable_ref_record *refs, int n)
+{
+	int err = 0;
+	int i = 0;
+	QSORT(refs, n, reftable_ref_record_compare_name);
+	for (i = 0; err == 0 && i < n; i++) {
+		err = reftable_writer_add_ref(w, &refs[i]);
+	}
+	return err;
+}
+
+int reftable_writer_add_log(struct reftable_writer *w,
+			    struct reftable_log_record *log)
+{
+	struct reftable_record rec = { NULL };
+	char *input_log_message = log->message;
+	struct strbuf cleaned_message = STRBUF_INIT;
+	int err;
+	if (log->refname == NULL)
+		return REFTABLE_API_ERROR;
+
+	if (w->block_writer != NULL &&
+	    block_writer_type(w->block_writer) == BLOCK_TYPE_REF) {
+		int err = writer_finish_public_section(w);
+		if (err < 0)
+			return err;
+	}
+
+	if (!w->opts.exact_log_message && log->message != NULL) {
+		strbuf_addstr(&cleaned_message, log->message);
+		while (cleaned_message.len &&
+		       cleaned_message.buf[cleaned_message.len - 1] == '\n')
+			strbuf_setlen(&cleaned_message,
+				      cleaned_message.len - 1);
+		if (strchr(cleaned_message.buf, '\n')) {
+			// multiple lines not allowed.
+			err = REFTABLE_API_ERROR;
+			goto done;
+		}
+		strbuf_addstr(&cleaned_message, "\n");
+		log->message = cleaned_message.buf;
+	}
+
+	w->next -= w->pending_padding;
+	w->pending_padding = 0;
+
+	reftable_record_from_log(&rec, log);
+	err = writer_add_record(w, &rec);
+
+done:
+	log->message = input_log_message;
+	strbuf_release(&cleaned_message);
+	return err;
+}
+
+int reftable_writer_add_logs(struct reftable_writer *w,
+			     struct reftable_log_record *logs, int n)
+{
+	int err = 0;
+	int i = 0;
+	QSORT(logs, n, reftable_log_record_compare_key);
+
+	for (i = 0; err == 0 && i < n; i++) {
+		err = reftable_writer_add_log(w, &logs[i]);
+	}
+	return err;
+}
+
+static int writer_finish_section(struct reftable_writer *w)
+{
+	uint8_t typ = block_writer_type(w->block_writer);
+	uint64_t index_start = 0;
+	int max_level = 0;
+	int threshold = w->opts.unpadded ? 1 : 3;
+	int before_blocks = w->stats.idx_stats.blocks;
+	int err = writer_flush_block(w);
+	int i = 0;
+	struct reftable_block_stats *bstats = NULL;
+	if (err < 0)
+		return err;
+
+	while (w->index_len > threshold) {
+		struct reftable_index_record *idx = NULL;
+		int idx_len = 0;
+
+		max_level++;
+		index_start = w->next;
+		writer_reinit_block_writer(w, BLOCK_TYPE_INDEX);
+
+		idx = w->index;
+		idx_len = w->index_len;
+
+		w->index = NULL;
+		w->index_len = 0;
+		w->index_cap = 0;
+		for (i = 0; i < idx_len; i++) {
+			struct reftable_record rec = { NULL };
+			reftable_record_from_index(&rec, idx + i);
+			if (block_writer_add(w->block_writer, &rec) == 0) {
+				continue;
+			}
+
+			err = writer_flush_block(w);
+			if (err < 0)
+				return err;
+
+			writer_reinit_block_writer(w, BLOCK_TYPE_INDEX);
+
+			err = block_writer_add(w->block_writer, &rec);
+			if (err != 0) {
+				/* write into fresh block should always succeed
+				 */
+				abort();
+			}
+		}
+		for (i = 0; i < idx_len; i++) {
+			strbuf_release(&idx[i].last_key);
+		}
+		reftable_free(idx);
+	}
+
+	writer_clear_index(w);
+
+	err = writer_flush_block(w);
+	if (err < 0)
+		return err;
+
+	bstats = writer_reftable_block_stats(w, typ);
+	bstats->index_blocks = w->stats.idx_stats.blocks - before_blocks;
+	bstats->index_offset = index_start;
+	bstats->max_index_level = max_level;
+
+	/* Reinit lastKey, as the next section can start with any key. */
+	w->last_key.len = 0;
+
+	return 0;
+}
+
+struct common_prefix_arg {
+	struct strbuf *last;
+	int max;
+};
+
+static void update_common(void *void_arg, void *key)
+{
+	struct common_prefix_arg *arg = (struct common_prefix_arg *)void_arg;
+	struct obj_index_tree_node *entry = (struct obj_index_tree_node *)key;
+	if (arg->last != NULL) {
+		int n = common_prefix_size(&entry->hash, arg->last);
+		if (n > arg->max) {
+			arg->max = n;
+		}
+	}
+	arg->last = &entry->hash;
+}
+
+struct write_record_arg {
+	struct reftable_writer *w;
+	int err;
+};
+
+static void write_object_record(void *void_arg, void *key)
+{
+	struct write_record_arg *arg = (struct write_record_arg *)void_arg;
+	struct obj_index_tree_node *entry = (struct obj_index_tree_node *)key;
+	struct reftable_obj_record obj_rec = {
+		.hash_prefix = (uint8_t *)entry->hash.buf,
+		.hash_prefix_len = arg->w->stats.object_id_len,
+		.offsets = entry->offsets,
+		.offset_len = entry->offset_len,
+	};
+	struct reftable_record rec = { NULL };
+	if (arg->err < 0)
+		goto done;
+
+	reftable_record_from_obj(&rec, &obj_rec);
+	arg->err = block_writer_add(arg->w->block_writer, &rec);
+	if (arg->err == 0)
+		goto done;
+
+	arg->err = writer_flush_block(arg->w);
+	if (arg->err < 0)
+		goto done;
+
+	writer_reinit_block_writer(arg->w, BLOCK_TYPE_OBJ);
+	arg->err = block_writer_add(arg->w->block_writer, &rec);
+	if (arg->err == 0)
+		goto done;
+	obj_rec.offset_len = 0;
+	arg->err = block_writer_add(arg->w->block_writer, &rec);
+
+	/* Should be able to write into a fresh block. */
+	assert(arg->err == 0);
+
+done:;
+}
+
+static void object_record_free(void *void_arg, void *key)
+{
+	struct obj_index_tree_node *entry = (struct obj_index_tree_node *)key;
+
+	FREE_AND_NULL(entry->offsets);
+	strbuf_release(&entry->hash);
+	reftable_free(entry);
+}
+
+static int writer_dump_object_index(struct reftable_writer *w)
+{
+	struct write_record_arg closure = { .w = w };
+	struct common_prefix_arg common = { NULL };
+	if (w->obj_index_tree != NULL) {
+		infix_walk(w->obj_index_tree, &update_common, &common);
+	}
+	w->stats.object_id_len = common.max + 1;
+
+	writer_reinit_block_writer(w, BLOCK_TYPE_OBJ);
+
+	if (w->obj_index_tree != NULL) {
+		infix_walk(w->obj_index_tree, &write_object_record, &closure);
+	}
+
+	if (closure.err < 0)
+		return closure.err;
+	return writer_finish_section(w);
+}
+
+static int writer_finish_public_section(struct reftable_writer *w)
+{
+	uint8_t typ = 0;
+	int err = 0;
+
+	if (w->block_writer == NULL)
+		return 0;
+
+	typ = block_writer_type(w->block_writer);
+	err = writer_finish_section(w);
+	if (err < 0)
+		return err;
+	if (typ == BLOCK_TYPE_REF && !w->opts.skip_index_objects &&
+	    w->stats.ref_stats.index_blocks > 0) {
+		err = writer_dump_object_index(w);
+		if (err < 0)
+			return err;
+	}
+
+	if (w->obj_index_tree != NULL) {
+		infix_walk(w->obj_index_tree, &object_record_free, NULL);
+		tree_free(w->obj_index_tree);
+		w->obj_index_tree = NULL;
+	}
+
+	w->block_writer = NULL;
+	return 0;
+}
+
+int reftable_writer_close(struct reftable_writer *w)
+{
+	uint8_t footer[72];
+	uint8_t *p = footer;
+	int err = writer_finish_public_section(w);
+	int empty_table = w->next == 0;
+	if (err != 0)
+		goto done;
+	w->pending_padding = 0;
+	if (empty_table) {
+		/* Empty tables need a header anyway. */
+		uint8_t header[28];
+		int n = writer_write_header(w, header);
+		err = padded_write(w, header, n, 0);
+		if (err < 0)
+			goto done;
+	}
+
+	p += writer_write_header(w, footer);
+	put_be64(p, w->stats.ref_stats.index_offset);
+	p += 8;
+	put_be64(p, (w->stats.obj_stats.offset) << 5 | w->stats.object_id_len);
+	p += 8;
+	put_be64(p, w->stats.obj_stats.index_offset);
+	p += 8;
+
+	put_be64(p, w->stats.log_stats.offset);
+	p += 8;
+	put_be64(p, w->stats.log_stats.index_offset);
+	p += 8;
+
+	put_be32(p, crc32(0, footer, p - footer));
+	p += 4;
+
+	err = padded_write(w, footer, footer_size(writer_version(w)), 0);
+	if (err < 0)
+		goto done;
+
+	if (empty_table) {
+		err = REFTABLE_EMPTY_TABLE_ERROR;
+		goto done;
+	}
+
+done:
+	/* free up memory. */
+	block_writer_release(&w->block_writer_data);
+	writer_clear_index(w);
+	strbuf_release(&w->last_key);
+	return err;
+}
+
+static void writer_clear_index(struct reftable_writer *w)
+{
+	int i = 0;
+	for (i = 0; i < w->index_len; i++) {
+		strbuf_release(&w->index[i].last_key);
+	}
+
+	FREE_AND_NULL(w->index);
+	w->index_len = 0;
+	w->index_cap = 0;
+}
+
+static const int debug = 0;
+
+static int writer_flush_nonempty_block(struct reftable_writer *w)
+{
+	uint8_t typ = block_writer_type(w->block_writer);
+	struct reftable_block_stats *bstats =
+		writer_reftable_block_stats(w, typ);
+	uint64_t block_typ_off = (bstats->blocks == 0) ? w->next : 0;
+	int raw_bytes = block_writer_finish(w->block_writer);
+	int padding = 0;
+	int err = 0;
+	struct reftable_index_record ir = { .last_key = STRBUF_INIT };
+	if (raw_bytes < 0)
+		return raw_bytes;
+
+	if (!w->opts.unpadded && typ != BLOCK_TYPE_LOG) {
+		padding = w->opts.block_size - raw_bytes;
+	}
+
+	if (block_typ_off > 0) {
+		bstats->offset = block_typ_off;
+	}
+
+	bstats->entries += w->block_writer->entries;
+	bstats->restarts += w->block_writer->restart_len;
+	bstats->blocks++;
+	w->stats.blocks++;
+
+	if (debug) {
+		fprintf(stderr, "block %c off %" PRIu64 " sz %d (%d)\n", typ,
+			w->next, raw_bytes,
+			get_be24(w->block + w->block_writer->header_off + 1));
+	}
+
+	if (w->next == 0) {
+		writer_write_header(w, w->block);
+	}
+
+	err = padded_write(w, w->block, raw_bytes, padding);
+	if (err < 0)
+		return err;
+
+	if (w->index_cap == w->index_len) {
+		w->index_cap = 2 * w->index_cap + 1;
+		w->index = reftable_realloc(
+			w->index,
+			sizeof(struct reftable_index_record) * w->index_cap);
+	}
+
+	ir.offset = w->next;
+	strbuf_reset(&ir.last_key);
+	strbuf_addbuf(&ir.last_key, &w->block_writer->last_key);
+	w->index[w->index_len] = ir;
+
+	w->index_len++;
+	w->next += padding + raw_bytes;
+	w->block_writer = NULL;
+	return 0;
+}
+
+static int writer_flush_block(struct reftable_writer *w)
+{
+	if (w->block_writer == NULL)
+		return 0;
+	if (w->block_writer->entries == 0)
+		return 0;
+	return writer_flush_nonempty_block(w);
+}
+
+const struct reftable_stats *writer_stats(struct reftable_writer *w)
+{
+	return &w->stats;
+}
diff --git a/reftable/writer.h b/reftable/writer.h
new file mode 100644
index 00000000000..4921c249d06
--- /dev/null
+++ b/reftable/writer.h
@@ -0,0 +1,50 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef WRITER_H
+#define WRITER_H
+
+#include "basics.h"
+#include "block.h"
+#include "tree.h"
+#include "reftable-writer.h"
+
+struct reftable_writer {
+	int (*write)(void *, const void *, size_t);
+	void *write_arg;
+	int pending_padding;
+	struct strbuf last_key;
+
+	/* offset of next block to write. */
+	uint64_t next;
+	uint64_t min_update_index, max_update_index;
+	struct reftable_write_options opts;
+
+	/* memory buffer for writing */
+	uint8_t *block;
+
+	/* writer for the current section. NULL or points to
+	 * block_writer_data */
+	struct block_writer *block_writer;
+
+	struct block_writer block_writer_data;
+
+	/* pending index records for the current section */
+	struct reftable_index_record *index;
+	size_t index_len;
+	size_t index_cap;
+
+	/*
+	 * tree for use with tsearch; used to populate the 'o' inverse OID
+	 * map */
+	struct tree_node *obj_index_tree;
+
+	struct reftable_stats stats;
+};
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v4 10/15] reftable: read reftable files
  2020-12-09 14:00     ` [PATCH v4 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                         ` (8 preceding siblings ...)
  2020-12-09 14:00       ` [PATCH v4 09/15] reftable: write reftable files Han-Wen Nienhuys via GitGitGadget
@ 2020-12-09 14:00       ` Han-Wen Nienhuys via GitGitGadget
  2020-12-09 14:00       ` [PATCH v4 11/15] reftable: reftable file level tests Han-Wen Nienhuys via GitGitGadget
                         ` (5 subsequent siblings)
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-12-09 14:00 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This supports reading a single reftable file.

The commit introduces an abstract iterator type, which captures the usecases
both of reading individual refs, and iterating over a segment of the ref
namespace.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                     |   3 +
 reftable/iter.c              | 248 ++++++++++++
 reftable/iter.h              |  75 ++++
 reftable/reader.c            | 733 +++++++++++++++++++++++++++++++++++
 reftable/reader.h            |  75 ++++
 reftable/reftable-iterator.h |  37 ++
 reftable/reftable-reader.h   |  89 +++++
 7 files changed, 1260 insertions(+)
 create mode 100644 reftable/iter.c
 create mode 100644 reftable/iter.h
 create mode 100644 reftable/reader.c
 create mode 100644 reftable/reader.h
 create mode 100644 reftable/reftable-iterator.h
 create mode 100644 reftable/reftable-reader.h

diff --git a/Makefile b/Makefile
index e78908c976a..2bafcb1b62a 100644
--- a/Makefile
+++ b/Makefile
@@ -2393,8 +2393,11 @@ REFTABLE_OBJS += reftable/basics.o
 REFTABLE_OBJS += reftable/error.o
 REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
+REFTABLE_OBJS += reftable/iter.o
 REFTABLE_OBJS += reftable/publicbasics.o
+REFTABLE_OBJS += reftable/reader.o
 REFTABLE_OBJS += reftable/record.o
+REFTABLE_OBJS += reftable/reftable.o
 REFTABLE_OBJS += reftable/tree.o
 REFTABLE_OBJS += reftable/writer.o
 REFTABLE_OBJS += reftable/zlib-compat.o
diff --git a/reftable/iter.c b/reftable/iter.c
new file mode 100644
index 00000000000..b8c091ffd57
--- /dev/null
+++ b/reftable/iter.c
@@ -0,0 +1,248 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "iter.h"
+
+#include "system.h"
+
+#include "block.h"
+#include "constants.h"
+#include "reader.h"
+#include "reftable-error.h"
+
+int iterator_is_null(struct reftable_iterator *it)
+{
+	return it->ops == NULL;
+}
+
+static int empty_iterator_next(void *arg, struct reftable_record *rec)
+{
+	return 1;
+}
+
+static void empty_iterator_close(void *arg)
+{
+}
+
+static struct reftable_iterator_vtable empty_vtable = {
+	.next = &empty_iterator_next,
+	.close = &empty_iterator_close,
+};
+
+void iterator_set_empty(struct reftable_iterator *it)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = NULL;
+	it->ops = &empty_vtable;
+}
+
+int iterator_next(struct reftable_iterator *it, struct reftable_record *rec)
+{
+	return it->ops->next(it->iter_arg, rec);
+}
+
+void reftable_iterator_destroy(struct reftable_iterator *it)
+{
+	if (it->ops == NULL) {
+		return;
+	}
+	it->ops->close(it->iter_arg);
+	it->ops = NULL;
+	FREE_AND_NULL(it->iter_arg);
+}
+
+int reftable_iterator_next_ref(struct reftable_iterator *it,
+			       struct reftable_ref_record *ref)
+{
+	struct reftable_record rec = { NULL };
+	reftable_record_from_ref(&rec, ref);
+	return iterator_next(it, &rec);
+}
+
+int reftable_iterator_next_log(struct reftable_iterator *it,
+			       struct reftable_log_record *log)
+{
+	struct reftable_record rec = { NULL };
+	reftable_record_from_log(&rec, log);
+	return iterator_next(it, &rec);
+}
+
+static void filtering_ref_iterator_close(void *iter_arg)
+{
+	struct filtering_ref_iterator *fri =
+		(struct filtering_ref_iterator *)iter_arg;
+	strbuf_release(&fri->oid);
+	reftable_iterator_destroy(&fri->it);
+}
+
+static int filtering_ref_iterator_next(void *iter_arg,
+				       struct reftable_record *rec)
+{
+	struct filtering_ref_iterator *fri =
+		(struct filtering_ref_iterator *)iter_arg;
+	struct reftable_ref_record *ref =
+		(struct reftable_ref_record *)rec->data;
+	int err = 0;
+	while (1) {
+		err = reftable_iterator_next_ref(&fri->it, ref);
+		if (err != 0) {
+			break;
+		}
+
+		if (fri->double_check) {
+			struct reftable_iterator it = { NULL };
+
+			err = reftable_table_seek_ref(&fri->tab, &it,
+						      ref->refname);
+			if (err == 0) {
+				err = reftable_iterator_next_ref(&it, ref);
+			}
+
+			reftable_iterator_destroy(&it);
+
+			if (err < 0) {
+				break;
+			}
+
+			if (err > 0) {
+				continue;
+			}
+		}
+
+		if (ref->value_type == REFTABLE_REF_VAL2 &&
+		    (!memcmp(fri->oid.buf, ref->value.val2.target_value,
+			     fri->oid.len) ||
+		     !memcmp(fri->oid.buf, ref->value.val2.value,
+			     fri->oid.len)))
+			return 0;
+
+		if (ref->value_type == REFTABLE_REF_VAL1 &&
+		    !memcmp(fri->oid.buf, ref->value.val1, fri->oid.len)) {
+			return 0;
+		}
+	}
+
+	reftable_ref_record_release(ref);
+	return err;
+}
+
+static struct reftable_iterator_vtable filtering_ref_iterator_vtable = {
+	.next = &filtering_ref_iterator_next,
+	.close = &filtering_ref_iterator_close,
+};
+
+void iterator_from_filtering_ref_iterator(struct reftable_iterator *it,
+					  struct filtering_ref_iterator *fri)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = fri;
+	it->ops = &filtering_ref_iterator_vtable;
+}
+
+static void indexed_table_ref_iter_close(void *p)
+{
+	struct indexed_table_ref_iter *it = (struct indexed_table_ref_iter *)p;
+	block_iter_close(&it->cur);
+	reftable_block_done(&it->block_reader.block);
+	strbuf_release(&it->oid);
+}
+
+static int indexed_table_ref_iter_next_block(struct indexed_table_ref_iter *it)
+{
+	uint64_t off;
+	int err = 0;
+	if (it->offset_idx == it->offset_len) {
+		it->is_finished = 1;
+		return 1;
+	}
+
+	reftable_block_done(&it->block_reader.block);
+
+	off = it->offsets[it->offset_idx++];
+	err = reader_init_block_reader(it->r, &it->block_reader, off,
+				       BLOCK_TYPE_REF);
+	if (err < 0) {
+		return err;
+	}
+	if (err > 0) {
+		/* indexed block does not exist. */
+		return REFTABLE_FORMAT_ERROR;
+	}
+	block_reader_start(&it->block_reader, &it->cur);
+	return 0;
+}
+
+static int indexed_table_ref_iter_next(void *p, struct reftable_record *rec)
+{
+	struct indexed_table_ref_iter *it = (struct indexed_table_ref_iter *)p;
+	struct reftable_ref_record *ref =
+		(struct reftable_ref_record *)rec->data;
+
+	while (1) {
+		int err = block_iter_next(&it->cur, rec);
+		if (err < 0) {
+			return err;
+		}
+
+		if (err > 0) {
+			err = indexed_table_ref_iter_next_block(it);
+			if (err < 0) {
+				return err;
+			}
+
+			if (it->is_finished) {
+				return 1;
+			}
+			continue;
+		}
+		/* BUG */
+		if (!memcmp(it->oid.buf, ref->value.val2.target_value,
+			    it->oid.len) ||
+		    !memcmp(it->oid.buf, ref->value.val2.value, it->oid.len)) {
+			return 0;
+		}
+	}
+}
+
+int new_indexed_table_ref_iter(struct indexed_table_ref_iter **dest,
+			       struct reftable_reader *r, uint8_t *oid,
+			       int oid_len, uint64_t *offsets, int offset_len)
+{
+	struct indexed_table_ref_iter empty = INDEXED_TABLE_REF_ITER_INIT;
+	struct indexed_table_ref_iter *itr =
+		reftable_calloc(sizeof(struct indexed_table_ref_iter));
+	int err = 0;
+
+	*itr = empty;
+	itr->r = r;
+	strbuf_add(&itr->oid, oid, oid_len);
+
+	itr->offsets = offsets;
+	itr->offset_len = offset_len;
+
+	err = indexed_table_ref_iter_next_block(itr);
+	if (err < 0) {
+		reftable_free(itr);
+	} else {
+		*dest = itr;
+	}
+	return err;
+}
+
+static struct reftable_iterator_vtable indexed_table_ref_iter_vtable = {
+	.next = &indexed_table_ref_iter_next,
+	.close = &indexed_table_ref_iter_close,
+};
+
+void iterator_from_indexed_table_ref_iter(struct reftable_iterator *it,
+					  struct indexed_table_ref_iter *itr)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = itr;
+	it->ops = &indexed_table_ref_iter_vtable;
+}
diff --git a/reftable/iter.h b/reftable/iter.h
new file mode 100644
index 00000000000..2ba964c29a3
--- /dev/null
+++ b/reftable/iter.h
@@ -0,0 +1,75 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef ITER_H
+#define ITER_H
+
+#include "system.h"
+#include "block.h"
+#include "record.h"
+
+#include "reftable-iterator.h"
+#include "reftable-generic.h"
+
+struct reftable_iterator_vtable {
+	int (*next)(void *iter_arg, struct reftable_record *rec);
+	void (*close)(void *iter_arg);
+};
+
+void iterator_set_empty(struct reftable_iterator *it);
+int iterator_next(struct reftable_iterator *it, struct reftable_record *rec);
+
+/* Returns true for a zeroed out iterator, such as the one returned from
+ * iterator_destroy. */
+int iterator_is_null(struct reftable_iterator *it);
+
+/* iterator that produces only ref records that point to `oid` */
+struct filtering_ref_iterator {
+	int double_check;
+	struct reftable_table tab;
+	struct strbuf oid;
+	struct reftable_iterator it;
+};
+#define FILTERING_REF_ITERATOR_INIT \
+	{                           \
+		.oid = STRBUF_INIT  \
+	}
+
+void iterator_from_filtering_ref_iterator(struct reftable_iterator *,
+					  struct filtering_ref_iterator *);
+
+/* iterator that produces only ref records that point to `oid`,
+ * but using the object index.
+ */
+struct indexed_table_ref_iter {
+	struct reftable_reader *r;
+	struct strbuf oid;
+
+	/* mutable */
+	uint64_t *offsets;
+
+	/* Points to the next offset to read. */
+	int offset_idx;
+	int offset_len;
+	struct block_reader block_reader;
+	struct block_iter cur;
+	int is_finished;
+};
+
+#define INDEXED_TABLE_REF_ITER_INIT                                     \
+	{                                                               \
+		.cur = { .last_key = STRBUF_INIT }, .oid = STRBUF_INIT, \
+	}
+
+void iterator_from_indexed_table_ref_iter(struct reftable_iterator *it,
+					  struct indexed_table_ref_iter *itr);
+int new_indexed_table_ref_iter(struct indexed_table_ref_iter **dest,
+			       struct reftable_reader *r, uint8_t *oid,
+			       int oid_len, uint64_t *offsets, int offset_len);
+
+#endif
diff --git a/reftable/reader.c b/reftable/reader.c
new file mode 100644
index 00000000000..a5b7b1d4fef
--- /dev/null
+++ b/reftable/reader.c
@@ -0,0 +1,733 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reader.h"
+
+#include "system.h"
+#include "block.h"
+#include "constants.h"
+#include "iter.h"
+#include "record.h"
+#include "reftable-error.h"
+#include "tree.h"
+
+uint64_t block_source_size(struct reftable_block_source *source)
+{
+	return source->ops->size(source->arg);
+}
+
+int block_source_read_block(struct reftable_block_source *source,
+			    struct reftable_block *dest, uint64_t off,
+			    uint32_t size)
+{
+	int result = source->ops->read_block(source->arg, dest, off, size);
+	dest->source = *source;
+	return result;
+}
+
+void block_source_close(struct reftable_block_source *source)
+{
+	if (source->ops == NULL) {
+		return;
+	}
+
+	source->ops->close(source->arg);
+	source->ops = NULL;
+}
+
+static struct reftable_reader_offsets *
+reader_offsets_for(struct reftable_reader *r, uint8_t typ)
+{
+	switch (typ) {
+	case BLOCK_TYPE_REF:
+		return &r->ref_offsets;
+	case BLOCK_TYPE_LOG:
+		return &r->log_offsets;
+	case BLOCK_TYPE_OBJ:
+		return &r->obj_offsets;
+	}
+	abort();
+}
+
+static int reader_get_block(struct reftable_reader *r,
+			    struct reftable_block *dest, uint64_t off,
+			    uint32_t sz)
+{
+	if (off >= r->size)
+		return 0;
+
+	if (off + sz > r->size) {
+		sz = r->size - off;
+	}
+
+	return block_source_read_block(&r->source, dest, off, sz);
+}
+
+uint32_t reftable_reader_hash_id(struct reftable_reader *r)
+{
+	return r->hash_id;
+}
+
+const char *reader_name(struct reftable_reader *r)
+{
+	return r->name;
+}
+
+static int parse_footer(struct reftable_reader *r, uint8_t *footer,
+			uint8_t *header)
+{
+	uint8_t *f = footer;
+	uint8_t first_block_typ;
+	int err = 0;
+	uint32_t computed_crc;
+	uint32_t file_crc;
+
+	if (memcmp(f, "REFT", 4)) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+	f += 4;
+
+	if (memcmp(footer, header, header_size(r->version))) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	f++;
+	r->block_size = get_be24(f);
+
+	f += 3;
+	r->min_update_index = get_be64(f);
+	f += 8;
+	r->max_update_index = get_be64(f);
+	f += 8;
+
+	if (r->version == 1) {
+		r->hash_id = SHA1_ID;
+	} else {
+		r->hash_id = get_be32(f);
+		switch (r->hash_id) {
+		case SHA1_ID:
+			break;
+		case SHA256_ID:
+			break;
+		default:
+			err = REFTABLE_FORMAT_ERROR;
+			goto done;
+		}
+		f += 4;
+	}
+
+	r->ref_offsets.index_offset = get_be64(f);
+	f += 8;
+
+	r->obj_offsets.offset = get_be64(f);
+	f += 8;
+
+	r->object_id_len = r->obj_offsets.offset & ((1 << 5) - 1);
+	r->obj_offsets.offset >>= 5;
+
+	r->obj_offsets.index_offset = get_be64(f);
+	f += 8;
+	r->log_offsets.offset = get_be64(f);
+	f += 8;
+	r->log_offsets.index_offset = get_be64(f);
+	f += 8;
+
+	computed_crc = crc32(0, footer, f - footer);
+	file_crc = get_be32(f);
+	f += 4;
+	if (computed_crc != file_crc) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	first_block_typ = header[header_size(r->version)];
+	r->ref_offsets.is_present = (first_block_typ == BLOCK_TYPE_REF);
+	r->ref_offsets.offset = 0;
+	r->log_offsets.is_present = (first_block_typ == BLOCK_TYPE_LOG ||
+				     r->log_offsets.offset > 0);
+	r->obj_offsets.is_present = r->obj_offsets.offset > 0;
+	err = 0;
+done:
+	return err;
+}
+
+int init_reader(struct reftable_reader *r, struct reftable_block_source *source,
+		const char *name)
+{
+	struct reftable_block footer = { NULL };
+	struct reftable_block header = { NULL };
+	int err = 0;
+
+	memset(r, 0, sizeof(struct reftable_reader));
+
+	/* Need +1 to read type of first block. */
+	err = block_source_read_block(source, &header, 0, header_size(2) + 1);
+	if (err != header_size(2) + 1) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	if (memcmp(header.data, "REFT", 4)) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+	r->version = header.data[4];
+	if (r->version != 1 && r->version != 2) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	r->size = block_source_size(source) - footer_size(r->version);
+	r->source = *source;
+	r->name = xstrdup(name);
+	r->hash_id = 0;
+
+	err = block_source_read_block(source, &footer, r->size,
+				      footer_size(r->version));
+	if (err != footer_size(r->version)) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = parse_footer(r, footer.data, header.data);
+done:
+	reftable_block_done(&footer);
+	reftable_block_done(&header);
+	return err;
+}
+
+struct table_iter {
+	struct reftable_reader *r;
+	uint8_t typ;
+	uint64_t block_off;
+	struct block_iter bi;
+	int is_finished;
+};
+#define TABLE_ITER_INIT                          \
+	{                                        \
+		.bi = {.last_key = STRBUF_INIT } \
+	}
+
+static void table_iter_copy_from(struct table_iter *dest,
+				 struct table_iter *src)
+{
+	dest->r = src->r;
+	dest->typ = src->typ;
+	dest->block_off = src->block_off;
+	dest->is_finished = src->is_finished;
+	block_iter_copy_from(&dest->bi, &src->bi);
+}
+
+static int table_iter_next_in_block(struct table_iter *ti,
+				    struct reftable_record *rec)
+{
+	int res = block_iter_next(&ti->bi, rec);
+	if (res == 0 && reftable_record_type(rec) == BLOCK_TYPE_REF) {
+		((struct reftable_ref_record *)rec->data)->update_index +=
+			ti->r->min_update_index;
+	}
+
+	return res;
+}
+
+static void table_iter_block_done(struct table_iter *ti)
+{
+	if (ti->bi.br == NULL) {
+		return;
+	}
+	reftable_block_done(&ti->bi.br->block);
+	FREE_AND_NULL(ti->bi.br);
+
+	ti->bi.last_key.len = 0;
+	ti->bi.next_off = 0;
+}
+
+static int32_t extract_block_size(uint8_t *data, uint8_t *typ, uint64_t off,
+				  int version)
+{
+	int32_t result = 0;
+
+	if (off == 0) {
+		data += header_size(version);
+	}
+
+	*typ = data[0];
+	if (reftable_is_block_type(*typ)) {
+		result = get_be24(data + 1);
+	}
+	return result;
+}
+
+int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br,
+			     uint64_t next_off, uint8_t want_typ)
+{
+	int32_t guess_block_size = r->block_size ? r->block_size :
+							 DEFAULT_BLOCK_SIZE;
+	struct reftable_block block = { NULL };
+	uint8_t block_typ = 0;
+	int err = 0;
+	uint32_t header_off = next_off ? 0 : header_size(r->version);
+	int32_t block_size = 0;
+
+	if (next_off >= r->size)
+		return 1;
+
+	err = reader_get_block(r, &block, next_off, guess_block_size);
+	if (err < 0)
+		return err;
+
+	block_size = extract_block_size(block.data, &block_typ, next_off,
+					r->version);
+	if (block_size < 0)
+		return block_size;
+
+	if (want_typ != BLOCK_TYPE_ANY && block_typ != want_typ) {
+		reftable_block_done(&block);
+		return 1;
+	}
+
+	if (block_size > guess_block_size) {
+		reftable_block_done(&block);
+		err = reader_get_block(r, &block, next_off, block_size);
+		if (err < 0) {
+			return err;
+		}
+	}
+
+	return block_reader_init(br, &block, header_off, r->block_size,
+				 hash_size(r->hash_id));
+}
+
+static int table_iter_next_block(struct table_iter *dest,
+				 struct table_iter *src)
+{
+	uint64_t next_block_off = src->block_off + src->bi.br->full_block_size;
+	struct block_reader br = { 0 };
+	int err = 0;
+
+	dest->r = src->r;
+	dest->typ = src->typ;
+	dest->block_off = next_block_off;
+
+	err = reader_init_block_reader(src->r, &br, next_block_off, src->typ);
+	if (err > 0) {
+		dest->is_finished = 1;
+		return 1;
+	}
+	if (err != 0)
+		return err;
+	else {
+		struct block_reader *brp =
+			reftable_malloc(sizeof(struct block_reader));
+		*brp = br;
+
+		dest->is_finished = 0;
+		block_reader_start(brp, &dest->bi);
+	}
+	return 0;
+}
+
+static int table_iter_next(struct table_iter *ti, struct reftable_record *rec)
+{
+	if (reftable_record_type(rec) != ti->typ)
+		return REFTABLE_API_ERROR;
+
+	while (1) {
+		struct table_iter next = TABLE_ITER_INIT;
+		int err = 0;
+		if (ti->is_finished) {
+			return 1;
+		}
+
+		err = table_iter_next_in_block(ti, rec);
+		if (err <= 0) {
+			return err;
+		}
+
+		err = table_iter_next_block(&next, ti);
+		if (err != 0) {
+			ti->is_finished = 1;
+		}
+		table_iter_block_done(ti);
+		if (err != 0) {
+			return err;
+		}
+		table_iter_copy_from(ti, &next);
+		block_iter_close(&next.bi);
+	}
+}
+
+static int table_iter_next_void(void *ti, struct reftable_record *rec)
+{
+	return table_iter_next((struct table_iter *)ti, rec);
+}
+
+static void table_iter_close(void *p)
+{
+	struct table_iter *ti = (struct table_iter *)p;
+	table_iter_block_done(ti);
+	block_iter_close(&ti->bi);
+}
+
+static struct reftable_iterator_vtable table_iter_vtable = {
+	.next = &table_iter_next_void,
+	.close = &table_iter_close,
+};
+
+static void iterator_from_table_iter(struct reftable_iterator *it,
+				     struct table_iter *ti)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = ti;
+	it->ops = &table_iter_vtable;
+}
+
+static int reader_table_iter_at(struct reftable_reader *r,
+				struct table_iter *ti, uint64_t off,
+				uint8_t typ)
+{
+	struct block_reader br = { 0 };
+	struct block_reader *brp = NULL;
+
+	int err = reader_init_block_reader(r, &br, off, typ);
+	if (err != 0)
+		return err;
+
+	brp = reftable_malloc(sizeof(struct block_reader));
+	*brp = br;
+	ti->r = r;
+	ti->typ = block_reader_type(brp);
+	ti->block_off = off;
+	block_reader_start(brp, &ti->bi);
+	return 0;
+}
+
+static int reader_start(struct reftable_reader *r, struct table_iter *ti,
+			uint8_t typ, int index)
+{
+	struct reftable_reader_offsets *offs = reader_offsets_for(r, typ);
+	uint64_t off = offs->offset;
+	if (index) {
+		off = offs->index_offset;
+		if (off == 0) {
+			return 1;
+		}
+		typ = BLOCK_TYPE_INDEX;
+	}
+
+	return reader_table_iter_at(r, ti, off, typ);
+}
+
+static int reader_seek_linear(struct reftable_reader *r, struct table_iter *ti,
+			      struct reftable_record *want)
+{
+	struct reftable_record rec =
+		reftable_new_record(reftable_record_type(want));
+	struct strbuf want_key = STRBUF_INIT;
+	struct strbuf got_key = STRBUF_INIT;
+	struct table_iter next = TABLE_ITER_INIT;
+	int err = -1;
+
+	reftable_record_key(want, &want_key);
+
+	while (1) {
+		err = table_iter_next_block(&next, ti);
+		if (err < 0)
+			goto done;
+
+		if (err > 0) {
+			break;
+		}
+
+		err = block_reader_first_key(next.bi.br, &got_key);
+		if (err < 0)
+			goto done;
+
+		if (strbuf_cmp(&got_key, &want_key) > 0) {
+			table_iter_block_done(&next);
+			break;
+		}
+
+		table_iter_block_done(ti);
+		table_iter_copy_from(ti, &next);
+	}
+
+	err = block_iter_seek(&ti->bi, &want_key);
+	if (err < 0)
+		goto done;
+	err = 0;
+
+done:
+	block_iter_close(&next.bi);
+	reftable_record_destroy(&rec);
+	strbuf_release(&want_key);
+	strbuf_release(&got_key);
+	return err;
+}
+
+static int reader_seek_indexed(struct reftable_reader *r,
+			       struct reftable_iterator *it,
+			       struct reftable_record *rec)
+{
+	struct reftable_index_record want_index = { .last_key = STRBUF_INIT };
+	struct reftable_record want_index_rec = { NULL };
+	struct reftable_index_record index_result = { .last_key = STRBUF_INIT };
+	struct reftable_record index_result_rec = { NULL };
+	struct table_iter index_iter = TABLE_ITER_INIT;
+	struct table_iter next = TABLE_ITER_INIT;
+	int err = 0;
+
+	reftable_record_key(rec, &want_index.last_key);
+	reftable_record_from_index(&want_index_rec, &want_index);
+	reftable_record_from_index(&index_result_rec, &index_result);
+
+	err = reader_start(r, &index_iter, reftable_record_type(rec), 1);
+	if (err < 0)
+		goto done;
+
+	err = reader_seek_linear(r, &index_iter, &want_index_rec);
+	while (1) {
+		err = table_iter_next(&index_iter, &index_result_rec);
+		table_iter_block_done(&index_iter);
+		if (err != 0)
+			goto done;
+
+		err = reader_table_iter_at(r, &next, index_result.offset, 0);
+		if (err != 0)
+			goto done;
+
+		err = block_iter_seek(&next.bi, &want_index.last_key);
+		if (err < 0)
+			goto done;
+
+		if (next.typ == reftable_record_type(rec)) {
+			err = 0;
+			break;
+		}
+
+		if (next.typ != BLOCK_TYPE_INDEX) {
+			err = REFTABLE_FORMAT_ERROR;
+			break;
+		}
+
+		table_iter_copy_from(&index_iter, &next);
+	}
+
+	if (err == 0) {
+		struct table_iter empty = TABLE_ITER_INIT;
+		struct table_iter *malloced =
+			reftable_calloc(sizeof(struct table_iter));
+		*malloced = empty;
+		table_iter_copy_from(malloced, &next);
+		iterator_from_table_iter(it, malloced);
+	}
+done:
+	block_iter_close(&next.bi);
+	table_iter_close(&index_iter);
+	reftable_record_release(&want_index_rec);
+	reftable_record_release(&index_result_rec);
+	return err;
+}
+
+static int reader_seek_internal(struct reftable_reader *r,
+				struct reftable_iterator *it,
+				struct reftable_record *rec)
+{
+	struct reftable_reader_offsets *offs =
+		reader_offsets_for(r, reftable_record_type(rec));
+	uint64_t idx = offs->index_offset;
+	struct table_iter ti = TABLE_ITER_INIT;
+	int err = 0;
+	if (idx > 0)
+		return reader_seek_indexed(r, it, rec);
+
+	err = reader_start(r, &ti, reftable_record_type(rec), 0);
+	if (err < 0)
+		return err;
+	err = reader_seek_linear(r, &ti, rec);
+	if (err < 0)
+		return err;
+	else {
+		struct table_iter *p =
+			reftable_malloc(sizeof(struct table_iter));
+		*p = ti;
+		iterator_from_table_iter(it, p);
+	}
+
+	return 0;
+}
+
+int reader_seek(struct reftable_reader *r, struct reftable_iterator *it,
+		struct reftable_record *rec)
+{
+	uint8_t typ = reftable_record_type(rec);
+
+	struct reftable_reader_offsets *offs = reader_offsets_for(r, typ);
+	if (!offs->is_present) {
+		iterator_set_empty(it);
+		return 0;
+	}
+
+	return reader_seek_internal(r, it, rec);
+}
+
+int reftable_reader_seek_ref(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name)
+{
+	struct reftable_ref_record ref = {
+		.refname = (char *)name,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_ref(&rec, &ref);
+	return reader_seek(r, it, &rec);
+}
+
+int reftable_reader_seek_log_at(struct reftable_reader *r,
+				struct reftable_iterator *it, const char *name,
+				uint64_t update_index)
+{
+	struct reftable_log_record log = {
+		.refname = (char *)name,
+		.update_index = update_index,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_log(&rec, &log);
+	return reader_seek(r, it, &rec);
+}
+
+int reftable_reader_seek_log(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name)
+{
+	uint64_t max = ~((uint64_t)0);
+	return reftable_reader_seek_log_at(r, it, name, max);
+}
+
+void reader_close(struct reftable_reader *r)
+{
+	block_source_close(&r->source);
+	FREE_AND_NULL(r->name);
+}
+
+int reftable_new_reader(struct reftable_reader **p,
+			struct reftable_block_source *src, char const *name)
+{
+	struct reftable_reader *rd =
+		reftable_calloc(sizeof(struct reftable_reader));
+	int err = init_reader(rd, src, name);
+	if (err == 0) {
+		*p = rd;
+	} else {
+		block_source_close(src);
+		reftable_free(rd);
+	}
+	return err;
+}
+
+void reftable_reader_free(struct reftable_reader *r)
+{
+	reader_close(r);
+	reftable_free(r);
+}
+
+static int reftable_reader_refs_for_indexed(struct reftable_reader *r,
+					    struct reftable_iterator *it,
+					    uint8_t *oid)
+{
+	struct reftable_obj_record want = {
+		.hash_prefix = oid,
+		.hash_prefix_len = r->object_id_len,
+	};
+	struct reftable_record want_rec = { NULL };
+	struct reftable_iterator oit = { NULL };
+	struct reftable_obj_record got = { NULL };
+	struct reftable_record got_rec = { NULL };
+	int err = 0;
+	struct indexed_table_ref_iter *itr = NULL;
+
+	/* Look through the reverse index. */
+	reftable_record_from_obj(&want_rec, &want);
+	err = reader_seek(r, &oit, &want_rec);
+	if (err != 0)
+		goto done;
+
+	/* read out the reftable_obj_record */
+	reftable_record_from_obj(&got_rec, &got);
+	err = iterator_next(&oit, &got_rec);
+	if (err < 0)
+		goto done;
+
+	if (err > 0 ||
+	    memcmp(want.hash_prefix, got.hash_prefix, r->object_id_len)) {
+		/* didn't find it; return empty iterator */
+		iterator_set_empty(it);
+		err = 0;
+		goto done;
+	}
+
+	err = new_indexed_table_ref_iter(&itr, r, oid, hash_size(r->hash_id),
+					 got.offsets, got.offset_len);
+	if (err < 0)
+		goto done;
+	got.offsets = NULL;
+	iterator_from_indexed_table_ref_iter(it, itr);
+
+done:
+	reftable_iterator_destroy(&oit);
+	reftable_record_release(&got_rec);
+	return err;
+}
+
+static int reftable_reader_refs_for_unindexed(struct reftable_reader *r,
+					      struct reftable_iterator *it,
+					      uint8_t *oid)
+{
+	struct table_iter ti_empty = TABLE_ITER_INIT;
+	struct table_iter *ti = reftable_calloc(sizeof(struct table_iter));
+	struct filtering_ref_iterator *filter = NULL;
+	struct filtering_ref_iterator empty = FILTERING_REF_ITERATOR_INIT;
+	int oid_len = hash_size(r->hash_id);
+	int err;
+
+	*ti = ti_empty;
+	err = reader_start(r, ti, BLOCK_TYPE_REF, 0);
+	if (err < 0) {
+		reftable_free(ti);
+		return err;
+	}
+
+	filter = reftable_malloc(sizeof(struct filtering_ref_iterator));
+	*filter = empty;
+
+	strbuf_add(&filter->oid, oid, oid_len);
+	reftable_table_from_reader(&filter->tab, r);
+	filter->double_check = 0;
+	iterator_from_table_iter(&filter->it, ti);
+
+	iterator_from_filtering_ref_iterator(it, filter);
+	return 0;
+}
+
+int reftable_reader_refs_for(struct reftable_reader *r,
+			     struct reftable_iterator *it, uint8_t *oid)
+{
+	if (r->obj_offsets.is_present)
+		return reftable_reader_refs_for_indexed(r, it, oid);
+	return reftable_reader_refs_for_unindexed(r, it, oid);
+}
+
+uint64_t reftable_reader_max_update_index(struct reftable_reader *r)
+{
+	return r->max_update_index;
+}
+
+uint64_t reftable_reader_min_update_index(struct reftable_reader *r)
+{
+	return r->min_update_index;
+}
diff --git a/reftable/reader.h b/reftable/reader.h
new file mode 100644
index 00000000000..6d4927e1c5d
--- /dev/null
+++ b/reftable/reader.h
@@ -0,0 +1,75 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef READER_H
+#define READER_H
+
+#include "block.h"
+#include "record.h"
+#include "reftable-iterator.h"
+#include "reftable-reader.h"
+
+uint64_t block_source_size(struct reftable_block_source *source);
+
+int block_source_read_block(struct reftable_block_source *source,
+			    struct reftable_block *dest, uint64_t off,
+			    uint32_t size);
+void block_source_close(struct reftable_block_source *source);
+
+/* metadata for a block type */
+struct reftable_reader_offsets {
+	int is_present;
+	uint64_t offset;
+	uint64_t index_offset;
+};
+
+/* The state for reading a reftable file. */
+struct reftable_reader {
+	/* for convience, associate a name with the instance. */
+	char *name;
+	struct reftable_block_source source;
+
+	/* Size of the file, excluding the footer. */
+	uint64_t size;
+
+	/* 'sha1' for SHA1, 's256' for SHA-256 */
+	uint32_t hash_id;
+
+	uint32_t block_size;
+	uint64_t min_update_index;
+	uint64_t max_update_index;
+	/* Length of the OID keys in the 'o' section */
+	int object_id_len;
+	int version;
+
+	struct reftable_reader_offsets ref_offsets;
+	struct reftable_reader_offsets obj_offsets;
+	struct reftable_reader_offsets log_offsets;
+};
+
+int init_reader(struct reftable_reader *r, struct reftable_block_source *source,
+		const char *name);
+int reader_seek(struct reftable_reader *r, struct reftable_iterator *it,
+		struct reftable_record *rec);
+void reader_close(struct reftable_reader *r);
+const char *reader_name(struct reftable_reader *r);
+
+/* initialize a block reader to read from `r` */
+int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br,
+			     uint64_t next_off, uint8_t want_typ);
+
+/* generic interface to reftables */
+struct reftable_table_vtable {
+	int (*seek_record)(void *tab, struct reftable_iterator *it,
+			   struct reftable_record *);
+	uint32_t (*hash_id)(void *tab);
+	uint64_t (*min_update_index)(void *tab);
+	uint64_t (*max_update_index)(void *tab);
+};
+
+#endif
diff --git a/reftable/reftable-iterator.h b/reftable/reftable-iterator.h
new file mode 100644
index 00000000000..d8251a45ba1
--- /dev/null
+++ b/reftable/reftable-iterator.h
@@ -0,0 +1,37 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_ITERATOR_H
+#define REFTABLE_ITERATOR_H
+
+#include "reftable-record.h"
+
+/* iterator is the generic interface for walking over data stored in a
+ * reftable.
+ */
+struct reftable_iterator {
+	struct reftable_iterator_vtable *ops;
+	void *iter_arg;
+};
+
+/* reads the next reftable_ref_record. Returns < 0 for error, 0 for OK and > 0:
+ * end of iteration.
+ */
+int reftable_iterator_next_ref(struct reftable_iterator *it,
+			       struct reftable_ref_record *ref);
+
+/* reads the next reftable_log_record. Returns < 0 for error, 0 for OK and > 0:
+ * end of iteration.
+ */
+int reftable_iterator_next_log(struct reftable_iterator *it,
+			       struct reftable_log_record *log);
+
+/* releases resources associated with an iterator. */
+void reftable_iterator_destroy(struct reftable_iterator *it);
+
+#endif
diff --git a/reftable/reftable-reader.h b/reftable/reftable-reader.h
new file mode 100644
index 00000000000..804e28543b3
--- /dev/null
+++ b/reftable/reftable-reader.h
@@ -0,0 +1,89 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_READER_H
+#define REFTABLE_READER_H
+
+#include "reftable-iterator.h"
+#include "reftable-blocksource.h"
+
+/*
+ * Reading single tables
+ *
+ * The follow routines are for reading single files. For an application-level
+ * interface, skip ahead to struct reftable_merged_table and struct
+ * reftable_stack.
+ */
+
+/* The reader struct is a handle to an open reftable file. */
+struct reftable_reader;
+
+/* reftable_new_reader opens a reftable for reading. If successful, returns 0
+ * code and sets pp. The name is used for creating a stack. Typically, it is the
+ * basename of the file. The block source `src` is owned by the reader, and is
+ * closed on calling reftable_reader_destroy().
+ */
+int reftable_new_reader(struct reftable_reader **pp,
+			struct reftable_block_source *src, const char *name);
+
+/* reftable_reader_seek_ref returns an iterator where 'name' would be inserted
+   in the table.  To seek to the start of the table, use name = "".
+
+   example:
+
+   struct reftable_reader *r = NULL;
+   int err = reftable_new_reader(&r, &src, "filename");
+   if (err < 0) { ... }
+   struct reftable_iterator it  = {0};
+   err = reftable_reader_seek_ref(r, &it, "refs/heads/master");
+   if (err < 0) { ... }
+   struct reftable_ref_record ref  = {0};
+   while (1) {
+     err = reftable_iterator_next_ref(&it, &ref);
+     if (err > 0) {
+       break;
+     }
+     if (err < 0) {
+       ..error handling..
+     }
+     ..found..
+   }
+   reftable_iterator_destroy(&it);
+   reftable_ref_record_release(&ref);
+ */
+int reftable_reader_seek_ref(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name);
+
+/* returns the hash ID used in this table. */
+uint32_t reftable_reader_hash_id(struct reftable_reader *r);
+
+/* seek to logs for the given name, older than update_index. To seek to the
+   start of the table, use name = "".
+ */
+int reftable_reader_seek_log_at(struct reftable_reader *r,
+				struct reftable_iterator *it, const char *name,
+				uint64_t update_index);
+
+/* seek to newest log entry for given name. */
+int reftable_reader_seek_log(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name);
+
+/* closes and deallocates a reader. */
+void reftable_reader_free(struct reftable_reader *);
+
+/* return an iterator for the refs pointing to `oid`. */
+int reftable_reader_refs_for(struct reftable_reader *r,
+			     struct reftable_iterator *it, uint8_t *oid);
+
+/* return the max_update_index for a table */
+uint64_t reftable_reader_max_update_index(struct reftable_reader *r);
+
+/* return the min_update_index for a table */
+uint64_t reftable_reader_min_update_index(struct reftable_reader *r);
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v4 11/15] reftable: reftable file level tests
  2020-12-09 14:00     ` [PATCH v4 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                         ` (9 preceding siblings ...)
  2020-12-09 14:00       ` [PATCH v4 10/15] reftable: read " Han-Wen Nienhuys via GitGitGadget
@ 2020-12-09 14:00       ` Han-Wen Nienhuys via GitGitGadget
  2020-12-09 14:00       ` [PATCH v4 12/15] reftable: rest of library Han-Wen Nienhuys via GitGitGadget
                         ` (4 subsequent siblings)
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-12-09 14:00 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

With support for reading and writing files in place, we can construct files (in
memory) and attempt to read them back.

Because some sections of the format are optional (eg. indices, log entries), we
have to exercise this code using multiple sizes of input data

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |   1 +
 reftable/reftable_test.c | 580 +++++++++++++++++++++++++++++++++++++++
 t/helper/test-reftable.c |   1 +
 3 files changed, 582 insertions(+)
 create mode 100644 reftable/reftable_test.c

diff --git a/Makefile b/Makefile
index 2bafcb1b62a..c913e90b643 100644
--- a/Makefile
+++ b/Makefile
@@ -2405,6 +2405,7 @@ REFTABLE_OBJS += reftable/zlib-compat.o
 REFTABLE_TEST_OBJS += reftable/basics_test.o
 REFTABLE_TEST_OBJS += reftable/block_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
+REFTABLE_TEST_OBJS += reftable/reftable_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/tree_test.o
 
diff --git a/reftable/reftable_test.c b/reftable/reftable_test.c
new file mode 100644
index 00000000000..c58f7061833
--- /dev/null
+++ b/reftable/reftable_test.c
@@ -0,0 +1,580 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+
+#include "basics.h"
+#include "block.h"
+#include "blocksource.h"
+#include "constants.h"
+#include "reader.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+#include "reftable-stack.h"
+
+static const int update_index = 5;
+
+static void test_buffer(void)
+{
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_block_source source = { NULL };
+	struct reftable_block out = { NULL };
+	int n;
+	uint8_t in[] = "hello";
+	strbuf_add(&buf, in, sizeof(in));
+	block_source_from_strbuf(&source, &buf);
+	EXPECT(block_source_size(&source) == 6);
+	n = block_source_read_block(&source, &out, 0, sizeof(in));
+	EXPECT(n == sizeof(in));
+	EXPECT(!memcmp(in, out.data, n));
+	reftable_block_done(&out);
+
+	n = block_source_read_block(&source, &out, 1, 2);
+	EXPECT(n == 2);
+	EXPECT(!memcmp(out.data, "el", 2));
+
+	reftable_block_done(&out);
+	block_source_close(&source);
+	strbuf_release(&buf);
+}
+
+static void write_table(char ***names, struct strbuf *buf, int N,
+			int block_size, uint32_t hash_id)
+{
+	struct reftable_write_options opts = {
+		.block_size = block_size,
+		.hash_id = hash_id,
+	};
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, buf, &opts);
+	struct reftable_ref_record ref = { NULL };
+	int i = 0, n;
+	struct reftable_log_record log = { NULL };
+	const struct reftable_stats *stats = NULL;
+	*names = reftable_calloc(sizeof(char *) * (N + 1));
+	reftable_writer_set_limits(w, update_index, update_index);
+	for (i = 0; i < N; i++) {
+		uint8_t hash[SHA256_SIZE] = { 0 };
+		char name[100];
+		int n;
+
+		set_test_hash(hash, i);
+
+		snprintf(name, sizeof(name), "refs/heads/branch%02d", i);
+
+		ref.refname = name;
+		ref.update_index = update_index;
+		ref.value_type = REFTABLE_REF_VAL1;
+		ref.value.val1 = hash;
+		(*names)[i] = xstrdup(name);
+
+		n = reftable_writer_add_ref(w, &ref);
+		EXPECT(n == 0);
+	}
+
+	for (i = 0; i < N; i++) {
+		uint8_t hash[SHA256_SIZE] = { 0 };
+		char name[100];
+		int n;
+
+		set_test_hash(hash, i);
+
+		snprintf(name, sizeof(name), "refs/heads/branch%02d", i);
+
+		log.refname = name;
+		log.new_hash = hash;
+		log.update_index = update_index;
+		log.message = "message";
+
+		n = reftable_writer_add_log(w, &log);
+		EXPECT(n == 0);
+	}
+
+	n = reftable_writer_close(w);
+	EXPECT(n == 0);
+
+	stats = writer_stats(w);
+	for (i = 0; i < stats->ref_stats.blocks; i++) {
+		int off = i * opts.block_size;
+		if (off == 0) {
+			off = header_size((hash_id == SHA256_ID) ? 2 : 1);
+		}
+		EXPECT(buf->buf[off] == 'r');
+	}
+
+	EXPECT(stats->log_stats.blocks > 0);
+	reftable_writer_free(w);
+}
+
+static void test_log_buffer_size(void)
+{
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_write_options opts = {
+		.block_size = 4096,
+	};
+	int err;
+	struct reftable_log_record log = {
+		.refname = "refs/heads/master",
+		.name = "Han-Wen Nienhuys",
+		.email = "hanwen@google.com",
+		.tz_offset = 100,
+		.time = 0x5e430672,
+		.update_index = 0xa,
+		.message = "commit: 9\n",
+	};
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+
+	/* This tests buffer extension for log compression. Must use a random
+	   hash, to ensure that the compressed part is larger than the original.
+	*/
+	uint8_t hash1[SHA1_SIZE], hash2[SHA1_SIZE];
+	for (int i = 0; i < SHA1_SIZE; i++) {
+		hash1[i] = (uint8_t)(rand() % 256);
+		hash2[i] = (uint8_t)(rand() % 256);
+	}
+	log.old_hash = hash1;
+	log.new_hash = hash2;
+	reftable_writer_set_limits(w, update_index, update_index);
+	err = reftable_writer_add_log(w, &log);
+	EXPECT_ERR(err);
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+	reftable_writer_free(w);
+	strbuf_release(&buf);
+}
+
+static void test_log_write_read(void)
+{
+	int N = 2;
+	char **names = reftable_calloc(sizeof(char *) * (N + 1));
+	int err;
+	struct reftable_write_options opts = {
+		.block_size = 256,
+	};
+	struct reftable_ref_record ref = { NULL };
+	int i = 0;
+	struct reftable_log_record log = { NULL };
+	int n;
+	struct reftable_iterator it = { NULL };
+	struct reftable_reader rd = { NULL };
+	struct reftable_block_source source = { NULL };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	const struct reftable_stats *stats = NULL;
+	reftable_writer_set_limits(w, 0, N);
+	for (i = 0; i < N; i++) {
+		char name[256];
+		struct reftable_ref_record ref = { NULL };
+		snprintf(name, sizeof(name), "b%02d%0*d", i, 130, 7);
+		names[i] = xstrdup(name);
+		ref.refname = name;
+		ref.update_index = i;
+
+		err = reftable_writer_add_ref(w, &ref);
+		EXPECT_ERR(err);
+	}
+	for (i = 0; i < N; i++) {
+		uint8_t hash1[SHA1_SIZE], hash2[SHA1_SIZE];
+		struct reftable_log_record log = { NULL };
+		set_test_hash(hash1, i);
+		set_test_hash(hash2, i + 1);
+
+		log.refname = names[i];
+		log.update_index = i;
+		log.old_hash = hash1;
+		log.new_hash = hash2;
+
+		err = reftable_writer_add_log(w, &log);
+		EXPECT_ERR(err);
+	}
+
+	n = reftable_writer_close(w);
+	EXPECT(n == 0);
+
+	stats = writer_stats(w);
+	EXPECT(stats->log_stats.blocks > 0);
+	reftable_writer_free(w);
+	w = NULL;
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.log");
+	EXPECT_ERR(err);
+
+	err = reftable_reader_seek_ref(&rd, &it, names[N - 1]);
+	EXPECT_ERR(err);
+
+	err = reftable_iterator_next_ref(&it, &ref);
+	EXPECT_ERR(err);
+
+	/* end of iteration. */
+	err = reftable_iterator_next_ref(&it, &ref);
+	EXPECT(0 < err);
+
+	reftable_iterator_destroy(&it);
+	reftable_ref_record_release(&ref);
+
+	err = reftable_reader_seek_log(&rd, &it, "");
+	EXPECT_ERR(err);
+
+	i = 0;
+	while (1) {
+		int err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			break;
+		}
+
+		EXPECT_ERR(err);
+		EXPECT_STREQ(names[i], log.refname);
+		EXPECT(i == log.update_index);
+		i++;
+		reftable_log_record_release(&log);
+	}
+
+	EXPECT(i == N);
+	reftable_iterator_destroy(&it);
+
+	/* cleanup. */
+	strbuf_release(&buf);
+	free_names(names);
+	reader_close(&rd);
+}
+
+static void test_table_read_write_sequential(void)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 50;
+	struct reftable_iterator it = { NULL };
+	struct reftable_block_source source = { NULL };
+	struct reftable_reader rd = { NULL };
+	int err = 0;
+	int j = 0;
+
+	write_table(&names, &buf, N, 256, SHA1_ID);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	EXPECT_ERR(err);
+
+	err = reftable_reader_seek_ref(&rd, &it, "");
+	EXPECT_ERR(err);
+
+	while (1) {
+		struct reftable_ref_record ref = { NULL };
+		int r = reftable_iterator_next_ref(&it, &ref);
+		EXPECT(r >= 0);
+		if (r > 0) {
+			break;
+		}
+		EXPECT(0 == strcmp(names[j], ref.refname));
+		EXPECT(update_index == ref.update_index);
+
+		j++;
+		reftable_ref_record_release(&ref);
+	}
+	EXPECT(j == N);
+	reftable_iterator_destroy(&it);
+	strbuf_release(&buf);
+	free_names(names);
+
+	reader_close(&rd);
+}
+
+static void test_table_write_small_table(void)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 1;
+	write_table(&names, &buf, N, 4096, SHA1_ID);
+	EXPECT(buf.len < 200);
+	strbuf_release(&buf);
+	free_names(names);
+}
+
+static void test_table_read_api(void)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 50;
+	struct reftable_reader rd = { NULL };
+	struct reftable_block_source source = { NULL };
+	int err;
+	int i;
+	struct reftable_log_record log = { NULL };
+	struct reftable_iterator it = { NULL };
+
+	write_table(&names, &buf, N, 256, SHA1_ID);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	EXPECT_ERR(err);
+
+	err = reftable_reader_seek_ref(&rd, &it, names[0]);
+	EXPECT_ERR(err);
+
+	err = reftable_iterator_next_log(&it, &log);
+	EXPECT(err == REFTABLE_API_ERROR);
+
+	strbuf_release(&buf);
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_free(names);
+	reader_close(&rd);
+	strbuf_release(&buf);
+}
+
+static void test_table_read_write_seek(int index, int hash_id)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 50;
+	struct reftable_reader rd = { NULL };
+	struct reftable_block_source source = { NULL };
+	int err;
+	int i = 0;
+
+	struct reftable_iterator it = { NULL };
+	struct strbuf pastLast = STRBUF_INIT;
+	struct reftable_ref_record ref = { NULL };
+
+	write_table(&names, &buf, N, 256, hash_id);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	EXPECT_ERR(err);
+	EXPECT(hash_id == reftable_reader_hash_id(&rd));
+
+	if (!index) {
+		rd.ref_offsets.index_offset = 0;
+	} else {
+		EXPECT(rd.ref_offsets.index_offset > 0);
+	}
+
+	for (i = 1; i < N; i++) {
+		int err = reftable_reader_seek_ref(&rd, &it, names[i]);
+		EXPECT_ERR(err);
+		err = reftable_iterator_next_ref(&it, &ref);
+		EXPECT_ERR(err);
+		EXPECT(0 == strcmp(names[i], ref.refname));
+		EXPECT(REFTABLE_REF_VAL1 == ref.value_type);
+		EXPECT(i == ref.value.val1[0]);
+
+		reftable_ref_record_release(&ref);
+		reftable_iterator_destroy(&it);
+	}
+
+	strbuf_addstr(&pastLast, names[N - 1]);
+	strbuf_addstr(&pastLast, "/");
+
+	err = reftable_reader_seek_ref(&rd, &it, pastLast.buf);
+	if (err == 0) {
+		struct reftable_ref_record ref = { NULL };
+		int err = reftable_iterator_next_ref(&it, &ref);
+		EXPECT(err > 0);
+	} else {
+		EXPECT(err > 0);
+	}
+
+	strbuf_release(&pastLast);
+	reftable_iterator_destroy(&it);
+
+	strbuf_release(&buf);
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+	reftable_free(names);
+	reader_close(&rd);
+}
+
+static void test_table_read_write_seek_linear(void)
+{
+	test_table_read_write_seek(0, SHA1_ID);
+}
+
+static void test_table_read_write_seek_linear_sha256(void)
+{
+	test_table_read_write_seek(0, SHA256_ID);
+}
+
+static void test_table_read_write_seek_index(void)
+{
+	test_table_read_write_seek(1, SHA1_ID);
+}
+
+static void test_table_refs_for(int indexed)
+{
+	int N = 50;
+	char **want_names = reftable_calloc(sizeof(char *) * (N + 1));
+	int want_names_len = 0;
+	uint8_t want_hash[SHA1_SIZE];
+
+	struct reftable_write_options opts = {
+		.block_size = 256,
+	};
+	struct reftable_ref_record ref = { NULL };
+	int i = 0;
+	int n;
+	int err;
+	struct reftable_reader rd;
+	struct reftable_block_source source = { NULL };
+
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+
+	struct reftable_iterator it = { NULL };
+	int j;
+
+	set_test_hash(want_hash, 4);
+
+	for (i = 0; i < N; i++) {
+		uint8_t hash[SHA1_SIZE];
+		char fill[51] = { 0 };
+		char name[100];
+		uint8_t hash1[SHA1_SIZE];
+		uint8_t hash2[SHA1_SIZE];
+		struct reftable_ref_record ref = { NULL };
+
+		memset(hash, i, sizeof(hash));
+		memset(fill, 'x', 50);
+		/* Put the variable part in the start */
+		snprintf(name, sizeof(name), "br%02d%s", i, fill);
+		name[40] = 0;
+		ref.refname = name;
+
+		set_test_hash(hash1, i / 4);
+		set_test_hash(hash2, 3 + i / 4);
+		ref.value_type = REFTABLE_REF_VAL2;
+		ref.value.val2.value = hash1;
+		ref.value.val2.target_value = hash2;
+
+		/* 80 bytes / entry, so 3 entries per block. Yields 17
+		 */
+		/* blocks. */
+		n = reftable_writer_add_ref(w, &ref);
+		EXPECT(n == 0);
+
+		if (!memcmp(hash1, want_hash, SHA1_SIZE) ||
+		    !memcmp(hash2, want_hash, SHA1_SIZE)) {
+			want_names[want_names_len++] = xstrdup(name);
+		}
+	}
+
+	n = reftable_writer_close(w);
+	EXPECT(n == 0);
+
+	reftable_writer_free(w);
+	w = NULL;
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	EXPECT_ERR(err);
+	if (!indexed) {
+		rd.obj_offsets.is_present = 0;
+	}
+
+	err = reftable_reader_seek_ref(&rd, &it, "");
+	EXPECT_ERR(err);
+	reftable_iterator_destroy(&it);
+
+	err = reftable_reader_refs_for(&rd, &it, want_hash);
+	EXPECT_ERR(err);
+
+	j = 0;
+	while (1) {
+		int err = reftable_iterator_next_ref(&it, &ref);
+		EXPECT(err >= 0);
+		if (err > 0) {
+			break;
+		}
+
+		EXPECT(j < want_names_len);
+		EXPECT(0 == strcmp(ref.refname, want_names[j]));
+		j++;
+		reftable_ref_record_release(&ref);
+	}
+	EXPECT(j == want_names_len);
+
+	strbuf_release(&buf);
+	free_names(want_names);
+	reftable_iterator_destroy(&it);
+	reader_close(&rd);
+}
+
+static void test_table_refs_for_no_index(void)
+{
+	test_table_refs_for(0);
+}
+
+static void test_table_refs_for_obj_index(void)
+{
+	test_table_refs_for(1);
+}
+
+static void test_table_empty(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	struct reftable_block_source source = { NULL };
+	struct reftable_reader *rd = NULL;
+	struct reftable_ref_record rec = { NULL };
+	struct reftable_iterator it = { NULL };
+	int err;
+
+	reftable_writer_set_limits(w, 1, 1);
+
+	err = reftable_writer_close(w);
+	EXPECT(err == REFTABLE_EMPTY_TABLE_ERROR);
+	reftable_writer_free(w);
+
+	EXPECT(buf.len == header_size(1) + footer_size(1));
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = reftable_new_reader(&rd, &source, "filename");
+	EXPECT_ERR(err);
+
+	err = reftable_reader_seek_ref(rd, &it, "");
+	EXPECT_ERR(err);
+
+	err = reftable_iterator_next_ref(&it, &rec);
+	EXPECT(err > 0);
+
+	reftable_iterator_destroy(&it);
+	reftable_reader_free(rd);
+	strbuf_release(&buf);
+}
+
+int reftable_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_log_write_read);
+	RUN_TEST(test_table_read_write_seek_linear_sha256);
+	RUN_TEST(test_log_buffer_size);
+	RUN_TEST(test_table_write_small_table);
+	RUN_TEST(test_buffer);
+	RUN_TEST(test_table_read_api);
+	RUN_TEST(test_table_read_write_sequential);
+	RUN_TEST(test_table_read_write_seek_linear);
+	RUN_TEST(test_table_read_write_seek_index);
+	RUN_TEST(test_table_refs_for_no_index);
+	RUN_TEST(test_table_refs_for_obj_index);
+	RUN_TEST(test_table_empty);
+	return 0;
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 050551fa698..fdf92586737 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -6,6 +6,7 @@ int cmd__reftable(int argc, const char **argv)
 	basics_test_main(argc, argv);
 	block_test_main(argc, argv);
 	record_test_main(argc, argv);
+	reftable_test_main(argc, argv);
 	tree_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v4 12/15] reftable: rest of library
  2020-12-09 14:00     ` [PATCH v4 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                         ` (10 preceding siblings ...)
  2020-12-09 14:00       ` [PATCH v4 11/15] reftable: reftable file level tests Han-Wen Nienhuys via GitGitGadget
@ 2020-12-09 14:00       ` Han-Wen Nienhuys via GitGitGadget
  2020-12-09 14:00       ` [PATCH v4 13/15] Reftable support for git-core Han-Wen Nienhuys via GitGitGadget
                         ` (3 subsequent siblings)
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-12-09 14:00 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                    |    4 +
 reftable/VERSION            |    1 +
 reftable/dump.c             |  212 ++++++
 reftable/merged.c           |  366 ++++++++++
 reftable/merged.h           |   35 +
 reftable/merged_test.c      |  343 ++++++++++
 reftable/pq.c               |  115 ++++
 reftable/pq.h               |   32 +
 reftable/refname.c          |  209 ++++++
 reftable/refname.h          |   29 +
 reftable/refname_test.c     |  102 +++
 reftable/reftable-generic.h |   48 ++
 reftable/reftable-merged.h  |   68 ++
 reftable/reftable-stack.h   |  120 ++++
 reftable/reftable.c         |   98 +++
 reftable/stack.c            | 1260 +++++++++++++++++++++++++++++++++++
 reftable/stack.h            |   41 ++
 reftable/stack_test.c       |  803 ++++++++++++++++++++++
 t/helper/test-reftable.c    |    3 +
 19 files changed, 3889 insertions(+)
 create mode 100644 reftable/VERSION
 create mode 100644 reftable/dump.c
 create mode 100644 reftable/merged.c
 create mode 100644 reftable/merged.h
 create mode 100644 reftable/merged_test.c
 create mode 100644 reftable/pq.c
 create mode 100644 reftable/pq.h
 create mode 100644 reftable/refname.c
 create mode 100644 reftable/refname.h
 create mode 100644 reftable/refname_test.c
 create mode 100644 reftable/reftable-generic.h
 create mode 100644 reftable/reftable-merged.h
 create mode 100644 reftable/reftable-stack.h
 create mode 100644 reftable/reftable.c
 create mode 100644 reftable/stack.c
 create mode 100644 reftable/stack.h
 create mode 100644 reftable/stack_test.c

diff --git a/Makefile b/Makefile
index c913e90b643..18cc18c2153 100644
--- a/Makefile
+++ b/Makefile
@@ -2394,10 +2394,14 @@ REFTABLE_OBJS += reftable/error.o
 REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/iter.o
+REFTABLE_OBJS += reftable/merged.o
+REFTABLE_OBJS += reftable/pq.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/reader.o
 REFTABLE_OBJS += reftable/record.o
+REFTABLE_OBJS += reftable/refname.o
 REFTABLE_OBJS += reftable/reftable.o
+REFTABLE_OBJS += reftable/stack.o
 REFTABLE_OBJS += reftable/tree.o
 REFTABLE_OBJS += reftable/writer.o
 REFTABLE_OBJS += reftable/zlib-compat.o
diff --git a/reftable/VERSION b/reftable/VERSION
new file mode 100644
index 00000000000..a67c0682e1b
--- /dev/null
+++ b/reftable/VERSION
@@ -0,0 +1 @@
+9b4a54059db9a05c270c0a0587f245bc6868d576 C: use rand() rather than cobbled together random generator.
diff --git a/reftable/dump.c b/reftable/dump.c
new file mode 100644
index 00000000000..00b444e8c9b
--- /dev/null
+++ b/reftable/dump.c
@@ -0,0 +1,212 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include <stddef.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+
+#include "reftable.h"
+#include "reftable-tests.h"
+
+static uint32_t hash_id;
+
+static int dump_table(const char *tablename)
+{
+	struct reftable_block_source src = { 0 };
+	int err = reftable_block_source_from_file(&src, tablename);
+	struct reftable_iterator it = { 0 };
+	struct reftable_ref_record ref = { 0 };
+	struct reftable_log_record log = { 0 };
+	struct reftable_reader *r = NULL;
+
+	if (err < 0)
+		return err;
+
+	err = reftable_new_reader(&r, &src, tablename);
+	if (err < 0)
+		return err;
+
+	err = reftable_reader_seek_ref(r, &it, "");
+	if (err < 0) {
+		return err;
+	}
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_ref_record_print(&ref, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_ref_record_clear(&ref);
+
+	err = reftable_reader_seek_log(r, &it, "");
+	if (err < 0) {
+		return err;
+	}
+	while (1) {
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_log_record_print(&log, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_log_record_clear(&log);
+
+	reftable_reader_free(r);
+	return 0;
+}
+
+static int compact_stack(const char *stackdir)
+{
+	struct reftable_stack *stack = NULL;
+	struct reftable_write_options cfg = {};
+
+	int err = reftable_new_stack(&stack, stackdir, cfg);
+	if (err < 0)
+		goto done;
+
+	err = reftable_stack_compact_all(stack, NULL);
+	if (err < 0)
+		goto done;
+done:
+	if (stack != NULL) {
+		reftable_stack_destroy(stack);
+	}
+	return err;
+}
+
+static int dump_stack(const char *stackdir)
+{
+	struct reftable_stack *stack = NULL;
+	struct reftable_write_options cfg = {};
+	struct reftable_iterator it = { 0 };
+	struct reftable_ref_record ref = { 0 };
+	struct reftable_log_record log = { 0 };
+	struct reftable_merged_table *merged = NULL;
+
+	int err = reftable_new_stack(&stack, stackdir, cfg);
+	if (err < 0)
+		return err;
+
+	merged = reftable_stack_merged_table(stack);
+
+	err = reftable_merged_table_seek_ref(merged, &it, "");
+	if (err < 0) {
+		return err;
+	}
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_ref_record_print(&ref, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_ref_record_clear(&ref);
+
+	err = reftable_merged_table_seek_log(merged, &it, "");
+	if (err < 0) {
+		return err;
+	}
+	while (1) {
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_log_record_print(&log, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_log_record_clear(&log);
+
+	reftable_stack_destroy(stack);
+	return 0;
+}
+
+static void print_help(void)
+{
+	printf("usage: dump [-cst] arg\n\n"
+	       "options: \n"
+	       "  -c compact\n"
+	       "  -t dump table\n"
+	       "  -s dump stack\n"
+	       "  -h this help\n"
+	       "  -2 use SHA256\n"
+	       "\n");
+}
+
+int reftable_dump_main(int argc, char *const *argv)
+{
+	int err = 0;
+	int opt;
+	int opt_dump_table = 0;
+	int opt_dump_stack = 0;
+	int opt_compact = 0;
+	const char *arg = NULL;
+	while ((opt = getopt(argc, argv, "2chts")) != -1) {
+		switch (opt) {
+		case '2':
+			hash_id = 0x73323536;
+			break;
+		case 't':
+			opt_dump_table = 1;
+			break;
+		case 's':
+			opt_dump_stack = 1;
+			break;
+		case 'c':
+			opt_compact = 1;
+			break;
+		case '?':
+		case 'h':
+			print_help();
+			return 2;
+			break;
+		}
+	}
+
+	if (argv[optind] == NULL) {
+		fprintf(stderr, "need argument\n");
+		print_help();
+		return 2;
+	}
+
+	arg = argv[optind];
+
+	if (opt_dump_table) {
+		err = dump_table(arg);
+	} else if (opt_dump_stack) {
+		err = dump_stack(arg);
+	} else if (opt_compact) {
+		err = compact_stack(arg);
+	}
+
+	if (err < 0) {
+		fprintf(stderr, "%s: %s: %s\n", argv[0], arg,
+			reftable_error_str(err));
+		return 1;
+	}
+	return 0;
+}
diff --git a/reftable/merged.c b/reftable/merged.c
new file mode 100644
index 00000000000..ed02625bb54
--- /dev/null
+++ b/reftable/merged.c
@@ -0,0 +1,366 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "merged.h"
+
+#include "constants.h"
+#include "iter.h"
+#include "pq.h"
+#include "reader.h"
+#include "record.h"
+#include "reftable-merged.h"
+#include "reftable-error.h"
+#include "system.h"
+
+static int merged_iter_init(struct merged_iter *mi)
+{
+	int i = 0;
+	for (i = 0; i < mi->stack_len; i++) {
+		struct reftable_record rec = reftable_new_record(mi->typ);
+		int err = iterator_next(&mi->stack[i], &rec);
+		if (err < 0) {
+			return err;
+		}
+
+		if (err > 0) {
+			reftable_iterator_destroy(&mi->stack[i]);
+			reftable_record_destroy(&rec);
+		} else {
+			struct pq_entry e = {
+				.rec = rec,
+				.index = i,
+			};
+			merged_iter_pqueue_add(&mi->pq, e);
+		}
+	}
+
+	return 0;
+}
+
+static void merged_iter_close(void *p)
+{
+	struct merged_iter *mi = (struct merged_iter *)p;
+	int i = 0;
+	merged_iter_pqueue_release(&mi->pq);
+	for (i = 0; i < mi->stack_len; i++) {
+		reftable_iterator_destroy(&mi->stack[i]);
+	}
+	reftable_free(mi->stack);
+}
+
+static int merged_iter_advance_nonnull_subiter(struct merged_iter *mi,
+					       size_t idx)
+{
+	struct reftable_record rec = reftable_new_record(mi->typ);
+	struct pq_entry e = {
+		.rec = rec,
+		.index = idx,
+	};
+	int err = iterator_next(&mi->stack[idx], &rec);
+	if (err < 0)
+		return err;
+
+	if (err > 0) {
+		reftable_iterator_destroy(&mi->stack[idx]);
+		reftable_record_destroy(&rec);
+		return 0;
+	}
+
+	merged_iter_pqueue_add(&mi->pq, e);
+	return 0;
+}
+
+static int merged_iter_advance_subiter(struct merged_iter *mi, size_t idx)
+{
+	if (iterator_is_null(&mi->stack[idx]))
+		return 0;
+	return merged_iter_advance_nonnull_subiter(mi, idx);
+}
+
+static int merged_iter_next_entry(struct merged_iter *mi,
+				  struct reftable_record *rec)
+{
+	struct strbuf entry_key = STRBUF_INIT;
+	struct pq_entry entry = { 0 };
+	int err = 0;
+
+	if (merged_iter_pqueue_is_empty(mi->pq))
+		return 1;
+
+	entry = merged_iter_pqueue_remove(&mi->pq);
+	err = merged_iter_advance_subiter(mi, entry.index);
+	if (err < 0)
+		return err;
+
+	/*
+	  One can also use reftable as datacenter-local storage, where the ref
+	  database is maintained in globally consistent database (eg.
+	  CockroachDB or Spanner). In this scenario, replication delays together
+	  with compaction may cause newer tables to contain older entries. In
+	  such a deployment, the loop below must be changed to collect all
+	  entries for the same key, and return new the newest one.
+	*/
+	reftable_record_key(&entry.rec, &entry_key);
+	while (!merged_iter_pqueue_is_empty(mi->pq)) {
+		struct pq_entry top = merged_iter_pqueue_top(mi->pq);
+		struct strbuf k = STRBUF_INIT;
+		int err = 0, cmp = 0;
+
+		reftable_record_key(&top.rec, &k);
+
+		cmp = strbuf_cmp(&k, &entry_key);
+		strbuf_release(&k);
+
+		if (cmp > 0) {
+			break;
+		}
+
+		merged_iter_pqueue_remove(&mi->pq);
+		err = merged_iter_advance_subiter(mi, top.index);
+		if (err < 0) {
+			return err;
+		}
+		reftable_record_destroy(&top.rec);
+	}
+
+	reftable_record_copy_from(rec, &entry.rec, hash_size(mi->hash_id));
+	reftable_record_destroy(&entry.rec);
+	strbuf_release(&entry_key);
+	return 0;
+}
+
+static int merged_iter_next(struct merged_iter *mi, struct reftable_record *rec)
+{
+	while (1) {
+		int err = merged_iter_next_entry(mi, rec);
+		if (err == 0 && mi->suppress_deletions &&
+		    reftable_record_is_deletion(rec)) {
+			continue;
+		}
+
+		return err;
+	}
+}
+
+static int merged_iter_next_void(void *p, struct reftable_record *rec)
+{
+	struct merged_iter *mi = (struct merged_iter *)p;
+	if (merged_iter_pqueue_is_empty(mi->pq))
+		return 1;
+
+	return merged_iter_next(mi, rec);
+}
+
+static struct reftable_iterator_vtable merged_iter_vtable = {
+	.next = &merged_iter_next_void,
+	.close = &merged_iter_close,
+};
+
+static void iterator_from_merged_iter(struct reftable_iterator *it,
+				      struct merged_iter *mi)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = mi;
+	it->ops = &merged_iter_vtable;
+}
+
+int reftable_new_merged_table(struct reftable_merged_table **dest,
+			      struct reftable_table *stack, int n,
+			      uint32_t hash_id)
+{
+	struct reftable_merged_table *m = NULL;
+	uint64_t last_max = 0;
+	uint64_t first_min = 0;
+	int i = 0;
+	for (i = 0; i < n; i++) {
+		uint64_t min = reftable_table_min_update_index(&stack[i]);
+		uint64_t max = reftable_table_max_update_index(&stack[i]);
+
+		if (reftable_table_hash_id(&stack[i]) != hash_id) {
+			return REFTABLE_FORMAT_ERROR;
+		}
+		if (i == 0 || min < first_min) {
+			first_min = min;
+		}
+		if (i == 0 || max > last_max) {
+			last_max = max;
+		}
+	}
+
+	m = (struct reftable_merged_table *)reftable_calloc(
+		sizeof(struct reftable_merged_table));
+	m->stack = stack;
+	m->stack_len = n;
+	m->min = first_min;
+	m->max = last_max;
+	m->hash_id = hash_id;
+	*dest = m;
+	return 0;
+}
+
+/* clears the list of subtable, without affecting the readers themselves. */
+void merged_table_release(struct reftable_merged_table *mt)
+{
+	FREE_AND_NULL(mt->stack);
+	mt->stack_len = 0;
+}
+
+void reftable_merged_table_free(struct reftable_merged_table *mt)
+{
+	if (mt == NULL) {
+		return;
+	}
+	merged_table_release(mt);
+	reftable_free(mt);
+}
+
+uint64_t
+reftable_merged_table_max_update_index(struct reftable_merged_table *mt)
+{
+	return mt->max;
+}
+
+uint64_t
+reftable_merged_table_min_update_index(struct reftable_merged_table *mt)
+{
+	return mt->min;
+}
+
+static int reftable_table_seek_record(struct reftable_table *tab,
+				      struct reftable_iterator *it,
+				      struct reftable_record *rec)
+{
+	return tab->ops->seek_record(tab->table_arg, it, rec);
+}
+
+static int merged_table_seek_record(struct reftable_merged_table *mt,
+				    struct reftable_iterator *it,
+				    struct reftable_record *rec)
+{
+	struct reftable_iterator *iters = reftable_calloc(
+		sizeof(struct reftable_iterator) * mt->stack_len);
+	struct merged_iter merged = {
+		.stack = iters,
+		.typ = reftable_record_type(rec),
+		.hash_id = mt->hash_id,
+		.suppress_deletions = mt->suppress_deletions,
+	};
+	int n = 0;
+	int err = 0;
+	int i = 0;
+	for (i = 0; i < mt->stack_len && err == 0; i++) {
+		int e = reftable_table_seek_record(&mt->stack[i], &iters[n],
+						   rec);
+		if (e < 0) {
+			err = e;
+		}
+		if (e == 0) {
+			n++;
+		}
+	}
+	if (err < 0) {
+		int i = 0;
+		for (i = 0; i < n; i++) {
+			reftable_iterator_destroy(&iters[i]);
+		}
+		reftable_free(iters);
+		return err;
+	}
+
+	merged.stack_len = n;
+	err = merged_iter_init(&merged);
+	if (err < 0) {
+		merged_iter_close(&merged);
+		return err;
+	} else {
+		struct merged_iter *p =
+			reftable_malloc(sizeof(struct merged_iter));
+		*p = merged;
+		iterator_from_merged_iter(it, p);
+	}
+	return 0;
+}
+
+int reftable_merged_table_seek_ref(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name)
+{
+	struct reftable_ref_record ref = {
+		.refname = (char *)name,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_ref(&rec, &ref);
+	return merged_table_seek_record(mt, it, &rec);
+}
+
+int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt,
+				      struct reftable_iterator *it,
+				      const char *name, uint64_t update_index)
+{
+	struct reftable_log_record log = {
+		.refname = (char *)name,
+		.update_index = update_index,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_log(&rec, &log);
+	return merged_table_seek_record(mt, it, &rec);
+}
+
+int reftable_merged_table_seek_log(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name)
+{
+	uint64_t max = ~((uint64_t)0);
+	return reftable_merged_table_seek_log_at(mt, it, name, max);
+}
+
+uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *mt)
+{
+	return mt->hash_id;
+}
+
+static int reftable_merged_table_seek_void(void *tab,
+					   struct reftable_iterator *it,
+					   struct reftable_record *rec)
+{
+	return merged_table_seek_record((struct reftable_merged_table *)tab, it,
+					rec);
+}
+
+static uint32_t reftable_merged_table_hash_id_void(void *tab)
+{
+	return reftable_merged_table_hash_id(
+		(struct reftable_merged_table *)tab);
+}
+
+static uint64_t reftable_merged_table_min_update_index_void(void *tab)
+{
+	return reftable_merged_table_min_update_index(
+		(struct reftable_merged_table *)tab);
+}
+
+static uint64_t reftable_merged_table_max_update_index_void(void *tab)
+{
+	return reftable_merged_table_max_update_index(
+		(struct reftable_merged_table *)tab);
+}
+
+static struct reftable_table_vtable merged_table_vtable = {
+	.seek_record = reftable_merged_table_seek_void,
+	.hash_id = reftable_merged_table_hash_id_void,
+	.min_update_index = reftable_merged_table_min_update_index_void,
+	.max_update_index = reftable_merged_table_max_update_index_void,
+};
+
+void reftable_table_from_merged_table(struct reftable_table *tab,
+				      struct reftable_merged_table *merged)
+{
+	assert(tab->ops == NULL);
+	tab->ops = &merged_table_vtable;
+	tab->table_arg = merged;
+}
diff --git a/reftable/merged.h b/reftable/merged.h
new file mode 100644
index 00000000000..8c4d4d58d77
--- /dev/null
+++ b/reftable/merged.h
@@ -0,0 +1,35 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef MERGED_H
+#define MERGED_H
+
+#include "pq.h"
+
+struct reftable_merged_table {
+	struct reftable_table *stack;
+	size_t stack_len;
+	uint32_t hash_id;
+	int suppress_deletions;
+
+	uint64_t min;
+	uint64_t max;
+};
+
+struct merged_iter {
+	struct reftable_iterator *stack;
+	uint32_t hash_id;
+	size_t stack_len;
+	uint8_t typ;
+	int suppress_deletions;
+	struct merged_iter_pqueue pq;
+};
+
+void merged_table_release(struct reftable_merged_table *mt);
+
+#endif
diff --git a/reftable/merged_test.c b/reftable/merged_test.c
new file mode 100644
index 00000000000..09a70aa0fda
--- /dev/null
+++ b/reftable/merged_test.c
@@ -0,0 +1,343 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "merged.h"
+
+#include "system.h"
+
+#include "basics.h"
+#include "blocksource.h"
+#include "constants.h"
+#include "pq.h"
+#include "reader.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-merged.h"
+#include "reftable-tests.h"
+#include "reftable-generic.h"
+#include "reftable-stack.h"
+
+static void test_pq(void)
+{
+	char *names[54] = { NULL };
+	int N = ARRAY_SIZE(names) - 1;
+
+	struct merged_iter_pqueue pq = { NULL };
+	const char *last = NULL;
+
+	int i = 0;
+	for (i = 0; i < N; i++) {
+		char name[100];
+		snprintf(name, sizeof(name), "%02d", i);
+		names[i] = xstrdup(name);
+	}
+
+	i = 1;
+	do {
+		struct reftable_record rec =
+			reftable_new_record(BLOCK_TYPE_REF);
+		struct pq_entry e = { 0 };
+
+		reftable_record_as_ref(&rec)->refname = names[i];
+		e.rec = rec;
+		merged_iter_pqueue_add(&pq, e);
+		merged_iter_pqueue_check(pq);
+		i = (i * 7) % N;
+	} while (i != 1);
+
+	while (!merged_iter_pqueue_is_empty(pq)) {
+		struct pq_entry e = merged_iter_pqueue_remove(&pq);
+		struct reftable_ref_record *ref =
+			reftable_record_as_ref(&e.rec);
+
+		merged_iter_pqueue_check(pq);
+
+		if (last != NULL) {
+			assert(strcmp(last, ref->refname) < 0);
+		}
+		last = ref->refname;
+		ref->refname = NULL;
+		reftable_free(ref);
+	}
+
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+
+	merged_iter_pqueue_release(&pq);
+}
+
+static void write_test_table(struct strbuf *buf,
+			     struct reftable_ref_record refs[], int n)
+{
+	int min = 0xffffffff;
+	int max = 0;
+	int i = 0;
+	int err;
+
+	struct reftable_write_options opts = {
+		.block_size = 256,
+	};
+	struct reftable_writer *w = NULL;
+	for (i = 0; i < n; i++) {
+		uint64_t ui = refs[i].update_index;
+		if (ui > max) {
+			max = ui;
+		}
+		if (ui < min) {
+			min = ui;
+		}
+	}
+
+	w = reftable_new_writer(&strbuf_add_void, buf, &opts);
+	reftable_writer_set_limits(w, min, max);
+
+	for (i = 0; i < n; i++) {
+		uint64_t before = refs[i].update_index;
+		int n = reftable_writer_add_ref(w, &refs[i]);
+		assert(n == 0);
+		assert(before == refs[i].update_index);
+	}
+
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+
+	reftable_writer_free(w);
+}
+
+static struct reftable_merged_table *
+merged_table_from_records(struct reftable_ref_record **refs,
+			  struct reftable_block_source **source,
+			  struct reftable_reader ***readers, int *sizes,
+			  struct strbuf *buf, int n)
+{
+	int i = 0;
+	struct reftable_merged_table *mt = NULL;
+	int err;
+	struct reftable_table *tabs =
+		reftable_calloc(n * sizeof(struct reftable_table));
+	*readers = reftable_calloc(n * sizeof(struct reftable_reader *));
+	*source = reftable_calloc(n * sizeof(**source));
+	for (i = 0; i < n; i++) {
+		write_test_table(&buf[i], refs[i], sizes[i]);
+		block_source_from_strbuf(&(*source)[i], &buf[i]);
+
+		err = reftable_new_reader(&(*readers)[i], &(*source)[i],
+					  "name");
+		EXPECT_ERR(err);
+		reftable_table_from_reader(&tabs[i], (*readers)[i]);
+	}
+
+	err = reftable_new_merged_table(&mt, tabs, n, SHA1_ID);
+	EXPECT_ERR(err);
+	return mt;
+}
+
+static void readers_destroy(struct reftable_reader **readers, size_t n)
+{
+	int i = 0;
+	for (; i < n; i++)
+		reftable_reader_free(readers[i]);
+	reftable_free(readers);
+}
+
+static void test_merged_between(void)
+{
+	uint8_t hash1[SHA1_SIZE] = { 1, 2, 3, 0 };
+
+	struct reftable_ref_record r1[] = { {
+		.refname = "b",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_VAL1,
+		.value.val1 = hash1,
+	} };
+	struct reftable_ref_record r2[] = { {
+		.refname = "a",
+		.update_index = 2,
+		.value_type = REFTABLE_REF_DELETION,
+	} };
+
+	struct reftable_ref_record *refs[] = { r1, r2 };
+	int sizes[] = { 1, 1 };
+	struct strbuf bufs[2] = { STRBUF_INIT, STRBUF_INIT };
+	struct reftable_block_source *bs = NULL;
+	struct reftable_reader **readers = NULL;
+	struct reftable_merged_table *mt =
+		merged_table_from_records(refs, &bs, &readers, sizes, bufs, 2);
+	int i;
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_iterator it = { NULL };
+	int err = reftable_merged_table_seek_ref(mt, &it, "a");
+	EXPECT_ERR(err);
+
+	err = reftable_iterator_next_ref(&it, &ref);
+	EXPECT_ERR(err);
+	EXPECT(ref.update_index == 2);
+	reftable_ref_record_release(&ref);
+	reftable_iterator_destroy(&it);
+	readers_destroy(readers, 2);
+	reftable_merged_table_free(mt);
+	for (i = 0; i < ARRAY_SIZE(bufs); i++) {
+		strbuf_release(&bufs[i]);
+	}
+	reftable_free(bs);
+}
+
+static void test_merged(void)
+{
+	uint8_t hash1[SHA1_SIZE] = { 1 };
+	uint8_t hash2[SHA1_SIZE] = { 2 };
+	struct reftable_ref_record r1[] = {
+		{
+			.refname = "a",
+			.update_index = 1,
+			.value_type = REFTABLE_REF_VAL1,
+			.value.val1 = hash1,
+		},
+		{
+			.refname = "b",
+			.update_index = 1,
+			.value_type = REFTABLE_REF_VAL1,
+			.value.val1 = hash1,
+		},
+		{
+			.refname = "c",
+			.update_index = 1,
+			.value_type = REFTABLE_REF_VAL1,
+			.value.val1 = hash1,
+		}
+	};
+	struct reftable_ref_record r2[] = { {
+		.refname = "a",
+		.update_index = 2,
+		.value_type = REFTABLE_REF_DELETION,
+	} };
+	struct reftable_ref_record r3[] = {
+		{
+			.refname = "c",
+			.update_index = 3,
+			.value_type = REFTABLE_REF_VAL1,
+			.value.val1 = hash2,
+		},
+		{
+			.refname = "d",
+			.update_index = 3,
+			.value_type = REFTABLE_REF_VAL1,
+			.value.val1 = hash1,
+		},
+	};
+
+	struct reftable_ref_record want[] = {
+		r2[0],
+		r1[1],
+		r3[0],
+		r3[1],
+	};
+
+	struct reftable_ref_record *refs[] = { r1, r2, r3 };
+	int sizes[3] = { 3, 1, 2 };
+	struct strbuf bufs[3] = { STRBUF_INIT, STRBUF_INIT, STRBUF_INIT };
+	struct reftable_block_source *bs = NULL;
+	struct reftable_reader **readers = NULL;
+	struct reftable_merged_table *mt =
+		merged_table_from_records(refs, &bs, &readers, sizes, bufs, 3);
+
+	struct reftable_iterator it = { NULL };
+	int err = reftable_merged_table_seek_ref(mt, &it, "a");
+	struct reftable_ref_record *out = NULL;
+	size_t len = 0;
+	size_t cap = 0;
+	int i = 0;
+
+	EXPECT_ERR(err);
+	while (len < 100) { /* cap loops/recursion. */
+		struct reftable_ref_record ref = { NULL };
+		int err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (len == cap) {
+			cap = 2 * cap + 1;
+			out = reftable_realloc(
+				out, sizeof(struct reftable_ref_record) * cap);
+		}
+		out[len++] = ref;
+	}
+	reftable_iterator_destroy(&it);
+
+	assert(ARRAY_SIZE(want) == len);
+	for (i = 0; i < len; i++) {
+		assert(reftable_ref_record_equal(&want[i], &out[i], SHA1_SIZE));
+	}
+	for (i = 0; i < len; i++) {
+		reftable_ref_record_release(&out[i]);
+	}
+	reftable_free(out);
+
+	for (i = 0; i < 3; i++) {
+		strbuf_release(&bufs[i]);
+	}
+	readers_destroy(readers, 3);
+	reftable_merged_table_free(mt);
+	reftable_free(bs);
+}
+
+static void test_default_write_opts(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+
+	struct reftable_ref_record rec = {
+		.refname = "master",
+		.update_index = 1,
+	};
+	int err;
+	struct reftable_block_source source = { NULL };
+	struct reftable_table *tab = reftable_calloc(sizeof(*tab) * 1);
+	uint32_t hash_id;
+	struct reftable_reader *rd = NULL;
+	struct reftable_merged_table *merged = NULL;
+
+	reftable_writer_set_limits(w, 1, 1);
+
+	err = reftable_writer_add_ref(w, &rec);
+	EXPECT_ERR(err);
+
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+	reftable_writer_free(w);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = reftable_new_reader(&rd, &source, "filename");
+	EXPECT_ERR(err);
+
+	hash_id = reftable_reader_hash_id(rd);
+	assert(hash_id == SHA1_ID);
+
+	reftable_table_from_reader(&tab[0], rd);
+	err = reftable_new_merged_table(&merged, tab, 1, SHA1_ID);
+	EXPECT_ERR(err);
+
+	reftable_reader_free(rd);
+	reftable_merged_table_free(merged);
+	strbuf_release(&buf);
+}
+
+/* XXX test refs_for(oid) */
+
+int merged_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_merged_between);
+	RUN_TEST(test_pq);
+	RUN_TEST(test_merged);
+	RUN_TEST(test_default_write_opts);
+	return 0;
+}
diff --git a/reftable/pq.c b/reftable/pq.c
new file mode 100644
index 00000000000..8918d158e2d
--- /dev/null
+++ b/reftable/pq.c
@@ -0,0 +1,115 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "pq.h"
+
+#include "reftable-record.h"
+#include "system.h"
+#include "basics.h"
+
+static int pq_less(struct pq_entry a, struct pq_entry b)
+{
+	struct strbuf ak = STRBUF_INIT;
+	struct strbuf bk = STRBUF_INIT;
+	int cmp = 0;
+	reftable_record_key(&a.rec, &ak);
+	reftable_record_key(&b.rec, &bk);
+
+	cmp = strbuf_cmp(&ak, &bk);
+
+	strbuf_release(&ak);
+	strbuf_release(&bk);
+
+	if (cmp == 0)
+		return a.index > b.index;
+
+	return cmp < 0;
+}
+
+struct pq_entry merged_iter_pqueue_top(struct merged_iter_pqueue pq)
+{
+	return pq.heap[0];
+}
+
+int merged_iter_pqueue_is_empty(struct merged_iter_pqueue pq)
+{
+	return pq.len == 0;
+}
+
+void merged_iter_pqueue_check(struct merged_iter_pqueue pq)
+{
+	int i = 0;
+	for (i = 1; i < pq.len; i++) {
+		int parent = (i - 1) / 2;
+
+		assert(pq_less(pq.heap[parent], pq.heap[i]));
+	}
+}
+
+struct pq_entry merged_iter_pqueue_remove(struct merged_iter_pqueue *pq)
+{
+	int i = 0;
+	struct pq_entry e = pq->heap[0];
+	pq->heap[0] = pq->heap[pq->len - 1];
+	pq->len--;
+
+	i = 0;
+	while (i < pq->len) {
+		int min = i;
+		int j = 2 * i + 1;
+		int k = 2 * i + 2;
+		if (j < pq->len && pq_less(pq->heap[j], pq->heap[i])) {
+			min = j;
+		}
+		if (k < pq->len && pq_less(pq->heap[k], pq->heap[min])) {
+			min = k;
+		}
+
+		if (min == i) {
+			break;
+		}
+
+		SWAP(pq->heap[i], pq->heap[min]);
+		i = min;
+	}
+
+	return e;
+}
+
+void merged_iter_pqueue_add(struct merged_iter_pqueue *pq, struct pq_entry e)
+{
+	int i = 0;
+	if (pq->len == pq->cap) {
+		pq->cap = 2 * pq->cap + 1;
+		pq->heap = reftable_realloc(pq->heap,
+					    pq->cap * sizeof(struct pq_entry));
+	}
+
+	pq->heap[pq->len++] = e;
+	i = pq->len - 1;
+	while (i > 0) {
+		int j = (i - 1) / 2;
+		if (pq_less(pq->heap[j], pq->heap[i])) {
+			break;
+		}
+
+		SWAP(pq->heap[j], pq->heap[i]);
+
+		i = j;
+	}
+}
+
+void merged_iter_pqueue_release(struct merged_iter_pqueue *pq)
+{
+	int i = 0;
+	for (i = 0; i < pq->len; i++) {
+		reftable_record_destroy(&pq->heap[i].rec);
+	}
+	FREE_AND_NULL(pq->heap);
+	pq->len = pq->cap = 0;
+}
diff --git a/reftable/pq.h b/reftable/pq.h
new file mode 100644
index 00000000000..385d2fb139a
--- /dev/null
+++ b/reftable/pq.h
@@ -0,0 +1,32 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef PQ_H
+#define PQ_H
+
+#include "record.h"
+
+struct pq_entry {
+	int index;
+	struct reftable_record rec;
+};
+
+struct merged_iter_pqueue {
+	struct pq_entry *heap;
+	size_t len;
+	size_t cap;
+};
+
+struct pq_entry merged_iter_pqueue_top(struct merged_iter_pqueue pq);
+int merged_iter_pqueue_is_empty(struct merged_iter_pqueue pq);
+void merged_iter_pqueue_check(struct merged_iter_pqueue pq);
+struct pq_entry merged_iter_pqueue_remove(struct merged_iter_pqueue *pq);
+void merged_iter_pqueue_add(struct merged_iter_pqueue *pq, struct pq_entry e);
+void merged_iter_pqueue_release(struct merged_iter_pqueue *pq);
+
+#endif
diff --git a/reftable/refname.c b/reftable/refname.c
new file mode 100644
index 00000000000..0f4eb3b292d
--- /dev/null
+++ b/reftable/refname.c
@@ -0,0 +1,209 @@
+/*
+  Copyright 2020 Google LLC
+
+  Use of this source code is governed by a BSD-style
+  license that can be found in the LICENSE file or at
+  https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+#include "reftable-error.h"
+#include "basics.h"
+#include "refname.h"
+#include "reftable-iterator.h"
+
+struct find_arg {
+	char **names;
+	const char *want;
+};
+
+static int find_name(size_t k, void *arg)
+{
+	struct find_arg *f_arg = (struct find_arg *)arg;
+	return strcmp(f_arg->names[k], f_arg->want) >= 0;
+}
+
+static int modification_has_ref(struct modification *mod, const char *name)
+{
+	struct reftable_ref_record ref = { NULL };
+	int err = 0;
+
+	if (mod->add_len > 0) {
+		struct find_arg arg = {
+			.names = mod->add,
+			.want = name,
+		};
+		int idx = binsearch(mod->add_len, find_name, &arg);
+		if (idx < mod->add_len && !strcmp(mod->add[idx], name)) {
+			return 0;
+		}
+	}
+
+	if (mod->del_len > 0) {
+		struct find_arg arg = {
+			.names = mod->del,
+			.want = name,
+		};
+		int idx = binsearch(mod->del_len, find_name, &arg);
+		if (idx < mod->del_len && !strcmp(mod->del[idx], name)) {
+			return 1;
+		}
+	}
+
+	err = reftable_table_read_ref(&mod->tab, name, &ref);
+	reftable_ref_record_release(&ref);
+	return err;
+}
+
+static void modification_release(struct modification *mod)
+{
+	/* don't delete the strings themselves; they're owned by ref records.
+	 */
+	FREE_AND_NULL(mod->add);
+	FREE_AND_NULL(mod->del);
+	mod->add_len = 0;
+	mod->del_len = 0;
+}
+
+static int modification_has_ref_with_prefix(struct modification *mod,
+					    const char *prefix)
+{
+	struct reftable_iterator it = { NULL };
+	struct reftable_ref_record ref = { NULL };
+	int err = 0;
+
+	if (mod->add_len > 0) {
+		struct find_arg arg = {
+			.names = mod->add,
+			.want = prefix,
+		};
+		int idx = binsearch(mod->add_len, find_name, &arg);
+		if (idx < mod->add_len &&
+		    !strncmp(prefix, mod->add[idx], strlen(prefix)))
+			goto done;
+	}
+	err = reftable_table_seek_ref(&mod->tab, &it, prefix);
+	if (err)
+		goto done;
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err)
+			goto done;
+
+		if (mod->del_len > 0) {
+			struct find_arg arg = {
+				.names = mod->del,
+				.want = ref.refname,
+			};
+			int idx = binsearch(mod->del_len, find_name, &arg);
+			if (idx < mod->del_len &&
+			    !strcmp(ref.refname, mod->del[idx])) {
+				continue;
+			}
+		}
+
+		if (strncmp(ref.refname, prefix, strlen(prefix))) {
+			err = 1;
+			goto done;
+		}
+		err = 0;
+		goto done;
+	}
+
+done:
+	reftable_ref_record_release(&ref);
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+static int validate_refname(const char *name)
+{
+	while (1) {
+		char *next = strchr(name, '/');
+		if (!*name) {
+			return REFTABLE_REFNAME_ERROR;
+		}
+		if (!next) {
+			return 0;
+		}
+		if (next - name == 0 || (next - name == 1 && *name == '.') ||
+		    (next - name == 2 && name[0] == '.' && name[1] == '.'))
+			return REFTABLE_REFNAME_ERROR;
+		name = next + 1;
+	}
+	return 0;
+}
+
+int validate_ref_record_addition(struct reftable_table tab,
+				 struct reftable_ref_record *recs, size_t sz)
+{
+	struct modification mod = {
+		.tab = tab,
+		.add = reftable_calloc(sizeof(char *) * sz),
+		.del = reftable_calloc(sizeof(char *) * sz),
+	};
+	int i = 0;
+	int err = 0;
+	for (; i < sz; i++) {
+		if (reftable_ref_record_is_deletion(&recs[i])) {
+			mod.del[mod.del_len++] = recs[i].refname;
+		} else {
+			mod.add[mod.add_len++] = recs[i].refname;
+		}
+	}
+
+	err = modification_validate(&mod);
+	modification_release(&mod);
+	return err;
+}
+
+static void strbuf_trim_component(struct strbuf *sl)
+{
+	while (sl->len > 0) {
+		int is_slash = (sl->buf[sl->len - 1] == '/');
+		strbuf_setlen(sl, sl->len - 1);
+		if (is_slash)
+			break;
+	}
+}
+
+int modification_validate(struct modification *mod)
+{
+	struct strbuf slashed = STRBUF_INIT;
+	int err = 0;
+	int i = 0;
+	for (; i < mod->add_len; i++) {
+		err = validate_refname(mod->add[i]);
+		if (err)
+			goto done;
+		strbuf_reset(&slashed);
+		strbuf_addstr(&slashed, mod->add[i]);
+		strbuf_addstr(&slashed, "/");
+
+		err = modification_has_ref_with_prefix(mod, slashed.buf);
+		if (err == 0) {
+			err = REFTABLE_NAME_CONFLICT;
+			goto done;
+		}
+		if (err < 0)
+			goto done;
+
+		strbuf_reset(&slashed);
+		strbuf_addstr(&slashed, mod->add[i]);
+		while (slashed.len) {
+			strbuf_trim_component(&slashed);
+			err = modification_has_ref(mod, slashed.buf);
+			if (err == 0) {
+				err = REFTABLE_NAME_CONFLICT;
+				goto done;
+			}
+			if (err < 0)
+				goto done;
+		}
+	}
+	err = 0;
+done:
+	strbuf_release(&slashed);
+	return err;
+}
diff --git a/reftable/refname.h b/reftable/refname.h
new file mode 100644
index 00000000000..a24b40fcb42
--- /dev/null
+++ b/reftable/refname.h
@@ -0,0 +1,29 @@
+/*
+  Copyright 2020 Google LLC
+
+  Use of this source code is governed by a BSD-style
+  license that can be found in the LICENSE file or at
+  https://developers.google.com/open-source/licenses/bsd
+*/
+#ifndef REFNAME_H
+#define REFNAME_H
+
+#include "reftable-record.h"
+#include "reftable-generic.h"
+
+struct modification {
+	struct reftable_table tab;
+
+	char **add;
+	size_t add_len;
+
+	char **del;
+	size_t del_len;
+};
+
+int validate_ref_record_addition(struct reftable_table tab,
+				 struct reftable_ref_record *recs, size_t sz);
+
+int modification_validate(struct modification *mod);
+
+#endif
diff --git a/reftable/refname_test.c b/reftable/refname_test.c
new file mode 100644
index 00000000000..5e005d6af31
--- /dev/null
+++ b/reftable/refname_test.c
@@ -0,0 +1,102 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "basics.h"
+#include "block.h"
+#include "blocksource.h"
+#include "constants.h"
+#include "reader.h"
+#include "record.h"
+#include "refname.h"
+#include "reftable-error.h"
+#include "reftable-writer.h"
+#include "system.h"
+
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+struct testcase {
+	char *add;
+	char *del;
+	int error_code;
+};
+
+static void test_conflict(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	struct reftable_ref_record rec = {
+		.refname = "a/b",
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "destination", /* make sure it's not a symref.
+						*/
+		.update_index = 1,
+	};
+	int err;
+	int i;
+	struct reftable_block_source source = { NULL };
+	struct reftable_reader *rd = NULL;
+	struct reftable_table tab = { NULL };
+	struct testcase cases[] = {
+		{ "a/b/c", NULL, REFTABLE_NAME_CONFLICT },
+		{ "b", NULL, 0 },
+		{ "a", NULL, REFTABLE_NAME_CONFLICT },
+		{ "a", "a/b", 0 },
+
+		{ "p/", NULL, REFTABLE_REFNAME_ERROR },
+		{ "p//q", NULL, REFTABLE_REFNAME_ERROR },
+		{ "p/./q", NULL, REFTABLE_REFNAME_ERROR },
+		{ "p/../q", NULL, REFTABLE_REFNAME_ERROR },
+
+		{ "a/b/c", "a/b", 0 },
+		{ NULL, "a//b", 0 },
+	};
+	reftable_writer_set_limits(w, 1, 1);
+
+	err = reftable_writer_add_ref(w, &rec);
+	EXPECT_ERR(err);
+
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+	reftable_writer_free(w);
+
+	block_source_from_strbuf(&source, &buf);
+	err = reftable_new_reader(&rd, &source, "filename");
+	EXPECT_ERR(err);
+
+	reftable_table_from_reader(&tab, rd);
+
+	for (i = 0; i < ARRAY_SIZE(cases); i++) {
+		struct modification mod = {
+			.tab = tab,
+		};
+
+		if (cases[i].add != NULL) {
+			mod.add = &cases[i].add;
+			mod.add_len = 1;
+		}
+		if (cases[i].del != NULL) {
+			mod.del = &cases[i].del;
+			mod.del_len = 1;
+		}
+
+		err = modification_validate(&mod);
+		EXPECT(err == cases[i].error_code);
+	}
+
+	reftable_reader_free(rd);
+	strbuf_release(&buf);
+}
+
+int refname_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_conflict);
+	return 0;
+}
diff --git a/reftable/reftable-generic.h b/reftable/reftable-generic.h
new file mode 100644
index 00000000000..77eca3b4eb0
--- /dev/null
+++ b/reftable/reftable-generic.h
@@ -0,0 +1,48 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_GENERIC_H
+#define REFTABLE_GENERIC_H
+
+#include "reftable-iterator.h"
+#include "reftable-reader.h"
+#include "reftable-merged.h"
+
+/*
+ * Provides a unified API for reading tables, either merged tables, or single
+ * readers. */
+struct reftable_table {
+	struct reftable_table_vtable *ops;
+	void *table_arg;
+};
+
+int reftable_table_seek_ref(struct reftable_table *tab,
+			    struct reftable_iterator *it, const char *name);
+
+void reftable_table_from_reader(struct reftable_table *tab,
+				struct reftable_reader *reader);
+
+/* returns the hash ID from a generic reftable_table */
+uint32_t reftable_table_hash_id(struct reftable_table *tab);
+
+/* create a generic table from reftable_merged_table */
+void reftable_table_from_merged_table(struct reftable_table *tab,
+				      struct reftable_merged_table *table);
+
+/* returns the max update_index covered by this table. */
+uint64_t reftable_table_max_update_index(struct reftable_table *tab);
+
+/* returns the min update_index covered by this table. */
+uint64_t reftable_table_min_update_index(struct reftable_table *tab);
+
+/* convenience function to read a single ref. Returns < 0 for error, 0
+   for success, and 1 if ref not found. */
+int reftable_table_read_ref(struct reftable_table *tab, const char *name,
+			    struct reftable_ref_record *ref);
+
+#endif
diff --git a/reftable/reftable-merged.h b/reftable/reftable-merged.h
new file mode 100644
index 00000000000..0e8ebe5a995
--- /dev/null
+++ b/reftable/reftable-merged.h
@@ -0,0 +1,68 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_MERGED_H
+#define REFTABLE_MERGED_H
+
+#include "reftable-iterator.h"
+
+/*
+ * Merged tables
+ *
+ * A ref database kept in a sequence of table files. The merged_table presents a
+ * unified view to reading (seeking, iterating) a sequence of immutable tables.
+ *
+ * The merged tables are on purpose kept disconnected from their actual storage
+ * (eg. files on disk), because it is useful to merge tables aren't files. For
+ * example, the per-workspace and global ref namespace can be implemented as a
+ * merged table of two stacks of file-backed reftables.
+ */
+
+/* A merged table is implements seeking/iterating over a stack of tables. */
+struct reftable_merged_table;
+
+/* A generic reftable; see below. */
+struct reftable_table;
+
+/* reftable_new_merged_table creates a new merged table. It takes ownership of
+   the stack array.
+*/
+int reftable_new_merged_table(struct reftable_merged_table **dest,
+			      struct reftable_table *stack, int n,
+			      uint32_t hash_id);
+
+/* returns an iterator positioned just before 'name' */
+int reftable_merged_table_seek_ref(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name);
+
+/* returns an iterator for log entry, at given update_index */
+int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt,
+				      struct reftable_iterator *it,
+				      const char *name, uint64_t update_index);
+
+/* like reftable_merged_table_seek_log_at but look for the newest entry. */
+int reftable_merged_table_seek_log(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name);
+
+/* returns the max update_index covered by this merged table. */
+uint64_t
+reftable_merged_table_max_update_index(struct reftable_merged_table *mt);
+
+/* returns the min update_index covered by this merged table. */
+uint64_t
+reftable_merged_table_min_update_index(struct reftable_merged_table *mt);
+
+/* releases memory for the merged_table */
+void reftable_merged_table_free(struct reftable_merged_table *m);
+
+/* return the hash ID of the merged table. */
+uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *m);
+
+#endif
diff --git a/reftable/reftable-stack.h b/reftable/reftable-stack.h
new file mode 100644
index 00000000000..b7060f111e8
--- /dev/null
+++ b/reftable/reftable-stack.h
@@ -0,0 +1,120 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_STACK_H
+#define REFTABLE_STACK_H
+
+#include "reftable-writer.h"
+
+/*
+ * The stack presents an interface to a mutable sequence of reftables.
+
+ * A stack can be mutated by pushing a table to the top of the stack.
+
+ * The reftable_stack automatically compacts files on disk to ensure good
+ * amortized performance.
+ *
+ * For windows and other platforms that cannot have open files as rename
+ * destinations, concurrent access from multiple processes needs the rand()
+ * random seed to be randomized.
+ */
+struct reftable_stack;
+
+/* open a new reftable stack. The tables along with the table list will be
+ *  stored in 'dir'. Typically, this should be .git/reftables.
+ */
+int reftable_new_stack(struct reftable_stack **dest, const char *dir,
+		       struct reftable_write_options config);
+
+/* returns the update_index at which a next table should be written. */
+uint64_t reftable_stack_next_update_index(struct reftable_stack *st);
+
+/* holds a transaction to add tables at the top of a stack. */
+struct reftable_addition;
+
+/*
+ * returns a new transaction to add reftables to the given stack. As a side
+ * effect, the ref database is locked.
+ */
+int reftable_stack_new_addition(struct reftable_addition **dest,
+				struct reftable_stack *st);
+
+/* Adds a reftable to transaction. */
+int reftable_addition_add(struct reftable_addition *add,
+			  int (*write_table)(struct reftable_writer *wr,
+					     void *arg),
+			  void *arg);
+
+/* Commits the transaction, releasing the lock. */
+int reftable_addition_commit(struct reftable_addition *add);
+
+/* Release all non-committed data from the transaction, and deallocate the
+ * transaction. Releases the lock if held. */
+void reftable_addition_destroy(struct reftable_addition *add);
+
+/* add a new table to the stack. The write_table function must call
+ * reftable_writer_set_limits, add refs and return an error value. */
+int reftable_stack_add(struct reftable_stack *st,
+		       int (*write_table)(struct reftable_writer *wr,
+					  void *write_arg),
+		       void *write_arg);
+
+/* returns the merged_table for seeking. This table is valid until the
+ * next write or reload, and should not be closed or deleted.
+ */
+struct reftable_merged_table *
+reftable_stack_merged_table(struct reftable_stack *st);
+
+/* frees all resources associated with the stack. */
+void reftable_stack_destroy(struct reftable_stack *st);
+
+/* Reloads the stack if necessary. This is very cheap to run if the stack was up
+ * to date */
+int reftable_stack_reload(struct reftable_stack *st);
+
+/* Policy for expiring reflog entries. */
+struct reftable_log_expiry_config {
+	/* Drop entries older than this timestamp */
+	uint64_t time;
+
+	/* Drop older entries */
+	uint64_t min_update_index;
+};
+
+/* compacts all reftables into a giant table. Expire reflog entries if config is
+ * non-NULL */
+int reftable_stack_compact_all(struct reftable_stack *st,
+			       struct reftable_log_expiry_config *config);
+
+/* heuristically compact unbalanced table stack. */
+int reftable_stack_auto_compact(struct reftable_stack *st);
+
+/* convenience function to read a single ref. Returns < 0 for error, 0 for
+ * success, and 1 if ref not found. */
+int reftable_stack_read_ref(struct reftable_stack *st, const char *refname,
+			    struct reftable_ref_record *ref);
+
+/* convenience function to read a single log. Returns < 0 for error, 0 for
+ * success, and 1 if ref not found. */
+int reftable_stack_read_log(struct reftable_stack *st, const char *refname,
+			    struct reftable_log_record *log);
+
+/* statistics on past compactions. */
+struct reftable_compaction_stats {
+	uint64_t bytes; /* total number of bytes written */
+	uint64_t entries_written; /* total number of entries written, including
+				     failures. */
+	int attempts; /* how often we tried to compact */
+	int failures; /* failures happen on concurrent updates */
+};
+
+/* return statistics for compaction up till now. */
+struct reftable_compaction_stats *
+reftable_stack_compaction_stats(struct reftable_stack *st);
+
+#endif
diff --git a/reftable/reftable.c b/reftable/reftable.c
new file mode 100644
index 00000000000..dc4fd03d5b2
--- /dev/null
+++ b/reftable/reftable.c
@@ -0,0 +1,98 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "record.h"
+#include "reader.h"
+#include "reftable-iterator.h"
+#include "reftable-generic.h"
+
+static int reftable_reader_seek_void(void *tab, struct reftable_iterator *it,
+				     struct reftable_record *rec)
+{
+	return reader_seek((struct reftable_reader *)tab, it, rec);
+}
+
+static uint32_t reftable_reader_hash_id_void(void *tab)
+{
+	return reftable_reader_hash_id((struct reftable_reader *)tab);
+}
+
+static uint64_t reftable_reader_min_update_index_void(void *tab)
+{
+	return reftable_reader_min_update_index((struct reftable_reader *)tab);
+}
+
+static uint64_t reftable_reader_max_update_index_void(void *tab)
+{
+	return reftable_reader_max_update_index((struct reftable_reader *)tab);
+}
+
+static struct reftable_table_vtable reader_vtable = {
+	.seek_record = reftable_reader_seek_void,
+	.hash_id = reftable_reader_hash_id_void,
+	.min_update_index = reftable_reader_min_update_index_void,
+	.max_update_index = reftable_reader_max_update_index_void,
+};
+
+int reftable_table_seek_ref(struct reftable_table *tab,
+			    struct reftable_iterator *it, const char *name)
+{
+	struct reftable_ref_record ref = {
+		.refname = (char *)name,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_ref(&rec, &ref);
+	return tab->ops->seek_record(tab->table_arg, it, &rec);
+}
+
+void reftable_table_from_reader(struct reftable_table *tab,
+				struct reftable_reader *reader)
+{
+	assert(tab->ops == NULL);
+	tab->ops = &reader_vtable;
+	tab->table_arg = reader;
+}
+
+int reftable_table_read_ref(struct reftable_table *tab, const char *name,
+			    struct reftable_ref_record *ref)
+{
+	struct reftable_iterator it = { NULL };
+	int err = reftable_table_seek_ref(tab, &it, name);
+	if (err)
+		goto done;
+
+	err = reftable_iterator_next_ref(&it, ref);
+	if (err)
+		goto done;
+
+	if (strcmp(ref->refname, name) ||
+	    reftable_ref_record_is_deletion(ref)) {
+		reftable_ref_record_release(ref);
+		err = 1;
+		goto done;
+	}
+
+done:
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+uint64_t reftable_table_max_update_index(struct reftable_table *tab)
+{
+	return tab->ops->max_update_index(tab->table_arg);
+}
+
+uint64_t reftable_table_min_update_index(struct reftable_table *tab)
+{
+	return tab->ops->min_update_index(tab->table_arg);
+}
+
+uint32_t reftable_table_hash_id(struct reftable_table *tab)
+{
+	return tab->ops->hash_id(tab->table_arg);
+}
diff --git a/reftable/stack.c b/reftable/stack.c
new file mode 100644
index 00000000000..10608089347
--- /dev/null
+++ b/reftable/stack.c
@@ -0,0 +1,1260 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "stack.h"
+
+#include "system.h"
+#include "merged.h"
+#include "reader.h"
+#include "refname.h"
+#include "reftable-error.h"
+#include "reftable-record.h"
+#include "writer.h"
+
+static int stack_try_add(struct reftable_stack *st,
+			 int (*write_table)(struct reftable_writer *wr,
+					    void *arg),
+			 void *arg);
+static int stack_write_compact(struct reftable_stack *st,
+			       struct reftable_writer *wr, int first, int last,
+			       struct reftable_log_expiry_config *config);
+static int stack_check_addition(struct reftable_stack *st,
+				const char *new_tab_name);
+static void reftable_addition_close(struct reftable_addition *add);
+static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
+					     int reuse_open);
+
+static int reftable_fd_write(void *arg, const void *data, size_t sz)
+{
+	int *fdp = (int *)arg;
+	return write(*fdp, data, sz);
+}
+
+int reftable_new_stack(struct reftable_stack **dest, const char *dir,
+		       struct reftable_write_options config)
+{
+	struct reftable_stack *p =
+		reftable_calloc(sizeof(struct reftable_stack));
+	struct strbuf list_file_name = STRBUF_INIT;
+	int err = 0;
+
+	if (config.hash_id == 0) {
+		config.hash_id = SHA1_ID;
+	}
+
+	*dest = NULL;
+
+	strbuf_reset(&list_file_name);
+	strbuf_addstr(&list_file_name, dir);
+	strbuf_addstr(&list_file_name, "/tables.list");
+
+	p->list_file = strbuf_detach(&list_file_name, NULL);
+	p->reftable_dir = xstrdup(dir);
+	p->config = config;
+
+	err = reftable_stack_reload_maybe_reuse(p, 1);
+	if (err < 0) {
+		reftable_stack_destroy(p);
+	} else {
+		*dest = p;
+	}
+	return err;
+}
+
+static int fd_read_lines(int fd, char ***namesp)
+{
+	off_t size = lseek(fd, 0, SEEK_END);
+	char *buf = NULL;
+	int err = 0;
+	if (size < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+	err = lseek(fd, 0, SEEK_SET);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	buf = reftable_malloc(size + 1);
+	if (read(fd, buf, size) != size) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+	buf[size] = 0;
+
+	parse_names(buf, size, namesp);
+
+done:
+	reftable_free(buf);
+	return err;
+}
+
+int read_lines(const char *filename, char ***namesp)
+{
+	int fd = open(filename, O_RDONLY, 0644);
+	int err = 0;
+	if (fd < 0) {
+		if (errno == ENOENT) {
+			*namesp = reftable_calloc(sizeof(char *));
+			return 0;
+		}
+
+		return REFTABLE_IO_ERROR;
+	}
+	err = fd_read_lines(fd, namesp);
+	close(fd);
+	return err;
+}
+
+struct reftable_merged_table *
+reftable_stack_merged_table(struct reftable_stack *st)
+{
+	return st->merged;
+}
+
+/* Close and free the stack */
+void reftable_stack_destroy(struct reftable_stack *st)
+{
+	if (st->merged != NULL) {
+		reftable_merged_table_free(st->merged);
+		st->merged = NULL;
+	}
+
+	if (st->readers != NULL) {
+		int i = 0;
+		for (i = 0; i < st->readers_len; i++) {
+			reftable_reader_free(st->readers[i]);
+		}
+		st->readers_len = 0;
+		FREE_AND_NULL(st->readers);
+	}
+	FREE_AND_NULL(st->list_file);
+	FREE_AND_NULL(st->reftable_dir);
+	reftable_free(st);
+}
+
+static struct reftable_reader **stack_copy_readers(struct reftable_stack *st,
+						   int cur_len)
+{
+	struct reftable_reader **cur =
+		reftable_calloc(sizeof(struct reftable_reader *) * cur_len);
+	int i = 0;
+	for (i = 0; i < cur_len; i++) {
+		cur[i] = st->readers[i];
+	}
+	return cur;
+}
+
+static int reftable_stack_reload_once(struct reftable_stack *st, char **names,
+				      int reuse_open)
+{
+	int cur_len = st->merged == NULL ? 0 : st->merged->stack_len;
+	struct reftable_reader **cur = stack_copy_readers(st, cur_len);
+	int err = 0;
+	int names_len = names_length(names);
+	struct reftable_reader **new_readers =
+		reftable_calloc(sizeof(struct reftable_reader *) * names_len);
+	struct reftable_table *new_tables =
+		reftable_calloc(sizeof(struct reftable_table) * names_len);
+	int new_readers_len = 0;
+	struct reftable_merged_table *new_merged = NULL;
+	int i;
+
+	while (*names) {
+		struct reftable_reader *rd = NULL;
+		char *name = *names++;
+
+		/* this is linear; we assume compaction keeps the number of
+		   tables under control so this is not quadratic. */
+		int j = 0;
+		for (j = 0; reuse_open && j < cur_len; j++) {
+			if (cur[j] != NULL && 0 == strcmp(cur[j]->name, name)) {
+				rd = cur[j];
+				cur[j] = NULL;
+				break;
+			}
+		}
+
+		if (rd == NULL) {
+			struct reftable_block_source src = { NULL };
+			struct strbuf table_path = STRBUF_INIT;
+			strbuf_addstr(&table_path, st->reftable_dir);
+			strbuf_addstr(&table_path, "/");
+			strbuf_addstr(&table_path, name);
+
+			err = reftable_block_source_from_file(&src,
+							      table_path.buf);
+			strbuf_release(&table_path);
+
+			if (err < 0)
+				goto done;
+
+			err = reftable_new_reader(&rd, &src, name);
+			if (err < 0)
+				goto done;
+		}
+
+		new_readers[new_readers_len] = rd;
+		reftable_table_from_reader(&new_tables[new_readers_len], rd);
+		new_readers_len++;
+	}
+
+	/* success! */
+	err = reftable_new_merged_table(&new_merged, new_tables,
+					new_readers_len, st->config.hash_id);
+	if (err < 0)
+		goto done;
+
+	new_tables = NULL;
+	st->readers_len = new_readers_len;
+	if (st->merged != NULL) {
+		merged_table_release(st->merged);
+		reftable_merged_table_free(st->merged);
+	}
+	if (st->readers != NULL) {
+		reftable_free(st->readers);
+	}
+	st->readers = new_readers;
+	new_readers = NULL;
+	new_readers_len = 0;
+
+	new_merged->suppress_deletions = 1;
+	st->merged = new_merged;
+	for (i = 0; i < cur_len; i++) {
+		if (cur[i] != NULL) {
+			reader_close(cur[i]);
+			reftable_reader_free(cur[i]);
+		}
+	}
+
+done:
+	for (i = 0; i < new_readers_len; i++) {
+		reader_close(new_readers[i]);
+		reftable_reader_free(new_readers[i]);
+	}
+	reftable_free(new_readers);
+	reftable_free(new_tables);
+	reftable_free(cur);
+	return err;
+}
+
+/* return negative if a before b. */
+static int tv_cmp(struct timeval *a, struct timeval *b)
+{
+	time_t diff = a->tv_sec - b->tv_sec;
+	int udiff = a->tv_usec - b->tv_usec;
+
+	if (diff != 0)
+		return diff;
+
+	return udiff;
+}
+
+static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
+					     int reuse_open)
+{
+	struct timeval deadline = { 0 };
+	int err = gettimeofday(&deadline, NULL);
+	int64_t delay = 0;
+	int tries = 0;
+	if (err < 0)
+		return err;
+
+	deadline.tv_sec += 3;
+	while (1) {
+		char **names = NULL;
+		char **names_after = NULL;
+		struct timeval now = { 0 };
+		int err = gettimeofday(&now, NULL);
+		int err2 = 0;
+		if (err < 0) {
+			return err;
+		}
+
+		/* Only look at deadlines after the first few times. This
+		   simplifies debugging in GDB */
+		tries++;
+		if (tries > 3 && tv_cmp(&now, &deadline) >= 0) {
+			break;
+		}
+
+		err = read_lines(st->list_file, &names);
+		if (err < 0) {
+			free_names(names);
+			return err;
+		}
+		err = reftable_stack_reload_once(st, names, reuse_open);
+		if (err == 0) {
+			free_names(names);
+			break;
+		}
+		if (err != REFTABLE_NOT_EXIST_ERROR) {
+			free_names(names);
+			return err;
+		}
+
+		/* err == REFTABLE_NOT_EXIST_ERROR can be caused by a concurrent
+		   writer. Check if there was one by checking if the name list
+		   changed.
+		*/
+		err2 = read_lines(st->list_file, &names_after);
+		if (err2 < 0) {
+			free_names(names);
+			return err2;
+		}
+
+		if (names_equal(names_after, names)) {
+			free_names(names);
+			free_names(names_after);
+			return err;
+		}
+		free_names(names);
+		free_names(names_after);
+
+		delay = delay + (delay * rand()) / RAND_MAX + 1;
+		sleep_millisec(delay);
+	}
+
+	return 0;
+}
+
+/* -1 = error
+ 0 = up to date
+ 1 = changed. */
+static int stack_uptodate(struct reftable_stack *st)
+{
+	char **names = NULL;
+	int err = read_lines(st->list_file, &names);
+	int i = 0;
+	if (err < 0)
+		return err;
+
+	for (i = 0; i < st->readers_len; i++) {
+		if (names[i] == NULL) {
+			err = 1;
+			goto done;
+		}
+
+		if (strcmp(st->readers[i]->name, names[i])) {
+			err = 1;
+			goto done;
+		}
+	}
+
+	if (names[st->merged->stack_len] != NULL) {
+		err = 1;
+		goto done;
+	}
+
+done:
+	free_names(names);
+	return err;
+}
+
+int reftable_stack_reload(struct reftable_stack *st)
+{
+	int err = stack_uptodate(st);
+	if (err > 0)
+		return reftable_stack_reload_maybe_reuse(st, 1);
+	return err;
+}
+
+int reftable_stack_add(struct reftable_stack *st,
+		       int (*write)(struct reftable_writer *wr, void *arg),
+		       void *arg)
+{
+	int err = stack_try_add(st, write, arg);
+	if (err < 0) {
+		if (err == REFTABLE_LOCK_ERROR) {
+			/* Ignore error return, we want to propagate
+			   REFTABLE_LOCK_ERROR.
+			*/
+			reftable_stack_reload(st);
+		}
+		return err;
+	}
+
+	if (!st->disable_auto_compact)
+		return reftable_stack_auto_compact(st);
+
+	return 0;
+}
+
+static void format_name(struct strbuf *dest, uint64_t min, uint64_t max)
+{
+	char buf[100];
+	uint32_t rnd = (uint32_t)rand();
+	snprintf(buf, sizeof(buf), "0x%012" PRIx64 "-0x%012" PRIx64 "-%08x",
+		 min, max, rnd);
+	strbuf_reset(dest);
+	strbuf_addstr(dest, buf);
+}
+
+struct reftable_addition {
+	int lock_file_fd;
+	struct strbuf lock_file_name;
+	struct reftable_stack *stack;
+
+	char **new_tables;
+	int new_tables_len;
+	uint64_t next_update_index;
+};
+
+#define REFTABLE_ADDITION_INIT                \
+	{                                     \
+		.lock_file_name = STRBUF_INIT \
+	}
+
+static int reftable_stack_init_addition(struct reftable_addition *add,
+					struct reftable_stack *st)
+{
+	int err = 0;
+	add->stack = st;
+
+	strbuf_reset(&add->lock_file_name);
+	strbuf_addstr(&add->lock_file_name, st->list_file);
+	strbuf_addstr(&add->lock_file_name, ".lock");
+
+	add->lock_file_fd = open(add->lock_file_name.buf,
+				 O_EXCL | O_CREAT | O_WRONLY, 0644);
+	if (add->lock_file_fd < 0) {
+		if (errno == EEXIST) {
+			err = REFTABLE_LOCK_ERROR;
+		} else {
+			err = REFTABLE_IO_ERROR;
+		}
+		goto done;
+	}
+	err = stack_uptodate(st);
+	if (err < 0)
+		goto done;
+
+	if (err > 1) {
+		err = REFTABLE_LOCK_ERROR;
+		goto done;
+	}
+
+	add->next_update_index = reftable_stack_next_update_index(st);
+done:
+	if (err) {
+		reftable_addition_close(add);
+	}
+	return err;
+}
+
+static void reftable_addition_close(struct reftable_addition *add)
+{
+	int i = 0;
+	struct strbuf nm = STRBUF_INIT;
+	for (i = 0; i < add->new_tables_len; i++) {
+		strbuf_reset(&nm);
+		strbuf_addstr(&nm, add->stack->reftable_dir);
+		strbuf_addstr(&nm, "/");
+		strbuf_addstr(&nm, add->new_tables[i]);
+		unlink(nm.buf);
+		reftable_free(add->new_tables[i]);
+		add->new_tables[i] = NULL;
+	}
+	reftable_free(add->new_tables);
+	add->new_tables = NULL;
+	add->new_tables_len = 0;
+
+	if (add->lock_file_fd > 0) {
+		close(add->lock_file_fd);
+		add->lock_file_fd = 0;
+	}
+	if (add->lock_file_name.len > 0) {
+		unlink(add->lock_file_name.buf);
+		strbuf_release(&add->lock_file_name);
+	}
+
+	strbuf_release(&nm);
+}
+
+void reftable_addition_destroy(struct reftable_addition *add)
+{
+	if (add == NULL) {
+		return;
+	}
+	reftable_addition_close(add);
+	reftable_free(add);
+}
+
+int reftable_addition_commit(struct reftable_addition *add)
+{
+	struct strbuf table_list = STRBUF_INIT;
+	int i = 0;
+	int err = 0;
+	if (add->new_tables_len == 0)
+		goto done;
+
+	for (i = 0; i < add->stack->merged->stack_len; i++) {
+		strbuf_addstr(&table_list, add->stack->readers[i]->name);
+		strbuf_addstr(&table_list, "\n");
+	}
+	for (i = 0; i < add->new_tables_len; i++) {
+		strbuf_addstr(&table_list, add->new_tables[i]);
+		strbuf_addstr(&table_list, "\n");
+	}
+
+	err = write(add->lock_file_fd, table_list.buf, table_list.len);
+	strbuf_release(&table_list);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = close(add->lock_file_fd);
+	add->lock_file_fd = 0;
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = rename(add->lock_file_name.buf, add->stack->list_file);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	/* success, no more state to clean up. */
+	strbuf_release(&add->lock_file_name);
+	for (i = 0; i < add->new_tables_len; i++) {
+		reftable_free(add->new_tables[i]);
+	}
+	reftable_free(add->new_tables);
+	add->new_tables = NULL;
+	add->new_tables_len = 0;
+
+	err = reftable_stack_reload(add->stack);
+done:
+	reftable_addition_close(add);
+	return err;
+}
+
+int reftable_stack_new_addition(struct reftable_addition **dest,
+				struct reftable_stack *st)
+{
+	int err = 0;
+	struct reftable_addition empty = REFTABLE_ADDITION_INIT;
+	*dest = reftable_calloc(sizeof(**dest));
+	**dest = empty;
+	err = reftable_stack_init_addition(*dest, st);
+	if (err) {
+		reftable_free(*dest);
+		*dest = NULL;
+	}
+	return err;
+}
+
+static int stack_try_add(struct reftable_stack *st,
+			 int (*write_table)(struct reftable_writer *wr,
+					    void *arg),
+			 void *arg)
+{
+	struct reftable_addition add = REFTABLE_ADDITION_INIT;
+	int err = reftable_stack_init_addition(&add, st);
+	if (err < 0)
+		goto done;
+	if (err > 0) {
+		err = REFTABLE_LOCK_ERROR;
+		goto done;
+	}
+
+	err = reftable_addition_add(&add, write_table, arg);
+	if (err < 0)
+		goto done;
+
+	err = reftable_addition_commit(&add);
+done:
+	reftable_addition_close(&add);
+	return err;
+}
+
+int reftable_addition_add(struct reftable_addition *add,
+			  int (*write_table)(struct reftable_writer *wr,
+					     void *arg),
+			  void *arg)
+{
+	struct strbuf temp_tab_file_name = STRBUF_INIT;
+	struct strbuf tab_file_name = STRBUF_INIT;
+	struct strbuf next_name = STRBUF_INIT;
+	struct reftable_writer *wr = NULL;
+	int err = 0;
+	int tab_fd = 0;
+
+	strbuf_reset(&next_name);
+	format_name(&next_name, add->next_update_index, add->next_update_index);
+
+	strbuf_addstr(&temp_tab_file_name, add->stack->reftable_dir);
+	strbuf_addstr(&temp_tab_file_name, "/");
+	strbuf_addbuf(&temp_tab_file_name, &next_name);
+	strbuf_addstr(&temp_tab_file_name, ".temp.XXXXXX");
+
+	tab_fd = mkstemp(temp_tab_file_name.buf);
+	if (tab_fd < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	wr = reftable_new_writer(reftable_fd_write, &tab_fd,
+				 &add->stack->config);
+	err = write_table(wr, arg);
+	if (err < 0)
+		goto done;
+
+	err = reftable_writer_close(wr);
+	if (err == REFTABLE_EMPTY_TABLE_ERROR) {
+		err = 0;
+		goto done;
+	}
+	if (err < 0)
+		goto done;
+
+	err = close(tab_fd);
+	tab_fd = 0;
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = stack_check_addition(add->stack, temp_tab_file_name.buf);
+	if (err < 0)
+		goto done;
+
+	if (wr->min_update_index < add->next_update_index) {
+		err = REFTABLE_API_ERROR;
+		goto done;
+	}
+
+	format_name(&next_name, wr->min_update_index, wr->max_update_index);
+	strbuf_addstr(&next_name, ".ref");
+
+	strbuf_addstr(&tab_file_name, add->stack->reftable_dir);
+	strbuf_addstr(&tab_file_name, "/");
+	strbuf_addbuf(&tab_file_name, &next_name);
+
+	/*
+	  On windows, this relies on rand() picking a unique destination name.
+	  Maybe we should do retry loop as well?
+	 */
+	err = rename(temp_tab_file_name.buf, tab_file_name.buf);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	add->new_tables = reftable_realloc(add->new_tables,
+					   sizeof(*add->new_tables) *
+						   (add->new_tables_len + 1));
+	add->new_tables[add->new_tables_len] = strbuf_detach(&next_name, NULL);
+	add->new_tables_len++;
+done:
+	if (tab_fd > 0) {
+		close(tab_fd);
+		tab_fd = 0;
+	}
+	if (temp_tab_file_name.len > 0) {
+		unlink(temp_tab_file_name.buf);
+	}
+
+	strbuf_release(&temp_tab_file_name);
+	strbuf_release(&tab_file_name);
+	strbuf_release(&next_name);
+	reftable_writer_free(wr);
+	return err;
+}
+
+uint64_t reftable_stack_next_update_index(struct reftable_stack *st)
+{
+	int sz = st->merged->stack_len;
+	if (sz > 0)
+		return reftable_reader_max_update_index(st->readers[sz - 1]) +
+		       1;
+	return 1;
+}
+
+static int stack_compact_locked(struct reftable_stack *st, int first, int last,
+				struct strbuf *temp_tab,
+				struct reftable_log_expiry_config *config)
+{
+	struct strbuf next_name = STRBUF_INIT;
+	int tab_fd = -1;
+	struct reftable_writer *wr = NULL;
+	int err = 0;
+
+	format_name(&next_name,
+		    reftable_reader_min_update_index(st->readers[first]),
+		    reftable_reader_max_update_index(st->readers[last]));
+
+	strbuf_reset(temp_tab);
+	strbuf_addstr(temp_tab, st->reftable_dir);
+	strbuf_addstr(temp_tab, "/");
+	strbuf_addbuf(temp_tab, &next_name);
+	strbuf_addstr(temp_tab, ".temp.XXXXXX");
+
+	tab_fd = mkstemp(temp_tab->buf);
+	wr = reftable_new_writer(reftable_fd_write, &tab_fd, &st->config);
+
+	err = stack_write_compact(st, wr, first, last, config);
+	if (err < 0)
+		goto done;
+	err = reftable_writer_close(wr);
+	if (err < 0)
+		goto done;
+
+	err = close(tab_fd);
+	tab_fd = 0;
+
+done:
+	reftable_writer_free(wr);
+	if (tab_fd > 0) {
+		close(tab_fd);
+		tab_fd = 0;
+	}
+	if (err != 0 && temp_tab->len > 0) {
+		unlink(temp_tab->buf);
+		strbuf_release(temp_tab);
+	}
+	strbuf_release(&next_name);
+	return err;
+}
+
+static int stack_write_compact(struct reftable_stack *st,
+			       struct reftable_writer *wr, int first, int last,
+			       struct reftable_log_expiry_config *config)
+{
+	int subtabs_len = last - first + 1;
+	struct reftable_table *subtabs = reftable_calloc(
+		sizeof(struct reftable_table) * (last - first + 1));
+	struct reftable_merged_table *mt = NULL;
+	int err = 0;
+	struct reftable_iterator it = { NULL };
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_log_record log = { NULL };
+
+	uint64_t entries = 0;
+
+	int i = 0, j = 0;
+	for (i = first, j = 0; i <= last; i++) {
+		struct reftable_reader *t = st->readers[i];
+		reftable_table_from_reader(&subtabs[j++], t);
+		st->stats.bytes += t->size;
+	}
+	reftable_writer_set_limits(wr, st->readers[first]->min_update_index,
+				   st->readers[last]->max_update_index);
+
+	err = reftable_new_merged_table(&mt, subtabs, subtabs_len,
+					st->config.hash_id);
+	if (err < 0) {
+		reftable_free(subtabs);
+		goto done;
+	}
+
+	err = reftable_merged_table_seek_ref(mt, &it, "");
+	if (err < 0)
+		goto done;
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+		if (err < 0) {
+			break;
+		}
+
+		if (first == 0 && reftable_ref_record_is_deletion(&ref)) {
+			continue;
+		}
+
+		err = reftable_writer_add_ref(wr, &ref);
+		if (err < 0) {
+			break;
+		}
+		entries++;
+	}
+	reftable_iterator_destroy(&it);
+
+	err = reftable_merged_table_seek_log(mt, &it, "");
+	if (err < 0)
+		goto done;
+
+	while (1) {
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+		if (err < 0) {
+			break;
+		}
+		if (first == 0 && reftable_log_record_is_deletion(&log)) {
+			continue;
+		}
+
+		if (config != NULL && config->time > 0 &&
+		    log.time < config->time) {
+			continue;
+		}
+
+		if (config != NULL && config->min_update_index > 0 &&
+		    log.update_index < config->min_update_index) {
+			continue;
+		}
+
+		err = reftable_writer_add_log(wr, &log);
+		if (err < 0) {
+			break;
+		}
+		entries++;
+	}
+
+done:
+	reftable_iterator_destroy(&it);
+	if (mt != NULL) {
+		merged_table_release(mt);
+		reftable_merged_table_free(mt);
+	}
+	reftable_ref_record_release(&ref);
+	reftable_log_record_release(&log);
+	st->stats.entries_written += entries;
+	return err;
+}
+
+/* <  0: error. 0 == OK, > 0 attempt failed; could retry. */
+static int stack_compact_range(struct reftable_stack *st, int first, int last,
+			       struct reftable_log_expiry_config *expiry)
+{
+	struct strbuf temp_tab_file_name = STRBUF_INIT;
+	struct strbuf new_table_name = STRBUF_INIT;
+	struct strbuf lock_file_name = STRBUF_INIT;
+	struct strbuf ref_list_contents = STRBUF_INIT;
+	struct strbuf new_table_path = STRBUF_INIT;
+	int err = 0;
+	int have_lock = 0;
+	int lock_file_fd = 0;
+	int compact_count = last - first + 1;
+	char **listp = NULL;
+	char **delete_on_success =
+		reftable_calloc(sizeof(char *) * (compact_count + 1));
+	char **subtable_locks =
+		reftable_calloc(sizeof(char *) * (compact_count + 1));
+	int i = 0;
+	int j = 0;
+	int is_empty_table = 0;
+
+	if (first > last || (expiry == NULL && first == last)) {
+		err = 0;
+		goto done;
+	}
+
+	st->stats.attempts++;
+
+	strbuf_reset(&lock_file_name);
+	strbuf_addstr(&lock_file_name, st->list_file);
+	strbuf_addstr(&lock_file_name, ".lock");
+
+	lock_file_fd =
+		open(lock_file_name.buf, O_EXCL | O_CREAT | O_WRONLY, 0644);
+	if (lock_file_fd < 0) {
+		if (errno == EEXIST) {
+			err = 1;
+		} else {
+			err = REFTABLE_IO_ERROR;
+		}
+		goto done;
+	}
+	/* Don't want to write to the lock for now.  */
+	close(lock_file_fd);
+	lock_file_fd = 0;
+
+	have_lock = 1;
+	err = stack_uptodate(st);
+	if (err != 0)
+		goto done;
+
+	for (i = first, j = 0; i <= last; i++) {
+		struct strbuf subtab_file_name = STRBUF_INIT;
+		struct strbuf subtab_lock = STRBUF_INIT;
+		int sublock_file_fd = -1;
+
+		strbuf_addstr(&subtab_file_name, st->reftable_dir);
+		strbuf_addstr(&subtab_file_name, "/");
+		strbuf_addstr(&subtab_file_name, reader_name(st->readers[i]));
+
+		strbuf_reset(&subtab_lock);
+		strbuf_addbuf(&subtab_lock, &subtab_file_name);
+		strbuf_addstr(&subtab_lock, ".lock");
+
+		sublock_file_fd = open(subtab_lock.buf,
+				       O_EXCL | O_CREAT | O_WRONLY, 0644);
+		if (sublock_file_fd > 0) {
+			close(sublock_file_fd);
+		} else if (sublock_file_fd < 0) {
+			if (errno == EEXIST) {
+				err = 1;
+			} else {
+				err = REFTABLE_IO_ERROR;
+			}
+		}
+
+		subtable_locks[j] = subtab_lock.buf;
+		delete_on_success[j] = subtab_file_name.buf;
+		j++;
+
+		if (err != 0)
+			goto done;
+	}
+
+	err = unlink(lock_file_name.buf);
+	if (err < 0)
+		goto done;
+	have_lock = 0;
+
+	err = stack_compact_locked(st, first, last, &temp_tab_file_name,
+				   expiry);
+	/* Compaction + tombstones can create an empty table out of non-empty
+	 * tables. */
+	is_empty_table = (err == REFTABLE_EMPTY_TABLE_ERROR);
+	if (is_empty_table) {
+		err = 0;
+	}
+	if (err < 0)
+		goto done;
+
+	lock_file_fd =
+		open(lock_file_name.buf, O_EXCL | O_CREAT | O_WRONLY, 0644);
+	if (lock_file_fd < 0) {
+		if (errno == EEXIST) {
+			err = 1;
+		} else {
+			err = REFTABLE_IO_ERROR;
+		}
+		goto done;
+	}
+	have_lock = 1;
+
+	format_name(&new_table_name, st->readers[first]->min_update_index,
+		    st->readers[last]->max_update_index);
+	strbuf_addstr(&new_table_name, ".ref");
+
+	strbuf_reset(&new_table_path);
+	strbuf_addstr(&new_table_path, st->reftable_dir);
+	strbuf_addstr(&new_table_path, "/");
+	strbuf_addbuf(&new_table_path, &new_table_name);
+
+	if (!is_empty_table) {
+		/* retry? */
+		err = rename(temp_tab_file_name.buf, new_table_path.buf);
+		if (err < 0) {
+			err = REFTABLE_IO_ERROR;
+			goto done;
+		}
+	}
+
+	for (i = 0; i < first; i++) {
+		strbuf_addstr(&ref_list_contents, st->readers[i]->name);
+		strbuf_addstr(&ref_list_contents, "\n");
+	}
+	if (!is_empty_table) {
+		strbuf_addbuf(&ref_list_contents, &new_table_name);
+		strbuf_addstr(&ref_list_contents, "\n");
+	}
+	for (i = last + 1; i < st->merged->stack_len; i++) {
+		strbuf_addstr(&ref_list_contents, st->readers[i]->name);
+		strbuf_addstr(&ref_list_contents, "\n");
+	}
+
+	err = write(lock_file_fd, ref_list_contents.buf, ref_list_contents.len);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		unlink(new_table_path.buf);
+		goto done;
+	}
+	err = close(lock_file_fd);
+	lock_file_fd = 0;
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		unlink(new_table_path.buf);
+		goto done;
+	}
+
+	err = rename(lock_file_name.buf, st->list_file);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		unlink(new_table_path.buf);
+		goto done;
+	}
+	have_lock = 0;
+
+	/* Reload the stack before deleting. On windows, we can only delete the
+	   files after we closed them.
+	*/
+	err = reftable_stack_reload_maybe_reuse(st, first < last);
+
+	listp = delete_on_success;
+	while (*listp) {
+		if (strcmp(*listp, new_table_path.buf)) {
+			unlink(*listp);
+		}
+		listp++;
+	}
+
+done:
+	free_names(delete_on_success);
+
+	listp = subtable_locks;
+	while (*listp) {
+		unlink(*listp);
+		listp++;
+	}
+	free_names(subtable_locks);
+	if (lock_file_fd > 0) {
+		close(lock_file_fd);
+		lock_file_fd = 0;
+	}
+	if (have_lock) {
+		unlink(lock_file_name.buf);
+	}
+	strbuf_release(&new_table_name);
+	strbuf_release(&new_table_path);
+	strbuf_release(&ref_list_contents);
+	strbuf_release(&temp_tab_file_name);
+	strbuf_release(&lock_file_name);
+	return err;
+}
+
+int reftable_stack_compact_all(struct reftable_stack *st,
+			       struct reftable_log_expiry_config *config)
+{
+	return stack_compact_range(st, 0, st->merged->stack_len - 1, config);
+}
+
+static int stack_compact_range_stats(struct reftable_stack *st, int first,
+				     int last,
+				     struct reftable_log_expiry_config *config)
+{
+	int err = stack_compact_range(st, first, last, config);
+	if (err > 0) {
+		st->stats.failures++;
+	}
+	return err;
+}
+
+static int segment_size(struct segment *s)
+{
+	return s->end - s->start;
+}
+
+int fastlog2(uint64_t sz)
+{
+	int l = 0;
+	if (sz == 0)
+		return 0;
+	for (; sz; sz /= 2) {
+		l++;
+	}
+	return l - 1;
+}
+
+struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n)
+{
+	struct segment *segs = reftable_calloc(sizeof(struct segment) * n);
+	int next = 0;
+	struct segment cur = { 0 };
+	int i = 0;
+
+	if (n == 0) {
+		*seglen = 0;
+		return segs;
+	}
+	for (i = 0; i < n; i++) {
+		int log = fastlog2(sizes[i]);
+		if (cur.log != log && cur.bytes > 0) {
+			struct segment fresh = {
+				.start = i,
+			};
+
+			segs[next++] = cur;
+			cur = fresh;
+		}
+
+		cur.log = log;
+		cur.end = i + 1;
+		cur.bytes += sizes[i];
+	}
+	segs[next++] = cur;
+	*seglen = next;
+	return segs;
+}
+
+struct segment suggest_compaction_segment(uint64_t *sizes, int n)
+{
+	int seglen = 0;
+	struct segment *segs = sizes_to_segments(&seglen, sizes, n);
+	struct segment min_seg = {
+		.log = 64,
+	};
+	int i = 0;
+	for (i = 0; i < seglen; i++) {
+		if (segment_size(&segs[i]) == 1) {
+			continue;
+		}
+
+		if (segs[i].log < min_seg.log) {
+			min_seg = segs[i];
+		}
+	}
+
+	while (min_seg.start > 0) {
+		int prev = min_seg.start - 1;
+		if (fastlog2(min_seg.bytes) < fastlog2(sizes[prev])) {
+			break;
+		}
+
+		min_seg.start = prev;
+		min_seg.bytes += sizes[prev];
+	}
+
+	reftable_free(segs);
+	return min_seg;
+}
+
+static uint64_t *stack_table_sizes_for_compaction(struct reftable_stack *st)
+{
+	uint64_t *sizes =
+		reftable_calloc(sizeof(uint64_t) * st->merged->stack_len);
+	int version = (st->config.hash_id == SHA1_ID) ? 1 : 2;
+	int overhead = header_size(version) - 1;
+	int i = 0;
+	for (i = 0; i < st->merged->stack_len; i++) {
+		sizes[i] = st->readers[i]->size - overhead;
+	}
+	return sizes;
+}
+
+int reftable_stack_auto_compact(struct reftable_stack *st)
+{
+	uint64_t *sizes = stack_table_sizes_for_compaction(st);
+	struct segment seg =
+		suggest_compaction_segment(sizes, st->merged->stack_len);
+	reftable_free(sizes);
+	if (segment_size(&seg) > 0)
+		return stack_compact_range_stats(st, seg.start, seg.end - 1,
+						 NULL);
+
+	return 0;
+}
+
+struct reftable_compaction_stats *
+reftable_stack_compaction_stats(struct reftable_stack *st)
+{
+	return &st->stats;
+}
+
+int reftable_stack_read_ref(struct reftable_stack *st, const char *refname,
+			    struct reftable_ref_record *ref)
+{
+	struct reftable_table tab = { NULL };
+	reftable_table_from_merged_table(&tab, reftable_stack_merged_table(st));
+	return reftable_table_read_ref(&tab, refname, ref);
+}
+
+int reftable_stack_read_log(struct reftable_stack *st, const char *refname,
+			    struct reftable_log_record *log)
+{
+	struct reftable_iterator it = { NULL };
+	struct reftable_merged_table *mt = reftable_stack_merged_table(st);
+	int err = reftable_merged_table_seek_log(mt, &it, refname);
+	if (err)
+		goto done;
+
+	err = reftable_iterator_next_log(&it, log);
+	if (err)
+		goto done;
+
+	if (strcmp(log->refname, refname) ||
+	    reftable_log_record_is_deletion(log)) {
+		err = 1;
+		goto done;
+	}
+
+done:
+	if (err) {
+		reftable_log_record_release(log);
+	}
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+static int stack_check_addition(struct reftable_stack *st,
+				const char *new_tab_name)
+{
+	int err = 0;
+	struct reftable_block_source src = { NULL };
+	struct reftable_reader *rd = NULL;
+	struct reftable_table tab = { NULL };
+	struct reftable_ref_record *refs = NULL;
+	struct reftable_iterator it = { NULL };
+	int cap = 0;
+	int len = 0;
+	int i = 0;
+
+	if (st->config.skip_name_check)
+		return 0;
+
+	err = reftable_block_source_from_file(&src, new_tab_name);
+	if (err < 0)
+		goto done;
+
+	err = reftable_new_reader(&rd, &src, new_tab_name);
+	if (err < 0)
+		goto done;
+
+	err = reftable_reader_seek_ref(rd, &it, "");
+	if (err > 0) {
+		err = 0;
+		goto done;
+	}
+	if (err < 0)
+		goto done;
+
+	while (1) {
+		struct reftable_ref_record ref = { NULL };
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0)
+			goto done;
+
+		if (len >= cap) {
+			cap = 2 * cap + 1;
+			refs = reftable_realloc(refs, cap * sizeof(refs[0]));
+		}
+
+		refs[len++] = ref;
+	}
+
+	reftable_table_from_merged_table(&tab, reftable_stack_merged_table(st));
+
+	err = validate_ref_record_addition(tab, refs, len);
+
+done:
+	for (i = 0; i < len; i++) {
+		reftable_ref_record_release(&refs[i]);
+	}
+
+	free(refs);
+	reftable_iterator_destroy(&it);
+	reftable_reader_free(rd);
+	return err;
+}
diff --git a/reftable/stack.h b/reftable/stack.h
new file mode 100644
index 00000000000..f57005846e5
--- /dev/null
+++ b/reftable/stack.h
@@ -0,0 +1,41 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef STACK_H
+#define STACK_H
+
+#include "system.h"
+#include "reftable-writer.h"
+#include "reftable-stack.h"
+
+struct reftable_stack {
+	char *list_file;
+	char *reftable_dir;
+	int disable_auto_compact;
+
+	struct reftable_write_options config;
+
+	struct reftable_reader **readers;
+	size_t readers_len;
+	struct reftable_merged_table *merged;
+	struct reftable_compaction_stats stats;
+};
+
+int read_lines(const char *filename, char ***lines);
+
+struct segment {
+	int start, end;
+	int log;
+	uint64_t bytes;
+};
+
+int fastlog2(uint64_t sz);
+struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n);
+struct segment suggest_compaction_segment(uint64_t *sizes, int n);
+
+#endif
diff --git a/reftable/stack_test.c b/reftable/stack_test.c
new file mode 100644
index 00000000000..f88a18e51fb
--- /dev/null
+++ b/reftable/stack_test.c
@@ -0,0 +1,803 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "stack.h"
+
+#include "system.h"
+
+#include "merged.h"
+#include "basics.h"
+#include "constants.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+#include <sys/types.h>
+#include <dirent.h>
+
+static void clear_dir(const char *dirname)
+{
+	struct strbuf path = STRBUF_INIT;
+	strbuf_addstr(&path, dirname);
+	remove_dir_recursively(&path, 0);
+	strbuf_release(&path);
+}
+
+static char *get_tmp_template(const char *prefix)
+{
+	const char *tmp = getenv("TMPDIR");
+	static char template[1024];
+	snprintf(template, sizeof(template) - 1, "%s/%s.XXXXXX",
+		 tmp ? tmp : "/tmp", prefix);
+	return template;
+}
+
+static void test_read_file(void)
+{
+	char *fn = get_tmp_template(__FUNCTION__);
+	int fd = mkstemp(fn);
+	char out[1024] = "line1\n\nline2\nline3";
+	int n, err;
+	char **names = NULL;
+	char *want[] = { "line1", "line2", "line3" };
+	int i = 0;
+
+	EXPECT(fd > 0);
+	n = write(fd, out, strlen(out));
+	EXPECT(n == strlen(out));
+	err = close(fd);
+	EXPECT(err >= 0);
+
+	err = read_lines(fn, &names);
+	EXPECT_ERR(err);
+
+	for (i = 0; names[i] != NULL; i++) {
+		EXPECT(0 == strcmp(want[i], names[i]));
+	}
+	free_names(names);
+	remove(fn);
+}
+
+static void test_parse_names(void)
+{
+	char buf[] = "line\n";
+	char **names = NULL;
+	parse_names(buf, strlen(buf), &names);
+
+	EXPECT(NULL != names[0]);
+	EXPECT(0 == strcmp(names[0], "line"));
+	EXPECT(NULL == names[1]);
+	free_names(names);
+}
+
+static void test_names_equal(void)
+{
+	char *a[] = { "a", "b", "c", NULL };
+	char *b[] = { "a", "b", "d", NULL };
+	char *c[] = { "a", "b", NULL };
+
+	EXPECT(names_equal(a, a));
+	EXPECT(!names_equal(a, b));
+	EXPECT(!names_equal(a, c));
+}
+
+static int write_test_ref(struct reftable_writer *wr, void *arg)
+{
+	struct reftable_ref_record *ref = arg;
+	reftable_writer_set_limits(wr, ref->update_index, ref->update_index);
+	return reftable_writer_add_ref(wr, ref);
+}
+
+struct write_log_arg {
+	struct reftable_log_record *log;
+	uint64_t update_index;
+};
+
+static int write_test_log(struct reftable_writer *wr, void *arg)
+{
+	struct write_log_arg *wla = arg;
+
+	reftable_writer_set_limits(wr, wla->update_index, wla->update_index);
+	return reftable_writer_add_log(wr, wla->log);
+}
+
+static void test_reftable_stack_add_one(void)
+{
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_ref_record ref = {
+		.refname = "HEAD",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+	struct reftable_ref_record dest = { NULL };
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_ref(st, ref.refname, &dest);
+	EXPECT_ERR(err);
+	EXPECT(0 == strcmp("master", dest.value.symref));
+
+	reftable_ref_record_release(&dest);
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_uptodate(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st1 = NULL;
+	struct reftable_stack *st2 = NULL;
+	char *dir = get_tmp_template(__FUNCTION__);
+	int err;
+	struct reftable_ref_record ref1 = {
+		.refname = "HEAD",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+	struct reftable_ref_record ref2 = {
+		.refname = "branch2",
+		.update_index = 2,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+
+	EXPECT(mkdtemp(dir));
+
+	/* simulate multi-process access to the same stack
+	   by creating two stacks for the same directory.
+	 */
+	err = reftable_new_stack(&st1, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_new_stack(&st2, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st1, &write_test_ref, &ref1);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st2, &write_test_ref, &ref2);
+	EXPECT(err == REFTABLE_LOCK_ERROR);
+
+	err = reftable_stack_reload(st2);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st2, &write_test_ref, &ref2);
+	EXPECT_ERR(err);
+	reftable_stack_destroy(st1);
+	reftable_stack_destroy(st2);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_transaction_api(void)
+{
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_addition *add = NULL;
+
+	struct reftable_ref_record ref = {
+		.refname = "HEAD",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+	struct reftable_ref_record dest = { NULL };
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	reftable_addition_destroy(add);
+
+	err = reftable_stack_new_addition(&add, st);
+	EXPECT_ERR(err);
+
+	err = reftable_addition_add(add, &write_test_ref, &ref);
+	EXPECT_ERR(err);
+
+	err = reftable_addition_commit(add);
+	EXPECT_ERR(err);
+
+	reftable_addition_destroy(add);
+
+	err = reftable_stack_read_ref(st, ref.refname, &dest);
+	EXPECT_ERR(err);
+	EXPECT(REFTABLE_REF_SYMREF == dest.value_type);
+	EXPECT(0 == strcmp("master", dest.value.symref));
+
+	reftable_ref_record_release(&dest);
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_validate_refname(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	char *dir = get_tmp_template(__FUNCTION__);
+	int i;
+	struct reftable_ref_record ref = {
+		.refname = "a/b",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+	char *additions[] = { "a", "a/b/c" };
+
+	EXPECT(mkdtemp(dir));
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref);
+	EXPECT_ERR(err);
+
+	for (i = 0; i < ARRAY_SIZE(additions); i++) {
+		struct reftable_ref_record ref = {
+			.refname = additions[i],
+			.update_index = 1,
+			.value_type = REFTABLE_REF_SYMREF,
+			.value.symref = "master",
+		};
+
+		err = reftable_stack_add(st, &write_test_ref, &ref);
+		EXPECT(err == REFTABLE_NAME_CONFLICT);
+	}
+
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static int write_error(struct reftable_writer *wr, void *arg)
+{
+	return *((int *)arg);
+}
+
+static void test_reftable_stack_update_index_check(void)
+{
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_ref_record ref1 = {
+		.refname = "name1",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+	struct reftable_ref_record ref2 = {
+		.refname = "name2",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref1);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref2);
+	EXPECT(err == REFTABLE_API_ERROR);
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_lock_failure(void)
+{
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err, i;
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+	for (i = -1; i != REFTABLE_EMPTY_TABLE_ERROR; i--) {
+		err = reftable_stack_add(st, &write_error, &i);
+		EXPECT(err == i);
+	}
+
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_add(void)
+{
+	int i = 0;
+	int err = 0;
+	struct reftable_write_options cfg = {
+		.exact_log_message = 1,
+	};
+	struct reftable_stack *st = NULL;
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_ref_record refs[2] = { { NULL } };
+	struct reftable_log_record logs[2] = { { NULL } };
+	int N = ARRAY_SIZE(refs);
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+	st->disable_auto_compact = 1;
+
+	for (i = 0; i < N; i++) {
+		char buf[256];
+		snprintf(buf, sizeof(buf), "branch%02d", i);
+		refs[i].refname = xstrdup(buf);
+		refs[i].update_index = i + 1;
+		refs[i].value_type = REFTABLE_REF_VAL1;
+		refs[i].value.val1 = reftable_malloc(SHA1_SIZE);
+		set_test_hash(refs[i].value.val1, i);
+
+		logs[i].refname = xstrdup(buf);
+		logs[i].update_index = N + i + 1;
+		logs[i].new_hash = reftable_malloc(SHA1_SIZE);
+		logs[i].email = xstrdup("identity@invalid");
+		set_test_hash(logs[i].new_hash, i);
+	}
+
+	for (i = 0; i < N; i++) {
+		int err = reftable_stack_add(st, &write_test_ref, &refs[i]);
+		EXPECT_ERR(err);
+	}
+
+	for (i = 0; i < N; i++) {
+		struct write_log_arg arg = {
+			.log = &logs[i],
+			.update_index = reftable_stack_next_update_index(st),
+		};
+		int err = reftable_stack_add(st, &write_test_log, &arg);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_stack_compact_all(st, NULL);
+	EXPECT_ERR(err);
+
+	for (i = 0; i < N; i++) {
+		struct reftable_ref_record dest = { NULL };
+
+		int err = reftable_stack_read_ref(st, refs[i].refname, &dest);
+		EXPECT_ERR(err);
+		EXPECT(reftable_ref_record_equal(&dest, refs + i, SHA1_SIZE));
+		reftable_ref_record_release(&dest);
+	}
+
+	for (i = 0; i < N; i++) {
+		struct reftable_log_record dest = { NULL };
+		int err = reftable_stack_read_log(st, refs[i].refname, &dest);
+		EXPECT_ERR(err);
+		EXPECT(reftable_log_record_equal(&dest, logs + i, SHA1_SIZE));
+		reftable_log_record_release(&dest);
+	}
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	for (i = 0; i < N; i++) {
+		reftable_ref_record_release(&refs[i]);
+		reftable_log_record_release(&logs[i]);
+	}
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_log_normalize(void)
+{
+	int err = 0;
+	struct reftable_write_options cfg = {
+		0,
+	};
+	struct reftable_stack *st = NULL;
+	char *dir = get_tmp_template(__FUNCTION__);
+
+	uint8_t h1[SHA1_SIZE] = { 0x01 }, h2[SHA1_SIZE] = { 0x02 };
+
+	struct reftable_log_record input = {
+		.refname = "branch",
+		.update_index = 1,
+		.new_hash = h1,
+		.old_hash = h2,
+	};
+	struct reftable_log_record dest = {
+		.update_index = 0,
+	};
+	struct write_log_arg arg = {
+		.log = &input,
+		.update_index = 1,
+	};
+
+	EXPECT(mkdtemp(dir));
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	input.message = "one\ntwo";
+	err = reftable_stack_add(st, &write_test_log, &arg);
+	EXPECT(err == REFTABLE_API_ERROR);
+
+	input.message = "one";
+	err = reftable_stack_add(st, &write_test_log, &arg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_log(st, input.refname, &dest);
+	EXPECT_ERR(err);
+	EXPECT(0 == strcmp(dest.message, "one\n"));
+
+	input.message = "two\n";
+	arg.update_index = 2;
+	err = reftable_stack_add(st, &write_test_log, &arg);
+	EXPECT_ERR(err);
+	err = reftable_stack_read_log(st, input.refname, &dest);
+	EXPECT_ERR(err);
+	EXPECT(0 == strcmp(dest.message, "two\n"));
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	reftable_log_record_release(&dest);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_tombstone(void)
+{
+	int i = 0;
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_ref_record refs[2] = { { NULL } };
+	struct reftable_log_record logs[2] = { { NULL } };
+	int N = ARRAY_SIZE(refs);
+	struct reftable_ref_record dest = { NULL };
+	struct reftable_log_record log_dest = { NULL };
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	for (i = 0; i < N; i++) {
+		const char *buf = "branch";
+		refs[i].refname = xstrdup(buf);
+		refs[i].update_index = i + 1;
+		if (i % 2 == 0) {
+			refs[i].value_type = REFTABLE_REF_VAL1;
+			refs[i].value.val1 = reftable_malloc(SHA1_SIZE);
+			set_test_hash(refs[i].value.val1, i);
+		}
+		logs[i].refname = xstrdup(buf);
+		/* update_index is part of the key. */
+		logs[i].update_index = 42;
+		if (i % 2 == 0) {
+			logs[i].new_hash = reftable_malloc(SHA1_SIZE);
+			set_test_hash(logs[i].new_hash, i);
+			logs[i].email = xstrdup("identity@invalid");
+		}
+	}
+	for (i = 0; i < N; i++) {
+		int err = reftable_stack_add(st, &write_test_ref, &refs[i]);
+		EXPECT_ERR(err);
+	}
+	for (i = 0; i < N; i++) {
+		struct write_log_arg arg = {
+			.log = &logs[i],
+			.update_index = reftable_stack_next_update_index(st),
+		};
+		int err = reftable_stack_add(st, &write_test_log, &arg);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_stack_read_ref(st, "branch", &dest);
+	EXPECT(err == 1);
+	reftable_ref_record_release(&dest);
+
+	err = reftable_stack_read_log(st, "branch", &log_dest);
+	EXPECT(err == 1);
+	reftable_log_record_release(&log_dest);
+
+	err = reftable_stack_compact_all(st, NULL);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_ref(st, "branch", &dest);
+	EXPECT(err == 1);
+
+	err = reftable_stack_read_log(st, "branch", &log_dest);
+	EXPECT(err == 1);
+	reftable_ref_record_release(&dest);
+	reftable_log_record_release(&log_dest);
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	for (i = 0; i < N; i++) {
+		reftable_ref_record_release(&refs[i]);
+		reftable_log_record_release(&logs[i]);
+	}
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_hash_id(void)
+{
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+
+	struct reftable_ref_record ref = {
+		.refname = "master",
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "target",
+		.update_index = 1,
+	};
+	struct reftable_write_options cfg32 = { .hash_id = SHA256_ID };
+	struct reftable_stack *st32 = NULL;
+	struct reftable_write_options cfg_default = { 0 };
+	struct reftable_stack *st_default = NULL;
+	struct reftable_ref_record dest = { NULL };
+
+	EXPECT(mkdtemp(dir));
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref);
+	EXPECT_ERR(err);
+
+	/* can't read it with the wrong hash ID. */
+	err = reftable_new_stack(&st32, dir, cfg32);
+	EXPECT(err == REFTABLE_FORMAT_ERROR);
+
+	/* check that we can read it back with default config too. */
+	err = reftable_new_stack(&st_default, dir, cfg_default);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_ref(st_default, "master", &dest);
+	EXPECT_ERR(err);
+
+	EXPECT(reftable_ref_record_equal(&ref, &dest, SHA1_SIZE));
+	reftable_ref_record_release(&dest);
+	reftable_stack_destroy(st);
+	reftable_stack_destroy(st_default);
+	clear_dir(dir);
+}
+
+static void test_log2(void)
+{
+	EXPECT(1 == fastlog2(3));
+	EXPECT(2 == fastlog2(4));
+	EXPECT(2 == fastlog2(5));
+}
+
+static void test_sizes_to_segments(void)
+{
+	uint64_t sizes[] = { 2, 3, 4, 5, 7, 9 };
+	/* .................0  1  2  3  4  5 */
+
+	int seglen = 0;
+	struct segment *segs =
+		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
+	EXPECT(segs[2].log == 3);
+	EXPECT(segs[2].start == 5);
+	EXPECT(segs[2].end == 6);
+
+	EXPECT(segs[1].log == 2);
+	EXPECT(segs[1].start == 2);
+	EXPECT(segs[1].end == 5);
+	reftable_free(segs);
+}
+
+static void test_sizes_to_segments_empty(void)
+{
+	int seglen = 0;
+	struct segment *segs = sizes_to_segments(&seglen, NULL, 0);
+	EXPECT(seglen == 0);
+	reftable_free(segs);
+}
+
+static void test_sizes_to_segments_all_equal(void)
+{
+	uint64_t sizes[] = { 5, 5 };
+
+	int seglen = 0;
+	struct segment *segs =
+		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
+	EXPECT(seglen == 1);
+	EXPECT(segs[0].start == 0);
+	EXPECT(segs[0].end == 2);
+	reftable_free(segs);
+}
+
+static void test_suggest_compaction_segment(void)
+{
+	uint64_t sizes[] = { 128, 64, 17, 16, 9, 9, 9, 16, 16 };
+	/* .................0    1    2  3   4  5  6 */
+	struct segment min =
+		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes));
+	EXPECT(min.start == 2);
+	EXPECT(min.end == 7);
+}
+
+static void test_suggest_compaction_segment_nothing(void)
+{
+	uint64_t sizes[] = { 64, 32, 16, 8, 4, 2 };
+	struct segment result =
+		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes));
+	EXPECT(result.start == result.end);
+}
+
+static void test_reflog_expire(void)
+{
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	struct reftable_log_record logs[20] = { { NULL } };
+	int N = ARRAY_SIZE(logs) - 1;
+	int i = 0;
+	int err;
+	struct reftable_log_expiry_config expiry = {
+		.time = 10,
+	};
+	struct reftable_log_record log = { NULL };
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	for (i = 1; i <= N; i++) {
+		char buf[256];
+		snprintf(buf, sizeof(buf), "branch%02d", i);
+
+		logs[i].refname = xstrdup(buf);
+		logs[i].update_index = i;
+		logs[i].time = i;
+		logs[i].new_hash = reftable_malloc(SHA1_SIZE);
+		logs[i].email = xstrdup("identity@invalid");
+		set_test_hash(logs[i].new_hash, i);
+	}
+
+	for (i = 1; i <= N; i++) {
+		struct write_log_arg arg = {
+			.log = &logs[i],
+			.update_index = reftable_stack_next_update_index(st),
+		};
+		int err = reftable_stack_add(st, &write_test_log, &arg);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_stack_compact_all(st, NULL);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_compact_all(st, &expiry);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_log(st, logs[9].refname, &log);
+	EXPECT(err == 1);
+
+	err = reftable_stack_read_log(st, logs[11].refname, &log);
+	EXPECT_ERR(err);
+
+	expiry.min_update_index = 15;
+	err = reftable_stack_compact_all(st, &expiry);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_log(st, logs[14].refname, &log);
+	EXPECT(err == 1);
+
+	err = reftable_stack_read_log(st, logs[16].refname, &log);
+	EXPECT_ERR(err);
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	for (i = 0; i <= N; i++) {
+		reftable_log_record_release(&logs[i]);
+	}
+	clear_dir(dir);
+	reftable_log_record_release(&log);
+}
+
+static int write_nothing(struct reftable_writer *wr, void *arg)
+{
+	reftable_writer_set_limits(wr, 1, 1);
+	return 0;
+}
+
+static void test_empty_add(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_stack *st2 = NULL;
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_nothing, NULL);
+	EXPECT_ERR(err);
+
+	err = reftable_new_stack(&st2, dir, cfg);
+	EXPECT_ERR(err);
+	clear_dir(dir);
+	reftable_stack_destroy(st);
+	reftable_stack_destroy(st2);
+}
+
+static void test_reftable_stack_auto_compaction(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	char *dir = get_tmp_template(__FUNCTION__);
+	int err, i;
+	int N = 100;
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	for (i = 0; i < N; i++) {
+		char name[100];
+		struct reftable_ref_record ref = {
+			.refname = name,
+			.update_index = reftable_stack_next_update_index(st),
+			.value_type = REFTABLE_REF_SYMREF,
+			.value.symref = "master",
+		};
+		snprintf(name, sizeof(name), "branch%04d", i);
+
+		err = reftable_stack_add(st, &write_test_ref, &ref);
+		EXPECT_ERR(err);
+
+		EXPECT(i < 3 || st->merged->stack_len < 2 * fastlog2(i));
+	}
+
+	EXPECT(reftable_stack_compaction_stats(st)->entries_written <
+	       (uint64_t)(N * fastlog2(N)));
+
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+int stack_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_reftable_stack_uptodate);
+	RUN_TEST(test_reftable_stack_transaction_api);
+	RUN_TEST(test_reftable_stack_hash_id);
+	RUN_TEST(test_sizes_to_segments_all_equal);
+	RUN_TEST(test_reftable_stack_auto_compaction);
+	RUN_TEST(test_reftable_stack_validate_refname);
+	RUN_TEST(test_reftable_stack_update_index_check);
+	RUN_TEST(test_reftable_stack_lock_failure);
+	RUN_TEST(test_reftable_stack_log_normalize);
+	RUN_TEST(test_reftable_stack_tombstone);
+	RUN_TEST(test_reftable_stack_add_one);
+	RUN_TEST(test_empty_add);
+	RUN_TEST(test_reflog_expire);
+	RUN_TEST(test_suggest_compaction_segment);
+	RUN_TEST(test_suggest_compaction_segment_nothing);
+	RUN_TEST(test_sizes_to_segments);
+	RUN_TEST(test_sizes_to_segments_empty);
+	RUN_TEST(test_log2);
+	RUN_TEST(test_parse_names);
+	RUN_TEST(test_read_file);
+	RUN_TEST(test_names_equal);
+	RUN_TEST(test_reftable_stack_add);
+	return 0;
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index fdf92586737..3b702f4855e 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -5,8 +5,11 @@ int cmd__reftable(int argc, const char **argv)
 {
 	basics_test_main(argc, argv);
 	block_test_main(argc, argv);
+	merged_test_main(argc, argv);
 	record_test_main(argc, argv);
+	refname_test_main(argc, argv);
 	reftable_test_main(argc, argv);
+	stack_test_main(argc, argv);
 	tree_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v4 13/15] Reftable support for git-core
  2020-12-09 14:00     ` [PATCH v4 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                         ` (11 preceding siblings ...)
  2020-12-09 14:00       ` [PATCH v4 12/15] reftable: rest of library Han-Wen Nienhuys via GitGitGadget
@ 2020-12-09 14:00       ` Han-Wen Nienhuys via GitGitGadget
  2021-01-21 15:55         ` Ævar Arnfjörð Bjarmason
  2021-02-22  0:41         ` [PATCH] refs: introduce API function to write invalid null ref Stefan Beller
  2020-12-09 14:00       ` [PATCH v4 14/15] git-prompt: prepare for reftable refs backend SZEDER Gábor via GitGitGadget
                         ` (2 subsequent siblings)
  15 siblings, 2 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-12-09 14:00 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

For background, see the previous commit introducing the library.

This introduces the file refs/reftable-backend.c containing a reftable-powered
ref storage backend.

It can be activated by passing --ref-storage=reftable to "git init", or setting
GIT_TEST_REFTABLE in the environment.

Example use: see t/t0031-reftable.sh

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Co-authored-by: Jeff King <peff@peff.net>
---
 Documentation/config/extensions.txt           |    9 +
 .../technical/repository-version.txt          |    7 +
 Makefile                                      |    4 +
 builtin/clone.c                               |    5 +-
 builtin/init-db.c                             |   55 +-
 builtin/worktree.c                            |   27 +-
 cache.h                                       |    8 +-
 config.mak.uname                              |    2 +-
 contrib/buildsystems/Generators/Vcxproj.pm    |   11 +-
 refs.c                                        |   27 +-
 refs.h                                        |    3 +
 refs/refs-internal.h                          |    1 +
 refs/reftable-backend.c                       | 1435 +++++++++++++++++
 repository.c                                  |    2 +
 repository.h                                  |    3 +
 setup.c                                       |    9 +-
 t/t0031-reftable.sh                           |  199 +++
 t/t1409-avoid-packing-refs.sh                 |    6 +
 t/t1450-fsck.sh                               |    6 +
 t/t3210-pack-refs.sh                          |    6 +
 t/test-lib.sh                                 |    5 +
 21 files changed, 1797 insertions(+), 33 deletions(-)
 create mode 100644 refs/reftable-backend.c
 create mode 100755 t/t0031-reftable.sh

diff --git a/Documentation/config/extensions.txt b/Documentation/config/extensions.txt
index 4e23d73cdca..82c5940f143 100644
--- a/Documentation/config/extensions.txt
+++ b/Documentation/config/extensions.txt
@@ -6,3 +6,12 @@ extensions.objectFormat::
 Note that this setting should only be set by linkgit:git-init[1] or
 linkgit:git-clone[1].  Trying to change it after initialization will not
 work and will produce hard-to-diagnose issues.
++
+extensions.refStorage::
+	Specify the ref storage mechanism to use.  The acceptable values are `files` and
+	`reftable`.  If not specified, `files` is assumed.  It is an error to specify
+	this key unless `core.repositoryFormatVersion` is 1.
++
+Note that this setting should only be set by linkgit:git-init[1] or
+linkgit:git-clone[1].  Trying to change it after initialization will not
+work and will produce hard-to-diagnose issues.
diff --git a/Documentation/technical/repository-version.txt b/Documentation/technical/repository-version.txt
index 7844ef30ffd..72576235833 100644
--- a/Documentation/technical/repository-version.txt
+++ b/Documentation/technical/repository-version.txt
@@ -100,3 +100,10 @@ If set, by default "git config" reads from both "config" and
 multiple working directory mode, "config" file is shared while
 "config.worktree" is per-working directory (i.e., it's in
 GIT_COMMON_DIR/worktrees/<id>/config.worktree)
+
+==== `refStorage`
+
+Specifies the file format for the ref database. Values are `files`
+(for the traditional packed + loose ref format) and `reftable` for the
+binary reftable format. See https://github.com/google/reftable for
+more information.
diff --git a/Makefile b/Makefile
index 18cc18c2153..6b7c8a165f4 100644
--- a/Makefile
+++ b/Makefile
@@ -977,6 +977,7 @@ LIB_OBJS += reflog-walk.o
 LIB_OBJS += refs.o
 LIB_OBJS += refs/debug.o
 LIB_OBJS += refs/files-backend.o
+LIB_OBJS += refs/reftable-backend.o
 LIB_OBJS += refs/iterator.o
 LIB_OBJS += refs/packed-backend.o
 LIB_OBJS += refs/ref-cache.o
@@ -2408,8 +2409,11 @@ REFTABLE_OBJS += reftable/zlib-compat.o
 
 REFTABLE_TEST_OBJS += reftable/basics_test.o
 REFTABLE_TEST_OBJS += reftable/block_test.o
+REFTABLE_TEST_OBJS += reftable/merged_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
+REFTABLE_TEST_OBJS += reftable/refname_test.o
 REFTABLE_TEST_OBJS += reftable/reftable_test.o
+REFTABLE_TEST_OBJS += reftable/stack_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/tree_test.o
 
diff --git a/builtin/clone.c b/builtin/clone.c
index a0841923cfe..974a374e9c9 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1138,7 +1138,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	}
 
 	init_db(git_dir, real_git_dir, option_template, GIT_HASH_UNKNOWN, NULL,
-		INIT_DB_QUIET);
+		default_ref_storage(), INIT_DB_QUIET);
 
 	if (real_git_dir)
 		git_dir = real_git_dir;
@@ -1273,7 +1273,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		 * Now that we know what algorithm the remote side is using,
 		 * let's set ours to the same thing.
 		 */
-		initialize_repository_version(hash_algo, 1);
+		initialize_repository_version(hash_algo, 1,
+					      default_ref_storage());
 		repo_set_hash_algo(the_repository, hash_algo);
 
 		mapped_refs = wanted_peer_refs(refs, &remote->fetch);
diff --git a/builtin/init-db.c b/builtin/init-db.c
index dcb7015db48..e10dc712b8d 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -179,12 +179,14 @@ static int needs_work_tree_config(const char *git_dir, const char *work_tree)
 	return 1;
 }
 
-void initialize_repository_version(int hash_algo, int reinit)
+void initialize_repository_version(int hash_algo, int reinit,
+				   const char *ref_storage_format)
 {
 	char repo_version_string[10];
 	int repo_version = GIT_REPO_VERSION;
 
-	if (hash_algo != GIT_HASH_SHA1)
+	if (hash_algo != GIT_HASH_SHA1 ||
+	    !strcmp(ref_storage_format, "reftable"))
 		repo_version = GIT_REPO_VERSION_READ;
 
 	/* This forces creation of new config file */
@@ -237,6 +239,7 @@ static int create_default_files(const char *template_path,
 	is_bare_repository_cfg = init_is_bare_repository;
 	if (init_shared_repository != -1)
 		set_shared_repository(init_shared_repository);
+	the_repository->ref_storage_format = xstrdup(fmt->ref_storage);
 
 	/*
 	 * We would have created the above under user's umask -- under
@@ -246,6 +249,24 @@ static int create_default_files(const char *template_path,
 		adjust_shared_perm(get_git_dir());
 	}
 
+	/*
+	 * Check to see if .git/HEAD exists; this must happen before
+	 * initializing the ref db, because we want to see if there is an
+	 * existing HEAD.
+	 */
+	path = git_path_buf(&buf, "HEAD");
+	reinit = (!access(path, R_OK) ||
+		  readlink(path, junk, sizeof(junk) - 1) != -1);
+
+	/*
+	 * refs/heads is a file when using reftable. We can't reinitialize with
+	 * a reftable because it will overwrite HEAD
+	 */
+	if (reinit && (!strcmp(fmt->ref_storage, "reftable")) ==
+			      is_directory(git_path_buf(&buf, "refs/heads"))) {
+		die("cannot switch ref storage format.");
+	}
+
 	/*
 	 * We need to create a "refs" dir in any case so that older
 	 * versions of git can tell that this is a repository.
@@ -260,9 +281,6 @@ static int create_default_files(const char *template_path,
 	 * Point the HEAD symref to the initial branch with if HEAD does
 	 * not yet exist.
 	 */
-	path = git_path_buf(&buf, "HEAD");
-	reinit = (!access(path, R_OK)
-		  || readlink(path, junk, sizeof(junk)-1) != -1);
 	if (!reinit) {
 		char *ref;
 
@@ -279,7 +297,7 @@ static int create_default_files(const char *template_path,
 		free(ref);
 	}
 
-	initialize_repository_version(fmt->hash_algo, 0);
+	initialize_repository_version(fmt->hash_algo, 0, fmt->ref_storage);
 
 	/* Check filemode trustability */
 	path = git_path_buf(&buf, "config");
@@ -395,7 +413,7 @@ static void validate_hash_algorithm(struct repository_format *repo_fmt, int hash
 
 int init_db(const char *git_dir, const char *real_git_dir,
 	    const char *template_dir, int hash, const char *initial_branch,
-	    unsigned int flags)
+	    const char *ref_storage_format, unsigned int flags)
 {
 	int reinit;
 	int exist_ok = flags & INIT_DB_EXIST_OK;
@@ -434,6 +452,7 @@ int init_db(const char *git_dir, const char *real_git_dir,
 	 * is an attempt to reinitialize new repository with an old tool.
 	 */
 	check_repository_format(&repo_fmt);
+	repo_fmt.ref_storage = xstrdup(ref_storage_format);
 
 	validate_hash_algorithm(&repo_fmt, hash);
 
@@ -487,6 +506,9 @@ int init_db(const char *git_dir, const char *real_git_dir,
 		git_config_set("receive.denyNonFastforwards", "true");
 	}
 
+	if (!strcmp(ref_storage_format, "reftable"))
+		git_config_set("extensions.refStorage", ref_storage_format);
+
 	if (!(flags & INIT_DB_QUIET)) {
 		int len = strlen(git_dir);
 
@@ -560,6 +582,7 @@ static const char *const init_db_usage[] = {
 int cmd_init_db(int argc, const char **argv, const char *prefix)
 {
 	const char *git_dir;
+	const char *ref_storage_format = default_ref_storage();
 	const char *real_git_dir = NULL;
 	const char *work_tree;
 	const char *template_dir = NULL;
@@ -568,15 +591,18 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
 	const char *initial_branch = NULL;
 	int hash_algo = GIT_HASH_UNKNOWN;
 	const struct option init_db_options[] = {
-		OPT_STRING(0, "template", &template_dir, N_("template-directory"),
-				N_("directory from which templates will be used")),
+		OPT_STRING(0, "template", &template_dir,
+			   N_("template-directory"),
+			   N_("directory from which templates will be used")),
 		OPT_SET_INT(0, "bare", &is_bare_repository_cfg,
-				N_("create a bare repository"), 1),
+			    N_("create a bare repository"), 1),
 		{ OPTION_CALLBACK, 0, "shared", &init_shared_repository,
-			N_("permissions"),
-			N_("specify that the git repository is to be shared amongst several users"),
-			PARSE_OPT_OPTARG | PARSE_OPT_NONEG, shared_callback, 0},
+		  N_("permissions"),
+		  N_("specify that the git repository is to be shared amongst several users"),
+		  PARSE_OPT_OPTARG | PARSE_OPT_NONEG, shared_callback, 0 },
 		OPT_BIT('q', "quiet", &flags, N_("be quiet"), INIT_DB_QUIET),
+		OPT_STRING(0, "ref-storage", &ref_storage_format, N_("backend"),
+			   N_("the ref storage format to use")),
 		OPT_STRING(0, "separate-git-dir", &real_git_dir, N_("gitdir"),
 			   N_("separate git dir from working tree")),
 		OPT_STRING('b', "initial-branch", &initial_branch, N_("name"),
@@ -717,10 +743,11 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
 	}
 
 	UNLEAK(real_git_dir);
+	UNLEAK(ref_storage_format);
 	UNLEAK(git_dir);
 	UNLEAK(work_tree);
 
 	flags |= INIT_DB_EXIST_OK;
 	return init_db(git_dir, real_git_dir, template_dir, hash_algo,
-		       initial_branch, flags);
+		       initial_branch, ref_storage_format, flags);
 }
diff --git a/builtin/worktree.c b/builtin/worktree.c
index 197fd24a555..a102aedfca7 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -12,6 +12,7 @@
 #include "submodule.h"
 #include "utf8.h"
 #include "worktree.h"
+#include "../refs/refs-internal.h"
 
 static const char * const worktree_usage[] = {
 	N_("git worktree add [<options>] <path> [<commit-ish>]"),
@@ -402,9 +403,29 @@ static int add_worktree(const char *path, const char *refname,
 	 * worktree.
 	 */
 	strbuf_reset(&sb);
-	strbuf_addf(&sb, "%s/HEAD", sb_repo.buf);
-	write_file(sb.buf, "%s", oid_to_hex(&null_oid));
-	strbuf_reset(&sb);
+	if (get_main_ref_store(the_repository)->be == &refs_be_reftable) {
+		/* XXX this is cut & paste from reftable_init_db. */
+		strbuf_addf(&sb, "%s/HEAD", sb_repo.buf);
+		write_file(sb.buf, "%s", "ref: refs/heads/.invalid\n");
+		strbuf_reset(&sb);
+
+		strbuf_addf(&sb, "%s/refs", sb_repo.buf);
+		safe_create_dir(sb.buf, 1);
+		strbuf_reset(&sb);
+
+		strbuf_addf(&sb, "%s/refs/heads", sb_repo.buf);
+		write_file(sb.buf, "this repository uses the reftable format");
+		strbuf_reset(&sb);
+
+		strbuf_addf(&sb, "%s/reftable", sb_repo.buf);
+		safe_create_dir(sb.buf, 1);
+		strbuf_reset(&sb);
+	} else {
+		strbuf_addf(&sb, "%s/HEAD", sb_repo.buf);
+		write_file(sb.buf, "%s", oid_to_hex(&null_oid));
+		strbuf_reset(&sb);
+	}
+
 	strbuf_addf(&sb, "%s/commondir", sb_repo.buf);
 	write_file(sb.buf, "../..");
 
diff --git a/cache.h b/cache.h
index e986cf4ea9c..545d2b72607 100644
--- a/cache.h
+++ b/cache.h
@@ -627,9 +627,10 @@ int path_inside_repo(const char *prefix, const char *path);
 #define INIT_DB_EXIST_OK 0x0002
 
 int init_db(const char *git_dir, const char *real_git_dir,
-	    const char *template_dir, int hash_algo,
-	    const char *initial_branch, unsigned int flags);
-void initialize_repository_version(int hash_algo, int reinit);
+	    const char *template_dir, int hash_algo, const char *initial_branch,
+	    const char *ref_storage_format, unsigned int flags);
+void initialize_repository_version(int hash_algo, int reinit,
+				   const char *ref_storage_format);
 
 void sanitize_stdfds(void);
 int daemonize(void);
@@ -1043,6 +1044,7 @@ struct repository_format {
 	int is_bare;
 	int hash_algo;
 	char *work_tree;
+	char *ref_storage;
 	struct string_list unknown_extensions;
 	struct string_list v1_only_extensions;
 };
diff --git a/config.mak.uname b/config.mak.uname
index 5b30a9154ac..4ab0191f5e6 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -701,7 +701,7 @@ vcxproj:
 	# Make .vcxproj files and add them
 	unset QUIET_GEN QUIET_BUILT_IN; \
 	perl contrib/buildsystems/generate -g Vcxproj
-	git add -f git.sln {*,*/lib,t/helper/*}/*.vcxproj
+	git add -f git.sln {*,*/lib,*/libreftable,t/helper/*}/*.vcxproj
 
 	# Generate the LinkOrCopyBuiltins.targets and LinkOrCopyRemoteHttp.targets file
 	(echo '<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">' && \
diff --git a/contrib/buildsystems/Generators/Vcxproj.pm b/contrib/buildsystems/Generators/Vcxproj.pm
index d2584450ba1..1a25789d285 100644
--- a/contrib/buildsystems/Generators/Vcxproj.pm
+++ b/contrib/buildsystems/Generators/Vcxproj.pm
@@ -77,7 +77,7 @@ sub createProject {
     my $libs_release = "\n    ";
     my $libs_debug = "\n    ";
     if (!$static_library) {
-      $libs_release = join(";", sort(grep /^(?!libgit\.lib|xdiff\/lib\.lib|vcs-svn\/lib\.lib)/, @{$$build_structure{"$prefix${name}_LIBS"}}));
+      $libs_release = join(";", sort(grep /^(?!libgit\.lib|xdiff\/lib\.lib|vcs-svn\/lib\.lib|reftable\/libreftable\.lib)/, @{$$build_structure{"$prefix${name}_LIBS"}}));
       $libs_debug = $libs_release;
       $libs_debug =~ s/zlib\.lib/zlibd\.lib/g;
       $libs_debug =~ s/libexpat\.lib/libexpatd\.lib/g;
@@ -232,6 +232,7 @@ sub createProject {
 EOM
     if (!$static_library || $target =~ 'vcs-svn' || $target =~ 'xdiff') {
       my $uuid_libgit = $$build_structure{"LIBS_libgit_GUID"};
+      my $uuid_libreftable = $$build_structure{"LIBS_reftable/libreftable_GUID"};
       my $uuid_xdiff_lib = $$build_structure{"LIBS_xdiff/lib_GUID"};
 
       print F << "EOM";
@@ -241,6 +242,14 @@ sub createProject {
       <ReferenceOutputAssembly>false</ReferenceOutputAssembly>
     </ProjectReference>
 EOM
+      if (!($name =~ /xdiff|libreftable/)) {
+        print F << "EOM";
+    <ProjectReference Include="$cdup\\reftable\\libreftable\\libreftable.vcxproj">
+      <Project>$uuid_libreftable</Project>
+      <ReferenceOutputAssembly>false</ReferenceOutputAssembly>
+    </ProjectReference>
+EOM
+      }
       if (!($name =~ 'xdiff')) {
         print F << "EOM";
     <ProjectReference Include="$cdup\\xdiff\\lib\\xdiff_lib.vcxproj">
diff --git a/refs.c b/refs.c
index 392f0bbf68b..1b874db3345 100644
--- a/refs.c
+++ b/refs.c
@@ -19,10 +19,16 @@
 #include "repository.h"
 #include "sigchain.h"
 
+const char *default_ref_storage(void)
+{
+	const char *test = getenv("GIT_TEST_REFTABLE");
+	return test ? "reftable" : "files";
+}
+
 /*
  * List of all available backends
  */
-static struct ref_storage_be *refs_backends = &refs_be_files;
+static struct ref_storage_be *refs_backends = &refs_be_reftable;
 
 static struct ref_storage_be *find_ref_storage_backend(const char *name)
 {
@@ -1754,13 +1760,13 @@ static struct ref_store *lookup_ref_store_map(struct hashmap *map,
  * Create, record, and return a ref_store instance for the specified
  * gitdir.
  */
-static struct ref_store *ref_store_init(const char *gitdir,
+static struct ref_store *ref_store_init(const char *gitdir, const char *be_name,
 					unsigned int flags)
 {
-	const char *be_name = "files";
-	struct ref_storage_be *be = find_ref_storage_backend(be_name);
+	struct ref_storage_be *be;
 	struct ref_store *refs;
 
+	be = find_ref_storage_backend(be_name);
 	if (!be)
 		BUG("reference backend %s is unknown", be_name);
 
@@ -1776,7 +1782,11 @@ struct ref_store *get_main_ref_store(struct repository *r)
 	if (!r->gitdir)
 		BUG("attempting to get main_ref_store outside of repository");
 
-	r->refs_private = ref_store_init(r->gitdir, REF_STORE_ALL_CAPS);
+	r->refs_private = ref_store_init(r->gitdir,
+					 r->ref_storage_format ?
+						       r->ref_storage_format :
+						       default_ref_storage(),
+					 REF_STORE_ALL_CAPS);
 	r->refs_private = maybe_debug_wrap_ref_store(r->gitdir, r->refs_private);
 	return r->refs_private;
 }
@@ -1832,7 +1842,7 @@ struct ref_store *get_submodule_ref_store(const char *submodule)
 		goto done;
 
 	/* assume that add_submodule_odb() has been called */
-	refs = ref_store_init(submodule_sb.buf,
+	refs = ref_store_init(submodule_sb.buf, default_ref_storage(),
 			      REF_STORE_READ | REF_STORE_ODB);
 	register_ref_store_map(&submodule_ref_stores, "submodule",
 			       refs, submodule);
@@ -1846,6 +1856,7 @@ struct ref_store *get_submodule_ref_store(const char *submodule)
 
 struct ref_store *get_worktree_ref_store(const struct worktree *wt)
 {
+	const char *format = default_ref_storage();
 	struct ref_store *refs;
 	const char *id;
 
@@ -1859,9 +1870,9 @@ struct ref_store *get_worktree_ref_store(const struct worktree *wt)
 
 	if (wt->id)
 		refs = ref_store_init(git_common_path("worktrees/%s", wt->id),
-				      REF_STORE_ALL_CAPS);
+				      format, REF_STORE_ALL_CAPS);
 	else
-		refs = ref_store_init(get_git_common_dir(),
+		refs = ref_store_init(get_git_common_dir(), format,
 				      REF_STORE_ALL_CAPS);
 
 	if (refs)
diff --git a/refs.h b/refs.h
index 66955181569..7dc60472c96 100644
--- a/refs.h
+++ b/refs.h
@@ -11,6 +11,9 @@ struct string_list;
 struct string_list_item;
 struct worktree;
 
+/* Returns the ref storage backend to use by default. */
+const char *default_ref_storage(void);
+
 /*
  * Resolve a reference, recursively following symbolic refererences.
  *
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 467f4b3c936..28166bf1f82 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -669,6 +669,7 @@ struct ref_storage_be {
 };
 
 extern struct ref_storage_be refs_be_files;
+extern struct ref_storage_be refs_be_reftable;
 extern struct ref_storage_be refs_be_packed;
 
 /*
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
new file mode 100644
index 00000000000..b7d12ca91de
--- /dev/null
+++ b/refs/reftable-backend.c
@@ -0,0 +1,1435 @@
+#include "../cache.h"
+#include "../chdir-notify.h"
+#include "../config.h"
+#include "../iterator.h"
+#include "../lockfile.h"
+#include "../refs.h"
+#include "../reftable/reftable-stack.h"
+#include "../reftable/reftable-record.h"
+#include "../reftable/reftable-error.h"
+#include "../reftable/reftable-blocksource.h"
+#include "../reftable/reftable-reader.h"
+#include "../reftable/reftable-iterator.h"
+#include "../reftable/reftable-merged.h"
+#include "../reftable/reftable-generic.h"
+#include "../worktree.h"
+#include "refs-internal.h"
+
+extern struct ref_storage_be refs_be_reftable;
+
+struct git_reftable_ref_store {
+	struct ref_store base;
+	unsigned int store_flags;
+
+	int err;
+	char *repo_dir;
+
+	char *reftable_dir;
+	char *worktree_reftable_dir;
+
+	struct reftable_stack *main_stack;
+	struct reftable_stack *worktree_stack;
+};
+
+static struct reftable_stack *stack_for(struct git_reftable_ref_store *store,
+					const char *refname)
+{
+	if (store->worktree_stack == NULL)
+		return store->main_stack;
+
+	switch (ref_type(refname)) {
+	case REF_TYPE_PER_WORKTREE:
+	case REF_TYPE_PSEUDOREF:
+	case REF_TYPE_OTHER_PSEUDOREF:
+		return store->worktree_stack;
+	default:
+	case REF_TYPE_MAIN_PSEUDOREF:
+	case REF_TYPE_NORMAL:
+		return store->main_stack;
+	}
+}
+
+static int git_reftable_read_raw_ref(struct ref_store *ref_store,
+				     const char *refname, struct object_id *oid,
+				     struct strbuf *referent,
+				     unsigned int *type);
+
+static void clear_reftable_log_record(struct reftable_log_record *log)
+{
+	log->old_hash = NULL;
+	log->new_hash = NULL;
+	log->message = NULL;
+	log->refname = NULL;
+	reftable_log_record_release(log);
+}
+
+static void fill_reftable_log_record(struct reftable_log_record *log)
+{
+	const char *info = git_committer_info(0);
+	struct ident_split split = { NULL };
+	int result = split_ident_line(&split, info, strlen(info));
+	int sign = 1;
+	assert(0 == result);
+
+	reftable_log_record_release(log);
+	log->name =
+		xstrndup(split.name_begin, split.name_end - split.name_begin);
+	log->email =
+		xstrndup(split.mail_begin, split.mail_end - split.mail_begin);
+	log->time = atol(split.date_begin);
+	if (*split.tz_begin == '-') {
+		sign = -1;
+		split.tz_begin++;
+	}
+	if (*split.tz_begin == '+') {
+		sign = 1;
+		split.tz_begin++;
+	}
+
+	log->tz_offset = sign * atoi(split.tz_begin);
+}
+
+static int has_suffix(struct strbuf *b, const char *suffix)
+{
+	size_t len = strlen(suffix);
+
+	if (len > b->len) {
+		return 0;
+	}
+
+	return 0 == strncmp(b->buf + b->len - len, suffix, len);
+}
+
+/* trims the last path component of b. Returns -1 if it is not
+ * present, or 0 on success
+ */
+static int trim_component(struct strbuf *b)
+{
+	char *last;
+	last = strrchr(b->buf, '/');
+	if (!last)
+		return -1;
+	strbuf_setlen(b, last - b->buf);
+	return 0;
+}
+
+/* Returns whether `b` is a worktree path, trimming it to the gitdir
+ */
+static int is_worktree(struct strbuf *b)
+{
+	if (trim_component(b) < 0) {
+		return 0;
+	}
+	if (!has_suffix(b, "/worktrees")) {
+		return 0;
+	}
+	trim_component(b);
+	return 1;
+}
+
+static struct ref_store *git_reftable_ref_store_create(const char *path,
+						       unsigned int store_flags)
+{
+	struct git_reftable_ref_store *refs = xcalloc(1, sizeof(*refs));
+	struct ref_store *ref_store = (struct ref_store *)refs;
+	struct reftable_write_options cfg = {
+		.block_size = 4096,
+		.hash_id = the_hash_algo->format_id,
+	};
+	struct strbuf sb = STRBUF_INIT;
+	const char *gitdir = path;
+	struct strbuf wt_buf = STRBUF_INIT;
+	int wt = 0;
+
+	strbuf_addstr(&wt_buf, path);
+
+	/* this is clumsy, but the official worktree functions (eg.
+	 * get_worktrees()) function will try to initialize a ref storage
+	 * backend, leading to infinite recursion.  */
+	wt = is_worktree(&wt_buf);
+	if (wt) {
+		gitdir = wt_buf.buf;
+	}
+
+	base_ref_store_init(ref_store, &refs_be_reftable);
+	ref_store->gitdir = xstrdup(gitdir);
+	refs->store_flags = store_flags;
+	strbuf_addf(&sb, "%s/reftable", gitdir);
+	refs->reftable_dir = xstrdup(sb.buf);
+	strbuf_reset(&sb);
+
+	refs->err =
+		reftable_new_stack(&refs->main_stack, refs->reftable_dir, cfg);
+	assert(refs->err != REFTABLE_API_ERROR);
+
+	if (refs->err == 0 && wt) {
+		strbuf_addf(&sb, "%s/reftable", path);
+		refs->worktree_reftable_dir = xstrdup(sb.buf);
+
+		refs->err = reftable_new_stack(&refs->worktree_stack,
+					       refs->worktree_reftable_dir,
+					       cfg);
+		assert(refs->err != REFTABLE_API_ERROR);
+	}
+
+	strbuf_release(&sb);
+	strbuf_release(&wt_buf);
+	return ref_store;
+}
+
+static int git_reftable_init_db(struct ref_store *ref_store, struct strbuf *err)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct strbuf sb = STRBUF_INIT;
+
+	safe_create_dir(refs->reftable_dir, 1);
+	assert(refs->worktree_reftable_dir == NULL);
+
+	strbuf_addf(&sb, "%s/HEAD", refs->base.gitdir);
+	write_file(sb.buf, "ref: refs/heads/.invalid");
+	strbuf_reset(&sb);
+
+	strbuf_addf(&sb, "%s/refs", refs->base.gitdir);
+	safe_create_dir(sb.buf, 1);
+	strbuf_reset(&sb);
+
+	strbuf_addf(&sb, "%s/refs/heads", refs->base.gitdir);
+	write_file(sb.buf, "this repository uses the reftable format");
+
+	return 0;
+}
+
+struct git_reftable_iterator {
+	struct ref_iterator base;
+	struct reftable_iterator iter;
+	struct reftable_ref_record ref;
+	struct object_id oid;
+	struct ref_store *ref_store;
+
+	/* In case we must iterate over 2 stacks, this is non-null. */
+	struct reftable_merged_table *merged;
+	unsigned int flags;
+	int err;
+	const char *prefix;
+};
+
+static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator)
+{
+	struct git_reftable_iterator *ri =
+		(struct git_reftable_iterator *)ref_iterator;
+	while (ri->err == 0) {
+		ri->err = reftable_iterator_next_ref(&ri->iter, &ri->ref);
+		if (ri->err) {
+			break;
+		}
+
+		if (ref_type(ri->ref.refname) == REF_TYPE_PSEUDOREF) {
+			/*
+			  pseudorefs, eg. HEAD, FETCH_HEAD should not be
+			  produced, by default.
+			 */
+			continue;
+		}
+		ri->base.refname = ri->ref.refname;
+		if (ri->prefix != NULL &&
+		    strncmp(ri->prefix, ri->ref.refname, strlen(ri->prefix))) {
+			ri->err = 1;
+			break;
+		}
+		if (ri->flags & DO_FOR_EACH_PER_WORKTREE_ONLY &&
+		    ref_type(ri->base.refname) != REF_TYPE_PER_WORKTREE)
+			continue;
+
+		ri->base.flags = 0;
+		switch (ri->ref.value_type) {
+		case REFTABLE_REF_VAL1:
+			hashcpy(ri->oid.hash, ri->ref.value.val1);
+			break;
+		case REFTABLE_REF_VAL2:
+			hashcpy(ri->oid.hash, ri->ref.value.val2.value);
+			break;
+		case REFTABLE_REF_SYMREF: {
+			int out_flags = 0;
+			const char *resolved = refs_resolve_ref_unsafe(
+				ri->ref_store, ri->ref.refname,
+				RESOLVE_REF_READING, &ri->oid, &out_flags);
+			ri->base.flags = out_flags;
+			if (resolved == NULL &&
+			    !(ri->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
+			    (ri->base.flags & REF_ISBROKEN)) {
+				continue;
+			}
+			break;
+		}
+		default:
+			abort();
+		}
+
+		ri->base.oid = &ri->oid;
+		if (!(ri->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
+		    !ref_resolves_to_object(ri->base.refname, ri->base.oid,
+					    ri->base.flags)) {
+			continue;
+		}
+
+		break;
+	}
+
+	if (ri->err > 0) {
+		return ITER_DONE;
+	}
+	if (ri->err < 0) {
+		return ITER_ERROR;
+	}
+
+	return ITER_OK;
+}
+
+static int reftable_ref_iterator_peel(struct ref_iterator *ref_iterator,
+				      struct object_id *peeled)
+{
+	struct git_reftable_iterator *ri =
+		(struct git_reftable_iterator *)ref_iterator;
+	if (ri->ref.value_type == REFTABLE_REF_VAL2) {
+		hashcpy(peeled->hash, ri->ref.value.val2.target_value);
+		return 0;
+	}
+
+	return -1;
+}
+
+static int reftable_ref_iterator_abort(struct ref_iterator *ref_iterator)
+{
+	struct git_reftable_iterator *ri =
+		(struct git_reftable_iterator *)ref_iterator;
+	reftable_ref_record_release(&ri->ref);
+	reftable_iterator_destroy(&ri->iter);
+	if (ri->merged) {
+		reftable_merged_table_free(ri->merged);
+	}
+	return 0;
+}
+
+static struct ref_iterator_vtable reftable_ref_iterator_vtable = {
+	reftable_ref_iterator_advance, reftable_ref_iterator_peel,
+	reftable_ref_iterator_abort
+};
+
+static struct ref_iterator *
+git_reftable_ref_iterator_begin(struct ref_store *ref_store, const char *prefix,
+				unsigned int flags)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct git_reftable_iterator *ri = xcalloc(1, sizeof(*ri));
+
+	if (refs->err < 0) {
+		ri->err = refs->err;
+	} else if (refs->worktree_stack == NULL) {
+		struct reftable_merged_table *mt =
+			reftable_stack_merged_table(refs->main_stack);
+		ri->err = reftable_merged_table_seek_ref(mt, &ri->iter, prefix);
+	} else {
+		struct reftable_merged_table *mt1 =
+			reftable_stack_merged_table(refs->main_stack);
+		struct reftable_merged_table *mt2 =
+			reftable_stack_merged_table(refs->worktree_stack);
+		struct reftable_table *tabs =
+			xcalloc(2, sizeof(struct reftable_table));
+		reftable_table_from_merged_table(&tabs[0], mt1);
+		reftable_table_from_merged_table(&tabs[1], mt2);
+		ri->err = reftable_new_merged_table(&ri->merged, tabs, 2,
+						    the_hash_algo->format_id);
+		if (ri->err == 0)
+			ri->err = reftable_merged_table_seek_ref(
+				ri->merged, &ri->iter, prefix);
+	}
+
+	base_ref_iterator_init(&ri->base, &reftable_ref_iterator_vtable, 1);
+	ri->prefix = prefix;
+	ri->base.oid = &ri->oid;
+	ri->flags = flags;
+	ri->ref_store = ref_store;
+	return &ri->base;
+}
+
+static int fixup_symrefs(struct ref_store *ref_store,
+			 struct ref_transaction *transaction)
+{
+	struct strbuf referent = STRBUF_INIT;
+	int i = 0;
+	int err = 0;
+
+	for (i = 0; i < transaction->nr; i++) {
+		struct ref_update *update = transaction->updates[i];
+		struct object_id old_oid;
+
+		err = git_reftable_read_raw_ref(ref_store, update->refname,
+						&old_oid, &referent,
+						/* mutate input, like
+						   files-backend.c */
+						&update->type);
+		if (err < 0 && errno == ENOENT &&
+		    is_null_oid(&update->old_oid)) {
+			err = 0;
+		}
+		if (err < 0)
+			goto done;
+
+		if (!(update->type & REF_ISSYMREF))
+			continue;
+
+		if (update->flags & REF_NO_DEREF) {
+			/* what should happen here? See files-backend.c
+			 * lock_ref_for_update. */
+		} else {
+			/*
+			  If we are updating a symref (eg. HEAD), we should also
+			  update the branch that the symref points to.
+
+			  This is generic functionality, and would be better
+			  done in refs.c, but the current implementation is
+			  intertwined with the locking in files-backend.c.
+			*/
+			int new_flags = update->flags;
+			struct ref_update *new_update = NULL;
+
+			/* if this is an update for HEAD, should also record a
+			   log entry for HEAD? See files-backend.c,
+			   split_head_update()
+			*/
+			new_update = ref_transaction_add_update(
+				transaction, referent.buf, new_flags,
+				&update->new_oid, &update->old_oid,
+				update->msg);
+			new_update->parent_update = update;
+
+			/* files-backend sets REF_LOG_ONLY here. */
+			update->flags |= REF_NO_DEREF | REF_LOG_ONLY;
+			update->flags &= ~REF_HAVE_OLD;
+		}
+	}
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	strbuf_release(&referent);
+	return err;
+}
+
+static int git_reftable_transaction_prepare(struct ref_store *ref_store,
+					    struct ref_transaction *transaction,
+					    struct strbuf *errbuf)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_addition *add = NULL;
+	struct reftable_stack *stack =
+		transaction->nr ?
+			      stack_for(refs, transaction->updates[0]->refname) :
+			      refs->main_stack;
+	int err = refs->err;
+	if (err < 0) {
+		goto done;
+	}
+
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+
+	err = reftable_stack_new_addition(&add, stack);
+	if (err) {
+		goto done;
+	}
+
+	err = fixup_symrefs(ref_store, transaction);
+	if (err) {
+		goto done;
+	}
+
+	transaction->backend_data = add;
+	transaction->state = REF_TRANSACTION_PREPARED;
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	if (err < 0) {
+		transaction->state = REF_TRANSACTION_CLOSED;
+		strbuf_addf(errbuf, "reftable: transaction prepare: %s",
+			    reftable_error_str(err));
+	}
+
+	return err;
+}
+
+static int git_reftable_transaction_abort(struct ref_store *ref_store,
+					  struct ref_transaction *transaction,
+					  struct strbuf *err)
+{
+	struct reftable_addition *add =
+		(struct reftable_addition *)transaction->backend_data;
+	reftable_addition_destroy(add);
+	transaction->backend_data = NULL;
+	return 0;
+}
+
+static int reftable_check_old_oid(struct ref_store *refs, const char *refname,
+				  struct object_id *want_oid)
+{
+	struct object_id out_oid;
+	int out_flags = 0;
+	const char *resolved = refs_resolve_ref_unsafe(
+		refs, refname, RESOLVE_REF_READING, &out_oid, &out_flags);
+	if (is_null_oid(want_oid) != (resolved == NULL)) {
+		return REFTABLE_LOCK_ERROR;
+	}
+
+	if (resolved != NULL && !oideq(&out_oid, want_oid)) {
+		return REFTABLE_LOCK_ERROR;
+	}
+
+	return 0;
+}
+
+static int ref_update_cmp(const void *a, const void *b)
+{
+	return strcmp((*(struct ref_update **)a)->refname,
+		      (*(struct ref_update **)b)->refname);
+}
+
+static int write_transaction_table(struct reftable_writer *writer, void *arg)
+{
+	struct ref_transaction *transaction = (struct ref_transaction *)arg;
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)transaction->ref_store;
+	struct reftable_stack *stack =
+		stack_for(refs, transaction->updates[0]->refname);
+	uint64_t ts = reftable_stack_next_update_index(stack);
+	int err = 0;
+	int i = 0;
+	struct reftable_log_record *logs =
+		calloc(transaction->nr, sizeof(*logs));
+	struct ref_update **sorted =
+		malloc(transaction->nr * sizeof(struct ref_update *));
+	COPY_ARRAY(sorted, transaction->updates, transaction->nr);
+	QSORT(sorted, transaction->nr, ref_update_cmp);
+	reftable_writer_set_limits(writer, ts, ts);
+
+	for (i = 0; i < transaction->nr; i++) {
+		struct ref_update *u = sorted[i];
+		struct reftable_log_record *log = &logs[i];
+		fill_reftable_log_record(log);
+		log->refname = (char *)u->refname;
+		log->old_hash = u->old_oid.hash;
+		log->new_hash = u->new_oid.hash;
+		log->update_index = ts;
+		log->message = u->msg;
+
+		if (u->flags & REF_LOG_ONLY) {
+			continue;
+		}
+
+		if (u->flags & REF_HAVE_NEW) {
+			struct reftable_ref_record ref = { NULL };
+			struct object_id peeled;
+
+			int peel_error = peel_object(&u->new_oid, &peeled);
+			ref.refname = (char *)u->refname;
+			ref.update_index = ts;
+
+			if (!peel_error) {
+				ref.value_type = REFTABLE_REF_VAL2;
+				ref.value.val2.target_value = peeled.hash;
+				ref.value.val2.value = u->new_oid.hash;
+			} else if (!is_null_oid(&u->new_oid)) {
+				ref.value_type = REFTABLE_REF_VAL1;
+				ref.value.val1 = u->new_oid.hash;
+			}
+
+			err = reftable_writer_add_ref(writer, &ref);
+			if (err < 0) {
+				goto done;
+			}
+		}
+	}
+
+	for (i = 0; i < transaction->nr; i++) {
+		err = reftable_writer_add_log(writer, &logs[i]);
+		clear_reftable_log_record(&logs[i]);
+		if (err < 0) {
+			goto done;
+		}
+	}
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	free(logs);
+	free(sorted);
+	return err;
+}
+
+static int git_reftable_transaction_finish(struct ref_store *ref_store,
+					   struct ref_transaction *transaction,
+					   struct strbuf *errmsg)
+{
+	struct reftable_addition *add =
+		(struct reftable_addition *)transaction->backend_data;
+	int err = 0;
+	int i;
+
+	for (i = 0; i < transaction->nr; i++) {
+		struct ref_update *u = transaction->updates[i];
+		if (u->flags & REF_HAVE_OLD) {
+			err = reftable_check_old_oid(transaction->ref_store,
+						     u->refname, &u->old_oid);
+			if (err < 0) {
+				goto done;
+			}
+		}
+	}
+	if (transaction->nr) {
+		err = reftable_addition_add(add, &write_transaction_table,
+					    transaction);
+		if (err < 0) {
+			goto done;
+		}
+	}
+
+	err = reftable_addition_commit(add);
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	reftable_addition_destroy(add);
+	transaction->state = REF_TRANSACTION_CLOSED;
+	transaction->backend_data = NULL;
+	if (err) {
+		strbuf_addf(errmsg, "reftable: transaction failure: %s",
+			    reftable_error_str(err));
+		return -1;
+	}
+	return err;
+}
+
+static int
+git_reftable_transaction_initial_commit(struct ref_store *ref_store,
+					struct ref_transaction *transaction,
+					struct strbuf *errmsg)
+{
+	int err = git_reftable_transaction_prepare(ref_store, transaction,
+						   errmsg);
+	if (err)
+		return err;
+
+	return git_reftable_transaction_finish(ref_store, transaction, errmsg);
+}
+
+struct write_delete_refs_arg {
+	struct reftable_stack *stack;
+	struct string_list *refnames;
+	const char *logmsg;
+	unsigned int flags;
+};
+
+static int write_delete_refs_table(struct reftable_writer *writer, void *argv)
+{
+	struct write_delete_refs_arg *arg =
+		(struct write_delete_refs_arg *)argv;
+	uint64_t ts = reftable_stack_next_update_index(arg->stack);
+	int err = 0;
+	int i = 0;
+
+	reftable_writer_set_limits(writer, ts, ts);
+	for (i = 0; i < arg->refnames->nr; i++) {
+		struct reftable_ref_record ref = {
+			.refname = (char *)arg->refnames->items[i].string,
+			.value_type = REFTABLE_REF_DELETION,
+			.update_index = ts,
+		};
+		err = reftable_writer_add_ref(writer, &ref);
+		if (err < 0) {
+			return err;
+		}
+	}
+
+	for (i = 0; i < arg->refnames->nr; i++) {
+		struct reftable_log_record log = {
+			.update_index = ts,
+		};
+		struct reftable_ref_record current = { NULL };
+		fill_reftable_log_record(&log);
+		log.message = xstrdup(arg->logmsg);
+		log.new_hash = NULL;
+		log.old_hash = NULL;
+		log.update_index = ts;
+		log.refname = (char *)arg->refnames->items[i].string;
+
+		if (reftable_stack_read_ref(arg->stack, log.refname,
+					    &current) == 0) {
+			log.old_hash = reftable_ref_record_val1(&current);
+		}
+		err = reftable_writer_add_log(writer, &log);
+		log.old_hash = NULL;
+		reftable_ref_record_release(&current);
+
+		clear_reftable_log_record(&log);
+		if (err < 0) {
+			return err;
+		}
+	}
+	return 0;
+}
+
+static int git_reftable_delete_refs(struct ref_store *ref_store,
+				    const char *msg,
+				    struct string_list *refnames,
+				    unsigned int flags)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack =
+		stack_for(refs, refnames->items[0].string);
+	struct write_delete_refs_arg arg = {
+		.stack = stack,
+		.refnames = refnames,
+		.logmsg = msg,
+		.flags = flags,
+	};
+	int err = refs->err;
+	if (err < 0) {
+		goto done;
+	}
+
+	string_list_sort(refnames);
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+	err = reftable_stack_add(stack, &write_delete_refs_table, &arg);
+done:
+	assert(err != REFTABLE_API_ERROR);
+	return err;
+}
+
+static int git_reftable_pack_refs(struct ref_store *ref_store,
+				  unsigned int flags)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	int err = refs->err;
+	if (err < 0) {
+		return err;
+	}
+	err = reftable_stack_compact_all(refs->main_stack, NULL);
+	if (err == 0 && refs->worktree_stack != NULL)
+		err = reftable_stack_compact_all(refs->worktree_stack, NULL);
+	return err;
+}
+
+struct write_create_symref_arg {
+	struct git_reftable_ref_store *refs;
+	struct reftable_stack *stack;
+	const char *refname;
+	const char *target;
+	const char *logmsg;
+};
+
+static int write_create_symref_table(struct reftable_writer *writer, void *arg)
+{
+	struct write_create_symref_arg *create =
+		(struct write_create_symref_arg *)arg;
+	uint64_t ts = reftable_stack_next_update_index(create->stack);
+	int err = 0;
+
+	struct reftable_ref_record ref = {
+		.refname = (char *)create->refname,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = (char *)create->target,
+		.update_index = ts,
+	};
+	reftable_writer_set_limits(writer, ts, ts);
+	err = reftable_writer_add_ref(writer, &ref);
+	if (err == 0) {
+		struct reftable_log_record log = { NULL };
+		struct object_id new_oid;
+		struct object_id old_oid;
+
+		fill_reftable_log_record(&log);
+		log.refname = (char *)create->refname;
+		log.message = (char *)create->logmsg;
+		log.update_index = ts;
+		if (refs_resolve_ref_unsafe(
+			    (struct ref_store *)create->refs, create->refname,
+			    RESOLVE_REF_READING, &old_oid, NULL) != NULL) {
+			log.old_hash = old_oid.hash;
+		}
+
+		if (refs_resolve_ref_unsafe((struct ref_store *)create->refs,
+					    create->target, RESOLVE_REF_READING,
+					    &new_oid, NULL) != NULL) {
+			log.new_hash = new_oid.hash;
+		}
+
+		if (log.old_hash != NULL || log.new_hash != NULL) {
+			err = reftable_writer_add_log(writer, &log);
+		}
+		log.refname = NULL;
+		log.message = NULL;
+		log.old_hash = NULL;
+		log.new_hash = NULL;
+		clear_reftable_log_record(&log);
+	}
+	return err;
+}
+
+static int git_reftable_create_symref(struct ref_store *ref_store,
+				      const char *refname, const char *target,
+				      const char *logmsg)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct write_create_symref_arg arg = { .refs = refs,
+					       .stack = stack,
+					       .refname = refname,
+					       .target = target,
+					       .logmsg = logmsg };
+	int err = refs->err;
+	if (err < 0) {
+		goto done;
+	}
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+	err = reftable_stack_add(stack, &write_create_symref_table, &arg);
+done:
+	assert(err != REFTABLE_API_ERROR);
+	return err;
+}
+
+struct write_rename_arg {
+	struct reftable_stack *stack;
+	const char *oldname;
+	const char *newname;
+	const char *logmsg;
+};
+
+static int write_rename_table(struct reftable_writer *writer, void *argv)
+{
+	struct write_rename_arg *arg = (struct write_rename_arg *)argv;
+	uint64_t ts = reftable_stack_next_update_index(arg->stack);
+	struct reftable_ref_record ref = { NULL };
+	int err = reftable_stack_read_ref(arg->stack, arg->oldname, &ref);
+
+	if (err) {
+		goto done;
+	}
+
+	/* XXX do ref renames overwrite the target? */
+	if (reftable_stack_read_ref(arg->stack, arg->newname, &ref) == 0) {
+		goto done;
+	}
+
+	reftable_writer_set_limits(writer, ts, ts);
+
+	{
+		struct reftable_ref_record todo[2] = {
+			{
+				.refname = (char *)arg->oldname,
+				.update_index = ts,
+				.value_type = REFTABLE_REF_DELETION,
+			},
+			ref,
+		};
+		todo[1].update_index = ts;
+		free(todo[1].refname);
+		todo[1].refname = strdup(arg->newname);
+
+		err = reftable_writer_add_refs(writer, todo, 2);
+		if (err < 0) {
+			goto done;
+		}
+	}
+
+	if (reftable_ref_record_val1(&ref)) {
+		uint8_t *val1 = reftable_ref_record_val1(&ref);
+		struct reftable_log_record todo[2] = { { NULL } };
+		fill_reftable_log_record(&todo[0]);
+		fill_reftable_log_record(&todo[1]);
+
+		todo[0].refname = (char *)arg->oldname;
+		todo[0].update_index = ts;
+		todo[0].message = (char *)arg->logmsg;
+		todo[0].old_hash = val1;
+		todo[0].new_hash = NULL;
+
+		todo[1].refname = (char *)arg->newname;
+		todo[1].update_index = ts;
+		todo[1].old_hash = NULL;
+		todo[1].new_hash = val1;
+		todo[1].message = (char *)arg->logmsg;
+
+		err = reftable_writer_add_logs(writer, todo, 2);
+
+		clear_reftable_log_record(&todo[0]);
+		clear_reftable_log_record(&todo[1]);
+
+		if (err < 0) {
+			goto done;
+		}
+
+	} else {
+		/* XXX symrefs? */
+	}
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	reftable_ref_record_release(&ref);
+	return err;
+}
+
+static int git_reftable_rename_ref(struct ref_store *ref_store,
+				   const char *oldrefname,
+				   const char *newrefname, const char *logmsg)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, newrefname);
+	struct write_rename_arg arg = {
+		.stack = stack,
+		.oldname = oldrefname,
+		.newname = newrefname,
+		.logmsg = logmsg,
+	};
+	int err = refs->err;
+	if (err < 0) {
+		goto done;
+	}
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+
+	err = reftable_stack_add(stack, &write_rename_table, &arg);
+done:
+	assert(err != REFTABLE_API_ERROR);
+	return err;
+}
+
+static int git_reftable_copy_ref(struct ref_store *ref_store,
+				 const char *oldrefname, const char *newrefname,
+				 const char *logmsg)
+{
+	BUG("reftable reference store does not support copying references");
+}
+
+struct git_reftable_reflog_ref_iterator {
+	struct ref_iterator base;
+	struct reftable_iterator iter;
+	struct reftable_log_record log;
+	struct object_id oid;
+
+	/* Used when iterating over worktree & main */
+	struct reftable_merged_table *merged;
+	char *last_name;
+};
+
+static int
+git_reftable_reflog_ref_iterator_advance(struct ref_iterator *ref_iterator)
+{
+	struct git_reftable_reflog_ref_iterator *ri =
+		(struct git_reftable_reflog_ref_iterator *)ref_iterator;
+
+	while (1) {
+		int err = reftable_iterator_next_log(&ri->iter, &ri->log);
+		if (err > 0) {
+			return ITER_DONE;
+		}
+		if (err < 0) {
+			return ITER_ERROR;
+		}
+
+		ri->base.refname = ri->log.refname;
+		if (ri->last_name != NULL &&
+		    !strcmp(ri->log.refname, ri->last_name)) {
+			/* we want the refnames that we have reflogs for, so we
+			 * skip if we've already produced this name. This could
+			 * be faster by seeking directly to
+			 * reflog@update_index==0.
+			 */
+			continue;
+		}
+
+		free(ri->last_name);
+		ri->last_name = xstrdup(ri->log.refname);
+		hashcpy(ri->oid.hash, ri->log.new_hash);
+		return ITER_OK;
+	}
+}
+
+static int
+git_reftable_reflog_ref_iterator_peel(struct ref_iterator *ref_iterator,
+				      struct object_id *peeled)
+{
+	BUG("not supported.");
+	return -1;
+}
+
+static int
+git_reftable_reflog_ref_iterator_abort(struct ref_iterator *ref_iterator)
+{
+	struct git_reftable_reflog_ref_iterator *ri =
+		(struct git_reftable_reflog_ref_iterator *)ref_iterator;
+	reftable_log_record_release(&ri->log);
+	reftable_iterator_destroy(&ri->iter);
+	if (ri->merged)
+		reftable_merged_table_free(ri->merged);
+	return 0;
+}
+
+static struct ref_iterator_vtable git_reftable_reflog_ref_iterator_vtable = {
+	git_reftable_reflog_ref_iterator_advance,
+	git_reftable_reflog_ref_iterator_peel,
+	git_reftable_reflog_ref_iterator_abort
+};
+
+static struct ref_iterator *
+git_reftable_reflog_iterator_begin(struct ref_store *ref_store)
+{
+	struct git_reftable_reflog_ref_iterator *ri = xcalloc(sizeof(*ri), 1);
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+
+	if (refs->worktree_stack == NULL) {
+		struct reftable_stack *stack = refs->main_stack;
+		struct reftable_merged_table *mt =
+			reftable_stack_merged_table(stack);
+		int err = reftable_merged_table_seek_log(mt, &ri->iter, "");
+		if (err < 0) {
+			free(ri);
+			/* XXX is this allowed? */
+			return NULL;
+		}
+	} else {
+		struct reftable_merged_table *mt1 =
+			reftable_stack_merged_table(refs->main_stack);
+		struct reftable_merged_table *mt2 =
+			reftable_stack_merged_table(refs->worktree_stack);
+		struct reftable_table *tabs =
+			xcalloc(2, sizeof(struct reftable_table));
+		int err = 0;
+		reftable_table_from_merged_table(&tabs[0], mt1);
+		reftable_table_from_merged_table(&tabs[1], mt2);
+		err = reftable_new_merged_table(&ri->merged, tabs, 2,
+						the_hash_algo->format_id);
+		if (err < 0) {
+			free(tabs);
+			/* XXX see above */
+			return NULL;
+		}
+		err = reftable_merged_table_seek_ref(ri->merged, &ri->iter, "");
+		if (err < 0) {
+			return NULL;
+		}
+	}
+	base_ref_iterator_init(&ri->base,
+			       &git_reftable_reflog_ref_iterator_vtable, 1);
+	ri->base.oid = &ri->oid;
+
+	return (struct ref_iterator *)ri;
+}
+
+static int git_reftable_for_each_reflog_ent_newest_first(
+	struct ref_store *ref_store, const char *refname, each_reflog_ent_fn fn,
+	void *cb_data)
+{
+	struct reftable_iterator it = { NULL };
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct reftable_merged_table *mt = NULL;
+	int err = 0;
+	struct reftable_log_record log = { NULL };
+
+	if (refs->err < 0) {
+		return refs->err;
+	}
+
+	mt = reftable_stack_merged_table(stack);
+	err = reftable_merged_table_seek_log(mt, &it, refname);
+	while (err == 0) {
+		struct object_id old_oid;
+		struct object_id new_oid;
+		const char *full_committer = "";
+
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+		if (err < 0) {
+			break;
+		}
+
+		if (strcmp(log.refname, refname)) {
+			break;
+		}
+
+		hashcpy(old_oid.hash, log.old_hash);
+		hashcpy(new_oid.hash, log.new_hash);
+
+		full_committer = fmt_ident(log.name, log.email,
+					   WANT_COMMITTER_IDENT,
+					   /*date*/ NULL, IDENT_NO_DATE);
+		err = fn(&old_oid, &new_oid, full_committer, log.time,
+			 log.tz_offset, log.message, cb_data);
+		if (err)
+			break;
+	}
+
+	reftable_log_record_release(&log);
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+static int git_reftable_for_each_reflog_ent_oldest_first(
+	struct ref_store *ref_store, const char *refname, each_reflog_ent_fn fn,
+	void *cb_data)
+{
+	struct reftable_iterator it = { NULL };
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct reftable_merged_table *mt = NULL;
+	struct reftable_log_record *logs = NULL;
+	int cap = 0;
+	int len = 0;
+	int err = 0;
+	int i = 0;
+
+	if (refs->err < 0) {
+		return refs->err;
+	}
+	mt = reftable_stack_merged_table(stack);
+	err = reftable_merged_table_seek_log(mt, &it, refname);
+
+	while (err == 0) {
+		struct reftable_log_record log = { NULL };
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+		if (err < 0) {
+			break;
+		}
+
+		if (strcmp(log.refname, refname)) {
+			break;
+		}
+
+		if (len == cap) {
+			cap = 2 * cap + 1;
+			logs = realloc(logs, cap * sizeof(*logs));
+		}
+
+		logs[len++] = log;
+	}
+
+	for (i = len; i--;) {
+		struct reftable_log_record *log = &logs[i];
+		struct object_id old_oid;
+		struct object_id new_oid;
+		const char *full_committer = "";
+
+		hashcpy(old_oid.hash, log->old_hash);
+		hashcpy(new_oid.hash, log->new_hash);
+
+		full_committer = fmt_ident(log->name, log->email,
+					   WANT_COMMITTER_IDENT, NULL,
+					   IDENT_NO_DATE);
+		err = fn(&old_oid, &new_oid, full_committer, log->time,
+			 log->tz_offset, log->message, cb_data);
+		if (err) {
+			break;
+		}
+	}
+
+	for (i = 0; i < len; i++) {
+		reftable_log_record_release(&logs[i]);
+	}
+	free(logs);
+
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+static int git_reftable_reflog_exists(struct ref_store *ref_store,
+				      const char *refname)
+{
+	struct reftable_iterator it = { NULL };
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct reftable_merged_table *mt = reftable_stack_merged_table(stack);
+	struct reftable_log_record log = { NULL };
+	int err = refs->err;
+
+	if (err < 0) {
+		goto done;
+	}
+	err = reftable_merged_table_seek_log(mt, &it, refname);
+	if (err) {
+		goto done;
+	}
+	err = reftable_iterator_next_log(&it, &log);
+	if (err) {
+		goto done;
+	}
+
+	if (strcmp(log.refname, refname)) {
+		err = 1;
+	}
+
+done:
+	reftable_iterator_destroy(&it);
+	reftable_log_record_release(&log);
+	return !err;
+}
+
+static int git_reftable_create_reflog(struct ref_store *ref_store,
+				      const char *refname, int force_create,
+				      struct strbuf *err)
+{
+	return 0;
+}
+
+static int git_reftable_delete_reflog(struct ref_store *ref_store,
+				      const char *refname)
+{
+	return 0;
+}
+
+struct reflog_expiry_arg {
+	struct git_reftable_ref_store *refs;
+	struct reftable_stack *stack;
+	struct reftable_log_record *tombstones;
+	int len;
+	int cap;
+};
+
+static void clear_log_tombstones(struct reflog_expiry_arg *arg)
+{
+	int i = 0;
+	for (; i < arg->len; i++) {
+		reftable_log_record_release(&arg->tombstones[i]);
+	}
+
+	FREE_AND_NULL(arg->tombstones);
+}
+
+static void add_log_tombstone(struct reflog_expiry_arg *arg,
+			      const char *refname, uint64_t ts)
+{
+	struct reftable_log_record tombstone = {
+		.refname = xstrdup(refname),
+		.update_index = ts,
+	};
+	if (arg->len == arg->cap) {
+		arg->cap = 2 * arg->cap + 1;
+		arg->tombstones =
+			realloc(arg->tombstones, arg->cap * sizeof(tombstone));
+	}
+	arg->tombstones[arg->len++] = tombstone;
+}
+
+static int write_reflog_expiry_table(struct reftable_writer *writer, void *argv)
+{
+	struct reflog_expiry_arg *arg = (struct reflog_expiry_arg *)argv;
+	uint64_t ts = reftable_stack_next_update_index(arg->stack);
+	int i = 0;
+	reftable_writer_set_limits(writer, ts, ts);
+	for (i = 0; i < arg->len; i++) {
+		int err = reftable_writer_add_log(writer, &arg->tombstones[i]);
+		if (err) {
+			return err;
+		}
+	}
+	return 0;
+}
+
+static int
+git_reftable_reflog_expire(struct ref_store *ref_store, const char *refname,
+			   const struct object_id *oid, unsigned int flags,
+			   reflog_expiry_prepare_fn prepare_fn,
+			   reflog_expiry_should_prune_fn should_prune_fn,
+			   reflog_expiry_cleanup_fn cleanup_fn,
+			   void *policy_cb_data)
+{
+	/*
+	  For log expiry, we write tombstones in place of the expired entries,
+	  This means that the entries are still retrievable by delving into the
+	  stack, and expiring entries paradoxically takes extra memory.
+
+	  This memory is only reclaimed when some operation issues a
+	  git_reftable_pack_refs(), which will compact the entire stack and get
+	  rid of deletion entries.
+
+	  It would be better if the refs backend supported an API that sets a
+	  criterion for all refs, passing the criterion to pack_refs().
+	*/
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct reftable_merged_table *mt = NULL;
+	struct reflog_expiry_arg arg = {
+		.stack = stack,
+		.refs = refs,
+	};
+	struct reftable_log_record log = { NULL };
+	struct reftable_iterator it = { NULL };
+	int err = 0;
+	if (refs->err < 0) {
+		return refs->err;
+	}
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+
+	mt = reftable_stack_merged_table(stack);
+	err = reftable_merged_table_seek_log(mt, &it, refname);
+	if (err < 0) {
+		goto done;
+	}
+
+	while (1) {
+		struct object_id ooid;
+		struct object_id noid;
+
+		int err = reftable_iterator_next_log(&it, &log);
+		if (err < 0) {
+			goto done;
+		}
+
+		if (err > 0 || strcmp(log.refname, refname)) {
+			break;
+		}
+		hashcpy(ooid.hash, log.old_hash);
+		hashcpy(noid.hash, log.new_hash);
+
+		if (should_prune_fn(&ooid, &noid, log.email,
+				    (timestamp_t)log.time, log.tz_offset,
+				    log.message, policy_cb_data)) {
+			add_log_tombstone(&arg, refname, log.update_index);
+		}
+	}
+	err = reftable_stack_add(stack, &write_reflog_expiry_table, &arg);
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	reftable_log_record_release(&log);
+	reftable_iterator_destroy(&it);
+	clear_log_tombstones(&arg);
+	return err;
+}
+
+static int reftable_error_to_errno(int err)
+{
+	switch (err) {
+	case REFTABLE_IO_ERROR:
+		return EIO;
+	case REFTABLE_FORMAT_ERROR:
+		return EFAULT;
+	case REFTABLE_NOT_EXIST_ERROR:
+		return ENOENT;
+	case REFTABLE_LOCK_ERROR:
+		return EBUSY;
+	case REFTABLE_API_ERROR:
+		return EINVAL;
+	case REFTABLE_ZLIB_ERROR:
+		return EDOM;
+	default:
+		return ERANGE;
+	}
+}
+
+static int git_reftable_read_raw_ref(struct ref_store *ref_store,
+				     const char *refname, struct object_id *oid,
+				     struct strbuf *referent,
+				     unsigned int *type)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+
+	struct reftable_ref_record ref = { NULL };
+	int err = 0;
+	if (refs->err < 0) {
+		return refs->err;
+	}
+
+	/* This is usually not needed, but Git doesn't signal to ref backend if
+	   a subprocess updated the ref DB.  So we always check.
+	*/
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+
+	err = reftable_stack_read_ref(stack, refname, &ref);
+	if (err > 0) {
+		errno = ENOENT;
+		err = -1;
+		goto done;
+	}
+	if (err < 0) {
+		errno = reftable_error_to_errno(err);
+		err = -1;
+		goto done;
+	}
+
+	if (ref.value_type == REFTABLE_REF_SYMREF) {
+		strbuf_reset(referent);
+		strbuf_addstr(referent, ref.value.symref);
+		*type |= REF_ISSYMREF;
+	} else if (reftable_ref_record_val1(&ref) != NULL) {
+		hashcpy(oid->hash, reftable_ref_record_val1(&ref));
+	} else {
+		*type |= REF_ISBROKEN;
+		errno = EINVAL;
+		err = -1;
+	}
+done:
+	assert(err != REFTABLE_API_ERROR);
+	reftable_ref_record_release(&ref);
+	return err;
+}
+
+struct ref_storage_be refs_be_reftable = {
+	&refs_be_files,
+	"reftable",
+	git_reftable_ref_store_create,
+	git_reftable_init_db,
+	git_reftable_transaction_prepare,
+	git_reftable_transaction_finish,
+	git_reftable_transaction_abort,
+	git_reftable_transaction_initial_commit,
+
+	git_reftable_pack_refs,
+	git_reftable_create_symref,
+	git_reftable_delete_refs,
+	git_reftable_rename_ref,
+	git_reftable_copy_ref,
+
+	git_reftable_ref_iterator_begin,
+	git_reftable_read_raw_ref,
+
+	git_reftable_reflog_iterator_begin,
+	git_reftable_for_each_reflog_ent_oldest_first,
+	git_reftable_for_each_reflog_ent_newest_first,
+	git_reftable_reflog_exists,
+	git_reftable_create_reflog,
+	git_reftable_delete_reflog,
+	git_reftable_reflog_expire,
+};
diff --git a/repository.c b/repository.c
index a4174ddb062..ff0988dac84 100644
--- a/repository.c
+++ b/repository.c
@@ -174,6 +174,8 @@ int repo_init(struct repository *repo,
 	if (worktree)
 		repo_set_worktree(repo, worktree);
 
+	repo->ref_storage_format = xstrdup_or_null(format.ref_storage);
+
 	clear_repository_format(&format);
 	return 0;
 
diff --git a/repository.h b/repository.h
index b385ca3c94b..8019a7d0a1f 100644
--- a/repository.h
+++ b/repository.h
@@ -78,6 +78,9 @@ struct repository {
 	 */
 	struct ref_store *refs_private;
 
+	/* The format to use for the ref database. */
+	char *ref_storage_format;
+
 	/*
 	 * Contains path to often used file names.
 	 */
diff --git a/setup.c b/setup.c
index c04cd25a30d..c6b57efd031 100644
--- a/setup.c
+++ b/setup.c
@@ -500,6 +500,9 @@ static enum extension_result handle_extension(const char *var,
 			return error("invalid value for 'extensions.objectformat'");
 		data->hash_algo = format;
 		return EXTENSION_OK;
+	} else if (!strcmp(ext, "refstorage")) {
+		data->ref_storage = xstrdup(value);
+		return EXTENSION_OK;
 	}
 	return EXTENSION_UNKNOWN;
 }
@@ -651,6 +654,7 @@ void clear_repository_format(struct repository_format *format)
 	string_list_clear(&format->v1_only_extensions, 0);
 	free(format->work_tree);
 	free(format->partial_clone);
+	free(format->ref_storage);
 	init_repository_format(format);
 }
 
@@ -1308,8 +1312,11 @@ const char *setup_git_directory_gently(int *nongit_ok)
 				gitdir = DEFAULT_GIT_DIR_ENVIRONMENT;
 			setup_git_env(gitdir);
 		}
-		if (startup_info->have_repository)
+		if (startup_info->have_repository) {
 			repo_set_hash_algo(the_repository, repo_fmt.hash_algo);
+			the_repository->ref_storage_format =
+				xstrdup_or_null(repo_fmt.ref_storage);
+		}
 	}
 
 	strbuf_release(&dir);
diff --git a/t/t0031-reftable.sh b/t/t0031-reftable.sh
new file mode 100755
index 00000000000..58c7d5d4bcd
--- /dev/null
+++ b/t/t0031-reftable.sh
@@ -0,0 +1,199 @@
+#!/bin/sh
+#
+# Copyright (c) 2020 Google LLC
+#
+
+test_description='reftable basics'
+
+. ./test-lib.sh
+
+INVALID_SHA1=aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+
+initialize ()  {
+	rm -rf .git &&
+	git init --ref-storage=reftable &&
+	mv .git/hooks .git/hooks-disabled
+}
+
+test_expect_success 'SHA256 support, env' '
+	rm -rf .git &&
+	GIT_DEFAULT_HASH=sha256 && export GIT_DEFAULT_HASH &&
+	git init --ref-storage=reftable &&
+	mv .git/hooks .git/hooks-disabled &&
+	test_commit file
+'
+
+test_expect_success 'SHA256 support, option' '
+	rm -rf .git &&
+	git init --ref-storage=reftable --object-format=sha256 &&
+	mv .git/hooks .git/hooks-disabled &&
+	test_commit file
+'
+
+test_expect_success 'delete ref' '
+	initialize &&
+	test_commit file &&
+	SHA=$(git show-ref -s --verify HEAD) &&
+	test_write_lines "$SHA refs/heads/master" "$SHA refs/tags/file" >expect &&
+	git show-ref > actual &&
+	! git update-ref -d refs/tags/file $INVALID_SHA1 &&
+	test_cmp expect actual &&
+	git update-ref -d refs/tags/file $SHA  &&
+	test_write_lines "$SHA refs/heads/master" >expect &&
+	git show-ref > actual &&
+	test_cmp expect actual
+'
+
+
+test_expect_success 'clone calls transaction_initial_commit' '
+	test_commit message1 file1 &&
+	git clone . cloned &&
+	(test  -f cloned/file1 || echo "Fixme.")
+'
+
+test_expect_success 'basic operation of reftable storage: commit, show-ref' '
+	initialize &&
+	test_commit file &&
+	test_write_lines refs/heads/master refs/tags/file >expect &&
+	git show-ref &&
+	git show-ref | cut -f2 -d" " > actual &&
+	test_cmp actual expect
+'
+
+test_expect_success 'reflog, repack' '
+	initialize &&
+	for count in $(test_seq 1 10)
+	do
+		test_commit "number $count" file.t $count number-$count ||
+		return 1
+	done &&
+	git pack-refs &&
+	ls -1 .git/reftable >table-files &&
+	test_line_count = 2 table-files &&
+	git reflog refs/heads/master >output &&
+	test_line_count = 10 output &&
+	grep "commit (initial): number 1" output &&
+	grep "commit: number 10" output &&
+	git gc &&
+	git reflog refs/heads/master >output &&
+	test_line_count = 0 output
+'
+
+test_expect_success 'branch switch in reflog output' '
+	initialize &&
+	test_commit file1 &&
+	git checkout -b branch1 &&
+	test_commit file2 &&
+	git checkout -b branch2 &&
+	git switch - &&
+	git rev-parse --symbolic-full-name HEAD > actual &&
+	echo refs/heads/branch1 > expect &&
+	test_cmp actual expect
+'
+
+
+# This matches show-ref's output
+print_ref() {
+	echo "$(git rev-parse "$1") $1"
+}
+
+test_expect_success 'peeled tags are stored' '
+	initialize &&
+	test_commit file &&
+	git tag -m "annotated tag" test_tag HEAD &&
+	{
+		print_ref "refs/heads/master" &&
+		print_ref "refs/tags/file" &&
+		print_ref "refs/tags/test_tag" &&
+		print_ref "refs/tags/test_tag^{}"
+	} >expect &&
+	git show-ref -d >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'show-ref works on fresh repo' '
+	initialize &&
+	rm -rf .git &&
+	git init --ref-storage=reftable &&
+	>expect &&
+	! git show-ref > actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'checkout unborn branch' '
+	initialize &&
+	git checkout -b master
+'
+
+
+test_expect_success 'dir/file conflict' '
+	initialize &&
+	test_commit file &&
+	! git branch master/forbidden
+'
+
+
+test_expect_success 'do not clobber existing repo' '
+	rm -rf .git &&
+	git init --ref-storage=files &&
+	cat .git/HEAD > expect &&
+	test_commit file &&
+	(git init --ref-storage=reftable || true) &&
+	cat .git/HEAD > actual &&
+	test_cmp expect actual
+'
+
+# cherry-pick uses a pseudo ref.
+test_expect_success 'pseudo refs' '
+	initialize &&
+	test_commit message1 file1 &&
+	test_commit message2 file2 &&
+	git branch source &&
+	git checkout HEAD^ &&
+	test_commit message3 file3 &&
+	git cherry-pick source &&
+	test -f file2
+'
+
+# cherry-pick uses a pseudo ref.
+test_expect_success 'rebase' '
+	initialize &&
+	test_commit message1 file1 &&
+	test_commit message2 file2 &&
+	git branch source &&
+	git checkout HEAD^ &&
+	test_commit message3 file3 &&
+	git rebase source &&
+	test -f file2
+'
+
+test_expect_success 'worktrees' '
+	git init --ref-storage=reftable start &&
+	(cd start && test_commit file1 && git checkout -b branch1 &&
+	git checkout -b branch2 &&
+	git worktree add  ../wt
+	) &&
+	cd wt &&
+	git checkout branch1 &&
+	git branch
+'
+
+test_expect_success 'worktrees 2' '
+	initialize &&
+	test_commit file1 &&
+	mkdir existing_empty &&
+	git worktree add --detach existing_empty master
+'
+
+test_expect_success 'FETCH_HEAD' '
+	initialize &&
+	test_commit one &&
+	(git init sub && cd sub && test_commit two) &&
+	git --git-dir sub/.git rev-parse HEAD >expect &&
+	git fetch sub &&
+	git checkout FETCH_HEAD &&
+	git rev-parse HEAD >actual &&
+	test_cmp expect actual
+'
+
+test_done
diff --git a/t/t1409-avoid-packing-refs.sh b/t/t1409-avoid-packing-refs.sh
index be12fb63506..c6f78325563 100755
--- a/t/t1409-avoid-packing-refs.sh
+++ b/t/t1409-avoid-packing-refs.sh
@@ -4,6 +4,12 @@ test_description='avoid rewriting packed-refs unnecessarily'
 
 . ./test-lib.sh
 
+if test_have_prereq REFTABLE
+then
+  skip_all='skipping pack-refs tests; incompatible with reftable'
+  test_done
+fi
+
 # Add an identifying mark to the packed-refs file header line. This
 # shouldn't upset readers, and it should be omitted if the file is
 # ever rewritten.
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index b17f5c21fbc..cc5d01571a4 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -8,6 +8,12 @@ test_description='git fsck random collection of tests
 
 . ./test-lib.sh
 
+if test_have_prereq REFTABLE
+then
+  skip_all='skipping tests; incompatible with reftable'
+  test_done
+fi
+
 test_expect_success setup '
 	git config gc.auto 0 &&
 	git config i18n.commitencoding ISO-8859-1 &&
diff --git a/t/t3210-pack-refs.sh b/t/t3210-pack-refs.sh
index f41b2afb996..edaef2c175a 100755
--- a/t/t3210-pack-refs.sh
+++ b/t/t3210-pack-refs.sh
@@ -11,6 +11,12 @@ semantic is still the same.
 '
 . ./test-lib.sh
 
+if test_have_prereq REFTABLE
+then
+  skip_all='skipping pack-refs tests; incompatible with reftable'
+  test_done
+fi
+
 test_expect_success 'enable reflogs' '
 	git config core.logallrefupdates true
 '
diff --git a/t/test-lib.sh b/t/test-lib.sh
index a863ccee7e9..7b638e0b8c8 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1520,6 +1520,11 @@ parisc* | hppa*)
 	;;
 esac
 
+if test -n "$GIT_TEST_REFTABLE"
+then
+  test_set_prereq REFTABLE
+fi
+
 ( COLUMNS=1 && test $COLUMNS = 1 ) && test_set_prereq COLUMNS_CAN_BE_1
 test -z "$NO_PERL" && test_set_prereq PERL
 test -z "$NO_PTHREADS" && test_set_prereq PTHREADS
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v4 14/15] git-prompt: prepare for reftable refs backend
  2020-12-09 14:00     ` [PATCH v4 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                         ` (12 preceding siblings ...)
  2020-12-09 14:00       ` [PATCH v4 13/15] Reftable support for git-core Han-Wen Nienhuys via GitGitGadget
@ 2020-12-09 14:00       ` SZEDER Gábor via GitGitGadget
  2020-12-09 14:00       ` [PATCH v4 15/15] Add "test-tool dump-reftable" command Han-Wen Nienhuys via GitGitGadget
  2021-03-12 20:19       ` [PATCH v5 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
  15 siblings, 0 replies; 251+ messages in thread
From: SZEDER Gábor via GitGitGadget @ 2020-12-09 14:00 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, SZEDER Gábor

From: =?UTF-8?q?SZEDER=20G=C3=A1bor?= <szeder.dev@gmail.com>

In our git-prompt script we strive to use Bash builtins wherever
possible, because fork()-ing subshells for command substitutions and
fork()+exec()-ing Git commands are expensive on some platforms.  We
even read and parse '.git/HEAD' using Bash builtins to get the name of
the current branch [1].  However, the upcoming reftable refs backend
won't use '.git/HEAD' at all, but will write an invalid refname as
placeholder for backwards compatibility instead, which will break our
git-prompt script.

Update the git-prompt script to recognize the placeholder '.git/HEAD'
written by the reftable backend (its content is specified in the
reftable specs), and then fall back to use 'git symbolic-ref' to get
the name of the current branch.

[1] 3a43c4b5bd (bash prompt: use bash builtins to find out current
    branch, 2011-03-31)

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 contrib/completion/git-prompt.sh | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/contrib/completion/git-prompt.sh b/contrib/completion/git-prompt.sh
index 4640a1535d1..2e5a5d80271 100644
--- a/contrib/completion/git-prompt.sh
+++ b/contrib/completion/git-prompt.sh
@@ -478,10 +478,15 @@ __git_ps1 ()
 			if ! __git_eread "$g/HEAD" head; then
 				return $exit
 			fi
-			# is it a symbolic ref?
 			b="${head#ref: }"
 			if [ "$head" = "$b" ]; then
 				detached=yes
+			elif [ "$b" = "refs/heads/.invalid" ]; then
+				# Reftable
+				b="$(git symbolic-ref HEAD 2>/dev/null)" ||
+				detached=yes
+			fi
+			if [ "$detached" = yes ]; then
 				b="$(
 				case "${GIT_PS1_DESCRIBE_STYLE-}" in
 				(contains)
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v4 15/15] Add "test-tool dump-reftable" command.
  2020-12-09 14:00     ` [PATCH v4 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                         ` (13 preceding siblings ...)
  2020-12-09 14:00       ` [PATCH v4 14/15] git-prompt: prepare for reftable refs backend SZEDER Gábor via GitGitGadget
@ 2020-12-09 14:00       ` Han-Wen Nienhuys via GitGitGadget
  2021-03-12 20:19       ` [PATCH v5 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2020-12-09 14:00 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This command dumps individual tables or a stack of of tables.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |  1 +
 reftable/dump.c          | 67 ++++++++++++++++++++--------------------
 t/helper/test-reftable.c |  5 +++
 t/helper/test-tool.c     |  1 +
 t/helper/test-tool.h     |  1 +
 5 files changed, 42 insertions(+), 33 deletions(-)

diff --git a/Makefile b/Makefile
index 6b7c8a165f4..2f9bf61fffb 100644
--- a/Makefile
+++ b/Makefile
@@ -2409,6 +2409,7 @@ REFTABLE_OBJS += reftable/zlib-compat.o
 
 REFTABLE_TEST_OBJS += reftable/basics_test.o
 REFTABLE_TEST_OBJS += reftable/block_test.o
+REFTABLE_TEST_OBJS += reftable/dump.o
 REFTABLE_TEST_OBJS += reftable/merged_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/refname_test.o
diff --git a/reftable/dump.c b/reftable/dump.c
index 00b444e8c9b..7d620a3cf0f 100644
--- a/reftable/dump.c
+++ b/reftable/dump.c
@@ -12,18 +12,25 @@ license that can be found in the LICENSE file or at
 #include <unistd.h>
 #include <string.h>
 
-#include "reftable.h"
+#include "reftable-blocksource.h"
+#include "reftable-error.h"
+#include "reftable-merged.h"
+#include "reftable-record.h"
 #include "reftable-tests.h"
+#include "reftable-writer.h"
+#include "reftable-iterator.h"
+#include "reftable-reader.h"
+#include "reftable-stack.h"
 
 static uint32_t hash_id;
 
 static int dump_table(const char *tablename)
 {
-	struct reftable_block_source src = { 0 };
+	struct reftable_block_source src = { NULL };
 	int err = reftable_block_source_from_file(&src, tablename);
-	struct reftable_iterator it = { 0 };
-	struct reftable_ref_record ref = { 0 };
-	struct reftable_log_record log = { 0 };
+	struct reftable_iterator it = { NULL };
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_log_record log = { NULL };
 	struct reftable_reader *r = NULL;
 
 	if (err < 0)
@@ -49,7 +56,7 @@ static int dump_table(const char *tablename)
 		reftable_ref_record_print(&ref, hash_id);
 	}
 	reftable_iterator_destroy(&it);
-	reftable_ref_record_clear(&ref);
+	reftable_ref_record_release(&ref);
 
 	err = reftable_reader_seek_log(r, &it, "");
 	if (err < 0) {
@@ -66,7 +73,7 @@ static int dump_table(const char *tablename)
 		reftable_log_record_print(&log, hash_id);
 	}
 	reftable_iterator_destroy(&it);
-	reftable_log_record_clear(&log);
+	reftable_log_record_release(&log);
 
 	reftable_reader_free(r);
 	return 0;
@@ -75,7 +82,7 @@ static int dump_table(const char *tablename)
 static int compact_stack(const char *stackdir)
 {
 	struct reftable_stack *stack = NULL;
-	struct reftable_write_options cfg = {};
+	struct reftable_write_options cfg = { 0 };
 
 	int err = reftable_new_stack(&stack, stackdir, cfg);
 	if (err < 0)
@@ -94,10 +101,10 @@ static int compact_stack(const char *stackdir)
 static int dump_stack(const char *stackdir)
 {
 	struct reftable_stack *stack = NULL;
-	struct reftable_write_options cfg = {};
-	struct reftable_iterator it = { 0 };
-	struct reftable_ref_record ref = { 0 };
-	struct reftable_log_record log = { 0 };
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_iterator it = { NULL };
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_log_record log = { NULL };
 	struct reftable_merged_table *merged = NULL;
 
 	int err = reftable_new_stack(&stack, stackdir, cfg);
@@ -122,7 +129,7 @@ static int dump_stack(const char *stackdir)
 		reftable_ref_record_print(&ref, hash_id);
 	}
 	reftable_iterator_destroy(&it);
-	reftable_ref_record_clear(&ref);
+	reftable_ref_record_release(&ref);
 
 	err = reftable_merged_table_seek_log(merged, &it, "");
 	if (err < 0) {
@@ -139,7 +146,7 @@ static int dump_stack(const char *stackdir)
 		reftable_log_record_print(&log, hash_id);
 	}
 	reftable_iterator_destroy(&it);
-	reftable_log_record_clear(&log);
+	reftable_log_record_release(&log);
 
 	reftable_stack_destroy(stack);
 	return 0;
@@ -160,40 +167,34 @@ static void print_help(void)
 int reftable_dump_main(int argc, char *const *argv)
 {
 	int err = 0;
-	int opt;
 	int opt_dump_table = 0;
 	int opt_dump_stack = 0;
 	int opt_compact = 0;
-	const char *arg = NULL;
-	while ((opt = getopt(argc, argv, "2chts")) != -1) {
-		switch (opt) {
-		case '2':
-			hash_id = 0x73323536;
+	const char *arg = NULL, *argv0 = argv[0];
+
+	for (; argc > 1; argv++, argc--)
+		if (*argv[1] != '-')
 			break;
-		case 't':
+		else if (!strcmp("-2", argv[1]))
+			hash_id = 0x73323536;
+		else if (!strcmp("-t", argv[1]))
 			opt_dump_table = 1;
-			break;
-		case 's':
+		else if (!strcmp("-s", argv[1]))
 			opt_dump_stack = 1;
-			break;
-		case 'c':
+		else if (!strcmp("-c", argv[1]))
 			opt_compact = 1;
-			break;
-		case '?':
-		case 'h':
+		else if (!strcmp("-?", argv[1]) || !strcmp("-h", argv[1])) {
 			print_help();
 			return 2;
-			break;
 		}
-	}
 
-	if (argv[optind] == NULL) {
+	if (argc != 2) {
 		fprintf(stderr, "need argument\n");
 		print_help();
 		return 2;
 	}
 
-	arg = argv[optind];
+	arg = argv[1];
 
 	if (opt_dump_table) {
 		err = dump_table(arg);
@@ -204,7 +205,7 @@ int reftable_dump_main(int argc, char *const *argv)
 	}
 
 	if (err < 0) {
-		fprintf(stderr, "%s: %s: %s\n", argv[0], arg,
+		fprintf(stderr, "%s: %s: %s\n", argv0, arg,
 			reftable_error_str(err));
 		return 1;
 	}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 3b702f4855e..c1ba131c3a4 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -13,3 +13,8 @@ int cmd__reftable(int argc, const char **argv)
 	tree_test_main(argc, argv);
 	return 0;
 }
+
+int cmd__dump_reftable(int argc, const char **argv)
+{
+	return reftable_dump_main(argc, (char *const *)argv);
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 0208a0a41cf..f064440a319 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -56,6 +56,7 @@ static struct test_cmd cmds[] = {
 	{ "read-midx", cmd__read_midx },
 	{ "ref-store", cmd__ref_store },
 	{ "reftable", cmd__reftable },
+	{ "dump-reftable", cmd__dump_reftable },
 	{ "regex", cmd__regex },
 	{ "repository", cmd__repository },
 	{ "revision-walking", cmd__revision_walking },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index 1de39ce5b58..226af8c6b89 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -18,6 +18,7 @@ int cmd__dump_cache_tree(int argc, const char **argv);
 int cmd__dump_fsmonitor(int argc, const char **argv);
 int cmd__dump_split_index(int argc, const char **argv);
 int cmd__dump_untracked_cache(int argc, const char **argv);
+int cmd__dump_reftable(int argc, const char **argv);
 int cmd__example_decorate(int argc, const char **argv);
 int cmd__fast_rebase(int argc, const char **argv);
 int cmd__genrandom(int argc, const char **argv);
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 251+ messages in thread

* Re: [PATCH v4 13/15] Reftable support for git-core
  2020-12-09 14:00       ` [PATCH v4 13/15] Reftable support for git-core Han-Wen Nienhuys via GitGitGadget
@ 2021-01-21 15:55         ` Ævar Arnfjörð Bjarmason
  2021-01-21 16:14           ` Han-Wen Nienhuys
  2021-04-23 10:22           ` Han-Wen Nienhuys
  2021-02-22  0:41         ` [PATCH] refs: introduce API function to write invalid null ref Stefan Beller
  1 sibling, 2 replies; 251+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-01-21 15:55 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: git, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Felipe Contreras, Han-Wen Nienhuys,
	Han-Wen Nienhuys


On Wed, Dec 09 2020, Han-Wen Nienhuys via GitGitGadget wrote:

> From: Han-Wen Nienhuys <hanwen@google.com>
> [...]
>  refs.c                                        |   27 +-
>  refs.h                                        |    3 +
> [...]
>  refs/reftable-backend.c                       | 1435 +++++++++++++++++
> [...]
> +static int git_reftable_read_raw_ref(struct ref_store *ref_store,
> +				     const char *refname, struct object_id *oid,
> +				     struct strbuf *referent,
> +				     unsigned int *type)
> +{
> +	struct git_reftable_ref_store *refs =
> +		(struct git_reftable_ref_store *)ref_store;
> +	struct reftable_stack *stack = stack_for(refs, refname);
> +
> +	struct reftable_ref_record ref = { NULL };
> +	int err = 0;
> +	if (refs->err < 0) {
> +		return refs->err;
> +	}
> +
> +	/* This is usually not needed, but Git doesn't signal to ref backend if
> +	   a subprocess updated the ref DB.  So we always check.
> +	*/
> +	err = reftable_stack_reload(stack);
> +	if (err) {
> +		goto done;
> +	}
> +
> +	err = reftable_stack_read_ref(stack, refname, &ref);
> +	if (err > 0) {
> +		errno = ENOENT;
> +		err = -1;
> +		goto done;
> +	}
> +	if (err < 0) {
> +		errno = reftable_error_to_errno(err);
> +		err = -1;
> +		goto done;
> +	}
> +
> +	if (ref.value_type == REFTABLE_REF_SYMREF) {
> +		strbuf_reset(referent);
> +		strbuf_addstr(referent, ref.value.symref);
> +		*type |= REF_ISSYMREF;
> +	} else if (reftable_ref_record_val1(&ref) != NULL) {
> +		hashcpy(oid->hash, reftable_ref_record_val1(&ref));
> +	} else {
> +		*type |= REF_ISBROKEN;
> +		errno = EINVAL;
> +		err = -1;

I've been chasing down some edge cases in refs.c as part of another WIP
series I have and found that for this particular "errno" stuff we don't
have any test coverage. And with master & hanwen/reftable if I apply:

    diff --git a/refs.c b/refs.c
    index af0d000082..8ac57908ea 100644
    --- a/refs.c
    +++ b/refs.c
    @@ -1589,3 +1589,2 @@ const char *refs_resolve_ref_unsafe(struct ref_store *refs,
                        !refname_is_safe(refname)) {
    -                       errno = EINVAL;
                            return NULL;
    @@ -1649,3 +1648,2 @@ const char *refs_resolve_ref_unsafe(struct ref_store *refs,
                                !refname_is_safe(refname)) {
    -                               errno = EINVAL;
                                    return NULL;
    @@ -1657,3 +1655,2 @@ const char *refs_resolve_ref_unsafe(struct ref_store *refs,
     
    -       errno = ELOOP;
            return NULL;
    diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
    index e98d22c826..e09ae21965 100644
    --- a/refs/reftable-backend.c
    +++ b/refs/reftable-backend.c
    @@ -1343,3 +1343,2 @@ static int git_reftable_read_raw_ref(struct ref_store *ref_store,
            if (err < 0) {
    -               errno = reftable_error_to_errno(err);
                    err = -1;
    @@ -1355,3 +1354,2 @@ static int git_reftable_read_raw_ref(struct ref_store *ref_store,
                    *type |= REF_ISBROKEN;
    -               errno = EINVAL;
                    err = -1;

All tests pass on both.

Anyway, in the case of the code sitting on "master" I find that less
concerning. It's some code that's drifted apart over the years blah blah
blah. Boring details but for various reasons we don't use some/any of
that error handling logic anymore, or to the extent that we do I'll need
to improve the tests.

I must say I do find it a bit more concerning in the case of the
reftable series. I.e. the git_reftable_read_raw_ref() function seems
like a faithful copy of the general logic (at least this part) of the
refs_resolve_ref_unsafe() function in refs.c.

That I can remove error checking/handling from this place means existing
general logic was faithfully copied without checking that we have a test
for it, or adding one. Or perhaps given the history of this codebase
there's out-of-tree tests for it somewhere?

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v4 13/15] Reftable support for git-core
  2021-01-21 15:55         ` Ævar Arnfjörð Bjarmason
@ 2021-01-21 16:14           ` Han-Wen Nienhuys
  2021-01-21 16:21             ` Han-Wen Nienhuys
  2021-04-23 10:22           ` Han-Wen Nienhuys
  1 sibling, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys @ 2021-01-21 16:14 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Felipe Contreras, Han-Wen Nienhuys

On Thu, Jan 21, 2021 at 4:55 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> I must say I do find it a bit more concerning in the case of the
> reftable series. I.e. the git_reftable_read_raw_ref() function seems
> like a faithful copy of the general logic (at least this part) of the
> refs_resolve_ref_unsafe() function in refs.c.
>
> That I can remove error checking/handling from this place means existing
> general logic was faithfully copied without checking that we have a test
> for it, or adding one. Or perhaps given the history of this codebase
> there's out-of-tree tests for it somewhere?

One of the earlier iterations of this code didn't have the errno
handling, and it caused a bug that I tracked down to the errno
handling. I'm fairly sure of this, because I remember being perplexed
and flabbergasted at using errno as an out-of-band API mechanism. The
first iteration of reftable code was about a year ago, so if you
checkout the Git source code from that time, and remove the errno
assignments, you would be able to see which test exercised this code
path.

Or you could check where errno is read-back (I believe ENOENT is the
case that has special meaning)

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v4 13/15] Reftable support for git-core
  2021-01-21 16:14           ` Han-Wen Nienhuys
@ 2021-01-21 16:21             ` Han-Wen Nienhuys
  2021-01-26 13:44               ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys @ 2021-01-21 16:21 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Felipe Contreras, Han-Wen Nienhuys

On Thu, Jan 21, 2021 at 5:14 PM Han-Wen Nienhuys <hanwen@google.com> wrote:
>
> On Thu, Jan 21, 2021 at 4:55 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
> > I must say I do find it a bit more concerning in the case of the
> > reftable series. I.e. the git_reftable_read_raw_ref() function seems
> > like a faithful copy of the general logic (at least this part) of the
> > refs_resolve_ref_unsafe() function in refs.c.
> >
> > That I can remove error checking/handling from this place means existing
> > general logic was faithfully copied without checking that we have a test
> > for it, or adding one. Or perhaps given the history of this codebase
> > there's out-of-tree tests for it somewhere?
>
> One of the earlier iterations of this code didn't have the errno
> handling, and it caused a bug that I tracked down to the errno
> handling. I'm fairly sure of this, because I remember being perplexed
> and flabbergasted at using errno as an out-of-band API mechanism. The
> first iteration of reftable code was about a year ago, so if you
> checkout the Git source code from that time, and remove the errno
> assignments, you would be able to see which test exercised this code
> path.
>
> Or you could check where errno is read-back (I believe ENOENT is the
> case that has special meaning)

In refs.c, there is behavior that depends on errno,

if (refs_read_raw_ref(refs, refname,
      oid, &sb_refname, &read_flags)) {
..
/*
* Otherwise a missing ref is OK. But the files backend
* may show errors besides ENOENT if there are
* similarly-named refs.
*/
if (errno != ENOENT &&
    errno != EISDIR &&
    errno != ENOTDIR)
return NULL;

blame tells me it has something to do with dir/file conflicts and
commit a1c1d8170db.


-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--
Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v4 13/15] Reftable support for git-core
  2021-01-21 16:21             ` Han-Wen Nienhuys
@ 2021-01-26 13:44               ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 251+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-01-26 13:44 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Felipe Contreras, Han-Wen Nienhuys


On Thu, Jan 21 2021, Han-Wen Nienhuys wrote:

> On Thu, Jan 21, 2021 at 5:14 PM Han-Wen Nienhuys <hanwen@google.com> wrote:
>>
>> On Thu, Jan 21, 2021 at 4:55 PM Ævar Arnfjörð Bjarmason
>> <avarab@gmail.com> wrote:
>> > I must say I do find it a bit more concerning in the case of the
>> > reftable series. I.e. the git_reftable_read_raw_ref() function seems
>> > like a faithful copy of the general logic (at least this part) of the
>> > refs_resolve_ref_unsafe() function in refs.c.
>> >
>> > That I can remove error checking/handling from this place means existing
>> > general logic was faithfully copied without checking that we have a test
>> > for it, or adding one. Or perhaps given the history of this codebase
>> > there's out-of-tree tests for it somewhere?
>>
>> One of the earlier iterations of this code didn't have the errno
>> handling, and it caused a bug that I tracked down to the errno
>> handling. I'm fairly sure of this, because I remember being perplexed
>> and flabbergasted at using errno as an out-of-band API mechanism. The
>> first iteration of reftable code was about a year ago, so if you
>> checkout the Git source code from that time, and remove the errno
>> assignments, you would be able to see which test exercised this code
>> path.
>>
>> Or you could check where errno is read-back (I believe ENOENT is the
>> case that has special meaning)
>
> In refs.c, there is behavior that depends on errno,
>
> if (refs_read_raw_ref(refs, refname,
>       oid, &sb_refname, &read_flags)) {
> ..
> /*
> * Otherwise a missing ref is OK. But the files backend
> * may show errors besides ENOENT if there are
> * similarly-named refs.
> */
> if (errno != ENOENT &&
>     errno != EISDIR &&
>     errno != ENOTDIR)
> return NULL;
>
> blame tells me it has something to do with dir/file conflicts and
> commit a1c1d8170db.

I'm aware mostly of what the code is doing. Yes some of it's used, some
of it's not.

My upthread [1] wasn't clear enough, the intended message was "hey this
new major refactoring/replacement of a core git format still has zero
coverage on at least some of the code it seeks to replace" not "hey
what's this thing in refs.c doing?".

I.e. lack of test coverage on refactored code as part of a big patch
series is a lot scarier than the same lack of coverage on existing
long-established code we'll either not update, or in small incremental
updates.

So it was a gentle prod to maybe look into what the state of that in the
reftable series as a whole is and fix the coverage gaps. I'd like the
feature in git.git, but this series has been stalled for a while. I for
one would feel a lot more confident in reviewing it if I knew I'd just
discovered an outlier case.

But in any case, when I wrote[1] I didn't know about / had forgotten
"make cover". As noted in t/README that can even give pretty HTML
output. It takes ages to run so I haven't run it on your series, but I
ran it on master a couple of days ago and it flagged those (and other)
parts of refs.c.

So perhaps running it on the reftables series will find some useful
cases to make up the test difference / potential bugs.

Of course 100% line coverage isn't practical (hitting every BUG(...)),
and even 100% line coverage != 100% test coverage. But it's a very good
first stop to see glaring omissions. E.g. the refs.c cases I mentioned
where we'resetting flag "xyz" in a() and then acting on "xyz/!xyz" in
b().

1. <871ree8eto.fsf@evledraar.gmail.com> 


^ permalink raw reply	[flat|nested] 251+ messages in thread

* [PATCH] refs: introduce API function to write invalid null ref
  2020-12-09 14:00       ` [PATCH v4 13/15] Reftable support for git-core Han-Wen Nienhuys via GitGitGadget
  2021-01-21 15:55         ` Ævar Arnfjörð Bjarmason
@ 2021-02-22  0:41         ` Stefan Beller
  2021-02-22  1:20           ` Eric Sunshine
  2021-02-22 18:38           ` Han-Wen Nienhuys
  1 sibling, 2 replies; 251+ messages in thread
From: Stefan Beller @ 2021-02-22  0:41 UTC (permalink / raw)
  To: gitgitgadget
  Cc: Johannes.Schindelin, avarab, emilyshaffer, felipe.contreras, git,
	hanwen, hanwenn, jonathantanmy, jrnieder, peff, ps, ramsay,
	steadmon, Stefan Beller

Different ref backends will have different ways to write out the invalid 00..00
ref when starting a new worktree. Encapsulate this into a function and expose
the function in the refs API.

Signed-off-by: Stefan Beller <stefanbeller@gmail.com>
---

Hi Han-Wen,

it's been a while since I looked at git source code, but today is the day!
I was actually looking how the refs table work progresses and this patch
caught my attention.  I think the changes in builtin/worktree.c (that 
if/else depending on the actual refs backend used)
demonstrate that the refs API layer is leaking implementation details.

What do you think about rolling this patch first, and then implementing
the following part inside the reftable as a function?


int reftable_write_invalid_head_ref(struct *ref_store, const char *dir)
{
	struct sb  = STRBUF_INIT;
	/* the following adapted from this patch: */

+		/* XXX this is cut & paste from reftable_init_db. */
+		strbuf_addf(&sb, "%s/HEAD", dir);
+		write_file(sb.buf, "%s", "ref: refs/heads/.invalid\n");
+		strbuf_reset(&sb);
+
+		strbuf_addf(&sb, "%s/refs", sb_repo.buf);
+		safe_create_dir(sb.buf, 1);
+		strbuf_reset(&sb);
+
+		strbuf_addf(&sb, "%s/refs/heads", sb_repo.buf);
+		write_file(sb.buf, "this repository uses the reftable format");
+		strbuf_reset(&sb);
+
+		strbuf_addf(&sb, "%s/reftable", sb_repo.buf);
+		safe_create_dir(sb.buf, 1);
+		strbuf_reset(&sb);

	return 0;
}

What do you think?

Cheers,
Stefan


 builtin/worktree.c    |  6 +++---
 refs.c                |  5 +++++
 refs.h                |  3 +++
 refs/debug.c          | 10 ++++++++++
 refs/files-backend.c  | 13 +++++++++++++
 refs/packed-backend.c |  7 +++++++
 refs/refs-internal.h  |  3 +++
 7 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/builtin/worktree.c b/builtin/worktree.c
index 1cd5c2016e..6f5d79e5ec 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -330,9 +330,9 @@ static int add_worktree(const char *path, const char *refname,
 	 * worktree.
 	 */
 	strbuf_reset(&sb);
-	strbuf_addf(&sb, "%s/HEAD", sb_repo.buf);
-	write_file(sb.buf, "%s", oid_to_hex(&null_oid));
-	strbuf_reset(&sb);
+
+	write_invalid_head(get_main_ref_store(the_repository), sb_repo.buf);
+
 	strbuf_addf(&sb, "%s/commondir", sb_repo.buf);
 	write_file(sb.buf, "../..");
 
diff --git a/refs.c b/refs.c
index a665ed5e10..a7b415c386 100644
--- a/refs.c
+++ b/refs.c
@@ -2454,3 +2454,8 @@ int copy_existing_ref(const char *oldref, const char *newref, const char *logmsg
 {
 	return refs_copy_existing_ref(get_main_ref_store(the_repository), oldref, newref, logmsg);
 }
+
+int write_invalid_head(struct ref_store *refs, const char *dir)
+{
+	return refs->be->write_invalid_head_ref(refs, dir);
+}
diff --git a/refs.h b/refs.h
index 48970dfc7e..7a64f777d5 100644
--- a/refs.h
+++ b/refs.h
@@ -456,6 +456,9 @@ int refs_delete_refs(struct ref_store *refs, const char *msg,
 int delete_refs(const char *msg, struct string_list *refnames,
 		unsigned int flags);
 
+
+int write_invalid_head(struct ref_store *refs, const char *dir);
+
 /** Delete a reflog */
 int refs_delete_reflog(struct ref_store *refs, const char *refname);
 int delete_reflog(const char *refname);
diff --git a/refs/debug.c b/refs/debug.c
index 922e64fa6a..9cfc3ca9f3 100644
--- a/refs/debug.c
+++ b/refs/debug.c
@@ -175,6 +175,15 @@ static int debug_copy_ref(struct ref_store *ref_store, const char *oldref,
 	return res;
 }
 
+static int debug_write_invalid_head_ref(struct ref_store *ref_store,
+					const char *dir)
+{
+	struct debug_ref_store *drefs = (struct debug_ref_store *)ref_store;
+	int res = drefs->refs->be->write_invalid_head_ref(drefs->refs, dir);
+	trace_printf_key(&trace_refs, "write_invalid_head: %s\n", dir);
+	return res;
+}
+
 struct debug_ref_iterator {
 	struct ref_iterator base;
 	struct ref_iterator *iter;
@@ -384,6 +393,7 @@ struct ref_storage_be refs_be_debug = {
 	debug_delete_refs,
 	debug_rename_ref,
 	debug_copy_ref,
+	debug_write_invalid_head_ref,
 
 	debug_ref_iterator_begin,
 	debug_read_raw_ref,
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 4fdc68810b..b1c82a6f61 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3169,6 +3169,18 @@ static int files_init_db(struct ref_store *ref_store, struct strbuf *err)
 	return 0;
 }
 
+static int files_write_invalid_head_ref(struct ref_store *ref_store,
+					const char *dir)
+{
+	struct strbuf sb = STRBUF_INIT;
+
+	strbuf_addf(&sb, "%s/HEAD", dir);
+	write_file(sb.buf, "%s", oid_to_hex(&null_oid));
+	strbuf_release(&sb);
+
+	return 0;
+}
+
 struct ref_storage_be refs_be_files = {
 	NULL,
 	"files",
@@ -3184,6 +3196,7 @@ struct ref_storage_be refs_be_files = {
 	files_delete_refs,
 	files_rename_ref,
 	files_copy_ref,
+	files_write_invalid_head_ref,
 
 	files_ref_iterator_begin,
 	files_read_raw_ref,
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index b912f2505f..a1adb81777 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1591,6 +1591,12 @@ static int packed_copy_ref(struct ref_store *ref_store,
 	BUG("packed reference store does not support copying references");
 }
 
+static int packed_write_invalid_head_ref(struct ref_store *ref_store,
+					 const char *dir)
+{
+	BUG("packed reference store does not support writing an invalid head ref");
+}
+
 static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_store)
 {
 	return empty_ref_iterator_begin();
@@ -1656,6 +1662,7 @@ struct ref_storage_be refs_be_packed = {
 	packed_delete_refs,
 	packed_rename_ref,
 	packed_copy_ref,
+	packed_write_invalid_head_ref,
 
 	packed_ref_iterator_begin,
 	packed_read_raw_ref,
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 467f4b3c93..5055226f75 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -556,6 +556,8 @@ typedef int rename_ref_fn(struct ref_store *ref_store,
 typedef int copy_ref_fn(struct ref_store *ref_store,
 			  const char *oldref, const char *newref,
 			  const char *logmsg);
+typedef int write_invalid_head_ref_fn(struct ref_store *ref_store,
+				      const char *dir);
 
 /*
  * Iterate over the references in `ref_store` whose names start with
@@ -655,6 +657,7 @@ struct ref_storage_be {
 	delete_refs_fn *delete_refs;
 	rename_ref_fn *rename_ref;
 	copy_ref_fn *copy_ref;
+	write_invalid_head_ref_fn *write_invalid_head_ref;
 
 	ref_iterator_begin_fn *iterator_begin;
 	read_raw_ref_fn *read_raw_ref;
-- 
2.26.2.266.ge870325ee8


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* Re: [PATCH] refs: introduce API function to write invalid null ref
  2021-02-22  0:41         ` [PATCH] refs: introduce API function to write invalid null ref Stefan Beller
@ 2021-02-22  1:20           ` Eric Sunshine
  2021-02-22  3:09             ` Eric Sunshine
  2021-02-22 18:38           ` Han-Wen Nienhuys
  1 sibling, 1 reply; 251+ messages in thread
From: Eric Sunshine @ 2021-02-22  1:20 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Johannes Schindelin via GitGitGadget, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason, Emily Shaffer,
	Felipe Contreras, Git List, Han-Wen Nienhuys, hanwenn,
	Jonathan Tan, Jonathan Nieder, Jeff King, Patrick Steinhardt,
	Ramsay Jones, Josh Steadmon

On Sun, Feb 21, 2021 at 7:43 PM Stefan Beller <stefanbeller@gmail.com> wrote:
> Different ref backends will have different ways to write out the invalid 00..00
> ref when starting a new worktree. Encapsulate this into a function and expose
> the function in the refs API.
>
> Signed-off-by: Stefan Beller <stefanbeller@gmail.com>
> ---
> it's been a while since I looked at git source code, but today is the day!
> I was actually looking how the refs table work progresses and this patch
> caught my attention.  I think the changes in builtin/worktree.c (that
> if/else depending on the actual refs backend used)
> demonstrate that the refs API layer is leaking implementation details.

Welcome back. I have a few comments in response to this proposal. See below...

> diff --git a/builtin/worktree.c b/builtin/worktree.c
> @@ -330,9 +330,9 @@ static int add_worktree(const char *path, const char *refname,
>         strbuf_reset(&sb);
> -       strbuf_addf(&sb, "%s/HEAD", sb_repo.buf);
> -       write_file(sb.buf, "%s", oid_to_hex(&null_oid));
> -       strbuf_reset(&sb);
> +
> +       write_invalid_head(get_main_ref_store(the_repository), sb_repo.buf);
> +
>         strbuf_addf(&sb, "%s/commondir", sb_repo.buf);
>         write_file(sb.buf, "../..");

Nit: This change ends up making the strbuf_reset() unnecessarily
distant from the location where it has most meaning (just before we
start using it again, which is the pattern in this function):

    strbuf_release(&sb);

    write_invalid_head(..., sb_repo.buf);

    strbuf_add(&sb, ...);

It would be clearer for the end result to be:

    write_invalid_head(..., sb_repo.buf);

    strbuf_release(&sb);
    strbuf_add(&sb, ...);

> diff --git a/refs/files-backend.c b/refs/files-backend.c
> @@ -3169,6 +3169,18 @@ static int files_init_db(struct ref_store *ref_store, struct strbuf *err)
> +static int files_write_invalid_head_ref(struct ref_store *ref_store,
> +                                       const char *dir)
> +{
> +       struct strbuf sb = STRBUF_INIT;
> +
> +       strbuf_addf(&sb, "%s/HEAD", dir);
> +       write_file(sb.buf, "%s", oid_to_hex(&null_oid));
> +       strbuf_release(&sb);
> +
> +       return 0;
> +}

I'm not super enthused about the name write_invalid_head_ref(). Even
if given a different name, I'm not necessarily enthused about it
unconditionally writing a zero-OID. Do you foresee other cases in
which callers will need to perform this precise operation?

The reason I ask is that the bit of code in
builtin/worktree.c:add_worktree() which this patch targets is itself a
hack to work around the shortcoming in which is_git_directory() won't
consider the newly-created worktree as being legitimate if it doesn't
have a well-formed HEAD ref. It's not visible in the context of this
patch, but there is a comment just above the changes to
builtin/worktree.c made by this patch:

    /*
     * This is to keep resolve_ref() happy. We need a valid HEAD
     * or is_git_directory() will reject the directory. Any value which
     * looks like an object ID will do since it will be immediately
     * replaced by the symbolic-ref or update-ref invocation in the new
     * worktree.
     */

As mentioned in the comment, it doesn't matter what the value of HEAD
is; it just needs to be something that looks like a well-formed OID.

So, my lack of enthusiasm for write_invalid_head_ref() is twofold.
First, this function is being introduced to paper over a hack, which
leaves the new function smelling funny. Second, the "invalid head" in
the name and the unconditional zero-OID feel too special-purpose and
focus too much on the particular current implementation in
builtin/worktree.c which happens to assign a zero-OID to HEAD. (At an
earlier time, builtin/worktree.h assigned the value of HEAD from the
current worktree to HEAD in the newly-created worktree instead. But,
as noted, the value is arbitrary -- it doesn't matter what it is -- it
just needs to exist long enough to pacify is_git_directory() and is
then immediately overwritten.)

On the other hand, I could see this as acceptable if "invalid" is
removed from the function name and if it accepts an OID to write
rather than unconditionally writing a zero-ID. In that case, it would
become a generally useful function without the bad smells associated
with the too-special-purpose write_invalid_head_ref(). (But I haven't
been paying attention to the ref backends work, so perhaps such a
function already exists? I don't know.)

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH] refs: introduce API function to write invalid null ref
  2021-02-22  1:20           ` Eric Sunshine
@ 2021-02-22  3:09             ` Eric Sunshine
  0 siblings, 0 replies; 251+ messages in thread
From: Eric Sunshine @ 2021-02-22  3:09 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Johannes Schindelin via GitGitGadget, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason, Emily Shaffer,
	Felipe Contreras, Git List, Han-Wen Nienhuys, hanwenn,
	Jonathan Tan, Jonathan Nieder, Jeff King, Patrick Steinhardt,
	Ramsay Jones, Josh Steadmon

On Sun, Feb 21, 2021 at 8:20 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
> The reason I ask is that the bit of code in
> builtin/worktree.c:add_worktree() which this patch targets is itself a
> hack to work around the shortcoming in which is_git_directory() won't
> consider the newly-created worktree as being legitimate if it doesn't
> have a well-formed HEAD ref. [...]

By the way, the only reason the hack of creating a temporary HEAD
(containing arbitrary OID) is needed is that add_worktree() shells out
to invoke one of git-update-ref or git-symbolic-ref. It is that
shell-out to invoke a Git command which triggers the necessity of
pacifying is_git_directory()...

> On the other hand, I could see this as acceptable if "invalid" is
> removed from the function name and if it accepts an OID to write
> rather than unconditionally writing a zero-ID. In that case, it would
> become a generally useful function without the bad smells associated
> with the too-special-purpose write_invalid_head_ref().

... so, a much cleaner fix would be to stop shelling out to set up
HEAD in the newly-created worktree, and instead call C API to perform
those functions (update-ref and symbolic-ref) directly. If that C API
does not yet exist, then it should be added.

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH] refs: introduce API function to write invalid null ref
  2021-02-22  0:41         ` [PATCH] refs: introduce API function to write invalid null ref Stefan Beller
  2021-02-22  1:20           ` Eric Sunshine
@ 2021-02-22 18:38           ` Han-Wen Nienhuys
  1 sibling, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys @ 2021-02-22 18:38 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Han-Wen Nienhuys via GitGitGadget, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason, Emily Shaffer,
	Felipe Contreras, git, Han-Wen Nienhuys, Jonathan Tan,
	Jonathan Nieder, Jeff King, Patrick Steinhardt, Ramsay Jones,
	Josh Steadmon

On Mon, Feb 22, 2021 at 1:41 AM Stefan Beller <stefanbeller@gmail.com> wrote:
>
> Different ref backends will have different ways to write out the invalid 00..00
> ref when starting a new worktree. Encapsulate this into a function and expose
> the function in the refs API.
>
> Signed-off-by: Stefan Beller <stefanbeller@gmail.com>
> ---
>
> Hi Han-Wen,
>
> it's been a while since I looked at git source code, but today is the day!
> I was actually looking how the refs table work progresses and this patch
> caught my attention.  I think the changes in builtin/worktree.c (that
> if/else depending on the actual refs backend used)
> demonstrate that the refs API layer is leaking implementation details.
>
> What do you think about rolling this patch first, and then implementing
> the following part inside the reftable as a function?

The "invalid HEAD" hack is there to avoid confusing historical git
implementations. It's not a part of the "modern" refs API layer, so I
think we shouldn't add it as a method on the ref backend API.


-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--
Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* [PATCH v5 00/15] reftable library
  2020-12-09 14:00     ` [PATCH v4 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                         ` (14 preceding siblings ...)
  2020-12-09 14:00       ` [PATCH v4 15/15] Add "test-tool dump-reftable" command Han-Wen Nienhuys via GitGitGadget
@ 2021-03-12 20:19       ` Han-Wen Nienhuys via GitGitGadget
  2021-03-12 20:19         ` [PATCH v5 01/15] init-db: set the_repository->hash_algo early on Han-Wen Nienhuys via GitGitGadget
                           ` (15 more replies)
  15 siblings, 16 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-03-12 20:19 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys

This splits the giant commit from
https://github.com/gitgitgadget/git/pull/539 into a series of smaller
commits, which build and have unittests.

The final commit should also be split up, but I want to wait until we have
consensus that the bottom commits look good.

version 12 Mar 2021:

 * tagged unions for reftable_log_record
 * some fixups by jun

Based on github.com/hanwen/reftable at
c33a40ad23dd622d8c72139c41a2be1cfa063e5f Add random suffix to table names

Han-Wen Nienhuys (14):
  init-db: set the_repository->hash_algo early on
  reftable: add LICENSE
  reftable: add error related functionality
  reftable: utility functions
  reftable: add blocksource, an abstraction for random access reads
  reftable: (de)serialization for the polymorphic record type.
  reftable: reading/writing blocks
  reftable: a generic binary tree implementation
  reftable: write reftable files
  reftable: read reftable files
  reftable: reftable file level tests
  reftable: rest of library
  Reftable support for git-core
  Add "test-tool dump-reftable" command.

SZEDER Gábor (1):
  git-prompt: prepare for reftable refs backend

 Documentation/config/extensions.txt           |    9 +
 .../technical/repository-version.txt          |    7 +
 Makefile                                      |   46 +-
 builtin/clone.c                               |    5 +-
 builtin/init-db.c                             |   76 +-
 builtin/worktree.c                            |   27 +-
 cache.h                                       |    8 +-
 config.mak.uname                              |    2 +-
 contrib/buildsystems/CMakeLists.txt           |   14 +-
 contrib/buildsystems/Generators/Vcxproj.pm    |   11 +-
 contrib/completion/git-prompt.sh              |    7 +-
 refs.c                                        |   27 +-
 refs.h                                        |    3 +
 refs/refs-internal.h                          |    1 +
 refs/reftable-backend.c                       | 1446 +++++++++++++++++
 reftable/.gitattributes                       |    1 +
 reftable/LICENSE                              |   31 +
 reftable/VERSION                              |    1 +
 reftable/basics.c                             |  128 ++
 reftable/basics.h                             |   60 +
 reftable/basics_test.c                        |   98 ++
 reftable/block.c                              |  440 +++++
 reftable/block.h                              |  127 ++
 reftable/block_test.c                         |  121 ++
 reftable/blocksource.c                        |  148 ++
 reftable/blocksource.h                        |   22 +
 reftable/constants.h                          |   21 +
 reftable/dump.c                               |  213 +++
 reftable/error.c                              |   41 +
 reftable/iter.c                               |  248 +++
 reftable/iter.h                               |   75 +
 reftable/merged.c                             |  366 +++++
 reftable/merged.h                             |   35 +
 reftable/merged_test.c                        |  343 ++++
 reftable/pq.c                                 |  115 ++
 reftable/pq.h                                 |   32 +
 reftable/publicbasics.c                       |   58 +
 reftable/reader.c                             |  733 +++++++++
 reftable/reader.h                             |   75 +
 reftable/record.c                             | 1203 ++++++++++++++
 reftable/record.h                             |  139 ++
 reftable/record_test.c                        |  405 +++++
 reftable/refname.c                            |  209 +++
 reftable/refname.h                            |   29 +
 reftable/refname_test.c                       |  102 ++
 reftable/reftable-blocksource.h               |   49 +
 reftable/reftable-error.h                     |   62 +
 reftable/reftable-generic.h                   |   48 +
 reftable/reftable-iterator.h                  |   37 +
 reftable/reftable-malloc.h                    |   18 +
 reftable/reftable-merged.h                    |   68 +
 reftable/reftable-reader.h                    |   89 +
 reftable/reftable-record.h                    |  114 ++
 reftable/reftable-stack.h                     |  120 ++
 reftable/reftable-tests.h                     |   22 +
 reftable/reftable-writer.h                    |  147 ++
 reftable/reftable.c                           |   98 ++
 reftable/reftable_test.c                      |  583 +++++++
 reftable/stack.c                              | 1260 ++++++++++++++
 reftable/stack.h                              |   41 +
 reftable/stack_test.c                         |  809 +++++++++
 reftable/system.h                             |   32 +
 reftable/test_framework.c                     |   23 +
 reftable/test_framework.h                     |   53 +
 reftable/tree.c                               |   63 +
 reftable/tree.h                               |   34 +
 reftable/tree_test.c                          |   61 +
 reftable/writer.c                             |  691 ++++++++
 reftable/writer.h                             |   50 +
 reftable/zlib-compat.c                        |   92 ++
 repository.c                                  |    2 +
 repository.h                                  |    3 +
 setup.c                                       |    9 +-
 t/helper/test-reftable.c                      |   20 +
 t/helper/test-tool.c                          |    4 +-
 t/helper/test-tool.h                          |    2 +
 t/t0031-reftable.sh                           |  203 +++
 t/t0032-reftable-unittest.sh                  |   15 +
 t/t1409-avoid-packing-refs.sh                 |    6 +
 t/t1450-fsck.sh                               |    6 +
 t/t3210-pack-refs.sh                          |    6 +
 t/test-lib.sh                                 |    5 +
 82 files changed, 12213 insertions(+), 40 deletions(-)
 create mode 100644 refs/reftable-backend.c
 create mode 100644 reftable/.gitattributes
 create mode 100644 reftable/LICENSE
 create mode 100644 reftable/VERSION
 create mode 100644 reftable/basics.c
 create mode 100644 reftable/basics.h
 create mode 100644 reftable/basics_test.c
 create mode 100644 reftable/block.c
 create mode 100644 reftable/block.h
 create mode 100644 reftable/block_test.c
 create mode 100644 reftable/blocksource.c
 create mode 100644 reftable/blocksource.h
 create mode 100644 reftable/constants.h
 create mode 100644 reftable/dump.c
 create mode 100644 reftable/error.c
 create mode 100644 reftable/iter.c
 create mode 100644 reftable/iter.h
 create mode 100644 reftable/merged.c
 create mode 100644 reftable/merged.h
 create mode 100644 reftable/merged_test.c
 create mode 100644 reftable/pq.c
 create mode 100644 reftable/pq.h
 create mode 100644 reftable/publicbasics.c
 create mode 100644 reftable/reader.c
 create mode 100644 reftable/reader.h
 create mode 100644 reftable/record.c
 create mode 100644 reftable/record.h
 create mode 100644 reftable/record_test.c
 create mode 100644 reftable/refname.c
 create mode 100644 reftable/refname.h
 create mode 100644 reftable/refname_test.c
 create mode 100644 reftable/reftable-blocksource.h
 create mode 100644 reftable/reftable-error.h
 create mode 100644 reftable/reftable-generic.h
 create mode 100644 reftable/reftable-iterator.h
 create mode 100644 reftable/reftable-malloc.h
 create mode 100644 reftable/reftable-merged.h
 create mode 100644 reftable/reftable-reader.h
 create mode 100644 reftable/reftable-record.h
 create mode 100644 reftable/reftable-stack.h
 create mode 100644 reftable/reftable-tests.h
 create mode 100644 reftable/reftable-writer.h
 create mode 100644 reftable/reftable.c
 create mode 100644 reftable/reftable_test.c
 create mode 100644 reftable/stack.c
 create mode 100644 reftable/stack.h
 create mode 100644 reftable/stack_test.c
 create mode 100644 reftable/system.h
 create mode 100644 reftable/test_framework.c
 create mode 100644 reftable/test_framework.h
 create mode 100644 reftable/tree.c
 create mode 100644 reftable/tree.h
 create mode 100644 reftable/tree_test.c
 create mode 100644 reftable/writer.c
 create mode 100644 reftable/writer.h
 create mode 100644 reftable/zlib-compat.c
 create mode 100644 t/helper/test-reftable.c
 create mode 100755 t/t0031-reftable.sh
 create mode 100755 t/t0032-reftable-unittest.sh


base-commit: 13d7ab6b5d7929825b626f050b62a11241ea4945
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-847%2Fhanwen%2Flibreftable-v5
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-847/hanwen/libreftable-v5
Pull-Request: https://github.com/git/git/pull/847

Range-diff vs v4:

  1:  40ac041d0efe !  1:  e1d1d9f49807 init-db: set the_repository->hash_algo early on
     @@ builtin/init-db.c: int init_db(const char *git_dir, const char *real_git_dir,
      +	repo_set_hash_algo(the_repository, repo_fmt.hash_algo);
      +
       	reinit = create_default_files(template_dir, original_git_dir,
     - 				      initial_branch, &repo_fmt);
     - 	if (reinit && initial_branch)
     + 				      initial_branch, &repo_fmt,
     + 				      flags & INIT_DB_QUIET);
  2:  cc0fe58bd333 =  2:  502a66befabe reftable: add LICENSE
  3:  798f680b800b =  3:  1bd136f6a6df reftable: add error related functionality
  4:  2f55ff6d2406 !  4:  5bbbafb428e9 reftable: utility functions
     @@ t/helper/test-reftable.c (new)
      
       ## t/helper/test-tool.c ##
      @@ t/helper/test-tool.c: static struct test_cmd cmds[] = {
     - 	{ "path-utils", cmd__path_utils },
     + 	{ "pcre2-config", cmd__pcre2_config },
       	{ "pkt-line", cmd__pkt_line },
       	{ "prio-queue", cmd__prio_queue },
      -	{ "proc-receive", cmd__proc_receive},
  5:  aaa817f8598b =  5:  e41fa780882d reftable: add blocksource, an abstraction for random access reads
  6:  e13afe315258 !  6:  390eaca4fc8c reftable: (de)serialization for the polymorphic record type.
     @@ reftable/record.c (new)
      +{
      +	char hex[SHA256_SIZE + 1] = { 0 };
      +
     -+	printf("log{%s(%" PRIu64 ") %s <%s> %" PRIu64 " %04d\n", log->refname,
     -+	       log->update_index, log->name, log->email, log->time,
     -+	       log->tz_offset);
     -+	hex_format(hex, log->old_hash, hash_size(hash_id));
     -+	printf("%s => ", hex);
     -+	hex_format(hex, log->new_hash, hash_size(hash_id));
     -+	printf("%s\n\n%s\n}\n", hex, log->message);
     ++	switch (log->value_type) {
     ++	case REFTABLE_LOG_DELETION:
     ++		printf("log{%s(%" PRIu64 ") delete", log->refname,
     ++		       log->update_index);
     ++		break;
     ++	case REFTABLE_LOG_UPDATE:
     ++		printf("log{%s(%" PRIu64 ") %s <%s> %" PRIu64 " %04d\n",
     ++		       log->refname, log->update_index, log->update.name,
     ++		       log->update.email, log->update.time,
     ++		       log->update.tz_offset);
     ++		hex_format(hex, log->update.old_hash, hash_size(hash_id));
     ++		printf("%s => ", hex);
     ++		hex_format(hex, log->update.new_hash, hash_size(hash_id));
     ++		printf("%s\n\n%s\n}\n", hex, log->update.message);
     ++		break;
     ++	}
      +}
      +
      +static void reftable_log_record_key(const void *r, struct strbuf *dest)
     @@ reftable/record.c (new)
      +	if (dst->refname != NULL) {
      +		dst->refname = xstrdup(dst->refname);
      +	}
     -+	if (dst->email != NULL) {
     -+		dst->email = xstrdup(dst->email);
     -+	}
     -+	if (dst->name != NULL) {
     -+		dst->name = xstrdup(dst->name);
     -+	}
     -+	if (dst->message != NULL) {
     -+		dst->message = xstrdup(dst->message);
     -+	}
     ++	switch (dst->value_type) {
     ++	case REFTABLE_LOG_DELETION:
     ++		break;
     ++	case REFTABLE_LOG_UPDATE:
     ++		if (dst->update.email != NULL) {
     ++			dst->update.email = xstrdup(dst->update.email);
     ++		}
     ++		if (dst->update.name != NULL) {
     ++			dst->update.name = xstrdup(dst->update.name);
     ++		}
     ++		if (dst->update.message != NULL) {
     ++			dst->update.message = xstrdup(dst->update.message);
     ++		}
      +
     -+	if (dst->new_hash != NULL) {
     -+		dst->new_hash = reftable_malloc(hash_size);
     -+		memcpy(dst->new_hash, src->new_hash, hash_size);
     -+	}
     -+	if (dst->old_hash != NULL) {
     -+		dst->old_hash = reftable_malloc(hash_size);
     -+		memcpy(dst->old_hash, src->old_hash, hash_size);
     ++		if (dst->update.new_hash != NULL) {
     ++			dst->update.new_hash = reftable_malloc(hash_size);
     ++			memcpy(dst->update.new_hash, src->update.new_hash,
     ++			       hash_size);
     ++		}
     ++		if (dst->update.old_hash != NULL) {
     ++			dst->update.old_hash = reftable_malloc(hash_size);
     ++			memcpy(dst->update.old_hash, src->update.old_hash,
     ++			       hash_size);
     ++		}
     ++		break;
      +	}
      +}
      +
     @@ reftable/record.c (new)
      +void reftable_log_record_release(struct reftable_log_record *r)
      +{
      +	reftable_free(r->refname);
     -+	reftable_free(r->new_hash);
     -+	reftable_free(r->old_hash);
     -+	reftable_free(r->name);
     -+	reftable_free(r->email);
     -+	reftable_free(r->message);
     ++	switch (r->value_type) {
     ++	case REFTABLE_LOG_DELETION:
     ++		break;
     ++	case REFTABLE_LOG_UPDATE:
     ++		reftable_free(r->update.new_hash);
     ++		reftable_free(r->update.old_hash);
     ++		reftable_free(r->update.name);
     ++		reftable_free(r->update.email);
     ++		reftable_free(r->update.message);
     ++		break;
     ++	}
      +	memset(r, 0, sizeof(struct reftable_log_record));
      +}
      +
     @@ reftable/record.c (new)
      +	struct reftable_log_record *r = (struct reftable_log_record *)rec;
      +	struct string_view start = s;
      +	int n = 0;
     -+	uint8_t *oldh = r->old_hash;
     -+	uint8_t *newh = r->new_hash;
     ++	uint8_t *oldh = NULL;
     ++	uint8_t *newh = NULL;
      +	if (reftable_log_record_is_deletion(r))
      +		return 0;
      +
     ++	oldh = r->update.old_hash;
     ++	newh = r->update.new_hash;
      +	if (oldh == NULL) {
      +		oldh = zero;
      +	}
     @@ reftable/record.c (new)
      +	memcpy(s.buf + hash_size, newh, hash_size);
      +	string_view_consume(&s, 2 * hash_size);
      +
     -+	n = encode_string(r->name ? r->name : "", s);
     ++	n = encode_string(r->update.name ? r->update.name : "", s);
      +	if (n < 0)
      +		return -1;
      +	string_view_consume(&s, n);
      +
     -+	n = encode_string(r->email ? r->email : "", s);
     ++	n = encode_string(r->update.email ? r->update.email : "", s);
      +	if (n < 0)
      +		return -1;
      +	string_view_consume(&s, n);
      +
     -+	n = put_var_int(&s, r->time);
     ++	n = put_var_int(&s, r->update.time);
      +	if (n < 0)
      +		return -1;
      +	string_view_consume(&s, n);
     @@ reftable/record.c (new)
      +	if (s.len < 2)
      +		return -1;
      +
     -+	put_be16(s.buf, r->tz_offset);
     ++	put_be16(s.buf, r->update.tz_offset);
      +	string_view_consume(&s, 2);
      +
     -+	n = encode_string(r->message ? r->message : "", s);
     ++	n = encode_string(r->update.message ? r->update.message : "", s);
      +	if (n < 0)
      +		return -1;
      +	string_view_consume(&s, n);
     @@ reftable/record.c (new)
      +
      +	r->update_index = (~max) - ts;
      +
     -+	if (val_type == 0) {
     -+		FREE_AND_NULL(r->old_hash);
     -+		FREE_AND_NULL(r->new_hash);
     -+		FREE_AND_NULL(r->message);
     -+		FREE_AND_NULL(r->email);
     -+		FREE_AND_NULL(r->name);
     -+		return 0;
     ++	if (val_type != r->value_type) {
     ++		switch (r->value_type) {
     ++		case REFTABLE_LOG_UPDATE:
     ++			FREE_AND_NULL(r->update.old_hash);
     ++			FREE_AND_NULL(r->update.new_hash);
     ++			FREE_AND_NULL(r->update.message);
     ++			FREE_AND_NULL(r->update.email);
     ++			FREE_AND_NULL(r->update.name);
     ++			break;
     ++		case REFTABLE_LOG_DELETION:
     ++			break;
     ++		}
      +	}
      +
     ++	r->value_type = val_type;
     ++	if (val_type == REFTABLE_LOG_DELETION)
     ++		return 0;
     ++
      +	if (in.len < 2 * hash_size)
      +		return REFTABLE_FORMAT_ERROR;
      +
     -+	r->old_hash = reftable_realloc(r->old_hash, hash_size);
     -+	r->new_hash = reftable_realloc(r->new_hash, hash_size);
     ++	r->update.old_hash = reftable_realloc(r->update.old_hash, hash_size);
     ++	r->update.new_hash = reftable_realloc(r->update.new_hash, hash_size);
      +
     -+	memcpy(r->old_hash, in.buf, hash_size);
     -+	memcpy(r->new_hash, in.buf + hash_size, hash_size);
     ++	memcpy(r->update.old_hash, in.buf, hash_size);
     ++	memcpy(r->update.new_hash, in.buf + hash_size, hash_size);
      +
      +	string_view_consume(&in, 2 * hash_size);
      +
     @@ reftable/record.c (new)
      +		goto done;
      +	string_view_consume(&in, n);
      +
     -+	r->name = reftable_realloc(r->name, dest.len + 1);
     -+	memcpy(r->name, dest.buf, dest.len);
     -+	r->name[dest.len] = 0;
     ++	r->update.name = reftable_realloc(r->update.name, dest.len + 1);
     ++	memcpy(r->update.name, dest.buf, dest.len);
     ++	r->update.name[dest.len] = 0;
      +
      +	strbuf_reset(&dest);
      +	n = decode_string(&dest, in);
     @@ reftable/record.c (new)
      +		goto done;
      +	string_view_consume(&in, n);
      +
     -+	r->email = reftable_realloc(r->email, dest.len + 1);
     -+	memcpy(r->email, dest.buf, dest.len);
     -+	r->email[dest.len] = 0;
     ++	r->update.email = reftable_realloc(r->update.email, dest.len + 1);
     ++	memcpy(r->update.email, dest.buf, dest.len);
     ++	r->update.email[dest.len] = 0;
      +
      +	ts = 0;
      +	n = get_var_int(&ts, &in);
      +	if (n < 0)
      +		goto done;
      +	string_view_consume(&in, n);
     -+	r->time = ts;
     ++	r->update.time = ts;
      +	if (in.len < 2)
      +		goto done;
      +
     -+	r->tz_offset = get_be16(in.buf);
     ++	r->update.tz_offset = get_be16(in.buf);
      +	string_view_consume(&in, 2);
      +
      +	strbuf_reset(&dest);
     @@ reftable/record.c (new)
      +		goto done;
      +	string_view_consume(&in, n);
      +
     -+	r->message = reftable_realloc(r->message, dest.len + 1);
     -+	memcpy(r->message, dest.buf, dest.len);
     -+	r->message[dest.len] = 0;
     ++	r->update.message = reftable_realloc(r->update.message, dest.len + 1);
     ++	memcpy(r->update.message, dest.buf, dest.len);
     ++	r->update.message[dest.len] = 0;
      +
      +	strbuf_release(&dest);
      +	return start.len - in.len;
     @@ reftable/record.c (new)
      +int reftable_log_record_equal(struct reftable_log_record *a,
      +			      struct reftable_log_record *b, int hash_size)
      +{
     -+	return null_streq(a->name, b->name) && null_streq(a->email, b->email) &&
     -+	       null_streq(a->message, b->message) &&
     -+	       zero_hash_eq(a->old_hash, b->old_hash, hash_size) &&
     -+	       zero_hash_eq(a->new_hash, b->new_hash, hash_size) &&
     -+	       a->time == b->time && a->tz_offset == b->tz_offset &&
     -+	       a->update_index == b->update_index;
     ++	if (!(null_streq(a->refname, b->refname) &&
     ++	      a->update_index == b->update_index &&
     ++	      a->value_type == b->value_type))
     ++		return 0;
     ++
     ++	switch (a->value_type) {
     ++	case REFTABLE_LOG_DELETION:
     ++		return 1;
     ++	case REFTABLE_LOG_UPDATE:
     ++		return null_streq(a->update.name, b->update.name) &&
     ++		       a->update.time == b->update.time &&
     ++		       a->update.tz_offset == b->update.tz_offset &&
     ++		       null_streq(a->update.email, b->update.email) &&
     ++		       null_streq(a->update.message, b->update.message) &&
     ++		       zero_hash_eq(a->update.old_hash, b->update.old_hash,
     ++				    hash_size) &&
     ++		       zero_hash_eq(a->update.new_hash, b->update.new_hash,
     ++				    hash_size);
     ++	}
     ++
     ++	abort();
      +}
      +
      +static int reftable_log_record_is_deletion_void(const void *p)
     @@ reftable/record.c (new)
      +
      +int reftable_log_record_is_deletion(const struct reftable_log_record *log)
      +{
     -+	return (log->new_hash == NULL && log->old_hash == NULL &&
     -+		log->name == NULL && log->email == NULL &&
     -+		log->message == NULL && log->time == 0 && log->tz_offset == 0 &&
     -+		log->message == NULL);
     ++	return (log->value_type == REFTABLE_LOG_DELETION);
      +}
      +
      +void string_view_consume(struct string_view *s, int n)
     @@ reftable/record.h (new)
       ## reftable/record_test.c (new) ##
      @@
      +/*
     -+Copyright 2020 Google LLC
     ++  Copyright 2020 Google LLC
      +
     -+Use of this source code is governed by a BSD-style
     -+license that can be found in the LICENSE file or at
     -+https://developers.google.com/open-source/licenses/bsd
     ++  Use of this source code is governed by a BSD-style
     ++  license that can be found in the LICENSE file or at
     ++  https://developers.google.com/open-source/licenses/bsd
      +*/
      +
      +#include "record.h"
     @@ reftable/record_test.c (new)
      +	struct reftable_log_record in[2] = {
      +		{
      +			.refname = xstrdup("refs/heads/master"),
     -+			.old_hash = reftable_malloc(SHA1_SIZE),
     -+			.new_hash = reftable_malloc(SHA1_SIZE),
     -+			.name = xstrdup("han-wen"),
     -+			.email = xstrdup("hanwen@google.com"),
     -+			.message = xstrdup("test"),
      +			.update_index = 42,
     -+			.time = 1577123507,
     -+			.tz_offset = 100,
     ++			.value_type = REFTABLE_LOG_UPDATE,
     ++			.update = {
     ++				.old_hash = reftable_malloc(SHA1_SIZE),
     ++				.new_hash = reftable_malloc(SHA1_SIZE),
     ++				.name = xstrdup("han-wen"),
     ++				.email = xstrdup("hanwen@google.com"),
     ++				.message = xstrdup("test"),
     ++				.time = 1577123507,
     ++				.tz_offset = 100,
     ++			}
      +		},
      +		{
      +			.refname = xstrdup("refs/heads/master"),
      +			.update_index = 22,
     ++			.value_type = REFTABLE_LOG_DELETION,
      +		}
      +	};
     -+	set_test_hash(in[0].new_hash, 1);
     -+	set_test_hash(in[0].old_hash, 2);
     ++	set_test_hash(in[0].update.new_hash, 1);
     ++	set_test_hash(in[0].update.old_hash, 2);
      +	for (int i = 0; i < ARRAY_SIZE(in); i++) {
      +		struct reftable_record rec = { NULL };
      +		struct strbuf key = STRBUF_INIT;
     @@ reftable/record_test.c (new)
      +		/* populate out, to check for leaks. */
      +		struct reftable_log_record out = {
      +			.refname = xstrdup("old name"),
     -+			.new_hash = reftable_calloc(SHA1_SIZE),
     -+			.old_hash = reftable_calloc(SHA1_SIZE),
     -+			.name = xstrdup("old name"),
     -+			.email = xstrdup("old@email"),
     -+			.message = xstrdup("old message"),
     ++			.value_type = REFTABLE_LOG_UPDATE,
     ++			.update = {
     ++				.new_hash = reftable_calloc(SHA1_SIZE),
     ++				.old_hash = reftable_calloc(SHA1_SIZE),
     ++				.name = xstrdup("old name"),
     ++				.email = xstrdup("old@email"),
     ++				.message = xstrdup("old message"),
     ++			},
      +		};
      +		struct reftable_record rec_out = { NULL };
      +		int n, m, valtype;
     @@ reftable/reftable-record.h (new)
      +	char *refname;
      +	uint64_t update_index; /* logical timestamp of a transactional update.
      +				*/
     -+	uint8_t *new_hash;
     -+	uint8_t *old_hash;
     -+	char *name;
     -+	char *email;
     -+	uint64_t time;
     -+	int16_t tz_offset;
     -+	char *message;
     ++
     ++	enum {
     ++		/* tombstone to hide deletions from earlier tables */
     ++		REFTABLE_LOG_DELETION = 0x0,
     ++
     ++		/* a simple update */
     ++		REFTABLE_LOG_UPDATE = 0x1,
     ++#define REFTABLE_NR_LOG_VALUETYPES 2
     ++	} value_type;
     ++
     ++	union {
     ++		struct {
     ++			uint8_t *new_hash;
     ++			uint8_t *old_hash;
     ++			char *name;
     ++			char *email;
     ++			uint64_t time;
     ++			int16_t tz_offset;
     ++			char *message;
     ++		} update;
     ++	};
      +};
      +
      +/* returns whether 'ref' represents the deletion of a log record. */
  7:  a48b9937642e =  7:  b108525009d9 reftable: reading/writing blocks
  8:  6968dbc3828f =  8:  b6eed7283aac reftable: a generic binary tree implementation
  9:  ff5b424d12fb !  9:  f2f005afca19 reftable: write reftable files
     @@ reftable/writer.c (new)
      +	return err;
      +}
      +
     -+int reftable_writer_add_log(struct reftable_writer *w,
     -+			    struct reftable_log_record *log)
     ++static int reftable_writer_add_log_verbatim(struct reftable_writer *w,
     ++					    struct reftable_log_record *log)
      +{
      +	struct reftable_record rec = { NULL };
     -+	char *input_log_message = log->message;
     -+	struct strbuf cleaned_message = STRBUF_INIT;
     -+	int err;
     -+	if (log->refname == NULL)
     -+		return REFTABLE_API_ERROR;
     -+
      +	if (w->block_writer != NULL &&
      +	    block_writer_type(w->block_writer) == BLOCK_TYPE_REF) {
      +		int err = writer_finish_public_section(w);
     @@ reftable/writer.c (new)
      +			return err;
      +	}
      +
     -+	if (!w->opts.exact_log_message && log->message != NULL) {
     -+		strbuf_addstr(&cleaned_message, log->message);
     ++	w->next -= w->pending_padding;
     ++	w->pending_padding = 0;
     ++
     ++	reftable_record_from_log(&rec, log);
     ++	return writer_add_record(w, &rec);
     ++}
     ++
     ++int reftable_writer_add_log(struct reftable_writer *w,
     ++			    struct reftable_log_record *log)
     ++{
     ++	char *input_log_message = NULL;
     ++	struct strbuf cleaned_message = STRBUF_INIT;
     ++	int err = 0;
     ++
     ++	if (log->value_type == REFTABLE_LOG_DELETION)
     ++		return reftable_writer_add_log_verbatim(w, log);
     ++
     ++	if (log->refname == NULL)
     ++		return REFTABLE_API_ERROR;
     ++
     ++	input_log_message = log->update.message;
     ++	if (!w->opts.exact_log_message && log->update.message != NULL) {
     ++		strbuf_addstr(&cleaned_message, log->update.message);
      +		while (cleaned_message.len &&
      +		       cleaned_message.buf[cleaned_message.len - 1] == '\n')
      +			strbuf_setlen(&cleaned_message,
     @@ reftable/writer.c (new)
      +			goto done;
      +		}
      +		strbuf_addstr(&cleaned_message, "\n");
     -+		log->message = cleaned_message.buf;
     ++		log->update.message = cleaned_message.buf;
      +	}
      +
     -+	w->next -= w->pending_padding;
     -+	w->pending_padding = 0;
     -+
     -+	reftable_record_from_log(&rec, log);
     -+	err = writer_add_record(w, &rec);
     -+
     ++	err = reftable_writer_add_log_verbatim(w, log);
     ++	log->update.message = input_log_message;
      +done:
     -+	log->message = input_log_message;
      +	strbuf_release(&cleaned_message);
      +	return err;
      +}
 10:  3c2de7dcc65c = 10:  1f17d5edab52 reftable: read reftable files
 11:  03681b820d13 ! 11:  c916629c562a reftable: reftable file level tests
     @@ reftable/reftable_test.c (new)
      +		snprintf(name, sizeof(name), "refs/heads/branch%02d", i);
      +
      +		log.refname = name;
     -+		log.new_hash = hash;
      +		log.update_index = update_index;
     -+		log.message = "message";
     ++		log.value_type = REFTABLE_LOG_UPDATE;
     ++		log.update.new_hash = hash;
     ++		log.update.message = "message";
      +
      +		n = reftable_writer_add_log(w, &log);
      +		EXPECT(n == 0);
     @@ reftable/reftable_test.c (new)
      +		.block_size = 4096,
      +	};
      +	int err;
     -+	struct reftable_log_record log = {
     -+		.refname = "refs/heads/master",
     -+		.name = "Han-Wen Nienhuys",
     -+		.email = "hanwen@google.com",
     -+		.tz_offset = 100,
     -+		.time = 0x5e430672,
     -+		.update_index = 0xa,
     -+		.message = "commit: 9\n",
     -+	};
     ++	struct reftable_log_record log = { .refname = "refs/heads/master",
     ++					   .update_index = 0xa,
     ++					   .value_type = REFTABLE_LOG_UPDATE,
     ++					   .update = {
     ++						   .name = "Han-Wen Nienhuys",
     ++						   .email = "hanwen@google.com",
     ++						   .tz_offset = 100,
     ++						   .time = 0x5e430672,
     ++						   .message = "commit: 9\n",
     ++					   } };
      +	struct reftable_writer *w =
      +		reftable_new_writer(&strbuf_add_void, &buf, &opts);
      +
     @@ reftable/reftable_test.c (new)
      +		hash1[i] = (uint8_t)(rand() % 256);
      +		hash2[i] = (uint8_t)(rand() % 256);
      +	}
     -+	log.old_hash = hash1;
     -+	log.new_hash = hash2;
     ++	log.update.old_hash = hash1;
     ++	log.update.new_hash = hash2;
      +	reftable_writer_set_limits(w, update_index, update_index);
      +	err = reftable_writer_add_log(w, &log);
      +	EXPECT_ERR(err);
     @@ reftable/reftable_test.c (new)
      +
      +		log.refname = names[i];
      +		log.update_index = i;
     -+		log.old_hash = hash1;
     -+		log.new_hash = hash2;
     ++		log.value_type = REFTABLE_LOG_UPDATE;
     ++		log.update.old_hash = hash1;
     ++		log.update.new_hash = hash2;
      +
      +		err = reftable_writer_add_log(w, &log);
      +		EXPECT_ERR(err);
 12:  557183d3e3e2 ! 12:  28aa69f7bbcb reftable: rest of library
     @@ Makefile: REFTABLE_OBJS += reftable/error.o
      
       ## reftable/VERSION (new) ##
      @@
     -+9b4a54059db9a05c270c0a0587f245bc6868d576 C: use rand() rather than cobbled together random generator.
     ++a337d48d4d42d513d6baa33addc172f0e0e36288 C: use tagged union for reftable_log_record
      
       ## reftable/dump.c (new) ##
      @@
     @@ reftable/stack.c (new)
      +			continue;
      +		}
      +
     -+		if (config != NULL && config->time > 0 &&
     -+		    log.time < config->time) {
     ++		if (config != NULL && config->min_update_index > 0 &&
     ++		    log.update_index < config->min_update_index) {
      +			continue;
      +		}
      +
     -+		if (config != NULL && config->min_update_index > 0 &&
     -+		    log.update_index < config->min_update_index) {
     ++		if (config != NULL && config->time > 0 &&
     ++		    log.update.time < config->time) {
      +			continue;
      +		}
      +
     @@ reftable/stack_test.c (new)
      +
      +		logs[i].refname = xstrdup(buf);
      +		logs[i].update_index = N + i + 1;
     -+		logs[i].new_hash = reftable_malloc(SHA1_SIZE);
     -+		logs[i].email = xstrdup("identity@invalid");
     -+		set_test_hash(logs[i].new_hash, i);
     ++		logs[i].value_type = REFTABLE_LOG_UPDATE;
     ++
     ++		logs[i].update.new_hash = reftable_malloc(SHA1_SIZE);
     ++		logs[i].update.email = xstrdup("identity@invalid");
     ++		set_test_hash(logs[i].update.new_hash, i);
      +	}
      +
      +	for (i = 0; i < N; i++) {
     @@ reftable/stack_test.c (new)
      +
      +	uint8_t h1[SHA1_SIZE] = { 0x01 }, h2[SHA1_SIZE] = { 0x02 };
      +
     -+	struct reftable_log_record input = {
     -+		.refname = "branch",
     -+		.update_index = 1,
     -+		.new_hash = h1,
     -+		.old_hash = h2,
     -+	};
     ++	struct reftable_log_record input = { .refname = "branch",
     ++					     .update_index = 1,
     ++					     .value_type = REFTABLE_LOG_UPDATE,
     ++					     .update = {
     ++						     .new_hash = h1,
     ++						     .old_hash = h2,
     ++					     } };
      +	struct reftable_log_record dest = {
      +		.update_index = 0,
      +	};
     @@ reftable/stack_test.c (new)
      +	err = reftable_new_stack(&st, dir, cfg);
      +	EXPECT_ERR(err);
      +
     -+	input.message = "one\ntwo";
     ++	input.update.message = "one\ntwo";
      +	err = reftable_stack_add(st, &write_test_log, &arg);
      +	EXPECT(err == REFTABLE_API_ERROR);
      +
     -+	input.message = "one";
     ++	input.update.message = "one";
      +	err = reftable_stack_add(st, &write_test_log, &arg);
      +	EXPECT_ERR(err);
      +
      +	err = reftable_stack_read_log(st, input.refname, &dest);
      +	EXPECT_ERR(err);
     -+	EXPECT(0 == strcmp(dest.message, "one\n"));
     ++	EXPECT(0 == strcmp(dest.update.message, "one\n"));
      +
     -+	input.message = "two\n";
     ++	input.update.message = "two\n";
      +	arg.update_index = 2;
      +	err = reftable_stack_add(st, &write_test_log, &arg);
      +	EXPECT_ERR(err);
      +	err = reftable_stack_read_log(st, input.refname, &dest);
      +	EXPECT_ERR(err);
     -+	EXPECT(0 == strcmp(dest.message, "two\n"));
     ++	EXPECT(0 == strcmp(dest.update.message, "two\n"));
      +
      +	/* cleanup */
      +	reftable_stack_destroy(st);
     @@ reftable/stack_test.c (new)
      +			refs[i].value.val1 = reftable_malloc(SHA1_SIZE);
      +			set_test_hash(refs[i].value.val1, i);
      +		}
     ++
      +		logs[i].refname = xstrdup(buf);
      +		/* update_index is part of the key. */
      +		logs[i].update_index = 42;
      +		if (i % 2 == 0) {
     -+			logs[i].new_hash = reftable_malloc(SHA1_SIZE);
     -+			set_test_hash(logs[i].new_hash, i);
     -+			logs[i].email = xstrdup("identity@invalid");
     ++			logs[i].value_type = REFTABLE_LOG_UPDATE;
     ++			logs[i].update.new_hash = reftable_malloc(SHA1_SIZE);
     ++			set_test_hash(logs[i].update.new_hash, i);
     ++			logs[i].update.email = xstrdup("identity@invalid");
      +		}
      +	}
      +	for (i = 0; i < N; i++) {
     @@ reftable/stack_test.c (new)
      +
      +		logs[i].refname = xstrdup(buf);
      +		logs[i].update_index = i;
     -+		logs[i].time = i;
     -+		logs[i].new_hash = reftable_malloc(SHA1_SIZE);
     -+		logs[i].email = xstrdup("identity@invalid");
     -+		set_test_hash(logs[i].new_hash, i);
     ++		logs[i].value_type = REFTABLE_LOG_UPDATE;
     ++		logs[i].update.time = i;
     ++		logs[i].update.new_hash = reftable_malloc(SHA1_SIZE);
     ++		logs[i].update.email = xstrdup("identity@invalid");
     ++		set_test_hash(logs[i].update.new_hash, i);
      +	}
      +
      +	for (i = 1; i <= N; i++) {
 13:  d57023d9f13d ! 13:  bdb19af22cc7 Reftable support for git-core
     @@ Commit message
          Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
          Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
     +    Helped-by: Junio Hamano <gitster@pobox.com>
          Co-authored-by: Jeff King <peff@peff.net>
      
       ## Documentation/config/extensions.txt ##
     @@ builtin/init-db.c: int cmd_init_db(int argc, const char **argv, const char *pref
      
       ## builtin/worktree.c ##
      @@
     - #include "submodule.h"
       #include "utf8.h"
       #include "worktree.h"
     + #include "quote.h"
      +#include "../refs/refs-internal.h"
       
       static const char * const worktree_usage[] = {
     @@ refs/reftable-backend.c (new)
      +
      +static void clear_reftable_log_record(struct reftable_log_record *log)
      +{
     -+	log->old_hash = NULL;
     -+	log->new_hash = NULL;
     -+	log->message = NULL;
      +	log->refname = NULL;
     ++	switch (log->value_type) {
     ++	case REFTABLE_LOG_UPDATE:
     ++		log->update.old_hash = NULL;
     ++		log->update.new_hash = NULL;
     ++		log->update.message = NULL;
     ++		break;
     ++	case REFTABLE_LOG_DELETION:
     ++		break;
     ++	}
      +	reftable_log_record_release(log);
      +}
      +
     @@ refs/reftable-backend.c (new)
      +	assert(0 == result);
      +
      +	reftable_log_record_release(log);
     -+	log->name =
     ++	log->value_type = REFTABLE_LOG_UPDATE;
     ++	log->update.name =
      +		xstrndup(split.name_begin, split.name_end - split.name_begin);
     -+	log->email =
     ++	log->update.email =
      +		xstrndup(split.mail_begin, split.mail_end - split.mail_begin);
     -+	log->time = atol(split.date_begin);
     ++	log->update.time = atol(split.date_begin);
      +	if (*split.tz_begin == '-') {
      +		sign = -1;
      +		split.tz_begin++;
     @@ refs/reftable-backend.c (new)
      +		split.tz_begin++;
      +	}
      +
     -+	log->tz_offset = sign * atoi(split.tz_begin);
     ++	log->update.tz_offset = sign * atoi(split.tz_begin);
      +}
      +
      +static int has_suffix(struct strbuf *b, const char *suffix)
     @@ refs/reftable-backend.c (new)
      +		struct ref_update *u = sorted[i];
      +		struct reftable_log_record *log = &logs[i];
      +		fill_reftable_log_record(log);
     -+		log->refname = (char *)u->refname;
     -+		log->old_hash = u->old_oid.hash;
     -+		log->new_hash = u->new_oid.hash;
      +		log->update_index = ts;
     -+		log->message = u->msg;
     ++		log->value_type = REFTABLE_LOG_UPDATE;
     ++		log->refname = (char *)u->refname;
     ++		log->update.old_hash = u->old_oid.hash;
     ++		log->update.new_hash = u->new_oid.hash;
     ++		log->update.message = u->msg;
      +
      +		if (u->flags & REF_LOG_ONLY) {
      +			continue;
     @@ refs/reftable-backend.c (new)
      +		};
      +		struct reftable_ref_record current = { NULL };
      +		fill_reftable_log_record(&log);
     -+		log.message = xstrdup(arg->logmsg);
     -+		log.new_hash = NULL;
     -+		log.old_hash = NULL;
      +		log.update_index = ts;
      +		log.refname = (char *)arg->refnames->items[i].string;
      +
     ++		log.update.message = xstrdup(arg->logmsg);
     ++		log.update.new_hash = NULL;
     ++		log.update.old_hash = NULL;
      +		if (reftable_stack_read_ref(arg->stack, log.refname,
      +					    &current) == 0) {
     -+			log.old_hash = reftable_ref_record_val1(&current);
     ++			log.update.old_hash =
     ++				reftable_ref_record_val1(&current);
      +		}
      +		err = reftable_writer_add_log(writer, &log);
     -+		log.old_hash = NULL;
     ++		log.update.old_hash = NULL;
      +		reftable_ref_record_release(&current);
      +
      +		clear_reftable_log_record(&log);
     @@ refs/reftable-backend.c (new)
      +
      +		fill_reftable_log_record(&log);
      +		log.refname = (char *)create->refname;
     -+		log.message = (char *)create->logmsg;
      +		log.update_index = ts;
     ++		log.update.message = (char *)create->logmsg;
      +		if (refs_resolve_ref_unsafe(
      +			    (struct ref_store *)create->refs, create->refname,
      +			    RESOLVE_REF_READING, &old_oid, NULL) != NULL) {
     -+			log.old_hash = old_oid.hash;
     ++			log.update.old_hash = old_oid.hash;
      +		}
      +
      +		if (refs_resolve_ref_unsafe((struct ref_store *)create->refs,
      +					    create->target, RESOLVE_REF_READING,
      +					    &new_oid, NULL) != NULL) {
     -+			log.new_hash = new_oid.hash;
     ++			log.update.new_hash = new_oid.hash;
      +		}
      +
     -+		if (log.old_hash != NULL || log.new_hash != NULL) {
     ++		if (log.update.old_hash != NULL ||
     ++		    log.update.new_hash != NULL) {
      +			err = reftable_writer_add_log(writer, &log);
      +		}
      +		log.refname = NULL;
     -+		log.message = NULL;
     -+		log.old_hash = NULL;
     -+		log.new_hash = NULL;
     ++		log.update.message = NULL;
     ++		log.update.old_hash = NULL;
     ++		log.update.new_hash = NULL;
      +		clear_reftable_log_record(&log);
      +	}
      +	return err;
     @@ refs/reftable-backend.c (new)
      +
      +		todo[0].refname = (char *)arg->oldname;
      +		todo[0].update_index = ts;
     -+		todo[0].message = (char *)arg->logmsg;
     -+		todo[0].old_hash = val1;
     -+		todo[0].new_hash = NULL;
     ++		todo[0].update.message = (char *)arg->logmsg;
     ++		todo[0].update.old_hash = val1;
     ++		todo[0].update.new_hash = NULL;
      +
      +		todo[1].refname = (char *)arg->newname;
      +		todo[1].update_index = ts;
     -+		todo[1].old_hash = NULL;
     -+		todo[1].new_hash = val1;
     -+		todo[1].message = (char *)arg->logmsg;
     ++		todo[1].update.old_hash = NULL;
     ++		todo[1].update.new_hash = val1;
     ++		todo[1].update.message = (char *)arg->logmsg;
      +
      +		err = reftable_writer_add_logs(writer, todo, 2);
      +
     @@ refs/reftable-backend.c (new)
      +
      +		free(ri->last_name);
      +		ri->last_name = xstrdup(ri->log.refname);
     -+		hashcpy(ri->oid.hash, ri->log.new_hash);
     ++		hashcpy(ri->oid.hash, ri->log.update.new_hash);
      +		return ITER_OK;
      +	}
      +}
     @@ refs/reftable-backend.c (new)
      +static struct ref_iterator *
      +git_reftable_reflog_iterator_begin(struct ref_store *ref_store)
      +{
     -+	struct git_reftable_reflog_ref_iterator *ri = xcalloc(sizeof(*ri), 1);
     ++	struct git_reftable_reflog_ref_iterator *ri = xcalloc(1, sizeof(*ri));
      +	struct git_reftable_ref_store *refs =
      +		(struct git_reftable_ref_store *)ref_store;
      +
     @@ refs/reftable-backend.c (new)
      +			break;
      +		}
      +
     -+		hashcpy(old_oid.hash, log.old_hash);
     -+		hashcpy(new_oid.hash, log.new_hash);
     ++		hashcpy(old_oid.hash, log.update.old_hash);
     ++		hashcpy(new_oid.hash, log.update.new_hash);
      +
     -+		full_committer = fmt_ident(log.name, log.email,
     ++		full_committer = fmt_ident(log.update.name, log.update.email,
      +					   WANT_COMMITTER_IDENT,
      +					   /*date*/ NULL, IDENT_NO_DATE);
     -+		err = fn(&old_oid, &new_oid, full_committer, log.time,
     -+			 log.tz_offset, log.message, cb_data);
     ++		err = fn(&old_oid, &new_oid, full_committer, log.update.time,
     ++			 log.update.tz_offset, log.update.message, cb_data);
      +		if (err)
      +			break;
      +	}
     @@ refs/reftable-backend.c (new)
      +		struct object_id new_oid;
      +		const char *full_committer = "";
      +
     -+		hashcpy(old_oid.hash, log->old_hash);
     -+		hashcpy(new_oid.hash, log->new_hash);
     ++		hashcpy(old_oid.hash, log->update.old_hash);
     ++		hashcpy(new_oid.hash, log->update.new_hash);
      +
     -+		full_committer = fmt_ident(log->name, log->email,
     ++		full_committer = fmt_ident(log->update.name, log->update.email,
      +					   WANT_COMMITTER_IDENT, NULL,
      +					   IDENT_NO_DATE);
     -+		err = fn(&old_oid, &new_oid, full_committer, log->time,
     -+			 log->tz_offset, log->message, cb_data);
     ++		err = fn(&old_oid, &new_oid, full_committer, log->update.time,
     ++			 log->update.tz_offset, log->update.message, cb_data);
      +		if (err) {
      +			break;
      +		}
     @@ refs/reftable-backend.c (new)
      +		if (err > 0 || strcmp(log.refname, refname)) {
      +			break;
      +		}
     -+		hashcpy(ooid.hash, log.old_hash);
     -+		hashcpy(noid.hash, log.new_hash);
     ++		hashcpy(ooid.hash, log.update.old_hash);
     ++		hashcpy(noid.hash, log.update.new_hash);
      +
     -+		if (should_prune_fn(&ooid, &noid, log.email,
     -+				    (timestamp_t)log.time, log.tz_offset,
     -+				    log.message, policy_cb_data)) {
     ++		if (should_prune_fn(&ooid, &noid, log.update.email,
     ++				    (timestamp_t)log.update.time,
     ++				    log.update.tz_offset, log.update.message,
     ++				    policy_cb_data)) {
      +			add_log_tombstone(&arg, refname, log.update_index);
      +		}
      +	}
     @@ t/t0031-reftable.sh (new)
      +
      +INVALID_SHA1=aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
      +
     ++git_init () {
     ++	git init -b primary "$@"
     ++}
     ++
      +initialize ()  {
      +	rm -rf .git &&
     -+	git init --ref-storage=reftable &&
     ++	git_init --ref-storage=reftable &&
      +	mv .git/hooks .git/hooks-disabled
      +}
      +
      +test_expect_success 'SHA256 support, env' '
      +	rm -rf .git &&
      +	GIT_DEFAULT_HASH=sha256 && export GIT_DEFAULT_HASH &&
     -+	git init --ref-storage=reftable &&
     ++	git_init --ref-storage=reftable &&
      +	mv .git/hooks .git/hooks-disabled &&
      +	test_commit file
      +'
      +
      +test_expect_success 'SHA256 support, option' '
      +	rm -rf .git &&
     -+	git init --ref-storage=reftable --object-format=sha256 &&
     ++	git_init --ref-storage=reftable --object-format=sha256 &&
      +	mv .git/hooks .git/hooks-disabled &&
      +	test_commit file
      +'
     @@ t/t0031-reftable.sh (new)
      +	initialize &&
      +	test_commit file &&
      +	SHA=$(git show-ref -s --verify HEAD) &&
     -+	test_write_lines "$SHA refs/heads/master" "$SHA refs/tags/file" >expect &&
     -+	git show-ref > actual &&
     ++	test_write_lines "$SHA refs/heads/primary" "$SHA refs/tags/file" >expect &&
     ++	git show-ref >actual &&
      +	! git update-ref -d refs/tags/file $INVALID_SHA1 &&
      +	test_cmp expect actual &&
      +	git update-ref -d refs/tags/file $SHA  &&
     -+	test_write_lines "$SHA refs/heads/master" >expect &&
     -+	git show-ref > actual &&
     ++	test_write_lines "$SHA refs/heads/primary" >expect &&
     ++	git show-ref >actual &&
      +	test_cmp expect actual
      +'
      +
     @@ t/t0031-reftable.sh (new)
      +test_expect_success 'basic operation of reftable storage: commit, show-ref' '
      +	initialize &&
      +	test_commit file &&
     -+	test_write_lines refs/heads/master refs/tags/file >expect &&
     ++	test_write_lines refs/heads/primary refs/tags/file >expect &&
      +	git show-ref &&
     -+	git show-ref | cut -f2 -d" " > actual &&
     ++	git show-ref | cut -f2 -d" " >actual &&
      +	test_cmp actual expect
      +'
      +
     @@ t/t0031-reftable.sh (new)
      +	git pack-refs &&
      +	ls -1 .git/reftable >table-files &&
      +	test_line_count = 2 table-files &&
     -+	git reflog refs/heads/master >output &&
     ++	git reflog refs/heads/primary >output &&
      +	test_line_count = 10 output &&
      +	grep "commit (initial): number 1" output &&
      +	grep "commit: number 10" output &&
      +	git gc &&
     -+	git reflog refs/heads/master >output &&
     ++	git reflog refs/heads/primary >output &&
      +	test_line_count = 0 output
      +'
      +
     @@ t/t0031-reftable.sh (new)
      +	test_commit file2 &&
      +	git checkout -b branch2 &&
      +	git switch - &&
     -+	git rev-parse --symbolic-full-name HEAD > actual &&
     -+	echo refs/heads/branch1 > expect &&
     ++	git rev-parse --symbolic-full-name HEAD >actual &&
     ++	echo refs/heads/branch1 >expect &&
      +	test_cmp actual expect
      +'
      +
     @@ t/t0031-reftable.sh (new)
      +	test_commit file &&
      +	git tag -m "annotated tag" test_tag HEAD &&
      +	{
     -+		print_ref "refs/heads/master" &&
     ++		print_ref "refs/heads/primary" &&
      +		print_ref "refs/tags/file" &&
      +		print_ref "refs/tags/test_tag" &&
      +		print_ref "refs/tags/test_tag^{}"
     @@ t/t0031-reftable.sh (new)
      +test_expect_success 'show-ref works on fresh repo' '
      +	initialize &&
      +	rm -rf .git &&
     -+	git init --ref-storage=reftable &&
     ++	git_init --ref-storage=reftable &&
      +	>expect &&
     -+	! git show-ref > actual &&
     ++	! git show-ref >actual &&
      +	test_cmp expect actual
      +'
      +
      +test_expect_success 'checkout unborn branch' '
      +	initialize &&
     -+	git checkout -b master
     ++	git checkout -b primary
      +'
      +
      +
      +test_expect_success 'dir/file conflict' '
      +	initialize &&
      +	test_commit file &&
     -+	! git branch master/forbidden
     ++	! git branch primary/forbidden
      +'
      +
      +
      +test_expect_success 'do not clobber existing repo' '
      +	rm -rf .git &&
     -+	git init --ref-storage=files &&
     -+	cat .git/HEAD > expect &&
     ++	git_init --ref-storage=files &&
     ++	cat .git/HEAD >expect &&
      +	test_commit file &&
     -+	(git init --ref-storage=reftable || true) &&
     -+	cat .git/HEAD > actual &&
     ++	(git_init --ref-storage=reftable || true) &&
     ++	cat .git/HEAD >actual &&
      +	test_cmp expect actual
      +'
      +
     @@ t/t0031-reftable.sh (new)
      +'
      +
      +test_expect_success 'worktrees' '
     -+	git init --ref-storage=reftable start &&
     ++	git_init --ref-storage=reftable start &&
      +	(cd start && test_commit file1 && git checkout -b branch1 &&
      +	git checkout -b branch2 &&
      +	git worktree add  ../wt
     @@ t/t0031-reftable.sh (new)
      +	initialize &&
      +	test_commit file1 &&
      +	mkdir existing_empty &&
     -+	git worktree add --detach existing_empty master
     ++	git worktree add --detach existing_empty primary
      +'
      +
      +test_expect_success 'FETCH_HEAD' '
      +	initialize &&
      +	test_commit one &&
     -+	(git init sub && cd sub && test_commit two) &&
     ++	(git_init sub && cd sub && test_commit two) &&
      +	git --git-dir sub/.git rev-parse HEAD >expect &&
      +	git fetch sub &&
      +	git checkout FETCH_HEAD &&
     @@ t/t1450-fsck.sh: test_description='git fsck random collection of tests
       	git config i18n.commitencoding ISO-8859-1 &&
      
       ## t/t3210-pack-refs.sh ##
     -@@ t/t3210-pack-refs.sh: semantic is still the same.
     - '
     +@@ t/t3210-pack-refs.sh: export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
     + 
       . ./test-lib.sh
       
      +if test_have_prereq REFTABLE
 14:  9df5bc69f971 = 14:  0d9477941678 git-prompt: prepare for reftable refs backend
 15:  4076ef5d20b4 = 15:  59f432522230 Add "test-tool dump-reftable" command.

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 251+ messages in thread

* [PATCH v5 01/15] init-db: set the_repository->hash_algo early on
  2021-03-12 20:19       ` [PATCH v5 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
@ 2021-03-12 20:19         ` Han-Wen Nienhuys via GitGitGadget
  2021-03-12 20:19         ` [PATCH v5 02/15] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
                           ` (14 subsequent siblings)
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-03-12 20:19 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable backend needs to know the hash algorithm for writing the
initialization hash table.

The initial reftable contains a symref HEAD => "main" (or "master"), which is
agnostic to the size of hash value, but this is an exceptional circumstance, and
the reftable library does not cater to this exception. It insists that all
tables in the stack have a consistent format ID for the hash algorithm.

Call set_repo_hash_algo directly after calling validate_hash_algorithm() (which
reads $GIT_DEFAULT_HASH).

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 builtin/init-db.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/builtin/init-db.c b/builtin/init-db.c
index dcc45bef5148..7766661a4755 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -438,6 +438,27 @@ int init_db(const char *git_dir, const char *real_git_dir,
 
 	validate_hash_algorithm(&repo_fmt, hash);
 
+	/*
+	 * At this point, the_repository we have in-core does not look
+	 * anything like one that we would see initialized in an already
+	 * working repository after calling setup_git_directory().
+	 *
+	 * Calling repository.c::initialize_the_repository() may have
+	 * prepared the .index .objects and .parsed_objects members, but
+	 * other members like .gitdir, .commondir, etc. have not been
+	 * initialized.
+	 *
+	 * Many API functions assume they are working with the_repository
+	 * that has sensibly been initialized, but because we haven't
+	 * really read from an existing repository, we need to hand-craft
+	 * the necessary members of the structure to get out of this
+	 * chicken-and-egg situation.
+	 *
+	 * For now, we update the hash algorithm member to what the
+	 * validate_hash_algorithm() call decided for us.
+	 */
+	repo_set_hash_algo(the_repository, repo_fmt.hash_algo);
+
 	reinit = create_default_files(template_dir, original_git_dir,
 				      initial_branch, &repo_fmt,
 				      flags & INIT_DB_QUIET);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v5 02/15] reftable: add LICENSE
  2021-03-12 20:19       ` [PATCH v5 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
  2021-03-12 20:19         ` [PATCH v5 01/15] init-db: set the_repository->hash_algo early on Han-Wen Nienhuys via GitGitGadget
@ 2021-03-12 20:19         ` Han-Wen Nienhuys via GitGitGadget
  2021-03-12 20:19         ` [PATCH v5 03/15] reftable: add error related functionality Han-Wen Nienhuys via GitGitGadget
                           ` (13 subsequent siblings)
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-03-12 20:19 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

TODO: relicense?

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/LICENSE | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)
 create mode 100644 reftable/LICENSE

diff --git a/reftable/LICENSE b/reftable/LICENSE
new file mode 100644
index 000000000000..402e0f9356ba
--- /dev/null
+++ b/reftable/LICENSE
@@ -0,0 +1,31 @@
+BSD License
+
+Copyright (c) 2020, Google LLC
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are
+met:
+
+* Redistributions of source code must retain the above copyright notice,
+this list of conditions and the following disclaimer.
+
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in the
+documentation and/or other materials provided with the distribution.
+
+* Neither the name of Google LLC nor the names of its contributors may
+be used to endorse or promote products derived from this software
+without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v5 03/15] reftable: add error related functionality
  2021-03-12 20:19       ` [PATCH v5 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
  2021-03-12 20:19         ` [PATCH v5 01/15] init-db: set the_repository->hash_algo early on Han-Wen Nienhuys via GitGitGadget
  2021-03-12 20:19         ` [PATCH v5 02/15] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
@ 2021-03-12 20:19         ` Han-Wen Nienhuys via GitGitGadget
  2021-03-12 20:19         ` [PATCH v5 04/15] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
                           ` (12 subsequent siblings)
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-03-12 20:19 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable/ directory is structured as a library, so it cannot
crash on misuse. Instead, it returns an error codes.

In addition, the error code can be used to signal conditions from lower levels
of the library to be handled by higher levels of the library. For example, a
transaction might legitimately write an empty reftable file, but in that case,
we'd want to shortcut the transaction overhead.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/error.c          | 41 ++++++++++++++++++++++++++
 reftable/reftable-error.h | 62 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 103 insertions(+)
 create mode 100644 reftable/error.c
 create mode 100644 reftable/reftable-error.h

diff --git a/reftable/error.c b/reftable/error.c
new file mode 100644
index 000000000000..f6f16def9214
--- /dev/null
+++ b/reftable/error.c
@@ -0,0 +1,41 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reftable-error.h"
+
+#include <stdio.h>
+
+const char *reftable_error_str(int err)
+{
+	static char buf[250];
+	switch (err) {
+	case REFTABLE_IO_ERROR:
+		return "I/O error";
+	case REFTABLE_FORMAT_ERROR:
+		return "corrupt reftable file";
+	case REFTABLE_NOT_EXIST_ERROR:
+		return "file does not exist";
+	case REFTABLE_LOCK_ERROR:
+		return "data is outdated";
+	case REFTABLE_API_ERROR:
+		return "misuse of the reftable API";
+	case REFTABLE_ZLIB_ERROR:
+		return "zlib failure";
+	case REFTABLE_NAME_CONFLICT:
+		return "file/directory conflict";
+	case REFTABLE_EMPTY_TABLE_ERROR:
+		return "wrote empty table";
+	case REFTABLE_REFNAME_ERROR:
+		return "invalid refname";
+	case -1:
+		return "general error";
+	default:
+		snprintf(buf, sizeof(buf), "unknown error code %d", err);
+		return buf;
+	}
+}
diff --git a/reftable/reftable-error.h b/reftable/reftable-error.h
new file mode 100644
index 000000000000..6f89bedf1a58
--- /dev/null
+++ b/reftable/reftable-error.h
@@ -0,0 +1,62 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_ERROR_H
+#define REFTABLE_ERROR_H
+
+/*
+ * Errors in reftable calls are signaled with negative integer return values. 0
+ * means success.
+ */
+enum reftable_error {
+	/* Unexpected file system behavior */
+	REFTABLE_IO_ERROR = -2,
+
+	/* Format inconsistency on reading data */
+	REFTABLE_FORMAT_ERROR = -3,
+
+	/* File does not exist. Returned from block_source_from_file(), because
+	 * it needs special handling in stack.
+	 */
+	REFTABLE_NOT_EXIST_ERROR = -4,
+
+	/* Trying to write out-of-date data. */
+	REFTABLE_LOCK_ERROR = -5,
+
+	/* Misuse of the API:
+	 *  - on writing a record with NULL refname.
+	 *  - on writing a reftable_ref_record outside the table limits
+	 *  - on writing a ref or log record before the stack's
+	 * next_update_inde*x
+	 *  - on writing a log record with multiline message with
+	 *  exact_log_message unset
+	 *  - on reading a reftable_ref_record from log iterator, or vice versa.
+	 *
+	 * When a call misuses the API, the internal state of the library is
+	 * kept unchanged.
+	 */
+	REFTABLE_API_ERROR = -6,
+
+	/* Decompression error */
+	REFTABLE_ZLIB_ERROR = -7,
+
+	/* Wrote a table without blocks. */
+	REFTABLE_EMPTY_TABLE_ERROR = -8,
+
+	/* Dir/file conflict. */
+	REFTABLE_NAME_CONFLICT = -9,
+
+	/* Invalid ref name. */
+	REFTABLE_REFNAME_ERROR = -10,
+};
+
+/* convert the numeric error code to a string. The string should not be
+ * deallocated. */
+const char *reftable_error_str(int err);
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v5 04/15] reftable: utility functions
  2021-03-12 20:19       ` [PATCH v5 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                           ` (2 preceding siblings ...)
  2021-03-12 20:19         ` [PATCH v5 03/15] reftable: add error related functionality Han-Wen Nienhuys via GitGitGadget
@ 2021-03-12 20:19         ` Han-Wen Nienhuys via GitGitGadget
  2021-03-12 20:19         ` [PATCH v5 05/15] reftable: add blocksource, an abstraction for random access reads Han-Wen Nienhuys via GitGitGadget
                           ` (11 subsequent siblings)
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-03-12 20:19 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This commit provides basic utility classes for the reftable library.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Makefile                            |  24 +++++-
 contrib/buildsystems/CMakeLists.txt |  14 ++-
 reftable/basics.c                   | 128 ++++++++++++++++++++++++++++
 reftable/basics.h                   |  60 +++++++++++++
 reftable/basics_test.c              |  98 +++++++++++++++++++++
 reftable/publicbasics.c             |  58 +++++++++++++
 reftable/reftable-malloc.h          |  18 ++++
 reftable/reftable-tests.h           |  22 +++++
 reftable/system.h                   |  32 +++++++
 reftable/test_framework.c           |  23 +++++
 reftable/test_framework.h           |  53 ++++++++++++
 t/helper/test-reftable.c            |   9 ++
 t/helper/test-tool.c                |   3 +-
 t/helper/test-tool.h                |   1 +
 t/t0032-reftable-unittest.sh        |  15 ++++
 15 files changed, 552 insertions(+), 6 deletions(-)
 create mode 100644 reftable/basics.c
 create mode 100644 reftable/basics.h
 create mode 100644 reftable/basics_test.c
 create mode 100644 reftable/publicbasics.c
 create mode 100644 reftable/reftable-malloc.h
 create mode 100644 reftable/reftable-tests.h
 create mode 100644 reftable/system.h
 create mode 100644 reftable/test_framework.c
 create mode 100644 reftable/test_framework.h
 create mode 100644 t/helper/test-reftable.c
 create mode 100755 t/t0032-reftable-unittest.sh

diff --git a/Makefile b/Makefile
index dfb0f1000fa3..43ed67fdb9db 100644
--- a/Makefile
+++ b/Makefile
@@ -728,6 +728,7 @@ TEST_BUILTINS_OBJS += test-read-cache.o
 TEST_BUILTINS_OBJS += test-read-graph.o
 TEST_BUILTINS_OBJS += test-read-midx.o
 TEST_BUILTINS_OBJS += test-ref-store.o
+TEST_BUILTINS_OBJS += test-reftable.o
 TEST_BUILTINS_OBJS += test-regex.o
 TEST_BUILTINS_OBJS += test-repository.o
 TEST_BUILTINS_OBJS += test-revision-walking.o
@@ -804,6 +805,8 @@ TEST_SHELL_PATH = $(SHELL_PATH)
 
 LIB_FILE = libgit.a
 XDIFF_LIB = xdiff/lib.a
+REFTABLE_LIB = reftable/libreftable.a
+REFTABLE_TEST_LIB = reftable/libreftable_test.a
 
 GENERATED_H += command-list.h
 GENERATED_H += config-list.h
@@ -1172,7 +1175,7 @@ THIRD_PARTY_SOURCES += compat/regex/%
 THIRD_PARTY_SOURCES += sha1collisiondetection/%
 THIRD_PARTY_SOURCES += sha1dc/%
 
-GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB)
+GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB)
 EXTLIBS =
 
 GIT_USER_AGENT = git/$(GIT_VERSION)
@@ -2379,10 +2382,19 @@ XDIFF_OBJS += xdiff/xpatience.o
 XDIFF_OBJS += xdiff/xprepare.o
 XDIFF_OBJS += xdiff/xutils.o
 
+REFTABLE_OBJS += reftable/basics.o
+REFTABLE_OBJS += reftable/error.o
+REFTABLE_OBJS += reftable/publicbasics.o
+
+REFTABLE_TEST_OBJS += reftable/test_framework.o
+REFTABLE_TEST_OBJS += reftable/basics_test.o
+
 TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
 OBJECTS := $(LIB_OBJS) $(BUILTIN_OBJS) $(PROGRAM_OBJS) $(TEST_OBJS) \
 	$(XDIFF_OBJS) \
 	$(FUZZ_OBJS) \
+	$(REFTABLE_OBJS) \
+	$(REFTABLE_TEST_OBJS) \
 	common-main.o \
 	git.o
 ifndef NO_CURL
@@ -2534,6 +2546,12 @@ $(LIB_FILE): $(LIB_OBJS)
 $(XDIFF_LIB): $(XDIFF_OBJS)
 	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
 
+$(REFTABLE_LIB): $(REFTABLE_OBJS)
+	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
+
+$(REFTABLE_TEST_LIB): $(REFTABLE_TEST_OBJS)
+	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
+
 export DEFAULT_EDITOR DEFAULT_PAGER
 
 Documentation/GIT-EXCLUDED-PROGRAMS: FORCE
@@ -2812,7 +2830,7 @@ perf: all
 
 t/helper/test-tool$X: $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
 
-t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS)
+t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS) $(REFTABLE_TEST_LIB)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(filter %.a,$^) $(LIBS)
 
 check-sha1:: t/helper/test-tool$X
@@ -3142,7 +3160,7 @@ cocciclean:
 clean: profile-clean coverage-clean cocciclean
 	$(RM) *.res
 	$(RM) $(OBJECTS)
-	$(RM) $(LIB_FILE) $(XDIFF_LIB)
+	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB)
 	$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) git$X
 	$(RM) $(TEST_PROGRAMS)
 	$(RM) $(FUZZ_PROGRAMS)
diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index ac3dbc079af8..986bfc04e0dc 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -601,6 +601,12 @@ parse_makefile_for_sources(libxdiff_SOURCES "XDIFF_OBJS")
 list(TRANSFORM libxdiff_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/")
 add_library(xdiff STATIC ${libxdiff_SOURCES})
 
+#reftable
+parse_makefile_for_sources(reftable_SOURCES "REFTABLE_OBJS")
+
+list(TRANSFORM reftable_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/")
+add_library(reftable STATIC ${reftable_SOURCES})
+
 if(WIN32)
 	if(NOT MSVC)#use windres when compiling with gcc and clang
 		add_custom_command(OUTPUT ${CMAKE_BINARY_DIR}/git.res
@@ -623,7 +629,7 @@ endif()
 #link all required libraries to common-main
 add_library(common-main OBJECT ${CMAKE_SOURCE_DIR}/common-main.c)
 
-target_link_libraries(common-main libgit xdiff ${ZLIB_LIBRARIES})
+target_link_libraries(common-main libgit xdiff reftable ${ZLIB_LIBRARIES})
 if(Intl_FOUND)
 	target_link_libraries(common-main ${Intl_LIBRARIES})
 endif()
@@ -855,11 +861,15 @@ if(BUILD_TESTING)
 add_executable(test-fake-ssh ${CMAKE_SOURCE_DIR}/t/helper/test-fake-ssh.c)
 target_link_libraries(test-fake-ssh common-main)
 
+#reftable-tests
+parse_makefile_for_sources(test-reftable_SOURCES "REFTABLE_TEST_OBJS")
+list(TRANSFORM test-reftable_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/")
+
 #test-tool
 parse_makefile_for_sources(test-tool_SOURCES "TEST_BUILTINS_OBJS")
 
 list(TRANSFORM test-tool_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/t/helper/")
-add_executable(test-tool ${CMAKE_SOURCE_DIR}/t/helper/test-tool.c ${test-tool_SOURCES})
+add_executable(test-tool ${CMAKE_SOURCE_DIR}/t/helper/test-tool.c ${test-tool_SOURCES} ${test-reftable_SOURCES})
 target_link_libraries(test-tool common-main)
 
 set_target_properties(test-fake-ssh test-tool
diff --git a/reftable/basics.c b/reftable/basics.c
new file mode 100644
index 000000000000..abd027b98882
--- /dev/null
+++ b/reftable/basics.c
@@ -0,0 +1,128 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "basics.h"
+
+void put_be24(uint8_t *out, uint32_t i)
+{
+	out[0] = (uint8_t)((i >> 16) & 0xff);
+	out[1] = (uint8_t)((i >> 8) & 0xff);
+	out[2] = (uint8_t)(i & 0xff);
+}
+
+uint32_t get_be24(uint8_t *in)
+{
+	return (uint32_t)(in[0]) << 16 | (uint32_t)(in[1]) << 8 |
+	       (uint32_t)(in[2]);
+}
+
+void put_be16(uint8_t *out, uint16_t i)
+{
+	out[0] = (uint8_t)((i >> 8) & 0xff);
+	out[1] = (uint8_t)(i & 0xff);
+}
+
+int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args)
+{
+	size_t lo = 0;
+	size_t hi = sz;
+
+	/* Invariants:
+	 *
+	 *  (hi == sz) || f(hi) == true
+	 *  (lo == 0 && f(0) == true) || fi(lo) == false
+	 */
+	while (hi - lo > 1) {
+		size_t mid = lo + (hi - lo) / 2;
+
+		if (f(mid, args))
+			hi = mid;
+		else
+			lo = mid;
+	}
+
+	if (lo)
+		return hi;
+
+	return f(0, args) ? 0 : 1;
+}
+
+void free_names(char **a)
+{
+	char **p;
+	if (a == NULL) {
+		return;
+	}
+	for (p = a; *p; p++) {
+		reftable_free(*p);
+	}
+	reftable_free(a);
+}
+
+int names_length(char **names)
+{
+	char **p = names;
+	for (; *p; p++) {
+		/* empty */
+	}
+	return p - names;
+}
+
+void parse_names(char *buf, int size, char ***namesp)
+{
+	char **names = NULL;
+	size_t names_cap = 0;
+	size_t names_len = 0;
+
+	char *p = buf;
+	char *end = buf + size;
+	while (p < end) {
+		char *next = strchr(p, '\n');
+		if (next && next < end) {
+			*next = 0;
+		} else {
+			next = end;
+		}
+		if (p < next) {
+			if (names_len == names_cap) {
+				names_cap = 2 * names_cap + 1;
+				names = reftable_realloc(
+					names, names_cap * sizeof(*names));
+			}
+			names[names_len++] = xstrdup(p);
+		}
+		p = next + 1;
+	}
+
+	names = reftable_realloc(names, (names_len + 1) * sizeof(*names));
+	names[names_len] = NULL;
+	*namesp = names;
+}
+
+int names_equal(char **a, char **b)
+{
+	int i = 0;
+	for (; a[i] && b[i]; i++) {
+		if (strcmp(a[i], b[i])) {
+			return 0;
+		}
+	}
+
+	return a[i] == b[i];
+}
+
+int common_prefix_size(struct strbuf *a, struct strbuf *b)
+{
+	int p = 0;
+	for (; p < a->len && p < b->len; p++) {
+		if (a->buf[p] != b->buf[p])
+			break;
+	}
+
+	return p;
+}
diff --git a/reftable/basics.h b/reftable/basics.h
new file mode 100644
index 000000000000..096b36862b9f
--- /dev/null
+++ b/reftable/basics.h
@@ -0,0 +1,60 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef BASICS_H
+#define BASICS_H
+
+/*
+ * miscellaneous utilities that are not provided by Git.
+ */
+
+#include "system.h"
+
+/* Bigendian en/decoding of integers */
+
+void put_be24(uint8_t *out, uint32_t i);
+uint32_t get_be24(uint8_t *in);
+void put_be16(uint8_t *out, uint16_t i);
+
+/*
+ * find smallest index i in [0, sz) at which f(i) is true, assuming
+ * that f is ascending. Return sz if f(i) is false for all indices.
+ *
+ * Contrary to bsearch(3), this returns something useful if the argument is not
+ * found.
+ */
+int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args);
+
+/*
+ * Frees a NULL terminated array of malloced strings. The array itself is also
+ * freed.
+ */
+void free_names(char **a);
+
+/* parse a newline separated list of names. `size` is the length of the buffer,
+ * without terminating '\0'. Empty names are discarded. */
+void parse_names(char *buf, int size, char ***namesp);
+
+/* compares two NULL-terminated arrays of strings. */
+int names_equal(char **a, char **b);
+
+/* returns the array size of a NULL-terminated array of strings. */
+int names_length(char **names);
+
+/* Allocation routines; they invoke the functions set through
+ * reftable_set_alloc() */
+void *reftable_malloc(size_t sz);
+void *reftable_realloc(void *p, size_t sz);
+void reftable_free(void *p);
+void *reftable_calloc(size_t sz);
+
+/* Find the longest shared prefix size of `a` and `b` */
+struct strbuf;
+int common_prefix_size(struct strbuf *a, struct strbuf *b);
+
+#endif
diff --git a/reftable/basics_test.c b/reftable/basics_test.c
new file mode 100644
index 000000000000..6d52f0f9d5aa
--- /dev/null
+++ b/reftable/basics_test.c
@@ -0,0 +1,98 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+
+#include "basics.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+struct binsearch_args {
+	int key;
+	int *arr;
+};
+
+static int binsearch_func(size_t i, void *void_args)
+{
+	struct binsearch_args *args = (struct binsearch_args *)void_args;
+
+	return args->key < args->arr[i];
+}
+
+static void test_binsearch(void)
+{
+	int arr[] = { 2, 4, 6, 8, 10 };
+	size_t sz = ARRAY_SIZE(arr);
+	struct binsearch_args args = {
+		.arr = arr,
+	};
+
+	int i = 0;
+	for (i = 1; i < 11; i++) {
+		int res;
+		args.key = i;
+		res = binsearch(sz, &binsearch_func, &args);
+
+		if (res < sz) {
+			EXPECT(args.key < arr[res]);
+			if (res > 0) {
+				EXPECT(args.key >= arr[res - 1]);
+			}
+		} else {
+			EXPECT(args.key == 10 || args.key == 11);
+		}
+	}
+}
+
+static void test_names_length(void)
+{
+	char *a[] = { "a", "b", NULL };
+	EXPECT(names_length(a) == 2);
+}
+
+static void test_parse_names_normal(void)
+{
+	char in[] = "a\nb\n";
+	char **out = NULL;
+	parse_names(in, strlen(in), &out);
+	EXPECT(!strcmp(out[0], "a"));
+	EXPECT(!strcmp(out[1], "b"));
+	EXPECT(out[2] == NULL);
+	free_names(out);
+}
+
+static void test_parse_names_drop_empty(void)
+{
+	char in[] = "a\n\n";
+	char **out = NULL;
+	parse_names(in, strlen(in), &out);
+	EXPECT(!strcmp(out[0], "a"));
+	EXPECT(out[1] == NULL);
+	free_names(out);
+}
+
+static void test_common_prefix(void)
+{
+	struct strbuf s1 = STRBUF_INIT;
+	struct strbuf s2 = STRBUF_INIT;
+	strbuf_addstr(&s1, "abcdef");
+	strbuf_addstr(&s2, "abc");
+	EXPECT(common_prefix_size(&s1, &s2) == 3);
+	strbuf_release(&s1);
+	strbuf_release(&s2);
+}
+
+int basics_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_common_prefix);
+	RUN_TEST(test_parse_names_normal);
+	RUN_TEST(test_parse_names_drop_empty);
+	RUN_TEST(test_binsearch);
+	RUN_TEST(test_names_length);
+	return 0;
+}
diff --git a/reftable/publicbasics.c b/reftable/publicbasics.c
new file mode 100644
index 000000000000..25639f61af61
--- /dev/null
+++ b/reftable/publicbasics.c
@@ -0,0 +1,58 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reftable-malloc.h"
+
+#include "basics.h"
+#include "system.h"
+
+static void *(*reftable_malloc_ptr)(size_t sz) = &malloc;
+static void *(*reftable_realloc_ptr)(void *, size_t) = &realloc;
+static void (*reftable_free_ptr)(void *) = &free;
+
+void *reftable_malloc(size_t sz)
+{
+	return (*reftable_malloc_ptr)(sz);
+}
+
+void *reftable_realloc(void *p, size_t sz)
+{
+	return (*reftable_realloc_ptr)(p, sz);
+}
+
+void reftable_free(void *p)
+{
+	reftable_free_ptr(p);
+}
+
+void *reftable_calloc(size_t sz)
+{
+	void *p = reftable_malloc(sz);
+	memset(p, 0, sz);
+	return p;
+}
+
+void reftable_set_alloc(void *(*malloc)(size_t),
+			void *(*realloc)(void *, size_t), void (*free)(void *))
+{
+	reftable_malloc_ptr = malloc;
+	reftable_realloc_ptr = realloc;
+	reftable_free_ptr = free;
+}
+
+int hash_size(uint32_t id)
+{
+	switch (id) {
+	case 0:
+	case SHA1_ID:
+		return SHA1_SIZE;
+	case SHA256_ID:
+		return SHA256_SIZE;
+	}
+	abort();
+}
diff --git a/reftable/reftable-malloc.h b/reftable/reftable-malloc.h
new file mode 100644
index 000000000000..5f2185f1f343
--- /dev/null
+++ b/reftable/reftable-malloc.h
@@ -0,0 +1,18 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_H
+#define REFTABLE_H
+
+#include <stddef.h>
+
+/* Overrides the functions to use for memory management. */
+void reftable_set_alloc(void *(*malloc)(size_t),
+			void *(*realloc)(void *, size_t), void (*free)(void *));
+
+#endif
diff --git a/reftable/reftable-tests.h b/reftable/reftable-tests.h
new file mode 100644
index 000000000000..5e7698ae654e
--- /dev/null
+++ b/reftable/reftable-tests.h
@@ -0,0 +1,22 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_TESTS_H
+#define REFTABLE_TESTS_H
+
+int basics_test_main(int argc, const char **argv);
+int block_test_main(int argc, const char **argv);
+int merged_test_main(int argc, const char **argv);
+int record_test_main(int argc, const char **argv);
+int refname_test_main(int argc, const char **argv);
+int reftable_test_main(int argc, const char **argv);
+int stack_test_main(int argc, const char **argv);
+int tree_test_main(int argc, const char **argv);
+int reftable_dump_main(int argc, char *const *argv);
+
+#endif
diff --git a/reftable/system.h b/reftable/system.h
new file mode 100644
index 000000000000..07277ca06273
--- /dev/null
+++ b/reftable/system.h
@@ -0,0 +1,32 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef SYSTEM_H
+#define SYSTEM_H
+
+#include "git-compat-util.h"
+#include "strbuf.h"
+
+#include <zlib.h>
+
+struct strbuf;
+/* In git, this is declared in dir.h */
+int remove_dir_recursively(struct strbuf *path, int flags);
+
+#define SHA1_ID 0x73686131
+#define SHA256_ID 0x73323536
+#define SHA1_SIZE 20
+#define SHA256_SIZE 32
+
+/* This is uncompress2, which is only available in zlib as of 2017.
+ */
+int uncompress_return_consumed(Bytef *dest, uLongf *destLen,
+			       const Bytef *source, uLong *sourceLen);
+int hash_size(uint32_t id);
+
+#endif
diff --git a/reftable/test_framework.c b/reftable/test_framework.c
new file mode 100644
index 000000000000..a5ff4e2a2d2f
--- /dev/null
+++ b/reftable/test_framework.c
@@ -0,0 +1,23 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+#include "test_framework.h"
+
+#include "basics.h"
+
+void set_test_hash(uint8_t *p, int i)
+{
+	memset(p, (uint8_t)i, hash_size(SHA1_ID));
+}
+
+int strbuf_add_void(void *b, const void *data, size_t sz)
+{
+	strbuf_add((struct strbuf *)b, data, sz);
+	return sz;
+}
diff --git a/reftable/test_framework.h b/reftable/test_framework.h
new file mode 100644
index 000000000000..5fdc9519a5a5
--- /dev/null
+++ b/reftable/test_framework.h
@@ -0,0 +1,53 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef TEST_FRAMEWORK_H
+#define TEST_FRAMEWORK_H
+
+#include "system.h"
+#include "reftable-error.h"
+
+#define EXPECT_ERR(c)                                                  \
+	if (c != 0) {                                                  \
+		fflush(stderr);                                        \
+		fflush(stdout);                                        \
+		fprintf(stderr, "%s: %d: error == %d (%s), want 0\n",  \
+			__FILE__, __LINE__, c, reftable_error_str(c)); \
+		abort();                                               \
+	}
+
+#define EXPECT_STREQ(a, b)                                               \
+	if (strcmp(a, b)) {                                              \
+		fflush(stderr);                                          \
+		fflush(stdout);                                          \
+		fprintf(stderr, "%s:%d: %s (%s) != %s (%s)\n", __FILE__, \
+			__LINE__, #a, a, #b, b);                         \
+		abort();                                                 \
+	}
+
+#define EXPECT(c)                                                          \
+	if (!(c)) {                                                        \
+		fflush(stderr);                                            \
+		fflush(stdout);                                            \
+		fprintf(stderr, "%s: %d: failed assertion %s\n", __FILE__, \
+			__LINE__, #c);                                     \
+		abort();                                                   \
+	}
+
+#define RUN_TEST(f)                          \
+	fprintf(stderr, "running %s\n", #f); \
+	fflush(stderr);                      \
+	f();
+
+void set_test_hash(uint8_t *p, int i);
+
+/* Like strbuf_add, but suitable for passing to reftable_new_writer
+ */
+int strbuf_add_void(void *b, const void *data, size_t sz);
+
+#endif
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
new file mode 100644
index 000000000000..3b58e423e7b1
--- /dev/null
+++ b/t/helper/test-reftable.c
@@ -0,0 +1,9 @@
+#include "reftable/reftable-tests.h"
+#include "test-tool.h"
+
+int cmd__reftable(int argc, const char **argv)
+{
+	basics_test_main(argc, argv);
+
+	return 0;
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index f97cd9f48a69..dfa2bee9e8a2 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -49,13 +49,14 @@ static struct test_cmd cmds[] = {
 	{ "pcre2-config", cmd__pcre2_config },
 	{ "pkt-line", cmd__pkt_line },
 	{ "prio-queue", cmd__prio_queue },
-	{ "proc-receive", cmd__proc_receive},
+	{ "proc-receive", cmd__proc_receive },
 	{ "progress", cmd__progress },
 	{ "reach", cmd__reach },
 	{ "read-cache", cmd__read_cache },
 	{ "read-graph", cmd__read_graph },
 	{ "read-midx", cmd__read_midx },
 	{ "ref-store", cmd__ref_store },
+	{ "reftable", cmd__reftable },
 	{ "regex", cmd__regex },
 	{ "repository", cmd__repository },
 	{ "revision-walking", cmd__revision_walking },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index 28072c0ad5ab..cad482a78915 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -45,6 +45,7 @@ int cmd__read_cache(int argc, const char **argv);
 int cmd__read_graph(int argc, const char **argv);
 int cmd__read_midx(int argc, const char **argv);
 int cmd__ref_store(int argc, const char **argv);
+int cmd__reftable(int argc, const char **argv);
 int cmd__regex(int argc, const char **argv);
 int cmd__repository(int argc, const char **argv);
 int cmd__revision_walking(int argc, const char **argv);
diff --git a/t/t0032-reftable-unittest.sh b/t/t0032-reftable-unittest.sh
new file mode 100755
index 000000000000..0ed14971a580
--- /dev/null
+++ b/t/t0032-reftable-unittest.sh
@@ -0,0 +1,15 @@
+#!/bin/sh
+#
+# Copyright (c) 2020 Google LLC
+#
+
+test_description='reftable unittests'
+
+. ./test-lib.sh
+
+test_expect_success 'unittests' '
+	TMPDIR=$(pwd) && export TMPDIR &&
+	test-tool reftable
+'
+
+test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v5 05/15] reftable: add blocksource, an abstraction for random access reads
  2021-03-12 20:19       ` [PATCH v5 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                           ` (3 preceding siblings ...)
  2021-03-12 20:19         ` [PATCH v5 04/15] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
@ 2021-03-12 20:19         ` Han-Wen Nienhuys via GitGitGadget
  2021-03-12 20:19         ` [PATCH v5 06/15] reftable: (de)serialization for the polymorphic record type Han-Wen Nienhuys via GitGitGadget
                           ` (10 subsequent siblings)
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-03-12 20:19 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable format is usually used with files for storage. However, we abstract
away this using the blocksource data structure. This has two advantages:

* log blocks are zlib compressed, and handling them is simplified if we can
  discard byte segments from within the block layer.

* for unittests, it is useful to read and write in-memory. The blocksource
  allows us to abstract the data away from on-disk files.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                        |   1 +
 reftable/blocksource.c          | 148 ++++++++++++++++++++++++++++++++
 reftable/blocksource.h          |  22 +++++
 reftable/reftable-blocksource.h |  49 +++++++++++
 4 files changed, 220 insertions(+)
 create mode 100644 reftable/blocksource.c
 create mode 100644 reftable/blocksource.h
 create mode 100644 reftable/reftable-blocksource.h

diff --git a/Makefile b/Makefile
index 43ed67fdb9db..82db2ad5e2fd 100644
--- a/Makefile
+++ b/Makefile
@@ -2384,6 +2384,7 @@ XDIFF_OBJS += xdiff/xutils.o
 
 REFTABLE_OBJS += reftable/basics.o
 REFTABLE_OBJS += reftable/error.o
+REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 
 REFTABLE_TEST_OBJS += reftable/test_framework.o
diff --git a/reftable/blocksource.c b/reftable/blocksource.c
new file mode 100644
index 000000000000..25d4d95b52b8
--- /dev/null
+++ b/reftable/blocksource.c
@@ -0,0 +1,148 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+
+#include "basics.h"
+#include "blocksource.h"
+#include "reftable-blocksource.h"
+#include "reftable-error.h"
+
+static void strbuf_return_block(void *b, struct reftable_block *dest)
+{
+	memset(dest->data, 0xff, dest->len);
+	reftable_free(dest->data);
+}
+
+static void strbuf_close(void *b)
+{
+}
+
+static int strbuf_read_block(void *v, struct reftable_block *dest, uint64_t off,
+			     uint32_t size)
+{
+	struct strbuf *b = (struct strbuf *)v;
+	assert(off + size <= b->len);
+	dest->data = reftable_calloc(size);
+	memcpy(dest->data, b->buf + off, size);
+	dest->len = size;
+	return size;
+}
+
+static uint64_t strbuf_size(void *b)
+{
+	return ((struct strbuf *)b)->len;
+}
+
+static struct reftable_block_source_vtable strbuf_vtable = {
+	.size = &strbuf_size,
+	.read_block = &strbuf_read_block,
+	.return_block = &strbuf_return_block,
+	.close = &strbuf_close,
+};
+
+void block_source_from_strbuf(struct reftable_block_source *bs,
+			      struct strbuf *buf)
+{
+	assert(bs->ops == NULL);
+	bs->ops = &strbuf_vtable;
+	bs->arg = buf;
+}
+
+static void malloc_return_block(void *b, struct reftable_block *dest)
+{
+	memset(dest->data, 0xff, dest->len);
+	reftable_free(dest->data);
+}
+
+static struct reftable_block_source_vtable malloc_vtable = {
+	.return_block = &malloc_return_block,
+};
+
+static struct reftable_block_source malloc_block_source_instance = {
+	.ops = &malloc_vtable,
+};
+
+struct reftable_block_source malloc_block_source(void)
+{
+	return malloc_block_source_instance;
+}
+
+struct file_block_source {
+	int fd;
+	uint64_t size;
+};
+
+static uint64_t file_size(void *b)
+{
+	return ((struct file_block_source *)b)->size;
+}
+
+static void file_return_block(void *b, struct reftable_block *dest)
+{
+	memset(dest->data, 0xff, dest->len);
+	reftable_free(dest->data);
+}
+
+static void file_close(void *b)
+{
+	int fd = ((struct file_block_source *)b)->fd;
+	if (fd > 0) {
+		close(fd);
+		((struct file_block_source *)b)->fd = 0;
+	}
+
+	reftable_free(b);
+}
+
+static int file_read_block(void *v, struct reftable_block *dest, uint64_t off,
+			   uint32_t size)
+{
+	struct file_block_source *b = (struct file_block_source *)v;
+	assert(off + size <= b->size);
+	dest->data = reftable_malloc(size);
+	if (pread(b->fd, dest->data, size, off) != size)
+		return -1;
+	dest->len = size;
+	return size;
+}
+
+static struct reftable_block_source_vtable file_vtable = {
+	.size = &file_size,
+	.read_block = &file_read_block,
+	.return_block = &file_return_block,
+	.close = &file_close,
+};
+
+int reftable_block_source_from_file(struct reftable_block_source *bs,
+				    const char *name)
+{
+	struct stat st = { 0 };
+	int err = 0;
+	int fd = open(name, O_RDONLY);
+	struct file_block_source *p = NULL;
+	if (fd < 0) {
+		if (errno == ENOENT) {
+			return REFTABLE_NOT_EXIST_ERROR;
+		}
+		return -1;
+	}
+
+	err = fstat(fd, &st);
+	if (err < 0)
+		return -1;
+
+	p = reftable_calloc(sizeof(struct file_block_source));
+	p->size = st.st_size;
+	p->fd = fd;
+
+	assert(bs->ops == NULL);
+	bs->ops = &file_vtable;
+	bs->arg = p;
+	return 0;
+}
diff --git a/reftable/blocksource.h b/reftable/blocksource.h
new file mode 100644
index 000000000000..072e2727ad20
--- /dev/null
+++ b/reftable/blocksource.h
@@ -0,0 +1,22 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef BLOCKSOURCE_H
+#define BLOCKSOURCE_H
+
+#include "system.h"
+
+struct reftable_block_source;
+
+/* Create an in-memory block source for reading reftables */
+void block_source_from_strbuf(struct reftable_block_source *bs,
+			      struct strbuf *buf);
+
+struct reftable_block_source malloc_block_source(void);
+
+#endif
diff --git a/reftable/reftable-blocksource.h b/reftable/reftable-blocksource.h
new file mode 100644
index 000000000000..5aa3990a5732
--- /dev/null
+++ b/reftable/reftable-blocksource.h
@@ -0,0 +1,49 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_BLOCKSOURCE_H
+#define REFTABLE_BLOCKSOURCE_H
+
+#include <stdint.h>
+
+/* block_source is a generic wrapper for a seekable readable file.
+ */
+struct reftable_block_source {
+	struct reftable_block_source_vtable *ops;
+	void *arg;
+};
+
+/* a contiguous segment of bytes. It keeps track of its generating block_source
+ * so it can return itself into the pool. */
+struct reftable_block {
+	uint8_t *data;
+	int len;
+	struct reftable_block_source source;
+};
+
+/* block_source_vtable are the operations that make up block_source */
+struct reftable_block_source_vtable {
+	/* returns the size of a block source */
+	uint64_t (*size)(void *source);
+
+	/* reads a segment from the block source. It is an error to read
+	   beyond the end of the block */
+	int (*read_block)(void *source, struct reftable_block *dest,
+			  uint64_t off, uint32_t size);
+	/* mark the block as read; may return the data back to malloc */
+	void (*return_block)(void *source, struct reftable_block *blockp);
+
+	/* release all resources associated with the block source */
+	void (*close)(void *source);
+};
+
+/* opens a file on the file system as a block_source */
+int reftable_block_source_from_file(struct reftable_block_source *block_src,
+				    const char *name);
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v5 06/15] reftable: (de)serialization for the polymorphic record type.
  2021-03-12 20:19       ` [PATCH v5 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                           ` (4 preceding siblings ...)
  2021-03-12 20:19         ` [PATCH v5 05/15] reftable: add blocksource, an abstraction for random access reads Han-Wen Nienhuys via GitGitGadget
@ 2021-03-12 20:19         ` Han-Wen Nienhuys via GitGitGadget
  2021-03-12 20:19         ` [PATCH v5 07/15] reftable: reading/writing blocks Han-Wen Nienhuys via GitGitGadget
                           ` (9 subsequent siblings)
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-03-12 20:19 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable format is structured as a sequence of blocks, and each block
contains a sequence of prefix-compressed key-value records. There are 4 types of
records, and they have similarities in how they must be handled. This is
achieved by introducing a polymorphic 'record' type that encapsulates ref, log,
index and object records.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                   |    2 +
 reftable/constants.h       |   21 +
 reftable/record.c          | 1203 ++++++++++++++++++++++++++++++++++++
 reftable/record.h          |  139 +++++
 reftable/record_test.c     |  405 ++++++++++++
 reftable/reftable-record.h |  114 ++++
 t/helper/test-reftable.c   |    2 +-
 7 files changed, 1885 insertions(+), 1 deletion(-)
 create mode 100644 reftable/constants.h
 create mode 100644 reftable/record.c
 create mode 100644 reftable/record.h
 create mode 100644 reftable/record_test.c
 create mode 100644 reftable/reftable-record.h

diff --git a/Makefile b/Makefile
index 82db2ad5e2fd..e746c42f9e3b 100644
--- a/Makefile
+++ b/Makefile
@@ -2386,7 +2386,9 @@ REFTABLE_OBJS += reftable/basics.o
 REFTABLE_OBJS += reftable/error.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
+REFTABLE_OBJS += reftable/record.o
 
+REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/basics_test.o
 
diff --git a/reftable/constants.h b/reftable/constants.h
new file mode 100644
index 000000000000..5eee72c4c113
--- /dev/null
+++ b/reftable/constants.h
@@ -0,0 +1,21 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef CONSTANTS_H
+#define CONSTANTS_H
+
+#define BLOCK_TYPE_LOG 'g'
+#define BLOCK_TYPE_INDEX 'i'
+#define BLOCK_TYPE_REF 'r'
+#define BLOCK_TYPE_OBJ 'o'
+#define BLOCK_TYPE_ANY 0
+
+#define MAX_RESTARTS ((1 << 16) - 1)
+#define DEFAULT_BLOCK_SIZE 4096
+
+#endif
diff --git a/reftable/record.c b/reftable/record.c
new file mode 100644
index 000000000000..d9e4c7bde411
--- /dev/null
+++ b/reftable/record.c
@@ -0,0 +1,1203 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+/* record.c - methods for different types of records. */
+
+#include "record.h"
+
+#include "system.h"
+#include "constants.h"
+#include "reftable-error.h"
+#include "basics.h"
+
+int get_var_int(uint64_t *dest, struct string_view *in)
+{
+	int ptr = 0;
+	uint64_t val;
+
+	if (in->len == 0)
+		return -1;
+	val = in->buf[ptr] & 0x7f;
+
+	while (in->buf[ptr] & 0x80) {
+		ptr++;
+		if (ptr > in->len) {
+			return -1;
+		}
+		val = (val + 1) << 7 | (uint64_t)(in->buf[ptr] & 0x7f);
+	}
+
+	*dest = val;
+	return ptr + 1;
+}
+
+int put_var_int(struct string_view *dest, uint64_t val)
+{
+	uint8_t buf[10] = { 0 };
+	int i = 9;
+	int n = 0;
+	buf[i] = (uint8_t)(val & 0x7f);
+	i--;
+	while (1) {
+		val >>= 7;
+		if (!val) {
+			break;
+		}
+		val--;
+		buf[i] = 0x80 | (uint8_t)(val & 0x7f);
+		i--;
+	}
+
+	n = sizeof(buf) - i - 1;
+	if (dest->len < n)
+		return -1;
+	memcpy(dest->buf, &buf[i + 1], n);
+	return n;
+}
+
+int reftable_is_block_type(uint8_t typ)
+{
+	switch (typ) {
+	case BLOCK_TYPE_REF:
+	case BLOCK_TYPE_LOG:
+	case BLOCK_TYPE_OBJ:
+	case BLOCK_TYPE_INDEX:
+		return 1;
+	}
+	return 0;
+}
+
+uint8_t *reftable_ref_record_val1(struct reftable_ref_record *rec)
+{
+	switch (rec->value_type) {
+	case REFTABLE_REF_VAL1:
+		return rec->value.val1;
+	case REFTABLE_REF_VAL2:
+		return rec->value.val2.value;
+	default:
+		return NULL;
+	}
+}
+
+uint8_t *reftable_ref_record_val2(struct reftable_ref_record *rec)
+{
+	switch (rec->value_type) {
+	case REFTABLE_REF_VAL2:
+		return rec->value.val2.target_value;
+	default:
+		return NULL;
+	}
+}
+
+static int decode_string(struct strbuf *dest, struct string_view in)
+{
+	int start_len = in.len;
+	uint64_t tsize = 0;
+	int n = get_var_int(&tsize, &in);
+	if (n <= 0)
+		return -1;
+	string_view_consume(&in, n);
+	if (in.len < tsize)
+		return -1;
+
+	strbuf_reset(dest);
+	strbuf_add(dest, in.buf, tsize);
+	string_view_consume(&in, tsize);
+
+	return start_len - in.len;
+}
+
+static int encode_string(char *str, struct string_view s)
+{
+	struct string_view start = s;
+	int l = strlen(str);
+	int n = put_var_int(&s, l);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+	if (s.len < l)
+		return -1;
+	memcpy(s.buf, str, l);
+	string_view_consume(&s, l);
+
+	return start.len - s.len;
+}
+
+int reftable_encode_key(int *restart, struct string_view dest,
+			struct strbuf prev_key, struct strbuf key,
+			uint8_t extra)
+{
+	struct string_view start = dest;
+	int prefix_len = common_prefix_size(&prev_key, &key);
+	uint64_t suffix_len = key.len - prefix_len;
+	int n = put_var_int(&dest, (uint64_t)prefix_len);
+	if (n < 0)
+		return -1;
+	string_view_consume(&dest, n);
+
+	*restart = (prefix_len == 0);
+
+	n = put_var_int(&dest, suffix_len << 3 | (uint64_t)extra);
+	if (n < 0)
+		return -1;
+	string_view_consume(&dest, n);
+
+	if (dest.len < suffix_len)
+		return -1;
+	memcpy(dest.buf, key.buf + prefix_len, suffix_len);
+	string_view_consume(&dest, suffix_len);
+
+	return start.len - dest.len;
+}
+
+int reftable_decode_key(struct strbuf *key, uint8_t *extra,
+			struct strbuf last_key, struct string_view in)
+{
+	int start_len = in.len;
+	uint64_t prefix_len = 0;
+	uint64_t suffix_len = 0;
+	int n = get_var_int(&prefix_len, &in);
+	if (n < 0)
+		return -1;
+	string_view_consume(&in, n);
+
+	if (prefix_len > last_key.len)
+		return -1;
+
+	n = get_var_int(&suffix_len, &in);
+	if (n <= 0)
+		return -1;
+	string_view_consume(&in, n);
+
+	*extra = (uint8_t)(suffix_len & 0x7);
+	suffix_len >>= 3;
+
+	if (in.len < suffix_len)
+		return -1;
+
+	strbuf_reset(key);
+	strbuf_add(key, last_key.buf, prefix_len);
+	strbuf_add(key, in.buf, suffix_len);
+	string_view_consume(&in, suffix_len);
+
+	return start_len - in.len;
+}
+
+static void reftable_ref_record_key(const void *r, struct strbuf *dest)
+{
+	const struct reftable_ref_record *rec =
+		(const struct reftable_ref_record *)r;
+	strbuf_reset(dest);
+	strbuf_addstr(dest, rec->refname);
+}
+
+static void reftable_ref_record_copy_from(void *rec, const void *src_rec,
+					  int hash_size)
+{
+	struct reftable_ref_record *ref = (struct reftable_ref_record *)rec;
+	struct reftable_ref_record *src = (struct reftable_ref_record *)src_rec;
+	assert(hash_size > 0);
+
+	/* This is simple and correct, but we could probably reuse the hash
+	 * fields. */
+	reftable_ref_record_release(ref);
+	if (src->refname != NULL) {
+		ref->refname = xstrdup(src->refname);
+	}
+	ref->update_index = src->update_index;
+	ref->value_type = src->value_type;
+	switch (src->value_type) {
+	case REFTABLE_REF_DELETION:
+		break;
+	case REFTABLE_REF_VAL1:
+		ref->value.val1 = reftable_malloc(hash_size);
+		memcpy(ref->value.val1, src->value.val1, hash_size);
+		break;
+	case REFTABLE_REF_VAL2:
+		ref->value.val2.value = reftable_malloc(hash_size);
+		memcpy(ref->value.val2.value, src->value.val2.value, hash_size);
+		ref->value.val2.target_value = reftable_malloc(hash_size);
+		memcpy(ref->value.val2.target_value,
+		       src->value.val2.target_value, hash_size);
+		break;
+	case REFTABLE_REF_SYMREF:
+		ref->value.symref = xstrdup(src->value.symref);
+		break;
+	}
+}
+
+static char hexdigit(int c)
+{
+	if (c <= 9)
+		return '0' + c;
+	return 'a' + (c - 10);
+}
+
+static void hex_format(char *dest, uint8_t *src, int hash_size)
+{
+	assert(hash_size > 0);
+	if (src != NULL) {
+		int i = 0;
+		for (i = 0; i < hash_size; i++) {
+			dest[2 * i] = hexdigit(src[i] >> 4);
+			dest[2 * i + 1] = hexdigit(src[i] & 0xf);
+		}
+		dest[2 * hash_size] = 0;
+	}
+}
+
+void reftable_ref_record_print(struct reftable_ref_record *ref,
+			       uint32_t hash_id)
+{
+	char hex[2 * SHA256_SIZE + 1] = { 0 }; /* BUG */
+	printf("ref{%s(%" PRIu64 ") ", ref->refname, ref->update_index);
+	switch (ref->value_type) {
+	case REFTABLE_REF_SYMREF:
+		printf("=> %s", ref->value.symref);
+		break;
+	case REFTABLE_REF_VAL2:
+		hex_format(hex, ref->value.val2.value, hash_size(hash_id));
+		printf("val 2 %s", hex);
+		hex_format(hex, ref->value.val2.target_value,
+			   hash_size(hash_id));
+		printf("(T %s)", hex);
+		break;
+	case REFTABLE_REF_VAL1:
+		hex_format(hex, ref->value.val1, hash_size(hash_id));
+		printf("val 1 %s", hex);
+		break;
+	case REFTABLE_REF_DELETION:
+		printf("delete");
+		break;
+	}
+	printf("}\n");
+}
+
+static void reftable_ref_record_release_void(void *rec)
+{
+	reftable_ref_record_release((struct reftable_ref_record *)rec);
+}
+
+void reftable_ref_record_release(struct reftable_ref_record *ref)
+{
+	switch (ref->value_type) {
+	case REFTABLE_REF_SYMREF:
+		reftable_free(ref->value.symref);
+		break;
+	case REFTABLE_REF_VAL2:
+		reftable_free(ref->value.val2.target_value);
+		reftable_free(ref->value.val2.value);
+		break;
+	case REFTABLE_REF_VAL1:
+		reftable_free(ref->value.val1);
+		break;
+	case REFTABLE_REF_DELETION:
+		break;
+	default:
+		abort();
+	}
+
+	reftable_free(ref->refname);
+	memset(ref, 0, sizeof(struct reftable_ref_record));
+}
+
+static uint8_t reftable_ref_record_val_type(const void *rec)
+{
+	const struct reftable_ref_record *r =
+		(const struct reftable_ref_record *)rec;
+	return r->value_type;
+}
+
+static int reftable_ref_record_encode(const void *rec, struct string_view s,
+				      int hash_size)
+{
+	const struct reftable_ref_record *r =
+		(const struct reftable_ref_record *)rec;
+	struct string_view start = s;
+	int n = put_var_int(&s, r->update_index);
+	assert(hash_size > 0);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	switch (r->value_type) {
+	case REFTABLE_REF_SYMREF:
+		n = encode_string(r->value.symref, s);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&s, n);
+		break;
+	case REFTABLE_REF_VAL2:
+		if (s.len < 2 * hash_size) {
+			return -1;
+		}
+		memcpy(s.buf, r->value.val2.value, hash_size);
+		string_view_consume(&s, hash_size);
+		memcpy(s.buf, r->value.val2.target_value, hash_size);
+		string_view_consume(&s, hash_size);
+		break;
+	case REFTABLE_REF_VAL1:
+		if (s.len < hash_size) {
+			return -1;
+		}
+		memcpy(s.buf, r->value.val1, hash_size);
+		string_view_consume(&s, hash_size);
+		break;
+	case REFTABLE_REF_DELETION:
+		break;
+	default:
+		abort();
+	}
+
+	return start.len - s.len;
+}
+
+static int reftable_ref_record_decode(void *rec, struct strbuf key,
+				      uint8_t val_type, struct string_view in,
+				      int hash_size)
+{
+	struct reftable_ref_record *r = (struct reftable_ref_record *)rec;
+	struct string_view start = in;
+	uint64_t update_index = 0;
+	int n = get_var_int(&update_index, &in);
+	if (n < 0)
+		return n;
+	string_view_consume(&in, n);
+
+	reftable_ref_record_release(r);
+
+	assert(hash_size > 0);
+
+	r->refname = reftable_realloc(r->refname, key.len + 1);
+	memcpy(r->refname, key.buf, key.len);
+	r->update_index = update_index;
+	r->refname[key.len] = 0;
+	r->value_type = val_type;
+	switch (val_type) {
+	case REFTABLE_REF_VAL1:
+		if (in.len < hash_size) {
+			return -1;
+		}
+
+		r->value.val1 = reftable_malloc(hash_size);
+		memcpy(r->value.val1, in.buf, hash_size);
+		string_view_consume(&in, hash_size);
+		break;
+
+	case REFTABLE_REF_VAL2:
+		if (in.len < 2 * hash_size) {
+			return -1;
+		}
+
+		r->value.val2.value = reftable_malloc(hash_size);
+		memcpy(r->value.val2.value, in.buf, hash_size);
+		string_view_consume(&in, hash_size);
+
+		r->value.val2.target_value = reftable_malloc(hash_size);
+		memcpy(r->value.val2.target_value, in.buf, hash_size);
+		string_view_consume(&in, hash_size);
+		break;
+
+	case REFTABLE_REF_SYMREF: {
+		struct strbuf dest = STRBUF_INIT;
+		int n = decode_string(&dest, in);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&in, n);
+		r->value.symref = dest.buf;
+	} break;
+
+	case REFTABLE_REF_DELETION:
+		break;
+	default:
+		abort();
+		break;
+	}
+
+	return start.len - in.len;
+}
+
+static int reftable_ref_record_is_deletion_void(const void *p)
+{
+	return reftable_ref_record_is_deletion(
+		(const struct reftable_ref_record *)p);
+}
+
+static struct reftable_record_vtable reftable_ref_record_vtable = {
+	.key = &reftable_ref_record_key,
+	.type = BLOCK_TYPE_REF,
+	.copy_from = &reftable_ref_record_copy_from,
+	.val_type = &reftable_ref_record_val_type,
+	.encode = &reftable_ref_record_encode,
+	.decode = &reftable_ref_record_decode,
+	.release = &reftable_ref_record_release_void,
+	.is_deletion = &reftable_ref_record_is_deletion_void,
+};
+
+static void reftable_obj_record_key(const void *r, struct strbuf *dest)
+{
+	const struct reftable_obj_record *rec =
+		(const struct reftable_obj_record *)r;
+	strbuf_reset(dest);
+	strbuf_add(dest, rec->hash_prefix, rec->hash_prefix_len);
+}
+
+static void reftable_obj_record_release(void *rec)
+{
+	struct reftable_obj_record *obj = (struct reftable_obj_record *)rec;
+	FREE_AND_NULL(obj->hash_prefix);
+	FREE_AND_NULL(obj->offsets);
+	memset(obj, 0, sizeof(struct reftable_obj_record));
+}
+
+static void reftable_obj_record_copy_from(void *rec, const void *src_rec,
+					  int hash_size)
+{
+	struct reftable_obj_record *obj = (struct reftable_obj_record *)rec;
+	const struct reftable_obj_record *src =
+		(const struct reftable_obj_record *)src_rec;
+	int olen;
+
+	reftable_obj_record_release(obj);
+	*obj = *src;
+	obj->hash_prefix = reftable_malloc(obj->hash_prefix_len);
+	memcpy(obj->hash_prefix, src->hash_prefix, obj->hash_prefix_len);
+
+	olen = obj->offset_len * sizeof(uint64_t);
+	obj->offsets = reftable_malloc(olen);
+	memcpy(obj->offsets, src->offsets, olen);
+}
+
+static uint8_t reftable_obj_record_val_type(const void *rec)
+{
+	struct reftable_obj_record *r = (struct reftable_obj_record *)rec;
+	if (r->offset_len > 0 && r->offset_len < 8)
+		return r->offset_len;
+	return 0;
+}
+
+static int reftable_obj_record_encode(const void *rec, struct string_view s,
+				      int hash_size)
+{
+	struct reftable_obj_record *r = (struct reftable_obj_record *)rec;
+	struct string_view start = s;
+	int i = 0;
+	int n = 0;
+	uint64_t last = 0;
+	if (r->offset_len == 0 || r->offset_len >= 8) {
+		n = put_var_int(&s, r->offset_len);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&s, n);
+	}
+	if (r->offset_len == 0)
+		return start.len - s.len;
+	n = put_var_int(&s, r->offsets[0]);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	last = r->offsets[0];
+	for (i = 1; i < r->offset_len; i++) {
+		int n = put_var_int(&s, r->offsets[i] - last);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&s, n);
+		last = r->offsets[i];
+	}
+	return start.len - s.len;
+}
+
+static int reftable_obj_record_decode(void *rec, struct strbuf key,
+				      uint8_t val_type, struct string_view in,
+				      int hash_size)
+{
+	struct string_view start = in;
+	struct reftable_obj_record *r = (struct reftable_obj_record *)rec;
+	uint64_t count = val_type;
+	int n = 0;
+	uint64_t last;
+	int j;
+	r->hash_prefix = reftable_malloc(key.len);
+	memcpy(r->hash_prefix, key.buf, key.len);
+	r->hash_prefix_len = key.len;
+
+	if (val_type == 0) {
+		n = get_var_int(&count, &in);
+		if (n < 0) {
+			return n;
+		}
+
+		string_view_consume(&in, n);
+	}
+
+	r->offsets = NULL;
+	r->offset_len = 0;
+	if (count == 0)
+		return start.len - in.len;
+
+	r->offsets = reftable_malloc(count * sizeof(uint64_t));
+	r->offset_len = count;
+
+	n = get_var_int(&r->offsets[0], &in);
+	if (n < 0)
+		return n;
+	string_view_consume(&in, n);
+
+	last = r->offsets[0];
+	j = 1;
+	while (j < count) {
+		uint64_t delta = 0;
+		int n = get_var_int(&delta, &in);
+		if (n < 0) {
+			return n;
+		}
+		string_view_consume(&in, n);
+
+		last = r->offsets[j] = (delta + last);
+		j++;
+	}
+	return start.len - in.len;
+}
+
+static int not_a_deletion(const void *p)
+{
+	return 0;
+}
+
+static struct reftable_record_vtable reftable_obj_record_vtable = {
+	.key = &reftable_obj_record_key,
+	.type = BLOCK_TYPE_OBJ,
+	.copy_from = &reftable_obj_record_copy_from,
+	.val_type = &reftable_obj_record_val_type,
+	.encode = &reftable_obj_record_encode,
+	.decode = &reftable_obj_record_decode,
+	.release = &reftable_obj_record_release,
+	.is_deletion = not_a_deletion,
+};
+
+void reftable_log_record_print(struct reftable_log_record *log,
+			       uint32_t hash_id)
+{
+	char hex[SHA256_SIZE + 1] = { 0 };
+
+	switch (log->value_type) {
+	case REFTABLE_LOG_DELETION:
+		printf("log{%s(%" PRIu64 ") delete", log->refname,
+		       log->update_index);
+		break;
+	case REFTABLE_LOG_UPDATE:
+		printf("log{%s(%" PRIu64 ") %s <%s> %" PRIu64 " %04d\n",
+		       log->refname, log->update_index, log->update.name,
+		       log->update.email, log->update.time,
+		       log->update.tz_offset);
+		hex_format(hex, log->update.old_hash, hash_size(hash_id));
+		printf("%s => ", hex);
+		hex_format(hex, log->update.new_hash, hash_size(hash_id));
+		printf("%s\n\n%s\n}\n", hex, log->update.message);
+		break;
+	}
+}
+
+static void reftable_log_record_key(const void *r, struct strbuf *dest)
+{
+	const struct reftable_log_record *rec =
+		(const struct reftable_log_record *)r;
+	int len = strlen(rec->refname);
+	uint8_t i64[8];
+	uint64_t ts = 0;
+	strbuf_reset(dest);
+	strbuf_add(dest, (uint8_t *)rec->refname, len + 1);
+
+	ts = (~ts) - rec->update_index;
+	put_be64(&i64[0], ts);
+	strbuf_add(dest, i64, sizeof(i64));
+}
+
+static void reftable_log_record_copy_from(void *rec, const void *src_rec,
+					  int hash_size)
+{
+	struct reftable_log_record *dst = (struct reftable_log_record *)rec;
+	const struct reftable_log_record *src =
+		(const struct reftable_log_record *)src_rec;
+
+	reftable_log_record_release(dst);
+	*dst = *src;
+	if (dst->refname != NULL) {
+		dst->refname = xstrdup(dst->refname);
+	}
+	switch (dst->value_type) {
+	case REFTABLE_LOG_DELETION:
+		break;
+	case REFTABLE_LOG_UPDATE:
+		if (dst->update.email != NULL) {
+			dst->update.email = xstrdup(dst->update.email);
+		}
+		if (dst->update.name != NULL) {
+			dst->update.name = xstrdup(dst->update.name);
+		}
+		if (dst->update.message != NULL) {
+			dst->update.message = xstrdup(dst->update.message);
+		}
+
+		if (dst->update.new_hash != NULL) {
+			dst->update.new_hash = reftable_malloc(hash_size);
+			memcpy(dst->update.new_hash, src->update.new_hash,
+			       hash_size);
+		}
+		if (dst->update.old_hash != NULL) {
+			dst->update.old_hash = reftable_malloc(hash_size);
+			memcpy(dst->update.old_hash, src->update.old_hash,
+			       hash_size);
+		}
+		break;
+	}
+}
+
+static void reftable_log_record_release_void(void *rec)
+{
+	struct reftable_log_record *r = (struct reftable_log_record *)rec;
+	reftable_log_record_release(r);
+}
+
+void reftable_log_record_release(struct reftable_log_record *r)
+{
+	reftable_free(r->refname);
+	switch (r->value_type) {
+	case REFTABLE_LOG_DELETION:
+		break;
+	case REFTABLE_LOG_UPDATE:
+		reftable_free(r->update.new_hash);
+		reftable_free(r->update.old_hash);
+		reftable_free(r->update.name);
+		reftable_free(r->update.email);
+		reftable_free(r->update.message);
+		break;
+	}
+	memset(r, 0, sizeof(struct reftable_log_record));
+}
+
+static uint8_t reftable_log_record_val_type(const void *rec)
+{
+	const struct reftable_log_record *log =
+		(const struct reftable_log_record *)rec;
+
+	return reftable_log_record_is_deletion(log) ? 0 : 1;
+}
+
+static uint8_t zero[SHA256_SIZE] = { 0 };
+
+static int reftable_log_record_encode(const void *rec, struct string_view s,
+				      int hash_size)
+{
+	struct reftable_log_record *r = (struct reftable_log_record *)rec;
+	struct string_view start = s;
+	int n = 0;
+	uint8_t *oldh = NULL;
+	uint8_t *newh = NULL;
+	if (reftable_log_record_is_deletion(r))
+		return 0;
+
+	oldh = r->update.old_hash;
+	newh = r->update.new_hash;
+	if (oldh == NULL) {
+		oldh = zero;
+	}
+	if (newh == NULL) {
+		newh = zero;
+	}
+
+	if (s.len < 2 * hash_size)
+		return -1;
+
+	memcpy(s.buf, oldh, hash_size);
+	memcpy(s.buf + hash_size, newh, hash_size);
+	string_view_consume(&s, 2 * hash_size);
+
+	n = encode_string(r->update.name ? r->update.name : "", s);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	n = encode_string(r->update.email ? r->update.email : "", s);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	n = put_var_int(&s, r->update.time);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	if (s.len < 2)
+		return -1;
+
+	put_be16(s.buf, r->update.tz_offset);
+	string_view_consume(&s, 2);
+
+	n = encode_string(r->update.message ? r->update.message : "", s);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	return start.len - s.len;
+}
+
+static int reftable_log_record_decode(void *rec, struct strbuf key,
+				      uint8_t val_type, struct string_view in,
+				      int hash_size)
+{
+	struct string_view start = in;
+	struct reftable_log_record *r = (struct reftable_log_record *)rec;
+	uint64_t max = 0;
+	uint64_t ts = 0;
+	struct strbuf dest = STRBUF_INIT;
+	int n;
+
+	if (key.len <= 9 || key.buf[key.len - 9] != 0)
+		return REFTABLE_FORMAT_ERROR;
+
+	r->refname = reftable_realloc(r->refname, key.len - 8);
+	memcpy(r->refname, key.buf, key.len - 8);
+	ts = get_be64(key.buf + key.len - 8);
+
+	r->update_index = (~max) - ts;
+
+	if (val_type != r->value_type) {
+		switch (r->value_type) {
+		case REFTABLE_LOG_UPDATE:
+			FREE_AND_NULL(r->update.old_hash);
+			FREE_AND_NULL(r->update.new_hash);
+			FREE_AND_NULL(r->update.message);
+			FREE_AND_NULL(r->update.email);
+			FREE_AND_NULL(r->update.name);
+			break;
+		case REFTABLE_LOG_DELETION:
+			break;
+		}
+	}
+
+	r->value_type = val_type;
+	if (val_type == REFTABLE_LOG_DELETION)
+		return 0;
+
+	if (in.len < 2 * hash_size)
+		return REFTABLE_FORMAT_ERROR;
+
+	r->update.old_hash = reftable_realloc(r->update.old_hash, hash_size);
+	r->update.new_hash = reftable_realloc(r->update.new_hash, hash_size);
+
+	memcpy(r->update.old_hash, in.buf, hash_size);
+	memcpy(r->update.new_hash, in.buf + hash_size, hash_size);
+
+	string_view_consume(&in, 2 * hash_size);
+
+	n = decode_string(&dest, in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+
+	r->update.name = reftable_realloc(r->update.name, dest.len + 1);
+	memcpy(r->update.name, dest.buf, dest.len);
+	r->update.name[dest.len] = 0;
+
+	strbuf_reset(&dest);
+	n = decode_string(&dest, in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+
+	r->update.email = reftable_realloc(r->update.email, dest.len + 1);
+	memcpy(r->update.email, dest.buf, dest.len);
+	r->update.email[dest.len] = 0;
+
+	ts = 0;
+	n = get_var_int(&ts, &in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+	r->update.time = ts;
+	if (in.len < 2)
+		goto done;
+
+	r->update.tz_offset = get_be16(in.buf);
+	string_view_consume(&in, 2);
+
+	strbuf_reset(&dest);
+	n = decode_string(&dest, in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+
+	r->update.message = reftable_realloc(r->update.message, dest.len + 1);
+	memcpy(r->update.message, dest.buf, dest.len);
+	r->update.message[dest.len] = 0;
+
+	strbuf_release(&dest);
+	return start.len - in.len;
+
+done:
+	strbuf_release(&dest);
+	return REFTABLE_FORMAT_ERROR;
+}
+
+static int null_streq(char *a, char *b)
+{
+	char *empty = "";
+	if (a == NULL)
+		a = empty;
+
+	if (b == NULL)
+		b = empty;
+
+	return 0 == strcmp(a, b);
+}
+
+static int zero_hash_eq(uint8_t *a, uint8_t *b, int sz)
+{
+	if (a == NULL)
+		a = zero;
+
+	if (b == NULL)
+		b = zero;
+
+	return !memcmp(a, b, sz);
+}
+
+int reftable_log_record_equal(struct reftable_log_record *a,
+			      struct reftable_log_record *b, int hash_size)
+{
+	if (!(null_streq(a->refname, b->refname) &&
+	      a->update_index == b->update_index &&
+	      a->value_type == b->value_type))
+		return 0;
+
+	switch (a->value_type) {
+	case REFTABLE_LOG_DELETION:
+		return 1;
+	case REFTABLE_LOG_UPDATE:
+		return null_streq(a->update.name, b->update.name) &&
+		       a->update.time == b->update.time &&
+		       a->update.tz_offset == b->update.tz_offset &&
+		       null_streq(a->update.email, b->update.email) &&
+		       null_streq(a->update.message, b->update.message) &&
+		       zero_hash_eq(a->update.old_hash, b->update.old_hash,
+				    hash_size) &&
+		       zero_hash_eq(a->update.new_hash, b->update.new_hash,
+				    hash_size);
+	}
+
+	abort();
+}
+
+static int reftable_log_record_is_deletion_void(const void *p)
+{
+	return reftable_log_record_is_deletion(
+		(const struct reftable_log_record *)p);
+}
+
+static struct reftable_record_vtable reftable_log_record_vtable = {
+	.key = &reftable_log_record_key,
+	.type = BLOCK_TYPE_LOG,
+	.copy_from = &reftable_log_record_copy_from,
+	.val_type = &reftable_log_record_val_type,
+	.encode = &reftable_log_record_encode,
+	.decode = &reftable_log_record_decode,
+	.release = &reftable_log_record_release_void,
+	.is_deletion = &reftable_log_record_is_deletion_void,
+};
+
+struct reftable_record reftable_new_record(uint8_t typ)
+{
+	struct reftable_record rec = { NULL };
+	switch (typ) {
+	case BLOCK_TYPE_REF: {
+		struct reftable_ref_record *r =
+			reftable_calloc(sizeof(struct reftable_ref_record));
+		reftable_record_from_ref(&rec, r);
+		return rec;
+	}
+
+	case BLOCK_TYPE_OBJ: {
+		struct reftable_obj_record *r =
+			reftable_calloc(sizeof(struct reftable_obj_record));
+		reftable_record_from_obj(&rec, r);
+		return rec;
+	}
+	case BLOCK_TYPE_LOG: {
+		struct reftable_log_record *r =
+			reftable_calloc(sizeof(struct reftable_log_record));
+		reftable_record_from_log(&rec, r);
+		return rec;
+	}
+	case BLOCK_TYPE_INDEX: {
+		struct reftable_index_record empty = { .last_key =
+							       STRBUF_INIT };
+		struct reftable_index_record *r =
+			reftable_calloc(sizeof(struct reftable_index_record));
+		*r = empty;
+		reftable_record_from_index(&rec, r);
+		return rec;
+	}
+	}
+	abort();
+	return rec;
+}
+
+/* clear out the record, yielding the reftable_record data that was
+ * encapsulated. */
+static void *reftable_record_yield(struct reftable_record *rec)
+{
+	void *p = rec->data;
+	rec->data = NULL;
+	return p;
+}
+
+void reftable_record_destroy(struct reftable_record *rec)
+{
+	reftable_record_release(rec);
+	reftable_free(reftable_record_yield(rec));
+}
+
+static void reftable_index_record_key(const void *r, struct strbuf *dest)
+{
+	struct reftable_index_record *rec = (struct reftable_index_record *)r;
+	strbuf_reset(dest);
+	strbuf_addbuf(dest, &rec->last_key);
+}
+
+static void reftable_index_record_copy_from(void *rec, const void *src_rec,
+					    int hash_size)
+{
+	struct reftable_index_record *dst = (struct reftable_index_record *)rec;
+	struct reftable_index_record *src =
+		(struct reftable_index_record *)src_rec;
+
+	strbuf_reset(&dst->last_key);
+	strbuf_addbuf(&dst->last_key, &src->last_key);
+	dst->offset = src->offset;
+}
+
+static void reftable_index_record_release(void *rec)
+{
+	struct reftable_index_record *idx = (struct reftable_index_record *)rec;
+	strbuf_release(&idx->last_key);
+}
+
+static uint8_t reftable_index_record_val_type(const void *rec)
+{
+	return 0;
+}
+
+static int reftable_index_record_encode(const void *rec, struct string_view out,
+					int hash_size)
+{
+	const struct reftable_index_record *r =
+		(const struct reftable_index_record *)rec;
+	struct string_view start = out;
+
+	int n = put_var_int(&out, r->offset);
+	if (n < 0)
+		return n;
+
+	string_view_consume(&out, n);
+
+	return start.len - out.len;
+}
+
+static int reftable_index_record_decode(void *rec, struct strbuf key,
+					uint8_t val_type, struct string_view in,
+					int hash_size)
+{
+	struct string_view start = in;
+	struct reftable_index_record *r = (struct reftable_index_record *)rec;
+	int n = 0;
+
+	strbuf_reset(&r->last_key);
+	strbuf_addbuf(&r->last_key, &key);
+
+	n = get_var_int(&r->offset, &in);
+	if (n < 0)
+		return n;
+
+	string_view_consume(&in, n);
+	return start.len - in.len;
+}
+
+static struct reftable_record_vtable reftable_index_record_vtable = {
+	.key = &reftable_index_record_key,
+	.type = BLOCK_TYPE_INDEX,
+	.copy_from = &reftable_index_record_copy_from,
+	.val_type = &reftable_index_record_val_type,
+	.encode = &reftable_index_record_encode,
+	.decode = &reftable_index_record_decode,
+	.release = &reftable_index_record_release,
+	.is_deletion = &not_a_deletion,
+};
+
+void reftable_record_key(struct reftable_record *rec, struct strbuf *dest)
+{
+	rec->ops->key(rec->data, dest);
+}
+
+uint8_t reftable_record_type(struct reftable_record *rec)
+{
+	return rec->ops->type;
+}
+
+int reftable_record_encode(struct reftable_record *rec, struct string_view dest,
+			   int hash_size)
+{
+	return rec->ops->encode(rec->data, dest, hash_size);
+}
+
+void reftable_record_copy_from(struct reftable_record *rec,
+			       struct reftable_record *src, int hash_size)
+{
+	assert(src->ops->type == rec->ops->type);
+
+	rec->ops->copy_from(rec->data, src->data, hash_size);
+}
+
+uint8_t reftable_record_val_type(struct reftable_record *rec)
+{
+	return rec->ops->val_type(rec->data);
+}
+
+int reftable_record_decode(struct reftable_record *rec, struct strbuf key,
+			   uint8_t extra, struct string_view src, int hash_size)
+{
+	return rec->ops->decode(rec->data, key, extra, src, hash_size);
+}
+
+void reftable_record_release(struct reftable_record *rec)
+{
+	rec->ops->release(rec->data);
+}
+
+int reftable_record_is_deletion(struct reftable_record *rec)
+{
+	return rec->ops->is_deletion(rec->data);
+}
+
+void reftable_record_from_ref(struct reftable_record *rec,
+			      struct reftable_ref_record *ref_rec)
+{
+	assert(rec->ops == NULL);
+	rec->data = ref_rec;
+	rec->ops = &reftable_ref_record_vtable;
+}
+
+void reftable_record_from_obj(struct reftable_record *rec,
+			      struct reftable_obj_record *obj_rec)
+{
+	assert(rec->ops == NULL);
+	rec->data = obj_rec;
+	rec->ops = &reftable_obj_record_vtable;
+}
+
+void reftable_record_from_index(struct reftable_record *rec,
+				struct reftable_index_record *index_rec)
+{
+	assert(rec->ops == NULL);
+	rec->data = index_rec;
+	rec->ops = &reftable_index_record_vtable;
+}
+
+void reftable_record_from_log(struct reftable_record *rec,
+			      struct reftable_log_record *log_rec)
+{
+	assert(rec->ops == NULL);
+	rec->data = log_rec;
+	rec->ops = &reftable_log_record_vtable;
+}
+
+struct reftable_ref_record *reftable_record_as_ref(struct reftable_record *rec)
+{
+	assert(reftable_record_type(rec) == BLOCK_TYPE_REF);
+	return (struct reftable_ref_record *)rec->data;
+}
+
+struct reftable_log_record *reftable_record_as_log(struct reftable_record *rec)
+{
+	assert(reftable_record_type(rec) == BLOCK_TYPE_LOG);
+	return (struct reftable_log_record *)rec->data;
+}
+
+static int hash_equal(uint8_t *a, uint8_t *b, int hash_size)
+{
+	if (a != NULL && b != NULL)
+		return !memcmp(a, b, hash_size);
+
+	return a == b;
+}
+
+int reftable_ref_record_equal(struct reftable_ref_record *a,
+			      struct reftable_ref_record *b, int hash_size)
+{
+	assert(hash_size > 0);
+	if (!(0 == strcmp(a->refname, b->refname) &&
+	      a->update_index == b->update_index &&
+	      a->value_type == b->value_type))
+		return 0;
+
+	switch (a->value_type) {
+	case REFTABLE_REF_SYMREF:
+		return !strcmp(a->value.symref, b->value.symref);
+	case REFTABLE_REF_VAL2:
+		return hash_equal(a->value.val2.value, b->value.val2.value,
+				  hash_size) &&
+		       hash_equal(a->value.val2.target_value,
+				  b->value.val2.target_value, hash_size);
+	case REFTABLE_REF_VAL1:
+		return hash_equal(a->value.val1, b->value.val1, hash_size);
+	case REFTABLE_REF_DELETION:
+		return 1;
+	default:
+		abort();
+	}
+}
+
+int reftable_ref_record_compare_name(const void *a, const void *b)
+{
+	return strcmp(((struct reftable_ref_record *)a)->refname,
+		      ((struct reftable_ref_record *)b)->refname);
+}
+
+int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref)
+{
+	return ref->value_type == REFTABLE_REF_DELETION;
+}
+
+int reftable_log_record_compare_key(const void *a, const void *b)
+{
+	struct reftable_log_record *la = (struct reftable_log_record *)a;
+	struct reftable_log_record *lb = (struct reftable_log_record *)b;
+
+	int cmp = strcmp(la->refname, lb->refname);
+	if (cmp)
+		return cmp;
+	if (la->update_index > lb->update_index)
+		return -1;
+	return (la->update_index < lb->update_index) ? 1 : 0;
+}
+
+int reftable_log_record_is_deletion(const struct reftable_log_record *log)
+{
+	return (log->value_type == REFTABLE_LOG_DELETION);
+}
+
+void string_view_consume(struct string_view *s, int n)
+{
+	s->buf += n;
+	s->len -= n;
+}
diff --git a/reftable/record.h b/reftable/record.h
new file mode 100644
index 000000000000..498e8c50bf4f
--- /dev/null
+++ b/reftable/record.h
@@ -0,0 +1,139 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef RECORD_H
+#define RECORD_H
+
+#include "system.h"
+
+#include <stdint.h>
+
+#include "reftable-record.h"
+
+/*
+ * A substring of existing string data. This structure takes no responsibility
+ * for the lifetime of the data it points to.
+ */
+struct string_view {
+	uint8_t *buf;
+	size_t len;
+};
+
+/* Advance `s.buf` by `n`, and decrease length. */
+void string_view_consume(struct string_view *s, int n);
+
+/* utilities for de/encoding varints */
+
+int get_var_int(uint64_t *dest, struct string_view *in);
+int put_var_int(struct string_view *dest, uint64_t val);
+
+/* Methods for records. */
+struct reftable_record_vtable {
+	/* encode the key of to a uint8_t strbuf. */
+	void (*key)(const void *rec, struct strbuf *dest);
+
+	/* The record type of ('r' for ref). */
+	uint8_t type;
+
+	void (*copy_from)(void *dest, const void *src, int hash_size);
+
+	/* a value of [0..7], indicating record subvariants (eg. ref vs. symref
+	 * vs ref deletion) */
+	uint8_t (*val_type)(const void *rec);
+
+	/* encodes rec into dest, returning how much space was used. */
+	int (*encode)(const void *rec, struct string_view dest, int hash_size);
+
+	/* decode data from `src` into the record. */
+	int (*decode)(void *rec, struct strbuf key, uint8_t extra,
+		      struct string_view src, int hash_size);
+
+	/* deallocate and null the record. */
+	void (*release)(void *rec);
+
+	/* is this a tombstone? */
+	int (*is_deletion)(const void *rec);
+};
+
+/* record is a generic wrapper for different types of records. */
+struct reftable_record {
+	void *data;
+	struct reftable_record_vtable *ops;
+};
+
+/* returns true for recognized block types. Block start with the block type. */
+int reftable_is_block_type(uint8_t typ);
+
+/* creates a malloced record of the given type. Dispose with record_destroy */
+struct reftable_record reftable_new_record(uint8_t typ);
+
+/* Encode `key` into `dest`. Sets `is_restart` to indicate a restart. Returns
+ * number of bytes written. */
+int reftable_encode_key(int *is_restart, struct string_view dest,
+			struct strbuf prev_key, struct strbuf key,
+			uint8_t extra);
+
+/* Decode into `key` and `extra` from `in` */
+int reftable_decode_key(struct strbuf *key, uint8_t *extra,
+			struct strbuf last_key, struct string_view in);
+
+/* reftable_index_record are used internally to speed up lookups. */
+struct reftable_index_record {
+	uint64_t offset; /* Offset of block */
+	struct strbuf last_key; /* Last key of the block. */
+};
+
+/* reftable_obj_record stores an object ID => ref mapping. */
+struct reftable_obj_record {
+	uint8_t *hash_prefix; /* leading bytes of the object ID */
+	int hash_prefix_len; /* number of leading bytes. Constant
+			      * across a single table. */
+	uint64_t *offsets; /* a vector of file offsets. */
+	int offset_len;
+};
+
+/* see struct record_vtable */
+
+void reftable_record_key(struct reftable_record *rec, struct strbuf *dest);
+uint8_t reftable_record_type(struct reftable_record *rec);
+void reftable_record_copy_from(struct reftable_record *rec,
+			       struct reftable_record *src, int hash_size);
+uint8_t reftable_record_val_type(struct reftable_record *rec);
+int reftable_record_encode(struct reftable_record *rec, struct string_view dest,
+			   int hash_size);
+int reftable_record_decode(struct reftable_record *rec, struct strbuf key,
+			   uint8_t extra, struct string_view src,
+			   int hash_size);
+int reftable_record_is_deletion(struct reftable_record *rec);
+
+/* zeroes out the embedded record */
+void reftable_record_release(struct reftable_record *rec);
+
+/* clear and deallocate embedded record, and zero `rec`. */
+void reftable_record_destroy(struct reftable_record *rec);
+
+/* initialize generic records from concrete records. The generic record should
+ * be zeroed out. */
+void reftable_record_from_obj(struct reftable_record *rec,
+			      struct reftable_obj_record *objrec);
+void reftable_record_from_index(struct reftable_record *rec,
+				struct reftable_index_record *idxrec);
+void reftable_record_from_ref(struct reftable_record *rec,
+			      struct reftable_ref_record *refrec);
+void reftable_record_from_log(struct reftable_record *rec,
+			      struct reftable_log_record *logrec);
+struct reftable_ref_record *reftable_record_as_ref(struct reftable_record *ref);
+struct reftable_log_record *reftable_record_as_log(struct reftable_record *ref);
+
+/* for qsort. */
+int reftable_ref_record_compare_name(const void *a, const void *b);
+
+/* for qsort. */
+int reftable_log_record_compare_key(const void *a, const void *b);
+
+#endif
diff --git a/reftable/record_test.c b/reftable/record_test.c
new file mode 100644
index 000000000000..da7951bd480b
--- /dev/null
+++ b/reftable/record_test.c
@@ -0,0 +1,405 @@
+/*
+  Copyright 2020 Google LLC
+
+  Use of this source code is governed by a BSD-style
+  license that can be found in the LICENSE file or at
+  https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "record.h"
+
+#include "system.h"
+#include "basics.h"
+#include "constants.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static void test_copy(struct reftable_record *rec)
+{
+	struct reftable_record copy =
+		reftable_new_record(reftable_record_type(rec));
+	reftable_record_copy_from(&copy, rec, SHA1_SIZE);
+	/* do it twice to catch memory leaks */
+	reftable_record_copy_from(&copy, rec, SHA1_SIZE);
+	switch (reftable_record_type(&copy)) {
+	case BLOCK_TYPE_REF:
+		EXPECT(reftable_ref_record_equal(reftable_record_as_ref(&copy),
+						 reftable_record_as_ref(rec),
+						 SHA1_SIZE));
+		break;
+	case BLOCK_TYPE_LOG:
+		EXPECT(reftable_log_record_equal(reftable_record_as_log(&copy),
+						 reftable_record_as_log(rec),
+						 SHA1_SIZE));
+		break;
+	}
+	reftable_record_destroy(&copy);
+}
+
+static void test_varint_roundtrip(void)
+{
+	uint64_t inputs[] = { 0,
+			      1,
+			      27,
+			      127,
+			      128,
+			      257,
+			      4096,
+			      ((uint64_t)1 << 63),
+			      ((uint64_t)1 << 63) + ((uint64_t)1 << 63) - 1 };
+	int i = 0;
+	for (i = 0; i < ARRAY_SIZE(inputs); i++) {
+		uint8_t dest[10];
+
+		struct string_view out = {
+			.buf = dest,
+			.len = sizeof(dest),
+		};
+		uint64_t in = inputs[i];
+		int n = put_var_int(&out, in);
+		uint64_t got = 0;
+
+		EXPECT(n > 0);
+		out.len = n;
+		n = get_var_int(&got, &out);
+		EXPECT(n > 0);
+
+		EXPECT(got == in);
+	}
+}
+
+static void test_common_prefix(void)
+{
+	struct {
+		const char *a, *b;
+		int want;
+	} cases[] = {
+		{ "abc", "ab", 2 },
+		{ "", "abc", 0 },
+		{ "abc", "abd", 2 },
+		{ "abc", "pqr", 0 },
+	};
+
+	int i = 0;
+	for (i = 0; i < ARRAY_SIZE(cases); i++) {
+		struct strbuf a = STRBUF_INIT;
+		struct strbuf b = STRBUF_INIT;
+		strbuf_addstr(&a, cases[i].a);
+		strbuf_addstr(&b, cases[i].b);
+		EXPECT(common_prefix_size(&a, &b) == cases[i].want);
+
+		strbuf_release(&a);
+		strbuf_release(&b);
+	}
+}
+
+static void set_hash(uint8_t *h, int j)
+{
+	int i = 0;
+	for (i = 0; i < hash_size(SHA1_ID); i++) {
+		h[i] = (j >> i) & 0xff;
+	}
+}
+
+static void test_reftable_ref_record_roundtrip(void)
+{
+	int i = 0;
+
+	for (i = REFTABLE_REF_DELETION; i < REFTABLE_NR_REF_VALUETYPES; i++) {
+		struct reftable_ref_record in = { NULL };
+		struct reftable_ref_record out = { NULL };
+		struct reftable_record rec_out = { NULL };
+		struct strbuf key = STRBUF_INIT;
+		struct reftable_record rec = { NULL };
+		uint8_t buffer[1024] = { 0 };
+		struct string_view dest = {
+			.buf = buffer,
+			.len = sizeof(buffer),
+		};
+
+		int n, m;
+
+		in.value_type = i;
+		switch (i) {
+		case REFTABLE_REF_DELETION:
+			break;
+		case REFTABLE_REF_VAL1:
+			in.value.val1 = reftable_malloc(SHA1_SIZE);
+			set_hash(in.value.val1, 1);
+			break;
+		case REFTABLE_REF_VAL2:
+			in.value.val2.value = reftable_malloc(SHA1_SIZE);
+			set_hash(in.value.val2.value, 1);
+			in.value.val2.target_value = reftable_malloc(SHA1_SIZE);
+			set_hash(in.value.val2.target_value, 2);
+			break;
+		case REFTABLE_REF_SYMREF:
+			in.value.symref = xstrdup("target");
+			break;
+		}
+		in.refname = xstrdup("refs/heads/master");
+
+		reftable_record_from_ref(&rec, &in);
+		test_copy(&rec);
+
+		EXPECT(reftable_record_val_type(&rec) == i);
+
+		reftable_record_key(&rec, &key);
+		n = reftable_record_encode(&rec, dest, SHA1_SIZE);
+		EXPECT(n > 0);
+
+		/* decode into a non-zero reftable_record to test for leaks. */
+
+		reftable_record_from_ref(&rec_out, &out);
+		m = reftable_record_decode(&rec_out, key, i, dest, SHA1_SIZE);
+		EXPECT(n == m);
+
+		EXPECT(reftable_ref_record_equal(&in, &out, SHA1_SIZE));
+		reftable_record_release(&rec_out);
+
+		strbuf_release(&key);
+		reftable_ref_record_release(&in);
+	}
+}
+
+static void test_reftable_log_record_equal(void)
+{
+	struct reftable_log_record in[2] = {
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.update_index = 42,
+		},
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.update_index = 22,
+		}
+	};
+
+	EXPECT(!reftable_log_record_equal(&in[0], &in[1], SHA1_SIZE));
+	in[1].update_index = in[0].update_index;
+	EXPECT(reftable_log_record_equal(&in[0], &in[1], SHA1_SIZE));
+	reftable_log_record_release(&in[0]);
+	reftable_log_record_release(&in[1]);
+}
+
+static void test_reftable_log_record_roundtrip(void)
+{
+	struct reftable_log_record in[2] = {
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.update_index = 42,
+			.value_type = REFTABLE_LOG_UPDATE,
+			.update = {
+				.old_hash = reftable_malloc(SHA1_SIZE),
+				.new_hash = reftable_malloc(SHA1_SIZE),
+				.name = xstrdup("han-wen"),
+				.email = xstrdup("hanwen@google.com"),
+				.message = xstrdup("test"),
+				.time = 1577123507,
+				.tz_offset = 100,
+			}
+		},
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.update_index = 22,
+			.value_type = REFTABLE_LOG_DELETION,
+		}
+	};
+	set_test_hash(in[0].update.new_hash, 1);
+	set_test_hash(in[0].update.old_hash, 2);
+	for (int i = 0; i < ARRAY_SIZE(in); i++) {
+		struct reftable_record rec = { NULL };
+		struct strbuf key = STRBUF_INIT;
+		uint8_t buffer[1024] = { 0 };
+		struct string_view dest = {
+			.buf = buffer,
+			.len = sizeof(buffer),
+		};
+		/* populate out, to check for leaks. */
+		struct reftable_log_record out = {
+			.refname = xstrdup("old name"),
+			.value_type = REFTABLE_LOG_UPDATE,
+			.update = {
+				.new_hash = reftable_calloc(SHA1_SIZE),
+				.old_hash = reftable_calloc(SHA1_SIZE),
+				.name = xstrdup("old name"),
+				.email = xstrdup("old@email"),
+				.message = xstrdup("old message"),
+			},
+		};
+		struct reftable_record rec_out = { NULL };
+		int n, m, valtype;
+
+		reftable_record_from_log(&rec, &in[i]);
+
+		test_copy(&rec);
+
+		reftable_record_key(&rec, &key);
+
+		n = reftable_record_encode(&rec, dest, SHA1_SIZE);
+		EXPECT(n >= 0);
+		reftable_record_from_log(&rec_out, &out);
+		valtype = reftable_record_val_type(&rec);
+		m = reftable_record_decode(&rec_out, key, valtype, dest,
+					   SHA1_SIZE);
+		EXPECT(n == m);
+
+		EXPECT(reftable_log_record_equal(&in[i], &out, SHA1_SIZE));
+		reftable_log_record_release(&in[i]);
+		strbuf_release(&key);
+		reftable_record_release(&rec_out);
+	}
+}
+
+static void test_u24_roundtrip(void)
+{
+	uint32_t in = 0x112233;
+	uint8_t dest[3];
+	uint32_t out;
+	put_be24(dest, in);
+	out = get_be24(dest);
+	EXPECT(in == out);
+}
+
+static void test_key_roundtrip(void)
+{
+	uint8_t buffer[1024] = { 0 };
+	struct string_view dest = {
+		.buf = buffer,
+		.len = sizeof(buffer),
+	};
+	struct strbuf last_key = STRBUF_INIT;
+	struct strbuf key = STRBUF_INIT;
+	struct strbuf roundtrip = STRBUF_INIT;
+	int restart;
+	uint8_t extra;
+	int n, m;
+	uint8_t rt_extra;
+
+	strbuf_addstr(&last_key, "refs/heads/master");
+	strbuf_addstr(&key, "refs/tags/bla");
+	extra = 6;
+	n = reftable_encode_key(&restart, dest, last_key, key, extra);
+	EXPECT(!restart);
+	EXPECT(n > 0);
+
+	m = reftable_decode_key(&roundtrip, &rt_extra, last_key, dest);
+	EXPECT(n == m);
+	EXPECT(0 == strbuf_cmp(&key, &roundtrip));
+	EXPECT(rt_extra == extra);
+
+	strbuf_release(&last_key);
+	strbuf_release(&key);
+	strbuf_release(&roundtrip);
+}
+
+static void test_reftable_obj_record_roundtrip(void)
+{
+	uint8_t testHash1[SHA1_SIZE] = { 1, 2, 3, 4, 0 };
+	uint64_t till9[] = { 1, 2, 3, 4, 500, 600, 700, 800, 9000 };
+	struct reftable_obj_record recs[3] = { {
+						       .hash_prefix = testHash1,
+						       .hash_prefix_len = 5,
+						       .offsets = till9,
+						       .offset_len = 3,
+					       },
+					       {
+						       .hash_prefix = testHash1,
+						       .hash_prefix_len = 5,
+						       .offsets = till9,
+						       .offset_len = 9,
+					       },
+					       {
+						       .hash_prefix = testHash1,
+						       .hash_prefix_len = 5,
+					       } };
+	int i = 0;
+	for (i = 0; i < ARRAY_SIZE(recs); i++) {
+		struct reftable_obj_record in = recs[i];
+		uint8_t buffer[1024] = { 0 };
+		struct string_view dest = {
+			.buf = buffer,
+			.len = sizeof(buffer),
+		};
+		struct reftable_record rec = { NULL };
+		struct strbuf key = STRBUF_INIT;
+		struct reftable_obj_record out = { NULL };
+		struct reftable_record rec_out = { NULL };
+		int n, m;
+		uint8_t extra;
+
+		reftable_record_from_obj(&rec, &in);
+		test_copy(&rec);
+		reftable_record_key(&rec, &key);
+		n = reftable_record_encode(&rec, dest, SHA1_SIZE);
+		EXPECT(n > 0);
+		extra = reftable_record_val_type(&rec);
+		reftable_record_from_obj(&rec_out, &out);
+		m = reftable_record_decode(&rec_out, key, extra, dest,
+					   SHA1_SIZE);
+		EXPECT(n == m);
+
+		EXPECT(in.hash_prefix_len == out.hash_prefix_len);
+		EXPECT(in.offset_len == out.offset_len);
+
+		EXPECT(!memcmp(in.hash_prefix, out.hash_prefix,
+			       in.hash_prefix_len));
+		EXPECT(0 == memcmp(in.offsets, out.offsets,
+				   sizeof(uint64_t) * in.offset_len));
+		strbuf_release(&key);
+		reftable_record_release(&rec_out);
+	}
+}
+
+static void test_reftable_index_record_roundtrip(void)
+{
+	struct reftable_index_record in = {
+		.offset = 42,
+		.last_key = STRBUF_INIT,
+	};
+	uint8_t buffer[1024] = { 0 };
+	struct string_view dest = {
+		.buf = buffer,
+		.len = sizeof(buffer),
+	};
+	struct strbuf key = STRBUF_INIT;
+	struct reftable_record rec = { NULL };
+	struct reftable_index_record out = { .last_key = STRBUF_INIT };
+	struct reftable_record out_rec = { NULL };
+	int n, m;
+	uint8_t extra;
+
+	strbuf_addstr(&in.last_key, "refs/heads/master");
+	reftable_record_from_index(&rec, &in);
+	reftable_record_key(&rec, &key);
+	test_copy(&rec);
+
+	EXPECT(0 == strbuf_cmp(&key, &in.last_key));
+	n = reftable_record_encode(&rec, dest, SHA1_SIZE);
+	EXPECT(n > 0);
+
+	extra = reftable_record_val_type(&rec);
+	reftable_record_from_index(&out_rec, &out);
+	m = reftable_record_decode(&out_rec, key, extra, dest, SHA1_SIZE);
+	EXPECT(m == n);
+
+	EXPECT(in.offset == out.offset);
+
+	reftable_record_release(&out_rec);
+	strbuf_release(&key);
+	strbuf_release(&in.last_key);
+}
+
+int record_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_reftable_log_record_equal);
+	RUN_TEST(test_reftable_log_record_roundtrip);
+	RUN_TEST(test_reftable_ref_record_roundtrip);
+	RUN_TEST(test_varint_roundtrip);
+	RUN_TEST(test_key_roundtrip);
+	RUN_TEST(test_common_prefix);
+	RUN_TEST(test_reftable_obj_record_roundtrip);
+	RUN_TEST(test_reftable_index_record_roundtrip);
+	RUN_TEST(test_u24_roundtrip);
+	return 0;
+}
diff --git a/reftable/reftable-record.h b/reftable/reftable-record.h
new file mode 100644
index 000000000000..7985b94ae2cc
--- /dev/null
+++ b/reftable/reftable-record.h
@@ -0,0 +1,114 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_RECORD_H
+#define REFTABLE_RECORD_H
+
+#include <stdint.h>
+
+/*
+ * Basic data types
+ *
+ * Reftables store the state of each ref in struct reftable_ref_record, and they
+ * store a sequence of reflog updates in struct reftable_log_record.
+ */
+
+/* reftable_ref_record holds a ref database entry target_value */
+struct reftable_ref_record {
+	char *refname; /* Name of the ref, malloced. */
+	uint64_t update_index; /* Logical timestamp at which this value is
+				* written */
+
+	enum {
+		/* tombstone to hide deletions from earlier tables */
+		REFTABLE_REF_DELETION = 0x0,
+
+		/* a simple ref */
+		REFTABLE_REF_VAL1 = 0x1,
+		/* a tag, plus its peeled hash */
+		REFTABLE_REF_VAL2 = 0x2,
+
+		/* a symbolic reference */
+		REFTABLE_REF_SYMREF = 0x3,
+#define REFTABLE_NR_REF_VALUETYPES 4
+	} value_type;
+	union {
+		uint8_t *val1; /* malloced hash. */
+		struct {
+			uint8_t *value; /* first value, malloced hash  */
+			uint8_t *target_value; /* second value, malloced hash */
+		} val2;
+		char *symref; /* referent, malloced 0-terminated string */
+	} value;
+};
+
+/* Returns the first hash, or NULL if `rec` is not of type
+ * REFTABLE_REF_VAL1 or REFTABLE_REF_VAL2. */
+uint8_t *reftable_ref_record_val1(struct reftable_ref_record *rec);
+
+/* Returns the second hash, or NULL if `rec` is not of type
+ * REFTABLE_REF_VAL2. */
+uint8_t *reftable_ref_record_val2(struct reftable_ref_record *rec);
+
+/* returns whether 'ref' represents a deletion */
+int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref);
+
+/* prints a reftable_ref_record onto stdout. Useful for debugging. */
+void reftable_ref_record_print(struct reftable_ref_record *ref,
+			       uint32_t hash_id);
+
+/* frees and nulls all pointer values inside `ref`. */
+void reftable_ref_record_release(struct reftable_ref_record *ref);
+
+/* returns whether two reftable_ref_records are the same. Useful for testing. */
+int reftable_ref_record_equal(struct reftable_ref_record *a,
+			      struct reftable_ref_record *b, int hash_size);
+
+/* reftable_log_record holds a reflog entry */
+struct reftable_log_record {
+	char *refname;
+	uint64_t update_index; /* logical timestamp of a transactional update.
+				*/
+
+	enum {
+		/* tombstone to hide deletions from earlier tables */
+		REFTABLE_LOG_DELETION = 0x0,
+
+		/* a simple update */
+		REFTABLE_LOG_UPDATE = 0x1,
+#define REFTABLE_NR_LOG_VALUETYPES 2
+	} value_type;
+
+	union {
+		struct {
+			uint8_t *new_hash;
+			uint8_t *old_hash;
+			char *name;
+			char *email;
+			uint64_t time;
+			int16_t tz_offset;
+			char *message;
+		} update;
+	};
+};
+
+/* returns whether 'ref' represents the deletion of a log record. */
+int reftable_log_record_is_deletion(const struct reftable_log_record *log);
+
+/* frees and nulls all pointer values. */
+void reftable_log_record_release(struct reftable_log_record *log);
+
+/* returns whether two records are equal. Useful for testing. */
+int reftable_log_record_equal(struct reftable_log_record *a,
+			      struct reftable_log_record *b, int hash_size);
+
+/* dumps a reftable_log_record on stdout, for debugging/testing. */
+void reftable_log_record_print(struct reftable_log_record *log,
+			       uint32_t hash_id);
+
+#endif
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 3b58e423e7b1..09d4b83ef9b8 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -4,6 +4,6 @@
 int cmd__reftable(int argc, const char **argv)
 {
 	basics_test_main(argc, argv);
-
+	record_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v5 07/15] reftable: reading/writing blocks
  2021-03-12 20:19       ` [PATCH v5 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                           ` (5 preceding siblings ...)
  2021-03-12 20:19         ` [PATCH v5 06/15] reftable: (de)serialization for the polymorphic record type Han-Wen Nienhuys via GitGitGadget
@ 2021-03-12 20:19         ` Han-Wen Nienhuys via GitGitGadget
  2021-03-12 20:19         ` [PATCH v5 08/15] reftable: a generic binary tree implementation Han-Wen Nienhuys via GitGitGadget
                           ` (8 subsequent siblings)
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-03-12 20:19 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable format is structured as a sequence of block. Within a block,
records are prefix compressed, with an index of offsets for fully expand keys to
enable binary search within blocks.

This commit provides the logic to read and write these blocks.

Includes a code snippet copied from zlib

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |   3 +
 reftable/.gitattributes  |   1 +
 reftable/block.c         | 440 +++++++++++++++++++++++++++++++++++++++
 reftable/block.h         | 127 +++++++++++
 reftable/block_test.c    | 121 +++++++++++
 reftable/zlib-compat.c   |  92 ++++++++
 t/helper/test-reftable.c |   1 +
 7 files changed, 785 insertions(+)
 create mode 100644 reftable/.gitattributes
 create mode 100644 reftable/block.c
 create mode 100644 reftable/block.h
 create mode 100644 reftable/block_test.c
 create mode 100644 reftable/zlib-compat.c

diff --git a/Makefile b/Makefile
index e746c42f9e3b..3499996aa49e 100644
--- a/Makefile
+++ b/Makefile
@@ -2384,10 +2384,13 @@ XDIFF_OBJS += xdiff/xutils.o
 
 REFTABLE_OBJS += reftable/basics.o
 REFTABLE_OBJS += reftable/error.o
+REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/record.o
+REFTABLE_OBJS += reftable/zlib-compat.o
 
+REFTABLE_TEST_OBJS += reftable/block_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/basics_test.o
diff --git a/reftable/.gitattributes b/reftable/.gitattributes
new file mode 100644
index 000000000000..f44451a37954
--- /dev/null
+++ b/reftable/.gitattributes
@@ -0,0 +1 @@
+/zlib-compat.c	whitespace=-indent-with-non-tab,-trailing-space
diff --git a/reftable/block.c b/reftable/block.c
new file mode 100644
index 000000000000..5b4e49fb2be2
--- /dev/null
+++ b/reftable/block.c
@@ -0,0 +1,440 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "block.h"
+
+#include "blocksource.h"
+#include "constants.h"
+#include "record.h"
+#include "reftable-error.h"
+#include "system.h"
+#include "zlib.h"
+
+int header_size(int version)
+{
+	switch (version) {
+	case 1:
+		return 24;
+	case 2:
+		return 28;
+	}
+	abort();
+}
+
+int footer_size(int version)
+{
+	switch (version) {
+	case 1:
+		return 68;
+	case 2:
+		return 72;
+	}
+	abort();
+}
+
+static int block_writer_register_restart(struct block_writer *w, int n,
+					 int is_restart, struct strbuf *key)
+{
+	int rlen = w->restart_len;
+	if (rlen >= MAX_RESTARTS) {
+		is_restart = 0;
+	}
+
+	if (is_restart) {
+		rlen++;
+	}
+	if (2 + 3 * rlen + n > w->block_size - w->next)
+		return -1;
+	if (is_restart) {
+		if (w->restart_len == w->restart_cap) {
+			w->restart_cap = w->restart_cap * 2 + 1;
+			w->restarts = reftable_realloc(
+				w->restarts, sizeof(uint32_t) * w->restart_cap);
+		}
+
+		w->restarts[w->restart_len++] = w->next;
+	}
+
+	w->next += n;
+
+	strbuf_reset(&w->last_key);
+	strbuf_addbuf(&w->last_key, key);
+	w->entries++;
+	return 0;
+}
+
+void block_writer_init(struct block_writer *bw, uint8_t typ, uint8_t *buf,
+		       uint32_t block_size, uint32_t header_off, int hash_size)
+{
+	bw->buf = buf;
+	bw->hash_size = hash_size;
+	bw->block_size = block_size;
+	bw->header_off = header_off;
+	bw->buf[header_off] = typ;
+	bw->next = header_off + 4;
+	bw->restart_interval = 16;
+	bw->entries = 0;
+	bw->restart_len = 0;
+	bw->last_key.len = 0;
+}
+
+uint8_t block_writer_type(struct block_writer *bw)
+{
+	return bw->buf[bw->header_off];
+}
+
+/* adds the reftable_record to the block. Returns -1 if it does not fit, 0 on
+   success */
+int block_writer_add(struct block_writer *w, struct reftable_record *rec)
+{
+	struct strbuf empty = STRBUF_INIT;
+	struct strbuf last =
+		w->entries % w->restart_interval == 0 ? empty : w->last_key;
+	struct string_view out = {
+		.buf = w->buf + w->next,
+		.len = w->block_size - w->next,
+	};
+
+	struct string_view start = out;
+
+	int is_restart = 0;
+	struct strbuf key = STRBUF_INIT;
+	int n = 0;
+
+	reftable_record_key(rec, &key);
+	n = reftable_encode_key(&is_restart, out, last, key,
+				reftable_record_val_type(rec));
+	if (n < 0)
+		goto done;
+	string_view_consume(&out, n);
+
+	n = reftable_record_encode(rec, out, w->hash_size);
+	if (n < 0)
+		goto done;
+	string_view_consume(&out, n);
+
+	if (block_writer_register_restart(w, start.len - out.len, is_restart,
+					  &key) < 0)
+		goto done;
+
+	strbuf_release(&key);
+	return 0;
+
+done:
+	strbuf_release(&key);
+	return -1;
+}
+
+int block_writer_finish(struct block_writer *w)
+{
+	int i = 0;
+	for (i = 0; i < w->restart_len; i++) {
+		put_be24(w->buf + w->next, w->restarts[i]);
+		w->next += 3;
+	}
+
+	put_be16(w->buf + w->next, w->restart_len);
+	w->next += 2;
+	put_be24(w->buf + 1 + w->header_off, w->next);
+
+	if (block_writer_type(w) == BLOCK_TYPE_LOG) {
+		int block_header_skip = 4 + w->header_off;
+		uint8_t *compressed = NULL;
+		int zresult = 0;
+		uLongf src_len = w->next - block_header_skip;
+		size_t dest_cap = src_len;
+
+		compressed = reftable_malloc(dest_cap);
+		while (1) {
+			uLongf out_dest_len = dest_cap;
+
+			zresult = compress2(compressed, &out_dest_len,
+					    w->buf + block_header_skip, src_len,
+					    9);
+			if (zresult == Z_BUF_ERROR) {
+				dest_cap *= 2;
+				compressed =
+					reftable_realloc(compressed, dest_cap);
+				continue;
+			}
+
+			if (Z_OK != zresult) {
+				reftable_free(compressed);
+				return REFTABLE_ZLIB_ERROR;
+			}
+
+			memcpy(w->buf + block_header_skip, compressed,
+			       out_dest_len);
+			w->next = out_dest_len + block_header_skip;
+			reftable_free(compressed);
+			break;
+		}
+	}
+	return w->next;
+}
+
+uint8_t block_reader_type(struct block_reader *r)
+{
+	return r->block.data[r->header_off];
+}
+
+int block_reader_init(struct block_reader *br, struct reftable_block *block,
+		      uint32_t header_off, uint32_t table_block_size,
+		      int hash_size)
+{
+	uint32_t full_block_size = table_block_size;
+	uint8_t typ = block->data[header_off];
+	uint32_t sz = get_be24(block->data + header_off + 1);
+
+	uint16_t restart_count = 0;
+	uint32_t restart_start = 0;
+	uint8_t *restart_bytes = NULL;
+
+	if (!reftable_is_block_type(typ))
+		return REFTABLE_FORMAT_ERROR;
+
+	if (typ == BLOCK_TYPE_LOG) {
+		int block_header_skip = 4 + header_off;
+		uLongf dst_len = sz - block_header_skip; /* total size of dest
+							    buffer. */
+		uLongf src_len = block->len - block_header_skip;
+		/* Log blocks specify the *uncompressed* size in their header.
+		 */
+		uint8_t *uncompressed = reftable_malloc(sz);
+
+		/* Copy over the block header verbatim. It's not compressed. */
+		memcpy(uncompressed, block->data, block_header_skip);
+
+		/* Uncompress */
+		if (Z_OK != uncompress_return_consumed(
+				    uncompressed + block_header_skip, &dst_len,
+				    block->data + block_header_skip,
+				    &src_len)) {
+			reftable_free(uncompressed);
+			return REFTABLE_ZLIB_ERROR;
+		}
+
+		if (dst_len + block_header_skip != sz)
+			return REFTABLE_FORMAT_ERROR;
+
+		/* We're done with the input data. */
+		reftable_block_done(block);
+		block->data = uncompressed;
+		block->len = sz;
+		block->source = malloc_block_source();
+		full_block_size = src_len + block_header_skip;
+	} else if (full_block_size == 0) {
+		full_block_size = sz;
+	} else if (sz < full_block_size && sz < block->len &&
+		   block->data[sz] != 0) {
+		/* If the block is smaller than the full block size, it is
+		   padded (data followed by '\0') or the next block is
+		   unaligned. */
+		full_block_size = sz;
+	}
+
+	restart_count = get_be16(block->data + sz - 2);
+	restart_start = sz - 2 - 3 * restart_count;
+	restart_bytes = block->data + restart_start;
+
+	/* transfer ownership. */
+	br->block = *block;
+	block->data = NULL;
+	block->len = 0;
+
+	br->hash_size = hash_size;
+	br->block_len = restart_start;
+	br->full_block_size = full_block_size;
+	br->header_off = header_off;
+	br->restart_count = restart_count;
+	br->restart_bytes = restart_bytes;
+
+	return 0;
+}
+
+static uint32_t block_reader_restart_offset(struct block_reader *br, int i)
+{
+	return get_be24(br->restart_bytes + 3 * i);
+}
+
+void block_reader_start(struct block_reader *br, struct block_iter *it)
+{
+	it->br = br;
+	strbuf_reset(&it->last_key);
+	it->next_off = br->header_off + 4;
+}
+
+struct restart_find_args {
+	int error;
+	struct strbuf key;
+	struct block_reader *r;
+};
+
+static int restart_key_less(size_t idx, void *args)
+{
+	struct restart_find_args *a = (struct restart_find_args *)args;
+	uint32_t off = block_reader_restart_offset(a->r, idx);
+	struct string_view in = {
+		.buf = a->r->block.data + off,
+		.len = a->r->block_len - off,
+	};
+
+	/* the restart key is verbatim in the block, so this could avoid the
+	   alloc for decoding the key */
+	struct strbuf rkey = STRBUF_INIT;
+	struct strbuf last_key = STRBUF_INIT;
+	uint8_t unused_extra;
+	int n = reftable_decode_key(&rkey, &unused_extra, last_key, in);
+	int result;
+	if (n < 0) {
+		a->error = 1;
+		return -1;
+	}
+
+	result = strbuf_cmp(&a->key, &rkey);
+	strbuf_release(&rkey);
+	return result;
+}
+
+void block_iter_copy_from(struct block_iter *dest, struct block_iter *src)
+{
+	dest->br = src->br;
+	dest->next_off = src->next_off;
+	strbuf_reset(&dest->last_key);
+	strbuf_addbuf(&dest->last_key, &src->last_key);
+}
+
+int block_iter_next(struct block_iter *it, struct reftable_record *rec)
+{
+	struct string_view in = {
+		.buf = it->br->block.data + it->next_off,
+		.len = it->br->block_len - it->next_off,
+	};
+	struct string_view start = in;
+	struct strbuf key = STRBUF_INIT;
+	uint8_t extra = 0;
+	int n = 0;
+
+	if (it->next_off >= it->br->block_len)
+		return 1;
+
+	n = reftable_decode_key(&key, &extra, it->last_key, in);
+	if (n < 0)
+		return -1;
+
+	string_view_consume(&in, n);
+	n = reftable_record_decode(rec, key, extra, in, it->br->hash_size);
+	if (n < 0)
+		return -1;
+	string_view_consume(&in, n);
+
+	strbuf_reset(&it->last_key);
+	strbuf_addbuf(&it->last_key, &key);
+	it->next_off += start.len - in.len;
+	strbuf_release(&key);
+	return 0;
+}
+
+int block_reader_first_key(struct block_reader *br, struct strbuf *key)
+{
+	struct strbuf empty = STRBUF_INIT;
+	int off = br->header_off + 4;
+	struct string_view in = {
+		.buf = br->block.data + off,
+		.len = br->block_len - off,
+	};
+
+	uint8_t extra = 0;
+	int n = reftable_decode_key(key, &extra, empty, in);
+	if (n < 0)
+		return n;
+
+	return 0;
+}
+
+int block_iter_seek(struct block_iter *it, struct strbuf *want)
+{
+	return block_reader_seek(it->br, it, want);
+}
+
+void block_iter_close(struct block_iter *it)
+{
+	strbuf_release(&it->last_key);
+}
+
+int block_reader_seek(struct block_reader *br, struct block_iter *it,
+		      struct strbuf *want)
+{
+	struct restart_find_args args = {
+		.key = *want,
+		.r = br,
+	};
+	struct reftable_record rec = reftable_new_record(block_reader_type(br));
+	struct strbuf key = STRBUF_INIT;
+	int err = 0;
+	struct block_iter next = {
+		.last_key = STRBUF_INIT,
+	};
+
+	int i = binsearch(br->restart_count, &restart_key_less, &args);
+	if (args.error) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	it->br = br;
+	if (i > 0) {
+		i--;
+		it->next_off = block_reader_restart_offset(br, i);
+	} else {
+		it->next_off = br->header_off + 4;
+	}
+
+	/* We're looking for the last entry less/equal than the wanted key, so
+	   we have to go one entry too far and then back up.
+	*/
+	while (1) {
+		block_iter_copy_from(&next, it);
+		err = block_iter_next(&next, &rec);
+		if (err < 0)
+			goto done;
+
+		reftable_record_key(&rec, &key);
+		if (err > 0 || strbuf_cmp(&key, want) >= 0) {
+			err = 0;
+			goto done;
+		}
+
+		block_iter_copy_from(it, &next);
+	}
+
+done:
+	strbuf_release(&key);
+	strbuf_release(&next.last_key);
+	reftable_record_destroy(&rec);
+
+	return err;
+}
+
+void block_writer_release(struct block_writer *bw)
+{
+	FREE_AND_NULL(bw->restarts);
+	strbuf_release(&bw->last_key);
+	/* the block is not owned. */
+}
+
+void reftable_block_done(struct reftable_block *blockp)
+{
+	struct reftable_block_source source = blockp->source;
+	if (blockp != NULL && source.ops != NULL)
+		source.ops->return_block(source.arg, blockp);
+	blockp->data = NULL;
+	blockp->len = 0;
+	blockp->source.ops = NULL;
+	blockp->source.arg = NULL;
+}
diff --git a/reftable/block.h b/reftable/block.h
new file mode 100644
index 000000000000..e207706a6448
--- /dev/null
+++ b/reftable/block.h
@@ -0,0 +1,127 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef BLOCK_H
+#define BLOCK_H
+
+#include "basics.h"
+#include "record.h"
+#include "reftable-blocksource.h"
+
+/*
+ * Writes reftable blocks. The block_writer is reused across blocks to minimize
+ * allocation overhead.
+ */
+struct block_writer {
+	uint8_t *buf;
+	uint32_t block_size;
+
+	/* Offset ofof the global header. Nonzero in the first block only. */
+	uint32_t header_off;
+
+	/* How often to restart keys. */
+	int restart_interval;
+	int hash_size;
+
+	/* Offset of next uint8_t to write. */
+	uint32_t next;
+	uint32_t *restarts;
+	uint32_t restart_len;
+	uint32_t restart_cap;
+
+	struct strbuf last_key;
+	int entries;
+};
+
+/*
+ * initializes the blockwriter to write `typ` entries, using `buf` as temporary
+ * storage. `buf` is not owned by the block_writer. */
+void block_writer_init(struct block_writer *bw, uint8_t typ, uint8_t *buf,
+		       uint32_t block_size, uint32_t header_off, int hash_size);
+
+/* returns the block type (eg. 'r' for ref records. */
+uint8_t block_writer_type(struct block_writer *bw);
+
+/* appends the record, or -1 if it doesn't fit. */
+int block_writer_add(struct block_writer *w, struct reftable_record *rec);
+
+/* appends the key restarts, and compress the block if necessary. */
+int block_writer_finish(struct block_writer *w);
+
+/* clears out internally allocated block_writer members. */
+void block_writer_release(struct block_writer *bw);
+
+/* Read a block. */
+struct block_reader {
+	/* offset of the block header; nonzero for the first block in a
+	 * reftable. */
+	uint32_t header_off;
+
+	/* the memory block */
+	struct reftable_block block;
+	int hash_size;
+
+	/* size of the data, excluding restart data. */
+	uint32_t block_len;
+	uint8_t *restart_bytes;
+	uint16_t restart_count;
+
+	/* size of the data in the file. For log blocks, this is the compressed
+	 * size. */
+	uint32_t full_block_size;
+};
+
+/* Iterate over entries in a block */
+struct block_iter {
+	/* offset within the block of the next entry to read. */
+	uint32_t next_off;
+	struct block_reader *br;
+
+	/* key for last entry we read. */
+	struct strbuf last_key;
+};
+
+/* initializes a block reader. */
+int block_reader_init(struct block_reader *br, struct reftable_block *bl,
+		      uint32_t header_off, uint32_t table_block_size,
+		      int hash_size);
+
+/* Position `it` at start of the block */
+void block_reader_start(struct block_reader *br, struct block_iter *it);
+
+/* Position `it` to the `want` key in the block */
+int block_reader_seek(struct block_reader *br, struct block_iter *it,
+		      struct strbuf *want);
+
+/* Returns the block type (eg. 'r' for refs) */
+uint8_t block_reader_type(struct block_reader *r);
+
+/* Decodes the first key in the block */
+int block_reader_first_key(struct block_reader *br, struct strbuf *key);
+
+void block_iter_copy_from(struct block_iter *dest, struct block_iter *src);
+
+/* return < 0 for error, 0 for OK, > 0 for EOF. */
+int block_iter_next(struct block_iter *it, struct reftable_record *rec);
+
+/* Seek to `want` with in the block pointed to by `it` */
+int block_iter_seek(struct block_iter *it, struct strbuf *want);
+
+/* deallocate memory for `it`. The block reader and its block is left intact. */
+void block_iter_close(struct block_iter *it);
+
+/* size of file header, depending on format version */
+int header_size(int version);
+
+/* size of file footer, depending on format version */
+int footer_size(int version);
+
+/* returns a block to its source. */
+void reftable_block_done(struct reftable_block *ret);
+
+#endif
diff --git a/reftable/block_test.c b/reftable/block_test.c
new file mode 100644
index 000000000000..75fe198e6773
--- /dev/null
+++ b/reftable/block_test.c
@@ -0,0 +1,121 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "block.h"
+
+#include "system.h"
+
+#include "blocksource.h"
+#include "basics.h"
+#include "constants.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static void test_block_read_write(void)
+{
+	const int header_off = 21; /* random */
+	char *names[30];
+	const int N = ARRAY_SIZE(names);
+	const int block_size = 1024;
+	struct reftable_block block = { NULL };
+	struct block_writer bw = {
+		.last_key = STRBUF_INIT,
+	};
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_record rec = { NULL };
+	int i = 0;
+	int n;
+	struct block_reader br = { 0 };
+	struct block_iter it = { .last_key = STRBUF_INIT };
+	int j = 0;
+	struct strbuf want = STRBUF_INIT;
+
+	block.data = reftable_calloc(block_size);
+	block.len = block_size;
+	block.source = malloc_block_source();
+	block_writer_init(&bw, BLOCK_TYPE_REF, block.data, block_size,
+			  header_off, hash_size(SHA1_ID));
+	reftable_record_from_ref(&rec, &ref);
+
+	for (i = 0; i < N; i++) {
+		char name[100];
+		uint8_t hash[SHA1_SIZE];
+		snprintf(name, sizeof(name), "branch%02d", i);
+		memset(hash, i, sizeof(hash));
+
+		ref.refname = name;
+		ref.value_type = REFTABLE_REF_VAL1;
+		ref.value.val1 = hash;
+
+		names[i] = xstrdup(name);
+		n = block_writer_add(&bw, &rec);
+		ref.refname = NULL;
+		ref.value_type = REFTABLE_REF_DELETION;
+		EXPECT(n == 0);
+	}
+
+	n = block_writer_finish(&bw);
+	EXPECT(n > 0);
+
+	block_writer_release(&bw);
+
+	block_reader_init(&br, &block, header_off, block_size, SHA1_SIZE);
+
+	block_reader_start(&br, &it);
+
+	while (1) {
+		int r = block_iter_next(&it, &rec);
+		EXPECT(r >= 0);
+		if (r > 0) {
+			break;
+		}
+		EXPECT_STREQ(names[j], ref.refname);
+		j++;
+	}
+
+	reftable_record_release(&rec);
+	block_iter_close(&it);
+
+	for (i = 0; i < N; i++) {
+		struct block_iter it = { .last_key = STRBUF_INIT };
+		strbuf_reset(&want);
+		strbuf_addstr(&want, names[i]);
+
+		n = block_reader_seek(&br, &it, &want);
+		EXPECT(n == 0);
+
+		n = block_iter_next(&it, &rec);
+		EXPECT(n == 0);
+
+		EXPECT_STREQ(names[i], ref.refname);
+
+		want.len--;
+		n = block_reader_seek(&br, &it, &want);
+		EXPECT(n == 0);
+
+		n = block_iter_next(&it, &rec);
+		EXPECT(n == 0);
+		EXPECT_STREQ(names[10 * (i / 10)], ref.refname);
+
+		block_iter_close(&it);
+	}
+
+	reftable_record_release(&rec);
+	reftable_block_done(&br.block);
+	strbuf_release(&want);
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+}
+
+int block_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_block_read_write);
+	return 0;
+}
diff --git a/reftable/zlib-compat.c b/reftable/zlib-compat.c
new file mode 100644
index 000000000000..3e0b0f24f1c7
--- /dev/null
+++ b/reftable/zlib-compat.c
@@ -0,0 +1,92 @@
+/* taken from zlib's uncompr.c
+
+   commit cacf7f1d4e3d44d871b605da3b647f07d718623f
+   Author: Mark Adler <madler@alumni.caltech.edu>
+   Date:   Sun Jan 15 09:18:46 2017 -0800
+
+       zlib 1.2.11
+
+*/
+
+/*
+ * Copyright (C) 1995-2003, 2010, 2014, 2016 Jean-loup Gailly, Mark Adler
+ * For conditions of distribution and use, see copyright notice in zlib.h
+ */
+
+#include "system.h"
+
+/* clang-format off */
+
+/* ===========================================================================
+     Decompresses the source buffer into the destination buffer.  *sourceLen is
+   the byte length of the source buffer. Upon entry, *destLen is the total size
+   of the destination buffer, which must be large enough to hold the entire
+   uncompressed data. (The size of the uncompressed data must have been saved
+   previously by the compressor and transmitted to the decompressor by some
+   mechanism outside the scope of this compression library.) Upon exit,
+   *destLen is the size of the decompressed data and *sourceLen is the number
+   of source bytes consumed. Upon return, source + *sourceLen points to the
+   first unused input byte.
+
+     uncompress returns Z_OK if success, Z_MEM_ERROR if there was not enough
+   memory, Z_BUF_ERROR if there was not enough room in the output buffer, or
+   Z_DATA_ERROR if the input data was corrupted, including if the input data is
+   an incomplete zlib stream.
+*/
+int ZEXPORT uncompress_return_consumed (
+    Bytef *dest,
+    uLongf *destLen,
+    const Bytef *source,
+    uLong *sourceLen) {
+    z_stream stream;
+    int err;
+    const uInt max = (uInt)-1;
+    uLong len, left;
+    Byte buf[1];    /* for detection of incomplete stream when *destLen == 0 */
+
+    len = *sourceLen;
+    if (*destLen) {
+        left = *destLen;
+        *destLen = 0;
+    }
+    else {
+        left = 1;
+        dest = buf;
+    }
+
+    stream.next_in = (z_const Bytef *)source;
+    stream.avail_in = 0;
+    stream.zalloc = (alloc_func)0;
+    stream.zfree = (free_func)0;
+    stream.opaque = (voidpf)0;
+
+    err = inflateInit(&stream);
+    if (err != Z_OK) return err;
+
+    stream.next_out = dest;
+    stream.avail_out = 0;
+
+    do {
+        if (stream.avail_out == 0) {
+            stream.avail_out = left > (uLong)max ? max : (uInt)left;
+            left -= stream.avail_out;
+        }
+        if (stream.avail_in == 0) {
+            stream.avail_in = len > (uLong)max ? max : (uInt)len;
+            len -= stream.avail_in;
+        }
+        err = inflate(&stream, Z_NO_FLUSH);
+    } while (err == Z_OK);
+
+    *sourceLen -= len + stream.avail_in;
+    if (dest != buf)
+        *destLen = stream.total_out;
+    else if (stream.total_out && err == Z_BUF_ERROR)
+        left = 1;
+
+    inflateEnd(&stream);
+    return err == Z_STREAM_END ? Z_OK :
+           err == Z_NEED_DICT ? Z_DATA_ERROR  :
+           err == Z_BUF_ERROR && left + stream.avail_out ? Z_DATA_ERROR :
+           err;
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 09d4b83ef9b8..c9deeaf08c7a 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -4,6 +4,7 @@
 int cmd__reftable(int argc, const char **argv)
 {
 	basics_test_main(argc, argv);
+	block_test_main(argc, argv);
 	record_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v5 08/15] reftable: a generic binary tree implementation
  2021-03-12 20:19       ` [PATCH v5 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                           ` (6 preceding siblings ...)
  2021-03-12 20:19         ` [PATCH v5 07/15] reftable: reading/writing blocks Han-Wen Nienhuys via GitGitGadget
@ 2021-03-12 20:19         ` Han-Wen Nienhuys via GitGitGadget
  2021-03-12 20:19         ` [PATCH v5 09/15] reftable: write reftable files Han-Wen Nienhuys via GitGitGadget
                           ` (7 subsequent siblings)
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-03-12 20:19 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable format includes support for an (OID => ref) map. This map can speed
up visibility and reachability checks. In particular, various operations along
the fetch/push path within Gerrit have ben sped up by using this structure.

The map is constructed with help of a binary tree. Object IDs are hashes, so
they are uniformly distributed. Hence, the tree does not attempt forced
rebalancing.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |  4 ++-
 reftable/tree.c          | 63 ++++++++++++++++++++++++++++++++++++++++
 reftable/tree.h          | 34 ++++++++++++++++++++++
 reftable/tree_test.c     | 61 ++++++++++++++++++++++++++++++++++++++
 t/helper/test-reftable.c |  1 +
 5 files changed, 162 insertions(+), 1 deletion(-)
 create mode 100644 reftable/tree.c
 create mode 100644 reftable/tree.h
 create mode 100644 reftable/tree_test.c

diff --git a/Makefile b/Makefile
index 3499996aa49e..57a61b359ea7 100644
--- a/Makefile
+++ b/Makefile
@@ -2388,12 +2388,14 @@ REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/record.o
+REFTABLE_OBJS += reftable/tree.o
 REFTABLE_OBJS += reftable/zlib-compat.o
 
+REFTABLE_TEST_OBJS += reftable/basics_test.o
 REFTABLE_TEST_OBJS += reftable/block_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
-REFTABLE_TEST_OBJS += reftable/basics_test.o
+REFTABLE_TEST_OBJS += reftable/tree_test.o
 
 TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
 OBJECTS := $(LIB_OBJS) $(BUILTIN_OBJS) $(PROGRAM_OBJS) $(TEST_OBJS) \
diff --git a/reftable/tree.c b/reftable/tree.c
new file mode 100644
index 000000000000..0061d14e306c
--- /dev/null
+++ b/reftable/tree.c
@@ -0,0 +1,63 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "tree.h"
+
+#include "basics.h"
+#include "system.h"
+
+struct tree_node *tree_search(void *key, struct tree_node **rootp,
+			      int (*compare)(const void *, const void *),
+			      int insert)
+{
+	int res;
+	if (*rootp == NULL) {
+		if (!insert) {
+			return NULL;
+		} else {
+			struct tree_node *n =
+				reftable_calloc(sizeof(struct tree_node));
+			n->key = key;
+			*rootp = n;
+			return *rootp;
+		}
+	}
+
+	res = compare(key, (*rootp)->key);
+	if (res < 0)
+		return tree_search(key, &(*rootp)->left, compare, insert);
+	else if (res > 0)
+		return tree_search(key, &(*rootp)->right, compare, insert);
+	return *rootp;
+}
+
+void infix_walk(struct tree_node *t, void (*action)(void *arg, void *key),
+		void *arg)
+{
+	if (t->left != NULL) {
+		infix_walk(t->left, action, arg);
+	}
+	action(arg, t->key);
+	if (t->right != NULL) {
+		infix_walk(t->right, action, arg);
+	}
+}
+
+void tree_free(struct tree_node *t)
+{
+	if (t == NULL) {
+		return;
+	}
+	if (t->left != NULL) {
+		tree_free(t->left);
+	}
+	if (t->right != NULL) {
+		tree_free(t->right);
+	}
+	reftable_free(t);
+}
diff --git a/reftable/tree.h b/reftable/tree.h
new file mode 100644
index 000000000000..fbdd002e23af
--- /dev/null
+++ b/reftable/tree.h
@@ -0,0 +1,34 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef TREE_H
+#define TREE_H
+
+/* tree_node is a generic binary search tree. */
+struct tree_node {
+	void *key;
+	struct tree_node *left, *right;
+};
+
+/* looks for `key` in `rootp` using `compare` as comparison function. If insert
+ * is set, insert the key if it's not found. Else, return NULL.
+ */
+struct tree_node *tree_search(void *key, struct tree_node **rootp,
+			      int (*compare)(const void *, const void *),
+			      int insert);
+
+/* performs an infix walk of the tree. */
+void infix_walk(struct tree_node *t, void (*action)(void *arg, void *key),
+		void *arg);
+
+/*
+ * deallocates the tree nodes recursively. Keys should be deallocated separately
+ * by walking over the tree. */
+void tree_free(struct tree_node *t);
+
+#endif
diff --git a/reftable/tree_test.c b/reftable/tree_test.c
new file mode 100644
index 000000000000..26d1e694252e
--- /dev/null
+++ b/reftable/tree_test.c
@@ -0,0 +1,61 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "tree.h"
+
+#include "basics.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static int test_compare(const void *a, const void *b)
+{
+	return (char *)a - (char *)b;
+}
+
+struct curry {
+	void *last;
+};
+
+static void check_increasing(void *arg, void *key)
+{
+	struct curry *c = (struct curry *)arg;
+	if (c->last != NULL) {
+		assert(test_compare(c->last, key) < 0);
+	}
+	c->last = key;
+}
+
+static void test_tree(void)
+{
+	struct tree_node *root = NULL;
+
+	void *values[11] = { NULL };
+	struct tree_node *nodes[11] = { NULL };
+	int i = 1;
+	struct curry c = { NULL };
+	do {
+		nodes[i] = tree_search(values + i, &root, &test_compare, 1);
+		i = (i * 7) % 11;
+	} while (i != 1);
+
+	for (i = 1; i < ARRAY_SIZE(nodes); i++) {
+		assert(values + i == nodes[i]->key);
+		assert(nodes[i] ==
+		       tree_search(values + i, &root, &test_compare, 0));
+	}
+
+	infix_walk(root, check_increasing, &c);
+	tree_free(root);
+}
+
+int tree_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_tree);
+	return 0;
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index c9deeaf08c7a..050551fa6985 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -6,5 +6,6 @@ int cmd__reftable(int argc, const char **argv)
 	basics_test_main(argc, argv);
 	block_test_main(argc, argv);
 	record_test_main(argc, argv);
+	tree_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v5 09/15] reftable: write reftable files
  2021-03-12 20:19       ` [PATCH v5 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                           ` (7 preceding siblings ...)
  2021-03-12 20:19         ` [PATCH v5 08/15] reftable: a generic binary tree implementation Han-Wen Nienhuys via GitGitGadget
@ 2021-03-12 20:19         ` Han-Wen Nienhuys via GitGitGadget
  2021-03-12 20:19         ` [PATCH v5 10/15] reftable: read " Han-Wen Nienhuys via GitGitGadget
                           ` (6 subsequent siblings)
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-03-12 20:19 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                   |   1 +
 reftable/reftable-writer.h | 147 ++++++++
 reftable/writer.c          | 691 +++++++++++++++++++++++++++++++++++++
 reftable/writer.h          |  50 +++
 4 files changed, 889 insertions(+)
 create mode 100644 reftable/reftable-writer.h
 create mode 100644 reftable/writer.c
 create mode 100644 reftable/writer.h

diff --git a/Makefile b/Makefile
index 57a61b359ea7..d37dd79bfab7 100644
--- a/Makefile
+++ b/Makefile
@@ -2389,6 +2389,7 @@ REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/record.o
 REFTABLE_OBJS += reftable/tree.o
+REFTABLE_OBJS += reftable/writer.o
 REFTABLE_OBJS += reftable/zlib-compat.o
 
 REFTABLE_TEST_OBJS += reftable/basics_test.o
diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h
new file mode 100644
index 000000000000..9d2f8d605558
--- /dev/null
+++ b/reftable/reftable-writer.h
@@ -0,0 +1,147 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_WRITER_H
+#define REFTABLE_WRITER_H
+
+#include <stdint.h>
+
+#include "reftable-record.h"
+
+/* Writing single reftables */
+
+/* reftable_write_options sets options for writing a single reftable. */
+struct reftable_write_options {
+	/* boolean: do not pad out blocks to block size. */
+	unsigned unpadded : 1;
+
+	/* the blocksize. Should be less than 2^24. */
+	uint32_t block_size;
+
+	/* boolean: do not generate a SHA1 => ref index. */
+	unsigned skip_index_objects : 1;
+
+	/* how often to write complete keys in each block. */
+	int restart_interval;
+
+	/* 4-byte identifier ("sha1", "s256") of the hash.
+	 * Defaults to SHA1 if unset
+	 */
+	uint32_t hash_id;
+
+	/* boolean: do not check ref names for validity or dir/file conflicts.
+	 */
+	unsigned skip_name_check : 1;
+
+	/* boolean: copy log messages exactly. If unset, check that the message
+	 *   is a single line, and add '\n' if missing.
+	 */
+	unsigned exact_log_message : 1;
+};
+
+/* reftable_block_stats holds statistics for a single block type */
+struct reftable_block_stats {
+	/* total number of entries written */
+	int entries;
+	/* total number of key restarts */
+	int restarts;
+	/* total number of blocks */
+	int blocks;
+	/* total number of index blocks */
+	int index_blocks;
+	/* depth of the index */
+	int max_index_level;
+
+	/* offset of the first block for this type */
+	uint64_t offset;
+	/* offset of the top level index block for this type, or 0 if not
+	 * present */
+	uint64_t index_offset;
+};
+
+/* stats holds overall statistics for a single reftable */
+struct reftable_stats {
+	/* total number of blocks written. */
+	int blocks;
+	/* stats for ref data */
+	struct reftable_block_stats ref_stats;
+	/* stats for the SHA1 to ref map. */
+	struct reftable_block_stats obj_stats;
+	/* stats for index blocks */
+	struct reftable_block_stats idx_stats;
+	/* stats for log blocks */
+	struct reftable_block_stats log_stats;
+
+	/* disambiguation length of shortened object IDs. */
+	int object_id_len;
+};
+
+/* reftable_new_writer creates a new writer */
+struct reftable_writer *
+reftable_new_writer(int (*writer_func)(void *, const void *, size_t),
+		    void *writer_arg, struct reftable_write_options *opts);
+
+/* Set the range of update indices for the records we will add. When writing a
+   table into a stack, the min should be at least
+   reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned.
+
+   For transactional updates to a stack, typically min==max, and the
+   update_index can be obtained by inspeciting the stack. When converting an
+   existing ref database into a single reftable, this would be a range of
+   update-index timestamps.
+ */
+void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
+				uint64_t max);
+
+/*
+  Add a reftable_ref_record. The record should have names that come after
+  already added records.
+
+  The update_index must be within the limits set by
+  reftable_writer_set_limits(), or REFTABLE_API_ERROR is returned. It is an
+  REFTABLE_API_ERROR error to write a ref record after a log record.
+*/
+int reftable_writer_add_ref(struct reftable_writer *w,
+			    struct reftable_ref_record *ref);
+
+/*
+  Convenience function to add multiple reftable_ref_records; the function sorts
+  the records before adding them, reordering the records array passed in.
+*/
+int reftable_writer_add_refs(struct reftable_writer *w,
+			     struct reftable_ref_record *refs, int n);
+
+/*
+  adds reftable_log_records. Log records are keyed by (refname, decreasing
+  update_index). The key for the record added must come after the already added
+  log records.
+*/
+int reftable_writer_add_log(struct reftable_writer *w,
+			    struct reftable_log_record *log);
+
+/*
+  Convenience function to add multiple reftable_log_records; the function sorts
+  the records before adding them, reordering records array passed in.
+*/
+int reftable_writer_add_logs(struct reftable_writer *w,
+			     struct reftable_log_record *logs, int n);
+
+/* reftable_writer_close finalizes the reftable. The writer is retained so
+ * statistics can be inspected. */
+int reftable_writer_close(struct reftable_writer *w);
+
+/* writer_stats returns the statistics on the reftable being written.
+
+   This struct becomes invalid when the writer is freed.
+ */
+const struct reftable_stats *writer_stats(struct reftable_writer *w);
+
+/* reftable_writer_free deallocates memory for the writer */
+void reftable_writer_free(struct reftable_writer *w);
+
+#endif
diff --git a/reftable/writer.c b/reftable/writer.c
new file mode 100644
index 000000000000..d42ca8afac11
--- /dev/null
+++ b/reftable/writer.c
@@ -0,0 +1,691 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "writer.h"
+
+#include "system.h"
+
+#include "block.h"
+#include "constants.h"
+#include "record.h"
+#include "tree.h"
+#include "reftable-error.h"
+
+/* finishes a block, and writes it to storage */
+static int writer_flush_block(struct reftable_writer *w);
+
+/* deallocates memory related to the index */
+static void writer_clear_index(struct reftable_writer *w);
+
+/* finishes writing a 'r' (refs) or 'g' (reflogs) section */
+static int writer_finish_public_section(struct reftable_writer *w);
+
+static struct reftable_block_stats *
+writer_reftable_block_stats(struct reftable_writer *w, uint8_t typ)
+{
+	switch (typ) {
+	case 'r':
+		return &w->stats.ref_stats;
+	case 'o':
+		return &w->stats.obj_stats;
+	case 'i':
+		return &w->stats.idx_stats;
+	case 'g':
+		return &w->stats.log_stats;
+	}
+	abort();
+	return NULL;
+}
+
+/* write data, queuing the padding for the next write. Returns negative for
+ * error. */
+static int padded_write(struct reftable_writer *w, uint8_t *data, size_t len,
+			int padding)
+{
+	int n = 0;
+	if (w->pending_padding > 0) {
+		uint8_t *zeroed = reftable_calloc(w->pending_padding);
+		int n = w->write(w->write_arg, zeroed, w->pending_padding);
+		if (n < 0)
+			return n;
+
+		w->pending_padding = 0;
+		reftable_free(zeroed);
+	}
+
+	w->pending_padding = padding;
+	n = w->write(w->write_arg, data, len);
+	if (n < 0)
+		return n;
+	n += padding;
+	return 0;
+}
+
+static void options_set_defaults(struct reftable_write_options *opts)
+{
+	if (opts->restart_interval == 0) {
+		opts->restart_interval = 16;
+	}
+
+	if (opts->hash_id == 0) {
+		opts->hash_id = SHA1_ID;
+	}
+	if (opts->block_size == 0) {
+		opts->block_size = DEFAULT_BLOCK_SIZE;
+	}
+}
+
+static int writer_version(struct reftable_writer *w)
+{
+	return (w->opts.hash_id == 0 || w->opts.hash_id == SHA1_ID) ? 1 : 2;
+}
+
+static int writer_write_header(struct reftable_writer *w, uint8_t *dest)
+{
+	memcpy((char *)dest, "REFT", 4);
+
+	dest[4] = writer_version(w);
+
+	put_be24(dest + 5, w->opts.block_size);
+	put_be64(dest + 8, w->min_update_index);
+	put_be64(dest + 16, w->max_update_index);
+	if (writer_version(w) == 2) {
+		put_be32(dest + 24, w->opts.hash_id);
+	}
+	return header_size(writer_version(w));
+}
+
+static void writer_reinit_block_writer(struct reftable_writer *w, uint8_t typ)
+{
+	int block_start = 0;
+	if (w->next == 0) {
+		block_start = header_size(writer_version(w));
+	}
+
+	strbuf_release(&w->last_key);
+	block_writer_init(&w->block_writer_data, typ, w->block,
+			  w->opts.block_size, block_start,
+			  hash_size(w->opts.hash_id));
+	w->block_writer = &w->block_writer_data;
+	w->block_writer->restart_interval = w->opts.restart_interval;
+}
+
+static struct strbuf reftable_empty_strbuf = STRBUF_INIT;
+
+struct reftable_writer *
+reftable_new_writer(int (*writer_func)(void *, const void *, size_t),
+		    void *writer_arg, struct reftable_write_options *opts)
+{
+	struct reftable_writer *wp =
+		reftable_calloc(sizeof(struct reftable_writer));
+	strbuf_init(&wp->block_writer_data.last_key, 0);
+	options_set_defaults(opts);
+	if (opts->block_size >= (1 << 24)) {
+		/* TODO - error return? */
+		abort();
+	}
+	wp->last_key = reftable_empty_strbuf;
+	wp->block = reftable_calloc(opts->block_size);
+	wp->write = writer_func;
+	wp->write_arg = writer_arg;
+	wp->opts = *opts;
+	writer_reinit_block_writer(wp, BLOCK_TYPE_REF);
+
+	return wp;
+}
+
+void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
+				uint64_t max)
+{
+	w->min_update_index = min;
+	w->max_update_index = max;
+}
+
+void reftable_writer_free(struct reftable_writer *w)
+{
+	reftable_free(w->block);
+	reftable_free(w);
+}
+
+struct obj_index_tree_node {
+	struct strbuf hash;
+	uint64_t *offsets;
+	size_t offset_len;
+	size_t offset_cap;
+};
+
+#define OBJ_INDEX_TREE_NODE_INIT    \
+	{                           \
+		.hash = STRBUF_INIT \
+	}
+
+static int obj_index_tree_node_compare(const void *a, const void *b)
+{
+	return strbuf_cmp(&((const struct obj_index_tree_node *)a)->hash,
+			  &((const struct obj_index_tree_node *)b)->hash);
+}
+
+static void writer_index_hash(struct reftable_writer *w, struct strbuf *hash)
+{
+	uint64_t off = w->next;
+
+	struct obj_index_tree_node want = { .hash = *hash };
+
+	struct tree_node *node = tree_search(&want, &w->obj_index_tree,
+					     &obj_index_tree_node_compare, 0);
+	struct obj_index_tree_node *key = NULL;
+	if (node == NULL) {
+		struct obj_index_tree_node empty = OBJ_INDEX_TREE_NODE_INIT;
+		key = reftable_malloc(sizeof(struct obj_index_tree_node));
+		*key = empty;
+
+		strbuf_reset(&key->hash);
+		strbuf_addbuf(&key->hash, hash);
+		tree_search((void *)key, &w->obj_index_tree,
+			    &obj_index_tree_node_compare, 1);
+	} else {
+		key = node->key;
+	}
+
+	if (key->offset_len > 0 && key->offsets[key->offset_len - 1] == off) {
+		return;
+	}
+
+	if (key->offset_len == key->offset_cap) {
+		key->offset_cap = 2 * key->offset_cap + 1;
+		key->offsets = reftable_realloc(
+			key->offsets, sizeof(uint64_t) * key->offset_cap);
+	}
+
+	key->offsets[key->offset_len++] = off;
+}
+
+static int writer_add_record(struct reftable_writer *w,
+			     struct reftable_record *rec)
+{
+	int result = -1;
+	struct strbuf key = STRBUF_INIT;
+	int err = 0;
+	reftable_record_key(rec, &key);
+	if (strbuf_cmp(&w->last_key, &key) >= 0)
+		goto done;
+
+	strbuf_reset(&w->last_key);
+	strbuf_addbuf(&w->last_key, &key);
+	if (w->block_writer == NULL) {
+		writer_reinit_block_writer(w, reftable_record_type(rec));
+	}
+
+	assert(block_writer_type(w->block_writer) == reftable_record_type(rec));
+
+	if (block_writer_add(w->block_writer, rec) == 0) {
+		result = 0;
+		goto done;
+	}
+
+	err = writer_flush_block(w);
+	if (err < 0) {
+		result = err;
+		goto done;
+	}
+
+	writer_reinit_block_writer(w, reftable_record_type(rec));
+	err = block_writer_add(w->block_writer, rec);
+	if (err < 0) {
+		result = err;
+		goto done;
+	}
+
+	result = 0;
+done:
+	strbuf_release(&key);
+	return result;
+}
+
+int reftable_writer_add_ref(struct reftable_writer *w,
+			    struct reftable_ref_record *ref)
+{
+	struct reftable_record rec = { NULL };
+	struct reftable_ref_record copy = *ref;
+	int err = 0;
+
+	if (ref->refname == NULL)
+		return REFTABLE_API_ERROR;
+	if (ref->update_index < w->min_update_index ||
+	    ref->update_index > w->max_update_index)
+		return REFTABLE_API_ERROR;
+
+	reftable_record_from_ref(&rec, &copy);
+	copy.update_index -= w->min_update_index;
+
+	err = writer_add_record(w, &rec);
+	if (err < 0)
+		return err;
+
+	if (!w->opts.skip_index_objects &&
+	    reftable_ref_record_val1(ref) != NULL) {
+		struct strbuf h = STRBUF_INIT;
+		strbuf_add(&h, (char *)reftable_ref_record_val1(ref),
+			   hash_size(w->opts.hash_id));
+		writer_index_hash(w, &h);
+		strbuf_release(&h);
+	}
+
+	if (!w->opts.skip_index_objects &&
+	    reftable_ref_record_val2(ref) != NULL) {
+		struct strbuf h = STRBUF_INIT;
+		strbuf_add(&h, reftable_ref_record_val2(ref),
+			   hash_size(w->opts.hash_id));
+		writer_index_hash(w, &h);
+		strbuf_release(&h);
+	}
+	return 0;
+}
+
+int reftable_writer_add_refs(struct reftable_writer *w,
+			     struct reftable_ref_record *refs, int n)
+{
+	int err = 0;
+	int i = 0;
+	QSORT(refs, n, reftable_ref_record_compare_name);
+	for (i = 0; err == 0 && i < n; i++) {
+		err = reftable_writer_add_ref(w, &refs[i]);
+	}
+	return err;
+}
+
+static int reftable_writer_add_log_verbatim(struct reftable_writer *w,
+					    struct reftable_log_record *log)
+{
+	struct reftable_record rec = { NULL };
+	if (w->block_writer != NULL &&
+	    block_writer_type(w->block_writer) == BLOCK_TYPE_REF) {
+		int err = writer_finish_public_section(w);
+		if (err < 0)
+			return err;
+	}
+
+	w->next -= w->pending_padding;
+	w->pending_padding = 0;
+
+	reftable_record_from_log(&rec, log);
+	return writer_add_record(w, &rec);
+}
+
+int reftable_writer_add_log(struct reftable_writer *w,
+			    struct reftable_log_record *log)
+{
+	char *input_log_message = NULL;
+	struct strbuf cleaned_message = STRBUF_INIT;
+	int err = 0;
+
+	if (log->value_type == REFTABLE_LOG_DELETION)
+		return reftable_writer_add_log_verbatim(w, log);
+
+	if (log->refname == NULL)
+		return REFTABLE_API_ERROR;
+
+	input_log_message = log->update.message;
+	if (!w->opts.exact_log_message && log->update.message != NULL) {
+		strbuf_addstr(&cleaned_message, log->update.message);
+		while (cleaned_message.len &&
+		       cleaned_message.buf[cleaned_message.len - 1] == '\n')
+			strbuf_setlen(&cleaned_message,
+				      cleaned_message.len - 1);
+		if (strchr(cleaned_message.buf, '\n')) {
+			// multiple lines not allowed.
+			err = REFTABLE_API_ERROR;
+			goto done;
+		}
+		strbuf_addstr(&cleaned_message, "\n");
+		log->update.message = cleaned_message.buf;
+	}
+
+	err = reftable_writer_add_log_verbatim(w, log);
+	log->update.message = input_log_message;
+done:
+	strbuf_release(&cleaned_message);
+	return err;
+}
+
+int reftable_writer_add_logs(struct reftable_writer *w,
+			     struct reftable_log_record *logs, int n)
+{
+	int err = 0;
+	int i = 0;
+	QSORT(logs, n, reftable_log_record_compare_key);
+
+	for (i = 0; err == 0 && i < n; i++) {
+		err = reftable_writer_add_log(w, &logs[i]);
+	}
+	return err;
+}
+
+static int writer_finish_section(struct reftable_writer *w)
+{
+	uint8_t typ = block_writer_type(w->block_writer);
+	uint64_t index_start = 0;
+	int max_level = 0;
+	int threshold = w->opts.unpadded ? 1 : 3;
+	int before_blocks = w->stats.idx_stats.blocks;
+	int err = writer_flush_block(w);
+	int i = 0;
+	struct reftable_block_stats *bstats = NULL;
+	if (err < 0)
+		return err;
+
+	while (w->index_len > threshold) {
+		struct reftable_index_record *idx = NULL;
+		int idx_len = 0;
+
+		max_level++;
+		index_start = w->next;
+		writer_reinit_block_writer(w, BLOCK_TYPE_INDEX);
+
+		idx = w->index;
+		idx_len = w->index_len;
+
+		w->index = NULL;
+		w->index_len = 0;
+		w->index_cap = 0;
+		for (i = 0; i < idx_len; i++) {
+			struct reftable_record rec = { NULL };
+			reftable_record_from_index(&rec, idx + i);
+			if (block_writer_add(w->block_writer, &rec) == 0) {
+				continue;
+			}
+
+			err = writer_flush_block(w);
+			if (err < 0)
+				return err;
+
+			writer_reinit_block_writer(w, BLOCK_TYPE_INDEX);
+
+			err = block_writer_add(w->block_writer, &rec);
+			if (err != 0) {
+				/* write into fresh block should always succeed
+				 */
+				abort();
+			}
+		}
+		for (i = 0; i < idx_len; i++) {
+			strbuf_release(&idx[i].last_key);
+		}
+		reftable_free(idx);
+	}
+
+	writer_clear_index(w);
+
+	err = writer_flush_block(w);
+	if (err < 0)
+		return err;
+
+	bstats = writer_reftable_block_stats(w, typ);
+	bstats->index_blocks = w->stats.idx_stats.blocks - before_blocks;
+	bstats->index_offset = index_start;
+	bstats->max_index_level = max_level;
+
+	/* Reinit lastKey, as the next section can start with any key. */
+	w->last_key.len = 0;
+
+	return 0;
+}
+
+struct common_prefix_arg {
+	struct strbuf *last;
+	int max;
+};
+
+static void update_common(void *void_arg, void *key)
+{
+	struct common_prefix_arg *arg = (struct common_prefix_arg *)void_arg;
+	struct obj_index_tree_node *entry = (struct obj_index_tree_node *)key;
+	if (arg->last != NULL) {
+		int n = common_prefix_size(&entry->hash, arg->last);
+		if (n > arg->max) {
+			arg->max = n;
+		}
+	}
+	arg->last = &entry->hash;
+}
+
+struct write_record_arg {
+	struct reftable_writer *w;
+	int err;
+};
+
+static void write_object_record(void *void_arg, void *key)
+{
+	struct write_record_arg *arg = (struct write_record_arg *)void_arg;
+	struct obj_index_tree_node *entry = (struct obj_index_tree_node *)key;
+	struct reftable_obj_record obj_rec = {
+		.hash_prefix = (uint8_t *)entry->hash.buf,
+		.hash_prefix_len = arg->w->stats.object_id_len,
+		.offsets = entry->offsets,
+		.offset_len = entry->offset_len,
+	};
+	struct reftable_record rec = { NULL };
+	if (arg->err < 0)
+		goto done;
+
+	reftable_record_from_obj(&rec, &obj_rec);
+	arg->err = block_writer_add(arg->w->block_writer, &rec);
+	if (arg->err == 0)
+		goto done;
+
+	arg->err = writer_flush_block(arg->w);
+	if (arg->err < 0)
+		goto done;
+
+	writer_reinit_block_writer(arg->w, BLOCK_TYPE_OBJ);
+	arg->err = block_writer_add(arg->w->block_writer, &rec);
+	if (arg->err == 0)
+		goto done;
+	obj_rec.offset_len = 0;
+	arg->err = block_writer_add(arg->w->block_writer, &rec);
+
+	/* Should be able to write into a fresh block. */
+	assert(arg->err == 0);
+
+done:;
+}
+
+static void object_record_free(void *void_arg, void *key)
+{
+	struct obj_index_tree_node *entry = (struct obj_index_tree_node *)key;
+
+	FREE_AND_NULL(entry->offsets);
+	strbuf_release(&entry->hash);
+	reftable_free(entry);
+}
+
+static int writer_dump_object_index(struct reftable_writer *w)
+{
+	struct write_record_arg closure = { .w = w };
+	struct common_prefix_arg common = { NULL };
+	if (w->obj_index_tree != NULL) {
+		infix_walk(w->obj_index_tree, &update_common, &common);
+	}
+	w->stats.object_id_len = common.max + 1;
+
+	writer_reinit_block_writer(w, BLOCK_TYPE_OBJ);
+
+	if (w->obj_index_tree != NULL) {
+		infix_walk(w->obj_index_tree, &write_object_record, &closure);
+	}
+
+	if (closure.err < 0)
+		return closure.err;
+	return writer_finish_section(w);
+}
+
+static int writer_finish_public_section(struct reftable_writer *w)
+{
+	uint8_t typ = 0;
+	int err = 0;
+
+	if (w->block_writer == NULL)
+		return 0;
+
+	typ = block_writer_type(w->block_writer);
+	err = writer_finish_section(w);
+	if (err < 0)
+		return err;
+	if (typ == BLOCK_TYPE_REF && !w->opts.skip_index_objects &&
+	    w->stats.ref_stats.index_blocks > 0) {
+		err = writer_dump_object_index(w);
+		if (err < 0)
+			return err;
+	}
+
+	if (w->obj_index_tree != NULL) {
+		infix_walk(w->obj_index_tree, &object_record_free, NULL);
+		tree_free(w->obj_index_tree);
+		w->obj_index_tree = NULL;
+	}
+
+	w->block_writer = NULL;
+	return 0;
+}
+
+int reftable_writer_close(struct reftable_writer *w)
+{
+	uint8_t footer[72];
+	uint8_t *p = footer;
+	int err = writer_finish_public_section(w);
+	int empty_table = w->next == 0;
+	if (err != 0)
+		goto done;
+	w->pending_padding = 0;
+	if (empty_table) {
+		/* Empty tables need a header anyway. */
+		uint8_t header[28];
+		int n = writer_write_header(w, header);
+		err = padded_write(w, header, n, 0);
+		if (err < 0)
+			goto done;
+	}
+
+	p += writer_write_header(w, footer);
+	put_be64(p, w->stats.ref_stats.index_offset);
+	p += 8;
+	put_be64(p, (w->stats.obj_stats.offset) << 5 | w->stats.object_id_len);
+	p += 8;
+	put_be64(p, w->stats.obj_stats.index_offset);
+	p += 8;
+
+	put_be64(p, w->stats.log_stats.offset);
+	p += 8;
+	put_be64(p, w->stats.log_stats.index_offset);
+	p += 8;
+
+	put_be32(p, crc32(0, footer, p - footer));
+	p += 4;
+
+	err = padded_write(w, footer, footer_size(writer_version(w)), 0);
+	if (err < 0)
+		goto done;
+
+	if (empty_table) {
+		err = REFTABLE_EMPTY_TABLE_ERROR;
+		goto done;
+	}
+
+done:
+	/* free up memory. */
+	block_writer_release(&w->block_writer_data);
+	writer_clear_index(w);
+	strbuf_release(&w->last_key);
+	return err;
+}
+
+static void writer_clear_index(struct reftable_writer *w)
+{
+	int i = 0;
+	for (i = 0; i < w->index_len; i++) {
+		strbuf_release(&w->index[i].last_key);
+	}
+
+	FREE_AND_NULL(w->index);
+	w->index_len = 0;
+	w->index_cap = 0;
+}
+
+static const int debug = 0;
+
+static int writer_flush_nonempty_block(struct reftable_writer *w)
+{
+	uint8_t typ = block_writer_type(w->block_writer);
+	struct reftable_block_stats *bstats =
+		writer_reftable_block_stats(w, typ);
+	uint64_t block_typ_off = (bstats->blocks == 0) ? w->next : 0;
+	int raw_bytes = block_writer_finish(w->block_writer);
+	int padding = 0;
+	int err = 0;
+	struct reftable_index_record ir = { .last_key = STRBUF_INIT };
+	if (raw_bytes < 0)
+		return raw_bytes;
+
+	if (!w->opts.unpadded && typ != BLOCK_TYPE_LOG) {
+		padding = w->opts.block_size - raw_bytes;
+	}
+
+	if (block_typ_off > 0) {
+		bstats->offset = block_typ_off;
+	}
+
+	bstats->entries += w->block_writer->entries;
+	bstats->restarts += w->block_writer->restart_len;
+	bstats->blocks++;
+	w->stats.blocks++;
+
+	if (debug) {
+		fprintf(stderr, "block %c off %" PRIu64 " sz %d (%d)\n", typ,
+			w->next, raw_bytes,
+			get_be24(w->block + w->block_writer->header_off + 1));
+	}
+
+	if (w->next == 0) {
+		writer_write_header(w, w->block);
+	}
+
+	err = padded_write(w, w->block, raw_bytes, padding);
+	if (err < 0)
+		return err;
+
+	if (w->index_cap == w->index_len) {
+		w->index_cap = 2 * w->index_cap + 1;
+		w->index = reftable_realloc(
+			w->index,
+			sizeof(struct reftable_index_record) * w->index_cap);
+	}
+
+	ir.offset = w->next;
+	strbuf_reset(&ir.last_key);
+	strbuf_addbuf(&ir.last_key, &w->block_writer->last_key);
+	w->index[w->index_len] = ir;
+
+	w->index_len++;
+	w->next += padding + raw_bytes;
+	w->block_writer = NULL;
+	return 0;
+}
+
+static int writer_flush_block(struct reftable_writer *w)
+{
+	if (w->block_writer == NULL)
+		return 0;
+	if (w->block_writer->entries == 0)
+		return 0;
+	return writer_flush_nonempty_block(w);
+}
+
+const struct reftable_stats *writer_stats(struct reftable_writer *w)
+{
+	return &w->stats;
+}
diff --git a/reftable/writer.h b/reftable/writer.h
new file mode 100644
index 000000000000..4921c249d06b
--- /dev/null
+++ b/reftable/writer.h
@@ -0,0 +1,50 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef WRITER_H
+#define WRITER_H
+
+#include "basics.h"
+#include "block.h"
+#include "tree.h"
+#include "reftable-writer.h"
+
+struct reftable_writer {
+	int (*write)(void *, const void *, size_t);
+	void *write_arg;
+	int pending_padding;
+	struct strbuf last_key;
+
+	/* offset of next block to write. */
+	uint64_t next;
+	uint64_t min_update_index, max_update_index;
+	struct reftable_write_options opts;
+
+	/* memory buffer for writing */
+	uint8_t *block;
+
+	/* writer for the current section. NULL or points to
+	 * block_writer_data */
+	struct block_writer *block_writer;
+
+	struct block_writer block_writer_data;
+
+	/* pending index records for the current section */
+	struct reftable_index_record *index;
+	size_t index_len;
+	size_t index_cap;
+
+	/*
+	 * tree for use with tsearch; used to populate the 'o' inverse OID
+	 * map */
+	struct tree_node *obj_index_tree;
+
+	struct reftable_stats stats;
+};
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v5 10/15] reftable: read reftable files
  2021-03-12 20:19       ` [PATCH v5 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                           ` (8 preceding siblings ...)
  2021-03-12 20:19         ` [PATCH v5 09/15] reftable: write reftable files Han-Wen Nienhuys via GitGitGadget
@ 2021-03-12 20:19         ` Han-Wen Nienhuys via GitGitGadget
  2021-03-12 20:19         ` [PATCH v5 11/15] reftable: reftable file level tests Han-Wen Nienhuys via GitGitGadget
                           ` (5 subsequent siblings)
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-03-12 20:19 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This supports reading a single reftable file.

The commit introduces an abstract iterator type, which captures the usecases
both of reading individual refs, and iterating over a segment of the ref
namespace.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                     |   3 +
 reftable/iter.c              | 248 ++++++++++++
 reftable/iter.h              |  75 ++++
 reftable/reader.c            | 733 +++++++++++++++++++++++++++++++++++
 reftable/reader.h            |  75 ++++
 reftable/reftable-iterator.h |  37 ++
 reftable/reftable-reader.h   |  89 +++++
 7 files changed, 1260 insertions(+)
 create mode 100644 reftable/iter.c
 create mode 100644 reftable/iter.h
 create mode 100644 reftable/reader.c
 create mode 100644 reftable/reader.h
 create mode 100644 reftable/reftable-iterator.h
 create mode 100644 reftable/reftable-reader.h

diff --git a/Makefile b/Makefile
index d37dd79bfab7..aba1adc10b75 100644
--- a/Makefile
+++ b/Makefile
@@ -2386,8 +2386,11 @@ REFTABLE_OBJS += reftable/basics.o
 REFTABLE_OBJS += reftable/error.o
 REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
+REFTABLE_OBJS += reftable/iter.o
 REFTABLE_OBJS += reftable/publicbasics.o
+REFTABLE_OBJS += reftable/reader.o
 REFTABLE_OBJS += reftable/record.o
+REFTABLE_OBJS += reftable/reftable.o
 REFTABLE_OBJS += reftable/tree.o
 REFTABLE_OBJS += reftable/writer.o
 REFTABLE_OBJS += reftable/zlib-compat.o
diff --git a/reftable/iter.c b/reftable/iter.c
new file mode 100644
index 000000000000..b8c091ffd570
--- /dev/null
+++ b/reftable/iter.c
@@ -0,0 +1,248 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "iter.h"
+
+#include "system.h"
+
+#include "block.h"
+#include "constants.h"
+#include "reader.h"
+#include "reftable-error.h"
+
+int iterator_is_null(struct reftable_iterator *it)
+{
+	return it->ops == NULL;
+}
+
+static int empty_iterator_next(void *arg, struct reftable_record *rec)
+{
+	return 1;
+}
+
+static void empty_iterator_close(void *arg)
+{
+}
+
+static struct reftable_iterator_vtable empty_vtable = {
+	.next = &empty_iterator_next,
+	.close = &empty_iterator_close,
+};
+
+void iterator_set_empty(struct reftable_iterator *it)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = NULL;
+	it->ops = &empty_vtable;
+}
+
+int iterator_next(struct reftable_iterator *it, struct reftable_record *rec)
+{
+	return it->ops->next(it->iter_arg, rec);
+}
+
+void reftable_iterator_destroy(struct reftable_iterator *it)
+{
+	if (it->ops == NULL) {
+		return;
+	}
+	it->ops->close(it->iter_arg);
+	it->ops = NULL;
+	FREE_AND_NULL(it->iter_arg);
+}
+
+int reftable_iterator_next_ref(struct reftable_iterator *it,
+			       struct reftable_ref_record *ref)
+{
+	struct reftable_record rec = { NULL };
+	reftable_record_from_ref(&rec, ref);
+	return iterator_next(it, &rec);
+}
+
+int reftable_iterator_next_log(struct reftable_iterator *it,
+			       struct reftable_log_record *log)
+{
+	struct reftable_record rec = { NULL };
+	reftable_record_from_log(&rec, log);
+	return iterator_next(it, &rec);
+}
+
+static void filtering_ref_iterator_close(void *iter_arg)
+{
+	struct filtering_ref_iterator *fri =
+		(struct filtering_ref_iterator *)iter_arg;
+	strbuf_release(&fri->oid);
+	reftable_iterator_destroy(&fri->it);
+}
+
+static int filtering_ref_iterator_next(void *iter_arg,
+				       struct reftable_record *rec)
+{
+	struct filtering_ref_iterator *fri =
+		(struct filtering_ref_iterator *)iter_arg;
+	struct reftable_ref_record *ref =
+		(struct reftable_ref_record *)rec->data;
+	int err = 0;
+	while (1) {
+		err = reftable_iterator_next_ref(&fri->it, ref);
+		if (err != 0) {
+			break;
+		}
+
+		if (fri->double_check) {
+			struct reftable_iterator it = { NULL };
+
+			err = reftable_table_seek_ref(&fri->tab, &it,
+						      ref->refname);
+			if (err == 0) {
+				err = reftable_iterator_next_ref(&it, ref);
+			}
+
+			reftable_iterator_destroy(&it);
+
+			if (err < 0) {
+				break;
+			}
+
+			if (err > 0) {
+				continue;
+			}
+		}
+
+		if (ref->value_type == REFTABLE_REF_VAL2 &&
+		    (!memcmp(fri->oid.buf, ref->value.val2.target_value,
+			     fri->oid.len) ||
+		     !memcmp(fri->oid.buf, ref->value.val2.value,
+			     fri->oid.len)))
+			return 0;
+
+		if (ref->value_type == REFTABLE_REF_VAL1 &&
+		    !memcmp(fri->oid.buf, ref->value.val1, fri->oid.len)) {
+			return 0;
+		}
+	}
+
+	reftable_ref_record_release(ref);
+	return err;
+}
+
+static struct reftable_iterator_vtable filtering_ref_iterator_vtable = {
+	.next = &filtering_ref_iterator_next,
+	.close = &filtering_ref_iterator_close,
+};
+
+void iterator_from_filtering_ref_iterator(struct reftable_iterator *it,
+					  struct filtering_ref_iterator *fri)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = fri;
+	it->ops = &filtering_ref_iterator_vtable;
+}
+
+static void indexed_table_ref_iter_close(void *p)
+{
+	struct indexed_table_ref_iter *it = (struct indexed_table_ref_iter *)p;
+	block_iter_close(&it->cur);
+	reftable_block_done(&it->block_reader.block);
+	strbuf_release(&it->oid);
+}
+
+static int indexed_table_ref_iter_next_block(struct indexed_table_ref_iter *it)
+{
+	uint64_t off;
+	int err = 0;
+	if (it->offset_idx == it->offset_len) {
+		it->is_finished = 1;
+		return 1;
+	}
+
+	reftable_block_done(&it->block_reader.block);
+
+	off = it->offsets[it->offset_idx++];
+	err = reader_init_block_reader(it->r, &it->block_reader, off,
+				       BLOCK_TYPE_REF);
+	if (err < 0) {
+		return err;
+	}
+	if (err > 0) {
+		/* indexed block does not exist. */
+		return REFTABLE_FORMAT_ERROR;
+	}
+	block_reader_start(&it->block_reader, &it->cur);
+	return 0;
+}
+
+static int indexed_table_ref_iter_next(void *p, struct reftable_record *rec)
+{
+	struct indexed_table_ref_iter *it = (struct indexed_table_ref_iter *)p;
+	struct reftable_ref_record *ref =
+		(struct reftable_ref_record *)rec->data;
+
+	while (1) {
+		int err = block_iter_next(&it->cur, rec);
+		if (err < 0) {
+			return err;
+		}
+
+		if (err > 0) {
+			err = indexed_table_ref_iter_next_block(it);
+			if (err < 0) {
+				return err;
+			}
+
+			if (it->is_finished) {
+				return 1;
+			}
+			continue;
+		}
+		/* BUG */
+		if (!memcmp(it->oid.buf, ref->value.val2.target_value,
+			    it->oid.len) ||
+		    !memcmp(it->oid.buf, ref->value.val2.value, it->oid.len)) {
+			return 0;
+		}
+	}
+}
+
+int new_indexed_table_ref_iter(struct indexed_table_ref_iter **dest,
+			       struct reftable_reader *r, uint8_t *oid,
+			       int oid_len, uint64_t *offsets, int offset_len)
+{
+	struct indexed_table_ref_iter empty = INDEXED_TABLE_REF_ITER_INIT;
+	struct indexed_table_ref_iter *itr =
+		reftable_calloc(sizeof(struct indexed_table_ref_iter));
+	int err = 0;
+
+	*itr = empty;
+	itr->r = r;
+	strbuf_add(&itr->oid, oid, oid_len);
+
+	itr->offsets = offsets;
+	itr->offset_len = offset_len;
+
+	err = indexed_table_ref_iter_next_block(itr);
+	if (err < 0) {
+		reftable_free(itr);
+	} else {
+		*dest = itr;
+	}
+	return err;
+}
+
+static struct reftable_iterator_vtable indexed_table_ref_iter_vtable = {
+	.next = &indexed_table_ref_iter_next,
+	.close = &indexed_table_ref_iter_close,
+};
+
+void iterator_from_indexed_table_ref_iter(struct reftable_iterator *it,
+					  struct indexed_table_ref_iter *itr)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = itr;
+	it->ops = &indexed_table_ref_iter_vtable;
+}
diff --git a/reftable/iter.h b/reftable/iter.h
new file mode 100644
index 000000000000..2ba964c29a3f
--- /dev/null
+++ b/reftable/iter.h
@@ -0,0 +1,75 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef ITER_H
+#define ITER_H
+
+#include "system.h"
+#include "block.h"
+#include "record.h"
+
+#include "reftable-iterator.h"
+#include "reftable-generic.h"
+
+struct reftable_iterator_vtable {
+	int (*next)(void *iter_arg, struct reftable_record *rec);
+	void (*close)(void *iter_arg);
+};
+
+void iterator_set_empty(struct reftable_iterator *it);
+int iterator_next(struct reftable_iterator *it, struct reftable_record *rec);
+
+/* Returns true for a zeroed out iterator, such as the one returned from
+ * iterator_destroy. */
+int iterator_is_null(struct reftable_iterator *it);
+
+/* iterator that produces only ref records that point to `oid` */
+struct filtering_ref_iterator {
+	int double_check;
+	struct reftable_table tab;
+	struct strbuf oid;
+	struct reftable_iterator it;
+};
+#define FILTERING_REF_ITERATOR_INIT \
+	{                           \
+		.oid = STRBUF_INIT  \
+	}
+
+void iterator_from_filtering_ref_iterator(struct reftable_iterator *,
+					  struct filtering_ref_iterator *);
+
+/* iterator that produces only ref records that point to `oid`,
+ * but using the object index.
+ */
+struct indexed_table_ref_iter {
+	struct reftable_reader *r;
+	struct strbuf oid;
+
+	/* mutable */
+	uint64_t *offsets;
+
+	/* Points to the next offset to read. */
+	int offset_idx;
+	int offset_len;
+	struct block_reader block_reader;
+	struct block_iter cur;
+	int is_finished;
+};
+
+#define INDEXED_TABLE_REF_ITER_INIT                                     \
+	{                                                               \
+		.cur = { .last_key = STRBUF_INIT }, .oid = STRBUF_INIT, \
+	}
+
+void iterator_from_indexed_table_ref_iter(struct reftable_iterator *it,
+					  struct indexed_table_ref_iter *itr);
+int new_indexed_table_ref_iter(struct indexed_table_ref_iter **dest,
+			       struct reftable_reader *r, uint8_t *oid,
+			       int oid_len, uint64_t *offsets, int offset_len);
+
+#endif
diff --git a/reftable/reader.c b/reftable/reader.c
new file mode 100644
index 000000000000..a5b7b1d4fefa
--- /dev/null
+++ b/reftable/reader.c
@@ -0,0 +1,733 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reader.h"
+
+#include "system.h"
+#include "block.h"
+#include "constants.h"
+#include "iter.h"
+#include "record.h"
+#include "reftable-error.h"
+#include "tree.h"
+
+uint64_t block_source_size(struct reftable_block_source *source)
+{
+	return source->ops->size(source->arg);
+}
+
+int block_source_read_block(struct reftable_block_source *source,
+			    struct reftable_block *dest, uint64_t off,
+			    uint32_t size)
+{
+	int result = source->ops->read_block(source->arg, dest, off, size);
+	dest->source = *source;
+	return result;
+}
+
+void block_source_close(struct reftable_block_source *source)
+{
+	if (source->ops == NULL) {
+		return;
+	}
+
+	source->ops->close(source->arg);
+	source->ops = NULL;
+}
+
+static struct reftable_reader_offsets *
+reader_offsets_for(struct reftable_reader *r, uint8_t typ)
+{
+	switch (typ) {
+	case BLOCK_TYPE_REF:
+		return &r->ref_offsets;
+	case BLOCK_TYPE_LOG:
+		return &r->log_offsets;
+	case BLOCK_TYPE_OBJ:
+		return &r->obj_offsets;
+	}
+	abort();
+}
+
+static int reader_get_block(struct reftable_reader *r,
+			    struct reftable_block *dest, uint64_t off,
+			    uint32_t sz)
+{
+	if (off >= r->size)
+		return 0;
+
+	if (off + sz > r->size) {
+		sz = r->size - off;
+	}
+
+	return block_source_read_block(&r->source, dest, off, sz);
+}
+
+uint32_t reftable_reader_hash_id(struct reftable_reader *r)
+{
+	return r->hash_id;
+}
+
+const char *reader_name(struct reftable_reader *r)
+{
+	return r->name;
+}
+
+static int parse_footer(struct reftable_reader *r, uint8_t *footer,
+			uint8_t *header)
+{
+	uint8_t *f = footer;
+	uint8_t first_block_typ;
+	int err = 0;
+	uint32_t computed_crc;
+	uint32_t file_crc;
+
+	if (memcmp(f, "REFT", 4)) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+	f += 4;
+
+	if (memcmp(footer, header, header_size(r->version))) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	f++;
+	r->block_size = get_be24(f);
+
+	f += 3;
+	r->min_update_index = get_be64(f);
+	f += 8;
+	r->max_update_index = get_be64(f);
+	f += 8;
+
+	if (r->version == 1) {
+		r->hash_id = SHA1_ID;
+	} else {
+		r->hash_id = get_be32(f);
+		switch (r->hash_id) {
+		case SHA1_ID:
+			break;
+		case SHA256_ID:
+			break;
+		default:
+			err = REFTABLE_FORMAT_ERROR;
+			goto done;
+		}
+		f += 4;
+	}
+
+	r->ref_offsets.index_offset = get_be64(f);
+	f += 8;
+
+	r->obj_offsets.offset = get_be64(f);
+	f += 8;
+
+	r->object_id_len = r->obj_offsets.offset & ((1 << 5) - 1);
+	r->obj_offsets.offset >>= 5;
+
+	r->obj_offsets.index_offset = get_be64(f);
+	f += 8;
+	r->log_offsets.offset = get_be64(f);
+	f += 8;
+	r->log_offsets.index_offset = get_be64(f);
+	f += 8;
+
+	computed_crc = crc32(0, footer, f - footer);
+	file_crc = get_be32(f);
+	f += 4;
+	if (computed_crc != file_crc) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	first_block_typ = header[header_size(r->version)];
+	r->ref_offsets.is_present = (first_block_typ == BLOCK_TYPE_REF);
+	r->ref_offsets.offset = 0;
+	r->log_offsets.is_present = (first_block_typ == BLOCK_TYPE_LOG ||
+				     r->log_offsets.offset > 0);
+	r->obj_offsets.is_present = r->obj_offsets.offset > 0;
+	err = 0;
+done:
+	return err;
+}
+
+int init_reader(struct reftable_reader *r, struct reftable_block_source *source,
+		const char *name)
+{
+	struct reftable_block footer = { NULL };
+	struct reftable_block header = { NULL };
+	int err = 0;
+
+	memset(r, 0, sizeof(struct reftable_reader));
+
+	/* Need +1 to read type of first block. */
+	err = block_source_read_block(source, &header, 0, header_size(2) + 1);
+	if (err != header_size(2) + 1) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	if (memcmp(header.data, "REFT", 4)) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+	r->version = header.data[4];
+	if (r->version != 1 && r->version != 2) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	r->size = block_source_size(source) - footer_size(r->version);
+	r->source = *source;
+	r->name = xstrdup(name);
+	r->hash_id = 0;
+
+	err = block_source_read_block(source, &footer, r->size,
+				      footer_size(r->version));
+	if (err != footer_size(r->version)) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = parse_footer(r, footer.data, header.data);
+done:
+	reftable_block_done(&footer);
+	reftable_block_done(&header);
+	return err;
+}
+
+struct table_iter {
+	struct reftable_reader *r;
+	uint8_t typ;
+	uint64_t block_off;
+	struct block_iter bi;
+	int is_finished;
+};
+#define TABLE_ITER_INIT                          \
+	{                                        \
+		.bi = {.last_key = STRBUF_INIT } \
+	}
+
+static void table_iter_copy_from(struct table_iter *dest,
+				 struct table_iter *src)
+{
+	dest->r = src->r;
+	dest->typ = src->typ;
+	dest->block_off = src->block_off;
+	dest->is_finished = src->is_finished;
+	block_iter_copy_from(&dest->bi, &src->bi);
+}
+
+static int table_iter_next_in_block(struct table_iter *ti,
+				    struct reftable_record *rec)
+{
+	int res = block_iter_next(&ti->bi, rec);
+	if (res == 0 && reftable_record_type(rec) == BLOCK_TYPE_REF) {
+		((struct reftable_ref_record *)rec->data)->update_index +=
+			ti->r->min_update_index;
+	}
+
+	return res;
+}
+
+static void table_iter_block_done(struct table_iter *ti)
+{
+	if (ti->bi.br == NULL) {
+		return;
+	}
+	reftable_block_done(&ti->bi.br->block);
+	FREE_AND_NULL(ti->bi.br);
+
+	ti->bi.last_key.len = 0;
+	ti->bi.next_off = 0;
+}
+
+static int32_t extract_block_size(uint8_t *data, uint8_t *typ, uint64_t off,
+				  int version)
+{
+	int32_t result = 0;
+
+	if (off == 0) {
+		data += header_size(version);
+	}
+
+	*typ = data[0];
+	if (reftable_is_block_type(*typ)) {
+		result = get_be24(data + 1);
+	}
+	return result;
+}
+
+int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br,
+			     uint64_t next_off, uint8_t want_typ)
+{
+	int32_t guess_block_size = r->block_size ? r->block_size :
+							 DEFAULT_BLOCK_SIZE;
+	struct reftable_block block = { NULL };
+	uint8_t block_typ = 0;
+	int err = 0;
+	uint32_t header_off = next_off ? 0 : header_size(r->version);
+	int32_t block_size = 0;
+
+	if (next_off >= r->size)
+		return 1;
+
+	err = reader_get_block(r, &block, next_off, guess_block_size);
+	if (err < 0)
+		return err;
+
+	block_size = extract_block_size(block.data, &block_typ, next_off,
+					r->version);
+	if (block_size < 0)
+		return block_size;
+
+	if (want_typ != BLOCK_TYPE_ANY && block_typ != want_typ) {
+		reftable_block_done(&block);
+		return 1;
+	}
+
+	if (block_size > guess_block_size) {
+		reftable_block_done(&block);
+		err = reader_get_block(r, &block, next_off, block_size);
+		if (err < 0) {
+			return err;
+		}
+	}
+
+	return block_reader_init(br, &block, header_off, r->block_size,
+				 hash_size(r->hash_id));
+}
+
+static int table_iter_next_block(struct table_iter *dest,
+				 struct table_iter *src)
+{
+	uint64_t next_block_off = src->block_off + src->bi.br->full_block_size;
+	struct block_reader br = { 0 };
+	int err = 0;
+
+	dest->r = src->r;
+	dest->typ = src->typ;
+	dest->block_off = next_block_off;
+
+	err = reader_init_block_reader(src->r, &br, next_block_off, src->typ);
+	if (err > 0) {
+		dest->is_finished = 1;
+		return 1;
+	}
+	if (err != 0)
+		return err;
+	else {
+		struct block_reader *brp =
+			reftable_malloc(sizeof(struct block_reader));
+		*brp = br;
+
+		dest->is_finished = 0;
+		block_reader_start(brp, &dest->bi);
+	}
+	return 0;
+}
+
+static int table_iter_next(struct table_iter *ti, struct reftable_record *rec)
+{
+	if (reftable_record_type(rec) != ti->typ)
+		return REFTABLE_API_ERROR;
+
+	while (1) {
+		struct table_iter next = TABLE_ITER_INIT;
+		int err = 0;
+		if (ti->is_finished) {
+			return 1;
+		}
+
+		err = table_iter_next_in_block(ti, rec);
+		if (err <= 0) {
+			return err;
+		}
+
+		err = table_iter_next_block(&next, ti);
+		if (err != 0) {
+			ti->is_finished = 1;
+		}
+		table_iter_block_done(ti);
+		if (err != 0) {
+			return err;
+		}
+		table_iter_copy_from(ti, &next);
+		block_iter_close(&next.bi);
+	}
+}
+
+static int table_iter_next_void(void *ti, struct reftable_record *rec)
+{
+	return table_iter_next((struct table_iter *)ti, rec);
+}
+
+static void table_iter_close(void *p)
+{
+	struct table_iter *ti = (struct table_iter *)p;
+	table_iter_block_done(ti);
+	block_iter_close(&ti->bi);
+}
+
+static struct reftable_iterator_vtable table_iter_vtable = {
+	.next = &table_iter_next_void,
+	.close = &table_iter_close,
+};
+
+static void iterator_from_table_iter(struct reftable_iterator *it,
+				     struct table_iter *ti)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = ti;
+	it->ops = &table_iter_vtable;
+}
+
+static int reader_table_iter_at(struct reftable_reader *r,
+				struct table_iter *ti, uint64_t off,
+				uint8_t typ)
+{
+	struct block_reader br = { 0 };
+	struct block_reader *brp = NULL;
+
+	int err = reader_init_block_reader(r, &br, off, typ);
+	if (err != 0)
+		return err;
+
+	brp = reftable_malloc(sizeof(struct block_reader));
+	*brp = br;
+	ti->r = r;
+	ti->typ = block_reader_type(brp);
+	ti->block_off = off;
+	block_reader_start(brp, &ti->bi);
+	return 0;
+}
+
+static int reader_start(struct reftable_reader *r, struct table_iter *ti,
+			uint8_t typ, int index)
+{
+	struct reftable_reader_offsets *offs = reader_offsets_for(r, typ);
+	uint64_t off = offs->offset;
+	if (index) {
+		off = offs->index_offset;
+		if (off == 0) {
+			return 1;
+		}
+		typ = BLOCK_TYPE_INDEX;
+	}
+
+	return reader_table_iter_at(r, ti, off, typ);
+}
+
+static int reader_seek_linear(struct reftable_reader *r, struct table_iter *ti,
+			      struct reftable_record *want)
+{
+	struct reftable_record rec =
+		reftable_new_record(reftable_record_type(want));
+	struct strbuf want_key = STRBUF_INIT;
+	struct strbuf got_key = STRBUF_INIT;
+	struct table_iter next = TABLE_ITER_INIT;
+	int err = -1;
+
+	reftable_record_key(want, &want_key);
+
+	while (1) {
+		err = table_iter_next_block(&next, ti);
+		if (err < 0)
+			goto done;
+
+		if (err > 0) {
+			break;
+		}
+
+		err = block_reader_first_key(next.bi.br, &got_key);
+		if (err < 0)
+			goto done;
+
+		if (strbuf_cmp(&got_key, &want_key) > 0) {
+			table_iter_block_done(&next);
+			break;
+		}
+
+		table_iter_block_done(ti);
+		table_iter_copy_from(ti, &next);
+	}
+
+	err = block_iter_seek(&ti->bi, &want_key);
+	if (err < 0)
+		goto done;
+	err = 0;
+
+done:
+	block_iter_close(&next.bi);
+	reftable_record_destroy(&rec);
+	strbuf_release(&want_key);
+	strbuf_release(&got_key);
+	return err;
+}
+
+static int reader_seek_indexed(struct reftable_reader *r,
+			       struct reftable_iterator *it,
+			       struct reftable_record *rec)
+{
+	struct reftable_index_record want_index = { .last_key = STRBUF_INIT };
+	struct reftable_record want_index_rec = { NULL };
+	struct reftable_index_record index_result = { .last_key = STRBUF_INIT };
+	struct reftable_record index_result_rec = { NULL };
+	struct table_iter index_iter = TABLE_ITER_INIT;
+	struct table_iter next = TABLE_ITER_INIT;
+	int err = 0;
+
+	reftable_record_key(rec, &want_index.last_key);
+	reftable_record_from_index(&want_index_rec, &want_index);
+	reftable_record_from_index(&index_result_rec, &index_result);
+
+	err = reader_start(r, &index_iter, reftable_record_type(rec), 1);
+	if (err < 0)
+		goto done;
+
+	err = reader_seek_linear(r, &index_iter, &want_index_rec);
+	while (1) {
+		err = table_iter_next(&index_iter, &index_result_rec);
+		table_iter_block_done(&index_iter);
+		if (err != 0)
+			goto done;
+
+		err = reader_table_iter_at(r, &next, index_result.offset, 0);
+		if (err != 0)
+			goto done;
+
+		err = block_iter_seek(&next.bi, &want_index.last_key);
+		if (err < 0)
+			goto done;
+
+		if (next.typ == reftable_record_type(rec)) {
+			err = 0;
+			break;
+		}
+
+		if (next.typ != BLOCK_TYPE_INDEX) {
+			err = REFTABLE_FORMAT_ERROR;
+			break;
+		}
+
+		table_iter_copy_from(&index_iter, &next);
+	}
+
+	if (err == 0) {
+		struct table_iter empty = TABLE_ITER_INIT;
+		struct table_iter *malloced =
+			reftable_calloc(sizeof(struct table_iter));
+		*malloced = empty;
+		table_iter_copy_from(malloced, &next);
+		iterator_from_table_iter(it, malloced);
+	}
+done:
+	block_iter_close(&next.bi);
+	table_iter_close(&index_iter);
+	reftable_record_release(&want_index_rec);
+	reftable_record_release(&index_result_rec);
+	return err;
+}
+
+static int reader_seek_internal(struct reftable_reader *r,
+				struct reftable_iterator *it,
+				struct reftable_record *rec)
+{
+	struct reftable_reader_offsets *offs =
+		reader_offsets_for(r, reftable_record_type(rec));
+	uint64_t idx = offs->index_offset;
+	struct table_iter ti = TABLE_ITER_INIT;
+	int err = 0;
+	if (idx > 0)
+		return reader_seek_indexed(r, it, rec);
+
+	err = reader_start(r, &ti, reftable_record_type(rec), 0);
+	if (err < 0)
+		return err;
+	err = reader_seek_linear(r, &ti, rec);
+	if (err < 0)
+		return err;
+	else {
+		struct table_iter *p =
+			reftable_malloc(sizeof(struct table_iter));
+		*p = ti;
+		iterator_from_table_iter(it, p);
+	}
+
+	return 0;
+}
+
+int reader_seek(struct reftable_reader *r, struct reftable_iterator *it,
+		struct reftable_record *rec)
+{
+	uint8_t typ = reftable_record_type(rec);
+
+	struct reftable_reader_offsets *offs = reader_offsets_for(r, typ);
+	if (!offs->is_present) {
+		iterator_set_empty(it);
+		return 0;
+	}
+
+	return reader_seek_internal(r, it, rec);
+}
+
+int reftable_reader_seek_ref(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name)
+{
+	struct reftable_ref_record ref = {
+		.refname = (char *)name,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_ref(&rec, &ref);
+	return reader_seek(r, it, &rec);
+}
+
+int reftable_reader_seek_log_at(struct reftable_reader *r,
+				struct reftable_iterator *it, const char *name,
+				uint64_t update_index)
+{
+	struct reftable_log_record log = {
+		.refname = (char *)name,
+		.update_index = update_index,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_log(&rec, &log);
+	return reader_seek(r, it, &rec);
+}
+
+int reftable_reader_seek_log(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name)
+{
+	uint64_t max = ~((uint64_t)0);
+	return reftable_reader_seek_log_at(r, it, name, max);
+}
+
+void reader_close(struct reftable_reader *r)
+{
+	block_source_close(&r->source);
+	FREE_AND_NULL(r->name);
+}
+
+int reftable_new_reader(struct reftable_reader **p,
+			struct reftable_block_source *src, char const *name)
+{
+	struct reftable_reader *rd =
+		reftable_calloc(sizeof(struct reftable_reader));
+	int err = init_reader(rd, src, name);
+	if (err == 0) {
+		*p = rd;
+	} else {
+		block_source_close(src);
+		reftable_free(rd);
+	}
+	return err;
+}
+
+void reftable_reader_free(struct reftable_reader *r)
+{
+	reader_close(r);
+	reftable_free(r);
+}
+
+static int reftable_reader_refs_for_indexed(struct reftable_reader *r,
+					    struct reftable_iterator *it,
+					    uint8_t *oid)
+{
+	struct reftable_obj_record want = {
+		.hash_prefix = oid,
+		.hash_prefix_len = r->object_id_len,
+	};
+	struct reftable_record want_rec = { NULL };
+	struct reftable_iterator oit = { NULL };
+	struct reftable_obj_record got = { NULL };
+	struct reftable_record got_rec = { NULL };
+	int err = 0;
+	struct indexed_table_ref_iter *itr = NULL;
+
+	/* Look through the reverse index. */
+	reftable_record_from_obj(&want_rec, &want);
+	err = reader_seek(r, &oit, &want_rec);
+	if (err != 0)
+		goto done;
+
+	/* read out the reftable_obj_record */
+	reftable_record_from_obj(&got_rec, &got);
+	err = iterator_next(&oit, &got_rec);
+	if (err < 0)
+		goto done;
+
+	if (err > 0 ||
+	    memcmp(want.hash_prefix, got.hash_prefix, r->object_id_len)) {
+		/* didn't find it; return empty iterator */
+		iterator_set_empty(it);
+		err = 0;
+		goto done;
+	}
+
+	err = new_indexed_table_ref_iter(&itr, r, oid, hash_size(r->hash_id),
+					 got.offsets, got.offset_len);
+	if (err < 0)
+		goto done;
+	got.offsets = NULL;
+	iterator_from_indexed_table_ref_iter(it, itr);
+
+done:
+	reftable_iterator_destroy(&oit);
+	reftable_record_release(&got_rec);
+	return err;
+}
+
+static int reftable_reader_refs_for_unindexed(struct reftable_reader *r,
+					      struct reftable_iterator *it,
+					      uint8_t *oid)
+{
+	struct table_iter ti_empty = TABLE_ITER_INIT;
+	struct table_iter *ti = reftable_calloc(sizeof(struct table_iter));
+	struct filtering_ref_iterator *filter = NULL;
+	struct filtering_ref_iterator empty = FILTERING_REF_ITERATOR_INIT;
+	int oid_len = hash_size(r->hash_id);
+	int err;
+
+	*ti = ti_empty;
+	err = reader_start(r, ti, BLOCK_TYPE_REF, 0);
+	if (err < 0) {
+		reftable_free(ti);
+		return err;
+	}
+
+	filter = reftable_malloc(sizeof(struct filtering_ref_iterator));
+	*filter = empty;
+
+	strbuf_add(&filter->oid, oid, oid_len);
+	reftable_table_from_reader(&filter->tab, r);
+	filter->double_check = 0;
+	iterator_from_table_iter(&filter->it, ti);
+
+	iterator_from_filtering_ref_iterator(it, filter);
+	return 0;
+}
+
+int reftable_reader_refs_for(struct reftable_reader *r,
+			     struct reftable_iterator *it, uint8_t *oid)
+{
+	if (r->obj_offsets.is_present)
+		return reftable_reader_refs_for_indexed(r, it, oid);
+	return reftable_reader_refs_for_unindexed(r, it, oid);
+}
+
+uint64_t reftable_reader_max_update_index(struct reftable_reader *r)
+{
+	return r->max_update_index;
+}
+
+uint64_t reftable_reader_min_update_index(struct reftable_reader *r)
+{
+	return r->min_update_index;
+}
diff --git a/reftable/reader.h b/reftable/reader.h
new file mode 100644
index 000000000000..6d4927e1c5d6
--- /dev/null
+++ b/reftable/reader.h
@@ -0,0 +1,75 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef READER_H
+#define READER_H
+
+#include "block.h"
+#include "record.h"
+#include "reftable-iterator.h"
+#include "reftable-reader.h"
+
+uint64_t block_source_size(struct reftable_block_source *source);
+
+int block_source_read_block(struct reftable_block_source *source,
+			    struct reftable_block *dest, uint64_t off,
+			    uint32_t size);
+void block_source_close(struct reftable_block_source *source);
+
+/* metadata for a block type */
+struct reftable_reader_offsets {
+	int is_present;
+	uint64_t offset;
+	uint64_t index_offset;
+};
+
+/* The state for reading a reftable file. */
+struct reftable_reader {
+	/* for convience, associate a name with the instance. */
+	char *name;
+	struct reftable_block_source source;
+
+	/* Size of the file, excluding the footer. */
+	uint64_t size;
+
+	/* 'sha1' for SHA1, 's256' for SHA-256 */
+	uint32_t hash_id;
+
+	uint32_t block_size;
+	uint64_t min_update_index;
+	uint64_t max_update_index;
+	/* Length of the OID keys in the 'o' section */
+	int object_id_len;
+	int version;
+
+	struct reftable_reader_offsets ref_offsets;
+	struct reftable_reader_offsets obj_offsets;
+	struct reftable_reader_offsets log_offsets;
+};
+
+int init_reader(struct reftable_reader *r, struct reftable_block_source *source,
+		const char *name);
+int reader_seek(struct reftable_reader *r, struct reftable_iterator *it,
+		struct reftable_record *rec);
+void reader_close(struct reftable_reader *r);
+const char *reader_name(struct reftable_reader *r);
+
+/* initialize a block reader to read from `r` */
+int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br,
+			     uint64_t next_off, uint8_t want_typ);
+
+/* generic interface to reftables */
+struct reftable_table_vtable {
+	int (*seek_record)(void *tab, struct reftable_iterator *it,
+			   struct reftable_record *);
+	uint32_t (*hash_id)(void *tab);
+	uint64_t (*min_update_index)(void *tab);
+	uint64_t (*max_update_index)(void *tab);
+};
+
+#endif
diff --git a/reftable/reftable-iterator.h b/reftable/reftable-iterator.h
new file mode 100644
index 000000000000..d8251a45ba19
--- /dev/null
+++ b/reftable/reftable-iterator.h
@@ -0,0 +1,37 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_ITERATOR_H
+#define REFTABLE_ITERATOR_H
+
+#include "reftable-record.h"
+
+/* iterator is the generic interface for walking over data stored in a
+ * reftable.
+ */
+struct reftable_iterator {
+	struct reftable_iterator_vtable *ops;
+	void *iter_arg;
+};
+
+/* reads the next reftable_ref_record. Returns < 0 for error, 0 for OK and > 0:
+ * end of iteration.
+ */
+int reftable_iterator_next_ref(struct reftable_iterator *it,
+			       struct reftable_ref_record *ref);
+
+/* reads the next reftable_log_record. Returns < 0 for error, 0 for OK and > 0:
+ * end of iteration.
+ */
+int reftable_iterator_next_log(struct reftable_iterator *it,
+			       struct reftable_log_record *log);
+
+/* releases resources associated with an iterator. */
+void reftable_iterator_destroy(struct reftable_iterator *it);
+
+#endif
diff --git a/reftable/reftable-reader.h b/reftable/reftable-reader.h
new file mode 100644
index 000000000000..804e28543b35
--- /dev/null
+++ b/reftable/reftable-reader.h
@@ -0,0 +1,89 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_READER_H
+#define REFTABLE_READER_H
+
+#include "reftable-iterator.h"
+#include "reftable-blocksource.h"
+
+/*
+ * Reading single tables
+ *
+ * The follow routines are for reading single files. For an application-level
+ * interface, skip ahead to struct reftable_merged_table and struct
+ * reftable_stack.
+ */
+
+/* The reader struct is a handle to an open reftable file. */
+struct reftable_reader;
+
+/* reftable_new_reader opens a reftable for reading. If successful, returns 0
+ * code and sets pp. The name is used for creating a stack. Typically, it is the
+ * basename of the file. The block source `src` is owned by the reader, and is
+ * closed on calling reftable_reader_destroy().
+ */
+int reftable_new_reader(struct reftable_reader **pp,
+			struct reftable_block_source *src, const char *name);
+
+/* reftable_reader_seek_ref returns an iterator where 'name' would be inserted
+   in the table.  To seek to the start of the table, use name = "".
+
+   example:
+
+   struct reftable_reader *r = NULL;
+   int err = reftable_new_reader(&r, &src, "filename");
+   if (err < 0) { ... }
+   struct reftable_iterator it  = {0};
+   err = reftable_reader_seek_ref(r, &it, "refs/heads/master");
+   if (err < 0) { ... }
+   struct reftable_ref_record ref  = {0};
+   while (1) {
+     err = reftable_iterator_next_ref(&it, &ref);
+     if (err > 0) {
+       break;
+     }
+     if (err < 0) {
+       ..error handling..
+     }
+     ..found..
+   }
+   reftable_iterator_destroy(&it);
+   reftable_ref_record_release(&ref);
+ */
+int reftable_reader_seek_ref(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name);
+
+/* returns the hash ID used in this table. */
+uint32_t reftable_reader_hash_id(struct reftable_reader *r);
+
+/* seek to logs for the given name, older than update_index. To seek to the
+   start of the table, use name = "".
+ */
+int reftable_reader_seek_log_at(struct reftable_reader *r,
+				struct reftable_iterator *it, const char *name,
+				uint64_t update_index);
+
+/* seek to newest log entry for given name. */
+int reftable_reader_seek_log(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name);
+
+/* closes and deallocates a reader. */
+void reftable_reader_free(struct reftable_reader *);
+
+/* return an iterator for the refs pointing to `oid`. */
+int reftable_reader_refs_for(struct reftable_reader *r,
+			     struct reftable_iterator *it, uint8_t *oid);
+
+/* return the max_update_index for a table */
+uint64_t reftable_reader_max_update_index(struct reftable_reader *r);
+
+/* return the min_update_index for a table */
+uint64_t reftable_reader_min_update_index(struct reftable_reader *r);
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v5 11/15] reftable: reftable file level tests
  2021-03-12 20:19       ` [PATCH v5 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                           ` (9 preceding siblings ...)
  2021-03-12 20:19         ` [PATCH v5 10/15] reftable: read " Han-Wen Nienhuys via GitGitGadget
@ 2021-03-12 20:19         ` Han-Wen Nienhuys via GitGitGadget
  2021-03-12 20:19         ` [PATCH v5 12/15] reftable: rest of library Han-Wen Nienhuys via GitGitGadget
                           ` (4 subsequent siblings)
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-03-12 20:19 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

With support for reading and writing files in place, we can construct files (in
memory) and attempt to read them back.

Because some sections of the format are optional (eg. indices, log entries), we
have to exercise this code using multiple sizes of input data

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |   1 +
 reftable/reftable_test.c | 583 +++++++++++++++++++++++++++++++++++++++
 t/helper/test-reftable.c |   1 +
 3 files changed, 585 insertions(+)
 create mode 100644 reftable/reftable_test.c

diff --git a/Makefile b/Makefile
index aba1adc10b75..b0152d9f5f84 100644
--- a/Makefile
+++ b/Makefile
@@ -2398,6 +2398,7 @@ REFTABLE_OBJS += reftable/zlib-compat.o
 REFTABLE_TEST_OBJS += reftable/basics_test.o
 REFTABLE_TEST_OBJS += reftable/block_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
+REFTABLE_TEST_OBJS += reftable/reftable_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/tree_test.o
 
diff --git a/reftable/reftable_test.c b/reftable/reftable_test.c
new file mode 100644
index 000000000000..45c84fa6ecfd
--- /dev/null
+++ b/reftable/reftable_test.c
@@ -0,0 +1,583 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+
+#include "basics.h"
+#include "block.h"
+#include "blocksource.h"
+#include "constants.h"
+#include "reader.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+#include "reftable-stack.h"
+
+static const int update_index = 5;
+
+static void test_buffer(void)
+{
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_block_source source = { NULL };
+	struct reftable_block out = { NULL };
+	int n;
+	uint8_t in[] = "hello";
+	strbuf_add(&buf, in, sizeof(in));
+	block_source_from_strbuf(&source, &buf);
+	EXPECT(block_source_size(&source) == 6);
+	n = block_source_read_block(&source, &out, 0, sizeof(in));
+	EXPECT(n == sizeof(in));
+	EXPECT(!memcmp(in, out.data, n));
+	reftable_block_done(&out);
+
+	n = block_source_read_block(&source, &out, 1, 2);
+	EXPECT(n == 2);
+	EXPECT(!memcmp(out.data, "el", 2));
+
+	reftable_block_done(&out);
+	block_source_close(&source);
+	strbuf_release(&buf);
+}
+
+static void write_table(char ***names, struct strbuf *buf, int N,
+			int block_size, uint32_t hash_id)
+{
+	struct reftable_write_options opts = {
+		.block_size = block_size,
+		.hash_id = hash_id,
+	};
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, buf, &opts);
+	struct reftable_ref_record ref = { NULL };
+	int i = 0, n;
+	struct reftable_log_record log = { NULL };
+	const struct reftable_stats *stats = NULL;
+	*names = reftable_calloc(sizeof(char *) * (N + 1));
+	reftable_writer_set_limits(w, update_index, update_index);
+	for (i = 0; i < N; i++) {
+		uint8_t hash[SHA256_SIZE] = { 0 };
+		char name[100];
+		int n;
+
+		set_test_hash(hash, i);
+
+		snprintf(name, sizeof(name), "refs/heads/branch%02d", i);
+
+		ref.refname = name;
+		ref.update_index = update_index;
+		ref.value_type = REFTABLE_REF_VAL1;
+		ref.value.val1 = hash;
+		(*names)[i] = xstrdup(name);
+
+		n = reftable_writer_add_ref(w, &ref);
+		EXPECT(n == 0);
+	}
+
+	for (i = 0; i < N; i++) {
+		uint8_t hash[SHA256_SIZE] = { 0 };
+		char name[100];
+		int n;
+
+		set_test_hash(hash, i);
+
+		snprintf(name, sizeof(name), "refs/heads/branch%02d", i);
+
+		log.refname = name;
+		log.update_index = update_index;
+		log.value_type = REFTABLE_LOG_UPDATE;
+		log.update.new_hash = hash;
+		log.update.message = "message";
+
+		n = reftable_writer_add_log(w, &log);
+		EXPECT(n == 0);
+	}
+
+	n = reftable_writer_close(w);
+	EXPECT(n == 0);
+
+	stats = writer_stats(w);
+	for (i = 0; i < stats->ref_stats.blocks; i++) {
+		int off = i * opts.block_size;
+		if (off == 0) {
+			off = header_size((hash_id == SHA256_ID) ? 2 : 1);
+		}
+		EXPECT(buf->buf[off] == 'r');
+	}
+
+	EXPECT(stats->log_stats.blocks > 0);
+	reftable_writer_free(w);
+}
+
+static void test_log_buffer_size(void)
+{
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_write_options opts = {
+		.block_size = 4096,
+	};
+	int err;
+	struct reftable_log_record log = { .refname = "refs/heads/master",
+					   .update_index = 0xa,
+					   .value_type = REFTABLE_LOG_UPDATE,
+					   .update = {
+						   .name = "Han-Wen Nienhuys",
+						   .email = "hanwen@google.com",
+						   .tz_offset = 100,
+						   .time = 0x5e430672,
+						   .message = "commit: 9\n",
+					   } };
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+
+	/* This tests buffer extension for log compression. Must use a random
+	   hash, to ensure that the compressed part is larger than the original.
+	*/
+	uint8_t hash1[SHA1_SIZE], hash2[SHA1_SIZE];
+	for (int i = 0; i < SHA1_SIZE; i++) {
+		hash1[i] = (uint8_t)(rand() % 256);
+		hash2[i] = (uint8_t)(rand() % 256);
+	}
+	log.update.old_hash = hash1;
+	log.update.new_hash = hash2;
+	reftable_writer_set_limits(w, update_index, update_index);
+	err = reftable_writer_add_log(w, &log);
+	EXPECT_ERR(err);
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+	reftable_writer_free(w);
+	strbuf_release(&buf);
+}
+
+static void test_log_write_read(void)
+{
+	int N = 2;
+	char **names = reftable_calloc(sizeof(char *) * (N + 1));
+	int err;
+	struct reftable_write_options opts = {
+		.block_size = 256,
+	};
+	struct reftable_ref_record ref = { NULL };
+	int i = 0;
+	struct reftable_log_record log = { NULL };
+	int n;
+	struct reftable_iterator it = { NULL };
+	struct reftable_reader rd = { NULL };
+	struct reftable_block_source source = { NULL };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	const struct reftable_stats *stats = NULL;
+	reftable_writer_set_limits(w, 0, N);
+	for (i = 0; i < N; i++) {
+		char name[256];
+		struct reftable_ref_record ref = { NULL };
+		snprintf(name, sizeof(name), "b%02d%0*d", i, 130, 7);
+		names[i] = xstrdup(name);
+		ref.refname = name;
+		ref.update_index = i;
+
+		err = reftable_writer_add_ref(w, &ref);
+		EXPECT_ERR(err);
+	}
+	for (i = 0; i < N; i++) {
+		uint8_t hash1[SHA1_SIZE], hash2[SHA1_SIZE];
+		struct reftable_log_record log = { NULL };
+		set_test_hash(hash1, i);
+		set_test_hash(hash2, i + 1);
+
+		log.refname = names[i];
+		log.update_index = i;
+		log.value_type = REFTABLE_LOG_UPDATE;
+		log.update.old_hash = hash1;
+		log.update.new_hash = hash2;
+
+		err = reftable_writer_add_log(w, &log);
+		EXPECT_ERR(err);
+	}
+
+	n = reftable_writer_close(w);
+	EXPECT(n == 0);
+
+	stats = writer_stats(w);
+	EXPECT(stats->log_stats.blocks > 0);
+	reftable_writer_free(w);
+	w = NULL;
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.log");
+	EXPECT_ERR(err);
+
+	err = reftable_reader_seek_ref(&rd, &it, names[N - 1]);
+	EXPECT_ERR(err);
+
+	err = reftable_iterator_next_ref(&it, &ref);
+	EXPECT_ERR(err);
+
+	/* end of iteration. */
+	err = reftable_iterator_next_ref(&it, &ref);
+	EXPECT(0 < err);
+
+	reftable_iterator_destroy(&it);
+	reftable_ref_record_release(&ref);
+
+	err = reftable_reader_seek_log(&rd, &it, "");
+	EXPECT_ERR(err);
+
+	i = 0;
+	while (1) {
+		int err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			break;
+		}
+
+		EXPECT_ERR(err);
+		EXPECT_STREQ(names[i], log.refname);
+		EXPECT(i == log.update_index);
+		i++;
+		reftable_log_record_release(&log);
+	}
+
+	EXPECT(i == N);
+	reftable_iterator_destroy(&it);
+
+	/* cleanup. */
+	strbuf_release(&buf);
+	free_names(names);
+	reader_close(&rd);
+}
+
+static void test_table_read_write_sequential(void)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 50;
+	struct reftable_iterator it = { NULL };
+	struct reftable_block_source source = { NULL };
+	struct reftable_reader rd = { NULL };
+	int err = 0;
+	int j = 0;
+
+	write_table(&names, &buf, N, 256, SHA1_ID);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	EXPECT_ERR(err);
+
+	err = reftable_reader_seek_ref(&rd, &it, "");
+	EXPECT_ERR(err);
+
+	while (1) {
+		struct reftable_ref_record ref = { NULL };
+		int r = reftable_iterator_next_ref(&it, &ref);
+		EXPECT(r >= 0);
+		if (r > 0) {
+			break;
+		}
+		EXPECT(0 == strcmp(names[j], ref.refname));
+		EXPECT(update_index == ref.update_index);
+
+		j++;
+		reftable_ref_record_release(&ref);
+	}
+	EXPECT(j == N);
+	reftable_iterator_destroy(&it);
+	strbuf_release(&buf);
+	free_names(names);
+
+	reader_close(&rd);
+}
+
+static void test_table_write_small_table(void)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 1;
+	write_table(&names, &buf, N, 4096, SHA1_ID);
+	EXPECT(buf.len < 200);
+	strbuf_release(&buf);
+	free_names(names);
+}
+
+static void test_table_read_api(void)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 50;
+	struct reftable_reader rd = { NULL };
+	struct reftable_block_source source = { NULL };
+	int err;
+	int i;
+	struct reftable_log_record log = { NULL };
+	struct reftable_iterator it = { NULL };
+
+	write_table(&names, &buf, N, 256, SHA1_ID);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	EXPECT_ERR(err);
+
+	err = reftable_reader_seek_ref(&rd, &it, names[0]);
+	EXPECT_ERR(err);
+
+	err = reftable_iterator_next_log(&it, &log);
+	EXPECT(err == REFTABLE_API_ERROR);
+
+	strbuf_release(&buf);
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_free(names);
+	reader_close(&rd);
+	strbuf_release(&buf);
+}
+
+static void test_table_read_write_seek(int index, int hash_id)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 50;
+	struct reftable_reader rd = { NULL };
+	struct reftable_block_source source = { NULL };
+	int err;
+	int i = 0;
+
+	struct reftable_iterator it = { NULL };
+	struct strbuf pastLast = STRBUF_INIT;
+	struct reftable_ref_record ref = { NULL };
+
+	write_table(&names, &buf, N, 256, hash_id);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	EXPECT_ERR(err);
+	EXPECT(hash_id == reftable_reader_hash_id(&rd));
+
+	if (!index) {
+		rd.ref_offsets.index_offset = 0;
+	} else {
+		EXPECT(rd.ref_offsets.index_offset > 0);
+	}
+
+	for (i = 1; i < N; i++) {
+		int err = reftable_reader_seek_ref(&rd, &it, names[i]);
+		EXPECT_ERR(err);
+		err = reftable_iterator_next_ref(&it, &ref);
+		EXPECT_ERR(err);
+		EXPECT(0 == strcmp(names[i], ref.refname));
+		EXPECT(REFTABLE_REF_VAL1 == ref.value_type);
+		EXPECT(i == ref.value.val1[0]);
+
+		reftable_ref_record_release(&ref);
+		reftable_iterator_destroy(&it);
+	}
+
+	strbuf_addstr(&pastLast, names[N - 1]);
+	strbuf_addstr(&pastLast, "/");
+
+	err = reftable_reader_seek_ref(&rd, &it, pastLast.buf);
+	if (err == 0) {
+		struct reftable_ref_record ref = { NULL };
+		int err = reftable_iterator_next_ref(&it, &ref);
+		EXPECT(err > 0);
+	} else {
+		EXPECT(err > 0);
+	}
+
+	strbuf_release(&pastLast);
+	reftable_iterator_destroy(&it);
+
+	strbuf_release(&buf);
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+	reftable_free(names);
+	reader_close(&rd);
+}
+
+static void test_table_read_write_seek_linear(void)
+{
+	test_table_read_write_seek(0, SHA1_ID);
+}
+
+static void test_table_read_write_seek_linear_sha256(void)
+{
+	test_table_read_write_seek(0, SHA256_ID);
+}
+
+static void test_table_read_write_seek_index(void)
+{
+	test_table_read_write_seek(1, SHA1_ID);
+}
+
+static void test_table_refs_for(int indexed)
+{
+	int N = 50;
+	char **want_names = reftable_calloc(sizeof(char *) * (N + 1));
+	int want_names_len = 0;
+	uint8_t want_hash[SHA1_SIZE];
+
+	struct reftable_write_options opts = {
+		.block_size = 256,
+	};
+	struct reftable_ref_record ref = { NULL };
+	int i = 0;
+	int n;
+	int err;
+	struct reftable_reader rd;
+	struct reftable_block_source source = { NULL };
+
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+
+	struct reftable_iterator it = { NULL };
+	int j;
+
+	set_test_hash(want_hash, 4);
+
+	for (i = 0; i < N; i++) {
+		uint8_t hash[SHA1_SIZE];
+		char fill[51] = { 0 };
+		char name[100];
+		uint8_t hash1[SHA1_SIZE];
+		uint8_t hash2[SHA1_SIZE];
+		struct reftable_ref_record ref = { NULL };
+
+		memset(hash, i, sizeof(hash));
+		memset(fill, 'x', 50);
+		/* Put the variable part in the start */
+		snprintf(name, sizeof(name), "br%02d%s", i, fill);
+		name[40] = 0;
+		ref.refname = name;
+
+		set_test_hash(hash1, i / 4);
+		set_test_hash(hash2, 3 + i / 4);
+		ref.value_type = REFTABLE_REF_VAL2;
+		ref.value.val2.value = hash1;
+		ref.value.val2.target_value = hash2;
+
+		/* 80 bytes / entry, so 3 entries per block. Yields 17
+		 */
+		/* blocks. */
+		n = reftable_writer_add_ref(w, &ref);
+		EXPECT(n == 0);
+
+		if (!memcmp(hash1, want_hash, SHA1_SIZE) ||
+		    !memcmp(hash2, want_hash, SHA1_SIZE)) {
+			want_names[want_names_len++] = xstrdup(name);
+		}
+	}
+
+	n = reftable_writer_close(w);
+	EXPECT(n == 0);
+
+	reftable_writer_free(w);
+	w = NULL;
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	EXPECT_ERR(err);
+	if (!indexed) {
+		rd.obj_offsets.is_present = 0;
+	}
+
+	err = reftable_reader_seek_ref(&rd, &it, "");
+	EXPECT_ERR(err);
+	reftable_iterator_destroy(&it);
+
+	err = reftable_reader_refs_for(&rd, &it, want_hash);
+	EXPECT_ERR(err);
+
+	j = 0;
+	while (1) {
+		int err = reftable_iterator_next_ref(&it, &ref);
+		EXPECT(err >= 0);
+		if (err > 0) {
+			break;
+		}
+
+		EXPECT(j < want_names_len);
+		EXPECT(0 == strcmp(ref.refname, want_names[j]));
+		j++;
+		reftable_ref_record_release(&ref);
+	}
+	EXPECT(j == want_names_len);
+
+	strbuf_release(&buf);
+	free_names(want_names);
+	reftable_iterator_destroy(&it);
+	reader_close(&rd);
+}
+
+static void test_table_refs_for_no_index(void)
+{
+	test_table_refs_for(0);
+}
+
+static void test_table_refs_for_obj_index(void)
+{
+	test_table_refs_for(1);
+}
+
+static void test_table_empty(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	struct reftable_block_source source = { NULL };
+	struct reftable_reader *rd = NULL;
+	struct reftable_ref_record rec = { NULL };
+	struct reftable_iterator it = { NULL };
+	int err;
+
+	reftable_writer_set_limits(w, 1, 1);
+
+	err = reftable_writer_close(w);
+	EXPECT(err == REFTABLE_EMPTY_TABLE_ERROR);
+	reftable_writer_free(w);
+
+	EXPECT(buf.len == header_size(1) + footer_size(1));
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = reftable_new_reader(&rd, &source, "filename");
+	EXPECT_ERR(err);
+
+	err = reftable_reader_seek_ref(rd, &it, "");
+	EXPECT_ERR(err);
+
+	err = reftable_iterator_next_ref(&it, &rec);
+	EXPECT(err > 0);
+
+	reftable_iterator_destroy(&it);
+	reftable_reader_free(rd);
+	strbuf_release(&buf);
+}
+
+int reftable_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_log_write_read);
+	RUN_TEST(test_table_read_write_seek_linear_sha256);
+	RUN_TEST(test_log_buffer_size);
+	RUN_TEST(test_table_write_small_table);
+	RUN_TEST(test_buffer);
+	RUN_TEST(test_table_read_api);
+	RUN_TEST(test_table_read_write_sequential);
+	RUN_TEST(test_table_read_write_seek_linear);
+	RUN_TEST(test_table_read_write_seek_index);
+	RUN_TEST(test_table_refs_for_no_index);
+	RUN_TEST(test_table_refs_for_obj_index);
+	RUN_TEST(test_table_empty);
+	return 0;
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 050551fa6985..fdf925867375 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -6,6 +6,7 @@ int cmd__reftable(int argc, const char **argv)
 	basics_test_main(argc, argv);
 	block_test_main(argc, argv);
 	record_test_main(argc, argv);
+	reftable_test_main(argc, argv);
 	tree_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v5 12/15] reftable: rest of library
  2021-03-12 20:19       ` [PATCH v5 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                           ` (10 preceding siblings ...)
  2021-03-12 20:19         ` [PATCH v5 11/15] reftable: reftable file level tests Han-Wen Nienhuys via GitGitGadget
@ 2021-03-12 20:19         ` Han-Wen Nienhuys via GitGitGadget
  2021-03-12 20:19         ` [PATCH v5 13/15] Reftable support for git-core Han-Wen Nienhuys via GitGitGadget
                           ` (3 subsequent siblings)
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-03-12 20:19 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                    |    4 +
 reftable/VERSION            |    1 +
 reftable/dump.c             |  212 ++++++
 reftable/merged.c           |  366 ++++++++++
 reftable/merged.h           |   35 +
 reftable/merged_test.c      |  343 ++++++++++
 reftable/pq.c               |  115 ++++
 reftable/pq.h               |   32 +
 reftable/refname.c          |  209 ++++++
 reftable/refname.h          |   29 +
 reftable/refname_test.c     |  102 +++
 reftable/reftable-generic.h |   48 ++
 reftable/reftable-merged.h  |   68 ++
 reftable/reftable-stack.h   |  120 ++++
 reftable/reftable.c         |   98 +++
 reftable/stack.c            | 1260 +++++++++++++++++++++++++++++++++++
 reftable/stack.h            |   41 ++
 reftable/stack_test.c       |  809 ++++++++++++++++++++++
 t/helper/test-reftable.c    |    3 +
 19 files changed, 3895 insertions(+)
 create mode 100644 reftable/VERSION
 create mode 100644 reftable/dump.c
 create mode 100644 reftable/merged.c
 create mode 100644 reftable/merged.h
 create mode 100644 reftable/merged_test.c
 create mode 100644 reftable/pq.c
 create mode 100644 reftable/pq.h
 create mode 100644 reftable/refname.c
 create mode 100644 reftable/refname.h
 create mode 100644 reftable/refname_test.c
 create mode 100644 reftable/reftable-generic.h
 create mode 100644 reftable/reftable-merged.h
 create mode 100644 reftable/reftable-stack.h
 create mode 100644 reftable/reftable.c
 create mode 100644 reftable/stack.c
 create mode 100644 reftable/stack.h
 create mode 100644 reftable/stack_test.c

diff --git a/Makefile b/Makefile
index b0152d9f5f84..a00cc17ec781 100644
--- a/Makefile
+++ b/Makefile
@@ -2387,10 +2387,14 @@ REFTABLE_OBJS += reftable/error.o
 REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/iter.o
+REFTABLE_OBJS += reftable/merged.o
+REFTABLE_OBJS += reftable/pq.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/reader.o
 REFTABLE_OBJS += reftable/record.o
+REFTABLE_OBJS += reftable/refname.o
 REFTABLE_OBJS += reftable/reftable.o
+REFTABLE_OBJS += reftable/stack.o
 REFTABLE_OBJS += reftable/tree.o
 REFTABLE_OBJS += reftable/writer.o
 REFTABLE_OBJS += reftable/zlib-compat.o
diff --git a/reftable/VERSION b/reftable/VERSION
new file mode 100644
index 000000000000..b4d6be9f50f3
--- /dev/null
+++ b/reftable/VERSION
@@ -0,0 +1 @@
+a337d48d4d42d513d6baa33addc172f0e0e36288 C: use tagged union for reftable_log_record
diff --git a/reftable/dump.c b/reftable/dump.c
new file mode 100644
index 000000000000..00b444e8c9b8
--- /dev/null
+++ b/reftable/dump.c
@@ -0,0 +1,212 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include <stddef.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+
+#include "reftable.h"
+#include "reftable-tests.h"
+
+static uint32_t hash_id;
+
+static int dump_table(const char *tablename)
+{
+	struct reftable_block_source src = { 0 };
+	int err = reftable_block_source_from_file(&src, tablename);
+	struct reftable_iterator it = { 0 };
+	struct reftable_ref_record ref = { 0 };
+	struct reftable_log_record log = { 0 };
+	struct reftable_reader *r = NULL;
+
+	if (err < 0)
+		return err;
+
+	err = reftable_new_reader(&r, &src, tablename);
+	if (err < 0)
+		return err;
+
+	err = reftable_reader_seek_ref(r, &it, "");
+	if (err < 0) {
+		return err;
+	}
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_ref_record_print(&ref, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_ref_record_clear(&ref);
+
+	err = reftable_reader_seek_log(r, &it, "");
+	if (err < 0) {
+		return err;
+	}
+	while (1) {
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_log_record_print(&log, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_log_record_clear(&log);
+
+	reftable_reader_free(r);
+	return 0;
+}
+
+static int compact_stack(const char *stackdir)
+{
+	struct reftable_stack *stack = NULL;
+	struct reftable_write_options cfg = {};
+
+	int err = reftable_new_stack(&stack, stackdir, cfg);
+	if (err < 0)
+		goto done;
+
+	err = reftable_stack_compact_all(stack, NULL);
+	if (err < 0)
+		goto done;
+done:
+	if (stack != NULL) {
+		reftable_stack_destroy(stack);
+	}
+	return err;
+}
+
+static int dump_stack(const char *stackdir)
+{
+	struct reftable_stack *stack = NULL;
+	struct reftable_write_options cfg = {};
+	struct reftable_iterator it = { 0 };
+	struct reftable_ref_record ref = { 0 };
+	struct reftable_log_record log = { 0 };
+	struct reftable_merged_table *merged = NULL;
+
+	int err = reftable_new_stack(&stack, stackdir, cfg);
+	if (err < 0)
+		return err;
+
+	merged = reftable_stack_merged_table(stack);
+
+	err = reftable_merged_table_seek_ref(merged, &it, "");
+	if (err < 0) {
+		return err;
+	}
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_ref_record_print(&ref, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_ref_record_clear(&ref);
+
+	err = reftable_merged_table_seek_log(merged, &it, "");
+	if (err < 0) {
+		return err;
+	}
+	while (1) {
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_log_record_print(&log, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_log_record_clear(&log);
+
+	reftable_stack_destroy(stack);
+	return 0;
+}
+
+static void print_help(void)
+{
+	printf("usage: dump [-cst] arg\n\n"
+	       "options: \n"
+	       "  -c compact\n"
+	       "  -t dump table\n"
+	       "  -s dump stack\n"
+	       "  -h this help\n"
+	       "  -2 use SHA256\n"
+	       "\n");
+}
+
+int reftable_dump_main(int argc, char *const *argv)
+{
+	int err = 0;
+	int opt;
+	int opt_dump_table = 0;
+	int opt_dump_stack = 0;
+	int opt_compact = 0;
+	const char *arg = NULL;
+	while ((opt = getopt(argc, argv, "2chts")) != -1) {
+		switch (opt) {
+		case '2':
+			hash_id = 0x73323536;
+			break;
+		case 't':
+			opt_dump_table = 1;
+			break;
+		case 's':
+			opt_dump_stack = 1;
+			break;
+		case 'c':
+			opt_compact = 1;
+			break;
+		case '?':
+		case 'h':
+			print_help();
+			return 2;
+			break;
+		}
+	}
+
+	if (argv[optind] == NULL) {
+		fprintf(stderr, "need argument\n");
+		print_help();
+		return 2;
+	}
+
+	arg = argv[optind];
+
+	if (opt_dump_table) {
+		err = dump_table(arg);
+	} else if (opt_dump_stack) {
+		err = dump_stack(arg);
+	} else if (opt_compact) {
+		err = compact_stack(arg);
+	}
+
+	if (err < 0) {
+		fprintf(stderr, "%s: %s: %s\n", argv[0], arg,
+			reftable_error_str(err));
+		return 1;
+	}
+	return 0;
+}
diff --git a/reftable/merged.c b/reftable/merged.c
new file mode 100644
index 000000000000..ed02625bb548
--- /dev/null
+++ b/reftable/merged.c
@@ -0,0 +1,366 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "merged.h"
+
+#include "constants.h"
+#include "iter.h"
+#include "pq.h"
+#include "reader.h"
+#include "record.h"
+#include "reftable-merged.h"
+#include "reftable-error.h"
+#include "system.h"
+
+static int merged_iter_init(struct merged_iter *mi)
+{
+	int i = 0;
+	for (i = 0; i < mi->stack_len; i++) {
+		struct reftable_record rec = reftable_new_record(mi->typ);
+		int err = iterator_next(&mi->stack[i], &rec);
+		if (err < 0) {
+			return err;
+		}
+
+		if (err > 0) {
+			reftable_iterator_destroy(&mi->stack[i]);
+			reftable_record_destroy(&rec);
+		} else {
+			struct pq_entry e = {
+				.rec = rec,
+				.index = i,
+			};
+			merged_iter_pqueue_add(&mi->pq, e);
+		}
+	}
+
+	return 0;
+}
+
+static void merged_iter_close(void *p)
+{
+	struct merged_iter *mi = (struct merged_iter *)p;
+	int i = 0;
+	merged_iter_pqueue_release(&mi->pq);
+	for (i = 0; i < mi->stack_len; i++) {
+		reftable_iterator_destroy(&mi->stack[i]);
+	}
+	reftable_free(mi->stack);
+}
+
+static int merged_iter_advance_nonnull_subiter(struct merged_iter *mi,
+					       size_t idx)
+{
+	struct reftable_record rec = reftable_new_record(mi->typ);
+	struct pq_entry e = {
+		.rec = rec,
+		.index = idx,
+	};
+	int err = iterator_next(&mi->stack[idx], &rec);
+	if (err < 0)
+		return err;
+
+	if (err > 0) {
+		reftable_iterator_destroy(&mi->stack[idx]);
+		reftable_record_destroy(&rec);
+		return 0;
+	}
+
+	merged_iter_pqueue_add(&mi->pq, e);
+	return 0;
+}
+
+static int merged_iter_advance_subiter(struct merged_iter *mi, size_t idx)
+{
+	if (iterator_is_null(&mi->stack[idx]))
+		return 0;
+	return merged_iter_advance_nonnull_subiter(mi, idx);
+}
+
+static int merged_iter_next_entry(struct merged_iter *mi,
+				  struct reftable_record *rec)
+{
+	struct strbuf entry_key = STRBUF_INIT;
+	struct pq_entry entry = { 0 };
+	int err = 0;
+
+	if (merged_iter_pqueue_is_empty(mi->pq))
+		return 1;
+
+	entry = merged_iter_pqueue_remove(&mi->pq);
+	err = merged_iter_advance_subiter(mi, entry.index);
+	if (err < 0)
+		return err;
+
+	/*
+	  One can also use reftable as datacenter-local storage, where the ref
+	  database is maintained in globally consistent database (eg.
+	  CockroachDB or Spanner). In this scenario, replication delays together
+	  with compaction may cause newer tables to contain older entries. In
+	  such a deployment, the loop below must be changed to collect all
+	  entries for the same key, and return new the newest one.
+	*/
+	reftable_record_key(&entry.rec, &entry_key);
+	while (!merged_iter_pqueue_is_empty(mi->pq)) {
+		struct pq_entry top = merged_iter_pqueue_top(mi->pq);
+		struct strbuf k = STRBUF_INIT;
+		int err = 0, cmp = 0;
+
+		reftable_record_key(&top.rec, &k);
+
+		cmp = strbuf_cmp(&k, &entry_key);
+		strbuf_release(&k);
+
+		if (cmp > 0) {
+			break;
+		}
+
+		merged_iter_pqueue_remove(&mi->pq);
+		err = merged_iter_advance_subiter(mi, top.index);
+		if (err < 0) {
+			return err;
+		}
+		reftable_record_destroy(&top.rec);
+	}
+
+	reftable_record_copy_from(rec, &entry.rec, hash_size(mi->hash_id));
+	reftable_record_destroy(&entry.rec);
+	strbuf_release(&entry_key);
+	return 0;
+}
+
+static int merged_iter_next(struct merged_iter *mi, struct reftable_record *rec)
+{
+	while (1) {
+		int err = merged_iter_next_entry(mi, rec);
+		if (err == 0 && mi->suppress_deletions &&
+		    reftable_record_is_deletion(rec)) {
+			continue;
+		}
+
+		return err;
+	}
+}
+
+static int merged_iter_next_void(void *p, struct reftable_record *rec)
+{
+	struct merged_iter *mi = (struct merged_iter *)p;
+	if (merged_iter_pqueue_is_empty(mi->pq))
+		return 1;
+
+	return merged_iter_next(mi, rec);
+}
+
+static struct reftable_iterator_vtable merged_iter_vtable = {
+	.next = &merged_iter_next_void,
+	.close = &merged_iter_close,
+};
+
+static void iterator_from_merged_iter(struct reftable_iterator *it,
+				      struct merged_iter *mi)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = mi;
+	it->ops = &merged_iter_vtable;
+}
+
+int reftable_new_merged_table(struct reftable_merged_table **dest,
+			      struct reftable_table *stack, int n,
+			      uint32_t hash_id)
+{
+	struct reftable_merged_table *m = NULL;
+	uint64_t last_max = 0;
+	uint64_t first_min = 0;
+	int i = 0;
+	for (i = 0; i < n; i++) {
+		uint64_t min = reftable_table_min_update_index(&stack[i]);
+		uint64_t max = reftable_table_max_update_index(&stack[i]);
+
+		if (reftable_table_hash_id(&stack[i]) != hash_id) {
+			return REFTABLE_FORMAT_ERROR;
+		}
+		if (i == 0 || min < first_min) {
+			first_min = min;
+		}
+		if (i == 0 || max > last_max) {
+			last_max = max;
+		}
+	}
+
+	m = (struct reftable_merged_table *)reftable_calloc(
+		sizeof(struct reftable_merged_table));
+	m->stack = stack;
+	m->stack_len = n;
+	m->min = first_min;
+	m->max = last_max;
+	m->hash_id = hash_id;
+	*dest = m;
+	return 0;
+}
+
+/* clears the list of subtable, without affecting the readers themselves. */
+void merged_table_release(struct reftable_merged_table *mt)
+{
+	FREE_AND_NULL(mt->stack);
+	mt->stack_len = 0;
+}
+
+void reftable_merged_table_free(struct reftable_merged_table *mt)
+{
+	if (mt == NULL) {
+		return;
+	}
+	merged_table_release(mt);
+	reftable_free(mt);
+}
+
+uint64_t
+reftable_merged_table_max_update_index(struct reftable_merged_table *mt)
+{
+	return mt->max;
+}
+
+uint64_t
+reftable_merged_table_min_update_index(struct reftable_merged_table *mt)
+{
+	return mt->min;
+}
+
+static int reftable_table_seek_record(struct reftable_table *tab,
+				      struct reftable_iterator *it,
+				      struct reftable_record *rec)
+{
+	return tab->ops->seek_record(tab->table_arg, it, rec);
+}
+
+static int merged_table_seek_record(struct reftable_merged_table *mt,
+				    struct reftable_iterator *it,
+				    struct reftable_record *rec)
+{
+	struct reftable_iterator *iters = reftable_calloc(
+		sizeof(struct reftable_iterator) * mt->stack_len);
+	struct merged_iter merged = {
+		.stack = iters,
+		.typ = reftable_record_type(rec),
+		.hash_id = mt->hash_id,
+		.suppress_deletions = mt->suppress_deletions,
+	};
+	int n = 0;
+	int err = 0;
+	int i = 0;
+	for (i = 0; i < mt->stack_len && err == 0; i++) {
+		int e = reftable_table_seek_record(&mt->stack[i], &iters[n],
+						   rec);
+		if (e < 0) {
+			err = e;
+		}
+		if (e == 0) {
+			n++;
+		}
+	}
+	if (err < 0) {
+		int i = 0;
+		for (i = 0; i < n; i++) {
+			reftable_iterator_destroy(&iters[i]);
+		}
+		reftable_free(iters);
+		return err;
+	}
+
+	merged.stack_len = n;
+	err = merged_iter_init(&merged);
+	if (err < 0) {
+		merged_iter_close(&merged);
+		return err;
+	} else {
+		struct merged_iter *p =
+			reftable_malloc(sizeof(struct merged_iter));
+		*p = merged;
+		iterator_from_merged_iter(it, p);
+	}
+	return 0;
+}
+
+int reftable_merged_table_seek_ref(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name)
+{
+	struct reftable_ref_record ref = {
+		.refname = (char *)name,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_ref(&rec, &ref);
+	return merged_table_seek_record(mt, it, &rec);
+}
+
+int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt,
+				      struct reftable_iterator *it,
+				      const char *name, uint64_t update_index)
+{
+	struct reftable_log_record log = {
+		.refname = (char *)name,
+		.update_index = update_index,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_log(&rec, &log);
+	return merged_table_seek_record(mt, it, &rec);
+}
+
+int reftable_merged_table_seek_log(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name)
+{
+	uint64_t max = ~((uint64_t)0);
+	return reftable_merged_table_seek_log_at(mt, it, name, max);
+}
+
+uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *mt)
+{
+	return mt->hash_id;
+}
+
+static int reftable_merged_table_seek_void(void *tab,
+					   struct reftable_iterator *it,
+					   struct reftable_record *rec)
+{
+	return merged_table_seek_record((struct reftable_merged_table *)tab, it,
+					rec);
+}
+
+static uint32_t reftable_merged_table_hash_id_void(void *tab)
+{
+	return reftable_merged_table_hash_id(
+		(struct reftable_merged_table *)tab);
+}
+
+static uint64_t reftable_merged_table_min_update_index_void(void *tab)
+{
+	return reftable_merged_table_min_update_index(
+		(struct reftable_merged_table *)tab);
+}
+
+static uint64_t reftable_merged_table_max_update_index_void(void *tab)
+{
+	return reftable_merged_table_max_update_index(
+		(struct reftable_merged_table *)tab);
+}
+
+static struct reftable_table_vtable merged_table_vtable = {
+	.seek_record = reftable_merged_table_seek_void,
+	.hash_id = reftable_merged_table_hash_id_void,
+	.min_update_index = reftable_merged_table_min_update_index_void,
+	.max_update_index = reftable_merged_table_max_update_index_void,
+};
+
+void reftable_table_from_merged_table(struct reftable_table *tab,
+				      struct reftable_merged_table *merged)
+{
+	assert(tab->ops == NULL);
+	tab->ops = &merged_table_vtable;
+	tab->table_arg = merged;
+}
diff --git a/reftable/merged.h b/reftable/merged.h
new file mode 100644
index 000000000000..8c4d4d58d77a
--- /dev/null
+++ b/reftable/merged.h
@@ -0,0 +1,35 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef MERGED_H
+#define MERGED_H
+
+#include "pq.h"
+
+struct reftable_merged_table {
+	struct reftable_table *stack;
+	size_t stack_len;
+	uint32_t hash_id;
+	int suppress_deletions;
+
+	uint64_t min;
+	uint64_t max;
+};
+
+struct merged_iter {
+	struct reftable_iterator *stack;
+	uint32_t hash_id;
+	size_t stack_len;
+	uint8_t typ;
+	int suppress_deletions;
+	struct merged_iter_pqueue pq;
+};
+
+void merged_table_release(struct reftable_merged_table *mt);
+
+#endif
diff --git a/reftable/merged_test.c b/reftable/merged_test.c
new file mode 100644
index 000000000000..09a70aa0fda3
--- /dev/null
+++ b/reftable/merged_test.c
@@ -0,0 +1,343 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "merged.h"
+
+#include "system.h"
+
+#include "basics.h"
+#include "blocksource.h"
+#include "constants.h"
+#include "pq.h"
+#include "reader.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-merged.h"
+#include "reftable-tests.h"
+#include "reftable-generic.h"
+#include "reftable-stack.h"
+
+static void test_pq(void)
+{
+	char *names[54] = { NULL };
+	int N = ARRAY_SIZE(names) - 1;
+
+	struct merged_iter_pqueue pq = { NULL };
+	const char *last = NULL;
+
+	int i = 0;
+	for (i = 0; i < N; i++) {
+		char name[100];
+		snprintf(name, sizeof(name), "%02d", i);
+		names[i] = xstrdup(name);
+	}
+
+	i = 1;
+	do {
+		struct reftable_record rec =
+			reftable_new_record(BLOCK_TYPE_REF);
+		struct pq_entry e = { 0 };
+
+		reftable_record_as_ref(&rec)->refname = names[i];
+		e.rec = rec;
+		merged_iter_pqueue_add(&pq, e);
+		merged_iter_pqueue_check(pq);
+		i = (i * 7) % N;
+	} while (i != 1);
+
+	while (!merged_iter_pqueue_is_empty(pq)) {
+		struct pq_entry e = merged_iter_pqueue_remove(&pq);
+		struct reftable_ref_record *ref =
+			reftable_record_as_ref(&e.rec);
+
+		merged_iter_pqueue_check(pq);
+
+		if (last != NULL) {
+			assert(strcmp(last, ref->refname) < 0);
+		}
+		last = ref->refname;
+		ref->refname = NULL;
+		reftable_free(ref);
+	}
+
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+
+	merged_iter_pqueue_release(&pq);
+}
+
+static void write_test_table(struct strbuf *buf,
+			     struct reftable_ref_record refs[], int n)
+{
+	int min = 0xffffffff;
+	int max = 0;
+	int i = 0;
+	int err;
+
+	struct reftable_write_options opts = {
+		.block_size = 256,
+	};
+	struct reftable_writer *w = NULL;
+	for (i = 0; i < n; i++) {
+		uint64_t ui = refs[i].update_index;
+		if (ui > max) {
+			max = ui;
+		}
+		if (ui < min) {
+			min = ui;
+		}
+	}
+
+	w = reftable_new_writer(&strbuf_add_void, buf, &opts);
+	reftable_writer_set_limits(w, min, max);
+
+	for (i = 0; i < n; i++) {
+		uint64_t before = refs[i].update_index;
+		int n = reftable_writer_add_ref(w, &refs[i]);
+		assert(n == 0);
+		assert(before == refs[i].update_index);
+	}
+
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+
+	reftable_writer_free(w);
+}
+
+static struct reftable_merged_table *
+merged_table_from_records(struct reftable_ref_record **refs,
+			  struct reftable_block_source **source,
+			  struct reftable_reader ***readers, int *sizes,
+			  struct strbuf *buf, int n)
+{
+	int i = 0;
+	struct reftable_merged_table *mt = NULL;
+	int err;
+	struct reftable_table *tabs =
+		reftable_calloc(n * sizeof(struct reftable_table));
+	*readers = reftable_calloc(n * sizeof(struct reftable_reader *));
+	*source = reftable_calloc(n * sizeof(**source));
+	for (i = 0; i < n; i++) {
+		write_test_table(&buf[i], refs[i], sizes[i]);
+		block_source_from_strbuf(&(*source)[i], &buf[i]);
+
+		err = reftable_new_reader(&(*readers)[i], &(*source)[i],
+					  "name");
+		EXPECT_ERR(err);
+		reftable_table_from_reader(&tabs[i], (*readers)[i]);
+	}
+
+	err = reftable_new_merged_table(&mt, tabs, n, SHA1_ID);
+	EXPECT_ERR(err);
+	return mt;
+}
+
+static void readers_destroy(struct reftable_reader **readers, size_t n)
+{
+	int i = 0;
+	for (; i < n; i++)
+		reftable_reader_free(readers[i]);
+	reftable_free(readers);
+}
+
+static void test_merged_between(void)
+{
+	uint8_t hash1[SHA1_SIZE] = { 1, 2, 3, 0 };
+
+	struct reftable_ref_record r1[] = { {
+		.refname = "b",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_VAL1,
+		.value.val1 = hash1,
+	} };
+	struct reftable_ref_record r2[] = { {
+		.refname = "a",
+		.update_index = 2,
+		.value_type = REFTABLE_REF_DELETION,
+	} };
+
+	struct reftable_ref_record *refs[] = { r1, r2 };
+	int sizes[] = { 1, 1 };
+	struct strbuf bufs[2] = { STRBUF_INIT, STRBUF_INIT };
+	struct reftable_block_source *bs = NULL;
+	struct reftable_reader **readers = NULL;
+	struct reftable_merged_table *mt =
+		merged_table_from_records(refs, &bs, &readers, sizes, bufs, 2);
+	int i;
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_iterator it = { NULL };
+	int err = reftable_merged_table_seek_ref(mt, &it, "a");
+	EXPECT_ERR(err);
+
+	err = reftable_iterator_next_ref(&it, &ref);
+	EXPECT_ERR(err);
+	EXPECT(ref.update_index == 2);
+	reftable_ref_record_release(&ref);
+	reftable_iterator_destroy(&it);
+	readers_destroy(readers, 2);
+	reftable_merged_table_free(mt);
+	for (i = 0; i < ARRAY_SIZE(bufs); i++) {
+		strbuf_release(&bufs[i]);
+	}
+	reftable_free(bs);
+}
+
+static void test_merged(void)
+{
+	uint8_t hash1[SHA1_SIZE] = { 1 };
+	uint8_t hash2[SHA1_SIZE] = { 2 };
+	struct reftable_ref_record r1[] = {
+		{
+			.refname = "a",
+			.update_index = 1,
+			.value_type = REFTABLE_REF_VAL1,
+			.value.val1 = hash1,
+		},
+		{
+			.refname = "b",
+			.update_index = 1,
+			.value_type = REFTABLE_REF_VAL1,
+			.value.val1 = hash1,
+		},
+		{
+			.refname = "c",
+			.update_index = 1,
+			.value_type = REFTABLE_REF_VAL1,
+			.value.val1 = hash1,
+		}
+	};
+	struct reftable_ref_record r2[] = { {
+		.refname = "a",
+		.update_index = 2,
+		.value_type = REFTABLE_REF_DELETION,
+	} };
+	struct reftable_ref_record r3[] = {
+		{
+			.refname = "c",
+			.update_index = 3,
+			.value_type = REFTABLE_REF_VAL1,
+			.value.val1 = hash2,
+		},
+		{
+			.refname = "d",
+			.update_index = 3,
+			.value_type = REFTABLE_REF_VAL1,
+			.value.val1 = hash1,
+		},
+	};
+
+	struct reftable_ref_record want[] = {
+		r2[0],
+		r1[1],
+		r3[0],
+		r3[1],
+	};
+
+	struct reftable_ref_record *refs[] = { r1, r2, r3 };
+	int sizes[3] = { 3, 1, 2 };
+	struct strbuf bufs[3] = { STRBUF_INIT, STRBUF_INIT, STRBUF_INIT };
+	struct reftable_block_source *bs = NULL;
+	struct reftable_reader **readers = NULL;
+	struct reftable_merged_table *mt =
+		merged_table_from_records(refs, &bs, &readers, sizes, bufs, 3);
+
+	struct reftable_iterator it = { NULL };
+	int err = reftable_merged_table_seek_ref(mt, &it, "a");
+	struct reftable_ref_record *out = NULL;
+	size_t len = 0;
+	size_t cap = 0;
+	int i = 0;
+
+	EXPECT_ERR(err);
+	while (len < 100) { /* cap loops/recursion. */
+		struct reftable_ref_record ref = { NULL };
+		int err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (len == cap) {
+			cap = 2 * cap + 1;
+			out = reftable_realloc(
+				out, sizeof(struct reftable_ref_record) * cap);
+		}
+		out[len++] = ref;
+	}
+	reftable_iterator_destroy(&it);
+
+	assert(ARRAY_SIZE(want) == len);
+	for (i = 0; i < len; i++) {
+		assert(reftable_ref_record_equal(&want[i], &out[i], SHA1_SIZE));
+	}
+	for (i = 0; i < len; i++) {
+		reftable_ref_record_release(&out[i]);
+	}
+	reftable_free(out);
+
+	for (i = 0; i < 3; i++) {
+		strbuf_release(&bufs[i]);
+	}
+	readers_destroy(readers, 3);
+	reftable_merged_table_free(mt);
+	reftable_free(bs);
+}
+
+static void test_default_write_opts(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+
+	struct reftable_ref_record rec = {
+		.refname = "master",
+		.update_index = 1,
+	};
+	int err;
+	struct reftable_block_source source = { NULL };
+	struct reftable_table *tab = reftable_calloc(sizeof(*tab) * 1);
+	uint32_t hash_id;
+	struct reftable_reader *rd = NULL;
+	struct reftable_merged_table *merged = NULL;
+
+	reftable_writer_set_limits(w, 1, 1);
+
+	err = reftable_writer_add_ref(w, &rec);
+	EXPECT_ERR(err);
+
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+	reftable_writer_free(w);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = reftable_new_reader(&rd, &source, "filename");
+	EXPECT_ERR(err);
+
+	hash_id = reftable_reader_hash_id(rd);
+	assert(hash_id == SHA1_ID);
+
+	reftable_table_from_reader(&tab[0], rd);
+	err = reftable_new_merged_table(&merged, tab, 1, SHA1_ID);
+	EXPECT_ERR(err);
+
+	reftable_reader_free(rd);
+	reftable_merged_table_free(merged);
+	strbuf_release(&buf);
+}
+
+/* XXX test refs_for(oid) */
+
+int merged_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_merged_between);
+	RUN_TEST(test_pq);
+	RUN_TEST(test_merged);
+	RUN_TEST(test_default_write_opts);
+	return 0;
+}
diff --git a/reftable/pq.c b/reftable/pq.c
new file mode 100644
index 000000000000..8918d158e2d4
--- /dev/null
+++ b/reftable/pq.c
@@ -0,0 +1,115 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "pq.h"
+
+#include "reftable-record.h"
+#include "system.h"
+#include "basics.h"
+
+static int pq_less(struct pq_entry a, struct pq_entry b)
+{
+	struct strbuf ak = STRBUF_INIT;
+	struct strbuf bk = STRBUF_INIT;
+	int cmp = 0;
+	reftable_record_key(&a.rec, &ak);
+	reftable_record_key(&b.rec, &bk);
+
+	cmp = strbuf_cmp(&ak, &bk);
+
+	strbuf_release(&ak);
+	strbuf_release(&bk);
+
+	if (cmp == 0)
+		return a.index > b.index;
+
+	return cmp < 0;
+}
+
+struct pq_entry merged_iter_pqueue_top(struct merged_iter_pqueue pq)
+{
+	return pq.heap[0];
+}
+
+int merged_iter_pqueue_is_empty(struct merged_iter_pqueue pq)
+{
+	return pq.len == 0;
+}
+
+void merged_iter_pqueue_check(struct merged_iter_pqueue pq)
+{
+	int i = 0;
+	for (i = 1; i < pq.len; i++) {
+		int parent = (i - 1) / 2;
+
+		assert(pq_less(pq.heap[parent], pq.heap[i]));
+	}
+}
+
+struct pq_entry merged_iter_pqueue_remove(struct merged_iter_pqueue *pq)
+{
+	int i = 0;
+	struct pq_entry e = pq->heap[0];
+	pq->heap[0] = pq->heap[pq->len - 1];
+	pq->len--;
+
+	i = 0;
+	while (i < pq->len) {
+		int min = i;
+		int j = 2 * i + 1;
+		int k = 2 * i + 2;
+		if (j < pq->len && pq_less(pq->heap[j], pq->heap[i])) {
+			min = j;
+		}
+		if (k < pq->len && pq_less(pq->heap[k], pq->heap[min])) {
+			min = k;
+		}
+
+		if (min == i) {
+			break;
+		}
+
+		SWAP(pq->heap[i], pq->heap[min]);
+		i = min;
+	}
+
+	return e;
+}
+
+void merged_iter_pqueue_add(struct merged_iter_pqueue *pq, struct pq_entry e)
+{
+	int i = 0;
+	if (pq->len == pq->cap) {
+		pq->cap = 2 * pq->cap + 1;
+		pq->heap = reftable_realloc(pq->heap,
+					    pq->cap * sizeof(struct pq_entry));
+	}
+
+	pq->heap[pq->len++] = e;
+	i = pq->len - 1;
+	while (i > 0) {
+		int j = (i - 1) / 2;
+		if (pq_less(pq->heap[j], pq->heap[i])) {
+			break;
+		}
+
+		SWAP(pq->heap[j], pq->heap[i]);
+
+		i = j;
+	}
+}
+
+void merged_iter_pqueue_release(struct merged_iter_pqueue *pq)
+{
+	int i = 0;
+	for (i = 0; i < pq->len; i++) {
+		reftable_record_destroy(&pq->heap[i].rec);
+	}
+	FREE_AND_NULL(pq->heap);
+	pq->len = pq->cap = 0;
+}
diff --git a/reftable/pq.h b/reftable/pq.h
new file mode 100644
index 000000000000..385d2fb139a6
--- /dev/null
+++ b/reftable/pq.h
@@ -0,0 +1,32 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef PQ_H
+#define PQ_H
+
+#include "record.h"
+
+struct pq_entry {
+	int index;
+	struct reftable_record rec;
+};
+
+struct merged_iter_pqueue {
+	struct pq_entry *heap;
+	size_t len;
+	size_t cap;
+};
+
+struct pq_entry merged_iter_pqueue_top(struct merged_iter_pqueue pq);
+int merged_iter_pqueue_is_empty(struct merged_iter_pqueue pq);
+void merged_iter_pqueue_check(struct merged_iter_pqueue pq);
+struct pq_entry merged_iter_pqueue_remove(struct merged_iter_pqueue *pq);
+void merged_iter_pqueue_add(struct merged_iter_pqueue *pq, struct pq_entry e);
+void merged_iter_pqueue_release(struct merged_iter_pqueue *pq);
+
+#endif
diff --git a/reftable/refname.c b/reftable/refname.c
new file mode 100644
index 000000000000..0f4eb3b292d3
--- /dev/null
+++ b/reftable/refname.c
@@ -0,0 +1,209 @@
+/*
+  Copyright 2020 Google LLC
+
+  Use of this source code is governed by a BSD-style
+  license that can be found in the LICENSE file or at
+  https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+#include "reftable-error.h"
+#include "basics.h"
+#include "refname.h"
+#include "reftable-iterator.h"
+
+struct find_arg {
+	char **names;
+	const char *want;
+};
+
+static int find_name(size_t k, void *arg)
+{
+	struct find_arg *f_arg = (struct find_arg *)arg;
+	return strcmp(f_arg->names[k], f_arg->want) >= 0;
+}
+
+static int modification_has_ref(struct modification *mod, const char *name)
+{
+	struct reftable_ref_record ref = { NULL };
+	int err = 0;
+
+	if (mod->add_len > 0) {
+		struct find_arg arg = {
+			.names = mod->add,
+			.want = name,
+		};
+		int idx = binsearch(mod->add_len, find_name, &arg);
+		if (idx < mod->add_len && !strcmp(mod->add[idx], name)) {
+			return 0;
+		}
+	}
+
+	if (mod->del_len > 0) {
+		struct find_arg arg = {
+			.names = mod->del,
+			.want = name,
+		};
+		int idx = binsearch(mod->del_len, find_name, &arg);
+		if (idx < mod->del_len && !strcmp(mod->del[idx], name)) {
+			return 1;
+		}
+	}
+
+	err = reftable_table_read_ref(&mod->tab, name, &ref);
+	reftable_ref_record_release(&ref);
+	return err;
+}
+
+static void modification_release(struct modification *mod)
+{
+	/* don't delete the strings themselves; they're owned by ref records.
+	 */
+	FREE_AND_NULL(mod->add);
+	FREE_AND_NULL(mod->del);
+	mod->add_len = 0;
+	mod->del_len = 0;
+}
+
+static int modification_has_ref_with_prefix(struct modification *mod,
+					    const char *prefix)
+{
+	struct reftable_iterator it = { NULL };
+	struct reftable_ref_record ref = { NULL };
+	int err = 0;
+
+	if (mod->add_len > 0) {
+		struct find_arg arg = {
+			.names = mod->add,
+			.want = prefix,
+		};
+		int idx = binsearch(mod->add_len, find_name, &arg);
+		if (idx < mod->add_len &&
+		    !strncmp(prefix, mod->add[idx], strlen(prefix)))
+			goto done;
+	}
+	err = reftable_table_seek_ref(&mod->tab, &it, prefix);
+	if (err)
+		goto done;
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err)
+			goto done;
+
+		if (mod->del_len > 0) {
+			struct find_arg arg = {
+				.names = mod->del,
+				.want = ref.refname,
+			};
+			int idx = binsearch(mod->del_len, find_name, &arg);
+			if (idx < mod->del_len &&
+			    !strcmp(ref.refname, mod->del[idx])) {
+				continue;
+			}
+		}
+
+		if (strncmp(ref.refname, prefix, strlen(prefix))) {
+			err = 1;
+			goto done;
+		}
+		err = 0;
+		goto done;
+	}
+
+done:
+	reftable_ref_record_release(&ref);
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+static int validate_refname(const char *name)
+{
+	while (1) {
+		char *next = strchr(name, '/');
+		if (!*name) {
+			return REFTABLE_REFNAME_ERROR;
+		}
+		if (!next) {
+			return 0;
+		}
+		if (next - name == 0 || (next - name == 1 && *name == '.') ||
+		    (next - name == 2 && name[0] == '.' && name[1] == '.'))
+			return REFTABLE_REFNAME_ERROR;
+		name = next + 1;
+	}
+	return 0;
+}
+
+int validate_ref_record_addition(struct reftable_table tab,
+				 struct reftable_ref_record *recs, size_t sz)
+{
+	struct modification mod = {
+		.tab = tab,
+		.add = reftable_calloc(sizeof(char *) * sz),
+		.del = reftable_calloc(sizeof(char *) * sz),
+	};
+	int i = 0;
+	int err = 0;
+	for (; i < sz; i++) {
+		if (reftable_ref_record_is_deletion(&recs[i])) {
+			mod.del[mod.del_len++] = recs[i].refname;
+		} else {
+			mod.add[mod.add_len++] = recs[i].refname;
+		}
+	}
+
+	err = modification_validate(&mod);
+	modification_release(&mod);
+	return err;
+}
+
+static void strbuf_trim_component(struct strbuf *sl)
+{
+	while (sl->len > 0) {
+		int is_slash = (sl->buf[sl->len - 1] == '/');
+		strbuf_setlen(sl, sl->len - 1);
+		if (is_slash)
+			break;
+	}
+}
+
+int modification_validate(struct modification *mod)
+{
+	struct strbuf slashed = STRBUF_INIT;
+	int err = 0;
+	int i = 0;
+	for (; i < mod->add_len; i++) {
+		err = validate_refname(mod->add[i]);
+		if (err)
+			goto done;
+		strbuf_reset(&slashed);
+		strbuf_addstr(&slashed, mod->add[i]);
+		strbuf_addstr(&slashed, "/");
+
+		err = modification_has_ref_with_prefix(mod, slashed.buf);
+		if (err == 0) {
+			err = REFTABLE_NAME_CONFLICT;
+			goto done;
+		}
+		if (err < 0)
+			goto done;
+
+		strbuf_reset(&slashed);
+		strbuf_addstr(&slashed, mod->add[i]);
+		while (slashed.len) {
+			strbuf_trim_component(&slashed);
+			err = modification_has_ref(mod, slashed.buf);
+			if (err == 0) {
+				err = REFTABLE_NAME_CONFLICT;
+				goto done;
+			}
+			if (err < 0)
+				goto done;
+		}
+	}
+	err = 0;
+done:
+	strbuf_release(&slashed);
+	return err;
+}
diff --git a/reftable/refname.h b/reftable/refname.h
new file mode 100644
index 000000000000..a24b40fcb428
--- /dev/null
+++ b/reftable/refname.h
@@ -0,0 +1,29 @@
+/*
+  Copyright 2020 Google LLC
+
+  Use of this source code is governed by a BSD-style
+  license that can be found in the LICENSE file or at
+  https://developers.google.com/open-source/licenses/bsd
+*/
+#ifndef REFNAME_H
+#define REFNAME_H
+
+#include "reftable-record.h"
+#include "reftable-generic.h"
+
+struct modification {
+	struct reftable_table tab;
+
+	char **add;
+	size_t add_len;
+
+	char **del;
+	size_t del_len;
+};
+
+int validate_ref_record_addition(struct reftable_table tab,
+				 struct reftable_ref_record *recs, size_t sz);
+
+int modification_validate(struct modification *mod);
+
+#endif
diff --git a/reftable/refname_test.c b/reftable/refname_test.c
new file mode 100644
index 000000000000..5e005d6af31c
--- /dev/null
+++ b/reftable/refname_test.c
@@ -0,0 +1,102 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "basics.h"
+#include "block.h"
+#include "blocksource.h"
+#include "constants.h"
+#include "reader.h"
+#include "record.h"
+#include "refname.h"
+#include "reftable-error.h"
+#include "reftable-writer.h"
+#include "system.h"
+
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+struct testcase {
+	char *add;
+	char *del;
+	int error_code;
+};
+
+static void test_conflict(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	struct reftable_ref_record rec = {
+		.refname = "a/b",
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "destination", /* make sure it's not a symref.
+						*/
+		.update_index = 1,
+	};
+	int err;
+	int i;
+	struct reftable_block_source source = { NULL };
+	struct reftable_reader *rd = NULL;
+	struct reftable_table tab = { NULL };
+	struct testcase cases[] = {
+		{ "a/b/c", NULL, REFTABLE_NAME_CONFLICT },
+		{ "b", NULL, 0 },
+		{ "a", NULL, REFTABLE_NAME_CONFLICT },
+		{ "a", "a/b", 0 },
+
+		{ "p/", NULL, REFTABLE_REFNAME_ERROR },
+		{ "p//q", NULL, REFTABLE_REFNAME_ERROR },
+		{ "p/./q", NULL, REFTABLE_REFNAME_ERROR },
+		{ "p/../q", NULL, REFTABLE_REFNAME_ERROR },
+
+		{ "a/b/c", "a/b", 0 },
+		{ NULL, "a//b", 0 },
+	};
+	reftable_writer_set_limits(w, 1, 1);
+
+	err = reftable_writer_add_ref(w, &rec);
+	EXPECT_ERR(err);
+
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+	reftable_writer_free(w);
+
+	block_source_from_strbuf(&source, &buf);
+	err = reftable_new_reader(&rd, &source, "filename");
+	EXPECT_ERR(err);
+
+	reftable_table_from_reader(&tab, rd);
+
+	for (i = 0; i < ARRAY_SIZE(cases); i++) {
+		struct modification mod = {
+			.tab = tab,
+		};
+
+		if (cases[i].add != NULL) {
+			mod.add = &cases[i].add;
+			mod.add_len = 1;
+		}
+		if (cases[i].del != NULL) {
+			mod.del = &cases[i].del;
+			mod.del_len = 1;
+		}
+
+		err = modification_validate(&mod);
+		EXPECT(err == cases[i].error_code);
+	}
+
+	reftable_reader_free(rd);
+	strbuf_release(&buf);
+}
+
+int refname_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_conflict);
+	return 0;
+}
diff --git a/reftable/reftable-generic.h b/reftable/reftable-generic.h
new file mode 100644
index 000000000000..77eca3b4eb0f
--- /dev/null
+++ b/reftable/reftable-generic.h
@@ -0,0 +1,48 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_GENERIC_H
+#define REFTABLE_GENERIC_H
+
+#include "reftable-iterator.h"
+#include "reftable-reader.h"
+#include "reftable-merged.h"
+
+/*
+ * Provides a unified API for reading tables, either merged tables, or single
+ * readers. */
+struct reftable_table {
+	struct reftable_table_vtable *ops;
+	void *table_arg;
+};
+
+int reftable_table_seek_ref(struct reftable_table *tab,
+			    struct reftable_iterator *it, const char *name);
+
+void reftable_table_from_reader(struct reftable_table *tab,
+				struct reftable_reader *reader);
+
+/* returns the hash ID from a generic reftable_table */
+uint32_t reftable_table_hash_id(struct reftable_table *tab);
+
+/* create a generic table from reftable_merged_table */
+void reftable_table_from_merged_table(struct reftable_table *tab,
+				      struct reftable_merged_table *table);
+
+/* returns the max update_index covered by this table. */
+uint64_t reftable_table_max_update_index(struct reftable_table *tab);
+
+/* returns the min update_index covered by this table. */
+uint64_t reftable_table_min_update_index(struct reftable_table *tab);
+
+/* convenience function to read a single ref. Returns < 0 for error, 0
+   for success, and 1 if ref not found. */
+int reftable_table_read_ref(struct reftable_table *tab, const char *name,
+			    struct reftable_ref_record *ref);
+
+#endif
diff --git a/reftable/reftable-merged.h b/reftable/reftable-merged.h
new file mode 100644
index 000000000000..0e8ebe5a9956
--- /dev/null
+++ b/reftable/reftable-merged.h
@@ -0,0 +1,68 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_MERGED_H
+#define REFTABLE_MERGED_H
+
+#include "reftable-iterator.h"
+
+/*
+ * Merged tables
+ *
+ * A ref database kept in a sequence of table files. The merged_table presents a
+ * unified view to reading (seeking, iterating) a sequence of immutable tables.
+ *
+ * The merged tables are on purpose kept disconnected from their actual storage
+ * (eg. files on disk), because it is useful to merge tables aren't files. For
+ * example, the per-workspace and global ref namespace can be implemented as a
+ * merged table of two stacks of file-backed reftables.
+ */
+
+/* A merged table is implements seeking/iterating over a stack of tables. */
+struct reftable_merged_table;
+
+/* A generic reftable; see below. */
+struct reftable_table;
+
+/* reftable_new_merged_table creates a new merged table. It takes ownership of
+   the stack array.
+*/
+int reftable_new_merged_table(struct reftable_merged_table **dest,
+			      struct reftable_table *stack, int n,
+			      uint32_t hash_id);
+
+/* returns an iterator positioned just before 'name' */
+int reftable_merged_table_seek_ref(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name);
+
+/* returns an iterator for log entry, at given update_index */
+int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt,
+				      struct reftable_iterator *it,
+				      const char *name, uint64_t update_index);
+
+/* like reftable_merged_table_seek_log_at but look for the newest entry. */
+int reftable_merged_table_seek_log(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name);
+
+/* returns the max update_index covered by this merged table. */
+uint64_t
+reftable_merged_table_max_update_index(struct reftable_merged_table *mt);
+
+/* returns the min update_index covered by this merged table. */
+uint64_t
+reftable_merged_table_min_update_index(struct reftable_merged_table *mt);
+
+/* releases memory for the merged_table */
+void reftable_merged_table_free(struct reftable_merged_table *m);
+
+/* return the hash ID of the merged table. */
+uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *m);
+
+#endif
diff --git a/reftable/reftable-stack.h b/reftable/reftable-stack.h
new file mode 100644
index 000000000000..b7060f111e89
--- /dev/null
+++ b/reftable/reftable-stack.h
@@ -0,0 +1,120 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_STACK_H
+#define REFTABLE_STACK_H
+
+#include "reftable-writer.h"
+
+/*
+ * The stack presents an interface to a mutable sequence of reftables.
+
+ * A stack can be mutated by pushing a table to the top of the stack.
+
+ * The reftable_stack automatically compacts files on disk to ensure good
+ * amortized performance.
+ *
+ * For windows and other platforms that cannot have open files as rename
+ * destinations, concurrent access from multiple processes needs the rand()
+ * random seed to be randomized.
+ */
+struct reftable_stack;
+
+/* open a new reftable stack. The tables along with the table list will be
+ *  stored in 'dir'. Typically, this should be .git/reftables.
+ */
+int reftable_new_stack(struct reftable_stack **dest, const char *dir,
+		       struct reftable_write_options config);
+
+/* returns the update_index at which a next table should be written. */
+uint64_t reftable_stack_next_update_index(struct reftable_stack *st);
+
+/* holds a transaction to add tables at the top of a stack. */
+struct reftable_addition;
+
+/*
+ * returns a new transaction to add reftables to the given stack. As a side
+ * effect, the ref database is locked.
+ */
+int reftable_stack_new_addition(struct reftable_addition **dest,
+				struct reftable_stack *st);
+
+/* Adds a reftable to transaction. */
+int reftable_addition_add(struct reftable_addition *add,
+			  int (*write_table)(struct reftable_writer *wr,
+					     void *arg),
+			  void *arg);
+
+/* Commits the transaction, releasing the lock. */
+int reftable_addition_commit(struct reftable_addition *add);
+
+/* Release all non-committed data from the transaction, and deallocate the
+ * transaction. Releases the lock if held. */
+void reftable_addition_destroy(struct reftable_addition *add);
+
+/* add a new table to the stack. The write_table function must call
+ * reftable_writer_set_limits, add refs and return an error value. */
+int reftable_stack_add(struct reftable_stack *st,
+		       int (*write_table)(struct reftable_writer *wr,
+					  void *write_arg),
+		       void *write_arg);
+
+/* returns the merged_table for seeking. This table is valid until the
+ * next write or reload, and should not be closed or deleted.
+ */
+struct reftable_merged_table *
+reftable_stack_merged_table(struct reftable_stack *st);
+
+/* frees all resources associated with the stack. */
+void reftable_stack_destroy(struct reftable_stack *st);
+
+/* Reloads the stack if necessary. This is very cheap to run if the stack was up
+ * to date */
+int reftable_stack_reload(struct reftable_stack *st);
+
+/* Policy for expiring reflog entries. */
+struct reftable_log_expiry_config {
+	/* Drop entries older than this timestamp */
+	uint64_t time;
+
+	/* Drop older entries */
+	uint64_t min_update_index;
+};
+
+/* compacts all reftables into a giant table. Expire reflog entries if config is
+ * non-NULL */
+int reftable_stack_compact_all(struct reftable_stack *st,
+			       struct reftable_log_expiry_config *config);
+
+/* heuristically compact unbalanced table stack. */
+int reftable_stack_auto_compact(struct reftable_stack *st);
+
+/* convenience function to read a single ref. Returns < 0 for error, 0 for
+ * success, and 1 if ref not found. */
+int reftable_stack_read_ref(struct reftable_stack *st, const char *refname,
+			    struct reftable_ref_record *ref);
+
+/* convenience function to read a single log. Returns < 0 for error, 0 for
+ * success, and 1 if ref not found. */
+int reftable_stack_read_log(struct reftable_stack *st, const char *refname,
+			    struct reftable_log_record *log);
+
+/* statistics on past compactions. */
+struct reftable_compaction_stats {
+	uint64_t bytes; /* total number of bytes written */
+	uint64_t entries_written; /* total number of entries written, including
+				     failures. */
+	int attempts; /* how often we tried to compact */
+	int failures; /* failures happen on concurrent updates */
+};
+
+/* return statistics for compaction up till now. */
+struct reftable_compaction_stats *
+reftable_stack_compaction_stats(struct reftable_stack *st);
+
+#endif
diff --git a/reftable/reftable.c b/reftable/reftable.c
new file mode 100644
index 000000000000..dc4fd03d5b25
--- /dev/null
+++ b/reftable/reftable.c
@@ -0,0 +1,98 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "record.h"
+#include "reader.h"
+#include "reftable-iterator.h"
+#include "reftable-generic.h"
+
+static int reftable_reader_seek_void(void *tab, struct reftable_iterator *it,
+				     struct reftable_record *rec)
+{
+	return reader_seek((struct reftable_reader *)tab, it, rec);
+}
+
+static uint32_t reftable_reader_hash_id_void(void *tab)
+{
+	return reftable_reader_hash_id((struct reftable_reader *)tab);
+}
+
+static uint64_t reftable_reader_min_update_index_void(void *tab)
+{
+	return reftable_reader_min_update_index((struct reftable_reader *)tab);
+}
+
+static uint64_t reftable_reader_max_update_index_void(void *tab)
+{
+	return reftable_reader_max_update_index((struct reftable_reader *)tab);
+}
+
+static struct reftable_table_vtable reader_vtable = {
+	.seek_record = reftable_reader_seek_void,
+	.hash_id = reftable_reader_hash_id_void,
+	.min_update_index = reftable_reader_min_update_index_void,
+	.max_update_index = reftable_reader_max_update_index_void,
+};
+
+int reftable_table_seek_ref(struct reftable_table *tab,
+			    struct reftable_iterator *it, const char *name)
+{
+	struct reftable_ref_record ref = {
+		.refname = (char *)name,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_ref(&rec, &ref);
+	return tab->ops->seek_record(tab->table_arg, it, &rec);
+}
+
+void reftable_table_from_reader(struct reftable_table *tab,
+				struct reftable_reader *reader)
+{
+	assert(tab->ops == NULL);
+	tab->ops = &reader_vtable;
+	tab->table_arg = reader;
+}
+
+int reftable_table_read_ref(struct reftable_table *tab, const char *name,
+			    struct reftable_ref_record *ref)
+{
+	struct reftable_iterator it = { NULL };
+	int err = reftable_table_seek_ref(tab, &it, name);
+	if (err)
+		goto done;
+
+	err = reftable_iterator_next_ref(&it, ref);
+	if (err)
+		goto done;
+
+	if (strcmp(ref->refname, name) ||
+	    reftable_ref_record_is_deletion(ref)) {
+		reftable_ref_record_release(ref);
+		err = 1;
+		goto done;
+	}
+
+done:
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+uint64_t reftable_table_max_update_index(struct reftable_table *tab)
+{
+	return tab->ops->max_update_index(tab->table_arg);
+}
+
+uint64_t reftable_table_min_update_index(struct reftable_table *tab)
+{
+	return tab->ops->min_update_index(tab->table_arg);
+}
+
+uint32_t reftable_table_hash_id(struct reftable_table *tab)
+{
+	return tab->ops->hash_id(tab->table_arg);
+}
diff --git a/reftable/stack.c b/reftable/stack.c
new file mode 100644
index 000000000000..cff21c814dff
--- /dev/null
+++ b/reftable/stack.c
@@ -0,0 +1,1260 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "stack.h"
+
+#include "system.h"
+#include "merged.h"
+#include "reader.h"
+#include "refname.h"
+#include "reftable-error.h"
+#include "reftable-record.h"
+#include "writer.h"
+
+static int stack_try_add(struct reftable_stack *st,
+			 int (*write_table)(struct reftable_writer *wr,
+					    void *arg),
+			 void *arg);
+static int stack_write_compact(struct reftable_stack *st,
+			       struct reftable_writer *wr, int first, int last,
+			       struct reftable_log_expiry_config *config);
+static int stack_check_addition(struct reftable_stack *st,
+				const char *new_tab_name);
+static void reftable_addition_close(struct reftable_addition *add);
+static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
+					     int reuse_open);
+
+static int reftable_fd_write(void *arg, const void *data, size_t sz)
+{
+	int *fdp = (int *)arg;
+	return write(*fdp, data, sz);
+}
+
+int reftable_new_stack(struct reftable_stack **dest, const char *dir,
+		       struct reftable_write_options config)
+{
+	struct reftable_stack *p =
+		reftable_calloc(sizeof(struct reftable_stack));
+	struct strbuf list_file_name = STRBUF_INIT;
+	int err = 0;
+
+	if (config.hash_id == 0) {
+		config.hash_id = SHA1_ID;
+	}
+
+	*dest = NULL;
+
+	strbuf_reset(&list_file_name);
+	strbuf_addstr(&list_file_name, dir);
+	strbuf_addstr(&list_file_name, "/tables.list");
+
+	p->list_file = strbuf_detach(&list_file_name, NULL);
+	p->reftable_dir = xstrdup(dir);
+	p->config = config;
+
+	err = reftable_stack_reload_maybe_reuse(p, 1);
+	if (err < 0) {
+		reftable_stack_destroy(p);
+	} else {
+		*dest = p;
+	}
+	return err;
+}
+
+static int fd_read_lines(int fd, char ***namesp)
+{
+	off_t size = lseek(fd, 0, SEEK_END);
+	char *buf = NULL;
+	int err = 0;
+	if (size < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+	err = lseek(fd, 0, SEEK_SET);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	buf = reftable_malloc(size + 1);
+	if (read(fd, buf, size) != size) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+	buf[size] = 0;
+
+	parse_names(buf, size, namesp);
+
+done:
+	reftable_free(buf);
+	return err;
+}
+
+int read_lines(const char *filename, char ***namesp)
+{
+	int fd = open(filename, O_RDONLY, 0644);
+	int err = 0;
+	if (fd < 0) {
+		if (errno == ENOENT) {
+			*namesp = reftable_calloc(sizeof(char *));
+			return 0;
+		}
+
+		return REFTABLE_IO_ERROR;
+	}
+	err = fd_read_lines(fd, namesp);
+	close(fd);
+	return err;
+}
+
+struct reftable_merged_table *
+reftable_stack_merged_table(struct reftable_stack *st)
+{
+	return st->merged;
+}
+
+/* Close and free the stack */
+void reftable_stack_destroy(struct reftable_stack *st)
+{
+	if (st->merged != NULL) {
+		reftable_merged_table_free(st->merged);
+		st->merged = NULL;
+	}
+
+	if (st->readers != NULL) {
+		int i = 0;
+		for (i = 0; i < st->readers_len; i++) {
+			reftable_reader_free(st->readers[i]);
+		}
+		st->readers_len = 0;
+		FREE_AND_NULL(st->readers);
+	}
+	FREE_AND_NULL(st->list_file);
+	FREE_AND_NULL(st->reftable_dir);
+	reftable_free(st);
+}
+
+static struct reftable_reader **stack_copy_readers(struct reftable_stack *st,
+						   int cur_len)
+{
+	struct reftable_reader **cur =
+		reftable_calloc(sizeof(struct reftable_reader *) * cur_len);
+	int i = 0;
+	for (i = 0; i < cur_len; i++) {
+		cur[i] = st->readers[i];
+	}
+	return cur;
+}
+
+static int reftable_stack_reload_once(struct reftable_stack *st, char **names,
+				      int reuse_open)
+{
+	int cur_len = st->merged == NULL ? 0 : st->merged->stack_len;
+	struct reftable_reader **cur = stack_copy_readers(st, cur_len);
+	int err = 0;
+	int names_len = names_length(names);
+	struct reftable_reader **new_readers =
+		reftable_calloc(sizeof(struct reftable_reader *) * names_len);
+	struct reftable_table *new_tables =
+		reftable_calloc(sizeof(struct reftable_table) * names_len);
+	int new_readers_len = 0;
+	struct reftable_merged_table *new_merged = NULL;
+	int i;
+
+	while (*names) {
+		struct reftable_reader *rd = NULL;
+		char *name = *names++;
+
+		/* this is linear; we assume compaction keeps the number of
+		   tables under control so this is not quadratic. */
+		int j = 0;
+		for (j = 0; reuse_open && j < cur_len; j++) {
+			if (cur[j] != NULL && 0 == strcmp(cur[j]->name, name)) {
+				rd = cur[j];
+				cur[j] = NULL;
+				break;
+			}
+		}
+
+		if (rd == NULL) {
+			struct reftable_block_source src = { NULL };
+			struct strbuf table_path = STRBUF_INIT;
+			strbuf_addstr(&table_path, st->reftable_dir);
+			strbuf_addstr(&table_path, "/");
+			strbuf_addstr(&table_path, name);
+
+			err = reftable_block_source_from_file(&src,
+							      table_path.buf);
+			strbuf_release(&table_path);
+
+			if (err < 0)
+				goto done;
+
+			err = reftable_new_reader(&rd, &src, name);
+			if (err < 0)
+				goto done;
+		}
+
+		new_readers[new_readers_len] = rd;
+		reftable_table_from_reader(&new_tables[new_readers_len], rd);
+		new_readers_len++;
+	}
+
+	/* success! */
+	err = reftable_new_merged_table(&new_merged, new_tables,
+					new_readers_len, st->config.hash_id);
+	if (err < 0)
+		goto done;
+
+	new_tables = NULL;
+	st->readers_len = new_readers_len;
+	if (st->merged != NULL) {
+		merged_table_release(st->merged);
+		reftable_merged_table_free(st->merged);
+	}
+	if (st->readers != NULL) {
+		reftable_free(st->readers);
+	}
+	st->readers = new_readers;
+	new_readers = NULL;
+	new_readers_len = 0;
+
+	new_merged->suppress_deletions = 1;
+	st->merged = new_merged;
+	for (i = 0; i < cur_len; i++) {
+		if (cur[i] != NULL) {
+			reader_close(cur[i]);
+			reftable_reader_free(cur[i]);
+		}
+	}
+
+done:
+	for (i = 0; i < new_readers_len; i++) {
+		reader_close(new_readers[i]);
+		reftable_reader_free(new_readers[i]);
+	}
+	reftable_free(new_readers);
+	reftable_free(new_tables);
+	reftable_free(cur);
+	return err;
+}
+
+/* return negative if a before b. */
+static int tv_cmp(struct timeval *a, struct timeval *b)
+{
+	time_t diff = a->tv_sec - b->tv_sec;
+	int udiff = a->tv_usec - b->tv_usec;
+
+	if (diff != 0)
+		return diff;
+
+	return udiff;
+}
+
+static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
+					     int reuse_open)
+{
+	struct timeval deadline = { 0 };
+	int err = gettimeofday(&deadline, NULL);
+	int64_t delay = 0;
+	int tries = 0;
+	if (err < 0)
+		return err;
+
+	deadline.tv_sec += 3;
+	while (1) {
+		char **names = NULL;
+		char **names_after = NULL;
+		struct timeval now = { 0 };
+		int err = gettimeofday(&now, NULL);
+		int err2 = 0;
+		if (err < 0) {
+			return err;
+		}
+
+		/* Only look at deadlines after the first few times. This
+		   simplifies debugging in GDB */
+		tries++;
+		if (tries > 3 && tv_cmp(&now, &deadline) >= 0) {
+			break;
+		}
+
+		err = read_lines(st->list_file, &names);
+		if (err < 0) {
+			free_names(names);
+			return err;
+		}
+		err = reftable_stack_reload_once(st, names, reuse_open);
+		if (err == 0) {
+			free_names(names);
+			break;
+		}
+		if (err != REFTABLE_NOT_EXIST_ERROR) {
+			free_names(names);
+			return err;
+		}
+
+		/* err == REFTABLE_NOT_EXIST_ERROR can be caused by a concurrent
+		   writer. Check if there was one by checking if the name list
+		   changed.
+		*/
+		err2 = read_lines(st->list_file, &names_after);
+		if (err2 < 0) {
+			free_names(names);
+			return err2;
+		}
+
+		if (names_equal(names_after, names)) {
+			free_names(names);
+			free_names(names_after);
+			return err;
+		}
+		free_names(names);
+		free_names(names_after);
+
+		delay = delay + (delay * rand()) / RAND_MAX + 1;
+		sleep_millisec(delay);
+	}
+
+	return 0;
+}
+
+/* -1 = error
+ 0 = up to date
+ 1 = changed. */
+static int stack_uptodate(struct reftable_stack *st)
+{
+	char **names = NULL;
+	int err = read_lines(st->list_file, &names);
+	int i = 0;
+	if (err < 0)
+		return err;
+
+	for (i = 0; i < st->readers_len; i++) {
+		if (names[i] == NULL) {
+			err = 1;
+			goto done;
+		}
+
+		if (strcmp(st->readers[i]->name, names[i])) {
+			err = 1;
+			goto done;
+		}
+	}
+
+	if (names[st->merged->stack_len] != NULL) {
+		err = 1;
+		goto done;
+	}
+
+done:
+	free_names(names);
+	return err;
+}
+
+int reftable_stack_reload(struct reftable_stack *st)
+{
+	int err = stack_uptodate(st);
+	if (err > 0)
+		return reftable_stack_reload_maybe_reuse(st, 1);
+	return err;
+}
+
+int reftable_stack_add(struct reftable_stack *st,
+		       int (*write)(struct reftable_writer *wr, void *arg),
+		       void *arg)
+{
+	int err = stack_try_add(st, write, arg);
+	if (err < 0) {
+		if (err == REFTABLE_LOCK_ERROR) {
+			/* Ignore error return, we want to propagate
+			   REFTABLE_LOCK_ERROR.
+			*/
+			reftable_stack_reload(st);
+		}
+		return err;
+	}
+
+	if (!st->disable_auto_compact)
+		return reftable_stack_auto_compact(st);
+
+	return 0;
+}
+
+static void format_name(struct strbuf *dest, uint64_t min, uint64_t max)
+{
+	char buf[100];
+	uint32_t rnd = (uint32_t)rand();
+	snprintf(buf, sizeof(buf), "0x%012" PRIx64 "-0x%012" PRIx64 "-%08x",
+		 min, max, rnd);
+	strbuf_reset(dest);
+	strbuf_addstr(dest, buf);
+}
+
+struct reftable_addition {
+	int lock_file_fd;
+	struct strbuf lock_file_name;
+	struct reftable_stack *stack;
+
+	char **new_tables;
+	int new_tables_len;
+	uint64_t next_update_index;
+};
+
+#define REFTABLE_ADDITION_INIT                \
+	{                                     \
+		.lock_file_name = STRBUF_INIT \
+	}
+
+static int reftable_stack_init_addition(struct reftable_addition *add,
+					struct reftable_stack *st)
+{
+	int err = 0;
+	add->stack = st;
+
+	strbuf_reset(&add->lock_file_name);
+	strbuf_addstr(&add->lock_file_name, st->list_file);
+	strbuf_addstr(&add->lock_file_name, ".lock");
+
+	add->lock_file_fd = open(add->lock_file_name.buf,
+				 O_EXCL | O_CREAT | O_WRONLY, 0644);
+	if (add->lock_file_fd < 0) {
+		if (errno == EEXIST) {
+			err = REFTABLE_LOCK_ERROR;
+		} else {
+			err = REFTABLE_IO_ERROR;
+		}
+		goto done;
+	}
+	err = stack_uptodate(st);
+	if (err < 0)
+		goto done;
+
+	if (err > 1) {
+		err = REFTABLE_LOCK_ERROR;
+		goto done;
+	}
+
+	add->next_update_index = reftable_stack_next_update_index(st);
+done:
+	if (err) {
+		reftable_addition_close(add);
+	}
+	return err;
+}
+
+static void reftable_addition_close(struct reftable_addition *add)
+{
+	int i = 0;
+	struct strbuf nm = STRBUF_INIT;
+	for (i = 0; i < add->new_tables_len; i++) {
+		strbuf_reset(&nm);
+		strbuf_addstr(&nm, add->stack->reftable_dir);
+		strbuf_addstr(&nm, "/");
+		strbuf_addstr(&nm, add->new_tables[i]);
+		unlink(nm.buf);
+		reftable_free(add->new_tables[i]);
+		add->new_tables[i] = NULL;
+	}
+	reftable_free(add->new_tables);
+	add->new_tables = NULL;
+	add->new_tables_len = 0;
+
+	if (add->lock_file_fd > 0) {
+		close(add->lock_file_fd);
+		add->lock_file_fd = 0;
+	}
+	if (add->lock_file_name.len > 0) {
+		unlink(add->lock_file_name.buf);
+		strbuf_release(&add->lock_file_name);
+	}
+
+	strbuf_release(&nm);
+}
+
+void reftable_addition_destroy(struct reftable_addition *add)
+{
+	if (add == NULL) {
+		return;
+	}
+	reftable_addition_close(add);
+	reftable_free(add);
+}
+
+int reftable_addition_commit(struct reftable_addition *add)
+{
+	struct strbuf table_list = STRBUF_INIT;
+	int i = 0;
+	int err = 0;
+	if (add->new_tables_len == 0)
+		goto done;
+
+	for (i = 0; i < add->stack->merged->stack_len; i++) {
+		strbuf_addstr(&table_list, add->stack->readers[i]->name);
+		strbuf_addstr(&table_list, "\n");
+	}
+	for (i = 0; i < add->new_tables_len; i++) {
+		strbuf_addstr(&table_list, add->new_tables[i]);
+		strbuf_addstr(&table_list, "\n");
+	}
+
+	err = write(add->lock_file_fd, table_list.buf, table_list.len);
+	strbuf_release(&table_list);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = close(add->lock_file_fd);
+	add->lock_file_fd = 0;
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = rename(add->lock_file_name.buf, add->stack->list_file);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	/* success, no more state to clean up. */
+	strbuf_release(&add->lock_file_name);
+	for (i = 0; i < add->new_tables_len; i++) {
+		reftable_free(add->new_tables[i]);
+	}
+	reftable_free(add->new_tables);
+	add->new_tables = NULL;
+	add->new_tables_len = 0;
+
+	err = reftable_stack_reload(add->stack);
+done:
+	reftable_addition_close(add);
+	return err;
+}
+
+int reftable_stack_new_addition(struct reftable_addition **dest,
+				struct reftable_stack *st)
+{
+	int err = 0;
+	struct reftable_addition empty = REFTABLE_ADDITION_INIT;
+	*dest = reftable_calloc(sizeof(**dest));
+	**dest = empty;
+	err = reftable_stack_init_addition(*dest, st);
+	if (err) {
+		reftable_free(*dest);
+		*dest = NULL;
+	}
+	return err;
+}
+
+static int stack_try_add(struct reftable_stack *st,
+			 int (*write_table)(struct reftable_writer *wr,
+					    void *arg),
+			 void *arg)
+{
+	struct reftable_addition add = REFTABLE_ADDITION_INIT;
+	int err = reftable_stack_init_addition(&add, st);
+	if (err < 0)
+		goto done;
+	if (err > 0) {
+		err = REFTABLE_LOCK_ERROR;
+		goto done;
+	}
+
+	err = reftable_addition_add(&add, write_table, arg);
+	if (err < 0)
+		goto done;
+
+	err = reftable_addition_commit(&add);
+done:
+	reftable_addition_close(&add);
+	return err;
+}
+
+int reftable_addition_add(struct reftable_addition *add,
+			  int (*write_table)(struct reftable_writer *wr,
+					     void *arg),
+			  void *arg)
+{
+	struct strbuf temp_tab_file_name = STRBUF_INIT;
+	struct strbuf tab_file_name = STRBUF_INIT;
+	struct strbuf next_name = STRBUF_INIT;
+	struct reftable_writer *wr = NULL;
+	int err = 0;
+	int tab_fd = 0;
+
+	strbuf_reset(&next_name);
+	format_name(&next_name, add->next_update_index, add->next_update_index);
+
+	strbuf_addstr(&temp_tab_file_name, add->stack->reftable_dir);
+	strbuf_addstr(&temp_tab_file_name, "/");
+	strbuf_addbuf(&temp_tab_file_name, &next_name);
+	strbuf_addstr(&temp_tab_file_name, ".temp.XXXXXX");
+
+	tab_fd = mkstemp(temp_tab_file_name.buf);
+	if (tab_fd < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	wr = reftable_new_writer(reftable_fd_write, &tab_fd,
+				 &add->stack->config);
+	err = write_table(wr, arg);
+	if (err < 0)
+		goto done;
+
+	err = reftable_writer_close(wr);
+	if (err == REFTABLE_EMPTY_TABLE_ERROR) {
+		err = 0;
+		goto done;
+	}
+	if (err < 0)
+		goto done;
+
+	err = close(tab_fd);
+	tab_fd = 0;
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = stack_check_addition(add->stack, temp_tab_file_name.buf);
+	if (err < 0)
+		goto done;
+
+	if (wr->min_update_index < add->next_update_index) {
+		err = REFTABLE_API_ERROR;
+		goto done;
+	}
+
+	format_name(&next_name, wr->min_update_index, wr->max_update_index);
+	strbuf_addstr(&next_name, ".ref");
+
+	strbuf_addstr(&tab_file_name, add->stack->reftable_dir);
+	strbuf_addstr(&tab_file_name, "/");
+	strbuf_addbuf(&tab_file_name, &next_name);
+
+	/*
+	  On windows, this relies on rand() picking a unique destination name.
+	  Maybe we should do retry loop as well?
+	 */
+	err = rename(temp_tab_file_name.buf, tab_file_name.buf);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	add->new_tables = reftable_realloc(add->new_tables,
+					   sizeof(*add->new_tables) *
+						   (add->new_tables_len + 1));
+	add->new_tables[add->new_tables_len] = strbuf_detach(&next_name, NULL);
+	add->new_tables_len++;
+done:
+	if (tab_fd > 0) {
+		close(tab_fd);
+		tab_fd = 0;
+	}
+	if (temp_tab_file_name.len > 0) {
+		unlink(temp_tab_file_name.buf);
+	}
+
+	strbuf_release(&temp_tab_file_name);
+	strbuf_release(&tab_file_name);
+	strbuf_release(&next_name);
+	reftable_writer_free(wr);
+	return err;
+}
+
+uint64_t reftable_stack_next_update_index(struct reftable_stack *st)
+{
+	int sz = st->merged->stack_len;
+	if (sz > 0)
+		return reftable_reader_max_update_index(st->readers[sz - 1]) +
+		       1;
+	return 1;
+}
+
+static int stack_compact_locked(struct reftable_stack *st, int first, int last,
+				struct strbuf *temp_tab,
+				struct reftable_log_expiry_config *config)
+{
+	struct strbuf next_name = STRBUF_INIT;
+	int tab_fd = -1;
+	struct reftable_writer *wr = NULL;
+	int err = 0;
+
+	format_name(&next_name,
+		    reftable_reader_min_update_index(st->readers[first]),
+		    reftable_reader_max_update_index(st->readers[last]));
+
+	strbuf_reset(temp_tab);
+	strbuf_addstr(temp_tab, st->reftable_dir);
+	strbuf_addstr(temp_tab, "/");
+	strbuf_addbuf(temp_tab, &next_name);
+	strbuf_addstr(temp_tab, ".temp.XXXXXX");
+
+	tab_fd = mkstemp(temp_tab->buf);
+	wr = reftable_new_writer(reftable_fd_write, &tab_fd, &st->config);
+
+	err = stack_write_compact(st, wr, first, last, config);
+	if (err < 0)
+		goto done;
+	err = reftable_writer_close(wr);
+	if (err < 0)
+		goto done;
+
+	err = close(tab_fd);
+	tab_fd = 0;
+
+done:
+	reftable_writer_free(wr);
+	if (tab_fd > 0) {
+		close(tab_fd);
+		tab_fd = 0;
+	}
+	if (err != 0 && temp_tab->len > 0) {
+		unlink(temp_tab->buf);
+		strbuf_release(temp_tab);
+	}
+	strbuf_release(&next_name);
+	return err;
+}
+
+static int stack_write_compact(struct reftable_stack *st,
+			       struct reftable_writer *wr, int first, int last,
+			       struct reftable_log_expiry_config *config)
+{
+	int subtabs_len = last - first + 1;
+	struct reftable_table *subtabs = reftable_calloc(
+		sizeof(struct reftable_table) * (last - first + 1));
+	struct reftable_merged_table *mt = NULL;
+	int err = 0;
+	struct reftable_iterator it = { NULL };
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_log_record log = { NULL };
+
+	uint64_t entries = 0;
+
+	int i = 0, j = 0;
+	for (i = first, j = 0; i <= last; i++) {
+		struct reftable_reader *t = st->readers[i];
+		reftable_table_from_reader(&subtabs[j++], t);
+		st->stats.bytes += t->size;
+	}
+	reftable_writer_set_limits(wr, st->readers[first]->min_update_index,
+				   st->readers[last]->max_update_index);
+
+	err = reftable_new_merged_table(&mt, subtabs, subtabs_len,
+					st->config.hash_id);
+	if (err < 0) {
+		reftable_free(subtabs);
+		goto done;
+	}
+
+	err = reftable_merged_table_seek_ref(mt, &it, "");
+	if (err < 0)
+		goto done;
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+		if (err < 0) {
+			break;
+		}
+
+		if (first == 0 && reftable_ref_record_is_deletion(&ref)) {
+			continue;
+		}
+
+		err = reftable_writer_add_ref(wr, &ref);
+		if (err < 0) {
+			break;
+		}
+		entries++;
+	}
+	reftable_iterator_destroy(&it);
+
+	err = reftable_merged_table_seek_log(mt, &it, "");
+	if (err < 0)
+		goto done;
+
+	while (1) {
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+		if (err < 0) {
+			break;
+		}
+		if (first == 0 && reftable_log_record_is_deletion(&log)) {
+			continue;
+		}
+
+		if (config != NULL && config->min_update_index > 0 &&
+		    log.update_index < config->min_update_index) {
+			continue;
+		}
+
+		if (config != NULL && config->time > 0 &&
+		    log.update.time < config->time) {
+			continue;
+		}
+
+		err = reftable_writer_add_log(wr, &log);
+		if (err < 0) {
+			break;
+		}
+		entries++;
+	}
+
+done:
+	reftable_iterator_destroy(&it);
+	if (mt != NULL) {
+		merged_table_release(mt);
+		reftable_merged_table_free(mt);
+	}
+	reftable_ref_record_release(&ref);
+	reftable_log_record_release(&log);
+	st->stats.entries_written += entries;
+	return err;
+}
+
+/* <  0: error. 0 == OK, > 0 attempt failed; could retry. */
+static int stack_compact_range(struct reftable_stack *st, int first, int last,
+			       struct reftable_log_expiry_config *expiry)
+{
+	struct strbuf temp_tab_file_name = STRBUF_INIT;
+	struct strbuf new_table_name = STRBUF_INIT;
+	struct strbuf lock_file_name = STRBUF_INIT;
+	struct strbuf ref_list_contents = STRBUF_INIT;
+	struct strbuf new_table_path = STRBUF_INIT;
+	int err = 0;
+	int have_lock = 0;
+	int lock_file_fd = 0;
+	int compact_count = last - first + 1;
+	char **listp = NULL;
+	char **delete_on_success =
+		reftable_calloc(sizeof(char *) * (compact_count + 1));
+	char **subtable_locks =
+		reftable_calloc(sizeof(char *) * (compact_count + 1));
+	int i = 0;
+	int j = 0;
+	int is_empty_table = 0;
+
+	if (first > last || (expiry == NULL && first == last)) {
+		err = 0;
+		goto done;
+	}
+
+	st->stats.attempts++;
+
+	strbuf_reset(&lock_file_name);
+	strbuf_addstr(&lock_file_name, st->list_file);
+	strbuf_addstr(&lock_file_name, ".lock");
+
+	lock_file_fd =
+		open(lock_file_name.buf, O_EXCL | O_CREAT | O_WRONLY, 0644);
+	if (lock_file_fd < 0) {
+		if (errno == EEXIST) {
+			err = 1;
+		} else {
+			err = REFTABLE_IO_ERROR;
+		}
+		goto done;
+	}
+	/* Don't want to write to the lock for now.  */
+	close(lock_file_fd);
+	lock_file_fd = 0;
+
+	have_lock = 1;
+	err = stack_uptodate(st);
+	if (err != 0)
+		goto done;
+
+	for (i = first, j = 0; i <= last; i++) {
+		struct strbuf subtab_file_name = STRBUF_INIT;
+		struct strbuf subtab_lock = STRBUF_INIT;
+		int sublock_file_fd = -1;
+
+		strbuf_addstr(&subtab_file_name, st->reftable_dir);
+		strbuf_addstr(&subtab_file_name, "/");
+		strbuf_addstr(&subtab_file_name, reader_name(st->readers[i]));
+
+		strbuf_reset(&subtab_lock);
+		strbuf_addbuf(&subtab_lock, &subtab_file_name);
+		strbuf_addstr(&subtab_lock, ".lock");
+
+		sublock_file_fd = open(subtab_lock.buf,
+				       O_EXCL | O_CREAT | O_WRONLY, 0644);
+		if (sublock_file_fd > 0) {
+			close(sublock_file_fd);
+		} else if (sublock_file_fd < 0) {
+			if (errno == EEXIST) {
+				err = 1;
+			} else {
+				err = REFTABLE_IO_ERROR;
+			}
+		}
+
+		subtable_locks[j] = subtab_lock.buf;
+		delete_on_success[j] = subtab_file_name.buf;
+		j++;
+
+		if (err != 0)
+			goto done;
+	}
+
+	err = unlink(lock_file_name.buf);
+	if (err < 0)
+		goto done;
+	have_lock = 0;
+
+	err = stack_compact_locked(st, first, last, &temp_tab_file_name,
+				   expiry);
+	/* Compaction + tombstones can create an empty table out of non-empty
+	 * tables. */
+	is_empty_table = (err == REFTABLE_EMPTY_TABLE_ERROR);
+	if (is_empty_table) {
+		err = 0;
+	}
+	if (err < 0)
+		goto done;
+
+	lock_file_fd =
+		open(lock_file_name.buf, O_EXCL | O_CREAT | O_WRONLY, 0644);
+	if (lock_file_fd < 0) {
+		if (errno == EEXIST) {
+			err = 1;
+		} else {
+			err = REFTABLE_IO_ERROR;
+		}
+		goto done;
+	}
+	have_lock = 1;
+
+	format_name(&new_table_name, st->readers[first]->min_update_index,
+		    st->readers[last]->max_update_index);
+	strbuf_addstr(&new_table_name, ".ref");
+
+	strbuf_reset(&new_table_path);
+	strbuf_addstr(&new_table_path, st->reftable_dir);
+	strbuf_addstr(&new_table_path, "/");
+	strbuf_addbuf(&new_table_path, &new_table_name);
+
+	if (!is_empty_table) {
+		/* retry? */
+		err = rename(temp_tab_file_name.buf, new_table_path.buf);
+		if (err < 0) {
+			err = REFTABLE_IO_ERROR;
+			goto done;
+		}
+	}
+
+	for (i = 0; i < first; i++) {
+		strbuf_addstr(&ref_list_contents, st->readers[i]->name);
+		strbuf_addstr(&ref_list_contents, "\n");
+	}
+	if (!is_empty_table) {
+		strbuf_addbuf(&ref_list_contents, &new_table_name);
+		strbuf_addstr(&ref_list_contents, "\n");
+	}
+	for (i = last + 1; i < st->merged->stack_len; i++) {
+		strbuf_addstr(&ref_list_contents, st->readers[i]->name);
+		strbuf_addstr(&ref_list_contents, "\n");
+	}
+
+	err = write(lock_file_fd, ref_list_contents.buf, ref_list_contents.len);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		unlink(new_table_path.buf);
+		goto done;
+	}
+	err = close(lock_file_fd);
+	lock_file_fd = 0;
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		unlink(new_table_path.buf);
+		goto done;
+	}
+
+	err = rename(lock_file_name.buf, st->list_file);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		unlink(new_table_path.buf);
+		goto done;
+	}
+	have_lock = 0;
+
+	/* Reload the stack before deleting. On windows, we can only delete the
+	   files after we closed them.
+	*/
+	err = reftable_stack_reload_maybe_reuse(st, first < last);
+
+	listp = delete_on_success;
+	while (*listp) {
+		if (strcmp(*listp, new_table_path.buf)) {
+			unlink(*listp);
+		}
+		listp++;
+	}
+
+done:
+	free_names(delete_on_success);
+
+	listp = subtable_locks;
+	while (*listp) {
+		unlink(*listp);
+		listp++;
+	}
+	free_names(subtable_locks);
+	if (lock_file_fd > 0) {
+		close(lock_file_fd);
+		lock_file_fd = 0;
+	}
+	if (have_lock) {
+		unlink(lock_file_name.buf);
+	}
+	strbuf_release(&new_table_name);
+	strbuf_release(&new_table_path);
+	strbuf_release(&ref_list_contents);
+	strbuf_release(&temp_tab_file_name);
+	strbuf_release(&lock_file_name);
+	return err;
+}
+
+int reftable_stack_compact_all(struct reftable_stack *st,
+			       struct reftable_log_expiry_config *config)
+{
+	return stack_compact_range(st, 0, st->merged->stack_len - 1, config);
+}
+
+static int stack_compact_range_stats(struct reftable_stack *st, int first,
+				     int last,
+				     struct reftable_log_expiry_config *config)
+{
+	int err = stack_compact_range(st, first, last, config);
+	if (err > 0) {
+		st->stats.failures++;
+	}
+	return err;
+}
+
+static int segment_size(struct segment *s)
+{
+	return s->end - s->start;
+}
+
+int fastlog2(uint64_t sz)
+{
+	int l = 0;
+	if (sz == 0)
+		return 0;
+	for (; sz; sz /= 2) {
+		l++;
+	}
+	return l - 1;
+}
+
+struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n)
+{
+	struct segment *segs = reftable_calloc(sizeof(struct segment) * n);
+	int next = 0;
+	struct segment cur = { 0 };
+	int i = 0;
+
+	if (n == 0) {
+		*seglen = 0;
+		return segs;
+	}
+	for (i = 0; i < n; i++) {
+		int log = fastlog2(sizes[i]);
+		if (cur.log != log && cur.bytes > 0) {
+			struct segment fresh = {
+				.start = i,
+			};
+
+			segs[next++] = cur;
+			cur = fresh;
+		}
+
+		cur.log = log;
+		cur.end = i + 1;
+		cur.bytes += sizes[i];
+	}
+	segs[next++] = cur;
+	*seglen = next;
+	return segs;
+}
+
+struct segment suggest_compaction_segment(uint64_t *sizes, int n)
+{
+	int seglen = 0;
+	struct segment *segs = sizes_to_segments(&seglen, sizes, n);
+	struct segment min_seg = {
+		.log = 64,
+	};
+	int i = 0;
+	for (i = 0; i < seglen; i++) {
+		if (segment_size(&segs[i]) == 1) {
+			continue;
+		}
+
+		if (segs[i].log < min_seg.log) {
+			min_seg = segs[i];
+		}
+	}
+
+	while (min_seg.start > 0) {
+		int prev = min_seg.start - 1;
+		if (fastlog2(min_seg.bytes) < fastlog2(sizes[prev])) {
+			break;
+		}
+
+		min_seg.start = prev;
+		min_seg.bytes += sizes[prev];
+	}
+
+	reftable_free(segs);
+	return min_seg;
+}
+
+static uint64_t *stack_table_sizes_for_compaction(struct reftable_stack *st)
+{
+	uint64_t *sizes =
+		reftable_calloc(sizeof(uint64_t) * st->merged->stack_len);
+	int version = (st->config.hash_id == SHA1_ID) ? 1 : 2;
+	int overhead = header_size(version) - 1;
+	int i = 0;
+	for (i = 0; i < st->merged->stack_len; i++) {
+		sizes[i] = st->readers[i]->size - overhead;
+	}
+	return sizes;
+}
+
+int reftable_stack_auto_compact(struct reftable_stack *st)
+{
+	uint64_t *sizes = stack_table_sizes_for_compaction(st);
+	struct segment seg =
+		suggest_compaction_segment(sizes, st->merged->stack_len);
+	reftable_free(sizes);
+	if (segment_size(&seg) > 0)
+		return stack_compact_range_stats(st, seg.start, seg.end - 1,
+						 NULL);
+
+	return 0;
+}
+
+struct reftable_compaction_stats *
+reftable_stack_compaction_stats(struct reftable_stack *st)
+{
+	return &st->stats;
+}
+
+int reftable_stack_read_ref(struct reftable_stack *st, const char *refname,
+			    struct reftable_ref_record *ref)
+{
+	struct reftable_table tab = { NULL };
+	reftable_table_from_merged_table(&tab, reftable_stack_merged_table(st));
+	return reftable_table_read_ref(&tab, refname, ref);
+}
+
+int reftable_stack_read_log(struct reftable_stack *st, const char *refname,
+			    struct reftable_log_record *log)
+{
+	struct reftable_iterator it = { NULL };
+	struct reftable_merged_table *mt = reftable_stack_merged_table(st);
+	int err = reftable_merged_table_seek_log(mt, &it, refname);
+	if (err)
+		goto done;
+
+	err = reftable_iterator_next_log(&it, log);
+	if (err)
+		goto done;
+
+	if (strcmp(log->refname, refname) ||
+	    reftable_log_record_is_deletion(log)) {
+		err = 1;
+		goto done;
+	}
+
+done:
+	if (err) {
+		reftable_log_record_release(log);
+	}
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+static int stack_check_addition(struct reftable_stack *st,
+				const char *new_tab_name)
+{
+	int err = 0;
+	struct reftable_block_source src = { NULL };
+	struct reftable_reader *rd = NULL;
+	struct reftable_table tab = { NULL };
+	struct reftable_ref_record *refs = NULL;
+	struct reftable_iterator it = { NULL };
+	int cap = 0;
+	int len = 0;
+	int i = 0;
+
+	if (st->config.skip_name_check)
+		return 0;
+
+	err = reftable_block_source_from_file(&src, new_tab_name);
+	if (err < 0)
+		goto done;
+
+	err = reftable_new_reader(&rd, &src, new_tab_name);
+	if (err < 0)
+		goto done;
+
+	err = reftable_reader_seek_ref(rd, &it, "");
+	if (err > 0) {
+		err = 0;
+		goto done;
+	}
+	if (err < 0)
+		goto done;
+
+	while (1) {
+		struct reftable_ref_record ref = { NULL };
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0)
+			goto done;
+
+		if (len >= cap) {
+			cap = 2 * cap + 1;
+			refs = reftable_realloc(refs, cap * sizeof(refs[0]));
+		}
+
+		refs[len++] = ref;
+	}
+
+	reftable_table_from_merged_table(&tab, reftable_stack_merged_table(st));
+
+	err = validate_ref_record_addition(tab, refs, len);
+
+done:
+	for (i = 0; i < len; i++) {
+		reftable_ref_record_release(&refs[i]);
+	}
+
+	free(refs);
+	reftable_iterator_destroy(&it);
+	reftable_reader_free(rd);
+	return err;
+}
diff --git a/reftable/stack.h b/reftable/stack.h
new file mode 100644
index 000000000000..f57005846e56
--- /dev/null
+++ b/reftable/stack.h
@@ -0,0 +1,41 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef STACK_H
+#define STACK_H
+
+#include "system.h"
+#include "reftable-writer.h"
+#include "reftable-stack.h"
+
+struct reftable_stack {
+	char *list_file;
+	char *reftable_dir;
+	int disable_auto_compact;
+
+	struct reftable_write_options config;
+
+	struct reftable_reader **readers;
+	size_t readers_len;
+	struct reftable_merged_table *merged;
+	struct reftable_compaction_stats stats;
+};
+
+int read_lines(const char *filename, char ***lines);
+
+struct segment {
+	int start, end;
+	int log;
+	uint64_t bytes;
+};
+
+int fastlog2(uint64_t sz);
+struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n);
+struct segment suggest_compaction_segment(uint64_t *sizes, int n);
+
+#endif
diff --git a/reftable/stack_test.c b/reftable/stack_test.c
new file mode 100644
index 000000000000..c215b61e0a16
--- /dev/null
+++ b/reftable/stack_test.c
@@ -0,0 +1,809 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "stack.h"
+
+#include "system.h"
+
+#include "merged.h"
+#include "basics.h"
+#include "constants.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+#include <sys/types.h>
+#include <dirent.h>
+
+static void clear_dir(const char *dirname)
+{
+	struct strbuf path = STRBUF_INIT;
+	strbuf_addstr(&path, dirname);
+	remove_dir_recursively(&path, 0);
+	strbuf_release(&path);
+}
+
+static char *get_tmp_template(const char *prefix)
+{
+	const char *tmp = getenv("TMPDIR");
+	static char template[1024];
+	snprintf(template, sizeof(template) - 1, "%s/%s.XXXXXX",
+		 tmp ? tmp : "/tmp", prefix);
+	return template;
+}
+
+static void test_read_file(void)
+{
+	char *fn = get_tmp_template(__FUNCTION__);
+	int fd = mkstemp(fn);
+	char out[1024] = "line1\n\nline2\nline3";
+	int n, err;
+	char **names = NULL;
+	char *want[] = { "line1", "line2", "line3" };
+	int i = 0;
+
+	EXPECT(fd > 0);
+	n = write(fd, out, strlen(out));
+	EXPECT(n == strlen(out));
+	err = close(fd);
+	EXPECT(err >= 0);
+
+	err = read_lines(fn, &names);
+	EXPECT_ERR(err);
+
+	for (i = 0; names[i] != NULL; i++) {
+		EXPECT(0 == strcmp(want[i], names[i]));
+	}
+	free_names(names);
+	remove(fn);
+}
+
+static void test_parse_names(void)
+{
+	char buf[] = "line\n";
+	char **names = NULL;
+	parse_names(buf, strlen(buf), &names);
+
+	EXPECT(NULL != names[0]);
+	EXPECT(0 == strcmp(names[0], "line"));
+	EXPECT(NULL == names[1]);
+	free_names(names);
+}
+
+static void test_names_equal(void)
+{
+	char *a[] = { "a", "b", "c", NULL };
+	char *b[] = { "a", "b", "d", NULL };
+	char *c[] = { "a", "b", NULL };
+
+	EXPECT(names_equal(a, a));
+	EXPECT(!names_equal(a, b));
+	EXPECT(!names_equal(a, c));
+}
+
+static int write_test_ref(struct reftable_writer *wr, void *arg)
+{
+	struct reftable_ref_record *ref = arg;
+	reftable_writer_set_limits(wr, ref->update_index, ref->update_index);
+	return reftable_writer_add_ref(wr, ref);
+}
+
+struct write_log_arg {
+	struct reftable_log_record *log;
+	uint64_t update_index;
+};
+
+static int write_test_log(struct reftable_writer *wr, void *arg)
+{
+	struct write_log_arg *wla = arg;
+
+	reftable_writer_set_limits(wr, wla->update_index, wla->update_index);
+	return reftable_writer_add_log(wr, wla->log);
+}
+
+static void test_reftable_stack_add_one(void)
+{
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_ref_record ref = {
+		.refname = "HEAD",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+	struct reftable_ref_record dest = { NULL };
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_ref(st, ref.refname, &dest);
+	EXPECT_ERR(err);
+	EXPECT(0 == strcmp("master", dest.value.symref));
+
+	reftable_ref_record_release(&dest);
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_uptodate(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st1 = NULL;
+	struct reftable_stack *st2 = NULL;
+	char *dir = get_tmp_template(__FUNCTION__);
+	int err;
+	struct reftable_ref_record ref1 = {
+		.refname = "HEAD",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+	struct reftable_ref_record ref2 = {
+		.refname = "branch2",
+		.update_index = 2,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+
+	EXPECT(mkdtemp(dir));
+
+	/* simulate multi-process access to the same stack
+	   by creating two stacks for the same directory.
+	 */
+	err = reftable_new_stack(&st1, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_new_stack(&st2, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st1, &write_test_ref, &ref1);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st2, &write_test_ref, &ref2);
+	EXPECT(err == REFTABLE_LOCK_ERROR);
+
+	err = reftable_stack_reload(st2);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st2, &write_test_ref, &ref2);
+	EXPECT_ERR(err);
+	reftable_stack_destroy(st1);
+	reftable_stack_destroy(st2);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_transaction_api(void)
+{
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_addition *add = NULL;
+
+	struct reftable_ref_record ref = {
+		.refname = "HEAD",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+	struct reftable_ref_record dest = { NULL };
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	reftable_addition_destroy(add);
+
+	err = reftable_stack_new_addition(&add, st);
+	EXPECT_ERR(err);
+
+	err = reftable_addition_add(add, &write_test_ref, &ref);
+	EXPECT_ERR(err);
+
+	err = reftable_addition_commit(add);
+	EXPECT_ERR(err);
+
+	reftable_addition_destroy(add);
+
+	err = reftable_stack_read_ref(st, ref.refname, &dest);
+	EXPECT_ERR(err);
+	EXPECT(REFTABLE_REF_SYMREF == dest.value_type);
+	EXPECT(0 == strcmp("master", dest.value.symref));
+
+	reftable_ref_record_release(&dest);
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_validate_refname(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	char *dir = get_tmp_template(__FUNCTION__);
+	int i;
+	struct reftable_ref_record ref = {
+		.refname = "a/b",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+	char *additions[] = { "a", "a/b/c" };
+
+	EXPECT(mkdtemp(dir));
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref);
+	EXPECT_ERR(err);
+
+	for (i = 0; i < ARRAY_SIZE(additions); i++) {
+		struct reftable_ref_record ref = {
+			.refname = additions[i],
+			.update_index = 1,
+			.value_type = REFTABLE_REF_SYMREF,
+			.value.symref = "master",
+		};
+
+		err = reftable_stack_add(st, &write_test_ref, &ref);
+		EXPECT(err == REFTABLE_NAME_CONFLICT);
+	}
+
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static int write_error(struct reftable_writer *wr, void *arg)
+{
+	return *((int *)arg);
+}
+
+static void test_reftable_stack_update_index_check(void)
+{
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_ref_record ref1 = {
+		.refname = "name1",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+	struct reftable_ref_record ref2 = {
+		.refname = "name2",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref1);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref2);
+	EXPECT(err == REFTABLE_API_ERROR);
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_lock_failure(void)
+{
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err, i;
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+	for (i = -1; i != REFTABLE_EMPTY_TABLE_ERROR; i--) {
+		err = reftable_stack_add(st, &write_error, &i);
+		EXPECT(err == i);
+	}
+
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_add(void)
+{
+	int i = 0;
+	int err = 0;
+	struct reftable_write_options cfg = {
+		.exact_log_message = 1,
+	};
+	struct reftable_stack *st = NULL;
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_ref_record refs[2] = { { NULL } };
+	struct reftable_log_record logs[2] = { { NULL } };
+	int N = ARRAY_SIZE(refs);
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+	st->disable_auto_compact = 1;
+
+	for (i = 0; i < N; i++) {
+		char buf[256];
+		snprintf(buf, sizeof(buf), "branch%02d", i);
+		refs[i].refname = xstrdup(buf);
+		refs[i].update_index = i + 1;
+		refs[i].value_type = REFTABLE_REF_VAL1;
+		refs[i].value.val1 = reftable_malloc(SHA1_SIZE);
+		set_test_hash(refs[i].value.val1, i);
+
+		logs[i].refname = xstrdup(buf);
+		logs[i].update_index = N + i + 1;
+		logs[i].value_type = REFTABLE_LOG_UPDATE;
+
+		logs[i].update.new_hash = reftable_malloc(SHA1_SIZE);
+		logs[i].update.email = xstrdup("identity@invalid");
+		set_test_hash(logs[i].update.new_hash, i);
+	}
+
+	for (i = 0; i < N; i++) {
+		int err = reftable_stack_add(st, &write_test_ref, &refs[i]);
+		EXPECT_ERR(err);
+	}
+
+	for (i = 0; i < N; i++) {
+		struct write_log_arg arg = {
+			.log = &logs[i],
+			.update_index = reftable_stack_next_update_index(st),
+		};
+		int err = reftable_stack_add(st, &write_test_log, &arg);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_stack_compact_all(st, NULL);
+	EXPECT_ERR(err);
+
+	for (i = 0; i < N; i++) {
+		struct reftable_ref_record dest = { NULL };
+
+		int err = reftable_stack_read_ref(st, refs[i].refname, &dest);
+		EXPECT_ERR(err);
+		EXPECT(reftable_ref_record_equal(&dest, refs + i, SHA1_SIZE));
+		reftable_ref_record_release(&dest);
+	}
+
+	for (i = 0; i < N; i++) {
+		struct reftable_log_record dest = { NULL };
+		int err = reftable_stack_read_log(st, refs[i].refname, &dest);
+		EXPECT_ERR(err);
+		EXPECT(reftable_log_record_equal(&dest, logs + i, SHA1_SIZE));
+		reftable_log_record_release(&dest);
+	}
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	for (i = 0; i < N; i++) {
+		reftable_ref_record_release(&refs[i]);
+		reftable_log_record_release(&logs[i]);
+	}
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_log_normalize(void)
+{
+	int err = 0;
+	struct reftable_write_options cfg = {
+		0,
+	};
+	struct reftable_stack *st = NULL;
+	char *dir = get_tmp_template(__FUNCTION__);
+
+	uint8_t h1[SHA1_SIZE] = { 0x01 }, h2[SHA1_SIZE] = { 0x02 };
+
+	struct reftable_log_record input = { .refname = "branch",
+					     .update_index = 1,
+					     .value_type = REFTABLE_LOG_UPDATE,
+					     .update = {
+						     .new_hash = h1,
+						     .old_hash = h2,
+					     } };
+	struct reftable_log_record dest = {
+		.update_index = 0,
+	};
+	struct write_log_arg arg = {
+		.log = &input,
+		.update_index = 1,
+	};
+
+	EXPECT(mkdtemp(dir));
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	input.update.message = "one\ntwo";
+	err = reftable_stack_add(st, &write_test_log, &arg);
+	EXPECT(err == REFTABLE_API_ERROR);
+
+	input.update.message = "one";
+	err = reftable_stack_add(st, &write_test_log, &arg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_log(st, input.refname, &dest);
+	EXPECT_ERR(err);
+	EXPECT(0 == strcmp(dest.update.message, "one\n"));
+
+	input.update.message = "two\n";
+	arg.update_index = 2;
+	err = reftable_stack_add(st, &write_test_log, &arg);
+	EXPECT_ERR(err);
+	err = reftable_stack_read_log(st, input.refname, &dest);
+	EXPECT_ERR(err);
+	EXPECT(0 == strcmp(dest.update.message, "two\n"));
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	reftable_log_record_release(&dest);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_tombstone(void)
+{
+	int i = 0;
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_ref_record refs[2] = { { NULL } };
+	struct reftable_log_record logs[2] = { { NULL } };
+	int N = ARRAY_SIZE(refs);
+	struct reftable_ref_record dest = { NULL };
+	struct reftable_log_record log_dest = { NULL };
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	for (i = 0; i < N; i++) {
+		const char *buf = "branch";
+		refs[i].refname = xstrdup(buf);
+		refs[i].update_index = i + 1;
+		if (i % 2 == 0) {
+			refs[i].value_type = REFTABLE_REF_VAL1;
+			refs[i].value.val1 = reftable_malloc(SHA1_SIZE);
+			set_test_hash(refs[i].value.val1, i);
+		}
+
+		logs[i].refname = xstrdup(buf);
+		/* update_index is part of the key. */
+		logs[i].update_index = 42;
+		if (i % 2 == 0) {
+			logs[i].value_type = REFTABLE_LOG_UPDATE;
+			logs[i].update.new_hash = reftable_malloc(SHA1_SIZE);
+			set_test_hash(logs[i].update.new_hash, i);
+			logs[i].update.email = xstrdup("identity@invalid");
+		}
+	}
+	for (i = 0; i < N; i++) {
+		int err = reftable_stack_add(st, &write_test_ref, &refs[i]);
+		EXPECT_ERR(err);
+	}
+	for (i = 0; i < N; i++) {
+		struct write_log_arg arg = {
+			.log = &logs[i],
+			.update_index = reftable_stack_next_update_index(st),
+		};
+		int err = reftable_stack_add(st, &write_test_log, &arg);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_stack_read_ref(st, "branch", &dest);
+	EXPECT(err == 1);
+	reftable_ref_record_release(&dest);
+
+	err = reftable_stack_read_log(st, "branch", &log_dest);
+	EXPECT(err == 1);
+	reftable_log_record_release(&log_dest);
+
+	err = reftable_stack_compact_all(st, NULL);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_ref(st, "branch", &dest);
+	EXPECT(err == 1);
+
+	err = reftable_stack_read_log(st, "branch", &log_dest);
+	EXPECT(err == 1);
+	reftable_ref_record_release(&dest);
+	reftable_log_record_release(&log_dest);
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	for (i = 0; i < N; i++) {
+		reftable_ref_record_release(&refs[i]);
+		reftable_log_record_release(&logs[i]);
+	}
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_hash_id(void)
+{
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+
+	struct reftable_ref_record ref = {
+		.refname = "master",
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "target",
+		.update_index = 1,
+	};
+	struct reftable_write_options cfg32 = { .hash_id = SHA256_ID };
+	struct reftable_stack *st32 = NULL;
+	struct reftable_write_options cfg_default = { 0 };
+	struct reftable_stack *st_default = NULL;
+	struct reftable_ref_record dest = { NULL };
+
+	EXPECT(mkdtemp(dir));
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref);
+	EXPECT_ERR(err);
+
+	/* can't read it with the wrong hash ID. */
+	err = reftable_new_stack(&st32, dir, cfg32);
+	EXPECT(err == REFTABLE_FORMAT_ERROR);
+
+	/* check that we can read it back with default config too. */
+	err = reftable_new_stack(&st_default, dir, cfg_default);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_ref(st_default, "master", &dest);
+	EXPECT_ERR(err);
+
+	EXPECT(reftable_ref_record_equal(&ref, &dest, SHA1_SIZE));
+	reftable_ref_record_release(&dest);
+	reftable_stack_destroy(st);
+	reftable_stack_destroy(st_default);
+	clear_dir(dir);
+}
+
+static void test_log2(void)
+{
+	EXPECT(1 == fastlog2(3));
+	EXPECT(2 == fastlog2(4));
+	EXPECT(2 == fastlog2(5));
+}
+
+static void test_sizes_to_segments(void)
+{
+	uint64_t sizes[] = { 2, 3, 4, 5, 7, 9 };
+	/* .................0  1  2  3  4  5 */
+
+	int seglen = 0;
+	struct segment *segs =
+		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
+	EXPECT(segs[2].log == 3);
+	EXPECT(segs[2].start == 5);
+	EXPECT(segs[2].end == 6);
+
+	EXPECT(segs[1].log == 2);
+	EXPECT(segs[1].start == 2);
+	EXPECT(segs[1].end == 5);
+	reftable_free(segs);
+}
+
+static void test_sizes_to_segments_empty(void)
+{
+	int seglen = 0;
+	struct segment *segs = sizes_to_segments(&seglen, NULL, 0);
+	EXPECT(seglen == 0);
+	reftable_free(segs);
+}
+
+static void test_sizes_to_segments_all_equal(void)
+{
+	uint64_t sizes[] = { 5, 5 };
+
+	int seglen = 0;
+	struct segment *segs =
+		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
+	EXPECT(seglen == 1);
+	EXPECT(segs[0].start == 0);
+	EXPECT(segs[0].end == 2);
+	reftable_free(segs);
+}
+
+static void test_suggest_compaction_segment(void)
+{
+	uint64_t sizes[] = { 128, 64, 17, 16, 9, 9, 9, 16, 16 };
+	/* .................0    1    2  3   4  5  6 */
+	struct segment min =
+		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes));
+	EXPECT(min.start == 2);
+	EXPECT(min.end == 7);
+}
+
+static void test_suggest_compaction_segment_nothing(void)
+{
+	uint64_t sizes[] = { 64, 32, 16, 8, 4, 2 };
+	struct segment result =
+		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes));
+	EXPECT(result.start == result.end);
+}
+
+static void test_reflog_expire(void)
+{
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	struct reftable_log_record logs[20] = { { NULL } };
+	int N = ARRAY_SIZE(logs) - 1;
+	int i = 0;
+	int err;
+	struct reftable_log_expiry_config expiry = {
+		.time = 10,
+	};
+	struct reftable_log_record log = { NULL };
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	for (i = 1; i <= N; i++) {
+		char buf[256];
+		snprintf(buf, sizeof(buf), "branch%02d", i);
+
+		logs[i].refname = xstrdup(buf);
+		logs[i].update_index = i;
+		logs[i].value_type = REFTABLE_LOG_UPDATE;
+		logs[i].update.time = i;
+		logs[i].update.new_hash = reftable_malloc(SHA1_SIZE);
+		logs[i].update.email = xstrdup("identity@invalid");
+		set_test_hash(logs[i].update.new_hash, i);
+	}
+
+	for (i = 1; i <= N; i++) {
+		struct write_log_arg arg = {
+			.log = &logs[i],
+			.update_index = reftable_stack_next_update_index(st),
+		};
+		int err = reftable_stack_add(st, &write_test_log, &arg);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_stack_compact_all(st, NULL);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_compact_all(st, &expiry);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_log(st, logs[9].refname, &log);
+	EXPECT(err == 1);
+
+	err = reftable_stack_read_log(st, logs[11].refname, &log);
+	EXPECT_ERR(err);
+
+	expiry.min_update_index = 15;
+	err = reftable_stack_compact_all(st, &expiry);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_log(st, logs[14].refname, &log);
+	EXPECT(err == 1);
+
+	err = reftable_stack_read_log(st, logs[16].refname, &log);
+	EXPECT_ERR(err);
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	for (i = 0; i <= N; i++) {
+		reftable_log_record_release(&logs[i]);
+	}
+	clear_dir(dir);
+	reftable_log_record_release(&log);
+}
+
+static int write_nothing(struct reftable_writer *wr, void *arg)
+{
+	reftable_writer_set_limits(wr, 1, 1);
+	return 0;
+}
+
+static void test_empty_add(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_stack *st2 = NULL;
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_nothing, NULL);
+	EXPECT_ERR(err);
+
+	err = reftable_new_stack(&st2, dir, cfg);
+	EXPECT_ERR(err);
+	clear_dir(dir);
+	reftable_stack_destroy(st);
+	reftable_stack_destroy(st2);
+}
+
+static void test_reftable_stack_auto_compaction(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	char *dir = get_tmp_template(__FUNCTION__);
+	int err, i;
+	int N = 100;
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	for (i = 0; i < N; i++) {
+		char name[100];
+		struct reftable_ref_record ref = {
+			.refname = name,
+			.update_index = reftable_stack_next_update_index(st),
+			.value_type = REFTABLE_REF_SYMREF,
+			.value.symref = "master",
+		};
+		snprintf(name, sizeof(name), "branch%04d", i);
+
+		err = reftable_stack_add(st, &write_test_ref, &ref);
+		EXPECT_ERR(err);
+
+		EXPECT(i < 3 || st->merged->stack_len < 2 * fastlog2(i));
+	}
+
+	EXPECT(reftable_stack_compaction_stats(st)->entries_written <
+	       (uint64_t)(N * fastlog2(N)));
+
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+int stack_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_reftable_stack_uptodate);
+	RUN_TEST(test_reftable_stack_transaction_api);
+	RUN_TEST(test_reftable_stack_hash_id);
+	RUN_TEST(test_sizes_to_segments_all_equal);
+	RUN_TEST(test_reftable_stack_auto_compaction);
+	RUN_TEST(test_reftable_stack_validate_refname);
+	RUN_TEST(test_reftable_stack_update_index_check);
+	RUN_TEST(test_reftable_stack_lock_failure);
+	RUN_TEST(test_reftable_stack_log_normalize);
+	RUN_TEST(test_reftable_stack_tombstone);
+	RUN_TEST(test_reftable_stack_add_one);
+	RUN_TEST(test_empty_add);
+	RUN_TEST(test_reflog_expire);
+	RUN_TEST(test_suggest_compaction_segment);
+	RUN_TEST(test_suggest_compaction_segment_nothing);
+	RUN_TEST(test_sizes_to_segments);
+	RUN_TEST(test_sizes_to_segments_empty);
+	RUN_TEST(test_log2);
+	RUN_TEST(test_parse_names);
+	RUN_TEST(test_read_file);
+	RUN_TEST(test_names_equal);
+	RUN_TEST(test_reftable_stack_add);
+	return 0;
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index fdf925867375..3b702f4855e6 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -5,8 +5,11 @@ int cmd__reftable(int argc, const char **argv)
 {
 	basics_test_main(argc, argv);
 	block_test_main(argc, argv);
+	merged_test_main(argc, argv);
 	record_test_main(argc, argv);
+	refname_test_main(argc, argv);
 	reftable_test_main(argc, argv);
+	stack_test_main(argc, argv);
 	tree_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v5 13/15] Reftable support for git-core
  2021-03-12 20:19       ` [PATCH v5 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                           ` (11 preceding siblings ...)
  2021-03-12 20:19         ` [PATCH v5 12/15] reftable: rest of library Han-Wen Nienhuys via GitGitGadget
@ 2021-03-12 20:19         ` Han-Wen Nienhuys via GitGitGadget
  2021-03-23 11:40           ` Derrick Stolee
  2021-03-12 20:19         ` [PATCH v5 14/15] git-prompt: prepare for reftable refs backend SZEDER Gábor via GitGitGadget
                           ` (2 subsequent siblings)
  15 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-03-12 20:19 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

For background, see the previous commit introducing the library.

This introduces the file refs/reftable-backend.c containing a reftable-powered
ref storage backend.

It can be activated by passing --ref-storage=reftable to "git init", or setting
GIT_TEST_REFTABLE in the environment.

Example use: see t/t0031-reftable.sh

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Junio Hamano <gitster@pobox.com>
Co-authored-by: Jeff King <peff@peff.net>
---
 Documentation/config/extensions.txt           |    9 +
 .../technical/repository-version.txt          |    7 +
 Makefile                                      |    4 +
 builtin/clone.c                               |    5 +-
 builtin/init-db.c                             |   55 +-
 builtin/worktree.c                            |   27 +-
 cache.h                                       |    8 +-
 config.mak.uname                              |    2 +-
 contrib/buildsystems/Generators/Vcxproj.pm    |   11 +-
 refs.c                                        |   27 +-
 refs.h                                        |    3 +
 refs/refs-internal.h                          |    1 +
 refs/reftable-backend.c                       | 1446 +++++++++++++++++
 repository.c                                  |    2 +
 repository.h                                  |    3 +
 setup.c                                       |    9 +-
 t/t0031-reftable.sh                           |  203 +++
 t/t1409-avoid-packing-refs.sh                 |    6 +
 t/t1450-fsck.sh                               |    6 +
 t/t3210-pack-refs.sh                          |    6 +
 t/test-lib.sh                                 |    5 +
 21 files changed, 1812 insertions(+), 33 deletions(-)
 create mode 100644 refs/reftable-backend.c
 create mode 100755 t/t0031-reftable.sh

diff --git a/Documentation/config/extensions.txt b/Documentation/config/extensions.txt
index 4e23d73cdcad..82c5940f1434 100644
--- a/Documentation/config/extensions.txt
+++ b/Documentation/config/extensions.txt
@@ -6,3 +6,12 @@ extensions.objectFormat::
 Note that this setting should only be set by linkgit:git-init[1] or
 linkgit:git-clone[1].  Trying to change it after initialization will not
 work and will produce hard-to-diagnose issues.
++
+extensions.refStorage::
+	Specify the ref storage mechanism to use.  The acceptable values are `files` and
+	`reftable`.  If not specified, `files` is assumed.  It is an error to specify
+	this key unless `core.repositoryFormatVersion` is 1.
++
+Note that this setting should only be set by linkgit:git-init[1] or
+linkgit:git-clone[1].  Trying to change it after initialization will not
+work and will produce hard-to-diagnose issues.
diff --git a/Documentation/technical/repository-version.txt b/Documentation/technical/repository-version.txt
index 7844ef30ffde..725762358333 100644
--- a/Documentation/technical/repository-version.txt
+++ b/Documentation/technical/repository-version.txt
@@ -100,3 +100,10 @@ If set, by default "git config" reads from both "config" and
 multiple working directory mode, "config" file is shared while
 "config.worktree" is per-working directory (i.e., it's in
 GIT_COMMON_DIR/worktrees/<id>/config.worktree)
+
+==== `refStorage`
+
+Specifies the file format for the ref database. Values are `files`
+(for the traditional packed + loose ref format) and `reftable` for the
+binary reftable format. See https://github.com/google/reftable for
+more information.
diff --git a/Makefile b/Makefile
index a00cc17ec781..90c204501739 100644
--- a/Makefile
+++ b/Makefile
@@ -967,6 +967,7 @@ LIB_OBJS += reflog-walk.o
 LIB_OBJS += refs.o
 LIB_OBJS += refs/debug.o
 LIB_OBJS += refs/files-backend.o
+LIB_OBJS += refs/reftable-backend.o
 LIB_OBJS += refs/iterator.o
 LIB_OBJS += refs/packed-backend.o
 LIB_OBJS += refs/ref-cache.o
@@ -2401,8 +2402,11 @@ REFTABLE_OBJS += reftable/zlib-compat.o
 
 REFTABLE_TEST_OBJS += reftable/basics_test.o
 REFTABLE_TEST_OBJS += reftable/block_test.o
+REFTABLE_TEST_OBJS += reftable/merged_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
+REFTABLE_TEST_OBJS += reftable/refname_test.o
 REFTABLE_TEST_OBJS += reftable/reftable_test.o
+REFTABLE_TEST_OBJS += reftable/stack_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/tree_test.o
 
diff --git a/builtin/clone.c b/builtin/clone.c
index 51e844a2de0a..6eee72d1ffe5 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1139,7 +1139,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	}
 
 	init_db(git_dir, real_git_dir, option_template, GIT_HASH_UNKNOWN, NULL,
-		INIT_DB_QUIET);
+		default_ref_storage(), INIT_DB_QUIET);
 
 	if (real_git_dir)
 		git_dir = real_git_dir;
@@ -1277,7 +1277,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		 * Now that we know what algorithm the remote side is using,
 		 * let's set ours to the same thing.
 		 */
-		initialize_repository_version(hash_algo, 1);
+		initialize_repository_version(hash_algo, 1,
+					      default_ref_storage());
 		repo_set_hash_algo(the_repository, hash_algo);
 
 		mapped_refs = wanted_peer_refs(refs, &remote->fetch);
diff --git a/builtin/init-db.c b/builtin/init-db.c
index 7766661a4755..c4fb62056344 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -179,12 +179,14 @@ static int needs_work_tree_config(const char *git_dir, const char *work_tree)
 	return 1;
 }
 
-void initialize_repository_version(int hash_algo, int reinit)
+void initialize_repository_version(int hash_algo, int reinit,
+				   const char *ref_storage_format)
 {
 	char repo_version_string[10];
 	int repo_version = GIT_REPO_VERSION;
 
-	if (hash_algo != GIT_HASH_SHA1)
+	if (hash_algo != GIT_HASH_SHA1 ||
+	    !strcmp(ref_storage_format, "reftable"))
 		repo_version = GIT_REPO_VERSION_READ;
 
 	/* This forces creation of new config file */
@@ -238,6 +240,7 @@ static int create_default_files(const char *template_path,
 	is_bare_repository_cfg = init_is_bare_repository;
 	if (init_shared_repository != -1)
 		set_shared_repository(init_shared_repository);
+	the_repository->ref_storage_format = xstrdup(fmt->ref_storage);
 
 	/*
 	 * We would have created the above under user's umask -- under
@@ -247,6 +250,24 @@ static int create_default_files(const char *template_path,
 		adjust_shared_perm(get_git_dir());
 	}
 
+	/*
+	 * Check to see if .git/HEAD exists; this must happen before
+	 * initializing the ref db, because we want to see if there is an
+	 * existing HEAD.
+	 */
+	path = git_path_buf(&buf, "HEAD");
+	reinit = (!access(path, R_OK) ||
+		  readlink(path, junk, sizeof(junk) - 1) != -1);
+
+	/*
+	 * refs/heads is a file when using reftable. We can't reinitialize with
+	 * a reftable because it will overwrite HEAD
+	 */
+	if (reinit && (!strcmp(fmt->ref_storage, "reftable")) ==
+			      is_directory(git_path_buf(&buf, "refs/heads"))) {
+		die("cannot switch ref storage format.");
+	}
+
 	/*
 	 * We need to create a "refs" dir in any case so that older
 	 * versions of git can tell that this is a repository.
@@ -261,9 +282,6 @@ static int create_default_files(const char *template_path,
 	 * Point the HEAD symref to the initial branch with if HEAD does
 	 * not yet exist.
 	 */
-	path = git_path_buf(&buf, "HEAD");
-	reinit = (!access(path, R_OK)
-		  || readlink(path, junk, sizeof(junk)-1) != -1);
 	if (!reinit) {
 		char *ref;
 
@@ -280,7 +298,7 @@ static int create_default_files(const char *template_path,
 		free(ref);
 	}
 
-	initialize_repository_version(fmt->hash_algo, 0);
+	initialize_repository_version(fmt->hash_algo, 0, fmt->ref_storage);
 
 	/* Check filemode trustability */
 	path = git_path_buf(&buf, "config");
@@ -396,7 +414,7 @@ static void validate_hash_algorithm(struct repository_format *repo_fmt, int hash
 
 int init_db(const char *git_dir, const char *real_git_dir,
 	    const char *template_dir, int hash, const char *initial_branch,
-	    unsigned int flags)
+	    const char *ref_storage_format, unsigned int flags)
 {
 	int reinit;
 	int exist_ok = flags & INIT_DB_EXIST_OK;
@@ -435,6 +453,7 @@ int init_db(const char *git_dir, const char *real_git_dir,
 	 * is an attempt to reinitialize new repository with an old tool.
 	 */
 	check_repository_format(&repo_fmt);
+	repo_fmt.ref_storage = xstrdup(ref_storage_format);
 
 	validate_hash_algorithm(&repo_fmt, hash);
 
@@ -489,6 +508,9 @@ int init_db(const char *git_dir, const char *real_git_dir,
 		git_config_set("receive.denyNonFastforwards", "true");
 	}
 
+	if (!strcmp(ref_storage_format, "reftable"))
+		git_config_set("extensions.refStorage", ref_storage_format);
+
 	if (!(flags & INIT_DB_QUIET)) {
 		int len = strlen(git_dir);
 
@@ -562,6 +584,7 @@ static const char *const init_db_usage[] = {
 int cmd_init_db(int argc, const char **argv, const char *prefix)
 {
 	const char *git_dir;
+	const char *ref_storage_format = default_ref_storage();
 	const char *real_git_dir = NULL;
 	const char *work_tree;
 	const char *template_dir = NULL;
@@ -570,15 +593,18 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
 	const char *initial_branch = NULL;
 	int hash_algo = GIT_HASH_UNKNOWN;
 	const struct option init_db_options[] = {
-		OPT_STRING(0, "template", &template_dir, N_("template-directory"),
-				N_("directory from which templates will be used")),
+		OPT_STRING(0, "template", &template_dir,
+			   N_("template-directory"),
+			   N_("directory from which templates will be used")),
 		OPT_SET_INT(0, "bare", &is_bare_repository_cfg,
-				N_("create a bare repository"), 1),
+			    N_("create a bare repository"), 1),
 		{ OPTION_CALLBACK, 0, "shared", &init_shared_repository,
-			N_("permissions"),
-			N_("specify that the git repository is to be shared amongst several users"),
-			PARSE_OPT_OPTARG | PARSE_OPT_NONEG, shared_callback, 0},
+		  N_("permissions"),
+		  N_("specify that the git repository is to be shared amongst several users"),
+		  PARSE_OPT_OPTARG | PARSE_OPT_NONEG, shared_callback, 0 },
 		OPT_BIT('q', "quiet", &flags, N_("be quiet"), INIT_DB_QUIET),
+		OPT_STRING(0, "ref-storage", &ref_storage_format, N_("backend"),
+			   N_("the ref storage format to use")),
 		OPT_STRING(0, "separate-git-dir", &real_git_dir, N_("gitdir"),
 			   N_("separate git dir from working tree")),
 		OPT_STRING('b', "initial-branch", &initial_branch, N_("name"),
@@ -719,10 +745,11 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
 	}
 
 	UNLEAK(real_git_dir);
+	UNLEAK(ref_storage_format);
 	UNLEAK(git_dir);
 	UNLEAK(work_tree);
 
 	flags |= INIT_DB_EXIST_OK;
 	return init_db(git_dir, real_git_dir, template_dir, hash_algo,
-		       initial_branch, flags);
+		       initial_branch, ref_storage_format, flags);
 }
diff --git a/builtin/worktree.c b/builtin/worktree.c
index 1cd5c2016e3f..a5d044e116b1 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -13,6 +13,7 @@
 #include "utf8.h"
 #include "worktree.h"
 #include "quote.h"
+#include "../refs/refs-internal.h"
 
 static const char * const worktree_usage[] = {
 	N_("git worktree add [<options>] <path> [<commit-ish>]"),
@@ -330,9 +331,29 @@ static int add_worktree(const char *path, const char *refname,
 	 * worktree.
 	 */
 	strbuf_reset(&sb);
-	strbuf_addf(&sb, "%s/HEAD", sb_repo.buf);
-	write_file(sb.buf, "%s", oid_to_hex(&null_oid));
-	strbuf_reset(&sb);
+	if (get_main_ref_store(the_repository)->be == &refs_be_reftable) {
+		/* XXX this is cut & paste from reftable_init_db. */
+		strbuf_addf(&sb, "%s/HEAD", sb_repo.buf);
+		write_file(sb.buf, "%s", "ref: refs/heads/.invalid\n");
+		strbuf_reset(&sb);
+
+		strbuf_addf(&sb, "%s/refs", sb_repo.buf);
+		safe_create_dir(sb.buf, 1);
+		strbuf_reset(&sb);
+
+		strbuf_addf(&sb, "%s/refs/heads", sb_repo.buf);
+		write_file(sb.buf, "this repository uses the reftable format");
+		strbuf_reset(&sb);
+
+		strbuf_addf(&sb, "%s/reftable", sb_repo.buf);
+		safe_create_dir(sb.buf, 1);
+		strbuf_reset(&sb);
+	} else {
+		strbuf_addf(&sb, "%s/HEAD", sb_repo.buf);
+		write_file(sb.buf, "%s", oid_to_hex(&null_oid));
+		strbuf_reset(&sb);
+	}
+
 	strbuf_addf(&sb, "%s/commondir", sb_repo.buf);
 	write_file(sb.buf, "../..");
 
diff --git a/cache.h b/cache.h
index 6fda8091f11d..be8fa1ddd6fc 100644
--- a/cache.h
+++ b/cache.h
@@ -629,9 +629,10 @@ int path_inside_repo(const char *prefix, const char *path);
 #define INIT_DB_EXIST_OK 0x0002
 
 int init_db(const char *git_dir, const char *real_git_dir,
-	    const char *template_dir, int hash_algo,
-	    const char *initial_branch, unsigned int flags);
-void initialize_repository_version(int hash_algo, int reinit);
+	    const char *template_dir, int hash_algo, const char *initial_branch,
+	    const char *ref_storage_format, unsigned int flags);
+void initialize_repository_version(int hash_algo, int reinit,
+				   const char *ref_storage_format);
 
 void sanitize_stdfds(void);
 int daemonize(void);
@@ -1045,6 +1046,7 @@ struct repository_format {
 	int is_bare;
 	int hash_algo;
 	char *work_tree;
+	char *ref_storage;
 	struct string_list unknown_extensions;
 	struct string_list v1_only_extensions;
 };
diff --git a/config.mak.uname b/config.mak.uname
index d204c20a6469..6cba285e8fa3 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -699,7 +699,7 @@ vcxproj:
 	# Make .vcxproj files and add them
 	unset QUIET_GEN QUIET_BUILT_IN; \
 	perl contrib/buildsystems/generate -g Vcxproj
-	git add -f git.sln {*,*/lib,t/helper/*}/*.vcxproj
+	git add -f git.sln {*,*/lib,*/libreftable,t/helper/*}/*.vcxproj
 
 	# Generate the LinkOrCopyBuiltins.targets and LinkOrCopyRemoteHttp.targets file
 	(echo '<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">' && \
diff --git a/contrib/buildsystems/Generators/Vcxproj.pm b/contrib/buildsystems/Generators/Vcxproj.pm
index d2584450ba17..1a25789d2851 100644
--- a/contrib/buildsystems/Generators/Vcxproj.pm
+++ b/contrib/buildsystems/Generators/Vcxproj.pm
@@ -77,7 +77,7 @@ sub createProject {
     my $libs_release = "\n    ";
     my $libs_debug = "\n    ";
     if (!$static_library) {
-      $libs_release = join(";", sort(grep /^(?!libgit\.lib|xdiff\/lib\.lib|vcs-svn\/lib\.lib)/, @{$$build_structure{"$prefix${name}_LIBS"}}));
+      $libs_release = join(";", sort(grep /^(?!libgit\.lib|xdiff\/lib\.lib|vcs-svn\/lib\.lib|reftable\/libreftable\.lib)/, @{$$build_structure{"$prefix${name}_LIBS"}}));
       $libs_debug = $libs_release;
       $libs_debug =~ s/zlib\.lib/zlibd\.lib/g;
       $libs_debug =~ s/libexpat\.lib/libexpatd\.lib/g;
@@ -232,6 +232,7 @@ sub createProject {
 EOM
     if (!$static_library || $target =~ 'vcs-svn' || $target =~ 'xdiff') {
       my $uuid_libgit = $$build_structure{"LIBS_libgit_GUID"};
+      my $uuid_libreftable = $$build_structure{"LIBS_reftable/libreftable_GUID"};
       my $uuid_xdiff_lib = $$build_structure{"LIBS_xdiff/lib_GUID"};
 
       print F << "EOM";
@@ -241,6 +242,14 @@ sub createProject {
       <ReferenceOutputAssembly>false</ReferenceOutputAssembly>
     </ProjectReference>
 EOM
+      if (!($name =~ /xdiff|libreftable/)) {
+        print F << "EOM";
+    <ProjectReference Include="$cdup\\reftable\\libreftable\\libreftable.vcxproj">
+      <Project>$uuid_libreftable</Project>
+      <ReferenceOutputAssembly>false</ReferenceOutputAssembly>
+    </ProjectReference>
+EOM
+      }
       if (!($name =~ 'xdiff')) {
         print F << "EOM";
     <ProjectReference Include="$cdup\\xdiff\\lib\\xdiff_lib.vcxproj">
diff --git a/refs.c b/refs.c
index a665ed5e10ac..bc27a5e1b282 100644
--- a/refs.c
+++ b/refs.c
@@ -19,10 +19,16 @@
 #include "repository.h"
 #include "sigchain.h"
 
+const char *default_ref_storage(void)
+{
+	const char *test = getenv("GIT_TEST_REFTABLE");
+	return test ? "reftable" : "files";
+}
+
 /*
  * List of all available backends
  */
-static struct ref_storage_be *refs_backends = &refs_be_files;
+static struct ref_storage_be *refs_backends = &refs_be_reftable;
 
 static struct ref_storage_be *find_ref_storage_backend(const char *name)
 {
@@ -1875,13 +1881,13 @@ static struct ref_store *lookup_ref_store_map(struct hashmap *map,
  * Create, record, and return a ref_store instance for the specified
  * gitdir.
  */
-static struct ref_store *ref_store_init(const char *gitdir,
+static struct ref_store *ref_store_init(const char *gitdir, const char *be_name,
 					unsigned int flags)
 {
-	const char *be_name = "files";
-	struct ref_storage_be *be = find_ref_storage_backend(be_name);
+	struct ref_storage_be *be;
 	struct ref_store *refs;
 
+	be = find_ref_storage_backend(be_name);
 	if (!be)
 		BUG("reference backend %s is unknown", be_name);
 
@@ -1897,7 +1903,11 @@ struct ref_store *get_main_ref_store(struct repository *r)
 	if (!r->gitdir)
 		BUG("attempting to get main_ref_store outside of repository");
 
-	r->refs_private = ref_store_init(r->gitdir, REF_STORE_ALL_CAPS);
+	r->refs_private = ref_store_init(r->gitdir,
+					 r->ref_storage_format ?
+						       r->ref_storage_format :
+						       default_ref_storage(),
+					 REF_STORE_ALL_CAPS);
 	r->refs_private = maybe_debug_wrap_ref_store(r->gitdir, r->refs_private);
 	return r->refs_private;
 }
@@ -1953,7 +1963,7 @@ struct ref_store *get_submodule_ref_store(const char *submodule)
 		goto done;
 
 	/* assume that add_submodule_odb() has been called */
-	refs = ref_store_init(submodule_sb.buf,
+	refs = ref_store_init(submodule_sb.buf, default_ref_storage(),
 			      REF_STORE_READ | REF_STORE_ODB);
 	register_ref_store_map(&submodule_ref_stores, "submodule",
 			       refs, submodule);
@@ -1967,6 +1977,7 @@ struct ref_store *get_submodule_ref_store(const char *submodule)
 
 struct ref_store *get_worktree_ref_store(const struct worktree *wt)
 {
+	const char *format = default_ref_storage();
 	struct ref_store *refs;
 	const char *id;
 
@@ -1980,9 +1991,9 @@ struct ref_store *get_worktree_ref_store(const struct worktree *wt)
 
 	if (wt->id)
 		refs = ref_store_init(git_common_path("worktrees/%s", wt->id),
-				      REF_STORE_ALL_CAPS);
+				      format, REF_STORE_ALL_CAPS);
 	else
-		refs = ref_store_init(get_git_common_dir(),
+		refs = ref_store_init(get_git_common_dir(), format,
 				      REF_STORE_ALL_CAPS);
 
 	if (refs)
diff --git a/refs.h b/refs.h
index 48970dfc7e0f..5a6d4ca9fa8d 100644
--- a/refs.h
+++ b/refs.h
@@ -11,6 +11,9 @@ struct string_list;
 struct string_list_item;
 struct worktree;
 
+/* Returns the ref storage backend to use by default. */
+const char *default_ref_storage(void);
+
 /*
  * Resolve a reference, recursively following symbolic refererences.
  *
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 467f4b3c936d..28166bf1f82b 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -669,6 +669,7 @@ struct ref_storage_be {
 };
 
 extern struct ref_storage_be refs_be_files;
+extern struct ref_storage_be refs_be_reftable;
 extern struct ref_storage_be refs_be_packed;
 
 /*
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
new file mode 100644
index 000000000000..f1f3e60a9c7c
--- /dev/null
+++ b/refs/reftable-backend.c
@@ -0,0 +1,1446 @@
+#include "../cache.h"
+#include "../chdir-notify.h"
+#include "../config.h"
+#include "../iterator.h"
+#include "../lockfile.h"
+#include "../refs.h"
+#include "../reftable/reftable-stack.h"
+#include "../reftable/reftable-record.h"
+#include "../reftable/reftable-error.h"
+#include "../reftable/reftable-blocksource.h"
+#include "../reftable/reftable-reader.h"
+#include "../reftable/reftable-iterator.h"
+#include "../reftable/reftable-merged.h"
+#include "../reftable/reftable-generic.h"
+#include "../worktree.h"
+#include "refs-internal.h"
+
+extern struct ref_storage_be refs_be_reftable;
+
+struct git_reftable_ref_store {
+	struct ref_store base;
+	unsigned int store_flags;
+
+	int err;
+	char *repo_dir;
+
+	char *reftable_dir;
+	char *worktree_reftable_dir;
+
+	struct reftable_stack *main_stack;
+	struct reftable_stack *worktree_stack;
+};
+
+static struct reftable_stack *stack_for(struct git_reftable_ref_store *store,
+					const char *refname)
+{
+	if (store->worktree_stack == NULL)
+		return store->main_stack;
+
+	switch (ref_type(refname)) {
+	case REF_TYPE_PER_WORKTREE:
+	case REF_TYPE_PSEUDOREF:
+	case REF_TYPE_OTHER_PSEUDOREF:
+		return store->worktree_stack;
+	default:
+	case REF_TYPE_MAIN_PSEUDOREF:
+	case REF_TYPE_NORMAL:
+		return store->main_stack;
+	}
+}
+
+static int git_reftable_read_raw_ref(struct ref_store *ref_store,
+				     const char *refname, struct object_id *oid,
+				     struct strbuf *referent,
+				     unsigned int *type);
+
+static void clear_reftable_log_record(struct reftable_log_record *log)
+{
+	log->refname = NULL;
+	switch (log->value_type) {
+	case REFTABLE_LOG_UPDATE:
+		log->update.old_hash = NULL;
+		log->update.new_hash = NULL;
+		log->update.message = NULL;
+		break;
+	case REFTABLE_LOG_DELETION:
+		break;
+	}
+	reftable_log_record_release(log);
+}
+
+static void fill_reftable_log_record(struct reftable_log_record *log)
+{
+	const char *info = git_committer_info(0);
+	struct ident_split split = { NULL };
+	int result = split_ident_line(&split, info, strlen(info));
+	int sign = 1;
+	assert(0 == result);
+
+	reftable_log_record_release(log);
+	log->value_type = REFTABLE_LOG_UPDATE;
+	log->update.name =
+		xstrndup(split.name_begin, split.name_end - split.name_begin);
+	log->update.email =
+		xstrndup(split.mail_begin, split.mail_end - split.mail_begin);
+	log->update.time = atol(split.date_begin);
+	if (*split.tz_begin == '-') {
+		sign = -1;
+		split.tz_begin++;
+	}
+	if (*split.tz_begin == '+') {
+		sign = 1;
+		split.tz_begin++;
+	}
+
+	log->update.tz_offset = sign * atoi(split.tz_begin);
+}
+
+static int has_suffix(struct strbuf *b, const char *suffix)
+{
+	size_t len = strlen(suffix);
+
+	if (len > b->len) {
+		return 0;
+	}
+
+	return 0 == strncmp(b->buf + b->len - len, suffix, len);
+}
+
+/* trims the last path component of b. Returns -1 if it is not
+ * present, or 0 on success
+ */
+static int trim_component(struct strbuf *b)
+{
+	char *last;
+	last = strrchr(b->buf, '/');
+	if (!last)
+		return -1;
+	strbuf_setlen(b, last - b->buf);
+	return 0;
+}
+
+/* Returns whether `b` is a worktree path, trimming it to the gitdir
+ */
+static int is_worktree(struct strbuf *b)
+{
+	if (trim_component(b) < 0) {
+		return 0;
+	}
+	if (!has_suffix(b, "/worktrees")) {
+		return 0;
+	}
+	trim_component(b);
+	return 1;
+}
+
+static struct ref_store *git_reftable_ref_store_create(const char *path,
+						       unsigned int store_flags)
+{
+	struct git_reftable_ref_store *refs = xcalloc(1, sizeof(*refs));
+	struct ref_store *ref_store = (struct ref_store *)refs;
+	struct reftable_write_options cfg = {
+		.block_size = 4096,
+		.hash_id = the_hash_algo->format_id,
+	};
+	struct strbuf sb = STRBUF_INIT;
+	const char *gitdir = path;
+	struct strbuf wt_buf = STRBUF_INIT;
+	int wt = 0;
+
+	strbuf_addstr(&wt_buf, path);
+
+	/* this is clumsy, but the official worktree functions (eg.
+	 * get_worktrees()) function will try to initialize a ref storage
+	 * backend, leading to infinite recursion.  */
+	wt = is_worktree(&wt_buf);
+	if (wt) {
+		gitdir = wt_buf.buf;
+	}
+
+	base_ref_store_init(ref_store, &refs_be_reftable);
+	ref_store->gitdir = xstrdup(gitdir);
+	refs->store_flags = store_flags;
+	strbuf_addf(&sb, "%s/reftable", gitdir);
+	refs->reftable_dir = xstrdup(sb.buf);
+	strbuf_reset(&sb);
+
+	refs->err =
+		reftable_new_stack(&refs->main_stack, refs->reftable_dir, cfg);
+	assert(refs->err != REFTABLE_API_ERROR);
+
+	if (refs->err == 0 && wt) {
+		strbuf_addf(&sb, "%s/reftable", path);
+		refs->worktree_reftable_dir = xstrdup(sb.buf);
+
+		refs->err = reftable_new_stack(&refs->worktree_stack,
+					       refs->worktree_reftable_dir,
+					       cfg);
+		assert(refs->err != REFTABLE_API_ERROR);
+	}
+
+	strbuf_release(&sb);
+	strbuf_release(&wt_buf);
+	return ref_store;
+}
+
+static int git_reftable_init_db(struct ref_store *ref_store, struct strbuf *err)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct strbuf sb = STRBUF_INIT;
+
+	safe_create_dir(refs->reftable_dir, 1);
+	assert(refs->worktree_reftable_dir == NULL);
+
+	strbuf_addf(&sb, "%s/HEAD", refs->base.gitdir);
+	write_file(sb.buf, "ref: refs/heads/.invalid");
+	strbuf_reset(&sb);
+
+	strbuf_addf(&sb, "%s/refs", refs->base.gitdir);
+	safe_create_dir(sb.buf, 1);
+	strbuf_reset(&sb);
+
+	strbuf_addf(&sb, "%s/refs/heads", refs->base.gitdir);
+	write_file(sb.buf, "this repository uses the reftable format");
+
+	return 0;
+}
+
+struct git_reftable_iterator {
+	struct ref_iterator base;
+	struct reftable_iterator iter;
+	struct reftable_ref_record ref;
+	struct object_id oid;
+	struct ref_store *ref_store;
+
+	/* In case we must iterate over 2 stacks, this is non-null. */
+	struct reftable_merged_table *merged;
+	unsigned int flags;
+	int err;
+	const char *prefix;
+};
+
+static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator)
+{
+	struct git_reftable_iterator *ri =
+		(struct git_reftable_iterator *)ref_iterator;
+	while (ri->err == 0) {
+		ri->err = reftable_iterator_next_ref(&ri->iter, &ri->ref);
+		if (ri->err) {
+			break;
+		}
+
+		if (ref_type(ri->ref.refname) == REF_TYPE_PSEUDOREF) {
+			/*
+			  pseudorefs, eg. HEAD, FETCH_HEAD should not be
+			  produced, by default.
+			 */
+			continue;
+		}
+		ri->base.refname = ri->ref.refname;
+		if (ri->prefix != NULL &&
+		    strncmp(ri->prefix, ri->ref.refname, strlen(ri->prefix))) {
+			ri->err = 1;
+			break;
+		}
+		if (ri->flags & DO_FOR_EACH_PER_WORKTREE_ONLY &&
+		    ref_type(ri->base.refname) != REF_TYPE_PER_WORKTREE)
+			continue;
+
+		ri->base.flags = 0;
+		switch (ri->ref.value_type) {
+		case REFTABLE_REF_VAL1:
+			hashcpy(ri->oid.hash, ri->ref.value.val1);
+			break;
+		case REFTABLE_REF_VAL2:
+			hashcpy(ri->oid.hash, ri->ref.value.val2.value);
+			break;
+		case REFTABLE_REF_SYMREF: {
+			int out_flags = 0;
+			const char *resolved = refs_resolve_ref_unsafe(
+				ri->ref_store, ri->ref.refname,
+				RESOLVE_REF_READING, &ri->oid, &out_flags);
+			ri->base.flags = out_flags;
+			if (resolved == NULL &&
+			    !(ri->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
+			    (ri->base.flags & REF_ISBROKEN)) {
+				continue;
+			}
+			break;
+		}
+		default:
+			abort();
+		}
+
+		ri->base.oid = &ri->oid;
+		if (!(ri->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
+		    !ref_resolves_to_object(ri->base.refname, ri->base.oid,
+					    ri->base.flags)) {
+			continue;
+		}
+
+		break;
+	}
+
+	if (ri->err > 0) {
+		return ITER_DONE;
+	}
+	if (ri->err < 0) {
+		return ITER_ERROR;
+	}
+
+	return ITER_OK;
+}
+
+static int reftable_ref_iterator_peel(struct ref_iterator *ref_iterator,
+				      struct object_id *peeled)
+{
+	struct git_reftable_iterator *ri =
+		(struct git_reftable_iterator *)ref_iterator;
+	if (ri->ref.value_type == REFTABLE_REF_VAL2) {
+		hashcpy(peeled->hash, ri->ref.value.val2.target_value);
+		return 0;
+	}
+
+	return -1;
+}
+
+static int reftable_ref_iterator_abort(struct ref_iterator *ref_iterator)
+{
+	struct git_reftable_iterator *ri =
+		(struct git_reftable_iterator *)ref_iterator;
+	reftable_ref_record_release(&ri->ref);
+	reftable_iterator_destroy(&ri->iter);
+	if (ri->merged) {
+		reftable_merged_table_free(ri->merged);
+	}
+	return 0;
+}
+
+static struct ref_iterator_vtable reftable_ref_iterator_vtable = {
+	reftable_ref_iterator_advance, reftable_ref_iterator_peel,
+	reftable_ref_iterator_abort
+};
+
+static struct ref_iterator *
+git_reftable_ref_iterator_begin(struct ref_store *ref_store, const char *prefix,
+				unsigned int flags)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct git_reftable_iterator *ri = xcalloc(1, sizeof(*ri));
+
+	if (refs->err < 0) {
+		ri->err = refs->err;
+	} else if (refs->worktree_stack == NULL) {
+		struct reftable_merged_table *mt =
+			reftable_stack_merged_table(refs->main_stack);
+		ri->err = reftable_merged_table_seek_ref(mt, &ri->iter, prefix);
+	} else {
+		struct reftable_merged_table *mt1 =
+			reftable_stack_merged_table(refs->main_stack);
+		struct reftable_merged_table *mt2 =
+			reftable_stack_merged_table(refs->worktree_stack);
+		struct reftable_table *tabs =
+			xcalloc(2, sizeof(struct reftable_table));
+		reftable_table_from_merged_table(&tabs[0], mt1);
+		reftable_table_from_merged_table(&tabs[1], mt2);
+		ri->err = reftable_new_merged_table(&ri->merged, tabs, 2,
+						    the_hash_algo->format_id);
+		if (ri->err == 0)
+			ri->err = reftable_merged_table_seek_ref(
+				ri->merged, &ri->iter, prefix);
+	}
+
+	base_ref_iterator_init(&ri->base, &reftable_ref_iterator_vtable, 1);
+	ri->prefix = prefix;
+	ri->base.oid = &ri->oid;
+	ri->flags = flags;
+	ri->ref_store = ref_store;
+	return &ri->base;
+}
+
+static int fixup_symrefs(struct ref_store *ref_store,
+			 struct ref_transaction *transaction)
+{
+	struct strbuf referent = STRBUF_INIT;
+	int i = 0;
+	int err = 0;
+
+	for (i = 0; i < transaction->nr; i++) {
+		struct ref_update *update = transaction->updates[i];
+		struct object_id old_oid;
+
+		err = git_reftable_read_raw_ref(ref_store, update->refname,
+						&old_oid, &referent,
+						/* mutate input, like
+						   files-backend.c */
+						&update->type);
+		if (err < 0 && errno == ENOENT &&
+		    is_null_oid(&update->old_oid)) {
+			err = 0;
+		}
+		if (err < 0)
+			goto done;
+
+		if (!(update->type & REF_ISSYMREF))
+			continue;
+
+		if (update->flags & REF_NO_DEREF) {
+			/* what should happen here? See files-backend.c
+			 * lock_ref_for_update. */
+		} else {
+			/*
+			  If we are updating a symref (eg. HEAD), we should also
+			  update the branch that the symref points to.
+
+			  This is generic functionality, and would be better
+			  done in refs.c, but the current implementation is
+			  intertwined with the locking in files-backend.c.
+			*/
+			int new_flags = update->flags;
+			struct ref_update *new_update = NULL;
+
+			/* if this is an update for HEAD, should also record a
+			   log entry for HEAD? See files-backend.c,
+			   split_head_update()
+			*/
+			new_update = ref_transaction_add_update(
+				transaction, referent.buf, new_flags,
+				&update->new_oid, &update->old_oid,
+				update->msg);
+			new_update->parent_update = update;
+
+			/* files-backend sets REF_LOG_ONLY here. */
+			update->flags |= REF_NO_DEREF | REF_LOG_ONLY;
+			update->flags &= ~REF_HAVE_OLD;
+		}
+	}
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	strbuf_release(&referent);
+	return err;
+}
+
+static int git_reftable_transaction_prepare(struct ref_store *ref_store,
+					    struct ref_transaction *transaction,
+					    struct strbuf *errbuf)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_addition *add = NULL;
+	struct reftable_stack *stack =
+		transaction->nr ?
+			      stack_for(refs, transaction->updates[0]->refname) :
+			      refs->main_stack;
+	int err = refs->err;
+	if (err < 0) {
+		goto done;
+	}
+
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+
+	err = reftable_stack_new_addition(&add, stack);
+	if (err) {
+		goto done;
+	}
+
+	err = fixup_symrefs(ref_store, transaction);
+	if (err) {
+		goto done;
+	}
+
+	transaction->backend_data = add;
+	transaction->state = REF_TRANSACTION_PREPARED;
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	if (err < 0) {
+		transaction->state = REF_TRANSACTION_CLOSED;
+		strbuf_addf(errbuf, "reftable: transaction prepare: %s",
+			    reftable_error_str(err));
+	}
+
+	return err;
+}
+
+static int git_reftable_transaction_abort(struct ref_store *ref_store,
+					  struct ref_transaction *transaction,
+					  struct strbuf *err)
+{
+	struct reftable_addition *add =
+		(struct reftable_addition *)transaction->backend_data;
+	reftable_addition_destroy(add);
+	transaction->backend_data = NULL;
+	return 0;
+}
+
+static int reftable_check_old_oid(struct ref_store *refs, const char *refname,
+				  struct object_id *want_oid)
+{
+	struct object_id out_oid;
+	int out_flags = 0;
+	const char *resolved = refs_resolve_ref_unsafe(
+		refs, refname, RESOLVE_REF_READING, &out_oid, &out_flags);
+	if (is_null_oid(want_oid) != (resolved == NULL)) {
+		return REFTABLE_LOCK_ERROR;
+	}
+
+	if (resolved != NULL && !oideq(&out_oid, want_oid)) {
+		return REFTABLE_LOCK_ERROR;
+	}
+
+	return 0;
+}
+
+static int ref_update_cmp(const void *a, const void *b)
+{
+	return strcmp((*(struct ref_update **)a)->refname,
+		      (*(struct ref_update **)b)->refname);
+}
+
+static int write_transaction_table(struct reftable_writer *writer, void *arg)
+{
+	struct ref_transaction *transaction = (struct ref_transaction *)arg;
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)transaction->ref_store;
+	struct reftable_stack *stack =
+		stack_for(refs, transaction->updates[0]->refname);
+	uint64_t ts = reftable_stack_next_update_index(stack);
+	int err = 0;
+	int i = 0;
+	struct reftable_log_record *logs =
+		calloc(transaction->nr, sizeof(*logs));
+	struct ref_update **sorted =
+		malloc(transaction->nr * sizeof(struct ref_update *));
+	COPY_ARRAY(sorted, transaction->updates, transaction->nr);
+	QSORT(sorted, transaction->nr, ref_update_cmp);
+	reftable_writer_set_limits(writer, ts, ts);
+
+	for (i = 0; i < transaction->nr; i++) {
+		struct ref_update *u = sorted[i];
+		struct reftable_log_record *log = &logs[i];
+		fill_reftable_log_record(log);
+		log->update_index = ts;
+		log->value_type = REFTABLE_LOG_UPDATE;
+		log->refname = (char *)u->refname;
+		log->update.old_hash = u->old_oid.hash;
+		log->update.new_hash = u->new_oid.hash;
+		log->update.message = u->msg;
+
+		if (u->flags & REF_LOG_ONLY) {
+			continue;
+		}
+
+		if (u->flags & REF_HAVE_NEW) {
+			struct reftable_ref_record ref = { NULL };
+			struct object_id peeled;
+
+			int peel_error = peel_object(&u->new_oid, &peeled);
+			ref.refname = (char *)u->refname;
+			ref.update_index = ts;
+
+			if (!peel_error) {
+				ref.value_type = REFTABLE_REF_VAL2;
+				ref.value.val2.target_value = peeled.hash;
+				ref.value.val2.value = u->new_oid.hash;
+			} else if (!is_null_oid(&u->new_oid)) {
+				ref.value_type = REFTABLE_REF_VAL1;
+				ref.value.val1 = u->new_oid.hash;
+			}
+
+			err = reftable_writer_add_ref(writer, &ref);
+			if (err < 0) {
+				goto done;
+			}
+		}
+	}
+
+	for (i = 0; i < transaction->nr; i++) {
+		err = reftable_writer_add_log(writer, &logs[i]);
+		clear_reftable_log_record(&logs[i]);
+		if (err < 0) {
+			goto done;
+		}
+	}
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	free(logs);
+	free(sorted);
+	return err;
+}
+
+static int git_reftable_transaction_finish(struct ref_store *ref_store,
+					   struct ref_transaction *transaction,
+					   struct strbuf *errmsg)
+{
+	struct reftable_addition *add =
+		(struct reftable_addition *)transaction->backend_data;
+	int err = 0;
+	int i;
+
+	for (i = 0; i < transaction->nr; i++) {
+		struct ref_update *u = transaction->updates[i];
+		if (u->flags & REF_HAVE_OLD) {
+			err = reftable_check_old_oid(transaction->ref_store,
+						     u->refname, &u->old_oid);
+			if (err < 0) {
+				goto done;
+			}
+		}
+	}
+	if (transaction->nr) {
+		err = reftable_addition_add(add, &write_transaction_table,
+					    transaction);
+		if (err < 0) {
+			goto done;
+		}
+	}
+
+	err = reftable_addition_commit(add);
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	reftable_addition_destroy(add);
+	transaction->state = REF_TRANSACTION_CLOSED;
+	transaction->backend_data = NULL;
+	if (err) {
+		strbuf_addf(errmsg, "reftable: transaction failure: %s",
+			    reftable_error_str(err));
+		return -1;
+	}
+	return err;
+}
+
+static int
+git_reftable_transaction_initial_commit(struct ref_store *ref_store,
+					struct ref_transaction *transaction,
+					struct strbuf *errmsg)
+{
+	int err = git_reftable_transaction_prepare(ref_store, transaction,
+						   errmsg);
+	if (err)
+		return err;
+
+	return git_reftable_transaction_finish(ref_store, transaction, errmsg);
+}
+
+struct write_delete_refs_arg {
+	struct reftable_stack *stack;
+	struct string_list *refnames;
+	const char *logmsg;
+	unsigned int flags;
+};
+
+static int write_delete_refs_table(struct reftable_writer *writer, void *argv)
+{
+	struct write_delete_refs_arg *arg =
+		(struct write_delete_refs_arg *)argv;
+	uint64_t ts = reftable_stack_next_update_index(arg->stack);
+	int err = 0;
+	int i = 0;
+
+	reftable_writer_set_limits(writer, ts, ts);
+	for (i = 0; i < arg->refnames->nr; i++) {
+		struct reftable_ref_record ref = {
+			.refname = (char *)arg->refnames->items[i].string,
+			.value_type = REFTABLE_REF_DELETION,
+			.update_index = ts,
+		};
+		err = reftable_writer_add_ref(writer, &ref);
+		if (err < 0) {
+			return err;
+		}
+	}
+
+	for (i = 0; i < arg->refnames->nr; i++) {
+		struct reftable_log_record log = {
+			.update_index = ts,
+		};
+		struct reftable_ref_record current = { NULL };
+		fill_reftable_log_record(&log);
+		log.update_index = ts;
+		log.refname = (char *)arg->refnames->items[i].string;
+
+		log.update.message = xstrdup(arg->logmsg);
+		log.update.new_hash = NULL;
+		log.update.old_hash = NULL;
+		if (reftable_stack_read_ref(arg->stack, log.refname,
+					    &current) == 0) {
+			log.update.old_hash =
+				reftable_ref_record_val1(&current);
+		}
+		err = reftable_writer_add_log(writer, &log);
+		log.update.old_hash = NULL;
+		reftable_ref_record_release(&current);
+
+		clear_reftable_log_record(&log);
+		if (err < 0) {
+			return err;
+		}
+	}
+	return 0;
+}
+
+static int git_reftable_delete_refs(struct ref_store *ref_store,
+				    const char *msg,
+				    struct string_list *refnames,
+				    unsigned int flags)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack =
+		stack_for(refs, refnames->items[0].string);
+	struct write_delete_refs_arg arg = {
+		.stack = stack,
+		.refnames = refnames,
+		.logmsg = msg,
+		.flags = flags,
+	};
+	int err = refs->err;
+	if (err < 0) {
+		goto done;
+	}
+
+	string_list_sort(refnames);
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+	err = reftable_stack_add(stack, &write_delete_refs_table, &arg);
+done:
+	assert(err != REFTABLE_API_ERROR);
+	return err;
+}
+
+static int git_reftable_pack_refs(struct ref_store *ref_store,
+				  unsigned int flags)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	int err = refs->err;
+	if (err < 0) {
+		return err;
+	}
+	err = reftable_stack_compact_all(refs->main_stack, NULL);
+	if (err == 0 && refs->worktree_stack != NULL)
+		err = reftable_stack_compact_all(refs->worktree_stack, NULL);
+	return err;
+}
+
+struct write_create_symref_arg {
+	struct git_reftable_ref_store *refs;
+	struct reftable_stack *stack;
+	const char *refname;
+	const char *target;
+	const char *logmsg;
+};
+
+static int write_create_symref_table(struct reftable_writer *writer, void *arg)
+{
+	struct write_create_symref_arg *create =
+		(struct write_create_symref_arg *)arg;
+	uint64_t ts = reftable_stack_next_update_index(create->stack);
+	int err = 0;
+
+	struct reftable_ref_record ref = {
+		.refname = (char *)create->refname,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = (char *)create->target,
+		.update_index = ts,
+	};
+	reftable_writer_set_limits(writer, ts, ts);
+	err = reftable_writer_add_ref(writer, &ref);
+	if (err == 0) {
+		struct reftable_log_record log = { NULL };
+		struct object_id new_oid;
+		struct object_id old_oid;
+
+		fill_reftable_log_record(&log);
+		log.refname = (char *)create->refname;
+		log.update_index = ts;
+		log.update.message = (char *)create->logmsg;
+		if (refs_resolve_ref_unsafe(
+			    (struct ref_store *)create->refs, create->refname,
+			    RESOLVE_REF_READING, &old_oid, NULL) != NULL) {
+			log.update.old_hash = old_oid.hash;
+		}
+
+		if (refs_resolve_ref_unsafe((struct ref_store *)create->refs,
+					    create->target, RESOLVE_REF_READING,
+					    &new_oid, NULL) != NULL) {
+			log.update.new_hash = new_oid.hash;
+		}
+
+		if (log.update.old_hash != NULL ||
+		    log.update.new_hash != NULL) {
+			err = reftable_writer_add_log(writer, &log);
+		}
+		log.refname = NULL;
+		log.update.message = NULL;
+		log.update.old_hash = NULL;
+		log.update.new_hash = NULL;
+		clear_reftable_log_record(&log);
+	}
+	return err;
+}
+
+static int git_reftable_create_symref(struct ref_store *ref_store,
+				      const char *refname, const char *target,
+				      const char *logmsg)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct write_create_symref_arg arg = { .refs = refs,
+					       .stack = stack,
+					       .refname = refname,
+					       .target = target,
+					       .logmsg = logmsg };
+	int err = refs->err;
+	if (err < 0) {
+		goto done;
+	}
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+	err = reftable_stack_add(stack, &write_create_symref_table, &arg);
+done:
+	assert(err != REFTABLE_API_ERROR);
+	return err;
+}
+
+struct write_rename_arg {
+	struct reftable_stack *stack;
+	const char *oldname;
+	const char *newname;
+	const char *logmsg;
+};
+
+static int write_rename_table(struct reftable_writer *writer, void *argv)
+{
+	struct write_rename_arg *arg = (struct write_rename_arg *)argv;
+	uint64_t ts = reftable_stack_next_update_index(arg->stack);
+	struct reftable_ref_record ref = { NULL };
+	int err = reftable_stack_read_ref(arg->stack, arg->oldname, &ref);
+
+	if (err) {
+		goto done;
+	}
+
+	/* XXX do ref renames overwrite the target? */
+	if (reftable_stack_read_ref(arg->stack, arg->newname, &ref) == 0) {
+		goto done;
+	}
+
+	reftable_writer_set_limits(writer, ts, ts);
+
+	{
+		struct reftable_ref_record todo[2] = {
+			{
+				.refname = (char *)arg->oldname,
+				.update_index = ts,
+				.value_type = REFTABLE_REF_DELETION,
+			},
+			ref,
+		};
+		todo[1].update_index = ts;
+		free(todo[1].refname);
+		todo[1].refname = strdup(arg->newname);
+
+		err = reftable_writer_add_refs(writer, todo, 2);
+		if (err < 0) {
+			goto done;
+		}
+	}
+
+	if (reftable_ref_record_val1(&ref)) {
+		uint8_t *val1 = reftable_ref_record_val1(&ref);
+		struct reftable_log_record todo[2] = { { NULL } };
+		fill_reftable_log_record(&todo[0]);
+		fill_reftable_log_record(&todo[1]);
+
+		todo[0].refname = (char *)arg->oldname;
+		todo[0].update_index = ts;
+		todo[0].update.message = (char *)arg->logmsg;
+		todo[0].update.old_hash = val1;
+		todo[0].update.new_hash = NULL;
+
+		todo[1].refname = (char *)arg->newname;
+		todo[1].update_index = ts;
+		todo[1].update.old_hash = NULL;
+		todo[1].update.new_hash = val1;
+		todo[1].update.message = (char *)arg->logmsg;
+
+		err = reftable_writer_add_logs(writer, todo, 2);
+
+		clear_reftable_log_record(&todo[0]);
+		clear_reftable_log_record(&todo[1]);
+
+		if (err < 0) {
+			goto done;
+		}
+
+	} else {
+		/* XXX symrefs? */
+	}
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	reftable_ref_record_release(&ref);
+	return err;
+}
+
+static int git_reftable_rename_ref(struct ref_store *ref_store,
+				   const char *oldrefname,
+				   const char *newrefname, const char *logmsg)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, newrefname);
+	struct write_rename_arg arg = {
+		.stack = stack,
+		.oldname = oldrefname,
+		.newname = newrefname,
+		.logmsg = logmsg,
+	};
+	int err = refs->err;
+	if (err < 0) {
+		goto done;
+	}
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+
+	err = reftable_stack_add(stack, &write_rename_table, &arg);
+done:
+	assert(err != REFTABLE_API_ERROR);
+	return err;
+}
+
+static int git_reftable_copy_ref(struct ref_store *ref_store,
+				 const char *oldrefname, const char *newrefname,
+				 const char *logmsg)
+{
+	BUG("reftable reference store does not support copying references");
+}
+
+struct git_reftable_reflog_ref_iterator {
+	struct ref_iterator base;
+	struct reftable_iterator iter;
+	struct reftable_log_record log;
+	struct object_id oid;
+
+	/* Used when iterating over worktree & main */
+	struct reftable_merged_table *merged;
+	char *last_name;
+};
+
+static int
+git_reftable_reflog_ref_iterator_advance(struct ref_iterator *ref_iterator)
+{
+	struct git_reftable_reflog_ref_iterator *ri =
+		(struct git_reftable_reflog_ref_iterator *)ref_iterator;
+
+	while (1) {
+		int err = reftable_iterator_next_log(&ri->iter, &ri->log);
+		if (err > 0) {
+			return ITER_DONE;
+		}
+		if (err < 0) {
+			return ITER_ERROR;
+		}
+
+		ri->base.refname = ri->log.refname;
+		if (ri->last_name != NULL &&
+		    !strcmp(ri->log.refname, ri->last_name)) {
+			/* we want the refnames that we have reflogs for, so we
+			 * skip if we've already produced this name. This could
+			 * be faster by seeking directly to
+			 * reflog@update_index==0.
+			 */
+			continue;
+		}
+
+		free(ri->last_name);
+		ri->last_name = xstrdup(ri->log.refname);
+		hashcpy(ri->oid.hash, ri->log.update.new_hash);
+		return ITER_OK;
+	}
+}
+
+static int
+git_reftable_reflog_ref_iterator_peel(struct ref_iterator *ref_iterator,
+				      struct object_id *peeled)
+{
+	BUG("not supported.");
+	return -1;
+}
+
+static int
+git_reftable_reflog_ref_iterator_abort(struct ref_iterator *ref_iterator)
+{
+	struct git_reftable_reflog_ref_iterator *ri =
+		(struct git_reftable_reflog_ref_iterator *)ref_iterator;
+	reftable_log_record_release(&ri->log);
+	reftable_iterator_destroy(&ri->iter);
+	if (ri->merged)
+		reftable_merged_table_free(ri->merged);
+	return 0;
+}
+
+static struct ref_iterator_vtable git_reftable_reflog_ref_iterator_vtable = {
+	git_reftable_reflog_ref_iterator_advance,
+	git_reftable_reflog_ref_iterator_peel,
+	git_reftable_reflog_ref_iterator_abort
+};
+
+static struct ref_iterator *
+git_reftable_reflog_iterator_begin(struct ref_store *ref_store)
+{
+	struct git_reftable_reflog_ref_iterator *ri = xcalloc(1, sizeof(*ri));
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+
+	if (refs->worktree_stack == NULL) {
+		struct reftable_stack *stack = refs->main_stack;
+		struct reftable_merged_table *mt =
+			reftable_stack_merged_table(stack);
+		int err = reftable_merged_table_seek_log(mt, &ri->iter, "");
+		if (err < 0) {
+			free(ri);
+			/* XXX is this allowed? */
+			return NULL;
+		}
+	} else {
+		struct reftable_merged_table *mt1 =
+			reftable_stack_merged_table(refs->main_stack);
+		struct reftable_merged_table *mt2 =
+			reftable_stack_merged_table(refs->worktree_stack);
+		struct reftable_table *tabs =
+			xcalloc(2, sizeof(struct reftable_table));
+		int err = 0;
+		reftable_table_from_merged_table(&tabs[0], mt1);
+		reftable_table_from_merged_table(&tabs[1], mt2);
+		err = reftable_new_merged_table(&ri->merged, tabs, 2,
+						the_hash_algo->format_id);
+		if (err < 0) {
+			free(tabs);
+			/* XXX see above */
+			return NULL;
+		}
+		err = reftable_merged_table_seek_ref(ri->merged, &ri->iter, "");
+		if (err < 0) {
+			return NULL;
+		}
+	}
+	base_ref_iterator_init(&ri->base,
+			       &git_reftable_reflog_ref_iterator_vtable, 1);
+	ri->base.oid = &ri->oid;
+
+	return (struct ref_iterator *)ri;
+}
+
+static int git_reftable_for_each_reflog_ent_newest_first(
+	struct ref_store *ref_store, const char *refname, each_reflog_ent_fn fn,
+	void *cb_data)
+{
+	struct reftable_iterator it = { NULL };
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct reftable_merged_table *mt = NULL;
+	int err = 0;
+	struct reftable_log_record log = { NULL };
+
+	if (refs->err < 0) {
+		return refs->err;
+	}
+
+	mt = reftable_stack_merged_table(stack);
+	err = reftable_merged_table_seek_log(mt, &it, refname);
+	while (err == 0) {
+		struct object_id old_oid;
+		struct object_id new_oid;
+		const char *full_committer = "";
+
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+		if (err < 0) {
+			break;
+		}
+
+		if (strcmp(log.refname, refname)) {
+			break;
+		}
+
+		hashcpy(old_oid.hash, log.update.old_hash);
+		hashcpy(new_oid.hash, log.update.new_hash);
+
+		full_committer = fmt_ident(log.update.name, log.update.email,
+					   WANT_COMMITTER_IDENT,
+					   /*date*/ NULL, IDENT_NO_DATE);
+		err = fn(&old_oid, &new_oid, full_committer, log.update.time,
+			 log.update.tz_offset, log.update.message, cb_data);
+		if (err)
+			break;
+	}
+
+	reftable_log_record_release(&log);
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+static int git_reftable_for_each_reflog_ent_oldest_first(
+	struct ref_store *ref_store, const char *refname, each_reflog_ent_fn fn,
+	void *cb_data)
+{
+	struct reftable_iterator it = { NULL };
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct reftable_merged_table *mt = NULL;
+	struct reftable_log_record *logs = NULL;
+	int cap = 0;
+	int len = 0;
+	int err = 0;
+	int i = 0;
+
+	if (refs->err < 0) {
+		return refs->err;
+	}
+	mt = reftable_stack_merged_table(stack);
+	err = reftable_merged_table_seek_log(mt, &it, refname);
+
+	while (err == 0) {
+		struct reftable_log_record log = { NULL };
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+		if (err < 0) {
+			break;
+		}
+
+		if (strcmp(log.refname, refname)) {
+			break;
+		}
+
+		if (len == cap) {
+			cap = 2 * cap + 1;
+			logs = realloc(logs, cap * sizeof(*logs));
+		}
+
+		logs[len++] = log;
+	}
+
+	for (i = len; i--;) {
+		struct reftable_log_record *log = &logs[i];
+		struct object_id old_oid;
+		struct object_id new_oid;
+		const char *full_committer = "";
+
+		hashcpy(old_oid.hash, log->update.old_hash);
+		hashcpy(new_oid.hash, log->update.new_hash);
+
+		full_committer = fmt_ident(log->update.name, log->update.email,
+					   WANT_COMMITTER_IDENT, NULL,
+					   IDENT_NO_DATE);
+		err = fn(&old_oid, &new_oid, full_committer, log->update.time,
+			 log->update.tz_offset, log->update.message, cb_data);
+		if (err) {
+			break;
+		}
+	}
+
+	for (i = 0; i < len; i++) {
+		reftable_log_record_release(&logs[i]);
+	}
+	free(logs);
+
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+static int git_reftable_reflog_exists(struct ref_store *ref_store,
+				      const char *refname)
+{
+	struct reftable_iterator it = { NULL };
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct reftable_merged_table *mt = reftable_stack_merged_table(stack);
+	struct reftable_log_record log = { NULL };
+	int err = refs->err;
+
+	if (err < 0) {
+		goto done;
+	}
+	err = reftable_merged_table_seek_log(mt, &it, refname);
+	if (err) {
+		goto done;
+	}
+	err = reftable_iterator_next_log(&it, &log);
+	if (err) {
+		goto done;
+	}
+
+	if (strcmp(log.refname, refname)) {
+		err = 1;
+	}
+
+done:
+	reftable_iterator_destroy(&it);
+	reftable_log_record_release(&log);
+	return !err;
+}
+
+static int git_reftable_create_reflog(struct ref_store *ref_store,
+				      const char *refname, int force_create,
+				      struct strbuf *err)
+{
+	return 0;
+}
+
+static int git_reftable_delete_reflog(struct ref_store *ref_store,
+				      const char *refname)
+{
+	return 0;
+}
+
+struct reflog_expiry_arg {
+	struct git_reftable_ref_store *refs;
+	struct reftable_stack *stack;
+	struct reftable_log_record *tombstones;
+	int len;
+	int cap;
+};
+
+static void clear_log_tombstones(struct reflog_expiry_arg *arg)
+{
+	int i = 0;
+	for (; i < arg->len; i++) {
+		reftable_log_record_release(&arg->tombstones[i]);
+	}
+
+	FREE_AND_NULL(arg->tombstones);
+}
+
+static void add_log_tombstone(struct reflog_expiry_arg *arg,
+			      const char *refname, uint64_t ts)
+{
+	struct reftable_log_record tombstone = {
+		.refname = xstrdup(refname),
+		.update_index = ts,
+	};
+	if (arg->len == arg->cap) {
+		arg->cap = 2 * arg->cap + 1;
+		arg->tombstones =
+			realloc(arg->tombstones, arg->cap * sizeof(tombstone));
+	}
+	arg->tombstones[arg->len++] = tombstone;
+}
+
+static int write_reflog_expiry_table(struct reftable_writer *writer, void *argv)
+{
+	struct reflog_expiry_arg *arg = (struct reflog_expiry_arg *)argv;
+	uint64_t ts = reftable_stack_next_update_index(arg->stack);
+	int i = 0;
+	reftable_writer_set_limits(writer, ts, ts);
+	for (i = 0; i < arg->len; i++) {
+		int err = reftable_writer_add_log(writer, &arg->tombstones[i]);
+		if (err) {
+			return err;
+		}
+	}
+	return 0;
+}
+
+static int
+git_reftable_reflog_expire(struct ref_store *ref_store, const char *refname,
+			   const struct object_id *oid, unsigned int flags,
+			   reflog_expiry_prepare_fn prepare_fn,
+			   reflog_expiry_should_prune_fn should_prune_fn,
+			   reflog_expiry_cleanup_fn cleanup_fn,
+			   void *policy_cb_data)
+{
+	/*
+	  For log expiry, we write tombstones in place of the expired entries,
+	  This means that the entries are still retrievable by delving into the
+	  stack, and expiring entries paradoxically takes extra memory.
+
+	  This memory is only reclaimed when some operation issues a
+	  git_reftable_pack_refs(), which will compact the entire stack and get
+	  rid of deletion entries.
+
+	  It would be better if the refs backend supported an API that sets a
+	  criterion for all refs, passing the criterion to pack_refs().
+	*/
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct reftable_merged_table *mt = NULL;
+	struct reflog_expiry_arg arg = {
+		.stack = stack,
+		.refs = refs,
+	};
+	struct reftable_log_record log = { NULL };
+	struct reftable_iterator it = { NULL };
+	int err = 0;
+	if (refs->err < 0) {
+		return refs->err;
+	}
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+
+	mt = reftable_stack_merged_table(stack);
+	err = reftable_merged_table_seek_log(mt, &it, refname);
+	if (err < 0) {
+		goto done;
+	}
+
+	while (1) {
+		struct object_id ooid;
+		struct object_id noid;
+
+		int err = reftable_iterator_next_log(&it, &log);
+		if (err < 0) {
+			goto done;
+		}
+
+		if (err > 0 || strcmp(log.refname, refname)) {
+			break;
+		}
+		hashcpy(ooid.hash, log.update.old_hash);
+		hashcpy(noid.hash, log.update.new_hash);
+
+		if (should_prune_fn(&ooid, &noid, log.update.email,
+				    (timestamp_t)log.update.time,
+				    log.update.tz_offset, log.update.message,
+				    policy_cb_data)) {
+			add_log_tombstone(&arg, refname, log.update_index);
+		}
+	}
+	err = reftable_stack_add(stack, &write_reflog_expiry_table, &arg);
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	reftable_log_record_release(&log);
+	reftable_iterator_destroy(&it);
+	clear_log_tombstones(&arg);
+	return err;
+}
+
+static int reftable_error_to_errno(int err)
+{
+	switch (err) {
+	case REFTABLE_IO_ERROR:
+		return EIO;
+	case REFTABLE_FORMAT_ERROR:
+		return EFAULT;
+	case REFTABLE_NOT_EXIST_ERROR:
+		return ENOENT;
+	case REFTABLE_LOCK_ERROR:
+		return EBUSY;
+	case REFTABLE_API_ERROR:
+		return EINVAL;
+	case REFTABLE_ZLIB_ERROR:
+		return EDOM;
+	default:
+		return ERANGE;
+	}
+}
+
+static int git_reftable_read_raw_ref(struct ref_store *ref_store,
+				     const char *refname, struct object_id *oid,
+				     struct strbuf *referent,
+				     unsigned int *type)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+
+	struct reftable_ref_record ref = { NULL };
+	int err = 0;
+	if (refs->err < 0) {
+		return refs->err;
+	}
+
+	/* This is usually not needed, but Git doesn't signal to ref backend if
+	   a subprocess updated the ref DB.  So we always check.
+	*/
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+
+	err = reftable_stack_read_ref(stack, refname, &ref);
+	if (err > 0) {
+		errno = ENOENT;
+		err = -1;
+		goto done;
+	}
+	if (err < 0) {
+		errno = reftable_error_to_errno(err);
+		err = -1;
+		goto done;
+	}
+
+	if (ref.value_type == REFTABLE_REF_SYMREF) {
+		strbuf_reset(referent);
+		strbuf_addstr(referent, ref.value.symref);
+		*type |= REF_ISSYMREF;
+	} else if (reftable_ref_record_val1(&ref) != NULL) {
+		hashcpy(oid->hash, reftable_ref_record_val1(&ref));
+	} else {
+		*type |= REF_ISBROKEN;
+		errno = EINVAL;
+		err = -1;
+	}
+done:
+	assert(err != REFTABLE_API_ERROR);
+	reftable_ref_record_release(&ref);
+	return err;
+}
+
+struct ref_storage_be refs_be_reftable = {
+	&refs_be_files,
+	"reftable",
+	git_reftable_ref_store_create,
+	git_reftable_init_db,
+	git_reftable_transaction_prepare,
+	git_reftable_transaction_finish,
+	git_reftable_transaction_abort,
+	git_reftable_transaction_initial_commit,
+
+	git_reftable_pack_refs,
+	git_reftable_create_symref,
+	git_reftable_delete_refs,
+	git_reftable_rename_ref,
+	git_reftable_copy_ref,
+
+	git_reftable_ref_iterator_begin,
+	git_reftable_read_raw_ref,
+
+	git_reftable_reflog_iterator_begin,
+	git_reftable_for_each_reflog_ent_oldest_first,
+	git_reftable_for_each_reflog_ent_newest_first,
+	git_reftable_reflog_exists,
+	git_reftable_create_reflog,
+	git_reftable_delete_reflog,
+	git_reftable_reflog_expire,
+};
diff --git a/repository.c b/repository.c
index c98298acd017..4d8f572fc9cd 100644
--- a/repository.c
+++ b/repository.c
@@ -174,6 +174,8 @@ int repo_init(struct repository *repo,
 	if (worktree)
 		repo_set_worktree(repo, worktree);
 
+	repo->ref_storage_format = xstrdup_or_null(format.ref_storage);
+
 	clear_repository_format(&format);
 	return 0;
 
diff --git a/repository.h b/repository.h
index b385ca3c94b6..8019a7d0a1fc 100644
--- a/repository.h
+++ b/repository.h
@@ -78,6 +78,9 @@ struct repository {
 	 */
 	struct ref_store *refs_private;
 
+	/* The format to use for the ref database. */
+	char *ref_storage_format;
+
 	/*
 	 * Contains path to often used file names.
 	 */
diff --git a/setup.c b/setup.c
index c04cd25a30df..c6b57efd0318 100644
--- a/setup.c
+++ b/setup.c
@@ -500,6 +500,9 @@ static enum extension_result handle_extension(const char *var,
 			return error("invalid value for 'extensions.objectformat'");
 		data->hash_algo = format;
 		return EXTENSION_OK;
+	} else if (!strcmp(ext, "refstorage")) {
+		data->ref_storage = xstrdup(value);
+		return EXTENSION_OK;
 	}
 	return EXTENSION_UNKNOWN;
 }
@@ -651,6 +654,7 @@ void clear_repository_format(struct repository_format *format)
 	string_list_clear(&format->v1_only_extensions, 0);
 	free(format->work_tree);
 	free(format->partial_clone);
+	free(format->ref_storage);
 	init_repository_format(format);
 }
 
@@ -1308,8 +1312,11 @@ const char *setup_git_directory_gently(int *nongit_ok)
 				gitdir = DEFAULT_GIT_DIR_ENVIRONMENT;
 			setup_git_env(gitdir);
 		}
-		if (startup_info->have_repository)
+		if (startup_info->have_repository) {
 			repo_set_hash_algo(the_repository, repo_fmt.hash_algo);
+			the_repository->ref_storage_format =
+				xstrdup_or_null(repo_fmt.ref_storage);
+		}
 	}
 
 	strbuf_release(&dir);
diff --git a/t/t0031-reftable.sh b/t/t0031-reftable.sh
new file mode 100755
index 000000000000..ba0738e51561
--- /dev/null
+++ b/t/t0031-reftable.sh
@@ -0,0 +1,203 @@
+#!/bin/sh
+#
+# Copyright (c) 2020 Google LLC
+#
+
+test_description='reftable basics'
+
+. ./test-lib.sh
+
+INVALID_SHA1=aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+
+git_init () {
+	git init -b primary "$@"
+}
+
+initialize ()  {
+	rm -rf .git &&
+	git_init --ref-storage=reftable &&
+	mv .git/hooks .git/hooks-disabled
+}
+
+test_expect_success 'SHA256 support, env' '
+	rm -rf .git &&
+	GIT_DEFAULT_HASH=sha256 && export GIT_DEFAULT_HASH &&
+	git_init --ref-storage=reftable &&
+	mv .git/hooks .git/hooks-disabled &&
+	test_commit file
+'
+
+test_expect_success 'SHA256 support, option' '
+	rm -rf .git &&
+	git_init --ref-storage=reftable --object-format=sha256 &&
+	mv .git/hooks .git/hooks-disabled &&
+	test_commit file
+'
+
+test_expect_success 'delete ref' '
+	initialize &&
+	test_commit file &&
+	SHA=$(git show-ref -s --verify HEAD) &&
+	test_write_lines "$SHA refs/heads/primary" "$SHA refs/tags/file" >expect &&
+	git show-ref >actual &&
+	! git update-ref -d refs/tags/file $INVALID_SHA1 &&
+	test_cmp expect actual &&
+	git update-ref -d refs/tags/file $SHA  &&
+	test_write_lines "$SHA refs/heads/primary" >expect &&
+	git show-ref >actual &&
+	test_cmp expect actual
+'
+
+
+test_expect_success 'clone calls transaction_initial_commit' '
+	test_commit message1 file1 &&
+	git clone . cloned &&
+	(test  -f cloned/file1 || echo "Fixme.")
+'
+
+test_expect_success 'basic operation of reftable storage: commit, show-ref' '
+	initialize &&
+	test_commit file &&
+	test_write_lines refs/heads/primary refs/tags/file >expect &&
+	git show-ref &&
+	git show-ref | cut -f2 -d" " >actual &&
+	test_cmp actual expect
+'
+
+test_expect_success 'reflog, repack' '
+	initialize &&
+	for count in $(test_seq 1 10)
+	do
+		test_commit "number $count" file.t $count number-$count ||
+		return 1
+	done &&
+	git pack-refs &&
+	ls -1 .git/reftable >table-files &&
+	test_line_count = 2 table-files &&
+	git reflog refs/heads/primary >output &&
+	test_line_count = 10 output &&
+	grep "commit (initial): number 1" output &&
+	grep "commit: number 10" output &&
+	git gc &&
+	git reflog refs/heads/primary >output &&
+	test_line_count = 0 output
+'
+
+test_expect_success 'branch switch in reflog output' '
+	initialize &&
+	test_commit file1 &&
+	git checkout -b branch1 &&
+	test_commit file2 &&
+	git checkout -b branch2 &&
+	git switch - &&
+	git rev-parse --symbolic-full-name HEAD >actual &&
+	echo refs/heads/branch1 >expect &&
+	test_cmp actual expect
+'
+
+
+# This matches show-ref's output
+print_ref() {
+	echo "$(git rev-parse "$1") $1"
+}
+
+test_expect_success 'peeled tags are stored' '
+	initialize &&
+	test_commit file &&
+	git tag -m "annotated tag" test_tag HEAD &&
+	{
+		print_ref "refs/heads/primary" &&
+		print_ref "refs/tags/file" &&
+		print_ref "refs/tags/test_tag" &&
+		print_ref "refs/tags/test_tag^{}"
+	} >expect &&
+	git show-ref -d >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'show-ref works on fresh repo' '
+	initialize &&
+	rm -rf .git &&
+	git_init --ref-storage=reftable &&
+	>expect &&
+	! git show-ref >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'checkout unborn branch' '
+	initialize &&
+	git checkout -b primary
+'
+
+
+test_expect_success 'dir/file conflict' '
+	initialize &&
+	test_commit file &&
+	! git branch primary/forbidden
+'
+
+
+test_expect_success 'do not clobber existing repo' '
+	rm -rf .git &&
+	git_init --ref-storage=files &&
+	cat .git/HEAD >expect &&
+	test_commit file &&
+	(git_init --ref-storage=reftable || true) &&
+	cat .git/HEAD >actual &&
+	test_cmp expect actual
+'
+
+# cherry-pick uses a pseudo ref.
+test_expect_success 'pseudo refs' '
+	initialize &&
+	test_commit message1 file1 &&
+	test_commit message2 file2 &&
+	git branch source &&
+	git checkout HEAD^ &&
+	test_commit message3 file3 &&
+	git cherry-pick source &&
+	test -f file2
+'
+
+# cherry-pick uses a pseudo ref.
+test_expect_success 'rebase' '
+	initialize &&
+	test_commit message1 file1 &&
+	test_commit message2 file2 &&
+	git branch source &&
+	git checkout HEAD^ &&
+	test_commit message3 file3 &&
+	git rebase source &&
+	test -f file2
+'
+
+test_expect_success 'worktrees' '
+	git_init --ref-storage=reftable start &&
+	(cd start && test_commit file1 && git checkout -b branch1 &&
+	git checkout -b branch2 &&
+	git worktree add  ../wt
+	) &&
+	cd wt &&
+	git checkout branch1 &&
+	git branch
+'
+
+test_expect_success 'worktrees 2' '
+	initialize &&
+	test_commit file1 &&
+	mkdir existing_empty &&
+	git worktree add --detach existing_empty primary
+'
+
+test_expect_success 'FETCH_HEAD' '
+	initialize &&
+	test_commit one &&
+	(git_init sub && cd sub && test_commit two) &&
+	git --git-dir sub/.git rev-parse HEAD >expect &&
+	git fetch sub &&
+	git checkout FETCH_HEAD &&
+	git rev-parse HEAD >actual &&
+	test_cmp expect actual
+'
+
+test_done
diff --git a/t/t1409-avoid-packing-refs.sh b/t/t1409-avoid-packing-refs.sh
index be12fb63506e..c6f783255633 100755
--- a/t/t1409-avoid-packing-refs.sh
+++ b/t/t1409-avoid-packing-refs.sh
@@ -4,6 +4,12 @@ test_description='avoid rewriting packed-refs unnecessarily'
 
 . ./test-lib.sh
 
+if test_have_prereq REFTABLE
+then
+  skip_all='skipping pack-refs tests; incompatible with reftable'
+  test_done
+fi
+
 # Add an identifying mark to the packed-refs file header line. This
 # shouldn't upset readers, and it should be omitted if the file is
 # ever rewritten.
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 5071ac63a5b5..16440d58d6bb 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -8,6 +8,12 @@ test_description='git fsck random collection of tests
 
 . ./test-lib.sh
 
+if test_have_prereq REFTABLE
+then
+  skip_all='skipping tests; incompatible with reftable'
+  test_done
+fi
+
 test_expect_success setup '
 	git config gc.auto 0 &&
 	git config i18n.commitencoding ISO-8859-1 &&
diff --git a/t/t3210-pack-refs.sh b/t/t3210-pack-refs.sh
index 3b7cdc56ec84..667a23affe36 100755
--- a/t/t3210-pack-refs.sh
+++ b/t/t3210-pack-refs.sh
@@ -14,6 +14,12 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
 
+if test_have_prereq REFTABLE
+then
+  skip_all='skipping pack-refs tests; incompatible with reftable'
+  test_done
+fi
+
 test_expect_success 'enable reflogs' '
 	git config core.logallrefupdates true
 '
diff --git a/t/test-lib.sh b/t/test-lib.sh
index d3f6af6a6545..943aa9713961 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1481,6 +1481,11 @@ parisc* | hppa*)
 	;;
 esac
 
+if test -n "$GIT_TEST_REFTABLE"
+then
+  test_set_prereq REFTABLE
+fi
+
 ( COLUMNS=1 && test $COLUMNS = 1 ) && test_set_prereq COLUMNS_CAN_BE_1
 test -z "$NO_PERL" && test_set_prereq PERL
 test -z "$NO_PTHREADS" && test_set_prereq PTHREADS
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v5 14/15] git-prompt: prepare for reftable refs backend
  2021-03-12 20:19       ` [PATCH v5 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                           ` (12 preceding siblings ...)
  2021-03-12 20:19         ` [PATCH v5 13/15] Reftable support for git-core Han-Wen Nienhuys via GitGitGadget
@ 2021-03-12 20:19         ` SZEDER Gábor via GitGitGadget
  2021-03-12 20:19         ` [PATCH v5 15/15] Add "test-tool dump-reftable" command Han-Wen Nienhuys via GitGitGadget
  2021-04-12 19:25         ` [PATCH v6 00/20] reftable library Han-Wen Nienhuys via GitGitGadget
  15 siblings, 0 replies; 251+ messages in thread
From: SZEDER Gábor via GitGitGadget @ 2021-03-12 20:19 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, SZEDER Gábor

From: =?UTF-8?q?SZEDER=20G=C3=A1bor?= <szeder.dev@gmail.com>

In our git-prompt script we strive to use Bash builtins wherever
possible, because fork()-ing subshells for command substitutions and
fork()+exec()-ing Git commands are expensive on some platforms.  We
even read and parse '.git/HEAD' using Bash builtins to get the name of
the current branch [1].  However, the upcoming reftable refs backend
won't use '.git/HEAD' at all, but will write an invalid refname as
placeholder for backwards compatibility instead, which will break our
git-prompt script.

Update the git-prompt script to recognize the placeholder '.git/HEAD'
written by the reftable backend (its content is specified in the
reftable specs), and then fall back to use 'git symbolic-ref' to get
the name of the current branch.

[1] 3a43c4b5bd (bash prompt: use bash builtins to find out current
    branch, 2011-03-31)

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 contrib/completion/git-prompt.sh | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/contrib/completion/git-prompt.sh b/contrib/completion/git-prompt.sh
index 4640a1535d15..2e5a5d80271d 100644
--- a/contrib/completion/git-prompt.sh
+++ b/contrib/completion/git-prompt.sh
@@ -478,10 +478,15 @@ __git_ps1 ()
 			if ! __git_eread "$g/HEAD" head; then
 				return $exit
 			fi
-			# is it a symbolic ref?
 			b="${head#ref: }"
 			if [ "$head" = "$b" ]; then
 				detached=yes
+			elif [ "$b" = "refs/heads/.invalid" ]; then
+				# Reftable
+				b="$(git symbolic-ref HEAD 2>/dev/null)" ||
+				detached=yes
+			fi
+			if [ "$detached" = yes ]; then
 				b="$(
 				case "${GIT_PS1_DESCRIBE_STYLE-}" in
 				(contains)
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v5 15/15] Add "test-tool dump-reftable" command.
  2021-03-12 20:19       ` [PATCH v5 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                           ` (13 preceding siblings ...)
  2021-03-12 20:19         ` [PATCH v5 14/15] git-prompt: prepare for reftable refs backend SZEDER Gábor via GitGitGadget
@ 2021-03-12 20:19         ` Han-Wen Nienhuys via GitGitGadget
  2021-04-12 19:25         ` [PATCH v6 00/20] reftable library Han-Wen Nienhuys via GitGitGadget
  15 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-03-12 20:19 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This command dumps individual tables or a stack of of tables.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |  1 +
 reftable/dump.c          | 67 ++++++++++++++++++++--------------------
 t/helper/test-reftable.c |  5 +++
 t/helper/test-tool.c     |  1 +
 t/helper/test-tool.h     |  1 +
 5 files changed, 42 insertions(+), 33 deletions(-)

diff --git a/Makefile b/Makefile
index 90c204501739..224719a6c435 100644
--- a/Makefile
+++ b/Makefile
@@ -2402,6 +2402,7 @@ REFTABLE_OBJS += reftable/zlib-compat.o
 
 REFTABLE_TEST_OBJS += reftable/basics_test.o
 REFTABLE_TEST_OBJS += reftable/block_test.o
+REFTABLE_TEST_OBJS += reftable/dump.o
 REFTABLE_TEST_OBJS += reftable/merged_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/refname_test.o
diff --git a/reftable/dump.c b/reftable/dump.c
index 00b444e8c9b8..7d620a3cf0fe 100644
--- a/reftable/dump.c
+++ b/reftable/dump.c
@@ -12,18 +12,25 @@ license that can be found in the LICENSE file or at
 #include <unistd.h>
 #include <string.h>
 
-#include "reftable.h"
+#include "reftable-blocksource.h"
+#include "reftable-error.h"
+#include "reftable-merged.h"
+#include "reftable-record.h"
 #include "reftable-tests.h"
+#include "reftable-writer.h"
+#include "reftable-iterator.h"
+#include "reftable-reader.h"
+#include "reftable-stack.h"
 
 static uint32_t hash_id;
 
 static int dump_table(const char *tablename)
 {
-	struct reftable_block_source src = { 0 };
+	struct reftable_block_source src = { NULL };
 	int err = reftable_block_source_from_file(&src, tablename);
-	struct reftable_iterator it = { 0 };
-	struct reftable_ref_record ref = { 0 };
-	struct reftable_log_record log = { 0 };
+	struct reftable_iterator it = { NULL };
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_log_record log = { NULL };
 	struct reftable_reader *r = NULL;
 
 	if (err < 0)
@@ -49,7 +56,7 @@ static int dump_table(const char *tablename)
 		reftable_ref_record_print(&ref, hash_id);
 	}
 	reftable_iterator_destroy(&it);
-	reftable_ref_record_clear(&ref);
+	reftable_ref_record_release(&ref);
 
 	err = reftable_reader_seek_log(r, &it, "");
 	if (err < 0) {
@@ -66,7 +73,7 @@ static int dump_table(const char *tablename)
 		reftable_log_record_print(&log, hash_id);
 	}
 	reftable_iterator_destroy(&it);
-	reftable_log_record_clear(&log);
+	reftable_log_record_release(&log);
 
 	reftable_reader_free(r);
 	return 0;
@@ -75,7 +82,7 @@ static int dump_table(const char *tablename)
 static int compact_stack(const char *stackdir)
 {
 	struct reftable_stack *stack = NULL;
-	struct reftable_write_options cfg = {};
+	struct reftable_write_options cfg = { 0 };
 
 	int err = reftable_new_stack(&stack, stackdir, cfg);
 	if (err < 0)
@@ -94,10 +101,10 @@ static int compact_stack(const char *stackdir)
 static int dump_stack(const char *stackdir)
 {
 	struct reftable_stack *stack = NULL;
-	struct reftable_write_options cfg = {};
-	struct reftable_iterator it = { 0 };
-	struct reftable_ref_record ref = { 0 };
-	struct reftable_log_record log = { 0 };
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_iterator it = { NULL };
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_log_record log = { NULL };
 	struct reftable_merged_table *merged = NULL;
 
 	int err = reftable_new_stack(&stack, stackdir, cfg);
@@ -122,7 +129,7 @@ static int dump_stack(const char *stackdir)
 		reftable_ref_record_print(&ref, hash_id);
 	}
 	reftable_iterator_destroy(&it);
-	reftable_ref_record_clear(&ref);
+	reftable_ref_record_release(&ref);
 
 	err = reftable_merged_table_seek_log(merged, &it, "");
 	if (err < 0) {
@@ -139,7 +146,7 @@ static int dump_stack(const char *stackdir)
 		reftable_log_record_print(&log, hash_id);
 	}
 	reftable_iterator_destroy(&it);
-	reftable_log_record_clear(&log);
+	reftable_log_record_release(&log);
 
 	reftable_stack_destroy(stack);
 	return 0;
@@ -160,40 +167,34 @@ static void print_help(void)
 int reftable_dump_main(int argc, char *const *argv)
 {
 	int err = 0;
-	int opt;
 	int opt_dump_table = 0;
 	int opt_dump_stack = 0;
 	int opt_compact = 0;
-	const char *arg = NULL;
-	while ((opt = getopt(argc, argv, "2chts")) != -1) {
-		switch (opt) {
-		case '2':
-			hash_id = 0x73323536;
+	const char *arg = NULL, *argv0 = argv[0];
+
+	for (; argc > 1; argv++, argc--)
+		if (*argv[1] != '-')
 			break;
-		case 't':
+		else if (!strcmp("-2", argv[1]))
+			hash_id = 0x73323536;
+		else if (!strcmp("-t", argv[1]))
 			opt_dump_table = 1;
-			break;
-		case 's':
+		else if (!strcmp("-s", argv[1]))
 			opt_dump_stack = 1;
-			break;
-		case 'c':
+		else if (!strcmp("-c", argv[1]))
 			opt_compact = 1;
-			break;
-		case '?':
-		case 'h':
+		else if (!strcmp("-?", argv[1]) || !strcmp("-h", argv[1])) {
 			print_help();
 			return 2;
-			break;
 		}
-	}
 
-	if (argv[optind] == NULL) {
+	if (argc != 2) {
 		fprintf(stderr, "need argument\n");
 		print_help();
 		return 2;
 	}
 
-	arg = argv[optind];
+	arg = argv[1];
 
 	if (opt_dump_table) {
 		err = dump_table(arg);
@@ -204,7 +205,7 @@ int reftable_dump_main(int argc, char *const *argv)
 	}
 
 	if (err < 0) {
-		fprintf(stderr, "%s: %s: %s\n", argv[0], arg,
+		fprintf(stderr, "%s: %s: %s\n", argv0, arg,
 			reftable_error_str(err));
 		return 1;
 	}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 3b702f4855e6..c1ba131c3a48 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -13,3 +13,8 @@ int cmd__reftable(int argc, const char **argv)
 	tree_test_main(argc, argv);
 	return 0;
 }
+
+int cmd__dump_reftable(int argc, const char **argv)
+{
+	return reftable_dump_main(argc, (char *const *)argv);
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index dfa2bee9e8a2..867446efafa8 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -57,6 +57,7 @@ static struct test_cmd cmds[] = {
 	{ "read-midx", cmd__read_midx },
 	{ "ref-store", cmd__ref_store },
 	{ "reftable", cmd__reftable },
+	{ "dump-reftable", cmd__dump_reftable },
 	{ "regex", cmd__regex },
 	{ "repository", cmd__repository },
 	{ "revision-walking", cmd__revision_walking },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index cad482a78915..256634eebaeb 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -18,6 +18,7 @@ int cmd__dump_cache_tree(int argc, const char **argv);
 int cmd__dump_fsmonitor(int argc, const char **argv);
 int cmd__dump_split_index(int argc, const char **argv);
 int cmd__dump_untracked_cache(int argc, const char **argv);
+int cmd__dump_reftable(int argc, const char **argv);
 int cmd__example_decorate(int argc, const char **argv);
 int cmd__fast_rebase(int argc, const char **argv);
 int cmd__genrandom(int argc, const char **argv);
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 251+ messages in thread

* Re: [PATCH v5 13/15] Reftable support for git-core
  2021-03-12 20:19         ` [PATCH v5 13/15] Reftable support for git-core Han-Wen Nienhuys via GitGitGadget
@ 2021-03-23 11:40           ` Derrick Stolee
  2021-03-23 12:20             ` Ævar Arnfjörð Bjarmason
  2021-03-23 20:12             ` Junio C Hamano
  0 siblings, 2 replies; 251+ messages in thread
From: Derrick Stolee @ 2021-03-23 11:40 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget, git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Han-Wen Nienhuys

On 3/12/2021 3:19 PM, Han-Wen Nienhuys via GitGitGadget wrote:
> diff --git a/Documentation/config/extensions.txt b/Documentation/config/extensions.txt
> index 4e23d73cdcad..82c5940f1434 100644
> --- a/Documentation/config/extensions.txt
> +++ b/Documentation/config/extensions.txt
> @@ -6,3 +6,12 @@ extensions.objectFormat::
>  Note that this setting should only be set by linkgit:git-init[1] or
>  linkgit:git-clone[1].  Trying to change it after initialization will not
>  work and will produce hard-to-diagnose issues.
> ++

I noticed while resolving conflicts with my series, which also edits this
file, that the "+" line above should be removed. That likely munges the
fact that the config entry below should be its own list item, not a
continuation of the previous one.

> +extensions.refStorage::
> +	Specify the ref storage mechanism to use.  The acceptable values are `files` and
> +	`reftable`.  If not specified, `files` is assumed.  It is an error to specify
> +	this key unless `core.repositoryFormatVersion` is 1.
> ++
> +Note that this setting should only be set by linkgit:git-init[1] or
> +linkgit:git-clone[1].  Trying to change it after initialization will not
> +work and will produce hard-to-diagnose issues.

The only other conflict is that we are trying to write at the end of
the file. extensions.sparseIndex should go after this option.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v5 13/15] Reftable support for git-core
  2021-03-23 11:40           ` Derrick Stolee
@ 2021-03-23 12:20             ` Ævar Arnfjörð Bjarmason
  2021-03-23 20:14               ` Junio C Hamano
  2021-03-23 20:12             ` Junio C Hamano
  1 sibling, 1 reply; 251+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-03-23 12:20 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Han-Wen Nienhuys,
	Jeff King, Ramsay Jones, Jonathan Nieder, Johannes Schindelin,
	Jonathan Tan, Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Felipe Contreras, Han-Wen Nienhuys


On Tue, Mar 23 2021, Derrick Stolee wrote:

> On 3/12/2021 3:19 PM, Han-Wen Nienhuys via GitGitGadget wrote:
>> diff --git a/Documentation/config/extensions.txt b/Documentation/config/extensions.txt
>> index 4e23d73cdcad..82c5940f1434 100644
>> --- a/Documentation/config/extensions.txt
>> +++ b/Documentation/config/extensions.txt
>> @@ -6,3 +6,12 @@ extensions.objectFormat::
>>  Note that this setting should only be set by linkgit:git-init[1] or
>>  linkgit:git-clone[1].  Trying to change it after initialization will not
>>  work and will produce hard-to-diagnose issues.
>> ++
>
> I noticed while resolving conflicts with my series, which also edits this
> file, that the "+" line above should be removed. That likely munges the
> fact that the config entry below should be its own list item, not a
> continuation of the previous one.

I haven't tested this patch, but just a plug for the very useful
Documentation/doc-diff for discovering any such formatting errors when
making non-trivial doc changes.

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v5 13/15] Reftable support for git-core
  2021-03-23 11:40           ` Derrick Stolee
  2021-03-23 12:20             ` Ævar Arnfjörð Bjarmason
@ 2021-03-23 20:12             ` Junio C Hamano
  1 sibling, 0 replies; 251+ messages in thread
From: Junio C Hamano @ 2021-03-23 20:12 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Han-Wen Nienhuys,
	Jeff King, Ramsay Jones, Jonathan Nieder, Johannes Schindelin,
	Jonathan Tan, Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Ævar Arnfjörð Bjarmason, Felipe Contreras,
	Han-Wen Nienhuys

Derrick Stolee <stolee@gmail.com> writes:

> On 3/12/2021 3:19 PM, Han-Wen Nienhuys via GitGitGadget wrote:
>> diff --git a/Documentation/config/extensions.txt b/Documentation/config/extensions.txt
>> index 4e23d73cdcad..82c5940f1434 100644
>> --- a/Documentation/config/extensions.txt
>> +++ b/Documentation/config/extensions.txt
>> @@ -6,3 +6,12 @@ extensions.objectFormat::
>>  Note that this setting should only be set by linkgit:git-init[1] or
>>  linkgit:git-clone[1].  Trying to change it after initialization will not
>>  work and will produce hard-to-diagnose issues.
>> ++
>
> I noticed while resolving conflicts with my series, which also edits this
> file, that the "+" line above should be removed. That likely munges the
> fact that the config entry below should be its own list item, not a
> continuation of the previous one.

Yup, I noticed them, too, while merging.  If I recall correctly,
even though using a blank line to separate the list items is
semantically more correct, using the plus-alone line produced either
identical, or visually indistinguishable, rendering when I tested.

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v5 13/15] Reftable support for git-core
  2021-03-23 12:20             ` Ævar Arnfjörð Bjarmason
@ 2021-03-23 20:14               ` Junio C Hamano
  0 siblings, 0 replies; 251+ messages in thread
From: Junio C Hamano @ 2021-03-23 20:14 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Derrick Stolee, Han-Wen Nienhuys via GitGitGadget, git,
	Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Felipe Contreras, Han-Wen Nienhuys

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Tue, Mar 23 2021, Derrick Stolee wrote:
>
>> On 3/12/2021 3:19 PM, Han-Wen Nienhuys via GitGitGadget wrote:
>>> diff --git a/Documentation/config/extensions.txt b/Documentation/config/extensions.txt
>>> index 4e23d73cdcad..82c5940f1434 100644
>>> --- a/Documentation/config/extensions.txt
>>> +++ b/Documentation/config/extensions.txt
>>> @@ -6,3 +6,12 @@ extensions.objectFormat::
>>>  Note that this setting should only be set by linkgit:git-init[1] or
>>>  linkgit:git-clone[1].  Trying to change it after initialization will not
>>>  work and will produce hard-to-diagnose issues.
>>> ++
>>
>> I noticed while resolving conflicts with my series, which also edits this
>> file, that the "+" line above should be removed. That likely munges the
>> fact that the config entry below should be its own list item, not a
>> continuation of the previous one.
>
> I haven't tested this patch, but just a plug for the very useful
> Documentation/doc-diff for discovering any such formatting errors when
> making non-trivial doc changes.

Yup, doc-diff is a useful tool to try.  I recall however in this
particular case there was no difference in the rendered output, even
though a blank line that separates two enumerated items is
semantically more correct than a plus-alone line that adds another
paragraph to the single item.


^ permalink raw reply	[flat|nested] 251+ messages in thread

* [PATCH v6 00/20] reftable library
  2021-03-12 20:19       ` [PATCH v5 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
                           ` (14 preceding siblings ...)
  2021-03-12 20:19         ` [PATCH v5 15/15] Add "test-tool dump-reftable" command Han-Wen Nienhuys via GitGitGadget
@ 2021-04-12 19:25         ` Han-Wen Nienhuys via GitGitGadget
  2021-04-12 19:25           ` [PATCH v6 01/20] init-db: set the_repository->hash_algo early on Han-Wen Nienhuys via GitGitGadget
                             ` (20 more replies)
  15 siblings, 21 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-12 19:25 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys

This splits the giant commit from
https://github.com/gitgitgadget/git/pull/539 into a series of smaller
commits, which build and have unittests.

version 12 Apr 2021:

 * handle cleanup for concurrent access on Windows
 * split up commits further

Based on github.com/hanwen/reftable at
c33a40ad23dd622d8c72139c41a2be1cfa063e5f Add random suffix to table names

Han-Wen Nienhuys (19):
  init-db: set the_repository->hash_algo early on
  reftable: add LICENSE
  reftable: add error related functionality
  reftable: utility functions
  reftable: add blocksource, an abstraction for random access reads
  reftable: (de)serialization for the polymorphic record type.
  reftable: reading/writing blocks
  reftable: a generic binary tree implementation
  reftable: write reftable files
  reftable: generic interface to tables
  reftable: read reftable files
  reftable: reftable file level tests
  reftable: add a heap-based priority queue for reftable records
  reftable: add merged table view
  reftable: implement refname validation
  reftable: implement stack, a mutable database of reftable files.
  reftable: add dump utility
  Reftable support for git-core
  Add "test-tool dump-reftable" command.

SZEDER Gábor (1):
  git-prompt: prepare for reftable refs backend

 Documentation/config/extensions.txt           |    9 +
 .../technical/repository-version.txt          |    7 +
 Makefile                                      |   48 +-
 builtin/clone.c                               |    5 +-
 builtin/init-db.c                             |   76 +-
 builtin/worktree.c                            |   27 +-
 cache.h                                       |    8 +-
 config.mak.uname                              |    2 +-
 contrib/buildsystems/CMakeLists.txt           |   14 +-
 contrib/buildsystems/Generators/Vcxproj.pm    |   11 +-
 contrib/completion/git-prompt.sh              |    7 +-
 refs.c                                        |   27 +-
 refs.h                                        |    3 +
 refs/refs-internal.h                          |    1 +
 refs/reftable-backend.c                       | 1451 +++++++++++++++++
 reftable/.gitattributes                       |    1 +
 reftable/LICENSE                              |   31 +
 reftable/basics.c                             |  128 ++
 reftable/basics.h                             |   60 +
 reftable/basics_test.c                        |   98 ++
 reftable/block.c                              |  440 +++++
 reftable/block.h                              |  127 ++
 reftable/block_test.c                         |  121 ++
 reftable/blocksource.c                        |  148 ++
 reftable/blocksource.h                        |   22 +
 reftable/constants.h                          |   21 +
 reftable/dump.c                               |  213 +++
 reftable/error.c                              |   41 +
 reftable/generic.h                            |   32 +
 reftable/iter.c                               |  197 +++
 reftable/iter.h                               |   67 +
 reftable/merged.c                             |  367 +++++
 reftable/merged.h                             |   35 +
 reftable/merged_test.c                        |  291 ++++
 reftable/pq.c                                 |  115 ++
 reftable/pq.h                                 |   32 +
 reftable/pq_test.c                            |   72 +
 reftable/publicbasics.c                       |   58 +
 reftable/reader.c                             |  773 +++++++++
 reftable/reader.h                             |   66 +
 reftable/record.c                             | 1203 ++++++++++++++
 reftable/record.h                             |  139 ++
 reftable/record_test.c                        |  405 +++++
 reftable/refname.c                            |  209 +++
 reftable/refname.h                            |   29 +
 reftable/refname_test.c                       |  102 ++
 reftable/reftable-blocksource.h               |   49 +
 reftable/reftable-error.h                     |   62 +
 reftable/reftable-generic.h                   |   41 +
 reftable/reftable-iterator.h                  |   39 +
 reftable/reftable-malloc.h                    |   18 +
 reftable/reftable-merged.h                    |   72 +
 reftable/reftable-reader.h                    |   98 ++
 reftable/reftable-record.h                    |  114 ++
 reftable/reftable-stack.h                     |  123 ++
 reftable/reftable-tests.h                     |   23 +
 reftable/reftable-writer.h                    |  147 ++
 reftable/reftable.c                           |  115 ++
 reftable/reftable_test.c                      |  583 +++++++
 reftable/stack.c                              | 1376 ++++++++++++++++
 reftable/stack.h                              |   41 +
 reftable/stack_test.c                         |  928 +++++++++++
 reftable/system.h                             |   32 +
 reftable/test_framework.c                     |   23 +
 reftable/test_framework.h                     |   53 +
 reftable/tree.c                               |   63 +
 reftable/tree.h                               |   34 +
 reftable/tree_test.c                          |   61 +
 reftable/writer.c                             |  691 ++++++++
 reftable/writer.h                             |   50 +
 reftable/zlib-compat.c                        |   92 ++
 repository.c                                  |    2 +
 repository.h                                  |    3 +
 setup.c                                       |    9 +-
 t/helper/test-reftable.c                      |   21 +
 t/helper/test-tool.c                          |    4 +-
 t/helper/test-tool.h                          |    2 +
 t/t0031-reftable.sh                           |  203 +++
 t/t0032-reftable-unittest.sh                  |   15 +
 t/t1409-avoid-packing-refs.sh                 |    6 +
 t/t1450-fsck.sh                               |    6 +
 t/t3210-pack-refs.sh                          |    6 +
 t/test-lib.sh                                 |    5 +
 83 files changed, 12509 insertions(+), 40 deletions(-)
 create mode 100644 refs/reftable-backend.c
 create mode 100644 reftable/.gitattributes
 create mode 100644 reftable/LICENSE
 create mode 100644 reftable/basics.c
 create mode 100644 reftable/basics.h
 create mode 100644 reftable/basics_test.c
 create mode 100644 reftable/block.c
 create mode 100644 reftable/block.h
 create mode 100644 reftable/block_test.c
 create mode 100644 reftable/blocksource.c
 create mode 100644 reftable/blocksource.h
 create mode 100644 reftable/constants.h
 create mode 100644 reftable/dump.c
 create mode 100644 reftable/error.c
 create mode 100644 reftable/generic.h
 create mode 100644 reftable/iter.c
 create mode 100644 reftable/iter.h
 create mode 100644 reftable/merged.c
 create mode 100644 reftable/merged.h
 create mode 100644 reftable/merged_test.c
 create mode 100644 reftable/pq.c
 create mode 100644 reftable/pq.h
 create mode 100644 reftable/pq_test.c
 create mode 100644 reftable/publicbasics.c
 create mode 100644 reftable/reader.c
 create mode 100644 reftable/reader.h
 create mode 100644 reftable/record.c
 create mode 100644 reftable/record.h
 create mode 100644 reftable/record_test.c
 create mode 100644 reftable/refname.c
 create mode 100644 reftable/refname.h
 create mode 100644 reftable/refname_test.c
 create mode 100644 reftable/reftable-blocksource.h
 create mode 100644 reftable/reftable-error.h
 create mode 100644 reftable/reftable-generic.h
 create mode 100644 reftable/reftable-iterator.h
 create mode 100644 reftable/reftable-malloc.h
 create mode 100644 reftable/reftable-merged.h
 create mode 100644 reftable/reftable-reader.h
 create mode 100644 reftable/reftable-record.h
 create mode 100644 reftable/reftable-stack.h
 create mode 100644 reftable/reftable-tests.h
 create mode 100644 reftable/reftable-writer.h
 create mode 100644 reftable/reftable.c
 create mode 100644 reftable/reftable_test.c
 create mode 100644 reftable/stack.c
 create mode 100644 reftable/stack.h
 create mode 100644 reftable/stack_test.c
 create mode 100644 reftable/system.h
 create mode 100644 reftable/test_framework.c
 create mode 100644 reftable/test_framework.h
 create mode 100644 reftable/tree.c
 create mode 100644 reftable/tree.h
 create mode 100644 reftable/tree_test.c
 create mode 100644 reftable/writer.c
 create mode 100644 reftable/writer.h
 create mode 100644 reftable/zlib-compat.c
 create mode 100644 t/helper/test-reftable.c
 create mode 100755 t/t0031-reftable.sh
 create mode 100755 t/t0032-reftable-unittest.sh


base-commit: 89b43f80a514aee58b662ad606e6352e03eaeee4
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-847%2Fhanwen%2Flibreftable-v6
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-847/hanwen/libreftable-v6
Pull-Request: https://github.com/git/git/pull/847

Range-diff vs v5:

  1:  e1d1d9f49807 =  1:  15b8306fbe4b init-db: set the_repository->hash_algo early on
  2:  502a66befabe =  2:  02bd1f40fa4e reftable: add LICENSE
  3:  1bd136f6a6df =  3:  bffa33c012ad reftable: add error related functionality
  4:  5bbbafb428e9 !  4:  df8003cb9a7d reftable: utility functions
     @@ Makefile: THIRD_PARTY_SOURCES += compat/regex/%
       EXTLIBS =
       
       GIT_USER_AGENT = git/$(GIT_VERSION)
     -@@ Makefile: XDIFF_OBJS += xdiff/xpatience.o
     - XDIFF_OBJS += xdiff/xprepare.o
     - XDIFF_OBJS += xdiff/xutils.o
     +@@ Makefile: XDIFF_OBJS += xdiff/xutils.o
     + .PHONY: xdiff-objs
     + xdiff-objs: $(XDIFF_OBJS)
       
      +REFTABLE_OBJS += reftable/basics.o
      +REFTABLE_OBJS += reftable/error.o
     @@ Makefile: XDIFF_OBJS += xdiff/xpatience.o
      +REFTABLE_TEST_OBJS += reftable/basics_test.o
      +
       TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
     - OBJECTS := $(LIB_OBJS) $(BUILTIN_OBJS) $(PROGRAM_OBJS) $(TEST_OBJS) \
     - 	$(XDIFF_OBJS) \
     - 	$(FUZZ_OBJS) \
     -+	$(REFTABLE_OBJS) \
     -+	$(REFTABLE_TEST_OBJS) \
     - 	common-main.o \
     - 	git.o
     ++
     + .PHONY: test-objs
     + test-objs: $(TEST_OBJS)
     + 
     +@@ Makefile: OBJECTS += $(PROGRAM_OBJS)
     + OBJECTS += $(TEST_OBJS)
     + OBJECTS += $(XDIFF_OBJS)
     + OBJECTS += $(FUZZ_OBJS)
     ++OBJECTS += $(REFTABLE_OBJS) $(REFTABLE_TEST_OBJS)
     ++
       ifndef NO_CURL
     + 	OBJECTS += http.o http-walker.o remote-curl.o
     + endif
      @@ Makefile: $(LIB_FILE): $(LIB_OBJS)
       $(XDIFF_LIB): $(XDIFF_OBJS)
       	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
  5:  e41fa780882d !  5:  6807af53e9a0 reftable: add blocksource, an abstraction for random access reads
     @@ Commit message
          Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
      
       ## Makefile ##
     -@@ Makefile: XDIFF_OBJS += xdiff/xutils.o
     +@@ Makefile: xdiff-objs: $(XDIFF_OBJS)
       
       REFTABLE_OBJS += reftable/basics.o
       REFTABLE_OBJS += reftable/error.o
  6:  390eaca4fc8c =  6:  78a352c96f82 reftable: (de)serialization for the polymorphic record type.
  7:  b108525009d9 !  7:  9297b9c363f6 reftable: reading/writing blocks
     @@ Commit message
          Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
      
       ## Makefile ##
     -@@ Makefile: XDIFF_OBJS += xdiff/xutils.o
     +@@ Makefile: xdiff-objs: $(XDIFF_OBJS)
       
       REFTABLE_OBJS += reftable/basics.o
       REFTABLE_OBJS += reftable/error.o
  8:  b6eed7283aac !  8:  ceddefadd48c reftable: a generic binary tree implementation
     @@ Makefile: REFTABLE_OBJS += reftable/block.o
      +REFTABLE_TEST_OBJS += reftable/tree_test.o
       
       TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
     - OBJECTS := $(LIB_OBJS) $(BUILTIN_OBJS) $(PROGRAM_OBJS) $(TEST_OBJS) \
     + 
      
       ## reftable/tree.c (new) ##
      @@
  9:  f2f005afca19 =  9:  1f635e6c86c5 reftable: write reftable files
  -:  ------------ > 10:  9c3fd77fbbac reftable: generic interface to tables
 10:  1f17d5edab52 ! 11:  8ba486f44a72 reftable: read reftable files
     @@ Makefile: REFTABLE_OBJS += reftable/basics.o
       REFTABLE_OBJS += reftable/publicbasics.o
      +REFTABLE_OBJS += reftable/reader.o
       REFTABLE_OBJS += reftable/record.o
     -+REFTABLE_OBJS += reftable/reftable.o
     + REFTABLE_OBJS += reftable/reftable.o
       REFTABLE_OBJS += reftable/tree.o
     - REFTABLE_OBJS += reftable/writer.o
     - REFTABLE_OBJS += reftable/zlib-compat.o
      
       ## reftable/iter.c (new) ##
      @@
     @@ reftable/iter.c (new)
      +#include "system.h"
      +
      +#include "block.h"
     ++#include "generic.h"
      +#include "constants.h"
      +#include "reader.h"
      +#include "reftable-error.h"
     @@ reftable/iter.c (new)
      +	return it->ops == NULL;
      +}
      +
     -+static int empty_iterator_next(void *arg, struct reftable_record *rec)
     -+{
     -+	return 1;
     -+}
     -+
     -+static void empty_iterator_close(void *arg)
     -+{
     -+}
     -+
     -+static struct reftable_iterator_vtable empty_vtable = {
     -+	.next = &empty_iterator_next,
     -+	.close = &empty_iterator_close,
     -+};
     -+
     -+void iterator_set_empty(struct reftable_iterator *it)
     -+{
     -+	assert(it->ops == NULL);
     -+	it->iter_arg = NULL;
     -+	it->ops = &empty_vtable;
     -+}
     -+
     -+int iterator_next(struct reftable_iterator *it, struct reftable_record *rec)
     -+{
     -+	return it->ops->next(it->iter_arg, rec);
     -+}
     -+
     -+void reftable_iterator_destroy(struct reftable_iterator *it)
     -+{
     -+	if (it->ops == NULL) {
     -+		return;
     -+	}
     -+	it->ops->close(it->iter_arg);
     -+	it->ops = NULL;
     -+	FREE_AND_NULL(it->iter_arg);
     -+}
     -+
     -+int reftable_iterator_next_ref(struct reftable_iterator *it,
     -+			       struct reftable_ref_record *ref)
     -+{
     -+	struct reftable_record rec = { NULL };
     -+	reftable_record_from_ref(&rec, ref);
     -+	return iterator_next(it, &rec);
     -+}
     -+
     -+int reftable_iterator_next_log(struct reftable_iterator *it,
     -+			       struct reftable_log_record *log)
     -+{
     -+	struct reftable_record rec = { NULL };
     -+	reftable_record_from_log(&rec, log);
     -+	return iterator_next(it, &rec);
     -+}
     -+
      +static void filtering_ref_iterator_close(void *iter_arg)
      +{
      +	struct filtering_ref_iterator *fri =
     @@ reftable/iter.h (new)
      +#include "reftable-iterator.h"
      +#include "reftable-generic.h"
      +
     -+struct reftable_iterator_vtable {
     -+	int (*next)(void *iter_arg, struct reftable_record *rec);
     -+	void (*close)(void *iter_arg);
     -+};
     -+
     -+void iterator_set_empty(struct reftable_iterator *it);
     -+int iterator_next(struct reftable_iterator *it, struct reftable_record *rec);
     -+
      +/* Returns true for a zeroed out iterator, such as the one returned from
      + * iterator_destroy. */
      +int iterator_is_null(struct reftable_iterator *it);
     @@ reftable/reader.c (new)
      +#include "system.h"
      +#include "block.h"
      +#include "constants.h"
     ++#include "generic.h"
      +#include "iter.h"
      +#include "record.h"
      +#include "reftable-error.h"
     ++#include "reftable-generic.h"
      +#include "tree.h"
      +
      +uint64_t block_source_size(struct reftable_block_source *source)
     @@ reftable/reader.c (new)
      +uint64_t reftable_reader_min_update_index(struct reftable_reader *r)
      +{
      +	return r->min_update_index;
     ++}
     ++
     ++/* generic table interface. */
     ++
     ++static int reftable_reader_seek_void(void *tab, struct reftable_iterator *it,
     ++				     struct reftable_record *rec)
     ++{
     ++	return reader_seek((struct reftable_reader *)tab, it, rec);
     ++}
     ++
     ++static uint32_t reftable_reader_hash_id_void(void *tab)
     ++{
     ++	return reftable_reader_hash_id((struct reftable_reader *)tab);
     ++}
     ++
     ++static uint64_t reftable_reader_min_update_index_void(void *tab)
     ++{
     ++	return reftable_reader_min_update_index((struct reftable_reader *)tab);
     ++}
     ++
     ++static uint64_t reftable_reader_max_update_index_void(void *tab)
     ++{
     ++	return reftable_reader_max_update_index((struct reftable_reader *)tab);
     ++}
     ++
     ++static struct reftable_table_vtable reader_vtable = {
     ++	.seek_record = reftable_reader_seek_void,
     ++	.hash_id = reftable_reader_hash_id_void,
     ++	.min_update_index = reftable_reader_min_update_index_void,
     ++	.max_update_index = reftable_reader_max_update_index_void,
     ++};
     ++
     ++void reftable_table_from_reader(struct reftable_table *tab,
     ++				struct reftable_reader *reader)
     ++{
     ++	assert(tab->ops == NULL);
     ++	tab->ops = &reader_vtable;
     ++	tab->table_arg = reader;
      +}
      
       ## reftable/reader.h (new) ##
     @@ reftable/reader.h (new)
      +int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br,
      +			     uint64_t next_off, uint8_t want_typ);
      +
     -+/* generic interface to reftables */
     -+struct reftable_table_vtable {
     -+	int (*seek_record)(void *tab, struct reftable_iterator *it,
     -+			   struct reftable_record *);
     -+	uint32_t (*hash_id)(void *tab);
     -+	uint64_t (*min_update_index)(void *tab);
     -+	uint64_t (*max_update_index)(void *tab);
     -+};
     -+
     -+#endif
     -
     - ## reftable/reftable-iterator.h (new) ##
     -@@
     -+/*
     -+Copyright 2020 Google LLC
     -+
     -+Use of this source code is governed by a BSD-style
     -+license that can be found in the LICENSE file or at
     -+https://developers.google.com/open-source/licenses/bsd
     -+*/
     -+
     -+#ifndef REFTABLE_ITERATOR_H
     -+#define REFTABLE_ITERATOR_H
     -+
     -+#include "reftable-record.h"
     -+
     -+/* iterator is the generic interface for walking over data stored in a
     -+ * reftable.
     -+ */
     -+struct reftable_iterator {
     -+	struct reftable_iterator_vtable *ops;
     -+	void *iter_arg;
     -+};
     -+
     -+/* reads the next reftable_ref_record. Returns < 0 for error, 0 for OK and > 0:
     -+ * end of iteration.
     -+ */
     -+int reftable_iterator_next_ref(struct reftable_iterator *it,
     -+			       struct reftable_ref_record *ref);
     -+
     -+/* reads the next reftable_log_record. Returns < 0 for error, 0 for OK and > 0:
     -+ * end of iteration.
     -+ */
     -+int reftable_iterator_next_log(struct reftable_iterator *it,
     -+			       struct reftable_log_record *log);
     -+
     -+/* releases resources associated with an iterator. */
     -+void reftable_iterator_destroy(struct reftable_iterator *it);
     -+
      +#endif
      
       ## reftable/reftable-reader.h (new) ##
      @@
      +/*
     -+Copyright 2020 Google LLC
     ++  Copyright 2020 Google LLC
      +
     -+Use of this source code is governed by a BSD-style
     -+license that can be found in the LICENSE file or at
     -+https://developers.google.com/open-source/licenses/bsd
     ++  Use of this source code is governed by a BSD-style
     ++  license that can be found in the LICENSE file or at
     ++  https://developers.google.com/open-source/licenses/bsd
      +*/
      +
      +#ifndef REFTABLE_READER_H
     @@ reftable/reftable-reader.h (new)
      +/*
      + * Reading single tables
      + *
     -+ * The follow routines are for reading single files. For an application-level
     -+ * interface, skip ahead to struct reftable_merged_table and struct
     -+ * reftable_stack.
     ++ * The follow routines are for reading single files. For an
     ++ * application-level interface, skip ahead to struct
     ++ * reftable_merged_table and struct reftable_stack.
      + */
      +
      +/* The reader struct is a handle to an open reftable file. */
      +struct reftable_reader;
      +
     -+/* reftable_new_reader opens a reftable for reading. If successful, returns 0
     -+ * code and sets pp. The name is used for creating a stack. Typically, it is the
     -+ * basename of the file. The block source `src` is owned by the reader, and is
     -+ * closed on calling reftable_reader_destroy().
     ++/* Generic table. */
     ++struct reftable_table;
     ++
     ++/* reftable_new_reader opens a reftable for reading. If successful,
     ++ * returns 0 code and sets pp. The name is used for creating a
     ++ * stack. Typically, it is the basename of the file. The block source
     ++ * `src` is owned by the reader, and is closed on calling
     ++ * reftable_reader_destroy(). On error, the block source `src` is
     ++ * closed as well.
      + */
      +int reftable_new_reader(struct reftable_reader **pp,
      +			struct reftable_block_source *src, const char *name);
     @@ reftable/reftable-reader.h (new)
      +   if (err < 0) { ... }
      +   struct reftable_ref_record ref  = {0};
      +   while (1) {
     -+     err = reftable_iterator_next_ref(&it, &ref);
     -+     if (err > 0) {
     -+       break;
     -+     }
     -+     if (err < 0) {
     -+       ..error handling..
     -+     }
     -+     ..found..
     ++   err = reftable_iterator_next_ref(&it, &ref);
     ++   if (err > 0) {
     ++   break;
     ++   }
     ++   if (err < 0) {
     ++   ..error handling..
     ++   }
     ++   ..found..
      +   }
      +   reftable_iterator_destroy(&it);
      +   reftable_ref_record_release(&ref);
     -+ */
     ++*/
      +int reftable_reader_seek_ref(struct reftable_reader *r,
      +			     struct reftable_iterator *it, const char *name);
      +
     @@ reftable/reftable-reader.h (new)
      +
      +/* seek to logs for the given name, older than update_index. To seek to the
      +   start of the table, use name = "".
     -+ */
     ++*/
      +int reftable_reader_seek_log_at(struct reftable_reader *r,
      +				struct reftable_iterator *it, const char *name,
      +				uint64_t update_index);
     @@ reftable/reftable-reader.h (new)
      +/* return the min_update_index for a table */
      +uint64_t reftable_reader_min_update_index(struct reftable_reader *r);
      +
     ++/* creates a generic table from a file reader. */
     ++void reftable_table_from_reader(struct reftable_table *tab,
     ++				struct reftable_reader *reader);
     ++
      +#endif
 11:  c916629c562a ! 12:  9ade9303f08f reftable: reftable file level tests
     @@ reftable/reftable_test.c (new)
      +#include "record.h"
      +#include "test_framework.h"
      +#include "reftable-tests.h"
     -+#include "reftable-stack.h"
     ++#include "reftable-writer.h"
      +
      +static const int update_index = 5;
      +
  -:  ------------ > 13:  0a6119d910a3 reftable: add a heap-based priority queue for reftable records
  -:  ------------ > 14:  393fcecdae1e reftable: add merged table view
  -:  ------------ > 15:  3186b2b70f73 reftable: implement refname validation
 12:  28aa69f7bbcb ! 16:  b5492d5a13d7 reftable: rest of library
     @@ Metadata
      Author: Han-Wen Nienhuys <hanwen@google.com>
      
       ## Commit message ##
     -    reftable: rest of library
     +    reftable: implement stack, a mutable database of reftable files.
      
          Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
      
       ## Makefile ##
     -@@ Makefile: REFTABLE_OBJS += reftable/error.o
     - REFTABLE_OBJS += reftable/block.o
     - REFTABLE_OBJS += reftable/blocksource.o
     - REFTABLE_OBJS += reftable/iter.o
     -+REFTABLE_OBJS += reftable/merged.o
     -+REFTABLE_OBJS += reftable/pq.o
     - REFTABLE_OBJS += reftable/publicbasics.o
     - REFTABLE_OBJS += reftable/reader.o
     +@@ Makefile: REFTABLE_OBJS += reftable/reader.o
       REFTABLE_OBJS += reftable/record.o
     -+REFTABLE_OBJS += reftable/refname.o
     + REFTABLE_OBJS += reftable/refname.o
       REFTABLE_OBJS += reftable/reftable.o
      +REFTABLE_OBJS += reftable/stack.o
       REFTABLE_OBJS += reftable/tree.o
       REFTABLE_OBJS += reftable/writer.o
       REFTABLE_OBJS += reftable/zlib-compat.o
     -
     - ## reftable/VERSION (new) ##
     -@@
     -+a337d48d4d42d513d6baa33addc172f0e0e36288 C: use tagged union for reftable_log_record
     -
     - ## reftable/dump.c (new) ##
     -@@
     -+/*
     -+Copyright 2020 Google LLC
     -+
     -+Use of this source code is governed by a BSD-style
     -+license that can be found in the LICENSE file or at
     -+https://developers.google.com/open-source/licenses/bsd
     -+*/
     -+
     -+#include <stddef.h>
     -+#include <stdio.h>
     -+#include <stdlib.h>
     -+#include <unistd.h>
     -+#include <string.h>
     -+
     -+#include "reftable.h"
     -+#include "reftable-tests.h"
     -+
     -+static uint32_t hash_id;
     -+
     -+static int dump_table(const char *tablename)
     -+{
     -+	struct reftable_block_source src = { 0 };
     -+	int err = reftable_block_source_from_file(&src, tablename);
     -+	struct reftable_iterator it = { 0 };
     -+	struct reftable_ref_record ref = { 0 };
     -+	struct reftable_log_record log = { 0 };
     -+	struct reftable_reader *r = NULL;
     -+
     -+	if (err < 0)
     -+		return err;
     -+
     -+	err = reftable_new_reader(&r, &src, tablename);
     -+	if (err < 0)
     -+		return err;
     -+
     -+	err = reftable_reader_seek_ref(r, &it, "");
     -+	if (err < 0) {
     -+		return err;
     -+	}
     -+
     -+	while (1) {
     -+		err = reftable_iterator_next_ref(&it, &ref);
     -+		if (err > 0) {
     -+			break;
     -+		}
     -+		if (err < 0) {
     -+			return err;
     -+		}
     -+		reftable_ref_record_print(&ref, hash_id);
     -+	}
     -+	reftable_iterator_destroy(&it);
     -+	reftable_ref_record_clear(&ref);
     -+
     -+	err = reftable_reader_seek_log(r, &it, "");
     -+	if (err < 0) {
     -+		return err;
     -+	}
     -+	while (1) {
     -+		err = reftable_iterator_next_log(&it, &log);
     -+		if (err > 0) {
     -+			break;
     -+		}
     -+		if (err < 0) {
     -+			return err;
     -+		}
     -+		reftable_log_record_print(&log, hash_id);
     -+	}
     -+	reftable_iterator_destroy(&it);
     -+	reftable_log_record_clear(&log);
     -+
     -+	reftable_reader_free(r);
     -+	return 0;
     -+}
     -+
     -+static int compact_stack(const char *stackdir)
     -+{
     -+	struct reftable_stack *stack = NULL;
     -+	struct reftable_write_options cfg = {};
     -+
     -+	int err = reftable_new_stack(&stack, stackdir, cfg);
     -+	if (err < 0)
     -+		goto done;
     -+
     -+	err = reftable_stack_compact_all(stack, NULL);
     -+	if (err < 0)
     -+		goto done;
     -+done:
     -+	if (stack != NULL) {
     -+		reftable_stack_destroy(stack);
     -+	}
     -+	return err;
     -+}
     -+
     -+static int dump_stack(const char *stackdir)
     -+{
     -+	struct reftable_stack *stack = NULL;
     -+	struct reftable_write_options cfg = {};
     -+	struct reftable_iterator it = { 0 };
     -+	struct reftable_ref_record ref = { 0 };
     -+	struct reftable_log_record log = { 0 };
     -+	struct reftable_merged_table *merged = NULL;
     -+
     -+	int err = reftable_new_stack(&stack, stackdir, cfg);
     -+	if (err < 0)
     -+		return err;
     -+
     -+	merged = reftable_stack_merged_table(stack);
     -+
     -+	err = reftable_merged_table_seek_ref(merged, &it, "");
     -+	if (err < 0) {
     -+		return err;
     -+	}
     -+
     -+	while (1) {
     -+		err = reftable_iterator_next_ref(&it, &ref);
     -+		if (err > 0) {
     -+			break;
     -+		}
     -+		if (err < 0) {
     -+			return err;
     -+		}
     -+		reftable_ref_record_print(&ref, hash_id);
     -+	}
     -+	reftable_iterator_destroy(&it);
     -+	reftable_ref_record_clear(&ref);
     -+
     -+	err = reftable_merged_table_seek_log(merged, &it, "");
     -+	if (err < 0) {
     -+		return err;
     -+	}
     -+	while (1) {
     -+		err = reftable_iterator_next_log(&it, &log);
     -+		if (err > 0) {
     -+			break;
     -+		}
     -+		if (err < 0) {
     -+			return err;
     -+		}
     -+		reftable_log_record_print(&log, hash_id);
     -+	}
     -+	reftable_iterator_destroy(&it);
     -+	reftable_log_record_clear(&log);
     -+
     -+	reftable_stack_destroy(stack);
     -+	return 0;
     -+}
     -+
     -+static void print_help(void)
     -+{
     -+	printf("usage: dump [-cst] arg\n\n"
     -+	       "options: \n"
     -+	       "  -c compact\n"
     -+	       "  -t dump table\n"
     -+	       "  -s dump stack\n"
     -+	       "  -h this help\n"
     -+	       "  -2 use SHA256\n"
     -+	       "\n");
     -+}
     -+
     -+int reftable_dump_main(int argc, char *const *argv)
     -+{
     -+	int err = 0;
     -+	int opt;
     -+	int opt_dump_table = 0;
     -+	int opt_dump_stack = 0;
     -+	int opt_compact = 0;
     -+	const char *arg = NULL;
     -+	while ((opt = getopt(argc, argv, "2chts")) != -1) {
     -+		switch (opt) {
     -+		case '2':
     -+			hash_id = 0x73323536;
     -+			break;
     -+		case 't':
     -+			opt_dump_table = 1;
     -+			break;
     -+		case 's':
     -+			opt_dump_stack = 1;
     -+			break;
     -+		case 'c':
     -+			opt_compact = 1;
     -+			break;
     -+		case '?':
     -+		case 'h':
     -+			print_help();
     -+			return 2;
     -+			break;
     -+		}
     -+	}
     -+
     -+	if (argv[optind] == NULL) {
     -+		fprintf(stderr, "need argument\n");
     -+		print_help();
     -+		return 2;
     -+	}
     -+
     -+	arg = argv[optind];
     -+
     -+	if (opt_dump_table) {
     -+		err = dump_table(arg);
     -+	} else if (opt_dump_stack) {
     -+		err = dump_stack(arg);
     -+	} else if (opt_compact) {
     -+		err = compact_stack(arg);
     -+	}
     -+
     -+	if (err < 0) {
     -+		fprintf(stderr, "%s: %s: %s\n", argv[0], arg,
     -+			reftable_error_str(err));
     -+		return 1;
     -+	}
     -+	return 0;
     -+}
     -
     - ## reftable/merged.c (new) ##
     -@@
     -+/*
     -+Copyright 2020 Google LLC
     -+
     -+Use of this source code is governed by a BSD-style
     -+license that can be found in the LICENSE file or at
     -+https://developers.google.com/open-source/licenses/bsd
     -+*/
     -+
     -+#include "merged.h"
     -+
     -+#include "constants.h"
     -+#include "iter.h"
     -+#include "pq.h"
     -+#include "reader.h"
     -+#include "record.h"
     -+#include "reftable-merged.h"
     -+#include "reftable-error.h"
     -+#include "system.h"
     -+
     -+static int merged_iter_init(struct merged_iter *mi)
     -+{
     -+	int i = 0;
     -+	for (i = 0; i < mi->stack_len; i++) {
     -+		struct reftable_record rec = reftable_new_record(mi->typ);
     -+		int err = iterator_next(&mi->stack[i], &rec);
     -+		if (err < 0) {
     -+			return err;
     -+		}
     -+
     -+		if (err > 0) {
     -+			reftable_iterator_destroy(&mi->stack[i]);
     -+			reftable_record_destroy(&rec);
     -+		} else {
     -+			struct pq_entry e = {
     -+				.rec = rec,
     -+				.index = i,
     -+			};
     -+			merged_iter_pqueue_add(&mi->pq, e);
     -+		}
     -+	}
     -+
     -+	return 0;
     -+}
     -+
     -+static void merged_iter_close(void *p)
     -+{
     -+	struct merged_iter *mi = (struct merged_iter *)p;
     -+	int i = 0;
     -+	merged_iter_pqueue_release(&mi->pq);
     -+	for (i = 0; i < mi->stack_len; i++) {
     -+		reftable_iterator_destroy(&mi->stack[i]);
     -+	}
     -+	reftable_free(mi->stack);
     -+}
     -+
     -+static int merged_iter_advance_nonnull_subiter(struct merged_iter *mi,
     -+					       size_t idx)
     -+{
     -+	struct reftable_record rec = reftable_new_record(mi->typ);
     -+	struct pq_entry e = {
     -+		.rec = rec,
     -+		.index = idx,
     -+	};
     -+	int err = iterator_next(&mi->stack[idx], &rec);
     -+	if (err < 0)
     -+		return err;
     -+
     -+	if (err > 0) {
     -+		reftable_iterator_destroy(&mi->stack[idx]);
     -+		reftable_record_destroy(&rec);
     -+		return 0;
     -+	}
     -+
     -+	merged_iter_pqueue_add(&mi->pq, e);
     -+	return 0;
     -+}
     -+
     -+static int merged_iter_advance_subiter(struct merged_iter *mi, size_t idx)
     -+{
     -+	if (iterator_is_null(&mi->stack[idx]))
     -+		return 0;
     -+	return merged_iter_advance_nonnull_subiter(mi, idx);
     -+}
     -+
     -+static int merged_iter_next_entry(struct merged_iter *mi,
     -+				  struct reftable_record *rec)
     -+{
     -+	struct strbuf entry_key = STRBUF_INIT;
     -+	struct pq_entry entry = { 0 };
     -+	int err = 0;
     -+
     -+	if (merged_iter_pqueue_is_empty(mi->pq))
     -+		return 1;
     -+
     -+	entry = merged_iter_pqueue_remove(&mi->pq);
     -+	err = merged_iter_advance_subiter(mi, entry.index);
     -+	if (err < 0)
     -+		return err;
     -+
     -+	/*
     -+	  One can also use reftable as datacenter-local storage, where the ref
     -+	  database is maintained in globally consistent database (eg.
     -+	  CockroachDB or Spanner). In this scenario, replication delays together
     -+	  with compaction may cause newer tables to contain older entries. In
     -+	  such a deployment, the loop below must be changed to collect all
     -+	  entries for the same key, and return new the newest one.
     -+	*/
     -+	reftable_record_key(&entry.rec, &entry_key);
     -+	while (!merged_iter_pqueue_is_empty(mi->pq)) {
     -+		struct pq_entry top = merged_iter_pqueue_top(mi->pq);
     -+		struct strbuf k = STRBUF_INIT;
     -+		int err = 0, cmp = 0;
     -+
     -+		reftable_record_key(&top.rec, &k);
     -+
     -+		cmp = strbuf_cmp(&k, &entry_key);
     -+		strbuf_release(&k);
     -+
     -+		if (cmp > 0) {
     -+			break;
     -+		}
     -+
     -+		merged_iter_pqueue_remove(&mi->pq);
     -+		err = merged_iter_advance_subiter(mi, top.index);
     -+		if (err < 0) {
     -+			return err;
     -+		}
     -+		reftable_record_destroy(&top.rec);
     -+	}
     -+
     -+	reftable_record_copy_from(rec, &entry.rec, hash_size(mi->hash_id));
     -+	reftable_record_destroy(&entry.rec);
     -+	strbuf_release(&entry_key);
     -+	return 0;
     -+}
     -+
     -+static int merged_iter_next(struct merged_iter *mi, struct reftable_record *rec)
     -+{
     -+	while (1) {
     -+		int err = merged_iter_next_entry(mi, rec);
     -+		if (err == 0 && mi->suppress_deletions &&
     -+		    reftable_record_is_deletion(rec)) {
     -+			continue;
     -+		}
     -+
     -+		return err;
     -+	}
     -+}
     -+
     -+static int merged_iter_next_void(void *p, struct reftable_record *rec)
     -+{
     -+	struct merged_iter *mi = (struct merged_iter *)p;
     -+	if (merged_iter_pqueue_is_empty(mi->pq))
     -+		return 1;
     -+
     -+	return merged_iter_next(mi, rec);
     -+}
     -+
     -+static struct reftable_iterator_vtable merged_iter_vtable = {
     -+	.next = &merged_iter_next_void,
     -+	.close = &merged_iter_close,
     -+};
     -+
     -+static void iterator_from_merged_iter(struct reftable_iterator *it,
     -+				      struct merged_iter *mi)
     -+{
     -+	assert(it->ops == NULL);
     -+	it->iter_arg = mi;
     -+	it->ops = &merged_iter_vtable;
     -+}
     -+
     -+int reftable_new_merged_table(struct reftable_merged_table **dest,
     -+			      struct reftable_table *stack, int n,
     -+			      uint32_t hash_id)
     -+{
     -+	struct reftable_merged_table *m = NULL;
     -+	uint64_t last_max = 0;
     -+	uint64_t first_min = 0;
     -+	int i = 0;
     -+	for (i = 0; i < n; i++) {
     -+		uint64_t min = reftable_table_min_update_index(&stack[i]);
     -+		uint64_t max = reftable_table_max_update_index(&stack[i]);
     -+
     -+		if (reftable_table_hash_id(&stack[i]) != hash_id) {
     -+			return REFTABLE_FORMAT_ERROR;
     -+		}
     -+		if (i == 0 || min < first_min) {
     -+			first_min = min;
     -+		}
     -+		if (i == 0 || max > last_max) {
     -+			last_max = max;
     -+		}
     -+	}
     -+
     -+	m = (struct reftable_merged_table *)reftable_calloc(
     -+		sizeof(struct reftable_merged_table));
     -+	m->stack = stack;
     -+	m->stack_len = n;
     -+	m->min = first_min;
     -+	m->max = last_max;
     -+	m->hash_id = hash_id;
     -+	*dest = m;
     -+	return 0;
     -+}
     -+
     -+/* clears the list of subtable, without affecting the readers themselves. */
     -+void merged_table_release(struct reftable_merged_table *mt)
     -+{
     -+	FREE_AND_NULL(mt->stack);
     -+	mt->stack_len = 0;
     -+}
     -+
     -+void reftable_merged_table_free(struct reftable_merged_table *mt)
     -+{
     -+	if (mt == NULL) {
     -+		return;
     -+	}
     -+	merged_table_release(mt);
     -+	reftable_free(mt);
     -+}
     -+
     -+uint64_t
     -+reftable_merged_table_max_update_index(struct reftable_merged_table *mt)
     -+{
     -+	return mt->max;
     -+}
     -+
     -+uint64_t
     -+reftable_merged_table_min_update_index(struct reftable_merged_table *mt)
     -+{
     -+	return mt->min;
     -+}
     -+
     -+static int reftable_table_seek_record(struct reftable_table *tab,
     -+				      struct reftable_iterator *it,
     -+				      struct reftable_record *rec)
     -+{
     -+	return tab->ops->seek_record(tab->table_arg, it, rec);
     -+}
     -+
     -+static int merged_table_seek_record(struct reftable_merged_table *mt,
     -+				    struct reftable_iterator *it,
     -+				    struct reftable_record *rec)
     -+{
     -+	struct reftable_iterator *iters = reftable_calloc(
     -+		sizeof(struct reftable_iterator) * mt->stack_len);
     -+	struct merged_iter merged = {
     -+		.stack = iters,
     -+		.typ = reftable_record_type(rec),
     -+		.hash_id = mt->hash_id,
     -+		.suppress_deletions = mt->suppress_deletions,
     -+	};
     -+	int n = 0;
     -+	int err = 0;
     -+	int i = 0;
     -+	for (i = 0; i < mt->stack_len && err == 0; i++) {
     -+		int e = reftable_table_seek_record(&mt->stack[i], &iters[n],
     -+						   rec);
     -+		if (e < 0) {
     -+			err = e;
     -+		}
     -+		if (e == 0) {
     -+			n++;
     -+		}
     -+	}
     -+	if (err < 0) {
     -+		int i = 0;
     -+		for (i = 0; i < n; i++) {
     -+			reftable_iterator_destroy(&iters[i]);
     -+		}
     -+		reftable_free(iters);
     -+		return err;
     -+	}
     -+
     -+	merged.stack_len = n;
     -+	err = merged_iter_init(&merged);
     -+	if (err < 0) {
     -+		merged_iter_close(&merged);
     -+		return err;
     -+	} else {
     -+		struct merged_iter *p =
     -+			reftable_malloc(sizeof(struct merged_iter));
     -+		*p = merged;
     -+		iterator_from_merged_iter(it, p);
     -+	}
     -+	return 0;
     -+}
     -+
     -+int reftable_merged_table_seek_ref(struct reftable_merged_table *mt,
     -+				   struct reftable_iterator *it,
     -+				   const char *name)
     -+{
     -+	struct reftable_ref_record ref = {
     -+		.refname = (char *)name,
     -+	};
     -+	struct reftable_record rec = { NULL };
     -+	reftable_record_from_ref(&rec, &ref);
     -+	return merged_table_seek_record(mt, it, &rec);
     -+}
     -+
     -+int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt,
     -+				      struct reftable_iterator *it,
     -+				      const char *name, uint64_t update_index)
     -+{
     -+	struct reftable_log_record log = {
     -+		.refname = (char *)name,
     -+		.update_index = update_index,
     -+	};
     -+	struct reftable_record rec = { NULL };
     -+	reftable_record_from_log(&rec, &log);
     -+	return merged_table_seek_record(mt, it, &rec);
     -+}
     -+
     -+int reftable_merged_table_seek_log(struct reftable_merged_table *mt,
     -+				   struct reftable_iterator *it,
     -+				   const char *name)
     -+{
     -+	uint64_t max = ~((uint64_t)0);
     -+	return reftable_merged_table_seek_log_at(mt, it, name, max);
     -+}
     -+
     -+uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *mt)
     -+{
     -+	return mt->hash_id;
     -+}
     -+
     -+static int reftable_merged_table_seek_void(void *tab,
     -+					   struct reftable_iterator *it,
     -+					   struct reftable_record *rec)
     -+{
     -+	return merged_table_seek_record((struct reftable_merged_table *)tab, it,
     -+					rec);
     -+}
     -+
     -+static uint32_t reftable_merged_table_hash_id_void(void *tab)
     -+{
     -+	return reftable_merged_table_hash_id(
     -+		(struct reftable_merged_table *)tab);
     -+}
     -+
     -+static uint64_t reftable_merged_table_min_update_index_void(void *tab)
     -+{
     -+	return reftable_merged_table_min_update_index(
     -+		(struct reftable_merged_table *)tab);
     -+}
     -+
     -+static uint64_t reftable_merged_table_max_update_index_void(void *tab)
     -+{
     -+	return reftable_merged_table_max_update_index(
     -+		(struct reftable_merged_table *)tab);
     -+}
     -+
     -+static struct reftable_table_vtable merged_table_vtable = {
     -+	.seek_record = reftable_merged_table_seek_void,
     -+	.hash_id = reftable_merged_table_hash_id_void,
     -+	.min_update_index = reftable_merged_table_min_update_index_void,
     -+	.max_update_index = reftable_merged_table_max_update_index_void,
     -+};
     -+
     -+void reftable_table_from_merged_table(struct reftable_table *tab,
     -+				      struct reftable_merged_table *merged)
     -+{
     -+	assert(tab->ops == NULL);
     -+	tab->ops = &merged_table_vtable;
     -+	tab->table_arg = merged;
     -+}
     -
     - ## reftable/merged.h (new) ##
     -@@
     -+/*
     -+Copyright 2020 Google LLC
     -+
     -+Use of this source code is governed by a BSD-style
     -+license that can be found in the LICENSE file or at
     -+https://developers.google.com/open-source/licenses/bsd
     -+*/
     -+
     -+#ifndef MERGED_H
     -+#define MERGED_H
     -+
     -+#include "pq.h"
     -+
     -+struct reftable_merged_table {
     -+	struct reftable_table *stack;
     -+	size_t stack_len;
     -+	uint32_t hash_id;
     -+	int suppress_deletions;
     -+
     -+	uint64_t min;
     -+	uint64_t max;
     -+};
     -+
     -+struct merged_iter {
     -+	struct reftable_iterator *stack;
     -+	uint32_t hash_id;
     -+	size_t stack_len;
     -+	uint8_t typ;
     -+	int suppress_deletions;
     -+	struct merged_iter_pqueue pq;
     -+};
     -+
     -+void merged_table_release(struct reftable_merged_table *mt);
     -+
     -+#endif
     -
     - ## reftable/merged_test.c (new) ##
     -@@
     -+/*
     -+Copyright 2020 Google LLC
     -+
     -+Use of this source code is governed by a BSD-style
     -+license that can be found in the LICENSE file or at
     -+https://developers.google.com/open-source/licenses/bsd
     -+*/
     -+
     -+#include "merged.h"
     -+
     -+#include "system.h"
     -+
     -+#include "basics.h"
     -+#include "blocksource.h"
     -+#include "constants.h"
     -+#include "pq.h"
     -+#include "reader.h"
     -+#include "record.h"
     -+#include "test_framework.h"
     -+#include "reftable-merged.h"
     -+#include "reftable-tests.h"
     -+#include "reftable-generic.h"
     -+#include "reftable-stack.h"
     -+
     -+static void test_pq(void)
     -+{
     -+	char *names[54] = { NULL };
     -+	int N = ARRAY_SIZE(names) - 1;
     -+
     -+	struct merged_iter_pqueue pq = { NULL };
     -+	const char *last = NULL;
     -+
     -+	int i = 0;
     -+	for (i = 0; i < N; i++) {
     -+		char name[100];
     -+		snprintf(name, sizeof(name), "%02d", i);
     -+		names[i] = xstrdup(name);
     -+	}
     -+
     -+	i = 1;
     -+	do {
     -+		struct reftable_record rec =
     -+			reftable_new_record(BLOCK_TYPE_REF);
     -+		struct pq_entry e = { 0 };
     -+
     -+		reftable_record_as_ref(&rec)->refname = names[i];
     -+		e.rec = rec;
     -+		merged_iter_pqueue_add(&pq, e);
     -+		merged_iter_pqueue_check(pq);
     -+		i = (i * 7) % N;
     -+	} while (i != 1);
     -+
     -+	while (!merged_iter_pqueue_is_empty(pq)) {
     -+		struct pq_entry e = merged_iter_pqueue_remove(&pq);
     -+		struct reftable_ref_record *ref =
     -+			reftable_record_as_ref(&e.rec);
     -+
     -+		merged_iter_pqueue_check(pq);
     -+
     -+		if (last != NULL) {
     -+			assert(strcmp(last, ref->refname) < 0);
     -+		}
     -+		last = ref->refname;
     -+		ref->refname = NULL;
     -+		reftable_free(ref);
     -+	}
     -+
     -+	for (i = 0; i < N; i++) {
     -+		reftable_free(names[i]);
     -+	}
     -+
     -+	merged_iter_pqueue_release(&pq);
     -+}
     -+
     -+static void write_test_table(struct strbuf *buf,
     -+			     struct reftable_ref_record refs[], int n)
     -+{
     -+	int min = 0xffffffff;
     -+	int max = 0;
     -+	int i = 0;
     -+	int err;
     -+
     -+	struct reftable_write_options opts = {
     -+		.block_size = 256,
     -+	};
     -+	struct reftable_writer *w = NULL;
     -+	for (i = 0; i < n; i++) {
     -+		uint64_t ui = refs[i].update_index;
     -+		if (ui > max) {
     -+			max = ui;
     -+		}
     -+		if (ui < min) {
     -+			min = ui;
     -+		}
     -+	}
     -+
     -+	w = reftable_new_writer(&strbuf_add_void, buf, &opts);
     -+	reftable_writer_set_limits(w, min, max);
     -+
     -+	for (i = 0; i < n; i++) {
     -+		uint64_t before = refs[i].update_index;
     -+		int n = reftable_writer_add_ref(w, &refs[i]);
     -+		assert(n == 0);
     -+		assert(before == refs[i].update_index);
     -+	}
     -+
     -+	err = reftable_writer_close(w);
     -+	EXPECT_ERR(err);
     -+
     -+	reftable_writer_free(w);
     -+}
     -+
     -+static struct reftable_merged_table *
     -+merged_table_from_records(struct reftable_ref_record **refs,
     -+			  struct reftable_block_source **source,
     -+			  struct reftable_reader ***readers, int *sizes,
     -+			  struct strbuf *buf, int n)
     -+{
     -+	int i = 0;
     -+	struct reftable_merged_table *mt = NULL;
     -+	int err;
     -+	struct reftable_table *tabs =
     -+		reftable_calloc(n * sizeof(struct reftable_table));
     -+	*readers = reftable_calloc(n * sizeof(struct reftable_reader *));
     -+	*source = reftable_calloc(n * sizeof(**source));
     -+	for (i = 0; i < n; i++) {
     -+		write_test_table(&buf[i], refs[i], sizes[i]);
     -+		block_source_from_strbuf(&(*source)[i], &buf[i]);
     -+
     -+		err = reftable_new_reader(&(*readers)[i], &(*source)[i],
     -+					  "name");
     -+		EXPECT_ERR(err);
     -+		reftable_table_from_reader(&tabs[i], (*readers)[i]);
     -+	}
     -+
     -+	err = reftable_new_merged_table(&mt, tabs, n, SHA1_ID);
     -+	EXPECT_ERR(err);
     -+	return mt;
     -+}
     -+
     -+static void readers_destroy(struct reftable_reader **readers, size_t n)
     -+{
     -+	int i = 0;
     -+	for (; i < n; i++)
     -+		reftable_reader_free(readers[i]);
     -+	reftable_free(readers);
     -+}
     -+
     -+static void test_merged_between(void)
     -+{
     -+	uint8_t hash1[SHA1_SIZE] = { 1, 2, 3, 0 };
     -+
     -+	struct reftable_ref_record r1[] = { {
     -+		.refname = "b",
     -+		.update_index = 1,
     -+		.value_type = REFTABLE_REF_VAL1,
     -+		.value.val1 = hash1,
     -+	} };
     -+	struct reftable_ref_record r2[] = { {
     -+		.refname = "a",
     -+		.update_index = 2,
     -+		.value_type = REFTABLE_REF_DELETION,
     -+	} };
     -+
     -+	struct reftable_ref_record *refs[] = { r1, r2 };
     -+	int sizes[] = { 1, 1 };
     -+	struct strbuf bufs[2] = { STRBUF_INIT, STRBUF_INIT };
     -+	struct reftable_block_source *bs = NULL;
     -+	struct reftable_reader **readers = NULL;
     -+	struct reftable_merged_table *mt =
     -+		merged_table_from_records(refs, &bs, &readers, sizes, bufs, 2);
     -+	int i;
     -+	struct reftable_ref_record ref = { NULL };
     -+	struct reftable_iterator it = { NULL };
     -+	int err = reftable_merged_table_seek_ref(mt, &it, "a");
     -+	EXPECT_ERR(err);
     -+
     -+	err = reftable_iterator_next_ref(&it, &ref);
     -+	EXPECT_ERR(err);
     -+	EXPECT(ref.update_index == 2);
     -+	reftable_ref_record_release(&ref);
     -+	reftable_iterator_destroy(&it);
     -+	readers_destroy(readers, 2);
     -+	reftable_merged_table_free(mt);
     -+	for (i = 0; i < ARRAY_SIZE(bufs); i++) {
     -+		strbuf_release(&bufs[i]);
     -+	}
     -+	reftable_free(bs);
     -+}
     -+
     -+static void test_merged(void)
     -+{
     -+	uint8_t hash1[SHA1_SIZE] = { 1 };
     -+	uint8_t hash2[SHA1_SIZE] = { 2 };
     -+	struct reftable_ref_record r1[] = {
     -+		{
     -+			.refname = "a",
     -+			.update_index = 1,
     -+			.value_type = REFTABLE_REF_VAL1,
     -+			.value.val1 = hash1,
     -+		},
     -+		{
     -+			.refname = "b",
     -+			.update_index = 1,
     -+			.value_type = REFTABLE_REF_VAL1,
     -+			.value.val1 = hash1,
     -+		},
     -+		{
     -+			.refname = "c",
     -+			.update_index = 1,
     -+			.value_type = REFTABLE_REF_VAL1,
     -+			.value.val1 = hash1,
     -+		}
     -+	};
     -+	struct reftable_ref_record r2[] = { {
     -+		.refname = "a",
     -+		.update_index = 2,
     -+		.value_type = REFTABLE_REF_DELETION,
     -+	} };
     -+	struct reftable_ref_record r3[] = {
     -+		{
     -+			.refname = "c",
     -+			.update_index = 3,
     -+			.value_type = REFTABLE_REF_VAL1,
     -+			.value.val1 = hash2,
     -+		},
     -+		{
     -+			.refname = "d",
     -+			.update_index = 3,
     -+			.value_type = REFTABLE_REF_VAL1,
     -+			.value.val1 = hash1,
     -+		},
     -+	};
     -+
     -+	struct reftable_ref_record want[] = {
     -+		r2[0],
     -+		r1[1],
     -+		r3[0],
     -+		r3[1],
     -+	};
     -+
     -+	struct reftable_ref_record *refs[] = { r1, r2, r3 };
     -+	int sizes[3] = { 3, 1, 2 };
     -+	struct strbuf bufs[3] = { STRBUF_INIT, STRBUF_INIT, STRBUF_INIT };
     -+	struct reftable_block_source *bs = NULL;
     -+	struct reftable_reader **readers = NULL;
     -+	struct reftable_merged_table *mt =
     -+		merged_table_from_records(refs, &bs, &readers, sizes, bufs, 3);
     -+
     -+	struct reftable_iterator it = { NULL };
     -+	int err = reftable_merged_table_seek_ref(mt, &it, "a");
     -+	struct reftable_ref_record *out = NULL;
     -+	size_t len = 0;
     -+	size_t cap = 0;
     -+	int i = 0;
     -+
     -+	EXPECT_ERR(err);
     -+	while (len < 100) { /* cap loops/recursion. */
     -+		struct reftable_ref_record ref = { NULL };
     -+		int err = reftable_iterator_next_ref(&it, &ref);
     -+		if (err > 0) {
     -+			break;
     -+		}
     -+		if (len == cap) {
     -+			cap = 2 * cap + 1;
     -+			out = reftable_realloc(
     -+				out, sizeof(struct reftable_ref_record) * cap);
     -+		}
     -+		out[len++] = ref;
     -+	}
     -+	reftable_iterator_destroy(&it);
     -+
     -+	assert(ARRAY_SIZE(want) == len);
     -+	for (i = 0; i < len; i++) {
     -+		assert(reftable_ref_record_equal(&want[i], &out[i], SHA1_SIZE));
     -+	}
     -+	for (i = 0; i < len; i++) {
     -+		reftable_ref_record_release(&out[i]);
     -+	}
     -+	reftable_free(out);
     -+
     -+	for (i = 0; i < 3; i++) {
     -+		strbuf_release(&bufs[i]);
     -+	}
     -+	readers_destroy(readers, 3);
     -+	reftable_merged_table_free(mt);
     -+	reftable_free(bs);
     -+}
     -+
     -+static void test_default_write_opts(void)
     -+{
     -+	struct reftable_write_options opts = { 0 };
     -+	struct strbuf buf = STRBUF_INIT;
     -+	struct reftable_writer *w =
     -+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
     -+
     -+	struct reftable_ref_record rec = {
     -+		.refname = "master",
     -+		.update_index = 1,
     -+	};
     -+	int err;
     -+	struct reftable_block_source source = { NULL };
     -+	struct reftable_table *tab = reftable_calloc(sizeof(*tab) * 1);
     -+	uint32_t hash_id;
     -+	struct reftable_reader *rd = NULL;
     -+	struct reftable_merged_table *merged = NULL;
     -+
     -+	reftable_writer_set_limits(w, 1, 1);
     -+
     -+	err = reftable_writer_add_ref(w, &rec);
     -+	EXPECT_ERR(err);
     -+
     -+	err = reftable_writer_close(w);
     -+	EXPECT_ERR(err);
     -+	reftable_writer_free(w);
     -+
     -+	block_source_from_strbuf(&source, &buf);
     -+
     -+	err = reftable_new_reader(&rd, &source, "filename");
     -+	EXPECT_ERR(err);
     -+
     -+	hash_id = reftable_reader_hash_id(rd);
     -+	assert(hash_id == SHA1_ID);
     -+
     -+	reftable_table_from_reader(&tab[0], rd);
     -+	err = reftable_new_merged_table(&merged, tab, 1, SHA1_ID);
     -+	EXPECT_ERR(err);
     -+
     -+	reftable_reader_free(rd);
     -+	reftable_merged_table_free(merged);
     -+	strbuf_release(&buf);
     -+}
     -+
     -+/* XXX test refs_for(oid) */
     -+
     -+int merged_test_main(int argc, const char *argv[])
     -+{
     -+	RUN_TEST(test_merged_between);
     -+	RUN_TEST(test_pq);
     -+	RUN_TEST(test_merged);
     -+	RUN_TEST(test_default_write_opts);
     -+	return 0;
     -+}
     -
     - ## reftable/pq.c (new) ##
     -@@
     -+/*
     -+Copyright 2020 Google LLC
     -+
     -+Use of this source code is governed by a BSD-style
     -+license that can be found in the LICENSE file or at
     -+https://developers.google.com/open-source/licenses/bsd
     -+*/
     -+
     -+#include "pq.h"
     -+
     -+#include "reftable-record.h"
     -+#include "system.h"
     -+#include "basics.h"
     -+
     -+static int pq_less(struct pq_entry a, struct pq_entry b)
     -+{
     -+	struct strbuf ak = STRBUF_INIT;
     -+	struct strbuf bk = STRBUF_INIT;
     -+	int cmp = 0;
     -+	reftable_record_key(&a.rec, &ak);
     -+	reftable_record_key(&b.rec, &bk);
     -+
     -+	cmp = strbuf_cmp(&ak, &bk);
     -+
     -+	strbuf_release(&ak);
     -+	strbuf_release(&bk);
     -+
     -+	if (cmp == 0)
     -+		return a.index > b.index;
     -+
     -+	return cmp < 0;
     -+}
     -+
     -+struct pq_entry merged_iter_pqueue_top(struct merged_iter_pqueue pq)
     -+{
     -+	return pq.heap[0];
     -+}
     -+
     -+int merged_iter_pqueue_is_empty(struct merged_iter_pqueue pq)
     -+{
     -+	return pq.len == 0;
     -+}
     -+
     -+void merged_iter_pqueue_check(struct merged_iter_pqueue pq)
     -+{
     -+	int i = 0;
     -+	for (i = 1; i < pq.len; i++) {
     -+		int parent = (i - 1) / 2;
     -+
     -+		assert(pq_less(pq.heap[parent], pq.heap[i]));
     -+	}
     -+}
     -+
     -+struct pq_entry merged_iter_pqueue_remove(struct merged_iter_pqueue *pq)
     -+{
     -+	int i = 0;
     -+	struct pq_entry e = pq->heap[0];
     -+	pq->heap[0] = pq->heap[pq->len - 1];
     -+	pq->len--;
     -+
     -+	i = 0;
     -+	while (i < pq->len) {
     -+		int min = i;
     -+		int j = 2 * i + 1;
     -+		int k = 2 * i + 2;
     -+		if (j < pq->len && pq_less(pq->heap[j], pq->heap[i])) {
     -+			min = j;
     -+		}
     -+		if (k < pq->len && pq_less(pq->heap[k], pq->heap[min])) {
     -+			min = k;
     -+		}
     -+
     -+		if (min == i) {
     -+			break;
     -+		}
     -+
     -+		SWAP(pq->heap[i], pq->heap[min]);
     -+		i = min;
     -+	}
     -+
     -+	return e;
     -+}
     -+
     -+void merged_iter_pqueue_add(struct merged_iter_pqueue *pq, struct pq_entry e)
     -+{
     -+	int i = 0;
     -+	if (pq->len == pq->cap) {
     -+		pq->cap = 2 * pq->cap + 1;
     -+		pq->heap = reftable_realloc(pq->heap,
     -+					    pq->cap * sizeof(struct pq_entry));
     -+	}
     -+
     -+	pq->heap[pq->len++] = e;
     -+	i = pq->len - 1;
     -+	while (i > 0) {
     -+		int j = (i - 1) / 2;
     -+		if (pq_less(pq->heap[j], pq->heap[i])) {
     -+			break;
     -+		}
     -+
     -+		SWAP(pq->heap[j], pq->heap[i]);
     -+
     -+		i = j;
     -+	}
     -+}
     -+
     -+void merged_iter_pqueue_release(struct merged_iter_pqueue *pq)
     -+{
     -+	int i = 0;
     -+	for (i = 0; i < pq->len; i++) {
     -+		reftable_record_destroy(&pq->heap[i].rec);
     -+	}
     -+	FREE_AND_NULL(pq->heap);
     -+	pq->len = pq->cap = 0;
     -+}
     -
     - ## reftable/pq.h (new) ##
     -@@
     -+/*
     -+Copyright 2020 Google LLC
     -+
     -+Use of this source code is governed by a BSD-style
     -+license that can be found in the LICENSE file or at
     -+https://developers.google.com/open-source/licenses/bsd
     -+*/
     -+
     -+#ifndef PQ_H
     -+#define PQ_H
     -+
     -+#include "record.h"
     -+
     -+struct pq_entry {
     -+	int index;
     -+	struct reftable_record rec;
     -+};
     -+
     -+struct merged_iter_pqueue {
     -+	struct pq_entry *heap;
     -+	size_t len;
     -+	size_t cap;
     -+};
     -+
     -+struct pq_entry merged_iter_pqueue_top(struct merged_iter_pqueue pq);
     -+int merged_iter_pqueue_is_empty(struct merged_iter_pqueue pq);
     -+void merged_iter_pqueue_check(struct merged_iter_pqueue pq);
     -+struct pq_entry merged_iter_pqueue_remove(struct merged_iter_pqueue *pq);
     -+void merged_iter_pqueue_add(struct merged_iter_pqueue *pq, struct pq_entry e);
     -+void merged_iter_pqueue_release(struct merged_iter_pqueue *pq);
     -+
     -+#endif
     -
     - ## reftable/refname.c (new) ##
     -@@
     -+/*
     -+  Copyright 2020 Google LLC
     -+
     -+  Use of this source code is governed by a BSD-style
     -+  license that can be found in the LICENSE file or at
     -+  https://developers.google.com/open-source/licenses/bsd
     -+*/
     -+
     -+#include "system.h"
     -+#include "reftable-error.h"
     -+#include "basics.h"
     -+#include "refname.h"
     -+#include "reftable-iterator.h"
     -+
     -+struct find_arg {
     -+	char **names;
     -+	const char *want;
     -+};
     -+
     -+static int find_name(size_t k, void *arg)
     -+{
     -+	struct find_arg *f_arg = (struct find_arg *)arg;
     -+	return strcmp(f_arg->names[k], f_arg->want) >= 0;
     -+}
     -+
     -+static int modification_has_ref(struct modification *mod, const char *name)
     -+{
     -+	struct reftable_ref_record ref = { NULL };
     -+	int err = 0;
     -+
     -+	if (mod->add_len > 0) {
     -+		struct find_arg arg = {
     -+			.names = mod->add,
     -+			.want = name,
     -+		};
     -+		int idx = binsearch(mod->add_len, find_name, &arg);
     -+		if (idx < mod->add_len && !strcmp(mod->add[idx], name)) {
     -+			return 0;
     -+		}
     -+	}
     -+
     -+	if (mod->del_len > 0) {
     -+		struct find_arg arg = {
     -+			.names = mod->del,
     -+			.want = name,
     -+		};
     -+		int idx = binsearch(mod->del_len, find_name, &arg);
     -+		if (idx < mod->del_len && !strcmp(mod->del[idx], name)) {
     -+			return 1;
     -+		}
     -+	}
     -+
     -+	err = reftable_table_read_ref(&mod->tab, name, &ref);
     -+	reftable_ref_record_release(&ref);
     -+	return err;
     -+}
     -+
     -+static void modification_release(struct modification *mod)
     -+{
     -+	/* don't delete the strings themselves; they're owned by ref records.
     -+	 */
     -+	FREE_AND_NULL(mod->add);
     -+	FREE_AND_NULL(mod->del);
     -+	mod->add_len = 0;
     -+	mod->del_len = 0;
     -+}
     -+
     -+static int modification_has_ref_with_prefix(struct modification *mod,
     -+					    const char *prefix)
     -+{
     -+	struct reftable_iterator it = { NULL };
     -+	struct reftable_ref_record ref = { NULL };
     -+	int err = 0;
     -+
     -+	if (mod->add_len > 0) {
     -+		struct find_arg arg = {
     -+			.names = mod->add,
     -+			.want = prefix,
     -+		};
     -+		int idx = binsearch(mod->add_len, find_name, &arg);
     -+		if (idx < mod->add_len &&
     -+		    !strncmp(prefix, mod->add[idx], strlen(prefix)))
     -+			goto done;
     -+	}
     -+	err = reftable_table_seek_ref(&mod->tab, &it, prefix);
     -+	if (err)
     -+		goto done;
     -+
     -+	while (1) {
     -+		err = reftable_iterator_next_ref(&it, &ref);
     -+		if (err)
     -+			goto done;
     -+
     -+		if (mod->del_len > 0) {
     -+			struct find_arg arg = {
     -+				.names = mod->del,
     -+				.want = ref.refname,
     -+			};
     -+			int idx = binsearch(mod->del_len, find_name, &arg);
     -+			if (idx < mod->del_len &&
     -+			    !strcmp(ref.refname, mod->del[idx])) {
     -+				continue;
     -+			}
     -+		}
     -+
     -+		if (strncmp(ref.refname, prefix, strlen(prefix))) {
     -+			err = 1;
     -+			goto done;
     -+		}
     -+		err = 0;
     -+		goto done;
     -+	}
     -+
     -+done:
     -+	reftable_ref_record_release(&ref);
     -+	reftable_iterator_destroy(&it);
     -+	return err;
     -+}
     -+
     -+static int validate_refname(const char *name)
     -+{
     -+	while (1) {
     -+		char *next = strchr(name, '/');
     -+		if (!*name) {
     -+			return REFTABLE_REFNAME_ERROR;
     -+		}
     -+		if (!next) {
     -+			return 0;
     -+		}
     -+		if (next - name == 0 || (next - name == 1 && *name == '.') ||
     -+		    (next - name == 2 && name[0] == '.' && name[1] == '.'))
     -+			return REFTABLE_REFNAME_ERROR;
     -+		name = next + 1;
     -+	}
     -+	return 0;
     -+}
     -+
     -+int validate_ref_record_addition(struct reftable_table tab,
     -+				 struct reftable_ref_record *recs, size_t sz)
     -+{
     -+	struct modification mod = {
     -+		.tab = tab,
     -+		.add = reftable_calloc(sizeof(char *) * sz),
     -+		.del = reftable_calloc(sizeof(char *) * sz),
     -+	};
     -+	int i = 0;
     -+	int err = 0;
     -+	for (; i < sz; i++) {
     -+		if (reftable_ref_record_is_deletion(&recs[i])) {
     -+			mod.del[mod.del_len++] = recs[i].refname;
     -+		} else {
     -+			mod.add[mod.add_len++] = recs[i].refname;
     -+		}
     -+	}
     -+
     -+	err = modification_validate(&mod);
     -+	modification_release(&mod);
     -+	return err;
     -+}
     -+
     -+static void strbuf_trim_component(struct strbuf *sl)
     -+{
     -+	while (sl->len > 0) {
     -+		int is_slash = (sl->buf[sl->len - 1] == '/');
     -+		strbuf_setlen(sl, sl->len - 1);
     -+		if (is_slash)
     -+			break;
     -+	}
     -+}
     -+
     -+int modification_validate(struct modification *mod)
     -+{
     -+	struct strbuf slashed = STRBUF_INIT;
     -+	int err = 0;
     -+	int i = 0;
     -+	for (; i < mod->add_len; i++) {
     -+		err = validate_refname(mod->add[i]);
     -+		if (err)
     -+			goto done;
     -+		strbuf_reset(&slashed);
     -+		strbuf_addstr(&slashed, mod->add[i]);
     -+		strbuf_addstr(&slashed, "/");
     -+
     -+		err = modification_has_ref_with_prefix(mod, slashed.buf);
     -+		if (err == 0) {
     -+			err = REFTABLE_NAME_CONFLICT;
     -+			goto done;
     -+		}
     -+		if (err < 0)
     -+			goto done;
     -+
     -+		strbuf_reset(&slashed);
     -+		strbuf_addstr(&slashed, mod->add[i]);
     -+		while (slashed.len) {
     -+			strbuf_trim_component(&slashed);
     -+			err = modification_has_ref(mod, slashed.buf);
     -+			if (err == 0) {
     -+				err = REFTABLE_NAME_CONFLICT;
     -+				goto done;
     -+			}
     -+			if (err < 0)
     -+				goto done;
     -+		}
     -+	}
     -+	err = 0;
     -+done:
     -+	strbuf_release(&slashed);
     -+	return err;
     -+}
     -
     - ## reftable/refname.h (new) ##
     -@@
     -+/*
     -+  Copyright 2020 Google LLC
     -+
     -+  Use of this source code is governed by a BSD-style
     -+  license that can be found in the LICENSE file or at
     -+  https://developers.google.com/open-source/licenses/bsd
     -+*/
     -+#ifndef REFNAME_H
     -+#define REFNAME_H
     -+
     -+#include "reftable-record.h"
     -+#include "reftable-generic.h"
     -+
     -+struct modification {
     -+	struct reftable_table tab;
     -+
     -+	char **add;
     -+	size_t add_len;
     -+
     -+	char **del;
     -+	size_t del_len;
     -+};
     -+
     -+int validate_ref_record_addition(struct reftable_table tab,
     -+				 struct reftable_ref_record *recs, size_t sz);
     -+
     -+int modification_validate(struct modification *mod);
     -+
     -+#endif
     -
     - ## reftable/refname_test.c (new) ##
     -@@
     -+/*
     -+Copyright 2020 Google LLC
     -+
     -+Use of this source code is governed by a BSD-style
     -+license that can be found in the LICENSE file or at
     -+https://developers.google.com/open-source/licenses/bsd
     -+*/
     -+
     -+#include "basics.h"
     -+#include "block.h"
     -+#include "blocksource.h"
     -+#include "constants.h"
     -+#include "reader.h"
     -+#include "record.h"
     -+#include "refname.h"
     -+#include "reftable-error.h"
     -+#include "reftable-writer.h"
     -+#include "system.h"
     -+
     -+#include "test_framework.h"
     -+#include "reftable-tests.h"
     -+
     -+struct testcase {
     -+	char *add;
     -+	char *del;
     -+	int error_code;
     -+};
     -+
     -+static void test_conflict(void)
     -+{
     -+	struct reftable_write_options opts = { 0 };
     -+	struct strbuf buf = STRBUF_INIT;
     -+	struct reftable_writer *w =
     -+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
     -+	struct reftable_ref_record rec = {
     -+		.refname = "a/b",
     -+		.value_type = REFTABLE_REF_SYMREF,
     -+		.value.symref = "destination", /* make sure it's not a symref.
     -+						*/
     -+		.update_index = 1,
     -+	};
     -+	int err;
     -+	int i;
     -+	struct reftable_block_source source = { NULL };
     -+	struct reftable_reader *rd = NULL;
     -+	struct reftable_table tab = { NULL };
     -+	struct testcase cases[] = {
     -+		{ "a/b/c", NULL, REFTABLE_NAME_CONFLICT },
     -+		{ "b", NULL, 0 },
     -+		{ "a", NULL, REFTABLE_NAME_CONFLICT },
     -+		{ "a", "a/b", 0 },
     -+
     -+		{ "p/", NULL, REFTABLE_REFNAME_ERROR },
     -+		{ "p//q", NULL, REFTABLE_REFNAME_ERROR },
     -+		{ "p/./q", NULL, REFTABLE_REFNAME_ERROR },
     -+		{ "p/../q", NULL, REFTABLE_REFNAME_ERROR },
     -+
     -+		{ "a/b/c", "a/b", 0 },
     -+		{ NULL, "a//b", 0 },
     -+	};
     -+	reftable_writer_set_limits(w, 1, 1);
     -+
     -+	err = reftable_writer_add_ref(w, &rec);
     -+	EXPECT_ERR(err);
     -+
     -+	err = reftable_writer_close(w);
     -+	EXPECT_ERR(err);
     -+	reftable_writer_free(w);
     -+
     -+	block_source_from_strbuf(&source, &buf);
     -+	err = reftable_new_reader(&rd, &source, "filename");
     -+	EXPECT_ERR(err);
     -+
     -+	reftable_table_from_reader(&tab, rd);
     -+
     -+	for (i = 0; i < ARRAY_SIZE(cases); i++) {
     -+		struct modification mod = {
     -+			.tab = tab,
     -+		};
     -+
     -+		if (cases[i].add != NULL) {
     -+			mod.add = &cases[i].add;
     -+			mod.add_len = 1;
     -+		}
     -+		if (cases[i].del != NULL) {
     -+			mod.del = &cases[i].del;
     -+			mod.del_len = 1;
     -+		}
     -+
     -+		err = modification_validate(&mod);
     -+		EXPECT(err == cases[i].error_code);
     -+	}
     -+
     -+	reftable_reader_free(rd);
     -+	strbuf_release(&buf);
     -+}
     -+
     -+int refname_test_main(int argc, const char *argv[])
     -+{
     -+	RUN_TEST(test_conflict);
     -+	return 0;
     -+}
     -
     - ## reftable/reftable-generic.h (new) ##
     -@@
     -+/*
     -+Copyright 2020 Google LLC
     -+
     -+Use of this source code is governed by a BSD-style
     -+license that can be found in the LICENSE file or at
     -+https://developers.google.com/open-source/licenses/bsd
     -+*/
     -+
     -+#ifndef REFTABLE_GENERIC_H
     -+#define REFTABLE_GENERIC_H
     -+
     -+#include "reftable-iterator.h"
     -+#include "reftable-reader.h"
     -+#include "reftable-merged.h"
     -+
     -+/*
     -+ * Provides a unified API for reading tables, either merged tables, or single
     -+ * readers. */
     -+struct reftable_table {
     -+	struct reftable_table_vtable *ops;
     -+	void *table_arg;
     -+};
     -+
     -+int reftable_table_seek_ref(struct reftable_table *tab,
     -+			    struct reftable_iterator *it, const char *name);
     -+
     -+void reftable_table_from_reader(struct reftable_table *tab,
     -+				struct reftable_reader *reader);
     -+
     -+/* returns the hash ID from a generic reftable_table */
     -+uint32_t reftable_table_hash_id(struct reftable_table *tab);
     -+
     -+/* create a generic table from reftable_merged_table */
     -+void reftable_table_from_merged_table(struct reftable_table *tab,
     -+				      struct reftable_merged_table *table);
     -+
     -+/* returns the max update_index covered by this table. */
     -+uint64_t reftable_table_max_update_index(struct reftable_table *tab);
     -+
     -+/* returns the min update_index covered by this table. */
     -+uint64_t reftable_table_min_update_index(struct reftable_table *tab);
     -+
     -+/* convenience function to read a single ref. Returns < 0 for error, 0
     -+   for success, and 1 if ref not found. */
     -+int reftable_table_read_ref(struct reftable_table *tab, const char *name,
     -+			    struct reftable_ref_record *ref);
     -+
     -+#endif
     -
     - ## reftable/reftable-merged.h (new) ##
     -@@
     -+/*
     -+Copyright 2020 Google LLC
     -+
     -+Use of this source code is governed by a BSD-style
     -+license that can be found in the LICENSE file or at
     -+https://developers.google.com/open-source/licenses/bsd
     -+*/
     -+
     -+#ifndef REFTABLE_MERGED_H
     -+#define REFTABLE_MERGED_H
     -+
     -+#include "reftable-iterator.h"
     -+
     -+/*
     -+ * Merged tables
     -+ *
     -+ * A ref database kept in a sequence of table files. The merged_table presents a
     -+ * unified view to reading (seeking, iterating) a sequence of immutable tables.
     -+ *
     -+ * The merged tables are on purpose kept disconnected from their actual storage
     -+ * (eg. files on disk), because it is useful to merge tables aren't files. For
     -+ * example, the per-workspace and global ref namespace can be implemented as a
     -+ * merged table of two stacks of file-backed reftables.
     -+ */
     -+
     -+/* A merged table is implements seeking/iterating over a stack of tables. */
     -+struct reftable_merged_table;
     -+
     -+/* A generic reftable; see below. */
     -+struct reftable_table;
     -+
     -+/* reftable_new_merged_table creates a new merged table. It takes ownership of
     -+   the stack array.
     -+*/
     -+int reftable_new_merged_table(struct reftable_merged_table **dest,
     -+			      struct reftable_table *stack, int n,
     -+			      uint32_t hash_id);
     -+
     -+/* returns an iterator positioned just before 'name' */
     -+int reftable_merged_table_seek_ref(struct reftable_merged_table *mt,
     -+				   struct reftable_iterator *it,
     -+				   const char *name);
     -+
     -+/* returns an iterator for log entry, at given update_index */
     -+int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt,
     -+				      struct reftable_iterator *it,
     -+				      const char *name, uint64_t update_index);
     -+
     -+/* like reftable_merged_table_seek_log_at but look for the newest entry. */
     -+int reftable_merged_table_seek_log(struct reftable_merged_table *mt,
     -+				   struct reftable_iterator *it,
     -+				   const char *name);
     -+
     -+/* returns the max update_index covered by this merged table. */
     -+uint64_t
     -+reftable_merged_table_max_update_index(struct reftable_merged_table *mt);
     -+
     -+/* returns the min update_index covered by this merged table. */
     -+uint64_t
     -+reftable_merged_table_min_update_index(struct reftable_merged_table *mt);
     -+
     -+/* releases memory for the merged_table */
     -+void reftable_merged_table_free(struct reftable_merged_table *m);
     -+
     -+/* return the hash ID of the merged table. */
     -+uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *m);
     -+
     -+#endif
     +@@ Makefile: REFTABLE_TEST_OBJS += reftable/pq_test.o
     + REFTABLE_TEST_OBJS += reftable/record_test.o
     + REFTABLE_TEST_OBJS += reftable/refname_test.o
     + REFTABLE_TEST_OBJS += reftable/reftable_test.o
     ++REFTABLE_TEST_OBJS += reftable/stack_test.o
     + REFTABLE_TEST_OBJS += reftable/test_framework.o
     + REFTABLE_TEST_OBJS += reftable/tree_test.o
     + 
      
       ## reftable/reftable-stack.h (new) ##
      @@
     @@ reftable/reftable-stack.h (new)
      +/* heuristically compact unbalanced table stack. */
      +int reftable_stack_auto_compact(struct reftable_stack *st);
      +
     ++/* delete stale .ref tables. */
     ++int reftable_stack_clean(struct reftable_stack *st);
     ++
      +/* convenience function to read a single ref. Returns < 0 for error, 0 for
      + * success, and 1 if ref not found. */
      +int reftable_stack_read_ref(struct reftable_stack *st, const char *refname,
     @@ reftable/reftable-stack.h (new)
      +
      +#endif
      
     - ## reftable/reftable.c (new) ##
     -@@
     -+/*
     -+Copyright 2020 Google LLC
     -+
     -+Use of this source code is governed by a BSD-style
     -+license that can be found in the LICENSE file or at
     -+https://developers.google.com/open-source/licenses/bsd
     -+*/
     -+
     -+#include "record.h"
     -+#include "reader.h"
     -+#include "reftable-iterator.h"
     -+#include "reftable-generic.h"
     -+
     -+static int reftable_reader_seek_void(void *tab, struct reftable_iterator *it,
     -+				     struct reftable_record *rec)
     -+{
     -+	return reader_seek((struct reftable_reader *)tab, it, rec);
     -+}
     -+
     -+static uint32_t reftable_reader_hash_id_void(void *tab)
     -+{
     -+	return reftable_reader_hash_id((struct reftable_reader *)tab);
     -+}
     -+
     -+static uint64_t reftable_reader_min_update_index_void(void *tab)
     -+{
     -+	return reftable_reader_min_update_index((struct reftable_reader *)tab);
     -+}
     -+
     -+static uint64_t reftable_reader_max_update_index_void(void *tab)
     -+{
     -+	return reftable_reader_max_update_index((struct reftable_reader *)tab);
     -+}
     -+
     -+static struct reftable_table_vtable reader_vtable = {
     -+	.seek_record = reftable_reader_seek_void,
     -+	.hash_id = reftable_reader_hash_id_void,
     -+	.min_update_index = reftable_reader_min_update_index_void,
     -+	.max_update_index = reftable_reader_max_update_index_void,
     -+};
     -+
     -+int reftable_table_seek_ref(struct reftable_table *tab,
     -+			    struct reftable_iterator *it, const char *name)
     -+{
     -+	struct reftable_ref_record ref = {
     -+		.refname = (char *)name,
     -+	};
     -+	struct reftable_record rec = { NULL };
     -+	reftable_record_from_ref(&rec, &ref);
     -+	return tab->ops->seek_record(tab->table_arg, it, &rec);
     -+}
     -+
     -+void reftable_table_from_reader(struct reftable_table *tab,
     -+				struct reftable_reader *reader)
     -+{
     -+	assert(tab->ops == NULL);
     -+	tab->ops = &reader_vtable;
     -+	tab->table_arg = reader;
     -+}
     -+
     -+int reftable_table_read_ref(struct reftable_table *tab, const char *name,
     -+			    struct reftable_ref_record *ref)
     -+{
     -+	struct reftable_iterator it = { NULL };
     -+	int err = reftable_table_seek_ref(tab, &it, name);
     -+	if (err)
     -+		goto done;
     -+
     -+	err = reftable_iterator_next_ref(&it, ref);
     -+	if (err)
     -+		goto done;
     -+
     -+	if (strcmp(ref->refname, name) ||
     -+	    reftable_ref_record_is_deletion(ref)) {
     -+		reftable_ref_record_release(ref);
     -+		err = 1;
     -+		goto done;
     -+	}
     -+
     -+done:
     -+	reftable_iterator_destroy(&it);
     -+	return err;
     -+}
     -+
     -+uint64_t reftable_table_max_update_index(struct reftable_table *tab)
     -+{
     -+	return tab->ops->max_update_index(tab->table_arg);
     -+}
     -+
     -+uint64_t reftable_table_min_update_index(struct reftable_table *tab)
     -+{
     -+	return tab->ops->min_update_index(tab->table_arg);
     -+}
     -+
     -+uint32_t reftable_table_hash_id(struct reftable_table *tab)
     -+{
     -+	return tab->ops->hash_id(tab->table_arg);
     -+}
     -
       ## reftable/stack.c (new) ##
      @@
      +/*
     @@ reftable/stack.c (new)
      +#include "refname.h"
      +#include "reftable-error.h"
      +#include "reftable-record.h"
     ++#include "reftable-merged.h"
      +#include "writer.h"
      +
      +static int stack_try_add(struct reftable_stack *st,
     @@ reftable/stack.c (new)
      +static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
      +					     int reuse_open);
      +
     ++static void stack_filename(struct strbuf *dest, struct reftable_stack *st,
     ++			   const char *name)
     ++{
     ++	strbuf_reset(dest);
     ++	strbuf_addstr(dest, st->reftable_dir);
     ++	strbuf_addstr(dest, "/");
     ++	strbuf_addstr(dest, name);
     ++}
     ++
      +static int reftable_fd_write(void *arg, const void *data, size_t sz)
      +{
      +	int *fdp = (int *)arg;
     @@ reftable/stack.c (new)
      +	return st->merged;
      +}
      +
     ++static int has_name(char **names, const char *name)
     ++{
     ++	while (*names) {
     ++		if (!strcmp(*names, name))
     ++			return 1;
     ++		names++;
     ++	}
     ++	return 0;
     ++}
     ++
      +/* Close and free the stack */
      +void reftable_stack_destroy(struct reftable_stack *st)
      +{
     ++	char **names = NULL;
     ++	int err = 0;
      +	if (st->merged != NULL) {
      +		reftable_merged_table_free(st->merged);
      +		st->merged = NULL;
      +	}
      +
     ++	err = read_lines(st->list_file, &names);
     ++	if (err < 0) {
     ++		FREE_AND_NULL(names);
     ++	}
     ++
      +	if (st->readers != NULL) {
      +		int i = 0;
     ++		struct strbuf filename = STRBUF_INIT;
      +		for (i = 0; i < st->readers_len; i++) {
     ++			const char *name = reader_name(st->readers[i]);
     ++			strbuf_reset(&filename);
     ++			if (names && !has_name(names, name)) {
     ++				stack_filename(&filename, st, name);
     ++			}
      +			reftable_reader_free(st->readers[i]);
     ++
     ++			if (filename.len) {
     ++				// On Windows, can only unlink after closing.
     ++				unlink(filename.buf);
     ++			}
      +		}
     ++		strbuf_release(&filename);
      +		st->readers_len = 0;
      +		FREE_AND_NULL(st->readers);
      +	}
      +	FREE_AND_NULL(st->list_file);
      +	FREE_AND_NULL(st->reftable_dir);
      +	reftable_free(st);
     ++	free_names(names);
      +}
      +
      +static struct reftable_reader **stack_copy_readers(struct reftable_stack *st,
     @@ reftable/stack.c (new)
      +		if (rd == NULL) {
      +			struct reftable_block_source src = { NULL };
      +			struct strbuf table_path = STRBUF_INIT;
     -+			strbuf_addstr(&table_path, st->reftable_dir);
     -+			strbuf_addstr(&table_path, "/");
     -+			strbuf_addstr(&table_path, name);
     ++			stack_filename(&table_path, st, name);
      +
      +			err = reftable_block_source_from_file(&src,
      +							      table_path.buf);
     @@ reftable/stack.c (new)
      +	st->merged = new_merged;
      +	for (i = 0; i < cur_len; i++) {
      +		if (cur[i] != NULL) {
     ++			const char *name = reader_name(cur[i]);
     ++			struct strbuf filename = STRBUF_INIT;
     ++			stack_filename(&filename, st, name);
     ++
      +			reader_close(cur[i]);
      +			reftable_reader_free(cur[i]);
     ++
     ++			// On Windows, can only unlink after closing.
     ++			unlink(filename.buf);
     ++
     ++			strbuf_release(&filename);
      +		}
      +	}
      +
     @@ reftable/stack.c (new)
      +	int i = 0;
      +	struct strbuf nm = STRBUF_INIT;
      +	for (i = 0; i < add->new_tables_len; i++) {
     -+		strbuf_reset(&nm);
     -+		strbuf_addstr(&nm, add->stack->reftable_dir);
     -+		strbuf_addstr(&nm, "/");
     -+		strbuf_addstr(&nm, add->new_tables[i]);
     ++		stack_filename(&nm, add->stack, add->new_tables[i]);
      +		unlink(nm.buf);
      +		reftable_free(add->new_tables[i]);
      +		add->new_tables[i] = NULL;
     @@ reftable/stack.c (new)
      +	strbuf_reset(&next_name);
      +	format_name(&next_name, add->next_update_index, add->next_update_index);
      +
     -+	strbuf_addstr(&temp_tab_file_name, add->stack->reftable_dir);
     -+	strbuf_addstr(&temp_tab_file_name, "/");
     -+	strbuf_addbuf(&temp_tab_file_name, &next_name);
     ++	stack_filename(&temp_tab_file_name, add->stack, next_name.buf);
      +	strbuf_addstr(&temp_tab_file_name, ".temp.XXXXXX");
      +
      +	tab_fd = mkstemp(temp_tab_file_name.buf);
     @@ reftable/stack.c (new)
      +	format_name(&next_name, wr->min_update_index, wr->max_update_index);
      +	strbuf_addstr(&next_name, ".ref");
      +
     -+	strbuf_addstr(&tab_file_name, add->stack->reftable_dir);
     -+	strbuf_addstr(&tab_file_name, "/");
     -+	strbuf_addbuf(&tab_file_name, &next_name);
     ++	stack_filename(&tab_file_name, add->stack, next_name.buf);
      +
      +	/*
      +	  On windows, this relies on rand() picking a unique destination name.
     @@ reftable/stack.c (new)
      +		    reftable_reader_min_update_index(st->readers[first]),
      +		    reftable_reader_max_update_index(st->readers[last]));
      +
     -+	strbuf_reset(temp_tab);
     -+	strbuf_addstr(temp_tab, st->reftable_dir);
     -+	strbuf_addstr(temp_tab, "/");
     -+	strbuf_addbuf(temp_tab, &next_name);
     ++	stack_filename(temp_tab, st, next_name.buf);
      +	strbuf_addstr(temp_tab, ".temp.XXXXXX");
      +
      +	tab_fd = mkstemp(temp_tab->buf);
     @@ reftable/stack.c (new)
      +		struct strbuf subtab_lock = STRBUF_INIT;
      +		int sublock_file_fd = -1;
      +
     -+		strbuf_addstr(&subtab_file_name, st->reftable_dir);
     -+		strbuf_addstr(&subtab_file_name, "/");
     -+		strbuf_addstr(&subtab_file_name, reader_name(st->readers[i]));
     ++		stack_filename(&subtab_file_name, st,
     ++			       reader_name(st->readers[i]));
      +
      +		strbuf_reset(&subtab_lock);
      +		strbuf_addbuf(&subtab_lock, &subtab_file_name);
     @@ reftable/stack.c (new)
      +		    st->readers[last]->max_update_index);
      +	strbuf_addstr(&new_table_name, ".ref");
      +
     -+	strbuf_reset(&new_table_path);
     -+	strbuf_addstr(&new_table_path, st->reftable_dir);
     -+	strbuf_addstr(&new_table_path, "/");
     -+	strbuf_addbuf(&new_table_path, &new_table_name);
     ++	stack_filename(&new_table_path, st, new_table_name.buf);
      +
      +	if (!is_empty_table) {
      +		/* retry? */
     @@ reftable/stack.c (new)
      +	reftable_iterator_destroy(&it);
      +	reftable_reader_free(rd);
      +	return err;
     ++}
     ++
     ++static int is_table_name(const char *s)
     ++{
     ++	const char *dot = strrchr(s, '.');
     ++	return dot != NULL && !strcmp(dot, ".ref");
     ++}
     ++
     ++static void remove_maybe_stale_table(struct reftable_stack *st, uint64_t max,
     ++				     const char *name)
     ++{
     ++	int err = 0;
     ++	uint64_t update_idx = 0;
     ++	struct reftable_block_source src = { NULL };
     ++	struct reftable_reader *rd = NULL;
     ++	struct strbuf table_path = STRBUF_INIT;
     ++	stack_filename(&table_path, st, name);
     ++
     ++	err = reftable_block_source_from_file(&src, table_path.buf);
     ++	if (err < 0)
     ++		goto done;
     ++
     ++	err = reftable_new_reader(&rd, &src, name);
     ++	if (err < 0)
     ++		goto done;
     ++
     ++	update_idx = reftable_reader_max_update_index(rd);
     ++	reftable_reader_free(rd);
     ++
     ++	if (update_idx <= max) {
     ++		unlink(table_path.buf);
     ++	}
     ++done:
     ++	strbuf_release(&table_path);
     ++}
     ++
     ++static int reftable_stack_clean_locked(struct reftable_stack *st)
     ++{
     ++	uint64_t max = reftable_merged_table_max_update_index(
     ++		reftable_stack_merged_table(st));
     ++	DIR *dir = opendir(st->reftable_dir);
     ++	struct dirent *d = NULL;
     ++	if (dir == NULL) {
     ++		return REFTABLE_IO_ERROR;
     ++	}
     ++
     ++	while ((d = readdir(dir)) != NULL) {
     ++		int i = 0;
     ++		int found = 0;
     ++		if (!is_table_name(d->d_name))
     ++			continue;
     ++
     ++		for (i = 0; !found && i < st->readers_len; i++) {
     ++			found = !strcmp(reader_name(st->readers[i]), d->d_name);
     ++		}
     ++		if (found)
     ++			continue;
     ++
     ++		remove_maybe_stale_table(st, max, d->d_name);
     ++	}
     ++
     ++	closedir(dir);
     ++	return 0;
     ++}
     ++
     ++int reftable_stack_clean(struct reftable_stack *st)
     ++{
     ++	struct reftable_addition *add = NULL;
     ++	int err = reftable_stack_new_addition(&add, st);
     ++	if (err < 0) {
     ++		goto done;
     ++	}
     ++
     ++	err = reftable_stack_reload(st);
     ++	if (err < 0) {
     ++		goto done;
     ++	}
     ++
     ++	err = reftable_stack_clean_locked(st);
     ++
     ++done:
     ++	reftable_addition_destroy(add);
     ++	return err;
      +}
      
       ## reftable/stack.h (new) ##
     @@ reftable/stack_test.c (new)
      +
      +#include "system.h"
      +
     ++#include "reftable-reader.h"
      +#include "merged.h"
      +#include "basics.h"
      +#include "constants.h"
     @@ reftable/stack_test.c (new)
      +	strbuf_release(&path);
      +}
      +
     ++static int count_dir_entries(const char *dirname)
     ++{
     ++	DIR *dir = opendir(dirname);
     ++	int len = 0;
     ++	struct dirent *d;
     ++	if (dir == NULL)
     ++		return 0;
     ++
     ++	while ((d = readdir(dir)) != NULL) {
     ++		if (!strcmp(d->d_name, "..") || !strcmp(d->d_name, "."))
     ++			continue;
     ++		len++;
     ++	}
     ++	closedir(dir);
     ++	return len;
     ++}
     ++
      +static char *get_tmp_template(const char *prefix)
      +{
      +	const char *tmp = getenv("TMPDIR");
     @@ reftable/stack_test.c (new)
      +	clear_dir(dir);
      +}
      +
     ++static void test_reftable_stack_compaction_concurrent(void)
     ++{
     ++	struct reftable_write_options cfg = { 0 };
     ++	struct reftable_stack *st1 = NULL, *st2 = NULL;
     ++	char *dir = get_tmp_template(__FUNCTION__);
     ++	int err, i;
     ++	int N = 3;
     ++	EXPECT(mkdtemp(dir));
     ++
     ++	err = reftable_new_stack(&st1, dir, cfg);
     ++	EXPECT_ERR(err);
     ++
     ++	for (i = 0; i < N; i++) {
     ++		char name[100];
     ++		struct reftable_ref_record ref = {
     ++			.refname = name,
     ++			.update_index = reftable_stack_next_update_index(st1),
     ++			.value_type = REFTABLE_REF_SYMREF,
     ++			.value.symref = "master",
     ++		};
     ++		snprintf(name, sizeof(name), "branch%04d", i);
     ++
     ++		err = reftable_stack_add(st1, &write_test_ref, &ref);
     ++		EXPECT_ERR(err);
     ++	}
     ++
     ++	err = reftable_new_stack(&st2, dir, cfg);
     ++	EXPECT_ERR(err);
     ++
     ++	err = reftable_stack_compact_all(st1, NULL);
     ++	EXPECT_ERR(err);
     ++
     ++	reftable_stack_destroy(st1);
     ++	reftable_stack_destroy(st2);
     ++
     ++	EXPECT(count_dir_entries(dir) == 2);
     ++	clear_dir(dir);
     ++}
     ++
     ++static void unclean_stack_close(struct reftable_stack *st)
     ++{
     ++	// break abstraction boundary to simulate unclean shutdown.
     ++	int i = 0;
     ++	for (; i < st->readers_len; i++) {
     ++		reftable_reader_free(st->readers[i]);
     ++	}
     ++	st->readers_len = 0;
     ++	FREE_AND_NULL(st->readers);
     ++}
     ++
     ++static void test_reftable_stack_compaction_concurrent_clean(void)
     ++{
     ++	struct reftable_write_options cfg = { 0 };
     ++	struct reftable_stack *st1 = NULL, *st2 = NULL, *st3 = NULL;
     ++	char *dir = get_tmp_template(__FUNCTION__);
     ++	int err, i;
     ++	int N = 3;
     ++	EXPECT(mkdtemp(dir));
     ++
     ++	err = reftable_new_stack(&st1, dir, cfg);
     ++	EXPECT_ERR(err);
     ++
     ++	for (i = 0; i < N; i++) {
     ++		char name[100];
     ++		struct reftable_ref_record ref = {
     ++			.refname = name,
     ++			.update_index = reftable_stack_next_update_index(st1),
     ++			.value_type = REFTABLE_REF_SYMREF,
     ++			.value.symref = "master",
     ++		};
     ++		snprintf(name, sizeof(name), "branch%04d", i);
     ++
     ++		err = reftable_stack_add(st1, &write_test_ref, &ref);
     ++		EXPECT_ERR(err);
     ++	}
     ++
     ++	err = reftable_new_stack(&st2, dir, cfg);
     ++	EXPECT_ERR(err);
     ++
     ++	err = reftable_stack_compact_all(st1, NULL);
     ++	EXPECT_ERR(err);
     ++
     ++	unclean_stack_close(st1);
     ++	unclean_stack_close(st2);
     ++
     ++	err = reftable_new_stack(&st3, dir, cfg);
     ++	EXPECT_ERR(err);
     ++
     ++	err = reftable_stack_clean(st3);
     ++	EXPECT_ERR(err);
     ++	EXPECT(count_dir_entries(dir) == 2);
     ++
     ++	reftable_stack_destroy(st1);
     ++	reftable_stack_destroy(st2);
     ++	reftable_stack_destroy(st3);
     ++
     ++	clear_dir(dir);
     ++}
     ++
      +int stack_test_main(int argc, const char *argv[])
      +{
     ++	RUN_TEST(test_reftable_stack_compaction_concurrent_clean);
     ++	RUN_TEST(test_reftable_stack_compaction_concurrent);
      +	RUN_TEST(test_reftable_stack_uptodate);
      +	RUN_TEST(test_reftable_stack_transaction_api);
      +	RUN_TEST(test_reftable_stack_hash_id);
     @@ reftable/stack_test.c (new)
      
       ## t/helper/test-reftable.c ##
      @@ t/helper/test-reftable.c: int cmd__reftable(int argc, const char **argv)
     - {
     - 	basics_test_main(argc, argv);
     - 	block_test_main(argc, argv);
     -+	merged_test_main(argc, argv);
       	record_test_main(argc, argv);
     -+	refname_test_main(argc, argv);
     + 	refname_test_main(argc, argv);
       	reftable_test_main(argc, argv);
      +	stack_test_main(argc, argv);
       	tree_test_main(argc, argv);
  -:  ------------ > 17:  5db7c8ab7f23 reftable: add dump utility
 13:  bdb19af22cc7 ! 18:  2dc73bf2ec96 Reftable support for git-core
     @@ Makefile: LIB_OBJS += reflog-walk.o
       LIB_OBJS += refs/iterator.o
       LIB_OBJS += refs/packed-backend.o
       LIB_OBJS += refs/ref-cache.o
     -@@ Makefile: REFTABLE_OBJS += reftable/zlib-compat.o
     - 
     - REFTABLE_TEST_OBJS += reftable/basics_test.o
     - REFTABLE_TEST_OBJS += reftable/block_test.o
     -+REFTABLE_TEST_OBJS += reftable/merged_test.o
     - REFTABLE_TEST_OBJS += reftable/record_test.o
     -+REFTABLE_TEST_OBJS += reftable/refname_test.o
     - REFTABLE_TEST_OBJS += reftable/reftable_test.o
     -+REFTABLE_TEST_OBJS += reftable/stack_test.o
     - REFTABLE_TEST_OBJS += reftable/test_framework.o
     - REFTABLE_TEST_OBJS += reftable/tree_test.o
     - 
      
       ## builtin/clone.c ##
      @@ builtin/clone.c: int cmd_clone(int argc, const char **argv, const char *prefix)
     @@ builtin/init-db.c: static int needs_work_tree_config(const char *git_dir, const
       
       	/* This forces creation of new config file */
      @@ builtin/init-db.c: static int create_default_files(const char *template_path,
     - 	is_bare_repository_cfg = init_is_bare_repository;
     + 	is_bare_repository_cfg = init_is_bare_repository || !work_tree;
       	if (init_shared_repository != -1)
       		set_shared_repository(init_shared_repository);
      +	the_repository->ref_storage_format = xstrdup(fmt->ref_storage);
     @@ refs/reftable-backend.c (new)
      +	err = reftable_stack_compact_all(refs->main_stack, NULL);
      +	if (err == 0 && refs->worktree_stack != NULL)
      +		err = reftable_stack_compact_all(refs->worktree_stack, NULL);
     ++	if (err == 0)
     ++		err = reftable_stack_clean(refs->main_stack);
     ++	if (err == 0 && refs->worktree_stack != NULL)
     ++		err = reftable_stack_clean(refs->worktree_stack);
     ++
      +	return err;
      +}
      +
 14:  0d9477941678 = 19:  b2fa6ea62c16 git-prompt: prepare for reftable refs backend
 15:  59f432522230 ! 20:  39e644b9436b Add "test-tool dump-reftable" command.
     @@ Makefile: REFTABLE_OBJS += reftable/zlib-compat.o
       REFTABLE_TEST_OBJS += reftable/block_test.o
      +REFTABLE_TEST_OBJS += reftable/dump.o
       REFTABLE_TEST_OBJS += reftable/merged_test.o
     + REFTABLE_TEST_OBJS += reftable/pq_test.o
       REFTABLE_TEST_OBJS += reftable/record_test.o
     - REFTABLE_TEST_OBJS += reftable/refname_test.o
      
       ## reftable/dump.c ##
      @@ reftable/dump.c: license that can be found in the LICENSE file or at

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 251+ messages in thread

* [PATCH v6 01/20] init-db: set the_repository->hash_algo early on
  2021-04-12 19:25         ` [PATCH v6 00/20] reftable library Han-Wen Nienhuys via GitGitGadget
@ 2021-04-12 19:25           ` Han-Wen Nienhuys via GitGitGadget
  2021-04-12 19:25           ` [PATCH v6 02/20] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
                             ` (19 subsequent siblings)
  20 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-12 19:25 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys,
	Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable backend needs to know the hash algorithm for writing the
initialization hash table.

The initial reftable contains a symref HEAD => "main" (or "master"), which is
agnostic to the size of hash value, but this is an exceptional circumstance, and
the reftable library does not cater to this exception. It insists that all
tables in the stack have a consistent format ID for the hash algorithm.

Call set_repo_hash_algo directly after calling validate_hash_algorithm() (which
reads $GIT_DEFAULT_HASH).

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 builtin/init-db.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/builtin/init-db.c b/builtin/init-db.c
index c19b35f1e691..ff06fe4b9a49 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -424,6 +424,27 @@ int init_db(const char *git_dir, const char *real_git_dir,
 
 	validate_hash_algorithm(&repo_fmt, hash);
 
+	/*
+	 * At this point, the_repository we have in-core does not look
+	 * anything like one that we would see initialized in an already
+	 * working repository after calling setup_git_directory().
+	 *
+	 * Calling repository.c::initialize_the_repository() may have
+	 * prepared the .index .objects and .parsed_objects members, but
+	 * other members like .gitdir, .commondir, etc. have not been
+	 * initialized.
+	 *
+	 * Many API functions assume they are working with the_repository
+	 * that has sensibly been initialized, but because we haven't
+	 * really read from an existing repository, we need to hand-craft
+	 * the necessary members of the structure to get out of this
+	 * chicken-and-egg situation.
+	 *
+	 * For now, we update the hash algorithm member to what the
+	 * validate_hash_algorithm() call decided for us.
+	 */
+	repo_set_hash_algo(the_repository, repo_fmt.hash_algo);
+
 	reinit = create_default_files(template_dir, original_git_dir,
 				      initial_branch, &repo_fmt,
 				      flags & INIT_DB_QUIET);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v6 02/20] reftable: add LICENSE
  2021-04-12 19:25         ` [PATCH v6 00/20] reftable library Han-Wen Nienhuys via GitGitGadget
  2021-04-12 19:25           ` [PATCH v6 01/20] init-db: set the_repository->hash_algo early on Han-Wen Nienhuys via GitGitGadget
@ 2021-04-12 19:25           ` Han-Wen Nienhuys via GitGitGadget
  2021-04-13  7:28             ` Ævar Arnfjörð Bjarmason
  2021-04-12 19:25           ` [PATCH v6 03/20] reftable: add error related functionality Han-Wen Nienhuys via GitGitGadget
                             ` (18 subsequent siblings)
  20 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-12 19:25 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys,
	Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

TODO: relicense?

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/LICENSE | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)
 create mode 100644 reftable/LICENSE

diff --git a/reftable/LICENSE b/reftable/LICENSE
new file mode 100644
index 000000000000..402e0f9356ba
--- /dev/null
+++ b/reftable/LICENSE
@@ -0,0 +1,31 @@
+BSD License
+
+Copyright (c) 2020, Google LLC
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are
+met:
+
+* Redistributions of source code must retain the above copyright notice,
+this list of conditions and the following disclaimer.
+
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in the
+documentation and/or other materials provided with the distribution.
+
+* Neither the name of Google LLC nor the names of its contributors may
+be used to endorse or promote products derived from this software
+without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v6 03/20] reftable: add error related functionality
  2021-04-12 19:25         ` [PATCH v6 00/20] reftable library Han-Wen Nienhuys via GitGitGadget
  2021-04-12 19:25           ` [PATCH v6 01/20] init-db: set the_repository->hash_algo early on Han-Wen Nienhuys via GitGitGadget
  2021-04-12 19:25           ` [PATCH v6 02/20] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
@ 2021-04-12 19:25           ` Han-Wen Nienhuys via GitGitGadget
  2021-04-12 19:25           ` [PATCH v6 04/20] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
                             ` (17 subsequent siblings)
  20 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-12 19:25 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys,
	Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable/ directory is structured as a library, so it cannot
crash on misuse. Instead, it returns an error codes.

In addition, the error code can be used to signal conditions from lower levels
of the library to be handled by higher levels of the library. For example, a
transaction might legitimately write an empty reftable file, but in that case,
we'd want to shortcut the transaction overhead.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/error.c          | 41 ++++++++++++++++++++++++++
 reftable/reftable-error.h | 62 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 103 insertions(+)
 create mode 100644 reftable/error.c
 create mode 100644 reftable/reftable-error.h

diff --git a/reftable/error.c b/reftable/error.c
new file mode 100644
index 000000000000..f6f16def9214
--- /dev/null
+++ b/reftable/error.c
@@ -0,0 +1,41 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reftable-error.h"
+
+#include <stdio.h>
+
+const char *reftable_error_str(int err)
+{
+	static char buf[250];
+	switch (err) {
+	case REFTABLE_IO_ERROR:
+		return "I/O error";
+	case REFTABLE_FORMAT_ERROR:
+		return "corrupt reftable file";
+	case REFTABLE_NOT_EXIST_ERROR:
+		return "file does not exist";
+	case REFTABLE_LOCK_ERROR:
+		return "data is outdated";
+	case REFTABLE_API_ERROR:
+		return "misuse of the reftable API";
+	case REFTABLE_ZLIB_ERROR:
+		return "zlib failure";
+	case REFTABLE_NAME_CONFLICT:
+		return "file/directory conflict";
+	case REFTABLE_EMPTY_TABLE_ERROR:
+		return "wrote empty table";
+	case REFTABLE_REFNAME_ERROR:
+		return "invalid refname";
+	case -1:
+		return "general error";
+	default:
+		snprintf(buf, sizeof(buf), "unknown error code %d", err);
+		return buf;
+	}
+}
diff --git a/reftable/reftable-error.h b/reftable/reftable-error.h
new file mode 100644
index 000000000000..6f89bedf1a58
--- /dev/null
+++ b/reftable/reftable-error.h
@@ -0,0 +1,62 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_ERROR_H
+#define REFTABLE_ERROR_H
+
+/*
+ * Errors in reftable calls are signaled with negative integer return values. 0
+ * means success.
+ */
+enum reftable_error {
+	/* Unexpected file system behavior */
+	REFTABLE_IO_ERROR = -2,
+
+	/* Format inconsistency on reading data */
+	REFTABLE_FORMAT_ERROR = -3,
+
+	/* File does not exist. Returned from block_source_from_file(), because
+	 * it needs special handling in stack.
+	 */
+	REFTABLE_NOT_EXIST_ERROR = -4,
+
+	/* Trying to write out-of-date data. */
+	REFTABLE_LOCK_ERROR = -5,
+
+	/* Misuse of the API:
+	 *  - on writing a record with NULL refname.
+	 *  - on writing a reftable_ref_record outside the table limits
+	 *  - on writing a ref or log record before the stack's
+	 * next_update_inde*x
+	 *  - on writing a log record with multiline message with
+	 *  exact_log_message unset
+	 *  - on reading a reftable_ref_record from log iterator, or vice versa.
+	 *
+	 * When a call misuses the API, the internal state of the library is
+	 * kept unchanged.
+	 */
+	REFTABLE_API_ERROR = -6,
+
+	/* Decompression error */
+	REFTABLE_ZLIB_ERROR = -7,
+
+	/* Wrote a table without blocks. */
+	REFTABLE_EMPTY_TABLE_ERROR = -8,
+
+	/* Dir/file conflict. */
+	REFTABLE_NAME_CONFLICT = -9,
+
+	/* Invalid ref name. */
+	REFTABLE_REFNAME_ERROR = -10,
+};
+
+/* convert the numeric error code to a string. The string should not be
+ * deallocated. */
+const char *reftable_error_str(int err);
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v6 04/20] reftable: utility functions
  2021-04-12 19:25         ` [PATCH v6 00/20] reftable library Han-Wen Nienhuys via GitGitGadget
                             ` (2 preceding siblings ...)
  2021-04-12 19:25           ` [PATCH v6 03/20] reftable: add error related functionality Han-Wen Nienhuys via GitGitGadget
@ 2021-04-12 19:25           ` Han-Wen Nienhuys via GitGitGadget
  2021-04-13  8:02             ` Ævar Arnfjörð Bjarmason
  2021-04-13 13:14             ` Ævar Arnfjörð Bjarmason
  2021-04-12 19:25           ` [PATCH v6 05/20] reftable: add blocksource, an abstraction for random access reads Han-Wen Nienhuys via GitGitGadget
                             ` (16 subsequent siblings)
  20 siblings, 2 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-12 19:25 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys,
	Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This commit provides basic utility classes for the reftable library.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Makefile                            |  25 +++++-
 contrib/buildsystems/CMakeLists.txt |  14 ++-
 reftable/basics.c                   | 128 ++++++++++++++++++++++++++++
 reftable/basics.h                   |  60 +++++++++++++
 reftable/basics_test.c              |  98 +++++++++++++++++++++
 reftable/publicbasics.c             |  58 +++++++++++++
 reftable/reftable-malloc.h          |  18 ++++
 reftable/reftable-tests.h           |  22 +++++
 reftable/system.h                   |  32 +++++++
 reftable/test_framework.c           |  23 +++++
 reftable/test_framework.h           |  53 ++++++++++++
 t/helper/test-reftable.c            |   9 ++
 t/helper/test-tool.c                |   3 +-
 t/helper/test-tool.h                |   1 +
 t/t0032-reftable-unittest.sh        |  15 ++++
 15 files changed, 553 insertions(+), 6 deletions(-)
 create mode 100644 reftable/basics.c
 create mode 100644 reftable/basics.h
 create mode 100644 reftable/basics_test.c
 create mode 100644 reftable/publicbasics.c
 create mode 100644 reftable/reftable-malloc.h
 create mode 100644 reftable/reftable-tests.h
 create mode 100644 reftable/system.h
 create mode 100644 reftable/test_framework.c
 create mode 100644 reftable/test_framework.h
 create mode 100644 t/helper/test-reftable.c
 create mode 100755 t/t0032-reftable-unittest.sh

diff --git a/Makefile b/Makefile
index ffc2ddfd930d..1a6528a40b51 100644
--- a/Makefile
+++ b/Makefile
@@ -735,6 +735,7 @@ TEST_BUILTINS_OBJS += test-read-cache.o
 TEST_BUILTINS_OBJS += test-read-graph.o
 TEST_BUILTINS_OBJS += test-read-midx.o
 TEST_BUILTINS_OBJS += test-ref-store.o
+TEST_BUILTINS_OBJS += test-reftable.o
 TEST_BUILTINS_OBJS += test-regex.o
 TEST_BUILTINS_OBJS += test-repository.o
 TEST_BUILTINS_OBJS += test-revision-walking.o
@@ -812,6 +813,8 @@ TEST_SHELL_PATH = $(SHELL_PATH)
 
 LIB_FILE = libgit.a
 XDIFF_LIB = xdiff/lib.a
+REFTABLE_LIB = reftable/libreftable.a
+REFTABLE_TEST_LIB = reftable/libreftable_test.a
 
 GENERATED_H += command-list.h
 GENERATED_H += config-list.h
@@ -1180,7 +1183,7 @@ THIRD_PARTY_SOURCES += compat/regex/%
 THIRD_PARTY_SOURCES += sha1collisiondetection/%
 THIRD_PARTY_SOURCES += sha1dc/%
 
-GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB)
+GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB)
 EXTLIBS =
 
 GIT_USER_AGENT = git/$(GIT_VERSION)
@@ -2397,7 +2400,15 @@ XDIFF_OBJS += xdiff/xutils.o
 .PHONY: xdiff-objs
 xdiff-objs: $(XDIFF_OBJS)
 
+REFTABLE_OBJS += reftable/basics.o
+REFTABLE_OBJS += reftable/error.o
+REFTABLE_OBJS += reftable/publicbasics.o
+
+REFTABLE_TEST_OBJS += reftable/test_framework.o
+REFTABLE_TEST_OBJS += reftable/basics_test.o
+
 TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
+
 .PHONY: test-objs
 test-objs: $(TEST_OBJS)
 
@@ -2413,6 +2424,8 @@ OBJECTS += $(PROGRAM_OBJS)
 OBJECTS += $(TEST_OBJS)
 OBJECTS += $(XDIFF_OBJS)
 OBJECTS += $(FUZZ_OBJS)
+OBJECTS += $(REFTABLE_OBJS) $(REFTABLE_TEST_OBJS)
+
 ifndef NO_CURL
 	OBJECTS += http.o http-walker.o remote-curl.o
 endif
@@ -2564,6 +2577,12 @@ $(LIB_FILE): $(LIB_OBJS)
 $(XDIFF_LIB): $(XDIFF_OBJS)
 	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
 
+$(REFTABLE_LIB): $(REFTABLE_OBJS)
+	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
+
+$(REFTABLE_TEST_LIB): $(REFTABLE_TEST_OBJS)
+	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
+
 export DEFAULT_EDITOR DEFAULT_PAGER
 
 Documentation/GIT-EXCLUDED-PROGRAMS: FORCE
@@ -2844,7 +2863,7 @@ perf: all
 
 t/helper/test-tool$X: $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
 
-t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS)
+t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS) $(REFTABLE_TEST_LIB)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(filter %.a,$^) $(LIBS)
 
 check-sha1:: t/helper/test-tool$X
@@ -3174,7 +3193,7 @@ cocciclean:
 clean: profile-clean coverage-clean cocciclean
 	$(RM) *.res
 	$(RM) $(OBJECTS)
-	$(RM) $(LIB_FILE) $(XDIFF_LIB)
+	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB)
 	$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) git$X
 	$(RM) $(TEST_PROGRAMS)
 	$(RM) $(FUZZ_PROGRAMS)
diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index 75ed198a6a36..f41d3dab9015 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -611,6 +611,12 @@ parse_makefile_for_sources(libxdiff_SOURCES "XDIFF_OBJS")
 list(TRANSFORM libxdiff_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/")
 add_library(xdiff STATIC ${libxdiff_SOURCES})
 
+#reftable
+parse_makefile_for_sources(reftable_SOURCES "REFTABLE_OBJS")
+
+list(TRANSFORM reftable_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/")
+add_library(reftable STATIC ${reftable_SOURCES})
+
 if(WIN32)
 	if(NOT MSVC)#use windres when compiling with gcc and clang
 		add_custom_command(OUTPUT ${CMAKE_BINARY_DIR}/git.res
@@ -633,7 +639,7 @@ endif()
 #link all required libraries to common-main
 add_library(common-main OBJECT ${CMAKE_SOURCE_DIR}/common-main.c)
 
-target_link_libraries(common-main libgit xdiff ${ZLIB_LIBRARIES})
+target_link_libraries(common-main libgit xdiff reftable ${ZLIB_LIBRARIES})
 if(Intl_FOUND)
 	target_link_libraries(common-main ${Intl_LIBRARIES})
 endif()
@@ -873,11 +879,15 @@ if(BUILD_TESTING)
 add_executable(test-fake-ssh ${CMAKE_SOURCE_DIR}/t/helper/test-fake-ssh.c)
 target_link_libraries(test-fake-ssh common-main)
 
+#reftable-tests
+parse_makefile_for_sources(test-reftable_SOURCES "REFTABLE_TEST_OBJS")
+list(TRANSFORM test-reftable_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/")
+
 #test-tool
 parse_makefile_for_sources(test-tool_SOURCES "TEST_BUILTINS_OBJS")
 
 list(TRANSFORM test-tool_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/t/helper/")
-add_executable(test-tool ${CMAKE_SOURCE_DIR}/t/helper/test-tool.c ${test-tool_SOURCES})
+add_executable(test-tool ${CMAKE_SOURCE_DIR}/t/helper/test-tool.c ${test-tool_SOURCES} ${test-reftable_SOURCES})
 target_link_libraries(test-tool common-main)
 
 set_target_properties(test-fake-ssh test-tool
diff --git a/reftable/basics.c b/reftable/basics.c
new file mode 100644
index 000000000000..abd027b98882
--- /dev/null
+++ b/reftable/basics.c
@@ -0,0 +1,128 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "basics.h"
+
+void put_be24(uint8_t *out, uint32_t i)
+{
+	out[0] = (uint8_t)((i >> 16) & 0xff);
+	out[1] = (uint8_t)((i >> 8) & 0xff);
+	out[2] = (uint8_t)(i & 0xff);
+}
+
+uint32_t get_be24(uint8_t *in)
+{
+	return (uint32_t)(in[0]) << 16 | (uint32_t)(in[1]) << 8 |
+	       (uint32_t)(in[2]);
+}
+
+void put_be16(uint8_t *out, uint16_t i)
+{
+	out[0] = (uint8_t)((i >> 8) & 0xff);
+	out[1] = (uint8_t)(i & 0xff);
+}
+
+int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args)
+{
+	size_t lo = 0;
+	size_t hi = sz;
+
+	/* Invariants:
+	 *
+	 *  (hi == sz) || f(hi) == true
+	 *  (lo == 0 && f(0) == true) || fi(lo) == false
+	 */
+	while (hi - lo > 1) {
+		size_t mid = lo + (hi - lo) / 2;
+
+		if (f(mid, args))
+			hi = mid;
+		else
+			lo = mid;
+	}
+
+	if (lo)
+		return hi;
+
+	return f(0, args) ? 0 : 1;
+}
+
+void free_names(char **a)
+{
+	char **p;
+	if (a == NULL) {
+		return;
+	}
+	for (p = a; *p; p++) {
+		reftable_free(*p);
+	}
+	reftable_free(a);
+}
+
+int names_length(char **names)
+{
+	char **p = names;
+	for (; *p; p++) {
+		/* empty */
+	}
+	return p - names;
+}
+
+void parse_names(char *buf, int size, char ***namesp)
+{
+	char **names = NULL;
+	size_t names_cap = 0;
+	size_t names_len = 0;
+
+	char *p = buf;
+	char *end = buf + size;
+	while (p < end) {
+		char *next = strchr(p, '\n');
+		if (next && next < end) {
+			*next = 0;
+		} else {
+			next = end;
+		}
+		if (p < next) {
+			if (names_len == names_cap) {
+				names_cap = 2 * names_cap + 1;
+				names = reftable_realloc(
+					names, names_cap * sizeof(*names));
+			}
+			names[names_len++] = xstrdup(p);
+		}
+		p = next + 1;
+	}
+
+	names = reftable_realloc(names, (names_len + 1) * sizeof(*names));
+	names[names_len] = NULL;
+	*namesp = names;
+}
+
+int names_equal(char **a, char **b)
+{
+	int i = 0;
+	for (; a[i] && b[i]; i++) {
+		if (strcmp(a[i], b[i])) {
+			return 0;
+		}
+	}
+
+	return a[i] == b[i];
+}
+
+int common_prefix_size(struct strbuf *a, struct strbuf *b)
+{
+	int p = 0;
+	for (; p < a->len && p < b->len; p++) {
+		if (a->buf[p] != b->buf[p])
+			break;
+	}
+
+	return p;
+}
diff --git a/reftable/basics.h b/reftable/basics.h
new file mode 100644
index 000000000000..096b36862b9f
--- /dev/null
+++ b/reftable/basics.h
@@ -0,0 +1,60 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef BASICS_H
+#define BASICS_H
+
+/*
+ * miscellaneous utilities that are not provided by Git.
+ */
+
+#include "system.h"
+
+/* Bigendian en/decoding of integers */
+
+void put_be24(uint8_t *out, uint32_t i);
+uint32_t get_be24(uint8_t *in);
+void put_be16(uint8_t *out, uint16_t i);
+
+/*
+ * find smallest index i in [0, sz) at which f(i) is true, assuming
+ * that f is ascending. Return sz if f(i) is false for all indices.
+ *
+ * Contrary to bsearch(3), this returns something useful if the argument is not
+ * found.
+ */
+int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args);
+
+/*
+ * Frees a NULL terminated array of malloced strings. The array itself is also
+ * freed.
+ */
+void free_names(char **a);
+
+/* parse a newline separated list of names. `size` is the length of the buffer,
+ * without terminating '\0'. Empty names are discarded. */
+void parse_names(char *buf, int size, char ***namesp);
+
+/* compares two NULL-terminated arrays of strings. */
+int names_equal(char **a, char **b);
+
+/* returns the array size of a NULL-terminated array of strings. */
+int names_length(char **names);
+
+/* Allocation routines; they invoke the functions set through
+ * reftable_set_alloc() */
+void *reftable_malloc(size_t sz);
+void *reftable_realloc(void *p, size_t sz);
+void reftable_free(void *p);
+void *reftable_calloc(size_t sz);
+
+/* Find the longest shared prefix size of `a` and `b` */
+struct strbuf;
+int common_prefix_size(struct strbuf *a, struct strbuf *b);
+
+#endif
diff --git a/reftable/basics_test.c b/reftable/basics_test.c
new file mode 100644
index 000000000000..6d52f0f9d5aa
--- /dev/null
+++ b/reftable/basics_test.c
@@ -0,0 +1,98 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+
+#include "basics.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+struct binsearch_args {
+	int key;
+	int *arr;
+};
+
+static int binsearch_func(size_t i, void *void_args)
+{
+	struct binsearch_args *args = (struct binsearch_args *)void_args;
+
+	return args->key < args->arr[i];
+}
+
+static void test_binsearch(void)
+{
+	int arr[] = { 2, 4, 6, 8, 10 };
+	size_t sz = ARRAY_SIZE(arr);
+	struct binsearch_args args = {
+		.arr = arr,
+	};
+
+	int i = 0;
+	for (i = 1; i < 11; i++) {
+		int res;
+		args.key = i;
+		res = binsearch(sz, &binsearch_func, &args);
+
+		if (res < sz) {
+			EXPECT(args.key < arr[res]);
+			if (res > 0) {
+				EXPECT(args.key >= arr[res - 1]);
+			}
+		} else {
+			EXPECT(args.key == 10 || args.key == 11);
+		}
+	}
+}
+
+static void test_names_length(void)
+{
+	char *a[] = { "a", "b", NULL };
+	EXPECT(names_length(a) == 2);
+}
+
+static void test_parse_names_normal(void)
+{
+	char in[] = "a\nb\n";
+	char **out = NULL;
+	parse_names(in, strlen(in), &out);
+	EXPECT(!strcmp(out[0], "a"));
+	EXPECT(!strcmp(out[1], "b"));
+	EXPECT(out[2] == NULL);
+	free_names(out);
+}
+
+static void test_parse_names_drop_empty(void)
+{
+	char in[] = "a\n\n";
+	char **out = NULL;
+	parse_names(in, strlen(in), &out);
+	EXPECT(!strcmp(out[0], "a"));
+	EXPECT(out[1] == NULL);
+	free_names(out);
+}
+
+static void test_common_prefix(void)
+{
+	struct strbuf s1 = STRBUF_INIT;
+	struct strbuf s2 = STRBUF_INIT;
+	strbuf_addstr(&s1, "abcdef");
+	strbuf_addstr(&s2, "abc");
+	EXPECT(common_prefix_size(&s1, &s2) == 3);
+	strbuf_release(&s1);
+	strbuf_release(&s2);
+}
+
+int basics_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_common_prefix);
+	RUN_TEST(test_parse_names_normal);
+	RUN_TEST(test_parse_names_drop_empty);
+	RUN_TEST(test_binsearch);
+	RUN_TEST(test_names_length);
+	return 0;
+}
diff --git a/reftable/publicbasics.c b/reftable/publicbasics.c
new file mode 100644
index 000000000000..25639f61af61
--- /dev/null
+++ b/reftable/publicbasics.c
@@ -0,0 +1,58 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reftable-malloc.h"
+
+#include "basics.h"
+#include "system.h"
+
+static void *(*reftable_malloc_ptr)(size_t sz) = &malloc;
+static void *(*reftable_realloc_ptr)(void *, size_t) = &realloc;
+static void (*reftable_free_ptr)(void *) = &free;
+
+void *reftable_malloc(size_t sz)
+{
+	return (*reftable_malloc_ptr)(sz);
+}
+
+void *reftable_realloc(void *p, size_t sz)
+{
+	return (*reftable_realloc_ptr)(p, sz);
+}
+
+void reftable_free(void *p)
+{
+	reftable_free_ptr(p);
+}
+
+void *reftable_calloc(size_t sz)
+{
+	void *p = reftable_malloc(sz);
+	memset(p, 0, sz);
+	return p;
+}
+
+void reftable_set_alloc(void *(*malloc)(size_t),
+			void *(*realloc)(void *, size_t), void (*free)(void *))
+{
+	reftable_malloc_ptr = malloc;
+	reftable_realloc_ptr = realloc;
+	reftable_free_ptr = free;
+}
+
+int hash_size(uint32_t id)
+{
+	switch (id) {
+	case 0:
+	case SHA1_ID:
+		return SHA1_SIZE;
+	case SHA256_ID:
+		return SHA256_SIZE;
+	}
+	abort();
+}
diff --git a/reftable/reftable-malloc.h b/reftable/reftable-malloc.h
new file mode 100644
index 000000000000..5f2185f1f343
--- /dev/null
+++ b/reftable/reftable-malloc.h
@@ -0,0 +1,18 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_H
+#define REFTABLE_H
+
+#include <stddef.h>
+
+/* Overrides the functions to use for memory management. */
+void reftable_set_alloc(void *(*malloc)(size_t),
+			void *(*realloc)(void *, size_t), void (*free)(void *));
+
+#endif
diff --git a/reftable/reftable-tests.h b/reftable/reftable-tests.h
new file mode 100644
index 000000000000..5e7698ae654e
--- /dev/null
+++ b/reftable/reftable-tests.h
@@ -0,0 +1,22 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_TESTS_H
+#define REFTABLE_TESTS_H
+
+int basics_test_main(int argc, const char **argv);
+int block_test_main(int argc, const char **argv);
+int merged_test_main(int argc, const char **argv);
+int record_test_main(int argc, const char **argv);
+int refname_test_main(int argc, const char **argv);
+int reftable_test_main(int argc, const char **argv);
+int stack_test_main(int argc, const char **argv);
+int tree_test_main(int argc, const char **argv);
+int reftable_dump_main(int argc, char *const *argv);
+
+#endif
diff --git a/reftable/system.h b/reftable/system.h
new file mode 100644
index 000000000000..07277ca06273
--- /dev/null
+++ b/reftable/system.h
@@ -0,0 +1,32 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef SYSTEM_H
+#define SYSTEM_H
+
+#include "git-compat-util.h"
+#include "strbuf.h"
+
+#include <zlib.h>
+
+struct strbuf;
+/* In git, this is declared in dir.h */
+int remove_dir_recursively(struct strbuf *path, int flags);
+
+#define SHA1_ID 0x73686131
+#define SHA256_ID 0x73323536
+#define SHA1_SIZE 20
+#define SHA256_SIZE 32
+
+/* This is uncompress2, which is only available in zlib as of 2017.
+ */
+int uncompress_return_consumed(Bytef *dest, uLongf *destLen,
+			       const Bytef *source, uLong *sourceLen);
+int hash_size(uint32_t id);
+
+#endif
diff --git a/reftable/test_framework.c b/reftable/test_framework.c
new file mode 100644
index 000000000000..a5ff4e2a2d2f
--- /dev/null
+++ b/reftable/test_framework.c
@@ -0,0 +1,23 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+#include "test_framework.h"
+
+#include "basics.h"
+
+void set_test_hash(uint8_t *p, int i)
+{
+	memset(p, (uint8_t)i, hash_size(SHA1_ID));
+}
+
+int strbuf_add_void(void *b, const void *data, size_t sz)
+{
+	strbuf_add((struct strbuf *)b, data, sz);
+	return sz;
+}
diff --git a/reftable/test_framework.h b/reftable/test_framework.h
new file mode 100644
index 000000000000..5fdc9519a5a5
--- /dev/null
+++ b/reftable/test_framework.h
@@ -0,0 +1,53 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef TEST_FRAMEWORK_H
+#define TEST_FRAMEWORK_H
+
+#include "system.h"
+#include "reftable-error.h"
+
+#define EXPECT_ERR(c)                                                  \
+	if (c != 0) {                                                  \
+		fflush(stderr);                                        \
+		fflush(stdout);                                        \
+		fprintf(stderr, "%s: %d: error == %d (%s), want 0\n",  \
+			__FILE__, __LINE__, c, reftable_error_str(c)); \
+		abort();                                               \
+	}
+
+#define EXPECT_STREQ(a, b)                                               \
+	if (strcmp(a, b)) {                                              \
+		fflush(stderr);                                          \
+		fflush(stdout);                                          \
+		fprintf(stderr, "%s:%d: %s (%s) != %s (%s)\n", __FILE__, \
+			__LINE__, #a, a, #b, b);                         \
+		abort();                                                 \
+	}
+
+#define EXPECT(c)                                                          \
+	if (!(c)) {                                                        \
+		fflush(stderr);                                            \
+		fflush(stdout);                                            \
+		fprintf(stderr, "%s: %d: failed assertion %s\n", __FILE__, \
+			__LINE__, #c);                                     \
+		abort();                                                   \
+	}
+
+#define RUN_TEST(f)                          \
+	fprintf(stderr, "running %s\n", #f); \
+	fflush(stderr);                      \
+	f();
+
+void set_test_hash(uint8_t *p, int i);
+
+/* Like strbuf_add, but suitable for passing to reftable_new_writer
+ */
+int strbuf_add_void(void *b, const void *data, size_t sz);
+
+#endif
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
new file mode 100644
index 000000000000..3b58e423e7b1
--- /dev/null
+++ b/t/helper/test-reftable.c
@@ -0,0 +1,9 @@
+#include "reftable/reftable-tests.h"
+#include "test-tool.h"
+
+int cmd__reftable(int argc, const char **argv)
+{
+	basics_test_main(argc, argv);
+
+	return 0;
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 287aa6002307..814d2606f8ca 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -49,13 +49,14 @@ static struct test_cmd cmds[] = {
 	{ "pcre2-config", cmd__pcre2_config },
 	{ "pkt-line", cmd__pkt_line },
 	{ "prio-queue", cmd__prio_queue },
-	{ "proc-receive", cmd__proc_receive},
+	{ "proc-receive", cmd__proc_receive },
 	{ "progress", cmd__progress },
 	{ "reach", cmd__reach },
 	{ "read-cache", cmd__read_cache },
 	{ "read-graph", cmd__read_graph },
 	{ "read-midx", cmd__read_midx },
 	{ "ref-store", cmd__ref_store },
+	{ "reftable", cmd__reftable },
 	{ "regex", cmd__regex },
 	{ "repository", cmd__repository },
 	{ "revision-walking", cmd__revision_walking },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index 9ea4b31011dd..789c26f302dd 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -45,6 +45,7 @@ int cmd__read_cache(int argc, const char **argv);
 int cmd__read_graph(int argc, const char **argv);
 int cmd__read_midx(int argc, const char **argv);
 int cmd__ref_store(int argc, const char **argv);
+int cmd__reftable(int argc, const char **argv);
 int cmd__regex(int argc, const char **argv);
 int cmd__repository(int argc, const char **argv);
 int cmd__revision_walking(int argc, const char **argv);
diff --git a/t/t0032-reftable-unittest.sh b/t/t0032-reftable-unittest.sh
new file mode 100755
index 000000000000..0ed14971a580
--- /dev/null
+++ b/t/t0032-reftable-unittest.sh
@@ -0,0 +1,15 @@
+#!/bin/sh
+#
+# Copyright (c) 2020 Google LLC
+#
+
+test_description='reftable unittests'
+
+. ./test-lib.sh
+
+test_expect_success 'unittests' '
+	TMPDIR=$(pwd) && export TMPDIR &&
+	test-tool reftable
+'
+
+test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v6 05/20] reftable: add blocksource, an abstraction for random access reads
  2021-04-12 19:25         ` [PATCH v6 00/20] reftable library Han-Wen Nienhuys via GitGitGadget
                             ` (3 preceding siblings ...)
  2021-04-12 19:25           ` [PATCH v6 04/20] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
@ 2021-04-12 19:25           ` Han-Wen Nienhuys via GitGitGadget
  2021-04-12 19:25           ` [PATCH v6 06/20] reftable: (de)serialization for the polymorphic record type Han-Wen Nienhuys via GitGitGadget
                             ` (15 subsequent siblings)
  20 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-12 19:25 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys,
	Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable format is usually used with files for storage. However, we abstract
away this using the blocksource data structure. This has two advantages:

* log blocks are zlib compressed, and handling them is simplified if we can
  discard byte segments from within the block layer.

* for unittests, it is useful to read and write in-memory. The blocksource
  allows us to abstract the data away from on-disk files.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                        |   1 +
 reftable/blocksource.c          | 148 ++++++++++++++++++++++++++++++++
 reftable/blocksource.h          |  22 +++++
 reftable/reftable-blocksource.h |  49 +++++++++++
 4 files changed, 220 insertions(+)
 create mode 100644 reftable/blocksource.c
 create mode 100644 reftable/blocksource.h
 create mode 100644 reftable/reftable-blocksource.h

diff --git a/Makefile b/Makefile
index 1a6528a40b51..bb57e0318cdb 100644
--- a/Makefile
+++ b/Makefile
@@ -2402,6 +2402,7 @@ xdiff-objs: $(XDIFF_OBJS)
 
 REFTABLE_OBJS += reftable/basics.o
 REFTABLE_OBJS += reftable/error.o
+REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 
 REFTABLE_TEST_OBJS += reftable/test_framework.o
diff --git a/reftable/blocksource.c b/reftable/blocksource.c
new file mode 100644
index 000000000000..25d4d95b52b8
--- /dev/null
+++ b/reftable/blocksource.c
@@ -0,0 +1,148 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+
+#include "basics.h"
+#include "blocksource.h"
+#include "reftable-blocksource.h"
+#include "reftable-error.h"
+
+static void strbuf_return_block(void *b, struct reftable_block *dest)
+{
+	memset(dest->data, 0xff, dest->len);
+	reftable_free(dest->data);
+}
+
+static void strbuf_close(void *b)
+{
+}
+
+static int strbuf_read_block(void *v, struct reftable_block *dest, uint64_t off,
+			     uint32_t size)
+{
+	struct strbuf *b = (struct strbuf *)v;
+	assert(off + size <= b->len);
+	dest->data = reftable_calloc(size);
+	memcpy(dest->data, b->buf + off, size);
+	dest->len = size;
+	return size;
+}
+
+static uint64_t strbuf_size(void *b)
+{
+	return ((struct strbuf *)b)->len;
+}
+
+static struct reftable_block_source_vtable strbuf_vtable = {
+	.size = &strbuf_size,
+	.read_block = &strbuf_read_block,
+	.return_block = &strbuf_return_block,
+	.close = &strbuf_close,
+};
+
+void block_source_from_strbuf(struct reftable_block_source *bs,
+			      struct strbuf *buf)
+{
+	assert(bs->ops == NULL);
+	bs->ops = &strbuf_vtable;
+	bs->arg = buf;
+}
+
+static void malloc_return_block(void *b, struct reftable_block *dest)
+{
+	memset(dest->data, 0xff, dest->len);
+	reftable_free(dest->data);
+}
+
+static struct reftable_block_source_vtable malloc_vtable = {
+	.return_block = &malloc_return_block,
+};
+
+static struct reftable_block_source malloc_block_source_instance = {
+	.ops = &malloc_vtable,
+};
+
+struct reftable_block_source malloc_block_source(void)
+{
+	return malloc_block_source_instance;
+}
+
+struct file_block_source {
+	int fd;
+	uint64_t size;
+};
+
+static uint64_t file_size(void *b)
+{
+	return ((struct file_block_source *)b)->size;
+}
+
+static void file_return_block(void *b, struct reftable_block *dest)
+{
+	memset(dest->data, 0xff, dest->len);
+	reftable_free(dest->data);
+}
+
+static void file_close(void *b)
+{
+	int fd = ((struct file_block_source *)b)->fd;
+	if (fd > 0) {
+		close(fd);
+		((struct file_block_source *)b)->fd = 0;
+	}
+
+	reftable_free(b);
+}
+
+static int file_read_block(void *v, struct reftable_block *dest, uint64_t off,
+			   uint32_t size)
+{
+	struct file_block_source *b = (struct file_block_source *)v;
+	assert(off + size <= b->size);
+	dest->data = reftable_malloc(size);
+	if (pread(b->fd, dest->data, size, off) != size)
+		return -1;
+	dest->len = size;
+	return size;
+}
+
+static struct reftable_block_source_vtable file_vtable = {
+	.size = &file_size,
+	.read_block = &file_read_block,
+	.return_block = &file_return_block,
+	.close = &file_close,
+};
+
+int reftable_block_source_from_file(struct reftable_block_source *bs,
+				    const char *name)
+{
+	struct stat st = { 0 };
+	int err = 0;
+	int fd = open(name, O_RDONLY);
+	struct file_block_source *p = NULL;
+	if (fd < 0) {
+		if (errno == ENOENT) {
+			return REFTABLE_NOT_EXIST_ERROR;
+		}
+		return -1;
+	}
+
+	err = fstat(fd, &st);
+	if (err < 0)
+		return -1;
+
+	p = reftable_calloc(sizeof(struct file_block_source));
+	p->size = st.st_size;
+	p->fd = fd;
+
+	assert(bs->ops == NULL);
+	bs->ops = &file_vtable;
+	bs->arg = p;
+	return 0;
+}
diff --git a/reftable/blocksource.h b/reftable/blocksource.h
new file mode 100644
index 000000000000..072e2727ad20
--- /dev/null
+++ b/reftable/blocksource.h
@@ -0,0 +1,22 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef BLOCKSOURCE_H
+#define BLOCKSOURCE_H
+
+#include "system.h"
+
+struct reftable_block_source;
+
+/* Create an in-memory block source for reading reftables */
+void block_source_from_strbuf(struct reftable_block_source *bs,
+			      struct strbuf *buf);
+
+struct reftable_block_source malloc_block_source(void);
+
+#endif
diff --git a/reftable/reftable-blocksource.h b/reftable/reftable-blocksource.h
new file mode 100644
index 000000000000..5aa3990a5732
--- /dev/null
+++ b/reftable/reftable-blocksource.h
@@ -0,0 +1,49 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_BLOCKSOURCE_H
+#define REFTABLE_BLOCKSOURCE_H
+
+#include <stdint.h>
+
+/* block_source is a generic wrapper for a seekable readable file.
+ */
+struct reftable_block_source {
+	struct reftable_block_source_vtable *ops;
+	void *arg;
+};
+
+/* a contiguous segment of bytes. It keeps track of its generating block_source
+ * so it can return itself into the pool. */
+struct reftable_block {
+	uint8_t *data;
+	int len;
+	struct reftable_block_source source;
+};
+
+/* block_source_vtable are the operations that make up block_source */
+struct reftable_block_source_vtable {
+	/* returns the size of a block source */
+	uint64_t (*size)(void *source);
+
+	/* reads a segment from the block source. It is an error to read
+	   beyond the end of the block */
+	int (*read_block)(void *source, struct reftable_block *dest,
+			  uint64_t off, uint32_t size);
+	/* mark the block as read; may return the data back to malloc */
+	void (*return_block)(void *source, struct reftable_block *blockp);
+
+	/* release all resources associated with the block source */
+	void (*close)(void *source);
+};
+
+/* opens a file on the file system as a block_source */
+int reftable_block_source_from_file(struct reftable_block_source *block_src,
+				    const char *name);
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v6 06/20] reftable: (de)serialization for the polymorphic record type.
  2021-04-12 19:25         ` [PATCH v6 00/20] reftable library Han-Wen Nienhuys via GitGitGadget
                             ` (4 preceding siblings ...)
  2021-04-12 19:25           ` [PATCH v6 05/20] reftable: add blocksource, an abstraction for random access reads Han-Wen Nienhuys via GitGitGadget
@ 2021-04-12 19:25           ` Han-Wen Nienhuys via GitGitGadget
  2021-04-12 19:25           ` [PATCH v6 07/20] reftable: reading/writing blocks Han-Wen Nienhuys via GitGitGadget
                             ` (14 subsequent siblings)
  20 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-12 19:25 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys,
	Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable format is structured as a sequence of blocks, and each block
contains a sequence of prefix-compressed key-value records. There are 4 types of
records, and they have similarities in how they must be handled. This is
achieved by introducing a polymorphic 'record' type that encapsulates ref, log,
index and object records.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                   |    2 +
 reftable/constants.h       |   21 +
 reftable/record.c          | 1203 ++++++++++++++++++++++++++++++++++++
 reftable/record.h          |  139 +++++
 reftable/record_test.c     |  405 ++++++++++++
 reftable/reftable-record.h |  114 ++++
 t/helper/test-reftable.c   |    2 +-
 7 files changed, 1885 insertions(+), 1 deletion(-)
 create mode 100644 reftable/constants.h
 create mode 100644 reftable/record.c
 create mode 100644 reftable/record.h
 create mode 100644 reftable/record_test.c
 create mode 100644 reftable/reftable-record.h

diff --git a/Makefile b/Makefile
index bb57e0318cdb..313421d16012 100644
--- a/Makefile
+++ b/Makefile
@@ -2404,7 +2404,9 @@ REFTABLE_OBJS += reftable/basics.o
 REFTABLE_OBJS += reftable/error.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
+REFTABLE_OBJS += reftable/record.o
 
+REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/basics_test.o
 
diff --git a/reftable/constants.h b/reftable/constants.h
new file mode 100644
index 000000000000..5eee72c4c113
--- /dev/null
+++ b/reftable/constants.h
@@ -0,0 +1,21 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef CONSTANTS_H
+#define CONSTANTS_H
+
+#define BLOCK_TYPE_LOG 'g'
+#define BLOCK_TYPE_INDEX 'i'
+#define BLOCK_TYPE_REF 'r'
+#define BLOCK_TYPE_OBJ 'o'
+#define BLOCK_TYPE_ANY 0
+
+#define MAX_RESTARTS ((1 << 16) - 1)
+#define DEFAULT_BLOCK_SIZE 4096
+
+#endif
diff --git a/reftable/record.c b/reftable/record.c
new file mode 100644
index 000000000000..d9e4c7bde411
--- /dev/null
+++ b/reftable/record.c
@@ -0,0 +1,1203 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+/* record.c - methods for different types of records. */
+
+#include "record.h"
+
+#include "system.h"
+#include "constants.h"
+#include "reftable-error.h"
+#include "basics.h"
+
+int get_var_int(uint64_t *dest, struct string_view *in)
+{
+	int ptr = 0;
+	uint64_t val;
+
+	if (in->len == 0)
+		return -1;
+	val = in->buf[ptr] & 0x7f;
+
+	while (in->buf[ptr] & 0x80) {
+		ptr++;
+		if (ptr > in->len) {
+			return -1;
+		}
+		val = (val + 1) << 7 | (uint64_t)(in->buf[ptr] & 0x7f);
+	}
+
+	*dest = val;
+	return ptr + 1;
+}
+
+int put_var_int(struct string_view *dest, uint64_t val)
+{
+	uint8_t buf[10] = { 0 };
+	int i = 9;
+	int n = 0;
+	buf[i] = (uint8_t)(val & 0x7f);
+	i--;
+	while (1) {
+		val >>= 7;
+		if (!val) {
+			break;
+		}
+		val--;
+		buf[i] = 0x80 | (uint8_t)(val & 0x7f);
+		i--;
+	}
+
+	n = sizeof(buf) - i - 1;
+	if (dest->len < n)
+		return -1;
+	memcpy(dest->buf, &buf[i + 1], n);
+	return n;
+}
+
+int reftable_is_block_type(uint8_t typ)
+{
+	switch (typ) {
+	case BLOCK_TYPE_REF:
+	case BLOCK_TYPE_LOG:
+	case BLOCK_TYPE_OBJ:
+	case BLOCK_TYPE_INDEX:
+		return 1;
+	}
+	return 0;
+}
+
+uint8_t *reftable_ref_record_val1(struct reftable_ref_record *rec)
+{
+	switch (rec->value_type) {
+	case REFTABLE_REF_VAL1:
+		return rec->value.val1;
+	case REFTABLE_REF_VAL2:
+		return rec->value.val2.value;
+	default:
+		return NULL;
+	}
+}
+
+uint8_t *reftable_ref_record_val2(struct reftable_ref_record *rec)
+{
+	switch (rec->value_type) {
+	case REFTABLE_REF_VAL2:
+		return rec->value.val2.target_value;
+	default:
+		return NULL;
+	}
+}
+
+static int decode_string(struct strbuf *dest, struct string_view in)
+{
+	int start_len = in.len;
+	uint64_t tsize = 0;
+	int n = get_var_int(&tsize, &in);
+	if (n <= 0)
+		return -1;
+	string_view_consume(&in, n);
+	if (in.len < tsize)
+		return -1;
+
+	strbuf_reset(dest);
+	strbuf_add(dest, in.buf, tsize);
+	string_view_consume(&in, tsize);
+
+	return start_len - in.len;
+}
+
+static int encode_string(char *str, struct string_view s)
+{
+	struct string_view start = s;
+	int l = strlen(str);
+	int n = put_var_int(&s, l);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+	if (s.len < l)
+		return -1;
+	memcpy(s.buf, str, l);
+	string_view_consume(&s, l);
+
+	return start.len - s.len;
+}
+
+int reftable_encode_key(int *restart, struct string_view dest,
+			struct strbuf prev_key, struct strbuf key,
+			uint8_t extra)
+{
+	struct string_view start = dest;
+	int prefix_len = common_prefix_size(&prev_key, &key);
+	uint64_t suffix_len = key.len - prefix_len;
+	int n = put_var_int(&dest, (uint64_t)prefix_len);
+	if (n < 0)
+		return -1;
+	string_view_consume(&dest, n);
+
+	*restart = (prefix_len == 0);
+
+	n = put_var_int(&dest, suffix_len << 3 | (uint64_t)extra);
+	if (n < 0)
+		return -1;
+	string_view_consume(&dest, n);
+
+	if (dest.len < suffix_len)
+		return -1;
+	memcpy(dest.buf, key.buf + prefix_len, suffix_len);
+	string_view_consume(&dest, suffix_len);
+
+	return start.len - dest.len;
+}
+
+int reftable_decode_key(struct strbuf *key, uint8_t *extra,
+			struct strbuf last_key, struct string_view in)
+{
+	int start_len = in.len;
+	uint64_t prefix_len = 0;
+	uint64_t suffix_len = 0;
+	int n = get_var_int(&prefix_len, &in);
+	if (n < 0)
+		return -1;
+	string_view_consume(&in, n);
+
+	if (prefix_len > last_key.len)
+		return -1;
+
+	n = get_var_int(&suffix_len, &in);
+	if (n <= 0)
+		return -1;
+	string_view_consume(&in, n);
+
+	*extra = (uint8_t)(suffix_len & 0x7);
+	suffix_len >>= 3;
+
+	if (in.len < suffix_len)
+		return -1;
+
+	strbuf_reset(key);
+	strbuf_add(key, last_key.buf, prefix_len);
+	strbuf_add(key, in.buf, suffix_len);
+	string_view_consume(&in, suffix_len);
+
+	return start_len - in.len;
+}
+
+static void reftable_ref_record_key(const void *r, struct strbuf *dest)
+{
+	const struct reftable_ref_record *rec =
+		(const struct reftable_ref_record *)r;
+	strbuf_reset(dest);
+	strbuf_addstr(dest, rec->refname);
+}
+
+static void reftable_ref_record_copy_from(void *rec, const void *src_rec,
+					  int hash_size)
+{
+	struct reftable_ref_record *ref = (struct reftable_ref_record *)rec;
+	struct reftable_ref_record *src = (struct reftable_ref_record *)src_rec;
+	assert(hash_size > 0);
+
+	/* This is simple and correct, but we could probably reuse the hash
+	 * fields. */
+	reftable_ref_record_release(ref);
+	if (src->refname != NULL) {
+		ref->refname = xstrdup(src->refname);
+	}
+	ref->update_index = src->update_index;
+	ref->value_type = src->value_type;
+	switch (src->value_type) {
+	case REFTABLE_REF_DELETION:
+		break;
+	case REFTABLE_REF_VAL1:
+		ref->value.val1 = reftable_malloc(hash_size);
+		memcpy(ref->value.val1, src->value.val1, hash_size);
+		break;
+	case REFTABLE_REF_VAL2:
+		ref->value.val2.value = reftable_malloc(hash_size);
+		memcpy(ref->value.val2.value, src->value.val2.value, hash_size);
+		ref->value.val2.target_value = reftable_malloc(hash_size);
+		memcpy(ref->value.val2.target_value,
+		       src->value.val2.target_value, hash_size);
+		break;
+	case REFTABLE_REF_SYMREF:
+		ref->value.symref = xstrdup(src->value.symref);
+		break;
+	}
+}
+
+static char hexdigit(int c)
+{
+	if (c <= 9)
+		return '0' + c;
+	return 'a' + (c - 10);
+}
+
+static void hex_format(char *dest, uint8_t *src, int hash_size)
+{
+	assert(hash_size > 0);
+	if (src != NULL) {
+		int i = 0;
+		for (i = 0; i < hash_size; i++) {
+			dest[2 * i] = hexdigit(src[i] >> 4);
+			dest[2 * i + 1] = hexdigit(src[i] & 0xf);
+		}
+		dest[2 * hash_size] = 0;
+	}
+}
+
+void reftable_ref_record_print(struct reftable_ref_record *ref,
+			       uint32_t hash_id)
+{
+	char hex[2 * SHA256_SIZE + 1] = { 0 }; /* BUG */
+	printf("ref{%s(%" PRIu64 ") ", ref->refname, ref->update_index);
+	switch (ref->value_type) {
+	case REFTABLE_REF_SYMREF:
+		printf("=> %s", ref->value.symref);
+		break;
+	case REFTABLE_REF_VAL2:
+		hex_format(hex, ref->value.val2.value, hash_size(hash_id));
+		printf("val 2 %s", hex);
+		hex_format(hex, ref->value.val2.target_value,
+			   hash_size(hash_id));
+		printf("(T %s)", hex);
+		break;
+	case REFTABLE_REF_VAL1:
+		hex_format(hex, ref->value.val1, hash_size(hash_id));
+		printf("val 1 %s", hex);
+		break;
+	case REFTABLE_REF_DELETION:
+		printf("delete");
+		break;
+	}
+	printf("}\n");
+}
+
+static void reftable_ref_record_release_void(void *rec)
+{
+	reftable_ref_record_release((struct reftable_ref_record *)rec);
+}
+
+void reftable_ref_record_release(struct reftable_ref_record *ref)
+{
+	switch (ref->value_type) {
+	case REFTABLE_REF_SYMREF:
+		reftable_free(ref->value.symref);
+		break;
+	case REFTABLE_REF_VAL2:
+		reftable_free(ref->value.val2.target_value);
+		reftable_free(ref->value.val2.value);
+		break;
+	case REFTABLE_REF_VAL1:
+		reftable_free(ref->value.val1);
+		break;
+	case REFTABLE_REF_DELETION:
+		break;
+	default:
+		abort();
+	}
+
+	reftable_free(ref->refname);
+	memset(ref, 0, sizeof(struct reftable_ref_record));
+}
+
+static uint8_t reftable_ref_record_val_type(const void *rec)
+{
+	const struct reftable_ref_record *r =
+		(const struct reftable_ref_record *)rec;
+	return r->value_type;
+}
+
+static int reftable_ref_record_encode(const void *rec, struct string_view s,
+				      int hash_size)
+{
+	const struct reftable_ref_record *r =
+		(const struct reftable_ref_record *)rec;
+	struct string_view start = s;
+	int n = put_var_int(&s, r->update_index);
+	assert(hash_size > 0);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	switch (r->value_type) {
+	case REFTABLE_REF_SYMREF:
+		n = encode_string(r->value.symref, s);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&s, n);
+		break;
+	case REFTABLE_REF_VAL2:
+		if (s.len < 2 * hash_size) {
+			return -1;
+		}
+		memcpy(s.buf, r->value.val2.value, hash_size);
+		string_view_consume(&s, hash_size);
+		memcpy(s.buf, r->value.val2.target_value, hash_size);
+		string_view_consume(&s, hash_size);
+		break;
+	case REFTABLE_REF_VAL1:
+		if (s.len < hash_size) {
+			return -1;
+		}
+		memcpy(s.buf, r->value.val1, hash_size);
+		string_view_consume(&s, hash_size);
+		break;
+	case REFTABLE_REF_DELETION:
+		break;
+	default:
+		abort();
+	}
+
+	return start.len - s.len;
+}
+
+static int reftable_ref_record_decode(void *rec, struct strbuf key,
+				      uint8_t val_type, struct string_view in,
+				      int hash_size)
+{
+	struct reftable_ref_record *r = (struct reftable_ref_record *)rec;
+	struct string_view start = in;
+	uint64_t update_index = 0;
+	int n = get_var_int(&update_index, &in);
+	if (n < 0)
+		return n;
+	string_view_consume(&in, n);
+
+	reftable_ref_record_release(r);
+
+	assert(hash_size > 0);
+
+	r->refname = reftable_realloc(r->refname, key.len + 1);
+	memcpy(r->refname, key.buf, key.len);
+	r->update_index = update_index;
+	r->refname[key.len] = 0;
+	r->value_type = val_type;
+	switch (val_type) {
+	case REFTABLE_REF_VAL1:
+		if (in.len < hash_size) {
+			return -1;
+		}
+
+		r->value.val1 = reftable_malloc(hash_size);
+		memcpy(r->value.val1, in.buf, hash_size);
+		string_view_consume(&in, hash_size);
+		break;
+
+	case REFTABLE_REF_VAL2:
+		if (in.len < 2 * hash_size) {
+			return -1;
+		}
+
+		r->value.val2.value = reftable_malloc(hash_size);
+		memcpy(r->value.val2.value, in.buf, hash_size);
+		string_view_consume(&in, hash_size);
+
+		r->value.val2.target_value = reftable_malloc(hash_size);
+		memcpy(r->value.val2.target_value, in.buf, hash_size);
+		string_view_consume(&in, hash_size);
+		break;
+
+	case REFTABLE_REF_SYMREF: {
+		struct strbuf dest = STRBUF_INIT;
+		int n = decode_string(&dest, in);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&in, n);
+		r->value.symref = dest.buf;
+	} break;
+
+	case REFTABLE_REF_DELETION:
+		break;
+	default:
+		abort();
+		break;
+	}
+
+	return start.len - in.len;
+}
+
+static int reftable_ref_record_is_deletion_void(const void *p)
+{
+	return reftable_ref_record_is_deletion(
+		(const struct reftable_ref_record *)p);
+}
+
+static struct reftable_record_vtable reftable_ref_record_vtable = {
+	.key = &reftable_ref_record_key,
+	.type = BLOCK_TYPE_REF,
+	.copy_from = &reftable_ref_record_copy_from,
+	.val_type = &reftable_ref_record_val_type,
+	.encode = &reftable_ref_record_encode,
+	.decode = &reftable_ref_record_decode,
+	.release = &reftable_ref_record_release_void,
+	.is_deletion = &reftable_ref_record_is_deletion_void,
+};
+
+static void reftable_obj_record_key(const void *r, struct strbuf *dest)
+{
+	const struct reftable_obj_record *rec =
+		(const struct reftable_obj_record *)r;
+	strbuf_reset(dest);
+	strbuf_add(dest, rec->hash_prefix, rec->hash_prefix_len);
+}
+
+static void reftable_obj_record_release(void *rec)
+{
+	struct reftable_obj_record *obj = (struct reftable_obj_record *)rec;
+	FREE_AND_NULL(obj->hash_prefix);
+	FREE_AND_NULL(obj->offsets);
+	memset(obj, 0, sizeof(struct reftable_obj_record));
+}
+
+static void reftable_obj_record_copy_from(void *rec, const void *src_rec,
+					  int hash_size)
+{
+	struct reftable_obj_record *obj = (struct reftable_obj_record *)rec;
+	const struct reftable_obj_record *src =
+		(const struct reftable_obj_record *)src_rec;
+	int olen;
+
+	reftable_obj_record_release(obj);
+	*obj = *src;
+	obj->hash_prefix = reftable_malloc(obj->hash_prefix_len);
+	memcpy(obj->hash_prefix, src->hash_prefix, obj->hash_prefix_len);
+
+	olen = obj->offset_len * sizeof(uint64_t);
+	obj->offsets = reftable_malloc(olen);
+	memcpy(obj->offsets, src->offsets, olen);
+}
+
+static uint8_t reftable_obj_record_val_type(const void *rec)
+{
+	struct reftable_obj_record *r = (struct reftable_obj_record *)rec;
+	if (r->offset_len > 0 && r->offset_len < 8)
+		return r->offset_len;
+	return 0;
+}
+
+static int reftable_obj_record_encode(const void *rec, struct string_view s,
+				      int hash_size)
+{
+	struct reftable_obj_record *r = (struct reftable_obj_record *)rec;
+	struct string_view start = s;
+	int i = 0;
+	int n = 0;
+	uint64_t last = 0;
+	if (r->offset_len == 0 || r->offset_len >= 8) {
+		n = put_var_int(&s, r->offset_len);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&s, n);
+	}
+	if (r->offset_len == 0)
+		return start.len - s.len;
+	n = put_var_int(&s, r->offsets[0]);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	last = r->offsets[0];
+	for (i = 1; i < r->offset_len; i++) {
+		int n = put_var_int(&s, r->offsets[i] - last);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&s, n);
+		last = r->offsets[i];
+	}
+	return start.len - s.len;
+}
+
+static int reftable_obj_record_decode(void *rec, struct strbuf key,
+				      uint8_t val_type, struct string_view in,
+				      int hash_size)
+{
+	struct string_view start = in;
+	struct reftable_obj_record *r = (struct reftable_obj_record *)rec;
+	uint64_t count = val_type;
+	int n = 0;
+	uint64_t last;
+	int j;
+	r->hash_prefix = reftable_malloc(key.len);
+	memcpy(r->hash_prefix, key.buf, key.len);
+	r->hash_prefix_len = key.len;
+
+	if (val_type == 0) {
+		n = get_var_int(&count, &in);
+		if (n < 0) {
+			return n;
+		}
+
+		string_view_consume(&in, n);
+	}
+
+	r->offsets = NULL;
+	r->offset_len = 0;
+	if (count == 0)
+		return start.len - in.len;
+
+	r->offsets = reftable_malloc(count * sizeof(uint64_t));
+	r->offset_len = count;
+
+	n = get_var_int(&r->offsets[0], &in);
+	if (n < 0)
+		return n;
+	string_view_consume(&in, n);
+
+	last = r->offsets[0];
+	j = 1;
+	while (j < count) {
+		uint64_t delta = 0;
+		int n = get_var_int(&delta, &in);
+		if (n < 0) {
+			return n;
+		}
+		string_view_consume(&in, n);
+
+		last = r->offsets[j] = (delta + last);
+		j++;
+	}
+	return start.len - in.len;
+}
+
+static int not_a_deletion(const void *p)
+{
+	return 0;
+}
+
+static struct reftable_record_vtable reftable_obj_record_vtable = {
+	.key = &reftable_obj_record_key,
+	.type = BLOCK_TYPE_OBJ,
+	.copy_from = &reftable_obj_record_copy_from,
+	.val_type = &reftable_obj_record_val_type,
+	.encode = &reftable_obj_record_encode,
+	.decode = &reftable_obj_record_decode,
+	.release = &reftable_obj_record_release,
+	.is_deletion = not_a_deletion,
+};
+
+void reftable_log_record_print(struct reftable_log_record *log,
+			       uint32_t hash_id)
+{
+	char hex[SHA256_SIZE + 1] = { 0 };
+
+	switch (log->value_type) {
+	case REFTABLE_LOG_DELETION:
+		printf("log{%s(%" PRIu64 ") delete", log->refname,
+		       log->update_index);
+		break;
+	case REFTABLE_LOG_UPDATE:
+		printf("log{%s(%" PRIu64 ") %s <%s> %" PRIu64 " %04d\n",
+		       log->refname, log->update_index, log->update.name,
+		       log->update.email, log->update.time,
+		       log->update.tz_offset);
+		hex_format(hex, log->update.old_hash, hash_size(hash_id));
+		printf("%s => ", hex);
+		hex_format(hex, log->update.new_hash, hash_size(hash_id));
+		printf("%s\n\n%s\n}\n", hex, log->update.message);
+		break;
+	}
+}
+
+static void reftable_log_record_key(const void *r, struct strbuf *dest)
+{
+	const struct reftable_log_record *rec =
+		(const struct reftable_log_record *)r;
+	int len = strlen(rec->refname);
+	uint8_t i64[8];
+	uint64_t ts = 0;
+	strbuf_reset(dest);
+	strbuf_add(dest, (uint8_t *)rec->refname, len + 1);
+
+	ts = (~ts) - rec->update_index;
+	put_be64(&i64[0], ts);
+	strbuf_add(dest, i64, sizeof(i64));
+}
+
+static void reftable_log_record_copy_from(void *rec, const void *src_rec,
+					  int hash_size)
+{
+	struct reftable_log_record *dst = (struct reftable_log_record *)rec;
+	const struct reftable_log_record *src =
+		(const struct reftable_log_record *)src_rec;
+
+	reftable_log_record_release(dst);
+	*dst = *src;
+	if (dst->refname != NULL) {
+		dst->refname = xstrdup(dst->refname);
+	}
+	switch (dst->value_type) {
+	case REFTABLE_LOG_DELETION:
+		break;
+	case REFTABLE_LOG_UPDATE:
+		if (dst->update.email != NULL) {
+			dst->update.email = xstrdup(dst->update.email);
+		}
+		if (dst->update.name != NULL) {
+			dst->update.name = xstrdup(dst->update.name);
+		}
+		if (dst->update.message != NULL) {
+			dst->update.message = xstrdup(dst->update.message);
+		}
+
+		if (dst->update.new_hash != NULL) {
+			dst->update.new_hash = reftable_malloc(hash_size);
+			memcpy(dst->update.new_hash, src->update.new_hash,
+			       hash_size);
+		}
+		if (dst->update.old_hash != NULL) {
+			dst->update.old_hash = reftable_malloc(hash_size);
+			memcpy(dst->update.old_hash, src->update.old_hash,
+			       hash_size);
+		}
+		break;
+	}
+}
+
+static void reftable_log_record_release_void(void *rec)
+{
+	struct reftable_log_record *r = (struct reftable_log_record *)rec;
+	reftable_log_record_release(r);
+}
+
+void reftable_log_record_release(struct reftable_log_record *r)
+{
+	reftable_free(r->refname);
+	switch (r->value_type) {
+	case REFTABLE_LOG_DELETION:
+		break;
+	case REFTABLE_LOG_UPDATE:
+		reftable_free(r->update.new_hash);
+		reftable_free(r->update.old_hash);
+		reftable_free(r->update.name);
+		reftable_free(r->update.email);
+		reftable_free(r->update.message);
+		break;
+	}
+	memset(r, 0, sizeof(struct reftable_log_record));
+}
+
+static uint8_t reftable_log_record_val_type(const void *rec)
+{
+	const struct reftable_log_record *log =
+		(const struct reftable_log_record *)rec;
+
+	return reftable_log_record_is_deletion(log) ? 0 : 1;
+}
+
+static uint8_t zero[SHA256_SIZE] = { 0 };
+
+static int reftable_log_record_encode(const void *rec, struct string_view s,
+				      int hash_size)
+{
+	struct reftable_log_record *r = (struct reftable_log_record *)rec;
+	struct string_view start = s;
+	int n = 0;
+	uint8_t *oldh = NULL;
+	uint8_t *newh = NULL;
+	if (reftable_log_record_is_deletion(r))
+		return 0;
+
+	oldh = r->update.old_hash;
+	newh = r->update.new_hash;
+	if (oldh == NULL) {
+		oldh = zero;
+	}
+	if (newh == NULL) {
+		newh = zero;
+	}
+
+	if (s.len < 2 * hash_size)
+		return -1;
+
+	memcpy(s.buf, oldh, hash_size);
+	memcpy(s.buf + hash_size, newh, hash_size);
+	string_view_consume(&s, 2 * hash_size);
+
+	n = encode_string(r->update.name ? r->update.name : "", s);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	n = encode_string(r->update.email ? r->update.email : "", s);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	n = put_var_int(&s, r->update.time);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	if (s.len < 2)
+		return -1;
+
+	put_be16(s.buf, r->update.tz_offset);
+	string_view_consume(&s, 2);
+
+	n = encode_string(r->update.message ? r->update.message : "", s);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	return start.len - s.len;
+}
+
+static int reftable_log_record_decode(void *rec, struct strbuf key,
+				      uint8_t val_type, struct string_view in,
+				      int hash_size)
+{
+	struct string_view start = in;
+	struct reftable_log_record *r = (struct reftable_log_record *)rec;
+	uint64_t max = 0;
+	uint64_t ts = 0;
+	struct strbuf dest = STRBUF_INIT;
+	int n;
+
+	if (key.len <= 9 || key.buf[key.len - 9] != 0)
+		return REFTABLE_FORMAT_ERROR;
+
+	r->refname = reftable_realloc(r->refname, key.len - 8);
+	memcpy(r->refname, key.buf, key.len - 8);
+	ts = get_be64(key.buf + key.len - 8);
+
+	r->update_index = (~max) - ts;
+
+	if (val_type != r->value_type) {
+		switch (r->value_type) {
+		case REFTABLE_LOG_UPDATE:
+			FREE_AND_NULL(r->update.old_hash);
+			FREE_AND_NULL(r->update.new_hash);
+			FREE_AND_NULL(r->update.message);
+			FREE_AND_NULL(r->update.email);
+			FREE_AND_NULL(r->update.name);
+			break;
+		case REFTABLE_LOG_DELETION:
+			break;
+		}
+	}
+
+	r->value_type = val_type;
+	if (val_type == REFTABLE_LOG_DELETION)
+		return 0;
+
+	if (in.len < 2 * hash_size)
+		return REFTABLE_FORMAT_ERROR;
+
+	r->update.old_hash = reftable_realloc(r->update.old_hash, hash_size);
+	r->update.new_hash = reftable_realloc(r->update.new_hash, hash_size);
+
+	memcpy(r->update.old_hash, in.buf, hash_size);
+	memcpy(r->update.new_hash, in.buf + hash_size, hash_size);
+
+	string_view_consume(&in, 2 * hash_size);
+
+	n = decode_string(&dest, in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+
+	r->update.name = reftable_realloc(r->update.name, dest.len + 1);
+	memcpy(r->update.name, dest.buf, dest.len);
+	r->update.name[dest.len] = 0;
+
+	strbuf_reset(&dest);
+	n = decode_string(&dest, in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+
+	r->update.email = reftable_realloc(r->update.email, dest.len + 1);
+	memcpy(r->update.email, dest.buf, dest.len);
+	r->update.email[dest.len] = 0;
+
+	ts = 0;
+	n = get_var_int(&ts, &in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+	r->update.time = ts;
+	if (in.len < 2)
+		goto done;
+
+	r->update.tz_offset = get_be16(in.buf);
+	string_view_consume(&in, 2);
+
+	strbuf_reset(&dest);
+	n = decode_string(&dest, in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+
+	r->update.message = reftable_realloc(r->update.message, dest.len + 1);
+	memcpy(r->update.message, dest.buf, dest.len);
+	r->update.message[dest.len] = 0;
+
+	strbuf_release(&dest);
+	return start.len - in.len;
+
+done:
+	strbuf_release(&dest);
+	return REFTABLE_FORMAT_ERROR;
+}
+
+static int null_streq(char *a, char *b)
+{
+	char *empty = "";
+	if (a == NULL)
+		a = empty;
+
+	if (b == NULL)
+		b = empty;
+
+	return 0 == strcmp(a, b);
+}
+
+static int zero_hash_eq(uint8_t *a, uint8_t *b, int sz)
+{
+	if (a == NULL)
+		a = zero;
+
+	if (b == NULL)
+		b = zero;
+
+	return !memcmp(a, b, sz);
+}
+
+int reftable_log_record_equal(struct reftable_log_record *a,
+			      struct reftable_log_record *b, int hash_size)
+{
+	if (!(null_streq(a->refname, b->refname) &&
+	      a->update_index == b->update_index &&
+	      a->value_type == b->value_type))
+		return 0;
+
+	switch (a->value_type) {
+	case REFTABLE_LOG_DELETION:
+		return 1;
+	case REFTABLE_LOG_UPDATE:
+		return null_streq(a->update.name, b->update.name) &&
+		       a->update.time == b->update.time &&
+		       a->update.tz_offset == b->update.tz_offset &&
+		       null_streq(a->update.email, b->update.email) &&
+		       null_streq(a->update.message, b->update.message) &&
+		       zero_hash_eq(a->update.old_hash, b->update.old_hash,
+				    hash_size) &&
+		       zero_hash_eq(a->update.new_hash, b->update.new_hash,
+				    hash_size);
+	}
+
+	abort();
+}
+
+static int reftable_log_record_is_deletion_void(const void *p)
+{
+	return reftable_log_record_is_deletion(
+		(const struct reftable_log_record *)p);
+}
+
+static struct reftable_record_vtable reftable_log_record_vtable = {
+	.key = &reftable_log_record_key,
+	.type = BLOCK_TYPE_LOG,
+	.copy_from = &reftable_log_record_copy_from,
+	.val_type = &reftable_log_record_val_type,
+	.encode = &reftable_log_record_encode,
+	.decode = &reftable_log_record_decode,
+	.release = &reftable_log_record_release_void,
+	.is_deletion = &reftable_log_record_is_deletion_void,
+};
+
+struct reftable_record reftable_new_record(uint8_t typ)
+{
+	struct reftable_record rec = { NULL };
+	switch (typ) {
+	case BLOCK_TYPE_REF: {
+		struct reftable_ref_record *r =
+			reftable_calloc(sizeof(struct reftable_ref_record));
+		reftable_record_from_ref(&rec, r);
+		return rec;
+	}
+
+	case BLOCK_TYPE_OBJ: {
+		struct reftable_obj_record *r =
+			reftable_calloc(sizeof(struct reftable_obj_record));
+		reftable_record_from_obj(&rec, r);
+		return rec;
+	}
+	case BLOCK_TYPE_LOG: {
+		struct reftable_log_record *r =
+			reftable_calloc(sizeof(struct reftable_log_record));
+		reftable_record_from_log(&rec, r);
+		return rec;
+	}
+	case BLOCK_TYPE_INDEX: {
+		struct reftable_index_record empty = { .last_key =
+							       STRBUF_INIT };
+		struct reftable_index_record *r =
+			reftable_calloc(sizeof(struct reftable_index_record));
+		*r = empty;
+		reftable_record_from_index(&rec, r);
+		return rec;
+	}
+	}
+	abort();
+	return rec;
+}
+
+/* clear out the record, yielding the reftable_record data that was
+ * encapsulated. */
+static void *reftable_record_yield(struct reftable_record *rec)
+{
+	void *p = rec->data;
+	rec->data = NULL;
+	return p;
+}
+
+void reftable_record_destroy(struct reftable_record *rec)
+{
+	reftable_record_release(rec);
+	reftable_free(reftable_record_yield(rec));
+}
+
+static void reftable_index_record_key(const void *r, struct strbuf *dest)
+{
+	struct reftable_index_record *rec = (struct reftable_index_record *)r;
+	strbuf_reset(dest);
+	strbuf_addbuf(dest, &rec->last_key);
+}
+
+static void reftable_index_record_copy_from(void *rec, const void *src_rec,
+					    int hash_size)
+{
+	struct reftable_index_record *dst = (struct reftable_index_record *)rec;
+	struct reftable_index_record *src =
+		(struct reftable_index_record *)src_rec;
+
+	strbuf_reset(&dst->last_key);
+	strbuf_addbuf(&dst->last_key, &src->last_key);
+	dst->offset = src->offset;
+}
+
+static void reftable_index_record_release(void *rec)
+{
+	struct reftable_index_record *idx = (struct reftable_index_record *)rec;
+	strbuf_release(&idx->last_key);
+}
+
+static uint8_t reftable_index_record_val_type(const void *rec)
+{
+	return 0;
+}
+
+static int reftable_index_record_encode(const void *rec, struct string_view out,
+					int hash_size)
+{
+	const struct reftable_index_record *r =
+		(const struct reftable_index_record *)rec;
+	struct string_view start = out;
+
+	int n = put_var_int(&out, r->offset);
+	if (n < 0)
+		return n;
+
+	string_view_consume(&out, n);
+
+	return start.len - out.len;
+}
+
+static int reftable_index_record_decode(void *rec, struct strbuf key,
+					uint8_t val_type, struct string_view in,
+					int hash_size)
+{
+	struct string_view start = in;
+	struct reftable_index_record *r = (struct reftable_index_record *)rec;
+	int n = 0;
+
+	strbuf_reset(&r->last_key);
+	strbuf_addbuf(&r->last_key, &key);
+
+	n = get_var_int(&r->offset, &in);
+	if (n < 0)
+		return n;
+
+	string_view_consume(&in, n);
+	return start.len - in.len;
+}
+
+static struct reftable_record_vtable reftable_index_record_vtable = {
+	.key = &reftable_index_record_key,
+	.type = BLOCK_TYPE_INDEX,
+	.copy_from = &reftable_index_record_copy_from,
+	.val_type = &reftable_index_record_val_type,
+	.encode = &reftable_index_record_encode,
+	.decode = &reftable_index_record_decode,
+	.release = &reftable_index_record_release,
+	.is_deletion = &not_a_deletion,
+};
+
+void reftable_record_key(struct reftable_record *rec, struct strbuf *dest)
+{
+	rec->ops->key(rec->data, dest);
+}
+
+uint8_t reftable_record_type(struct reftable_record *rec)
+{
+	return rec->ops->type;
+}
+
+int reftable_record_encode(struct reftable_record *rec, struct string_view dest,
+			   int hash_size)
+{
+	return rec->ops->encode(rec->data, dest, hash_size);
+}
+
+void reftable_record_copy_from(struct reftable_record *rec,
+			       struct reftable_record *src, int hash_size)
+{
+	assert(src->ops->type == rec->ops->type);
+
+	rec->ops->copy_from(rec->data, src->data, hash_size);
+}
+
+uint8_t reftable_record_val_type(struct reftable_record *rec)
+{
+	return rec->ops->val_type(rec->data);
+}
+
+int reftable_record_decode(struct reftable_record *rec, struct strbuf key,
+			   uint8_t extra, struct string_view src, int hash_size)
+{
+	return rec->ops->decode(rec->data, key, extra, src, hash_size);
+}
+
+void reftable_record_release(struct reftable_record *rec)
+{
+	rec->ops->release(rec->data);
+}
+
+int reftable_record_is_deletion(struct reftable_record *rec)
+{
+	return rec->ops->is_deletion(rec->data);
+}
+
+void reftable_record_from_ref(struct reftable_record *rec,
+			      struct reftable_ref_record *ref_rec)
+{
+	assert(rec->ops == NULL);
+	rec->data = ref_rec;
+	rec->ops = &reftable_ref_record_vtable;
+}
+
+void reftable_record_from_obj(struct reftable_record *rec,
+			      struct reftable_obj_record *obj_rec)
+{
+	assert(rec->ops == NULL);
+	rec->data = obj_rec;
+	rec->ops = &reftable_obj_record_vtable;
+}
+
+void reftable_record_from_index(struct reftable_record *rec,
+				struct reftable_index_record *index_rec)
+{
+	assert(rec->ops == NULL);
+	rec->data = index_rec;
+	rec->ops = &reftable_index_record_vtable;
+}
+
+void reftable_record_from_log(struct reftable_record *rec,
+			      struct reftable_log_record *log_rec)
+{
+	assert(rec->ops == NULL);
+	rec->data = log_rec;
+	rec->ops = &reftable_log_record_vtable;
+}
+
+struct reftable_ref_record *reftable_record_as_ref(struct reftable_record *rec)
+{
+	assert(reftable_record_type(rec) == BLOCK_TYPE_REF);
+	return (struct reftable_ref_record *)rec->data;
+}
+
+struct reftable_log_record *reftable_record_as_log(struct reftable_record *rec)
+{
+	assert(reftable_record_type(rec) == BLOCK_TYPE_LOG);
+	return (struct reftable_log_record *)rec->data;
+}
+
+static int hash_equal(uint8_t *a, uint8_t *b, int hash_size)
+{
+	if (a != NULL && b != NULL)
+		return !memcmp(a, b, hash_size);
+
+	return a == b;
+}
+
+int reftable_ref_record_equal(struct reftable_ref_record *a,
+			      struct reftable_ref_record *b, int hash_size)
+{
+	assert(hash_size > 0);
+	if (!(0 == strcmp(a->refname, b->refname) &&
+	      a->update_index == b->update_index &&
+	      a->value_type == b->value_type))
+		return 0;
+
+	switch (a->value_type) {
+	case REFTABLE_REF_SYMREF:
+		return !strcmp(a->value.symref, b->value.symref);
+	case REFTABLE_REF_VAL2:
+		return hash_equal(a->value.val2.value, b->value.val2.value,
+				  hash_size) &&
+		       hash_equal(a->value.val2.target_value,
+				  b->value.val2.target_value, hash_size);
+	case REFTABLE_REF_VAL1:
+		return hash_equal(a->value.val1, b->value.val1, hash_size);
+	case REFTABLE_REF_DELETION:
+		return 1;
+	default:
+		abort();
+	}
+}
+
+int reftable_ref_record_compare_name(const void *a, const void *b)
+{
+	return strcmp(((struct reftable_ref_record *)a)->refname,
+		      ((struct reftable_ref_record *)b)->refname);
+}
+
+int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref)
+{
+	return ref->value_type == REFTABLE_REF_DELETION;
+}
+
+int reftable_log_record_compare_key(const void *a, const void *b)
+{
+	struct reftable_log_record *la = (struct reftable_log_record *)a;
+	struct reftable_log_record *lb = (struct reftable_log_record *)b;
+
+	int cmp = strcmp(la->refname, lb->refname);
+	if (cmp)
+		return cmp;
+	if (la->update_index > lb->update_index)
+		return -1;
+	return (la->update_index < lb->update_index) ? 1 : 0;
+}
+
+int reftable_log_record_is_deletion(const struct reftable_log_record *log)
+{
+	return (log->value_type == REFTABLE_LOG_DELETION);
+}
+
+void string_view_consume(struct string_view *s, int n)
+{
+	s->buf += n;
+	s->len -= n;
+}
diff --git a/reftable/record.h b/reftable/record.h
new file mode 100644
index 000000000000..498e8c50bf4f
--- /dev/null
+++ b/reftable/record.h
@@ -0,0 +1,139 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef RECORD_H
+#define RECORD_H
+
+#include "system.h"
+
+#include <stdint.h>
+
+#include "reftable-record.h"
+
+/*
+ * A substring of existing string data. This structure takes no responsibility
+ * for the lifetime of the data it points to.
+ */
+struct string_view {
+	uint8_t *buf;
+	size_t len;
+};
+
+/* Advance `s.buf` by `n`, and decrease length. */
+void string_view_consume(struct string_view *s, int n);
+
+/* utilities for de/encoding varints */
+
+int get_var_int(uint64_t *dest, struct string_view *in);
+int put_var_int(struct string_view *dest, uint64_t val);
+
+/* Methods for records. */
+struct reftable_record_vtable {
+	/* encode the key of to a uint8_t strbuf. */
+	void (*key)(const void *rec, struct strbuf *dest);
+
+	/* The record type of ('r' for ref). */
+	uint8_t type;
+
+	void (*copy_from)(void *dest, const void *src, int hash_size);
+
+	/* a value of [0..7], indicating record subvariants (eg. ref vs. symref
+	 * vs ref deletion) */
+	uint8_t (*val_type)(const void *rec);
+
+	/* encodes rec into dest, returning how much space was used. */
+	int (*encode)(const void *rec, struct string_view dest, int hash_size);
+
+	/* decode data from `src` into the record. */
+	int (*decode)(void *rec, struct strbuf key, uint8_t extra,
+		      struct string_view src, int hash_size);
+
+	/* deallocate and null the record. */
+	void (*release)(void *rec);
+
+	/* is this a tombstone? */
+	int (*is_deletion)(const void *rec);
+};
+
+/* record is a generic wrapper for different types of records. */
+struct reftable_record {
+	void *data;
+	struct reftable_record_vtable *ops;
+};
+
+/* returns true for recognized block types. Block start with the block type. */
+int reftable_is_block_type(uint8_t typ);
+
+/* creates a malloced record of the given type. Dispose with record_destroy */
+struct reftable_record reftable_new_record(uint8_t typ);
+
+/* Encode `key` into `dest`. Sets `is_restart` to indicate a restart. Returns
+ * number of bytes written. */
+int reftable_encode_key(int *is_restart, struct string_view dest,
+			struct strbuf prev_key, struct strbuf key,
+			uint8_t extra);
+
+/* Decode into `key` and `extra` from `in` */
+int reftable_decode_key(struct strbuf *key, uint8_t *extra,
+			struct strbuf last_key, struct string_view in);
+
+/* reftable_index_record are used internally to speed up lookups. */
+struct reftable_index_record {
+	uint64_t offset; /* Offset of block */
+	struct strbuf last_key; /* Last key of the block. */
+};
+
+/* reftable_obj_record stores an object ID => ref mapping. */
+struct reftable_obj_record {
+	uint8_t *hash_prefix; /* leading bytes of the object ID */
+	int hash_prefix_len; /* number of leading bytes. Constant
+			      * across a single table. */
+	uint64_t *offsets; /* a vector of file offsets. */
+	int offset_len;
+};
+
+/* see struct record_vtable */
+
+void reftable_record_key(struct reftable_record *rec, struct strbuf *dest);
+uint8_t reftable_record_type(struct reftable_record *rec);
+void reftable_record_copy_from(struct reftable_record *rec,
+			       struct reftable_record *src, int hash_size);
+uint8_t reftable_record_val_type(struct reftable_record *rec);
+int reftable_record_encode(struct reftable_record *rec, struct string_view dest,
+			   int hash_size);
+int reftable_record_decode(struct reftable_record *rec, struct strbuf key,
+			   uint8_t extra, struct string_view src,
+			   int hash_size);
+int reftable_record_is_deletion(struct reftable_record *rec);
+
+/* zeroes out the embedded record */
+void reftable_record_release(struct reftable_record *rec);
+
+/* clear and deallocate embedded record, and zero `rec`. */
+void reftable_record_destroy(struct reftable_record *rec);
+
+/* initialize generic records from concrete records. The generic record should
+ * be zeroed out. */
+void reftable_record_from_obj(struct reftable_record *rec,
+			      struct reftable_obj_record *objrec);
+void reftable_record_from_index(struct reftable_record *rec,
+				struct reftable_index_record *idxrec);
+void reftable_record_from_ref(struct reftable_record *rec,
+			      struct reftable_ref_record *refrec);
+void reftable_record_from_log(struct reftable_record *rec,
+			      struct reftable_log_record *logrec);
+struct reftable_ref_record *reftable_record_as_ref(struct reftable_record *ref);
+struct reftable_log_record *reftable_record_as_log(struct reftable_record *ref);
+
+/* for qsort. */
+int reftable_ref_record_compare_name(const void *a, const void *b);
+
+/* for qsort. */
+int reftable_log_record_compare_key(const void *a, const void *b);
+
+#endif
diff --git a/reftable/record_test.c b/reftable/record_test.c
new file mode 100644
index 000000000000..da7951bd480b
--- /dev/null
+++ b/reftable/record_test.c
@@ -0,0 +1,405 @@
+/*
+  Copyright 2020 Google LLC
+
+  Use of this source code is governed by a BSD-style
+  license that can be found in the LICENSE file or at
+  https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "record.h"
+
+#include "system.h"
+#include "basics.h"
+#include "constants.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static void test_copy(struct reftable_record *rec)
+{
+	struct reftable_record copy =
+		reftable_new_record(reftable_record_type(rec));
+	reftable_record_copy_from(&copy, rec, SHA1_SIZE);
+	/* do it twice to catch memory leaks */
+	reftable_record_copy_from(&copy, rec, SHA1_SIZE);
+	switch (reftable_record_type(&copy)) {
+	case BLOCK_TYPE_REF:
+		EXPECT(reftable_ref_record_equal(reftable_record_as_ref(&copy),
+						 reftable_record_as_ref(rec),
+						 SHA1_SIZE));
+		break;
+	case BLOCK_TYPE_LOG:
+		EXPECT(reftable_log_record_equal(reftable_record_as_log(&copy),
+						 reftable_record_as_log(rec),
+						 SHA1_SIZE));
+		break;
+	}
+	reftable_record_destroy(&copy);
+}
+
+static void test_varint_roundtrip(void)
+{
+	uint64_t inputs[] = { 0,
+			      1,
+			      27,
+			      127,
+			      128,
+			      257,
+			      4096,
+			      ((uint64_t)1 << 63),
+			      ((uint64_t)1 << 63) + ((uint64_t)1 << 63) - 1 };
+	int i = 0;
+	for (i = 0; i < ARRAY_SIZE(inputs); i++) {
+		uint8_t dest[10];
+
+		struct string_view out = {
+			.buf = dest,
+			.len = sizeof(dest),
+		};
+		uint64_t in = inputs[i];
+		int n = put_var_int(&out, in);
+		uint64_t got = 0;
+
+		EXPECT(n > 0);
+		out.len = n;
+		n = get_var_int(&got, &out);
+		EXPECT(n > 0);
+
+		EXPECT(got == in);
+	}
+}
+
+static void test_common_prefix(void)
+{
+	struct {
+		const char *a, *b;
+		int want;
+	} cases[] = {
+		{ "abc", "ab", 2 },
+		{ "", "abc", 0 },
+		{ "abc", "abd", 2 },
+		{ "abc", "pqr", 0 },
+	};
+
+	int i = 0;
+	for (i = 0; i < ARRAY_SIZE(cases); i++) {
+		struct strbuf a = STRBUF_INIT;
+		struct strbuf b = STRBUF_INIT;
+		strbuf_addstr(&a, cases[i].a);
+		strbuf_addstr(&b, cases[i].b);
+		EXPECT(common_prefix_size(&a, &b) == cases[i].want);
+
+		strbuf_release(&a);
+		strbuf_release(&b);
+	}
+}
+
+static void set_hash(uint8_t *h, int j)
+{
+	int i = 0;
+	for (i = 0; i < hash_size(SHA1_ID); i++) {
+		h[i] = (j >> i) & 0xff;
+	}
+}
+
+static void test_reftable_ref_record_roundtrip(void)
+{
+	int i = 0;
+
+	for (i = REFTABLE_REF_DELETION; i < REFTABLE_NR_REF_VALUETYPES; i++) {
+		struct reftable_ref_record in = { NULL };
+		struct reftable_ref_record out = { NULL };
+		struct reftable_record rec_out = { NULL };
+		struct strbuf key = STRBUF_INIT;
+		struct reftable_record rec = { NULL };
+		uint8_t buffer[1024] = { 0 };
+		struct string_view dest = {
+			.buf = buffer,
+			.len = sizeof(buffer),
+		};
+
+		int n, m;
+
+		in.value_type = i;
+		switch (i) {
+		case REFTABLE_REF_DELETION:
+			break;
+		case REFTABLE_REF_VAL1:
+			in.value.val1 = reftable_malloc(SHA1_SIZE);
+			set_hash(in.value.val1, 1);
+			break;
+		case REFTABLE_REF_VAL2:
+			in.value.val2.value = reftable_malloc(SHA1_SIZE);
+			set_hash(in.value.val2.value, 1);
+			in.value.val2.target_value = reftable_malloc(SHA1_SIZE);
+			set_hash(in.value.val2.target_value, 2);
+			break;
+		case REFTABLE_REF_SYMREF:
+			in.value.symref = xstrdup("target");
+			break;
+		}
+		in.refname = xstrdup("refs/heads/master");
+
+		reftable_record_from_ref(&rec, &in);
+		test_copy(&rec);
+
+		EXPECT(reftable_record_val_type(&rec) == i);
+
+		reftable_record_key(&rec, &key);
+		n = reftable_record_encode(&rec, dest, SHA1_SIZE);
+		EXPECT(n > 0);
+
+		/* decode into a non-zero reftable_record to test for leaks. */
+
+		reftable_record_from_ref(&rec_out, &out);
+		m = reftable_record_decode(&rec_out, key, i, dest, SHA1_SIZE);
+		EXPECT(n == m);
+
+		EXPECT(reftable_ref_record_equal(&in, &out, SHA1_SIZE));
+		reftable_record_release(&rec_out);
+
+		strbuf_release(&key);
+		reftable_ref_record_release(&in);
+	}
+}
+
+static void test_reftable_log_record_equal(void)
+{
+	struct reftable_log_record in[2] = {
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.update_index = 42,
+		},
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.update_index = 22,
+		}
+	};
+
+	EXPECT(!reftable_log_record_equal(&in[0], &in[1], SHA1_SIZE));
+	in[1].update_index = in[0].update_index;
+	EXPECT(reftable_log_record_equal(&in[0], &in[1], SHA1_SIZE));
+	reftable_log_record_release(&in[0]);
+	reftable_log_record_release(&in[1]);
+}
+
+static void test_reftable_log_record_roundtrip(void)
+{
+	struct reftable_log_record in[2] = {
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.update_index = 42,
+			.value_type = REFTABLE_LOG_UPDATE,
+			.update = {
+				.old_hash = reftable_malloc(SHA1_SIZE),
+				.new_hash = reftable_malloc(SHA1_SIZE),
+				.name = xstrdup("han-wen"),
+				.email = xstrdup("hanwen@google.com"),
+				.message = xstrdup("test"),
+				.time = 1577123507,
+				.tz_offset = 100,
+			}
+		},
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.update_index = 22,
+			.value_type = REFTABLE_LOG_DELETION,
+		}
+	};
+	set_test_hash(in[0].update.new_hash, 1);
+	set_test_hash(in[0].update.old_hash, 2);
+	for (int i = 0; i < ARRAY_SIZE(in); i++) {
+		struct reftable_record rec = { NULL };
+		struct strbuf key = STRBUF_INIT;
+		uint8_t buffer[1024] = { 0 };
+		struct string_view dest = {
+			.buf = buffer,
+			.len = sizeof(buffer),
+		};
+		/* populate out, to check for leaks. */
+		struct reftable_log_record out = {
+			.refname = xstrdup("old name"),
+			.value_type = REFTABLE_LOG_UPDATE,
+			.update = {
+				.new_hash = reftable_calloc(SHA1_SIZE),
+				.old_hash = reftable_calloc(SHA1_SIZE),
+				.name = xstrdup("old name"),
+				.email = xstrdup("old@email"),
+				.message = xstrdup("old message"),
+			},
+		};
+		struct reftable_record rec_out = { NULL };
+		int n, m, valtype;
+
+		reftable_record_from_log(&rec, &in[i]);
+
+		test_copy(&rec);
+
+		reftable_record_key(&rec, &key);
+
+		n = reftable_record_encode(&rec, dest, SHA1_SIZE);
+		EXPECT(n >= 0);
+		reftable_record_from_log(&rec_out, &out);
+		valtype = reftable_record_val_type(&rec);
+		m = reftable_record_decode(&rec_out, key, valtype, dest,
+					   SHA1_SIZE);
+		EXPECT(n == m);
+
+		EXPECT(reftable_log_record_equal(&in[i], &out, SHA1_SIZE));
+		reftable_log_record_release(&in[i]);
+		strbuf_release(&key);
+		reftable_record_release(&rec_out);
+	}
+}
+
+static void test_u24_roundtrip(void)
+{
+	uint32_t in = 0x112233;
+	uint8_t dest[3];
+	uint32_t out;
+	put_be24(dest, in);
+	out = get_be24(dest);
+	EXPECT(in == out);
+}
+
+static void test_key_roundtrip(void)
+{
+	uint8_t buffer[1024] = { 0 };
+	struct string_view dest = {
+		.buf = buffer,
+		.len = sizeof(buffer),
+	};
+	struct strbuf last_key = STRBUF_INIT;
+	struct strbuf key = STRBUF_INIT;
+	struct strbuf roundtrip = STRBUF_INIT;
+	int restart;
+	uint8_t extra;
+	int n, m;
+	uint8_t rt_extra;
+
+	strbuf_addstr(&last_key, "refs/heads/master");
+	strbuf_addstr(&key, "refs/tags/bla");
+	extra = 6;
+	n = reftable_encode_key(&restart, dest, last_key, key, extra);
+	EXPECT(!restart);
+	EXPECT(n > 0);
+
+	m = reftable_decode_key(&roundtrip, &rt_extra, last_key, dest);
+	EXPECT(n == m);
+	EXPECT(0 == strbuf_cmp(&key, &roundtrip));
+	EXPECT(rt_extra == extra);
+
+	strbuf_release(&last_key);
+	strbuf_release(&key);
+	strbuf_release(&roundtrip);
+}
+
+static void test_reftable_obj_record_roundtrip(void)
+{
+	uint8_t testHash1[SHA1_SIZE] = { 1, 2, 3, 4, 0 };
+	uint64_t till9[] = { 1, 2, 3, 4, 500, 600, 700, 800, 9000 };
+	struct reftable_obj_record recs[3] = { {
+						       .hash_prefix = testHash1,
+						       .hash_prefix_len = 5,
+						       .offsets = till9,
+						       .offset_len = 3,
+					       },
+					       {
+						       .hash_prefix = testHash1,
+						       .hash_prefix_len = 5,
+						       .offsets = till9,
+						       .offset_len = 9,
+					       },
+					       {
+						       .hash_prefix = testHash1,
+						       .hash_prefix_len = 5,
+					       } };
+	int i = 0;
+	for (i = 0; i < ARRAY_SIZE(recs); i++) {
+		struct reftable_obj_record in = recs[i];
+		uint8_t buffer[1024] = { 0 };
+		struct string_view dest = {
+			.buf = buffer,
+			.len = sizeof(buffer),
+		};
+		struct reftable_record rec = { NULL };
+		struct strbuf key = STRBUF_INIT;
+		struct reftable_obj_record out = { NULL };
+		struct reftable_record rec_out = { NULL };
+		int n, m;
+		uint8_t extra;
+
+		reftable_record_from_obj(&rec, &in);
+		test_copy(&rec);
+		reftable_record_key(&rec, &key);
+		n = reftable_record_encode(&rec, dest, SHA1_SIZE);
+		EXPECT(n > 0);
+		extra = reftable_record_val_type(&rec);
+		reftable_record_from_obj(&rec_out, &out);
+		m = reftable_record_decode(&rec_out, key, extra, dest,
+					   SHA1_SIZE);
+		EXPECT(n == m);
+
+		EXPECT(in.hash_prefix_len == out.hash_prefix_len);
+		EXPECT(in.offset_len == out.offset_len);
+
+		EXPECT(!memcmp(in.hash_prefix, out.hash_prefix,
+			       in.hash_prefix_len));
+		EXPECT(0 == memcmp(in.offsets, out.offsets,
+				   sizeof(uint64_t) * in.offset_len));
+		strbuf_release(&key);
+		reftable_record_release(&rec_out);
+	}
+}
+
+static void test_reftable_index_record_roundtrip(void)
+{
+	struct reftable_index_record in = {
+		.offset = 42,
+		.last_key = STRBUF_INIT,
+	};
+	uint8_t buffer[1024] = { 0 };
+	struct string_view dest = {
+		.buf = buffer,
+		.len = sizeof(buffer),
+	};
+	struct strbuf key = STRBUF_INIT;
+	struct reftable_record rec = { NULL };
+	struct reftable_index_record out = { .last_key = STRBUF_INIT };
+	struct reftable_record out_rec = { NULL };
+	int n, m;
+	uint8_t extra;
+
+	strbuf_addstr(&in.last_key, "refs/heads/master");
+	reftable_record_from_index(&rec, &in);
+	reftable_record_key(&rec, &key);
+	test_copy(&rec);
+
+	EXPECT(0 == strbuf_cmp(&key, &in.last_key));
+	n = reftable_record_encode(&rec, dest, SHA1_SIZE);
+	EXPECT(n > 0);
+
+	extra = reftable_record_val_type(&rec);
+	reftable_record_from_index(&out_rec, &out);
+	m = reftable_record_decode(&out_rec, key, extra, dest, SHA1_SIZE);
+	EXPECT(m == n);
+
+	EXPECT(in.offset == out.offset);
+
+	reftable_record_release(&out_rec);
+	strbuf_release(&key);
+	strbuf_release(&in.last_key);
+}
+
+int record_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_reftable_log_record_equal);
+	RUN_TEST(test_reftable_log_record_roundtrip);
+	RUN_TEST(test_reftable_ref_record_roundtrip);
+	RUN_TEST(test_varint_roundtrip);
+	RUN_TEST(test_key_roundtrip);
+	RUN_TEST(test_common_prefix);
+	RUN_TEST(test_reftable_obj_record_roundtrip);
+	RUN_TEST(test_reftable_index_record_roundtrip);
+	RUN_TEST(test_u24_roundtrip);
+	return 0;
+}
diff --git a/reftable/reftable-record.h b/reftable/reftable-record.h
new file mode 100644
index 000000000000..7985b94ae2cc
--- /dev/null
+++ b/reftable/reftable-record.h
@@ -0,0 +1,114 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_RECORD_H
+#define REFTABLE_RECORD_H
+
+#include <stdint.h>
+
+/*
+ * Basic data types
+ *
+ * Reftables store the state of each ref in struct reftable_ref_record, and they
+ * store a sequence of reflog updates in struct reftable_log_record.
+ */
+
+/* reftable_ref_record holds a ref database entry target_value */
+struct reftable_ref_record {
+	char *refname; /* Name of the ref, malloced. */
+	uint64_t update_index; /* Logical timestamp at which this value is
+				* written */
+
+	enum {
+		/* tombstone to hide deletions from earlier tables */
+		REFTABLE_REF_DELETION = 0x0,
+
+		/* a simple ref */
+		REFTABLE_REF_VAL1 = 0x1,
+		/* a tag, plus its peeled hash */
+		REFTABLE_REF_VAL2 = 0x2,
+
+		/* a symbolic reference */
+		REFTABLE_REF_SYMREF = 0x3,
+#define REFTABLE_NR_REF_VALUETYPES 4
+	} value_type;
+	union {
+		uint8_t *val1; /* malloced hash. */
+		struct {
+			uint8_t *value; /* first value, malloced hash  */
+			uint8_t *target_value; /* second value, malloced hash */
+		} val2;
+		char *symref; /* referent, malloced 0-terminated string */
+	} value;
+};
+
+/* Returns the first hash, or NULL if `rec` is not of type
+ * REFTABLE_REF_VAL1 or REFTABLE_REF_VAL2. */
+uint8_t *reftable_ref_record_val1(struct reftable_ref_record *rec);
+
+/* Returns the second hash, or NULL if `rec` is not of type
+ * REFTABLE_REF_VAL2. */
+uint8_t *reftable_ref_record_val2(struct reftable_ref_record *rec);
+
+/* returns whether 'ref' represents a deletion */
+int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref);
+
+/* prints a reftable_ref_record onto stdout. Useful for debugging. */
+void reftable_ref_record_print(struct reftable_ref_record *ref,
+			       uint32_t hash_id);
+
+/* frees and nulls all pointer values inside `ref`. */
+void reftable_ref_record_release(struct reftable_ref_record *ref);
+
+/* returns whether two reftable_ref_records are the same. Useful for testing. */
+int reftable_ref_record_equal(struct reftable_ref_record *a,
+			      struct reftable_ref_record *b, int hash_size);
+
+/* reftable_log_record holds a reflog entry */
+struct reftable_log_record {
+	char *refname;
+	uint64_t update_index; /* logical timestamp of a transactional update.
+				*/
+
+	enum {
+		/* tombstone to hide deletions from earlier tables */
+		REFTABLE_LOG_DELETION = 0x0,
+
+		/* a simple update */
+		REFTABLE_LOG_UPDATE = 0x1,
+#define REFTABLE_NR_LOG_VALUETYPES 2
+	} value_type;
+
+	union {
+		struct {
+			uint8_t *new_hash;
+			uint8_t *old_hash;
+			char *name;
+			char *email;
+			uint64_t time;
+			int16_t tz_offset;
+			char *message;
+		} update;
+	};
+};
+
+/* returns whether 'ref' represents the deletion of a log record. */
+int reftable_log_record_is_deletion(const struct reftable_log_record *log);
+
+/* frees and nulls all pointer values. */
+void reftable_log_record_release(struct reftable_log_record *log);
+
+/* returns whether two records are equal. Useful for testing. */
+int reftable_log_record_equal(struct reftable_log_record *a,
+			      struct reftable_log_record *b, int hash_size);
+
+/* dumps a reftable_log_record on stdout, for debugging/testing. */
+void reftable_log_record_print(struct reftable_log_record *log,
+			       uint32_t hash_id);
+
+#endif
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 3b58e423e7b1..09d4b83ef9b8 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -4,6 +4,6 @@
 int cmd__reftable(int argc, const char **argv)
 {
 	basics_test_main(argc, argv);
-
+	record_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v6 07/20] reftable: reading/writing blocks
  2021-04-12 19:25         ` [PATCH v6 00/20] reftable library Han-Wen Nienhuys via GitGitGadget
                             ` (5 preceding siblings ...)
  2021-04-12 19:25           ` [PATCH v6 06/20] reftable: (de)serialization for the polymorphic record type Han-Wen Nienhuys via GitGitGadget
@ 2021-04-12 19:25           ` Han-Wen Nienhuys via GitGitGadget
  2021-04-12 21:40             ` Junio C Hamano
  2021-04-13  8:19             ` Ævar Arnfjörð Bjarmason
  2021-04-12 19:25           ` [PATCH v6 08/20] reftable: a generic binary tree implementation Han-Wen Nienhuys via GitGitGadget
                             ` (13 subsequent siblings)
  20 siblings, 2 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-12 19:25 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys,
	Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable format is structured as a sequence of block. Within a block,
records are prefix compressed, with an index of offsets for fully expand keys to
enable binary search within blocks.

This commit provides the logic to read and write these blocks.

Includes a code snippet copied from zlib

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |   3 +
 reftable/.gitattributes  |   1 +
 reftable/block.c         | 440 +++++++++++++++++++++++++++++++++++++++
 reftable/block.h         | 127 +++++++++++
 reftable/block_test.c    | 121 +++++++++++
 reftable/zlib-compat.c   |  92 ++++++++
 t/helper/test-reftable.c |   1 +
 7 files changed, 785 insertions(+)
 create mode 100644 reftable/.gitattributes
 create mode 100644 reftable/block.c
 create mode 100644 reftable/block.h
 create mode 100644 reftable/block_test.c
 create mode 100644 reftable/zlib-compat.c

diff --git a/Makefile b/Makefile
index 313421d16012..89f998a9a5f2 100644
--- a/Makefile
+++ b/Makefile
@@ -2402,10 +2402,13 @@ xdiff-objs: $(XDIFF_OBJS)
 
 REFTABLE_OBJS += reftable/basics.o
 REFTABLE_OBJS += reftable/error.o
+REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/record.o
+REFTABLE_OBJS += reftable/zlib-compat.o
 
+REFTABLE_TEST_OBJS += reftable/block_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/basics_test.o
diff --git a/reftable/.gitattributes b/reftable/.gitattributes
new file mode 100644
index 000000000000..f44451a37954
--- /dev/null
+++ b/reftable/.gitattributes
@@ -0,0 +1 @@
+/zlib-compat.c	whitespace=-indent-with-non-tab,-trailing-space
diff --git a/reftable/block.c b/reftable/block.c
new file mode 100644
index 000000000000..5b4e49fb2be2
--- /dev/null
+++ b/reftable/block.c
@@ -0,0 +1,440 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "block.h"
+
+#include "blocksource.h"
+#include "constants.h"
+#include "record.h"
+#include "reftable-error.h"
+#include "system.h"
+#include "zlib.h"
+
+int header_size(int version)
+{
+	switch (version) {
+	case 1:
+		return 24;
+	case 2:
+		return 28;
+	}
+	abort();
+}
+
+int footer_size(int version)
+{
+	switch (version) {
+	case 1:
+		return 68;
+	case 2:
+		return 72;
+	}
+	abort();
+}
+
+static int block_writer_register_restart(struct block_writer *w, int n,
+					 int is_restart, struct strbuf *key)
+{
+	int rlen = w->restart_len;
+	if (rlen >= MAX_RESTARTS) {
+		is_restart = 0;
+	}
+
+	if (is_restart) {
+		rlen++;
+	}
+	if (2 + 3 * rlen + n > w->block_size - w->next)
+		return -1;
+	if (is_restart) {
+		if (w->restart_len == w->restart_cap) {
+			w->restart_cap = w->restart_cap * 2 + 1;
+			w->restarts = reftable_realloc(
+				w->restarts, sizeof(uint32_t) * w->restart_cap);
+		}
+
+		w->restarts[w->restart_len++] = w->next;
+	}
+
+	w->next += n;
+
+	strbuf_reset(&w->last_key);
+	strbuf_addbuf(&w->last_key, key);
+	w->entries++;
+	return 0;
+}
+
+void block_writer_init(struct block_writer *bw, uint8_t typ, uint8_t *buf,
+		       uint32_t block_size, uint32_t header_off, int hash_size)
+{
+	bw->buf = buf;
+	bw->hash_size = hash_size;
+	bw->block_size = block_size;
+	bw->header_off = header_off;
+	bw->buf[header_off] = typ;
+	bw->next = header_off + 4;
+	bw->restart_interval = 16;
+	bw->entries = 0;
+	bw->restart_len = 0;
+	bw->last_key.len = 0;
+}
+
+uint8_t block_writer_type(struct block_writer *bw)
+{
+	return bw->buf[bw->header_off];
+}
+
+/* adds the reftable_record to the block. Returns -1 if it does not fit, 0 on
+   success */
+int block_writer_add(struct block_writer *w, struct reftable_record *rec)
+{
+	struct strbuf empty = STRBUF_INIT;
+	struct strbuf last =
+		w->entries % w->restart_interval == 0 ? empty : w->last_key;
+	struct string_view out = {
+		.buf = w->buf + w->next,
+		.len = w->block_size - w->next,
+	};
+
+	struct string_view start = out;
+
+	int is_restart = 0;
+	struct strbuf key = STRBUF_INIT;
+	int n = 0;
+
+	reftable_record_key(rec, &key);
+	n = reftable_encode_key(&is_restart, out, last, key,
+				reftable_record_val_type(rec));
+	if (n < 0)
+		goto done;
+	string_view_consume(&out, n);
+
+	n = reftable_record_encode(rec, out, w->hash_size);
+	if (n < 0)
+		goto done;
+	string_view_consume(&out, n);
+
+	if (block_writer_register_restart(w, start.len - out.len, is_restart,
+					  &key) < 0)
+		goto done;
+
+	strbuf_release(&key);
+	return 0;
+
+done:
+	strbuf_release(&key);
+	return -1;
+}
+
+int block_writer_finish(struct block_writer *w)
+{
+	int i = 0;
+	for (i = 0; i < w->restart_len; i++) {
+		put_be24(w->buf + w->next, w->restarts[i]);
+		w->next += 3;
+	}
+
+	put_be16(w->buf + w->next, w->restart_len);
+	w->next += 2;
+	put_be24(w->buf + 1 + w->header_off, w->next);
+
+	if (block_writer_type(w) == BLOCK_TYPE_LOG) {
+		int block_header_skip = 4 + w->header_off;
+		uint8_t *compressed = NULL;
+		int zresult = 0;
+		uLongf src_len = w->next - block_header_skip;
+		size_t dest_cap = src_len;
+
+		compressed = reftable_malloc(dest_cap);
+		while (1) {
+			uLongf out_dest_len = dest_cap;
+
+			zresult = compress2(compressed, &out_dest_len,
+					    w->buf + block_header_skip, src_len,
+					    9);
+			if (zresult == Z_BUF_ERROR) {
+				dest_cap *= 2;
+				compressed =
+					reftable_realloc(compressed, dest_cap);
+				continue;
+			}
+
+			if (Z_OK != zresult) {
+				reftable_free(compressed);
+				return REFTABLE_ZLIB_ERROR;
+			}
+
+			memcpy(w->buf + block_header_skip, compressed,
+			       out_dest_len);
+			w->next = out_dest_len + block_header_skip;
+			reftable_free(compressed);
+			break;
+		}
+	}
+	return w->next;
+}
+
+uint8_t block_reader_type(struct block_reader *r)
+{
+	return r->block.data[r->header_off];
+}
+
+int block_reader_init(struct block_reader *br, struct reftable_block *block,
+		      uint32_t header_off, uint32_t table_block_size,
+		      int hash_size)
+{
+	uint32_t full_block_size = table_block_size;
+	uint8_t typ = block->data[header_off];
+	uint32_t sz = get_be24(block->data + header_off + 1);
+
+	uint16_t restart_count = 0;
+	uint32_t restart_start = 0;
+	uint8_t *restart_bytes = NULL;
+
+	if (!reftable_is_block_type(typ))
+		return REFTABLE_FORMAT_ERROR;
+
+	if (typ == BLOCK_TYPE_LOG) {
+		int block_header_skip = 4 + header_off;
+		uLongf dst_len = sz - block_header_skip; /* total size of dest
+							    buffer. */
+		uLongf src_len = block->len - block_header_skip;
+		/* Log blocks specify the *uncompressed* size in their header.
+		 */
+		uint8_t *uncompressed = reftable_malloc(sz);
+
+		/* Copy over the block header verbatim. It's not compressed. */
+		memcpy(uncompressed, block->data, block_header_skip);
+
+		/* Uncompress */
+		if (Z_OK != uncompress_return_consumed(
+				    uncompressed + block_header_skip, &dst_len,
+				    block->data + block_header_skip,
+				    &src_len)) {
+			reftable_free(uncompressed);
+			return REFTABLE_ZLIB_ERROR;
+		}
+
+		if (dst_len + block_header_skip != sz)
+			return REFTABLE_FORMAT_ERROR;
+
+		/* We're done with the input data. */
+		reftable_block_done(block);
+		block->data = uncompressed;
+		block->len = sz;
+		block->source = malloc_block_source();
+		full_block_size = src_len + block_header_skip;
+	} else if (full_block_size == 0) {
+		full_block_size = sz;
+	} else if (sz < full_block_size && sz < block->len &&
+		   block->data[sz] != 0) {
+		/* If the block is smaller than the full block size, it is
+		   padded (data followed by '\0') or the next block is
+		   unaligned. */
+		full_block_size = sz;
+	}
+
+	restart_count = get_be16(block->data + sz - 2);
+	restart_start = sz - 2 - 3 * restart_count;
+	restart_bytes = block->data + restart_start;
+
+	/* transfer ownership. */
+	br->block = *block;
+	block->data = NULL;
+	block->len = 0;
+
+	br->hash_size = hash_size;
+	br->block_len = restart_start;
+	br->full_block_size = full_block_size;
+	br->header_off = header_off;
+	br->restart_count = restart_count;
+	br->restart_bytes = restart_bytes;
+
+	return 0;
+}
+
+static uint32_t block_reader_restart_offset(struct block_reader *br, int i)
+{
+	return get_be24(br->restart_bytes + 3 * i);
+}
+
+void block_reader_start(struct block_reader *br, struct block_iter *it)
+{
+	it->br = br;
+	strbuf_reset(&it->last_key);
+	it->next_off = br->header_off + 4;
+}
+
+struct restart_find_args {
+	int error;
+	struct strbuf key;
+	struct block_reader *r;
+};
+
+static int restart_key_less(size_t idx, void *args)
+{
+	struct restart_find_args *a = (struct restart_find_args *)args;
+	uint32_t off = block_reader_restart_offset(a->r, idx);
+	struct string_view in = {
+		.buf = a->r->block.data + off,
+		.len = a->r->block_len - off,
+	};
+
+	/* the restart key is verbatim in the block, so this could avoid the
+	   alloc for decoding the key */
+	struct strbuf rkey = STRBUF_INIT;
+	struct strbuf last_key = STRBUF_INIT;
+	uint8_t unused_extra;
+	int n = reftable_decode_key(&rkey, &unused_extra, last_key, in);
+	int result;
+	if (n < 0) {
+		a->error = 1;
+		return -1;
+	}
+
+	result = strbuf_cmp(&a->key, &rkey);
+	strbuf_release(&rkey);
+	return result;
+}
+
+void block_iter_copy_from(struct block_iter *dest, struct block_iter *src)
+{
+	dest->br = src->br;
+	dest->next_off = src->next_off;
+	strbuf_reset(&dest->last_key);
+	strbuf_addbuf(&dest->last_key, &src->last_key);
+}
+
+int block_iter_next(struct block_iter *it, struct reftable_record *rec)
+{
+	struct string_view in = {
+		.buf = it->br->block.data + it->next_off,
+		.len = it->br->block_len - it->next_off,
+	};
+	struct string_view start = in;
+	struct strbuf key = STRBUF_INIT;
+	uint8_t extra = 0;
+	int n = 0;
+
+	if (it->next_off >= it->br->block_len)
+		return 1;
+
+	n = reftable_decode_key(&key, &extra, it->last_key, in);
+	if (n < 0)
+		return -1;
+
+	string_view_consume(&in, n);
+	n = reftable_record_decode(rec, key, extra, in, it->br->hash_size);
+	if (n < 0)
+		return -1;
+	string_view_consume(&in, n);
+
+	strbuf_reset(&it->last_key);
+	strbuf_addbuf(&it->last_key, &key);
+	it->next_off += start.len - in.len;
+	strbuf_release(&key);
+	return 0;
+}
+
+int block_reader_first_key(struct block_reader *br, struct strbuf *key)
+{
+	struct strbuf empty = STRBUF_INIT;
+	int off = br->header_off + 4;
+	struct string_view in = {
+		.buf = br->block.data + off,
+		.len = br->block_len - off,
+	};
+
+	uint8_t extra = 0;
+	int n = reftable_decode_key(key, &extra, empty, in);
+	if (n < 0)
+		return n;
+
+	return 0;
+}
+
+int block_iter_seek(struct block_iter *it, struct strbuf *want)
+{
+	return block_reader_seek(it->br, it, want);
+}
+
+void block_iter_close(struct block_iter *it)
+{
+	strbuf_release(&it->last_key);
+}
+
+int block_reader_seek(struct block_reader *br, struct block_iter *it,
+		      struct strbuf *want)
+{
+	struct restart_find_args args = {
+		.key = *want,
+		.r = br,
+	};
+	struct reftable_record rec = reftable_new_record(block_reader_type(br));
+	struct strbuf key = STRBUF_INIT;
+	int err = 0;
+	struct block_iter next = {
+		.last_key = STRBUF_INIT,
+	};
+
+	int i = binsearch(br->restart_count, &restart_key_less, &args);
+	if (args.error) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	it->br = br;
+	if (i > 0) {
+		i--;
+		it->next_off = block_reader_restart_offset(br, i);
+	} else {
+		it->next_off = br->header_off + 4;
+	}
+
+	/* We're looking for the last entry less/equal than the wanted key, so
+	   we have to go one entry too far and then back up.
+	*/
+	while (1) {
+		block_iter_copy_from(&next, it);
+		err = block_iter_next(&next, &rec);
+		if (err < 0)
+			goto done;
+
+		reftable_record_key(&rec, &key);
+		if (err > 0 || strbuf_cmp(&key, want) >= 0) {
+			err = 0;
+			goto done;
+		}
+
+		block_iter_copy_from(it, &next);
+	}
+
+done:
+	strbuf_release(&key);
+	strbuf_release(&next.last_key);
+	reftable_record_destroy(&rec);
+
+	return err;
+}
+
+void block_writer_release(struct block_writer *bw)
+{
+	FREE_AND_NULL(bw->restarts);
+	strbuf_release(&bw->last_key);
+	/* the block is not owned. */
+}
+
+void reftable_block_done(struct reftable_block *blockp)
+{
+	struct reftable_block_source source = blockp->source;
+	if (blockp != NULL && source.ops != NULL)
+		source.ops->return_block(source.arg, blockp);
+	blockp->data = NULL;
+	blockp->len = 0;
+	blockp->source.ops = NULL;
+	blockp->source.arg = NULL;
+}
diff --git a/reftable/block.h b/reftable/block.h
new file mode 100644
index 000000000000..e207706a6448
--- /dev/null
+++ b/reftable/block.h
@@ -0,0 +1,127 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef BLOCK_H
+#define BLOCK_H
+
+#include "basics.h"
+#include "record.h"
+#include "reftable-blocksource.h"
+
+/*
+ * Writes reftable blocks. The block_writer is reused across blocks to minimize
+ * allocation overhead.
+ */
+struct block_writer {
+	uint8_t *buf;
+	uint32_t block_size;
+
+	/* Offset ofof the global header. Nonzero in the first block only. */
+	uint32_t header_off;
+
+	/* How often to restart keys. */
+	int restart_interval;
+	int hash_size;
+
+	/* Offset of next uint8_t to write. */
+	uint32_t next;
+	uint32_t *restarts;
+	uint32_t restart_len;
+	uint32_t restart_cap;
+
+	struct strbuf last_key;
+	int entries;
+};
+
+/*
+ * initializes the blockwriter to write `typ` entries, using `buf` as temporary
+ * storage. `buf` is not owned by the block_writer. */
+void block_writer_init(struct block_writer *bw, uint8_t typ, uint8_t *buf,
+		       uint32_t block_size, uint32_t header_off, int hash_size);
+
+/* returns the block type (eg. 'r' for ref records. */
+uint8_t block_writer_type(struct block_writer *bw);
+
+/* appends the record, or -1 if it doesn't fit. */
+int block_writer_add(struct block_writer *w, struct reftable_record *rec);
+
+/* appends the key restarts, and compress the block if necessary. */
+int block_writer_finish(struct block_writer *w);
+
+/* clears out internally allocated block_writer members. */
+void block_writer_release(struct block_writer *bw);
+
+/* Read a block. */
+struct block_reader {
+	/* offset of the block header; nonzero for the first block in a
+	 * reftable. */
+	uint32_t header_off;
+
+	/* the memory block */
+	struct reftable_block block;
+	int hash_size;
+
+	/* size of the data, excluding restart data. */
+	uint32_t block_len;
+	uint8_t *restart_bytes;
+	uint16_t restart_count;
+
+	/* size of the data in the file. For log blocks, this is the compressed
+	 * size. */
+	uint32_t full_block_size;
+};
+
+/* Iterate over entries in a block */
+struct block_iter {
+	/* offset within the block of the next entry to read. */
+	uint32_t next_off;
+	struct block_reader *br;
+
+	/* key for last entry we read. */
+	struct strbuf last_key;
+};
+
+/* initializes a block reader. */
+int block_reader_init(struct block_reader *br, struct reftable_block *bl,
+		      uint32_t header_off, uint32_t table_block_size,
+		      int hash_size);
+
+/* Position `it` at start of the block */
+void block_reader_start(struct block_reader *br, struct block_iter *it);
+
+/* Position `it` to the `want` key in the block */
+int block_reader_seek(struct block_reader *br, struct block_iter *it,
+		      struct strbuf *want);
+
+/* Returns the block type (eg. 'r' for refs) */
+uint8_t block_reader_type(struct block_reader *r);
+
+/* Decodes the first key in the block */
+int block_reader_first_key(struct block_reader *br, struct strbuf *key);
+
+void block_iter_copy_from(struct block_iter *dest, struct block_iter *src);
+
+/* return < 0 for error, 0 for OK, > 0 for EOF. */
+int block_iter_next(struct block_iter *it, struct reftable_record *rec);
+
+/* Seek to `want` with in the block pointed to by `it` */
+int block_iter_seek(struct block_iter *it, struct strbuf *want);
+
+/* deallocate memory for `it`. The block reader and its block is left intact. */
+void block_iter_close(struct block_iter *it);
+
+/* size of file header, depending on format version */
+int header_size(int version);
+
+/* size of file footer, depending on format version */
+int footer_size(int version);
+
+/* returns a block to its source. */
+void reftable_block_done(struct reftable_block *ret);
+
+#endif
diff --git a/reftable/block_test.c b/reftable/block_test.c
new file mode 100644
index 000000000000..75fe198e6773
--- /dev/null
+++ b/reftable/block_test.c
@@ -0,0 +1,121 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "block.h"
+
+#include "system.h"
+
+#include "blocksource.h"
+#include "basics.h"
+#include "constants.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static void test_block_read_write(void)
+{
+	const int header_off = 21; /* random */
+	char *names[30];
+	const int N = ARRAY_SIZE(names);
+	const int block_size = 1024;
+	struct reftable_block block = { NULL };
+	struct block_writer bw = {
+		.last_key = STRBUF_INIT,
+	};
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_record rec = { NULL };
+	int i = 0;
+	int n;
+	struct block_reader br = { 0 };
+	struct block_iter it = { .last_key = STRBUF_INIT };
+	int j = 0;
+	struct strbuf want = STRBUF_INIT;
+
+	block.data = reftable_calloc(block_size);
+	block.len = block_size;
+	block.source = malloc_block_source();
+	block_writer_init(&bw, BLOCK_TYPE_REF, block.data, block_size,
+			  header_off, hash_size(SHA1_ID));
+	reftable_record_from_ref(&rec, &ref);
+
+	for (i = 0; i < N; i++) {
+		char name[100];
+		uint8_t hash[SHA1_SIZE];
+		snprintf(name, sizeof(name), "branch%02d", i);
+		memset(hash, i, sizeof(hash));
+
+		ref.refname = name;
+		ref.value_type = REFTABLE_REF_VAL1;
+		ref.value.val1 = hash;
+
+		names[i] = xstrdup(name);
+		n = block_writer_add(&bw, &rec);
+		ref.refname = NULL;
+		ref.value_type = REFTABLE_REF_DELETION;
+		EXPECT(n == 0);
+	}
+
+	n = block_writer_finish(&bw);
+	EXPECT(n > 0);
+
+	block_writer_release(&bw);
+
+	block_reader_init(&br, &block, header_off, block_size, SHA1_SIZE);
+
+	block_reader_start(&br, &it);
+
+	while (1) {
+		int r = block_iter_next(&it, &rec);
+		EXPECT(r >= 0);
+		if (r > 0) {
+			break;
+		}
+		EXPECT_STREQ(names[j], ref.refname);
+		j++;
+	}
+
+	reftable_record_release(&rec);
+	block_iter_close(&it);
+
+	for (i = 0; i < N; i++) {
+		struct block_iter it = { .last_key = STRBUF_INIT };
+		strbuf_reset(&want);
+		strbuf_addstr(&want, names[i]);
+
+		n = block_reader_seek(&br, &it, &want);
+		EXPECT(n == 0);
+
+		n = block_iter_next(&it, &rec);
+		EXPECT(n == 0);
+
+		EXPECT_STREQ(names[i], ref.refname);
+
+		want.len--;
+		n = block_reader_seek(&br, &it, &want);
+		EXPECT(n == 0);
+
+		n = block_iter_next(&it, &rec);
+		EXPECT(n == 0);
+		EXPECT_STREQ(names[10 * (i / 10)], ref.refname);
+
+		block_iter_close(&it);
+	}
+
+	reftable_record_release(&rec);
+	reftable_block_done(&br.block);
+	strbuf_release(&want);
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+}
+
+int block_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_block_read_write);
+	return 0;
+}
diff --git a/reftable/zlib-compat.c b/reftable/zlib-compat.c
new file mode 100644
index 000000000000..3e0b0f24f1c7
--- /dev/null
+++ b/reftable/zlib-compat.c
@@ -0,0 +1,92 @@
+/* taken from zlib's uncompr.c
+
+   commit cacf7f1d4e3d44d871b605da3b647f07d718623f
+   Author: Mark Adler <madler@alumni.caltech.edu>
+   Date:   Sun Jan 15 09:18:46 2017 -0800
+
+       zlib 1.2.11
+
+*/
+
+/*
+ * Copyright (C) 1995-2003, 2010, 2014, 2016 Jean-loup Gailly, Mark Adler
+ * For conditions of distribution and use, see copyright notice in zlib.h
+ */
+
+#include "system.h"
+
+/* clang-format off */
+
+/* ===========================================================================
+     Decompresses the source buffer into the destination buffer.  *sourceLen is
+   the byte length of the source buffer. Upon entry, *destLen is the total size
+   of the destination buffer, which must be large enough to hold the entire
+   uncompressed data. (The size of the uncompressed data must have been saved
+   previously by the compressor and transmitted to the decompressor by some
+   mechanism outside the scope of this compression library.) Upon exit,
+   *destLen is the size of the decompressed data and *sourceLen is the number
+   of source bytes consumed. Upon return, source + *sourceLen points to the
+   first unused input byte.
+
+     uncompress returns Z_OK if success, Z_MEM_ERROR if there was not enough
+   memory, Z_BUF_ERROR if there was not enough room in the output buffer, or
+   Z_DATA_ERROR if the input data was corrupted, including if the input data is
+   an incomplete zlib stream.
+*/
+int ZEXPORT uncompress_return_consumed (
+    Bytef *dest,
+    uLongf *destLen,
+    const Bytef *source,
+    uLong *sourceLen) {
+    z_stream stream;
+    int err;
+    const uInt max = (uInt)-1;
+    uLong len, left;
+    Byte buf[1];    /* for detection of incomplete stream when *destLen == 0 */
+
+    len = *sourceLen;
+    if (*destLen) {
+        left = *destLen;
+        *destLen = 0;
+    }
+    else {
+        left = 1;
+        dest = buf;
+    }
+
+    stream.next_in = (z_const Bytef *)source;
+    stream.avail_in = 0;
+    stream.zalloc = (alloc_func)0;
+    stream.zfree = (free_func)0;
+    stream.opaque = (voidpf)0;
+
+    err = inflateInit(&stream);
+    if (err != Z_OK) return err;
+
+    stream.next_out = dest;
+    stream.avail_out = 0;
+
+    do {
+        if (stream.avail_out == 0) {
+            stream.avail_out = left > (uLong)max ? max : (uInt)left;
+            left -= stream.avail_out;
+        }
+        if (stream.avail_in == 0) {
+            stream.avail_in = len > (uLong)max ? max : (uInt)len;
+            len -= stream.avail_in;
+        }
+        err = inflate(&stream, Z_NO_FLUSH);
+    } while (err == Z_OK);
+
+    *sourceLen -= len + stream.avail_in;
+    if (dest != buf)
+        *destLen = stream.total_out;
+    else if (stream.total_out && err == Z_BUF_ERROR)
+        left = 1;
+
+    inflateEnd(&stream);
+    return err == Z_STREAM_END ? Z_OK :
+           err == Z_NEED_DICT ? Z_DATA_ERROR  :
+           err == Z_BUF_ERROR && left + stream.avail_out ? Z_DATA_ERROR :
+           err;
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 09d4b83ef9b8..c9deeaf08c7a 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -4,6 +4,7 @@
 int cmd__reftable(int argc, const char **argv)
 {
 	basics_test_main(argc, argv);
+	block_test_main(argc, argv);
 	record_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v6 08/20] reftable: a generic binary tree implementation
  2021-04-12 19:25         ` [PATCH v6 00/20] reftable library Han-Wen Nienhuys via GitGitGadget
                             ` (6 preceding siblings ...)
  2021-04-12 19:25           ` [PATCH v6 07/20] reftable: reading/writing blocks Han-Wen Nienhuys via GitGitGadget
@ 2021-04-12 19:25           ` Han-Wen Nienhuys via GitGitGadget
  2021-04-12 19:25           ` [PATCH v6 09/20] reftable: write reftable files Han-Wen Nienhuys via GitGitGadget
                             ` (12 subsequent siblings)
  20 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-12 19:25 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys,
	Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable format includes support for an (OID => ref) map. This map can speed
up visibility and reachability checks. In particular, various operations along
the fetch/push path within Gerrit have ben sped up by using this structure.

The map is constructed with help of a binary tree. Object IDs are hashes, so
they are uniformly distributed. Hence, the tree does not attempt forced
rebalancing.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |  4 ++-
 reftable/tree.c          | 63 ++++++++++++++++++++++++++++++++++++++++
 reftable/tree.h          | 34 ++++++++++++++++++++++
 reftable/tree_test.c     | 61 ++++++++++++++++++++++++++++++++++++++
 t/helper/test-reftable.c |  1 +
 5 files changed, 162 insertions(+), 1 deletion(-)
 create mode 100644 reftable/tree.c
 create mode 100644 reftable/tree.h
 create mode 100644 reftable/tree_test.c

diff --git a/Makefile b/Makefile
index 89f998a9a5f2..5ac204eb211c 100644
--- a/Makefile
+++ b/Makefile
@@ -2406,12 +2406,14 @@ REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/record.o
+REFTABLE_OBJS += reftable/tree.o
 REFTABLE_OBJS += reftable/zlib-compat.o
 
+REFTABLE_TEST_OBJS += reftable/basics_test.o
 REFTABLE_TEST_OBJS += reftable/block_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
-REFTABLE_TEST_OBJS += reftable/basics_test.o
+REFTABLE_TEST_OBJS += reftable/tree_test.o
 
 TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
 
diff --git a/reftable/tree.c b/reftable/tree.c
new file mode 100644
index 000000000000..0061d14e306c
--- /dev/null
+++ b/reftable/tree.c
@@ -0,0 +1,63 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "tree.h"
+
+#include "basics.h"
+#include "system.h"
+
+struct tree_node *tree_search(void *key, struct tree_node **rootp,
+			      int (*compare)(const void *, const void *),
+			      int insert)
+{
+	int res;
+	if (*rootp == NULL) {
+		if (!insert) {
+			return NULL;
+		} else {
+			struct tree_node *n =
+				reftable_calloc(sizeof(struct tree_node));
+			n->key = key;
+			*rootp = n;
+			return *rootp;
+		}
+	}
+
+	res = compare(key, (*rootp)->key);
+	if (res < 0)
+		return tree_search(key, &(*rootp)->left, compare, insert);
+	else if (res > 0)
+		return tree_search(key, &(*rootp)->right, compare, insert);
+	return *rootp;
+}
+
+void infix_walk(struct tree_node *t, void (*action)(void *arg, void *key),
+		void *arg)
+{
+	if (t->left != NULL) {
+		infix_walk(t->left, action, arg);
+	}
+	action(arg, t->key);
+	if (t->right != NULL) {
+		infix_walk(t->right, action, arg);
+	}
+}
+
+void tree_free(struct tree_node *t)
+{
+	if (t == NULL) {
+		return;
+	}
+	if (t->left != NULL) {
+		tree_free(t->left);
+	}
+	if (t->right != NULL) {
+		tree_free(t->right);
+	}
+	reftable_free(t);
+}
diff --git a/reftable/tree.h b/reftable/tree.h
new file mode 100644
index 000000000000..fbdd002e23af
--- /dev/null
+++ b/reftable/tree.h
@@ -0,0 +1,34 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef TREE_H
+#define TREE_H
+
+/* tree_node is a generic binary search tree. */
+struct tree_node {
+	void *key;
+	struct tree_node *left, *right;
+};
+
+/* looks for `key` in `rootp` using `compare` as comparison function. If insert
+ * is set, insert the key if it's not found. Else, return NULL.
+ */
+struct tree_node *tree_search(void *key, struct tree_node **rootp,
+			      int (*compare)(const void *, const void *),
+			      int insert);
+
+/* performs an infix walk of the tree. */
+void infix_walk(struct tree_node *t, void (*action)(void *arg, void *key),
+		void *arg);
+
+/*
+ * deallocates the tree nodes recursively. Keys should be deallocated separately
+ * by walking over the tree. */
+void tree_free(struct tree_node *t);
+
+#endif
diff --git a/reftable/tree_test.c b/reftable/tree_test.c
new file mode 100644
index 000000000000..26d1e694252e
--- /dev/null
+++ b/reftable/tree_test.c
@@ -0,0 +1,61 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "tree.h"
+
+#include "basics.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static int test_compare(const void *a, const void *b)
+{
+	return (char *)a - (char *)b;
+}
+
+struct curry {
+	void *last;
+};
+
+static void check_increasing(void *arg, void *key)
+{
+	struct curry *c = (struct curry *)arg;
+	if (c->last != NULL) {
+		assert(test_compare(c->last, key) < 0);
+	}
+	c->last = key;
+}
+
+static void test_tree(void)
+{
+	struct tree_node *root = NULL;
+
+	void *values[11] = { NULL };
+	struct tree_node *nodes[11] = { NULL };
+	int i = 1;
+	struct curry c = { NULL };
+	do {
+		nodes[i] = tree_search(values + i, &root, &test_compare, 1);
+		i = (i * 7) % 11;
+	} while (i != 1);
+
+	for (i = 1; i < ARRAY_SIZE(nodes); i++) {
+		assert(values + i == nodes[i]->key);
+		assert(nodes[i] ==
+		       tree_search(values + i, &root, &test_compare, 0));
+	}
+
+	infix_walk(root, check_increasing, &c);
+	tree_free(root);
+}
+
+int tree_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_tree);
+	return 0;
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index c9deeaf08c7a..050551fa6985 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -6,5 +6,6 @@ int cmd__reftable(int argc, const char **argv)
 	basics_test_main(argc, argv);
 	block_test_main(argc, argv);
 	record_test_main(argc, argv);
+	tree_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v6 09/20] reftable: write reftable files
  2021-04-12 19:25         ` [PATCH v6 00/20] reftable library Han-Wen Nienhuys via GitGitGadget
                             ` (7 preceding siblings ...)
  2021-04-12 19:25           ` [PATCH v6 08/20] reftable: a generic binary tree implementation Han-Wen Nienhuys via GitGitGadget
@ 2021-04-12 19:25           ` Han-Wen Nienhuys via GitGitGadget
  2021-04-12 19:25           ` [PATCH v6 10/20] reftable: generic interface to tables Han-Wen Nienhuys via GitGitGadget
                             ` (11 subsequent siblings)
  20 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-12 19:25 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys,
	Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                   |   1 +
 reftable/reftable-writer.h | 147 ++++++++
 reftable/writer.c          | 691 +++++++++++++++++++++++++++++++++++++
 reftable/writer.h          |  50 +++
 4 files changed, 889 insertions(+)
 create mode 100644 reftable/reftable-writer.h
 create mode 100644 reftable/writer.c
 create mode 100644 reftable/writer.h

diff --git a/Makefile b/Makefile
index 5ac204eb211c..2653894d0308 100644
--- a/Makefile
+++ b/Makefile
@@ -2407,6 +2407,7 @@ REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/record.o
 REFTABLE_OBJS += reftable/tree.o
+REFTABLE_OBJS += reftable/writer.o
 REFTABLE_OBJS += reftable/zlib-compat.o
 
 REFTABLE_TEST_OBJS += reftable/basics_test.o
diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h
new file mode 100644
index 000000000000..9d2f8d605558
--- /dev/null
+++ b/reftable/reftable-writer.h
@@ -0,0 +1,147 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_WRITER_H
+#define REFTABLE_WRITER_H
+
+#include <stdint.h>
+
+#include "reftable-record.h"
+
+/* Writing single reftables */
+
+/* reftable_write_options sets options for writing a single reftable. */
+struct reftable_write_options {
+	/* boolean: do not pad out blocks to block size. */
+	unsigned unpadded : 1;
+
+	/* the blocksize. Should be less than 2^24. */
+	uint32_t block_size;
+
+	/* boolean: do not generate a SHA1 => ref index. */
+	unsigned skip_index_objects : 1;
+
+	/* how often to write complete keys in each block. */
+	int restart_interval;
+
+	/* 4-byte identifier ("sha1", "s256") of the hash.
+	 * Defaults to SHA1 if unset
+	 */
+	uint32_t hash_id;
+
+	/* boolean: do not check ref names for validity or dir/file conflicts.
+	 */
+	unsigned skip_name_check : 1;
+
+	/* boolean: copy log messages exactly. If unset, check that the message
+	 *   is a single line, and add '\n' if missing.
+	 */
+	unsigned exact_log_message : 1;
+};
+
+/* reftable_block_stats holds statistics for a single block type */
+struct reftable_block_stats {
+	/* total number of entries written */
+	int entries;
+	/* total number of key restarts */
+	int restarts;
+	/* total number of blocks */
+	int blocks;
+	/* total number of index blocks */
+	int index_blocks;
+	/* depth of the index */
+	int max_index_level;
+
+	/* offset of the first block for this type */
+	uint64_t offset;
+	/* offset of the top level index block for this type, or 0 if not
+	 * present */
+	uint64_t index_offset;
+};
+
+/* stats holds overall statistics for a single reftable */
+struct reftable_stats {
+	/* total number of blocks written. */
+	int blocks;
+	/* stats for ref data */
+	struct reftable_block_stats ref_stats;
+	/* stats for the SHA1 to ref map. */
+	struct reftable_block_stats obj_stats;
+	/* stats for index blocks */
+	struct reftable_block_stats idx_stats;
+	/* stats for log blocks */
+	struct reftable_block_stats log_stats;
+
+	/* disambiguation length of shortened object IDs. */
+	int object_id_len;
+};
+
+/* reftable_new_writer creates a new writer */
+struct reftable_writer *
+reftable_new_writer(int (*writer_func)(void *, const void *, size_t),
+		    void *writer_arg, struct reftable_write_options *opts);
+
+/* Set the range of update indices for the records we will add. When writing a
+   table into a stack, the min should be at least
+   reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned.
+
+   For transactional updates to a stack, typically min==max, and the
+   update_index can be obtained by inspeciting the stack. When converting an
+   existing ref database into a single reftable, this would be a range of
+   update-index timestamps.
+ */
+void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
+				uint64_t max);
+
+/*
+  Add a reftable_ref_record. The record should have names that come after
+  already added records.
+
+  The update_index must be within the limits set by
+  reftable_writer_set_limits(), or REFTABLE_API_ERROR is returned. It is an
+  REFTABLE_API_ERROR error to write a ref record after a log record.
+*/
+int reftable_writer_add_ref(struct reftable_writer *w,
+			    struct reftable_ref_record *ref);
+
+/*
+  Convenience function to add multiple reftable_ref_records; the function sorts
+  the records before adding them, reordering the records array passed in.
+*/
+int reftable_writer_add_refs(struct reftable_writer *w,
+			     struct reftable_ref_record *refs, int n);
+
+/*
+  adds reftable_log_records. Log records are keyed by (refname, decreasing
+  update_index). The key for the record added must come after the already added
+  log records.
+*/
+int reftable_writer_add_log(struct reftable_writer *w,
+			    struct reftable_log_record *log);
+
+/*
+  Convenience function to add multiple reftable_log_records; the function sorts
+  the records before adding them, reordering records array passed in.
+*/
+int reftable_writer_add_logs(struct reftable_writer *w,
+			     struct reftable_log_record *logs, int n);
+
+/* reftable_writer_close finalizes the reftable. The writer is retained so
+ * statistics can be inspected. */
+int reftable_writer_close(struct reftable_writer *w);
+
+/* writer_stats returns the statistics on the reftable being written.
+
+   This struct becomes invalid when the writer is freed.
+ */
+const struct reftable_stats *writer_stats(struct reftable_writer *w);
+
+/* reftable_writer_free deallocates memory for the writer */
+void reftable_writer_free(struct reftable_writer *w);
+
+#endif
diff --git a/reftable/writer.c b/reftable/writer.c
new file mode 100644
index 000000000000..d42ca8afac11
--- /dev/null
+++ b/reftable/writer.c
@@ -0,0 +1,691 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "writer.h"
+
+#include "system.h"
+
+#include "block.h"
+#include "constants.h"
+#include "record.h"
+#include "tree.h"
+#include "reftable-error.h"
+
+/* finishes a block, and writes it to storage */
+static int writer_flush_block(struct reftable_writer *w);
+
+/* deallocates memory related to the index */
+static void writer_clear_index(struct reftable_writer *w);
+
+/* finishes writing a 'r' (refs) or 'g' (reflogs) section */
+static int writer_finish_public_section(struct reftable_writer *w);
+
+static struct reftable_block_stats *
+writer_reftable_block_stats(struct reftable_writer *w, uint8_t typ)
+{
+	switch (typ) {
+	case 'r':
+		return &w->stats.ref_stats;
+	case 'o':
+		return &w->stats.obj_stats;
+	case 'i':
+		return &w->stats.idx_stats;
+	case 'g':
+		return &w->stats.log_stats;
+	}
+	abort();
+	return NULL;
+}
+
+/* write data, queuing the padding for the next write. Returns negative for
+ * error. */
+static int padded_write(struct reftable_writer *w, uint8_t *data, size_t len,
+			int padding)
+{
+	int n = 0;
+	if (w->pending_padding > 0) {
+		uint8_t *zeroed = reftable_calloc(w->pending_padding);
+		int n = w->write(w->write_arg, zeroed, w->pending_padding);
+		if (n < 0)
+			return n;
+
+		w->pending_padding = 0;
+		reftable_free(zeroed);
+	}
+
+	w->pending_padding = padding;
+	n = w->write(w->write_arg, data, len);
+	if (n < 0)
+		return n;
+	n += padding;
+	return 0;
+}
+
+static void options_set_defaults(struct reftable_write_options *opts)
+{
+	if (opts->restart_interval == 0) {
+		opts->restart_interval = 16;
+	}
+
+	if (opts->hash_id == 0) {
+		opts->hash_id = SHA1_ID;
+	}
+	if (opts->block_size == 0) {
+		opts->block_size = DEFAULT_BLOCK_SIZE;
+	}
+}
+
+static int writer_version(struct reftable_writer *w)
+{
+	return (w->opts.hash_id == 0 || w->opts.hash_id == SHA1_ID) ? 1 : 2;
+}
+
+static int writer_write_header(struct reftable_writer *w, uint8_t *dest)
+{
+	memcpy((char *)dest, "REFT", 4);
+
+	dest[4] = writer_version(w);
+
+	put_be24(dest + 5, w->opts.block_size);
+	put_be64(dest + 8, w->min_update_index);
+	put_be64(dest + 16, w->max_update_index);
+	if (writer_version(w) == 2) {
+		put_be32(dest + 24, w->opts.hash_id);
+	}
+	return header_size(writer_version(w));
+}
+
+static void writer_reinit_block_writer(struct reftable_writer *w, uint8_t typ)
+{
+	int block_start = 0;
+	if (w->next == 0) {
+		block_start = header_size(writer_version(w));
+	}
+
+	strbuf_release(&w->last_key);
+	block_writer_init(&w->block_writer_data, typ, w->block,
+			  w->opts.block_size, block_start,
+			  hash_size(w->opts.hash_id));
+	w->block_writer = &w->block_writer_data;
+	w->block_writer->restart_interval = w->opts.restart_interval;
+}
+
+static struct strbuf reftable_empty_strbuf = STRBUF_INIT;
+
+struct reftable_writer *
+reftable_new_writer(int (*writer_func)(void *, const void *, size_t),
+		    void *writer_arg, struct reftable_write_options *opts)
+{
+	struct reftable_writer *wp =
+		reftable_calloc(sizeof(struct reftable_writer));
+	strbuf_init(&wp->block_writer_data.last_key, 0);
+	options_set_defaults(opts);
+	if (opts->block_size >= (1 << 24)) {
+		/* TODO - error return? */
+		abort();
+	}
+	wp->last_key = reftable_empty_strbuf;
+	wp->block = reftable_calloc(opts->block_size);
+	wp->write = writer_func;
+	wp->write_arg = writer_arg;
+	wp->opts = *opts;
+	writer_reinit_block_writer(wp, BLOCK_TYPE_REF);
+
+	return wp;
+}
+
+void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
+				uint64_t max)
+{
+	w->min_update_index = min;
+	w->max_update_index = max;
+}
+
+void reftable_writer_free(struct reftable_writer *w)
+{
+	reftable_free(w->block);
+	reftable_free(w);
+}
+
+struct obj_index_tree_node {
+	struct strbuf hash;
+	uint64_t *offsets;
+	size_t offset_len;
+	size_t offset_cap;
+};
+
+#define OBJ_INDEX_TREE_NODE_INIT    \
+	{                           \
+		.hash = STRBUF_INIT \
+	}
+
+static int obj_index_tree_node_compare(const void *a, const void *b)
+{
+	return strbuf_cmp(&((const struct obj_index_tree_node *)a)->hash,
+			  &((const struct obj_index_tree_node *)b)->hash);
+}
+
+static void writer_index_hash(struct reftable_writer *w, struct strbuf *hash)
+{
+	uint64_t off = w->next;
+
+	struct obj_index_tree_node want = { .hash = *hash };
+
+	struct tree_node *node = tree_search(&want, &w->obj_index_tree,
+					     &obj_index_tree_node_compare, 0);
+	struct obj_index_tree_node *key = NULL;
+	if (node == NULL) {
+		struct obj_index_tree_node empty = OBJ_INDEX_TREE_NODE_INIT;
+		key = reftable_malloc(sizeof(struct obj_index_tree_node));
+		*key = empty;
+
+		strbuf_reset(&key->hash);
+		strbuf_addbuf(&key->hash, hash);
+		tree_search((void *)key, &w->obj_index_tree,
+			    &obj_index_tree_node_compare, 1);
+	} else {
+		key = node->key;
+	}
+
+	if (key->offset_len > 0 && key->offsets[key->offset_len - 1] == off) {
+		return;
+	}
+
+	if (key->offset_len == key->offset_cap) {
+		key->offset_cap = 2 * key->offset_cap + 1;
+		key->offsets = reftable_realloc(
+			key->offsets, sizeof(uint64_t) * key->offset_cap);
+	}
+
+	key->offsets[key->offset_len++] = off;
+}
+
+static int writer_add_record(struct reftable_writer *w,
+			     struct reftable_record *rec)
+{
+	int result = -1;
+	struct strbuf key = STRBUF_INIT;
+	int err = 0;
+	reftable_record_key(rec, &key);
+	if (strbuf_cmp(&w->last_key, &key) >= 0)
+		goto done;
+
+	strbuf_reset(&w->last_key);
+	strbuf_addbuf(&w->last_key, &key);
+	if (w->block_writer == NULL) {
+		writer_reinit_block_writer(w, reftable_record_type(rec));
+	}
+
+	assert(block_writer_type(w->block_writer) == reftable_record_type(rec));
+
+	if (block_writer_add(w->block_writer, rec) == 0) {
+		result = 0;
+		goto done;
+	}
+
+	err = writer_flush_block(w);
+	if (err < 0) {
+		result = err;
+		goto done;
+	}
+
+	writer_reinit_block_writer(w, reftable_record_type(rec));
+	err = block_writer_add(w->block_writer, rec);
+	if (err < 0) {
+		result = err;
+		goto done;
+	}
+
+	result = 0;
+done:
+	strbuf_release(&key);
+	return result;
+}
+
+int reftable_writer_add_ref(struct reftable_writer *w,
+			    struct reftable_ref_record *ref)
+{
+	struct reftable_record rec = { NULL };
+	struct reftable_ref_record copy = *ref;
+	int err = 0;
+
+	if (ref->refname == NULL)
+		return REFTABLE_API_ERROR;
+	if (ref->update_index < w->min_update_index ||
+	    ref->update_index > w->max_update_index)
+		return REFTABLE_API_ERROR;
+
+	reftable_record_from_ref(&rec, &copy);
+	copy.update_index -= w->min_update_index;
+
+	err = writer_add_record(w, &rec);
+	if (err < 0)
+		return err;
+
+	if (!w->opts.skip_index_objects &&
+	    reftable_ref_record_val1(ref) != NULL) {
+		struct strbuf h = STRBUF_INIT;
+		strbuf_add(&h, (char *)reftable_ref_record_val1(ref),
+			   hash_size(w->opts.hash_id));
+		writer_index_hash(w, &h);
+		strbuf_release(&h);
+	}
+
+	if (!w->opts.skip_index_objects &&
+	    reftable_ref_record_val2(ref) != NULL) {
+		struct strbuf h = STRBUF_INIT;
+		strbuf_add(&h, reftable_ref_record_val2(ref),
+			   hash_size(w->opts.hash_id));
+		writer_index_hash(w, &h);
+		strbuf_release(&h);
+	}
+	return 0;
+}
+
+int reftable_writer_add_refs(struct reftable_writer *w,
+			     struct reftable_ref_record *refs, int n)
+{
+	int err = 0;
+	int i = 0;
+	QSORT(refs, n, reftable_ref_record_compare_name);
+	for (i = 0; err == 0 && i < n; i++) {
+		err = reftable_writer_add_ref(w, &refs[i]);
+	}
+	return err;
+}
+
+static int reftable_writer_add_log_verbatim(struct reftable_writer *w,
+					    struct reftable_log_record *log)
+{
+	struct reftable_record rec = { NULL };
+	if (w->block_writer != NULL &&
+	    block_writer_type(w->block_writer) == BLOCK_TYPE_REF) {
+		int err = writer_finish_public_section(w);
+		if (err < 0)
+			return err;
+	}
+
+	w->next -= w->pending_padding;
+	w->pending_padding = 0;
+
+	reftable_record_from_log(&rec, log);
+	return writer_add_record(w, &rec);
+}
+
+int reftable_writer_add_log(struct reftable_writer *w,
+			    struct reftable_log_record *log)
+{
+	char *input_log_message = NULL;
+	struct strbuf cleaned_message = STRBUF_INIT;
+	int err = 0;
+
+	if (log->value_type == REFTABLE_LOG_DELETION)
+		return reftable_writer_add_log_verbatim(w, log);
+
+	if (log->refname == NULL)
+		return REFTABLE_API_ERROR;
+
+	input_log_message = log->update.message;
+	if (!w->opts.exact_log_message && log->update.message != NULL) {
+		strbuf_addstr(&cleaned_message, log->update.message);
+		while (cleaned_message.len &&
+		       cleaned_message.buf[cleaned_message.len - 1] == '\n')
+			strbuf_setlen(&cleaned_message,
+				      cleaned_message.len - 1);
+		if (strchr(cleaned_message.buf, '\n')) {
+			// multiple lines not allowed.
+			err = REFTABLE_API_ERROR;
+			goto done;
+		}
+		strbuf_addstr(&cleaned_message, "\n");
+		log->update.message = cleaned_message.buf;
+	}
+
+	err = reftable_writer_add_log_verbatim(w, log);
+	log->update.message = input_log_message;
+done:
+	strbuf_release(&cleaned_message);
+	return err;
+}
+
+int reftable_writer_add_logs(struct reftable_writer *w,
+			     struct reftable_log_record *logs, int n)
+{
+	int err = 0;
+	int i = 0;
+	QSORT(logs, n, reftable_log_record_compare_key);
+
+	for (i = 0; err == 0 && i < n; i++) {
+		err = reftable_writer_add_log(w, &logs[i]);
+	}
+	return err;
+}
+
+static int writer_finish_section(struct reftable_writer *w)
+{
+	uint8_t typ = block_writer_type(w->block_writer);
+	uint64_t index_start = 0;
+	int max_level = 0;
+	int threshold = w->opts.unpadded ? 1 : 3;
+	int before_blocks = w->stats.idx_stats.blocks;
+	int err = writer_flush_block(w);
+	int i = 0;
+	struct reftable_block_stats *bstats = NULL;
+	if (err < 0)
+		return err;
+
+	while (w->index_len > threshold) {
+		struct reftable_index_record *idx = NULL;
+		int idx_len = 0;
+
+		max_level++;
+		index_start = w->next;
+		writer_reinit_block_writer(w, BLOCK_TYPE_INDEX);
+
+		idx = w->index;
+		idx_len = w->index_len;
+
+		w->index = NULL;
+		w->index_len = 0;
+		w->index_cap = 0;
+		for (i = 0; i < idx_len; i++) {
+			struct reftable_record rec = { NULL };
+			reftable_record_from_index(&rec, idx + i);
+			if (block_writer_add(w->block_writer, &rec) == 0) {
+				continue;
+			}
+
+			err = writer_flush_block(w);
+			if (err < 0)
+				return err;
+
+			writer_reinit_block_writer(w, BLOCK_TYPE_INDEX);
+
+			err = block_writer_add(w->block_writer, &rec);
+			if (err != 0) {
+				/* write into fresh block should always succeed
+				 */
+				abort();
+			}
+		}
+		for (i = 0; i < idx_len; i++) {
+			strbuf_release(&idx[i].last_key);
+		}
+		reftable_free(idx);
+	}
+
+	writer_clear_index(w);
+
+	err = writer_flush_block(w);
+	if (err < 0)
+		return err;
+
+	bstats = writer_reftable_block_stats(w, typ);
+	bstats->index_blocks = w->stats.idx_stats.blocks - before_blocks;
+	bstats->index_offset = index_start;
+	bstats->max_index_level = max_level;
+
+	/* Reinit lastKey, as the next section can start with any key. */
+	w->last_key.len = 0;
+
+	return 0;
+}
+
+struct common_prefix_arg {
+	struct strbuf *last;
+	int max;
+};
+
+static void update_common(void *void_arg, void *key)
+{
+	struct common_prefix_arg *arg = (struct common_prefix_arg *)void_arg;
+	struct obj_index_tree_node *entry = (struct obj_index_tree_node *)key;
+	if (arg->last != NULL) {
+		int n = common_prefix_size(&entry->hash, arg->last);
+		if (n > arg->max) {
+			arg->max = n;
+		}
+	}
+	arg->last = &entry->hash;
+}
+
+struct write_record_arg {
+	struct reftable_writer *w;
+	int err;
+};
+
+static void write_object_record(void *void_arg, void *key)
+{
+	struct write_record_arg *arg = (struct write_record_arg *)void_arg;
+	struct obj_index_tree_node *entry = (struct obj_index_tree_node *)key;
+	struct reftable_obj_record obj_rec = {
+		.hash_prefix = (uint8_t *)entry->hash.buf,
+		.hash_prefix_len = arg->w->stats.object_id_len,
+		.offsets = entry->offsets,
+		.offset_len = entry->offset_len,
+	};
+	struct reftable_record rec = { NULL };
+	if (arg->err < 0)
+		goto done;
+
+	reftable_record_from_obj(&rec, &obj_rec);
+	arg->err = block_writer_add(arg->w->block_writer, &rec);
+	if (arg->err == 0)
+		goto done;
+
+	arg->err = writer_flush_block(arg->w);
+	if (arg->err < 0)
+		goto done;
+
+	writer_reinit_block_writer(arg->w, BLOCK_TYPE_OBJ);
+	arg->err = block_writer_add(arg->w->block_writer, &rec);
+	if (arg->err == 0)
+		goto done;
+	obj_rec.offset_len = 0;
+	arg->err = block_writer_add(arg->w->block_writer, &rec);
+
+	/* Should be able to write into a fresh block. */
+	assert(arg->err == 0);
+
+done:;
+}
+
+static void object_record_free(void *void_arg, void *key)
+{
+	struct obj_index_tree_node *entry = (struct obj_index_tree_node *)key;
+
+	FREE_AND_NULL(entry->offsets);
+	strbuf_release(&entry->hash);
+	reftable_free(entry);
+}
+
+static int writer_dump_object_index(struct reftable_writer *w)
+{
+	struct write_record_arg closure = { .w = w };
+	struct common_prefix_arg common = { NULL };
+	if (w->obj_index_tree != NULL) {
+		infix_walk(w->obj_index_tree, &update_common, &common);
+	}
+	w->stats.object_id_len = common.max + 1;
+
+	writer_reinit_block_writer(w, BLOCK_TYPE_OBJ);
+
+	if (w->obj_index_tree != NULL) {
+		infix_walk(w->obj_index_tree, &write_object_record, &closure);
+	}
+
+	if (closure.err < 0)
+		return closure.err;
+	return writer_finish_section(w);
+}
+
+static int writer_finish_public_section(struct reftable_writer *w)
+{
+	uint8_t typ = 0;
+	int err = 0;
+
+	if (w->block_writer == NULL)
+		return 0;
+
+	typ = block_writer_type(w->block_writer);
+	err = writer_finish_section(w);
+	if (err < 0)
+		return err;
+	if (typ == BLOCK_TYPE_REF && !w->opts.skip_index_objects &&
+	    w->stats.ref_stats.index_blocks > 0) {
+		err = writer_dump_object_index(w);
+		if (err < 0)
+			return err;
+	}
+
+	if (w->obj_index_tree != NULL) {
+		infix_walk(w->obj_index_tree, &object_record_free, NULL);
+		tree_free(w->obj_index_tree);
+		w->obj_index_tree = NULL;
+	}
+
+	w->block_writer = NULL;
+	return 0;
+}
+
+int reftable_writer_close(struct reftable_writer *w)
+{
+	uint8_t footer[72];
+	uint8_t *p = footer;
+	int err = writer_finish_public_section(w);
+	int empty_table = w->next == 0;
+	if (err != 0)
+		goto done;
+	w->pending_padding = 0;
+	if (empty_table) {
+		/* Empty tables need a header anyway. */
+		uint8_t header[28];
+		int n = writer_write_header(w, header);
+		err = padded_write(w, header, n, 0);
+		if (err < 0)
+			goto done;
+	}
+
+	p += writer_write_header(w, footer);
+	put_be64(p, w->stats.ref_stats.index_offset);
+	p += 8;
+	put_be64(p, (w->stats.obj_stats.offset) << 5 | w->stats.object_id_len);
+	p += 8;
+	put_be64(p, w->stats.obj_stats.index_offset);
+	p += 8;
+
+	put_be64(p, w->stats.log_stats.offset);
+	p += 8;
+	put_be64(p, w->stats.log_stats.index_offset);
+	p += 8;
+
+	put_be32(p, crc32(0, footer, p - footer));
+	p += 4;
+
+	err = padded_write(w, footer, footer_size(writer_version(w)), 0);
+	if (err < 0)
+		goto done;
+
+	if (empty_table) {
+		err = REFTABLE_EMPTY_TABLE_ERROR;
+		goto done;
+	}
+
+done:
+	/* free up memory. */
+	block_writer_release(&w->block_writer_data);
+	writer_clear_index(w);
+	strbuf_release(&w->last_key);
+	return err;
+}
+
+static void writer_clear_index(struct reftable_writer *w)
+{
+	int i = 0;
+	for (i = 0; i < w->index_len; i++) {
+		strbuf_release(&w->index[i].last_key);
+	}
+
+	FREE_AND_NULL(w->index);
+	w->index_len = 0;
+	w->index_cap = 0;
+}
+
+static const int debug = 0;
+
+static int writer_flush_nonempty_block(struct reftable_writer *w)
+{
+	uint8_t typ = block_writer_type(w->block_writer);
+	struct reftable_block_stats *bstats =
+		writer_reftable_block_stats(w, typ);
+	uint64_t block_typ_off = (bstats->blocks == 0) ? w->next : 0;
+	int raw_bytes = block_writer_finish(w->block_writer);
+	int padding = 0;
+	int err = 0;
+	struct reftable_index_record ir = { .last_key = STRBUF_INIT };
+	if (raw_bytes < 0)
+		return raw_bytes;
+
+	if (!w->opts.unpadded && typ != BLOCK_TYPE_LOG) {
+		padding = w->opts.block_size - raw_bytes;
+	}
+
+	if (block_typ_off > 0) {
+		bstats->offset = block_typ_off;
+	}
+
+	bstats->entries += w->block_writer->entries;
+	bstats->restarts += w->block_writer->restart_len;
+	bstats->blocks++;
+	w->stats.blocks++;
+
+	if (debug) {
+		fprintf(stderr, "block %c off %" PRIu64 " sz %d (%d)\n", typ,
+			w->next, raw_bytes,
+			get_be24(w->block + w->block_writer->header_off + 1));
+	}
+
+	if (w->next == 0) {
+		writer_write_header(w, w->block);
+	}
+
+	err = padded_write(w, w->block, raw_bytes, padding);
+	if (err < 0)
+		return err;
+
+	if (w->index_cap == w->index_len) {
+		w->index_cap = 2 * w->index_cap + 1;
+		w->index = reftable_realloc(
+			w->index,
+			sizeof(struct reftable_index_record) * w->index_cap);
+	}
+
+	ir.offset = w->next;
+	strbuf_reset(&ir.last_key);
+	strbuf_addbuf(&ir.last_key, &w->block_writer->last_key);
+	w->index[w->index_len] = ir;
+
+	w->index_len++;
+	w->next += padding + raw_bytes;
+	w->block_writer = NULL;
+	return 0;
+}
+
+static int writer_flush_block(struct reftable_writer *w)
+{
+	if (w->block_writer == NULL)
+		return 0;
+	if (w->block_writer->entries == 0)
+		return 0;
+	return writer_flush_nonempty_block(w);
+}
+
+const struct reftable_stats *writer_stats(struct reftable_writer *w)
+{
+	return &w->stats;
+}
diff --git a/reftable/writer.h b/reftable/writer.h
new file mode 100644
index 000000000000..4921c249d06b
--- /dev/null
+++ b/reftable/writer.h
@@ -0,0 +1,50 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef WRITER_H
+#define WRITER_H
+
+#include "basics.h"
+#include "block.h"
+#include "tree.h"
+#include "reftable-writer.h"
+
+struct reftable_writer {
+	int (*write)(void *, const void *, size_t);
+	void *write_arg;
+	int pending_padding;
+	struct strbuf last_key;
+
+	/* offset of next block to write. */
+	uint64_t next;
+	uint64_t min_update_index, max_update_index;
+	struct reftable_write_options opts;
+
+	/* memory buffer for writing */
+	uint8_t *block;
+
+	/* writer for the current section. NULL or points to
+	 * block_writer_data */
+	struct block_writer *block_writer;
+
+	struct block_writer block_writer_data;
+
+	/* pending index records for the current section */
+	struct reftable_index_record *index;
+	size_t index_len;
+	size_t index_cap;
+
+	/*
+	 * tree for use with tsearch; used to populate the 'o' inverse OID
+	 * map */
+	struct tree_node *obj_index_tree;
+
+	struct reftable_stats stats;
+};
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v6 10/20] reftable: generic interface to tables
  2021-04-12 19:25         ` [PATCH v6 00/20] reftable library Han-Wen Nienhuys via GitGitGadget
                             ` (8 preceding siblings ...)
  2021-04-12 19:25           ` [PATCH v6 09/20] reftable: write reftable files Han-Wen Nienhuys via GitGitGadget
@ 2021-04-12 19:25           ` Han-Wen Nienhuys via GitGitGadget
  2021-04-12 19:25           ` [PATCH v6 11/20] reftable: read reftable files Han-Wen Nienhuys via GitGitGadget
                             ` (10 subsequent siblings)
  20 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-12 19:25 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys,
	Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                     |   1 +
 reftable/generic.h           |  32 ++++++++++
 reftable/reftable-generic.h  |  41 +++++++++++++
 reftable/reftable-iterator.h |  39 ++++++++++++
 reftable/reftable.c          | 115 +++++++++++++++++++++++++++++++++++
 5 files changed, 228 insertions(+)
 create mode 100644 reftable/generic.h
 create mode 100644 reftable/reftable-generic.h
 create mode 100644 reftable/reftable-iterator.h
 create mode 100644 reftable/reftable.c

diff --git a/Makefile b/Makefile
index 2653894d0308..1e94c2e0ac2c 100644
--- a/Makefile
+++ b/Makefile
@@ -2406,6 +2406,7 @@ REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/record.o
+REFTABLE_OBJS += reftable/reftable.o
 REFTABLE_OBJS += reftable/tree.o
 REFTABLE_OBJS += reftable/writer.o
 REFTABLE_OBJS += reftable/zlib-compat.o
diff --git a/reftable/generic.h b/reftable/generic.h
new file mode 100644
index 000000000000..98886a06402d
--- /dev/null
+++ b/reftable/generic.h
@@ -0,0 +1,32 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef GENERIC_H
+#define GENERIC_H
+
+#include "record.h"
+#include "reftable-generic.h"
+
+/* generic interface to reftables */
+struct reftable_table_vtable {
+	int (*seek_record)(void *tab, struct reftable_iterator *it,
+			   struct reftable_record *);
+	uint32_t (*hash_id)(void *tab);
+	uint64_t (*min_update_index)(void *tab);
+	uint64_t (*max_update_index)(void *tab);
+};
+
+struct reftable_iterator_vtable {
+	int (*next)(void *iter_arg, struct reftable_record *rec);
+	void (*close)(void *iter_arg);
+};
+
+void iterator_set_empty(struct reftable_iterator *it);
+int iterator_next(struct reftable_iterator *it, struct reftable_record *rec);
+
+#endif
diff --git a/reftable/reftable-generic.h b/reftable/reftable-generic.h
new file mode 100644
index 000000000000..d019dbbc19f1
--- /dev/null
+++ b/reftable/reftable-generic.h
@@ -0,0 +1,41 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_GENERIC_H
+#define REFTABLE_GENERIC_H
+
+#include "reftable-iterator.h"
+
+struct reftable_table_vtable;
+
+/*
+ * Provides a unified API for reading tables, either merged tables, or single
+ * readers. */
+struct reftable_table {
+	struct reftable_table_vtable *ops;
+	void *table_arg;
+};
+
+int reftable_table_seek_ref(struct reftable_table *tab,
+			    struct reftable_iterator *it, const char *name);
+
+/* returns the hash ID from a generic reftable_table */
+uint32_t reftable_table_hash_id(struct reftable_table *tab);
+
+/* returns the max update_index covered by this table. */
+uint64_t reftable_table_max_update_index(struct reftable_table *tab);
+
+/* returns the min update_index covered by this table. */
+uint64_t reftable_table_min_update_index(struct reftable_table *tab);
+
+/* convenience function to read a single ref. Returns < 0 for error, 0
+   for success, and 1 if ref not found. */
+int reftable_table_read_ref(struct reftable_table *tab, const char *name,
+			    struct reftable_ref_record *ref);
+
+#endif
diff --git a/reftable/reftable-iterator.h b/reftable/reftable-iterator.h
new file mode 100644
index 000000000000..d3eee7af3571
--- /dev/null
+++ b/reftable/reftable-iterator.h
@@ -0,0 +1,39 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_ITERATOR_H
+#define REFTABLE_ITERATOR_H
+
+#include "reftable-record.h"
+
+struct reftable_iterator_vtable;
+
+/* iterator is the generic interface for walking over data stored in a
+ * reftable.
+ */
+struct reftable_iterator {
+	struct reftable_iterator_vtable *ops;
+	void *iter_arg;
+};
+
+/* reads the next reftable_ref_record. Returns < 0 for error, 0 for OK and > 0:
+ * end of iteration.
+ */
+int reftable_iterator_next_ref(struct reftable_iterator *it,
+			       struct reftable_ref_record *ref);
+
+/* reads the next reftable_log_record. Returns < 0 for error, 0 for OK and > 0:
+ * end of iteration.
+ */
+int reftable_iterator_next_log(struct reftable_iterator *it,
+			       struct reftable_log_record *log);
+
+/* releases resources associated with an iterator. */
+void reftable_iterator_destroy(struct reftable_iterator *it);
+
+#endif
diff --git a/reftable/reftable.c b/reftable/reftable.c
new file mode 100644
index 000000000000..85ab8ba9f963
--- /dev/null
+++ b/reftable/reftable.c
@@ -0,0 +1,115 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "basics.h"
+#include "record.h"
+#include "generic.h"
+#include "reftable-iterator.h"
+#include "reftable-generic.h"
+
+int reftable_table_seek_ref(struct reftable_table *tab,
+			    struct reftable_iterator *it, const char *name)
+{
+	struct reftable_ref_record ref = {
+		.refname = (char *)name,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_ref(&rec, &ref);
+	return tab->ops->seek_record(tab->table_arg, it, &rec);
+}
+
+int reftable_table_read_ref(struct reftable_table *tab, const char *name,
+			    struct reftable_ref_record *ref)
+{
+	struct reftable_iterator it = { NULL };
+	int err = reftable_table_seek_ref(tab, &it, name);
+	if (err)
+		goto done;
+
+	err = reftable_iterator_next_ref(&it, ref);
+	if (err)
+		goto done;
+
+	if (strcmp(ref->refname, name) ||
+	    reftable_ref_record_is_deletion(ref)) {
+		reftable_ref_record_release(ref);
+		err = 1;
+		goto done;
+	}
+
+done:
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+uint64_t reftable_table_max_update_index(struct reftable_table *tab)
+{
+	return tab->ops->max_update_index(tab->table_arg);
+}
+
+uint64_t reftable_table_min_update_index(struct reftable_table *tab)
+{
+	return tab->ops->min_update_index(tab->table_arg);
+}
+
+uint32_t reftable_table_hash_id(struct reftable_table *tab)
+{
+	return tab->ops->hash_id(tab->table_arg);
+}
+
+void reftable_iterator_destroy(struct reftable_iterator *it)
+{
+	if (it->ops == NULL) {
+		return;
+	}
+	it->ops->close(it->iter_arg);
+	it->ops = NULL;
+	FREE_AND_NULL(it->iter_arg);
+}
+
+int reftable_iterator_next_ref(struct reftable_iterator *it,
+			       struct reftable_ref_record *ref)
+{
+	struct reftable_record rec = { NULL };
+	reftable_record_from_ref(&rec, ref);
+	return iterator_next(it, &rec);
+}
+
+int reftable_iterator_next_log(struct reftable_iterator *it,
+			       struct reftable_log_record *log)
+{
+	struct reftable_record rec = { NULL };
+	reftable_record_from_log(&rec, log);
+	return iterator_next(it, &rec);
+}
+
+int iterator_next(struct reftable_iterator *it, struct reftable_record *rec)
+{
+	return it->ops->next(it->iter_arg, rec);
+}
+
+static int empty_iterator_next(void *arg, struct reftable_record *rec)
+{
+	return 1;
+}
+
+static void empty_iterator_close(void *arg)
+{
+}
+
+static struct reftable_iterator_vtable empty_vtable = {
+	.next = &empty_iterator_next,
+	.close = &empty_iterator_close,
+};
+
+void iterator_set_empty(struct reftable_iterator *it)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = NULL;
+	it->ops = &empty_vtable;
+}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v6 11/20] reftable: read reftable files
  2021-04-12 19:25         ` [PATCH v6 00/20] reftable library Han-Wen Nienhuys via GitGitGadget
                             ` (9 preceding siblings ...)
  2021-04-12 19:25           ` [PATCH v6 10/20] reftable: generic interface to tables Han-Wen Nienhuys via GitGitGadget
@ 2021-04-12 19:25           ` Han-Wen Nienhuys via GitGitGadget
  2021-04-12 19:25           ` [PATCH v6 12/20] reftable: reftable file level tests Han-Wen Nienhuys via GitGitGadget
                             ` (9 subsequent siblings)
  20 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-12 19:25 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys,
	Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This supports reading a single reftable file.

The commit introduces an abstract iterator type, which captures the usecases
both of reading individual refs, and iterating over a segment of the ref
namespace.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                   |   2 +
 reftable/iter.c            | 197 ++++++++++
 reftable/iter.h            |  67 ++++
 reftable/reader.c          | 773 +++++++++++++++++++++++++++++++++++++
 reftable/reader.h          |  66 ++++
 reftable/reftable-reader.h |  98 +++++
 6 files changed, 1203 insertions(+)
 create mode 100644 reftable/iter.c
 create mode 100644 reftable/iter.h
 create mode 100644 reftable/reader.c
 create mode 100644 reftable/reader.h
 create mode 100644 reftable/reftable-reader.h

diff --git a/Makefile b/Makefile
index 1e94c2e0ac2c..f8167084b80a 100644
--- a/Makefile
+++ b/Makefile
@@ -2404,7 +2404,9 @@ REFTABLE_OBJS += reftable/basics.o
 REFTABLE_OBJS += reftable/error.o
 REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
+REFTABLE_OBJS += reftable/iter.o
 REFTABLE_OBJS += reftable/publicbasics.o
+REFTABLE_OBJS += reftable/reader.o
 REFTABLE_OBJS += reftable/record.o
 REFTABLE_OBJS += reftable/reftable.o
 REFTABLE_OBJS += reftable/tree.o
diff --git a/reftable/iter.c b/reftable/iter.c
new file mode 100644
index 000000000000..de1b3b898f61
--- /dev/null
+++ b/reftable/iter.c
@@ -0,0 +1,197 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "iter.h"
+
+#include "system.h"
+
+#include "block.h"
+#include "generic.h"
+#include "constants.h"
+#include "reader.h"
+#include "reftable-error.h"
+
+int iterator_is_null(struct reftable_iterator *it)
+{
+	return it->ops == NULL;
+}
+
+static void filtering_ref_iterator_close(void *iter_arg)
+{
+	struct filtering_ref_iterator *fri =
+		(struct filtering_ref_iterator *)iter_arg;
+	strbuf_release(&fri->oid);
+	reftable_iterator_destroy(&fri->it);
+}
+
+static int filtering_ref_iterator_next(void *iter_arg,
+				       struct reftable_record *rec)
+{
+	struct filtering_ref_iterator *fri =
+		(struct filtering_ref_iterator *)iter_arg;
+	struct reftable_ref_record *ref =
+		(struct reftable_ref_record *)rec->data;
+	int err = 0;
+	while (1) {
+		err = reftable_iterator_next_ref(&fri->it, ref);
+		if (err != 0) {
+			break;
+		}
+
+		if (fri->double_check) {
+			struct reftable_iterator it = { NULL };
+
+			err = reftable_table_seek_ref(&fri->tab, &it,
+						      ref->refname);
+			if (err == 0) {
+				err = reftable_iterator_next_ref(&it, ref);
+			}
+
+			reftable_iterator_destroy(&it);
+
+			if (err < 0) {
+				break;
+			}
+
+			if (err > 0) {
+				continue;
+			}
+		}
+
+		if (ref->value_type == REFTABLE_REF_VAL2 &&
+		    (!memcmp(fri->oid.buf, ref->value.val2.target_value,
+			     fri->oid.len) ||
+		     !memcmp(fri->oid.buf, ref->value.val2.value,
+			     fri->oid.len)))
+			return 0;
+
+		if (ref->value_type == REFTABLE_REF_VAL1 &&
+		    !memcmp(fri->oid.buf, ref->value.val1, fri->oid.len)) {
+			return 0;
+		}
+	}
+
+	reftable_ref_record_release(ref);
+	return err;
+}
+
+static struct reftable_iterator_vtable filtering_ref_iterator_vtable = {
+	.next = &filtering_ref_iterator_next,
+	.close = &filtering_ref_iterator_close,
+};
+
+void iterator_from_filtering_ref_iterator(struct reftable_iterator *it,
+					  struct filtering_ref_iterator *fri)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = fri;
+	it->ops = &filtering_ref_iterator_vtable;
+}
+
+static void indexed_table_ref_iter_close(void *p)
+{
+	struct indexed_table_ref_iter *it = (struct indexed_table_ref_iter *)p;
+	block_iter_close(&it->cur);
+	reftable_block_done(&it->block_reader.block);
+	strbuf_release(&it->oid);
+}
+
+static int indexed_table_ref_iter_next_block(struct indexed_table_ref_iter *it)
+{
+	uint64_t off;
+	int err = 0;
+	if (it->offset_idx == it->offset_len) {
+		it->is_finished = 1;
+		return 1;
+	}
+
+	reftable_block_done(&it->block_reader.block);
+
+	off = it->offsets[it->offset_idx++];
+	err = reader_init_block_reader(it->r, &it->block_reader, off,
+				       BLOCK_TYPE_REF);
+	if (err < 0) {
+		return err;
+	}
+	if (err > 0) {
+		/* indexed block does not exist. */
+		return REFTABLE_FORMAT_ERROR;
+	}
+	block_reader_start(&it->block_reader, &it->cur);
+	return 0;
+}
+
+static int indexed_table_ref_iter_next(void *p, struct reftable_record *rec)
+{
+	struct indexed_table_ref_iter *it = (struct indexed_table_ref_iter *)p;
+	struct reftable_ref_record *ref =
+		(struct reftable_ref_record *)rec->data;
+
+	while (1) {
+		int err = block_iter_next(&it->cur, rec);
+		if (err < 0) {
+			return err;
+		}
+
+		if (err > 0) {
+			err = indexed_table_ref_iter_next_block(it);
+			if (err < 0) {
+				return err;
+			}
+
+			if (it->is_finished) {
+				return 1;
+			}
+			continue;
+		}
+		/* BUG */
+		if (!memcmp(it->oid.buf, ref->value.val2.target_value,
+			    it->oid.len) ||
+		    !memcmp(it->oid.buf, ref->value.val2.value, it->oid.len)) {
+			return 0;
+		}
+	}
+}
+
+int new_indexed_table_ref_iter(struct indexed_table_ref_iter **dest,
+			       struct reftable_reader *r, uint8_t *oid,
+			       int oid_len, uint64_t *offsets, int offset_len)
+{
+	struct indexed_table_ref_iter empty = INDEXED_TABLE_REF_ITER_INIT;
+	struct indexed_table_ref_iter *itr =
+		reftable_calloc(sizeof(struct indexed_table_ref_iter));
+	int err = 0;
+
+	*itr = empty;
+	itr->r = r;
+	strbuf_add(&itr->oid, oid, oid_len);
+
+	itr->offsets = offsets;
+	itr->offset_len = offset_len;
+
+	err = indexed_table_ref_iter_next_block(itr);
+	if (err < 0) {
+		reftable_free(itr);
+	} else {
+		*dest = itr;
+	}
+	return err;
+}
+
+static struct reftable_iterator_vtable indexed_table_ref_iter_vtable = {
+	.next = &indexed_table_ref_iter_next,
+	.close = &indexed_table_ref_iter_close,
+};
+
+void iterator_from_indexed_table_ref_iter(struct reftable_iterator *it,
+					  struct indexed_table_ref_iter *itr)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = itr;
+	it->ops = &indexed_table_ref_iter_vtable;
+}
diff --git a/reftable/iter.h b/reftable/iter.h
new file mode 100644
index 000000000000..656ba3053e9f
--- /dev/null
+++ b/reftable/iter.h
@@ -0,0 +1,67 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef ITER_H
+#define ITER_H
+
+#include "system.h"
+#include "block.h"
+#include "record.h"
+
+#include "reftable-iterator.h"
+#include "reftable-generic.h"
+
+/* Returns true for a zeroed out iterator, such as the one returned from
+ * iterator_destroy. */
+int iterator_is_null(struct reftable_iterator *it);
+
+/* iterator that produces only ref records that point to `oid` */
+struct filtering_ref_iterator {
+	int double_check;
+	struct reftable_table tab;
+	struct strbuf oid;
+	struct reftable_iterator it;
+};
+#define FILTERING_REF_ITERATOR_INIT \
+	{                           \
+		.oid = STRBUF_INIT  \
+	}
+
+void iterator_from_filtering_ref_iterator(struct reftable_iterator *,
+					  struct filtering_ref_iterator *);
+
+/* iterator that produces only ref records that point to `oid`,
+ * but using the object index.
+ */
+struct indexed_table_ref_iter {
+	struct reftable_reader *r;
+	struct strbuf oid;
+
+	/* mutable */
+	uint64_t *offsets;
+
+	/* Points to the next offset to read. */
+	int offset_idx;
+	int offset_len;
+	struct block_reader block_reader;
+	struct block_iter cur;
+	int is_finished;
+};
+
+#define INDEXED_TABLE_REF_ITER_INIT                                     \
+	{                                                               \
+		.cur = { .last_key = STRBUF_INIT }, .oid = STRBUF_INIT, \
+	}
+
+void iterator_from_indexed_table_ref_iter(struct reftable_iterator *it,
+					  struct indexed_table_ref_iter *itr);
+int new_indexed_table_ref_iter(struct indexed_table_ref_iter **dest,
+			       struct reftable_reader *r, uint8_t *oid,
+			       int oid_len, uint64_t *offsets, int offset_len);
+
+#endif
diff --git a/reftable/reader.c b/reftable/reader.c
new file mode 100644
index 000000000000..36d7be01a3b3
--- /dev/null
+++ b/reftable/reader.c
@@ -0,0 +1,773 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reader.h"
+
+#include "system.h"
+#include "block.h"
+#include "constants.h"
+#include "generic.h"
+#include "iter.h"
+#include "record.h"
+#include "reftable-error.h"
+#include "reftable-generic.h"
+#include "tree.h"
+
+uint64_t block_source_size(struct reftable_block_source *source)
+{
+	return source->ops->size(source->arg);
+}
+
+int block_source_read_block(struct reftable_block_source *source,
+			    struct reftable_block *dest, uint64_t off,
+			    uint32_t size)
+{
+	int result = source->ops->read_block(source->arg, dest, off, size);
+	dest->source = *source;
+	return result;
+}
+
+void block_source_close(struct reftable_block_source *source)
+{
+	if (source->ops == NULL) {
+		return;
+	}
+
+	source->ops->close(source->arg);
+	source->ops = NULL;
+}
+
+static struct reftable_reader_offsets *
+reader_offsets_for(struct reftable_reader *r, uint8_t typ)
+{
+	switch (typ) {
+	case BLOCK_TYPE_REF:
+		return &r->ref_offsets;
+	case BLOCK_TYPE_LOG:
+		return &r->log_offsets;
+	case BLOCK_TYPE_OBJ:
+		return &r->obj_offsets;
+	}
+	abort();
+}
+
+static int reader_get_block(struct reftable_reader *r,
+			    struct reftable_block *dest, uint64_t off,
+			    uint32_t sz)
+{
+	if (off >= r->size)
+		return 0;
+
+	if (off + sz > r->size) {
+		sz = r->size - off;
+	}
+
+	return block_source_read_block(&r->source, dest, off, sz);
+}
+
+uint32_t reftable_reader_hash_id(struct reftable_reader *r)
+{
+	return r->hash_id;
+}
+
+const char *reader_name(struct reftable_reader *r)
+{
+	return r->name;
+}
+
+static int parse_footer(struct reftable_reader *r, uint8_t *footer,
+			uint8_t *header)
+{
+	uint8_t *f = footer;
+	uint8_t first_block_typ;
+	int err = 0;
+	uint32_t computed_crc;
+	uint32_t file_crc;
+
+	if (memcmp(f, "REFT", 4)) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+	f += 4;
+
+	if (memcmp(footer, header, header_size(r->version))) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	f++;
+	r->block_size = get_be24(f);
+
+	f += 3;
+	r->min_update_index = get_be64(f);
+	f += 8;
+	r->max_update_index = get_be64(f);
+	f += 8;
+
+	if (r->version == 1) {
+		r->hash_id = SHA1_ID;
+	} else {
+		r->hash_id = get_be32(f);
+		switch (r->hash_id) {
+		case SHA1_ID:
+			break;
+		case SHA256_ID:
+			break;
+		default:
+			err = REFTABLE_FORMAT_ERROR;
+			goto done;
+		}
+		f += 4;
+	}
+
+	r->ref_offsets.index_offset = get_be64(f);
+	f += 8;
+
+	r->obj_offsets.offset = get_be64(f);
+	f += 8;
+
+	r->object_id_len = r->obj_offsets.offset & ((1 << 5) - 1);
+	r->obj_offsets.offset >>= 5;
+
+	r->obj_offsets.index_offset = get_be64(f);
+	f += 8;
+	r->log_offsets.offset = get_be64(f);
+	f += 8;
+	r->log_offsets.index_offset = get_be64(f);
+	f += 8;
+
+	computed_crc = crc32(0, footer, f - footer);
+	file_crc = get_be32(f);
+	f += 4;
+	if (computed_crc != file_crc) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	first_block_typ = header[header_size(r->version)];
+	r->ref_offsets.is_present = (first_block_typ == BLOCK_TYPE_REF);
+	r->ref_offsets.offset = 0;
+	r->log_offsets.is_present = (first_block_typ == BLOCK_TYPE_LOG ||
+				     r->log_offsets.offset > 0);
+	r->obj_offsets.is_present = r->obj_offsets.offset > 0;
+	err = 0;
+done:
+	return err;
+}
+
+int init_reader(struct reftable_reader *r, struct reftable_block_source *source,
+		const char *name)
+{
+	struct reftable_block footer = { NULL };
+	struct reftable_block header = { NULL };
+	int err = 0;
+
+	memset(r, 0, sizeof(struct reftable_reader));
+
+	/* Need +1 to read type of first block. */
+	err = block_source_read_block(source, &header, 0, header_size(2) + 1);
+	if (err != header_size(2) + 1) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	if (memcmp(header.data, "REFT", 4)) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+	r->version = header.data[4];
+	if (r->version != 1 && r->version != 2) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	r->size = block_source_size(source) - footer_size(r->version);
+	r->source = *source;
+	r->name = xstrdup(name);
+	r->hash_id = 0;
+
+	err = block_source_read_block(source, &footer, r->size,
+				      footer_size(r->version));
+	if (err != footer_size(r->version)) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = parse_footer(r, footer.data, header.data);
+done:
+	reftable_block_done(&footer);
+	reftable_block_done(&header);
+	return err;
+}
+
+struct table_iter {
+	struct reftable_reader *r;
+	uint8_t typ;
+	uint64_t block_off;
+	struct block_iter bi;
+	int is_finished;
+};
+#define TABLE_ITER_INIT                          \
+	{                                        \
+		.bi = {.last_key = STRBUF_INIT } \
+	}
+
+static void table_iter_copy_from(struct table_iter *dest,
+				 struct table_iter *src)
+{
+	dest->r = src->r;
+	dest->typ = src->typ;
+	dest->block_off = src->block_off;
+	dest->is_finished = src->is_finished;
+	block_iter_copy_from(&dest->bi, &src->bi);
+}
+
+static int table_iter_next_in_block(struct table_iter *ti,
+				    struct reftable_record *rec)
+{
+	int res = block_iter_next(&ti->bi, rec);
+	if (res == 0 && reftable_record_type(rec) == BLOCK_TYPE_REF) {
+		((struct reftable_ref_record *)rec->data)->update_index +=
+			ti->r->min_update_index;
+	}
+
+	return res;
+}
+
+static void table_iter_block_done(struct table_iter *ti)
+{
+	if (ti->bi.br == NULL) {
+		return;
+	}
+	reftable_block_done(&ti->bi.br->block);
+	FREE_AND_NULL(ti->bi.br);
+
+	ti->bi.last_key.len = 0;
+	ti->bi.next_off = 0;
+}
+
+static int32_t extract_block_size(uint8_t *data, uint8_t *typ, uint64_t off,
+				  int version)
+{
+	int32_t result = 0;
+
+	if (off == 0) {
+		data += header_size(version);
+	}
+
+	*typ = data[0];
+	if (reftable_is_block_type(*typ)) {
+		result = get_be24(data + 1);
+	}
+	return result;
+}
+
+int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br,
+			     uint64_t next_off, uint8_t want_typ)
+{
+	int32_t guess_block_size = r->block_size ? r->block_size :
+							 DEFAULT_BLOCK_SIZE;
+	struct reftable_block block = { NULL };
+	uint8_t block_typ = 0;
+	int err = 0;
+	uint32_t header_off = next_off ? 0 : header_size(r->version);
+	int32_t block_size = 0;
+
+	if (next_off >= r->size)
+		return 1;
+
+	err = reader_get_block(r, &block, next_off, guess_block_size);
+	if (err < 0)
+		return err;
+
+	block_size = extract_block_size(block.data, &block_typ, next_off,
+					r->version);
+	if (block_size < 0)
+		return block_size;
+
+	if (want_typ != BLOCK_TYPE_ANY && block_typ != want_typ) {
+		reftable_block_done(&block);
+		return 1;
+	}
+
+	if (block_size > guess_block_size) {
+		reftable_block_done(&block);
+		err = reader_get_block(r, &block, next_off, block_size);
+		if (err < 0) {
+			return err;
+		}
+	}
+
+	return block_reader_init(br, &block, header_off, r->block_size,
+				 hash_size(r->hash_id));
+}
+
+static int table_iter_next_block(struct table_iter *dest,
+				 struct table_iter *src)
+{
+	uint64_t next_block_off = src->block_off + src->bi.br->full_block_size;
+	struct block_reader br = { 0 };
+	int err = 0;
+
+	dest->r = src->r;
+	dest->typ = src->typ;
+	dest->block_off = next_block_off;
+
+	err = reader_init_block_reader(src->r, &br, next_block_off, src->typ);
+	if (err > 0) {
+		dest->is_finished = 1;
+		return 1;
+	}
+	if (err != 0)
+		return err;
+	else {
+		struct block_reader *brp =
+			reftable_malloc(sizeof(struct block_reader));
+		*brp = br;
+
+		dest->is_finished = 0;
+		block_reader_start(brp, &dest->bi);
+	}
+	return 0;
+}
+
+static int table_iter_next(struct table_iter *ti, struct reftable_record *rec)
+{
+	if (reftable_record_type(rec) != ti->typ)
+		return REFTABLE_API_ERROR;
+
+	while (1) {
+		struct table_iter next = TABLE_ITER_INIT;
+		int err = 0;
+		if (ti->is_finished) {
+			return 1;
+		}
+
+		err = table_iter_next_in_block(ti, rec);
+		if (err <= 0) {
+			return err;
+		}
+
+		err = table_iter_next_block(&next, ti);
+		if (err != 0) {
+			ti->is_finished = 1;
+		}
+		table_iter_block_done(ti);
+		if (err != 0) {
+			return err;
+		}
+		table_iter_copy_from(ti, &next);
+		block_iter_close(&next.bi);
+	}
+}
+
+static int table_iter_next_void(void *ti, struct reftable_record *rec)
+{
+	return table_iter_next((struct table_iter *)ti, rec);
+}
+
+static void table_iter_close(void *p)
+{
+	struct table_iter *ti = (struct table_iter *)p;
+	table_iter_block_done(ti);
+	block_iter_close(&ti->bi);
+}
+
+static struct reftable_iterator_vtable table_iter_vtable = {
+	.next = &table_iter_next_void,
+	.close = &table_iter_close,
+};
+
+static void iterator_from_table_iter(struct reftable_iterator *it,
+				     struct table_iter *ti)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = ti;
+	it->ops = &table_iter_vtable;
+}
+
+static int reader_table_iter_at(struct reftable_reader *r,
+				struct table_iter *ti, uint64_t off,
+				uint8_t typ)
+{
+	struct block_reader br = { 0 };
+	struct block_reader *brp = NULL;
+
+	int err = reader_init_block_reader(r, &br, off, typ);
+	if (err != 0)
+		return err;
+
+	brp = reftable_malloc(sizeof(struct block_reader));
+	*brp = br;
+	ti->r = r;
+	ti->typ = block_reader_type(brp);
+	ti->block_off = off;
+	block_reader_start(brp, &ti->bi);
+	return 0;
+}
+
+static int reader_start(struct reftable_reader *r, struct table_iter *ti,
+			uint8_t typ, int index)
+{
+	struct reftable_reader_offsets *offs = reader_offsets_for(r, typ);
+	uint64_t off = offs->offset;
+	if (index) {
+		off = offs->index_offset;
+		if (off == 0) {
+			return 1;
+		}
+		typ = BLOCK_TYPE_INDEX;
+	}
+
+	return reader_table_iter_at(r, ti, off, typ);
+}
+
+static int reader_seek_linear(struct reftable_reader *r, struct table_iter *ti,
+			      struct reftable_record *want)
+{
+	struct reftable_record rec =
+		reftable_new_record(reftable_record_type(want));
+	struct strbuf want_key = STRBUF_INIT;
+	struct strbuf got_key = STRBUF_INIT;
+	struct table_iter next = TABLE_ITER_INIT;
+	int err = -1;
+
+	reftable_record_key(want, &want_key);
+
+	while (1) {
+		err = table_iter_next_block(&next, ti);
+		if (err < 0)
+			goto done;
+
+		if (err > 0) {
+			break;
+		}
+
+		err = block_reader_first_key(next.bi.br, &got_key);
+		if (err < 0)
+			goto done;
+
+		if (strbuf_cmp(&got_key, &want_key) > 0) {
+			table_iter_block_done(&next);
+			break;
+		}
+
+		table_iter_block_done(ti);
+		table_iter_copy_from(ti, &next);
+	}
+
+	err = block_iter_seek(&ti->bi, &want_key);
+	if (err < 0)
+		goto done;
+	err = 0;
+
+done:
+	block_iter_close(&next.bi);
+	reftable_record_destroy(&rec);
+	strbuf_release(&want_key);
+	strbuf_release(&got_key);
+	return err;
+}
+
+static int reader_seek_indexed(struct reftable_reader *r,
+			       struct reftable_iterator *it,
+			       struct reftable_record *rec)
+{
+	struct reftable_index_record want_index = { .last_key = STRBUF_INIT };
+	struct reftable_record want_index_rec = { NULL };
+	struct reftable_index_record index_result = { .last_key = STRBUF_INIT };
+	struct reftable_record index_result_rec = { NULL };
+	struct table_iter index_iter = TABLE_ITER_INIT;
+	struct table_iter next = TABLE_ITER_INIT;
+	int err = 0;
+
+	reftable_record_key(rec, &want_index.last_key);
+	reftable_record_from_index(&want_index_rec, &want_index);
+	reftable_record_from_index(&index_result_rec, &index_result);
+
+	err = reader_start(r, &index_iter, reftable_record_type(rec), 1);
+	if (err < 0)
+		goto done;
+
+	err = reader_seek_linear(r, &index_iter, &want_index_rec);
+	while (1) {
+		err = table_iter_next(&index_iter, &index_result_rec);
+		table_iter_block_done(&index_iter);
+		if (err != 0)
+			goto done;
+
+		err = reader_table_iter_at(r, &next, index_result.offset, 0);
+		if (err != 0)
+			goto done;
+
+		err = block_iter_seek(&next.bi, &want_index.last_key);
+		if (err < 0)
+			goto done;
+
+		if (next.typ == reftable_record_type(rec)) {
+			err = 0;
+			break;
+		}
+
+		if (next.typ != BLOCK_TYPE_INDEX) {
+			err = REFTABLE_FORMAT_ERROR;
+			break;
+		}
+
+		table_iter_copy_from(&index_iter, &next);
+	}
+
+	if (err == 0) {
+		struct table_iter empty = TABLE_ITER_INIT;
+		struct table_iter *malloced =
+			reftable_calloc(sizeof(struct table_iter));
+		*malloced = empty;
+		table_iter_copy_from(malloced, &next);
+		iterator_from_table_iter(it, malloced);
+	}
+done:
+	block_iter_close(&next.bi);
+	table_iter_close(&index_iter);
+	reftable_record_release(&want_index_rec);
+	reftable_record_release(&index_result_rec);
+	return err;
+}
+
+static int reader_seek_internal(struct reftable_reader *r,
+				struct reftable_iterator *it,
+				struct reftable_record *rec)
+{
+	struct reftable_reader_offsets *offs =
+		reader_offsets_for(r, reftable_record_type(rec));
+	uint64_t idx = offs->index_offset;
+	struct table_iter ti = TABLE_ITER_INIT;
+	int err = 0;
+	if (idx > 0)
+		return reader_seek_indexed(r, it, rec);
+
+	err = reader_start(r, &ti, reftable_record_type(rec), 0);
+	if (err < 0)
+		return err;
+	err = reader_seek_linear(r, &ti, rec);
+	if (err < 0)
+		return err;
+	else {
+		struct table_iter *p =
+			reftable_malloc(sizeof(struct table_iter));
+		*p = ti;
+		iterator_from_table_iter(it, p);
+	}
+
+	return 0;
+}
+
+int reader_seek(struct reftable_reader *r, struct reftable_iterator *it,
+		struct reftable_record *rec)
+{
+	uint8_t typ = reftable_record_type(rec);
+
+	struct reftable_reader_offsets *offs = reader_offsets_for(r, typ);
+	if (!offs->is_present) {
+		iterator_set_empty(it);
+		return 0;
+	}
+
+	return reader_seek_internal(r, it, rec);
+}
+
+int reftable_reader_seek_ref(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name)
+{
+	struct reftable_ref_record ref = {
+		.refname = (char *)name,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_ref(&rec, &ref);
+	return reader_seek(r, it, &rec);
+}
+
+int reftable_reader_seek_log_at(struct reftable_reader *r,
+				struct reftable_iterator *it, const char *name,
+				uint64_t update_index)
+{
+	struct reftable_log_record log = {
+		.refname = (char *)name,
+		.update_index = update_index,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_log(&rec, &log);
+	return reader_seek(r, it, &rec);
+}
+
+int reftable_reader_seek_log(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name)
+{
+	uint64_t max = ~((uint64_t)0);
+	return reftable_reader_seek_log_at(r, it, name, max);
+}
+
+void reader_close(struct reftable_reader *r)
+{
+	block_source_close(&r->source);
+	FREE_AND_NULL(r->name);
+}
+
+int reftable_new_reader(struct reftable_reader **p,
+			struct reftable_block_source *src, char const *name)
+{
+	struct reftable_reader *rd =
+		reftable_calloc(sizeof(struct reftable_reader));
+	int err = init_reader(rd, src, name);
+	if (err == 0) {
+		*p = rd;
+	} else {
+		block_source_close(src);
+		reftable_free(rd);
+	}
+	return err;
+}
+
+void reftable_reader_free(struct reftable_reader *r)
+{
+	reader_close(r);
+	reftable_free(r);
+}
+
+static int reftable_reader_refs_for_indexed(struct reftable_reader *r,
+					    struct reftable_iterator *it,
+					    uint8_t *oid)
+{
+	struct reftable_obj_record want = {
+		.hash_prefix = oid,
+		.hash_prefix_len = r->object_id_len,
+	};
+	struct reftable_record want_rec = { NULL };
+	struct reftable_iterator oit = { NULL };
+	struct reftable_obj_record got = { NULL };
+	struct reftable_record got_rec = { NULL };
+	int err = 0;
+	struct indexed_table_ref_iter *itr = NULL;
+
+	/* Look through the reverse index. */
+	reftable_record_from_obj(&want_rec, &want);
+	err = reader_seek(r, &oit, &want_rec);
+	if (err != 0)
+		goto done;
+
+	/* read out the reftable_obj_record */
+	reftable_record_from_obj(&got_rec, &got);
+	err = iterator_next(&oit, &got_rec);
+	if (err < 0)
+		goto done;
+
+	if (err > 0 ||
+	    memcmp(want.hash_prefix, got.hash_prefix, r->object_id_len)) {
+		/* didn't find it; return empty iterator */
+		iterator_set_empty(it);
+		err = 0;
+		goto done;
+	}
+
+	err = new_indexed_table_ref_iter(&itr, r, oid, hash_size(r->hash_id),
+					 got.offsets, got.offset_len);
+	if (err < 0)
+		goto done;
+	got.offsets = NULL;
+	iterator_from_indexed_table_ref_iter(it, itr);
+
+done:
+	reftable_iterator_destroy(&oit);
+	reftable_record_release(&got_rec);
+	return err;
+}
+
+static int reftable_reader_refs_for_unindexed(struct reftable_reader *r,
+					      struct reftable_iterator *it,
+					      uint8_t *oid)
+{
+	struct table_iter ti_empty = TABLE_ITER_INIT;
+	struct table_iter *ti = reftable_calloc(sizeof(struct table_iter));
+	struct filtering_ref_iterator *filter = NULL;
+	struct filtering_ref_iterator empty = FILTERING_REF_ITERATOR_INIT;
+	int oid_len = hash_size(r->hash_id);
+	int err;
+
+	*ti = ti_empty;
+	err = reader_start(r, ti, BLOCK_TYPE_REF, 0);
+	if (err < 0) {
+		reftable_free(ti);
+		return err;
+	}
+
+	filter = reftable_malloc(sizeof(struct filtering_ref_iterator));
+	*filter = empty;
+
+	strbuf_add(&filter->oid, oid, oid_len);
+	reftable_table_from_reader(&filter->tab, r);
+	filter->double_check = 0;
+	iterator_from_table_iter(&filter->it, ti);
+
+	iterator_from_filtering_ref_iterator(it, filter);
+	return 0;
+}
+
+int reftable_reader_refs_for(struct reftable_reader *r,
+			     struct reftable_iterator *it, uint8_t *oid)
+{
+	if (r->obj_offsets.is_present)
+		return reftable_reader_refs_for_indexed(r, it, oid);
+	return reftable_reader_refs_for_unindexed(r, it, oid);
+}
+
+uint64_t reftable_reader_max_update_index(struct reftable_reader *r)
+{
+	return r->max_update_index;
+}
+
+uint64_t reftable_reader_min_update_index(struct reftable_reader *r)
+{
+	return r->min_update_index;
+}
+
+/* generic table interface. */
+
+static int reftable_reader_seek_void(void *tab, struct reftable_iterator *it,
+				     struct reftable_record *rec)
+{
+	return reader_seek((struct reftable_reader *)tab, it, rec);
+}
+
+static uint32_t reftable_reader_hash_id_void(void *tab)
+{
+	return reftable_reader_hash_id((struct reftable_reader *)tab);
+}
+
+static uint64_t reftable_reader_min_update_index_void(void *tab)
+{
+	return reftable_reader_min_update_index((struct reftable_reader *)tab);
+}
+
+static uint64_t reftable_reader_max_update_index_void(void *tab)
+{
+	return reftable_reader_max_update_index((struct reftable_reader *)tab);
+}
+
+static struct reftable_table_vtable reader_vtable = {
+	.seek_record = reftable_reader_seek_void,
+	.hash_id = reftable_reader_hash_id_void,
+	.min_update_index = reftable_reader_min_update_index_void,
+	.max_update_index = reftable_reader_max_update_index_void,
+};
+
+void reftable_table_from_reader(struct reftable_table *tab,
+				struct reftable_reader *reader)
+{
+	assert(tab->ops == NULL);
+	tab->ops = &reader_vtable;
+	tab->table_arg = reader;
+}
diff --git a/reftable/reader.h b/reftable/reader.h
new file mode 100644
index 000000000000..39583e5dbcd4
--- /dev/null
+++ b/reftable/reader.h
@@ -0,0 +1,66 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef READER_H
+#define READER_H
+
+#include "block.h"
+#include "record.h"
+#include "reftable-iterator.h"
+#include "reftable-reader.h"
+
+uint64_t block_source_size(struct reftable_block_source *source);
+
+int block_source_read_block(struct reftable_block_source *source,
+			    struct reftable_block *dest, uint64_t off,
+			    uint32_t size);
+void block_source_close(struct reftable_block_source *source);
+
+/* metadata for a block type */
+struct reftable_reader_offsets {
+	int is_present;
+	uint64_t offset;
+	uint64_t index_offset;
+};
+
+/* The state for reading a reftable file. */
+struct reftable_reader {
+	/* for convience, associate a name with the instance. */
+	char *name;
+	struct reftable_block_source source;
+
+	/* Size of the file, excluding the footer. */
+	uint64_t size;
+
+	/* 'sha1' for SHA1, 's256' for SHA-256 */
+	uint32_t hash_id;
+
+	uint32_t block_size;
+	uint64_t min_update_index;
+	uint64_t max_update_index;
+	/* Length of the OID keys in the 'o' section */
+	int object_id_len;
+	int version;
+
+	struct reftable_reader_offsets ref_offsets;
+	struct reftable_reader_offsets obj_offsets;
+	struct reftable_reader_offsets log_offsets;
+};
+
+int init_reader(struct reftable_reader *r, struct reftable_block_source *source,
+		const char *name);
+int reader_seek(struct reftable_reader *r, struct reftable_iterator *it,
+		struct reftable_record *rec);
+void reader_close(struct reftable_reader *r);
+const char *reader_name(struct reftable_reader *r);
+
+/* initialize a block reader to read from `r` */
+int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br,
+			     uint64_t next_off, uint8_t want_typ);
+
+#endif
diff --git a/reftable/reftable-reader.h b/reftable/reftable-reader.h
new file mode 100644
index 000000000000..201aab3c2af7
--- /dev/null
+++ b/reftable/reftable-reader.h
@@ -0,0 +1,98 @@
+/*
+  Copyright 2020 Google LLC
+
+  Use of this source code is governed by a BSD-style
+  license that can be found in the LICENSE file or at
+  https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_READER_H
+#define REFTABLE_READER_H
+
+#include "reftable-iterator.h"
+#include "reftable-blocksource.h"
+
+/*
+ * Reading single tables
+ *
+ * The follow routines are for reading single files. For an
+ * application-level interface, skip ahead to struct
+ * reftable_merged_table and struct reftable_stack.
+ */
+
+/* The reader struct is a handle to an open reftable file. */
+struct reftable_reader;
+
+/* Generic table. */
+struct reftable_table;
+
+/* reftable_new_reader opens a reftable for reading. If successful,
+ * returns 0 code and sets pp. The name is used for creating a
+ * stack. Typically, it is the basename of the file. The block source
+ * `src` is owned by the reader, and is closed on calling
+ * reftable_reader_destroy(). On error, the block source `src` is
+ * closed as well.
+ */
+int reftable_new_reader(struct reftable_reader **pp,
+			struct reftable_block_source *src, const char *name);
+
+/* reftable_reader_seek_ref returns an iterator where 'name' would be inserted
+   in the table.  To seek to the start of the table, use name = "".
+
+   example:
+
+   struct reftable_reader *r = NULL;
+   int err = reftable_new_reader(&r, &src, "filename");
+   if (err < 0) { ... }
+   struct reftable_iterator it  = {0};
+   err = reftable_reader_seek_ref(r, &it, "refs/heads/master");
+   if (err < 0) { ... }
+   struct reftable_ref_record ref  = {0};
+   while (1) {
+   err = reftable_iterator_next_ref(&it, &ref);
+   if (err > 0) {
+   break;
+   }
+   if (err < 0) {
+   ..error handling..
+   }
+   ..found..
+   }
+   reftable_iterator_destroy(&it);
+   reftable_ref_record_release(&ref);
+*/
+int reftable_reader_seek_ref(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name);
+
+/* returns the hash ID used in this table. */
+uint32_t reftable_reader_hash_id(struct reftable_reader *r);
+
+/* seek to logs for the given name, older than update_index. To seek to the
+   start of the table, use name = "".
+*/
+int reftable_reader_seek_log_at(struct reftable_reader *r,
+				struct reftable_iterator *it, const char *name,
+				uint64_t update_index);
+
+/* seek to newest log entry for given name. */
+int reftable_reader_seek_log(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name);
+
+/* closes and deallocates a reader. */
+void reftable_reader_free(struct reftable_reader *);
+
+/* return an iterator for the refs pointing to `oid`. */
+int reftable_reader_refs_for(struct reftable_reader *r,
+			     struct reftable_iterator *it, uint8_t *oid);
+
+/* return the max_update_index for a table */
+uint64_t reftable_reader_max_update_index(struct reftable_reader *r);
+
+/* return the min_update_index for a table */
+uint64_t reftable_reader_min_update_index(struct reftable_reader *r);
+
+/* creates a generic table from a file reader. */
+void reftable_table_from_reader(struct reftable_table *tab,
+				struct reftable_reader *reader);
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v6 12/20] reftable: reftable file level tests
  2021-04-12 19:25         ` [PATCH v6 00/20] reftable library Han-Wen Nienhuys via GitGitGadget
                             ` (10 preceding siblings ...)
  2021-04-12 19:25           ` [PATCH v6 11/20] reftable: read reftable files Han-Wen Nienhuys via GitGitGadget
@ 2021-04-12 19:25           ` Han-Wen Nienhuys via GitGitGadget
  2021-04-12 19:25           ` [PATCH v6 13/20] reftable: add a heap-based priority queue for reftable records Han-Wen Nienhuys via GitGitGadget
                             ` (8 subsequent siblings)
  20 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-12 19:25 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys,
	Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

With support for reading and writing files in place, we can construct files (in
memory) and attempt to read them back.

Because some sections of the format are optional (eg. indices, log entries), we
have to exercise this code using multiple sizes of input data

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |   1 +
 reftable/reftable_test.c | 583 +++++++++++++++++++++++++++++++++++++++
 t/helper/test-reftable.c |   1 +
 3 files changed, 585 insertions(+)
 create mode 100644 reftable/reftable_test.c

diff --git a/Makefile b/Makefile
index f8167084b80a..b23e4a82063a 100644
--- a/Makefile
+++ b/Makefile
@@ -2416,6 +2416,7 @@ REFTABLE_OBJS += reftable/zlib-compat.o
 REFTABLE_TEST_OBJS += reftable/basics_test.o
 REFTABLE_TEST_OBJS += reftable/block_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
+REFTABLE_TEST_OBJS += reftable/reftable_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/tree_test.o
 
diff --git a/reftable/reftable_test.c b/reftable/reftable_test.c
new file mode 100644
index 000000000000..69dbfb09fff9
--- /dev/null
+++ b/reftable/reftable_test.c
@@ -0,0 +1,583 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+
+#include "basics.h"
+#include "block.h"
+#include "blocksource.h"
+#include "constants.h"
+#include "reader.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+#include "reftable-writer.h"
+
+static const int update_index = 5;
+
+static void test_buffer(void)
+{
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_block_source source = { NULL };
+	struct reftable_block out = { NULL };
+	int n;
+	uint8_t in[] = "hello";
+	strbuf_add(&buf, in, sizeof(in));
+	block_source_from_strbuf(&source, &buf);
+	EXPECT(block_source_size(&source) == 6);
+	n = block_source_read_block(&source, &out, 0, sizeof(in));
+	EXPECT(n == sizeof(in));
+	EXPECT(!memcmp(in, out.data, n));
+	reftable_block_done(&out);
+
+	n = block_source_read_block(&source, &out, 1, 2);
+	EXPECT(n == 2);
+	EXPECT(!memcmp(out.data, "el", 2));
+
+	reftable_block_done(&out);
+	block_source_close(&source);
+	strbuf_release(&buf);
+}
+
+static void write_table(char ***names, struct strbuf *buf, int N,
+			int block_size, uint32_t hash_id)
+{
+	struct reftable_write_options opts = {
+		.block_size = block_size,
+		.hash_id = hash_id,
+	};
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, buf, &opts);
+	struct reftable_ref_record ref = { NULL };
+	int i = 0, n;
+	struct reftable_log_record log = { NULL };
+	const struct reftable_stats *stats = NULL;
+	*names = reftable_calloc(sizeof(char *) * (N + 1));
+	reftable_writer_set_limits(w, update_index, update_index);
+	for (i = 0; i < N; i++) {
+		uint8_t hash[SHA256_SIZE] = { 0 };
+		char name[100];
+		int n;
+
+		set_test_hash(hash, i);
+
+		snprintf(name, sizeof(name), "refs/heads/branch%02d", i);
+
+		ref.refname = name;
+		ref.update_index = update_index;
+		ref.value_type = REFTABLE_REF_VAL1;
+		ref.value.val1 = hash;
+		(*names)[i] = xstrdup(name);
+
+		n = reftable_writer_add_ref(w, &ref);
+		EXPECT(n == 0);
+	}
+
+	for (i = 0; i < N; i++) {
+		uint8_t hash[SHA256_SIZE] = { 0 };
+		char name[100];
+		int n;
+
+		set_test_hash(hash, i);
+
+		snprintf(name, sizeof(name), "refs/heads/branch%02d", i);
+
+		log.refname = name;
+		log.update_index = update_index;
+		log.value_type = REFTABLE_LOG_UPDATE;
+		log.update.new_hash = hash;
+		log.update.message = "message";
+
+		n = reftable_writer_add_log(w, &log);
+		EXPECT(n == 0);
+	}
+
+	n = reftable_writer_close(w);
+	EXPECT(n == 0);
+
+	stats = writer_stats(w);
+	for (i = 0; i < stats->ref_stats.blocks; i++) {
+		int off = i * opts.block_size;
+		if (off == 0) {
+			off = header_size((hash_id == SHA256_ID) ? 2 : 1);
+		}
+		EXPECT(buf->buf[off] == 'r');
+	}
+
+	EXPECT(stats->log_stats.blocks > 0);
+	reftable_writer_free(w);
+}
+
+static void test_log_buffer_size(void)
+{
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_write_options opts = {
+		.block_size = 4096,
+	};
+	int err;
+	struct reftable_log_record log = { .refname = "refs/heads/master",
+					   .update_index = 0xa,
+					   .value_type = REFTABLE_LOG_UPDATE,
+					   .update = {
+						   .name = "Han-Wen Nienhuys",
+						   .email = "hanwen@google.com",
+						   .tz_offset = 100,
+						   .time = 0x5e430672,
+						   .message = "commit: 9\n",
+					   } };
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+
+	/* This tests buffer extension for log compression. Must use a random
+	   hash, to ensure that the compressed part is larger than the original.
+	*/
+	uint8_t hash1[SHA1_SIZE], hash2[SHA1_SIZE];
+	for (int i = 0; i < SHA1_SIZE; i++) {
+		hash1[i] = (uint8_t)(rand() % 256);
+		hash2[i] = (uint8_t)(rand() % 256);
+	}
+	log.update.old_hash = hash1;
+	log.update.new_hash = hash2;
+	reftable_writer_set_limits(w, update_index, update_index);
+	err = reftable_writer_add_log(w, &log);
+	EXPECT_ERR(err);
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+	reftable_writer_free(w);
+	strbuf_release(&buf);
+}
+
+static void test_log_write_read(void)
+{
+	int N = 2;
+	char **names = reftable_calloc(sizeof(char *) * (N + 1));
+	int err;
+	struct reftable_write_options opts = {
+		.block_size = 256,
+	};
+	struct reftable_ref_record ref = { NULL };
+	int i = 0;
+	struct reftable_log_record log = { NULL };
+	int n;
+	struct reftable_iterator it = { NULL };
+	struct reftable_reader rd = { NULL };
+	struct reftable_block_source source = { NULL };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	const struct reftable_stats *stats = NULL;
+	reftable_writer_set_limits(w, 0, N);
+	for (i = 0; i < N; i++) {
+		char name[256];
+		struct reftable_ref_record ref = { NULL };
+		snprintf(name, sizeof(name), "b%02d%0*d", i, 130, 7);
+		names[i] = xstrdup(name);
+		ref.refname = name;
+		ref.update_index = i;
+
+		err = reftable_writer_add_ref(w, &ref);
+		EXPECT_ERR(err);
+	}
+	for (i = 0; i < N; i++) {
+		uint8_t hash1[SHA1_SIZE], hash2[SHA1_SIZE];
+		struct reftable_log_record log = { NULL };
+		set_test_hash(hash1, i);
+		set_test_hash(hash2, i + 1);
+
+		log.refname = names[i];
+		log.update_index = i;
+		log.value_type = REFTABLE_LOG_UPDATE;
+		log.update.old_hash = hash1;
+		log.update.new_hash = hash2;
+
+		err = reftable_writer_add_log(w, &log);
+		EXPECT_ERR(err);
+	}
+
+	n = reftable_writer_close(w);
+	EXPECT(n == 0);
+
+	stats = writer_stats(w);
+	EXPECT(stats->log_stats.blocks > 0);
+	reftable_writer_free(w);
+	w = NULL;
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.log");
+	EXPECT_ERR(err);
+
+	err = reftable_reader_seek_ref(&rd, &it, names[N - 1]);
+	EXPECT_ERR(err);
+
+	err = reftable_iterator_next_ref(&it, &ref);
+	EXPECT_ERR(err);
+
+	/* end of iteration. */
+	err = reftable_iterator_next_ref(&it, &ref);
+	EXPECT(0 < err);
+
+	reftable_iterator_destroy(&it);
+	reftable_ref_record_release(&ref);
+
+	err = reftable_reader_seek_log(&rd, &it, "");
+	EXPECT_ERR(err);
+
+	i = 0;
+	while (1) {
+		int err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			break;
+		}
+
+		EXPECT_ERR(err);
+		EXPECT_STREQ(names[i], log.refname);
+		EXPECT(i == log.update_index);
+		i++;
+		reftable_log_record_release(&log);
+	}
+
+	EXPECT(i == N);
+	reftable_iterator_destroy(&it);
+
+	/* cleanup. */
+	strbuf_release(&buf);
+	free_names(names);
+	reader_close(&rd);
+}
+
+static void test_table_read_write_sequential(void)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 50;
+	struct reftable_iterator it = { NULL };
+	struct reftable_block_source source = { NULL };
+	struct reftable_reader rd = { NULL };
+	int err = 0;
+	int j = 0;
+
+	write_table(&names, &buf, N, 256, SHA1_ID);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	EXPECT_ERR(err);
+
+	err = reftable_reader_seek_ref(&rd, &it, "");
+	EXPECT_ERR(err);
+
+	while (1) {
+		struct reftable_ref_record ref = { NULL };
+		int r = reftable_iterator_next_ref(&it, &ref);
+		EXPECT(r >= 0);
+		if (r > 0) {
+			break;
+		}
+		EXPECT(0 == strcmp(names[j], ref.refname));
+		EXPECT(update_index == ref.update_index);
+
+		j++;
+		reftable_ref_record_release(&ref);
+	}
+	EXPECT(j == N);
+	reftable_iterator_destroy(&it);
+	strbuf_release(&buf);
+	free_names(names);
+
+	reader_close(&rd);
+}
+
+static void test_table_write_small_table(void)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 1;
+	write_table(&names, &buf, N, 4096, SHA1_ID);
+	EXPECT(buf.len < 200);
+	strbuf_release(&buf);
+	free_names(names);
+}
+
+static void test_table_read_api(void)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 50;
+	struct reftable_reader rd = { NULL };
+	struct reftable_block_source source = { NULL };
+	int err;
+	int i;
+	struct reftable_log_record log = { NULL };
+	struct reftable_iterator it = { NULL };
+
+	write_table(&names, &buf, N, 256, SHA1_ID);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	EXPECT_ERR(err);
+
+	err = reftable_reader_seek_ref(&rd, &it, names[0]);
+	EXPECT_ERR(err);
+
+	err = reftable_iterator_next_log(&it, &log);
+	EXPECT(err == REFTABLE_API_ERROR);
+
+	strbuf_release(&buf);
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_free(names);
+	reader_close(&rd);
+	strbuf_release(&buf);
+}
+
+static void test_table_read_write_seek(int index, int hash_id)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 50;
+	struct reftable_reader rd = { NULL };
+	struct reftable_block_source source = { NULL };
+	int err;
+	int i = 0;
+
+	struct reftable_iterator it = { NULL };
+	struct strbuf pastLast = STRBUF_INIT;
+	struct reftable_ref_record ref = { NULL };
+
+	write_table(&names, &buf, N, 256, hash_id);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	EXPECT_ERR(err);
+	EXPECT(hash_id == reftable_reader_hash_id(&rd));
+
+	if (!index) {
+		rd.ref_offsets.index_offset = 0;
+	} else {
+		EXPECT(rd.ref_offsets.index_offset > 0);
+	}
+
+	for (i = 1; i < N; i++) {
+		int err = reftable_reader_seek_ref(&rd, &it, names[i]);
+		EXPECT_ERR(err);
+		err = reftable_iterator_next_ref(&it, &ref);
+		EXPECT_ERR(err);
+		EXPECT(0 == strcmp(names[i], ref.refname));
+		EXPECT(REFTABLE_REF_VAL1 == ref.value_type);
+		EXPECT(i == ref.value.val1[0]);
+
+		reftable_ref_record_release(&ref);
+		reftable_iterator_destroy(&it);
+	}
+
+	strbuf_addstr(&pastLast, names[N - 1]);
+	strbuf_addstr(&pastLast, "/");
+
+	err = reftable_reader_seek_ref(&rd, &it, pastLast.buf);
+	if (err == 0) {
+		struct reftable_ref_record ref = { NULL };
+		int err = reftable_iterator_next_ref(&it, &ref);
+		EXPECT(err > 0);
+	} else {
+		EXPECT(err > 0);
+	}
+
+	strbuf_release(&pastLast);
+	reftable_iterator_destroy(&it);
+
+	strbuf_release(&buf);
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+	reftable_free(names);
+	reader_close(&rd);
+}
+
+static void test_table_read_write_seek_linear(void)
+{
+	test_table_read_write_seek(0, SHA1_ID);
+}
+
+static void test_table_read_write_seek_linear_sha256(void)
+{
+	test_table_read_write_seek(0, SHA256_ID);
+}
+
+static void test_table_read_write_seek_index(void)
+{
+	test_table_read_write_seek(1, SHA1_ID);
+}
+
+static void test_table_refs_for(int indexed)
+{
+	int N = 50;
+	char **want_names = reftable_calloc(sizeof(char *) * (N + 1));
+	int want_names_len = 0;
+	uint8_t want_hash[SHA1_SIZE];
+
+	struct reftable_write_options opts = {
+		.block_size = 256,
+	};
+	struct reftable_ref_record ref = { NULL };
+	int i = 0;
+	int n;
+	int err;
+	struct reftable_reader rd;
+	struct reftable_block_source source = { NULL };
+
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+
+	struct reftable_iterator it = { NULL };
+	int j;
+
+	set_test_hash(want_hash, 4);
+
+	for (i = 0; i < N; i++) {
+		uint8_t hash[SHA1_SIZE];
+		char fill[51] = { 0 };
+		char name[100];
+		uint8_t hash1[SHA1_SIZE];
+		uint8_t hash2[SHA1_SIZE];
+		struct reftable_ref_record ref = { NULL };
+
+		memset(hash, i, sizeof(hash));
+		memset(fill, 'x', 50);
+		/* Put the variable part in the start */
+		snprintf(name, sizeof(name), "br%02d%s", i, fill);
+		name[40] = 0;
+		ref.refname = name;
+
+		set_test_hash(hash1, i / 4);
+		set_test_hash(hash2, 3 + i / 4);
+		ref.value_type = REFTABLE_REF_VAL2;
+		ref.value.val2.value = hash1;
+		ref.value.val2.target_value = hash2;
+
+		/* 80 bytes / entry, so 3 entries per block. Yields 17
+		 */
+		/* blocks. */
+		n = reftable_writer_add_ref(w, &ref);
+		EXPECT(n == 0);
+
+		if (!memcmp(hash1, want_hash, SHA1_SIZE) ||
+		    !memcmp(hash2, want_hash, SHA1_SIZE)) {
+			want_names[want_names_len++] = xstrdup(name);
+		}
+	}
+
+	n = reftable_writer_close(w);
+	EXPECT(n == 0);
+
+	reftable_writer_free(w);
+	w = NULL;
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	EXPECT_ERR(err);
+	if (!indexed) {
+		rd.obj_offsets.is_present = 0;
+	}
+
+	err = reftable_reader_seek_ref(&rd, &it, "");
+	EXPECT_ERR(err);
+	reftable_iterator_destroy(&it);
+
+	err = reftable_reader_refs_for(&rd, &it, want_hash);
+	EXPECT_ERR(err);
+
+	j = 0;
+	while (1) {
+		int err = reftable_iterator_next_ref(&it, &ref);
+		EXPECT(err >= 0);
+		if (err > 0) {
+			break;
+		}
+
+		EXPECT(j < want_names_len);
+		EXPECT(0 == strcmp(ref.refname, want_names[j]));
+		j++;
+		reftable_ref_record_release(&ref);
+	}
+	EXPECT(j == want_names_len);
+
+	strbuf_release(&buf);
+	free_names(want_names);
+	reftable_iterator_destroy(&it);
+	reader_close(&rd);
+}
+
+static void test_table_refs_for_no_index(void)
+{
+	test_table_refs_for(0);
+}
+
+static void test_table_refs_for_obj_index(void)
+{
+	test_table_refs_for(1);
+}
+
+static void test_table_empty(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	struct reftable_block_source source = { NULL };
+	struct reftable_reader *rd = NULL;
+	struct reftable_ref_record rec = { NULL };
+	struct reftable_iterator it = { NULL };
+	int err;
+
+	reftable_writer_set_limits(w, 1, 1);
+
+	err = reftable_writer_close(w);
+	EXPECT(err == REFTABLE_EMPTY_TABLE_ERROR);
+	reftable_writer_free(w);
+
+	EXPECT(buf.len == header_size(1) + footer_size(1));
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = reftable_new_reader(&rd, &source, "filename");
+	EXPECT_ERR(err);
+
+	err = reftable_reader_seek_ref(rd, &it, "");
+	EXPECT_ERR(err);
+
+	err = reftable_iterator_next_ref(&it, &rec);
+	EXPECT(err > 0);
+
+	reftable_iterator_destroy(&it);
+	reftable_reader_free(rd);
+	strbuf_release(&buf);
+}
+
+int reftable_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_log_write_read);
+	RUN_TEST(test_table_read_write_seek_linear_sha256);
+	RUN_TEST(test_log_buffer_size);
+	RUN_TEST(test_table_write_small_table);
+	RUN_TEST(test_buffer);
+	RUN_TEST(test_table_read_api);
+	RUN_TEST(test_table_read_write_sequential);
+	RUN_TEST(test_table_read_write_seek_linear);
+	RUN_TEST(test_table_read_write_seek_index);
+	RUN_TEST(test_table_refs_for_no_index);
+	RUN_TEST(test_table_refs_for_obj_index);
+	RUN_TEST(test_table_empty);
+	return 0;
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 050551fa6985..fdf925867375 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -6,6 +6,7 @@ int cmd__reftable(int argc, const char **argv)
 	basics_test_main(argc, argv);
 	block_test_main(argc, argv);
 	record_test_main(argc, argv);
+	reftable_test_main(argc, argv);
 	tree_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v6 13/20] reftable: add a heap-based priority queue for reftable records
  2021-04-12 19:25         ` [PATCH v6 00/20] reftable library Han-Wen Nienhuys via GitGitGadget
                             ` (11 preceding siblings ...)
  2021-04-12 19:25           ` [PATCH v6 12/20] reftable: reftable file level tests Han-Wen Nienhuys via GitGitGadget
@ 2021-04-12 19:25           ` Han-Wen Nienhuys via GitGitGadget
  2021-04-12 19:25           ` [PATCH v6 14/20] reftable: add merged table view Han-Wen Nienhuys via GitGitGadget
                             ` (7 subsequent siblings)
  20 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-12 19:25 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys,
	Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This is needed to create a merged view multiple reftables

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                  |   2 +
 reftable/pq.c             | 115 ++++++++++++++++++++++++++++++++++++++
 reftable/pq.h             |  32 +++++++++++
 reftable/pq_test.c        |  72 ++++++++++++++++++++++++
 reftable/reftable-tests.h |   1 +
 t/helper/test-reftable.c  |   1 +
 6 files changed, 223 insertions(+)
 create mode 100644 reftable/pq.c
 create mode 100644 reftable/pq.h
 create mode 100644 reftable/pq_test.c

diff --git a/Makefile b/Makefile
index b23e4a82063a..74560a9605bb 100644
--- a/Makefile
+++ b/Makefile
@@ -2406,6 +2406,7 @@ REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/iter.o
 REFTABLE_OBJS += reftable/publicbasics.o
+REFTABLE_OBJS += reftable/pq.o
 REFTABLE_OBJS += reftable/reader.o
 REFTABLE_OBJS += reftable/record.o
 REFTABLE_OBJS += reftable/reftable.o
@@ -2415,6 +2416,7 @@ REFTABLE_OBJS += reftable/zlib-compat.o
 
 REFTABLE_TEST_OBJS += reftable/basics_test.o
 REFTABLE_TEST_OBJS += reftable/block_test.o
+REFTABLE_TEST_OBJS += reftable/pq_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/reftable_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
diff --git a/reftable/pq.c b/reftable/pq.c
new file mode 100644
index 000000000000..8918d158e2d4
--- /dev/null
+++ b/reftable/pq.c
@@ -0,0 +1,115 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "pq.h"
+
+#include "reftable-record.h"
+#include "system.h"
+#include "basics.h"
+
+static int pq_less(struct pq_entry a, struct pq_entry b)
+{
+	struct strbuf ak = STRBUF_INIT;
+	struct strbuf bk = STRBUF_INIT;
+	int cmp = 0;
+	reftable_record_key(&a.rec, &ak);
+	reftable_record_key(&b.rec, &bk);
+
+	cmp = strbuf_cmp(&ak, &bk);
+
+	strbuf_release(&ak);
+	strbuf_release(&bk);
+
+	if (cmp == 0)
+		return a.index > b.index;
+
+	return cmp < 0;
+}
+
+struct pq_entry merged_iter_pqueue_top(struct merged_iter_pqueue pq)
+{
+	return pq.heap[0];
+}
+
+int merged_iter_pqueue_is_empty(struct merged_iter_pqueue pq)
+{
+	return pq.len == 0;
+}
+
+void merged_iter_pqueue_check(struct merged_iter_pqueue pq)
+{
+	int i = 0;
+	for (i = 1; i < pq.len; i++) {
+		int parent = (i - 1) / 2;
+
+		assert(pq_less(pq.heap[parent], pq.heap[i]));
+	}
+}
+
+struct pq_entry merged_iter_pqueue_remove(struct merged_iter_pqueue *pq)
+{
+	int i = 0;
+	struct pq_entry e = pq->heap[0];
+	pq->heap[0] = pq->heap[pq->len - 1];
+	pq->len--;
+
+	i = 0;
+	while (i < pq->len) {
+		int min = i;
+		int j = 2 * i + 1;
+		int k = 2 * i + 2;
+		if (j < pq->len && pq_less(pq->heap[j], pq->heap[i])) {
+			min = j;
+		}
+		if (k < pq->len && pq_less(pq->heap[k], pq->heap[min])) {
+			min = k;
+		}
+
+		if (min == i) {
+			break;
+		}
+
+		SWAP(pq->heap[i], pq->heap[min]);
+		i = min;
+	}
+
+	return e;
+}
+
+void merged_iter_pqueue_add(struct merged_iter_pqueue *pq, struct pq_entry e)
+{
+	int i = 0;
+	if (pq->len == pq->cap) {
+		pq->cap = 2 * pq->cap + 1;
+		pq->heap = reftable_realloc(pq->heap,
+					    pq->cap * sizeof(struct pq_entry));
+	}
+
+	pq->heap[pq->len++] = e;
+	i = pq->len - 1;
+	while (i > 0) {
+		int j = (i - 1) / 2;
+		if (pq_less(pq->heap[j], pq->heap[i])) {
+			break;
+		}
+
+		SWAP(pq->heap[j], pq->heap[i]);
+
+		i = j;
+	}
+}
+
+void merged_iter_pqueue_release(struct merged_iter_pqueue *pq)
+{
+	int i = 0;
+	for (i = 0; i < pq->len; i++) {
+		reftable_record_destroy(&pq->heap[i].rec);
+	}
+	FREE_AND_NULL(pq->heap);
+	pq->len = pq->cap = 0;
+}
diff --git a/reftable/pq.h b/reftable/pq.h
new file mode 100644
index 000000000000..385d2fb139a6
--- /dev/null
+++ b/reftable/pq.h
@@ -0,0 +1,32 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef PQ_H
+#define PQ_H
+
+#include "record.h"
+
+struct pq_entry {
+	int index;
+	struct reftable_record rec;
+};
+
+struct merged_iter_pqueue {
+	struct pq_entry *heap;
+	size_t len;
+	size_t cap;
+};
+
+struct pq_entry merged_iter_pqueue_top(struct merged_iter_pqueue pq);
+int merged_iter_pqueue_is_empty(struct merged_iter_pqueue pq);
+void merged_iter_pqueue_check(struct merged_iter_pqueue pq);
+struct pq_entry merged_iter_pqueue_remove(struct merged_iter_pqueue *pq);
+void merged_iter_pqueue_add(struct merged_iter_pqueue *pq, struct pq_entry e);
+void merged_iter_pqueue_release(struct merged_iter_pqueue *pq);
+
+#endif
diff --git a/reftable/pq_test.c b/reftable/pq_test.c
new file mode 100644
index 000000000000..5178980b37ed
--- /dev/null
+++ b/reftable/pq_test.c
@@ -0,0 +1,72 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+
+#include "basics.h"
+#include "constants.h"
+#include "pq.h"
+#include "record.h"
+#include "reftable-tests.h"
+#include "test_framework.h"
+
+static void test_pq(void)
+{
+	char *names[54] = { NULL };
+	int N = ARRAY_SIZE(names) - 1;
+
+	struct merged_iter_pqueue pq = { NULL };
+	const char *last = NULL;
+
+	int i = 0;
+	for (i = 0; i < N; i++) {
+		char name[100];
+		snprintf(name, sizeof(name), "%02d", i);
+		names[i] = xstrdup(name);
+	}
+
+	i = 1;
+	do {
+		struct reftable_record rec =
+			reftable_new_record(BLOCK_TYPE_REF);
+		struct pq_entry e = { 0 };
+
+		reftable_record_as_ref(&rec)->refname = names[i];
+		e.rec = rec;
+		merged_iter_pqueue_add(&pq, e);
+		merged_iter_pqueue_check(pq);
+		i = (i * 7) % N;
+	} while (i != 1);
+
+	while (!merged_iter_pqueue_is_empty(pq)) {
+		struct pq_entry e = merged_iter_pqueue_remove(&pq);
+		struct reftable_ref_record *ref =
+			reftable_record_as_ref(&e.rec);
+
+		merged_iter_pqueue_check(pq);
+
+		if (last != NULL) {
+			assert(strcmp(last, ref->refname) < 0);
+		}
+		last = ref->refname;
+		ref->refname = NULL;
+		reftable_free(ref);
+	}
+
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+
+	merged_iter_pqueue_release(&pq);
+}
+
+int pq_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_pq);
+	return 0;
+}
diff --git a/reftable/reftable-tests.h b/reftable/reftable-tests.h
index 5e7698ae654e..495556572e20 100644
--- a/reftable/reftable-tests.h
+++ b/reftable/reftable-tests.h
@@ -12,6 +12,7 @@ license that can be found in the LICENSE file or at
 int basics_test_main(int argc, const char **argv);
 int block_test_main(int argc, const char **argv);
 int merged_test_main(int argc, const char **argv);
+int pq_test_main(int argc, const char **argv);
 int record_test_main(int argc, const char **argv);
 int refname_test_main(int argc, const char **argv);
 int reftable_test_main(int argc, const char **argv);
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index fdf925867375..368eb7339eb7 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -5,6 +5,7 @@ int cmd__reftable(int argc, const char **argv)
 {
 	basics_test_main(argc, argv);
 	block_test_main(argc, argv);
+	pq_test_main(argc, argv);
 	record_test_main(argc, argv);
 	reftable_test_main(argc, argv);
 	tree_test_main(argc, argv);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v6 14/20] reftable: add merged table view
  2021-04-12 19:25         ` [PATCH v6 00/20] reftable library Han-Wen Nienhuys via GitGitGadget
                             ` (12 preceding siblings ...)
  2021-04-12 19:25           ` [PATCH v6 13/20] reftable: add a heap-based priority queue for reftable records Han-Wen Nienhuys via GitGitGadget
@ 2021-04-12 19:25           ` Han-Wen Nienhuys via GitGitGadget
  2021-04-12 19:25           ` [PATCH v6 15/20] reftable: implement refname validation Han-Wen Nienhuys via GitGitGadget
                             ` (6 subsequent siblings)
  20 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-12 19:25 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys,
	Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This adds an abstract, read-only interface to the ref database.

This primitive is used to construct the read view of the ref database
(the read view is constructed by merging several *.ref files). It also
provides the mechanism to provide a unified view of the refs in the main
repository and the per-worktree refs.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                   |   2 +
 reftable/merged.c          | 367 +++++++++++++++++++++++++++++++++++++
 reftable/merged.h          |  35 ++++
 reftable/merged_test.c     | 291 +++++++++++++++++++++++++++++
 reftable/reftable-merged.h |  72 ++++++++
 t/helper/test-reftable.c   |   1 +
 6 files changed, 768 insertions(+)
 create mode 100644 reftable/merged.c
 create mode 100644 reftable/merged.h
 create mode 100644 reftable/merged_test.c
 create mode 100644 reftable/reftable-merged.h

diff --git a/Makefile b/Makefile
index 74560a9605bb..129f73d118ac 100644
--- a/Makefile
+++ b/Makefile
@@ -2406,6 +2406,7 @@ REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/iter.o
 REFTABLE_OBJS += reftable/publicbasics.o
+REFTABLE_OBJS += reftable/merged.o
 REFTABLE_OBJS += reftable/pq.o
 REFTABLE_OBJS += reftable/reader.o
 REFTABLE_OBJS += reftable/record.o
@@ -2416,6 +2417,7 @@ REFTABLE_OBJS += reftable/zlib-compat.o
 
 REFTABLE_TEST_OBJS += reftable/basics_test.o
 REFTABLE_TEST_OBJS += reftable/block_test.o
+REFTABLE_TEST_OBJS += reftable/merged_test.o
 REFTABLE_TEST_OBJS += reftable/pq_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/reftable_test.o
diff --git a/reftable/merged.c b/reftable/merged.c
new file mode 100644
index 000000000000..02940a79b0a3
--- /dev/null
+++ b/reftable/merged.c
@@ -0,0 +1,367 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "merged.h"
+
+#include "constants.h"
+#include "iter.h"
+#include "pq.h"
+#include "reader.h"
+#include "record.h"
+#include "generic.h"
+#include "reftable-merged.h"
+#include "reftable-error.h"
+#include "system.h"
+
+static int merged_iter_init(struct merged_iter *mi)
+{
+	int i = 0;
+	for (i = 0; i < mi->stack_len; i++) {
+		struct reftable_record rec = reftable_new_record(mi->typ);
+		int err = iterator_next(&mi->stack[i], &rec);
+		if (err < 0) {
+			return err;
+		}
+
+		if (err > 0) {
+			reftable_iterator_destroy(&mi->stack[i]);
+			reftable_record_destroy(&rec);
+		} else {
+			struct pq_entry e = {
+				.rec = rec,
+				.index = i,
+			};
+			merged_iter_pqueue_add(&mi->pq, e);
+		}
+	}
+
+	return 0;
+}
+
+static void merged_iter_close(void *p)
+{
+	struct merged_iter *mi = (struct merged_iter *)p;
+	int i = 0;
+	merged_iter_pqueue_release(&mi->pq);
+	for (i = 0; i < mi->stack_len; i++) {
+		reftable_iterator_destroy(&mi->stack[i]);
+	}
+	reftable_free(mi->stack);
+}
+
+static int merged_iter_advance_nonnull_subiter(struct merged_iter *mi,
+					       size_t idx)
+{
+	struct reftable_record rec = reftable_new_record(mi->typ);
+	struct pq_entry e = {
+		.rec = rec,
+		.index = idx,
+	};
+	int err = iterator_next(&mi->stack[idx], &rec);
+	if (err < 0)
+		return err;
+
+	if (err > 0) {
+		reftable_iterator_destroy(&mi->stack[idx]);
+		reftable_record_destroy(&rec);
+		return 0;
+	}
+
+	merged_iter_pqueue_add(&mi->pq, e);
+	return 0;
+}
+
+static int merged_iter_advance_subiter(struct merged_iter *mi, size_t idx)
+{
+	if (iterator_is_null(&mi->stack[idx]))
+		return 0;
+	return merged_iter_advance_nonnull_subiter(mi, idx);
+}
+
+static int merged_iter_next_entry(struct merged_iter *mi,
+				  struct reftable_record *rec)
+{
+	struct strbuf entry_key = STRBUF_INIT;
+	struct pq_entry entry = { 0 };
+	int err = 0;
+
+	if (merged_iter_pqueue_is_empty(mi->pq))
+		return 1;
+
+	entry = merged_iter_pqueue_remove(&mi->pq);
+	err = merged_iter_advance_subiter(mi, entry.index);
+	if (err < 0)
+		return err;
+
+	/*
+	  One can also use reftable as datacenter-local storage, where the ref
+	  database is maintained in globally consistent database (eg.
+	  CockroachDB or Spanner). In this scenario, replication delays together
+	  with compaction may cause newer tables to contain older entries. In
+	  such a deployment, the loop below must be changed to collect all
+	  entries for the same key, and return new the newest one.
+	*/
+	reftable_record_key(&entry.rec, &entry_key);
+	while (!merged_iter_pqueue_is_empty(mi->pq)) {
+		struct pq_entry top = merged_iter_pqueue_top(mi->pq);
+		struct strbuf k = STRBUF_INIT;
+		int err = 0, cmp = 0;
+
+		reftable_record_key(&top.rec, &k);
+
+		cmp = strbuf_cmp(&k, &entry_key);
+		strbuf_release(&k);
+
+		if (cmp > 0) {
+			break;
+		}
+
+		merged_iter_pqueue_remove(&mi->pq);
+		err = merged_iter_advance_subiter(mi, top.index);
+		if (err < 0) {
+			return err;
+		}
+		reftable_record_destroy(&top.rec);
+	}
+
+	reftable_record_copy_from(rec, &entry.rec, hash_size(mi->hash_id));
+	reftable_record_destroy(&entry.rec);
+	strbuf_release(&entry_key);
+	return 0;
+}
+
+static int merged_iter_next(struct merged_iter *mi, struct reftable_record *rec)
+{
+	while (1) {
+		int err = merged_iter_next_entry(mi, rec);
+		if (err == 0 && mi->suppress_deletions &&
+		    reftable_record_is_deletion(rec)) {
+			continue;
+		}
+
+		return err;
+	}
+}
+
+static int merged_iter_next_void(void *p, struct reftable_record *rec)
+{
+	struct merged_iter *mi = (struct merged_iter *)p;
+	if (merged_iter_pqueue_is_empty(mi->pq))
+		return 1;
+
+	return merged_iter_next(mi, rec);
+}
+
+static struct reftable_iterator_vtable merged_iter_vtable = {
+	.next = &merged_iter_next_void,
+	.close = &merged_iter_close,
+};
+
+static void iterator_from_merged_iter(struct reftable_iterator *it,
+				      struct merged_iter *mi)
+{
+	assert(it->ops == NULL);
+	it->iter_arg = mi;
+	it->ops = &merged_iter_vtable;
+}
+
+int reftable_new_merged_table(struct reftable_merged_table **dest,
+			      struct reftable_table *stack, int n,
+			      uint32_t hash_id)
+{
+	struct reftable_merged_table *m = NULL;
+	uint64_t last_max = 0;
+	uint64_t first_min = 0;
+	int i = 0;
+	for (i = 0; i < n; i++) {
+		uint64_t min = reftable_table_min_update_index(&stack[i]);
+		uint64_t max = reftable_table_max_update_index(&stack[i]);
+
+		if (reftable_table_hash_id(&stack[i]) != hash_id) {
+			return REFTABLE_FORMAT_ERROR;
+		}
+		if (i == 0 || min < first_min) {
+			first_min = min;
+		}
+		if (i == 0 || max > last_max) {
+			last_max = max;
+		}
+	}
+
+	m = (struct reftable_merged_table *)reftable_calloc(
+		sizeof(struct reftable_merged_table));
+	m->stack = stack;
+	m->stack_len = n;
+	m->min = first_min;
+	m->max = last_max;
+	m->hash_id = hash_id;
+	*dest = m;
+	return 0;
+}
+
+/* clears the list of subtable, without affecting the readers themselves. */
+void merged_table_release(struct reftable_merged_table *mt)
+{
+	FREE_AND_NULL(mt->stack);
+	mt->stack_len = 0;
+}
+
+void reftable_merged_table_free(struct reftable_merged_table *mt)
+{
+	if (mt == NULL) {
+		return;
+	}
+	merged_table_release(mt);
+	reftable_free(mt);
+}
+
+uint64_t
+reftable_merged_table_max_update_index(struct reftable_merged_table *mt)
+{
+	return mt->max;
+}
+
+uint64_t
+reftable_merged_table_min_update_index(struct reftable_merged_table *mt)
+{
+	return mt->min;
+}
+
+static int reftable_table_seek_record(struct reftable_table *tab,
+				      struct reftable_iterator *it,
+				      struct reftable_record *rec)
+{
+	return tab->ops->seek_record(tab->table_arg, it, rec);
+}
+
+static int merged_table_seek_record(struct reftable_merged_table *mt,
+				    struct reftable_iterator *it,
+				    struct reftable_record *rec)
+{
+	struct reftable_iterator *iters = reftable_calloc(
+		sizeof(struct reftable_iterator) * mt->stack_len);
+	struct merged_iter merged = {
+		.stack = iters,
+		.typ = reftable_record_type(rec),
+		.hash_id = mt->hash_id,
+		.suppress_deletions = mt->suppress_deletions,
+	};
+	int n = 0;
+	int err = 0;
+	int i = 0;
+	for (i = 0; i < mt->stack_len && err == 0; i++) {
+		int e = reftable_table_seek_record(&mt->stack[i], &iters[n],
+						   rec);
+		if (e < 0) {
+			err = e;
+		}
+		if (e == 0) {
+			n++;
+		}
+	}
+	if (err < 0) {
+		int i = 0;
+		for (i = 0; i < n; i++) {
+			reftable_iterator_destroy(&iters[i]);
+		}
+		reftable_free(iters);
+		return err;
+	}
+
+	merged.stack_len = n;
+	err = merged_iter_init(&merged);
+	if (err < 0) {
+		merged_iter_close(&merged);
+		return err;
+	} else {
+		struct merged_iter *p =
+			reftable_malloc(sizeof(struct merged_iter));
+		*p = merged;
+		iterator_from_merged_iter(it, p);
+	}
+	return 0;
+}
+
+int reftable_merged_table_seek_ref(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name)
+{
+	struct reftable_ref_record ref = {
+		.refname = (char *)name,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_ref(&rec, &ref);
+	return merged_table_seek_record(mt, it, &rec);
+}
+
+int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt,
+				      struct reftable_iterator *it,
+				      const char *name, uint64_t update_index)
+{
+	struct reftable_log_record log = {
+		.refname = (char *)name,
+		.update_index = update_index,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_log(&rec, &log);
+	return merged_table_seek_record(mt, it, &rec);
+}
+
+int reftable_merged_table_seek_log(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name)
+{
+	uint64_t max = ~((uint64_t)0);
+	return reftable_merged_table_seek_log_at(mt, it, name, max);
+}
+
+uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *mt)
+{
+	return mt->hash_id;
+}
+
+static int reftable_merged_table_seek_void(void *tab,
+					   struct reftable_iterator *it,
+					   struct reftable_record *rec)
+{
+	return merged_table_seek_record((struct reftable_merged_table *)tab, it,
+					rec);
+}
+
+static uint32_t reftable_merged_table_hash_id_void(void *tab)
+{
+	return reftable_merged_table_hash_id(
+		(struct reftable_merged_table *)tab);
+}
+
+static uint64_t reftable_merged_table_min_update_index_void(void *tab)
+{
+	return reftable_merged_table_min_update_index(
+		(struct reftable_merged_table *)tab);
+}
+
+static uint64_t reftable_merged_table_max_update_index_void(void *tab)
+{
+	return reftable_merged_table_max_update_index(
+		(struct reftable_merged_table *)tab);
+}
+
+static struct reftable_table_vtable merged_table_vtable = {
+	.seek_record = reftable_merged_table_seek_void,
+	.hash_id = reftable_merged_table_hash_id_void,
+	.min_update_index = reftable_merged_table_min_update_index_void,
+	.max_update_index = reftable_merged_table_max_update_index_void,
+};
+
+void reftable_table_from_merged_table(struct reftable_table *tab,
+				      struct reftable_merged_table *merged)
+{
+	assert(tab->ops == NULL);
+	tab->ops = &merged_table_vtable;
+	tab->table_arg = merged;
+}
diff --git a/reftable/merged.h b/reftable/merged.h
new file mode 100644
index 000000000000..8c4d4d58d77a
--- /dev/null
+++ b/reftable/merged.h
@@ -0,0 +1,35 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef MERGED_H
+#define MERGED_H
+
+#include "pq.h"
+
+struct reftable_merged_table {
+	struct reftable_table *stack;
+	size_t stack_len;
+	uint32_t hash_id;
+	int suppress_deletions;
+
+	uint64_t min;
+	uint64_t max;
+};
+
+struct merged_iter {
+	struct reftable_iterator *stack;
+	uint32_t hash_id;
+	size_t stack_len;
+	uint8_t typ;
+	int suppress_deletions;
+	struct merged_iter_pqueue pq;
+};
+
+void merged_table_release(struct reftable_merged_table *mt);
+
+#endif
diff --git a/reftable/merged_test.c b/reftable/merged_test.c
new file mode 100644
index 000000000000..0c301ceccedd
--- /dev/null
+++ b/reftable/merged_test.c
@@ -0,0 +1,291 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "merged.h"
+
+#include "system.h"
+
+#include "basics.h"
+#include "blocksource.h"
+#include "constants.h"
+#include "reader.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-merged.h"
+#include "reftable-tests.h"
+#include "reftable-generic.h"
+#include "reftable-writer.h"
+
+static void write_test_table(struct strbuf *buf,
+			     struct reftable_ref_record refs[], int n)
+{
+	int min = 0xffffffff;
+	int max = 0;
+	int i = 0;
+	int err;
+
+	struct reftable_write_options opts = {
+		.block_size = 256,
+	};
+	struct reftable_writer *w = NULL;
+	for (i = 0; i < n; i++) {
+		uint64_t ui = refs[i].update_index;
+		if (ui > max) {
+			max = ui;
+		}
+		if (ui < min) {
+			min = ui;
+		}
+	}
+
+	w = reftable_new_writer(&strbuf_add_void, buf, &opts);
+	reftable_writer_set_limits(w, min, max);
+
+	for (i = 0; i < n; i++) {
+		uint64_t before = refs[i].update_index;
+		int n = reftable_writer_add_ref(w, &refs[i]);
+		assert(n == 0);
+		assert(before == refs[i].update_index);
+	}
+
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+
+	reftable_writer_free(w);
+}
+
+static struct reftable_merged_table *
+merged_table_from_records(struct reftable_ref_record **refs,
+			  struct reftable_block_source **source,
+			  struct reftable_reader ***readers, int *sizes,
+			  struct strbuf *buf, int n)
+{
+	int i = 0;
+	struct reftable_merged_table *mt = NULL;
+	int err;
+	struct reftable_table *tabs =
+		reftable_calloc(n * sizeof(struct reftable_table));
+	*readers = reftable_calloc(n * sizeof(struct reftable_reader *));
+	*source = reftable_calloc(n * sizeof(**source));
+	for (i = 0; i < n; i++) {
+		write_test_table(&buf[i], refs[i], sizes[i]);
+		block_source_from_strbuf(&(*source)[i], &buf[i]);
+
+		err = reftable_new_reader(&(*readers)[i], &(*source)[i],
+					  "name");
+		EXPECT_ERR(err);
+		reftable_table_from_reader(&tabs[i], (*readers)[i]);
+	}
+
+	err = reftable_new_merged_table(&mt, tabs, n, SHA1_ID);
+	EXPECT_ERR(err);
+	return mt;
+}
+
+static void readers_destroy(struct reftable_reader **readers, size_t n)
+{
+	int i = 0;
+	for (; i < n; i++)
+		reftable_reader_free(readers[i]);
+	reftable_free(readers);
+}
+
+static void test_merged_between(void)
+{
+	uint8_t hash1[SHA1_SIZE] = { 1, 2, 3, 0 };
+
+	struct reftable_ref_record r1[] = { {
+		.refname = "b",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_VAL1,
+		.value.val1 = hash1,
+	} };
+	struct reftable_ref_record r2[] = { {
+		.refname = "a",
+		.update_index = 2,
+		.value_type = REFTABLE_REF_DELETION,
+	} };
+
+	struct reftable_ref_record *refs[] = { r1, r2 };
+	int sizes[] = { 1, 1 };
+	struct strbuf bufs[2] = { STRBUF_INIT, STRBUF_INIT };
+	struct reftable_block_source *bs = NULL;
+	struct reftable_reader **readers = NULL;
+	struct reftable_merged_table *mt =
+		merged_table_from_records(refs, &bs, &readers, sizes, bufs, 2);
+	int i;
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_iterator it = { NULL };
+	int err = reftable_merged_table_seek_ref(mt, &it, "a");
+	EXPECT_ERR(err);
+
+	err = reftable_iterator_next_ref(&it, &ref);
+	EXPECT_ERR(err);
+	EXPECT(ref.update_index == 2);
+	reftable_ref_record_release(&ref);
+	reftable_iterator_destroy(&it);
+	readers_destroy(readers, 2);
+	reftable_merged_table_free(mt);
+	for (i = 0; i < ARRAY_SIZE(bufs); i++) {
+		strbuf_release(&bufs[i]);
+	}
+	reftable_free(bs);
+}
+
+static void test_merged(void)
+{
+	uint8_t hash1[SHA1_SIZE] = { 1 };
+	uint8_t hash2[SHA1_SIZE] = { 2 };
+	struct reftable_ref_record r1[] = {
+		{
+			.refname = "a",
+			.update_index = 1,
+			.value_type = REFTABLE_REF_VAL1,
+			.value.val1 = hash1,
+		},
+		{
+			.refname = "b",
+			.update_index = 1,
+			.value_type = REFTABLE_REF_VAL1,
+			.value.val1 = hash1,
+		},
+		{
+			.refname = "c",
+			.update_index = 1,
+			.value_type = REFTABLE_REF_VAL1,
+			.value.val1 = hash1,
+		}
+	};
+	struct reftable_ref_record r2[] = { {
+		.refname = "a",
+		.update_index = 2,
+		.value_type = REFTABLE_REF_DELETION,
+	} };
+	struct reftable_ref_record r3[] = {
+		{
+			.refname = "c",
+			.update_index = 3,
+			.value_type = REFTABLE_REF_VAL1,
+			.value.val1 = hash2,
+		},
+		{
+			.refname = "d",
+			.update_index = 3,
+			.value_type = REFTABLE_REF_VAL1,
+			.value.val1 = hash1,
+		},
+	};
+
+	struct reftable_ref_record want[] = {
+		r2[0],
+		r1[1],
+		r3[0],
+		r3[1],
+	};
+
+	struct reftable_ref_record *refs[] = { r1, r2, r3 };
+	int sizes[3] = { 3, 1, 2 };
+	struct strbuf bufs[3] = { STRBUF_INIT, STRBUF_INIT, STRBUF_INIT };
+	struct reftable_block_source *bs = NULL;
+	struct reftable_reader **readers = NULL;
+	struct reftable_merged_table *mt =
+		merged_table_from_records(refs, &bs, &readers, sizes, bufs, 3);
+
+	struct reftable_iterator it = { NULL };
+	int err = reftable_merged_table_seek_ref(mt, &it, "a");
+	struct reftable_ref_record *out = NULL;
+	size_t len = 0;
+	size_t cap = 0;
+	int i = 0;
+
+	EXPECT_ERR(err);
+	while (len < 100) { /* cap loops/recursion. */
+		struct reftable_ref_record ref = { NULL };
+		int err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (len == cap) {
+			cap = 2 * cap + 1;
+			out = reftable_realloc(
+				out, sizeof(struct reftable_ref_record) * cap);
+		}
+		out[len++] = ref;
+	}
+	reftable_iterator_destroy(&it);
+
+	assert(ARRAY_SIZE(want) == len);
+	for (i = 0; i < len; i++) {
+		assert(reftable_ref_record_equal(&want[i], &out[i], SHA1_SIZE));
+	}
+	for (i = 0; i < len; i++) {
+		reftable_ref_record_release(&out[i]);
+	}
+	reftable_free(out);
+
+	for (i = 0; i < 3; i++) {
+		strbuf_release(&bufs[i]);
+	}
+	readers_destroy(readers, 3);
+	reftable_merged_table_free(mt);
+	reftable_free(bs);
+}
+
+static void test_default_write_opts(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+
+	struct reftable_ref_record rec = {
+		.refname = "master",
+		.update_index = 1,
+	};
+	int err;
+	struct reftable_block_source source = { NULL };
+	struct reftable_table *tab = reftable_calloc(sizeof(*tab) * 1);
+	uint32_t hash_id;
+	struct reftable_reader *rd = NULL;
+	struct reftable_merged_table *merged = NULL;
+
+	reftable_writer_set_limits(w, 1, 1);
+
+	err = reftable_writer_add_ref(w, &rec);
+	EXPECT_ERR(err);
+
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+	reftable_writer_free(w);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = reftable_new_reader(&rd, &source, "filename");
+	EXPECT_ERR(err);
+
+	hash_id = reftable_reader_hash_id(rd);
+	assert(hash_id == SHA1_ID);
+
+	reftable_table_from_reader(&tab[0], rd);
+	err = reftable_new_merged_table(&merged, tab, 1, SHA1_ID);
+	EXPECT_ERR(err);
+
+	reftable_reader_free(rd);
+	reftable_merged_table_free(merged);
+	strbuf_release(&buf);
+}
+
+/* XXX test refs_for(oid) */
+
+int merged_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_merged_between);
+	RUN_TEST(test_merged);
+	RUN_TEST(test_default_write_opts);
+	return 0;
+}
diff --git a/reftable/reftable-merged.h b/reftable/reftable-merged.h
new file mode 100644
index 000000000000..1a6d16915ab4
--- /dev/null
+++ b/reftable/reftable-merged.h
@@ -0,0 +1,72 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_MERGED_H
+#define REFTABLE_MERGED_H
+
+#include "reftable-iterator.h"
+
+/*
+ * Merged tables
+ *
+ * A ref database kept in a sequence of table files. The merged_table presents a
+ * unified view to reading (seeking, iterating) a sequence of immutable tables.
+ *
+ * The merged tables are on purpose kept disconnected from their actual storage
+ * (eg. files on disk), because it is useful to merge tables aren't files. For
+ * example, the per-workspace and global ref namespace can be implemented as a
+ * merged table of two stacks of file-backed reftables.
+ */
+
+/* A merged table is implements seeking/iterating over a stack of tables. */
+struct reftable_merged_table;
+
+/* A generic reftable; see below. */
+struct reftable_table;
+
+/* reftable_new_merged_table creates a new merged table. It takes ownership of
+   the stack array.
+*/
+int reftable_new_merged_table(struct reftable_merged_table **dest,
+			      struct reftable_table *stack, int n,
+			      uint32_t hash_id);
+
+/* returns an iterator positioned just before 'name' */
+int reftable_merged_table_seek_ref(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name);
+
+/* returns an iterator for log entry, at given update_index */
+int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt,
+				      struct reftable_iterator *it,
+				      const char *name, uint64_t update_index);
+
+/* like reftable_merged_table_seek_log_at but look for the newest entry. */
+int reftable_merged_table_seek_log(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name);
+
+/* returns the max update_index covered by this merged table. */
+uint64_t
+reftable_merged_table_max_update_index(struct reftable_merged_table *mt);
+
+/* returns the min update_index covered by this merged table. */
+uint64_t
+reftable_merged_table_min_update_index(struct reftable_merged_table *mt);
+
+/* releases memory for the merged_table */
+void reftable_merged_table_free(struct reftable_merged_table *m);
+
+/* return the hash ID of the merged table. */
+uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *m);
+
+/* create a generic table from reftable_merged_table */
+void reftable_table_from_merged_table(struct reftable_table *tab,
+				      struct reftable_merged_table *table);
+
+#endif
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 368eb7339eb7..cb277d911692 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -5,6 +5,7 @@ int cmd__reftable(int argc, const char **argv)
 {
 	basics_test_main(argc, argv);
 	block_test_main(argc, argv);
+	merged_test_main(argc, argv);
 	pq_test_main(argc, argv);
 	record_test_main(argc, argv);
 	reftable_test_main(argc, argv);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v6 15/20] reftable: implement refname validation
  2021-04-12 19:25         ` [PATCH v6 00/20] reftable library Han-Wen Nienhuys via GitGitGadget
                             ` (13 preceding siblings ...)
  2021-04-12 19:25           ` [PATCH v6 14/20] reftable: add merged table view Han-Wen Nienhuys via GitGitGadget
@ 2021-04-12 19:25           ` Han-Wen Nienhuys via GitGitGadget
  2021-04-12 19:25           ` [PATCH v6 16/20] reftable: implement stack, a mutable database of reftable files Han-Wen Nienhuys via GitGitGadget
                             ` (5 subsequent siblings)
  20 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-12 19:25 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys,
	Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The packed/loose format has restrictions on refnames: a and a/b cannot
coexist. This limitation does not apply to reftable per se, but must be
maintained for interoperability. This code adds validation routines to
abort transactions that are trying to add invalid names.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |   2 +
 reftable/refname.c       | 209 +++++++++++++++++++++++++++++++++++++++
 reftable/refname.h       |  29 ++++++
 reftable/refname_test.c  | 102 +++++++++++++++++++
 t/helper/test-reftable.c |   1 +
 5 files changed, 343 insertions(+)
 create mode 100644 reftable/refname.c
 create mode 100644 reftable/refname.h
 create mode 100644 reftable/refname_test.c

diff --git a/Makefile b/Makefile
index 129f73d118ac..9b941b69eb4b 100644
--- a/Makefile
+++ b/Makefile
@@ -2410,6 +2410,7 @@ REFTABLE_OBJS += reftable/merged.o
 REFTABLE_OBJS += reftable/pq.o
 REFTABLE_OBJS += reftable/reader.o
 REFTABLE_OBJS += reftable/record.o
+REFTABLE_OBJS += reftable/refname.o
 REFTABLE_OBJS += reftable/reftable.o
 REFTABLE_OBJS += reftable/tree.o
 REFTABLE_OBJS += reftable/writer.o
@@ -2420,6 +2421,7 @@ REFTABLE_TEST_OBJS += reftable/block_test.o
 REFTABLE_TEST_OBJS += reftable/merged_test.o
 REFTABLE_TEST_OBJS += reftable/pq_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
+REFTABLE_TEST_OBJS += reftable/refname_test.o
 REFTABLE_TEST_OBJS += reftable/reftable_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/tree_test.o
diff --git a/reftable/refname.c b/reftable/refname.c
new file mode 100644
index 000000000000..0f4eb3b292d3
--- /dev/null
+++ b/reftable/refname.c
@@ -0,0 +1,209 @@
+/*
+  Copyright 2020 Google LLC
+
+  Use of this source code is governed by a BSD-style
+  license that can be found in the LICENSE file or at
+  https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+#include "reftable-error.h"
+#include "basics.h"
+#include "refname.h"
+#include "reftable-iterator.h"
+
+struct find_arg {
+	char **names;
+	const char *want;
+};
+
+static int find_name(size_t k, void *arg)
+{
+	struct find_arg *f_arg = (struct find_arg *)arg;
+	return strcmp(f_arg->names[k], f_arg->want) >= 0;
+}
+
+static int modification_has_ref(struct modification *mod, const char *name)
+{
+	struct reftable_ref_record ref = { NULL };
+	int err = 0;
+
+	if (mod->add_len > 0) {
+		struct find_arg arg = {
+			.names = mod->add,
+			.want = name,
+		};
+		int idx = binsearch(mod->add_len, find_name, &arg);
+		if (idx < mod->add_len && !strcmp(mod->add[idx], name)) {
+			return 0;
+		}
+	}
+
+	if (mod->del_len > 0) {
+		struct find_arg arg = {
+			.names = mod->del,
+			.want = name,
+		};
+		int idx = binsearch(mod->del_len, find_name, &arg);
+		if (idx < mod->del_len && !strcmp(mod->del[idx], name)) {
+			return 1;
+		}
+	}
+
+	err = reftable_table_read_ref(&mod->tab, name, &ref);
+	reftable_ref_record_release(&ref);
+	return err;
+}
+
+static void modification_release(struct modification *mod)
+{
+	/* don't delete the strings themselves; they're owned by ref records.
+	 */
+	FREE_AND_NULL(mod->add);
+	FREE_AND_NULL(mod->del);
+	mod->add_len = 0;
+	mod->del_len = 0;
+}
+
+static int modification_has_ref_with_prefix(struct modification *mod,
+					    const char *prefix)
+{
+	struct reftable_iterator it = { NULL };
+	struct reftable_ref_record ref = { NULL };
+	int err = 0;
+
+	if (mod->add_len > 0) {
+		struct find_arg arg = {
+			.names = mod->add,
+			.want = prefix,
+		};
+		int idx = binsearch(mod->add_len, find_name, &arg);
+		if (idx < mod->add_len &&
+		    !strncmp(prefix, mod->add[idx], strlen(prefix)))
+			goto done;
+	}
+	err = reftable_table_seek_ref(&mod->tab, &it, prefix);
+	if (err)
+		goto done;
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err)
+			goto done;
+
+		if (mod->del_len > 0) {
+			struct find_arg arg = {
+				.names = mod->del,
+				.want = ref.refname,
+			};
+			int idx = binsearch(mod->del_len, find_name, &arg);
+			if (idx < mod->del_len &&
+			    !strcmp(ref.refname, mod->del[idx])) {
+				continue;
+			}
+		}
+
+		if (strncmp(ref.refname, prefix, strlen(prefix))) {
+			err = 1;
+			goto done;
+		}
+		err = 0;
+		goto done;
+	}
+
+done:
+	reftable_ref_record_release(&ref);
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+static int validate_refname(const char *name)
+{
+	while (1) {
+		char *next = strchr(name, '/');
+		if (!*name) {
+			return REFTABLE_REFNAME_ERROR;
+		}
+		if (!next) {
+			return 0;
+		}
+		if (next - name == 0 || (next - name == 1 && *name == '.') ||
+		    (next - name == 2 && name[0] == '.' && name[1] == '.'))
+			return REFTABLE_REFNAME_ERROR;
+		name = next + 1;
+	}
+	return 0;
+}
+
+int validate_ref_record_addition(struct reftable_table tab,
+				 struct reftable_ref_record *recs, size_t sz)
+{
+	struct modification mod = {
+		.tab = tab,
+		.add = reftable_calloc(sizeof(char *) * sz),
+		.del = reftable_calloc(sizeof(char *) * sz),
+	};
+	int i = 0;
+	int err = 0;
+	for (; i < sz; i++) {
+		if (reftable_ref_record_is_deletion(&recs[i])) {
+			mod.del[mod.del_len++] = recs[i].refname;
+		} else {
+			mod.add[mod.add_len++] = recs[i].refname;
+		}
+	}
+
+	err = modification_validate(&mod);
+	modification_release(&mod);
+	return err;
+}
+
+static void strbuf_trim_component(struct strbuf *sl)
+{
+	while (sl->len > 0) {
+		int is_slash = (sl->buf[sl->len - 1] == '/');
+		strbuf_setlen(sl, sl->len - 1);
+		if (is_slash)
+			break;
+	}
+}
+
+int modification_validate(struct modification *mod)
+{
+	struct strbuf slashed = STRBUF_INIT;
+	int err = 0;
+	int i = 0;
+	for (; i < mod->add_len; i++) {
+		err = validate_refname(mod->add[i]);
+		if (err)
+			goto done;
+		strbuf_reset(&slashed);
+		strbuf_addstr(&slashed, mod->add[i]);
+		strbuf_addstr(&slashed, "/");
+
+		err = modification_has_ref_with_prefix(mod, slashed.buf);
+		if (err == 0) {
+			err = REFTABLE_NAME_CONFLICT;
+			goto done;
+		}
+		if (err < 0)
+			goto done;
+
+		strbuf_reset(&slashed);
+		strbuf_addstr(&slashed, mod->add[i]);
+		while (slashed.len) {
+			strbuf_trim_component(&slashed);
+			err = modification_has_ref(mod, slashed.buf);
+			if (err == 0) {
+				err = REFTABLE_NAME_CONFLICT;
+				goto done;
+			}
+			if (err < 0)
+				goto done;
+		}
+	}
+	err = 0;
+done:
+	strbuf_release(&slashed);
+	return err;
+}
diff --git a/reftable/refname.h b/reftable/refname.h
new file mode 100644
index 000000000000..a24b40fcb428
--- /dev/null
+++ b/reftable/refname.h
@@ -0,0 +1,29 @@
+/*
+  Copyright 2020 Google LLC
+
+  Use of this source code is governed by a BSD-style
+  license that can be found in the LICENSE file or at
+  https://developers.google.com/open-source/licenses/bsd
+*/
+#ifndef REFNAME_H
+#define REFNAME_H
+
+#include "reftable-record.h"
+#include "reftable-generic.h"
+
+struct modification {
+	struct reftable_table tab;
+
+	char **add;
+	size_t add_len;
+
+	char **del;
+	size_t del_len;
+};
+
+int validate_ref_record_addition(struct reftable_table tab,
+				 struct reftable_ref_record *recs, size_t sz);
+
+int modification_validate(struct modification *mod);
+
+#endif
diff --git a/reftable/refname_test.c b/reftable/refname_test.c
new file mode 100644
index 000000000000..5e005d6af31c
--- /dev/null
+++ b/reftable/refname_test.c
@@ -0,0 +1,102 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "basics.h"
+#include "block.h"
+#include "blocksource.h"
+#include "constants.h"
+#include "reader.h"
+#include "record.h"
+#include "refname.h"
+#include "reftable-error.h"
+#include "reftable-writer.h"
+#include "system.h"
+
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+struct testcase {
+	char *add;
+	char *del;
+	int error_code;
+};
+
+static void test_conflict(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	struct reftable_ref_record rec = {
+		.refname = "a/b",
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "destination", /* make sure it's not a symref.
+						*/
+		.update_index = 1,
+	};
+	int err;
+	int i;
+	struct reftable_block_source source = { NULL };
+	struct reftable_reader *rd = NULL;
+	struct reftable_table tab = { NULL };
+	struct testcase cases[] = {
+		{ "a/b/c", NULL, REFTABLE_NAME_CONFLICT },
+		{ "b", NULL, 0 },
+		{ "a", NULL, REFTABLE_NAME_CONFLICT },
+		{ "a", "a/b", 0 },
+
+		{ "p/", NULL, REFTABLE_REFNAME_ERROR },
+		{ "p//q", NULL, REFTABLE_REFNAME_ERROR },
+		{ "p/./q", NULL, REFTABLE_REFNAME_ERROR },
+		{ "p/../q", NULL, REFTABLE_REFNAME_ERROR },
+
+		{ "a/b/c", "a/b", 0 },
+		{ NULL, "a//b", 0 },
+	};
+	reftable_writer_set_limits(w, 1, 1);
+
+	err = reftable_writer_add_ref(w, &rec);
+	EXPECT_ERR(err);
+
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+	reftable_writer_free(w);
+
+	block_source_from_strbuf(&source, &buf);
+	err = reftable_new_reader(&rd, &source, "filename");
+	EXPECT_ERR(err);
+
+	reftable_table_from_reader(&tab, rd);
+
+	for (i = 0; i < ARRAY_SIZE(cases); i++) {
+		struct modification mod = {
+			.tab = tab,
+		};
+
+		if (cases[i].add != NULL) {
+			mod.add = &cases[i].add;
+			mod.add_len = 1;
+		}
+		if (cases[i].del != NULL) {
+			mod.del = &cases[i].del;
+			mod.del_len = 1;
+		}
+
+		err = modification_validate(&mod);
+		EXPECT(err == cases[i].error_code);
+	}
+
+	reftable_reader_free(rd);
+	strbuf_release(&buf);
+}
+
+int refname_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_conflict);
+	return 0;
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index cb277d911692..23e050eb5d23 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -8,6 +8,7 @@ int cmd__reftable(int argc, const char **argv)
 	merged_test_main(argc, argv);
 	pq_test_main(argc, argv);
 	record_test_main(argc, argv);
+	refname_test_main(argc, argv);
 	reftable_test_main(argc, argv);
 	tree_test_main(argc, argv);
 	return 0;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v6 16/20] reftable: implement stack, a mutable database of reftable files.
  2021-04-12 19:25         ` [PATCH v6 00/20] reftable library Han-Wen Nienhuys via GitGitGadget
                             ` (14 preceding siblings ...)
  2021-04-12 19:25           ` [PATCH v6 15/20] reftable: implement refname validation Han-Wen Nienhuys via GitGitGadget
@ 2021-04-12 19:25           ` Han-Wen Nienhuys via GitGitGadget
  2021-04-12 19:25           ` [PATCH v6 17/20] reftable: add dump utility Han-Wen Nienhuys via GitGitGadget
                             ` (4 subsequent siblings)
  20 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-12 19:25 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys,
	Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                  |    2 +
 reftable/reftable-stack.h |  123 ++++
 reftable/stack.c          | 1376 +++++++++++++++++++++++++++++++++++++
 reftable/stack.h          |   41 ++
 reftable/stack_test.c     |  928 +++++++++++++++++++++++++
 t/helper/test-reftable.c  |    1 +
 6 files changed, 2471 insertions(+)
 create mode 100644 reftable/reftable-stack.h
 create mode 100644 reftable/stack.c
 create mode 100644 reftable/stack.h
 create mode 100644 reftable/stack_test.c

diff --git a/Makefile b/Makefile
index 9b941b69eb4b..e5ea8c6627ae 100644
--- a/Makefile
+++ b/Makefile
@@ -2412,6 +2412,7 @@ REFTABLE_OBJS += reftable/reader.o
 REFTABLE_OBJS += reftable/record.o
 REFTABLE_OBJS += reftable/refname.o
 REFTABLE_OBJS += reftable/reftable.o
+REFTABLE_OBJS += reftable/stack.o
 REFTABLE_OBJS += reftable/tree.o
 REFTABLE_OBJS += reftable/writer.o
 REFTABLE_OBJS += reftable/zlib-compat.o
@@ -2423,6 +2424,7 @@ REFTABLE_TEST_OBJS += reftable/pq_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/refname_test.o
 REFTABLE_TEST_OBJS += reftable/reftable_test.o
+REFTABLE_TEST_OBJS += reftable/stack_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/tree_test.o
 
diff --git a/reftable/reftable-stack.h b/reftable/reftable-stack.h
new file mode 100644
index 000000000000..2b363c908345
--- /dev/null
+++ b/reftable/reftable-stack.h
@@ -0,0 +1,123 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_STACK_H
+#define REFTABLE_STACK_H
+
+#include "reftable-writer.h"
+
+/*
+ * The stack presents an interface to a mutable sequence of reftables.
+
+ * A stack can be mutated by pushing a table to the top of the stack.
+
+ * The reftable_stack automatically compacts files on disk to ensure good
+ * amortized performance.
+ *
+ * For windows and other platforms that cannot have open files as rename
+ * destinations, concurrent access from multiple processes needs the rand()
+ * random seed to be randomized.
+ */
+struct reftable_stack;
+
+/* open a new reftable stack. The tables along with the table list will be
+ *  stored in 'dir'. Typically, this should be .git/reftables.
+ */
+int reftable_new_stack(struct reftable_stack **dest, const char *dir,
+		       struct reftable_write_options config);
+
+/* returns the update_index at which a next table should be written. */
+uint64_t reftable_stack_next_update_index(struct reftable_stack *st);
+
+/* holds a transaction to add tables at the top of a stack. */
+struct reftable_addition;
+
+/*
+ * returns a new transaction to add reftables to the given stack. As a side
+ * effect, the ref database is locked.
+ */
+int reftable_stack_new_addition(struct reftable_addition **dest,
+				struct reftable_stack *st);
+
+/* Adds a reftable to transaction. */
+int reftable_addition_add(struct reftable_addition *add,
+			  int (*write_table)(struct reftable_writer *wr,
+					     void *arg),
+			  void *arg);
+
+/* Commits the transaction, releasing the lock. */
+int reftable_addition_commit(struct reftable_addition *add);
+
+/* Release all non-committed data from the transaction, and deallocate the
+ * transaction. Releases the lock if held. */
+void reftable_addition_destroy(struct reftable_addition *add);
+
+/* add a new table to the stack. The write_table function must call
+ * reftable_writer_set_limits, add refs and return an error value. */
+int reftable_stack_add(struct reftable_stack *st,
+		       int (*write_table)(struct reftable_writer *wr,
+					  void *write_arg),
+		       void *write_arg);
+
+/* returns the merged_table for seeking. This table is valid until the
+ * next write or reload, and should not be closed or deleted.
+ */
+struct reftable_merged_table *
+reftable_stack_merged_table(struct reftable_stack *st);
+
+/* frees all resources associated with the stack. */
+void reftable_stack_destroy(struct reftable_stack *st);
+
+/* Reloads the stack if necessary. This is very cheap to run if the stack was up
+ * to date */
+int reftable_stack_reload(struct reftable_stack *st);
+
+/* Policy for expiring reflog entries. */
+struct reftable_log_expiry_config {
+	/* Drop entries older than this timestamp */
+	uint64_t time;
+
+	/* Drop older entries */
+	uint64_t min_update_index;
+};
+
+/* compacts all reftables into a giant table. Expire reflog entries if config is
+ * non-NULL */
+int reftable_stack_compact_all(struct reftable_stack *st,
+			       struct reftable_log_expiry_config *config);
+
+/* heuristically compact unbalanced table stack. */
+int reftable_stack_auto_compact(struct reftable_stack *st);
+
+/* delete stale .ref tables. */
+int reftable_stack_clean(struct reftable_stack *st);
+
+/* convenience function to read a single ref. Returns < 0 for error, 0 for
+ * success, and 1 if ref not found. */
+int reftable_stack_read_ref(struct reftable_stack *st, const char *refname,
+			    struct reftable_ref_record *ref);
+
+/* convenience function to read a single log. Returns < 0 for error, 0 for
+ * success, and 1 if ref not found. */
+int reftable_stack_read_log(struct reftable_stack *st, const char *refname,
+			    struct reftable_log_record *log);
+
+/* statistics on past compactions. */
+struct reftable_compaction_stats {
+	uint64_t bytes; /* total number of bytes written */
+	uint64_t entries_written; /* total number of entries written, including
+				     failures. */
+	int attempts; /* how often we tried to compact */
+	int failures; /* failures happen on concurrent updates */
+};
+
+/* return statistics for compaction up till now. */
+struct reftable_compaction_stats *
+reftable_stack_compaction_stats(struct reftable_stack *st);
+
+#endif
diff --git a/reftable/stack.c b/reftable/stack.c
new file mode 100644
index 000000000000..3cdb6e8ed33b
--- /dev/null
+++ b/reftable/stack.c
@@ -0,0 +1,1376 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "stack.h"
+
+#include "system.h"
+#include "merged.h"
+#include "reader.h"
+#include "refname.h"
+#include "reftable-error.h"
+#include "reftable-record.h"
+#include "reftable-merged.h"
+#include "writer.h"
+
+static int stack_try_add(struct reftable_stack *st,
+			 int (*write_table)(struct reftable_writer *wr,
+					    void *arg),
+			 void *arg);
+static int stack_write_compact(struct reftable_stack *st,
+			       struct reftable_writer *wr, int first, int last,
+			       struct reftable_log_expiry_config *config);
+static int stack_check_addition(struct reftable_stack *st,
+				const char *new_tab_name);
+static void reftable_addition_close(struct reftable_addition *add);
+static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
+					     int reuse_open);
+
+static void stack_filename(struct strbuf *dest, struct reftable_stack *st,
+			   const char *name)
+{
+	strbuf_reset(dest);
+	strbuf_addstr(dest, st->reftable_dir);
+	strbuf_addstr(dest, "/");
+	strbuf_addstr(dest, name);
+}
+
+static int reftable_fd_write(void *arg, const void *data, size_t sz)
+{
+	int *fdp = (int *)arg;
+	return write(*fdp, data, sz);
+}
+
+int reftable_new_stack(struct reftable_stack **dest, const char *dir,
+		       struct reftable_write_options config)
+{
+	struct reftable_stack *p =
+		reftable_calloc(sizeof(struct reftable_stack));
+	struct strbuf list_file_name = STRBUF_INIT;
+	int err = 0;
+
+	if (config.hash_id == 0) {
+		config.hash_id = SHA1_ID;
+	}
+
+	*dest = NULL;
+
+	strbuf_reset(&list_file_name);
+	strbuf_addstr(&list_file_name, dir);
+	strbuf_addstr(&list_file_name, "/tables.list");
+
+	p->list_file = strbuf_detach(&list_file_name, NULL);
+	p->reftable_dir = xstrdup(dir);
+	p->config = config;
+
+	err = reftable_stack_reload_maybe_reuse(p, 1);
+	if (err < 0) {
+		reftable_stack_destroy(p);
+	} else {
+		*dest = p;
+	}
+	return err;
+}
+
+static int fd_read_lines(int fd, char ***namesp)
+{
+	off_t size = lseek(fd, 0, SEEK_END);
+	char *buf = NULL;
+	int err = 0;
+	if (size < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+	err = lseek(fd, 0, SEEK_SET);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	buf = reftable_malloc(size + 1);
+	if (read(fd, buf, size) != size) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+	buf[size] = 0;
+
+	parse_names(buf, size, namesp);
+
+done:
+	reftable_free(buf);
+	return err;
+}
+
+int read_lines(const char *filename, char ***namesp)
+{
+	int fd = open(filename, O_RDONLY, 0644);
+	int err = 0;
+	if (fd < 0) {
+		if (errno == ENOENT) {
+			*namesp = reftable_calloc(sizeof(char *));
+			return 0;
+		}
+
+		return REFTABLE_IO_ERROR;
+	}
+	err = fd_read_lines(fd, namesp);
+	close(fd);
+	return err;
+}
+
+struct reftable_merged_table *
+reftable_stack_merged_table(struct reftable_stack *st)
+{
+	return st->merged;
+}
+
+static int has_name(char **names, const char *name)
+{
+	while (*names) {
+		if (!strcmp(*names, name))
+			return 1;
+		names++;
+	}
+	return 0;
+}
+
+/* Close and free the stack */
+void reftable_stack_destroy(struct reftable_stack *st)
+{
+	char **names = NULL;
+	int err = 0;
+	if (st->merged != NULL) {
+		reftable_merged_table_free(st->merged);
+		st->merged = NULL;
+	}
+
+	err = read_lines(st->list_file, &names);
+	if (err < 0) {
+		FREE_AND_NULL(names);
+	}
+
+	if (st->readers != NULL) {
+		int i = 0;
+		struct strbuf filename = STRBUF_INIT;
+		for (i = 0; i < st->readers_len; i++) {
+			const char *name = reader_name(st->readers[i]);
+			strbuf_reset(&filename);
+			if (names && !has_name(names, name)) {
+				stack_filename(&filename, st, name);
+			}
+			reftable_reader_free(st->readers[i]);
+
+			if (filename.len) {
+				// On Windows, can only unlink after closing.
+				unlink(filename.buf);
+			}
+		}
+		strbuf_release(&filename);
+		st->readers_len = 0;
+		FREE_AND_NULL(st->readers);
+	}
+	FREE_AND_NULL(st->list_file);
+	FREE_AND_NULL(st->reftable_dir);
+	reftable_free(st);
+	free_names(names);
+}
+
+static struct reftable_reader **stack_copy_readers(struct reftable_stack *st,
+						   int cur_len)
+{
+	struct reftable_reader **cur =
+		reftable_calloc(sizeof(struct reftable_reader *) * cur_len);
+	int i = 0;
+	for (i = 0; i < cur_len; i++) {
+		cur[i] = st->readers[i];
+	}
+	return cur;
+}
+
+static int reftable_stack_reload_once(struct reftable_stack *st, char **names,
+				      int reuse_open)
+{
+	int cur_len = st->merged == NULL ? 0 : st->merged->stack_len;
+	struct reftable_reader **cur = stack_copy_readers(st, cur_len);
+	int err = 0;
+	int names_len = names_length(names);
+	struct reftable_reader **new_readers =
+		reftable_calloc(sizeof(struct reftable_reader *) * names_len);
+	struct reftable_table *new_tables =
+		reftable_calloc(sizeof(struct reftable_table) * names_len);
+	int new_readers_len = 0;
+	struct reftable_merged_table *new_merged = NULL;
+	int i;
+
+	while (*names) {
+		struct reftable_reader *rd = NULL;
+		char *name = *names++;
+
+		/* this is linear; we assume compaction keeps the number of
+		   tables under control so this is not quadratic. */
+		int j = 0;
+		for (j = 0; reuse_open && j < cur_len; j++) {
+			if (cur[j] != NULL && 0 == strcmp(cur[j]->name, name)) {
+				rd = cur[j];
+				cur[j] = NULL;
+				break;
+			}
+		}
+
+		if (rd == NULL) {
+			struct reftable_block_source src = { NULL };
+			struct strbuf table_path = STRBUF_INIT;
+			stack_filename(&table_path, st, name);
+
+			err = reftable_block_source_from_file(&src,
+							      table_path.buf);
+			strbuf_release(&table_path);
+
+			if (err < 0)
+				goto done;
+
+			err = reftable_new_reader(&rd, &src, name);
+			if (err < 0)
+				goto done;
+		}
+
+		new_readers[new_readers_len] = rd;
+		reftable_table_from_reader(&new_tables[new_readers_len], rd);
+		new_readers_len++;
+	}
+
+	/* success! */
+	err = reftable_new_merged_table(&new_merged, new_tables,
+					new_readers_len, st->config.hash_id);
+	if (err < 0)
+		goto done;
+
+	new_tables = NULL;
+	st->readers_len = new_readers_len;
+	if (st->merged != NULL) {
+		merged_table_release(st->merged);
+		reftable_merged_table_free(st->merged);
+	}
+	if (st->readers != NULL) {
+		reftable_free(st->readers);
+	}
+	st->readers = new_readers;
+	new_readers = NULL;
+	new_readers_len = 0;
+
+	new_merged->suppress_deletions = 1;
+	st->merged = new_merged;
+	for (i = 0; i < cur_len; i++) {
+		if (cur[i] != NULL) {
+			const char *name = reader_name(cur[i]);
+			struct strbuf filename = STRBUF_INIT;
+			stack_filename(&filename, st, name);
+
+			reader_close(cur[i]);
+			reftable_reader_free(cur[i]);
+
+			// On Windows, can only unlink after closing.
+			unlink(filename.buf);
+
+			strbuf_release(&filename);
+		}
+	}
+
+done:
+	for (i = 0; i < new_readers_len; i++) {
+		reader_close(new_readers[i]);
+		reftable_reader_free(new_readers[i]);
+	}
+	reftable_free(new_readers);
+	reftable_free(new_tables);
+	reftable_free(cur);
+	return err;
+}
+
+/* return negative if a before b. */
+static int tv_cmp(struct timeval *a, struct timeval *b)
+{
+	time_t diff = a->tv_sec - b->tv_sec;
+	int udiff = a->tv_usec - b->tv_usec;
+
+	if (diff != 0)
+		return diff;
+
+	return udiff;
+}
+
+static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
+					     int reuse_open)
+{
+	struct timeval deadline = { 0 };
+	int err = gettimeofday(&deadline, NULL);
+	int64_t delay = 0;
+	int tries = 0;
+	if (err < 0)
+		return err;
+
+	deadline.tv_sec += 3;
+	while (1) {
+		char **names = NULL;
+		char **names_after = NULL;
+		struct timeval now = { 0 };
+		int err = gettimeofday(&now, NULL);
+		int err2 = 0;
+		if (err < 0) {
+			return err;
+		}
+
+		/* Only look at deadlines after the first few times. This
+		   simplifies debugging in GDB */
+		tries++;
+		if (tries > 3 && tv_cmp(&now, &deadline) >= 0) {
+			break;
+		}
+
+		err = read_lines(st->list_file, &names);
+		if (err < 0) {
+			free_names(names);
+			return err;
+		}
+		err = reftable_stack_reload_once(st, names, reuse_open);
+		if (err == 0) {
+			free_names(names);
+			break;
+		}
+		if (err != REFTABLE_NOT_EXIST_ERROR) {
+			free_names(names);
+			return err;
+		}
+
+		/* err == REFTABLE_NOT_EXIST_ERROR can be caused by a concurrent
+		   writer. Check if there was one by checking if the name list
+		   changed.
+		*/
+		err2 = read_lines(st->list_file, &names_after);
+		if (err2 < 0) {
+			free_names(names);
+			return err2;
+		}
+
+		if (names_equal(names_after, names)) {
+			free_names(names);
+			free_names(names_after);
+			return err;
+		}
+		free_names(names);
+		free_names(names_after);
+
+		delay = delay + (delay * rand()) / RAND_MAX + 1;
+		sleep_millisec(delay);
+	}
+
+	return 0;
+}
+
+/* -1 = error
+ 0 = up to date
+ 1 = changed. */
+static int stack_uptodate(struct reftable_stack *st)
+{
+	char **names = NULL;
+	int err = read_lines(st->list_file, &names);
+	int i = 0;
+	if (err < 0)
+		return err;
+
+	for (i = 0; i < st->readers_len; i++) {
+		if (names[i] == NULL) {
+			err = 1;
+			goto done;
+		}
+
+		if (strcmp(st->readers[i]->name, names[i])) {
+			err = 1;
+			goto done;
+		}
+	}
+
+	if (names[st->merged->stack_len] != NULL) {
+		err = 1;
+		goto done;
+	}
+
+done:
+	free_names(names);
+	return err;
+}
+
+int reftable_stack_reload(struct reftable_stack *st)
+{
+	int err = stack_uptodate(st);
+	if (err > 0)
+		return reftable_stack_reload_maybe_reuse(st, 1);
+	return err;
+}
+
+int reftable_stack_add(struct reftable_stack *st,
+		       int (*write)(struct reftable_writer *wr, void *arg),
+		       void *arg)
+{
+	int err = stack_try_add(st, write, arg);
+	if (err < 0) {
+		if (err == REFTABLE_LOCK_ERROR) {
+			/* Ignore error return, we want to propagate
+			   REFTABLE_LOCK_ERROR.
+			*/
+			reftable_stack_reload(st);
+		}
+		return err;
+	}
+
+	if (!st->disable_auto_compact)
+		return reftable_stack_auto_compact(st);
+
+	return 0;
+}
+
+static void format_name(struct strbuf *dest, uint64_t min, uint64_t max)
+{
+	char buf[100];
+	uint32_t rnd = (uint32_t)rand();
+	snprintf(buf, sizeof(buf), "0x%012" PRIx64 "-0x%012" PRIx64 "-%08x",
+		 min, max, rnd);
+	strbuf_reset(dest);
+	strbuf_addstr(dest, buf);
+}
+
+struct reftable_addition {
+	int lock_file_fd;
+	struct strbuf lock_file_name;
+	struct reftable_stack *stack;
+
+	char **new_tables;
+	int new_tables_len;
+	uint64_t next_update_index;
+};
+
+#define REFTABLE_ADDITION_INIT                \
+	{                                     \
+		.lock_file_name = STRBUF_INIT \
+	}
+
+static int reftable_stack_init_addition(struct reftable_addition *add,
+					struct reftable_stack *st)
+{
+	int err = 0;
+	add->stack = st;
+
+	strbuf_reset(&add->lock_file_name);
+	strbuf_addstr(&add->lock_file_name, st->list_file);
+	strbuf_addstr(&add->lock_file_name, ".lock");
+
+	add->lock_file_fd = open(add->lock_file_name.buf,
+				 O_EXCL | O_CREAT | O_WRONLY, 0644);
+	if (add->lock_file_fd < 0) {
+		if (errno == EEXIST) {
+			err = REFTABLE_LOCK_ERROR;
+		} else {
+			err = REFTABLE_IO_ERROR;
+		}
+		goto done;
+	}
+	err = stack_uptodate(st);
+	if (err < 0)
+		goto done;
+
+	if (err > 1) {
+		err = REFTABLE_LOCK_ERROR;
+		goto done;
+	}
+
+	add->next_update_index = reftable_stack_next_update_index(st);
+done:
+	if (err) {
+		reftable_addition_close(add);
+	}
+	return err;
+}
+
+static void reftable_addition_close(struct reftable_addition *add)
+{
+	int i = 0;
+	struct strbuf nm = STRBUF_INIT;
+	for (i = 0; i < add->new_tables_len; i++) {
+		stack_filename(&nm, add->stack, add->new_tables[i]);
+		unlink(nm.buf);
+		reftable_free(add->new_tables[i]);
+		add->new_tables[i] = NULL;
+	}
+	reftable_free(add->new_tables);
+	add->new_tables = NULL;
+	add->new_tables_len = 0;
+
+	if (add->lock_file_fd > 0) {
+		close(add->lock_file_fd);
+		add->lock_file_fd = 0;
+	}
+	if (add->lock_file_name.len > 0) {
+		unlink(add->lock_file_name.buf);
+		strbuf_release(&add->lock_file_name);
+	}
+
+	strbuf_release(&nm);
+}
+
+void reftable_addition_destroy(struct reftable_addition *add)
+{
+	if (add == NULL) {
+		return;
+	}
+	reftable_addition_close(add);
+	reftable_free(add);
+}
+
+int reftable_addition_commit(struct reftable_addition *add)
+{
+	struct strbuf table_list = STRBUF_INIT;
+	int i = 0;
+	int err = 0;
+	if (add->new_tables_len == 0)
+		goto done;
+
+	for (i = 0; i < add->stack->merged->stack_len; i++) {
+		strbuf_addstr(&table_list, add->stack->readers[i]->name);
+		strbuf_addstr(&table_list, "\n");
+	}
+	for (i = 0; i < add->new_tables_len; i++) {
+		strbuf_addstr(&table_list, add->new_tables[i]);
+		strbuf_addstr(&table_list, "\n");
+	}
+
+	err = write(add->lock_file_fd, table_list.buf, table_list.len);
+	strbuf_release(&table_list);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = close(add->lock_file_fd);
+	add->lock_file_fd = 0;
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = rename(add->lock_file_name.buf, add->stack->list_file);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	/* success, no more state to clean up. */
+	strbuf_release(&add->lock_file_name);
+	for (i = 0; i < add->new_tables_len; i++) {
+		reftable_free(add->new_tables[i]);
+	}
+	reftable_free(add->new_tables);
+	add->new_tables = NULL;
+	add->new_tables_len = 0;
+
+	err = reftable_stack_reload(add->stack);
+done:
+	reftable_addition_close(add);
+	return err;
+}
+
+int reftable_stack_new_addition(struct reftable_addition **dest,
+				struct reftable_stack *st)
+{
+	int err = 0;
+	struct reftable_addition empty = REFTABLE_ADDITION_INIT;
+	*dest = reftable_calloc(sizeof(**dest));
+	**dest = empty;
+	err = reftable_stack_init_addition(*dest, st);
+	if (err) {
+		reftable_free(*dest);
+		*dest = NULL;
+	}
+	return err;
+}
+
+static int stack_try_add(struct reftable_stack *st,
+			 int (*write_table)(struct reftable_writer *wr,
+					    void *arg),
+			 void *arg)
+{
+	struct reftable_addition add = REFTABLE_ADDITION_INIT;
+	int err = reftable_stack_init_addition(&add, st);
+	if (err < 0)
+		goto done;
+	if (err > 0) {
+		err = REFTABLE_LOCK_ERROR;
+		goto done;
+	}
+
+	err = reftable_addition_add(&add, write_table, arg);
+	if (err < 0)
+		goto done;
+
+	err = reftable_addition_commit(&add);
+done:
+	reftable_addition_close(&add);
+	return err;
+}
+
+int reftable_addition_add(struct reftable_addition *add,
+			  int (*write_table)(struct reftable_writer *wr,
+					     void *arg),
+			  void *arg)
+{
+	struct strbuf temp_tab_file_name = STRBUF_INIT;
+	struct strbuf tab_file_name = STRBUF_INIT;
+	struct strbuf next_name = STRBUF_INIT;
+	struct reftable_writer *wr = NULL;
+	int err = 0;
+	int tab_fd = 0;
+
+	strbuf_reset(&next_name);
+	format_name(&next_name, add->next_update_index, add->next_update_index);
+
+	stack_filename(&temp_tab_file_name, add->stack, next_name.buf);
+	strbuf_addstr(&temp_tab_file_name, ".temp.XXXXXX");
+
+	tab_fd = mkstemp(temp_tab_file_name.buf);
+	if (tab_fd < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	wr = reftable_new_writer(reftable_fd_write, &tab_fd,
+				 &add->stack->config);
+	err = write_table(wr, arg);
+	if (err < 0)
+		goto done;
+
+	err = reftable_writer_close(wr);
+	if (err == REFTABLE_EMPTY_TABLE_ERROR) {
+		err = 0;
+		goto done;
+	}
+	if (err < 0)
+		goto done;
+
+	err = close(tab_fd);
+	tab_fd = 0;
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = stack_check_addition(add->stack, temp_tab_file_name.buf);
+	if (err < 0)
+		goto done;
+
+	if (wr->min_update_index < add->next_update_index) {
+		err = REFTABLE_API_ERROR;
+		goto done;
+	}
+
+	format_name(&next_name, wr->min_update_index, wr->max_update_index);
+	strbuf_addstr(&next_name, ".ref");
+
+	stack_filename(&tab_file_name, add->stack, next_name.buf);
+
+	/*
+	  On windows, this relies on rand() picking a unique destination name.
+	  Maybe we should do retry loop as well?
+	 */
+	err = rename(temp_tab_file_name.buf, tab_file_name.buf);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	add->new_tables = reftable_realloc(add->new_tables,
+					   sizeof(*add->new_tables) *
+						   (add->new_tables_len + 1));
+	add->new_tables[add->new_tables_len] = strbuf_detach(&next_name, NULL);
+	add->new_tables_len++;
+done:
+	if (tab_fd > 0) {
+		close(tab_fd);
+		tab_fd = 0;
+	}
+	if (temp_tab_file_name.len > 0) {
+		unlink(temp_tab_file_name.buf);
+	}
+
+	strbuf_release(&temp_tab_file_name);
+	strbuf_release(&tab_file_name);
+	strbuf_release(&next_name);
+	reftable_writer_free(wr);
+	return err;
+}
+
+uint64_t reftable_stack_next_update_index(struct reftable_stack *st)
+{
+	int sz = st->merged->stack_len;
+	if (sz > 0)
+		return reftable_reader_max_update_index(st->readers[sz - 1]) +
+		       1;
+	return 1;
+}
+
+static int stack_compact_locked(struct reftable_stack *st, int first, int last,
+				struct strbuf *temp_tab,
+				struct reftable_log_expiry_config *config)
+{
+	struct strbuf next_name = STRBUF_INIT;
+	int tab_fd = -1;
+	struct reftable_writer *wr = NULL;
+	int err = 0;
+
+	format_name(&next_name,
+		    reftable_reader_min_update_index(st->readers[first]),
+		    reftable_reader_max_update_index(st->readers[last]));
+
+	stack_filename(temp_tab, st, next_name.buf);
+	strbuf_addstr(temp_tab, ".temp.XXXXXX");
+
+	tab_fd = mkstemp(temp_tab->buf);
+	wr = reftable_new_writer(reftable_fd_write, &tab_fd, &st->config);
+
+	err = stack_write_compact(st, wr, first, last, config);
+	if (err < 0)
+		goto done;
+	err = reftable_writer_close(wr);
+	if (err < 0)
+		goto done;
+
+	err = close(tab_fd);
+	tab_fd = 0;
+
+done:
+	reftable_writer_free(wr);
+	if (tab_fd > 0) {
+		close(tab_fd);
+		tab_fd = 0;
+	}
+	if (err != 0 && temp_tab->len > 0) {
+		unlink(temp_tab->buf);
+		strbuf_release(temp_tab);
+	}
+	strbuf_release(&next_name);
+	return err;
+}
+
+static int stack_write_compact(struct reftable_stack *st,
+			       struct reftable_writer *wr, int first, int last,
+			       struct reftable_log_expiry_config *config)
+{
+	int subtabs_len = last - first + 1;
+	struct reftable_table *subtabs = reftable_calloc(
+		sizeof(struct reftable_table) * (last - first + 1));
+	struct reftable_merged_table *mt = NULL;
+	int err = 0;
+	struct reftable_iterator it = { NULL };
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_log_record log = { NULL };
+
+	uint64_t entries = 0;
+
+	int i = 0, j = 0;
+	for (i = first, j = 0; i <= last; i++) {
+		struct reftable_reader *t = st->readers[i];
+		reftable_table_from_reader(&subtabs[j++], t);
+		st->stats.bytes += t->size;
+	}
+	reftable_writer_set_limits(wr, st->readers[first]->min_update_index,
+				   st->readers[last]->max_update_index);
+
+	err = reftable_new_merged_table(&mt, subtabs, subtabs_len,
+					st->config.hash_id);
+	if (err < 0) {
+		reftable_free(subtabs);
+		goto done;
+	}
+
+	err = reftable_merged_table_seek_ref(mt, &it, "");
+	if (err < 0)
+		goto done;
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+		if (err < 0) {
+			break;
+		}
+
+		if (first == 0 && reftable_ref_record_is_deletion(&ref)) {
+			continue;
+		}
+
+		err = reftable_writer_add_ref(wr, &ref);
+		if (err < 0) {
+			break;
+		}
+		entries++;
+	}
+	reftable_iterator_destroy(&it);
+
+	err = reftable_merged_table_seek_log(mt, &it, "");
+	if (err < 0)
+		goto done;
+
+	while (1) {
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+		if (err < 0) {
+			break;
+		}
+		if (first == 0 && reftable_log_record_is_deletion(&log)) {
+			continue;
+		}
+
+		if (config != NULL && config->min_update_index > 0 &&
+		    log.update_index < config->min_update_index) {
+			continue;
+		}
+
+		if (config != NULL && config->time > 0 &&
+		    log.update.time < config->time) {
+			continue;
+		}
+
+		err = reftable_writer_add_log(wr, &log);
+		if (err < 0) {
+			break;
+		}
+		entries++;
+	}
+
+done:
+	reftable_iterator_destroy(&it);
+	if (mt != NULL) {
+		merged_table_release(mt);
+		reftable_merged_table_free(mt);
+	}
+	reftable_ref_record_release(&ref);
+	reftable_log_record_release(&log);
+	st->stats.entries_written += entries;
+	return err;
+}
+
+/* <  0: error. 0 == OK, > 0 attempt failed; could retry. */
+static int stack_compact_range(struct reftable_stack *st, int first, int last,
+			       struct reftable_log_expiry_config *expiry)
+{
+	struct strbuf temp_tab_file_name = STRBUF_INIT;
+	struct strbuf new_table_name = STRBUF_INIT;
+	struct strbuf lock_file_name = STRBUF_INIT;
+	struct strbuf ref_list_contents = STRBUF_INIT;
+	struct strbuf new_table_path = STRBUF_INIT;
+	int err = 0;
+	int have_lock = 0;
+	int lock_file_fd = 0;
+	int compact_count = last - first + 1;
+	char **listp = NULL;
+	char **delete_on_success =
+		reftable_calloc(sizeof(char *) * (compact_count + 1));
+	char **subtable_locks =
+		reftable_calloc(sizeof(char *) * (compact_count + 1));
+	int i = 0;
+	int j = 0;
+	int is_empty_table = 0;
+
+	if (first > last || (expiry == NULL && first == last)) {
+		err = 0;
+		goto done;
+	}
+
+	st->stats.attempts++;
+
+	strbuf_reset(&lock_file_name);
+	strbuf_addstr(&lock_file_name, st->list_file);
+	strbuf_addstr(&lock_file_name, ".lock");
+
+	lock_file_fd =
+		open(lock_file_name.buf, O_EXCL | O_CREAT | O_WRONLY, 0644);
+	if (lock_file_fd < 0) {
+		if (errno == EEXIST) {
+			err = 1;
+		} else {
+			err = REFTABLE_IO_ERROR;
+		}
+		goto done;
+	}
+	/* Don't want to write to the lock for now.  */
+	close(lock_file_fd);
+	lock_file_fd = 0;
+
+	have_lock = 1;
+	err = stack_uptodate(st);
+	if (err != 0)
+		goto done;
+
+	for (i = first, j = 0; i <= last; i++) {
+		struct strbuf subtab_file_name = STRBUF_INIT;
+		struct strbuf subtab_lock = STRBUF_INIT;
+		int sublock_file_fd = -1;
+
+		stack_filename(&subtab_file_name, st,
+			       reader_name(st->readers[i]));
+
+		strbuf_reset(&subtab_lock);
+		strbuf_addbuf(&subtab_lock, &subtab_file_name);
+		strbuf_addstr(&subtab_lock, ".lock");
+
+		sublock_file_fd = open(subtab_lock.buf,
+				       O_EXCL | O_CREAT | O_WRONLY, 0644);
+		if (sublock_file_fd > 0) {
+			close(sublock_file_fd);
+		} else if (sublock_file_fd < 0) {
+			if (errno == EEXIST) {
+				err = 1;
+			} else {
+				err = REFTABLE_IO_ERROR;
+			}
+		}
+
+		subtable_locks[j] = subtab_lock.buf;
+		delete_on_success[j] = subtab_file_name.buf;
+		j++;
+
+		if (err != 0)
+			goto done;
+	}
+
+	err = unlink(lock_file_name.buf);
+	if (err < 0)
+		goto done;
+	have_lock = 0;
+
+	err = stack_compact_locked(st, first, last, &temp_tab_file_name,
+				   expiry);
+	/* Compaction + tombstones can create an empty table out of non-empty
+	 * tables. */
+	is_empty_table = (err == REFTABLE_EMPTY_TABLE_ERROR);
+	if (is_empty_table) {
+		err = 0;
+	}
+	if (err < 0)
+		goto done;
+
+	lock_file_fd =
+		open(lock_file_name.buf, O_EXCL | O_CREAT | O_WRONLY, 0644);
+	if (lock_file_fd < 0) {
+		if (errno == EEXIST) {
+			err = 1;
+		} else {
+			err = REFTABLE_IO_ERROR;
+		}
+		goto done;
+	}
+	have_lock = 1;
+
+	format_name(&new_table_name, st->readers[first]->min_update_index,
+		    st->readers[last]->max_update_index);
+	strbuf_addstr(&new_table_name, ".ref");
+
+	stack_filename(&new_table_path, st, new_table_name.buf);
+
+	if (!is_empty_table) {
+		/* retry? */
+		err = rename(temp_tab_file_name.buf, new_table_path.buf);
+		if (err < 0) {
+			err = REFTABLE_IO_ERROR;
+			goto done;
+		}
+	}
+
+	for (i = 0; i < first; i++) {
+		strbuf_addstr(&ref_list_contents, st->readers[i]->name);
+		strbuf_addstr(&ref_list_contents, "\n");
+	}
+	if (!is_empty_table) {
+		strbuf_addbuf(&ref_list_contents, &new_table_name);
+		strbuf_addstr(&ref_list_contents, "\n");
+	}
+	for (i = last + 1; i < st->merged->stack_len; i++) {
+		strbuf_addstr(&ref_list_contents, st->readers[i]->name);
+		strbuf_addstr(&ref_list_contents, "\n");
+	}
+
+	err = write(lock_file_fd, ref_list_contents.buf, ref_list_contents.len);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		unlink(new_table_path.buf);
+		goto done;
+	}
+	err = close(lock_file_fd);
+	lock_file_fd = 0;
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		unlink(new_table_path.buf);
+		goto done;
+	}
+
+	err = rename(lock_file_name.buf, st->list_file);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		unlink(new_table_path.buf);
+		goto done;
+	}
+	have_lock = 0;
+
+	/* Reload the stack before deleting. On windows, we can only delete the
+	   files after we closed them.
+	*/
+	err = reftable_stack_reload_maybe_reuse(st, first < last);
+
+	listp = delete_on_success;
+	while (*listp) {
+		if (strcmp(*listp, new_table_path.buf)) {
+			unlink(*listp);
+		}
+		listp++;
+	}
+
+done:
+	free_names(delete_on_success);
+
+	listp = subtable_locks;
+	while (*listp) {
+		unlink(*listp);
+		listp++;
+	}
+	free_names(subtable_locks);
+	if (lock_file_fd > 0) {
+		close(lock_file_fd);
+		lock_file_fd = 0;
+	}
+	if (have_lock) {
+		unlink(lock_file_name.buf);
+	}
+	strbuf_release(&new_table_name);
+	strbuf_release(&new_table_path);
+	strbuf_release(&ref_list_contents);
+	strbuf_release(&temp_tab_file_name);
+	strbuf_release(&lock_file_name);
+	return err;
+}
+
+int reftable_stack_compact_all(struct reftable_stack *st,
+			       struct reftable_log_expiry_config *config)
+{
+	return stack_compact_range(st, 0, st->merged->stack_len - 1, config);
+}
+
+static int stack_compact_range_stats(struct reftable_stack *st, int first,
+				     int last,
+				     struct reftable_log_expiry_config *config)
+{
+	int err = stack_compact_range(st, first, last, config);
+	if (err > 0) {
+		st->stats.failures++;
+	}
+	return err;
+}
+
+static int segment_size(struct segment *s)
+{
+	return s->end - s->start;
+}
+
+int fastlog2(uint64_t sz)
+{
+	int l = 0;
+	if (sz == 0)
+		return 0;
+	for (; sz; sz /= 2) {
+		l++;
+	}
+	return l - 1;
+}
+
+struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n)
+{
+	struct segment *segs = reftable_calloc(sizeof(struct segment) * n);
+	int next = 0;
+	struct segment cur = { 0 };
+	int i = 0;
+
+	if (n == 0) {
+		*seglen = 0;
+		return segs;
+	}
+	for (i = 0; i < n; i++) {
+		int log = fastlog2(sizes[i]);
+		if (cur.log != log && cur.bytes > 0) {
+			struct segment fresh = {
+				.start = i,
+			};
+
+			segs[next++] = cur;
+			cur = fresh;
+		}
+
+		cur.log = log;
+		cur.end = i + 1;
+		cur.bytes += sizes[i];
+	}
+	segs[next++] = cur;
+	*seglen = next;
+	return segs;
+}
+
+struct segment suggest_compaction_segment(uint64_t *sizes, int n)
+{
+	int seglen = 0;
+	struct segment *segs = sizes_to_segments(&seglen, sizes, n);
+	struct segment min_seg = {
+		.log = 64,
+	};
+	int i = 0;
+	for (i = 0; i < seglen; i++) {
+		if (segment_size(&segs[i]) == 1) {
+			continue;
+		}
+
+		if (segs[i].log < min_seg.log) {
+			min_seg = segs[i];
+		}
+	}
+
+	while (min_seg.start > 0) {
+		int prev = min_seg.start - 1;
+		if (fastlog2(min_seg.bytes) < fastlog2(sizes[prev])) {
+			break;
+		}
+
+		min_seg.start = prev;
+		min_seg.bytes += sizes[prev];
+	}
+
+	reftable_free(segs);
+	return min_seg;
+}
+
+static uint64_t *stack_table_sizes_for_compaction(struct reftable_stack *st)
+{
+	uint64_t *sizes =
+		reftable_calloc(sizeof(uint64_t) * st->merged->stack_len);
+	int version = (st->config.hash_id == SHA1_ID) ? 1 : 2;
+	int overhead = header_size(version) - 1;
+	int i = 0;
+	for (i = 0; i < st->merged->stack_len; i++) {
+		sizes[i] = st->readers[i]->size - overhead;
+	}
+	return sizes;
+}
+
+int reftable_stack_auto_compact(struct reftable_stack *st)
+{
+	uint64_t *sizes = stack_table_sizes_for_compaction(st);
+	struct segment seg =
+		suggest_compaction_segment(sizes, st->merged->stack_len);
+	reftable_free(sizes);
+	if (segment_size(&seg) > 0)
+		return stack_compact_range_stats(st, seg.start, seg.end - 1,
+						 NULL);
+
+	return 0;
+}
+
+struct reftable_compaction_stats *
+reftable_stack_compaction_stats(struct reftable_stack *st)
+{
+	return &st->stats;
+}
+
+int reftable_stack_read_ref(struct reftable_stack *st, const char *refname,
+			    struct reftable_ref_record *ref)
+{
+	struct reftable_table tab = { NULL };
+	reftable_table_from_merged_table(&tab, reftable_stack_merged_table(st));
+	return reftable_table_read_ref(&tab, refname, ref);
+}
+
+int reftable_stack_read_log(struct reftable_stack *st, const char *refname,
+			    struct reftable_log_record *log)
+{
+	struct reftable_iterator it = { NULL };
+	struct reftable_merged_table *mt = reftable_stack_merged_table(st);
+	int err = reftable_merged_table_seek_log(mt, &it, refname);
+	if (err)
+		goto done;
+
+	err = reftable_iterator_next_log(&it, log);
+	if (err)
+		goto done;
+
+	if (strcmp(log->refname, refname) ||
+	    reftable_log_record_is_deletion(log)) {
+		err = 1;
+		goto done;
+	}
+
+done:
+	if (err) {
+		reftable_log_record_release(log);
+	}
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+static int stack_check_addition(struct reftable_stack *st,
+				const char *new_tab_name)
+{
+	int err = 0;
+	struct reftable_block_source src = { NULL };
+	struct reftable_reader *rd = NULL;
+	struct reftable_table tab = { NULL };
+	struct reftable_ref_record *refs = NULL;
+	struct reftable_iterator it = { NULL };
+	int cap = 0;
+	int len = 0;
+	int i = 0;
+
+	if (st->config.skip_name_check)
+		return 0;
+
+	err = reftable_block_source_from_file(&src, new_tab_name);
+	if (err < 0)
+		goto done;
+
+	err = reftable_new_reader(&rd, &src, new_tab_name);
+	if (err < 0)
+		goto done;
+
+	err = reftable_reader_seek_ref(rd, &it, "");
+	if (err > 0) {
+		err = 0;
+		goto done;
+	}
+	if (err < 0)
+		goto done;
+
+	while (1) {
+		struct reftable_ref_record ref = { NULL };
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0)
+			goto done;
+
+		if (len >= cap) {
+			cap = 2 * cap + 1;
+			refs = reftable_realloc(refs, cap * sizeof(refs[0]));
+		}
+
+		refs[len++] = ref;
+	}
+
+	reftable_table_from_merged_table(&tab, reftable_stack_merged_table(st));
+
+	err = validate_ref_record_addition(tab, refs, len);
+
+done:
+	for (i = 0; i < len; i++) {
+		reftable_ref_record_release(&refs[i]);
+	}
+
+	free(refs);
+	reftable_iterator_destroy(&it);
+	reftable_reader_free(rd);
+	return err;
+}
+
+static int is_table_name(const char *s)
+{
+	const char *dot = strrchr(s, '.');
+	return dot != NULL && !strcmp(dot, ".ref");
+}
+
+static void remove_maybe_stale_table(struct reftable_stack *st, uint64_t max,
+				     const char *name)
+{
+	int err = 0;
+	uint64_t update_idx = 0;
+	struct reftable_block_source src = { NULL };
+	struct reftable_reader *rd = NULL;
+	struct strbuf table_path = STRBUF_INIT;
+	stack_filename(&table_path, st, name);
+
+	err = reftable_block_source_from_file(&src, table_path.buf);
+	if (err < 0)
+		goto done;
+
+	err = reftable_new_reader(&rd, &src, name);
+	if (err < 0)
+		goto done;
+
+	update_idx = reftable_reader_max_update_index(rd);
+	reftable_reader_free(rd);
+
+	if (update_idx <= max) {
+		unlink(table_path.buf);
+	}
+done:
+	strbuf_release(&table_path);
+}
+
+static int reftable_stack_clean_locked(struct reftable_stack *st)
+{
+	uint64_t max = reftable_merged_table_max_update_index(
+		reftable_stack_merged_table(st));
+	DIR *dir = opendir(st->reftable_dir);
+	struct dirent *d = NULL;
+	if (dir == NULL) {
+		return REFTABLE_IO_ERROR;
+	}
+
+	while ((d = readdir(dir)) != NULL) {
+		int i = 0;
+		int found = 0;
+		if (!is_table_name(d->d_name))
+			continue;
+
+		for (i = 0; !found && i < st->readers_len; i++) {
+			found = !strcmp(reader_name(st->readers[i]), d->d_name);
+		}
+		if (found)
+			continue;
+
+		remove_maybe_stale_table(st, max, d->d_name);
+	}
+
+	closedir(dir);
+	return 0;
+}
+
+int reftable_stack_clean(struct reftable_stack *st)
+{
+	struct reftable_addition *add = NULL;
+	int err = reftable_stack_new_addition(&add, st);
+	if (err < 0) {
+		goto done;
+	}
+
+	err = reftable_stack_reload(st);
+	if (err < 0) {
+		goto done;
+	}
+
+	err = reftable_stack_clean_locked(st);
+
+done:
+	reftable_addition_destroy(add);
+	return err;
+}
diff --git a/reftable/stack.h b/reftable/stack.h
new file mode 100644
index 000000000000..f57005846e56
--- /dev/null
+++ b/reftable/stack.h
@@ -0,0 +1,41 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef STACK_H
+#define STACK_H
+
+#include "system.h"
+#include "reftable-writer.h"
+#include "reftable-stack.h"
+
+struct reftable_stack {
+	char *list_file;
+	char *reftable_dir;
+	int disable_auto_compact;
+
+	struct reftable_write_options config;
+
+	struct reftable_reader **readers;
+	size_t readers_len;
+	struct reftable_merged_table *merged;
+	struct reftable_compaction_stats stats;
+};
+
+int read_lines(const char *filename, char ***lines);
+
+struct segment {
+	int start, end;
+	int log;
+	uint64_t bytes;
+};
+
+int fastlog2(uint64_t sz);
+struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n);
+struct segment suggest_compaction_segment(uint64_t *sizes, int n);
+
+#endif
diff --git a/reftable/stack_test.c b/reftable/stack_test.c
new file mode 100644
index 000000000000..5e1d5f941524
--- /dev/null
+++ b/reftable/stack_test.c
@@ -0,0 +1,928 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "stack.h"
+
+#include "system.h"
+
+#include "reftable-reader.h"
+#include "merged.h"
+#include "basics.h"
+#include "constants.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+#include <sys/types.h>
+#include <dirent.h>
+
+static void clear_dir(const char *dirname)
+{
+	struct strbuf path = STRBUF_INIT;
+	strbuf_addstr(&path, dirname);
+	remove_dir_recursively(&path, 0);
+	strbuf_release(&path);
+}
+
+static int count_dir_entries(const char *dirname)
+{
+	DIR *dir = opendir(dirname);
+	int len = 0;
+	struct dirent *d;
+	if (dir == NULL)
+		return 0;
+
+	while ((d = readdir(dir)) != NULL) {
+		if (!strcmp(d->d_name, "..") || !strcmp(d->d_name, "."))
+			continue;
+		len++;
+	}
+	closedir(dir);
+	return len;
+}
+
+static char *get_tmp_template(const char *prefix)
+{
+	const char *tmp = getenv("TMPDIR");
+	static char template[1024];
+	snprintf(template, sizeof(template) - 1, "%s/%s.XXXXXX",
+		 tmp ? tmp : "/tmp", prefix);
+	return template;
+}
+
+static void test_read_file(void)
+{
+	char *fn = get_tmp_template(__FUNCTION__);
+	int fd = mkstemp(fn);
+	char out[1024] = "line1\n\nline2\nline3";
+	int n, err;
+	char **names = NULL;
+	char *want[] = { "line1", "line2", "line3" };
+	int i = 0;
+
+	EXPECT(fd > 0);
+	n = write(fd, out, strlen(out));
+	EXPECT(n == strlen(out));
+	err = close(fd);
+	EXPECT(err >= 0);
+
+	err = read_lines(fn, &names);
+	EXPECT_ERR(err);
+
+	for (i = 0; names[i] != NULL; i++) {
+		EXPECT(0 == strcmp(want[i], names[i]));
+	}
+	free_names(names);
+	remove(fn);
+}
+
+static void test_parse_names(void)
+{
+	char buf[] = "line\n";
+	char **names = NULL;
+	parse_names(buf, strlen(buf), &names);
+
+	EXPECT(NULL != names[0]);
+	EXPECT(0 == strcmp(names[0], "line"));
+	EXPECT(NULL == names[1]);
+	free_names(names);
+}
+
+static void test_names_equal(void)
+{
+	char *a[] = { "a", "b", "c", NULL };
+	char *b[] = { "a", "b", "d", NULL };
+	char *c[] = { "a", "b", NULL };
+
+	EXPECT(names_equal(a, a));
+	EXPECT(!names_equal(a, b));
+	EXPECT(!names_equal(a, c));
+}
+
+static int write_test_ref(struct reftable_writer *wr, void *arg)
+{
+	struct reftable_ref_record *ref = arg;
+	reftable_writer_set_limits(wr, ref->update_index, ref->update_index);
+	return reftable_writer_add_ref(wr, ref);
+}
+
+struct write_log_arg {
+	struct reftable_log_record *log;
+	uint64_t update_index;
+};
+
+static int write_test_log(struct reftable_writer *wr, void *arg)
+{
+	struct write_log_arg *wla = arg;
+
+	reftable_writer_set_limits(wr, wla->update_index, wla->update_index);
+	return reftable_writer_add_log(wr, wla->log);
+}
+
+static void test_reftable_stack_add_one(void)
+{
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_ref_record ref = {
+		.refname = "HEAD",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+	struct reftable_ref_record dest = { NULL };
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_ref(st, ref.refname, &dest);
+	EXPECT_ERR(err);
+	EXPECT(0 == strcmp("master", dest.value.symref));
+
+	reftable_ref_record_release(&dest);
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_uptodate(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st1 = NULL;
+	struct reftable_stack *st2 = NULL;
+	char *dir = get_tmp_template(__FUNCTION__);
+	int err;
+	struct reftable_ref_record ref1 = {
+		.refname = "HEAD",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+	struct reftable_ref_record ref2 = {
+		.refname = "branch2",
+		.update_index = 2,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+
+	EXPECT(mkdtemp(dir));
+
+	/* simulate multi-process access to the same stack
+	   by creating two stacks for the same directory.
+	 */
+	err = reftable_new_stack(&st1, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_new_stack(&st2, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st1, &write_test_ref, &ref1);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st2, &write_test_ref, &ref2);
+	EXPECT(err == REFTABLE_LOCK_ERROR);
+
+	err = reftable_stack_reload(st2);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st2, &write_test_ref, &ref2);
+	EXPECT_ERR(err);
+	reftable_stack_destroy(st1);
+	reftable_stack_destroy(st2);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_transaction_api(void)
+{
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_addition *add = NULL;
+
+	struct reftable_ref_record ref = {
+		.refname = "HEAD",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+	struct reftable_ref_record dest = { NULL };
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	reftable_addition_destroy(add);
+
+	err = reftable_stack_new_addition(&add, st);
+	EXPECT_ERR(err);
+
+	err = reftable_addition_add(add, &write_test_ref, &ref);
+	EXPECT_ERR(err);
+
+	err = reftable_addition_commit(add);
+	EXPECT_ERR(err);
+
+	reftable_addition_destroy(add);
+
+	err = reftable_stack_read_ref(st, ref.refname, &dest);
+	EXPECT_ERR(err);
+	EXPECT(REFTABLE_REF_SYMREF == dest.value_type);
+	EXPECT(0 == strcmp("master", dest.value.symref));
+
+	reftable_ref_record_release(&dest);
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_validate_refname(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	char *dir = get_tmp_template(__FUNCTION__);
+	int i;
+	struct reftable_ref_record ref = {
+		.refname = "a/b",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+	char *additions[] = { "a", "a/b/c" };
+
+	EXPECT(mkdtemp(dir));
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref);
+	EXPECT_ERR(err);
+
+	for (i = 0; i < ARRAY_SIZE(additions); i++) {
+		struct reftable_ref_record ref = {
+			.refname = additions[i],
+			.update_index = 1,
+			.value_type = REFTABLE_REF_SYMREF,
+			.value.symref = "master",
+		};
+
+		err = reftable_stack_add(st, &write_test_ref, &ref);
+		EXPECT(err == REFTABLE_NAME_CONFLICT);
+	}
+
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static int write_error(struct reftable_writer *wr, void *arg)
+{
+	return *((int *)arg);
+}
+
+static void test_reftable_stack_update_index_check(void)
+{
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_ref_record ref1 = {
+		.refname = "name1",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+	struct reftable_ref_record ref2 = {
+		.refname = "name2",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref1);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref2);
+	EXPECT(err == REFTABLE_API_ERROR);
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_lock_failure(void)
+{
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err, i;
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+	for (i = -1; i != REFTABLE_EMPTY_TABLE_ERROR; i--) {
+		err = reftable_stack_add(st, &write_error, &i);
+		EXPECT(err == i);
+	}
+
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_add(void)
+{
+	int i = 0;
+	int err = 0;
+	struct reftable_write_options cfg = {
+		.exact_log_message = 1,
+	};
+	struct reftable_stack *st = NULL;
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_ref_record refs[2] = { { NULL } };
+	struct reftable_log_record logs[2] = { { NULL } };
+	int N = ARRAY_SIZE(refs);
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+	st->disable_auto_compact = 1;
+
+	for (i = 0; i < N; i++) {
+		char buf[256];
+		snprintf(buf, sizeof(buf), "branch%02d", i);
+		refs[i].refname = xstrdup(buf);
+		refs[i].update_index = i + 1;
+		refs[i].value_type = REFTABLE_REF_VAL1;
+		refs[i].value.val1 = reftable_malloc(SHA1_SIZE);
+		set_test_hash(refs[i].value.val1, i);
+
+		logs[i].refname = xstrdup(buf);
+		logs[i].update_index = N + i + 1;
+		logs[i].value_type = REFTABLE_LOG_UPDATE;
+
+		logs[i].update.new_hash = reftable_malloc(SHA1_SIZE);
+		logs[i].update.email = xstrdup("identity@invalid");
+		set_test_hash(logs[i].update.new_hash, i);
+	}
+
+	for (i = 0; i < N; i++) {
+		int err = reftable_stack_add(st, &write_test_ref, &refs[i]);
+		EXPECT_ERR(err);
+	}
+
+	for (i = 0; i < N; i++) {
+		struct write_log_arg arg = {
+			.log = &logs[i],
+			.update_index = reftable_stack_next_update_index(st),
+		};
+		int err = reftable_stack_add(st, &write_test_log, &arg);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_stack_compact_all(st, NULL);
+	EXPECT_ERR(err);
+
+	for (i = 0; i < N; i++) {
+		struct reftable_ref_record dest = { NULL };
+
+		int err = reftable_stack_read_ref(st, refs[i].refname, &dest);
+		EXPECT_ERR(err);
+		EXPECT(reftable_ref_record_equal(&dest, refs + i, SHA1_SIZE));
+		reftable_ref_record_release(&dest);
+	}
+
+	for (i = 0; i < N; i++) {
+		struct reftable_log_record dest = { NULL };
+		int err = reftable_stack_read_log(st, refs[i].refname, &dest);
+		EXPECT_ERR(err);
+		EXPECT(reftable_log_record_equal(&dest, logs + i, SHA1_SIZE));
+		reftable_log_record_release(&dest);
+	}
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	for (i = 0; i < N; i++) {
+		reftable_ref_record_release(&refs[i]);
+		reftable_log_record_release(&logs[i]);
+	}
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_log_normalize(void)
+{
+	int err = 0;
+	struct reftable_write_options cfg = {
+		0,
+	};
+	struct reftable_stack *st = NULL;
+	char *dir = get_tmp_template(__FUNCTION__);
+
+	uint8_t h1[SHA1_SIZE] = { 0x01 }, h2[SHA1_SIZE] = { 0x02 };
+
+	struct reftable_log_record input = { .refname = "branch",
+					     .update_index = 1,
+					     .value_type = REFTABLE_LOG_UPDATE,
+					     .update = {
+						     .new_hash = h1,
+						     .old_hash = h2,
+					     } };
+	struct reftable_log_record dest = {
+		.update_index = 0,
+	};
+	struct write_log_arg arg = {
+		.log = &input,
+		.update_index = 1,
+	};
+
+	EXPECT(mkdtemp(dir));
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	input.update.message = "one\ntwo";
+	err = reftable_stack_add(st, &write_test_log, &arg);
+	EXPECT(err == REFTABLE_API_ERROR);
+
+	input.update.message = "one";
+	err = reftable_stack_add(st, &write_test_log, &arg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_log(st, input.refname, &dest);
+	EXPECT_ERR(err);
+	EXPECT(0 == strcmp(dest.update.message, "one\n"));
+
+	input.update.message = "two\n";
+	arg.update_index = 2;
+	err = reftable_stack_add(st, &write_test_log, &arg);
+	EXPECT_ERR(err);
+	err = reftable_stack_read_log(st, input.refname, &dest);
+	EXPECT_ERR(err);
+	EXPECT(0 == strcmp(dest.update.message, "two\n"));
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	reftable_log_record_release(&dest);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_tombstone(void)
+{
+	int i = 0;
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_ref_record refs[2] = { { NULL } };
+	struct reftable_log_record logs[2] = { { NULL } };
+	int N = ARRAY_SIZE(refs);
+	struct reftable_ref_record dest = { NULL };
+	struct reftable_log_record log_dest = { NULL };
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	for (i = 0; i < N; i++) {
+		const char *buf = "branch";
+		refs[i].refname = xstrdup(buf);
+		refs[i].update_index = i + 1;
+		if (i % 2 == 0) {
+			refs[i].value_type = REFTABLE_REF_VAL1;
+			refs[i].value.val1 = reftable_malloc(SHA1_SIZE);
+			set_test_hash(refs[i].value.val1, i);
+		}
+
+		logs[i].refname = xstrdup(buf);
+		/* update_index is part of the key. */
+		logs[i].update_index = 42;
+		if (i % 2 == 0) {
+			logs[i].value_type = REFTABLE_LOG_UPDATE;
+			logs[i].update.new_hash = reftable_malloc(SHA1_SIZE);
+			set_test_hash(logs[i].update.new_hash, i);
+			logs[i].update.email = xstrdup("identity@invalid");
+		}
+	}
+	for (i = 0; i < N; i++) {
+		int err = reftable_stack_add(st, &write_test_ref, &refs[i]);
+		EXPECT_ERR(err);
+	}
+	for (i = 0; i < N; i++) {
+		struct write_log_arg arg = {
+			.log = &logs[i],
+			.update_index = reftable_stack_next_update_index(st),
+		};
+		int err = reftable_stack_add(st, &write_test_log, &arg);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_stack_read_ref(st, "branch", &dest);
+	EXPECT(err == 1);
+	reftable_ref_record_release(&dest);
+
+	err = reftable_stack_read_log(st, "branch", &log_dest);
+	EXPECT(err == 1);
+	reftable_log_record_release(&log_dest);
+
+	err = reftable_stack_compact_all(st, NULL);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_ref(st, "branch", &dest);
+	EXPECT(err == 1);
+
+	err = reftable_stack_read_log(st, "branch", &log_dest);
+	EXPECT(err == 1);
+	reftable_ref_record_release(&dest);
+	reftable_log_record_release(&log_dest);
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	for (i = 0; i < N; i++) {
+		reftable_ref_record_release(&refs[i]);
+		reftable_log_record_release(&logs[i]);
+	}
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_hash_id(void)
+{
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+
+	struct reftable_ref_record ref = {
+		.refname = "master",
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "target",
+		.update_index = 1,
+	};
+	struct reftable_write_options cfg32 = { .hash_id = SHA256_ID };
+	struct reftable_stack *st32 = NULL;
+	struct reftable_write_options cfg_default = { 0 };
+	struct reftable_stack *st_default = NULL;
+	struct reftable_ref_record dest = { NULL };
+
+	EXPECT(mkdtemp(dir));
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref);
+	EXPECT_ERR(err);
+
+	/* can't read it with the wrong hash ID. */
+	err = reftable_new_stack(&st32, dir, cfg32);
+	EXPECT(err == REFTABLE_FORMAT_ERROR);
+
+	/* check that we can read it back with default config too. */
+	err = reftable_new_stack(&st_default, dir, cfg_default);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_ref(st_default, "master", &dest);
+	EXPECT_ERR(err);
+
+	EXPECT(reftable_ref_record_equal(&ref, &dest, SHA1_SIZE));
+	reftable_ref_record_release(&dest);
+	reftable_stack_destroy(st);
+	reftable_stack_destroy(st_default);
+	clear_dir(dir);
+}
+
+static void test_log2(void)
+{
+	EXPECT(1 == fastlog2(3));
+	EXPECT(2 == fastlog2(4));
+	EXPECT(2 == fastlog2(5));
+}
+
+static void test_sizes_to_segments(void)
+{
+	uint64_t sizes[] = { 2, 3, 4, 5, 7, 9 };
+	/* .................0  1  2  3  4  5 */
+
+	int seglen = 0;
+	struct segment *segs =
+		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
+	EXPECT(segs[2].log == 3);
+	EXPECT(segs[2].start == 5);
+	EXPECT(segs[2].end == 6);
+
+	EXPECT(segs[1].log == 2);
+	EXPECT(segs[1].start == 2);
+	EXPECT(segs[1].end == 5);
+	reftable_free(segs);
+}
+
+static void test_sizes_to_segments_empty(void)
+{
+	int seglen = 0;
+	struct segment *segs = sizes_to_segments(&seglen, NULL, 0);
+	EXPECT(seglen == 0);
+	reftable_free(segs);
+}
+
+static void test_sizes_to_segments_all_equal(void)
+{
+	uint64_t sizes[] = { 5, 5 };
+
+	int seglen = 0;
+	struct segment *segs =
+		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
+	EXPECT(seglen == 1);
+	EXPECT(segs[0].start == 0);
+	EXPECT(segs[0].end == 2);
+	reftable_free(segs);
+}
+
+static void test_suggest_compaction_segment(void)
+{
+	uint64_t sizes[] = { 128, 64, 17, 16, 9, 9, 9, 16, 16 };
+	/* .................0    1    2  3   4  5  6 */
+	struct segment min =
+		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes));
+	EXPECT(min.start == 2);
+	EXPECT(min.end == 7);
+}
+
+static void test_suggest_compaction_segment_nothing(void)
+{
+	uint64_t sizes[] = { 64, 32, 16, 8, 4, 2 };
+	struct segment result =
+		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes));
+	EXPECT(result.start == result.end);
+}
+
+static void test_reflog_expire(void)
+{
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	struct reftable_log_record logs[20] = { { NULL } };
+	int N = ARRAY_SIZE(logs) - 1;
+	int i = 0;
+	int err;
+	struct reftable_log_expiry_config expiry = {
+		.time = 10,
+	};
+	struct reftable_log_record log = { NULL };
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	for (i = 1; i <= N; i++) {
+		char buf[256];
+		snprintf(buf, sizeof(buf), "branch%02d", i);
+
+		logs[i].refname = xstrdup(buf);
+		logs[i].update_index = i;
+		logs[i].value_type = REFTABLE_LOG_UPDATE;
+		logs[i].update.time = i;
+		logs[i].update.new_hash = reftable_malloc(SHA1_SIZE);
+		logs[i].update.email = xstrdup("identity@invalid");
+		set_test_hash(logs[i].update.new_hash, i);
+	}
+
+	for (i = 1; i <= N; i++) {
+		struct write_log_arg arg = {
+			.log = &logs[i],
+			.update_index = reftable_stack_next_update_index(st),
+		};
+		int err = reftable_stack_add(st, &write_test_log, &arg);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_stack_compact_all(st, NULL);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_compact_all(st, &expiry);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_log(st, logs[9].refname, &log);
+	EXPECT(err == 1);
+
+	err = reftable_stack_read_log(st, logs[11].refname, &log);
+	EXPECT_ERR(err);
+
+	expiry.min_update_index = 15;
+	err = reftable_stack_compact_all(st, &expiry);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_log(st, logs[14].refname, &log);
+	EXPECT(err == 1);
+
+	err = reftable_stack_read_log(st, logs[16].refname, &log);
+	EXPECT_ERR(err);
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	for (i = 0; i <= N; i++) {
+		reftable_log_record_release(&logs[i]);
+	}
+	clear_dir(dir);
+	reftable_log_record_release(&log);
+}
+
+static int write_nothing(struct reftable_writer *wr, void *arg)
+{
+	reftable_writer_set_limits(wr, 1, 1);
+	return 0;
+}
+
+static void test_empty_add(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	char *dir = get_tmp_template(__FUNCTION__);
+	struct reftable_stack *st2 = NULL;
+
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_nothing, NULL);
+	EXPECT_ERR(err);
+
+	err = reftable_new_stack(&st2, dir, cfg);
+	EXPECT_ERR(err);
+	clear_dir(dir);
+	reftable_stack_destroy(st);
+	reftable_stack_destroy(st2);
+}
+
+static void test_reftable_stack_auto_compaction(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	char *dir = get_tmp_template(__FUNCTION__);
+	int err, i;
+	int N = 100;
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	for (i = 0; i < N; i++) {
+		char name[100];
+		struct reftable_ref_record ref = {
+			.refname = name,
+			.update_index = reftable_stack_next_update_index(st),
+			.value_type = REFTABLE_REF_SYMREF,
+			.value.symref = "master",
+		};
+		snprintf(name, sizeof(name), "branch%04d", i);
+
+		err = reftable_stack_add(st, &write_test_ref, &ref);
+		EXPECT_ERR(err);
+
+		EXPECT(i < 3 || st->merged->stack_len < 2 * fastlog2(i));
+	}
+
+	EXPECT(reftable_stack_compaction_stats(st)->entries_written <
+	       (uint64_t)(N * fastlog2(N)));
+
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_compaction_concurrent(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st1 = NULL, *st2 = NULL;
+	char *dir = get_tmp_template(__FUNCTION__);
+	int err, i;
+	int N = 3;
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st1, dir, cfg);
+	EXPECT_ERR(err);
+
+	for (i = 0; i < N; i++) {
+		char name[100];
+		struct reftable_ref_record ref = {
+			.refname = name,
+			.update_index = reftable_stack_next_update_index(st1),
+			.value_type = REFTABLE_REF_SYMREF,
+			.value.symref = "master",
+		};
+		snprintf(name, sizeof(name), "branch%04d", i);
+
+		err = reftable_stack_add(st1, &write_test_ref, &ref);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_new_stack(&st2, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_compact_all(st1, NULL);
+	EXPECT_ERR(err);
+
+	reftable_stack_destroy(st1);
+	reftable_stack_destroy(st2);
+
+	EXPECT(count_dir_entries(dir) == 2);
+	clear_dir(dir);
+}
+
+static void unclean_stack_close(struct reftable_stack *st)
+{
+	// break abstraction boundary to simulate unclean shutdown.
+	int i = 0;
+	for (; i < st->readers_len; i++) {
+		reftable_reader_free(st->readers[i]);
+	}
+	st->readers_len = 0;
+	FREE_AND_NULL(st->readers);
+}
+
+static void test_reftable_stack_compaction_concurrent_clean(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st1 = NULL, *st2 = NULL, *st3 = NULL;
+	char *dir = get_tmp_template(__FUNCTION__);
+	int err, i;
+	int N = 3;
+	EXPECT(mkdtemp(dir));
+
+	err = reftable_new_stack(&st1, dir, cfg);
+	EXPECT_ERR(err);
+
+	for (i = 0; i < N; i++) {
+		char name[100];
+		struct reftable_ref_record ref = {
+			.refname = name,
+			.update_index = reftable_stack_next_update_index(st1),
+			.value_type = REFTABLE_REF_SYMREF,
+			.value.symref = "master",
+		};
+		snprintf(name, sizeof(name), "branch%04d", i);
+
+		err = reftable_stack_add(st1, &write_test_ref, &ref);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_new_stack(&st2, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_compact_all(st1, NULL);
+	EXPECT_ERR(err);
+
+	unclean_stack_close(st1);
+	unclean_stack_close(st2);
+
+	err = reftable_new_stack(&st3, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_clean(st3);
+	EXPECT_ERR(err);
+	EXPECT(count_dir_entries(dir) == 2);
+
+	reftable_stack_destroy(st1);
+	reftable_stack_destroy(st2);
+	reftable_stack_destroy(st3);
+
+	clear_dir(dir);
+}
+
+int stack_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_reftable_stack_compaction_concurrent_clean);
+	RUN_TEST(test_reftable_stack_compaction_concurrent);
+	RUN_TEST(test_reftable_stack_uptodate);
+	RUN_TEST(test_reftable_stack_transaction_api);
+	RUN_TEST(test_reftable_stack_hash_id);
+	RUN_TEST(test_sizes_to_segments_all_equal);
+	RUN_TEST(test_reftable_stack_auto_compaction);
+	RUN_TEST(test_reftable_stack_validate_refname);
+	RUN_TEST(test_reftable_stack_update_index_check);
+	RUN_TEST(test_reftable_stack_lock_failure);
+	RUN_TEST(test_reftable_stack_log_normalize);
+	RUN_TEST(test_reftable_stack_tombstone);
+	RUN_TEST(test_reftable_stack_add_one);
+	RUN_TEST(test_empty_add);
+	RUN_TEST(test_reflog_expire);
+	RUN_TEST(test_suggest_compaction_segment);
+	RUN_TEST(test_suggest_compaction_segment_nothing);
+	RUN_TEST(test_sizes_to_segments);
+	RUN_TEST(test_sizes_to_segments_empty);
+	RUN_TEST(test_log2);
+	RUN_TEST(test_parse_names);
+	RUN_TEST(test_read_file);
+	RUN_TEST(test_names_equal);
+	RUN_TEST(test_reftable_stack_add);
+	return 0;
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 23e050eb5d23..2efe48b23bbb 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -10,6 +10,7 @@ int cmd__reftable(int argc, const char **argv)
 	record_test_main(argc, argv);
 	refname_test_main(argc, argv);
 	reftable_test_main(argc, argv);
+	stack_test_main(argc, argv);
 	tree_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v6 17/20] reftable: add dump utility
  2021-04-12 19:25         ` [PATCH v6 00/20] reftable library Han-Wen Nienhuys via GitGitGadget
                             ` (15 preceding siblings ...)
  2021-04-12 19:25           ` [PATCH v6 16/20] reftable: implement stack, a mutable database of reftable files Han-Wen Nienhuys via GitGitGadget
@ 2021-04-12 19:25           ` Han-Wen Nienhuys via GitGitGadget
  2021-04-12 19:25           ` [PATCH v6 18/20] Reftable support for git-core Han-Wen Nienhuys via GitGitGadget
                             ` (3 subsequent siblings)
  20 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-12 19:25 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys,
	Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

provide a command-line utility for inspecting individual tables, and
inspecting a complete ref database

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/dump.c | 212 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 212 insertions(+)
 create mode 100644 reftable/dump.c

diff --git a/reftable/dump.c b/reftable/dump.c
new file mode 100644
index 000000000000..00b444e8c9b8
--- /dev/null
+++ b/reftable/dump.c
@@ -0,0 +1,212 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include <stddef.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+
+#include "reftable.h"
+#include "reftable-tests.h"
+
+static uint32_t hash_id;
+
+static int dump_table(const char *tablename)
+{
+	struct reftable_block_source src = { 0 };
+	int err = reftable_block_source_from_file(&src, tablename);
+	struct reftable_iterator it = { 0 };
+	struct reftable_ref_record ref = { 0 };
+	struct reftable_log_record log = { 0 };
+	struct reftable_reader *r = NULL;
+
+	if (err < 0)
+		return err;
+
+	err = reftable_new_reader(&r, &src, tablename);
+	if (err < 0)
+		return err;
+
+	err = reftable_reader_seek_ref(r, &it, "");
+	if (err < 0) {
+		return err;
+	}
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_ref_record_print(&ref, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_ref_record_clear(&ref);
+
+	err = reftable_reader_seek_log(r, &it, "");
+	if (err < 0) {
+		return err;
+	}
+	while (1) {
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_log_record_print(&log, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_log_record_clear(&log);
+
+	reftable_reader_free(r);
+	return 0;
+}
+
+static int compact_stack(const char *stackdir)
+{
+	struct reftable_stack *stack = NULL;
+	struct reftable_write_options cfg = {};
+
+	int err = reftable_new_stack(&stack, stackdir, cfg);
+	if (err < 0)
+		goto done;
+
+	err = reftable_stack_compact_all(stack, NULL);
+	if (err < 0)
+		goto done;
+done:
+	if (stack != NULL) {
+		reftable_stack_destroy(stack);
+	}
+	return err;
+}
+
+static int dump_stack(const char *stackdir)
+{
+	struct reftable_stack *stack = NULL;
+	struct reftable_write_options cfg = {};
+	struct reftable_iterator it = { 0 };
+	struct reftable_ref_record ref = { 0 };
+	struct reftable_log_record log = { 0 };
+	struct reftable_merged_table *merged = NULL;
+
+	int err = reftable_new_stack(&stack, stackdir, cfg);
+	if (err < 0)
+		return err;
+
+	merged = reftable_stack_merged_table(stack);
+
+	err = reftable_merged_table_seek_ref(merged, &it, "");
+	if (err < 0) {
+		return err;
+	}
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_ref_record_print(&ref, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_ref_record_clear(&ref);
+
+	err = reftable_merged_table_seek_log(merged, &it, "");
+	if (err < 0) {
+		return err;
+	}
+	while (1) {
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_log_record_print(&log, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_log_record_clear(&log);
+
+	reftable_stack_destroy(stack);
+	return 0;
+}
+
+static void print_help(void)
+{
+	printf("usage: dump [-cst] arg\n\n"
+	       "options: \n"
+	       "  -c compact\n"
+	       "  -t dump table\n"
+	       "  -s dump stack\n"
+	       "  -h this help\n"
+	       "  -2 use SHA256\n"
+	       "\n");
+}
+
+int reftable_dump_main(int argc, char *const *argv)
+{
+	int err = 0;
+	int opt;
+	int opt_dump_table = 0;
+	int opt_dump_stack = 0;
+	int opt_compact = 0;
+	const char *arg = NULL;
+	while ((opt = getopt(argc, argv, "2chts")) != -1) {
+		switch (opt) {
+		case '2':
+			hash_id = 0x73323536;
+			break;
+		case 't':
+			opt_dump_table = 1;
+			break;
+		case 's':
+			opt_dump_stack = 1;
+			break;
+		case 'c':
+			opt_compact = 1;
+			break;
+		case '?':
+		case 'h':
+			print_help();
+			return 2;
+			break;
+		}
+	}
+
+	if (argv[optind] == NULL) {
+		fprintf(stderr, "need argument\n");
+		print_help();
+		return 2;
+	}
+
+	arg = argv[optind];
+
+	if (opt_dump_table) {
+		err = dump_table(arg);
+	} else if (opt_dump_stack) {
+		err = dump_stack(arg);
+	} else if (opt_compact) {
+		err = compact_stack(arg);
+	}
+
+	if (err < 0) {
+		fprintf(stderr, "%s: %s: %s\n", argv[0], arg,
+			reftable_error_str(err));
+		return 1;
+	}
+	return 0;
+}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v6 18/20] Reftable support for git-core
  2021-04-12 19:25         ` [PATCH v6 00/20] reftable library Han-Wen Nienhuys via GitGitGadget
                             ` (16 preceding siblings ...)
  2021-04-12 19:25           ` [PATCH v6 17/20] reftable: add dump utility Han-Wen Nienhuys via GitGitGadget
@ 2021-04-12 19:25           ` Han-Wen Nienhuys via GitGitGadget
  2021-04-13  7:18             ` Ævar Arnfjörð Bjarmason
  2021-04-12 19:25           ` [PATCH v6 19/20] git-prompt: prepare for reftable refs backend SZEDER Gábor via GitGitGadget
                             ` (2 subsequent siblings)
  20 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-12 19:25 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys,
	Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

For background, see the previous commit introducing the library.

This introduces the file refs/reftable-backend.c containing a reftable-powered
ref storage backend.

It can be activated by passing --ref-storage=reftable to "git init", or setting
GIT_TEST_REFTABLE in the environment.

Example use: see t/t0031-reftable.sh

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Junio Hamano <gitster@pobox.com>
Co-authored-by: Jeff King <peff@peff.net>
---
 Documentation/config/extensions.txt           |    9 +
 .../technical/repository-version.txt          |    7 +
 Makefile                                      |    1 +
 builtin/clone.c                               |    5 +-
 builtin/init-db.c                             |   55 +-
 builtin/worktree.c                            |   27 +-
 cache.h                                       |    8 +-
 config.mak.uname                              |    2 +-
 contrib/buildsystems/Generators/Vcxproj.pm    |   11 +-
 refs.c                                        |   27 +-
 refs.h                                        |    3 +
 refs/refs-internal.h                          |    1 +
 refs/reftable-backend.c                       | 1451 +++++++++++++++++
 repository.c                                  |    2 +
 repository.h                                  |    3 +
 setup.c                                       |    9 +-
 t/t0031-reftable.sh                           |  203 +++
 t/t1409-avoid-packing-refs.sh                 |    6 +
 t/t1450-fsck.sh                               |    6 +
 t/t3210-pack-refs.sh                          |    6 +
 t/test-lib.sh                                 |    5 +
 21 files changed, 1814 insertions(+), 33 deletions(-)
 create mode 100644 refs/reftable-backend.c
 create mode 100755 t/t0031-reftable.sh

diff --git a/Documentation/config/extensions.txt b/Documentation/config/extensions.txt
index 4e23d73cdcad..82c5940f1434 100644
--- a/Documentation/config/extensions.txt
+++ b/Documentation/config/extensions.txt
@@ -6,3 +6,12 @@ extensions.objectFormat::
 Note that this setting should only be set by linkgit:git-init[1] or
 linkgit:git-clone[1].  Trying to change it after initialization will not
 work and will produce hard-to-diagnose issues.
++
+extensions.refStorage::
+	Specify the ref storage mechanism to use.  The acceptable values are `files` and
+	`reftable`.  If not specified, `files` is assumed.  It is an error to specify
+	this key unless `core.repositoryFormatVersion` is 1.
++
+Note that this setting should only be set by linkgit:git-init[1] or
+linkgit:git-clone[1].  Trying to change it after initialization will not
+work and will produce hard-to-diagnose issues.
diff --git a/Documentation/technical/repository-version.txt b/Documentation/technical/repository-version.txt
index 7844ef30ffde..725762358333 100644
--- a/Documentation/technical/repository-version.txt
+++ b/Documentation/technical/repository-version.txt
@@ -100,3 +100,10 @@ If set, by default "git config" reads from both "config" and
 multiple working directory mode, "config" file is shared while
 "config.worktree" is per-working directory (i.e., it's in
 GIT_COMMON_DIR/worktrees/<id>/config.worktree)
+
+==== `refStorage`
+
+Specifies the file format for the ref database. Values are `files`
+(for the traditional packed + loose ref format) and `reftable` for the
+binary reftable format. See https://github.com/google/reftable for
+more information.
diff --git a/Makefile b/Makefile
index e5ea8c6627ae..5890e7d1cbb2 100644
--- a/Makefile
+++ b/Makefile
@@ -975,6 +975,7 @@ LIB_OBJS += reflog-walk.o
 LIB_OBJS += refs.o
 LIB_OBJS += refs/debug.o
 LIB_OBJS += refs/files-backend.o
+LIB_OBJS += refs/reftable-backend.o
 LIB_OBJS += refs/iterator.o
 LIB_OBJS += refs/packed-backend.o
 LIB_OBJS += refs/ref-cache.o
diff --git a/builtin/clone.c b/builtin/clone.c
index f6b0c48beda5..bb2509fc1268 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1148,7 +1148,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	}
 
 	init_db(git_dir, real_git_dir, option_template, GIT_HASH_UNKNOWN, NULL,
-		INIT_DB_QUIET);
+		default_ref_storage(), INIT_DB_QUIET);
 
 	if (real_git_dir)
 		git_dir = real_git_dir;
@@ -1299,7 +1299,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		 * Now that we know what algorithm the remote side is using,
 		 * let's set ours to the same thing.
 		 */
-		initialize_repository_version(hash_algo, 1);
+		initialize_repository_version(hash_algo, 1,
+					      default_ref_storage());
 		repo_set_hash_algo(the_repository, hash_algo);
 
 		mapped_refs = wanted_peer_refs(refs, &remote->fetch);
diff --git a/builtin/init-db.c b/builtin/init-db.c
index ff06fe4b9a49..31b718259a64 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -167,12 +167,14 @@ static int needs_work_tree_config(const char *git_dir, const char *work_tree)
 	return 1;
 }
 
-void initialize_repository_version(int hash_algo, int reinit)
+void initialize_repository_version(int hash_algo, int reinit,
+				   const char *ref_storage_format)
 {
 	char repo_version_string[10];
 	int repo_version = GIT_REPO_VERSION;
 
-	if (hash_algo != GIT_HASH_SHA1)
+	if (hash_algo != GIT_HASH_SHA1 ||
+	    !strcmp(ref_storage_format, "reftable"))
 		repo_version = GIT_REPO_VERSION_READ;
 
 	/* This forces creation of new config file */
@@ -225,6 +227,7 @@ static int create_default_files(const char *template_path,
 	is_bare_repository_cfg = init_is_bare_repository || !work_tree;
 	if (init_shared_repository != -1)
 		set_shared_repository(init_shared_repository);
+	the_repository->ref_storage_format = xstrdup(fmt->ref_storage);
 
 	/*
 	 * We would have created the above under user's umask -- under
@@ -234,6 +237,24 @@ static int create_default_files(const char *template_path,
 		adjust_shared_perm(get_git_dir());
 	}
 
+	/*
+	 * Check to see if .git/HEAD exists; this must happen before
+	 * initializing the ref db, because we want to see if there is an
+	 * existing HEAD.
+	 */
+	path = git_path_buf(&buf, "HEAD");
+	reinit = (!access(path, R_OK) ||
+		  readlink(path, junk, sizeof(junk) - 1) != -1);
+
+	/*
+	 * refs/heads is a file when using reftable. We can't reinitialize with
+	 * a reftable because it will overwrite HEAD
+	 */
+	if (reinit && (!strcmp(fmt->ref_storage, "reftable")) ==
+			      is_directory(git_path_buf(&buf, "refs/heads"))) {
+		die("cannot switch ref storage format.");
+	}
+
 	/*
 	 * We need to create a "refs" dir in any case so that older
 	 * versions of git can tell that this is a repository.
@@ -248,9 +269,6 @@ static int create_default_files(const char *template_path,
 	 * Point the HEAD symref to the initial branch with if HEAD does
 	 * not yet exist.
 	 */
-	path = git_path_buf(&buf, "HEAD");
-	reinit = (!access(path, R_OK)
-		  || readlink(path, junk, sizeof(junk)-1) != -1);
 	if (!reinit) {
 		char *ref;
 
@@ -267,7 +285,7 @@ static int create_default_files(const char *template_path,
 		free(ref);
 	}
 
-	initialize_repository_version(fmt->hash_algo, 0);
+	initialize_repository_version(fmt->hash_algo, 0, fmt->ref_storage);
 
 	/* Check filemode trustability */
 	path = git_path_buf(&buf, "config");
@@ -382,7 +400,7 @@ static void validate_hash_algorithm(struct repository_format *repo_fmt, int hash
 
 int init_db(const char *git_dir, const char *real_git_dir,
 	    const char *template_dir, int hash, const char *initial_branch,
-	    unsigned int flags)
+	    const char *ref_storage_format, unsigned int flags)
 {
 	int reinit;
 	int exist_ok = flags & INIT_DB_EXIST_OK;
@@ -421,6 +439,7 @@ int init_db(const char *git_dir, const char *real_git_dir,
 	 * is an attempt to reinitialize new repository with an old tool.
 	 */
 	check_repository_format(&repo_fmt);
+	repo_fmt.ref_storage = xstrdup(ref_storage_format);
 
 	validate_hash_algorithm(&repo_fmt, hash);
 
@@ -475,6 +494,9 @@ int init_db(const char *git_dir, const char *real_git_dir,
 		git_config_set("receive.denyNonFastforwards", "true");
 	}
 
+	if (!strcmp(ref_storage_format, "reftable"))
+		git_config_set("extensions.refStorage", ref_storage_format);
+
 	if (!(flags & INIT_DB_QUIET)) {
 		int len = strlen(git_dir);
 
@@ -548,6 +570,7 @@ static const char *const init_db_usage[] = {
 int cmd_init_db(int argc, const char **argv, const char *prefix)
 {
 	const char *git_dir;
+	const char *ref_storage_format = default_ref_storage();
 	const char *real_git_dir = NULL;
 	const char *work_tree;
 	const char *template_dir = NULL;
@@ -556,15 +579,18 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
 	const char *initial_branch = NULL;
 	int hash_algo = GIT_HASH_UNKNOWN;
 	const struct option init_db_options[] = {
-		OPT_STRING(0, "template", &template_dir, N_("template-directory"),
-				N_("directory from which templates will be used")),
+		OPT_STRING(0, "template", &template_dir,
+			   N_("template-directory"),
+			   N_("directory from which templates will be used")),
 		OPT_SET_INT(0, "bare", &is_bare_repository_cfg,
-				N_("create a bare repository"), 1),
+			    N_("create a bare repository"), 1),
 		{ OPTION_CALLBACK, 0, "shared", &init_shared_repository,
-			N_("permissions"),
-			N_("specify that the git repository is to be shared amongst several users"),
-			PARSE_OPT_OPTARG | PARSE_OPT_NONEG, shared_callback, 0},
+		  N_("permissions"),
+		  N_("specify that the git repository is to be shared amongst several users"),
+		  PARSE_OPT_OPTARG | PARSE_OPT_NONEG, shared_callback, 0 },
 		OPT_BIT('q', "quiet", &flags, N_("be quiet"), INIT_DB_QUIET),
+		OPT_STRING(0, "ref-storage", &ref_storage_format, N_("backend"),
+			   N_("the ref storage format to use")),
 		OPT_STRING(0, "separate-git-dir", &real_git_dir, N_("gitdir"),
 			   N_("separate git dir from working tree")),
 		OPT_STRING('b', "initial-branch", &initial_branch, N_("name"),
@@ -707,10 +733,11 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
 	}
 
 	UNLEAK(real_git_dir);
+	UNLEAK(ref_storage_format);
 	UNLEAK(git_dir);
 	UNLEAK(work_tree);
 
 	flags |= INIT_DB_EXIST_OK;
 	return init_db(git_dir, real_git_dir, template_dir, hash_algo,
-		       initial_branch, flags);
+		       initial_branch, ref_storage_format, flags);
 }
diff --git a/builtin/worktree.c b/builtin/worktree.c
index 877145349381..fd389c86f6fa 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -13,6 +13,7 @@
 #include "utf8.h"
 #include "worktree.h"
 #include "quote.h"
+#include "../refs/refs-internal.h"
 
 static const char * const worktree_usage[] = {
 	N_("git worktree add [<options>] <path> [<commit-ish>]"),
@@ -330,9 +331,29 @@ static int add_worktree(const char *path, const char *refname,
 	 * worktree.
 	 */
 	strbuf_reset(&sb);
-	strbuf_addf(&sb, "%s/HEAD", sb_repo.buf);
-	write_file(sb.buf, "%s", oid_to_hex(&null_oid));
-	strbuf_reset(&sb);
+	if (get_main_ref_store(the_repository)->be == &refs_be_reftable) {
+		/* XXX this is cut & paste from reftable_init_db. */
+		strbuf_addf(&sb, "%s/HEAD", sb_repo.buf);
+		write_file(sb.buf, "%s", "ref: refs/heads/.invalid\n");
+		strbuf_reset(&sb);
+
+		strbuf_addf(&sb, "%s/refs", sb_repo.buf);
+		safe_create_dir(sb.buf, 1);
+		strbuf_reset(&sb);
+
+		strbuf_addf(&sb, "%s/refs/heads", sb_repo.buf);
+		write_file(sb.buf, "this repository uses the reftable format");
+		strbuf_reset(&sb);
+
+		strbuf_addf(&sb, "%s/reftable", sb_repo.buf);
+		safe_create_dir(sb.buf, 1);
+		strbuf_reset(&sb);
+	} else {
+		strbuf_addf(&sb, "%s/HEAD", sb_repo.buf);
+		write_file(sb.buf, "%s", oid_to_hex(&null_oid));
+		strbuf_reset(&sb);
+	}
+
 	strbuf_addf(&sb, "%s/commondir", sb_repo.buf);
 	write_file(sb.buf, "../..");
 
diff --git a/cache.h b/cache.h
index 148d9ab5f185..f299535b0b03 100644
--- a/cache.h
+++ b/cache.h
@@ -629,9 +629,10 @@ int path_inside_repo(const char *prefix, const char *path);
 #define INIT_DB_EXIST_OK 0x0002
 
 int init_db(const char *git_dir, const char *real_git_dir,
-	    const char *template_dir, int hash_algo,
-	    const char *initial_branch, unsigned int flags);
-void initialize_repository_version(int hash_algo, int reinit);
+	    const char *template_dir, int hash_algo, const char *initial_branch,
+	    const char *ref_storage_format, unsigned int flags);
+void initialize_repository_version(int hash_algo, int reinit,
+				   const char *ref_storage_format);
 
 void sanitize_stdfds(void);
 int daemonize(void);
@@ -1045,6 +1046,7 @@ struct repository_format {
 	int is_bare;
 	int hash_algo;
 	char *work_tree;
+	char *ref_storage;
 	struct string_list unknown_extensions;
 	struct string_list v1_only_extensions;
 };
diff --git a/config.mak.uname b/config.mak.uname
index cb443b4e023a..96c1268693f1 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -701,7 +701,7 @@ vcxproj:
 	# Make .vcxproj files and add them
 	unset QUIET_GEN QUIET_BUILT_IN; \
 	perl contrib/buildsystems/generate -g Vcxproj
-	git add -f git.sln {*,*/lib,t/helper/*}/*.vcxproj
+	git add -f git.sln {*,*/lib,*/libreftable,t/helper/*}/*.vcxproj
 
 	# Generate the LinkOrCopyBuiltins.targets and LinkOrCopyRemoteHttp.targets file
 	(echo '<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">' && \
diff --git a/contrib/buildsystems/Generators/Vcxproj.pm b/contrib/buildsystems/Generators/Vcxproj.pm
index d2584450ba17..1a25789d2851 100644
--- a/contrib/buildsystems/Generators/Vcxproj.pm
+++ b/contrib/buildsystems/Generators/Vcxproj.pm
@@ -77,7 +77,7 @@ sub createProject {
     my $libs_release = "\n    ";
     my $libs_debug = "\n    ";
     if (!$static_library) {
-      $libs_release = join(";", sort(grep /^(?!libgit\.lib|xdiff\/lib\.lib|vcs-svn\/lib\.lib)/, @{$$build_structure{"$prefix${name}_LIBS"}}));
+      $libs_release = join(";", sort(grep /^(?!libgit\.lib|xdiff\/lib\.lib|vcs-svn\/lib\.lib|reftable\/libreftable\.lib)/, @{$$build_structure{"$prefix${name}_LIBS"}}));
       $libs_debug = $libs_release;
       $libs_debug =~ s/zlib\.lib/zlibd\.lib/g;
       $libs_debug =~ s/libexpat\.lib/libexpatd\.lib/g;
@@ -232,6 +232,7 @@ sub createProject {
 EOM
     if (!$static_library || $target =~ 'vcs-svn' || $target =~ 'xdiff') {
       my $uuid_libgit = $$build_structure{"LIBS_libgit_GUID"};
+      my $uuid_libreftable = $$build_structure{"LIBS_reftable/libreftable_GUID"};
       my $uuid_xdiff_lib = $$build_structure{"LIBS_xdiff/lib_GUID"};
 
       print F << "EOM";
@@ -241,6 +242,14 @@ sub createProject {
       <ReferenceOutputAssembly>false</ReferenceOutputAssembly>
     </ProjectReference>
 EOM
+      if (!($name =~ /xdiff|libreftable/)) {
+        print F << "EOM";
+    <ProjectReference Include="$cdup\\reftable\\libreftable\\libreftable.vcxproj">
+      <Project>$uuid_libreftable</Project>
+      <ReferenceOutputAssembly>false</ReferenceOutputAssembly>
+    </ProjectReference>
+EOM
+      }
       if (!($name =~ 'xdiff')) {
         print F << "EOM";
     <ProjectReference Include="$cdup\\xdiff\\lib\\xdiff_lib.vcxproj">
diff --git a/refs.c b/refs.c
index 261fd82beb98..b7c2decd30fc 100644
--- a/refs.c
+++ b/refs.c
@@ -19,10 +19,16 @@
 #include "repository.h"
 #include "sigchain.h"
 
+const char *default_ref_storage(void)
+{
+	const char *test = getenv("GIT_TEST_REFTABLE");
+	return test ? "reftable" : "files";
+}
+
 /*
  * List of all available backends
  */
-static struct ref_storage_be *refs_backends = &refs_be_files;
+static struct ref_storage_be *refs_backends = &refs_be_reftable;
 
 static struct ref_storage_be *find_ref_storage_backend(const char *name)
 {
@@ -1875,13 +1881,13 @@ static struct ref_store *lookup_ref_store_map(struct hashmap *map,
  * Create, record, and return a ref_store instance for the specified
  * gitdir.
  */
-static struct ref_store *ref_store_init(const char *gitdir,
+static struct ref_store *ref_store_init(const char *gitdir, const char *be_name,
 					unsigned int flags)
 {
-	const char *be_name = "files";
-	struct ref_storage_be *be = find_ref_storage_backend(be_name);
+	struct ref_storage_be *be;
 	struct ref_store *refs;
 
+	be = find_ref_storage_backend(be_name);
 	if (!be)
 		BUG("reference backend %s is unknown", be_name);
 
@@ -1897,7 +1903,11 @@ struct ref_store *get_main_ref_store(struct repository *r)
 	if (!r->gitdir)
 		BUG("attempting to get main_ref_store outside of repository");
 
-	r->refs_private = ref_store_init(r->gitdir, REF_STORE_ALL_CAPS);
+	r->refs_private = ref_store_init(r->gitdir,
+					 r->ref_storage_format ?
+						       r->ref_storage_format :
+						       default_ref_storage(),
+					 REF_STORE_ALL_CAPS);
 	r->refs_private = maybe_debug_wrap_ref_store(r->gitdir, r->refs_private);
 	return r->refs_private;
 }
@@ -1953,7 +1963,7 @@ struct ref_store *get_submodule_ref_store(const char *submodule)
 		goto done;
 
 	/* assume that add_submodule_odb() has been called */
-	refs = ref_store_init(submodule_sb.buf,
+	refs = ref_store_init(submodule_sb.buf, default_ref_storage(),
 			      REF_STORE_READ | REF_STORE_ODB);
 	register_ref_store_map(&submodule_ref_stores, "submodule",
 			       refs, submodule);
@@ -1967,6 +1977,7 @@ struct ref_store *get_submodule_ref_store(const char *submodule)
 
 struct ref_store *get_worktree_ref_store(const struct worktree *wt)
 {
+	const char *format = default_ref_storage();
 	struct ref_store *refs;
 	const char *id;
 
@@ -1980,9 +1991,9 @@ struct ref_store *get_worktree_ref_store(const struct worktree *wt)
 
 	if (wt->id)
 		refs = ref_store_init(git_common_path("worktrees/%s", wt->id),
-				      REF_STORE_ALL_CAPS);
+				      format, REF_STORE_ALL_CAPS);
 	else
-		refs = ref_store_init(get_git_common_dir(),
+		refs = ref_store_init(get_git_common_dir(), format,
 				      REF_STORE_ALL_CAPS);
 
 	if (refs)
diff --git a/refs.h b/refs.h
index 48970dfc7e0f..5a6d4ca9fa8d 100644
--- a/refs.h
+++ b/refs.h
@@ -11,6 +11,9 @@ struct string_list;
 struct string_list_item;
 struct worktree;
 
+/* Returns the ref storage backend to use by default. */
+const char *default_ref_storage(void);
+
 /*
  * Resolve a reference, recursively following symbolic refererences.
  *
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 467f4b3c936d..28166bf1f82b 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -669,6 +669,7 @@ struct ref_storage_be {
 };
 
 extern struct ref_storage_be refs_be_files;
+extern struct ref_storage_be refs_be_reftable;
 extern struct ref_storage_be refs_be_packed;
 
 /*
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
new file mode 100644
index 000000000000..130fd90e45b8
--- /dev/null
+++ b/refs/reftable-backend.c
@@ -0,0 +1,1451 @@
+#include "../cache.h"
+#include "../chdir-notify.h"
+#include "../config.h"
+#include "../iterator.h"
+#include "../lockfile.h"
+#include "../refs.h"
+#include "../reftable/reftable-stack.h"
+#include "../reftable/reftable-record.h"
+#include "../reftable/reftable-error.h"
+#include "../reftable/reftable-blocksource.h"
+#include "../reftable/reftable-reader.h"
+#include "../reftable/reftable-iterator.h"
+#include "../reftable/reftable-merged.h"
+#include "../reftable/reftable-generic.h"
+#include "../worktree.h"
+#include "refs-internal.h"
+
+extern struct ref_storage_be refs_be_reftable;
+
+struct git_reftable_ref_store {
+	struct ref_store base;
+	unsigned int store_flags;
+
+	int err;
+	char *repo_dir;
+
+	char *reftable_dir;
+	char *worktree_reftable_dir;
+
+	struct reftable_stack *main_stack;
+	struct reftable_stack *worktree_stack;
+};
+
+static struct reftable_stack *stack_for(struct git_reftable_ref_store *store,
+					const char *refname)
+{
+	if (store->worktree_stack == NULL)
+		return store->main_stack;
+
+	switch (ref_type(refname)) {
+	case REF_TYPE_PER_WORKTREE:
+	case REF_TYPE_PSEUDOREF:
+	case REF_TYPE_OTHER_PSEUDOREF:
+		return store->worktree_stack;
+	default:
+	case REF_TYPE_MAIN_PSEUDOREF:
+	case REF_TYPE_NORMAL:
+		return store->main_stack;
+	}
+}
+
+static int git_reftable_read_raw_ref(struct ref_store *ref_store,
+				     const char *refname, struct object_id *oid,
+				     struct strbuf *referent,
+				     unsigned int *type);
+
+static void clear_reftable_log_record(struct reftable_log_record *log)
+{
+	log->refname = NULL;
+	switch (log->value_type) {
+	case REFTABLE_LOG_UPDATE:
+		log->update.old_hash = NULL;
+		log->update.new_hash = NULL;
+		log->update.message = NULL;
+		break;
+	case REFTABLE_LOG_DELETION:
+		break;
+	}
+	reftable_log_record_release(log);
+}
+
+static void fill_reftable_log_record(struct reftable_log_record *log)
+{
+	const char *info = git_committer_info(0);
+	struct ident_split split = { NULL };
+	int result = split_ident_line(&split, info, strlen(info));
+	int sign = 1;
+	assert(0 == result);
+
+	reftable_log_record_release(log);
+	log->value_type = REFTABLE_LOG_UPDATE;
+	log->update.name =
+		xstrndup(split.name_begin, split.name_end - split.name_begin);
+	log->update.email =
+		xstrndup(split.mail_begin, split.mail_end - split.mail_begin);
+	log->update.time = atol(split.date_begin);
+	if (*split.tz_begin == '-') {
+		sign = -1;
+		split.tz_begin++;
+	}
+	if (*split.tz_begin == '+') {
+		sign = 1;
+		split.tz_begin++;
+	}
+
+	log->update.tz_offset = sign * atoi(split.tz_begin);
+}
+
+static int has_suffix(struct strbuf *b, const char *suffix)
+{
+	size_t len = strlen(suffix);
+
+	if (len > b->len) {
+		return 0;
+	}
+
+	return 0 == strncmp(b->buf + b->len - len, suffix, len);
+}
+
+/* trims the last path component of b. Returns -1 if it is not
+ * present, or 0 on success
+ */
+static int trim_component(struct strbuf *b)
+{
+	char *last;
+	last = strrchr(b->buf, '/');
+	if (!last)
+		return -1;
+	strbuf_setlen(b, last - b->buf);
+	return 0;
+}
+
+/* Returns whether `b` is a worktree path, trimming it to the gitdir
+ */
+static int is_worktree(struct strbuf *b)
+{
+	if (trim_component(b) < 0) {
+		return 0;
+	}
+	if (!has_suffix(b, "/worktrees")) {
+		return 0;
+	}
+	trim_component(b);
+	return 1;
+}
+
+static struct ref_store *git_reftable_ref_store_create(const char *path,
+						       unsigned int store_flags)
+{
+	struct git_reftable_ref_store *refs = xcalloc(1, sizeof(*refs));
+	struct ref_store *ref_store = (struct ref_store *)refs;
+	struct reftable_write_options cfg = {
+		.block_size = 4096,
+		.hash_id = the_hash_algo->format_id,
+	};
+	struct strbuf sb = STRBUF_INIT;
+	const char *gitdir = path;
+	struct strbuf wt_buf = STRBUF_INIT;
+	int wt = 0;
+
+	strbuf_addstr(&wt_buf, path);
+
+	/* this is clumsy, but the official worktree functions (eg.
+	 * get_worktrees()) function will try to initialize a ref storage
+	 * backend, leading to infinite recursion.  */
+	wt = is_worktree(&wt_buf);
+	if (wt) {
+		gitdir = wt_buf.buf;
+	}
+
+	base_ref_store_init(ref_store, &refs_be_reftable);
+	ref_store->gitdir = xstrdup(gitdir);
+	refs->store_flags = store_flags;
+	strbuf_addf(&sb, "%s/reftable", gitdir);
+	refs->reftable_dir = xstrdup(sb.buf);
+	strbuf_reset(&sb);
+
+	refs->err =
+		reftable_new_stack(&refs->main_stack, refs->reftable_dir, cfg);
+	assert(refs->err != REFTABLE_API_ERROR);
+
+	if (refs->err == 0 && wt) {
+		strbuf_addf(&sb, "%s/reftable", path);
+		refs->worktree_reftable_dir = xstrdup(sb.buf);
+
+		refs->err = reftable_new_stack(&refs->worktree_stack,
+					       refs->worktree_reftable_dir,
+					       cfg);
+		assert(refs->err != REFTABLE_API_ERROR);
+	}
+
+	strbuf_release(&sb);
+	strbuf_release(&wt_buf);
+	return ref_store;
+}
+
+static int git_reftable_init_db(struct ref_store *ref_store, struct strbuf *err)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct strbuf sb = STRBUF_INIT;
+
+	safe_create_dir(refs->reftable_dir, 1);
+	assert(refs->worktree_reftable_dir == NULL);
+
+	strbuf_addf(&sb, "%s/HEAD", refs->base.gitdir);
+	write_file(sb.buf, "ref: refs/heads/.invalid");
+	strbuf_reset(&sb);
+
+	strbuf_addf(&sb, "%s/refs", refs->base.gitdir);
+	safe_create_dir(sb.buf, 1);
+	strbuf_reset(&sb);
+
+	strbuf_addf(&sb, "%s/refs/heads", refs->base.gitdir);
+	write_file(sb.buf, "this repository uses the reftable format");
+
+	return 0;
+}
+
+struct git_reftable_iterator {
+	struct ref_iterator base;
+	struct reftable_iterator iter;
+	struct reftable_ref_record ref;
+	struct object_id oid;
+	struct ref_store *ref_store;
+
+	/* In case we must iterate over 2 stacks, this is non-null. */
+	struct reftable_merged_table *merged;
+	unsigned int flags;
+	int err;
+	const char *prefix;
+};
+
+static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator)
+{
+	struct git_reftable_iterator *ri =
+		(struct git_reftable_iterator *)ref_iterator;
+	while (ri->err == 0) {
+		ri->err = reftable_iterator_next_ref(&ri->iter, &ri->ref);
+		if (ri->err) {
+			break;
+		}
+
+		if (ref_type(ri->ref.refname) == REF_TYPE_PSEUDOREF) {
+			/*
+			  pseudorefs, eg. HEAD, FETCH_HEAD should not be
+			  produced, by default.
+			 */
+			continue;
+		}
+		ri->base.refname = ri->ref.refname;
+		if (ri->prefix != NULL &&
+		    strncmp(ri->prefix, ri->ref.refname, strlen(ri->prefix))) {
+			ri->err = 1;
+			break;
+		}
+		if (ri->flags & DO_FOR_EACH_PER_WORKTREE_ONLY &&
+		    ref_type(ri->base.refname) != REF_TYPE_PER_WORKTREE)
+			continue;
+
+		ri->base.flags = 0;
+		switch (ri->ref.value_type) {
+		case REFTABLE_REF_VAL1:
+			hashcpy(ri->oid.hash, ri->ref.value.val1);
+			break;
+		case REFTABLE_REF_VAL2:
+			hashcpy(ri->oid.hash, ri->ref.value.val2.value);
+			break;
+		case REFTABLE_REF_SYMREF: {
+			int out_flags = 0;
+			const char *resolved = refs_resolve_ref_unsafe(
+				ri->ref_store, ri->ref.refname,
+				RESOLVE_REF_READING, &ri->oid, &out_flags);
+			ri->base.flags = out_flags;
+			if (resolved == NULL &&
+			    !(ri->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
+			    (ri->base.flags & REF_ISBROKEN)) {
+				continue;
+			}
+			break;
+		}
+		default:
+			abort();
+		}
+
+		ri->base.oid = &ri->oid;
+		if (!(ri->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
+		    !ref_resolves_to_object(ri->base.refname, ri->base.oid,
+					    ri->base.flags)) {
+			continue;
+		}
+
+		break;
+	}
+
+	if (ri->err > 0) {
+		return ITER_DONE;
+	}
+	if (ri->err < 0) {
+		return ITER_ERROR;
+	}
+
+	return ITER_OK;
+}
+
+static int reftable_ref_iterator_peel(struct ref_iterator *ref_iterator,
+				      struct object_id *peeled)
+{
+	struct git_reftable_iterator *ri =
+		(struct git_reftable_iterator *)ref_iterator;
+	if (ri->ref.value_type == REFTABLE_REF_VAL2) {
+		hashcpy(peeled->hash, ri->ref.value.val2.target_value);
+		return 0;
+	}
+
+	return -1;
+}
+
+static int reftable_ref_iterator_abort(struct ref_iterator *ref_iterator)
+{
+	struct git_reftable_iterator *ri =
+		(struct git_reftable_iterator *)ref_iterator;
+	reftable_ref_record_release(&ri->ref);
+	reftable_iterator_destroy(&ri->iter);
+	if (ri->merged) {
+		reftable_merged_table_free(ri->merged);
+	}
+	return 0;
+}
+
+static struct ref_iterator_vtable reftable_ref_iterator_vtable = {
+	reftable_ref_iterator_advance, reftable_ref_iterator_peel,
+	reftable_ref_iterator_abort
+};
+
+static struct ref_iterator *
+git_reftable_ref_iterator_begin(struct ref_store *ref_store, const char *prefix,
+				unsigned int flags)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct git_reftable_iterator *ri = xcalloc(1, sizeof(*ri));
+
+	if (refs->err < 0) {
+		ri->err = refs->err;
+	} else if (refs->worktree_stack == NULL) {
+		struct reftable_merged_table *mt =
+			reftable_stack_merged_table(refs->main_stack);
+		ri->err = reftable_merged_table_seek_ref(mt, &ri->iter, prefix);
+	} else {
+		struct reftable_merged_table *mt1 =
+			reftable_stack_merged_table(refs->main_stack);
+		struct reftable_merged_table *mt2 =
+			reftable_stack_merged_table(refs->worktree_stack);
+		struct reftable_table *tabs =
+			xcalloc(2, sizeof(struct reftable_table));
+		reftable_table_from_merged_table(&tabs[0], mt1);
+		reftable_table_from_merged_table(&tabs[1], mt2);
+		ri->err = reftable_new_merged_table(&ri->merged, tabs, 2,
+						    the_hash_algo->format_id);
+		if (ri->err == 0)
+			ri->err = reftable_merged_table_seek_ref(
+				ri->merged, &ri->iter, prefix);
+	}
+
+	base_ref_iterator_init(&ri->base, &reftable_ref_iterator_vtable, 1);
+	ri->prefix = prefix;
+	ri->base.oid = &ri->oid;
+	ri->flags = flags;
+	ri->ref_store = ref_store;
+	return &ri->base;
+}
+
+static int fixup_symrefs(struct ref_store *ref_store,
+			 struct ref_transaction *transaction)
+{
+	struct strbuf referent = STRBUF_INIT;
+	int i = 0;
+	int err = 0;
+
+	for (i = 0; i < transaction->nr; i++) {
+		struct ref_update *update = transaction->updates[i];
+		struct object_id old_oid;
+
+		err = git_reftable_read_raw_ref(ref_store, update->refname,
+						&old_oid, &referent,
+						/* mutate input, like
+						   files-backend.c */
+						&update->type);
+		if (err < 0 && errno == ENOENT &&
+		    is_null_oid(&update->old_oid)) {
+			err = 0;
+		}
+		if (err < 0)
+			goto done;
+
+		if (!(update->type & REF_ISSYMREF))
+			continue;
+
+		if (update->flags & REF_NO_DEREF) {
+			/* what should happen here? See files-backend.c
+			 * lock_ref_for_update. */
+		} else {
+			/*
+			  If we are updating a symref (eg. HEAD), we should also
+			  update the branch that the symref points to.
+
+			  This is generic functionality, and would be better
+			  done in refs.c, but the current implementation is
+			  intertwined with the locking in files-backend.c.
+			*/
+			int new_flags = update->flags;
+			struct ref_update *new_update = NULL;
+
+			/* if this is an update for HEAD, should also record a
+			   log entry for HEAD? See files-backend.c,
+			   split_head_update()
+			*/
+			new_update = ref_transaction_add_update(
+				transaction, referent.buf, new_flags,
+				&update->new_oid, &update->old_oid,
+				update->msg);
+			new_update->parent_update = update;
+
+			/* files-backend sets REF_LOG_ONLY here. */
+			update->flags |= REF_NO_DEREF | REF_LOG_ONLY;
+			update->flags &= ~REF_HAVE_OLD;
+		}
+	}
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	strbuf_release(&referent);
+	return err;
+}
+
+static int git_reftable_transaction_prepare(struct ref_store *ref_store,
+					    struct ref_transaction *transaction,
+					    struct strbuf *errbuf)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_addition *add = NULL;
+	struct reftable_stack *stack =
+		transaction->nr ?
+			      stack_for(refs, transaction->updates[0]->refname) :
+			      refs->main_stack;
+	int err = refs->err;
+	if (err < 0) {
+		goto done;
+	}
+
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+
+	err = reftable_stack_new_addition(&add, stack);
+	if (err) {
+		goto done;
+	}
+
+	err = fixup_symrefs(ref_store, transaction);
+	if (err) {
+		goto done;
+	}
+
+	transaction->backend_data = add;
+	transaction->state = REF_TRANSACTION_PREPARED;
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	if (err < 0) {
+		transaction->state = REF_TRANSACTION_CLOSED;
+		strbuf_addf(errbuf, "reftable: transaction prepare: %s",
+			    reftable_error_str(err));
+	}
+
+	return err;
+}
+
+static int git_reftable_transaction_abort(struct ref_store *ref_store,
+					  struct ref_transaction *transaction,
+					  struct strbuf *err)
+{
+	struct reftable_addition *add =
+		(struct reftable_addition *)transaction->backend_data;
+	reftable_addition_destroy(add);
+	transaction->backend_data = NULL;
+	return 0;
+}
+
+static int reftable_check_old_oid(struct ref_store *refs, const char *refname,
+				  struct object_id *want_oid)
+{
+	struct object_id out_oid;
+	int out_flags = 0;
+	const char *resolved = refs_resolve_ref_unsafe(
+		refs, refname, RESOLVE_REF_READING, &out_oid, &out_flags);
+	if (is_null_oid(want_oid) != (resolved == NULL)) {
+		return REFTABLE_LOCK_ERROR;
+	}
+
+	if (resolved != NULL && !oideq(&out_oid, want_oid)) {
+		return REFTABLE_LOCK_ERROR;
+	}
+
+	return 0;
+}
+
+static int ref_update_cmp(const void *a, const void *b)
+{
+	return strcmp((*(struct ref_update **)a)->refname,
+		      (*(struct ref_update **)b)->refname);
+}
+
+static int write_transaction_table(struct reftable_writer *writer, void *arg)
+{
+	struct ref_transaction *transaction = (struct ref_transaction *)arg;
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)transaction->ref_store;
+	struct reftable_stack *stack =
+		stack_for(refs, transaction->updates[0]->refname);
+	uint64_t ts = reftable_stack_next_update_index(stack);
+	int err = 0;
+	int i = 0;
+	struct reftable_log_record *logs =
+		calloc(transaction->nr, sizeof(*logs));
+	struct ref_update **sorted =
+		malloc(transaction->nr * sizeof(struct ref_update *));
+	COPY_ARRAY(sorted, transaction->updates, transaction->nr);
+	QSORT(sorted, transaction->nr, ref_update_cmp);
+	reftable_writer_set_limits(writer, ts, ts);
+
+	for (i = 0; i < transaction->nr; i++) {
+		struct ref_update *u = sorted[i];
+		struct reftable_log_record *log = &logs[i];
+		fill_reftable_log_record(log);
+		log->update_index = ts;
+		log->value_type = REFTABLE_LOG_UPDATE;
+		log->refname = (char *)u->refname;
+		log->update.old_hash = u->old_oid.hash;
+		log->update.new_hash = u->new_oid.hash;
+		log->update.message = u->msg;
+
+		if (u->flags & REF_LOG_ONLY) {
+			continue;
+		}
+
+		if (u->flags & REF_HAVE_NEW) {
+			struct reftable_ref_record ref = { NULL };
+			struct object_id peeled;
+
+			int peel_error = peel_object(&u->new_oid, &peeled);
+			ref.refname = (char *)u->refname;
+			ref.update_index = ts;
+
+			if (!peel_error) {
+				ref.value_type = REFTABLE_REF_VAL2;
+				ref.value.val2.target_value = peeled.hash;
+				ref.value.val2.value = u->new_oid.hash;
+			} else if (!is_null_oid(&u->new_oid)) {
+				ref.value_type = REFTABLE_REF_VAL1;
+				ref.value.val1 = u->new_oid.hash;
+			}
+
+			err = reftable_writer_add_ref(writer, &ref);
+			if (err < 0) {
+				goto done;
+			}
+		}
+	}
+
+	for (i = 0; i < transaction->nr; i++) {
+		err = reftable_writer_add_log(writer, &logs[i]);
+		clear_reftable_log_record(&logs[i]);
+		if (err < 0) {
+			goto done;
+		}
+	}
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	free(logs);
+	free(sorted);
+	return err;
+}
+
+static int git_reftable_transaction_finish(struct ref_store *ref_store,
+					   struct ref_transaction *transaction,
+					   struct strbuf *errmsg)
+{
+	struct reftable_addition *add =
+		(struct reftable_addition *)transaction->backend_data;
+	int err = 0;
+	int i;
+
+	for (i = 0; i < transaction->nr; i++) {
+		struct ref_update *u = transaction->updates[i];
+		if (u->flags & REF_HAVE_OLD) {
+			err = reftable_check_old_oid(transaction->ref_store,
+						     u->refname, &u->old_oid);
+			if (err < 0) {
+				goto done;
+			}
+		}
+	}
+	if (transaction->nr) {
+		err = reftable_addition_add(add, &write_transaction_table,
+					    transaction);
+		if (err < 0) {
+			goto done;
+		}
+	}
+
+	err = reftable_addition_commit(add);
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	reftable_addition_destroy(add);
+	transaction->state = REF_TRANSACTION_CLOSED;
+	transaction->backend_data = NULL;
+	if (err) {
+		strbuf_addf(errmsg, "reftable: transaction failure: %s",
+			    reftable_error_str(err));
+		return -1;
+	}
+	return err;
+}
+
+static int
+git_reftable_transaction_initial_commit(struct ref_store *ref_store,
+					struct ref_transaction *transaction,
+					struct strbuf *errmsg)
+{
+	int err = git_reftable_transaction_prepare(ref_store, transaction,
+						   errmsg);
+	if (err)
+		return err;
+
+	return git_reftable_transaction_finish(ref_store, transaction, errmsg);
+}
+
+struct write_delete_refs_arg {
+	struct reftable_stack *stack;
+	struct string_list *refnames;
+	const char *logmsg;
+	unsigned int flags;
+};
+
+static int write_delete_refs_table(struct reftable_writer *writer, void *argv)
+{
+	struct write_delete_refs_arg *arg =
+		(struct write_delete_refs_arg *)argv;
+	uint64_t ts = reftable_stack_next_update_index(arg->stack);
+	int err = 0;
+	int i = 0;
+
+	reftable_writer_set_limits(writer, ts, ts);
+	for (i = 0; i < arg->refnames->nr; i++) {
+		struct reftable_ref_record ref = {
+			.refname = (char *)arg->refnames->items[i].string,
+			.value_type = REFTABLE_REF_DELETION,
+			.update_index = ts,
+		};
+		err = reftable_writer_add_ref(writer, &ref);
+		if (err < 0) {
+			return err;
+		}
+	}
+
+	for (i = 0; i < arg->refnames->nr; i++) {
+		struct reftable_log_record log = {
+			.update_index = ts,
+		};
+		struct reftable_ref_record current = { NULL };
+		fill_reftable_log_record(&log);
+		log.update_index = ts;
+		log.refname = (char *)arg->refnames->items[i].string;
+
+		log.update.message = xstrdup(arg->logmsg);
+		log.update.new_hash = NULL;
+		log.update.old_hash = NULL;
+		if (reftable_stack_read_ref(arg->stack, log.refname,
+					    &current) == 0) {
+			log.update.old_hash =
+				reftable_ref_record_val1(&current);
+		}
+		err = reftable_writer_add_log(writer, &log);
+		log.update.old_hash = NULL;
+		reftable_ref_record_release(&current);
+
+		clear_reftable_log_record(&log);
+		if (err < 0) {
+			return err;
+		}
+	}
+	return 0;
+}
+
+static int git_reftable_delete_refs(struct ref_store *ref_store,
+				    const char *msg,
+				    struct string_list *refnames,
+				    unsigned int flags)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack =
+		stack_for(refs, refnames->items[0].string);
+	struct write_delete_refs_arg arg = {
+		.stack = stack,
+		.refnames = refnames,
+		.logmsg = msg,
+		.flags = flags,
+	};
+	int err = refs->err;
+	if (err < 0) {
+		goto done;
+	}
+
+	string_list_sort(refnames);
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+	err = reftable_stack_add(stack, &write_delete_refs_table, &arg);
+done:
+	assert(err != REFTABLE_API_ERROR);
+	return err;
+}
+
+static int git_reftable_pack_refs(struct ref_store *ref_store,
+				  unsigned int flags)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	int err = refs->err;
+	if (err < 0) {
+		return err;
+	}
+	err = reftable_stack_compact_all(refs->main_stack, NULL);
+	if (err == 0 && refs->worktree_stack != NULL)
+		err = reftable_stack_compact_all(refs->worktree_stack, NULL);
+	if (err == 0)
+		err = reftable_stack_clean(refs->main_stack);
+	if (err == 0 && refs->worktree_stack != NULL)
+		err = reftable_stack_clean(refs->worktree_stack);
+
+	return err;
+}
+
+struct write_create_symref_arg {
+	struct git_reftable_ref_store *refs;
+	struct reftable_stack *stack;
+	const char *refname;
+	const char *target;
+	const char *logmsg;
+};
+
+static int write_create_symref_table(struct reftable_writer *writer, void *arg)
+{
+	struct write_create_symref_arg *create =
+		(struct write_create_symref_arg *)arg;
+	uint64_t ts = reftable_stack_next_update_index(create->stack);
+	int err = 0;
+
+	struct reftable_ref_record ref = {
+		.refname = (char *)create->refname,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = (char *)create->target,
+		.update_index = ts,
+	};
+	reftable_writer_set_limits(writer, ts, ts);
+	err = reftable_writer_add_ref(writer, &ref);
+	if (err == 0) {
+		struct reftable_log_record log = { NULL };
+		struct object_id new_oid;
+		struct object_id old_oid;
+
+		fill_reftable_log_record(&log);
+		log.refname = (char *)create->refname;
+		log.update_index = ts;
+		log.update.message = (char *)create->logmsg;
+		if (refs_resolve_ref_unsafe(
+			    (struct ref_store *)create->refs, create->refname,
+			    RESOLVE_REF_READING, &old_oid, NULL) != NULL) {
+			log.update.old_hash = old_oid.hash;
+		}
+
+		if (refs_resolve_ref_unsafe((struct ref_store *)create->refs,
+					    create->target, RESOLVE_REF_READING,
+					    &new_oid, NULL) != NULL) {
+			log.update.new_hash = new_oid.hash;
+		}
+
+		if (log.update.old_hash != NULL ||
+		    log.update.new_hash != NULL) {
+			err = reftable_writer_add_log(writer, &log);
+		}
+		log.refname = NULL;
+		log.update.message = NULL;
+		log.update.old_hash = NULL;
+		log.update.new_hash = NULL;
+		clear_reftable_log_record(&log);
+	}
+	return err;
+}
+
+static int git_reftable_create_symref(struct ref_store *ref_store,
+				      const char *refname, const char *target,
+				      const char *logmsg)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct write_create_symref_arg arg = { .refs = refs,
+					       .stack = stack,
+					       .refname = refname,
+					       .target = target,
+					       .logmsg = logmsg };
+	int err = refs->err;
+	if (err < 0) {
+		goto done;
+	}
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+	err = reftable_stack_add(stack, &write_create_symref_table, &arg);
+done:
+	assert(err != REFTABLE_API_ERROR);
+	return err;
+}
+
+struct write_rename_arg {
+	struct reftable_stack *stack;
+	const char *oldname;
+	const char *newname;
+	const char *logmsg;
+};
+
+static int write_rename_table(struct reftable_writer *writer, void *argv)
+{
+	struct write_rename_arg *arg = (struct write_rename_arg *)argv;
+	uint64_t ts = reftable_stack_next_update_index(arg->stack);
+	struct reftable_ref_record ref = { NULL };
+	int err = reftable_stack_read_ref(arg->stack, arg->oldname, &ref);
+
+	if (err) {
+		goto done;
+	}
+
+	/* XXX do ref renames overwrite the target? */
+	if (reftable_stack_read_ref(arg->stack, arg->newname, &ref) == 0) {
+		goto done;
+	}
+
+	reftable_writer_set_limits(writer, ts, ts);
+
+	{
+		struct reftable_ref_record todo[2] = {
+			{
+				.refname = (char *)arg->oldname,
+				.update_index = ts,
+				.value_type = REFTABLE_REF_DELETION,
+			},
+			ref,
+		};
+		todo[1].update_index = ts;
+		free(todo[1].refname);
+		todo[1].refname = strdup(arg->newname);
+
+		err = reftable_writer_add_refs(writer, todo, 2);
+		if (err < 0) {
+			goto done;
+		}
+	}
+
+	if (reftable_ref_record_val1(&ref)) {
+		uint8_t *val1 = reftable_ref_record_val1(&ref);
+		struct reftable_log_record todo[2] = { { NULL } };
+		fill_reftable_log_record(&todo[0]);
+		fill_reftable_log_record(&todo[1]);
+
+		todo[0].refname = (char *)arg->oldname;
+		todo[0].update_index = ts;
+		todo[0].update.message = (char *)arg->logmsg;
+		todo[0].update.old_hash = val1;
+		todo[0].update.new_hash = NULL;
+
+		todo[1].refname = (char *)arg->newname;
+		todo[1].update_index = ts;
+		todo[1].update.old_hash = NULL;
+		todo[1].update.new_hash = val1;
+		todo[1].update.message = (char *)arg->logmsg;
+
+		err = reftable_writer_add_logs(writer, todo, 2);
+
+		clear_reftable_log_record(&todo[0]);
+		clear_reftable_log_record(&todo[1]);
+
+		if (err < 0) {
+			goto done;
+		}
+
+	} else {
+		/* XXX symrefs? */
+	}
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	reftable_ref_record_release(&ref);
+	return err;
+}
+
+static int git_reftable_rename_ref(struct ref_store *ref_store,
+				   const char *oldrefname,
+				   const char *newrefname, const char *logmsg)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, newrefname);
+	struct write_rename_arg arg = {
+		.stack = stack,
+		.oldname = oldrefname,
+		.newname = newrefname,
+		.logmsg = logmsg,
+	};
+	int err = refs->err;
+	if (err < 0) {
+		goto done;
+	}
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+
+	err = reftable_stack_add(stack, &write_rename_table, &arg);
+done:
+	assert(err != REFTABLE_API_ERROR);
+	return err;
+}
+
+static int git_reftable_copy_ref(struct ref_store *ref_store,
+				 const char *oldrefname, const char *newrefname,
+				 const char *logmsg)
+{
+	BUG("reftable reference store does not support copying references");
+}
+
+struct git_reftable_reflog_ref_iterator {
+	struct ref_iterator base;
+	struct reftable_iterator iter;
+	struct reftable_log_record log;
+	struct object_id oid;
+
+	/* Used when iterating over worktree & main */
+	struct reftable_merged_table *merged;
+	char *last_name;
+};
+
+static int
+git_reftable_reflog_ref_iterator_advance(struct ref_iterator *ref_iterator)
+{
+	struct git_reftable_reflog_ref_iterator *ri =
+		(struct git_reftable_reflog_ref_iterator *)ref_iterator;
+
+	while (1) {
+		int err = reftable_iterator_next_log(&ri->iter, &ri->log);
+		if (err > 0) {
+			return ITER_DONE;
+		}
+		if (err < 0) {
+			return ITER_ERROR;
+		}
+
+		ri->base.refname = ri->log.refname;
+		if (ri->last_name != NULL &&
+		    !strcmp(ri->log.refname, ri->last_name)) {
+			/* we want the refnames that we have reflogs for, so we
+			 * skip if we've already produced this name. This could
+			 * be faster by seeking directly to
+			 * reflog@update_index==0.
+			 */
+			continue;
+		}
+
+		free(ri->last_name);
+		ri->last_name = xstrdup(ri->log.refname);
+		hashcpy(ri->oid.hash, ri->log.update.new_hash);
+		return ITER_OK;
+	}
+}
+
+static int
+git_reftable_reflog_ref_iterator_peel(struct ref_iterator *ref_iterator,
+				      struct object_id *peeled)
+{
+	BUG("not supported.");
+	return -1;
+}
+
+static int
+git_reftable_reflog_ref_iterator_abort(struct ref_iterator *ref_iterator)
+{
+	struct git_reftable_reflog_ref_iterator *ri =
+		(struct git_reftable_reflog_ref_iterator *)ref_iterator;
+	reftable_log_record_release(&ri->log);
+	reftable_iterator_destroy(&ri->iter);
+	if (ri->merged)
+		reftable_merged_table_free(ri->merged);
+	return 0;
+}
+
+static struct ref_iterator_vtable git_reftable_reflog_ref_iterator_vtable = {
+	git_reftable_reflog_ref_iterator_advance,
+	git_reftable_reflog_ref_iterator_peel,
+	git_reftable_reflog_ref_iterator_abort
+};
+
+static struct ref_iterator *
+git_reftable_reflog_iterator_begin(struct ref_store *ref_store)
+{
+	struct git_reftable_reflog_ref_iterator *ri = xcalloc(1, sizeof(*ri));
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+
+	if (refs->worktree_stack == NULL) {
+		struct reftable_stack *stack = refs->main_stack;
+		struct reftable_merged_table *mt =
+			reftable_stack_merged_table(stack);
+		int err = reftable_merged_table_seek_log(mt, &ri->iter, "");
+		if (err < 0) {
+			free(ri);
+			/* XXX is this allowed? */
+			return NULL;
+		}
+	} else {
+		struct reftable_merged_table *mt1 =
+			reftable_stack_merged_table(refs->main_stack);
+		struct reftable_merged_table *mt2 =
+			reftable_stack_merged_table(refs->worktree_stack);
+		struct reftable_table *tabs =
+			xcalloc(2, sizeof(struct reftable_table));
+		int err = 0;
+		reftable_table_from_merged_table(&tabs[0], mt1);
+		reftable_table_from_merged_table(&tabs[1], mt2);
+		err = reftable_new_merged_table(&ri->merged, tabs, 2,
+						the_hash_algo->format_id);
+		if (err < 0) {
+			free(tabs);
+			/* XXX see above */
+			return NULL;
+		}
+		err = reftable_merged_table_seek_ref(ri->merged, &ri->iter, "");
+		if (err < 0) {
+			return NULL;
+		}
+	}
+	base_ref_iterator_init(&ri->base,
+			       &git_reftable_reflog_ref_iterator_vtable, 1);
+	ri->base.oid = &ri->oid;
+
+	return (struct ref_iterator *)ri;
+}
+
+static int git_reftable_for_each_reflog_ent_newest_first(
+	struct ref_store *ref_store, const char *refname, each_reflog_ent_fn fn,
+	void *cb_data)
+{
+	struct reftable_iterator it = { NULL };
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct reftable_merged_table *mt = NULL;
+	int err = 0;
+	struct reftable_log_record log = { NULL };
+
+	if (refs->err < 0) {
+		return refs->err;
+	}
+
+	mt = reftable_stack_merged_table(stack);
+	err = reftable_merged_table_seek_log(mt, &it, refname);
+	while (err == 0) {
+		struct object_id old_oid;
+		struct object_id new_oid;
+		const char *full_committer = "";
+
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+		if (err < 0) {
+			break;
+		}
+
+		if (strcmp(log.refname, refname)) {
+			break;
+		}
+
+		hashcpy(old_oid.hash, log.update.old_hash);
+		hashcpy(new_oid.hash, log.update.new_hash);
+
+		full_committer = fmt_ident(log.update.name, log.update.email,
+					   WANT_COMMITTER_IDENT,
+					   /*date*/ NULL, IDENT_NO_DATE);
+		err = fn(&old_oid, &new_oid, full_committer, log.update.time,
+			 log.update.tz_offset, log.update.message, cb_data);
+		if (err)
+			break;
+	}
+
+	reftable_log_record_release(&log);
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+static int git_reftable_for_each_reflog_ent_oldest_first(
+	struct ref_store *ref_store, const char *refname, each_reflog_ent_fn fn,
+	void *cb_data)
+{
+	struct reftable_iterator it = { NULL };
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct reftable_merged_table *mt = NULL;
+	struct reftable_log_record *logs = NULL;
+	int cap = 0;
+	int len = 0;
+	int err = 0;
+	int i = 0;
+
+	if (refs->err < 0) {
+		return refs->err;
+	}
+	mt = reftable_stack_merged_table(stack);
+	err = reftable_merged_table_seek_log(mt, &it, refname);
+
+	while (err == 0) {
+		struct reftable_log_record log = { NULL };
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+		if (err < 0) {
+			break;
+		}
+
+		if (strcmp(log.refname, refname)) {
+			break;
+		}
+
+		if (len == cap) {
+			cap = 2 * cap + 1;
+			logs = realloc(logs, cap * sizeof(*logs));
+		}
+
+		logs[len++] = log;
+	}
+
+	for (i = len; i--;) {
+		struct reftable_log_record *log = &logs[i];
+		struct object_id old_oid;
+		struct object_id new_oid;
+		const char *full_committer = "";
+
+		hashcpy(old_oid.hash, log->update.old_hash);
+		hashcpy(new_oid.hash, log->update.new_hash);
+
+		full_committer = fmt_ident(log->update.name, log->update.email,
+					   WANT_COMMITTER_IDENT, NULL,
+					   IDENT_NO_DATE);
+		err = fn(&old_oid, &new_oid, full_committer, log->update.time,
+			 log->update.tz_offset, log->update.message, cb_data);
+		if (err) {
+			break;
+		}
+	}
+
+	for (i = 0; i < len; i++) {
+		reftable_log_record_release(&logs[i]);
+	}
+	free(logs);
+
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+static int git_reftable_reflog_exists(struct ref_store *ref_store,
+				      const char *refname)
+{
+	struct reftable_iterator it = { NULL };
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct reftable_merged_table *mt = reftable_stack_merged_table(stack);
+	struct reftable_log_record log = { NULL };
+	int err = refs->err;
+
+	if (err < 0) {
+		goto done;
+	}
+	err = reftable_merged_table_seek_log(mt, &it, refname);
+	if (err) {
+		goto done;
+	}
+	err = reftable_iterator_next_log(&it, &log);
+	if (err) {
+		goto done;
+	}
+
+	if (strcmp(log.refname, refname)) {
+		err = 1;
+	}
+
+done:
+	reftable_iterator_destroy(&it);
+	reftable_log_record_release(&log);
+	return !err;
+}
+
+static int git_reftable_create_reflog(struct ref_store *ref_store,
+				      const char *refname, int force_create,
+				      struct strbuf *err)
+{
+	return 0;
+}
+
+static int git_reftable_delete_reflog(struct ref_store *ref_store,
+				      const char *refname)
+{
+	return 0;
+}
+
+struct reflog_expiry_arg {
+	struct git_reftable_ref_store *refs;
+	struct reftable_stack *stack;
+	struct reftable_log_record *tombstones;
+	int len;
+	int cap;
+};
+
+static void clear_log_tombstones(struct reflog_expiry_arg *arg)
+{
+	int i = 0;
+	for (; i < arg->len; i++) {
+		reftable_log_record_release(&arg->tombstones[i]);
+	}
+
+	FREE_AND_NULL(arg->tombstones);
+}
+
+static void add_log_tombstone(struct reflog_expiry_arg *arg,
+			      const char *refname, uint64_t ts)
+{
+	struct reftable_log_record tombstone = {
+		.refname = xstrdup(refname),
+		.update_index = ts,
+	};
+	if (arg->len == arg->cap) {
+		arg->cap = 2 * arg->cap + 1;
+		arg->tombstones =
+			realloc(arg->tombstones, arg->cap * sizeof(tombstone));
+	}
+	arg->tombstones[arg->len++] = tombstone;
+}
+
+static int write_reflog_expiry_table(struct reftable_writer *writer, void *argv)
+{
+	struct reflog_expiry_arg *arg = (struct reflog_expiry_arg *)argv;
+	uint64_t ts = reftable_stack_next_update_index(arg->stack);
+	int i = 0;
+	reftable_writer_set_limits(writer, ts, ts);
+	for (i = 0; i < arg->len; i++) {
+		int err = reftable_writer_add_log(writer, &arg->tombstones[i]);
+		if (err) {
+			return err;
+		}
+	}
+	return 0;
+}
+
+static int
+git_reftable_reflog_expire(struct ref_store *ref_store, const char *refname,
+			   const struct object_id *oid, unsigned int flags,
+			   reflog_expiry_prepare_fn prepare_fn,
+			   reflog_expiry_should_prune_fn should_prune_fn,
+			   reflog_expiry_cleanup_fn cleanup_fn,
+			   void *policy_cb_data)
+{
+	/*
+	  For log expiry, we write tombstones in place of the expired entries,
+	  This means that the entries are still retrievable by delving into the
+	  stack, and expiring entries paradoxically takes extra memory.
+
+	  This memory is only reclaimed when some operation issues a
+	  git_reftable_pack_refs(), which will compact the entire stack and get
+	  rid of deletion entries.
+
+	  It would be better if the refs backend supported an API that sets a
+	  criterion for all refs, passing the criterion to pack_refs().
+	*/
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct reftable_merged_table *mt = NULL;
+	struct reflog_expiry_arg arg = {
+		.stack = stack,
+		.refs = refs,
+	};
+	struct reftable_log_record log = { NULL };
+	struct reftable_iterator it = { NULL };
+	int err = 0;
+	if (refs->err < 0) {
+		return refs->err;
+	}
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+
+	mt = reftable_stack_merged_table(stack);
+	err = reftable_merged_table_seek_log(mt, &it, refname);
+	if (err < 0) {
+		goto done;
+	}
+
+	while (1) {
+		struct object_id ooid;
+		struct object_id noid;
+
+		int err = reftable_iterator_next_log(&it, &log);
+		if (err < 0) {
+			goto done;
+		}
+
+		if (err > 0 || strcmp(log.refname, refname)) {
+			break;
+		}
+		hashcpy(ooid.hash, log.update.old_hash);
+		hashcpy(noid.hash, log.update.new_hash);
+
+		if (should_prune_fn(&ooid, &noid, log.update.email,
+				    (timestamp_t)log.update.time,
+				    log.update.tz_offset, log.update.message,
+				    policy_cb_data)) {
+			add_log_tombstone(&arg, refname, log.update_index);
+		}
+	}
+	err = reftable_stack_add(stack, &write_reflog_expiry_table, &arg);
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	reftable_log_record_release(&log);
+	reftable_iterator_destroy(&it);
+	clear_log_tombstones(&arg);
+	return err;
+}
+
+static int reftable_error_to_errno(int err)
+{
+	switch (err) {
+	case REFTABLE_IO_ERROR:
+		return EIO;
+	case REFTABLE_FORMAT_ERROR:
+		return EFAULT;
+	case REFTABLE_NOT_EXIST_ERROR:
+		return ENOENT;
+	case REFTABLE_LOCK_ERROR:
+		return EBUSY;
+	case REFTABLE_API_ERROR:
+		return EINVAL;
+	case REFTABLE_ZLIB_ERROR:
+		return EDOM;
+	default:
+		return ERANGE;
+	}
+}
+
+static int git_reftable_read_raw_ref(struct ref_store *ref_store,
+				     const char *refname, struct object_id *oid,
+				     struct strbuf *referent,
+				     unsigned int *type)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+
+	struct reftable_ref_record ref = { NULL };
+	int err = 0;
+	if (refs->err < 0) {
+		return refs->err;
+	}
+
+	/* This is usually not needed, but Git doesn't signal to ref backend if
+	   a subprocess updated the ref DB.  So we always check.
+	*/
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+
+	err = reftable_stack_read_ref(stack, refname, &ref);
+	if (err > 0) {
+		errno = ENOENT;
+		err = -1;
+		goto done;
+	}
+	if (err < 0) {
+		errno = reftable_error_to_errno(err);
+		err = -1;
+		goto done;
+	}
+
+	if (ref.value_type == REFTABLE_REF_SYMREF) {
+		strbuf_reset(referent);
+		strbuf_addstr(referent, ref.value.symref);
+		*type |= REF_ISSYMREF;
+	} else if (reftable_ref_record_val1(&ref) != NULL) {
+		hashcpy(oid->hash, reftable_ref_record_val1(&ref));
+	} else {
+		*type |= REF_ISBROKEN;
+		errno = EINVAL;
+		err = -1;
+	}
+done:
+	assert(err != REFTABLE_API_ERROR);
+	reftable_ref_record_release(&ref);
+	return err;
+}
+
+struct ref_storage_be refs_be_reftable = {
+	&refs_be_files,
+	"reftable",
+	git_reftable_ref_store_create,
+	git_reftable_init_db,
+	git_reftable_transaction_prepare,
+	git_reftable_transaction_finish,
+	git_reftable_transaction_abort,
+	git_reftable_transaction_initial_commit,
+
+	git_reftable_pack_refs,
+	git_reftable_create_symref,
+	git_reftable_delete_refs,
+	git_reftable_rename_ref,
+	git_reftable_copy_ref,
+
+	git_reftable_ref_iterator_begin,
+	git_reftable_read_raw_ref,
+
+	git_reftable_reflog_iterator_begin,
+	git_reftable_for_each_reflog_ent_oldest_first,
+	git_reftable_for_each_reflog_ent_newest_first,
+	git_reftable_reflog_exists,
+	git_reftable_create_reflog,
+	git_reftable_delete_reflog,
+	git_reftable_reflog_expire,
+};
diff --git a/repository.c b/repository.c
index 87b355e7a659..e9f78496d338 100644
--- a/repository.c
+++ b/repository.c
@@ -174,6 +174,8 @@ int repo_init(struct repository *repo,
 	if (worktree)
 		repo_set_worktree(repo, worktree);
 
+	repo->ref_storage_format = xstrdup_or_null(format.ref_storage);
+
 	clear_repository_format(&format);
 	return 0;
 
diff --git a/repository.h b/repository.h
index b385ca3c94b6..8019a7d0a1fc 100644
--- a/repository.h
+++ b/repository.h
@@ -78,6 +78,9 @@ struct repository {
 	 */
 	struct ref_store *refs_private;
 
+	/* The format to use for the ref database. */
+	char *ref_storage_format;
+
 	/*
 	 * Contains path to often used file names.
 	 */
diff --git a/setup.c b/setup.c
index c04cd25a30df..c6b57efd0318 100644
--- a/setup.c
+++ b/setup.c
@@ -500,6 +500,9 @@ static enum extension_result handle_extension(const char *var,
 			return error("invalid value for 'extensions.objectformat'");
 		data->hash_algo = format;
 		return EXTENSION_OK;
+	} else if (!strcmp(ext, "refstorage")) {
+		data->ref_storage = xstrdup(value);
+		return EXTENSION_OK;
 	}
 	return EXTENSION_UNKNOWN;
 }
@@ -651,6 +654,7 @@ void clear_repository_format(struct repository_format *format)
 	string_list_clear(&format->v1_only_extensions, 0);
 	free(format->work_tree);
 	free(format->partial_clone);
+	free(format->ref_storage);
 	init_repository_format(format);
 }
 
@@ -1308,8 +1312,11 @@ const char *setup_git_directory_gently(int *nongit_ok)
 				gitdir = DEFAULT_GIT_DIR_ENVIRONMENT;
 			setup_git_env(gitdir);
 		}
-		if (startup_info->have_repository)
+		if (startup_info->have_repository) {
 			repo_set_hash_algo(the_repository, repo_fmt.hash_algo);
+			the_repository->ref_storage_format =
+				xstrdup_or_null(repo_fmt.ref_storage);
+		}
 	}
 
 	strbuf_release(&dir);
diff --git a/t/t0031-reftable.sh b/t/t0031-reftable.sh
new file mode 100755
index 000000000000..ba0738e51561
--- /dev/null
+++ b/t/t0031-reftable.sh
@@ -0,0 +1,203 @@
+#!/bin/sh
+#
+# Copyright (c) 2020 Google LLC
+#
+
+test_description='reftable basics'
+
+. ./test-lib.sh
+
+INVALID_SHA1=aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+
+git_init () {
+	git init -b primary "$@"
+}
+
+initialize ()  {
+	rm -rf .git &&
+	git_init --ref-storage=reftable &&
+	mv .git/hooks .git/hooks-disabled
+}
+
+test_expect_success 'SHA256 support, env' '
+	rm -rf .git &&
+	GIT_DEFAULT_HASH=sha256 && export GIT_DEFAULT_HASH &&
+	git_init --ref-storage=reftable &&
+	mv .git/hooks .git/hooks-disabled &&
+	test_commit file
+'
+
+test_expect_success 'SHA256 support, option' '
+	rm -rf .git &&
+	git_init --ref-storage=reftable --object-format=sha256 &&
+	mv .git/hooks .git/hooks-disabled &&
+	test_commit file
+'
+
+test_expect_success 'delete ref' '
+	initialize &&
+	test_commit file &&
+	SHA=$(git show-ref -s --verify HEAD) &&
+	test_write_lines "$SHA refs/heads/primary" "$SHA refs/tags/file" >expect &&
+	git show-ref >actual &&
+	! git update-ref -d refs/tags/file $INVALID_SHA1 &&
+	test_cmp expect actual &&
+	git update-ref -d refs/tags/file $SHA  &&
+	test_write_lines "$SHA refs/heads/primary" >expect &&
+	git show-ref >actual &&
+	test_cmp expect actual
+'
+
+
+test_expect_success 'clone calls transaction_initial_commit' '
+	test_commit message1 file1 &&
+	git clone . cloned &&
+	(test  -f cloned/file1 || echo "Fixme.")
+'
+
+test_expect_success 'basic operation of reftable storage: commit, show-ref' '
+	initialize &&
+	test_commit file &&
+	test_write_lines refs/heads/primary refs/tags/file >expect &&
+	git show-ref &&
+	git show-ref | cut -f2 -d" " >actual &&
+	test_cmp actual expect
+'
+
+test_expect_success 'reflog, repack' '
+	initialize &&
+	for count in $(test_seq 1 10)
+	do
+		test_commit "number $count" file.t $count number-$count ||
+		return 1
+	done &&
+	git pack-refs &&
+	ls -1 .git/reftable >table-files &&
+	test_line_count = 2 table-files &&
+	git reflog refs/heads/primary >output &&
+	test_line_count = 10 output &&
+	grep "commit (initial): number 1" output &&
+	grep "commit: number 10" output &&
+	git gc &&
+	git reflog refs/heads/primary >output &&
+	test_line_count = 0 output
+'
+
+test_expect_success 'branch switch in reflog output' '
+	initialize &&
+	test_commit file1 &&
+	git checkout -b branch1 &&
+	test_commit file2 &&
+	git checkout -b branch2 &&
+	git switch - &&
+	git rev-parse --symbolic-full-name HEAD >actual &&
+	echo refs/heads/branch1 >expect &&
+	test_cmp actual expect
+'
+
+
+# This matches show-ref's output
+print_ref() {
+	echo "$(git rev-parse "$1") $1"
+}
+
+test_expect_success 'peeled tags are stored' '
+	initialize &&
+	test_commit file &&
+	git tag -m "annotated tag" test_tag HEAD &&
+	{
+		print_ref "refs/heads/primary" &&
+		print_ref "refs/tags/file" &&
+		print_ref "refs/tags/test_tag" &&
+		print_ref "refs/tags/test_tag^{}"
+	} >expect &&
+	git show-ref -d >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'show-ref works on fresh repo' '
+	initialize &&
+	rm -rf .git &&
+	git_init --ref-storage=reftable &&
+	>expect &&
+	! git show-ref >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'checkout unborn branch' '
+	initialize &&
+	git checkout -b primary
+'
+
+
+test_expect_success 'dir/file conflict' '
+	initialize &&
+	test_commit file &&
+	! git branch primary/forbidden
+'
+
+
+test_expect_success 'do not clobber existing repo' '
+	rm -rf .git &&
+	git_init --ref-storage=files &&
+	cat .git/HEAD >expect &&
+	test_commit file &&
+	(git_init --ref-storage=reftable || true) &&
+	cat .git/HEAD >actual &&
+	test_cmp expect actual
+'
+
+# cherry-pick uses a pseudo ref.
+test_expect_success 'pseudo refs' '
+	initialize &&
+	test_commit message1 file1 &&
+	test_commit message2 file2 &&
+	git branch source &&
+	git checkout HEAD^ &&
+	test_commit message3 file3 &&
+	git cherry-pick source &&
+	test -f file2
+'
+
+# cherry-pick uses a pseudo ref.
+test_expect_success 'rebase' '
+	initialize &&
+	test_commit message1 file1 &&
+	test_commit message2 file2 &&
+	git branch source &&
+	git checkout HEAD^ &&
+	test_commit message3 file3 &&
+	git rebase source &&
+	test -f file2
+'
+
+test_expect_success 'worktrees' '
+	git_init --ref-storage=reftable start &&
+	(cd start && test_commit file1 && git checkout -b branch1 &&
+	git checkout -b branch2 &&
+	git worktree add  ../wt
+	) &&
+	cd wt &&
+	git checkout branch1 &&
+	git branch
+'
+
+test_expect_success 'worktrees 2' '
+	initialize &&
+	test_commit file1 &&
+	mkdir existing_empty &&
+	git worktree add --detach existing_empty primary
+'
+
+test_expect_success 'FETCH_HEAD' '
+	initialize &&
+	test_commit one &&
+	(git_init sub && cd sub && test_commit two) &&
+	git --git-dir sub/.git rev-parse HEAD >expect &&
+	git fetch sub &&
+	git checkout FETCH_HEAD &&
+	git rev-parse HEAD >actual &&
+	test_cmp expect actual
+'
+
+test_done
diff --git a/t/t1409-avoid-packing-refs.sh b/t/t1409-avoid-packing-refs.sh
index be12fb63506e..c6f783255633 100755
--- a/t/t1409-avoid-packing-refs.sh
+++ b/t/t1409-avoid-packing-refs.sh
@@ -4,6 +4,12 @@ test_description='avoid rewriting packed-refs unnecessarily'
 
 . ./test-lib.sh
 
+if test_have_prereq REFTABLE
+then
+  skip_all='skipping pack-refs tests; incompatible with reftable'
+  test_done
+fi
+
 # Add an identifying mark to the packed-refs file header line. This
 # shouldn't upset readers, and it should be omitted if the file is
 # ever rewritten.
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 5071ac63a5b5..16440d58d6bb 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -8,6 +8,12 @@ test_description='git fsck random collection of tests
 
 . ./test-lib.sh
 
+if test_have_prereq REFTABLE
+then
+  skip_all='skipping tests; incompatible with reftable'
+  test_done
+fi
+
 test_expect_success setup '
 	git config gc.auto 0 &&
 	git config i18n.commitencoding ISO-8859-1 &&
diff --git a/t/t3210-pack-refs.sh b/t/t3210-pack-refs.sh
index 3b7cdc56ec84..667a23affe36 100755
--- a/t/t3210-pack-refs.sh
+++ b/t/t3210-pack-refs.sh
@@ -14,6 +14,12 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
 
+if test_have_prereq REFTABLE
+then
+  skip_all='skipping pack-refs tests; incompatible with reftable'
+  test_done
+fi
+
 test_expect_success 'enable reflogs' '
 	git config core.logallrefupdates true
 '
diff --git a/t/test-lib.sh b/t/test-lib.sh
index d3f6af6a6545..943aa9713961 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1481,6 +1481,11 @@ parisc* | hppa*)
 	;;
 esac
 
+if test -n "$GIT_TEST_REFTABLE"
+then
+  test_set_prereq REFTABLE
+fi
+
 ( COLUMNS=1 && test $COLUMNS = 1 ) && test_set_prereq COLUMNS_CAN_BE_1
 test -z "$NO_PERL" && test_set_prereq PERL
 test -z "$NO_PTHREADS" && test_set_prereq PTHREADS
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v6 19/20] git-prompt: prepare for reftable refs backend
  2021-04-12 19:25         ` [PATCH v6 00/20] reftable library Han-Wen Nienhuys via GitGitGadget
                             ` (17 preceding siblings ...)
  2021-04-12 19:25           ` [PATCH v6 18/20] Reftable support for git-core Han-Wen Nienhuys via GitGitGadget
@ 2021-04-12 19:25           ` SZEDER Gábor via GitGitGadget
  2021-04-12 19:25           ` [PATCH v6 20/20] Add "test-tool dump-reftable" command Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
  20 siblings, 0 replies; 251+ messages in thread
From: SZEDER Gábor via GitGitGadget @ 2021-04-12 19:25 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys,
	SZEDER Gábor

From: =?UTF-8?q?SZEDER=20G=C3=A1bor?= <szeder.dev@gmail.com>

In our git-prompt script we strive to use Bash builtins wherever
possible, because fork()-ing subshells for command substitutions and
fork()+exec()-ing Git commands are expensive on some platforms.  We
even read and parse '.git/HEAD' using Bash builtins to get the name of
the current branch [1].  However, the upcoming reftable refs backend
won't use '.git/HEAD' at all, but will write an invalid refname as
placeholder for backwards compatibility instead, which will break our
git-prompt script.

Update the git-prompt script to recognize the placeholder '.git/HEAD'
written by the reftable backend (its content is specified in the
reftable specs), and then fall back to use 'git symbolic-ref' to get
the name of the current branch.

[1] 3a43c4b5bd (bash prompt: use bash builtins to find out current
    branch, 2011-03-31)

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 contrib/completion/git-prompt.sh | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/contrib/completion/git-prompt.sh b/contrib/completion/git-prompt.sh
index 4640a1535d15..2e5a5d80271d 100644
--- a/contrib/completion/git-prompt.sh
+++ b/contrib/completion/git-prompt.sh
@@ -478,10 +478,15 @@ __git_ps1 ()
 			if ! __git_eread "$g/HEAD" head; then
 				return $exit
 			fi
-			# is it a symbolic ref?
 			b="${head#ref: }"
 			if [ "$head" = "$b" ]; then
 				detached=yes
+			elif [ "$b" = "refs/heads/.invalid" ]; then
+				# Reftable
+				b="$(git symbolic-ref HEAD 2>/dev/null)" ||
+				detached=yes
+			fi
+			if [ "$detached" = yes ]; then
 				b="$(
 				case "${GIT_PS1_DESCRIBE_STYLE-}" in
 				(contains)
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v6 20/20] Add "test-tool dump-reftable" command.
  2021-04-12 19:25         ` [PATCH v6 00/20] reftable library Han-Wen Nienhuys via GitGitGadget
                             ` (18 preceding siblings ...)
  2021-04-12 19:25           ` [PATCH v6 19/20] git-prompt: prepare for reftable refs backend SZEDER Gábor via GitGitGadget
@ 2021-04-12 19:25           ` Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
  20 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-12 19:25 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys,
	Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This command dumps individual tables or a stack of of tables.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |  1 +
 reftable/dump.c          | 67 ++++++++++++++++++++--------------------
 t/helper/test-reftable.c |  5 +++
 t/helper/test-tool.c     |  1 +
 t/helper/test-tool.h     |  1 +
 5 files changed, 42 insertions(+), 33 deletions(-)

diff --git a/Makefile b/Makefile
index 5890e7d1cbb2..870155d48b58 100644
--- a/Makefile
+++ b/Makefile
@@ -2420,6 +2420,7 @@ REFTABLE_OBJS += reftable/zlib-compat.o
 
 REFTABLE_TEST_OBJS += reftable/basics_test.o
 REFTABLE_TEST_OBJS += reftable/block_test.o
+REFTABLE_TEST_OBJS += reftable/dump.o
 REFTABLE_TEST_OBJS += reftable/merged_test.o
 REFTABLE_TEST_OBJS += reftable/pq_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
diff --git a/reftable/dump.c b/reftable/dump.c
index 00b444e8c9b8..7d620a3cf0fe 100644
--- a/reftable/dump.c
+++ b/reftable/dump.c
@@ -12,18 +12,25 @@ license that can be found in the LICENSE file or at
 #include <unistd.h>
 #include <string.h>
 
-#include "reftable.h"
+#include "reftable-blocksource.h"
+#include "reftable-error.h"
+#include "reftable-merged.h"
+#include "reftable-record.h"
 #include "reftable-tests.h"
+#include "reftable-writer.h"
+#include "reftable-iterator.h"
+#include "reftable-reader.h"
+#include "reftable-stack.h"
 
 static uint32_t hash_id;
 
 static int dump_table(const char *tablename)
 {
-	struct reftable_block_source src = { 0 };
+	struct reftable_block_source src = { NULL };
 	int err = reftable_block_source_from_file(&src, tablename);
-	struct reftable_iterator it = { 0 };
-	struct reftable_ref_record ref = { 0 };
-	struct reftable_log_record log = { 0 };
+	struct reftable_iterator it = { NULL };
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_log_record log = { NULL };
 	struct reftable_reader *r = NULL;
 
 	if (err < 0)
@@ -49,7 +56,7 @@ static int dump_table(const char *tablename)
 		reftable_ref_record_print(&ref, hash_id);
 	}
 	reftable_iterator_destroy(&it);
-	reftable_ref_record_clear(&ref);
+	reftable_ref_record_release(&ref);
 
 	err = reftable_reader_seek_log(r, &it, "");
 	if (err < 0) {
@@ -66,7 +73,7 @@ static int dump_table(const char *tablename)
 		reftable_log_record_print(&log, hash_id);
 	}
 	reftable_iterator_destroy(&it);
-	reftable_log_record_clear(&log);
+	reftable_log_record_release(&log);
 
 	reftable_reader_free(r);
 	return 0;
@@ -75,7 +82,7 @@ static int dump_table(const char *tablename)
 static int compact_stack(const char *stackdir)
 {
 	struct reftable_stack *stack = NULL;
-	struct reftable_write_options cfg = {};
+	struct reftable_write_options cfg = { 0 };
 
 	int err = reftable_new_stack(&stack, stackdir, cfg);
 	if (err < 0)
@@ -94,10 +101,10 @@ static int compact_stack(const char *stackdir)
 static int dump_stack(const char *stackdir)
 {
 	struct reftable_stack *stack = NULL;
-	struct reftable_write_options cfg = {};
-	struct reftable_iterator it = { 0 };
-	struct reftable_ref_record ref = { 0 };
-	struct reftable_log_record log = { 0 };
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_iterator it = { NULL };
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_log_record log = { NULL };
 	struct reftable_merged_table *merged = NULL;
 
 	int err = reftable_new_stack(&stack, stackdir, cfg);
@@ -122,7 +129,7 @@ static int dump_stack(const char *stackdir)
 		reftable_ref_record_print(&ref, hash_id);
 	}
 	reftable_iterator_destroy(&it);
-	reftable_ref_record_clear(&ref);
+	reftable_ref_record_release(&ref);
 
 	err = reftable_merged_table_seek_log(merged, &it, "");
 	if (err < 0) {
@@ -139,7 +146,7 @@ static int dump_stack(const char *stackdir)
 		reftable_log_record_print(&log, hash_id);
 	}
 	reftable_iterator_destroy(&it);
-	reftable_log_record_clear(&log);
+	reftable_log_record_release(&log);
 
 	reftable_stack_destroy(stack);
 	return 0;
@@ -160,40 +167,34 @@ static void print_help(void)
 int reftable_dump_main(int argc, char *const *argv)
 {
 	int err = 0;
-	int opt;
 	int opt_dump_table = 0;
 	int opt_dump_stack = 0;
 	int opt_compact = 0;
-	const char *arg = NULL;
-	while ((opt = getopt(argc, argv, "2chts")) != -1) {
-		switch (opt) {
-		case '2':
-			hash_id = 0x73323536;
+	const char *arg = NULL, *argv0 = argv[0];
+
+	for (; argc > 1; argv++, argc--)
+		if (*argv[1] != '-')
 			break;
-		case 't':
+		else if (!strcmp("-2", argv[1]))
+			hash_id = 0x73323536;
+		else if (!strcmp("-t", argv[1]))
 			opt_dump_table = 1;
-			break;
-		case 's':
+		else if (!strcmp("-s", argv[1]))
 			opt_dump_stack = 1;
-			break;
-		case 'c':
+		else if (!strcmp("-c", argv[1]))
 			opt_compact = 1;
-			break;
-		case '?':
-		case 'h':
+		else if (!strcmp("-?", argv[1]) || !strcmp("-h", argv[1])) {
 			print_help();
 			return 2;
-			break;
 		}
-	}
 
-	if (argv[optind] == NULL) {
+	if (argc != 2) {
 		fprintf(stderr, "need argument\n");
 		print_help();
 		return 2;
 	}
 
-	arg = argv[optind];
+	arg = argv[1];
 
 	if (opt_dump_table) {
 		err = dump_table(arg);
@@ -204,7 +205,7 @@ int reftable_dump_main(int argc, char *const *argv)
 	}
 
 	if (err < 0) {
-		fprintf(stderr, "%s: %s: %s\n", argv[0], arg,
+		fprintf(stderr, "%s: %s: %s\n", argv0, arg,
 			reftable_error_str(err));
 		return 1;
 	}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 2efe48b23bbb..09e727721fe9 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -14,3 +14,8 @@ int cmd__reftable(int argc, const char **argv)
 	tree_test_main(argc, argv);
 	return 0;
 }
+
+int cmd__dump_reftable(int argc, const char **argv)
+{
+	return reftable_dump_main(argc, (char *const *)argv);
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 814d2606f8ca..bf4958492386 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -57,6 +57,7 @@ static struct test_cmd cmds[] = {
 	{ "read-midx", cmd__read_midx },
 	{ "ref-store", cmd__ref_store },
 	{ "reftable", cmd__reftable },
+	{ "dump-reftable", cmd__dump_reftable },
 	{ "regex", cmd__regex },
 	{ "repository", cmd__repository },
 	{ "revision-walking", cmd__revision_walking },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index 789c26f302dd..f195a7539cd2 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -18,6 +18,7 @@ int cmd__dump_cache_tree(int argc, const char **argv);
 int cmd__dump_fsmonitor(int argc, const char **argv);
 int cmd__dump_split_index(int argc, const char **argv);
 int cmd__dump_untracked_cache(int argc, const char **argv);
+int cmd__dump_reftable(int argc, const char **argv);
 int cmd__example_decorate(int argc, const char **argv);
 int cmd__fast_rebase(int argc, const char **argv);
 int cmd__genrandom(int argc, const char **argv);
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 251+ messages in thread

* Re: [PATCH v6 07/20] reftable: reading/writing blocks
  2021-04-12 19:25           ` [PATCH v6 07/20] reftable: reading/writing blocks Han-Wen Nienhuys via GitGitGadget
@ 2021-04-12 21:40             ` Junio C Hamano
  2021-04-13  8:19             ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 251+ messages in thread
From: Junio C Hamano @ 2021-04-12 21:40 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: git, Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys

I am seeing this on the receiving end.  Thanks.

<stdin>:832: indent with spaces.
        left = *destLen;
<stdin>:833: indent with spaces.
        *destLen = 0;
<stdin>:836: indent with spaces.
        left = 1;
<stdin>:837: indent with spaces.
        dest = buf;
<stdin>:853: indent with spaces.
        if (stream.avail_out == 0) {
warning: squelched 13 whitespace errors
warning: 18 lines add whitespace errors.

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v6 18/20] Reftable support for git-core
  2021-04-12 19:25           ` [PATCH v6 18/20] Reftable support for git-core Han-Wen Nienhuys via GitGitGadget
@ 2021-04-13  7:18             ` Ævar Arnfjörð Bjarmason
  2021-04-14 16:44               ` Han-Wen Nienhuys
  0 siblings, 1 reply; 251+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-13  7:18 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: git, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Felipe Contreras, Derrick Stolee,
	Han-Wen Nienhuys, Han-Wen Nienhuys


On Mon, Apr 12 2021, Han-Wen Nienhuys via GitGitGadget wrote:

> It can be activated by passing --ref-storage=reftable to "git init", or setting
> GIT_TEST_REFTABLE in the environment.
> [...]
> +const char *default_ref_storage(void)
> +{
> +	const char *test = getenv("GIT_TEST_REFTABLE");
> +	return test ? "reftable" : "files";
> +}

Nit: use git_env_bool() here. So GIT_TEST_REFTABLE=true is true, but
GIT_TEST_REFTABLE= is not.

Not a nit: are these tests supposed to work / do they work for you? For
me there's a *lot* of failures, e.g. t6120-describe.sh fails for reasons
you might expect, manually tweaking things in the FS-refstore, but
e.g. t5510-fetch.sh segfaults.

t0001-init.sh fails with:

    git: refs/reftable-backend.c:194: git_reftable_init_db: Assertion `refs->worktree_reftable_dir == NULL' failed.

This is on an amd64 Debian w/Linux 4.19, so nothing exotic.

> [...re-arranged a bit...]
>
> +if test -n "$GIT_TEST_REFTABLE"
> +then
> +  test_set_prereq REFTABLE
> +fi
> +
>  ( COLUMNS=1 && test $COLUMNS = 1 ) && test_set_prereq COLUMNS_CAN_BE_1
>  test -z "$NO_PERL" && test_set_prereq PERL
>  test -z "$NO_PTHREADS" && test_set_prereq PTHREADS

Ah, so here's the (seemingly underused) prereq, but:

> --- a/t/t1450-fsck.sh
> +++ b/t/t1450-fsck.sh
> @@ -8,6 +8,12 @@ test_description='git fsck random collection of tests
>  
>  . ./test-lib.sh
>  
> +if test_have_prereq REFTABLE
> +then
> +  skip_all='skipping tests; incompatible with reftable'
> +  test_done
> +fi
> +

I understand that getting this series to land has been a pain, but this
sort of thing doesn't inspire confidence.

Manually running it under reftable anyway I have 17 out of 81 tests
failing. A lot of tests won't run under reftable because they're of the
form:

	test_when_finished "remove_object $tag" &&
	echo $tag >.git/refs/tags/wrong &&
	test_when_finished "git update-ref -d refs/tags/wrong" &&

I.e. we have some hack to get around update-ref guarding things for us.

I would think that a better approach here would be to start with some
(per-se unrelated) series to teach update-ref some mode like
hash-object's --literally, i.e. "YOLO this ref update".

Then adjust our various tests that move, clobber, or intentionally
corrupt things in the refstore to use that helper/option.

At that point we can actually run things like t1450-fsck.sh, maybe we
simply have to skip some of them because we think the sort of corruption
we're testing/worried about is impossible with reftable, but that in
itself is *very* interesting.

I.e. do we not have to handle certain edge cases at all in fsck and
friends because of inherent properties of the reftable backend (e.g. I'd
assume, but don't know, that we're not likely to get one corrupt ref
entry with it), but of course we can have a ref pointing to a loose
object that then gets removed, and is thus corrupt.

Anyway. I'm speculating, but that's also my point, that I'm having to
speculate. That comes down to us having a new GIT_TEST_* mode that
doesn't pass the test suite._

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v6 02/20] reftable: add LICENSE
  2021-04-12 19:25           ` [PATCH v6 02/20] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
@ 2021-04-13  7:28             ` Ævar Arnfjörð Bjarmason
  2021-04-13 10:50               ` Han-Wen Nienhuys
  0 siblings, 1 reply; 251+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-13  7:28 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: git, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Felipe Contreras, Derrick Stolee,
	Han-Wen Nienhuys, Han-Wen Nienhuys


On Mon, Apr 12 2021, Han-Wen Nienhuys via GitGitGadget wrote:

> From: Han-Wen Nienhuys <hanwen@google.com>
>
> TODO: relicense?

So to your[1]:

    > However, it is fundamentally blocked on someone willing to spend
    > some time reviewing the series; this seems to be out of my
    > control.

I for one find it hard to follow re-rolls of this when past discussion
isn't clarified/updated in commit messages/CLs.

So in particular in v3[2] there was a whole discussion about what what
this licence & external hosting of this codebase meant in practical
terms.

I vaguely recall that that was clarified in some way by you, but didn't
find the relevant E-Mail. Something like Google's lawyers said something
to the effect that this could just be added to git.git, no? Maybe that's
incorrect, I don't remember.

Anyway, we're now in v6, and we still have nothing but a "TODO:
relicense?" blurb here.

As noted in the past discussion I'm not interested in the
academic/theoretical side that license discussion, but the very
practical question of how this library is expected to be
integrated/maintained.

I.e. if it's meant to be 100% externally maintained and "code-dumped"
into git.git like we do with sha1collisiondetection/ that raises one set
of concerns, but if it's meant to be "eaten" by git.git that poses
another set of questions for the code review here. I.e. much of what
you're doing in later patches in this series is introducing things that
are redundant/odd if viewed if we're supposing that this code is meant
to live in git.git.

1. https://lore.kernel.org/git/CAFQ2z_P=MqT81gLjov6A471Z9sd69qQTep8KG8M8=LO9pJtkpQ@mail.gmail.com/
2. https://lore.kernel.org/git/CAFQ2z_OJQf3+b0TT6BmAV9q9G9c2icbLK0EqrEjpYmpi8g9Fsg@mail.gmail.com/

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v6 04/20] reftable: utility functions
  2021-04-12 19:25           ` [PATCH v6 04/20] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
@ 2021-04-13  8:02             ` Ævar Arnfjörð Bjarmason
  2021-04-13 10:58               ` Han-Wen Nienhuys
  2021-04-13 13:14             ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 251+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-13  8:02 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: git, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Felipe Contreras, Derrick Stolee,
	Han-Wen Nienhuys, Han-Wen Nienhuys


On Mon, Apr 12 2021, Han-Wen Nienhuys via GitGitGadget wrote:

> +int strbuf_add_void(void *b, const void *data, size_t sz)
> +{
> +	strbuf_add((struct strbuf *)b, data, sz);
> +	return sz;
> +}

Is that cast needed on your compiler? This compiles without warnings for
me without that.

Also, maybe this is the sort of thing that makes sense to split into
general "APIs needed for reftable" patches. E.g. something like the
below (just the strbuf.h change):

diff --git a/reftable/merged_test.c b/reftable/merged_test.c
index 0c301cecced..e49029eed34 100644
--- a/reftable/merged_test.c
+++ b/reftable/merged_test.c
@@ -43,7 +43,7 @@ static void write_test_table(struct strbuf *buf,
 		}
 	}
 
-	w = reftable_new_writer(&strbuf_add_void, buf, &opts);
+	w = reftable_new_writer(&strbuf_add_write, buf, &opts);
 	reftable_writer_set_limits(w, min, max);
 
 	for (i = 0; i < n; i++) {
@@ -241,7 +241,7 @@ static void test_default_write_opts(void)
 	struct reftable_write_options opts = { 0 };
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+		reftable_new_writer(&strbuf_add_write, &buf, &opts);
 
 	struct reftable_ref_record rec = {
 		.refname = "master",
diff --git a/reftable/refname_test.c b/reftable/refname_test.c
index 5e005d6af31..e8ecba1fad9 100644
--- a/reftable/refname_test.c
+++ b/reftable/refname_test.c
@@ -31,7 +31,7 @@ static void test_conflict(void)
 	struct reftable_write_options opts = { 0 };
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+		reftable_new_writer(&strbuf_add_write, &buf, &opts);
 	struct reftable_ref_record rec = {
 		.refname = "a/b",
 		.value_type = REFTABLE_REF_SYMREF,
diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h
index 9d2f8d60555..bdef4813f4c 100644
--- a/reftable/reftable-writer.h
+++ b/reftable/reftable-writer.h
@@ -83,7 +83,7 @@ struct reftable_stats {
 
 /* reftable_new_writer creates a new writer */
 struct reftable_writer *
-reftable_new_writer(int (*writer_func)(void *, const void *, size_t),
+reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
 		    void *writer_arg, struct reftable_write_options *opts);
 
 /* Set the range of update indices for the records we will add. When writing a
diff --git a/reftable/reftable_test.c b/reftable/reftable_test.c
index 69dbfb09fff..1685b9a07bc 100644
--- a/reftable/reftable_test.c
+++ b/reftable/reftable_test.c
@@ -52,7 +52,7 @@ static void write_table(char ***names, struct strbuf *buf, int N,
 		.hash_id = hash_id,
 	};
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, buf, &opts);
+		reftable_new_writer(&strbuf_add_write, buf, &opts);
 	struct reftable_ref_record ref = { NULL };
 	int i = 0, n;
 	struct reftable_log_record log = { NULL };
@@ -131,7 +131,7 @@ static void test_log_buffer_size(void)
 						   .message = "commit: 9\n",
 					   } };
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+		reftable_new_writer(&strbuf_add_write, &buf, &opts);
 
 	/* This tests buffer extension for log compression. Must use a random
 	   hash, to ensure that the compressed part is larger than the original.
@@ -169,7 +169,7 @@ static void test_log_write_read(void)
 	struct reftable_block_source source = { NULL };
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+		reftable_new_writer(&strbuf_add_write, &buf, &opts);
 	const struct reftable_stats *stats = NULL;
 	reftable_writer_set_limits(w, 0, N);
 	for (i = 0; i < N; i++) {
@@ -437,7 +437,7 @@ static void test_table_refs_for(int indexed)
 
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+		reftable_new_writer(&strbuf_add_write, &buf, &opts);
 
 	struct reftable_iterator it = { NULL };
 	int j;
@@ -534,7 +534,7 @@ static void test_table_empty(void)
 	struct reftable_write_options opts = { 0 };
 	struct strbuf buf = STRBUF_INIT;
 	struct reftable_writer *w =
-		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+		reftable_new_writer(&strbuf_add_write, &buf, &opts);
 	struct reftable_block_source source = { NULL };
 	struct reftable_reader *rd = NULL;
 	struct reftable_ref_record rec = { NULL };
diff --git a/reftable/stack.c b/reftable/stack.c
index 3cdb6e8ed33..2e3fd8db1dd 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -39,7 +39,7 @@ static void stack_filename(struct strbuf *dest, struct reftable_stack *st,
 	strbuf_addstr(dest, name);
 }
 
-static int reftable_fd_write(void *arg, const void *data, size_t sz)
+static ssize_t reftable_fd_write(void *arg, const void *data, size_t sz)
 {
 	int *fdp = (int *)arg;
 	return write(*fdp, data, sz);
diff --git a/reftable/test_framework.c b/reftable/test_framework.c
index a5ff4e2a2d2..e5e39fe8b13 100644
--- a/reftable/test_framework.c
+++ b/reftable/test_framework.c
@@ -15,9 +15,3 @@ void set_test_hash(uint8_t *p, int i)
 {
 	memset(p, (uint8_t)i, hash_size(SHA1_ID));
 }
-
-int strbuf_add_void(void *b, const void *data, size_t sz)
-{
-	strbuf_add((struct strbuf *)b, data, sz);
-	return sz;
-}
diff --git a/reftable/test_framework.h b/reftable/test_framework.h
index 5fdc9519a5a..c04925ea11d 100644
--- a/reftable/test_framework.h
+++ b/reftable/test_framework.h
@@ -46,8 +46,4 @@ license that can be found in the LICENSE file or at
 
 void set_test_hash(uint8_t *p, int i);
 
-/* Like strbuf_add, but suitable for passing to reftable_new_writer
- */
-int strbuf_add_void(void *b, const void *data, size_t sz);
-
 #endif
diff --git a/reftable/writer.c b/reftable/writer.c
index d42ca8afac1..ca3b127f83d 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -118,7 +118,7 @@ static void writer_reinit_block_writer(struct reftable_writer *w, uint8_t typ)
 static struct strbuf reftable_empty_strbuf = STRBUF_INIT;
 
 struct reftable_writer *
-reftable_new_writer(int (*writer_func)(void *, const void *, size_t),
+reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
 		    void *writer_arg, struct reftable_write_options *opts)
 {
 	struct reftable_writer *wp =
diff --git a/reftable/writer.h b/reftable/writer.h
index 4921c249d06..09b88673d97 100644
--- a/reftable/writer.h
+++ b/reftable/writer.h
@@ -15,7 +15,7 @@ license that can be found in the LICENSE file or at
 #include "reftable-writer.h"
 
 struct reftable_writer {
-	int (*write)(void *, const void *, size_t);
+	ssize_t (*write)(void *, const void *, size_t);
 	void *write_arg;
 	int pending_padding;
 	struct strbuf last_key;
diff --git a/strbuf.h b/strbuf.h
index 223ee2094af..e3ae5b8a9d3 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -290,6 +290,18 @@ void strbuf_add_commented_lines(struct strbuf *out,
  */
 void strbuf_add(struct strbuf *sb, const void *data, size_t len);
 
+/**
+ * Like strbuf_add() but emulates write() for APIs that need
+ * it. Returns the passed-in `len` as-is. The `void *` is really a
+ * `struct strbuf *`.
+ */
+static inline ssize_t strbuf_add_write(void *sb, const void *data,
+				       size_t len)
+{
+	strbuf_add(sb, data, len);
+	return len;
+}
+
 /**
  * Add a NUL-terminated string to the buffer.
  *

^ permalink raw reply related	[flat|nested] 251+ messages in thread

* Re: [PATCH v6 07/20] reftable: reading/writing blocks
  2021-04-12 19:25           ` [PATCH v6 07/20] reftable: reading/writing blocks Han-Wen Nienhuys via GitGitGadget
  2021-04-12 21:40             ` Junio C Hamano
@ 2021-04-13  8:19             ` Ævar Arnfjörð Bjarmason
  2021-04-15  8:57               ` Han-Wen Nienhuys
  1 sibling, 1 reply; 251+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-13  8:19 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: git, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Felipe Contreras, Derrick Stolee,
	Han-Wen Nienhuys, Han-Wen Nienhuys


On Mon, Apr 12 2021, Han-Wen Nienhuys via GitGitGadget wrote:

> Includes a code snippet copied from zlib
> [...]
> +int ZEXPORT uncompress_return_consumed (

Pending the "how is this integrated?" question I had in
http://lore.kernel.org/git/87fszuej8y.fsf@evledraar.gmail.com it's a bit
odd to have a "compat" we unconditionally compile.

Since this is for post-2017 zlib doesn't putting it in top-level
compat/* and having a flag to enable it make more sense, and then not
renaming the function.

Thus those who need it will have a symbol conflict when they upgrade
their zlib to not need the compat wrapper, but that's a feature.

> +    Bytef *dest,
> +    uLongf *destLen,
> +    const Bytef *source,

I see this is modified from upstream's which would trip our
DEVELOPER=*-only -Wold-style-definition under -Werror. We have other
disabling of flags for compat/* already, seems better to use the
upstream source file as-is with that.

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v6 02/20] reftable: add LICENSE
  2021-04-13  7:28             ` Ævar Arnfjörð Bjarmason
@ 2021-04-13 10:50               ` Han-Wen Nienhuys
  2021-04-13 13:41                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys @ 2021-04-13 10:50 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys

On Tue, Apr 13, 2021 at 9:28 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>     > However, it is fundamentally blocked on someone willing to spend
>     > some time reviewing the series; this seems to be out of my
>     > control.
>
> I for one find it hard to follow re-rolls of this when past discussion
> isn't clarified/updated in commit messages/CLs.
>
> So in particular in v3[2] there was a whole discussion about what what
> this licence & external hosting of this codebase meant in practical
> terms.
>
> I vaguely recall that that was clarified in some way by you, but didn't
> find the relevant E-Mail. Something like Google's lawyers said something
> to the effect that this could just be added to git.git, no? Maybe that's
> incorrect, I don't remember.

You're right, I didn't update the thread. The outcome is that the
library for now lives in https://github.com/hanwen/reftable while it's
not integrated in git.git. In this repo I can accept contributions
(including fixup changes posted to the git list) without requiring
CLAs.

> I.e. if it's meant to be 100% externally maintained and "code-dumped"
> into git.git like we do with sha1collisiondetection/ that raises one set
> of concerns, but if it's meant to be "eaten" by git.git that poses
> another set of questions for the code review here. I.e. much of what
> you're doing in later patches in this series is introducing things that
> are redundant/odd if viewed if we're supposing that this code is meant
> to live in git.git.

The code is meant to be eaten by git.git, but be amenable to being
used by libgit2 with minimal modifications. To that end, it tries to
use just functionality offered by git-compat-util.

If you could point out specific oddities, that would be helpful.

As said earlier, the BSD license here is the most liberal I could
find, but I'm open to something else, if that makes integration in
both git and libgit2 possible. This was not discussed further, and
hence the "TODO: relicense?" comment.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--
Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v6 04/20] reftable: utility functions
  2021-04-13  8:02             ` Ævar Arnfjörð Bjarmason
@ 2021-04-13 10:58               ` Han-Wen Nienhuys
  2021-04-13 12:56                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys @ 2021-04-13 10:58 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys

On Tue, Apr 13, 2021 at 10:02 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Mon, Apr 12 2021, Han-Wen Nienhuys via GitGitGadget wrote:
>
> > +int strbuf_add_void(void *b, const void *data, size_t sz)
> > +{
> > +     strbuf_add((struct strbuf *)b, data, sz);
> > +     return sz;
> > +}
>
> Is that cast needed on your compiler? This compiles without warnings for
> me without that.

No! thanks.

> Also, maybe this is the sort of thing that makes sense to split into
> general "APIs needed for reftable" patches. E.g. something like the
> below (just the strbuf.h change):

SGTM.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v6 04/20] reftable: utility functions
  2021-04-13 10:58               ` Han-Wen Nienhuys
@ 2021-04-13 12:56                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 251+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-13 12:56 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys


On Tue, Apr 13 2021, Han-Wen Nienhuys wrote:

> On Tue, Apr 13, 2021 at 10:02 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>>
>> On Mon, Apr 12 2021, Han-Wen Nienhuys via GitGitGadget wrote:
>>
>> > +int strbuf_add_void(void *b, const void *data, size_t sz)
>> > +{
>> > +     strbuf_add((struct strbuf *)b, data, sz);
>> > +     return sz;
>> > +}
>>
>> Is that cast needed on your compiler? This compiles without warnings for
>> me without that.
>
> No! thanks.

Some more, a combination of redundant casting and casting early to
another variable being easier to read. The latter being a matter of
style/opinion. Some more like that in the code:

    git grep -W '\(\(struct .*\*'

diff:

diff --git a/reftable/blocksource.c b/reftable/blocksource.c
index 25d4d95b52b..91eb0b495f5 100644
--- a/reftable/blocksource.c
+++ b/reftable/blocksource.c
@@ -26,7 +26,7 @@ static void strbuf_close(void *b)
 static int strbuf_read_block(void *v, struct reftable_block *dest, uint64_t off,
 			     uint32_t size)
 {
-	struct strbuf *b = (struct strbuf *)v;
+	struct strbuf *b = v;
 	assert(off + size <= b->len);
 	dest->data = reftable_calloc(size);
 	memcpy(dest->data, b->buf + off, size);
@@ -36,7 +36,8 @@ static int strbuf_read_block(void *v, struct reftable_block *dest, uint64_t off,
 
 static uint64_t strbuf_size(void *b)
 {
-	return ((struct strbuf *)b)->len;
+	struct strbuf *buf = b;
+	return buf->len;
 }
 
 static struct reftable_block_source_vtable strbuf_vtable = {
@@ -91,10 +92,11 @@ static void file_return_block(void *b, struct reftable_block *dest)
 
 static void file_close(void *b)
 {
-	int fd = ((struct file_block_source *)b)->fd;
+	struct file_block_source *bs =  b;
+	int fd = bs->fd;
 	if (fd > 0) {
 		close(fd);
-		((struct file_block_source *)b)->fd = 0;
+		bs->fd = 0;
 	}
 
 	reftable_free(b);
@@ -103,7 +105,7 @@ static void file_close(void *b)
 static int file_read_block(void *v, struct reftable_block *dest, uint64_t off,
 			   uint32_t size)
 {
-	struct file_block_source *b = (struct file_block_source *)v;
+	struct file_block_source *b = v;
 	assert(off + size <= b->size);
 	dest->data = reftable_malloc(size);
 	if (pread(b->fd, dest->data, size, off) != size)

^ permalink raw reply related	[flat|nested] 251+ messages in thread

* Re: [PATCH v6 04/20] reftable: utility functions
  2021-04-12 19:25           ` [PATCH v6 04/20] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
  2021-04-13  8:02             ` Ævar Arnfjörð Bjarmason
@ 2021-04-13 13:14             ` Ævar Arnfjörð Bjarmason
  2021-04-15 15:00               ` Han-Wen Nienhuys
  1 sibling, 1 reply; 251+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-13 13:14 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: git, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Felipe Contreras, Derrick Stolee,
	Han-Wen Nienhuys, Han-Wen Nienhuys


On Mon, Apr 12 2021, Han-Wen Nienhuys via GitGitGadget wrote:

> diff --git a/reftable/system.h b/reftable/system.h
> new file mode 100644
> index 000000000000..07277ca06273
> --- /dev/null
> +++ b/reftable/system.h
> @@ -0,0 +1,32 @@
> +/*
> +Copyright 2020 Google LLC
> +
> +Use of this source code is governed by a BSD-style
> +license that can be found in the LICENSE file or at
> +https://developers.google.com/open-source/licenses/bsd
> +*/
> +
> +#ifndef SYSTEM_H
> +#define SYSTEM_H
> +
> +#include "git-compat-util.h"
> +#include "strbuf.h"
> +
> +#include <zlib.h>
> +
> +struct strbuf;
> +/* In git, this is declared in dir.h */
> +int remove_dir_recursively(struct strbuf *path, int flags);
> +
> +#define SHA1_ID 0x73686131
> +#define SHA256_ID 0x73323536
> +#define SHA1_SIZE 20
> +#define SHA256_SIZE 32
> +
> +/* This is uncompress2, which is only available in zlib as of 2017.
> + */
> +int uncompress_return_consumed(Bytef *dest, uLongf *destLen,
> +			       const Bytef *source, uLong *sourceLen);
> +int hash_size(uint32_t id);

Related to the comment in
http://lore.kernel.org/git/87a6q2egvy.fsf@evledraar.gmail.com

I'd think that rather than duplicating magic constants & one thing in
dir.h we'd be better off having some leading patches splitting off the
relevant parts of object-file.c & dir.h, maybe THAT_NAME_minimal.h?

Or: why not simply include dir.h and object.h etc? The compiler/linker
will discard the unused functions, and if the worry is overuse of
git.git functions creeping in I'd think that would be better done via
some test/CI job that checks what objects were used.

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v6 02/20] reftable: add LICENSE
  2021-04-13 10:50               ` Han-Wen Nienhuys
@ 2021-04-13 13:41                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 251+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-13 13:41 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys


On Tue, Apr 13 2021, Han-Wen Nienhuys wrote:

> On Tue, Apr 13, 2021 at 9:28 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>     > However, it is fundamentally blocked on someone willing to spend
>>     > some time reviewing the series; this seems to be out of my
>>     > control.
>>
>> I for one find it hard to follow re-rolls of this when past discussion
>> isn't clarified/updated in commit messages/CLs.
>>
>> So in particular in v3[2] there was a whole discussion about what what
>> this licence & external hosting of this codebase meant in practical
>> terms.
>>
>> I vaguely recall that that was clarified in some way by you, but didn't
>> find the relevant E-Mail. Something like Google's lawyers said something
>> to the effect that this could just be added to git.git, no? Maybe that's
>> incorrect, I don't remember.
>
> You're right, I didn't update the thread. The outcome is that the
> library for now lives in https://github.com/hanwen/reftable while it's
> not integrated in git.git. In this repo I can accept contributions
> (including fixup changes posted to the git list) without requiring
> CLAs.

And once this lands if there's patches to reftable/* on this ML?

Knowing would be good, a change to Documentation/SubmittingPatches et al
in the next iteration even better :)

>> I.e. if it's meant to be 100% externally maintained and "code-dumped"
>> into git.git like we do with sha1collisiondetection/ that raises one set
>> of concerns, but if it's meant to be "eaten" by git.git that poses
>> another set of questions for the code review here. I.e. much of what
>> you're doing in later patches in this series is introducing things that
>> are redundant/odd if viewed if we're supposing that this code is meant
>> to live in git.git.
>
> The code is meant to be eaten by git.git, but be amenable to being
> used by libgit2 with minimal modifications. To that end, it tries to
> use just functionality offered by git-compat-util.
>
> If you could point out specific oddities, that would be helpful.

Things like different coding style, e.g. $(git grep '[!=]= NULL' --
reftable/) & comment style.

Having Go-ish (or Google-ish?) *_test.c files instead of that living in
t/helper/*.

reftable/dump.c not using parse_options().

This having reftable_free() etc., which at least for in-tree use is
redundant, see 8d128513429 (grep/pcre2: actually make pcre2 use custom
allocator, 2021-02-18).

Wondering if internal libraries (e.g. in basic.c) are there to avoid
various kitchen-sink things existing in git.git itself, or because no
existing thing fit the purpose.

I'm not saying those things aren't OK or the "sorta-external" trade-off
needs to change, just that reviewing it not knowing what the expected
status is left me hanging.

> As said earlier, the BSD license here is the most liberal I could
> find, but I'm open to something else, if that makes integration in
> both git and libgit2 possible. This was not discussed further, and
> hence the "TODO: relicense?" comment.

I don't per-se have issues with the license, just questions about what
it means for the development flow.

Let's say Junio merges this to master, and someone discovers a bug in
it. Surely one of these happen:

 1. We just take the patch, if it's a tree-wide patch someone at
    reftable.git would need to either look at GPLv2 patch or clean-room
    duplicate it somehow.

 2. Junio refuses any suggested changes to the reftable/* directory, we
    tell people to submit an upstream PR, once that lands we update. So
    like sha1collisiondetection/ (well, sans the "Junio refusing", I
    think I un-forked that at least a couple of times).

One of those means that e.g. if the change is to a strbuf API reftable/*
needs we'd need to version that API or otherwise treat it as stable.

As an aside, to your [1] again: At least to me reading a series on-list
that has TODO comments in commit messages very much sounds like an RFC
but not calling itself that, maybe I'm alone in that, but if not
clearing this sort of thing up / documenting it might go far in getting
more reviews.

1. https://lore.kernel.org/git/CAFQ2z_P=MqT81gLjov6A471Z9sd69qQTep8KG8M8=LO9pJtkpQ@mail.gmail.com/

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v6 18/20] Reftable support for git-core
  2021-04-13  7:18             ` Ævar Arnfjörð Bjarmason
@ 2021-04-14 16:44               ` Han-Wen Nienhuys
  2021-04-16 14:55                 ` Ævar Arnfjörð Bjarmason
  2021-04-16 18:47                 ` Junio C Hamano
  0 siblings, 2 replies; 251+ messages in thread
From: Han-Wen Nienhuys @ 2021-04-14 16:44 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys

On Tue, Apr 13, 2021 at 9:18 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> > It can be activated by passing --ref-storage=reftable to "git init", or setting
> > GIT_TEST_REFTABLE in the environment.
> > [...]
> > +const char *default_ref_storage(void)
> > +{
> > +     const char *test = getenv("GIT_TEST_REFTABLE");
> > +     return test ? "reftable" : "files";
> > +}
>
> Nit: use git_env_bool() here. So GIT_TEST_REFTABLE=true is true, but
> GIT_TEST_REFTABLE= is not.

thanks, adapted this.

> Not a nit: are these tests supposed to work / do they work for you? For
> me there's a *lot* of failures, e.g. t6120-describe.sh fails for reasons
> you might expect, manually tweaking things in the FS-refstore, but
> e.g. t5510-fetch.sh segfaults.

I'll have a look; segfaults should not happen, of course.

> > +if test_have_prereq REFTABLE
> > +then
> > +  skip_all='skipping tests; incompatible with reftable'
> > +  test_done
> > +fi
> > +
>
> I understand that getting this series to land has been a pain, but this
> sort of thing doesn't inspire confidence.
>
> Manually running it under reftable anyway I have 17 out of 81 tests
> failing. A lot of tests won't run under reftable because they're of the
> form:
>
>         test_when_finished "remove_object $tag" &&
>         echo $tag >.git/refs/tags/wrong &&
>         test_when_finished "git update-ref -d refs/tags/wrong" &&
>
> I.e. we have some hack to get around update-ref guarding things for us.
>
> I would think that a better approach here would be to start with some
> (per-se unrelated) series to teach update-ref some mode like
> hash-object's --literally, i.e. "YOLO this ref update".

I disagree.  I think this would be a job better suited to a
test-helper. Then we don't put tools into users' hands that
potentially corrupt the repository. I don't understand why hash-object
--literally is not a test helper either.

> Then adjust our various tests that move, clobber, or intentionally
> corrupt things in the refstore to use that helper/option.
>
> At that point we can actually run things like t1450-fsck.sh, maybe we
> simply have to skip some of them because we think the sort of corruption
> we're testing/worried about is impossible with reftable, but that in
> itself is *very* interesting.

One reason for skipping some files wholesale, is that quite a few
tests are not isolated in their test functions, so skipping an
individual test functions affects follow-on tests. I can't remember if
t1450 is one of them, though.

> I.e. do we not have to handle certain edge cases at all in fsck and
> friends because of inherent properties of the reftable backend (e.g. I'd
> assume, but don't know, that we're not likely to get one corrupt ref
> entry with it), but of course we can have a ref pointing to a loose
> object that then gets removed, and is thus corrupt.
>
> Anyway. I'm speculating, but that's also my point, that I'm having to
> speculate. That comes down to us having a new GIT_TEST_* mode that
> doesn't pass the test suite._

The speculation is warranted, but there are currently about 80 files
that have some sort of test failure, and some of these are for quite
common workflows. I would rather focus on those before looking at
tests like fsck. Functional failures may have far reaching
implications on the code structure (example: worktrees considerably
complicated both the reftable library and its git-core glue), while
corruption issues if they occur, will be due to localized bugs.

--
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v6 07/20] reftable: reading/writing blocks
  2021-04-13  8:19             ` Ævar Arnfjörð Bjarmason
@ 2021-04-15  8:57               ` Han-Wen Nienhuys
  0 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys @ 2021-04-15  8:57 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys

On Tue, Apr 13, 2021 at 10:19 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> > Includes a code snippet copied from zlib
> > [...]
> > +int ZEXPORT uncompress_return_consumed (
>
> Pending the "how is this integrated?" question I had in
> http://lore.kernel.org/git/87fszuej8y.fsf@evledraar.gmail.com it's a bit
> odd to have a "compat" we unconditionally compile.
>
> Since this is for post-2017 zlib doesn't putting it in top-level
> compat/* and having a flag to enable it make more sense, and then not
> renaming the function.

Thanks. I've moved this into compat/ for the next version of the series

(how should I track this info. Is the email here fine, or do I somehow
have to work this into the commit message?)

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v6 04/20] reftable: utility functions
  2021-04-13 13:14             ` Ævar Arnfjörð Bjarmason
@ 2021-04-15 15:00               ` Han-Wen Nienhuys
  0 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys @ 2021-04-15 15:00 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys

On Tue, Apr 13, 2021 at 3:14 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> I'd think that rather than duplicating magic constants & one thing in
> dir.h we'd be better off having some leading patches splitting off the
> relevant parts of object-file.c & dir.h, maybe THAT_NAME_minimal.h?
>
> Or: why not simply include dir.h and object.h etc? The compiler/linker
> will discard the unused functions, and if the worry is overuse of
> git.git functions creeping in I'd think that would be better done via
> some test/CI job that checks what objects were used.

It was useful to see how much (or how little) of Git I needed to make
it work, but it has served its purpose, so I followed your suggestion
to use dir.h and object.h.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v6 18/20] Reftable support for git-core
  2021-04-14 16:44               ` Han-Wen Nienhuys
@ 2021-04-16 14:55                 ` Ævar Arnfjörð Bjarmason
  2021-04-16 18:47                 ` Junio C Hamano
  1 sibling, 0 replies; 251+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-16 14:55 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys


On Wed, Apr 14 2021, Han-Wen Nienhuys wrote:

> On Tue, Apr 13, 2021 at 9:18 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
> [...]
>> I understand that getting this series to land has been a pain, but this
>> sort of thing doesn't inspire confidence.
>>
>> Manually running it under reftable anyway I have 17 out of 81 tests
>> failing. A lot of tests won't run under reftable because they're of the
>> form:
>>
>>         test_when_finished "remove_object $tag" &&
>>         echo $tag >.git/refs/tags/wrong &&
>>         test_when_finished "git update-ref -d refs/tags/wrong" &&
>>
>> I.e. we have some hack to get around update-ref guarding things for us.
>>
>> I would think that a better approach here would be to start with some
>> (per-se unrelated) series to teach update-ref some mode like
>> hash-object's --literally, i.e. "YOLO this ref update".
>
> I disagree.  I think this would be a job better suited to a
> test-helper. Then we don't put tools into users' hands that
> potentially corrupt the repository. I don't understand why hash-object
> --literally is not a test helper either.

I don't feel too strongly either way, and in any case that's a
discussion for any such mass test update series, and separate from the
reftable series per-se.

I do lean on the side of shipping a thing like that as part of our
plumbing, of course with very prominent warnings, just as we do
--literally.

It is useful for not only our test suite, but also for others to
reproduce issues locally with the ref and object store going out of
sync.

Until now there hasn't been much of a need in practice, the file backend
format is trivially understood and you can just echo a bad value to a
file.

But if I or anyone else encounters some Heisenbug in the wild where we
suspect the ref and object store went out of syncy, and we're using
reftable, it would be very handy to test that with an "update-ref
--literally", instead of having to locally recompile git, or otherwise
munge the opaque reftable format.

>> Then adjust our various tests that move, clobber, or intentionally
>> corrupt things in the refstore to use that helper/option.
>>
>> At that point we can actually run things like t1450-fsck.sh, maybe we
>> simply have to skip some of them because we think the sort of corruption
>> we're testing/worried about is impossible with reftable, but that in
>> itself is *very* interesting.
>
> One reason for skipping some files wholesale, is that quite a few
> tests are not isolated in their test functions, so skipping an
> individual test functions affects follow-on tests. I can't remember if
> t1450 is one of them, though.

[...]

>> I.e. do we not have to handle certain edge cases at all in fsck and
>> friends because of inherent properties of the reftable backend (e.g. I'd
>> assume, but don't know, that we're not likely to get one corrupt ref
>> entry with it), but of course we can have a ref pointing to a loose
>> object that then gets removed, and is thus corrupt.
>>
>> Anyway. I'm speculating, but that's also my point, that I'm having to
>> speculate. That comes down to us having a new GIT_TEST_* mode that
>> doesn't pass the test suite._
>
> The speculation is warranted, but there are currently about 80 files
> that have some sort of test failure, and some of these are for quite
> common workflows. I would rather focus on those before looking at
> tests like fsck. Functional failures may have far reaching
> implications on the code structure (example: worktrees considerably
> complicated both the reftable library and its git-core glue), while
> corruption issues if they occur, will be due to localized bugs.

I'm not sure from reading this if/where we disagree, so let's enumerate
and see where we land point-by-point:

 1. The test suite must pass 100% for any new GIT_TEST_X mode as it
    lands.

 2. It must do so in a meaningful way (i.e. not skipping entire files,
    if some/many of the tests are relevant / stress the codepath
    involved)

 3. #2 should be there as the series lands.

My comments on that:

 1: I feel strongly that #1 is an absolute prerequisite for integrating
    this or any other major core surgery.

    E.g. now we can't do any meaningful cross-platform tests for this to
    see how it breaks on our various supported platforms.

    We're also in V6 of this and apparently finding an easily
    discoverable segfault now, if v7 of this had a GIT_TEST_REFTABLE in
    the relevant ci/* code ...

    I realize that that's hard, e.g. we had to do large refactoring for
    all of the commit-graph, protocol v2, sha256 etc. modes, but it
    really pays off. E.g. for protocol v2 upload-pack not paying
    attention to hideRefs was discovered that way.

 2-3: I do think we should have this too, but would be willing to be
      convinced otherwise. E.g. we have a "gcov" target, once we have #1
      passing do we se that under GIT_TEST_REFTABLE=1 we have the same
      (or trivially explainable) test coverage, or are we simply
      skipping a large part of the fsck etc. code (some/much of which
      will have to do with refs).

      Even more than test coverage I think this is interesting to see
      what we *don't* cover, as I noted upthread. I.e. can we skip large
      parts of some sanity checking "if (reftable)"? I suspect so, but
      without the coverage we won't be able to tell.

I realize the above would be a lot of work, probably a >50 patch series
to t/* if previous major GIT_TEST_* refactorings are any indication.

But if we agree that's a viable way forward that's not all work that
you'd have to do.


^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v6 18/20] Reftable support for git-core
  2021-04-14 16:44               ` Han-Wen Nienhuys
  2021-04-16 14:55                 ` Ævar Arnfjörð Bjarmason
@ 2021-04-16 18:47                 ` Junio C Hamano
  1 sibling, 0 replies; 251+ messages in thread
From: Junio C Hamano @ 2021-04-16 18:47 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Ævar Arnfjörð Bjarmason,
	Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys

Han-Wen Nienhuys <hanwen@google.com> writes:

>> I would think that a better approach here would be to start with some
>> (per-se unrelated) series to teach update-ref some mode like
>> hash-object's --literally, i.e. "YOLO this ref update".
>
> I disagree.  I think this would be a job better suited to a
> test-helper. Then we don't put tools into users' hands that
> potentially corrupt the repository. I don't understand why hash-object
> --literally is not a test helper either.

As the person who invented the "--literally" option, I'd have to
agree with this assessment.  It does make life a little bit easier
for those who hack on Git codebase and reimplementations of it, but
little practical value for those who use Git every day [*].

If it were a tool to _dump_ the contents of a possibly corrupt
object that the "--literally" option would have produced to make it
easier to see by humans, I might be pursuaded to say that such a
feature may be better in end-user accessible subcommands.  But the
reason why we invented "--literally" was specifically to create
corrupt objects in test repository to see end-user accessible tools
behave, and in hindsight, such a use case shouldn't have been a good
justification to add the option.

I wasn't following the discussion of this particular patch closely,
so I do not know what is being discussed is a tool on the diagnosis
side (for which I may say it is OK to give to end-usres) or on the
currupting side (for which I would regret to add to end-user tools),
but hopefully the above would help guiding the decision between the
two.

Thanks.


[Footnote]

* ... for any purpose other than creating a corrupt repository,
  asking somebody who is or claims to be a Git expert to figure out
  what is wrong in his or her repository, and either waste expert's
  time or have fun by watching the expert fail and gets embarrased,
  that is ;-)

^ permalink raw reply	[flat|nested] 251+ messages in thread

* [PATCH v7 00/28] reftable library
  2021-04-12 19:25         ` [PATCH v6 00/20] reftable library Han-Wen Nienhuys via GitGitGadget
                             ` (19 preceding siblings ...)
  2021-04-12 19:25           ` [PATCH v6 20/20] Add "test-tool dump-reftable" command Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37           ` Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37             ` [PATCH v7 01/28] refs: ref_iterator_peel returns boolean, rather than peel_status Han-Wen Nienhuys via GitGitGadget
                               ` (28 more replies)
  20 siblings, 29 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys

This splits the giant commit from
https://github.com/gitgitgadget/git/pull/539 into a series of smaller
commits, which build and have unittests.

Changes relative to last series: version 12 Apr 2021:

 * AEvar's comments: zlib into compat/, void* casts, use implicit pointer
   <-> boolean conversions.
 * Plug memory leak in indexed reverse lookup.
 * Take SHA1/SHA256 constants from git headers.
 * Fix branch rename (branch -M)
 * Fix stash (support reflog deletion.)
 * Move debug printing logic out of dump.c
 * Fix reflog --dry-run expire.
 * Implement branch copy (branch -c)
 * Fix deletion of 0 branches
 * Fix transactions without REF_HAVE_OLD
 * Add requirement REFFILES for test cases that are specific to the files
   backend.
 * Fixes for various test files.

The commits up to "hash.h: provide constants for the hash IDs" should be
good to merge to 'next'.

There are several test fixups, but I've put them in another series because
GGG enforces max 30 commits.

Due to using uncompress2, this build fails on the Linux32 builder; what is
the magic incantation to run autoconf? There are other unexplained failures
in t0031-reftable.sh on CI that I haven't been able to look into. The test
passes for me, locally.

Han-Wen Nienhuys (27):
  refs: ref_iterator_peel returns boolean, rather than peel_status
  refs: document reflog_expire_fn's flag argument
  refs/debug: trace into reflog expiry too
  hash.h: provide constants for the hash IDs
  init-db: set the_repository->hash_algo early on
  reftable: add LICENSE
  reftable: add error related functionality
  reftable: utility functions
  reftable: add blocksource, an abstraction for random access reads
  reftable: (de)serialization for the polymorphic record type.
  Provide zlib's uncompress2 from compat/zlib-compat.c
  reftable: reading/writing blocks
  reftable: a generic binary tree implementation
  reftable: write reftable files
  reftable: generic interface to tables
  reftable: read reftable files
  reftable: reftable file level tests
  reftable: add a heap-based priority queue for reftable records
  reftable: add merged table view
  reftable: implement refname validation
  reftable: implement stack, a mutable database of reftable files.
  reftable: add dump utility
  Reftable support for git-core
  Add "test-tool dump-reftable" command.
  t1301: document what needs to be done for REFTABLE
  t1401,t2011: parameterize HEAD.lock for REFTABLE
  t1404: annotate test cases with REFFILES

SZEDER Gábor (1):
  git-prompt: prepare for reftable refs backend

 Documentation/config/extensions.txt           |    9 +
 .../technical/repository-version.txt          |    7 +
 Makefile                                      |   54 +-
 builtin/clone.c                               |    5 +-
 builtin/init-db.c                             |   76 +-
 builtin/stash.c                               |    8 +-
 builtin/worktree.c                            |   27 +-
 cache.h                                       |    8 +-
 compat/.gitattributes                         |    1 +
 compat/zlib-uncompress2.c                     |   92 +
 config.mak.uname                              |    2 +-
 configure.ac                                  |   13 +
 contrib/buildsystems/CMakeLists.txt           |   14 +-
 contrib/buildsystems/Generators/Vcxproj.pm    |   11 +-
 contrib/completion/git-prompt.sh              |    7 +-
 git-compat-util.h                             |    4 +
 hash.h                                        |    6 +
 object-file.c                                 |    6 +-
 refs.c                                        |   28 +-
 refs.h                                        |    3 +
 refs/debug.c                                  |   47 +-
 refs/ref-cache.c                              |    2 +-
 refs/refs-internal.h                          |    8 +
 refs/reftable-backend.c                       | 1616 +++++++++++++++++
 reftable/LICENSE                              |   31 +
 reftable/basics.c                             |  128 ++
 reftable/basics.h                             |   60 +
 reftable/basics_test.c                        |   98 +
 reftable/block.c                              |  446 +++++
 reftable/block.h                              |  127 ++
 reftable/block_test.c                         |  121 ++
 reftable/blocksource.c                        |  148 ++
 reftable/blocksource.h                        |   22 +
 reftable/constants.h                          |   21 +
 reftable/dump.c                               |  100 +
 reftable/error.c                              |   41 +
 reftable/generic.c                            |  169 ++
 reftable/generic.h                            |   32 +
 reftable/iter.c                               |  194 ++
 reftable/iter.h                               |   69 +
 reftable/merged.c                             |  362 ++++
 reftable/merged.h                             |   35 +
 reftable/merged_test.c                        |  292 +++
 reftable/pq.c                                 |  115 ++
 reftable/pq.h                                 |   32 +
 reftable/pq_test.c                            |   72 +
 reftable/publicbasics.c                       |   58 +
 reftable/reader.c                             |  794 ++++++++
 reftable/reader.h                             |   66 +
 reftable/readwrite_test.c                     |  621 +++++++
 reftable/record.c                             | 1202 ++++++++++++
 reftable/record.h                             |  139 ++
 reftable/record_test.c                        |  407 +++++
 reftable/refname.c                            |  209 +++
 reftable/refname.h                            |   29 +
 reftable/refname_test.c                       |  102 ++
 reftable/reftable-blocksource.h               |   49 +
 reftable/reftable-error.h                     |   62 +
 reftable/reftable-generic.h                   |   47 +
 reftable/reftable-iterator.h                  |   39 +
 reftable/reftable-malloc.h                    |   18 +
 reftable/reftable-merged.h                    |   72 +
 reftable/reftable-reader.h                    |  101 ++
 reftable/reftable-record.h                    |  114 ++
 reftable/reftable-stack.h                     |  128 ++
 reftable/reftable-tests.h                     |   23 +
 reftable/reftable-writer.h                    |  147 ++
 reftable/reftable.c                           |  115 ++
 reftable/stack.c                              | 1395 ++++++++++++++
 reftable/stack.h                              |   41 +
 reftable/stack_test.c                         |  940 ++++++++++
 reftable/system.h                             |   24 +
 reftable/test_framework.c                     |   23 +
 reftable/test_framework.h                     |   53 +
 reftable/tree.c                               |   63 +
 reftable/tree.h                               |   34 +
 reftable/tree_test.c                          |   61 +
 reftable/writer.c                             |  690 +++++++
 reftable/writer.h                             |   50 +
 repository.c                                  |    2 +
 repository.h                                  |    3 +
 setup.c                                       |    9 +-
 t/helper/test-reftable.c                      |   21 +
 t/helper/test-tool.c                          |    4 +-
 t/helper/test-tool.h                          |    2 +
 t/t0031-reftable.sh                           |  271 +++
 t/t0032-reftable-unittest.sh                  |   15 +
 t/t1301-shared-repo.sh                        |    8 +-
 t/t1401-symbolic-ref.sh                       |   11 +-
 t/t1404-update-ref-errors.sh                  |   86 +-
 t/t1409-avoid-packing-refs.sh                 |    6 +
 t/t1450-fsck.sh                               |    6 +
 t/t2011-checkout-invalid-head.sh              |   11 +-
 t/t3210-pack-refs.sh                          |    6 +
 t/test-lib.sh                                 |    5 +
 95 files changed, 13065 insertions(+), 86 deletions(-)
 create mode 100644 compat/.gitattributes
 create mode 100644 compat/zlib-uncompress2.c
 create mode 100644 refs/reftable-backend.c
 create mode 100644 reftable/LICENSE
 create mode 100644 reftable/basics.c
 create mode 100644 reftable/basics.h
 create mode 100644 reftable/basics_test.c
 create mode 100644 reftable/block.c
 create mode 100644 reftable/block.h
 create mode 100644 reftable/block_test.c
 create mode 100644 reftable/blocksource.c
 create mode 100644 reftable/blocksource.h
 create mode 100644 reftable/constants.h
 create mode 100644 reftable/dump.c
 create mode 100644 reftable/error.c
 create mode 100644 reftable/generic.c
 create mode 100644 reftable/generic.h
 create mode 100644 reftable/iter.c
 create mode 100644 reftable/iter.h
 create mode 100644 reftable/merged.c
 create mode 100644 reftable/merged.h
 create mode 100644 reftable/merged_test.c
 create mode 100644 reftable/pq.c
 create mode 100644 reftable/pq.h
 create mode 100644 reftable/pq_test.c
 create mode 100644 reftable/publicbasics.c
 create mode 100644 reftable/reader.c
 create mode 100644 reftable/reader.h
 create mode 100644 reftable/readwrite_test.c
 create mode 100644 reftable/record.c
 create mode 100644 reftable/record.h
 create mode 100644 reftable/record_test.c
 create mode 100644 reftable/refname.c
 create mode 100644 reftable/refname.h
 create mode 100644 reftable/refname_test.c
 create mode 100644 reftable/reftable-blocksource.h
 create mode 100644 reftable/reftable-error.h
 create mode 100644 reftable/reftable-generic.h
 create mode 100644 reftable/reftable-iterator.h
 create mode 100644 reftable/reftable-malloc.h
 create mode 100644 reftable/reftable-merged.h
 create mode 100644 reftable/reftable-reader.h
 create mode 100644 reftable/reftable-record.h
 create mode 100644 reftable/reftable-stack.h
 create mode 100644 reftable/reftable-tests.h
 create mode 100644 reftable/reftable-writer.h
 create mode 100644 reftable/reftable.c
 create mode 100644 reftable/stack.c
 create mode 100644 reftable/stack.h
 create mode 100644 reftable/stack_test.c
 create mode 100644 reftable/system.h
 create mode 100644 reftable/test_framework.c
 create mode 100644 reftable/test_framework.h
 create mode 100644 reftable/tree.c
 create mode 100644 reftable/tree.h
 create mode 100644 reftable/tree_test.c
 create mode 100644 reftable/writer.c
 create mode 100644 reftable/writer.h
 create mode 100644 t/helper/test-reftable.c
 create mode 100755 t/t0031-reftable.sh
 create mode 100755 t/t0032-reftable-unittest.sh


base-commit: 54a391711554ed41b4b0792cfef004abc74893bd
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-847%2Fhanwen%2Flibreftable-v7
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-847/hanwen/libreftable-v7
Pull-Request: https://github.com/git/git/pull/847

Range-diff vs v6:

  -:  ------------ >  1:  8103a80394ae refs: ref_iterator_peel returns boolean, rather than peel_status
  -:  ------------ >  2:  554bb1ac3aed refs: document reflog_expire_fn's flag argument
  -:  ------------ >  3:  9ae5ddff6aed refs/debug: trace into reflog expiry too
  -:  ------------ >  4:  f0216ae20b69 hash.h: provide constants for the hash IDs
  1:  15b8306fbe4b =  5:  fa64aa9ce7c1 init-db: set the_repository->hash_algo early on
  2:  02bd1f40fa4e =  6:  8a960fb77fa9 reftable: add LICENSE
  3:  bffa33c012ad =  7:  5177919a3fd2 reftable: add error related functionality
  4:  df8003cb9a7d !  8:  a24e9e661616 reftable: utility functions
     @@ reftable/basics.c (new)
      +void free_names(char **a)
      +{
      +	char **p;
     -+	if (a == NULL) {
     ++	if (!a) {
      +		return;
      +	}
      +	for (p = a; *p; p++) {
     @@ reftable/basics_test.c (new)
      +
      +static int binsearch_func(size_t i, void *void_args)
      +{
     -+	struct binsearch_args *args = (struct binsearch_args *)void_args;
     ++	struct binsearch_args *args = void_args;
      +
      +	return args->key < args->arr[i];
      +}
     @@ reftable/basics_test.c (new)
      +	parse_names(in, strlen(in), &out);
      +	EXPECT(!strcmp(out[0], "a"));
      +	EXPECT(!strcmp(out[1], "b"));
     -+	EXPECT(out[2] == NULL);
     ++	EXPECT(!out[2]);
      +	free_names(out);
      +}
      +
     @@ reftable/basics_test.c (new)
      +	char **out = NULL;
      +	parse_names(in, strlen(in), &out);
      +	EXPECT(!strcmp(out[0], "a"));
     -+	EXPECT(out[1] == NULL);
     ++	EXPECT(!out[1]);
      +	free_names(out);
      +}
      +
     @@ reftable/publicbasics.c (new)
      +{
      +	switch (id) {
      +	case 0:
     -+	case SHA1_ID:
     -+		return SHA1_SIZE;
     -+	case SHA256_ID:
     -+		return SHA256_SIZE;
     ++	case GIT_SHA1_HASH_ID:
     ++		return GIT_SHA1_RAWSZ;
     ++	case GIT_SHA256_HASH_ID:
     ++		return GIT_SHA256_RAWSZ;
      +	}
      +	abort();
      +}
     @@ reftable/system.h (new)
      +#ifndef SYSTEM_H
      +#define SYSTEM_H
      +
     ++// This header glues the reftable library to the rest of Git
     ++
      +#include "git-compat-util.h"
      +#include "strbuf.h"
     ++#include "hash.h" /* hash ID, sizes.*/
     ++#include "dir.h" /* remove_dir_recursively, for tests.*/
      +
      +#include <zlib.h>
      +
      +struct strbuf;
     -+/* In git, this is declared in dir.h */
     -+int remove_dir_recursively(struct strbuf *path, int flags);
     -+
     -+#define SHA1_ID 0x73686131
     -+#define SHA256_ID 0x73323536
     -+#define SHA1_SIZE 20
     -+#define SHA256_SIZE 32
     -+
     -+/* This is uncompress2, which is only available in zlib as of 2017.
     -+ */
     -+int uncompress_return_consumed(Bytef *dest, uLongf *destLen,
     -+			       const Bytef *source, uLong *sourceLen);
      +int hash_size(uint32_t id);
      +
      +#endif
     @@ reftable/test_framework.c (new)
      +
      +void set_test_hash(uint8_t *p, int i)
      +{
     -+	memset(p, (uint8_t)i, hash_size(SHA1_ID));
     ++	memset(p, (uint8_t)i, hash_size(GIT_SHA1_HASH_ID));
      +}
      +
     -+int strbuf_add_void(void *b, const void *data, size_t sz)
     ++ssize_t strbuf_add_void(void *b, const void *data, size_t sz)
      +{
     -+	strbuf_add((struct strbuf *)b, data, sz);
     ++	strbuf_add(b, data, sz);
      +	return sz;
      +}
      
     @@ reftable/test_framework.h (new)
      +
      +/* Like strbuf_add, but suitable for passing to reftable_new_writer
      + */
     -+int strbuf_add_void(void *b, const void *data, size_t sz);
     ++ssize_t strbuf_add_void(void *b, const void *data, size_t sz);
      +
      +#endif
      
  5:  6807af53e9a0 !  9:  5d54e6fac99b reftable: add blocksource, an abstraction for random access reads
     @@ reftable/blocksource.c (new)
      +static int strbuf_read_block(void *v, struct reftable_block *dest, uint64_t off,
      +			     uint32_t size)
      +{
     -+	struct strbuf *b = (struct strbuf *)v;
     ++	struct strbuf *b = v;
      +	assert(off + size <= b->len);
      +	dest->data = reftable_calloc(size);
      +	memcpy(dest->data, b->buf + off, size);
     @@ reftable/blocksource.c (new)
      +void block_source_from_strbuf(struct reftable_block_source *bs,
      +			      struct strbuf *buf)
      +{
     -+	assert(bs->ops == NULL);
     ++	assert(!bs->ops);
      +	bs->ops = &strbuf_vtable;
      +	bs->arg = buf;
      +}
     @@ reftable/blocksource.c (new)
      +static int file_read_block(void *v, struct reftable_block *dest, uint64_t off,
      +			   uint32_t size)
      +{
     -+	struct file_block_source *b = (struct file_block_source *)v;
     ++	struct file_block_source *b = v;
      +	assert(off + size <= b->size);
      +	dest->data = reftable_malloc(size);
      +	if (pread(b->fd, dest->data, size, off) != size)
     @@ reftable/blocksource.c (new)
      +	p->size = st.st_size;
      +	p->fd = fd;
      +
     -+	assert(bs->ops == NULL);
     ++	assert(!bs->ops);
      +	bs->ops = &file_vtable;
      +	bs->arg = p;
      +	return 0;
  6:  78a352c96f82 ! 10:  028dd13e701b reftable: (de)serialization for the polymorphic record type.
     @@ reftable/record.c (new)
      +static void reftable_ref_record_copy_from(void *rec, const void *src_rec,
      +					  int hash_size)
      +{
     -+	struct reftable_ref_record *ref = (struct reftable_ref_record *)rec;
     -+	struct reftable_ref_record *src = (struct reftable_ref_record *)src_rec;
     ++	struct reftable_ref_record *ref = rec;
     ++	const struct reftable_ref_record *src = src_rec;
      +	assert(hash_size > 0);
      +
      +	/* This is simple and correct, but we could probably reuse the hash
      +	 * fields. */
      +	reftable_ref_record_release(ref);
     -+	if (src->refname != NULL) {
     ++	if (src->refname) {
      +		ref->refname = xstrdup(src->refname);
      +	}
      +	ref->update_index = src->update_index;
     @@ reftable/record.c (new)
      +static void hex_format(char *dest, uint8_t *src, int hash_size)
      +{
      +	assert(hash_size > 0);
     -+	if (src != NULL) {
     ++	if (src) {
      +		int i = 0;
      +		for (i = 0; i < hash_size; i++) {
      +			dest[2 * i] = hexdigit(src[i] >> 4);
     @@ reftable/record.c (new)
      +void reftable_ref_record_print(struct reftable_ref_record *ref,
      +			       uint32_t hash_id)
      +{
     -+	char hex[2 * SHA256_SIZE + 1] = { 0 }; /* BUG */
     ++	char hex[2 * GIT_SHA256_RAWSZ + 1] = { 0 }; /* BUG */
      +	printf("ref{%s(%" PRIu64 ") ", ref->refname, ref->update_index);
      +	switch (ref->value_type) {
      +	case REFTABLE_REF_SYMREF:
     @@ reftable/record.c (new)
      +
      +static void reftable_ref_record_release_void(void *rec)
      +{
     -+	reftable_ref_record_release((struct reftable_ref_record *)rec);
     ++	reftable_ref_record_release(rec);
      +}
      +
      +void reftable_ref_record_release(struct reftable_ref_record *ref)
     @@ reftable/record.c (new)
      +				      uint8_t val_type, struct string_view in,
      +				      int hash_size)
      +{
     -+	struct reftable_ref_record *r = (struct reftable_ref_record *)rec;
     ++	struct reftable_ref_record *r = rec;
      +	struct string_view start = in;
      +	uint64_t update_index = 0;
      +	int n = get_var_int(&update_index, &in);
     @@ reftable/record.c (new)
      +
      +static void reftable_obj_record_release(void *rec)
      +{
     -+	struct reftable_obj_record *obj = (struct reftable_obj_record *)rec;
     ++	struct reftable_obj_record *obj = rec;
      +	FREE_AND_NULL(obj->hash_prefix);
      +	FREE_AND_NULL(obj->offsets);
      +	memset(obj, 0, sizeof(struct reftable_obj_record));
     @@ reftable/record.c (new)
      +static void reftable_obj_record_copy_from(void *rec, const void *src_rec,
      +					  int hash_size)
      +{
     -+	struct reftable_obj_record *obj = (struct reftable_obj_record *)rec;
     ++	struct reftable_obj_record *obj = rec;
      +	const struct reftable_obj_record *src =
      +		(const struct reftable_obj_record *)src_rec;
      +	int olen;
     @@ reftable/record.c (new)
      +
      +static uint8_t reftable_obj_record_val_type(const void *rec)
      +{
     -+	struct reftable_obj_record *r = (struct reftable_obj_record *)rec;
     ++	const struct reftable_obj_record *r = rec;
      +	if (r->offset_len > 0 && r->offset_len < 8)
      +		return r->offset_len;
      +	return 0;
     @@ reftable/record.c (new)
      +static int reftable_obj_record_encode(const void *rec, struct string_view s,
      +				      int hash_size)
      +{
     -+	struct reftable_obj_record *r = (struct reftable_obj_record *)rec;
     ++	const struct reftable_obj_record *r = rec;
      +	struct string_view start = s;
      +	int i = 0;
      +	int n = 0;
     @@ reftable/record.c (new)
      +				      int hash_size)
      +{
      +	struct string_view start = in;
     -+	struct reftable_obj_record *r = (struct reftable_obj_record *)rec;
     ++	struct reftable_obj_record *r = rec;
      +	uint64_t count = val_type;
      +	int n = 0;
      +	uint64_t last;
     @@ reftable/record.c (new)
      +void reftable_log_record_print(struct reftable_log_record *log,
      +			       uint32_t hash_id)
      +{
     -+	char hex[SHA256_SIZE + 1] = { 0 };
     ++	char hex[GIT_SHA256_RAWSZ + 1] = { 0 };
      +
      +	switch (log->value_type) {
      +	case REFTABLE_LOG_DELETION:
     @@ reftable/record.c (new)
      +static void reftable_log_record_copy_from(void *rec, const void *src_rec,
      +					  int hash_size)
      +{
     -+	struct reftable_log_record *dst = (struct reftable_log_record *)rec;
     ++	struct reftable_log_record *dst = rec;
      +	const struct reftable_log_record *src =
      +		(const struct reftable_log_record *)src_rec;
      +
      +	reftable_log_record_release(dst);
      +	*dst = *src;
     -+	if (dst->refname != NULL) {
     ++	if (dst->refname) {
      +		dst->refname = xstrdup(dst->refname);
      +	}
      +	switch (dst->value_type) {
      +	case REFTABLE_LOG_DELETION:
      +		break;
      +	case REFTABLE_LOG_UPDATE:
     -+		if (dst->update.email != NULL) {
     ++		if (dst->update.email) {
      +			dst->update.email = xstrdup(dst->update.email);
      +		}
     -+		if (dst->update.name != NULL) {
     ++		if (dst->update.name) {
      +			dst->update.name = xstrdup(dst->update.name);
      +		}
     -+		if (dst->update.message != NULL) {
     ++		if (dst->update.message) {
      +			dst->update.message = xstrdup(dst->update.message);
      +		}
      +
     -+		if (dst->update.new_hash != NULL) {
     ++		if (dst->update.new_hash) {
      +			dst->update.new_hash = reftable_malloc(hash_size);
      +			memcpy(dst->update.new_hash, src->update.new_hash,
      +			       hash_size);
      +		}
     -+		if (dst->update.old_hash != NULL) {
     ++		if (dst->update.old_hash) {
      +			dst->update.old_hash = reftable_malloc(hash_size);
      +			memcpy(dst->update.old_hash, src->update.old_hash,
      +			       hash_size);
     @@ reftable/record.c (new)
      +
      +static void reftable_log_record_release_void(void *rec)
      +{
     -+	struct reftable_log_record *r = (struct reftable_log_record *)rec;
     ++	struct reftable_log_record *r = rec;
      +	reftable_log_record_release(r);
      +}
      +
     @@ reftable/record.c (new)
      +	return reftable_log_record_is_deletion(log) ? 0 : 1;
      +}
      +
     -+static uint8_t zero[SHA256_SIZE] = { 0 };
     ++static uint8_t zero[GIT_SHA256_RAWSZ] = { 0 };
      +
      +static int reftable_log_record_encode(const void *rec, struct string_view s,
      +				      int hash_size)
      +{
     -+	struct reftable_log_record *r = (struct reftable_log_record *)rec;
     ++	const struct reftable_log_record *r = rec;
      +	struct string_view start = s;
      +	int n = 0;
      +	uint8_t *oldh = NULL;
     @@ reftable/record.c (new)
      +
      +	oldh = r->update.old_hash;
      +	newh = r->update.new_hash;
     -+	if (oldh == NULL) {
     ++	if (!oldh) {
      +		oldh = zero;
      +	}
     -+	if (newh == NULL) {
     ++	if (!newh) {
      +		newh = zero;
      +	}
      +
     @@ reftable/record.c (new)
      +				      int hash_size)
      +{
      +	struct string_view start = in;
     -+	struct reftable_log_record *r = (struct reftable_log_record *)rec;
     ++	struct reftable_log_record *r = rec;
      +	uint64_t max = 0;
      +	uint64_t ts = 0;
      +	struct strbuf dest = STRBUF_INIT;
     @@ reftable/record.c (new)
      +static int null_streq(char *a, char *b)
      +{
      +	char *empty = "";
     -+	if (a == NULL)
     ++	if (!a)
      +		a = empty;
      +
     -+	if (b == NULL)
     ++	if (!b)
      +		b = empty;
      +
      +	return 0 == strcmp(a, b);
     @@ reftable/record.c (new)
      +
      +static int zero_hash_eq(uint8_t *a, uint8_t *b, int sz)
      +{
     -+	if (a == NULL)
     ++	if (!a)
      +		a = zero;
      +
     -+	if (b == NULL)
     ++	if (!b)
      +		b = zero;
      +
      +	return !memcmp(a, b, sz);
     @@ reftable/record.c (new)
      +
      +static void reftable_index_record_key(const void *r, struct strbuf *dest)
      +{
     -+	struct reftable_index_record *rec = (struct reftable_index_record *)r;
     ++	const struct reftable_index_record *rec = r;
      +	strbuf_reset(dest);
      +	strbuf_addbuf(dest, &rec->last_key);
      +}
     @@ reftable/record.c (new)
      +static void reftable_index_record_copy_from(void *rec, const void *src_rec,
      +					    int hash_size)
      +{
     -+	struct reftable_index_record *dst = (struct reftable_index_record *)rec;
     -+	struct reftable_index_record *src =
     -+		(struct reftable_index_record *)src_rec;
     ++	struct reftable_index_record *dst = rec;
     ++	const struct reftable_index_record *src = src_rec;
      +
      +	strbuf_reset(&dst->last_key);
      +	strbuf_addbuf(&dst->last_key, &src->last_key);
     @@ reftable/record.c (new)
      +
      +static void reftable_index_record_release(void *rec)
      +{
     -+	struct reftable_index_record *idx = (struct reftable_index_record *)rec;
     ++	struct reftable_index_record *idx = rec;
      +	strbuf_release(&idx->last_key);
      +}
      +
     @@ reftable/record.c (new)
      +					int hash_size)
      +{
      +	struct string_view start = in;
     -+	struct reftable_index_record *r = (struct reftable_index_record *)rec;
     ++	struct reftable_index_record *r = rec;
      +	int n = 0;
      +
      +	strbuf_reset(&r->last_key);
     @@ reftable/record.c (new)
      +void reftable_record_from_ref(struct reftable_record *rec,
      +			      struct reftable_ref_record *ref_rec)
      +{
     -+	assert(rec->ops == NULL);
     ++	assert(!rec->ops);
      +	rec->data = ref_rec;
      +	rec->ops = &reftable_ref_record_vtable;
      +}
     @@ reftable/record.c (new)
      +void reftable_record_from_obj(struct reftable_record *rec,
      +			      struct reftable_obj_record *obj_rec)
      +{
     -+	assert(rec->ops == NULL);
     ++	assert(!rec->ops);
      +	rec->data = obj_rec;
      +	rec->ops = &reftable_obj_record_vtable;
      +}
     @@ reftable/record.c (new)
      +void reftable_record_from_index(struct reftable_record *rec,
      +				struct reftable_index_record *index_rec)
      +{
     -+	assert(rec->ops == NULL);
     ++	assert(!rec->ops);
      +	rec->data = index_rec;
      +	rec->ops = &reftable_index_record_vtable;
      +}
     @@ reftable/record.c (new)
      +void reftable_record_from_log(struct reftable_record *rec,
      +			      struct reftable_log_record *log_rec)
      +{
     -+	assert(rec->ops == NULL);
     ++	assert(!rec->ops);
      +	rec->data = log_rec;
      +	rec->ops = &reftable_log_record_vtable;
      +}
     @@ reftable/record.c (new)
      +struct reftable_ref_record *reftable_record_as_ref(struct reftable_record *rec)
      +{
      +	assert(reftable_record_type(rec) == BLOCK_TYPE_REF);
     -+	return (struct reftable_ref_record *)rec->data;
     ++	return rec->data;
      +}
      +
      +struct reftable_log_record *reftable_record_as_log(struct reftable_record *rec)
      +{
      +	assert(reftable_record_type(rec) == BLOCK_TYPE_LOG);
     -+	return (struct reftable_log_record *)rec->data;
     ++	return rec->data;
      +}
      +
      +static int hash_equal(uint8_t *a, uint8_t *b, int hash_size)
      +{
     -+	if (a != NULL && b != NULL)
     ++	if (a && b)
      +		return !memcmp(a, b, hash_size);
      +
      +	return a == b;
     @@ reftable/record.c (new)
      +
      +int reftable_log_record_compare_key(const void *a, const void *b)
      +{
     -+	struct reftable_log_record *la = (struct reftable_log_record *)a;
     -+	struct reftable_log_record *lb = (struct reftable_log_record *)b;
     ++	const struct reftable_log_record *la = a;
     ++	const struct reftable_log_record *lb = b;
      +
      +	int cmp = strcmp(la->refname, lb->refname);
      +	if (cmp)
     @@ reftable/record_test.c (new)
      +{
      +	struct reftable_record copy =
      +		reftable_new_record(reftable_record_type(rec));
     -+	reftable_record_copy_from(&copy, rec, SHA1_SIZE);
     ++	reftable_record_copy_from(&copy, rec, GIT_SHA1_RAWSZ);
      +	/* do it twice to catch memory leaks */
     -+	reftable_record_copy_from(&copy, rec, SHA1_SIZE);
     ++	reftable_record_copy_from(&copy, rec, GIT_SHA1_RAWSZ);
      +	switch (reftable_record_type(&copy)) {
      +	case BLOCK_TYPE_REF:
      +		EXPECT(reftable_ref_record_equal(reftable_record_as_ref(&copy),
      +						 reftable_record_as_ref(rec),
     -+						 SHA1_SIZE));
     ++						 GIT_SHA1_RAWSZ));
      +		break;
      +	case BLOCK_TYPE_LOG:
      +		EXPECT(reftable_log_record_equal(reftable_record_as_log(&copy),
      +						 reftable_record_as_log(rec),
     -+						 SHA1_SIZE));
     ++						 GIT_SHA1_RAWSZ));
      +		break;
      +	}
      +	reftable_record_destroy(&copy);
     @@ reftable/record_test.c (new)
      +static void set_hash(uint8_t *h, int j)
      +{
      +	int i = 0;
     -+	for (i = 0; i < hash_size(SHA1_ID); i++) {
     ++	for (i = 0; i < hash_size(GIT_SHA1_HASH_ID); i++) {
      +		h[i] = (j >> i) & 0xff;
      +	}
      +}
     @@ reftable/record_test.c (new)
      +		case REFTABLE_REF_DELETION:
      +			break;
      +		case REFTABLE_REF_VAL1:
     -+			in.value.val1 = reftable_malloc(SHA1_SIZE);
     ++			in.value.val1 = reftable_malloc(GIT_SHA1_RAWSZ);
      +			set_hash(in.value.val1, 1);
      +			break;
      +		case REFTABLE_REF_VAL2:
     -+			in.value.val2.value = reftable_malloc(SHA1_SIZE);
     ++			in.value.val2.value = reftable_malloc(GIT_SHA1_RAWSZ);
      +			set_hash(in.value.val2.value, 1);
     -+			in.value.val2.target_value = reftable_malloc(SHA1_SIZE);
     ++			in.value.val2.target_value =
     ++				reftable_malloc(GIT_SHA1_RAWSZ);
      +			set_hash(in.value.val2.target_value, 2);
      +			break;
      +		case REFTABLE_REF_SYMREF:
     @@ reftable/record_test.c (new)
      +		EXPECT(reftable_record_val_type(&rec) == i);
      +
      +		reftable_record_key(&rec, &key);
     -+		n = reftable_record_encode(&rec, dest, SHA1_SIZE);
     ++		n = reftable_record_encode(&rec, dest, GIT_SHA1_RAWSZ);
      +		EXPECT(n > 0);
      +
      +		/* decode into a non-zero reftable_record to test for leaks. */
      +
      +		reftable_record_from_ref(&rec_out, &out);
     -+		m = reftable_record_decode(&rec_out, key, i, dest, SHA1_SIZE);
     ++		m = reftable_record_decode(&rec_out, key, i, dest,
     ++					   GIT_SHA1_RAWSZ);
      +		EXPECT(n == m);
      +
     -+		EXPECT(reftable_ref_record_equal(&in, &out, SHA1_SIZE));
     ++		EXPECT(reftable_ref_record_equal(&in, &out, GIT_SHA1_RAWSZ));
      +		reftable_record_release(&rec_out);
      +
      +		strbuf_release(&key);
     @@ reftable/record_test.c (new)
      +		}
      +	};
      +
     -+	EXPECT(!reftable_log_record_equal(&in[0], &in[1], SHA1_SIZE));
     ++	EXPECT(!reftable_log_record_equal(&in[0], &in[1], GIT_SHA1_RAWSZ));
      +	in[1].update_index = in[0].update_index;
     -+	EXPECT(reftable_log_record_equal(&in[0], &in[1], SHA1_SIZE));
     ++	EXPECT(reftable_log_record_equal(&in[0], &in[1], GIT_SHA1_RAWSZ));
      +	reftable_log_record_release(&in[0]);
      +	reftable_log_record_release(&in[1]);
      +}
     @@ reftable/record_test.c (new)
      +			.update_index = 42,
      +			.value_type = REFTABLE_LOG_UPDATE,
      +			.update = {
     -+				.old_hash = reftable_malloc(SHA1_SIZE),
     -+				.new_hash = reftable_malloc(SHA1_SIZE),
     ++				.old_hash = reftable_malloc(GIT_SHA1_RAWSZ),
     ++				.new_hash = reftable_malloc(GIT_SHA1_RAWSZ),
      +				.name = xstrdup("han-wen"),
      +				.email = xstrdup("hanwen@google.com"),
      +				.message = xstrdup("test"),
     @@ reftable/record_test.c (new)
      +			.refname = xstrdup("old name"),
      +			.value_type = REFTABLE_LOG_UPDATE,
      +			.update = {
     -+				.new_hash = reftable_calloc(SHA1_SIZE),
     -+				.old_hash = reftable_calloc(SHA1_SIZE),
     ++				.new_hash = reftable_calloc(GIT_SHA1_RAWSZ),
     ++				.old_hash = reftable_calloc(GIT_SHA1_RAWSZ),
      +				.name = xstrdup("old name"),
      +				.email = xstrdup("old@email"),
      +				.message = xstrdup("old message"),
     @@ reftable/record_test.c (new)
      +
      +		reftable_record_key(&rec, &key);
      +
     -+		n = reftable_record_encode(&rec, dest, SHA1_SIZE);
     ++		n = reftable_record_encode(&rec, dest, GIT_SHA1_RAWSZ);
      +		EXPECT(n >= 0);
      +		reftable_record_from_log(&rec_out, &out);
      +		valtype = reftable_record_val_type(&rec);
      +		m = reftable_record_decode(&rec_out, key, valtype, dest,
     -+					   SHA1_SIZE);
     ++					   GIT_SHA1_RAWSZ);
      +		EXPECT(n == m);
      +
     -+		EXPECT(reftable_log_record_equal(&in[i], &out, SHA1_SIZE));
     ++		EXPECT(reftable_log_record_equal(&in[i], &out, GIT_SHA1_RAWSZ));
      +		reftable_log_record_release(&in[i]);
      +		strbuf_release(&key);
      +		reftable_record_release(&rec_out);
     @@ reftable/record_test.c (new)
      +
      +static void test_reftable_obj_record_roundtrip(void)
      +{
     -+	uint8_t testHash1[SHA1_SIZE] = { 1, 2, 3, 4, 0 };
     ++	uint8_t testHash1[GIT_SHA1_RAWSZ] = { 1, 2, 3, 4, 0 };
      +	uint64_t till9[] = { 1, 2, 3, 4, 500, 600, 700, 800, 9000 };
      +	struct reftable_obj_record recs[3] = { {
      +						       .hash_prefix = testHash1,
     @@ reftable/record_test.c (new)
      +		reftable_record_from_obj(&rec, &in);
      +		test_copy(&rec);
      +		reftable_record_key(&rec, &key);
     -+		n = reftable_record_encode(&rec, dest, SHA1_SIZE);
     ++		n = reftable_record_encode(&rec, dest, GIT_SHA1_RAWSZ);
      +		EXPECT(n > 0);
      +		extra = reftable_record_val_type(&rec);
      +		reftable_record_from_obj(&rec_out, &out);
      +		m = reftable_record_decode(&rec_out, key, extra, dest,
     -+					   SHA1_SIZE);
     ++					   GIT_SHA1_RAWSZ);
      +		EXPECT(n == m);
      +
      +		EXPECT(in.hash_prefix_len == out.hash_prefix_len);
     @@ reftable/record_test.c (new)
      +	test_copy(&rec);
      +
      +	EXPECT(0 == strbuf_cmp(&key, &in.last_key));
     -+	n = reftable_record_encode(&rec, dest, SHA1_SIZE);
     ++	n = reftable_record_encode(&rec, dest, GIT_SHA1_RAWSZ);
      +	EXPECT(n > 0);
      +
      +	extra = reftable_record_val_type(&rec);
      +	reftable_record_from_index(&out_rec, &out);
     -+	m = reftable_record_decode(&out_rec, key, extra, dest, SHA1_SIZE);
     ++	m = reftable_record_decode(&out_rec, key, extra, dest, GIT_SHA1_RAWSZ);
      +	EXPECT(m == n);
      +
      +	EXPECT(in.offset == out.offset);
  -:  ------------ > 11:  0c2604d7e7d4 Provide zlib's uncompress2 from compat/zlib-compat.c
  7:  9297b9c363f6 ! 12:  425cd0185aab reftable: reading/writing blocks
     @@ Commit message
      
          This commit provides the logic to read and write these blocks.
      
     -    Includes a code snippet copied from zlib
     -
          Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
      
       ## Makefile ##
     @@ Makefile: xdiff-objs: $(XDIFF_OBJS)
       REFTABLE_OBJS += reftable/blocksource.o
       REFTABLE_OBJS += reftable/publicbasics.o
       REFTABLE_OBJS += reftable/record.o
     -+REFTABLE_OBJS += reftable/zlib-compat.o
       
      +REFTABLE_TEST_OBJS += reftable/block_test.o
       REFTABLE_TEST_OBJS += reftable/record_test.o
       REFTABLE_TEST_OBJS += reftable/test_framework.o
       REFTABLE_TEST_OBJS += reftable/basics_test.o
      
     - ## reftable/.gitattributes (new) ##
     -@@
     -+/zlib-compat.c	whitespace=-indent-with-non-tab,-trailing-space
     -
       ## reftable/block.c (new) ##
      @@
      +/*
     @@ reftable/block.c (new)
      +#include "record.h"
      +#include "reftable-error.h"
      +#include "system.h"
     -+#include "zlib.h"
     ++#include <zlib.h>
     ++
     ++#ifdef NO_UNCOMPRESS2
     ++/* This is uncompress2, which is only available in zlib as of 2017.
     ++ */
     ++int uncompress2(Bytef *dest, uLongf *destLen, const Bytef *source,
     ++		uLong *sourceLen);
     ++#endif
      +
      +int header_size(int version)
      +{
     @@ reftable/block.c (new)
      +		memcpy(uncompressed, block->data, block_header_skip);
      +
      +		/* Uncompress */
     -+		if (Z_OK != uncompress_return_consumed(
     -+				    uncompressed + block_header_skip, &dst_len,
     -+				    block->data + block_header_skip,
     -+				    &src_len)) {
     ++		if (Z_OK !=
     ++		    uncompress2(uncompressed + block_header_skip, &dst_len,
     ++				block->data + block_header_skip, &src_len)) {
      +			reftable_free(uncompressed);
      +			return REFTABLE_ZLIB_ERROR;
      +		}
     @@ reftable/block.c (new)
      +
      +static int restart_key_less(size_t idx, void *args)
      +{
     -+	struct restart_find_args *a = (struct restart_find_args *)args;
     ++	struct restart_find_args *a = args;
      +	uint32_t off = block_reader_restart_offset(a->r, idx);
      +	struct string_view in = {
      +		.buf = a->r->block.data + off,
     @@ reftable/block.c (new)
      +void reftable_block_done(struct reftable_block *blockp)
      +{
      +	struct reftable_block_source source = blockp->source;
     -+	if (blockp != NULL && source.ops != NULL)
     ++	if (blockp && source.ops)
      +		source.ops->return_block(source.arg, blockp);
      +	blockp->data = NULL;
      +	blockp->len = 0;
     @@ reftable/block_test.c (new)
      +	block.len = block_size;
      +	block.source = malloc_block_source();
      +	block_writer_init(&bw, BLOCK_TYPE_REF, block.data, block_size,
     -+			  header_off, hash_size(SHA1_ID));
     ++			  header_off, hash_size(GIT_SHA1_HASH_ID));
      +	reftable_record_from_ref(&rec, &ref);
      +
      +	for (i = 0; i < N; i++) {
      +		char name[100];
     -+		uint8_t hash[SHA1_SIZE];
     ++		uint8_t hash[GIT_SHA1_RAWSZ];
      +		snprintf(name, sizeof(name), "branch%02d", i);
      +		memset(hash, i, sizeof(hash));
      +
     @@ reftable/block_test.c (new)
      +
      +	block_writer_release(&bw);
      +
     -+	block_reader_init(&br, &block, header_off, block_size, SHA1_SIZE);
     ++	block_reader_init(&br, &block, header_off, block_size, GIT_SHA1_RAWSZ);
      +
      +	block_reader_start(&br, &it);
      +
     @@ reftable/block_test.c (new)
      +{
      +	RUN_TEST(test_block_read_write);
      +	return 0;
     -+}
     -
     - ## reftable/zlib-compat.c (new) ##
     -@@
     -+/* taken from zlib's uncompr.c
     -+
     -+   commit cacf7f1d4e3d44d871b605da3b647f07d718623f
     -+   Author: Mark Adler <madler@alumni.caltech.edu>
     -+   Date:   Sun Jan 15 09:18:46 2017 -0800
     -+
     -+       zlib 1.2.11
     -+
     -+*/
     -+
     -+/*
     -+ * Copyright (C) 1995-2003, 2010, 2014, 2016 Jean-loup Gailly, Mark Adler
     -+ * For conditions of distribution and use, see copyright notice in zlib.h
     -+ */
     -+
     -+#include "system.h"
     -+
     -+/* clang-format off */
     -+
     -+/* ===========================================================================
     -+     Decompresses the source buffer into the destination buffer.  *sourceLen is
     -+   the byte length of the source buffer. Upon entry, *destLen is the total size
     -+   of the destination buffer, which must be large enough to hold the entire
     -+   uncompressed data. (The size of the uncompressed data must have been saved
     -+   previously by the compressor and transmitted to the decompressor by some
     -+   mechanism outside the scope of this compression library.) Upon exit,
     -+   *destLen is the size of the decompressed data and *sourceLen is the number
     -+   of source bytes consumed. Upon return, source + *sourceLen points to the
     -+   first unused input byte.
     -+
     -+     uncompress returns Z_OK if success, Z_MEM_ERROR if there was not enough
     -+   memory, Z_BUF_ERROR if there was not enough room in the output buffer, or
     -+   Z_DATA_ERROR if the input data was corrupted, including if the input data is
     -+   an incomplete zlib stream.
     -+*/
     -+int ZEXPORT uncompress_return_consumed (
     -+    Bytef *dest,
     -+    uLongf *destLen,
     -+    const Bytef *source,
     -+    uLong *sourceLen) {
     -+    z_stream stream;
     -+    int err;
     -+    const uInt max = (uInt)-1;
     -+    uLong len, left;
     -+    Byte buf[1];    /* for detection of incomplete stream when *destLen == 0 */
     -+
     -+    len = *sourceLen;
     -+    if (*destLen) {
     -+        left = *destLen;
     -+        *destLen = 0;
     -+    }
     -+    else {
     -+        left = 1;
     -+        dest = buf;
     -+    }
     -+
     -+    stream.next_in = (z_const Bytef *)source;
     -+    stream.avail_in = 0;
     -+    stream.zalloc = (alloc_func)0;
     -+    stream.zfree = (free_func)0;
     -+    stream.opaque = (voidpf)0;
     -+
     -+    err = inflateInit(&stream);
     -+    if (err != Z_OK) return err;
     -+
     -+    stream.next_out = dest;
     -+    stream.avail_out = 0;
     -+
     -+    do {
     -+        if (stream.avail_out == 0) {
     -+            stream.avail_out = left > (uLong)max ? max : (uInt)left;
     -+            left -= stream.avail_out;
     -+        }
     -+        if (stream.avail_in == 0) {
     -+            stream.avail_in = len > (uLong)max ? max : (uInt)len;
     -+            len -= stream.avail_in;
     -+        }
     -+        err = inflate(&stream, Z_NO_FLUSH);
     -+    } while (err == Z_OK);
     -+
     -+    *sourceLen -= len + stream.avail_in;
     -+    if (dest != buf)
     -+        *destLen = stream.total_out;
     -+    else if (stream.total_out && err == Z_BUF_ERROR)
     -+        left = 1;
     -+
     -+    inflateEnd(&stream);
     -+    return err == Z_STREAM_END ? Z_OK :
     -+           err == Z_NEED_DICT ? Z_DATA_ERROR  :
     -+           err == Z_BUF_ERROR && left + stream.avail_out ? Z_DATA_ERROR :
     -+           err;
      +}
      
       ## t/helper/test-reftable.c ##
  8:  ceddefadd48c ! 13:  c03957db45c0 reftable: a generic binary tree implementation
     @@ Makefile: REFTABLE_OBJS += reftable/block.o
       REFTABLE_OBJS += reftable/publicbasics.o
       REFTABLE_OBJS += reftable/record.o
      +REFTABLE_OBJS += reftable/tree.o
     - REFTABLE_OBJS += reftable/zlib-compat.o
       
      +REFTABLE_TEST_OBJS += reftable/basics_test.o
       REFTABLE_TEST_OBJS += reftable/block_test.o
     @@ reftable/tree.c (new)
      +void infix_walk(struct tree_node *t, void (*action)(void *arg, void *key),
      +		void *arg)
      +{
     -+	if (t->left != NULL) {
     ++	if (t->left) {
      +		infix_walk(t->left, action, arg);
      +	}
      +	action(arg, t->key);
     -+	if (t->right != NULL) {
     ++	if (t->right) {
      +		infix_walk(t->right, action, arg);
      +	}
      +}
     @@ reftable/tree.c (new)
      +	if (t == NULL) {
      +		return;
      +	}
     -+	if (t->left != NULL) {
     ++	if (t->left) {
      +		tree_free(t->left);
      +	}
     -+	if (t->right != NULL) {
     ++	if (t->right) {
      +		tree_free(t->right);
      +	}
      +	reftable_free(t);
     @@ reftable/tree_test.c (new)
      +
      +static void check_increasing(void *arg, void *key)
      +{
     -+	struct curry *c = (struct curry *)arg;
     -+	if (c->last != NULL) {
     ++	struct curry *c = arg;
     ++	if (c->last) {
      +		assert(test_compare(c->last, key) < 0);
      +	}
      +	c->last = key;
  9:  1f635e6c86c5 ! 14:  922deb272fff reftable: write reftable files
     @@ Makefile: REFTABLE_OBJS += reftable/blocksource.o
       REFTABLE_OBJS += reftable/record.o
       REFTABLE_OBJS += reftable/tree.o
      +REFTABLE_OBJS += reftable/writer.o
     - REFTABLE_OBJS += reftable/zlib-compat.o
       
       REFTABLE_TEST_OBJS += reftable/basics_test.o
     + REFTABLE_TEST_OBJS += reftable/block_test.o
      
       ## reftable/reftable-writer.h (new) ##
      @@
     @@ reftable/reftable-writer.h (new)
      +
      +/* reftable_new_writer creates a new writer */
      +struct reftable_writer *
     -+reftable_new_writer(int (*writer_func)(void *, const void *, size_t),
     ++reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
      +		    void *writer_arg, struct reftable_write_options *opts);
      +
      +/* Set the range of update indices for the records we will add. When writing a
     @@ reftable/writer.c (new)
      +	}
      +
      +	if (opts->hash_id == 0) {
     -+		opts->hash_id = SHA1_ID;
     ++		opts->hash_id = GIT_SHA1_HASH_ID;
      +	}
      +	if (opts->block_size == 0) {
      +		opts->block_size = DEFAULT_BLOCK_SIZE;
     @@ reftable/writer.c (new)
      +
      +static int writer_version(struct reftable_writer *w)
      +{
     -+	return (w->opts.hash_id == 0 || w->opts.hash_id == SHA1_ID) ? 1 : 2;
     ++	return (w->opts.hash_id == 0 || w->opts.hash_id == GIT_SHA1_HASH_ID) ?
     ++			     1 :
     ++			     2;
      +}
      +
      +static int writer_write_header(struct reftable_writer *w, uint8_t *dest)
     @@ reftable/writer.c (new)
      +static struct strbuf reftable_empty_strbuf = STRBUF_INIT;
      +
      +struct reftable_writer *
     -+reftable_new_writer(int (*writer_func)(void *, const void *, size_t),
     ++reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
      +		    void *writer_arg, struct reftable_write_options *opts)
      +{
      +	struct reftable_writer *wp =
     @@ reftable/writer.c (new)
      +static int writer_add_record(struct reftable_writer *w,
      +			     struct reftable_record *rec)
      +{
     -+	int result = -1;
      +	struct strbuf key = STRBUF_INIT;
     -+	int err = 0;
     ++	int err = -1;
      +	reftable_record_key(rec, &key);
     -+	if (strbuf_cmp(&w->last_key, &key) >= 0)
     ++	if (strbuf_cmp(&w->last_key, &key) >= 0) {
     ++		err = REFTABLE_API_ERROR;
      +		goto done;
     ++	}
      +
      +	strbuf_reset(&w->last_key);
      +	strbuf_addbuf(&w->last_key, &key);
     @@ reftable/writer.c (new)
      +	assert(block_writer_type(w->block_writer) == reftable_record_type(rec));
      +
      +	if (block_writer_add(w->block_writer, rec) == 0) {
     -+		result = 0;
     ++		err = 0;
      +		goto done;
      +	}
      +
      +	err = writer_flush_block(w);
      +	if (err < 0) {
     -+		result = err;
      +		goto done;
      +	}
      +
      +	writer_reinit_block_writer(w, reftable_record_type(rec));
      +	err = block_writer_add(w->block_writer, rec);
      +	if (err < 0) {
     -+		result = err;
      +		goto done;
      +	}
      +
     -+	result = 0;
     ++	err = 0;
      +done:
      +	strbuf_release(&key);
     -+	return result;
     ++	return err;
      +}
      +
      +int reftable_writer_add_ref(struct reftable_writer *w,
     @@ reftable/writer.c (new)
      +	if (err < 0)
      +		return err;
      +
     -+	if (!w->opts.skip_index_objects &&
     -+	    reftable_ref_record_val1(ref) != NULL) {
     ++	if (!w->opts.skip_index_objects && reftable_ref_record_val1(ref)) {
      +		struct strbuf h = STRBUF_INIT;
      +		strbuf_add(&h, (char *)reftable_ref_record_val1(ref),
      +			   hash_size(w->opts.hash_id));
     @@ reftable/writer.c (new)
      +		strbuf_release(&h);
      +	}
      +
     -+	if (!w->opts.skip_index_objects &&
     -+	    reftable_ref_record_val2(ref) != NULL) {
     ++	if (!w->opts.skip_index_objects && reftable_ref_record_val2(ref)) {
      +		struct strbuf h = STRBUF_INIT;
      +		strbuf_add(&h, reftable_ref_record_val2(ref),
      +			   hash_size(w->opts.hash_id));
     @@ reftable/writer.c (new)
      +					    struct reftable_log_record *log)
      +{
      +	struct reftable_record rec = { NULL };
     -+	if (w->block_writer != NULL &&
     ++	if (w->block_writer &&
      +	    block_writer_type(w->block_writer) == BLOCK_TYPE_REF) {
      +		int err = writer_finish_public_section(w);
      +		if (err < 0)
     @@ reftable/writer.c (new)
      +		return REFTABLE_API_ERROR;
      +
      +	input_log_message = log->update.message;
     -+	if (!w->opts.exact_log_message && log->update.message != NULL) {
     ++	if (!w->opts.exact_log_message && log->update.message) {
      +		strbuf_addstr(&cleaned_message, log->update.message);
      +		while (cleaned_message.len &&
      +		       cleaned_message.buf[cleaned_message.len - 1] == '\n')
     @@ reftable/writer.c (new)
      +
      +static void update_common(void *void_arg, void *key)
      +{
     -+	struct common_prefix_arg *arg = (struct common_prefix_arg *)void_arg;
     -+	struct obj_index_tree_node *entry = (struct obj_index_tree_node *)key;
     -+	if (arg->last != NULL) {
     ++	struct common_prefix_arg *arg = void_arg;
     ++	struct obj_index_tree_node *entry = key;
     ++	if (arg->last) {
      +		int n = common_prefix_size(&entry->hash, arg->last);
      +		if (n > arg->max) {
      +			arg->max = n;
     @@ reftable/writer.c (new)
      +
      +static void write_object_record(void *void_arg, void *key)
      +{
     -+	struct write_record_arg *arg = (struct write_record_arg *)void_arg;
     -+	struct obj_index_tree_node *entry = (struct obj_index_tree_node *)key;
     ++	struct write_record_arg *arg = void_arg;
     ++	struct obj_index_tree_node *entry = key;
      +	struct reftable_obj_record obj_rec = {
      +		.hash_prefix = (uint8_t *)entry->hash.buf,
      +		.hash_prefix_len = arg->w->stats.object_id_len,
     @@ reftable/writer.c (new)
      +
      +static void object_record_free(void *void_arg, void *key)
      +{
     -+	struct obj_index_tree_node *entry = (struct obj_index_tree_node *)key;
     ++	struct obj_index_tree_node *entry = key;
      +
      +	FREE_AND_NULL(entry->offsets);
      +	strbuf_release(&entry->hash);
     @@ reftable/writer.c (new)
      +{
      +	struct write_record_arg closure = { .w = w };
      +	struct common_prefix_arg common = { NULL };
     -+	if (w->obj_index_tree != NULL) {
     ++	if (w->obj_index_tree) {
      +		infix_walk(w->obj_index_tree, &update_common, &common);
      +	}
      +	w->stats.object_id_len = common.max + 1;
      +
      +	writer_reinit_block_writer(w, BLOCK_TYPE_OBJ);
      +
     -+	if (w->obj_index_tree != NULL) {
     ++	if (w->obj_index_tree) {
      +		infix_walk(w->obj_index_tree, &write_object_record, &closure);
      +	}
      +
     @@ reftable/writer.c (new)
      +			return err;
      +	}
      +
     -+	if (w->obj_index_tree != NULL) {
     ++	if (w->obj_index_tree) {
      +		infix_walk(w->obj_index_tree, &object_record_free, NULL);
      +		tree_free(w->obj_index_tree);
      +		w->obj_index_tree = NULL;
     @@ reftable/writer.h (new)
      +#include "reftable-writer.h"
      +
      +struct reftable_writer {
     -+	int (*write)(void *, const void *, size_t);
     ++	ssize_t (*write)(void *, const void *, size_t);
      +	void *write_arg;
      +	int pending_padding;
      +	struct strbuf last_key;
 10:  9c3fd77fbbac ! 15:  afca8f8f6a29 reftable: generic interface to tables
     @@ Makefile: REFTABLE_OBJS += reftable/block.o
       REFTABLE_OBJS += reftable/blocksource.o
       REFTABLE_OBJS += reftable/publicbasics.o
       REFTABLE_OBJS += reftable/record.o
     -+REFTABLE_OBJS += reftable/reftable.o
     ++REFTABLE_OBJS += reftable/refname.o
     ++REFTABLE_OBJS += reftable/generic.o
     ++REFTABLE_OBJS += reftable/stack.o
       REFTABLE_OBJS += reftable/tree.o
       REFTABLE_OBJS += reftable/writer.o
     - REFTABLE_OBJS += reftable/zlib-compat.o
     + 
     +
     + ## reftable/generic.c (new) ##
     +@@
     ++/*
     ++Copyright 2020 Google LLC
     ++
     ++Use of this source code is governed by a BSD-style
     ++license that can be found in the LICENSE file or at
     ++https://developers.google.com/open-source/licenses/bsd
     ++*/
     ++
     ++#include "basics.h"
     ++#include "record.h"
     ++#include "generic.h"
     ++#include "reftable-iterator.h"
     ++#include "reftable-generic.h"
     ++
     ++int reftable_table_seek_ref(struct reftable_table *tab,
     ++			    struct reftable_iterator *it, const char *name)
     ++{
     ++	struct reftable_ref_record ref = {
     ++		.refname = (char *)name,
     ++	};
     ++	struct reftable_record rec = { NULL };
     ++	reftable_record_from_ref(&rec, &ref);
     ++	return tab->ops->seek_record(tab->table_arg, it, &rec);
     ++}
     ++
     ++int reftable_table_seek_log(struct reftable_table *tab,
     ++			    struct reftable_iterator *it, const char *name)
     ++{
     ++	struct reftable_log_record log = {
     ++		.refname = (char *)name,
     ++		.update_index = ~((uint64_t)0),
     ++	};
     ++	struct reftable_record rec = { NULL };
     ++	reftable_record_from_log(&rec, &log);
     ++	return tab->ops->seek_record(tab->table_arg, it, &rec);
     ++}
     ++
     ++int reftable_table_read_ref(struct reftable_table *tab, const char *name,
     ++			    struct reftable_ref_record *ref)
     ++{
     ++	struct reftable_iterator it = { NULL };
     ++	int err = reftable_table_seek_ref(tab, &it, name);
     ++	if (err)
     ++		goto done;
     ++
     ++	err = reftable_iterator_next_ref(&it, ref);
     ++	if (err)
     ++		goto done;
     ++
     ++	if (strcmp(ref->refname, name) ||
     ++	    reftable_ref_record_is_deletion(ref)) {
     ++		reftable_ref_record_release(ref);
     ++		err = 1;
     ++		goto done;
     ++	}
     ++
     ++done:
     ++	reftable_iterator_destroy(&it);
     ++	return err;
     ++}
     ++
     ++int reftable_table_print(struct reftable_table *tab) {
     ++	struct reftable_iterator it = { NULL };
     ++	struct reftable_ref_record ref = { NULL };
     ++	struct reftable_log_record log = { NULL };
     ++	uint32_t hash_id = reftable_table_hash_id(tab);
     ++	int err = reftable_table_seek_ref(tab, &it, "");
     ++	if (err < 0) {
     ++		return err;
     ++	}
     ++
     ++	while (1) {
     ++		err = reftable_iterator_next_ref(&it, &ref);
     ++		if (err > 0) {
     ++			break;
     ++		}
     ++		if (err < 0) {
     ++			return err;
     ++		}
     ++		reftable_ref_record_print(&ref, hash_id);
     ++	}
     ++	reftable_iterator_destroy(&it);
     ++	reftable_ref_record_release(&ref);
     ++
     ++	err = reftable_table_seek_log(tab, &it, "");
     ++	if (err < 0) {
     ++		return err;
     ++	}
     ++	while (1) {
     ++		err = reftable_iterator_next_log(&it, &log);
     ++		if (err > 0) {
     ++			break;
     ++		}
     ++		if (err < 0) {
     ++			return err;
     ++		}
     ++		reftable_log_record_print(&log, hash_id);
     ++	}
     ++	reftable_iterator_destroy(&it);
     ++	reftable_log_record_release(&log);
     ++	return 0;
     ++}
     ++
     ++uint64_t reftable_table_max_update_index(struct reftable_table *tab)
     ++{
     ++	return tab->ops->max_update_index(tab->table_arg);
     ++}
     ++
     ++uint64_t reftable_table_min_update_index(struct reftable_table *tab)
     ++{
     ++	return tab->ops->min_update_index(tab->table_arg);
     ++}
     ++
     ++uint32_t reftable_table_hash_id(struct reftable_table *tab)
     ++{
     ++	return tab->ops->hash_id(tab->table_arg);
     ++}
     ++
     ++void reftable_iterator_destroy(struct reftable_iterator *it)
     ++{
     ++	if (!it->ops) {
     ++		return;
     ++	}
     ++	it->ops->close(it->iter_arg);
     ++	it->ops = NULL;
     ++	FREE_AND_NULL(it->iter_arg);
     ++}
     ++
     ++int reftable_iterator_next_ref(struct reftable_iterator *it,
     ++			       struct reftable_ref_record *ref)
     ++{
     ++	struct reftable_record rec = { NULL };
     ++	reftable_record_from_ref(&rec, ref);
     ++	return iterator_next(it, &rec);
     ++}
     ++
     ++int reftable_iterator_next_log(struct reftable_iterator *it,
     ++			       struct reftable_log_record *log)
     ++{
     ++	struct reftable_record rec = { NULL };
     ++	reftable_record_from_log(&rec, log);
     ++	return iterator_next(it, &rec);
     ++}
     ++
     ++int iterator_next(struct reftable_iterator *it, struct reftable_record *rec)
     ++{
     ++	return it->ops->next(it->iter_arg, rec);
     ++}
     ++
     ++static int empty_iterator_next(void *arg, struct reftable_record *rec)
     ++{
     ++	return 1;
     ++}
     ++
     ++static void empty_iterator_close(void *arg)
     ++{
     ++}
     ++
     ++static struct reftable_iterator_vtable empty_vtable = {
     ++	.next = &empty_iterator_next,
     ++	.close = &empty_iterator_close,
     ++};
     ++
     ++void iterator_set_empty(struct reftable_iterator *it)
     ++{
     ++	assert(!it->ops);
     ++	it->iter_arg = NULL;
     ++	it->ops = &empty_vtable;
     ++}
      
       ## reftable/generic.h (new) ##
      @@
     @@ reftable/reftable-generic.h (new)
      +	void *table_arg;
      +};
      +
     ++int reftable_table_seek_log(struct reftable_table *tab,
     ++			    struct reftable_iterator *it, const char *name);
     ++
      +int reftable_table_seek_ref(struct reftable_table *tab,
      +			    struct reftable_iterator *it, const char *name);
      +
     @@ reftable/reftable-generic.h (new)
      +int reftable_table_read_ref(struct reftable_table *tab, const char *name,
      +			    struct reftable_ref_record *ref);
      +
     ++/* dump table contents onto stdout for debugging */
     ++int reftable_table_print(struct reftable_table *tab);
     ++
      +#endif
      
       ## reftable/reftable-iterator.h (new) ##
     @@ reftable/reftable.c (new)
      +
      +void reftable_iterator_destroy(struct reftable_iterator *it)
      +{
     -+	if (it->ops == NULL) {
     ++	if (!it->ops) {
      +		return;
      +	}
      +	it->ops->close(it->iter_arg);
     @@ reftable/reftable.c (new)
      +
      +void iterator_set_empty(struct reftable_iterator *it)
      +{
     -+	assert(it->ops == NULL);
     ++	assert(!it->ops);
      +	it->iter_arg = NULL;
      +	it->ops = &empty_vtable;
      +}
 11:  8ba486f44a72 ! 16:  501cf0677ce4 reftable: read reftable files
     @@ Makefile: REFTABLE_OBJS += reftable/basics.o
       REFTABLE_OBJS += reftable/publicbasics.o
      +REFTABLE_OBJS += reftable/reader.o
       REFTABLE_OBJS += reftable/record.o
     - REFTABLE_OBJS += reftable/reftable.o
     - REFTABLE_OBJS += reftable/tree.o
     + REFTABLE_OBJS += reftable/refname.o
     + REFTABLE_OBJS += reftable/generic.o
      
       ## reftable/iter.c (new) ##
      @@
     @@ reftable/iter.c (new)
      +
      +int iterator_is_null(struct reftable_iterator *it)
      +{
     -+	return it->ops == NULL;
     ++	return !it->ops;
      +}
      +
      +static void filtering_ref_iterator_close(void *iter_arg)
      +{
     -+	struct filtering_ref_iterator *fri =
     -+		(struct filtering_ref_iterator *)iter_arg;
     ++	struct filtering_ref_iterator *fri = iter_arg;
      +	strbuf_release(&fri->oid);
      +	reftable_iterator_destroy(&fri->it);
      +}
     @@ reftable/iter.c (new)
      +static int filtering_ref_iterator_next(void *iter_arg,
      +				       struct reftable_record *rec)
      +{
     -+	struct filtering_ref_iterator *fri =
     -+		(struct filtering_ref_iterator *)iter_arg;
     -+	struct reftable_ref_record *ref =
     -+		(struct reftable_ref_record *)rec->data;
     ++	struct filtering_ref_iterator *fri = iter_arg;
     ++	struct reftable_ref_record *ref = rec->data;
      +	int err = 0;
      +	while (1) {
      +		err = reftable_iterator_next_ref(&fri->it, ref);
     @@ reftable/iter.c (new)
      +void iterator_from_filtering_ref_iterator(struct reftable_iterator *it,
      +					  struct filtering_ref_iterator *fri)
      +{
     -+	assert(it->ops == NULL);
     ++	assert(!it->ops);
      +	it->iter_arg = fri;
      +	it->ops = &filtering_ref_iterator_vtable;
      +}
      +
      +static void indexed_table_ref_iter_close(void *p)
      +{
     -+	struct indexed_table_ref_iter *it = (struct indexed_table_ref_iter *)p;
     ++	struct indexed_table_ref_iter *it = p;
      +	block_iter_close(&it->cur);
      +	reftable_block_done(&it->block_reader.block);
     ++	reftable_free(it->offsets);
      +	strbuf_release(&it->oid);
      +}
      +
     @@ reftable/iter.c (new)
      +
      +static int indexed_table_ref_iter_next(void *p, struct reftable_record *rec)
      +{
     -+	struct indexed_table_ref_iter *it = (struct indexed_table_ref_iter *)p;
     -+	struct reftable_ref_record *ref =
     -+		(struct reftable_ref_record *)rec->data;
     ++	struct indexed_table_ref_iter *it = p;
     ++	struct reftable_ref_record *ref = rec->data;
      +
      +	while (1) {
      +		int err = block_iter_next(&it->cur, rec);
     @@ reftable/iter.c (new)
      +void iterator_from_indexed_table_ref_iter(struct reftable_iterator *it,
      +					  struct indexed_table_ref_iter *itr)
      +{
     -+	assert(it->ops == NULL);
     ++	assert(!it->ops);
      +	it->iter_arg = itr;
      +	it->ops = &indexed_table_ref_iter_vtable;
      +}
     @@ reftable/iter.h (new)
      +
      +void iterator_from_indexed_table_ref_iter(struct reftable_iterator *it,
      +					  struct indexed_table_ref_iter *itr);
     ++
     ++/* Takes ownership of `offsets` */
      +int new_indexed_table_ref_iter(struct indexed_table_ref_iter **dest,
      +			       struct reftable_reader *r, uint8_t *oid,
      +			       int oid_len, uint64_t *offsets, int offset_len);
     @@ reftable/reader.c (new)
      +
      +void block_source_close(struct reftable_block_source *source)
      +{
     -+	if (source->ops == NULL) {
     ++	if (!source->ops) {
      +		return;
      +	}
      +
     @@ reftable/reader.c (new)
      +	f += 8;
      +
      +	if (r->version == 1) {
     -+		r->hash_id = SHA1_ID;
     ++		r->hash_id = GIT_SHA1_HASH_ID;
      +	} else {
      +		r->hash_id = get_be32(f);
      +		switch (r->hash_id) {
     -+		case SHA1_ID:
     ++		case GIT_SHA1_HASH_ID:
      +			break;
     -+		case SHA256_ID:
     ++		case GIT_SHA256_HASH_ID:
      +			break;
      +		default:
      +			err = REFTABLE_FORMAT_ERROR;
     @@ reftable/reader.c (new)
      +
      +static void table_iter_block_done(struct table_iter *ti)
      +{
     -+	if (ti->bi.br == NULL) {
     ++	if (!ti->bi.br) {
      +		return;
      +	}
      +	reftable_block_done(&ti->bi.br->block);
     @@ reftable/reader.c (new)
      +
      +static int table_iter_next_void(void *ti, struct reftable_record *rec)
      +{
     -+	return table_iter_next((struct table_iter *)ti, rec);
     ++	return table_iter_next(ti, rec);
      +}
      +
      +static void table_iter_close(void *p)
      +{
     -+	struct table_iter *ti = (struct table_iter *)p;
     ++	struct table_iter *ti = p;
      +	table_iter_block_done(ti);
      +	block_iter_close(&ti->bi);
      +}
     @@ reftable/reader.c (new)
      +static void iterator_from_table_iter(struct reftable_iterator *it,
      +				     struct table_iter *ti)
      +{
     -+	assert(it->ops == NULL);
     ++	assert(!it->ops);
      +	it->iter_arg = ti;
      +	it->ops = &table_iter_vtable;
      +}
     @@ reftable/reader.c (new)
      +static int reftable_reader_seek_void(void *tab, struct reftable_iterator *it,
      +				     struct reftable_record *rec)
      +{
     -+	return reader_seek((struct reftable_reader *)tab, it, rec);
     ++	return reader_seek(tab, it, rec);
      +}
      +
      +static uint32_t reftable_reader_hash_id_void(void *tab)
      +{
     -+	return reftable_reader_hash_id((struct reftable_reader *)tab);
     ++	return reftable_reader_hash_id(tab);
      +}
      +
      +static uint64_t reftable_reader_min_update_index_void(void *tab)
      +{
     -+	return reftable_reader_min_update_index((struct reftable_reader *)tab);
     ++	return reftable_reader_min_update_index(tab);
      +}
      +
      +static uint64_t reftable_reader_max_update_index_void(void *tab)
      +{
     -+	return reftable_reader_max_update_index((struct reftable_reader *)tab);
     ++	return reftable_reader_max_update_index(tab);
      +}
      +
      +static struct reftable_table_vtable reader_vtable = {
     @@ reftable/reader.c (new)
      +void reftable_table_from_reader(struct reftable_table *tab,
      +				struct reftable_reader *reader)
      +{
     -+	assert(tab->ops == NULL);
     ++	assert(!tab->ops);
      +	tab->ops = &reader_vtable;
      +	tab->table_arg = reader;
     ++}
     ++
     ++
     ++int reftable_reader_print_file(const char *tablename)
     ++{
     ++	struct reftable_block_source src = { NULL };
     ++	int err = reftable_block_source_from_file(&src, tablename);
     ++	struct reftable_reader *r = NULL;
     ++	struct reftable_table tab = { NULL };
     ++	if (err < 0)
     ++		goto done;
     ++
     ++	err = reftable_new_reader(&r, &src, tablename);
     ++	if (err < 0)
     ++		goto done;
     ++
     ++	reftable_table_from_reader(&tab, r);
     ++	err = reftable_table_print(&tab);
     ++done:
     ++	reftable_reader_free(r);
     ++	return err;
      +}
      
       ## reftable/reader.h (new) ##
     @@ reftable/reftable-reader.h (new)
      +void reftable_table_from_reader(struct reftable_table *tab,
      +				struct reftable_reader *reader);
      +
     ++/* print table onto stdout for debugging. */
     ++int reftable_reader_print_file(const char *tablename);
     ++
      +#endif
 12:  9ade9303f08f ! 17:  dc04ec88cf53 reftable: reftable file level tests
     @@ Commit message
          Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
      
       ## Makefile ##
     -@@ Makefile: REFTABLE_OBJS += reftable/zlib-compat.o
     +@@ Makefile: REFTABLE_OBJS += reftable/writer.o
       REFTABLE_TEST_OBJS += reftable/basics_test.o
       REFTABLE_TEST_OBJS += reftable/block_test.o
       REFTABLE_TEST_OBJS += reftable/record_test.o
     -+REFTABLE_TEST_OBJS += reftable/reftable_test.o
     ++REFTABLE_TEST_OBJS += reftable/readwrite_test.o
       REFTABLE_TEST_OBJS += reftable/test_framework.o
       REFTABLE_TEST_OBJS += reftable/tree_test.o
       
      
     - ## reftable/reftable_test.c (new) ##
     + ## reftable/readwrite_test.c (new) ##
      @@
      +/*
      +Copyright 2020 Google LLC
     @@ reftable/reftable_test.c (new)
      +	*names = reftable_calloc(sizeof(char *) * (N + 1));
      +	reftable_writer_set_limits(w, update_index, update_index);
      +	for (i = 0; i < N; i++) {
     -+		uint8_t hash[SHA256_SIZE] = { 0 };
     ++		uint8_t hash[GIT_SHA256_RAWSZ] = { 0 };
      +		char name[100];
      +		int n;
      +
     @@ reftable/reftable_test.c (new)
      +	}
      +
      +	for (i = 0; i < N; i++) {
     -+		uint8_t hash[SHA256_SIZE] = { 0 };
     ++		uint8_t hash[GIT_SHA256_RAWSZ] = { 0 };
      +		char name[100];
      +		int n;
      +
     @@ reftable/reftable_test.c (new)
      +	for (i = 0; i < stats->ref_stats.blocks; i++) {
      +		int off = i * opts.block_size;
      +		if (off == 0) {
     -+			off = header_size((hash_id == SHA256_ID) ? 2 : 1);
     ++			off = header_size((hash_id == GIT_SHA256_HASH_ID) ? 2 :
     ++										  1);
      +		}
      +		EXPECT(buf->buf[off] == 'r');
      +	}
     @@ reftable/reftable_test.c (new)
      +	/* This tests buffer extension for log compression. Must use a random
      +	   hash, to ensure that the compressed part is larger than the original.
      +	*/
     -+	uint8_t hash1[SHA1_SIZE], hash2[SHA1_SIZE];
     -+	for (int i = 0; i < SHA1_SIZE; i++) {
     ++	uint8_t hash1[GIT_SHA1_RAWSZ], hash2[GIT_SHA1_RAWSZ];
     ++	for (int i = 0; i < GIT_SHA1_RAWSZ; i++) {
      +		hash1[i] = (uint8_t)(rand() % 256);
      +		hash2[i] = (uint8_t)(rand() % 256);
      +	}
     @@ reftable/reftable_test.c (new)
      +		EXPECT_ERR(err);
      +	}
      +	for (i = 0; i < N; i++) {
     -+		uint8_t hash1[SHA1_SIZE], hash2[SHA1_SIZE];
     ++		uint8_t hash1[GIT_SHA1_RAWSZ], hash2[GIT_SHA1_RAWSZ];
      +		struct reftable_log_record log = { NULL };
      +		set_test_hash(hash1, i);
      +		set_test_hash(hash2, i + 1);
     @@ reftable/reftable_test.c (new)
      +	int err = 0;
      +	int j = 0;
      +
     -+	write_table(&names, &buf, N, 256, SHA1_ID);
     ++	write_table(&names, &buf, N, 256, GIT_SHA1_HASH_ID);
      +
      +	block_source_from_strbuf(&source, &buf);
      +
     @@ reftable/reftable_test.c (new)
      +	char **names;
      +	struct strbuf buf = STRBUF_INIT;
      +	int N = 1;
     -+	write_table(&names, &buf, N, 4096, SHA1_ID);
     ++	write_table(&names, &buf, N, 4096, GIT_SHA1_HASH_ID);
      +	EXPECT(buf.len < 200);
      +	strbuf_release(&buf);
      +	free_names(names);
     @@ reftable/reftable_test.c (new)
      +	struct reftable_log_record log = { NULL };
      +	struct reftable_iterator it = { NULL };
      +
     -+	write_table(&names, &buf, N, 256, SHA1_ID);
     ++	write_table(&names, &buf, N, 256, GIT_SHA1_HASH_ID);
      +
      +	block_source_from_strbuf(&source, &buf);
      +
     @@ reftable/reftable_test.c (new)
      +
      +static void test_table_read_write_seek_linear(void)
      +{
     -+	test_table_read_write_seek(0, SHA1_ID);
     ++	test_table_read_write_seek(0, GIT_SHA1_HASH_ID);
      +}
      +
      +static void test_table_read_write_seek_linear_sha256(void)
      +{
     -+	test_table_read_write_seek(0, SHA256_ID);
     ++	test_table_read_write_seek(0, GIT_SHA256_HASH_ID);
      +}
      +
      +static void test_table_read_write_seek_index(void)
      +{
     -+	test_table_read_write_seek(1, SHA1_ID);
     ++	test_table_read_write_seek(1, GIT_SHA1_HASH_ID);
      +}
      +
      +static void test_table_refs_for(int indexed)
     @@ reftable/reftable_test.c (new)
      +	int N = 50;
      +	char **want_names = reftable_calloc(sizeof(char *) * (N + 1));
      +	int want_names_len = 0;
     -+	uint8_t want_hash[SHA1_SIZE];
     ++	uint8_t want_hash[GIT_SHA1_RAWSZ];
      +
      +	struct reftable_write_options opts = {
      +		.block_size = 256,
     @@ reftable/reftable_test.c (new)
      +	set_test_hash(want_hash, 4);
      +
      +	for (i = 0; i < N; i++) {
     -+		uint8_t hash[SHA1_SIZE];
     ++		uint8_t hash[GIT_SHA1_RAWSZ];
      +		char fill[51] = { 0 };
      +		char name[100];
     -+		uint8_t hash1[SHA1_SIZE];
     -+		uint8_t hash2[SHA1_SIZE];
     ++		uint8_t hash1[GIT_SHA1_RAWSZ];
     ++		uint8_t hash2[GIT_SHA1_RAWSZ];
      +		struct reftable_ref_record ref = { NULL };
      +
      +		memset(hash, i, sizeof(hash));
     @@ reftable/reftable_test.c (new)
      +		n = reftable_writer_add_ref(w, &ref);
      +		EXPECT(n == 0);
      +
     -+		if (!memcmp(hash1, want_hash, SHA1_SIZE) ||
     -+		    !memcmp(hash2, want_hash, SHA1_SIZE)) {
     ++		if (!memcmp(hash1, want_hash, GIT_SHA1_RAWSZ) ||
     ++		    !memcmp(hash2, want_hash, GIT_SHA1_RAWSZ)) {
      +			want_names[want_names_len++] = xstrdup(name);
      +		}
      +	}
     @@ reftable/reftable_test.c (new)
      +	strbuf_release(&buf);
      +}
      +
     -+int reftable_test_main(int argc, const char *argv[])
     ++static void test_write_key_order(void)
     ++{
     ++	struct reftable_write_options opts = { 0 };
     ++	struct strbuf buf = STRBUF_INIT;
     ++	struct reftable_writer *w =
     ++		reftable_new_writer(&strbuf_add_void, &buf, &opts);
     ++	struct reftable_ref_record refs[2] = {
     ++		{
     ++			.refname = "b",
     ++			.update_index = 1,
     ++			.value_type = REFTABLE_REF_SYMREF,
     ++			.value = {
     ++				.symref = "target",
     ++			},
     ++		}, {
     ++			.refname = "a",
     ++			.update_index = 1,
     ++			.value_type = REFTABLE_REF_SYMREF,
     ++			.value = {
     ++				.symref = "target",
     ++			},
     ++		}
     ++	};
     ++	int err;
     ++
     ++	reftable_writer_set_limits(w, 1, 1);
     ++	err = reftable_writer_add_ref(w, &refs[0]);
     ++	EXPECT_ERR(err);
     ++	err = reftable_writer_add_ref(w, &refs[1]);
     ++	printf("%d\n", err);
     ++	EXPECT(err == REFTABLE_API_ERROR);
     ++	reftable_writer_close(w);
     ++	reftable_writer_free(w);
     ++	strbuf_release(&buf);
     ++}
     ++
     ++int readwrite_test_main(int argc, const char *argv[])
      +{
      +	RUN_TEST(test_log_write_read);
     ++	RUN_TEST(test_write_key_order);
      +	RUN_TEST(test_table_read_write_seek_linear_sha256);
      +	RUN_TEST(test_log_buffer_size);
      +	RUN_TEST(test_table_write_small_table);
     @@ reftable/reftable_test.c (new)
      +	return 0;
      +}
      
     + ## reftable/reftable-tests.h ##
     +@@ reftable/reftable-tests.h: int block_test_main(int argc, const char **argv);
     + int merged_test_main(int argc, const char **argv);
     + int record_test_main(int argc, const char **argv);
     + int refname_test_main(int argc, const char **argv);
     +-int reftable_test_main(int argc, const char **argv);
     ++int readwrite_test_main(int argc, const char **argv);
     + int stack_test_main(int argc, const char **argv);
     + int tree_test_main(int argc, const char **argv);
     + int reftable_dump_main(int argc, char *const *argv);
     +
       ## t/helper/test-reftable.c ##
      @@ t/helper/test-reftable.c: int cmd__reftable(int argc, const char **argv)
       	basics_test_main(argc, argv);
       	block_test_main(argc, argv);
       	record_test_main(argc, argv);
     -+	reftable_test_main(argc, argv);
     ++	readwrite_test_main(argc, argv);
       	tree_test_main(argc, argv);
       	return 0;
       }
 13:  0a6119d910a3 ! 18:  84578b666a40 reftable: add a heap-based priority queue for reftable records
     @@ Makefile: REFTABLE_OBJS += reftable/block.o
      +REFTABLE_OBJS += reftable/pq.o
       REFTABLE_OBJS += reftable/reader.o
       REFTABLE_OBJS += reftable/record.o
     - REFTABLE_OBJS += reftable/reftable.o
     -@@ Makefile: REFTABLE_OBJS += reftable/zlib-compat.o
     + REFTABLE_OBJS += reftable/refname.o
     +@@ Makefile: REFTABLE_OBJS += reftable/writer.o
       
       REFTABLE_TEST_OBJS += reftable/basics_test.o
       REFTABLE_TEST_OBJS += reftable/block_test.o
      +REFTABLE_TEST_OBJS += reftable/pq_test.o
       REFTABLE_TEST_OBJS += reftable/record_test.o
     - REFTABLE_TEST_OBJS += reftable/reftable_test.o
     + REFTABLE_TEST_OBJS += reftable/readwrite_test.o
       REFTABLE_TEST_OBJS += reftable/test_framework.o
      
       ## reftable/pq.c (new) ##
     @@ reftable/pq_test.c (new)
      +
      +		merged_iter_pqueue_check(pq);
      +
     -+		if (last != NULL) {
     ++		if (last) {
      +			assert(strcmp(last, ref->refname) < 0);
      +		}
      +		last = ref->refname;
     @@ reftable/reftable-tests.h: license that can be found in the LICENSE file or at
      +int pq_test_main(int argc, const char **argv);
       int record_test_main(int argc, const char **argv);
       int refname_test_main(int argc, const char **argv);
     - int reftable_test_main(int argc, const char **argv);
     + int readwrite_test_main(int argc, const char **argv);
      
       ## t/helper/test-reftable.c ##
      @@ t/helper/test-reftable.c: int cmd__reftable(int argc, const char **argv)
     @@ t/helper/test-reftable.c: int cmd__reftable(int argc, const char **argv)
       	block_test_main(argc, argv);
      +	pq_test_main(argc, argv);
       	record_test_main(argc, argv);
     - 	reftable_test_main(argc, argv);
     + 	readwrite_test_main(argc, argv);
       	tree_test_main(argc, argv);
 14:  393fcecdae1e ! 19:  e3a776f2076c reftable: add merged table view
     @@ Makefile: REFTABLE_OBJS += reftable/block.o
       REFTABLE_OBJS += reftable/pq.o
       REFTABLE_OBJS += reftable/reader.o
       REFTABLE_OBJS += reftable/record.o
     -@@ Makefile: REFTABLE_OBJS += reftable/zlib-compat.o
     +@@ Makefile: REFTABLE_OBJS += reftable/writer.o
       
       REFTABLE_TEST_OBJS += reftable/basics_test.o
       REFTABLE_TEST_OBJS += reftable/block_test.o
      +REFTABLE_TEST_OBJS += reftable/merged_test.o
       REFTABLE_TEST_OBJS += reftable/pq_test.o
       REFTABLE_TEST_OBJS += reftable/record_test.o
     - REFTABLE_TEST_OBJS += reftable/reftable_test.o
     + REFTABLE_TEST_OBJS += reftable/readwrite_test.o
      
       ## reftable/merged.c (new) ##
      @@
     @@ reftable/merged.c (new)
      +
      +static void merged_iter_close(void *p)
      +{
     -+	struct merged_iter *mi = (struct merged_iter *)p;
     ++	struct merged_iter *mi = p;
      +	int i = 0;
      +	merged_iter_pqueue_release(&mi->pq);
      +	for (i = 0; i < mi->stack_len; i++) {
     @@ reftable/merged.c (new)
      +
      +static int merged_iter_next_void(void *p, struct reftable_record *rec)
      +{
     -+	struct merged_iter *mi = (struct merged_iter *)p;
     ++	struct merged_iter *mi = p;
      +	if (merged_iter_pqueue_is_empty(mi->pq))
      +		return 1;
      +
     @@ reftable/merged.c (new)
      +static void iterator_from_merged_iter(struct reftable_iterator *it,
      +				      struct merged_iter *mi)
      +{
     -+	assert(it->ops == NULL);
     ++	assert(!it->ops);
      +	it->iter_arg = mi;
      +	it->ops = &merged_iter_vtable;
      +}
     @@ reftable/merged.c (new)
      +		}
      +	}
      +
     -+	m = (struct reftable_merged_table *)reftable_calloc(
     -+		sizeof(struct reftable_merged_table));
     ++	m = reftable_calloc(sizeof(struct reftable_merged_table));
      +	m->stack = stack;
      +	m->stack_len = n;
      +	m->min = first_min;
     @@ reftable/merged.c (new)
      +
      +void reftable_merged_table_free(struct reftable_merged_table *mt)
      +{
     -+	if (mt == NULL) {
     ++	if (!mt) {
      +		return;
      +	}
      +	merged_table_release(mt);
     @@ reftable/merged.c (new)
      +					   struct reftable_iterator *it,
      +					   struct reftable_record *rec)
      +{
     -+	return merged_table_seek_record((struct reftable_merged_table *)tab, it,
     -+					rec);
     ++	return merged_table_seek_record(tab, it, rec);
      +}
      +
      +static uint32_t reftable_merged_table_hash_id_void(void *tab)
      +{
     -+	return reftable_merged_table_hash_id(
     -+		(struct reftable_merged_table *)tab);
     ++	return reftable_merged_table_hash_id(tab);
      +}
      +
      +static uint64_t reftable_merged_table_min_update_index_void(void *tab)
      +{
     -+	return reftable_merged_table_min_update_index(
     -+		(struct reftable_merged_table *)tab);
     ++	return reftable_merged_table_min_update_index(tab);
      +}
      +
      +static uint64_t reftable_merged_table_max_update_index_void(void *tab)
      +{
     -+	return reftable_merged_table_max_update_index(
     -+		(struct reftable_merged_table *)tab);
     ++	return reftable_merged_table_max_update_index(tab);
      +}
      +
      +static struct reftable_table_vtable merged_table_vtable = {
     @@ reftable/merged.c (new)
      +void reftable_table_from_merged_table(struct reftable_table *tab,
      +				      struct reftable_merged_table *merged)
      +{
     -+	assert(tab->ops == NULL);
     ++	assert(!tab->ops);
      +	tab->ops = &merged_table_vtable;
      +	tab->table_arg = merged;
      +}
     @@ reftable/merged_test.c (new)
      +		reftable_table_from_reader(&tabs[i], (*readers)[i]);
      +	}
      +
     -+	err = reftable_new_merged_table(&mt, tabs, n, SHA1_ID);
     ++	err = reftable_new_merged_table(&mt, tabs, n, GIT_SHA1_HASH_ID);
      +	EXPECT_ERR(err);
      +	return mt;
      +}
     @@ reftable/merged_test.c (new)
      +
      +static void test_merged_between(void)
      +{
     -+	uint8_t hash1[SHA1_SIZE] = { 1, 2, 3, 0 };
     ++	uint8_t hash1[GIT_SHA1_RAWSZ] = { 1, 2, 3, 0 };
      +
      +	struct reftable_ref_record r1[] = { {
      +		.refname = "b",
     @@ reftable/merged_test.c (new)
      +
      +static void test_merged(void)
      +{
     -+	uint8_t hash1[SHA1_SIZE] = { 1 };
     -+	uint8_t hash2[SHA1_SIZE] = { 2 };
     ++	uint8_t hash1[GIT_SHA1_RAWSZ] = { 1 };
     ++	uint8_t hash2[GIT_SHA1_RAWSZ] = { 2 };
      +	struct reftable_ref_record r1[] = {
      +		{
      +			.refname = "a",
     @@ reftable/merged_test.c (new)
      +
      +	assert(ARRAY_SIZE(want) == len);
      +	for (i = 0; i < len; i++) {
     -+		assert(reftable_ref_record_equal(&want[i], &out[i], SHA1_SIZE));
     ++		assert(reftable_ref_record_equal(&want[i], &out[i],
     ++						 GIT_SHA1_RAWSZ));
      +	}
      +	for (i = 0; i < len; i++) {
      +		reftable_ref_record_release(&out[i]);
     @@ reftable/merged_test.c (new)
      +	EXPECT_ERR(err);
      +
      +	hash_id = reftable_reader_hash_id(rd);
     -+	assert(hash_id == SHA1_ID);
     ++	assert(hash_id == GIT_SHA1_HASH_ID);
      +
      +	reftable_table_from_reader(&tab[0], rd);
     -+	err = reftable_new_merged_table(&merged, tab, 1, SHA1_ID);
     ++	err = reftable_new_merged_table(&merged, tab, 1, GIT_SHA1_HASH_ID);
      +	EXPECT_ERR(err);
      +
      +	reftable_reader_free(rd);
     @@ t/helper/test-reftable.c: int cmd__reftable(int argc, const char **argv)
      +	merged_test_main(argc, argv);
       	pq_test_main(argc, argv);
       	record_test_main(argc, argv);
     - 	reftable_test_main(argc, argv);
     + 	readwrite_test_main(argc, argv);
 15:  3186b2b70f73 ! 20:  7913027063a2 reftable: implement refname validation
     @@ Commit message
          Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
      
       ## Makefile ##
     -@@ Makefile: REFTABLE_OBJS += reftable/merged.o
     - REFTABLE_OBJS += reftable/pq.o
     - REFTABLE_OBJS += reftable/reader.o
     - REFTABLE_OBJS += reftable/record.o
     -+REFTABLE_OBJS += reftable/refname.o
     - REFTABLE_OBJS += reftable/reftable.o
     - REFTABLE_OBJS += reftable/tree.o
     - REFTABLE_OBJS += reftable/writer.o
     -@@ Makefile: REFTABLE_TEST_OBJS += reftable/block_test.o
     - REFTABLE_TEST_OBJS += reftable/merged_test.o
     +@@ Makefile: REFTABLE_TEST_OBJS += reftable/merged_test.o
       REFTABLE_TEST_OBJS += reftable/pq_test.o
       REFTABLE_TEST_OBJS += reftable/record_test.o
     + REFTABLE_TEST_OBJS += reftable/readwrite_test.o
      +REFTABLE_TEST_OBJS += reftable/refname_test.o
     - REFTABLE_TEST_OBJS += reftable/reftable_test.o
       REFTABLE_TEST_OBJS += reftable/test_framework.o
       REFTABLE_TEST_OBJS += reftable/tree_test.o
     + 
      
       ## reftable/refname.c (new) ##
      @@
     @@ reftable/refname.c (new)
      +
      +static int find_name(size_t k, void *arg)
      +{
     -+	struct find_arg *f_arg = (struct find_arg *)arg;
     ++	struct find_arg *f_arg = arg;
      +	return strcmp(f_arg->names[k], f_arg->want) >= 0;
      +}
      +
     @@ reftable/refname_test.c (new)
      +			.tab = tab,
      +		};
      +
     -+		if (cases[i].add != NULL) {
     ++		if (cases[i].add) {
      +			mod.add = &cases[i].add;
      +			mod.add_len = 1;
      +		}
     -+		if (cases[i].del != NULL) {
     ++		if (cases[i].del) {
      +			mod.del = &cases[i].del;
      +			mod.del_len = 1;
      +		}
     @@ t/helper/test-reftable.c: int cmd__reftable(int argc, const char **argv)
       	pq_test_main(argc, argv);
       	record_test_main(argc, argv);
      +	refname_test_main(argc, argv);
     - 	reftable_test_main(argc, argv);
     + 	readwrite_test_main(argc, argv);
       	tree_test_main(argc, argv);
       	return 0;
 16:  b5492d5a13d7 ! 21:  3f1e5aedf4a8 reftable: implement stack, a mutable database of reftable files.
     @@ Commit message
          Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
      
       ## Makefile ##
     -@@ Makefile: REFTABLE_OBJS += reftable/reader.o
     - REFTABLE_OBJS += reftable/record.o
     - REFTABLE_OBJS += reftable/refname.o
     - REFTABLE_OBJS += reftable/reftable.o
     -+REFTABLE_OBJS += reftable/stack.o
     - REFTABLE_OBJS += reftable/tree.o
     - REFTABLE_OBJS += reftable/writer.o
     - REFTABLE_OBJS += reftable/zlib-compat.o
      @@ Makefile: REFTABLE_TEST_OBJS += reftable/pq_test.o
       REFTABLE_TEST_OBJS += reftable/record_test.o
     + REFTABLE_TEST_OBJS += reftable/readwrite_test.o
       REFTABLE_TEST_OBJS += reftable/refname_test.o
     - REFTABLE_TEST_OBJS += reftable/reftable_test.o
      +REFTABLE_TEST_OBJS += reftable/stack_test.o
       REFTABLE_TEST_OBJS += reftable/test_framework.o
       REFTABLE_TEST_OBJS += reftable/tree_test.o
     @@ reftable/reftable-stack.h (new)
      +					     void *arg),
      +			  void *arg);
      +
     -+/* Commits the transaction, releasing the lock. */
     ++/* Commits the transaction, releasing the lock. After calling this,
     ++ * reftable_addition_destroy should still be called.
     ++ */
      +int reftable_addition_commit(struct reftable_addition *add);
      +
      +/* Release all non-committed data from the transaction, and deallocate the
     @@ reftable/reftable-stack.h (new)
      +struct reftable_compaction_stats *
      +reftable_stack_compaction_stats(struct reftable_stack *st);
      +
     ++/* print the entire stack represented by the directory */
     ++int reftable_stack_print_directory(const char *stackdir);
     ++
      +#endif
      
       ## reftable/stack.c (new) ##
     @@ reftable/stack.c (new)
      +	strbuf_addstr(dest, name);
      +}
      +
     -+static int reftable_fd_write(void *arg, const void *data, size_t sz)
     ++static ssize_t reftable_fd_write(void *arg, const void *data, size_t sz)
      +{
      +	int *fdp = (int *)arg;
      +	return write(*fdp, data, sz);
     @@ reftable/stack.c (new)
      +	int err = 0;
      +
      +	if (config.hash_id == 0) {
     -+		config.hash_id = SHA1_ID;
     ++		config.hash_id = GIT_SHA1_HASH_ID;
      +	}
      +
      +	*dest = NULL;
     @@ reftable/stack.c (new)
      +{
      +	char **names = NULL;
      +	int err = 0;
     -+	if (st->merged != NULL) {
     ++	if (st->merged) {
      +		reftable_merged_table_free(st->merged);
      +		st->merged = NULL;
      +	}
     @@ reftable/stack.c (new)
      +		FREE_AND_NULL(names);
      +	}
      +
     -+	if (st->readers != NULL) {
     ++	if (st->readers) {
      +		int i = 0;
      +		struct strbuf filename = STRBUF_INIT;
      +		for (i = 0; i < st->readers_len; i++) {
     @@ reftable/stack.c (new)
      +static int reftable_stack_reload_once(struct reftable_stack *st, char **names,
      +				      int reuse_open)
      +{
     -+	int cur_len = st->merged == NULL ? 0 : st->merged->stack_len;
     ++	int cur_len = !st->merged ? 0 : st->merged->stack_len;
      +	struct reftable_reader **cur = stack_copy_readers(st, cur_len);
      +	int err = 0;
      +	int names_len = names_length(names);
     @@ reftable/stack.c (new)
      +		   tables under control so this is not quadratic. */
      +		int j = 0;
      +		for (j = 0; reuse_open && j < cur_len; j++) {
     -+			if (cur[j] != NULL && 0 == strcmp(cur[j]->name, name)) {
     ++			if (cur[j] && 0 == strcmp(cur[j]->name, name)) {
      +				rd = cur[j];
      +				cur[j] = NULL;
      +				break;
      +			}
      +		}
      +
     -+		if (rd == NULL) {
     ++		if (!rd) {
      +			struct reftable_block_source src = { NULL };
      +			struct strbuf table_path = STRBUF_INIT;
      +			stack_filename(&table_path, st, name);
     @@ reftable/stack.c (new)
      +
      +	new_tables = NULL;
      +	st->readers_len = new_readers_len;
     -+	if (st->merged != NULL) {
     ++	if (st->merged) {
      +		merged_table_release(st->merged);
      +		reftable_merged_table_free(st->merged);
      +	}
     -+	if (st->readers != NULL) {
     ++	if (st->readers) {
      +		reftable_free(st->readers);
      +	}
      +	st->readers = new_readers;
     @@ reftable/stack.c (new)
      +	new_merged->suppress_deletions = 1;
      +	st->merged = new_merged;
      +	for (i = 0; i < cur_len; i++) {
     -+		if (cur[i] != NULL) {
     ++		if (cur[i]) {
      +			const char *name = reader_name(cur[i]);
      +			struct strbuf filename = STRBUF_INIT;
      +			stack_filename(&filename, st, name);
     @@ reftable/stack.c (new)
      +		return err;
      +
      +	for (i = 0; i < st->readers_len; i++) {
     -+		if (names[i] == NULL) {
     ++		if (!names[i]) {
      +			err = 1;
      +			goto done;
      +		}
     @@ reftable/stack.c (new)
      +		}
      +	}
      +
     -+	if (names[st->merged->stack_len] != NULL) {
     ++	if (names[st->merged->stack_len]) {
      +		err = 1;
      +		goto done;
      +	}
     @@ reftable/stack.c (new)
      +
      +void reftable_addition_destroy(struct reftable_addition *add)
      +{
     -+	if (add == NULL) {
     ++	if (!add) {
      +		return;
      +	}
      +	reftable_addition_close(add);
     @@ reftable/stack.c (new)
      +			continue;
      +		}
      +
     -+		if (config != NULL && config->min_update_index > 0 &&
     ++		if (config && config->min_update_index > 0 &&
      +		    log.update_index < config->min_update_index) {
      +			continue;
      +		}
      +
     -+		if (config != NULL && config->time > 0 &&
     ++		if (config && config->time > 0 &&
      +		    log.update.time < config->time) {
      +			continue;
      +		}
     @@ reftable/stack.c (new)
      +
      +done:
      +	reftable_iterator_destroy(&it);
     -+	if (mt != NULL) {
     ++	if (mt) {
      +		merged_table_release(mt);
      +		reftable_merged_table_free(mt);
      +	}
     @@ reftable/stack.c (new)
      +	int j = 0;
      +	int is_empty_table = 0;
      +
     -+	if (first > last || (expiry == NULL && first == last)) {
     ++	if (first > last || (!expiry && first == last)) {
      +		err = 0;
      +		goto done;
      +	}
     @@ reftable/stack.c (new)
      +{
      +	uint64_t *sizes =
      +		reftable_calloc(sizeof(uint64_t) * st->merged->stack_len);
     -+	int version = (st->config.hash_id == SHA1_ID) ? 1 : 2;
     ++	int version = (st->config.hash_id == GIT_SHA1_HASH_ID) ? 1 : 2;
      +	int overhead = header_size(version) - 1;
      +	int i = 0;
      +	for (i = 0; i < st->merged->stack_len; i++) {
     @@ reftable/stack.c (new)
      +static int is_table_name(const char *s)
      +{
      +	const char *dot = strrchr(s, '.');
     -+	return dot != NULL && !strcmp(dot, ".ref");
     ++	return dot && !strcmp(dot, ".ref");
      +}
      +
      +static void remove_maybe_stale_table(struct reftable_stack *st, uint64_t max,
     @@ reftable/stack.c (new)
      +		reftable_stack_merged_table(st));
      +	DIR *dir = opendir(st->reftable_dir);
      +	struct dirent *d = NULL;
     -+	if (dir == NULL) {
     ++	if (!dir) {
      +		return REFTABLE_IO_ERROR;
      +	}
      +
     -+	while ((d = readdir(dir)) != NULL) {
     ++	while ((d = readdir(dir))) {
      +		int i = 0;
      +		int found = 0;
      +		if (!is_table_name(d->d_name))
     @@ reftable/stack.c (new)
      +done:
      +	reftable_addition_destroy(add);
      +	return err;
     ++}
     ++
     ++int reftable_stack_print_directory(const char *stackdir)
     ++{
     ++	struct reftable_stack *stack = NULL;
     ++	struct reftable_write_options cfg = { 0 };
     ++	struct reftable_merged_table *merged = NULL;
     ++	struct reftable_table table = { NULL };
     ++
     ++	int err = reftable_new_stack(&stack, stackdir, cfg);
     ++	if (err < 0)
     ++		goto done;
     ++
     ++	merged = reftable_stack_merged_table(stack);
     ++	reftable_table_from_merged_table(&table, merged);
     ++	err = reftable_table_print(&table);
     ++done:
     ++	reftable_stack_destroy(stack);
     ++	return err;
      +}
      
       ## reftable/stack.h (new) ##
     @@ reftable/stack_test.c (new)
      +	if (dir == NULL)
      +		return 0;
      +
     -+	while ((d = readdir(dir)) != NULL) {
     ++	while ((d = readdir(dir))) {
      +		if (!strcmp(d->d_name, "..") || !strcmp(d->d_name, "."))
      +			continue;
      +		len++;
     @@ reftable/stack_test.c (new)
      +	return template;
      +}
      +
     ++static char *get_tmp_dir(const char *prefix)
     ++{
     ++	char *dir = get_tmp_template(prefix);
     ++	EXPECT(mkdtemp(dir));
     ++	return dir;
     ++}
     ++
      +static void test_read_file(void)
      +{
      +	char *fn = get_tmp_template(__FUNCTION__);
     @@ reftable/stack_test.c (new)
      +	err = read_lines(fn, &names);
      +	EXPECT_ERR(err);
      +
     -+	for (i = 0; names[i] != NULL; i++) {
     ++	for (i = 0; names[i]; i++) {
      +		EXPECT(0 == strcmp(want[i], names[i]));
      +	}
      +	free_names(names);
     @@ reftable/stack_test.c (new)
      +
      +static void test_reftable_stack_add_one(void)
      +{
     -+	char *dir = get_tmp_template(__FUNCTION__);
     ++	char *dir = get_tmp_dir(__FUNCTION__);
     ++
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st = NULL;
      +	int err;
     @@ reftable/stack_test.c (new)
      +	};
      +	struct reftable_ref_record dest = { NULL };
      +
     -+	EXPECT(mkdtemp(dir));
      +
      +	err = reftable_new_stack(&st, dir, cfg);
      +	EXPECT_ERR(err);
     @@ reftable/stack_test.c (new)
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st1 = NULL;
      +	struct reftable_stack *st2 = NULL;
     -+	char *dir = get_tmp_template(__FUNCTION__);
     ++	char *dir = get_tmp_dir(__FUNCTION__);
     ++
      +	int err;
      +	struct reftable_ref_record ref1 = {
      +		.refname = "HEAD",
     @@ reftable/stack_test.c (new)
      +		.value.symref = "master",
      +	};
      +
     -+	EXPECT(mkdtemp(dir));
      +
      +	/* simulate multi-process access to the same stack
      +	   by creating two stacks for the same directory.
     @@ reftable/stack_test.c (new)
      +
      +static void test_reftable_stack_transaction_api(void)
      +{
     -+	char *dir = get_tmp_template(__FUNCTION__);
     ++	char *dir = get_tmp_dir(__FUNCTION__);
     ++
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st = NULL;
      +	int err;
     @@ reftable/stack_test.c (new)
      +	};
      +	struct reftable_ref_record dest = { NULL };
      +
     -+	EXPECT(mkdtemp(dir));
      +
      +	err = reftable_new_stack(&st, dir, cfg);
      +	EXPECT_ERR(err);
     @@ reftable/stack_test.c (new)
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st = NULL;
      +	int err;
     -+	char *dir = get_tmp_template(__FUNCTION__);
     ++	char *dir = get_tmp_dir(__FUNCTION__);
     ++
      +	int i;
      +	struct reftable_ref_record ref = {
      +		.refname = "a/b",
     @@ reftable/stack_test.c (new)
      +	};
      +	char *additions[] = { "a", "a/b/c" };
      +
     -+	EXPECT(mkdtemp(dir));
      +	err = reftable_new_stack(&st, dir, cfg);
      +	EXPECT_ERR(err);
      +
     @@ reftable/stack_test.c (new)
      +
      +static void test_reftable_stack_update_index_check(void)
      +{
     -+	char *dir = get_tmp_template(__FUNCTION__);
     ++	char *dir = get_tmp_dir(__FUNCTION__);
     ++
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st = NULL;
      +	int err;
     @@ reftable/stack_test.c (new)
      +		.value_type = REFTABLE_REF_SYMREF,
      +		.value.symref = "master",
      +	};
     -+	EXPECT(mkdtemp(dir));
      +
      +	err = reftable_new_stack(&st, dir, cfg);
      +	EXPECT_ERR(err);
     @@ reftable/stack_test.c (new)
      +
      +static void test_reftable_stack_lock_failure(void)
      +{
     -+	char *dir = get_tmp_template(__FUNCTION__);
     ++	char *dir = get_tmp_dir(__FUNCTION__);
     ++
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st = NULL;
      +	int err, i;
     -+	EXPECT(mkdtemp(dir));
      +
      +	err = reftable_new_stack(&st, dir, cfg);
      +	EXPECT_ERR(err);
     @@ reftable/stack_test.c (new)
      +		.exact_log_message = 1,
      +	};
      +	struct reftable_stack *st = NULL;
     -+	char *dir = get_tmp_template(__FUNCTION__);
     ++	char *dir = get_tmp_dir(__FUNCTION__);
     ++
      +	struct reftable_ref_record refs[2] = { { NULL } };
      +	struct reftable_log_record logs[2] = { { NULL } };
      +	int N = ARRAY_SIZE(refs);
      +
     -+	EXPECT(mkdtemp(dir));
      +
      +	err = reftable_new_stack(&st, dir, cfg);
      +	EXPECT_ERR(err);
     @@ reftable/stack_test.c (new)
      +		refs[i].refname = xstrdup(buf);
      +		refs[i].update_index = i + 1;
      +		refs[i].value_type = REFTABLE_REF_VAL1;
     -+		refs[i].value.val1 = reftable_malloc(SHA1_SIZE);
     ++		refs[i].value.val1 = reftable_malloc(GIT_SHA1_RAWSZ);
      +		set_test_hash(refs[i].value.val1, i);
      +
      +		logs[i].refname = xstrdup(buf);
      +		logs[i].update_index = N + i + 1;
      +		logs[i].value_type = REFTABLE_LOG_UPDATE;
      +
     -+		logs[i].update.new_hash = reftable_malloc(SHA1_SIZE);
     ++		logs[i].update.new_hash = reftable_malloc(GIT_SHA1_RAWSZ);
      +		logs[i].update.email = xstrdup("identity@invalid");
      +		set_test_hash(logs[i].update.new_hash, i);
      +	}
     @@ reftable/stack_test.c (new)
      +
      +		int err = reftable_stack_read_ref(st, refs[i].refname, &dest);
      +		EXPECT_ERR(err);
     -+		EXPECT(reftable_ref_record_equal(&dest, refs + i, SHA1_SIZE));
     ++		EXPECT(reftable_ref_record_equal(&dest, refs + i,
     ++						 GIT_SHA1_RAWSZ));
      +		reftable_ref_record_release(&dest);
      +	}
      +
     @@ reftable/stack_test.c (new)
      +		struct reftable_log_record dest = { NULL };
      +		int err = reftable_stack_read_log(st, refs[i].refname, &dest);
      +		EXPECT_ERR(err);
     -+		EXPECT(reftable_log_record_equal(&dest, logs + i, SHA1_SIZE));
     ++		EXPECT(reftable_log_record_equal(&dest, logs + i,
     ++						 GIT_SHA1_RAWSZ));
      +		reftable_log_record_release(&dest);
      +	}
      +
     @@ reftable/stack_test.c (new)
      +		0,
      +	};
      +	struct reftable_stack *st = NULL;
     -+	char *dir = get_tmp_template(__FUNCTION__);
     ++	char *dir = get_tmp_dir(__FUNCTION__);
     ++
      +
     -+	uint8_t h1[SHA1_SIZE] = { 0x01 }, h2[SHA1_SIZE] = { 0x02 };
     ++	uint8_t h1[GIT_SHA1_RAWSZ] = { 0x01 }, h2[GIT_SHA1_RAWSZ] = { 0x02 };
      +
      +	struct reftable_log_record input = { .refname = "branch",
      +					     .update_index = 1,
     @@ reftable/stack_test.c (new)
      +		.update_index = 1,
      +	};
      +
     -+	EXPECT(mkdtemp(dir));
      +	err = reftable_new_stack(&st, dir, cfg);
      +	EXPECT_ERR(err);
      +
     @@ reftable/stack_test.c (new)
      +static void test_reftable_stack_tombstone(void)
      +{
      +	int i = 0;
     -+	char *dir = get_tmp_template(__FUNCTION__);
     ++	char *dir = get_tmp_dir(__FUNCTION__);
     ++
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st = NULL;
      +	int err;
     @@ reftable/stack_test.c (new)
      +	struct reftable_ref_record dest = { NULL };
      +	struct reftable_log_record log_dest = { NULL };
      +
     -+	EXPECT(mkdtemp(dir));
      +
      +	err = reftable_new_stack(&st, dir, cfg);
      +	EXPECT_ERR(err);
      +
     ++	/* even entries add the refs, odd entries delete them. */
      +	for (i = 0; i < N; i++) {
      +		const char *buf = "branch";
      +		refs[i].refname = xstrdup(buf);
      +		refs[i].update_index = i + 1;
      +		if (i % 2 == 0) {
      +			refs[i].value_type = REFTABLE_REF_VAL1;
     -+			refs[i].value.val1 = reftable_malloc(SHA1_SIZE);
     ++			refs[i].value.val1 = reftable_malloc(GIT_SHA1_RAWSZ);
      +			set_test_hash(refs[i].value.val1, i);
      +		}
      +
     @@ reftable/stack_test.c (new)
      +		logs[i].update_index = 42;
      +		if (i % 2 == 0) {
      +			logs[i].value_type = REFTABLE_LOG_UPDATE;
     -+			logs[i].update.new_hash = reftable_malloc(SHA1_SIZE);
     ++			logs[i].update.new_hash =
     ++				reftable_malloc(GIT_SHA1_RAWSZ);
      +			set_test_hash(logs[i].update.new_hash, i);
      +			logs[i].update.email = xstrdup("identity@invalid");
      +		}
     @@ reftable/stack_test.c (new)
      +		int err = reftable_stack_add(st, &write_test_ref, &refs[i]);
      +		EXPECT_ERR(err);
      +	}
     ++
      +	for (i = 0; i < N; i++) {
      +		struct write_log_arg arg = {
      +			.log = &logs[i],
     @@ reftable/stack_test.c (new)
      +
      +static void test_reftable_stack_hash_id(void)
      +{
     -+	char *dir = get_tmp_template(__FUNCTION__);
     ++	char *dir = get_tmp_dir(__FUNCTION__);
     ++
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st = NULL;
      +	int err;
     @@ reftable/stack_test.c (new)
      +		.value.symref = "target",
      +		.update_index = 1,
      +	};
     -+	struct reftable_write_options cfg32 = { .hash_id = SHA256_ID };
     ++	struct reftable_write_options cfg32 = { .hash_id = GIT_SHA256_HASH_ID };
      +	struct reftable_stack *st32 = NULL;
      +	struct reftable_write_options cfg_default = { 0 };
      +	struct reftable_stack *st_default = NULL;
      +	struct reftable_ref_record dest = { NULL };
      +
     -+	EXPECT(mkdtemp(dir));
      +	err = reftable_new_stack(&st, dir, cfg);
      +	EXPECT_ERR(err);
      +
     @@ reftable/stack_test.c (new)
      +	err = reftable_stack_read_ref(st_default, "master", &dest);
      +	EXPECT_ERR(err);
      +
     -+	EXPECT(reftable_ref_record_equal(&ref, &dest, SHA1_SIZE));
     ++	EXPECT(reftable_ref_record_equal(&ref, &dest, GIT_SHA1_RAWSZ));
      +	reftable_ref_record_release(&dest);
      +	reftable_stack_destroy(st);
      +	reftable_stack_destroy(st_default);
     @@ reftable/stack_test.c (new)
      +
      +static void test_reflog_expire(void)
      +{
     -+	char *dir = get_tmp_template(__FUNCTION__);
     ++	char *dir = get_tmp_dir(__FUNCTION__);
     ++
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st = NULL;
      +	struct reftable_log_record logs[20] = { { NULL } };
     @@ reftable/stack_test.c (new)
      +	};
      +	struct reftable_log_record log = { NULL };
      +
     -+	EXPECT(mkdtemp(dir));
      +
      +	err = reftable_new_stack(&st, dir, cfg);
      +	EXPECT_ERR(err);
     @@ reftable/stack_test.c (new)
      +		logs[i].update_index = i;
      +		logs[i].value_type = REFTABLE_LOG_UPDATE;
      +		logs[i].update.time = i;
     -+		logs[i].update.new_hash = reftable_malloc(SHA1_SIZE);
     ++		logs[i].update.new_hash = reftable_malloc(GIT_SHA1_RAWSZ);
      +		logs[i].update.email = xstrdup("identity@invalid");
      +		set_test_hash(logs[i].update.new_hash, i);
      +	}
     @@ reftable/stack_test.c (new)
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st = NULL;
      +	int err;
     -+	char *dir = get_tmp_template(__FUNCTION__);
     ++	char *dir = get_tmp_dir(__FUNCTION__);
     ++
      +	struct reftable_stack *st2 = NULL;
      +
     -+	EXPECT(mkdtemp(dir));
      +
      +	err = reftable_new_stack(&st, dir, cfg);
      +	EXPECT_ERR(err);
     @@ reftable/stack_test.c (new)
      +{
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st = NULL;
     -+	char *dir = get_tmp_template(__FUNCTION__);
     ++	char *dir = get_tmp_dir(__FUNCTION__);
     ++
      +	int err, i;
      +	int N = 100;
     -+	EXPECT(mkdtemp(dir));
      +
      +	err = reftable_new_stack(&st, dir, cfg);
      +	EXPECT_ERR(err);
     @@ reftable/stack_test.c (new)
      +{
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st1 = NULL, *st2 = NULL;
     -+	char *dir = get_tmp_template(__FUNCTION__);
     ++	char *dir = get_tmp_dir(__FUNCTION__);
     ++
      +	int err, i;
      +	int N = 3;
     -+	EXPECT(mkdtemp(dir));
      +
      +	err = reftable_new_stack(&st1, dir, cfg);
      +	EXPECT_ERR(err);
     @@ reftable/stack_test.c (new)
      +{
      +	struct reftable_write_options cfg = { 0 };
      +	struct reftable_stack *st1 = NULL, *st2 = NULL, *st3 = NULL;
     -+	char *dir = get_tmp_template(__FUNCTION__);
     ++	char *dir = get_tmp_dir(__FUNCTION__);
     ++
      +	int err, i;
      +	int N = 3;
     -+	EXPECT(mkdtemp(dir));
      +
      +	err = reftable_new_stack(&st1, dir, cfg);
      +	EXPECT_ERR(err);
     @@ reftable/stack_test.c (new)
      +
      +int stack_test_main(int argc, const char *argv[])
      +{
     -+	RUN_TEST(test_reftable_stack_compaction_concurrent_clean);
     ++	RUN_TEST(test_empty_add);
     ++	RUN_TEST(test_log2);
     ++	RUN_TEST(test_names_equal);
     ++	RUN_TEST(test_parse_names);
     ++	RUN_TEST(test_read_file);
     ++	RUN_TEST(test_reflog_expire);
     ++	RUN_TEST(test_reftable_stack_add);
     ++	RUN_TEST(test_reftable_stack_add_one);
     ++	RUN_TEST(test_reftable_stack_auto_compaction);
      +	RUN_TEST(test_reftable_stack_compaction_concurrent);
     -+	RUN_TEST(test_reftable_stack_uptodate);
     -+	RUN_TEST(test_reftable_stack_transaction_api);
     ++	RUN_TEST(test_reftable_stack_compaction_concurrent_clean);
      +	RUN_TEST(test_reftable_stack_hash_id);
     -+	RUN_TEST(test_sizes_to_segments_all_equal);
     -+	RUN_TEST(test_reftable_stack_auto_compaction);
     -+	RUN_TEST(test_reftable_stack_validate_refname);
     -+	RUN_TEST(test_reftable_stack_update_index_check);
      +	RUN_TEST(test_reftable_stack_lock_failure);
      +	RUN_TEST(test_reftable_stack_log_normalize);
      +	RUN_TEST(test_reftable_stack_tombstone);
     -+	RUN_TEST(test_reftable_stack_add_one);
     -+	RUN_TEST(test_empty_add);
     -+	RUN_TEST(test_reflog_expire);
     -+	RUN_TEST(test_suggest_compaction_segment);
     -+	RUN_TEST(test_suggest_compaction_segment_nothing);
     ++	RUN_TEST(test_reftable_stack_transaction_api);
     ++	RUN_TEST(test_reftable_stack_update_index_check);
     ++	RUN_TEST(test_reftable_stack_uptodate);
     ++	RUN_TEST(test_reftable_stack_validate_refname);
      +	RUN_TEST(test_sizes_to_segments);
     ++	RUN_TEST(test_sizes_to_segments_all_equal);
      +	RUN_TEST(test_sizes_to_segments_empty);
     -+	RUN_TEST(test_log2);
     -+	RUN_TEST(test_parse_names);
     -+	RUN_TEST(test_read_file);
     -+	RUN_TEST(test_names_equal);
     -+	RUN_TEST(test_reftable_stack_add);
     ++	RUN_TEST(test_suggest_compaction_segment);
     ++	RUN_TEST(test_suggest_compaction_segment_nothing);
      +	return 0;
      +}
      
     @@ t/helper/test-reftable.c
      @@ t/helper/test-reftable.c: int cmd__reftable(int argc, const char **argv)
       	record_test_main(argc, argv);
       	refname_test_main(argc, argv);
     - 	reftable_test_main(argc, argv);
     + 	readwrite_test_main(argc, argv);
      +	stack_test_main(argc, argv);
       	tree_test_main(argc, argv);
       	return 0;
 17:  5db7c8ab7f23 ! 22:  b2ac95a25af8 reftable: add dump utility
     @@ reftable/dump.c (new)
      +#include <unistd.h>
      +#include <string.h>
      +
     -+#include "reftable.h"
     ++#include "reftable-blocksource.h"
     ++#include "reftable-error.h"
     ++#include "reftable-merged.h"
     ++#include "reftable-record.h"
      +#include "reftable-tests.h"
     -+
     -+static uint32_t hash_id;
     -+
     -+static int dump_table(const char *tablename)
     -+{
     -+	struct reftable_block_source src = { 0 };
     -+	int err = reftable_block_source_from_file(&src, tablename);
     -+	struct reftable_iterator it = { 0 };
     -+	struct reftable_ref_record ref = { 0 };
     -+	struct reftable_log_record log = { 0 };
     -+	struct reftable_reader *r = NULL;
     -+
     -+	if (err < 0)
     -+		return err;
     -+
     -+	err = reftable_new_reader(&r, &src, tablename);
     -+	if (err < 0)
     -+		return err;
     -+
     -+	err = reftable_reader_seek_ref(r, &it, "");
     -+	if (err < 0) {
     -+		return err;
     -+	}
     -+
     -+	while (1) {
     -+		err = reftable_iterator_next_ref(&it, &ref);
     -+		if (err > 0) {
     -+			break;
     -+		}
     -+		if (err < 0) {
     -+			return err;
     -+		}
     -+		reftable_ref_record_print(&ref, hash_id);
     -+	}
     -+	reftable_iterator_destroy(&it);
     -+	reftable_ref_record_clear(&ref);
     -+
     -+	err = reftable_reader_seek_log(r, &it, "");
     -+	if (err < 0) {
     -+		return err;
     -+	}
     -+	while (1) {
     -+		err = reftable_iterator_next_log(&it, &log);
     -+		if (err > 0) {
     -+			break;
     -+		}
     -+		if (err < 0) {
     -+			return err;
     -+		}
     -+		reftable_log_record_print(&log, hash_id);
     -+	}
     -+	reftable_iterator_destroy(&it);
     -+	reftable_log_record_clear(&log);
     -+
     -+	reftable_reader_free(r);
     -+	return 0;
     -+}
     ++#include "reftable-writer.h"
     ++#include "reftable-iterator.h"
     ++#include "reftable-reader.h"
     ++#include "reftable-stack.h"
     ++#include "reftable-generic.h"
      +
      +static int compact_stack(const char *stackdir)
      +{
      +	struct reftable_stack *stack = NULL;
     -+	struct reftable_write_options cfg = {};
     ++	struct reftable_write_options cfg = { 0 };
      +
      +	int err = reftable_new_stack(&stack, stackdir, cfg);
      +	if (err < 0)
     @@ reftable/dump.c (new)
      +	if (err < 0)
      +		goto done;
      +done:
     -+	if (stack != NULL) {
     ++	if (stack) {
      +		reftable_stack_destroy(stack);
      +	}
      +	return err;
      +}
      +
     -+static int dump_stack(const char *stackdir)
     -+{
     -+	struct reftable_stack *stack = NULL;
     -+	struct reftable_write_options cfg = {};
     -+	struct reftable_iterator it = { 0 };
     -+	struct reftable_ref_record ref = { 0 };
     -+	struct reftable_log_record log = { 0 };
     -+	struct reftable_merged_table *merged = NULL;
     -+
     -+	int err = reftable_new_stack(&stack, stackdir, cfg);
     -+	if (err < 0)
     -+		return err;
     -+
     -+	merged = reftable_stack_merged_table(stack);
     -+
     -+	err = reftable_merged_table_seek_ref(merged, &it, "");
     -+	if (err < 0) {
     -+		return err;
     -+	}
     -+
     -+	while (1) {
     -+		err = reftable_iterator_next_ref(&it, &ref);
     -+		if (err > 0) {
     -+			break;
     -+		}
     -+		if (err < 0) {
     -+			return err;
     -+		}
     -+		reftable_ref_record_print(&ref, hash_id);
     -+	}
     -+	reftable_iterator_destroy(&it);
     -+	reftable_ref_record_clear(&ref);
     -+
     -+	err = reftable_merged_table_seek_log(merged, &it, "");
     -+	if (err < 0) {
     -+		return err;
     -+	}
     -+	while (1) {
     -+		err = reftable_iterator_next_log(&it, &log);
     -+		if (err > 0) {
     -+			break;
     -+		}
     -+		if (err < 0) {
     -+			return err;
     -+		}
     -+		reftable_log_record_print(&log, hash_id);
     -+	}
     -+	reftable_iterator_destroy(&it);
     -+	reftable_log_record_clear(&log);
     -+
     -+	reftable_stack_destroy(stack);
     -+	return 0;
     -+}
     -+
      +static void print_help(void)
      +{
      +	printf("usage: dump [-cst] arg\n\n"
     @@ reftable/dump.c (new)
      +	       "  -t dump table\n"
      +	       "  -s dump stack\n"
      +	       "  -h this help\n"
     -+	       "  -2 use SHA256\n"
      +	       "\n");
      +}
      +
      +int reftable_dump_main(int argc, char *const *argv)
      +{
      +	int err = 0;
     -+	int opt;
      +	int opt_dump_table = 0;
      +	int opt_dump_stack = 0;
      +	int opt_compact = 0;
     -+	const char *arg = NULL;
     -+	while ((opt = getopt(argc, argv, "2chts")) != -1) {
     -+		switch (opt) {
     -+		case '2':
     -+			hash_id = 0x73323536;
     ++	const char *arg = NULL, *argv0 = argv[0];
     ++
     ++	for (; argc > 1; argv++, argc--)
     ++		if (*argv[1] != '-')
      +			break;
     -+		case 't':
     ++		else if (!strcmp("-t", argv[1]))
      +			opt_dump_table = 1;
     -+			break;
     -+		case 's':
     ++		else if (!strcmp("-s", argv[1]))
      +			opt_dump_stack = 1;
     -+			break;
     -+		case 'c':
     ++		else if (!strcmp("-c", argv[1]))
      +			opt_compact = 1;
     -+			break;
     -+		case '?':
     -+		case 'h':
     ++		else if (!strcmp("-?", argv[1]) || !strcmp("-h", argv[1])) {
      +			print_help();
      +			return 2;
     -+			break;
      +		}
     -+	}
      +
     -+	if (argv[optind] == NULL) {
     ++	if (argc != 2) {
      +		fprintf(stderr, "need argument\n");
      +		print_help();
      +		return 2;
      +	}
      +
     -+	arg = argv[optind];
     ++	arg = argv[1];
      +
      +	if (opt_dump_table) {
     -+		err = dump_table(arg);
     ++		err = reftable_reader_print_file(arg);
      +	} else if (opt_dump_stack) {
     -+		err = dump_stack(arg);
     ++		err = reftable_stack_print_directory(arg);
      +	} else if (opt_compact) {
      +		err = compact_stack(arg);
      +	}
      +
      +	if (err < 0) {
     -+		fprintf(stderr, "%s: %s: %s\n", argv[0], arg,
     ++		fprintf(stderr, "%s: %s: %s\n", argv0, arg,
      +			reftable_error_str(err));
      +		return 1;
      +	}
 18:  2dc73bf2ec96 ! 23:  2fd7cb8c0983 Reftable support for git-core
     @@ Commit message
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
          Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
          Helped-by: Junio Hamano <gitster@pobox.com>
     +    Helped-by: Patrick Steinhardt <patrick.steinhardt@elego.de>
          Co-authored-by: Jeff King <peff@peff.net>
      
       ## Documentation/config/extensions.txt ##
     @@ builtin/init-db.c: int cmd_init_db(int argc, const char **argv, const char *pref
      +		       initial_branch, ref_storage_format, flags);
       }
      
     + ## builtin/stash.c ##
     +@@ builtin/stash.c: static int get_stash_info(struct stash_info *info, int argc, const char **argv)
     + static int do_clear_stash(void)
     + {
     + 	struct object_id obj;
     ++	int result;
     + 	if (get_oid(ref_stash, &obj))
     + 		return 0;
     + 
     +-	return delete_ref(NULL, ref_stash, &obj, 0);
     ++	result = delete_ref(NULL, ref_stash, &obj, 0);
     ++
     ++	/* Ignore error; this is necessary for reftable, which keeps reflogs
     ++	 * even when refs are deleted. */
     ++	delete_reflog(ref_stash);
     ++	return result;
     + }
     + 
     + static int clear_stash(int argc, const char **argv, const char *prefix)
     +
       ## builtin/worktree.c ##
      @@
       #include "utf8.h"
     @@ refs.c
       
      +const char *default_ref_storage(void)
      +{
     -+	const char *test = getenv("GIT_TEST_REFTABLE");
     -+	return test ? "reftable" : "files";
     ++	return git_env_bool("GIT_TEST_REFTABLE", 0) ? "reftable" : "files";
      +}
      +
       /*
     @@ refs/reftable-backend.c (new)
      +static struct reftable_stack *stack_for(struct git_reftable_ref_store *store,
      +					const char *refname)
      +{
     -+	if (store->worktree_stack == NULL)
     ++	if (store->worktree_stack == NULL || refname == NULL)
      +		return store->main_stack;
      +
      +	switch (ref_type(refname)) {
     @@ refs/reftable-backend.c (new)
      +		ri->base.flags = 0;
      +		switch (ri->ref.value_type) {
      +		case REFTABLE_REF_VAL1:
     -+			hashcpy(ri->oid.hash, ri->ref.value.val1);
     ++			oidread(&ri->oid, ri->ref.value.val1);
      +			break;
      +		case REFTABLE_REF_VAL2:
     -+			hashcpy(ri->oid.hash, ri->ref.value.val2.value);
     ++			oidread(&ri->oid, ri->ref.value.val2.value);
      +			break;
      +		case REFTABLE_REF_SYMREF: {
      +			int out_flags = 0;
     @@ refs/reftable-backend.c (new)
      +	struct git_reftable_iterator *ri =
      +		(struct git_reftable_iterator *)ref_iterator;
      +	if (ri->ref.value_type == REFTABLE_REF_VAL2) {
     -+		hashcpy(peeled->hash, ri->ref.value.val2.target_value);
     ++		oidread(peeled, ri->ref.value.val2.target_value);
      +		return 0;
      +	}
      +
     -+	return -1;
     ++	return 1;
      +}
      +
      +static int reftable_ref_iterator_abort(struct ref_iterator *ref_iterator)
     @@ refs/reftable-backend.c (new)
      +		(struct git_reftable_ref_store *)ref_store;
      +	struct reftable_addition *add = NULL;
      +	struct reftable_stack *stack =
     -+		transaction->nr ?
     -+			      stack_for(refs, transaction->updates[0]->refname) :
     -+			      refs->main_stack;
     ++		stack_for(refs,
     ++			  transaction->nr ? transaction->updates[0]->refname : NULL);
     ++
      +	int err = refs->err;
      +	if (err < 0) {
      +		goto done;
     @@ refs/reftable-backend.c (new)
      +		calloc(transaction->nr, sizeof(*logs));
      +	struct ref_update **sorted =
      +		malloc(transaction->nr * sizeof(struct ref_update *));
     ++	struct reftable_merged_table *mt = reftable_stack_merged_table(stack);
     ++	struct reftable_table tab = {NULL};
     ++	struct reftable_ref_record ref = {NULL};
     ++	reftable_table_from_merged_table(&tab, mt);
      +	COPY_ARRAY(sorted, transaction->updates, transaction->nr);
      +	QSORT(sorted, transaction->nr, ref_update_cmp);
      +	reftable_writer_set_limits(writer, ts, ts);
     @@ refs/reftable-backend.c (new)
      +	for (i = 0; i < transaction->nr; i++) {
      +		struct ref_update *u = sorted[i];
      +		struct reftable_log_record *log = &logs[i];
     ++		struct object_id old_id;
      +		fill_reftable_log_record(log);
      +		log->update_index = ts;
      +		log->value_type = REFTABLE_LOG_UPDATE;
      +		log->refname = (char *)u->refname;
     -+		log->update.old_hash = u->old_oid.hash;
      +		log->update.new_hash = u->new_oid.hash;
      +		log->update.message = u->msg;
      +
     ++		err = reftable_table_read_ref(&tab, u->refname, &ref);
     ++		if (err < 0)
     ++			goto done;
     ++		else if (err > 0) {
     ++			old_id = null_oid;
     ++		} else {
     ++			oidread(&old_id, ref.value.val1);
     ++		}
     ++
     ++		/* XXX fold together with the old_id check below? */
     ++
     ++		log->update.old_hash = old_id.hash;
      +		if (u->flags & REF_LOG_ONLY) {
      +			continue;
      +		}
     @@ refs/reftable-backend.c (new)
      +
      +done:
      +	assert(err != REFTABLE_API_ERROR);
     ++	reftable_ref_record_release(&ref);
      +	free(logs);
      +	free(sorted);
      +	return err;
     @@ refs/reftable-backend.c (new)
      +	struct git_reftable_ref_store *refs =
      +		(struct git_reftable_ref_store *)ref_store;
      +	struct reftable_stack *stack =
     -+		stack_for(refs, refnames->items[0].string);
     ++		stack_for(refs, refnames->nr ? refnames->items[0].string : NULL);
      +	struct write_delete_refs_arg arg = {
      +		.stack = stack,
      +		.refnames = refnames,
     @@ refs/reftable-backend.c (new)
      +{
      +	struct write_rename_arg *arg = (struct write_rename_arg *)argv;
      +	uint64_t ts = reftable_stack_next_update_index(arg->stack);
     -+	struct reftable_ref_record ref = { NULL };
     -+	int err = reftable_stack_read_ref(arg->stack, arg->oldname, &ref);
     ++	struct reftable_ref_record old_ref = { NULL };
     ++	struct reftable_ref_record new_ref = { NULL };
     ++	int err = reftable_stack_read_ref(arg->stack, arg->oldname, &old_ref);
      +
      +	if (err) {
      +		goto done;
      +	}
      +
     -+	/* XXX do ref renames overwrite the target? */
     -+	if (reftable_stack_read_ref(arg->stack, arg->newname, &ref) == 0) {
     ++	/* git-branch supports a --force, but the check is not atomic. */
     ++	if (reftable_stack_read_ref(arg->stack, arg->newname, &new_ref) == 0) {
      +		goto done;
      +	}
      +
     @@ refs/reftable-backend.c (new)
      +				.update_index = ts,
      +				.value_type = REFTABLE_REF_DELETION,
      +			},
     -+			ref,
     ++			old_ref,
      +		};
      +		todo[1].update_index = ts;
      +		free(todo[1].refname);
     @@ refs/reftable-backend.c (new)
      +		}
      +	}
      +
     -+	if (reftable_ref_record_val1(&ref)) {
     -+		uint8_t *val1 = reftable_ref_record_val1(&ref);
     ++	if (reftable_ref_record_val1(&old_ref)) {
     ++		uint8_t *val1 = reftable_ref_record_val1(&old_ref);
      +		struct reftable_log_record todo[2] = { { NULL } };
      +		fill_reftable_log_record(&todo[0]);
      +		fill_reftable_log_record(&todo[1]);
     @@ refs/reftable-backend.c (new)
      +
      +done:
      +	assert(err != REFTABLE_API_ERROR);
     -+	reftable_ref_record_release(&ref);
     ++	reftable_ref_record_release(&new_ref);
     ++	reftable_ref_record_release(&old_ref);
     ++	return err;
     ++}
     ++
     ++static int write_copy_table(struct reftable_writer *writer, void *argv)
     ++{
     ++	struct write_rename_arg *arg = (struct write_rename_arg *)argv;
     ++	uint64_t ts = reftable_stack_next_update_index(arg->stack);
     ++	struct reftable_ref_record old_ref = { NULL };
     ++	struct reftable_ref_record new_ref = { NULL };
     ++	struct reftable_log_record log = { NULL };
     ++	struct reftable_iterator it = { NULL };
     ++	int err = reftable_stack_read_ref(arg->stack, arg->oldname, &old_ref);
     ++	if (err) {
     ++		goto done;
     ++	}
     ++
     ++	/* git-branch supports a --force, but the check is not atomic. */
     ++	if (reftable_stack_read_ref(arg->stack, arg->newname, &new_ref) == 0) {
     ++		goto done;
     ++	}
     ++
     ++	reftable_writer_set_limits(writer, ts, ts);
     ++
     ++	FREE_AND_NULL(old_ref.refname);
     ++	old_ref.refname = xstrdup(arg->newname);
     ++	old_ref.update_index = ts;
     ++	err = reftable_writer_add_ref(writer, &old_ref);
     ++	if (err < 0) {
     ++		goto done;
     ++	}
     ++
     ++	/* this copies the entire reflog history. Is this the right semantics? */
     ++	/* XXX should clear out existing reflog entries for oldname? */
     ++	err = reftable_merged_table_seek_log(reftable_stack_merged_table(arg->stack), &it, arg->oldname);
     ++	if (err < 0) {
     ++		goto done;
     ++	}
     ++	while (1) {
     ++		int err = reftable_iterator_next_log(&it, &log);
     ++		if (err < 0) {
     ++			goto done;
     ++		}
     ++
     ++		if (err > 0 || strcmp(log.refname, arg->oldname)) {
     ++			break;
     ++		}
     ++		FREE_AND_NULL(log.refname);
     ++		log.refname = xstrdup(arg->newname);
     ++		reftable_writer_add_log(writer, &log);
     ++		reftable_log_record_release(&log);
     ++	}
     ++
     ++done:
     ++	assert(err != REFTABLE_API_ERROR);
     ++	reftable_ref_record_release(&new_ref);
     ++	reftable_ref_record_release(&old_ref);
     ++	reftable_log_record_release(&log);
     ++	reftable_iterator_destroy(&it);
      +	return err;
      +}
      +
     @@ refs/reftable-backend.c (new)
      +				 const char *oldrefname, const char *newrefname,
      +				 const char *logmsg)
      +{
     -+	BUG("reftable reference store does not support copying references");
     ++	struct git_reftable_ref_store *refs =
     ++		(struct git_reftable_ref_store *)ref_store;
     ++	struct reftable_stack *stack = stack_for(refs, newrefname);
     ++	struct write_rename_arg arg = {
     ++		.stack = stack,
     ++		.oldname = oldrefname,
     ++		.newname = newrefname,
     ++		.logmsg = logmsg,
     ++	};
     ++	int err = refs->err;
     ++	if (err < 0) {
     ++		goto done;
     ++	}
     ++	err = reftable_stack_reload(stack);
     ++	if (err) {
     ++		goto done;
     ++	}
     ++
     ++	err = reftable_stack_add(stack, &write_copy_table, &arg);
     ++done:
     ++	assert(err != REFTABLE_API_ERROR);
     ++	return err;
      +}
      +
      +struct git_reftable_reflog_ref_iterator {
     @@ refs/reftable-backend.c (new)
      +
      +		free(ri->last_name);
      +		ri->last_name = xstrdup(ri->log.refname);
     -+		hashcpy(ri->oid.hash, ri->log.update.new_hash);
     ++		oidread(&ri->oid, ri->log.update.new_hash);
      +		return ITER_OK;
      +	}
      +}
     @@ refs/reftable-backend.c (new)
      +			break;
      +		}
      +
     -+		hashcpy(old_oid.hash, log.update.old_hash);
     -+		hashcpy(new_oid.hash, log.update.new_hash);
     ++		oidread(&old_oid, log.update.old_hash);
     ++		oidread(&new_oid, log.update.new_hash);
      +
      +		full_committer = fmt_ident(log.update.name, log.update.email,
      +					   WANT_COMMITTER_IDENT,
     @@ refs/reftable-backend.c (new)
      +		struct object_id new_oid;
      +		const char *full_committer = "";
      +
     -+		hashcpy(old_oid.hash, log->update.old_hash);
     -+		hashcpy(new_oid.hash, log->update.new_hash);
     ++		oidread(&old_oid, log->update.old_hash);
     ++		oidread(&new_oid, log->update.new_hash);
      +
      +		full_committer = fmt_ident(log->update.name, log->update.email,
      +					   WANT_COMMITTER_IDENT, NULL,
     @@ refs/reftable-backend.c (new)
      +	return 0;
      +}
      +
     -+static int git_reftable_delete_reflog(struct ref_store *ref_store,
     -+				      const char *refname)
     -+{
     -+	return 0;
     -+}
     -+
     -+struct reflog_expiry_arg {
     -+	struct git_reftable_ref_store *refs;
     ++struct write_reflog_delete_arg {
      +	struct reftable_stack *stack;
     -+	struct reftable_log_record *tombstones;
     -+	int len;
     -+	int cap;
     ++	const char *refname;
      +};
      +
     -+static void clear_log_tombstones(struct reflog_expiry_arg *arg)
     ++static int write_reflog_delete_table(struct reftable_writer *writer, void *argv)
      +{
     -+	int i = 0;
     -+	for (; i < arg->len; i++) {
     -+		reftable_log_record_release(&arg->tombstones[i]);
     ++	struct write_reflog_delete_arg *arg = argv;
     ++	struct reftable_merged_table *mt = reftable_stack_merged_table(arg->stack);
     ++	struct reftable_log_record log = { NULL };
     ++	struct reftable_iterator it = { NULL };
     ++	uint64_t ts = reftable_stack_next_update_index(arg->stack);
     ++	int err = reftable_merged_table_seek_log(mt, &it, arg->refname);
     ++
     ++	reftable_writer_set_limits(writer, ts, ts);
     ++	while (err == 0) {
     ++		struct reftable_log_record tombstone = {
     ++			.refname = (char*)arg->refname,
     ++			.update_index = REFTABLE_LOG_DELETION,
     ++		};
     ++		err = reftable_iterator_next_log(&it, &log);
     ++		if (err > 0) {
     ++			err = 0;
     ++			break;
     ++		}
     ++
     ++		if (err < 0 || strcmp(log.refname, arg->refname)) {
     ++			break;
     ++		}
     ++		tombstone.update_index = log.update_index;
     ++		err = reftable_writer_add_log(writer, &tombstone);
      +	}
      +
     -+	FREE_AND_NULL(arg->tombstones);
     ++	return err;
      +}
      +
     -+static void add_log_tombstone(struct reflog_expiry_arg *arg,
     -+			      const char *refname, uint64_t ts)
     ++
     ++static int git_reftable_delete_reflog(struct ref_store *ref_store,
     ++				      const char *refname)
      +{
     -+	struct reftable_log_record tombstone = {
     -+		.refname = xstrdup(refname),
     -+		.update_index = ts,
     ++	struct git_reftable_ref_store *refs =
     ++		(struct git_reftable_ref_store *)ref_store;
     ++	struct reftable_stack *stack = stack_for(refs, refname);
     ++	struct write_reflog_delete_arg arg = {
     ++		.stack = stack,
     ++		.refname = refname,
      +	};
     -+	if (arg->len == arg->cap) {
     -+		arg->cap = 2 * arg->cap + 1;
     -+		arg->tombstones =
     -+			realloc(arg->tombstones, arg->cap * sizeof(tombstone));
     -+	}
     -+	arg->tombstones[arg->len++] = tombstone;
     ++	int err = reftable_stack_add(stack, &write_reflog_delete_table, &arg);
     ++	assert(err != REFTABLE_API_ERROR);
     ++	return err;
      +}
      +
     ++struct reflog_expiry_arg {
     ++	struct reftable_stack *stack;
     ++	struct reftable_log_record *records;
     ++	int len;
     ++};
     ++
      +static int write_reflog_expiry_table(struct reftable_writer *writer, void *argv)
      +{
      +	struct reflog_expiry_arg *arg = (struct reflog_expiry_arg *)argv;
     @@ refs/reftable-backend.c (new)
      +	int i = 0;
      +	reftable_writer_set_limits(writer, ts, ts);
      +	for (i = 0; i < arg->len; i++) {
     -+		int err = reftable_writer_add_log(writer, &arg->tombstones[i]);
     ++		int err = reftable_writer_add_log(writer, &arg->records[i]);
      +		if (err) {
      +			return err;
      +		}
     @@ refs/reftable-backend.c (new)
      +	struct reftable_merged_table *mt = NULL;
      +	struct reflog_expiry_arg arg = {
      +		.stack = stack,
     -+		.refs = refs,
      +	};
     -+	struct reftable_log_record log = { NULL };
     ++	struct reftable_log_record *logs = NULL;
     ++	struct reftable_log_record *rewritten = NULL;
     ++	int logs_len = 0;
     ++	int logs_cap = 0;
     ++	int i = 0;
     ++	uint8_t *last_hash = NULL;
      +	struct reftable_iterator it = { NULL };
     ++	struct reftable_addition *add = NULL;
      +	int err = 0;
      +	if (refs->err < 0) {
      +		return refs->err;
     @@ refs/reftable-backend.c (new)
      +		goto done;
      +	}
      +
     ++	err = reftable_stack_new_addition(&add, stack);
     ++	if (err) {
     ++		goto done;
     ++	}
     ++	prepare_fn(refname, oid, policy_cb_data);
      +	while (1) {
     -+		struct object_id ooid;
     -+		struct object_id noid;
     -+
     ++		struct reftable_log_record log = {NULL};
      +		int err = reftable_iterator_next_log(&it, &log);
      +		if (err < 0) {
      +			goto done;
     @@ refs/reftable-backend.c (new)
      +		if (err > 0 || strcmp(log.refname, refname)) {
      +			break;
      +		}
     -+		hashcpy(ooid.hash, log.update.old_hash);
     -+		hashcpy(noid.hash, log.update.new_hash);
      +
     -+		if (should_prune_fn(&ooid, &noid, log.update.email,
     -+				    (timestamp_t)log.update.time,
     -+				    log.update.tz_offset, log.update.message,
     ++		if (logs_len >= logs_cap) {
     ++			int new_cap = logs_cap * 2 + 1;
     ++			logs = realloc(logs, new_cap * sizeof(*logs));
     ++			logs_cap = new_cap;
     ++		}
     ++		logs[logs_len++] = log;
     ++	}
     ++
     ++	rewritten = calloc(logs_len, sizeof(*rewritten));
     ++	for (i = logs_len-1; i >= 0; i--) {
     ++		struct object_id ooid;
     ++		struct object_id noid;
     ++		struct reftable_log_record *dest = &rewritten[i];
     ++
     ++		*dest = logs[i];
     ++		oidread(&ooid, logs[i].update.old_hash);
     ++		oidread(&noid, logs[i].update.new_hash);
     ++
     ++		if (should_prune_fn(&ooid, &noid, logs[i].update.email,
     ++				    (timestamp_t)logs[i].update.time,
     ++				    logs[i].update.tz_offset, logs[i].update.message,
      +				    policy_cb_data)) {
     -+			add_log_tombstone(&arg, refname, log.update_index);
     ++			dest->value_type = REFTABLE_LOG_DELETION;
     ++		} else {
     ++			if ((flags & EXPIRE_REFLOGS_REWRITE) && last_hash != NULL) {
     ++				dest->update.old_hash = last_hash;
     ++			}
     ++			last_hash = logs[i].update.new_hash;
      +		}
      +	}
     -+	err = reftable_stack_add(stack, &write_reflog_expiry_table, &arg);
     ++
     ++	arg.records = rewritten;
     ++	arg.len = logs_len;
     ++	err = reftable_addition_add(add, &write_reflog_expiry_table, &arg);
     ++	if (err < 0) {
     ++		goto done;
     ++	}
     ++
     ++	if (!(flags & EXPIRE_REFLOGS_DRY_RUN)) {
     ++		/* XXX - skip writing records that were not changed. */
     ++		err = reftable_addition_commit(add);
     ++	} else {
     ++		/* XXX - print something */
     ++	}
      +
      +done:
     ++	if (add) {
     ++		cleanup_fn(policy_cb_data);
     ++	}
      +	assert(err != REFTABLE_API_ERROR);
     -+	reftable_log_record_release(&log);
     ++	reftable_addition_destroy(add);
     ++	for (i = 0; i < logs_len; i++)
     ++		reftable_log_record_release(&logs[i]);
     ++	free(logs);
     ++	free(rewritten);
      +	reftable_iterator_destroy(&it);
     -+	clear_log_tombstones(&arg);
      +	return err;
      +}
      +
     @@ refs/reftable-backend.c (new)
      +		strbuf_addstr(referent, ref.value.symref);
      +		*type |= REF_ISSYMREF;
      +	} else if (reftable_ref_record_val1(&ref) != NULL) {
     -+		hashcpy(oid->hash, reftable_ref_record_val1(&ref));
     ++		oidread(oid, reftable_ref_record_val1(&ref));
      +	} else {
      +		*type |= REF_ISBROKEN;
      +		errno = EINVAL;
     @@ setup.c: const char *setup_git_directory_gently(int *nongit_ok)
      +				xstrdup_or_null(repo_fmt.ref_storage);
      +		}
       	}
     - 
     - 	strbuf_release(&dir);
     + 	/*
     + 	 * Since precompose_string_if_needed() needs to look at
      
       ## t/t0031-reftable.sh (new) ##
      @@
     @@ t/t0031-reftable.sh (new)
      +	mv .git/hooks .git/hooks-disabled
      +}
      +
     ++write_script fake_editor <<\EOF
     ++echo "$MSG" >"$1"
     ++echo "$MSG" >&2
     ++EOF
     ++GIT_EDITOR=./fake_editor
     ++export GIT_EDITOR
     ++
     ++test_expect_success 'read existing old OID if REF_HAVE_OLD is not set' '
     ++	initialize &&
     ++	test_commit 1st &&
     ++	test_commit 2nd &&
     ++	MSG=b4 git notes add &&
     ++	MSG=b3 git notes edit  &&
     ++	echo b4 >expect &&
     ++	git notes --ref commits@{1} show >actual &&
     ++	test_cmp expect actual
     ++'
     ++
     ++test_expect_success 'git reflog delete' '
     ++	initialize &&
     ++	test_commit file &&
     ++	test_commit file2 &&
     ++	test_commit file3 &&
     ++	test_commit file4 &&
     ++	git reflog delete HEAD@{1} &&
     ++	git reflog > output &&
     ++	! grep file3 output
     ++'
     ++
     ++test_expect_success 'branch -D delete nonexistent branch' '
     ++	initialize &&
     ++	test_commit file &&
     ++	test_must_fail git branch -D ../../my-private-file
     ++'
     ++
     ++test_expect_success 'branch copy' '
     ++	initialize &&
     ++	test_commit file1 &&
     ++	test_commit file2 &&
     ++	git branch src &&
     ++	git reflog src > expect &&
     ++	git branch -c src dst &&
     ++	git reflog dst | sed "s/dst/src/g" > actual &&
     ++	test_cmp expect actual
     ++'
     ++
     ++test_expect_success 'git stash' '
     ++	initialize &&
     ++	test_commit file &&
     ++	touch actual expected &&
     ++	git -c status.showStash=true status >expected &&
     ++	echo hoi >> file.t &&
     ++	git stash push -m stashed &&
     ++	git stash clear &&
     ++	git -c status.showStash=true status >actual &&
     ++	test_cmp expected actual
     ++'
     ++
     ++test_expect_success 'rename branch' '
     ++	initialize &&
     ++	git symbolic-ref HEAD refs/heads/before &&
     ++	test_commit file &&
     ++	git show-ref | sed s/before/after/g > expected &&
     ++	git branch -M after &&
     ++	git show-ref > actual &&
     ++	test_cmp expected actual
     ++'
     ++
      +test_expect_success 'SHA256 support, env' '
      +	rm -rf .git &&
      +	GIT_DEFAULT_HASH=sha256 && export GIT_DEFAULT_HASH &&
 19:  b2fa6ea62c16 = 24:  742d716e4510 git-prompt: prepare for reftable refs backend
 20:  39e644b9436b ! 25:  aca82763e72c Add "test-tool dump-reftable" command.
     @@ Commit message
          Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
      
       ## Makefile ##
     -@@ Makefile: REFTABLE_OBJS += reftable/zlib-compat.o
     +@@ Makefile: REFTABLE_OBJS += reftable/writer.o
       
       REFTABLE_TEST_OBJS += reftable/basics_test.o
       REFTABLE_TEST_OBJS += reftable/block_test.o
     @@ Makefile: REFTABLE_OBJS += reftable/zlib-compat.o
       REFTABLE_TEST_OBJS += reftable/pq_test.o
       REFTABLE_TEST_OBJS += reftable/record_test.o
      
     - ## reftable/dump.c ##
     -@@ reftable/dump.c: license that can be found in the LICENSE file or at
     - #include <unistd.h>
     - #include <string.h>
     - 
     --#include "reftable.h"
     -+#include "reftable-blocksource.h"
     -+#include "reftable-error.h"
     -+#include "reftable-merged.h"
     -+#include "reftable-record.h"
     - #include "reftable-tests.h"
     -+#include "reftable-writer.h"
     -+#include "reftable-iterator.h"
     -+#include "reftable-reader.h"
     -+#include "reftable-stack.h"
     - 
     - static uint32_t hash_id;
     - 
     - static int dump_table(const char *tablename)
     - {
     --	struct reftable_block_source src = { 0 };
     -+	struct reftable_block_source src = { NULL };
     - 	int err = reftable_block_source_from_file(&src, tablename);
     --	struct reftable_iterator it = { 0 };
     --	struct reftable_ref_record ref = { 0 };
     --	struct reftable_log_record log = { 0 };
     -+	struct reftable_iterator it = { NULL };
     -+	struct reftable_ref_record ref = { NULL };
     -+	struct reftable_log_record log = { NULL };
     - 	struct reftable_reader *r = NULL;
     - 
     - 	if (err < 0)
     -@@ reftable/dump.c: static int dump_table(const char *tablename)
     - 		reftable_ref_record_print(&ref, hash_id);
     - 	}
     - 	reftable_iterator_destroy(&it);
     --	reftable_ref_record_clear(&ref);
     -+	reftable_ref_record_release(&ref);
     - 
     - 	err = reftable_reader_seek_log(r, &it, "");
     - 	if (err < 0) {
     -@@ reftable/dump.c: static int dump_table(const char *tablename)
     - 		reftable_log_record_print(&log, hash_id);
     - 	}
     - 	reftable_iterator_destroy(&it);
     --	reftable_log_record_clear(&log);
     -+	reftable_log_record_release(&log);
     - 
     - 	reftable_reader_free(r);
     - 	return 0;
     -@@ reftable/dump.c: static int dump_table(const char *tablename)
     - static int compact_stack(const char *stackdir)
     - {
     - 	struct reftable_stack *stack = NULL;
     --	struct reftable_write_options cfg = {};
     -+	struct reftable_write_options cfg = { 0 };
     - 
     - 	int err = reftable_new_stack(&stack, stackdir, cfg);
     - 	if (err < 0)
     -@@ reftable/dump.c: static int compact_stack(const char *stackdir)
     - static int dump_stack(const char *stackdir)
     - {
     - 	struct reftable_stack *stack = NULL;
     --	struct reftable_write_options cfg = {};
     --	struct reftable_iterator it = { 0 };
     --	struct reftable_ref_record ref = { 0 };
     --	struct reftable_log_record log = { 0 };
     -+	struct reftable_write_options cfg = { 0 };
     -+	struct reftable_iterator it = { NULL };
     -+	struct reftable_ref_record ref = { NULL };
     -+	struct reftable_log_record log = { NULL };
     - 	struct reftable_merged_table *merged = NULL;
     - 
     - 	int err = reftable_new_stack(&stack, stackdir, cfg);
     -@@ reftable/dump.c: static int dump_stack(const char *stackdir)
     - 		reftable_ref_record_print(&ref, hash_id);
     - 	}
     - 	reftable_iterator_destroy(&it);
     --	reftable_ref_record_clear(&ref);
     -+	reftable_ref_record_release(&ref);
     - 
     - 	err = reftable_merged_table_seek_log(merged, &it, "");
     - 	if (err < 0) {
     -@@ reftable/dump.c: static int dump_stack(const char *stackdir)
     - 		reftable_log_record_print(&log, hash_id);
     - 	}
     - 	reftable_iterator_destroy(&it);
     --	reftable_log_record_clear(&log);
     -+	reftable_log_record_release(&log);
     - 
     - 	reftable_stack_destroy(stack);
     - 	return 0;
     -@@ reftable/dump.c: static void print_help(void)
     - int reftable_dump_main(int argc, char *const *argv)
     - {
     - 	int err = 0;
     --	int opt;
     - 	int opt_dump_table = 0;
     - 	int opt_dump_stack = 0;
     - 	int opt_compact = 0;
     --	const char *arg = NULL;
     --	while ((opt = getopt(argc, argv, "2chts")) != -1) {
     --		switch (opt) {
     --		case '2':
     --			hash_id = 0x73323536;
     -+	const char *arg = NULL, *argv0 = argv[0];
     -+
     -+	for (; argc > 1; argv++, argc--)
     -+		if (*argv[1] != '-')
     - 			break;
     --		case 't':
     -+		else if (!strcmp("-2", argv[1]))
     -+			hash_id = 0x73323536;
     -+		else if (!strcmp("-t", argv[1]))
     - 			opt_dump_table = 1;
     --			break;
     --		case 's':
     -+		else if (!strcmp("-s", argv[1]))
     - 			opt_dump_stack = 1;
     --			break;
     --		case 'c':
     -+		else if (!strcmp("-c", argv[1]))
     - 			opt_compact = 1;
     --			break;
     --		case '?':
     --		case 'h':
     -+		else if (!strcmp("-?", argv[1]) || !strcmp("-h", argv[1])) {
     - 			print_help();
     - 			return 2;
     --			break;
     - 		}
     --	}
     - 
     --	if (argv[optind] == NULL) {
     -+	if (argc != 2) {
     - 		fprintf(stderr, "need argument\n");
     - 		print_help();
     - 		return 2;
     - 	}
     - 
     --	arg = argv[optind];
     -+	arg = argv[1];
     - 
     - 	if (opt_dump_table) {
     - 		err = dump_table(arg);
     -@@ reftable/dump.c: int reftable_dump_main(int argc, char *const *argv)
     - 	}
     - 
     - 	if (err < 0) {
     --		fprintf(stderr, "%s: %s: %s\n", argv[0], arg,
     -+		fprintf(stderr, "%s: %s: %s\n", argv0, arg,
     - 			reftable_error_str(err));
     - 		return 1;
     - 	}
     -
       ## t/helper/test-reftable.c ##
      @@ t/helper/test-reftable.c: int cmd__reftable(int argc, const char **argv)
       	tree_test_main(argc, argv);
  -:  ------------ > 26:  ae88f41de9a9 t1301: document what needs to be done for REFTABLE
  -:  ------------ > 27:  0c1eb69900a7 t1401,t2011: parameterize HEAD.lock for REFTABLE
  -:  ------------ > 28:  dd8ffbf53157 t1404: annotate test cases with REFFILES

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 251+ messages in thread

* [PATCH v7 01/28] refs: ref_iterator_peel returns boolean, rather than peel_status
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-20 18:47               ` Junio C Hamano
  2021-04-19 11:37             ` [PATCH v7 02/28] refs: document reflog_expire_fn's flag argument Han-Wen Nienhuys via GitGitGadget
                               ` (27 subsequent siblings)
  28 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Before, the cached ref_iterator would return peel_object() output directly. This
led to spurious differences in the GIT_TRACE_REFS output, depending on the ref
storage backend active.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 refs.c               | 2 +-
 refs/ref-cache.c     | 2 +-
 refs/refs-internal.h | 3 +++
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/refs.c b/refs.c
index 261fd82beb98..8873854a44fb 100644
--- a/refs.c
+++ b/refs.c
@@ -2010,7 +2010,7 @@ int peel_iterated_oid(const struct object_id *base, struct object_id *peeled)
 	     oideq(current_ref_iter->oid, base)))
 		return ref_iterator_peel(current_ref_iter, peeled);
 
-	return peel_object(base, peeled);
+	return !!peel_object(base, peeled);
 }
 
 int refs_create_symref(struct ref_store *refs,
diff --git a/refs/ref-cache.c b/refs/ref-cache.c
index 46f1e5428433..703a12959e1f 100644
--- a/refs/ref-cache.c
+++ b/refs/ref-cache.c
@@ -491,7 +491,7 @@ static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)
 static int cache_ref_iterator_peel(struct ref_iterator *ref_iterator,
 				   struct object_id *peeled)
 {
-	return peel_object(ref_iterator->oid, peeled);
+	return !!peel_object(ref_iterator->oid, peeled);
 }
 
 static int cache_ref_iterator_abort(struct ref_iterator *ref_iterator)
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 467f4b3c936d..546a6b965dcc 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -453,6 +453,9 @@ void base_ref_iterator_free(struct ref_iterator *iter);
  */
 typedef int ref_iterator_advance_fn(struct ref_iterator *ref_iterator);
 
+/*
+ * Peels the current ref, returning 0 for success.
+ */
 typedef int ref_iterator_peel_fn(struct ref_iterator *ref_iterator,
 				 struct object_id *peeled);
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 02/28] refs: document reflog_expire_fn's flag argument
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37             ` [PATCH v7 01/28] refs: ref_iterator_peel returns boolean, rather than peel_status Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-20 19:34               ` Junio C Hamano
  2021-04-19 11:37             ` [PATCH v7 03/28] refs/debug: trace into reflog expiry too Han-Wen Nienhuys via GitGitGadget
                               ` (26 subsequent siblings)
  28 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 refs/refs-internal.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 546a6b965dcc..a31c1f465beb 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -592,6 +592,10 @@ typedef int reflog_exists_fn(struct ref_store *ref_store, const char *refname);
 typedef int create_reflog_fn(struct ref_store *ref_store, const char *refname,
 			     int force_create, struct strbuf *err);
 typedef int delete_reflog_fn(struct ref_store *ref_store, const char *refname);
+
+/*
+ * `flags` accepts a bitmask of `expire_reflog_flags`.
+ */
 typedef int reflog_expire_fn(struct ref_store *ref_store,
 			     const char *refname, const struct object_id *oid,
 			     unsigned int flags,
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 03/28] refs/debug: trace into reflog expiry too
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37             ` [PATCH v7 01/28] refs: ref_iterator_peel returns boolean, rather than peel_status Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37             ` [PATCH v7 02/28] refs: document reflog_expire_fn's flag argument Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-20 19:41               ` Junio C Hamano
  2021-04-19 11:37             ` [PATCH v7 04/28] hash.h: provide constants for the hash IDs Han-Wen Nienhuys via GitGitGadget
                               ` (25 subsequent siblings)
  28 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 refs/debug.c | 47 ++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 44 insertions(+), 3 deletions(-)

diff --git a/refs/debug.c b/refs/debug.c
index 922e64fa6ad9..3b25e3aeb1ba 100644
--- a/refs/debug.c
+++ b/refs/debug.c
@@ -353,6 +353,40 @@ static int debug_delete_reflog(struct ref_store *ref_store, const char *refname)
 	return res;
 }
 
+struct debug_reflog_expiry_should_prune {
+	reflog_expiry_prepare_fn *prepare;
+	reflog_expiry_should_prune_fn *should_prune;
+	reflog_expiry_cleanup_fn *cleanup;
+	void *cb_data;
+};
+
+static void debug_reflog_expiry_prepare(const char *refname,
+				    const struct object_id *oid,
+				    void *cb_data)
+{
+	struct debug_reflog_expiry_should_prune *prune = cb_data;
+	trace_printf_key(&trace_refs, "reflog_expire_prepare: %s\n", refname);
+	prune->prepare(refname, oid, prune->cb_data);
+}
+
+static int debug_reflog_expiry_should_prune_fn(struct object_id *ooid,
+					       struct object_id *noid,
+					       const char *email,
+					       timestamp_t timestamp, int tz,
+					       const char *message, void *cb_data) {
+	struct debug_reflog_expiry_should_prune *prune = cb_data;
+
+	int result = prune->should_prune(ooid, noid, email, timestamp, tz, message, prune->cb_data);
+	trace_printf_key(&trace_refs, "reflog_expire_should_prune: %s %ld: %d\n", message, (long int) timestamp, result);
+	return result;
+}
+
+static void debug_reflog_expiry_cleanup(void *cb_data)
+{
+	struct debug_reflog_expiry_should_prune *prune = cb_data;
+	prune->cleanup(prune->cb_data);
+}
+
 static int debug_reflog_expire(struct ref_store *ref_store, const char *refname,
 			       const struct object_id *oid, unsigned int flags,
 			       reflog_expiry_prepare_fn prepare_fn,
@@ -361,10 +395,17 @@ static int debug_reflog_expire(struct ref_store *ref_store, const char *refname,
 			       void *policy_cb_data)
 {
 	struct debug_ref_store *drefs = (struct debug_ref_store *)ref_store;
+	struct debug_reflog_expiry_should_prune prune = {
+		.prepare = prepare_fn,
+		.cleanup = cleanup_fn,
+		.should_prune = should_prune_fn,
+		.cb_data = policy_cb_data,
+	};
 	int res = drefs->refs->be->reflog_expire(drefs->refs, refname, oid,
-						 flags, prepare_fn,
-						 should_prune_fn, cleanup_fn,
-						 policy_cb_data);
+						 flags, &debug_reflog_expiry_prepare,
+						 &debug_reflog_expiry_should_prune_fn,
+						 &debug_reflog_expiry_cleanup,
+						 &prune);
 	trace_printf_key(&trace_refs, "reflog_expire: %s: %d\n", refname, res);
 	return res;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 04/28] hash.h: provide constants for the hash IDs
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (2 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 03/28] refs/debug: trace into reflog expiry too Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-20 19:49               ` Junio C Hamano
  2021-04-19 11:37             ` [PATCH v7 05/28] init-db: set the_repository->hash_algo early on Han-Wen Nienhuys via GitGitGadget
                               ` (24 subsequent siblings)
  28 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This will simplify referencing them from code that is not deeply integrated with
Git, in particular, the reftable library.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 hash.h        | 6 ++++++
 object-file.c | 6 ++----
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/hash.h b/hash.h
index 3fb0c3d4005a..b17fb2927711 100644
--- a/hash.h
+++ b/hash.h
@@ -161,12 +161,18 @@ static inline int hash_algo_by_ptr(const struct git_hash_algo *p)
 	return p - hash_algos;
 }
 
+/* "sha1", big-endian */
+#define GIT_SHA1_HASH_ID 0x73686131
+
 /* The length in bytes and in hex digits of an object name (SHA-1 value). */
 #define GIT_SHA1_RAWSZ 20
 #define GIT_SHA1_HEXSZ (2 * GIT_SHA1_RAWSZ)
 /* The block size of SHA-1. */
 #define GIT_SHA1_BLKSZ 64
 
+/* "s256", big-endian */
+#define GIT_SHA256_HASH_ID 0x73323536
+
 /* The length in bytes and in hex digits of an object name (SHA-256 value). */
 #define GIT_SHA256_RAWSZ 32
 #define GIT_SHA256_HEXSZ (2 * GIT_SHA256_RAWSZ)
diff --git a/object-file.c b/object-file.c
index 624af408cdcd..5af384a8cfc4 100644
--- a/object-file.c
+++ b/object-file.c
@@ -146,8 +146,7 @@ const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = {
 	},
 	{
 		"sha1",
-		/* "sha1", big-endian */
-		0x73686131,
+		GIT_SHA1_HASH_ID,
 		GIT_SHA1_RAWSZ,
 		GIT_SHA1_HEXSZ,
 		GIT_SHA1_BLKSZ,
@@ -160,8 +159,7 @@ const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = {
 	},
 	{
 		"sha256",
-		/* "s256", big-endian */
-		0x73323536,
+		GIT_SHA256_HASH_ID,
 		GIT_SHA256_RAWSZ,
 		GIT_SHA256_HEXSZ,
 		GIT_SHA256_BLKSZ,
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 05/28] init-db: set the_repository->hash_algo early on
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (3 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 04/28] hash.h: provide constants for the hash IDs Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37             ` [PATCH v7 06/28] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
                               ` (23 subsequent siblings)
  28 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable backend needs to know the hash algorithm for writing the
initialization hash table.

The initial reftable contains a symref HEAD => "main" (or "master"), which is
agnostic to the size of hash value, but this is an exceptional circumstance, and
the reftable library does not cater to this exception. It insists that all
tables in the stack have a consistent format ID for the hash algorithm.

Call set_repo_hash_algo directly after calling validate_hash_algorithm() (which
reads $GIT_DEFAULT_HASH).

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 builtin/init-db.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/builtin/init-db.c b/builtin/init-db.c
index c19b35f1e691..ff06fe4b9a49 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -424,6 +424,27 @@ int init_db(const char *git_dir, const char *real_git_dir,
 
 	validate_hash_algorithm(&repo_fmt, hash);
 
+	/*
+	 * At this point, the_repository we have in-core does not look
+	 * anything like one that we would see initialized in an already
+	 * working repository after calling setup_git_directory().
+	 *
+	 * Calling repository.c::initialize_the_repository() may have
+	 * prepared the .index .objects and .parsed_objects members, but
+	 * other members like .gitdir, .commondir, etc. have not been
+	 * initialized.
+	 *
+	 * Many API functions assume they are working with the_repository
+	 * that has sensibly been initialized, but because we haven't
+	 * really read from an existing repository, we need to hand-craft
+	 * the necessary members of the structure to get out of this
+	 * chicken-and-egg situation.
+	 *
+	 * For now, we update the hash algorithm member to what the
+	 * validate_hash_algorithm() call decided for us.
+	 */
+	repo_set_hash_algo(the_repository, repo_fmt.hash_algo);
+
 	reinit = create_default_files(template_dir, original_git_dir,
 				      initial_branch, &repo_fmt,
 				      flags & INIT_DB_QUIET);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 06/28] reftable: add LICENSE
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (4 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 05/28] init-db: set the_repository->hash_algo early on Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-21  7:48               ` Ævar Arnfjörð Bjarmason
  2021-04-19 11:37             ` [PATCH v7 07/28] reftable: add error related functionality Han-Wen Nienhuys via GitGitGadget
                               ` (22 subsequent siblings)
  28 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

TODO: relicense?

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/LICENSE | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)
 create mode 100644 reftable/LICENSE

diff --git a/reftable/LICENSE b/reftable/LICENSE
new file mode 100644
index 000000000000..402e0f9356ba
--- /dev/null
+++ b/reftable/LICENSE
@@ -0,0 +1,31 @@
+BSD License
+
+Copyright (c) 2020, Google LLC
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are
+met:
+
+* Redistributions of source code must retain the above copyright notice,
+this list of conditions and the following disclaimer.
+
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in the
+documentation and/or other materials provided with the distribution.
+
+* Neither the name of Google LLC nor the names of its contributors may
+be used to endorse or promote products derived from this software
+without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 07/28] reftable: add error related functionality
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (5 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 06/28] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37             ` [PATCH v7 08/28] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
                               ` (21 subsequent siblings)
  28 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable/ directory is structured as a library, so it cannot
crash on misuse. Instead, it returns an error codes.

In addition, the error code can be used to signal conditions from lower levels
of the library to be handled by higher levels of the library. For example, a
transaction might legitimately write an empty reftable file, but in that case,
we'd want to shortcut the transaction overhead.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/error.c          | 41 ++++++++++++++++++++++++++
 reftable/reftable-error.h | 62 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 103 insertions(+)
 create mode 100644 reftable/error.c
 create mode 100644 reftable/reftable-error.h

diff --git a/reftable/error.c b/reftable/error.c
new file mode 100644
index 000000000000..f6f16def9214
--- /dev/null
+++ b/reftable/error.c
@@ -0,0 +1,41 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reftable-error.h"
+
+#include <stdio.h>
+
+const char *reftable_error_str(int err)
+{
+	static char buf[250];
+	switch (err) {
+	case REFTABLE_IO_ERROR:
+		return "I/O error";
+	case REFTABLE_FORMAT_ERROR:
+		return "corrupt reftable file";
+	case REFTABLE_NOT_EXIST_ERROR:
+		return "file does not exist";
+	case REFTABLE_LOCK_ERROR:
+		return "data is outdated";
+	case REFTABLE_API_ERROR:
+		return "misuse of the reftable API";
+	case REFTABLE_ZLIB_ERROR:
+		return "zlib failure";
+	case REFTABLE_NAME_CONFLICT:
+		return "file/directory conflict";
+	case REFTABLE_EMPTY_TABLE_ERROR:
+		return "wrote empty table";
+	case REFTABLE_REFNAME_ERROR:
+		return "invalid refname";
+	case -1:
+		return "general error";
+	default:
+		snprintf(buf, sizeof(buf), "unknown error code %d", err);
+		return buf;
+	}
+}
diff --git a/reftable/reftable-error.h b/reftable/reftable-error.h
new file mode 100644
index 000000000000..6f89bedf1a58
--- /dev/null
+++ b/reftable/reftable-error.h
@@ -0,0 +1,62 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_ERROR_H
+#define REFTABLE_ERROR_H
+
+/*
+ * Errors in reftable calls are signaled with negative integer return values. 0
+ * means success.
+ */
+enum reftable_error {
+	/* Unexpected file system behavior */
+	REFTABLE_IO_ERROR = -2,
+
+	/* Format inconsistency on reading data */
+	REFTABLE_FORMAT_ERROR = -3,
+
+	/* File does not exist. Returned from block_source_from_file(), because
+	 * it needs special handling in stack.
+	 */
+	REFTABLE_NOT_EXIST_ERROR = -4,
+
+	/* Trying to write out-of-date data. */
+	REFTABLE_LOCK_ERROR = -5,
+
+	/* Misuse of the API:
+	 *  - on writing a record with NULL refname.
+	 *  - on writing a reftable_ref_record outside the table limits
+	 *  - on writing a ref or log record before the stack's
+	 * next_update_inde*x
+	 *  - on writing a log record with multiline message with
+	 *  exact_log_message unset
+	 *  - on reading a reftable_ref_record from log iterator, or vice versa.
+	 *
+	 * When a call misuses the API, the internal state of the library is
+	 * kept unchanged.
+	 */
+	REFTABLE_API_ERROR = -6,
+
+	/* Decompression error */
+	REFTABLE_ZLIB_ERROR = -7,
+
+	/* Wrote a table without blocks. */
+	REFTABLE_EMPTY_TABLE_ERROR = -8,
+
+	/* Dir/file conflict. */
+	REFTABLE_NAME_CONFLICT = -9,
+
+	/* Invalid ref name. */
+	REFTABLE_REFNAME_ERROR = -10,
+};
+
+/* convert the numeric error code to a string. The string should not be
+ * deallocated. */
+const char *reftable_error_str(int err);
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 08/28] reftable: utility functions
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (6 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 07/28] reftable: add error related functionality Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37             ` [PATCH v7 09/28] reftable: add blocksource, an abstraction for random access reads Han-Wen Nienhuys via GitGitGadget
                               ` (20 subsequent siblings)
  28 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This commit provides basic utility classes for the reftable library.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Makefile                            |  25 +++++-
 contrib/buildsystems/CMakeLists.txt |  14 ++-
 reftable/basics.c                   | 128 ++++++++++++++++++++++++++++
 reftable/basics.h                   |  60 +++++++++++++
 reftable/basics_test.c              |  98 +++++++++++++++++++++
 reftable/publicbasics.c             |  58 +++++++++++++
 reftable/reftable-malloc.h          |  18 ++++
 reftable/reftable-tests.h           |  22 +++++
 reftable/system.h                   |  24 ++++++
 reftable/test_framework.c           |  23 +++++
 reftable/test_framework.h           |  53 ++++++++++++
 t/helper/test-reftable.c            |   9 ++
 t/helper/test-tool.c                |   3 +-
 t/helper/test-tool.h                |   1 +
 t/t0032-reftable-unittest.sh        |  15 ++++
 15 files changed, 545 insertions(+), 6 deletions(-)
 create mode 100644 reftable/basics.c
 create mode 100644 reftable/basics.h
 create mode 100644 reftable/basics_test.c
 create mode 100644 reftable/publicbasics.c
 create mode 100644 reftable/reftable-malloc.h
 create mode 100644 reftable/reftable-tests.h
 create mode 100644 reftable/system.h
 create mode 100644 reftable/test_framework.c
 create mode 100644 reftable/test_framework.h
 create mode 100644 t/helper/test-reftable.c
 create mode 100755 t/t0032-reftable-unittest.sh

diff --git a/Makefile b/Makefile
index 21c0bf16672b..6f6143672333 100644
--- a/Makefile
+++ b/Makefile
@@ -736,6 +736,7 @@ TEST_BUILTINS_OBJS += test-read-cache.o
 TEST_BUILTINS_OBJS += test-read-graph.o
 TEST_BUILTINS_OBJS += test-read-midx.o
 TEST_BUILTINS_OBJS += test-ref-store.o
+TEST_BUILTINS_OBJS += test-reftable.o
 TEST_BUILTINS_OBJS += test-regex.o
 TEST_BUILTINS_OBJS += test-repository.o
 TEST_BUILTINS_OBJS += test-revision-walking.o
@@ -813,6 +814,8 @@ TEST_SHELL_PATH = $(SHELL_PATH)
 
 LIB_FILE = libgit.a
 XDIFF_LIB = xdiff/lib.a
+REFTABLE_LIB = reftable/libreftable.a
+REFTABLE_TEST_LIB = reftable/libreftable_test.a
 
 GENERATED_H += command-list.h
 GENERATED_H += config-list.h
@@ -1181,7 +1184,7 @@ THIRD_PARTY_SOURCES += compat/regex/%
 THIRD_PARTY_SOURCES += sha1collisiondetection/%
 THIRD_PARTY_SOURCES += sha1dc/%
 
-GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB)
+GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB)
 EXTLIBS =
 
 GIT_USER_AGENT = git/$(GIT_VERSION)
@@ -2398,7 +2401,15 @@ XDIFF_OBJS += xdiff/xutils.o
 .PHONY: xdiff-objs
 xdiff-objs: $(XDIFF_OBJS)
 
+REFTABLE_OBJS += reftable/basics.o
+REFTABLE_OBJS += reftable/error.o
+REFTABLE_OBJS += reftable/publicbasics.o
+
+REFTABLE_TEST_OBJS += reftable/test_framework.o
+REFTABLE_TEST_OBJS += reftable/basics_test.o
+
 TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
+
 .PHONY: test-objs
 test-objs: $(TEST_OBJS)
 
@@ -2414,6 +2425,8 @@ OBJECTS += $(PROGRAM_OBJS)
 OBJECTS += $(TEST_OBJS)
 OBJECTS += $(XDIFF_OBJS)
 OBJECTS += $(FUZZ_OBJS)
+OBJECTS += $(REFTABLE_OBJS) $(REFTABLE_TEST_OBJS)
+
 ifndef NO_CURL
 	OBJECTS += http.o http-walker.o remote-curl.o
 endif
@@ -2565,6 +2578,12 @@ $(LIB_FILE): $(LIB_OBJS)
 $(XDIFF_LIB): $(XDIFF_OBJS)
 	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
 
+$(REFTABLE_LIB): $(REFTABLE_OBJS)
+	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
+
+$(REFTABLE_TEST_LIB): $(REFTABLE_TEST_OBJS)
+	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
+
 export DEFAULT_EDITOR DEFAULT_PAGER
 
 Documentation/GIT-EXCLUDED-PROGRAMS: FORCE
@@ -2845,7 +2864,7 @@ perf: all
 
 t/helper/test-tool$X: $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
 
-t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS)
+t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS) $(REFTABLE_TEST_LIB)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(filter %.a,$^) $(LIBS)
 
 check-sha1:: t/helper/test-tool$X
@@ -3175,7 +3194,7 @@ cocciclean:
 clean: profile-clean coverage-clean cocciclean
 	$(RM) *.res
 	$(RM) $(OBJECTS)
-	$(RM) $(LIB_FILE) $(XDIFF_LIB)
+	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB)
 	$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) git$X
 	$(RM) $(TEST_PROGRAMS)
 	$(RM) $(FUZZ_PROGRAMS)
diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index 75ed198a6a36..f41d3dab9015 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -611,6 +611,12 @@ parse_makefile_for_sources(libxdiff_SOURCES "XDIFF_OBJS")
 list(TRANSFORM libxdiff_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/")
 add_library(xdiff STATIC ${libxdiff_SOURCES})
 
+#reftable
+parse_makefile_for_sources(reftable_SOURCES "REFTABLE_OBJS")
+
+list(TRANSFORM reftable_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/")
+add_library(reftable STATIC ${reftable_SOURCES})
+
 if(WIN32)
 	if(NOT MSVC)#use windres when compiling with gcc and clang
 		add_custom_command(OUTPUT ${CMAKE_BINARY_DIR}/git.res
@@ -633,7 +639,7 @@ endif()
 #link all required libraries to common-main
 add_library(common-main OBJECT ${CMAKE_SOURCE_DIR}/common-main.c)
 
-target_link_libraries(common-main libgit xdiff ${ZLIB_LIBRARIES})
+target_link_libraries(common-main libgit xdiff reftable ${ZLIB_LIBRARIES})
 if(Intl_FOUND)
 	target_link_libraries(common-main ${Intl_LIBRARIES})
 endif()
@@ -873,11 +879,15 @@ if(BUILD_TESTING)
 add_executable(test-fake-ssh ${CMAKE_SOURCE_DIR}/t/helper/test-fake-ssh.c)
 target_link_libraries(test-fake-ssh common-main)
 
+#reftable-tests
+parse_makefile_for_sources(test-reftable_SOURCES "REFTABLE_TEST_OBJS")
+list(TRANSFORM test-reftable_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/")
+
 #test-tool
 parse_makefile_for_sources(test-tool_SOURCES "TEST_BUILTINS_OBJS")
 
 list(TRANSFORM test-tool_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/t/helper/")
-add_executable(test-tool ${CMAKE_SOURCE_DIR}/t/helper/test-tool.c ${test-tool_SOURCES})
+add_executable(test-tool ${CMAKE_SOURCE_DIR}/t/helper/test-tool.c ${test-tool_SOURCES} ${test-reftable_SOURCES})
 target_link_libraries(test-tool common-main)
 
 set_target_properties(test-fake-ssh test-tool
diff --git a/reftable/basics.c b/reftable/basics.c
new file mode 100644
index 000000000000..f761e48028c4
--- /dev/null
+++ b/reftable/basics.c
@@ -0,0 +1,128 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "basics.h"
+
+void put_be24(uint8_t *out, uint32_t i)
+{
+	out[0] = (uint8_t)((i >> 16) & 0xff);
+	out[1] = (uint8_t)((i >> 8) & 0xff);
+	out[2] = (uint8_t)(i & 0xff);
+}
+
+uint32_t get_be24(uint8_t *in)
+{
+	return (uint32_t)(in[0]) << 16 | (uint32_t)(in[1]) << 8 |
+	       (uint32_t)(in[2]);
+}
+
+void put_be16(uint8_t *out, uint16_t i)
+{
+	out[0] = (uint8_t)((i >> 8) & 0xff);
+	out[1] = (uint8_t)(i & 0xff);
+}
+
+int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args)
+{
+	size_t lo = 0;
+	size_t hi = sz;
+
+	/* Invariants:
+	 *
+	 *  (hi == sz) || f(hi) == true
+	 *  (lo == 0 && f(0) == true) || fi(lo) == false
+	 */
+	while (hi - lo > 1) {
+		size_t mid = lo + (hi - lo) / 2;
+
+		if (f(mid, args))
+			hi = mid;
+		else
+			lo = mid;
+	}
+
+	if (lo)
+		return hi;
+
+	return f(0, args) ? 0 : 1;
+}
+
+void free_names(char **a)
+{
+	char **p;
+	if (!a) {
+		return;
+	}
+	for (p = a; *p; p++) {
+		reftable_free(*p);
+	}
+	reftable_free(a);
+}
+
+int names_length(char **names)
+{
+	char **p = names;
+	for (; *p; p++) {
+		/* empty */
+	}
+	return p - names;
+}
+
+void parse_names(char *buf, int size, char ***namesp)
+{
+	char **names = NULL;
+	size_t names_cap = 0;
+	size_t names_len = 0;
+
+	char *p = buf;
+	char *end = buf + size;
+	while (p < end) {
+		char *next = strchr(p, '\n');
+		if (next && next < end) {
+			*next = 0;
+		} else {
+			next = end;
+		}
+		if (p < next) {
+			if (names_len == names_cap) {
+				names_cap = 2 * names_cap + 1;
+				names = reftable_realloc(
+					names, names_cap * sizeof(*names));
+			}
+			names[names_len++] = xstrdup(p);
+		}
+		p = next + 1;
+	}
+
+	names = reftable_realloc(names, (names_len + 1) * sizeof(*names));
+	names[names_len] = NULL;
+	*namesp = names;
+}
+
+int names_equal(char **a, char **b)
+{
+	int i = 0;
+	for (; a[i] && b[i]; i++) {
+		if (strcmp(a[i], b[i])) {
+			return 0;
+		}
+	}
+
+	return a[i] == b[i];
+}
+
+int common_prefix_size(struct strbuf *a, struct strbuf *b)
+{
+	int p = 0;
+	for (; p < a->len && p < b->len; p++) {
+		if (a->buf[p] != b->buf[p])
+			break;
+	}
+
+	return p;
+}
diff --git a/reftable/basics.h b/reftable/basics.h
new file mode 100644
index 000000000000..096b36862b9f
--- /dev/null
+++ b/reftable/basics.h
@@ -0,0 +1,60 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef BASICS_H
+#define BASICS_H
+
+/*
+ * miscellaneous utilities that are not provided by Git.
+ */
+
+#include "system.h"
+
+/* Bigendian en/decoding of integers */
+
+void put_be24(uint8_t *out, uint32_t i);
+uint32_t get_be24(uint8_t *in);
+void put_be16(uint8_t *out, uint16_t i);
+
+/*
+ * find smallest index i in [0, sz) at which f(i) is true, assuming
+ * that f is ascending. Return sz if f(i) is false for all indices.
+ *
+ * Contrary to bsearch(3), this returns something useful if the argument is not
+ * found.
+ */
+int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args);
+
+/*
+ * Frees a NULL terminated array of malloced strings. The array itself is also
+ * freed.
+ */
+void free_names(char **a);
+
+/* parse a newline separated list of names. `size` is the length of the buffer,
+ * without terminating '\0'. Empty names are discarded. */
+void parse_names(char *buf, int size, char ***namesp);
+
+/* compares two NULL-terminated arrays of strings. */
+int names_equal(char **a, char **b);
+
+/* returns the array size of a NULL-terminated array of strings. */
+int names_length(char **names);
+
+/* Allocation routines; they invoke the functions set through
+ * reftable_set_alloc() */
+void *reftable_malloc(size_t sz);
+void *reftable_realloc(void *p, size_t sz);
+void reftable_free(void *p);
+void *reftable_calloc(size_t sz);
+
+/* Find the longest shared prefix size of `a` and `b` */
+struct strbuf;
+int common_prefix_size(struct strbuf *a, struct strbuf *b);
+
+#endif
diff --git a/reftable/basics_test.c b/reftable/basics_test.c
new file mode 100644
index 000000000000..1fcd22972567
--- /dev/null
+++ b/reftable/basics_test.c
@@ -0,0 +1,98 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+
+#include "basics.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+struct binsearch_args {
+	int key;
+	int *arr;
+};
+
+static int binsearch_func(size_t i, void *void_args)
+{
+	struct binsearch_args *args = void_args;
+
+	return args->key < args->arr[i];
+}
+
+static void test_binsearch(void)
+{
+	int arr[] = { 2, 4, 6, 8, 10 };
+	size_t sz = ARRAY_SIZE(arr);
+	struct binsearch_args args = {
+		.arr = arr,
+	};
+
+	int i = 0;
+	for (i = 1; i < 11; i++) {
+		int res;
+		args.key = i;
+		res = binsearch(sz, &binsearch_func, &args);
+
+		if (res < sz) {
+			EXPECT(args.key < arr[res]);
+			if (res > 0) {
+				EXPECT(args.key >= arr[res - 1]);
+			}
+		} else {
+			EXPECT(args.key == 10 || args.key == 11);
+		}
+	}
+}
+
+static void test_names_length(void)
+{
+	char *a[] = { "a", "b", NULL };
+	EXPECT(names_length(a) == 2);
+}
+
+static void test_parse_names_normal(void)
+{
+	char in[] = "a\nb\n";
+	char **out = NULL;
+	parse_names(in, strlen(in), &out);
+	EXPECT(!strcmp(out[0], "a"));
+	EXPECT(!strcmp(out[1], "b"));
+	EXPECT(!out[2]);
+	free_names(out);
+}
+
+static void test_parse_names_drop_empty(void)
+{
+	char in[] = "a\n\n";
+	char **out = NULL;
+	parse_names(in, strlen(in), &out);
+	EXPECT(!strcmp(out[0], "a"));
+	EXPECT(!out[1]);
+	free_names(out);
+}
+
+static void test_common_prefix(void)
+{
+	struct strbuf s1 = STRBUF_INIT;
+	struct strbuf s2 = STRBUF_INIT;
+	strbuf_addstr(&s1, "abcdef");
+	strbuf_addstr(&s2, "abc");
+	EXPECT(common_prefix_size(&s1, &s2) == 3);
+	strbuf_release(&s1);
+	strbuf_release(&s2);
+}
+
+int basics_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_common_prefix);
+	RUN_TEST(test_parse_names_normal);
+	RUN_TEST(test_parse_names_drop_empty);
+	RUN_TEST(test_binsearch);
+	RUN_TEST(test_names_length);
+	return 0;
+}
diff --git a/reftable/publicbasics.c b/reftable/publicbasics.c
new file mode 100644
index 000000000000..fd418cfaefd1
--- /dev/null
+++ b/reftable/publicbasics.c
@@ -0,0 +1,58 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reftable-malloc.h"
+
+#include "basics.h"
+#include "system.h"
+
+static void *(*reftable_malloc_ptr)(size_t sz) = &malloc;
+static void *(*reftable_realloc_ptr)(void *, size_t) = &realloc;
+static void (*reftable_free_ptr)(void *) = &free;
+
+void *reftable_malloc(size_t sz)
+{
+	return (*reftable_malloc_ptr)(sz);
+}
+
+void *reftable_realloc(void *p, size_t sz)
+{
+	return (*reftable_realloc_ptr)(p, sz);
+}
+
+void reftable_free(void *p)
+{
+	reftable_free_ptr(p);
+}
+
+void *reftable_calloc(size_t sz)
+{
+	void *p = reftable_malloc(sz);
+	memset(p, 0, sz);
+	return p;
+}
+
+void reftable_set_alloc(void *(*malloc)(size_t),
+			void *(*realloc)(void *, size_t), void (*free)(void *))
+{
+	reftable_malloc_ptr = malloc;
+	reftable_realloc_ptr = realloc;
+	reftable_free_ptr = free;
+}
+
+int hash_size(uint32_t id)
+{
+	switch (id) {
+	case 0:
+	case GIT_SHA1_HASH_ID:
+		return GIT_SHA1_RAWSZ;
+	case GIT_SHA256_HASH_ID:
+		return GIT_SHA256_RAWSZ;
+	}
+	abort();
+}
diff --git a/reftable/reftable-malloc.h b/reftable/reftable-malloc.h
new file mode 100644
index 000000000000..5f2185f1f343
--- /dev/null
+++ b/reftable/reftable-malloc.h
@@ -0,0 +1,18 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_H
+#define REFTABLE_H
+
+#include <stddef.h>
+
+/* Overrides the functions to use for memory management. */
+void reftable_set_alloc(void *(*malloc)(size_t),
+			void *(*realloc)(void *, size_t), void (*free)(void *));
+
+#endif
diff --git a/reftable/reftable-tests.h b/reftable/reftable-tests.h
new file mode 100644
index 000000000000..5e7698ae654e
--- /dev/null
+++ b/reftable/reftable-tests.h
@@ -0,0 +1,22 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_TESTS_H
+#define REFTABLE_TESTS_H
+
+int basics_test_main(int argc, const char **argv);
+int block_test_main(int argc, const char **argv);
+int merged_test_main(int argc, const char **argv);
+int record_test_main(int argc, const char **argv);
+int refname_test_main(int argc, const char **argv);
+int reftable_test_main(int argc, const char **argv);
+int stack_test_main(int argc, const char **argv);
+int tree_test_main(int argc, const char **argv);
+int reftable_dump_main(int argc, char *const *argv);
+
+#endif
diff --git a/reftable/system.h b/reftable/system.h
new file mode 100644
index 000000000000..bf963ee458e4
--- /dev/null
+++ b/reftable/system.h
@@ -0,0 +1,24 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef SYSTEM_H
+#define SYSTEM_H
+
+// This header glues the reftable library to the rest of Git
+
+#include "git-compat-util.h"
+#include "strbuf.h"
+#include "hash.h" /* hash ID, sizes.*/
+#include "dir.h" /* remove_dir_recursively, for tests.*/
+
+#include <zlib.h>
+
+struct strbuf;
+int hash_size(uint32_t id);
+
+#endif
diff --git a/reftable/test_framework.c b/reftable/test_framework.c
new file mode 100644
index 000000000000..0932166f2d11
--- /dev/null
+++ b/reftable/test_framework.c
@@ -0,0 +1,23 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+#include "test_framework.h"
+
+#include "basics.h"
+
+void set_test_hash(uint8_t *p, int i)
+{
+	memset(p, (uint8_t)i, hash_size(GIT_SHA1_HASH_ID));
+}
+
+ssize_t strbuf_add_void(void *b, const void *data, size_t sz)
+{
+	strbuf_add(b, data, sz);
+	return sz;
+}
diff --git a/reftable/test_framework.h b/reftable/test_framework.h
new file mode 100644
index 000000000000..774cb275bf67
--- /dev/null
+++ b/reftable/test_framework.h
@@ -0,0 +1,53 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef TEST_FRAMEWORK_H
+#define TEST_FRAMEWORK_H
+
+#include "system.h"
+#include "reftable-error.h"
+
+#define EXPECT_ERR(c)                                                  \
+	if (c != 0) {                                                  \
+		fflush(stderr);                                        \
+		fflush(stdout);                                        \
+		fprintf(stderr, "%s: %d: error == %d (%s), want 0\n",  \
+			__FILE__, __LINE__, c, reftable_error_str(c)); \
+		abort();                                               \
+	}
+
+#define EXPECT_STREQ(a, b)                                               \
+	if (strcmp(a, b)) {                                              \
+		fflush(stderr);                                          \
+		fflush(stdout);                                          \
+		fprintf(stderr, "%s:%d: %s (%s) != %s (%s)\n", __FILE__, \
+			__LINE__, #a, a, #b, b);                         \
+		abort();                                                 \
+	}
+
+#define EXPECT(c)                                                          \
+	if (!(c)) {                                                        \
+		fflush(stderr);                                            \
+		fflush(stdout);                                            \
+		fprintf(stderr, "%s: %d: failed assertion %s\n", __FILE__, \
+			__LINE__, #c);                                     \
+		abort();                                                   \
+	}
+
+#define RUN_TEST(f)                          \
+	fprintf(stderr, "running %s\n", #f); \
+	fflush(stderr);                      \
+	f();
+
+void set_test_hash(uint8_t *p, int i);
+
+/* Like strbuf_add, but suitable for passing to reftable_new_writer
+ */
+ssize_t strbuf_add_void(void *b, const void *data, size_t sz);
+
+#endif
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
new file mode 100644
index 000000000000..3b58e423e7b1
--- /dev/null
+++ b/t/helper/test-reftable.c
@@ -0,0 +1,9 @@
+#include "reftable/reftable-tests.h"
+#include "test-tool.h"
+
+int cmd__reftable(int argc, const char **argv)
+{
+	basics_test_main(argc, argv);
+
+	return 0;
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 25c6a37e93e5..bea3285eed9f 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -50,13 +50,14 @@ static struct test_cmd cmds[] = {
 	{ "pcre2-config", cmd__pcre2_config },
 	{ "pkt-line", cmd__pkt_line },
 	{ "prio-queue", cmd__prio_queue },
-	{ "proc-receive", cmd__proc_receive},
+	{ "proc-receive", cmd__proc_receive },
 	{ "progress", cmd__progress },
 	{ "reach", cmd__reach },
 	{ "read-cache", cmd__read_cache },
 	{ "read-graph", cmd__read_graph },
 	{ "read-midx", cmd__read_midx },
 	{ "ref-store", cmd__ref_store },
+	{ "reftable", cmd__reftable },
 	{ "regex", cmd__regex },
 	{ "repository", cmd__repository },
 	{ "revision-walking", cmd__revision_walking },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index f03c5988b20c..90f935310788 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -46,6 +46,7 @@ int cmd__read_cache(int argc, const char **argv);
 int cmd__read_graph(int argc, const char **argv);
 int cmd__read_midx(int argc, const char **argv);
 int cmd__ref_store(int argc, const char **argv);
+int cmd__reftable(int argc, const char **argv);
 int cmd__regex(int argc, const char **argv);
 int cmd__repository(int argc, const char **argv);
 int cmd__revision_walking(int argc, const char **argv);
diff --git a/t/t0032-reftable-unittest.sh b/t/t0032-reftable-unittest.sh
new file mode 100755
index 000000000000..0ed14971a580
--- /dev/null
+++ b/t/t0032-reftable-unittest.sh
@@ -0,0 +1,15 @@
+#!/bin/sh
+#
+# Copyright (c) 2020 Google LLC
+#
+
+test_description='reftable unittests'
+
+. ./test-lib.sh
+
+test_expect_success 'unittests' '
+	TMPDIR=$(pwd) && export TMPDIR &&
+	test-tool reftable
+'
+
+test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 09/28] reftable: add blocksource, an abstraction for random access reads
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (7 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 08/28] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37             ` [PATCH v7 10/28] reftable: (de)serialization for the polymorphic record type Han-Wen Nienhuys via GitGitGadget
                               ` (19 subsequent siblings)
  28 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable format is usually used with files for storage. However, we abstract
away this using the blocksource data structure. This has two advantages:

* log blocks are zlib compressed, and handling them is simplified if we can
  discard byte segments from within the block layer.

* for unittests, it is useful to read and write in-memory. The blocksource
  allows us to abstract the data away from on-disk files.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                        |   1 +
 reftable/blocksource.c          | 148 ++++++++++++++++++++++++++++++++
 reftable/blocksource.h          |  22 +++++
 reftable/reftable-blocksource.h |  49 +++++++++++
 4 files changed, 220 insertions(+)
 create mode 100644 reftable/blocksource.c
 create mode 100644 reftable/blocksource.h
 create mode 100644 reftable/reftable-blocksource.h

diff --git a/Makefile b/Makefile
index 6f6143672333..189bf572661b 100644
--- a/Makefile
+++ b/Makefile
@@ -2403,6 +2403,7 @@ xdiff-objs: $(XDIFF_OBJS)
 
 REFTABLE_OBJS += reftable/basics.o
 REFTABLE_OBJS += reftable/error.o
+REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 
 REFTABLE_TEST_OBJS += reftable/test_framework.o
diff --git a/reftable/blocksource.c b/reftable/blocksource.c
new file mode 100644
index 000000000000..0044eecd9aa3
--- /dev/null
+++ b/reftable/blocksource.c
@@ -0,0 +1,148 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+
+#include "basics.h"
+#include "blocksource.h"
+#include "reftable-blocksource.h"
+#include "reftable-error.h"
+
+static void strbuf_return_block(void *b, struct reftable_block *dest)
+{
+	memset(dest->data, 0xff, dest->len);
+	reftable_free(dest->data);
+}
+
+static void strbuf_close(void *b)
+{
+}
+
+static int strbuf_read_block(void *v, struct reftable_block *dest, uint64_t off,
+			     uint32_t size)
+{
+	struct strbuf *b = v;
+	assert(off + size <= b->len);
+	dest->data = reftable_calloc(size);
+	memcpy(dest->data, b->buf + off, size);
+	dest->len = size;
+	return size;
+}
+
+static uint64_t strbuf_size(void *b)
+{
+	return ((struct strbuf *)b)->len;
+}
+
+static struct reftable_block_source_vtable strbuf_vtable = {
+	.size = &strbuf_size,
+	.read_block = &strbuf_read_block,
+	.return_block = &strbuf_return_block,
+	.close = &strbuf_close,
+};
+
+void block_source_from_strbuf(struct reftable_block_source *bs,
+			      struct strbuf *buf)
+{
+	assert(!bs->ops);
+	bs->ops = &strbuf_vtable;
+	bs->arg = buf;
+}
+
+static void malloc_return_block(void *b, struct reftable_block *dest)
+{
+	memset(dest->data, 0xff, dest->len);
+	reftable_free(dest->data);
+}
+
+static struct reftable_block_source_vtable malloc_vtable = {
+	.return_block = &malloc_return_block,
+};
+
+static struct reftable_block_source malloc_block_source_instance = {
+	.ops = &malloc_vtable,
+};
+
+struct reftable_block_source malloc_block_source(void)
+{
+	return malloc_block_source_instance;
+}
+
+struct file_block_source {
+	int fd;
+	uint64_t size;
+};
+
+static uint64_t file_size(void *b)
+{
+	return ((struct file_block_source *)b)->size;
+}
+
+static void file_return_block(void *b, struct reftable_block *dest)
+{
+	memset(dest->data, 0xff, dest->len);
+	reftable_free(dest->data);
+}
+
+static void file_close(void *b)
+{
+	int fd = ((struct file_block_source *)b)->fd;
+	if (fd > 0) {
+		close(fd);
+		((struct file_block_source *)b)->fd = 0;
+	}
+
+	reftable_free(b);
+}
+
+static int file_read_block(void *v, struct reftable_block *dest, uint64_t off,
+			   uint32_t size)
+{
+	struct file_block_source *b = v;
+	assert(off + size <= b->size);
+	dest->data = reftable_malloc(size);
+	if (pread(b->fd, dest->data, size, off) != size)
+		return -1;
+	dest->len = size;
+	return size;
+}
+
+static struct reftable_block_source_vtable file_vtable = {
+	.size = &file_size,
+	.read_block = &file_read_block,
+	.return_block = &file_return_block,
+	.close = &file_close,
+};
+
+int reftable_block_source_from_file(struct reftable_block_source *bs,
+				    const char *name)
+{
+	struct stat st = { 0 };
+	int err = 0;
+	int fd = open(name, O_RDONLY);
+	struct file_block_source *p = NULL;
+	if (fd < 0) {
+		if (errno == ENOENT) {
+			return REFTABLE_NOT_EXIST_ERROR;
+		}
+		return -1;
+	}
+
+	err = fstat(fd, &st);
+	if (err < 0)
+		return -1;
+
+	p = reftable_calloc(sizeof(struct file_block_source));
+	p->size = st.st_size;
+	p->fd = fd;
+
+	assert(!bs->ops);
+	bs->ops = &file_vtable;
+	bs->arg = p;
+	return 0;
+}
diff --git a/reftable/blocksource.h b/reftable/blocksource.h
new file mode 100644
index 000000000000..072e2727ad20
--- /dev/null
+++ b/reftable/blocksource.h
@@ -0,0 +1,22 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef BLOCKSOURCE_H
+#define BLOCKSOURCE_H
+
+#include "system.h"
+
+struct reftable_block_source;
+
+/* Create an in-memory block source for reading reftables */
+void block_source_from_strbuf(struct reftable_block_source *bs,
+			      struct strbuf *buf);
+
+struct reftable_block_source malloc_block_source(void);
+
+#endif
diff --git a/reftable/reftable-blocksource.h b/reftable/reftable-blocksource.h
new file mode 100644
index 000000000000..5aa3990a5732
--- /dev/null
+++ b/reftable/reftable-blocksource.h
@@ -0,0 +1,49 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_BLOCKSOURCE_H
+#define REFTABLE_BLOCKSOURCE_H
+
+#include <stdint.h>
+
+/* block_source is a generic wrapper for a seekable readable file.
+ */
+struct reftable_block_source {
+	struct reftable_block_source_vtable *ops;
+	void *arg;
+};
+
+/* a contiguous segment of bytes. It keeps track of its generating block_source
+ * so it can return itself into the pool. */
+struct reftable_block {
+	uint8_t *data;
+	int len;
+	struct reftable_block_source source;
+};
+
+/* block_source_vtable are the operations that make up block_source */
+struct reftable_block_source_vtable {
+	/* returns the size of a block source */
+	uint64_t (*size)(void *source);
+
+	/* reads a segment from the block source. It is an error to read
+	   beyond the end of the block */
+	int (*read_block)(void *source, struct reftable_block *dest,
+			  uint64_t off, uint32_t size);
+	/* mark the block as read; may return the data back to malloc */
+	void (*return_block)(void *source, struct reftable_block *blockp);
+
+	/* release all resources associated with the block source */
+	void (*close)(void *source);
+};
+
+/* opens a file on the file system as a block_source */
+int reftable_block_source_from_file(struct reftable_block_source *block_src,
+				    const char *name);
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 10/28] reftable: (de)serialization for the polymorphic record type.
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (8 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 09/28] reftable: add blocksource, an abstraction for random access reads Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-05-04 17:23               ` Andrzej Hunt
  2021-04-19 11:37             ` [PATCH v7 11/28] Provide zlib's uncompress2 from compat/zlib-compat.c Han-Wen Nienhuys via GitGitGadget
                               ` (18 subsequent siblings)
  28 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable format is structured as a sequence of blocks, and each block
contains a sequence of prefix-compressed key-value records. There are 4 types of
records, and they have similarities in how they must be handled. This is
achieved by introducing a polymorphic 'record' type that encapsulates ref, log,
index and object records.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                   |    2 +
 reftable/constants.h       |   21 +
 reftable/record.c          | 1202 ++++++++++++++++++++++++++++++++++++
 reftable/record.h          |  139 +++++
 reftable/record_test.c     |  407 ++++++++++++
 reftable/reftable-record.h |  114 ++++
 t/helper/test-reftable.c   |    2 +-
 7 files changed, 1886 insertions(+), 1 deletion(-)
 create mode 100644 reftable/constants.h
 create mode 100644 reftable/record.c
 create mode 100644 reftable/record.h
 create mode 100644 reftable/record_test.c
 create mode 100644 reftable/reftable-record.h

diff --git a/Makefile b/Makefile
index 189bf572661b..d6bc336c5565 100644
--- a/Makefile
+++ b/Makefile
@@ -2405,7 +2405,9 @@ REFTABLE_OBJS += reftable/basics.o
 REFTABLE_OBJS += reftable/error.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
+REFTABLE_OBJS += reftable/record.o
 
+REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/basics_test.o
 
diff --git a/reftable/constants.h b/reftable/constants.h
new file mode 100644
index 000000000000..5eee72c4c113
--- /dev/null
+++ b/reftable/constants.h
@@ -0,0 +1,21 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef CONSTANTS_H
+#define CONSTANTS_H
+
+#define BLOCK_TYPE_LOG 'g'
+#define BLOCK_TYPE_INDEX 'i'
+#define BLOCK_TYPE_REF 'r'
+#define BLOCK_TYPE_OBJ 'o'
+#define BLOCK_TYPE_ANY 0
+
+#define MAX_RESTARTS ((1 << 16) - 1)
+#define DEFAULT_BLOCK_SIZE 4096
+
+#endif
diff --git a/reftable/record.c b/reftable/record.c
new file mode 100644
index 000000000000..b111852b667e
--- /dev/null
+++ b/reftable/record.c
@@ -0,0 +1,1202 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+/* record.c - methods for different types of records. */
+
+#include "record.h"
+
+#include "system.h"
+#include "constants.h"
+#include "reftable-error.h"
+#include "basics.h"
+
+int get_var_int(uint64_t *dest, struct string_view *in)
+{
+	int ptr = 0;
+	uint64_t val;
+
+	if (in->len == 0)
+		return -1;
+	val = in->buf[ptr] & 0x7f;
+
+	while (in->buf[ptr] & 0x80) {
+		ptr++;
+		if (ptr > in->len) {
+			return -1;
+		}
+		val = (val + 1) << 7 | (uint64_t)(in->buf[ptr] & 0x7f);
+	}
+
+	*dest = val;
+	return ptr + 1;
+}
+
+int put_var_int(struct string_view *dest, uint64_t val)
+{
+	uint8_t buf[10] = { 0 };
+	int i = 9;
+	int n = 0;
+	buf[i] = (uint8_t)(val & 0x7f);
+	i--;
+	while (1) {
+		val >>= 7;
+		if (!val) {
+			break;
+		}
+		val--;
+		buf[i] = 0x80 | (uint8_t)(val & 0x7f);
+		i--;
+	}
+
+	n = sizeof(buf) - i - 1;
+	if (dest->len < n)
+		return -1;
+	memcpy(dest->buf, &buf[i + 1], n);
+	return n;
+}
+
+int reftable_is_block_type(uint8_t typ)
+{
+	switch (typ) {
+	case BLOCK_TYPE_REF:
+	case BLOCK_TYPE_LOG:
+	case BLOCK_TYPE_OBJ:
+	case BLOCK_TYPE_INDEX:
+		return 1;
+	}
+	return 0;
+}
+
+uint8_t *reftable_ref_record_val1(struct reftable_ref_record *rec)
+{
+	switch (rec->value_type) {
+	case REFTABLE_REF_VAL1:
+		return rec->value.val1;
+	case REFTABLE_REF_VAL2:
+		return rec->value.val2.value;
+	default:
+		return NULL;
+	}
+}
+
+uint8_t *reftable_ref_record_val2(struct reftable_ref_record *rec)
+{
+	switch (rec->value_type) {
+	case REFTABLE_REF_VAL2:
+		return rec->value.val2.target_value;
+	default:
+		return NULL;
+	}
+}
+
+static int decode_string(struct strbuf *dest, struct string_view in)
+{
+	int start_len = in.len;
+	uint64_t tsize = 0;
+	int n = get_var_int(&tsize, &in);
+	if (n <= 0)
+		return -1;
+	string_view_consume(&in, n);
+	if (in.len < tsize)
+		return -1;
+
+	strbuf_reset(dest);
+	strbuf_add(dest, in.buf, tsize);
+	string_view_consume(&in, tsize);
+
+	return start_len - in.len;
+}
+
+static int encode_string(char *str, struct string_view s)
+{
+	struct string_view start = s;
+	int l = strlen(str);
+	int n = put_var_int(&s, l);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+	if (s.len < l)
+		return -1;
+	memcpy(s.buf, str, l);
+	string_view_consume(&s, l);
+
+	return start.len - s.len;
+}
+
+int reftable_encode_key(int *restart, struct string_view dest,
+			struct strbuf prev_key, struct strbuf key,
+			uint8_t extra)
+{
+	struct string_view start = dest;
+	int prefix_len = common_prefix_size(&prev_key, &key);
+	uint64_t suffix_len = key.len - prefix_len;
+	int n = put_var_int(&dest, (uint64_t)prefix_len);
+	if (n < 0)
+		return -1;
+	string_view_consume(&dest, n);
+
+	*restart = (prefix_len == 0);
+
+	n = put_var_int(&dest, suffix_len << 3 | (uint64_t)extra);
+	if (n < 0)
+		return -1;
+	string_view_consume(&dest, n);
+
+	if (dest.len < suffix_len)
+		return -1;
+	memcpy(dest.buf, key.buf + prefix_len, suffix_len);
+	string_view_consume(&dest, suffix_len);
+
+	return start.len - dest.len;
+}
+
+int reftable_decode_key(struct strbuf *key, uint8_t *extra,
+			struct strbuf last_key, struct string_view in)
+{
+	int start_len = in.len;
+	uint64_t prefix_len = 0;
+	uint64_t suffix_len = 0;
+	int n = get_var_int(&prefix_len, &in);
+	if (n < 0)
+		return -1;
+	string_view_consume(&in, n);
+
+	if (prefix_len > last_key.len)
+		return -1;
+
+	n = get_var_int(&suffix_len, &in);
+	if (n <= 0)
+		return -1;
+	string_view_consume(&in, n);
+
+	*extra = (uint8_t)(suffix_len & 0x7);
+	suffix_len >>= 3;
+
+	if (in.len < suffix_len)
+		return -1;
+
+	strbuf_reset(key);
+	strbuf_add(key, last_key.buf, prefix_len);
+	strbuf_add(key, in.buf, suffix_len);
+	string_view_consume(&in, suffix_len);
+
+	return start_len - in.len;
+}
+
+static void reftable_ref_record_key(const void *r, struct strbuf *dest)
+{
+	const struct reftable_ref_record *rec =
+		(const struct reftable_ref_record *)r;
+	strbuf_reset(dest);
+	strbuf_addstr(dest, rec->refname);
+}
+
+static void reftable_ref_record_copy_from(void *rec, const void *src_rec,
+					  int hash_size)
+{
+	struct reftable_ref_record *ref = rec;
+	const struct reftable_ref_record *src = src_rec;
+	assert(hash_size > 0);
+
+	/* This is simple and correct, but we could probably reuse the hash
+	 * fields. */
+	reftable_ref_record_release(ref);
+	if (src->refname) {
+		ref->refname = xstrdup(src->refname);
+	}
+	ref->update_index = src->update_index;
+	ref->value_type = src->value_type;
+	switch (src->value_type) {
+	case REFTABLE_REF_DELETION:
+		break;
+	case REFTABLE_REF_VAL1:
+		ref->value.val1 = reftable_malloc(hash_size);
+		memcpy(ref->value.val1, src->value.val1, hash_size);
+		break;
+	case REFTABLE_REF_VAL2:
+		ref->value.val2.value = reftable_malloc(hash_size);
+		memcpy(ref->value.val2.value, src->value.val2.value, hash_size);
+		ref->value.val2.target_value = reftable_malloc(hash_size);
+		memcpy(ref->value.val2.target_value,
+		       src->value.val2.target_value, hash_size);
+		break;
+	case REFTABLE_REF_SYMREF:
+		ref->value.symref = xstrdup(src->value.symref);
+		break;
+	}
+}
+
+static char hexdigit(int c)
+{
+	if (c <= 9)
+		return '0' + c;
+	return 'a' + (c - 10);
+}
+
+static void hex_format(char *dest, uint8_t *src, int hash_size)
+{
+	assert(hash_size > 0);
+	if (src) {
+		int i = 0;
+		for (i = 0; i < hash_size; i++) {
+			dest[2 * i] = hexdigit(src[i] >> 4);
+			dest[2 * i + 1] = hexdigit(src[i] & 0xf);
+		}
+		dest[2 * hash_size] = 0;
+	}
+}
+
+void reftable_ref_record_print(struct reftable_ref_record *ref,
+			       uint32_t hash_id)
+{
+	char hex[2 * GIT_SHA256_RAWSZ + 1] = { 0 }; /* BUG */
+	printf("ref{%s(%" PRIu64 ") ", ref->refname, ref->update_index);
+	switch (ref->value_type) {
+	case REFTABLE_REF_SYMREF:
+		printf("=> %s", ref->value.symref);
+		break;
+	case REFTABLE_REF_VAL2:
+		hex_format(hex, ref->value.val2.value, hash_size(hash_id));
+		printf("val 2 %s", hex);
+		hex_format(hex, ref->value.val2.target_value,
+			   hash_size(hash_id));
+		printf("(T %s)", hex);
+		break;
+	case REFTABLE_REF_VAL1:
+		hex_format(hex, ref->value.val1, hash_size(hash_id));
+		printf("val 1 %s", hex);
+		break;
+	case REFTABLE_REF_DELETION:
+		printf("delete");
+		break;
+	}
+	printf("}\n");
+}
+
+static void reftable_ref_record_release_void(void *rec)
+{
+	reftable_ref_record_release(rec);
+}
+
+void reftable_ref_record_release(struct reftable_ref_record *ref)
+{
+	switch (ref->value_type) {
+	case REFTABLE_REF_SYMREF:
+		reftable_free(ref->value.symref);
+		break;
+	case REFTABLE_REF_VAL2:
+		reftable_free(ref->value.val2.target_value);
+		reftable_free(ref->value.val2.value);
+		break;
+	case REFTABLE_REF_VAL1:
+		reftable_free(ref->value.val1);
+		break;
+	case REFTABLE_REF_DELETION:
+		break;
+	default:
+		abort();
+	}
+
+	reftable_free(ref->refname);
+	memset(ref, 0, sizeof(struct reftable_ref_record));
+}
+
+static uint8_t reftable_ref_record_val_type(const void *rec)
+{
+	const struct reftable_ref_record *r =
+		(const struct reftable_ref_record *)rec;
+	return r->value_type;
+}
+
+static int reftable_ref_record_encode(const void *rec, struct string_view s,
+				      int hash_size)
+{
+	const struct reftable_ref_record *r =
+		(const struct reftable_ref_record *)rec;
+	struct string_view start = s;
+	int n = put_var_int(&s, r->update_index);
+	assert(hash_size > 0);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	switch (r->value_type) {
+	case REFTABLE_REF_SYMREF:
+		n = encode_string(r->value.symref, s);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&s, n);
+		break;
+	case REFTABLE_REF_VAL2:
+		if (s.len < 2 * hash_size) {
+			return -1;
+		}
+		memcpy(s.buf, r->value.val2.value, hash_size);
+		string_view_consume(&s, hash_size);
+		memcpy(s.buf, r->value.val2.target_value, hash_size);
+		string_view_consume(&s, hash_size);
+		break;
+	case REFTABLE_REF_VAL1:
+		if (s.len < hash_size) {
+			return -1;
+		}
+		memcpy(s.buf, r->value.val1, hash_size);
+		string_view_consume(&s, hash_size);
+		break;
+	case REFTABLE_REF_DELETION:
+		break;
+	default:
+		abort();
+	}
+
+	return start.len - s.len;
+}
+
+static int reftable_ref_record_decode(void *rec, struct strbuf key,
+				      uint8_t val_type, struct string_view in,
+				      int hash_size)
+{
+	struct reftable_ref_record *r = rec;
+	struct string_view start = in;
+	uint64_t update_index = 0;
+	int n = get_var_int(&update_index, &in);
+	if (n < 0)
+		return n;
+	string_view_consume(&in, n);
+
+	reftable_ref_record_release(r);
+
+	assert(hash_size > 0);
+
+	r->refname = reftable_realloc(r->refname, key.len + 1);
+	memcpy(r->refname, key.buf, key.len);
+	r->update_index = update_index;
+	r->refname[key.len] = 0;
+	r->value_type = val_type;
+	switch (val_type) {
+	case REFTABLE_REF_VAL1:
+		if (in.len < hash_size) {
+			return -1;
+		}
+
+		r->value.val1 = reftable_malloc(hash_size);
+		memcpy(r->value.val1, in.buf, hash_size);
+		string_view_consume(&in, hash_size);
+		break;
+
+	case REFTABLE_REF_VAL2:
+		if (in.len < 2 * hash_size) {
+			return -1;
+		}
+
+		r->value.val2.value = reftable_malloc(hash_size);
+		memcpy(r->value.val2.value, in.buf, hash_size);
+		string_view_consume(&in, hash_size);
+
+		r->value.val2.target_value = reftable_malloc(hash_size);
+		memcpy(r->value.val2.target_value, in.buf, hash_size);
+		string_view_consume(&in, hash_size);
+		break;
+
+	case REFTABLE_REF_SYMREF: {
+		struct strbuf dest = STRBUF_INIT;
+		int n = decode_string(&dest, in);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&in, n);
+		r->value.symref = dest.buf;
+	} break;
+
+	case REFTABLE_REF_DELETION:
+		break;
+	default:
+		abort();
+		break;
+	}
+
+	return start.len - in.len;
+}
+
+static int reftable_ref_record_is_deletion_void(const void *p)
+{
+	return reftable_ref_record_is_deletion(
+		(const struct reftable_ref_record *)p);
+}
+
+static struct reftable_record_vtable reftable_ref_record_vtable = {
+	.key = &reftable_ref_record_key,
+	.type = BLOCK_TYPE_REF,
+	.copy_from = &reftable_ref_record_copy_from,
+	.val_type = &reftable_ref_record_val_type,
+	.encode = &reftable_ref_record_encode,
+	.decode = &reftable_ref_record_decode,
+	.release = &reftable_ref_record_release_void,
+	.is_deletion = &reftable_ref_record_is_deletion_void,
+};
+
+static void reftable_obj_record_key(const void *r, struct strbuf *dest)
+{
+	const struct reftable_obj_record *rec =
+		(const struct reftable_obj_record *)r;
+	strbuf_reset(dest);
+	strbuf_add(dest, rec->hash_prefix, rec->hash_prefix_len);
+}
+
+static void reftable_obj_record_release(void *rec)
+{
+	struct reftable_obj_record *obj = rec;
+	FREE_AND_NULL(obj->hash_prefix);
+	FREE_AND_NULL(obj->offsets);
+	memset(obj, 0, sizeof(struct reftable_obj_record));
+}
+
+static void reftable_obj_record_copy_from(void *rec, const void *src_rec,
+					  int hash_size)
+{
+	struct reftable_obj_record *obj = rec;
+	const struct reftable_obj_record *src =
+		(const struct reftable_obj_record *)src_rec;
+	int olen;
+
+	reftable_obj_record_release(obj);
+	*obj = *src;
+	obj->hash_prefix = reftable_malloc(obj->hash_prefix_len);
+	memcpy(obj->hash_prefix, src->hash_prefix, obj->hash_prefix_len);
+
+	olen = obj->offset_len * sizeof(uint64_t);
+	obj->offsets = reftable_malloc(olen);
+	memcpy(obj->offsets, src->offsets, olen);
+}
+
+static uint8_t reftable_obj_record_val_type(const void *rec)
+{
+	const struct reftable_obj_record *r = rec;
+	if (r->offset_len > 0 && r->offset_len < 8)
+		return r->offset_len;
+	return 0;
+}
+
+static int reftable_obj_record_encode(const void *rec, struct string_view s,
+				      int hash_size)
+{
+	const struct reftable_obj_record *r = rec;
+	struct string_view start = s;
+	int i = 0;
+	int n = 0;
+	uint64_t last = 0;
+	if (r->offset_len == 0 || r->offset_len >= 8) {
+		n = put_var_int(&s, r->offset_len);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&s, n);
+	}
+	if (r->offset_len == 0)
+		return start.len - s.len;
+	n = put_var_int(&s, r->offsets[0]);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	last = r->offsets[0];
+	for (i = 1; i < r->offset_len; i++) {
+		int n = put_var_int(&s, r->offsets[i] - last);
+		if (n < 0) {
+			return -1;
+		}
+		string_view_consume(&s, n);
+		last = r->offsets[i];
+	}
+	return start.len - s.len;
+}
+
+static int reftable_obj_record_decode(void *rec, struct strbuf key,
+				      uint8_t val_type, struct string_view in,
+				      int hash_size)
+{
+	struct string_view start = in;
+	struct reftable_obj_record *r = rec;
+	uint64_t count = val_type;
+	int n = 0;
+	uint64_t last;
+	int j;
+	r->hash_prefix = reftable_malloc(key.len);
+	memcpy(r->hash_prefix, key.buf, key.len);
+	r->hash_prefix_len = key.len;
+
+	if (val_type == 0) {
+		n = get_var_int(&count, &in);
+		if (n < 0) {
+			return n;
+		}
+
+		string_view_consume(&in, n);
+	}
+
+	r->offsets = NULL;
+	r->offset_len = 0;
+	if (count == 0)
+		return start.len - in.len;
+
+	r->offsets = reftable_malloc(count * sizeof(uint64_t));
+	r->offset_len = count;
+
+	n = get_var_int(&r->offsets[0], &in);
+	if (n < 0)
+		return n;
+	string_view_consume(&in, n);
+
+	last = r->offsets[0];
+	j = 1;
+	while (j < count) {
+		uint64_t delta = 0;
+		int n = get_var_int(&delta, &in);
+		if (n < 0) {
+			return n;
+		}
+		string_view_consume(&in, n);
+
+		last = r->offsets[j] = (delta + last);
+		j++;
+	}
+	return start.len - in.len;
+}
+
+static int not_a_deletion(const void *p)
+{
+	return 0;
+}
+
+static struct reftable_record_vtable reftable_obj_record_vtable = {
+	.key = &reftable_obj_record_key,
+	.type = BLOCK_TYPE_OBJ,
+	.copy_from = &reftable_obj_record_copy_from,
+	.val_type = &reftable_obj_record_val_type,
+	.encode = &reftable_obj_record_encode,
+	.decode = &reftable_obj_record_decode,
+	.release = &reftable_obj_record_release,
+	.is_deletion = not_a_deletion,
+};
+
+void reftable_log_record_print(struct reftable_log_record *log,
+			       uint32_t hash_id)
+{
+	char hex[GIT_SHA256_RAWSZ + 1] = { 0 };
+
+	switch (log->value_type) {
+	case REFTABLE_LOG_DELETION:
+		printf("log{%s(%" PRIu64 ") delete", log->refname,
+		       log->update_index);
+		break;
+	case REFTABLE_LOG_UPDATE:
+		printf("log{%s(%" PRIu64 ") %s <%s> %" PRIu64 " %04d\n",
+		       log->refname, log->update_index, log->update.name,
+		       log->update.email, log->update.time,
+		       log->update.tz_offset);
+		hex_format(hex, log->update.old_hash, hash_size(hash_id));
+		printf("%s => ", hex);
+		hex_format(hex, log->update.new_hash, hash_size(hash_id));
+		printf("%s\n\n%s\n}\n", hex, log->update.message);
+		break;
+	}
+}
+
+static void reftable_log_record_key(const void *r, struct strbuf *dest)
+{
+	const struct reftable_log_record *rec =
+		(const struct reftable_log_record *)r;
+	int len = strlen(rec->refname);
+	uint8_t i64[8];
+	uint64_t ts = 0;
+	strbuf_reset(dest);
+	strbuf_add(dest, (uint8_t *)rec->refname, len + 1);
+
+	ts = (~ts) - rec->update_index;
+	put_be64(&i64[0], ts);
+	strbuf_add(dest, i64, sizeof(i64));
+}
+
+static void reftable_log_record_copy_from(void *rec, const void *src_rec,
+					  int hash_size)
+{
+	struct reftable_log_record *dst = rec;
+	const struct reftable_log_record *src =
+		(const struct reftable_log_record *)src_rec;
+
+	reftable_log_record_release(dst);
+	*dst = *src;
+	if (dst->refname) {
+		dst->refname = xstrdup(dst->refname);
+	}
+	switch (dst->value_type) {
+	case REFTABLE_LOG_DELETION:
+		break;
+	case REFTABLE_LOG_UPDATE:
+		if (dst->update.email) {
+			dst->update.email = xstrdup(dst->update.email);
+		}
+		if (dst->update.name) {
+			dst->update.name = xstrdup(dst->update.name);
+		}
+		if (dst->update.message) {
+			dst->update.message = xstrdup(dst->update.message);
+		}
+
+		if (dst->update.new_hash) {
+			dst->update.new_hash = reftable_malloc(hash_size);
+			memcpy(dst->update.new_hash, src->update.new_hash,
+			       hash_size);
+		}
+		if (dst->update.old_hash) {
+			dst->update.old_hash = reftable_malloc(hash_size);
+			memcpy(dst->update.old_hash, src->update.old_hash,
+			       hash_size);
+		}
+		break;
+	}
+}
+
+static void reftable_log_record_release_void(void *rec)
+{
+	struct reftable_log_record *r = rec;
+	reftable_log_record_release(r);
+}
+
+void reftable_log_record_release(struct reftable_log_record *r)
+{
+	reftable_free(r->refname);
+	switch (r->value_type) {
+	case REFTABLE_LOG_DELETION:
+		break;
+	case REFTABLE_LOG_UPDATE:
+		reftable_free(r->update.new_hash);
+		reftable_free(r->update.old_hash);
+		reftable_free(r->update.name);
+		reftable_free(r->update.email);
+		reftable_free(r->update.message);
+		break;
+	}
+	memset(r, 0, sizeof(struct reftable_log_record));
+}
+
+static uint8_t reftable_log_record_val_type(const void *rec)
+{
+	const struct reftable_log_record *log =
+		(const struct reftable_log_record *)rec;
+
+	return reftable_log_record_is_deletion(log) ? 0 : 1;
+}
+
+static uint8_t zero[GIT_SHA256_RAWSZ] = { 0 };
+
+static int reftable_log_record_encode(const void *rec, struct string_view s,
+				      int hash_size)
+{
+	const struct reftable_log_record *r = rec;
+	struct string_view start = s;
+	int n = 0;
+	uint8_t *oldh = NULL;
+	uint8_t *newh = NULL;
+	if (reftable_log_record_is_deletion(r))
+		return 0;
+
+	oldh = r->update.old_hash;
+	newh = r->update.new_hash;
+	if (!oldh) {
+		oldh = zero;
+	}
+	if (!newh) {
+		newh = zero;
+	}
+
+	if (s.len < 2 * hash_size)
+		return -1;
+
+	memcpy(s.buf, oldh, hash_size);
+	memcpy(s.buf + hash_size, newh, hash_size);
+	string_view_consume(&s, 2 * hash_size);
+
+	n = encode_string(r->update.name ? r->update.name : "", s);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	n = encode_string(r->update.email ? r->update.email : "", s);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	n = put_var_int(&s, r->update.time);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	if (s.len < 2)
+		return -1;
+
+	put_be16(s.buf, r->update.tz_offset);
+	string_view_consume(&s, 2);
+
+	n = encode_string(r->update.message ? r->update.message : "", s);
+	if (n < 0)
+		return -1;
+	string_view_consume(&s, n);
+
+	return start.len - s.len;
+}
+
+static int reftable_log_record_decode(void *rec, struct strbuf key,
+				      uint8_t val_type, struct string_view in,
+				      int hash_size)
+{
+	struct string_view start = in;
+	struct reftable_log_record *r = rec;
+	uint64_t max = 0;
+	uint64_t ts = 0;
+	struct strbuf dest = STRBUF_INIT;
+	int n;
+
+	if (key.len <= 9 || key.buf[key.len - 9] != 0)
+		return REFTABLE_FORMAT_ERROR;
+
+	r->refname = reftable_realloc(r->refname, key.len - 8);
+	memcpy(r->refname, key.buf, key.len - 8);
+	ts = get_be64(key.buf + key.len - 8);
+
+	r->update_index = (~max) - ts;
+
+	if (val_type != r->value_type) {
+		switch (r->value_type) {
+		case REFTABLE_LOG_UPDATE:
+			FREE_AND_NULL(r->update.old_hash);
+			FREE_AND_NULL(r->update.new_hash);
+			FREE_AND_NULL(r->update.message);
+			FREE_AND_NULL(r->update.email);
+			FREE_AND_NULL(r->update.name);
+			break;
+		case REFTABLE_LOG_DELETION:
+			break;
+		}
+	}
+
+	r->value_type = val_type;
+	if (val_type == REFTABLE_LOG_DELETION)
+		return 0;
+
+	if (in.len < 2 * hash_size)
+		return REFTABLE_FORMAT_ERROR;
+
+	r->update.old_hash = reftable_realloc(r->update.old_hash, hash_size);
+	r->update.new_hash = reftable_realloc(r->update.new_hash, hash_size);
+
+	memcpy(r->update.old_hash, in.buf, hash_size);
+	memcpy(r->update.new_hash, in.buf + hash_size, hash_size);
+
+	string_view_consume(&in, 2 * hash_size);
+
+	n = decode_string(&dest, in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+
+	r->update.name = reftable_realloc(r->update.name, dest.len + 1);
+	memcpy(r->update.name, dest.buf, dest.len);
+	r->update.name[dest.len] = 0;
+
+	strbuf_reset(&dest);
+	n = decode_string(&dest, in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+
+	r->update.email = reftable_realloc(r->update.email, dest.len + 1);
+	memcpy(r->update.email, dest.buf, dest.len);
+	r->update.email[dest.len] = 0;
+
+	ts = 0;
+	n = get_var_int(&ts, &in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+	r->update.time = ts;
+	if (in.len < 2)
+		goto done;
+
+	r->update.tz_offset = get_be16(in.buf);
+	string_view_consume(&in, 2);
+
+	strbuf_reset(&dest);
+	n = decode_string(&dest, in);
+	if (n < 0)
+		goto done;
+	string_view_consume(&in, n);
+
+	r->update.message = reftable_realloc(r->update.message, dest.len + 1);
+	memcpy(r->update.message, dest.buf, dest.len);
+	r->update.message[dest.len] = 0;
+
+	strbuf_release(&dest);
+	return start.len - in.len;
+
+done:
+	strbuf_release(&dest);
+	return REFTABLE_FORMAT_ERROR;
+}
+
+static int null_streq(char *a, char *b)
+{
+	char *empty = "";
+	if (!a)
+		a = empty;
+
+	if (!b)
+		b = empty;
+
+	return 0 == strcmp(a, b);
+}
+
+static int zero_hash_eq(uint8_t *a, uint8_t *b, int sz)
+{
+	if (!a)
+		a = zero;
+
+	if (!b)
+		b = zero;
+
+	return !memcmp(a, b, sz);
+}
+
+int reftable_log_record_equal(struct reftable_log_record *a,
+			      struct reftable_log_record *b, int hash_size)
+{
+	if (!(null_streq(a->refname, b->refname) &&
+	      a->update_index == b->update_index &&
+	      a->value_type == b->value_type))
+		return 0;
+
+	switch (a->value_type) {
+	case REFTABLE_LOG_DELETION:
+		return 1;
+	case REFTABLE_LOG_UPDATE:
+		return null_streq(a->update.name, b->update.name) &&
+		       a->update.time == b->update.time &&
+		       a->update.tz_offset == b->update.tz_offset &&
+		       null_streq(a->update.email, b->update.email) &&
+		       null_streq(a->update.message, b->update.message) &&
+		       zero_hash_eq(a->update.old_hash, b->update.old_hash,
+				    hash_size) &&
+		       zero_hash_eq(a->update.new_hash, b->update.new_hash,
+				    hash_size);
+	}
+
+	abort();
+}
+
+static int reftable_log_record_is_deletion_void(const void *p)
+{
+	return reftable_log_record_is_deletion(
+		(const struct reftable_log_record *)p);
+}
+
+static struct reftable_record_vtable reftable_log_record_vtable = {
+	.key = &reftable_log_record_key,
+	.type = BLOCK_TYPE_LOG,
+	.copy_from = &reftable_log_record_copy_from,
+	.val_type = &reftable_log_record_val_type,
+	.encode = &reftable_log_record_encode,
+	.decode = &reftable_log_record_decode,
+	.release = &reftable_log_record_release_void,
+	.is_deletion = &reftable_log_record_is_deletion_void,
+};
+
+struct reftable_record reftable_new_record(uint8_t typ)
+{
+	struct reftable_record rec = { NULL };
+	switch (typ) {
+	case BLOCK_TYPE_REF: {
+		struct reftable_ref_record *r =
+			reftable_calloc(sizeof(struct reftable_ref_record));
+		reftable_record_from_ref(&rec, r);
+		return rec;
+	}
+
+	case BLOCK_TYPE_OBJ: {
+		struct reftable_obj_record *r =
+			reftable_calloc(sizeof(struct reftable_obj_record));
+		reftable_record_from_obj(&rec, r);
+		return rec;
+	}
+	case BLOCK_TYPE_LOG: {
+		struct reftable_log_record *r =
+			reftable_calloc(sizeof(struct reftable_log_record));
+		reftable_record_from_log(&rec, r);
+		return rec;
+	}
+	case BLOCK_TYPE_INDEX: {
+		struct reftable_index_record empty = { .last_key =
+							       STRBUF_INIT };
+		struct reftable_index_record *r =
+			reftable_calloc(sizeof(struct reftable_index_record));
+		*r = empty;
+		reftable_record_from_index(&rec, r);
+		return rec;
+	}
+	}
+	abort();
+	return rec;
+}
+
+/* clear out the record, yielding the reftable_record data that was
+ * encapsulated. */
+static void *reftable_record_yield(struct reftable_record *rec)
+{
+	void *p = rec->data;
+	rec->data = NULL;
+	return p;
+}
+
+void reftable_record_destroy(struct reftable_record *rec)
+{
+	reftable_record_release(rec);
+	reftable_free(reftable_record_yield(rec));
+}
+
+static void reftable_index_record_key(const void *r, struct strbuf *dest)
+{
+	const struct reftable_index_record *rec = r;
+	strbuf_reset(dest);
+	strbuf_addbuf(dest, &rec->last_key);
+}
+
+static void reftable_index_record_copy_from(void *rec, const void *src_rec,
+					    int hash_size)
+{
+	struct reftable_index_record *dst = rec;
+	const struct reftable_index_record *src = src_rec;
+
+	strbuf_reset(&dst->last_key);
+	strbuf_addbuf(&dst->last_key, &src->last_key);
+	dst->offset = src->offset;
+}
+
+static void reftable_index_record_release(void *rec)
+{
+	struct reftable_index_record *idx = rec;
+	strbuf_release(&idx->last_key);
+}
+
+static uint8_t reftable_index_record_val_type(const void *rec)
+{
+	return 0;
+}
+
+static int reftable_index_record_encode(const void *rec, struct string_view out,
+					int hash_size)
+{
+	const struct reftable_index_record *r =
+		(const struct reftable_index_record *)rec;
+	struct string_view start = out;
+
+	int n = put_var_int(&out, r->offset);
+	if (n < 0)
+		return n;
+
+	string_view_consume(&out, n);
+
+	return start.len - out.len;
+}
+
+static int reftable_index_record_decode(void *rec, struct strbuf key,
+					uint8_t val_type, struct string_view in,
+					int hash_size)
+{
+	struct string_view start = in;
+	struct reftable_index_record *r = rec;
+	int n = 0;
+
+	strbuf_reset(&r->last_key);
+	strbuf_addbuf(&r->last_key, &key);
+
+	n = get_var_int(&r->offset, &in);
+	if (n < 0)
+		return n;
+
+	string_view_consume(&in, n);
+	return start.len - in.len;
+}
+
+static struct reftable_record_vtable reftable_index_record_vtable = {
+	.key = &reftable_index_record_key,
+	.type = BLOCK_TYPE_INDEX,
+	.copy_from = &reftable_index_record_copy_from,
+	.val_type = &reftable_index_record_val_type,
+	.encode = &reftable_index_record_encode,
+	.decode = &reftable_index_record_decode,
+	.release = &reftable_index_record_release,
+	.is_deletion = &not_a_deletion,
+};
+
+void reftable_record_key(struct reftable_record *rec, struct strbuf *dest)
+{
+	rec->ops->key(rec->data, dest);
+}
+
+uint8_t reftable_record_type(struct reftable_record *rec)
+{
+	return rec->ops->type;
+}
+
+int reftable_record_encode(struct reftable_record *rec, struct string_view dest,
+			   int hash_size)
+{
+	return rec->ops->encode(rec->data, dest, hash_size);
+}
+
+void reftable_record_copy_from(struct reftable_record *rec,
+			       struct reftable_record *src, int hash_size)
+{
+	assert(src->ops->type == rec->ops->type);
+
+	rec->ops->copy_from(rec->data, src->data, hash_size);
+}
+
+uint8_t reftable_record_val_type(struct reftable_record *rec)
+{
+	return rec->ops->val_type(rec->data);
+}
+
+int reftable_record_decode(struct reftable_record *rec, struct strbuf key,
+			   uint8_t extra, struct string_view src, int hash_size)
+{
+	return rec->ops->decode(rec->data, key, extra, src, hash_size);
+}
+
+void reftable_record_release(struct reftable_record *rec)
+{
+	rec->ops->release(rec->data);
+}
+
+int reftable_record_is_deletion(struct reftable_record *rec)
+{
+	return rec->ops->is_deletion(rec->data);
+}
+
+void reftable_record_from_ref(struct reftable_record *rec,
+			      struct reftable_ref_record *ref_rec)
+{
+	assert(!rec->ops);
+	rec->data = ref_rec;
+	rec->ops = &reftable_ref_record_vtable;
+}
+
+void reftable_record_from_obj(struct reftable_record *rec,
+			      struct reftable_obj_record *obj_rec)
+{
+	assert(!rec->ops);
+	rec->data = obj_rec;
+	rec->ops = &reftable_obj_record_vtable;
+}
+
+void reftable_record_from_index(struct reftable_record *rec,
+				struct reftable_index_record *index_rec)
+{
+	assert(!rec->ops);
+	rec->data = index_rec;
+	rec->ops = &reftable_index_record_vtable;
+}
+
+void reftable_record_from_log(struct reftable_record *rec,
+			      struct reftable_log_record *log_rec)
+{
+	assert(!rec->ops);
+	rec->data = log_rec;
+	rec->ops = &reftable_log_record_vtable;
+}
+
+struct reftable_ref_record *reftable_record_as_ref(struct reftable_record *rec)
+{
+	assert(reftable_record_type(rec) == BLOCK_TYPE_REF);
+	return rec->data;
+}
+
+struct reftable_log_record *reftable_record_as_log(struct reftable_record *rec)
+{
+	assert(reftable_record_type(rec) == BLOCK_TYPE_LOG);
+	return rec->data;
+}
+
+static int hash_equal(uint8_t *a, uint8_t *b, int hash_size)
+{
+	if (a && b)
+		return !memcmp(a, b, hash_size);
+
+	return a == b;
+}
+
+int reftable_ref_record_equal(struct reftable_ref_record *a,
+			      struct reftable_ref_record *b, int hash_size)
+{
+	assert(hash_size > 0);
+	if (!(0 == strcmp(a->refname, b->refname) &&
+	      a->update_index == b->update_index &&
+	      a->value_type == b->value_type))
+		return 0;
+
+	switch (a->value_type) {
+	case REFTABLE_REF_SYMREF:
+		return !strcmp(a->value.symref, b->value.symref);
+	case REFTABLE_REF_VAL2:
+		return hash_equal(a->value.val2.value, b->value.val2.value,
+				  hash_size) &&
+		       hash_equal(a->value.val2.target_value,
+				  b->value.val2.target_value, hash_size);
+	case REFTABLE_REF_VAL1:
+		return hash_equal(a->value.val1, b->value.val1, hash_size);
+	case REFTABLE_REF_DELETION:
+		return 1;
+	default:
+		abort();
+	}
+}
+
+int reftable_ref_record_compare_name(const void *a, const void *b)
+{
+	return strcmp(((struct reftable_ref_record *)a)->refname,
+		      ((struct reftable_ref_record *)b)->refname);
+}
+
+int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref)
+{
+	return ref->value_type == REFTABLE_REF_DELETION;
+}
+
+int reftable_log_record_compare_key(const void *a, const void *b)
+{
+	const struct reftable_log_record *la = a;
+	const struct reftable_log_record *lb = b;
+
+	int cmp = strcmp(la->refname, lb->refname);
+	if (cmp)
+		return cmp;
+	if (la->update_index > lb->update_index)
+		return -1;
+	return (la->update_index < lb->update_index) ? 1 : 0;
+}
+
+int reftable_log_record_is_deletion(const struct reftable_log_record *log)
+{
+	return (log->value_type == REFTABLE_LOG_DELETION);
+}
+
+void string_view_consume(struct string_view *s, int n)
+{
+	s->buf += n;
+	s->len -= n;
+}
diff --git a/reftable/record.h b/reftable/record.h
new file mode 100644
index 000000000000..498e8c50bf4f
--- /dev/null
+++ b/reftable/record.h
@@ -0,0 +1,139 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef RECORD_H
+#define RECORD_H
+
+#include "system.h"
+
+#include <stdint.h>
+
+#include "reftable-record.h"
+
+/*
+ * A substring of existing string data. This structure takes no responsibility
+ * for the lifetime of the data it points to.
+ */
+struct string_view {
+	uint8_t *buf;
+	size_t len;
+};
+
+/* Advance `s.buf` by `n`, and decrease length. */
+void string_view_consume(struct string_view *s, int n);
+
+/* utilities for de/encoding varints */
+
+int get_var_int(uint64_t *dest, struct string_view *in);
+int put_var_int(struct string_view *dest, uint64_t val);
+
+/* Methods for records. */
+struct reftable_record_vtable {
+	/* encode the key of to a uint8_t strbuf. */
+	void (*key)(const void *rec, struct strbuf *dest);
+
+	/* The record type of ('r' for ref). */
+	uint8_t type;
+
+	void (*copy_from)(void *dest, const void *src, int hash_size);
+
+	/* a value of [0..7], indicating record subvariants (eg. ref vs. symref
+	 * vs ref deletion) */
+	uint8_t (*val_type)(const void *rec);
+
+	/* encodes rec into dest, returning how much space was used. */
+	int (*encode)(const void *rec, struct string_view dest, int hash_size);
+
+	/* decode data from `src` into the record. */
+	int (*decode)(void *rec, struct strbuf key, uint8_t extra,
+		      struct string_view src, int hash_size);
+
+	/* deallocate and null the record. */
+	void (*release)(void *rec);
+
+	/* is this a tombstone? */
+	int (*is_deletion)(const void *rec);
+};
+
+/* record is a generic wrapper for different types of records. */
+struct reftable_record {
+	void *data;
+	struct reftable_record_vtable *ops;
+};
+
+/* returns true for recognized block types. Block start with the block type. */
+int reftable_is_block_type(uint8_t typ);
+
+/* creates a malloced record of the given type. Dispose with record_destroy */
+struct reftable_record reftable_new_record(uint8_t typ);
+
+/* Encode `key` into `dest`. Sets `is_restart` to indicate a restart. Returns
+ * number of bytes written. */
+int reftable_encode_key(int *is_restart, struct string_view dest,
+			struct strbuf prev_key, struct strbuf key,
+			uint8_t extra);
+
+/* Decode into `key` and `extra` from `in` */
+int reftable_decode_key(struct strbuf *key, uint8_t *extra,
+			struct strbuf last_key, struct string_view in);
+
+/* reftable_index_record are used internally to speed up lookups. */
+struct reftable_index_record {
+	uint64_t offset; /* Offset of block */
+	struct strbuf last_key; /* Last key of the block. */
+};
+
+/* reftable_obj_record stores an object ID => ref mapping. */
+struct reftable_obj_record {
+	uint8_t *hash_prefix; /* leading bytes of the object ID */
+	int hash_prefix_len; /* number of leading bytes. Constant
+			      * across a single table. */
+	uint64_t *offsets; /* a vector of file offsets. */
+	int offset_len;
+};
+
+/* see struct record_vtable */
+
+void reftable_record_key(struct reftable_record *rec, struct strbuf *dest);
+uint8_t reftable_record_type(struct reftable_record *rec);
+void reftable_record_copy_from(struct reftable_record *rec,
+			       struct reftable_record *src, int hash_size);
+uint8_t reftable_record_val_type(struct reftable_record *rec);
+int reftable_record_encode(struct reftable_record *rec, struct string_view dest,
+			   int hash_size);
+int reftable_record_decode(struct reftable_record *rec, struct strbuf key,
+			   uint8_t extra, struct string_view src,
+			   int hash_size);
+int reftable_record_is_deletion(struct reftable_record *rec);
+
+/* zeroes out the embedded record */
+void reftable_record_release(struct reftable_record *rec);
+
+/* clear and deallocate embedded record, and zero `rec`. */
+void reftable_record_destroy(struct reftable_record *rec);
+
+/* initialize generic records from concrete records. The generic record should
+ * be zeroed out. */
+void reftable_record_from_obj(struct reftable_record *rec,
+			      struct reftable_obj_record *objrec);
+void reftable_record_from_index(struct reftable_record *rec,
+				struct reftable_index_record *idxrec);
+void reftable_record_from_ref(struct reftable_record *rec,
+			      struct reftable_ref_record *refrec);
+void reftable_record_from_log(struct reftable_record *rec,
+			      struct reftable_log_record *logrec);
+struct reftable_ref_record *reftable_record_as_ref(struct reftable_record *ref);
+struct reftable_log_record *reftable_record_as_log(struct reftable_record *ref);
+
+/* for qsort. */
+int reftable_ref_record_compare_name(const void *a, const void *b);
+
+/* for qsort. */
+int reftable_log_record_compare_key(const void *a, const void *b);
+
+#endif
diff --git a/reftable/record_test.c b/reftable/record_test.c
new file mode 100644
index 000000000000..0a1901897641
--- /dev/null
+++ b/reftable/record_test.c
@@ -0,0 +1,407 @@
+/*
+  Copyright 2020 Google LLC
+
+  Use of this source code is governed by a BSD-style
+  license that can be found in the LICENSE file or at
+  https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "record.h"
+
+#include "system.h"
+#include "basics.h"
+#include "constants.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static void test_copy(struct reftable_record *rec)
+{
+	struct reftable_record copy =
+		reftable_new_record(reftable_record_type(rec));
+	reftable_record_copy_from(&copy, rec, GIT_SHA1_RAWSZ);
+	/* do it twice to catch memory leaks */
+	reftable_record_copy_from(&copy, rec, GIT_SHA1_RAWSZ);
+	switch (reftable_record_type(&copy)) {
+	case BLOCK_TYPE_REF:
+		EXPECT(reftable_ref_record_equal(reftable_record_as_ref(&copy),
+						 reftable_record_as_ref(rec),
+						 GIT_SHA1_RAWSZ));
+		break;
+	case BLOCK_TYPE_LOG:
+		EXPECT(reftable_log_record_equal(reftable_record_as_log(&copy),
+						 reftable_record_as_log(rec),
+						 GIT_SHA1_RAWSZ));
+		break;
+	}
+	reftable_record_destroy(&copy);
+}
+
+static void test_varint_roundtrip(void)
+{
+	uint64_t inputs[] = { 0,
+			      1,
+			      27,
+			      127,
+			      128,
+			      257,
+			      4096,
+			      ((uint64_t)1 << 63),
+			      ((uint64_t)1 << 63) + ((uint64_t)1 << 63) - 1 };
+	int i = 0;
+	for (i = 0; i < ARRAY_SIZE(inputs); i++) {
+		uint8_t dest[10];
+
+		struct string_view out = {
+			.buf = dest,
+			.len = sizeof(dest),
+		};
+		uint64_t in = inputs[i];
+		int n = put_var_int(&out, in);
+		uint64_t got = 0;
+
+		EXPECT(n > 0);
+		out.len = n;
+		n = get_var_int(&got, &out);
+		EXPECT(n > 0);
+
+		EXPECT(got == in);
+	}
+}
+
+static void test_common_prefix(void)
+{
+	struct {
+		const char *a, *b;
+		int want;
+	} cases[] = {
+		{ "abc", "ab", 2 },
+		{ "", "abc", 0 },
+		{ "abc", "abd", 2 },
+		{ "abc", "pqr", 0 },
+	};
+
+	int i = 0;
+	for (i = 0; i < ARRAY_SIZE(cases); i++) {
+		struct strbuf a = STRBUF_INIT;
+		struct strbuf b = STRBUF_INIT;
+		strbuf_addstr(&a, cases[i].a);
+		strbuf_addstr(&b, cases[i].b);
+		EXPECT(common_prefix_size(&a, &b) == cases[i].want);
+
+		strbuf_release(&a);
+		strbuf_release(&b);
+	}
+}
+
+static void set_hash(uint8_t *h, int j)
+{
+	int i = 0;
+	for (i = 0; i < hash_size(GIT_SHA1_HASH_ID); i++) {
+		h[i] = (j >> i) & 0xff;
+	}
+}
+
+static void test_reftable_ref_record_roundtrip(void)
+{
+	int i = 0;
+
+	for (i = REFTABLE_REF_DELETION; i < REFTABLE_NR_REF_VALUETYPES; i++) {
+		struct reftable_ref_record in = { NULL };
+		struct reftable_ref_record out = { NULL };
+		struct reftable_record rec_out = { NULL };
+		struct strbuf key = STRBUF_INIT;
+		struct reftable_record rec = { NULL };
+		uint8_t buffer[1024] = { 0 };
+		struct string_view dest = {
+			.buf = buffer,
+			.len = sizeof(buffer),
+		};
+
+		int n, m;
+
+		in.value_type = i;
+		switch (i) {
+		case REFTABLE_REF_DELETION:
+			break;
+		case REFTABLE_REF_VAL1:
+			in.value.val1 = reftable_malloc(GIT_SHA1_RAWSZ);
+			set_hash(in.value.val1, 1);
+			break;
+		case REFTABLE_REF_VAL2:
+			in.value.val2.value = reftable_malloc(GIT_SHA1_RAWSZ);
+			set_hash(in.value.val2.value, 1);
+			in.value.val2.target_value =
+				reftable_malloc(GIT_SHA1_RAWSZ);
+			set_hash(in.value.val2.target_value, 2);
+			break;
+		case REFTABLE_REF_SYMREF:
+			in.value.symref = xstrdup("target");
+			break;
+		}
+		in.refname = xstrdup("refs/heads/master");
+
+		reftable_record_from_ref(&rec, &in);
+		test_copy(&rec);
+
+		EXPECT(reftable_record_val_type(&rec) == i);
+
+		reftable_record_key(&rec, &key);
+		n = reftable_record_encode(&rec, dest, GIT_SHA1_RAWSZ);
+		EXPECT(n > 0);
+
+		/* decode into a non-zero reftable_record to test for leaks. */
+
+		reftable_record_from_ref(&rec_out, &out);
+		m = reftable_record_decode(&rec_out, key, i, dest,
+					   GIT_SHA1_RAWSZ);
+		EXPECT(n == m);
+
+		EXPECT(reftable_ref_record_equal(&in, &out, GIT_SHA1_RAWSZ));
+		reftable_record_release(&rec_out);
+
+		strbuf_release(&key);
+		reftable_ref_record_release(&in);
+	}
+}
+
+static void test_reftable_log_record_equal(void)
+{
+	struct reftable_log_record in[2] = {
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.update_index = 42,
+		},
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.update_index = 22,
+		}
+	};
+
+	EXPECT(!reftable_log_record_equal(&in[0], &in[1], GIT_SHA1_RAWSZ));
+	in[1].update_index = in[0].update_index;
+	EXPECT(reftable_log_record_equal(&in[0], &in[1], GIT_SHA1_RAWSZ));
+	reftable_log_record_release(&in[0]);
+	reftable_log_record_release(&in[1]);
+}
+
+static void test_reftable_log_record_roundtrip(void)
+{
+	struct reftable_log_record in[2] = {
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.update_index = 42,
+			.value_type = REFTABLE_LOG_UPDATE,
+			.update = {
+				.old_hash = reftable_malloc(GIT_SHA1_RAWSZ),
+				.new_hash = reftable_malloc(GIT_SHA1_RAWSZ),
+				.name = xstrdup("han-wen"),
+				.email = xstrdup("hanwen@google.com"),
+				.message = xstrdup("test"),
+				.time = 1577123507,
+				.tz_offset = 100,
+			}
+		},
+		{
+			.refname = xstrdup("refs/heads/master"),
+			.update_index = 22,
+			.value_type = REFTABLE_LOG_DELETION,
+		}
+	};
+	set_test_hash(in[0].update.new_hash, 1);
+	set_test_hash(in[0].update.old_hash, 2);
+	for (int i = 0; i < ARRAY_SIZE(in); i++) {
+		struct reftable_record rec = { NULL };
+		struct strbuf key = STRBUF_INIT;
+		uint8_t buffer[1024] = { 0 };
+		struct string_view dest = {
+			.buf = buffer,
+			.len = sizeof(buffer),
+		};
+		/* populate out, to check for leaks. */
+		struct reftable_log_record out = {
+			.refname = xstrdup("old name"),
+			.value_type = REFTABLE_LOG_UPDATE,
+			.update = {
+				.new_hash = reftable_calloc(GIT_SHA1_RAWSZ),
+				.old_hash = reftable_calloc(GIT_SHA1_RAWSZ),
+				.name = xstrdup("old name"),
+				.email = xstrdup("old@email"),
+				.message = xstrdup("old message"),
+			},
+		};
+		struct reftable_record rec_out = { NULL };
+		int n, m, valtype;
+
+		reftable_record_from_log(&rec, &in[i]);
+
+		test_copy(&rec);
+
+		reftable_record_key(&rec, &key);
+
+		n = reftable_record_encode(&rec, dest, GIT_SHA1_RAWSZ);
+		EXPECT(n >= 0);
+		reftable_record_from_log(&rec_out, &out);
+		valtype = reftable_record_val_type(&rec);
+		m = reftable_record_decode(&rec_out, key, valtype, dest,
+					   GIT_SHA1_RAWSZ);
+		EXPECT(n == m);
+
+		EXPECT(reftable_log_record_equal(&in[i], &out, GIT_SHA1_RAWSZ));
+		reftable_log_record_release(&in[i]);
+		strbuf_release(&key);
+		reftable_record_release(&rec_out);
+	}
+}
+
+static void test_u24_roundtrip(void)
+{
+	uint32_t in = 0x112233;
+	uint8_t dest[3];
+	uint32_t out;
+	put_be24(dest, in);
+	out = get_be24(dest);
+	EXPECT(in == out);
+}
+
+static void test_key_roundtrip(void)
+{
+	uint8_t buffer[1024] = { 0 };
+	struct string_view dest = {
+		.buf = buffer,
+		.len = sizeof(buffer),
+	};
+	struct strbuf last_key = STRBUF_INIT;
+	struct strbuf key = STRBUF_INIT;
+	struct strbuf roundtrip = STRBUF_INIT;
+	int restart;
+	uint8_t extra;
+	int n, m;
+	uint8_t rt_extra;
+
+	strbuf_addstr(&last_key, "refs/heads/master");
+	strbuf_addstr(&key, "refs/tags/bla");
+	extra = 6;
+	n = reftable_encode_key(&restart, dest, last_key, key, extra);
+	EXPECT(!restart);
+	EXPECT(n > 0);
+
+	m = reftable_decode_key(&roundtrip, &rt_extra, last_key, dest);
+	EXPECT(n == m);
+	EXPECT(0 == strbuf_cmp(&key, &roundtrip));
+	EXPECT(rt_extra == extra);
+
+	strbuf_release(&last_key);
+	strbuf_release(&key);
+	strbuf_release(&roundtrip);
+}
+
+static void test_reftable_obj_record_roundtrip(void)
+{
+	uint8_t testHash1[GIT_SHA1_RAWSZ] = { 1, 2, 3, 4, 0 };
+	uint64_t till9[] = { 1, 2, 3, 4, 500, 600, 700, 800, 9000 };
+	struct reftable_obj_record recs[3] = { {
+						       .hash_prefix = testHash1,
+						       .hash_prefix_len = 5,
+						       .offsets = till9,
+						       .offset_len = 3,
+					       },
+					       {
+						       .hash_prefix = testHash1,
+						       .hash_prefix_len = 5,
+						       .offsets = till9,
+						       .offset_len = 9,
+					       },
+					       {
+						       .hash_prefix = testHash1,
+						       .hash_prefix_len = 5,
+					       } };
+	int i = 0;
+	for (i = 0; i < ARRAY_SIZE(recs); i++) {
+		struct reftable_obj_record in = recs[i];
+		uint8_t buffer[1024] = { 0 };
+		struct string_view dest = {
+			.buf = buffer,
+			.len = sizeof(buffer),
+		};
+		struct reftable_record rec = { NULL };
+		struct strbuf key = STRBUF_INIT;
+		struct reftable_obj_record out = { NULL };
+		struct reftable_record rec_out = { NULL };
+		int n, m;
+		uint8_t extra;
+
+		reftable_record_from_obj(&rec, &in);
+		test_copy(&rec);
+		reftable_record_key(&rec, &key);
+		n = reftable_record_encode(&rec, dest, GIT_SHA1_RAWSZ);
+		EXPECT(n > 0);
+		extra = reftable_record_val_type(&rec);
+		reftable_record_from_obj(&rec_out, &out);
+		m = reftable_record_decode(&rec_out, key, extra, dest,
+					   GIT_SHA1_RAWSZ);
+		EXPECT(n == m);
+
+		EXPECT(in.hash_prefix_len == out.hash_prefix_len);
+		EXPECT(in.offset_len == out.offset_len);
+
+		EXPECT(!memcmp(in.hash_prefix, out.hash_prefix,
+			       in.hash_prefix_len));
+		EXPECT(0 == memcmp(in.offsets, out.offsets,
+				   sizeof(uint64_t) * in.offset_len));
+		strbuf_release(&key);
+		reftable_record_release(&rec_out);
+	}
+}
+
+static void test_reftable_index_record_roundtrip(void)
+{
+	struct reftable_index_record in = {
+		.offset = 42,
+		.last_key = STRBUF_INIT,
+	};
+	uint8_t buffer[1024] = { 0 };
+	struct string_view dest = {
+		.buf = buffer,
+		.len = sizeof(buffer),
+	};
+	struct strbuf key = STRBUF_INIT;
+	struct reftable_record rec = { NULL };
+	struct reftable_index_record out = { .last_key = STRBUF_INIT };
+	struct reftable_record out_rec = { NULL };
+	int n, m;
+	uint8_t extra;
+
+	strbuf_addstr(&in.last_key, "refs/heads/master");
+	reftable_record_from_index(&rec, &in);
+	reftable_record_key(&rec, &key);
+	test_copy(&rec);
+
+	EXPECT(0 == strbuf_cmp(&key, &in.last_key));
+	n = reftable_record_encode(&rec, dest, GIT_SHA1_RAWSZ);
+	EXPECT(n > 0);
+
+	extra = reftable_record_val_type(&rec);
+	reftable_record_from_index(&out_rec, &out);
+	m = reftable_record_decode(&out_rec, key, extra, dest, GIT_SHA1_RAWSZ);
+	EXPECT(m == n);
+
+	EXPECT(in.offset == out.offset);
+
+	reftable_record_release(&out_rec);
+	strbuf_release(&key);
+	strbuf_release(&in.last_key);
+}
+
+int record_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_reftable_log_record_equal);
+	RUN_TEST(test_reftable_log_record_roundtrip);
+	RUN_TEST(test_reftable_ref_record_roundtrip);
+	RUN_TEST(test_varint_roundtrip);
+	RUN_TEST(test_key_roundtrip);
+	RUN_TEST(test_common_prefix);
+	RUN_TEST(test_reftable_obj_record_roundtrip);
+	RUN_TEST(test_reftable_index_record_roundtrip);
+	RUN_TEST(test_u24_roundtrip);
+	return 0;
+}
diff --git a/reftable/reftable-record.h b/reftable/reftable-record.h
new file mode 100644
index 000000000000..7985b94ae2cc
--- /dev/null
+++ b/reftable/reftable-record.h
@@ -0,0 +1,114 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_RECORD_H
+#define REFTABLE_RECORD_H
+
+#include <stdint.h>
+
+/*
+ * Basic data types
+ *
+ * Reftables store the state of each ref in struct reftable_ref_record, and they
+ * store a sequence of reflog updates in struct reftable_log_record.
+ */
+
+/* reftable_ref_record holds a ref database entry target_value */
+struct reftable_ref_record {
+	char *refname; /* Name of the ref, malloced. */
+	uint64_t update_index; /* Logical timestamp at which this value is
+				* written */
+
+	enum {
+		/* tombstone to hide deletions from earlier tables */
+		REFTABLE_REF_DELETION = 0x0,
+
+		/* a simple ref */
+		REFTABLE_REF_VAL1 = 0x1,
+		/* a tag, plus its peeled hash */
+		REFTABLE_REF_VAL2 = 0x2,
+
+		/* a symbolic reference */
+		REFTABLE_REF_SYMREF = 0x3,
+#define REFTABLE_NR_REF_VALUETYPES 4
+	} value_type;
+	union {
+		uint8_t *val1; /* malloced hash. */
+		struct {
+			uint8_t *value; /* first value, malloced hash  */
+			uint8_t *target_value; /* second value, malloced hash */
+		} val2;
+		char *symref; /* referent, malloced 0-terminated string */
+	} value;
+};
+
+/* Returns the first hash, or NULL if `rec` is not of type
+ * REFTABLE_REF_VAL1 or REFTABLE_REF_VAL2. */
+uint8_t *reftable_ref_record_val1(struct reftable_ref_record *rec);
+
+/* Returns the second hash, or NULL if `rec` is not of type
+ * REFTABLE_REF_VAL2. */
+uint8_t *reftable_ref_record_val2(struct reftable_ref_record *rec);
+
+/* returns whether 'ref' represents a deletion */
+int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref);
+
+/* prints a reftable_ref_record onto stdout. Useful for debugging. */
+void reftable_ref_record_print(struct reftable_ref_record *ref,
+			       uint32_t hash_id);
+
+/* frees and nulls all pointer values inside `ref`. */
+void reftable_ref_record_release(struct reftable_ref_record *ref);
+
+/* returns whether two reftable_ref_records are the same. Useful for testing. */
+int reftable_ref_record_equal(struct reftable_ref_record *a,
+			      struct reftable_ref_record *b, int hash_size);
+
+/* reftable_log_record holds a reflog entry */
+struct reftable_log_record {
+	char *refname;
+	uint64_t update_index; /* logical timestamp of a transactional update.
+				*/
+
+	enum {
+		/* tombstone to hide deletions from earlier tables */
+		REFTABLE_LOG_DELETION = 0x0,
+
+		/* a simple update */
+		REFTABLE_LOG_UPDATE = 0x1,
+#define REFTABLE_NR_LOG_VALUETYPES 2
+	} value_type;
+
+	union {
+		struct {
+			uint8_t *new_hash;
+			uint8_t *old_hash;
+			char *name;
+			char *email;
+			uint64_t time;
+			int16_t tz_offset;
+			char *message;
+		} update;
+	};
+};
+
+/* returns whether 'ref' represents the deletion of a log record. */
+int reftable_log_record_is_deletion(const struct reftable_log_record *log);
+
+/* frees and nulls all pointer values. */
+void reftable_log_record_release(struct reftable_log_record *log);
+
+/* returns whether two records are equal. Useful for testing. */
+int reftable_log_record_equal(struct reftable_log_record *a,
+			      struct reftable_log_record *b, int hash_size);
+
+/* dumps a reftable_log_record on stdout, for debugging/testing. */
+void reftable_log_record_print(struct reftable_log_record *log,
+			       uint32_t hash_id);
+
+#endif
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 3b58e423e7b1..09d4b83ef9b8 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -4,6 +4,6 @@
 int cmd__reftable(int argc, const char **argv)
 {
 	basics_test_main(argc, argv);
-
+	record_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 11/28] Provide zlib's uncompress2 from compat/zlib-compat.c
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (9 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 10/28] reftable: (de)serialization for the polymorphic record type Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37             ` [PATCH v7 12/28] reftable: reading/writing blocks Han-Wen Nienhuys via GitGitGadget
                               ` (17 subsequent siblings)
  28 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This will be needed for reading reflog blocks in reftable.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                  |  7 +++
 compat/.gitattributes     |  1 +
 compat/zlib-uncompress2.c | 92 +++++++++++++++++++++++++++++++++++++++
 configure.ac              | 13 ++++++
 git-compat-util.h         |  4 ++
 5 files changed, 117 insertions(+)
 create mode 100644 compat/.gitattributes
 create mode 100644 compat/zlib-uncompress2.c

diff --git a/Makefile b/Makefile
index d6bc336c5565..e6df765d3361 100644
--- a/Makefile
+++ b/Makefile
@@ -256,6 +256,8 @@ all::
 #
 # Define NO_DEFLATE_BOUND if your zlib does not have deflateBound.
 #
+# Define NO_UNCOMPRESS2 if your zlib does not have uncompress2.
+#
 # Define NO_NORETURN if using buggy versions of gcc 4.6+ and profile feedback,
 # as the compiler can crash (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49299)
 #
@@ -1706,6 +1708,11 @@ ifdef NO_DEFLATE_BOUND
 	BASIC_CFLAGS += -DNO_DEFLATE_BOUND
 endif
 
+ifdef NO_UNCOMPRESS2
+	BASIC_CFLAGS += -DNO_UNCOMPRESS2
+	LIB_OBJS += compat/zlib-uncompress2.o
+endif
+
 ifdef NO_POSIX_GOODIES
 	BASIC_CFLAGS += -DNO_POSIX_GOODIES
 endif
diff --git a/compat/.gitattributes b/compat/.gitattributes
new file mode 100644
index 000000000000..40dbfb170dab
--- /dev/null
+++ b/compat/.gitattributes
@@ -0,0 +1 @@
+/zlib-uncompress2.c	whitespace=-indent-with-non-tab,-trailing-space
diff --git a/compat/zlib-uncompress2.c b/compat/zlib-uncompress2.c
new file mode 100644
index 000000000000..0adc5ee7eeb6
--- /dev/null
+++ b/compat/zlib-uncompress2.c
@@ -0,0 +1,92 @@
+/* taken from zlib's uncompr.c
+
+   commit cacf7f1d4e3d44d871b605da3b647f07d718623f
+   Author: Mark Adler <madler@alumni.caltech.edu>
+   Date:   Sun Jan 15 09:18:46 2017 -0800
+
+       zlib 1.2.11
+
+*/
+
+/*
+ * Copyright (C) 1995-2003, 2010, 2014, 2016 Jean-loup Gailly, Mark Adler
+ * For conditions of distribution and use, see copyright notice in zlib.h
+ */
+
+#include "system.h"
+
+/* clang-format off */
+
+/* ===========================================================================
+     Decompresses the source buffer into the destination buffer.  *sourceLen is
+   the byte length of the source buffer. Upon entry, *destLen is the total size
+   of the destination buffer, which must be large enough to hold the entire
+   uncompressed data. (The size of the uncompressed data must have been saved
+   previously by the compressor and transmitted to the decompressor by some
+   mechanism outside the scope of this compression library.) Upon exit,
+   *destLen is the size of the decompressed data and *sourceLen is the number
+   of source bytes consumed. Upon return, source + *sourceLen points to the
+   first unused input byte.
+
+     uncompress returns Z_OK if success, Z_MEM_ERROR if there was not enough
+   memory, Z_BUF_ERROR if there was not enough room in the output buffer, or
+   Z_DATA_ERROR if the input data was corrupted, including if the input data is
+   an incomplete zlib stream.
+*/
+int ZEXPORT uncompress2 (
+    Bytef *dest,
+    uLongf *destLen,
+    const Bytef *source,
+    uLong *sourceLen) {
+    z_stream stream;
+    int err;
+    const uInt max = (uInt)-1;
+    uLong len, left;
+    Byte buf[1];    /* for detection of incomplete stream when *destLen == 0 */
+
+    len = *sourceLen;
+    if (*destLen) {
+        left = *destLen;
+        *destLen = 0;
+    }
+    else {
+        left = 1;
+        dest = buf;
+    }
+
+    stream.next_in = (z_const Bytef *)source;
+    stream.avail_in = 0;
+    stream.zalloc = (alloc_func)0;
+    stream.zfree = (free_func)0;
+    stream.opaque = (voidpf)0;
+
+    err = inflateInit(&stream);
+    if (err != Z_OK) return err;
+
+    stream.next_out = dest;
+    stream.avail_out = 0;
+
+    do {
+        if (stream.avail_out == 0) {
+            stream.avail_out = left > (uLong)max ? max : (uInt)left;
+            left -= stream.avail_out;
+        }
+        if (stream.avail_in == 0) {
+            stream.avail_in = len > (uLong)max ? max : (uInt)len;
+            len -= stream.avail_in;
+        }
+        err = inflate(&stream, Z_NO_FLUSH);
+    } while (err == Z_OK);
+
+    *sourceLen -= len + stream.avail_in;
+    if (dest != buf)
+        *destLen = stream.total_out;
+    else if (stream.total_out && err == Z_BUF_ERROR)
+        left = 1;
+
+    inflateEnd(&stream);
+    return err == Z_STREAM_END ? Z_OK :
+           err == Z_NEED_DICT ? Z_DATA_ERROR  :
+           err == Z_BUF_ERROR && left + stream.avail_out ? Z_DATA_ERROR :
+           err;
+}
diff --git a/configure.ac b/configure.ac
index 031e8d3fee8f..c3a913103d00 100644
--- a/configure.ac
+++ b/configure.ac
@@ -672,9 +672,22 @@ AC_LINK_IFELSE([ZLIBTEST_SRC],
 	NO_DEFLATE_BOUND=yes])
 LIBS="$old_LIBS"
 
+AC_DEFUN([ZLIBTEST_UNCOMPRESS2_SRC], [
+AC_LANG_PROGRAM([#include <zlib.h>],
+ [uncompress2(NULL,NULL,NULL,NULL);])])
+AC_MSG_CHECKING([for uncompress2 in -lz])
+old_LIBS="$LIBS"
+LIBS="$LIBS -lz"
+AC_LINK_IFELSE([ZLIBTEST_UNCOMPRESS2_SRC],
+	[AC_MSG_RESULT([yes])],
+	[AC_MSG_RESULT([no])
+	NO_UNCOMPRESS2=yes])
+LIBS="$old_LIBS"
+
 GIT_UNSTASH_FLAGS($ZLIB_PATH)
 
 GIT_CONF_SUBST([NO_DEFLATE_BOUND])
+GIT_CONF_SUBST([NO_UNCOMPRESS2])
 
 #
 # Define NEEDS_SOCKET if linking with libc is not enough (SunOS,
diff --git a/git-compat-util.h b/git-compat-util.h
index a508dbe5a351..d405d0bd7a79 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1382,4 +1382,8 @@ static inline void *container_of_or_null_offset(void *ptr, size_t offset)
 
 void sleep_millisec(int millisec);
 
+#ifdef NO_UNCOMPRESS2
+
+#endif
+
 #endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 12/28] reftable: reading/writing blocks
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (10 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 11/28] Provide zlib's uncompress2 from compat/zlib-compat.c Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37             ` [PATCH v7 13/28] reftable: a generic binary tree implementation Han-Wen Nienhuys via GitGitGadget
                               ` (16 subsequent siblings)
  28 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable format is structured as a sequence of block. Within a block,
records are prefix compressed, with an index of offsets for fully expand keys to
enable binary search within blocks.

This commit provides the logic to read and write these blocks.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |   2 +
 reftable/block.c         | 446 +++++++++++++++++++++++++++++++++++++++
 reftable/block.h         | 127 +++++++++++
 reftable/block_test.c    | 121 +++++++++++
 t/helper/test-reftable.c |   1 +
 5 files changed, 697 insertions(+)
 create mode 100644 reftable/block.c
 create mode 100644 reftable/block.h
 create mode 100644 reftable/block_test.c

diff --git a/Makefile b/Makefile
index e6df765d3361..cd8a9de2519e 100644
--- a/Makefile
+++ b/Makefile
@@ -2410,10 +2410,12 @@ xdiff-objs: $(XDIFF_OBJS)
 
 REFTABLE_OBJS += reftable/basics.o
 REFTABLE_OBJS += reftable/error.o
+REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/record.o
 
+REFTABLE_TEST_OBJS += reftable/block_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/basics_test.o
diff --git a/reftable/block.c b/reftable/block.c
new file mode 100644
index 000000000000..92f8e5abfad7
--- /dev/null
+++ b/reftable/block.c
@@ -0,0 +1,446 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "block.h"
+
+#include "blocksource.h"
+#include "constants.h"
+#include "record.h"
+#include "reftable-error.h"
+#include "system.h"
+#include <zlib.h>
+
+#ifdef NO_UNCOMPRESS2
+/* This is uncompress2, which is only available in zlib as of 2017.
+ */
+int uncompress2(Bytef *dest, uLongf *destLen, const Bytef *source,
+		uLong *sourceLen);
+#endif
+
+int header_size(int version)
+{
+	switch (version) {
+	case 1:
+		return 24;
+	case 2:
+		return 28;
+	}
+	abort();
+}
+
+int footer_size(int version)
+{
+	switch (version) {
+	case 1:
+		return 68;
+	case 2:
+		return 72;
+	}
+	abort();
+}
+
+static int block_writer_register_restart(struct block_writer *w, int n,
+					 int is_restart, struct strbuf *key)
+{
+	int rlen = w->restart_len;
+	if (rlen >= MAX_RESTARTS) {
+		is_restart = 0;
+	}
+
+	if (is_restart) {
+		rlen++;
+	}
+	if (2 + 3 * rlen + n > w->block_size - w->next)
+		return -1;
+	if (is_restart) {
+		if (w->restart_len == w->restart_cap) {
+			w->restart_cap = w->restart_cap * 2 + 1;
+			w->restarts = reftable_realloc(
+				w->restarts, sizeof(uint32_t) * w->restart_cap);
+		}
+
+		w->restarts[w->restart_len++] = w->next;
+	}
+
+	w->next += n;
+
+	strbuf_reset(&w->last_key);
+	strbuf_addbuf(&w->last_key, key);
+	w->entries++;
+	return 0;
+}
+
+void block_writer_init(struct block_writer *bw, uint8_t typ, uint8_t *buf,
+		       uint32_t block_size, uint32_t header_off, int hash_size)
+{
+	bw->buf = buf;
+	bw->hash_size = hash_size;
+	bw->block_size = block_size;
+	bw->header_off = header_off;
+	bw->buf[header_off] = typ;
+	bw->next = header_off + 4;
+	bw->restart_interval = 16;
+	bw->entries = 0;
+	bw->restart_len = 0;
+	bw->last_key.len = 0;
+}
+
+uint8_t block_writer_type(struct block_writer *bw)
+{
+	return bw->buf[bw->header_off];
+}
+
+/* adds the reftable_record to the block. Returns -1 if it does not fit, 0 on
+   success */
+int block_writer_add(struct block_writer *w, struct reftable_record *rec)
+{
+	struct strbuf empty = STRBUF_INIT;
+	struct strbuf last =
+		w->entries % w->restart_interval == 0 ? empty : w->last_key;
+	struct string_view out = {
+		.buf = w->buf + w->next,
+		.len = w->block_size - w->next,
+	};
+
+	struct string_view start = out;
+
+	int is_restart = 0;
+	struct strbuf key = STRBUF_INIT;
+	int n = 0;
+
+	reftable_record_key(rec, &key);
+	n = reftable_encode_key(&is_restart, out, last, key,
+				reftable_record_val_type(rec));
+	if (n < 0)
+		goto done;
+	string_view_consume(&out, n);
+
+	n = reftable_record_encode(rec, out, w->hash_size);
+	if (n < 0)
+		goto done;
+	string_view_consume(&out, n);
+
+	if (block_writer_register_restart(w, start.len - out.len, is_restart,
+					  &key) < 0)
+		goto done;
+
+	strbuf_release(&key);
+	return 0;
+
+done:
+	strbuf_release(&key);
+	return -1;
+}
+
+int block_writer_finish(struct block_writer *w)
+{
+	int i = 0;
+	for (i = 0; i < w->restart_len; i++) {
+		put_be24(w->buf + w->next, w->restarts[i]);
+		w->next += 3;
+	}
+
+	put_be16(w->buf + w->next, w->restart_len);
+	w->next += 2;
+	put_be24(w->buf + 1 + w->header_off, w->next);
+
+	if (block_writer_type(w) == BLOCK_TYPE_LOG) {
+		int block_header_skip = 4 + w->header_off;
+		uint8_t *compressed = NULL;
+		int zresult = 0;
+		uLongf src_len = w->next - block_header_skip;
+		size_t dest_cap = src_len;
+
+		compressed = reftable_malloc(dest_cap);
+		while (1) {
+			uLongf out_dest_len = dest_cap;
+
+			zresult = compress2(compressed, &out_dest_len,
+					    w->buf + block_header_skip, src_len,
+					    9);
+			if (zresult == Z_BUF_ERROR) {
+				dest_cap *= 2;
+				compressed =
+					reftable_realloc(compressed, dest_cap);
+				continue;
+			}
+
+			if (Z_OK != zresult) {
+				reftable_free(compressed);
+				return REFTABLE_ZLIB_ERROR;
+			}
+
+			memcpy(w->buf + block_header_skip, compressed,
+			       out_dest_len);
+			w->next = out_dest_len + block_header_skip;
+			reftable_free(compressed);
+			break;
+		}
+	}
+	return w->next;
+}
+
+uint8_t block_reader_type(struct block_reader *r)
+{
+	return r->block.data[r->header_off];
+}
+
+int block_reader_init(struct block_reader *br, struct reftable_block *block,
+		      uint32_t header_off, uint32_t table_block_size,
+		      int hash_size)
+{
+	uint32_t full_block_size = table_block_size;
+	uint8_t typ = block->data[header_off];
+	uint32_t sz = get_be24(block->data + header_off + 1);
+
+	uint16_t restart_count = 0;
+	uint32_t restart_start = 0;
+	uint8_t *restart_bytes = NULL;
+
+	if (!reftable_is_block_type(typ))
+		return REFTABLE_FORMAT_ERROR;
+
+	if (typ == BLOCK_TYPE_LOG) {
+		int block_header_skip = 4 + header_off;
+		uLongf dst_len = sz - block_header_skip; /* total size of dest
+							    buffer. */
+		uLongf src_len = block->len - block_header_skip;
+		/* Log blocks specify the *uncompressed* size in their header.
+		 */
+		uint8_t *uncompressed = reftable_malloc(sz);
+
+		/* Copy over the block header verbatim. It's not compressed. */
+		memcpy(uncompressed, block->data, block_header_skip);
+
+		/* Uncompress */
+		if (Z_OK !=
+		    uncompress2(uncompressed + block_header_skip, &dst_len,
+				block->data + block_header_skip, &src_len)) {
+			reftable_free(uncompressed);
+			return REFTABLE_ZLIB_ERROR;
+		}
+
+		if (dst_len + block_header_skip != sz)
+			return REFTABLE_FORMAT_ERROR;
+
+		/* We're done with the input data. */
+		reftable_block_done(block);
+		block->data = uncompressed;
+		block->len = sz;
+		block->source = malloc_block_source();
+		full_block_size = src_len + block_header_skip;
+	} else if (full_block_size == 0) {
+		full_block_size = sz;
+	} else if (sz < full_block_size && sz < block->len &&
+		   block->data[sz] != 0) {
+		/* If the block is smaller than the full block size, it is
+		   padded (data followed by '\0') or the next block is
+		   unaligned. */
+		full_block_size = sz;
+	}
+
+	restart_count = get_be16(block->data + sz - 2);
+	restart_start = sz - 2 - 3 * restart_count;
+	restart_bytes = block->data + restart_start;
+
+	/* transfer ownership. */
+	br->block = *block;
+	block->data = NULL;
+	block->len = 0;
+
+	br->hash_size = hash_size;
+	br->block_len = restart_start;
+	br->full_block_size = full_block_size;
+	br->header_off = header_off;
+	br->restart_count = restart_count;
+	br->restart_bytes = restart_bytes;
+
+	return 0;
+}
+
+static uint32_t block_reader_restart_offset(struct block_reader *br, int i)
+{
+	return get_be24(br->restart_bytes + 3 * i);
+}
+
+void block_reader_start(struct block_reader *br, struct block_iter *it)
+{
+	it->br = br;
+	strbuf_reset(&it->last_key);
+	it->next_off = br->header_off + 4;
+}
+
+struct restart_find_args {
+	int error;
+	struct strbuf key;
+	struct block_reader *r;
+};
+
+static int restart_key_less(size_t idx, void *args)
+{
+	struct restart_find_args *a = args;
+	uint32_t off = block_reader_restart_offset(a->r, idx);
+	struct string_view in = {
+		.buf = a->r->block.data + off,
+		.len = a->r->block_len - off,
+	};
+
+	/* the restart key is verbatim in the block, so this could avoid the
+	   alloc for decoding the key */
+	struct strbuf rkey = STRBUF_INIT;
+	struct strbuf last_key = STRBUF_INIT;
+	uint8_t unused_extra;
+	int n = reftable_decode_key(&rkey, &unused_extra, last_key, in);
+	int result;
+	if (n < 0) {
+		a->error = 1;
+		return -1;
+	}
+
+	result = strbuf_cmp(&a->key, &rkey);
+	strbuf_release(&rkey);
+	return result;
+}
+
+void block_iter_copy_from(struct block_iter *dest, struct block_iter *src)
+{
+	dest->br = src->br;
+	dest->next_off = src->next_off;
+	strbuf_reset(&dest->last_key);
+	strbuf_addbuf(&dest->last_key, &src->last_key);
+}
+
+int block_iter_next(struct block_iter *it, struct reftable_record *rec)
+{
+	struct string_view in = {
+		.buf = it->br->block.data + it->next_off,
+		.len = it->br->block_len - it->next_off,
+	};
+	struct string_view start = in;
+	struct strbuf key = STRBUF_INIT;
+	uint8_t extra = 0;
+	int n = 0;
+
+	if (it->next_off >= it->br->block_len)
+		return 1;
+
+	n = reftable_decode_key(&key, &extra, it->last_key, in);
+	if (n < 0)
+		return -1;
+
+	string_view_consume(&in, n);
+	n = reftable_record_decode(rec, key, extra, in, it->br->hash_size);
+	if (n < 0)
+		return -1;
+	string_view_consume(&in, n);
+
+	strbuf_reset(&it->last_key);
+	strbuf_addbuf(&it->last_key, &key);
+	it->next_off += start.len - in.len;
+	strbuf_release(&key);
+	return 0;
+}
+
+int block_reader_first_key(struct block_reader *br, struct strbuf *key)
+{
+	struct strbuf empty = STRBUF_INIT;
+	int off = br->header_off + 4;
+	struct string_view in = {
+		.buf = br->block.data + off,
+		.len = br->block_len - off,
+	};
+
+	uint8_t extra = 0;
+	int n = reftable_decode_key(key, &extra, empty, in);
+	if (n < 0)
+		return n;
+
+	return 0;
+}
+
+int block_iter_seek(struct block_iter *it, struct strbuf *want)
+{
+	return block_reader_seek(it->br, it, want);
+}
+
+void block_iter_close(struct block_iter *it)
+{
+	strbuf_release(&it->last_key);
+}
+
+int block_reader_seek(struct block_reader *br, struct block_iter *it,
+		      struct strbuf *want)
+{
+	struct restart_find_args args = {
+		.key = *want,
+		.r = br,
+	};
+	struct reftable_record rec = reftable_new_record(block_reader_type(br));
+	struct strbuf key = STRBUF_INIT;
+	int err = 0;
+	struct block_iter next = {
+		.last_key = STRBUF_INIT,
+	};
+
+	int i = binsearch(br->restart_count, &restart_key_less, &args);
+	if (args.error) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	it->br = br;
+	if (i > 0) {
+		i--;
+		it->next_off = block_reader_restart_offset(br, i);
+	} else {
+		it->next_off = br->header_off + 4;
+	}
+
+	/* We're looking for the last entry less/equal than the wanted key, so
+	   we have to go one entry too far and then back up.
+	*/
+	while (1) {
+		block_iter_copy_from(&next, it);
+		err = block_iter_next(&next, &rec);
+		if (err < 0)
+			goto done;
+
+		reftable_record_key(&rec, &key);
+		if (err > 0 || strbuf_cmp(&key, want) >= 0) {
+			err = 0;
+			goto done;
+		}
+
+		block_iter_copy_from(it, &next);
+	}
+
+done:
+	strbuf_release(&key);
+	strbuf_release(&next.last_key);
+	reftable_record_destroy(&rec);
+
+	return err;
+}
+
+void block_writer_release(struct block_writer *bw)
+{
+	FREE_AND_NULL(bw->restarts);
+	strbuf_release(&bw->last_key);
+	/* the block is not owned. */
+}
+
+void reftable_block_done(struct reftable_block *blockp)
+{
+	struct reftable_block_source source = blockp->source;
+	if (blockp && source.ops)
+		source.ops->return_block(source.arg, blockp);
+	blockp->data = NULL;
+	blockp->len = 0;
+	blockp->source.ops = NULL;
+	blockp->source.arg = NULL;
+}
diff --git a/reftable/block.h b/reftable/block.h
new file mode 100644
index 000000000000..e207706a6448
--- /dev/null
+++ b/reftable/block.h
@@ -0,0 +1,127 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef BLOCK_H
+#define BLOCK_H
+
+#include "basics.h"
+#include "record.h"
+#include "reftable-blocksource.h"
+
+/*
+ * Writes reftable blocks. The block_writer is reused across blocks to minimize
+ * allocation overhead.
+ */
+struct block_writer {
+	uint8_t *buf;
+	uint32_t block_size;
+
+	/* Offset ofof the global header. Nonzero in the first block only. */
+	uint32_t header_off;
+
+	/* How often to restart keys. */
+	int restart_interval;
+	int hash_size;
+
+	/* Offset of next uint8_t to write. */
+	uint32_t next;
+	uint32_t *restarts;
+	uint32_t restart_len;
+	uint32_t restart_cap;
+
+	struct strbuf last_key;
+	int entries;
+};
+
+/*
+ * initializes the blockwriter to write `typ` entries, using `buf` as temporary
+ * storage. `buf` is not owned by the block_writer. */
+void block_writer_init(struct block_writer *bw, uint8_t typ, uint8_t *buf,
+		       uint32_t block_size, uint32_t header_off, int hash_size);
+
+/* returns the block type (eg. 'r' for ref records. */
+uint8_t block_writer_type(struct block_writer *bw);
+
+/* appends the record, or -1 if it doesn't fit. */
+int block_writer_add(struct block_writer *w, struct reftable_record *rec);
+
+/* appends the key restarts, and compress the block if necessary. */
+int block_writer_finish(struct block_writer *w);
+
+/* clears out internally allocated block_writer members. */
+void block_writer_release(struct block_writer *bw);
+
+/* Read a block. */
+struct block_reader {
+	/* offset of the block header; nonzero for the first block in a
+	 * reftable. */
+	uint32_t header_off;
+
+	/* the memory block */
+	struct reftable_block block;
+	int hash_size;
+
+	/* size of the data, excluding restart data. */
+	uint32_t block_len;
+	uint8_t *restart_bytes;
+	uint16_t restart_count;
+
+	/* size of the data in the file. For log blocks, this is the compressed
+	 * size. */
+	uint32_t full_block_size;
+};
+
+/* Iterate over entries in a block */
+struct block_iter {
+	/* offset within the block of the next entry to read. */
+	uint32_t next_off;
+	struct block_reader *br;
+
+	/* key for last entry we read. */
+	struct strbuf last_key;
+};
+
+/* initializes a block reader. */
+int block_reader_init(struct block_reader *br, struct reftable_block *bl,
+		      uint32_t header_off, uint32_t table_block_size,
+		      int hash_size);
+
+/* Position `it` at start of the block */
+void block_reader_start(struct block_reader *br, struct block_iter *it);
+
+/* Position `it` to the `want` key in the block */
+int block_reader_seek(struct block_reader *br, struct block_iter *it,
+		      struct strbuf *want);
+
+/* Returns the block type (eg. 'r' for refs) */
+uint8_t block_reader_type(struct block_reader *r);
+
+/* Decodes the first key in the block */
+int block_reader_first_key(struct block_reader *br, struct strbuf *key);
+
+void block_iter_copy_from(struct block_iter *dest, struct block_iter *src);
+
+/* return < 0 for error, 0 for OK, > 0 for EOF. */
+int block_iter_next(struct block_iter *it, struct reftable_record *rec);
+
+/* Seek to `want` with in the block pointed to by `it` */
+int block_iter_seek(struct block_iter *it, struct strbuf *want);
+
+/* deallocate memory for `it`. The block reader and its block is left intact. */
+void block_iter_close(struct block_iter *it);
+
+/* size of file header, depending on format version */
+int header_size(int version);
+
+/* size of file footer, depending on format version */
+int footer_size(int version);
+
+/* returns a block to its source. */
+void reftable_block_done(struct reftable_block *ret);
+
+#endif
diff --git a/reftable/block_test.c b/reftable/block_test.c
new file mode 100644
index 000000000000..85e53186ef74
--- /dev/null
+++ b/reftable/block_test.c
@@ -0,0 +1,121 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "block.h"
+
+#include "system.h"
+
+#include "blocksource.h"
+#include "basics.h"
+#include "constants.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static void test_block_read_write(void)
+{
+	const int header_off = 21; /* random */
+	char *names[30];
+	const int N = ARRAY_SIZE(names);
+	const int block_size = 1024;
+	struct reftable_block block = { NULL };
+	struct block_writer bw = {
+		.last_key = STRBUF_INIT,
+	};
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_record rec = { NULL };
+	int i = 0;
+	int n;
+	struct block_reader br = { 0 };
+	struct block_iter it = { .last_key = STRBUF_INIT };
+	int j = 0;
+	struct strbuf want = STRBUF_INIT;
+
+	block.data = reftable_calloc(block_size);
+	block.len = block_size;
+	block.source = malloc_block_source();
+	block_writer_init(&bw, BLOCK_TYPE_REF, block.data, block_size,
+			  header_off, hash_size(GIT_SHA1_HASH_ID));
+	reftable_record_from_ref(&rec, &ref);
+
+	for (i = 0; i < N; i++) {
+		char name[100];
+		uint8_t hash[GIT_SHA1_RAWSZ];
+		snprintf(name, sizeof(name), "branch%02d", i);
+		memset(hash, i, sizeof(hash));
+
+		ref.refname = name;
+		ref.value_type = REFTABLE_REF_VAL1;
+		ref.value.val1 = hash;
+
+		names[i] = xstrdup(name);
+		n = block_writer_add(&bw, &rec);
+		ref.refname = NULL;
+		ref.value_type = REFTABLE_REF_DELETION;
+		EXPECT(n == 0);
+	}
+
+	n = block_writer_finish(&bw);
+	EXPECT(n > 0);
+
+	block_writer_release(&bw);
+
+	block_reader_init(&br, &block, header_off, block_size, GIT_SHA1_RAWSZ);
+
+	block_reader_start(&br, &it);
+
+	while (1) {
+		int r = block_iter_next(&it, &rec);
+		EXPECT(r >= 0);
+		if (r > 0) {
+			break;
+		}
+		EXPECT_STREQ(names[j], ref.refname);
+		j++;
+	}
+
+	reftable_record_release(&rec);
+	block_iter_close(&it);
+
+	for (i = 0; i < N; i++) {
+		struct block_iter it = { .last_key = STRBUF_INIT };
+		strbuf_reset(&want);
+		strbuf_addstr(&want, names[i]);
+
+		n = block_reader_seek(&br, &it, &want);
+		EXPECT(n == 0);
+
+		n = block_iter_next(&it, &rec);
+		EXPECT(n == 0);
+
+		EXPECT_STREQ(names[i], ref.refname);
+
+		want.len--;
+		n = block_reader_seek(&br, &it, &want);
+		EXPECT(n == 0);
+
+		n = block_iter_next(&it, &rec);
+		EXPECT(n == 0);
+		EXPECT_STREQ(names[10 * (i / 10)], ref.refname);
+
+		block_iter_close(&it);
+	}
+
+	reftable_record_release(&rec);
+	reftable_block_done(&br.block);
+	strbuf_release(&want);
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+}
+
+int block_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_block_read_write);
+	return 0;
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 09d4b83ef9b8..c9deeaf08c7a 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -4,6 +4,7 @@
 int cmd__reftable(int argc, const char **argv)
 {
 	basics_test_main(argc, argv);
+	block_test_main(argc, argv);
 	record_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 13/28] reftable: a generic binary tree implementation
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (11 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 12/28] reftable: reading/writing blocks Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37             ` [PATCH v7 14/28] reftable: write reftable files Han-Wen Nienhuys via GitGitGadget
                               ` (15 subsequent siblings)
  28 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The reftable format includes support for an (OID => ref) map. This map can speed
up visibility and reachability checks. In particular, various operations along
the fetch/push path within Gerrit have ben sped up by using this structure.

The map is constructed with help of a binary tree. Object IDs are hashes, so
they are uniformly distributed. Hence, the tree does not attempt forced
rebalancing.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |  4 ++-
 reftable/tree.c          | 63 ++++++++++++++++++++++++++++++++++++++++
 reftable/tree.h          | 34 ++++++++++++++++++++++
 reftable/tree_test.c     | 61 ++++++++++++++++++++++++++++++++++++++
 t/helper/test-reftable.c |  1 +
 5 files changed, 162 insertions(+), 1 deletion(-)
 create mode 100644 reftable/tree.c
 create mode 100644 reftable/tree.h
 create mode 100644 reftable/tree_test.c

diff --git a/Makefile b/Makefile
index cd8a9de2519e..01916cefda0f 100644
--- a/Makefile
+++ b/Makefile
@@ -2414,11 +2414,13 @@ REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/record.o
+REFTABLE_OBJS += reftable/tree.o
 
+REFTABLE_TEST_OBJS += reftable/basics_test.o
 REFTABLE_TEST_OBJS += reftable/block_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
-REFTABLE_TEST_OBJS += reftable/basics_test.o
+REFTABLE_TEST_OBJS += reftable/tree_test.o
 
 TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
 
diff --git a/reftable/tree.c b/reftable/tree.c
new file mode 100644
index 000000000000..82db7995dd68
--- /dev/null
+++ b/reftable/tree.c
@@ -0,0 +1,63 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "tree.h"
+
+#include "basics.h"
+#include "system.h"
+
+struct tree_node *tree_search(void *key, struct tree_node **rootp,
+			      int (*compare)(const void *, const void *),
+			      int insert)
+{
+	int res;
+	if (*rootp == NULL) {
+		if (!insert) {
+			return NULL;
+		} else {
+			struct tree_node *n =
+				reftable_calloc(sizeof(struct tree_node));
+			n->key = key;
+			*rootp = n;
+			return *rootp;
+		}
+	}
+
+	res = compare(key, (*rootp)->key);
+	if (res < 0)
+		return tree_search(key, &(*rootp)->left, compare, insert);
+	else if (res > 0)
+		return tree_search(key, &(*rootp)->right, compare, insert);
+	return *rootp;
+}
+
+void infix_walk(struct tree_node *t, void (*action)(void *arg, void *key),
+		void *arg)
+{
+	if (t->left) {
+		infix_walk(t->left, action, arg);
+	}
+	action(arg, t->key);
+	if (t->right) {
+		infix_walk(t->right, action, arg);
+	}
+}
+
+void tree_free(struct tree_node *t)
+{
+	if (t == NULL) {
+		return;
+	}
+	if (t->left) {
+		tree_free(t->left);
+	}
+	if (t->right) {
+		tree_free(t->right);
+	}
+	reftable_free(t);
+}
diff --git a/reftable/tree.h b/reftable/tree.h
new file mode 100644
index 000000000000..fbdd002e23af
--- /dev/null
+++ b/reftable/tree.h
@@ -0,0 +1,34 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef TREE_H
+#define TREE_H
+
+/* tree_node is a generic binary search tree. */
+struct tree_node {
+	void *key;
+	struct tree_node *left, *right;
+};
+
+/* looks for `key` in `rootp` using `compare` as comparison function. If insert
+ * is set, insert the key if it's not found. Else, return NULL.
+ */
+struct tree_node *tree_search(void *key, struct tree_node **rootp,
+			      int (*compare)(const void *, const void *),
+			      int insert);
+
+/* performs an infix walk of the tree. */
+void infix_walk(struct tree_node *t, void (*action)(void *arg, void *key),
+		void *arg);
+
+/*
+ * deallocates the tree nodes recursively. Keys should be deallocated separately
+ * by walking over the tree. */
+void tree_free(struct tree_node *t);
+
+#endif
diff --git a/reftable/tree_test.c b/reftable/tree_test.c
new file mode 100644
index 000000000000..09a970e17b9c
--- /dev/null
+++ b/reftable/tree_test.c
@@ -0,0 +1,61 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "tree.h"
+
+#include "basics.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+static int test_compare(const void *a, const void *b)
+{
+	return (char *)a - (char *)b;
+}
+
+struct curry {
+	void *last;
+};
+
+static void check_increasing(void *arg, void *key)
+{
+	struct curry *c = arg;
+	if (c->last) {
+		assert(test_compare(c->last, key) < 0);
+	}
+	c->last = key;
+}
+
+static void test_tree(void)
+{
+	struct tree_node *root = NULL;
+
+	void *values[11] = { NULL };
+	struct tree_node *nodes[11] = { NULL };
+	int i = 1;
+	struct curry c = { NULL };
+	do {
+		nodes[i] = tree_search(values + i, &root, &test_compare, 1);
+		i = (i * 7) % 11;
+	} while (i != 1);
+
+	for (i = 1; i < ARRAY_SIZE(nodes); i++) {
+		assert(values + i == nodes[i]->key);
+		assert(nodes[i] ==
+		       tree_search(values + i, &root, &test_compare, 0));
+	}
+
+	infix_walk(root, check_increasing, &c);
+	tree_free(root);
+}
+
+int tree_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_tree);
+	return 0;
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index c9deeaf08c7a..050551fa6985 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -6,5 +6,6 @@ int cmd__reftable(int argc, const char **argv)
 	basics_test_main(argc, argv);
 	block_test_main(argc, argv);
 	record_test_main(argc, argv);
+	tree_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 14/28] reftable: write reftable files
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (12 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 13/28] reftable: a generic binary tree implementation Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37             ` [PATCH v7 15/28] reftable: generic interface to tables Han-Wen Nienhuys via GitGitGadget
                               ` (14 subsequent siblings)
  28 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                   |   1 +
 reftable/reftable-writer.h | 147 ++++++++
 reftable/writer.c          | 690 +++++++++++++++++++++++++++++++++++++
 reftable/writer.h          |  50 +++
 4 files changed, 888 insertions(+)
 create mode 100644 reftable/reftable-writer.h
 create mode 100644 reftable/writer.c
 create mode 100644 reftable/writer.h

diff --git a/Makefile b/Makefile
index 01916cefda0f..f7954b362aeb 100644
--- a/Makefile
+++ b/Makefile
@@ -2415,6 +2415,7 @@ REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/record.o
 REFTABLE_OBJS += reftable/tree.o
+REFTABLE_OBJS += reftable/writer.o
 
 REFTABLE_TEST_OBJS += reftable/basics_test.o
 REFTABLE_TEST_OBJS += reftable/block_test.o
diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h
new file mode 100644
index 000000000000..bdef4813f4c7
--- /dev/null
+++ b/reftable/reftable-writer.h
@@ -0,0 +1,147 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_WRITER_H
+#define REFTABLE_WRITER_H
+
+#include <stdint.h>
+
+#include "reftable-record.h"
+
+/* Writing single reftables */
+
+/* reftable_write_options sets options for writing a single reftable. */
+struct reftable_write_options {
+	/* boolean: do not pad out blocks to block size. */
+	unsigned unpadded : 1;
+
+	/* the blocksize. Should be less than 2^24. */
+	uint32_t block_size;
+
+	/* boolean: do not generate a SHA1 => ref index. */
+	unsigned skip_index_objects : 1;
+
+	/* how often to write complete keys in each block. */
+	int restart_interval;
+
+	/* 4-byte identifier ("sha1", "s256") of the hash.
+	 * Defaults to SHA1 if unset
+	 */
+	uint32_t hash_id;
+
+	/* boolean: do not check ref names for validity or dir/file conflicts.
+	 */
+	unsigned skip_name_check : 1;
+
+	/* boolean: copy log messages exactly. If unset, check that the message
+	 *   is a single line, and add '\n' if missing.
+	 */
+	unsigned exact_log_message : 1;
+};
+
+/* reftable_block_stats holds statistics for a single block type */
+struct reftable_block_stats {
+	/* total number of entries written */
+	int entries;
+	/* total number of key restarts */
+	int restarts;
+	/* total number of blocks */
+	int blocks;
+	/* total number of index blocks */
+	int index_blocks;
+	/* depth of the index */
+	int max_index_level;
+
+	/* offset of the first block for this type */
+	uint64_t offset;
+	/* offset of the top level index block for this type, or 0 if not
+	 * present */
+	uint64_t index_offset;
+};
+
+/* stats holds overall statistics for a single reftable */
+struct reftable_stats {
+	/* total number of blocks written. */
+	int blocks;
+	/* stats for ref data */
+	struct reftable_block_stats ref_stats;
+	/* stats for the SHA1 to ref map. */
+	struct reftable_block_stats obj_stats;
+	/* stats for index blocks */
+	struct reftable_block_stats idx_stats;
+	/* stats for log blocks */
+	struct reftable_block_stats log_stats;
+
+	/* disambiguation length of shortened object IDs. */
+	int object_id_len;
+};
+
+/* reftable_new_writer creates a new writer */
+struct reftable_writer *
+reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
+		    void *writer_arg, struct reftable_write_options *opts);
+
+/* Set the range of update indices for the records we will add. When writing a
+   table into a stack, the min should be at least
+   reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned.
+
+   For transactional updates to a stack, typically min==max, and the
+   update_index can be obtained by inspeciting the stack. When converting an
+   existing ref database into a single reftable, this would be a range of
+   update-index timestamps.
+ */
+void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
+				uint64_t max);
+
+/*
+  Add a reftable_ref_record. The record should have names that come after
+  already added records.
+
+  The update_index must be within the limits set by
+  reftable_writer_set_limits(), or REFTABLE_API_ERROR is returned. It is an
+  REFTABLE_API_ERROR error to write a ref record after a log record.
+*/
+int reftable_writer_add_ref(struct reftable_writer *w,
+			    struct reftable_ref_record *ref);
+
+/*
+  Convenience function to add multiple reftable_ref_records; the function sorts
+  the records before adding them, reordering the records array passed in.
+*/
+int reftable_writer_add_refs(struct reftable_writer *w,
+			     struct reftable_ref_record *refs, int n);
+
+/*
+  adds reftable_log_records. Log records are keyed by (refname, decreasing
+  update_index). The key for the record added must come after the already added
+  log records.
+*/
+int reftable_writer_add_log(struct reftable_writer *w,
+			    struct reftable_log_record *log);
+
+/*
+  Convenience function to add multiple reftable_log_records; the function sorts
+  the records before adding them, reordering records array passed in.
+*/
+int reftable_writer_add_logs(struct reftable_writer *w,
+			     struct reftable_log_record *logs, int n);
+
+/* reftable_writer_close finalizes the reftable. The writer is retained so
+ * statistics can be inspected. */
+int reftable_writer_close(struct reftable_writer *w);
+
+/* writer_stats returns the statistics on the reftable being written.
+
+   This struct becomes invalid when the writer is freed.
+ */
+const struct reftable_stats *writer_stats(struct reftable_writer *w);
+
+/* reftable_writer_free deallocates memory for the writer */
+void reftable_writer_free(struct reftable_writer *w);
+
+#endif
diff --git a/reftable/writer.c b/reftable/writer.c
new file mode 100644
index 000000000000..81e9515c5358
--- /dev/null
+++ b/reftable/writer.c
@@ -0,0 +1,690 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "writer.h"
+
+#include "system.h"
+
+#include "block.h"
+#include "constants.h"
+#include "record.h"
+#include "tree.h"
+#include "reftable-error.h"
+
+/* finishes a block, and writes it to storage */
+static int writer_flush_block(struct reftable_writer *w);
+
+/* deallocates memory related to the index */
+static void writer_clear_index(struct reftable_writer *w);
+
+/* finishes writing a 'r' (refs) or 'g' (reflogs) section */
+static int writer_finish_public_section(struct reftable_writer *w);
+
+static struct reftable_block_stats *
+writer_reftable_block_stats(struct reftable_writer *w, uint8_t typ)
+{
+	switch (typ) {
+	case 'r':
+		return &w->stats.ref_stats;
+	case 'o':
+		return &w->stats.obj_stats;
+	case 'i':
+		return &w->stats.idx_stats;
+	case 'g':
+		return &w->stats.log_stats;
+	}
+	abort();
+	return NULL;
+}
+
+/* write data, queuing the padding for the next write. Returns negative for
+ * error. */
+static int padded_write(struct reftable_writer *w, uint8_t *data, size_t len,
+			int padding)
+{
+	int n = 0;
+	if (w->pending_padding > 0) {
+		uint8_t *zeroed = reftable_calloc(w->pending_padding);
+		int n = w->write(w->write_arg, zeroed, w->pending_padding);
+		if (n < 0)
+			return n;
+
+		w->pending_padding = 0;
+		reftable_free(zeroed);
+	}
+
+	w->pending_padding = padding;
+	n = w->write(w->write_arg, data, len);
+	if (n < 0)
+		return n;
+	n += padding;
+	return 0;
+}
+
+static void options_set_defaults(struct reftable_write_options *opts)
+{
+	if (opts->restart_interval == 0) {
+		opts->restart_interval = 16;
+	}
+
+	if (opts->hash_id == 0) {
+		opts->hash_id = GIT_SHA1_HASH_ID;
+	}
+	if (opts->block_size == 0) {
+		opts->block_size = DEFAULT_BLOCK_SIZE;
+	}
+}
+
+static int writer_version(struct reftable_writer *w)
+{
+	return (w->opts.hash_id == 0 || w->opts.hash_id == GIT_SHA1_HASH_ID) ?
+			     1 :
+			     2;
+}
+
+static int writer_write_header(struct reftable_writer *w, uint8_t *dest)
+{
+	memcpy((char *)dest, "REFT", 4);
+
+	dest[4] = writer_version(w);
+
+	put_be24(dest + 5, w->opts.block_size);
+	put_be64(dest + 8, w->min_update_index);
+	put_be64(dest + 16, w->max_update_index);
+	if (writer_version(w) == 2) {
+		put_be32(dest + 24, w->opts.hash_id);
+	}
+	return header_size(writer_version(w));
+}
+
+static void writer_reinit_block_writer(struct reftable_writer *w, uint8_t typ)
+{
+	int block_start = 0;
+	if (w->next == 0) {
+		block_start = header_size(writer_version(w));
+	}
+
+	strbuf_release(&w->last_key);
+	block_writer_init(&w->block_writer_data, typ, w->block,
+			  w->opts.block_size, block_start,
+			  hash_size(w->opts.hash_id));
+	w->block_writer = &w->block_writer_data;
+	w->block_writer->restart_interval = w->opts.restart_interval;
+}
+
+static struct strbuf reftable_empty_strbuf = STRBUF_INIT;
+
+struct reftable_writer *
+reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t),
+		    void *writer_arg, struct reftable_write_options *opts)
+{
+	struct reftable_writer *wp =
+		reftable_calloc(sizeof(struct reftable_writer));
+	strbuf_init(&wp->block_writer_data.last_key, 0);
+	options_set_defaults(opts);
+	if (opts->block_size >= (1 << 24)) {
+		/* TODO - error return? */
+		abort();
+	}
+	wp->last_key = reftable_empty_strbuf;
+	wp->block = reftable_calloc(opts->block_size);
+	wp->write = writer_func;
+	wp->write_arg = writer_arg;
+	wp->opts = *opts;
+	writer_reinit_block_writer(wp, BLOCK_TYPE_REF);
+
+	return wp;
+}
+
+void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
+				uint64_t max)
+{
+	w->min_update_index = min;
+	w->max_update_index = max;
+}
+
+void reftable_writer_free(struct reftable_writer *w)
+{
+	reftable_free(w->block);
+	reftable_free(w);
+}
+
+struct obj_index_tree_node {
+	struct strbuf hash;
+	uint64_t *offsets;
+	size_t offset_len;
+	size_t offset_cap;
+};
+
+#define OBJ_INDEX_TREE_NODE_INIT    \
+	{                           \
+		.hash = STRBUF_INIT \
+	}
+
+static int obj_index_tree_node_compare(const void *a, const void *b)
+{
+	return strbuf_cmp(&((const struct obj_index_tree_node *)a)->hash,
+			  &((const struct obj_index_tree_node *)b)->hash);
+}
+
+static void writer_index_hash(struct reftable_writer *w, struct strbuf *hash)
+{
+	uint64_t off = w->next;
+
+	struct obj_index_tree_node want = { .hash = *hash };
+
+	struct tree_node *node = tree_search(&want, &w->obj_index_tree,
+					     &obj_index_tree_node_compare, 0);
+	struct obj_index_tree_node *key = NULL;
+	if (node == NULL) {
+		struct obj_index_tree_node empty = OBJ_INDEX_TREE_NODE_INIT;
+		key = reftable_malloc(sizeof(struct obj_index_tree_node));
+		*key = empty;
+
+		strbuf_reset(&key->hash);
+		strbuf_addbuf(&key->hash, hash);
+		tree_search((void *)key, &w->obj_index_tree,
+			    &obj_index_tree_node_compare, 1);
+	} else {
+		key = node->key;
+	}
+
+	if (key->offset_len > 0 && key->offsets[key->offset_len - 1] == off) {
+		return;
+	}
+
+	if (key->offset_len == key->offset_cap) {
+		key->offset_cap = 2 * key->offset_cap + 1;
+		key->offsets = reftable_realloc(
+			key->offsets, sizeof(uint64_t) * key->offset_cap);
+	}
+
+	key->offsets[key->offset_len++] = off;
+}
+
+static int writer_add_record(struct reftable_writer *w,
+			     struct reftable_record *rec)
+{
+	struct strbuf key = STRBUF_INIT;
+	int err = -1;
+	reftable_record_key(rec, &key);
+	if (strbuf_cmp(&w->last_key, &key) >= 0) {
+		err = REFTABLE_API_ERROR;
+		goto done;
+	}
+
+	strbuf_reset(&w->last_key);
+	strbuf_addbuf(&w->last_key, &key);
+	if (w->block_writer == NULL) {
+		writer_reinit_block_writer(w, reftable_record_type(rec));
+	}
+
+	assert(block_writer_type(w->block_writer) == reftable_record_type(rec));
+
+	if (block_writer_add(w->block_writer, rec) == 0) {
+		err = 0;
+		goto done;
+	}
+
+	err = writer_flush_block(w);
+	if (err < 0) {
+		goto done;
+	}
+
+	writer_reinit_block_writer(w, reftable_record_type(rec));
+	err = block_writer_add(w->block_writer, rec);
+	if (err < 0) {
+		goto done;
+	}
+
+	err = 0;
+done:
+	strbuf_release(&key);
+	return err;
+}
+
+int reftable_writer_add_ref(struct reftable_writer *w,
+			    struct reftable_ref_record *ref)
+{
+	struct reftable_record rec = { NULL };
+	struct reftable_ref_record copy = *ref;
+	int err = 0;
+
+	if (ref->refname == NULL)
+		return REFTABLE_API_ERROR;
+	if (ref->update_index < w->min_update_index ||
+	    ref->update_index > w->max_update_index)
+		return REFTABLE_API_ERROR;
+
+	reftable_record_from_ref(&rec, &copy);
+	copy.update_index -= w->min_update_index;
+
+	err = writer_add_record(w, &rec);
+	if (err < 0)
+		return err;
+
+	if (!w->opts.skip_index_objects && reftable_ref_record_val1(ref)) {
+		struct strbuf h = STRBUF_INIT;
+		strbuf_add(&h, (char *)reftable_ref_record_val1(ref),
+			   hash_size(w->opts.hash_id));
+		writer_index_hash(w, &h);
+		strbuf_release(&h);
+	}
+
+	if (!w->opts.skip_index_objects && reftable_ref_record_val2(ref)) {
+		struct strbuf h = STRBUF_INIT;
+		strbuf_add(&h, reftable_ref_record_val2(ref),
+			   hash_size(w->opts.hash_id));
+		writer_index_hash(w, &h);
+		strbuf_release(&h);
+	}
+	return 0;
+}
+
+int reftable_writer_add_refs(struct reftable_writer *w,
+			     struct reftable_ref_record *refs, int n)
+{
+	int err = 0;
+	int i = 0;
+	QSORT(refs, n, reftable_ref_record_compare_name);
+	for (i = 0; err == 0 && i < n; i++) {
+		err = reftable_writer_add_ref(w, &refs[i]);
+	}
+	return err;
+}
+
+static int reftable_writer_add_log_verbatim(struct reftable_writer *w,
+					    struct reftable_log_record *log)
+{
+	struct reftable_record rec = { NULL };
+	if (w->block_writer &&
+	    block_writer_type(w->block_writer) == BLOCK_TYPE_REF) {
+		int err = writer_finish_public_section(w);
+		if (err < 0)
+			return err;
+	}
+
+	w->next -= w->pending_padding;
+	w->pending_padding = 0;
+
+	reftable_record_from_log(&rec, log);
+	return writer_add_record(w, &rec);
+}
+
+int reftable_writer_add_log(struct reftable_writer *w,
+			    struct reftable_log_record *log)
+{
+	char *input_log_message = NULL;
+	struct strbuf cleaned_message = STRBUF_INIT;
+	int err = 0;
+
+	if (log->value_type == REFTABLE_LOG_DELETION)
+		return reftable_writer_add_log_verbatim(w, log);
+
+	if (log->refname == NULL)
+		return REFTABLE_API_ERROR;
+
+	input_log_message = log->update.message;
+	if (!w->opts.exact_log_message && log->update.message) {
+		strbuf_addstr(&cleaned_message, log->update.message);
+		while (cleaned_message.len &&
+		       cleaned_message.buf[cleaned_message.len - 1] == '\n')
+			strbuf_setlen(&cleaned_message,
+				      cleaned_message.len - 1);
+		if (strchr(cleaned_message.buf, '\n')) {
+			// multiple lines not allowed.
+			err = REFTABLE_API_ERROR;
+			goto done;
+		}
+		strbuf_addstr(&cleaned_message, "\n");
+		log->update.message = cleaned_message.buf;
+	}
+
+	err = reftable_writer_add_log_verbatim(w, log);
+	log->update.message = input_log_message;
+done:
+	strbuf_release(&cleaned_message);
+	return err;
+}
+
+int reftable_writer_add_logs(struct reftable_writer *w,
+			     struct reftable_log_record *logs, int n)
+{
+	int err = 0;
+	int i = 0;
+	QSORT(logs, n, reftable_log_record_compare_key);
+
+	for (i = 0; err == 0 && i < n; i++) {
+		err = reftable_writer_add_log(w, &logs[i]);
+	}
+	return err;
+}
+
+static int writer_finish_section(struct reftable_writer *w)
+{
+	uint8_t typ = block_writer_type(w->block_writer);
+	uint64_t index_start = 0;
+	int max_level = 0;
+	int threshold = w->opts.unpadded ? 1 : 3;
+	int before_blocks = w->stats.idx_stats.blocks;
+	int err = writer_flush_block(w);
+	int i = 0;
+	struct reftable_block_stats *bstats = NULL;
+	if (err < 0)
+		return err;
+
+	while (w->index_len > threshold) {
+		struct reftable_index_record *idx = NULL;
+		int idx_len = 0;
+
+		max_level++;
+		index_start = w->next;
+		writer_reinit_block_writer(w, BLOCK_TYPE_INDEX);
+
+		idx = w->index;
+		idx_len = w->index_len;
+
+		w->index = NULL;
+		w->index_len = 0;
+		w->index_cap = 0;
+		for (i = 0; i < idx_len; i++) {
+			struct reftable_record rec = { NULL };
+			reftable_record_from_index(&rec, idx + i);
+			if (block_writer_add(w->block_writer, &rec) == 0) {
+				continue;
+			}
+
+			err = writer_flush_block(w);
+			if (err < 0)
+				return err;
+
+			writer_reinit_block_writer(w, BLOCK_TYPE_INDEX);
+
+			err = block_writer_add(w->block_writer, &rec);
+			if (err != 0) {
+				/* write into fresh block should always succeed
+				 */
+				abort();
+			}
+		}
+		for (i = 0; i < idx_len; i++) {
+			strbuf_release(&idx[i].last_key);
+		}
+		reftable_free(idx);
+	}
+
+	writer_clear_index(w);
+
+	err = writer_flush_block(w);
+	if (err < 0)
+		return err;
+
+	bstats = writer_reftable_block_stats(w, typ);
+	bstats->index_blocks = w->stats.idx_stats.blocks - before_blocks;
+	bstats->index_offset = index_start;
+	bstats->max_index_level = max_level;
+
+	/* Reinit lastKey, as the next section can start with any key. */
+	w->last_key.len = 0;
+
+	return 0;
+}
+
+struct common_prefix_arg {
+	struct strbuf *last;
+	int max;
+};
+
+static void update_common(void *void_arg, void *key)
+{
+	struct common_prefix_arg *arg = void_arg;
+	struct obj_index_tree_node *entry = key;
+	if (arg->last) {
+		int n = common_prefix_size(&entry->hash, arg->last);
+		if (n > arg->max) {
+			arg->max = n;
+		}
+	}
+	arg->last = &entry->hash;
+}
+
+struct write_record_arg {
+	struct reftable_writer *w;
+	int err;
+};
+
+static void write_object_record(void *void_arg, void *key)
+{
+	struct write_record_arg *arg = void_arg;
+	struct obj_index_tree_node *entry = key;
+	struct reftable_obj_record obj_rec = {
+		.hash_prefix = (uint8_t *)entry->hash.buf,
+		.hash_prefix_len = arg->w->stats.object_id_len,
+		.offsets = entry->offsets,
+		.offset_len = entry->offset_len,
+	};
+	struct reftable_record rec = { NULL };
+	if (arg->err < 0)
+		goto done;
+
+	reftable_record_from_obj(&rec, &obj_rec);
+	arg->err = block_writer_add(arg->w->block_writer, &rec);
+	if (arg->err == 0)
+		goto done;
+
+	arg->err = writer_flush_block(arg->w);
+	if (arg->err < 0)
+		goto done;
+
+	writer_reinit_block_writer(arg->w, BLOCK_TYPE_OBJ);
+	arg->err = block_writer_add(arg->w->block_writer, &rec);
+	if (arg->err == 0)
+		goto done;
+	obj_rec.offset_len = 0;
+	arg->err = block_writer_add(arg->w->block_writer, &rec);
+
+	/* Should be able to write into a fresh block. */
+	assert(arg->err == 0);
+
+done:;
+}
+
+static void object_record_free(void *void_arg, void *key)
+{
+	struct obj_index_tree_node *entry = key;
+
+	FREE_AND_NULL(entry->offsets);
+	strbuf_release(&entry->hash);
+	reftable_free(entry);
+}
+
+static int writer_dump_object_index(struct reftable_writer *w)
+{
+	struct write_record_arg closure = { .w = w };
+	struct common_prefix_arg common = { NULL };
+	if (w->obj_index_tree) {
+		infix_walk(w->obj_index_tree, &update_common, &common);
+	}
+	w->stats.object_id_len = common.max + 1;
+
+	writer_reinit_block_writer(w, BLOCK_TYPE_OBJ);
+
+	if (w->obj_index_tree) {
+		infix_walk(w->obj_index_tree, &write_object_record, &closure);
+	}
+
+	if (closure.err < 0)
+		return closure.err;
+	return writer_finish_section(w);
+}
+
+static int writer_finish_public_section(struct reftable_writer *w)
+{
+	uint8_t typ = 0;
+	int err = 0;
+
+	if (w->block_writer == NULL)
+		return 0;
+
+	typ = block_writer_type(w->block_writer);
+	err = writer_finish_section(w);
+	if (err < 0)
+		return err;
+	if (typ == BLOCK_TYPE_REF && !w->opts.skip_index_objects &&
+	    w->stats.ref_stats.index_blocks > 0) {
+		err = writer_dump_object_index(w);
+		if (err < 0)
+			return err;
+	}
+
+	if (w->obj_index_tree) {
+		infix_walk(w->obj_index_tree, &object_record_free, NULL);
+		tree_free(w->obj_index_tree);
+		w->obj_index_tree = NULL;
+	}
+
+	w->block_writer = NULL;
+	return 0;
+}
+
+int reftable_writer_close(struct reftable_writer *w)
+{
+	uint8_t footer[72];
+	uint8_t *p = footer;
+	int err = writer_finish_public_section(w);
+	int empty_table = w->next == 0;
+	if (err != 0)
+		goto done;
+	w->pending_padding = 0;
+	if (empty_table) {
+		/* Empty tables need a header anyway. */
+		uint8_t header[28];
+		int n = writer_write_header(w, header);
+		err = padded_write(w, header, n, 0);
+		if (err < 0)
+			goto done;
+	}
+
+	p += writer_write_header(w, footer);
+	put_be64(p, w->stats.ref_stats.index_offset);
+	p += 8;
+	put_be64(p, (w->stats.obj_stats.offset) << 5 | w->stats.object_id_len);
+	p += 8;
+	put_be64(p, w->stats.obj_stats.index_offset);
+	p += 8;
+
+	put_be64(p, w->stats.log_stats.offset);
+	p += 8;
+	put_be64(p, w->stats.log_stats.index_offset);
+	p += 8;
+
+	put_be32(p, crc32(0, footer, p - footer));
+	p += 4;
+
+	err = padded_write(w, footer, footer_size(writer_version(w)), 0);
+	if (err < 0)
+		goto done;
+
+	if (empty_table) {
+		err = REFTABLE_EMPTY_TABLE_ERROR;
+		goto done;
+	}
+
+done:
+	/* free up memory. */
+	block_writer_release(&w->block_writer_data);
+	writer_clear_index(w);
+	strbuf_release(&w->last_key);
+	return err;
+}
+
+static void writer_clear_index(struct reftable_writer *w)
+{
+	int i = 0;
+	for (i = 0; i < w->index_len; i++) {
+		strbuf_release(&w->index[i].last_key);
+	}
+
+	FREE_AND_NULL(w->index);
+	w->index_len = 0;
+	w->index_cap = 0;
+}
+
+static const int debug = 0;
+
+static int writer_flush_nonempty_block(struct reftable_writer *w)
+{
+	uint8_t typ = block_writer_type(w->block_writer);
+	struct reftable_block_stats *bstats =
+		writer_reftable_block_stats(w, typ);
+	uint64_t block_typ_off = (bstats->blocks == 0) ? w->next : 0;
+	int raw_bytes = block_writer_finish(w->block_writer);
+	int padding = 0;
+	int err = 0;
+	struct reftable_index_record ir = { .last_key = STRBUF_INIT };
+	if (raw_bytes < 0)
+		return raw_bytes;
+
+	if (!w->opts.unpadded && typ != BLOCK_TYPE_LOG) {
+		padding = w->opts.block_size - raw_bytes;
+	}
+
+	if (block_typ_off > 0) {
+		bstats->offset = block_typ_off;
+	}
+
+	bstats->entries += w->block_writer->entries;
+	bstats->restarts += w->block_writer->restart_len;
+	bstats->blocks++;
+	w->stats.blocks++;
+
+	if (debug) {
+		fprintf(stderr, "block %c off %" PRIu64 " sz %d (%d)\n", typ,
+			w->next, raw_bytes,
+			get_be24(w->block + w->block_writer->header_off + 1));
+	}
+
+	if (w->next == 0) {
+		writer_write_header(w, w->block);
+	}
+
+	err = padded_write(w, w->block, raw_bytes, padding);
+	if (err < 0)
+		return err;
+
+	if (w->index_cap == w->index_len) {
+		w->index_cap = 2 * w->index_cap + 1;
+		w->index = reftable_realloc(
+			w->index,
+			sizeof(struct reftable_index_record) * w->index_cap);
+	}
+
+	ir.offset = w->next;
+	strbuf_reset(&ir.last_key);
+	strbuf_addbuf(&ir.last_key, &w->block_writer->last_key);
+	w->index[w->index_len] = ir;
+
+	w->index_len++;
+	w->next += padding + raw_bytes;
+	w->block_writer = NULL;
+	return 0;
+}
+
+static int writer_flush_block(struct reftable_writer *w)
+{
+	if (w->block_writer == NULL)
+		return 0;
+	if (w->block_writer->entries == 0)
+		return 0;
+	return writer_flush_nonempty_block(w);
+}
+
+const struct reftable_stats *writer_stats(struct reftable_writer *w)
+{
+	return &w->stats;
+}
diff --git a/reftable/writer.h b/reftable/writer.h
new file mode 100644
index 000000000000..09b88673d975
--- /dev/null
+++ b/reftable/writer.h
@@ -0,0 +1,50 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef WRITER_H
+#define WRITER_H
+
+#include "basics.h"
+#include "block.h"
+#include "tree.h"
+#include "reftable-writer.h"
+
+struct reftable_writer {
+	ssize_t (*write)(void *, const void *, size_t);
+	void *write_arg;
+	int pending_padding;
+	struct strbuf last_key;
+
+	/* offset of next block to write. */
+	uint64_t next;
+	uint64_t min_update_index, max_update_index;
+	struct reftable_write_options opts;
+
+	/* memory buffer for writing */
+	uint8_t *block;
+
+	/* writer for the current section. NULL or points to
+	 * block_writer_data */
+	struct block_writer *block_writer;
+
+	struct block_writer block_writer_data;
+
+	/* pending index records for the current section */
+	struct reftable_index_record *index;
+	size_t index_len;
+	size_t index_cap;
+
+	/*
+	 * tree for use with tsearch; used to populate the 'o' inverse OID
+	 * map */
+	struct tree_node *obj_index_tree;
+
+	struct reftable_stats stats;
+};
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 15/28] reftable: generic interface to tables
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (13 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 14/28] reftable: write reftable files Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37             ` [PATCH v7 16/28] reftable: read reftable files Han-Wen Nienhuys via GitGitGadget
                               ` (13 subsequent siblings)
  28 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                     |   3 +
 reftable/generic.c           | 169 +++++++++++++++++++++++++++++++++++
 reftable/generic.h           |  32 +++++++
 reftable/reftable-generic.h  |  47 ++++++++++
 reftable/reftable-iterator.h |  39 ++++++++
 reftable/reftable.c          | 115 ++++++++++++++++++++++++
 6 files changed, 405 insertions(+)
 create mode 100644 reftable/generic.c
 create mode 100644 reftable/generic.h
 create mode 100644 reftable/reftable-generic.h
 create mode 100644 reftable/reftable-iterator.h
 create mode 100644 reftable/reftable.c

diff --git a/Makefile b/Makefile
index f7954b362aeb..97b69ce8f080 100644
--- a/Makefile
+++ b/Makefile
@@ -2414,6 +2414,9 @@ REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/publicbasics.o
 REFTABLE_OBJS += reftable/record.o
+REFTABLE_OBJS += reftable/refname.o
+REFTABLE_OBJS += reftable/generic.o
+REFTABLE_OBJS += reftable/stack.o
 REFTABLE_OBJS += reftable/tree.o
 REFTABLE_OBJS += reftable/writer.o
 
diff --git a/reftable/generic.c b/reftable/generic.c
new file mode 100644
index 000000000000..7a8a738d8609
--- /dev/null
+++ b/reftable/generic.c
@@ -0,0 +1,169 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "basics.h"
+#include "record.h"
+#include "generic.h"
+#include "reftable-iterator.h"
+#include "reftable-generic.h"
+
+int reftable_table_seek_ref(struct reftable_table *tab,
+			    struct reftable_iterator *it, const char *name)
+{
+	struct reftable_ref_record ref = {
+		.refname = (char *)name,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_ref(&rec, &ref);
+	return tab->ops->seek_record(tab->table_arg, it, &rec);
+}
+
+int reftable_table_seek_log(struct reftable_table *tab,
+			    struct reftable_iterator *it, const char *name)
+{
+	struct reftable_log_record log = {
+		.refname = (char *)name,
+		.update_index = ~((uint64_t)0),
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_log(&rec, &log);
+	return tab->ops->seek_record(tab->table_arg, it, &rec);
+}
+
+int reftable_table_read_ref(struct reftable_table *tab, const char *name,
+			    struct reftable_ref_record *ref)
+{
+	struct reftable_iterator it = { NULL };
+	int err = reftable_table_seek_ref(tab, &it, name);
+	if (err)
+		goto done;
+
+	err = reftable_iterator_next_ref(&it, ref);
+	if (err)
+		goto done;
+
+	if (strcmp(ref->refname, name) ||
+	    reftable_ref_record_is_deletion(ref)) {
+		reftable_ref_record_release(ref);
+		err = 1;
+		goto done;
+	}
+
+done:
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+int reftable_table_print(struct reftable_table *tab) {
+	struct reftable_iterator it = { NULL };
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_log_record log = { NULL };
+	uint32_t hash_id = reftable_table_hash_id(tab);
+	int err = reftable_table_seek_ref(tab, &it, "");
+	if (err < 0) {
+		return err;
+	}
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_ref_record_print(&ref, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_ref_record_release(&ref);
+
+	err = reftable_table_seek_log(tab, &it, "");
+	if (err < 0) {
+		return err;
+	}
+	while (1) {
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0) {
+			return err;
+		}
+		reftable_log_record_print(&log, hash_id);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_log_record_release(&log);
+	return 0;
+}
+
+uint64_t reftable_table_max_update_index(struct reftable_table *tab)
+{
+	return tab->ops->max_update_index(tab->table_arg);
+}
+
+uint64_t reftable_table_min_update_index(struct reftable_table *tab)
+{
+	return tab->ops->min_update_index(tab->table_arg);
+}
+
+uint32_t reftable_table_hash_id(struct reftable_table *tab)
+{
+	return tab->ops->hash_id(tab->table_arg);
+}
+
+void reftable_iterator_destroy(struct reftable_iterator *it)
+{
+	if (!it->ops) {
+		return;
+	}
+	it->ops->close(it->iter_arg);
+	it->ops = NULL;
+	FREE_AND_NULL(it->iter_arg);
+}
+
+int reftable_iterator_next_ref(struct reftable_iterator *it,
+			       struct reftable_ref_record *ref)
+{
+	struct reftable_record rec = { NULL };
+	reftable_record_from_ref(&rec, ref);
+	return iterator_next(it, &rec);
+}
+
+int reftable_iterator_next_log(struct reftable_iterator *it,
+			       struct reftable_log_record *log)
+{
+	struct reftable_record rec = { NULL };
+	reftable_record_from_log(&rec, log);
+	return iterator_next(it, &rec);
+}
+
+int iterator_next(struct reftable_iterator *it, struct reftable_record *rec)
+{
+	return it->ops->next(it->iter_arg, rec);
+}
+
+static int empty_iterator_next(void *arg, struct reftable_record *rec)
+{
+	return 1;
+}
+
+static void empty_iterator_close(void *arg)
+{
+}
+
+static struct reftable_iterator_vtable empty_vtable = {
+	.next = &empty_iterator_next,
+	.close = &empty_iterator_close,
+};
+
+void iterator_set_empty(struct reftable_iterator *it)
+{
+	assert(!it->ops);
+	it->iter_arg = NULL;
+	it->ops = &empty_vtable;
+}
diff --git a/reftable/generic.h b/reftable/generic.h
new file mode 100644
index 000000000000..98886a06402d
--- /dev/null
+++ b/reftable/generic.h
@@ -0,0 +1,32 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef GENERIC_H
+#define GENERIC_H
+
+#include "record.h"
+#include "reftable-generic.h"
+
+/* generic interface to reftables */
+struct reftable_table_vtable {
+	int (*seek_record)(void *tab, struct reftable_iterator *it,
+			   struct reftable_record *);
+	uint32_t (*hash_id)(void *tab);
+	uint64_t (*min_update_index)(void *tab);
+	uint64_t (*max_update_index)(void *tab);
+};
+
+struct reftable_iterator_vtable {
+	int (*next)(void *iter_arg, struct reftable_record *rec);
+	void (*close)(void *iter_arg);
+};
+
+void iterator_set_empty(struct reftable_iterator *it);
+int iterator_next(struct reftable_iterator *it, struct reftable_record *rec);
+
+#endif
diff --git a/reftable/reftable-generic.h b/reftable/reftable-generic.h
new file mode 100644
index 000000000000..d239751a7783
--- /dev/null
+++ b/reftable/reftable-generic.h
@@ -0,0 +1,47 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_GENERIC_H
+#define REFTABLE_GENERIC_H
+
+#include "reftable-iterator.h"
+
+struct reftable_table_vtable;
+
+/*
+ * Provides a unified API for reading tables, either merged tables, or single
+ * readers. */
+struct reftable_table {
+	struct reftable_table_vtable *ops;
+	void *table_arg;
+};
+
+int reftable_table_seek_log(struct reftable_table *tab,
+			    struct reftable_iterator *it, const char *name);
+
+int reftable_table_seek_ref(struct reftable_table *tab,
+			    struct reftable_iterator *it, const char *name);
+
+/* returns the hash ID from a generic reftable_table */
+uint32_t reftable_table_hash_id(struct reftable_table *tab);
+
+/* returns the max update_index covered by this table. */
+uint64_t reftable_table_max_update_index(struct reftable_table *tab);
+
+/* returns the min update_index covered by this table. */
+uint64_t reftable_table_min_update_index(struct reftable_table *tab);
+
+/* convenience function to read a single ref. Returns < 0 for error, 0
+   for success, and 1 if ref not found. */
+int reftable_table_read_ref(struct reftable_table *tab, const char *name,
+			    struct reftable_ref_record *ref);
+
+/* dump table contents onto stdout for debugging */
+int reftable_table_print(struct reftable_table *tab);
+
+#endif
diff --git a/reftable/reftable-iterator.h b/reftable/reftable-iterator.h
new file mode 100644
index 000000000000..d3eee7af3571
--- /dev/null
+++ b/reftable/reftable-iterator.h
@@ -0,0 +1,39 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_ITERATOR_H
+#define REFTABLE_ITERATOR_H
+
+#include "reftable-record.h"
+
+struct reftable_iterator_vtable;
+
+/* iterator is the generic interface for walking over data stored in a
+ * reftable.
+ */
+struct reftable_iterator {
+	struct reftable_iterator_vtable *ops;
+	void *iter_arg;
+};
+
+/* reads the next reftable_ref_record. Returns < 0 for error, 0 for OK and > 0:
+ * end of iteration.
+ */
+int reftable_iterator_next_ref(struct reftable_iterator *it,
+			       struct reftable_ref_record *ref);
+
+/* reads the next reftable_log_record. Returns < 0 for error, 0 for OK and > 0:
+ * end of iteration.
+ */
+int reftable_iterator_next_log(struct reftable_iterator *it,
+			       struct reftable_log_record *log);
+
+/* releases resources associated with an iterator. */
+void reftable_iterator_destroy(struct reftable_iterator *it);
+
+#endif
diff --git a/reftable/reftable.c b/reftable/reftable.c
new file mode 100644
index 000000000000..0e4607a7cd6c
--- /dev/null
+++ b/reftable/reftable.c
@@ -0,0 +1,115 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "basics.h"
+#include "record.h"
+#include "generic.h"
+#include "reftable-iterator.h"
+#include "reftable-generic.h"
+
+int reftable_table_seek_ref(struct reftable_table *tab,
+			    struct reftable_iterator *it, const char *name)
+{
+	struct reftable_ref_record ref = {
+		.refname = (char *)name,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_ref(&rec, &ref);
+	return tab->ops->seek_record(tab->table_arg, it, &rec);
+}
+
+int reftable_table_read_ref(struct reftable_table *tab, const char *name,
+			    struct reftable_ref_record *ref)
+{
+	struct reftable_iterator it = { NULL };
+	int err = reftable_table_seek_ref(tab, &it, name);
+	if (err)
+		goto done;
+
+	err = reftable_iterator_next_ref(&it, ref);
+	if (err)
+		goto done;
+
+	if (strcmp(ref->refname, name) ||
+	    reftable_ref_record_is_deletion(ref)) {
+		reftable_ref_record_release(ref);
+		err = 1;
+		goto done;
+	}
+
+done:
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+uint64_t reftable_table_max_update_index(struct reftable_table *tab)
+{
+	return tab->ops->max_update_index(tab->table_arg);
+}
+
+uint64_t reftable_table_min_update_index(struct reftable_table *tab)
+{
+	return tab->ops->min_update_index(tab->table_arg);
+}
+
+uint32_t reftable_table_hash_id(struct reftable_table *tab)
+{
+	return tab->ops->hash_id(tab->table_arg);
+}
+
+void reftable_iterator_destroy(struct reftable_iterator *it)
+{
+	if (!it->ops) {
+		return;
+	}
+	it->ops->close(it->iter_arg);
+	it->ops = NULL;
+	FREE_AND_NULL(it->iter_arg);
+}
+
+int reftable_iterator_next_ref(struct reftable_iterator *it,
+			       struct reftable_ref_record *ref)
+{
+	struct reftable_record rec = { NULL };
+	reftable_record_from_ref(&rec, ref);
+	return iterator_next(it, &rec);
+}
+
+int reftable_iterator_next_log(struct reftable_iterator *it,
+			       struct reftable_log_record *log)
+{
+	struct reftable_record rec = { NULL };
+	reftable_record_from_log(&rec, log);
+	return iterator_next(it, &rec);
+}
+
+int iterator_next(struct reftable_iterator *it, struct reftable_record *rec)
+{
+	return it->ops->next(it->iter_arg, rec);
+}
+
+static int empty_iterator_next(void *arg, struct reftable_record *rec)
+{
+	return 1;
+}
+
+static void empty_iterator_close(void *arg)
+{
+}
+
+static struct reftable_iterator_vtable empty_vtable = {
+	.next = &empty_iterator_next,
+	.close = &empty_iterator_close,
+};
+
+void iterator_set_empty(struct reftable_iterator *it)
+{
+	assert(!it->ops);
+	it->iter_arg = NULL;
+	it->ops = &empty_vtable;
+}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 16/28] reftable: read reftable files
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (14 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 15/28] reftable: generic interface to tables Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37             ` [PATCH v7 17/28] reftable: reftable file level tests Han-Wen Nienhuys via GitGitGadget
                               ` (12 subsequent siblings)
  28 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This supports reading a single reftable file.

The commit introduces an abstract iterator type, which captures the usecases
both of reading individual refs, and iterating over a segment of the ref
namespace.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                   |   2 +
 reftable/iter.c            | 194 +++++++++
 reftable/iter.h            |  69 ++++
 reftable/reader.c          | 794 +++++++++++++++++++++++++++++++++++++
 reftable/reader.h          |  66 +++
 reftable/reftable-reader.h | 101 +++++
 6 files changed, 1226 insertions(+)
 create mode 100644 reftable/iter.c
 create mode 100644 reftable/iter.h
 create mode 100644 reftable/reader.c
 create mode 100644 reftable/reader.h
 create mode 100644 reftable/reftable-reader.h

diff --git a/Makefile b/Makefile
index 97b69ce8f080..6593e632ac08 100644
--- a/Makefile
+++ b/Makefile
@@ -2412,7 +2412,9 @@ REFTABLE_OBJS += reftable/basics.o
 REFTABLE_OBJS += reftable/error.o
 REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
+REFTABLE_OBJS += reftable/iter.o
 REFTABLE_OBJS += reftable/publicbasics.o
+REFTABLE_OBJS += reftable/reader.o
 REFTABLE_OBJS += reftable/record.o
 REFTABLE_OBJS += reftable/refname.o
 REFTABLE_OBJS += reftable/generic.o
diff --git a/reftable/iter.c b/reftable/iter.c
new file mode 100644
index 000000000000..93d04f735b85
--- /dev/null
+++ b/reftable/iter.c
@@ -0,0 +1,194 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "iter.h"
+
+#include "system.h"
+
+#include "block.h"
+#include "generic.h"
+#include "constants.h"
+#include "reader.h"
+#include "reftable-error.h"
+
+int iterator_is_null(struct reftable_iterator *it)
+{
+	return !it->ops;
+}
+
+static void filtering_ref_iterator_close(void *iter_arg)
+{
+	struct filtering_ref_iterator *fri = iter_arg;
+	strbuf_release(&fri->oid);
+	reftable_iterator_destroy(&fri->it);
+}
+
+static int filtering_ref_iterator_next(void *iter_arg,
+				       struct reftable_record *rec)
+{
+	struct filtering_ref_iterator *fri = iter_arg;
+	struct reftable_ref_record *ref = rec->data;
+	int err = 0;
+	while (1) {
+		err = reftable_iterator_next_ref(&fri->it, ref);
+		if (err != 0) {
+			break;
+		}
+
+		if (fri->double_check) {
+			struct reftable_iterator it = { NULL };
+
+			err = reftable_table_seek_ref(&fri->tab, &it,
+						      ref->refname);
+			if (err == 0) {
+				err = reftable_iterator_next_ref(&it, ref);
+			}
+
+			reftable_iterator_destroy(&it);
+
+			if (err < 0) {
+				break;
+			}
+
+			if (err > 0) {
+				continue;
+			}
+		}
+
+		if (ref->value_type == REFTABLE_REF_VAL2 &&
+		    (!memcmp(fri->oid.buf, ref->value.val2.target_value,
+			     fri->oid.len) ||
+		     !memcmp(fri->oid.buf, ref->value.val2.value,
+			     fri->oid.len)))
+			return 0;
+
+		if (ref->value_type == REFTABLE_REF_VAL1 &&
+		    !memcmp(fri->oid.buf, ref->value.val1, fri->oid.len)) {
+			return 0;
+		}
+	}
+
+	reftable_ref_record_release(ref);
+	return err;
+}
+
+static struct reftable_iterator_vtable filtering_ref_iterator_vtable = {
+	.next = &filtering_ref_iterator_next,
+	.close = &filtering_ref_iterator_close,
+};
+
+void iterator_from_filtering_ref_iterator(struct reftable_iterator *it,
+					  struct filtering_ref_iterator *fri)
+{
+	assert(!it->ops);
+	it->iter_arg = fri;
+	it->ops = &filtering_ref_iterator_vtable;
+}
+
+static void indexed_table_ref_iter_close(void *p)
+{
+	struct indexed_table_ref_iter *it = p;
+	block_iter_close(&it->cur);
+	reftable_block_done(&it->block_reader.block);
+	reftable_free(it->offsets);
+	strbuf_release(&it->oid);
+}
+
+static int indexed_table_ref_iter_next_block(struct indexed_table_ref_iter *it)
+{
+	uint64_t off;
+	int err = 0;
+	if (it->offset_idx == it->offset_len) {
+		it->is_finished = 1;
+		return 1;
+	}
+
+	reftable_block_done(&it->block_reader.block);
+
+	off = it->offsets[it->offset_idx++];
+	err = reader_init_block_reader(it->r, &it->block_reader, off,
+				       BLOCK_TYPE_REF);
+	if (err < 0) {
+		return err;
+	}
+	if (err > 0) {
+		/* indexed block does not exist. */
+		return REFTABLE_FORMAT_ERROR;
+	}
+	block_reader_start(&it->block_reader, &it->cur);
+	return 0;
+}
+
+static int indexed_table_ref_iter_next(void *p, struct reftable_record *rec)
+{
+	struct indexed_table_ref_iter *it = p;
+	struct reftable_ref_record *ref = rec->data;
+
+	while (1) {
+		int err = block_iter_next(&it->cur, rec);
+		if (err < 0) {
+			return err;
+		}
+
+		if (err > 0) {
+			err = indexed_table_ref_iter_next_block(it);
+			if (err < 0) {
+				return err;
+			}
+
+			if (it->is_finished) {
+				return 1;
+			}
+			continue;
+		}
+		/* BUG */
+		if (!memcmp(it->oid.buf, ref->value.val2.target_value,
+			    it->oid.len) ||
+		    !memcmp(it->oid.buf, ref->value.val2.value, it->oid.len)) {
+			return 0;
+		}
+	}
+}
+
+int new_indexed_table_ref_iter(struct indexed_table_ref_iter **dest,
+			       struct reftable_reader *r, uint8_t *oid,
+			       int oid_len, uint64_t *offsets, int offset_len)
+{
+	struct indexed_table_ref_iter empty = INDEXED_TABLE_REF_ITER_INIT;
+	struct indexed_table_ref_iter *itr =
+		reftable_calloc(sizeof(struct indexed_table_ref_iter));
+	int err = 0;
+
+	*itr = empty;
+	itr->r = r;
+	strbuf_add(&itr->oid, oid, oid_len);
+
+	itr->offsets = offsets;
+	itr->offset_len = offset_len;
+
+	err = indexed_table_ref_iter_next_block(itr);
+	if (err < 0) {
+		reftable_free(itr);
+	} else {
+		*dest = itr;
+	}
+	return err;
+}
+
+static struct reftable_iterator_vtable indexed_table_ref_iter_vtable = {
+	.next = &indexed_table_ref_iter_next,
+	.close = &indexed_table_ref_iter_close,
+};
+
+void iterator_from_indexed_table_ref_iter(struct reftable_iterator *it,
+					  struct indexed_table_ref_iter *itr)
+{
+	assert(!it->ops);
+	it->iter_arg = itr;
+	it->ops = &indexed_table_ref_iter_vtable;
+}
diff --git a/reftable/iter.h b/reftable/iter.h
new file mode 100644
index 000000000000..09eb0cbfa599
--- /dev/null
+++ b/reftable/iter.h
@@ -0,0 +1,69 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef ITER_H
+#define ITER_H
+
+#include "system.h"
+#include "block.h"
+#include "record.h"
+
+#include "reftable-iterator.h"
+#include "reftable-generic.h"
+
+/* Returns true for a zeroed out iterator, such as the one returned from
+ * iterator_destroy. */
+int iterator_is_null(struct reftable_iterator *it);
+
+/* iterator that produces only ref records that point to `oid` */
+struct filtering_ref_iterator {
+	int double_check;
+	struct reftable_table tab;
+	struct strbuf oid;
+	struct reftable_iterator it;
+};
+#define FILTERING_REF_ITERATOR_INIT \
+	{                           \
+		.oid = STRBUF_INIT  \
+	}
+
+void iterator_from_filtering_ref_iterator(struct reftable_iterator *,
+					  struct filtering_ref_iterator *);
+
+/* iterator that produces only ref records that point to `oid`,
+ * but using the object index.
+ */
+struct indexed_table_ref_iter {
+	struct reftable_reader *r;
+	struct strbuf oid;
+
+	/* mutable */
+	uint64_t *offsets;
+
+	/* Points to the next offset to read. */
+	int offset_idx;
+	int offset_len;
+	struct block_reader block_reader;
+	struct block_iter cur;
+	int is_finished;
+};
+
+#define INDEXED_TABLE_REF_ITER_INIT                                     \
+	{                                                               \
+		.cur = { .last_key = STRBUF_INIT }, .oid = STRBUF_INIT, \
+	}
+
+void iterator_from_indexed_table_ref_iter(struct reftable_iterator *it,
+					  struct indexed_table_ref_iter *itr);
+
+/* Takes ownership of `offsets` */
+int new_indexed_table_ref_iter(struct indexed_table_ref_iter **dest,
+			       struct reftable_reader *r, uint8_t *oid,
+			       int oid_len, uint64_t *offsets, int offset_len);
+
+#endif
diff --git a/reftable/reader.c b/reftable/reader.c
new file mode 100644
index 000000000000..5c9e1d6b1da7
--- /dev/null
+++ b/reftable/reader.c
@@ -0,0 +1,794 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "reader.h"
+
+#include "system.h"
+#include "block.h"
+#include "constants.h"
+#include "generic.h"
+#include "iter.h"
+#include "record.h"
+#include "reftable-error.h"
+#include "reftable-generic.h"
+#include "tree.h"
+
+uint64_t block_source_size(struct reftable_block_source *source)
+{
+	return source->ops->size(source->arg);
+}
+
+int block_source_read_block(struct reftable_block_source *source,
+			    struct reftable_block *dest, uint64_t off,
+			    uint32_t size)
+{
+	int result = source->ops->read_block(source->arg, dest, off, size);
+	dest->source = *source;
+	return result;
+}
+
+void block_source_close(struct reftable_block_source *source)
+{
+	if (!source->ops) {
+		return;
+	}
+
+	source->ops->close(source->arg);
+	source->ops = NULL;
+}
+
+static struct reftable_reader_offsets *
+reader_offsets_for(struct reftable_reader *r, uint8_t typ)
+{
+	switch (typ) {
+	case BLOCK_TYPE_REF:
+		return &r->ref_offsets;
+	case BLOCK_TYPE_LOG:
+		return &r->log_offsets;
+	case BLOCK_TYPE_OBJ:
+		return &r->obj_offsets;
+	}
+	abort();
+}
+
+static int reader_get_block(struct reftable_reader *r,
+			    struct reftable_block *dest, uint64_t off,
+			    uint32_t sz)
+{
+	if (off >= r->size)
+		return 0;
+
+	if (off + sz > r->size) {
+		sz = r->size - off;
+	}
+
+	return block_source_read_block(&r->source, dest, off, sz);
+}
+
+uint32_t reftable_reader_hash_id(struct reftable_reader *r)
+{
+	return r->hash_id;
+}
+
+const char *reader_name(struct reftable_reader *r)
+{
+	return r->name;
+}
+
+static int parse_footer(struct reftable_reader *r, uint8_t *footer,
+			uint8_t *header)
+{
+	uint8_t *f = footer;
+	uint8_t first_block_typ;
+	int err = 0;
+	uint32_t computed_crc;
+	uint32_t file_crc;
+
+	if (memcmp(f, "REFT", 4)) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+	f += 4;
+
+	if (memcmp(footer, header, header_size(r->version))) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	f++;
+	r->block_size = get_be24(f);
+
+	f += 3;
+	r->min_update_index = get_be64(f);
+	f += 8;
+	r->max_update_index = get_be64(f);
+	f += 8;
+
+	if (r->version == 1) {
+		r->hash_id = GIT_SHA1_HASH_ID;
+	} else {
+		r->hash_id = get_be32(f);
+		switch (r->hash_id) {
+		case GIT_SHA1_HASH_ID:
+			break;
+		case GIT_SHA256_HASH_ID:
+			break;
+		default:
+			err = REFTABLE_FORMAT_ERROR;
+			goto done;
+		}
+		f += 4;
+	}
+
+	r->ref_offsets.index_offset = get_be64(f);
+	f += 8;
+
+	r->obj_offsets.offset = get_be64(f);
+	f += 8;
+
+	r->object_id_len = r->obj_offsets.offset & ((1 << 5) - 1);
+	r->obj_offsets.offset >>= 5;
+
+	r->obj_offsets.index_offset = get_be64(f);
+	f += 8;
+	r->log_offsets.offset = get_be64(f);
+	f += 8;
+	r->log_offsets.index_offset = get_be64(f);
+	f += 8;
+
+	computed_crc = crc32(0, footer, f - footer);
+	file_crc = get_be32(f);
+	f += 4;
+	if (computed_crc != file_crc) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	first_block_typ = header[header_size(r->version)];
+	r->ref_offsets.is_present = (first_block_typ == BLOCK_TYPE_REF);
+	r->ref_offsets.offset = 0;
+	r->log_offsets.is_present = (first_block_typ == BLOCK_TYPE_LOG ||
+				     r->log_offsets.offset > 0);
+	r->obj_offsets.is_present = r->obj_offsets.offset > 0;
+	err = 0;
+done:
+	return err;
+}
+
+int init_reader(struct reftable_reader *r, struct reftable_block_source *source,
+		const char *name)
+{
+	struct reftable_block footer = { NULL };
+	struct reftable_block header = { NULL };
+	int err = 0;
+
+	memset(r, 0, sizeof(struct reftable_reader));
+
+	/* Need +1 to read type of first block. */
+	err = block_source_read_block(source, &header, 0, header_size(2) + 1);
+	if (err != header_size(2) + 1) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	if (memcmp(header.data, "REFT", 4)) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+	r->version = header.data[4];
+	if (r->version != 1 && r->version != 2) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
+	r->size = block_source_size(source) - footer_size(r->version);
+	r->source = *source;
+	r->name = xstrdup(name);
+	r->hash_id = 0;
+
+	err = block_source_read_block(source, &footer, r->size,
+				      footer_size(r->version));
+	if (err != footer_size(r->version)) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = parse_footer(r, footer.data, header.data);
+done:
+	reftable_block_done(&footer);
+	reftable_block_done(&header);
+	return err;
+}
+
+struct table_iter {
+	struct reftable_reader *r;
+	uint8_t typ;
+	uint64_t block_off;
+	struct block_iter bi;
+	int is_finished;
+};
+#define TABLE_ITER_INIT                          \
+	{                                        \
+		.bi = {.last_key = STRBUF_INIT } \
+	}
+
+static void table_iter_copy_from(struct table_iter *dest,
+				 struct table_iter *src)
+{
+	dest->r = src->r;
+	dest->typ = src->typ;
+	dest->block_off = src->block_off;
+	dest->is_finished = src->is_finished;
+	block_iter_copy_from(&dest->bi, &src->bi);
+}
+
+static int table_iter_next_in_block(struct table_iter *ti,
+				    struct reftable_record *rec)
+{
+	int res = block_iter_next(&ti->bi, rec);
+	if (res == 0 && reftable_record_type(rec) == BLOCK_TYPE_REF) {
+		((struct reftable_ref_record *)rec->data)->update_index +=
+			ti->r->min_update_index;
+	}
+
+	return res;
+}
+
+static void table_iter_block_done(struct table_iter *ti)
+{
+	if (!ti->bi.br) {
+		return;
+	}
+	reftable_block_done(&ti->bi.br->block);
+	FREE_AND_NULL(ti->bi.br);
+
+	ti->bi.last_key.len = 0;
+	ti->bi.next_off = 0;
+}
+
+static int32_t extract_block_size(uint8_t *data, uint8_t *typ, uint64_t off,
+				  int version)
+{
+	int32_t result = 0;
+
+	if (off == 0) {
+		data += header_size(version);
+	}
+
+	*typ = data[0];
+	if (reftable_is_block_type(*typ)) {
+		result = get_be24(data + 1);
+	}
+	return result;
+}
+
+int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br,
+			     uint64_t next_off, uint8_t want_typ)
+{
+	int32_t guess_block_size = r->block_size ? r->block_size :
+							 DEFAULT_BLOCK_SIZE;
+	struct reftable_block block = { NULL };
+	uint8_t block_typ = 0;
+	int err = 0;
+	uint32_t header_off = next_off ? 0 : header_size(r->version);
+	int32_t block_size = 0;
+
+	if (next_off >= r->size)
+		return 1;
+
+	err = reader_get_block(r, &block, next_off, guess_block_size);
+	if (err < 0)
+		return err;
+
+	block_size = extract_block_size(block.data, &block_typ, next_off,
+					r->version);
+	if (block_size < 0)
+		return block_size;
+
+	if (want_typ != BLOCK_TYPE_ANY && block_typ != want_typ) {
+		reftable_block_done(&block);
+		return 1;
+	}
+
+	if (block_size > guess_block_size) {
+		reftable_block_done(&block);
+		err = reader_get_block(r, &block, next_off, block_size);
+		if (err < 0) {
+			return err;
+		}
+	}
+
+	return block_reader_init(br, &block, header_off, r->block_size,
+				 hash_size(r->hash_id));
+}
+
+static int table_iter_next_block(struct table_iter *dest,
+				 struct table_iter *src)
+{
+	uint64_t next_block_off = src->block_off + src->bi.br->full_block_size;
+	struct block_reader br = { 0 };
+	int err = 0;
+
+	dest->r = src->r;
+	dest->typ = src->typ;
+	dest->block_off = next_block_off;
+
+	err = reader_init_block_reader(src->r, &br, next_block_off, src->typ);
+	if (err > 0) {
+		dest->is_finished = 1;
+		return 1;
+	}
+	if (err != 0)
+		return err;
+	else {
+		struct block_reader *brp =
+			reftable_malloc(sizeof(struct block_reader));
+		*brp = br;
+
+		dest->is_finished = 0;
+		block_reader_start(brp, &dest->bi);
+	}
+	return 0;
+}
+
+static int table_iter_next(struct table_iter *ti, struct reftable_record *rec)
+{
+	if (reftable_record_type(rec) != ti->typ)
+		return REFTABLE_API_ERROR;
+
+	while (1) {
+		struct table_iter next = TABLE_ITER_INIT;
+		int err = 0;
+		if (ti->is_finished) {
+			return 1;
+		}
+
+		err = table_iter_next_in_block(ti, rec);
+		if (err <= 0) {
+			return err;
+		}
+
+		err = table_iter_next_block(&next, ti);
+		if (err != 0) {
+			ti->is_finished = 1;
+		}
+		table_iter_block_done(ti);
+		if (err != 0) {
+			return err;
+		}
+		table_iter_copy_from(ti, &next);
+		block_iter_close(&next.bi);
+	}
+}
+
+static int table_iter_next_void(void *ti, struct reftable_record *rec)
+{
+	return table_iter_next(ti, rec);
+}
+
+static void table_iter_close(void *p)
+{
+	struct table_iter *ti = p;
+	table_iter_block_done(ti);
+	block_iter_close(&ti->bi);
+}
+
+static struct reftable_iterator_vtable table_iter_vtable = {
+	.next = &table_iter_next_void,
+	.close = &table_iter_close,
+};
+
+static void iterator_from_table_iter(struct reftable_iterator *it,
+				     struct table_iter *ti)
+{
+	assert(!it->ops);
+	it->iter_arg = ti;
+	it->ops = &table_iter_vtable;
+}
+
+static int reader_table_iter_at(struct reftable_reader *r,
+				struct table_iter *ti, uint64_t off,
+				uint8_t typ)
+{
+	struct block_reader br = { 0 };
+	struct block_reader *brp = NULL;
+
+	int err = reader_init_block_reader(r, &br, off, typ);
+	if (err != 0)
+		return err;
+
+	brp = reftable_malloc(sizeof(struct block_reader));
+	*brp = br;
+	ti->r = r;
+	ti->typ = block_reader_type(brp);
+	ti->block_off = off;
+	block_reader_start(brp, &ti->bi);
+	return 0;
+}
+
+static int reader_start(struct reftable_reader *r, struct table_iter *ti,
+			uint8_t typ, int index)
+{
+	struct reftable_reader_offsets *offs = reader_offsets_for(r, typ);
+	uint64_t off = offs->offset;
+	if (index) {
+		off = offs->index_offset;
+		if (off == 0) {
+			return 1;
+		}
+		typ = BLOCK_TYPE_INDEX;
+	}
+
+	return reader_table_iter_at(r, ti, off, typ);
+}
+
+static int reader_seek_linear(struct reftable_reader *r, struct table_iter *ti,
+			      struct reftable_record *want)
+{
+	struct reftable_record rec =
+		reftable_new_record(reftable_record_type(want));
+	struct strbuf want_key = STRBUF_INIT;
+	struct strbuf got_key = STRBUF_INIT;
+	struct table_iter next = TABLE_ITER_INIT;
+	int err = -1;
+
+	reftable_record_key(want, &want_key);
+
+	while (1) {
+		err = table_iter_next_block(&next, ti);
+		if (err < 0)
+			goto done;
+
+		if (err > 0) {
+			break;
+		}
+
+		err = block_reader_first_key(next.bi.br, &got_key);
+		if (err < 0)
+			goto done;
+
+		if (strbuf_cmp(&got_key, &want_key) > 0) {
+			table_iter_block_done(&next);
+			break;
+		}
+
+		table_iter_block_done(ti);
+		table_iter_copy_from(ti, &next);
+	}
+
+	err = block_iter_seek(&ti->bi, &want_key);
+	if (err < 0)
+		goto done;
+	err = 0;
+
+done:
+	block_iter_close(&next.bi);
+	reftable_record_destroy(&rec);
+	strbuf_release(&want_key);
+	strbuf_release(&got_key);
+	return err;
+}
+
+static int reader_seek_indexed(struct reftable_reader *r,
+			       struct reftable_iterator *it,
+			       struct reftable_record *rec)
+{
+	struct reftable_index_record want_index = { .last_key = STRBUF_INIT };
+	struct reftable_record want_index_rec = { NULL };
+	struct reftable_index_record index_result = { .last_key = STRBUF_INIT };
+	struct reftable_record index_result_rec = { NULL };
+	struct table_iter index_iter = TABLE_ITER_INIT;
+	struct table_iter next = TABLE_ITER_INIT;
+	int err = 0;
+
+	reftable_record_key(rec, &want_index.last_key);
+	reftable_record_from_index(&want_index_rec, &want_index);
+	reftable_record_from_index(&index_result_rec, &index_result);
+
+	err = reader_start(r, &index_iter, reftable_record_type(rec), 1);
+	if (err < 0)
+		goto done;
+
+	err = reader_seek_linear(r, &index_iter, &want_index_rec);
+	while (1) {
+		err = table_iter_next(&index_iter, &index_result_rec);
+		table_iter_block_done(&index_iter);
+		if (err != 0)
+			goto done;
+
+		err = reader_table_iter_at(r, &next, index_result.offset, 0);
+		if (err != 0)
+			goto done;
+
+		err = block_iter_seek(&next.bi, &want_index.last_key);
+		if (err < 0)
+			goto done;
+
+		if (next.typ == reftable_record_type(rec)) {
+			err = 0;
+			break;
+		}
+
+		if (next.typ != BLOCK_TYPE_INDEX) {
+			err = REFTABLE_FORMAT_ERROR;
+			break;
+		}
+
+		table_iter_copy_from(&index_iter, &next);
+	}
+
+	if (err == 0) {
+		struct table_iter empty = TABLE_ITER_INIT;
+		struct table_iter *malloced =
+			reftable_calloc(sizeof(struct table_iter));
+		*malloced = empty;
+		table_iter_copy_from(malloced, &next);
+		iterator_from_table_iter(it, malloced);
+	}
+done:
+	block_iter_close(&next.bi);
+	table_iter_close(&index_iter);
+	reftable_record_release(&want_index_rec);
+	reftable_record_release(&index_result_rec);
+	return err;
+}
+
+static int reader_seek_internal(struct reftable_reader *r,
+				struct reftable_iterator *it,
+				struct reftable_record *rec)
+{
+	struct reftable_reader_offsets *offs =
+		reader_offsets_for(r, reftable_record_type(rec));
+	uint64_t idx = offs->index_offset;
+	struct table_iter ti = TABLE_ITER_INIT;
+	int err = 0;
+	if (idx > 0)
+		return reader_seek_indexed(r, it, rec);
+
+	err = reader_start(r, &ti, reftable_record_type(rec), 0);
+	if (err < 0)
+		return err;
+	err = reader_seek_linear(r, &ti, rec);
+	if (err < 0)
+		return err;
+	else {
+		struct table_iter *p =
+			reftable_malloc(sizeof(struct table_iter));
+		*p = ti;
+		iterator_from_table_iter(it, p);
+	}
+
+	return 0;
+}
+
+int reader_seek(struct reftable_reader *r, struct reftable_iterator *it,
+		struct reftable_record *rec)
+{
+	uint8_t typ = reftable_record_type(rec);
+
+	struct reftable_reader_offsets *offs = reader_offsets_for(r, typ);
+	if (!offs->is_present) {
+		iterator_set_empty(it);
+		return 0;
+	}
+
+	return reader_seek_internal(r, it, rec);
+}
+
+int reftable_reader_seek_ref(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name)
+{
+	struct reftable_ref_record ref = {
+		.refname = (char *)name,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_ref(&rec, &ref);
+	return reader_seek(r, it, &rec);
+}
+
+int reftable_reader_seek_log_at(struct reftable_reader *r,
+				struct reftable_iterator *it, const char *name,
+				uint64_t update_index)
+{
+	struct reftable_log_record log = {
+		.refname = (char *)name,
+		.update_index = update_index,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_log(&rec, &log);
+	return reader_seek(r, it, &rec);
+}
+
+int reftable_reader_seek_log(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name)
+{
+	uint64_t max = ~((uint64_t)0);
+	return reftable_reader_seek_log_at(r, it, name, max);
+}
+
+void reader_close(struct reftable_reader *r)
+{
+	block_source_close(&r->source);
+	FREE_AND_NULL(r->name);
+}
+
+int reftable_new_reader(struct reftable_reader **p,
+			struct reftable_block_source *src, char const *name)
+{
+	struct reftable_reader *rd =
+		reftable_calloc(sizeof(struct reftable_reader));
+	int err = init_reader(rd, src, name);
+	if (err == 0) {
+		*p = rd;
+	} else {
+		block_source_close(src);
+		reftable_free(rd);
+	}
+	return err;
+}
+
+void reftable_reader_free(struct reftable_reader *r)
+{
+	reader_close(r);
+	reftable_free(r);
+}
+
+static int reftable_reader_refs_for_indexed(struct reftable_reader *r,
+					    struct reftable_iterator *it,
+					    uint8_t *oid)
+{
+	struct reftable_obj_record want = {
+		.hash_prefix = oid,
+		.hash_prefix_len = r->object_id_len,
+	};
+	struct reftable_record want_rec = { NULL };
+	struct reftable_iterator oit = { NULL };
+	struct reftable_obj_record got = { NULL };
+	struct reftable_record got_rec = { NULL };
+	int err = 0;
+	struct indexed_table_ref_iter *itr = NULL;
+
+	/* Look through the reverse index. */
+	reftable_record_from_obj(&want_rec, &want);
+	err = reader_seek(r, &oit, &want_rec);
+	if (err != 0)
+		goto done;
+
+	/* read out the reftable_obj_record */
+	reftable_record_from_obj(&got_rec, &got);
+	err = iterator_next(&oit, &got_rec);
+	if (err < 0)
+		goto done;
+
+	if (err > 0 ||
+	    memcmp(want.hash_prefix, got.hash_prefix, r->object_id_len)) {
+		/* didn't find it; return empty iterator */
+		iterator_set_empty(it);
+		err = 0;
+		goto done;
+	}
+
+	err = new_indexed_table_ref_iter(&itr, r, oid, hash_size(r->hash_id),
+					 got.offsets, got.offset_len);
+	if (err < 0)
+		goto done;
+	got.offsets = NULL;
+	iterator_from_indexed_table_ref_iter(it, itr);
+
+done:
+	reftable_iterator_destroy(&oit);
+	reftable_record_release(&got_rec);
+	return err;
+}
+
+static int reftable_reader_refs_for_unindexed(struct reftable_reader *r,
+					      struct reftable_iterator *it,
+					      uint8_t *oid)
+{
+	struct table_iter ti_empty = TABLE_ITER_INIT;
+	struct table_iter *ti = reftable_calloc(sizeof(struct table_iter));
+	struct filtering_ref_iterator *filter = NULL;
+	struct filtering_ref_iterator empty = FILTERING_REF_ITERATOR_INIT;
+	int oid_len = hash_size(r->hash_id);
+	int err;
+
+	*ti = ti_empty;
+	err = reader_start(r, ti, BLOCK_TYPE_REF, 0);
+	if (err < 0) {
+		reftable_free(ti);
+		return err;
+	}
+
+	filter = reftable_malloc(sizeof(struct filtering_ref_iterator));
+	*filter = empty;
+
+	strbuf_add(&filter->oid, oid, oid_len);
+	reftable_table_from_reader(&filter->tab, r);
+	filter->double_check = 0;
+	iterator_from_table_iter(&filter->it, ti);
+
+	iterator_from_filtering_ref_iterator(it, filter);
+	return 0;
+}
+
+int reftable_reader_refs_for(struct reftable_reader *r,
+			     struct reftable_iterator *it, uint8_t *oid)
+{
+	if (r->obj_offsets.is_present)
+		return reftable_reader_refs_for_indexed(r, it, oid);
+	return reftable_reader_refs_for_unindexed(r, it, oid);
+}
+
+uint64_t reftable_reader_max_update_index(struct reftable_reader *r)
+{
+	return r->max_update_index;
+}
+
+uint64_t reftable_reader_min_update_index(struct reftable_reader *r)
+{
+	return r->min_update_index;
+}
+
+/* generic table interface. */
+
+static int reftable_reader_seek_void(void *tab, struct reftable_iterator *it,
+				     struct reftable_record *rec)
+{
+	return reader_seek(tab, it, rec);
+}
+
+static uint32_t reftable_reader_hash_id_void(void *tab)
+{
+	return reftable_reader_hash_id(tab);
+}
+
+static uint64_t reftable_reader_min_update_index_void(void *tab)
+{
+	return reftable_reader_min_update_index(tab);
+}
+
+static uint64_t reftable_reader_max_update_index_void(void *tab)
+{
+	return reftable_reader_max_update_index(tab);
+}
+
+static struct reftable_table_vtable reader_vtable = {
+	.seek_record = reftable_reader_seek_void,
+	.hash_id = reftable_reader_hash_id_void,
+	.min_update_index = reftable_reader_min_update_index_void,
+	.max_update_index = reftable_reader_max_update_index_void,
+};
+
+void reftable_table_from_reader(struct reftable_table *tab,
+				struct reftable_reader *reader)
+{
+	assert(!tab->ops);
+	tab->ops = &reader_vtable;
+	tab->table_arg = reader;
+}
+
+
+int reftable_reader_print_file(const char *tablename)
+{
+	struct reftable_block_source src = { NULL };
+	int err = reftable_block_source_from_file(&src, tablename);
+	struct reftable_reader *r = NULL;
+	struct reftable_table tab = { NULL };
+	if (err < 0)
+		goto done;
+
+	err = reftable_new_reader(&r, &src, tablename);
+	if (err < 0)
+		goto done;
+
+	reftable_table_from_reader(&tab, r);
+	err = reftable_table_print(&tab);
+done:
+	reftable_reader_free(r);
+	return err;
+}
diff --git a/reftable/reader.h b/reftable/reader.h
new file mode 100644
index 000000000000..39583e5dbcd4
--- /dev/null
+++ b/reftable/reader.h
@@ -0,0 +1,66 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef READER_H
+#define READER_H
+
+#include "block.h"
+#include "record.h"
+#include "reftable-iterator.h"
+#include "reftable-reader.h"
+
+uint64_t block_source_size(struct reftable_block_source *source);
+
+int block_source_read_block(struct reftable_block_source *source,
+			    struct reftable_block *dest, uint64_t off,
+			    uint32_t size);
+void block_source_close(struct reftable_block_source *source);
+
+/* metadata for a block type */
+struct reftable_reader_offsets {
+	int is_present;
+	uint64_t offset;
+	uint64_t index_offset;
+};
+
+/* The state for reading a reftable file. */
+struct reftable_reader {
+	/* for convience, associate a name with the instance. */
+	char *name;
+	struct reftable_block_source source;
+
+	/* Size of the file, excluding the footer. */
+	uint64_t size;
+
+	/* 'sha1' for SHA1, 's256' for SHA-256 */
+	uint32_t hash_id;
+
+	uint32_t block_size;
+	uint64_t min_update_index;
+	uint64_t max_update_index;
+	/* Length of the OID keys in the 'o' section */
+	int object_id_len;
+	int version;
+
+	struct reftable_reader_offsets ref_offsets;
+	struct reftable_reader_offsets obj_offsets;
+	struct reftable_reader_offsets log_offsets;
+};
+
+int init_reader(struct reftable_reader *r, struct reftable_block_source *source,
+		const char *name);
+int reader_seek(struct reftable_reader *r, struct reftable_iterator *it,
+		struct reftable_record *rec);
+void reader_close(struct reftable_reader *r);
+const char *reader_name(struct reftable_reader *r);
+
+/* initialize a block reader to read from `r` */
+int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br,
+			     uint64_t next_off, uint8_t want_typ);
+
+#endif
diff --git a/reftable/reftable-reader.h b/reftable/reftable-reader.h
new file mode 100644
index 000000000000..4a4bc2fdf85a
--- /dev/null
+++ b/reftable/reftable-reader.h
@@ -0,0 +1,101 @@
+/*
+  Copyright 2020 Google LLC
+
+  Use of this source code is governed by a BSD-style
+  license that can be found in the LICENSE file or at
+  https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_READER_H
+#define REFTABLE_READER_H
+
+#include "reftable-iterator.h"
+#include "reftable-blocksource.h"
+
+/*
+ * Reading single tables
+ *
+ * The follow routines are for reading single files. For an
+ * application-level interface, skip ahead to struct
+ * reftable_merged_table and struct reftable_stack.
+ */
+
+/* The reader struct is a handle to an open reftable file. */
+struct reftable_reader;
+
+/* Generic table. */
+struct reftable_table;
+
+/* reftable_new_reader opens a reftable for reading. If successful,
+ * returns 0 code and sets pp. The name is used for creating a
+ * stack. Typically, it is the basename of the file. The block source
+ * `src` is owned by the reader, and is closed on calling
+ * reftable_reader_destroy(). On error, the block source `src` is
+ * closed as well.
+ */
+int reftable_new_reader(struct reftable_reader **pp,
+			struct reftable_block_source *src, const char *name);
+
+/* reftable_reader_seek_ref returns an iterator where 'name' would be inserted
+   in the table.  To seek to the start of the table, use name = "".
+
+   example:
+
+   struct reftable_reader *r = NULL;
+   int err = reftable_new_reader(&r, &src, "filename");
+   if (err < 0) { ... }
+   struct reftable_iterator it  = {0};
+   err = reftable_reader_seek_ref(r, &it, "refs/heads/master");
+   if (err < 0) { ... }
+   struct reftable_ref_record ref  = {0};
+   while (1) {
+   err = reftable_iterator_next_ref(&it, &ref);
+   if (err > 0) {
+   break;
+   }
+   if (err < 0) {
+   ..error handling..
+   }
+   ..found..
+   }
+   reftable_iterator_destroy(&it);
+   reftable_ref_record_release(&ref);
+*/
+int reftable_reader_seek_ref(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name);
+
+/* returns the hash ID used in this table. */
+uint32_t reftable_reader_hash_id(struct reftable_reader *r);
+
+/* seek to logs for the given name, older than update_index. To seek to the
+   start of the table, use name = "".
+*/
+int reftable_reader_seek_log_at(struct reftable_reader *r,
+				struct reftable_iterator *it, const char *name,
+				uint64_t update_index);
+
+/* seek to newest log entry for given name. */
+int reftable_reader_seek_log(struct reftable_reader *r,
+			     struct reftable_iterator *it, const char *name);
+
+/* closes and deallocates a reader. */
+void reftable_reader_free(struct reftable_reader *);
+
+/* return an iterator for the refs pointing to `oid`. */
+int reftable_reader_refs_for(struct reftable_reader *r,
+			     struct reftable_iterator *it, uint8_t *oid);
+
+/* return the max_update_index for a table */
+uint64_t reftable_reader_max_update_index(struct reftable_reader *r);
+
+/* return the min_update_index for a table */
+uint64_t reftable_reader_min_update_index(struct reftable_reader *r);
+
+/* creates a generic table from a file reader. */
+void reftable_table_from_reader(struct reftable_table *tab,
+				struct reftable_reader *reader);
+
+/* print table onto stdout for debugging. */
+int reftable_reader_print_file(const char *tablename);
+
+#endif
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 17/28] reftable: reftable file level tests
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (15 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 16/28] reftable: read reftable files Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37             ` [PATCH v7 18/28] reftable: add a heap-based priority queue for reftable records Han-Wen Nienhuys via GitGitGadget
                               ` (11 subsequent siblings)
  28 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

With support for reading and writing files in place, we can construct files (in
memory) and attempt to read them back.

Because some sections of the format are optional (eg. indices, log entries), we
have to exercise this code using multiple sizes of input data

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                  |   1 +
 reftable/readwrite_test.c | 621 ++++++++++++++++++++++++++++++++++++++
 reftable/reftable-tests.h |   2 +-
 t/helper/test-reftable.c  |   1 +
 4 files changed, 624 insertions(+), 1 deletion(-)
 create mode 100644 reftable/readwrite_test.c

diff --git a/Makefile b/Makefile
index 6593e632ac08..26a805afb5ad 100644
--- a/Makefile
+++ b/Makefile
@@ -2425,6 +2425,7 @@ REFTABLE_OBJS += reftable/writer.o
 REFTABLE_TEST_OBJS += reftable/basics_test.o
 REFTABLE_TEST_OBJS += reftable/block_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
+REFTABLE_TEST_OBJS += reftable/readwrite_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/tree_test.o
 
diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c
new file mode 100644
index 000000000000..6520447a346d
--- /dev/null
+++ b/reftable/readwrite_test.c
@@ -0,0 +1,621 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+
+#include "basics.h"
+#include "block.h"
+#include "blocksource.h"
+#include "constants.h"
+#include "reader.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+#include "reftable-writer.h"
+
+static const int update_index = 5;
+
+static void test_buffer(void)
+{
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_block_source source = { NULL };
+	struct reftable_block out = { NULL };
+	int n;
+	uint8_t in[] = "hello";
+	strbuf_add(&buf, in, sizeof(in));
+	block_source_from_strbuf(&source, &buf);
+	EXPECT(block_source_size(&source) == 6);
+	n = block_source_read_block(&source, &out, 0, sizeof(in));
+	EXPECT(n == sizeof(in));
+	EXPECT(!memcmp(in, out.data, n));
+	reftable_block_done(&out);
+
+	n = block_source_read_block(&source, &out, 1, 2);
+	EXPECT(n == 2);
+	EXPECT(!memcmp(out.data, "el", 2));
+
+	reftable_block_done(&out);
+	block_source_close(&source);
+	strbuf_release(&buf);
+}
+
+static void write_table(char ***names, struct strbuf *buf, int N,
+			int block_size, uint32_t hash_id)
+{
+	struct reftable_write_options opts = {
+		.block_size = block_size,
+		.hash_id = hash_id,
+	};
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, buf, &opts);
+	struct reftable_ref_record ref = { NULL };
+	int i = 0, n;
+	struct reftable_log_record log = { NULL };
+	const struct reftable_stats *stats = NULL;
+	*names = reftable_calloc(sizeof(char *) * (N + 1));
+	reftable_writer_set_limits(w, update_index, update_index);
+	for (i = 0; i < N; i++) {
+		uint8_t hash[GIT_SHA256_RAWSZ] = { 0 };
+		char name[100];
+		int n;
+
+		set_test_hash(hash, i);
+
+		snprintf(name, sizeof(name), "refs/heads/branch%02d", i);
+
+		ref.refname = name;
+		ref.update_index = update_index;
+		ref.value_type = REFTABLE_REF_VAL1;
+		ref.value.val1 = hash;
+		(*names)[i] = xstrdup(name);
+
+		n = reftable_writer_add_ref(w, &ref);
+		EXPECT(n == 0);
+	}
+
+	for (i = 0; i < N; i++) {
+		uint8_t hash[GIT_SHA256_RAWSZ] = { 0 };
+		char name[100];
+		int n;
+
+		set_test_hash(hash, i);
+
+		snprintf(name, sizeof(name), "refs/heads/branch%02d", i);
+
+		log.refname = name;
+		log.update_index = update_index;
+		log.value_type = REFTABLE_LOG_UPDATE;
+		log.update.new_hash = hash;
+		log.update.message = "message";
+
+		n = reftable_writer_add_log(w, &log);
+		EXPECT(n == 0);
+	}
+
+	n = reftable_writer_close(w);
+	EXPECT(n == 0);
+
+	stats = writer_stats(w);
+	for (i = 0; i < stats->ref_stats.blocks; i++) {
+		int off = i * opts.block_size;
+		if (off == 0) {
+			off = header_size((hash_id == GIT_SHA256_HASH_ID) ? 2 :
+										  1);
+		}
+		EXPECT(buf->buf[off] == 'r');
+	}
+
+	EXPECT(stats->log_stats.blocks > 0);
+	reftable_writer_free(w);
+}
+
+static void test_log_buffer_size(void)
+{
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_write_options opts = {
+		.block_size = 4096,
+	};
+	int err;
+	struct reftable_log_record log = { .refname = "refs/heads/master",
+					   .update_index = 0xa,
+					   .value_type = REFTABLE_LOG_UPDATE,
+					   .update = {
+						   .name = "Han-Wen Nienhuys",
+						   .email = "hanwen@google.com",
+						   .tz_offset = 100,
+						   .time = 0x5e430672,
+						   .message = "commit: 9\n",
+					   } };
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+
+	/* This tests buffer extension for log compression. Must use a random
+	   hash, to ensure that the compressed part is larger than the original.
+	*/
+	uint8_t hash1[GIT_SHA1_RAWSZ], hash2[GIT_SHA1_RAWSZ];
+	for (int i = 0; i < GIT_SHA1_RAWSZ; i++) {
+		hash1[i] = (uint8_t)(rand() % 256);
+		hash2[i] = (uint8_t)(rand() % 256);
+	}
+	log.update.old_hash = hash1;
+	log.update.new_hash = hash2;
+	reftable_writer_set_limits(w, update_index, update_index);
+	err = reftable_writer_add_log(w, &log);
+	EXPECT_ERR(err);
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+	reftable_writer_free(w);
+	strbuf_release(&buf);
+}
+
+static void test_log_write_read(void)
+{
+	int N = 2;
+	char **names = reftable_calloc(sizeof(char *) * (N + 1));
+	int err;
+	struct reftable_write_options opts = {
+		.block_size = 256,
+	};
+	struct reftable_ref_record ref = { NULL };
+	int i = 0;
+	struct reftable_log_record log = { NULL };
+	int n;
+	struct reftable_iterator it = { NULL };
+	struct reftable_reader rd = { NULL };
+	struct reftable_block_source source = { NULL };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	const struct reftable_stats *stats = NULL;
+	reftable_writer_set_limits(w, 0, N);
+	for (i = 0; i < N; i++) {
+		char name[256];
+		struct reftable_ref_record ref = { NULL };
+		snprintf(name, sizeof(name), "b%02d%0*d", i, 130, 7);
+		names[i] = xstrdup(name);
+		ref.refname = name;
+		ref.update_index = i;
+
+		err = reftable_writer_add_ref(w, &ref);
+		EXPECT_ERR(err);
+	}
+	for (i = 0; i < N; i++) {
+		uint8_t hash1[GIT_SHA1_RAWSZ], hash2[GIT_SHA1_RAWSZ];
+		struct reftable_log_record log = { NULL };
+		set_test_hash(hash1, i);
+		set_test_hash(hash2, i + 1);
+
+		log.refname = names[i];
+		log.update_index = i;
+		log.value_type = REFTABLE_LOG_UPDATE;
+		log.update.old_hash = hash1;
+		log.update.new_hash = hash2;
+
+		err = reftable_writer_add_log(w, &log);
+		EXPECT_ERR(err);
+	}
+
+	n = reftable_writer_close(w);
+	EXPECT(n == 0);
+
+	stats = writer_stats(w);
+	EXPECT(stats->log_stats.blocks > 0);
+	reftable_writer_free(w);
+	w = NULL;
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.log");
+	EXPECT_ERR(err);
+
+	err = reftable_reader_seek_ref(&rd, &it, names[N - 1]);
+	EXPECT_ERR(err);
+
+	err = reftable_iterator_next_ref(&it, &ref);
+	EXPECT_ERR(err);
+
+	/* end of iteration. */
+	err = reftable_iterator_next_ref(&it, &ref);
+	EXPECT(0 < err);
+
+	reftable_iterator_destroy(&it);
+	reftable_ref_record_release(&ref);
+
+	err = reftable_reader_seek_log(&rd, &it, "");
+	EXPECT_ERR(err);
+
+	i = 0;
+	while (1) {
+		int err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			break;
+		}
+
+		EXPECT_ERR(err);
+		EXPECT_STREQ(names[i], log.refname);
+		EXPECT(i == log.update_index);
+		i++;
+		reftable_log_record_release(&log);
+	}
+
+	EXPECT(i == N);
+	reftable_iterator_destroy(&it);
+
+	/* cleanup. */
+	strbuf_release(&buf);
+	free_names(names);
+	reader_close(&rd);
+}
+
+static void test_table_read_write_sequential(void)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 50;
+	struct reftable_iterator it = { NULL };
+	struct reftable_block_source source = { NULL };
+	struct reftable_reader rd = { NULL };
+	int err = 0;
+	int j = 0;
+
+	write_table(&names, &buf, N, 256, GIT_SHA1_HASH_ID);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	EXPECT_ERR(err);
+
+	err = reftable_reader_seek_ref(&rd, &it, "");
+	EXPECT_ERR(err);
+
+	while (1) {
+		struct reftable_ref_record ref = { NULL };
+		int r = reftable_iterator_next_ref(&it, &ref);
+		EXPECT(r >= 0);
+		if (r > 0) {
+			break;
+		}
+		EXPECT(0 == strcmp(names[j], ref.refname));
+		EXPECT(update_index == ref.update_index);
+
+		j++;
+		reftable_ref_record_release(&ref);
+	}
+	EXPECT(j == N);
+	reftable_iterator_destroy(&it);
+	strbuf_release(&buf);
+	free_names(names);
+
+	reader_close(&rd);
+}
+
+static void test_table_write_small_table(void)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 1;
+	write_table(&names, &buf, N, 4096, GIT_SHA1_HASH_ID);
+	EXPECT(buf.len < 200);
+	strbuf_release(&buf);
+	free_names(names);
+}
+
+static void test_table_read_api(void)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 50;
+	struct reftable_reader rd = { NULL };
+	struct reftable_block_source source = { NULL };
+	int err;
+	int i;
+	struct reftable_log_record log = { NULL };
+	struct reftable_iterator it = { NULL };
+
+	write_table(&names, &buf, N, 256, GIT_SHA1_HASH_ID);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	EXPECT_ERR(err);
+
+	err = reftable_reader_seek_ref(&rd, &it, names[0]);
+	EXPECT_ERR(err);
+
+	err = reftable_iterator_next_log(&it, &log);
+	EXPECT(err == REFTABLE_API_ERROR);
+
+	strbuf_release(&buf);
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+	reftable_iterator_destroy(&it);
+	reftable_free(names);
+	reader_close(&rd);
+	strbuf_release(&buf);
+}
+
+static void test_table_read_write_seek(int index, int hash_id)
+{
+	char **names;
+	struct strbuf buf = STRBUF_INIT;
+	int N = 50;
+	struct reftable_reader rd = { NULL };
+	struct reftable_block_source source = { NULL };
+	int err;
+	int i = 0;
+
+	struct reftable_iterator it = { NULL };
+	struct strbuf pastLast = STRBUF_INIT;
+	struct reftable_ref_record ref = { NULL };
+
+	write_table(&names, &buf, N, 256, hash_id);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	EXPECT_ERR(err);
+	EXPECT(hash_id == reftable_reader_hash_id(&rd));
+
+	if (!index) {
+		rd.ref_offsets.index_offset = 0;
+	} else {
+		EXPECT(rd.ref_offsets.index_offset > 0);
+	}
+
+	for (i = 1; i < N; i++) {
+		int err = reftable_reader_seek_ref(&rd, &it, names[i]);
+		EXPECT_ERR(err);
+		err = reftable_iterator_next_ref(&it, &ref);
+		EXPECT_ERR(err);
+		EXPECT(0 == strcmp(names[i], ref.refname));
+		EXPECT(REFTABLE_REF_VAL1 == ref.value_type);
+		EXPECT(i == ref.value.val1[0]);
+
+		reftable_ref_record_release(&ref);
+		reftable_iterator_destroy(&it);
+	}
+
+	strbuf_addstr(&pastLast, names[N - 1]);
+	strbuf_addstr(&pastLast, "/");
+
+	err = reftable_reader_seek_ref(&rd, &it, pastLast.buf);
+	if (err == 0) {
+		struct reftable_ref_record ref = { NULL };
+		int err = reftable_iterator_next_ref(&it, &ref);
+		EXPECT(err > 0);
+	} else {
+		EXPECT(err > 0);
+	}
+
+	strbuf_release(&pastLast);
+	reftable_iterator_destroy(&it);
+
+	strbuf_release(&buf);
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+	reftable_free(names);
+	reader_close(&rd);
+}
+
+static void test_table_read_write_seek_linear(void)
+{
+	test_table_read_write_seek(0, GIT_SHA1_HASH_ID);
+}
+
+static void test_table_read_write_seek_linear_sha256(void)
+{
+	test_table_read_write_seek(0, GIT_SHA256_HASH_ID);
+}
+
+static void test_table_read_write_seek_index(void)
+{
+	test_table_read_write_seek(1, GIT_SHA1_HASH_ID);
+}
+
+static void test_table_refs_for(int indexed)
+{
+	int N = 50;
+	char **want_names = reftable_calloc(sizeof(char *) * (N + 1));
+	int want_names_len = 0;
+	uint8_t want_hash[GIT_SHA1_RAWSZ];
+
+	struct reftable_write_options opts = {
+		.block_size = 256,
+	};
+	struct reftable_ref_record ref = { NULL };
+	int i = 0;
+	int n;
+	int err;
+	struct reftable_reader rd;
+	struct reftable_block_source source = { NULL };
+
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+
+	struct reftable_iterator it = { NULL };
+	int j;
+
+	set_test_hash(want_hash, 4);
+
+	for (i = 0; i < N; i++) {
+		uint8_t hash[GIT_SHA1_RAWSZ];
+		char fill[51] = { 0 };
+		char name[100];
+		uint8_t hash1[GIT_SHA1_RAWSZ];
+		uint8_t hash2[GIT_SHA1_RAWSZ];
+		struct reftable_ref_record ref = { NULL };
+
+		memset(hash, i, sizeof(hash));
+		memset(fill, 'x', 50);
+		/* Put the variable part in the start */
+		snprintf(name, sizeof(name), "br%02d%s", i, fill);
+		name[40] = 0;
+		ref.refname = name;
+
+		set_test_hash(hash1, i / 4);
+		set_test_hash(hash2, 3 + i / 4);
+		ref.value_type = REFTABLE_REF_VAL2;
+		ref.value.val2.value = hash1;
+		ref.value.val2.target_value = hash2;
+
+		/* 80 bytes / entry, so 3 entries per block. Yields 17
+		 */
+		/* blocks. */
+		n = reftable_writer_add_ref(w, &ref);
+		EXPECT(n == 0);
+
+		if (!memcmp(hash1, want_hash, GIT_SHA1_RAWSZ) ||
+		    !memcmp(hash2, want_hash, GIT_SHA1_RAWSZ)) {
+			want_names[want_names_len++] = xstrdup(name);
+		}
+	}
+
+	n = reftable_writer_close(w);
+	EXPECT(n == 0);
+
+	reftable_writer_free(w);
+	w = NULL;
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = init_reader(&rd, &source, "file.ref");
+	EXPECT_ERR(err);
+	if (!indexed) {
+		rd.obj_offsets.is_present = 0;
+	}
+
+	err = reftable_reader_seek_ref(&rd, &it, "");
+	EXPECT_ERR(err);
+	reftable_iterator_destroy(&it);
+
+	err = reftable_reader_refs_for(&rd, &it, want_hash);
+	EXPECT_ERR(err);
+
+	j = 0;
+	while (1) {
+		int err = reftable_iterator_next_ref(&it, &ref);
+		EXPECT(err >= 0);
+		if (err > 0) {
+			break;
+		}
+
+		EXPECT(j < want_names_len);
+		EXPECT(0 == strcmp(ref.refname, want_names[j]));
+		j++;
+		reftable_ref_record_release(&ref);
+	}
+	EXPECT(j == want_names_len);
+
+	strbuf_release(&buf);
+	free_names(want_names);
+	reftable_iterator_destroy(&it);
+	reader_close(&rd);
+}
+
+static void test_table_refs_for_no_index(void)
+{
+	test_table_refs_for(0);
+}
+
+static void test_table_refs_for_obj_index(void)
+{
+	test_table_refs_for(1);
+}
+
+static void test_table_empty(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	struct reftable_block_source source = { NULL };
+	struct reftable_reader *rd = NULL;
+	struct reftable_ref_record rec = { NULL };
+	struct reftable_iterator it = { NULL };
+	int err;
+
+	reftable_writer_set_limits(w, 1, 1);
+
+	err = reftable_writer_close(w);
+	EXPECT(err == REFTABLE_EMPTY_TABLE_ERROR);
+	reftable_writer_free(w);
+
+	EXPECT(buf.len == header_size(1) + footer_size(1));
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = reftable_new_reader(&rd, &source, "filename");
+	EXPECT_ERR(err);
+
+	err = reftable_reader_seek_ref(rd, &it, "");
+	EXPECT_ERR(err);
+
+	err = reftable_iterator_next_ref(&it, &rec);
+	EXPECT(err > 0);
+
+	reftable_iterator_destroy(&it);
+	reftable_reader_free(rd);
+	strbuf_release(&buf);
+}
+
+static void test_write_key_order(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	struct reftable_ref_record refs[2] = {
+		{
+			.refname = "b",
+			.update_index = 1,
+			.value_type = REFTABLE_REF_SYMREF,
+			.value = {
+				.symref = "target",
+			},
+		}, {
+			.refname = "a",
+			.update_index = 1,
+			.value_type = REFTABLE_REF_SYMREF,
+			.value = {
+				.symref = "target",
+			},
+		}
+	};
+	int err;
+
+	reftable_writer_set_limits(w, 1, 1);
+	err = reftable_writer_add_ref(w, &refs[0]);
+	EXPECT_ERR(err);
+	err = reftable_writer_add_ref(w, &refs[1]);
+	printf("%d\n", err);
+	EXPECT(err == REFTABLE_API_ERROR);
+	reftable_writer_close(w);
+	reftable_writer_free(w);
+	strbuf_release(&buf);
+}
+
+int readwrite_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_log_write_read);
+	RUN_TEST(test_write_key_order);
+	RUN_TEST(test_table_read_write_seek_linear_sha256);
+	RUN_TEST(test_log_buffer_size);
+	RUN_TEST(test_table_write_small_table);
+	RUN_TEST(test_buffer);
+	RUN_TEST(test_table_read_api);
+	RUN_TEST(test_table_read_write_sequential);
+	RUN_TEST(test_table_read_write_seek_linear);
+	RUN_TEST(test_table_read_write_seek_index);
+	RUN_TEST(test_table_refs_for_no_index);
+	RUN_TEST(test_table_refs_for_obj_index);
+	RUN_TEST(test_table_empty);
+	return 0;
+}
diff --git a/reftable/reftable-tests.h b/reftable/reftable-tests.h
index 5e7698ae654e..3d541fa5c0ce 100644
--- a/reftable/reftable-tests.h
+++ b/reftable/reftable-tests.h
@@ -14,7 +14,7 @@ int block_test_main(int argc, const char **argv);
 int merged_test_main(int argc, const char **argv);
 int record_test_main(int argc, const char **argv);
 int refname_test_main(int argc, const char **argv);
-int reftable_test_main(int argc, const char **argv);
+int readwrite_test_main(int argc, const char **argv);
 int stack_test_main(int argc, const char **argv);
 int tree_test_main(int argc, const char **argv);
 int reftable_dump_main(int argc, char *const *argv);
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 050551fa6985..898aba836fd1 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -6,6 +6,7 @@ int cmd__reftable(int argc, const char **argv)
 	basics_test_main(argc, argv);
 	block_test_main(argc, argv);
 	record_test_main(argc, argv);
+	readwrite_test_main(argc, argv);
 	tree_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 18/28] reftable: add a heap-based priority queue for reftable records
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (16 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 17/28] reftable: reftable file level tests Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37             ` [PATCH v7 19/28] reftable: add merged table view Han-Wen Nienhuys via GitGitGadget
                               ` (10 subsequent siblings)
  28 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This is needed to create a merged view multiple reftables

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                  |   2 +
 reftable/pq.c             | 115 ++++++++++++++++++++++++++++++++++++++
 reftable/pq.h             |  32 +++++++++++
 reftable/pq_test.c        |  72 ++++++++++++++++++++++++
 reftable/reftable-tests.h |   1 +
 t/helper/test-reftable.c  |   1 +
 6 files changed, 223 insertions(+)
 create mode 100644 reftable/pq.c
 create mode 100644 reftable/pq.h
 create mode 100644 reftable/pq_test.c

diff --git a/Makefile b/Makefile
index 26a805afb5ad..07814f92fb36 100644
--- a/Makefile
+++ b/Makefile
@@ -2414,6 +2414,7 @@ REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/iter.o
 REFTABLE_OBJS += reftable/publicbasics.o
+REFTABLE_OBJS += reftable/pq.o
 REFTABLE_OBJS += reftable/reader.o
 REFTABLE_OBJS += reftable/record.o
 REFTABLE_OBJS += reftable/refname.o
@@ -2424,6 +2425,7 @@ REFTABLE_OBJS += reftable/writer.o
 
 REFTABLE_TEST_OBJS += reftable/basics_test.o
 REFTABLE_TEST_OBJS += reftable/block_test.o
+REFTABLE_TEST_OBJS += reftable/pq_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/readwrite_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
diff --git a/reftable/pq.c b/reftable/pq.c
new file mode 100644
index 000000000000..8918d158e2d4
--- /dev/null
+++ b/reftable/pq.c
@@ -0,0 +1,115 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "pq.h"
+
+#include "reftable-record.h"
+#include "system.h"
+#include "basics.h"
+
+static int pq_less(struct pq_entry a, struct pq_entry b)
+{
+	struct strbuf ak = STRBUF_INIT;
+	struct strbuf bk = STRBUF_INIT;
+	int cmp = 0;
+	reftable_record_key(&a.rec, &ak);
+	reftable_record_key(&b.rec, &bk);
+
+	cmp = strbuf_cmp(&ak, &bk);
+
+	strbuf_release(&ak);
+	strbuf_release(&bk);
+
+	if (cmp == 0)
+		return a.index > b.index;
+
+	return cmp < 0;
+}
+
+struct pq_entry merged_iter_pqueue_top(struct merged_iter_pqueue pq)
+{
+	return pq.heap[0];
+}
+
+int merged_iter_pqueue_is_empty(struct merged_iter_pqueue pq)
+{
+	return pq.len == 0;
+}
+
+void merged_iter_pqueue_check(struct merged_iter_pqueue pq)
+{
+	int i = 0;
+	for (i = 1; i < pq.len; i++) {
+		int parent = (i - 1) / 2;
+
+		assert(pq_less(pq.heap[parent], pq.heap[i]));
+	}
+}
+
+struct pq_entry merged_iter_pqueue_remove(struct merged_iter_pqueue *pq)
+{
+	int i = 0;
+	struct pq_entry e = pq->heap[0];
+	pq->heap[0] = pq->heap[pq->len - 1];
+	pq->len--;
+
+	i = 0;
+	while (i < pq->len) {
+		int min = i;
+		int j = 2 * i + 1;
+		int k = 2 * i + 2;
+		if (j < pq->len && pq_less(pq->heap[j], pq->heap[i])) {
+			min = j;
+		}
+		if (k < pq->len && pq_less(pq->heap[k], pq->heap[min])) {
+			min = k;
+		}
+
+		if (min == i) {
+			break;
+		}
+
+		SWAP(pq->heap[i], pq->heap[min]);
+		i = min;
+	}
+
+	return e;
+}
+
+void merged_iter_pqueue_add(struct merged_iter_pqueue *pq, struct pq_entry e)
+{
+	int i = 0;
+	if (pq->len == pq->cap) {
+		pq->cap = 2 * pq->cap + 1;
+		pq->heap = reftable_realloc(pq->heap,
+					    pq->cap * sizeof(struct pq_entry));
+	}
+
+	pq->heap[pq->len++] = e;
+	i = pq->len - 1;
+	while (i > 0) {
+		int j = (i - 1) / 2;
+		if (pq_less(pq->heap[j], pq->heap[i])) {
+			break;
+		}
+
+		SWAP(pq->heap[j], pq->heap[i]);
+
+		i = j;
+	}
+}
+
+void merged_iter_pqueue_release(struct merged_iter_pqueue *pq)
+{
+	int i = 0;
+	for (i = 0; i < pq->len; i++) {
+		reftable_record_destroy(&pq->heap[i].rec);
+	}
+	FREE_AND_NULL(pq->heap);
+	pq->len = pq->cap = 0;
+}
diff --git a/reftable/pq.h b/reftable/pq.h
new file mode 100644
index 000000000000..385d2fb139a6
--- /dev/null
+++ b/reftable/pq.h
@@ -0,0 +1,32 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef PQ_H
+#define PQ_H
+
+#include "record.h"
+
+struct pq_entry {
+	int index;
+	struct reftable_record rec;
+};
+
+struct merged_iter_pqueue {
+	struct pq_entry *heap;
+	size_t len;
+	size_t cap;
+};
+
+struct pq_entry merged_iter_pqueue_top(struct merged_iter_pqueue pq);
+int merged_iter_pqueue_is_empty(struct merged_iter_pqueue pq);
+void merged_iter_pqueue_check(struct merged_iter_pqueue pq);
+struct pq_entry merged_iter_pqueue_remove(struct merged_iter_pqueue *pq);
+void merged_iter_pqueue_add(struct merged_iter_pqueue *pq, struct pq_entry e);
+void merged_iter_pqueue_release(struct merged_iter_pqueue *pq);
+
+#endif
diff --git a/reftable/pq_test.c b/reftable/pq_test.c
new file mode 100644
index 000000000000..ad21673e8546
--- /dev/null
+++ b/reftable/pq_test.c
@@ -0,0 +1,72 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+
+#include "basics.h"
+#include "constants.h"
+#include "pq.h"
+#include "record.h"
+#include "reftable-tests.h"
+#include "test_framework.h"
+
+static void test_pq(void)
+{
+	char *names[54] = { NULL };
+	int N = ARRAY_SIZE(names) - 1;
+
+	struct merged_iter_pqueue pq = { NULL };
+	const char *last = NULL;
+
+	int i = 0;
+	for (i = 0; i < N; i++) {
+		char name[100];
+		snprintf(name, sizeof(name), "%02d", i);
+		names[i] = xstrdup(name);
+	}
+
+	i = 1;
+	do {
+		struct reftable_record rec =
+			reftable_new_record(BLOCK_TYPE_REF);
+		struct pq_entry e = { 0 };
+
+		reftable_record_as_ref(&rec)->refname = names[i];
+		e.rec = rec;
+		merged_iter_pqueue_add(&pq, e);
+		merged_iter_pqueue_check(pq);
+		i = (i * 7) % N;
+	} while (i != 1);
+
+	while (!merged_iter_pqueue_is_empty(pq)) {
+		struct pq_entry e = merged_iter_pqueue_remove(&pq);
+		struct reftable_ref_record *ref =
+			reftable_record_as_ref(&e.rec);
+
+		merged_iter_pqueue_check(pq);
+
+		if (last) {
+			assert(strcmp(last, ref->refname) < 0);
+		}
+		last = ref->refname;
+		ref->refname = NULL;
+		reftable_free(ref);
+	}
+
+	for (i = 0; i < N; i++) {
+		reftable_free(names[i]);
+	}
+
+	merged_iter_pqueue_release(&pq);
+}
+
+int pq_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_pq);
+	return 0;
+}
diff --git a/reftable/reftable-tests.h b/reftable/reftable-tests.h
index 3d541fa5c0ce..0019cbcfa498 100644
--- a/reftable/reftable-tests.h
+++ b/reftable/reftable-tests.h
@@ -12,6 +12,7 @@ license that can be found in the LICENSE file or at
 int basics_test_main(int argc, const char **argv);
 int block_test_main(int argc, const char **argv);
 int merged_test_main(int argc, const char **argv);
+int pq_test_main(int argc, const char **argv);
 int record_test_main(int argc, const char **argv);
 int refname_test_main(int argc, const char **argv);
 int readwrite_test_main(int argc, const char **argv);
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 898aba836fd1..0b5a1701df15 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -5,6 +5,7 @@ int cmd__reftable(int argc, const char **argv)
 {
 	basics_test_main(argc, argv);
 	block_test_main(argc, argv);
+	pq_test_main(argc, argv);
 	record_test_main(argc, argv);
 	readwrite_test_main(argc, argv);
 	tree_test_main(argc, argv);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 19/28] reftable: add merged table view
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (17 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 18/28] reftable: add a heap-based priority queue for reftable records Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37             ` [PATCH v7 20/28] reftable: implement refname validation Han-Wen Nienhuys via GitGitGadget
                               ` (9 subsequent siblings)
  28 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This adds an abstract, read-only interface to the ref database.

This primitive is used to construct the read view of the ref database
(the read view is constructed by merging several *.ref files). It also
provides the mechanism to provide a unified view of the refs in the main
repository and the per-worktree refs.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                   |   2 +
 reftable/merged.c          | 362 +++++++++++++++++++++++++++++++++++++
 reftable/merged.h          |  35 ++++
 reftable/merged_test.c     | 292 ++++++++++++++++++++++++++++++
 reftable/reftable-merged.h |  72 ++++++++
 t/helper/test-reftable.c   |   1 +
 6 files changed, 764 insertions(+)
 create mode 100644 reftable/merged.c
 create mode 100644 reftable/merged.h
 create mode 100644 reftable/merged_test.c
 create mode 100644 reftable/reftable-merged.h

diff --git a/Makefile b/Makefile
index 07814f92fb36..6139f02faabe 100644
--- a/Makefile
+++ b/Makefile
@@ -2414,6 +2414,7 @@ REFTABLE_OBJS += reftable/block.o
 REFTABLE_OBJS += reftable/blocksource.o
 REFTABLE_OBJS += reftable/iter.o
 REFTABLE_OBJS += reftable/publicbasics.o
+REFTABLE_OBJS += reftable/merged.o
 REFTABLE_OBJS += reftable/pq.o
 REFTABLE_OBJS += reftable/reader.o
 REFTABLE_OBJS += reftable/record.o
@@ -2425,6 +2426,7 @@ REFTABLE_OBJS += reftable/writer.o
 
 REFTABLE_TEST_OBJS += reftable/basics_test.o
 REFTABLE_TEST_OBJS += reftable/block_test.o
+REFTABLE_TEST_OBJS += reftable/merged_test.o
 REFTABLE_TEST_OBJS += reftable/pq_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/readwrite_test.o
diff --git a/reftable/merged.c b/reftable/merged.c
new file mode 100644
index 000000000000..e5b53da6db3f
--- /dev/null
+++ b/reftable/merged.c
@@ -0,0 +1,362 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "merged.h"
+
+#include "constants.h"
+#include "iter.h"
+#include "pq.h"
+#include "reader.h"
+#include "record.h"
+#include "generic.h"
+#include "reftable-merged.h"
+#include "reftable-error.h"
+#include "system.h"
+
+static int merged_iter_init(struct merged_iter *mi)
+{
+	int i = 0;
+	for (i = 0; i < mi->stack_len; i++) {
+		struct reftable_record rec = reftable_new_record(mi->typ);
+		int err = iterator_next(&mi->stack[i], &rec);
+		if (err < 0) {
+			return err;
+		}
+
+		if (err > 0) {
+			reftable_iterator_destroy(&mi->stack[i]);
+			reftable_record_destroy(&rec);
+		} else {
+			struct pq_entry e = {
+				.rec = rec,
+				.index = i,
+			};
+			merged_iter_pqueue_add(&mi->pq, e);
+		}
+	}
+
+	return 0;
+}
+
+static void merged_iter_close(void *p)
+{
+	struct merged_iter *mi = p;
+	int i = 0;
+	merged_iter_pqueue_release(&mi->pq);
+	for (i = 0; i < mi->stack_len; i++) {
+		reftable_iterator_destroy(&mi->stack[i]);
+	}
+	reftable_free(mi->stack);
+}
+
+static int merged_iter_advance_nonnull_subiter(struct merged_iter *mi,
+					       size_t idx)
+{
+	struct reftable_record rec = reftable_new_record(mi->typ);
+	struct pq_entry e = {
+		.rec = rec,
+		.index = idx,
+	};
+	int err = iterator_next(&mi->stack[idx], &rec);
+	if (err < 0)
+		return err;
+
+	if (err > 0) {
+		reftable_iterator_destroy(&mi->stack[idx]);
+		reftable_record_destroy(&rec);
+		return 0;
+	}
+
+	merged_iter_pqueue_add(&mi->pq, e);
+	return 0;
+}
+
+static int merged_iter_advance_subiter(struct merged_iter *mi, size_t idx)
+{
+	if (iterator_is_null(&mi->stack[idx]))
+		return 0;
+	return merged_iter_advance_nonnull_subiter(mi, idx);
+}
+
+static int merged_iter_next_entry(struct merged_iter *mi,
+				  struct reftable_record *rec)
+{
+	struct strbuf entry_key = STRBUF_INIT;
+	struct pq_entry entry = { 0 };
+	int err = 0;
+
+	if (merged_iter_pqueue_is_empty(mi->pq))
+		return 1;
+
+	entry = merged_iter_pqueue_remove(&mi->pq);
+	err = merged_iter_advance_subiter(mi, entry.index);
+	if (err < 0)
+		return err;
+
+	/*
+	  One can also use reftable as datacenter-local storage, where the ref
+	  database is maintained in globally consistent database (eg.
+	  CockroachDB or Spanner). In this scenario, replication delays together
+	  with compaction may cause newer tables to contain older entries. In
+	  such a deployment, the loop below must be changed to collect all
+	  entries for the same key, and return new the newest one.
+	*/
+	reftable_record_key(&entry.rec, &entry_key);
+	while (!merged_iter_pqueue_is_empty(mi->pq)) {
+		struct pq_entry top = merged_iter_pqueue_top(mi->pq);
+		struct strbuf k = STRBUF_INIT;
+		int err = 0, cmp = 0;
+
+		reftable_record_key(&top.rec, &k);
+
+		cmp = strbuf_cmp(&k, &entry_key);
+		strbuf_release(&k);
+
+		if (cmp > 0) {
+			break;
+		}
+
+		merged_iter_pqueue_remove(&mi->pq);
+		err = merged_iter_advance_subiter(mi, top.index);
+		if (err < 0) {
+			return err;
+		}
+		reftable_record_destroy(&top.rec);
+	}
+
+	reftable_record_copy_from(rec, &entry.rec, hash_size(mi->hash_id));
+	reftable_record_destroy(&entry.rec);
+	strbuf_release(&entry_key);
+	return 0;
+}
+
+static int merged_iter_next(struct merged_iter *mi, struct reftable_record *rec)
+{
+	while (1) {
+		int err = merged_iter_next_entry(mi, rec);
+		if (err == 0 && mi->suppress_deletions &&
+		    reftable_record_is_deletion(rec)) {
+			continue;
+		}
+
+		return err;
+	}
+}
+
+static int merged_iter_next_void(void *p, struct reftable_record *rec)
+{
+	struct merged_iter *mi = p;
+	if (merged_iter_pqueue_is_empty(mi->pq))
+		return 1;
+
+	return merged_iter_next(mi, rec);
+}
+
+static struct reftable_iterator_vtable merged_iter_vtable = {
+	.next = &merged_iter_next_void,
+	.close = &merged_iter_close,
+};
+
+static void iterator_from_merged_iter(struct reftable_iterator *it,
+				      struct merged_iter *mi)
+{
+	assert(!it->ops);
+	it->iter_arg = mi;
+	it->ops = &merged_iter_vtable;
+}
+
+int reftable_new_merged_table(struct reftable_merged_table **dest,
+			      struct reftable_table *stack, int n,
+			      uint32_t hash_id)
+{
+	struct reftable_merged_table *m = NULL;
+	uint64_t last_max = 0;
+	uint64_t first_min = 0;
+	int i = 0;
+	for (i = 0; i < n; i++) {
+		uint64_t min = reftable_table_min_update_index(&stack[i]);
+		uint64_t max = reftable_table_max_update_index(&stack[i]);
+
+		if (reftable_table_hash_id(&stack[i]) != hash_id) {
+			return REFTABLE_FORMAT_ERROR;
+		}
+		if (i == 0 || min < first_min) {
+			first_min = min;
+		}
+		if (i == 0 || max > last_max) {
+			last_max = max;
+		}
+	}
+
+	m = reftable_calloc(sizeof(struct reftable_merged_table));
+	m->stack = stack;
+	m->stack_len = n;
+	m->min = first_min;
+	m->max = last_max;
+	m->hash_id = hash_id;
+	*dest = m;
+	return 0;
+}
+
+/* clears the list of subtable, without affecting the readers themselves. */
+void merged_table_release(struct reftable_merged_table *mt)
+{
+	FREE_AND_NULL(mt->stack);
+	mt->stack_len = 0;
+}
+
+void reftable_merged_table_free(struct reftable_merged_table *mt)
+{
+	if (!mt) {
+		return;
+	}
+	merged_table_release(mt);
+	reftable_free(mt);
+}
+
+uint64_t
+reftable_merged_table_max_update_index(struct reftable_merged_table *mt)
+{
+	return mt->max;
+}
+
+uint64_t
+reftable_merged_table_min_update_index(struct reftable_merged_table *mt)
+{
+	return mt->min;
+}
+
+static int reftable_table_seek_record(struct reftable_table *tab,
+				      struct reftable_iterator *it,
+				      struct reftable_record *rec)
+{
+	return tab->ops->seek_record(tab->table_arg, it, rec);
+}
+
+static int merged_table_seek_record(struct reftable_merged_table *mt,
+				    struct reftable_iterator *it,
+				    struct reftable_record *rec)
+{
+	struct reftable_iterator *iters = reftable_calloc(
+		sizeof(struct reftable_iterator) * mt->stack_len);
+	struct merged_iter merged = {
+		.stack = iters,
+		.typ = reftable_record_type(rec),
+		.hash_id = mt->hash_id,
+		.suppress_deletions = mt->suppress_deletions,
+	};
+	int n = 0;
+	int err = 0;
+	int i = 0;
+	for (i = 0; i < mt->stack_len && err == 0; i++) {
+		int e = reftable_table_seek_record(&mt->stack[i], &iters[n],
+						   rec);
+		if (e < 0) {
+			err = e;
+		}
+		if (e == 0) {
+			n++;
+		}
+	}
+	if (err < 0) {
+		int i = 0;
+		for (i = 0; i < n; i++) {
+			reftable_iterator_destroy(&iters[i]);
+		}
+		reftable_free(iters);
+		return err;
+	}
+
+	merged.stack_len = n;
+	err = merged_iter_init(&merged);
+	if (err < 0) {
+		merged_iter_close(&merged);
+		return err;
+	} else {
+		struct merged_iter *p =
+			reftable_malloc(sizeof(struct merged_iter));
+		*p = merged;
+		iterator_from_merged_iter(it, p);
+	}
+	return 0;
+}
+
+int reftable_merged_table_seek_ref(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name)
+{
+	struct reftable_ref_record ref = {
+		.refname = (char *)name,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_ref(&rec, &ref);
+	return merged_table_seek_record(mt, it, &rec);
+}
+
+int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt,
+				      struct reftable_iterator *it,
+				      const char *name, uint64_t update_index)
+{
+	struct reftable_log_record log = {
+		.refname = (char *)name,
+		.update_index = update_index,
+	};
+	struct reftable_record rec = { NULL };
+	reftable_record_from_log(&rec, &log);
+	return merged_table_seek_record(mt, it, &rec);
+}
+
+int reftable_merged_table_seek_log(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name)
+{
+	uint64_t max = ~((uint64_t)0);
+	return reftable_merged_table_seek_log_at(mt, it, name, max);
+}
+
+uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *mt)
+{
+	return mt->hash_id;
+}
+
+static int reftable_merged_table_seek_void(void *tab,
+					   struct reftable_iterator *it,
+					   struct reftable_record *rec)
+{
+	return merged_table_seek_record(tab, it, rec);
+}
+
+static uint32_t reftable_merged_table_hash_id_void(void *tab)
+{
+	return reftable_merged_table_hash_id(tab);
+}
+
+static uint64_t reftable_merged_table_min_update_index_void(void *tab)
+{
+	return reftable_merged_table_min_update_index(tab);
+}
+
+static uint64_t reftable_merged_table_max_update_index_void(void *tab)
+{
+	return reftable_merged_table_max_update_index(tab);
+}
+
+static struct reftable_table_vtable merged_table_vtable = {
+	.seek_record = reftable_merged_table_seek_void,
+	.hash_id = reftable_merged_table_hash_id_void,
+	.min_update_index = reftable_merged_table_min_update_index_void,
+	.max_update_index = reftable_merged_table_max_update_index_void,
+};
+
+void reftable_table_from_merged_table(struct reftable_table *tab,
+				      struct reftable_merged_table *merged)
+{
+	assert(!tab->ops);
+	tab->ops = &merged_table_vtable;
+	tab->table_arg = merged;
+}
diff --git a/reftable/merged.h b/reftable/merged.h
new file mode 100644
index 000000000000..8c4d4d58d77a
--- /dev/null
+++ b/reftable/merged.h
@@ -0,0 +1,35 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef MERGED_H
+#define MERGED_H
+
+#include "pq.h"
+
+struct reftable_merged_table {
+	struct reftable_table *stack;
+	size_t stack_len;
+	uint32_t hash_id;
+	int suppress_deletions;
+
+	uint64_t min;
+	uint64_t max;
+};
+
+struct merged_iter {
+	struct reftable_iterator *stack;
+	uint32_t hash_id;
+	size_t stack_len;
+	uint8_t typ;
+	int suppress_deletions;
+	struct merged_iter_pqueue pq;
+};
+
+void merged_table_release(struct reftable_merged_table *mt);
+
+#endif
diff --git a/reftable/merged_test.c b/reftable/merged_test.c
new file mode 100644
index 000000000000..a53251c24f8d
--- /dev/null
+++ b/reftable/merged_test.c
@@ -0,0 +1,292 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "merged.h"
+
+#include "system.h"
+
+#include "basics.h"
+#include "blocksource.h"
+#include "constants.h"
+#include "reader.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-merged.h"
+#include "reftable-tests.h"
+#include "reftable-generic.h"
+#include "reftable-writer.h"
+
+static void write_test_table(struct strbuf *buf,
+			     struct reftable_ref_record refs[], int n)
+{
+	int min = 0xffffffff;
+	int max = 0;
+	int i = 0;
+	int err;
+
+	struct reftable_write_options opts = {
+		.block_size = 256,
+	};
+	struct reftable_writer *w = NULL;
+	for (i = 0; i < n; i++) {
+		uint64_t ui = refs[i].update_index;
+		if (ui > max) {
+			max = ui;
+		}
+		if (ui < min) {
+			min = ui;
+		}
+	}
+
+	w = reftable_new_writer(&strbuf_add_void, buf, &opts);
+	reftable_writer_set_limits(w, min, max);
+
+	for (i = 0; i < n; i++) {
+		uint64_t before = refs[i].update_index;
+		int n = reftable_writer_add_ref(w, &refs[i]);
+		assert(n == 0);
+		assert(before == refs[i].update_index);
+	}
+
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+
+	reftable_writer_free(w);
+}
+
+static struct reftable_merged_table *
+merged_table_from_records(struct reftable_ref_record **refs,
+			  struct reftable_block_source **source,
+			  struct reftable_reader ***readers, int *sizes,
+			  struct strbuf *buf, int n)
+{
+	int i = 0;
+	struct reftable_merged_table *mt = NULL;
+	int err;
+	struct reftable_table *tabs =
+		reftable_calloc(n * sizeof(struct reftable_table));
+	*readers = reftable_calloc(n * sizeof(struct reftable_reader *));
+	*source = reftable_calloc(n * sizeof(**source));
+	for (i = 0; i < n; i++) {
+		write_test_table(&buf[i], refs[i], sizes[i]);
+		block_source_from_strbuf(&(*source)[i], &buf[i]);
+
+		err = reftable_new_reader(&(*readers)[i], &(*source)[i],
+					  "name");
+		EXPECT_ERR(err);
+		reftable_table_from_reader(&tabs[i], (*readers)[i]);
+	}
+
+	err = reftable_new_merged_table(&mt, tabs, n, GIT_SHA1_HASH_ID);
+	EXPECT_ERR(err);
+	return mt;
+}
+
+static void readers_destroy(struct reftable_reader **readers, size_t n)
+{
+	int i = 0;
+	for (; i < n; i++)
+		reftable_reader_free(readers[i]);
+	reftable_free(readers);
+}
+
+static void test_merged_between(void)
+{
+	uint8_t hash1[GIT_SHA1_RAWSZ] = { 1, 2, 3, 0 };
+
+	struct reftable_ref_record r1[] = { {
+		.refname = "b",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_VAL1,
+		.value.val1 = hash1,
+	} };
+	struct reftable_ref_record r2[] = { {
+		.refname = "a",
+		.update_index = 2,
+		.value_type = REFTABLE_REF_DELETION,
+	} };
+
+	struct reftable_ref_record *refs[] = { r1, r2 };
+	int sizes[] = { 1, 1 };
+	struct strbuf bufs[2] = { STRBUF_INIT, STRBUF_INIT };
+	struct reftable_block_source *bs = NULL;
+	struct reftable_reader **readers = NULL;
+	struct reftable_merged_table *mt =
+		merged_table_from_records(refs, &bs, &readers, sizes, bufs, 2);
+	int i;
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_iterator it = { NULL };
+	int err = reftable_merged_table_seek_ref(mt, &it, "a");
+	EXPECT_ERR(err);
+
+	err = reftable_iterator_next_ref(&it, &ref);
+	EXPECT_ERR(err);
+	EXPECT(ref.update_index == 2);
+	reftable_ref_record_release(&ref);
+	reftable_iterator_destroy(&it);
+	readers_destroy(readers, 2);
+	reftable_merged_table_free(mt);
+	for (i = 0; i < ARRAY_SIZE(bufs); i++) {
+		strbuf_release(&bufs[i]);
+	}
+	reftable_free(bs);
+}
+
+static void test_merged(void)
+{
+	uint8_t hash1[GIT_SHA1_RAWSZ] = { 1 };
+	uint8_t hash2[GIT_SHA1_RAWSZ] = { 2 };
+	struct reftable_ref_record r1[] = {
+		{
+			.refname = "a",
+			.update_index = 1,
+			.value_type = REFTABLE_REF_VAL1,
+			.value.val1 = hash1,
+		},
+		{
+			.refname = "b",
+			.update_index = 1,
+			.value_type = REFTABLE_REF_VAL1,
+			.value.val1 = hash1,
+		},
+		{
+			.refname = "c",
+			.update_index = 1,
+			.value_type = REFTABLE_REF_VAL1,
+			.value.val1 = hash1,
+		}
+	};
+	struct reftable_ref_record r2[] = { {
+		.refname = "a",
+		.update_index = 2,
+		.value_type = REFTABLE_REF_DELETION,
+	} };
+	struct reftable_ref_record r3[] = {
+		{
+			.refname = "c",
+			.update_index = 3,
+			.value_type = REFTABLE_REF_VAL1,
+			.value.val1 = hash2,
+		},
+		{
+			.refname = "d",
+			.update_index = 3,
+			.value_type = REFTABLE_REF_VAL1,
+			.value.val1 = hash1,
+		},
+	};
+
+	struct reftable_ref_record want[] = {
+		r2[0],
+		r1[1],
+		r3[0],
+		r3[1],
+	};
+
+	struct reftable_ref_record *refs[] = { r1, r2, r3 };
+	int sizes[3] = { 3, 1, 2 };
+	struct strbuf bufs[3] = { STRBUF_INIT, STRBUF_INIT, STRBUF_INIT };
+	struct reftable_block_source *bs = NULL;
+	struct reftable_reader **readers = NULL;
+	struct reftable_merged_table *mt =
+		merged_table_from_records(refs, &bs, &readers, sizes, bufs, 3);
+
+	struct reftable_iterator it = { NULL };
+	int err = reftable_merged_table_seek_ref(mt, &it, "a");
+	struct reftable_ref_record *out = NULL;
+	size_t len = 0;
+	size_t cap = 0;
+	int i = 0;
+
+	EXPECT_ERR(err);
+	while (len < 100) { /* cap loops/recursion. */
+		struct reftable_ref_record ref = { NULL };
+		int err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (len == cap) {
+			cap = 2 * cap + 1;
+			out = reftable_realloc(
+				out, sizeof(struct reftable_ref_record) * cap);
+		}
+		out[len++] = ref;
+	}
+	reftable_iterator_destroy(&it);
+
+	assert(ARRAY_SIZE(want) == len);
+	for (i = 0; i < len; i++) {
+		assert(reftable_ref_record_equal(&want[i], &out[i],
+						 GIT_SHA1_RAWSZ));
+	}
+	for (i = 0; i < len; i++) {
+		reftable_ref_record_release(&out[i]);
+	}
+	reftable_free(out);
+
+	for (i = 0; i < 3; i++) {
+		strbuf_release(&bufs[i]);
+	}
+	readers_destroy(readers, 3);
+	reftable_merged_table_free(mt);
+	reftable_free(bs);
+}
+
+static void test_default_write_opts(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+
+	struct reftable_ref_record rec = {
+		.refname = "master",
+		.update_index = 1,
+	};
+	int err;
+	struct reftable_block_source source = { NULL };
+	struct reftable_table *tab = reftable_calloc(sizeof(*tab) * 1);
+	uint32_t hash_id;
+	struct reftable_reader *rd = NULL;
+	struct reftable_merged_table *merged = NULL;
+
+	reftable_writer_set_limits(w, 1, 1);
+
+	err = reftable_writer_add_ref(w, &rec);
+	EXPECT_ERR(err);
+
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+	reftable_writer_free(w);
+
+	block_source_from_strbuf(&source, &buf);
+
+	err = reftable_new_reader(&rd, &source, "filename");
+	EXPECT_ERR(err);
+
+	hash_id = reftable_reader_hash_id(rd);
+	assert(hash_id == GIT_SHA1_HASH_ID);
+
+	reftable_table_from_reader(&tab[0], rd);
+	err = reftable_new_merged_table(&merged, tab, 1, GIT_SHA1_HASH_ID);
+	EXPECT_ERR(err);
+
+	reftable_reader_free(rd);
+	reftable_merged_table_free(merged);
+	strbuf_release(&buf);
+}
+
+/* XXX test refs_for(oid) */
+
+int merged_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_merged_between);
+	RUN_TEST(test_merged);
+	RUN_TEST(test_default_write_opts);
+	return 0;
+}
diff --git a/reftable/reftable-merged.h b/reftable/reftable-merged.h
new file mode 100644
index 000000000000..1a6d16915ab4
--- /dev/null
+++ b/reftable/reftable-merged.h
@@ -0,0 +1,72 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_MERGED_H
+#define REFTABLE_MERGED_H
+
+#include "reftable-iterator.h"
+
+/*
+ * Merged tables
+ *
+ * A ref database kept in a sequence of table files. The merged_table presents a
+ * unified view to reading (seeking, iterating) a sequence of immutable tables.
+ *
+ * The merged tables are on purpose kept disconnected from their actual storage
+ * (eg. files on disk), because it is useful to merge tables aren't files. For
+ * example, the per-workspace and global ref namespace can be implemented as a
+ * merged table of two stacks of file-backed reftables.
+ */
+
+/* A merged table is implements seeking/iterating over a stack of tables. */
+struct reftable_merged_table;
+
+/* A generic reftable; see below. */
+struct reftable_table;
+
+/* reftable_new_merged_table creates a new merged table. It takes ownership of
+   the stack array.
+*/
+int reftable_new_merged_table(struct reftable_merged_table **dest,
+			      struct reftable_table *stack, int n,
+			      uint32_t hash_id);
+
+/* returns an iterator positioned just before 'name' */
+int reftable_merged_table_seek_ref(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name);
+
+/* returns an iterator for log entry, at given update_index */
+int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt,
+				      struct reftable_iterator *it,
+				      const char *name, uint64_t update_index);
+
+/* like reftable_merged_table_seek_log_at but look for the newest entry. */
+int reftable_merged_table_seek_log(struct reftable_merged_table *mt,
+				   struct reftable_iterator *it,
+				   const char *name);
+
+/* returns the max update_index covered by this merged table. */
+uint64_t
+reftable_merged_table_max_update_index(struct reftable_merged_table *mt);
+
+/* returns the min update_index covered by this merged table. */
+uint64_t
+reftable_merged_table_min_update_index(struct reftable_merged_table *mt);
+
+/* releases memory for the merged_table */
+void reftable_merged_table_free(struct reftable_merged_table *m);
+
+/* return the hash ID of the merged table. */
+uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *m);
+
+/* create a generic table from reftable_merged_table */
+void reftable_table_from_merged_table(struct reftable_table *tab,
+				      struct reftable_merged_table *table);
+
+#endif
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 0b5a1701df15..8087f2da4e6a 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -5,6 +5,7 @@ int cmd__reftable(int argc, const char **argv)
 {
 	basics_test_main(argc, argv);
 	block_test_main(argc, argv);
+	merged_test_main(argc, argv);
 	pq_test_main(argc, argv);
 	record_test_main(argc, argv);
 	readwrite_test_main(argc, argv);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 20/28] reftable: implement refname validation
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (18 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 19/28] reftable: add merged table view Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37             ` [PATCH v7 21/28] reftable: implement stack, a mutable database of reftable files Han-Wen Nienhuys via GitGitGadget
                               ` (8 subsequent siblings)
  28 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The packed/loose format has restrictions on refnames: a and a/b cannot
coexist. This limitation does not apply to reftable per se, but must be
maintained for interoperability. This code adds validation routines to
abort transactions that are trying to add invalid names.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 |   1 +
 reftable/refname.c       | 209 +++++++++++++++++++++++++++++++++++++++
 reftable/refname.h       |  29 ++++++
 reftable/refname_test.c  | 102 +++++++++++++++++++
 t/helper/test-reftable.c |   1 +
 5 files changed, 342 insertions(+)
 create mode 100644 reftable/refname.c
 create mode 100644 reftable/refname.h
 create mode 100644 reftable/refname_test.c

diff --git a/Makefile b/Makefile
index 6139f02faabe..b3973d8f5a8f 100644
--- a/Makefile
+++ b/Makefile
@@ -2430,6 +2430,7 @@ REFTABLE_TEST_OBJS += reftable/merged_test.o
 REFTABLE_TEST_OBJS += reftable/pq_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/readwrite_test.o
+REFTABLE_TEST_OBJS += reftable/refname_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/tree_test.o
 
diff --git a/reftable/refname.c b/reftable/refname.c
new file mode 100644
index 000000000000..957349693247
--- /dev/null
+++ b/reftable/refname.c
@@ -0,0 +1,209 @@
+/*
+  Copyright 2020 Google LLC
+
+  Use of this source code is governed by a BSD-style
+  license that can be found in the LICENSE file or at
+  https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "system.h"
+#include "reftable-error.h"
+#include "basics.h"
+#include "refname.h"
+#include "reftable-iterator.h"
+
+struct find_arg {
+	char **names;
+	const char *want;
+};
+
+static int find_name(size_t k, void *arg)
+{
+	struct find_arg *f_arg = arg;
+	return strcmp(f_arg->names[k], f_arg->want) >= 0;
+}
+
+static int modification_has_ref(struct modification *mod, const char *name)
+{
+	struct reftable_ref_record ref = { NULL };
+	int err = 0;
+
+	if (mod->add_len > 0) {
+		struct find_arg arg = {
+			.names = mod->add,
+			.want = name,
+		};
+		int idx = binsearch(mod->add_len, find_name, &arg);
+		if (idx < mod->add_len && !strcmp(mod->add[idx], name)) {
+			return 0;
+		}
+	}
+
+	if (mod->del_len > 0) {
+		struct find_arg arg = {
+			.names = mod->del,
+			.want = name,
+		};
+		int idx = binsearch(mod->del_len, find_name, &arg);
+		if (idx < mod->del_len && !strcmp(mod->del[idx], name)) {
+			return 1;
+		}
+	}
+
+	err = reftable_table_read_ref(&mod->tab, name, &ref);
+	reftable_ref_record_release(&ref);
+	return err;
+}
+
+static void modification_release(struct modification *mod)
+{
+	/* don't delete the strings themselves; they're owned by ref records.
+	 */
+	FREE_AND_NULL(mod->add);
+	FREE_AND_NULL(mod->del);
+	mod->add_len = 0;
+	mod->del_len = 0;
+}
+
+static int modification_has_ref_with_prefix(struct modification *mod,
+					    const char *prefix)
+{
+	struct reftable_iterator it = { NULL };
+	struct reftable_ref_record ref = { NULL };
+	int err = 0;
+
+	if (mod->add_len > 0) {
+		struct find_arg arg = {
+			.names = mod->add,
+			.want = prefix,
+		};
+		int idx = binsearch(mod->add_len, find_name, &arg);
+		if (idx < mod->add_len &&
+		    !strncmp(prefix, mod->add[idx], strlen(prefix)))
+			goto done;
+	}
+	err = reftable_table_seek_ref(&mod->tab, &it, prefix);
+	if (err)
+		goto done;
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err)
+			goto done;
+
+		if (mod->del_len > 0) {
+			struct find_arg arg = {
+				.names = mod->del,
+				.want = ref.refname,
+			};
+			int idx = binsearch(mod->del_len, find_name, &arg);
+			if (idx < mod->del_len &&
+			    !strcmp(ref.refname, mod->del[idx])) {
+				continue;
+			}
+		}
+
+		if (strncmp(ref.refname, prefix, strlen(prefix))) {
+			err = 1;
+			goto done;
+		}
+		err = 0;
+		goto done;
+	}
+
+done:
+	reftable_ref_record_release(&ref);
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+static int validate_refname(const char *name)
+{
+	while (1) {
+		char *next = strchr(name, '/');
+		if (!*name) {
+			return REFTABLE_REFNAME_ERROR;
+		}
+		if (!next) {
+			return 0;
+		}
+		if (next - name == 0 || (next - name == 1 && *name == '.') ||
+		    (next - name == 2 && name[0] == '.' && name[1] == '.'))
+			return REFTABLE_REFNAME_ERROR;
+		name = next + 1;
+	}
+	return 0;
+}
+
+int validate_ref_record_addition(struct reftable_table tab,
+				 struct reftable_ref_record *recs, size_t sz)
+{
+	struct modification mod = {
+		.tab = tab,
+		.add = reftable_calloc(sizeof(char *) * sz),
+		.del = reftable_calloc(sizeof(char *) * sz),
+	};
+	int i = 0;
+	int err = 0;
+	for (; i < sz; i++) {
+		if (reftable_ref_record_is_deletion(&recs[i])) {
+			mod.del[mod.del_len++] = recs[i].refname;
+		} else {
+			mod.add[mod.add_len++] = recs[i].refname;
+		}
+	}
+
+	err = modification_validate(&mod);
+	modification_release(&mod);
+	return err;
+}
+
+static void strbuf_trim_component(struct strbuf *sl)
+{
+	while (sl->len > 0) {
+		int is_slash = (sl->buf[sl->len - 1] == '/');
+		strbuf_setlen(sl, sl->len - 1);
+		if (is_slash)
+			break;
+	}
+}
+
+int modification_validate(struct modification *mod)
+{
+	struct strbuf slashed = STRBUF_INIT;
+	int err = 0;
+	int i = 0;
+	for (; i < mod->add_len; i++) {
+		err = validate_refname(mod->add[i]);
+		if (err)
+			goto done;
+		strbuf_reset(&slashed);
+		strbuf_addstr(&slashed, mod->add[i]);
+		strbuf_addstr(&slashed, "/");
+
+		err = modification_has_ref_with_prefix(mod, slashed.buf);
+		if (err == 0) {
+			err = REFTABLE_NAME_CONFLICT;
+			goto done;
+		}
+		if (err < 0)
+			goto done;
+
+		strbuf_reset(&slashed);
+		strbuf_addstr(&slashed, mod->add[i]);
+		while (slashed.len) {
+			strbuf_trim_component(&slashed);
+			err = modification_has_ref(mod, slashed.buf);
+			if (err == 0) {
+				err = REFTABLE_NAME_CONFLICT;
+				goto done;
+			}
+			if (err < 0)
+				goto done;
+		}
+	}
+	err = 0;
+done:
+	strbuf_release(&slashed);
+	return err;
+}
diff --git a/reftable/refname.h b/reftable/refname.h
new file mode 100644
index 000000000000..a24b40fcb428
--- /dev/null
+++ b/reftable/refname.h
@@ -0,0 +1,29 @@
+/*
+  Copyright 2020 Google LLC
+
+  Use of this source code is governed by a BSD-style
+  license that can be found in the LICENSE file or at
+  https://developers.google.com/open-source/licenses/bsd
+*/
+#ifndef REFNAME_H
+#define REFNAME_H
+
+#include "reftable-record.h"
+#include "reftable-generic.h"
+
+struct modification {
+	struct reftable_table tab;
+
+	char **add;
+	size_t add_len;
+
+	char **del;
+	size_t del_len;
+};
+
+int validate_ref_record_addition(struct reftable_table tab,
+				 struct reftable_ref_record *recs, size_t sz);
+
+int modification_validate(struct modification *mod);
+
+#endif
diff --git a/reftable/refname_test.c b/reftable/refname_test.c
new file mode 100644
index 000000000000..8645cd93bbd8
--- /dev/null
+++ b/reftable/refname_test.c
@@ -0,0 +1,102 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "basics.h"
+#include "block.h"
+#include "blocksource.h"
+#include "constants.h"
+#include "reader.h"
+#include "record.h"
+#include "refname.h"
+#include "reftable-error.h"
+#include "reftable-writer.h"
+#include "system.h"
+
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+struct testcase {
+	char *add;
+	char *del;
+	int error_code;
+};
+
+static void test_conflict(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	struct reftable_ref_record rec = {
+		.refname = "a/b",
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "destination", /* make sure it's not a symref.
+						*/
+		.update_index = 1,
+	};
+	int err;
+	int i;
+	struct reftable_block_source source = { NULL };
+	struct reftable_reader *rd = NULL;
+	struct reftable_table tab = { NULL };
+	struct testcase cases[] = {
+		{ "a/b/c", NULL, REFTABLE_NAME_CONFLICT },
+		{ "b", NULL, 0 },
+		{ "a", NULL, REFTABLE_NAME_CONFLICT },
+		{ "a", "a/b", 0 },
+
+		{ "p/", NULL, REFTABLE_REFNAME_ERROR },
+		{ "p//q", NULL, REFTABLE_REFNAME_ERROR },
+		{ "p/./q", NULL, REFTABLE_REFNAME_ERROR },
+		{ "p/../q", NULL, REFTABLE_REFNAME_ERROR },
+
+		{ "a/b/c", "a/b", 0 },
+		{ NULL, "a//b", 0 },
+	};
+	reftable_writer_set_limits(w, 1, 1);
+
+	err = reftable_writer_add_ref(w, &rec);
+	EXPECT_ERR(err);
+
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+	reftable_writer_free(w);
+
+	block_source_from_strbuf(&source, &buf);
+	err = reftable_new_reader(&rd, &source, "filename");
+	EXPECT_ERR(err);
+
+	reftable_table_from_reader(&tab, rd);
+
+	for (i = 0; i < ARRAY_SIZE(cases); i++) {
+		struct modification mod = {
+			.tab = tab,
+		};
+
+		if (cases[i].add) {
+			mod.add = &cases[i].add;
+			mod.add_len = 1;
+		}
+		if (cases[i].del) {
+			mod.del = &cases[i].del;
+			mod.del_len = 1;
+		}
+
+		err = modification_validate(&mod);
+		EXPECT(err == cases[i].error_code);
+	}
+
+	reftable_reader_free(rd);
+	strbuf_release(&buf);
+}
+
+int refname_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_conflict);
+	return 0;
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 8087f2da4e6a..c8db6852c353 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -8,6 +8,7 @@ int cmd__reftable(int argc, const char **argv)
 	merged_test_main(argc, argv);
 	pq_test_main(argc, argv);
 	record_test_main(argc, argv);
+	refname_test_main(argc, argv);
 	readwrite_test_main(argc, argv);
 	tree_test_main(argc, argv);
 	return 0;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 21/28] reftable: implement stack, a mutable database of reftable files.
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (19 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 20/28] reftable: implement refname validation Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37             ` [PATCH v7 22/28] reftable: add dump utility Han-Wen Nienhuys via GitGitGadget
                               ` (7 subsequent siblings)
  28 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                  |    1 +
 reftable/reftable-stack.h |  128 ++++
 reftable/stack.c          | 1395 +++++++++++++++++++++++++++++++++++++
 reftable/stack.h          |   41 ++
 reftable/stack_test.c     |  940 +++++++++++++++++++++++++
 t/helper/test-reftable.c  |    1 +
 6 files changed, 2506 insertions(+)
 create mode 100644 reftable/reftable-stack.h
 create mode 100644 reftable/stack.c
 create mode 100644 reftable/stack.h
 create mode 100644 reftable/stack_test.c

diff --git a/Makefile b/Makefile
index b3973d8f5a8f..f78d1cbc2d3c 100644
--- a/Makefile
+++ b/Makefile
@@ -2431,6 +2431,7 @@ REFTABLE_TEST_OBJS += reftable/pq_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
 REFTABLE_TEST_OBJS += reftable/readwrite_test.o
 REFTABLE_TEST_OBJS += reftable/refname_test.o
+REFTABLE_TEST_OBJS += reftable/stack_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/tree_test.o
 
diff --git a/reftable/reftable-stack.h b/reftable/reftable-stack.h
new file mode 100644
index 000000000000..b64daf120a3c
--- /dev/null
+++ b/reftable/reftable-stack.h
@@ -0,0 +1,128 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef REFTABLE_STACK_H
+#define REFTABLE_STACK_H
+
+#include "reftable-writer.h"
+
+/*
+ * The stack presents an interface to a mutable sequence of reftables.
+
+ * A stack can be mutated by pushing a table to the top of the stack.
+
+ * The reftable_stack automatically compacts files on disk to ensure good
+ * amortized performance.
+ *
+ * For windows and other platforms that cannot have open files as rename
+ * destinations, concurrent access from multiple processes needs the rand()
+ * random seed to be randomized.
+ */
+struct reftable_stack;
+
+/* open a new reftable stack. The tables along with the table list will be
+ *  stored in 'dir'. Typically, this should be .git/reftables.
+ */
+int reftable_new_stack(struct reftable_stack **dest, const char *dir,
+		       struct reftable_write_options config);
+
+/* returns the update_index at which a next table should be written. */
+uint64_t reftable_stack_next_update_index(struct reftable_stack *st);
+
+/* holds a transaction to add tables at the top of a stack. */
+struct reftable_addition;
+
+/*
+ * returns a new transaction to add reftables to the given stack. As a side
+ * effect, the ref database is locked.
+ */
+int reftable_stack_new_addition(struct reftable_addition **dest,
+				struct reftable_stack *st);
+
+/* Adds a reftable to transaction. */
+int reftable_addition_add(struct reftable_addition *add,
+			  int (*write_table)(struct reftable_writer *wr,
+					     void *arg),
+			  void *arg);
+
+/* Commits the transaction, releasing the lock. After calling this,
+ * reftable_addition_destroy should still be called.
+ */
+int reftable_addition_commit(struct reftable_addition *add);
+
+/* Release all non-committed data from the transaction, and deallocate the
+ * transaction. Releases the lock if held. */
+void reftable_addition_destroy(struct reftable_addition *add);
+
+/* add a new table to the stack. The write_table function must call
+ * reftable_writer_set_limits, add refs and return an error value. */
+int reftable_stack_add(struct reftable_stack *st,
+		       int (*write_table)(struct reftable_writer *wr,
+					  void *write_arg),
+		       void *write_arg);
+
+/* returns the merged_table for seeking. This table is valid until the
+ * next write or reload, and should not be closed or deleted.
+ */
+struct reftable_merged_table *
+reftable_stack_merged_table(struct reftable_stack *st);
+
+/* frees all resources associated with the stack. */
+void reftable_stack_destroy(struct reftable_stack *st);
+
+/* Reloads the stack if necessary. This is very cheap to run if the stack was up
+ * to date */
+int reftable_stack_reload(struct reftable_stack *st);
+
+/* Policy for expiring reflog entries. */
+struct reftable_log_expiry_config {
+	/* Drop entries older than this timestamp */
+	uint64_t time;
+
+	/* Drop older entries */
+	uint64_t min_update_index;
+};
+
+/* compacts all reftables into a giant table. Expire reflog entries if config is
+ * non-NULL */
+int reftable_stack_compact_all(struct reftable_stack *st,
+			       struct reftable_log_expiry_config *config);
+
+/* heuristically compact unbalanced table stack. */
+int reftable_stack_auto_compact(struct reftable_stack *st);
+
+/* delete stale .ref tables. */
+int reftable_stack_clean(struct reftable_stack *st);
+
+/* convenience function to read a single ref. Returns < 0 for error, 0 for
+ * success, and 1 if ref not found. */
+int reftable_stack_read_ref(struct reftable_stack *st, const char *refname,
+			    struct reftable_ref_record *ref);
+
+/* convenience function to read a single log. Returns < 0 for error, 0 for
+ * success, and 1 if ref not found. */
+int reftable_stack_read_log(struct reftable_stack *st, const char *refname,
+			    struct reftable_log_record *log);
+
+/* statistics on past compactions. */
+struct reftable_compaction_stats {
+	uint64_t bytes; /* total number of bytes written */
+	uint64_t entries_written; /* total number of entries written, including
+				     failures. */
+	int attempts; /* how often we tried to compact */
+	int failures; /* failures happen on concurrent updates */
+};
+
+/* return statistics for compaction up till now. */
+struct reftable_compaction_stats *
+reftable_stack_compaction_stats(struct reftable_stack *st);
+
+/* print the entire stack represented by the directory */
+int reftable_stack_print_directory(const char *stackdir);
+
+#endif
diff --git a/reftable/stack.c b/reftable/stack.c
new file mode 100644
index 000000000000..e3e94e320664
--- /dev/null
+++ b/reftable/stack.c
@@ -0,0 +1,1395 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "stack.h"
+
+#include "system.h"
+#include "merged.h"
+#include "reader.h"
+#include "refname.h"
+#include "reftable-error.h"
+#include "reftable-record.h"
+#include "reftable-merged.h"
+#include "writer.h"
+
+static int stack_try_add(struct reftable_stack *st,
+			 int (*write_table)(struct reftable_writer *wr,
+					    void *arg),
+			 void *arg);
+static int stack_write_compact(struct reftable_stack *st,
+			       struct reftable_writer *wr, int first, int last,
+			       struct reftable_log_expiry_config *config);
+static int stack_check_addition(struct reftable_stack *st,
+				const char *new_tab_name);
+static void reftable_addition_close(struct reftable_addition *add);
+static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
+					     int reuse_open);
+
+static void stack_filename(struct strbuf *dest, struct reftable_stack *st,
+			   const char *name)
+{
+	strbuf_reset(dest);
+	strbuf_addstr(dest, st->reftable_dir);
+	strbuf_addstr(dest, "/");
+	strbuf_addstr(dest, name);
+}
+
+static ssize_t reftable_fd_write(void *arg, const void *data, size_t sz)
+{
+	int *fdp = (int *)arg;
+	return write(*fdp, data, sz);
+}
+
+int reftable_new_stack(struct reftable_stack **dest, const char *dir,
+		       struct reftable_write_options config)
+{
+	struct reftable_stack *p =
+		reftable_calloc(sizeof(struct reftable_stack));
+	struct strbuf list_file_name = STRBUF_INIT;
+	int err = 0;
+
+	if (config.hash_id == 0) {
+		config.hash_id = GIT_SHA1_HASH_ID;
+	}
+
+	*dest = NULL;
+
+	strbuf_reset(&list_file_name);
+	strbuf_addstr(&list_file_name, dir);
+	strbuf_addstr(&list_file_name, "/tables.list");
+
+	p->list_file = strbuf_detach(&list_file_name, NULL);
+	p->reftable_dir = xstrdup(dir);
+	p->config = config;
+
+	err = reftable_stack_reload_maybe_reuse(p, 1);
+	if (err < 0) {
+		reftable_stack_destroy(p);
+	} else {
+		*dest = p;
+	}
+	return err;
+}
+
+static int fd_read_lines(int fd, char ***namesp)
+{
+	off_t size = lseek(fd, 0, SEEK_END);
+	char *buf = NULL;
+	int err = 0;
+	if (size < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+	err = lseek(fd, 0, SEEK_SET);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	buf = reftable_malloc(size + 1);
+	if (read(fd, buf, size) != size) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+	buf[size] = 0;
+
+	parse_names(buf, size, namesp);
+
+done:
+	reftable_free(buf);
+	return err;
+}
+
+int read_lines(const char *filename, char ***namesp)
+{
+	int fd = open(filename, O_RDONLY, 0644);
+	int err = 0;
+	if (fd < 0) {
+		if (errno == ENOENT) {
+			*namesp = reftable_calloc(sizeof(char *));
+			return 0;
+		}
+
+		return REFTABLE_IO_ERROR;
+	}
+	err = fd_read_lines(fd, namesp);
+	close(fd);
+	return err;
+}
+
+struct reftable_merged_table *
+reftable_stack_merged_table(struct reftable_stack *st)
+{
+	return st->merged;
+}
+
+static int has_name(char **names, const char *name)
+{
+	while (*names) {
+		if (!strcmp(*names, name))
+			return 1;
+		names++;
+	}
+	return 0;
+}
+
+/* Close and free the stack */
+void reftable_stack_destroy(struct reftable_stack *st)
+{
+	char **names = NULL;
+	int err = 0;
+	if (st->merged) {
+		reftable_merged_table_free(st->merged);
+		st->merged = NULL;
+	}
+
+	err = read_lines(st->list_file, &names);
+	if (err < 0) {
+		FREE_AND_NULL(names);
+	}
+
+	if (st->readers) {
+		int i = 0;
+		struct strbuf filename = STRBUF_INIT;
+		for (i = 0; i < st->readers_len; i++) {
+			const char *name = reader_name(st->readers[i]);
+			strbuf_reset(&filename);
+			if (names && !has_name(names, name)) {
+				stack_filename(&filename, st, name);
+			}
+			reftable_reader_free(st->readers[i]);
+
+			if (filename.len) {
+				// On Windows, can only unlink after closing.
+				unlink(filename.buf);
+			}
+		}
+		strbuf_release(&filename);
+		st->readers_len = 0;
+		FREE_AND_NULL(st->readers);
+	}
+	FREE_AND_NULL(st->list_file);
+	FREE_AND_NULL(st->reftable_dir);
+	reftable_free(st);
+	free_names(names);
+}
+
+static struct reftable_reader **stack_copy_readers(struct reftable_stack *st,
+						   int cur_len)
+{
+	struct reftable_reader **cur =
+		reftable_calloc(sizeof(struct reftable_reader *) * cur_len);
+	int i = 0;
+	for (i = 0; i < cur_len; i++) {
+		cur[i] = st->readers[i];
+	}
+	return cur;
+}
+
+static int reftable_stack_reload_once(struct reftable_stack *st, char **names,
+				      int reuse_open)
+{
+	int cur_len = !st->merged ? 0 : st->merged->stack_len;
+	struct reftable_reader **cur = stack_copy_readers(st, cur_len);
+	int err = 0;
+	int names_len = names_length(names);
+	struct reftable_reader **new_readers =
+		reftable_calloc(sizeof(struct reftable_reader *) * names_len);
+	struct reftable_table *new_tables =
+		reftable_calloc(sizeof(struct reftable_table) * names_len);
+	int new_readers_len = 0;
+	struct reftable_merged_table *new_merged = NULL;
+	int i;
+
+	while (*names) {
+		struct reftable_reader *rd = NULL;
+		char *name = *names++;
+
+		/* this is linear; we assume compaction keeps the number of
+		   tables under control so this is not quadratic. */
+		int j = 0;
+		for (j = 0; reuse_open && j < cur_len; j++) {
+			if (cur[j] && 0 == strcmp(cur[j]->name, name)) {
+				rd = cur[j];
+				cur[j] = NULL;
+				break;
+			}
+		}
+
+		if (!rd) {
+			struct reftable_block_source src = { NULL };
+			struct strbuf table_path = STRBUF_INIT;
+			stack_filename(&table_path, st, name);
+
+			err = reftable_block_source_from_file(&src,
+							      table_path.buf);
+			strbuf_release(&table_path);
+
+			if (err < 0)
+				goto done;
+
+			err = reftable_new_reader(&rd, &src, name);
+			if (err < 0)
+				goto done;
+		}
+
+		new_readers[new_readers_len] = rd;
+		reftable_table_from_reader(&new_tables[new_readers_len], rd);
+		new_readers_len++;
+	}
+
+	/* success! */
+	err = reftable_new_merged_table(&new_merged, new_tables,
+					new_readers_len, st->config.hash_id);
+	if (err < 0)
+		goto done;
+
+	new_tables = NULL;
+	st->readers_len = new_readers_len;
+	if (st->merged) {
+		merged_table_release(st->merged);
+		reftable_merged_table_free(st->merged);
+	}
+	if (st->readers) {
+		reftable_free(st->readers);
+	}
+	st->readers = new_readers;
+	new_readers = NULL;
+	new_readers_len = 0;
+
+	new_merged->suppress_deletions = 1;
+	st->merged = new_merged;
+	for (i = 0; i < cur_len; i++) {
+		if (cur[i]) {
+			const char *name = reader_name(cur[i]);
+			struct strbuf filename = STRBUF_INIT;
+			stack_filename(&filename, st, name);
+
+			reader_close(cur[i]);
+			reftable_reader_free(cur[i]);
+
+			// On Windows, can only unlink after closing.
+			unlink(filename.buf);
+
+			strbuf_release(&filename);
+		}
+	}
+
+done:
+	for (i = 0; i < new_readers_len; i++) {
+		reader_close(new_readers[i]);
+		reftable_reader_free(new_readers[i]);
+	}
+	reftable_free(new_readers);
+	reftable_free(new_tables);
+	reftable_free(cur);
+	return err;
+}
+
+/* return negative if a before b. */
+static int tv_cmp(struct timeval *a, struct timeval *b)
+{
+	time_t diff = a->tv_sec - b->tv_sec;
+	int udiff = a->tv_usec - b->tv_usec;
+
+	if (diff != 0)
+		return diff;
+
+	return udiff;
+}
+
+static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
+					     int reuse_open)
+{
+	struct timeval deadline = { 0 };
+	int err = gettimeofday(&deadline, NULL);
+	int64_t delay = 0;
+	int tries = 0;
+	if (err < 0)
+		return err;
+
+	deadline.tv_sec += 3;
+	while (1) {
+		char **names = NULL;
+		char **names_after = NULL;
+		struct timeval now = { 0 };
+		int err = gettimeofday(&now, NULL);
+		int err2 = 0;
+		if (err < 0) {
+			return err;
+		}
+
+		/* Only look at deadlines after the first few times. This
+		   simplifies debugging in GDB */
+		tries++;
+		if (tries > 3 && tv_cmp(&now, &deadline) >= 0) {
+			break;
+		}
+
+		err = read_lines(st->list_file, &names);
+		if (err < 0) {
+			free_names(names);
+			return err;
+		}
+		err = reftable_stack_reload_once(st, names, reuse_open);
+		if (err == 0) {
+			free_names(names);
+			break;
+		}
+		if (err != REFTABLE_NOT_EXIST_ERROR) {
+			free_names(names);
+			return err;
+		}
+
+		/* err == REFTABLE_NOT_EXIST_ERROR can be caused by a concurrent
+		   writer. Check if there was one by checking if the name list
+		   changed.
+		*/
+		err2 = read_lines(st->list_file, &names_after);
+		if (err2 < 0) {
+			free_names(names);
+			return err2;
+		}
+
+		if (names_equal(names_after, names)) {
+			free_names(names);
+			free_names(names_after);
+			return err;
+		}
+		free_names(names);
+		free_names(names_after);
+
+		delay = delay + (delay * rand()) / RAND_MAX + 1;
+		sleep_millisec(delay);
+	}
+
+	return 0;
+}
+
+/* -1 = error
+ 0 = up to date
+ 1 = changed. */
+static int stack_uptodate(struct reftable_stack *st)
+{
+	char **names = NULL;
+	int err = read_lines(st->list_file, &names);
+	int i = 0;
+	if (err < 0)
+		return err;
+
+	for (i = 0; i < st->readers_len; i++) {
+		if (!names[i]) {
+			err = 1;
+			goto done;
+		}
+
+		if (strcmp(st->readers[i]->name, names[i])) {
+			err = 1;
+			goto done;
+		}
+	}
+
+	if (names[st->merged->stack_len]) {
+		err = 1;
+		goto done;
+	}
+
+done:
+	free_names(names);
+	return err;
+}
+
+int reftable_stack_reload(struct reftable_stack *st)
+{
+	int err = stack_uptodate(st);
+	if (err > 0)
+		return reftable_stack_reload_maybe_reuse(st, 1);
+	return err;
+}
+
+int reftable_stack_add(struct reftable_stack *st,
+		       int (*write)(struct reftable_writer *wr, void *arg),
+		       void *arg)
+{
+	int err = stack_try_add(st, write, arg);
+	if (err < 0) {
+		if (err == REFTABLE_LOCK_ERROR) {
+			/* Ignore error return, we want to propagate
+			   REFTABLE_LOCK_ERROR.
+			*/
+			reftable_stack_reload(st);
+		}
+		return err;
+	}
+
+	if (!st->disable_auto_compact)
+		return reftable_stack_auto_compact(st);
+
+	return 0;
+}
+
+static void format_name(struct strbuf *dest, uint64_t min, uint64_t max)
+{
+	char buf[100];
+	uint32_t rnd = (uint32_t)rand();
+	snprintf(buf, sizeof(buf), "0x%012" PRIx64 "-0x%012" PRIx64 "-%08x",
+		 min, max, rnd);
+	strbuf_reset(dest);
+	strbuf_addstr(dest, buf);
+}
+
+struct reftable_addition {
+	int lock_file_fd;
+	struct strbuf lock_file_name;
+	struct reftable_stack *stack;
+
+	char **new_tables;
+	int new_tables_len;
+	uint64_t next_update_index;
+};
+
+#define REFTABLE_ADDITION_INIT                \
+	{                                     \
+		.lock_file_name = STRBUF_INIT \
+	}
+
+static int reftable_stack_init_addition(struct reftable_addition *add,
+					struct reftable_stack *st)
+{
+	int err = 0;
+	add->stack = st;
+
+	strbuf_reset(&add->lock_file_name);
+	strbuf_addstr(&add->lock_file_name, st->list_file);
+	strbuf_addstr(&add->lock_file_name, ".lock");
+
+	add->lock_file_fd = open(add->lock_file_name.buf,
+				 O_EXCL | O_CREAT | O_WRONLY, 0644);
+	if (add->lock_file_fd < 0) {
+		if (errno == EEXIST) {
+			err = REFTABLE_LOCK_ERROR;
+		} else {
+			err = REFTABLE_IO_ERROR;
+		}
+		goto done;
+	}
+	err = stack_uptodate(st);
+	if (err < 0)
+		goto done;
+
+	if (err > 1) {
+		err = REFTABLE_LOCK_ERROR;
+		goto done;
+	}
+
+	add->next_update_index = reftable_stack_next_update_index(st);
+done:
+	if (err) {
+		reftable_addition_close(add);
+	}
+	return err;
+}
+
+static void reftable_addition_close(struct reftable_addition *add)
+{
+	int i = 0;
+	struct strbuf nm = STRBUF_INIT;
+	for (i = 0; i < add->new_tables_len; i++) {
+		stack_filename(&nm, add->stack, add->new_tables[i]);
+		unlink(nm.buf);
+		reftable_free(add->new_tables[i]);
+		add->new_tables[i] = NULL;
+	}
+	reftable_free(add->new_tables);
+	add->new_tables = NULL;
+	add->new_tables_len = 0;
+
+	if (add->lock_file_fd > 0) {
+		close(add->lock_file_fd);
+		add->lock_file_fd = 0;
+	}
+	if (add->lock_file_name.len > 0) {
+		unlink(add->lock_file_name.buf);
+		strbuf_release(&add->lock_file_name);
+	}
+
+	strbuf_release(&nm);
+}
+
+void reftable_addition_destroy(struct reftable_addition *add)
+{
+	if (!add) {
+		return;
+	}
+	reftable_addition_close(add);
+	reftable_free(add);
+}
+
+int reftable_addition_commit(struct reftable_addition *add)
+{
+	struct strbuf table_list = STRBUF_INIT;
+	int i = 0;
+	int err = 0;
+	if (add->new_tables_len == 0)
+		goto done;
+
+	for (i = 0; i < add->stack->merged->stack_len; i++) {
+		strbuf_addstr(&table_list, add->stack->readers[i]->name);
+		strbuf_addstr(&table_list, "\n");
+	}
+	for (i = 0; i < add->new_tables_len; i++) {
+		strbuf_addstr(&table_list, add->new_tables[i]);
+		strbuf_addstr(&table_list, "\n");
+	}
+
+	err = write(add->lock_file_fd, table_list.buf, table_list.len);
+	strbuf_release(&table_list);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = close(add->lock_file_fd);
+	add->lock_file_fd = 0;
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = rename(add->lock_file_name.buf, add->stack->list_file);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	/* success, no more state to clean up. */
+	strbuf_release(&add->lock_file_name);
+	for (i = 0; i < add->new_tables_len; i++) {
+		reftable_free(add->new_tables[i]);
+	}
+	reftable_free(add->new_tables);
+	add->new_tables = NULL;
+	add->new_tables_len = 0;
+
+	err = reftable_stack_reload(add->stack);
+done:
+	reftable_addition_close(add);
+	return err;
+}
+
+int reftable_stack_new_addition(struct reftable_addition **dest,
+				struct reftable_stack *st)
+{
+	int err = 0;
+	struct reftable_addition empty = REFTABLE_ADDITION_INIT;
+	*dest = reftable_calloc(sizeof(**dest));
+	**dest = empty;
+	err = reftable_stack_init_addition(*dest, st);
+	if (err) {
+		reftable_free(*dest);
+		*dest = NULL;
+	}
+	return err;
+}
+
+static int stack_try_add(struct reftable_stack *st,
+			 int (*write_table)(struct reftable_writer *wr,
+					    void *arg),
+			 void *arg)
+{
+	struct reftable_addition add = REFTABLE_ADDITION_INIT;
+	int err = reftable_stack_init_addition(&add, st);
+	if (err < 0)
+		goto done;
+	if (err > 0) {
+		err = REFTABLE_LOCK_ERROR;
+		goto done;
+	}
+
+	err = reftable_addition_add(&add, write_table, arg);
+	if (err < 0)
+		goto done;
+
+	err = reftable_addition_commit(&add);
+done:
+	reftable_addition_close(&add);
+	return err;
+}
+
+int reftable_addition_add(struct reftable_addition *add,
+			  int (*write_table)(struct reftable_writer *wr,
+					     void *arg),
+			  void *arg)
+{
+	struct strbuf temp_tab_file_name = STRBUF_INIT;
+	struct strbuf tab_file_name = STRBUF_INIT;
+	struct strbuf next_name = STRBUF_INIT;
+	struct reftable_writer *wr = NULL;
+	int err = 0;
+	int tab_fd = 0;
+
+	strbuf_reset(&next_name);
+	format_name(&next_name, add->next_update_index, add->next_update_index);
+
+	stack_filename(&temp_tab_file_name, add->stack, next_name.buf);
+	strbuf_addstr(&temp_tab_file_name, ".temp.XXXXXX");
+
+	tab_fd = mkstemp(temp_tab_file_name.buf);
+	if (tab_fd < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	wr = reftable_new_writer(reftable_fd_write, &tab_fd,
+				 &add->stack->config);
+	err = write_table(wr, arg);
+	if (err < 0)
+		goto done;
+
+	err = reftable_writer_close(wr);
+	if (err == REFTABLE_EMPTY_TABLE_ERROR) {
+		err = 0;
+		goto done;
+	}
+	if (err < 0)
+		goto done;
+
+	err = close(tab_fd);
+	tab_fd = 0;
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	err = stack_check_addition(add->stack, temp_tab_file_name.buf);
+	if (err < 0)
+		goto done;
+
+	if (wr->min_update_index < add->next_update_index) {
+		err = REFTABLE_API_ERROR;
+		goto done;
+	}
+
+	format_name(&next_name, wr->min_update_index, wr->max_update_index);
+	strbuf_addstr(&next_name, ".ref");
+
+	stack_filename(&tab_file_name, add->stack, next_name.buf);
+
+	/*
+	  On windows, this relies on rand() picking a unique destination name.
+	  Maybe we should do retry loop as well?
+	 */
+	err = rename(temp_tab_file_name.buf, tab_file_name.buf);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+
+	add->new_tables = reftable_realloc(add->new_tables,
+					   sizeof(*add->new_tables) *
+						   (add->new_tables_len + 1));
+	add->new_tables[add->new_tables_len] = strbuf_detach(&next_name, NULL);
+	add->new_tables_len++;
+done:
+	if (tab_fd > 0) {
+		close(tab_fd);
+		tab_fd = 0;
+	}
+	if (temp_tab_file_name.len > 0) {
+		unlink(temp_tab_file_name.buf);
+	}
+
+	strbuf_release(&temp_tab_file_name);
+	strbuf_release(&tab_file_name);
+	strbuf_release(&next_name);
+	reftable_writer_free(wr);
+	return err;
+}
+
+uint64_t reftable_stack_next_update_index(struct reftable_stack *st)
+{
+	int sz = st->merged->stack_len;
+	if (sz > 0)
+		return reftable_reader_max_update_index(st->readers[sz - 1]) +
+		       1;
+	return 1;
+}
+
+static int stack_compact_locked(struct reftable_stack *st, int first, int last,
+				struct strbuf *temp_tab,
+				struct reftable_log_expiry_config *config)
+{
+	struct strbuf next_name = STRBUF_INIT;
+	int tab_fd = -1;
+	struct reftable_writer *wr = NULL;
+	int err = 0;
+
+	format_name(&next_name,
+		    reftable_reader_min_update_index(st->readers[first]),
+		    reftable_reader_max_update_index(st->readers[last]));
+
+	stack_filename(temp_tab, st, next_name.buf);
+	strbuf_addstr(temp_tab, ".temp.XXXXXX");
+
+	tab_fd = mkstemp(temp_tab->buf);
+	wr = reftable_new_writer(reftable_fd_write, &tab_fd, &st->config);
+
+	err = stack_write_compact(st, wr, first, last, config);
+	if (err < 0)
+		goto done;
+	err = reftable_writer_close(wr);
+	if (err < 0)
+		goto done;
+
+	err = close(tab_fd);
+	tab_fd = 0;
+
+done:
+	reftable_writer_free(wr);
+	if (tab_fd > 0) {
+		close(tab_fd);
+		tab_fd = 0;
+	}
+	if (err != 0 && temp_tab->len > 0) {
+		unlink(temp_tab->buf);
+		strbuf_release(temp_tab);
+	}
+	strbuf_release(&next_name);
+	return err;
+}
+
+static int stack_write_compact(struct reftable_stack *st,
+			       struct reftable_writer *wr, int first, int last,
+			       struct reftable_log_expiry_config *config)
+{
+	int subtabs_len = last - first + 1;
+	struct reftable_table *subtabs = reftable_calloc(
+		sizeof(struct reftable_table) * (last - first + 1));
+	struct reftable_merged_table *mt = NULL;
+	int err = 0;
+	struct reftable_iterator it = { NULL };
+	struct reftable_ref_record ref = { NULL };
+	struct reftable_log_record log = { NULL };
+
+	uint64_t entries = 0;
+
+	int i = 0, j = 0;
+	for (i = first, j = 0; i <= last; i++) {
+		struct reftable_reader *t = st->readers[i];
+		reftable_table_from_reader(&subtabs[j++], t);
+		st->stats.bytes += t->size;
+	}
+	reftable_writer_set_limits(wr, st->readers[first]->min_update_index,
+				   st->readers[last]->max_update_index);
+
+	err = reftable_new_merged_table(&mt, subtabs, subtabs_len,
+					st->config.hash_id);
+	if (err < 0) {
+		reftable_free(subtabs);
+		goto done;
+	}
+
+	err = reftable_merged_table_seek_ref(mt, &it, "");
+	if (err < 0)
+		goto done;
+
+	while (1) {
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+		if (err < 0) {
+			break;
+		}
+
+		if (first == 0 && reftable_ref_record_is_deletion(&ref)) {
+			continue;
+		}
+
+		err = reftable_writer_add_ref(wr, &ref);
+		if (err < 0) {
+			break;
+		}
+		entries++;
+	}
+	reftable_iterator_destroy(&it);
+
+	err = reftable_merged_table_seek_log(mt, &it, "");
+	if (err < 0)
+		goto done;
+
+	while (1) {
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+		if (err < 0) {
+			break;
+		}
+		if (first == 0 && reftable_log_record_is_deletion(&log)) {
+			continue;
+		}
+
+		if (config && config->min_update_index > 0 &&
+		    log.update_index < config->min_update_index) {
+			continue;
+		}
+
+		if (config && config->time > 0 &&
+		    log.update.time < config->time) {
+			continue;
+		}
+
+		err = reftable_writer_add_log(wr, &log);
+		if (err < 0) {
+			break;
+		}
+		entries++;
+	}
+
+done:
+	reftable_iterator_destroy(&it);
+	if (mt) {
+		merged_table_release(mt);
+		reftable_merged_table_free(mt);
+	}
+	reftable_ref_record_release(&ref);
+	reftable_log_record_release(&log);
+	st->stats.entries_written += entries;
+	return err;
+}
+
+/* <  0: error. 0 == OK, > 0 attempt failed; could retry. */
+static int stack_compact_range(struct reftable_stack *st, int first, int last,
+			       struct reftable_log_expiry_config *expiry)
+{
+	struct strbuf temp_tab_file_name = STRBUF_INIT;
+	struct strbuf new_table_name = STRBUF_INIT;
+	struct strbuf lock_file_name = STRBUF_INIT;
+	struct strbuf ref_list_contents = STRBUF_INIT;
+	struct strbuf new_table_path = STRBUF_INIT;
+	int err = 0;
+	int have_lock = 0;
+	int lock_file_fd = 0;
+	int compact_count = last - first + 1;
+	char **listp = NULL;
+	char **delete_on_success =
+		reftable_calloc(sizeof(char *) * (compact_count + 1));
+	char **subtable_locks =
+		reftable_calloc(sizeof(char *) * (compact_count + 1));
+	int i = 0;
+	int j = 0;
+	int is_empty_table = 0;
+
+	if (first > last || (!expiry && first == last)) {
+		err = 0;
+		goto done;
+	}
+
+	st->stats.attempts++;
+
+	strbuf_reset(&lock_file_name);
+	strbuf_addstr(&lock_file_name, st->list_file);
+	strbuf_addstr(&lock_file_name, ".lock");
+
+	lock_file_fd =
+		open(lock_file_name.buf, O_EXCL | O_CREAT | O_WRONLY, 0644);
+	if (lock_file_fd < 0) {
+		if (errno == EEXIST) {
+			err = 1;
+		} else {
+			err = REFTABLE_IO_ERROR;
+		}
+		goto done;
+	}
+	/* Don't want to write to the lock for now.  */
+	close(lock_file_fd);
+	lock_file_fd = 0;
+
+	have_lock = 1;
+	err = stack_uptodate(st);
+	if (err != 0)
+		goto done;
+
+	for (i = first, j = 0; i <= last; i++) {
+		struct strbuf subtab_file_name = STRBUF_INIT;
+		struct strbuf subtab_lock = STRBUF_INIT;
+		int sublock_file_fd = -1;
+
+		stack_filename(&subtab_file_name, st,
+			       reader_name(st->readers[i]));
+
+		strbuf_reset(&subtab_lock);
+		strbuf_addbuf(&subtab_lock, &subtab_file_name);
+		strbuf_addstr(&subtab_lock, ".lock");
+
+		sublock_file_fd = open(subtab_lock.buf,
+				       O_EXCL | O_CREAT | O_WRONLY, 0644);
+		if (sublock_file_fd > 0) {
+			close(sublock_file_fd);
+		} else if (sublock_file_fd < 0) {
+			if (errno == EEXIST) {
+				err = 1;
+			} else {
+				err = REFTABLE_IO_ERROR;
+			}
+		}
+
+		subtable_locks[j] = subtab_lock.buf;
+		delete_on_success[j] = subtab_file_name.buf;
+		j++;
+
+		if (err != 0)
+			goto done;
+	}
+
+	err = unlink(lock_file_name.buf);
+	if (err < 0)
+		goto done;
+	have_lock = 0;
+
+	err = stack_compact_locked(st, first, last, &temp_tab_file_name,
+				   expiry);
+	/* Compaction + tombstones can create an empty table out of non-empty
+	 * tables. */
+	is_empty_table = (err == REFTABLE_EMPTY_TABLE_ERROR);
+	if (is_empty_table) {
+		err = 0;
+	}
+	if (err < 0)
+		goto done;
+
+	lock_file_fd =
+		open(lock_file_name.buf, O_EXCL | O_CREAT | O_WRONLY, 0644);
+	if (lock_file_fd < 0) {
+		if (errno == EEXIST) {
+			err = 1;
+		} else {
+			err = REFTABLE_IO_ERROR;
+		}
+		goto done;
+	}
+	have_lock = 1;
+
+	format_name(&new_table_name, st->readers[first]->min_update_index,
+		    st->readers[last]->max_update_index);
+	strbuf_addstr(&new_table_name, ".ref");
+
+	stack_filename(&new_table_path, st, new_table_name.buf);
+
+	if (!is_empty_table) {
+		/* retry? */
+		err = rename(temp_tab_file_name.buf, new_table_path.buf);
+		if (err < 0) {
+			err = REFTABLE_IO_ERROR;
+			goto done;
+		}
+	}
+
+	for (i = 0; i < first; i++) {
+		strbuf_addstr(&ref_list_contents, st->readers[i]->name);
+		strbuf_addstr(&ref_list_contents, "\n");
+	}
+	if (!is_empty_table) {
+		strbuf_addbuf(&ref_list_contents, &new_table_name);
+		strbuf_addstr(&ref_list_contents, "\n");
+	}
+	for (i = last + 1; i < st->merged->stack_len; i++) {
+		strbuf_addstr(&ref_list_contents, st->readers[i]->name);
+		strbuf_addstr(&ref_list_contents, "\n");
+	}
+
+	err = write(lock_file_fd, ref_list_contents.buf, ref_list_contents.len);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		unlink(new_table_path.buf);
+		goto done;
+	}
+	err = close(lock_file_fd);
+	lock_file_fd = 0;
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		unlink(new_table_path.buf);
+		goto done;
+	}
+
+	err = rename(lock_file_name.buf, st->list_file);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
+		unlink(new_table_path.buf);
+		goto done;
+	}
+	have_lock = 0;
+
+	/* Reload the stack before deleting. On windows, we can only delete the
+	   files after we closed them.
+	*/
+	err = reftable_stack_reload_maybe_reuse(st, first < last);
+
+	listp = delete_on_success;
+	while (*listp) {
+		if (strcmp(*listp, new_table_path.buf)) {
+			unlink(*listp);
+		}
+		listp++;
+	}
+
+done:
+	free_names(delete_on_success);
+
+	listp = subtable_locks;
+	while (*listp) {
+		unlink(*listp);
+		listp++;
+	}
+	free_names(subtable_locks);
+	if (lock_file_fd > 0) {
+		close(lock_file_fd);
+		lock_file_fd = 0;
+	}
+	if (have_lock) {
+		unlink(lock_file_name.buf);
+	}
+	strbuf_release(&new_table_name);
+	strbuf_release(&new_table_path);
+	strbuf_release(&ref_list_contents);
+	strbuf_release(&temp_tab_file_name);
+	strbuf_release(&lock_file_name);
+	return err;
+}
+
+int reftable_stack_compact_all(struct reftable_stack *st,
+			       struct reftable_log_expiry_config *config)
+{
+	return stack_compact_range(st, 0, st->merged->stack_len - 1, config);
+}
+
+static int stack_compact_range_stats(struct reftable_stack *st, int first,
+				     int last,
+				     struct reftable_log_expiry_config *config)
+{
+	int err = stack_compact_range(st, first, last, config);
+	if (err > 0) {
+		st->stats.failures++;
+	}
+	return err;
+}
+
+static int segment_size(struct segment *s)
+{
+	return s->end - s->start;
+}
+
+int fastlog2(uint64_t sz)
+{
+	int l = 0;
+	if (sz == 0)
+		return 0;
+	for (; sz; sz /= 2) {
+		l++;
+	}
+	return l - 1;
+}
+
+struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n)
+{
+	struct segment *segs = reftable_calloc(sizeof(struct segment) * n);
+	int next = 0;
+	struct segment cur = { 0 };
+	int i = 0;
+
+	if (n == 0) {
+		*seglen = 0;
+		return segs;
+	}
+	for (i = 0; i < n; i++) {
+		int log = fastlog2(sizes[i]);
+		if (cur.log != log && cur.bytes > 0) {
+			struct segment fresh = {
+				.start = i,
+			};
+
+			segs[next++] = cur;
+			cur = fresh;
+		}
+
+		cur.log = log;
+		cur.end = i + 1;
+		cur.bytes += sizes[i];
+	}
+	segs[next++] = cur;
+	*seglen = next;
+	return segs;
+}
+
+struct segment suggest_compaction_segment(uint64_t *sizes, int n)
+{
+	int seglen = 0;
+	struct segment *segs = sizes_to_segments(&seglen, sizes, n);
+	struct segment min_seg = {
+		.log = 64,
+	};
+	int i = 0;
+	for (i = 0; i < seglen; i++) {
+		if (segment_size(&segs[i]) == 1) {
+			continue;
+		}
+
+		if (segs[i].log < min_seg.log) {
+			min_seg = segs[i];
+		}
+	}
+
+	while (min_seg.start > 0) {
+		int prev = min_seg.start - 1;
+		if (fastlog2(min_seg.bytes) < fastlog2(sizes[prev])) {
+			break;
+		}
+
+		min_seg.start = prev;
+		min_seg.bytes += sizes[prev];
+	}
+
+	reftable_free(segs);
+	return min_seg;
+}
+
+static uint64_t *stack_table_sizes_for_compaction(struct reftable_stack *st)
+{
+	uint64_t *sizes =
+		reftable_calloc(sizeof(uint64_t) * st->merged->stack_len);
+	int version = (st->config.hash_id == GIT_SHA1_HASH_ID) ? 1 : 2;
+	int overhead = header_size(version) - 1;
+	int i = 0;
+	for (i = 0; i < st->merged->stack_len; i++) {
+		sizes[i] = st->readers[i]->size - overhead;
+	}
+	return sizes;
+}
+
+int reftable_stack_auto_compact(struct reftable_stack *st)
+{
+	uint64_t *sizes = stack_table_sizes_for_compaction(st);
+	struct segment seg =
+		suggest_compaction_segment(sizes, st->merged->stack_len);
+	reftable_free(sizes);
+	if (segment_size(&seg) > 0)
+		return stack_compact_range_stats(st, seg.start, seg.end - 1,
+						 NULL);
+
+	return 0;
+}
+
+struct reftable_compaction_stats *
+reftable_stack_compaction_stats(struct reftable_stack *st)
+{
+	return &st->stats;
+}
+
+int reftable_stack_read_ref(struct reftable_stack *st, const char *refname,
+			    struct reftable_ref_record *ref)
+{
+	struct reftable_table tab = { NULL };
+	reftable_table_from_merged_table(&tab, reftable_stack_merged_table(st));
+	return reftable_table_read_ref(&tab, refname, ref);
+}
+
+int reftable_stack_read_log(struct reftable_stack *st, const char *refname,
+			    struct reftable_log_record *log)
+{
+	struct reftable_iterator it = { NULL };
+	struct reftable_merged_table *mt = reftable_stack_merged_table(st);
+	int err = reftable_merged_table_seek_log(mt, &it, refname);
+	if (err)
+		goto done;
+
+	err = reftable_iterator_next_log(&it, log);
+	if (err)
+		goto done;
+
+	if (strcmp(log->refname, refname) ||
+	    reftable_log_record_is_deletion(log)) {
+		err = 1;
+		goto done;
+	}
+
+done:
+	if (err) {
+		reftable_log_record_release(log);
+	}
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+static int stack_check_addition(struct reftable_stack *st,
+				const char *new_tab_name)
+{
+	int err = 0;
+	struct reftable_block_source src = { NULL };
+	struct reftable_reader *rd = NULL;
+	struct reftable_table tab = { NULL };
+	struct reftable_ref_record *refs = NULL;
+	struct reftable_iterator it = { NULL };
+	int cap = 0;
+	int len = 0;
+	int i = 0;
+
+	if (st->config.skip_name_check)
+		return 0;
+
+	err = reftable_block_source_from_file(&src, new_tab_name);
+	if (err < 0)
+		goto done;
+
+	err = reftable_new_reader(&rd, &src, new_tab_name);
+	if (err < 0)
+		goto done;
+
+	err = reftable_reader_seek_ref(rd, &it, "");
+	if (err > 0) {
+		err = 0;
+		goto done;
+	}
+	if (err < 0)
+		goto done;
+
+	while (1) {
+		struct reftable_ref_record ref = { NULL };
+		err = reftable_iterator_next_ref(&it, &ref);
+		if (err > 0) {
+			break;
+		}
+		if (err < 0)
+			goto done;
+
+		if (len >= cap) {
+			cap = 2 * cap + 1;
+			refs = reftable_realloc(refs, cap * sizeof(refs[0]));
+		}
+
+		refs[len++] = ref;
+	}
+
+	reftable_table_from_merged_table(&tab, reftable_stack_merged_table(st));
+
+	err = validate_ref_record_addition(tab, refs, len);
+
+done:
+	for (i = 0; i < len; i++) {
+		reftable_ref_record_release(&refs[i]);
+	}
+
+	free(refs);
+	reftable_iterator_destroy(&it);
+	reftable_reader_free(rd);
+	return err;
+}
+
+static int is_table_name(const char *s)
+{
+	const char *dot = strrchr(s, '.');
+	return dot && !strcmp(dot, ".ref");
+}
+
+static void remove_maybe_stale_table(struct reftable_stack *st, uint64_t max,
+				     const char *name)
+{
+	int err = 0;
+	uint64_t update_idx = 0;
+	struct reftable_block_source src = { NULL };
+	struct reftable_reader *rd = NULL;
+	struct strbuf table_path = STRBUF_INIT;
+	stack_filename(&table_path, st, name);
+
+	err = reftable_block_source_from_file(&src, table_path.buf);
+	if (err < 0)
+		goto done;
+
+	err = reftable_new_reader(&rd, &src, name);
+	if (err < 0)
+		goto done;
+
+	update_idx = reftable_reader_max_update_index(rd);
+	reftable_reader_free(rd);
+
+	if (update_idx <= max) {
+		unlink(table_path.buf);
+	}
+done:
+	strbuf_release(&table_path);
+}
+
+static int reftable_stack_clean_locked(struct reftable_stack *st)
+{
+	uint64_t max = reftable_merged_table_max_update_index(
+		reftable_stack_merged_table(st));
+	DIR *dir = opendir(st->reftable_dir);
+	struct dirent *d = NULL;
+	if (!dir) {
+		return REFTABLE_IO_ERROR;
+	}
+
+	while ((d = readdir(dir))) {
+		int i = 0;
+		int found = 0;
+		if (!is_table_name(d->d_name))
+			continue;
+
+		for (i = 0; !found && i < st->readers_len; i++) {
+			found = !strcmp(reader_name(st->readers[i]), d->d_name);
+		}
+		if (found)
+			continue;
+
+		remove_maybe_stale_table(st, max, d->d_name);
+	}
+
+	closedir(dir);
+	return 0;
+}
+
+int reftable_stack_clean(struct reftable_stack *st)
+{
+	struct reftable_addition *add = NULL;
+	int err = reftable_stack_new_addition(&add, st);
+	if (err < 0) {
+		goto done;
+	}
+
+	err = reftable_stack_reload(st);
+	if (err < 0) {
+		goto done;
+	}
+
+	err = reftable_stack_clean_locked(st);
+
+done:
+	reftable_addition_destroy(add);
+	return err;
+}
+
+int reftable_stack_print_directory(const char *stackdir)
+{
+	struct reftable_stack *stack = NULL;
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_merged_table *merged = NULL;
+	struct reftable_table table = { NULL };
+
+	int err = reftable_new_stack(&stack, stackdir, cfg);
+	if (err < 0)
+		goto done;
+
+	merged = reftable_stack_merged_table(stack);
+	reftable_table_from_merged_table(&table, merged);
+	err = reftable_table_print(&table);
+done:
+	reftable_stack_destroy(stack);
+	return err;
+}
diff --git a/reftable/stack.h b/reftable/stack.h
new file mode 100644
index 000000000000..f57005846e56
--- /dev/null
+++ b/reftable/stack.h
@@ -0,0 +1,41 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#ifndef STACK_H
+#define STACK_H
+
+#include "system.h"
+#include "reftable-writer.h"
+#include "reftable-stack.h"
+
+struct reftable_stack {
+	char *list_file;
+	char *reftable_dir;
+	int disable_auto_compact;
+
+	struct reftable_write_options config;
+
+	struct reftable_reader **readers;
+	size_t readers_len;
+	struct reftable_merged_table *merged;
+	struct reftable_compaction_stats stats;
+};
+
+int read_lines(const char *filename, char ***lines);
+
+struct segment {
+	int start, end;
+	int log;
+	uint64_t bytes;
+};
+
+int fastlog2(uint64_t sz);
+struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n);
+struct segment suggest_compaction_segment(uint64_t *sizes, int n);
+
+#endif
diff --git a/reftable/stack_test.c b/reftable/stack_test.c
new file mode 100644
index 000000000000..3bd78d2a2bc5
--- /dev/null
+++ b/reftable/stack_test.c
@@ -0,0 +1,940 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include "stack.h"
+
+#include "system.h"
+
+#include "reftable-reader.h"
+#include "merged.h"
+#include "basics.h"
+#include "constants.h"
+#include "record.h"
+#include "test_framework.h"
+#include "reftable-tests.h"
+
+#include <sys/types.h>
+#include <dirent.h>
+
+static void clear_dir(const char *dirname)
+{
+	struct strbuf path = STRBUF_INIT;
+	strbuf_addstr(&path, dirname);
+	remove_dir_recursively(&path, 0);
+	strbuf_release(&path);
+}
+
+static int count_dir_entries(const char *dirname)
+{
+	DIR *dir = opendir(dirname);
+	int len = 0;
+	struct dirent *d;
+	if (dir == NULL)
+		return 0;
+
+	while ((d = readdir(dir))) {
+		if (!strcmp(d->d_name, "..") || !strcmp(d->d_name, "."))
+			continue;
+		len++;
+	}
+	closedir(dir);
+	return len;
+}
+
+static char *get_tmp_template(const char *prefix)
+{
+	const char *tmp = getenv("TMPDIR");
+	static char template[1024];
+	snprintf(template, sizeof(template) - 1, "%s/%s.XXXXXX",
+		 tmp ? tmp : "/tmp", prefix);
+	return template;
+}
+
+static char *get_tmp_dir(const char *prefix)
+{
+	char *dir = get_tmp_template(prefix);
+	EXPECT(mkdtemp(dir));
+	return dir;
+}
+
+static void test_read_file(void)
+{
+	char *fn = get_tmp_template(__FUNCTION__);
+	int fd = mkstemp(fn);
+	char out[1024] = "line1\n\nline2\nline3";
+	int n, err;
+	char **names = NULL;
+	char *want[] = { "line1", "line2", "line3" };
+	int i = 0;
+
+	EXPECT(fd > 0);
+	n = write(fd, out, strlen(out));
+	EXPECT(n == strlen(out));
+	err = close(fd);
+	EXPECT(err >= 0);
+
+	err = read_lines(fn, &names);
+	EXPECT_ERR(err);
+
+	for (i = 0; names[i]; i++) {
+		EXPECT(0 == strcmp(want[i], names[i]));
+	}
+	free_names(names);
+	remove(fn);
+}
+
+static void test_parse_names(void)
+{
+	char buf[] = "line\n";
+	char **names = NULL;
+	parse_names(buf, strlen(buf), &names);
+
+	EXPECT(NULL != names[0]);
+	EXPECT(0 == strcmp(names[0], "line"));
+	EXPECT(NULL == names[1]);
+	free_names(names);
+}
+
+static void test_names_equal(void)
+{
+	char *a[] = { "a", "b", "c", NULL };
+	char *b[] = { "a", "b", "d", NULL };
+	char *c[] = { "a", "b", NULL };
+
+	EXPECT(names_equal(a, a));
+	EXPECT(!names_equal(a, b));
+	EXPECT(!names_equal(a, c));
+}
+
+static int write_test_ref(struct reftable_writer *wr, void *arg)
+{
+	struct reftable_ref_record *ref = arg;
+	reftable_writer_set_limits(wr, ref->update_index, ref->update_index);
+	return reftable_writer_add_ref(wr, ref);
+}
+
+struct write_log_arg {
+	struct reftable_log_record *log;
+	uint64_t update_index;
+};
+
+static int write_test_log(struct reftable_writer *wr, void *arg)
+{
+	struct write_log_arg *wla = arg;
+
+	reftable_writer_set_limits(wr, wla->update_index, wla->update_index);
+	return reftable_writer_add_log(wr, wla->log);
+}
+
+static void test_reftable_stack_add_one(void)
+{
+	char *dir = get_tmp_dir(__FUNCTION__);
+
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_ref_record ref = {
+		.refname = "HEAD",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+	struct reftable_ref_record dest = { NULL };
+
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_ref(st, ref.refname, &dest);
+	EXPECT_ERR(err);
+	EXPECT(0 == strcmp("master", dest.value.symref));
+
+	reftable_ref_record_release(&dest);
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_uptodate(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st1 = NULL;
+	struct reftable_stack *st2 = NULL;
+	char *dir = get_tmp_dir(__FUNCTION__);
+
+	int err;
+	struct reftable_ref_record ref1 = {
+		.refname = "HEAD",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+	struct reftable_ref_record ref2 = {
+		.refname = "branch2",
+		.update_index = 2,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+
+
+	/* simulate multi-process access to the same stack
+	   by creating two stacks for the same directory.
+	 */
+	err = reftable_new_stack(&st1, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_new_stack(&st2, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st1, &write_test_ref, &ref1);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st2, &write_test_ref, &ref2);
+	EXPECT(err == REFTABLE_LOCK_ERROR);
+
+	err = reftable_stack_reload(st2);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st2, &write_test_ref, &ref2);
+	EXPECT_ERR(err);
+	reftable_stack_destroy(st1);
+	reftable_stack_destroy(st2);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_transaction_api(void)
+{
+	char *dir = get_tmp_dir(__FUNCTION__);
+
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_addition *add = NULL;
+
+	struct reftable_ref_record ref = {
+		.refname = "HEAD",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+	struct reftable_ref_record dest = { NULL };
+
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	reftable_addition_destroy(add);
+
+	err = reftable_stack_new_addition(&add, st);
+	EXPECT_ERR(err);
+
+	err = reftable_addition_add(add, &write_test_ref, &ref);
+	EXPECT_ERR(err);
+
+	err = reftable_addition_commit(add);
+	EXPECT_ERR(err);
+
+	reftable_addition_destroy(add);
+
+	err = reftable_stack_read_ref(st, ref.refname, &dest);
+	EXPECT_ERR(err);
+	EXPECT(REFTABLE_REF_SYMREF == dest.value_type);
+	EXPECT(0 == strcmp("master", dest.value.symref));
+
+	reftable_ref_record_release(&dest);
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_validate_refname(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	char *dir = get_tmp_dir(__FUNCTION__);
+
+	int i;
+	struct reftable_ref_record ref = {
+		.refname = "a/b",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+	char *additions[] = { "a", "a/b/c" };
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref);
+	EXPECT_ERR(err);
+
+	for (i = 0; i < ARRAY_SIZE(additions); i++) {
+		struct reftable_ref_record ref = {
+			.refname = additions[i],
+			.update_index = 1,
+			.value_type = REFTABLE_REF_SYMREF,
+			.value.symref = "master",
+		};
+
+		err = reftable_stack_add(st, &write_test_ref, &ref);
+		EXPECT(err == REFTABLE_NAME_CONFLICT);
+	}
+
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static int write_error(struct reftable_writer *wr, void *arg)
+{
+	return *((int *)arg);
+}
+
+static void test_reftable_stack_update_index_check(void)
+{
+	char *dir = get_tmp_dir(__FUNCTION__);
+
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_ref_record ref1 = {
+		.refname = "name1",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+	struct reftable_ref_record ref2 = {
+		.refname = "name2",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "master",
+	};
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref1);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref2);
+	EXPECT(err == REFTABLE_API_ERROR);
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_lock_failure(void)
+{
+	char *dir = get_tmp_dir(__FUNCTION__);
+
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err, i;
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+	for (i = -1; i != REFTABLE_EMPTY_TABLE_ERROR; i--) {
+		err = reftable_stack_add(st, &write_error, &i);
+		EXPECT(err == i);
+	}
+
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_add(void)
+{
+	int i = 0;
+	int err = 0;
+	struct reftable_write_options cfg = {
+		.exact_log_message = 1,
+	};
+	struct reftable_stack *st = NULL;
+	char *dir = get_tmp_dir(__FUNCTION__);
+
+	struct reftable_ref_record refs[2] = { { NULL } };
+	struct reftable_log_record logs[2] = { { NULL } };
+	int N = ARRAY_SIZE(refs);
+
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+	st->disable_auto_compact = 1;
+
+	for (i = 0; i < N; i++) {
+		char buf[256];
+		snprintf(buf, sizeof(buf), "branch%02d", i);
+		refs[i].refname = xstrdup(buf);
+		refs[i].update_index = i + 1;
+		refs[i].value_type = REFTABLE_REF_VAL1;
+		refs[i].value.val1 = reftable_malloc(GIT_SHA1_RAWSZ);
+		set_test_hash(refs[i].value.val1, i);
+
+		logs[i].refname = xstrdup(buf);
+		logs[i].update_index = N + i + 1;
+		logs[i].value_type = REFTABLE_LOG_UPDATE;
+
+		logs[i].update.new_hash = reftable_malloc(GIT_SHA1_RAWSZ);
+		logs[i].update.email = xstrdup("identity@invalid");
+		set_test_hash(logs[i].update.new_hash, i);
+	}
+
+	for (i = 0; i < N; i++) {
+		int err = reftable_stack_add(st, &write_test_ref, &refs[i]);
+		EXPECT_ERR(err);
+	}
+
+	for (i = 0; i < N; i++) {
+		struct write_log_arg arg = {
+			.log = &logs[i],
+			.update_index = reftable_stack_next_update_index(st),
+		};
+		int err = reftable_stack_add(st, &write_test_log, &arg);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_stack_compact_all(st, NULL);
+	EXPECT_ERR(err);
+
+	for (i = 0; i < N; i++) {
+		struct reftable_ref_record dest = { NULL };
+
+		int err = reftable_stack_read_ref(st, refs[i].refname, &dest);
+		EXPECT_ERR(err);
+		EXPECT(reftable_ref_record_equal(&dest, refs + i,
+						 GIT_SHA1_RAWSZ));
+		reftable_ref_record_release(&dest);
+	}
+
+	for (i = 0; i < N; i++) {
+		struct reftable_log_record dest = { NULL };
+		int err = reftable_stack_read_log(st, refs[i].refname, &dest);
+		EXPECT_ERR(err);
+		EXPECT(reftable_log_record_equal(&dest, logs + i,
+						 GIT_SHA1_RAWSZ));
+		reftable_log_record_release(&dest);
+	}
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	for (i = 0; i < N; i++) {
+		reftable_ref_record_release(&refs[i]);
+		reftable_log_record_release(&logs[i]);
+	}
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_log_normalize(void)
+{
+	int err = 0;
+	struct reftable_write_options cfg = {
+		0,
+	};
+	struct reftable_stack *st = NULL;
+	char *dir = get_tmp_dir(__FUNCTION__);
+
+
+	uint8_t h1[GIT_SHA1_RAWSZ] = { 0x01 }, h2[GIT_SHA1_RAWSZ] = { 0x02 };
+
+	struct reftable_log_record input = { .refname = "branch",
+					     .update_index = 1,
+					     .value_type = REFTABLE_LOG_UPDATE,
+					     .update = {
+						     .new_hash = h1,
+						     .old_hash = h2,
+					     } };
+	struct reftable_log_record dest = {
+		.update_index = 0,
+	};
+	struct write_log_arg arg = {
+		.log = &input,
+		.update_index = 1,
+	};
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	input.update.message = "one\ntwo";
+	err = reftable_stack_add(st, &write_test_log, &arg);
+	EXPECT(err == REFTABLE_API_ERROR);
+
+	input.update.message = "one";
+	err = reftable_stack_add(st, &write_test_log, &arg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_log(st, input.refname, &dest);
+	EXPECT_ERR(err);
+	EXPECT(0 == strcmp(dest.update.message, "one\n"));
+
+	input.update.message = "two\n";
+	arg.update_index = 2;
+	err = reftable_stack_add(st, &write_test_log, &arg);
+	EXPECT_ERR(err);
+	err = reftable_stack_read_log(st, input.refname, &dest);
+	EXPECT_ERR(err);
+	EXPECT(0 == strcmp(dest.update.message, "two\n"));
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	reftable_log_record_release(&dest);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_tombstone(void)
+{
+	int i = 0;
+	char *dir = get_tmp_dir(__FUNCTION__);
+
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	struct reftable_ref_record refs[2] = { { NULL } };
+	struct reftable_log_record logs[2] = { { NULL } };
+	int N = ARRAY_SIZE(refs);
+	struct reftable_ref_record dest = { NULL };
+	struct reftable_log_record log_dest = { NULL };
+
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	/* even entries add the refs, odd entries delete them. */
+	for (i = 0; i < N; i++) {
+		const char *buf = "branch";
+		refs[i].refname = xstrdup(buf);
+		refs[i].update_index = i + 1;
+		if (i % 2 == 0) {
+			refs[i].value_type = REFTABLE_REF_VAL1;
+			refs[i].value.val1 = reftable_malloc(GIT_SHA1_RAWSZ);
+			set_test_hash(refs[i].value.val1, i);
+		}
+
+		logs[i].refname = xstrdup(buf);
+		/* update_index is part of the key. */
+		logs[i].update_index = 42;
+		if (i % 2 == 0) {
+			logs[i].value_type = REFTABLE_LOG_UPDATE;
+			logs[i].update.new_hash =
+				reftable_malloc(GIT_SHA1_RAWSZ);
+			set_test_hash(logs[i].update.new_hash, i);
+			logs[i].update.email = xstrdup("identity@invalid");
+		}
+	}
+	for (i = 0; i < N; i++) {
+		int err = reftable_stack_add(st, &write_test_ref, &refs[i]);
+		EXPECT_ERR(err);
+	}
+
+	for (i = 0; i < N; i++) {
+		struct write_log_arg arg = {
+			.log = &logs[i],
+			.update_index = reftable_stack_next_update_index(st),
+		};
+		int err = reftable_stack_add(st, &write_test_log, &arg);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_stack_read_ref(st, "branch", &dest);
+	EXPECT(err == 1);
+	reftable_ref_record_release(&dest);
+
+	err = reftable_stack_read_log(st, "branch", &log_dest);
+	EXPECT(err == 1);
+	reftable_log_record_release(&log_dest);
+
+	err = reftable_stack_compact_all(st, NULL);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_ref(st, "branch", &dest);
+	EXPECT(err == 1);
+
+	err = reftable_stack_read_log(st, "branch", &log_dest);
+	EXPECT(err == 1);
+	reftable_ref_record_release(&dest);
+	reftable_log_record_release(&log_dest);
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	for (i = 0; i < N; i++) {
+		reftable_ref_record_release(&refs[i]);
+		reftable_log_record_release(&logs[i]);
+	}
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_hash_id(void)
+{
+	char *dir = get_tmp_dir(__FUNCTION__);
+
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+
+	struct reftable_ref_record ref = {
+		.refname = "master",
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = "target",
+		.update_index = 1,
+	};
+	struct reftable_write_options cfg32 = { .hash_id = GIT_SHA256_HASH_ID };
+	struct reftable_stack *st32 = NULL;
+	struct reftable_write_options cfg_default = { 0 };
+	struct reftable_stack *st_default = NULL;
+	struct reftable_ref_record dest = { NULL };
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_test_ref, &ref);
+	EXPECT_ERR(err);
+
+	/* can't read it with the wrong hash ID. */
+	err = reftable_new_stack(&st32, dir, cfg32);
+	EXPECT(err == REFTABLE_FORMAT_ERROR);
+
+	/* check that we can read it back with default config too. */
+	err = reftable_new_stack(&st_default, dir, cfg_default);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_ref(st_default, "master", &dest);
+	EXPECT_ERR(err);
+
+	EXPECT(reftable_ref_record_equal(&ref, &dest, GIT_SHA1_RAWSZ));
+	reftable_ref_record_release(&dest);
+	reftable_stack_destroy(st);
+	reftable_stack_destroy(st_default);
+	clear_dir(dir);
+}
+
+static void test_log2(void)
+{
+	EXPECT(1 == fastlog2(3));
+	EXPECT(2 == fastlog2(4));
+	EXPECT(2 == fastlog2(5));
+}
+
+static void test_sizes_to_segments(void)
+{
+	uint64_t sizes[] = { 2, 3, 4, 5, 7, 9 };
+	/* .................0  1  2  3  4  5 */
+
+	int seglen = 0;
+	struct segment *segs =
+		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
+	EXPECT(segs[2].log == 3);
+	EXPECT(segs[2].start == 5);
+	EXPECT(segs[2].end == 6);
+
+	EXPECT(segs[1].log == 2);
+	EXPECT(segs[1].start == 2);
+	EXPECT(segs[1].end == 5);
+	reftable_free(segs);
+}
+
+static void test_sizes_to_segments_empty(void)
+{
+	int seglen = 0;
+	struct segment *segs = sizes_to_segments(&seglen, NULL, 0);
+	EXPECT(seglen == 0);
+	reftable_free(segs);
+}
+
+static void test_sizes_to_segments_all_equal(void)
+{
+	uint64_t sizes[] = { 5, 5 };
+
+	int seglen = 0;
+	struct segment *segs =
+		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
+	EXPECT(seglen == 1);
+	EXPECT(segs[0].start == 0);
+	EXPECT(segs[0].end == 2);
+	reftable_free(segs);
+}
+
+static void test_suggest_compaction_segment(void)
+{
+	uint64_t sizes[] = { 128, 64, 17, 16, 9, 9, 9, 16, 16 };
+	/* .................0    1    2  3   4  5  6 */
+	struct segment min =
+		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes));
+	EXPECT(min.start == 2);
+	EXPECT(min.end == 7);
+}
+
+static void test_suggest_compaction_segment_nothing(void)
+{
+	uint64_t sizes[] = { 64, 32, 16, 8, 4, 2 };
+	struct segment result =
+		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes));
+	EXPECT(result.start == result.end);
+}
+
+static void test_reflog_expire(void)
+{
+	char *dir = get_tmp_dir(__FUNCTION__);
+
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	struct reftable_log_record logs[20] = { { NULL } };
+	int N = ARRAY_SIZE(logs) - 1;
+	int i = 0;
+	int err;
+	struct reftable_log_expiry_config expiry = {
+		.time = 10,
+	};
+	struct reftable_log_record log = { NULL };
+
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	for (i = 1; i <= N; i++) {
+		char buf[256];
+		snprintf(buf, sizeof(buf), "branch%02d", i);
+
+		logs[i].refname = xstrdup(buf);
+		logs[i].update_index = i;
+		logs[i].value_type = REFTABLE_LOG_UPDATE;
+		logs[i].update.time = i;
+		logs[i].update.new_hash = reftable_malloc(GIT_SHA1_RAWSZ);
+		logs[i].update.email = xstrdup("identity@invalid");
+		set_test_hash(logs[i].update.new_hash, i);
+	}
+
+	for (i = 1; i <= N; i++) {
+		struct write_log_arg arg = {
+			.log = &logs[i],
+			.update_index = reftable_stack_next_update_index(st),
+		};
+		int err = reftable_stack_add(st, &write_test_log, &arg);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_stack_compact_all(st, NULL);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_compact_all(st, &expiry);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_log(st, logs[9].refname, &log);
+	EXPECT(err == 1);
+
+	err = reftable_stack_read_log(st, logs[11].refname, &log);
+	EXPECT_ERR(err);
+
+	expiry.min_update_index = 15;
+	err = reftable_stack_compact_all(st, &expiry);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_read_log(st, logs[14].refname, &log);
+	EXPECT(err == 1);
+
+	err = reftable_stack_read_log(st, logs[16].refname, &log);
+	EXPECT_ERR(err);
+
+	/* cleanup */
+	reftable_stack_destroy(st);
+	for (i = 0; i <= N; i++) {
+		reftable_log_record_release(&logs[i]);
+	}
+	clear_dir(dir);
+	reftable_log_record_release(&log);
+}
+
+static int write_nothing(struct reftable_writer *wr, void *arg)
+{
+	reftable_writer_set_limits(wr, 1, 1);
+	return 0;
+}
+
+static void test_empty_add(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	int err;
+	char *dir = get_tmp_dir(__FUNCTION__);
+
+	struct reftable_stack *st2 = NULL;
+
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_add(st, &write_nothing, NULL);
+	EXPECT_ERR(err);
+
+	err = reftable_new_stack(&st2, dir, cfg);
+	EXPECT_ERR(err);
+	clear_dir(dir);
+	reftable_stack_destroy(st);
+	reftable_stack_destroy(st2);
+}
+
+static void test_reftable_stack_auto_compaction(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st = NULL;
+	char *dir = get_tmp_dir(__FUNCTION__);
+
+	int err, i;
+	int N = 100;
+
+	err = reftable_new_stack(&st, dir, cfg);
+	EXPECT_ERR(err);
+
+	for (i = 0; i < N; i++) {
+		char name[100];
+		struct reftable_ref_record ref = {
+			.refname = name,
+			.update_index = reftable_stack_next_update_index(st),
+			.value_type = REFTABLE_REF_SYMREF,
+			.value.symref = "master",
+		};
+		snprintf(name, sizeof(name), "branch%04d", i);
+
+		err = reftable_stack_add(st, &write_test_ref, &ref);
+		EXPECT_ERR(err);
+
+		EXPECT(i < 3 || st->merged->stack_len < 2 * fastlog2(i));
+	}
+
+	EXPECT(reftable_stack_compaction_stats(st)->entries_written <
+	       (uint64_t)(N * fastlog2(N)));
+
+	reftable_stack_destroy(st);
+	clear_dir(dir);
+}
+
+static void test_reftable_stack_compaction_concurrent(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st1 = NULL, *st2 = NULL;
+	char *dir = get_tmp_dir(__FUNCTION__);
+
+	int err, i;
+	int N = 3;
+
+	err = reftable_new_stack(&st1, dir, cfg);
+	EXPECT_ERR(err);
+
+	for (i = 0; i < N; i++) {
+		char name[100];
+		struct reftable_ref_record ref = {
+			.refname = name,
+			.update_index = reftable_stack_next_update_index(st1),
+			.value_type = REFTABLE_REF_SYMREF,
+			.value.symref = "master",
+		};
+		snprintf(name, sizeof(name), "branch%04d", i);
+
+		err = reftable_stack_add(st1, &write_test_ref, &ref);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_new_stack(&st2, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_compact_all(st1, NULL);
+	EXPECT_ERR(err);
+
+	reftable_stack_destroy(st1);
+	reftable_stack_destroy(st2);
+
+	EXPECT(count_dir_entries(dir) == 2);
+	clear_dir(dir);
+}
+
+static void unclean_stack_close(struct reftable_stack *st)
+{
+	// break abstraction boundary to simulate unclean shutdown.
+	int i = 0;
+	for (; i < st->readers_len; i++) {
+		reftable_reader_free(st->readers[i]);
+	}
+	st->readers_len = 0;
+	FREE_AND_NULL(st->readers);
+}
+
+static void test_reftable_stack_compaction_concurrent_clean(void)
+{
+	struct reftable_write_options cfg = { 0 };
+	struct reftable_stack *st1 = NULL, *st2 = NULL, *st3 = NULL;
+	char *dir = get_tmp_dir(__FUNCTION__);
+
+	int err, i;
+	int N = 3;
+
+	err = reftable_new_stack(&st1, dir, cfg);
+	EXPECT_ERR(err);
+
+	for (i = 0; i < N; i++) {
+		char name[100];
+		struct reftable_ref_record ref = {
+			.refname = name,
+			.update_index = reftable_stack_next_update_index(st1),
+			.value_type = REFTABLE_REF_SYMREF,
+			.value.symref = "master",
+		};
+		snprintf(name, sizeof(name), "branch%04d", i);
+
+		err = reftable_stack_add(st1, &write_test_ref, &ref);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_new_stack(&st2, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_compact_all(st1, NULL);
+	EXPECT_ERR(err);
+
+	unclean_stack_close(st1);
+	unclean_stack_close(st2);
+
+	err = reftable_new_stack(&st3, dir, cfg);
+	EXPECT_ERR(err);
+
+	err = reftable_stack_clean(st3);
+	EXPECT_ERR(err);
+	EXPECT(count_dir_entries(dir) == 2);
+
+	reftable_stack_destroy(st1);
+	reftable_stack_destroy(st2);
+	reftable_stack_destroy(st3);
+
+	clear_dir(dir);
+}
+
+int stack_test_main(int argc, const char *argv[])
+{
+	RUN_TEST(test_empty_add);
+	RUN_TEST(test_log2);
+	RUN_TEST(test_names_equal);
+	RUN_TEST(test_parse_names);
+	RUN_TEST(test_read_file);
+	RUN_TEST(test_reflog_expire);
+	RUN_TEST(test_reftable_stack_add);
+	RUN_TEST(test_reftable_stack_add_one);
+	RUN_TEST(test_reftable_stack_auto_compaction);
+	RUN_TEST(test_reftable_stack_compaction_concurrent);
+	RUN_TEST(test_reftable_stack_compaction_concurrent_clean);
+	RUN_TEST(test_reftable_stack_hash_id);
+	RUN_TEST(test_reftable_stack_lock_failure);
+	RUN_TEST(test_reftable_stack_log_normalize);
+	RUN_TEST(test_reftable_stack_tombstone);
+	RUN_TEST(test_reftable_stack_transaction_api);
+	RUN_TEST(test_reftable_stack_update_index_check);
+	RUN_TEST(test_reftable_stack_uptodate);
+	RUN_TEST(test_reftable_stack_validate_refname);
+	RUN_TEST(test_sizes_to_segments);
+	RUN_TEST(test_sizes_to_segments_all_equal);
+	RUN_TEST(test_sizes_to_segments_empty);
+	RUN_TEST(test_suggest_compaction_segment);
+	RUN_TEST(test_suggest_compaction_segment_nothing);
+	return 0;
+}
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index c8db6852c353..996da85f7b5c 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -10,6 +10,7 @@ int cmd__reftable(int argc, const char **argv)
 	record_test_main(argc, argv);
 	refname_test_main(argc, argv);
 	readwrite_test_main(argc, argv);
+	stack_test_main(argc, argv);
 	tree_test_main(argc, argv);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 22/28] reftable: add dump utility
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (20 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 21/28] reftable: implement stack, a mutable database of reftable files Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37             ` [PATCH v7 23/28] Reftable support for git-core Han-Wen Nienhuys via GitGitGadget
                               ` (6 subsequent siblings)
  28 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

provide a command-line utility for inspecting individual tables, and
inspecting a complete ref database

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/dump.c | 100 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 100 insertions(+)
 create mode 100644 reftable/dump.c

diff --git a/reftable/dump.c b/reftable/dump.c
new file mode 100644
index 000000000000..c4201f826b0c
--- /dev/null
+++ b/reftable/dump.c
@@ -0,0 +1,100 @@
+/*
+Copyright 2020 Google LLC
+
+Use of this source code is governed by a BSD-style
+license that can be found in the LICENSE file or at
+https://developers.google.com/open-source/licenses/bsd
+*/
+
+#include <stddef.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+
+#include "reftable-blocksource.h"
+#include "reftable-error.h"
+#include "reftable-merged.h"
+#include "reftable-record.h"
+#include "reftable-tests.h"
+#include "reftable-writer.h"
+#include "reftable-iterator.h"
+#include "reftable-reader.h"
+#include "reftable-stack.h"
+#include "reftable-generic.h"
+
+static int compact_stack(const char *stackdir)
+{
+	struct reftable_stack *stack = NULL;
+	struct reftable_write_options cfg = { 0 };
+
+	int err = reftable_new_stack(&stack, stackdir, cfg);
+	if (err < 0)
+		goto done;
+
+	err = reftable_stack_compact_all(stack, NULL);
+	if (err < 0)
+		goto done;
+done:
+	if (stack) {
+		reftable_stack_destroy(stack);
+	}
+	return err;
+}
+
+static void print_help(void)
+{
+	printf("usage: dump [-cst] arg\n\n"
+	       "options: \n"
+	       "  -c compact\n"
+	       "  -t dump table\n"
+	       "  -s dump stack\n"
+	       "  -h this help\n"
+	       "\n");
+}
+
+int reftable_dump_main(int argc, char *const *argv)
+{
+	int err = 0;
+	int opt_dump_table = 0;
+	int opt_dump_stack = 0;
+	int opt_compact = 0;
+	const char *arg = NULL, *argv0 = argv[0];
+
+	for (; argc > 1; argv++, argc--)
+		if (*argv[1] != '-')
+			break;
+		else if (!strcmp("-t", argv[1]))
+			opt_dump_table = 1;
+		else if (!strcmp("-s", argv[1]))
+			opt_dump_stack = 1;
+		else if (!strcmp("-c", argv[1]))
+			opt_compact = 1;
+		else if (!strcmp("-?", argv[1]) || !strcmp("-h", argv[1])) {
+			print_help();
+			return 2;
+		}
+
+	if (argc != 2) {
+		fprintf(stderr, "need argument\n");
+		print_help();
+		return 2;
+	}
+
+	arg = argv[1];
+
+	if (opt_dump_table) {
+		err = reftable_reader_print_file(arg);
+	} else if (opt_dump_stack) {
+		err = reftable_stack_print_directory(arg);
+	} else if (opt_compact) {
+		err = compact_stack(arg);
+	}
+
+	if (err < 0) {
+		fprintf(stderr, "%s: %s: %s\n", argv0, arg,
+			reftable_error_str(err));
+		return 1;
+	}
+	return 0;
+}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 23/28] Reftable support for git-core
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (21 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 22/28] reftable: add dump utility Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-20 22:44               ` Junio C Hamano
  2021-05-04 17:24               ` Andrzej Hunt
  2021-04-19 11:37             ` [PATCH v7 24/28] git-prompt: prepare for reftable refs backend SZEDER Gábor via GitGitGadget
                               ` (5 subsequent siblings)
  28 siblings, 2 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

For background, see the previous commit introducing the library.

This introduces the file refs/reftable-backend.c containing a reftable-powered
ref storage backend.

It can be activated by passing --ref-storage=reftable to "git init", or setting
GIT_TEST_REFTABLE in the environment.

Example use: see t/t0031-reftable.sh

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Junio Hamano <gitster@pobox.com>
Helped-by: Patrick Steinhardt <patrick.steinhardt@elego.de>
Co-authored-by: Jeff King <peff@peff.net>
---
 Documentation/config/extensions.txt           |    9 +
 .../technical/repository-version.txt          |    7 +
 Makefile                                      |    1 +
 builtin/clone.c                               |    5 +-
 builtin/init-db.c                             |   55 +-
 builtin/stash.c                               |    8 +-
 builtin/worktree.c                            |   27 +-
 cache.h                                       |    8 +-
 config.mak.uname                              |    2 +-
 contrib/buildsystems/Generators/Vcxproj.pm    |   11 +-
 refs.c                                        |   26 +-
 refs.h                                        |    3 +
 refs/refs-internal.h                          |    1 +
 refs/reftable-backend.c                       | 1616 +++++++++++++++++
 repository.c                                  |    2 +
 repository.h                                  |    3 +
 setup.c                                       |    9 +-
 t/t0031-reftable.sh                           |  271 +++
 t/t1409-avoid-packing-refs.sh                 |    6 +
 t/t1450-fsck.sh                               |    6 +
 t/t3210-pack-refs.sh                          |    6 +
 t/test-lib.sh                                 |    5 +
 22 files changed, 2053 insertions(+), 34 deletions(-)
 create mode 100644 refs/reftable-backend.c
 create mode 100755 t/t0031-reftable.sh

diff --git a/Documentation/config/extensions.txt b/Documentation/config/extensions.txt
index 4e23d73cdcad..82c5940f1434 100644
--- a/Documentation/config/extensions.txt
+++ b/Documentation/config/extensions.txt
@@ -6,3 +6,12 @@ extensions.objectFormat::
 Note that this setting should only be set by linkgit:git-init[1] or
 linkgit:git-clone[1].  Trying to change it after initialization will not
 work and will produce hard-to-diagnose issues.
++
+extensions.refStorage::
+	Specify the ref storage mechanism to use.  The acceptable values are `files` and
+	`reftable`.  If not specified, `files` is assumed.  It is an error to specify
+	this key unless `core.repositoryFormatVersion` is 1.
++
+Note that this setting should only be set by linkgit:git-init[1] or
+linkgit:git-clone[1].  Trying to change it after initialization will not
+work and will produce hard-to-diagnose issues.
diff --git a/Documentation/technical/repository-version.txt b/Documentation/technical/repository-version.txt
index 7844ef30ffde..725762358333 100644
--- a/Documentation/technical/repository-version.txt
+++ b/Documentation/technical/repository-version.txt
@@ -100,3 +100,10 @@ If set, by default "git config" reads from both "config" and
 multiple working directory mode, "config" file is shared while
 "config.worktree" is per-working directory (i.e., it's in
 GIT_COMMON_DIR/worktrees/<id>/config.worktree)
+
+==== `refStorage`
+
+Specifies the file format for the ref database. Values are `files`
+(for the traditional packed + loose ref format) and `reftable` for the
+binary reftable format. See https://github.com/google/reftable for
+more information.
diff --git a/Makefile b/Makefile
index f78d1cbc2d3c..cffcf3496978 100644
--- a/Makefile
+++ b/Makefile
@@ -978,6 +978,7 @@ LIB_OBJS += reflog-walk.o
 LIB_OBJS += refs.o
 LIB_OBJS += refs/debug.o
 LIB_OBJS += refs/files-backend.o
+LIB_OBJS += refs/reftable-backend.o
 LIB_OBJS += refs/iterator.o
 LIB_OBJS += refs/packed-backend.o
 LIB_OBJS += refs/ref-cache.o
diff --git a/builtin/clone.c b/builtin/clone.c
index f6b0c48beda5..bb2509fc1268 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1148,7 +1148,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	}
 
 	init_db(git_dir, real_git_dir, option_template, GIT_HASH_UNKNOWN, NULL,
-		INIT_DB_QUIET);
+		default_ref_storage(), INIT_DB_QUIET);
 
 	if (real_git_dir)
 		git_dir = real_git_dir;
@@ -1299,7 +1299,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		 * Now that we know what algorithm the remote side is using,
 		 * let's set ours to the same thing.
 		 */
-		initialize_repository_version(hash_algo, 1);
+		initialize_repository_version(hash_algo, 1,
+					      default_ref_storage());
 		repo_set_hash_algo(the_repository, hash_algo);
 
 		mapped_refs = wanted_peer_refs(refs, &remote->fetch);
diff --git a/builtin/init-db.c b/builtin/init-db.c
index ff06fe4b9a49..31b718259a64 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -167,12 +167,14 @@ static int needs_work_tree_config(const char *git_dir, const char *work_tree)
 	return 1;
 }
 
-void initialize_repository_version(int hash_algo, int reinit)
+void initialize_repository_version(int hash_algo, int reinit,
+				   const char *ref_storage_format)
 {
 	char repo_version_string[10];
 	int repo_version = GIT_REPO_VERSION;
 
-	if (hash_algo != GIT_HASH_SHA1)
+	if (hash_algo != GIT_HASH_SHA1 ||
+	    !strcmp(ref_storage_format, "reftable"))
 		repo_version = GIT_REPO_VERSION_READ;
 
 	/* This forces creation of new config file */
@@ -225,6 +227,7 @@ static int create_default_files(const char *template_path,
 	is_bare_repository_cfg = init_is_bare_repository || !work_tree;
 	if (init_shared_repository != -1)
 		set_shared_repository(init_shared_repository);
+	the_repository->ref_storage_format = xstrdup(fmt->ref_storage);
 
 	/*
 	 * We would have created the above under user's umask -- under
@@ -234,6 +237,24 @@ static int create_default_files(const char *template_path,
 		adjust_shared_perm(get_git_dir());
 	}
 
+	/*
+	 * Check to see if .git/HEAD exists; this must happen before
+	 * initializing the ref db, because we want to see if there is an
+	 * existing HEAD.
+	 */
+	path = git_path_buf(&buf, "HEAD");
+	reinit = (!access(path, R_OK) ||
+		  readlink(path, junk, sizeof(junk) - 1) != -1);
+
+	/*
+	 * refs/heads is a file when using reftable. We can't reinitialize with
+	 * a reftable because it will overwrite HEAD
+	 */
+	if (reinit && (!strcmp(fmt->ref_storage, "reftable")) ==
+			      is_directory(git_path_buf(&buf, "refs/heads"))) {
+		die("cannot switch ref storage format.");
+	}
+
 	/*
 	 * We need to create a "refs" dir in any case so that older
 	 * versions of git can tell that this is a repository.
@@ -248,9 +269,6 @@ static int create_default_files(const char *template_path,
 	 * Point the HEAD symref to the initial branch with if HEAD does
 	 * not yet exist.
 	 */
-	path = git_path_buf(&buf, "HEAD");
-	reinit = (!access(path, R_OK)
-		  || readlink(path, junk, sizeof(junk)-1) != -1);
 	if (!reinit) {
 		char *ref;
 
@@ -267,7 +285,7 @@ static int create_default_files(const char *template_path,
 		free(ref);
 	}
 
-	initialize_repository_version(fmt->hash_algo, 0);
+	initialize_repository_version(fmt->hash_algo, 0, fmt->ref_storage);
 
 	/* Check filemode trustability */
 	path = git_path_buf(&buf, "config");
@@ -382,7 +400,7 @@ static void validate_hash_algorithm(struct repository_format *repo_fmt, int hash
 
 int init_db(const char *git_dir, const char *real_git_dir,
 	    const char *template_dir, int hash, const char *initial_branch,
-	    unsigned int flags)
+	    const char *ref_storage_format, unsigned int flags)
 {
 	int reinit;
 	int exist_ok = flags & INIT_DB_EXIST_OK;
@@ -421,6 +439,7 @@ int init_db(const char *git_dir, const char *real_git_dir,
 	 * is an attempt to reinitialize new repository with an old tool.
 	 */
 	check_repository_format(&repo_fmt);
+	repo_fmt.ref_storage = xstrdup(ref_storage_format);
 
 	validate_hash_algorithm(&repo_fmt, hash);
 
@@ -475,6 +494,9 @@ int init_db(const char *git_dir, const char *real_git_dir,
 		git_config_set("receive.denyNonFastforwards", "true");
 	}
 
+	if (!strcmp(ref_storage_format, "reftable"))
+		git_config_set("extensions.refStorage", ref_storage_format);
+
 	if (!(flags & INIT_DB_QUIET)) {
 		int len = strlen(git_dir);
 
@@ -548,6 +570,7 @@ static const char *const init_db_usage[] = {
 int cmd_init_db(int argc, const char **argv, const char *prefix)
 {
 	const char *git_dir;
+	const char *ref_storage_format = default_ref_storage();
 	const char *real_git_dir = NULL;
 	const char *work_tree;
 	const char *template_dir = NULL;
@@ -556,15 +579,18 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
 	const char *initial_branch = NULL;
 	int hash_algo = GIT_HASH_UNKNOWN;
 	const struct option init_db_options[] = {
-		OPT_STRING(0, "template", &template_dir, N_("template-directory"),
-				N_("directory from which templates will be used")),
+		OPT_STRING(0, "template", &template_dir,
+			   N_("template-directory"),
+			   N_("directory from which templates will be used")),
 		OPT_SET_INT(0, "bare", &is_bare_repository_cfg,
-				N_("create a bare repository"), 1),
+			    N_("create a bare repository"), 1),
 		{ OPTION_CALLBACK, 0, "shared", &init_shared_repository,
-			N_("permissions"),
-			N_("specify that the git repository is to be shared amongst several users"),
-			PARSE_OPT_OPTARG | PARSE_OPT_NONEG, shared_callback, 0},
+		  N_("permissions"),
+		  N_("specify that the git repository is to be shared amongst several users"),
+		  PARSE_OPT_OPTARG | PARSE_OPT_NONEG, shared_callback, 0 },
 		OPT_BIT('q', "quiet", &flags, N_("be quiet"), INIT_DB_QUIET),
+		OPT_STRING(0, "ref-storage", &ref_storage_format, N_("backend"),
+			   N_("the ref storage format to use")),
 		OPT_STRING(0, "separate-git-dir", &real_git_dir, N_("gitdir"),
 			   N_("separate git dir from working tree")),
 		OPT_STRING('b', "initial-branch", &initial_branch, N_("name"),
@@ -707,10 +733,11 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
 	}
 
 	UNLEAK(real_git_dir);
+	UNLEAK(ref_storage_format);
 	UNLEAK(git_dir);
 	UNLEAK(work_tree);
 
 	flags |= INIT_DB_EXIST_OK;
 	return init_db(git_dir, real_git_dir, template_dir, hash_algo,
-		       initial_branch, flags);
+		       initial_branch, ref_storage_format, flags);
 }
diff --git a/builtin/stash.c b/builtin/stash.c
index c56fed33546b..2066b85a3903 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -207,10 +207,16 @@ static int get_stash_info(struct stash_info *info, int argc, const char **argv)
 static int do_clear_stash(void)
 {
 	struct object_id obj;
+	int result;
 	if (get_oid(ref_stash, &obj))
 		return 0;
 
-	return delete_ref(NULL, ref_stash, &obj, 0);
+	result = delete_ref(NULL, ref_stash, &obj, 0);
+
+	/* Ignore error; this is necessary for reftable, which keeps reflogs
+	 * even when refs are deleted. */
+	delete_reflog(ref_stash);
+	return result;
 }
 
 static int clear_stash(int argc, const char **argv, const char *prefix)
diff --git a/builtin/worktree.c b/builtin/worktree.c
index 877145349381..fd389c86f6fa 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -13,6 +13,7 @@
 #include "utf8.h"
 #include "worktree.h"
 #include "quote.h"
+#include "../refs/refs-internal.h"
 
 static const char * const worktree_usage[] = {
 	N_("git worktree add [<options>] <path> [<commit-ish>]"),
@@ -330,9 +331,29 @@ static int add_worktree(const char *path, const char *refname,
 	 * worktree.
 	 */
 	strbuf_reset(&sb);
-	strbuf_addf(&sb, "%s/HEAD", sb_repo.buf);
-	write_file(sb.buf, "%s", oid_to_hex(&null_oid));
-	strbuf_reset(&sb);
+	if (get_main_ref_store(the_repository)->be == &refs_be_reftable) {
+		/* XXX this is cut & paste from reftable_init_db. */
+		strbuf_addf(&sb, "%s/HEAD", sb_repo.buf);
+		write_file(sb.buf, "%s", "ref: refs/heads/.invalid\n");
+		strbuf_reset(&sb);
+
+		strbuf_addf(&sb, "%s/refs", sb_repo.buf);
+		safe_create_dir(sb.buf, 1);
+		strbuf_reset(&sb);
+
+		strbuf_addf(&sb, "%s/refs/heads", sb_repo.buf);
+		write_file(sb.buf, "this repository uses the reftable format");
+		strbuf_reset(&sb);
+
+		strbuf_addf(&sb, "%s/reftable", sb_repo.buf);
+		safe_create_dir(sb.buf, 1);
+		strbuf_reset(&sb);
+	} else {
+		strbuf_addf(&sb, "%s/HEAD", sb_repo.buf);
+		write_file(sb.buf, "%s", oid_to_hex(&null_oid));
+		strbuf_reset(&sb);
+	}
+
 	strbuf_addf(&sb, "%s/commondir", sb_repo.buf);
 	write_file(sb.buf, "../..");
 
diff --git a/cache.h b/cache.h
index 148d9ab5f185..f299535b0b03 100644
--- a/cache.h
+++ b/cache.h
@@ -629,9 +629,10 @@ int path_inside_repo(const char *prefix, const char *path);
 #define INIT_DB_EXIST_OK 0x0002
 
 int init_db(const char *git_dir, const char *real_git_dir,
-	    const char *template_dir, int hash_algo,
-	    const char *initial_branch, unsigned int flags);
-void initialize_repository_version(int hash_algo, int reinit);
+	    const char *template_dir, int hash_algo, const char *initial_branch,
+	    const char *ref_storage_format, unsigned int flags);
+void initialize_repository_version(int hash_algo, int reinit,
+				   const char *ref_storage_format);
 
 void sanitize_stdfds(void);
 int daemonize(void);
@@ -1045,6 +1046,7 @@ struct repository_format {
 	int is_bare;
 	int hash_algo;
 	char *work_tree;
+	char *ref_storage;
 	struct string_list unknown_extensions;
 	struct string_list v1_only_extensions;
 };
diff --git a/config.mak.uname b/config.mak.uname
index cb443b4e023a..96c1268693f1 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -701,7 +701,7 @@ vcxproj:
 	# Make .vcxproj files and add them
 	unset QUIET_GEN QUIET_BUILT_IN; \
 	perl contrib/buildsystems/generate -g Vcxproj
-	git add -f git.sln {*,*/lib,t/helper/*}/*.vcxproj
+	git add -f git.sln {*,*/lib,*/libreftable,t/helper/*}/*.vcxproj
 
 	# Generate the LinkOrCopyBuiltins.targets and LinkOrCopyRemoteHttp.targets file
 	(echo '<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">' && \
diff --git a/contrib/buildsystems/Generators/Vcxproj.pm b/contrib/buildsystems/Generators/Vcxproj.pm
index d2584450ba17..1a25789d2851 100644
--- a/contrib/buildsystems/Generators/Vcxproj.pm
+++ b/contrib/buildsystems/Generators/Vcxproj.pm
@@ -77,7 +77,7 @@ sub createProject {
     my $libs_release = "\n    ";
     my $libs_debug = "\n    ";
     if (!$static_library) {
-      $libs_release = join(";", sort(grep /^(?!libgit\.lib|xdiff\/lib\.lib|vcs-svn\/lib\.lib)/, @{$$build_structure{"$prefix${name}_LIBS"}}));
+      $libs_release = join(";", sort(grep /^(?!libgit\.lib|xdiff\/lib\.lib|vcs-svn\/lib\.lib|reftable\/libreftable\.lib)/, @{$$build_structure{"$prefix${name}_LIBS"}}));
       $libs_debug = $libs_release;
       $libs_debug =~ s/zlib\.lib/zlibd\.lib/g;
       $libs_debug =~ s/libexpat\.lib/libexpatd\.lib/g;
@@ -232,6 +232,7 @@ sub createProject {
 EOM
     if (!$static_library || $target =~ 'vcs-svn' || $target =~ 'xdiff') {
       my $uuid_libgit = $$build_structure{"LIBS_libgit_GUID"};
+      my $uuid_libreftable = $$build_structure{"LIBS_reftable/libreftable_GUID"};
       my $uuid_xdiff_lib = $$build_structure{"LIBS_xdiff/lib_GUID"};
 
       print F << "EOM";
@@ -241,6 +242,14 @@ sub createProject {
       <ReferenceOutputAssembly>false</ReferenceOutputAssembly>
     </ProjectReference>
 EOM
+      if (!($name =~ /xdiff|libreftable/)) {
+        print F << "EOM";
+    <ProjectReference Include="$cdup\\reftable\\libreftable\\libreftable.vcxproj">
+      <Project>$uuid_libreftable</Project>
+      <ReferenceOutputAssembly>false</ReferenceOutputAssembly>
+    </ProjectReference>
+EOM
+      }
       if (!($name =~ 'xdiff')) {
         print F << "EOM";
     <ProjectReference Include="$cdup\\xdiff\\lib\\xdiff_lib.vcxproj">
diff --git a/refs.c b/refs.c
index 8873854a44fb..7292c80a26ab 100644
--- a/refs.c
+++ b/refs.c
@@ -19,10 +19,15 @@
 #include "repository.h"
 #include "sigchain.h"
 
+const char *default_ref_storage(void)
+{
+	return git_env_bool("GIT_TEST_REFTABLE", 0) ? "reftable" : "files";
+}
+
 /*
  * List of all available backends
  */
-static struct ref_storage_be *refs_backends = &refs_be_files;
+static struct ref_storage_be *refs_backends = &refs_be_reftable;
 
 static struct ref_storage_be *find_ref_storage_backend(const char *name)
 {
@@ -1875,13 +1880,13 @@ static struct ref_store *lookup_ref_store_map(struct hashmap *map,
  * Create, record, and return a ref_store instance for the specified
  * gitdir.
  */
-static struct ref_store *ref_store_init(const char *gitdir,
+static struct ref_store *ref_store_init(const char *gitdir, const char *be_name,
 					unsigned int flags)
 {
-	const char *be_name = "files";
-	struct ref_storage_be *be = find_ref_storage_backend(be_name);
+	struct ref_storage_be *be;
 	struct ref_store *refs;
 
+	be = find_ref_storage_backend(be_name);
 	if (!be)
 		BUG("reference backend %s is unknown", be_name);
 
@@ -1897,7 +1902,11 @@ struct ref_store *get_main_ref_store(struct repository *r)
 	if (!r->gitdir)
 		BUG("attempting to get main_ref_store outside of repository");
 
-	r->refs_private = ref_store_init(r->gitdir, REF_STORE_ALL_CAPS);
+	r->refs_private = ref_store_init(r->gitdir,
+					 r->ref_storage_format ?
+						       r->ref_storage_format :
+						       default_ref_storage(),
+					 REF_STORE_ALL_CAPS);
 	r->refs_private = maybe_debug_wrap_ref_store(r->gitdir, r->refs_private);
 	return r->refs_private;
 }
@@ -1953,7 +1962,7 @@ struct ref_store *get_submodule_ref_store(const char *submodule)
 		goto done;
 
 	/* assume that add_submodule_odb() has been called */
-	refs = ref_store_init(submodule_sb.buf,
+	refs = ref_store_init(submodule_sb.buf, default_ref_storage(),
 			      REF_STORE_READ | REF_STORE_ODB);
 	register_ref_store_map(&submodule_ref_stores, "submodule",
 			       refs, submodule);
@@ -1967,6 +1976,7 @@ struct ref_store *get_submodule_ref_store(const char *submodule)
 
 struct ref_store *get_worktree_ref_store(const struct worktree *wt)
 {
+	const char *format = default_ref_storage();
 	struct ref_store *refs;
 	const char *id;
 
@@ -1980,9 +1990,9 @@ struct ref_store *get_worktree_ref_store(const struct worktree *wt)
 
 	if (wt->id)
 		refs = ref_store_init(git_common_path("worktrees/%s", wt->id),
-				      REF_STORE_ALL_CAPS);
+				      format, REF_STORE_ALL_CAPS);
 	else
-		refs = ref_store_init(get_git_common_dir(),
+		refs = ref_store_init(get_git_common_dir(), format,
 				      REF_STORE_ALL_CAPS);
 
 	if (refs)
diff --git a/refs.h b/refs.h
index 48970dfc7e0f..5a6d4ca9fa8d 100644
--- a/refs.h
+++ b/refs.h
@@ -11,6 +11,9 @@ struct string_list;
 struct string_list_item;
 struct worktree;
 
+/* Returns the ref storage backend to use by default. */
+const char *default_ref_storage(void);
+
 /*
  * Resolve a reference, recursively following symbolic refererences.
  *
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index a31c1f465beb..7b427b58f61f 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -676,6 +676,7 @@ struct ref_storage_be {
 };
 
 extern struct ref_storage_be refs_be_files;
+extern struct ref_storage_be refs_be_reftable;
 extern struct ref_storage_be refs_be_packed;
 
 /*
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
new file mode 100644
index 000000000000..55d053e5ca65
--- /dev/null
+++ b/refs/reftable-backend.c
@@ -0,0 +1,1616 @@
+#include "../cache.h"
+#include "../chdir-notify.h"
+#include "../config.h"
+#include "../iterator.h"
+#include "../lockfile.h"
+#include "../refs.h"
+#include "../reftable/reftable-stack.h"
+#include "../reftable/reftable-record.h"
+#include "../reftable/reftable-error.h"
+#include "../reftable/reftable-blocksource.h"
+#include "../reftable/reftable-reader.h"
+#include "../reftable/reftable-iterator.h"
+#include "../reftable/reftable-merged.h"
+#include "../reftable/reftable-generic.h"
+#include "../worktree.h"
+#include "refs-internal.h"
+
+extern struct ref_storage_be refs_be_reftable;
+
+struct git_reftable_ref_store {
+	struct ref_store base;
+	unsigned int store_flags;
+
+	int err;
+	char *repo_dir;
+
+	char *reftable_dir;
+	char *worktree_reftable_dir;
+
+	struct reftable_stack *main_stack;
+	struct reftable_stack *worktree_stack;
+};
+
+static struct reftable_stack *stack_for(struct git_reftable_ref_store *store,
+					const char *refname)
+{
+	if (store->worktree_stack == NULL || refname == NULL)
+		return store->main_stack;
+
+	switch (ref_type(refname)) {
+	case REF_TYPE_PER_WORKTREE:
+	case REF_TYPE_PSEUDOREF:
+	case REF_TYPE_OTHER_PSEUDOREF:
+		return store->worktree_stack;
+	default:
+	case REF_TYPE_MAIN_PSEUDOREF:
+	case REF_TYPE_NORMAL:
+		return store->main_stack;
+	}
+}
+
+static int git_reftable_read_raw_ref(struct ref_store *ref_store,
+				     const char *refname, struct object_id *oid,
+				     struct strbuf *referent,
+				     unsigned int *type);
+
+static void clear_reftable_log_record(struct reftable_log_record *log)
+{
+	log->refname = NULL;
+	switch (log->value_type) {
+	case REFTABLE_LOG_UPDATE:
+		log->update.old_hash = NULL;
+		log->update.new_hash = NULL;
+		log->update.message = NULL;
+		break;
+	case REFTABLE_LOG_DELETION:
+		break;
+	}
+	reftable_log_record_release(log);
+}
+
+static void fill_reftable_log_record(struct reftable_log_record *log)
+{
+	const char *info = git_committer_info(0);
+	struct ident_split split = { NULL };
+	int result = split_ident_line(&split, info, strlen(info));
+	int sign = 1;
+	assert(0 == result);
+
+	reftable_log_record_release(log);
+	log->value_type = REFTABLE_LOG_UPDATE;
+	log->update.name =
+		xstrndup(split.name_begin, split.name_end - split.name_begin);
+	log->update.email =
+		xstrndup(split.mail_begin, split.mail_end - split.mail_begin);
+	log->update.time = atol(split.date_begin);
+	if (*split.tz_begin == '-') {
+		sign = -1;
+		split.tz_begin++;
+	}
+	if (*split.tz_begin == '+') {
+		sign = 1;
+		split.tz_begin++;
+	}
+
+	log->update.tz_offset = sign * atoi(split.tz_begin);
+}
+
+static int has_suffix(struct strbuf *b, const char *suffix)
+{
+	size_t len = strlen(suffix);
+
+	if (len > b->len) {
+		return 0;
+	}
+
+	return 0 == strncmp(b->buf + b->len - len, suffix, len);
+}
+
+/* trims the last path component of b. Returns -1 if it is not
+ * present, or 0 on success
+ */
+static int trim_component(struct strbuf *b)
+{
+	char *last;
+	last = strrchr(b->buf, '/');
+	if (!last)
+		return -1;
+	strbuf_setlen(b, last - b->buf);
+	return 0;
+}
+
+/* Returns whether `b` is a worktree path, trimming it to the gitdir
+ */
+static int is_worktree(struct strbuf *b)
+{
+	if (trim_component(b) < 0) {
+		return 0;
+	}
+	if (!has_suffix(b, "/worktrees")) {
+		return 0;
+	}
+	trim_component(b);
+	return 1;
+}
+
+static struct ref_store *git_reftable_ref_store_create(const char *path,
+						       unsigned int store_flags)
+{
+	struct git_reftable_ref_store *refs = xcalloc(1, sizeof(*refs));
+	struct ref_store *ref_store = (struct ref_store *)refs;
+	struct reftable_write_options cfg = {
+		.block_size = 4096,
+		.hash_id = the_hash_algo->format_id,
+	};
+	struct strbuf sb = STRBUF_INIT;
+	const char *gitdir = path;
+	struct strbuf wt_buf = STRBUF_INIT;
+	int wt = 0;
+
+	strbuf_addstr(&wt_buf, path);
+
+	/* this is clumsy, but the official worktree functions (eg.
+	 * get_worktrees()) function will try to initialize a ref storage
+	 * backend, leading to infinite recursion.  */
+	wt = is_worktree(&wt_buf);
+	if (wt) {
+		gitdir = wt_buf.buf;
+	}
+
+	base_ref_store_init(ref_store, &refs_be_reftable);
+	ref_store->gitdir = xstrdup(gitdir);
+	refs->store_flags = store_flags;
+	strbuf_addf(&sb, "%s/reftable", gitdir);
+	refs->reftable_dir = xstrdup(sb.buf);
+	strbuf_reset(&sb);
+
+	refs->err =
+		reftable_new_stack(&refs->main_stack, refs->reftable_dir, cfg);
+	assert(refs->err != REFTABLE_API_ERROR);
+
+	if (refs->err == 0 && wt) {
+		strbuf_addf(&sb, "%s/reftable", path);
+		refs->worktree_reftable_dir = xstrdup(sb.buf);
+
+		refs->err = reftable_new_stack(&refs->worktree_stack,
+					       refs->worktree_reftable_dir,
+					       cfg);
+		assert(refs->err != REFTABLE_API_ERROR);
+	}
+
+	strbuf_release(&sb);
+	strbuf_release(&wt_buf);
+	return ref_store;
+}
+
+static int git_reftable_init_db(struct ref_store *ref_store, struct strbuf *err)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct strbuf sb = STRBUF_INIT;
+
+	safe_create_dir(refs->reftable_dir, 1);
+	assert(refs->worktree_reftable_dir == NULL);
+
+	strbuf_addf(&sb, "%s/HEAD", refs->base.gitdir);
+	write_file(sb.buf, "ref: refs/heads/.invalid");
+	strbuf_reset(&sb);
+
+	strbuf_addf(&sb, "%s/refs", refs->base.gitdir);
+	safe_create_dir(sb.buf, 1);
+	strbuf_reset(&sb);
+
+	strbuf_addf(&sb, "%s/refs/heads", refs->base.gitdir);
+	write_file(sb.buf, "this repository uses the reftable format");
+
+	return 0;
+}
+
+struct git_reftable_iterator {
+	struct ref_iterator base;
+	struct reftable_iterator iter;
+	struct reftable_ref_record ref;
+	struct object_id oid;
+	struct ref_store *ref_store;
+
+	/* In case we must iterate over 2 stacks, this is non-null. */
+	struct reftable_merged_table *merged;
+	unsigned int flags;
+	int err;
+	const char *prefix;
+};
+
+static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator)
+{
+	struct git_reftable_iterator *ri =
+		(struct git_reftable_iterator *)ref_iterator;
+	while (ri->err == 0) {
+		ri->err = reftable_iterator_next_ref(&ri->iter, &ri->ref);
+		if (ri->err) {
+			break;
+		}
+
+		if (ref_type(ri->ref.refname) == REF_TYPE_PSEUDOREF) {
+			/*
+			  pseudorefs, eg. HEAD, FETCH_HEAD should not be
+			  produced, by default.
+			 */
+			continue;
+		}
+		ri->base.refname = ri->ref.refname;
+		if (ri->prefix != NULL &&
+		    strncmp(ri->prefix, ri->ref.refname, strlen(ri->prefix))) {
+			ri->err = 1;
+			break;
+		}
+		if (ri->flags & DO_FOR_EACH_PER_WORKTREE_ONLY &&
+		    ref_type(ri->base.refname) != REF_TYPE_PER_WORKTREE)
+			continue;
+
+		ri->base.flags = 0;
+		switch (ri->ref.value_type) {
+		case REFTABLE_REF_VAL1:
+			oidread(&ri->oid, ri->ref.value.val1);
+			break;
+		case REFTABLE_REF_VAL2:
+			oidread(&ri->oid, ri->ref.value.val2.value);
+			break;
+		case REFTABLE_REF_SYMREF: {
+			int out_flags = 0;
+			const char *resolved = refs_resolve_ref_unsafe(
+				ri->ref_store, ri->ref.refname,
+				RESOLVE_REF_READING, &ri->oid, &out_flags);
+			ri->base.flags = out_flags;
+			if (resolved == NULL &&
+			    !(ri->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
+			    (ri->base.flags & REF_ISBROKEN)) {
+				continue;
+			}
+			break;
+		}
+		default:
+			abort();
+		}
+
+		ri->base.oid = &ri->oid;
+		if (!(ri->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
+		    !ref_resolves_to_object(ri->base.refname, ri->base.oid,
+					    ri->base.flags)) {
+			continue;
+		}
+
+		break;
+	}
+
+	if (ri->err > 0) {
+		return ITER_DONE;
+	}
+	if (ri->err < 0) {
+		return ITER_ERROR;
+	}
+
+	return ITER_OK;
+}
+
+static int reftable_ref_iterator_peel(struct ref_iterator *ref_iterator,
+				      struct object_id *peeled)
+{
+	struct git_reftable_iterator *ri =
+		(struct git_reftable_iterator *)ref_iterator;
+	if (ri->ref.value_type == REFTABLE_REF_VAL2) {
+		oidread(peeled, ri->ref.value.val2.target_value);
+		return 0;
+	}
+
+	return 1;
+}
+
+static int reftable_ref_iterator_abort(struct ref_iterator *ref_iterator)
+{
+	struct git_reftable_iterator *ri =
+		(struct git_reftable_iterator *)ref_iterator;
+	reftable_ref_record_release(&ri->ref);
+	reftable_iterator_destroy(&ri->iter);
+	if (ri->merged) {
+		reftable_merged_table_free(ri->merged);
+	}
+	return 0;
+}
+
+static struct ref_iterator_vtable reftable_ref_iterator_vtable = {
+	reftable_ref_iterator_advance, reftable_ref_iterator_peel,
+	reftable_ref_iterator_abort
+};
+
+static struct ref_iterator *
+git_reftable_ref_iterator_begin(struct ref_store *ref_store, const char *prefix,
+				unsigned int flags)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct git_reftable_iterator *ri = xcalloc(1, sizeof(*ri));
+
+	if (refs->err < 0) {
+		ri->err = refs->err;
+	} else if (refs->worktree_stack == NULL) {
+		struct reftable_merged_table *mt =
+			reftable_stack_merged_table(refs->main_stack);
+		ri->err = reftable_merged_table_seek_ref(mt, &ri->iter, prefix);
+	} else {
+		struct reftable_merged_table *mt1 =
+			reftable_stack_merged_table(refs->main_stack);
+		struct reftable_merged_table *mt2 =
+			reftable_stack_merged_table(refs->worktree_stack);
+		struct reftable_table *tabs =
+			xcalloc(2, sizeof(struct reftable_table));
+		reftable_table_from_merged_table(&tabs[0], mt1);
+		reftable_table_from_merged_table(&tabs[1], mt2);
+		ri->err = reftable_new_merged_table(&ri->merged, tabs, 2,
+						    the_hash_algo->format_id);
+		if (ri->err == 0)
+			ri->err = reftable_merged_table_seek_ref(
+				ri->merged, &ri->iter, prefix);
+	}
+
+	base_ref_iterator_init(&ri->base, &reftable_ref_iterator_vtable, 1);
+	ri->prefix = prefix;
+	ri->base.oid = &ri->oid;
+	ri->flags = flags;
+	ri->ref_store = ref_store;
+	return &ri->base;
+}
+
+static int fixup_symrefs(struct ref_store *ref_store,
+			 struct ref_transaction *transaction)
+{
+	struct strbuf referent = STRBUF_INIT;
+	int i = 0;
+	int err = 0;
+
+	for (i = 0; i < transaction->nr; i++) {
+		struct ref_update *update = transaction->updates[i];
+		struct object_id old_oid;
+
+		err = git_reftable_read_raw_ref(ref_store, update->refname,
+						&old_oid, &referent,
+						/* mutate input, like
+						   files-backend.c */
+						&update->type);
+		if (err < 0 && errno == ENOENT &&
+		    is_null_oid(&update->old_oid)) {
+			err = 0;
+		}
+		if (err < 0)
+			goto done;
+
+		if (!(update->type & REF_ISSYMREF))
+			continue;
+
+		if (update->flags & REF_NO_DEREF) {
+			/* what should happen here? See files-backend.c
+			 * lock_ref_for_update. */
+		} else {
+			/*
+			  If we are updating a symref (eg. HEAD), we should also
+			  update the branch that the symref points to.
+
+			  This is generic functionality, and would be better
+			  done in refs.c, but the current implementation is
+			  intertwined with the locking in files-backend.c.
+			*/
+			int new_flags = update->flags;
+			struct ref_update *new_update = NULL;
+
+			/* if this is an update for HEAD, should also record a
+			   log entry for HEAD? See files-backend.c,
+			   split_head_update()
+			*/
+			new_update = ref_transaction_add_update(
+				transaction, referent.buf, new_flags,
+				&update->new_oid, &update->old_oid,
+				update->msg);
+			new_update->parent_update = update;
+
+			/* files-backend sets REF_LOG_ONLY here. */
+			update->flags |= REF_NO_DEREF | REF_LOG_ONLY;
+			update->flags &= ~REF_HAVE_OLD;
+		}
+	}
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	strbuf_release(&referent);
+	return err;
+}
+
+static int git_reftable_transaction_prepare(struct ref_store *ref_store,
+					    struct ref_transaction *transaction,
+					    struct strbuf *errbuf)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_addition *add = NULL;
+	struct reftable_stack *stack =
+		stack_for(refs,
+			  transaction->nr ? transaction->updates[0]->refname : NULL);
+
+	int err = refs->err;
+	if (err < 0) {
+		goto done;
+	}
+
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+
+	err = reftable_stack_new_addition(&add, stack);
+	if (err) {
+		goto done;
+	}
+
+	err = fixup_symrefs(ref_store, transaction);
+	if (err) {
+		goto done;
+	}
+
+	transaction->backend_data = add;
+	transaction->state = REF_TRANSACTION_PREPARED;
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	if (err < 0) {
+		transaction->state = REF_TRANSACTION_CLOSED;
+		strbuf_addf(errbuf, "reftable: transaction prepare: %s",
+			    reftable_error_str(err));
+	}
+
+	return err;
+}
+
+static int git_reftable_transaction_abort(struct ref_store *ref_store,
+					  struct ref_transaction *transaction,
+					  struct strbuf *err)
+{
+	struct reftable_addition *add =
+		(struct reftable_addition *)transaction->backend_data;
+	reftable_addition_destroy(add);
+	transaction->backend_data = NULL;
+	return 0;
+}
+
+static int reftable_check_old_oid(struct ref_store *refs, const char *refname,
+				  struct object_id *want_oid)
+{
+	struct object_id out_oid;
+	int out_flags = 0;
+	const char *resolved = refs_resolve_ref_unsafe(
+		refs, refname, RESOLVE_REF_READING, &out_oid, &out_flags);
+	if (is_null_oid(want_oid) != (resolved == NULL)) {
+		return REFTABLE_LOCK_ERROR;
+	}
+
+	if (resolved != NULL && !oideq(&out_oid, want_oid)) {
+		return REFTABLE_LOCK_ERROR;
+	}
+
+	return 0;
+}
+
+static int ref_update_cmp(const void *a, const void *b)
+{
+	return strcmp((*(struct ref_update **)a)->refname,
+		      (*(struct ref_update **)b)->refname);
+}
+
+static int write_transaction_table(struct reftable_writer *writer, void *arg)
+{
+	struct ref_transaction *transaction = (struct ref_transaction *)arg;
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)transaction->ref_store;
+	struct reftable_stack *stack =
+		stack_for(refs, transaction->updates[0]->refname);
+	uint64_t ts = reftable_stack_next_update_index(stack);
+	int err = 0;
+	int i = 0;
+	struct reftable_log_record *logs =
+		calloc(transaction->nr, sizeof(*logs));
+	struct ref_update **sorted =
+		malloc(transaction->nr * sizeof(struct ref_update *));
+	struct reftable_merged_table *mt = reftable_stack_merged_table(stack);
+	struct reftable_table tab = {NULL};
+	struct reftable_ref_record ref = {NULL};
+	reftable_table_from_merged_table(&tab, mt);
+	COPY_ARRAY(sorted, transaction->updates, transaction->nr);
+	QSORT(sorted, transaction->nr, ref_update_cmp);
+	reftable_writer_set_limits(writer, ts, ts);
+
+	for (i = 0; i < transaction->nr; i++) {
+		struct ref_update *u = sorted[i];
+		struct reftable_log_record *log = &logs[i];
+		struct object_id old_id;
+		fill_reftable_log_record(log);
+		log->update_index = ts;
+		log->value_type = REFTABLE_LOG_UPDATE;
+		log->refname = (char *)u->refname;
+		log->update.new_hash = u->new_oid.hash;
+		log->update.message = u->msg;
+
+		err = reftable_table_read_ref(&tab, u->refname, &ref);
+		if (err < 0)
+			goto done;
+		else if (err > 0) {
+			old_id = null_oid;
+		} else {
+			oidread(&old_id, ref.value.val1);
+		}
+
+		/* XXX fold together with the old_id check below? */
+
+		log->update.old_hash = old_id.hash;
+		if (u->flags & REF_LOG_ONLY) {
+			continue;
+		}
+
+		if (u->flags & REF_HAVE_NEW) {
+			struct reftable_ref_record ref = { NULL };
+			struct object_id peeled;
+
+			int peel_error = peel_object(&u->new_oid, &peeled);
+			ref.refname = (char *)u->refname;
+			ref.update_index = ts;
+
+			if (!peel_error) {
+				ref.value_type = REFTABLE_REF_VAL2;
+				ref.value.val2.target_value = peeled.hash;
+				ref.value.val2.value = u->new_oid.hash;
+			} else if (!is_null_oid(&u->new_oid)) {
+				ref.value_type = REFTABLE_REF_VAL1;
+				ref.value.val1 = u->new_oid.hash;
+			}
+
+			err = reftable_writer_add_ref(writer, &ref);
+			if (err < 0) {
+				goto done;
+			}
+		}
+	}
+
+	for (i = 0; i < transaction->nr; i++) {
+		err = reftable_writer_add_log(writer, &logs[i]);
+		clear_reftable_log_record(&logs[i]);
+		if (err < 0) {
+			goto done;
+		}
+	}
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	reftable_ref_record_release(&ref);
+	free(logs);
+	free(sorted);
+	return err;
+}
+
+static int git_reftable_transaction_finish(struct ref_store *ref_store,
+					   struct ref_transaction *transaction,
+					   struct strbuf *errmsg)
+{
+	struct reftable_addition *add =
+		(struct reftable_addition *)transaction->backend_data;
+	int err = 0;
+	int i;
+
+	for (i = 0; i < transaction->nr; i++) {
+		struct ref_update *u = transaction->updates[i];
+		if (u->flags & REF_HAVE_OLD) {
+			err = reftable_check_old_oid(transaction->ref_store,
+						     u->refname, &u->old_oid);
+			if (err < 0) {
+				goto done;
+			}
+		}
+	}
+	if (transaction->nr) {
+		err = reftable_addition_add(add, &write_transaction_table,
+					    transaction);
+		if (err < 0) {
+			goto done;
+		}
+	}
+
+	err = reftable_addition_commit(add);
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	reftable_addition_destroy(add);
+	transaction->state = REF_TRANSACTION_CLOSED;
+	transaction->backend_data = NULL;
+	if (err) {
+		strbuf_addf(errmsg, "reftable: transaction failure: %s",
+			    reftable_error_str(err));
+		return -1;
+	}
+	return err;
+}
+
+static int
+git_reftable_transaction_initial_commit(struct ref_store *ref_store,
+					struct ref_transaction *transaction,
+					struct strbuf *errmsg)
+{
+	int err = git_reftable_transaction_prepare(ref_store, transaction,
+						   errmsg);
+	if (err)
+		return err;
+
+	return git_reftable_transaction_finish(ref_store, transaction, errmsg);
+}
+
+struct write_delete_refs_arg {
+	struct reftable_stack *stack;
+	struct string_list *refnames;
+	const char *logmsg;
+	unsigned int flags;
+};
+
+static int write_delete_refs_table(struct reftable_writer *writer, void *argv)
+{
+	struct write_delete_refs_arg *arg =
+		(struct write_delete_refs_arg *)argv;
+	uint64_t ts = reftable_stack_next_update_index(arg->stack);
+	int err = 0;
+	int i = 0;
+
+	reftable_writer_set_limits(writer, ts, ts);
+	for (i = 0; i < arg->refnames->nr; i++) {
+		struct reftable_ref_record ref = {
+			.refname = (char *)arg->refnames->items[i].string,
+			.value_type = REFTABLE_REF_DELETION,
+			.update_index = ts,
+		};
+		err = reftable_writer_add_ref(writer, &ref);
+		if (err < 0) {
+			return err;
+		}
+	}
+
+	for (i = 0; i < arg->refnames->nr; i++) {
+		struct reftable_log_record log = {
+			.update_index = ts,
+		};
+		struct reftable_ref_record current = { NULL };
+		fill_reftable_log_record(&log);
+		log.update_index = ts;
+		log.refname = (char *)arg->refnames->items[i].string;
+
+		log.update.message = xstrdup(arg->logmsg);
+		log.update.new_hash = NULL;
+		log.update.old_hash = NULL;
+		if (reftable_stack_read_ref(arg->stack, log.refname,
+					    &current) == 0) {
+			log.update.old_hash =
+				reftable_ref_record_val1(&current);
+		}
+		err = reftable_writer_add_log(writer, &log);
+		log.update.old_hash = NULL;
+		reftable_ref_record_release(&current);
+
+		clear_reftable_log_record(&log);
+		if (err < 0) {
+			return err;
+		}
+	}
+	return 0;
+}
+
+static int git_reftable_delete_refs(struct ref_store *ref_store,
+				    const char *msg,
+				    struct string_list *refnames,
+				    unsigned int flags)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack =
+		stack_for(refs, refnames->nr ? refnames->items[0].string : NULL);
+	struct write_delete_refs_arg arg = {
+		.stack = stack,
+		.refnames = refnames,
+		.logmsg = msg,
+		.flags = flags,
+	};
+	int err = refs->err;
+	if (err < 0) {
+		goto done;
+	}
+
+	string_list_sort(refnames);
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+	err = reftable_stack_add(stack, &write_delete_refs_table, &arg);
+done:
+	assert(err != REFTABLE_API_ERROR);
+	return err;
+}
+
+static int git_reftable_pack_refs(struct ref_store *ref_store,
+				  unsigned int flags)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	int err = refs->err;
+	if (err < 0) {
+		return err;
+	}
+	err = reftable_stack_compact_all(refs->main_stack, NULL);
+	if (err == 0 && refs->worktree_stack != NULL)
+		err = reftable_stack_compact_all(refs->worktree_stack, NULL);
+	if (err == 0)
+		err = reftable_stack_clean(refs->main_stack);
+	if (err == 0 && refs->worktree_stack != NULL)
+		err = reftable_stack_clean(refs->worktree_stack);
+
+	return err;
+}
+
+struct write_create_symref_arg {
+	struct git_reftable_ref_store *refs;
+	struct reftable_stack *stack;
+	const char *refname;
+	const char *target;
+	const char *logmsg;
+};
+
+static int write_create_symref_table(struct reftable_writer *writer, void *arg)
+{
+	struct write_create_symref_arg *create =
+		(struct write_create_symref_arg *)arg;
+	uint64_t ts = reftable_stack_next_update_index(create->stack);
+	int err = 0;
+
+	struct reftable_ref_record ref = {
+		.refname = (char *)create->refname,
+		.value_type = REFTABLE_REF_SYMREF,
+		.value.symref = (char *)create->target,
+		.update_index = ts,
+	};
+	reftable_writer_set_limits(writer, ts, ts);
+	err = reftable_writer_add_ref(writer, &ref);
+	if (err == 0) {
+		struct reftable_log_record log = { NULL };
+		struct object_id new_oid;
+		struct object_id old_oid;
+
+		fill_reftable_log_record(&log);
+		log.refname = (char *)create->refname;
+		log.update_index = ts;
+		log.update.message = (char *)create->logmsg;
+		if (refs_resolve_ref_unsafe(
+			    (struct ref_store *)create->refs, create->refname,
+			    RESOLVE_REF_READING, &old_oid, NULL) != NULL) {
+			log.update.old_hash = old_oid.hash;
+		}
+
+		if (refs_resolve_ref_unsafe((struct ref_store *)create->refs,
+					    create->target, RESOLVE_REF_READING,
+					    &new_oid, NULL) != NULL) {
+			log.update.new_hash = new_oid.hash;
+		}
+
+		if (log.update.old_hash != NULL ||
+		    log.update.new_hash != NULL) {
+			err = reftable_writer_add_log(writer, &log);
+		}
+		log.refname = NULL;
+		log.update.message = NULL;
+		log.update.old_hash = NULL;
+		log.update.new_hash = NULL;
+		clear_reftable_log_record(&log);
+	}
+	return err;
+}
+
+static int git_reftable_create_symref(struct ref_store *ref_store,
+				      const char *refname, const char *target,
+				      const char *logmsg)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct write_create_symref_arg arg = { .refs = refs,
+					       .stack = stack,
+					       .refname = refname,
+					       .target = target,
+					       .logmsg = logmsg };
+	int err = refs->err;
+	if (err < 0) {
+		goto done;
+	}
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+	err = reftable_stack_add(stack, &write_create_symref_table, &arg);
+done:
+	assert(err != REFTABLE_API_ERROR);
+	return err;
+}
+
+struct write_rename_arg {
+	struct reftable_stack *stack;
+	const char *oldname;
+	const char *newname;
+	const char *logmsg;
+};
+
+static int write_rename_table(struct reftable_writer *writer, void *argv)
+{
+	struct write_rename_arg *arg = (struct write_rename_arg *)argv;
+	uint64_t ts = reftable_stack_next_update_index(arg->stack);
+	struct reftable_ref_record old_ref = { NULL };
+	struct reftable_ref_record new_ref = { NULL };
+	int err = reftable_stack_read_ref(arg->stack, arg->oldname, &old_ref);
+
+	if (err) {
+		goto done;
+	}
+
+	/* git-branch supports a --force, but the check is not atomic. */
+	if (reftable_stack_read_ref(arg->stack, arg->newname, &new_ref) == 0) {
+		goto done;
+	}
+
+	reftable_writer_set_limits(writer, ts, ts);
+
+	{
+		struct reftable_ref_record todo[2] = {
+			{
+				.refname = (char *)arg->oldname,
+				.update_index = ts,
+				.value_type = REFTABLE_REF_DELETION,
+			},
+			old_ref,
+		};
+		todo[1].update_index = ts;
+		free(todo[1].refname);
+		todo[1].refname = strdup(arg->newname);
+
+		err = reftable_writer_add_refs(writer, todo, 2);
+		if (err < 0) {
+			goto done;
+		}
+	}
+
+	if (reftable_ref_record_val1(&old_ref)) {
+		uint8_t *val1 = reftable_ref_record_val1(&old_ref);
+		struct reftable_log_record todo[2] = { { NULL } };
+		fill_reftable_log_record(&todo[0]);
+		fill_reftable_log_record(&todo[1]);
+
+		todo[0].refname = (char *)arg->oldname;
+		todo[0].update_index = ts;
+		todo[0].update.message = (char *)arg->logmsg;
+		todo[0].update.old_hash = val1;
+		todo[0].update.new_hash = NULL;
+
+		todo[1].refname = (char *)arg->newname;
+		todo[1].update_index = ts;
+		todo[1].update.old_hash = NULL;
+		todo[1].update.new_hash = val1;
+		todo[1].update.message = (char *)arg->logmsg;
+
+		err = reftable_writer_add_logs(writer, todo, 2);
+
+		clear_reftable_log_record(&todo[0]);
+		clear_reftable_log_record(&todo[1]);
+
+		if (err < 0) {
+			goto done;
+		}
+
+	} else {
+		/* XXX symrefs? */
+	}
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	reftable_ref_record_release(&new_ref);
+	reftable_ref_record_release(&old_ref);
+	return err;
+}
+
+static int write_copy_table(struct reftable_writer *writer, void *argv)
+{
+	struct write_rename_arg *arg = (struct write_rename_arg *)argv;
+	uint64_t ts = reftable_stack_next_update_index(arg->stack);
+	struct reftable_ref_record old_ref = { NULL };
+	struct reftable_ref_record new_ref = { NULL };
+	struct reftable_log_record log = { NULL };
+	struct reftable_iterator it = { NULL };
+	int err = reftable_stack_read_ref(arg->stack, arg->oldname, &old_ref);
+	if (err) {
+		goto done;
+	}
+
+	/* git-branch supports a --force, but the check is not atomic. */
+	if (reftable_stack_read_ref(arg->stack, arg->newname, &new_ref) == 0) {
+		goto done;
+	}
+
+	reftable_writer_set_limits(writer, ts, ts);
+
+	FREE_AND_NULL(old_ref.refname);
+	old_ref.refname = xstrdup(arg->newname);
+	old_ref.update_index = ts;
+	err = reftable_writer_add_ref(writer, &old_ref);
+	if (err < 0) {
+		goto done;
+	}
+
+	/* this copies the entire reflog history. Is this the right semantics? */
+	/* XXX should clear out existing reflog entries for oldname? */
+	err = reftable_merged_table_seek_log(reftable_stack_merged_table(arg->stack), &it, arg->oldname);
+	if (err < 0) {
+		goto done;
+	}
+	while (1) {
+		int err = reftable_iterator_next_log(&it, &log);
+		if (err < 0) {
+			goto done;
+		}
+
+		if (err > 0 || strcmp(log.refname, arg->oldname)) {
+			break;
+		}
+		FREE_AND_NULL(log.refname);
+		log.refname = xstrdup(arg->newname);
+		reftable_writer_add_log(writer, &log);
+		reftable_log_record_release(&log);
+	}
+
+done:
+	assert(err != REFTABLE_API_ERROR);
+	reftable_ref_record_release(&new_ref);
+	reftable_ref_record_release(&old_ref);
+	reftable_log_record_release(&log);
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+static int git_reftable_rename_ref(struct ref_store *ref_store,
+				   const char *oldrefname,
+				   const char *newrefname, const char *logmsg)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, newrefname);
+	struct write_rename_arg arg = {
+		.stack = stack,
+		.oldname = oldrefname,
+		.newname = newrefname,
+		.logmsg = logmsg,
+	};
+	int err = refs->err;
+	if (err < 0) {
+		goto done;
+	}
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+
+	err = reftable_stack_add(stack, &write_rename_table, &arg);
+done:
+	assert(err != REFTABLE_API_ERROR);
+	return err;
+}
+
+static int git_reftable_copy_ref(struct ref_store *ref_store,
+				 const char *oldrefname, const char *newrefname,
+				 const char *logmsg)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, newrefname);
+	struct write_rename_arg arg = {
+		.stack = stack,
+		.oldname = oldrefname,
+		.newname = newrefname,
+		.logmsg = logmsg,
+	};
+	int err = refs->err;
+	if (err < 0) {
+		goto done;
+	}
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+
+	err = reftable_stack_add(stack, &write_copy_table, &arg);
+done:
+	assert(err != REFTABLE_API_ERROR);
+	return err;
+}
+
+struct git_reftable_reflog_ref_iterator {
+	struct ref_iterator base;
+	struct reftable_iterator iter;
+	struct reftable_log_record log;
+	struct object_id oid;
+
+	/* Used when iterating over worktree & main */
+	struct reftable_merged_table *merged;
+	char *last_name;
+};
+
+static int
+git_reftable_reflog_ref_iterator_advance(struct ref_iterator *ref_iterator)
+{
+	struct git_reftable_reflog_ref_iterator *ri =
+		(struct git_reftable_reflog_ref_iterator *)ref_iterator;
+
+	while (1) {
+		int err = reftable_iterator_next_log(&ri->iter, &ri->log);
+		if (err > 0) {
+			return ITER_DONE;
+		}
+		if (err < 0) {
+			return ITER_ERROR;
+		}
+
+		ri->base.refname = ri->log.refname;
+		if (ri->last_name != NULL &&
+		    !strcmp(ri->log.refname, ri->last_name)) {
+			/* we want the refnames that we have reflogs for, so we
+			 * skip if we've already produced this name. This could
+			 * be faster by seeking directly to
+			 * reflog@update_index==0.
+			 */
+			continue;
+		}
+
+		free(ri->last_name);
+		ri->last_name = xstrdup(ri->log.refname);
+		oidread(&ri->oid, ri->log.update.new_hash);
+		return ITER_OK;
+	}
+}
+
+static int
+git_reftable_reflog_ref_iterator_peel(struct ref_iterator *ref_iterator,
+				      struct object_id *peeled)
+{
+	BUG("not supported.");
+	return -1;
+}
+
+static int
+git_reftable_reflog_ref_iterator_abort(struct ref_iterator *ref_iterator)
+{
+	struct git_reftable_reflog_ref_iterator *ri =
+		(struct git_reftable_reflog_ref_iterator *)ref_iterator;
+	reftable_log_record_release(&ri->log);
+	reftable_iterator_destroy(&ri->iter);
+	if (ri->merged)
+		reftable_merged_table_free(ri->merged);
+	return 0;
+}
+
+static struct ref_iterator_vtable git_reftable_reflog_ref_iterator_vtable = {
+	git_reftable_reflog_ref_iterator_advance,
+	git_reftable_reflog_ref_iterator_peel,
+	git_reftable_reflog_ref_iterator_abort
+};
+
+static struct ref_iterator *
+git_reftable_reflog_iterator_begin(struct ref_store *ref_store)
+{
+	struct git_reftable_reflog_ref_iterator *ri = xcalloc(1, sizeof(*ri));
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+
+	if (refs->worktree_stack == NULL) {
+		struct reftable_stack *stack = refs->main_stack;
+		struct reftable_merged_table *mt =
+			reftable_stack_merged_table(stack);
+		int err = reftable_merged_table_seek_log(mt, &ri->iter, "");
+		if (err < 0) {
+			free(ri);
+			/* XXX is this allowed? */
+			return NULL;
+		}
+	} else {
+		struct reftable_merged_table *mt1 =
+			reftable_stack_merged_table(refs->main_stack);
+		struct reftable_merged_table *mt2 =
+			reftable_stack_merged_table(refs->worktree_stack);
+		struct reftable_table *tabs =
+			xcalloc(2, sizeof(struct reftable_table));
+		int err = 0;
+		reftable_table_from_merged_table(&tabs[0], mt1);
+		reftable_table_from_merged_table(&tabs[1], mt2);
+		err = reftable_new_merged_table(&ri->merged, tabs, 2,
+						the_hash_algo->format_id);
+		if (err < 0) {
+			free(tabs);
+			/* XXX see above */
+			return NULL;
+		}
+		err = reftable_merged_table_seek_ref(ri->merged, &ri->iter, "");
+		if (err < 0) {
+			return NULL;
+		}
+	}
+	base_ref_iterator_init(&ri->base,
+			       &git_reftable_reflog_ref_iterator_vtable, 1);
+	ri->base.oid = &ri->oid;
+
+	return (struct ref_iterator *)ri;
+}
+
+static int git_reftable_for_each_reflog_ent_newest_first(
+	struct ref_store *ref_store, const char *refname, each_reflog_ent_fn fn,
+	void *cb_data)
+{
+	struct reftable_iterator it = { NULL };
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct reftable_merged_table *mt = NULL;
+	int err = 0;
+	struct reftable_log_record log = { NULL };
+
+	if (refs->err < 0) {
+		return refs->err;
+	}
+
+	mt = reftable_stack_merged_table(stack);
+	err = reftable_merged_table_seek_log(mt, &it, refname);
+	while (err == 0) {
+		struct object_id old_oid;
+		struct object_id new_oid;
+		const char *full_committer = "";
+
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+		if (err < 0) {
+			break;
+		}
+
+		if (strcmp(log.refname, refname)) {
+			break;
+		}
+
+		oidread(&old_oid, log.update.old_hash);
+		oidread(&new_oid, log.update.new_hash);
+
+		full_committer = fmt_ident(log.update.name, log.update.email,
+					   WANT_COMMITTER_IDENT,
+					   /*date*/ NULL, IDENT_NO_DATE);
+		err = fn(&old_oid, &new_oid, full_committer, log.update.time,
+			 log.update.tz_offset, log.update.message, cb_data);
+		if (err)
+			break;
+	}
+
+	reftable_log_record_release(&log);
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+static int git_reftable_for_each_reflog_ent_oldest_first(
+	struct ref_store *ref_store, const char *refname, each_reflog_ent_fn fn,
+	void *cb_data)
+{
+	struct reftable_iterator it = { NULL };
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct reftable_merged_table *mt = NULL;
+	struct reftable_log_record *logs = NULL;
+	int cap = 0;
+	int len = 0;
+	int err = 0;
+	int i = 0;
+
+	if (refs->err < 0) {
+		return refs->err;
+	}
+	mt = reftable_stack_merged_table(stack);
+	err = reftable_merged_table_seek_log(mt, &it, refname);
+
+	while (err == 0) {
+		struct reftable_log_record log = { NULL };
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+		if (err < 0) {
+			break;
+		}
+
+		if (strcmp(log.refname, refname)) {
+			break;
+		}
+
+		if (len == cap) {
+			cap = 2 * cap + 1;
+			logs = realloc(logs, cap * sizeof(*logs));
+		}
+
+		logs[len++] = log;
+	}
+
+	for (i = len; i--;) {
+		struct reftable_log_record *log = &logs[i];
+		struct object_id old_oid;
+		struct object_id new_oid;
+		const char *full_committer = "";
+
+		oidread(&old_oid, log->update.old_hash);
+		oidread(&new_oid, log->update.new_hash);
+
+		full_committer = fmt_ident(log->update.name, log->update.email,
+					   WANT_COMMITTER_IDENT, NULL,
+					   IDENT_NO_DATE);
+		err = fn(&old_oid, &new_oid, full_committer, log->update.time,
+			 log->update.tz_offset, log->update.message, cb_data);
+		if (err) {
+			break;
+		}
+	}
+
+	for (i = 0; i < len; i++) {
+		reftable_log_record_release(&logs[i]);
+	}
+	free(logs);
+
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+static int git_reftable_reflog_exists(struct ref_store *ref_store,
+				      const char *refname)
+{
+	struct reftable_iterator it = { NULL };
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct reftable_merged_table *mt = reftable_stack_merged_table(stack);
+	struct reftable_log_record log = { NULL };
+	int err = refs->err;
+
+	if (err < 0) {
+		goto done;
+	}
+	err = reftable_merged_table_seek_log(mt, &it, refname);
+	if (err) {
+		goto done;
+	}
+	err = reftable_iterator_next_log(&it, &log);
+	if (err) {
+		goto done;
+	}
+
+	if (strcmp(log.refname, refname)) {
+		err = 1;
+	}
+
+done:
+	reftable_iterator_destroy(&it);
+	reftable_log_record_release(&log);
+	return !err;
+}
+
+static int git_reftable_create_reflog(struct ref_store *ref_store,
+				      const char *refname, int force_create,
+				      struct strbuf *err)
+{
+	return 0;
+}
+
+struct write_reflog_delete_arg {
+	struct reftable_stack *stack;
+	const char *refname;
+};
+
+static int write_reflog_delete_table(struct reftable_writer *writer, void *argv)
+{
+	struct write_reflog_delete_arg *arg = argv;
+	struct reftable_merged_table *mt = reftable_stack_merged_table(arg->stack);
+	struct reftable_log_record log = { NULL };
+	struct reftable_iterator it = { NULL };
+	uint64_t ts = reftable_stack_next_update_index(arg->stack);
+	int err = reftable_merged_table_seek_log(mt, &it, arg->refname);
+
+	reftable_writer_set_limits(writer, ts, ts);
+	while (err == 0) {
+		struct reftable_log_record tombstone = {
+			.refname = (char*)arg->refname,
+			.update_index = REFTABLE_LOG_DELETION,
+		};
+		err = reftable_iterator_next_log(&it, &log);
+		if (err > 0) {
+			err = 0;
+			break;
+		}
+
+		if (err < 0 || strcmp(log.refname, arg->refname)) {
+			break;
+		}
+		tombstone.update_index = log.update_index;
+		err = reftable_writer_add_log(writer, &tombstone);
+	}
+
+	return err;
+}
+
+
+static int git_reftable_delete_reflog(struct ref_store *ref_store,
+				      const char *refname)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct write_reflog_delete_arg arg = {
+		.stack = stack,
+		.refname = refname,
+	};
+	int err = reftable_stack_add(stack, &write_reflog_delete_table, &arg);
+	assert(err != REFTABLE_API_ERROR);
+	return err;
+}
+
+struct reflog_expiry_arg {
+	struct reftable_stack *stack;
+	struct reftable_log_record *records;
+	int len;
+};
+
+static int write_reflog_expiry_table(struct reftable_writer *writer, void *argv)
+{
+	struct reflog_expiry_arg *arg = (struct reflog_expiry_arg *)argv;
+	uint64_t ts = reftable_stack_next_update_index(arg->stack);
+	int i = 0;
+	reftable_writer_set_limits(writer, ts, ts);
+	for (i = 0; i < arg->len; i++) {
+		int err = reftable_writer_add_log(writer, &arg->records[i]);
+		if (err) {
+			return err;
+		}
+	}
+	return 0;
+}
+
+static int
+git_reftable_reflog_expire(struct ref_store *ref_store, const char *refname,
+			   const struct object_id *oid, unsigned int flags,
+			   reflog_expiry_prepare_fn prepare_fn,
+			   reflog_expiry_should_prune_fn should_prune_fn,
+			   reflog_expiry_cleanup_fn cleanup_fn,
+			   void *policy_cb_data)
+{
+	/*
+	  For log expiry, we write tombstones in place of the expired entries,
+	  This means that the entries are still retrievable by delving into the
+	  stack, and expiring entries paradoxically takes extra memory.
+
+	  This memory is only reclaimed when some operation issues a
+	  git_reftable_pack_refs(), which will compact the entire stack and get
+	  rid of deletion entries.
+
+	  It would be better if the refs backend supported an API that sets a
+	  criterion for all refs, passing the criterion to pack_refs().
+	*/
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+	struct reftable_merged_table *mt = NULL;
+	struct reflog_expiry_arg arg = {
+		.stack = stack,
+	};
+	struct reftable_log_record *logs = NULL;
+	struct reftable_log_record *rewritten = NULL;
+	int logs_len = 0;
+	int logs_cap = 0;
+	int i = 0;
+	uint8_t *last_hash = NULL;
+	struct reftable_iterator it = { NULL };
+	struct reftable_addition *add = NULL;
+	int err = 0;
+	if (refs->err < 0) {
+		return refs->err;
+	}
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+
+	mt = reftable_stack_merged_table(stack);
+	err = reftable_merged_table_seek_log(mt, &it, refname);
+	if (err < 0) {
+		goto done;
+	}
+
+	err = reftable_stack_new_addition(&add, stack);
+	if (err) {
+		goto done;
+	}
+	prepare_fn(refname, oid, policy_cb_data);
+	while (1) {
+		struct reftable_log_record log = {NULL};
+		int err = reftable_iterator_next_log(&it, &log);
+		if (err < 0) {
+			goto done;
+		}
+
+		if (err > 0 || strcmp(log.refname, refname)) {
+			break;
+		}
+
+		if (logs_len >= logs_cap) {
+			int new_cap = logs_cap * 2 + 1;
+			logs = realloc(logs, new_cap * sizeof(*logs));
+			logs_cap = new_cap;
+		}
+		logs[logs_len++] = log;
+	}
+
+	rewritten = calloc(logs_len, sizeof(*rewritten));
+	for (i = logs_len-1; i >= 0; i--) {
+		struct object_id ooid;
+		struct object_id noid;
+		struct reftable_log_record *dest = &rewritten[i];
+
+		*dest = logs[i];
+		oidread(&ooid, logs[i].update.old_hash);
+		oidread(&noid, logs[i].update.new_hash);
+
+		if (should_prune_fn(&ooid, &noid, logs[i].update.email,
+				    (timestamp_t)logs[i].update.time,
+				    logs[i].update.tz_offset, logs[i].update.message,
+				    policy_cb_data)) {
+			dest->value_type = REFTABLE_LOG_DELETION;
+		} else {
+			if ((flags & EXPIRE_REFLOGS_REWRITE) && last_hash != NULL) {
+				dest->update.old_hash = last_hash;
+			}
+			last_hash = logs[i].update.new_hash;
+		}
+	}
+
+	arg.records = rewritten;
+	arg.len = logs_len;
+	err = reftable_addition_add(add, &write_reflog_expiry_table, &arg);
+	if (err < 0) {
+		goto done;
+	}
+
+	if (!(flags & EXPIRE_REFLOGS_DRY_RUN)) {
+		/* XXX - skip writing records that were not changed. */
+		err = reftable_addition_commit(add);
+	} else {
+		/* XXX - print something */
+	}
+
+done:
+	if (add) {
+		cleanup_fn(policy_cb_data);
+	}
+	assert(err != REFTABLE_API_ERROR);
+	reftable_addition_destroy(add);
+	for (i = 0; i < logs_len; i++)
+		reftable_log_record_release(&logs[i]);
+	free(logs);
+	free(rewritten);
+	reftable_iterator_destroy(&it);
+	return err;
+}
+
+static int reftable_error_to_errno(int err)
+{
+	switch (err) {
+	case REFTABLE_IO_ERROR:
+		return EIO;
+	case REFTABLE_FORMAT_ERROR:
+		return EFAULT;
+	case REFTABLE_NOT_EXIST_ERROR:
+		return ENOENT;
+	case REFTABLE_LOCK_ERROR:
+		return EBUSY;
+	case REFTABLE_API_ERROR:
+		return EINVAL;
+	case REFTABLE_ZLIB_ERROR:
+		return EDOM;
+	default:
+		return ERANGE;
+	}
+}
+
+static int git_reftable_read_raw_ref(struct ref_store *ref_store,
+				     const char *refname, struct object_id *oid,
+				     struct strbuf *referent,
+				     unsigned int *type)
+{
+	struct git_reftable_ref_store *refs =
+		(struct git_reftable_ref_store *)ref_store;
+	struct reftable_stack *stack = stack_for(refs, refname);
+
+	struct reftable_ref_record ref = { NULL };
+	int err = 0;
+	if (refs->err < 0) {
+		return refs->err;
+	}
+
+	/* This is usually not needed, but Git doesn't signal to ref backend if
+	   a subprocess updated the ref DB.  So we always check.
+	*/
+	err = reftable_stack_reload(stack);
+	if (err) {
+		goto done;
+	}
+
+	err = reftable_stack_read_ref(stack, refname, &ref);
+	if (err > 0) {
+		errno = ENOENT;
+		err = -1;
+		goto done;
+	}
+	if (err < 0) {
+		errno = reftable_error_to_errno(err);
+		err = -1;
+		goto done;
+	}
+
+	if (ref.value_type == REFTABLE_REF_SYMREF) {
+		strbuf_reset(referent);
+		strbuf_addstr(referent, ref.value.symref);
+		*type |= REF_ISSYMREF;
+	} else if (reftable_ref_record_val1(&ref) != NULL) {
+		oidread(oid, reftable_ref_record_val1(&ref));
+	} else {
+		*type |= REF_ISBROKEN;
+		errno = EINVAL;
+		err = -1;
+	}
+done:
+	assert(err != REFTABLE_API_ERROR);
+	reftable_ref_record_release(&ref);
+	return err;
+}
+
+struct ref_storage_be refs_be_reftable = {
+	&refs_be_files,
+	"reftable",
+	git_reftable_ref_store_create,
+	git_reftable_init_db,
+	git_reftable_transaction_prepare,
+	git_reftable_transaction_finish,
+	git_reftable_transaction_abort,
+	git_reftable_transaction_initial_commit,
+
+	git_reftable_pack_refs,
+	git_reftable_create_symref,
+	git_reftable_delete_refs,
+	git_reftable_rename_ref,
+	git_reftable_copy_ref,
+
+	git_reftable_ref_iterator_begin,
+	git_reftable_read_raw_ref,
+
+	git_reftable_reflog_iterator_begin,
+	git_reftable_for_each_reflog_ent_oldest_first,
+	git_reftable_for_each_reflog_ent_newest_first,
+	git_reftable_reflog_exists,
+	git_reftable_create_reflog,
+	git_reftable_delete_reflog,
+	git_reftable_reflog_expire,
+};
diff --git a/repository.c b/repository.c
index 87b355e7a659..e9f78496d338 100644
--- a/repository.c
+++ b/repository.c
@@ -174,6 +174,8 @@ int repo_init(struct repository *repo,
 	if (worktree)
 		repo_set_worktree(repo, worktree);
 
+	repo->ref_storage_format = xstrdup_or_null(format.ref_storage);
+
 	clear_repository_format(&format);
 	return 0;
 
diff --git a/repository.h b/repository.h
index b385ca3c94b6..8019a7d0a1fc 100644
--- a/repository.h
+++ b/repository.h
@@ -78,6 +78,9 @@ struct repository {
 	 */
 	struct ref_store *refs_private;
 
+	/* The format to use for the ref database. */
+	char *ref_storage_format;
+
 	/*
 	 * Contains path to often used file names.
 	 */
diff --git a/setup.c b/setup.c
index 59e2facd9db6..3a4fe1ccebab 100644
--- a/setup.c
+++ b/setup.c
@@ -500,6 +500,9 @@ static enum extension_result handle_extension(const char *var,
 			return error("invalid value for 'extensions.objectformat'");
 		data->hash_algo = format;
 		return EXTENSION_OK;
+	} else if (!strcmp(ext, "refstorage")) {
+		data->ref_storage = xstrdup(value);
+		return EXTENSION_OK;
 	}
 	return EXTENSION_UNKNOWN;
 }
@@ -651,6 +654,7 @@ void clear_repository_format(struct repository_format *format)
 	string_list_clear(&format->v1_only_extensions, 0);
 	free(format->work_tree);
 	free(format->partial_clone);
+	free(format->ref_storage);
 	init_repository_format(format);
 }
 
@@ -1300,8 +1304,11 @@ const char *setup_git_directory_gently(int *nongit_ok)
 				gitdir = DEFAULT_GIT_DIR_ENVIRONMENT;
 			setup_git_env(gitdir);
 		}
-		if (startup_info->have_repository)
+		if (startup_info->have_repository) {
 			repo_set_hash_algo(the_repository, repo_fmt.hash_algo);
+			the_repository->ref_storage_format =
+				xstrdup_or_null(repo_fmt.ref_storage);
+		}
 	}
 	/*
 	 * Since precompose_string_if_needed() needs to look at
diff --git a/t/t0031-reftable.sh b/t/t0031-reftable.sh
new file mode 100755
index 000000000000..4685b9484214
--- /dev/null
+++ b/t/t0031-reftable.sh
@@ -0,0 +1,271 @@
+#!/bin/sh
+#
+# Copyright (c) 2020 Google LLC
+#
+
+test_description='reftable basics'
+
+. ./test-lib.sh
+
+INVALID_SHA1=aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+
+git_init () {
+	git init -b primary "$@"
+}
+
+initialize ()  {
+	rm -rf .git &&
+	git_init --ref-storage=reftable &&
+	mv .git/hooks .git/hooks-disabled
+}
+
+write_script fake_editor <<\EOF
+echo "$MSG" >"$1"
+echo "$MSG" >&2
+EOF
+GIT_EDITOR=./fake_editor
+export GIT_EDITOR
+
+test_expect_success 'read existing old OID if REF_HAVE_OLD is not set' '
+	initialize &&
+	test_commit 1st &&
+	test_commit 2nd &&
+	MSG=b4 git notes add &&
+	MSG=b3 git notes edit  &&
+	echo b4 >expect &&
+	git notes --ref commits@{1} show >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git reflog delete' '
+	initialize &&
+	test_commit file &&
+	test_commit file2 &&
+	test_commit file3 &&
+	test_commit file4 &&
+	git reflog delete HEAD@{1} &&
+	git reflog > output &&
+	! grep file3 output
+'
+
+test_expect_success 'branch -D delete nonexistent branch' '
+	initialize &&
+	test_commit file &&
+	test_must_fail git branch -D ../../my-private-file
+'
+
+test_expect_success 'branch copy' '
+	initialize &&
+	test_commit file1 &&
+	test_commit file2 &&
+	git branch src &&
+	git reflog src > expect &&
+	git branch -c src dst &&
+	git reflog dst | sed "s/dst/src/g" > actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git stash' '
+	initialize &&
+	test_commit file &&
+	touch actual expected &&
+	git -c status.showStash=true status >expected &&
+	echo hoi >> file.t &&
+	git stash push -m stashed &&
+	git stash clear &&
+	git -c status.showStash=true status >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'rename branch' '
+	initialize &&
+	git symbolic-ref HEAD refs/heads/before &&
+	test_commit file &&
+	git show-ref | sed s/before/after/g > expected &&
+	git branch -M after &&
+	git show-ref > actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'SHA256 support, env' '
+	rm -rf .git &&
+	GIT_DEFAULT_HASH=sha256 && export GIT_DEFAULT_HASH &&
+	git_init --ref-storage=reftable &&
+	mv .git/hooks .git/hooks-disabled &&
+	test_commit file
+'
+
+test_expect_success 'SHA256 support, option' '
+	rm -rf .git &&
+	git_init --ref-storage=reftable --object-format=sha256 &&
+	mv .git/hooks .git/hooks-disabled &&
+	test_commit file
+'
+
+test_expect_success 'delete ref' '
+	initialize &&
+	test_commit file &&
+	SHA=$(git show-ref -s --verify HEAD) &&
+	test_write_lines "$SHA refs/heads/primary" "$SHA refs/tags/file" >expect &&
+	git show-ref >actual &&
+	! git update-ref -d refs/tags/file $INVALID_SHA1 &&
+	test_cmp expect actual &&
+	git update-ref -d refs/tags/file $SHA  &&
+	test_write_lines "$SHA refs/heads/primary" >expect &&
+	git show-ref >actual &&
+	test_cmp expect actual
+'
+
+
+test_expect_success 'clone calls transaction_initial_commit' '
+	test_commit message1 file1 &&
+	git clone . cloned &&
+	(test  -f cloned/file1 || echo "Fixme.")
+'
+
+test_expect_success 'basic operation of reftable storage: commit, show-ref' '
+	initialize &&
+	test_commit file &&
+	test_write_lines refs/heads/primary refs/tags/file >expect &&
+	git show-ref &&
+	git show-ref | cut -f2 -d" " >actual &&
+	test_cmp actual expect
+'
+
+test_expect_success 'reflog, repack' '
+	initialize &&
+	for count in $(test_seq 1 10)
+	do
+		test_commit "number $count" file.t $count number-$count ||
+		return 1
+	done &&
+	git pack-refs &&
+	ls -1 .git/reftable >table-files &&
+	test_line_count = 2 table-files &&
+	git reflog refs/heads/primary >output &&
+	test_line_count = 10 output &&
+	grep "commit (initial): number 1" output &&
+	grep "commit: number 10" output &&
+	git gc &&
+	git reflog refs/heads/primary >output &&
+	test_line_count = 0 output
+'
+
+test_expect_success 'branch switch in reflog output' '
+	initialize &&
+	test_commit file1 &&
+	git checkout -b branch1 &&
+	test_commit file2 &&
+	git checkout -b branch2 &&
+	git switch - &&
+	git rev-parse --symbolic-full-name HEAD >actual &&
+	echo refs/heads/branch1 >expect &&
+	test_cmp actual expect
+'
+
+
+# This matches show-ref's output
+print_ref() {
+	echo "$(git rev-parse "$1") $1"
+}
+
+test_expect_success 'peeled tags are stored' '
+	initialize &&
+	test_commit file &&
+	git tag -m "annotated tag" test_tag HEAD &&
+	{
+		print_ref "refs/heads/primary" &&
+		print_ref "refs/tags/file" &&
+		print_ref "refs/tags/test_tag" &&
+		print_ref "refs/tags/test_tag^{}"
+	} >expect &&
+	git show-ref -d >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'show-ref works on fresh repo' '
+	initialize &&
+	rm -rf .git &&
+	git_init --ref-storage=reftable &&
+	>expect &&
+	! git show-ref >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'checkout unborn branch' '
+	initialize &&
+	git checkout -b primary
+'
+
+
+test_expect_success 'dir/file conflict' '
+	initialize &&
+	test_commit file &&
+	! git branch primary/forbidden
+'
+
+
+test_expect_success 'do not clobber existing repo' '
+	rm -rf .git &&
+	git_init --ref-storage=files &&
+	cat .git/HEAD >expect &&
+	test_commit file &&
+	(git_init --ref-storage=reftable || true) &&
+	cat .git/HEAD >actual &&
+	test_cmp expect actual
+'
+
+# cherry-pick uses a pseudo ref.
+test_expect_success 'pseudo refs' '
+	initialize &&
+	test_commit message1 file1 &&
+	test_commit message2 file2 &&
+	git branch source &&
+	git checkout HEAD^ &&
+	test_commit message3 file3 &&
+	git cherry-pick source &&
+	test -f file2
+'
+
+# cherry-pick uses a pseudo ref.
+test_expect_success 'rebase' '
+	initialize &&
+	test_commit message1 file1 &&
+	test_commit message2 file2 &&
+	git branch source &&
+	git checkout HEAD^ &&
+	test_commit message3 file3 &&
+	git rebase source &&
+	test -f file2
+'
+
+test_expect_success 'worktrees' '
+	git_init --ref-storage=reftable start &&
+	(cd start && test_commit file1 && git checkout -b branch1 &&
+	git checkout -b branch2 &&
+	git worktree add  ../wt
+	) &&
+	cd wt &&
+	git checkout branch1 &&
+	git branch
+'
+
+test_expect_success 'worktrees 2' '
+	initialize &&
+	test_commit file1 &&
+	mkdir existing_empty &&
+	git worktree add --detach existing_empty primary
+'
+
+test_expect_success 'FETCH_HEAD' '
+	initialize &&
+	test_commit one &&
+	(git_init sub && cd sub && test_commit two) &&
+	git --git-dir sub/.git rev-parse HEAD >expect &&
+	git fetch sub &&
+	git checkout FETCH_HEAD &&
+	git rev-parse HEAD >actual &&
+	test_cmp expect actual
+'
+
+test_done
diff --git a/t/t1409-avoid-packing-refs.sh b/t/t1409-avoid-packing-refs.sh
index be12fb63506e..c6f783255633 100755
--- a/t/t1409-avoid-packing-refs.sh
+++ b/t/t1409-avoid-packing-refs.sh
@@ -4,6 +4,12 @@ test_description='avoid rewriting packed-refs unnecessarily'
 
 . ./test-lib.sh
 
+if test_have_prereq REFTABLE
+then
+  skip_all='skipping pack-refs tests; incompatible with reftable'
+  test_done
+fi
+
 # Add an identifying mark to the packed-refs file header line. This
 # shouldn't upset readers, and it should be omitted if the file is
 # ever rewritten.
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 5071ac63a5b5..16440d58d6bb 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -8,6 +8,12 @@ test_description='git fsck random collection of tests
 
 . ./test-lib.sh
 
+if test_have_prereq REFTABLE
+then
+  skip_all='skipping tests; incompatible with reftable'
+  test_done
+fi
+
 test_expect_success setup '
 	git config gc.auto 0 &&
 	git config i18n.commitencoding ISO-8859-1 &&
diff --git a/t/t3210-pack-refs.sh b/t/t3210-pack-refs.sh
index 3b7cdc56ec84..667a23affe36 100755
--- a/t/t3210-pack-refs.sh
+++ b/t/t3210-pack-refs.sh
@@ -14,6 +14,12 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
 
+if test_have_prereq REFTABLE
+then
+  skip_all='skipping pack-refs tests; incompatible with reftable'
+  test_done
+fi
+
 test_expect_success 'enable reflogs' '
 	git config core.logallrefupdates true
 '
diff --git a/t/test-lib.sh b/t/test-lib.sh
index d3f6af6a6545..943aa9713961 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1481,6 +1481,11 @@ parisc* | hppa*)
 	;;
 esac
 
+if test -n "$GIT_TEST_REFTABLE"
+then
+  test_set_prereq REFTABLE
+fi
+
 ( COLUMNS=1 && test $COLUMNS = 1 ) && test_set_prereq COLUMNS_CAN_BE_1
 test -z "$NO_PERL" && test_set_prereq PERL
 test -z "$NO_PTHREADS" && test_set_prereq PTHREADS
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 24/28] git-prompt: prepare for reftable refs backend
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (22 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 23/28] Reftable support for git-core Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` SZEDER Gábor via GitGitGadget
  2021-04-19 11:37             ` [PATCH v7 25/28] Add "test-tool dump-reftable" command Han-Wen Nienhuys via GitGitGadget
                               ` (4 subsequent siblings)
  28 siblings, 0 replies; 251+ messages in thread
From: SZEDER Gábor via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, SZEDER Gábor

From: =?UTF-8?q?SZEDER=20G=C3=A1bor?= <szeder.dev@gmail.com>

In our git-prompt script we strive to use Bash builtins wherever
possible, because fork()-ing subshells for command substitutions and
fork()+exec()-ing Git commands are expensive on some platforms.  We
even read and parse '.git/HEAD' using Bash builtins to get the name of
the current branch [1].  However, the upcoming reftable refs backend
won't use '.git/HEAD' at all, but will write an invalid refname as
placeholder for backwards compatibility instead, which will break our
git-prompt script.

Update the git-prompt script to recognize the placeholder '.git/HEAD'
written by the reftable backend (its content is specified in the
reftable specs), and then fall back to use 'git symbolic-ref' to get
the name of the current branch.

[1] 3a43c4b5bd (bash prompt: use bash builtins to find out current
    branch, 2011-03-31)

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 contrib/completion/git-prompt.sh | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/contrib/completion/git-prompt.sh b/contrib/completion/git-prompt.sh
index 4640a1535d15..2e5a5d80271d 100644
--- a/contrib/completion/git-prompt.sh
+++ b/contrib/completion/git-prompt.sh
@@ -478,10 +478,15 @@ __git_ps1 ()
 			if ! __git_eread "$g/HEAD" head; then
 				return $exit
 			fi
-			# is it a symbolic ref?
 			b="${head#ref: }"
 			if [ "$head" = "$b" ]; then
 				detached=yes
+			elif [ "$b" = "refs/heads/.invalid" ]; then
+				# Reftable
+				b="$(git symbolic-ref HEAD 2>/dev/null)" ||
+				detached=yes
+			fi
+			if [ "$detached" = yes ]; then
 				b="$(
 				case "${GIT_PS1_DESCRIBE_STYLE-}" in
 				(contains)
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 25/28] Add "test-tool dump-reftable" command.
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (23 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 24/28] git-prompt: prepare for reftable refs backend SZEDER Gábor via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37             ` [PATCH v7 26/28] t1301: document what needs to be done for REFTABLE Han-Wen Nienhuys via GitGitGadget
                               ` (3 subsequent siblings)
  28 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This command dumps individual tables or a stack of of tables.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Makefile                 | 1 +
 t/helper/test-reftable.c | 5 +++++
 t/helper/test-tool.c     | 1 +
 t/helper/test-tool.h     | 1 +
 4 files changed, 8 insertions(+)

diff --git a/Makefile b/Makefile
index cffcf3496978..115ba0ccba0a 100644
--- a/Makefile
+++ b/Makefile
@@ -2427,6 +2427,7 @@ REFTABLE_OBJS += reftable/writer.o
 
 REFTABLE_TEST_OBJS += reftable/basics_test.o
 REFTABLE_TEST_OBJS += reftable/block_test.o
+REFTABLE_TEST_OBJS += reftable/dump.o
 REFTABLE_TEST_OBJS += reftable/merged_test.o
 REFTABLE_TEST_OBJS += reftable/pq_test.o
 REFTABLE_TEST_OBJS += reftable/record_test.o
diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c
index 996da85f7b5c..26b03d7b7892 100644
--- a/t/helper/test-reftable.c
+++ b/t/helper/test-reftable.c
@@ -14,3 +14,8 @@ int cmd__reftable(int argc, const char **argv)
 	tree_test_main(argc, argv);
 	return 0;
 }
+
+int cmd__dump_reftable(int argc, const char **argv)
+{
+	return reftable_dump_main(argc, (char *const *)argv);
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index bea3285eed9f..6922f64c1683 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -58,6 +58,7 @@ static struct test_cmd cmds[] = {
 	{ "read-midx", cmd__read_midx },
 	{ "ref-store", cmd__ref_store },
 	{ "reftable", cmd__reftable },
+	{ "dump-reftable", cmd__dump_reftable },
 	{ "regex", cmd__regex },
 	{ "repository", cmd__repository },
 	{ "revision-walking", cmd__revision_walking },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index 90f935310788..3ccc869f4595 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -19,6 +19,7 @@ int cmd__dump_cache_tree(int argc, const char **argv);
 int cmd__dump_fsmonitor(int argc, const char **argv);
 int cmd__dump_split_index(int argc, const char **argv);
 int cmd__dump_untracked_cache(int argc, const char **argv);
+int cmd__dump_reftable(int argc, const char **argv);
 int cmd__example_decorate(int argc, const char **argv);
 int cmd__fast_rebase(int argc, const char **argv);
 int cmd__genrandom(int argc, const char **argv);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 26/28] t1301: document what needs to be done for REFTABLE
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (24 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 25/28] Add "test-tool dump-reftable" command Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37             ` [PATCH v7 27/28] t1401,t2011: parameterize HEAD.lock " Han-Wen Nienhuys via GitGitGadget
                               ` (2 subsequent siblings)
  28 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 t/t1301-shared-repo.sh | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/t/t1301-shared-repo.sh b/t/t1301-shared-repo.sh
index ac947bff9fcf..194fa54ea3e7 100755
--- a/t/t1301-shared-repo.sh
+++ b/t/t1301-shared-repo.sh
@@ -22,9 +22,10 @@ test_expect_success 'shared = 0400 (faulty permission u-w)' '
 	)
 '
 
+# TODO(hanwen): for REFTABLE should inspect group-readable of .git/reftable/
 for u in 002 022
 do
-	test_expect_success POSIXPERM "shared=1 does not clear bits preset by umask $u" '
+	test_expect_success REFFILES,POSIXPERM "shared=1 does not clear bits preset by umask $u" '
 		mkdir sub && (
 			cd sub &&
 			umask $u &&
@@ -114,7 +115,8 @@ test_expect_success POSIXPERM 'info/refs respects umask in unshared repo' '
 	test_cmp expect actual
 '
 
-test_expect_success POSIXPERM 'git reflog expire honors core.sharedRepository' '
+# For reftable, the check on .git/reftable/ is sufficient.
+test_expect_success REFFILES,POSIXPERM 'git reflog expire honors core.sharedRepository' '
 	umask 077 &&
 	git config core.sharedRepository group &&
 	git reflog expire --all &&
@@ -201,7 +203,7 @@ test_expect_success POSIXPERM 're-init respects core.sharedrepository (remote)'
 	test_cmp expect actual
 '
 
-test_expect_success POSIXPERM 'template can set core.sharedrepository' '
+test_expect_success REFFILES,POSIXPERM 'template can set core.sharedrepository' '
 	rm -rf child.git &&
 	umask 0022 &&
 	git config core.sharedrepository 0666 &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 27/28] t1401,t2011: parameterize HEAD.lock for REFTABLE
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (25 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 26/28] t1301: document what needs to be done for REFTABLE Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-19 11:37             ` [PATCH v7 28/28] t1404: annotate test cases with REFFILES Han-Wen Nienhuys via GitGitGadget
  2021-04-21  7:45             ` [PATCH v7 00/28] reftable library Ævar Arnfjörð Bjarmason
  28 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 t/t1401-symbolic-ref.sh          | 11 +++++++++--
 t/t2011-checkout-invalid-head.sh | 11 +++++++++--
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/t/t1401-symbolic-ref.sh b/t/t1401-symbolic-ref.sh
index a4ebb0b65fec..4e130776b40f 100755
--- a/t/t1401-symbolic-ref.sh
+++ b/t/t1401-symbolic-ref.sh
@@ -99,9 +99,16 @@ test_expect_success LONG_REF 'we can parse long symbolic ref' '
 	test_cmp expect actual
 '
 
+if test_have_prereq REFTABLE
+then
+	HEAD_LOCK=reftable/tables.list.lock
+else
+	HEAD_LOCK=HEAD.lock
+fi
+
 test_expect_success 'symbolic-ref reports failure in exit code' '
-	test_when_finished "rm -f .git/HEAD.lock" &&
-	>.git/HEAD.lock &&
+	test_when_finished "rm -f .git/$HEAD_LOCK" &&
+	>.git/$HEAD_LOCK &&
 	test_must_fail git symbolic-ref HEAD refs/heads/whatever
 '
 
diff --git a/t/t2011-checkout-invalid-head.sh b/t/t2011-checkout-invalid-head.sh
index e52022e15229..ef6471cb0954 100755
--- a/t/t2011-checkout-invalid-head.sh
+++ b/t/t2011-checkout-invalid-head.sh
@@ -22,9 +22,16 @@ test_expect_success 'checkout main from invalid HEAD' '
 	git checkout main --
 '
 
+if test_have_prereq REFTABLE
+then
+	HEAD_LOCK=reftable/tables.list.lock
+else
+	HEAD_LOCK=HEAD.lock
+fi
+
 test_expect_success 'checkout notices failure to lock HEAD' '
-	test_when_finished "rm -f .git/HEAD.lock" &&
-	>.git/HEAD.lock &&
+	test_when_finished "rm -f .git/$HEAD_LOCK" &&
+	>.git/$HEAD_LOCK &&
 	test_must_fail git checkout -b other
 '
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 251+ messages in thread

* [PATCH v7 28/28] t1404: annotate test cases with REFFILES
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (26 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 27/28] t1401,t2011: parameterize HEAD.lock " Han-Wen Nienhuys via GitGitGadget
@ 2021-04-19 11:37             ` Han-Wen Nienhuys via GitGitGadget
  2021-04-21  7:45             ` [PATCH v7 00/28] reftable library Ævar Arnfjörð Bjarmason
  28 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-04-19 11:37 UTC (permalink / raw)
  To: git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Han-Wen Nienhuys g,
	Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

* Reftable for now lacks detailed error messages for directory/file conflicts.
  Skip message comparisons.

* Mark tests that muck with .git directly as REFFILES.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 t/t1404-update-ref-errors.sh | 86 ++++++++++++++++++++++++------------
 1 file changed, 57 insertions(+), 29 deletions(-)

diff --git a/t/t1404-update-ref-errors.sh b/t/t1404-update-ref-errors.sh
index 8b51c4efc135..811d5bb56d41 100755
--- a/t/t1404-update-ref-errors.sh
+++ b/t/t1404-update-ref-errors.sh
@@ -27,7 +27,9 @@ test_update_rejected () {
 	fi &&
 	printf "create $prefix/%s $C\n" $create >input &&
 	test_must_fail git update-ref --stdin <input 2>output.err &&
-	test_i18ngrep -F "$error" output.err &&
+	if test_have_prereq REFFILES ; then
+		test_i18ngrep -F "$error" output.err
+	fi &&
 	git for-each-ref $prefix >actual &&
 	test_cmp unchanged actual
 }
@@ -101,7 +103,9 @@ df_test() {
 		printf "%s\n" "delete $delname" "create $addname $D"
 	fi >commands &&
 	test_must_fail git update-ref --stdin <commands 2>output.err &&
-	test_cmp expected-err output.err &&
+	if test_have_prereq REFFILES ; then
+		test_cmp expected-err output.err
+	fi &&
 	printf "%s\n" "$C $delref" >expected-refs &&
 	git for-each-ref --format="%(objectname) %(refname)" $prefix/r >actual-refs &&
 	test_cmp expected-refs actual-refs
@@ -189,7 +193,7 @@ test_expect_success 'one new ref is a simple prefix of another' '
 
 '
 
-test_expect_success 'empty directory should not fool rev-parse' '
+test_expect_success REFFILES 'empty directory should not fool rev-parse' '
 	prefix=refs/e-rev-parse &&
 	git update-ref $prefix/foo $C &&
 	git pack-refs --all &&
@@ -199,7 +203,7 @@ test_expect_success 'empty directory should not fool rev-parse' '
 	test_cmp expected actual
 '
 
-test_expect_success 'empty directory should not fool for-each-ref' '
+test_expect_success REFFILES 'empty directory should not fool for-each-ref' '
 	prefix=refs/e-for-each-ref &&
 	git update-ref $prefix/foo $C &&
 	git for-each-ref $prefix >expected &&
@@ -209,14 +213,14 @@ test_expect_success 'empty directory should not fool for-each-ref' '
 	test_cmp expected actual
 '
 
-test_expect_success 'empty directory should not fool create' '
+test_expect_success REFFILES 'empty directory should not fool create' '
 	prefix=refs/e-create &&
 	mkdir -p .git/$prefix/foo/bar/baz &&
 	printf "create %s $C\n" $prefix/foo |
 	git update-ref --stdin
 '
 
-test_expect_success 'empty directory should not fool verify' '
+test_expect_success REFFILES 'empty directory should not fool verify' '
 	prefix=refs/e-verify &&
 	git update-ref $prefix/foo $C &&
 	git pack-refs --all &&
@@ -225,7 +229,7 @@ test_expect_success 'empty directory should not fool verify' '
 	git update-ref --stdin
 '
 
-test_expect_success 'empty directory should not fool 1-arg update' '
+test_expect_success REFFILES 'empty directory should not fool 1-arg update' '
 	prefix=refs/e-update-1 &&
 	git update-ref $prefix/foo $C &&
 	git pack-refs --all &&
@@ -234,7 +238,7 @@ test_expect_success 'empty directory should not fool 1-arg update' '
 	git update-ref --stdin
 '
 
-test_expect_success 'empty directory should not fool 2-arg update' '
+test_expect_success REFFILES 'empty directory should not fool 2-arg update' '
 	prefix=refs/e-update-2 &&
 	git update-ref $prefix/foo $C &&
 	git pack-refs --all &&
@@ -243,7 +247,7 @@ test_expect_success 'empty directory should not fool 2-arg update' '
 	git update-ref --stdin
 '
 
-test_expect_success 'empty directory should not fool 0-arg delete' '
+test_expect_success REFFILES 'empty directory should not fool 0-arg delete' '
 	prefix=refs/e-delete-0 &&
 	git update-ref $prefix/foo $C &&
 	git pack-refs --all &&
@@ -252,7 +256,7 @@ test_expect_success 'empty directory should not fool 0-arg delete' '
 	git update-ref --stdin
 '
 
-test_expect_success 'empty directory should not fool 1-arg delete' '
+test_expect_success REFFILES 'empty directory should not fool 1-arg delete' '
 	prefix=refs/e-delete-1 &&
 	git update-ref $prefix/foo $C &&
 	git pack-refs --all &&
@@ -336,7 +340,9 @@ test_expect_success 'missing old value blocks update' '
 	EOF
 	printf "%s\n" "update $prefix/foo $E $D" |
 	test_must_fail git update-ref --stdin 2>output.err &&
-	test_cmp expected output.err
+	if test_have_prereq REFFILES ; then
+		test_cmp expected output.err
+	fi
 '
 
 test_expect_success 'incorrect old value blocks update' '
@@ -347,7 +353,9 @@ test_expect_success 'incorrect old value blocks update' '
 	EOF
 	printf "%s\n" "update $prefix/foo $E $D" |
 	test_must_fail git update-ref --stdin 2>output.err &&
-	test_cmp expected output.err
+	if test_have_prereq REFFILES ; then
+		test_cmp expected output.err
+	fi
 '
 
 test_expect_success 'existing old value blocks create' '
@@ -358,7 +366,9 @@ test_expect_success 'existing old value blocks create' '
 	EOF
 	printf "%s\n" "create $prefix/foo $E" |
 	test_must_fail git update-ref --stdin 2>output.err &&
-	test_cmp expected output.err
+	if test_have_prereq REFFILES ; then
+		test_cmp expected output.err
+	fi
 '
 
 test_expect_success 'incorrect old value blocks delete' '
@@ -369,7 +379,9 @@ test_expect_success 'incorrect old value blocks delete' '
 	EOF
 	printf "%s\n" "delete $prefix/foo $D" |
 	test_must_fail git update-ref --stdin 2>output.err &&
-	test_cmp expected output.err
+	if test_have_prereq REFFILES ; then
+		test_cmp expected output.err
+	fi
 '
 
 test_expect_success 'missing old value blocks indirect update' '
@@ -380,7 +392,9 @@ test_expect_success 'missing old value blocks indirect update' '
 	EOF
 	printf "%s\n" "update $prefix/symref $E $D" |
 	test_must_fail git update-ref --stdin 2>output.err &&
-	test_cmp expected output.err
+	if test_have_prereq REFFILES ; then
+		test_cmp expected output.err
+	fi
 '
 
 test_expect_success 'incorrect old value blocks indirect update' '
@@ -392,7 +406,9 @@ test_expect_success 'incorrect old value blocks indirect update' '
 	EOF
 	printf "%s\n" "update $prefix/symref $E $D" |
 	test_must_fail git update-ref --stdin 2>output.err &&
-	test_cmp expected output.err
+	if test_have_prereq REFFILES ; then
+		test_cmp expected output.err
+	fi
 '
 
 test_expect_success 'existing old value blocks indirect create' '
@@ -404,7 +420,9 @@ test_expect_success 'existing old value blocks indirect create' '
 	EOF
 	printf "%s\n" "create $prefix/symref $E" |
 	test_must_fail git update-ref --stdin 2>output.err &&
-	test_cmp expected output.err
+	if test_have_prereq REFFILES ; then
+		test_cmp expected output.err
+	fi
 '
 
 test_expect_success 'incorrect old value blocks indirect delete' '
@@ -416,7 +434,9 @@ test_expect_success 'incorrect old value blocks indirect delete' '
 	EOF
 	printf "%s\n" "delete $prefix/symref $D" |
 	test_must_fail git update-ref --stdin 2>output.err &&
-	test_cmp expected output.err
+	if test_have_prereq REFFILES ; then
+		test_cmp expected output.err
+	fi
 '
 
 test_expect_success 'missing old value blocks indirect no-deref update' '
@@ -427,7 +447,9 @@ test_expect_success 'missing old value blocks indirect no-deref update' '
 	EOF
 	printf "%s\n" "option no-deref" "update $prefix/symref $E $D" |
 	test_must_fail git update-ref --stdin 2>output.err &&
-	test_cmp expected output.err
+	if test_have_prereq REFFILES ; then
+		test_cmp expected output.err
+	fi
 '
 
 test_expect_success 'incorrect old value blocks indirect no-deref update' '
@@ -439,7 +461,9 @@ test_expect_success 'incorrect old value blocks indirect no-deref update' '
 	EOF
 	printf "%s\n" "option no-deref" "update $prefix/symref $E $D" |
 	test_must_fail git update-ref --stdin 2>output.err &&
-	test_cmp expected output.err
+	if test_have_prereq REFFILES ; then
+		test_cmp expected output.err
+	fi
 '
 
 test_expect_success 'existing old value blocks indirect no-deref create' '
@@ -451,7 +475,9 @@ test_expect_success 'existing old value blocks indirect no-deref create' '
 	EOF
 	printf "%s\n" "option no-deref" "create $prefix/symref $E" |
 	test_must_fail git update-ref --stdin 2>output.err &&
-	test_cmp expected output.err
+	if test_have_prereq REFFILES ; then
+		test_cmp expected output.err
+	fi
 '
 
 test_expect_success 'incorrect old value blocks indirect no-deref delete' '
@@ -463,10 +489,12 @@ test_expect_success 'incorrect old value blocks indirect no-deref delete' '
 	EOF
 	printf "%s\n" "option no-deref" "delete $prefix/symref $D" |
 	test_must_fail git update-ref --stdin 2>output.err &&
-	test_cmp expected output.err
+	if test_have_prereq REFFILES ; then
+		test_cmp expected output.err
+	fi
 '
 
-test_expect_success 'non-empty directory blocks create' '
+test_expect_success REFFILES 'non-empty directory blocks create' '
 	prefix=refs/ne-create &&
 	mkdir -p .git/$prefix/foo/bar &&
 	: >.git/$prefix/foo/bar/baz.lock &&
@@ -485,7 +513,7 @@ test_expect_success 'non-empty directory blocks create' '
 	test_cmp expected output.err
 '
 
-test_expect_success 'broken reference blocks create' '
+test_expect_success REFFILES 'broken reference blocks create' '
 	prefix=refs/broken-create &&
 	mkdir -p .git/$prefix &&
 	echo "gobbledigook" >.git/$prefix/foo &&
@@ -504,7 +532,7 @@ test_expect_success 'broken reference blocks create' '
 	test_cmp expected output.err
 '
 
-test_expect_success 'non-empty directory blocks indirect create' '
+test_expect_success REFFILES 'non-empty directory blocks indirect create' '
 	prefix=refs/ne-indirect-create &&
 	git symbolic-ref $prefix/symref $prefix/foo &&
 	mkdir -p .git/$prefix/foo/bar &&
@@ -524,7 +552,7 @@ test_expect_success 'non-empty directory blocks indirect create' '
 	test_cmp expected output.err
 '
 
-test_expect_success 'broken reference blocks indirect create' '
+test_expect_success REFFILES 'broken reference blocks indirect create' '
 	prefix=refs/broken-indirect-create &&
 	git symbolic-ref $prefix/symref $prefix/foo &&
 	echo "gobbledigook" >.git/$prefix/foo &&
@@ -543,7 +571,7 @@ test_expect_success 'broken reference blocks indirect create' '
 	test_cmp expected output.err
 '
 
-test_expect_success 'no bogus intermediate values during delete' '
+test_expect_success REFFILES 'no bogus intermediate values during delete' '
 	prefix=refs/slow-transaction &&
 	# Set up a reference with differing loose and packed versions:
 	git update-ref $prefix/foo $C &&
@@ -600,7 +628,7 @@ test_expect_success 'no bogus intermediate values during delete' '
 	test_must_fail git rev-parse --verify --quiet $prefix/foo
 '
 
-test_expect_success 'delete fails cleanly if packed-refs file is locked' '
+test_expect_success REFFILES 'delete fails cleanly if packed-refs file is locked' '
 	prefix=refs/locked-packed-refs &&
 	# Set up a reference with differing loose and packed versions:
 	git update-ref $prefix/foo $C &&
@@ -616,7 +644,7 @@ test_expect_success 'delete fails cleanly if packed-refs file is locked' '
 	test_cmp unchanged actual
 '
 
-test_expect_success 'delete fails cleanly if packed-refs.new write fails' '
+test_expect_success REFFILES 'delete fails cleanly if packed-refs.new write fails' '
 	# Setup and expectations are similar to the test above.
 	prefix=refs/failed-packed-refs &&
 	git update-ref $prefix/foo $C &&
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 01/28] refs: ref_iterator_peel returns boolean, rather than peel_status
  2021-04-19 11:37             ` [PATCH v7 01/28] refs: ref_iterator_peel returns boolean, rather than peel_status Han-Wen Nienhuys via GitGitGadget
@ 2021-04-20 18:47               ` Junio C Hamano
  2021-04-21 10:15                 ` Han-Wen Nienhuys
  0 siblings, 1 reply; 251+ messages in thread
From: Junio C Hamano @ 2021-04-20 18:47 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: git, Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys

"Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Han-Wen Nienhuys <hanwen@google.com>
>
> Before, the cached ref_iterator would return peel_object() output directly. This
> led to spurious differences in the GIT_TRACE_REFS output, depending on the ref
> storage backend active.
>
> Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
> ---
>  refs.c               | 2 +-
>  refs/ref-cache.c     | 2 +-
>  refs/refs-internal.h | 3 +++
>  3 files changed, 5 insertions(+), 2 deletions(-)

A few observations.

 * The ref_iterator_peel() is defined to "Return 0 on success." in
   refs/refs-internal.h and implication of the missing mention of
   what is returned on failure is that it can give any non-zero
   value back, and in this project a failure usually is signaled by
   returning a negative value.

 * If the trace output cares that all non-success trace entries to
   look identical, I wonder if that layer should be the one that
   normalizes "0 is success, any other value is failure" into "0 is
   success, 1 is failure" (even though I find a positive 1 used as a
   failure a bit odd).

 * refs.c::peel_object() is defined to return "enum peel_status"
   which is not even a simple yes/no.

    PEEL_PEELED = 0
    PEEL_INVALID = -1
    PEEL_NON_TAG = -2
    PEEL_IS_SYMREF = -3
    PEEL_BROKEN = -4

   The comment in refs/refs-internal.h about "Reference interators"
   suggests me that ref_iterator_peel() is supposed to be a moral
   equivalent of (recently gone) peel_ref() which was a thin wrapper
   to still surviving peel_object(), so I expect ref_iterator_peel()
   to allow the caller to differentiate various failure modes.  And
   as a way to debug into a running system, I am not sure if losing
   the distinction in the trace output is even desirable.

packed_ref_iterator_peel() does use !!peel_object(), but I have a
feeling that that is what should be fixed and not the other way
around.

I haven't seen a good justification given to help convince me that
this is a good change (and I presume it is, as I trust you or any
other contributores enough not to knowingly make the system worse),

> +/*
> + * Peels the current ref, returning 0 for success.
> + */

And if it does make sense to squash the peel status down to "bool",
then the comment should mention the single acceptable value for
failure, not just "0 for success" which implies "different negative
values depending on the nature of failure".

>  typedef int ref_iterator_peel_fn(struct ref_iterator *ref_iterator,
>  				 struct object_id *peeled);

Thanks.

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 02/28] refs: document reflog_expire_fn's flag argument
  2021-04-19 11:37             ` [PATCH v7 02/28] refs: document reflog_expire_fn's flag argument Han-Wen Nienhuys via GitGitGadget
@ 2021-04-20 19:34               ` Junio C Hamano
  2021-04-27 15:21                 ` Han-Wen Nienhuys
  0 siblings, 1 reply; 251+ messages in thread
From: Junio C Hamano @ 2021-04-20 19:34 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: git, Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys

"Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Han-Wen Nienhuys <hanwen@google.com>
>
> Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
> ---
>  refs/refs-internal.h | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/refs/refs-internal.h b/refs/refs-internal.h
> index 546a6b965dcc..a31c1f465beb 100644
> --- a/refs/refs-internal.h
> +++ b/refs/refs-internal.h
> @@ -592,6 +592,10 @@ typedef int reflog_exists_fn(struct ref_store *ref_store, const char *refname);
>  typedef int create_reflog_fn(struct ref_store *ref_store, const char *refname,
>  			     int force_create, struct strbuf *err);
>  typedef int delete_reflog_fn(struct ref_store *ref_store, const char *refname);
> +
> +/*
> + * `flags` accepts a bitmask of `expire_reflog_flags`.
> + */

OK.  It would have been better to say what expire_reflog_flags is
(i.e. `enum expire_reflog_flags`), though.

>  typedef int reflog_expire_fn(struct ref_store *ref_store,
>  			     const char *refname, const struct object_id *oid,
>  			     unsigned int flags,

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 03/28] refs/debug: trace into reflog expiry too
  2021-04-19 11:37             ` [PATCH v7 03/28] refs/debug: trace into reflog expiry too Han-Wen Nienhuys via GitGitGadget
@ 2021-04-20 19:41               ` Junio C Hamano
  2021-04-22 17:27                 ` Han-Wen Nienhuys
  0 siblings, 1 reply; 251+ messages in thread
From: Junio C Hamano @ 2021-04-20 19:41 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: git, Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys

"Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Han-Wen Nienhuys <hanwen@google.com>
>
> Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
> ---
>  refs/debug.c | 47 ++++++++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 44 insertions(+), 3 deletions(-)

Nicely done.  

I as a reader of this patch do have to wonder, with the above very
limited log message material, how useful did "debug_reflog_expire()"
machinery used to be without any tracing.

It just reported the fact that expire method was called and what the
backend did as a whole, instead of reporting what the machinery
decided for each reflog entry to be pruned (or not pruned)?

Not a problem with this patch at all, and certainly it does not have
to be part of this series, but it feels very backwards, at least to
me, to have the method should_prune in ref backends.  As a function
to make a policy decision (e.g. "this has a timestamp older than X,
so needs to be pruned", "the author of this change I do not like, so
let's prune it ;-)"), it is more natural to have it as independent
as possible from the individual backends, no?

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 04/28] hash.h: provide constants for the hash IDs
  2021-04-19 11:37             ` [PATCH v7 04/28] hash.h: provide constants for the hash IDs Han-Wen Nienhuys via GitGitGadget
@ 2021-04-20 19:49               ` Junio C Hamano
  2021-04-21  1:04                 ` brian m. carlson
  0 siblings, 1 reply; 251+ messages in thread
From: Junio C Hamano @ 2021-04-20 19:49 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget, brian m. carlson
  Cc: git, Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys

"Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Han-Wen Nienhuys <hanwen@google.com>
>
> This will simplify referencing them from code that is not deeply integrated with
> Git, in particular, the reftable library.
>
> Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
> ---
>  hash.h        | 6 ++++++
>  object-file.c | 6 ++----
>  2 files changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/hash.h b/hash.h
> index 3fb0c3d4005a..b17fb2927711 100644
> --- a/hash.h
> +++ b/hash.h
> @@ -161,12 +161,18 @@ static inline int hash_algo_by_ptr(const struct git_hash_algo *p)
>  	return p - hash_algos;
>  }
>  
> +/* "sha1", big-endian */
> +#define GIT_SHA1_HASH_ID 0x73686131
> +
>  /* The length in bytes and in hex digits of an object name (SHA-1 value). */
>  #define GIT_SHA1_RAWSZ 20
>  #define GIT_SHA1_HEXSZ (2 * GIT_SHA1_RAWSZ)
>  /* The block size of SHA-1. */
>  #define GIT_SHA1_BLKSZ 64
>  
> +/* "s256", big-endian */
> +#define GIT_SHA256_HASH_ID 0x73323536

I agree that it makes sense to have symbolic constants defined in
this file.  These are values that fit in the ".format_id" member of
"struct git_hash_algo", and it is preferrable to make sure that the
names align (if I were designing in void, I would have called the
member "algo_id" and made the constants GIT_(SHA1|SHA256)_ALGO_ID).

Brian?  What's your preference ("I am fine to store HASH_ID in the
'.format_id' member" is perfectly an acceptable answer).

Thanks.

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 23/28] Reftable support for git-core
  2021-04-19 11:37             ` [PATCH v7 23/28] Reftable support for git-core Han-Wen Nienhuys via GitGitGadget
@ 2021-04-20 22:44               ` Junio C Hamano
  2021-04-21 10:19                 ` Han-Wen Nienhuys
  2021-05-04 17:24               ` Andrzej Hunt
  1 sibling, 1 reply; 251+ messages in thread
From: Junio C Hamano @ 2021-04-20 22:44 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: git, Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys

"Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com> writes:

> +if test -n "$GIT_TEST_REFTABLE"
> +then
> +  test_set_prereq REFTABLE
> +fi

How would this interact with what the test prep series did which was
to unconditionally add

test_seq_prereq REFFILES

around the same place?  Should $GIT_TEST_REFTABLE disable REFFILES?
IOW, do you want the above to read

	if test -n "$GIT_TEST_REFTABLE"
	then
		test_set_prereq REFTABLE
	else
		test_set_prereq REFFILES
	fi

when both series are in effect?

Thanks.

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 04/28] hash.h: provide constants for the hash IDs
  2021-04-20 19:49               ` Junio C Hamano
@ 2021-04-21  1:04                 ` brian m. carlson
  2021-04-21  9:43                   ` Han-Wen Nienhuys
  0 siblings, 1 reply; 251+ messages in thread
From: brian m. carlson @ 2021-04-21  1:04 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Han-Wen Nienhuys,
	Jeff King, Ramsay Jones, Jonathan Nieder, Johannes Schindelin,
	Jonathan Tan, Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Ævar Arnfjörð Bjarmason, Felipe Contreras,
	Derrick Stolee, Han-Wen Nienhuys

[-- Attachment #1: Type: text/plain, Size: 2064 bytes --]

On 2021-04-20 at 19:49:41, Junio C Hamano wrote:
> "Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
> > From: Han-Wen Nienhuys <hanwen@google.com>
> >
> > This will simplify referencing them from code that is not deeply integrated with
> > Git, in particular, the reftable library.
> >
> > Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
> > ---
> >  hash.h        | 6 ++++++
> >  object-file.c | 6 ++----
> >  2 files changed, 8 insertions(+), 4 deletions(-)
> >
> > diff --git a/hash.h b/hash.h
> > index 3fb0c3d4005a..b17fb2927711 100644
> > --- a/hash.h
> > +++ b/hash.h
> > @@ -161,12 +161,18 @@ static inline int hash_algo_by_ptr(const struct git_hash_algo *p)
> >  	return p - hash_algos;
> >  }
> >  
> > +/* "sha1", big-endian */
> > +#define GIT_SHA1_HASH_ID 0x73686131
> > +
> >  /* The length in bytes and in hex digits of an object name (SHA-1 value). */
> >  #define GIT_SHA1_RAWSZ 20
> >  #define GIT_SHA1_HEXSZ (2 * GIT_SHA1_RAWSZ)
> >  /* The block size of SHA-1. */
> >  #define GIT_SHA1_BLKSZ 64
> >  
> > +/* "s256", big-endian */
> > +#define GIT_SHA256_HASH_ID 0x73323536
> 
> I agree that it makes sense to have symbolic constants defined in
> this file.  These are values that fit in the ".format_id" member of
> "struct git_hash_algo", and it is preferrable to make sure that the
> names align (if I were designing in void, I would have called the
> member "algo_id" and made the constants GIT_(SHA1|SHA256)_ALGO_ID).
> 
> Brian?  What's your preference ("I am fine to store HASH_ID in the
> '.format_id' member" is perfectly an acceptable answer).

I slightly prefer FORMAT_ID because it's consistent (and for that reason
only), but if HASH_ID is more convenient, that's fine; I don't have a
strong opinion at all.  Definitely don't reroll the series because of my
slight preference.  Either way, I think these are fine things to have as
constants, and I appreciate you hoisting the comments here.
-- 
brian m. carlson (he/him or they/them)
Houston, Texas, US

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 00/28] reftable library
  2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
                               ` (27 preceding siblings ...)
  2021-04-19 11:37             ` [PATCH v7 28/28] t1404: annotate test cases with REFFILES Han-Wen Nienhuys via GitGitGadget
@ 2021-04-21  7:45             ` Ævar Arnfjörð Bjarmason
  2021-04-21  9:52               ` Han-Wen Nienhuys
  28 siblings, 1 reply; 251+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-21  7:45 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: git, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Felipe Contreras, Derrick Stolee,
	Han-Wen Nienhuys g, Han-Wen Nienhuys


On Mon, Apr 19 2021, Han-Wen Nienhuys via GitGitGadget wrote:

> The commits up to "hash.h: provide constants for the hash IDs" should be
> good to merge to 'next'.

With how Junio queues things up perhaps submitting this as another
"prep" series would be good, to reduce future reviewer fatigue,
i.e. anything we can make land on master makes re-rolls that much
smaller.

> There are several test fixups, but I've put them in another series because
> GGG enforces max 30 commits.

I left a bunch of comments on the test prep series now. Probably good to
have it split up regardless of GGG limits.

Re the comments I left on the test series. I'm very happy to see the
start of addressing the "it must be tested" concerns I had in
https://lore.kernel.org/git/87wnt2th1v.fsf@evledraar.gmail.com/

But running the test suite with GIT_TEST_REFTABLE=true still yields a
wall of failed tests. I tested with a merger of this and
pr-git-1008/hanwen/reffiles-v1, i.e. what's on "seen" currently.

I don't see the point of having re-rolls of this topic while the test
changes topic it's based on hasn't finished
marking/splitting/refactoring the various tests that do and don't depend
on the file backend.

At least when I review it I'm just left with going in circles digging
into one of those failing test, figuring out if it's really
refs/files-backend.c specific or not etc., and as long as we can't turn
on GIT_TEST_REFTABLE=true in CI as part of this series I don't see a
path to making it advance to next->master.

> Due to using uncompress2, this build fails on the Linux32 builder; what is
> the magic incantation to run autoconf?

Doesn't this just need a NO_UNCOMPRESS2 for Linux32 in ci/lib.sh? See
the lines just below that for e.g. NO_REGEX for another CI target.

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 06/28] reftable: add LICENSE
  2021-04-19 11:37             ` [PATCH v7 06/28] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
@ 2021-04-21  7:48               ` Ævar Arnfjörð Bjarmason
  2021-04-21  9:15                 ` Han-Wen Nienhuys
  0 siblings, 1 reply; 251+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-21  7:48 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget
  Cc: git, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Felipe Contreras, Derrick Stolee,
	Han-Wen Nienhuys, Han-Wen Nienhuys


On Mon, Apr 19 2021, Han-Wen Nienhuys via GitGitGadget wrote:

> From: Han-Wen Nienhuys <hanwen@google.com>
>
> TODO: relicense?

So, still no resolution to the questions in v6 in [1]?

1. https://lore.kernel.org/git/87o8ei2tfo.fsf@evledraar.gmail.com/

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 06/28] reftable: add LICENSE
  2021-04-21  7:48               ` Ævar Arnfjörð Bjarmason
@ 2021-04-21  9:15                 ` Han-Wen Nienhuys
  0 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys @ 2021-04-21  9:15 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys

On Wed, Apr 21, 2021 at 9:48 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Mon, Apr 19 2021, Han-Wen Nienhuys via GitGitGadget wrote:
>
> > From: Han-Wen Nienhuys <hanwen@google.com>
> >
> > TODO: relicense?
>
> So, still no resolution to the questions in v6 in [1]?

I reworded the commit message, but that update was lost in one of the
copious rebase operations. I recovered it from reflog, and changed it.
Hopefully it survives this again.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 04/28] hash.h: provide constants for the hash IDs
  2021-04-21  1:04                 ` brian m. carlson
@ 2021-04-21  9:43                   ` Han-Wen Nienhuys
  2021-07-22  8:31                     ` Han-Wen Nienhuys
  0 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys @ 2021-04-21  9:43 UTC (permalink / raw)
  To: brian m. carlson, Junio C Hamano,
	Han-Wen Nienhuys via GitGitGadget, git, Han-Wen Nienhuys,
	Jeff King, Ramsay Jones, Jonathan Nieder, Johannes Schindelin,
	Jonathan Tan, Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Ævar Arnfjörð Bjarmason, Felipe Contreras,
	Derrick Stolee, Han-Wen Nienhuys

On Wed, Apr 21, 2021 at 3:04 AM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
> > I agree that it makes sense to have symbolic constants defined in
> > this file.  These are values that fit in the ".format_id" member of
> > "struct git_hash_algo", and it is preferrable to make sure that the
> > names align (if I were designing in void, I would have called the
> > member "algo_id" and made the constants GIT_(SHA1|SHA256)_ALGO_ID).
> >
> > Brian?  What's your preference ("I am fine to store HASH_ID in the
> > '.format_id' member" is perfectly an acceptable answer).
>
> I slightly prefer FORMAT_ID because it's consistent (and for that reason
> only), but if HASH_ID is more convenient, that's fine; I don't have a
> strong opinion at all.  Definitely don't reroll the series because of my
> slight preference.  Either way, I think these are fine things to have as
> constants, and I appreciate you hoisting the comments here.

Either way is fine for me. I can paint the bikeshed in your favorite color :)

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 00/28] reftable library
  2021-04-21  7:45             ` [PATCH v7 00/28] reftable library Ævar Arnfjörð Bjarmason
@ 2021-04-21  9:52               ` Han-Wen Nienhuys
  2021-04-21 11:21                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys @ 2021-04-21  9:52 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys

On Wed, Apr 21, 2021 at 9:45 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> > The commits up to "hash.h: provide constants for the hash IDs" should be
> > good to merge to 'next'.
>
> With how Junio queues things up perhaps submitting this as another
> "prep" series would be good, to reduce future reviewer fatigue,
> i.e. anything we can make land on master makes re-rolls that much
> smaller.

will do.

> > There are several test fixups, but I've put them in another series because
> > GGG enforces max 30 commits.
>
> I left a bunch of comments on the test prep series now. Probably good to
> have it split up regardless of GGG limits.
>
> Re the comments I left on the test series. I'm very happy to see the
> start of addressing the "it must be tested" concerns I had in
> https://lore.kernel.org/git/87wnt2th1v.fsf@evledraar.gmail.com/

It may look like the start, but I've been improving the number of
tests that pass continuously since I posted the first version of this
code over a year ago.

> I don't see the point of having re-rolls of this topic while the test
> changes topic it's based on hasn't finished
> marking/splitting/refactoring the various tests that do and don't depend
> on the file backend.
>
> At least when I review it I'm just left with going in circles digging
> into one of those failing test, figuring out if it's really
> refs/files-backend.c specific or not etc., and as long as we can't turn
> on GIT_TEST_REFTABLE=true in CI as part of this series I don't see a
> path to making it advance to next->master.

The point of posting updates is to garner feedback, especially from
people familiar with the Git code itself. If you would like this
effort to move forward faster, I am most in need of review for the
part that glues the library together with the git code itself (ie. the
commit introducing refs/reftable-backend.c).

> > Due to using uncompress2, this build fails on the Linux32 builder; what is
> > the magic incantation to run autoconf?
>
> Doesn't this just need a NO_UNCOMPRESS2 for Linux32 in ci/lib.sh? See
> the lines just below that for e.g. NO_REGEX for another CI target.

Thanks!

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 01/28] refs: ref_iterator_peel returns boolean, rather than peel_status
  2021-04-20 18:47               ` Junio C Hamano
@ 2021-04-21 10:15                 ` Han-Wen Nienhuys
  2021-04-21 23:28                   ` Junio C Hamano
  0 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys @ 2021-04-21 10:15 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Ævar Arnfjörð Bjarmason, Felipe Contreras,
	Derrick Stolee, Han-Wen Nienhuys

On Tue, Apr 20, 2021 at 8:47 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Han-Wen Nienhuys <hanwen@google.com>
> >
> > Before, the cached ref_iterator would return peel_object() output directly. This
> > led to spurious differences in the GIT_TRACE_REFS output, depending on the ref
> > storage backend active.
> >
> > Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
> > ---
> >  refs.c               | 2 +-
> >  refs/ref-cache.c     | 2 +-
> >  refs/refs-internal.h | 3 +++
> >  3 files changed, 5 insertions(+), 2 deletions(-)
>
> A few observations.
>
>  * The ref_iterator_peel() is defined to "Return 0 on success." in
>    refs/refs-internal.h and implication of the missing mention of
>    what is returned on failure is that it can give any non-zero
>    value back, and in this project a failure usually is signaled by
>    returning a negative value.
>
>  * If the trace output cares that all non-success trace entries to
>    look identical, I wonder if that layer should be the one that
>    normalizes "0 is success, any other value is failure" into "0 is
>    success, 1 is failure" (even though I find a positive 1 used as a
>    failure a bit odd).

No, it should not.  The whole point of the trace output is to see
where and how the files backend and the reftable backend return
different results. By having the trace layer paper over differences, I
will miss regressions of the reftable backend.

>  * refs.c::peel_object() is defined to return "enum peel_status"
>    which is not even a simple yes/no.
>
>     PEEL_PEELED = 0
>     PEEL_INVALID = -1
>     PEEL_NON_TAG = -2
>     PEEL_IS_SYMREF = -3
>     PEEL_BROKEN = -4
>
>    The comment in refs/refs-internal.h about "Reference interators"
>    suggests me that ref_iterator_peel() is supposed to be a moral
>    equivalent of (recently gone) peel_ref() which was a thin wrapper
>    to still surviving peel_object(), so I expect ref_iterator_peel()
>    to allow the caller to differentiate various failure modes.  And
>    as a way to debug into a running system, I am not sure if losing
>    the distinction in the trace output is even desirable.
>
> packed_ref_iterator_peel() does use !!peel_object(), but I have a
> feeling that that is what should be fixed and not the other way
> around.

All these functions go into ref_iterator_peel, which is only called
from peel_iterated_oid().

That function calls !!peel_object(base, peeled) if there is no
iterator. That leads me to conclude

All callers of peel_iterated_oid use it as a boolean exclusively, i.e.

  if (!peel_iterated_oid( .. )) { .. }

Aside from these considerations, it is also odd to return peel_status.
For example, PEEL_IS_SYMREF is unnecessary, because the symref status
is already returned from iterator_next().

> I haven't seen a good justification given to help convince me that
> this is a good change (and I presume it is, as I trust you or any
> other contributores enough not to knowingly make the system worse),
>
> > +/*
> > + * Peels the current ref, returning 0 for success.
> > + */
>
> And if it does make sense to squash the peel status down to "bool",
> then the comment should mention the single acceptable value for
> failure, not just "0 for success" which implies "different negative
> values depending on the nature of failure".

I can make it return -1 instead.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--
Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 23/28] Reftable support for git-core
  2021-04-20 22:44               ` Junio C Hamano
@ 2021-04-21 10:19                 ` Han-Wen Nienhuys
  2021-04-21 23:22                   ` Junio C Hamano
  0 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys @ 2021-04-21 10:19 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Ævar Arnfjörð Bjarmason, Felipe Contreras,
	Derrick Stolee, Han-Wen Nienhuys

On Wed, Apr 21, 2021 at 12:44 AM Junio C Hamano <gitster@pobox.com> wrote:
> > +if test -n "$GIT_TEST_REFTABLE"
> > +then
> > +  test_set_prereq REFTABLE
> > +fi
>
> How would this interact with what the test prep series did which was
> to unconditionally add
>
> test_seq_prereq REFFILES
>
> around the same place?  Should $GIT_TEST_REFTABLE disable REFFILES?
> IOW, do you want the above to read
>
>         if test -n "$GIT_TEST_REFTABLE"
>         then
>                 test_set_prereq REFTABLE
>         else
>                 test_set_prereq REFFILES
>         fi
>
> when both series are in effect?

Yes, but on 2nd thought it's probably better to stick with just
REFFILES, and rewrite any reftable specific tests as !REFFILES.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 00/28] reftable library
  2021-04-21  9:52               ` Han-Wen Nienhuys
@ 2021-04-21 11:21                 ` Ævar Arnfjörð Bjarmason
  2021-04-26 17:59                   ` Han-Wen Nienhuys
  0 siblings, 1 reply; 251+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-21 11:21 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys


On Wed, Apr 21 2021, Han-Wen Nienhuys wrote:

> On Wed, Apr 21, 2021 at 9:45 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>> > The commits up to "hash.h: provide constants for the hash IDs" should be
>> > good to merge to 'next'.
>>
>> With how Junio queues things up perhaps submitting this as another
>> "prep" series would be good, to reduce future reviewer fatigue,
>> i.e. anything we can make land on master makes re-rolls that much
>> smaller.
>
> will do.
>
>> > There are several test fixups, but I've put them in another series because
>> > GGG enforces max 30 commits.
>>
>> I left a bunch of comments on the test prep series now. Probably good to
>> have it split up regardless of GGG limits.
>>
>> Re the comments I left on the test series. I'm very happy to see the
>> start of addressing the "it must be tested" concerns I had in
>> https://lore.kernel.org/git/87wnt2th1v.fsf@evledraar.gmail.com/
>
> It may look like the start, but I've been improving the number of
> tests that pass continuously since I posted the first version of this
> code over a year ago.

FWIW I meant or meant to say something closer to "a start at the
numerous failures I noted in the v6 discussion", not "no work has been
done at all on this front". Sorry.

>> I don't see the point of having re-rolls of this topic while the test
>> changes topic it's based on hasn't finished
>> marking/splitting/refactoring the various tests that do and don't depend
>> on the file backend.
>>
>> At least when I review it I'm just left with going in circles digging
>> into one of those failing test, figuring out if it's really
>> refs/files-backend.c specific or not etc., and as long as we can't turn
>> on GIT_TEST_REFTABLE=true in CI as part of this series I don't see a
>> path to making it advance to next->master.
>
> The point of posting updates is to garner feedback, especially from
> people familiar with the Git code itself.

So you agree that we should make the tests pass first, then shouldn't
these me marked as RFC/PATCH?

I for one would find that a lot less confusing, "PATCH" means "the
author considers this ready to land on master sans undiscovered bugs
etc.

> If you would like this effort to move forward faster,

Yes, I'm keen to help move it forward.

I am saying that I think for me or anyone else to do that in any
sensible way the path forward is to make the tests pass with
GIT_TEST_REFTABLE=true.

IOW a large part of the feedback you're looking for is already part of
the codebase. Nobody can keep all of it in their head, but we've encoded
all the tricky edge cases we could think of in the tests.

So in particular per my feedback on the test series: It's only when we
start digging into those tests that we discover the interesting bits,
i.e. how things like .git/SOME_FILE behaves per[1], and my comments
about reflog behavior in [2].

Anyway, I think I'm just repeating myself at this point, but part of the
reason is that I don't think I've gotten a straight answer about the
point-by-point questions I had in [3].

I.e. reaching some consensus on things like whether
GIT_TEST_REFTABLE=true passing under CI being a hard-prereq for this
series or not is surely one of the first things to sketch out before
figuring out how to move this forward.

Also, with how unhappy Junio is with my patch-dumping I'm probably the
last person to give advice about managing mailing list attention, but
still: here's an attempt:

In v6 I noted that t5510-fetch.sh had a segfault[4], you said you'd
check it out, and reading your cover letter nothing stood out about
that, so I assumed it was sorted out somehow.

But running it now yields a BUG() instead:
    
    BUG: refs.c:1038: free called on a prepared reference transaction
    Aborted
    [...]
    not ok 18 - fetch --atomic aborts all reference updates if hook aborts

And trying the whole test suite with --verbose-log yields:
    
    $ grep -e 'Segmentation fault'  -e BUG: test-results/*
    test-results/t0210-trace2-normal.out:BUG: t/helper/test-trace2.c:206: the bug message
    test-results/t1400-update-ref.out:BUG: refs.c:1038: free called on a prepared reference transaction
    test-results/t1400-update-ref.out:BUG: refs.c:1038: free called on a prepared reference transaction
    test-results/t4058-diff-duplicates.out:Segmentation fault
    test-results/t4058-diff-duplicates.out:Segmentation fault
    test-results/t4058-diff-duplicates.out:Segmentation fault
    test-results/t4058-diff-duplicates.out:Segmentation fault
    test-results/t5510-fetch.out:BUG: refs.c:1038: free called on a prepared reference transaction
    test-results/t5510-fetch.out:BUG: refs.c:1038: free called on a prepared reference transaction
    test-results/t5600-clone-fail-cleanup.out:Segmentation fault
    test-results/t5600-clone-fail-cleanup.out:Segmentation fault
    test-results/t5600-clone-fail-cleanup.out:Segmentation fault
    test-results/t5601-clone.out:Segmentation fault
    test-results/t6423-merge-rename-directories.out:                test_i18ngrep ! BUG: err &&

Not all of that is yours, FWIW "seen" is currently doing, and the "diff"
failures are ac14de13b2 (t4058: explore duplicate tree entry handling in
a bit more detail, 2020-12-11).
    
    $ grep -e 'Segmentation fault'  -e BUG: test-results/*
    test-results/t0210-trace2-normal.out:BUG: t/helper/test-trace2.c:206: the bug message
    test-results/t1406-submodule-ref-store.out:BUG: refs/files-backend.c:139: operation pack_refs requires abilities 0x6, but only have 0x5
    test-results/t1406-submodule-ref-store.out:BUG: refs/files-backend.c:139: operation create_symref requires abilities 0x2, but only have 0x5
    test-results/t1406-submodule-ref-store.out:BUG: refs/files-backend.c:139: operation delete_refs requires abilities 0x2, but only have 0x5
    test-results/t1406-submodule-ref-store.out:BUG: refs/files-backend.c:139: operation rename_ref requires abilities 0x2, but only have 0x5
    test-results/t1406-submodule-ref-store.out:BUG: refs/files-backend.c:139: operation delete_reflog requires abilities 0x2, but only have 0x5
    test-results/t1406-submodule-ref-store.out:BUG: refs/files-backend.c:139: operation create_reflog requires abilities 0x2, but only have 0x5
    test-results/t4058-diff-duplicates.out:Segmentation fault
    test-results/t4058-diff-duplicates.out:Segmentation fault
    test-results/t4058-diff-duplicates.out:Segmentation fault
    test-results/t4058-diff-duplicates.out:Segmentation fault
    test-results/t6423-merge-rename-directories.out:                test_i18ngrep ! BUG: err &&

Anyway, the "clone"/"fetch"/"update-ref" failures look new with
GIT_TEST_REFTABLE=true.

So to close up the attempt at getting feedback: I think we'd probably be
better off still discussing "I tried this to fix the segfault, but now I
get this BUG" rather than a 30 patch re-roll.

>  I am most in need of review for the part that glues the library
> together with the git code itself (ie. the commit introducing
> refs/reftable-backend.c).

In other and briefer words I don't think this series is in need of
review at this point, it's in need of *addressing* review, where review
is both the ghosts from the past of failing tests, and outstanding
segfaults/BUG()s being hit.

1. https://lore.kernel.org/git/87k0ow2n29.fsf@evledraar.gmail.com/ 
2. https://lore.kernel.org/git/87pmyo3zvw.fsf@evledraar.gmail.com/
3. https://lore.kernel.org/git/87wnt2th1v.fsf@evledraar.gmail.com/
4. https://lore.kernel.org/git/87im4qejpk.fsf@evledraar.gmail.com/

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 23/28] Reftable support for git-core
  2021-04-21 10:19                 ` Han-Wen Nienhuys
@ 2021-04-21 23:22                   ` Junio C Hamano
  0 siblings, 0 replies; 251+ messages in thread
From: Junio C Hamano @ 2021-04-21 23:22 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Ævar Arnfjörð Bjarmason, Felipe Contreras,
	Derrick Stolee, Han-Wen Nienhuys

Han-Wen Nienhuys <hanwen@google.com> writes:

>> IOW, do you want the above to read
>>
>>         if test -n "$GIT_TEST_REFTABLE"
>>         then
>>                 test_set_prereq REFTABLE
>>         else
>>                 test_set_prereq REFFILES
>>         fi
>>
>> when both series are in effect?
>
> Yes, but on 2nd thought it's probably better to stick with just
> REFFILES, and rewrite any reftable specific tests as !REFFILES.

I guess that is both good enough and simpler.  The only possible
downside is that it would be cumbersome if we ever need to support
the third ref backend, but I expect that it won't happen before the
reftable support solidifies, and under that assumption, the simpler
the better.

Thanks.

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 01/28] refs: ref_iterator_peel returns boolean, rather than peel_status
  2021-04-21 10:15                 ` Han-Wen Nienhuys
@ 2021-04-21 23:28                   ` Junio C Hamano
  0 siblings, 0 replies; 251+ messages in thread
From: Junio C Hamano @ 2021-04-21 23:28 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Ævar Arnfjörð Bjarmason, Felipe Contreras,
	Derrick Stolee, Han-Wen Nienhuys

Han-Wen Nienhuys <hanwen@google.com> writes:

> All callers of peel_iterated_oid use it as a boolean exclusively, i.e.
>
>   if (!peel_iterated_oid( .. )) { .. }
>
> Aside from these considerations, it is also odd to return peel_status.
> For example, PEEL_IS_SYMREF is unnecessary, because the symref status
> is already returned from iterator_next().
>
>> I haven't seen a good justification given to help convince me that
>> this is a good change (and I presume it is, as I trust you or any
>> other contributores enough not to knowingly make the system worse),

I am not sure I agree 100% with

	if (!do_this()) {
		...
	}

i.e. "try doing it and do one thing upon success" is a sign that all
error conditions will ever be treated equally and justifies to squash
all the error codes the current code tries to return, but whether I
agree with it or not, I'd want to see it recorded in the proposed
log message.  It would help future developers and allow them to take
into account your motivation behind this change, when they need to
update the implementation.

>> > +/*
>> > + * Peels the current ref, returning 0 for success.
>> > + */
>>
>> And if it does make sense to squash the peel status down to "bool",
>> then the comment should mention the single acceptable value for
>> failure, not just "0 for success" which implies "different negative
>> values depending on the nature of failure".
>
> I can make it return -1 instead.

I do care more about _documenting_ it here, not just ending the
sentence with ", returning 0 for success", it should also say
"returns X upon failure", whether the value of X is 1 or -1 (and
obviously our codebase prefers -1 for an error, but that is
secondary).

Thanks.

	

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 03/28] refs/debug: trace into reflog expiry too
  2021-04-20 19:41               ` Junio C Hamano
@ 2021-04-22 17:27                 ` Han-Wen Nienhuys
  0 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys @ 2021-04-22 17:27 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Ævar Arnfjörð Bjarmason, Felipe Contreras,
	Derrick Stolee, Han-Wen Nienhuys

On Tue, Apr 20, 2021 at 9:42 PM Junio C Hamano <gitster@pobox.com> wrote:
> Nicely done.
>
> I as a reader of this patch do have to wonder, with the above very
> limited log message material, how useful did "debug_reflog_expire()"
> machinery used to be without any tracing.

It wasn't; that's why I'm changing it :-)

I noticed a test failure with reftable, and adding this function
showed me I was expiring the entries in the reverse order, so that

  git reflog delete branch@{1}

would remove one but oldest entry in the reflog.

> Not a problem with this patch at all, and certainly it does not have
> to be part of this series, but it feels very backwards, at least to
> me, to have the method should_prune in ref backends.  As a function
> to make a policy decision (e.g. "this has a timestamp older than X,
> so needs to be pruned", "the author of this change I do not like, so
> let's prune it ;-)"), it is more natural to have it as independent
> as possible from the individual backends, no?

the should_prune function is passed in into the ref backend as an
argument of type reflog_expiry_should_prune_fn to the
refs_reflog_expire() function.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--
Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v4 13/15] Reftable support for git-core
  2021-01-21 15:55         ` Ævar Arnfjörð Bjarmason
  2021-01-21 16:14           ` Han-Wen Nienhuys
@ 2021-04-23 10:22           ` Han-Wen Nienhuys
  2021-04-26 13:23             ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys @ 2021-04-23 10:22 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Felipe Contreras, Han-Wen Nienhuys

On Thu, Jan 21, 2021 at 4:55 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> I've been chasing down some edge cases in refs.c as part of another WIP
> series I have and found that for this particular "errno" stuff we don't
> have any test coverage. And with master & hanwen/reftable if I apply:
>
>     -                       errno = EINVAL;
..
> All tests pass on both.
..
> That I can remove error checking/handling from this place means existing
> general logic was faithfully copied without checking that we have a test
> for it, or adding one.

I think that is an unfair characterization. The API documentation of
read_raw_ref_fn says that implementations should return EINVAL for
certain cases, so that's what I did. The point of having a documented
internal API is that one doesn't have to double check what is behind
the API.

--
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v4 13/15] Reftable support for git-core
  2021-04-23 10:22           ` Han-Wen Nienhuys
@ 2021-04-26 13:23             ` Ævar Arnfjörð Bjarmason
  2021-04-26 16:17               ` Han-Wen Nienhuys
  0 siblings, 1 reply; 251+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-26 13:23 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Felipe Contreras, Han-Wen Nienhuys


On Fri, Apr 23 2021, Han-Wen Nienhuys wrote:

> On Thu, Jan 21, 2021 at 4:55 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>> I've been chasing down some edge cases in refs.c as part of another WIP
>> series I have and found that for this particular "errno" stuff we don't
>> have any test coverage. And with master & hanwen/reftable if I apply:
>>
>>     -                       errno = EINVAL;
> ..
>> All tests pass on both.
> ..
>> That I can remove error checking/handling from this place means existing
>> general logic was faithfully copied without checking that we have a test
>> for it, or adding one.
>
> I think that is an unfair characterization. The API documentation of
> read_raw_ref_fn says that implementations should return EINVAL for
> certain cases, so that's what I did. The point of having a documented
> internal API is that one doesn't have to double check what is behind
> the API.

I'm only meaning to point out that the behavior being relied on here
doesn't have any tests.

I agree that whether or not we should have any tests at all is a matter
of opinion and circumstance. I don't think in general that someone using
some random internal API needs to be checking what is and isn't tested
in that API.

I do think in this case that it's worth digging a bit further. The APIs
in question are using EINVAL and EISDIR to pass up errors that are had a
1:1 mapping into the FS historically.

Are we really better off faking those up, or is some of that perhaps not
used at all? Maybe we'd find that this level of abstraction isn't what
we need, and it could actually be much simpler.

My main goal here is not to keep creating some infinite amount of work
for you, but to try to help this along into some shape that's better
reviewable & land-able.

It seems to me that a good way to get there is to seek some systematic
ways of focusing review onto various edge cases of this series. I.e. to
begin with having GIT_TEST_REFTABLE pass as noted elsewhere, and in this
case calling attention to some of the underlying assumptions behind this
series.

One of the hardest things I've found about trying to review this has
been closing the gap between things that exist in your mind and commit
messages / code.

E.g. in this case sure, you can be of the opinion that since the point
of an internal API is to provide an interface we can simply assume it's
OK and working as intended.

I don't think given the history and use of this code that that's an
assumption I'd be as comfortable making, but anyway. Getting back on
point, I do think whatever opinion anyone is of on that subject having
something like this in the commit message (and other applicable commits
etc.) would be *very* valuable:

    In functions such as git_reftable_read_raw_ref() (and ????, are
    there more?) we're diligently emulating the documented behavior and
    return values of the refs file backend. According to "make gcov" we
    can see we don't have coverage for this, in particular the behavior
    of EINVAL etc.

I.e. per [1] once if and when we have GIT_TEST_REFTABLE passing surely
one of the best way to garner feedback on the rest is to discover those
parts (using "make gcov", after running with/without
GIT_TEST_REFTABLE=[true|false]) where we still have outstanding blind
spots between the tests and code.

1. https://lore.kernel.org/git/87lf9b3mth.fsf@evledraar.gmail.com/

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v4 13/15] Reftable support for git-core
  2021-04-26 13:23             ` Ævar Arnfjörð Bjarmason
@ 2021-04-26 16:17               ` Han-Wen Nienhuys
  2021-04-28 16:32                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys @ 2021-04-26 16:17 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Felipe Contreras, Han-Wen Nienhuys

On Mon, Apr 26, 2021 at 4:28 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> I agree that whether or not we should have any tests at all is a matter
> of opinion and circumstance. I don't think in general that someone using
> some random internal API needs to be checking what is and isn't tested
> in that API.
>
> I do think in this case that it's worth digging a bit further. The APIs
> in question are using EINVAL and EISDIR to pass up errors that are had a
> 1:1 mapping into the FS historically.
>
> Are we really better off faking those up, or is some of that perhaps not
> used at all? Maybe we'd find that this level of abstraction isn't what
> we need, and it could actually be much simpler.

I think none of these error functions are being used at all, and I've
made a start at removing them in https://github.com/git/git/pull/1012
(see also my discussion with peff.)

> It seems to me that a good way to get there is to seek some systematic
> ways of focusing review onto various edge cases of this series. I.e. to
> begin with having GIT_TEST_REFTABLE pass as noted elsewhere, and in this
> case calling attention to some of the underlying assumptions behind this
> series.

> One of the hardest things I've found about trying to review this has
> been closing the gap between things that exist in your mind and commit
> messages / code.

thanks, that is valuable feedback.

> something like this in the commit message (and other applicable commits
> etc.) would be *very* valuable:
>
>     In functions such as git_reftable_read_raw_ref() (and ????, are
>     there more?) we're diligently emulating the documented behavior and
>     return values of the refs file backend. According to "make gcov" we
>     can see we don't have coverage for this, in particular the behavior
>     of EINVAL etc.

I haven't done this, because a lot of these considerations are
transient, and I'd rather not spend a lot of time documenting what I
don't know.

> I.e. per [1] once if and when we have GIT_TEST_REFTABLE passing surely
> one of the best way to garner feedback on the rest is to discover those
> parts (using "make gcov", after running with/without
> GIT_TEST_REFTABLE=[true|false]) where we still have outstanding blind
> spots between the tests and code.

Getting GIT_TEST_REFTABLE=1 passing is the hard part, because it means
having  to understand exactly how the current code is supposed to
work.  Once I get to that point (with knowledge being complete and
tests passing), it will be easy to document what is happening and why.

I was hoping that by posting these series with known test failures,
and questions marked "XXX" in reftable-backend.c, I would get feedback
from the other people who know exactly how this part of the code
works.  But from your mail, I get the sense that nobody understands
how the whole picture fits together?

--
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 00/28] reftable library
  2021-04-21 11:21                 ` Ævar Arnfjörð Bjarmason
@ 2021-04-26 17:59                   ` Han-Wen Nienhuys
  0 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys @ 2021-04-26 17:59 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys

On Wed, Apr 21, 2021 at 1:21 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:

> >> I don't see the point of having re-rolls of this topic while the test
> >> changes topic it's based on hasn't finished
> >> marking/splitting/refactoring the various tests that do and don't depend
> >> on the file backend.
> >>
> >> At least when I review it I'm just left with going in circles digging
> >> into one of those failing test, figuring out if it's really
> >> refs/files-backend.c specific or not etc., and as long as we can't turn
> >> on GIT_TEST_REFTABLE=true in CI as part of this series I don't see a
> >> path to making it advance to next->master.
> >
> > The point of posting updates is to garner feedback, especially from
> > people familiar with the Git code itself.
>
> So you agree that we should make the tests pass first, then shouldn't
> these me marked as RFC/PATCH?

I added RFC to the patch introducing reftable-backend.c

I consider the code under reftable/*.c to be of production quality,
even if it's stylistically different from the git code.

> IOW a large part of the feedback you're looking for is already part of
> the codebase. Nobody can keep all of it in their head, but we've encoded
> all the tricky edge cases we could think of in the tests.

That's disappointing. I know that I can just fix test failures until
there are none left, but it's not a terribly efficient method if there
is someone who knows how this works.  Is there no other way to solicit
feedback?

> I.e. reaching some consensus on things like whether
> GIT_TEST_REFTABLE=true passing under CI being a hard-prereq for this
> series or not is surely one of the first things to sketch out before
> figuring out how to move this forward.

I've had conflicting feedback about this, but it's ultimately not my
call to make. Jonathan Nieder suggested that this was asking too much,
and it would be OK to have a credible starting point where the effort
to address remaining test failures could be shared across multiple
people. Jun also expressed he didn't want to put all the work on my
shoulders, but so far I've not received much help.

> In v6 I noted that t5510-fetch.sh had a segfault[4], you said you'd
> check it out, and reading your cover letter nothing stood out about
> that, so I assumed it was sorted out somehow.

I fixed a number of segfaults, so it's quite likely (I did see the
freed transaction BUG).

One of the problems of working with a large pending series like this,
is that it is a project in itself, and keeping track of which change
fixes which bug is something I really need version control for, but
the frequent rebasing/squashing erases this kind of information.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 02/28] refs: document reflog_expire_fn's flag argument
  2021-04-20 19:34               ` Junio C Hamano
@ 2021-04-27 15:21                 ` Han-Wen Nienhuys
  0 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys @ 2021-04-27 15:21 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Ævar Arnfjörð Bjarmason, Felipe Contreras,
	Derrick Stolee, Han-Wen Nienhuys

On Tue, Apr 20, 2021 at 9:34 PM Junio C Hamano <gitster@pobox.com> wrote:> > +/*
> > + * `flags` accepts a bitmask of `expire_reflog_flags`.
> > + */
>
> OK.  It would have been better to say what expire_reflog_flags is
> (i.e. `enum expire_reflog_flags`), though.

Done, in the last version.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v4 13/15] Reftable support for git-core
  2021-04-26 16:17               ` Han-Wen Nienhuys
@ 2021-04-28 16:32                 ` Ævar Arnfjörð Bjarmason
  2021-04-28 17:40                   ` Han-Wen Nienhuys
  0 siblings, 1 reply; 251+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-28 16:32 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Felipe Contreras, Han-Wen Nienhuys


On Mon, Apr 26 2021, Han-Wen Nienhuys wrote:

> On Mon, Apr 26, 2021 at 4:28 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>> I agree that whether or not we should have any tests at all is a matter
>> of opinion and circumstance. I don't think in general that someone using
>> some random internal API needs to be checking what is and isn't tested
>> in that API.
>>
>> I do think in this case that it's worth digging a bit further. The APIs
>> in question are using EINVAL and EISDIR to pass up errors that are had a
>> 1:1 mapping into the FS historically.
>>
>> Are we really better off faking those up, or is some of that perhaps not
>> used at all? Maybe we'd find that this level of abstraction isn't what
>> we need, and it could actually be much simpler.
>
> I think none of these error functions are being used at all, and I've
> made a start at removing them in https://github.com/git/git/pull/1012
> (see also my discussion with peff.)
>
>> It seems to me that a good way to get there is to seek some systematic
>> ways of focusing review onto various edge cases of this series. I.e. to
>> begin with having GIT_TEST_REFTABLE pass as noted elsewhere, and in this
>> case calling attention to some of the underlying assumptions behind this
>> series.
>
>> One of the hardest things I've found about trying to review this has
>> been closing the gap between things that exist in your mind and commit
>> messages / code.
>
> thanks, that is valuable feedback.
>
>> something like this in the commit message (and other applicable commits
>> etc.) would be *very* valuable:
>>
>>     In functions such as git_reftable_read_raw_ref() (and ????, are
>>     there more?) we're diligently emulating the documented behavior and
>>     return values of the refs file backend. According to "make gcov" we
>>     can see we don't have coverage for this, in particular the behavior
>>     of EINVAL etc.
>
> I haven't done this, because a lot of these considerations are
> transient, and I'd rather not spend a lot of time documenting what I
> don't know.
>
>> I.e. per [1] once if and when we have GIT_TEST_REFTABLE passing surely
>> one of the best way to garner feedback on the rest is to discover those
>> parts (using "make gcov", after running with/without
>> GIT_TEST_REFTABLE=[true|false]) where we still have outstanding blind
>> spots between the tests and code.
>
> Getting GIT_TEST_REFTABLE=1 passing is the hard part, because it means
> having  to understand exactly how the current code is supposed to
> work.  Once I get to that point (with knowledge being complete and
> tests passing), it will be easy to document what is happening and why.
>
> I was hoping that by posting these series with known test failures,
> and questions marked "XXX" in reftable-backend.c, I would get feedback
> from the other people who know exactly how this part of the code
> works.  But from your mail, I get the sense that nobody understands
> how the whole picture fits together?

Almost definitely not. I don't know about you but when I'm looking at
code I wrote 6 months ago handling some special case it takes me a while
to get up to speed on just knowing what I knew then, and when we're
talking about something like refs.c ...

On the topic of the way forward: I for one would very much be for a plan
where step 0 is to just a series import the reftable code you have
as-is. I.e. we'd include it as an imported external library, maybe have
some light test-tool integration and compile it / run its own tests, but
not have/advertise the "git init" etc. integration yet.

I.e. my opinion on GIT_TEST_REFTABLE=1 needing to pass is implicit (but
I now realize I haven't explicitly said this) on that happening before
the tree is in a state where we'd code/doc-wise be in a state of
shipping this to users.

Whereas just importing the library != that, and I think we'd be in much
better shape if we had it in-tree and would incrementally work on
integration from that point, v.s. having more re-rolls of
mostly-the-same big codebase being re-submitted.

I'm sure there's some things that'll need to change as we start the
test/integration work, e.g. the reflog topic that's been discussed. But
that's surely better done as some patches on top of the already-landed
library import at this point v.s. trying to get the library perfect
before getting it in-tree.

Maybe Junio disagrees, just my 0.02....

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v4 13/15] Reftable support for git-core
  2021-04-28 16:32                 ` Ævar Arnfjörð Bjarmason
@ 2021-04-28 17:40                   ` Han-Wen Nienhuys
  0 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys @ 2021-04-28 17:40 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Felipe Contreras, Han-Wen Nienhuys

On Wed, Apr 28, 2021 at 6:40 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> >> I.e. per [1] once if and when we have GIT_TEST_REFTABLE passing surely
> >> one of the best way to garner feedback on the rest is to discover those
> >> parts (using "make gcov", after running with/without
> >> GIT_TEST_REFTABLE=[true|false]) where we still have outstanding blind
> >> spots between the tests and code.
> >
> > Getting GIT_TEST_REFTABLE=1 passing is the hard part, because it means
> > having  to understand exactly how the current code is supposed to
> > work.  Once I get to that point (with knowledge being complete and
> > tests passing), it will be easy to document what is happening and why.
> >
> > I was hoping that by posting these series with known test failures,
> > and questions marked "XXX" in reftable-backend.c, I would get feedback
> > from the other people who know exactly how this part of the code
> > works.  But from your mail, I get the sense that nobody understands
> > how the whole picture fits together?
>
> Almost definitely not. I don't know about you but when I'm looking at
> code I wrote 6 months ago handling some special case it takes me a while
> to get up to speed on just knowing what I knew then, and when we're
> talking about something like refs.c ...

:-(

on the bright side, once this is in, the expectations will be much
more explicit, because there are two backends that have to work in
roughly the same way.

> On the topic of the way forward: I for one would very much be for a plan
> where step 0 is to just a series import the reftable code you have
> as-is. I.e. we'd include it as an imported external library, maybe have
> some light test-tool integration and compile it / run its own tests, but
> not have/advertise the "git init" etc. integration yet.

I would really like that too, and I support this way forward. The bulk
of the work and problems are in the refs/reftable-backend.c and
assorted incompatibilities across the code base. The library itself
seems pretty solid at this point.

> I'm sure there's some things that'll need to change as we start the
> test/integration work, e.g. the reflog topic that's been discussed. But
> that's surely better done as some patches on top of the already-landed
> library import at this point v.s. trying to get the library perfect
> before getting it in-tree.
>
> Maybe Junio disagrees, just my 0.02....

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 10/28] reftable: (de)serialization for the polymorphic record type.
  2021-04-19 11:37             ` [PATCH v7 10/28] reftable: (de)serialization for the polymorphic record type Han-Wen Nienhuys via GitGitGadget
@ 2021-05-04 17:23               ` Andrzej Hunt
  2021-05-18 13:12                 ` Han-Wen Nienhuys
  0 siblings, 1 reply; 251+ messages in thread
From: Andrzej Hunt @ 2021-05-04 17:23 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget, git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys



On 19/04/2021 13:37, Han-Wen Nienhuys via GitGitGadget wrote:
> From: Han-Wen Nienhuys <hanwen@google.com>
[...snip...]
> diff --git a/reftable/record.c b/reftable/record.c
> new file mode 100644
> index 000000000000..b111852b667e
> --- /dev/null
> +++ b/reftable/record.c[...snip...> +static void reftable_obj_record_copy_from(void *rec, const 
void *src_rec,
> +					  int hash_size)
> +{
> +	struct reftable_obj_record *obj = rec;
> +	const struct reftable_obj_record *src =
> +		(const struct reftable_obj_record *)src_rec;
> +	int olen;
> +
> +	reftable_obj_record_release(obj);
> +	*obj = *src;
> +	obj->hash_prefix = reftable_malloc(obj->hash_prefix_len);
> +	memcpy(obj->hash_prefix, src->hash_prefix, obj->hash_prefix_len);
> +
> +	olen = obj->offset_len * sizeof(uint64_t);
> +	obj->offsets = reftable_malloc(olen);
> +	memcpy(obj->offsets, src->offsets, olen);
> +}

If src->offset_len is 0 then offsets will be NULL - and passing NULL 
into memcpy() results in undefined behaviour. I think we should either 
add an "if (src->offset_len)" check around the memcpy, or perhaps switch 
to COPY_ARRAY() which performs that check for us. (We can probably also 
skip the malloc and hence also olen calculation in this scenario though, 
because obj->offsets should already be NULL if src->offsets was NULL?)

Specifically, if I run t0032 as such:

   make CC=clang-11 UBSAN_OPTIONS=print_stacktrace=1 SANITIZE=undefined 
COPTS="-Og -g" GIT_TEST_OPTS=-v T=t0032-reftable-unittest.sh test

UBSAN reports the following error:

reftable/record.c:475:23: runtime error: null pointer passed as argument 
2, which is declared to never be null
/usr/include/string.h:43:28: note: nonnull attribute specified here
     #0 0x698e52 in reftable_obj_record_copy_from record.c:475:2
     #1 0x6ac806 in test_copy record_test.c:21:2
     #2 0x6abb73 in test_reftable_obj_record_roundtrip record_test.c:334:3
     #3 0x6abb73 in record_test_main record_test.c:403:2
     #4 0x437f08 in cmd__reftable test-reftable.c:10:2
     #5 0x427e74 in cmd_main test-tool.c:126:11
     #6 0x428043 in main common-main.c:52:11
     #7 0x7f3cd1044349 in __libc_start_main (/lib64/libc.so.6+0x24349)
     #8 0x407209 in _start start.S:120

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior 
reftable/record.c:475:23 in

[...snip...]
> +static int reftable_obj_record_decode(void *rec, struct strbuf key,
> +				      uint8_t val_type, struct string_view in,
> +				      int hash_size)
> +{
> +	struct string_view start = in;
> +	struct reftable_obj_record *r = rec;
> +	uint64_t count = val_type;
> +	int n = 0;
> +	uint64_t last;
> +	int j;
> +	r->hash_prefix = reftable_malloc(key.len);
> +	memcpy(r->hash_prefix, key.buf, key.len);
> +	r->hash_prefix_len = key.len;
> +
> +	if (val_type == 0) {
> +		n = get_var_int(&count, &in);
> +		if (n < 0) {
> +			return n;
> +		}
> +
> +		string_view_consume(&in, n);
> +	}
> +
> +	r->offsets = NULL;
> +	r->offset_len = 0;
> +	if (count == 0)
> +		return start.len - in.len;

And it looks like creating a reftable_obj_record with offsets = NULL and 
offset_len = 0 is considered valid based on the code above.

[... snipped the rest ...]

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 23/28] Reftable support for git-core
  2021-04-19 11:37             ` [PATCH v7 23/28] Reftable support for git-core Han-Wen Nienhuys via GitGitGadget
  2021-04-20 22:44               ` Junio C Hamano
@ 2021-05-04 17:24               ` Andrzej Hunt
  2021-05-18 13:18                 ` Han-Wen Nienhuys
  1 sibling, 1 reply; 251+ messages in thread
From: Andrzej Hunt @ 2021-05-04 17:24 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget, git
  Cc: Han-Wen Nienhuys, Jeff King, Ramsay Jones, Jonathan Nieder,
	Johannes Schindelin, Jonathan Tan, Josh Steadmon, Emily Shaffer,
	Patrick Steinhardt, Ævar Arnfjörð Bjarmason,
	Felipe Contreras, Derrick Stolee, Han-Wen Nienhuys



On 19/04/2021 13:37, Han-Wen Nienhuys via GitGitGadget wrote:
> From: Han-Wen Nienhuys <hanwen@google.com>
[...snip...]> diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
> new file mode 100644
> index 000000000000..55d053e5ca65
> --- /dev/null
> +++ b/refs/reftable-backend.c
[...snip...]
> +static int write_transaction_table(struct reftable_writer *writer, void *arg)
> +{
> +	struct ref_transaction *transaction = (struct ref_transaction *)arg;
> +	struct git_reftable_ref_store *refs =
> +		(struct git_reftable_ref_store *)transaction->ref_store;
> +	struct reftable_stack *stack =
> +		stack_for(refs, transaction->updates[0]->refname);
> +	uint64_t ts = reftable_stack_next_update_index(stack);
> +	int err = 0;
> +	int i = 0;
> +	struct reftable_log_record *logs =
> +		calloc(transaction->nr, sizeof(*logs));
> +	struct ref_update **sorted =
> +		malloc(transaction->nr * sizeof(struct ref_update *));
> +	struct reftable_merged_table *mt = reftable_stack_merged_table(stack);
> +	struct reftable_table tab = {NULL};
> +	struct reftable_ref_record ref = {NULL};
> +	reftable_table_from_merged_table(&tab, mt);
> +	COPY_ARRAY(sorted, transaction->updates, transaction->nr);
> +	QSORT(sorted, transaction->nr, ref_update_cmp);
> +	reftable_writer_set_limits(writer, ts, ts);
> +
> +	for (i = 0; i < transaction->nr; i++) {
> +		struct ref_update *u = sorted[i];
> +		struct reftable_log_record *log = &logs[i];
> +		struct object_id old_id;
> +		fill_reftable_log_record(log);
> +		log->update_index = ts;
> +		log->value_type = REFTABLE_LOG_UPDATE;
> +		log->refname = (char *)u->refname;
> +		log->update.new_hash = u->new_oid.hash;
> +		log->update.message = u->msg;
> +
> +		err = reftable_table_read_ref(&tab, u->refname, &ref);
> +		if (err < 0)
> +			goto done;
> +		else if (err > 0) {
> +			old_id = null_oid;
> +		} else {
> +			oidread(&old_id, ref.value.val1);
> +		}

This seems to assume that 'ref.value_type == REFTABLE_REF_VAL1' - but do 
we expect to have to handle the other types 
(REFTABLE_REF_VAL2/REFTABLE_REF_SYMREF)? When I run tests in seen 
against ASAN I see the following errors in t0031, which suggests we're 
running this code against REFTABLE_REF_SYMREF too - but I don't know if 
that means that this code should be able to handle the other ref types 
or if there's a bug higher up the stack. (AFAIUI REFTABLE_REF_DELETION 
is already handled because reftable_table_read_ref() already returns 1 
for deletion, but the other cases seem valid?)

ASAN error output:

==25352==ERROR: AddressSanitizer: heap-buffer-overflow on address 
0x603000003353 at pc 0x000000499d17 bp 0x7fff0ea0a210 sp 0x7fff0ea099d8
READ of size 32 at 0x603000003353 thread T0
     #0 0x499d16 in __asan_memcpy 
../projects/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cpp:22:3
     #1 0x97ed0a in oidread hash.h:292:2
     #2 0x97ed0a in write_transaction_table refs/reftable-backend.c:548:4
     #3 0xb5537a in reftable_addition_add reftable/stack.c:650:8
     #4 0x97b024 in git_reftable_transaction_finish 
refs/reftable-backend.c:618:9
     #5 0x95f9bf in ref_transaction_commit refs.c:2218:8
     #6 0x9d3aa6 in update_head_with_reflog sequencer.c:1138:6
     #7 0x52ffbf in cmd_commit builtin/commit.c:1814:6
     #8 0x4ce8fe in run_builtin git.c:461:11
     #9 0x4ccbc8 in handle_builtin git.c:718:3
     #10 0x4cb0cc in run_argv git.c:785:4
     #11 0x4cb0cc in cmd_main git.c:916:19
     #12 0x6beded in main common-main.c:52:11
     #13 0x7f415c762349 in __libc_start_main (/lib64/libc.so.6+0x24349)
     #14 0x420769 in _start ../sysdeps/x86_64/start.S:120

0x603000003353 is located 0 bytes to the right of 19-byte region 
[0x603000003340,0x603000003353)
allocated by thread T0 here:
     #0 0x4868b4 in strdup 
../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
     #1 0xaa08e8 in xstrdup wrapper.c:29:14
     #2 0xb4b280 in reftable_ref_record_copy_from reftable/record.c:229:23
     #3 0xb46754 in merged_iter_next_entry reftable/merged.c:132:2
     #4 0xb46754 in merged_iter_next reftable/merged.c:141:13
     #5 0xb46754 in merged_iter_next_void reftable/merged.c:157:9
     #6 0xb500ae in iterator_next reftable/generic.c:147:9
     #7 0xb500ae in reftable_iterator_next_ref reftable/generic.c:134:9
     #8 0xb500ae in reftable_table_read_ref reftable/generic.c:46:8
     #9 0x97ec67 in write_transaction_table refs/reftable-backend.c:542:9
     #10 0xb5537a in reftable_addition_add reftable/stack.c:650:8
     #11 0x97b024 in git_reftable_transaction_finish 
refs/reftable-backend.c:618:9
     #12 0x95f9bf in ref_transaction_commit refs.c:2218:8
     #13 0x9d3aa6 in update_head_with_reflog sequencer.c:1138:6
     #14 0x52ffbf in cmd_commit builtin/commit.c:1814:6
     #15 0x4ce8fe in run_builtin git.c:461:11
     #16 0x4ccbc8 in handle_builtin git.c:718:3
     #17 0x4cb0cc in run_argv git.c:785:4
     #18 0x4cb0cc in cmd_main git.c:916:19
     #19 0x6beded in main common-main.c:52:11
     #20 0x7f415c762349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: heap-buffer-overflow 
../projects/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cpp:22:3 in 
__asan_memcpy


Produced using:

   make CC=clang-11 SANITIZE=address COPTS="-Og -g" GIT_TEST_OPTS=-v 
T=t0031-reftable.sh test


> +
> +		/* XXX fold together with the old_id check below? */
> +
> +		log->update.old_hash = old_id.hash;
> +		if (u->flags & REF_LOG_ONLY) {
> +			continue;
> +		}
> +
> +		if (u->flags & REF_HAVE_NEW) {
> +			struct reftable_ref_record ref = { NULL };
> +			struct object_id peeled;
> +
> +			int peel_error = peel_object(&u->new_oid, &peeled);
> +			ref.refname = (char *)u->refname;
> +			ref.update_index = ts;
> +
> +			if (!peel_error) {
> +				ref.value_type = REFTABLE_REF_VAL2;
> +				ref.value.val2.target_value = peeled.hash;
> +				ref.value.val2.value = u->new_oid.hash;
> +			} else if (!is_null_oid(&u->new_oid)) {
> +				ref.value_type = REFTABLE_REF_VAL1;
> +				ref.value.val1 = u->new_oid.hash;
> +			}
> +
> +			err = reftable_writer_add_ref(writer, &ref);
> +			if (err < 0) {
> +				goto done;
> +			}
> +		}
> +	}
> +
> +	for (i = 0; i < transaction->nr; i++) {
> +		err = reftable_writer_add_log(writer, &logs[i]);
> +		clear_reftable_log_record(&logs[i]);
> +		if (err < 0) {
> +			goto done;
> +		}
> +	}
> +
> +done:
> +	assert(err != REFTABLE_API_ERROR);
> +	reftable_ref_record_release(&ref);
> +	free(logs);
> +	free(sorted);
> +	return err;
> +}
> +

[...snip...]

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 10/28] reftable: (de)serialization for the polymorphic record type.
  2021-05-04 17:23               ` Andrzej Hunt
@ 2021-05-18 13:12                 ` Han-Wen Nienhuys
  0 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys @ 2021-05-18 13:12 UTC (permalink / raw)
  To: Andrzej Hunt
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Ævar Arnfjörð Bjarmason, Felipe Contreras,
	Derrick Stolee, Han-Wen Nienhuys

On Tue, May 4, 2021 at 7:23 PM Andrzej Hunt <andrzej@ahunt.org> wrote:
> If src->offset_len is 0 then offsets will be NULL - and passing NULL
> into memcpy() results in undefined behaviour. I think we should either
> add an "if (src->offset_len)" check around the memcpy, or perhaps switch
> to COPY_ARRAY() which performs that check for us. (We can probably also
> skip the malloc and hence also olen calculation in this scenario though,
> because obj->offsets should already be NULL if src->offsets was NULL?)

Thanks for the detailed report. I've fixed this for the next round.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 23/28] Reftable support for git-core
  2021-05-04 17:24               ` Andrzej Hunt
@ 2021-05-18 13:18                 ` Han-Wen Nienhuys
  2021-05-18 13:30                   ` Han-Wen Nienhuys
  0 siblings, 1 reply; 251+ messages in thread
From: Han-Wen Nienhuys @ 2021-05-18 13:18 UTC (permalink / raw)
  To: Andrzej Hunt
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Ævar Arnfjörð Bjarmason, Felipe Contreras,
	Derrick Stolee, Han-Wen Nienhuys

On Tue, May 4, 2021 at 7:24 PM Andrzej Hunt <andrzej@ahunt.org> wrote:
> > +             } else {
> > +                     oidread(&old_id, ref.value.val1);
> > +             }
>
> This seems to assume that 'ref.value_type == REFTABLE_REF_VAL1' - but do
> we expect to have to handle the other types
> (REFTABLE_REF_VAL2/REFTABLE_REF_SYMREF)? When I run tests in seen
> against ASAN I see the following errors in t0031, which suggests we're
> running this code against REFTABLE_REF_SYMREF too - but I don't know if
> that means that this code should be able to handle the other ref types
> or if there's a bug higher up the stack. (AFAIUI REFTABLE_REF_DELETION
> is already handled because reftable_table_read_ref() already returns 1
> for deletion, but the other cases seem valid?)

Curious.  I didn't know it was supported to create symrefs in a
transaction, but I've added an assert to make sure this doesn't
trigger any sanitizers.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 23/28] Reftable support for git-core
  2021-05-18 13:18                 ` Han-Wen Nienhuys
@ 2021-05-18 13:30                   ` Han-Wen Nienhuys
  0 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys @ 2021-05-18 13:30 UTC (permalink / raw)
  To: Andrzej Hunt
  Cc: Han-Wen Nienhuys via GitGitGadget, git, Jeff King, Ramsay Jones,
	Jonathan Nieder, Johannes Schindelin, Jonathan Tan,
	Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Ævar Arnfjörð Bjarmason, Felipe Contreras,
	Derrick Stolee, Han-Wen Nienhuys

On Tue, May 18, 2021 at 3:18 PM Han-Wen Nienhuys <hanwen@google.com> wrote:
> > This seems to assume that 'ref.value_type == REFTABLE_REF_VAL1' - but do
> > we expect to have to handle the other types
> > (REFTABLE_REF_VAL2/REFTABLE_REF_SYMREF)? When I run tests in seen
> > against ASAN I see the following errors in t0031, which suggests we're
> > running this code against REFTABLE_REF_SYMREF too - but I don't know if
> > that means that this code should be able to handle the other ref types
> > or if there's a bug higher up the stack. (AFAIUI REFTABLE_REF_DELETION
> > is already handled because reftable_table_read_ref() already returns 1
> > for deletion, but the other cases seem valid?)
>
> Curious.  I didn't know it was supported to create symrefs in a
> transaction, but I've added an assert to make sure this doesn't
> trigger any sanitizers.

This happens on repo initialization, where we create HEAD and its
referents at the same time.

I've fixed this provisionally for now in my series.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

* Re: [PATCH v7 04/28] hash.h: provide constants for the hash IDs
  2021-04-21  9:43                   ` Han-Wen Nienhuys
@ 2021-07-22  8:31                     ` Han-Wen Nienhuys
  0 siblings, 0 replies; 251+ messages in thread
From: Han-Wen Nienhuys @ 2021-07-22  8:31 UTC (permalink / raw)
  To: brian m. carlson, Junio C Hamano,
	Han-Wen Nienhuys via GitGitGadget, git, Han-Wen Nienhuys,
	Jeff King, Ramsay Jones, Jonathan Nieder, Johannes Schindelin,
	Jonathan Tan, Josh Steadmon, Emily Shaffer, Patrick Steinhardt,
	Ævar Arnfjörð Bjarmason, Felipe Contreras,
	Derrick Stolee, Han-Wen Nienhuys

On Wed, Apr 21, 2021 at 11:43 AM Han-Wen Nienhuys <hanwen@google.com> wrote:
>
> On Wed, Apr 21, 2021 at 3:04 AM brian m. carlson
> <sandals@crustytoothpaste.net> wrote:
> > > I agree that it makes sense to have symbolic constants defined in
> > > this file.  These are values that fit in the ".format_id" member of
> > > "struct git_hash_algo", and it is preferrable to make sure that the
> > > names align (if I were designing in void, I would have called the
> > > member "algo_id" and made the constants GIT_(SHA1|SHA256)_ALGO_ID).
> > >
> > > Brian?  What's your preference ("I am fine to store HASH_ID in the
> > > '.format_id' member" is perfectly an acceptable answer).
> >
> > I slightly prefer FORMAT_ID because it's consistent (and for that reason
> > only), but if HASH_ID is more convenient, that's fine; I don't have a
> > strong opinion at all.  Definitely don't reroll the series because of my
> > slight preference.  Either way, I think these are fine things to have as
> > constants, and I appreciate you hoisting the comments here.
>
> Either way is fine for me. I can paint the bikeshed in your favorite color :)

This is now the first commit of the reftable series.  I think it could
graduate separately.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--
Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 251+ messages in thread

end of thread, other threads:[~2021-07-22  8:31 UTC | newest]

Thread overview: 251+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-16 19:10 [PATCH 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
2020-09-16 19:10 ` [PATCH 01/13] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
2020-09-16 19:10 ` [PATCH 02/13] reftable: define the public API Han-Wen Nienhuys via GitGitGadget
2020-09-16 19:10 ` [PATCH 03/13] vcxproj: adjust for the reftable changes Johannes Schindelin via GitGitGadget
2020-09-16 19:10 ` [PATCH 04/13] reftable: add a barebones unittest framework Han-Wen Nienhuys via GitGitGadget
2020-09-16 19:10 ` [PATCH 05/13] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
2020-09-16 19:10 ` [PATCH 06/13] reftable: (de)serialization for the polymorphic record type Han-Wen Nienhuys via GitGitGadget
2020-09-20  1:00   ` Junio C Hamano
2020-09-21 13:13     ` Han-Wen Nienhuys
2020-09-24  7:21       ` Jeff King
2020-09-24  7:31         ` Jeff King
2020-09-24 17:22           ` Junio C Hamano
2020-09-16 19:10 ` [PATCH 07/13] reftable: reading/writing blocks Han-Wen Nienhuys via GitGitGadget
2020-09-16 19:10 ` [PATCH 08/13] reftable: a generic binary tree implementation Han-Wen Nienhuys via GitGitGadget
2020-09-16 19:10 ` [PATCH 09/13] reftable: write reftable files Han-Wen Nienhuys via GitGitGadget
2020-09-16 19:10 ` [PATCH 10/13] reftable: read " Han-Wen Nienhuys via GitGitGadget
2020-09-16 19:10 ` [PATCH 11/13] reftable: file level tests Han-Wen Nienhuys via GitGitGadget
2020-09-16 19:10 ` [PATCH 12/13] reftable: rest of library Han-Wen Nienhuys via GitGitGadget
2020-09-16 19:10 ` [PATCH 13/13] reftable: "test-tool dump-reftable" command Han-Wen Nienhuys via GitGitGadget
2020-10-01 16:10 ` [PATCH v2 00/13] reftable library Han-Wen Nienhuys via GitGitGadget
2020-10-01 16:10   ` [PATCH v2 01/13] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
2020-10-02  3:18     ` Jonathan Nieder
2020-10-01 16:10   ` [PATCH v2 02/13] reftable: define the public API Han-Wen Nienhuys via GitGitGadget
2020-10-02  3:58     ` Jonathan Nieder
2020-10-09 21:13       ` Emily Shaffer
2020-10-10 17:03         ` Han-Wen Nienhuys
2020-11-30 14:44         ` Han-Wen Nienhuys
2020-10-10 13:43       ` Han-Wen Nienhuys
2020-10-12 16:57         ` Jonathan Nieder
2020-11-30 14:55           ` Han-Wen Nienhuys
2020-10-08  1:41     ` Jonathan Tan
2020-10-10 16:57       ` Han-Wen Nienhuys
2020-10-01 16:10   ` [PATCH v2 03/13] vcxproj: adjust for the reftable changes Johannes Schindelin via GitGitGadget
2020-10-02  4:02     ` Jonathan Nieder
2020-10-02 11:43       ` Johannes Schindelin
2020-10-01 16:10   ` [PATCH v2 04/13] reftable: add a barebones unittest framework Han-Wen Nienhuys via GitGitGadget
2020-10-02  4:05     ` Jonathan Nieder
2020-10-08  1:45     ` Jonathan Tan
2020-10-08 22:31       ` Josh Steadmon
2020-10-01 16:10   ` [PATCH v2 05/13] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
2020-10-02  4:12     ` Jonathan Nieder
2020-10-10 17:32       ` Han-Wen Nienhuys
2020-10-12 15:25         ` Jonathan Nieder
2020-10-12 17:05           ` Patrick Steinhardt
2020-10-12 17:45             ` Jonathan Nieder
2020-10-13 12:12             ` Johannes Schindelin
2020-10-13 15:47               ` Junio C Hamano
2020-10-15 11:46                 ` Johannes Schindelin
2020-10-15 16:23                   ` Junio C Hamano
2020-10-15 19:39                     ` Johannes Schindelin
2020-10-16  9:15                     ` Patrick Steinhardt
2020-10-02 14:01     ` Johannes Schindelin
2020-10-02 20:47       ` Junio C Hamano
2020-10-03  8:07         ` Johannes Schindelin
2020-10-08  1:48     ` Jonathan Tan
2020-10-10 17:28       ` Han-Wen Nienhuys
2020-10-11 10:52         ` Johannes Schindelin
2020-10-12 15:19           ` Jonathan Nieder
2020-10-12 18:44             ` Johannes Schindelin
2020-10-12 19:41               ` Jonathan Nieder
2020-10-12 20:27                 ` Johannes Schindelin
2020-10-12 16:42           ` Junio C Hamano
2020-10-12 19:01             ` Johannes Schindelin
2020-10-23  9:13         ` Ævar Arnfjörð Bjarmason
2020-10-23 17:36           ` Junio C Hamano
2020-10-01 16:10   ` [PATCH v2 06/13] reftable: (de)serialization for the polymorphic record type Han-Wen Nienhuys via GitGitGadget
2020-10-01 19:23     ` Junio C Hamano
2020-10-01 19:59       ` Ramsay Jones
2020-10-01 16:10   ` [PATCH v2 07/13] reftable: reading/writing blocks Han-Wen Nienhuys via GitGitGadget
2020-10-01 16:10   ` [PATCH v2 08/13] reftable: a generic binary tree implementation Han-Wen Nienhuys via GitGitGadget
2020-10-01 16:10   ` [PATCH v2 09/13] reftable: write reftable files Han-Wen Nienhuys via GitGitGadget
2020-10-01 16:11   ` [PATCH v2 10/13] reftable: read " Han-Wen Nienhuys via GitGitGadget
2020-10-01 16:11   ` [PATCH v2 11/13] reftable: file level tests Han-Wen Nienhuys via GitGitGadget
2020-10-01 16:11   ` [PATCH v2 12/13] reftable: rest of library Han-Wen Nienhuys via GitGitGadget
2020-10-02 13:57     ` Johannes Schindelin
2020-10-02 17:08       ` Junio C Hamano
2020-10-04 18:39         ` Johannes Schindelin
2020-10-01 16:11   ` [PATCH v2 13/13] reftable: "test-tool dump-reftable" command Han-Wen Nienhuys via GitGitGadget
2020-11-26 19:42   ` [PATCH v3 00/16] reftable library Han-Wen Nienhuys via GitGitGadget
2020-11-26 19:42     ` [PATCH v3 01/16] move sleep_millisec to git-compat-util.h Han-Wen Nienhuys via GitGitGadget
2020-11-26 19:42     ` [PATCH v3 02/16] init-db: set the_repository->hash_algo early on Han-Wen Nienhuys via GitGitGadget
2020-11-27 10:22       ` Ævar Arnfjörð Bjarmason
2020-11-26 19:42     ` [PATCH v3 03/16] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
2020-11-27 10:23       ` Ævar Arnfjörð Bjarmason
2020-11-30 11:26         ` Han-Wen Nienhuys
2020-11-30 20:25           ` Han-Wen Nienhuys
2020-11-30 21:21             ` Felipe Contreras
2020-12-01  9:51               ` Han-Wen Nienhuys
2020-12-01 10:38                 ` Felipe Contreras
2020-12-01 11:45                   ` Ævar Arnfjörð Bjarmason
2020-12-01 13:34                     ` Han-Wen Nienhuys
2020-12-01 23:13                       ` Felipe Contreras
2020-12-01 23:03                     ` Felipe Contreras
2020-11-26 19:42     ` [PATCH v3 04/16] reftable: add error related functionality Han-Wen Nienhuys via GitGitGadget
2020-11-27  9:13       ` Felipe Contreras
2020-11-27 10:25       ` Ævar Arnfjörð Bjarmason
2020-11-30 11:27         ` Han-Wen Nienhuys
2020-11-26 19:42     ` [PATCH v3 05/16] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
2020-11-27 10:18       ` Felipe Contreras
2020-11-27 10:33       ` Ævar Arnfjörð Bjarmason
2020-11-26 19:42     ` [PATCH v3 06/16] reftable: add blocksource, an abstraction for random access reads Han-Wen Nienhuys via GitGitGadget
2020-11-26 19:42     ` [PATCH v3 07/16] reftable: (de)serialization for the polymorphic record type Han-Wen Nienhuys via GitGitGadget
2020-11-26 19:42     ` [PATCH v3 08/16] reftable: reading/writing blocks Han-Wen Nienhuys via GitGitGadget
2020-11-26 19:42     ` [PATCH v3 09/16] reftable: a generic binary tree implementation Han-Wen Nienhuys via GitGitGadget
2020-11-26 19:42     ` [PATCH v3 10/16] reftable: write reftable files Han-Wen Nienhuys via GitGitGadget
2020-11-26 19:42     ` [PATCH v3 11/16] reftable: read " Han-Wen Nienhuys via GitGitGadget
2020-11-26 19:42     ` [PATCH v3 12/16] reftable: reftable file level tests Han-Wen Nienhuys via GitGitGadget
2020-11-26 19:42     ` [PATCH v3 13/16] reftable: rest of library Han-Wen Nienhuys via GitGitGadget
2020-11-26 19:42     ` [PATCH v3 14/16] Reftable support for git-core Han-Wen Nienhuys via GitGitGadget
2020-11-27 10:59       ` Ævar Arnfjörð Bjarmason
2020-11-26 19:42     ` [PATCH v3 15/16] git-prompt: prepare for reftable refs backend SZEDER Gábor via GitGitGadget
2020-11-26 19:42     ` [PATCH v3 16/16] Add "test-tool dump-reftable" command Han-Wen Nienhuys via GitGitGadget
2020-12-09 14:00     ` [PATCH v4 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
2020-12-09 14:00       ` [PATCH v4 01/15] init-db: set the_repository->hash_algo early on Han-Wen Nienhuys via GitGitGadget
2020-12-09 14:00       ` [PATCH v4 02/15] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
2020-12-09 14:00       ` [PATCH v4 03/15] reftable: add error related functionality Han-Wen Nienhuys via GitGitGadget
2020-12-09 14:00       ` [PATCH v4 04/15] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
2020-12-09 14:00       ` [PATCH v4 05/15] reftable: add blocksource, an abstraction for random access reads Han-Wen Nienhuys via GitGitGadget
2020-12-09 14:00       ` [PATCH v4 06/15] reftable: (de)serialization for the polymorphic record type Han-Wen Nienhuys via GitGitGadget
2020-12-09 14:00       ` [PATCH v4 07/15] reftable: reading/writing blocks Han-Wen Nienhuys via GitGitGadget
2020-12-09 14:00       ` [PATCH v4 08/15] reftable: a generic binary tree implementation Han-Wen Nienhuys via GitGitGadget
2020-12-09 14:00       ` [PATCH v4 09/15] reftable: write reftable files Han-Wen Nienhuys via GitGitGadget
2020-12-09 14:00       ` [PATCH v4 10/15] reftable: read " Han-Wen Nienhuys via GitGitGadget
2020-12-09 14:00       ` [PATCH v4 11/15] reftable: reftable file level tests Han-Wen Nienhuys via GitGitGadget
2020-12-09 14:00       ` [PATCH v4 12/15] reftable: rest of library Han-Wen Nienhuys via GitGitGadget
2020-12-09 14:00       ` [PATCH v4 13/15] Reftable support for git-core Han-Wen Nienhuys via GitGitGadget
2021-01-21 15:55         ` Ævar Arnfjörð Bjarmason
2021-01-21 16:14           ` Han-Wen Nienhuys
2021-01-21 16:21             ` Han-Wen Nienhuys
2021-01-26 13:44               ` Ævar Arnfjörð Bjarmason
2021-04-23 10:22           ` Han-Wen Nienhuys
2021-04-26 13:23             ` Ævar Arnfjörð Bjarmason
2021-04-26 16:17               ` Han-Wen Nienhuys
2021-04-28 16:32                 ` Ævar Arnfjörð Bjarmason
2021-04-28 17:40                   ` Han-Wen Nienhuys
2021-02-22  0:41         ` [PATCH] refs: introduce API function to write invalid null ref Stefan Beller
2021-02-22  1:20           ` Eric Sunshine
2021-02-22  3:09             ` Eric Sunshine
2021-02-22 18:38           ` Han-Wen Nienhuys
2020-12-09 14:00       ` [PATCH v4 14/15] git-prompt: prepare for reftable refs backend SZEDER Gábor via GitGitGadget
2020-12-09 14:00       ` [PATCH v4 15/15] Add "test-tool dump-reftable" command Han-Wen Nienhuys via GitGitGadget
2021-03-12 20:19       ` [PATCH v5 00/15] reftable library Han-Wen Nienhuys via GitGitGadget
2021-03-12 20:19         ` [PATCH v5 01/15] init-db: set the_repository->hash_algo early on Han-Wen Nienhuys via GitGitGadget
2021-03-12 20:19         ` [PATCH v5 02/15] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
2021-03-12 20:19         ` [PATCH v5 03/15] reftable: add error related functionality Han-Wen Nienhuys via GitGitGadget
2021-03-12 20:19         ` [PATCH v5 04/15] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
2021-03-12 20:19         ` [PATCH v5 05/15] reftable: add blocksource, an abstraction for random access reads Han-Wen Nienhuys via GitGitGadget
2021-03-12 20:19         ` [PATCH v5 06/15] reftable: (de)serialization for the polymorphic record type Han-Wen Nienhuys via GitGitGadget
2021-03-12 20:19         ` [PATCH v5 07/15] reftable: reading/writing blocks Han-Wen Nienhuys via GitGitGadget
2021-03-12 20:19         ` [PATCH v5 08/15] reftable: a generic binary tree implementation Han-Wen Nienhuys via GitGitGadget
2021-03-12 20:19         ` [PATCH v5 09/15] reftable: write reftable files Han-Wen Nienhuys via GitGitGadget
2021-03-12 20:19         ` [PATCH v5 10/15] reftable: read " Han-Wen Nienhuys via GitGitGadget
2021-03-12 20:19         ` [PATCH v5 11/15] reftable: reftable file level tests Han-Wen Nienhuys via GitGitGadget
2021-03-12 20:19         ` [PATCH v5 12/15] reftable: rest of library Han-Wen Nienhuys via GitGitGadget
2021-03-12 20:19         ` [PATCH v5 13/15] Reftable support for git-core Han-Wen Nienhuys via GitGitGadget
2021-03-23 11:40           ` Derrick Stolee
2021-03-23 12:20             ` Ævar Arnfjörð Bjarmason
2021-03-23 20:14               ` Junio C Hamano
2021-03-23 20:12             ` Junio C Hamano
2021-03-12 20:19         ` [PATCH v5 14/15] git-prompt: prepare for reftable refs backend SZEDER Gábor via GitGitGadget
2021-03-12 20:19         ` [PATCH v5 15/15] Add "test-tool dump-reftable" command Han-Wen Nienhuys via GitGitGadget
2021-04-12 19:25         ` [PATCH v6 00/20] reftable library Han-Wen Nienhuys via GitGitGadget
2021-04-12 19:25           ` [PATCH v6 01/20] init-db: set the_repository->hash_algo early on Han-Wen Nienhuys via GitGitGadget
2021-04-12 19:25           ` [PATCH v6 02/20] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
2021-04-13  7:28             ` Ævar Arnfjörð Bjarmason
2021-04-13 10:50               ` Han-Wen Nienhuys
2021-04-13 13:41                 ` Ævar Arnfjörð Bjarmason
2021-04-12 19:25           ` [PATCH v6 03/20] reftable: add error related functionality Han-Wen Nienhuys via GitGitGadget
2021-04-12 19:25           ` [PATCH v6 04/20] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
2021-04-13  8:02             ` Ævar Arnfjörð Bjarmason
2021-04-13 10:58               ` Han-Wen Nienhuys
2021-04-13 12:56                 ` Ævar Arnfjörð Bjarmason
2021-04-13 13:14             ` Ævar Arnfjörð Bjarmason
2021-04-15 15:00               ` Han-Wen Nienhuys
2021-04-12 19:25           ` [PATCH v6 05/20] reftable: add blocksource, an abstraction for random access reads Han-Wen Nienhuys via GitGitGadget
2021-04-12 19:25           ` [PATCH v6 06/20] reftable: (de)serialization for the polymorphic record type Han-Wen Nienhuys via GitGitGadget
2021-04-12 19:25           ` [PATCH v6 07/20] reftable: reading/writing blocks Han-Wen Nienhuys via GitGitGadget
2021-04-12 21:40             ` Junio C Hamano
2021-04-13  8:19             ` Ævar Arnfjörð Bjarmason
2021-04-15  8:57               ` Han-Wen Nienhuys
2021-04-12 19:25           ` [PATCH v6 08/20] reftable: a generic binary tree implementation Han-Wen Nienhuys via GitGitGadget
2021-04-12 19:25           ` [PATCH v6 09/20] reftable: write reftable files Han-Wen Nienhuys via GitGitGadget
2021-04-12 19:25           ` [PATCH v6 10/20] reftable: generic interface to tables Han-Wen Nienhuys via GitGitGadget
2021-04-12 19:25           ` [PATCH v6 11/20] reftable: read reftable files Han-Wen Nienhuys via GitGitGadget
2021-04-12 19:25           ` [PATCH v6 12/20] reftable: reftable file level tests Han-Wen Nienhuys via GitGitGadget
2021-04-12 19:25           ` [PATCH v6 13/20] reftable: add a heap-based priority queue for reftable records Han-Wen Nienhuys via GitGitGadget
2021-04-12 19:25           ` [PATCH v6 14/20] reftable: add merged table view Han-Wen Nienhuys via GitGitGadget
2021-04-12 19:25           ` [PATCH v6 15/20] reftable: implement refname validation Han-Wen Nienhuys via GitGitGadget
2021-04-12 19:25           ` [PATCH v6 16/20] reftable: implement stack, a mutable database of reftable files Han-Wen Nienhuys via GitGitGadget
2021-04-12 19:25           ` [PATCH v6 17/20] reftable: add dump utility Han-Wen Nienhuys via GitGitGadget
2021-04-12 19:25           ` [PATCH v6 18/20] Reftable support for git-core Han-Wen Nienhuys via GitGitGadget
2021-04-13  7:18             ` Ævar Arnfjörð Bjarmason
2021-04-14 16:44               ` Han-Wen Nienhuys
2021-04-16 14:55                 ` Ævar Arnfjörð Bjarmason
2021-04-16 18:47                 ` Junio C Hamano
2021-04-12 19:25           ` [PATCH v6 19/20] git-prompt: prepare for reftable refs backend SZEDER Gábor via GitGitGadget
2021-04-12 19:25           ` [PATCH v6 20/20] Add "test-tool dump-reftable" command Han-Wen Nienhuys via GitGitGadget
2021-04-19 11:37           ` [PATCH v7 00/28] reftable library Han-Wen Nienhuys via GitGitGadget
2021-04-19 11:37             ` [PATCH v7 01/28] refs: ref_iterator_peel returns boolean, rather than peel_status Han-Wen Nienhuys via GitGitGadget
2021-04-20 18:47               ` Junio C Hamano
2021-04-21 10:15                 ` Han-Wen Nienhuys
2021-04-21 23:28                   ` Junio C Hamano
2021-04-19 11:37             ` [PATCH v7 02/28] refs: document reflog_expire_fn's flag argument Han-Wen Nienhuys via GitGitGadget
2021-04-20 19:34               ` Junio C Hamano
2021-04-27 15:21                 ` Han-Wen Nienhuys
2021-04-19 11:37             ` [PATCH v7 03/28] refs/debug: trace into reflog expiry too Han-Wen Nienhuys via GitGitGadget
2021-04-20 19:41               ` Junio C Hamano
2021-04-22 17:27                 ` Han-Wen Nienhuys
2021-04-19 11:37             ` [PATCH v7 04/28] hash.h: provide constants for the hash IDs Han-Wen Nienhuys via GitGitGadget
2021-04-20 19:49               ` Junio C Hamano
2021-04-21  1:04                 ` brian m. carlson
2021-04-21  9:43                   ` Han-Wen Nienhuys
2021-07-22  8:31                     ` Han-Wen Nienhuys
2021-04-19 11:37             ` [PATCH v7 05/28] init-db: set the_repository->hash_algo early on Han-Wen Nienhuys via GitGitGadget
2021-04-19 11:37             ` [PATCH v7 06/28] reftable: add LICENSE Han-Wen Nienhuys via GitGitGadget
2021-04-21  7:48               ` Ævar Arnfjörð Bjarmason
2021-04-21  9:15                 ` Han-Wen Nienhuys
2021-04-19 11:37             ` [PATCH v7 07/28] reftable: add error related functionality Han-Wen Nienhuys via GitGitGadget
2021-04-19 11:37             ` [PATCH v7 08/28] reftable: utility functions Han-Wen Nienhuys via GitGitGadget
2021-04-19 11:37             ` [PATCH v7 09/28] reftable: add blocksource, an abstraction for random access reads Han-Wen Nienhuys via GitGitGadget
2021-04-19 11:37             ` [PATCH v7 10/28] reftable: (de)serialization for the polymorphic record type Han-Wen Nienhuys via GitGitGadget
2021-05-04 17:23               ` Andrzej Hunt
2021-05-18 13:12                 ` Han-Wen Nienhuys
2021-04-19 11:37             ` [PATCH v7 11/28] Provide zlib's uncompress2 from compat/zlib-compat.c Han-Wen Nienhuys via GitGitGadget
2021-04-19 11:37             ` [PATCH v7 12/28] reftable: reading/writing blocks Han-Wen Nienhuys via GitGitGadget
2021-04-19 11:37             ` [PATCH v7 13/28] reftable: a generic binary tree implementation Han-Wen Nienhuys via GitGitGadget
2021-04-19 11:37             ` [PATCH v7 14/28] reftable: write reftable files Han-Wen Nienhuys via GitGitGadget
2021-04-19 11:37             ` [PATCH v7 15/28] reftable: generic interface to tables Han-Wen Nienhuys via GitGitGadget
2021-04-19 11:37             ` [PATCH v7 16/28] reftable: read reftable files Han-Wen Nienhuys via GitGitGadget
2021-04-19 11:37             ` [PATCH v7 17/28] reftable: reftable file level tests Han-Wen Nienhuys via GitGitGadget
2021-04-19 11:37             ` [PATCH v7 18/28] reftable: add a heap-based priority queue for reftable records Han-Wen Nienhuys via GitGitGadget
2021-04-19 11:37             ` [PATCH v7 19/28] reftable: add merged table view Han-Wen Nienhuys via GitGitGadget
2021-04-19 11:37             ` [PATCH v7 20/28] reftable: implement refname validation Han-Wen Nienhuys via GitGitGadget
2021-04-19 11:37             ` [PATCH v7 21/28] reftable: implement stack, a mutable database of reftable files Han-Wen Nienhuys via GitGitGadget
2021-04-19 11:37             ` [PATCH v7 22/28] reftable: add dump utility Han-Wen Nienhuys via GitGitGadget
2021-04-19 11:37             ` [PATCH v7 23/28] Reftable support for git-core Han-Wen Nienhuys via GitGitGadget
2021-04-20 22:44               ` Junio C Hamano
2021-04-21 10:19                 ` Han-Wen Nienhuys
2021-04-21 23:22                   ` Junio C Hamano
2021-05-04 17:24               ` Andrzej Hunt
2021-05-18 13:18                 ` Han-Wen Nienhuys
2021-05-18 13:30                   ` Han-Wen Nienhuys
2021-04-19 11:37             ` [PATCH v7 24/28] git-prompt: prepare for reftable refs backend SZEDER Gábor via GitGitGadget
2021-04-19 11:37             ` [PATCH v7 25/28] Add "test-tool dump-reftable" command Han-Wen Nienhuys via GitGitGadget
2021-04-19 11:37             ` [PATCH v7 26/28] t1301: document what needs to be done for REFTABLE Han-Wen Nienhuys via GitGitGadget
2021-04-19 11:37             ` [PATCH v7 27/28] t1401,t2011: parameterize HEAD.lock " Han-Wen Nienhuys via GitGitGadget
2021-04-19 11:37             ` [PATCH v7 28/28] t1404: annotate test cases with REFFILES Han-Wen Nienhuys via GitGitGadget
2021-04-21  7:45             ` [PATCH v7 00/28] reftable library Ævar Arnfjörð Bjarmason
2021-04-21  9:52               ` Han-Wen Nienhuys
2021-04-21 11:21                 ` Ævar Arnfjörð Bjarmason
2021-04-26 17:59                   ` Han-Wen Nienhuys

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).