On 2020-02-06 at 22:55:55, Han-Wen Nienhuys via GitGitGadget wrote: > diff --git a/reftable/README.md b/reftable/README.md > new file mode 100644 > index 0000000000..f527da0380 > --- /dev/null > +++ b/reftable/README.md > @@ -0,0 +1,19 @@ > + > +The source code in this directory comes from https://github.com/google/reftable. > + > +The VERSION file keeps track of the current version of the reftable library. > + > +To update the library, do: > + > + ((cd reftable-repo && git fetch origin && git checkout origin/master ) || > + git clone https://github.com/google/reftable reftable-repo) && \ > + cp reftable-repo/c/*.[ch] reftable/ && \ > + cp reftable-repo/LICENSE reftable/ && > + git --git-dir reftable-repo/.git show --no-patch origin/master \ > + > reftable/VERSION && \ > + echo '/* empty */' > reftable/config.h > + rm reftable/*_test.c reftable/test_framework.* > + git add reftable/*.[ch] > + > +Bugfixes should be accompanied by a test and applied to upstream project at > +https://github.com/google/reftable. I think this particular statement may be problematic. The upstream project requires a CLA, which is a non-starter for many folks. I don't think we can expect folks who are working on Git to necessarily send patches upstream with that condition. For example, if I end up needing to patch this series to work with SHA-256, I'm happy for Google to have my changes under the BSD License, but I won't be signing a CLA. > +#define SHA1_SIZE 20 > +#define SHA256_SIZE 32 I'd rather we didn't hard-code these values here. If you need a size suitable for allocation, we have GIT_MAX_RAWSZ and GIT_MAX_HEXSZ. Those are guaranteed to be the right size for any future hash. > + { > + struct merged_table m = { > + .stack = stack, > + .stack_len = n, > + .min = first_min, > + .max = last_max, > + .hash_size = SHA1_SIZE, This is going to cause problems with SHA-256. We'd want to write this as the_hash_algo->rawsz, since it can change at runtime. > +static byte zero[SHA256_SIZE] = {}; It would be better to refer to null_oid here. > +#ifndef REFTABLE_H > +#define REFTABLE_H > + > +#include "system.h" > + > +typedef uint8_t byte; I think we typically prefer writing "unsigned char" or "uint8_t" instead of "byte". > +typedef byte bool; I suspect this is going to cause a whole host of sadness if somebody ever includes stdbool.h, or if that header ever gets included by an OS-specific header, since bool is a #define constant for the built-in type _Bool. We typically use int for booleans, but I'm not opposed to using stdbool.h for it on those systems that we know have it (which, in 2020, is probably all of them). > +struct writer *new_writer(int (*writer_func)(void *, byte *, int), > + void *writer_arg, struct write_options *opts) > +{ > + struct writer *wp = calloc(sizeof(struct writer), 1); > + options_set_defaults(opts); > + if (opts->block_size >= (1 << 24)) { > + /* TODO - error return? */ > + abort(); > + } > + wp->hash_size = SHA1_SIZE; Another SHA-256 problem. Once this series has good test coverage, I'm happy to send a patch that squashes in SHA-256 support if you don't want to implement it yourself and you CC me. If you do want to implement it yourself, you can grab the transition-stage-4 branch from https://github.com/bk2204/git.git, rebase your series on top of it, and run the testsuite with GIT_TEST_DEFAULT_HASH=sha256 to see what breaks. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204