All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"Han Xin" <chiyutianyi@gmail.com>,
	"Jiang Xin" <worldhello.net@gmail.com>,
	"René Scharfe" <l.s.r@web.de>,
	"Derrick Stolee" <stolee@gmail.com>,
	"Philip Oakley" <philipoakley@iee.email>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: [PATCH v11 0/8] unpack-objects: support streaming blobs to disk
Date: Sat, 19 Mar 2022 01:23:17 +0100	[thread overview]
Message-ID: <cover-v11-0.8-00000000000-20220319T001411Z-avarab@gmail.com> (raw)
In-Reply-To: <cover-v10-0.6-00000000000-20220204T135538Z-avarab@gmail.com>

This series by Han Xin was waiting on some in-flight patches that
landed in 430883a70c7 (Merge branch 'ab/object-file-api-updates',
2022-03-16).

This series teaches "git unpack-objects" to stream objects larger than
core.bigFileThreshold to disk. As 8/8 shows streaming e.g. a 100MB
blob now uses ~5MB of memory instead of ~105MB. This streaming method
is slower if you've got memory to handle the blobs in-core, but if you
don't it allows you to unpack objects at all, as you might otherwise
OOM.

Changes since v10:

 * Renamed the new test file, its number conflicted with a
   since-landed commit-graph test.

 * Some minor code changes to make diffs to the pre-image smaller
   (e.g. the top of the range-diff below)

 * The whole "find dest.git" to see if we have loose objects is now
   either a test for "do we have objects at all?" (--dry-run mode), or
   uses a simpler implementation. We could use
   "test_stdout_line_count" for that.

 * We also test that as we use "unpack-objects" to stream directly to
   a pack that the result is byte-for-byte the same as the source.

 * A new 4/8 that I added allows for more code sharing in
   object-file.c, our two end-state functions now share more logic.

 * Minor typo/grammar/comment etc. fixes throughout.

 * Updated 8/8 with benchmarks, somewhere along the line we lost the
   code to run the benchmark mentioned in the commit message...

1. https://lore.kernel.org/git/cover-v10-0.6-00000000000-20220204T135538Z-avarab@gmail.com/

Han Xin (4):
  unpack-objects: low memory footprint for get_data() in dry_run mode
  object-file.c: refactor write_loose_object() to several steps
  object-file.c: add "stream_loose_object()" to handle large object
  unpack-objects: use stream_loose_object() to unpack large objects

Ævar Arnfjörð Bjarmason (4):
  object-file.c: do fsync() and close() before post-write die()
  object-file.c: factor out deflate part of write_loose_object()
  core doc: modernize core.bigFileThreshold documentation
  unpack-objects: refactor away unpack_non_delta_entry()

 Documentation/config/core.txt   |  33 +++--
 builtin/unpack-objects.c        | 109 +++++++++++---
 object-file.c                   | 250 +++++++++++++++++++++++++++-----
 object-store.h                  |   8 +
 t/t5351-unpack-large-objects.sh |  61 ++++++++
 5 files changed, 397 insertions(+), 64 deletions(-)
 create mode 100755 t/t5351-unpack-large-objects.sh

Range-diff against v10:
1:  e46eb75b98f ! 1:  2103d5bfd96 unpack-objects: low memory footprint for get_data() in dry_run mode
    @@ builtin/unpack-objects.c: static void use(int bytes)
      {
      	git_zstream stream;
     -	void *buf = xmallocz(size);
    -+	unsigned long bufsize;
    -+	void *buf;
    ++	unsigned long bufsize = dry_run && size > 8192 ? 8192 : size;
    ++	void *buf = xmallocz(bufsize);
      
      	memset(&stream, 0, sizeof(stream));
    -+	if (dry_run && size > 8192)
    -+		bufsize = 8192;
    -+	else
    -+		bufsize = size;
    -+	buf = xmallocz(bufsize);
      
      	stream.next_out = buf;
     -	stream.avail_out = size;
    @@ builtin/unpack-objects.c: static void unpack_delta_entry(enum object_type type,
      		hi = nr;
      		while (lo < hi) {
     
    - ## t/t5328-unpack-large-objects.sh (new) ##
    + ## t/t5351-unpack-large-objects.sh (new) ##
     @@
     +#!/bin/sh
     +#
    @@ t/t5328-unpack-large-objects.sh (new)
     +	git init --bare dest.git
     +}
     +
    -+test_no_loose () {
    -+	test $(find dest.git/objects/?? -type f | wc -l) = 0
    -+}
    -+
     +test_expect_success "create large objects (1.5 MB) and PACK" '
     +	test-tool genrandom foo 1500000 >big-blob &&
     +	test_commit --append foo big-blob &&
     +	test-tool genrandom bar 1500000 >big-blob &&
     +	test_commit --append bar big-blob &&
    -+	PACK=$(echo HEAD | git pack-objects --revs test)
    ++	PACK=$(echo HEAD | git pack-objects --revs pack)
     +'
     +
     +test_expect_success 'set memory limitation to 1MB' '
    @@ t/t5328-unpack-large-objects.sh (new)
     +
     +test_expect_success 'unpack-objects failed under memory limitation' '
     +	prepare_dest &&
    -+	test_must_fail git -C dest.git unpack-objects <test-$PACK.pack 2>err &&
    ++	test_must_fail git -C dest.git unpack-objects <pack-$PACK.pack 2>err &&
     +	grep "fatal: attempting to allocate" err
     +'
     +
     +test_expect_success 'unpack-objects works with memory limitation in dry-run mode' '
     +	prepare_dest &&
    -+	git -C dest.git unpack-objects -n <test-$PACK.pack &&
    -+	test_no_loose &&
    ++	git -C dest.git unpack-objects -n <pack-$PACK.pack &&
    ++	test_stdout_line_count = 0 find dest.git/objects -type f &&
     +	test_dir_is_empty dest.git/objects/pack
     +'
     +
2:  48bf9090058 = 2:  6acd8759772 object-file.c: do fsync() and close() before post-write die()
3:  0e33d2a6e35 = 3:  f7b02c307fc object-file.c: refactor write_loose_object() to several steps
-:  ----------- > 4:  20d97cc2605 object-file.c: factor out deflate part of write_loose_object()
4:  9644df5c744 ! 5:  db40f4160c4 object-file.c: add "stream_loose_object()" to handle large object
    @@ Commit message
         Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
         Helped-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
         Signed-off-by: Han Xin <hanxin.hx@alibaba-inc.com>
    +    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## object-file.c ##
     @@ object-file.c: static int freshen_packed_object(const struct object_id *oid)
    @@ object-file.c: static int freshen_packed_object(const struct object_id *oid)
     +	strbuf_addf(&filename, "%s/", get_object_directory());
     +	hdrlen = format_object_header(hdr, sizeof(hdr), OBJ_BLOB, len);
     +
    -+	/* Common steps for write_loose_object and stream_loose_object to
    -+	 * start writing loose oject:
    ++	/*
    ++	 * Common steps for write_loose_object and stream_loose_object to
    ++	 * start writing loose objects:
     +	 *
     +	 *  - Create tmpfile for the loose object.
     +	 *  - Setup zlib stream for compression.
    @@ object-file.c: static int freshen_packed_object(const struct object_id *oid)
     +	/* Then the data itself.. */
     +	do {
     +		unsigned char *in0 = stream.next_in;
    ++
     +		if (!stream.avail_in && !in_stream->is_finished) {
     +			const void *in = in_stream->read(in_stream, &stream.avail_in);
     +			stream.next_in = (void *)in;
     +			in0 = (unsigned char *)in;
     +			/* All data has been read. */
     +			if (in_stream->is_finished)
    -+				flush = Z_FINISH;
    ++				flush = 1;
     +		}
    -+		ret = git_deflate(&stream, flush);
    -+		the_hash_algo->update_fn(&c, in0, stream.next_in - in0);
    -+		if (write_buffer(fd, compressed, stream.next_out - compressed) < 0)
    -+			die(_("unable to write loose object file"));
    -+		stream.next_out = compressed;
    -+		stream.avail_out = sizeof(compressed);
    ++		ret = write_loose_object_common(&c, &stream, flush, in0, fd,
    ++						compressed, sizeof(compressed));
     +		/*
     +		 * Unlike write_loose_object(), we do not have the entire
     +		 * buffer. If we get Z_BUF_ERROR due to too few input bytes,
5:  4550f3a2745 = 6:  d8ae2eadb98 core doc: modernize core.bigFileThreshold documentation
-:  ----------- > 7:  2b403e7cd9c unpack-objects: refactor away unpack_non_delta_entry()
6:  6a70e49a346 ! 8:  5eded902496 unpack-objects: use stream_loose_object() to unpack large objects
    @@ Commit message
         malloc() the size of the blob before unpacking it, which could cause
         OOM with very large blobs.
     
    -    We could use this new interface to unpack all blobs, but doing so
    -    would result in a performance penalty of around 10%, as the below
    -    "hyperfine" benchmark will show. We therefore limit this to files
    -    larger than "core.bigFileThreshold":
    -
    -        $ hyperfine \
    -          --setup \
    -          'if ! test -d scalar.git; then git clone --bare
    -           https://github.com/microsoft/scalar.git;
    -           cp scalar.git/objects/pack/*.pack small.pack; fi' \
    -          --prepare 'rm -rf dest.git && git init --bare dest.git' \
    -          ...
    -
    -        Summary
    -          './git -C dest.git -c core.bigFileThreshold=512m
    -          unpack-objects <small.pack' in 'origin/master'
    -            1.01 ± 0.04 times faster than './git -C dest.git
    -                    -c core.bigFileThreshold=512m unpack-objects
    -                    <small.pack' in 'HEAD~1'
    -            1.01 ± 0.04 times faster than './git -C dest.git
    -                    -c core.bigFileThreshold=512m unpack-objects
    -                    <small.pack' in 'HEAD~0'
    -            1.03 ± 0.10 times faster than './git -C dest.git
    -                    -c core.bigFileThreshold=16k unpack-objects
    -                    <small.pack' in 'origin/master'
    -            1.02 ± 0.07 times faster than './git -C dest.git
    -                    -c core.bigFileThreshold=16k unpack-objects
    -                    <small.pack' in 'HEAD~0'
    -            1.10 ± 0.04 times faster than './git -C dest.git
    -                    -c core.bigFileThreshold=16k unpack-objects
    -                    <small.pack' in 'HEAD~1'
    +    We could use the new streaming interface to unpack all blobs, but
    +    doing so would be much slower, as demonstrated e.g. with this
    +    benchmark using git-hyperfine[0]:
    +
    +            rm -rf /tmp/scalar.git &&
    +            git clone --bare https://github.com/Microsoft/scalar.git /tmp/scalar.git &&
    +            mv /tmp/scalar.git/objects/pack/*.pack /tmp/scalar.git/my.pack &&
    +            git hyperfine \
    +                    -r 2 --warmup 1 \
    +                    -L rev origin/master,HEAD -L v "10,512,1k,1m" \
    +                    -s 'make' \
    +                    -p 'git init --bare dest.git' \
    +                    -c 'rm -rf dest.git' \
    +                    './git -C dest.git -c core.bigFileThreshold={v} unpack-objects </tmp/scalar.git/my.pack'
    +
    +    Here we'll perform worse with lower core.bigFileThreshold settings
    +    with this change in terms of speed, but we're getting lower memory use
    +    in return:
    +
    +            Summary
    +              './git -C dest.git -c core.bigFileThreshold=10 unpack-objects </tmp/scalar.git/my.pack' in 'origin/master' ran
    +                1.01 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1k unpack-objects </tmp/scalar.git/my.pack' in 'origin/master'
    +                1.01 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1m unpack-objects </tmp/scalar.git/my.pack' in 'origin/master'
    +                1.01 ± 0.02 times faster than './git -C dest.git -c core.bigFileThreshold=1m unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'
    +                1.02 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/scalar.git/my.pack' in 'origin/master'
    +                1.09 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1k unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'
    +                1.10 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'
    +                1.11 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=10 unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'
    +
    +    A better benchmark to demonstrate the benefits of that this one, which
    +    creates an artificial repo with a 1, 25, 50, 75 and 100MB blob:
    +
    +            rm -rf /tmp/repo &&
    +            git init /tmp/repo &&
    +            (
    +                    cd /tmp/repo &&
    +                    for i in 1 25 50 75 100
    +                    do
    +                            dd if=/dev/urandom of=blob.$i count=$(($i*1024)) bs=1024
    +                    done &&
    +                    git add blob.* &&
    +                    git commit -mblobs &&
    +                    git gc &&
    +                    PACK=$(echo .git/objects/pack/pack-*.pack) &&
    +                    cp "$PACK" my.pack
    +            ) &&
    +            git hyperfine \
    +                    --show-output \
    +                    -L rev origin/master,HEAD -L v "512,50m,100m" \
    +                    -s 'make' \
    +                    -p 'git init --bare dest.git' \
    +                    -c 'rm -rf dest.git' \
    +                    '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold={v} unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum'
    +
    +    Using this test we'll always use >100MB of memory on
    +    origin/master (around ~105MB), but max out at e.g. ~55MB if we set
    +    core.bigFileThreshold=50m.
    +
    +    The relevant "Maximum resident set size" lines were manually added
    +    below the relevant benchmark:
    +
    +      '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=50m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'origin/master' ran
    +            Maximum resident set size (kbytes): 107080
    +        1.02 ± 0.78 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'origin/master'
    +            Maximum resident set size (kbytes): 106968
    +        1.09 ± 0.79 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=100m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'origin/master'
    +            Maximum resident set size (kbytes): 107032
    +        1.42 ± 1.07 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=100m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'HEAD'
    +            Maximum resident set size (kbytes): 107072
    +        1.83 ± 1.02 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=50m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'HEAD'
    +            Maximum resident set size (kbytes): 55704
    +        2.16 ± 1.19 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'HEAD'
    +            Maximum resident set size (kbytes): 4564
    +
    +    This shows that if you have enough memory this new streaming method is
    +    slower the lower you set the streaming threshold, but the benefit is
    +    more bounded memory use.
     
         An earlier version of this patch introduced a new
         "core.bigFileStreamingThreshold" instead of re-using the existing
    @@ Commit message
         split up "core.bigFileThreshold" in the future if there's a need for
         that.
     
    +    0. https://github.com/avar/git-hyperfine/
         1. https://lore.kernel.org/git/20211210103435.83656-1-chiyutianyi@gmail.com/
         2. https://lore.kernel.org/git/20220120112114.47618-5-chiyutianyi@gmail.com/
     
    @@ Commit message
         Helped-by: Derrick Stolee <stolee@gmail.com>
         Helped-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
         Signed-off-by: Han Xin <hanxin.hx@alibaba-inc.com>
    +    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Documentation/config/core.txt ##
     @@ Documentation/config/core.txt: usage, at the slight expense of increased disk usage.
    @@ builtin/unpack-objects.c: static void added_object(unsigned nr, enum object_type
     +	return data->buf;
     +}
     +
    -+static void write_stream_blob(unsigned nr, size_t size)
    ++static void stream_blob(unsigned long size, unsigned nr)
     +{
     +	git_zstream zstream = { 0 };
     +	struct input_zstream_data data = { 0 };
    @@ builtin/unpack-objects.c: static void added_object(unsigned nr, enum object_type
     +		.read = feed_input_zstream,
     +		.data = &data,
     +	};
    ++	struct obj_info *info = &obj_list[nr];
     +
     +	data.zstream = &zstream;
     +	git_inflate_init(&zstream);
     +
    -+	if (stream_loose_object(&in_stream, size, &obj_list[nr].oid))
    ++	if (stream_loose_object(&in_stream, size, &info->oid))
     +		die(_("failed to write object in stream"));
     +
     +	if (data.status != Z_STREAM_END)
    @@ builtin/unpack-objects.c: static void added_object(unsigned nr, enum object_type
     +	git_inflate_end(&zstream);
     +
     +	if (strict) {
    -+		struct blob *blob =
    -+			lookup_blob(the_repository, &obj_list[nr].oid);
    -+		if (blob)
    -+			blob->object.flags |= FLAG_WRITTEN;
    -+		else
    ++		struct blob *blob = lookup_blob(the_repository, &info->oid);
    ++
    ++		if (!blob)
     +			die(_("invalid blob object from stream"));
    ++		blob->object.flags |= FLAG_WRITTEN;
     +	}
    -+	obj_list[nr].obj = NULL;
    ++	info->obj = NULL;
     +}
     +
    - static void unpack_non_delta_entry(enum object_type type, unsigned long size,
    - 				   unsigned nr)
    + static int resolve_against_held(unsigned nr, const struct object_id *base,
    + 				void *delta_data, unsigned long delta_size)
      {
    --	void *buf = get_data(size);
    -+	void *buf;
    -+
    -+	/* Write large blob in stream without allocating full buffer. */
    -+	if (!dry_run && type == OBJ_BLOB && size > big_file_threshold) {
    -+		write_stream_blob(nr, size);
    -+		return;
    -+	}
    +@@ builtin/unpack-objects.c: static void unpack_one(unsigned nr)
      
    -+	buf = get_data(size);
    - 	if (buf)
    - 		write_object(nr, type, buf, size);
    - }
    + 	switch (type) {
    + 	case OBJ_BLOB:
    ++		if (!dry_run && size > big_file_threshold) {
    ++			stream_blob(size, nr);
    ++			return;
    ++		}
    ++		/* fallthrough */
    + 	case OBJ_COMMIT:
    + 	case OBJ_TREE:
    + 	case OBJ_TAG:
     
    - ## t/t5328-unpack-large-objects.sh ##
    -@@ t/t5328-unpack-large-objects.sh: test_description='git unpack-objects with large objects'
    + ## t/t5351-unpack-large-objects.sh ##
    +@@ t/t5351-unpack-large-objects.sh: test_description='git unpack-objects with large objects'
      
      prepare_dest () {
      	test_when_finished "rm -rf dest.git" &&
     -	git init --bare dest.git
     +	git init --bare dest.git &&
    -+	if test -n "$1"
    -+	then
    -+		git -C dest.git config core.bigFileThreshold $1
    -+	fi
    ++	git -C dest.git config core.bigFileThreshold "$1"
      }
      
    - test_no_loose () {
    -@@ t/t5328-unpack-large-objects.sh: test_expect_success 'set memory limitation to 1MB' '
    + test_expect_success "create large objects (1.5 MB) and PACK" '
    +@@ t/t5351-unpack-large-objects.sh: test_expect_success 'set memory limitation to 1MB' '
      '
      
      test_expect_success 'unpack-objects failed under memory limitation' '
     -	prepare_dest &&
     +	prepare_dest 2m &&
    - 	test_must_fail git -C dest.git unpack-objects <test-$PACK.pack 2>err &&
    + 	test_must_fail git -C dest.git unpack-objects <pack-$PACK.pack 2>err &&
      	grep "fatal: attempting to allocate" err
      '
      
      test_expect_success 'unpack-objects works with memory limitation in dry-run mode' '
     -	prepare_dest &&
     +	prepare_dest 2m &&
    - 	git -C dest.git unpack-objects -n <test-$PACK.pack &&
    - 	test_no_loose &&
    + 	git -C dest.git unpack-objects -n <pack-$PACK.pack &&
    + 	test_stdout_line_count = 0 find dest.git/objects -type f &&
      	test_dir_is_empty dest.git/objects/pack
      '
      
     +test_expect_success 'unpack big object in stream' '
     +	prepare_dest 1m &&
    -+	git -C dest.git unpack-objects <test-$PACK.pack &&
    ++	git -C dest.git unpack-objects <pack-$PACK.pack &&
     +	test_dir_is_empty dest.git/objects/pack
     +'
     +
     +test_expect_success 'do not unpack existing large objects' '
     +	prepare_dest 1m &&
    -+	git -C dest.git index-pack --stdin <test-$PACK.pack &&
    -+	git -C dest.git unpack-objects <test-$PACK.pack &&
    -+	test_no_loose
    ++	git -C dest.git index-pack --stdin <pack-$PACK.pack &&
    ++	git -C dest.git unpack-objects <pack-$PACK.pack &&
    ++
    ++	# The destination came up with the exact same pack...
    ++	DEST_PACK=$(echo dest.git/objects/pack/pack-*.pack) &&
    ++	test_cmp pack-$PACK.pack $DEST_PACK &&
    ++
    ++	# ...and wrote no loose objects
    ++	test_stdout_line_count = 0 find dest.git/objects -type f ! -name "pack-*"
     +'
     +
      test_done
-- 
2.35.1.1438.g8874c8eeb35


  parent reply	other threads:[~2022-03-19  0:23 UTC|newest]

Thread overview: 211+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-09  8:20 [PATCH] unpack-objects: unpack large object in stream Han Xin
2021-10-19  7:37 ` Han Xin
2021-10-20 14:42 ` Philip Oakley
2021-10-21  3:42   ` Han Xin
2021-10-21 22:47     ` Philip Oakley
2021-11-03  1:48 ` Han Xin
2021-11-03 10:07   ` Philip Oakley
2021-11-12  9:40 ` [PATCH v2 1/6] object-file: refactor write_loose_object() to support inputstream Han Xin
2021-11-18  4:59   ` Jiang Xin
2021-11-18  6:45     ` Junio C Hamano
2021-11-12  9:40 ` [PATCH v2 2/6] object-file.c: add dry_run mode for write_loose_object() Han Xin
2021-11-18  5:42   ` Jiang Xin
2021-11-12  9:40 ` [PATCH v2 3/6] object-file.c: handle nil oid in write_loose_object() Han Xin
2021-11-18  5:49   ` Jiang Xin
2021-11-12  9:40 ` [PATCH v2 4/6] object-file.c: read input stream repeatedly " Han Xin
2021-11-18  5:56   ` Jiang Xin
2021-11-12  9:40 ` [PATCH v2 5/6] object-store.h: add write_loose_object() Han Xin
2021-11-12  9:40 ` [PATCH v2 6/6] unpack-objects: unpack large object in stream Han Xin
2021-11-18  7:14   ` Jiang Xin
2021-11-22  3:32 ` [PATCH v3 0/5] unpack large objects " Han Xin
2021-11-29  7:01   ` Han Xin
2021-11-29 19:12     ` Jeff King
2021-11-30  2:57       ` Han Xin
2021-12-03  9:35   ` [PATCH v4 " Han Xin
2021-12-07 16:18     ` Derrick Stolee
2021-12-10 10:34     ` [PATCH v5 0/6] unpack large blobs " Han Xin
2021-12-17 11:26       ` Han Xin
2021-12-21 11:51         ` [PATCH v7 0/5] " Han Xin
2021-12-21 11:51         ` [PATCH v7 1/5] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-12-21 14:09           ` Ævar Arnfjörð Bjarmason
2021-12-21 14:43             ` René Scharfe
2021-12-21 15:04               ` Ævar Arnfjörð Bjarmason
2021-12-22 11:15               ` Jiang Xin
2021-12-22 11:29             ` Jiang Xin
2021-12-31  3:06           ` Jiang Xin
2021-12-21 11:51         ` [PATCH v7 2/5] object-file API: add a format_object_header() function Han Xin
2021-12-21 14:30           ` René Scharfe
2022-02-01 14:28             ` C99 %z (was: [PATCH v7 2/5] object-file API: add a format_object_header() function) Ævar Arnfjörð Bjarmason
2021-12-31  3:12           ` [PATCH v7 2/5] object-file API: add a format_object_header() function Jiang Xin
2021-12-21 11:51         ` [PATCH v7 3/5] object-file.c: refactor write_loose_object() to reuse in stream version Han Xin
2021-12-21 14:16           ` Ævar Arnfjörð Bjarmason
2021-12-22 12:02             ` Jiang Xin
2021-12-21 11:52         ` [PATCH v7 4/5] object-file.c: add "write_stream_object_file()" to support read in stream Han Xin
2021-12-21 14:20           ` Ævar Arnfjörð Bjarmason
2021-12-21 15:05             ` Ævar Arnfjörð Bjarmason
2021-12-21 11:52         ` [PATCH v7 5/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-21 15:06           ` Ævar Arnfjörð Bjarmason
2021-12-31  3:19           ` Jiang Xin
2022-01-08  8:54         ` [PATCH v8 0/6] unpack large blobs in stream Han Xin
2022-01-20 11:21           ` [PATCH v9 0/5] " Han Xin
2022-02-01 21:24             ` Ævar Arnfjörð Bjarmason
2022-02-02  8:32               ` Han Xin
2022-02-02 10:59                 ` Ævar Arnfjörð Bjarmason
2022-02-04 14:07             ` [PATCH v10 0/6] unpack-objects: support streaming large objects to disk Ævar Arnfjörð Bjarmason
2022-02-04 14:07               ` [PATCH v10 1/6] unpack-objects: low memory footprint for get_data() in dry_run mode Ævar Arnfjörð Bjarmason
2022-02-04 14:07               ` [PATCH v10 2/6] object-file.c: do fsync() and close() before post-write die() Ævar Arnfjörð Bjarmason
2022-02-04 14:07               ` [PATCH v10 3/6] object-file.c: refactor write_loose_object() to several steps Ævar Arnfjörð Bjarmason
2022-02-04 14:07               ` [PATCH v10 4/6] object-file.c: add "stream_loose_object()" to handle large object Ævar Arnfjörð Bjarmason
2022-02-04 14:07               ` [PATCH v10 5/6] core doc: modernize core.bigFileThreshold documentation Ævar Arnfjörð Bjarmason
2022-02-04 14:07               ` [PATCH v10 6/6] unpack-objects: use stream_loose_object() to unpack large objects Ævar Arnfjörð Bjarmason
2022-03-19  0:23               ` Ævar Arnfjörð Bjarmason [this message]
2022-03-19  0:23                 ` [PATCH v11 1/8] unpack-objects: low memory footprint for get_data() in dry_run mode Ævar Arnfjörð Bjarmason
2022-03-19  0:23                 ` [PATCH v11 2/8] object-file.c: do fsync() and close() before post-write die() Ævar Arnfjörð Bjarmason
2022-03-19  0:23                 ` [PATCH v11 3/8] object-file.c: refactor write_loose_object() to several steps Ævar Arnfjörð Bjarmason
2022-03-19 10:11                   ` René Scharfe
2022-03-19  0:23                 ` [PATCH v11 4/8] object-file.c: factor out deflate part of write_loose_object() Ævar Arnfjörð Bjarmason
2022-03-19  0:23                 ` [PATCH v11 5/8] object-file.c: add "stream_loose_object()" to handle large object Ævar Arnfjörð Bjarmason
2022-03-19  0:23                 ` [PATCH v11 6/8] core doc: modernize core.bigFileThreshold documentation Ævar Arnfjörð Bjarmason
2022-03-19  0:23                 ` [PATCH v11 7/8] unpack-objects: refactor away unpack_non_delta_entry() Ævar Arnfjörð Bjarmason
2022-03-19  0:23                 ` [PATCH v11 8/8] unpack-objects: use stream_loose_object() to unpack large objects Ævar Arnfjörð Bjarmason
2022-03-29 13:56                 ` [PATCH v12 0/8] unpack-objects: support streaming blobs to disk Ævar Arnfjörð Bjarmason
2022-03-29 13:56                   ` [PATCH v12 1/8] unpack-objects: low memory footprint for get_data() in dry_run mode Ævar Arnfjörð Bjarmason
2022-03-29 13:56                   ` [PATCH v12 2/8] object-file.c: do fsync() and close() before post-write die() Ævar Arnfjörð Bjarmason
2022-03-29 13:56                   ` [PATCH v12 3/8] object-file.c: refactor write_loose_object() to several steps Ævar Arnfjörð Bjarmason
2022-03-30  7:13                     ` Han Xin
2022-03-30 17:34                       ` Ævar Arnfjörð Bjarmason
2022-03-29 13:56                   ` [PATCH v12 4/8] object-file.c: factor out deflate part of write_loose_object() Ævar Arnfjörð Bjarmason
2022-03-29 13:56                   ` [PATCH v12 5/8] object-file.c: add "stream_loose_object()" to handle large object Ævar Arnfjörð Bjarmason
2022-03-31 19:54                     ` Neeraj Singh
2022-03-29 13:56                   ` [PATCH v12 6/8] core doc: modernize core.bigFileThreshold documentation Ævar Arnfjörð Bjarmason
2022-03-29 13:56                   ` [PATCH v12 7/8] unpack-objects: refactor away unpack_non_delta_entry() Ævar Arnfjörð Bjarmason
2022-03-30 19:40                     ` René Scharfe
2022-03-31 12:42                       ` Ævar Arnfjörð Bjarmason
2022-03-31 16:38                         ` René Scharfe
2022-03-29 13:56                   ` [PATCH v12 8/8] unpack-objects: use stream_loose_object() to unpack large objects Ævar Arnfjörð Bjarmason
2022-06-04 10:10                   ` [PATCH v13 0/7] unpack-objects: support streaming blobs to disk Ævar Arnfjörð Bjarmason
2022-06-04 10:10                     ` [PATCH v13 1/7] unpack-objects: low memory footprint for get_data() in dry_run mode Ævar Arnfjörð Bjarmason
2022-06-06 18:35                       ` Junio C Hamano
2022-06-09  4:10                         ` Han Xin
2022-06-09 18:27                           ` Junio C Hamano
2022-06-10  1:50                             ` Han Xin
2022-06-10  2:05                               ` Ævar Arnfjörð Bjarmason
2022-06-10 12:04                                 ` Han Xin
2022-06-04 10:10                     ` [PATCH v13 2/7] object-file.c: do fsync() and close() before post-write die() Ævar Arnfjörð Bjarmason
2022-06-06 18:45                       ` Junio C Hamano
2022-06-04 10:10                     ` [PATCH v13 3/7] object-file.c: refactor write_loose_object() to several steps Ævar Arnfjörð Bjarmason
2022-06-04 10:10                     ` [PATCH v13 4/7] object-file.c: factor out deflate part of write_loose_object() Ævar Arnfjörð Bjarmason
2022-06-04 10:10                     ` [PATCH v13 5/7] object-file.c: add "stream_loose_object()" to handle large object Ævar Arnfjörð Bjarmason
2022-06-06 19:44                       ` Junio C Hamano
2022-06-06 20:02                         ` Junio C Hamano
2022-06-09  6:04                           ` Han Xin
2022-06-09  6:14                         ` Han Xin
2022-06-07 19:53                       ` Neeraj Singh
2022-06-08 15:34                         ` Junio C Hamano
2022-06-09  3:05                         ` [RFC PATCH] object-file.c: batched disk flushes for stream_loose_object() Han Xin
2022-06-09  7:35                           ` Neeraj Singh
2022-06-09  9:30                           ` Johannes Schindelin
2022-06-10 12:55                             ` Han Xin
2022-06-04 10:10                     ` [PATCH v13 6/7] core doc: modernize core.bigFileThreshold documentation Ævar Arnfjörð Bjarmason
2022-06-06 19:50                       ` Junio C Hamano
2022-06-04 10:10                     ` [PATCH v13 7/7] unpack-objects: use stream_loose_object() to unpack large objects Ævar Arnfjörð Bjarmason
2022-06-10 14:46                     ` [PATCH v14 0/7] unpack-objects: support streaming blobs to disk Han Xin
2022-06-10 14:46                       ` [PATCH v14 1/7] unpack-objects: low memory footprint for get_data() in dry_run mode Han Xin
2022-06-10 14:46                       ` [PATCH v14 2/7] object-file.c: do fsync() and close() before post-write die() Han Xin
2022-06-10 21:10                         ` René Scharfe
2022-06-10 21:33                           ` Junio C Hamano
2022-06-11  1:50                             ` Han Xin
2022-06-10 14:46                       ` [PATCH v14 3/7] object-file.c: refactor write_loose_object() to several steps Han Xin
2022-06-10 14:46                       ` [PATCH v14 4/7] object-file.c: factor out deflate part of write_loose_object() Han Xin
2022-06-10 14:46                       ` [PATCH v14 5/7] object-file.c: add "stream_loose_object()" to handle large object Han Xin
2022-06-10 14:46                       ` [PATCH v14 6/7] core doc: modernize core.bigFileThreshold documentation Han Xin
2022-06-10 21:01                         ` Junio C Hamano
2022-06-10 14:46                       ` [PATCH v14 7/7] unpack-objects: use stream_loose_object() to unpack large objects Han Xin
2022-06-11  2:44                       ` [PATCH v15 0/6] unpack-objects: support streaming blobs to disk Han Xin
2022-06-11  2:44                         ` [PATCH v15 1/6] unpack-objects: low memory footprint for get_data() in dry_run mode Han Xin
2022-06-11  2:44                         ` [PATCH v15 2/6] object-file.c: refactor write_loose_object() to several steps Han Xin
2022-06-11  2:44                         ` [PATCH v15 3/6] object-file.c: factor out deflate part of write_loose_object() Han Xin
2022-06-11  2:44                         ` [PATCH v15 4/6] object-file.c: add "stream_loose_object()" to handle large object Han Xin
2022-06-11  2:44                         ` [PATCH v15 5/6] core doc: modernize core.bigFileThreshold documentation Han Xin
2022-06-11  2:44                         ` [PATCH v15 6/6] unpack-objects: use stream_loose_object() to unpack large objects Han Xin
2022-07-01  2:01                           ` Junio C Hamano
2022-05-20  3:05                 ` [PATCH 0/1] unpack-objects: low memory footprint for get_data() in dry_run mode Han Xin
2022-05-20  3:05                   ` [PATCH 1/1] " Han Xin
2022-01-20 11:21           ` [PATCH v9 1/5] " Han Xin
2022-01-20 11:21           ` [PATCH v9 2/5] object-file.c: refactor write_loose_object() to several steps Han Xin
2022-01-20 11:21           ` [PATCH v9 3/5] object-file.c: add "stream_loose_object()" to handle large object Han Xin
2022-01-20 11:21           ` [PATCH v9 4/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2022-01-20 11:21           ` [PATCH v9 5/5] object-file API: add a format_object_header() function Han Xin
2022-01-08  8:54         ` [PATCH v8 1/6] unpack-objects: low memory footprint for get_data() in dry_run mode Han Xin
2022-01-08 12:28           ` René Scharfe
2022-01-11 10:41             ` Han Xin
2022-01-08  8:54         ` [PATCH v8 2/6] object-file.c: refactor write_loose_object() to several steps Han Xin
2022-01-08 12:28           ` René Scharfe
2022-01-11 10:33             ` Han Xin
2022-01-08  8:54         ` [PATCH v8 3/6] object-file.c: remove the slash for directory_size() Han Xin
2022-01-08 17:24           ` René Scharfe
2022-01-11 10:14             ` Han Xin
2022-01-08  8:54         ` [PATCH v8 4/6] object-file.c: add "stream_loose_object()" to handle large object Han Xin
2022-01-08  8:54         ` [PATCH v8 5/6] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2022-01-08  8:54         ` [PATCH v8 6/6] object-file API: add a format_object_header() function Han Xin
2021-12-17 11:26       ` [PATCH v6 1/6] object-file.c: release strbuf in write_loose_object() Han Xin
2021-12-17 19:28         ` René Scharfe
2021-12-18  0:09           ` Junio C Hamano
2021-12-17 11:26       ` [PATCH v6 2/6] object-file.c: refactor object header generation into a function Han Xin
2021-12-20 12:10         ` [RFC PATCH] object-file API: add a format_loose_header() function Ævar Arnfjörð Bjarmason
2021-12-20 12:48           ` Philip Oakley
2021-12-20 22:25           ` Junio C Hamano
2021-12-21  1:42             ` Ævar Arnfjörð Bjarmason
2021-12-21  2:11               ` Junio C Hamano
2021-12-21  2:27                 ` Ævar Arnfjörð Bjarmason
2021-12-21 11:43           ` Han Xin
2021-12-17 11:26       ` [PATCH v6 3/6] object-file.c: refactor write_loose_object() to reuse in stream version Han Xin
2021-12-17 11:26       ` [PATCH v6 4/6] object-file.c: make "write_object_file_flags()" to support read in stream Han Xin
2021-12-17 22:52         ` René Scharfe
2021-12-17 11:26       ` [PATCH v6 5/6] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-12-17 21:22         ` René Scharfe
2021-12-17 11:26       ` [PATCH v6 6/6] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-10 10:34     ` [PATCH v5 1/6] object-file: refactor write_loose_object() to support read from stream Han Xin
2021-12-10 10:34     ` [PATCH v5 2/6] object-file.c: handle undetermined oid in write_loose_object() Han Xin
2021-12-13  7:32       ` Ævar Arnfjörð Bjarmason
2021-12-10 10:34     ` [PATCH v5 3/6] object-file.c: read stream in a loop " Han Xin
2021-12-10 10:34     ` [PATCH v5 4/6] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-12-10 10:34     ` [PATCH v5 5/6] object-file.c: make "write_object_file_flags()" to support "HASH_STREAM" Han Xin
2021-12-10 10:34     ` [PATCH v5 6/6] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-13  8:05       ` Ævar Arnfjörð Bjarmason
2021-12-03  9:35   ` [PATCH v4 1/5] object-file: refactor write_loose_object() to read buffer from stream Han Xin
2021-12-03 13:28     ` Ævar Arnfjörð Bjarmason
2021-12-06  2:07       ` Han Xin
2021-12-03  9:35   ` [PATCH v4 2/5] object-file.c: handle undetermined oid in write_loose_object() Han Xin
2021-12-03 13:21     ` Ævar Arnfjörð Bjarmason
2021-12-06  2:51       ` Han Xin
2021-12-03 13:41     ` Ævar Arnfjörð Bjarmason
2021-12-06  3:12       ` Han Xin
2021-12-03  9:35   ` [PATCH v4 3/5] object-file.c: read stream in a loop " Han Xin
2021-12-03  9:35   ` [PATCH v4 4/5] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-12-03 13:59     ` Ævar Arnfjörð Bjarmason
2021-12-06  3:20       ` Han Xin
2021-12-03  9:35   ` [PATCH v4 5/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-03 13:07     ` Ævar Arnfjörð Bjarmason
2021-12-07  6:42       ` Han Xin
2021-12-03 13:54     ` Ævar Arnfjörð Bjarmason
2021-12-07  6:17       ` Han Xin
2021-12-03 14:05     ` Ævar Arnfjörð Bjarmason
2021-12-07  6:48       ` Han Xin
2021-11-22  3:32 ` [PATCH v3 1/5] object-file: refactor write_loose_object() to read buffer from stream Han Xin
2021-11-23 23:24   ` Junio C Hamano
2021-11-24  9:00     ` Han Xin
2021-11-22  3:32 ` [PATCH v3 2/5] object-file.c: handle undetermined oid in write_loose_object() Han Xin
2021-11-29 15:10   ` Derrick Stolee
2021-11-29 20:44     ` Junio C Hamano
2021-11-29 22:18       ` Derrick Stolee
2021-11-30  3:23         ` Han Xin
2021-11-22  3:32 ` [PATCH v3 3/5] object-file.c: read stream in a loop " Han Xin
2021-11-22  3:32 ` [PATCH v3 4/5] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-11-22  3:32 ` [PATCH v3 5/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-11-29 17:37   ` Derrick Stolee
2021-11-30 13:49     ` Han Xin
2021-11-30 18:38       ` Derrick Stolee
2021-12-01 20:37         ` "git hyperfine" (was: [PATCH v3 5/5] unpack-objects[...]) Ævar Arnfjörð Bjarmason
2021-12-02  7:33         ` [PATCH v3 5/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-02 13:53           ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover-v11-0.8-00000000000-20220319T001411Z-avarab@gmail.com \
    --to=avarab@gmail.com \
    --cc=chiyutianyi@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    --cc=philipoakley@iee.email \
    --cc=stolee@gmail.com \
    --cc=worldhello.net@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.