From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
"Han Xin" <chiyutianyi@gmail.com>,
"Jiang Xin" <worldhello.net@gmail.com>,
"René Scharfe" <l.s.r@web.de>,
"Derrick Stolee" <stolee@gmail.com>,
"Philip Oakley" <philipoakley@iee.email>,
"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: [PATCH v11 0/8] unpack-objects: support streaming blobs to disk
Date: Sat, 19 Mar 2022 01:23:17 +0100 [thread overview]
Message-ID: <cover-v11-0.8-00000000000-20220319T001411Z-avarab@gmail.com> (raw)
In-Reply-To: <cover-v10-0.6-00000000000-20220204T135538Z-avarab@gmail.com>
This series by Han Xin was waiting on some in-flight patches that
landed in 430883a70c7 (Merge branch 'ab/object-file-api-updates',
2022-03-16).
This series teaches "git unpack-objects" to stream objects larger than
core.bigFileThreshold to disk. As 8/8 shows streaming e.g. a 100MB
blob now uses ~5MB of memory instead of ~105MB. This streaming method
is slower if you've got memory to handle the blobs in-core, but if you
don't it allows you to unpack objects at all, as you might otherwise
OOM.
Changes since v10:
* Renamed the new test file, its number conflicted with a
since-landed commit-graph test.
* Some minor code changes to make diffs to the pre-image smaller
(e.g. the top of the range-diff below)
* The whole "find dest.git" to see if we have loose objects is now
either a test for "do we have objects at all?" (--dry-run mode), or
uses a simpler implementation. We could use
"test_stdout_line_count" for that.
* We also test that as we use "unpack-objects" to stream directly to
a pack that the result is byte-for-byte the same as the source.
* A new 4/8 that I added allows for more code sharing in
object-file.c, our two end-state functions now share more logic.
* Minor typo/grammar/comment etc. fixes throughout.
* Updated 8/8 with benchmarks, somewhere along the line we lost the
code to run the benchmark mentioned in the commit message...
1. https://lore.kernel.org/git/cover-v10-0.6-00000000000-20220204T135538Z-avarab@gmail.com/
Han Xin (4):
unpack-objects: low memory footprint for get_data() in dry_run mode
object-file.c: refactor write_loose_object() to several steps
object-file.c: add "stream_loose_object()" to handle large object
unpack-objects: use stream_loose_object() to unpack large objects
Ævar Arnfjörð Bjarmason (4):
object-file.c: do fsync() and close() before post-write die()
object-file.c: factor out deflate part of write_loose_object()
core doc: modernize core.bigFileThreshold documentation
unpack-objects: refactor away unpack_non_delta_entry()
Documentation/config/core.txt | 33 +++--
builtin/unpack-objects.c | 109 +++++++++++---
object-file.c | 250 +++++++++++++++++++++++++++-----
object-store.h | 8 +
t/t5351-unpack-large-objects.sh | 61 ++++++++
5 files changed, 397 insertions(+), 64 deletions(-)
create mode 100755 t/t5351-unpack-large-objects.sh
Range-diff against v10:
1: e46eb75b98f ! 1: 2103d5bfd96 unpack-objects: low memory footprint for get_data() in dry_run mode
@@ builtin/unpack-objects.c: static void use(int bytes)
{
git_zstream stream;
- void *buf = xmallocz(size);
-+ unsigned long bufsize;
-+ void *buf;
++ unsigned long bufsize = dry_run && size > 8192 ? 8192 : size;
++ void *buf = xmallocz(bufsize);
memset(&stream, 0, sizeof(stream));
-+ if (dry_run && size > 8192)
-+ bufsize = 8192;
-+ else
-+ bufsize = size;
-+ buf = xmallocz(bufsize);
stream.next_out = buf;
- stream.avail_out = size;
@@ builtin/unpack-objects.c: static void unpack_delta_entry(enum object_type type,
hi = nr;
while (lo < hi) {
- ## t/t5328-unpack-large-objects.sh (new) ##
+ ## t/t5351-unpack-large-objects.sh (new) ##
@@
+#!/bin/sh
+#
@@ t/t5328-unpack-large-objects.sh (new)
+ git init --bare dest.git
+}
+
-+test_no_loose () {
-+ test $(find dest.git/objects/?? -type f | wc -l) = 0
-+}
-+
+test_expect_success "create large objects (1.5 MB) and PACK" '
+ test-tool genrandom foo 1500000 >big-blob &&
+ test_commit --append foo big-blob &&
+ test-tool genrandom bar 1500000 >big-blob &&
+ test_commit --append bar big-blob &&
-+ PACK=$(echo HEAD | git pack-objects --revs test)
++ PACK=$(echo HEAD | git pack-objects --revs pack)
+'
+
+test_expect_success 'set memory limitation to 1MB' '
@@ t/t5328-unpack-large-objects.sh (new)
+
+test_expect_success 'unpack-objects failed under memory limitation' '
+ prepare_dest &&
-+ test_must_fail git -C dest.git unpack-objects <test-$PACK.pack 2>err &&
++ test_must_fail git -C dest.git unpack-objects <pack-$PACK.pack 2>err &&
+ grep "fatal: attempting to allocate" err
+'
+
+test_expect_success 'unpack-objects works with memory limitation in dry-run mode' '
+ prepare_dest &&
-+ git -C dest.git unpack-objects -n <test-$PACK.pack &&
-+ test_no_loose &&
++ git -C dest.git unpack-objects -n <pack-$PACK.pack &&
++ test_stdout_line_count = 0 find dest.git/objects -type f &&
+ test_dir_is_empty dest.git/objects/pack
+'
+
2: 48bf9090058 = 2: 6acd8759772 object-file.c: do fsync() and close() before post-write die()
3: 0e33d2a6e35 = 3: f7b02c307fc object-file.c: refactor write_loose_object() to several steps
-: ----------- > 4: 20d97cc2605 object-file.c: factor out deflate part of write_loose_object()
4: 9644df5c744 ! 5: db40f4160c4 object-file.c: add "stream_loose_object()" to handle large object
@@ Commit message
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Han Xin <hanxin.hx@alibaba-inc.com>
+ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
## object-file.c ##
@@ object-file.c: static int freshen_packed_object(const struct object_id *oid)
@@ object-file.c: static int freshen_packed_object(const struct object_id *oid)
+ strbuf_addf(&filename, "%s/", get_object_directory());
+ hdrlen = format_object_header(hdr, sizeof(hdr), OBJ_BLOB, len);
+
-+ /* Common steps for write_loose_object and stream_loose_object to
-+ * start writing loose oject:
++ /*
++ * Common steps for write_loose_object and stream_loose_object to
++ * start writing loose objects:
+ *
+ * - Create tmpfile for the loose object.
+ * - Setup zlib stream for compression.
@@ object-file.c: static int freshen_packed_object(const struct object_id *oid)
+ /* Then the data itself.. */
+ do {
+ unsigned char *in0 = stream.next_in;
++
+ if (!stream.avail_in && !in_stream->is_finished) {
+ const void *in = in_stream->read(in_stream, &stream.avail_in);
+ stream.next_in = (void *)in;
+ in0 = (unsigned char *)in;
+ /* All data has been read. */
+ if (in_stream->is_finished)
-+ flush = Z_FINISH;
++ flush = 1;
+ }
-+ ret = git_deflate(&stream, flush);
-+ the_hash_algo->update_fn(&c, in0, stream.next_in - in0);
-+ if (write_buffer(fd, compressed, stream.next_out - compressed) < 0)
-+ die(_("unable to write loose object file"));
-+ stream.next_out = compressed;
-+ stream.avail_out = sizeof(compressed);
++ ret = write_loose_object_common(&c, &stream, flush, in0, fd,
++ compressed, sizeof(compressed));
+ /*
+ * Unlike write_loose_object(), we do not have the entire
+ * buffer. If we get Z_BUF_ERROR due to too few input bytes,
5: 4550f3a2745 = 6: d8ae2eadb98 core doc: modernize core.bigFileThreshold documentation
-: ----------- > 7: 2b403e7cd9c unpack-objects: refactor away unpack_non_delta_entry()
6: 6a70e49a346 ! 8: 5eded902496 unpack-objects: use stream_loose_object() to unpack large objects
@@ Commit message
malloc() the size of the blob before unpacking it, which could cause
OOM with very large blobs.
- We could use this new interface to unpack all blobs, but doing so
- would result in a performance penalty of around 10%, as the below
- "hyperfine" benchmark will show. We therefore limit this to files
- larger than "core.bigFileThreshold":
-
- $ hyperfine \
- --setup \
- 'if ! test -d scalar.git; then git clone --bare
- https://github.com/microsoft/scalar.git;
- cp scalar.git/objects/pack/*.pack small.pack; fi' \
- --prepare 'rm -rf dest.git && git init --bare dest.git' \
- ...
-
- Summary
- './git -C dest.git -c core.bigFileThreshold=512m
- unpack-objects <small.pack' in 'origin/master'
- 1.01 ± 0.04 times faster than './git -C dest.git
- -c core.bigFileThreshold=512m unpack-objects
- <small.pack' in 'HEAD~1'
- 1.01 ± 0.04 times faster than './git -C dest.git
- -c core.bigFileThreshold=512m unpack-objects
- <small.pack' in 'HEAD~0'
- 1.03 ± 0.10 times faster than './git -C dest.git
- -c core.bigFileThreshold=16k unpack-objects
- <small.pack' in 'origin/master'
- 1.02 ± 0.07 times faster than './git -C dest.git
- -c core.bigFileThreshold=16k unpack-objects
- <small.pack' in 'HEAD~0'
- 1.10 ± 0.04 times faster than './git -C dest.git
- -c core.bigFileThreshold=16k unpack-objects
- <small.pack' in 'HEAD~1'
+ We could use the new streaming interface to unpack all blobs, but
+ doing so would be much slower, as demonstrated e.g. with this
+ benchmark using git-hyperfine[0]:
+
+ rm -rf /tmp/scalar.git &&
+ git clone --bare https://github.com/Microsoft/scalar.git /tmp/scalar.git &&
+ mv /tmp/scalar.git/objects/pack/*.pack /tmp/scalar.git/my.pack &&
+ git hyperfine \
+ -r 2 --warmup 1 \
+ -L rev origin/master,HEAD -L v "10,512,1k,1m" \
+ -s 'make' \
+ -p 'git init --bare dest.git' \
+ -c 'rm -rf dest.git' \
+ './git -C dest.git -c core.bigFileThreshold={v} unpack-objects </tmp/scalar.git/my.pack'
+
+ Here we'll perform worse with lower core.bigFileThreshold settings
+ with this change in terms of speed, but we're getting lower memory use
+ in return:
+
+ Summary
+ './git -C dest.git -c core.bigFileThreshold=10 unpack-objects </tmp/scalar.git/my.pack' in 'origin/master' ran
+ 1.01 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1k unpack-objects </tmp/scalar.git/my.pack' in 'origin/master'
+ 1.01 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1m unpack-objects </tmp/scalar.git/my.pack' in 'origin/master'
+ 1.01 ± 0.02 times faster than './git -C dest.git -c core.bigFileThreshold=1m unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'
+ 1.02 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/scalar.git/my.pack' in 'origin/master'
+ 1.09 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1k unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'
+ 1.10 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'
+ 1.11 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=10 unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'
+
+ A better benchmark to demonstrate the benefits of that this one, which
+ creates an artificial repo with a 1, 25, 50, 75 and 100MB blob:
+
+ rm -rf /tmp/repo &&
+ git init /tmp/repo &&
+ (
+ cd /tmp/repo &&
+ for i in 1 25 50 75 100
+ do
+ dd if=/dev/urandom of=blob.$i count=$(($i*1024)) bs=1024
+ done &&
+ git add blob.* &&
+ git commit -mblobs &&
+ git gc &&
+ PACK=$(echo .git/objects/pack/pack-*.pack) &&
+ cp "$PACK" my.pack
+ ) &&
+ git hyperfine \
+ --show-output \
+ -L rev origin/master,HEAD -L v "512,50m,100m" \
+ -s 'make' \
+ -p 'git init --bare dest.git' \
+ -c 'rm -rf dest.git' \
+ '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold={v} unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum'
+
+ Using this test we'll always use >100MB of memory on
+ origin/master (around ~105MB), but max out at e.g. ~55MB if we set
+ core.bigFileThreshold=50m.
+
+ The relevant "Maximum resident set size" lines were manually added
+ below the relevant benchmark:
+
+ '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=50m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'origin/master' ran
+ Maximum resident set size (kbytes): 107080
+ 1.02 ± 0.78 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'origin/master'
+ Maximum resident set size (kbytes): 106968
+ 1.09 ± 0.79 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=100m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'origin/master'
+ Maximum resident set size (kbytes): 107032
+ 1.42 ± 1.07 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=100m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'HEAD'
+ Maximum resident set size (kbytes): 107072
+ 1.83 ± 1.02 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=50m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'HEAD'
+ Maximum resident set size (kbytes): 55704
+ 2.16 ± 1.19 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'HEAD'
+ Maximum resident set size (kbytes): 4564
+
+ This shows that if you have enough memory this new streaming method is
+ slower the lower you set the streaming threshold, but the benefit is
+ more bounded memory use.
An earlier version of this patch introduced a new
"core.bigFileStreamingThreshold" instead of re-using the existing
@@ Commit message
split up "core.bigFileThreshold" in the future if there's a need for
that.
+ 0. https://github.com/avar/git-hyperfine/
1. https://lore.kernel.org/git/20211210103435.83656-1-chiyutianyi@gmail.com/
2. https://lore.kernel.org/git/20220120112114.47618-5-chiyutianyi@gmail.com/
@@ Commit message
Helped-by: Derrick Stolee <stolee@gmail.com>
Helped-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Han Xin <hanxin.hx@alibaba-inc.com>
+ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
## Documentation/config/core.txt ##
@@ Documentation/config/core.txt: usage, at the slight expense of increased disk usage.
@@ builtin/unpack-objects.c: static void added_object(unsigned nr, enum object_type
+ return data->buf;
+}
+
-+static void write_stream_blob(unsigned nr, size_t size)
++static void stream_blob(unsigned long size, unsigned nr)
+{
+ git_zstream zstream = { 0 };
+ struct input_zstream_data data = { 0 };
@@ builtin/unpack-objects.c: static void added_object(unsigned nr, enum object_type
+ .read = feed_input_zstream,
+ .data = &data,
+ };
++ struct obj_info *info = &obj_list[nr];
+
+ data.zstream = &zstream;
+ git_inflate_init(&zstream);
+
-+ if (stream_loose_object(&in_stream, size, &obj_list[nr].oid))
++ if (stream_loose_object(&in_stream, size, &info->oid))
+ die(_("failed to write object in stream"));
+
+ if (data.status != Z_STREAM_END)
@@ builtin/unpack-objects.c: static void added_object(unsigned nr, enum object_type
+ git_inflate_end(&zstream);
+
+ if (strict) {
-+ struct blob *blob =
-+ lookup_blob(the_repository, &obj_list[nr].oid);
-+ if (blob)
-+ blob->object.flags |= FLAG_WRITTEN;
-+ else
++ struct blob *blob = lookup_blob(the_repository, &info->oid);
++
++ if (!blob)
+ die(_("invalid blob object from stream"));
++ blob->object.flags |= FLAG_WRITTEN;
+ }
-+ obj_list[nr].obj = NULL;
++ info->obj = NULL;
+}
+
- static void unpack_non_delta_entry(enum object_type type, unsigned long size,
- unsigned nr)
+ static int resolve_against_held(unsigned nr, const struct object_id *base,
+ void *delta_data, unsigned long delta_size)
{
-- void *buf = get_data(size);
-+ void *buf;
-+
-+ /* Write large blob in stream without allocating full buffer. */
-+ if (!dry_run && type == OBJ_BLOB && size > big_file_threshold) {
-+ write_stream_blob(nr, size);
-+ return;
-+ }
+@@ builtin/unpack-objects.c: static void unpack_one(unsigned nr)
-+ buf = get_data(size);
- if (buf)
- write_object(nr, type, buf, size);
- }
+ switch (type) {
+ case OBJ_BLOB:
++ if (!dry_run && size > big_file_threshold) {
++ stream_blob(size, nr);
++ return;
++ }
++ /* fallthrough */
+ case OBJ_COMMIT:
+ case OBJ_TREE:
+ case OBJ_TAG:
- ## t/t5328-unpack-large-objects.sh ##
-@@ t/t5328-unpack-large-objects.sh: test_description='git unpack-objects with large objects'
+ ## t/t5351-unpack-large-objects.sh ##
+@@ t/t5351-unpack-large-objects.sh: test_description='git unpack-objects with large objects'
prepare_dest () {
test_when_finished "rm -rf dest.git" &&
- git init --bare dest.git
+ git init --bare dest.git &&
-+ if test -n "$1"
-+ then
-+ git -C dest.git config core.bigFileThreshold $1
-+ fi
++ git -C dest.git config core.bigFileThreshold "$1"
}
- test_no_loose () {
-@@ t/t5328-unpack-large-objects.sh: test_expect_success 'set memory limitation to 1MB' '
+ test_expect_success "create large objects (1.5 MB) and PACK" '
+@@ t/t5351-unpack-large-objects.sh: test_expect_success 'set memory limitation to 1MB' '
'
test_expect_success 'unpack-objects failed under memory limitation' '
- prepare_dest &&
+ prepare_dest 2m &&
- test_must_fail git -C dest.git unpack-objects <test-$PACK.pack 2>err &&
+ test_must_fail git -C dest.git unpack-objects <pack-$PACK.pack 2>err &&
grep "fatal: attempting to allocate" err
'
test_expect_success 'unpack-objects works with memory limitation in dry-run mode' '
- prepare_dest &&
+ prepare_dest 2m &&
- git -C dest.git unpack-objects -n <test-$PACK.pack &&
- test_no_loose &&
+ git -C dest.git unpack-objects -n <pack-$PACK.pack &&
+ test_stdout_line_count = 0 find dest.git/objects -type f &&
test_dir_is_empty dest.git/objects/pack
'
+test_expect_success 'unpack big object in stream' '
+ prepare_dest 1m &&
-+ git -C dest.git unpack-objects <test-$PACK.pack &&
++ git -C dest.git unpack-objects <pack-$PACK.pack &&
+ test_dir_is_empty dest.git/objects/pack
+'
+
+test_expect_success 'do not unpack existing large objects' '
+ prepare_dest 1m &&
-+ git -C dest.git index-pack --stdin <test-$PACK.pack &&
-+ git -C dest.git unpack-objects <test-$PACK.pack &&
-+ test_no_loose
++ git -C dest.git index-pack --stdin <pack-$PACK.pack &&
++ git -C dest.git unpack-objects <pack-$PACK.pack &&
++
++ # The destination came up with the exact same pack...
++ DEST_PACK=$(echo dest.git/objects/pack/pack-*.pack) &&
++ test_cmp pack-$PACK.pack $DEST_PACK &&
++
++ # ...and wrote no loose objects
++ test_stdout_line_count = 0 find dest.git/objects -type f ! -name "pack-*"
+'
+
test_done
--
2.35.1.1438.g8874c8eeb35
next prev parent reply other threads:[~2022-03-19 0:23 UTC|newest]
Thread overview: 211+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-09 8:20 [PATCH] unpack-objects: unpack large object in stream Han Xin
2021-10-19 7:37 ` Han Xin
2021-10-20 14:42 ` Philip Oakley
2021-10-21 3:42 ` Han Xin
2021-10-21 22:47 ` Philip Oakley
2021-11-03 1:48 ` Han Xin
2021-11-03 10:07 ` Philip Oakley
2021-11-12 9:40 ` [PATCH v2 1/6] object-file: refactor write_loose_object() to support inputstream Han Xin
2021-11-18 4:59 ` Jiang Xin
2021-11-18 6:45 ` Junio C Hamano
2021-11-12 9:40 ` [PATCH v2 2/6] object-file.c: add dry_run mode for write_loose_object() Han Xin
2021-11-18 5:42 ` Jiang Xin
2021-11-12 9:40 ` [PATCH v2 3/6] object-file.c: handle nil oid in write_loose_object() Han Xin
2021-11-18 5:49 ` Jiang Xin
2021-11-12 9:40 ` [PATCH v2 4/6] object-file.c: read input stream repeatedly " Han Xin
2021-11-18 5:56 ` Jiang Xin
2021-11-12 9:40 ` [PATCH v2 5/6] object-store.h: add write_loose_object() Han Xin
2021-11-12 9:40 ` [PATCH v2 6/6] unpack-objects: unpack large object in stream Han Xin
2021-11-18 7:14 ` Jiang Xin
2021-11-22 3:32 ` [PATCH v3 0/5] unpack large objects " Han Xin
2021-11-29 7:01 ` Han Xin
2021-11-29 19:12 ` Jeff King
2021-11-30 2:57 ` Han Xin
2021-12-03 9:35 ` [PATCH v4 " Han Xin
2021-12-07 16:18 ` Derrick Stolee
2021-12-10 10:34 ` [PATCH v5 0/6] unpack large blobs " Han Xin
2021-12-17 11:26 ` Han Xin
2021-12-21 11:51 ` [PATCH v7 0/5] " Han Xin
2021-12-21 11:51 ` [PATCH v7 1/5] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-12-21 14:09 ` Ævar Arnfjörð Bjarmason
2021-12-21 14:43 ` René Scharfe
2021-12-21 15:04 ` Ævar Arnfjörð Bjarmason
2021-12-22 11:15 ` Jiang Xin
2021-12-22 11:29 ` Jiang Xin
2021-12-31 3:06 ` Jiang Xin
2021-12-21 11:51 ` [PATCH v7 2/5] object-file API: add a format_object_header() function Han Xin
2021-12-21 14:30 ` René Scharfe
2022-02-01 14:28 ` C99 %z (was: [PATCH v7 2/5] object-file API: add a format_object_header() function) Ævar Arnfjörð Bjarmason
2021-12-31 3:12 ` [PATCH v7 2/5] object-file API: add a format_object_header() function Jiang Xin
2021-12-21 11:51 ` [PATCH v7 3/5] object-file.c: refactor write_loose_object() to reuse in stream version Han Xin
2021-12-21 14:16 ` Ævar Arnfjörð Bjarmason
2021-12-22 12:02 ` Jiang Xin
2021-12-21 11:52 ` [PATCH v7 4/5] object-file.c: add "write_stream_object_file()" to support read in stream Han Xin
2021-12-21 14:20 ` Ævar Arnfjörð Bjarmason
2021-12-21 15:05 ` Ævar Arnfjörð Bjarmason
2021-12-21 11:52 ` [PATCH v7 5/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-21 15:06 ` Ævar Arnfjörð Bjarmason
2021-12-31 3:19 ` Jiang Xin
2022-01-08 8:54 ` [PATCH v8 0/6] unpack large blobs in stream Han Xin
2022-01-20 11:21 ` [PATCH v9 0/5] " Han Xin
2022-02-01 21:24 ` Ævar Arnfjörð Bjarmason
2022-02-02 8:32 ` Han Xin
2022-02-02 10:59 ` Ævar Arnfjörð Bjarmason
2022-02-04 14:07 ` [PATCH v10 0/6] unpack-objects: support streaming large objects to disk Ævar Arnfjörð Bjarmason
2022-02-04 14:07 ` [PATCH v10 1/6] unpack-objects: low memory footprint for get_data() in dry_run mode Ævar Arnfjörð Bjarmason
2022-02-04 14:07 ` [PATCH v10 2/6] object-file.c: do fsync() and close() before post-write die() Ævar Arnfjörð Bjarmason
2022-02-04 14:07 ` [PATCH v10 3/6] object-file.c: refactor write_loose_object() to several steps Ævar Arnfjörð Bjarmason
2022-02-04 14:07 ` [PATCH v10 4/6] object-file.c: add "stream_loose_object()" to handle large object Ævar Arnfjörð Bjarmason
2022-02-04 14:07 ` [PATCH v10 5/6] core doc: modernize core.bigFileThreshold documentation Ævar Arnfjörð Bjarmason
2022-02-04 14:07 ` [PATCH v10 6/6] unpack-objects: use stream_loose_object() to unpack large objects Ævar Arnfjörð Bjarmason
2022-03-19 0:23 ` Ævar Arnfjörð Bjarmason [this message]
2022-03-19 0:23 ` [PATCH v11 1/8] unpack-objects: low memory footprint for get_data() in dry_run mode Ævar Arnfjörð Bjarmason
2022-03-19 0:23 ` [PATCH v11 2/8] object-file.c: do fsync() and close() before post-write die() Ævar Arnfjörð Bjarmason
2022-03-19 0:23 ` [PATCH v11 3/8] object-file.c: refactor write_loose_object() to several steps Ævar Arnfjörð Bjarmason
2022-03-19 10:11 ` René Scharfe
2022-03-19 0:23 ` [PATCH v11 4/8] object-file.c: factor out deflate part of write_loose_object() Ævar Arnfjörð Bjarmason
2022-03-19 0:23 ` [PATCH v11 5/8] object-file.c: add "stream_loose_object()" to handle large object Ævar Arnfjörð Bjarmason
2022-03-19 0:23 ` [PATCH v11 6/8] core doc: modernize core.bigFileThreshold documentation Ævar Arnfjörð Bjarmason
2022-03-19 0:23 ` [PATCH v11 7/8] unpack-objects: refactor away unpack_non_delta_entry() Ævar Arnfjörð Bjarmason
2022-03-19 0:23 ` [PATCH v11 8/8] unpack-objects: use stream_loose_object() to unpack large objects Ævar Arnfjörð Bjarmason
2022-03-29 13:56 ` [PATCH v12 0/8] unpack-objects: support streaming blobs to disk Ævar Arnfjörð Bjarmason
2022-03-29 13:56 ` [PATCH v12 1/8] unpack-objects: low memory footprint for get_data() in dry_run mode Ævar Arnfjörð Bjarmason
2022-03-29 13:56 ` [PATCH v12 2/8] object-file.c: do fsync() and close() before post-write die() Ævar Arnfjörð Bjarmason
2022-03-29 13:56 ` [PATCH v12 3/8] object-file.c: refactor write_loose_object() to several steps Ævar Arnfjörð Bjarmason
2022-03-30 7:13 ` Han Xin
2022-03-30 17:34 ` Ævar Arnfjörð Bjarmason
2022-03-29 13:56 ` [PATCH v12 4/8] object-file.c: factor out deflate part of write_loose_object() Ævar Arnfjörð Bjarmason
2022-03-29 13:56 ` [PATCH v12 5/8] object-file.c: add "stream_loose_object()" to handle large object Ævar Arnfjörð Bjarmason
2022-03-31 19:54 ` Neeraj Singh
2022-03-29 13:56 ` [PATCH v12 6/8] core doc: modernize core.bigFileThreshold documentation Ævar Arnfjörð Bjarmason
2022-03-29 13:56 ` [PATCH v12 7/8] unpack-objects: refactor away unpack_non_delta_entry() Ævar Arnfjörð Bjarmason
2022-03-30 19:40 ` René Scharfe
2022-03-31 12:42 ` Ævar Arnfjörð Bjarmason
2022-03-31 16:38 ` René Scharfe
2022-03-29 13:56 ` [PATCH v12 8/8] unpack-objects: use stream_loose_object() to unpack large objects Ævar Arnfjörð Bjarmason
2022-06-04 10:10 ` [PATCH v13 0/7] unpack-objects: support streaming blobs to disk Ævar Arnfjörð Bjarmason
2022-06-04 10:10 ` [PATCH v13 1/7] unpack-objects: low memory footprint for get_data() in dry_run mode Ævar Arnfjörð Bjarmason
2022-06-06 18:35 ` Junio C Hamano
2022-06-09 4:10 ` Han Xin
2022-06-09 18:27 ` Junio C Hamano
2022-06-10 1:50 ` Han Xin
2022-06-10 2:05 ` Ævar Arnfjörð Bjarmason
2022-06-10 12:04 ` Han Xin
2022-06-04 10:10 ` [PATCH v13 2/7] object-file.c: do fsync() and close() before post-write die() Ævar Arnfjörð Bjarmason
2022-06-06 18:45 ` Junio C Hamano
2022-06-04 10:10 ` [PATCH v13 3/7] object-file.c: refactor write_loose_object() to several steps Ævar Arnfjörð Bjarmason
2022-06-04 10:10 ` [PATCH v13 4/7] object-file.c: factor out deflate part of write_loose_object() Ævar Arnfjörð Bjarmason
2022-06-04 10:10 ` [PATCH v13 5/7] object-file.c: add "stream_loose_object()" to handle large object Ævar Arnfjörð Bjarmason
2022-06-06 19:44 ` Junio C Hamano
2022-06-06 20:02 ` Junio C Hamano
2022-06-09 6:04 ` Han Xin
2022-06-09 6:14 ` Han Xin
2022-06-07 19:53 ` Neeraj Singh
2022-06-08 15:34 ` Junio C Hamano
2022-06-09 3:05 ` [RFC PATCH] object-file.c: batched disk flushes for stream_loose_object() Han Xin
2022-06-09 7:35 ` Neeraj Singh
2022-06-09 9:30 ` Johannes Schindelin
2022-06-10 12:55 ` Han Xin
2022-06-04 10:10 ` [PATCH v13 6/7] core doc: modernize core.bigFileThreshold documentation Ævar Arnfjörð Bjarmason
2022-06-06 19:50 ` Junio C Hamano
2022-06-04 10:10 ` [PATCH v13 7/7] unpack-objects: use stream_loose_object() to unpack large objects Ævar Arnfjörð Bjarmason
2022-06-10 14:46 ` [PATCH v14 0/7] unpack-objects: support streaming blobs to disk Han Xin
2022-06-10 14:46 ` [PATCH v14 1/7] unpack-objects: low memory footprint for get_data() in dry_run mode Han Xin
2022-06-10 14:46 ` [PATCH v14 2/7] object-file.c: do fsync() and close() before post-write die() Han Xin
2022-06-10 21:10 ` René Scharfe
2022-06-10 21:33 ` Junio C Hamano
2022-06-11 1:50 ` Han Xin
2022-06-10 14:46 ` [PATCH v14 3/7] object-file.c: refactor write_loose_object() to several steps Han Xin
2022-06-10 14:46 ` [PATCH v14 4/7] object-file.c: factor out deflate part of write_loose_object() Han Xin
2022-06-10 14:46 ` [PATCH v14 5/7] object-file.c: add "stream_loose_object()" to handle large object Han Xin
2022-06-10 14:46 ` [PATCH v14 6/7] core doc: modernize core.bigFileThreshold documentation Han Xin
2022-06-10 21:01 ` Junio C Hamano
2022-06-10 14:46 ` [PATCH v14 7/7] unpack-objects: use stream_loose_object() to unpack large objects Han Xin
2022-06-11 2:44 ` [PATCH v15 0/6] unpack-objects: support streaming blobs to disk Han Xin
2022-06-11 2:44 ` [PATCH v15 1/6] unpack-objects: low memory footprint for get_data() in dry_run mode Han Xin
2022-06-11 2:44 ` [PATCH v15 2/6] object-file.c: refactor write_loose_object() to several steps Han Xin
2022-06-11 2:44 ` [PATCH v15 3/6] object-file.c: factor out deflate part of write_loose_object() Han Xin
2022-06-11 2:44 ` [PATCH v15 4/6] object-file.c: add "stream_loose_object()" to handle large object Han Xin
2022-06-11 2:44 ` [PATCH v15 5/6] core doc: modernize core.bigFileThreshold documentation Han Xin
2022-06-11 2:44 ` [PATCH v15 6/6] unpack-objects: use stream_loose_object() to unpack large objects Han Xin
2022-07-01 2:01 ` Junio C Hamano
2022-05-20 3:05 ` [PATCH 0/1] unpack-objects: low memory footprint for get_data() in dry_run mode Han Xin
2022-05-20 3:05 ` [PATCH 1/1] " Han Xin
2022-01-20 11:21 ` [PATCH v9 1/5] " Han Xin
2022-01-20 11:21 ` [PATCH v9 2/5] object-file.c: refactor write_loose_object() to several steps Han Xin
2022-01-20 11:21 ` [PATCH v9 3/5] object-file.c: add "stream_loose_object()" to handle large object Han Xin
2022-01-20 11:21 ` [PATCH v9 4/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2022-01-20 11:21 ` [PATCH v9 5/5] object-file API: add a format_object_header() function Han Xin
2022-01-08 8:54 ` [PATCH v8 1/6] unpack-objects: low memory footprint for get_data() in dry_run mode Han Xin
2022-01-08 12:28 ` René Scharfe
2022-01-11 10:41 ` Han Xin
2022-01-08 8:54 ` [PATCH v8 2/6] object-file.c: refactor write_loose_object() to several steps Han Xin
2022-01-08 12:28 ` René Scharfe
2022-01-11 10:33 ` Han Xin
2022-01-08 8:54 ` [PATCH v8 3/6] object-file.c: remove the slash for directory_size() Han Xin
2022-01-08 17:24 ` René Scharfe
2022-01-11 10:14 ` Han Xin
2022-01-08 8:54 ` [PATCH v8 4/6] object-file.c: add "stream_loose_object()" to handle large object Han Xin
2022-01-08 8:54 ` [PATCH v8 5/6] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2022-01-08 8:54 ` [PATCH v8 6/6] object-file API: add a format_object_header() function Han Xin
2021-12-17 11:26 ` [PATCH v6 1/6] object-file.c: release strbuf in write_loose_object() Han Xin
2021-12-17 19:28 ` René Scharfe
2021-12-18 0:09 ` Junio C Hamano
2021-12-17 11:26 ` [PATCH v6 2/6] object-file.c: refactor object header generation into a function Han Xin
2021-12-20 12:10 ` [RFC PATCH] object-file API: add a format_loose_header() function Ævar Arnfjörð Bjarmason
2021-12-20 12:48 ` Philip Oakley
2021-12-20 22:25 ` Junio C Hamano
2021-12-21 1:42 ` Ævar Arnfjörð Bjarmason
2021-12-21 2:11 ` Junio C Hamano
2021-12-21 2:27 ` Ævar Arnfjörð Bjarmason
2021-12-21 11:43 ` Han Xin
2021-12-17 11:26 ` [PATCH v6 3/6] object-file.c: refactor write_loose_object() to reuse in stream version Han Xin
2021-12-17 11:26 ` [PATCH v6 4/6] object-file.c: make "write_object_file_flags()" to support read in stream Han Xin
2021-12-17 22:52 ` René Scharfe
2021-12-17 11:26 ` [PATCH v6 5/6] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-12-17 21:22 ` René Scharfe
2021-12-17 11:26 ` [PATCH v6 6/6] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-10 10:34 ` [PATCH v5 1/6] object-file: refactor write_loose_object() to support read from stream Han Xin
2021-12-10 10:34 ` [PATCH v5 2/6] object-file.c: handle undetermined oid in write_loose_object() Han Xin
2021-12-13 7:32 ` Ævar Arnfjörð Bjarmason
2021-12-10 10:34 ` [PATCH v5 3/6] object-file.c: read stream in a loop " Han Xin
2021-12-10 10:34 ` [PATCH v5 4/6] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-12-10 10:34 ` [PATCH v5 5/6] object-file.c: make "write_object_file_flags()" to support "HASH_STREAM" Han Xin
2021-12-10 10:34 ` [PATCH v5 6/6] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-13 8:05 ` Ævar Arnfjörð Bjarmason
2021-12-03 9:35 ` [PATCH v4 1/5] object-file: refactor write_loose_object() to read buffer from stream Han Xin
2021-12-03 13:28 ` Ævar Arnfjörð Bjarmason
2021-12-06 2:07 ` Han Xin
2021-12-03 9:35 ` [PATCH v4 2/5] object-file.c: handle undetermined oid in write_loose_object() Han Xin
2021-12-03 13:21 ` Ævar Arnfjörð Bjarmason
2021-12-06 2:51 ` Han Xin
2021-12-03 13:41 ` Ævar Arnfjörð Bjarmason
2021-12-06 3:12 ` Han Xin
2021-12-03 9:35 ` [PATCH v4 3/5] object-file.c: read stream in a loop " Han Xin
2021-12-03 9:35 ` [PATCH v4 4/5] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-12-03 13:59 ` Ævar Arnfjörð Bjarmason
2021-12-06 3:20 ` Han Xin
2021-12-03 9:35 ` [PATCH v4 5/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-03 13:07 ` Ævar Arnfjörð Bjarmason
2021-12-07 6:42 ` Han Xin
2021-12-03 13:54 ` Ævar Arnfjörð Bjarmason
2021-12-07 6:17 ` Han Xin
2021-12-03 14:05 ` Ævar Arnfjörð Bjarmason
2021-12-07 6:48 ` Han Xin
2021-11-22 3:32 ` [PATCH v3 1/5] object-file: refactor write_loose_object() to read buffer from stream Han Xin
2021-11-23 23:24 ` Junio C Hamano
2021-11-24 9:00 ` Han Xin
2021-11-22 3:32 ` [PATCH v3 2/5] object-file.c: handle undetermined oid in write_loose_object() Han Xin
2021-11-29 15:10 ` Derrick Stolee
2021-11-29 20:44 ` Junio C Hamano
2021-11-29 22:18 ` Derrick Stolee
2021-11-30 3:23 ` Han Xin
2021-11-22 3:32 ` [PATCH v3 3/5] object-file.c: read stream in a loop " Han Xin
2021-11-22 3:32 ` [PATCH v3 4/5] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-11-22 3:32 ` [PATCH v3 5/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-11-29 17:37 ` Derrick Stolee
2021-11-30 13:49 ` Han Xin
2021-11-30 18:38 ` Derrick Stolee
2021-12-01 20:37 ` "git hyperfine" (was: [PATCH v3 5/5] unpack-objects[...]) Ævar Arnfjörð Bjarmason
2021-12-02 7:33 ` [PATCH v3 5/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-02 13:53 ` Derrick Stolee
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover-v11-0.8-00000000000-20220319T001411Z-avarab@gmail.com \
--to=avarab@gmail.com \
--cc=chiyutianyi@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=l.s.r@web.de \
--cc=philipoakley@iee.email \
--cc=stolee@gmail.com \
--cc=worldhello.net@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).