git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"Han Xin" <chiyutianyi@gmail.com>,
	"Jiang Xin" <worldhello.net@gmail.com>,
	"René Scharfe" <l.s.r@web.de>,
	"Derrick Stolee" <stolee@gmail.com>,
	"Philip Oakley" <philipoakley@iee.email>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: [PATCH v10 0/6] unpack-objects: support streaming large objects to disk
Date: Fri,  4 Feb 2022 15:07:06 +0100	[thread overview]
Message-ID: <cover-v10-0.6-00000000000-20220204T135538Z-avarab@gmail.com> (raw)
In-Reply-To: <20220120112114.47618-1-chiyutianyi@gmail.com>

This is a v10 re-roll of Han Xin's series[1] to stream large objects
to disk in "git unpack-objects". This v9 had integrated a proposed
cleanup patch of mine, which is now a part of its own series, which
this series now depends on: [2]. This v10 is sent with Han Xin's
approval[3].

Changes since v9:

 * Now based on [2]
 * Small grammar/typo fixes in commit messages
 * Replaced an echo/eval pattern in a test with a $(find ... | wc -l)
   comparison, which is a pattern we already use in another test for
   the same (or similar) assertion.
 * I added a new 2/6 to do an fsync() before an oideq() assertion. I
   don't think it matters in practice, but allows 3/6 to be smaller by
   having that code-now-utility-function share more logic among its two callers.
 * Changed inline comments in 3/6 to API docs where appropriate, the
   helper function now gets a "fd" per 2/6.
 * 4/6 could use the format_object_header() function in the base
   topic, and now does so (instead of that conversion coming later in
   v9).
 * A new 5/6 updates the core.bigFileThreshold documentation to
   account for 12 years of behavior changes we hadn't documented.
 * The updated 6/6 now links to those docs, and I removed a very
   detailed accounting of all in-tree uses of core.bigFileThreshold
   from the commit message. I think linking to the summary docs should
   suffice, and for anyone digging in the future 5/6 links to the more
   detailed summary in the old patch.

More generally I've been heavily involved in the review for the past
iterations, and I think barring any last minute nits in this v10 this
topic should be ready to advance. As the above summary shows we're
down to typo fixes, doc and test tweaks etc. at this point.

The core functionality being added here isn't changed in any
meaningful way, and has had a lot of careful review already.

1. https://lore.kernel.org/git/20220120112114.47618-1-chiyutianyi@gmail.com/
2. https://lore.kernel.org/git/cover-v2-00.11-00000000000-20220204T135005Z-avarab@gmail.com/
3. https://lore.kernel.org/git/CAO0brD2Pe0aKSiBphZS861gC=nZk+q2GtXDN4pPjAQnPdns3TA@mail.gmail.com/

Han Xin (4):
  unpack-objects: low memory footprint for get_data() in dry_run mode
  object-file.c: refactor write_loose_object() to several steps
  object-file.c: add "stream_loose_object()" to handle large object
  unpack-objects: use stream_loose_object() to unpack large objects

Ævar Arnfjörð Bjarmason (2):
  object-file.c: do fsync() and close() before post-write die()
  core doc: modernize core.bigFileThreshold documentation

 Documentation/config/core.txt   |  33 +++--
 builtin/unpack-objects.c        | 110 ++++++++++++++--
 object-file.c                   | 221 +++++++++++++++++++++++++++-----
 object-store.h                  |   8 ++
 t/t5328-unpack-large-objects.sh |  62 +++++++++
 5 files changed, 381 insertions(+), 53 deletions(-)
 create mode 100755 t/t5328-unpack-large-objects.sh

Range-diff against v9:
1:  553a9377eb3 ! 1:  e46eb75b98f unpack-objects: low memory footprint for get_data() in dry_run mode
    @@ Commit message
         unpack-objects: low memory footprint for get_data() in dry_run mode
     
         As the name implies, "get_data(size)" will allocate and return a given
    -    size of memory. Allocating memory for a large blob object may cause the
    +    amount of memory. Allocating memory for a large blob object may cause the
         system to run out of memory. Before preparing to replace calling of
         "get_data()" to unpack large blob objects in latter commits, refactor
         "get_data()" to reduce memory footprint for dry_run mode.
    @@ Commit message
         in dry_run mode, "get_data()" will release the allocated buffer and
         return NULL instead of returning garbage data.
     
    +    The "find [...]objects/?? -type f | wc -l" test idiom being used here
    +    is adapted from the same "find" use added to another test in
    +    d9545c7f465 (fast-import: implement unpack limit, 2016-04-25).
    +
         Suggested-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
         Signed-off-by: Han Xin <hanxin.hx@alibaba-inc.com>
    +    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## builtin/unpack-objects.c ##
     @@ builtin/unpack-objects.c: static void use(int bytes)
    @@ t/t5328-unpack-large-objects.sh (new)
     +}
     +
     +test_no_loose () {
    -+	glob=dest.git/objects/?? &&
    -+	echo "$glob" >expect &&
    -+	eval "echo $glob" >actual &&
    -+	test_cmp expect actual
    ++	test $(find dest.git/objects/?? -type f | wc -l) = 0
     +}
     +
     +test_expect_success "create large objects (1.5 MB) and PACK" '
-:  ----------- > 2:  48bf9090058 object-file.c: do fsync() and close() before post-write die()
2:  88c91affd61 ! 3:  0e33d2a6e35 object-file.c: refactor write_loose_object() to several steps
    @@ Commit message
         When writing a large blob using "write_loose_object()", we have to pass
         a buffer with the whole content of the blob, and this behavior will
         consume lots of memory and may cause OOM. We will introduce a stream
    -    version function ("stream_loose_object()") in latter commit to resolve
    +    version function ("stream_loose_object()") in later commit to resolve
         this issue.
     
    -    Before introducing a stream vesion function for writing loose object,
    -    do some refactoring on "write_loose_object()" to reuse code for both
    -    versions.
    +    Before introducing that streaming function, do some refactoring on
    +    "write_loose_object()" to reuse code for both versions.
     
         Rewrite "write_loose_object()" as follows:
     
    @@ Commit message
     
          3. Compress data.
     
    -     4. Move common steps for ending zlib stream into a new funciton
    +     4. Move common steps for ending zlib stream into a new function
             "end_loose_object_common()".
     
          5. Close fd and finalize the object file.
    @@ Commit message
         Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
         Helped-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
         Signed-off-by: Han Xin <hanxin.hx@alibaba-inc.com>
    +    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## object-file.c ##
     @@ object-file.c: static int create_tmpfile(struct strbuf *tmp, const char *filename)
      	return fd;
      }
      
    ++/**
    ++ * Common steps for loose object writers to start writing loose
    ++ * objects:
    ++ *
    ++ * - Create tmpfile for the loose object.
    ++ * - Setup zlib stream for compression.
    ++ * - Start to feed header to zlib stream.
    ++ *
    ++ * Returns a "fd", which should later be provided to
    ++ * end_loose_object_common().
    ++ */
     +static int start_loose_object_common(struct strbuf *tmp_file,
     +				     const char *filename, unsigned flags,
     +				     git_zstream *stream,
    @@ object-file.c: static int create_tmpfile(struct strbuf *tmp, const char *filenam
     +	return fd;
     +}
     +
    -+static void end_loose_object_common(int ret, git_hash_ctx *c,
    ++/**
    ++ * Common steps for loose object writers to end writing loose objects:
    ++ *
    ++ * - End the compression of zlib stream.
    ++ * - Get the calculated oid to "parano_oid".
    ++ * - fsync() and close() the "fd"
    ++ */
    ++static void end_loose_object_common(int fd, int ret, git_hash_ctx *c,
     +				    git_zstream *stream,
     +				    struct object_id *parano_oid,
     +				    const struct object_id *expected_oid,
    @@ object-file.c: static int create_tmpfile(struct strbuf *tmp, const char *filenam
     +	if (ret != Z_OK)
     +		die(_(die_msg2_fmt), ret, expected_oid);
     +	the_hash_algo->final_oid_fn(parano_oid, c);
    ++
    ++	/*
    ++	 * We already did a write_buffer() to the "fd", let's fsync()
    ++	 * and close().
    ++	 *
    ++	 * We might still die() on a subsequent sanity check, but
    ++	 * let's not add to that confusion by not flushing any
    ++	 * outstanding writes to disk first.
    ++	 */
    ++	close_loose_object(fd);
     +}
     +
      static int write_loose_object(const struct object_id *oid, char *hdr,
    @@ object-file.c: static int write_loose_object(const struct object_id *oid, char *
     -	while (git_deflate(&stream, 0) == Z_OK)
     -		; /* nothing */
     -	the_hash_algo->update_fn(&c, hdr, hdrlen);
    -+	/* Common steps for write_loose_object and stream_loose_object to
    -+	 * start writing loose oject:
    -+	 *
    -+	 *  - Create tmpfile for the loose object.
    -+	 *  - Setup zlib stream for compression.
    -+	 *  - Start to feed header to zlib stream.
    -+	 */
     +	fd = start_loose_object_common(&tmp_file, filename.buf, flags,
     +				       &stream, compressed, sizeof(compressed),
     +				       &c, hdr, hdrlen);
    @@ object-file.c: static int write_loose_object(const struct object_id *oid, char *
     -		die(_("deflateEnd on object %s failed (%d)"), oid_to_hex(oid),
     -		    ret);
     -	the_hash_algo->final_oid_fn(&parano_oid, &c);
    -+	/* Common steps for write_loose_object and stream_loose_object to
    -+	 * end writing loose oject:
    -+	 *
    -+	 *  - End the compression of zlib stream.
    -+	 *  - Get the calculated oid to "parano_oid".
    -+	 */
    -+	end_loose_object_common(ret, &c, &stream, &parano_oid, oid,
    +-
    +-	/*
    +-	 * We already did a write_buffer() to the "fd", let's fsync()
    +-	 * and close().
    +-	 *
    +-	 * We might still die() on a subsequent sanity check, but
    +-	 * let's not add to that confusion by not flushing any
    +-	 * outstanding writes to disk first.
    +-	 */
    +-	close_loose_object(fd);
    ++	end_loose_object_common(fd, ret, &c, &stream, &parano_oid, oid,
     +				N_("unable to deflate new object %s (%d)"),
     +				N_("deflateEnd on object %s failed (%d)"));
    -+
    + 
      	if (!oideq(oid, &parano_oid))
      		die(_("confused by unstable object source data for %s"),
    - 		    oid_to_hex(oid));
3:  054a00ed21d ! 4:  9644df5c744 object-file.c: add "stream_loose_object()" to handle large object
    @@ Commit message
     
         Add a new function "stream_loose_object()", which is a stream version of
         "write_loose_object()" but with a low memory footprint. We will use this
    -    function to unpack large blob object in latter commit.
    +    function to unpack large blob object in later commit.
     
         Another difference with "write_loose_object()" is that we have no chance
         to run "write_object_file_prepare()" to calculate the oid in advance.
         In "write_loose_object()", we know the oid and we can write the
         temporary file in the same directory as the final object, but for an
         object with an undetermined oid, we don't know the exact directory for
    -    the object, so we have to save the temporary file in ".git/objects/"
    -    directory instead.
    +    the object.
    +
    +    Still, we need to save the temporary file we're preparing
    +    somewhere. We'll do that in the top-level ".git/objects/"
    +    directory (or whatever "GIT_OBJECT_DIRECTORY" is set to). Once we've
    +    streamed it we'll know the OID, and will move it to its canonical
    +    path.
     
         "freshen_packed_object()" or "freshen_loose_object()" will be called
         inside "stream_loose_object()" after obtaining the "oid".
    @@ object-file.c: static int freshen_packed_object(const struct object_id *oid)
     +
     +	/* Since oid is not determined, save tmp file to odb path. */
     +	strbuf_addf(&filename, "%s/", get_object_directory());
    -+	hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %"PRIuMAX, type_name(OBJ_BLOB), len) + 1;
    ++	hdrlen = format_object_header(hdr, sizeof(hdr), OBJ_BLOB, len);
     +
     +	/* Common steps for write_loose_object and stream_loose_object to
     +	 * start writing loose oject:
    @@ object-file.c: static int freshen_packed_object(const struct object_id *oid)
     +	 *  - End the compression of zlib stream.
     +	 *  - Get the calculated oid.
     +	 */
    -+	end_loose_object_common(ret, &c, &stream, oid, NULL,
    ++	end_loose_object_common(fd, ret, &c, &stream, oid, NULL,
     +				N_("unable to stream deflate new object (%d)"),
     +				N_("deflateEnd on stream object failed (%d)"));
     +
    -+	close_loose_object(fd);
    -+
     +	if (freshen_packed_object(oid) || freshen_loose_object(oid)) {
     +		unlink_or_warn(tmp_file.buf);
     +		goto cleanup;
    @@ object-file.c: static int freshen_packed_object(const struct object_id *oid)
     +}
     +
      int write_object_file_flags(const void *buf, unsigned long len,
    - 			    const char *type, struct object_id *oid,
    + 			    enum object_type type, struct object_id *oid,
      			    unsigned flags)
     
      ## object-store.h ##
    @@ object-store.h: struct object_directory {
      	struct object_directory *, 1, fspathhash, fspatheq)
      
     @@ object-store.h: static inline int write_object_file(const void *buf, unsigned long len,
    - 	return write_object_file_flags(buf, len, type, oid, 0);
    - }
    - 
    + int write_object_file_literally(const void *buf, unsigned long len,
    + 				const char *type, struct object_id *oid,
    + 				unsigned flags);
     +int stream_loose_object(struct input_stream *in_stream, size_t len,
     +			struct object_id *oid);
    -+
    - int hash_object_file_literally(const void *buf, unsigned long len,
    - 			       const char *type, struct object_id *oid,
    - 			       unsigned flags);
    + 
    + /*
    +  * Add an object file to the in-memory object store, without writing it
-:  ----------- > 5:  4550f3a2745 core doc: modernize core.bigFileThreshold documentation
4:  6bcba6bce66 ! 6:  6a70e49a346 unpack-objects: unpack_non_delta_entry() read data in a stream
    @@ Metadata
     Author: Han Xin <hanxin.hx@alibaba-inc.com>
     
      ## Commit message ##
    -    unpack-objects: unpack_non_delta_entry() read data in a stream
    +    unpack-objects: use stream_loose_object() to unpack large objects
     
    -    We used to call "get_data()" in "unpack_non_delta_entry()" to read the
    -    entire contents of a blob object, no matter how big it is. This
    -    implementation may consume all the memory and cause OOM.
    +    Make use of the stream_loose_object() function introduced in the
    +    preceding commit to unpack large objects. Before this we'd need to
    +    malloc() the size of the blob before unpacking it, which could cause
    +    OOM with very large blobs.
     
    -    By implementing a zstream version of input_stream interface, we can use
    -    a small fixed buffer for "unpack_non_delta_entry()". However, unpack
    -    non-delta objects from a stream instead of from an entrie buffer will
    -    have 10% performance penalty.
    +    We could use this new interface to unpack all blobs, but doing so
    +    would result in a performance penalty of around 10%, as the below
    +    "hyperfine" benchmark will show. We therefore limit this to files
    +    larger than "core.bigFileThreshold":
     
             $ hyperfine \
               --setup \
    @@ Commit message
                         -c core.bigFileThreshold=16k unpack-objects
                         <small.pack' in 'HEAD~1'
     
    -    Therefore, only unpack objects larger than the "core.bigFileThreshold"
    -    in zstream. Until now, the config variable has been used in the
    -    following cases, and our new case belongs to the packfile category.
    +    An earlier version of this patch introduced a new
    +    "core.bigFileStreamingThreshold" instead of re-using the existing
    +    "core.bigFileThreshold" variable[1]. As noted in a detailed overview
    +    of its users in [2] using it has several different meanings.
     
    -     * Archive:
    +    Still, we consider it good enough to simply re-use it. While it's
    +    possible that someone might want to e.g. consider objects "small" for
    +    the purposes of diffing but "big" for the purposes of writing them
    +    such use-cases are probably too obscure to worry about. We can always
    +    split up "core.bigFileThreshold" in the future if there's a need for
    +    that.
     
    -       + archive.c: write_entry(): write large blob entries to archive in
    -         stream.
    -
    -     * Loose objects:
    -
    -       + object-file.c: index_fd(): when hashing large files in worktree,
    -         read files in a stream, and create one packfile per large blob if
    -         want to save files to git object store.
    -
    -       + object-file.c: read_loose_object(): when checking loose objects
    -         using "git-fsck", do not read full content of large loose objects.
    -
    -     * Packfile:
    -
    -       + fast-import.c: parse_and_store_blob(): streaming large blob from
    -         foreign source to packfile.
    -
    -       + index-pack.c: check_collison(): read and check large blob in stream.
    -
    -       + index-pack.c: unpack_entry_data(): do not return the entire
    -         contents of the big blob from packfile, but uses a fixed buf to
    -         perform some integrity checks on the object.
    -
    -       + pack-check.c: verify_packfile(): used by "git-fsck" and will call
    -         check_object_signature() to check large blob in pack with the
    -         streaming interface.
    -
    -       + pack-objects.c: get_object_details(): set "no_try_delta" for large
    -         blobs when counting objects.
    -
    -       + pack-objects.c: write_no_reuse_object(): streaming large blob to
    -         pack.
    -
    -       + unpack-objects.c: unpack_non_delta_entry(): unpack large blob in
    -         stream from packfile.
    -
    -     * Others:
    -
    -       + diff.c: diff_populate_filespec(): treat large blob file as binary.
    -
    -       + streaming.c: istream_source(): as a helper of "open_istream()" to
    -         select proper streaming interface to read large blob from packfile.
    +    1. https://lore.kernel.org/git/20211210103435.83656-1-chiyutianyi@gmail.com/
    +    2. https://lore.kernel.org/git/20220120112114.47618-5-chiyutianyi@gmail.com/
     
         Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
         Helped-by: Derrick Stolee <stolee@gmail.com>
         Helped-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
         Signed-off-by: Han Xin <hanxin.hx@alibaba-inc.com>
     
    + ## Documentation/config/core.txt ##
    +@@ Documentation/config/core.txt: usage, at the slight expense of increased disk usage.
    + * Will be generally be streamed when written, which avoids excessive
    + memory usage, at the cost of some fixed overhead. Commands that make
    + use of this include linkgit:git-archive[1],
    +-linkgit:git-fast-import[1], linkgit:git-index-pack[1] and
    +-linkgit:git-fsck[1].
    ++linkgit:git-fast-import[1], linkgit:git-index-pack[1],
    ++linkgit:git-unpack-objects[1] and linkgit:git-fsck[1].
    + 
    + core.excludesFile::
    + 	Specifies the pathname to the file that contains patterns to
    +
      ## builtin/unpack-objects.c ##
     @@ builtin/unpack-objects.c: static void added_object(unsigned nr, enum object_type type,
      	}
5:  1bfaf89ee0b < -:  ----------- object-file API: add a format_object_header() function
-- 
2.35.1.940.ge7a5b4b05f2


  parent reply	other threads:[~2022-02-04 14:07 UTC|newest]

Thread overview: 211+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-09  8:20 [PATCH] unpack-objects: unpack large object in stream Han Xin
2021-10-19  7:37 ` Han Xin
2021-10-20 14:42 ` Philip Oakley
2021-10-21  3:42   ` Han Xin
2021-10-21 22:47     ` Philip Oakley
2021-11-03  1:48 ` Han Xin
2021-11-03 10:07   ` Philip Oakley
2021-11-12  9:40 ` [PATCH v2 1/6] object-file: refactor write_loose_object() to support inputstream Han Xin
2021-11-18  4:59   ` Jiang Xin
2021-11-18  6:45     ` Junio C Hamano
2021-11-12  9:40 ` [PATCH v2 2/6] object-file.c: add dry_run mode for write_loose_object() Han Xin
2021-11-18  5:42   ` Jiang Xin
2021-11-12  9:40 ` [PATCH v2 3/6] object-file.c: handle nil oid in write_loose_object() Han Xin
2021-11-18  5:49   ` Jiang Xin
2021-11-12  9:40 ` [PATCH v2 4/6] object-file.c: read input stream repeatedly " Han Xin
2021-11-18  5:56   ` Jiang Xin
2021-11-12  9:40 ` [PATCH v2 5/6] object-store.h: add write_loose_object() Han Xin
2021-11-12  9:40 ` [PATCH v2 6/6] unpack-objects: unpack large object in stream Han Xin
2021-11-18  7:14   ` Jiang Xin
2021-11-22  3:32 ` [PATCH v3 0/5] unpack large objects " Han Xin
2021-11-29  7:01   ` Han Xin
2021-11-29 19:12     ` Jeff King
2021-11-30  2:57       ` Han Xin
2021-12-03  9:35   ` [PATCH v4 " Han Xin
2021-12-07 16:18     ` Derrick Stolee
2021-12-10 10:34     ` [PATCH v5 0/6] unpack large blobs " Han Xin
2021-12-17 11:26       ` Han Xin
2021-12-21 11:51         ` [PATCH v7 0/5] " Han Xin
2021-12-21 11:51         ` [PATCH v7 1/5] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-12-21 14:09           ` Ævar Arnfjörð Bjarmason
2021-12-21 14:43             ` René Scharfe
2021-12-21 15:04               ` Ævar Arnfjörð Bjarmason
2021-12-22 11:15               ` Jiang Xin
2021-12-22 11:29             ` Jiang Xin
2021-12-31  3:06           ` Jiang Xin
2021-12-21 11:51         ` [PATCH v7 2/5] object-file API: add a format_object_header() function Han Xin
2021-12-21 14:30           ` René Scharfe
2022-02-01 14:28             ` C99 %z (was: [PATCH v7 2/5] object-file API: add a format_object_header() function) Ævar Arnfjörð Bjarmason
2021-12-31  3:12           ` [PATCH v7 2/5] object-file API: add a format_object_header() function Jiang Xin
2021-12-21 11:51         ` [PATCH v7 3/5] object-file.c: refactor write_loose_object() to reuse in stream version Han Xin
2021-12-21 14:16           ` Ævar Arnfjörð Bjarmason
2021-12-22 12:02             ` Jiang Xin
2021-12-21 11:52         ` [PATCH v7 4/5] object-file.c: add "write_stream_object_file()" to support read in stream Han Xin
2021-12-21 14:20           ` Ævar Arnfjörð Bjarmason
2021-12-21 15:05             ` Ævar Arnfjörð Bjarmason
2021-12-21 11:52         ` [PATCH v7 5/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-21 15:06           ` Ævar Arnfjörð Bjarmason
2021-12-31  3:19           ` Jiang Xin
2022-01-08  8:54         ` [PATCH v8 0/6] unpack large blobs in stream Han Xin
2022-01-20 11:21           ` [PATCH v9 0/5] " Han Xin
2022-02-01 21:24             ` Ævar Arnfjörð Bjarmason
2022-02-02  8:32               ` Han Xin
2022-02-02 10:59                 ` Ævar Arnfjörð Bjarmason
2022-02-04 14:07             ` Ævar Arnfjörð Bjarmason [this message]
2022-02-04 14:07               ` [PATCH v10 1/6] unpack-objects: low memory footprint for get_data() in dry_run mode Ævar Arnfjörð Bjarmason
2022-02-04 14:07               ` [PATCH v10 2/6] object-file.c: do fsync() and close() before post-write die() Ævar Arnfjörð Bjarmason
2022-02-04 14:07               ` [PATCH v10 3/6] object-file.c: refactor write_loose_object() to several steps Ævar Arnfjörð Bjarmason
2022-02-04 14:07               ` [PATCH v10 4/6] object-file.c: add "stream_loose_object()" to handle large object Ævar Arnfjörð Bjarmason
2022-02-04 14:07               ` [PATCH v10 5/6] core doc: modernize core.bigFileThreshold documentation Ævar Arnfjörð Bjarmason
2022-02-04 14:07               ` [PATCH v10 6/6] unpack-objects: use stream_loose_object() to unpack large objects Ævar Arnfjörð Bjarmason
2022-03-19  0:23               ` [PATCH v11 0/8] unpack-objects: support streaming blobs to disk Ævar Arnfjörð Bjarmason
2022-03-19  0:23                 ` [PATCH v11 1/8] unpack-objects: low memory footprint for get_data() in dry_run mode Ævar Arnfjörð Bjarmason
2022-03-19  0:23                 ` [PATCH v11 2/8] object-file.c: do fsync() and close() before post-write die() Ævar Arnfjörð Bjarmason
2022-03-19  0:23                 ` [PATCH v11 3/8] object-file.c: refactor write_loose_object() to several steps Ævar Arnfjörð Bjarmason
2022-03-19 10:11                   ` René Scharfe
2022-03-19  0:23                 ` [PATCH v11 4/8] object-file.c: factor out deflate part of write_loose_object() Ævar Arnfjörð Bjarmason
2022-03-19  0:23                 ` [PATCH v11 5/8] object-file.c: add "stream_loose_object()" to handle large object Ævar Arnfjörð Bjarmason
2022-03-19  0:23                 ` [PATCH v11 6/8] core doc: modernize core.bigFileThreshold documentation Ævar Arnfjörð Bjarmason
2022-03-19  0:23                 ` [PATCH v11 7/8] unpack-objects: refactor away unpack_non_delta_entry() Ævar Arnfjörð Bjarmason
2022-03-19  0:23                 ` [PATCH v11 8/8] unpack-objects: use stream_loose_object() to unpack large objects Ævar Arnfjörð Bjarmason
2022-03-29 13:56                 ` [PATCH v12 0/8] unpack-objects: support streaming blobs to disk Ævar Arnfjörð Bjarmason
2022-03-29 13:56                   ` [PATCH v12 1/8] unpack-objects: low memory footprint for get_data() in dry_run mode Ævar Arnfjörð Bjarmason
2022-03-29 13:56                   ` [PATCH v12 2/8] object-file.c: do fsync() and close() before post-write die() Ævar Arnfjörð Bjarmason
2022-03-29 13:56                   ` [PATCH v12 3/8] object-file.c: refactor write_loose_object() to several steps Ævar Arnfjörð Bjarmason
2022-03-30  7:13                     ` Han Xin
2022-03-30 17:34                       ` Ævar Arnfjörð Bjarmason
2022-03-29 13:56                   ` [PATCH v12 4/8] object-file.c: factor out deflate part of write_loose_object() Ævar Arnfjörð Bjarmason
2022-03-29 13:56                   ` [PATCH v12 5/8] object-file.c: add "stream_loose_object()" to handle large object Ævar Arnfjörð Bjarmason
2022-03-31 19:54                     ` Neeraj Singh
2022-03-29 13:56                   ` [PATCH v12 6/8] core doc: modernize core.bigFileThreshold documentation Ævar Arnfjörð Bjarmason
2022-03-29 13:56                   ` [PATCH v12 7/8] unpack-objects: refactor away unpack_non_delta_entry() Ævar Arnfjörð Bjarmason
2022-03-30 19:40                     ` René Scharfe
2022-03-31 12:42                       ` Ævar Arnfjörð Bjarmason
2022-03-31 16:38                         ` René Scharfe
2022-03-29 13:56                   ` [PATCH v12 8/8] unpack-objects: use stream_loose_object() to unpack large objects Ævar Arnfjörð Bjarmason
2022-06-04 10:10                   ` [PATCH v13 0/7] unpack-objects: support streaming blobs to disk Ævar Arnfjörð Bjarmason
2022-06-04 10:10                     ` [PATCH v13 1/7] unpack-objects: low memory footprint for get_data() in dry_run mode Ævar Arnfjörð Bjarmason
2022-06-06 18:35                       ` Junio C Hamano
2022-06-09  4:10                         ` Han Xin
2022-06-09 18:27                           ` Junio C Hamano
2022-06-10  1:50                             ` Han Xin
2022-06-10  2:05                               ` Ævar Arnfjörð Bjarmason
2022-06-10 12:04                                 ` Han Xin
2022-06-04 10:10                     ` [PATCH v13 2/7] object-file.c: do fsync() and close() before post-write die() Ævar Arnfjörð Bjarmason
2022-06-06 18:45                       ` Junio C Hamano
2022-06-04 10:10                     ` [PATCH v13 3/7] object-file.c: refactor write_loose_object() to several steps Ævar Arnfjörð Bjarmason
2022-06-04 10:10                     ` [PATCH v13 4/7] object-file.c: factor out deflate part of write_loose_object() Ævar Arnfjörð Bjarmason
2022-06-04 10:10                     ` [PATCH v13 5/7] object-file.c: add "stream_loose_object()" to handle large object Ævar Arnfjörð Bjarmason
2022-06-06 19:44                       ` Junio C Hamano
2022-06-06 20:02                         ` Junio C Hamano
2022-06-09  6:04                           ` Han Xin
2022-06-09  6:14                         ` Han Xin
2022-06-07 19:53                       ` Neeraj Singh
2022-06-08 15:34                         ` Junio C Hamano
2022-06-09  3:05                         ` [RFC PATCH] object-file.c: batched disk flushes for stream_loose_object() Han Xin
2022-06-09  7:35                           ` Neeraj Singh
2022-06-09  9:30                           ` Johannes Schindelin
2022-06-10 12:55                             ` Han Xin
2022-06-04 10:10                     ` [PATCH v13 6/7] core doc: modernize core.bigFileThreshold documentation Ævar Arnfjörð Bjarmason
2022-06-06 19:50                       ` Junio C Hamano
2022-06-04 10:10                     ` [PATCH v13 7/7] unpack-objects: use stream_loose_object() to unpack large objects Ævar Arnfjörð Bjarmason
2022-06-10 14:46                     ` [PATCH v14 0/7] unpack-objects: support streaming blobs to disk Han Xin
2022-06-10 14:46                       ` [PATCH v14 1/7] unpack-objects: low memory footprint for get_data() in dry_run mode Han Xin
2022-06-10 14:46                       ` [PATCH v14 2/7] object-file.c: do fsync() and close() before post-write die() Han Xin
2022-06-10 21:10                         ` René Scharfe
2022-06-10 21:33                           ` Junio C Hamano
2022-06-11  1:50                             ` Han Xin
2022-06-10 14:46                       ` [PATCH v14 3/7] object-file.c: refactor write_loose_object() to several steps Han Xin
2022-06-10 14:46                       ` [PATCH v14 4/7] object-file.c: factor out deflate part of write_loose_object() Han Xin
2022-06-10 14:46                       ` [PATCH v14 5/7] object-file.c: add "stream_loose_object()" to handle large object Han Xin
2022-06-10 14:46                       ` [PATCH v14 6/7] core doc: modernize core.bigFileThreshold documentation Han Xin
2022-06-10 21:01                         ` Junio C Hamano
2022-06-10 14:46                       ` [PATCH v14 7/7] unpack-objects: use stream_loose_object() to unpack large objects Han Xin
2022-06-11  2:44                       ` [PATCH v15 0/6] unpack-objects: support streaming blobs to disk Han Xin
2022-06-11  2:44                         ` [PATCH v15 1/6] unpack-objects: low memory footprint for get_data() in dry_run mode Han Xin
2022-06-11  2:44                         ` [PATCH v15 2/6] object-file.c: refactor write_loose_object() to several steps Han Xin
2022-06-11  2:44                         ` [PATCH v15 3/6] object-file.c: factor out deflate part of write_loose_object() Han Xin
2022-06-11  2:44                         ` [PATCH v15 4/6] object-file.c: add "stream_loose_object()" to handle large object Han Xin
2022-06-11  2:44                         ` [PATCH v15 5/6] core doc: modernize core.bigFileThreshold documentation Han Xin
2022-06-11  2:44                         ` [PATCH v15 6/6] unpack-objects: use stream_loose_object() to unpack large objects Han Xin
2022-07-01  2:01                           ` Junio C Hamano
2022-05-20  3:05                 ` [PATCH 0/1] unpack-objects: low memory footprint for get_data() in dry_run mode Han Xin
2022-05-20  3:05                   ` [PATCH 1/1] " Han Xin
2022-01-20 11:21           ` [PATCH v9 1/5] " Han Xin
2022-01-20 11:21           ` [PATCH v9 2/5] object-file.c: refactor write_loose_object() to several steps Han Xin
2022-01-20 11:21           ` [PATCH v9 3/5] object-file.c: add "stream_loose_object()" to handle large object Han Xin
2022-01-20 11:21           ` [PATCH v9 4/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2022-01-20 11:21           ` [PATCH v9 5/5] object-file API: add a format_object_header() function Han Xin
2022-01-08  8:54         ` [PATCH v8 1/6] unpack-objects: low memory footprint for get_data() in dry_run mode Han Xin
2022-01-08 12:28           ` René Scharfe
2022-01-11 10:41             ` Han Xin
2022-01-08  8:54         ` [PATCH v8 2/6] object-file.c: refactor write_loose_object() to several steps Han Xin
2022-01-08 12:28           ` René Scharfe
2022-01-11 10:33             ` Han Xin
2022-01-08  8:54         ` [PATCH v8 3/6] object-file.c: remove the slash for directory_size() Han Xin
2022-01-08 17:24           ` René Scharfe
2022-01-11 10:14             ` Han Xin
2022-01-08  8:54         ` [PATCH v8 4/6] object-file.c: add "stream_loose_object()" to handle large object Han Xin
2022-01-08  8:54         ` [PATCH v8 5/6] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2022-01-08  8:54         ` [PATCH v8 6/6] object-file API: add a format_object_header() function Han Xin
2021-12-17 11:26       ` [PATCH v6 1/6] object-file.c: release strbuf in write_loose_object() Han Xin
2021-12-17 19:28         ` René Scharfe
2021-12-18  0:09           ` Junio C Hamano
2021-12-17 11:26       ` [PATCH v6 2/6] object-file.c: refactor object header generation into a function Han Xin
2021-12-20 12:10         ` [RFC PATCH] object-file API: add a format_loose_header() function Ævar Arnfjörð Bjarmason
2021-12-20 12:48           ` Philip Oakley
2021-12-20 22:25           ` Junio C Hamano
2021-12-21  1:42             ` Ævar Arnfjörð Bjarmason
2021-12-21  2:11               ` Junio C Hamano
2021-12-21  2:27                 ` Ævar Arnfjörð Bjarmason
2021-12-21 11:43           ` Han Xin
2021-12-17 11:26       ` [PATCH v6 3/6] object-file.c: refactor write_loose_object() to reuse in stream version Han Xin
2021-12-17 11:26       ` [PATCH v6 4/6] object-file.c: make "write_object_file_flags()" to support read in stream Han Xin
2021-12-17 22:52         ` René Scharfe
2021-12-17 11:26       ` [PATCH v6 5/6] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-12-17 21:22         ` René Scharfe
2021-12-17 11:26       ` [PATCH v6 6/6] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-10 10:34     ` [PATCH v5 1/6] object-file: refactor write_loose_object() to support read from stream Han Xin
2021-12-10 10:34     ` [PATCH v5 2/6] object-file.c: handle undetermined oid in write_loose_object() Han Xin
2021-12-13  7:32       ` Ævar Arnfjörð Bjarmason
2021-12-10 10:34     ` [PATCH v5 3/6] object-file.c: read stream in a loop " Han Xin
2021-12-10 10:34     ` [PATCH v5 4/6] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-12-10 10:34     ` [PATCH v5 5/6] object-file.c: make "write_object_file_flags()" to support "HASH_STREAM" Han Xin
2021-12-10 10:34     ` [PATCH v5 6/6] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-13  8:05       ` Ævar Arnfjörð Bjarmason
2021-12-03  9:35   ` [PATCH v4 1/5] object-file: refactor write_loose_object() to read buffer from stream Han Xin
2021-12-03 13:28     ` Ævar Arnfjörð Bjarmason
2021-12-06  2:07       ` Han Xin
2021-12-03  9:35   ` [PATCH v4 2/5] object-file.c: handle undetermined oid in write_loose_object() Han Xin
2021-12-03 13:21     ` Ævar Arnfjörð Bjarmason
2021-12-06  2:51       ` Han Xin
2021-12-03 13:41     ` Ævar Arnfjörð Bjarmason
2021-12-06  3:12       ` Han Xin
2021-12-03  9:35   ` [PATCH v4 3/5] object-file.c: read stream in a loop " Han Xin
2021-12-03  9:35   ` [PATCH v4 4/5] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-12-03 13:59     ` Ævar Arnfjörð Bjarmason
2021-12-06  3:20       ` Han Xin
2021-12-03  9:35   ` [PATCH v4 5/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-03 13:07     ` Ævar Arnfjörð Bjarmason
2021-12-07  6:42       ` Han Xin
2021-12-03 13:54     ` Ævar Arnfjörð Bjarmason
2021-12-07  6:17       ` Han Xin
2021-12-03 14:05     ` Ævar Arnfjörð Bjarmason
2021-12-07  6:48       ` Han Xin
2021-11-22  3:32 ` [PATCH v3 1/5] object-file: refactor write_loose_object() to read buffer from stream Han Xin
2021-11-23 23:24   ` Junio C Hamano
2021-11-24  9:00     ` Han Xin
2021-11-22  3:32 ` [PATCH v3 2/5] object-file.c: handle undetermined oid in write_loose_object() Han Xin
2021-11-29 15:10   ` Derrick Stolee
2021-11-29 20:44     ` Junio C Hamano
2021-11-29 22:18       ` Derrick Stolee
2021-11-30  3:23         ` Han Xin
2021-11-22  3:32 ` [PATCH v3 3/5] object-file.c: read stream in a loop " Han Xin
2021-11-22  3:32 ` [PATCH v3 4/5] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-11-22  3:32 ` [PATCH v3 5/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-11-29 17:37   ` Derrick Stolee
2021-11-30 13:49     ` Han Xin
2021-11-30 18:38       ` Derrick Stolee
2021-12-01 20:37         ` "git hyperfine" (was: [PATCH v3 5/5] unpack-objects[...]) Ævar Arnfjörð Bjarmason
2021-12-02  7:33         ` [PATCH v3 5/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-02 13:53           ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover-v10-0.6-00000000000-20220204T135538Z-avarab@gmail.com \
    --to=avarab@gmail.com \
    --cc=chiyutianyi@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    --cc=philipoakley@iee.email \
    --cc=stolee@gmail.com \
    --cc=worldhello.net@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).