git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Han Xin <chiyutianyi@gmail.com>
To: "Junio C Hamano" <gitster@pobox.com>,
	"Git List" <git@vger.kernel.org>, "Jeff King" <peff@peff.net>,
	"Jiang Xin" <zhiyou.jx@alibaba-inc.com>,
	"Philip Oakley" <philipoakley@iee.email>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Derrick Stolee" <stolee@gmail.com>,
	"René Scharfe" <l.s.r@web.de>
Cc: Han Xin <hanxin.hx@alibaba-inc.com>
Subject: [PATCH v7 0/5] unpack large blobs in stream
Date: Tue, 21 Dec 2021 19:51:56 +0800	[thread overview]
Message-ID: <20211221115201.12120-1-chiyutianyi@gmail.com> (raw)
In-Reply-To: <20211217112629.12334-1-chiyutianyi@gmail.com>

From: Han Xin <hanxin.hx@alibaba-inc.com>

Changes since v6:
* Remove "object-file.c: release strbuf in write_loose_object()" which is not
  needed anymore. Thanks to René Scharfe[1] for reporting this.

* Reorder the patch series and put "unpack-objects.c: add dry_run mode for get_data()"
  and its testcases to the front.

* Replace "refactor object header generation into a function" with
  "object-file API: add a format_object_header() function" sugguested by
  Ævar Arnfjörð Bjarmason[2].

* Export "write_stream_object_file()" instead of "reusing write_object_file_flags()"
  sugguested by René Scharfe[3]. The new flag "HASH_STREAM" has been removed.

* Fix the directory creation error and the "strbuf dir" leak in
  "write_stream_object_file()".

* Change "unsigned long size" to "size_t size" in "write_stream_blob()" and
  "get_data()" in "unpack-objects.c".

1. https://lore.kernel.org/git/c860c56f-ce25-4391-7f65-50c9d5d80c2c@web.de/
2. https://lore.kernel.org/git/RFC-patch-1.1-bda62567f6b-20211220T120740Z-avarab@gmail.com/
3. https://lore.kernel.org/git/e959e4f1-7500-5f6b-5bd2-2f060287eeff@web.de/

Han Xin (4):
  unpack-objects.c: add dry_run mode for get_data()
  object-file.c: refactor write_loose_object() to reuse in stream
    version
  object-file.c: add "write_stream_object_file()" to support read in
    stream
  unpack-objects: unpack_non_delta_entry() read data in a stream

Ævar Arnfjörð Bjarmason (1):
  object-file API: add a format_object_header() function

 Documentation/config/core.txt       |  11 ++
 builtin/index-pack.c                |   3 +-
 builtin/unpack-objects.c            |  94 ++++++++++++-
 bulk-checkin.c                      |   4 +-
 cache.h                             |  22 +++
 config.c                            |   5 +
 environment.c                       |   1 +
 http-push.c                         |   2 +-
 object-file.c                       | 199 ++++++++++++++++++++++------
 object-store.h                      |   9 ++
 t/t5590-unpack-non-delta-objects.sh |  91 +++++++++++++
 11 files changed, 392 insertions(+), 49 deletions(-)
 create mode 100755 t/t5590-unpack-non-delta-objects.sh

Range-diff against v6:
1:  59d35dac5f < -:  ---------- object-file.c: release strbuf in write_loose_object()
2:  2174a6cbad < -:  ---------- object-file.c: refactor object header generation into a function
5:  1acbb6e849 ! 1:  a8f232f553 unpack-objects.c: add dry_run mode for get_data()
    @@ builtin/unpack-objects.c: static void use(int bytes)
      }
      
     -static void *get_data(unsigned long size)
    -+static void *get_data(unsigned long size, int dry_run)
    ++static void *get_data(size_t size, int dry_run)
      {
      	git_zstream stream;
     -	void *buf = xmallocz(size);
    -+	unsigned long bufsize;
    ++	size_t bufsize;
     +	void *buf;
      
      	memset(&stream, 0, sizeof(stream));
    @@ builtin/unpack-objects.c: static void unpack_delta_entry(enum object_type type,
      		if (dry_run || !delta_data) {
      			free(delta_data);
      			return;
    +
    + ## t/t5590-unpack-non-delta-objects.sh (new) ##
    +@@
    ++#!/bin/sh
    ++#
    ++# Copyright (c) 2021 Han Xin
    ++#
    ++
    ++test_description='Test unpack-objects with non-delta objects'
    ++
    ++GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
    ++export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
    ++
    ++. ./test-lib.sh
    ++
    ++prepare_dest () {
    ++	test_when_finished "rm -rf dest.git" &&
    ++	git init --bare dest.git
    ++}
    ++
    ++test_expect_success "setup repo with big blobs (1.5 MB)" '
    ++	test-tool genrandom foo 1500000 >big-blob &&
    ++	test_commit --append foo big-blob &&
    ++	test-tool genrandom bar 1500000 >big-blob &&
    ++	test_commit --append bar big-blob &&
    ++	(
    ++		cd .git &&
    ++		find objects/?? -type f | sort
    ++	) >expect &&
    ++	PACK=$(echo main | git pack-objects --revs test)
    ++'
    ++
    ++test_expect_success 'setup env: GIT_ALLOC_LIMIT to 1MB' '
    ++	GIT_ALLOC_LIMIT=1m &&
    ++	export GIT_ALLOC_LIMIT
    ++'
    ++
    ++test_expect_success 'fail to unpack-objects: cannot allocate' '
    ++	prepare_dest &&
    ++	test_must_fail git -C dest.git unpack-objects <test-$PACK.pack 2>err &&
    ++	grep "fatal: attempting to allocate" err &&
    ++	(
    ++		cd dest.git &&
    ++		find objects/?? -type f | sort
    ++	) >actual &&
    ++	test_file_not_empty actual &&
    ++	! test_cmp expect actual
    ++'
    ++
    ++test_expect_success 'unpack-objects dry-run' '
    ++	prepare_dest &&
    ++	git -C dest.git unpack-objects -n <test-$PACK.pack &&
    ++	(
    ++		cd dest.git &&
    ++		find objects/ -type f
    ++	) >actual &&
    ++	test_must_be_empty actual
    ++'
    ++
    ++test_done
-:  ---------- > 2:  0d2e0f3a00 object-file API: add a format_object_header() function
3:  8a704ecc59 ! 3:  a571b8f16c object-file.c: refactor write_loose_object() to reuse in stream version
    @@ object-file.c: static int write_loose_object(const struct object_id *oid, char *
      	loose_object_path(the_repository, &filename, oid);
      
     -	fd = create_tmpfile(&tmp_file, filename.buf);
    -+	fd = create_tmpfile(&tmp_file, filename.buf, flags);
    - 	if (fd < 0) {
    +-	if (fd < 0) {
     -		if (flags & HASH_SILENT)
    --			ret = -1;
    +-			return -1;
     -		else if (errno == EACCES)
    --			ret = error(_("insufficient permission for adding an "
    --				      "object to repository database %s"),
    --				    get_object_directory());
    +-			return error(_("insufficient permission for adding an object to repository database %s"), get_object_directory());
     -		else
    --			ret = error_errno(_("unable to create temporary file"));
    -+		ret = -1;
    - 		goto cleanup;
    - 	}
    - 
    +-			return error_errno(_("unable to create temporary file"));
    +-	}
    +-
     -	/* Set it up */
     -	git_deflate_init(&stream, zlib_compression_level);
     -	stream.next_out = compressed;
     -	stream.avail_out = sizeof(compressed);
     -	the_hash_algo->init_fn(&c);
    --
    ++	fd = create_tmpfile(&tmp_file, filename.buf, flags);
    ++	if (fd < 0)
    ++		return -1;
    + 
     -	/* First header.. */
     -	stream.next_in = (unsigned char *)hdr;
     -	stream.avail_in = hdrlen;
    @@ object-file.c: static int write_loose_object(const struct object_id *oid, char *
     -			warning_errno(_("failed utime() on %s"), tmp_file.buf);
     -	}
     -
    --	ret = finalize_object_file(tmp_file.buf, filename.buf);
    -+	ret = finalize_object_file_with_mtime(tmp_file.buf, filename.buf, mtime, flags);
    - cleanup:
    - 	strbuf_release(&filename);
    - 	strbuf_release(&tmp_file);
    +-	return finalize_object_file(tmp_file.buf, filename.buf);
    ++	return finalize_object_file_with_mtime(tmp_file.buf, filename.buf,
    ++					       mtime, flags);
    + }
    + 
    + static int freshen_loose_object(const struct object_id *oid)
4:  96f05632a2 ! 4:  1de06a8f5c object-file.c: make "write_object_file_flags()" to support read in stream
    @@ Metadata
     Author: Han Xin <hanxin.hx@alibaba-inc.com>
     
      ## Commit message ##
    -    object-file.c: make "write_object_file_flags()" to support read in stream
    +    object-file.c: add "write_stream_object_file()" to support read in stream
     
         We used to call "get_data()" in "unpack_non_delta_entry()" to read the
         entire contents of a blob object, no matter how big it is. This
         implementation may consume all the memory and cause OOM.
     
    -    This can be improved by feeding data to "stream_loose_object()" in a
    -    stream. The input stream is implemented as an interface.
    -
    -    When streaming a large blob object to "write_loose_object()", we have no
    -    chance to run "write_object_file_prepare()" to calculate the oid in
    -    advance. So we need to handle undetermined oid in a new function called
    -    "stream_loose_object()".
    +    This can be improved by feeding data to "write_stream_object_file()"
    +    in a stream. The input stream is implemented as an interface.
     
    +    The difference with "write_loose_object()" is that we have no chance
    +    to run "write_object_file_prepare()" to calculate the oid in advance.
         In "write_loose_object()", we know the oid and we can write the
         temporary file in the same directory as the final object, but for an
         object with an undetermined oid, we don't know the exact directory for
         the object, so we have to save the temporary file in ".git/objects/"
         directory instead.
     
    -    We will reuse "write_object_file_flags()" in "unpack_non_delta_entry()" to
    -    read the entire data contents in stream, so a new flag "HASH_STREAM" is
    -    added. When read in stream, we needn't prepare the "oid" before
    -    "write_loose_object()", only generate the header.
         "freshen_packed_object()" or "freshen_loose_object()" will be called
    -    inside "stream_loose_object()" after obtaining the "oid".
    +    inside "write_stream_object_file()" after obtaining the "oid".
     
    +    Helped-by: René Scharfe <l.s.r@web.de>
         Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
         Helped-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
         Signed-off-by: Han Xin <hanxin.hx@alibaba-inc.com>
     
    - ## cache.h ##
    -@@ cache.h: int ie_modified(struct index_state *, const struct cache_entry *, struct stat *,
    - #define HASH_FORMAT_CHECK 2
    - #define HASH_RENORMALIZE  4
    - #define HASH_SILENT 8
    -+#define HASH_STREAM 16
    - int index_fd(struct index_state *istate, struct object_id *oid, int fd, struct stat *st, enum object_type type, const char *path, unsigned flags);
    - int index_path(struct index_state *istate, struct object_id *oid, const char *path, struct stat *st, unsigned flags);
    - 
    -
      ## object-file.c ##
     @@ object-file.c: static int freshen_packed_object(const struct object_id *oid)
      	return 1;
      }
      
    -+static int stream_loose_object(struct object_id *oid, char *hdr, int hdrlen,
    -+			       const struct input_stream *in_stream,
    -+			       unsigned long len, time_t mtime, unsigned flags)
    ++int write_stream_object_file(struct input_stream *in_stream, size_t len,
    ++			     enum object_type type, time_t mtime,
    ++			     unsigned flags, struct object_id *oid)
     +{
    -+	int fd, ret, err = 0, flush = 0;
    ++	int fd, ret, flush = 0;
     +	unsigned char compressed[4096];
     +	git_zstream stream;
     +	git_hash_ctx c;
    @@ object-file.c: static int freshen_packed_object(const struct object_id *oid)
     +	static struct strbuf tmp_file = STRBUF_INIT;
     +	static struct strbuf filename = STRBUF_INIT;
     +	int dirlen;
    ++	char hdr[MAX_HEADER_LEN];
    ++	int hdrlen = sizeof(hdr);
     +
    ++	/* Since "filename" is defined as static, it will be reused. So reset it
    ++	 * first before using it. */
    ++	strbuf_reset(&filename);
     +	/* When oid is not determined, save tmp file to odb path. */
     +	strbuf_addf(&filename, "%s/", get_object_directory());
     +
     +	fd = create_tmpfile(&tmp_file, filename.buf, flags);
    -+	if (fd < 0) {
    -+		err = -1;
    -+		goto cleanup;
    -+	}
    ++	if (fd < 0)
    ++		return -1;
    ++
    ++	hdrlen = format_object_header(hdr, hdrlen, type, len);
     +
     +	/* Set it up and write header */
     +	setup_stream_and_header(&stream, compressed, sizeof(compressed),
    @@ object-file.c: static int freshen_packed_object(const struct object_id *oid)
     +
     +	if (freshen_packed_object(oid) || freshen_loose_object(oid)) {
     +		unlink_or_warn(tmp_file.buf);
    -+		goto cleanup;
    ++		return 0;
     +	}
     +
     +	loose_object_path(the_repository, &filename, oid);
    @@ object-file.c: static int freshen_packed_object(const struct object_id *oid)
     +		struct strbuf dir = STRBUF_INIT;
     +		strbuf_add(&dir, filename.buf, dirlen - 1);
     +
    -+		if (mkdir_in_gitdir(dir.buf) < 0) {
    -+			err = -1;
    -+			goto cleanup;
    ++		if (mkdir_in_gitdir(dir.buf) && errno != EEXIST) {
    ++			ret = error_errno(_("unable to create directory %s"), dir.buf);
    ++			strbuf_release(&dir);
    ++			return ret;
     +		}
    ++		strbuf_release(&dir);
     +	}
     +
    -+	err = finalize_object_file_with_mtime(tmp_file.buf, filename.buf, mtime, flags);
    -+cleanup:
    -+	strbuf_release(&tmp_file);
    -+	strbuf_release(&filename);
    -+	return err;
    ++	return finalize_object_file_with_mtime(tmp_file.buf, filename.buf, mtime, flags);
     +}
     +
      int write_object_file_flags(const void *buf, unsigned long len,
      			    const char *type, struct object_id *oid,
      			    unsigned flags)
    -@@ object-file.c: int write_object_file_flags(const void *buf, unsigned long len,
    - 	char hdr[MAX_HEADER_LEN];
    - 	int hdrlen = sizeof(hdr);
    - 
    -+	/* When streaming a large blob object (marked as HASH_STREAM),
    -+	 * we have no chance to run "write_object_file_prepare()" to
    -+	 * calculate the "oid" in advance.  Call "stream_loose_object()"
    -+	 * to write loose object in stream.
    -+	 */
    -+	if (flags & HASH_STREAM) {
    -+		hdrlen = generate_object_header(hdr, hdrlen, type, len);
    -+		return stream_loose_object(oid, hdr, hdrlen, buf, len, 0, flags);
    -+	}
    -+
    - 	/* Normally if we have it in the pack then we do not bother writing
    - 	 * it out into .git/objects/??/?{38} file.
    - 	 */
     
      ## object-store.h ##
     @@ object-store.h: struct object_directory {
    @@ object-store.h: struct object_directory {
      };
      
     +struct input_stream {
    -+	const void *(*read)(const struct input_stream *, unsigned long *len);
    ++	const void *(*read)(struct input_stream *, unsigned long *len);
     +	void *data;
     +};
     +
      KHASH_INIT(odb_path_map, const char * /* key: odb_path */,
      	struct object_directory *, 1, fspathhash, fspatheq)
      
    +@@ object-store.h: static inline int write_object_file(const void *buf, unsigned long len,
    + 	return write_object_file_flags(buf, len, type, oid, 0);
    + }
    + 
    ++int write_stream_object_file(struct input_stream *in_stream, size_t len,
    ++			     enum object_type type, time_t mtime,
    ++			     unsigned flags, struct object_id *oid);
    ++
    + int hash_object_file_literally(const void *buf, unsigned long len,
    + 			       const char *type, struct object_id *oid,
    + 			       unsigned flags);
6:  476aaba527 ! 5:  e7b4e426ef unpack-objects: unpack_non_delta_entry() read data in a stream
    @@ builtin/unpack-objects.c: static void added_object(unsigned nr, enum object_type
     +	int status;
     +};
     +
    -+static const void *feed_input_zstream(const struct input_stream *in_stream,
    ++static const void *feed_input_zstream(struct input_stream *in_stream,
     +				      unsigned long *readlen)
     +{
     +	struct input_zstream_data *data = in_stream->data;
    @@ builtin/unpack-objects.c: static void added_object(unsigned nr, enum object_type
     +	return data->buf;
     +}
     +
    -+static void write_stream_blob(unsigned nr, unsigned long size)
    ++static void write_stream_blob(unsigned nr, size_t size)
     +{
     +	git_zstream zstream;
     +	struct input_zstream_data data;
    @@ builtin/unpack-objects.c: static void added_object(unsigned nr, enum object_type
     +	data.zstream = &zstream;
     +	git_inflate_init(&zstream);
     +
    -+	if (write_object_file_flags(&in_stream, size,
    -+				    type_name(OBJ_BLOB),
    -+				    &obj_list[nr].oid,
    -+				    HASH_STREAM))
    ++	if (write_stream_object_file(&in_stream, size, OBJ_BLOB, 0, 0,
    ++				     &obj_list[nr].oid))
     +		die(_("failed to write object in stream"));
     +
     +	if (zstream.total_out != size || data.status != Z_STREAM_END)
    @@ builtin/unpack-objects.c: static void added_object(unsigned nr, enum object_type
     +	git_inflate_end(&zstream);
     +
     +	if (strict) {
    -+		struct blob *blob = lookup_blob(the_repository, &obj_list[nr].oid);
    ++		struct blob *blob =
    ++			lookup_blob(the_repository, &obj_list[nr].oid);
     +		if (blob)
     +			blob->object.flags |= FLAG_WRITTEN;
     +		else
    @@ builtin/unpack-objects.c: static void added_object(unsigned nr, enum object_type
     +	void *buf;
     +
     +	/* Write large blob in stream without allocating full buffer. */
    -+	if (!dry_run && type == OBJ_BLOB && size > big_file_streaming_threshold) {
    ++	if (!dry_run && type == OBJ_BLOB &&
    ++	    size > big_file_streaming_threshold) {
     +		write_stream_blob(nr, size);
     +		return;
     +	}
    @@ environment.c: size_t packed_git_window_size = DEFAULT_PACKED_GIT_WINDOW_SIZE;
      const char *editor_program;
      const char *askpass_program;
     
    - ## t/t5590-unpack-non-delta-objects.sh (new) ##
    -@@
    -+#!/bin/sh
    -+#
    -+# Copyright (c) 2021 Han Xin
    -+#
    -+
    -+test_description='Test unpack-objects when receive pack'
    -+
    -+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
    -+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
    -+
    -+. ./test-lib.sh
    -+
    -+prepare_dest () {
    -+	test_when_finished "rm -rf dest.git" &&
    -+	git init --bare dest.git &&
    -+	git -C dest.git config core.bigFileStreamingThreshold $1 &&
    -+	git -C dest.git config core.bigFileThreshold $1
    -+}
    -+
    -+test_expect_success "setup repo with big blobs (1.5 MB)" '
    -+	test-tool genrandom foo 1500000 >big-blob &&
    -+	test_commit --append foo big-blob &&
    -+	test-tool genrandom bar 1500000 >big-blob &&
    -+	test_commit --append bar big-blob &&
    -+	(
    -+		cd .git &&
    -+		find objects/?? -type f | sort
    -+	) >expect &&
    -+	PACK=$(echo main | git pack-objects --revs test)
    -+'
    -+
    -+test_expect_success 'setup env: GIT_ALLOC_LIMIT to 1MB' '
    -+	GIT_ALLOC_LIMIT=1m &&
    -+	export GIT_ALLOC_LIMIT
    -+'
    -+
    -+test_expect_success 'fail to unpack-objects: cannot allocate' '
    + ## t/t5590-unpack-non-delta-objects.sh ##
    +@@ t/t5590-unpack-non-delta-objects.sh: export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
    + prepare_dest () {
    + 	test_when_finished "rm -rf dest.git" &&
    + 	git init --bare dest.git
    ++	if test -n "$1"
    ++	then
    ++		git -C dest.git config core.bigFileStreamingThreshold $1
    ++		git -C dest.git config core.bigFileThreshold $1
    ++	fi
    + }
    + 
    + test_expect_success "setup repo with big blobs (1.5 MB)" '
    +@@ t/t5590-unpack-non-delta-objects.sh: test_expect_success 'setup env: GIT_ALLOC_LIMIT to 1MB' '
    + '
    + 
    + test_expect_success 'fail to unpack-objects: cannot allocate' '
    +-	prepare_dest &&
     +	prepare_dest 2m &&
    -+	test_must_fail git -C dest.git unpack-objects <test-$PACK.pack 2>err &&
    -+	grep "fatal: attempting to allocate" err &&
    -+	(
    -+		cd dest.git &&
    -+		find objects/?? -type f | sort
    -+	) >actual &&
    -+	test_file_not_empty actual &&
    -+	! test_cmp expect actual
    -+'
    -+
    + 	test_must_fail git -C dest.git unpack-objects <test-$PACK.pack 2>err &&
    + 	grep "fatal: attempting to allocate" err &&
    + 	(
    +@@ t/t5590-unpack-non-delta-objects.sh: test_expect_success 'fail to unpack-objects: cannot allocate' '
    + 	! test_cmp expect actual
    + '
    + 
     +test_expect_success 'unpack big object in stream' '
     +	prepare_dest 1m &&
    ++	mkdir -p dest.git/objects/05 &&
     +	git -C dest.git unpack-objects <test-$PACK.pack &&
     +	git -C dest.git fsck &&
     +	(
    @@ t/t5590-unpack-non-delta-objects.sh (new)
     +	test_must_be_empty actual
     +'
     +
    -+test_expect_success 'unpack-objects dry-run' '
    -+	prepare_dest 1m &&
    -+	git -C dest.git unpack-objects -n <test-$PACK.pack &&
    -+	(
    -+		cd dest.git &&
    -+		find objects/ -type f
    -+	) >actual &&
    -+	test_must_be_empty actual
    -+'
    -+
    -+test_done
    + test_expect_success 'unpack-objects dry-run' '
    + 	prepare_dest &&
    + 	git -C dest.git unpack-objects -n <test-$PACK.pack &&
-- 
2.34.1.52.g80008efde6.agit.6.5.6


  reply	other threads:[~2021-12-21 11:54 UTC|newest]

Thread overview: 211+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-09  8:20 [PATCH] unpack-objects: unpack large object in stream Han Xin
2021-10-19  7:37 ` Han Xin
2021-10-20 14:42 ` Philip Oakley
2021-10-21  3:42   ` Han Xin
2021-10-21 22:47     ` Philip Oakley
2021-11-03  1:48 ` Han Xin
2021-11-03 10:07   ` Philip Oakley
2021-11-12  9:40 ` [PATCH v2 1/6] object-file: refactor write_loose_object() to support inputstream Han Xin
2021-11-18  4:59   ` Jiang Xin
2021-11-18  6:45     ` Junio C Hamano
2021-11-12  9:40 ` [PATCH v2 2/6] object-file.c: add dry_run mode for write_loose_object() Han Xin
2021-11-18  5:42   ` Jiang Xin
2021-11-12  9:40 ` [PATCH v2 3/6] object-file.c: handle nil oid in write_loose_object() Han Xin
2021-11-18  5:49   ` Jiang Xin
2021-11-12  9:40 ` [PATCH v2 4/6] object-file.c: read input stream repeatedly " Han Xin
2021-11-18  5:56   ` Jiang Xin
2021-11-12  9:40 ` [PATCH v2 5/6] object-store.h: add write_loose_object() Han Xin
2021-11-12  9:40 ` [PATCH v2 6/6] unpack-objects: unpack large object in stream Han Xin
2021-11-18  7:14   ` Jiang Xin
2021-11-22  3:32 ` [PATCH v3 0/5] unpack large objects " Han Xin
2021-11-29  7:01   ` Han Xin
2021-11-29 19:12     ` Jeff King
2021-11-30  2:57       ` Han Xin
2021-12-03  9:35   ` [PATCH v4 " Han Xin
2021-12-07 16:18     ` Derrick Stolee
2021-12-10 10:34     ` [PATCH v5 0/6] unpack large blobs " Han Xin
2021-12-17 11:26       ` Han Xin
2021-12-21 11:51         ` Han Xin [this message]
2021-12-21 11:51         ` [PATCH v7 1/5] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-12-21 14:09           ` Ævar Arnfjörð Bjarmason
2021-12-21 14:43             ` René Scharfe
2021-12-21 15:04               ` Ævar Arnfjörð Bjarmason
2021-12-22 11:15               ` Jiang Xin
2021-12-22 11:29             ` Jiang Xin
2021-12-31  3:06           ` Jiang Xin
2021-12-21 11:51         ` [PATCH v7 2/5] object-file API: add a format_object_header() function Han Xin
2021-12-21 14:30           ` René Scharfe
2022-02-01 14:28             ` C99 %z (was: [PATCH v7 2/5] object-file API: add a format_object_header() function) Ævar Arnfjörð Bjarmason
2021-12-31  3:12           ` [PATCH v7 2/5] object-file API: add a format_object_header() function Jiang Xin
2021-12-21 11:51         ` [PATCH v7 3/5] object-file.c: refactor write_loose_object() to reuse in stream version Han Xin
2021-12-21 14:16           ` Ævar Arnfjörð Bjarmason
2021-12-22 12:02             ` Jiang Xin
2021-12-21 11:52         ` [PATCH v7 4/5] object-file.c: add "write_stream_object_file()" to support read in stream Han Xin
2021-12-21 14:20           ` Ævar Arnfjörð Bjarmason
2021-12-21 15:05             ` Ævar Arnfjörð Bjarmason
2021-12-21 11:52         ` [PATCH v7 5/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-21 15:06           ` Ævar Arnfjörð Bjarmason
2021-12-31  3:19           ` Jiang Xin
2022-01-08  8:54         ` [PATCH v8 0/6] unpack large blobs in stream Han Xin
2022-01-20 11:21           ` [PATCH v9 0/5] " Han Xin
2022-02-01 21:24             ` Ævar Arnfjörð Bjarmason
2022-02-02  8:32               ` Han Xin
2022-02-02 10:59                 ` Ævar Arnfjörð Bjarmason
2022-02-04 14:07             ` [PATCH v10 0/6] unpack-objects: support streaming large objects to disk Ævar Arnfjörð Bjarmason
2022-02-04 14:07               ` [PATCH v10 1/6] unpack-objects: low memory footprint for get_data() in dry_run mode Ævar Arnfjörð Bjarmason
2022-02-04 14:07               ` [PATCH v10 2/6] object-file.c: do fsync() and close() before post-write die() Ævar Arnfjörð Bjarmason
2022-02-04 14:07               ` [PATCH v10 3/6] object-file.c: refactor write_loose_object() to several steps Ævar Arnfjörð Bjarmason
2022-02-04 14:07               ` [PATCH v10 4/6] object-file.c: add "stream_loose_object()" to handle large object Ævar Arnfjörð Bjarmason
2022-02-04 14:07               ` [PATCH v10 5/6] core doc: modernize core.bigFileThreshold documentation Ævar Arnfjörð Bjarmason
2022-02-04 14:07               ` [PATCH v10 6/6] unpack-objects: use stream_loose_object() to unpack large objects Ævar Arnfjörð Bjarmason
2022-03-19  0:23               ` [PATCH v11 0/8] unpack-objects: support streaming blobs to disk Ævar Arnfjörð Bjarmason
2022-03-19  0:23                 ` [PATCH v11 1/8] unpack-objects: low memory footprint for get_data() in dry_run mode Ævar Arnfjörð Bjarmason
2022-03-19  0:23                 ` [PATCH v11 2/8] object-file.c: do fsync() and close() before post-write die() Ævar Arnfjörð Bjarmason
2022-03-19  0:23                 ` [PATCH v11 3/8] object-file.c: refactor write_loose_object() to several steps Ævar Arnfjörð Bjarmason
2022-03-19 10:11                   ` René Scharfe
2022-03-19  0:23                 ` [PATCH v11 4/8] object-file.c: factor out deflate part of write_loose_object() Ævar Arnfjörð Bjarmason
2022-03-19  0:23                 ` [PATCH v11 5/8] object-file.c: add "stream_loose_object()" to handle large object Ævar Arnfjörð Bjarmason
2022-03-19  0:23                 ` [PATCH v11 6/8] core doc: modernize core.bigFileThreshold documentation Ævar Arnfjörð Bjarmason
2022-03-19  0:23                 ` [PATCH v11 7/8] unpack-objects: refactor away unpack_non_delta_entry() Ævar Arnfjörð Bjarmason
2022-03-19  0:23                 ` [PATCH v11 8/8] unpack-objects: use stream_loose_object() to unpack large objects Ævar Arnfjörð Bjarmason
2022-03-29 13:56                 ` [PATCH v12 0/8] unpack-objects: support streaming blobs to disk Ævar Arnfjörð Bjarmason
2022-03-29 13:56                   ` [PATCH v12 1/8] unpack-objects: low memory footprint for get_data() in dry_run mode Ævar Arnfjörð Bjarmason
2022-03-29 13:56                   ` [PATCH v12 2/8] object-file.c: do fsync() and close() before post-write die() Ævar Arnfjörð Bjarmason
2022-03-29 13:56                   ` [PATCH v12 3/8] object-file.c: refactor write_loose_object() to several steps Ævar Arnfjörð Bjarmason
2022-03-30  7:13                     ` Han Xin
2022-03-30 17:34                       ` Ævar Arnfjörð Bjarmason
2022-03-29 13:56                   ` [PATCH v12 4/8] object-file.c: factor out deflate part of write_loose_object() Ævar Arnfjörð Bjarmason
2022-03-29 13:56                   ` [PATCH v12 5/8] object-file.c: add "stream_loose_object()" to handle large object Ævar Arnfjörð Bjarmason
2022-03-31 19:54                     ` Neeraj Singh
2022-03-29 13:56                   ` [PATCH v12 6/8] core doc: modernize core.bigFileThreshold documentation Ævar Arnfjörð Bjarmason
2022-03-29 13:56                   ` [PATCH v12 7/8] unpack-objects: refactor away unpack_non_delta_entry() Ævar Arnfjörð Bjarmason
2022-03-30 19:40                     ` René Scharfe
2022-03-31 12:42                       ` Ævar Arnfjörð Bjarmason
2022-03-31 16:38                         ` René Scharfe
2022-03-29 13:56                   ` [PATCH v12 8/8] unpack-objects: use stream_loose_object() to unpack large objects Ævar Arnfjörð Bjarmason
2022-06-04 10:10                   ` [PATCH v13 0/7] unpack-objects: support streaming blobs to disk Ævar Arnfjörð Bjarmason
2022-06-04 10:10                     ` [PATCH v13 1/7] unpack-objects: low memory footprint for get_data() in dry_run mode Ævar Arnfjörð Bjarmason
2022-06-06 18:35                       ` Junio C Hamano
2022-06-09  4:10                         ` Han Xin
2022-06-09 18:27                           ` Junio C Hamano
2022-06-10  1:50                             ` Han Xin
2022-06-10  2:05                               ` Ævar Arnfjörð Bjarmason
2022-06-10 12:04                                 ` Han Xin
2022-06-04 10:10                     ` [PATCH v13 2/7] object-file.c: do fsync() and close() before post-write die() Ævar Arnfjörð Bjarmason
2022-06-06 18:45                       ` Junio C Hamano
2022-06-04 10:10                     ` [PATCH v13 3/7] object-file.c: refactor write_loose_object() to several steps Ævar Arnfjörð Bjarmason
2022-06-04 10:10                     ` [PATCH v13 4/7] object-file.c: factor out deflate part of write_loose_object() Ævar Arnfjörð Bjarmason
2022-06-04 10:10                     ` [PATCH v13 5/7] object-file.c: add "stream_loose_object()" to handle large object Ævar Arnfjörð Bjarmason
2022-06-06 19:44                       ` Junio C Hamano
2022-06-06 20:02                         ` Junio C Hamano
2022-06-09  6:04                           ` Han Xin
2022-06-09  6:14                         ` Han Xin
2022-06-07 19:53                       ` Neeraj Singh
2022-06-08 15:34                         ` Junio C Hamano
2022-06-09  3:05                         ` [RFC PATCH] object-file.c: batched disk flushes for stream_loose_object() Han Xin
2022-06-09  7:35                           ` Neeraj Singh
2022-06-09  9:30                           ` Johannes Schindelin
2022-06-10 12:55                             ` Han Xin
2022-06-04 10:10                     ` [PATCH v13 6/7] core doc: modernize core.bigFileThreshold documentation Ævar Arnfjörð Bjarmason
2022-06-06 19:50                       ` Junio C Hamano
2022-06-04 10:10                     ` [PATCH v13 7/7] unpack-objects: use stream_loose_object() to unpack large objects Ævar Arnfjörð Bjarmason
2022-06-10 14:46                     ` [PATCH v14 0/7] unpack-objects: support streaming blobs to disk Han Xin
2022-06-10 14:46                       ` [PATCH v14 1/7] unpack-objects: low memory footprint for get_data() in dry_run mode Han Xin
2022-06-10 14:46                       ` [PATCH v14 2/7] object-file.c: do fsync() and close() before post-write die() Han Xin
2022-06-10 21:10                         ` René Scharfe
2022-06-10 21:33                           ` Junio C Hamano
2022-06-11  1:50                             ` Han Xin
2022-06-10 14:46                       ` [PATCH v14 3/7] object-file.c: refactor write_loose_object() to several steps Han Xin
2022-06-10 14:46                       ` [PATCH v14 4/7] object-file.c: factor out deflate part of write_loose_object() Han Xin
2022-06-10 14:46                       ` [PATCH v14 5/7] object-file.c: add "stream_loose_object()" to handle large object Han Xin
2022-06-10 14:46                       ` [PATCH v14 6/7] core doc: modernize core.bigFileThreshold documentation Han Xin
2022-06-10 21:01                         ` Junio C Hamano
2022-06-10 14:46                       ` [PATCH v14 7/7] unpack-objects: use stream_loose_object() to unpack large objects Han Xin
2022-06-11  2:44                       ` [PATCH v15 0/6] unpack-objects: support streaming blobs to disk Han Xin
2022-06-11  2:44                         ` [PATCH v15 1/6] unpack-objects: low memory footprint for get_data() in dry_run mode Han Xin
2022-06-11  2:44                         ` [PATCH v15 2/6] object-file.c: refactor write_loose_object() to several steps Han Xin
2022-06-11  2:44                         ` [PATCH v15 3/6] object-file.c: factor out deflate part of write_loose_object() Han Xin
2022-06-11  2:44                         ` [PATCH v15 4/6] object-file.c: add "stream_loose_object()" to handle large object Han Xin
2022-06-11  2:44                         ` [PATCH v15 5/6] core doc: modernize core.bigFileThreshold documentation Han Xin
2022-06-11  2:44                         ` [PATCH v15 6/6] unpack-objects: use stream_loose_object() to unpack large objects Han Xin
2022-07-01  2:01                           ` Junio C Hamano
2022-05-20  3:05                 ` [PATCH 0/1] unpack-objects: low memory footprint for get_data() in dry_run mode Han Xin
2022-05-20  3:05                   ` [PATCH 1/1] " Han Xin
2022-01-20 11:21           ` [PATCH v9 1/5] " Han Xin
2022-01-20 11:21           ` [PATCH v9 2/5] object-file.c: refactor write_loose_object() to several steps Han Xin
2022-01-20 11:21           ` [PATCH v9 3/5] object-file.c: add "stream_loose_object()" to handle large object Han Xin
2022-01-20 11:21           ` [PATCH v9 4/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2022-01-20 11:21           ` [PATCH v9 5/5] object-file API: add a format_object_header() function Han Xin
2022-01-08  8:54         ` [PATCH v8 1/6] unpack-objects: low memory footprint for get_data() in dry_run mode Han Xin
2022-01-08 12:28           ` René Scharfe
2022-01-11 10:41             ` Han Xin
2022-01-08  8:54         ` [PATCH v8 2/6] object-file.c: refactor write_loose_object() to several steps Han Xin
2022-01-08 12:28           ` René Scharfe
2022-01-11 10:33             ` Han Xin
2022-01-08  8:54         ` [PATCH v8 3/6] object-file.c: remove the slash for directory_size() Han Xin
2022-01-08 17:24           ` René Scharfe
2022-01-11 10:14             ` Han Xin
2022-01-08  8:54         ` [PATCH v8 4/6] object-file.c: add "stream_loose_object()" to handle large object Han Xin
2022-01-08  8:54         ` [PATCH v8 5/6] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2022-01-08  8:54         ` [PATCH v8 6/6] object-file API: add a format_object_header() function Han Xin
2021-12-17 11:26       ` [PATCH v6 1/6] object-file.c: release strbuf in write_loose_object() Han Xin
2021-12-17 19:28         ` René Scharfe
2021-12-18  0:09           ` Junio C Hamano
2021-12-17 11:26       ` [PATCH v6 2/6] object-file.c: refactor object header generation into a function Han Xin
2021-12-20 12:10         ` [RFC PATCH] object-file API: add a format_loose_header() function Ævar Arnfjörð Bjarmason
2021-12-20 12:48           ` Philip Oakley
2021-12-20 22:25           ` Junio C Hamano
2021-12-21  1:42             ` Ævar Arnfjörð Bjarmason
2021-12-21  2:11               ` Junio C Hamano
2021-12-21  2:27                 ` Ævar Arnfjörð Bjarmason
2021-12-21 11:43           ` Han Xin
2021-12-17 11:26       ` [PATCH v6 3/6] object-file.c: refactor write_loose_object() to reuse in stream version Han Xin
2021-12-17 11:26       ` [PATCH v6 4/6] object-file.c: make "write_object_file_flags()" to support read in stream Han Xin
2021-12-17 22:52         ` René Scharfe
2021-12-17 11:26       ` [PATCH v6 5/6] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-12-17 21:22         ` René Scharfe
2021-12-17 11:26       ` [PATCH v6 6/6] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-10 10:34     ` [PATCH v5 1/6] object-file: refactor write_loose_object() to support read from stream Han Xin
2021-12-10 10:34     ` [PATCH v5 2/6] object-file.c: handle undetermined oid in write_loose_object() Han Xin
2021-12-13  7:32       ` Ævar Arnfjörð Bjarmason
2021-12-10 10:34     ` [PATCH v5 3/6] object-file.c: read stream in a loop " Han Xin
2021-12-10 10:34     ` [PATCH v5 4/6] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-12-10 10:34     ` [PATCH v5 5/6] object-file.c: make "write_object_file_flags()" to support "HASH_STREAM" Han Xin
2021-12-10 10:34     ` [PATCH v5 6/6] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-13  8:05       ` Ævar Arnfjörð Bjarmason
2021-12-03  9:35   ` [PATCH v4 1/5] object-file: refactor write_loose_object() to read buffer from stream Han Xin
2021-12-03 13:28     ` Ævar Arnfjörð Bjarmason
2021-12-06  2:07       ` Han Xin
2021-12-03  9:35   ` [PATCH v4 2/5] object-file.c: handle undetermined oid in write_loose_object() Han Xin
2021-12-03 13:21     ` Ævar Arnfjörð Bjarmason
2021-12-06  2:51       ` Han Xin
2021-12-03 13:41     ` Ævar Arnfjörð Bjarmason
2021-12-06  3:12       ` Han Xin
2021-12-03  9:35   ` [PATCH v4 3/5] object-file.c: read stream in a loop " Han Xin
2021-12-03  9:35   ` [PATCH v4 4/5] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-12-03 13:59     ` Ævar Arnfjörð Bjarmason
2021-12-06  3:20       ` Han Xin
2021-12-03  9:35   ` [PATCH v4 5/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-03 13:07     ` Ævar Arnfjörð Bjarmason
2021-12-07  6:42       ` Han Xin
2021-12-03 13:54     ` Ævar Arnfjörð Bjarmason
2021-12-07  6:17       ` Han Xin
2021-12-03 14:05     ` Ævar Arnfjörð Bjarmason
2021-12-07  6:48       ` Han Xin
2021-11-22  3:32 ` [PATCH v3 1/5] object-file: refactor write_loose_object() to read buffer from stream Han Xin
2021-11-23 23:24   ` Junio C Hamano
2021-11-24  9:00     ` Han Xin
2021-11-22  3:32 ` [PATCH v3 2/5] object-file.c: handle undetermined oid in write_loose_object() Han Xin
2021-11-29 15:10   ` Derrick Stolee
2021-11-29 20:44     ` Junio C Hamano
2021-11-29 22:18       ` Derrick Stolee
2021-11-30  3:23         ` Han Xin
2021-11-22  3:32 ` [PATCH v3 3/5] object-file.c: read stream in a loop " Han Xin
2021-11-22  3:32 ` [PATCH v3 4/5] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-11-22  3:32 ` [PATCH v3 5/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-11-29 17:37   ` Derrick Stolee
2021-11-30 13:49     ` Han Xin
2021-11-30 18:38       ` Derrick Stolee
2021-12-01 20:37         ` "git hyperfine" (was: [PATCH v3 5/5] unpack-objects[...]) Ævar Arnfjörð Bjarmason
2021-12-02  7:33         ` [PATCH v3 5/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-02 13:53           ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211221115201.12120-1-chiyutianyi@gmail.com \
    --to=chiyutianyi@gmail.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=hanxin.hx@alibaba-inc.com \
    --cc=l.s.r@web.de \
    --cc=peff@peff.net \
    --cc=philipoakley@iee.email \
    --cc=stolee@gmail.com \
    --cc=zhiyou.jx@alibaba-inc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).