From: Han Xin <chiyutianyi@gmail.com>
To: Junio C Hamano <gitster@pobox.com>,
Git List <git@vger.kernel.org>, Jeff King <peff@peff.net>,
Jiang Xin <zhiyou.jx@alibaba-inc.com>,
Philip Oakley <philipoakley@iee.email>
Cc: Han Xin <hanxin.hx@alibaba-inc.com>
Subject: [PATCH v2 4/6] object-file.c: read input stream repeatedly in write_loose_object()
Date: Fri, 12 Nov 2021 17:40:08 +0800 [thread overview]
Message-ID: <20211112094010.73468-4-chiyutianyi@gmail.com> (raw)
In-Reply-To: <20211009082058.41138-1-chiyutianyi@gmail.com>
From: Han Xin <hanxin.hx@alibaba-inc.com>
Read input stream repeatedly in write_loose_object() unless reach the
end, so that we can divide the large blob write into many small blocks.
Signed-off-by: Han Xin <hanxin.hx@alibaba-inc.com>
---
object-file.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/object-file.c b/object-file.c
index 8393659f0d..e333448c54 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1891,7 +1891,7 @@ static int write_loose_object(const struct object_id *oid, char *hdr,
static struct strbuf tmp_file = STRBUF_INIT;
static struct strbuf filename = STRBUF_INIT;
const char *buf;
- unsigned long len;
+ int flush = 0;
if (is_null_oid(oid)) {
/* When oid is not determined, save tmp file to odb path. */
@@ -1927,12 +1927,16 @@ static int write_loose_object(const struct object_id *oid, char *hdr,
the_hash_algo->update_fn(&c, hdr, hdrlen);
/* Then the data itself.. */
- buf = in_stream->read(in_stream->data, &len);
- stream.next_in = (void *)buf;
- stream.avail_in = len;
do {
unsigned char *in0 = stream.next_in;
- ret = git_deflate(&stream, Z_FINISH);
+ if (!stream.avail_in) {
+ if ((buf = in_stream->read(in_stream->data, &stream.avail_in))) {
+ stream.next_in = (void *)buf;
+ in0 = (unsigned char *)buf;
+ } else
+ flush = Z_FINISH;
+ }
+ ret = git_deflate(&stream, flush);
the_hash_algo->update_fn(&c, in0, stream.next_in - in0);
if (!dry_run && write_buffer(fd, compressed, stream.next_out - compressed) < 0)
die(_("unable to write loose object file"));
--
2.33.1.44.g9344627884.agit.6.5.4
next prev parent reply other threads:[~2021-11-12 9:42 UTC|newest]
Thread overview: 211+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-09 8:20 [PATCH] unpack-objects: unpack large object in stream Han Xin
2021-10-19 7:37 ` Han Xin
2021-10-20 14:42 ` Philip Oakley
2021-10-21 3:42 ` Han Xin
2021-10-21 22:47 ` Philip Oakley
2021-11-03 1:48 ` Han Xin
2021-11-03 10:07 ` Philip Oakley
2021-11-12 9:40 ` [PATCH v2 1/6] object-file: refactor write_loose_object() to support inputstream Han Xin
2021-11-18 4:59 ` Jiang Xin
2021-11-18 6:45 ` Junio C Hamano
2021-11-12 9:40 ` [PATCH v2 2/6] object-file.c: add dry_run mode for write_loose_object() Han Xin
2021-11-18 5:42 ` Jiang Xin
2021-11-12 9:40 ` [PATCH v2 3/6] object-file.c: handle nil oid in write_loose_object() Han Xin
2021-11-18 5:49 ` Jiang Xin
2021-11-12 9:40 ` Han Xin [this message]
2021-11-18 5:56 ` [PATCH v2 4/6] object-file.c: read input stream repeatedly " Jiang Xin
2021-11-12 9:40 ` [PATCH v2 5/6] object-store.h: add write_loose_object() Han Xin
2021-11-12 9:40 ` [PATCH v2 6/6] unpack-objects: unpack large object in stream Han Xin
2021-11-18 7:14 ` Jiang Xin
2021-11-22 3:32 ` [PATCH v3 0/5] unpack large objects " Han Xin
2021-11-29 7:01 ` Han Xin
2021-11-29 19:12 ` Jeff King
2021-11-30 2:57 ` Han Xin
2021-12-03 9:35 ` [PATCH v4 " Han Xin
2021-12-07 16:18 ` Derrick Stolee
2021-12-10 10:34 ` [PATCH v5 0/6] unpack large blobs " Han Xin
2021-12-17 11:26 ` Han Xin
2021-12-21 11:51 ` [PATCH v7 0/5] " Han Xin
2021-12-21 11:51 ` [PATCH v7 1/5] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-12-21 14:09 ` Ævar Arnfjörð Bjarmason
2021-12-21 14:43 ` René Scharfe
2021-12-21 15:04 ` Ævar Arnfjörð Bjarmason
2021-12-22 11:15 ` Jiang Xin
2021-12-22 11:29 ` Jiang Xin
2021-12-31 3:06 ` Jiang Xin
2021-12-21 11:51 ` [PATCH v7 2/5] object-file API: add a format_object_header() function Han Xin
2021-12-21 14:30 ` René Scharfe
2022-02-01 14:28 ` C99 %z (was: [PATCH v7 2/5] object-file API: add a format_object_header() function) Ævar Arnfjörð Bjarmason
2021-12-31 3:12 ` [PATCH v7 2/5] object-file API: add a format_object_header() function Jiang Xin
2021-12-21 11:51 ` [PATCH v7 3/5] object-file.c: refactor write_loose_object() to reuse in stream version Han Xin
2021-12-21 14:16 ` Ævar Arnfjörð Bjarmason
2021-12-22 12:02 ` Jiang Xin
2021-12-21 11:52 ` [PATCH v7 4/5] object-file.c: add "write_stream_object_file()" to support read in stream Han Xin
2021-12-21 14:20 ` Ævar Arnfjörð Bjarmason
2021-12-21 15:05 ` Ævar Arnfjörð Bjarmason
2021-12-21 11:52 ` [PATCH v7 5/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-21 15:06 ` Ævar Arnfjörð Bjarmason
2021-12-31 3:19 ` Jiang Xin
2022-01-08 8:54 ` [PATCH v8 0/6] unpack large blobs in stream Han Xin
2022-01-20 11:21 ` [PATCH v9 0/5] " Han Xin
2022-02-01 21:24 ` Ævar Arnfjörð Bjarmason
2022-02-02 8:32 ` Han Xin
2022-02-02 10:59 ` Ævar Arnfjörð Bjarmason
2022-02-04 14:07 ` [PATCH v10 0/6] unpack-objects: support streaming large objects to disk Ævar Arnfjörð Bjarmason
2022-02-04 14:07 ` [PATCH v10 1/6] unpack-objects: low memory footprint for get_data() in dry_run mode Ævar Arnfjörð Bjarmason
2022-02-04 14:07 ` [PATCH v10 2/6] object-file.c: do fsync() and close() before post-write die() Ævar Arnfjörð Bjarmason
2022-02-04 14:07 ` [PATCH v10 3/6] object-file.c: refactor write_loose_object() to several steps Ævar Arnfjörð Bjarmason
2022-02-04 14:07 ` [PATCH v10 4/6] object-file.c: add "stream_loose_object()" to handle large object Ævar Arnfjörð Bjarmason
2022-02-04 14:07 ` [PATCH v10 5/6] core doc: modernize core.bigFileThreshold documentation Ævar Arnfjörð Bjarmason
2022-02-04 14:07 ` [PATCH v10 6/6] unpack-objects: use stream_loose_object() to unpack large objects Ævar Arnfjörð Bjarmason
2022-03-19 0:23 ` [PATCH v11 0/8] unpack-objects: support streaming blobs to disk Ævar Arnfjörð Bjarmason
2022-03-19 0:23 ` [PATCH v11 1/8] unpack-objects: low memory footprint for get_data() in dry_run mode Ævar Arnfjörð Bjarmason
2022-03-19 0:23 ` [PATCH v11 2/8] object-file.c: do fsync() and close() before post-write die() Ævar Arnfjörð Bjarmason
2022-03-19 0:23 ` [PATCH v11 3/8] object-file.c: refactor write_loose_object() to several steps Ævar Arnfjörð Bjarmason
2022-03-19 10:11 ` René Scharfe
2022-03-19 0:23 ` [PATCH v11 4/8] object-file.c: factor out deflate part of write_loose_object() Ævar Arnfjörð Bjarmason
2022-03-19 0:23 ` [PATCH v11 5/8] object-file.c: add "stream_loose_object()" to handle large object Ævar Arnfjörð Bjarmason
2022-03-19 0:23 ` [PATCH v11 6/8] core doc: modernize core.bigFileThreshold documentation Ævar Arnfjörð Bjarmason
2022-03-19 0:23 ` [PATCH v11 7/8] unpack-objects: refactor away unpack_non_delta_entry() Ævar Arnfjörð Bjarmason
2022-03-19 0:23 ` [PATCH v11 8/8] unpack-objects: use stream_loose_object() to unpack large objects Ævar Arnfjörð Bjarmason
2022-03-29 13:56 ` [PATCH v12 0/8] unpack-objects: support streaming blobs to disk Ævar Arnfjörð Bjarmason
2022-03-29 13:56 ` [PATCH v12 1/8] unpack-objects: low memory footprint for get_data() in dry_run mode Ævar Arnfjörð Bjarmason
2022-03-29 13:56 ` [PATCH v12 2/8] object-file.c: do fsync() and close() before post-write die() Ævar Arnfjörð Bjarmason
2022-03-29 13:56 ` [PATCH v12 3/8] object-file.c: refactor write_loose_object() to several steps Ævar Arnfjörð Bjarmason
2022-03-30 7:13 ` Han Xin
2022-03-30 17:34 ` Ævar Arnfjörð Bjarmason
2022-03-29 13:56 ` [PATCH v12 4/8] object-file.c: factor out deflate part of write_loose_object() Ævar Arnfjörð Bjarmason
2022-03-29 13:56 ` [PATCH v12 5/8] object-file.c: add "stream_loose_object()" to handle large object Ævar Arnfjörð Bjarmason
2022-03-31 19:54 ` Neeraj Singh
2022-03-29 13:56 ` [PATCH v12 6/8] core doc: modernize core.bigFileThreshold documentation Ævar Arnfjörð Bjarmason
2022-03-29 13:56 ` [PATCH v12 7/8] unpack-objects: refactor away unpack_non_delta_entry() Ævar Arnfjörð Bjarmason
2022-03-30 19:40 ` René Scharfe
2022-03-31 12:42 ` Ævar Arnfjörð Bjarmason
2022-03-31 16:38 ` René Scharfe
2022-03-29 13:56 ` [PATCH v12 8/8] unpack-objects: use stream_loose_object() to unpack large objects Ævar Arnfjörð Bjarmason
2022-06-04 10:10 ` [PATCH v13 0/7] unpack-objects: support streaming blobs to disk Ævar Arnfjörð Bjarmason
2022-06-04 10:10 ` [PATCH v13 1/7] unpack-objects: low memory footprint for get_data() in dry_run mode Ævar Arnfjörð Bjarmason
2022-06-06 18:35 ` Junio C Hamano
2022-06-09 4:10 ` Han Xin
2022-06-09 18:27 ` Junio C Hamano
2022-06-10 1:50 ` Han Xin
2022-06-10 2:05 ` Ævar Arnfjörð Bjarmason
2022-06-10 12:04 ` Han Xin
2022-06-04 10:10 ` [PATCH v13 2/7] object-file.c: do fsync() and close() before post-write die() Ævar Arnfjörð Bjarmason
2022-06-06 18:45 ` Junio C Hamano
2022-06-04 10:10 ` [PATCH v13 3/7] object-file.c: refactor write_loose_object() to several steps Ævar Arnfjörð Bjarmason
2022-06-04 10:10 ` [PATCH v13 4/7] object-file.c: factor out deflate part of write_loose_object() Ævar Arnfjörð Bjarmason
2022-06-04 10:10 ` [PATCH v13 5/7] object-file.c: add "stream_loose_object()" to handle large object Ævar Arnfjörð Bjarmason
2022-06-06 19:44 ` Junio C Hamano
2022-06-06 20:02 ` Junio C Hamano
2022-06-09 6:04 ` Han Xin
2022-06-09 6:14 ` Han Xin
2022-06-07 19:53 ` Neeraj Singh
2022-06-08 15:34 ` Junio C Hamano
2022-06-09 3:05 ` [RFC PATCH] object-file.c: batched disk flushes for stream_loose_object() Han Xin
2022-06-09 7:35 ` Neeraj Singh
2022-06-09 9:30 ` Johannes Schindelin
2022-06-10 12:55 ` Han Xin
2022-06-04 10:10 ` [PATCH v13 6/7] core doc: modernize core.bigFileThreshold documentation Ævar Arnfjörð Bjarmason
2022-06-06 19:50 ` Junio C Hamano
2022-06-04 10:10 ` [PATCH v13 7/7] unpack-objects: use stream_loose_object() to unpack large objects Ævar Arnfjörð Bjarmason
2022-06-10 14:46 ` [PATCH v14 0/7] unpack-objects: support streaming blobs to disk Han Xin
2022-06-10 14:46 ` [PATCH v14 1/7] unpack-objects: low memory footprint for get_data() in dry_run mode Han Xin
2022-06-10 14:46 ` [PATCH v14 2/7] object-file.c: do fsync() and close() before post-write die() Han Xin
2022-06-10 21:10 ` René Scharfe
2022-06-10 21:33 ` Junio C Hamano
2022-06-11 1:50 ` Han Xin
2022-06-10 14:46 ` [PATCH v14 3/7] object-file.c: refactor write_loose_object() to several steps Han Xin
2022-06-10 14:46 ` [PATCH v14 4/7] object-file.c: factor out deflate part of write_loose_object() Han Xin
2022-06-10 14:46 ` [PATCH v14 5/7] object-file.c: add "stream_loose_object()" to handle large object Han Xin
2022-06-10 14:46 ` [PATCH v14 6/7] core doc: modernize core.bigFileThreshold documentation Han Xin
2022-06-10 21:01 ` Junio C Hamano
2022-06-10 14:46 ` [PATCH v14 7/7] unpack-objects: use stream_loose_object() to unpack large objects Han Xin
2022-06-11 2:44 ` [PATCH v15 0/6] unpack-objects: support streaming blobs to disk Han Xin
2022-06-11 2:44 ` [PATCH v15 1/6] unpack-objects: low memory footprint for get_data() in dry_run mode Han Xin
2022-06-11 2:44 ` [PATCH v15 2/6] object-file.c: refactor write_loose_object() to several steps Han Xin
2022-06-11 2:44 ` [PATCH v15 3/6] object-file.c: factor out deflate part of write_loose_object() Han Xin
2022-06-11 2:44 ` [PATCH v15 4/6] object-file.c: add "stream_loose_object()" to handle large object Han Xin
2022-06-11 2:44 ` [PATCH v15 5/6] core doc: modernize core.bigFileThreshold documentation Han Xin
2022-06-11 2:44 ` [PATCH v15 6/6] unpack-objects: use stream_loose_object() to unpack large objects Han Xin
2022-07-01 2:01 ` Junio C Hamano
2022-05-20 3:05 ` [PATCH 0/1] unpack-objects: low memory footprint for get_data() in dry_run mode Han Xin
2022-05-20 3:05 ` [PATCH 1/1] " Han Xin
2022-01-20 11:21 ` [PATCH v9 1/5] " Han Xin
2022-01-20 11:21 ` [PATCH v9 2/5] object-file.c: refactor write_loose_object() to several steps Han Xin
2022-01-20 11:21 ` [PATCH v9 3/5] object-file.c: add "stream_loose_object()" to handle large object Han Xin
2022-01-20 11:21 ` [PATCH v9 4/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2022-01-20 11:21 ` [PATCH v9 5/5] object-file API: add a format_object_header() function Han Xin
2022-01-08 8:54 ` [PATCH v8 1/6] unpack-objects: low memory footprint for get_data() in dry_run mode Han Xin
2022-01-08 12:28 ` René Scharfe
2022-01-11 10:41 ` Han Xin
2022-01-08 8:54 ` [PATCH v8 2/6] object-file.c: refactor write_loose_object() to several steps Han Xin
2022-01-08 12:28 ` René Scharfe
2022-01-11 10:33 ` Han Xin
2022-01-08 8:54 ` [PATCH v8 3/6] object-file.c: remove the slash for directory_size() Han Xin
2022-01-08 17:24 ` René Scharfe
2022-01-11 10:14 ` Han Xin
2022-01-08 8:54 ` [PATCH v8 4/6] object-file.c: add "stream_loose_object()" to handle large object Han Xin
2022-01-08 8:54 ` [PATCH v8 5/6] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2022-01-08 8:54 ` [PATCH v8 6/6] object-file API: add a format_object_header() function Han Xin
2021-12-17 11:26 ` [PATCH v6 1/6] object-file.c: release strbuf in write_loose_object() Han Xin
2021-12-17 19:28 ` René Scharfe
2021-12-18 0:09 ` Junio C Hamano
2021-12-17 11:26 ` [PATCH v6 2/6] object-file.c: refactor object header generation into a function Han Xin
2021-12-20 12:10 ` [RFC PATCH] object-file API: add a format_loose_header() function Ævar Arnfjörð Bjarmason
2021-12-20 12:48 ` Philip Oakley
2021-12-20 22:25 ` Junio C Hamano
2021-12-21 1:42 ` Ævar Arnfjörð Bjarmason
2021-12-21 2:11 ` Junio C Hamano
2021-12-21 2:27 ` Ævar Arnfjörð Bjarmason
2021-12-21 11:43 ` Han Xin
2021-12-17 11:26 ` [PATCH v6 3/6] object-file.c: refactor write_loose_object() to reuse in stream version Han Xin
2021-12-17 11:26 ` [PATCH v6 4/6] object-file.c: make "write_object_file_flags()" to support read in stream Han Xin
2021-12-17 22:52 ` René Scharfe
2021-12-17 11:26 ` [PATCH v6 5/6] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-12-17 21:22 ` René Scharfe
2021-12-17 11:26 ` [PATCH v6 6/6] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-10 10:34 ` [PATCH v5 1/6] object-file: refactor write_loose_object() to support read from stream Han Xin
2021-12-10 10:34 ` [PATCH v5 2/6] object-file.c: handle undetermined oid in write_loose_object() Han Xin
2021-12-13 7:32 ` Ævar Arnfjörð Bjarmason
2021-12-10 10:34 ` [PATCH v5 3/6] object-file.c: read stream in a loop " Han Xin
2021-12-10 10:34 ` [PATCH v5 4/6] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-12-10 10:34 ` [PATCH v5 5/6] object-file.c: make "write_object_file_flags()" to support "HASH_STREAM" Han Xin
2021-12-10 10:34 ` [PATCH v5 6/6] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-13 8:05 ` Ævar Arnfjörð Bjarmason
2021-12-03 9:35 ` [PATCH v4 1/5] object-file: refactor write_loose_object() to read buffer from stream Han Xin
2021-12-03 13:28 ` Ævar Arnfjörð Bjarmason
2021-12-06 2:07 ` Han Xin
2021-12-03 9:35 ` [PATCH v4 2/5] object-file.c: handle undetermined oid in write_loose_object() Han Xin
2021-12-03 13:21 ` Ævar Arnfjörð Bjarmason
2021-12-06 2:51 ` Han Xin
2021-12-03 13:41 ` Ævar Arnfjörð Bjarmason
2021-12-06 3:12 ` Han Xin
2021-12-03 9:35 ` [PATCH v4 3/5] object-file.c: read stream in a loop " Han Xin
2021-12-03 9:35 ` [PATCH v4 4/5] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-12-03 13:59 ` Ævar Arnfjörð Bjarmason
2021-12-06 3:20 ` Han Xin
2021-12-03 9:35 ` [PATCH v4 5/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-03 13:07 ` Ævar Arnfjörð Bjarmason
2021-12-07 6:42 ` Han Xin
2021-12-03 13:54 ` Ævar Arnfjörð Bjarmason
2021-12-07 6:17 ` Han Xin
2021-12-03 14:05 ` Ævar Arnfjörð Bjarmason
2021-12-07 6:48 ` Han Xin
2021-11-22 3:32 ` [PATCH v3 1/5] object-file: refactor write_loose_object() to read buffer from stream Han Xin
2021-11-23 23:24 ` Junio C Hamano
2021-11-24 9:00 ` Han Xin
2021-11-22 3:32 ` [PATCH v3 2/5] object-file.c: handle undetermined oid in write_loose_object() Han Xin
2021-11-29 15:10 ` Derrick Stolee
2021-11-29 20:44 ` Junio C Hamano
2021-11-29 22:18 ` Derrick Stolee
2021-11-30 3:23 ` Han Xin
2021-11-22 3:32 ` [PATCH v3 3/5] object-file.c: read stream in a loop " Han Xin
2021-11-22 3:32 ` [PATCH v3 4/5] unpack-objects.c: add dry_run mode for get_data() Han Xin
2021-11-22 3:32 ` [PATCH v3 5/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-11-29 17:37 ` Derrick Stolee
2021-11-30 13:49 ` Han Xin
2021-11-30 18:38 ` Derrick Stolee
2021-12-01 20:37 ` "git hyperfine" (was: [PATCH v3 5/5] unpack-objects[...]) Ævar Arnfjörð Bjarmason
2021-12-02 7:33 ` [PATCH v3 5/5] unpack-objects: unpack_non_delta_entry() read data in a stream Han Xin
2021-12-02 13:53 ` Derrick Stolee
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211112094010.73468-4-chiyutianyi@gmail.com \
--to=chiyutianyi@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=hanxin.hx@alibaba-inc.com \
--cc=peff@peff.net \
--cc=philipoakley@iee.email \
--cc=zhiyou.jx@alibaba-inc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).