All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nicolas Pitre <nico@fluxnic.net>
To: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH] pack-objects: use streaming interface for reading large loose blobs
Date: Sat, 12 May 2012 12:51:05 -0400 (EDT)	[thread overview]
Message-ID: <alpine.LFD.2.02.1205121220070.21030@xanadu.home> (raw)
In-Reply-To: <1336818375-16895-1-git-send-email-pclouds@gmail.com>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1841 bytes --]

On Sat, 12 May 2012, Nguyễn Thái Ngọc Duy wrote:

> git usually streams large blobs directly to packs. But there are cases
> where git can create large loose blobs (unpack-objects or hash-object
> over pipe). Or they can come from other git implementations.
> core.bigfilethreshold can also be lowered down and introduce a new
> wave of large loose blobs.
> 
> Use streaming interface to read these blobs and compress/write at the
> same time.
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>

Comments below.

> ---
>  index-pack's streaming support is on the way. unpack-objects is
>  another story because I'm thinking of merging it back to index-pack
>  first, which may take more than one release cycle.
> 
>  builtin/pack-objects.c |   73 ++++++++++++++++++++++++++++++++++++++++++++----
>  t/t1050-large.sh       |   16 ++++++++++
>  2 files changed, 83 insertions(+), 6 deletions(-)
> 
> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index 1861093..98b51c1 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -259,9 +309,14 @@ static unsigned long write_object(struct sha1file *f,
>  	if (!to_reuse) {
>  		no_reuse:
>  		if (!usable_delta) {
> -			buf = read_sha1_file(entry->idx.sha1, &type, &size);
> -			if (!buf)
> -				die("unable to read %s", sha1_to_hex(entry->idx.sha1));
> +			type = sha1_object_info(entry->idx.sha1, &size);

Please don't use sha1_object_info() lightly.  This is a potentially 
expensive operation, and you really don't want to do it on each objects.

And as a matter of fact, the information you are looking for has already 
been determined earlier.  See the code in check_object() which tries 
hard to avoid sha1_object_info() as much as possible.

Therefore you should have entry->type and entry->size already set for 
you to use.


Nicolas

  reply	other threads:[~2012-05-12 16:51 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-12 10:26 [PATCH] pack-objects: use streaming interface for reading large loose blobs Nguyễn Thái Ngọc Duy
2012-05-12 16:51 ` Nicolas Pitre [this message]
2012-05-13  4:37   ` [PATCH v2] " Nguyễn Thái Ngọc Duy
2012-05-14 15:56     ` Junio C Hamano
2012-05-14 19:43     ` Junio C Hamano
2012-05-15 11:18       ` Nguyen Thai Ngoc Duy
2012-05-15 15:27         ` Junio C Hamano
2012-05-16  7:09           ` Nguyen Thai Ngoc Duy
2012-05-16 12:02 ` [PATCH v2 1/4] streaming: allow to call close_istream(NULL); Nguyễn Thái Ngọc Duy
2012-05-16 12:02   ` [PATCH v2 2/4] pack-objects, streaming: turn "xx >= big_file_threshold" to ".. > .." Nguyễn Thái Ngọc Duy
2012-05-18 21:05     ` Junio C Hamano
2012-05-16 12:02   ` [PATCH v2 3/4] pack-objects: refactor write_object() Nguyễn Thái Ngọc Duy
2012-05-18 21:16     ` Junio C Hamano
2012-05-19  2:43     ` Nicolas Pitre
2012-05-16 12:02   ` [PATCH v2 4/4] pack-objects: use streaming interface for reading large loose blobs Nguyễn Thái Ngọc Duy
2012-05-18 21:02   ` [PATCH v2 1/4] streaming: allow to call close_istream(NULL); Junio C Hamano
2012-05-26 10:28 [PATCH] pack-objects: use streaming interface for reading large loose blobs Nguyễn Thái Ngọc Duy
2012-05-29 17:56 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.02.1205121220070.21030@xanadu.home \
    --to=nico@fluxnic.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.