All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Taylor Blau <me@ttaylorr.com>, Elijah Newren <newren@gmail.com>
Subject: Re: [PATCH 1/2] blob.c: remove buffer & size arguments to parse_blob_buffer()
Date: Sat, 10 Apr 2021 14:57:12 +0200	[thread overview]
Message-ID: <87y2dqfgbb.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <YHCUFwZMitOXELpq@coredump.intra.peff.net>


On Fri, Apr 09 2021, Jeff King wrote:

> On Fri, Apr 09, 2021 at 10:07:27AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> As noted in the comment introduced in 837d395a5c0 (Replace
>> parse_blob() with an explanatory comment, 2010-01-18) the old
>> parse_blob() function and the current parse_blob_buffer() exist merely
>> to provide consistency in the API.
>> 
>> We're not going to parse blobs like we "parse" commits, trees or
>> tags. So let's not have the parse_blob_buffer() take arguments that
>> pretends that we do. Its only use is to set the "parsed" flag.
>> 
>> See bd2c39f58f9 ([PATCH] don't load and decompress objects twice with
>> parse_object(), 2005-05-06) for the introduction of parse_blob_buffer().
>
> OK. Calling it parse_blob_buffer() is a little silly since it doesn't
> even take a buffer anymore. But I guess parse_blob() might imply that it
> actually loads the contents from disk to check them (which the other
> parse_foo() functions do), so that's not a good name.
>
> So this might be the least bad thing. Given that there are only two
> callers, just setting blob->object.parsed might not be unreasonable,
> either. But I don't think it's worth spending too much time on.
>
>> @@ -266,7 +266,7 @@ struct object *parse_object(struct repository *r, const struct object_id *oid)
>>  			error(_("hash mismatch %s"), oid_to_hex(oid));
>>  			return NULL;
>>  		}
>> -		parse_blob_buffer(lookup_blob(r, oid), NULL, 0);
>> +		parse_blob_buffer(lookup_blob(r, oid));
>>  		return lookup_object(r, oid);
>
> Not new in your patch, but I wondered if this could cause a segfault
> when lookup_blob() returns NULL. I _think_ the answer is "no". We'd hit
> this code path when either:
>
>   - lookup_object() returns an object with type OBJ_BLOB, in which case
>     lookup_blob() would return that same object
>
>   - lookup_object() returned NULL, in which case lookup_blob() will call
>     it again, get NULL again, and then auto-create the blob and return
>     it
>
> So I think it is OK. But there are a bunch of duplicate hash lookups in
> this code. It would be clearer and more efficient as:
>
> diff --git a/object.c b/object.c
> index 2c32691dc4..2dfa038f13 100644
> --- a/object.c
> +++ b/object.c
> @@ -262,12 +262,14 @@ struct object *parse_object(struct repository *r, const struct object_id *oid)
>  	if ((obj && obj->type == OBJ_BLOB && repo_has_object_file(r, oid)) ||
>  	    (!obj && repo_has_object_file(r, oid) &&
>  	     oid_object_info(r, oid, NULL) == OBJ_BLOB)) {
> +		if (!obj)
> +			obj = create_object(r, oid, alloc_blob_node(r));
>  		if (check_object_signature(r, repl, NULL, 0, NULL) < 0) {
>  			error(_("hash mismatch %s"), oid_to_hex(oid));
>  			return NULL;
>  		}
> -		parse_blob_buffer(lookup_blob(r, oid), NULL, 0);
> -		return lookup_object(r, oid);
> +		parse_blob_buffer(obj, NULL, 0);
> +		return obj;
>  	}
>  
>  	buffer = repo_read_object_file(r, oid, &type, &size);
>
> but I doubt the efficiency matters much in practice. Those hash lookups
> will be lost in the noise of computing the hash of the blob contents.

I was trying to keep the changes smaller, but what about just doing this?:

diff --git a/blob.c b/blob.c
index 182718aba9..69293e7d8e 100644
--- a/blob.c
+++ b/blob.c
@@ -5,16 +5,16 @@
 
 const char *blob_type = "blob";
 
+struct blob *create_blob(struct repository *r, const struct object_id *oid)
+{
+	return create_object(r, oid, alloc_blob_node(r));
+}
+
 struct blob *lookup_blob(struct repository *r, const struct object_id *oid)
 {
 	struct object *obj = lookup_object(r, oid);
 	if (!obj)
-		return create_object(r, oid, alloc_blob_node(r));
-	return object_as_type(obj, OBJ_BLOB, 0);
-}
+		return create_blob(r, oid);
 
-int parse_blob_buffer(struct blob *item, void *buffer, unsigned long size)
-{
-	item->object.parsed = 1;
-	return 0;
+	return object_as_type(obj, OBJ_BLOB, 0);
 }
diff --git a/blob.h b/blob.h
index 1664872055..ad34f0e9cc 100644
--- a/blob.h
+++ b/blob.h
@@ -9,10 +9,9 @@ struct blob {
 	struct object object;
 };
 
+struct blob *create_blob(struct repository *r, const struct object_id *oid);
 struct blob *lookup_blob(struct repository *r, const struct object_id *oid);
 
-int parse_blob_buffer(struct blob *item, void *buffer, unsigned long size);
-
 /**
  * Blobs do not contain references to other objects and do not have
  * structured data that needs parsing. However, code may use the
diff --git a/object.c b/object.c
index 78343781ae..2699431404 100644
--- a/object.c
+++ b/object.c
@@ -195,8 +195,7 @@ struct object *parse_object_buffer(struct repository *r, const struct object_id
 	if (type == OBJ_BLOB) {
 		struct blob *blob = lookup_blob(r, oid);
 		if (blob) {
-			if (parse_blob_buffer(blob, buffer, size))
-				return NULL;
+			blob->object.parsed = 1;
 			obj = &blob->object;
 		}
 	} else if (type == OBJ_TREE) {
@@ -262,12 +261,16 @@ struct object *parse_object(struct repository *r, const struct object_id *oid)
 	if ((obj && obj->type == OBJ_BLOB && repo_has_object_file(r, oid)) ||
 	    (!obj && repo_has_object_file(r, oid) &&
 	     oid_object_info(r, oid, NULL) == OBJ_BLOB)) {
+		struct blob *blob;
+		if (!obj)
+			blob = create_blob(r, oid);
 		if (check_object_signature(r, repl, NULL, 0, NULL) < 0) {
 			error(_("hash mismatch %s"), oid_to_hex(oid));
 			return NULL;
 		}
-		parse_blob_buffer(lookup_blob(r, oid), NULL, 0);
-		return lookup_object(r, oid);
+		obj = &blob->object;
+		obj->parsed = 1;
+		return obj;
 	}
 
 	buffer = repo_read_object_file(r, oid, &type, &size);

  parent reply	other threads:[~2021-04-10 12:57 UTC|newest]

Thread overview: 142+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-06-22  0:33 [PATCH 0/2] Pull objects of various types Daniel Barkalow
2005-06-22  0:35 ` [PATCH 1/2] Parse tags for absent objects Daniel Barkalow
2021-03-08 20:04   ` [PATCH 0/7] improve reporting of unexpected objects Ævar Arnfjörð Bjarmason
2021-03-28  2:13     ` [PATCH v2 00/10] " Ævar Arnfjörð Bjarmason
2021-03-28  2:13       ` [PATCH v2 01/10] object.c: stop supporting len == -1 in type_from_string_gently() Ævar Arnfjörð Bjarmason
2021-03-28  5:35         ` Junio C Hamano
2021-03-28 15:46           ` Ævar Arnfjörð Bjarmason
2021-03-28 18:25             ` Junio C Hamano
2021-04-22 18:09               ` Felipe Contreras
2021-03-28  2:13       ` [PATCH v2 02/10] object.c: refactor type_from_string_gently() Ævar Arnfjörð Bjarmason
2021-03-28  2:13       ` [PATCH v2 03/10] object.c: make type_from_string() return "enum object_type" Ævar Arnfjörð Bjarmason
2021-03-28  2:13       ` [PATCH v2 04/10] object-file.c: make oid_object_info() " Ævar Arnfjörð Bjarmason
2021-03-28  2:13       ` [PATCH v2 05/10] object-name.c: make dependency on object_type order more obvious Ævar Arnfjörð Bjarmason
2021-03-28  2:13       ` [PATCH v2 06/10] tree.c: fix misindentation in parse_tree_gently() Ævar Arnfjörð Bjarmason
2021-03-28  2:13       ` [PATCH v2 07/10] object.c: add a utility function for "expected type X, got Y" Ævar Arnfjörð Bjarmason
2021-03-28  2:13       ` [PATCH v2 08/10] object.c: add and use oid_is_type_or_die_msg() function Ævar Arnfjörð Bjarmason
2021-03-28  2:13       ` [PATCH v2 09/10] object tests: add test for unexpected objects in tags Ævar Arnfjörð Bjarmason
2021-03-28  2:13       ` [PATCH v2 10/10] tag: don't misreport type of tagged objects in errors Ævar Arnfjörð Bjarmason
2021-03-30  5:50         ` Junio C Hamano
2021-03-31 11:02           ` Jeff King
2021-03-31 18:05             ` Junio C Hamano
2021-03-31 18:31             ` Ævar Arnfjörð Bjarmason
2021-03-31 18:59               ` Jeff King
2021-03-31 20:46                 ` Ævar Arnfjörð Bjarmason
2021-04-01  7:54                   ` Jeff King
2021-04-01  8:32                     ` [PATCH] ref-filter: fix NULL check for parse object failure Jeff King
2021-04-01 13:56                       ` [PATCH v2 0/5] mktag tests & fix for-each-ref segfault Ævar Arnfjörð Bjarmason
2021-04-01 13:56                         ` [PATCH v2 1/5] mktag tests: parse out options in helper Ævar Arnfjörð Bjarmason
2021-04-01 13:56                         ` [PATCH v2 2/5] mktag tests: invert --no-strict test Ævar Arnfjörð Bjarmason
2021-04-01 13:56                         ` [PATCH v2 3/5] mktag tests: do fsck on failure Ævar Arnfjörð Bjarmason
2021-04-01 13:56                         ` [PATCH v2 4/5] mktag tests: test for maybe segfaulting for-each-ref Ævar Arnfjörð Bjarmason
2021-04-01 13:56                         ` [PATCH v2 5/5] ref-filter: fix NULL check for parse object failure Ævar Arnfjörð Bjarmason
2021-04-01 19:19                           ` Ramsay Jones
2021-04-01 19:56                         ` [PATCH v2 0/5] mktag tests & fix for-each-ref segfault Junio C Hamano
2021-04-02 11:37                           ` Ævar Arnfjörð Bjarmason
2021-04-02 20:51                             ` Junio C Hamano
2021-04-01 19:52                       ` [PATCH] ref-filter: fix NULL check for parse object failure Junio C Hamano
2021-03-31 18:41             ` [PATCH v2 10/10] tag: don't misreport type of tagged objects in errors Junio C Hamano
2021-03-31 19:00               ` Jeff King
2021-03-28  9:27       ` [PATCH v2 00/10] improve reporting of unexpected objects Jeff King
2021-03-29 13:34         ` Ævar Arnfjörð Bjarmason
2021-03-31 10:43           ` Jeff King
2021-04-09  8:07       ` [PATCH 0/2] blob/object.c: trivial readability improvements Ævar Arnfjörð Bjarmason
2021-04-09  8:07         ` [PATCH 1/2] blob.c: remove buffer & size arguments to parse_blob_buffer() Ævar Arnfjörð Bjarmason
2021-04-09 17:51           ` Jeff King
2021-04-09 22:31             ` Junio C Hamano
2021-04-10 12:57             ` Ævar Arnfjörð Bjarmason [this message]
2021-04-10 13:01               ` Ævar Arnfjörð Bjarmason
2021-04-13  8:25               ` Jeff King
2021-04-09  8:07         ` [PATCH 2/2] object.c: initialize automatic variable in lookup_object() Ævar Arnfjörð Bjarmason
2021-04-09 17:53           ` Jeff King
2021-04-09 22:32             ` Junio C Hamano
2021-04-09  8:32         ` [PATCH 0/6] {tag,object}*.c: refactorings + prep for a larger change Ævar Arnfjörð Bjarmason
2021-04-09  8:32           ` [PATCH 1/6] object.c: stop supporting len == -1 in type_from_string_gently() Ævar Arnfjörð Bjarmason
2021-04-09 18:06             ` Jeff King
2021-04-09 18:10               ` Jeff King
2021-04-09  8:32           ` [PATCH 2/6] object.c: remove "gently" argument to type_from_string_gently() Ævar Arnfjörð Bjarmason
2021-04-09 18:10             ` Jeff King
2021-04-09  8:32           ` [PATCH 3/6] object.c: make type_from_string() return "enum object_type" Ævar Arnfjörð Bjarmason
2021-04-09 18:14             ` Jeff King
2021-04-09 19:42               ` Ævar Arnfjörð Bjarmason
2021-04-09 21:29                 ` Jeff King
2021-04-09  8:32           ` [PATCH 4/6] object-file.c: make oid_object_info() " Ævar Arnfjörð Bjarmason
2021-04-09 18:24             ` Jeff King
2021-04-09  8:32           ` [PATCH 5/6] object-name.c: make dependency on object_type order more obvious Ævar Arnfjörð Bjarmason
2021-04-09 18:36             ` Jeff King
2021-04-09  8:32           ` [PATCH 6/6] tag.c: use type_from_string_gently() when parsing tags Ævar Arnfjörð Bjarmason
2021-04-09 18:42             ` Jeff King
2021-04-09  8:49           ` [PATCH 0/7] object.c: add and use "is expected" utility function + object_as_type() use Ævar Arnfjörð Bjarmason
2021-04-09  8:49             ` [PATCH 1/7] tree.c: fix misindentation in parse_tree_gently() Ævar Arnfjörð Bjarmason
2021-04-09  8:49             ` [PATCH 2/7] object.c: add a utility function for "expected type X, got Y" Ævar Arnfjörð Bjarmason
2021-04-09  8:49             ` [PATCH 3/7] object.c: add and use oid_is_type_or_die_msg() function Ævar Arnfjörð Bjarmason
2021-04-09  8:49             ` [PATCH 4/7] commit-graph: use obj->type, not object_as_type() Ævar Arnfjörð Bjarmason
2021-04-09  8:50             ` [PATCH 5/7] commit.c: don't use deref_tag() -> object_as_type() Ævar Arnfjörð Bjarmason
2021-04-09  8:50             ` [PATCH 6/7] object.c: normalize brace style in object_as_type() Ævar Arnfjörð Bjarmason
2021-04-09  8:50             ` [PATCH 7/7] object.c: remove "quiet" parameter from object_as_type() Ævar Arnfjörð Bjarmason
2021-04-20 13:36             ` [PATCH v2 0/8] object.c: add and use "is expected" utility function + object_as_type() use Ævar Arnfjörð Bjarmason
2021-04-20 13:36               ` [PATCH v2 1/8] tree.c: fix misindentation in parse_tree_gently() Ævar Arnfjörð Bjarmason
2021-04-20 13:36               ` [PATCH v2 2/8] object.c: add a utility function for "expected type X, got Y" Ævar Arnfjörð Bjarmason
2021-04-21 22:02                 ` Jonathan Tan
2021-04-22  6:10                   ` Ævar Arnfjörð Bjarmason
2021-04-20 13:36               ` [PATCH v2 3/8] object.c: add and use oid_is_type_or_die_msg() function Ævar Arnfjörð Bjarmason
2021-04-21 22:07                 ` Jonathan Tan
2021-04-21 23:28                 ` Josh Steadmon
2021-04-28  4:12                   ` Junio C Hamano
2021-04-20 13:36               ` [PATCH v2 4/8] commit-graph: use obj->type, not object_as_type() Ævar Arnfjörð Bjarmason
2021-04-20 13:36               ` [PATCH v2 5/8] branch tests: assert lookup_commit_reference_gently() error Ævar Arnfjörð Bjarmason
2021-04-20 13:36               ` [PATCH v2 6/8] commit.c: don't use deref_tag() -> object_as_type() Ævar Arnfjörð Bjarmason
2021-04-21 22:26                 ` Jonathan Tan
2021-04-20 13:36               ` [PATCH v2 7/8] object.c: normalize brace style in object_as_type() Ævar Arnfjörð Bjarmason
2021-04-20 13:37               ` [PATCH v2 8/8] object.c: remove "quiet" parameter from object_as_type() Ævar Arnfjörð Bjarmason
2021-04-20 13:00           ` [PATCH v2 00/10] {tag,object}*.c: refactorings + prep for a larger change Ævar Arnfjörð Bjarmason
2021-04-20 13:00             ` [PATCH v2 01/10] object.c: stop supporting len == -1 in type_from_string_gently() Ævar Arnfjörð Bjarmason
2021-04-20 13:00             ` [PATCH v2 02/10] object.c: remove "gently" argument to type_from_string_gently() Ævar Arnfjörð Bjarmason
2021-04-20 13:00             ` [PATCH v2 03/10] object.c: make type_from_string() return "enum object_type" Ævar Arnfjörð Bjarmason
2021-04-20 13:00             ` [PATCH v2 04/10] object-file.c: make oid_object_info() " Ævar Arnfjörð Bjarmason
2021-04-20 13:00             ` [PATCH v2 05/10] object-name.c: make dependency on object_type order more obvious Ævar Arnfjörð Bjarmason
2021-04-20 13:00             ` [PATCH v2 06/10] tag.c: use type_from_string_gently() when parsing tags Ævar Arnfjörð Bjarmason
2021-04-20 13:00             ` [PATCH v2 07/10] hash-object: pass along type length to object.c Ævar Arnfjörð Bjarmason
2021-04-20 13:00             ` [PATCH v2 08/10] hash-object: refactor nested else/if/if into else if/else if Ævar Arnfjörð Bjarmason
2021-04-20 13:00             ` [PATCH v2 09/10] hash-object: show usage on invalid --type Ævar Arnfjörð Bjarmason
2021-04-20 13:00             ` [PATCH v2 10/10] object.c: move type_from_string() code to its last user Ævar Arnfjörð Bjarmason
2021-04-20 12:50         ` [PATCH v2 00/10] object.c et al: tests, small bug fixes etc Ævar Arnfjörð Bjarmason
2021-04-20 12:50           ` [PATCH v2 01/10] cat-file tests: test for bogus type name handling Ævar Arnfjörð Bjarmason
2021-04-29  4:15             ` Junio C Hamano
2021-04-20 12:50           ` [PATCH v2 02/10] hash-object tests: more detailed test for invalid type Ævar Arnfjörð Bjarmason
2021-04-20 12:50           ` [PATCH v2 03/10] mktree tests: add test for invalid object type Ævar Arnfjörð Bjarmason
2021-04-20 12:50           ` [PATCH v2 04/10] object-file.c: take type id, not string, in read_object_with_reference() Ævar Arnfjörð Bjarmason
2021-04-29  4:37             ` Junio C Hamano
2021-04-20 12:50           ` [PATCH v2 05/10] {commit,tree,blob,tag}.c: add a create_{commit,tree,blob,tag}() Ævar Arnfjörð Bjarmason
2021-04-29  4:45             ` Junio C Hamano
2021-04-29 12:01               ` Ævar Arnfjörð Bjarmason
2021-04-20 12:50           ` [PATCH v2 06/10] blob.c: remove parse_blob_buffer() Ævar Arnfjörð Bjarmason
2021-04-29  4:51             ` Junio C Hamano
2021-04-20 12:50           ` [PATCH v2 07/10] object.c: simplify return semantic of parse_object_buffer() Ævar Arnfjörð Bjarmason
2021-04-20 12:50           ` [PATCH v2 08/10] object.c: don't go past "len" under die() in type_from_string_gently() Ævar Arnfjörð Bjarmason
2021-04-29  4:55             ` Junio C Hamano
2021-04-20 12:50           ` [PATCH v2 09/10] mktree: stop setting *ntr++ to NIL Ævar Arnfjörð Bjarmason
2021-04-29  5:01             ` Junio C Hamano
2021-04-20 12:50           ` [PATCH v2 10/10] mktree: emit a more detailed error when the <type> is invalid Ævar Arnfjörð Bjarmason
2021-03-08 20:04   ` [PATCH 1/7] object.c: refactor type_from_string_gently() Ævar Arnfjörð Bjarmason
2021-03-08 20:52     ` Taylor Blau
2021-03-09 10:46     ` Jeff King
2021-03-08 20:04   ` [PATCH 2/7] object.c: make type_from_string() return "enum object_type" Ævar Arnfjörð Bjarmason
2021-03-08 20:56     ` Taylor Blau
2021-03-08 21:48     ` Junio C Hamano
2021-03-08 20:04   ` [PATCH 3/7] oid_object_info(): " Ævar Arnfjörð Bjarmason
2021-03-08 21:54     ` Junio C Hamano
2021-03-08 22:32       ` Junio C Hamano
2021-03-09 10:34     ` Jeff King
2021-03-08 20:04   ` [PATCH 4/7] tree.c: fix misindentation in parse_tree_gently() Ævar Arnfjörð Bjarmason
2021-03-08 20:04   ` [PATCH 5/7] object.c: add a utility function for "expected type X, got Y" Ævar Arnfjörð Bjarmason
2021-03-08 20:59     ` Taylor Blau
2021-03-08 22:15     ` Junio C Hamano
2021-03-08 20:04   ` [PATCH 6/7] object tests: add test for unexpected objects in tags Ævar Arnfjörð Bjarmason
2021-03-09 10:44     ` Jeff King
2021-03-28  1:35       ` Ævar Arnfjörð Bjarmason
2021-03-28  9:06         ` Jeff King
2021-03-28 15:39           ` Ævar Arnfjörð Bjarmason
2021-03-29  9:16             ` Jeff King
2021-03-08 20:04   ` [PATCH 7/7] tag: don't misreport type of tagged objects in errors Ævar Arnfjörð Bjarmason
2005-06-22  0:35 ` [PATCH 2/2] Pull misc objects Daniel Barkalow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y2dqfgbb.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.