All of lore.kernel.org
 help / color / mirror / Atom feed
From: Richard Oliver <roliver@roku.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Jeff King <peff@peff.net>, Taylor Blau <me@ttaylorr.com>,
	Derrick Stolee <derrickstolee@github.com>,
	git@vger.kernel.org, jonathantanmy@google.com
Subject: [PATCH] mktree: do not check type of remote objects
Date: Tue, 21 Jun 2022 14:59:39 +0100	[thread overview]
Message-ID: <748f39a9-65aa-2110-cf92-7ddf81b5f507@roku.com> (raw)
In-Reply-To: <xmqqy1xwtsds.fsf@gitster.g>

On 16/06/2022 18:44, Junio C Hamano wrote:
> This patch would be a good first cut as a starting point, but we
> probably can do better by doing oid_object_info_extended() call with
> OBJECT_INFO_SKIP_FETCH_OBJECT bit (and probably QUICK bit, too) set,
> with the current code structure.
> 
> And when we do so, the title would not match the purpose of the
> change.  The verification was disabled with "--missing" all along
> and that is not what we are changing.  What we will be fixing is the
> wasteful implementation.
> 
>     mktree: do not check types of remote objects
> 
>     With 31c8221a (mktree: validate entry type in input, 2009-05-14), we
>     called the sha1_object_info() API to obtain the type information,
>     but allowed the call to silently fail when the object was missing
>     locally, so that we can sanity-check the types opportunistically
>     when the object did exist.
> 
>     The implementation is understandable because back then there was no
>     lazy/on-demand downloading of individual objects from the promisor
>     remotes that causes a long delay and materializes the object, hence
>     defeating the point of using "--missing".  The design is hurting us
>     now.
> 
>     We could bypass the opportunistic type/mode consistency check
>     altogether when "--missing" is given, but instead, use the
>     oid_object_info_extended() API and tell it that we are only
>     interested in objects that locally exist and are immediately
>     available by passing OBJECT_INFO_SKIP_FETCH_OBJECT bit to it.  That
>     way, we will still retain the cheap and opportunistic sanity check
>     for local objects.

I've prepared a patch below as per your suggestion.

As a side note, do you think we need to re-work some uses of the word
'missing' in the documentation? Some uses of the word, such as in
mktree, predate the concept of promisor remotes. The partial-clone.txt
documentation differentiates between missing "due to a partial clone
or fetch" and missing "due to repository corruption".  Would making
such a distinction elsewhere be useful?

Cheers,
Richard

-- >8 --
Subject: [PATCH] mktree: do not check type of remote objects

With 31c8221a (mktree: validate entry type in input, 2009-05-14), we
called the sha1_object_info() API to obtain the type information, but
allowed the call to silently fail when the object was missing locally,
so that we can sanity-check the types opportunistically when the
object did exist.

The implementation is understandable because back then there was no
lazy/on-demand downloading of individual objects from the promisor
remotes that causes a long delay and materializes the object, hence
defeating the point of using "--missing".  The design is hurting us
now.

We could bypass the opportunistic type/mode consistency check
altogether when "--missing" is given, but instead, use the
oid_object_info_extended() API and tell it that we are only interested
in objects that locally exist and are immediately available by passing
OBJECT_INFO_SKIP_FETCH_OBJECT bit to it.  That way, we will still
retain the cheap and opportunistic sanity check for local objects.

Signed-off-by: Richard Oliver <roliver@roku.com>
---
 builtin/mktree.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/builtin/mktree.c b/builtin/mktree.c
index 902edba6d2..cfadb52670 100644
--- a/builtin/mktree.c
+++ b/builtin/mktree.c
@@ -116,8 +116,15 @@ static void mktree_line(char *buf, int nul_term_line, int allow_missing)
 			path, ptr, type_name(mode_type));
 	}
 
-	/* Check the type of object identified by sha1 */
-	obj_type = oid_object_info(the_repository, &oid, NULL);
+	/* Check the type of object identified by oid without fetching objects */
+	struct object_info oi = OBJECT_INFO_INIT;
+	oi.typep = &obj_type;
+	if (oid_object_info_extended(the_repository, &oid, &oi,
+				     OBJECT_INFO_LOOKUP_REPLACE |
+				     OBJECT_INFO_QUICK |
+				     OBJECT_INFO_SKIP_FETCH_OBJECT) < 0)
+		obj_type = -1;
+
 	if (obj_type < 0) {
 		if (allow_missing) {
 			; /* no problem - missing objects are presumed to be of the right type */
-- 
2.36.1.467.g4f6db706e6.dirty


  reply	other threads:[~2022-06-21 13:59 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-14 13:36 [PATCH] mktree: learn about promised objects Richard Oliver
2022-06-14 14:14 ` Derrick Stolee
2022-06-14 16:33   ` Richard Oliver
2022-06-14 17:27     ` Derrick Stolee
2022-06-15  0:35       ` Taylor Blau
2022-06-15  4:00         ` Jeff King
2022-06-15 17:40           ` Richard Oliver
2022-06-15 18:17             ` Derrick Stolee
2022-06-16  6:07               ` Jeff King
2022-06-16  6:54                 ` [PATCH] is_promisor_object(): walk promisor packs in pack-order Jeff King
2022-06-16 14:00                   ` Derrick Stolee
2022-06-17 19:50                   ` Jonathan Tan
2022-06-16 13:59                 ` [PATCH] mktree: learn about promised objects Derrick Stolee
2022-06-15 21:01             ` Junio C Hamano
2022-06-16  5:02               ` Jeff King
2022-06-16 15:46               ` [PATCH] mktree: Make '--missing' behave as documented Richard Oliver
2022-06-16 17:44                 ` Junio C Hamano
2022-06-21 13:59                   ` Richard Oliver [this message]
2022-06-21 16:51                     ` [PATCH] mktree: do not check type of remote objects Junio C Hamano
2022-06-21 17:48                     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=748f39a9-65aa-2110-cf92-7ddf81b5f507@roku.com \
    --to=roliver@roku.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.