All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jonathan Tan <jonathantanmy@google.com>
Cc: git@vger.kernel.org, peff@peff.net
Subject: Re: [PATCH v2 4/4] sha1_file, fsck: add missing blob support
Date: Thu, 15 Jun 2017 11:34:45 -0700	[thread overview]
Message-ID: <xmqqy3st14my.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <aa3904dbe16050188a6b70a209bfcbfa86ea3213.1497387714.git.jonathantanmy@google.com> (Jonathan Tan's message of "Tue, 13 Jun 2017 14:06:00 -0700")

Jonathan Tan <jonathantanmy@google.com> writes:

> diff --git a/sha1_file.c b/sha1_file.c
> index 98086e21e..75fe2174d 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -27,6 +27,9 @@
>  #include "list.h"
>  #include "mergesort.h"
>  #include "quote.h"
> +#include "iterator.h"
> +#include "dir-iterator.h"
> +#include "sha1-lookup.h"
>  
>  #define SZ_FMT PRIuMAX
>  static inline uintmax_t sz_fmt(size_t s) { return s; }
> @@ -1624,6 +1627,72 @@ static const struct packed_git *has_packed_and_bad(const unsigned char *sha1)
>  	return NULL;
>  }
>  
> +struct missing_blob_manifest {
> +	struct missing_blob_manifest *next;
> +	const char *data;
> +};
> +struct missing_blob_manifest *missing_blobs;
> +int missing_blobs_initialized;

I do not think you meant to make these non-static.  The type of the
former is not even visible to the outside world, and the latter is
something that could be made into static to prepare_missing_blobs()
function (unless and until you start allowing the missing-blobs
manifest to be re-initialized).  Your ensure_configured() below
seems to do the "static" right, on the other hand ;-).

Do we expect that we will have only a handful of these missing blob
manifests?  Each manifest seems to be efficiently looked-up with a
binary search, but it makes me wonder if it is a good idea to
consolidate these manifests into a single list of object names to
eliminate the outer loop in has_missing_blob().  Unlike pack .idx
files that must stay one-to-one with .pack files, it appears to me
that there is no reason why we need to keep multiple ones separate
for extended period of time (e.g. whenever we learn that we receieved
an incomplete pack from the other side with a list of newly missing
blobs, we could incorporate that into existing missing blob list).

> +int has_missing_blob(const unsigned char *sha1, unsigned long *size)
> +{

This function that answers "is it expected to be missing?" is
confusingly named.  Is it missing, or does it exist?

> @@ -2981,11 +3050,55 @@ static int sha1_loose_object_info(const unsigned char *sha1,
>  	return (status < 0) ? status : 0;
>  }
>  
> +static char *missing_blob_command;
> +static int sha1_file_config(const char *conf_key, const char *value, void *cb)
> +{
> +	if (!strcmp(conf_key, "core.missingblobcommand")) {
> +		missing_blob_command = xstrdup(value);
> +	}
> +	return 0;
> +}
> +
> +static int configured;
> +static void ensure_configured(void)
> +{
> +	if (configured)
> +		return;

Do not be selfish and pretend that this is the _only_ kind of
configuration that needs to be done inside sha1_file.c.  Call the
function ensure_<something>_is_configured() and rename the run-once
guard to match.

The run-once guard can be made static to the "ensure" function, and
if you do so, then its name can stay to be "configured", as at that
point it is clear what it is guarding.

> diff --git a/t/t3907-missing-blob.sh b/t/t3907-missing-blob.sh
> new file mode 100755
> index 000000000..e0ce0942d
> --- /dev/null
> +++ b/t/t3907-missing-blob.sh
> @@ -0,0 +1,69 @@
> +#!/bin/sh
> +
> +test_description='core.missingblobcommand option'
> +
> +. ./test-lib.sh
> +
> +pack() {

Style: "pack () {"

> +	perl -e '$/ = undef; $input = <>; print pack("H*", $input)'

high-nybble first to match ntohll() done in has_missing_blob()?  OK.

> +}
> +
> +test_expect_success 'sha1_object_info_extended and read_sha1_file (through git cat-file -p)' '
> +	rm -rf server client &&
> +
> +	git init server &&
> +	test_commit -C server 1 &&
> +	test_config -C server uploadpack.allowanysha1inwant 1 &&
> +	HASH=$(git hash-object server/1.t) &&
> +
> +	git init client &&
> +	test_config -C client core.missingblobcommand \
> +		"git -C \"$(pwd)/server\" pack-objects --stdout | git unpack-objects" &&
> +
> +	# does not work if missing blob is not registered
> +	test_must_fail git -C client cat-file -p "$HASH" &&
> +
> +	mkdir -p client/.git/objects/missing &&
> +	printf "%016x%s%016x" 1 "$HASH" "$(wc -c <server/1.t)" |
> +		pack >client/.git/objects/missing/x &&
> +
> +	# works when missing blob is registered
> +	git -C client cat-file -p "$HASH"
> +'

OK, by passing printf '%016x', implementations of "$(wc -c)" that
gives extra whitespace around its output can still work correctly.
Good.

  reply	other threads:[~2017-06-15 18:34 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-09 19:23 [RFC PATCH 0/4] Improvements to sha1_file Jonathan Tan
2017-06-09 19:23 ` [RFC PATCH 1/4] sha1_file: teach packed_object_info about typename Jonathan Tan
2017-06-12 20:55   ` Junio C Hamano
2017-06-09 19:23 ` [RFC PATCH 2/4] sha1_file: extract type and size from object_info Jonathan Tan
2017-06-10  7:01   ` Jeff King
2017-06-12 19:52     ` Jonathan Tan
2017-06-12 21:13       ` Jeff King
2017-06-09 19:23 ` [RFC PATCH 3/4] sha1_file: consolidate storage-agnostic object fns Jonathan Tan
2017-06-09 19:23 ` [RFC PATCH 4/4] sha1_file, fsck: add missing blob support Jonathan Tan
2017-06-13 21:05 ` [PATCH v2 0/4] Improvements to sha1_file Jonathan Tan
2017-06-13 21:05 ` [PATCH v2 1/4] sha1_file: teach packed_object_info about typename Jonathan Tan
2017-06-13 21:05 ` [PATCH v2 2/4] sha1_file: move delta base cache code up Jonathan Tan
2017-06-15 17:00   ` Junio C Hamano
2017-06-13 21:05 ` [PATCH v2 3/4] sha1_file: consolidate storage-agnostic object fns Jonathan Tan
2017-06-15 17:50   ` Junio C Hamano
2017-06-15 18:14     ` Jonathan Tan
2017-06-17 12:19     ` Jeff King
2017-06-19  4:18       ` Junio C Hamano
2017-06-13 21:06 ` [PATCH v2 4/4] sha1_file, fsck: add missing blob support Jonathan Tan
2017-06-15 18:34   ` Junio C Hamano [this message]
2017-06-15 20:31     ` Jonathan Tan
2017-06-15 20:52       ` Junio C Hamano
2017-06-15 20:39 ` [PATCH v3 0/4] Improvements to sha1_file Jonathan Tan
2017-06-15 20:39 ` [PATCH v3 1/4] sha1_file: teach packed_object_info about typename Jonathan Tan
2017-06-15 20:39 ` [PATCH v3 2/4] sha1_file: move delta base cache code up Jonathan Tan
2017-06-15 20:39 ` [PATCH v3 3/4] sha1_file: consolidate storage-agnostic object fns Jonathan Tan
2017-06-15 20:39 ` [PATCH v3 4/4] sha1_file, fsck: add missing blob support Jonathan Tan
2017-06-20  1:03 ` [PATCH v4 0/8] Improvements to sha1_file Jonathan Tan
2017-06-21 18:18   ` Junio C Hamano
2017-06-24 12:51   ` Jeff King
2017-06-20  1:03 ` [PATCH v4 1/8] sha1_file: teach packed_object_info about typename Jonathan Tan
2017-06-20  1:03 ` [PATCH v4 2/8] sha1_file: rename LOOKUP_UNKNOWN_OBJECT Jonathan Tan
2017-06-21 17:22   ` Junio C Hamano
2017-06-21 17:34     ` Jonathan Tan
2017-06-20  1:03 ` [PATCH v4 3/8] sha1_file: rename LOOKUP_REPLACE_OBJECT Jonathan Tan
2017-06-21 17:33   ` Junio C Hamano
2017-06-20  1:03 ` [PATCH v4 4/8] sha1_file: move delta base cache code up Jonathan Tan
2017-06-20  1:03 ` [PATCH v4 5/8] sha1_file: refactor read_object Jonathan Tan
2017-06-21 17:58   ` Junio C Hamano
2017-06-20  1:03 ` [PATCH v4 6/8] sha1_file: improve sha1_object_info_extended Jonathan Tan
2017-06-24 12:45   ` Jeff King
2017-06-26 16:45     ` Jonathan Tan
2017-06-26 17:28       ` Junio C Hamano
2017-06-26 17:35         ` Jonathan Tan
2017-06-26 17:26     ` Junio C Hamano
2017-06-20  1:03 ` [PATCH v4 7/8] sha1_file: do not access pack if unneeded Jonathan Tan
2017-06-21 18:15   ` Junio C Hamano
2017-06-24 12:48     ` Jeff King
2017-06-24 18:41       ` Junio C Hamano
2017-06-24 20:39         ` Jeff King
2017-06-26 16:28           ` Jonathan Tan
2017-06-20  1:03 ` [PATCH v4 8/8] sha1_file: refactor has_sha1_file_with_flags Jonathan Tan
2017-06-22  0:40 ` [PATCH v5 0/8] Improvements to sha1_file Jonathan Tan
2017-06-22  1:40   ` Junio C Hamano
2017-06-22  0:40 ` [PATCH v5 1/8] sha1_file: teach packed_object_info about typename Jonathan Tan
2017-06-22  0:40 ` [PATCH v5 2/8] sha1_file: rename LOOKUP_UNKNOWN_OBJECT Jonathan Tan
2017-06-22  0:40 ` [PATCH v5 3/8] sha1_file: rename LOOKUP_REPLACE_OBJECT Jonathan Tan
2017-06-22  0:40 ` [PATCH v5 4/8] sha1_file: move delta base cache code up Jonathan Tan
2017-06-22  0:40 ` [PATCH v5 5/8] sha1_file: refactor read_object Jonathan Tan
2017-06-22  0:40 ` [PATCH v5 6/8] sha1_file: improve sha1_object_info_extended Jonathan Tan
2017-06-22  0:40 ` [PATCH v5 7/8] sha1_file: do not access pack if unneeded Jonathan Tan
2017-06-22  0:40 ` [PATCH v5 8/8] sha1_file: refactor has_sha1_file_with_flags Jonathan Tan
2017-07-18 10:30   ` Christian Couder
2017-07-18 16:39     ` Jonathan Tan
2017-07-19 12:52       ` Johannes Schindelin
2017-07-19 17:12         ` [PATCH] sha1_file: use access(), not lstat(), if possible Jonathan Tan
2017-07-20 21:48           ` Junio C Hamano
2017-07-22 11:16             ` Johannes Schindelin
2017-07-22 16:15               ` Junio C Hamano
2017-07-25 10:19                 ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqy3st14my.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.