git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jonathan Tan <jonathantanmy@google.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 1/9] refs: make _advance() check struct repo, not flag
Date: Wed, 22 Sep 2021 18:00:52 -0700	[thread overview]
Message-ID: <xmqq1r5g3y2j.fsf@gitster.g> (raw)
In-Reply-To: <493fff7f4716d889da751b5f8c6740cc1e3aa360.1632242495.git.jonathantanmy@google.com> (Jonathan Tan's message of "Tue, 21 Sep 2021 09:51:03 -0700")

Jonathan Tan <jonathantanmy@google.com> writes:

> As a first step in resolving both these problems, replace the
> DO_FOR_EACH_INCLUDE_BROKEN flag with a struct repository pointer. This
> commit is a mechanical conversion - whenever DO_FOR_EACH_INCLUDE_BROKEN
> is set, a NULL repository (representing access to no object store) is
> used instead, and whenever DO_FOR_EACH_INCLUDE_BROKEN is unset, a
> non-NULL repository (representing access to that repository's object
> store) is used instead.

Hmph, so the lack of "include broken" is a signal to validate the
object the ref points at, and the new parameter is "if this pointer
is not NULL, then expect to find the object in this repository and
validate it" that replaces the original "validate it" with a bit
more detailed instruction (i.e. "how to validate--use the object
store associated to this repository")?

> Right now, the locations in which
> non-the_repository support needs to be added are marked with BUG()
> statements - in a future patch, these will be replaced. (NEEDSWORK: in
> this RFC patch set, this has not been done)

> - Change the _advance() callback to also have a repository object
>   parameter, and either skip or not skip depending on whether that
>   parameter is NULL. This burdens callers to have to carry this
>   information along with the iterator, and such calling code may be
>   unclear as to why that parameter can be NULL in some cases and cannot
>   in others.

Hmph.  

> diff --git a/refs.c b/refs.c
> index 8b9f7c3a80..49ddcdac53 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1413,16 +1413,16 @@ int head_ref(each_ref_fn fn, void *cb_data)
>  
>  struct ref_iterator *refs_ref_iterator_begin(
>  		struct ref_store *refs,
> -		const char *prefix, int trim, int flags)
> +		const char *prefix, int trim, struct repository *repo,
> +		int flags)
>  {
>  	struct ref_iterator *iter;
>  
>  	if (ref_paranoia < 0)
>  		ref_paranoia = git_env_bool("GIT_REF_PARANOIA", 0);
> -	if (ref_paranoia)
> -		flags |= DO_FOR_EACH_INCLUDE_BROKEN;
>  
>  	iter = refs->be->iterator_begin(refs, prefix, flags);
> +	iter->repo = ref_paranoia ? NULL : repo;

OK.  "flags" is still kept because there are bits other than
"include broken" that need to be propagated.

> @@ -1442,13 +1442,16 @@ struct ref_iterator *refs_ref_iterator_begin(
>   * Call fn for each reference in the specified submodule for which the
>   * refname begins with prefix. If trim is non-zero, then trim that
>   * many characters off the beginning of each refname before passing
> - * the refname to fn. flags can be DO_FOR_EACH_INCLUDE_BROKEN to
> - * include broken references in the iteration. If fn ever returns a
> + * the refname to fn. If fn ever returns a
>   * non-zero value, stop the iteration and return that value;
>   * otherwise, return 0.
> + *
> + * See the documentation of refs_ref_iterator_begin() for more information on
> + * the repo parameter.
>   */
>  static int do_for_each_repo_ref(struct repository *r, const char *prefix,
> -				each_repo_ref_fn fn, int trim, int flags,
> +				each_repo_ref_fn fn, int trim,
> +				struct repository *repo, int flags,
>  				void *cb_data)

Confusing.  We are iterating refs that exists in the repository "r",
right?  Why do we need to have an extra "repo" parameter?  Can they
ever diverge (beyond repo could be NULL to signal now-lost "include
broken" bit wanted to convey)?  It's not like a valid caller can
pass the superproject in 'r' and a submodule in 'repo', right?

Enhancing an interface this way, and allowing an arbitrary
repository instance to be passed only to convey one bit of
information, by adding a "repo" smells like inviting bugs in the
future.

I have a feeling that the function signature for this one should
stay as before, and "repo" should be a local variable that is
initialized as

	struct repository *repo = (flags & DO_FOR_EACH_INCLUDE_BROKEN)
				? r
				: NULL;

to avoid such a future bug, but given that there is only one caller
to this helper, I do not mind

	if (repo && r != repo)
		BUG(...);

to catch any such mistake.

>  int for_each_replace_ref(struct repository *r, each_repo_ref_fn fn, void *cb_data)
>  {
>  	return do_for_each_repo_ref(r, git_replace_ref_base, fn,
>  				    strlen(git_replace_ref_base),
> -				    DO_FOR_EACH_INCLUDE_BROKEN, cb_data);
> +				    NULL, 0, cb_data);
>  }

And this is the only such caller, if I am reading the code right.

Do we ever pass non-NULL "repo" to do_for_each_repo_ref() in future
steps?

If not, perhaps we do not even have to add "repo" as a new parameter
to do_for_each_repo_ref(), and instead always pass NULL down to
refs_ref_iterator_begin() from do_for_each_repo_ref()?

> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 677b7e4cdd..cd145301d0 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -744,12 +744,6 @@ static int files_ref_iterator_advance(struct ref_iterator *ref_iterator)
>  		    ref_type(iter->iter0->refname) != REF_TYPE_PER_WORKTREE)
>  			continue;
>  
> -		if (!(iter->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
> -		    !ref_resolves_to_object(iter->iter0->refname,
> -					    iter->iter0->oid,
> -					    iter->iter0->flags))
> -			continue;
> -
>  		iter->base.refname = iter->iter0->refname;
>  		iter->base.oid = iter->iter0->oid;
>  		iter->base.flags = iter->iter0->flags;
> @@ -801,9 +795,6 @@ static struct ref_iterator *files_ref_iterator_begin(
>  	struct ref_iterator *ref_iterator;
>  	unsigned int required_flags = REF_STORE_READ;
>  
> -	if (!(flags & DO_FOR_EACH_INCLUDE_BROKEN))
> -		required_flags |= REF_STORE_ODB;
> -
>  	refs = files_downcast(ref_store, required_flags, "ref_iterator_begin");
>  
>  	/*

Hmph, I am not sure where the lossage in these two hunks are
compensated.  Perhaps in the backend independent layer in
refs/iterator.c?  Let's read on.

> @@ -836,10 +827,13 @@ static struct ref_iterator *files_ref_iterator_begin(
>  	 * references, and (if needed) do our own check for broken
>  	 * ones in files_ref_iterator_advance(), after we have merged
>  	 * the packed and loose references.
> +	 *
> +	 * Do this by not supplying any repo, regardless of whether a repo was
> +	 * supplied to files_ref_iterator_begin().
>  	 */
>  	packed_iter = refs_ref_iterator_begin(
>  			refs->packed_ref_store, prefix, 0,
> -			DO_FOR_EACH_INCLUDE_BROKEN);
> +			NULL, 0);

OK.

> diff --git a/refs/iterator.c b/refs/iterator.c
> index a89d132d4f..5af6554887 100644
> --- a/refs/iterator.c
> +++ b/refs/iterator.c
> @@ -10,7 +10,23 @@
>  
>  int ref_iterator_advance(struct ref_iterator *ref_iterator)
>  {
> -	return ref_iterator->vtable->advance(ref_iterator);
> +	int ok;
> +
> +	if (ref_iterator->repo && ref_iterator->repo != the_repository)

OK. refs_ref_interator_begin() assigned the "repo" parameter that
tells which repository to consult to validate the objects at the tip
of refs to the .repo member of the iterator object, and we check it
here.

It is a bit surprising that ref_iterator does not know which
repository it is working in (regardless of "include broken" bit).
Do you think it will stay that way?  I have this nagging feeling
that it won't, and having "struct repository *repository" pointer
that always points at the repository the ref-store belongs to in a
ref_iterator instance would become necessary in the longer run.

In which case, this .repo member this patch adds would become a big
problem, no?  If we were to validate objects at the tip of the refs
against object store, we will always use the object store that
belongs to the iterator->repository, so the only valid states for
iterator->repo are either NULL or iterator->repository.  That again
is the same problem I pointed out already about the parameter the
do_for_each_repo_ref() helper that is inviting future bugs, it seems
to me.  Wouldn't it make more sense to add

 * iterator->repository that points at the repository in which we
   are iterating the refs

 * a bit in iterator that chooses between "do not bother checking"
   and "do check the tip of refs against the object store of
   iterator->repository

to avoid such a mess?  Perhaps we already have such a bit in the
flags word in the ref_iterator but I didn't check.

Thanks.

  reply	other threads:[~2021-09-23  1:00 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-21 16:51 [PATCH 0/9] No more adding submodule ODB as alternate Jonathan Tan
2021-09-21 16:51 ` [PATCH 1/9] refs: make _advance() check struct repo, not flag Jonathan Tan
2021-09-23  1:00   ` Junio C Hamano [this message]
2021-09-24 17:56     ` Jonathan Tan
2021-09-24 19:55       ` Junio C Hamano
2021-09-24 18:13   ` Jeff King
2021-09-24 18:28     ` Jonathan Tan
2021-09-21 16:51 ` [PATCH 2/9] refs: add repo paramater to _iterator_peel() Jonathan Tan
2021-09-21 16:51 ` [PATCH 3/9] refs iterator: support non-the_repository advance Jonathan Tan
2021-09-21 16:51 ` [PATCH 4/9] refs: teach refs_for_each_ref() arbitrary repos Jonathan Tan
2021-09-21 16:51 ` [PATCH 5/9] merge-{ort,recursive}: remove add_submodule_odb() Jonathan Tan
2021-09-28  0:29   ` Elijah Newren
2021-09-21 16:51 ` [PATCH 6/9] object-file: only register submodule ODB if needed Jonathan Tan
2021-09-21 16:51 ` [PATCH 7/9] submodule: pass repo to check_has_commit() Jonathan Tan
2021-09-21 16:51 ` [PATCH 8/9] refs: change refs_for_each_ref_in() to take repo Jonathan Tan
2021-09-21 16:51 ` [PATCH 9/9] submodule: trace adding submodule ODB as alternate Jonathan Tan
2021-09-23 18:05 ` [PATCH 0/9] No more " Junio C Hamano
2021-09-28 20:10 ` [PATCH v2 " Jonathan Tan
2021-09-28 20:10   ` [PATCH v2 1/9] refs: plumb repo param in begin-iterator functions Jonathan Tan
2021-09-28 22:24     ` Junio C Hamano
2021-09-28 20:10   ` [PATCH v2 2/9] refs: teach arbitrary repo support to iterators Jonathan Tan
2021-09-28 22:35     ` Junio C Hamano
2021-09-29 17:04       ` Jonathan Tan
2021-09-28 20:10   ` [PATCH v2 3/9] refs: peeling non-the_repository iterators is BUG Jonathan Tan
2021-09-28 20:10   ` [PATCH v2 4/9] refs: teach refs_for_each_ref() arbitrary repos Jonathan Tan
2021-09-28 22:49     ` Junio C Hamano
2021-09-28 20:10   ` [PATCH v2 5/9] merge-{ort,recursive}: remove add_submodule_odb() Jonathan Tan
2021-09-28 20:10   ` [PATCH v2 6/9] object-file: only register submodule ODB if needed Jonathan Tan
2021-09-28 20:10   ` [PATCH v2 7/9] submodule: pass repo to check_has_commit() Jonathan Tan
2021-09-28 20:10   ` [PATCH v2 8/9] refs: change refs_for_each_ref_in() to take repo Jonathan Tan
2021-09-28 20:10   ` [PATCH v2 9/9] submodule: trace adding submodule ODB as alternate Jonathan Tan
2021-09-29 23:06 ` [PATCH v3 0/7] No more " Jonathan Tan
2021-09-29 23:06   ` [PATCH v3 1/7] refs: plumb repo into ref stores Jonathan Tan
2021-09-30 11:13     ` [PATCH] fixup! " Carlo Marcelo Arenas Belón
2021-10-06 17:42     ` Glen Choo
2021-10-08 20:05       ` Jonathan Tan
2021-10-08 20:07       ` Jonathan Tan
2021-10-07 18:33     ` [PATCH v3 1/7] " Josh Steadmon
2021-10-08 20:08       ` Jonathan Tan
2021-09-29 23:06   ` [PATCH v3 2/7] refs: teach arbitrary repo support to iterators Jonathan Tan
2021-10-07 19:31     ` Glen Choo
2021-10-08 20:12       ` Jonathan Tan
2021-09-29 23:06   ` [PATCH v3 3/7] refs: peeling non-the_repository iterators is BUG Jonathan Tan
2021-09-29 23:06   ` [PATCH v3 4/7] merge-{ort,recursive}: remove add_submodule_odb() Jonathan Tan
2021-10-07 18:34     ` Josh Steadmon
2021-10-08 20:19       ` Jonathan Tan
2021-09-29 23:06   ` [PATCH v3 5/7] object-file: only register submodule ODB if needed Jonathan Tan
2021-10-07 18:34     ` Josh Steadmon
2021-10-08 20:22       ` Jonathan Tan
2021-09-29 23:06   ` [PATCH v3 6/7] submodule: pass repo to check_has_commit() Jonathan Tan
2021-09-29 23:06   ` [PATCH v3 7/7] submodule: trace adding submodule ODB as alternate Jonathan Tan
2021-10-07 18:34     ` Josh Steadmon
2021-10-08 20:23       ` Jonathan Tan
2021-10-07 18:32   ` [PATCH v3 0/7] No more " Josh Steadmon
2021-10-08 21:08 ` [PATCH v4 " Jonathan Tan
2021-10-08 21:08   ` [PATCH v4 1/7] refs: plumb repo into ref stores Jonathan Tan
2021-10-08 21:08   ` [PATCH v4 2/7] refs: teach arbitrary repo support to iterators Jonathan Tan
2021-10-08 21:08   ` [PATCH v4 3/7] refs: peeling non-the_repository iterators is BUG Jonathan Tan
2021-10-08 21:08   ` [PATCH v4 4/7] merge-{ort,recursive}: remove add_submodule_odb() Jonathan Tan
2021-10-08 21:08   ` [PATCH v4 5/7] object-file: only register submodule ODB if needed Jonathan Tan
2021-10-08 21:08   ` [PATCH v4 6/7] submodule: pass repo to check_has_commit() Jonathan Tan
2021-10-08 21:08   ` [PATCH v4 7/7] submodule: trace adding submodule ODB as alternate Jonathan Tan
2021-10-12 22:10   ` [PATCH v4 0/7] No more " Glen Choo
2021-10-12 22:40   ` Josh Steadmon
2021-10-12 22:49     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq1r5g3y2j.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).