All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Beller <sbeller@google.com>
To: Brandon Williams <bmwill@google.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v2 5/6] submodule: improve submodule_has_commits
Date: Mon, 1 May 2017 18:34:35 -0700	[thread overview]
Message-ID: <CAGZ79kbbz3AAjbg_dV9RVS8kgLs-zWZxt5tsFbQczCm78LcTVw@mail.gmail.com> (raw)
In-Reply-To: <20170502010239.179369-6-bmwill@google.com>

On Mon, May 1, 2017 at 6:02 PM, Brandon Williams <bmwill@google.com> wrote:
> Teach 'submodule_has_commits()' to ensure that if a commit exists in a
> submodule, that it is also reachable from a ref.
>
> This is a preparatory step prior to merging the logic which checks for
> changed submodules when fetching or pushing.
>
> Change-Id: I4fed2acfa7e69a5fbbca534df165671e77a90f22
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  submodule.c | 34 ++++++++++++++++++++++++++++++++++
>  1 file changed, 34 insertions(+)
>
> diff --git a/submodule.c b/submodule.c
> index 3bcf44521..057695e64 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -644,10 +644,44 @@ static int submodule_has_commits(const char *path, struct oid_array *commits)
>  {
>         int has_commit = 1;
>
> +       /*
> +        * Perform a cheap, but incorrect check for the existance of 'commits'.
> +        * This is done by adding the submodule's object store to the in-core
> +        * object store, and then querying for each commit's existance.  If we
> +        * do not have the commit object anywhere, there is no chance we have
> +        * it in the object store of the correct submodule and have it
> +        * reachable from a ref, so we can fail early without spawning rev-list
> +        * which is expensive.
> +        */
>         if (add_submodule_odb(path))
>                 return 0;

Thanks for the comment!

>
>         oid_array_for_each_unique(commits, check_has_commit, &has_commit);
> +
> +       if (has_commit) {
> +               /*
> +                * Even if the submodule is checked out and the commit is
> +                * present, make sure it exists in the submodule's object store
> +                * and that it is reachable from a ref.
> +                */
> +               struct child_process cp = CHILD_PROCESS_INIT;
> +               struct strbuf out = STRBUF_INIT;
> +
> +               argv_array_pushl(&cp.args, "rev-list", "-n", "1", NULL);
> +               oid_array_for_each_unique(commits, append_oid_to_argv, &cp.args);
> +               argv_array_pushl(&cp.args, "--not", "--all", NULL);
> +
> +               prepare_submodule_repo_env(&cp.env_array);
> +               cp.git_cmd = 1;
> +               cp.no_stdin = 1;
> +               cp.dir = path;
> +
> +               if (capture_command(&cp, &out, GIT_MAX_HEXSZ + 1) || out.len)

eh, I gave too much and self-contradicting feedback here earlier,
ideally I'd like to review this to be similar as:

    if (capture_command(&cp, &out, GIT_MAX_HEXSZ + 1)
        die("cannot capture git-rev-list in submodule '%s', sub->path);

    if (out.len)
        has_commit = 0;

instead as that does not have a silent error. (though it errs
on the safe side, so maybe it is not to bad.)

I could understand if the callers do not want to have
`submodule_has_commits` die()-ing on them, so maybe

    if (capture_command(&cp, &out, GIT_MAX_HEXSZ + 1) {
        warning("cannot capture git-rev-list in submodule '%s', sub->path);
        has_commit = -1;
        /* this would require auditing all callers and handling -1 though */
    }

    if (out.len)
        has_commit = 0;

As the comment eludes, we'd then have
 0 -> has no commits
 1 -> has commits
-1 -> error

So to group (error || has_no_commits), we could write

    if (submodule_has_commits(..) <= 0)

which is awkward. So maybe we can rename the function
to misses_submodule_commits instead, as then we could
flip the return value as well and have

 0 -> has commits
 1 -> has no commits
-1 -> error

and the lazy invoker could just go with

    if (!misses_submodule_commits(..))
        proceed();
    else
        die("missing submodule commits or errors; I don't care");

whereas the careful invoker could go with

    switch (misses_submodule_commits(..)) {
    case 0:
        proceed(); break;
    case 1:
        pull_magic_trick(); break;
    case -1:
        make_errors_go_away_and_retry(); break;
    }



---
On the longer term plan:
As you wrote about costs. Maybe instead of invoking rev-list,
we could try to have this in-core as a first try-out for
"classified-repos", looking at refs.h there is e.g.

    int for_each_ref_submodule(const char *submodule_path,
          each_ref_fn fn, void *cb_data);

which we could use to obtain all submodule refs and then
use the revision walking machinery to find out ourselves if
we have or do not have the commits. (As we loaded the
odb of the submodule, this would *just work*, building one
kludgy hack upon the next.)

Thanks,
Stefan

  reply	other threads:[~2017-05-02  1:34 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-28 23:53 [PATCH 0/6] changed submodules Brandon Williams
2017-04-28 23:53 ` [PATCH 1/6] submodule: rename add_sha1_to_array Brandon Williams
2017-05-01  3:18   ` Junio C Hamano
2017-04-28 23:53 ` [PATCH 2/6] submodule: rename free_submodules_sha1s Brandon Williams
2017-04-28 23:53 ` [PATCH 3/6] submodule: remove add_oid_to_argv Brandon Williams
2017-04-28 23:54 ` [PATCH 4/6] submodule: change string_list changed_submodule_paths Brandon Williams
2017-05-01  3:28   ` Junio C Hamano
2017-05-01 16:35     ` Brandon Williams
2017-04-28 23:54 ` [PATCH 5/6] submodule: improve submodule_has_commits Brandon Williams
2017-04-29  0:28   ` Stefan Beller
2017-04-30 23:14     ` Brandon Williams
2017-05-01 16:52       ` Stefan Beller
2017-05-01 16:55         ` Brandon Williams
2017-05-01  3:37   ` Junio C Hamano
2017-05-01 16:46     ` Brandon Williams
2017-04-28 23:54 ` [PATCH 6/6] submodule: refactor logic to determine changed submodules Brandon Williams
2017-04-29  0:53   ` Stefan Beller
2017-05-01 16:49     ` Brandon Williams
2017-05-01  1:42 ` [PATCH 0/6] " Junio C Hamano
2017-05-02  1:02 ` [PATCH v2 " Brandon Williams
2017-05-02  1:02   ` [PATCH v2 1/6] submodule: rename add_sha1_to_array Brandon Williams
2017-05-02  1:05     ` Stefan Beller
2017-05-02  1:09       ` Brandon Williams
2017-05-02  1:02   ` [PATCH v2 2/6] submodule: rename free_submodules_sha1s Brandon Williams
2017-05-02  1:02   ` [PATCH v2 3/6] submodule: remove add_oid_to_argv Brandon Williams
2017-05-02  1:02   ` [PATCH v2 4/6] submodule: change string_list changed_submodule_paths Brandon Williams
2017-05-02  1:02   ` [PATCH v2 5/6] submodule: improve submodule_has_commits Brandon Williams
2017-05-02  1:34     ` Stefan Beller [this message]
2017-05-02 17:25       ` Brandon Williams
2017-05-02 17:55         ` Stefan Beller
2017-05-02 19:14           ` Brandon Williams
2017-05-02 19:30             ` Brandon Williams
2017-05-02  1:02   ` [PATCH v2 6/6] submodule: refactor logic to determine changed submodules Brandon Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGZ79kbbz3AAjbg_dV9RVS8kgLs-zWZxt5tsFbQczCm78LcTVw@mail.gmail.com \
    --to=sbeller@google.com \
    --cc=bmwill@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.