git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Glen Choo <chooglen@google.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>, git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"Jonas Bernoulli" <jonas@bernoul.li>, "Jeff King" <peff@peff.net>,
	"Emily Shaffer" <emilyshaffer@google.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: Re: [PATCH 07/10] submodule: make it a built-in, remove git-submodule.sh
Date: Thu, 20 Oct 2022 15:49:42 -0700	[thread overview]
Message-ID: <kl6lpmemxg8p.fsf@chooglen-macbookpro.roam.corp.google.com> (raw)
In-Reply-To: <patch-07.10-2c77ed38d90-20221017T115544Z-avarab@gmail.com>

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> Replace the "git-submodule.sh" script with a built-in
> "builtin/submodule.c. For" now this new command is only a dumb
> dispatcher that uses run-command.c to invoke "git submodule--helper",
> just as "git-submodule.sh" used to do.
>
> This is obviously not ideal, and we should eventually follow-up and
> merge the "builtin/submodule--helper.c" code into
> "builtin/submodule.c". Doing it this way makes it easy to review that
> this new C implementation isn't doing anything more clever than the
> old shellscript implementation.
>
> This is a large win for performance, we're now more than 4x as fast as
> before in terms of the fixed cost of invoking any "git submodule"
> command[1]:
>
> 	$ git hyperfine -L rev HEAD~1,HEAD -s 'make CFLAGS=-O3' './git --exec-path=$PWD submodule foreach "echo \$name"'
> 	Benchmark 1: ./git --exec-path=$PWD submodule foreach "echo \$name"' in 'HEAD~1
> 	  Time (mean ± σ):      42.2 ms ±   0.4 ms    [User: 34.9 ms, System: 9.1 ms]
> 	  Range (min … max):    41.3 ms …  43.2 ms    70 runs
>
> 	Benchmark 2: ./git --exec-path=$PWD submodule foreach "echo \$name"' in 'HEAD
> 	  Time (mean ± σ):       9.7 ms ±   0.1 ms    [User: 7.6 ms, System: 2.2 ms]
> 	  Range (min … max):     9.5 ms …  10.3 ms    282 runs
>
> 	Summary
> 	  './git --exec-path=$PWD submodule foreach "echo \$name"' in 'HEAD' ran
> 	    4.33 ± 0.07 times faster than './git --exec-path=$PWD submodule foreach "echo \$name"' in 'HEAD~1'
>
> We're taking pains here to faithfully reproduce existing
> "git-submodule.sh" behavior, even when that behavior is stupid. Some
> of it we'll fix in subsequent commits, but let's first faithfully
> reproduce the behavior.
>
> One exception is the change in the behavior of the exit code
> stand-alone "-h" and "--" yield, see the altered tests. Returning 129
> instead of 0 and 1 for "-h" and "--" respectively is a concession to
> basic sanity.

Sounds reasonable.

> The pattern of using "define BUILTIN_" macros here isn't needed for
> now, but as we'll move code out of "builtin/submodule--helper.c" we'll
> want to re-use these strings. See 8757b35d443 (commit-graph: define
> common usage with a macro, 2021-08-23) and 1e91d3faf6c (reflog: move
> "usage" variables and use macros, 2022-03-17) for prior art using this
> pattern.
>
> The "(argc < 2 || !strcmp(argv[1], "-h"))" path at the top of
> cmd_submodule__helper() could now be a "(argc < 2)" if not for
> t0012-help.sh (which invokes all built-ins manually with "-h"). Let's
> leave it for now, eventually we'll consolidate the two.
>
> 1. Using the "git hyperfine" wrapper for "hyperfine":
>    https://lore.kernel.org/git/211201.86r1aw9gbd.gmgdl@evledraar.gmail.com/
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  Makefile                   |   2 +-
>  builtin.h                  |   1 +
>  builtin/submodule.c        | 151 +++++++++++++++++++++++++++++++++++++
>  git-submodule.sh           |  91 ----------------------
>  git.c                      |   1 +
>  t/t7400-submodule-basic.sh |  12 ++-
>  6 files changed, 159 insertions(+), 99 deletions(-)
>  create mode 100644 builtin/submodule.c
>  delete mode 100755 git-submodule.sh
>
> diff --git a/Makefile b/Makefile
> index 6bfb62cbe94..d8e2c02ad42 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -635,7 +635,6 @@ SCRIPT_SH += git-merge-resolve.sh
>  SCRIPT_SH += git-mergetool.sh
>  SCRIPT_SH += git-quiltimport.sh
>  SCRIPT_SH += git-request-pull.sh
> -SCRIPT_SH += git-submodule.sh
>  SCRIPT_SH += git-web--browse.sh
>  
>  SCRIPT_LIB += git-mergetool--lib
> @@ -1235,6 +1234,7 @@ BUILTIN_OBJS += builtin/show-ref.o
>  BUILTIN_OBJS += builtin/sparse-checkout.o
>  BUILTIN_OBJS += builtin/stash.o
>  BUILTIN_OBJS += builtin/stripspace.o
> +BUILTIN_OBJS += builtin/submodule.o
>  BUILTIN_OBJS += builtin/submodule--helper.o
>  BUILTIN_OBJS += builtin/symbolic-ref.o
>  BUILTIN_OBJS += builtin/tag.o
> diff --git a/builtin.h b/builtin.h
> index 8901a34d6bf..475c79b6a5a 100644
> --- a/builtin.h
> +++ b/builtin.h
> @@ -224,6 +224,7 @@ int cmd_sparse_checkout(int argc, const char **argv, const char *prefix);
>  int cmd_status(int argc, const char **argv, const char *prefix);
>  int cmd_stash(int argc, const char **argv, const char *prefix);
>  int cmd_stripspace(int argc, const char **argv, const char *prefix);
> +int cmd_submodule(int argc, const char **argv, const char *prefix);
>  int cmd_submodule__helper(int argc, const char **argv, const char *prefix);
>  int cmd_switch(int argc, const char **argv, const char *prefix);
>  int cmd_symbolic_ref(int argc, const char **argv, const char *prefix);
> diff --git a/builtin/submodule.c b/builtin/submodule.c
> new file mode 100644
> index 00000000000..7e3499f3376
> --- /dev/null
> +++ b/builtin/submodule.c
> @@ -0,0 +1,151 @@
> +/*
> + * Copyright (c) 2007-2022 Lars Hjemli & others
> + * Copyright(c) 2022 Ævar Arnfjörð Bjarmason
> + */
> +#include "builtin.h"
> +#include "parse-options.h"
> +#include "run-command.h"
> +#include "strvec.h"
> +
> +#define BUILTIN_SUBMODULE_USAGE \
> +	"git submodule [--quiet] [--cached]"
> +
> +#define BUILTIN_SUBMODULE_ADD_USAGE \
> +	N_("git submodule [--quiet] add [-b <branch>] [-f | --force] [--name <name>]\n" \
> +	   "              [--reference <repository>] [--] <repository> [<path>]")
> +
> +#define BUILTIN_SUBMODULE_STATUS_USAGE \
> +	N_("git submodule [--quiet] status [--cached] [--recursive] [--] [<path>...]")
> +
> +#define BUILTIN_SUBMODULE_INIT_USAGE \
> +	N_("git submodule [--quiet] init [--] [<path>...]")
> +
> +#define BUILTIN_SUBMODULE_DEINIT_USAGE \
> +	N_("git submodule [--quiet] deinit [-f | --force] (--all | [--] <path>...)")
> +
> +#define BUILTIN_SUBMODULE_UPDATE_USAGE \
> +	N_("git submodule [--quiet] update [-v] [--init [--filter=<filter-spec>]]\n" \
> +	   "              [--remote] [-N | --no-fetch] [-f | --force] [--checkout |--merge | --rebase]\n" \
> +	   "              [--[no-]recommend-shallow] [--reference <repository>] [--recursive]\n" \
> +	   "              [--[no-]single-branch] [--] [<path>...]")
> +
> +#define BUILTIN_SUBMODULE_SET_BRANCH_USAGE \
> +	N_("git submodule [--quiet] set-branch (--default | --branch <branch>) [--] <path>")
> +
> +#define BUILTIN_SUBMODULE_SET_URL_USAGE \
> +	N_("git submodule [--quiet] set-url [--] <path> <newurl>")
> +
> +#define BUILTIN_SUBMODULE_SUMMARY_USAGE \
> +	N_("git submodule [--quiet] summary [--cached | --files] [--summary-limit <n>]\n"  \
> +	   "              [commit] [--] [<path>...]")
> +#define BUILTIN_SUBMODULE_FOREACH_USAGE \
> +	N_("git submodule [--quiet] foreach [--recursive] <command>")
> +
> +#define BUILTIN_SUBMODULE_SYNC_USAGE \
> +	N_("git submodule [--quiet] sync [--recursive] [--] [<path>...]")
> +
> +#define BUILTIN_SUBMODULE_ABSORBGITDIRS_USAGE \
> +	N_("git submodule [--quiet] absorbgitdirs [--] [<path>...]")
> +

I was surprised to see that these strings aren't deduped from the ones
in builtin/submodule--helper.c, and are, in fact, different. Is there a
reason for that?

> +static const char * const git_submodule_usage[] = {
> +	BUILTIN_SUBMODULE_USAGE,
> +	BUILTIN_SUBMODULE_ADD_USAGE,
> +	BUILTIN_SUBMODULE_STATUS_USAGE,
> +	BUILTIN_SUBMODULE_INIT_USAGE,
> +	BUILTIN_SUBMODULE_DEINIT_USAGE,
> +	BUILTIN_SUBMODULE_UPDATE_USAGE,
> +	BUILTIN_SUBMODULE_SET_BRANCH_USAGE,
> +	BUILTIN_SUBMODULE_SET_URL_USAGE,
> +	BUILTIN_SUBMODULE_SUMMARY_USAGE,
> +	BUILTIN_SUBMODULE_FOREACH_USAGE,
> +	BUILTIN_SUBMODULE_SYNC_USAGE,
> +	BUILTIN_SUBMODULE_ABSORBGITDIRS_USAGE,
> +	NULL,
> +};
> +
> +static void setup_helper_args(int argc, const char **argv, const char *prefix,
> +			      int quiet, int cached, struct strvec *args,
> +			      const struct option *options)
> +{
> +	const char *cmd;
> +	int do_quiet_cache = 1;
> +	int do_prefix = 1;
> +
> +	strvec_push(args, "submodule--helper");
> +
> +	/* No command word defaults to "status" */
> +	if (!argc) {
> +		strvec_push(args, "status");
> +		return;
> +	}

We can't return without processing "--cached", e.g. removing the
explicit "status" subcommand like so fails.

  diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
  index d8f7d6ee29..5e61cef18e 100755
  --- a/t/t7400-submodule-basic.sh
  +++ b/t/t7400-submodule-basic.sh
  @@ -587,7 +587,7 @@ test_expect_success 'status should be "modified" after submodule commit' '
  '

  test_expect_success 'the --cached sha1 should be rev1' '
  -	git submodule status --cached >list &&
  +	git submodule --cached >list &&
    grep "^+$rev1" list
  '

> +
> +	/* Did we get --cached with a command? */
> +	if (cached)
> +		usage_msg_optf(_("'%s' option is only supported with explicit 'status'"),
> +			       git_submodule_usage, options, "--cached");
> +
> +
> +	/* Either a valid command, or submodule--helper will barf! */
> +	cmd = argv[0];

submodule--helper does die, but the help message that it emits is
submodule--helper specific, instead of the "git submodule" usage string
from before.

> +	strvec_push(args, cmd);
> +	argv++;
> +	argc--;
> +
> +	/*
> +	  * This is stupid, but don't support "[--]" to
> +	 * subcommand-less "git-submodule" for now.
> +	 */
> +	if (!strcmp(cmd, "--") || !strcmp(cmd, "--end-of-options"))
> +		usage_msg_optf(_("need explicit sub-command name to delimit with '%s'"),
> +			       git_submodule_usage, options, cmd);

If this is the only "stupid" behavior, maybe just call this out
specifically in the commit message.

> +
> +	/* Options that need to go before user-supplied options */
> +	if (!strcmp(cmd, "absorbgitdirs"))
> +		do_quiet_cache = 0;
> +	else if (!strcmp(cmd, "update"))
> +		;
> +	else
> +		do_prefix = 0;
> +	if (do_quiet_cache) {
> +		if (quiet)
> +			strvec_push(args, "--quiet");
> +		if (cached)
> +			strvec_push(args, "--cached");
> +
> +		if (prefix && do_prefix)
> +			strvec_pushl(args, "--prefix", prefix, NULL);
> +	}

This looks like a good reason to get rid of "--prefix" from "git
submodule--helper update" like I mentioned upthread.

I didn't notice earlier that absorbgitdirs had the same problem, but
that's an even easier fixup:

  diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
  index d11e100301..1f7f142270 100644
  --- a/builtin/submodule--helper.c
  +++ b/builtin/submodule--helper.c
  @@ -2825,9 +2825,6 @@ static int absorb_git_dirs(int argc, const char **argv, const char *prefix)
    struct module_list list = MODULE_LIST_INIT;
    unsigned flags = ABSORB_GITDIR_RECURSE_SUBMODULES;
    struct option embed_gitdir_options[] = {
  -		OPT_STRING(0, "prefix", &prefix,
  -			   N_("path"),
  -			   N_("path into the working tree")),
      OPT_BIT(0, "--recursive", &flags, N_("recurse into submodules"),
        ABSORB_GITDIR_RECURSE_SUBMODULES),
      OPT_END()

  reply	other threads:[~2022-10-20 22:49 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-17 12:09 [PATCH 00/10] submodule: make it a built-in, remove git-submodule.sh Ævar Arnfjörð Bjarmason
2022-10-17 12:09 ` [PATCH 01/10] git-submodule.sh: create a "case" dispatch statement Ævar Arnfjörð Bjarmason
2022-10-17 12:09 ` [PATCH 02/10] git-submodule.sh: dispatch "sync" to helper Ævar Arnfjörð Bjarmason
2022-10-20 20:42   ` Glen Choo
2022-10-17 12:09 ` [PATCH 03/10] git-submodule.sh: dispatch directly " Ævar Arnfjörð Bjarmason
2022-10-17 12:09 ` [PATCH 04/10] git-submodule.sh: dispatch "foreach" " Ævar Arnfjörð Bjarmason
2022-10-20 21:14   ` Glen Choo
2022-10-17 12:09 ` [PATCH 05/10] git-submodule.sh: dispatch "update" " Ævar Arnfjörð Bjarmason
2022-10-20 21:50   ` Glen Choo
2022-10-17 12:09 ` [PATCH 06/10] git-submodule.sh: don't support top-level "--cached" Ævar Arnfjörð Bjarmason
2022-10-20 22:14   ` Glen Choo
2022-10-17 12:09 ` [PATCH 07/10] submodule: make it a built-in, remove git-submodule.sh Ævar Arnfjörð Bjarmason
2022-10-20 22:49   ` Glen Choo [this message]
2022-10-17 12:09 ` [PATCH 08/10] submodule: support "--" with no other arguments Ævar Arnfjörð Bjarmason
2022-10-17 12:09 ` [PATCH 09/10] submodule: support sub-command-less "--recursive" option Ævar Arnfjörð Bjarmason
2022-10-20 23:05   ` Glen Choo
2022-10-17 12:09 ` [PATCH 10/10] submodule: don't use a subprocess to invoke "submodule--helper" Ævar Arnfjörð Bjarmason
2022-10-20 23:18   ` Glen Choo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=kl6lpmemxg8p.fsf@chooglen-macbookpro.roam.corp.google.com \
    --to=chooglen@google.com \
    --cc=avarab@gmail.com \
    --cc=emilyshaffer@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonas@bernoul.li \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).