git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] propose config-based hooks
@ 2020-05-21 18:54 Emily Shaffer
  2020-05-21 18:54 ` [PATCH v2 1/4] doc: propose hooks managed by the config Emily Shaffer
                   ` (4 more replies)
  0 siblings, 5 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-05-21 18:54 UTC (permalink / raw)
  To: git
  Cc: Emily Shaffer, Jeff King, Junio C Hamano, James Ramsay,
	Jonathan Nieder, brian m. carlson,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Josh Steadmon

This series implements "Stage 1" of the config-based hooks rollout
process as proposed in the design doc. It does not touch the existing
hook implementation or change the way that Git functions - it only adds
a new, independent command.

In the design doc, I mentioned the possibility of including 'git hook
add' and 'git hook edit' in this stage. However, I'd like to get input
from our UX team internally before I get started - I know my own limits,
and coming up with good UX design is one of them ;) Unfortunately, I
won't be able to get time with them until the first week of June, so I
haven't included those commands here.

The series is listed as v2 because I included the updated design doc
with changes pointed out by Junio and brian. That's a good place to
start if you're reviewing the series for the first time. (I'm also
breaking thread with the contributor summit notes to bring the series to
the attention of more contributors who may be interested.)

One point I'd like discussion on especially is the '--porcelain'
command. The intent was to make it very easy for non-builtins to run
hooks; but I'm starting to wonder whether it makes more sense to include
a `git hook run <hookname>`, which makes parallelization possible in the
future if we decide to implement that. Even if we decide it makes sense
to keep 'list --porcelain', I'm not sure what information to include;
providing simply the line to pass to 'sh' seems a little thin.

The next stage from here is to migrate internal callers who use
'find_hook()' now to call the hook library (and teach the hook library
to call find_hook()), which will essentially turn on config-based hooks;
does it make sense to include that stage at the same time as this
series so we aren't checking in unused code?

Thanks all.
 - Emily

Emily Shaffer (4):
  doc: propose hooks managed by the config
  hook: scaffolding for git-hook subcommand
  hook: add list command
  hook: add --porcelain to list command

 .gitignore                                    |   1 +
 Documentation/Makefile                        |   1 +
 Documentation/git-hook.txt                    |  63 ++++
 .../technical/config-based-hooks.txt          | 320 ++++++++++++++++++
 Makefile                                      |   2 +
 builtin.h                                     |   1 +
 builtin/hook.c                                |  77 +++++
 git.c                                         |   1 +
 hook.c                                        |  90 +++++
 hook.h                                        |  15 +
 t/t1360-config-based-hooks.sh                 |  69 ++++
 11 files changed, 640 insertions(+)
 create mode 100644 Documentation/git-hook.txt
 create mode 100644 Documentation/technical/config-based-hooks.txt
 create mode 100644 builtin/hook.c
 create mode 100644 hook.c
 create mode 100644 hook.h
 create mode 100755 t/t1360-config-based-hooks.sh

-- 
2.27.0.rc0.183.gde8f92d652-goog


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 1/4] doc: propose hooks managed by the config
  2020-05-21 18:54 [PATCH v2 0/4] propose config-based hooks Emily Shaffer
@ 2020-05-21 18:54 ` Emily Shaffer
  2020-05-22 10:13   ` Phillip Wood
  2020-05-21 18:54 ` [PATCH v2 2/4] hook: scaffolding for git-hook subcommand Emily Shaffer
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-05-21 18:54 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Begin a design document for config-based hooks, managed via git-hook.
Focus on an overview of the implementation and motivation for design
decisions. Briefly discuss the alternatives considered before this
point. Also, attempt to redefine terms to fit into a multihook world.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/Makefile                        |   1 +
 .../technical/config-based-hooks.txt          | 320 ++++++++++++++++++
 2 files changed, 321 insertions(+)
 create mode 100644 Documentation/technical/config-based-hooks.txt

diff --git a/Documentation/Makefile b/Documentation/Makefile
index 15d9d04f31..5b21f31d31 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -80,6 +80,7 @@ SP_ARTICLES += $(API_DOCS)
 TECH_DOCS += MyFirstContribution
 TECH_DOCS += MyFirstObjectWalk
 TECH_DOCS += SubmittingPatches
+TECH_DOCS += technical/config-based-hooks
 TECH_DOCS += technical/hash-function-transition
 TECH_DOCS += technical/http-protocol
 TECH_DOCS += technical/index-format
diff --git a/Documentation/technical/config-based-hooks.txt b/Documentation/technical/config-based-hooks.txt
new file mode 100644
index 0000000000..59cdc25a47
--- /dev/null
+++ b/Documentation/technical/config-based-hooks.txt
@@ -0,0 +1,320 @@
+Configuration-based hook management
+===================================
+
+== Motivation
+
+Treat hooks as a first-class citizen by replacing the .git/hook/hookname path as
+the only source of hooks to execute, in a way which is friendly to users with
+multiple repos which have similar needs.
+
+Redefine "hook" as an event rather than a single script, allowing users to
+perform unrelated actions on a single event.
+
+Take a step closer to safety when copying zipped Git repositories from untrusted
+users.
+
+Make it easier for users to discover Git's hook feature and automate their
+workflows.
+
+== User interfaces
+
+=== Config schema
+
+Hooks can be introduced by editing the configuration manually. There are two new
+sections added, `hook` and `hookcmd`.
+
+==== `hook`
+
+Primarily contains subsections for each hook event. These subsections define
+hook command execution order; hook commands can be specified by passing the
+command directly if no additional configuration is needed, or by passing the
+name of a `hookcmd`. If Git does not find a `hookcmd` whose subsection matches
+the value of the given command string, Git will try to execute the string
+directly. Hooks are executed by passing the resolved command string to the
+shell. Hook event subsections can also contain per-hook-event settings.
+
+Also contains top-level hook execution settings, for example,
+`hook.warnHookDir`, `hook.runHookDir`, or `hook.disableAll`.
+
+----
+[hook "pre-commit"]
+  command = perl-linter
+  command = /usr/bin/git-secrets --pre-commit
+
+[hook "pre-applypatch"]
+  command = perl-linter
+  error = ignore
+
+[hook]
+  runHookDir = interactive
+----
+
+==== `hookcmd`
+
+Defines a hook command and its attributes, which will be used when a hook event
+occurs. Unqualified attributes are assumed to apply to this hook during all hook
+events, but event-specific attributes can also be supplied. The example runs
+`/usr/bin/lint-it --language=perl <args passed by Git>`, but for repos which
+include this config, the hook command will be skipped for all events to which
+it's normally subscribed _except_ `pre-commit`.
+
+----
+[hookcmd "perl-linter"]
+  command = /usr/bin/lint-it --language=perl
+  skip = true
+  pre-commit-skip = false
+----
+
+=== Command-line API
+
+Users should be able to view, reorder, and create hook commands via the command
+line. External tools should be able to view a list of hooks in the correct order
+to run.
+
+*`git hook list <hook-event>`*
+
+*`git hook list (--system|--global|--local|--worktree)`*
+
+*`git hook edit <hook-event>`*
+
+*`git hook add <hook-command> <hook-event> <options...>`*
+
+=== Hook editor
+
+The tool which is presented by `git hook edit <hook-command>`. Ideally, this
+tool should be easier to use than manually editing the config, and then produce
+a concise config afterwards. It may take a form similar to `git rebase
+--interactive`.
+
+== Implementation
+
+=== Library
+
+`hook.c` and `hook.h` are responsible for interacting with the config files. In
+the case when the code generating a hook event doesn't have special concerns
+about how to run the hooks, the hook library will provide a basic API to call
+all hooks in config order with an `argv_array` provided by the code which
+generates the hook event:
+
+*`int run_hooks(const char *hookname, struct argv_array *args)`*
+
+This call includes the hook command provided by `run-command.h:find_hook()`;
+eventually, this legacy hook will be gated by a config `hook.runHookDir`. The
+config is checked against a number of cases:
+
+- "no": the legacy hook will not be run
+- "interactive": Git will prompt the user before running the legacy hook
+- "warn": Git will print a warning to stderr before running the legacy hook
+- "yes" (default): Git will silently run the legacy hook
+
+In case this list is expanded in the future, if a value for `hook.runHookDir` is
+given which Git does not recognize, Git should discard that config entry. For
+example, if "warn" was specified at system level and "junk" was specified at
+global level, Git would resolve the value to "warn"; if the only time the config
+was set was to "junk", Git would use the default value of "yes".
+
+If the caller wants to do something more complicated, the hook library can also
+provide a callback API:
+
+*`int for_each_hookcmd(const char *hookname, hookcmd_function *cb)`*
+
+Finally, to facilitate the builtin, the library will also provide the following
+APIs to interact with the config:
+
+----
+int set_hook_commands(const char *hookname, struct string_list *commands,
+	enum config_scope scope);
+int set_hookcmd(const char *hookcmd, struct hookcmd options);
+
+int list_hook_commands(const char *hookname, struct string_list *commands);
+int list_hooks_in_scope(enum config_scope scope, struct string_list *commands);
+----
+
+`struct hookcmd` is expected to grow in size over time as more functionality is
+added to hooks; so that other parts of the code don't need to understand the
+config schema, `struct hookcmd` should contain logical values instead of string
+pairs.
+
+----
+struct hookcmd {
+  const char *name;
+  const char *command;
+
+  /* for illustration only; not planned at present */
+  int parallelizable;
+  const char *hookcmd_before;
+  const char *hookcmd_after;
+  enum recovery_action on_fail;
+}
+----
+
+=== Builtin
+
+`builtin/hook.c` is responsible for providing the frontend. It's responsible for
+formatting user-provided data and then calling the library API to set the
+configs as appropriate. The builtin frontend is not responsible for calling the
+config directly, so that other areas of Git can rely on the hook library to
+understand the most recent config schema for hooks.
+
+=== Migration path
+
+==== Stage 0
+
+Hooks are called by running `run-command.h:find_hook()` with the hookname and
+executing the result. The hook library and builtin do not exist. Hooks only
+exist as specially named scripts within `.git/hooks/`.
+
+==== Stage 1
+
+`git hook list --porcelain <hook-event>` is implemented. Users can replace their
+`.git/hooks/<hook-event>` scripts with a trampoline based on `git hook list`'s
+output. Modifier commands like `git hook add` and `git hook edit` can be
+implemented around this time as well.
+
+==== Stage 2
+
+`hook.h:run_hooks()` is taught to include `run-command.h:find_hook()` at the
+end; calls to `find_hook()` are replaced with calls to `run_hooks()`. Users can
+opt-in to config-based hooks simply by creating some in their config; otherwise
+users should remain unaffected by the change.
+
+==== Stage 3
+
+The call to `find_hook()` inside of `run_hooks()` learns to check for a config,
+`hook.runHookDir`. Users can opt into managing their hooks completely via the
+config this way.
+
+==== Stage 4
+
+`.git/hooks` is removed from the template and the hook directory is considered
+deprecated. To avoid breaking older repos, the default of `hook.runHookDir` is
+not changed, and `find_hook()` is not removed.
+
+== Caveats
+
+=== Security and repo config
+
+Part of the motivation behind this refactor is to mitigate hooks as an attack
+vector;footnote:[https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/]
+however, as the design stands, users can still provide hooks in the repo-level
+config, which is included when a repo is zipped and sent elsewhere.  The
+security of the repo-level config is still under discussion; this design
+generally assumes the repo-level config is secure, which is not true yet. The
+goal is to avoid an overcomplicated design to work around a problem which has
+ceased to exist.
+
+=== Ease of use
+
+The config schema is nontrivial; that's why it's important for the `git hook`
+modifier commands to be usable. Contributors with UX expertise are encouraged to
+share their suggestions.
+
+== Alternative approaches
+
+A previous summary of alternatives exists in the
+archives.footnote:[https://lore.kernel.org/git/20191116011125.GG22855@google.com]
+
+=== Status quo
+
+Today users can implement multihooks themselves by using a "trampoline script"
+as their hook, and pointing that script to a directory or list of other scripts
+they wish to run.
+
+=== Hook directories
+
+Other contributors have suggested Git learn about the existence of a directory
+such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order.
+
+=== Comparison table
+
+.Comparison of alternatives
+|===
+|Feature |Config-based hooks |Hook directories |Status quo
+
+|Supports multiple hooks
+|Natively
+|Natively
+|With user effort
+
+|Safer for zipped repos
+|A little
+|No
+|No
+
+|Previous hooks just work
+|If configured
+|Yes
+|Yes
+
+|Can install one hook to many repos
+|Yes
+|No
+|No
+
+|Discoverability
+|Better (in `git help git`)
+|Same as before
+|Same as before
+
+|Hard to run unexpected hook
+|If configured
+|No
+|No
+|===
+
+== Future work
+
+=== Execution ordering
+
+We may find that config order is insufficient for some users; for example,
+config order makes it difficult to add a new hook to the system or global config
+which runs at the end of the hook list. A new ordering schema should be:
+
+1) Specified by a `hook.order` config, so that users will not unexpectedly see
+their order change;
+
+2) Either dependency or numerically based.
+
+Dependency-based ordering is prone to classic linked-list problems, like a
+cycles and handling of missing dependencies. But, it paves the way for enabling
+parallelization if some tasks truly depend on others.
+
+Numerical ordering makes it tricky for Git to generate suggested ordering
+numbers for each command, but is easy to determine a definitive order.
+
+=== Parallelization
+
+Users with many hooks might want to run them simultaneously, if the hooks don't
+modify state; if one hook depends on another's output, then users will want to
+specify those dependencies. If we decide to solve this problem, we may want to
+look to modern build systems for inspiration on how to manage dependencies and
+parallel tasks.
+
+=== Securing hookdir hooks
+
+With the design as written in this doc, it's still possible for a malicious user
+to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then
+zip their repo and send it to another user. It may be necessary to teach Git to
+only allow one-line hooks like this if they were configured outside of the local
+scope; or another approach, like a list of safe projects, might be useful. It
+may also be sufficient (or at least useful) to teach a `hook.disableAll` config
+or similar flag to the Git executable.
+
+=== Submodule inheritance
+
+It's possible some submodules may want to run the identical set of hooks that
+their superrepo runs. While a globally-configured hook set is helpful, it's not
+a great solution for users who have multiple repos-with-submodules under the
+same user. It would be useful for submodules to learn how to run hooks from
+their superrepo's config, or inherit that hook setting.
+
+== Glossary
+
+*hook event*
+
+A point during Git's execution where user scripts may be run, for example,
+_prepare-commit-msg_ or _pre-push_.
+
+*hook command*
+
+A user script or executable which will be run on one or more hook events.
-- 
2.27.0.rc0.183.gde8f92d652-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v2 2/4] hook: scaffolding for git-hook subcommand
  2020-05-21 18:54 [PATCH v2 0/4] propose config-based hooks Emily Shaffer
  2020-05-21 18:54 ` [PATCH v2 1/4] doc: propose hooks managed by the config Emily Shaffer
@ 2020-05-21 18:54 ` Emily Shaffer
  2020-05-21 18:54 ` [PATCH v2 3/4] hook: add list command Emily Shaffer
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-05-21 18:54 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Introduce infrastructure for a new subcommand, git-hook, which will be
used to ease config-based hook management. This command will handle
parsing configs to compose a list of hooks to run for a given event, as
well as adding or modifying hook configs in an interactive fashion.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 .gitignore                    |  1 +
 Documentation/git-hook.txt    | 19 +++++++++++++++++++
 Makefile                      |  1 +
 builtin.h                     |  1 +
 builtin/hook.c                | 21 +++++++++++++++++++++
 git.c                         |  1 +
 t/t1360-config-based-hooks.sh | 11 +++++++++++
 7 files changed, 55 insertions(+)
 create mode 100644 Documentation/git-hook.txt
 create mode 100644 builtin/hook.c
 create mode 100755 t/t1360-config-based-hooks.sh

diff --git a/.gitignore b/.gitignore
index ee509a2ad2..0694a34884 100644
--- a/.gitignore
+++ b/.gitignore
@@ -75,6 +75,7 @@
 /git-grep
 /git-hash-object
 /git-help
+/git-hook
 /git-http-backend
 /git-http-fetch
 /git-http-push
diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
new file mode 100644
index 0000000000..2d50c414cc
--- /dev/null
+++ b/Documentation/git-hook.txt
@@ -0,0 +1,19 @@
+git-hook(1)
+===========
+
+NAME
+----
+git-hook - Manage configured hooks
+
+SYNOPSIS
+--------
+[verse]
+'git hook'
+
+DESCRIPTION
+-----------
+You can list, add, and modify hooks with this command.
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index 3d3a39fc19..fce6ee154e 100644
--- a/Makefile
+++ b/Makefile
@@ -1080,6 +1080,7 @@ BUILTIN_OBJS += builtin/get-tar-commit-id.o
 BUILTIN_OBJS += builtin/grep.o
 BUILTIN_OBJS += builtin/hash-object.o
 BUILTIN_OBJS += builtin/help.o
+BUILTIN_OBJS += builtin/hook.o
 BUILTIN_OBJS += builtin/index-pack.o
 BUILTIN_OBJS += builtin/init-db.o
 BUILTIN_OBJS += builtin/interpret-trailers.o
diff --git a/builtin.h b/builtin.h
index a5ae15bfe5..4e736499c0 100644
--- a/builtin.h
+++ b/builtin.h
@@ -157,6 +157,7 @@ int cmd_get_tar_commit_id(int argc, const char **argv, const char *prefix);
 int cmd_grep(int argc, const char **argv, const char *prefix);
 int cmd_hash_object(int argc, const char **argv, const char *prefix);
 int cmd_help(int argc, const char **argv, const char *prefix);
+int cmd_hook(int argc, const char **argv, const char *prefix);
 int cmd_index_pack(int argc, const char **argv, const char *prefix);
 int cmd_init_db(int argc, const char **argv, const char *prefix);
 int cmd_interpret_trailers(int argc, const char **argv, const char *prefix);
diff --git a/builtin/hook.c b/builtin/hook.c
new file mode 100644
index 0000000000..b2bbc84d4d
--- /dev/null
+++ b/builtin/hook.c
@@ -0,0 +1,21 @@
+#include "cache.h"
+
+#include "builtin.h"
+#include "parse-options.h"
+
+static const char * const builtin_hook_usage[] = {
+	N_("git hook"),
+	NULL
+};
+
+int cmd_hook(int argc, const char **argv, const char *prefix)
+{
+	struct option builtin_hook_options[] = {
+		OPT_END(),
+	};
+
+	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+			     builtin_hook_usage, 0);
+
+	return 0;
+}
diff --git a/git.c b/git.c
index a2d337eed7..99372529a2 100644
--- a/git.c
+++ b/git.c
@@ -517,6 +517,7 @@ static struct cmd_struct commands[] = {
 	{ "grep", cmd_grep, RUN_SETUP_GENTLY },
 	{ "hash-object", cmd_hash_object },
 	{ "help", cmd_help },
+	{ "hook", cmd_hook, RUN_SETUP },
 	{ "index-pack", cmd_index_pack, RUN_SETUP_GENTLY | NO_PARSEOPT },
 	{ "init", cmd_init_db },
 	{ "init-db", cmd_init_db },
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
new file mode 100755
index 0000000000..34b0df5216
--- /dev/null
+++ b/t/t1360-config-based-hooks.sh
@@ -0,0 +1,11 @@
+#!/bin/bash
+
+test_description='config-managed multihooks, including git-hook command'
+
+. ./test-lib.sh
+
+test_expect_success 'git hook command does not crash' '
+	git hook
+'
+
+test_done
-- 
2.27.0.rc0.183.gde8f92d652-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v2 3/4] hook: add list command
  2020-05-21 18:54 [PATCH v2 0/4] propose config-based hooks Emily Shaffer
  2020-05-21 18:54 ` [PATCH v2 1/4] doc: propose hooks managed by the config Emily Shaffer
  2020-05-21 18:54 ` [PATCH v2 2/4] hook: scaffolding for git-hook subcommand Emily Shaffer
@ 2020-05-21 18:54 ` Emily Shaffer
  2020-05-22 10:27   ` Phillip Wood
  2020-05-24 23:00   ` Johannes Schindelin
  2020-05-21 18:54 ` [PATCH v2 4/4] hook: add --porcelain to " Emily Shaffer
  2020-07-28 22:24 ` [PATCH v3 0/6] propose config-based hooks Emily Shaffer
  4 siblings, 2 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-05-21 18:54 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Teach 'git hook list <hookname>', which checks the known configs in
order to create an ordered list of hooks to run on a given hook event.

Multiple commands can be specified for a given hook by providing
multiple "hook.<hookname>.command = <path-to-hook>" lines. Hooks will be
run in config order. If more properties need to be set on a given hook
in the future, commands can also be specified by providing
"hook.<hookname>.command = <hookcmd-name>", as well as a "[hookcmd
<hookcmd-name>]" subsection; at minimum, this subsection must contain a
"hookcmd.<hookcmd-name>.command = <path-to-hook>" line.

For example:

  $ git config --list | grep ^hook
  hook.pre-commit.command=baz
  hook.pre-commit.command=~/bar.sh
  hookcmd.baz.command=~/baz/from/hookcmd.sh

  $ git hook list pre-commit
  ~/baz/from/hookcmd.sh
  ~/bar.sh

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/git-hook.txt    | 37 +++++++++++++-
 Makefile                      |  1 +
 builtin/hook.c                | 55 +++++++++++++++++++--
 hook.c                        | 90 +++++++++++++++++++++++++++++++++++
 hook.h                        | 15 ++++++
 t/t1360-config-based-hooks.sh | 51 +++++++++++++++++++-
 6 files changed, 242 insertions(+), 7 deletions(-)
 create mode 100644 hook.c
 create mode 100644 hook.h

diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index 2d50c414cc..e458586e96 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -8,12 +8,47 @@ git-hook - Manage configured hooks
 SYNOPSIS
 --------
 [verse]
-'git hook'
+'git hook' list <hook-name>
 
 DESCRIPTION
 -----------
 You can list, add, and modify hooks with this command.
 
+This command parses the default configuration files for sections "hook" and
+"hookcmd". "hook" is used to describe the commands which will be run during a
+particular hook event; commands are run in config order. "hookcmd" is used to
+describe attributes of a specific command. If additional attributes don't need
+to be specified, a command to run can be specified directly in the "hook"
+section; if a "hookcmd" by that name isn't found, Git will attempt to run the
+provided value directly. For example:
+
+Global config
+----
+  [hook "post-commit"]
+    command = "linter"
+    command = "~/typocheck.sh"
+
+  [hookcmd "linter"]
+    command = "/bin/linter --c"
+----
+
+Local config
+----
+  [hook "prepare-commit-msg"]
+    command = "linter"
+  [hook "post-commit"]
+    command = "python ~/run-test-suite.py"
+----
+
+COMMANDS
+--------
+
+list <hook-name>::
+
+List the hooks which have been configured for <hook-name>. Hooks appear
+in the order they should be run, and note the config scope where the relevant
+`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index fce6ee154e..b7bbf3be7b 100644
--- a/Makefile
+++ b/Makefile
@@ -894,6 +894,7 @@ LIB_OBJS += grep.o
 LIB_OBJS += hashmap.o
 LIB_OBJS += help.o
 LIB_OBJS += hex.o
+LIB_OBJS += hook.o
 LIB_OBJS += ident.o
 LIB_OBJS += interdiff.o
 LIB_OBJS += json-writer.o
diff --git a/builtin/hook.c b/builtin/hook.c
index b2bbc84d4d..cfd8e388bd 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -1,21 +1,68 @@
 #include "cache.h"
 
 #include "builtin.h"
+#include "config.h"
+#include "hook.h"
 #include "parse-options.h"
+#include "strbuf.h"
 
 static const char * const builtin_hook_usage[] = {
-	N_("git hook"),
+	N_("git hook list <hookname>"),
 	NULL
 };
 
-int cmd_hook(int argc, const char **argv, const char *prefix)
+static int list(int argc, const char **argv, const char *prefix)
 {
-	struct option builtin_hook_options[] = {
+	struct list_head *head, *pos;
+	struct hook *item;
+	struct strbuf hookname = STRBUF_INIT;
+
+	struct option list_options[] = {
 		OPT_END(),
 	};
 
-	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+	argc = parse_options(argc, argv, prefix, list_options,
 			     builtin_hook_usage, 0);
 
+	if (argc < 1) {
+		usage_msg_opt("a hookname must be provided to operate on.",
+			      builtin_hook_usage, list_options);
+	}
+
+	strbuf_addstr(&hookname, argv[0]);
+
+	head = hook_list(&hookname);
+
+	if (!head) {
+		printf(_("no commands configured for hook '%s'\n"),
+		       hookname.buf);
+		return 0;
+	}
+
+	list_for_each(pos, head) {
+		item = list_entry(pos, struct hook, list);
+		if (item)
+			printf("%s:\t%s\n",
+			       config_scope_name(item->origin),
+			       item->command.buf);
+	}
+
+	clear_hook_list();
+	strbuf_release(&hookname);
+
 	return 0;
 }
+
+int cmd_hook(int argc, const char **argv, const char *prefix)
+{
+	struct option builtin_hook_options[] = {
+		OPT_END(),
+	};
+	if (argc < 2)
+		usage_with_options(builtin_hook_usage, builtin_hook_options);
+
+	if (!strcmp(argv[1], "list"))
+		return list(argc - 1, argv + 1, prefix);
+
+	usage_with_options(builtin_hook_usage, builtin_hook_options);
+}
diff --git a/hook.c b/hook.c
new file mode 100644
index 0000000000..9dfc1a885e
--- /dev/null
+++ b/hook.c
@@ -0,0 +1,90 @@
+#include "cache.h"
+
+#include "hook.h"
+#include "config.h"
+
+static LIST_HEAD(hook_head);
+
+void free_hook(struct hook *ptr)
+{
+	if (ptr) {
+		strbuf_release(&ptr->command);
+		free(ptr);
+	}
+}
+
+static void emplace_hook(struct list_head *pos, const char *command)
+{
+	struct hook *to_add = malloc(sizeof(struct hook));
+	to_add->origin = current_config_scope();
+	strbuf_init(&to_add->command, 0);
+	strbuf_addstr(&to_add->command, command);
+
+	list_add_tail(&to_add->list, pos);
+}
+
+static void remove_hook(struct list_head *to_remove)
+{
+	struct hook *hook_to_remove = list_entry(to_remove, struct hook, list);
+	list_del(to_remove);
+	free_hook(hook_to_remove);
+}
+
+void clear_hook_list(void)
+{
+	struct list_head *pos, *tmp;
+	list_for_each_safe(pos, tmp, &hook_head)
+		remove_hook(pos);
+}
+
+static int hook_config_lookup(const char *key, const char *value, void *hook_key_cb)
+{
+	const char *hook_key = hook_key_cb;
+
+	if (!strcmp(key, hook_key)) {
+		const char *command = value;
+		struct strbuf hookcmd_name = STRBUF_INIT;
+		struct list_head *pos = NULL, *tmp = NULL;
+
+		/* Check if a hookcmd with that name exists. */
+		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
+		git_config_get_value(hookcmd_name.buf, &command);
+
+		if (!command)
+			BUG("git_config_get_value overwrote a string it shouldn't have");
+
+		/*
+		 * TODO: implement an option-getting callback, e.g.
+		 *   get configs by pattern hookcmd.$value.*
+		 *   for each key+value, do_callback(key, value, cb_data)
+		 */
+
+		list_for_each_safe(pos, tmp, &hook_head) {
+			struct hook *hook = list_entry(pos, struct hook, list);
+			/*
+			 * The list of hooks to run can be reordered by being redeclared
+			 * in the config. Options about hook ordering should be checked
+			 * here.
+			 */
+			if (0 == strcmp(hook->command.buf, command))
+				remove_hook(pos);
+		}
+		emplace_hook(pos, command);
+	}
+
+	return 0;
+}
+
+struct list_head* hook_list(const struct strbuf* hookname)
+{
+	struct strbuf hook_key = STRBUF_INIT;
+
+	if (!hookname)
+		return NULL;
+
+	strbuf_addf(&hook_key, "hook.%s.command", hookname->buf);
+
+	git_config(hook_config_lookup, (void*)hook_key.buf);
+
+	return &hook_head;
+}
diff --git a/hook.h b/hook.h
new file mode 100644
index 0000000000..aaf6511cff
--- /dev/null
+++ b/hook.h
@@ -0,0 +1,15 @@
+#include "config.h"
+#include "list.h"
+#include "strbuf.h"
+
+struct hook
+{
+	struct list_head list;
+	enum config_scope origin;
+	struct strbuf command;
+};
+
+struct list_head* hook_list(const struct strbuf *hookname);
+
+void free_hook(struct hook *ptr);
+void clear_hook_list(void);
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 34b0df5216..4e46d7dd4e 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -4,8 +4,55 @@ test_description='config-managed multihooks, including git-hook command'
 
 . ./test-lib.sh
 
-test_expect_success 'git hook command does not crash' '
-	git hook
+test_expect_success 'git hook rejects commands without a mode' '
+	test_must_fail git hook pre-commit
+'
+
+
+test_expect_success 'git hook rejects commands without a hookname' '
+	test_must_fail git hook list
+'
+
+test_expect_success 'setup hooks in global, and local' '
+	git config --add --local hook.pre-commit.command "/path/ghi" &&
+	git config --add --global hook.pre-commit.command "/path/def"
+'
+
+test_expect_success 'git hook list orders by config order' '
+	cat >expected <<-\EOF &&
+	global:	/path/def
+	local:	/path/ghi
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list dereferences a hookcmd' '
+	git config --add --local hook.pre-commit.command "abc" &&
+	git config --add --global hookcmd.abc.command "/path/abc" &&
+
+	cat >expected <<-\EOF &&
+	global:	/path/def
+	local:	/path/ghi
+	local:	/path/abc
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list reorders on duplicate commands' '
+	git config --add --local hook.pre-commit.command "/path/def" &&
+
+	cat >expected <<-\EOF &&
+	local:	/path/ghi
+	local:	/path/abc
+	local:	/path/def
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
 '
 
 test_done
-- 
2.27.0.rc0.183.gde8f92d652-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v2 4/4] hook: add --porcelain to list command
  2020-05-21 18:54 [PATCH v2 0/4] propose config-based hooks Emily Shaffer
                   ` (2 preceding siblings ...)
  2020-05-21 18:54 ` [PATCH v2 3/4] hook: add list command Emily Shaffer
@ 2020-05-21 18:54 ` Emily Shaffer
  2020-05-24 23:00   ` Johannes Schindelin
  2020-07-28 22:24 ` [PATCH v3 0/6] propose config-based hooks Emily Shaffer
  4 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-05-21 18:54 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Teach 'git hook list --porcelain <hookname>', which prints simply the
commands to be run in the order suggested by the config. This option is
intended for use by user scripts, wrappers, or out-of-process Git
commands which still want to execute hooks. For example, the following
snippet might be added to git-send-email.perl to introduce a
`pre-send-email` hook:

  sub pre_send_email {
    open(my $fh, 'git hook list --porcelain pre-send-email |');
    chomp(my @hooks = <$fh>);
    close($fh);

    foreach $hook (@hooks) {
            system $hook
    }

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/git-hook.txt    | 13 +++++++++++--
 builtin/hook.c                | 17 +++++++++++++----
 t/t1360-config-based-hooks.sh | 11 +++++++++++
 3 files changed, 35 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index e458586e96..0854035ce2 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -8,7 +8,7 @@ git-hook - Manage configured hooks
 SYNOPSIS
 --------
 [verse]
-'git hook' list <hook-name>
+'git hook' list [--porcelain] <hook-name>
 
 DESCRIPTION
 -----------
@@ -43,11 +43,20 @@ Local config
 COMMANDS
 --------
 
-list <hook-name>::
+list [--porcelain] <hook-name>::
 
 List the hooks which have been configured for <hook-name>. Hooks appear
 in the order they should be run, and note the config scope where the relevant
 `hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
++
+If `--porcelain` is specified, instead print the commands alone, separated by
+newlines, for easy parsing by a script.
+
+OPTIONS
+-------
+--porcelain::
+	With `list`, print the commands in the order they should be run,
+	separated by newlines, for easy parsing by a script.
 
 GIT
 ---
diff --git a/builtin/hook.c b/builtin/hook.c
index cfd8e388bd..2e51c84c81 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -16,8 +16,11 @@ static int list(int argc, const char **argv, const char *prefix)
 	struct list_head *head, *pos;
 	struct hook *item;
 	struct strbuf hookname = STRBUF_INIT;
+	int porcelain = 0;
 
 	struct option list_options[] = {
+		OPT_BOOL(0, "porcelain", &porcelain,
+			 "format for execution by a script"),
 		OPT_END(),
 	};
 
@@ -29,6 +32,8 @@ static int list(int argc, const char **argv, const char *prefix)
 			      builtin_hook_usage, list_options);
 	}
 
+
+
 	strbuf_addstr(&hookname, argv[0]);
 
 	head = hook_list(&hookname);
@@ -41,10 +46,14 @@ static int list(int argc, const char **argv, const char *prefix)
 
 	list_for_each(pos, head) {
 		item = list_entry(pos, struct hook, list);
-		if (item)
-			printf("%s:\t%s\n",
-			       config_scope_name(item->origin),
-			       item->command.buf);
+		if (item) {
+			if (porcelain)
+				printf("%s\n", item->command.buf);
+			else
+				printf("%s:\t%s\n",
+				       config_scope_name(item->origin),
+				       item->command.buf);
+		}
 	}
 
 	clear_hook_list();
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 4e46d7dd4e..3296d8af45 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -55,4 +55,15 @@ test_expect_success 'git hook list reorders on duplicate commands' '
 	test_cmp expected actual
 '
 
+test_expect_success 'git hook list --porcelain prints just the command' '
+	cat >expected <<-\EOF &&
+	/path/ghi
+	/path/abc
+	/path/def
+	EOF
+
+	git hook list --porcelain pre-commit >actual &&
+	test_cmp expected actual
+'
+
 test_done
-- 
2.27.0.rc0.183.gde8f92d652-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 1/4] doc: propose hooks managed by the config
  2020-05-21 18:54 ` [PATCH v2 1/4] doc: propose hooks managed by the config Emily Shaffer
@ 2020-05-22 10:13   ` Phillip Wood
  2020-06-09 20:26     ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Phillip Wood @ 2020-05-22 10:13 UTC (permalink / raw)
  To: Emily Shaffer, git

Hi Emily

Thanks for working on this

On 21/05/2020 19:54, Emily Shaffer wrote:
> Begin a design document for config-based hooks, managed via git-hook.
> Focus on an overview of the implementation and motivation for design
> decisions. Briefly discuss the alternatives considered before this
> point. Also, attempt to redefine terms to fit into a multihook world.
> 
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
>  Documentation/Makefile                        |   1 +
>  .../technical/config-based-hooks.txt          | 320 ++++++++++++++++++
>  2 files changed, 321 insertions(+)
>  create mode 100644 Documentation/technical/config-based-hooks.txt
> 
> diff --git a/Documentation/Makefile b/Documentation/Makefile
> index 15d9d04f31..5b21f31d31 100644
> --- a/Documentation/Makefile
> +++ b/Documentation/Makefile
> @@ -80,6 +80,7 @@ SP_ARTICLES += $(API_DOCS)
>  TECH_DOCS += MyFirstContribution
>  TECH_DOCS += MyFirstObjectWalk
>  TECH_DOCS += SubmittingPatches
> +TECH_DOCS += technical/config-based-hooks
>  TECH_DOCS += technical/hash-function-transition
>  TECH_DOCS += technical/http-protocol
>  TECH_DOCS += technical/index-format
> diff --git a/Documentation/technical/config-based-hooks.txt b/Documentation/technical/config-based-hooks.txt
> new file mode 100644
> index 0000000000..59cdc25a47
> --- /dev/null
> +++ b/Documentation/technical/config-based-hooks.txt
> @@ -0,0 +1,320 @@
> +Configuration-based hook management
> +===================================
> +
> +== Motivation
> +
> +Treat hooks as a first-class citizen by replacing the .git/hook/hookname path as
> +the only source of hooks to execute, in a way which is friendly to users with
> +multiple repos which have similar needs.
> +
> +Redefine "hook" as an event rather than a single script, allowing users to
> +perform unrelated actions on a single event.
> +
> +Take a step closer to safety when copying zipped Git repositories from untrusted
> +users.

Having read through this (admittedly fairly quickly) I'm not sure what
that step is

> +
> +Make it easier for users to discover Git's hook feature and automate their
> +workflows.
> +
> +== User interfaces
> +
> +=== Config schema
> +
> +Hooks can be introduced by editing the configuration manually. There are two new
> +sections added, `hook` and `hookcmd`.
> +
> +==== `hook`
> +
> +Primarily contains subsections for each hook event. These subsections define
> +hook command execution order;

May be "The order of these subsections define the hook command execution
order" ?

> hook commands can be specified by passing the
> +command directly if no additional configuration is needed, or by passing the
> +name of a `hookcmd`.

I know what you mean by "passing" but as this section is talking about
config settings perhaps it should refer to the keys and values.

> If Git does not find a `hookcmd` whose subsection matches
> +the value of the given command string, Git will try to execute the string
> +directly. Hooks are executed by passing the resolved command string to the
> +shell.

Do we really need to invoke the shell just to split a command-line and
look up the command in $PATH? If we used split_commandline() in alias.c
then we could avoid invoking this extra process for each hook command.

> Hook event subsections can also contain per-hook-event settings.
> +
> +Also contains top-level hook execution settings, for example,
> +`hook.warnHookDir`, `hook.runHookDir`, or `hook.disableAll`.

(see sections ...) ? for the forward references to these settings?

> +
> +----
> +[hook "pre-commit"]
> +  command = perl-linter
> +  command = /usr/bin/git-secrets --pre-commit
> +
> +[hook "pre-applypatch"]
> +  command = perl-linter
> +  error = ignore
> +
> +[hook]
> +  runHookDir = interactive
> +----
> +
> +==== `hookcmd`
> +
> +Defines a hook command and its attributes, which will be used when a hook event
> +occurs. Unqualified attributes are assumed to apply to this hook during all hook
> +events, but event-specific attributes can also be supplied. The example runs
> +`/usr/bin/lint-it --language=perl <args passed by Git>`, but for repos which
> +include this config, the hook command will be skipped for all events to which
> +it's normally subscribed _except_ `pre-commit`.
> +
> +----
> +[hookcmd "perl-linter"]
> +  command = /usr/bin/lint-it --language=perl
> +  skip = true
> +  pre-commit-skip = false
> +----
> +
> +=== Command-line API
> +
> +Users should be able to view, reorder, and create hook commands via the command
> +line. External tools should be able to view a list of hooks in the correct order
> +to run.
> +
> +*`git hook list <hook-event>`*
> +
> +*`git hook list (--system|--global|--local|--worktree)`*
> +
> +*`git hook edit <hook-event>`*
> +
> +*`git hook add <hook-command> <hook-event> <options...>`*
> +
> +=== Hook editor
> +
> +The tool which is presented by `git hook edit <hook-command>`. Ideally, this
> +tool should be easier to use than manually editing the config, and then produce
> +a concise config afterwards. It may take a form similar to `git rebase
> +--interactive`.

rebase -i is not necessarily an exemplar of user interface design, what
sort of thing do you have in mind?

> +
> +== Implementation
> +
> +=== Library
> +
> +`hook.c` and `hook.h` are responsible for interacting with the config files. In
> +the case when the code generating a hook event doesn't have special concerns
> +about how to run the hooks, the hook library will provide a basic API to call
> +all hooks in config order with an `argv_array` provided by the code which
> +generates the hook event:
> +
> +*`int run_hooks(const char *hookname, struct argv_array *args)`*
> +
> +This call includes the hook command provided by `run-command.h:find_hook()`;
> +eventually, this legacy hook will be gated by a config `hook.runHookDir`. The
> +config is checked against a number of cases:
> +
> +- "no": the legacy hook will not be run
> +- "interactive": Git will prompt the user before running the legacy hook
> +- "warn": Git will print a warning to stderr before running the legacy hook
> +- "yes" (default): Git will silently run the legacy hook
> +
> +In case this list is expanded in the future, if a value for `hook.runHookDir` is
> +given which Git does not recognize, Git should discard that config entry. For
> +example, if "warn" was specified at system level and "junk" was specified at
> +global level, Git would resolve the value to "warn"; if the only time the config
> +was set was to "junk", Git would use the default value of "yes".
> +
> +If the caller wants to do something more complicated, the hook library can also
> +provide a callback API:
> +
> +*`int for_each_hookcmd(const char *hookname, hookcmd_function *cb)`*
> +
> +Finally, to facilitate the builtin, the library will also provide the following
> +APIs to interact with the config:
> +
> +----
> +int set_hook_commands(const char *hookname, struct string_list *commands,
> +	enum config_scope scope);
> +int set_hookcmd(const char *hookcmd, struct hookcmd options);
> +
> +int list_hook_commands(const char *hookname, struct string_list *commands);
> +int list_hooks_in_scope(enum config_scope scope, struct string_list *commands);
> +----
> +
> +`struct hookcmd` is expected to grow in size over time as more functionality is
> +added to hooks; so that other parts of the code don't need to understand the
> +config schema, `struct hookcmd` should contain logical values instead of string
> +pairs.
> +
> +----
> +struct hookcmd {
> +  const char *name;
> +  const char *command;
> +
> +  /* for illustration only; not planned at present */
> +  int parallelizable;
> +  const char *hookcmd_before;
> +  const char *hookcmd_after;
> +  enum recovery_action on_fail;
> +}
> +----
> +
> +=== Builtin
> +
> +`builtin/hook.c` is responsible for providing the frontend. It's responsible for
> +formatting user-provided data and then calling the library API to set the
> +configs as appropriate. The builtin frontend is not responsible for calling the
> +config directly, so that other areas of Git can rely on the hook library to
> +understand the most recent config schema for hooks.
> +
> +=== Migration path
> +
> +==== Stage 0
> +
> +Hooks are called by running `run-command.h:find_hook()` with the hookname and
> +executing the result. The hook library and builtin do not exist. Hooks only
> +exist as specially named scripts within `.git/hooks/`.
> +
> +==== Stage 1
> +
> +`git hook list --porcelain <hook-event>` is implemented. Users can replace their
> +`.git/hooks/<hook-event>` scripts with a trampoline based on `git hook list`'s
> +output. Modifier commands like `git hook add` and `git hook edit` can be
> +implemented around this time as well.
> +
> +==== Stage 2
> +
> +`hook.h:run_hooks()` is taught to include `run-command.h:find_hook()` at the
> +end; calls to `find_hook()` are replaced with calls to `run_hooks()`. Users can
> +opt-in to config-based hooks simply by creating some in their config; otherwise
> +users should remain unaffected by the change.
> +
> +==== Stage 3
> +
> +The call to `find_hook()` inside of `run_hooks()` learns to check for a config,
> +`hook.runHookDir`. Users can opt into managing their hooks completely via the
> +config this way.
> +
> +==== Stage 4
> +
> +`.git/hooks` is removed from the template and the hook directory is considered
> +deprecated. To avoid breaking older repos, the default of `hook.runHookDir` is
> +not changed, and `find_hook()` is not removed.
> +
> +== Caveats
> +
> +=== Security and repo config
> +
> +Part of the motivation behind this refactor is to mitigate hooks as an attack
> +vector;footnote:[https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/]
> +however, as the design stands, users can still provide hooks in the repo-level
> +config, which is included when a repo is zipped and sent elsewhere.  The
> +security of the repo-level config is still under discussion; this design
> +generally assumes the repo-level config is secure, which is not true yet. The
> +goal is to avoid an overcomplicated design to work around a problem which has
> +ceased to exist.
> +
> +=== Ease of use
> +
> +The config schema is nontrivial; that's why it's important for the `git hook`
> +modifier commands to be usable.

That's an important point

> Contributors with UX expertise are encouraged to
> +share their suggestions.
> +
> +== Alternative approaches
> +
> +A previous summary of alternatives exists in the
> +archives.footnote:[https://lore.kernel.org/git/20191116011125.GG22855@google.com]
> +
> +=== Status quo
> +
> +Today users can implement multihooks themselves by using a "trampoline script"
> +as their hook, and pointing that script to a directory or list of other scripts
> +they wish to run.
> +
> +=== Hook directories
> +
> +Other contributors have suggested Git learn about the existence of a directory
> +such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order.
> +
> +=== Comparison table
> +
> +.Comparison of alternatives
> +|===
> +|Feature |Config-based hooks |Hook directories |Status quo
> +
> +|Supports multiple hooks
> +|Natively
> +|Natively
> +|With user effort
> +
> +|Safer for zipped repos
> +|A little
> +|No
> +|No
> +
> +|Previous hooks just work
> +|If configured
> +|Yes
> +|Yes
> +
> +|Can install one hook to many repos
> +|Yes
> +|No
> +|No
> +
> +|Discoverability
> +|Better (in `git help git`)
> +|Same as before
> +|Same as before
> +
> +|Hard to run unexpected hook
> +|If configured
> +|No
> +|No
> +|===
> +
> +== Future work
> +
> +=== Execution ordering
> +
> +We may find that config order is insufficient for some users; for example,
> +config order makes it difficult to add a new hook to the system or global config
> +which runs at the end of the hook list. A new ordering schema should be:
> +
> +1) Specified by a `hook.order` config, so that users will not unexpectedly see
> +their order change;
> +
> +2) Either dependency or numerically based.
> +
> +Dependency-based ordering is prone to classic linked-list problems, like a
> +cycles and handling of missing dependencies. But, it paves the way for enabling
> +parallelization if some tasks truly depend on others.
> +
> +Numerical ordering makes it tricky for Git to generate suggested ordering
> +numbers for each command, but is easy to determine a definitive order.
> +
> +=== Parallelization
> +
> +Users with many hooks might want to run them simultaneously, if the hooks don't
> +modify state; if one hook depends on another's output, then users will want to
> +specify those dependencies. If we decide to solve this problem, we may want to
> +look to modern build systems for inspiration on how to manage dependencies and
> +parallel tasks.
> +
> +=== Securing hookdir hooks
> +
> +With the design as written in this doc, it's still possible for a malicious user
> +to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then
> +zip their repo and send it to another user. It may be necessary to teach Git to
> +only allow one-line hooks like this if they were configured outside of the local
> +scope;

Does "disabling one-line hooks" mean "disable passing command line
arguments to the hook"? I'm not sure that gains much security - can't I
just set 'hook.pre-receive.command = ./delete-everything' and include
delete-everything in my malicious repo?

Best Wishes

Phillip

> or another approach, like a list of safe projects, might be useful. It
> +may also be sufficient (or at least useful) to teach a `hook.disableAll` config
> +or similar flag to the Git executable.
> +
> +=== Submodule inheritance
> +
> +It's possible some submodules may want to run the identical set of hooks that
> +their superrepo runs. While a globally-configured hook set is helpful, it's not
> +a great solution for users who have multiple repos-with-submodules under the
> +same user. It would be useful for submodules to learn how to run hooks from
> +their superrepo's config, or inherit that hook setting.
> +
> +== Glossary
> +
> +*hook event*
> +
> +A point during Git's execution where user scripts may be run, for example,
> +_prepare-commit-msg_ or _pre-push_.
> +
> +*hook command*
> +
> +A user script or executable which will be run on one or more hook events.
> 


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 3/4] hook: add list command
  2020-05-21 18:54 ` [PATCH v2 3/4] hook: add list command Emily Shaffer
@ 2020-05-22 10:27   ` Phillip Wood
  2020-06-09 21:49     ` Emily Shaffer
  2020-05-24 23:00   ` Johannes Schindelin
  1 sibling, 1 reply; 170+ messages in thread
From: Phillip Wood @ 2020-05-22 10:27 UTC (permalink / raw)
  To: Emily Shaffer, git

Hi Emily

On 21/05/2020 19:54, Emily Shaffer wrote:
> Teach 'git hook list <hookname>', which checks the known configs in
> order to create an ordered list of hooks to run on a given hook event.
> 
> Multiple commands can be specified for a given hook by providing
> multiple "hook.<hookname>.command = <path-to-hook>" lines. Hooks will be
> run in config order. If more properties need to be set on a given hook
> in the future, commands can also be specified by providing
> "hook.<hookname>.command = <hookcmd-name>", as well as a "[hookcmd
> <hookcmd-name>]" subsection; at minimum, this subsection must contain a
> "hookcmd.<hookcmd-name>.command = <path-to-hook>" line.
> 
> For example:
> 
>   $ git config --list | grep ^hook
>   hook.pre-commit.command=baz
>   hook.pre-commit.command=~/bar.sh
>   hookcmd.baz.command=~/baz/from/hookcmd.sh
> 
>   $ git hook list pre-commit
>   ~/baz/from/hookcmd.sh
>   ~/bar.sh
> 
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
>  Documentation/git-hook.txt    | 37 +++++++++++++-
>  Makefile                      |  1 +
>  builtin/hook.c                | 55 +++++++++++++++++++--
>  hook.c                        | 90 +++++++++++++++++++++++++++++++++++
>  hook.h                        | 15 ++++++
>  t/t1360-config-based-hooks.sh | 51 +++++++++++++++++++-
>  6 files changed, 242 insertions(+), 7 deletions(-)
>  create mode 100644 hook.c
>  create mode 100644 hook.h
> 
> diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
> index 2d50c414cc..e458586e96 100644
> --- a/Documentation/git-hook.txt
> +++ b/Documentation/git-hook.txt
> @@ -8,12 +8,47 @@ git-hook - Manage configured hooks
>  SYNOPSIS
>  --------
>  [verse]
> -'git hook'
> +'git hook' list <hook-name>
>  
>  DESCRIPTION
>  -----------
>  You can list, add, and modify hooks with this command.
>  
> +This command parses the default configuration files for sections "hook" and
> +"hookcmd". "hook" is used to describe the commands which will be run during a
> +particular hook event; commands are run in config order. "hookcmd" is used to
> +describe attributes of a specific command. If additional attributes don't need
> +to be specified, a command to run can be specified directly in the "hook"
> +section; if a "hookcmd" by that name isn't found, Git will attempt to run the
> +provided value directly. For example:
> +
> +Global config
> +----
> +  [hook "post-commit"]
> +    command = "linter"
> +    command = "~/typocheck.sh"
> +
> +  [hookcmd "linter"]
> +    command = "/bin/linter --c"
> +----
> +
> +Local config
> +----
> +  [hook "prepare-commit-msg"]
> +    command = "linter"
> +  [hook "post-commit"]
> +    command = "python ~/run-test-suite.py"
> +----
> +
> +COMMANDS
> +--------
> +
> +list <hook-name>::
> +
> +List the hooks which have been configured for <hook-name>. Hooks appear
> +in the order they should be run, and note the config scope where the relevant
> +`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
> +
>  GIT
>  ---
>  Part of the linkgit:git[1] suite
> diff --git a/Makefile b/Makefile
> index fce6ee154e..b7bbf3be7b 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -894,6 +894,7 @@ LIB_OBJS += grep.o
>  LIB_OBJS += hashmap.o
>  LIB_OBJS += help.o
>  LIB_OBJS += hex.o
> +LIB_OBJS += hook.o
>  LIB_OBJS += ident.o
>  LIB_OBJS += interdiff.o
>  LIB_OBJS += json-writer.o
> diff --git a/builtin/hook.c b/builtin/hook.c
> index b2bbc84d4d..cfd8e388bd 100644
> --- a/builtin/hook.c
> +++ b/builtin/hook.c
> @@ -1,21 +1,68 @@
>  #include "cache.h"
>  
>  #include "builtin.h"
> +#include "config.h"
> +#include "hook.h"
>  #include "parse-options.h"
> +#include "strbuf.h"
>  
>  static const char * const builtin_hook_usage[] = {
> -	N_("git hook"),
> +	N_("git hook list <hookname>"),
>  	NULL
>  };
>  
> -int cmd_hook(int argc, const char **argv, const char *prefix)
> +static int list(int argc, const char **argv, const char *prefix)
>  {
> -	struct option builtin_hook_options[] = {
> +	struct list_head *head, *pos;
> +	struct hook *item;
> +	struct strbuf hookname = STRBUF_INIT;
> +
> +	struct option list_options[] = {
>  		OPT_END(),
>  	};
>  
> -	argc = parse_options(argc, argv, prefix, builtin_hook_options,
> +	argc = parse_options(argc, argv, prefix, list_options,
>  			     builtin_hook_usage, 0);
>  
> +	if (argc < 1) {
> +		usage_msg_opt("a hookname must be provided to operate on.",
> +			      builtin_hook_usage, list_options);
> +	}
> +
> +	strbuf_addstr(&hookname, argv[0]);
> +
> +	head = hook_list(&hookname);
> +
> +	if (!head) {
> +		printf(_("no commands configured for hook '%s'\n"),
> +		       hookname.buf);
> +		return 0;
> +	}
> +
> +	list_for_each(pos, head) {
> +		item = list_entry(pos, struct hook, list);
> +		if (item)
> +			printf("%s:\t%s\n",
> +			       config_scope_name(item->origin),
> +			       item->command.buf);
> +	}
> +
> +	clear_hook_list();
> +	strbuf_release(&hookname);
> +
>  	return 0;
>  }
> +
> +int cmd_hook(int argc, const char **argv, const char *prefix)
> +{
> +	struct option builtin_hook_options[] = {
> +		OPT_END(),
> +	};
> +	if (argc < 2)
> +		usage_with_options(builtin_hook_usage, builtin_hook_options);
> +
> +	if (!strcmp(argv[1], "list"))
> +		return list(argc - 1, argv + 1, prefix);
> +
> +	usage_with_options(builtin_hook_usage, builtin_hook_options);
> +}
> diff --git a/hook.c b/hook.c
> new file mode 100644
> index 0000000000..9dfc1a885e
> --- /dev/null
> +++ b/hook.c
> @@ -0,0 +1,90 @@
> +#include "cache.h"
> +
> +#include "hook.h"
> +#include "config.h"
> +
> +static LIST_HEAD(hook_head);
> +
> +void free_hook(struct hook *ptr)
> +{
> +	if (ptr) {
> +		strbuf_release(&ptr->command);
> +		free(ptr);
> +	}
> +}
> +
> +static void emplace_hook(struct list_head *pos, const char *command)
> +{
> +	struct hook *to_add = malloc(sizeof(struct hook));
> +	to_add->origin = current_config_scope();
> +	strbuf_init(&to_add->command, 0);
> +	strbuf_addstr(&to_add->command, command);
> +
> +	list_add_tail(&to_add->list, pos);
> +}
> +
> +static void remove_hook(struct list_head *to_remove)
> +{
> +	struct hook *hook_to_remove = list_entry(to_remove, struct hook, list);
> +	list_del(to_remove);
> +	free_hook(hook_to_remove);
> +}
> +
> +void clear_hook_list(void)
> +{
> +	struct list_head *pos, *tmp;
> +	list_for_each_safe(pos, tmp, &hook_head)
> +		remove_hook(pos);
> +}
> +
> +static int hook_config_lookup(const char *key, const char *value, void *hook_key_cb)
> +{
> +	const char *hook_key = hook_key_cb;
> +
> +	if (!strcmp(key, hook_key)) {
> +		const char *command = value;
> +		struct strbuf hookcmd_name = STRBUF_INIT;
> +		struct list_head *pos = NULL, *tmp = NULL;
> +
> +		/* Check if a hookcmd with that name exists. */
> +		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
> +		git_config_get_value(hookcmd_name.buf, &command);

This looks dodgy to me. This code is called by git_config() as it parses
the config files, so it has not had a chance to fully populate the
config cache used by git_config_get_value(). I think the test below
passes because the hookcmd setting is set in the global file and the
hook setting is set in the local file so when we have already parsed the
hookcmd setting when we come to look it up. The same comment applies to
the hypothetical ordering config mentioned below. I think it would be
better to collect the list of hook.<event>.command settings in this
callback and then look up any hookcmd settings for those hook commands
after we've finished reading all of the config files.

> +
> +		if (!command)
> +			BUG("git_config_get_value overwrote a string it shouldn't have");
> +
> +		/*
> +		 * TODO: implement an option-getting callback, e.g.
> +		 *   get configs by pattern hookcmd.$value.*
> +		 *   for each key+value, do_callback(key, value, cb_data)
> +		 */
> +
> +		list_for_each_safe(pos, tmp, &hook_head) {
> +			struct hook *hook = list_entry(pos, struct hook, list);
> +			/*
> +			 * The list of hooks to run can be reordered by being redeclared
> +			 * in the config. Options about hook ordering should be checked
> +			 * here.
> +			 */
> +			if (0 == strcmp(hook->command.buf, command))
> +				remove_hook(pos);
> +		}
> +		emplace_hook(pos, command);
> +	}
> +
> +	return 0;
> +}
> +
> +struct list_head* hook_list(const struct strbuf* hookname)
> +{
> +	struct strbuf hook_key = STRBUF_INIT;
> +
> +	if (!hookname)
> +		return NULL;
> +
> +	strbuf_addf(&hook_key, "hook.%s.command", hookname->buf);
> +
> +	git_config(hook_config_lookup, (void*)hook_key.buf);
> +
> +	return &hook_head;
> +}
> diff --git a/hook.h b/hook.h
> new file mode 100644
> index 0000000000..aaf6511cff
> --- /dev/null
> +++ b/hook.h
> @@ -0,0 +1,15 @@
> +#include "config.h"
> +#include "list.h"
> +#include "strbuf.h"
> +
> +struct hook
> +{
> +	struct list_head list;
> +	enum config_scope origin;
> +	struct strbuf command;
> +};
> +
> +struct list_head* hook_list(const struct strbuf *hookname);
> +
> +void free_hook(struct hook *ptr);
> +void clear_hook_list(void);
> diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
> index 34b0df5216..4e46d7dd4e 100755
> --- a/t/t1360-config-based-hooks.sh
> +++ b/t/t1360-config-based-hooks.sh
> @@ -4,8 +4,55 @@ test_description='config-managed multihooks, including git-hook command'
>  
>  . ./test-lib.sh
>  
> -test_expect_success 'git hook command does not crash' '
> -	git hook
> +test_expect_success 'git hook rejects commands without a mode' '
> +	test_must_fail git hook pre-commit
> +'
> +
> +
> +test_expect_success 'git hook rejects commands without a hookname' '
> +	test_must_fail git hook list
> +'
> +
> +test_expect_success 'setup hooks in global, and local' '
> +	git config --add --local hook.pre-commit.command "/path/ghi" &&

Can I make a plea for the use of test_config please. Writing tests which
rely on previous tests for their set-up creates a chain of hidden
dependencies that make it hard to add/alter tests later or run a subset
of the tests when developing a new patch. t3404-rebase-interactive.sh is
a prime example of this and I dread touching it.

> +	git config --add --global hook.pre-commit.command "/path/def"
> +'
> +
> +test_expect_success 'git hook list orders by config order' '
> +	cat >expected <<-\EOF &&
> +	global:	/path/def
> +	local:	/path/ghi
> +	EOF
> +
> +	git hook list pre-commit >actual &&
> +	test_cmp expected actual
> +'
> +
> +test_expect_success 'git hook list dereferences a hookcmd' '
> +	git config --add --local hook.pre-commit.command "abc" &&
> +	git config --add --global hookcmd.abc.command "/path/abc" &&
> +
> +	cat >expected <<-\EOF &&
> +	global:	/path/def
> +	local:	/path/ghi
> +	local:	/path/abc

We should make it clear in the documentation that the config origin
applies to the hook setting, even though we display the hookcmd command
which is set globally here for the last hook.

Best Wishes

Phillip

> +	EOF
> +
> +	git hook list pre-commit >actual &&
> +	test_cmp expected actual
> +'
> +
> +test_expect_success 'git hook list reorders on duplicate commands' '
> +	git config --add --local hook.pre-commit.command "/path/def" &&
> +
> +	cat >expected <<-\EOF &&
> +	local:	/path/ghi
> +	local:	/path/abc
> +	local:	/path/def
> +	EOF
> +
> +	git hook list pre-commit >actual &&
> +	test_cmp expected actual
>  '
>  
>  test_done
> 


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 3/4] hook: add list command
  2020-05-21 18:54 ` [PATCH v2 3/4] hook: add list command Emily Shaffer
  2020-05-22 10:27   ` Phillip Wood
@ 2020-05-24 23:00   ` Johannes Schindelin
  2020-05-27 23:37     ` Emily Shaffer
  1 sibling, 1 reply; 170+ messages in thread
From: Johannes Schindelin @ 2020-05-24 23:00 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Hi Emily,

On Thu, 21 May 2020, Emily Shaffer wrote:

> diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
> index 34b0df5216..4e46d7dd4e 100755
> --- a/t/t1360-config-based-hooks.sh
> +++ b/t/t1360-config-based-hooks.sh
> @@ -4,8 +4,55 @@ test_description='config-managed multihooks, including git-hook command'
>
>  . ./test-lib.sh
>
> -test_expect_success 'git hook command does not crash' '
> -	git hook
> +test_expect_success 'git hook rejects commands without a mode' '
> +	test_must_fail git hook pre-commit
> +'
> +
> +
> +test_expect_success 'git hook rejects commands without a hookname' '
> +	test_must_fail git hook list
> +'
> +
> +test_expect_success 'setup hooks in global, and local' '
> +	git config --add --local hook.pre-commit.command "/path/ghi" &&
> +	git config --add --global hook.pre-commit.command "/path/def"
> +'
> +
> +test_expect_success 'git hook list orders by config order' '
> +	cat >expected <<-\EOF &&
> +	global:	/path/def
> +	local:	/path/ghi
> +	EOF
> +
> +	git hook list pre-commit >actual &&
> +	test_cmp expected actual

This, as well as the next two test cases, won't work on Windows, as you
almost certainly realized from looking at the failed GitHub workflow run
of your branch.

The reason is that Unix-like absolute paths like `/path/def` do _not_ do
what you think on Windows: they are relative to the MSYS2 root (because
the shell script runs in an MSYS2 Bash). The Git executable, however, has
not the slightest idea about MSYS2 and does not handle those. To remedy
that, the MSYS2 Bash prefixes those paths with the absolute
_Windows-style_ path when passing them to `git.exe` (in your case,
actually in the `setup hooks` test case above).

So you will need to squash this (or an equivalent fix) into your patch:

-- snip --
From f2568d47509130a9c35590d907797d2eb813ac0d Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 25 May 2020 15:03:16 +0200
Subject: [PATCH] fixup??? hook: add list command

This is needed to make the tests pass on Windows, where Unix-like
absolute paths are not what you think they are.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t1360-config-based-hooks.sh | 39 +++++++++++++++++++++--------------
 1 file changed, 24 insertions(+), 15 deletions(-)

diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 3296d8af4587..c862655fd4d9 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -18,10 +18,19 @@ test_expect_success 'setup hooks in global, and local' '
 	git config --add --global hook.pre-commit.command "/path/def"
 '

+ROOT=
+if test_have_prereq MINGW
+then
+	# In Git for Windows, Unix-like paths work only in shell scripts;
+	# `git.exe`, however, will prefix them with the pseudo root directory
+	# (of the Unix shell). Let's accommodate for that.
+	ROOT="$(cd / && pwd)"
+fi
+
 test_expect_success 'git hook list orders by config order' '
-	cat >expected <<-\EOF &&
-	global:	/path/def
-	local:	/path/ghi
+	cat >expected <<-EOF &&
+	global:	$ROOT/path/def
+	local:	$ROOT/path/ghi
 	EOF

 	git hook list pre-commit >actual &&
@@ -32,10 +41,10 @@ test_expect_success 'git hook list dereferences a hookcmd' '
 	git config --add --local hook.pre-commit.command "abc" &&
 	git config --add --global hookcmd.abc.command "/path/abc" &&

-	cat >expected <<-\EOF &&
-	global:	/path/def
-	local:	/path/ghi
-	local:	/path/abc
+	cat >expected <<-EOF &&
+	global:	$ROOT/path/def
+	local:	$ROOT/path/ghi
+	local:	$ROOT/path/abc
 	EOF

 	git hook list pre-commit >actual &&
@@ -45,10 +54,10 @@ test_expect_success 'git hook list dereferences a hookcmd' '
 test_expect_success 'git hook list reorders on duplicate commands' '
 	git config --add --local hook.pre-commit.command "/path/def" &&

-	cat >expected <<-\EOF &&
-	local:	/path/ghi
-	local:	/path/abc
-	local:	/path/def
+	cat >expected <<-EOF &&
+	local:	$ROOT/path/ghi
+	local:	$ROOT/path/abc
+	local:	$ROOT/path/def
 	EOF

 	git hook list pre-commit >actual &&
@@ -56,10 +65,10 @@ test_expect_success 'git hook list reorders on duplicate commands' '
 '

 test_expect_success 'git hook list --porcelain prints just the command' '
-	cat >expected <<-\EOF &&
-	/path/ghi
-	/path/abc
-	/path/def
+	cat >expected <<-EOF &&
+	$ROOT/path/ghi
+	$ROOT/path/abc
+	$ROOT/path/def
 	EOF

 	git hook list --porcelain pre-commit >actual &&
--
2.27.0.rc1.windows.1

-- snap --

Ciao,
Dscho

> +'
> +
> +test_expect_success 'git hook list dereferences a hookcmd' '
> +	git config --add --local hook.pre-commit.command "abc" &&
> +	git config --add --global hookcmd.abc.command "/path/abc" &&
> +
> +	cat >expected <<-\EOF &&
> +	global:	/path/def
> +	local:	/path/ghi
> +	local:	/path/abc
> +	EOF
> +
> +	git hook list pre-commit >actual &&
> +	test_cmp expected actual
> +'
> +
> +test_expect_success 'git hook list reorders on duplicate commands' '
> +	git config --add --local hook.pre-commit.command "/path/def" &&
> +
> +	cat >expected <<-\EOF &&
> +	local:	/path/ghi
> +	local:	/path/abc
> +	local:	/path/def
> +	EOF
> +
> +	git hook list pre-commit >actual &&
> +	test_cmp expected actual
>  '
>
>  test_done
> --
> 2.27.0.rc0.183.gde8f92d652-goog
>
>
>

^ permalink raw reply related	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 4/4] hook: add --porcelain to list command
  2020-05-21 18:54 ` [PATCH v2 4/4] hook: add --porcelain to " Emily Shaffer
@ 2020-05-24 23:00   ` Johannes Schindelin
  2020-05-25  0:29     ` Johannes Schindelin
  0 siblings, 1 reply; 170+ messages in thread
From: Johannes Schindelin @ 2020-05-24 23:00 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Hi Emily,

On Thu, 21 May 2020, Emily Shaffer wrote:

> diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
> index 4e46d7dd4e..3296d8af45 100755
> --- a/t/t1360-config-based-hooks.sh
> +++ b/t/t1360-config-based-hooks.sh
> @@ -55,4 +55,15 @@ test_expect_success 'git hook list reorders on duplicate commands' '
>  	test_cmp expected actual
>  '
>
> +test_expect_success 'git hook list --porcelain prints just the command' '
> +	cat >expected <<-\EOF &&
> +	/path/ghi
> +	/path/abc
> +	/path/def
> +	EOF
> +
> +	git hook list --porcelain pre-commit >actual &&
> +	test_cmp expected actual
> +'

As you surely found out from the GitHub workflow running in your fork,
this does not work on Windows. I need this (and strongly suggest you
squash that into your patch):

-- snipsnap --
From 97e3dfa6155785363c881ce2dcaf4f5ddead83ed Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 25 May 2020 15:04:24 +0200
Subject: [PATCH] fixup??? hook: add --porcelain to list command

This is required to let the test pass on Windows, where Git reports
Windows-style absolute paths and has no idea about the pseudo Unix
absolute paths that the Bash knows about.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t1360-config-based-hooks.sh | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index c862655fd4d9..fce7335e97b9 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -65,10 +65,10 @@ test_expect_success 'git hook list reorders on duplicate commands' '
 '

 test_expect_success 'git hook list --porcelain prints just the command' '
-	cat >expected <<-EOF &&
-	$ROOT/path/ghi
-	$ROOT/path/abc
-	$ROOT/path/def
+	cat >expected <<-\EOF &&
+	/path/ghi
+	/path/abc
+	/path/def
 	EOF

 	git hook list --porcelain pre-commit >actual &&
--
2.27.0.rc1.windows.1


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 4/4] hook: add --porcelain to list command
  2020-05-24 23:00   ` Johannes Schindelin
@ 2020-05-25  0:29     ` Johannes Schindelin
  0 siblings, 0 replies; 170+ messages in thread
From: Johannes Schindelin @ 2020-05-25  0:29 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Hi Emily,

On Mon, 25 May 2020, Johannes Schindelin wrote:

> Hi Emily,
>
> On Thu, 21 May 2020, Emily Shaffer wrote:
>
> > diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
> > index 4e46d7dd4e..3296d8af45 100755
> > --- a/t/t1360-config-based-hooks.sh
> > +++ b/t/t1360-config-based-hooks.sh
> > @@ -55,4 +55,15 @@ test_expect_success 'git hook list reorders on duplicate commands' '
> >  	test_cmp expected actual
> >  '
> >
> > +test_expect_success 'git hook list --porcelain prints just the command' '
> > +	cat >expected <<-\EOF &&
> > +	/path/ghi
> > +	/path/abc
> > +	/path/def
> > +	EOF
> > +
> > +	git hook list --porcelain pre-commit >actual &&
> > +	test_cmp expected actual
> > +'
>
> As you surely found out from the GitHub workflow running in your fork,
> this does not work on Windows. I need this (and strongly suggest you
> squash that into your patch):
>
> -- snipsnap --
> From 97e3dfa6155785363c881ce2dcaf4f5ddead83ed Mon Sep 17 00:00:00 2001
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> Date: Mon, 25 May 2020 15:04:24 +0200
> Subject: [PATCH] fixup??? hook: add --porcelain to list command
>
> This is required to let the test pass on Windows, where Git reports
> Windows-style absolute paths and has no idea about the pseudo Unix
> absolute paths that the Bash knows about.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  t/t1360-config-based-hooks.sh | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
> index c862655fd4d9..fce7335e97b9 100755
> --- a/t/t1360-config-based-hooks.sh
> +++ b/t/t1360-config-based-hooks.sh
> @@ -65,10 +65,10 @@ test_expect_success 'git hook list reorders on duplicate commands' '
>  '
>
>  test_expect_success 'git hook list --porcelain prints just the command' '
> -	cat >expected <<-EOF &&
> -	$ROOT/path/ghi
> -	$ROOT/path/abc
> -	$ROOT/path/def
> +	cat >expected <<-\EOF &&
> +	/path/ghi
> +	/path/abc
> +	/path/def

Due to an oversight on my part, this is actually the _reverse_ diff, and
the corresponding part in my mail answering your PATCH 3/4 should be
skipped from that fixup. Sorry for that.

Ciao,
Dscho

>  	EOF
>
>  	git hook list --porcelain pre-commit >actual &&
> --
> 2.27.0.rc1.windows.1
>
>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 3/4] hook: add list command
  2020-05-24 23:00   ` Johannes Schindelin
@ 2020-05-27 23:37     ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-05-27 23:37 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

On Mon, May 25, 2020 at 01:00:03AM +0200, Johannes Schindelin wrote:
> cc: git@vger.kernel.org
> 
> Hi Emily,
> 
> On Thu, 21 May 2020, Emily Shaffer wrote:
> 
> > diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
> > index 34b0df5216..4e46d7dd4e 100755
> > --- a/t/t1360-config-based-hooks.sh
> > +++ b/t/t1360-config-based-hooks.sh
> > @@ -4,8 +4,55 @@ test_description='config-managed multihooks, including git-hook command'
> >
> >  . ./test-lib.sh
> >
> > -test_expect_success 'git hook command does not crash' '
> > -	git hook
> > +test_expect_success 'git hook rejects commands without a mode' '
> > +	test_must_fail git hook pre-commit
> > +'
> > +
> > +
> > +test_expect_success 'git hook rejects commands without a hookname' '
> > +	test_must_fail git hook list
> > +'
> > +
> > +test_expect_success 'setup hooks in global, and local' '
> > +	git config --add --local hook.pre-commit.command "/path/ghi" &&
> > +	git config --add --global hook.pre-commit.command "/path/def"
> > +'
> > +
> > +test_expect_success 'git hook list orders by config order' '
> > +	cat >expected <<-\EOF &&
> > +	global:	/path/def
> > +	local:	/path/ghi
> > +	EOF
> > +
> > +	git hook list pre-commit >actual &&
> > +	test_cmp expected actual
> 
> This, as well as the next two test cases, won't work on Windows, as you
> almost certainly realized from looking at the failed GitHub workflow run
> of your branch.

Thanks very much for sending this - to be honest, the failed workflow
run appeared to be because of the earlier SDK download issue, which I
have not rebased on top of a fix for yet, so I missed any actionable
failures when I ran the CI locally. I'll take it into account, much
appreciated.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 1/4] doc: propose hooks managed by the config
  2020-05-22 10:13   ` Phillip Wood
@ 2020-06-09 20:26     ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-06-09 20:26 UTC (permalink / raw)
  To: Phillip Wood; +Cc: git

On Fri, May 22, 2020 at 11:13:07AM +0100, Phillip Wood wrote:
> 
> Hi Emily
> 
> Thanks for working on this
> 
> On 21/05/2020 19:54, Emily Shaffer wrote:
> > Begin a design document for config-based hooks, managed via git-hook.
> > Focus on an overview of the implementation and motivation for design
> > decisions. Briefly discuss the alternatives considered before this
> > point. Also, attempt to redefine terms to fit into a multihook world.
> > 
> > Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> > ---
> >  Documentation/Makefile                        |   1 +
> >  .../technical/config-based-hooks.txt          | 320 ++++++++++++++++++
> >  2 files changed, 321 insertions(+)
> >  create mode 100644 Documentation/technical/config-based-hooks.txt
> > 
> > diff --git a/Documentation/Makefile b/Documentation/Makefile
> > index 15d9d04f31..5b21f31d31 100644
> > --- a/Documentation/Makefile
> > +++ b/Documentation/Makefile
> > @@ -80,6 +80,7 @@ SP_ARTICLES += $(API_DOCS)
> >  TECH_DOCS += MyFirstContribution
> >  TECH_DOCS += MyFirstObjectWalk
> >  TECH_DOCS += SubmittingPatches
> > +TECH_DOCS += technical/config-based-hooks
> >  TECH_DOCS += technical/hash-function-transition
> >  TECH_DOCS += technical/http-protocol
> >  TECH_DOCS += technical/index-format
> > diff --git a/Documentation/technical/config-based-hooks.txt b/Documentation/technical/config-based-hooks.txt
> > new file mode 100644
> > index 0000000000..59cdc25a47
> > --- /dev/null
> > +++ b/Documentation/technical/config-based-hooks.txt
> > @@ -0,0 +1,320 @@
> > +Configuration-based hook management
> > +===================================
> > +
> > +== Motivation
> > +
> > +Treat hooks as a first-class citizen by replacing the .git/hook/hookname path as
> > +the only source of hooks to execute, in a way which is friendly to users with
> > +multiple repos which have similar needs.
> > +
> > +Redefine "hook" as an event rather than a single script, allowing users to
> > +perform unrelated actions on a single event.
> > +
> > +Take a step closer to safety when copying zipped Git repositories from untrusted
> > +users.
> 
> Having read through this (admittedly fairly quickly) I'm not sure what
> that step is

Ok, I'll try to clarify it a little here.

> 
> > +
> > +Make it easier for users to discover Git's hook feature and automate their
> > +workflows.
> > +
> > +== User interfaces
> > +
> > +=== Config schema
> > +
> > +Hooks can be introduced by editing the configuration manually. There are two new
> > +sections added, `hook` and `hookcmd`.
> > +
> > +==== `hook`
> > +
> > +Primarily contains subsections for each hook event. These subsections define
> > +hook command execution order;
> 
> May be "The order of these subsections define the hook command execution
> order" ?

Nice. Took it verbatim.

> 
> > hook commands can be specified by passing the
> > +command directly if no additional configuration is needed, or by passing the
> > +name of a `hookcmd`.
> 
> I know what you mean by "passing" but as this section is talking about
> config settings perhaps it should refer to the keys and values.

Sure.

> 
> > If Git does not find a `hookcmd` whose subsection matches
> > +the value of the given command string, Git will try to execute the string
> > +directly. Hooks are executed by passing the resolved command string to the
> > +shell.
> 
> Do we really need to invoke the shell just to split a command-line and
> look up the command in $PATH? If we used split_commandline() in alias.c
> then we could avoid invoking this extra process for each hook command.

I'll want to experiment a little bit with this and figure out what works
best - you may be right, and I could also be wrong about platform
compatibility doing it the way I described. I haven't written this bit
yet - so I'd like to update this section of the design doc when I get to
the implementation, so that it matches.

> 
> > Hook event subsections can also contain per-hook-event settings.
> > +
> > +Also contains top-level hook execution settings, for example,
> > +`hook.warnHookDir`, `hook.runHookDir`, or `hook.disableAll`.
> 
> (see sections ...) ? for the forward references to these settings?

Sure. I think the best way to do this is if I use anchors for all the
sections; this works without me specifying it in Asciidoctor but needs
to be explicitly specified in Asciidoc. So I'll make sure to include
that with the next iteration.

> 
> > +
> > +----
> > +[hook "pre-commit"]
> > +  command = perl-linter
> > +  command = /usr/bin/git-secrets --pre-commit
> > +
> > +[hook "pre-applypatch"]
> > +  command = perl-linter
> > +  error = ignore
> > +
> > +[hook]
> > +  runHookDir = interactive
> > +----
> > +
> > +==== `hookcmd`
> > +
> > +Defines a hook command and its attributes, which will be used when a hook event
> > +occurs. Unqualified attributes are assumed to apply to this hook during all hook
> > +events, but event-specific attributes can also be supplied. The example runs
> > +`/usr/bin/lint-it --language=perl <args passed by Git>`, but for repos which
> > +include this config, the hook command will be skipped for all events to which
> > +it's normally subscribed _except_ `pre-commit`.
> > +
> > +----
> > +[hookcmd "perl-linter"]
> > +  command = /usr/bin/lint-it --language=perl
> > +  skip = true
> > +  pre-commit-skip = false
> > +----
> > +
> > +=== Command-line API
> > +
> > +Users should be able to view, reorder, and create hook commands via the command
> > +line. External tools should be able to view a list of hooks in the correct order
> > +to run.
> > +
> > +*`git hook list <hook-event>`*
> > +
> > +*`git hook list (--system|--global|--local|--worktree)`*
> > +
> > +*`git hook edit <hook-event>`*
> > +
> > +*`git hook add <hook-command> <hook-event> <options...>`*
> > +
> > +=== Hook editor
> > +
> > +The tool which is presented by `git hook edit <hook-command>`. Ideally, this
> > +tool should be easier to use than manually editing the config, and then produce
> > +a concise config afterwards. It may take a form similar to `git rebase
> > +--interactive`.
> 
> rebase -i is not necessarily an exemplar of user interface design, what
> sort of thing do you have in mind?

Thanks for patience on this - I didn't really have a clear idea before
when I wrote the doc because I don't have much expertise in user
interfaces. However, since then I worked with some UX experts here, so
I'll make a better writeup in the next iteration - I've got a much
clearer idea of how that should look, now.

> 
> > +
> > +== Implementation
> > +
> > +=== Library
> > +
> > +`hook.c` and `hook.h` are responsible for interacting with the config files. In
> > +the case when the code generating a hook event doesn't have special concerns
> > +about how to run the hooks, the hook library will provide a basic API to call
> > +all hooks in config order with an `argv_array` provided by the code which
> > +generates the hook event:
> > +
> > +*`int run_hooks(const char *hookname, struct argv_array *args)`*
> > +
> > +This call includes the hook command provided by `run-command.h:find_hook()`;
> > +eventually, this legacy hook will be gated by a config `hook.runHookDir`. The
> > +config is checked against a number of cases:
> > +
> > +- "no": the legacy hook will not be run
> > +- "interactive": Git will prompt the user before running the legacy hook
> > +- "warn": Git will print a warning to stderr before running the legacy hook
> > +- "yes" (default): Git will silently run the legacy hook
> > +
> > +In case this list is expanded in the future, if a value for `hook.runHookDir` is
> > +given which Git does not recognize, Git should discard that config entry. For
> > +example, if "warn" was specified at system level and "junk" was specified at
> > +global level, Git would resolve the value to "warn"; if the only time the config
> > +was set was to "junk", Git would use the default value of "yes".
> > +
> > +If the caller wants to do something more complicated, the hook library can also
> > +provide a callback API:
> > +
> > +*`int for_each_hookcmd(const char *hookname, hookcmd_function *cb)`*
> > +
> > +Finally, to facilitate the builtin, the library will also provide the following
> > +APIs to interact with the config:
> > +
> > +----
> > +int set_hook_commands(const char *hookname, struct string_list *commands,
> > +	enum config_scope scope);
> > +int set_hookcmd(const char *hookcmd, struct hookcmd options);
> > +
> > +int list_hook_commands(const char *hookname, struct string_list *commands);
> > +int list_hooks_in_scope(enum config_scope scope, struct string_list *commands);
> > +----
> > +
> > +`struct hookcmd` is expected to grow in size over time as more functionality is
> > +added to hooks; so that other parts of the code don't need to understand the
> > +config schema, `struct hookcmd` should contain logical values instead of string
> > +pairs.
> > +
> > +----
> > +struct hookcmd {
> > +  const char *name;
> > +  const char *command;
> > +
> > +  /* for illustration only; not planned at present */
> > +  int parallelizable;
> > +  const char *hookcmd_before;
> > +  const char *hookcmd_after;
> > +  enum recovery_action on_fail;
> > +}
> > +----
> > +
> > +=== Builtin
> > +
> > +`builtin/hook.c` is responsible for providing the frontend. It's responsible for
> > +formatting user-provided data and then calling the library API to set the
> > +configs as appropriate. The builtin frontend is not responsible for calling the
> > +config directly, so that other areas of Git can rely on the hook library to
> > +understand the most recent config schema for hooks.
> > +
> > +=== Migration path
> > +
> > +==== Stage 0
> > +
> > +Hooks are called by running `run-command.h:find_hook()` with the hookname and
> > +executing the result. The hook library and builtin do not exist. Hooks only
> > +exist as specially named scripts within `.git/hooks/`.
> > +
> > +==== Stage 1
> > +
> > +`git hook list --porcelain <hook-event>` is implemented. Users can replace their
> > +`.git/hooks/<hook-event>` scripts with a trampoline based on `git hook list`'s
> > +output. Modifier commands like `git hook add` and `git hook edit` can be
> > +implemented around this time as well.
> > +
> > +==== Stage 2
> > +
> > +`hook.h:run_hooks()` is taught to include `run-command.h:find_hook()` at the
> > +end; calls to `find_hook()` are replaced with calls to `run_hooks()`. Users can
> > +opt-in to config-based hooks simply by creating some in their config; otherwise
> > +users should remain unaffected by the change.
> > +
> > +==== Stage 3
> > +
> > +The call to `find_hook()` inside of `run_hooks()` learns to check for a config,
> > +`hook.runHookDir`. Users can opt into managing their hooks completely via the
> > +config this way.
> > +
> > +==== Stage 4
> > +
> > +`.git/hooks` is removed from the template and the hook directory is considered
> > +deprecated. To avoid breaking older repos, the default of `hook.runHookDir` is
> > +not changed, and `find_hook()` is not removed.
> > +
> > +== Caveats
> > +
> > +=== Security and repo config
> > +
> > +Part of the motivation behind this refactor is to mitigate hooks as an attack
> > +vector;footnote:[https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/]
> > +however, as the design stands, users can still provide hooks in the repo-level
> > +config, which is included when a repo is zipped and sent elsewhere.  The
> > +security of the repo-level config is still under discussion; this design
> > +generally assumes the repo-level config is secure, which is not true yet. The
> > +goal is to avoid an overcomplicated design to work around a problem which has
> > +ceased to exist.
> > +
> > +=== Ease of use
> > +
> > +The config schema is nontrivial; that's why it's important for the `git hook`
> > +modifier commands to be usable.
> 
> That's an important point
> 
> > Contributors with UX expertise are encouraged to
> > +share their suggestions.
> > +
> > +== Alternative approaches
> > +
> > +A previous summary of alternatives exists in the
> > +archives.footnote:[https://lore.kernel.org/git/20191116011125.GG22855@google.com]
> > +
> > +=== Status quo
> > +
> > +Today users can implement multihooks themselves by using a "trampoline script"
> > +as their hook, and pointing that script to a directory or list of other scripts
> > +they wish to run.
> > +
> > +=== Hook directories
> > +
> > +Other contributors have suggested Git learn about the existence of a directory
> > +such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order.
> > +
> > +=== Comparison table
> > +
> > +.Comparison of alternatives
> > +|===
> > +|Feature |Config-based hooks |Hook directories |Status quo
> > +
> > +|Supports multiple hooks
> > +|Natively
> > +|Natively
> > +|With user effort
> > +
> > +|Safer for zipped repos
> > +|A little
> > +|No
> > +|No
> > +
> > +|Previous hooks just work
> > +|If configured
> > +|Yes
> > +|Yes
> > +
> > +|Can install one hook to many repos
> > +|Yes
> > +|No
> > +|No
> > +
> > +|Discoverability
> > +|Better (in `git help git`)
> > +|Same as before
> > +|Same as before
> > +
> > +|Hard to run unexpected hook
> > +|If configured
> > +|No
> > +|No
> > +|===
> > +
> > +== Future work
> > +
> > +=== Execution ordering
> > +
> > +We may find that config order is insufficient for some users; for example,
> > +config order makes it difficult to add a new hook to the system or global config
> > +which runs at the end of the hook list. A new ordering schema should be:
> > +
> > +1) Specified by a `hook.order` config, so that users will not unexpectedly see
> > +their order change;
> > +
> > +2) Either dependency or numerically based.
> > +
> > +Dependency-based ordering is prone to classic linked-list problems, like a
> > +cycles and handling of missing dependencies. But, it paves the way for enabling
> > +parallelization if some tasks truly depend on others.
> > +
> > +Numerical ordering makes it tricky for Git to generate suggested ordering
> > +numbers for each command, but is easy to determine a definitive order.
> > +
> > +=== Parallelization
> > +
> > +Users with many hooks might want to run them simultaneously, if the hooks don't
> > +modify state; if one hook depends on another's output, then users will want to
> > +specify those dependencies. If we decide to solve this problem, we may want to
> > +look to modern build systems for inspiration on how to manage dependencies and
> > +parallel tasks.
> > +
> > +=== Securing hookdir hooks
> > +
> > +With the design as written in this doc, it's still possible for a malicious user
> > +to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then
> > +zip their repo and send it to another user. It may be necessary to teach Git to
> > +only allow one-line hooks like this if they were configured outside of the local
> > +scope;
> 
> Does "disabling one-line hooks" mean "disable passing command line
> arguments to the hook"? I'm not sure that gains much security - can't I
> just set 'hook.pre-receive.command = ./delete-everything' and include
> delete-everything in my malicious repo?

No, I meant something more along the lines of:

- hookcmds cannot be specified at the repo/worktree level
- hook.pre-receive.command's value *must* be a hookcmd name

I'll try to make that more clear next round.

Thanks for reading.
 - Emily

> > or another approach, like a list of safe projects, might be useful. It
> > +may also be sufficient (or at least useful) to teach a `hook.disableAll` config
> > +or similar flag to the Git executable.
> > +
> > +=== Submodule inheritance
> > +
> > +It's possible some submodules may want to run the identical set of hooks that
> > +their superrepo runs. While a globally-configured hook set is helpful, it's not
> > +a great solution for users who have multiple repos-with-submodules under the
> > +same user. It would be useful for submodules to learn how to run hooks from
> > +their superrepo's config, or inherit that hook setting.
> > +
> > +== Glossary
> > +
> > +*hook event*
> > +
> > +A point during Git's execution where user scripts may be run, for example,
> > +_prepare-commit-msg_ or _pre-push_.
> > +
> > +*hook command*
> > +
> > +A user script or executable which will be run on one or more hook events.
> > 
> 

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 3/4] hook: add list command
  2020-05-22 10:27   ` Phillip Wood
@ 2020-06-09 21:49     ` Emily Shaffer
  2020-08-17 13:36       ` Phillip Wood
  0 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-06-09 21:49 UTC (permalink / raw)
  To: Phillip Wood; +Cc: git

On Fri, May 22, 2020 at 11:27:44AM +0100, Phillip Wood wrote:
> 
> Hi Emily
> 
> On 21/05/2020 19:54, Emily Shaffer wrote:
> > Teach 'git hook list <hookname>', which checks the known configs in
> > order to create an ordered list of hooks to run on a given hook event.
> > 
> > Multiple commands can be specified for a given hook by providing
> > multiple "hook.<hookname>.command = <path-to-hook>" lines. Hooks will be
> > run in config order. If more properties need to be set on a given hook
> > in the future, commands can also be specified by providing
> > "hook.<hookname>.command = <hookcmd-name>", as well as a "[hookcmd
> > <hookcmd-name>]" subsection; at minimum, this subsection must contain a
> > "hookcmd.<hookcmd-name>.command = <path-to-hook>" line.
> > 
> > For example:
> > 
> >   $ git config --list | grep ^hook
> >   hook.pre-commit.command=baz
> >   hook.pre-commit.command=~/bar.sh
> >   hookcmd.baz.command=~/baz/from/hookcmd.sh
> > 
> >   $ git hook list pre-commit
> >   ~/baz/from/hookcmd.sh
> >   ~/bar.sh
> > 
> > Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> > ---
> >  Documentation/git-hook.txt    | 37 +++++++++++++-
> >  Makefile                      |  1 +
> >  builtin/hook.c                | 55 +++++++++++++++++++--
> >  hook.c                        | 90 +++++++++++++++++++++++++++++++++++
> >  hook.h                        | 15 ++++++
> >  t/t1360-config-based-hooks.sh | 51 +++++++++++++++++++-
> >  6 files changed, 242 insertions(+), 7 deletions(-)
> >  create mode 100644 hook.c
> >  create mode 100644 hook.h
> > 
> > diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
> > index 2d50c414cc..e458586e96 100644
> > --- a/Documentation/git-hook.txt
> > +++ b/Documentation/git-hook.txt
> > @@ -8,12 +8,47 @@ git-hook - Manage configured hooks
> >  SYNOPSIS
> >  --------
> >  [verse]
> > -'git hook'
> > +'git hook' list <hook-name>
> >  
> >  DESCRIPTION
> >  -----------
> >  You can list, add, and modify hooks with this command.
> >  
> > +This command parses the default configuration files for sections "hook" and
> > +"hookcmd". "hook" is used to describe the commands which will be run during a
> > +particular hook event; commands are run in config order. "hookcmd" is used to
> > +describe attributes of a specific command. If additional attributes don't need
> > +to be specified, a command to run can be specified directly in the "hook"
> > +section; if a "hookcmd" by that name isn't found, Git will attempt to run the
> > +provided value directly. For example:
> > +
> > +Global config
> > +----
> > +  [hook "post-commit"]
> > +    command = "linter"
> > +    command = "~/typocheck.sh"
> > +
> > +  [hookcmd "linter"]
> > +    command = "/bin/linter --c"
> > +----
> > +
> > +Local config
> > +----
> > +  [hook "prepare-commit-msg"]
> > +    command = "linter"
> > +  [hook "post-commit"]
> > +    command = "python ~/run-test-suite.py"
> > +----
> > +
> > +COMMANDS
> > +--------
> > +
> > +list <hook-name>::
> > +
> > +List the hooks which have been configured for <hook-name>. Hooks appear
> > +in the order they should be run, and note the config scope where the relevant
> > +`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
> > +
> >  GIT
> >  ---
> >  Part of the linkgit:git[1] suite
> > diff --git a/Makefile b/Makefile
> > index fce6ee154e..b7bbf3be7b 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -894,6 +894,7 @@ LIB_OBJS += grep.o
> >  LIB_OBJS += hashmap.o
> >  LIB_OBJS += help.o
> >  LIB_OBJS += hex.o
> > +LIB_OBJS += hook.o
> >  LIB_OBJS += ident.o
> >  LIB_OBJS += interdiff.o
> >  LIB_OBJS += json-writer.o
> > diff --git a/builtin/hook.c b/builtin/hook.c
> > index b2bbc84d4d..cfd8e388bd 100644
> > --- a/builtin/hook.c
> > +++ b/builtin/hook.c
> > @@ -1,21 +1,68 @@
> >  #include "cache.h"
> >  
> >  #include "builtin.h"
> > +#include "config.h"
> > +#include "hook.h"
> >  #include "parse-options.h"
> > +#include "strbuf.h"
> >  
> >  static const char * const builtin_hook_usage[] = {
> > -	N_("git hook"),
> > +	N_("git hook list <hookname>"),
> >  	NULL
> >  };
> >  
> > -int cmd_hook(int argc, const char **argv, const char *prefix)
> > +static int list(int argc, const char **argv, const char *prefix)
> >  {
> > -	struct option builtin_hook_options[] = {
> > +	struct list_head *head, *pos;
> > +	struct hook *item;
> > +	struct strbuf hookname = STRBUF_INIT;
> > +
> > +	struct option list_options[] = {
> >  		OPT_END(),
> >  	};
> >  
> > -	argc = parse_options(argc, argv, prefix, builtin_hook_options,
> > +	argc = parse_options(argc, argv, prefix, list_options,
> >  			     builtin_hook_usage, 0);
> >  
> > +	if (argc < 1) {
> > +		usage_msg_opt("a hookname must be provided to operate on.",
> > +			      builtin_hook_usage, list_options);
> > +	}
> > +
> > +	strbuf_addstr(&hookname, argv[0]);
> > +
> > +	head = hook_list(&hookname);
> > +
> > +	if (!head) {
> > +		printf(_("no commands configured for hook '%s'\n"),
> > +		       hookname.buf);
> > +		return 0;
> > +	}
> > +
> > +	list_for_each(pos, head) {
> > +		item = list_entry(pos, struct hook, list);
> > +		if (item)
> > +			printf("%s:\t%s\n",
> > +			       config_scope_name(item->origin),
> > +			       item->command.buf);
> > +	}
> > +
> > +	clear_hook_list();
> > +	strbuf_release(&hookname);
> > +
> >  	return 0;
> >  }
> > +
> > +int cmd_hook(int argc, const char **argv, const char *prefix)
> > +{
> > +	struct option builtin_hook_options[] = {
> > +		OPT_END(),
> > +	};
> > +	if (argc < 2)
> > +		usage_with_options(builtin_hook_usage, builtin_hook_options);
> > +
> > +	if (!strcmp(argv[1], "list"))
> > +		return list(argc - 1, argv + 1, prefix);
> > +
> > +	usage_with_options(builtin_hook_usage, builtin_hook_options);
> > +}
> > diff --git a/hook.c b/hook.c
> > new file mode 100644
> > index 0000000000..9dfc1a885e
> > --- /dev/null
> > +++ b/hook.c
> > @@ -0,0 +1,90 @@
> > +#include "cache.h"
> > +
> > +#include "hook.h"
> > +#include "config.h"
> > +
> > +static LIST_HEAD(hook_head);
> > +
> > +void free_hook(struct hook *ptr)
> > +{
> > +	if (ptr) {
> > +		strbuf_release(&ptr->command);
> > +		free(ptr);
> > +	}
> > +}
> > +
> > +static void emplace_hook(struct list_head *pos, const char *command)
> > +{
> > +	struct hook *to_add = malloc(sizeof(struct hook));
> > +	to_add->origin = current_config_scope();
> > +	strbuf_init(&to_add->command, 0);
> > +	strbuf_addstr(&to_add->command, command);
> > +
> > +	list_add_tail(&to_add->list, pos);
> > +}
> > +
> > +static void remove_hook(struct list_head *to_remove)
> > +{
> > +	struct hook *hook_to_remove = list_entry(to_remove, struct hook, list);
> > +	list_del(to_remove);
> > +	free_hook(hook_to_remove);
> > +}
> > +
> > +void clear_hook_list(void)
> > +{
> > +	struct list_head *pos, *tmp;
> > +	list_for_each_safe(pos, tmp, &hook_head)
> > +		remove_hook(pos);
> > +}
> > +
> > +static int hook_config_lookup(const char *key, const char *value, void *hook_key_cb)
> > +{
> > +	const char *hook_key = hook_key_cb;
> > +
> > +	if (!strcmp(key, hook_key)) {
> > +		const char *command = value;
> > +		struct strbuf hookcmd_name = STRBUF_INIT;
> > +		struct list_head *pos = NULL, *tmp = NULL;
> > +
> > +		/* Check if a hookcmd with that name exists. */
> > +		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
> > +		git_config_get_value(hookcmd_name.buf, &command);
> 
> This looks dodgy to me. This code is called by git_config() as it parses
> the config files, so it has not had a chance to fully populate the
> config cache used by git_config_get_value(). I think the test below
> passes because the hookcmd setting is set in the global file and the
> hook setting is set in the local file so when we have already parsed the
> hookcmd setting when we come to look it up. The same comment applies to
> the hypothetical ordering config mentioned below. I think it would be
> better to collect the list of hook.<event>.command settings in this
> callback and then look up any hookcmd settings for those hook commands
> after we've finished reading all of the config files.

git_config_get_value() calls repo_read_config(the_repository) if the
config hasn't been fully parsed yet, so I think what you're worrying
about is not an issue. It's ugly, I agree, but since the new hotness
(git_config_get_value() and friends) doesn't offer the same
functionality as the old solution (config origin) this seemed like an
okay approach. As I understand it, moving this hookcmd lookup section
outside of the config callback will save us up to one additional pass
through the configs, at the expense of a more convoluted code path.

> 
> > +
> > +		if (!command)
> > +			BUG("git_config_get_value overwrote a string it shouldn't have");
> > +
> > +		/*
> > +		 * TODO: implement an option-getting callback, e.g.
> > +		 *   get configs by pattern hookcmd.$value.*
> > +		 *   for each key+value, do_callback(key, value, cb_data)
> > +		 */
> > +
> > +		list_for_each_safe(pos, tmp, &hook_head) {
> > +			struct hook *hook = list_entry(pos, struct hook, list);
> > +			/*
> > +			 * The list of hooks to run can be reordered by being redeclared
> > +			 * in the config. Options about hook ordering should be checked
> > +			 * here.
> > +			 */
> > +			if (0 == strcmp(hook->command.buf, command))
> > +				remove_hook(pos);
> > +		}
> > +		emplace_hook(pos, command);
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +struct list_head* hook_list(const struct strbuf* hookname)
> > +{
> > +	struct strbuf hook_key = STRBUF_INIT;
> > +
> > +	if (!hookname)
> > +		return NULL;
> > +
> > +	strbuf_addf(&hook_key, "hook.%s.command", hookname->buf);
> > +
> > +	git_config(hook_config_lookup, (void*)hook_key.buf);
> > +
> > +	return &hook_head;
> > +}
> > diff --git a/hook.h b/hook.h
> > new file mode 100644
> > index 0000000000..aaf6511cff
> > --- /dev/null
> > +++ b/hook.h
> > @@ -0,0 +1,15 @@
> > +#include "config.h"
> > +#include "list.h"
> > +#include "strbuf.h"
> > +
> > +struct hook
> > +{
> > +	struct list_head list;
> > +	enum config_scope origin;
> > +	struct strbuf command;
> > +};
> > +
> > +struct list_head* hook_list(const struct strbuf *hookname);
> > +
> > +void free_hook(struct hook *ptr);
> > +void clear_hook_list(void);
> > diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
> > index 34b0df5216..4e46d7dd4e 100755
> > --- a/t/t1360-config-based-hooks.sh
> > +++ b/t/t1360-config-based-hooks.sh
> > @@ -4,8 +4,55 @@ test_description='config-managed multihooks, including git-hook command'
> >  
> >  . ./test-lib.sh
> >  
> > -test_expect_success 'git hook command does not crash' '
> > -	git hook
> > +test_expect_success 'git hook rejects commands without a mode' '
> > +	test_must_fail git hook pre-commit
> > +'
> > +
> > +
> > +test_expect_success 'git hook rejects commands without a hookname' '
> > +	test_must_fail git hook list
> > +'
> > +
> > +test_expect_success 'setup hooks in global, and local' '
> > +	git config --add --local hook.pre-commit.command "/path/ghi" &&
> 
> Can I make a plea for the use of test_config please. Writing tests which
> rely on previous tests for their set-up creates a chain of hidden
> dependencies that make it hard to add/alter tests later or run a subset
> of the tests when developing a new patch. t3404-rebase-interactive.sh is
> a prime example of this and I dread touching it.

Sure. I'll redo them.

> 
> > +	git config --add --global hook.pre-commit.command "/path/def"
> > +'
> > +
> > +test_expect_success 'git hook list orders by config order' '
> > +	cat >expected <<-\EOF &&
> > +	global:	/path/def
> > +	local:	/path/ghi
> > +	EOF
> > +
> > +	git hook list pre-commit >actual &&
> > +	test_cmp expected actual
> > +'
> > +
> > +test_expect_success 'git hook list dereferences a hookcmd' '
> > +	git config --add --local hook.pre-commit.command "abc" &&
> > +	git config --add --global hookcmd.abc.command "/path/abc" &&
> > +
> > +	cat >expected <<-\EOF &&
> > +	global:	/path/def
> > +	local:	/path/ghi
> > +	local:	/path/abc
> 
> We should make it clear in the documentation that the config origin
> applies to the hook setting, even though we display the hookcmd command
> which is set globally here for the last hook.

One of the suggestions from our UX team last week was to make this list
output clearer to indicate the origin of the command plus the origin of
the hookcmd object; I'll try to straighten this out and make sure the
doc agrees.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v3 0/6] propose config-based hooks
  2020-05-21 18:54 [PATCH v2 0/4] propose config-based hooks Emily Shaffer
                   ` (3 preceding siblings ...)
  2020-05-21 18:54 ` [PATCH v2 4/4] hook: add --porcelain to " Emily Shaffer
@ 2020-07-28 22:24 ` Emily Shaffer
  2020-07-28 22:24   ` [PATCH v3 1/6] doc: propose hooks managed by the config Emily Shaffer
                     ` (6 more replies)
  4 siblings, 7 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-07-28 22:24 UTC (permalink / raw)
  To: git
  Cc: Emily Shaffer, Jeff King, Junio C Hamano, James Ramsay,
	Jonathan Nieder, brian m. carlson,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Josh Steadmon, Johannes Schindelin

Hi all,

After taking a few weeks to work on other items, I've got another update
to the config-based hook series. Patches 5 and 6 are RFC - a sketch of
how the hook library could run the appropriate set of hooks. There's
more work to do, which I'll outline later in the cover letter.

Since last time, I took into account review comments, including Dscho's
fixups to make the tests work in Windows. It seems those tests are
passing now, according to the GH Actions run:
https://github.com/nasamuffin/git/actions/runs/186242637

One thing I didn't decide on was the benefit of moving the hookcmd
resolution outside of the hook config pass; that code is unchanged. I
still haven't decided quite which approach I like better, but it's still
on my mind.

In the 'run_hook()' implementation I flipped the 'use_shell' bit, which
by my understanding only uses a shell if it can't find the command in
PATH; this seems like a reasonable approach especially because the code
is so brief, but I'm interested in hearing why I'm wrong or it won't
work well :)

There is still some work I've got locally which isn't quite ready:
 - support for hook.runHookDir. This is turning into a yak shave about
   who decides where and when to display or run the hookdir hook. I
   think I've got it mostly figured out and there's a patch locally, but
   it's not polished.
 - Drafts for 'git hook add' and 'git hook edit'. These features are
   probably the most complicated part of the series, but it's possible
   to use config-based hooks without them. In the interest of getting
   something out for people to try on their own, I'll probably leave
   these for later.
 - Support for stdin redirection to hooks. Since this means we want to
   point the same stdin to multiple processes, I'm thinking it will be
   slightly complicated. Maybe someone has a hint for me? :) Without
   having looked at what's available or not yet, I'm planning to do this
   by reading the whole stdin to memory and then streaming it to each
   process in turn, as I can't seek back to the beginning of the stream
   when I start each new process.
 - Conversion of codebase to use the hook library instead. Partly, this
   is gated on the previous point - there are plenty of callers who,
   instead of using run-command's run_hook_*(), just use find_hook() and
   roll their own struct child_process so they can use stdin/stdout. I
   do plan to consider the hook lib's run_hooks() implementation as
   non-final until I start this process - I'm expecting to learn more
   about what I do and don't have to support when I do this.

Thanks, all. Hopefully I can do better than a 2-month wait for the
series after this one... although I imagine I cursed myself by saying
that. :)

 - Emily


Emily Shaffer (6):
  doc: propose hooks managed by the config
  hook: scaffolding for git-hook subcommand
  hook: add list command
  hook: add --porcelain to list command
  parse-options: parse into argv_array
  hook: add 'run' subcommand

 .gitignore                                    |   1 +
 Documentation/Makefile                        |   1 +
 Documentation/git-hook.txt                    |  63 ++++
 Documentation/technical/api-parse-options.txt |   5 +
 .../technical/config-based-hooks.txt          | 354 ++++++++++++++++++
 Makefile                                      |   2 +
 builtin.h                                     |   1 +
 builtin/hook.c                                | 107 ++++++
 git.c                                         |   1 +
 hook.c                                        | 132 +++++++
 hook.h                                        |  18 +
 parse-options-cb.c                            |  16 +
 parse-options.h                               |   4 +
 t/t1360-config-based-hooks.sh                 | 115 ++++++
 14 files changed, 820 insertions(+)
 create mode 100644 Documentation/git-hook.txt
 create mode 100644 Documentation/technical/config-based-hooks.txt
 create mode 100644 builtin/hook.c
 create mode 100644 hook.c
 create mode 100644 hook.h
 create mode 100755 t/t1360-config-based-hooks.sh

-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v3 1/6] doc: propose hooks managed by the config
  2020-07-28 22:24 ` [PATCH v3 0/6] propose config-based hooks Emily Shaffer
@ 2020-07-28 22:24   ` Emily Shaffer
  2020-07-28 22:24   ` [PATCH v3 2/6] hook: scaffolding for git-hook subcommand Emily Shaffer
                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-07-28 22:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Begin a design document for config-based hooks, managed via git-hook.
Focus on an overview of the implementation and motivation for design
decisions. Briefly discuss the alternatives considered before this
point. Also, attempt to redefine terms to fit into a multihook world.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/Makefile                        |   1 +
 .../technical/config-based-hooks.txt          | 354 ++++++++++++++++++
 2 files changed, 355 insertions(+)
 create mode 100644 Documentation/technical/config-based-hooks.txt

diff --git a/Documentation/Makefile b/Documentation/Makefile
index ecd0b340b1..5483995113 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -80,6 +80,7 @@ SP_ARTICLES += $(API_DOCS)
 TECH_DOCS += MyFirstContribution
 TECH_DOCS += MyFirstObjectWalk
 TECH_DOCS += SubmittingPatches
+TECH_DOCS += technical/config-based-hooks
 TECH_DOCS += technical/hash-function-transition
 TECH_DOCS += technical/http-protocol
 TECH_DOCS += technical/index-format
diff --git a/Documentation/technical/config-based-hooks.txt b/Documentation/technical/config-based-hooks.txt
new file mode 100644
index 0000000000..c6e762b192
--- /dev/null
+++ b/Documentation/technical/config-based-hooks.txt
@@ -0,0 +1,354 @@
+Configuration-based hook management
+===================================
+:sectanchors:
+
+[[motivation]]
+== Motivation
+
+Treat hooks as a first-class citizen by replacing the .git/hook/hookname path as
+the only source of hooks to execute, in a way which is friendly to users with
+multiple repos which have similar needs.
+
+Redefine "hook" as an event rather than a single script, allowing users to
+perform unrelated actions on a single event.
+
+Take a step closer to safety when copying zipped Git repositories from untrusted
+users by making it more apparent to users which scripts will be run during
+normal Git operations.
+
+Make it easier for users to discover Git's hook feature and automate their
+workflows.
+
+[[user-interfaces]]
+== User interfaces
+
+[[config-schema]]
+=== Config schema
+
+Hooks can be introduced by editing the configuration manually. There are two new
+sections added, `hook` and `hookcmd`.
+
+[[config-schema-hook]]
+==== `hook`
+
+Primarily contains subsections for each hook event. These order of these
+subsections defines the hook command execution order; hook commands can be
+specified by setting the value directly to the command if no additional
+configuration is needed, or by setting the value as the name of a `hookcmd`. If
+Git does not find a `hookcmd` whose subsection matches the value of the given
+command string, Git will try to execute the string directly. Hooks are executed
+by passing the resolved command string to the shell. Hook event subsections can
+also contain per-hook-event settings.
+
+Also contains top-level hook execution settings, for example,
+`hook.warnHookDir`, `hook.runHookDir`, or `hook.disableAll`. (These settings are
+described more in <<library,Library>>.)
+
+----
+[hook "pre-commit"]
+  command = perl-linter
+  command = /usr/bin/git-secrets --pre-commit
+
+[hook "pre-applypatch"]
+  command = perl-linter
+  error = ignore
+
+[hook]
+  runHookDir = interactive
+----
+
+[[config-schema-hookcmd]]
+==== `hookcmd`
+
+Defines a hook command and its attributes, which will be used when a hook event
+occurs. Unqualified attributes are assumed to apply to this hook during all hook
+events, but event-specific attributes can also be supplied. The example runs
+`/usr/bin/lint-it --language=perl <args passed by Git>`, but for repos which
+include this config, the hook command will be skipped for all events to which
+it's normally subscribed _except_ `pre-commit`.
+
+----
+[hookcmd "perl-linter"]
+  command = /usr/bin/lint-it --language=perl
+  skip = true
+  pre-commit-skip = false
+----
+
+[[command-line-api]]
+=== Command-line API
+
+Users should be able to view, reorder, and create hook commands via the command
+line. External tools should be able to view a list of hooks in the correct order
+to run.
+
+*`git hook list <hook-event>`*
+
+*`git hook list (--system|--global|--local|--worktree)`*
+
+*`git hook edit <hook-event>`*
+
+*`git hook add <hook-command> <hook-event> <options...>`*
+
+[[hook-editor]]
+=== Hook editor
+
+The tool which is presented by `git hook edit <hook-command>`. Ideally, this
+tool should be easier to use than manually editing the config, and then produce
+a concise config afterwards. It may take a form similar to `git rebase
+--interactive`.
+
+[[implementation]]
+== Implementation
+
+[[library]]
+=== Library
+
+`hook.c` and `hook.h` are responsible for interacting with the config files. In
+the case when the code generating a hook event doesn't have special concerns
+about how to run the hooks, the hook library will provide a basic API to call
+all hooks in config order with an `argv_array` provided by the code which
+generates the hook event:
+
+*`int run_hooks(const char *hookname, struct argv_array *args)`*
+
+This call includes the hook command provided by `run-command.h:find_hook()`;
+eventually, this legacy hook will be gated by a config `hook.runHookDir`. The
+config is checked against a number of cases:
+
+- "no": the legacy hook will not be run
+- "interactive": Git will prompt the user before running the legacy hook
+- "warn": Git will print a warning to stderr before running the legacy hook
+- "yes" (default): Git will silently run the legacy hook
+
+In case this list is expanded in the future, if a value for `hook.runHookDir` is
+given which Git does not recognize, Git should discard that config entry. For
+example, if "warn" was specified at system level and "junk" was specified at
+global level, Git would resolve the value to "warn"; if the only time the config
+was set was to "junk", Git would use the default value of "yes".
+
+If the caller wants to do something more complicated, the hook library can also
+provide a callback API:
+
+*`int for_each_hookcmd(const char *hookname, hookcmd_function *cb)`*
+
+Finally, to facilitate the builtin, the library will also provide the following
+APIs to interact with the config:
+
+----
+int set_hook_commands(const char *hookname, struct string_list *commands,
+	enum config_scope scope);
+int set_hookcmd(const char *hookcmd, struct hookcmd options);
+
+int list_hook_commands(const char *hookname, struct string_list *commands);
+int list_hooks_in_scope(enum config_scope scope, struct string_list *commands);
+----
+
+`struct hookcmd` is expected to grow in size over time as more functionality is
+added to hooks; so that other parts of the code don't need to understand the
+config schema, `struct hookcmd` should contain logical values instead of string
+pairs.
+
+----
+struct hookcmd {
+  const char *name;
+  const char *command;
+
+  /* for illustration only; not planned at present */
+  int parallelizable;
+  const char *hookcmd_before;
+  const char *hookcmd_after;
+  enum recovery_action on_fail;
+}
+----
+
+[[builtin]]
+=== Builtin
+
+`builtin/hook.c` is responsible for providing the frontend. It's responsible for
+formatting user-provided data and then calling the library API to set the
+configs as appropriate. The builtin frontend is not responsible for calling the
+config directly, so that other areas of Git can rely on the hook library to
+understand the most recent config schema for hooks.
+
+[[migration]]
+=== Migration path
+
+[[stage-0]]
+==== Stage 0
+
+Hooks are called by running `run-command.h:find_hook()` with the hookname and
+executing the result. The hook library and builtin do not exist. Hooks only
+exist as specially named scripts within `.git/hooks/`.
+
+[[stage-1]]
+==== Stage 1
+
+`git hook list --porcelain <hook-event>` is implemented. Users can replace their
+`.git/hooks/<hook-event>` scripts with a trampoline based on `git hook list`'s
+output. Modifier commands like `git hook add` and `git hook edit` can be
+implemented around this time as well.
+
+[[stage-2]]
+==== Stage 2
+
+`hook.h:run_hooks()` is taught to include `run-command.h:find_hook()` at the
+end; calls to `find_hook()` are replaced with calls to `run_hooks()`. Users can
+opt-in to config-based hooks simply by creating some in their config; otherwise
+users should remain unaffected by the change.
+
+[[stage-3]]
+==== Stage 3
+
+The call to `find_hook()` inside of `run_hooks()` learns to check for a config,
+`hook.runHookDir`. Users can opt into managing their hooks completely via the
+config this way.
+
+[[stage-4]]
+==== Stage 4
+
+`.git/hooks` is removed from the template and the hook directory is considered
+deprecated. To avoid breaking older repos, the default of `hook.runHookDir` is
+not changed, and `find_hook()` is not removed.
+
+[[caveats]]
+== Caveats
+
+[[security]]
+=== Security and repo config
+
+Part of the motivation behind this refactor is to mitigate hooks as an attack
+vector;footnote:[https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/]
+however, as the design stands, users can still provide hooks in the repo-level
+config, which is included when a repo is zipped and sent elsewhere.  The
+security of the repo-level config is still under discussion; this design
+generally assumes the repo-level config is secure, which is not true yet. The
+goal is to avoid an overcomplicated design to work around a problem which has
+ceased to exist.
+
+[[ease-of-use]]
+=== Ease of use
+
+The config schema is nontrivial; that's why it's important for the `git hook`
+modifier commands to be usable. Contributors with UX expertise are encouraged to
+share their suggestions.
+
+[[alternatives]]
+== Alternative approaches
+
+A previous summary of alternatives exists in the
+archives.footnote:[https://lore.kernel.org/git/20191116011125.GG22855@google.com]
+
+[[status-quo]]
+=== Status quo
+
+Today users can implement multihooks themselves by using a "trampoline script"
+as their hook, and pointing that script to a directory or list of other scripts
+they wish to run.
+
+[[hook-directories]]
+=== Hook directories
+
+Other contributors have suggested Git learn about the existence of a directory
+such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order.
+
+[[comparison]]
+=== Comparison table
+
+.Comparison of alternatives
+|===
+|Feature |Config-based hooks |Hook directories |Status quo
+
+|Supports multiple hooks
+|Natively
+|Natively
+|With user effort
+
+|Safer for zipped repos
+|A little
+|No
+|No
+
+|Previous hooks just work
+|If configured
+|Yes
+|Yes
+
+|Can install one hook to many repos
+|Yes
+|No
+|No
+
+|Discoverability
+|Better (in `git help git`)
+|Same as before
+|Same as before
+
+|Hard to run unexpected hook
+|If configured
+|No
+|No
+|===
+
+[[future-work]]
+== Future work
+
+[[execution-ordering]]
+=== Execution ordering
+
+We may find that config order is insufficient for some users; for example,
+config order makes it difficult to add a new hook to the system or global config
+which runs at the end of the hook list. A new ordering schema should be:
+
+1) Specified by a `hook.order` config, so that users will not unexpectedly see
+their order change;
+
+2) Either dependency or numerically based.
+
+Dependency-based ordering is prone to classic linked-list problems, like a
+cycles and handling of missing dependencies. But, it paves the way for enabling
+parallelization if some tasks truly depend on others.
+
+Numerical ordering makes it tricky for Git to generate suggested ordering
+numbers for each command, but is easy to determine a definitive order.
+
+[[parallelization]]
+=== Parallelization
+
+Users with many hooks might want to run them simultaneously, if the hooks don't
+modify state; if one hook depends on another's output, then users will want to
+specify those dependencies. If we decide to solve this problem, we may want to
+look to modern build systems for inspiration on how to manage dependencies and
+parallel tasks.
+
+[[securing-hookdir-hooks]]
+=== Securing hookdir hooks
+
+With the design as written in this doc, it's still possible for a malicious user
+to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then
+zip their repo and send it to another user. It may be necessary to teach Git to
+only allow inlined hooks like this if they were configured outside of the local
+scope (in other words, only run hookcmds, and only allow hookcmds to be
+configured in global or system scope); or another approach, like a list of safe
+projects, might be useful. It may also be sufficient (or at least useful) to
+teach a `hook.disableAll` config or similar flag to the Git executable.
+
+[[submodule-inheritance]]
+=== Submodule inheritance
+
+It's possible some submodules may want to run the identical set of hooks that
+their superrepo runs. While a globally-configured hook set is helpful, it's not
+a great solution for users who have multiple repos-with-submodules under the
+same user. It would be useful for submodules to learn how to run hooks from
+their superrepo's config, or inherit that hook setting.
+
+[[glossary]]
+== Glossary
+
+*hook event*
+
+A point during Git's execution where user scripts may be run, for example,
+_prepare-commit-msg_ or _pre-push_.
+
+*hook command*
+
+A user script or executable which will be run on one or more hook events.
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v3 2/6] hook: scaffolding for git-hook subcommand
  2020-07-28 22:24 ` [PATCH v3 0/6] propose config-based hooks Emily Shaffer
  2020-07-28 22:24   ` [PATCH v3 1/6] doc: propose hooks managed by the config Emily Shaffer
@ 2020-07-28 22:24   ` Emily Shaffer
  2020-07-28 22:24   ` [PATCH v3 3/6] hook: add list command Emily Shaffer
                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-07-28 22:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Introduce infrastructure for a new subcommand, git-hook, which will be
used to ease config-based hook management. This command will handle
parsing configs to compose a list of hooks to run for a given event, as
well as adding or modifying hook configs in an interactive fashion.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 .gitignore                    |  1 +
 Documentation/git-hook.txt    | 19 +++++++++++++++++++
 Makefile                      |  1 +
 builtin.h                     |  1 +
 builtin/hook.c                | 21 +++++++++++++++++++++
 git.c                         |  1 +
 t/t1360-config-based-hooks.sh | 11 +++++++++++
 7 files changed, 55 insertions(+)
 create mode 100644 Documentation/git-hook.txt
 create mode 100644 builtin/hook.c
 create mode 100755 t/t1360-config-based-hooks.sh

diff --git a/.gitignore b/.gitignore
index ee509a2ad2..0694a34884 100644
--- a/.gitignore
+++ b/.gitignore
@@ -75,6 +75,7 @@
 /git-grep
 /git-hash-object
 /git-help
+/git-hook
 /git-http-backend
 /git-http-fetch
 /git-http-push
diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
new file mode 100644
index 0000000000..2d50c414cc
--- /dev/null
+++ b/Documentation/git-hook.txt
@@ -0,0 +1,19 @@
+git-hook(1)
+===========
+
+NAME
+----
+git-hook - Manage configured hooks
+
+SYNOPSIS
+--------
+[verse]
+'git hook'
+
+DESCRIPTION
+-----------
+You can list, add, and modify hooks with this command.
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index 372139f1f2..e13e58e23f 100644
--- a/Makefile
+++ b/Makefile
@@ -1077,6 +1077,7 @@ BUILTIN_OBJS += builtin/get-tar-commit-id.o
 BUILTIN_OBJS += builtin/grep.o
 BUILTIN_OBJS += builtin/hash-object.o
 BUILTIN_OBJS += builtin/help.o
+BUILTIN_OBJS += builtin/hook.o
 BUILTIN_OBJS += builtin/index-pack.o
 BUILTIN_OBJS += builtin/init-db.o
 BUILTIN_OBJS += builtin/interpret-trailers.o
diff --git a/builtin.h b/builtin.h
index a5ae15bfe5..4e736499c0 100644
--- a/builtin.h
+++ b/builtin.h
@@ -157,6 +157,7 @@ int cmd_get_tar_commit_id(int argc, const char **argv, const char *prefix);
 int cmd_grep(int argc, const char **argv, const char *prefix);
 int cmd_hash_object(int argc, const char **argv, const char *prefix);
 int cmd_help(int argc, const char **argv, const char *prefix);
+int cmd_hook(int argc, const char **argv, const char *prefix);
 int cmd_index_pack(int argc, const char **argv, const char *prefix);
 int cmd_init_db(int argc, const char **argv, const char *prefix);
 int cmd_interpret_trailers(int argc, const char **argv, const char *prefix);
diff --git a/builtin/hook.c b/builtin/hook.c
new file mode 100644
index 0000000000..b2bbc84d4d
--- /dev/null
+++ b/builtin/hook.c
@@ -0,0 +1,21 @@
+#include "cache.h"
+
+#include "builtin.h"
+#include "parse-options.h"
+
+static const char * const builtin_hook_usage[] = {
+	N_("git hook"),
+	NULL
+};
+
+int cmd_hook(int argc, const char **argv, const char *prefix)
+{
+	struct option builtin_hook_options[] = {
+		OPT_END(),
+	};
+
+	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+			     builtin_hook_usage, 0);
+
+	return 0;
+}
diff --git a/git.c b/git.c
index 2f021b97f3..7f3328c63f 100644
--- a/git.c
+++ b/git.c
@@ -517,6 +517,7 @@ static struct cmd_struct commands[] = {
 	{ "grep", cmd_grep, RUN_SETUP_GENTLY },
 	{ "hash-object", cmd_hash_object },
 	{ "help", cmd_help },
+	{ "hook", cmd_hook, RUN_SETUP },
 	{ "index-pack", cmd_index_pack, RUN_SETUP_GENTLY | NO_PARSEOPT },
 	{ "init", cmd_init_db },
 	{ "init-db", cmd_init_db },
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
new file mode 100755
index 0000000000..34b0df5216
--- /dev/null
+++ b/t/t1360-config-based-hooks.sh
@@ -0,0 +1,11 @@
+#!/bin/bash
+
+test_description='config-managed multihooks, including git-hook command'
+
+. ./test-lib.sh
+
+test_expect_success 'git hook command does not crash' '
+	git hook
+'
+
+test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v3 3/6] hook: add list command
  2020-07-28 22:24 ` [PATCH v3 0/6] propose config-based hooks Emily Shaffer
  2020-07-28 22:24   ` [PATCH v3 1/6] doc: propose hooks managed by the config Emily Shaffer
  2020-07-28 22:24   ` [PATCH v3 2/6] hook: scaffolding for git-hook subcommand Emily Shaffer
@ 2020-07-28 22:24   ` Emily Shaffer
  2020-07-28 22:24   ` [PATCH v3 4/6] hook: add --porcelain to " Emily Shaffer
                     ` (3 subsequent siblings)
  6 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-07-28 22:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Teach 'git hook list <hookname>', which checks the known configs in
order to create an ordered list of hooks to run on a given hook event.

Multiple commands can be specified for a given hook by providing
multiple "hook.<hookname>.command = <path-to-hook>" lines. Hooks will be
run in config order. If more properties need to be set on a given hook
in the future, commands can also be specified by providing
"hook.<hookname>.command = <hookcmd-name>", as well as a "[hookcmd
<hookcmd-name>]" subsection; at minimum, this subsection must contain a
"hookcmd.<hookcmd-name>.command = <path-to-hook>" line.

For example:

  $ git config --list | grep ^hook
  hook.pre-commit.command=baz
  hook.pre-commit.command=~/bar.sh
  hookcmd.baz.command=~/baz/from/hookcmd.sh

  $ git hook list pre-commit
  ~/baz/from/hookcmd.sh
  ~/bar.sh

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/git-hook.txt    | 37 +++++++++++++-
 Makefile                      |  1 +
 builtin/hook.c                | 55 +++++++++++++++++++--
 hook.c                        | 90 +++++++++++++++++++++++++++++++++++
 hook.h                        | 15 ++++++
 t/t1360-config-based-hooks.sh | 68 +++++++++++++++++++++++++-
 6 files changed, 259 insertions(+), 7 deletions(-)
 create mode 100644 hook.c
 create mode 100644 hook.h

diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index 2d50c414cc..e458586e96 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -8,12 +8,47 @@ git-hook - Manage configured hooks
 SYNOPSIS
 --------
 [verse]
-'git hook'
+'git hook' list <hook-name>
 
 DESCRIPTION
 -----------
 You can list, add, and modify hooks with this command.
 
+This command parses the default configuration files for sections "hook" and
+"hookcmd". "hook" is used to describe the commands which will be run during a
+particular hook event; commands are run in config order. "hookcmd" is used to
+describe attributes of a specific command. If additional attributes don't need
+to be specified, a command to run can be specified directly in the "hook"
+section; if a "hookcmd" by that name isn't found, Git will attempt to run the
+provided value directly. For example:
+
+Global config
+----
+  [hook "post-commit"]
+    command = "linter"
+    command = "~/typocheck.sh"
+
+  [hookcmd "linter"]
+    command = "/bin/linter --c"
+----
+
+Local config
+----
+  [hook "prepare-commit-msg"]
+    command = "linter"
+  [hook "post-commit"]
+    command = "python ~/run-test-suite.py"
+----
+
+COMMANDS
+--------
+
+list <hook-name>::
+
+List the hooks which have been configured for <hook-name>. Hooks appear
+in the order they should be run, and note the config scope where the relevant
+`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index e13e58e23f..50e7c911d1 100644
--- a/Makefile
+++ b/Makefile
@@ -891,6 +891,7 @@ LIB_OBJS += grep.o
 LIB_OBJS += hashmap.o
 LIB_OBJS += help.o
 LIB_OBJS += hex.o
+LIB_OBJS += hook.o
 LIB_OBJS += ident.o
 LIB_OBJS += interdiff.o
 LIB_OBJS += json-writer.o
diff --git a/builtin/hook.c b/builtin/hook.c
index b2bbc84d4d..a0759a4c26 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -1,21 +1,68 @@
 #include "cache.h"
 
 #include "builtin.h"
+#include "config.h"
+#include "hook.h"
 #include "parse-options.h"
+#include "strbuf.h"
 
 static const char * const builtin_hook_usage[] = {
-	N_("git hook"),
+	N_("git hook list <hookname>"),
 	NULL
 };
 
-int cmd_hook(int argc, const char **argv, const char *prefix)
+static int list(int argc, const char **argv, const char *prefix)
 {
-	struct option builtin_hook_options[] = {
+	struct list_head *head, *pos;
+	struct hook *item;
+	struct strbuf hookname = STRBUF_INIT;
+
+	struct option list_options[] = {
 		OPT_END(),
 	};
 
-	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+	argc = parse_options(argc, argv, prefix, list_options,
 			     builtin_hook_usage, 0);
 
+	if (argc < 1) {
+		usage_msg_opt("a hookname must be provided to operate on.",
+			      builtin_hook_usage, list_options);
+	}
+
+	strbuf_addstr(&hookname, argv[0]);
+
+	head = hook_list(&hookname);
+
+	if (list_empty(head)) {
+		printf(_("no commands configured for hook '%s'\n"),
+		       hookname.buf);
+		return 0;
+	}
+
+	list_for_each(pos, head) {
+		item = list_entry(pos, struct hook, list);
+		if (item)
+			printf("%s:\t%s\n",
+			       config_scope_name(item->origin),
+			       item->command.buf);
+	}
+
+	clear_hook_list();
+	strbuf_release(&hookname);
+
 	return 0;
 }
+
+int cmd_hook(int argc, const char **argv, const char *prefix)
+{
+	struct option builtin_hook_options[] = {
+		OPT_END(),
+	};
+	if (argc < 2)
+		usage_with_options(builtin_hook_usage, builtin_hook_options);
+
+	if (!strcmp(argv[1], "list"))
+		return list(argc - 1, argv + 1, prefix);
+
+	usage_with_options(builtin_hook_usage, builtin_hook_options);
+}
diff --git a/hook.c b/hook.c
new file mode 100644
index 0000000000..9dfc1a885e
--- /dev/null
+++ b/hook.c
@@ -0,0 +1,90 @@
+#include "cache.h"
+
+#include "hook.h"
+#include "config.h"
+
+static LIST_HEAD(hook_head);
+
+void free_hook(struct hook *ptr)
+{
+	if (ptr) {
+		strbuf_release(&ptr->command);
+		free(ptr);
+	}
+}
+
+static void emplace_hook(struct list_head *pos, const char *command)
+{
+	struct hook *to_add = malloc(sizeof(struct hook));
+	to_add->origin = current_config_scope();
+	strbuf_init(&to_add->command, 0);
+	strbuf_addstr(&to_add->command, command);
+
+	list_add_tail(&to_add->list, pos);
+}
+
+static void remove_hook(struct list_head *to_remove)
+{
+	struct hook *hook_to_remove = list_entry(to_remove, struct hook, list);
+	list_del(to_remove);
+	free_hook(hook_to_remove);
+}
+
+void clear_hook_list(void)
+{
+	struct list_head *pos, *tmp;
+	list_for_each_safe(pos, tmp, &hook_head)
+		remove_hook(pos);
+}
+
+static int hook_config_lookup(const char *key, const char *value, void *hook_key_cb)
+{
+	const char *hook_key = hook_key_cb;
+
+	if (!strcmp(key, hook_key)) {
+		const char *command = value;
+		struct strbuf hookcmd_name = STRBUF_INIT;
+		struct list_head *pos = NULL, *tmp = NULL;
+
+		/* Check if a hookcmd with that name exists. */
+		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
+		git_config_get_value(hookcmd_name.buf, &command);
+
+		if (!command)
+			BUG("git_config_get_value overwrote a string it shouldn't have");
+
+		/*
+		 * TODO: implement an option-getting callback, e.g.
+		 *   get configs by pattern hookcmd.$value.*
+		 *   for each key+value, do_callback(key, value, cb_data)
+		 */
+
+		list_for_each_safe(pos, tmp, &hook_head) {
+			struct hook *hook = list_entry(pos, struct hook, list);
+			/*
+			 * The list of hooks to run can be reordered by being redeclared
+			 * in the config. Options about hook ordering should be checked
+			 * here.
+			 */
+			if (0 == strcmp(hook->command.buf, command))
+				remove_hook(pos);
+		}
+		emplace_hook(pos, command);
+	}
+
+	return 0;
+}
+
+struct list_head* hook_list(const struct strbuf* hookname)
+{
+	struct strbuf hook_key = STRBUF_INIT;
+
+	if (!hookname)
+		return NULL;
+
+	strbuf_addf(&hook_key, "hook.%s.command", hookname->buf);
+
+	git_config(hook_config_lookup, (void*)hook_key.buf);
+
+	return &hook_head;
+}
diff --git a/hook.h b/hook.h
new file mode 100644
index 0000000000..aaf6511cff
--- /dev/null
+++ b/hook.h
@@ -0,0 +1,15 @@
+#include "config.h"
+#include "list.h"
+#include "strbuf.h"
+
+struct hook
+{
+	struct list_head list;
+	enum config_scope origin;
+	struct strbuf command;
+};
+
+struct list_head* hook_list(const struct strbuf *hookname);
+
+void free_hook(struct hook *ptr);
+void clear_hook_list(void);
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 34b0df5216..46d1ed354a 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -4,8 +4,72 @@ test_description='config-managed multihooks, including git-hook command'
 
 . ./test-lib.sh
 
-test_expect_success 'git hook command does not crash' '
-	git hook
+ROOT=
+if test_have_prereq MINGW
+then
+	# In Git for Windows, Unix-like paths work only in shell scripts;
+	# `git.exe`, however, will prefix them with the pseudo root directory
+	# (of the Unix shell). Let's accommodate for that.
+	ROOT="$(cd / && pwd)"
+fi
+
+setup_hooks () {
+	test_config hook.pre-commit.command "/path/ghi" --add
+	test_config_global hook.pre-commit.command "/path/def" --add
+}
+
+setup_hookcmd () {
+	test_config hook.pre-commit.command "abc" --add
+	test_config_global hookcmd.abc.command "/path/abc" --add
+}
+
+test_expect_success 'git hook rejects commands without a mode' '
+	test_must_fail git hook pre-commit
+'
+
+
+test_expect_success 'git hook rejects commands without a hookname' '
+	test_must_fail git hook list
+'
+
+test_expect_success 'git hook list orders by config order' '
+	setup_hooks &&
+
+	cat >expected <<-EOF &&
+	global:	$ROOT/path/def
+	local:	$ROOT/path/ghi
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list dereferences a hookcmd' '
+	setup_hooks &&
+	setup_hookcmd &&
+
+	cat >expected <<-EOF &&
+	global:	$ROOT/path/def
+	local:	$ROOT/path/ghi
+	local:	$ROOT/path/abc
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list reorders on duplicate commands' '
+	setup_hooks &&
+
+	test_config hook.pre-commit.command "/path/def" --add &&
+
+	cat >expected <<-EOF &&
+	local:	$ROOT/path/ghi
+	local:	$ROOT/path/def
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
 '
 
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v3 4/6] hook: add --porcelain to list command
  2020-07-28 22:24 ` [PATCH v3 0/6] propose config-based hooks Emily Shaffer
                     ` (2 preceding siblings ...)
  2020-07-28 22:24   ` [PATCH v3 3/6] hook: add list command Emily Shaffer
@ 2020-07-28 22:24   ` Emily Shaffer
  2020-07-28 22:24   ` [RFC PATCH v3 5/6] parse-options: parse into argv_array Emily Shaffer
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-07-28 22:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Teach 'git hook list --porcelain <hookname>', which prints simply the
commands to be run in the order suggested by the config. This option is
intended for use by user scripts, wrappers, or out-of-process Git
commands which still want to execute hooks. For example, the following
snippet might be added to git-send-email.perl to introduce a
`pre-send-email` hook:

  sub pre_send_email {
    open(my $fh, 'git hook list --porcelain pre-send-email |');
    chomp(my @hooks = <$fh>);
    close($fh);

    foreach $hook (@hooks) {
            system $hook
    }

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/git-hook.txt    | 13 +++++++++++--
 builtin/hook.c                | 17 +++++++++++++----
 t/t1360-config-based-hooks.sh | 12 ++++++++++++
 3 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index e458586e96..0854035ce2 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -8,7 +8,7 @@ git-hook - Manage configured hooks
 SYNOPSIS
 --------
 [verse]
-'git hook' list <hook-name>
+'git hook' list [--porcelain] <hook-name>
 
 DESCRIPTION
 -----------
@@ -43,11 +43,20 @@ Local config
 COMMANDS
 --------
 
-list <hook-name>::
+list [--porcelain] <hook-name>::
 
 List the hooks which have been configured for <hook-name>. Hooks appear
 in the order they should be run, and note the config scope where the relevant
 `hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
++
+If `--porcelain` is specified, instead print the commands alone, separated by
+newlines, for easy parsing by a script.
+
+OPTIONS
+-------
+--porcelain::
+	With `list`, print the commands in the order they should be run,
+	separated by newlines, for easy parsing by a script.
 
 GIT
 ---
diff --git a/builtin/hook.c b/builtin/hook.c
index a0759a4c26..0d92124ca6 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -16,8 +16,11 @@ static int list(int argc, const char **argv, const char *prefix)
 	struct list_head *head, *pos;
 	struct hook *item;
 	struct strbuf hookname = STRBUF_INIT;
+	int porcelain = 0;
 
 	struct option list_options[] = {
+		OPT_BOOL(0, "porcelain", &porcelain,
+			 "format for execution by a script"),
 		OPT_END(),
 	};
 
@@ -29,6 +32,8 @@ static int list(int argc, const char **argv, const char *prefix)
 			      builtin_hook_usage, list_options);
 	}
 
+
+
 	strbuf_addstr(&hookname, argv[0]);
 
 	head = hook_list(&hookname);
@@ -41,10 +46,14 @@ static int list(int argc, const char **argv, const char *prefix)
 
 	list_for_each(pos, head) {
 		item = list_entry(pos, struct hook, list);
-		if (item)
-			printf("%s:\t%s\n",
-			       config_scope_name(item->origin),
-			       item->command.buf);
+		if (item) {
+			if (porcelain)
+				printf("%s\n", item->command.buf);
+			else
+				printf("%s:\t%s\n",
+				       config_scope_name(item->origin),
+				       item->command.buf);
+		}
 	}
 
 	clear_hook_list();
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 46d1ed354a..ebf8f38d68 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -72,4 +72,16 @@ test_expect_success 'git hook list reorders on duplicate commands' '
 	test_cmp expected actual
 '
 
+test_expect_success 'git hook list --porcelain prints just the command' '
+	setup_hooks &&
+
+	cat >expected <<-EOF &&
+	$ROOT/path/def
+	$ROOT/path/ghi
+	EOF
+
+	git hook list --porcelain pre-commit >actual &&
+	test_cmp expected actual
+'
+
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [RFC PATCH v3 5/6] parse-options: parse into argv_array
  2020-07-28 22:24 ` [PATCH v3 0/6] propose config-based hooks Emily Shaffer
                     ` (3 preceding siblings ...)
  2020-07-28 22:24   ` [PATCH v3 4/6] hook: add --porcelain to " Emily Shaffer
@ 2020-07-28 22:24   ` Emily Shaffer
  2020-07-29 19:33     ` Junio C Hamano
  2020-07-28 22:24   ` [RFC PATCH v3 6/6] hook: add 'run' subcommand Emily Shaffer
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
  6 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-07-28 22:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

parse-options already knows how to read into a string_list, and it knows
how to read into an argv_array as a passthrough (that is, including the
argument as well as its value). string_list and argv_array serve similar
purposes but are somewhat painful to convert between; so, let's teach
parse-options to read values of string arguments directly into an
argv_array without preserving the argument name.

This is useful if collecting generic arguments to pass through to
another command, for example, 'git hook run --arg "--quiet" --arg
"--format=pretty" some-hook'. The resulting argv_array would contain
{ "--quiet", "--format=pretty" }.

The implementation is based on that of OPT_STRING_LIST.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/technical/api-parse-options.txt |  5 +++++
 parse-options-cb.c                            | 16 ++++++++++++++++
 parse-options.h                               |  4 ++++
 3 files changed, 25 insertions(+)

diff --git a/Documentation/technical/api-parse-options.txt b/Documentation/technical/api-parse-options.txt
index 2e2e7c10c6..1e97343338 100644
--- a/Documentation/technical/api-parse-options.txt
+++ b/Documentation/technical/api-parse-options.txt
@@ -173,6 +173,11 @@ There are some macros to easily define options:
 	The string argument is stored as an element in `string_list`.
 	Use of `--no-option` will clear the list of preceding values.
 
+`OPT_ARGV_ARRAY(short, long, &struct argv_array, arg_str, description)`::
+	Introduce an option with a string argument.
+	The string argument is stored as an element in `argv_array`.
+	Use of `--no-option` will clear the list of preceding values.
+
 `OPT_INTEGER(short, long, &int_var, description)`::
 	Introduce an option with integer argument.
 	The integer is put into `int_var`.
diff --git a/parse-options-cb.c b/parse-options-cb.c
index 86cd393013..94c2dd397a 100644
--- a/parse-options-cb.c
+++ b/parse-options-cb.c
@@ -205,6 +205,22 @@ int parse_opt_string_list(const struct option *opt, const char *arg, int unset)
 	return 0;
 }
 
+int parse_opt_argv_array(const struct option *opt, const char *arg, int unset)
+{
+	struct argv_array *v = opt->value;
+
+	if (unset) {
+		argv_array_clear(v);
+		return 0;
+	}
+
+	if (!arg)
+		return -1;
+
+	argv_array_push(v, arg);
+	return 0;
+}
+
 int parse_opt_noop_cb(const struct option *opt, const char *arg, int unset)
 {
 	return 0;
diff --git a/parse-options.h b/parse-options.h
index 46af942093..e2e2de75c8 100644
--- a/parse-options.h
+++ b/parse-options.h
@@ -177,6 +177,9 @@ struct option {
 #define OPT_STRING_LIST(s, l, v, a, h) \
 				    { OPTION_CALLBACK, (s), (l), (v), (a), \
 				      (h), 0, &parse_opt_string_list }
+#define OPT_ARGV_ARRAY(s, l, v, a, h) \
+				    { OPTION_CALLBACK, (s), (l), (v), (a), \
+				      (h), 0, &parse_opt_argv_array }
 #define OPT_UYN(s, l, v, h)         { OPTION_CALLBACK, (s), (l), (v), NULL, \
 				      (h), PARSE_OPT_NOARG, &parse_opt_tertiary }
 #define OPT_EXPIRY_DATE(s, l, v, h) \
@@ -296,6 +299,7 @@ int parse_opt_commits(const struct option *, const char *, int);
 int parse_opt_commit(const struct option *, const char *, int);
 int parse_opt_tertiary(const struct option *, const char *, int);
 int parse_opt_string_list(const struct option *, const char *, int);
+int parse_opt_argv_array(const struct option *, const char *, int);
 int parse_opt_noop_cb(const struct option *, const char *, int);
 enum parse_opt_result parse_opt_unknown_cb(struct parse_opt_ctx_t *ctx,
 					   const struct option *,
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [RFC PATCH v3 6/6] hook: add 'run' subcommand
  2020-07-28 22:24 ` [PATCH v3 0/6] propose config-based hooks Emily Shaffer
                     ` (4 preceding siblings ...)
  2020-07-28 22:24   ` [RFC PATCH v3 5/6] parse-options: parse into argv_array Emily Shaffer
@ 2020-07-28 22:24   ` Emily Shaffer
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
  6 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-07-28 22:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

In order to enable hooks to be run as an external process, by a
standalone Git command, or by tools which wrap Git, provide an external
means to run all configured hook commands for a given hook event.

For now, the hook commands will in config order, in series. As alternate
ordering or parallelism is supported in the future, we should add knobs
to use those to the command line as well.

As with the legacy hook implementation, all stdout generated by hook
commands is redirected to stderr. Piping from stdin is not yet
supported.

Legacy hooks (those present in $GITDIR/hooks) are run at the end of the
execution list. For now, there is no way to disable them.

Users may wish to provide hook commands like 'git config
hook.pre-commit.command "~/linter.sh --pre-commit"'. To enable this, the
contents of the 'hook.*.command' and 'hookcmd.*.command' strings are
first split by space or quotes into an argv_array, then expanded with
'expand_user_path()'.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 builtin/hook.c                | 30 +++++++++++++++++++++++++
 hook.c                        | 42 +++++++++++++++++++++++++++++++++++
 hook.h                        |  3 +++
 t/t1360-config-based-hooks.sh | 28 +++++++++++++++++++++++
 4 files changed, 103 insertions(+)

diff --git a/builtin/hook.c b/builtin/hook.c
index 0d92124ca6..cd61fad5fb 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -5,9 +5,11 @@
 #include "hook.h"
 #include "parse-options.h"
 #include "strbuf.h"
+#include "argv-array.h"
 
 static const char * const builtin_hook_usage[] = {
 	N_("git hook list <hookname>"),
+	N_("git hook run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] <hookname>"),
 	NULL
 };
 
@@ -62,6 +64,32 @@ static int list(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+static int run(int argc, const char **argv, const char *prefix)
+{
+	struct strbuf hookname = STRBUF_INIT;
+	struct argv_array env_argv = ARGV_ARRAY_INIT;
+	struct argv_array arg_argv = ARGV_ARRAY_INIT;
+
+	struct option run_options[] = {
+		OPT_ARGV_ARRAY('e', "env", &env_argv, N_("var"),
+			       N_("environment variables for hook to use")),
+		OPT_ARGV_ARRAY('a', "arg", &arg_argv, N_("args"),
+			       N_("argument to pass to hook")),
+		OPT_END(),
+	};
+
+	argc = parse_options(argc, argv, prefix, run_options,
+			     builtin_hook_usage, 0);
+
+	if (argc < 1)
+		usage_msg_opt(_("a hookname must be provided to operate on."),
+			      builtin_hook_usage, run_options);
+
+	strbuf_addstr(&hookname, argv[0]);
+
+	return run_hooks(env_argv.argv, &hookname, &arg_argv);
+}
+
 int cmd_hook(int argc, const char **argv, const char *prefix)
 {
 	struct option builtin_hook_options[] = {
@@ -72,6 +100,8 @@ int cmd_hook(int argc, const char **argv, const char *prefix)
 
 	if (!strcmp(argv[1], "list"))
 		return list(argc - 1, argv + 1, prefix);
+	if (!strcmp(argv[1], "run"))
+		return run(argc - 1, argv + 1, prefix);
 
 	usage_with_options(builtin_hook_usage, builtin_hook_options);
 }
diff --git a/hook.c b/hook.c
index 9dfc1a885e..902e213173 100644
--- a/hook.c
+++ b/hook.c
@@ -2,6 +2,7 @@
 
 #include "hook.h"
 #include "config.h"
+#include "run-command.h"
 
 static LIST_HEAD(hook_head);
 
@@ -78,6 +79,7 @@ static int hook_config_lookup(const char *key, const char *value, void *hook_key
 struct list_head* hook_list(const struct strbuf* hookname)
 {
 	struct strbuf hook_key = STRBUF_INIT;
+	const char *legacy_hook_path = NULL;
 
 	if (!hookname)
 		return NULL;
@@ -86,5 +88,45 @@ struct list_head* hook_list(const struct strbuf* hookname)
 
 	git_config(hook_config_lookup, (void*)hook_key.buf);
 
+	legacy_hook_path = find_hook(hookname->buf);
+
+	/* TODO: check hook.runHookDir */
+	if (legacy_hook_path)
+		emplace_hook(&hook_head, legacy_hook_path);
+
 	return &hook_head;
 }
+
+int run_hooks(const char *const *env, const struct strbuf *hookname,
+	      const struct argv_array *args)
+{
+	struct list_head *to_run, *pos = NULL, *tmp = NULL;
+	int rc = 0;
+
+	to_run = hook_list(hookname);
+
+	list_for_each_safe(pos, tmp, to_run) {
+		struct child_process hook_proc = CHILD_PROCESS_INIT;
+		struct hook *hook = list_entry(pos, struct hook, list);
+
+		/* add command */
+		argv_array_push(&hook_proc.args, hook->command.buf);
+
+		/*
+		 * add passed-in argv, without expanding - let the user get back
+		 * exactly what they put in
+		 */
+		if (args)
+			argv_array_pushv(&hook_proc.args, args->argv);
+
+		hook_proc.env = env;
+		hook_proc.no_stdin = 1;
+		hook_proc.stdout_to_stderr = 1;
+		hook_proc.trace2_hook_name = hook->command.buf;
+		hook_proc.use_shell = 1;
+
+		rc |= run_command(&hook_proc);
+	}
+
+	return rc;
+}
diff --git a/hook.h b/hook.h
index aaf6511cff..cf598d6ccb 100644
--- a/hook.h
+++ b/hook.h
@@ -1,6 +1,7 @@
 #include "config.h"
 #include "list.h"
 #include "strbuf.h"
+#include "argv-array.h"
 
 struct hook
 {
@@ -10,6 +11,8 @@ struct hook
 };
 
 struct list_head* hook_list(const struct strbuf *hookname);
+int run_hooks(const char *const *env, const struct strbuf *hookname,
+	      const struct argv_array *args);
 
 void free_hook(struct hook *ptr);
 void clear_hook_list(void);
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index ebf8f38d68..ee8114250d 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -84,4 +84,32 @@ test_expect_success 'git hook list --porcelain prints just the command' '
 	test_cmp expected actual
 '
 
+test_expect_success 'inline hook definitions execute oneliners' '
+	test_config hook.pre-commit.command "echo \"Hello World\"" &&
+
+	echo "Hello World" >expected &&
+
+	# hooks are run with stdout_to_stderr = 1
+	git hook run pre-commit 2>actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'inline hook definitions resolve paths' '
+	cat >~/sample-hook.sh <<-EOF &&
+	echo \"Sample Hook\"
+	EOF
+
+	test_when_finished "rm ~/sample-hook.sh" &&
+
+	chmod +x ~/sample-hook.sh &&
+
+	test_config hook.pre-commit.command "~/sample-hook.sh" &&
+
+	echo \"Sample Hook\" >expected &&
+
+	# hooks are run with stdout_to_stderr = 1
+	git hook run pre-commit 2>actual &&
+	test_cmp expected actual
+'
+
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* Re: [RFC PATCH v3 5/6] parse-options: parse into argv_array
  2020-07-28 22:24   ` [RFC PATCH v3 5/6] parse-options: parse into argv_array Emily Shaffer
@ 2020-07-29 19:33     ` Junio C Hamano
  2020-07-30 23:41       ` Junio C Hamano
  0 siblings, 1 reply; 170+ messages in thread
From: Junio C Hamano @ 2020-07-29 19:33 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Emily Shaffer <emilyshaffer@google.com> writes:

> parse-options already knows how to read into a string_list, and it knows
> how to read into an argv_array as a passthrough (that is, including the
> argument as well as its value). string_list and argv_array serve similar
> purposes but are somewhat painful to convert between; so, let's teach
> parse-options to read values of string arguments directly into an
> argv_array without preserving the argument name.
>
> This is useful if collecting generic arguments to pass through to
> another command, for example, 'git hook run --arg "--quiet" --arg
> "--format=pretty" some-hook'. The resulting argv_array would contain
> { "--quiet", "--format=pretty" }.
>
> The implementation is based on that of OPT_STRING_LIST.

Be it argv_array or strvec, I think this is a useful thing to do.

I grepped for the users of OPT_STRING_LIST() to see if some of them
are better served by this, but none of them stood out as candidates
that are particularly good match.

> +int parse_opt_argv_array(const struct option *opt, const char *arg, int unset)
> +{
> +	struct argv_array *v = opt->value;
> +
> +	if (unset) {
> +		argv_array_clear(v);
> +		return 0;
> +	}
> +
> +	if (!arg)
> +		return -1;

I think the calling parse_options() loop would catch this negative
return and raise an error, but is it better for this code to stay
silent or would it be better to say that opt->long_name/short_name 
is not a boolean?

> +	argv_array_push(v, arg);
> +	return 0;
> +}

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC PATCH v3 5/6] parse-options: parse into argv_array
  2020-07-29 19:33     ` Junio C Hamano
@ 2020-07-30 23:41       ` Junio C Hamano
  0 siblings, 0 replies; 170+ messages in thread
From: Junio C Hamano @ 2020-07-30 23:41 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git, Jeff King

Junio C Hamano <gitster@pobox.com> writes:

> Be it argv_array or strvec, I think this is a useful thing to do.
>
> I grepped for the users of OPT_STRING_LIST() to see if some of them
> are better served by this, but none of them stood out as candidates
> that are particularly good match.
>
>> +int parse_opt_argv_array(const struct option *opt, const char *arg, int unset)
>> +{
>> +	struct argv_array *v = opt->value;
>> +
>> +	if (unset) {
>> +		argv_array_clear(v);
>> +		return 0;
>> +	}
>> +
>> +	if (!arg)
>> +		return -1;
>
> I think the calling parse_options() loop would catch this negative
> return and raise an error, but is it better for this code to stay
> silent or would it be better to say that opt->long_name/short_name 
> is not a boolean?

I am still waiting for this to be answered, but I queued the whole
topic, these last two steps included, just to see how bad adjusting
to the strvec API migration would be.  It wasn't too bad.

I would not recommend you, or other contributors who use argv-array
API in their topics, to build on top of jk/strvec, not just yet, as
I expect it to go through at least one more reroll to update the
details.

Thanks.


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 3/4] hook: add list command
  2020-06-09 21:49     ` Emily Shaffer
@ 2020-08-17 13:36       ` Phillip Wood
  0 siblings, 0 replies; 170+ messages in thread
From: Phillip Wood @ 2020-08-17 13:36 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Hi Emily

sorry it has taken me so long to reply

On 09/06/2020 22:49, Emily Shaffer wrote:
> On Fri, May 22, 2020 at 11:27:44AM +0100, Phillip Wood wrote:
>>
>> Hi Emily
>>
>> On 21/05/2020 19:54, Emily Shaffer wrote:
>>> [...]
>>> +static int hook_config_lookup(const char *key, const char *value, void *hook_key_cb)
>>> +{
>>> +	const char *hook_key = hook_key_cb;
>>> +
>>> +	if (!strcmp(key, hook_key)) {
>>> +		const char *command = value;
>>> +		struct strbuf hookcmd_name = STRBUF_INIT;
>>> +		struct list_head *pos = NULL, *tmp = NULL;
>>> +
>>> +		/* Check if a hookcmd with that name exists. */
>>> +		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
>>> +		git_config_get_value(hookcmd_name.buf, &command);
>>
>> This looks dodgy to me. This code is called by git_config() as it parses
>> the config files, so it has not had a chance to fully populate the
>> config cache used by git_config_get_value(). I think the test below
>> passes because the hookcmd setting is set in the global file and the
>> hook setting is set in the local file so when we have already parsed the
>> hookcmd setting when we come to look it up. The same comment applies to
>> the hypothetical ordering config mentioned below. I think it would be
>> better to collect the list of hook.<event>.command settings in this
>> callback and then look up any hookcmd settings for those hook commands
>> after we've finished reading all of the config files.
> 
> git_config_get_value() calls repo_read_config(the_repository) if the
> config hasn't been fully parsed yet, so I think what you're worrying
> about is not an issue. It's ugly, I agree, but since the new hotness
> (git_config_get_value() and friends) doesn't offer the same
> functionality as the old solution (config origin) this seemed like an
> okay approach. As I understand it, moving this hookcmd lookup section
> outside of the config callback will save us up to one additional pass
> through the configs, at the expense of a more convoluted code path.

Oh I didn't realize that, thanks for explaining it. Below you mention 
showing the origin for hookcmds as well as the origin of the command 
which would mean having to change this code anyway I think.

>>
>>> +
>>> +		if (!command)
>>> +			BUG("git_config_get_value overwrote a string it shouldn't have");
>>> +
>>> +		/*
>>> +		 * TODO: implement an option-getting callback, e.g.
>>> +		 *   get configs by pattern hookcmd.$value.*
>>> +		 *   for each key+value, do_callback(key, value, cb_data)
>>> +		 */
>>> +
>>> +		list_for_each_safe(pos, tmp, &hook_head) {
>>> +			struct hook *hook = list_entry(pos, struct hook, list);
>>> +			/*
>>> +			 * The list of hooks to run can be reordered by being redeclared
>>> +			 * in the config. Options about hook ordering should be checked
>>> +			 * here.
>>> +			 */
>>> +			if (0 == strcmp(hook->command.buf, command))
>>> +				remove_hook(pos);
>>> +		}
>>> +		emplace_hook(pos, command);
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +struct list_head* hook_list(const struct strbuf* hookname)
>>> +{
>>> +	struct strbuf hook_key = STRBUF_INIT;
>>> +
>>> +	if (!hookname)
>>> +		return NULL;
>>> +
>>> +	strbuf_addf(&hook_key, "hook.%s.command", hookname->buf);
>>> +
>>> +	git_config(hook_config_lookup, (void*)hook_key.buf);
>>> +
>>> +	return &hook_head;
>>> +}
>>> diff --git a/hook.h b/hook.h
>>> new file mode 100644
>>> index 0000000000..aaf6511cff
>>> --- /dev/null
>>> +++ b/hook.h
>>> @@ -0,0 +1,15 @@
>>> +#include "config.h"
>>> +#include "list.h"
>>> +#include "strbuf.h"
>>> +
>>> +struct hook
>>> +{
>>> +	struct list_head list;
>>> +	enum config_scope origin;
>>> +	struct strbuf command;
>>> +};
>>> +
>>> +struct list_head* hook_list(const struct strbuf *hookname);
>>> +
>>> +void free_hook(struct hook *ptr);
>>> +void clear_hook_list(void);
>>> diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
>>> index 34b0df5216..4e46d7dd4e 100755
>>> --- a/t/t1360-config-based-hooks.sh
>>> +++ b/t/t1360-config-based-hooks.sh
>>> @@ -4,8 +4,55 @@ test_description='config-managed multihooks, including git-hook command'
>>>   
>>>   . ./test-lib.sh
>>>   
>>> -test_expect_success 'git hook command does not crash' '
>>> -	git hook
>>> +test_expect_success 'git hook rejects commands without a mode' '
>>> +	test_must_fail git hook pre-commit
>>> +'
>>> +
>>> +
>>> +test_expect_success 'git hook rejects commands without a hookname' '
>>> +	test_must_fail git hook list
>>> +'
>>> +
>>> +test_expect_success 'setup hooks in global, and local' '
>>> +	git config --add --local hook.pre-commit.command "/path/ghi" &&
>>
>> Can I make a plea for the use of test_config please. Writing tests which
>> rely on previous tests for their set-up creates a chain of hidden
>> dependencies that make it hard to add/alter tests later or run a subset
>> of the tests when developing a new patch. t3404-rebase-interactive.sh is
>> a prime example of this and I dread touching it.
> 
> Sure. I'll redo them.

That's great, thanks

Best Wishes

Phillip

>>
>>> +	git config --add --global hook.pre-commit.command "/path/def"
>>> +'
>>> +
>>> +test_expect_success 'git hook list orders by config order' '
>>> +	cat >expected <<-\EOF &&
>>> +	global:	/path/def
>>> +	local:	/path/ghi
>>> +	EOF
>>> +
>>> +	git hook list pre-commit >actual &&
>>> +	test_cmp expected actual
>>> +'
>>> +
>>> +test_expect_success 'git hook list dereferences a hookcmd' '
>>> +	git config --add --local hook.pre-commit.command "abc" &&
>>> +	git config --add --global hookcmd.abc.command "/path/abc" &&
>>> +
>>> +	cat >expected <<-\EOF &&
>>> +	global:	/path/def
>>> +	local:	/path/ghi
>>> +	local:	/path/abc
>>
>> We should make it clear in the documentation that the config origin
>> applies to the hook setting, even though we display the hookcmd command
>> which is set globally here for the last hook.
> 
> One of the suggestions from our UX team last week was to make this list
> output clearer to indicate the origin of the command plus the origin of
> the hookcmd object; I'll try to straighten this out and make sure the
> doc agrees.
> 
>   - Emily
> 

^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v4 0/9] propose config-based hooks
  2020-07-28 22:24 ` [PATCH v3 0/6] propose config-based hooks Emily Shaffer
                     ` (5 preceding siblings ...)
  2020-07-28 22:24   ` [RFC PATCH v3 6/6] hook: add 'run' subcommand Emily Shaffer
@ 2020-09-09  0:49   ` Emily Shaffer
  2020-09-09  0:49     ` [PATCH v4 1/9] doc: propose hooks managed by the config Emily Shaffer
                       ` (10 more replies)
  6 siblings, 11 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-09-09  0:49 UTC (permalink / raw)
  To: git
  Cc: Emily Shaffer, Jeff King, Junio C Hamano, James Ramsay,
	Jonathan Nieder, brian m. carlson,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Josh Steadmon, Johannes Schindelin

Since v3, the biggest change is the conversion of commit hooks to use the new
hook machinery. The first change ("commit: use config-based hooks") is the
important part; the second change ("run_commit_hook: take strvec instead of varargs")
is probably subjective, but I thought it was a decent tech debt reduction.

I wanted to send this reroll quickly since I had promised it in standup last
week, but I've got pretty good progress locally on the patch for configuring
"hook.runHookDir"; I'm planning to send that soon, probably this week.

 - Emily

Emily Shaffer (9):
  doc: propose hooks managed by the config
  hook: scaffolding for git-hook subcommand
  hook: add list command
  hook: add --porcelain to list command
  parse-options: parse into strvec
  hook: add 'run' subcommand
  hook: replace run-command.h:find_hook
  commit: use config-based hooks
  run_commit_hook: take strvec instead of varargs

 .gitignore                                    |   1 +
 Documentation/Makefile                        |   1 +
 Documentation/git-hook.txt                    |  63 ++++
 Documentation/technical/api-parse-options.txt |   5 +
 .../technical/config-based-hooks.txt          | 354 ++++++++++++++++++
 Makefile                                      |   2 +
 builtin.h                                     |   1 +
 builtin/commit.c                              |  49 +--
 builtin/hook.c                                | 107 ++++++
 builtin/merge.c                               |  23 +-
 commit.c                                      |  12 +-
 commit.h                                      |   5 +-
 git.c                                         |   1 +
 hook.c                                        | 155 ++++++++
 hook.h                                        |  19 +
 parse-options-cb.c                            |  16 +
 parse-options.h                               |   4 +
 sequencer.c                                   |  15 +-
 t/t1360-config-based-hooks.sh                 | 115 ++++++
 ...3-pre-commit-and-pre-merge-commit-hooks.sh |  13 +
 20 files changed, 918 insertions(+), 43 deletions(-)
 create mode 100644 Documentation/git-hook.txt
 create mode 100644 Documentation/technical/config-based-hooks.txt
 create mode 100644 builtin/hook.c
 create mode 100644 hook.c
 create mode 100644 hook.h
 create mode 100755 t/t1360-config-based-hooks.sh

-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v4 1/9] doc: propose hooks managed by the config
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
@ 2020-09-09  0:49     ` Emily Shaffer
  2020-09-23 22:59       ` Jonathan Tan
  2020-10-07  9:23       ` Ævar Arnfjörð Bjarmason
  2020-09-09  0:49     ` [PATCH v4 2/9] hook: scaffolding for git-hook subcommand Emily Shaffer
                       ` (9 subsequent siblings)
  10 siblings, 2 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-09-09  0:49 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Begin a design document for config-based hooks, managed via git-hook.
Focus on an overview of the implementation and motivation for design
decisions. Briefly discuss the alternatives considered before this
point. Also, attempt to redefine terms to fit into a multihook world.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/Makefile                        |   1 +
 .../technical/config-based-hooks.txt          | 354 ++++++++++++++++++
 2 files changed, 355 insertions(+)
 create mode 100644 Documentation/technical/config-based-hooks.txt

diff --git a/Documentation/Makefile b/Documentation/Makefile
index 80d1908a44..58d6b3acbe 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -81,6 +81,7 @@ SP_ARTICLES += $(API_DOCS)
 TECH_DOCS += MyFirstContribution
 TECH_DOCS += MyFirstObjectWalk
 TECH_DOCS += SubmittingPatches
+TECH_DOCS += technical/config-based-hooks
 TECH_DOCS += technical/hash-function-transition
 TECH_DOCS += technical/http-protocol
 TECH_DOCS += technical/index-format
diff --git a/Documentation/technical/config-based-hooks.txt b/Documentation/technical/config-based-hooks.txt
new file mode 100644
index 0000000000..c6e762b192
--- /dev/null
+++ b/Documentation/technical/config-based-hooks.txt
@@ -0,0 +1,354 @@
+Configuration-based hook management
+===================================
+:sectanchors:
+
+[[motivation]]
+== Motivation
+
+Treat hooks as a first-class citizen by replacing the .git/hook/hookname path as
+the only source of hooks to execute, in a way which is friendly to users with
+multiple repos which have similar needs.
+
+Redefine "hook" as an event rather than a single script, allowing users to
+perform unrelated actions on a single event.
+
+Take a step closer to safety when copying zipped Git repositories from untrusted
+users by making it more apparent to users which scripts will be run during
+normal Git operations.
+
+Make it easier for users to discover Git's hook feature and automate their
+workflows.
+
+[[user-interfaces]]
+== User interfaces
+
+[[config-schema]]
+=== Config schema
+
+Hooks can be introduced by editing the configuration manually. There are two new
+sections added, `hook` and `hookcmd`.
+
+[[config-schema-hook]]
+==== `hook`
+
+Primarily contains subsections for each hook event. These order of these
+subsections defines the hook command execution order; hook commands can be
+specified by setting the value directly to the command if no additional
+configuration is needed, or by setting the value as the name of a `hookcmd`. If
+Git does not find a `hookcmd` whose subsection matches the value of the given
+command string, Git will try to execute the string directly. Hooks are executed
+by passing the resolved command string to the shell. Hook event subsections can
+also contain per-hook-event settings.
+
+Also contains top-level hook execution settings, for example,
+`hook.warnHookDir`, `hook.runHookDir`, or `hook.disableAll`. (These settings are
+described more in <<library,Library>>.)
+
+----
+[hook "pre-commit"]
+  command = perl-linter
+  command = /usr/bin/git-secrets --pre-commit
+
+[hook "pre-applypatch"]
+  command = perl-linter
+  error = ignore
+
+[hook]
+  runHookDir = interactive
+----
+
+[[config-schema-hookcmd]]
+==== `hookcmd`
+
+Defines a hook command and its attributes, which will be used when a hook event
+occurs. Unqualified attributes are assumed to apply to this hook during all hook
+events, but event-specific attributes can also be supplied. The example runs
+`/usr/bin/lint-it --language=perl <args passed by Git>`, but for repos which
+include this config, the hook command will be skipped for all events to which
+it's normally subscribed _except_ `pre-commit`.
+
+----
+[hookcmd "perl-linter"]
+  command = /usr/bin/lint-it --language=perl
+  skip = true
+  pre-commit-skip = false
+----
+
+[[command-line-api]]
+=== Command-line API
+
+Users should be able to view, reorder, and create hook commands via the command
+line. External tools should be able to view a list of hooks in the correct order
+to run.
+
+*`git hook list <hook-event>`*
+
+*`git hook list (--system|--global|--local|--worktree)`*
+
+*`git hook edit <hook-event>`*
+
+*`git hook add <hook-command> <hook-event> <options...>`*
+
+[[hook-editor]]
+=== Hook editor
+
+The tool which is presented by `git hook edit <hook-command>`. Ideally, this
+tool should be easier to use than manually editing the config, and then produce
+a concise config afterwards. It may take a form similar to `git rebase
+--interactive`.
+
+[[implementation]]
+== Implementation
+
+[[library]]
+=== Library
+
+`hook.c` and `hook.h` are responsible for interacting with the config files. In
+the case when the code generating a hook event doesn't have special concerns
+about how to run the hooks, the hook library will provide a basic API to call
+all hooks in config order with an `argv_array` provided by the code which
+generates the hook event:
+
+*`int run_hooks(const char *hookname, struct argv_array *args)`*
+
+This call includes the hook command provided by `run-command.h:find_hook()`;
+eventually, this legacy hook will be gated by a config `hook.runHookDir`. The
+config is checked against a number of cases:
+
+- "no": the legacy hook will not be run
+- "interactive": Git will prompt the user before running the legacy hook
+- "warn": Git will print a warning to stderr before running the legacy hook
+- "yes" (default): Git will silently run the legacy hook
+
+In case this list is expanded in the future, if a value for `hook.runHookDir` is
+given which Git does not recognize, Git should discard that config entry. For
+example, if "warn" was specified at system level and "junk" was specified at
+global level, Git would resolve the value to "warn"; if the only time the config
+was set was to "junk", Git would use the default value of "yes".
+
+If the caller wants to do something more complicated, the hook library can also
+provide a callback API:
+
+*`int for_each_hookcmd(const char *hookname, hookcmd_function *cb)`*
+
+Finally, to facilitate the builtin, the library will also provide the following
+APIs to interact with the config:
+
+----
+int set_hook_commands(const char *hookname, struct string_list *commands,
+	enum config_scope scope);
+int set_hookcmd(const char *hookcmd, struct hookcmd options);
+
+int list_hook_commands(const char *hookname, struct string_list *commands);
+int list_hooks_in_scope(enum config_scope scope, struct string_list *commands);
+----
+
+`struct hookcmd` is expected to grow in size over time as more functionality is
+added to hooks; so that other parts of the code don't need to understand the
+config schema, `struct hookcmd` should contain logical values instead of string
+pairs.
+
+----
+struct hookcmd {
+  const char *name;
+  const char *command;
+
+  /* for illustration only; not planned at present */
+  int parallelizable;
+  const char *hookcmd_before;
+  const char *hookcmd_after;
+  enum recovery_action on_fail;
+}
+----
+
+[[builtin]]
+=== Builtin
+
+`builtin/hook.c` is responsible for providing the frontend. It's responsible for
+formatting user-provided data and then calling the library API to set the
+configs as appropriate. The builtin frontend is not responsible for calling the
+config directly, so that other areas of Git can rely on the hook library to
+understand the most recent config schema for hooks.
+
+[[migration]]
+=== Migration path
+
+[[stage-0]]
+==== Stage 0
+
+Hooks are called by running `run-command.h:find_hook()` with the hookname and
+executing the result. The hook library and builtin do not exist. Hooks only
+exist as specially named scripts within `.git/hooks/`.
+
+[[stage-1]]
+==== Stage 1
+
+`git hook list --porcelain <hook-event>` is implemented. Users can replace their
+`.git/hooks/<hook-event>` scripts with a trampoline based on `git hook list`'s
+output. Modifier commands like `git hook add` and `git hook edit` can be
+implemented around this time as well.
+
+[[stage-2]]
+==== Stage 2
+
+`hook.h:run_hooks()` is taught to include `run-command.h:find_hook()` at the
+end; calls to `find_hook()` are replaced with calls to `run_hooks()`. Users can
+opt-in to config-based hooks simply by creating some in their config; otherwise
+users should remain unaffected by the change.
+
+[[stage-3]]
+==== Stage 3
+
+The call to `find_hook()` inside of `run_hooks()` learns to check for a config,
+`hook.runHookDir`. Users can opt into managing their hooks completely via the
+config this way.
+
+[[stage-4]]
+==== Stage 4
+
+`.git/hooks` is removed from the template and the hook directory is considered
+deprecated. To avoid breaking older repos, the default of `hook.runHookDir` is
+not changed, and `find_hook()` is not removed.
+
+[[caveats]]
+== Caveats
+
+[[security]]
+=== Security and repo config
+
+Part of the motivation behind this refactor is to mitigate hooks as an attack
+vector;footnote:[https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/]
+however, as the design stands, users can still provide hooks in the repo-level
+config, which is included when a repo is zipped and sent elsewhere.  The
+security of the repo-level config is still under discussion; this design
+generally assumes the repo-level config is secure, which is not true yet. The
+goal is to avoid an overcomplicated design to work around a problem which has
+ceased to exist.
+
+[[ease-of-use]]
+=== Ease of use
+
+The config schema is nontrivial; that's why it's important for the `git hook`
+modifier commands to be usable. Contributors with UX expertise are encouraged to
+share their suggestions.
+
+[[alternatives]]
+== Alternative approaches
+
+A previous summary of alternatives exists in the
+archives.footnote:[https://lore.kernel.org/git/20191116011125.GG22855@google.com]
+
+[[status-quo]]
+=== Status quo
+
+Today users can implement multihooks themselves by using a "trampoline script"
+as their hook, and pointing that script to a directory or list of other scripts
+they wish to run.
+
+[[hook-directories]]
+=== Hook directories
+
+Other contributors have suggested Git learn about the existence of a directory
+such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order.
+
+[[comparison]]
+=== Comparison table
+
+.Comparison of alternatives
+|===
+|Feature |Config-based hooks |Hook directories |Status quo
+
+|Supports multiple hooks
+|Natively
+|Natively
+|With user effort
+
+|Safer for zipped repos
+|A little
+|No
+|No
+
+|Previous hooks just work
+|If configured
+|Yes
+|Yes
+
+|Can install one hook to many repos
+|Yes
+|No
+|No
+
+|Discoverability
+|Better (in `git help git`)
+|Same as before
+|Same as before
+
+|Hard to run unexpected hook
+|If configured
+|No
+|No
+|===
+
+[[future-work]]
+== Future work
+
+[[execution-ordering]]
+=== Execution ordering
+
+We may find that config order is insufficient for some users; for example,
+config order makes it difficult to add a new hook to the system or global config
+which runs at the end of the hook list. A new ordering schema should be:
+
+1) Specified by a `hook.order` config, so that users will not unexpectedly see
+their order change;
+
+2) Either dependency or numerically based.
+
+Dependency-based ordering is prone to classic linked-list problems, like a
+cycles and handling of missing dependencies. But, it paves the way for enabling
+parallelization if some tasks truly depend on others.
+
+Numerical ordering makes it tricky for Git to generate suggested ordering
+numbers for each command, but is easy to determine a definitive order.
+
+[[parallelization]]
+=== Parallelization
+
+Users with many hooks might want to run them simultaneously, if the hooks don't
+modify state; if one hook depends on another's output, then users will want to
+specify those dependencies. If we decide to solve this problem, we may want to
+look to modern build systems for inspiration on how to manage dependencies and
+parallel tasks.
+
+[[securing-hookdir-hooks]]
+=== Securing hookdir hooks
+
+With the design as written in this doc, it's still possible for a malicious user
+to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then
+zip their repo and send it to another user. It may be necessary to teach Git to
+only allow inlined hooks like this if they were configured outside of the local
+scope (in other words, only run hookcmds, and only allow hookcmds to be
+configured in global or system scope); or another approach, like a list of safe
+projects, might be useful. It may also be sufficient (or at least useful) to
+teach a `hook.disableAll` config or similar flag to the Git executable.
+
+[[submodule-inheritance]]
+=== Submodule inheritance
+
+It's possible some submodules may want to run the identical set of hooks that
+their superrepo runs. While a globally-configured hook set is helpful, it's not
+a great solution for users who have multiple repos-with-submodules under the
+same user. It would be useful for submodules to learn how to run hooks from
+their superrepo's config, or inherit that hook setting.
+
+[[glossary]]
+== Glossary
+
+*hook event*
+
+A point during Git's execution where user scripts may be run, for example,
+_prepare-commit-msg_ or _pre-push_.
+
+*hook command*
+
+A user script or executable which will be run on one or more hook events.
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v4 2/9] hook: scaffolding for git-hook subcommand
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
  2020-09-09  0:49     ` [PATCH v4 1/9] doc: propose hooks managed by the config Emily Shaffer
@ 2020-09-09  0:49     ` Emily Shaffer
  2020-10-05 23:24       ` Jonathan Nieder
  2020-09-09  0:49     ` [PATCH v4 3/9] hook: add list command Emily Shaffer
                       ` (8 subsequent siblings)
  10 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-09-09  0:49 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Introduce infrastructure for a new subcommand, git-hook, which will be
used to ease config-based hook management. This command will handle
parsing configs to compose a list of hooks to run for a given event, as
well as adding or modifying hook configs in an interactive fashion.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 .gitignore                    |  1 +
 Documentation/git-hook.txt    | 19 +++++++++++++++++++
 Makefile                      |  1 +
 builtin.h                     |  1 +
 builtin/hook.c                | 21 +++++++++++++++++++++
 git.c                         |  1 +
 t/t1360-config-based-hooks.sh | 11 +++++++++++
 7 files changed, 55 insertions(+)
 create mode 100644 Documentation/git-hook.txt
 create mode 100644 builtin/hook.c
 create mode 100755 t/t1360-config-based-hooks.sh

diff --git a/.gitignore b/.gitignore
index ee509a2ad2..0694a34884 100644
--- a/.gitignore
+++ b/.gitignore
@@ -75,6 +75,7 @@
 /git-grep
 /git-hash-object
 /git-help
+/git-hook
 /git-http-backend
 /git-http-fetch
 /git-http-push
diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
new file mode 100644
index 0000000000..2d50c414cc
--- /dev/null
+++ b/Documentation/git-hook.txt
@@ -0,0 +1,19 @@
+git-hook(1)
+===========
+
+NAME
+----
+git-hook - Manage configured hooks
+
+SYNOPSIS
+--------
+[verse]
+'git hook'
+
+DESCRIPTION
+-----------
+You can list, add, and modify hooks with this command.
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index 65f8cfb236..6eee75555e 100644
--- a/Makefile
+++ b/Makefile
@@ -1077,6 +1077,7 @@ BUILTIN_OBJS += builtin/get-tar-commit-id.o
 BUILTIN_OBJS += builtin/grep.o
 BUILTIN_OBJS += builtin/hash-object.o
 BUILTIN_OBJS += builtin/help.o
+BUILTIN_OBJS += builtin/hook.o
 BUILTIN_OBJS += builtin/index-pack.o
 BUILTIN_OBJS += builtin/init-db.o
 BUILTIN_OBJS += builtin/interpret-trailers.o
diff --git a/builtin.h b/builtin.h
index a5ae15bfe5..4e736499c0 100644
--- a/builtin.h
+++ b/builtin.h
@@ -157,6 +157,7 @@ int cmd_get_tar_commit_id(int argc, const char **argv, const char *prefix);
 int cmd_grep(int argc, const char **argv, const char *prefix);
 int cmd_hash_object(int argc, const char **argv, const char *prefix);
 int cmd_help(int argc, const char **argv, const char *prefix);
+int cmd_hook(int argc, const char **argv, const char *prefix);
 int cmd_index_pack(int argc, const char **argv, const char *prefix);
 int cmd_init_db(int argc, const char **argv, const char *prefix);
 int cmd_interpret_trailers(int argc, const char **argv, const char *prefix);
diff --git a/builtin/hook.c b/builtin/hook.c
new file mode 100644
index 0000000000..b2bbc84d4d
--- /dev/null
+++ b/builtin/hook.c
@@ -0,0 +1,21 @@
+#include "cache.h"
+
+#include "builtin.h"
+#include "parse-options.h"
+
+static const char * const builtin_hook_usage[] = {
+	N_("git hook"),
+	NULL
+};
+
+int cmd_hook(int argc, const char **argv, const char *prefix)
+{
+	struct option builtin_hook_options[] = {
+		OPT_END(),
+	};
+
+	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+			     builtin_hook_usage, 0);
+
+	return 0;
+}
diff --git a/git.c b/git.c
index 8bd1d7551d..1cdb3221a5 100644
--- a/git.c
+++ b/git.c
@@ -519,6 +519,7 @@ static struct cmd_struct commands[] = {
 	{ "grep", cmd_grep, RUN_SETUP_GENTLY },
 	{ "hash-object", cmd_hash_object },
 	{ "help", cmd_help },
+	{ "hook", cmd_hook, RUN_SETUP },
 	{ "index-pack", cmd_index_pack, RUN_SETUP_GENTLY | NO_PARSEOPT },
 	{ "init", cmd_init_db },
 	{ "init-db", cmd_init_db },
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
new file mode 100755
index 0000000000..34b0df5216
--- /dev/null
+++ b/t/t1360-config-based-hooks.sh
@@ -0,0 +1,11 @@
+#!/bin/bash
+
+test_description='config-managed multihooks, including git-hook command'
+
+. ./test-lib.sh
+
+test_expect_success 'git hook command does not crash' '
+	git hook
+'
+
+test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v4 3/9] hook: add list command
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
  2020-09-09  0:49     ` [PATCH v4 1/9] doc: propose hooks managed by the config Emily Shaffer
  2020-09-09  0:49     ` [PATCH v4 2/9] hook: scaffolding for git-hook subcommand Emily Shaffer
@ 2020-09-09  0:49     ` Emily Shaffer
  2020-09-11 13:27       ` Phillip Wood
                         ` (3 more replies)
  2020-09-09  0:49     ` [PATCH v4 4/9] hook: add --porcelain to " Emily Shaffer
                       ` (7 subsequent siblings)
  10 siblings, 4 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-09-09  0:49 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Teach 'git hook list <hookname>', which checks the known configs in
order to create an ordered list of hooks to run on a given hook event.

Multiple commands can be specified for a given hook by providing
multiple "hook.<hookname>.command = <path-to-hook>" lines. Hooks will be
run in config order. If more properties need to be set on a given hook
in the future, commands can also be specified by providing
"hook.<hookname>.command = <hookcmd-name>", as well as a "[hookcmd
<hookcmd-name>]" subsection; at minimum, this subsection must contain a
"hookcmd.<hookcmd-name>.command = <path-to-hook>" line.

For example:

  $ git config --list | grep ^hook
  hook.pre-commit.command=baz
  hook.pre-commit.command=~/bar.sh
  hookcmd.baz.command=~/baz/from/hookcmd.sh

  $ git hook list pre-commit
  ~/baz/from/hookcmd.sh
  ~/bar.sh

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/git-hook.txt    |  37 +++++++++++-
 Makefile                      |   1 +
 builtin/hook.c                |  55 ++++++++++++++++--
 hook.c                        | 102 ++++++++++++++++++++++++++++++++++
 hook.h                        |  15 +++++
 t/t1360-config-based-hooks.sh |  68 ++++++++++++++++++++++-
 6 files changed, 271 insertions(+), 7 deletions(-)
 create mode 100644 hook.c
 create mode 100644 hook.h

diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index 2d50c414cc..e458586e96 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -8,12 +8,47 @@ git-hook - Manage configured hooks
 SYNOPSIS
 --------
 [verse]
-'git hook'
+'git hook' list <hook-name>
 
 DESCRIPTION
 -----------
 You can list, add, and modify hooks with this command.
 
+This command parses the default configuration files for sections "hook" and
+"hookcmd". "hook" is used to describe the commands which will be run during a
+particular hook event; commands are run in config order. "hookcmd" is used to
+describe attributes of a specific command. If additional attributes don't need
+to be specified, a command to run can be specified directly in the "hook"
+section; if a "hookcmd" by that name isn't found, Git will attempt to run the
+provided value directly. For example:
+
+Global config
+----
+  [hook "post-commit"]
+    command = "linter"
+    command = "~/typocheck.sh"
+
+  [hookcmd "linter"]
+    command = "/bin/linter --c"
+----
+
+Local config
+----
+  [hook "prepare-commit-msg"]
+    command = "linter"
+  [hook "post-commit"]
+    command = "python ~/run-test-suite.py"
+----
+
+COMMANDS
+--------
+
+list <hook-name>::
+
+List the hooks which have been configured for <hook-name>. Hooks appear
+in the order they should be run, and note the config scope where the relevant
+`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index 6eee75555e..804de45b16 100644
--- a/Makefile
+++ b/Makefile
@@ -890,6 +890,7 @@ LIB_OBJS += grep.o
 LIB_OBJS += hashmap.o
 LIB_OBJS += help.o
 LIB_OBJS += hex.o
+LIB_OBJS += hook.o
 LIB_OBJS += ident.o
 LIB_OBJS += interdiff.o
 LIB_OBJS += json-writer.o
diff --git a/builtin/hook.c b/builtin/hook.c
index b2bbc84d4d..a0759a4c26 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -1,21 +1,68 @@
 #include "cache.h"
 
 #include "builtin.h"
+#include "config.h"
+#include "hook.h"
 #include "parse-options.h"
+#include "strbuf.h"
 
 static const char * const builtin_hook_usage[] = {
-	N_("git hook"),
+	N_("git hook list <hookname>"),
 	NULL
 };
 
-int cmd_hook(int argc, const char **argv, const char *prefix)
+static int list(int argc, const char **argv, const char *prefix)
 {
-	struct option builtin_hook_options[] = {
+	struct list_head *head, *pos;
+	struct hook *item;
+	struct strbuf hookname = STRBUF_INIT;
+
+	struct option list_options[] = {
 		OPT_END(),
 	};
 
-	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+	argc = parse_options(argc, argv, prefix, list_options,
 			     builtin_hook_usage, 0);
 
+	if (argc < 1) {
+		usage_msg_opt("a hookname must be provided to operate on.",
+			      builtin_hook_usage, list_options);
+	}
+
+	strbuf_addstr(&hookname, argv[0]);
+
+	head = hook_list(&hookname);
+
+	if (list_empty(head)) {
+		printf(_("no commands configured for hook '%s'\n"),
+		       hookname.buf);
+		return 0;
+	}
+
+	list_for_each(pos, head) {
+		item = list_entry(pos, struct hook, list);
+		if (item)
+			printf("%s:\t%s\n",
+			       config_scope_name(item->origin),
+			       item->command.buf);
+	}
+
+	clear_hook_list();
+	strbuf_release(&hookname);
+
 	return 0;
 }
+
+int cmd_hook(int argc, const char **argv, const char *prefix)
+{
+	struct option builtin_hook_options[] = {
+		OPT_END(),
+	};
+	if (argc < 2)
+		usage_with_options(builtin_hook_usage, builtin_hook_options);
+
+	if (!strcmp(argv[1], "list"))
+		return list(argc - 1, argv + 1, prefix);
+
+	usage_with_options(builtin_hook_usage, builtin_hook_options);
+}
diff --git a/hook.c b/hook.c
new file mode 100644
index 0000000000..b006950eb8
--- /dev/null
+++ b/hook.c
@@ -0,0 +1,102 @@
+#include "cache.h"
+
+#include "hook.h"
+#include "config.h"
+
+/*
+ * NEEDSWORK: a stateful hook_head means we can't run two hook events in the
+ * background at the same time - which might be ok, or might not.
+ *
+ * Maybe it's better to cache a list head per hookname, since we can probably
+ * guess that the hook list won't change during a user-initiated operation. For
+ * now, within list_hooks, call clear_hook_list() at the outset.
+ */
+static LIST_HEAD(hook_head);
+
+void free_hook(struct hook *ptr)
+{
+	if (ptr) {
+		strbuf_release(&ptr->command);
+		free(ptr);
+	}
+}
+
+static void emplace_hook(struct list_head *pos, const char *command)
+{
+	struct hook *to_add = malloc(sizeof(struct hook));
+	to_add->origin = current_config_scope();
+	strbuf_init(&to_add->command, 0);
+	/* even with use_shell, run_command() needs quotes */
+	strbuf_addf(&to_add->command, "'%s'", command);
+
+	list_add_tail(&to_add->list, pos);
+}
+
+static void remove_hook(struct list_head *to_remove)
+{
+	struct hook *hook_to_remove = list_entry(to_remove, struct hook, list);
+	list_del(to_remove);
+	free_hook(hook_to_remove);
+}
+
+void clear_hook_list(void)
+{
+	struct list_head *pos, *tmp;
+	list_for_each_safe(pos, tmp, &hook_head)
+		remove_hook(pos);
+}
+
+static int hook_config_lookup(const char *key, const char *value, void *hook_key_cb)
+{
+	const char *hook_key = hook_key_cb;
+
+	if (!strcmp(key, hook_key)) {
+		const char *command = value;
+		struct strbuf hookcmd_name = STRBUF_INIT;
+		struct list_head *pos = NULL, *tmp = NULL;
+
+		/* Check if a hookcmd with that name exists. */
+		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
+		git_config_get_value(hookcmd_name.buf, &command);
+
+		if (!command)
+			BUG("git_config_get_value overwrote a string it shouldn't have");
+
+		/*
+		 * TODO: implement an option-getting callback, e.g.
+		 *   get configs by pattern hookcmd.$value.*
+		 *   for each key+value, do_callback(key, value, cb_data)
+		 */
+
+		list_for_each_safe(pos, tmp, &hook_head) {
+			struct hook *hook = list_entry(pos, struct hook, list);
+			/*
+			 * The list of hooks to run can be reordered by being redeclared
+			 * in the config. Options about hook ordering should be checked
+			 * here.
+			 */
+			if (0 == strcmp(hook->command.buf, command))
+				remove_hook(pos);
+		}
+		emplace_hook(pos, command);
+	}
+
+	return 0;
+}
+
+struct list_head* hook_list(const struct strbuf* hookname)
+{
+	struct strbuf hook_key = STRBUF_INIT;
+
+	if (!hookname)
+		return NULL;
+
+	/* hook_head is stateful */
+	clear_hook_list();
+
+	strbuf_addf(&hook_key, "hook.%s.command", hookname->buf);
+
+	git_config(hook_config_lookup, (void*)hook_key.buf);
+
+	return &hook_head;
+}
diff --git a/hook.h b/hook.h
new file mode 100644
index 0000000000..aaf6511cff
--- /dev/null
+++ b/hook.h
@@ -0,0 +1,15 @@
+#include "config.h"
+#include "list.h"
+#include "strbuf.h"
+
+struct hook
+{
+	struct list_head list;
+	enum config_scope origin;
+	struct strbuf command;
+};
+
+struct list_head* hook_list(const struct strbuf *hookname);
+
+void free_hook(struct hook *ptr);
+void clear_hook_list(void);
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 34b0df5216..46d1ed354a 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -4,8 +4,72 @@ test_description='config-managed multihooks, including git-hook command'
 
 . ./test-lib.sh
 
-test_expect_success 'git hook command does not crash' '
-	git hook
+ROOT=
+if test_have_prereq MINGW
+then
+	# In Git for Windows, Unix-like paths work only in shell scripts;
+	# `git.exe`, however, will prefix them with the pseudo root directory
+	# (of the Unix shell). Let's accommodate for that.
+	ROOT="$(cd / && pwd)"
+fi
+
+setup_hooks () {
+	test_config hook.pre-commit.command "/path/ghi" --add
+	test_config_global hook.pre-commit.command "/path/def" --add
+}
+
+setup_hookcmd () {
+	test_config hook.pre-commit.command "abc" --add
+	test_config_global hookcmd.abc.command "/path/abc" --add
+}
+
+test_expect_success 'git hook rejects commands without a mode' '
+	test_must_fail git hook pre-commit
+'
+
+
+test_expect_success 'git hook rejects commands without a hookname' '
+	test_must_fail git hook list
+'
+
+test_expect_success 'git hook list orders by config order' '
+	setup_hooks &&
+
+	cat >expected <<-EOF &&
+	global:	$ROOT/path/def
+	local:	$ROOT/path/ghi
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list dereferences a hookcmd' '
+	setup_hooks &&
+	setup_hookcmd &&
+
+	cat >expected <<-EOF &&
+	global:	$ROOT/path/def
+	local:	$ROOT/path/ghi
+	local:	$ROOT/path/abc
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list reorders on duplicate commands' '
+	setup_hooks &&
+
+	test_config hook.pre-commit.command "/path/def" --add &&
+
+	cat >expected <<-EOF &&
+	local:	$ROOT/path/ghi
+	local:	$ROOT/path/def
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
 '
 
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v4 4/9] hook: add --porcelain to list command
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
                       ` (2 preceding siblings ...)
  2020-09-09  0:49     ` [PATCH v4 3/9] hook: add list command Emily Shaffer
@ 2020-09-09  0:49     ` Emily Shaffer
  2020-09-28 19:29       ` Josh Steadmon
  2020-09-09  0:49     ` [PATCH v4 5/9] parse-options: parse into strvec Emily Shaffer
                       ` (6 subsequent siblings)
  10 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-09-09  0:49 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Teach 'git hook list --porcelain <hookname>', which prints simply the
commands to be run in the order suggested by the config. This option is
intended for use by user scripts, wrappers, or out-of-process Git
commands which still want to execute hooks. For example, the following
snippet might be added to git-send-email.perl to introduce a
`pre-send-email` hook:

  sub pre_send_email {
    open(my $fh, 'git hook list --porcelain pre-send-email |');
    chomp(my @hooks = <$fh>);
    close($fh);

    foreach $hook (@hooks) {
            system $hook
    }

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/git-hook.txt    | 13 +++++++++++--
 builtin/hook.c                | 17 +++++++++++++----
 t/t1360-config-based-hooks.sh | 12 ++++++++++++
 3 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index e458586e96..0854035ce2 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -8,7 +8,7 @@ git-hook - Manage configured hooks
 SYNOPSIS
 --------
 [verse]
-'git hook' list <hook-name>
+'git hook' list [--porcelain] <hook-name>
 
 DESCRIPTION
 -----------
@@ -43,11 +43,20 @@ Local config
 COMMANDS
 --------
 
-list <hook-name>::
+list [--porcelain] <hook-name>::
 
 List the hooks which have been configured for <hook-name>. Hooks appear
 in the order they should be run, and note the config scope where the relevant
 `hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
++
+If `--porcelain` is specified, instead print the commands alone, separated by
+newlines, for easy parsing by a script.
+
+OPTIONS
+-------
+--porcelain::
+	With `list`, print the commands in the order they should be run,
+	separated by newlines, for easy parsing by a script.
 
 GIT
 ---
diff --git a/builtin/hook.c b/builtin/hook.c
index a0759a4c26..0d92124ca6 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -16,8 +16,11 @@ static int list(int argc, const char **argv, const char *prefix)
 	struct list_head *head, *pos;
 	struct hook *item;
 	struct strbuf hookname = STRBUF_INIT;
+	int porcelain = 0;
 
 	struct option list_options[] = {
+		OPT_BOOL(0, "porcelain", &porcelain,
+			 "format for execution by a script"),
 		OPT_END(),
 	};
 
@@ -29,6 +32,8 @@ static int list(int argc, const char **argv, const char *prefix)
 			      builtin_hook_usage, list_options);
 	}
 
+
+
 	strbuf_addstr(&hookname, argv[0]);
 
 	head = hook_list(&hookname);
@@ -41,10 +46,14 @@ static int list(int argc, const char **argv, const char *prefix)
 
 	list_for_each(pos, head) {
 		item = list_entry(pos, struct hook, list);
-		if (item)
-			printf("%s:\t%s\n",
-			       config_scope_name(item->origin),
-			       item->command.buf);
+		if (item) {
+			if (porcelain)
+				printf("%s\n", item->command.buf);
+			else
+				printf("%s:\t%s\n",
+				       config_scope_name(item->origin),
+				       item->command.buf);
+		}
 	}
 
 	clear_hook_list();
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 46d1ed354a..ebf8f38d68 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -72,4 +72,16 @@ test_expect_success 'git hook list reorders on duplicate commands' '
 	test_cmp expected actual
 '
 
+test_expect_success 'git hook list --porcelain prints just the command' '
+	setup_hooks &&
+
+	cat >expected <<-EOF &&
+	$ROOT/path/def
+	$ROOT/path/ghi
+	EOF
+
+	git hook list --porcelain pre-commit >actual &&
+	test_cmp expected actual
+'
+
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v4 5/9] parse-options: parse into strvec
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
                       ` (3 preceding siblings ...)
  2020-09-09  0:49     ` [PATCH v4 4/9] hook: add --porcelain to " Emily Shaffer
@ 2020-09-09  0:49     ` Emily Shaffer
  2020-10-05 23:30       ` Jonathan Nieder
  2020-09-09  0:49     ` [PATCH v4 6/9] hook: add 'run' subcommand Emily Shaffer
                       ` (5 subsequent siblings)
  10 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-09-09  0:49 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

parse-options already knows how to read into a string_list, and it knows
how to read into an strvec as a passthrough (that is, including the
argument as well as its value). string_list and strvec serve similar
purposes but are somewhat painful to convert between; so, let's teach
parse-options to read values of string arguments directly into an
strvec without preserving the argument name.

This is useful if collecting generic arguments to pass through to
another command, for example, 'git hook run --arg "--quiet" --arg
"--format=pretty" some-hook'. The resulting strvec would contain
{ "--quiet", "--format=pretty" }.

The implementation is based on that of OPT_STRING_LIST.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/technical/api-parse-options.txt |  5 +++++
 parse-options-cb.c                            | 16 ++++++++++++++++
 parse-options.h                               |  4 ++++
 3 files changed, 25 insertions(+)

diff --git a/Documentation/technical/api-parse-options.txt b/Documentation/technical/api-parse-options.txt
index 5a60bbfa7f..b4f1fc4a1a 100644
--- a/Documentation/technical/api-parse-options.txt
+++ b/Documentation/technical/api-parse-options.txt
@@ -173,6 +173,11 @@ There are some macros to easily define options:
 	The string argument is stored as an element in `string_list`.
 	Use of `--no-option` will clear the list of preceding values.
 
+`OPT_ARGV_ARRAY(short, long, &struct argv_array, arg_str, description)`::
+	Introduce an option with a string argument.
+	The string argument is stored as an element in `argv_array`.
+	Use of `--no-option` will clear the list of preceding values.
+
 `OPT_INTEGER(short, long, &int_var, description)`::
 	Introduce an option with integer argument.
 	The integer is put into `int_var`.
diff --git a/parse-options-cb.c b/parse-options-cb.c
index d9d3b0819f..d2b8b7b98a 100644
--- a/parse-options-cb.c
+++ b/parse-options-cb.c
@@ -205,6 +205,22 @@ int parse_opt_string_list(const struct option *opt, const char *arg, int unset)
 	return 0;
 }
 
+int parse_opt_strvec(const struct option *opt, const char *arg, int unset)
+{
+	struct strvec *v = opt->value;
+
+	if (unset) {
+		strvec_clear(v);
+		return 0;
+	}
+
+	if (!arg)
+		return -1;
+
+	strvec_push(v, arg);
+	return 0;
+}
+
 int parse_opt_noop_cb(const struct option *opt, const char *arg, int unset)
 {
 	return 0;
diff --git a/parse-options.h b/parse-options.h
index 46af942093..177259488b 100644
--- a/parse-options.h
+++ b/parse-options.h
@@ -177,6 +177,9 @@ struct option {
 #define OPT_STRING_LIST(s, l, v, a, h) \
 				    { OPTION_CALLBACK, (s), (l), (v), (a), \
 				      (h), 0, &parse_opt_string_list }
+#define OPT_STRVEC(s, l, v, a, h) \
+				    { OPTION_CALLBACK, (s), (l), (v), (a), \
+				      (h), 0, &parse_opt_strvec }
 #define OPT_UYN(s, l, v, h)         { OPTION_CALLBACK, (s), (l), (v), NULL, \
 				      (h), PARSE_OPT_NOARG, &parse_opt_tertiary }
 #define OPT_EXPIRY_DATE(s, l, v, h) \
@@ -296,6 +299,7 @@ int parse_opt_commits(const struct option *, const char *, int);
 int parse_opt_commit(const struct option *, const char *, int);
 int parse_opt_tertiary(const struct option *, const char *, int);
 int parse_opt_string_list(const struct option *, const char *, int);
+int parse_opt_strvec(const struct option *, const char *, int);
 int parse_opt_noop_cb(const struct option *, const char *, int);
 enum parse_opt_result parse_opt_unknown_cb(struct parse_opt_ctx_t *ctx,
 					   const struct option *,
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v4 6/9] hook: add 'run' subcommand
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
                       ` (4 preceding siblings ...)
  2020-09-09  0:49     ` [PATCH v4 5/9] parse-options: parse into strvec Emily Shaffer
@ 2020-09-09  0:49     ` Emily Shaffer
  2020-09-11 13:30       ` Phillip Wood
                         ` (2 more replies)
  2020-09-09  0:49     ` [PATCH v4 7/9] hook: replace run-command.h:find_hook Emily Shaffer
                       ` (4 subsequent siblings)
  10 siblings, 3 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-09-09  0:49 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

In order to enable hooks to be run as an external process, by a
standalone Git command, or by tools which wrap Git, provide an external
means to run all configured hook commands for a given hook event.

For now, the hook commands will in config order, in series. As alternate
ordering or parallelism is supported in the future, we should add knobs
to use those to the command line as well.

As with the legacy hook implementation, all stdout generated by hook
commands is redirected to stderr. Piping from stdin is not yet
supported.

Legacy hooks (those present in $GITDIR/hooks) are run at the end of the
execution list. For now, there is no way to disable them.

Users may wish to provide hook commands like 'git config
hook.pre-commit.command "~/linter.sh --pre-commit"'. To enable this, the
contents of the 'hook.*.command' and 'hookcmd.*.command' strings are
first split by space or quotes into an argv_array, then expanded with
'expand_user_path()'.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 builtin/hook.c                | 30 ++++++++++++++++++++
 hook.c                        | 52 ++++++++++++++++++++++++++++++++---
 hook.h                        |  3 ++
 t/t1360-config-based-hooks.sh | 28 +++++++++++++++++++
 4 files changed, 109 insertions(+), 4 deletions(-)

diff --git a/builtin/hook.c b/builtin/hook.c
index 0d92124ca6..a8f8b03699 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -5,9 +5,11 @@
 #include "hook.h"
 #include "parse-options.h"
 #include "strbuf.h"
+#include "strvec.h"
 
 static const char * const builtin_hook_usage[] = {
 	N_("git hook list <hookname>"),
+	N_("git hook run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] <hookname>"),
 	NULL
 };
 
@@ -62,6 +64,32 @@ static int list(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+static int run(int argc, const char **argv, const char *prefix)
+{
+	struct strbuf hookname = STRBUF_INIT;
+	struct strvec envs = STRVEC_INIT;
+	struct strvec args = STRVEC_INIT;
+
+	struct option run_options[] = {
+		OPT_STRVEC('e', "env", &envs, N_("var"),
+			   N_("environment variables for hook to use")),
+		OPT_STRVEC('a', "arg", &args, N_("args"),
+			   N_("argument to pass to hook")),
+		OPT_END(),
+	};
+
+	argc = parse_options(argc, argv, prefix, run_options,
+			     builtin_hook_usage, 0);
+
+	if (argc < 1)
+		usage_msg_opt(_("a hookname must be provided to operate on."),
+			      builtin_hook_usage, run_options);
+
+	strbuf_addstr(&hookname, argv[0]);
+
+	return run_hooks(envs.v, &hookname, &args);
+}
+
 int cmd_hook(int argc, const char **argv, const char *prefix)
 {
 	struct option builtin_hook_options[] = {
@@ -72,6 +100,8 @@ int cmd_hook(int argc, const char **argv, const char *prefix)
 
 	if (!strcmp(argv[1], "list"))
 		return list(argc - 1, argv + 1, prefix);
+	if (!strcmp(argv[1], "run"))
+		return run(argc - 1, argv + 1, prefix);
 
 	usage_with_options(builtin_hook_usage, builtin_hook_options);
 }
diff --git a/hook.c b/hook.c
index b006950eb8..0dab981681 100644
--- a/hook.c
+++ b/hook.c
@@ -2,6 +2,7 @@
 
 #include "hook.h"
 #include "config.h"
+#include "run-command.h"
 
 /*
  * NEEDSWORK: a stateful hook_head means we can't run two hook events in the
@@ -21,13 +22,15 @@ void free_hook(struct hook *ptr)
 	}
 }
 
-static void emplace_hook(struct list_head *pos, const char *command)
+static void emplace_hook(struct list_head *pos, const char *command, int quoted)
 {
 	struct hook *to_add = malloc(sizeof(struct hook));
 	to_add->origin = current_config_scope();
 	strbuf_init(&to_add->command, 0);
-	/* even with use_shell, run_command() needs quotes */
-	strbuf_addf(&to_add->command, "'%s'", command);
+	if (quoted)
+		strbuf_addf(&to_add->command, "'%s'", command);
+	else
+		strbuf_addstr(&to_add->command, command);
 
 	list_add_tail(&to_add->list, pos);
 }
@@ -78,7 +81,7 @@ static int hook_config_lookup(const char *key, const char *value, void *hook_key
 			if (0 == strcmp(hook->command.buf, command))
 				remove_hook(pos);
 		}
-		emplace_hook(pos, command);
+		emplace_hook(pos, command, 0);
 	}
 
 	return 0;
@@ -87,6 +90,7 @@ static int hook_config_lookup(const char *key, const char *value, void *hook_key
 struct list_head* hook_list(const struct strbuf* hookname)
 {
 	struct strbuf hook_key = STRBUF_INIT;
+	const char *legacy_hook_path = NULL;
 
 	if (!hookname)
 		return NULL;
@@ -98,5 +102,45 @@ struct list_head* hook_list(const struct strbuf* hookname)
 
 	git_config(hook_config_lookup, (void*)hook_key.buf);
 
+	legacy_hook_path = find_hook(hookname->buf);
+
+	/* TODO: check hook.runHookDir */
+	if (legacy_hook_path)
+		emplace_hook(&hook_head, legacy_hook_path, 1);
+
 	return &hook_head;
 }
+
+int run_hooks(const char *const *env, const struct strbuf *hookname,
+	      const struct strvec *args)
+{
+	struct list_head *to_run, *pos = NULL, *tmp = NULL;
+	int rc = 0;
+
+	to_run = hook_list(hookname);
+
+	list_for_each_safe(pos, tmp, to_run) {
+		struct child_process hook_proc = CHILD_PROCESS_INIT;
+		struct hook *hook = list_entry(pos, struct hook, list);
+
+		/* add command */
+		strvec_push(&hook_proc.args, hook->command.buf);
+
+		/*
+		 * add passed-in argv, without expanding - let the user get back
+		 * exactly what they put in
+		 */
+		if (args)
+			strvec_pushv(&hook_proc.args, args->v);
+
+		hook_proc.env = env;
+		hook_proc.no_stdin = 1;
+		hook_proc.stdout_to_stderr = 1;
+		hook_proc.trace2_hook_name = hook->command.buf;
+		hook_proc.use_shell = 1;
+
+		rc |= run_command(&hook_proc);
+	}
+
+	return rc;
+}
diff --git a/hook.h b/hook.h
index aaf6511cff..d020788a6b 100644
--- a/hook.h
+++ b/hook.h
@@ -1,6 +1,7 @@
 #include "config.h"
 #include "list.h"
 #include "strbuf.h"
+#include "strvec.h"
 
 struct hook
 {
@@ -10,6 +11,8 @@ struct hook
 };
 
 struct list_head* hook_list(const struct strbuf *hookname);
+int run_hooks(const char *const *env, const struct strbuf *hookname,
+	      const struct strvec *args);
 
 void free_hook(struct hook *ptr);
 void clear_hook_list(void);
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index ebf8f38d68..ee8114250d 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -84,4 +84,32 @@ test_expect_success 'git hook list --porcelain prints just the command' '
 	test_cmp expected actual
 '
 
+test_expect_success 'inline hook definitions execute oneliners' '
+	test_config hook.pre-commit.command "echo \"Hello World\"" &&
+
+	echo "Hello World" >expected &&
+
+	# hooks are run with stdout_to_stderr = 1
+	git hook run pre-commit 2>actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'inline hook definitions resolve paths' '
+	cat >~/sample-hook.sh <<-EOF &&
+	echo \"Sample Hook\"
+	EOF
+
+	test_when_finished "rm ~/sample-hook.sh" &&
+
+	chmod +x ~/sample-hook.sh &&
+
+	test_config hook.pre-commit.command "~/sample-hook.sh" &&
+
+	echo \"Sample Hook\" >expected &&
+
+	# hooks are run with stdout_to_stderr = 1
+	git hook run pre-commit 2>actual &&
+	test_cmp expected actual
+'
+
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v4 7/9] hook: replace run-command.h:find_hook
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
                       ` (5 preceding siblings ...)
  2020-09-09  0:49     ` [PATCH v4 6/9] hook: add 'run' subcommand Emily Shaffer
@ 2020-09-09  0:49     ` Emily Shaffer
  2020-09-09 20:32       ` Junio C Hamano
                         ` (2 more replies)
  2020-09-09  0:49     ` [PATCH v4 8/9] commit: use config-based hooks Emily Shaffer
                       ` (3 subsequent siblings)
  10 siblings, 3 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-09-09  0:49 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Add a helper to easily determine whether any hooks exist for a given
hook event.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 hook.c | 9 +++++++++
 hook.h | 1 +
 2 files changed, 10 insertions(+)

diff --git a/hook.c b/hook.c
index 0dab981681..7c7b922369 100644
--- a/hook.c
+++ b/hook.c
@@ -111,6 +111,15 @@ struct list_head* hook_list(const struct strbuf* hookname)
 	return &hook_head;
 }
 
+int hook_exists(const char *hookname)
+{
+	const char *value = NULL;
+	struct strbuf hook_key = STRBUF_INIT;
+	strbuf_addf(&hook_key, "hook.%s.command", hookname);
+
+	return (!git_config_get_value(hook_key.buf, &value)) || !!find_hook(hookname);
+}
+
 int run_hooks(const char *const *env, const struct strbuf *hookname,
 	      const struct strvec *args)
 {
diff --git a/hook.h b/hook.h
index d020788a6b..d94511b609 100644
--- a/hook.h
+++ b/hook.h
@@ -11,6 +11,7 @@ struct hook
 };
 
 struct list_head* hook_list(const struct strbuf *hookname);
+int hook_exists(const char *hookname);
 int run_hooks(const char *const *env, const struct strbuf *hookname,
 	      const struct strvec *args);
 
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v4 8/9] commit: use config-based hooks
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
                       ` (6 preceding siblings ...)
  2020-09-09  0:49     ` [PATCH v4 7/9] hook: replace run-command.h:find_hook Emily Shaffer
@ 2020-09-09  0:49     ` Emily Shaffer
  2020-09-10 13:50       ` Phillip Wood
  2020-09-23 23:47       ` Jonathan Tan
  2020-09-09  0:49     ` [PATCH v4 9/9] run_commit_hook: take strvec instead of varargs Emily Shaffer
                       ` (2 subsequent siblings)
  10 siblings, 2 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-09-09  0:49 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

As part of the adoption of config-based hooks, teach run_commit_hook()
to call hook.h instead of run-command.h. This covers 'pre-commit',
'commit-msg', and 'prepare-commit-msg'. Additionally, ask the hook
library - not run-command - whether any hooks will be run, as it's
possible hooks may exist in the config but not the hookdir.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 builtin/commit.c                                 |  3 ++-
 builtin/merge.c                                  |  3 ++-
 commit.c                                         | 13 ++++++++++++-
 t/t7503-pre-commit-and-pre-merge-commit-hooks.sh | 13 +++++++++++++
 4 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 69ac78d5e5..a19c6478eb 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -36,6 +36,7 @@
 #include "help.h"
 #include "commit-reach.h"
 #include "commit-graph.h"
+#include "hook.h"
 
 static const char * const builtin_commit_usage[] = {
 	N_("git commit [<options>] [--] <pathspec>..."),
@@ -985,7 +986,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 		return 0;
 	}
 
-	if (!no_verify && find_hook("pre-commit")) {
+	if (!no_verify && hook_exists("pre-commit")) {
 		/*
 		 * Re-read the index as pre-commit hook could have updated it,
 		 * and write it out as a tree.  We must do this before we invoke
diff --git a/builtin/merge.c b/builtin/merge.c
index 74829a838e..c1a9d0083d 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -41,6 +41,7 @@
 #include "commit-reach.h"
 #include "wt-status.h"
 #include "commit-graph.h"
+#include "hook.h"
 
 #define DEFAULT_TWOHEAD (1<<0)
 #define DEFAULT_OCTOPUS (1<<1)
@@ -829,7 +830,7 @@ static void prepare_to_commit(struct commit_list *remoteheads)
 	 * and write it out as a tree.  We must do this before we invoke
 	 * the editor and after we invoke run_status above.
 	 */
-	if (find_hook("pre-merge-commit"))
+	if (hook_exists("pre-merge-commit"))
 		discard_cache();
 	read_cache_from(index_file);
 	strbuf_addbuf(&msg, &merge_msg);
diff --git a/commit.c b/commit.c
index 4ce8cb38d5..c7a243e848 100644
--- a/commit.c
+++ b/commit.c
@@ -21,6 +21,7 @@
 #include "commit-reach.h"
 #include "run-command.h"
 #include "shallow.h"
+#include "hook.h"
 
 static struct commit_extra_header *read_commit_extra_header_lines(const char *buf, size_t len, const char **);
 
@@ -1632,8 +1633,13 @@ int run_commit_hook(int editor_is_used, const char *index_file,
 {
 	struct strvec hook_env = STRVEC_INIT;
 	va_list args;
+	const char *arg;
+	struct strvec hook_args = STRVEC_INIT;
+	struct strbuf hook_name = STRBUF_INIT;
 	int ret;
 
+	strbuf_addstr(&hook_name, name);
+
 	strvec_pushf(&hook_env, "GIT_INDEX_FILE=%s", index_file);
 
 	/*
@@ -1643,9 +1649,14 @@ int run_commit_hook(int editor_is_used, const char *index_file,
 		strvec_push(&hook_env, "GIT_EDITOR=:");
 
 	va_start(args, name);
-	ret = run_hook_ve(hook_env.v, name, args);
+	while ((arg = va_arg(args, const char *)))
+		strvec_push(&hook_args, arg);
 	va_end(args);
+
+	ret = run_hooks(hook_env.v, &hook_name, &hook_args);
 	strvec_clear(&hook_env);
+	strvec_clear(&hook_args);
+	strbuf_release(&hook_name);
 
 	return ret;
 }
diff --git a/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh b/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh
index b3485450a2..cef8085dcc 100755
--- a/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh
+++ b/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh
@@ -103,6 +103,19 @@ test_expect_success 'with succeeding hook' '
 	test_cmp expected_hooks actual_hooks
 '
 
+# NEEDSWORK: when 'git hook add' and 'git hook remove' have been added, use that
+# instead
+test_expect_success 'with succeeding hook (config-based)' '
+	test_when_finished "git config --unset hook.pre-commit.command success.sample" &&
+	test_when_finished "rm -f expected_hooks actual_hooks" &&
+	git config hook.pre-commit.command "$HOOKDIR/success.sample" &&
+	echo "$HOOKDIR/success.sample" >expected_hooks &&
+	echo "more" >>file &&
+	git add file &&
+	git commit -m "more" &&
+	test_cmp expected_hooks actual_hooks
+'
+
 test_expect_success 'with succeeding hook (merge)' '
 	test_when_finished "rm -f \"$PREMERGE\" expected_hooks actual_hooks" &&
 	cp "$HOOKDIR/success.sample" "$PREMERGE" &&
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v4 9/9] run_commit_hook: take strvec instead of varargs
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
                       ` (7 preceding siblings ...)
  2020-09-09  0:49     ` [PATCH v4 8/9] commit: use config-based hooks Emily Shaffer
@ 2020-09-09  0:49     ` Emily Shaffer
  2020-09-10 14:16       ` Phillip Wood
  2020-09-09 21:04     ` [PATCH v4 0/9] propose config-based hooks Junio C Hamano
  2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
  10 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-09-09  0:49 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Taking varargs in run_commit_hook() led to some bizarre patterns, like
callers using two string variables (which may or may not be filled) to
express different argument lists for the commit hooks. Because
run_commit_hook() no longer needs to call a variadic function for the
hook run itself, we can use strvec to make the calling code more
conventional.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 builtin/commit.c | 46 +++++++++++++++++++++++-----------------------
 builtin/merge.c  | 20 ++++++++++++++++----
 commit.c         | 13 ++-----------
 commit.h         |  5 +++--
 sequencer.c      | 15 ++++++++-------
 5 files changed, 52 insertions(+), 47 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index a19c6478eb..f029d4f5ac 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -691,8 +691,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 	struct strbuf committer_ident = STRBUF_INIT;
 	int committable;
 	struct strbuf sb = STRBUF_INIT;
-	const char *hook_arg1 = NULL;
-	const char *hook_arg2 = NULL;
+	struct strvec hook_args = STRVEC_INIT;
 	int clean_message_contents = (cleanup_mode != COMMIT_MSG_CLEANUP_NONE);
 	int old_display_comment_prefix;
 	int merge_contains_scissors = 0;
@@ -700,7 +699,8 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 	/* This checks and barfs if author is badly specified */
 	determine_author_info(author_ident);
 
-	if (!no_verify && run_commit_hook(use_editor, index_file, "pre-commit", NULL))
+	if (!no_verify && run_commit_hook(use_editor, index_file, "pre-commit",
+					  &hook_args))
 		return 0;
 
 	if (squash_message) {
@@ -722,27 +722,28 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 		}
 	}
 
+	strvec_push(&hook_args, git_path_commit_editmsg());
+
 	if (have_option_m && !fixup_message) {
 		strbuf_addbuf(&sb, &message);
-		hook_arg1 = "message";
+		strvec_push(&hook_args, "message");
 	} else if (logfile && !strcmp(logfile, "-")) {
 		if (isatty(0))
 			fprintf(stderr, _("(reading log message from standard input)\n"));
 		if (strbuf_read(&sb, 0, 0) < 0)
 			die_errno(_("could not read log from standard input"));
-		hook_arg1 = "message";
+		strvec_push(&hook_args, "message");
 	} else if (logfile) {
 		if (strbuf_read_file(&sb, logfile, 0) < 0)
 			die_errno(_("could not read log file '%s'"),
 				  logfile);
-		hook_arg1 = "message";
+		strvec_push(&hook_args, "message");
 	} else if (use_message) {
 		char *buffer;
 		buffer = strstr(use_message_buffer, "\n\n");
 		if (buffer)
 			strbuf_addstr(&sb, skip_blank_lines(buffer + 2));
-		hook_arg1 = "commit";
-		hook_arg2 = use_message;
+		strvec_pushl(&hook_args, "commit", use_message, NULL);
 	} else if (fixup_message) {
 		struct pretty_print_context ctx = {0};
 		struct commit *commit;
@@ -754,7 +755,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 				      &sb, &ctx);
 		if (have_option_m)
 			strbuf_addbuf(&sb, &message);
-		hook_arg1 = "message";
+		strvec_push(&hook_args, "message");
 	} else if (!stat(git_path_merge_msg(the_repository), &statbuf)) {
 		size_t merge_msg_start;
 
@@ -765,9 +766,9 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 		if (!stat(git_path_squash_msg(the_repository), &statbuf)) {
 			if (strbuf_read_file(&sb, git_path_squash_msg(the_repository), 0) < 0)
 				die_errno(_("could not read SQUASH_MSG"));
-			hook_arg1 = "squash";
+			strvec_push(&hook_args, "squash");
 		} else
-			hook_arg1 = "merge";
+			strvec_push(&hook_args, "merge");
 
 		merge_msg_start = sb.len;
 		if (strbuf_read_file(&sb, git_path_merge_msg(the_repository), 0) < 0)
@@ -781,11 +782,11 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 	} else if (!stat(git_path_squash_msg(the_repository), &statbuf)) {
 		if (strbuf_read_file(&sb, git_path_squash_msg(the_repository), 0) < 0)
 			die_errno(_("could not read SQUASH_MSG"));
-		hook_arg1 = "squash";
+		strvec_push(&hook_args, "squash");
 	} else if (template_file) {
 		if (strbuf_read_file(&sb, template_file, 0) < 0)
 			die_errno(_("could not read '%s'"), template_file);
-		hook_arg1 = "template";
+		strvec_push(&hook_args, "template");
 		clean_message_contents = 0;
 	}
 
@@ -794,11 +795,9 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 	 * just set the argument(s) to the prepare-commit-msg hook.
 	 */
 	else if (whence == FROM_MERGE)
-		hook_arg1 = "merge";
-	else if (is_from_cherry_pick(whence) || whence == FROM_REBASE_PICK) {
-		hook_arg1 = "commit";
-		hook_arg2 = "CHERRY_PICK_HEAD";
-	}
+		strvec_push(&hook_args, "merge");
+	else if (is_from_cherry_pick(whence) || whence == FROM_REBASE_PICK)
+		strvec_pushl(&hook_args, "commit", "CHERRY_PICK_HEAD", NULL);
 
 	if (squash_message) {
 		/*
@@ -806,8 +805,8 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 		 * then we're possibly hijacking other commit log options.
 		 * Reset the hook args to tell the real story.
 		 */
-		hook_arg1 = "message";
-		hook_arg2 = "";
+		strvec_clear(&hook_args);
+		strvec_pushl(&hook_args, git_path_commit_editmsg(), "message", NULL);
 	}
 
 	s->fp = fopen_for_writing(git_path_commit_editmsg());
@@ -1001,8 +1000,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 		return 0;
 	}
 
-	if (run_commit_hook(use_editor, index_file, "prepare-commit-msg",
-			    git_path_commit_editmsg(), hook_arg1, hook_arg2, NULL))
+	if (run_commit_hook(use_editor, index_file, "prepare-commit-msg", &hook_args))
 		return 0;
 
 	if (use_editor) {
@@ -1017,8 +1015,10 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 		strvec_clear(&env);
 	}
 
+	strvec_clear(&hook_args);
+	strvec_push(&hook_args, git_path_commit_editmsg());
 	if (!no_verify &&
-	    run_commit_hook(use_editor, index_file, "commit-msg", git_path_commit_editmsg(), NULL)) {
+	    run_commit_hook(use_editor, index_file, "commit-msg", &hook_args)) {
 		return 0;
 	}
 
diff --git a/builtin/merge.c b/builtin/merge.c
index c1a9d0083d..863c9039a3 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -821,10 +821,14 @@ static void write_merge_heads(struct commit_list *);
 static void prepare_to_commit(struct commit_list *remoteheads)
 {
 	struct strbuf msg = STRBUF_INIT;
+	struct strvec hook_args = STRVEC_INIT;
+	struct strbuf hook_name = STRBUF_INIT;
 	const char *index_file = get_index_file();
 
-	if (!no_verify && run_commit_hook(0 < option_edit, index_file, "pre-merge-commit", NULL))
+	if (!no_verify && run_commit_hook(0 < option_edit, index_file,
+					  "pre-merge-commit", &hook_args))
 		abort_commit(remoteheads, NULL);
+
 	/*
 	 * Re-read the index as pre-merge-commit hook could have updated it,
 	 * and write it out as a tree.  We must do this before we invoke
@@ -832,6 +836,7 @@ static void prepare_to_commit(struct commit_list *remoteheads)
 	 */
 	if (hook_exists("pre-merge-commit"))
 		discard_cache();
+
 	read_cache_from(index_file);
 	strbuf_addbuf(&msg, &merge_msg);
 	if (squash)
@@ -851,17 +856,22 @@ static void prepare_to_commit(struct commit_list *remoteheads)
 		append_signoff(&msg, ignore_non_trailer(msg.buf, msg.len), 0);
 	write_merge_heads(remoteheads);
 	write_file_buf(git_path_merge_msg(the_repository), msg.buf, msg.len);
+
+	strvec_clear(&hook_args);
+	strvec_pushl(&hook_args, git_path_merge_msg(the_repository), "merge", NULL);
 	if (run_commit_hook(0 < option_edit, get_index_file(), "prepare-commit-msg",
-			    git_path_merge_msg(the_repository), "merge", NULL))
+			    &hook_args))
 		abort_commit(remoteheads, NULL);
+
 	if (0 < option_edit) {
 		if (launch_editor(git_path_merge_msg(the_repository), NULL, NULL))
 			abort_commit(remoteheads, NULL);
 	}
 
+	strvec_clear(&hook_args);
+	strvec_push(&hook_args, git_path_merge_msg(the_repository));
 	if (!no_verify && run_commit_hook(0 < option_edit, get_index_file(),
-					  "commit-msg",
-					  git_path_merge_msg(the_repository), NULL))
+					  "commit-msg", &hook_args))
 		abort_commit(remoteheads, NULL);
 
 	read_merge_msg(&msg);
@@ -871,6 +881,8 @@ static void prepare_to_commit(struct commit_list *remoteheads)
 	strbuf_release(&merge_msg);
 	strbuf_addbuf(&merge_msg, &msg);
 	strbuf_release(&msg);
+	strbuf_release(&hook_name);
+	strvec_clear(&hook_args);
 }
 
 static int merge_trivial(struct commit *head, struct commit_list *remoteheads)
diff --git a/commit.c b/commit.c
index c7a243e848..726407152c 100644
--- a/commit.c
+++ b/commit.c
@@ -1629,12 +1629,9 @@ size_t ignore_non_trailer(const char *buf, size_t len)
 }
 
 int run_commit_hook(int editor_is_used, const char *index_file,
-		    const char *name, ...)
+		    const char *name, struct strvec *args)
 {
 	struct strvec hook_env = STRVEC_INIT;
-	va_list args;
-	const char *arg;
-	struct strvec hook_args = STRVEC_INIT;
 	struct strbuf hook_name = STRBUF_INIT;
 	int ret;
 
@@ -1648,14 +1645,8 @@ int run_commit_hook(int editor_is_used, const char *index_file,
 	if (!editor_is_used)
 		strvec_push(&hook_env, "GIT_EDITOR=:");
 
-	va_start(args, name);
-	while ((arg = va_arg(args, const char *)))
-		strvec_push(&hook_args, arg);
-	va_end(args);
-
-	ret = run_hooks(hook_env.v, &hook_name, &hook_args);
+	ret = run_hooks(hook_env.v, &hook_name, args);
 	strvec_clear(&hook_env);
-	strvec_clear(&hook_args);
 	strbuf_release(&hook_name);
 
 	return ret;
diff --git a/commit.h b/commit.h
index e901538909..978da3c3e0 100644
--- a/commit.h
+++ b/commit.h
@@ -9,6 +9,7 @@
 #include "string-list.h"
 #include "pretty.h"
 #include "commit-slab.h"
+#include "strvec.h"
 
 #define COMMIT_NOT_FROM_GRAPH 0xFFFFFFFF
 #define GENERATION_NUMBER_INFINITY 0xFFFFFFFF
@@ -353,7 +354,7 @@ void verify_merge_signature(struct commit *commit, int verbose,
 int compare_commits_by_commit_date(const void *a_, const void *b_, void *unused);
 int compare_commits_by_gen_then_commit_date(const void *a_, const void *b_, void *unused);
 
-LAST_ARG_MUST_BE_NULL
-int run_commit_hook(int editor_is_used, const char *index_file, const char *name, ...);
+int run_commit_hook(int editor_is_used, const char *index_file,
+		    const char *name, struct strvec *args);
 
 #endif /* COMMIT_H */
diff --git a/sequencer.c b/sequencer.c
index cc3f8fa88e..5dd4b134d6 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -1124,22 +1124,23 @@ static int run_prepare_commit_msg_hook(struct repository *r,
 				       const char *commit)
 {
 	int ret = 0;
-	const char *name, *arg1 = NULL, *arg2 = NULL;
+	struct strvec args = STRVEC_INIT;
+	const char *name = git_path_commit_editmsg();
 
-	name = git_path_commit_editmsg();
+	strvec_push(&args, name);
 	if (write_message(msg->buf, msg->len, name, 0))
 		return -1;
 
 	if (commit) {
-		arg1 = "commit";
-		arg2 = commit;
+		strvec_push(&args, "commit");
+		strvec_push(&args, commit);
 	} else {
-		arg1 = "message";
+		strvec_push(&args, "message");
 	}
-	if (run_commit_hook(0, r->index_file, "prepare-commit-msg", name,
-			    arg1, arg2, NULL))
+	if (run_commit_hook(0, r->index_file, "prepare-commit-msg", &args))
 		ret = error(_("'prepare-commit-msg' hook failed"));
 
+	strvec_clear(&args);
 	return ret;
 }
 
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 7/9] hook: replace run-command.h:find_hook
  2020-09-09  0:49     ` [PATCH v4 7/9] hook: replace run-command.h:find_hook Emily Shaffer
@ 2020-09-09 20:32       ` Junio C Hamano
  2020-09-10 19:08         ` Emily Shaffer
  2020-09-23 23:20       ` Jonathan Tan
  2020-10-05 23:42       ` Jonathan Nieder
  2 siblings, 1 reply; 170+ messages in thread
From: Junio C Hamano @ 2020-09-09 20:32 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Emily Shaffer <emilyshaffer@google.com> writes:

> Add a helper to easily determine whether any hooks exist for a given
> hook event.
>
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
>  hook.c | 9 +++++++++
>  hook.h | 1 +
>  2 files changed, 10 insertions(+)

Should we consider the last three patches still work-in-progress
technology demonstration, or are these meant as a proposal for a new
API element as-is?

It is perfectly fine if it is the former.  I just want to make sure
we share a common understanding on the direction in which we want
these patches to take us.  Here is my take:

 - For now, a hook/event that is aware of the config-based hook
   system is supposed to use hook_exists(), while the traditional
   ones still use find_hook().  We expect more and more will be
   converted to the former over time.

 - Invoking hook scripts under the new world order is done by
   including hook.h and calling run_hooks(), not by driving the
   run-command API yourself (I count run_hook_ve() as part of the
   latter) like the traditional code did.  We expect more and more
   will be converted to the former over time.

 - From the point of view of the end users who have been happily
   using scripts in $GIT_DIR/hooks, everything will stay the same.
   hook_exists() will find them (by calling find_hook() as a
   fallback) and run_hooks() will run them (by relying on
   hook_list() to include them).

I am guessing that the above gives us a high-level description.

The new interface needs to be described in hook.h once the series
graduates from the technology demonstration state, in order to help
others who want to help updating the callsites of traditional hooks
to the new API.  And the above three-bullet point list is my attempt
to figure out what kind of things need to be documented to help
them.

I am not seeing anything in run_hooks() that consumes input from us
over pipe, by the way, without which we cannot do things like the
"pre-receive" hooks under the new world order.  Are they planned to
come in the future, after these "we feed anything they need from the
command line and from the enviornment" hooks are dealt with in this
first pass?

Thanks.

> diff --git a/hook.c b/hook.c
> index 0dab981681..7c7b922369 100644
> --- a/hook.c
> +++ b/hook.c
> @@ -111,6 +111,15 @@ struct list_head* hook_list(const struct strbuf* hookname)
>  	return &hook_head;
>  }
>  
> +int hook_exists(const char *hookname)
> +{
> +	const char *value = NULL;
> +	struct strbuf hook_key = STRBUF_INIT;
> +	strbuf_addf(&hook_key, "hook.%s.command", hookname);
> +
> +	return (!git_config_get_value(hook_key.buf, &value)) || !!find_hook(hookname);
> +}
> +
>  int run_hooks(const char *const *env, const struct strbuf *hookname,
>  	      const struct strvec *args)
>  {
> diff --git a/hook.h b/hook.h
> index d020788a6b..d94511b609 100644
> --- a/hook.h
> +++ b/hook.h
> @@ -11,6 +11,7 @@ struct hook
>  };
>  
>  struct list_head* hook_list(const struct strbuf *hookname);
> +int hook_exists(const char *hookname);
>  int run_hooks(const char *const *env, const struct strbuf *hookname,
>  	      const struct strvec *args);

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 0/9] propose config-based hooks
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
                       ` (8 preceding siblings ...)
  2020-09-09  0:49     ` [PATCH v4 9/9] run_commit_hook: take strvec instead of varargs Emily Shaffer
@ 2020-09-09 21:04     ` Junio C Hamano
  2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
  10 siblings, 0 replies; 170+ messages in thread
From: Junio C Hamano @ 2020-09-09 21:04 UTC (permalink / raw)
  To: Emily Shaffer
  Cc: git, Jeff King, James Ramsay, Jonathan Nieder, brian m. carlson,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Josh Steadmon, Johannes Schindelin

Emily Shaffer <emilyshaffer@google.com> writes:

> Since v3, the biggest change is the conversion of commit hooks to use the new
> hook machinery. The first change ("commit: use config-based hooks") is the
> important part; the second change ("run_commit_hook: take strvec instead of varargs")
> is probably subjective, but I thought it was a decent tech debt reduction.
>
> I wanted to send this reroll quickly since I had promised it in standup last
> week, but I've got pretty good progress locally on the patch for configuring
> "hook.runHookDir"; I'm planning to send that soon, probably this week.

I've had the attached merge-fix patch as a way to adjust argv_array
to strvec transition [*1*], but now *most* but not all parts of this
series have been migrated to the strvec API, you should apply some
parts in the merge-fix patch to your copy.  I think the changes in
the old "merge-fix" patch to *.c and *.h are already in your series
that has been rebased on a newer 'master' that has strvec, but
documentation and possibly in-code comments may need to be adjusted.

Another way to sanity check the result would be to run this:

    $ git diff master..es/config-hooks | grep -i argv.array

Thanks.  

[Footnote]

*1* The way I work with a topic that causes conflicts with other
    topics is to merge a new version of topic and letting the rerere
    records I created while resolving the conflicts with the
    previous round.  After textual conflicts are thusly resolved, if
    there are further changes that do not cause textual conflict
    that are necessary, they are written in the form of a
    "merge-fix" patch like the attached.

-- >8 --

 Documentation/technical/api-parse-options.txt  |  4 ++--
 Documentation/technical/config-based-hooks.txt |  4 ++--
 builtin/hook.c                                 | 16 ++++++++--------
 hook.c                                         |  6 +++---
 hook.h                                         |  4 ++--
 parse-options-cb.c                             |  8 ++++----
 parse-options.h                                |  6 +++---
 7 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/Documentation/technical/api-parse-options.txt b/Documentation/technical/api-parse-options.txt
index b4f1fc4a1a..679bd98629 100644
--- a/Documentation/technical/api-parse-options.txt
+++ b/Documentation/technical/api-parse-options.txt
@@ -173,9 +173,9 @@ There are some macros to easily define options:
 	The string argument is stored as an element in `string_list`.
 	Use of `--no-option` will clear the list of preceding values.
 
-`OPT_ARGV_ARRAY(short, long, &struct argv_array, arg_str, description)`::
+`OPT_STRVEC(short, long, &struct strvec, arg_str, description)`::
 	Introduce an option with a string argument.
-	The string argument is stored as an element in `argv_array`.
+	The string argument is stored as an element in `strvec`.
 	Use of `--no-option` will clear the list of preceding values.
 
 `OPT_INTEGER(short, long, &int_var, description)`::
diff --git a/Documentation/technical/config-based-hooks.txt b/Documentation/technical/config-based-hooks.txt
index c6e762b192..4443f70ded 100644
--- a/Documentation/technical/config-based-hooks.txt
+++ b/Documentation/technical/config-based-hooks.txt
@@ -106,10 +106,10 @@ a concise config afterwards. It may take a form similar to `git rebase
 `hook.c` and `hook.h` are responsible for interacting with the config files. In
 the case when the code generating a hook event doesn't have special concerns
 about how to run the hooks, the hook library will provide a basic API to call
-all hooks in config order with an `argv_array` provided by the code which
+all hooks in config order with an `strvec` provided by the code which
 generates the hook event:
 
-*`int run_hooks(const char *hookname, struct argv_array *args)`*
+*`int run_hooks(const char *hookname, struct strvec *args)`*
 
 This call includes the hook command provided by `run-command.h:find_hook()`;
 eventually, this legacy hook will be gated by a config `hook.runHookDir`. The
diff --git a/builtin/hook.c b/builtin/hook.c
index cd61fad5fb..debcb5a77a 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -5,7 +5,7 @@
 #include "hook.h"
 #include "parse-options.h"
 #include "strbuf.h"
-#include "argv-array.h"
+#include "strvec.h"
 
 static const char * const builtin_hook_usage[] = {
 	N_("git hook list <hookname>"),
@@ -67,14 +67,14 @@ static int list(int argc, const char **argv, const char *prefix)
 static int run(int argc, const char **argv, const char *prefix)
 {
 	struct strbuf hookname = STRBUF_INIT;
-	struct argv_array env_argv = ARGV_ARRAY_INIT;
-	struct argv_array arg_argv = ARGV_ARRAY_INIT;
+	struct strvec env_argv = STRVEC_INIT;
+	struct strvec arg_argv = STRVEC_INIT;
 
 	struct option run_options[] = {
-		OPT_ARGV_ARRAY('e', "env", &env_argv, N_("var"),
-			       N_("environment variables for hook to use")),
-		OPT_ARGV_ARRAY('a', "arg", &arg_argv, N_("args"),
-			       N_("argument to pass to hook")),
+		OPT_STRVEC('e', "env", &env_argv, N_("var"),
+			   N_("environment variables for hook to use")),
+		OPT_STRVEC('a', "arg", &arg_argv, N_("args"),
+			   N_("argument to pass to hook")),
 		OPT_END(),
 	};
 
@@ -87,7 +87,7 @@ static int run(int argc, const char **argv, const char *prefix)
 
 	strbuf_addstr(&hookname, argv[0]);
 
-	return run_hooks(env_argv.argv, &hookname, &arg_argv);
+	return run_hooks(env_argv.v, &hookname, &arg_argv);
 }
 
 int cmd_hook(int argc, const char **argv, const char *prefix)
diff --git a/hook.c b/hook.c
index 902e213173..40d319adb1 100644
--- a/hook.c
+++ b/hook.c
@@ -98,7 +98,7 @@ struct list_head* hook_list(const struct strbuf* hookname)
 }
 
 int run_hooks(const char *const *env, const struct strbuf *hookname,
-	      const struct argv_array *args)
+	      const struct strvec *args)
 {
 	struct list_head *to_run, *pos = NULL, *tmp = NULL;
 	int rc = 0;
@@ -110,14 +110,14 @@ int run_hooks(const char *const *env, const struct strbuf *hookname,
 		struct hook *hook = list_entry(pos, struct hook, list);
 
 		/* add command */
-		argv_array_push(&hook_proc.args, hook->command.buf);
+		strvec_push(&hook_proc.args, hook->command.buf);
 
 		/*
 		 * add passed-in argv, without expanding - let the user get back
 		 * exactly what they put in
 		 */
 		if (args)
-			argv_array_pushv(&hook_proc.args, args->argv);
+			strvec_pushv(&hook_proc.args, args->v);
 
 		hook_proc.env = env;
 		hook_proc.no_stdin = 1;
diff --git a/hook.h b/hook.h
index cf598d6ccb..d020788a6b 100644
--- a/hook.h
+++ b/hook.h
@@ -1,7 +1,7 @@
 #include "config.h"
 #include "list.h"
 #include "strbuf.h"
-#include "argv-array.h"
+#include "strvec.h"
 
 struct hook
 {
@@ -12,7 +12,7 @@ struct hook
 
 struct list_head* hook_list(const struct strbuf *hookname);
 int run_hooks(const char *const *env, const struct strbuf *hookname,
-	      const struct argv_array *args);
+	      const struct strvec *args);
 
 void free_hook(struct hook *ptr);
 void clear_hook_list(void);
diff --git a/parse-options-cb.c b/parse-options-cb.c
index 4f993cd734..d2b8b7b98a 100644
--- a/parse-options-cb.c
+++ b/parse-options-cb.c
@@ -205,19 +205,19 @@ int parse_opt_string_list(const struct option *opt, const char *arg, int unset)
 	return 0;
 }
 
-int parse_opt_argv_array(const struct option *opt, const char *arg, int unset)
+int parse_opt_strvec(const struct option *opt, const char *arg, int unset)
 {
-	struct argv_array *v = opt->value;
+	struct strvec *v = opt->value;
 
 	if (unset) {
-		argv_array_clear(v);
+		strvec_clear(v);
 		return 0;
 	}
 
 	if (!arg)
 		return -1;
 
-	argv_array_push(v, arg);
+	strvec_push(v, arg);
 	return 0;
 }
 
diff --git a/parse-options.h b/parse-options.h
index e2e2de75c8..177259488b 100644
--- a/parse-options.h
+++ b/parse-options.h
@@ -177,9 +177,9 @@ struct option {
 #define OPT_STRING_LIST(s, l, v, a, h) \
 				    { OPTION_CALLBACK, (s), (l), (v), (a), \
 				      (h), 0, &parse_opt_string_list }
-#define OPT_ARGV_ARRAY(s, l, v, a, h) \
+#define OPT_STRVEC(s, l, v, a, h) \
 				    { OPTION_CALLBACK, (s), (l), (v), (a), \
-				      (h), 0, &parse_opt_argv_array }
+				      (h), 0, &parse_opt_strvec }
 #define OPT_UYN(s, l, v, h)         { OPTION_CALLBACK, (s), (l), (v), NULL, \
 				      (h), PARSE_OPT_NOARG, &parse_opt_tertiary }
 #define OPT_EXPIRY_DATE(s, l, v, h) \
@@ -299,7 +299,7 @@ int parse_opt_commits(const struct option *, const char *, int);
 int parse_opt_commit(const struct option *, const char *, int);
 int parse_opt_tertiary(const struct option *, const char *, int);
 int parse_opt_string_list(const struct option *, const char *, int);
-int parse_opt_argv_array(const struct option *, const char *, int);
+int parse_opt_strvec(const struct option *, const char *, int);
 int parse_opt_noop_cb(const struct option *, const char *, int);
 enum parse_opt_result parse_opt_unknown_cb(struct parse_opt_ctx_t *ctx,
 					   const struct option *,
-- 
2.28.0-558-g7a0184fd7b


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 8/9] commit: use config-based hooks
  2020-09-09  0:49     ` [PATCH v4 8/9] commit: use config-based hooks Emily Shaffer
@ 2020-09-10 13:50       ` Phillip Wood
  2020-09-10 22:21         ` Junio C Hamano
  2020-09-23 23:47       ` Jonathan Tan
  1 sibling, 1 reply; 170+ messages in thread
From: Phillip Wood @ 2020-09-10 13:50 UTC (permalink / raw)
  To: Emily Shaffer, git

Hi Emily

On 09/09/2020 01:49, Emily Shaffer wrote:
> As part of the adoption of config-based hooks, teach run_commit_hook()
> to call hook.h instead of run-command.h. This covers 'pre-commit',
> 'commit-msg', and 'prepare-commit-msg'. Additionally, ask the hook
> library - not run-command - whether any hooks will be run, as it's
> possible hooks may exist in the config but not the hookdir.
> 
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
>   builtin/commit.c                                 |  3 ++-
>   builtin/merge.c                                  |  3 ++-
>   commit.c                                         | 13 ++++++++++++-
>   t/t7503-pre-commit-and-pre-merge-commit-hooks.sh | 13 +++++++++++++
>   4 files changed, 29 insertions(+), 3 deletions(-)
> 
> diff --git a/builtin/commit.c b/builtin/commit.c
> index 69ac78d5e5..a19c6478eb 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -36,6 +36,7 @@
>   #include "help.h"
>   #include "commit-reach.h"
>   #include "commit-graph.h"
> +#include "hook.h"
>   
>   static const char * const builtin_commit_usage[] = {
>   	N_("git commit [<options>] [--] <pathspec>..."),
> @@ -985,7 +986,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>   		return 0;
>   	}
>   
> -	if (!no_verify && find_hook("pre-commit")) {
> +	if (!no_verify && hook_exists("pre-commit")) {
>   		/*
>   		 * Re-read the index as pre-commit hook could have updated it,
>   		 * and write it out as a tree.  We must do this before we invoke
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 74829a838e..c1a9d0083d 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -41,6 +41,7 @@
>   #include "commit-reach.h"
>   #include "wt-status.h"
>   #include "commit-graph.h"
> +#include "hook.h"
>   
>   #define DEFAULT_TWOHEAD (1<<0)
>   #define DEFAULT_OCTOPUS (1<<1)
> @@ -829,7 +830,7 @@ static void prepare_to_commit(struct commit_list *remoteheads)
>   	 * and write it out as a tree.  We must do this before we invoke
>   	 * the editor and after we invoke run_status above.
>   	 */
> -	if (find_hook("pre-merge-commit"))
> +	if (hook_exists("pre-merge-commit"))
>   		discard_cache();
>   	read_cache_from(index_file);
>   	strbuf_addbuf(&msg, &merge_msg);
> diff --git a/commit.c b/commit.c
> index 4ce8cb38d5..c7a243e848 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -21,6 +21,7 @@
>   #include "commit-reach.h"
>   #include "run-command.h"
>   #include "shallow.h"
> +#include "hook.h"
>   
>   static struct commit_extra_header *read_commit_extra_header_lines(const char *buf, size_t len, const char **);
>   
> @@ -1632,8 +1633,13 @@ int run_commit_hook(int editor_is_used, const char *index_file,
>   {
>   	struct strvec hook_env = STRVEC_INIT;
>   	va_list args;
> +	const char *arg;
> +	struct strvec hook_args = STRVEC_INIT;
> +	struct strbuf hook_name = STRBUF_INIT;
>   	int ret;
>   
> +	strbuf_addstr(&hook_name, name);

Seeing this makes me wonder if it would be better for run_hooks() to 
take a string for the name rather than an strbuf, I suspect that 
virtually all callers have a fixed hook name.

Best Wishes

Phillip

>   	strvec_pushf(&hook_env, "GIT_INDEX_FILE=%s", index_file);
>   
>   	/*
> @@ -1643,9 +1649,14 @@ int run_commit_hook(int editor_is_used, const char *index_file,
>   		strvec_push(&hook_env, "GIT_EDITOR=:");
>   
>   	va_start(args, name);
> -	ret = run_hook_ve(hook_env.v, name, args);
> +	while ((arg = va_arg(args, const char *)))
> +		strvec_push(&hook_args, arg);
>   	va_end(args);
> +
> +	ret = run_hooks(hook_env.v, &hook_name, &hook_args);
>   	strvec_clear(&hook_env);
> +	strvec_clear(&hook_args);
> +	strbuf_release(&hook_name);
>   
>   	return ret;
>   }
> diff --git a/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh b/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh
> index b3485450a2..cef8085dcc 100755
> --- a/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh
> +++ b/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh
> @@ -103,6 +103,19 @@ test_expect_success 'with succeeding hook' '
>   	test_cmp expected_hooks actual_hooks
>   '
>   
> +# NEEDSWORK: when 'git hook add' and 'git hook remove' have been added, use that
> +# instead
> +test_expect_success 'with succeeding hook (config-based)' '
> +	test_when_finished "git config --unset hook.pre-commit.command success.sample" &&
> +	test_when_finished "rm -f expected_hooks actual_hooks" &&
> +	git config hook.pre-commit.command "$HOOKDIR/success.sample" &&
> +	echo "$HOOKDIR/success.sample" >expected_hooks &&
> +	echo "more" >>file &&
> +	git add file &&
> +	git commit -m "more" &&
> +	test_cmp expected_hooks actual_hooks
> +'
> +
>   test_expect_success 'with succeeding hook (merge)' '
>   	test_when_finished "rm -f \"$PREMERGE\" expected_hooks actual_hooks" &&
>   	cp "$HOOKDIR/success.sample" "$PREMERGE" &&
> 

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 9/9] run_commit_hook: take strvec instead of varargs
  2020-09-09  0:49     ` [PATCH v4 9/9] run_commit_hook: take strvec instead of varargs Emily Shaffer
@ 2020-09-10 14:16       ` Phillip Wood
  2020-09-11 13:20         ` Phillip Wood
  0 siblings, 1 reply; 170+ messages in thread
From: Phillip Wood @ 2020-09-10 14:16 UTC (permalink / raw)
  To: Emily Shaffer, git

Hi Emily

On 09/09/2020 01:49, Emily Shaffer wrote:
> Taking varargs in run_commit_hook() led to some bizarre patterns, like
> callers using two string variables (which may or may not be filled) to
> express different argument lists for the commit hooks. Because
> run_commit_hook() no longer needs to call a variadic function for the
> hook run itself, we can use strvec to make the calling code more
> conventional.
> 
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
>   builtin/commit.c | 46 +++++++++++++++++++++++-----------------------
>   builtin/merge.c  | 20 ++++++++++++++++----
>   commit.c         | 13 ++-----------
>   commit.h         |  5 +++--
>   sequencer.c      | 15 ++++++++-------
>   5 files changed, 52 insertions(+), 47 deletions(-)
> 
> diff --git a/builtin/commit.c b/builtin/commit.c
> index a19c6478eb..f029d4f5ac 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -691,8 +691,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>   	struct strbuf committer_ident = STRBUF_INIT;
>   	int committable;
>   	struct strbuf sb = STRBUF_INIT;
> -	const char *hook_arg1 = NULL;
> -	const char *hook_arg2 = NULL;
> +	struct strvec hook_args = STRVEC_INIT;
>   	int clean_message_contents = (cleanup_mode != COMMIT_MSG_CLEANUP_NONE);
>   	int old_display_comment_prefix;
>   	int merge_contains_scissors = 0;
> @@ -700,7 +699,8 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>   	/* This checks and barfs if author is badly specified */
>   	determine_author_info(author_ident);
>   
> -	if (!no_verify && run_commit_hook(use_editor, index_file, "pre-commit", NULL))
> +	if (!no_verify && run_commit_hook(use_editor, index_file, "pre-commit",
> +					  &hook_args))
>   		return 0;
>   
>   	if (squash_message) {
> @@ -722,27 +722,28 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>   		}
>   	}
>   
> +	strvec_push(&hook_args, git_path_commit_editmsg());

This is a long way from the call where we use hook_args. With the 
variadic interface it is clear by looking at the call to 
run_commit_hook() what the first argument is and that is always the same.

>   	if (have_option_m && !fixup_message) {
>   		strbuf_addbuf(&sb, &message);
> -		hook_arg1 = "message";
> +		strvec_push(&hook_args, "message");
>   	} else if (logfile && !strcmp(logfile, "-")) {
>   		if (isatty(0))
>   			fprintf(stderr, _("(reading log message from standard input)\n"));
>   		if (strbuf_read(&sb, 0, 0) < 0)
>   			die_errno(_("could not read log from standard input"));
> -		hook_arg1 = "message";
> +		strvec_push(&hook_args, "message");
>   	} else if (logfile) {
>   		if (strbuf_read_file(&sb, logfile, 0) < 0)
>   			die_errno(_("could not read log file '%s'"),
>   				  logfile);
> -		hook_arg1 = "message";
> +		strvec_push(&hook_args, "message");
>   	} else if (use_message) {
>   		char *buffer;
>   		buffer = strstr(use_message_buffer, "\n\n");
>   		if (buffer)
>   			strbuf_addstr(&sb, skip_blank_lines(buffer + 2));
> -		hook_arg1 = "commit";
> -		hook_arg2 = use_message;
> +		strvec_pushl(&hook_args, "commit", use_message, NULL);
>   	} else if (fixup_message) {
>   		struct pretty_print_context ctx = {0};
>   		struct commit *commit;
> @@ -754,7 +755,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>   				      &sb, &ctx);
>   		if (have_option_m)
>   			strbuf_addbuf(&sb, &message);
> -		hook_arg1 = "message";
> +		strvec_push(&hook_args, "message");
>   	} else if (!stat(git_path_merge_msg(the_repository), &statbuf)) {
>   		size_t merge_msg_start;
>   
> @@ -765,9 +766,9 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>   		if (!stat(git_path_squash_msg(the_repository), &statbuf)) {
>   			if (strbuf_read_file(&sb, git_path_squash_msg(the_repository), 0) < 0)
>   				die_errno(_("could not read SQUASH_MSG"));
> -			hook_arg1 = "squash";
> +			strvec_push(&hook_args, "squash");
>   		} else
> -			hook_arg1 = "merge";
> +			strvec_push(&hook_args, "merge");
>   
>   		merge_msg_start = sb.len;
>   		if (strbuf_read_file(&sb, git_path_merge_msg(the_repository), 0) < 0)
> @@ -781,11 +782,11 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>   	} else if (!stat(git_path_squash_msg(the_repository), &statbuf)) {
>   		if (strbuf_read_file(&sb, git_path_squash_msg(the_repository), 0) < 0)
>   			die_errno(_("could not read SQUASH_MSG"));
> -		hook_arg1 = "squash";
> +		strvec_push(&hook_args, "squash");
>   	} else if (template_file) {
>   		if (strbuf_read_file(&sb, template_file, 0) < 0)
>   			die_errno(_("could not read '%s'"), template_file);
> -		hook_arg1 = "template";
> +		strvec_push(&hook_args, "template");
>   		clean_message_contents = 0;
>   	}
>   
> @@ -794,11 +795,9 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>   	 * just set the argument(s) to the prepare-commit-msg hook.
>   	 */
>   	else if (whence == FROM_MERGE)
> -		hook_arg1 = "merge";
> -	else if (is_from_cherry_pick(whence) || whence == FROM_REBASE_PICK) {
> -		hook_arg1 = "commit";
> -		hook_arg2 = "CHERRY_PICK_HEAD";
> -	}
> +		strvec_push(&hook_args, "merge");
> +	else if (is_from_cherry_pick(whence) || whence == FROM_REBASE_PICK)
> +		strvec_pushl(&hook_args, "commit", "CHERRY_PICK_HEAD", NULL);
>   
>   	if (squash_message) {
>   		/*
> @@ -806,8 +805,8 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>   		 * then we're possibly hijacking other commit log options.
>   		 * Reset the hook args to tell the real story.
>   		 */
> -		hook_arg1 = "message";
> -		hook_arg2 = "";
> +		strvec_clear(&hook_args);
> +		strvec_pushl(&hook_args, git_path_commit_editmsg(), "message", NULL);

It's a shame we have to clear the strvec and remember to re-add 
git_path_commit_editmsg() here.

>   	}
>   
>   	s->fp = fopen_for_writing(git_path_commit_editmsg());
> @@ -1001,8 +1000,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>   		return 0;
>   	}
>   
> -	if (run_commit_hook(use_editor, index_file, "prepare-commit-msg",
> -			    git_path_commit_editmsg(), hook_arg1, hook_arg2, NULL))
> +	if (run_commit_hook(use_editor, index_file, "prepare-commit-msg", &hook_args))
>   		return 0;
>   
>   	if (use_editor) {
> @@ -1017,8 +1015,10 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>   		strvec_clear(&env);
>   	}
>   
> +	strvec_clear(&hook_args);
> +	strvec_push(&hook_args, git_path_commit_editmsg());
>   	if (!no_verify &&
> -	    run_commit_hook(use_editor, index_file, "commit-msg", git_path_commit_editmsg(), NULL)) {
> +	    run_commit_hook(use_editor, index_file, "commit-msg", &hook_args)) {
>   		return 0;
>   	}
>   
> diff --git a/builtin/merge.c b/builtin/merge.c
> index c1a9d0083d..863c9039a3 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -821,10 +821,14 @@ static void write_merge_heads(struct commit_list *);
>   static void prepare_to_commit(struct commit_list *remoteheads)
>   {
>   	struct strbuf msg = STRBUF_INIT;
> +	struct strvec hook_args = STRVEC_INIT;
> +	struct strbuf hook_name = STRBUF_INIT;

As far as I can see hook_name is never used except to free it at the end.

>   	const char *index_file = get_index_file();
>   
> -	if (!no_verify && run_commit_hook(0 < option_edit, index_file, "pre-merge-commit", NULL))
> +	if (!no_verify && run_commit_hook(0 < option_edit, index_file,
> +					  "pre-merge-commit", &hook_args))
>   		abort_commit(remoteheads, NULL);
> +
>   	/*
>   	 * Re-read the index as pre-merge-commit hook could have updated it,
>   	 * and write it out as a tree.  We must do this before we invoke
> @@ -832,6 +836,7 @@ static void prepare_to_commit(struct commit_list *remoteheads)
>   	 */
>   	if (hook_exists("pre-merge-commit"))
>   		discard_cache();
> +
>   	read_cache_from(index_file);
>   	strbuf_addbuf(&msg, &merge_msg);
>   	if (squash)
> @@ -851,17 +856,22 @@ static void prepare_to_commit(struct commit_list *remoteheads)
>   		append_signoff(&msg, ignore_non_trailer(msg.buf, msg.len), 0);
>   	write_merge_heads(remoteheads);
>   	write_file_buf(git_path_merge_msg(the_repository), msg.buf, msg.len);
> +
> +	strvec_clear(&hook_args);
> +	strvec_pushl(&hook_args, git_path_merge_msg(the_repository), "merge", NULL);
>   	if (run_commit_hook(0 < option_edit, get_index_file(), "prepare-commit-msg",
> -			    git_path_merge_msg(the_repository), "merge", NULL))
> +			    &hook_args))
>   		abort_commit(remoteheads, NULL);
> +
>   	if (0 < option_edit) {
>   		if (launch_editor(git_path_merge_msg(the_repository), NULL, NULL))
>   			abort_commit(remoteheads, NULL);
>   	}
>   
> +	strvec_clear(&hook_args);
> +	strvec_push(&hook_args, git_path_merge_msg(the_repository));
>   	if (!no_verify && run_commit_hook(0 < option_edit, get_index_file(),
> -					  "commit-msg",
> -					  git_path_merge_msg(the_repository), NULL))
> +					  "commit-msg", &hook_args))
>   		abort_commit(remoteheads, NULL);
>   
>   	read_merge_msg(&msg);
> @@ -871,6 +881,8 @@ static void prepare_to_commit(struct commit_list *remoteheads)
>   	strbuf_release(&merge_msg);
>   	strbuf_addbuf(&merge_msg, &msg);
>   	strbuf_release(&msg);
> +	strbuf_release(&hook_name);
> +	strvec_clear(&hook_args);
>   }
>   
>   static int merge_trivial(struct commit *head, struct commit_list *remoteheads)
> diff --git a/commit.c b/commit.c
> index c7a243e848..726407152c 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -1629,12 +1629,9 @@ size_t ignore_non_trailer(const char *buf, size_t len)
>   }
>   
>   int run_commit_hook(int editor_is_used, const char *index_file,
> -		    const char *name, ...)
> +		    const char *name, struct strvec *args)
>   {
>   	struct strvec hook_env = STRVEC_INIT;
> -	va_list args;
> -	const char *arg;
> -	struct strvec hook_args = STRVEC_INIT;
>   	struct strbuf hook_name = STRBUF_INIT;
>   	int ret;
>   
> @@ -1648,14 +1645,8 @@ int run_commit_hook(int editor_is_used, const char *index_file,
>   	if (!editor_is_used)
>   		strvec_push(&hook_env, "GIT_EDITOR=:");
>   
> -	va_start(args, name);
> -	while ((arg = va_arg(args, const char *)))
> -		strvec_push(&hook_args, arg);
> -	va_end(args);
> -
> -	ret = run_hooks(hook_env.v, &hook_name, &hook_args);
> +	ret = run_hooks(hook_env.v, &hook_name, args);
>   	strvec_clear(&hook_env);
> -	strvec_clear(&hook_args);
>   	strbuf_release(&hook_name);
>   
>   	return ret;
> diff --git a/commit.h b/commit.h
> index e901538909..978da3c3e0 100644
> --- a/commit.h
> +++ b/commit.h
> @@ -9,6 +9,7 @@
>   #include "string-list.h"
>   #include "pretty.h"
>   #include "commit-slab.h"
> +#include "strvec.h"
>   
>   #define COMMIT_NOT_FROM_GRAPH 0xFFFFFFFF
>   #define GENERATION_NUMBER_INFINITY 0xFFFFFFFF
> @@ -353,7 +354,7 @@ void verify_merge_signature(struct commit *commit, int verbose,
>   int compare_commits_by_commit_date(const void *a_, const void *b_, void *unused);
>   int compare_commits_by_gen_then_commit_date(const void *a_, const void *b_, void *unused);
>   
> -LAST_ARG_MUST_BE_NULL
> -int run_commit_hook(int editor_is_used, const char *index_file, const char *name, ...);
> +int run_commit_hook(int editor_is_used, const char *index_file,
> +		    const char *name, struct strvec *args);
>   
>   #endif /* COMMIT_H */
> diff --git a/sequencer.c b/sequencer.c
> index cc3f8fa88e..5dd4b134d6 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -1124,22 +1124,23 @@ static int run_prepare_commit_msg_hook(struct repository *r,
>   				       const char *commit)
>   {
>   	int ret = 0;
> -	const char *name, *arg1 = NULL, *arg2 = NULL;
> +	struct strvec args = STRVEC_INIT;
> +	const char *name = git_path_commit_editmsg();
>   
> -	name = git_path_commit_editmsg();
> +	strvec_push(&args, name);

I think you could drop name altogether and just pass 
git_path_commit_editmsg() instead.

>   	if (write_message(msg->buf, msg->len, name, 0))
>   		return -1;
>   
>   	if (commit) {
> -		arg1 = "commit";
> -		arg2 = commit;
> +		strvec_push(&args, "commit");
> +		strvec_push(&args, commit);

Complete nit pick but the other conversions all used strvec_pushl() 
rather than two strvec_push() calls.

I don't have a strong opinion about these changes (though I'm not 
particularly enthusiastic). Having to push the arguments in order is not 
particularly convenient and the use of strvec_pushl() means we are 
replacing a small number of variadic calls to run_commit_hook() with a 
larger number of calls to a different variadic interface.

Best Wishes

Phillip

>   	} else {
> -		arg1 = "message";
> +		strvec_push(&args, "message");
>   	}
> -	if (run_commit_hook(0, r->index_file, "prepare-commit-msg", name,
> -			    arg1, arg2, NULL))
> +	if (run_commit_hook(0, r->index_file, "prepare-commit-msg", &args))
>   		ret = error(_("'prepare-commit-msg' hook failed"));
>   
> +	strvec_clear(&args);
>   	return ret;
>   }
>   
> 

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 7/9] hook: replace run-command.h:find_hook
  2020-09-09 20:32       ` Junio C Hamano
@ 2020-09-10 19:08         ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-09-10 19:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, Sep 09, 2020 at 01:32:12PM -0700, Junio C Hamano wrote:
> 
> Emily Shaffer <emilyshaffer@google.com> writes:
> 
> > Add a helper to easily determine whether any hooks exist for a given
> > hook event.
> >
> > Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> > ---
> >  hook.c | 9 +++++++++
> >  hook.h | 1 +
> >  2 files changed, 10 insertions(+)
> 
> Should we consider the last three patches still work-in-progress
> technology demonstration, or are these meant as a proposal for a new
> API element as-is?

The former. I'm irritated with myself for spending a long time fidgeting
with the wording on this reroll and still forgetting to mark the last
three "RFC" as I had planned to do.

> It is perfectly fine if it is the former.  I just want to make sure
> we share a common understanding on the direction in which we want
> these patches to take us.  Here is my take:
> 
>  - For now, a hook/event that is aware of the config-based hook
>    system is supposed to use hook_exists(), while the traditional
>    ones still use find_hook().  We expect more and more will be
>    converted to the former over time.
> 
>  - Invoking hook scripts under the new world order is done by
>    including hook.h and calling run_hooks(), not by driving the
>    run-command API yourself (I count run_hook_ve() as part of the
>    latter) like the traditional code did.  We expect more and more
>    will be converted to the former over time.
> 
>  - From the point of view of the end users who have been happily
>    using scripts in $GIT_DIR/hooks, everything will stay the same.
>    hook_exists() will find them (by calling find_hook() as a
>    fallback) and run_hooks() will run them (by relying on
>    hook_list() to include them).
> 
> I am guessing that the above gives us a high-level description.

Yes. I am also working on a patch locally to include a config -
optionally users could shut off the $GIT_DIR/hooks, but I don't see us
making that the default behavior any time soon (or ever).

> 
> The new interface needs to be described in hook.h once the series
> graduates from the technology demonstration state, in order to help
> others who want to help updating the callsites of traditional hooks
> to the new API.  And the above three-bullet point list is my attempt
> to figure out what kind of things need to be documented to help
> them.

Sure. Agreed. Thanks for pointing it out - I had planned on updating the
`git help hook` manpage but adding API comments in hook.h had slipped my
mind, so the reminder is useful.

> 
> I am not seeing anything in run_hooks() that consumes input from us
> over pipe, by the way, without which we cannot do things like the
> "pre-receive" hooks under the new world order.  Are they planned to
> come in the future, after these "we feed anything they need from the
> command line and from the enviornment" hooks are dealt with in this
> first pass?

I included this conversion to demonstrate the tech and give people
something to look at (and shout to stop if so needed). I do plan to
include hooks which need piped input; in fact, I'm hoping to target one
such for the next conversion I do. The todo list looks like so:

 1. semantics for checking hook.runHookDir config
 2. convert all the hooks which take input in interesting ways (or, just
 all the hooks)
 3. add user friendliness via 'git hook add', 'git hook edit', etc

 The config semantics are in progress and I'm hoping to send this week.

 As for submission plan, I don't mind including new architecture (if
 unused) except for the code bloat; I'd rather push all the
 "conversions" simultaneously, so users don't have to wonder "is this
 hook a new and supported one, or not?".  I don't mind adding the
 niceties ('git hook add' etc) later as the config is a little annoying
 for a human to write themselves, but not impossible.

  - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 8/9] commit: use config-based hooks
  2020-09-10 13:50       ` Phillip Wood
@ 2020-09-10 22:21         ` Junio C Hamano
  0 siblings, 0 replies; 170+ messages in thread
From: Junio C Hamano @ 2020-09-10 22:21 UTC (permalink / raw)
  To: Phillip Wood; +Cc: Emily Shaffer, git

Phillip Wood <phillip.wood123@gmail.com> writes:

>> +	const char *arg;
>> +	struct strvec hook_args = STRVEC_INIT;
>> +	struct strbuf hook_name = STRBUF_INIT;
>>   	int ret;
>>   +	strbuf_addstr(&hook_name, name);
>
> Seeing this makes me wonder if it would be better for run_hooks() to
> take a string for the name rather than an strbuf, I suspect that
> virtually all callers have a fixed hook name.

Yeah, that is a good point.  It is always a good discipline to keep
the type of the parameters callers need to pass to the minimum.




^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 9/9] run_commit_hook: take strvec instead of varargs
  2020-09-10 14:16       ` Phillip Wood
@ 2020-09-11 13:20         ` Phillip Wood
  0 siblings, 0 replies; 170+ messages in thread
From: Phillip Wood @ 2020-09-11 13:20 UTC (permalink / raw)
  To: Emily Shaffer, git

On 10/09/2020 15:16, Phillip Wood wrote:
> Hi Emily
> 
> On 09/09/2020 01:49, Emily Shaffer wrote:
>> Taking varargs in run_commit_hook() led to some bizarre patterns, like
>> callers using two string variables (which may or may not be filled) to
>> express different argument lists for the commit hooks. Because
>> run_commit_hook() no longer needs to call a variadic function for the
>> hook run itself, we can use strvec to make the calling code more
>> conventional.
>>
>> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
>> ---
>>   builtin/commit.c | 46 +++++++++++++++++++++++-----------------------
>>   builtin/merge.c  | 20 ++++++++++++++++----
>>   commit.c         | 13 ++-----------
>>   commit.h         |  5 +++--
>>   sequencer.c      | 15 ++++++++-------
>>   5 files changed, 52 insertions(+), 47 deletions(-)
>>
>> diff --git a/builtin/commit.c b/builtin/commit.c
>> index a19c6478eb..f029d4f5ac 100644
>> --- a/builtin/commit.c
>> +++ b/builtin/commit.c
>> @@ -691,8 +691,7 @@ static int prepare_to_commit(const char 
>> *index_file, const char *prefix,
>>       struct strbuf committer_ident = STRBUF_INIT;
>>       int committable;
>>       struct strbuf sb = STRBUF_INIT;
>> -    const char *hook_arg1 = NULL;
>> -    const char *hook_arg2 = NULL;
>> +    struct strvec hook_args = STRVEC_INIT;
>>       int clean_message_contents = (cleanup_mode != 
>> COMMIT_MSG_CLEANUP_NONE);
>>       int old_display_comment_prefix;
>>       int merge_contains_scissors = 0;
>> @@ -700,7 +699,8 @@ static int prepare_to_commit(const char 
>> *index_file, const char *prefix,
>>       /* This checks and barfs if author is badly specified */
>>       determine_author_info(author_ident);
>> -    if (!no_verify && run_commit_hook(use_editor, index_file, 
>> "pre-commit", NULL))
>> +    if (!no_verify && run_commit_hook(use_editor, index_file, 
>> "pre-commit",
>> +                      &hook_args))
>>           return 0;
>>       if (squash_message) {
>> @@ -722,27 +722,28 @@ static int prepare_to_commit(const char 
>> *index_file, const char *prefix,
>>           }
>>       }
>> +    strvec_push(&hook_args, git_path_commit_editmsg());
> 
> This is a long way from the call where we use hook_args. With the 
> variadic interface it is clear by looking at the call to 
> run_commit_hook() what the first argument is and that is always the same.
> 
>>       if (have_option_m && !fixup_message) {
>>           strbuf_addbuf(&sb, &message);
>> -        hook_arg1 = "message";
>> +        strvec_push(&hook_args, "message");
>>       } else if (logfile && !strcmp(logfile, "-")) {
>>           if (isatty(0))
>>               fprintf(stderr, _("(reading log message from standard 
>> input)\n"));
>>           if (strbuf_read(&sb, 0, 0) < 0)
>>               die_errno(_("could not read log from standard input"));
>> -        hook_arg1 = "message";
>> +        strvec_push(&hook_args, "message");
>>       } else if (logfile) {
>>           if (strbuf_read_file(&sb, logfile, 0) < 0)
>>               die_errno(_("could not read log file '%s'"),
>>                     logfile);
>> -        hook_arg1 = "message";
>> +        strvec_push(&hook_args, "message");
>>       } else if (use_message) {
>>           char *buffer;
>>           buffer = strstr(use_message_buffer, "\n\n");
>>           if (buffer)
>>               strbuf_addstr(&sb, skip_blank_lines(buffer + 2));
>> -        hook_arg1 = "commit";
>> -        hook_arg2 = use_message;
>> +        strvec_pushl(&hook_args, "commit", use_message, NULL);
>>       } else if (fixup_message) {
>>           struct pretty_print_context ctx = {0};
>>           struct commit *commit;
>> @@ -754,7 +755,7 @@ static int prepare_to_commit(const char 
>> *index_file, const char *prefix,
>>                         &sb, &ctx);
>>           if (have_option_m)
>>               strbuf_addbuf(&sb, &message);
>> -        hook_arg1 = "message";
>> +        strvec_push(&hook_args, "message");
>>       } else if (!stat(git_path_merge_msg(the_repository), &statbuf)) {
>>           size_t merge_msg_start;
>> @@ -765,9 +766,9 @@ static int prepare_to_commit(const char 
>> *index_file, const char *prefix,
>>           if (!stat(git_path_squash_msg(the_repository), &statbuf)) {
>>               if (strbuf_read_file(&sb, 
>> git_path_squash_msg(the_repository), 0) < 0)
>>                   die_errno(_("could not read SQUASH_MSG"));
>> -            hook_arg1 = "squash";
>> +            strvec_push(&hook_args, "squash");
>>           } else
>> -            hook_arg1 = "merge";
>> +            strvec_push(&hook_args, "merge");
>>           merge_msg_start = sb.len;
>>           if (strbuf_read_file(&sb, 
>> git_path_merge_msg(the_repository), 0) < 0)
>> @@ -781,11 +782,11 @@ static int prepare_to_commit(const char 
>> *index_file, const char *prefix,
>>       } else if (!stat(git_path_squash_msg(the_repository), &statbuf)) {
>>           if (strbuf_read_file(&sb, 
>> git_path_squash_msg(the_repository), 0) < 0)
>>               die_errno(_("could not read SQUASH_MSG"));
>> -        hook_arg1 = "squash";
>> +        strvec_push(&hook_args, "squash");
>>       } else if (template_file) {
>>           if (strbuf_read_file(&sb, template_file, 0) < 0)
>>               die_errno(_("could not read '%s'"), template_file);
>> -        hook_arg1 = "template";
>> +        strvec_push(&hook_args, "template");
>>           clean_message_contents = 0;
>>       }
>> @@ -794,11 +795,9 @@ static int prepare_to_commit(const char 
>> *index_file, const char *prefix,
>>        * just set the argument(s) to the prepare-commit-msg hook.
>>        */
>>       else if (whence == FROM_MERGE)
>> -        hook_arg1 = "merge";
>> -    else if (is_from_cherry_pick(whence) || whence == 
>> FROM_REBASE_PICK) {
>> -        hook_arg1 = "commit";
>> -        hook_arg2 = "CHERRY_PICK_HEAD";
>> -    }
>> +        strvec_push(&hook_args, "merge");
>> +    else if (is_from_cherry_pick(whence) || whence == FROM_REBASE_PICK)
>> +        strvec_pushl(&hook_args, "commit", "CHERRY_PICK_HEAD", NULL);
>>       if (squash_message) {
>>           /*
>> @@ -806,8 +805,8 @@ static int prepare_to_commit(const char 
>> *index_file, const char *prefix,
>>            * then we're possibly hijacking other commit log options.
>>            * Reset the hook args to tell the real story.
>>            */
>> -        hook_arg1 = "message";
>> -        hook_arg2 = "";
>> +        strvec_clear(&hook_args);
>> +        strvec_pushl(&hook_args, git_path_commit_editmsg(), 
>> "message", NULL);
> 
> It's a shame we have to clear the strvec and remember to re-add 
> git_path_commit_editmsg() here.
> 
>>       }
>>       s->fp = fopen_for_writing(git_path_commit_editmsg());
>> @@ -1001,8 +1000,7 @@ static int prepare_to_commit(const char 
>> *index_file, const char *prefix,
>>           return 0;
>>       }
>> -    if (run_commit_hook(use_editor, index_file, "prepare-commit-msg",
>> -                git_path_commit_editmsg(), hook_arg1, hook_arg2, NULL))
>> +    if (run_commit_hook(use_editor, index_file, "prepare-commit-msg", 
>> &hook_args))
>>           return 0;
>>       if (use_editor) {
>> @@ -1017,8 +1015,10 @@ static int prepare_to_commit(const char 
>> *index_file, const char *prefix,
>>           strvec_clear(&env);
>>       }
>> +    strvec_clear(&hook_args);
>> +    strvec_push(&hook_args, git_path_commit_editmsg());
>>       if (!no_verify &&
>> -        run_commit_hook(use_editor, index_file, "commit-msg", 
>> git_path_commit_editmsg(), NULL)) {
>> +        run_commit_hook(use_editor, index_file, "commit-msg", 
>> &hook_args)) {
>>           return 0;
>>       }
 >[...]
> 
> I don't have a strong opinion about these changes (though I'm not 
> particularly enthusiastic). Having to push the arguments in order is not 
> particularly convenient and the use of strvec_pushl() means we are 
> replacing a small number of variadic calls to run_commit_hook() with a 
> larger number of calls to a different variadic interface.

On reflection I think it is the conversion in builtin/commit.c rather 
than the change in the API that makes me uncomfortable. If it kept 
`hook_arg1` and `hook_arg2` and just did

strvec_push(&hook_args, git_path_commit_editmsg())\
strvec_push(&hook_args, hook_arg1);
if (hook_arg2)
	strvec_push(&hook_args, hook_arg2);
run_commit_hook(..., &hook_args);

It would keep the fixed first argument near the call to 
run_commit_hook() and avoid the problem of having to clear hook_args in 
the hunk at line 806.

Thank you for adding the last couple of patches that show an example 
conversion, it is really helpful to see how the API would be used.

Best Wishes

Phillip

> Best Wishes
> 
> Phillip
> 
>>       } else {
>> -        arg1 = "message";
>> +        strvec_push(&args, "message");
>>       }
>> -    if (run_commit_hook(0, r->index_file, "prepare-commit-msg", name,
>> -                arg1, arg2, NULL))
>> +    if (run_commit_hook(0, r->index_file, "prepare-commit-msg", &args))
>>           ret = error(_("'prepare-commit-msg' hook failed"));
>> +    strvec_clear(&args);
>>       return ret;
>>   }
>>


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 3/9] hook: add list command
  2020-09-09  0:49     ` [PATCH v4 3/9] hook: add list command Emily Shaffer
@ 2020-09-11 13:27       ` Phillip Wood
  2020-09-11 16:51         ` Emily Shaffer
  2020-09-23 23:04       ` Jonathan Tan
                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 170+ messages in thread
From: Phillip Wood @ 2020-09-11 13:27 UTC (permalink / raw)
  To: Emily Shaffer, git

Hi Emily

On 09/09/2020 01:49, Emily Shaffer wrote:
> Teach 'git hook list <hookname>', which checks the known configs in
> order to create an ordered list of hooks to run on a given hook event.
> 
> Multiple commands can be specified for a given hook by providing
> multiple "hook.<hookname>.command = <path-to-hook>" lines. Hooks will be
> run in config order. If more properties need to be set on a given hook
> in the future, commands can also be specified by providing
> "hook.<hookname>.command = <hookcmd-name>", as well as a "[hookcmd
> <hookcmd-name>]" subsection; at minimum, this subsection must contain a
> "hookcmd.<hookcmd-name>.command = <path-to-hook>" line.
> 
> For example:
> 
>    $ git config --list | grep ^hook
>    hook.pre-commit.command=baz
>    hook.pre-commit.command=~/bar.sh
>    hookcmd.baz.command=~/baz/from/hookcmd.sh
> 
>    $ git hook list pre-commit
>    ~/baz/from/hookcmd.sh
>    ~/bar.sh
> 
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
>   Documentation/git-hook.txt    |  37 +++++++++++-
>   Makefile                      |   1 +
>   builtin/hook.c                |  55 ++++++++++++++++--
>   hook.c                        | 102 ++++++++++++++++++++++++++++++++++
>   hook.h                        |  15 +++++
>   t/t1360-config-based-hooks.sh |  68 ++++++++++++++++++++++-
>   6 files changed, 271 insertions(+), 7 deletions(-)
>   create mode 100644 hook.c
>   create mode 100644 hook.h
> 
> diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
> index 2d50c414cc..e458586e96 100644
> --- a/Documentation/git-hook.txt
> +++ b/Documentation/git-hook.txt
> @@ -8,12 +8,47 @@ git-hook - Manage configured hooks
>   SYNOPSIS
>   --------
>   [verse]
> -'git hook'
> +'git hook' list <hook-name>
>   
>   DESCRIPTION
>   -----------
>   You can list, add, and modify hooks with this command.
>   
> +This command parses the default configuration files for sections "hook" and
> +"hookcmd". "hook" is used to describe the commands which will be run during a
> +particular hook event; commands are run in config order. "hookcmd" is used to
> +describe attributes of a specific command. If additional attributes don't need
> +to be specified, a command to run can be specified directly in the "hook"
> +section; if a "hookcmd" by that name isn't found, Git will attempt to run the
> +provided value directly. For example:
> +
> +Global config
> +----
> +  [hook "post-commit"]
> +    command = "linter"
> +    command = "~/typocheck.sh"
> +
> +  [hookcmd "linter"]
> +    command = "/bin/linter --c"
> +----
> +
> +Local config
> +----
> +  [hook "prepare-commit-msg"]
> +    command = "linter"
> +  [hook "post-commit"]
> +    command = "python ~/run-test-suite.py"
> +----

I think it would be helpful to have a couple of lines explaining what 
the example configuration sets up

> +COMMANDS
> +--------
> +
> +list <hook-name>::
> +
> +List the hooks which have been configured for <hook-name>. Hooks appear
> +in the order they should be run, and note the config scope where the relevant
> +`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).

Thanks for clarifying that it is the origin of the 
hook.<hook-name>.command that is printed. An example of the output of 
the config above would be useful I think.

>   GIT
>   ---
>   Part of the linkgit:git[1] suite
> diff --git a/Makefile b/Makefile
> index 6eee75555e..804de45b16 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -890,6 +890,7 @@ LIB_OBJS += grep.o
>   LIB_OBJS += hashmap.o
>   LIB_OBJS += help.o
>   LIB_OBJS += hex.o
> +LIB_OBJS += hook.o
>   LIB_OBJS += ident.o
>   LIB_OBJS += interdiff.o
>   LIB_OBJS += json-writer.o
> diff --git a/builtin/hook.c b/builtin/hook.c
> index b2bbc84d4d..a0759a4c26 100644
> --- a/builtin/hook.c
> +++ b/builtin/hook.c
> @@ -1,21 +1,68 @@
>   #include "cache.h"
>   
>   #include "builtin.h"
> +#include "config.h"
> +#include "hook.h"
>   #include "parse-options.h"
> +#include "strbuf.h"
>   
>   static const char * const builtin_hook_usage[] = {
> -	N_("git hook"),
> +	N_("git hook list <hookname>"),
>   	NULL
>   };
>   
> -int cmd_hook(int argc, const char **argv, const char *prefix)
> +static int list(int argc, const char **argv, const char *prefix)
>   {
> -	struct option builtin_hook_options[] = {
> +	struct list_head *head, *pos;
> +	struct hook *item;
> +	struct strbuf hookname = STRBUF_INIT;
> +
> +	struct option list_options[] = {
>   		OPT_END(),
>   	};
>   
> -	argc = parse_options(argc, argv, prefix, builtin_hook_options,
> +	argc = parse_options(argc, argv, prefix, list_options,
>   			     builtin_hook_usage, 0);
>   
> +	if (argc < 1) {
> +		usage_msg_opt("a hookname must be provided to operate on.",
> +			      builtin_hook_usage, list_options);
> +	}
> +
> +	strbuf_addstr(&hookname, argv[0]);
> +
> +	head = hook_list(&hookname);
> +
> +	if (list_empty(head)) {
> +		printf(_("no commands configured for hook '%s'\n"),
> +		       hookname.buf);
> +		return 0;
> +	}
> +
> +	list_for_each(pos, head) {
> +		item = list_entry(pos, struct hook, list);
> +		if (item)
> +			printf("%s:\t%s\n",
> +			       config_scope_name(item->origin),
> +			       item->command.buf);
> +	}
> +
> +	clear_hook_list();
> +	strbuf_release(&hookname);
> +
>   	return 0;
>   }
> +
> +int cmd_hook(int argc, const char **argv, const char *prefix)
> +{
> +	struct option builtin_hook_options[] = {
> +		OPT_END(),
> +	};
> +	if (argc < 2)
> +		usage_with_options(builtin_hook_usage, builtin_hook_options);
> +
> +	if (!strcmp(argv[1], "list"))
> +		return list(argc - 1, argv + 1, prefix);
> +
> +	usage_with_options(builtin_hook_usage, builtin_hook_options);
> +}
> diff --git a/hook.c b/hook.c
> new file mode 100644
> index 0000000000..b006950eb8
> --- /dev/null
> +++ b/hook.c
> @@ -0,0 +1,102 @@
> +#include "cache.h"
> +
> +#include "hook.h"
> +#include "config.h"
> +
> +/*
> + * NEEDSWORK: a stateful hook_head means we can't run two hook events in the
> + * background at the same time - which might be ok, or might not.
> + *
> + * Maybe it's better to cache a list head per hookname, since we can probably
> + * guess that the hook list won't change during a user-initiated operation. For
> + * now, within list_hooks, call clear_hook_list() at the outset.
> + */
> +static LIST_HEAD(hook_head);

I can see a cache might be useful for the sequencer which needs to run 
the prepare-msg hook for each commit (it should probably not be running 
the post-commit hook but does at the moment) and for am which runs some 
hooks for each patch but until then I'm not sure why we need a global 
variable here, can't we just declare `hook_head` in `list_hook()`?

> +void free_hook(struct hook *ptr)
> +{
> +	if (ptr) {
> +		strbuf_release(&ptr->command);
> +		free(ptr);
> +	}
> +}
> +
> +static void emplace_hook(struct list_head *pos, const char *command)
> +{
> +	struct hook *to_add = malloc(sizeof(struct hook));
> +	to_add->origin = current_config_scope();
> +	strbuf_init(&to_add->command, 0);
> +	/* even with use_shell, run_command() needs quotes */
> +	strbuf_addf(&to_add->command, "'%s'", command);
> +
> +	list_add_tail(&to_add->list, pos);
> +}
> +
> +static void remove_hook(struct list_head *to_remove)
> +{
> +	struct hook *hook_to_remove = list_entry(to_remove, struct hook, list);
> +	list_del(to_remove);
> +	free_hook(hook_to_remove);
> +}
> +
> +void clear_hook_list(void)
> +{
> +	struct list_head *pos, *tmp;
> +	list_for_each_safe(pos, tmp, &hook_head)
> +		remove_hook(pos);
> +}
> +
> +static int hook_config_lookup(const char *key, const char *value, void *hook_key_cb)
> +{
> +	const char *hook_key = hook_key_cb;
> +
> +	if (!strcmp(key, hook_key)) {
> +		const char *command = value;
> +		struct strbuf hookcmd_name = STRBUF_INIT;
> +		struct list_head *pos = NULL, *tmp = NULL;
> +
> +		/* Check if a hookcmd with that name exists. */
> +		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
> +		git_config_get_value(hookcmd_name.buf, &command);
> +
> +		if (!command)
> +			BUG("git_config_get_value overwrote a string it shouldn't have");
> +
> +		/*
> +		 * TODO: implement an option-getting callback, e.g.
> +		 *   get configs by pattern hookcmd.$value.*
> +		 *   for each key+value, do_callback(key, value, cb_data)
> +		 */
> +
> +		list_for_each_safe(pos, tmp, &hook_head) {
> +			struct hook *hook = list_entry(pos, struct hook, list);
> +			/*
> +			 * The list of hooks to run can be reordered by being redeclared
> +			 * in the config. Options about hook ordering should be checked
> +			 * here.
> +			 */
> +			if (0 == strcmp(hook->command.buf, command))

We normally write this as !strcmp(...)

> +				remove_hook(pos);
> +		}
> +		emplace_hook(pos, command);
> +	}
> +
> +	return 0;
> +}
> +
> +struct list_head* hook_list(const struct strbuf* hookname)
> +{
> +	struct strbuf hook_key = STRBUF_INIT;
> +
> +	if (!hookname)
> +		return NULL;
> +
> +	/* hook_head is stateful */
> +	clear_hook_list();
> +
> +	strbuf_addf(&hook_key, "hook.%s.command", hookname->buf);
> +
> +	git_config(hook_config_lookup, (void*)hook_key.buf);
> +
> +	return &hook_head;
> +}
> diff --git a/hook.h b/hook.h
> new file mode 100644
> index 0000000000..aaf6511cff
> --- /dev/null
> +++ b/hook.h
> @@ -0,0 +1,15 @@
> +#include "config.h"
> +#include "list.h"
> +#include "strbuf.h"
> +
> +struct hook
> +{
> +	struct list_head list;
> +	enum config_scope origin;
> +	struct strbuf command;
> +};
> +
> +struct list_head* hook_list(const struct strbuf *hookname);
> +
> +void free_hook(struct hook *ptr);
> +void clear_hook_list(void);
> diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
> index 34b0df5216..46d1ed354a 100755
> --- a/t/t1360-config-based-hooks.sh
> +++ b/t/t1360-config-based-hooks.sh
> @@ -4,8 +4,72 @@ test_description='config-managed multihooks, including git-hook command'
>   
>   . ./test-lib.sh
>   
> -test_expect_success 'git hook command does not crash' '
> -	git hook
> +ROOT=
> +if test_have_prereq MINGW
> +then
> +	# In Git for Windows, Unix-like paths work only in shell scripts;
> +	# `git.exe`, however, will prefix them with the pseudo root directory
> +	# (of the Unix shell). Let's accommodate for that.
> +	ROOT="$(cd / && pwd)"
> +fi
> +
> +setup_hooks () {
> +	test_config hook.pre-commit.command "/path/ghi" --add
> +	test_config_global hook.pre-commit.command "/path/def" --add
> +}
> +
> +setup_hookcmd () {
> +	test_config hook.pre-commit.command "abc" --add
> +	test_config_global hookcmd.abc.command "/path/abc" --add
> +}
> +
> +test_expect_success 'git hook rejects commands without a mode' '
> +	test_must_fail git hook pre-commit
> +'

Thanks for changing the tests to be independent of each other

Best Wishes

Phillip

> +
> +test_expect_success 'git hook rejects commands without a hookname' '
> +	test_must_fail git hook list
> +'
> +
> +test_expect_success 'git hook list orders by config order' '
> +	setup_hooks &&
> +
> +	cat >expected <<-EOF &&
> +	global:	$ROOT/path/def
> +	local:	$ROOT/path/ghi
> +	EOF
> +
> +	git hook list pre-commit >actual &&
> +	test_cmp expected actual
> +'
> +
> +test_expect_success 'git hook list dereferences a hookcmd' '
> +	setup_hooks &&
> +	setup_hookcmd &&
> +
> +	cat >expected <<-EOF &&
> +	global:	$ROOT/path/def
> +	local:	$ROOT/path/ghi
> +	local:	$ROOT/path/abc
> +	EOF
> +
> +	git hook list pre-commit >actual &&
> +	test_cmp expected actual
> +'
> +
> +test_expect_success 'git hook list reorders on duplicate commands' '
> +	setup_hooks &&
> +
> +	test_config hook.pre-commit.command "/path/def" --add &&
> +
> +	cat >expected <<-EOF &&
> +	local:	$ROOT/path/ghi
> +	local:	$ROOT/path/def
> +	EOF
> +
> +	git hook list pre-commit >actual &&
> +	test_cmp expected actual
>   '
>   
>   test_done
> 


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 6/9] hook: add 'run' subcommand
  2020-09-09  0:49     ` [PATCH v4 6/9] hook: add 'run' subcommand Emily Shaffer
@ 2020-09-11 13:30       ` Phillip Wood
  2020-09-28 19:29       ` Josh Steadmon
  2020-10-05 23:39       ` Jonathan Nieder
  2 siblings, 0 replies; 170+ messages in thread
From: Phillip Wood @ 2020-09-11 13:30 UTC (permalink / raw)
  To: Emily Shaffer, git

Hi Emily

On 09/09/2020 01:49, Emily Shaffer wrote:
> In order to enable hooks to be run as an external process, by a
> standalone Git command, or by tools which wrap Git, provide an external
> means to run all configured hook commands for a given hook event.
> 
> For now, the hook commands will in config order, in series. As alternate
> ordering or parallelism is supported in the future, we should add knobs
> to use those to the command line as well.
> 
> As with the legacy hook implementation, all stdout generated by hook
> commands is redirected to stderr. Piping from stdin is not yet
> supported.
> 
> Legacy hooks (those present in $GITDIR/hooks) are run at the end of the
> execution list. For now, there is no way to disable them.
> 
> Users may wish to provide hook commands like 'git config
> hook.pre-commit.command "~/linter.sh --pre-commit"'. To enable this, the
> contents of the 'hook.*.command' and 'hookcmd.*.command' strings are
> first split by space or quotes into an argv_array, then expanded with
> 'expand_user_path()'.
> 
 > [...]
> diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
> index ebf8f38d68..ee8114250d 100755
> --- a/t/t1360-config-based-hooks.sh
> +++ b/t/t1360-config-based-hooks.sh
> @@ -84,4 +84,32 @@ test_expect_success 'git hook list --porcelain prints just the command' '
>   	test_cmp expected actual
>   '
>   
> +test_expect_success 'inline hook definitions execute oneliners' '
> +	test_config hook.pre-commit.command "echo \"Hello World\"" &&
> +
> +	echo "Hello World" >expected &&
> +
> +	# hooks are run with stdout_to_stderr = 1
> +	git hook run pre-commit 2>actual &&
> +	test_cmp expected actual
> +'
> +
> +test_expect_success 'inline hook definitions resolve paths' '
> +	cat >~/sample-hook.sh <<-EOF &&
> +	echo \"Sample Hook\"
> +	EOF

I think this could use `write_script`. I'm rather scared of the '~' in 
the script path, can we write it to the test directory please.

Best Wishes

Phillip

> +	test_when_finished "rm ~/sample-hook.sh" &&
> +
> +	chmod +x ~/sample-hook.sh &&
> +
> +	test_config hook.pre-commit.command "~/sample-hook.sh" &&
> +
> +	echo \"Sample Hook\" >expected &&
> +
> +	# hooks are run with stdout_to_stderr = 1
> +	git hook run pre-commit 2>actual &&
> +	test_cmp expected actual
> +'
> +
>   test_done
> 


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 3/9] hook: add list command
  2020-09-11 13:27       ` Phillip Wood
@ 2020-09-11 16:51         ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-09-11 16:51 UTC (permalink / raw)
  To: Phillip Wood; +Cc: git

On Fri, Sep 11, 2020 at 02:27:42PM +0100, Phillip Wood wrote:
> 
> Hi Emily
> 
> > +Global config
> > +----
> > +  [hook "post-commit"]
> > +    command = "linter"
> > +    command = "~/typocheck.sh"
> > +
> > +  [hookcmd "linter"]
> > +    command = "/bin/linter --c"
> > +----
> > +
> > +Local config
> > +----
> > +  [hook "prepare-commit-msg"]
> > +    command = "linter"
> > +  [hook "post-commit"]
> > +    command = "python ~/run-test-suite.py"
> > +----
> 
> I think it would be helpful to have a couple of lines explaining what the
> example configuration sets up

Sure.

> 
> > +COMMANDS
> > +--------
> > +
> > +list <hook-name>::
> > +
> > +List the hooks which have been configured for <hook-name>. Hooks appear
> > +in the order they should be run, and note the config scope where the relevant
> > +`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
> 
> Thanks for clarifying that it is the origin of the hook.<hook-name>.command
> that is printed. An example of the output of the config above would be
> useful I think.

Oh, that's a good idea - you're absolutely right. I'll do that.

> > +/*
> > + * NEEDSWORK: a stateful hook_head means we can't run two hook events in the
> > + * background at the same time - which might be ok, or might not.
> > + *
> > + * Maybe it's better to cache a list head per hookname, since we can probably
> > + * guess that the hook list won't change during a user-initiated operation. For
> > + * now, within list_hooks, call clear_hook_list() at the outset.
> > + */
> > +static LIST_HEAD(hook_head);
> 
> I can see a cache might be useful for the sequencer which needs to run the
> prepare-msg hook for each commit (it should probably not be running the
> post-commit hook but does at the moment) and for am which runs some hooks
> for each patch but until then I'm not sure why we need a global variable
> here, can't we just declare `hook_head` in `list_hook()`?

Yeah, I agree. I'll make that change with the next reroll.

Thanks for reading.
 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 1/9] doc: propose hooks managed by the config
  2020-09-09  0:49     ` [PATCH v4 1/9] doc: propose hooks managed by the config Emily Shaffer
@ 2020-09-23 22:59       ` Jonathan Tan
  2020-09-24 21:54         ` Emily Shaffer
  2020-10-07  9:23       ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 170+ messages in thread
From: Jonathan Tan @ 2020-09-23 22:59 UTC (permalink / raw)
  To: emilyshaffer; +Cc: git, Jonathan Tan

For this review, I'll just concern myself with overall design and
structure.

For this patch, overall I think it's better if there's a clear
distinction between what we are implementing now and what we are
implementing later.

> +[[motivation]]
> +== Motivation
> +
> +Treat hooks as a first-class citizen by replacing the .git/hook/hookname path as
> +the only source of hooks to execute, in a way which is friendly to users with
> +multiple repos which have similar needs.

I don't understand what "first-class citizen" here means - probably
better to just omit that phrase and describe the new way of doing hooks.

> +[[config-schema-hook]]
> +==== `hook`
> +
> +Primarily contains subsections for each hook event. These order of these
> +subsections defines the hook command execution order

The execution order is defined by the order of a multivalue config
variable, I think, not the order of subsections? Besides, I believe that
there's one subsection per hook event (e.g. hook."pre-commit"), not one
subsection per command.

> ; hook commands can be
> +specified by setting the value directly to the command if no additional
> +configuration is needed, or by setting the value as the name of a `hookcmd`. If
> +Git does not find a `hookcmd` whose subsection matches the value of the given
> +command string, Git will try to execute the string directly. Hooks are executed
> +by passing the resolved command string to the shell.

[snip]

> Hook event subsections can
> +also contain per-hook-event settings.

If this is not yet implemented, maybe list under "future work".

> +
> +Also contains top-level hook execution settings, for example,
> +`hook.warnHookDir`, `hook.runHookDir`, or `hook.disableAll`. (These settings are
> +described more in <<library,Library>>.)

I think it's clearer if you list this under "future work" - I didn't see
any implementation of this.

> +[hook "pre-commit"]
> +  command = perl-linter
> +  command = /usr/bin/git-secrets --pre-commit
> +
> +[hook "pre-applypatch"]
> +  command = perl-linter
> +  error = ignore

Is "error" implemented?

> +
> +[hook]
> +  runHookDir = interactive

Same question for "runHookDir".

> +[[config-schema-hookcmd]]
> +==== `hookcmd`
> +
> +Defines a hook command and its attributes, which will be used when a hook event
> +occurs. Unqualified attributes are assumed to apply to this hook during all hook
> +events, but event-specific attributes can also be supplied. The example runs
> +`/usr/bin/lint-it --language=perl <args passed by Git>`, but for repos which
> +include this config, the hook command will be skipped for all events to which
> +it's normally subscribed _except_ `pre-commit`.
> +
> +----
> +[hookcmd "perl-linter"]
> +  command = /usr/bin/lint-it --language=perl
> +  skip = true
> +  pre-commit-skip = false
> +----

And the skips. (And several more below which I will skip.)

> +If the caller wants to do something more complicated, the hook library can also
> +provide a callback API:
> +
> +*`int for_each_hookcmd(const char *hookname, hookcmd_function *cb)`*

Is there a use case that would need such a function?

> +[[migration]]
> +=== Migration path
> +
> +[[stage-0]]
> +==== Stage 0
> +
> +Hooks are called by running `run-command.h:find_hook()` with the hookname and
> +executing the result. The hook library and builtin do not exist. Hooks only
> +exist as specially named scripts within `.git/hooks/`.
> +
> +[[stage-1]]
> +==== Stage 1
> +
> +`git hook list --porcelain <hook-event>` is implemented. Users can replace their
> +`.git/hooks/<hook-event>` scripts with a trampoline based on `git hook list`'s
> +output. Modifier commands like `git hook add` and `git hook edit` can be
> +implemented around this time as well.

This seems to contradict patch 8, which teaches Git to use the configs
directly without any change to .git/hooks/<hook-event> (at least for
certain commit-related hooks).

> +[[future-work]]
> +== Future work
> +
> +[[execution-ordering]]
> +=== Execution ordering
> +
> +We may find that config order is insufficient for some users; for example,
> +config order makes it difficult to add a new hook to the system or global config
> +which runs at the end of the hook list. A new ordering schema should be:
> +
> +1) Specified by a `hook.order` config, so that users will not unexpectedly see
> +their order change;
> +
> +2) Either dependency or numerically based.
> +
> +Dependency-based ordering is prone to classic linked-list problems, like a
> +cycles and handling of missing dependencies. But, it paves the way for enabling
> +parallelization if some tasks truly depend on others.
> +
> +Numerical ordering makes it tricky for Git to generate suggested ordering
> +numbers for each command, but is easy to determine a definitive order.

With this schema, and with the "skip" behavior described above (but not
implemented in this patch set), rudimentary ordering can already be
done; because a hook is removed and reinserted whenever it appears in
the config, even a hook X in the system config can be made to run after a
hook Y in the worktree config by adding Y then X in the worktree config,
and if we want to disable X instead, we can just add "skip" to X.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 3/9] hook: add list command
  2020-09-09  0:49     ` [PATCH v4 3/9] hook: add list command Emily Shaffer
  2020-09-11 13:27       ` Phillip Wood
@ 2020-09-23 23:04       ` Jonathan Tan
  2020-10-06 20:46         ` Emily Shaffer
  2020-09-27 19:23       ` Martin Ågren
  2020-10-05 23:27       ` Jonathan Nieder
  3 siblings, 1 reply; 170+ messages in thread
From: Jonathan Tan @ 2020-09-23 23:04 UTC (permalink / raw)
  To: emilyshaffer; +Cc: git, Jonathan Tan

>   $ git hook list pre-commit
>   ~/baz/from/hookcmd.sh
>   ~/bar.sh

In the tests below, there is a "local:" prefix (or similar). It's
clearer if the commit message has that too.

Also, looking at a later commit, the "list" command probably should
include the legacy hook if it exists.

> +static void emplace_hook(struct list_head *pos, const char *command)
> +{
> +	struct hook *to_add = malloc(sizeof(struct hook));
> +	to_add->origin = current_config_scope();
> +	strbuf_init(&to_add->command, 0);
> +	/* even with use_shell, run_command() needs quotes */
> +	strbuf_addf(&to_add->command, "'%s'", command);
> +
> +	list_add_tail(&to_add->list, pos);
> +}

It might be odd to a programmer reading this that an existing "struct
hook" with the same name is not reused - the scanning of the list done
in hook_config_lookup() could probably go here instead.

> +test_expect_success 'git hook list orders by config order' '
> +	setup_hooks &&
> +
> +	cat >expected <<-EOF &&
> +	global:	$ROOT/path/def
> +	local:	$ROOT/path/ghi

Will the "global" strings etc. be translated? If yes, it's probably not
worth it to align the paths in this way.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 7/9] hook: replace run-command.h:find_hook
  2020-09-09  0:49     ` [PATCH v4 7/9] hook: replace run-command.h:find_hook Emily Shaffer
  2020-09-09 20:32       ` Junio C Hamano
@ 2020-09-23 23:20       ` Jonathan Tan
  2020-10-05 23:42       ` Jonathan Nieder
  2 siblings, 0 replies; 170+ messages in thread
From: Jonathan Tan @ 2020-09-23 23:20 UTC (permalink / raw)
  To: emilyshaffer; +Cc: git, Jonathan Tan

> +int hook_exists(const char *hookname)
> +{
> +	const char *value = NULL;
> +	struct strbuf hook_key = STRBUF_INIT;
> +	strbuf_addf(&hook_key, "hook.%s.command", hookname);
> +
> +	return (!git_config_get_value(hook_key.buf, &value)) || !!find_hook(hookname);
> +}

I was surprised that this didn't share code with hook_list. Upon further
thought, hook_list might be expensive if hooks are present, but if we
can cache results, I think it's worth it. A caller that calls this
function usually will run hooks if they are present, so it's not wasted
work to construct the hook list.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 8/9] commit: use config-based hooks
  2020-09-09  0:49     ` [PATCH v4 8/9] commit: use config-based hooks Emily Shaffer
  2020-09-10 13:50       ` Phillip Wood
@ 2020-09-23 23:47       ` Jonathan Tan
  2020-10-05 21:27         ` Emily Shaffer
  1 sibling, 1 reply; 170+ messages in thread
From: Jonathan Tan @ 2020-09-23 23:47 UTC (permalink / raw)
  To: emilyshaffer; +Cc: git, Jonathan Tan

> -	if (!no_verify && find_hook("pre-commit")) {
> +	if (!no_verify && hook_exists("pre-commit")) {

A reviewer would probably need to look at all instances of "pre-commit"
(and likewise for the other hooks) but if the plan is to convert all
hooks, then the reviewer wouldn't need to do this since we could just
delete the "find_hook" function.

Overall comments about the design and scope of the patch set:

 - I think that the abilities of the current patch set regarding
   overriding order of globally-set hook commands is sufficient. We
   should also have some way of disabling globally-set hooks, perhaps
   by implementing the "skip" variable mentioned in patch 1 or by
   allowing the redefinition of hookcmd sections (e.g. by redefining a
   command to "/usr/bin/true"). To me, these provide substantial
   user-facing value, and would be sufficient for a first version - and
   other things like parallelization can come later.

 - As for the UI that should be exposed through the "git hook" command,
   I think that "git hook list" and "git hook run" are sufficient.
   Editing the config files are not too difficult, and "git hook add"
   etc. can be added later.

 - As for whether (1) it is OK for none of the hooks to be converted (and
   instead rely on the user to edit their hook scripts to call "git hook
   run ???"), or if (2) we should require some hooks to be
   converted, or if (3) we should require all hooks to be converted: I'd
   rather have (2) or (3) so that we don't have dead code. I prefer (3),
   especially since a reviewer wouldn't have to worry about leftover
   usages of old functions like find_hook() (as I mentioned at the start
   of this email), but I'm not fully opposed to (2) either.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 1/9] doc: propose hooks managed by the config
  2020-09-23 22:59       ` Jonathan Tan
@ 2020-09-24 21:54         ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-09-24 21:54 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Wed, Sep 23, 2020 at 03:59:10PM -0700, Jonathan Tan wrote:
> 
> For this review, I'll just concern myself with overall design and
> structure.

Thanks - the design doc is now slightly old, so it's nice to have some
fresh eyes on it.

> 
> For this patch, overall I think it's better if there's a clear
> distinction between what we are implementing now and what we are
> implementing later.

I took a light hand when I checked for this - the topic isn't complete
yet, and there's some work in the design doc which I want to include in
this topic, but which hasn't been sent around (or written) yet.

> 
> > +[[motivation]]
> > +== Motivation
> > +
> > +Treat hooks as a first-class citizen by replacing the .git/hook/hookname path as
> > +the only source of hooks to execute, in a way which is friendly to users with
> > +multiple repos which have similar needs.
> 
> I don't understand what "first-class citizen" here means - probably
> better to just omit that phrase and describe the new way of doing hooks.

Sure.

> 
> > +[[config-schema-hook]]
> > +==== `hook`
> > +
> > +Primarily contains subsections for each hook event. These order of these
> > +subsections defines the hook command execution order
> 
> The execution order is defined by the order of a multivalue config
> variable, I think, not the order of subsections? Besides, I believe that
> there's one subsection per hook event (e.g. hook."pre-commit"), not one
> subsection per command.

Ok. Have changed to "The order of variables in these subsections
defines..."

> 
> > ; hook commands can be
> > +specified by setting the value directly to the command if no additional
> > +configuration is needed, or by setting the value as the name of a `hookcmd`. If
> > +Git does not find a `hookcmd` whose subsection matches the value of the given
> > +command string, Git will try to execute the string directly. Hooks are executed
> > +by passing the resolved command string to the shell.
> 
> [snip]
> 
> > Hook event subsections can
> > +also contain per-hook-event settings.
> 
> If this is not yet implemented, maybe list under "future work".

Good idea. Done.

> 
> > +
> > +Also contains top-level hook execution settings, for example,
> > +`hook.warnHookDir`, `hook.runHookDir`, or `hook.disableAll`. (These settings are
> > +described more in <<library,Library>>.)
> 
> I think it's clearer if you list this under "future work" - I didn't see
> any implementation of this.

Yeah, this is out of sync with what the implementation ended up looking
like; disableAll might still be a useful thing to include in the initial
feature topic, so I won't remove it, but warnHookDir is not necessary.

> 
> > +[hook "pre-commit"]
> > +  command = perl-linter
> > +  command = /usr/bin/git-secrets --pre-commit
> > +
> > +[hook "pre-applypatch"]
> > +  command = perl-linter
> > +  error = ignore
> 
> Is "error" implemented?

No, have marked it with a comment.

> 
> > +
> > +[hook]
> > +  runHookDir = interactive
> 
> Same question for "runHookDir".

It is in the reroll I'm trying to get out this week :)

> 
> > +[[config-schema-hookcmd]]
> > +==== `hookcmd`
> > +
> > +Defines a hook command and its attributes, which will be used when a hook event
> > +occurs. Unqualified attributes are assumed to apply to this hook during all hook
> > +events, but event-specific attributes can also be supplied. The example runs
> > +`/usr/bin/lint-it --language=perl <args passed by Git>`, but for repos which
> > +include this config, the hook command will be skipped for all events to which
> > +it's normally subscribed _except_ `pre-commit`.
> > +
> > +----
> > +[hookcmd "perl-linter"]
> > +  command = /usr/bin/lint-it --language=perl
> > +  skip = true
> > +  pre-commit-skip = false
> > +----
> 
> And the skips. (And several more below which I will skip.)

Again, this is in the reroll I'm working on.

> 
> > +If the caller wants to do something more complicated, the hook library can also
> > +provide a callback API:
> > +
> > +*`int for_each_hookcmd(const char *hookname, hookcmd_function *cb)`*
> 
> Is there a use case that would need such a function?

I'm not sure yet - but I'm not quite ready to cut it from the design
doc, until I finish migrating the existing hooks and know that it's not
needed. At that point it'll make sense to move it into the future work
section.

> 
> > +[[migration]]
> > +=== Migration path
> > +
> > +[[stage-0]]
> > +==== Stage 0
> > +
> > +Hooks are called by running `run-command.h:find_hook()` with the hookname and
> > +executing the result. The hook library and builtin do not exist. Hooks only
> > +exist as specially named scripts within `.git/hooks/`.
> > +
> > +[[stage-1]]
> > +==== Stage 1
> > +
> > +`git hook list --porcelain <hook-event>` is implemented. Users can replace their
> > +`.git/hooks/<hook-event>` scripts with a trampoline based on `git hook list`'s
> > +output. Modifier commands like `git hook add` and `git hook edit` can be
> > +implemented around this time as well.
> 
> This seems to contradict patch 8, which teaches Git to use the configs
> directly without any change to .git/hooks/<hook-event> (at least for
> certain commit-related hooks).

Yeah, I think this needs to be rephrased; at this point locally I've
completely removed the --porcelain patch - I'm pretty sure it needs to
be a format string instead.

> 
> > +[[future-work]]
> > +== Future work
> > +
> > +[[execution-ordering]]
> > +=== Execution ordering
> > +
> > +We may find that config order is insufficient for some users; for example,
> > +config order makes it difficult to add a new hook to the system or global config
> > +which runs at the end of the hook list. A new ordering schema should be:
> > +
> > +1) Specified by a `hook.order` config, so that users will not unexpectedly see
> > +their order change;
> > +
> > +2) Either dependency or numerically based.
> > +
> > +Dependency-based ordering is prone to classic linked-list problems, like a
> > +cycles and handling of missing dependencies. But, it paves the way for enabling
> > +parallelization if some tasks truly depend on others.
> > +
> > +Numerical ordering makes it tricky for Git to generate suggested ordering
> > +numbers for each command, but is easy to determine a definitive order.
> 
> With this schema, and with the "skip" behavior described above (but not
> implemented in this patch set), rudimentary ordering can already be
> done; because a hook is removed and reinserted whenever it appears in
> the config, even a hook X in the system config can be made to run after a
> hook Y in the worktree config by adding Y then X in the worktree config,
> and if we want to disable X instead, we can just add "skip" to X.

Yep, that's why reordering is in the future work section :)

The problem with config ordering is like such: if I want everyone in
my enterprise to run 'git-secrets --prepare-commit-msg' as the very last
prepare-commit-msg hook, but I can only ship them an /etc/gitconfig,
then the best I can do is email my users and ask them to run 'git config
hook.prepare-commit-msg.command git-secrets-prepare-commit-msg' in every
new repo and include a 'hookcmd.git-secrets-prepare-commit-msg.command'
config in the /etc/gitconfig I ship. (I mention git-secrets here because
it's possible other hooks could have introduced credential secrets into
my user's commit message after git-secrets already ran.)

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 3/9] hook: add list command
  2020-09-09  0:49     ` [PATCH v4 3/9] hook: add list command Emily Shaffer
  2020-09-11 13:27       ` Phillip Wood
  2020-09-23 23:04       ` Jonathan Tan
@ 2020-09-27 19:23       ` Martin Ågren
  2020-10-06 20:20         ` Emily Shaffer
  2020-10-05 23:27       ` Jonathan Nieder
  3 siblings, 1 reply; 170+ messages in thread
From: Martin Ågren @ 2020-09-27 19:23 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: Git Mailing List

Hi Emily,

On Wed, 9 Sep 2020 at 02:54, Emily Shaffer <emilyshaffer@google.com> wrote:

>  DESCRIPTION
>  -----------
>  You can list, add, and modify hooks with this command.

(BTW, I think this patch could teach this to say "You can list hooks
with this command." If/when we add the other commands, we can expand
on this.)

> +This command parses the default configuration files for sections "hook" and
> +"hookcmd". "hook" is used to describe the commands which will be run during a

I propose s/"hook"/`hook`/ and similar to set this as monospace since we
are discussing configuration sections. If we want to avoid starting
sentences with "hook" (or `hookcmd`; do we?), maybe something like "The
section `hook` ..." would work fine.

> +particular hook event; commands are run in config order. "hookcmd" is used to

"config order" feels a bit too colloquial/vague. You use the same phrase
in the commit message and I think it works well there for the indented
audience. But for this document, I'm not so sure. How about

  Commands are run in the order they are encountered as the Git
  configuration files are processed (see linkgit:git-config[1]).

? It's also quite possible that "config order" hits the exact right tone
-- please trust your judgment.

> +describe attributes of a specific command. If additional attributes don't need
> +to be specified, a command to run can be specified directly in the "hook"
> +section; if a "hookcmd" by that name isn't found, Git will attempt to run the
> +provided value directly. For example:

> +  [hook "post-commit"]
> +    command = "linter"
> +    command = "~/typocheck.sh"
> +
> +  [hookcmd "linter"]
> +    command = "/bin/linter --c"

Hmm. "hook", "command" and "hookcmd". Should that be "cmd", or
"hookcommand"? I'd favour the latter, but the current proposal somehow
feels asymmetric. (If code uses, and is consistent about using,
"hookcmd" that's another thing entirely, I think. It's just that for the
configuration, it looks a bit odd.)

> +List the hooks which have been configured for <hook-name>. Hooks appear

`<hook-name>` with backticks.

> +in the order they should be run, and note the config scope where the relevant
> +`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).

I had to read and re-read this a few times. The "and note the" does not
mean "and please observe that", but rather "and they make note of". Not
sure how that can be done clearer. The second thing that tripped me up
was that last part. Maybe end the sentence after "specified", then add
something like "The scope is not affected by if and where
`hookcmd.<hook-name>.command` appears.".

I think you could add

  CONFIGURATION
  -------------
  include::config/hook.txt[]

here and add such a file

  hook.<hook-name>.command::
         ...

  hookcmd.<hook-name>.command::
         ...

where you define/describe those items. And you can include it from
config.txt as well.

Martin

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 4/9] hook: add --porcelain to list command
  2020-09-09  0:49     ` [PATCH v4 4/9] hook: add --porcelain to " Emily Shaffer
@ 2020-09-28 19:29       ` Josh Steadmon
  0 siblings, 0 replies; 170+ messages in thread
From: Josh Steadmon @ 2020-09-28 19:29 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

On 2020.09.08 17:49, Emily Shaffer wrote:
> Teach 'git hook list --porcelain <hookname>', which prints simply the
> commands to be run in the order suggested by the config. This option is
> intended for use by user scripts, wrappers, or out-of-process Git
> commands which still want to execute hooks. For example, the following
> snippet might be added to git-send-email.perl to introduce a
> `pre-send-email` hook:
> 
>   sub pre_send_email {
>     open(my $fh, 'git hook list --porcelain pre-send-email |');
>     chomp(my @hooks = <$fh>);
>     close($fh);
> 
>     foreach $hook (@hooks) {
>             system $hook
>     }
> 
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
>  Documentation/git-hook.txt    | 13 +++++++++++--
>  builtin/hook.c                | 17 +++++++++++++----
>  t/t1360-config-based-hooks.sh | 12 ++++++++++++
>  3 files changed, 36 insertions(+), 6 deletions(-)
> 
> diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
> index e458586e96..0854035ce2 100644
> --- a/Documentation/git-hook.txt
> +++ b/Documentation/git-hook.txt
> @@ -8,7 +8,7 @@ git-hook - Manage configured hooks
>  SYNOPSIS
>  --------
>  [verse]
> -'git hook' list <hook-name>
> +'git hook' list [--porcelain] <hook-name>
>  
>  DESCRIPTION
>  -----------
> @@ -43,11 +43,20 @@ Local config
>  COMMANDS
>  --------
>  
> -list <hook-name>::
> +list [--porcelain] <hook-name>::
>  
>  List the hooks which have been configured for <hook-name>. Hooks appear
>  in the order they should be run, and note the config scope where the relevant
>  `hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
> ++
> +If `--porcelain` is specified, instead print the commands alone, separated by
> +newlines, for easy parsing by a script.
> +
> +OPTIONS
> +-------
> +--porcelain::
> +	With `list`, print the commands in the order they should be run,
> +	separated by newlines, for easy parsing by a script.

Rather than a hard-coded porcelain format, perhaps we could accept a
format string to allow callers to specify which items they want, for
greater forwards-compatibility?

Also, we may want a "-z / --null" option like in `git config` to delimit
by null bytes rather than newlines, in case any commands end up with
embedded newlines.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 6/9] hook: add 'run' subcommand
  2020-09-09  0:49     ` [PATCH v4 6/9] hook: add 'run' subcommand Emily Shaffer
  2020-09-11 13:30       ` Phillip Wood
@ 2020-09-28 19:29       ` Josh Steadmon
  2020-10-05 23:39       ` Jonathan Nieder
  2 siblings, 0 replies; 170+ messages in thread
From: Josh Steadmon @ 2020-09-28 19:29 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

On 2020.09.08 17:49, Emily Shaffer wrote:
> In order to enable hooks to be run as an external process, by a
> standalone Git command, or by tools which wrap Git, provide an external
> means to run all configured hook commands for a given hook event.
> 
> For now, the hook commands will in config order, in series. As alternate

Looks like a small typo here:
s/will in config order/will run in config order/

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 8/9] commit: use config-based hooks
  2020-09-23 23:47       ` Jonathan Tan
@ 2020-10-05 21:27         ` Emily Shaffer
  2020-10-05 23:48           ` Jonathan Nieder
  0 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-10-05 21:27 UTC (permalink / raw)
  To: Jonathan Tan, Junio C Hamano; +Cc: git

On Wed, Sep 23, 2020 at 04:47:34PM -0700, Jonathan Tan wrote:
> 
> > -	if (!no_verify && find_hook("pre-commit")) {
> > +	if (!no_verify && hook_exists("pre-commit")) {
> 
> A reviewer would probably need to look at all instances of "pre-commit"
> (and likewise for the other hooks) but if the plan is to convert all
> hooks, then the reviewer wouldn't need to do this since we could just
> delete the "find_hook" function.
> 
> Overall comments about the design and scope of the patch set:
> 
>  - I think that the abilities of the current patch set regarding
>    overriding order of globally-set hook commands is sufficient. We
>    should also have some way of disabling globally-set hooks, perhaps
>    by implementing the "skip" variable mentioned in patch 1 or by
>    allowing the redefinition of hookcmd sections (e.g. by redefining a
>    command to "/usr/bin/true"). To me, these provide substantial
>    user-facing value, and would be sufficient for a first version - and
>    other things like parallelization can come later.

OK. I will send 'skip' in the next reroll. Thanks for pointing it out!

> 
>  - As for the UI that should be exposed through the "git hook" command,
>    I think that "git hook list" and "git hook run" are sufficient.
>    Editing the config files are not too difficult, and "git hook add"
>    etc. can be added later.
> 
>  - As for whether (1) it is OK for none of the hooks to be converted (and
>    instead rely on the user to edit their hook scripts to call "git hook
>    run ???"), or if (2) we should require some hooks to be
>    converted, or if (3) we should require all hooks to be converted: I'd
>    rather have (2) or (3) so that we don't have dead code. I prefer (3),
>    especially since a reviewer wouldn't have to worry about leftover
>    usages of old functions like find_hook() (as I mentioned at the start
>    of this email), but I'm not fully opposed to (2) either.

I personally prefer (3) - I think the user experience with (2) in a
release (or even in 'next', which all Googlers use) is pretty bad. The
downside, of course, is that a large topic gets merged all at once and
makes some pretty nasty reviewer overhead.

Junio, I wonder if you can give any advice here? What would be really
ideal for me would be to do something like Stolee has been doing with
his maintenance series - config-based hooks pt. I containing the library
code and config-based hooks pt. II containing the conversion of
preexisting hooks. Does that make the overhead for you significantly
worse?

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 2/9] hook: scaffolding for git-hook subcommand
  2020-09-09  0:49     ` [PATCH v4 2/9] hook: scaffolding for git-hook subcommand Emily Shaffer
@ 2020-10-05 23:24       ` Jonathan Nieder
  2020-10-06 19:06         ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Jonathan Nieder @ 2020-10-05 23:24 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Hi,

Emily Shaffer wrote:

> Introduce infrastructure for a new subcommand, git-hook, which will be
> used to ease config-based hook management. This command will handle
> parsing configs to compose a list of hooks to run for a given event, as
> well as adding or modifying hook configs in an interactive fashion.
>
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
>  .gitignore                    |  1 +
>  Documentation/git-hook.txt    | 19 +++++++++++++++++++
>  Makefile                      |  1 +
>  builtin.h                     |  1 +
>  builtin/hook.c                | 21 +++++++++++++++++++++
>  git.c                         |  1 +
>  t/t1360-config-based-hooks.sh | 11 +++++++++++
>  7 files changed, 55 insertions(+)
>  create mode 100644 Documentation/git-hook.txt
>  create mode 100644 builtin/hook.c
>  create mode 100755 t/t1360-config-based-hooks.sh

optional: I could imagine this being squashed into patch 3 --- that way,
the command has functionality as soon as it exists.  Alternatively:

[...]
> --- /dev/null
> +++ b/Documentation/git-hook.txt
> @@ -0,0 +1,19 @@
> +git-hook(1)
> +===========
> +
> +NAME
> +----
> +git-hook - Manage configured hooks
> +
> +SYNOPSIS
> +--------
> +[verse]
> +'git hook'
> +
> +DESCRIPTION
> +-----------
> +You can list, add, and modify hooks with this command.

This could say something like "This is a placeholder command that will
gain functionality in subsequent patches" to make the current state
clear.

[...]
> --- a/git.c
> +++ b/git.c
> @@ -519,6 +519,7 @@ static struct cmd_struct commands[] = {
>  	{ "grep", cmd_grep, RUN_SETUP_GENTLY },
>  	{ "hash-object", cmd_hash_object },
>  	{ "help", cmd_help },
> +	{ "hook", cmd_hook, RUN_SETUP },

This makes the command require that it run within a git repository,
but I can imagine wanting to list hooks outside of any.  Should it use
RUN_SETUP_GENTLY instead?

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 3/9] hook: add list command
  2020-09-09  0:49     ` [PATCH v4 3/9] hook: add list command Emily Shaffer
                         ` (2 preceding siblings ...)
  2020-09-27 19:23       ` Martin Ågren
@ 2020-10-05 23:27       ` Jonathan Nieder
  3 siblings, 0 replies; 170+ messages in thread
From: Jonathan Nieder @ 2020-10-05 23:27 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Emily Shaffer wrote:

> --- a/Documentation/git-hook.txt
> +++ b/Documentation/git-hook.txt
> @@ -8,12 +8,47 @@ git-hook - Manage configured hooks
[...]
> +COMMANDS
> +--------
> +
> +list <hook-name>::
> +
> +List the hooks which have been configured for <hook-name>. Hooks appear
> +in the order they should be run, and note the config scope where the relevant
> +`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).

A little bit of futureproofing: this may want to mention that the
output is intended to be human-readable and is subject to change over
time (scripters beware!).

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 5/9] parse-options: parse into strvec
  2020-09-09  0:49     ` [PATCH v4 5/9] parse-options: parse into strvec Emily Shaffer
@ 2020-10-05 23:30       ` Jonathan Nieder
  2020-10-06  4:49         ` Junio C Hamano
  0 siblings, 1 reply; 170+ messages in thread
From: Jonathan Nieder @ 2020-10-05 23:30 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Emily Shaffer wrote:

> This is useful if collecting generic arguments to pass through to
> another command, for example, 'git hook run --arg "--quiet" --arg
> "--format=pretty" some-hook'. The resulting strvec would contain
> { "--quiet", "--format=pretty" }.

An alternative is to use OPT_STRING_LIST and then convert in the
caller.  One advantage of that is that it would guarantee the behavior
with --no-arg etc is going to match exactly.

I prefer this OPT_STRVEC approach nonetheless.  Can the
parse_opt_strvec and parse_opt_string_list functions get comments
pointing to each other as an alternative way to encourage that kind of
consistency?

[...]
> --- a/Documentation/technical/api-parse-options.txt
> +++ b/Documentation/technical/api-parse-options.txt
> @@ -173,6 +173,11 @@ There are some macros to easily define options:
>  	The string argument is stored as an element in `string_list`.
>  	Use of `--no-option` will clear the list of preceding values.
>  
> +`OPT_ARGV_ARRAY(short, long, &struct argv_array, arg_str, description)`::

nit: this should be OPT_STRVEC

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 6/9] hook: add 'run' subcommand
  2020-09-09  0:49     ` [PATCH v4 6/9] hook: add 'run' subcommand Emily Shaffer
  2020-09-11 13:30       ` Phillip Wood
  2020-09-28 19:29       ` Josh Steadmon
@ 2020-10-05 23:39       ` Jonathan Nieder
  2020-10-06 22:57         ` Emily Shaffer
  2 siblings, 1 reply; 170+ messages in thread
From: Jonathan Nieder @ 2020-10-05 23:39 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Hi,

Emily Shaffer wrote:

> In order to enable hooks to be run as an external process, by a
> standalone Git command, or by tools which wrap Git, provide an external
> means to run all configured hook commands for a given hook event.

Exciting!

I would even be tempted to put this earlier in the series: providing a
"git hook run" command that only supports legacy hooks and then
improving it from there to support config-based hooks.  This ordering is
also fine, though.

[...]
> ---
>  builtin/hook.c                | 30 ++++++++++++++++++++
>  hook.c                        | 52 ++++++++++++++++++++++++++++++++---
>  hook.h                        |  3 ++
>  t/t1360-config-based-hooks.sh | 28 +++++++++++++++++++
>  4 files changed, 109 insertions(+), 4 deletions(-)

Needs docs.

[...]
> --- a/builtin/hook.c
> +++ b/builtin/hook.c
> @@ -5,9 +5,11 @@
[...]
> +static int run(int argc, const char **argv, const char *prefix)
> +{
> +	struct strbuf hookname = STRBUF_INIT;
> +	struct strvec envs = STRVEC_INIT;
> +	struct strvec args = STRVEC_INIT;
> +
> +	struct option run_options[] = {
> +		OPT_STRVEC('e', "env", &envs, N_("var"),
> +			   N_("environment variables for hook to use")),
> +		OPT_STRVEC('a', "arg", &args, N_("args"),
> +			   N_("argument to pass to hook")),
> +		OPT_END(),
> +	};
> +
> +	argc = parse_options(argc, argv, prefix, run_options,
> +			     builtin_hook_usage, 0);
> +
> +	if (argc < 1)
> +		usage_msg_opt(_("a hookname must be provided to operate on."),
> +			      builtin_hook_usage, run_options);

Error message nit: what does it mean to operate on a hookname?

Perhaps this should allude to the usage string?

	usage_msg_opt(_("missing <hookname> parameter"), ...);

Or to match the conversational approach of commands like "clone":

	usage_msg_opt(_("You must specify a hook to run."), ...);

[...]
> --- a/hook.c
> +++ b/hook.c
> @@ -2,6 +2,7 @@
>  
>  #include "hook.h"
>  #include "config.h"
> +#include "run-command.h"
>  
>  /*
>   * NEEDSWORK: a stateful hook_head means we can't run two hook events in the
> @@ -21,13 +22,15 @@ void free_hook(struct hook *ptr)
>  	}
>  }
>  
> -static void emplace_hook(struct list_head *pos, const char *command)
> +static void emplace_hook(struct list_head *pos, const char *command, int quoted)
>  {
>  	struct hook *to_add = malloc(sizeof(struct hook));
>  	to_add->origin = current_config_scope();
>  	strbuf_init(&to_add->command, 0);
> -	/* even with use_shell, run_command() needs quotes */
> -	strbuf_addf(&to_add->command, "'%s'", command);
> +	if (quoted)
> +		strbuf_addf(&to_add->command, "'%s'", command);
> +	else
> +		strbuf_addstr(&to_add->command, command);
>  
>  	list_add_tail(&to_add->list, pos);
>  }

This would need to use sq_quote_* to be safe, but we can do something
simpler: if we accumulate parameters in an argv_array passed to
run_command, then they will be safely passed to the shell without
triggering expansion.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 7/9] hook: replace run-command.h:find_hook
  2020-09-09  0:49     ` [PATCH v4 7/9] hook: replace run-command.h:find_hook Emily Shaffer
  2020-09-09 20:32       ` Junio C Hamano
  2020-09-23 23:20       ` Jonathan Tan
@ 2020-10-05 23:42       ` Jonathan Nieder
  2 siblings, 0 replies; 170+ messages in thread
From: Jonathan Nieder @ 2020-10-05 23:42 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Emily Shaffer wrote:

> Subject: hook: replace run-command.h:find_hook

tiny nit: This doesn't remove find_hook, so this may want to be
described as "add replacement for" instead of "replace".

[...]
> --- a/hook.c
> +++ b/hook.c
> @@ -111,6 +111,15 @@ struct list_head* hook_list(const struct strbuf* hookname)
>  	return &hook_head;
>  }
>  
> +int hook_exists(const char *hookname)
> +{
> +	const char *value = NULL;
> +	struct strbuf hook_key = STRBUF_INIT;
> +	strbuf_addf(&hook_key, "hook.%s.command", hookname);
> +
> +	return (!git_config_get_value(hook_key.buf, &value)) || !!find_hook(hookname);
> +}

This feels a bit fragile, since it can go out of sync with run_hooks.
I think I'd prefer if they shared code and this function either
returned a parsed structure that could be used later to run hooks or
cached the result keyed by hookname.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 8/9] commit: use config-based hooks
  2020-10-05 21:27         ` Emily Shaffer
@ 2020-10-05 23:48           ` Jonathan Nieder
  2020-10-06 19:08             ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Jonathan Nieder @ 2020-10-05 23:48 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: Jonathan Tan, Junio C Hamano, git

Emily Shaffer wrote:
> On Wed, Sep 23, 2020 at 04:47:34PM -0700, Jonathan Tan wrote:

>>  - As for whether (1) it is OK for none of the hooks to be converted (and
>>    instead rely on the user to edit their hook scripts to call "git hook
>>    run ???"), or if (2) we should require some hooks to be
>>    converted, or if (3) we should require all hooks to be converted: I'd
>>    rather have (2) or (3) so that we don't have dead code. I prefer (3),
>>    especially since a reviewer wouldn't have to worry about leftover
>>    usages of old functions like find_hook() (as I mentioned at the start
>>    of this email), but I'm not fully opposed to (2) either.
>
> I personally prefer (3) - I think the user experience with (2) in a
> release (or even in 'next', which all Googlers use) is pretty bad. The
> downside, of course, is that a large topic gets merged all at once and
> makes some pretty nasty reviewer overhead.

One approach is to build up a series with "git hook run" and "git hook
list" demonstrating and testing the functionality and [PATCH n+1/n]
extra patches at the end converting existing hooks.  The user
experience from "git hook run" and even "git hook list" supporting a
preview of the future without built-in commands living in that future
yet would not be so bad, methinks.  And then a final series could
update the built-in commands' usage of hooks and would still be fairly
small.

In other words, I think I like (1), except *without* the
recommendation for users to edit their hook scripts to call "git hook
run" --- instead, the recommendation would be "try running this
command if you want to see what hooks will do in the future".

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 5/9] parse-options: parse into strvec
  2020-10-05 23:30       ` Jonathan Nieder
@ 2020-10-06  4:49         ` Junio C Hamano
  0 siblings, 0 replies; 170+ messages in thread
From: Junio C Hamano @ 2020-10-06  4:49 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Emily Shaffer, git

Jonathan Nieder <jrnieder@gmail.com> writes:

> Emily Shaffer wrote:
>
>> This is useful if collecting generic arguments to pass through to
>> another command, for example, 'git hook run --arg "--quiet" --arg
>> "--format=pretty" some-hook'. The resulting strvec would contain
>> { "--quiet", "--format=pretty" }.
>
> An alternative is to use OPT_STRING_LIST and then convert in the
> caller.  One advantage of that is that it would guarantee the behavior
> with --no-arg etc is going to match exactly.
>
> I prefer this OPT_STRVEC approach nonetheless.  Can the
> parse_opt_strvec and parse_opt_string_list functions get comments
> pointing to each other as an alternative way to encourage that kind of
> consistency?
>
> [...]
>> --- a/Documentation/technical/api-parse-options.txt
>> +++ b/Documentation/technical/api-parse-options.txt
>> @@ -173,6 +173,11 @@ There are some macros to easily define options:
>>  	The string argument is stored as an element in `string_list`.
>>  	Use of `--no-option` will clear the list of preceding values.
>>  
>> +`OPT_ARGV_ARRAY(short, long, &struct argv_array, arg_str, description)`::
>
> nit: this should be OPT_STRVEC

Sigh.  I thought I caught all of these with a SQUASH fix-up patch
the last round.  Thanks for being extra careful.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 2/9] hook: scaffolding for git-hook subcommand
  2020-10-05 23:24       ` Jonathan Nieder
@ 2020-10-06 19:06         ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-06 19:06 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git

On Mon, Oct 05, 2020 at 04:24:18PM -0700, Jonathan Nieder wrote:
> 
> Hi,
> 
> Emily Shaffer wrote:
> 
> > Introduce infrastructure for a new subcommand, git-hook, which will be
> > used to ease config-based hook management. This command will handle
> > parsing configs to compose a list of hooks to run for a given event, as
> > well as adding or modifying hook configs in an interactive fashion.
> >
> > Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> > ---
> >  .gitignore                    |  1 +
> >  Documentation/git-hook.txt    | 19 +++++++++++++++++++
> >  Makefile                      |  1 +
> >  builtin.h                     |  1 +
> >  builtin/hook.c                | 21 +++++++++++++++++++++
> >  git.c                         |  1 +
> >  t/t1360-config-based-hooks.sh | 11 +++++++++++
> >  7 files changed, 55 insertions(+)
> >  create mode 100644 Documentation/git-hook.txt
> >  create mode 100644 builtin/hook.c
> >  create mode 100755 t/t1360-config-based-hooks.sh
> 
> optional: I could imagine this being squashed into patch 3 --- that way,
> the command has functionality as soon as it exists.  Alternatively:

I would prefer to leave it on its own. Managing changes like
builtin<->standalone or even the one you mentioned below about
RUN_SETUP_GENTLY is somewhat easier to manage when they aren't in the
same patch as the business logic, IMO.

> 
> [...]
> > --- /dev/null
> > +++ b/Documentation/git-hook.txt
> > @@ -0,0 +1,19 @@
> > +git-hook(1)
> > +===========
> > +
> > +NAME
> > +----
> > +git-hook - Manage configured hooks
> > +
> > +SYNOPSIS
> > +--------
> > +[verse]
> > +'git hook'
> > +
> > +DESCRIPTION
> > +-----------
> > +You can list, add, and modify hooks with this command.
> 
> This could say something like "This is a placeholder command that will
> gain functionality in subsequent patches" to make the current state
> clear.

Done.

> 
> [...]
> > --- a/git.c
> > +++ b/git.c
> > @@ -519,6 +519,7 @@ static struct cmd_struct commands[] = {
> >  	{ "grep", cmd_grep, RUN_SETUP_GENTLY },
> >  	{ "hash-object", cmd_hash_object },
> >  	{ "help", cmd_help },
> > +	{ "hook", cmd_hook, RUN_SETUP },
> 
> This makes the command require that it run within a git repository,
> but I can imagine wanting to list hooks outside of any.  Should it use
> RUN_SETUP_GENTLY instead?

Nice catch. I'll add a test to the list patch to that effect also.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 8/9] commit: use config-based hooks
  2020-10-05 23:48           ` Jonathan Nieder
@ 2020-10-06 19:08             ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-06 19:08 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Jonathan Tan, Junio C Hamano, git

On Mon, Oct 05, 2020 at 04:48:39PM -0700, Jonathan Nieder wrote:
> 
> Emily Shaffer wrote:
> > On Wed, Sep 23, 2020 at 04:47:34PM -0700, Jonathan Tan wrote:
> 
> >>  - As for whether (1) it is OK for none of the hooks to be converted (and
> >>    instead rely on the user to edit their hook scripts to call "git hook
> >>    run ???"), or if (2) we should require some hooks to be
> >>    converted, or if (3) we should require all hooks to be converted: I'd
> >>    rather have (2) or (3) so that we don't have dead code. I prefer (3),
> >>    especially since a reviewer wouldn't have to worry about leftover
> >>    usages of old functions like find_hook() (as I mentioned at the start
> >>    of this email), but I'm not fully opposed to (2) either.
> >
> > I personally prefer (3) - I think the user experience with (2) in a
> > release (or even in 'next', which all Googlers use) is pretty bad. The
> > downside, of course, is that a large topic gets merged all at once and
> > makes some pretty nasty reviewer overhead.
> 
> One approach is to build up a series with "git hook run" and "git hook
> list" demonstrating and testing the functionality and [PATCH n+1/n]
> extra patches at the end converting existing hooks.  The user
> experience from "git hook run" and even "git hook list" supporting a
> preview of the future without built-in commands living in that future
> yet would not be so bad, methinks.  And then a final series could
> update the built-in commands' usage of hooks and would still be fairly
> small.
> 
> In other words, I think I like (1), except *without* the
> recommendation for users to edit their hook scripts to call "git hook
> run" --- instead, the recommendation would be "try running this
> command if you want to see what hooks will do in the future".

Ok. I'll fix up the wording in the design doc and follow through with my
plan to split the series into two parts.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 3/9] hook: add list command
  2020-09-27 19:23       ` Martin Ågren
@ 2020-10-06 20:20         ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-06 20:20 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Git Mailing List

On Sun, Sep 27, 2020 at 09:23:35PM +0200, Martin Ågren wrote:
> 
> Hi Emily,

Firstly, thanks for the doc review - this is great stuff.

> 
> On Wed, 9 Sep 2020 at 02:54, Emily Shaffer <emilyshaffer@google.com> wrote:
> 
> >  DESCRIPTION
> >  -----------
> >  You can list, add, and modify hooks with this command.
> 
> (BTW, I think this patch could teach this to say "You can list hooks
> with this command." If/when we add the other commands, we can expand
> on this.)

Done. I sort of glued this together with Jonathan Nieder's suggestion in
the setup patch, and ended up saying "later you will be able to blah".

> 
> > +This command parses the default configuration files for sections "hook" and
> > +"hookcmd". "hook" is used to describe the commands which will be run during a
> 
> I propose s/"hook"/`hook`/ and similar to set this as monospace since we
> are discussing configuration sections. If we want to avoid starting
> sentences with "hook" (or `hookcmd`; do we?), maybe something like "The
> section `hook` ..." would work fine.

Nice - done. I don't see much problem with starting a sentence with
monospaced lower-cased section name... someone can disagree with me :)

> 
> > +particular hook event; commands are run in config order. "hookcmd" is used to
> 
> "config order" feels a bit too colloquial/vague. You use the same phrase
> in the commit message and I think it works well there for the indented
> audience. But for this document, I'm not so sure. How about
> 
>   Commands are run in the order they are encountered as the Git
>   configuration files are processed (see linkgit:git-config[1]).

I don't mind colloquial - I think that improves the readability of user
documentation - but you're right that it's vague. "...commands are run
in the order Git encounters them during the configuration parse (see
linkgitblah)" seemed like an okay balance to me.

> 
> ? It's also quite possible that "config order" hits the exact right tone
> -- please trust your judgment.

Nah, I think you're right that "config order" is easily understood by
Git devs, but probably not by Git users. I like that linking out to the
config doc invites users to also learn a little more about how config
files work :)

> 
> > +describe attributes of a specific command. If additional attributes don't need
> > +to be specified, a command to run can be specified directly in the "hook"
> > +section; if a "hookcmd" by that name isn't found, Git will attempt to run the
> > +provided value directly. For example:
> 
> > +  [hook "post-commit"]
> > +    command = "linter"
> > +    command = "~/typocheck.sh"
> > +
> > +  [hookcmd "linter"]
> > +    command = "/bin/linter --c"
> 
> Hmm. "hook", "command" and "hookcmd". Should that be "cmd", or
> "hookcommand"? I'd favour the latter, but the current proposal somehow
> feels asymmetric. (If code uses, and is consistent about using,
> "hookcmd" that's another thing entirely, I think. It's just that for the
> configuration, it looks a bit odd.)

I'm not entirely in love with the name "hookcmd" but somehow I like
"hookcommand" even less - especially since you end up with
"hook.command" referencing a "hookcommand" which also has a
"hookcommand.command" - blech.

Some possible alternatives to "hookcmd":
- hookmodule/hook-module
- reusable-hook
- hook-with-options/hook-options (nah, this sounds like it means
  "options for hook execution")
- hook-details/detailed-hook
- named-hook

I'll think on this more. I like "named-hook" quite a lot. Very
interested in hearing other ideas - "the two hardest problems in
computer science are naming, cache invalidation, and off-by-one errors"
;)

> 
> > +List the hooks which have been configured for <hook-name>. Hooks appear
> 
> `<hook-name>` with backticks.
> 
> > +in the order they should be run, and note the config scope where the relevant
> > +`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
> 
> I had to read and re-read this a few times. The "and note the" does not
> mean "and please observe that", but rather "and they make note of". Not
> sure how that can be done clearer. The second thing that tripped me up
> was that last part. Maybe end the sentence after "specified", then add
> something like "The scope is not affected by if and where
> `hookcmd.<hook-name>.command` appears.".

Occam's Razor suggests "Hooks appear in the order they should be run,
and print the config scope blah". Thanks for pointing out "and note
that" collision - I never use that phrase so it didn't occur to me!

> 
> I think you could add
> 
>   CONFIGURATION
>   -------------
>   include::config/hook.txt[]
> 
> here and add such a file
> 
>   hook.<hook-name>.command::
>          ...
> 
>   hookcmd.<hook-name>.command::
>          ...
> 
> where you define/describe those items. And you can include it from
> config.txt as well.

Yes, totally. Thanks.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 3/9] hook: add list command
  2020-09-23 23:04       ` Jonathan Tan
@ 2020-10-06 20:46         ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-06 20:46 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Wed, Sep 23, 2020 at 04:04:51PM -0700, Jonathan Tan wrote:
> 
> >   $ git hook list pre-commit
> >   ~/baz/from/hookcmd.sh
> >   ~/bar.sh
> 
> In the tests below, there is a "local:" prefix (or similar). It's
> clearer if the commit message has that too.
> 
> Also, looking at a later commit, the "list" command probably should
> include the legacy hook if it exists.

Have added it as a separate patch for v5, hopefully that will make more
sense.

> 
> > +static void emplace_hook(struct list_head *pos, const char *command)
> > +{
> > +	struct hook *to_add = malloc(sizeof(struct hook));
> > +	to_add->origin = current_config_scope();
> > +	strbuf_init(&to_add->command, 0);
> > +	/* even with use_shell, run_command() needs quotes */
> > +	strbuf_addf(&to_add->command, "'%s'", command);
> > +
> > +	list_add_tail(&to_add->list, pos);
> > +}
> 
> It might be odd to a programmer reading this that an existing "struct
> hook" with the same name is not reused - the scanning of the list done
> in hook_config_lookup() could probably go here instead.

Sure, done.

> 
> > +test_expect_success 'git hook list orders by config order' '
> > +	setup_hooks &&
> > +
> > +	cat >expected <<-EOF &&
> > +	global:	$ROOT/path/def
> > +	local:	$ROOT/path/ghi
> 
> Will the "global" strings etc. be translated? If yes, it's probably not
> worth it to align the paths in this way.

Asked more offline. Jonathan was saying that translation might result in
scope name + tab character leaving the path in different columns
depending on the scope anyways, so there's no point in using a tab
character instead of a space character here. That seems reasonable; I'll
switch.

 - Emily


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 6/9] hook: add 'run' subcommand
  2020-10-05 23:39       ` Jonathan Nieder
@ 2020-10-06 22:57         ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-06 22:57 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git

On Mon, Oct 05, 2020 at 04:39:03PM -0700, Jonathan Nieder wrote:
> 
> Hi,
> 
> Emily Shaffer wrote:
> 
> > In order to enable hooks to be run as an external process, by a
> > standalone Git command, or by tools which wrap Git, provide an external
> > means to run all configured hook commands for a given hook event.
> 
> Exciting!
> 
> I would even be tempted to put this earlier in the series: providing a
> "git hook run" command that only supports legacy hooks and then
> improving it from there to support config-based hooks.  This ordering is
> also fine, though.

Oh, interesting! I sort of wish I had started with that ordering... but
now it seems a little unwieldy to switch. I'd probably want to do 100%
of the run_hook_(ve|le) conversions first, in that case, and delete the
old hook API. But at this point I think it would be a pretty large
amount of overhead to switch.

> 
> [...]
> > ---
> >  builtin/hook.c                | 30 ++++++++++++++++++++
> >  hook.c                        | 52 ++++++++++++++++++++++++++++++++---
> >  hook.h                        |  3 ++
> >  t/t1360-config-based-hooks.sh | 28 +++++++++++++++++++
> >  4 files changed, 109 insertions(+), 4 deletions(-)
> 
> Needs docs.

Done

> 
> [...]
> > --- a/builtin/hook.c
> > +++ b/builtin/hook.c
> > @@ -5,9 +5,11 @@
> [...]
> > +static int run(int argc, const char **argv, const char *prefix)
> > +{
> > +	struct strbuf hookname = STRBUF_INIT;
> > +	struct strvec envs = STRVEC_INIT;
> > +	struct strvec args = STRVEC_INIT;
> > +
> > +	struct option run_options[] = {
> > +		OPT_STRVEC('e', "env", &envs, N_("var"),
> > +			   N_("environment variables for hook to use")),
> > +		OPT_STRVEC('a', "arg", &args, N_("args"),
> > +			   N_("argument to pass to hook")),
> > +		OPT_END(),
> > +	};
> > +
> > +	argc = parse_options(argc, argv, prefix, run_options,
> > +			     builtin_hook_usage, 0);
> > +
> > +	if (argc < 1)
> > +		usage_msg_opt(_("a hookname must be provided to operate on."),
> > +			      builtin_hook_usage, run_options);
> 
> Error message nit: what does it mean to operate on a hookname?
> 
> Perhaps this should allude to the usage string?
> 
> 	usage_msg_opt(_("missing <hookname> parameter"), ...);
> 
> Or to match the conversational approach of commands like "clone":
> 
> 	usage_msg_opt(_("You must specify a hook to run."), ...);
> 

Yeah, I like this one. I noticed the same error (untranslated, even!) is
used for list, so I'll fix that too.

> [...]
> > --- a/hook.c
> > +++ b/hook.c
> > @@ -2,6 +2,7 @@
> >  
> >  #include "hook.h"
> >  #include "config.h"
> > +#include "run-command.h"
> >  
> >  /*
> >   * NEEDSWORK: a stateful hook_head means we can't run two hook events in the
> > @@ -21,13 +22,15 @@ void free_hook(struct hook *ptr)
> >  	}
> >  }
> >  
> > -static void emplace_hook(struct list_head *pos, const char *command)
> > +static void emplace_hook(struct list_head *pos, const char *command, int quoted)
> >  {
> >  	struct hook *to_add = malloc(sizeof(struct hook));
> >  	to_add->origin = current_config_scope();
> >  	strbuf_init(&to_add->command, 0);
> > -	/* even with use_shell, run_command() needs quotes */
> > -	strbuf_addf(&to_add->command, "'%s'", command);
> > +	if (quoted)
> > +		strbuf_addf(&to_add->command, "'%s'", command);
> > +	else
> > +		strbuf_addstr(&to_add->command, command);
> >  
> >  	list_add_tail(&to_add->list, pos);
> >  }
> 
> This would need to use sq_quote_* to be safe, but we can do something
> simpler: if we accumulate parameters in an argv_array passed to
> run_command, then they will be safely passed to the shell without
> triggering expansion.

Thanks. I'll do that - no point in duplicating the work :)

 - Emily




^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 1/9] doc: propose hooks managed by the config
  2020-09-09  0:49     ` [PATCH v4 1/9] doc: propose hooks managed by the config Emily Shaffer
  2020-09-23 22:59       ` Jonathan Tan
@ 2020-10-07  9:23       ` Ævar Arnfjörð Bjarmason
  2020-10-22  0:58         ` Emily Shaffer
  1 sibling, 1 reply; 170+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2020-10-07  9:23 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git


On Wed, Sep 09 2020, Emily Shaffer wrote:

First, thanks a lot for working on this. As you may have found I've done
some small amount of actual work in this area before, but mostly just
blathered about it on the ML.

> Begin a design document for config-based hooks, managed via git-hook.
> Focus on an overview of the implementation and motivation for design
> decisions. Briefly discuss the alternatives considered before this
> point. Also, attempt to redefine terms to fit into a multihook world.
> [...]
> +[[status-quo]]
> +=== Status quo
> +
> +Today users can implement multihooks themselves by using a "trampoline script"
> +as their hook, and pointing that script to a directory or list of other scripts
> +they wish to run.

...or by setting core.hooksPath in their local/global/system
config. Granted it doesn't cover the malicious hook injection case
you're also trying to solve, but does address e.g. having a git server
with a lot of centralized hooks.

The "trampoline script" also isn't needed for the common case you
mention, you just symlink the .git/hooks directory (as e.g. GitLab
does). People usually use a trampoline script for e.g. using GNU
parallel or something to execute N hooks.


> +[[hook-directories]]
> +=== Hook directories
> +
> +Other contributors have suggested Git learn about the existence of a directory
> +such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order.

...which seems like an easy thing to add later by having a "hookdir" in
addition to "hookcmd", i.e. just specify a glob there instead of a
cmd/path.

You already use "hookdir" for something else though, so that's a bit
confusing, perhaps s/hookcmd/definehookcmd/ would be less confusing, or
perhaps more confusing...

> [...]
> +[[execution-ordering]]
> +=== Execution ordering
> +
> +We may find that config order is insufficient for some users; for example,
> +config order makes it difficult to add a new hook to the system or global config
> +which runs at the end of the hook list. A new ordering schema should be:
> +
> +1) Specified by a `hook.order` config, so that users will not unexpectedly see
> +their order change;
> +
> +2) Either dependency or numerically based.
> +
> +Dependency-based ordering is prone to classic linked-list problems, like a
> +cycles and handling of missing dependencies. But, it paves the way for enabling
> +parallelization if some tasks truly depend on others.
>
> +Numerical ordering makes it tricky for Git to generate suggested ordering
> +numbers for each command, but is easy to determine a definitive order.
> +
> +[[parallelization]]
> +=== Parallelization
> +
> +Users with many hooks might want to run them simultaneously, if the hooks don't
> +modify state; if one hook depends on another's output, then users will want to
> +specify those dependencies. If we decide to solve this problem, we may want to
> +look to modern build systems for inspiration on how to manage dependencies and
> +parallel tasks.

If you're taking requests it would make me very happy if we had
parallelism in this from day one. It's the kind of thing that's hard to
do by default once a feature is shipped since people will implicitly
depend on it not being there, i.e. we won't know what we're breaking.

I think doing it this way is simple, covers most use cases, and solves a
lot of the problems you note:

1. Don't use config order to execute hooks, use glob'd name order
   regardless of origin. I.e. a system-level hook is called "001-first"
   is executed before a local hook called "999-at-the-end" (or the other
   way around, i.e. hook origin doesn't matter).

2. We execute hooks parallel in that glob order, i.e. a pthread for-loop
   that starts the 001-first task first, eventually getting to
   999-at-the-end N at a time. I.e. the same as:

       parallel --jobs N --halt-on-error soon,fail=1" ::: <hooks-in-glob-order>

   This allows for parallelism but guarantees the very useful case of
   having a global log hook being guaranteed to execute.

3. A hook can define "parallel=no" in its config. We'll then run it
   while no other hook is running.

4. We don't attempt to do dependencies etc, if you need that sort of
   complexity you can just make one of the hooks be a hook runner as
   users do now for the common "make it parallel" case.

It's a relatively small change to the code you have already. I.e. the
for_each() in run_hooks() would be called N times for each continuous
glob'd parallel/non-parallel segment, and hook_list()'s config parsing
would learn to spew those out as a list-of-lists.

This also gives you a rudimentary implementation of the dependency
schema you proposed for free. I.e. a definition of (pseudocode):

    hookcmd=000-first
    parallel=no

    hookcmd=250-middle-abc
    hookcmd=250-middle-xyz

    hookcmd=300-gather
    parallel=no

    hookcmd=999-the-end

Would result in the pseudocode execution of;

    segments=[[000-first],
              [250-middle-abc, 250-middle-xyz],
              [300-gather],
              [999-the-end]]
    for each s in segments:
        ok = run_in_parallel(s)
        last if !ok # or, depending on "early exit?" config

I.e.:

 * The common case of people adding N hooks won't take sum(N) time.

 * parallel=no hooks aren't run in parallel with other non-parallel
   hooks

 * We support a rudimentary dependency schema as a side-effect,
   i.e. defining 300-gather as non-parallel allows it to act as the sole
   "reduce" step in a map/reduce in a "map" step started with the 250-*
   hooks.

> +[[securing-hookdir-hooks]]
> +=== Securing hookdir hooks
> +
> +With the design as written in this doc, it's still possible for a malicious user
> +to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then
> +zip their repo and send it to another user. It may be necessary to teach Git to
> +only allow inlined hooks like this if they were configured outside of the local
> +scope (in other words, only run hookcmds, and only allow hookcmds to be
> +configured in global or system scope); or another approach, like a list of safe
> +projects, might be useful. It may also be sufficient (or at least useful) to
> +teach a `hook.disableAll` config or similar flag to the Git executable.

I think this part of the doc should note a bit of the context in
https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/

I.e. even if we get a 100% secure hook implementation we've done
practically nothing for overall security, since we'll still run the
pager, aliases etc. from that local repo.

This is a great step in the right direction, but it behooves us to note
that, so some user reading this documentation without context doesn't
think inspecting untrusted repositories like that is safe just because
they set the right hook settings in their config (once what's being
proposed here is implemented).

^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v5 0/8] propose config-based hooks (part I)
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
                       ` (9 preceding siblings ...)
  2020-09-09 21:04     ` [PATCH v4 0/9] propose config-based hooks Junio C Hamano
@ 2020-10-14 23:24     ` Emily Shaffer
  2020-10-14 23:24       ` [PATCH v5 1/8] doc: propose hooks managed by the config Emily Shaffer
                         ` (8 more replies)
  10 siblings, 9 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-14 23:24 UTC (permalink / raw)
  To: git
  Cc: Emily Shaffer, Jeff King, Junio C Hamano, James Ramsay,
	Jonathan Nieder, brian m. carlson,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Josh Steadmon, Johannes Schindelin

Since v4:
- Reordered the commits. Hookdir support is added sooner and conversion
  of existing hooks is moved to another branch (part II) for hopefully
  more granular reviewing. If folks hate this, let me know and I'll
  reintegrate the two topics.
- Removed the --porcelain option on 'git hook list'. General consensus
  is that this should use a format string instead, and I didn't want to
  write that new feature while I had been promising v5 "any day now".
- Added functionality for 'skip' to remove hooks from the execution
  list.
- General nits from folks.

Coming soon:
- 'git hook list --format'
- More conversions (in the other topic)
- As required by new conversions, stdin support for hooks

Coming much later:
- 'git hook add'/'git hook edit'. The config isn't too ugly to manually
  edit, for now, so I'd like to get the hooks themselves all figured out
  before adding these convenience tools. I do still think they're a good
  idea, as they'll increase the discoverability of the feature for new
  users.

More detailed notes in each commit. Thanks all for your patience and
reviews.

 - Emily

Emily Shaffer (8):
  doc: propose hooks managed by the config
  hook: scaffolding for git-hook subcommand
  hook: add list command
  hook: include hookdir hook in list
  hook: implement hookcmd.<name>.skip
  parse-options: parse into strvec
  hook: add 'run' subcommand
  hook: replace find_hook() with hook_exists()

 .gitignore                                    |   1 +
 Documentation/Makefile                        |   1 +
 Documentation/config/hook.txt                 |  14 +
 Documentation/git-hook.txt                    |  81 ++++
 Documentation/technical/api-parse-options.txt |   5 +
 .../technical/config-based-hooks.txt          | 367 ++++++++++++++++++
 Makefile                                      |   2 +
 builtin.h                                     |   1 +
 builtin/hook.c                                | 163 ++++++++
 git.c                                         |   1 +
 hook.c                                        | 282 ++++++++++++++
 hook.h                                        |  58 +++
 parse-options-cb.c                            |  16 +
 parse-options.h                               |   4 +
 t/t1360-config-based-hooks.sh                 | 232 +++++++++++
 15 files changed, 1228 insertions(+)
 create mode 100644 Documentation/config/hook.txt
 create mode 100644 Documentation/git-hook.txt
 create mode 100644 Documentation/technical/config-based-hooks.txt
 create mode 100644 builtin/hook.c
 create mode 100644 hook.c
 create mode 100644 hook.h
 create mode 100755 t/t1360-config-based-hooks.sh

-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v5 1/8] doc: propose hooks managed by the config
  2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
@ 2020-10-14 23:24       ` Emily Shaffer
  2020-10-15 16:31         ` Ævar Arnfjörð Bjarmason
  2020-10-14 23:24       ` [PATCH v5 2/8] hook: scaffolding for git-hook subcommand Emily Shaffer
                         ` (7 subsequent siblings)
  8 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-10-14 23:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Begin a design document for config-based hooks, managed via git-hook.
Focus on an overview of the implementation and motivation for design
decisions. Briefly discuss the alternatives considered before this
point. Also, attempt to redefine terms to fit into a multihook world.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, addressed comments from Jonathan Tan about wording.

 Documentation/Makefile                        |   1 +
 .../technical/config-based-hooks.txt          | 367 ++++++++++++++++++
 2 files changed, 368 insertions(+)
 create mode 100644 Documentation/technical/config-based-hooks.txt

diff --git a/Documentation/Makefile b/Documentation/Makefile
index 80d1908a44..58d6b3acbe 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -81,6 +81,7 @@ SP_ARTICLES += $(API_DOCS)
 TECH_DOCS += MyFirstContribution
 TECH_DOCS += MyFirstObjectWalk
 TECH_DOCS += SubmittingPatches
+TECH_DOCS += technical/config-based-hooks
 TECH_DOCS += technical/hash-function-transition
 TECH_DOCS += technical/http-protocol
 TECH_DOCS += technical/index-format
diff --git a/Documentation/technical/config-based-hooks.txt b/Documentation/technical/config-based-hooks.txt
new file mode 100644
index 0000000000..dac391f505
--- /dev/null
+++ b/Documentation/technical/config-based-hooks.txt
@@ -0,0 +1,367 @@
+Configuration-based hook management
+===================================
+:sectanchors:
+
+[[motivation]]
+== Motivation
+
+Replace the .git/hook/hookname path as the only source of hooks to execute;
+allow users to define hooks using config files, in a way which is friendly to
+users with multiple repos which have similar needs.
+
+Redefine "hook" as an event rather than a single script, allowing users to
+perform unrelated actions on a single event.
+
+Take a step closer to safety when copying zipped Git repositories from untrusted
+users by making it more apparent to users which scripts will be run during
+normal Git operations.
+
+Make it easier for users to discover Git's hook feature and automate their
+workflows.
+
+[[user-interfaces]]
+== User interfaces
+
+[[config-schema]]
+=== Config schema
+
+Hooks can be introduced by editing the configuration manually. There are two new
+sections added, `hook` and `hookcmd`.
+
+[[config-schema-hook]]
+==== `hook`
+
+Primarily contains subsections for each hook event. The order of variables in
+these subsections defines the hook command execution order; hook commands can be
+specified by setting the value directly to the command if no additional
+configuration is needed, or by setting the value as the name of a `hookcmd`. If
+Git does not find a `hookcmd` whose subsection matches the value of the given
+command string, Git will try to execute the string directly. Hooks are executed
+by passing the resolved command string to the shell. In the future, hook event
+subsections could also contain per-hook-event settings; see
+<<per-hook-event-settings,the section in Future Work>> for more details.
+
+Also contains top-level hook execution settings, for example, `hook.runHookDir`
+or `hook.disableAll`. (These settings are described more in
+<<library,Library>>.)
+
+----
+[hook "pre-commit"]
+  command = perl-linter
+  command = /usr/bin/git-secrets --pre-commit
+
+[hook "pre-applypatch"]
+  command = perl-linter
+  # for illustration purposes; error behavior isn't planned yet
+  error = ignore
+
+[hook]
+  runHookDir = interactive
+----
+
+[[config-schema-hookcmd]]
+==== `hookcmd`
+
+Defines a hook command and its attributes, which will be used when a hook event
+occurs. Unqualified attributes are assumed to apply to this hook during all hook
+events, but event-specific attributes can also be supplied. The example runs
+`/usr/bin/lint-it --language=perl <args passed by Git>`, but for repos which
+include this config, the hook command will be skipped for all events to which
+it's normally subscribed _except_ `pre-commit`.
+
+----
+[hookcmd "perl-linter"]
+  command = /usr/bin/lint-it --language=perl
+  skip = true
+  # for illustration purposes; below hasn't been defined yet
+  pre-commit-skip = false
+----
+
+[[command-line-api]]
+=== Command-line API
+
+Users should be able to view, reorder, and create hook commands via the command
+line. External tools should be able to view a list of hooks in the correct order
+to run.
+
+*`git hook list <hook-event>`*
+
+*`git hook list (--system|--global|--local|--worktree)`*
+
+*`git hook edit <hook-event>`*
+
+*`git hook add <hook-command> <hook-event> <options...>`*
+
+[[hook-editor]]
+=== Hook editor
+
+The tool which is presented by `git hook edit <hook-command>`. Ideally, this
+tool should be easier to use than manually editing the config, and then produce
+a concise config afterwards. It may take a form similar to `git rebase
+--interactive`.
+
+[[implementation]]
+== Implementation
+
+[[library]]
+=== Library
+
+`hook.c` and `hook.h` are responsible for interacting with the config files. In
+the case when the code generating a hook event doesn't have special concerns
+about how to run the hooks, the hook library will provide a basic API to call
+all hooks in config order with an `strvec` provided by the code which
+generates the hook event:
+
+*`int run_hooks(const char *hookname, struct strvec *args)`*
+
+This call includes the hook command provided by `run-command.h:find_hook()`;
+eventually, this legacy hook will be gated by a config `hook.runHookDir`. The
+config is checked against a number of cases:
+
+- "no": the legacy hook will not be run
+- "interactive": Git will prompt the user before running the legacy hook
+- "warn": Git will print a warning to stderr before running the legacy hook
+- "yes" (default): Git will silently run the legacy hook
+
+In case this list is expanded in the future, if a value for `hook.runHookDir` is
+given which Git does not recognize, Git should discard that config entry. For
+example, if "warn" was specified at system level and "junk" was specified at
+global level, Git would resolve the value to "warn"; if the only time the config
+was set was to "junk", Git would use the default value of "yes".
+
+If the caller wants to do something more complicated, the hook library can also
+provide a callback API:
+
+*`int for_each_hookcmd(const char *hookname, hookcmd_function *cb)`*
+
+Finally, to facilitate the builtin, the library will also provide the following
+APIs to interact with the config:
+
+----
+int set_hook_commands(const char *hookname, struct string_list *commands,
+	enum config_scope scope);
+int set_hookcmd(const char *hookcmd, struct hookcmd options);
+
+int list_hook_commands(const char *hookname, struct string_list *commands);
+int list_hooks_in_scope(enum config_scope scope, struct string_list *commands);
+----
+
+`struct hookcmd` is expected to grow in size over time as more functionality is
+added to hooks; so that other parts of the code don't need to understand the
+config schema, `struct hookcmd` should contain logical values instead of string
+pairs.
+
+----
+struct hookcmd {
+  const char *name;
+  const char *command;
+
+  /* for illustration only; not planned at present */
+  int parallelizable;
+  const char *hookcmd_before;
+  const char *hookcmd_after;
+  enum recovery_action on_fail;
+}
+----
+
+[[builtin]]
+=== Builtin
+
+`builtin/hook.c` is responsible for providing the frontend. It's responsible for
+formatting user-provided data and then calling the library API to set the
+configs as appropriate. The builtin frontend is not responsible for calling the
+config directly, so that other areas of Git can rely on the hook library to
+understand the most recent config schema for hooks.
+
+[[migration]]
+=== Migration path
+
+[[stage-0]]
+==== Stage 0
+
+Hooks are called by running `run-command.h:find_hook()` with the hookname and
+executing the result. The hook library and builtin do not exist. Hooks only
+exist as specially named scripts within `.git/hooks/`.
+
+[[stage-1]]
+==== Stage 1
+
+`git hook list --porcelain <hook-event>` is implemented. Users can replace their
+`.git/hooks/<hook-event>` scripts with a trampoline based on `git hook list`'s
+output. Modifier commands like `git hook add` and `git hook edit` can be
+implemented around this time as well.
+
+[[stage-2]]
+==== Stage 2
+
+`hook.h:run_hooks()` is taught to include `run-command.h:find_hook()` at the
+end; calls to `find_hook()` are replaced with calls to `run_hooks()`. Users can
+opt-in to config-based hooks simply by creating some in their config; otherwise
+users should remain unaffected by the change.
+
+[[stage-3]]
+==== Stage 3
+
+The call to `find_hook()` inside of `run_hooks()` learns to check for a config,
+`hook.runHookDir`. Users can opt into managing their hooks completely via the
+config this way.
+
+[[stage-4]]
+==== Stage 4
+
+`.git/hooks` is removed from the template and the hook directory is considered
+deprecated. To avoid breaking older repos, the default of `hook.runHookDir` is
+not changed, and `find_hook()` is not removed.
+
+[[caveats]]
+== Caveats
+
+[[security]]
+=== Security and repo config
+
+Part of the motivation behind this refactor is to mitigate hooks as an attack
+vector;footnote:[https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/]
+however, as the design stands, users can still provide hooks in the repo-level
+config, which is included when a repo is zipped and sent elsewhere.  The
+security of the repo-level config is still under discussion; this design
+generally assumes the repo-level config is secure, which is not true yet. The
+goal is to avoid an overcomplicated design to work around a problem which has
+ceased to exist.
+
+[[ease-of-use]]
+=== Ease of use
+
+The config schema is nontrivial; that's why it's important for the `git hook`
+modifier commands to be usable. Contributors with UX expertise are encouraged to
+share their suggestions.
+
+[[alternatives]]
+== Alternative approaches
+
+A previous summary of alternatives exists in the
+archives.footnote:[https://lore.kernel.org/git/20191116011125.GG22855@google.com]
+
+[[status-quo]]
+=== Status quo
+
+Today users can implement multihooks themselves by using a "trampoline script"
+as their hook, and pointing that script to a directory or list of other scripts
+they wish to run.
+
+[[hook-directories]]
+=== Hook directories
+
+Other contributors have suggested Git learn about the existence of a directory
+such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order.
+
+[[comparison]]
+=== Comparison table
+
+.Comparison of alternatives
+|===
+|Feature |Config-based hooks |Hook directories |Status quo
+
+|Supports multiple hooks
+|Natively
+|Natively
+|With user effort
+
+|Safer for zipped repos
+|A little
+|No
+|No
+
+|Previous hooks just work
+|If configured
+|Yes
+|Yes
+
+|Can install one hook to many repos
+|Yes
+|No
+|No
+
+|Discoverability
+|Better (in `git help git`)
+|Same as before
+|Same as before
+
+|Hard to run unexpected hook
+|If configured
+|No
+|No
+|===
+
+[[future-work]]
+== Future work
+
+[[execution-ordering]]
+=== Execution ordering
+
+We may find that config order is insufficient for some users; for example,
+config order makes it difficult to add a new hook to the system or global config
+which runs at the end of the hook list. A new ordering schema should be:
+
+1) Specified by a `hook.order` config, so that users will not unexpectedly see
+their order change;
+
+2) Either dependency or numerically based.
+
+Dependency-based ordering is prone to classic linked-list problems, like a
+cycles and handling of missing dependencies. But, it paves the way for enabling
+parallelization if some tasks truly depend on others.
+
+Numerical ordering makes it tricky for Git to generate suggested ordering
+numbers for each command, but is easy to determine a definitive order.
+
+[[parallelization]]
+=== Parallelization
+
+Users with many hooks might want to run them simultaneously, if the hooks don't
+modify state; if one hook depends on another's output, then users will want to
+specify those dependencies. If we decide to solve this problem, we may want to
+look to modern build systems for inspiration on how to manage dependencies and
+parallel tasks.
+
+[[securing-hookdir-hooks]]
+=== Securing hookdir hooks
+
+With the design as written in this doc, it's still possible for a malicious user
+to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then
+zip their repo and send it to another user. It may be necessary to teach Git to
+only allow inlined hooks like this if they were configured outside of the local
+scope (in other words, only run hookcmds, and only allow hookcmds to be
+configured in global or system scope); or another approach, like a list of safe
+projects, might be useful. It may also be sufficient (or at least useful) to
+teach a `hook.disableAll` config or similar flag to the Git executable.
+
+[[submodule-inheritance]]
+=== Submodule inheritance
+
+It's possible some submodules may want to run the identical set of hooks that
+their superrepo runs. While a globally-configured hook set is helpful, it's not
+a great solution for users who have multiple repos-with-submodules under the
+same user. It would be useful for submodules to learn how to run hooks from
+their superrepo's config, or inherit that hook setting.
+
+[[per-hook-event-settings]]
+=== Per-hook-event settings
+
+It might be desirable to keep settings specifically for some hook events, but
+not for others - for example, a user may wish to disable hookdir hooks for all
+events but pre-commit, which they haven't had time to convert yet; or, a user
+may wish for execution order settings to differ based on hook event. In that
+case, it would be useful to set something like `hook.pre-commit.executionOrder`
+which would not apply to the 'prepare-commit-msg' hook, for example.
+
+[[glossary]]
+== Glossary
+
+*hook event*
+
+A point during Git's execution where user scripts may be run, for example,
+_prepare-commit-msg_ or _pre-push_.
+
+*hook command*
+
+A user script or executable which will be run on one or more hook events.
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v5 2/8] hook: scaffolding for git-hook subcommand
  2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
  2020-10-14 23:24       ` [PATCH v5 1/8] doc: propose hooks managed by the config Emily Shaffer
@ 2020-10-14 23:24       ` Emily Shaffer
  2020-10-14 23:24       ` [PATCH v5 3/8] hook: add list command Emily Shaffer
                         ` (6 subsequent siblings)
  8 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-14 23:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Introduce infrastructure for a new subcommand, git-hook, which will be
used to ease config-based hook management. This command will handle
parsing configs to compose a list of hooks to run for a given event, as
well as adding or modifying hook configs in an interactive fashion.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, mainly changed to RUN_SETUP_GENTLY so that 'git hook list' can
    be executed outside of a repo.

 .gitignore                    |  1 +
 Documentation/git-hook.txt    | 20 ++++++++++++++++++++
 Makefile                      |  1 +
 builtin.h                     |  1 +
 builtin/hook.c                | 21 +++++++++++++++++++++
 git.c                         |  1 +
 t/t1360-config-based-hooks.sh | 11 +++++++++++
 7 files changed, 56 insertions(+)
 create mode 100644 Documentation/git-hook.txt
 create mode 100644 builtin/hook.c
 create mode 100755 t/t1360-config-based-hooks.sh

diff --git a/.gitignore b/.gitignore
index 6232d33924..432e0b11cb 100644
--- a/.gitignore
+++ b/.gitignore
@@ -75,6 +75,7 @@
 /git-grep
 /git-hash-object
 /git-help
+/git-hook
 /git-http-backend
 /git-http-fetch
 /git-http-push
diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
new file mode 100644
index 0000000000..9eeab0009d
--- /dev/null
+++ b/Documentation/git-hook.txt
@@ -0,0 +1,20 @@
+git-hook(1)
+===========
+
+NAME
+----
+git-hook - Manage configured hooks
+
+SYNOPSIS
+--------
+[verse]
+'git hook'
+
+DESCRIPTION
+-----------
+A placeholder command. Later, you will be able to list, add, and modify hooks
+with this command.
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index 5311b1d2c4..9152f6d7c8 100644
--- a/Makefile
+++ b/Makefile
@@ -1095,6 +1095,7 @@ BUILTIN_OBJS += builtin/get-tar-commit-id.o
 BUILTIN_OBJS += builtin/grep.o
 BUILTIN_OBJS += builtin/hash-object.o
 BUILTIN_OBJS += builtin/help.o
+BUILTIN_OBJS += builtin/hook.o
 BUILTIN_OBJS += builtin/index-pack.o
 BUILTIN_OBJS += builtin/init-db.o
 BUILTIN_OBJS += builtin/interpret-trailers.o
diff --git a/builtin.h b/builtin.h
index 53fb290963..3b20689d1a 100644
--- a/builtin.h
+++ b/builtin.h
@@ -162,6 +162,7 @@ int cmd_get_tar_commit_id(int argc, const char **argv, const char *prefix);
 int cmd_grep(int argc, const char **argv, const char *prefix);
 int cmd_hash_object(int argc, const char **argv, const char *prefix);
 int cmd_help(int argc, const char **argv, const char *prefix);
+int cmd_hook(int argc, const char **argv, const char *prefix);
 int cmd_index_pack(int argc, const char **argv, const char *prefix);
 int cmd_init_db(int argc, const char **argv, const char *prefix);
 int cmd_interpret_trailers(int argc, const char **argv, const char *prefix);
diff --git a/builtin/hook.c b/builtin/hook.c
new file mode 100644
index 0000000000..b2bbc84d4d
--- /dev/null
+++ b/builtin/hook.c
@@ -0,0 +1,21 @@
+#include "cache.h"
+
+#include "builtin.h"
+#include "parse-options.h"
+
+static const char * const builtin_hook_usage[] = {
+	N_("git hook"),
+	NULL
+};
+
+int cmd_hook(int argc, const char **argv, const char *prefix)
+{
+	struct option builtin_hook_options[] = {
+		OPT_END(),
+	};
+
+	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+			     builtin_hook_usage, 0);
+
+	return 0;
+}
diff --git a/git.c b/git.c
index f1e8b56d99..caad1c877f 100644
--- a/git.c
+++ b/git.c
@@ -524,6 +524,7 @@ static struct cmd_struct commands[] = {
 	{ "grep", cmd_grep, RUN_SETUP_GENTLY },
 	{ "hash-object", cmd_hash_object },
 	{ "help", cmd_help },
+	{ "hook", cmd_hook, RUN_SETUP_GENTLY },
 	{ "index-pack", cmd_index_pack, RUN_SETUP_GENTLY | NO_PARSEOPT },
 	{ "init", cmd_init_db },
 	{ "init-db", cmd_init_db },
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
new file mode 100755
index 0000000000..34b0df5216
--- /dev/null
+++ b/t/t1360-config-based-hooks.sh
@@ -0,0 +1,11 @@
+#!/bin/bash
+
+test_description='config-managed multihooks, including git-hook command'
+
+. ./test-lib.sh
+
+test_expect_success 'git hook command does not crash' '
+	git hook
+'
+
+test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v5 3/8] hook: add list command
  2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
  2020-10-14 23:24       ` [PATCH v5 1/8] doc: propose hooks managed by the config Emily Shaffer
  2020-10-14 23:24       ` [PATCH v5 2/8] hook: scaffolding for git-hook subcommand Emily Shaffer
@ 2020-10-14 23:24       ` Emily Shaffer
  2020-10-14 23:24       ` [PATCH v5 4/8] hook: include hookdir hook in list Emily Shaffer
                         ` (5 subsequent siblings)
  8 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-14 23:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Teach 'git hook list <hookname>', which checks the known configs in
order to create an ordered list of hooks to run on a given hook event.

Multiple commands can be specified for a given hook by providing
multiple "hook.<hookname>.command = <path-to-hook>" lines. Hooks will be
run in config order. If more properties need to be set on a given hook
in the future, commands can also be specified by providing
"hook.<hookname>.command = <hookcmd-name>", as well as a "[hookcmd
<hookcmd-name>]" subsection; at minimum, this subsection must contain a
"hookcmd.<hookcmd-name>.command = <path-to-hook>" line.

For example:

  $ git config --list | grep ^hook
  hook.pre-commit.command=baz
  hook.pre-commit.command=~/bar.sh
  hookcmd.baz.command=~/baz/from/hookcmd.sh

  $ git hook list pre-commit
  global: ~/baz/from/hookcmd.sh
  local: ~/bar.sh

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, more work on the documentation. Also a slight change to the
    output format (space instead of tab).

 Documentation/config/hook.txt |   9 +++
 Documentation/git-hook.txt    |  59 ++++++++++++++++-
 Makefile                      |   1 +
 builtin/hook.c                |  56 +++++++++++++++--
 hook.c                        | 115 ++++++++++++++++++++++++++++++++++
 hook.h                        |  26 ++++++++
 t/t1360-config-based-hooks.sh |  81 +++++++++++++++++++++++-
 7 files changed, 338 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/config/hook.txt
 create mode 100644 hook.c
 create mode 100644 hook.h

diff --git a/Documentation/config/hook.txt b/Documentation/config/hook.txt
new file mode 100644
index 0000000000..71449ecbc7
--- /dev/null
+++ b/Documentation/config/hook.txt
@@ -0,0 +1,9 @@
+hook.<command>.command::
+	A command to execute during the <command> hook event. This can be an
+	executable on your device, a oneliner for your shell, or the name of a
+	hookcmd. See linkgit:git-hook[1].
+
+hookcmd.<name>.command::
+	A command to execute during a hook for which <name> has been specified
+	as a command. This can be an executable on your device or a oneliner for
+	your shell. See linkgit:git-hook[1].
diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index 9eeab0009d..f19875ed68 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -8,12 +8,65 @@ git-hook - Manage configured hooks
 SYNOPSIS
 --------
 [verse]
-'git hook'
+'git hook' list <hook-name>
 
 DESCRIPTION
 -----------
-A placeholder command. Later, you will be able to list, add, and modify hooks
-with this command.
+You can list configured hooks with this command. Later, you will be able to run,
+add, and modify hooks with this command.
+
+This command parses the default configuration files for sections `hook` and
+`hookcmd`. `hook` is used to describe the commands which will be run during a
+particular hook event; commands are run in the order Git encounters them during
+the configuration parse (see linkgit:git-config[1]). `hookcmd` is used to
+describe attributes of a specific command. If additional attributes don't need
+to be specified, a command to run can be specified directly in the `hook`
+section; if a `hookcmd` by that name isn't found, Git will attempt to run the
+provided value directly. For example:
+
+Global config
+----
+  [hook "post-commit"]
+    command = "linter"
+    command = "~/typocheck.sh"
+
+  [hookcmd "linter"]
+    command = "/bin/linter --c"
+----
+
+Local config
+----
+  [hook "prepare-commit-msg"]
+    command = "linter"
+  [hook "post-commit"]
+    command = "python ~/run-test-suite.py"
+----
+
+With these configs, you'd then see:
+
+----
+$ git hook list "post-commit"
+global: /bin/linter --c
+global: ~/typocheck.sh
+local: python ~/run-test-suite.py
+
+$ git hook list "prepare-commit-msg"
+local: /bin/linter --c
+----
+
+COMMANDS
+--------
+
+list `<hook-name>`::
+
+List the hooks which have been configured for `<hook-name>`. Hooks appear
+in the order they should be run, and print the config scope where the relevant
+`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
+This output is human-readable and the format is subject to change over time.
+
+CONFIGURATION
+-------------
+include::config/hook.txt[]
 
 GIT
 ---
diff --git a/Makefile b/Makefile
index 9152f6d7c8..5cd1486e42 100644
--- a/Makefile
+++ b/Makefile
@@ -902,6 +902,7 @@ LIB_OBJS += grep.o
 LIB_OBJS += hashmap.o
 LIB_OBJS += help.o
 LIB_OBJS += hex.o
+LIB_OBJS += hook.o
 LIB_OBJS += ident.o
 LIB_OBJS += json-writer.o
 LIB_OBJS += kwset.o
diff --git a/builtin/hook.c b/builtin/hook.c
index b2bbc84d4d..4d36de52f8 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -1,21 +1,69 @@
 #include "cache.h"
 
 #include "builtin.h"
+#include "config.h"
+#include "hook.h"
 #include "parse-options.h"
+#include "strbuf.h"
 
 static const char * const builtin_hook_usage[] = {
-	N_("git hook"),
+	N_("git hook list <hookname>"),
 	NULL
 };
 
-int cmd_hook(int argc, const char **argv, const char *prefix)
+static int list(int argc, const char **argv, const char *prefix)
 {
-	struct option builtin_hook_options[] = {
+	struct list_head *head, *pos;
+	struct hook *item;
+	struct strbuf hookname = STRBUF_INIT;
+
+	struct option list_options[] = {
 		OPT_END(),
 	};
 
-	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+	argc = parse_options(argc, argv, prefix, list_options,
 			     builtin_hook_usage, 0);
 
+	if (argc < 1) {
+		usage_msg_opt(_("You must specify a hook event name to list."),
+			      builtin_hook_usage, list_options);
+	}
+
+	strbuf_addstr(&hookname, argv[0]);
+
+	head = hook_list(&hookname);
+
+	if (list_empty(head)) {
+		printf(_("no commands configured for hook '%s'\n"),
+		       hookname.buf);
+		strbuf_release(&hookname);
+		return 0;
+	}
+
+	list_for_each(pos, head) {
+		item = list_entry(pos, struct hook, list);
+		if (item)
+			printf("%s: %s\n",
+			       config_scope_name(item->origin),
+			       item->command.buf);
+	}
+
+	clear_hook_list(head);
+	strbuf_release(&hookname);
+
 	return 0;
 }
+
+int cmd_hook(int argc, const char **argv, const char *prefix)
+{
+	struct option builtin_hook_options[] = {
+		OPT_END(),
+	};
+	if (argc < 2)
+		usage_with_options(builtin_hook_usage, builtin_hook_options);
+
+	if (!strcmp(argv[1], "list"))
+		return list(argc - 1, argv + 1, prefix);
+
+	usage_with_options(builtin_hook_usage, builtin_hook_options);
+}
diff --git a/hook.c b/hook.c
new file mode 100644
index 0000000000..937dc768c8
--- /dev/null
+++ b/hook.c
@@ -0,0 +1,115 @@
+#include "cache.h"
+
+#include "hook.h"
+#include "config.h"
+
+void free_hook(struct hook *ptr)
+{
+	if (ptr) {
+		strbuf_release(&ptr->command);
+		free(ptr);
+	}
+}
+
+static void append_or_move_hook(struct list_head *head, const char *command)
+{
+	struct list_head *pos = NULL, *tmp = NULL;
+	struct hook *to_add = NULL;
+
+	/*
+	 * remove the prior entry with this command; we'll replace it at the
+	 * end.
+	 */
+	list_for_each_safe(pos, tmp, head) {
+		struct hook *it = list_entry(pos, struct hook, list);
+		if (!strcmp(it->command.buf, command)) {
+		    list_del(pos);
+		    /* we'll simply move the hook to the end */
+		    to_add = it;
+		}
+	}
+
+	if (!to_add) {
+		/* adding a new hook, not moving an old one */
+		to_add = xmalloc(sizeof(struct hook));
+		strbuf_init(&to_add->command, 0);
+		strbuf_addstr(&to_add->command, command);
+	}
+
+	/* re-set the scope so we show where an override was specified */
+	to_add->origin = current_config_scope();
+
+	list_add_tail(&to_add->list, pos);
+}
+
+static void remove_hook(struct list_head *to_remove)
+{
+	struct hook *hook_to_remove = list_entry(to_remove, struct hook, list);
+	list_del(to_remove);
+	free_hook(hook_to_remove);
+}
+
+void clear_hook_list(struct list_head *head)
+{
+	struct list_head *pos, *tmp;
+	list_for_each_safe(pos, tmp, head)
+		remove_hook(pos);
+}
+
+struct hook_config_cb
+{
+	struct strbuf *hookname;
+	struct list_head *list;
+};
+
+static int hook_config_lookup(const char *key, const char *value, void *cb_data)
+{
+	struct hook_config_cb *data = cb_data;
+	const char *hook_key = data->hookname->buf;
+	struct list_head *head = data->list;
+
+	if (!strcmp(key, hook_key)) {
+		const char *command = value;
+		struct strbuf hookcmd_name = STRBUF_INIT;
+
+		/* Check if a hookcmd with that name exists. */
+		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
+		git_config_get_value(hookcmd_name.buf, &command);
+
+		if (!command) {
+			strbuf_release(&hookcmd_name);
+			BUG("git_config_get_value overwrote a string it shouldn't have");
+		}
+
+		/*
+		 * TODO: implement an option-getting callback, e.g.
+		 *   get configs by pattern hookcmd.$value.*
+		 *   for each key+value, do_callback(key, value, cb_data)
+		 */
+
+		append_or_move_hook(head, command);
+
+		strbuf_release(&hookcmd_name);
+	}
+
+	return 0;
+}
+
+struct list_head* hook_list(const struct strbuf* hookname)
+{
+	struct strbuf hook_key = STRBUF_INIT;
+	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
+	struct hook_config_cb cb_data = { &hook_key, hook_head };
+
+	INIT_LIST_HEAD(hook_head);
+
+	if (!hookname)
+		return NULL;
+
+	strbuf_addf(&hook_key, "hook.%s.command", hookname->buf);
+
+	git_config(hook_config_lookup, (void*)&cb_data);
+
+	strbuf_release(&hook_key);
+	return hook_head;
+}
diff --git a/hook.h b/hook.h
new file mode 100644
index 0000000000..8ffc4f14b6
--- /dev/null
+++ b/hook.h
@@ -0,0 +1,26 @@
+#include "config.h"
+#include "list.h"
+#include "strbuf.h"
+
+struct hook
+{
+	struct list_head list;
+	/*
+	 * Config file which holds the hook.*.command definition.
+	 * (This has nothing to do with the hookcmd.<name>.* configs.)
+	 */
+	enum config_scope origin;
+	/* The literal command to run. */
+	struct strbuf command;
+};
+
+/*
+ * Provides a linked list of 'struct hook' detailing commands which should run
+ * in response to the 'hookname' event, in execution order.
+ */
+struct list_head* hook_list(const struct strbuf *hookname);
+
+/* Free memory associated with a 'struct hook' */
+void free_hook(struct hook *ptr);
+/* Empties the list at 'head', calling 'free_hook()' on each entry */
+void clear_hook_list(struct list_head *head);
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 34b0df5216..6e4a3e763f 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -4,8 +4,85 @@ test_description='config-managed multihooks, including git-hook command'
 
 . ./test-lib.sh
 
-test_expect_success 'git hook command does not crash' '
-	git hook
+ROOT=
+if test_have_prereq MINGW
+then
+	# In Git for Windows, Unix-like paths work only in shell scripts;
+	# `git.exe`, however, will prefix them with the pseudo root directory
+	# (of the Unix shell). Let's accommodate for that.
+	ROOT="$(cd / && pwd)"
+fi
+
+setup_hooks () {
+	test_config hook.pre-commit.command "/path/ghi" --add
+	test_config_global hook.pre-commit.command "/path/def" --add
+}
+
+setup_hookcmd () {
+	test_config hook.pre-commit.command "abc" --add
+	test_config_global hookcmd.abc.command "/path/abc" --add
+}
+
+test_expect_success 'git hook rejects commands without a mode' '
+	test_must_fail git hook pre-commit
+'
+
+
+test_expect_success 'git hook rejects commands without a hookname' '
+	test_must_fail git hook list
+'
+
+test_expect_success 'git hook runs outside of a repo' '
+	setup_hooks &&
+
+	cat >expected <<-EOF &&
+	global: $ROOT/path/def
+	EOF
+
+	nongit git config --list --global &&
+
+	nongit git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list orders by config order' '
+	setup_hooks &&
+
+	cat >expected <<-EOF &&
+	global: $ROOT/path/def
+	local: $ROOT/path/ghi
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list dereferences a hookcmd' '
+	setup_hooks &&
+	setup_hookcmd &&
+
+	cat >expected <<-EOF &&
+	global: $ROOT/path/def
+	local: $ROOT/path/ghi
+	local: $ROOT/path/abc
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list reorders on duplicate commands' '
+	setup_hooks &&
+
+	test_config hook.pre-commit.command "/path/def" --add &&
+
+	cat >expected <<-EOF &&
+	local: $ROOT/path/ghi
+	local: $ROOT/path/def
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
 '
 
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v5 4/8] hook: include hookdir hook in list
  2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
                         ` (2 preceding siblings ...)
  2020-10-14 23:24       ` [PATCH v5 3/8] hook: add list command Emily Shaffer
@ 2020-10-14 23:24       ` Emily Shaffer
  2020-10-14 23:24       ` [PATCH v5 5/8] hook: implement hookcmd.<name>.skip Emily Shaffer
                         ` (4 subsequent siblings)
  8 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-14 23:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Historically, hooks are declared by placing an executable into
$GIT_DIR/hooks/$HOOKNAME (or $HOOKDIR/$HOOKNAME). Although hooks taken
from the config are more featureful than hooks placed in the $HOOKDIR,
those hooks should not stop working for users who already have them.

When we add hooks from $HOOKDIR to the list of all hooks to run, to
support paths with spaces in them, quote legacy hook paths.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Newly split into its own commit since v4, and taking place much sooner.
    
    An unfortunate side effect of adding this support *before* the
    hook.runHookDir support is that the labels on the list are not clear -
    because we aren't yet flagging which hooks are from the hookdir versus
    the config. I suppose we could move the addition of that field to the
    struct hook up to this patch, but it didn't make a lot of sense to me to
    do it just for cosmetic purposes.

 Documentation/config/hook.txt |  5 +++
 builtin/hook.c                | 70 +++++++++++++++++++++++++++++++----
 hook.c                        | 36 ++++++++++++++++++
 hook.h                        | 16 ++++++++
 t/t1360-config-based-hooks.sh | 62 +++++++++++++++++++++++++++++++
 5 files changed, 182 insertions(+), 7 deletions(-)

diff --git a/Documentation/config/hook.txt b/Documentation/config/hook.txt
index 71449ecbc7..75312754ae 100644
--- a/Documentation/config/hook.txt
+++ b/Documentation/config/hook.txt
@@ -7,3 +7,8 @@ hookcmd.<name>.command::
 	A command to execute during a hook for which <name> has been specified
 	as a command. This can be an executable on your device or a oneliner for
 	your shell. See linkgit:git-hook[1].
+
+hook.runHookDir::
+	Controls how hooks contained in your hookdir are executed. Can be any of
+	"yes", "warn", "interactive", or "no". Defaults to "yes". See
+	linkgit:git-hook[1] and linkgit:git-config[1] "core.hooksPath").
diff --git a/builtin/hook.c b/builtin/hook.c
index 4d36de52f8..16324d4195 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -11,11 +11,14 @@ static const char * const builtin_hook_usage[] = {
 	NULL
 };
 
+static enum hookdir_opt should_run_hookdir;
+
 static int list(int argc, const char **argv, const char *prefix)
 {
 	struct list_head *head, *pos;
 	struct hook *item;
 	struct strbuf hookname = STRBUF_INIT;
+	struct strbuf hookdir_annotation = STRBUF_INIT;
 
 	struct option list_options[] = {
 		OPT_END(),
@@ -40,12 +43,39 @@ static int list(int argc, const char **argv, const char *prefix)
 		return 0;
 	}
 
+	switch (should_run_hookdir) {
+		case hookdir_no:
+			strbuf_addstr(&hookdir_annotation, _(" (will not run)"));
+			break;
+		case hookdir_interactive:
+			strbuf_addstr(&hookdir_annotation, _(" (will prompt)"));
+			break;
+		case hookdir_warn:
+		case hookdir_unknown:
+			strbuf_addstr(&hookdir_annotation, _(" (will warn)"));
+			break;
+		case hookdir_yes:
+		/*
+		 * The default behavior should agree with
+		 * hook.c:configured_hookdir_opt().
+		 */
+		default:
+			break;
+	}
+
 	list_for_each(pos, head) {
 		item = list_entry(pos, struct hook, list);
-		if (item)
-			printf("%s: %s\n",
-			       config_scope_name(item->origin),
-			       item->command.buf);
+		if (item) {
+			/* Don't translate 'hookdir' - it matches the config */
+			printf("%s: %s%s\n",
+			       (item->from_hookdir
+				? "hookdir"
+				: config_scope_name(item->origin)),
+			       item->command.buf,
+			       (item->from_hookdir
+				? hookdir_annotation.buf
+				: ""));
+		}
 	}
 
 	clear_hook_list(head);
@@ -56,14 +86,40 @@ static int list(int argc, const char **argv, const char *prefix)
 
 int cmd_hook(int argc, const char **argv, const char *prefix)
 {
+	const char *run_hookdir = NULL;
+
 	struct option builtin_hook_options[] = {
+		OPT_STRING(0, "run-hookdir", &run_hookdir, N_("option"),
+			   N_("what to do with hooks found in the hookdir")),
 		OPT_END(),
 	};
-	if (argc < 2)
+
+	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+			     builtin_hook_usage, 0);
+
+	/* after the parse, we should have "<command> <hookname> <args...>" */
+	if (argc < 1)
 		usage_with_options(builtin_hook_usage, builtin_hook_options);
 
-	if (!strcmp(argv[1], "list"))
-		return list(argc - 1, argv + 1, prefix);
+
+	/* argument > config */
+	if (run_hookdir)
+		if (!strcmp(run_hookdir, "no"))
+			should_run_hookdir = hookdir_no;
+		else if (!strcmp(run_hookdir, "yes"))
+			should_run_hookdir = hookdir_yes;
+		else if (!strcmp(run_hookdir, "warn"))
+			should_run_hookdir = hookdir_warn;
+		else if (!strcmp(run_hookdir, "interactive"))
+			should_run_hookdir = hookdir_interactive;
+		else
+			die(_("'%s' is not a valid option for --run-hookdir "
+			      "(yes, warn, interactive, no)"), run_hookdir);
+	else
+		should_run_hookdir = configured_hookdir_opt();
+
+	if (!strcmp(argv[0], "list"))
+		return list(argc, argv, prefix);
 
 	usage_with_options(builtin_hook_usage, builtin_hook_options);
 }
diff --git a/hook.c b/hook.c
index 937dc768c8..340e5a35c8 100644
--- a/hook.c
+++ b/hook.c
@@ -2,6 +2,7 @@
 
 #include "hook.h"
 #include "config.h"
+#include "run-command.h"
 
 void free_hook(struct hook *ptr)
 {
@@ -34,6 +35,7 @@ static void append_or_move_hook(struct list_head *head, const char *command)
 		to_add = xmalloc(sizeof(struct hook));
 		strbuf_init(&to_add->command, 0);
 		strbuf_addstr(&to_add->command, command);
+		to_add->from_hookdir = 0;
 	}
 
 	/* re-set the scope so we show where an override was specified */
@@ -95,11 +97,33 @@ static int hook_config_lookup(const char *key, const char *value, void *cb_data)
 	return 0;
 }
 
+enum hookdir_opt configured_hookdir_opt(void)
+{
+	const char *key;
+	if (git_config_get_value("hook.runhookdir", &key))
+		return hookdir_yes; /* by default, just run it. */
+
+	if (!strcmp(key, "no"))
+		return hookdir_no;
+
+	if (!strcmp(key, "yes"))
+		return hookdir_yes;
+
+	if (!strcmp(key, "warn"))
+		return hookdir_warn;
+
+	if (!strcmp(key, "interactive"))
+		return hookdir_interactive;
+
+	return hookdir_unknown;
+}
+
 struct list_head* hook_list(const struct strbuf* hookname)
 {
 	struct strbuf hook_key = STRBUF_INIT;
 	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
 	struct hook_config_cb cb_data = { &hook_key, hook_head };
+	const char *legacy_hook_path = NULL;
 
 	INIT_LIST_HEAD(hook_head);
 
@@ -110,6 +134,18 @@ struct list_head* hook_list(const struct strbuf* hookname)
 
 	git_config(hook_config_lookup, (void*)&cb_data);
 
+	if (have_git_dir())
+		legacy_hook_path = find_hook(hookname->buf);
+
+	/* Unconditionally add legacy hook, but annotate it. */
+	if (legacy_hook_path) {
+		struct hook *legacy_hook;
+
+		append_or_move_hook(hook_head, absolute_path(legacy_hook_path));
+		legacy_hook = list_entry(hook_head->prev, struct hook, list);
+		legacy_hook->from_hookdir = 1;
+	}
+
 	strbuf_release(&hook_key);
 	return hook_head;
 }
diff --git a/hook.h b/hook.h
index 8ffc4f14b6..ca45d388d3 100644
--- a/hook.h
+++ b/hook.h
@@ -12,6 +12,7 @@ struct hook
 	enum config_scope origin;
 	/* The literal command to run. */
 	struct strbuf command;
+	int from_hookdir;
 };
 
 /*
@@ -20,6 +21,21 @@ struct hook
  */
 struct list_head* hook_list(const struct strbuf *hookname);
 
+enum hookdir_opt
+{
+	hookdir_no,
+	hookdir_warn,
+	hookdir_interactive,
+	hookdir_yes,
+	hookdir_unknown,
+};
+
+/*
+ * Provides the hookdir_opt specified in the config without consulting any
+ * command line arguments.
+ */
+enum hookdir_opt configured_hookdir_opt(void);
+
 /* Free memory associated with a 'struct hook' */
 void free_hook(struct hook *ptr);
 /* Empties the list at 'head', calling 'free_hook()' on each entry */
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 6e4a3e763f..91127a50a4 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -23,6 +23,14 @@ setup_hookcmd () {
 	test_config_global hookcmd.abc.command "/path/abc" --add
 }
 
+setup_hookdir () {
+	mkdir .git/hooks
+	write_script .git/hooks/pre-commit <<-EOF
+	echo \"Legacy Hook\"
+	EOF
+	test_when_finished rm -rf .git/hooks
+}
+
 test_expect_success 'git hook rejects commands without a mode' '
 	test_must_fail git hook pre-commit
 '
@@ -85,4 +93,58 @@ test_expect_success 'git hook list reorders on duplicate commands' '
 	test_cmp expected actual
 '
 
+test_expect_success 'git hook list shows hooks from the hookdir' '
+	setup_hookdir &&
+
+	cat >expected <<-EOF &&
+	hookdir: $(pwd)/.git/hooks/pre-commit
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'hook.runHookDir = no is respected by list' '
+	setup_hookdir &&
+
+	test_config hook.runHookDir "no" &&
+
+	cat >expected <<-EOF &&
+	hookdir: $(pwd)/.git/hooks/pre-commit (will not run)
+	EOF
+
+	git hook list pre-commit >actual &&
+	# the hookdir annotation is translated
+	test_i18ncmp expected actual
+'
+
+test_expect_success 'hook.runHookDir = warn is respected by list' '
+	setup_hookdir &&
+
+	test_config hook.runHookDir "warn" &&
+
+	cat >expected <<-EOF &&
+	hookdir: $(pwd)/.git/hooks/pre-commit (will warn)
+	EOF
+
+	git hook list pre-commit >actual &&
+	# the hookdir annotation is translated
+	test_i18ncmp expected actual
+'
+
+
+test_expect_success 'hook.runHookDir = interactive is respected by list' '
+	setup_hookdir &&
+
+	test_config hook.runHookDir "interactive" &&
+
+	cat >expected <<-EOF &&
+	hookdir: $(pwd)/.git/hooks/pre-commit (will prompt)
+	EOF
+
+	git hook list pre-commit >actual &&
+	# the hookdir annotation is translated
+	test_i18ncmp expected actual
+'
+
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v5 5/8] hook: implement hookcmd.<name>.skip
  2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
                         ` (3 preceding siblings ...)
  2020-10-14 23:24       ` [PATCH v5 4/8] hook: include hookdir hook in list Emily Shaffer
@ 2020-10-14 23:24       ` Emily Shaffer
  2020-10-14 23:24       ` [PATCH v5 6/8] parse-options: parse into strvec Emily Shaffer
                         ` (3 subsequent siblings)
  8 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-14 23:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

If a user wants a specific repo to skip execution of a hook which is set
at a global or system level, they can now do so by specifying 'skip' in
their repo config:

~/.gitconfig
  [hook.pre-commit]
    command = skippable-oneliner
    command = skippable-hookcmd

  [hookcmd.skippable-hookcmd]
    command = foo.sh

$GIT_DIR/.git/config
  [hookcmd.skippable-oneliner]
    skip = true
  [hookcmd.skippable-hookcmd]
    skip = true

Later it may make sense to add an option like
"hookcmd.<name>.<hook-event>-skip" - but for simplicity, let's start
with a universal skip setting like this.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    New since v4.
    
    During the Google team's review club I was reminded about this whole
    'skip' option I never implemented. It's true that it's impossible to
    exclude a given hook without this; however, I think I have some more
    work to do on it, so consider it RFC for now and tell me what you think
    :)
     - Emily

 hook.c                        | 37 +++++++++++++++++++++++++----------
 t/t1360-config-based-hooks.sh | 23 ++++++++++++++++++++++
 2 files changed, 50 insertions(+), 10 deletions(-)

diff --git a/hook.c b/hook.c
index 340e5a35c8..f4084e33c8 100644
--- a/hook.c
+++ b/hook.c
@@ -12,23 +12,24 @@ void free_hook(struct hook *ptr)
 	}
 }
 
-static void append_or_move_hook(struct list_head *head, const char *command)
+static struct hook* find_hook_by_command(struct list_head *head, const char *command)
 {
 	struct list_head *pos = NULL, *tmp = NULL;
-	struct hook *to_add = NULL;
+	struct hook *found = NULL;
 
-	/*
-	 * remove the prior entry with this command; we'll replace it at the
-	 * end.
-	 */
 	list_for_each_safe(pos, tmp, head) {
 		struct hook *it = list_entry(pos, struct hook, list);
 		if (!strcmp(it->command.buf, command)) {
 		    list_del(pos);
-		    /* we'll simply move the hook to the end */
-		    to_add = it;
+		    found = it;
 		}
 	}
+	return found;
+}
+
+static void append_or_move_hook(struct list_head *head, const char *command)
+{
+	struct hook *to_add = find_hook_by_command(head, command);
 
 	if (!to_add) {
 		/* adding a new hook, not moving an old one */
@@ -41,7 +42,7 @@ static void append_or_move_hook(struct list_head *head, const char *command)
 	/* re-set the scope so we show where an override was specified */
 	to_add->origin = current_config_scope();
 
-	list_add_tail(&to_add->list, pos);
+	list_add_tail(&to_add->list, head);
 }
 
 static void remove_hook(struct list_head *to_remove)
@@ -73,8 +74,18 @@ static int hook_config_lookup(const char *key, const char *value, void *cb_data)
 	if (!strcmp(key, hook_key)) {
 		const char *command = value;
 		struct strbuf hookcmd_name = STRBUF_INIT;
+		int skip = 0;
+
+		/*
+		 * Check if we're removing that hook instead. Hookcmds are
+		 * removed by name, and inlined hooks are removed by command
+		 * content.
+		 */
+		strbuf_addf(&hookcmd_name, "hookcmd.%s.skip", command);
+		git_config_get_bool(hookcmd_name.buf, &skip);
 
 		/* Check if a hookcmd with that name exists. */
+		strbuf_reset(&hookcmd_name);
 		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
 		git_config_get_value(hookcmd_name.buf, &command);
 
@@ -89,7 +100,13 @@ static int hook_config_lookup(const char *key, const char *value, void *cb_data)
 		 *   for each key+value, do_callback(key, value, cb_data)
 		 */
 
-		append_or_move_hook(head, command);
+		if (skip) {
+			struct hook *to_remove = find_hook_by_command(head, command);
+			if (to_remove)
+				remove_hook(&(to_remove->list));
+		} else {
+			append_or_move_hook(head, command);
+		}
 
 		strbuf_release(&hookcmd_name);
 	}
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 91127a50a4..ebd3bc623f 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -132,6 +132,29 @@ test_expect_success 'hook.runHookDir = warn is respected by list' '
 	test_i18ncmp expected actual
 '
 
+test_expect_success 'git hook list removes skipped hookcmd' '
+	setup_hookcmd &&
+	test_config hookcmd.abc.skip "true" --add &&
+
+	cat >expected <<-EOF &&
+	no commands configured for hook '\''pre-commit'\''
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_i18ncmp expected actual
+'
+
+test_expect_success 'git hook list removes skipped inlined hook' '
+	setup_hooks &&
+	test_config hookcmd."$ROOT/path/ghi".skip "true" --add &&
+
+	cat >expected <<-EOF &&
+	global: $ROOT/path/def
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
 
 test_expect_success 'hook.runHookDir = interactive is respected by list' '
 	setup_hookdir &&
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v5 6/8] parse-options: parse into strvec
  2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
                         ` (4 preceding siblings ...)
  2020-10-14 23:24       ` [PATCH v5 5/8] hook: implement hookcmd.<name>.skip Emily Shaffer
@ 2020-10-14 23:24       ` Emily Shaffer
  2020-10-14 23:24       ` [PATCH v5 7/8] hook: add 'run' subcommand Emily Shaffer
                         ` (2 subsequent siblings)
  8 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-14 23:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

parse-options already knows how to read into a string_list, and it knows
how to read into an strvec as a passthrough (that is, including the
argument as well as its value). string_list and strvec serve similar
purposes but are somewhat painful to convert between; so, let's teach
parse-options to read values of string arguments directly into an
strvec without preserving the argument name.

This is useful if collecting generic arguments to pass through to
another command, for example, 'git hook run --arg "--quiet" --arg
"--format=pretty" some-hook'. The resulting strvec would contain
{ "--quiet", "--format=pretty" }.

The implementation is based on that of OPT_STRING_LIST.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, fixed one or two more places where I missed the argv_array->strvec
    rename.

 Documentation/technical/api-parse-options.txt |  5 +++++
 parse-options-cb.c                            | 16 ++++++++++++++++
 parse-options.h                               |  4 ++++
 3 files changed, 25 insertions(+)

diff --git a/Documentation/technical/api-parse-options.txt b/Documentation/technical/api-parse-options.txt
index 5a60bbfa7f..679bd98629 100644
--- a/Documentation/technical/api-parse-options.txt
+++ b/Documentation/technical/api-parse-options.txt
@@ -173,6 +173,11 @@ There are some macros to easily define options:
 	The string argument is stored as an element in `string_list`.
 	Use of `--no-option` will clear the list of preceding values.
 
+`OPT_STRVEC(short, long, &struct strvec, arg_str, description)`::
+	Introduce an option with a string argument.
+	The string argument is stored as an element in `strvec`.
+	Use of `--no-option` will clear the list of preceding values.
+
 `OPT_INTEGER(short, long, &int_var, description)`::
 	Introduce an option with integer argument.
 	The integer is put into `int_var`.
diff --git a/parse-options-cb.c b/parse-options-cb.c
index 4542d4d3f9..c2451dfb1b 100644
--- a/parse-options-cb.c
+++ b/parse-options-cb.c
@@ -207,6 +207,22 @@ int parse_opt_string_list(const struct option *opt, const char *arg, int unset)
 	return 0;
 }
 
+int parse_opt_strvec(const struct option *opt, const char *arg, int unset)
+{
+	struct strvec *v = opt->value;
+
+	if (unset) {
+		strvec_clear(v);
+		return 0;
+	}
+
+	if (!arg)
+		return -1;
+
+	strvec_push(v, arg);
+	return 0;
+}
+
 int parse_opt_noop_cb(const struct option *opt, const char *arg, int unset)
 {
 	return 0;
diff --git a/parse-options.h b/parse-options.h
index 7030d8f3da..75cc8c7c96 100644
--- a/parse-options.h
+++ b/parse-options.h
@@ -177,6 +177,9 @@ struct option {
 #define OPT_STRING_LIST(s, l, v, a, h) \
 				    { OPTION_CALLBACK, (s), (l), (v), (a), \
 				      (h), 0, &parse_opt_string_list }
+#define OPT_STRVEC(s, l, v, a, h) \
+				    { OPTION_CALLBACK, (s), (l), (v), (a), \
+				      (h), 0, &parse_opt_strvec }
 #define OPT_UYN(s, l, v, h)         { OPTION_CALLBACK, (s), (l), (v), NULL, \
 				      (h), PARSE_OPT_NOARG, &parse_opt_tertiary }
 #define OPT_EXPIRY_DATE(s, l, v, h) \
@@ -296,6 +299,7 @@ int parse_opt_commits(const struct option *, const char *, int);
 int parse_opt_commit(const struct option *, const char *, int);
 int parse_opt_tertiary(const struct option *, const char *, int);
 int parse_opt_string_list(const struct option *, const char *, int);
+int parse_opt_strvec(const struct option *, const char *, int);
 int parse_opt_noop_cb(const struct option *, const char *, int);
 enum parse_opt_result parse_opt_unknown_cb(struct parse_opt_ctx_t *ctx,
 					   const struct option *,
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v5 7/8] hook: add 'run' subcommand
  2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
                         ` (5 preceding siblings ...)
  2020-10-14 23:24       ` [PATCH v5 6/8] parse-options: parse into strvec Emily Shaffer
@ 2020-10-14 23:24       ` Emily Shaffer
  2020-10-14 23:24       ` [PATCH v5 8/8] hook: replace find_hook() with hook_exists() Emily Shaffer
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
  8 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-14 23:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

In order to enable hooks to be run as an external process, by a
standalone Git command, or by tools which wrap Git, provide an external
means to run all configured hook commands for a given hook event.

For now, the hook commands will run in config order, in series. As
alternate ordering or parallelism is supported in the future, we should
add knobs to use those to the command line as well.

As with the legacy hook implementation, all stdout generated by hook
commands is redirected to stderr. Piping from stdin is not yet
supported.

Legacy hooks (those present in $GITDIR/hooks) are run at the end of the
execution list. For now, there is no way to disable them.

Users may wish to provide hook commands like 'git config
hook.pre-commit.command "~/linter.sh --pre-commit"'. To enable this, the
contents of the 'hook.*.command' and 'hookcmd.*.command' strings are
first split by space or quotes into an argv_array, then expanded with
'expand_user_path()'.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, updated the docs, and did less local application of single
    quotes. In order for hookdir hooks to run successfully with a space in
    the path, though, they must not be run with 'sh -c'. So we can treat the
    hookdir hooks specially, and warn users via doc about special
    considerations for configured hooks with spaces in their path.

 Documentation/git-hook.txt    |  12 +++-
 builtin/hook.c                |  40 +++++++++++++-
 hook.c                        | 100 ++++++++++++++++++++++++++++++++++
 hook.h                        |   7 +++
 t/t1360-config-based-hooks.sh |  65 +++++++++++++++++++++-
 5 files changed, 218 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index f19875ed68..95d3687905 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -9,11 +9,12 @@ SYNOPSIS
 --------
 [verse]
 'git hook' list <hook-name>
+'git hook' run <hook-name>
 
 DESCRIPTION
 -----------
-You can list configured hooks with this command. Later, you will be able to run,
-add, and modify hooks with this command.
+You can list and run configured hooks with this command. Later, you will be able
+to add and modify hooks with this command.
 
 This command parses the default configuration files for sections `hook` and
 `hookcmd`. `hook` is used to describe the commands which will be run during a
@@ -64,6 +65,13 @@ in the order they should be run, and print the config scope where the relevant
 `hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
 This output is human-readable and the format is subject to change over time.
 
+run `<hook-name>`::
+
+Runs hooks configured for `<hook-name>`, in the same order displayed by `git
+hook list`. Hooks configured this way are run prepended with `sh -c`, so paths
+containing special characters or spaces should be wrapped in single quotes:
+`command = '/my/path with spaces/script.sh' some args`.
+
 CONFIGURATION
 -------------
 include::config/hook.txt[]
diff --git a/builtin/hook.c b/builtin/hook.c
index 16324d4195..64aad28e54 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -5,9 +5,11 @@
 #include "hook.h"
 #include "parse-options.h"
 #include "strbuf.h"
+#include "strvec.h"
 
 static const char * const builtin_hook_usage[] = {
 	N_("git hook list <hookname>"),
+	N_("git hook run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] <hookname>"),
 	NULL
 };
 
@@ -84,6 +86,40 @@ static int list(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+static int run(int argc, const char **argv, const char *prefix)
+{
+	struct strbuf hookname = STRBUF_INIT;
+	struct strvec envs = STRVEC_INIT;
+	struct strvec args = STRVEC_INIT;
+
+	struct option run_options[] = {
+		OPT_STRVEC('e', "env", &envs, N_("var"),
+			   N_("environment variables for hook to use")),
+		OPT_STRVEC('a', "arg", &args, N_("args"),
+			   N_("argument to pass to hook")),
+		OPT_END(),
+	};
+
+	/*
+	 * While it makes sense to list hooks out-of-repo, it doesn't make sense
+	 * to execute them. Hooks usually want to look at repository artifacts.
+	 */
+	if (!have_git_dir())
+		usage_msg_opt(_("You must be in a Git repo to execute hooks."),
+			      builtin_hook_usage, run_options);
+
+	argc = parse_options(argc, argv, prefix, run_options,
+			     builtin_hook_usage, 0);
+
+	if (argc < 1)
+		usage_msg_opt(_("You must specify a hook event to run."),
+			      builtin_hook_usage, run_options);
+
+	strbuf_addstr(&hookname, argv[0]);
+
+	return run_hooks(envs.v, hookname.buf, &args, should_run_hookdir);
+}
+
 int cmd_hook(int argc, const char **argv, const char *prefix)
 {
 	const char *run_hookdir = NULL;
@@ -98,7 +134,7 @@ int cmd_hook(int argc, const char **argv, const char *prefix)
 			     builtin_hook_usage, 0);
 
 	/* after the parse, we should have "<command> <hookname> <args...>" */
-	if (argc < 1)
+	if (argc < 2)
 		usage_with_options(builtin_hook_usage, builtin_hook_options);
 
 
@@ -120,6 +156,8 @@ int cmd_hook(int argc, const char **argv, const char *prefix)
 
 	if (!strcmp(argv[0], "list"))
 		return list(argc, argv, prefix);
+	if (!strcmp(argv[0], "run"))
+		return run(argc, argv, prefix);
 
 	usage_with_options(builtin_hook_usage, builtin_hook_options);
 }
diff --git a/hook.c b/hook.c
index f4084e33c8..1494a32c1a 100644
--- a/hook.c
+++ b/hook.c
@@ -3,6 +3,7 @@
 #include "hook.h"
 #include "config.h"
 #include "run-command.h"
+#include "prompt.h"
 
 void free_hook(struct hook *ptr)
 {
@@ -135,6 +136,56 @@ enum hookdir_opt configured_hookdir_opt(void)
 	return hookdir_unknown;
 }
 
+static int should_include_hookdir(const char *path, enum hookdir_opt cfg)
+{
+	struct strbuf prompt = STRBUF_INIT;
+	/*
+	 * If the path doesn't exist, don't bother adding the empty hook and
+	 * don't bother checking the config or prompting the user.
+	 */
+	if (!path)
+		return 0;
+
+	switch (cfg)
+	{
+		case hookdir_no:
+			return 0;
+		case hookdir_unknown:
+			fprintf(stderr,
+				_("Unrecognized value for 'hook.runHookDir'. "
+				  "Is there a typo? "));
+			/* FALLTHROUGH */
+		case hookdir_warn:
+			fprintf(stderr, _("Running legacy hook at '%s'\n"),
+				path);
+			return 1;
+		case hookdir_interactive:
+			do {
+				/*
+				 * TRANSLATORS: Make sure to include [Y] and [n]
+				 * in your translation. Only English input is
+				 * accepted. Default option is "yes".
+				 */
+				fprintf(stderr, _("Run '%s'? [Yn] "), path);
+				git_read_line_interactively(&prompt);
+				strbuf_tolower(&prompt);
+				if (starts_with(prompt.buf, "n")) {
+					strbuf_release(&prompt);
+					return 0;
+				} else if (starts_with(prompt.buf, "y")) {
+					strbuf_release(&prompt);
+					return 1;
+				}
+				/* otherwise, we didn't understand the input */
+			} while (prompt.len); /* an empty reply means "Yes" */
+			strbuf_release(&prompt);
+			return 1;
+		case hookdir_yes:
+		default:
+			return 1;
+	}
+}
+
 struct list_head* hook_list(const struct strbuf* hookname)
 {
 	struct strbuf hook_key = STRBUF_INIT;
@@ -166,3 +217,52 @@ struct list_head* hook_list(const struct strbuf* hookname)
 	strbuf_release(&hook_key);
 	return hook_head;
 }
+
+
+int run_hooks(const char *const *env, const char *hookname,
+	      const struct strvec *args, enum hookdir_opt run_hookdir)
+{
+	struct strbuf hookname_str = STRBUF_INIT;
+	struct list_head *to_run, *pos = NULL, *tmp = NULL;
+	int rc = 0;
+
+	strbuf_addstr(&hookname_str, hookname);
+
+	to_run = hook_list(&hookname_str);
+
+	list_for_each_safe(pos, tmp, to_run) {
+		struct child_process hook_proc = CHILD_PROCESS_INIT;
+		struct hook *hook = list_entry(pos, struct hook, list);
+
+		hook_proc.env = env;
+		hook_proc.no_stdin = 1;
+		hook_proc.stdout_to_stderr = 1;
+		hook_proc.trace2_hook_name = hook->command.buf;
+		hook_proc.use_shell = 1;
+
+
+		if (hook->from_hookdir) {
+		    if (!should_include_hookdir(hook->command.buf, run_hookdir))
+			continue;
+		    /*
+		     * Commands from the config could be oneliners, but we know
+		     * for certain that hookdir commands are not.
+		     */
+		    hook_proc.use_shell = 0;
+		}
+
+		/* add command */
+		strvec_push(&hook_proc.args, hook->command.buf);
+
+		/*
+		 * add passed-in argv, without expanding - let the user get back
+		 * exactly what they put in
+		 */
+		if (args)
+			strvec_pushv(&hook_proc.args, args->v);
+
+		rc |= run_command(&hook_proc);
+	}
+
+	return rc;
+}
diff --git a/hook.h b/hook.h
index ca45d388d3..6eb1dc99c4 100644
--- a/hook.h
+++ b/hook.h
@@ -1,6 +1,7 @@
 #include "config.h"
 #include "list.h"
 #include "strbuf.h"
+#include "strvec.h"
 
 struct hook
 {
@@ -35,6 +36,12 @@ enum hookdir_opt
  * command line arguments.
  */
 enum hookdir_opt configured_hookdir_opt(void);
+/*
+ * Runs all hooks associated to the 'hookname' event in order. Each hook will be
+ * passed 'env' and 'args'.
+ */
+int run_hooks(const char *const *env, const char *hookname,
+	      const struct strvec *args, enum hookdir_opt run_hookdir);
 
 /* Free memory associated with a 'struct hook' */
 void free_hook(struct hook *ptr);
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index ebd3bc623f..5b3003d59b 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -115,7 +115,10 @@ test_expect_success 'hook.runHookDir = no is respected by list' '
 
 	git hook list pre-commit >actual &&
 	# the hookdir annotation is translated
-	test_i18ncmp expected actual
+	test_i18ncmp expected actual &&
+
+	git hook run pre-commit 2>actual &&
+	test_must_be_empty actual
 '
 
 test_expect_success 'hook.runHookDir = warn is respected by list' '
@@ -129,6 +132,14 @@ test_expect_success 'hook.runHookDir = warn is respected by list' '
 
 	git hook list pre-commit >actual &&
 	# the hookdir annotation is translated
+	test_i18ncmp expected actual &&
+
+	cat >expected <<-EOF &&
+	Running legacy hook at '\''$(pwd)/.git/hooks/pre-commit'\''
+	"Legacy Hook"
+	EOF
+
+	git hook run pre-commit 2>actual &&
 	test_i18ncmp expected actual
 '
 
@@ -156,7 +167,7 @@ test_expect_success 'git hook list removes skipped inlined hook' '
 	test_cmp expected actual
 '
 
-test_expect_success 'hook.runHookDir = interactive is respected by list' '
+test_expect_success 'hook.runHookDir = interactive is respected by list and run' '
 	setup_hookdir &&
 
 	test_config hook.runHookDir "interactive" &&
@@ -167,7 +178,55 @@ test_expect_success 'hook.runHookDir = interactive is respected by list' '
 
 	git hook list pre-commit >actual &&
 	# the hookdir annotation is translated
-	test_i18ncmp expected actual
+	test_i18ncmp expected actual &&
+
+	test_write_lines n | git hook run pre-commit 2>actual &&
+	! grep "Legacy Hook" actual &&
+
+	test_write_lines y | git hook run pre-commit 2>actual &&
+	grep "Legacy Hook" actual
+'
+
+test_expect_success 'inline hook definitions execute oneliners' '
+	test_config hook.pre-commit.command "echo \"Hello World\"" &&
+
+	echo "Hello World" >expected &&
+
+	# hooks are run with stdout_to_stderr = 1
+	git hook run pre-commit 2>actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'inline hook definitions resolve paths' '
+	write_script sample-hook.sh <<-EOF &&
+	echo \"Sample Hook\"
+	EOF
+
+	test_when_finished "rm sample-hook.sh" &&
+
+	test_config hook.pre-commit.command "\"$(pwd)/sample-hook.sh\"" &&
+
+	echo \"Sample Hook\" >expected &&
+
+	# hooks are run with stdout_to_stderr = 1
+	git hook run pre-commit 2>actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'hookdir hook included in git hook run' '
+	setup_hookdir &&
+
+	echo \"Legacy Hook\" >expected &&
+
+	# hooks are run with stdout_to_stderr = 1
+	git hook run pre-commit 2>actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'out-of-repo runs excluded' '
+	setup_hooks &&
+
+	nongit test_must_fail git hook run pre-commit
 '
 
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v5 8/8] hook: replace find_hook() with hook_exists()
  2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
                         ` (6 preceding siblings ...)
  2020-10-14 23:24       ` [PATCH v5 7/8] hook: add 'run' subcommand Emily Shaffer
@ 2020-10-14 23:24       ` Emily Shaffer
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
  8 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-14 23:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Add a helper to easily determine whether any hooks exist for a given
hook event.

Many callers want to check whether some state could be modified by a
hook; that check should include the config-based hooks as well. Optimize
by checking the config directly. Since commands which execute hooks
might want to take args to replace 'hook.runHookDir', let
'hook_exists()' mirror the behavior of 'hook.runHookDir'.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, a little more nuance when deciding whether a hookdir hook can happen.

 hook.c | 14 ++++++++++++++
 hook.h |  9 +++++++++
 2 files changed, 23 insertions(+)

diff --git a/hook.c b/hook.c
index 1494a32c1a..e3d289d0e9 100644
--- a/hook.c
+++ b/hook.c
@@ -218,6 +218,20 @@ struct list_head* hook_list(const struct strbuf* hookname)
 	return hook_head;
 }
 
+int hook_exists(const char *hookname, enum hookdir_opt should_run_hookdir)
+{
+	const char *value = NULL; /* throwaway */
+	struct strbuf hook_key = STRBUF_INIT;
+
+	int could_run_hookdir = (should_run_hookdir == hookdir_interactive ||
+				should_run_hookdir == hookdir_warn ||
+				should_run_hookdir == hookdir_yes)
+				&& !!find_hook(hookname);
+
+	strbuf_addf(&hook_key, "hook.%s.command", hookname);
+
+	return (!git_config_get_value(hook_key.buf, &value)) || could_run_hookdir;
+}
 
 int run_hooks(const char *const *env, const char *hookname,
 	      const struct strvec *args, enum hookdir_opt run_hookdir)
diff --git a/hook.h b/hook.h
index 6eb1dc99c4..bf8ea3ee11 100644
--- a/hook.h
+++ b/hook.h
@@ -36,6 +36,15 @@ enum hookdir_opt
  * command line arguments.
  */
 enum hookdir_opt configured_hookdir_opt(void);
+
+/*
+ * Returns 1 if any hooks are specified in the config or if a hook exists in the
+ * hookdir. Typically, invoke hook_exsts() like:
+ *   hook_exists(hookname, configured_hookdir_opt());
+ * Like with run_hooks, if you take a --run-hookdir flag, reflect that
+ * user-specified behavior here instead.
+ */
+int hook_exists(const char *hookname, enum hookdir_opt should_run_hookdir);
 /*
  * Runs all hooks associated to the 'hookname' event in order. Each hook will be
  * passed 'env' and 'args'.
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* Re: [PATCH v5 1/8] doc: propose hooks managed by the config
  2020-10-14 23:24       ` [PATCH v5 1/8] doc: propose hooks managed by the config Emily Shaffer
@ 2020-10-15 16:31         ` Ævar Arnfjörð Bjarmason
  2020-10-16 17:29           ` Junio C Hamano
  2020-10-21 23:37           ` Emily Shaffer
  0 siblings, 2 replies; 170+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2020-10-15 16:31 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git


On Thu, Oct 15 2020, Emily Shaffer wrote:

> Notes:
>     Since v4, addressed comments from Jonathan Tan about wording.

I had some extensive comments on the v4 here:
https://lore.kernel.org/git/87mu0ygzk1.fsf@evledraar.gmail.com/

Your CL & this patch don't mention it. I'd be interested in
collaborating on this depending on if/how our goals/wants align, but I'd
lke to get your thoughts on that feedback first.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v5 1/8] doc: propose hooks managed by the config
  2020-10-15 16:31         ` Ævar Arnfjörð Bjarmason
@ 2020-10-16 17:29           ` Junio C Hamano
  2020-10-21 23:37           ` Emily Shaffer
  1 sibling, 0 replies; 170+ messages in thread
From: Junio C Hamano @ 2020-10-16 17:29 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Emily Shaffer, git

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Thu, Oct 15 2020, Emily Shaffer wrote:
>
>> Notes:
>>     Since v4, addressed comments from Jonathan Tan about wording.
>
> I had some extensive comments on the v4 here:
> https://lore.kernel.org/git/87mu0ygzk1.fsf@evledraar.gmail.com/
>
> Your CL & this patch don't mention it. I'd be interested in
> collaborating on this depending on if/how our goals/wants align, but I'd
> lke to get your thoughts on that feedback first.

True.

It seems that it wasn't responded (not even a single-liner "Thanks,
I'll get to it later" or "Thanks, but the goal I am aiming is
different from yours and your experience does not translate directly
here") and I can only conclude that it somehow was overlooked?

Emily?

Side note: perhaps it is just me, but after making a review and
giving extensive comments and suggestions, it is often disorienting
to read the next round without getting any hint on which parts of
the comments were heard and which other parts were dismissed (and
why).  I think your earlier review is a kind that deserves a
separate response before the updated patchset.

Thanks.



^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v5 1/8] doc: propose hooks managed by the config
  2020-10-15 16:31         ` Ævar Arnfjörð Bjarmason
  2020-10-16 17:29           ` Junio C Hamano
@ 2020-10-21 23:37           ` Emily Shaffer
  1 sibling, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-21 23:37 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git

On Thu, Oct 15, 2020 at 06:31:15PM +0200, Ævar Arnfjörð Bjarmason wrote:
> 
> 
> On Thu, Oct 15 2020, Emily Shaffer wrote:
> 
> > Notes:
> >     Since v4, addressed comments from Jonathan Tan about wording.
> 
> I had some extensive comments on the v4 here:
> https://lore.kernel.org/git/87mu0ygzk1.fsf@evledraar.gmail.com/

Hum, it seems I completely missed it. I'm sorry - that was very rude of
me! I'll have a look now and reply there.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 1/9] doc: propose hooks managed by the config
  2020-10-07  9:23       ` Ævar Arnfjörð Bjarmason
@ 2020-10-22  0:58         ` Emily Shaffer
  2020-10-23 19:10           ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-10-22  0:58 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git

On Wed, Oct 07, 2020 at 11:23:10AM +0200, Ævar Arnfjörð Bjarmason wrote:
> 
> 
> On Wed, Sep 09 2020, Emily Shaffer wrote:
> 
> First, thanks a lot for working on this. As you may have found I've done
> some small amount of actual work in this area before, but mostly just
> blathered about it on the ML.
> 
> > Begin a design document for config-based hooks, managed via git-hook.
> > Focus on an overview of the implementation and motivation for design
> > decisions. Briefly discuss the alternatives considered before this
> > point. Also, attempt to redefine terms to fit into a multihook world.
> > [...]
> > +[[status-quo]]
> > +=== Status quo
> > +
> > +Today users can implement multihooks themselves by using a "trampoline script"
> > +as their hook, and pointing that script to a directory or list of other scripts
> > +they wish to run.
> 
> ...or by setting core.hooksPath in their local/global/system
> config. Granted it doesn't cover the malicious hook injection case
> you're also trying to solve, but does address e.g. having a git server
> with a lot of centralized hooks.

Aha, setting core.hooksPath in the global/system config had not occurred
to me.

> 
> The "trampoline script" also isn't needed for the common case you
> mention, you just symlink the .git/hooks directory (as e.g. GitLab
> does). People usually use a trampoline script for e.g. using GNU
> parallel or something to execute N hooks.

Hm, I don't think that's quite true. Symlinking out .git/hooks doesn't
give me more than one $HOOKDIR/pre-commit - it just gives me a different
one. So if I wanted to run three different hooks, $HOOKDIR/pre-commit
would need to do the work of all three, regardless of where $HOOKDIR
points. That's what I meant when I said "multihooks" in this section.

But I think what you're trying to say is this: the "status quo" section
doesn't fully cover the status quo. There are more tricks than I
mentioned, e.g. 'git config --global core.hooksPath
/home/emily/githook/' to get the same set of hooks to run everywhere.
This approach still has some drawbacks - for example, it doesn't allow
me to use language-specific linters if I have repos in various
languages, without exempting an individual repo from the ~/githook/ by
'git config --local core.hooksPath
/home/emily/my-python-thing/.git/hook'.

It looks like, then, the "status quo" section needs some rework for the
next iteration.

> 
> 
> > +[[hook-directories]]
> > +=== Hook directories
> > +
> > +Other contributors have suggested Git learn about the existence of a directory
> > +such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order.
> 
> ...which seems like an easy thing to add later by having a "hookdir" in
> addition to "hookcmd", i.e. just specify a glob there instead of a
> cmd/path.

Hum, interesting! Something like so:

[hook.pre-commit]
  command = last-minute-checks

[hookdir.last-minute-checks]
  dir = /home/emily/last-minute-checks/*

And then the hooks library knows to go and run everything in
~/last-minute-checks/. This is easier to keep fresh than:

[hook.pre-commit]
  command = /home/emily/last-minute-checks/c-linter
  command = /home/emily/last-minute-checks/check-for-debug-prints
  command = /home/emily/last-minute-checks/check-for-notes
  ...

I actually like the idea of this for folks who might have a small number
of hooks they wrote for themselves. I wonder if it's applicable for
something like git-secrets, which presumably users would grab with a
'git clone' later.

It doesn't seem at odds with the rest of the design - how would you feel
about me adding it to the "future work" section at the end? Future work,
rather than "Emily will do this in the next couple of rounds", because:
 - I think nobody already has their hooks in $HOOKDIR/hook/pre-commit.d
   without a corresponding trampoline in $HOOKDIR/hook/pre-commit; so
   they could still call that trampoline, for now
 - I think it might be prone to some bikeshedding - e.g. should we
   recurse into ~/last-minute-checks/linters/c/? how far? what if some
   script requires magic options? etc? But as I'm typing those questions
   out they sound mostly trivial or ridiculous, so maybe my assessment
   is wrong here.
 - It sounds like you might be keen to write it, or at the very least,
   more keen than me
 - Practically speaking, I am not sure I have time to do it alongside
   the rest of the series. Again, my bikeshedding assessment could be
   wrong, and this extra feature could be totally trivial.

> You already use "hookdir" for something else though, so that's a bit
> confusing, perhaps s/hookcmd/definehookcmd/ would be less confusing, or
> perhaps more confusing...

"Hookdir" might be the wrong word to use, too - maybe it's better to
mirror "hookspath" there. Eitherway, "hookdir" and "hookspath" are
similar enough that I think it would be confusing, and "hookcmd" is
already getting some side-eye from me for not being a great choice.

Some thoughts for "a path to a directory in which multiple scripts for a
single hook live":
 - hookset
 - hookbatch (ugh, redundant with MS scripting)
 - hook.pre-commit.all-of = ~/last-minute-checks/
 -  "   "  .everything-in = "   "
...?

I think I named a couple silly ideas for "hookcmd" in another mail.

> 
> > [...]
> > +[[execution-ordering]]
> > +=== Execution ordering
> > +
> > +We may find that config order is insufficient for some users; for example,
> > +config order makes it difficult to add a new hook to the system or global config
> > +which runs at the end of the hook list. A new ordering schema should be:
> > +
> > +1) Specified by a `hook.order` config, so that users will not unexpectedly see
> > +their order change;
> > +
> > +2) Either dependency or numerically based.
> > +
> > +Dependency-based ordering is prone to classic linked-list problems, like a
> > +cycles and handling of missing dependencies. But, it paves the way for enabling
> > +parallelization if some tasks truly depend on others.
> >
> > +Numerical ordering makes it tricky for Git to generate suggested ordering
> > +numbers for each command, but is easy to determine a definitive order.
> > +
> > +[[parallelization]]
> > +=== Parallelization
> > +
> > +Users with many hooks might want to run them simultaneously, if the hooks don't
> > +modify state; if one hook depends on another's output, then users will want to
> > +specify those dependencies. If we decide to solve this problem, we may want to
> > +look to modern build systems for inspiration on how to manage dependencies and
> > +parallel tasks.
> 
> If you're taking requests it would make me very happy if we had
> parallelism in this from day one. It's the kind of thing that's hard to
> do by default once a feature is shipped since people will implicitly
> depend on it not being there, i.e. we won't know what we're breaking.

Hm. This might be tricky.

Some hooks are inherently not able to be parallelized - for example,
hooks which modify a given file, like the commit message draft. In
general, based on the handful of hooks I've converted locally, it's hard
to check whether a callsite assumes a hook could have modified state.
Usually this seems to be done with a call to find_hook() ("was there a
hook that might have run?") and then reopening the file. Sometimes a
file is reopened unconditionally. Sometimes the find_hook() call is
very far away from the run_hook_le() call.

The rest, then, which only read a file and say yes or no, probably don't
need to have a strict ordering - at least as far as Git is concerned.
And I think that's what you're worried about:

[hook.theoretical-parallelizable-event]
  command = check-and-mark-a-file-foo
  command = check-file-foo-and-do-something-else
  command = do-something-totally-unrelated

On day 1 of this feature, as written, this is safe. But if we aren't
careful and we start to parallelize *without* setting up dependency
ordering, e.g. 'git config --global hook.parallelize', and turn that on
by default without warning anyone, then the author of this config will
be unhappy.

But as I read further, you're talking about specifically *not* allowing
dependency ordering...

> 
> I think doing it this way is simple, covers most use cases, and solves a
> lot of the problems you note:
> 
> 1. Don't use config order to execute hooks, use glob'd name order
>    regardless of origin. I.e. a system-level hook is called "001-first"
>    is executed before a local hook called "999-at-the-end" (or the other
>    way around, i.e. hook origin doesn't matter).

Can you say a little more about why different ordering schema would
matter, if we effectively don't care which jobs are in parallel with
which, as you describe? I'm not quite following.

> 
> 2. We execute hooks parallel in that glob order, i.e. a pthread for-loop
>    that starts the 001-first task first, eventually getting to
>    999-at-the-end N at a time. I.e. the same as:
> 
>        parallel --jobs N --halt-on-error soon,fail=1" ::: <hooks-in-glob-order>
> 
>    This allows for parallelism but guarantees the very useful case of
>    having a global log hook being guaranteed to execute.

Ah, I think you're suggesting the glob order specifically to make up for
--halt-on-error in this case.

> 
> 3. A hook can define "parallel=no" in its config. We'll then run it
>    while no other hook is running.
> 
> 4. We don't attempt to do dependencies etc, if you need that sort of
>    complexity you can just make one of the hooks be a hook runner as
>    users do now for the common "make it parallel" case.

If we aren't attempting any magical ordering, then I don't really see a
big difference between glob vs. config order - presumably for most users
the effect would be same, e.g. N = $(nproc * hyperthreading), M = (number of scripts I
care to run) probably will often result in M < N, so all jobs would run
simultaneously anyways.

> 
> It's a relatively small change to the code you have already. I.e. the
> for_each() in run_hooks() would be called N times for each continuous
> glob'd parallel/non-parallel segment, and hook_list()'s config parsing
> would learn to spew those out as a list-of-lists.
> 
> This also gives you a rudimentary implementation of the dependency
> schema you proposed for free. I.e. a definition of (pseudocode):
> 
>     hookcmd=000-first
>     parallel=no
> 
>     hookcmd=250-middle-abc
>     hookcmd=250-middle-xyz
> 
>     hookcmd=300-gather
>     parallel=no
> 
>     hookcmd=999-the-end
> 
> Would result in the pseudocode execution of;
> 
>     segments=[[000-first],
>               [250-middle-abc, 250-middle-xyz],

Hum. This seems to say "folks who started their hooks with the same
number agree that their hooks should also run simultaneously" - which
sounds like an even harder problem than "how do I know my ordering
number isn't the same as someone else's in another config file". Or else
I'm misunderstanding your pseudo :)

Ah, I see later you mention it directly as a dependency schema. I think
this offers the same set of problems I saw trying to use this as an
ordering schema, but worse in all the usual ways parallelism provides.
It is still impossible for someone writing a global or system config to
know where in the dependency chain more local hooks reside.

>               [300-gather],
>               [999-the-end]]
>     for each s in segments:
>         ok = run_in_parallel(s)
>         last if !ok # or, depending on "early exit?" config
> 
> I.e.:
> 
>  * The common case of people adding N hooks won't take sum(N) time.
> 
>  * parallel=no hooks aren't run in parallel with other non-parallel
>    hooks
> 
>  * We support a rudimentary dependency schema as a side-effect,
>    i.e. defining 300-gather as non-parallel allows it to act as the sole
>    "reduce" step in a map/reduce in a "map" step started with the 250-*
>    hooks.

As I understand it, the main concerns you have about getting
parallelization to happen on day 1 are like so:

 - keep users from assuming serial execution
 - avoid a messy schema change to deal with dependencies

I see the benefit of the former; I don't like the new schema proposed by
the latter. I do see that not turning it on day 1 would prevent us from
turning it on by default later, in case users did something silly like
assume dependencies.

Hrm.

I think we could turn on parallelization day 1 by providing an
explicitly-parallel API in hook.h (and a similar 'git hook run foo
--parallel' flag), and being more careful when converting hooks to call
run_hooks_parallel() instead of run_hooks(). That way hooks which will
never be parallelizable (e.g. commit-msg) won't get burned later by us
trying to be clever. Everyone else who can be parallelized is, in config
order, with no dependency management whatsoever. That leaves the door
open for us to add dependency management however we want later on, but
users can still roll their own with a launcher script today.

I know I rambled a lot - I was trying to convince myself :) For now, I'd
prefer to add more detail to the "future work" section of the doc and
then not touch this problem with a very long pole... ;) Thoughts
welcome.

> 
> > +[[securing-hookdir-hooks]]
> > +=== Securing hookdir hooks
> > +
> > +With the design as written in this doc, it's still possible for a malicious user
> > +to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then
> > +zip their repo and send it to another user. It may be necessary to teach Git to
> > +only allow inlined hooks like this if they were configured outside of the local
> > +scope (in other words, only run hookcmds, and only allow hookcmds to be
> > +configured in global or system scope); or another approach, like a list of safe
> > +projects, might be useful. It may also be sufficient (or at least useful) to
> > +teach a `hook.disableAll` config or similar flag to the Git executable.
> 
> I think this part of the doc should note a bit of the context in
> https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/
> 
> I.e. even if we get a 100% secure hook implementation we've done
> practically nothing for overall security, since we'll still run the
> pager, aliases etc. from that local repo.
> 
> This is a great step in the right direction, but it behooves us to note
> that, so some user reading this documentation without context doesn't
> think inspecting untrusted repositories like that is safe just because
> they set the right hook settings in their config (once what's being
> proposed here is implemented).

Yeah, I agree. I'll try to make that clearer in the doc in the next
reroll.

Very sorry again for having missed this - I think the first weeks of
October I was working from my local todo list instead of from the list
of replies in mutt. Urk.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 1/9] doc: propose hooks managed by the config
  2020-10-22  0:58         ` Emily Shaffer
@ 2020-10-23 19:10           ` Ævar Arnfjörð Bjarmason
  2020-10-29 15:38             ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2020-10-23 19:10 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git


On Thu, Oct 22 2020, Emily Shaffer wrote:

> On Wed, Oct 07, 2020 at 11:23:10AM +0200, Ævar Arnfjörð Bjarmason wrote:
>> 
>> 
>> On Wed, Sep 09 2020, Emily Shaffer wrote:
>> 
>> First, thanks a lot for working on this. As you may have found I've done
>> some small amount of actual work in this area before, but mostly just
>> blathered about it on the ML.
>> 
>> > Begin a design document for config-based hooks, managed via git-hook.
>> > Focus on an overview of the implementation and motivation for design
>> > decisions. Briefly discuss the alternatives considered before this
>> > point. Also, attempt to redefine terms to fit into a multihook world.
>> > [...]
>> > +[[status-quo]]
>> > +=== Status quo
>> > +
>> > +Today users can implement multihooks themselves by using a "trampoline script"
>> > +as their hook, and pointing that script to a directory or list of other scripts
>> > +they wish to run.
>> 
>> ...or by setting core.hooksPath in their local/global/system
>> config. Granted it doesn't cover the malicious hook injection case
>> you're also trying to solve, but does address e.g. having a git server
>> with a lot of centralized hooks.
>
> Aha, setting core.hooksPath in the global/system config had not occurred
> to me.

It's a useful hack.

>> 
>> The "trampoline script" also isn't needed for the common case you
>> mention, you just symlink the .git/hooks directory (as e.g. GitLab
>> does). People usually use a trampoline script for e.g. using GNU
>> parallel or something to execute N hooks.
>
> Hm, I don't think that's quite true. Symlinking out .git/hooks doesn't
> give me more than one $HOOKDIR/pre-commit - it just gives me a different
> one. So if I wanted to run three different hooks, $HOOKDIR/pre-commit
> would need to do the work of all three, regardless of where $HOOKDIR
> points. That's what I meant when I said "multihooks" in this section.
>
> But I think what you're trying to say is this: the "status quo" section
> doesn't fully cover the status quo. There are more tricks than I
> mentioned, e.g. 'git config --global core.hooksPath
> /home/emily/githook/' to get the same set of hooks to run everywhere.
> This approach still has some drawbacks - for example, it doesn't allow
> me to use language-specific linters if I have repos in various
> languages, without exempting an individual repo from the ~/githook/ by
> 'git config --local core.hooksPath
> /home/emily/my-python-thing/.git/hook'.
>
> It looks like, then, the "status quo" section needs some rework for the
> next iteration.

Re-reading your original patch I think I just misread that. I thought
you were saying a stub script was needed in the .git to point to a
multi-hook script, but I was pointing out that you can just symlink to
the multi-hook script (as e.g. GitLab does), but reading it again & this
I don't thin you meant that at all. Nevermind.

>> 
>> 
>> > +[[hook-directories]]
>> > +=== Hook directories
>> > +
>> > +Other contributors have suggested Git learn about the existence of a directory
>> > +such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order.
>> 
>> ...which seems like an easy thing to add later by having a "hookdir" in
>> addition to "hookcmd", i.e. just specify a glob there instead of a
>> cmd/path.
>
> Hum, interesting! Something like so:
>
> [hook.pre-commit]
>   command = last-minute-checks
>
> [hookdir.last-minute-checks]
>   dir = /home/emily/last-minute-checks/*
>
> And then the hooks library knows to go and run everything in
> ~/last-minute-checks/. This is easier to keep fresh than:
>
> [hook.pre-commit]
>   command = /home/emily/last-minute-checks/c-linter
>   command = /home/emily/last-minute-checks/check-for-debug-prints
>   command = /home/emily/last-minute-checks/check-for-notes
>   ...
>
> I actually like the idea of this for folks who might have a small number
> of hooks they wrote for themselves. I wonder if it's applicable for
> something like git-secrets, which presumably users would grab with a
> 'git clone' later.
>
> It doesn't seem at odds with the rest of the design - how would you feel
> about me adding it to the "future work" section at the end? Future work,
> rather than "Emily will do this in the next couple of rounds", because:
>  - I think nobody already has their hooks in $HOOKDIR/hook/pre-commit.d
>    without a corresponding trampoline in $HOOKDIR/hook/pre-commit; so
>    they could still call that trampoline, for now
>  - I think it might be prone to some bikeshedding - e.g. should we
>    recurse into ~/last-minute-checks/linters/c/? how far? what if some
>    script requires magic options? etc? But as I'm typing those questions
>    out they sound mostly trivial or ridiculous, so maybe my assessment
>    is wrong here.
>  - It sounds like you might be keen to write it, or at the very least,
>    more keen than me
>  - Practically speaking, I am not sure I have time to do it alongside
>    the rest of the series. Again, my bikeshedding assessment could be
>    wrong, and this extra feature could be totally trivial.
>
>> You already use "hookdir" for something else though, so that's a bit
>> confusing, perhaps s/hookcmd/definehookcmd/ would be less confusing, or
>> perhaps more confusing...
>
> "Hookdir" might be the wrong word to use, too - maybe it's better to
> mirror "hookspath" there. Eitherway, "hookdir" and "hookspath" are
> similar enough that I think it would be confusing, and "hookcmd" is
> already getting some side-eye from me for not being a great choice.
>
> Some thoughts for "a path to a directory in which multiple scripts for a
> single hook live":
>  - hookset
>  - hookbatch (ugh, redundant with MS scripting)
>  - hook.pre-commit.all-of = ~/last-minute-checks/
>  -  "   "  .everything-in = "   "
> ...?
>
> I think I named a couple silly ideas for "hookcmd" in another mail.

To both of the above: Yeah I'm not saying you need to do the work, just
that I think it would be a useful case to bikeshed now since it seems
inevitable that we'll get a "find hooks in this dir by glob" once we
have this facility. So having a config syntax for that which isn't
overly confusing / extensible to that case would be useful, i.e. as the
current syntax uses "dir" already.

>> 
>> > [...]
>> > +[[execution-ordering]]
>> > +=== Execution ordering
>> > +
>> > +We may find that config order is insufficient for some users; for example,
>> > +config order makes it difficult to add a new hook to the system or global config
>> > +which runs at the end of the hook list. A new ordering schema should be:
>> > +
>> > +1) Specified by a `hook.order` config, so that users will not unexpectedly see
>> > +their order change;
>> > +
>> > +2) Either dependency or numerically based.
>> > +
>> > +Dependency-based ordering is prone to classic linked-list problems, like a
>> > +cycles and handling of missing dependencies. But, it paves the way for enabling
>> > +parallelization if some tasks truly depend on others.
>> >
>> > +Numerical ordering makes it tricky for Git to generate suggested ordering
>> > +numbers for each command, but is easy to determine a definitive order.
>> > +
>> > +[[parallelization]]
>> > +=== Parallelization
>> > +
>> > +Users with many hooks might want to run them simultaneously, if the hooks don't
>> > +modify state; if one hook depends on another's output, then users will want to
>> > +specify those dependencies. If we decide to solve this problem, we may want to
>> > +look to modern build systems for inspiration on how to manage dependencies and
>> > +parallel tasks.
>> 
>> If you're taking requests it would make me very happy if we had
>> parallelism in this from day one. It's the kind of thing that's hard to
>> do by default once a feature is shipped since people will implicitly
>> depend on it not being there, i.e. we won't know what we're breaking.
>
> Hm. This might be tricky.
>
> Some hooks are inherently not able to be parallelized - for example,
> hooks which modify a given file, like the commit message draft. In
> general, based on the handful of hooks I've converted locally, it's hard
> to check whether a callsite assumes a hook could have modified state.
> Usually this seems to be done with a call to find_hook() ("was there a
> hook that might have run?") and then reopening the file. Sometimes a
> file is reopened unconditionally. Sometimes the find_hook() call is
> very far away from the run_hook_le() call.
>
> The rest, then, which only read a file and say yes or no, probably don't
> need to have a strict ordering - at least as far as Git is concerned.
> And I think that's what you're worried about:
>
> [hook.theoretical-parallelizable-event]
>   command = check-and-mark-a-file-foo
>   command = check-file-foo-and-do-something-else
>   command = do-something-totally-unrelated
>
> On day 1 of this feature, as written, this is safe. But if we aren't
> careful and we start to parallelize *without* setting up dependency
> ordering, e.g. 'git config --global hook.parallelize', and turn that on
> by default without warning anyone, then the author of this config will
> be unhappy.
>
> But as I read further, you're talking about specifically *not* allowing
> dependency ordering...
>
>> 
>> I think doing it this way is simple, covers most use cases, and solves a
>> lot of the problems you note:
>> 
>> 1. Don't use config order to execute hooks, use glob'd name order
>>    regardless of origin. I.e. a system-level hook is called "001-first"
>>    is executed before a local hook called "999-at-the-end" (or the other
>>    way around, i.e. hook origin doesn't matter).
>
> Can you say a little more about why different ordering schema would
> matter, if we effectively don't care which jobs are in parallel with
> which, as you describe? I'm not quite following.
>
>> 
>> 2. We execute hooks parallel in that glob order, i.e. a pthread for-loop
>>    that starts the 001-first task first, eventually getting to
>>    999-at-the-end N at a time. I.e. the same as:
>> 
>>        parallel --jobs N --halt-on-error soon,fail=1" ::: <hooks-in-glob-order>
>> 
>>    This allows for parallelism but guarantees the very useful case of
>>    having a global log hook being guaranteed to execute.
>
> Ah, I think you're suggesting the glob order specifically to make up for
> --halt-on-error in this case.
>
>> 
>> 3. A hook can define "parallel=no" in its config. We'll then run it
>>    while no other hook is running.
>> 
>> 4. We don't attempt to do dependencies etc, if you need that sort of
>>    complexity you can just make one of the hooks be a hook runner as
>>    users do now for the common "make it parallel" case.
>
> If we aren't attempting any magical ordering, then I don't really see a
> big difference between glob vs. config order - presumably for most users
> the effect would be same, e.g. N = $(nproc * hyperthreading), M = (number of scripts I
> care to run) probably will often result in M < N, so all jobs would run
> simultaneously anyways.
>
>> 
>> It's a relatively small change to the code you have already. I.e. the
>> for_each() in run_hooks() would be called N times for each continuous
>> glob'd parallel/non-parallel segment, and hook_list()'s config parsing
>> would learn to spew those out as a list-of-lists.
>> 
>> This also gives you a rudimentary implementation of the dependency
>> schema you proposed for free. I.e. a definition of (pseudocode):
>> 
>>     hookcmd=000-first
>>     parallel=no
>> 
>>     hookcmd=250-middle-abc
>>     hookcmd=250-middle-xyz
>> 
>>     hookcmd=300-gather
>>     parallel=no
>> 
>>     hookcmd=999-the-end
>> 
>> Would result in the pseudocode execution of;
>> 
>>     segments=[[000-first],
>>               [250-middle-abc, 250-middle-xyz],
>
> Hum. This seems to say "folks who started their hooks with the same
> number agree that their hooks should also run simultaneously" - which
> sounds like an even harder problem than "how do I know my ordering
> number isn't the same as someone else's in another config file". Or else
> I'm misunderstanding your pseudo :)

The prefix number isn't meaningful in that way, i.e. if you have 10
threads and 5 hooks starting with 250-* they won't all be invoked at the
same time.

> Ah, I see later you mention it directly as a dependency schema. I think
> this offers the same set of problems I saw trying to use this as an
> ordering schema, but worse in all the usual ways parallelism provides.
> It is still impossible for someone writing a global or system config to
> know where in the dependency chain more local hooks reside.
>
>>               [300-gather],
>>               [999-the-end]]
>>     for each s in segments:
>>         ok = run_in_parallel(s)
>>         last if !ok # or, depending on "early exit?" config
>> 
>> I.e.:
>> 
>>  * The common case of people adding N hooks won't take sum(N) time.
>> 
>>  * parallel=no hooks aren't run in parallel with other non-parallel
>>    hooks
>> 
>>  * We support a rudimentary dependency schema as a side-effect,
>>    i.e. defining 300-gather as non-parallel allows it to act as the sole
>>    "reduce" step in a map/reduce in a "map" step started with the 250-*
>>    hooks.
>
> As I understand it, the main concerns you have about getting
> parallelization to happen on day 1 are like so:
>
>  - keep users from assuming serial execution
>  - avoid a messy schema change to deal with dependencies
>
> I see the benefit of the former; I don't like the new schema proposed by
> the latter. I do see that not turning it on day 1 would prevent us from
> turning it on by default later, in case users did something silly like
> assume dependencies.
>
> Hrm.
>
> I think we could turn on parallelization day 1 by providing an
> explicitly-parallel API in hook.h (and a similar 'git hook run foo
> --parallel' flag), and being more careful when converting hooks to call
> run_hooks_parallel() instead of run_hooks(). That way hooks which will
> never be parallelizable (e.g. commit-msg) won't get burned later by us
> trying to be clever. Everyone else who can be parallelized is, in config
> order, with no dependency management whatsoever. That leaves the door
> open for us to add dependency management however we want later on, but
> users can still roll their own with a launcher script today.
>
> I know I rambled a lot - I was trying to convince myself :) For now, I'd
> prefer to add more detail to the "future work" section of the doc and
> then not touch this problem with a very long pole... ;) Thoughts
> welcome.

I'm replying to much of the above in general here, particularly since
much of it was in the form of a question you answered yourself later :)

Yes as you point out the reason I'm raising the parallel thing now is
"keep users from assuming serial execution", i.e. any implementation
that isn't like that from day 1 will need more verbose syntax to opt-in
to that.

I think parallel is the sane default, although there's a really strong
case as you point out with the "commit-msg" hook for treating that on a
hook-type basis. E.g. commit-msg (in-place editing of as single file)
being non-parallel by default, but e.g. post-commit, pre-applypatch,
pre-receive and other "should we proceed?" hooks being parallel.

But I'm also raising a general concern with the design of the API /
command around this.

I don't see the need for having a git hook list/edit/add command at
all. We should just keep this simpler and be able to point to "git
config --add/--get-regexp" etc.

It seems the reason to introduce this command API around it is because
you're imagining that git needs to manage hooks whose relative execution
order is important, and to later on once this lands aim to implement a
much more complex dependency management schema.

I just can't imagine a case that needs that where say those 10 hooks
need to execute in exact order 1/2/3/4 where the author of that tight
coupling wouldn't also desire to roll that all into one script, or at
least that it's an obscure enough case that we can just say "do that".

Whereas I do think "run a bunch of independent checks, if all pass
proceed" is *the* common case, e.g. adding a bunch of pre-receive
hooks. If we tell the user we'll treat those as independent programs we
can run them in parallel. The vast majority of users will benefit from
the default faster execution.

The "glob order" case I mentioned is extra complexity on top of that,
yes, but I think that concession is sane for the common case of "yes
parallel, but I want to always run the always-exit-0 log
hook". E.g. I've used this to setup a hook to run push
attempts/successes in a hook framework that runs N pre-receive hooks.

All that being said I'm open to being convinced, I just don't see what
the target user is, and the submitted docs don't really make a case for
it. I.e. there's plenty of "what" not "why would someone want this...".

>> 
>> > +[[securing-hookdir-hooks]]
>> > +=== Securing hookdir hooks
>> > +
>> > +With the design as written in this doc, it's still possible for a malicious user
>> > +to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then
>> > +zip their repo and send it to another user. It may be necessary to teach Git to
>> > +only allow inlined hooks like this if they were configured outside of the local
>> > +scope (in other words, only run hookcmds, and only allow hookcmds to be
>> > +configured in global or system scope); or another approach, like a list of safe
>> > +projects, might be useful. It may also be sufficient (or at least useful) to
>> > +teach a `hook.disableAll` config or similar flag to the Git executable.
>> 
>> I think this part of the doc should note a bit of the context in
>> https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/
>> 
>> I.e. even if we get a 100% secure hook implementation we've done
>> practically nothing for overall security, since we'll still run the
>> pager, aliases etc. from that local repo.
>> 
>> This is a great step in the right direction, but it behooves us to note
>> that, so some user reading this documentation without context doesn't
>> think inspecting untrusted repositories like that is safe just because
>> they set the right hook settings in their config (once what's being
>> proposed here is implemented).
>
> Yeah, I agree. I'll try to make that clearer in the doc in the next
> reroll.
>
> Very sorry again for having missed this - I think the first weeks of
> October I was working from my local todo list instead of from the list
> of replies in mutt. Urk.

*nod*

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 1/9] doc: propose hooks managed by the config
  2020-10-23 19:10           ` Ævar Arnfjörð Bjarmason
@ 2020-10-29 15:38             ` Emily Shaffer
  2020-10-29 20:04               ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-10-29 15:38 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, h; +Cc: git

On Fri, Oct 23, 2020 at 09:10:24PM +0200, Ævar Arnfjörð Bjarmason wrote:

> >> You already use "hookdir" for something else though, so that's a bit
> >> confusing, perhaps s/hookcmd/definehookcmd/ would be less confusing, or
> >> perhaps more confusing...
> >
> > "Hookdir" might be the wrong word to use, too - maybe it's better to
> > mirror "hookspath" there. Eitherway, "hookdir" and "hookspath" are
> > similar enough that I think it would be confusing, and "hookcmd" is
> > already getting some side-eye from me for not being a great choice.
> >
> > Some thoughts for "a path to a directory in which multiple scripts for a
> > single hook live":
> >  - hookset
> >  - hookbatch (ugh, redundant with MS scripting)
> >  - hook.pre-commit.all-of = ~/last-minute-checks/
> >  -  "   "  .everything-in = "   "
> > ...?
> >
> > I think I named a couple silly ideas for "hookcmd" in another mail.
> 
> To both of the above: Yeah I'm not saying you need to do the work, just
> that I think it would be a useful case to bikeshed now since it seems
> inevitable that we'll get a "find hooks in this dir by glob" once we
> have this facility. So having a config syntax for that which isn't
> overly confusing / extensible to that case would be useful, i.e. as the
> current syntax uses "dir" already.

Yeah. I'm not sure that it needs to happen right away. Because
hook.*.command // hookcommand.*.command gets passed right into
run_command()-with-shell, it's possible for a user who's keen to also
set `hook.*.command = find -type f /some/path | xargs` in the meantime.
And also because it's passed right into run_command()-with-shell, it's
hard to do some smart wildcarding on the .command config and try to
figure out the right syntax. I'd just as soon see something explicit
like the configs I mentioned above, which can be added pretty easily
after the fact. I think what you're mostly saying, though, is "Leave
some words for glob execution!" and that I can appreciate.

> > Hum. This seems to say "folks who started their hooks with the same
> > number agree that their hooks should also run simultaneously" - which
> > sounds like an even harder problem than "how do I know my ordering
> > number isn't the same as someone else's in another config file". Or else
> > I'm misunderstanding your pseudo :)
> 
> The prefix number isn't meaningful in that way, i.e. if you have 10
> threads and 5 hooks starting with 250-* they won't all be invoked at the
> same time.

Ok. I misunderstood, then.

> > I know I rambled a lot - I was trying to convince myself :) For now, I'd
> > prefer to add more detail to the "future work" section of the doc and
> > then not touch this problem with a very long pole... ;) Thoughts
> > welcome.
> 
> I'm replying to much of the above in general here, particularly since
> much of it was in the form of a question you answered yourself later :)
> 
> Yes as you point out the reason I'm raising the parallel thing now is
> "keep users from assuming serial execution", i.e. any implementation
> that isn't like that from day 1 will need more verbose syntax to opt-in
> to that.
> 
> I think parallel is the sane default, although there's a really strong
> case as you point out with the "commit-msg" hook for treating that on a
> hook-type basis. E.g. commit-msg (in-place editing of as single file)
> being non-parallel by default, but e.g. post-commit, pre-applypatch,
> pre-receive and other "should we proceed?" hooks being parallel.

Yeah. I think you've sold me. So what I will do is thus: before I send
the next reroll (as I'm pretty much done, locally, and hope to be ready
for nits next time) I'll take a look in 'git help githooks' and see
which ones expect writes to occur. I think there are more than just
"commit-msg". I'll add a bit to run_hooks() and a corresponding flag to
'git hook run', plus relevant documentation. I'll also plan to add
explicit documentation to 'git help githooks' mentioning parallel vs.
serial execution.

But I will plan on writing it stupidly - user configurable job number
but no dependency checking; and let the user turn off parallel execution
for everyone (hook.jobs=1) or for just one hook
(hook.pre-commit.parallel = false (?)). Like you and Jonathan N say, we
can add more sugar like hookcmd.*.depends later on when we need it.

> 
> But I'm also raising a general concern with the design of the API /
> command around this.
> 
> I don't see the need for having a git hook list/edit/add command at
> all. We should just keep this simpler and be able to point to "git
> config --add/--get-regexp" etc.
> 
> It seems the reason to introduce this command API around it is because
> you're imagining that git needs to manage hooks whose relative execution
> order is important, and to later on once this lands aim to implement a
> much more complex dependency management schema.

No, I don't think that's the reason to have list/edit/add. The reason is
more for discoverability (if I 'git help git' or 'git^TAB', do I see
something handy in the command list that I didn't know about before?)
and user friendliness ("I can't remember the right config options to set
this up every dang time"). And 'list', I think, is handy for giving
users a dry run of what they can expect to see happen (and where to fix
them, since it lists the origin). Yes, a user could put it all together
from invocations of 'git config', but I personally think it's more
useful for Git to tell me what Git is going to do/what Git wants than
for my meat brain to try and guess :)

> 
> I just can't imagine a case that needs that where say those 10 hooks
> need to execute in exact order 1/2/3/4 where the author of that tight
> coupling wouldn't also desire to roll that all into one script, or at
> least that it's an obscure enough case that we can just say "do that".
> 
> Whereas I do think "run a bunch of independent checks, if all pass
> proceed" is *the* common case, e.g. adding a bunch of pre-receive
> hooks. If we tell the user we'll treat those as independent programs we
> can run them in parallel. The vast majority of users will benefit from
> the default faster execution.
> 
> The "glob order" case I mentioned is extra complexity on top of that,
> yes, but I think that concession is sane for the common case of "yes
> parallel, but I want to always run the always-exit-0 log
> hook". E.g. I've used this to setup a hook to run push
> attempts/successes in a hook framework that runs N pre-receive hooks.

Reading this, I think I'm still missing something key about what you
think glob ordering provides. I'm not following why having the log hook
set early requires glob ordering over config ordering (since the config
ordering schema allows reordering via replacement), and I'm not
following why it's required to halt on failure.

> 
> All that being said I'm open to being convinced, I just don't see what
> the target user is, and the submitted docs don't really make a case for
> it. I.e. there's plenty of "what" not "why would someone want this...".

ACK. I'll try and go over the doc again before I reroll.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 1/9] doc: propose hooks managed by the config
  2020-10-29 15:38             ` Emily Shaffer
@ 2020-10-29 20:04               ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 170+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2020-10-29 20:04 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: h, git


On Thu, Oct 29 2020, Emily Shaffer wrote:

> On Fri, Oct 23, 2020 at 09:10:24PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> >> You already use "hookdir" for something else though, so that's a bit
>> >> confusing, perhaps s/hookcmd/definehookcmd/ would be less confusing, or
>> >> perhaps more confusing...
>> >
>> > "Hookdir" might be the wrong word to use, too - maybe it's better to
>> > mirror "hookspath" there. Eitherway, "hookdir" and "hookspath" are
>> > similar enough that I think it would be confusing, and "hookcmd" is
>> > already getting some side-eye from me for not being a great choice.
>> >
>> > Some thoughts for "a path to a directory in which multiple scripts for a
>> > single hook live":
>> >  - hookset
>> >  - hookbatch (ugh, redundant with MS scripting)
>> >  - hook.pre-commit.all-of = ~/last-minute-checks/
>> >  -  "   "  .everything-in = "   "
>> > ...?
>> >
>> > I think I named a couple silly ideas for "hookcmd" in another mail.
>> 
>> To both of the above: Yeah I'm not saying you need to do the work, just
>> that I think it would be a useful case to bikeshed now since it seems
>> inevitable that we'll get a "find hooks in this dir by glob" once we
>> have this facility. So having a config syntax for that which isn't
>> overly confusing / extensible to that case would be useful, i.e. as the
>> current syntax uses "dir" already.
>
> Yeah. I'm not sure that it needs to happen right away. Because
> hook.*.command // hookcommand.*.command gets passed right into
> run_command()-with-shell, it's possible for a user who's keen to also
> set `hook.*.command = find -type f /some/path | xargs` in the meantime.
> And also because it's passed right into run_command()-with-shell, it's
> hard to do some smart wildcarding on the .command config and try to
> figure out the right syntax. I'd just as soon see something explicit
> like the configs I mentioned above, which can be added pretty easily
> after the fact. I think what you're mostly saying, though, is "Leave
> some words for glob execution!" and that I can appreciate.

Yeah, or rather, just now in config key naming think about if the key
naming makes sense if it's expanded to support such glob inclusion,
which seems like a desired addition. But I won't belabor that point.

Just one thing to add: We don't really need to come up with a syntax &
semantics for glob inclusion special to this, we'd use the sort of glob
patterns "Conditional includes" use, as documented in  git-config(1).

>> > Hum. This seems to say "folks who started their hooks with the same
>> > number agree that their hooks should also run simultaneously" - which
>> > sounds like an even harder problem than "how do I know my ordering
>> > number isn't the same as someone else's in another config file". Or else
>> > I'm misunderstanding your pseudo :)
>> 
>> The prefix number isn't meaningful in that way, i.e. if you have 10
>> threads and 5 hooks starting with 250-* they won't all be invoked at the
>> same time.
>
> Ok. I misunderstood, then.
>
>> > I know I rambled a lot - I was trying to convince myself :) For now, I'd
>> > prefer to add more detail to the "future work" section of the doc and
>> > then not touch this problem with a very long pole... ;) Thoughts
>> > welcome.
>> 
>> I'm replying to much of the above in general here, particularly since
>> much of it was in the form of a question you answered yourself later :)
>> 
>> Yes as you point out the reason I'm raising the parallel thing now is
>> "keep users from assuming serial execution", i.e. any implementation
>> that isn't like that from day 1 will need more verbose syntax to opt-in
>> to that.
>> 
>> I think parallel is the sane default, although there's a really strong
>> case as you point out with the "commit-msg" hook for treating that on a
>> hook-type basis. E.g. commit-msg (in-place editing of as single file)
>> being non-parallel by default, but e.g. post-commit, pre-applypatch,
>> pre-receive and other "should we proceed?" hooks being parallel.
>
> Yeah. I think you've sold me. So what I will do is thus: before I send
> the next reroll (as I'm pretty much done, locally, and hope to be ready
> for nits next time) I'll take a look in 'git help githooks' and see
> which ones expect writes to occur. I think there are more than just
> "commit-msg". I'll add a bit to run_hooks() and a corresponding flag to
> 'git hook run', plus relevant documentation. I'll also plan to add
> explicit documentation to 'git help githooks' mentioning parallel vs.
> serial execution.

Sounds good.

> But I will plan on writing it stupidly - user configurable job number
> but no dependency checking; and let the user turn off parallel execution
> for everyone (hook.jobs=1) or for just one hook
> (hook.pre-commit.parallel = false (?)). Like you and Jonathan N say, we
> can add more sugar like hookcmd.*.depends later on when we need it.

Yeah, that sounds great. As long as there's parallelism that stuff can
always be tweaked later.

>> 
>> But I'm also raising a general concern with the design of the API /
>> command around this.
>> 
>> I don't see the need for having a git hook list/edit/add command at
>> all. We should just keep this simpler and be able to point to "git
>> config --add/--get-regexp" etc.
>> 
>> It seems the reason to introduce this command API around it is because
>> you're imagining that git needs to manage hooks whose relative execution
>> order is important, and to later on once this lands aim to implement a
>> much more complex dependency management schema.
>
> No, I don't think that's the reason to have list/edit/add. The reason is
> more for discoverability (if I 'git help git' or 'git^TAB', do I see
> something handy in the command list that I didn't know about before?)
> and user friendliness ("I can't remember the right config options to set
> this up every dang time"). And 'list', I think, is handy for giving
> users a dry run of what they can expect to see happen (and where to fix
> them, since it lists the origin). Yes, a user could put it all together
> from invocations of 'git config', but I personally think it's more
> useful for Git to tell me what Git is going to do/what Git wants than
> for my meat brain to try and guess :)

Okey, that makes sense & I've got nothing against that, just clarifying
since it *looked* like it was the first step in some future addition of
complexity around this.

It would be nice if the docs for the new command were modified to state
that clearly, even to the point of saying "this is really just sugar for
this similar git-config invocation".

>> 
>> I just can't imagine a case that needs that where say those 10 hooks
>> need to execute in exact order 1/2/3/4 where the author of that tight
>> coupling wouldn't also desire to roll that all into one script, or at
>> least that it's an obscure enough case that we can just say "do that".
>> 
>> Whereas I do think "run a bunch of independent checks, if all pass
>> proceed" is *the* common case, e.g. adding a bunch of pre-receive
>> hooks. If we tell the user we'll treat those as independent programs we
>> can run them in parallel. The vast majority of users will benefit from
>> the default faster execution.
>> 
>> The "glob order" case I mentioned is extra complexity on top of that,
>> yes, but I think that concession is sane for the common case of "yes
>> parallel, but I want to always run the always-exit-0 log
>> hook". E.g. I've used this to setup a hook to run push
>> attempts/successes in a hook framework that runs N pre-receive hooks.
>
> Reading this, I think I'm still missing something key about what you
> think glob ordering provides. 

For context, I feel strongly that we should do parallel by default for
implementing something like this, it's great that per the above
discussion you're open to that.

This "glob ordering" is an entirely separate idea I'm not strongly
advocating, there's pros & cons of doing that v.s. config ordering.

 * Con: less obvious than config order, you write hooks "a c b" in the
   config and we execute in "a b c" order.

 * Pro: Sidesteps the issues you noted in "Execution ordering" in the
   docs you're adding, i.e. now it'll be impossible to execute a
   repo-local hook before a system-wide one, you can override that with
   having a local one called "000-something".

   I.e. now we'd read the config in the normal config order, and thus if
   there's a system hook there's no way to define a local hook to run
   first, until we get some sort of override for that.

> I'm not following why having the log hook set early requires glob
> ordering over config ordering (since the config ordering schema allows
> reordering via replacement)
> [...]
>  and I'm not following why it's required to halt on failure.

I realize I didn't elaborate on this, there's some past discussion[1][2]
about this. 

I.e. when running N hooks sometimes you'd want to run them all (e.g. to
send notifications), but for others such as pre-receive.d guard checks
you don't have to run all N, if one check (say one checks commit format
validity, another code syntax) fails you'd like to abort early.

So halting on failure is just saving CPU, you might have 10 hooks that
each take 1 second, no point in making the user wait on all 10 checks
for 10 seconds if a failure of any fails the push.

But OTOH you have other use-cases where users want to run them all
(talked about in the [1][2] discussion above), so it's been anticipated
as something we'd grow config for with multi-hook support.

The glob ordering allows common cases for things that aren't possible
with config-order with such early abort.

E.g. consider a server with some common system-wide pre-receive.d hook
(e.g. author e-mail envelope check), and a SOX/PCI controlled repository
where some compliance thing says all push attempts must be logged.

You could then do:

    /etc/git/hooks/pre-receive.d/email-check
    /path/to/repo/hooks/pre-receive.d/000-log-push-attempt-to-db
    /path/to/repo/hooks/pre-receive.d/some-other-check

And we'd always run the 000-* hook first, whereas in the current schema
you can't do that without editing the system-wide config.

>> 
>> All that being said I'm open to being convinced, I just don't see what
>> the target user is, and the submitted docs don't really make a case for
>> it. I.e. there's plenty of "what" not "why would someone want this...".
>
> ACK. I'll try and go over the doc again before I reroll.
>
>  - Emily

1. https://lore.kernel.org/git/87wojjsv9p.fsf@evledraar.gmail.com/
2. https://public-inbox.org/git/CACBZZX6j6q2DUN_Z-Pnent1u714dVNPFBrL_PiEQyLmCzLUVxg@mail.gmail.com/

^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v6 00/17] propose config-based hooks (part I)
  2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
                         ` (7 preceding siblings ...)
  2020-10-14 23:24       ` [PATCH v5 8/8] hook: replace find_hook() with hook_exists() Emily Shaffer
@ 2020-12-05  1:45       ` Emily Shaffer
  2020-12-05  1:45         ` [PATCH 01/17] doc: propose hooks managed by the config Emily Shaffer
                           ` (18 more replies)
  8 siblings, 19 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:45 UTC (permalink / raw)
  To: git
  Cc: Emily Shaffer, Jeff King, Junio C Hamano, James Ramsay,
	Jonathan Nieder, brian m. carlson,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Josh Steadmon, Johannes Schindelin

Hi folks, and thanks for the patience - I ran into many, many last-mile
challenges.

I haven't addressed many comments on the design doc yet - I was keen to get the
"functionally complete" implementation and conversion to the list.

Next on my plate:
 - Update the design doc to make sense with what's in the implementation.
 - A blog post! How to set up new hooks, why they're neat, etc.
 - We seem to have some Googlers interested in trying it out internally, so
   I'm hoping we'll gather and collate feedback from that soon too.
 - And of course addressing comments on this series.

Thanks!
 - Emily

Emily Shaffer (17):
  doc: propose hooks managed by the config
  hook: scaffolding for git-hook subcommand
  hook: add list command
  hook: include hookdir hook in list
  hook: respect hook.runHookDir
  hook: implement hookcmd.<name>.skip
  parse-options: parse into strvec
  hook: add 'run' subcommand
  hook: replace find_hook() with hook_exists()
  hook: support passing stdin to hooks
  run-command: allow stdin for run_processes_parallel
  hook: allow parallel hook execution
  hook: allow specifying working directory for hooks
  run-command: add stdin callback for parallelization
  hook: provide stdin by string_list or callback
  run-command: allow capturing of collated output
  hooks: allow callers to capture output

 .gitignore                                    |   1 +
 Documentation/Makefile                        |   1 +
 Documentation/config/hook.txt                 |  19 +
 Documentation/git-hook.txt                    | 118 +++++
 Documentation/technical/api-parse-options.txt |   5 +
 .../technical/config-based-hooks.txt          | 367 +++++++++++++++
 Makefile                                      |   2 +
 builtin.h                                     |   1 +
 builtin/bugreport.c                           |   4 +-
 builtin/fetch.c                               |   1 +
 builtin/hook.c                                | 174 ++++++++
 builtin/submodule--helper.c                   |   2 +-
 git.c                                         |   1 +
 hook.c                                        | 417 ++++++++++++++++++
 hook.h                                        | 154 +++++++
 parse-options-cb.c                            |  16 +
 parse-options.h                               |   4 +
 run-command.c                                 |  85 +++-
 run-command.h                                 |  31 ++
 submodule.c                                   |   1 +
 t/helper/test-run-command.c                   |  46 +-
 t/t0061-run-command.sh                        |  37 ++
 t/t1360-config-based-hooks.sh                 | 256 +++++++++++
 23 files changed, 1728 insertions(+), 15 deletions(-)
 create mode 100644 Documentation/config/hook.txt
 create mode 100644 Documentation/git-hook.txt
 create mode 100644 Documentation/technical/config-based-hooks.txt
 create mode 100644 builtin/hook.c
 create mode 100644 hook.c
 create mode 100644 hook.h
 create mode 100755 t/t1360-config-based-hooks.sh

-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH 01/17] doc: propose hooks managed by the config
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
@ 2020-12-05  1:45         ` Emily Shaffer
  2020-12-05  1:45         ` [PATCH 02/17] hook: scaffolding for git-hook subcommand Emily Shaffer
                           ` (17 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:45 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Begin a design document for config-based hooks, managed via git-hook.
Focus on an overview of the implementation and motivation for design
decisions. Briefly discuss the alternatives considered before this
point. Also, attempt to redefine terms to fit into a multihook world.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, addressed comments from Jonathan Tan about wording. However, I have
    not addressed AEvar's comments or done a full re-review of this document.
    I wanted to get the rest of the series out for initial review first.
    
     - Emily
    
    Since v4, addressed comments from Jonathan Tan about wording.

 Documentation/Makefile                        |   1 +
 .../technical/config-based-hooks.txt          | 367 ++++++++++++++++++
 2 files changed, 368 insertions(+)
 create mode 100644 Documentation/technical/config-based-hooks.txt

diff --git a/Documentation/Makefile b/Documentation/Makefile
index 80d1908a44..58d6b3acbe 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -81,6 +81,7 @@ SP_ARTICLES += $(API_DOCS)
 TECH_DOCS += MyFirstContribution
 TECH_DOCS += MyFirstObjectWalk
 TECH_DOCS += SubmittingPatches
+TECH_DOCS += technical/config-based-hooks
 TECH_DOCS += technical/hash-function-transition
 TECH_DOCS += technical/http-protocol
 TECH_DOCS += technical/index-format
diff --git a/Documentation/technical/config-based-hooks.txt b/Documentation/technical/config-based-hooks.txt
new file mode 100644
index 0000000000..dac391f505
--- /dev/null
+++ b/Documentation/technical/config-based-hooks.txt
@@ -0,0 +1,367 @@
+Configuration-based hook management
+===================================
+:sectanchors:
+
+[[motivation]]
+== Motivation
+
+Replace the .git/hook/hookname path as the only source of hooks to execute;
+allow users to define hooks using config files, in a way which is friendly to
+users with multiple repos which have similar needs.
+
+Redefine "hook" as an event rather than a single script, allowing users to
+perform unrelated actions on a single event.
+
+Take a step closer to safety when copying zipped Git repositories from untrusted
+users by making it more apparent to users which scripts will be run during
+normal Git operations.
+
+Make it easier for users to discover Git's hook feature and automate their
+workflows.
+
+[[user-interfaces]]
+== User interfaces
+
+[[config-schema]]
+=== Config schema
+
+Hooks can be introduced by editing the configuration manually. There are two new
+sections added, `hook` and `hookcmd`.
+
+[[config-schema-hook]]
+==== `hook`
+
+Primarily contains subsections for each hook event. The order of variables in
+these subsections defines the hook command execution order; hook commands can be
+specified by setting the value directly to the command if no additional
+configuration is needed, or by setting the value as the name of a `hookcmd`. If
+Git does not find a `hookcmd` whose subsection matches the value of the given
+command string, Git will try to execute the string directly. Hooks are executed
+by passing the resolved command string to the shell. In the future, hook event
+subsections could also contain per-hook-event settings; see
+<<per-hook-event-settings,the section in Future Work>> for more details.
+
+Also contains top-level hook execution settings, for example, `hook.runHookDir`
+or `hook.disableAll`. (These settings are described more in
+<<library,Library>>.)
+
+----
+[hook "pre-commit"]
+  command = perl-linter
+  command = /usr/bin/git-secrets --pre-commit
+
+[hook "pre-applypatch"]
+  command = perl-linter
+  # for illustration purposes; error behavior isn't planned yet
+  error = ignore
+
+[hook]
+  runHookDir = interactive
+----
+
+[[config-schema-hookcmd]]
+==== `hookcmd`
+
+Defines a hook command and its attributes, which will be used when a hook event
+occurs. Unqualified attributes are assumed to apply to this hook during all hook
+events, but event-specific attributes can also be supplied. The example runs
+`/usr/bin/lint-it --language=perl <args passed by Git>`, but for repos which
+include this config, the hook command will be skipped for all events to which
+it's normally subscribed _except_ `pre-commit`.
+
+----
+[hookcmd "perl-linter"]
+  command = /usr/bin/lint-it --language=perl
+  skip = true
+  # for illustration purposes; below hasn't been defined yet
+  pre-commit-skip = false
+----
+
+[[command-line-api]]
+=== Command-line API
+
+Users should be able to view, reorder, and create hook commands via the command
+line. External tools should be able to view a list of hooks in the correct order
+to run.
+
+*`git hook list <hook-event>`*
+
+*`git hook list (--system|--global|--local|--worktree)`*
+
+*`git hook edit <hook-event>`*
+
+*`git hook add <hook-command> <hook-event> <options...>`*
+
+[[hook-editor]]
+=== Hook editor
+
+The tool which is presented by `git hook edit <hook-command>`. Ideally, this
+tool should be easier to use than manually editing the config, and then produce
+a concise config afterwards. It may take a form similar to `git rebase
+--interactive`.
+
+[[implementation]]
+== Implementation
+
+[[library]]
+=== Library
+
+`hook.c` and `hook.h` are responsible for interacting with the config files. In
+the case when the code generating a hook event doesn't have special concerns
+about how to run the hooks, the hook library will provide a basic API to call
+all hooks in config order with an `strvec` provided by the code which
+generates the hook event:
+
+*`int run_hooks(const char *hookname, struct strvec *args)`*
+
+This call includes the hook command provided by `run-command.h:find_hook()`;
+eventually, this legacy hook will be gated by a config `hook.runHookDir`. The
+config is checked against a number of cases:
+
+- "no": the legacy hook will not be run
+- "interactive": Git will prompt the user before running the legacy hook
+- "warn": Git will print a warning to stderr before running the legacy hook
+- "yes" (default): Git will silently run the legacy hook
+
+In case this list is expanded in the future, if a value for `hook.runHookDir` is
+given which Git does not recognize, Git should discard that config entry. For
+example, if "warn" was specified at system level and "junk" was specified at
+global level, Git would resolve the value to "warn"; if the only time the config
+was set was to "junk", Git would use the default value of "yes".
+
+If the caller wants to do something more complicated, the hook library can also
+provide a callback API:
+
+*`int for_each_hookcmd(const char *hookname, hookcmd_function *cb)`*
+
+Finally, to facilitate the builtin, the library will also provide the following
+APIs to interact with the config:
+
+----
+int set_hook_commands(const char *hookname, struct string_list *commands,
+	enum config_scope scope);
+int set_hookcmd(const char *hookcmd, struct hookcmd options);
+
+int list_hook_commands(const char *hookname, struct string_list *commands);
+int list_hooks_in_scope(enum config_scope scope, struct string_list *commands);
+----
+
+`struct hookcmd` is expected to grow in size over time as more functionality is
+added to hooks; so that other parts of the code don't need to understand the
+config schema, `struct hookcmd` should contain logical values instead of string
+pairs.
+
+----
+struct hookcmd {
+  const char *name;
+  const char *command;
+
+  /* for illustration only; not planned at present */
+  int parallelizable;
+  const char *hookcmd_before;
+  const char *hookcmd_after;
+  enum recovery_action on_fail;
+}
+----
+
+[[builtin]]
+=== Builtin
+
+`builtin/hook.c` is responsible for providing the frontend. It's responsible for
+formatting user-provided data and then calling the library API to set the
+configs as appropriate. The builtin frontend is not responsible for calling the
+config directly, so that other areas of Git can rely on the hook library to
+understand the most recent config schema for hooks.
+
+[[migration]]
+=== Migration path
+
+[[stage-0]]
+==== Stage 0
+
+Hooks are called by running `run-command.h:find_hook()` with the hookname and
+executing the result. The hook library and builtin do not exist. Hooks only
+exist as specially named scripts within `.git/hooks/`.
+
+[[stage-1]]
+==== Stage 1
+
+`git hook list --porcelain <hook-event>` is implemented. Users can replace their
+`.git/hooks/<hook-event>` scripts with a trampoline based on `git hook list`'s
+output. Modifier commands like `git hook add` and `git hook edit` can be
+implemented around this time as well.
+
+[[stage-2]]
+==== Stage 2
+
+`hook.h:run_hooks()` is taught to include `run-command.h:find_hook()` at the
+end; calls to `find_hook()` are replaced with calls to `run_hooks()`. Users can
+opt-in to config-based hooks simply by creating some in their config; otherwise
+users should remain unaffected by the change.
+
+[[stage-3]]
+==== Stage 3
+
+The call to `find_hook()` inside of `run_hooks()` learns to check for a config,
+`hook.runHookDir`. Users can opt into managing their hooks completely via the
+config this way.
+
+[[stage-4]]
+==== Stage 4
+
+`.git/hooks` is removed from the template and the hook directory is considered
+deprecated. To avoid breaking older repos, the default of `hook.runHookDir` is
+not changed, and `find_hook()` is not removed.
+
+[[caveats]]
+== Caveats
+
+[[security]]
+=== Security and repo config
+
+Part of the motivation behind this refactor is to mitigate hooks as an attack
+vector;footnote:[https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/]
+however, as the design stands, users can still provide hooks in the repo-level
+config, which is included when a repo is zipped and sent elsewhere.  The
+security of the repo-level config is still under discussion; this design
+generally assumes the repo-level config is secure, which is not true yet. The
+goal is to avoid an overcomplicated design to work around a problem which has
+ceased to exist.
+
+[[ease-of-use]]
+=== Ease of use
+
+The config schema is nontrivial; that's why it's important for the `git hook`
+modifier commands to be usable. Contributors with UX expertise are encouraged to
+share their suggestions.
+
+[[alternatives]]
+== Alternative approaches
+
+A previous summary of alternatives exists in the
+archives.footnote:[https://lore.kernel.org/git/20191116011125.GG22855@google.com]
+
+[[status-quo]]
+=== Status quo
+
+Today users can implement multihooks themselves by using a "trampoline script"
+as their hook, and pointing that script to a directory or list of other scripts
+they wish to run.
+
+[[hook-directories]]
+=== Hook directories
+
+Other contributors have suggested Git learn about the existence of a directory
+such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order.
+
+[[comparison]]
+=== Comparison table
+
+.Comparison of alternatives
+|===
+|Feature |Config-based hooks |Hook directories |Status quo
+
+|Supports multiple hooks
+|Natively
+|Natively
+|With user effort
+
+|Safer for zipped repos
+|A little
+|No
+|No
+
+|Previous hooks just work
+|If configured
+|Yes
+|Yes
+
+|Can install one hook to many repos
+|Yes
+|No
+|No
+
+|Discoverability
+|Better (in `git help git`)
+|Same as before
+|Same as before
+
+|Hard to run unexpected hook
+|If configured
+|No
+|No
+|===
+
+[[future-work]]
+== Future work
+
+[[execution-ordering]]
+=== Execution ordering
+
+We may find that config order is insufficient for some users; for example,
+config order makes it difficult to add a new hook to the system or global config
+which runs at the end of the hook list. A new ordering schema should be:
+
+1) Specified by a `hook.order` config, so that users will not unexpectedly see
+their order change;
+
+2) Either dependency or numerically based.
+
+Dependency-based ordering is prone to classic linked-list problems, like a
+cycles and handling of missing dependencies. But, it paves the way for enabling
+parallelization if some tasks truly depend on others.
+
+Numerical ordering makes it tricky for Git to generate suggested ordering
+numbers for each command, but is easy to determine a definitive order.
+
+[[parallelization]]
+=== Parallelization
+
+Users with many hooks might want to run them simultaneously, if the hooks don't
+modify state; if one hook depends on another's output, then users will want to
+specify those dependencies. If we decide to solve this problem, we may want to
+look to modern build systems for inspiration on how to manage dependencies and
+parallel tasks.
+
+[[securing-hookdir-hooks]]
+=== Securing hookdir hooks
+
+With the design as written in this doc, it's still possible for a malicious user
+to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then
+zip their repo and send it to another user. It may be necessary to teach Git to
+only allow inlined hooks like this if they were configured outside of the local
+scope (in other words, only run hookcmds, and only allow hookcmds to be
+configured in global or system scope); or another approach, like a list of safe
+projects, might be useful. It may also be sufficient (or at least useful) to
+teach a `hook.disableAll` config or similar flag to the Git executable.
+
+[[submodule-inheritance]]
+=== Submodule inheritance
+
+It's possible some submodules may want to run the identical set of hooks that
+their superrepo runs. While a globally-configured hook set is helpful, it's not
+a great solution for users who have multiple repos-with-submodules under the
+same user. It would be useful for submodules to learn how to run hooks from
+their superrepo's config, or inherit that hook setting.
+
+[[per-hook-event-settings]]
+=== Per-hook-event settings
+
+It might be desirable to keep settings specifically for some hook events, but
+not for others - for example, a user may wish to disable hookdir hooks for all
+events but pre-commit, which they haven't had time to convert yet; or, a user
+may wish for execution order settings to differ based on hook event. In that
+case, it would be useful to set something like `hook.pre-commit.executionOrder`
+which would not apply to the 'prepare-commit-msg' hook, for example.
+
+[[glossary]]
+== Glossary
+
+*hook event*
+
+A point during Git's execution where user scripts may be run, for example,
+_prepare-commit-msg_ or _pre-push_.
+
+*hook command*
+
+A user script or executable which will be run on one or more hook events.
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 02/17] hook: scaffolding for git-hook subcommand
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
  2020-12-05  1:45         ` [PATCH 01/17] doc: propose hooks managed by the config Emily Shaffer
@ 2020-12-05  1:45         ` Emily Shaffer
  2020-12-05  1:45         ` [PATCH 03/17] hook: add list command Emily Shaffer
                           ` (16 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:45 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Introduce infrastructure for a new subcommand, git-hook, which will be
used to ease config-based hook management. This command will handle
parsing configs to compose a list of hooks to run for a given event, as
well as adding or modifying hook configs in an interactive fashion.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, mainly changed to RUN_SETUP_GENTLY so that 'git hook list' can
    be executed outside of a repo.

 .gitignore                    |  1 +
 Documentation/git-hook.txt    | 20 ++++++++++++++++++++
 Makefile                      |  1 +
 builtin.h                     |  1 +
 builtin/hook.c                | 21 +++++++++++++++++++++
 git.c                         |  1 +
 t/t1360-config-based-hooks.sh | 11 +++++++++++
 7 files changed, 56 insertions(+)
 create mode 100644 Documentation/git-hook.txt
 create mode 100644 builtin/hook.c
 create mode 100755 t/t1360-config-based-hooks.sh

diff --git a/.gitignore b/.gitignore
index f22b7a4cf1..094f58a175 100644
--- a/.gitignore
+++ b/.gitignore
@@ -76,6 +76,7 @@
 /git-grep
 /git-hash-object
 /git-help
+/git-hook
 /git-http-backend
 /git-http-fetch
 /git-http-push
diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
new file mode 100644
index 0000000000..9eeab0009d
--- /dev/null
+++ b/Documentation/git-hook.txt
@@ -0,0 +1,20 @@
+git-hook(1)
+===========
+
+NAME
+----
+git-hook - Manage configured hooks
+
+SYNOPSIS
+--------
+[verse]
+'git hook'
+
+DESCRIPTION
+-----------
+A placeholder command. Later, you will be able to list, add, and modify hooks
+with this command.
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index 45bce31016..6ef9c0ee4e 100644
--- a/Makefile
+++ b/Makefile
@@ -1100,6 +1100,7 @@ BUILTIN_OBJS += builtin/get-tar-commit-id.o
 BUILTIN_OBJS += builtin/grep.o
 BUILTIN_OBJS += builtin/hash-object.o
 BUILTIN_OBJS += builtin/help.o
+BUILTIN_OBJS += builtin/hook.o
 BUILTIN_OBJS += builtin/index-pack.o
 BUILTIN_OBJS += builtin/init-db.o
 BUILTIN_OBJS += builtin/interpret-trailers.o
diff --git a/builtin.h b/builtin.h
index b6ce981b73..8df1d36a7a 100644
--- a/builtin.h
+++ b/builtin.h
@@ -163,6 +163,7 @@ int cmd_get_tar_commit_id(int argc, const char **argv, const char *prefix);
 int cmd_grep(int argc, const char **argv, const char *prefix);
 int cmd_hash_object(int argc, const char **argv, const char *prefix);
 int cmd_help(int argc, const char **argv, const char *prefix);
+int cmd_hook(int argc, const char **argv, const char *prefix);
 int cmd_index_pack(int argc, const char **argv, const char *prefix);
 int cmd_init_db(int argc, const char **argv, const char *prefix);
 int cmd_interpret_trailers(int argc, const char **argv, const char *prefix);
diff --git a/builtin/hook.c b/builtin/hook.c
new file mode 100644
index 0000000000..b2bbc84d4d
--- /dev/null
+++ b/builtin/hook.c
@@ -0,0 +1,21 @@
+#include "cache.h"
+
+#include "builtin.h"
+#include "parse-options.h"
+
+static const char * const builtin_hook_usage[] = {
+	N_("git hook"),
+	NULL
+};
+
+int cmd_hook(int argc, const char **argv, const char *prefix)
+{
+	struct option builtin_hook_options[] = {
+		OPT_END(),
+	};
+
+	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+			     builtin_hook_usage, 0);
+
+	return 0;
+}
diff --git a/git.c b/git.c
index 4b7bd77b80..8e92b5d3f6 100644
--- a/git.c
+++ b/git.c
@@ -525,6 +525,7 @@ static struct cmd_struct commands[] = {
 	{ "grep", cmd_grep, RUN_SETUP_GENTLY },
 	{ "hash-object", cmd_hash_object },
 	{ "help", cmd_help },
+	{ "hook", cmd_hook, RUN_SETUP_GENTLY },
 	{ "index-pack", cmd_index_pack, RUN_SETUP_GENTLY | NO_PARSEOPT },
 	{ "init", cmd_init_db },
 	{ "init-db", cmd_init_db },
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
new file mode 100755
index 0000000000..34b0df5216
--- /dev/null
+++ b/t/t1360-config-based-hooks.sh
@@ -0,0 +1,11 @@
+#!/bin/bash
+
+test_description='config-managed multihooks, including git-hook command'
+
+. ./test-lib.sh
+
+test_expect_success 'git hook command does not crash' '
+	git hook
+'
+
+test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 03/17] hook: add list command
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
  2020-12-05  1:45         ` [PATCH 01/17] doc: propose hooks managed by the config Emily Shaffer
  2020-12-05  1:45         ` [PATCH 02/17] hook: scaffolding for git-hook subcommand Emily Shaffer
@ 2020-12-05  1:45         ` Emily Shaffer
  2020-12-05  1:45         ` [PATCH 04/17] hook: include hookdir hook in list Emily Shaffer
                           ` (15 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:45 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Teach 'git hook list <hookname>', which checks the known configs in
order to create an ordered list of hooks to run on a given hook event.

Multiple commands can be specified for a given hook by providing
multiple "hook.<hookname>.command = <path-to-hook>" lines. Hooks will be
run in config order. If more properties need to be set on a given hook
in the future, commands can also be specified by providing
"hook.<hookname>.command = <hookcmd-name>", as well as a "[hookcmd
<hookcmd-name>]" subsection; at minimum, this subsection must contain a
"hookcmd.<hookcmd-name>.command = <path-to-hook>" line.

For example:

  $ git config --list | grep ^hook
  hook.pre-commit.command=baz
  hook.pre-commit.command=~/bar.sh
  hookcmd.baz.command=~/baz/from/hookcmd.sh

  $ git hook list pre-commit
  global: ~/baz/from/hookcmd.sh
  local: ~/bar.sh

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, updated the sample in the commit message to reflect reality better.
    
    Since v4, more work on the documentation. Also a slight change to the
    output format (space instead of tab).

 Documentation/config/hook.txt |   9 +++
 Documentation/git-hook.txt    |  59 ++++++++++++++++-
 Makefile                      |   1 +
 builtin/hook.c                |  56 +++++++++++++++--
 hook.c                        | 115 ++++++++++++++++++++++++++++++++++
 hook.h                        |  26 ++++++++
 t/t1360-config-based-hooks.sh |  81 +++++++++++++++++++++++-
 7 files changed, 338 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/config/hook.txt
 create mode 100644 hook.c
 create mode 100644 hook.h

diff --git a/Documentation/config/hook.txt b/Documentation/config/hook.txt
new file mode 100644
index 0000000000..71449ecbc7
--- /dev/null
+++ b/Documentation/config/hook.txt
@@ -0,0 +1,9 @@
+hook.<command>.command::
+	A command to execute during the <command> hook event. This can be an
+	executable on your device, a oneliner for your shell, or the name of a
+	hookcmd. See linkgit:git-hook[1].
+
+hookcmd.<name>.command::
+	A command to execute during a hook for which <name> has been specified
+	as a command. This can be an executable on your device or a oneliner for
+	your shell. See linkgit:git-hook[1].
diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index 9eeab0009d..f19875ed68 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -8,12 +8,65 @@ git-hook - Manage configured hooks
 SYNOPSIS
 --------
 [verse]
-'git hook'
+'git hook' list <hook-name>
 
 DESCRIPTION
 -----------
-A placeholder command. Later, you will be able to list, add, and modify hooks
-with this command.
+You can list configured hooks with this command. Later, you will be able to run,
+add, and modify hooks with this command.
+
+This command parses the default configuration files for sections `hook` and
+`hookcmd`. `hook` is used to describe the commands which will be run during a
+particular hook event; commands are run in the order Git encounters them during
+the configuration parse (see linkgit:git-config[1]). `hookcmd` is used to
+describe attributes of a specific command. If additional attributes don't need
+to be specified, a command to run can be specified directly in the `hook`
+section; if a `hookcmd` by that name isn't found, Git will attempt to run the
+provided value directly. For example:
+
+Global config
+----
+  [hook "post-commit"]
+    command = "linter"
+    command = "~/typocheck.sh"
+
+  [hookcmd "linter"]
+    command = "/bin/linter --c"
+----
+
+Local config
+----
+  [hook "prepare-commit-msg"]
+    command = "linter"
+  [hook "post-commit"]
+    command = "python ~/run-test-suite.py"
+----
+
+With these configs, you'd then see:
+
+----
+$ git hook list "post-commit"
+global: /bin/linter --c
+global: ~/typocheck.sh
+local: python ~/run-test-suite.py
+
+$ git hook list "prepare-commit-msg"
+local: /bin/linter --c
+----
+
+COMMANDS
+--------
+
+list `<hook-name>`::
+
+List the hooks which have been configured for `<hook-name>`. Hooks appear
+in the order they should be run, and print the config scope where the relevant
+`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
+This output is human-readable and the format is subject to change over time.
+
+CONFIGURATION
+-------------
+include::config/hook.txt[]
 
 GIT
 ---
diff --git a/Makefile b/Makefile
index 6ef9c0ee4e..4bf158c4f8 100644
--- a/Makefile
+++ b/Makefile
@@ -903,6 +903,7 @@ LIB_OBJS += grep.o
 LIB_OBJS += hashmap.o
 LIB_OBJS += help.o
 LIB_OBJS += hex.o
+LIB_OBJS += hook.o
 LIB_OBJS += ident.o
 LIB_OBJS += json-writer.o
 LIB_OBJS += kwset.o
diff --git a/builtin/hook.c b/builtin/hook.c
index b2bbc84d4d..4d36de52f8 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -1,21 +1,69 @@
 #include "cache.h"
 
 #include "builtin.h"
+#include "config.h"
+#include "hook.h"
 #include "parse-options.h"
+#include "strbuf.h"
 
 static const char * const builtin_hook_usage[] = {
-	N_("git hook"),
+	N_("git hook list <hookname>"),
 	NULL
 };
 
-int cmd_hook(int argc, const char **argv, const char *prefix)
+static int list(int argc, const char **argv, const char *prefix)
 {
-	struct option builtin_hook_options[] = {
+	struct list_head *head, *pos;
+	struct hook *item;
+	struct strbuf hookname = STRBUF_INIT;
+
+	struct option list_options[] = {
 		OPT_END(),
 	};
 
-	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+	argc = parse_options(argc, argv, prefix, list_options,
 			     builtin_hook_usage, 0);
 
+	if (argc < 1) {
+		usage_msg_opt(_("You must specify a hook event name to list."),
+			      builtin_hook_usage, list_options);
+	}
+
+	strbuf_addstr(&hookname, argv[0]);
+
+	head = hook_list(&hookname);
+
+	if (list_empty(head)) {
+		printf(_("no commands configured for hook '%s'\n"),
+		       hookname.buf);
+		strbuf_release(&hookname);
+		return 0;
+	}
+
+	list_for_each(pos, head) {
+		item = list_entry(pos, struct hook, list);
+		if (item)
+			printf("%s: %s\n",
+			       config_scope_name(item->origin),
+			       item->command.buf);
+	}
+
+	clear_hook_list(head);
+	strbuf_release(&hookname);
+
 	return 0;
 }
+
+int cmd_hook(int argc, const char **argv, const char *prefix)
+{
+	struct option builtin_hook_options[] = {
+		OPT_END(),
+	};
+	if (argc < 2)
+		usage_with_options(builtin_hook_usage, builtin_hook_options);
+
+	if (!strcmp(argv[1], "list"))
+		return list(argc - 1, argv + 1, prefix);
+
+	usage_with_options(builtin_hook_usage, builtin_hook_options);
+}
diff --git a/hook.c b/hook.c
new file mode 100644
index 0000000000..937dc768c8
--- /dev/null
+++ b/hook.c
@@ -0,0 +1,115 @@
+#include "cache.h"
+
+#include "hook.h"
+#include "config.h"
+
+void free_hook(struct hook *ptr)
+{
+	if (ptr) {
+		strbuf_release(&ptr->command);
+		free(ptr);
+	}
+}
+
+static void append_or_move_hook(struct list_head *head, const char *command)
+{
+	struct list_head *pos = NULL, *tmp = NULL;
+	struct hook *to_add = NULL;
+
+	/*
+	 * remove the prior entry with this command; we'll replace it at the
+	 * end.
+	 */
+	list_for_each_safe(pos, tmp, head) {
+		struct hook *it = list_entry(pos, struct hook, list);
+		if (!strcmp(it->command.buf, command)) {
+		    list_del(pos);
+		    /* we'll simply move the hook to the end */
+		    to_add = it;
+		}
+	}
+
+	if (!to_add) {
+		/* adding a new hook, not moving an old one */
+		to_add = xmalloc(sizeof(struct hook));
+		strbuf_init(&to_add->command, 0);
+		strbuf_addstr(&to_add->command, command);
+	}
+
+	/* re-set the scope so we show where an override was specified */
+	to_add->origin = current_config_scope();
+
+	list_add_tail(&to_add->list, pos);
+}
+
+static void remove_hook(struct list_head *to_remove)
+{
+	struct hook *hook_to_remove = list_entry(to_remove, struct hook, list);
+	list_del(to_remove);
+	free_hook(hook_to_remove);
+}
+
+void clear_hook_list(struct list_head *head)
+{
+	struct list_head *pos, *tmp;
+	list_for_each_safe(pos, tmp, head)
+		remove_hook(pos);
+}
+
+struct hook_config_cb
+{
+	struct strbuf *hookname;
+	struct list_head *list;
+};
+
+static int hook_config_lookup(const char *key, const char *value, void *cb_data)
+{
+	struct hook_config_cb *data = cb_data;
+	const char *hook_key = data->hookname->buf;
+	struct list_head *head = data->list;
+
+	if (!strcmp(key, hook_key)) {
+		const char *command = value;
+		struct strbuf hookcmd_name = STRBUF_INIT;
+
+		/* Check if a hookcmd with that name exists. */
+		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
+		git_config_get_value(hookcmd_name.buf, &command);
+
+		if (!command) {
+			strbuf_release(&hookcmd_name);
+			BUG("git_config_get_value overwrote a string it shouldn't have");
+		}
+
+		/*
+		 * TODO: implement an option-getting callback, e.g.
+		 *   get configs by pattern hookcmd.$value.*
+		 *   for each key+value, do_callback(key, value, cb_data)
+		 */
+
+		append_or_move_hook(head, command);
+
+		strbuf_release(&hookcmd_name);
+	}
+
+	return 0;
+}
+
+struct list_head* hook_list(const struct strbuf* hookname)
+{
+	struct strbuf hook_key = STRBUF_INIT;
+	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
+	struct hook_config_cb cb_data = { &hook_key, hook_head };
+
+	INIT_LIST_HEAD(hook_head);
+
+	if (!hookname)
+		return NULL;
+
+	strbuf_addf(&hook_key, "hook.%s.command", hookname->buf);
+
+	git_config(hook_config_lookup, (void*)&cb_data);
+
+	strbuf_release(&hook_key);
+	return hook_head;
+}
diff --git a/hook.h b/hook.h
new file mode 100644
index 0000000000..8ffc4f14b6
--- /dev/null
+++ b/hook.h
@@ -0,0 +1,26 @@
+#include "config.h"
+#include "list.h"
+#include "strbuf.h"
+
+struct hook
+{
+	struct list_head list;
+	/*
+	 * Config file which holds the hook.*.command definition.
+	 * (This has nothing to do with the hookcmd.<name>.* configs.)
+	 */
+	enum config_scope origin;
+	/* The literal command to run. */
+	struct strbuf command;
+};
+
+/*
+ * Provides a linked list of 'struct hook' detailing commands which should run
+ * in response to the 'hookname' event, in execution order.
+ */
+struct list_head* hook_list(const struct strbuf *hookname);
+
+/* Free memory associated with a 'struct hook' */
+void free_hook(struct hook *ptr);
+/* Empties the list at 'head', calling 'free_hook()' on each entry */
+void clear_hook_list(struct list_head *head);
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 34b0df5216..6e4a3e763f 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -4,8 +4,85 @@ test_description='config-managed multihooks, including git-hook command'
 
 . ./test-lib.sh
 
-test_expect_success 'git hook command does not crash' '
-	git hook
+ROOT=
+if test_have_prereq MINGW
+then
+	# In Git for Windows, Unix-like paths work only in shell scripts;
+	# `git.exe`, however, will prefix them with the pseudo root directory
+	# (of the Unix shell). Let's accommodate for that.
+	ROOT="$(cd / && pwd)"
+fi
+
+setup_hooks () {
+	test_config hook.pre-commit.command "/path/ghi" --add
+	test_config_global hook.pre-commit.command "/path/def" --add
+}
+
+setup_hookcmd () {
+	test_config hook.pre-commit.command "abc" --add
+	test_config_global hookcmd.abc.command "/path/abc" --add
+}
+
+test_expect_success 'git hook rejects commands without a mode' '
+	test_must_fail git hook pre-commit
+'
+
+
+test_expect_success 'git hook rejects commands without a hookname' '
+	test_must_fail git hook list
+'
+
+test_expect_success 'git hook runs outside of a repo' '
+	setup_hooks &&
+
+	cat >expected <<-EOF &&
+	global: $ROOT/path/def
+	EOF
+
+	nongit git config --list --global &&
+
+	nongit git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list orders by config order' '
+	setup_hooks &&
+
+	cat >expected <<-EOF &&
+	global: $ROOT/path/def
+	local: $ROOT/path/ghi
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list dereferences a hookcmd' '
+	setup_hooks &&
+	setup_hookcmd &&
+
+	cat >expected <<-EOF &&
+	global: $ROOT/path/def
+	local: $ROOT/path/ghi
+	local: $ROOT/path/abc
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list reorders on duplicate commands' '
+	setup_hooks &&
+
+	test_config hook.pre-commit.command "/path/def" --add &&
+
+	cat >expected <<-EOF &&
+	local: $ROOT/path/ghi
+	local: $ROOT/path/def
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
 '
 
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 04/17] hook: include hookdir hook in list
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (2 preceding siblings ...)
  2020-12-05  1:45         ` [PATCH 03/17] hook: add list command Emily Shaffer
@ 2020-12-05  1:45         ` Emily Shaffer
  2020-12-05  1:45         ` [PATCH 05/17] hook: respect hook.runHookDir Emily Shaffer
                           ` (14 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:45 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Historically, hooks are declared by placing an executable into
$GIT_DIR/hooks/$HOOKNAME (or $HOOKDIR/$HOOKNAME). Although hooks taken
from the config are more featureful than hooks placed in the $HOOKDIR,
those hooks should not stop working for users who already have them.

Legacy hooks should be run directly, not in shell. We know that they are
a path to an executable, not a oneliner script - and running them
directly takes care of path quoting concerns for us for free.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Newly split into its own commit since v4, and taking place much sooner.
    
    An unfortunate side effect of adding this support *before* the
    hook.runHookDir support is that the labels on the list are not clear -
    because we aren't yet flagging which hooks are from the hookdir versus
    the config. I suppose we could move the addition of that field to the
    struct hook up to this patch, but it didn't make a lot of sense to me to
    do it just for cosmetic purposes.

 builtin/hook.c                | 16 ++++++++++++----
 hook.c                        | 15 +++++++++++++++
 hook.h                        |  1 +
 t/t1360-config-based-hooks.sh | 19 +++++++++++++++++++
 4 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/builtin/hook.c b/builtin/hook.c
index 4d36de52f8..45bbc83b2b 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -16,6 +16,7 @@ static int list(int argc, const char **argv, const char *prefix)
 	struct list_head *head, *pos;
 	struct hook *item;
 	struct strbuf hookname = STRBUF_INIT;
+	struct strbuf hookdir_annotation = STRBUF_INIT;
 
 	struct option list_options[] = {
 		OPT_END(),
@@ -42,10 +43,17 @@ static int list(int argc, const char **argv, const char *prefix)
 
 	list_for_each(pos, head) {
 		item = list_entry(pos, struct hook, list);
-		if (item)
-			printf("%s: %s\n",
-			       config_scope_name(item->origin),
-			       item->command.buf);
+		if (item) {
+			/* Don't translate 'hookdir' - it matches the config */
+			printf("%s: %s%s\n",
+			       (item->from_hookdir
+				? "hookdir"
+				: config_scope_name(item->origin)),
+			       item->command.buf,
+			       (item->from_hookdir
+				? hookdir_annotation.buf
+				: ""));
+		}
 	}
 
 	clear_hook_list(head);
diff --git a/hook.c b/hook.c
index 937dc768c8..ffbdcfd987 100644
--- a/hook.c
+++ b/hook.c
@@ -2,6 +2,7 @@
 
 #include "hook.h"
 #include "config.h"
+#include "run-command.h"
 
 void free_hook(struct hook *ptr)
 {
@@ -34,6 +35,7 @@ static void append_or_move_hook(struct list_head *head, const char *command)
 		to_add = xmalloc(sizeof(struct hook));
 		strbuf_init(&to_add->command, 0);
 		strbuf_addstr(&to_add->command, command);
+		to_add->from_hookdir = 0;
 	}
 
 	/* re-set the scope so we show where an override was specified */
@@ -100,6 +102,7 @@ struct list_head* hook_list(const struct strbuf* hookname)
 	struct strbuf hook_key = STRBUF_INIT;
 	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
 	struct hook_config_cb cb_data = { &hook_key, hook_head };
+	const char *legacy_hook_path = NULL;
 
 	INIT_LIST_HEAD(hook_head);
 
@@ -110,6 +113,18 @@ struct list_head* hook_list(const struct strbuf* hookname)
 
 	git_config(hook_config_lookup, (void*)&cb_data);
 
+	if (have_git_dir())
+		legacy_hook_path = find_hook(hookname->buf);
+
+	/* Unconditionally add legacy hook, but annotate it. */
+	if (legacy_hook_path) {
+		struct hook *legacy_hook;
+
+		append_or_move_hook(hook_head, absolute_path(legacy_hook_path));
+		legacy_hook = list_entry(hook_head->prev, struct hook, list);
+		legacy_hook->from_hookdir = 1;
+	}
+
 	strbuf_release(&hook_key);
 	return hook_head;
 }
diff --git a/hook.h b/hook.h
index 8ffc4f14b6..5750634c83 100644
--- a/hook.h
+++ b/hook.h
@@ -12,6 +12,7 @@ struct hook
 	enum config_scope origin;
 	/* The literal command to run. */
 	struct strbuf command;
+	int from_hookdir;
 };
 
 /*
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 6e4a3e763f..0f12af4659 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -23,6 +23,14 @@ setup_hookcmd () {
 	test_config_global hookcmd.abc.command "/path/abc" --add
 }
 
+setup_hookdir () {
+	mkdir .git/hooks
+	write_script .git/hooks/pre-commit <<-EOF
+	echo \"Legacy Hook\"
+	EOF
+	test_when_finished rm -rf .git/hooks
+}
+
 test_expect_success 'git hook rejects commands without a mode' '
 	test_must_fail git hook pre-commit
 '
@@ -85,4 +93,15 @@ test_expect_success 'git hook list reorders on duplicate commands' '
 	test_cmp expected actual
 '
 
+test_expect_success 'git hook list shows hooks from the hookdir' '
+	setup_hookdir &&
+
+	cat >expected <<-EOF &&
+	hookdir: $(pwd)/.git/hooks/pre-commit
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 05/17] hook: respect hook.runHookDir
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (3 preceding siblings ...)
  2020-12-05  1:45         ` [PATCH 04/17] hook: include hookdir hook in list Emily Shaffer
@ 2020-12-05  1:45         ` Emily Shaffer
  2020-12-05  1:45         ` [PATCH 06/17] hook: implement hookcmd.<name>.skip Emily Shaffer
                           ` (13 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:45 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Include hooks specified in the hook directory in the list of hooks to
run. These hooks do need to be treated differently from config-specified
ones - they do not need to run in a shell, and later on may be disabled
or warned about based on a config setting.

Because they are at least as local as the local config, we'll run them
last - to keep the hook execution order from global to local.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Newly split into its own commit since v4, and taking place much sooner.
    
    An unfortunate side effect of adding this support *before* the
    hook.runHookDir support is that the labels on the list are not clear -
    because we aren't yet flagging which hooks are from the hookdir versus
    the config. I suppose we could move the addition of that field to the
    struct hook up to this patch, but it didn't make a lot of sense to me to
    do it just for cosmetic purposes.

 Documentation/config/hook.txt |  5 ++++
 builtin/hook.c                | 54 +++++++++++++++++++++++++++++++++--
 hook.c                        | 21 ++++++++++++++
 hook.h                        | 15 ++++++++++
 t/t1360-config-based-hooks.sh | 43 ++++++++++++++++++++++++++++
 5 files changed, 135 insertions(+), 3 deletions(-)

diff --git a/Documentation/config/hook.txt b/Documentation/config/hook.txt
index 71449ecbc7..75312754ae 100644
--- a/Documentation/config/hook.txt
+++ b/Documentation/config/hook.txt
@@ -7,3 +7,8 @@ hookcmd.<name>.command::
 	A command to execute during a hook for which <name> has been specified
 	as a command. This can be an executable on your device or a oneliner for
 	your shell. See linkgit:git-hook[1].
+
+hook.runHookDir::
+	Controls how hooks contained in your hookdir are executed. Can be any of
+	"yes", "warn", "interactive", or "no". Defaults to "yes". See
+	linkgit:git-hook[1] and linkgit:git-config[1] "core.hooksPath").
diff --git a/builtin/hook.c b/builtin/hook.c
index 45bbc83b2b..16324d4195 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -11,6 +11,8 @@ static const char * const builtin_hook_usage[] = {
 	NULL
 };
 
+static enum hookdir_opt should_run_hookdir;
+
 static int list(int argc, const char **argv, const char *prefix)
 {
 	struct list_head *head, *pos;
@@ -41,6 +43,26 @@ static int list(int argc, const char **argv, const char *prefix)
 		return 0;
 	}
 
+	switch (should_run_hookdir) {
+		case hookdir_no:
+			strbuf_addstr(&hookdir_annotation, _(" (will not run)"));
+			break;
+		case hookdir_interactive:
+			strbuf_addstr(&hookdir_annotation, _(" (will prompt)"));
+			break;
+		case hookdir_warn:
+		case hookdir_unknown:
+			strbuf_addstr(&hookdir_annotation, _(" (will warn)"));
+			break;
+		case hookdir_yes:
+		/*
+		 * The default behavior should agree with
+		 * hook.c:configured_hookdir_opt().
+		 */
+		default:
+			break;
+	}
+
 	list_for_each(pos, head) {
 		item = list_entry(pos, struct hook, list);
 		if (item) {
@@ -64,14 +86,40 @@ static int list(int argc, const char **argv, const char *prefix)
 
 int cmd_hook(int argc, const char **argv, const char *prefix)
 {
+	const char *run_hookdir = NULL;
+
 	struct option builtin_hook_options[] = {
+		OPT_STRING(0, "run-hookdir", &run_hookdir, N_("option"),
+			   N_("what to do with hooks found in the hookdir")),
 		OPT_END(),
 	};
-	if (argc < 2)
+
+	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+			     builtin_hook_usage, 0);
+
+	/* after the parse, we should have "<command> <hookname> <args...>" */
+	if (argc < 1)
 		usage_with_options(builtin_hook_usage, builtin_hook_options);
 
-	if (!strcmp(argv[1], "list"))
-		return list(argc - 1, argv + 1, prefix);
+
+	/* argument > config */
+	if (run_hookdir)
+		if (!strcmp(run_hookdir, "no"))
+			should_run_hookdir = hookdir_no;
+		else if (!strcmp(run_hookdir, "yes"))
+			should_run_hookdir = hookdir_yes;
+		else if (!strcmp(run_hookdir, "warn"))
+			should_run_hookdir = hookdir_warn;
+		else if (!strcmp(run_hookdir, "interactive"))
+			should_run_hookdir = hookdir_interactive;
+		else
+			die(_("'%s' is not a valid option for --run-hookdir "
+			      "(yes, warn, interactive, no)"), run_hookdir);
+	else
+		should_run_hookdir = configured_hookdir_opt();
+
+	if (!strcmp(argv[0], "list"))
+		return list(argc, argv, prefix);
 
 	usage_with_options(builtin_hook_usage, builtin_hook_options);
 }
diff --git a/hook.c b/hook.c
index ffbdcfd987..340e5a35c8 100644
--- a/hook.c
+++ b/hook.c
@@ -97,6 +97,27 @@ static int hook_config_lookup(const char *key, const char *value, void *cb_data)
 	return 0;
 }
 
+enum hookdir_opt configured_hookdir_opt(void)
+{
+	const char *key;
+	if (git_config_get_value("hook.runhookdir", &key))
+		return hookdir_yes; /* by default, just run it. */
+
+	if (!strcmp(key, "no"))
+		return hookdir_no;
+
+	if (!strcmp(key, "yes"))
+		return hookdir_yes;
+
+	if (!strcmp(key, "warn"))
+		return hookdir_warn;
+
+	if (!strcmp(key, "interactive"))
+		return hookdir_interactive;
+
+	return hookdir_unknown;
+}
+
 struct list_head* hook_list(const struct strbuf* hookname)
 {
 	struct strbuf hook_key = STRBUF_INIT;
diff --git a/hook.h b/hook.h
index 5750634c83..ca45d388d3 100644
--- a/hook.h
+++ b/hook.h
@@ -21,6 +21,21 @@ struct hook
  */
 struct list_head* hook_list(const struct strbuf *hookname);
 
+enum hookdir_opt
+{
+	hookdir_no,
+	hookdir_warn,
+	hookdir_interactive,
+	hookdir_yes,
+	hookdir_unknown,
+};
+
+/*
+ * Provides the hookdir_opt specified in the config without consulting any
+ * command line arguments.
+ */
+enum hookdir_opt configured_hookdir_opt(void);
+
 /* Free memory associated with a 'struct hook' */
 void free_hook(struct hook *ptr);
 /* Empties the list at 'head', calling 'free_hook()' on each entry */
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 0f12af4659..91127a50a4 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -104,4 +104,47 @@ test_expect_success 'git hook list shows hooks from the hookdir' '
 	test_cmp expected actual
 '
 
+test_expect_success 'hook.runHookDir = no is respected by list' '
+	setup_hookdir &&
+
+	test_config hook.runHookDir "no" &&
+
+	cat >expected <<-EOF &&
+	hookdir: $(pwd)/.git/hooks/pre-commit (will not run)
+	EOF
+
+	git hook list pre-commit >actual &&
+	# the hookdir annotation is translated
+	test_i18ncmp expected actual
+'
+
+test_expect_success 'hook.runHookDir = warn is respected by list' '
+	setup_hookdir &&
+
+	test_config hook.runHookDir "warn" &&
+
+	cat >expected <<-EOF &&
+	hookdir: $(pwd)/.git/hooks/pre-commit (will warn)
+	EOF
+
+	git hook list pre-commit >actual &&
+	# the hookdir annotation is translated
+	test_i18ncmp expected actual
+'
+
+
+test_expect_success 'hook.runHookDir = interactive is respected by list' '
+	setup_hookdir &&
+
+	test_config hook.runHookDir "interactive" &&
+
+	cat >expected <<-EOF &&
+	hookdir: $(pwd)/.git/hooks/pre-commit (will prompt)
+	EOF
+
+	git hook list pre-commit >actual &&
+	# the hookdir annotation is translated
+	test_i18ncmp expected actual
+'
+
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 06/17] hook: implement hookcmd.<name>.skip
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (4 preceding siblings ...)
  2020-12-05  1:45         ` [PATCH 05/17] hook: respect hook.runHookDir Emily Shaffer
@ 2020-12-05  1:45         ` Emily Shaffer
  2020-12-05  1:45         ` [PATCH 07/17] parse-options: parse into strvec Emily Shaffer
                           ` (12 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:45 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

If a user wants a specific repo to skip execution of a hook which is set
at a global or system level, they can now do so by specifying 'skip' in
their repo config:

~/.gitconfig
  [hook.pre-commit]
    command = skippable-oneliner
    command = skippable-hookcmd

  [hookcmd.skippable-hookcmd]
    command = foo.sh

$GIT_DIR/.git/config
  [hookcmd.skippable-oneliner]
    skip = true
  [hookcmd.skippable-hookcmd]
    skip = true

Later it may make sense to add an option like
"hookcmd.<name>.<hook-event>-skip" - but for simplicity, let's start
with a universal skip setting like this.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    In addition to being handy for turning off global hooks one project doesn't
    care about, this setting will be necessary much later for the 'proc-receive'
    hook, which can only cope with up to one hook being specified.
    
    New since v4.
    
    During the Google team's review club I was reminded about this whole
    'skip' option I never implemented. It's true that it's impossible to
    exclude a given hook without this; however, I think I have some more
    work to do on it, so consider it RFC for now and tell me what you think
    :)
     - Emily
    
    During the Google team's review club this week I was reminded about this whole
    'skip' option I never implemented. It's true that it's impossible to exclude
    a given hook without this; however, I think we have some more work to do on it,
    so consider it RFC for now and tell me what you think :)
    
     - Emily

 hook.c                        | 37 +++++++++++++++++++++++++----------
 t/t1360-config-based-hooks.sh | 23 ++++++++++++++++++++++
 2 files changed, 50 insertions(+), 10 deletions(-)

diff --git a/hook.c b/hook.c
index 340e5a35c8..f4084e33c8 100644
--- a/hook.c
+++ b/hook.c
@@ -12,23 +12,24 @@ void free_hook(struct hook *ptr)
 	}
 }
 
-static void append_or_move_hook(struct list_head *head, const char *command)
+static struct hook* find_hook_by_command(struct list_head *head, const char *command)
 {
 	struct list_head *pos = NULL, *tmp = NULL;
-	struct hook *to_add = NULL;
+	struct hook *found = NULL;
 
-	/*
-	 * remove the prior entry with this command; we'll replace it at the
-	 * end.
-	 */
 	list_for_each_safe(pos, tmp, head) {
 		struct hook *it = list_entry(pos, struct hook, list);
 		if (!strcmp(it->command.buf, command)) {
 		    list_del(pos);
-		    /* we'll simply move the hook to the end */
-		    to_add = it;
+		    found = it;
 		}
 	}
+	return found;
+}
+
+static void append_or_move_hook(struct list_head *head, const char *command)
+{
+	struct hook *to_add = find_hook_by_command(head, command);
 
 	if (!to_add) {
 		/* adding a new hook, not moving an old one */
@@ -41,7 +42,7 @@ static void append_or_move_hook(struct list_head *head, const char *command)
 	/* re-set the scope so we show where an override was specified */
 	to_add->origin = current_config_scope();
 
-	list_add_tail(&to_add->list, pos);
+	list_add_tail(&to_add->list, head);
 }
 
 static void remove_hook(struct list_head *to_remove)
@@ -73,8 +74,18 @@ static int hook_config_lookup(const char *key, const char *value, void *cb_data)
 	if (!strcmp(key, hook_key)) {
 		const char *command = value;
 		struct strbuf hookcmd_name = STRBUF_INIT;
+		int skip = 0;
+
+		/*
+		 * Check if we're removing that hook instead. Hookcmds are
+		 * removed by name, and inlined hooks are removed by command
+		 * content.
+		 */
+		strbuf_addf(&hookcmd_name, "hookcmd.%s.skip", command);
+		git_config_get_bool(hookcmd_name.buf, &skip);
 
 		/* Check if a hookcmd with that name exists. */
+		strbuf_reset(&hookcmd_name);
 		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
 		git_config_get_value(hookcmd_name.buf, &command);
 
@@ -89,7 +100,13 @@ static int hook_config_lookup(const char *key, const char *value, void *cb_data)
 		 *   for each key+value, do_callback(key, value, cb_data)
 		 */
 
-		append_or_move_hook(head, command);
+		if (skip) {
+			struct hook *to_remove = find_hook_by_command(head, command);
+			if (to_remove)
+				remove_hook(&(to_remove->list));
+		} else {
+			append_or_move_hook(head, command);
+		}
 
 		strbuf_release(&hookcmd_name);
 	}
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 91127a50a4..ebd3bc623f 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -132,6 +132,29 @@ test_expect_success 'hook.runHookDir = warn is respected by list' '
 	test_i18ncmp expected actual
 '
 
+test_expect_success 'git hook list removes skipped hookcmd' '
+	setup_hookcmd &&
+	test_config hookcmd.abc.skip "true" --add &&
+
+	cat >expected <<-EOF &&
+	no commands configured for hook '\''pre-commit'\''
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_i18ncmp expected actual
+'
+
+test_expect_success 'git hook list removes skipped inlined hook' '
+	setup_hooks &&
+	test_config hookcmd."$ROOT/path/ghi".skip "true" --add &&
+
+	cat >expected <<-EOF &&
+	global: $ROOT/path/def
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
 
 test_expect_success 'hook.runHookDir = interactive is respected by list' '
 	setup_hookdir &&
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 07/17] parse-options: parse into strvec
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (5 preceding siblings ...)
  2020-12-05  1:45         ` [PATCH 06/17] hook: implement hookcmd.<name>.skip Emily Shaffer
@ 2020-12-05  1:45         ` Emily Shaffer
  2020-12-05  1:45         ` [PATCH 08/17] hook: add 'run' subcommand Emily Shaffer
                           ` (11 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:45 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

parse-options already knows how to read into a string_list, and it knows
how to read into an strvec as a passthrough (that is, including the
argument as well as its value). string_list and strvec serve similar
purposes but are somewhat painful to convert between; so, let's teach
parse-options to read values of string arguments directly into an
strvec without preserving the argument name.

This is useful if collecting generic arguments to pass through to
another command, for example, 'git hook run --arg "--quiet" --arg
"--format=pretty" some-hook'. The resulting strvec would contain
{ "--quiet", "--format=pretty" }.

The implementation is based on that of OPT_STRING_LIST.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, fixed one or two more places where I missed the argv_array->strvec
    rename.

 Documentation/technical/api-parse-options.txt |  5 +++++
 parse-options-cb.c                            | 16 ++++++++++++++++
 parse-options.h                               |  4 ++++
 3 files changed, 25 insertions(+)

diff --git a/Documentation/technical/api-parse-options.txt b/Documentation/technical/api-parse-options.txt
index 5a60bbfa7f..679bd98629 100644
--- a/Documentation/technical/api-parse-options.txt
+++ b/Documentation/technical/api-parse-options.txt
@@ -173,6 +173,11 @@ There are some macros to easily define options:
 	The string argument is stored as an element in `string_list`.
 	Use of `--no-option` will clear the list of preceding values.
 
+`OPT_STRVEC(short, long, &struct strvec, arg_str, description)`::
+	Introduce an option with a string argument.
+	The string argument is stored as an element in `strvec`.
+	Use of `--no-option` will clear the list of preceding values.
+
 `OPT_INTEGER(short, long, &int_var, description)`::
 	Introduce an option with integer argument.
 	The integer is put into `int_var`.
diff --git a/parse-options-cb.c b/parse-options-cb.c
index 4542d4d3f9..c2451dfb1b 100644
--- a/parse-options-cb.c
+++ b/parse-options-cb.c
@@ -207,6 +207,22 @@ int parse_opt_string_list(const struct option *opt, const char *arg, int unset)
 	return 0;
 }
 
+int parse_opt_strvec(const struct option *opt, const char *arg, int unset)
+{
+	struct strvec *v = opt->value;
+
+	if (unset) {
+		strvec_clear(v);
+		return 0;
+	}
+
+	if (!arg)
+		return -1;
+
+	strvec_push(v, arg);
+	return 0;
+}
+
 int parse_opt_noop_cb(const struct option *opt, const char *arg, int unset)
 {
 	return 0;
diff --git a/parse-options.h b/parse-options.h
index 7030d8f3da..75cc8c7c96 100644
--- a/parse-options.h
+++ b/parse-options.h
@@ -177,6 +177,9 @@ struct option {
 #define OPT_STRING_LIST(s, l, v, a, h) \
 				    { OPTION_CALLBACK, (s), (l), (v), (a), \
 				      (h), 0, &parse_opt_string_list }
+#define OPT_STRVEC(s, l, v, a, h) \
+				    { OPTION_CALLBACK, (s), (l), (v), (a), \
+				      (h), 0, &parse_opt_strvec }
 #define OPT_UYN(s, l, v, h)         { OPTION_CALLBACK, (s), (l), (v), NULL, \
 				      (h), PARSE_OPT_NOARG, &parse_opt_tertiary }
 #define OPT_EXPIRY_DATE(s, l, v, h) \
@@ -296,6 +299,7 @@ int parse_opt_commits(const struct option *, const char *, int);
 int parse_opt_commit(const struct option *, const char *, int);
 int parse_opt_tertiary(const struct option *, const char *, int);
 int parse_opt_string_list(const struct option *, const char *, int);
+int parse_opt_strvec(const struct option *, const char *, int);
 int parse_opt_noop_cb(const struct option *, const char *, int);
 enum parse_opt_result parse_opt_unknown_cb(struct parse_opt_ctx_t *ctx,
 					   const struct option *,
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 08/17] hook: add 'run' subcommand
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (6 preceding siblings ...)
  2020-12-05  1:45         ` [PATCH 07/17] parse-options: parse into strvec Emily Shaffer
@ 2020-12-05  1:45         ` Emily Shaffer
  2020-12-11 10:15           ` Phillip Wood
  2020-12-05  1:45         ` [PATCH 09/17] hook: replace find_hook() with hook_exists() Emily Shaffer
                           ` (10 subsequent siblings)
  18 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:45 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

In order to enable hooks to be run as an external process, by a
standalone Git command, or by tools which wrap Git, provide an external
means to run all configured hook commands for a given hook event.

For now, the hook commands will run in config order, in series. As
alternate ordering or parallelism is supported in the future, we should
add knobs to use those to the command line as well.

As with the legacy hook implementation, all stdout generated by hook
commands is redirected to stderr. Piping from stdin is not yet
supported.

Legacy hooks (those present in $GITDIR/hooks) are run at the end of the
execution list. For now, there is no way to disable them.

Users may wish to provide hook commands like 'git config
hook.pre-commit.command "~/linter.sh --pre-commit"'. To enable this, the
contents of the 'hook.*.command' and 'hookcmd.*.command' strings are
first split by space or quotes into an argv_array, then expanded with
'expand_user_path()'.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, updated the docs, and did less local application of single
    quotes. In order for hookdir hooks to run successfully with a space in
    the path, though, they must not be run with 'sh -c'. So we can treat the
    hookdir hooks specially, and warn users via doc about special
    considerations for configured hooks with spaces in their path.

 Documentation/git-hook.txt    |  31 +++++++++-
 builtin/hook.c                |  48 ++++++++++++++-
 hook.c                        | 112 ++++++++++++++++++++++++++++++++++
 hook.h                        |  32 ++++++++++
 t/t1360-config-based-hooks.sh |  65 +++++++++++++++++++-
 5 files changed, 281 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index f19875ed68..18a817d832 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -9,11 +9,12 @@ SYNOPSIS
 --------
 [verse]
 'git hook' list <hook-name>
+'git hook' run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] <hook-name>
 
 DESCRIPTION
 -----------
-You can list configured hooks with this command. Later, you will be able to run,
-add, and modify hooks with this command.
+You can list and run configured hooks with this command. Later, you will be able
+to add and modify hooks with this command.
 
 This command parses the default configuration files for sections `hook` and
 `hookcmd`. `hook` is used to describe the commands which will be run during a
@@ -64,6 +65,32 @@ in the order they should be run, and print the config scope where the relevant
 `hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
 This output is human-readable and the format is subject to change over time.
 
+run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] `<hook-name>`::
+
+Runs hooks configured for `<hook-name>`, in the same order displayed by `git
+hook list`. Hooks configured this way are run prepended with `sh -c`, so paths
+containing special characters or spaces should be wrapped in single quotes:
+`command = '/my/path with spaces/script.sh' some args`.
+
+OPTIONS
+-------
+--run-hookdir::
+	Overrides the hook.runHookDir config. Must be 'yes', 'warn',
+	'interactive', or 'no'. Specifies how to handle hooks located in the Git
+	hook directory (core.hooksPath).
+
+-a::
+--arg::
+	Only valid for `run`.
++
+Specify arguments to pass to every hook that is run.
+
+-e::
+--env::
+	Only valid for `run`.
++
+Specify environment variables to set for every hook that is run.
+
 CONFIGURATION
 -------------
 include::config/hook.txt[]
diff --git a/builtin/hook.c b/builtin/hook.c
index 16324d4195..26f7050387 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -5,9 +5,11 @@
 #include "hook.h"
 #include "parse-options.h"
 #include "strbuf.h"
+#include "strvec.h"
 
 static const char * const builtin_hook_usage[] = {
 	N_("git hook list <hookname>"),
+	N_("git hook run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] <hookname>"),
 	NULL
 };
 
@@ -84,6 +86,46 @@ static int list(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+static int run(int argc, const char **argv, const char *prefix)
+{
+	struct strbuf hookname = STRBUF_INIT;
+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
+	int rc = 0;
+
+	struct option run_options[] = {
+		OPT_STRVEC('e', "env", &opt.env, N_("var"),
+			   N_("environment variables for hook to use")),
+		OPT_STRVEC('a', "arg", &opt.args, N_("args"),
+			   N_("argument to pass to hook")),
+		OPT_END(),
+	};
+
+	/*
+	 * While it makes sense to list hooks out-of-repo, it doesn't make sense
+	 * to execute them. Hooks usually want to look at repository artifacts.
+	 */
+	if (!have_git_dir())
+		usage_msg_opt(_("You must be in a Git repo to execute hooks."),
+			      builtin_hook_usage, run_options);
+
+	argc = parse_options(argc, argv, prefix, run_options,
+			     builtin_hook_usage, 0);
+
+	if (argc < 1)
+		usage_msg_opt(_("You must specify a hook event to run."),
+			      builtin_hook_usage, run_options);
+
+	strbuf_addstr(&hookname, argv[0]);
+	opt.run_hookdir = should_run_hookdir;
+
+	rc = run_hooks(hookname.buf, &opt);
+
+	strbuf_release(&hookname);
+	run_hooks_opt_clear(&opt);
+
+	return rc;
+}
+
 int cmd_hook(int argc, const char **argv, const char *prefix)
 {
 	const char *run_hookdir = NULL;
@@ -95,10 +137,10 @@ int cmd_hook(int argc, const char **argv, const char *prefix)
 	};
 
 	argc = parse_options(argc, argv, prefix, builtin_hook_options,
-			     builtin_hook_usage, 0);
+			     builtin_hook_usage, PARSE_OPT_KEEP_UNKNOWN);
 
 	/* after the parse, we should have "<command> <hookname> <args...>" */
-	if (argc < 1)
+	if (argc < 2)
 		usage_with_options(builtin_hook_usage, builtin_hook_options);
 
 
@@ -120,6 +162,8 @@ int cmd_hook(int argc, const char **argv, const char *prefix)
 
 	if (!strcmp(argv[0], "list"))
 		return list(argc, argv, prefix);
+	if (!strcmp(argv[0], "run"))
+		return run(argc, argv, prefix);
 
 	usage_with_options(builtin_hook_usage, builtin_hook_options);
 }
diff --git a/hook.c b/hook.c
index f4084e33c8..c4595a2324 100644
--- a/hook.c
+++ b/hook.c
@@ -3,6 +3,7 @@
 #include "hook.h"
 #include "config.h"
 #include "run-command.h"
+#include "prompt.h"
 
 void free_hook(struct hook *ptr)
 {
@@ -135,6 +136,56 @@ enum hookdir_opt configured_hookdir_opt(void)
 	return hookdir_unknown;
 }
 
+static int should_include_hookdir(const char *path, enum hookdir_opt cfg)
+{
+	struct strbuf prompt = STRBUF_INIT;
+	/*
+	 * If the path doesn't exist, don't bother adding the empty hook and
+	 * don't bother checking the config or prompting the user.
+	 */
+	if (!path)
+		return 0;
+
+	switch (cfg)
+	{
+		case hookdir_no:
+			return 0;
+		case hookdir_unknown:
+			fprintf(stderr,
+				_("Unrecognized value for 'hook.runHookDir'. "
+				  "Is there a typo? "));
+			/* FALLTHROUGH */
+		case hookdir_warn:
+			fprintf(stderr, _("Running legacy hook at '%s'\n"),
+				path);
+			return 1;
+		case hookdir_interactive:
+			do {
+				/*
+				 * TRANSLATORS: Make sure to include [Y] and [n]
+				 * in your translation. Only English input is
+				 * accepted. Default option is "yes".
+				 */
+				fprintf(stderr, _("Run '%s'? [Yn] "), path);
+				git_read_line_interactively(&prompt);
+				strbuf_tolower(&prompt);
+				if (starts_with(prompt.buf, "n")) {
+					strbuf_release(&prompt);
+					return 0;
+				} else if (starts_with(prompt.buf, "y")) {
+					strbuf_release(&prompt);
+					return 1;
+				}
+				/* otherwise, we didn't understand the input */
+			} while (prompt.len); /* an empty reply means "Yes" */
+			strbuf_release(&prompt);
+			return 1;
+		case hookdir_yes:
+		default:
+			return 1;
+	}
+}
+
 struct list_head* hook_list(const struct strbuf* hookname)
 {
 	struct strbuf hook_key = STRBUF_INIT;
@@ -166,3 +217,64 @@ struct list_head* hook_list(const struct strbuf* hookname)
 	strbuf_release(&hook_key);
 	return hook_head;
 }
+
+void run_hooks_opt_init(struct run_hooks_opt *o)
+{
+	strvec_init(&o->env);
+	strvec_init(&o->args);
+	o->run_hookdir = configured_hookdir_opt();
+}
+
+void run_hooks_opt_clear(struct run_hooks_opt *o)
+{
+	strvec_clear(&o->env);
+	strvec_clear(&o->args);
+}
+
+int run_hooks(const char *hookname, struct run_hooks_opt *options)
+{
+	struct strbuf hookname_str = STRBUF_INIT;
+	struct list_head *to_run, *pos = NULL, *tmp = NULL;
+	int rc = 0;
+
+	if (!options)
+		BUG("a struct run_hooks_opt must be provided to run_hooks");
+
+	strbuf_addstr(&hookname_str, hookname);
+
+	to_run = hook_list(&hookname_str);
+
+	list_for_each_safe(pos, tmp, to_run) {
+		struct child_process hook_proc = CHILD_PROCESS_INIT;
+		struct hook *hook = list_entry(pos, struct hook, list);
+
+		hook_proc.env = options->env.v;
+		hook_proc.no_stdin = 1;
+		hook_proc.stdout_to_stderr = 1;
+		hook_proc.trace2_hook_name = hook->command.buf;
+		hook_proc.use_shell = 1;
+
+		if (hook->from_hookdir) {
+		    if (!should_include_hookdir(hook->command.buf, options->run_hookdir))
+			continue;
+		    /*
+		     * Commands from the config could be oneliners, but we know
+		     * for certain that hookdir commands are not.
+		     */
+		    hook_proc.use_shell = 0;
+		}
+
+		/* add command */
+		strvec_push(&hook_proc.args, hook->command.buf);
+
+		/*
+		 * add passed-in argv, without expanding - let the user get back
+		 * exactly what they put in
+		 */
+		strvec_pushv(&hook_proc.args, options->args.v);
+
+		rc |= run_command(&hook_proc);
+	}
+
+	return rc;
+}
diff --git a/hook.h b/hook.h
index ca45d388d3..d1c3d71e82 100644
--- a/hook.h
+++ b/hook.h
@@ -1,6 +1,7 @@
 #include "config.h"
 #include "list.h"
 #include "strbuf.h"
+#include "strvec.h"
 
 struct hook
 {
@@ -36,6 +37,37 @@ enum hookdir_opt
  */
 enum hookdir_opt configured_hookdir_opt(void);
 
+struct run_hooks_opt
+{
+	/* Environment vars to be set for each hook */
+	struct strvec env;
+
+	/* Args to be passed to each hook */
+	struct strvec args;
+
+	/*
+	 * How should the hookdir be handled?
+	 * Leave the RUN_HOOKS_OPT_INIT default in most cases; this only needs
+	 * to be overridden if the user can override it at the command line.
+	 */
+	enum hookdir_opt run_hookdir;
+};
+
+#define RUN_HOOKS_OPT_INIT  {   		\
+	.env = STRVEC_INIT, 				\
+	.args = STRVEC_INIT, 			\
+	.run_hookdir = configured_hookdir_opt()	\
+}
+
+void run_hooks_opt_init(struct run_hooks_opt *o);
+void run_hooks_opt_clear(struct run_hooks_opt *o);
+
+/*
+ * Runs all hooks associated to the 'hookname' event in order. Each hook will be
+ * passed 'env' and 'args'.
+ */
+int run_hooks(const char *hookname, struct run_hooks_opt *options);
+
 /* Free memory associated with a 'struct hook' */
 void free_hook(struct hook *ptr);
 /* Empties the list at 'head', calling 'free_hook()' on each entry */
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index ebd3bc623f..5b3003d59b 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -115,7 +115,10 @@ test_expect_success 'hook.runHookDir = no is respected by list' '
 
 	git hook list pre-commit >actual &&
 	# the hookdir annotation is translated
-	test_i18ncmp expected actual
+	test_i18ncmp expected actual &&
+
+	git hook run pre-commit 2>actual &&
+	test_must_be_empty actual
 '
 
 test_expect_success 'hook.runHookDir = warn is respected by list' '
@@ -129,6 +132,14 @@ test_expect_success 'hook.runHookDir = warn is respected by list' '
 
 	git hook list pre-commit >actual &&
 	# the hookdir annotation is translated
+	test_i18ncmp expected actual &&
+
+	cat >expected <<-EOF &&
+	Running legacy hook at '\''$(pwd)/.git/hooks/pre-commit'\''
+	"Legacy Hook"
+	EOF
+
+	git hook run pre-commit 2>actual &&
 	test_i18ncmp expected actual
 '
 
@@ -156,7 +167,7 @@ test_expect_success 'git hook list removes skipped inlined hook' '
 	test_cmp expected actual
 '
 
-test_expect_success 'hook.runHookDir = interactive is respected by list' '
+test_expect_success 'hook.runHookDir = interactive is respected by list and run' '
 	setup_hookdir &&
 
 	test_config hook.runHookDir "interactive" &&
@@ -167,7 +178,55 @@ test_expect_success 'hook.runHookDir = interactive is respected by list' '
 
 	git hook list pre-commit >actual &&
 	# the hookdir annotation is translated
-	test_i18ncmp expected actual
+	test_i18ncmp expected actual &&
+
+	test_write_lines n | git hook run pre-commit 2>actual &&
+	! grep "Legacy Hook" actual &&
+
+	test_write_lines y | git hook run pre-commit 2>actual &&
+	grep "Legacy Hook" actual
+'
+
+test_expect_success 'inline hook definitions execute oneliners' '
+	test_config hook.pre-commit.command "echo \"Hello World\"" &&
+
+	echo "Hello World" >expected &&
+
+	# hooks are run with stdout_to_stderr = 1
+	git hook run pre-commit 2>actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'inline hook definitions resolve paths' '
+	write_script sample-hook.sh <<-EOF &&
+	echo \"Sample Hook\"
+	EOF
+
+	test_when_finished "rm sample-hook.sh" &&
+
+	test_config hook.pre-commit.command "\"$(pwd)/sample-hook.sh\"" &&
+
+	echo \"Sample Hook\" >expected &&
+
+	# hooks are run with stdout_to_stderr = 1
+	git hook run pre-commit 2>actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'hookdir hook included in git hook run' '
+	setup_hookdir &&
+
+	echo \"Legacy Hook\" >expected &&
+
+	# hooks are run with stdout_to_stderr = 1
+	git hook run pre-commit 2>actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'out-of-repo runs excluded' '
+	setup_hooks &&
+
+	nongit test_must_fail git hook run pre-commit
 '
 
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 09/17] hook: replace find_hook() with hook_exists()
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (7 preceding siblings ...)
  2020-12-05  1:45         ` [PATCH 08/17] hook: add 'run' subcommand Emily Shaffer
@ 2020-12-05  1:45         ` Emily Shaffer
  2020-12-05  1:46         ` [PATCH 10/17] hook: support passing stdin to hooks Emily Shaffer
                           ` (9 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:45 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Add a helper to easily determine whether any hooks exist for a given
hook event.

Many callers want to check whether some state could be modified by a
hook; that check should include the config-based hooks as well. Optimize
by checking the config directly. Since commands which execute hooks
might want to take args to replace 'hook.runHookDir', let
'hook_exists()' mirror the behavior of 'hook.runHookDir'.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, updated this commit to include bugreport as a builtin instead of
    as a standalone.
    
    Since v4, a little more nuance when deciding whether a hookdir hook can happen.

 builtin/bugreport.c |  4 ++--
 hook.c              | 15 +++++++++++++++
 hook.h              |  9 +++++++++
 3 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/builtin/bugreport.c b/builtin/bugreport.c
index 3ad4b9b62e..11043f4a22 100644
--- a/builtin/bugreport.c
+++ b/builtin/bugreport.c
@@ -3,7 +3,7 @@
 #include "strbuf.h"
 #include "help.h"
 #include "compat/compiler.h"
-#include "run-command.h"
+#include "hook.h"
 
 
 static void get_system_info(struct strbuf *sys_info)
@@ -82,7 +82,7 @@ static void get_populated_hooks(struct strbuf *hook_info, int nongit)
 	}
 
 	for (i = 0; i < ARRAY_SIZE(hook); i++)
-		if (find_hook(hook[i]))
+		if (hook_exists(hook[i], configured_hookdir_opt()))
 			strbuf_addf(hook_info, "%s\n", hook[i]);
 }
 
diff --git a/hook.c b/hook.c
index c4595a2324..a7a4abdcac 100644
--- a/hook.c
+++ b/hook.c
@@ -225,6 +225,21 @@ void run_hooks_opt_init(struct run_hooks_opt *o)
 	o->run_hookdir = configured_hookdir_opt();
 }
 
+int hook_exists(const char *hookname, enum hookdir_opt should_run_hookdir)
+{
+	const char *value = NULL; /* throwaway */
+	struct strbuf hook_key = STRBUF_INIT;
+
+	int could_run_hookdir = (should_run_hookdir == hookdir_interactive ||
+				should_run_hookdir == hookdir_warn ||
+				should_run_hookdir == hookdir_yes)
+				&& !!find_hook(hookname);
+
+	strbuf_addf(&hook_key, "hook.%s.command", hookname);
+
+	return (!git_config_get_value(hook_key.buf, &value)) || could_run_hookdir;
+}
+
 void run_hooks_opt_clear(struct run_hooks_opt *o)
 {
 	strvec_clear(&o->env);
diff --git a/hook.h b/hook.h
index d1c3d71e82..94a25c7cd0 100644
--- a/hook.h
+++ b/hook.h
@@ -62,6 +62,15 @@ struct run_hooks_opt
 void run_hooks_opt_init(struct run_hooks_opt *o);
 void run_hooks_opt_clear(struct run_hooks_opt *o);
 
+/*
+ * Returns 1 if any hooks are specified in the config or if a hook exists in the
+ * hookdir. Typically, invoke hook_exsts() like:
+ *   hook_exists(hookname, configured_hookdir_opt());
+ * Like with run_hooks, if you take a --run-hookdir flag, reflect that
+ * user-specified behavior here instead.
+ */
+int hook_exists(const char *hookname, enum hookdir_opt should_run_hookdir);
+
 /*
  * Runs all hooks associated to the 'hookname' event in order. Each hook will be
  * passed 'env' and 'args'.
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 10/17] hook: support passing stdin to hooks
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (8 preceding siblings ...)
  2020-12-05  1:45         ` [PATCH 09/17] hook: replace find_hook() with hook_exists() Emily Shaffer
@ 2020-12-05  1:46         ` Emily Shaffer
  2020-12-05  1:46         ` [PATCH 11/17] run-command: allow stdin for run_processes_parallel Emily Shaffer
                           ` (8 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:46 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Some hooks (such as post-rewrite) need to take input via stdin.
Previously, callers provided stdin to hooks by setting
run-command.h:child_process.in, which takes a FD. Callers would open the
file in question themselves before calling run-command(). However, since
we will now need to seek to the front of the file and read it again for
every hook which runs, hook.h:run_command() takes a path and handles FD
management itself. Since this file is opened for read only, it should
not prevent later parallel execution support.

On the frontend, this is supported by asking for a file path, rather
than by reading stdin. Reading directly from stdin would involve caching
the entire stdin (to memory or to disk) and reading it back from the
beginning to each hook. We'd want to support cases like insufficient
memory or storage for the file. While this may prove useful later, for
now the path of least resistance is to just ask the user to make this
interim file themselves.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/git-hook.txt    | 11 +++++++++--
 builtin/hook.c                |  5 ++++-
 hook.c                        |  7 ++++++-
 hook.h                        |  9 +++++++--
 t/t1360-config-based-hooks.sh | 24 ++++++++++++++++++++++++
 5 files changed, 50 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index 18a817d832..cce30a80d0 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -9,7 +9,8 @@ SYNOPSIS
 --------
 [verse]
 'git hook' list <hook-name>
-'git hook' run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] <hook-name>
+'git hook' run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] [--to-stdin=<path>]
+	<hook-name>
 
 DESCRIPTION
 -----------
@@ -65,7 +66,7 @@ in the order they should be run, and print the config scope where the relevant
 `hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
 This output is human-readable and the format is subject to change over time.
 
-run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] `<hook-name>`::
+run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] [--to-stdin=<path>] `<hook-name>`::
 
 Runs hooks configured for `<hook-name>`, in the same order displayed by `git
 hook list`. Hooks configured this way are run prepended with `sh -c`, so paths
@@ -91,6 +92,12 @@ Specify arguments to pass to every hook that is run.
 +
 Specify environment variables to set for every hook that is run.
 
+--to-stdin::
+	Only valid for `run`.
++
+Specify a file which will be streamed into stdin for every hook that is run.
+Each hook will receive the entire file from beginning to EOF.
+
 CONFIGURATION
 -------------
 include::config/hook.txt[]
diff --git a/builtin/hook.c b/builtin/hook.c
index 26f7050387..e45831e01d 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -9,7 +9,8 @@
 
 static const char * const builtin_hook_usage[] = {
 	N_("git hook list <hookname>"),
-	N_("git hook run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] <hookname>"),
+	N_("git hook run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...]"
+	   "[--to-stdin=<path>] <hookname>"),
 	NULL
 };
 
@@ -97,6 +98,8 @@ static int run(int argc, const char **argv, const char *prefix)
 			   N_("environment variables for hook to use")),
 		OPT_STRVEC('a', "arg", &opt.args, N_("args"),
 			   N_("argument to pass to hook")),
+		OPT_STRING(0, "to-stdin", &opt.path_to_stdin, N_("path"),
+			   N_("file to read into hooks' stdin")),
 		OPT_END(),
 	};
 
diff --git a/hook.c b/hook.c
index a7a4abdcac..c7fdf556fe 100644
--- a/hook.c
+++ b/hook.c
@@ -263,8 +263,13 @@ int run_hooks(const char *hookname, struct run_hooks_opt *options)
 		struct child_process hook_proc = CHILD_PROCESS_INIT;
 		struct hook *hook = list_entry(pos, struct hook, list);
 
+		/* reopen the file for stdin; run_command closes it. */
+		if (options->path_to_stdin)
+			hook_proc.in = xopen(options->path_to_stdin, O_RDONLY);
+		else
+			hook_proc.no_stdin = 1;
+
 		hook_proc.env = options->env.v;
-		hook_proc.no_stdin = 1;
 		hook_proc.stdout_to_stderr = 1;
 		hook_proc.trace2_hook_name = hook->command.buf;
 		hook_proc.use_shell = 1;
diff --git a/hook.h b/hook.h
index 94a25c7cd0..5184dcaa5a 100644
--- a/hook.h
+++ b/hook.h
@@ -51,11 +51,15 @@ struct run_hooks_opt
 	 * to be overridden if the user can override it at the command line.
 	 */
 	enum hookdir_opt run_hookdir;
+
+	/* Path to file which should be piped to stdin for each hook */
+	const char *path_to_stdin;
 };
 
 #define RUN_HOOKS_OPT_INIT  {   		\
-	.env = STRVEC_INIT, 				\
+	.env = STRVEC_INIT, 			\
 	.args = STRVEC_INIT, 			\
+	.path_to_stdin = NULL,			\
 	.run_hookdir = configured_hookdir_opt()	\
 }
 
@@ -73,7 +77,8 @@ int hook_exists(const char *hookname, enum hookdir_opt should_run_hookdir);
 
 /*
  * Runs all hooks associated to the 'hookname' event in order. Each hook will be
- * passed 'env' and 'args'.
+ * passed 'env' and 'args'. The file at 'stdin_path' will be closed and reopened
+ * for each hook that runs.
  */
 int run_hooks(const char *hookname, struct run_hooks_opt *options);
 
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 5b3003d59b..c672269ee4 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -229,4 +229,28 @@ test_expect_success 'out-of-repo runs excluded' '
 	nongit test_must_fail git hook run pre-commit
 '
 
+test_expect_success 'stdin to multiple hooks' '
+	git config --add hook.test.command "xargs -P1 -I% echo a%" &&
+	git config --add hook.test.command "xargs -P1 -I% echo b%" &&
+	test_when_finished "test_unconfig hook.test.command" &&
+
+	cat >input <<-EOF &&
+	1
+	2
+	3
+	EOF
+
+	cat >expected <<-EOF &&
+	a1
+	a2
+	a3
+	b1
+	b2
+	b3
+	EOF
+
+	git hook run --to-stdin=input test 2>actual &&
+	test_cmp expected actual
+'
+
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 11/17] run-command: allow stdin for run_processes_parallel
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (9 preceding siblings ...)
  2020-12-05  1:46         ` [PATCH 10/17] hook: support passing stdin to hooks Emily Shaffer
@ 2020-12-05  1:46         ` Emily Shaffer
  2020-12-05  1:46         ` [PATCH 12/17] hook: allow parallel hook execution Emily Shaffer
                           ` (7 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:46 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

While it makes sense not to inherit stdin from the parent process to
avoid deadlocking, it's not necessary to completely ban stdin to
children. An informed user should be able to configure stdin safely. By
setting `some_child.process.no_stdin=1` before calling `get_next_task()`
we provide a reasonable default behavior but enable users to set up
stdin streaming for themselves during the callback.

`some_child.process.stdout_to_stderr`, however, remains unmodifiable by
`get_next_task()` - the rest of the run_processes_parallel() API depends
on child output in stderr.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 run-command.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/run-command.c b/run-command.c
index ea4d0fb4b1..80c8c97bc1 100644
--- a/run-command.c
+++ b/run-command.c
@@ -1683,6 +1683,9 @@ static int pp_start_one(struct parallel_processes *pp)
 	if (i == pp->max_processes)
 		BUG("bookkeeping is hard");
 
+	/* disallow by default, but allow users to set up stdin if they wish */
+	pp->children[i].process.no_stdin = 1;
+
 	code = pp->get_next_task(&pp->children[i].process,
 				 &pp->children[i].err,
 				 pp->data,
@@ -1694,7 +1697,6 @@ static int pp_start_one(struct parallel_processes *pp)
 	}
 	pp->children[i].process.err = -1;
 	pp->children[i].process.stdout_to_stderr = 1;
-	pp->children[i].process.no_stdin = 1;
 
 	if (start_command(&pp->children[i].process)) {
 		code = pp->start_failure(&pp->children[i].err,
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 12/17] hook: allow parallel hook execution
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (10 preceding siblings ...)
  2020-12-05  1:46         ` [PATCH 11/17] run-command: allow stdin for run_processes_parallel Emily Shaffer
@ 2020-12-05  1:46         ` Emily Shaffer
  2020-12-05  1:46         ` [PATCH 13/17] hook: allow specifying working directory for hooks Emily Shaffer
                           ` (6 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:46 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

In many cases, there's no reason not to allow hooks to execute in
parallel. run_processes_parallel() is well-suited - it's a task queue
that runs its housekeeping in series, which means users don't
need to worry about thread safety on their callback data. True
multithreaded execution with the async_* functions isn't necessary here.
Synchronous hook execution can be achieved by only allowing 1 job to run
at a time.

Teach run_hooks() to use that function for simple hooks which don't
require stdin or capture of stderr.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Per AEvar's request - parallel hook execution on day zero.
    
    In most ways run_processes_parallel() worked great for me - but it didn't
    have great support for hooks where we pipe to and from. I had to add this
    support later in the series.
    
    Since I modified an existing and in-use library I'd appreciate a keen look on
    these patches.
    
     - Emily

 Documentation/config/hook.txt |   5 ++
 Documentation/git-hook.txt    |  15 +++-
 builtin/hook.c                |   6 +-
 hook.c                        | 142 ++++++++++++++++++++++++++--------
 hook.h                        |  28 ++++++-
 5 files changed, 158 insertions(+), 38 deletions(-)

diff --git a/Documentation/config/hook.txt b/Documentation/config/hook.txt
index 75312754ae..a423d13781 100644
--- a/Documentation/config/hook.txt
+++ b/Documentation/config/hook.txt
@@ -12,3 +12,8 @@ hook.runHookDir::
 	Controls how hooks contained in your hookdir are executed. Can be any of
 	"yes", "warn", "interactive", or "no". Defaults to "yes". See
 	linkgit:git-hook[1] and linkgit:git-config[1] "core.hooksPath").
+
+hook.jobs::
+	Specifies how many hooks can be run simultaneously during parallelized
+	hook execution. If unspecified, defaults to the number of processors on
+	the current system.
diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index cce30a80d0..c2678c61b2 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 [verse]
 'git hook' list <hook-name>
 'git hook' run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] [--to-stdin=<path>]
-	<hook-name>
+	[(-j|--jobs) <n>] <hook-name>
 
 DESCRIPTION
 -----------
@@ -66,7 +66,8 @@ in the order they should be run, and print the config scope where the relevant
 `hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
 This output is human-readable and the format is subject to change over time.
 
-run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] [--to-stdin=<path>] `<hook-name>`::
+run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] [--to-stdin=<path>]
+	[(-j|--jobs)<n>]`<hook-name>`::
 
 Runs hooks configured for `<hook-name>`, in the same order displayed by `git
 hook list`. Hooks configured this way are run prepended with `sh -c`, so paths
@@ -98,6 +99,16 @@ Specify environment variables to set for every hook that is run.
 Specify a file which will be streamed into stdin for every hook that is run.
 Each hook will receive the entire file from beginning to EOF.
 
+-j::
+--jobs::
+	Only valid for `run`.
++
+Specify how many hooks to run simultaneously. If this flag is not specified, use
+the value of the `hook.jobs` config. If the config is not specified, use the
+number of CPUs on the current system. Some hooks may be ineligible for
+parallelization: for example, 'commit-msg' intends hooks modify the commit
+message body and cannot be parallelized.
+
 CONFIGURATION
 -------------
 include::config/hook.txt[]
diff --git a/builtin/hook.c b/builtin/hook.c
index e45831e01d..064a0fea29 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -10,7 +10,7 @@
 static const char * const builtin_hook_usage[] = {
 	N_("git hook list <hookname>"),
 	N_("git hook run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...]"
-	   "[--to-stdin=<path>] <hookname>"),
+	   "[--to-stdin=<path>] [(-j|--jobs) <count>] <hookname>"),
 	NULL
 };
 
@@ -90,7 +90,7 @@ static int list(int argc, const char **argv, const char *prefix)
 static int run(int argc, const char **argv, const char *prefix)
 {
 	struct strbuf hookname = STRBUF_INIT;
-	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_ASYNC;
 	int rc = 0;
 
 	struct option run_options[] = {
@@ -100,6 +100,8 @@ static int run(int argc, const char **argv, const char *prefix)
 			   N_("argument to pass to hook")),
 		OPT_STRING(0, "to-stdin", &opt.path_to_stdin, N_("path"),
 			   N_("file to read into hooks' stdin")),
+		OPT_INTEGER('j', "jobs", &opt.jobs,
+			    N_("run up to <n> hooks simultaneously")),
 		OPT_END(),
 	};
 
diff --git a/hook.c b/hook.c
index c7fdf556fe..edea54f95c 100644
--- a/hook.c
+++ b/hook.c
@@ -136,6 +136,14 @@ enum hookdir_opt configured_hookdir_opt(void)
 	return hookdir_unknown;
 }
 
+int configured_hook_jobs(void)
+{
+	int n = online_cpus();
+	git_config_get_int("hook.jobs", &n);
+
+	return n;
+}
+
 static int should_include_hookdir(const char *path, enum hookdir_opt cfg)
 {
 	struct strbuf prompt = STRBUF_INIT;
@@ -223,6 +231,7 @@ void run_hooks_opt_init(struct run_hooks_opt *o)
 	strvec_init(&o->env);
 	strvec_init(&o->args);
 	o->run_hookdir = configured_hookdir_opt();
+	o->jobs = configured_hook_jobs();
 }
 
 int hook_exists(const char *hookname, enum hookdir_opt should_run_hookdir)
@@ -246,11 +255,96 @@ void run_hooks_opt_clear(struct run_hooks_opt *o)
 	strvec_clear(&o->args);
 }
 
+
+static int pick_next_hook(struct child_process *cp,
+			  struct strbuf *out,
+			  void *pp_cb,
+			  void **pp_task_cb)
+{
+	struct hook_cb_data *hook_cb = pp_cb;
+
+	struct hook *hook = list_entry(hook_cb->run_me, struct hook, list);
+
+	if (hook_cb->head == hook_cb->run_me)
+		return 0;
+
+	cp->env = hook_cb->options->env.v;
+	cp->stdout_to_stderr = 1;
+	cp->trace2_hook_name = hook->command.buf;
+
+	/* reopen the file for stdin; run_command closes it. */
+	if (hook_cb->options->path_to_stdin) {
+		cp->no_stdin = 0;
+		cp->in = xopen(hook_cb->options->path_to_stdin, O_RDONLY);
+	} else {
+		cp->no_stdin = 1;
+	}
+
+	/*
+	 * Commands from the config could be oneliners, but we know
+	 * for certain that hookdir commands are not.
+	 */
+	if (hook->from_hookdir)
+		cp->use_shell = 0;
+	else
+		cp->use_shell = 1;
+
+	/* add command */
+	strvec_push(&cp->args, hook->command.buf);
+
+	/*
+	 * add passed-in argv, without expanding - let the user get back
+	 * exactly what they put in
+	 */
+	strvec_pushv(&cp->args, hook_cb->options->args.v);
+
+	/* Provide context for errors if necessary */
+	*pp_task_cb = hook;
+
+	/* Get the next entry ready */
+	hook_cb->run_me = hook_cb->run_me->next;
+
+	return 1;
+}
+
+static int notify_start_failure(struct strbuf *out,
+				void *pp_cb,
+				void *pp_task_cp)
+{
+	struct hook_cb_data *hook_cb = pp_cb;
+	struct hook *attempted = pp_task_cp;
+
+	/* |= rc in cb */
+	hook_cb->rc |= 1;
+
+	strbuf_addf(out, _("Couldn't start '%s', configured in '%s'\n"),
+		    attempted->command.buf,
+		    attempted->from_hookdir ? "hookdir"
+		    	: config_scope_name(attempted->origin));
+
+	/* NEEDSWORK: if halt_on_error is desired, do it here. */
+	return 0;
+}
+
+static int notify_hook_finished(int result,
+				struct strbuf *out,
+				void *pp_cb,
+				void *pp_task_cb)
+{
+	struct hook_cb_data *hook_cb = pp_cb;
+
+	/* |= rc in cb */
+	hook_cb->rc |= result;
+
+	/* NEEDSWORK: if halt_on_error is desired, do it here. */
+	return 0;
+}
+
 int run_hooks(const char *hookname, struct run_hooks_opt *options)
 {
 	struct strbuf hookname_str = STRBUF_INIT;
 	struct list_head *to_run, *pos = NULL, *tmp = NULL;
-	int rc = 0;
+	struct hook_cb_data cb_data = { 0, NULL, NULL, options };
 
 	if (!options)
 		BUG("a struct run_hooks_opt must be provided to run_hooks");
@@ -260,41 +354,23 @@ int run_hooks(const char *hookname, struct run_hooks_opt *options)
 	to_run = hook_list(&hookname_str);
 
 	list_for_each_safe(pos, tmp, to_run) {
-		struct child_process hook_proc = CHILD_PROCESS_INIT;
 		struct hook *hook = list_entry(pos, struct hook, list);
 
-		/* reopen the file for stdin; run_command closes it. */
-		if (options->path_to_stdin)
-			hook_proc.in = xopen(options->path_to_stdin, O_RDONLY);
-		else
-			hook_proc.no_stdin = 1;
-
-		hook_proc.env = options->env.v;
-		hook_proc.stdout_to_stderr = 1;
-		hook_proc.trace2_hook_name = hook->command.buf;
-		hook_proc.use_shell = 1;
-
-		if (hook->from_hookdir) {
-		    if (!should_include_hookdir(hook->command.buf, options->run_hookdir))
-			continue;
-		    /*
-		     * Commands from the config could be oneliners, but we know
-		     * for certain that hookdir commands are not.
-		     */
-		    hook_proc.use_shell = 0;
-		}
-
-		/* add command */
-		strvec_push(&hook_proc.args, hook->command.buf);
+		if (hook->from_hookdir &&
+		    !should_include_hookdir(hook->command.buf, options->run_hookdir))
+			    list_del(pos);
+	}
 
-		/*
-		 * add passed-in argv, without expanding - let the user get back
-		 * exactly what they put in
-		 */
-		strvec_pushv(&hook_proc.args, options->args.v);
+	cb_data.head = to_run;
+	cb_data.run_me = to_run->next;
 
-		rc |= run_command(&hook_proc);
-	}
+	run_processes_parallel_tr2(options->jobs,
+				   pick_next_hook,
+				   notify_start_failure,
+				   notify_hook_finished,
+				   &cb_data,
+				   "hook",
+				   hookname);
 
-	return rc;
+	return cb_data.rc;
 }
diff --git a/hook.h b/hook.h
index 5184dcaa5a..f54568afe3 100644
--- a/hook.h
+++ b/hook.h
@@ -37,6 +37,9 @@ enum hookdir_opt
  */
 enum hookdir_opt configured_hookdir_opt(void);
 
+/* Provides the number of threads to use for parallel hook execution. */
+int configured_hook_jobs(void);
+
 struct run_hooks_opt
 {
 	/* Environment vars to be set for each hook */
@@ -54,15 +57,38 @@ struct run_hooks_opt
 
 	/* Path to file which should be piped to stdin for each hook */
 	const char *path_to_stdin;
+
+	/* Number of threads to parallelize across */
+	int jobs;
 };
 
-#define RUN_HOOKS_OPT_INIT  {   		\
+/*
+ * Callback provided to feed_pipe_fn and consume_sideband_fn.
+ */
+struct hook_cb_data {
+	int rc;
+	struct list_head *head;
+	struct list_head *run_me;
+	struct run_hooks_opt *options;
+};
+
+#define RUN_HOOKS_OPT_INIT_SYNC  {   		\
 	.env = STRVEC_INIT, 			\
 	.args = STRVEC_INIT, 			\
 	.path_to_stdin = NULL,			\
+	.jobs = 1,				\
 	.run_hookdir = configured_hookdir_opt()	\
 }
 
+#define RUN_HOOKS_OPT_INIT_ASYNC {		\
+	.env = STRVEC_INIT, 			\
+	.args = STRVEC_INIT, 			\
+	.path_to_stdin = NULL,			\
+	.jobs = configured_hook_jobs(),		\
+	.run_hookdir = configured_hookdir_opt()	\
+}
+
+
 void run_hooks_opt_init(struct run_hooks_opt *o);
 void run_hooks_opt_clear(struct run_hooks_opt *o);
 
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 13/17] hook: allow specifying working directory for hooks
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (11 preceding siblings ...)
  2020-12-05  1:46         ` [PATCH 12/17] hook: allow parallel hook execution Emily Shaffer
@ 2020-12-05  1:46         ` Emily Shaffer
  2020-12-05  1:46         ` [PATCH 14/17] run-command: add stdin callback for parallelization Emily Shaffer
                           ` (5 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:46 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Hooks like "post-checkout" require that hooks have a different working
directory than the initial process. Pipe that directly through to struct
child_process.

Because we can just run 'git -C <some-dir> hook run ...' it shouldn't be
necessary to pipe this option through the frontend. In fact, this
reduces the possibility of users running hooks which affect some part of
the filesystem outside of the repo in question.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Needed later for "post-checkout" conversion.

 hook.c | 1 +
 hook.h | 5 +++++
 2 files changed, 6 insertions(+)

diff --git a/hook.c b/hook.c
index edea54f95c..f0c052d847 100644
--- a/hook.c
+++ b/hook.c
@@ -271,6 +271,7 @@ static int pick_next_hook(struct child_process *cp,
 	cp->env = hook_cb->options->env.v;
 	cp->stdout_to_stderr = 1;
 	cp->trace2_hook_name = hook->command.buf;
+	cp->dir = hook_cb->options->dir;
 
 	/* reopen the file for stdin; run_command closes it. */
 	if (hook_cb->options->path_to_stdin) {
diff --git a/hook.h b/hook.h
index f54568afe3..4aae8e2dbb 100644
--- a/hook.h
+++ b/hook.h
@@ -60,6 +60,9 @@ struct run_hooks_opt
 
 	/* Number of threads to parallelize across */
 	int jobs;
+
+	/* Path to initial working directory for subprocess */
+	const char *dir;
 };
 
 /*
@@ -77,6 +80,7 @@ struct hook_cb_data {
 	.args = STRVEC_INIT, 			\
 	.path_to_stdin = NULL,			\
 	.jobs = 1,				\
+	.dir = NULL,				\
 	.run_hookdir = configured_hookdir_opt()	\
 }
 
@@ -85,6 +89,7 @@ struct hook_cb_data {
 	.args = STRVEC_INIT, 			\
 	.path_to_stdin = NULL,			\
 	.jobs = configured_hook_jobs(),		\
+	.dir = NULL,				\
 	.run_hookdir = configured_hookdir_opt()	\
 }
 
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 14/17] run-command: add stdin callback for parallelization
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (12 preceding siblings ...)
  2020-12-05  1:46         ` [PATCH 13/17] hook: allow specifying working directory for hooks Emily Shaffer
@ 2020-12-05  1:46         ` Emily Shaffer
  2020-12-05  1:46         ` [PATCH 15/17] hook: provide stdin by string_list or callback Emily Shaffer
                           ` (4 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:46 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

If a user of the run_processes_parallel() API wants to pipe a large
amount of information to stdin of each parallel command, that
information could exceed the buffer of the pipe allocated for that
process's stdin.  Generally this is solved by repeatedly writing to
child_process.in between calls to start_command() and finish_command();
run_processes_parallel() did not provide users an opportunity to access
child_process at that time.

Because the data might be extremely large (for example, a list of all
refs received during a push from a client) simply taking a string_list
or strbuf is not as scalable as using a callback; the rest of the
run_processes_parallel() API also uses callbacks, so making this feature
match the rest of the API reduces mental load on the user.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since run_processes_parallel() is used elsewhere, I'd appreciate a close
    look on this patch which modifies it. Thanks :)

 builtin/fetch.c             |  1 +
 builtin/submodule--helper.c |  2 +-
 run-command.c               | 54 +++++++++++++++++++++++++++++++++++--
 run-command.h               | 17 +++++++++++-
 submodule.c                 |  1 +
 t/helper/test-run-command.c | 31 ++++++++++++++++++---
 t/t0061-run-command.sh      | 30 +++++++++++++++++++++
 7 files changed, 128 insertions(+), 8 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index ecf8537605..5e153b5193 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1647,6 +1647,7 @@ static int fetch_multiple(struct string_list *list, int max_children)
 		result = run_processes_parallel_tr2(max_children,
 						    &fetch_next_remote,
 						    &fetch_failed_to_start,
+						    NULL,
 						    &fetch_finished,
 						    &state,
 						    "fetch", "parallel/fetch");
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index c30896c897..bb623c1852 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -2294,7 +2294,7 @@ static int update_submodules(struct submodule_update_clone *suc)
 	int i;
 
 	run_processes_parallel_tr2(suc->max_jobs, update_clone_get_next_task,
-				   update_clone_start_failure,
+				   update_clone_start_failure, NULL,
 				   update_clone_task_finished, suc, "submodule",
 				   "parallel/update");
 
diff --git a/run-command.c b/run-command.c
index 80c8c97bc1..7b65c087f8 100644
--- a/run-command.c
+++ b/run-command.c
@@ -1548,6 +1548,7 @@ struct parallel_processes {
 
 	get_next_task_fn get_next_task;
 	start_failure_fn start_failure;
+	feed_pipe_fn feed_pipe;
 	task_finished_fn task_finished;
 
 	struct {
@@ -1575,6 +1576,13 @@ static int default_start_failure(struct strbuf *out,
 	return 0;
 }
 
+static int default_feed_pipe(struct strbuf *pipe,
+			     void *pp_cb,
+			     void *pp_task_cb)
+{
+	return 1;
+}
+
 static int default_task_finished(int result,
 				 struct strbuf *out,
 				 void *pp_cb,
@@ -1605,6 +1613,7 @@ static void pp_init(struct parallel_processes *pp,
 		    int n,
 		    get_next_task_fn get_next_task,
 		    start_failure_fn start_failure,
+		    feed_pipe_fn feed_pipe,
 		    task_finished_fn task_finished,
 		    void *data)
 {
@@ -1623,6 +1632,7 @@ static void pp_init(struct parallel_processes *pp,
 	pp->get_next_task = get_next_task;
 
 	pp->start_failure = start_failure ? start_failure : default_start_failure;
+	pp->feed_pipe = feed_pipe ? feed_pipe : default_feed_pipe;
 	pp->task_finished = task_finished ? task_finished : default_task_finished;
 
 	pp->nr_processes = 0;
@@ -1715,6 +1725,37 @@ static int pp_start_one(struct parallel_processes *pp)
 	return 0;
 }
 
+static void pp_buffer_stdin(struct parallel_processes *pp)
+{
+	int i;
+	struct strbuf sb = STRBUF_INIT;
+
+	/* Buffer stdin for each pipe. */
+	for (i = 0; i < pp->max_processes; i++) {
+		if (pp->children[i].state == GIT_CP_WORKING &&
+		    pp->children[i].process.in > 0) {
+			int done;
+			strbuf_reset(&sb);
+			done = pp->feed_pipe(&sb, pp->data,
+					      pp->children[i].data);
+			if (sb.len) {
+				if (write_in_full(pp->children[i].process.in,
+					      sb.buf, sb.len) < 0) {
+					if (errno != EPIPE)
+						die_errno("write");
+					done = 1;
+				}
+			}
+			if (done) {
+				close(pp->children[i].process.in);
+				pp->children[i].process.in = 0;
+			}
+		}
+	}
+
+	strbuf_release(&sb);
+}
+
 static void pp_buffer_stderr(struct parallel_processes *pp, int output_timeout)
 {
 	int i;
@@ -1779,6 +1820,7 @@ static int pp_collect_finished(struct parallel_processes *pp)
 		pp->nr_processes--;
 		pp->children[i].state = GIT_CP_FREE;
 		pp->pfd[i].fd = -1;
+		pp->children[i].process.in = 0;
 		child_process_init(&pp->children[i].process);
 
 		if (i != pp->output_owner) {
@@ -1812,6 +1854,7 @@ static int pp_collect_finished(struct parallel_processes *pp)
 int run_processes_parallel(int n,
 			   get_next_task_fn get_next_task,
 			   start_failure_fn start_failure,
+			   feed_pipe_fn feed_pipe,
 			   task_finished_fn task_finished,
 			   void *pp_cb)
 {
@@ -1820,7 +1863,9 @@ int run_processes_parallel(int n,
 	int spawn_cap = 4;
 	struct parallel_processes pp;
 
-	pp_init(&pp, n, get_next_task, start_failure, task_finished, pp_cb);
+	sigchain_push(SIGPIPE, SIG_IGN);
+
+	pp_init(&pp, n, get_next_task, start_failure, feed_pipe, task_finished, pp_cb);
 	while (1) {
 		for (i = 0;
 		    i < spawn_cap && !pp.shutdown &&
@@ -1837,6 +1882,7 @@ int run_processes_parallel(int n,
 		}
 		if (!pp.nr_processes)
 			break;
+		pp_buffer_stdin(&pp);
 		pp_buffer_stderr(&pp, output_timeout);
 		pp_output(&pp);
 		code = pp_collect_finished(&pp);
@@ -1848,11 +1894,15 @@ int run_processes_parallel(int n,
 	}
 
 	pp_cleanup(&pp);
+
+	sigchain_pop(SIGPIPE);
+
 	return 0;
 }
 
 int run_processes_parallel_tr2(int n, get_next_task_fn get_next_task,
 			       start_failure_fn start_failure,
+			       feed_pipe_fn feed_pipe,
 			       task_finished_fn task_finished, void *pp_cb,
 			       const char *tr2_category, const char *tr2_label)
 {
@@ -1862,7 +1912,7 @@ int run_processes_parallel_tr2(int n, get_next_task_fn get_next_task,
 				   ((n < 1) ? online_cpus() : n));
 
 	result = run_processes_parallel(n, get_next_task, start_failure,
-					task_finished, pp_cb);
+					feed_pipe, task_finished, pp_cb);
 
 	trace2_region_leave(tr2_category, tr2_label, NULL);
 
diff --git a/run-command.h b/run-command.h
index 6472b38bde..e058c0e2c8 100644
--- a/run-command.h
+++ b/run-command.h
@@ -436,6 +436,20 @@ typedef int (*start_failure_fn)(struct strbuf *out,
 				void *pp_cb,
 				void *pp_task_cb);
 
+/**
+ * This callback is called repeatedly on every child process who requests
+ * start_command() to create a pipe by setting child_process.in < 0.
+ *
+ * pp_cb is the callback cookie as passed into run_processes_parallel, and
+ * pp_task_cb is the callback cookie as passed into get_next_task_fn.
+ * The contents of 'send' will be read into the pipe and passed to the pipe.
+ *
+ * Return nonzero to close the pipe.
+ */
+typedef int (*feed_pipe_fn)(struct strbuf *pipe,
+			    void *pp_cb,
+			    void *pp_task_cb);
+
 /**
  * This callback is called on every child process that finished processing.
  *
@@ -470,10 +484,11 @@ typedef int (*task_finished_fn)(int result,
 int run_processes_parallel(int n,
 			   get_next_task_fn,
 			   start_failure_fn,
+			   feed_pipe_fn,
 			   task_finished_fn,
 			   void *pp_cb);
 int run_processes_parallel_tr2(int n, get_next_task_fn, start_failure_fn,
-			       task_finished_fn, void *pp_cb,
+			       feed_pipe_fn, task_finished_fn, void *pp_cb,
 			       const char *tr2_category, const char *tr2_label);
 
 #endif
diff --git a/submodule.c b/submodule.c
index b3bb59f066..953f41818c 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1638,6 +1638,7 @@ int fetch_populated_submodules(struct repository *r,
 	run_processes_parallel_tr2(max_parallel_jobs,
 				   get_next_submodule,
 				   fetch_start_failure,
+				   NULL,
 				   fetch_finish,
 				   &spf,
 				   "submodule", "parallel/fetch");
diff --git a/t/helper/test-run-command.c b/t/helper/test-run-command.c
index 7ae03dc712..9348184d30 100644
--- a/t/helper/test-run-command.c
+++ b/t/helper/test-run-command.c
@@ -32,8 +32,13 @@ static int parallel_next(struct child_process *cp,
 		return 0;
 
 	strvec_pushv(&cp->args, d->argv);
+	cp->in = d->in;
+	cp->no_stdin = d->no_stdin;
 	strbuf_addstr(err, "preloaded output of a child\n");
 	number_callbacks++;
+
+	*task_cb = xmalloc(sizeof(int));
+	*(int*)(*task_cb) = 2;
 	return 1;
 }
 
@@ -55,6 +60,17 @@ static int task_finished(int result,
 	return 1;
 }
 
+static int test_stdin(struct strbuf *pipe, void *cb, void *task_cb)
+{
+	int *lines_remaining = task_cb;
+
+	if (*lines_remaining)
+		strbuf_addf(pipe, "sample stdin %d\n", --(*lines_remaining));
+
+	return !(*lines_remaining);
+}
+
+
 struct testsuite {
 	struct string_list tests, failed;
 	int next;
@@ -185,7 +201,7 @@ static int testsuite(int argc, const char **argv)
 		suite.tests.nr, max_jobs);
 
 	ret = run_processes_parallel(max_jobs, next_test, test_failed,
-				     test_finished, &suite);
+				     test_stdin, test_finished, &suite);
 
 	if (suite.failed.nr > 0) {
 		ret = 1;
@@ -413,15 +429,22 @@ int cmd__run_command(int argc, const char **argv)
 
 	if (!strcmp(argv[1], "run-command-parallel"))
 		exit(run_processes_parallel(jobs, parallel_next,
-					    NULL, NULL, &proc));
+					    NULL, NULL, NULL, &proc));
 
 	if (!strcmp(argv[1], "run-command-abort"))
 		exit(run_processes_parallel(jobs, parallel_next,
-					    NULL, task_finished, &proc));
+					    NULL, NULL, task_finished, &proc));
 
 	if (!strcmp(argv[1], "run-command-no-jobs"))
 		exit(run_processes_parallel(jobs, no_job,
-					    NULL, task_finished, &proc));
+					    NULL, NULL, task_finished, &proc));
+
+	if (!strcmp(argv[1], "run-command-stdin")) {
+		proc.in = -1;
+		proc.no_stdin = 0;
+		exit (run_processes_parallel(jobs, parallel_next, NULL,
+					     test_stdin, NULL, &proc));
+	}
 
 	fprintf(stderr, "check usage\n");
 	return 1;
diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index 7d599675e3..3eb572e6cd 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -143,6 +143,36 @@ test_expect_success 'run_command runs in parallel with more tasks than jobs avai
 	test_cmp expect actual
 '
 
+cat >expect <<-EOF
+preloaded output of a child
+listening for stdin:
+sample stdin 1
+sample stdin 0
+preloaded output of a child
+listening for stdin:
+sample stdin 1
+sample stdin 0
+preloaded output of a child
+listening for stdin:
+sample stdin 1
+sample stdin 0
+preloaded output of a child
+listening for stdin:
+sample stdin 1
+sample stdin 0
+EOF
+
+test_expect_success 'run_command listens to stdin' '
+	write_script stdin-script <<-\EOF &&
+	echo "listening for stdin:"
+	while read line; do
+		echo "$line"
+	done </dev/stdin
+	EOF
+	test-tool run-command run-command-stdin 2 ./stdin-script 2>actual &&
+	test_cmp expect actual
+'
+
 cat >expect <<-EOF
 preloaded output of a child
 asking for a quick stop
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 15/17] hook: provide stdin by string_list or callback
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (13 preceding siblings ...)
  2020-12-05  1:46         ` [PATCH 14/17] run-command: add stdin callback for parallelization Emily Shaffer
@ 2020-12-05  1:46         ` Emily Shaffer
  2020-12-08 21:09           ` SZEDER Gábor
  2020-12-05  1:46         ` [PATCH 16/17] run-command: allow capturing of collated output Emily Shaffer
                           ` (3 subsequent siblings)
  18 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:46 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

In cases where a hook requires only a small amount of information via
stdin, it should be simple for users to provide a string_list alone. But
in more complicated cases where the stdin is too large to hold in
memory, let's provide a callback the users can populate line after line
with instead.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 hook.c | 39 +++++++++++++++++++++++++++++++++++++++
 hook.h | 25 +++++++++++++++++++++++++
 2 files changed, 64 insertions(+)

diff --git a/hook.c b/hook.c
index f0c052d847..fbb49f241d 100644
--- a/hook.c
+++ b/hook.c
@@ -9,6 +9,8 @@ void free_hook(struct hook *ptr)
 {
 	if (ptr) {
 		strbuf_release(&ptr->command);
+		if (ptr->feed_pipe_cb_data)
+			free(ptr->feed_pipe_cb_data);
 		free(ptr);
 	}
 }
@@ -38,6 +40,7 @@ static void append_or_move_hook(struct list_head *head, const char *command)
 		strbuf_init(&to_add->command, 0);
 		strbuf_addstr(&to_add->command, command);
 		to_add->from_hookdir = 0;
+		to_add->feed_pipe_cb_data = NULL;
 	}
 
 	/* re-set the scope so we show where an override was specified */
@@ -253,9 +256,32 @@ void run_hooks_opt_clear(struct run_hooks_opt *o)
 {
 	strvec_clear(&o->env);
 	strvec_clear(&o->args);
+	string_list_clear(&o->str_stdin, 0);
 }
 
 
+static int pipe_from_string_list(struct strbuf *pipe, void *pp_cb, void *pp_task_cb)
+{
+	int *item_idx;
+	struct hook *ctx = pp_task_cb;
+	struct string_list *to_pipe = &((struct hook_cb_data*)pp_cb)->options->str_stdin;
+
+	/* Bootstrap the state manager if necessary. */
+	if (!ctx->feed_pipe_cb_data) {
+		ctx->feed_pipe_cb_data = xmalloc(sizeof(unsigned int));
+		*(int*)ctx->feed_pipe_cb_data = 0;
+	}
+
+	item_idx = ctx->feed_pipe_cb_data;
+
+	if (*item_idx < to_pipe->nr) {
+		strbuf_addf(pipe, "%s\n", to_pipe->items[*item_idx].string);
+		(*item_idx)++;
+		return 0;
+	}
+	return 1;
+}
+
 static int pick_next_hook(struct child_process *cp,
 			  struct strbuf *out,
 			  void *pp_cb,
@@ -277,6 +303,10 @@ static int pick_next_hook(struct child_process *cp,
 	if (hook_cb->options->path_to_stdin) {
 		cp->no_stdin = 0;
 		cp->in = xopen(hook_cb->options->path_to_stdin, O_RDONLY);
+	} else if (hook_cb->options->feed_pipe) {
+		/* ask for start_command() to make a pipe for us */
+		cp->in = -1;
+		cp->no_stdin = 0;
 	} else {
 		cp->no_stdin = 1;
 	}
@@ -350,6 +380,14 @@ int run_hooks(const char *hookname, struct run_hooks_opt *options)
 	if (!options)
 		BUG("a struct run_hooks_opt must be provided to run_hooks");
 
+	if ((options->path_to_stdin && options->str_stdin.nr) ||
+	    (options->path_to_stdin && options->feed_pipe) ||
+	    (options->str_stdin.nr && options->feed_pipe))
+		BUG("choose only one method to populate stdin");
+
+	if (options->str_stdin.nr)
+		options->feed_pipe = &pipe_from_string_list;
+
 	strbuf_addstr(&hookname_str, hookname);
 
 	to_run = hook_list(&hookname_str);
@@ -368,6 +406,7 @@ int run_hooks(const char *hookname, struct run_hooks_opt *options)
 	run_processes_parallel_tr2(options->jobs,
 				   pick_next_hook,
 				   notify_start_failure,
+				   options->feed_pipe,
 				   notify_hook_finished,
 				   &cb_data,
 				   "hook",
diff --git a/hook.h b/hook.h
index 4aae8e2dbb..ace26c637e 100644
--- a/hook.h
+++ b/hook.h
@@ -2,6 +2,7 @@
 #include "list.h"
 #include "strbuf.h"
 #include "strvec.h"
+#include "run-command.h"
 
 struct hook
 {
@@ -14,6 +15,12 @@ struct hook
 	/* The literal command to run. */
 	struct strbuf command;
 	int from_hookdir;
+
+	/*
+	 * Use this to keep state for your feed_pipe_fn if you are using
+	 * run_hooks_opt.feed_pipe. Otherwise, do not touch it.
+	 */
+	void *feed_pipe_cb_data;
 };
 
 /*
@@ -57,12 +64,24 @@ struct run_hooks_opt
 
 	/* Path to file which should be piped to stdin for each hook */
 	const char *path_to_stdin;
+	/* Pipe each string to stdin, separated by newlines */
+	struct string_list str_stdin;
+	/*
+	 * Callback and state pointer to ask for more content to pipe to stdin.
+	 * Will be called repeatedly, for each hook. See
+	 * hook.c:pipe_from_stdin() for an example. Keep per-hook state in
+	 * hook.feed_pipe_cb_data (per process). Keep initialization context in
+	 * feed_pipe_ctx (shared by all processes).
+	 */
+	feed_pipe_fn feed_pipe;
+	void *feed_pipe_ctx;
 
 	/* Number of threads to parallelize across */
 	int jobs;
 
 	/* Path to initial working directory for subprocess */
 	const char *dir;
+
 };
 
 /*
@@ -81,6 +100,9 @@ struct hook_cb_data {
 	.path_to_stdin = NULL,			\
 	.jobs = 1,				\
 	.dir = NULL,				\
+	.str_stdin = STRING_LIST_INIT_DUP,	\
+	.feed_pipe = NULL,			\
+	.feed_pipe_ctx = NULL,			\
 	.run_hookdir = configured_hookdir_opt()	\
 }
 
@@ -90,6 +112,9 @@ struct hook_cb_data {
 	.path_to_stdin = NULL,			\
 	.jobs = configured_hook_jobs(),		\
 	.dir = NULL,				\
+	.str_stdin = STRING_LIST_INIT_DUP,	\
+	.feed_pipe = NULL,			\
+	.feed_pipe_ctx = NULL,			\
 	.run_hookdir = configured_hookdir_opt()	\
 }
 
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 16/17] run-command: allow capturing of collated output
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (14 preceding siblings ...)
  2020-12-05  1:46         ` [PATCH 15/17] hook: provide stdin by string_list or callback Emily Shaffer
@ 2020-12-05  1:46         ` Emily Shaffer
  2020-12-05  1:46         ` [PATCH 17/17] hooks: allow callers to capture output Emily Shaffer
                           ` (2 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:46 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Some callers, for example server-side hooks which wish to relay hook
output to clients across a transport, want to capture what would
normally print to stderr and do something else with it. Allow that via a
callback.

By calling the callback regardless of whether there's output available,
we allow clients to send e.g. a keepalive if necessary.

Because we expose a strbuf, not a fd or FILE*, there's no need to create
a temporary pipe or similar - we can just skip the print to stderr and
instead hand it to the caller.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Originally when writing this patch I attempted to use a pipe in memory -
    but managing its lifetime was actually pretty tricky, and I found I could
    achieve the same thing with less code by doing it this way. Critique welcome,
    including "no, you really need to do it with a pipe".

 builtin/fetch.c             |  2 +-
 builtin/submodule--helper.c |  2 +-
 hook.c                      |  1 +
 run-command.c               | 33 +++++++++++++++++++++++++--------
 run-command.h               | 18 +++++++++++++++++-
 submodule.c                 |  2 +-
 t/helper/test-run-command.c | 25 ++++++++++++++++++++-----
 t/t0061-run-command.sh      |  7 +++++++
 8 files changed, 73 insertions(+), 17 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 5e153b5193..6a634085d9 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1647,7 +1647,7 @@ static int fetch_multiple(struct string_list *list, int max_children)
 		result = run_processes_parallel_tr2(max_children,
 						    &fetch_next_remote,
 						    &fetch_failed_to_start,
-						    NULL,
+						    NULL, NULL,
 						    &fetch_finished,
 						    &state,
 						    "fetch", "parallel/fetch");
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index bb623c1852..8c543d33fd 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -2294,7 +2294,7 @@ static int update_submodules(struct submodule_update_clone *suc)
 	int i;
 
 	run_processes_parallel_tr2(suc->max_jobs, update_clone_get_next_task,
-				   update_clone_start_failure, NULL,
+				   update_clone_start_failure, NULL, NULL,
 				   update_clone_task_finished, suc, "submodule",
 				   "parallel/update");
 
diff --git a/hook.c b/hook.c
index fbb49f241d..1186ee41b3 100644
--- a/hook.c
+++ b/hook.c
@@ -407,6 +407,7 @@ int run_hooks(const char *hookname, struct run_hooks_opt *options)
 				   pick_next_hook,
 				   notify_start_failure,
 				   options->feed_pipe,
+				   NULL,
 				   notify_hook_finished,
 				   &cb_data,
 				   "hook",
diff --git a/run-command.c b/run-command.c
index 7b65c087f8..0dce6bec83 100644
--- a/run-command.c
+++ b/run-command.c
@@ -1549,6 +1549,7 @@ struct parallel_processes {
 	get_next_task_fn get_next_task;
 	start_failure_fn start_failure;
 	feed_pipe_fn feed_pipe;
+	consume_sideband_fn consume_sideband;
 	task_finished_fn task_finished;
 
 	struct {
@@ -1614,6 +1615,7 @@ static void pp_init(struct parallel_processes *pp,
 		    get_next_task_fn get_next_task,
 		    start_failure_fn start_failure,
 		    feed_pipe_fn feed_pipe,
+		    consume_sideband_fn consume_sideband,
 		    task_finished_fn task_finished,
 		    void *data)
 {
@@ -1634,6 +1636,7 @@ static void pp_init(struct parallel_processes *pp,
 	pp->start_failure = start_failure ? start_failure : default_start_failure;
 	pp->feed_pipe = feed_pipe ? feed_pipe : default_feed_pipe;
 	pp->task_finished = task_finished ? task_finished : default_task_finished;
+	pp->consume_sideband = consume_sideband;
 
 	pp->nr_processes = 0;
 	pp->output_owner = 0;
@@ -1670,7 +1673,10 @@ static void pp_cleanup(struct parallel_processes *pp)
 	 * When get_next_task added messages to the buffer in its last
 	 * iteration, the buffered output is non empty.
 	 */
-	strbuf_write(&pp->buffered_output, stderr);
+	if (pp->consume_sideband)
+		pp->consume_sideband(&pp->buffered_output, pp->data);
+	else
+		strbuf_write(&pp->buffered_output, stderr);
 	strbuf_release(&pp->buffered_output);
 
 	sigchain_pop_common();
@@ -1786,9 +1792,13 @@ static void pp_buffer_stderr(struct parallel_processes *pp, int output_timeout)
 static void pp_output(struct parallel_processes *pp)
 {
 	int i = pp->output_owner;
+
 	if (pp->children[i].state == GIT_CP_WORKING &&
 	    pp->children[i].err.len) {
-		strbuf_write(&pp->children[i].err, stderr);
+		if (pp->consume_sideband)
+			pp->consume_sideband(&pp->children[i].err, pp->data);
+		else
+			strbuf_write(&pp->children[i].err, stderr);
 		strbuf_reset(&pp->children[i].err);
 	}
 }
@@ -1827,11 +1837,15 @@ static int pp_collect_finished(struct parallel_processes *pp)
 			strbuf_addbuf(&pp->buffered_output, &pp->children[i].err);
 			strbuf_reset(&pp->children[i].err);
 		} else {
-			strbuf_write(&pp->children[i].err, stderr);
+			/* Output errors, then all other finished child processes */
+			if (pp->consume_sideband) {
+				pp->consume_sideband(&pp->children[i].err, pp->data);
+				pp->consume_sideband(&pp->buffered_output, pp->data);
+			} else {
+				strbuf_write(&pp->children[i].err, stderr);
+				strbuf_write(&pp->buffered_output, stderr);
+			}
 			strbuf_reset(&pp->children[i].err);
-
-			/* Output all other finished child processes */
-			strbuf_write(&pp->buffered_output, stderr);
 			strbuf_reset(&pp->buffered_output);
 
 			/*
@@ -1855,6 +1869,7 @@ int run_processes_parallel(int n,
 			   get_next_task_fn get_next_task,
 			   start_failure_fn start_failure,
 			   feed_pipe_fn feed_pipe,
+			   consume_sideband_fn consume_sideband,
 			   task_finished_fn task_finished,
 			   void *pp_cb)
 {
@@ -1865,7 +1880,7 @@ int run_processes_parallel(int n,
 
 	sigchain_push(SIGPIPE, SIG_IGN);
 
-	pp_init(&pp, n, get_next_task, start_failure, feed_pipe, task_finished, pp_cb);
+	pp_init(&pp, n, get_next_task, start_failure, feed_pipe, consume_sideband, task_finished, pp_cb);
 	while (1) {
 		for (i = 0;
 		    i < spawn_cap && !pp.shutdown &&
@@ -1903,6 +1918,7 @@ int run_processes_parallel(int n,
 int run_processes_parallel_tr2(int n, get_next_task_fn get_next_task,
 			       start_failure_fn start_failure,
 			       feed_pipe_fn feed_pipe,
+			       consume_sideband_fn consume_sideband,
 			       task_finished_fn task_finished, void *pp_cb,
 			       const char *tr2_category, const char *tr2_label)
 {
@@ -1912,7 +1928,8 @@ int run_processes_parallel_tr2(int n, get_next_task_fn get_next_task,
 				   ((n < 1) ? online_cpus() : n));
 
 	result = run_processes_parallel(n, get_next_task, start_failure,
-					feed_pipe, task_finished, pp_cb);
+					feed_pipe, consume_sideband,
+					task_finished, pp_cb);
 
 	trace2_region_leave(tr2_category, tr2_label, NULL);
 
diff --git a/run-command.h b/run-command.h
index e058c0e2c8..2ad8271f56 100644
--- a/run-command.h
+++ b/run-command.h
@@ -450,6 +450,20 @@ typedef int (*feed_pipe_fn)(struct strbuf *pipe,
 			    void *pp_cb,
 			    void *pp_task_cb);
 
+/**
+ * If this callback is provided, instead of collating process output to stderr,
+ * they will be collated into a new pipe. consume_sideband_fn will be called
+ * repeatedly. When output is available on that pipe, it will be contained in
+ * 'output'. But it will be called with an empty 'output' too, to allow for
+ * keepalives or similar operations if necessary.
+ *
+ * pp_cb is the callback cookie as passed into run_processes_parallel.
+ *
+ * Since this callback is provided with the collated output, no task cookie is
+ * provided.
+ */
+typedef void (*consume_sideband_fn)(struct strbuf *output, void *pp_cb);
+
 /**
  * This callback is called on every child process that finished processing.
  *
@@ -485,10 +499,12 @@ int run_processes_parallel(int n,
 			   get_next_task_fn,
 			   start_failure_fn,
 			   feed_pipe_fn,
+			   consume_sideband_fn,
 			   task_finished_fn,
 			   void *pp_cb);
 int run_processes_parallel_tr2(int n, get_next_task_fn, start_failure_fn,
-			       feed_pipe_fn, task_finished_fn, void *pp_cb,
+			       feed_pipe_fn, consume_sideband_fn,
+			       task_finished_fn, void *pp_cb,
 			       const char *tr2_category, const char *tr2_label);
 
 #endif
diff --git a/submodule.c b/submodule.c
index 953f41818c..215bff22d9 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1638,7 +1638,7 @@ int fetch_populated_submodules(struct repository *r,
 	run_processes_parallel_tr2(max_parallel_jobs,
 				   get_next_submodule,
 				   fetch_start_failure,
-				   NULL,
+				   NULL, NULL,
 				   fetch_finish,
 				   &spf,
 				   "submodule", "parallel/fetch");
diff --git a/t/helper/test-run-command.c b/t/helper/test-run-command.c
index 9348184d30..d53db6d11c 100644
--- a/t/helper/test-run-command.c
+++ b/t/helper/test-run-command.c
@@ -51,6 +51,16 @@ static int no_job(struct child_process *cp,
 	return 0;
 }
 
+static void test_consume_sideband(struct strbuf *output, void *cb)
+{
+	FILE *sideband;
+
+	sideband = fopen("./sideband", "a");
+
+	strbuf_write(output, sideband);
+	fclose(sideband);
+}
+
 static int task_finished(int result,
 			 struct strbuf *err,
 			 void *pp_cb,
@@ -201,7 +211,7 @@ static int testsuite(int argc, const char **argv)
 		suite.tests.nr, max_jobs);
 
 	ret = run_processes_parallel(max_jobs, next_test, test_failed,
-				     test_stdin, test_finished, &suite);
+				     test_stdin, NULL, test_finished, &suite);
 
 	if (suite.failed.nr > 0) {
 		ret = 1;
@@ -429,23 +439,28 @@ int cmd__run_command(int argc, const char **argv)
 
 	if (!strcmp(argv[1], "run-command-parallel"))
 		exit(run_processes_parallel(jobs, parallel_next,
-					    NULL, NULL, NULL, &proc));
+					    NULL, NULL, NULL, NULL, &proc));
 
 	if (!strcmp(argv[1], "run-command-abort"))
 		exit(run_processes_parallel(jobs, parallel_next,
-					    NULL, NULL, task_finished, &proc));
+					    NULL, NULL, NULL, task_finished, &proc));
 
 	if (!strcmp(argv[1], "run-command-no-jobs"))
 		exit(run_processes_parallel(jobs, no_job,
-					    NULL, NULL, task_finished, &proc));
+					    NULL, NULL, NULL, task_finished, &proc));
 
 	if (!strcmp(argv[1], "run-command-stdin")) {
 		proc.in = -1;
 		proc.no_stdin = 0;
 		exit (run_processes_parallel(jobs, parallel_next, NULL,
-					     test_stdin, NULL, &proc));
+					     test_stdin, NULL, NULL, &proc));
 	}
 
+	if (!strcmp(argv[1], "run-command-sideband"))
+		exit(run_processes_parallel(jobs, parallel_next, NULL, NULL,
+					    test_consume_sideband, NULL,
+					    &proc));
+
 	fprintf(stderr, "check usage\n");
 	return 1;
 }
diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index 3eb572e6cd..c5a5b6df6c 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -143,6 +143,13 @@ test_expect_success 'run_command runs in parallel with more tasks than jobs avai
 	test_cmp expect actual
 '
 
+test_expect_success 'run_command can divert output' '
+	test_when_finished rm sideband &&
+	test-tool run-command run-command-sideband 3 sh -c "printf \"%s\n%s\n\" Hello World" 2>actual &&
+	test_must_be_empty actual &&
+	test_cmp expect sideband
+'
+
 cat >expect <<-EOF
 preloaded output of a child
 listening for stdin:
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 17/17] hooks: allow callers to capture output
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (15 preceding siblings ...)
  2020-12-05  1:46         ` [PATCH 16/17] run-command: allow capturing of collated output Emily Shaffer
@ 2020-12-05  1:46         ` Emily Shaffer
  2020-12-16  0:34         ` [PATCH v6 00/17] propose config-based hooks (part I) Josh Steadmon
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:46 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Some server-side hooks will require capturing output to send over
sideband instead of printing directly to stderr. Expose that capability.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    You can see this in practice in the conversions for some of the push hooks,
    like 'receive-pack'.

 hook.c |  2 +-
 hook.h | 10 ++++++++++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/hook.c b/hook.c
index 1186ee41b3..78d7721b74 100644
--- a/hook.c
+++ b/hook.c
@@ -407,7 +407,7 @@ int run_hooks(const char *hookname, struct run_hooks_opt *options)
 				   pick_next_hook,
 				   notify_start_failure,
 				   options->feed_pipe,
-				   NULL,
+				   options->consume_sideband,
 				   notify_hook_finished,
 				   &cb_data,
 				   "hook",
diff --git a/hook.h b/hook.h
index ace26c637e..7059e0db77 100644
--- a/hook.h
+++ b/hook.h
@@ -76,6 +76,14 @@ struct run_hooks_opt
 	feed_pipe_fn feed_pipe;
 	void *feed_pipe_ctx;
 
+	/*
+	 * Populate this to capture output and prevent it from being printed to
+	 * stderr. This will be passed directly through to
+	 * run_command:run_parallel_processes(). See t/helper/test-run-command.c
+	 * for an example.
+	 */
+	consume_sideband_fn consume_sideband;
+
 	/* Number of threads to parallelize across */
 	int jobs;
 
@@ -103,6 +111,7 @@ struct hook_cb_data {
 	.str_stdin = STRING_LIST_INIT_DUP,	\
 	.feed_pipe = NULL,			\
 	.feed_pipe_ctx = NULL,			\
+	.consume_sideband = NULL,		\
 	.run_hookdir = configured_hookdir_opt()	\
 }
 
@@ -115,6 +124,7 @@ struct hook_cb_data {
 	.str_stdin = STRING_LIST_INIT_DUP,	\
 	.feed_pipe = NULL,			\
 	.feed_pipe_ctx = NULL,			\
+	.consume_sideband = NULL,		\
 	.run_hookdir = configured_hookdir_opt()	\
 }
 
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* Re: [PATCH 15/17] hook: provide stdin by string_list or callback
  2020-12-05  1:46         ` [PATCH 15/17] hook: provide stdin by string_list or callback Emily Shaffer
@ 2020-12-08 21:09           ` SZEDER Gábor
  2020-12-08 22:11             ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: SZEDER Gábor @ 2020-12-08 21:09 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

On Fri, Dec 04, 2020 at 05:46:05PM -0800, Emily Shaffer wrote:
> diff --git a/hook.c b/hook.c
> index f0c052d847..fbb49f241d 100644
> --- a/hook.c
> +++ b/hook.c
> @@ -9,6 +9,8 @@ void free_hook(struct hook *ptr)
>  {
>  	if (ptr) {
>  		strbuf_release(&ptr->command);
> +		if (ptr->feed_pipe_cb_data)

Coccinelle suggests to drop this condition, because free() can handle
a NULL pointer just fine.

> +			free(ptr->feed_pipe_cb_data);
>  		free(ptr);
>  	}
>  }

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH 15/17] hook: provide stdin by string_list or callback
  2020-12-08 21:09           ` SZEDER Gábor
@ 2020-12-08 22:11             ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-08 22:11 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git

On Tue, Dec 08, 2020 at 10:09:25PM +0100, SZEDER Gábor wrote:
> 
> On Fri, Dec 04, 2020 at 05:46:05PM -0800, Emily Shaffer wrote:
> > diff --git a/hook.c b/hook.c
> > index f0c052d847..fbb49f241d 100644
> > --- a/hook.c
> > +++ b/hook.c
> > @@ -9,6 +9,8 @@ void free_hook(struct hook *ptr)
> >  {
> >  	if (ptr) {
> >  		strbuf_release(&ptr->command);
> > +		if (ptr->feed_pipe_cb_data)
> 
> Coccinelle suggests to drop this condition, because free() can handle
> a NULL pointer just fine.

Done (locally). Thanks (and thanks for checking the coccinelle output
too).

> 
> > +			free(ptr->feed_pipe_cb_data);
> >  		free(ptr);
> >  	}
> >  }

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH 08/17] hook: add 'run' subcommand
  2020-12-05  1:45         ` [PATCH 08/17] hook: add 'run' subcommand Emily Shaffer
@ 2020-12-11 10:15           ` Phillip Wood
  2020-12-15 21:41             ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Phillip Wood @ 2020-12-11 10:15 UTC (permalink / raw)
  To: Emily Shaffer, git

Hi Emily

On 05/12/2020 01:45, Emily Shaffer wrote:
> In order to enable hooks to be run as an external process, by a
> standalone Git command, or by tools which wrap Git, provide an external
> means to run all configured hook commands for a given hook event.
> 
> For now, the hook commands will run in config order, in series. As
> alternate ordering or parallelism is supported in the future, we should
> add knobs to use those to the command line as well.
> 
> As with the legacy hook implementation, all stdout generated by hook
> commands is redirected to stderr. Piping from stdin is not yet
> supported.
> 
> Legacy hooks (those present in $GITDIR/hooks) are run at the end of the
> execution list. For now, there is no way to disable them.
> 
> Users may wish to provide hook commands like 'git config
> hook.pre-commit.command "~/linter.sh --pre-commit"'. To enable this, the
> contents of the 'hook.*.command' and 'hookcmd.*.command' strings are
> first split by space or quotes into an argv_array, then expanded with
> 'expand_user_path()'.

I'm a bit confused by this last paragraph, the docs below say we pass 
the string to the shell and that's what the implementation seems to do. 
If we're running a lot of hooks then maybe it would be worth using 
split_cmdline() and expand_user_path() rather than invoking the shell 
for each hook we run.

I'm afraid I've only had time to skip the patch, there are a couple of 
minor comments below.

> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
> 
> Notes:
>      Since v4, updated the docs, and did less local application of single
>      quotes. In order for hookdir hooks to run successfully with a space in
>      the path, though, they must not be run with 'sh -c'. So we can treat the
>      hookdir hooks specially, and warn users via doc about special
>      considerations for configured hooks with spaces in their path.
> 
>   Documentation/git-hook.txt    |  31 +++++++++-
>   builtin/hook.c                |  48 ++++++++++++++-
>   hook.c                        | 112 ++++++++++++++++++++++++++++++++++
>   hook.h                        |  32 ++++++++++
>   t/t1360-config-based-hooks.sh |  65 +++++++++++++++++++-
>   5 files changed, 281 insertions(+), 7 deletions(-)
> 
> diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
> index f19875ed68..18a817d832 100644
> --- a/Documentation/git-hook.txt
> +++ b/Documentation/git-hook.txt
> @@ -9,11 +9,12 @@ SYNOPSIS
>   --------
>   [verse]
>   'git hook' list <hook-name>
> +'git hook' run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] <hook-name>
>   
>   DESCRIPTION
>   -----------
> -You can list configured hooks with this command. Later, you will be able to run,
> -add, and modify hooks with this command.
> +You can list and run configured hooks with this command. Later, you will be able
> +to add and modify hooks with this command.
>   
>   This command parses the default configuration files for sections `hook` and
>   `hookcmd`. `hook` is used to describe the commands which will be run during a
> @@ -64,6 +65,32 @@ in the order they should be run, and print the config scope where the relevant
>   `hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
>   This output is human-readable and the format is subject to change over time.
>   
> +run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] `<hook-name>`::
> +
> +Runs hooks configured for `<hook-name>`, in the same order displayed by `git
> +hook list`. Hooks configured this way are run prepended with `sh -c`, so paths
> +containing special characters or spaces should be wrapped in single quotes:
> +`command = '/my/path with spaces/script.sh' some args`.
> +
> +OPTIONS
> +-------
> +--run-hookdir::
> +	Overrides the hook.runHookDir config. Must be 'yes', 'warn',
> +	'interactive', or 'no'. Specifies how to handle hooks located in the Git
> +	hook directory (core.hooksPath).
> +
> +-a::
> +--arg::
> +	Only valid for `run`.
> ++
> +Specify arguments to pass to every hook that is run.
> +
> +-e::
> +--env::
> +	Only valid for `run`.
> ++
> +Specify environment variables to set for every hook that is run.
> +
>   CONFIGURATION
>   -------------
>   include::config/hook.txt[]
> diff --git a/builtin/hook.c b/builtin/hook.c
> index 16324d4195..26f7050387 100644
> --- a/builtin/hook.c
> +++ b/builtin/hook.c
> @@ -5,9 +5,11 @@
>   #include "hook.h"
>   #include "parse-options.h"
>   #include "strbuf.h"
> +#include "strvec.h"
>   
>   static const char * const builtin_hook_usage[] = {
>   	N_("git hook list <hookname>"),
> +	N_("git hook run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] <hookname>"),
>   	NULL
>   };
>   
> @@ -84,6 +86,46 @@ static int list(int argc, const char **argv, const char *prefix)
>   	return 0;
>   }
>   
> +static int run(int argc, const char **argv, const char *prefix)
> +{
> +	struct strbuf hookname = STRBUF_INIT;
> +	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
> +	int rc = 0;
> +
> +	struct option run_options[] = {
> +		OPT_STRVEC('e', "env", &opt.env, N_("var"),
> +			   N_("environment variables for hook to use")),
> +		OPT_STRVEC('a', "arg", &opt.args, N_("args"),
> +			   N_("argument to pass to hook")),
> +		OPT_END(),
> +	};
> +
> +	/*
> +	 * While it makes sense to list hooks out-of-repo, it doesn't make sense
> +	 * to execute them. Hooks usually want to look at repository artifacts.
> +	 */
> +	if (!have_git_dir())
> +		usage_msg_opt(_("You must be in a Git repo to execute hooks."),
> +			      builtin_hook_usage, run_options);
> +
> +	argc = parse_options(argc, argv, prefix, run_options,
> +			     builtin_hook_usage, 0);
> +
> +	if (argc < 1)
> +		usage_msg_opt(_("You must specify a hook event to run."),
> +			      builtin_hook_usage, run_options);
> +
> +	strbuf_addstr(&hookname, argv[0]);
> +	opt.run_hookdir = should_run_hookdir;
> +
> +	rc = run_hooks(hookname.buf, &opt);
> +
> +	strbuf_release(&hookname);
> +	run_hooks_opt_clear(&opt);
> +
> +	return rc;
> +}
> +
>   int cmd_hook(int argc, const char **argv, const char *prefix)
>   {
>   	const char *run_hookdir = NULL;
> @@ -95,10 +137,10 @@ int cmd_hook(int argc, const char **argv, const char *prefix)
>   	};
>   
>   	argc = parse_options(argc, argv, prefix, builtin_hook_options,
> -			     builtin_hook_usage, 0);
> +			     builtin_hook_usage, PARSE_OPT_KEEP_UNKNOWN);
>   
>   	/* after the parse, we should have "<command> <hookname> <args...>" */
> -	if (argc < 1)
> +	if (argc < 2)
>   		usage_with_options(builtin_hook_usage, builtin_hook_options);
>   
>   
> @@ -120,6 +162,8 @@ int cmd_hook(int argc, const char **argv, const char *prefix)
>   
>   	if (!strcmp(argv[0], "list"))
>   		return list(argc, argv, prefix);
> +	if (!strcmp(argv[0], "run"))
> +		return run(argc, argv, prefix);
>   
>   	usage_with_options(builtin_hook_usage, builtin_hook_options);
>   }
> diff --git a/hook.c b/hook.c
> index f4084e33c8..c4595a2324 100644
> --- a/hook.c
> +++ b/hook.c
> @@ -3,6 +3,7 @@
>   #include "hook.h"
>   #include "config.h"
>   #include "run-command.h"
> +#include "prompt.h"
>   
>   void free_hook(struct hook *ptr)
>   {
> @@ -135,6 +136,56 @@ enum hookdir_opt configured_hookdir_opt(void)
>   	return hookdir_unknown;
>   }
>   
> +static int should_include_hookdir(const char *path, enum hookdir_opt cfg)
> +{
> +	struct strbuf prompt = STRBUF_INIT;
> +	/*
> +	 * If the path doesn't exist, don't bother adding the empty hook and
> +	 * don't bother checking the config or prompting the user.
> +	 */
> +	if (!path)
> +		return 0;
> +
> +	switch (cfg)
> +	{
> +		case hookdir_no:

Style nit: we normally use uppercase for constants and enums.

> +			return 0;
> +		case hookdir_unknown:
> +			fprintf(stderr,
> +				_("Unrecognized value for 'hook.runHookDir'. "
> +				  "Is there a typo? "));

What happens at the moment if core.hooksPath does not exist?

Best Wishes

Phillip

> +			/* FALLTHROUGH */
> +		case hookdir_warn:
> +			fprintf(stderr, _("Running legacy hook at '%s'\n"),
> +				path);
> +			return 1;
> +		case hookdir_interactive:
> +			do {
> +				/*
> +				 * TRANSLATORS: Make sure to include [Y] and [n]
> +				 * in your translation. Only English input is
> +				 * accepted. Default option is "yes".
> +				 */
> +				fprintf(stderr, _("Run '%s'? [Yn] "), path);
> +				git_read_line_interactively(&prompt);
> +				strbuf_tolower(&prompt);
> +				if (starts_with(prompt.buf, "n")) {
> +					strbuf_release(&prompt);
> +					return 0;
> +				} else if (starts_with(prompt.buf, "y")) {
> +					strbuf_release(&prompt);
> +					return 1;
> +				}
> +				/* otherwise, we didn't understand the input */
> +			} while (prompt.len); /* an empty reply means "Yes" */
> +			strbuf_release(&prompt);
> +			return 1;
> +		case hookdir_yes:
> +		default:
> +			return 1;
> +	}
> +}
> +
>   struct list_head* hook_list(const struct strbuf* hookname)
>   {
>   	struct strbuf hook_key = STRBUF_INIT;
> @@ -166,3 +217,64 @@ struct list_head* hook_list(const struct strbuf* hookname)
>   	strbuf_release(&hook_key);
>   	return hook_head;
>   }
> +
> +void run_hooks_opt_init(struct run_hooks_opt *o)
> +{
> +	strvec_init(&o->env);
> +	strvec_init(&o->args);
> +	o->run_hookdir = configured_hookdir_opt();
> +}
> +
> +void run_hooks_opt_clear(struct run_hooks_opt *o)
> +{
> +	strvec_clear(&o->env);
> +	strvec_clear(&o->args);
> +}
> +
> +int run_hooks(const char *hookname, struct run_hooks_opt *options)
> +{
> +	struct strbuf hookname_str = STRBUF_INIT;
> +	struct list_head *to_run, *pos = NULL, *tmp = NULL;
> +	int rc = 0;
> +
> +	if (!options)
> +		BUG("a struct run_hooks_opt must be provided to run_hooks");
> +
> +	strbuf_addstr(&hookname_str, hookname);
> +
> +	to_run = hook_list(&hookname_str);
> +
> +	list_for_each_safe(pos, tmp, to_run) {
> +		struct child_process hook_proc = CHILD_PROCESS_INIT;
> +		struct hook *hook = list_entry(pos, struct hook, list);
> +
> +		hook_proc.env = options->env.v;
> +		hook_proc.no_stdin = 1;
> +		hook_proc.stdout_to_stderr = 1;
> +		hook_proc.trace2_hook_name = hook->command.buf;
> +		hook_proc.use_shell = 1;
> +
> +		if (hook->from_hookdir) {
> +		    if (!should_include_hookdir(hook->command.buf, options->run_hookdir))
> +			continue;
> +		    /*
> +		     * Commands from the config could be oneliners, but we know
> +		     * for certain that hookdir commands are not.
> +		     */
> +		    hook_proc.use_shell = 0;
> +		}
> +
> +		/* add command */
> +		strvec_push(&hook_proc.args, hook->command.buf);
> +
> +		/*
> +		 * add passed-in argv, without expanding - let the user get back
> +		 * exactly what they put in
> +		 */
> +		strvec_pushv(&hook_proc.args, options->args.v);
> +
> +		rc |= run_command(&hook_proc);
> +	}
> +
> +	return rc;
> +}
> diff --git a/hook.h b/hook.h
> index ca45d388d3..d1c3d71e82 100644
> --- a/hook.h
> +++ b/hook.h
> @@ -1,6 +1,7 @@
>   #include "config.h"
>   #include "list.h"
>   #include "strbuf.h"
> +#include "strvec.h"
>   
>   struct hook
>   {
> @@ -36,6 +37,37 @@ enum hookdir_opt
>    */
>   enum hookdir_opt configured_hookdir_opt(void);
>   
> +struct run_hooks_opt
> +{
> +	/* Environment vars to be set for each hook */
> +	struct strvec env;
> +
> +	/* Args to be passed to each hook */
> +	struct strvec args;
> +
> +	/*
> +	 * How should the hookdir be handled?
> +	 * Leave the RUN_HOOKS_OPT_INIT default in most cases; this only needs
> +	 * to be overridden if the user can override it at the command line.
> +	 */
> +	enum hookdir_opt run_hookdir;
> +};
> +
> +#define RUN_HOOKS_OPT_INIT  {   		\
> +	.env = STRVEC_INIT, 				\
> +	.args = STRVEC_INIT, 			\
> +	.run_hookdir = configured_hookdir_opt()	\
> +}
> +
> +void run_hooks_opt_init(struct run_hooks_opt *o);
> +void run_hooks_opt_clear(struct run_hooks_opt *o);
> +
> +/*
> + * Runs all hooks associated to the 'hookname' event in order. Each hook will be
> + * passed 'env' and 'args'.
> + */
> +int run_hooks(const char *hookname, struct run_hooks_opt *options);
> +
>   /* Free memory associated with a 'struct hook' */
>   void free_hook(struct hook *ptr);
>   /* Empties the list at 'head', calling 'free_hook()' on each entry */
> diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
> index ebd3bc623f..5b3003d59b 100755
> --- a/t/t1360-config-based-hooks.sh
> +++ b/t/t1360-config-based-hooks.sh
> @@ -115,7 +115,10 @@ test_expect_success 'hook.runHookDir = no is respected by list' '
>   
>   	git hook list pre-commit >actual &&
>   	# the hookdir annotation is translated
> -	test_i18ncmp expected actual
> +	test_i18ncmp expected actual &&
> +
> +	git hook run pre-commit 2>actual &&
> +	test_must_be_empty actual
>   '
>   
>   test_expect_success 'hook.runHookDir = warn is respected by list' '
> @@ -129,6 +132,14 @@ test_expect_success 'hook.runHookDir = warn is respected by list' '
>   
>   	git hook list pre-commit >actual &&
>   	# the hookdir annotation is translated
> +	test_i18ncmp expected actual &&
> +
> +	cat >expected <<-EOF &&
> +	Running legacy hook at '\''$(pwd)/.git/hooks/pre-commit'\''
> +	"Legacy Hook"
> +	EOF
> +
> +	git hook run pre-commit 2>actual &&
>   	test_i18ncmp expected actual
>   '
>   
> @@ -156,7 +167,7 @@ test_expect_success 'git hook list removes skipped inlined hook' '
>   	test_cmp expected actual
>   '
>   
> -test_expect_success 'hook.runHookDir = interactive is respected by list' '
> +test_expect_success 'hook.runHookDir = interactive is respected by list and run' '
>   	setup_hookdir &&
>   
>   	test_config hook.runHookDir "interactive" &&
> @@ -167,7 +178,55 @@ test_expect_success 'hook.runHookDir = interactive is respected by list' '
>   
>   	git hook list pre-commit >actual &&
>   	# the hookdir annotation is translated
> -	test_i18ncmp expected actual
> +	test_i18ncmp expected actual &&
> +
> +	test_write_lines n | git hook run pre-commit 2>actual &&
> +	! grep "Legacy Hook" actual &&
> +
> +	test_write_lines y | git hook run pre-commit 2>actual &&
> +	grep "Legacy Hook" actual
> +'
> +
> +test_expect_success 'inline hook definitions execute oneliners' '
> +	test_config hook.pre-commit.command "echo \"Hello World\"" &&
> +
> +	echo "Hello World" >expected &&
> +
> +	# hooks are run with stdout_to_stderr = 1
> +	git hook run pre-commit 2>actual &&
> +	test_cmp expected actual
> +'
> +
> +test_expect_success 'inline hook definitions resolve paths' '
> +	write_script sample-hook.sh <<-EOF &&
> +	echo \"Sample Hook\"
> +	EOF
> +
> +	test_when_finished "rm sample-hook.sh" &&
> +
> +	test_config hook.pre-commit.command "\"$(pwd)/sample-hook.sh\"" &&
> +
> +	echo \"Sample Hook\" >expected &&
> +
> +	# hooks are run with stdout_to_stderr = 1
> +	git hook run pre-commit 2>actual &&
> +	test_cmp expected actual
> +'
> +
> +test_expect_success 'hookdir hook included in git hook run' '
> +	setup_hookdir &&
> +
> +	echo \"Legacy Hook\" >expected &&
> +
> +	# hooks are run with stdout_to_stderr = 1
> +	git hook run pre-commit 2>actual &&
> +	test_cmp expected actual
> +'
> +
> +test_expect_success 'out-of-repo runs excluded' '
> +	setup_hooks &&
> +
> +	nongit test_must_fail git hook run pre-commit
>   '
>   
>   test_done
> 

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH 08/17] hook: add 'run' subcommand
  2020-12-11 10:15           ` Phillip Wood
@ 2020-12-15 21:41             ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-15 21:41 UTC (permalink / raw)
  To: phillip.wood; +Cc: git

On Fri, Dec 11, 2020 at 10:15:26AM +0000, Phillip Wood wrote:
> 
> Hi Emily
> 
> On 05/12/2020 01:45, Emily Shaffer wrote:
> > In order to enable hooks to be run as an external process, by a
> > standalone Git command, or by tools which wrap Git, provide an external
> > means to run all configured hook commands for a given hook event.
> > 
> > For now, the hook commands will run in config order, in series. As
> > alternate ordering or parallelism is supported in the future, we should
> > add knobs to use those to the command line as well.
> > 
> > As with the legacy hook implementation, all stdout generated by hook
> > commands is redirected to stderr. Piping from stdin is not yet
> > supported.
> > 
> > Legacy hooks (those present in $GITDIR/hooks) are run at the end of the
> > execution list. For now, there is no way to disable them.
> > 
> > Users may wish to provide hook commands like 'git config
> > hook.pre-commit.command "~/linter.sh --pre-commit"'. To enable this, the
> > contents of the 'hook.*.command' and 'hookcmd.*.command' strings are
> > first split by space or quotes into an argv_array, then expanded with
> > 'expand_user_path()'.
> 
> I'm a bit confused by this last paragraph, the docs below say we pass the
> string to the shell and that's what the implementation seems to do. If we're
> running a lot of hooks then maybe it would be worth using split_cmdline()
> and expand_user_path() rather than invoking the shell for each hook we run.

Yeah, I think you are right that the commit message is stale. I had some
trouble getting things to work correctly with split_cmdline() and
expand_user_path(), so I'd prefer to run with shell.

> 
> I'm afraid I've only had time to skip the patch, there are a couple of minor
> comments below.

No problem. Thanks for having a look.

> > +static int should_include_hookdir(const char *path, enum hookdir_opt cfg)
> > +{
> > +	struct strbuf prompt = STRBUF_INIT;
> > +	/*
> > +	 * If the path doesn't exist, don't bother adding the empty hook and
> > +	 * don't bother checking the config or prompting the user.
> > +	 */
> > +	if (!path)
> > +		return 0;
> > +
> > +	switch (cfg)
> > +	{
> > +		case hookdir_no:
> 
> Style nit: we normally use uppercase for constants and enums.

OK. Thanks - will fix where it's introduced and update subsequent
patches.

> 
> > +			return 0;
> > +		case hookdir_unknown:
> > +			fprintf(stderr,
> > +				_("Unrecognized value for 'hook.runHookDir'. "
> > +				  "Is there a typo? "));
> 
> What happens at the moment if core.hooksPath does not exist?

When core.hooksPath does not exist then $GIT_DIR/hooks/ is used instead.
My setup currently doesn't have $GIT_DIR/hooks/ and runs happily. That
bit of logic (core.hooksPath or $GIT_DIR/hooks) is done in
run-command.h:find_hook() so I don't worry about it manually here.

However, your comment caused me to investigate what happens when
core.hooksPath DOES exist - and I found a bug. Because the 'git hook'
builtin doesn't call the default configuration callback, I miss
core.hooksPath hooks during 'git hook list' - but not during hooks
invoked during regular Git process runs. Very confusing :) So thanks for
the hint.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v6 00/17] propose config-based hooks (part I)
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (16 preceding siblings ...)
  2020-12-05  1:46         ` [PATCH 17/17] hooks: allow callers to capture output Emily Shaffer
@ 2020-12-16  0:34         ` Josh Steadmon
  2020-12-16  0:56           ` Junio C Hamano
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
  18 siblings, 1 reply; 170+ messages in thread
From: Josh Steadmon @ 2020-12-16  0:34 UTC (permalink / raw)
  To: Emily Shaffer
  Cc: git, Jeff King, Junio C Hamano, James Ramsay, Jonathan Nieder,
	brian m. carlson, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Johannes Schindelin

On 2020.12.04 17:45, Emily Shaffer wrote:
> Hi folks, and thanks for the patience - I ran into many, many last-mile
> challenges.
> 
> I haven't addressed many comments on the design doc yet - I was keen to get the
> "functionally complete" implementation and conversion to the list.
> 
> Next on my plate:
>  - Update the design doc to make sense with what's in the implementation.
>  - A blog post! How to set up new hooks, why they're neat, etc.
>  - We seem to have some Googlers interested in trying it out internally, so
>    I'm hoping we'll gather and collate feedback from that soon too.
>  - And of course addressing comments on this series.
> 
> Thanks!
>  - Emily

This approach looks good to me. I'll look forward to seeing the updated
design and the feedback from the internal tests.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v6 00/17] propose config-based hooks (part I)
  2020-12-16  0:34         ` [PATCH v6 00/17] propose config-based hooks (part I) Josh Steadmon
@ 2020-12-16  0:56           ` Junio C Hamano
  2020-12-16 20:16             ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Junio C Hamano @ 2020-12-16  0:56 UTC (permalink / raw)
  To: Josh Steadmon
  Cc: Emily Shaffer, git, Jeff King, James Ramsay, Jonathan Nieder,
	brian m. carlson, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Johannes Schindelin

Josh Steadmon <steadmon@google.com> writes:

> On 2020.12.04 17:45, Emily Shaffer wrote:
>> Hi folks, and thanks for the patience - I ran into many, many last-mile
>> challenges.
>> 
>> I haven't addressed many comments on the design doc yet - I was keen to get the
>> "functionally complete" implementation and conversion to the list.
>> 
>> Next on my plate:
>>  - Update the design doc to make sense with what's in the implementation.
>>  - A blog post! How to set up new hooks, why they're neat, etc.
>>  - We seem to have some Googlers interested in trying it out internally, so
>>    I'm hoping we'll gather and collate feedback from that soon too.
>>  - And of course addressing comments on this series.
>> 
>> Thanks!
>>  - Emily
>
> This approach looks good to me. I'll look forward to seeing the updated
> design and the feedback from the internal tests.

Thanks.

By the way, es/config-hooks does not seem to pass 5411 (at least)
even as a standalone topic, and has been kicked out of 'seen' for
some time.  Has anybody took a look into the issue?



^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v6 00/17] propose config-based hooks (part I)
  2020-12-16  0:56           ` Junio C Hamano
@ 2020-12-16 20:16             ` Emily Shaffer
  2020-12-16 23:32               ` Junio C Hamano
  0 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-12-16 20:16 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Josh Steadmon, git, Jeff King, James Ramsay, Jonathan Nieder,
	brian m. carlson, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Johannes Schindelin

On Tue, Dec 15, 2020 at 04:56:18PM -0800, Junio C Hamano wrote:
> 
> Josh Steadmon <steadmon@google.com> writes:
> 
> > On 2020.12.04 17:45, Emily Shaffer wrote:
> >> Hi folks, and thanks for the patience - I ran into many, many last-mile
> >> challenges.
> >> 
> >> I haven't addressed many comments on the design doc yet - I was keen to get the
> >> "functionally complete" implementation and conversion to the list.
> >> 
> >> Next on my plate:
> >>  - Update the design doc to make sense with what's in the implementation.
> >>  - A blog post! How to set up new hooks, why they're neat, etc.
> >>  - We seem to have some Googlers interested in trying it out internally, so
> >>    I'm hoping we'll gather and collate feedback from that soon too.
> >>  - And of course addressing comments on this series.
> >> 
> >> Thanks!
> >>  - Emily
> >
> > This approach looks good to me. I'll look forward to seeing the updated
> > design and the feedback from the internal tests.
> 
> Thanks.
> 
> By the way, es/config-hooks does not seem to pass 5411 (at least)
> even as a standalone topic, and has been kicked out of 'seen' for
> some time.  Has anybody took a look into the issue?

Yeah, I looked at it today. Looks like an issue with not paying
attention to master->main default config, since I added a new test to
the 5411 suite (which means it wouldn't have made a conflict for someone
to say "ah yes, s/master/main/g"). I am tracking down couple of other CI
errors today and will send a reroll today or tomorrow.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v6 00/17] propose config-based hooks (part I)
  2020-12-16 20:16             ` Emily Shaffer
@ 2020-12-16 23:32               ` Junio C Hamano
  2020-12-18  2:07                 ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Junio C Hamano @ 2020-12-16 23:32 UTC (permalink / raw)
  To: Emily Shaffer
  Cc: Josh Steadmon, git, Jeff King, James Ramsay, Jonathan Nieder,
	brian m. carlson, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Johannes Schindelin

Emily Shaffer <emilyshaffer@google.com> writes:

>> By the way, es/config-hooks does not seem to pass 5411 (at least)
>> even as a standalone topic, and has been kicked out of 'seen' for
>> some time.  Has anybody took a look into the issue?
>
> Yeah, I looked at it today. Looks like an issue with not paying
> attention to master->main default config, since I added a new test to
> the 5411 suite (which means it wouldn't have made a conflict for someone
> to say "ah yes, s/master/main/g"). I am tracking down couple of other CI
> errors today and will send a reroll today or tomorrow.

Thanks.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v6 00/17] propose config-based hooks (part I)
  2020-12-16 23:32               ` Junio C Hamano
@ 2020-12-18  2:07                 ` Emily Shaffer
  2020-12-18  5:29                   ` Junio C Hamano
  0 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-12-18  2:07 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Josh Steadmon, git, Jeff King, James Ramsay, Jonathan Nieder,
	brian m. carlson, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Johannes Schindelin

On Wed, Dec 16, 2020 at 03:32:46PM -0800, Junio C Hamano wrote:
> 
> Emily Shaffer <emilyshaffer@google.com> writes:
> 
> >> By the way, es/config-hooks does not seem to pass 5411 (at least)
> >> even as a standalone topic, and has been kicked out of 'seen' for
> >> some time.  Has anybody took a look into the issue?
> >
> > Yeah, I looked at it today. Looks like an issue with not paying
> > attention to master->main default config, since I added a new test to
> > the 5411 suite (which means it wouldn't have made a conflict for someone
> > to say "ah yes, s/master/main/g"). I am tracking down couple of other CI
> > errors today and will send a reroll today or tomorrow.
> 
> Thanks.

I don't have a reroll today. I have been trying to get to the bottom of
a test which fails when built with clang but passes when built with gcc
(t6030-bisect-porcelain.sh after patch 12 of the part II series) and
have not made progress on that, let alone on the other tasks I wanted to
do before sending the next version.

Next week I will only work one day, so I'd anticipate a reroll sometime
the week following. Sorry for the wait - but I think even if I sent it
with the fix for this t5411 failure, it would still break 'seen' because
of whatever this clang vs. gcc problem is.

Hope you enjoy your holidays.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v6 00/17] propose config-based hooks (part I)
  2020-12-18  2:07                 ` Emily Shaffer
@ 2020-12-18  5:29                   ` Junio C Hamano
  0 siblings, 0 replies; 170+ messages in thread
From: Junio C Hamano @ 2020-12-18  5:29 UTC (permalink / raw)
  To: Emily Shaffer
  Cc: Josh Steadmon, git, Jeff King, James Ramsay, Jonathan Nieder,
	brian m. carlson, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Johannes Schindelin

Emily Shaffer <emilyshaffer@google.com> writes:

> I don't have a reroll today. I have been trying to get to the bottom of
> a test which fails when built with clang but passes when built with gcc
> (t6030-bisect-porcelain.sh after patch 12 of the part II series) and
> have not made progress on that, let alone on the other tasks I wanted to
> do before sending the next version.

Thanks for an interim report.  No need to rush.

> Hope you enjoy your holidays.

You too, and have fun.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v7 00/17] propose config-based hooks (part I)
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (17 preceding siblings ...)
  2020-12-16  0:34         ` [PATCH v6 00/17] propose config-based hooks (part I) Josh Steadmon
@ 2020-12-22  0:02         ` Emily Shaffer
  2020-12-22  0:02           ` [PATCH v7 01/17] doc: propose hooks managed by the config Emily Shaffer
                             ` (20 more replies)
  18 siblings, 21 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02 UTC (permalink / raw)
  To: git
  Cc: Emily Shaffer, Jeff King, Junio C Hamano, James Ramsay,
	Jonathan Nieder, brian m. carlson,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Josh Steadmon, Johannes Schindelin

Since v6:

 - Converted 'enum hookdir_opt' to UPPER_SNAKE
 - Coccinelle fix in the hook destructor
 - Fixed a bug where builtin/hook.c wasn't running the default git config setup
   and therefore missed hooks in core.hooksPath when it was set. (These hooks
   would still run except when invoked by 'git hook run' as the config was
   called by the processes which invoked the hook library.)

CI run: https://github.com/nasamuffin/git/actions/runs/436864964

Thanks!
 - Emily

Emily Shaffer (17):
  doc: propose hooks managed by the config
  hook: scaffolding for git-hook subcommand
  hook: add list command
  hook: include hookdir hook in list
  hook: respect hook.runHookDir
  hook: implement hookcmd.<name>.skip
  parse-options: parse into strvec
  hook: add 'run' subcommand
  hook: replace find_hook() with hook_exists()
  hook: support passing stdin to hooks
  run-command: allow stdin for run_processes_parallel
  hook: allow parallel hook execution
  hook: allow specifying working directory for hooks
  run-command: add stdin callback for parallelization
  hook: provide stdin by string_list or callback
  run-command: allow capturing of collated output
  hooks: allow callers to capture output

 .gitignore                                    |   1 +
 Documentation/Makefile                        |   1 +
 Documentation/config/hook.txt                 |  19 +
 Documentation/git-hook.txt                    | 117 +++++
 Documentation/technical/api-parse-options.txt |   5 +
 .../technical/config-based-hooks.txt          | 355 +++++++++++++++
 Makefile                                      |   2 +
 builtin.h                                     |   1 +
 builtin/bugreport.c                           |   4 +-
 builtin/fetch.c                               |   1 +
 builtin/hook.c                                | 176 ++++++++
 builtin/submodule--helper.c                   |   2 +-
 command-list.txt                              |   1 +
 git.c                                         |   1 +
 hook.c                                        | 416 ++++++++++++++++++
 hook.h                                        | 154 +++++++
 parse-options-cb.c                            |  16 +
 parse-options.h                               |   4 +
 run-command.c                                 |  85 +++-
 run-command.h                                 |  31 ++
 submodule.c                                   |   1 +
 t/helper/test-run-command.c                   |  46 +-
 t/t0061-run-command.sh                        |  37 ++
 t/t1360-config-based-hooks.sh                 | 256 +++++++++++
 24 files changed, 1717 insertions(+), 15 deletions(-)
 create mode 100644 Documentation/config/hook.txt
 create mode 100644 Documentation/git-hook.txt
 create mode 100644 Documentation/technical/config-based-hooks.txt
 create mode 100644 builtin/hook.c
 create mode 100644 hook.c
 create mode 100644 hook.h
 create mode 100755 t/t1360-config-based-hooks.sh

-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v7 01/17] doc: propose hooks managed by the config
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
@ 2020-12-22  0:02           ` Emily Shaffer
  2021-01-23 15:38             ` Ævar Arnfjörð Bjarmason
  2021-02-01 22:11             ` Junio C Hamano
  2020-12-22  0:02           ` [PATCH v7 02/17] hook: scaffolding for git-hook subcommand Emily Shaffer
                             ` (19 subsequent siblings)
  20 siblings, 2 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Begin a design document for config-based hooks, managed via git-hook.
Focus on an overview of the implementation and motivation for design
decisions. Briefly discuss the alternatives considered before this
point. Also, attempt to redefine terms to fit into a multihook world.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v6, checked for inconsistencies with implementation and added lots of
    caveats about whether 'git hook add' and 'git hook edit' will ever materialize.
    
    Hopefully this reflects reality now; please review accordingly.
    
    Since v6, checked for inconsistencies with implementation and added lots of
    caveats about whether 'git hook add' and 'git hook edit' will ever materialize.
    
    Hopefully this reflects reality now; please review accordingly.
    
    Since v4, addressed comments from Jonathan Tan about wording. However, I have
    not addressed AEvar's comments or done a full re-review of this document.
    I wanted to get the rest of the series out for initial review first.
    
     - Emily
    
    Since v4, addressed comments from Jonathan Tan about wording.

 Documentation/Makefile                        |   1 +
 .../technical/config-based-hooks.txt          | 355 ++++++++++++++++++
 2 files changed, 356 insertions(+)
 create mode 100644 Documentation/technical/config-based-hooks.txt

diff --git a/Documentation/Makefile b/Documentation/Makefile
index 69dbe4bb0b..505d318da1 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -81,6 +81,7 @@ SP_ARTICLES += $(API_DOCS)
 TECH_DOCS += MyFirstContribution
 TECH_DOCS += MyFirstObjectWalk
 TECH_DOCS += SubmittingPatches
+TECH_DOCS += technical/config-based-hooks
 TECH_DOCS += technical/hash-function-transition
 TECH_DOCS += technical/http-protocol
 TECH_DOCS += technical/index-format
diff --git a/Documentation/technical/config-based-hooks.txt b/Documentation/technical/config-based-hooks.txt
new file mode 100644
index 0000000000..3217faba47
--- /dev/null
+++ b/Documentation/technical/config-based-hooks.txt
@@ -0,0 +1,355 @@
+Configuration-based hook management
+===================================
+:sectanchors:
+
+[[motivation]]
+== Motivation
+
+Replace the .git/hook/hookname path as the only source of hooks to execute;
+allow users to define hooks using config files, in a way which is friendly to
+users with multiple repos which have similar needs.
+
+Redefine "hook" as an event rather than a single script, allowing users to
+perform multiple unrelated actions on a single event.
+
+Take a step closer to safety when copying zipped Git repositories from untrusted
+users by making it more apparent to users which scripts will be run during
+normal Git operations.
+
+Make it easier for users to discover Git's hook feature and automate their
+workflows.
+
+[[user-interfaces]]
+== User interfaces
+
+[[config-schema]]
+=== Config schema
+
+Hooks can be introduced by editing the configuration manually. There are two new
+sections added, `hook` and `hookcmd`.
+
+[[config-schema-hook]]
+==== `hook`
+
+Primarily contains subsections for each hook event. The order of variables in
+these subsections defines the hook command execution order; hook commands can be
+specified by setting the value directly to the command if no additional
+configuration is needed, or by setting the value as the name of a `hookcmd`. If
+Git does not find a `hookcmd` whose subsection matches the value of the given
+command string, Git will try to execute the string directly. Hooks are executed
+by passing the resolved command string to the shell. In the future, hook event
+subsections could also contain per-hook-event settings; see
+<<per-hook-event-settings,the section in Future Work>> for more details.
+
+Also contains top-level hook execution settings, for example, `hook.runHookDir`.
+(These settings are described more in <<library,Library>>.)
+
+----
+[hook "pre-commit"]
+  command = perl-linter
+  command = /usr/bin/git-secrets --pre-commit
+
+[hook "pre-applypatch"]
+  command = perl-linter
+  # for illustration purposes; error behavior isn't planned yet
+  error = ignore
+
+[hook]
+  runHookDir = interactive
+----
+
+[[config-schema-hookcmd]]
+==== `hookcmd`
+
+Defines a hook command and its attributes, which will be used when a hook event
+occurs. Unqualified attributes are assumed to apply to this hook during all hook
+events, but event-specific attributes can also be supplied. The example runs
+`/usr/bin/lint-it --language=perl <args passed by Git>`, but for repos which
+include this config, the hook command will be skipped for all events.
+Theoretically, the last line could be used to "un-skip" the hook command for
+`pre-commit` hooks, but this hasn't been scoped or implemented yet.
+
+----
+[hookcmd "perl-linter"]
+  command = /usr/bin/lint-it --language=perl
+  skip = true
+  # for illustration purposes; below hasn't been defined yet
+  pre-commit-skip = false
+----
+
+[[command-line-api]]
+=== Command-line API
+
+Users should be able to view, run, reorder, and create hook commands via the
+command line. External tools should be able to view a list of hooks in the
+correct order to run. Modifier commands (`edit` and `add`) have not been
+implemented yet and may not be if manually editing the config proves usable
+enough.
+
+*`git hook list <hook-event>`*
+
+*`git hook run <hook-event> [-a <arg>]... [-e <env-var>]...`*
+
+*`git hook edit <hook-event>`*
+
+*`git hook add <hook-command> <hook-event> <options...>`*
+
+[[hook-editor]]
+=== Hook editor
+
+The tool which is presented by `git hook edit <hook-command>`. Ideally, this
+tool should be easier to use than manually editing the config, and then produce
+a concise config afterwards. It may take a form similar to `git rebase
+--interactive`. This has not been designed or implemented yet and may not be if
+the config proves usable enough.
+
+[[implementation]]
+== Implementation
+
+[[library]]
+=== Library
+
+`hook.c` and `hook.h` are responsible for interacting with the config files. The
+hook library provides a basic API to call all hooks in config order with more
+complex options passed via `struct run_hooks_opt`:
+
+*`int run_hooks(const char *hookname, struct run_hooks_opt *options)`*
+
+`struct run_hooks_opt` allows callers to set:
+
+- environment variables
+- command-line arguments
+- behavior for the hook command provided by `run-command.h:find_hook()` (see
+  below)
+- a method to provide stdin to each hook, either via a file containing stdin, a
+  `struct string_list` containing a list of lines to print, or a callback
+  function to allow the caller to populate stdin manually
+- a method to process stdout from each hook, e.g. for printing to sideband
+  during a network operation
+- parallelism
+- a custom working directory for hooks to execute in
+
+And this struct can be extended with more options as necessary in the future.
+
+The "legacy" hook provided by `run-command.h:find_hook()` - that is, the hook
+present in `.git/hooks/<hookname>` or
+`$(git config --get core.hooksPath)/<hookname>` - can be handled in a number of
+ways, providing an avenue to deprecate these "legacy" hooks if desired. The
+handling is based on a config `hook.runHookDir`, which is checked against a
+number of cases:
+
+- "no": the legacy hook will not be run
+- "interactive": Git will prompt the user before running the legacy hook
+- "warn": Git will print a warning to stderr before running the legacy hook
+- "yes" (default): Git will silently run the legacy hook
+
+In case this list is expanded in the future, if a value for `hook.runHookDir` is
+given which Git does not recognize, Git should discard that config entry. For
+example, if "warn" was specified at system level and "junk" was specified at
+global level, Git would resolve the value to "warn"; if the only time the config
+was set was to "junk", Git would use the default value of "yes".
+
+`struct hookcmd` is expected to grow in size over time as more functionality is
+added to hooks; so that other parts of the code don't need to understand the
+config schema, `struct hookcmd` should contain logical values instead of string
+pairs.
+
+By default, hook parallelism is chosen based on the semantics of each hook;
+callsites initialize their `struct run_hooks_opt` via one of two macros,
+`RUN_HOOKS_OPT_INIT_SYNC` or `RUN_HOOKS_OPT_INIT_ASYNC`. The default number of
+jobs can be configured in `hook.jobs`; this config applies across all hook
+events. If unset, the value of `online_cpus()` (equivalent to `nproc`) is used.
+
+[[builtin]]
+=== Builtin
+
+`builtin/hook.c` is responsible for providing the frontend. It's responsible for
+formatting user-provided data and then calling the library API to set the
+configs as appropriate. The builtin frontend is not responsible for calling the
+config directly, so that other areas of Git can rely on the hook library to
+understand the most recent config schema for hooks.
+
+[[migration]]
+=== Migration path
+
+[[stage-0]]
+==== Stage 0
+
+Hooks are called by running `run-command.h:find_hook()` with the hookname and
+executing the result. The hook library and builtin do not exist. Hooks only
+exist as specially named scripts within `.git/hooks/`.
+
+[[stage-1]]
+==== Stage 1
+
+`git hook list --porcelain <hook-event>` is implemented. `hook.h:run_hooks()` is
+taught to include `run-command.h:find_hook()` at the end; calls to `find_hook()`
+are replaced with calls to `run_hooks()`. Users can opt-in to config-based hooks
+simply by creating some in their config; otherwise users should remain
+unaffected by the change.
+
+[[stage-2]]
+==== Stage 2
+
+The call to `find_hook()` inside of `run_hooks()` learns to check for a config,
+`hook.runHookDir`. Users can opt into managing their hooks completely via the
+config this way.
+
+[[stage-3]]
+==== Stage 3
+
+`.git/hooks` is removed from the template and the hook directory is considered
+deprecated. To avoid breaking older repos, the default of `hook.runHookDir` is
+not changed, and `find_hook()` is not removed.
+
+[[caveats]]
+== Caveats
+
+[[security]]
+=== Security and repo config
+
+Part of the motivation behind this refactor is to mitigate hooks as an attack
+vector;footnote:[https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/]
+however, as the design stands, users can still provide hooks in the repo-level
+config, which is included when a repo is zipped and sent elsewhere.  The
+security of the repo-level config is still under discussion; this design
+generally assumes the repo-level config is secure, which is not true yet. The
+goal is to avoid an overcomplicated design to work around a problem which has
+ceased to exist.
+
+[[ease-of-use]]
+=== Ease of use
+
+The config schema is nontrivial; that's why it's important for the `git hook`
+modifier commands to be usable. Contributors with UX expertise are encouraged to
+share their suggestions.
+
+[[alternatives]]
+== Alternative approaches
+
+A previous summary of alternatives exists in the
+archives.footnote:[https://lore.kernel.org/git/20191116011125.GG22855@google.com]
+
+[[status-quo]]
+=== Status quo
+
+Today users can implement multihooks themselves by using a "trampoline script"
+as their hook, and pointing that script to a directory or list of other scripts
+they wish to run.
+
+[[hook-directories]]
+=== Hook directories
+
+Other contributors have suggested Git learn about the existence of a directory
+such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order.
+
+[[comparison]]
+=== Comparison table
+
+.Comparison of alternatives
+|===
+|Feature |Config-based hooks |Hook directories |Status quo
+
+|Supports multiple hooks
+|Natively
+|Natively
+|With user effort
+
+|Safer for zipped repos
+|A little
+|No
+|No
+
+|Previous hooks just work
+|If configured
+|Yes
+|Yes
+
+|Can install one hook to many repos
+|Yes
+|No
+|No
+
+|Discoverability
+|Better (in `git help git`)
+|Same as before
+|Same as before
+
+|Hard to run unexpected hook
+|If configured
+|No
+|No
+|===
+
+[[future-work]]
+== Future work
+
+[[execution-ordering]]
+=== Execution ordering
+
+We may find that config order is insufficient for some users; for example,
+config order makes it difficult to add a new hook to the system or global config
+which runs at the end of the hook list. A new ordering schema should be:
+
+1) Specified by a `hook.order` config, so that users will not unexpectedly see
+their order change;
+
+2) Either dependency or numerically based.
+
+Dependency-based ordering is prone to classic linked-list problems, like a
+cycles and handling of missing dependencies. But, it paves the way for enabling
+parallelization if some tasks truly depend on others.
+
+Numerical ordering makes it tricky for Git to generate suggested ordering
+numbers for each command, but is easy to determine a definitive order.
+
+[[parallelization]]
+=== Parallelization with dependencies
+
+Currently hooks use a naive parallelization scheme or are run in series.  But if
+one hook depends on another's output, then users will want to specify those
+dependencies. If we decide to solve this problem, we may want to look to modern
+build systems for inspiration on how to manage dependencies and parallel tasks.
+
+[[securing-hookdir-hooks]]
+=== Securing hookdir hooks
+
+With the design as written in this doc, it's still possible for a malicious user
+to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then
+zip their repo and send it to another user. It may be necessary to teach Git to
+only allow inlined hooks like this if they were configured outside of the local
+scope (in other words, only run hookcmds, and only allow hookcmds to be
+configured in global or system scope); or another approach, like a list of safe
+projects, might be useful. It may also be sufficient (or at least useful) to
+teach a `hook.disableAll` config or similar flag to the Git executable.
+
+[[submodule-inheritance]]
+=== Submodule inheritance
+
+It's possible some submodules may want to run the identical set of hooks that
+their superrepo runs. While a globally-configured hook set is helpful, it's not
+a great solution for users who have multiple repos-with-submodules under the
+same user. It would be useful for submodules to learn how to run hooks from
+their superrepo's config, or inherit that hook setting.
+
+[[per-hook-event-settings]]
+=== Per-hook-event settings
+
+It might be desirable to keep settings specifically for some hook events, but
+not for others - for example, a user may wish to disable hookdir hooks for all
+events but pre-commit, which they haven't had time to convert yet; or, a user
+may wish for execution order settings to differ based on hook event. In that
+case, it would be useful to set something like `hook.pre-commit.executionOrder`
+which would not apply to the 'prepare-commit-msg' hook, for example.
+
+[[glossary]]
+== Glossary
+
+*hook event*
+
+A point during Git's execution where user scripts may be run, for example,
+_prepare-commit-msg_ or _pre-push_.
+
+*hook command*
+
+A user script or executable which will be run on one or more hook events.
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v7 02/17] hook: scaffolding for git-hook subcommand
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
  2020-12-22  0:02           ` [PATCH v7 01/17] doc: propose hooks managed by the config Emily Shaffer
@ 2020-12-22  0:02           ` Emily Shaffer
  2020-12-22  0:02           ` [PATCH v7 03/17] hook: add list command Emily Shaffer
                             ` (18 subsequent siblings)
  20 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Introduce infrastructure for a new subcommand, git-hook, which will be
used to ease config-based hook management. This command will handle
parsing configs to compose a list of hooks to run for a given event, as
well as adding or modifying hook configs in an interactive fashion.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, mainly changed to RUN_SETUP_GENTLY so that 'git hook list' can
    be executed outside of a repo.

 .gitignore                    |  1 +
 Documentation/git-hook.txt    | 20 ++++++++++++++++++++
 Makefile                      |  1 +
 builtin.h                     |  1 +
 builtin/hook.c                | 21 +++++++++++++++++++++
 command-list.txt              |  1 +
 git.c                         |  1 +
 t/t1360-config-based-hooks.sh | 11 +++++++++++
 8 files changed, 57 insertions(+)
 create mode 100644 Documentation/git-hook.txt
 create mode 100644 builtin/hook.c
 create mode 100755 t/t1360-config-based-hooks.sh

diff --git a/.gitignore b/.gitignore
index 3dcdb6bb5a..3608c35b73 100644
--- a/.gitignore
+++ b/.gitignore
@@ -76,6 +76,7 @@
 /git-grep
 /git-hash-object
 /git-help
+/git-hook
 /git-http-backend
 /git-http-fetch
 /git-http-push
diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
new file mode 100644
index 0000000000..9eeab0009d
--- /dev/null
+++ b/Documentation/git-hook.txt
@@ -0,0 +1,20 @@
+git-hook(1)
+===========
+
+NAME
+----
+git-hook - Manage configured hooks
+
+SYNOPSIS
+--------
+[verse]
+'git hook'
+
+DESCRIPTION
+-----------
+A placeholder command. Later, you will be able to list, add, and modify hooks
+with this command.
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index 6fb86c5862..24cee44400 100644
--- a/Makefile
+++ b/Makefile
@@ -1101,6 +1101,7 @@ BUILTIN_OBJS += builtin/get-tar-commit-id.o
 BUILTIN_OBJS += builtin/grep.o
 BUILTIN_OBJS += builtin/hash-object.o
 BUILTIN_OBJS += builtin/help.o
+BUILTIN_OBJS += builtin/hook.o
 BUILTIN_OBJS += builtin/index-pack.o
 BUILTIN_OBJS += builtin/init-db.o
 BUILTIN_OBJS += builtin/interpret-trailers.o
diff --git a/builtin.h b/builtin.h
index b6ce981b73..8df1d36a7a 100644
--- a/builtin.h
+++ b/builtin.h
@@ -163,6 +163,7 @@ int cmd_get_tar_commit_id(int argc, const char **argv, const char *prefix);
 int cmd_grep(int argc, const char **argv, const char *prefix);
 int cmd_hash_object(int argc, const char **argv, const char *prefix);
 int cmd_help(int argc, const char **argv, const char *prefix);
+int cmd_hook(int argc, const char **argv, const char *prefix);
 int cmd_index_pack(int argc, const char **argv, const char *prefix);
 int cmd_init_db(int argc, const char **argv, const char *prefix);
 int cmd_interpret_trailers(int argc, const char **argv, const char *prefix);
diff --git a/builtin/hook.c b/builtin/hook.c
new file mode 100644
index 0000000000..b2bbc84d4d
--- /dev/null
+++ b/builtin/hook.c
@@ -0,0 +1,21 @@
+#include "cache.h"
+
+#include "builtin.h"
+#include "parse-options.h"
+
+static const char * const builtin_hook_usage[] = {
+	N_("git hook"),
+	NULL
+};
+
+int cmd_hook(int argc, const char **argv, const char *prefix)
+{
+	struct option builtin_hook_options[] = {
+		OPT_END(),
+	};
+
+	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+			     builtin_hook_usage, 0);
+
+	return 0;
+}
diff --git a/command-list.txt b/command-list.txt
index 9379b02e5e..75909bf602 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -103,6 +103,7 @@ git-grep                                mainporcelain           info
 git-gui                                 mainporcelain
 git-hash-object                         plumbingmanipulators
 git-help                                ancillaryinterrogators          complete
+git-hook                                mainporcelain
 git-http-backend                        synchingrepositories
 git-http-fetch                          synchelpers
 git-http-push                           synchelpers
diff --git a/git.c b/git.c
index a00a0a4d94..9d1768b8e8 100644
--- a/git.c
+++ b/git.c
@@ -525,6 +525,7 @@ static struct cmd_struct commands[] = {
 	{ "grep", cmd_grep, RUN_SETUP_GENTLY },
 	{ "hash-object", cmd_hash_object },
 	{ "help", cmd_help },
+	{ "hook", cmd_hook, RUN_SETUP_GENTLY },
 	{ "index-pack", cmd_index_pack, RUN_SETUP_GENTLY | NO_PARSEOPT },
 	{ "init", cmd_init_db },
 	{ "init-db", cmd_init_db },
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
new file mode 100755
index 0000000000..34b0df5216
--- /dev/null
+++ b/t/t1360-config-based-hooks.sh
@@ -0,0 +1,11 @@
+#!/bin/bash
+
+test_description='config-managed multihooks, including git-hook command'
+
+. ./test-lib.sh
+
+test_expect_success 'git hook command does not crash' '
+	git hook
+'
+
+test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v7 03/17] hook: add list command
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
  2020-12-22  0:02           ` [PATCH v7 01/17] doc: propose hooks managed by the config Emily Shaffer
  2020-12-22  0:02           ` [PATCH v7 02/17] hook: scaffolding for git-hook subcommand Emily Shaffer
@ 2020-12-22  0:02           ` Emily Shaffer
  2021-01-31  3:10             ` Jonathan Tan
  2020-12-22  0:02           ` [PATCH v7 04/17] hook: include hookdir hook in list Emily Shaffer
                             ` (17 subsequent siblings)
  20 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Teach 'git hook list <hookname>', which checks the known configs in
order to create an ordered list of hooks to run on a given hook event.

Multiple commands can be specified for a given hook by providing
multiple "hook.<hookname>.command = <path-to-hook>" lines. Hooks will be
run in config order. If more properties need to be set on a given hook
in the future, commands can also be specified by providing
"hook.<hookname>.command = <hookcmd-name>", as well as a "[hookcmd
<hookcmd-name>]" subsection; at minimum, this subsection must contain a
"hookcmd.<hookcmd-name>.command = <path-to-hook>" line.

For example:

  $ git config --list | grep ^hook
  hook.pre-commit.command=baz
  hook.pre-commit.command=~/bar.sh
  hookcmd.baz.command=~/baz/from/hookcmd.sh

  $ git hook list pre-commit
  global: ~/baz/from/hookcmd.sh
  local: ~/bar.sh

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, updated the sample in the commit message to reflect reality better.
    
    Since v4, more work on the documentation. Also a slight change to the
    output format (space instead of tab).

 Documentation/config/hook.txt |   9 +++
 Documentation/git-hook.txt    |  59 ++++++++++++++++-
 Makefile                      |   1 +
 builtin/hook.c                |  56 +++++++++++++++--
 hook.c                        | 115 ++++++++++++++++++++++++++++++++++
 hook.h                        |  26 ++++++++
 t/t1360-config-based-hooks.sh |  81 +++++++++++++++++++++++-
 7 files changed, 338 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/config/hook.txt
 create mode 100644 hook.c
 create mode 100644 hook.h

diff --git a/Documentation/config/hook.txt b/Documentation/config/hook.txt
new file mode 100644
index 0000000000..71449ecbc7
--- /dev/null
+++ b/Documentation/config/hook.txt
@@ -0,0 +1,9 @@
+hook.<command>.command::
+	A command to execute during the <command> hook event. This can be an
+	executable on your device, a oneliner for your shell, or the name of a
+	hookcmd. See linkgit:git-hook[1].
+
+hookcmd.<name>.command::
+	A command to execute during a hook for which <name> has been specified
+	as a command. This can be an executable on your device or a oneliner for
+	your shell. See linkgit:git-hook[1].
diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index 9eeab0009d..f19875ed68 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -8,12 +8,65 @@ git-hook - Manage configured hooks
 SYNOPSIS
 --------
 [verse]
-'git hook'
+'git hook' list <hook-name>
 
 DESCRIPTION
 -----------
-A placeholder command. Later, you will be able to list, add, and modify hooks
-with this command.
+You can list configured hooks with this command. Later, you will be able to run,
+add, and modify hooks with this command.
+
+This command parses the default configuration files for sections `hook` and
+`hookcmd`. `hook` is used to describe the commands which will be run during a
+particular hook event; commands are run in the order Git encounters them during
+the configuration parse (see linkgit:git-config[1]). `hookcmd` is used to
+describe attributes of a specific command. If additional attributes don't need
+to be specified, a command to run can be specified directly in the `hook`
+section; if a `hookcmd` by that name isn't found, Git will attempt to run the
+provided value directly. For example:
+
+Global config
+----
+  [hook "post-commit"]
+    command = "linter"
+    command = "~/typocheck.sh"
+
+  [hookcmd "linter"]
+    command = "/bin/linter --c"
+----
+
+Local config
+----
+  [hook "prepare-commit-msg"]
+    command = "linter"
+  [hook "post-commit"]
+    command = "python ~/run-test-suite.py"
+----
+
+With these configs, you'd then see:
+
+----
+$ git hook list "post-commit"
+global: /bin/linter --c
+global: ~/typocheck.sh
+local: python ~/run-test-suite.py
+
+$ git hook list "prepare-commit-msg"
+local: /bin/linter --c
+----
+
+COMMANDS
+--------
+
+list `<hook-name>`::
+
+List the hooks which have been configured for `<hook-name>`. Hooks appear
+in the order they should be run, and print the config scope where the relevant
+`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
+This output is human-readable and the format is subject to change over time.
+
+CONFIGURATION
+-------------
+include::config/hook.txt[]
 
 GIT
 ---
diff --git a/Makefile b/Makefile
index 24cee44400..d9f43dc8fe 100644
--- a/Makefile
+++ b/Makefile
@@ -904,6 +904,7 @@ LIB_OBJS += grep.o
 LIB_OBJS += hashmap.o
 LIB_OBJS += help.o
 LIB_OBJS += hex.o
+LIB_OBJS += hook.o
 LIB_OBJS += ident.o
 LIB_OBJS += json-writer.o
 LIB_OBJS += kwset.o
diff --git a/builtin/hook.c b/builtin/hook.c
index b2bbc84d4d..4d36de52f8 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -1,21 +1,69 @@
 #include "cache.h"
 
 #include "builtin.h"
+#include "config.h"
+#include "hook.h"
 #include "parse-options.h"
+#include "strbuf.h"
 
 static const char * const builtin_hook_usage[] = {
-	N_("git hook"),
+	N_("git hook list <hookname>"),
 	NULL
 };
 
-int cmd_hook(int argc, const char **argv, const char *prefix)
+static int list(int argc, const char **argv, const char *prefix)
 {
-	struct option builtin_hook_options[] = {
+	struct list_head *head, *pos;
+	struct hook *item;
+	struct strbuf hookname = STRBUF_INIT;
+
+	struct option list_options[] = {
 		OPT_END(),
 	};
 
-	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+	argc = parse_options(argc, argv, prefix, list_options,
 			     builtin_hook_usage, 0);
 
+	if (argc < 1) {
+		usage_msg_opt(_("You must specify a hook event name to list."),
+			      builtin_hook_usage, list_options);
+	}
+
+	strbuf_addstr(&hookname, argv[0]);
+
+	head = hook_list(&hookname);
+
+	if (list_empty(head)) {
+		printf(_("no commands configured for hook '%s'\n"),
+		       hookname.buf);
+		strbuf_release(&hookname);
+		return 0;
+	}
+
+	list_for_each(pos, head) {
+		item = list_entry(pos, struct hook, list);
+		if (item)
+			printf("%s: %s\n",
+			       config_scope_name(item->origin),
+			       item->command.buf);
+	}
+
+	clear_hook_list(head);
+	strbuf_release(&hookname);
+
 	return 0;
 }
+
+int cmd_hook(int argc, const char **argv, const char *prefix)
+{
+	struct option builtin_hook_options[] = {
+		OPT_END(),
+	};
+	if (argc < 2)
+		usage_with_options(builtin_hook_usage, builtin_hook_options);
+
+	if (!strcmp(argv[1], "list"))
+		return list(argc - 1, argv + 1, prefix);
+
+	usage_with_options(builtin_hook_usage, builtin_hook_options);
+}
diff --git a/hook.c b/hook.c
new file mode 100644
index 0000000000..937dc768c8
--- /dev/null
+++ b/hook.c
@@ -0,0 +1,115 @@
+#include "cache.h"
+
+#include "hook.h"
+#include "config.h"
+
+void free_hook(struct hook *ptr)
+{
+	if (ptr) {
+		strbuf_release(&ptr->command);
+		free(ptr);
+	}
+}
+
+static void append_or_move_hook(struct list_head *head, const char *command)
+{
+	struct list_head *pos = NULL, *tmp = NULL;
+	struct hook *to_add = NULL;
+
+	/*
+	 * remove the prior entry with this command; we'll replace it at the
+	 * end.
+	 */
+	list_for_each_safe(pos, tmp, head) {
+		struct hook *it = list_entry(pos, struct hook, list);
+		if (!strcmp(it->command.buf, command)) {
+		    list_del(pos);
+		    /* we'll simply move the hook to the end */
+		    to_add = it;
+		}
+	}
+
+	if (!to_add) {
+		/* adding a new hook, not moving an old one */
+		to_add = xmalloc(sizeof(struct hook));
+		strbuf_init(&to_add->command, 0);
+		strbuf_addstr(&to_add->command, command);
+	}
+
+	/* re-set the scope so we show where an override was specified */
+	to_add->origin = current_config_scope();
+
+	list_add_tail(&to_add->list, pos);
+}
+
+static void remove_hook(struct list_head *to_remove)
+{
+	struct hook *hook_to_remove = list_entry(to_remove, struct hook, list);
+	list_del(to_remove);
+	free_hook(hook_to_remove);
+}
+
+void clear_hook_list(struct list_head *head)
+{
+	struct list_head *pos, *tmp;
+	list_for_each_safe(pos, tmp, head)
+		remove_hook(pos);
+}
+
+struct hook_config_cb
+{
+	struct strbuf *hookname;
+	struct list_head *list;
+};
+
+static int hook_config_lookup(const char *key, const char *value, void *cb_data)
+{
+	struct hook_config_cb *data = cb_data;
+	const char *hook_key = data->hookname->buf;
+	struct list_head *head = data->list;
+
+	if (!strcmp(key, hook_key)) {
+		const char *command = value;
+		struct strbuf hookcmd_name = STRBUF_INIT;
+
+		/* Check if a hookcmd with that name exists. */
+		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
+		git_config_get_value(hookcmd_name.buf, &command);
+
+		if (!command) {
+			strbuf_release(&hookcmd_name);
+			BUG("git_config_get_value overwrote a string it shouldn't have");
+		}
+
+		/*
+		 * TODO: implement an option-getting callback, e.g.
+		 *   get configs by pattern hookcmd.$value.*
+		 *   for each key+value, do_callback(key, value, cb_data)
+		 */
+
+		append_or_move_hook(head, command);
+
+		strbuf_release(&hookcmd_name);
+	}
+
+	return 0;
+}
+
+struct list_head* hook_list(const struct strbuf* hookname)
+{
+	struct strbuf hook_key = STRBUF_INIT;
+	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
+	struct hook_config_cb cb_data = { &hook_key, hook_head };
+
+	INIT_LIST_HEAD(hook_head);
+
+	if (!hookname)
+		return NULL;
+
+	strbuf_addf(&hook_key, "hook.%s.command", hookname->buf);
+
+	git_config(hook_config_lookup, (void*)&cb_data);
+
+	strbuf_release(&hook_key);
+	return hook_head;
+}
diff --git a/hook.h b/hook.h
new file mode 100644
index 0000000000..8ffc4f14b6
--- /dev/null
+++ b/hook.h
@@ -0,0 +1,26 @@
+#include "config.h"
+#include "list.h"
+#include "strbuf.h"
+
+struct hook
+{
+	struct list_head list;
+	/*
+	 * Config file which holds the hook.*.command definition.
+	 * (This has nothing to do with the hookcmd.<name>.* configs.)
+	 */
+	enum config_scope origin;
+	/* The literal command to run. */
+	struct strbuf command;
+};
+
+/*
+ * Provides a linked list of 'struct hook' detailing commands which should run
+ * in response to the 'hookname' event, in execution order.
+ */
+struct list_head* hook_list(const struct strbuf *hookname);
+
+/* Free memory associated with a 'struct hook' */
+void free_hook(struct hook *ptr);
+/* Empties the list at 'head', calling 'free_hook()' on each entry */
+void clear_hook_list(struct list_head *head);
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 34b0df5216..6e4a3e763f 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -4,8 +4,85 @@ test_description='config-managed multihooks, including git-hook command'
 
 . ./test-lib.sh
 
-test_expect_success 'git hook command does not crash' '
-	git hook
+ROOT=
+if test_have_prereq MINGW
+then
+	# In Git for Windows, Unix-like paths work only in shell scripts;
+	# `git.exe`, however, will prefix them with the pseudo root directory
+	# (of the Unix shell). Let's accommodate for that.
+	ROOT="$(cd / && pwd)"
+fi
+
+setup_hooks () {
+	test_config hook.pre-commit.command "/path/ghi" --add
+	test_config_global hook.pre-commit.command "/path/def" --add
+}
+
+setup_hookcmd () {
+	test_config hook.pre-commit.command "abc" --add
+	test_config_global hookcmd.abc.command "/path/abc" --add
+}
+
+test_expect_success 'git hook rejects commands without a mode' '
+	test_must_fail git hook pre-commit
+'
+
+
+test_expect_success 'git hook rejects commands without a hookname' '
+	test_must_fail git hook list
+'
+
+test_expect_success 'git hook runs outside of a repo' '
+	setup_hooks &&
+
+	cat >expected <<-EOF &&
+	global: $ROOT/path/def
+	EOF
+
+	nongit git config --list --global &&
+
+	nongit git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list orders by config order' '
+	setup_hooks &&
+
+	cat >expected <<-EOF &&
+	global: $ROOT/path/def
+	local: $ROOT/path/ghi
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list dereferences a hookcmd' '
+	setup_hooks &&
+	setup_hookcmd &&
+
+	cat >expected <<-EOF &&
+	global: $ROOT/path/def
+	local: $ROOT/path/ghi
+	local: $ROOT/path/abc
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list reorders on duplicate commands' '
+	setup_hooks &&
+
+	test_config hook.pre-commit.command "/path/def" --add &&
+
+	cat >expected <<-EOF &&
+	local: $ROOT/path/ghi
+	local: $ROOT/path/def
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
 '
 
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v7 04/17] hook: include hookdir hook in list
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
                             ` (2 preceding siblings ...)
  2020-12-22  0:02           ` [PATCH v7 03/17] hook: add list command Emily Shaffer
@ 2020-12-22  0:02           ` Emily Shaffer
  2021-01-31  3:20             ` Jonathan Tan
  2020-12-22  0:02           ` [PATCH v7 05/17] hook: respect hook.runHookDir Emily Shaffer
                             ` (16 subsequent siblings)
  20 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Historically, hooks are declared by placing an executable into
$GIT_DIR/hooks/$HOOKNAME (or $HOOKDIR/$HOOKNAME). Although hooks taken
from the config are more featureful than hooks placed in the $HOOKDIR,
those hooks should not stop working for users who already have them.

Legacy hooks should be run directly, not in shell. We know that they are
a path to an executable, not a oneliner script - and running them
directly takes care of path quoting concerns for us for free.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Newly split into its own commit since v4, and taking place much sooner.
    
    An unfortunate side effect of adding this support *before* the
    hook.runHookDir support is that the labels on the list are not clear -
    because we aren't yet flagging which hooks are from the hookdir versus
    the config. I suppose we could move the addition of that field to the
    struct hook up to this patch, but it didn't make a lot of sense to me to
    do it just for cosmetic purposes.

 builtin/hook.c                | 18 ++++++++++++++----
 hook.c                        | 15 +++++++++++++++
 hook.h                        |  1 +
 t/t1360-config-based-hooks.sh | 19 +++++++++++++++++++
 4 files changed, 49 insertions(+), 4 deletions(-)

diff --git a/builtin/hook.c b/builtin/hook.c
index 4d36de52f8..a0013ae4d7 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -16,6 +16,7 @@ static int list(int argc, const char **argv, const char *prefix)
 	struct list_head *head, *pos;
 	struct hook *item;
 	struct strbuf hookname = STRBUF_INIT;
+	struct strbuf hookdir_annotation = STRBUF_INIT;
 
 	struct option list_options[] = {
 		OPT_END(),
@@ -42,10 +43,17 @@ static int list(int argc, const char **argv, const char *prefix)
 
 	list_for_each(pos, head) {
 		item = list_entry(pos, struct hook, list);
-		if (item)
-			printf("%s: %s\n",
-			       config_scope_name(item->origin),
-			       item->command.buf);
+		if (item) {
+			/* Don't translate 'hookdir' - it matches the config */
+			printf("%s: %s%s\n",
+			       (item->from_hookdir
+				? "hookdir"
+				: config_scope_name(item->origin)),
+			       item->command.buf,
+			       (item->from_hookdir
+				? hookdir_annotation.buf
+				: ""));
+		}
 	}
 
 	clear_hook_list(head);
@@ -62,6 +70,8 @@ int cmd_hook(int argc, const char **argv, const char *prefix)
 	if (argc < 2)
 		usage_with_options(builtin_hook_usage, builtin_hook_options);
 
+	git_config(git_default_config, NULL);
+
 	if (!strcmp(argv[1], "list"))
 		return list(argc - 1, argv + 1, prefix);
 
diff --git a/hook.c b/hook.c
index 937dc768c8..ffbdcfd987 100644
--- a/hook.c
+++ b/hook.c
@@ -2,6 +2,7 @@
 
 #include "hook.h"
 #include "config.h"
+#include "run-command.h"
 
 void free_hook(struct hook *ptr)
 {
@@ -34,6 +35,7 @@ static void append_or_move_hook(struct list_head *head, const char *command)
 		to_add = xmalloc(sizeof(struct hook));
 		strbuf_init(&to_add->command, 0);
 		strbuf_addstr(&to_add->command, command);
+		to_add->from_hookdir = 0;
 	}
 
 	/* re-set the scope so we show where an override was specified */
@@ -100,6 +102,7 @@ struct list_head* hook_list(const struct strbuf* hookname)
 	struct strbuf hook_key = STRBUF_INIT;
 	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
 	struct hook_config_cb cb_data = { &hook_key, hook_head };
+	const char *legacy_hook_path = NULL;
 
 	INIT_LIST_HEAD(hook_head);
 
@@ -110,6 +113,18 @@ struct list_head* hook_list(const struct strbuf* hookname)
 
 	git_config(hook_config_lookup, (void*)&cb_data);
 
+	if (have_git_dir())
+		legacy_hook_path = find_hook(hookname->buf);
+
+	/* Unconditionally add legacy hook, but annotate it. */
+	if (legacy_hook_path) {
+		struct hook *legacy_hook;
+
+		append_or_move_hook(hook_head, absolute_path(legacy_hook_path));
+		legacy_hook = list_entry(hook_head->prev, struct hook, list);
+		legacy_hook->from_hookdir = 1;
+	}
+
 	strbuf_release(&hook_key);
 	return hook_head;
 }
diff --git a/hook.h b/hook.h
index 8ffc4f14b6..5750634c83 100644
--- a/hook.h
+++ b/hook.h
@@ -12,6 +12,7 @@ struct hook
 	enum config_scope origin;
 	/* The literal command to run. */
 	struct strbuf command;
+	int from_hookdir;
 };
 
 /*
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 6e4a3e763f..0f12af4659 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -23,6 +23,14 @@ setup_hookcmd () {
 	test_config_global hookcmd.abc.command "/path/abc" --add
 }
 
+setup_hookdir () {
+	mkdir .git/hooks
+	write_script .git/hooks/pre-commit <<-EOF
+	echo \"Legacy Hook\"
+	EOF
+	test_when_finished rm -rf .git/hooks
+}
+
 test_expect_success 'git hook rejects commands without a mode' '
 	test_must_fail git hook pre-commit
 '
@@ -85,4 +93,15 @@ test_expect_success 'git hook list reorders on duplicate commands' '
 	test_cmp expected actual
 '
 
+test_expect_success 'git hook list shows hooks from the hookdir' '
+	setup_hookdir &&
+
+	cat >expected <<-EOF &&
+	hookdir: $(pwd)/.git/hooks/pre-commit
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v7 05/17] hook: respect hook.runHookDir
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
                             ` (3 preceding siblings ...)
  2020-12-22  0:02           ` [PATCH v7 04/17] hook: include hookdir hook in list Emily Shaffer
@ 2020-12-22  0:02           ` Emily Shaffer
  2021-01-31  3:35             ` Jonathan Tan
  2020-12-22  0:02           ` [PATCH v7 06/17] hook: implement hookcmd.<name>.skip Emily Shaffer
                             ` (15 subsequent siblings)
  20 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Include hooks specified in the hook directory in the list of hooks to
run. These hooks do need to be treated differently from config-specified
ones - they do not need to run in a shell, and later on may be disabled
or warned about based on a config setting.

Because they are at least as local as the local config, we'll run them
last - to keep the hook execution order from global to local.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Newly split into its own commit since v4, and taking place much sooner.
    
    An unfortunate side effect of adding this support *before* the
    hook.runHookDir support is that the labels on the list are not clear -
    because we aren't yet flagging which hooks are from the hookdir versus
    the config. I suppose we could move the addition of that field to the
    struct hook up to this patch, but it didn't make a lot of sense to me to
    do it just for cosmetic purposes.

 Documentation/config/hook.txt |  5 ++++
 builtin/hook.c                | 54 +++++++++++++++++++++++++++++++++--
 hook.c                        | 21 ++++++++++++++
 hook.h                        | 15 ++++++++++
 t/t1360-config-based-hooks.sh | 43 ++++++++++++++++++++++++++++
 5 files changed, 135 insertions(+), 3 deletions(-)

diff --git a/Documentation/config/hook.txt b/Documentation/config/hook.txt
index 71449ecbc7..75312754ae 100644
--- a/Documentation/config/hook.txt
+++ b/Documentation/config/hook.txt
@@ -7,3 +7,8 @@ hookcmd.<name>.command::
 	A command to execute during a hook for which <name> has been specified
 	as a command. This can be an executable on your device or a oneliner for
 	your shell. See linkgit:git-hook[1].
+
+hook.runHookDir::
+	Controls how hooks contained in your hookdir are executed. Can be any of
+	"yes", "warn", "interactive", or "no". Defaults to "yes". See
+	linkgit:git-hook[1] and linkgit:git-config[1] "core.hooksPath").
diff --git a/builtin/hook.c b/builtin/hook.c
index a0013ae4d7..d087e6f5b0 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -11,6 +11,8 @@ static const char * const builtin_hook_usage[] = {
 	NULL
 };
 
+static enum hookdir_opt should_run_hookdir;
+
 static int list(int argc, const char **argv, const char *prefix)
 {
 	struct list_head *head, *pos;
@@ -41,6 +43,26 @@ static int list(int argc, const char **argv, const char *prefix)
 		return 0;
 	}
 
+	switch (should_run_hookdir) {
+		case HOOKDIR_NO:
+			strbuf_addstr(&hookdir_annotation, _(" (will not run)"));
+			break;
+		case HOOKDIR_INTERACTIVE:
+			strbuf_addstr(&hookdir_annotation, _(" (will prompt)"));
+			break;
+		case HOOKDIR_WARN:
+		case HOOKDIR_UNKNOWN:
+			strbuf_addstr(&hookdir_annotation, _(" (will warn)"));
+			break;
+		case HOOKDIR_YES:
+		/*
+		 * The default behavior should agree with
+		 * hook.c:configured_hookdir_opt().
+		 */
+		default:
+			break;
+	}
+
 	list_for_each(pos, head) {
 		item = list_entry(pos, struct hook, list);
 		if (item) {
@@ -64,16 +86,42 @@ static int list(int argc, const char **argv, const char *prefix)
 
 int cmd_hook(int argc, const char **argv, const char *prefix)
 {
+	const char *run_hookdir = NULL;
+
 	struct option builtin_hook_options[] = {
+		OPT_STRING(0, "run-hookdir", &run_hookdir, N_("option"),
+			   N_("what to do with hooks found in the hookdir")),
 		OPT_END(),
 	};
-	if (argc < 2)
+
+	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+			     builtin_hook_usage, 0);
+
+	/* after the parse, we should have "<command> <hookname> <args...>" */
+	if (argc < 1)
 		usage_with_options(builtin_hook_usage, builtin_hook_options);
 
 	git_config(git_default_config, NULL);
 
-	if (!strcmp(argv[1], "list"))
-		return list(argc - 1, argv + 1, prefix);
+
+	/* argument > config */
+	if (run_hookdir)
+		if (!strcmp(run_hookdir, "no"))
+			should_run_hookdir = HOOKDIR_NO;
+		else if (!strcmp(run_hookdir, "yes"))
+			should_run_hookdir = HOOKDIR_YES;
+		else if (!strcmp(run_hookdir, "warn"))
+			should_run_hookdir = HOOKDIR_WARN;
+		else if (!strcmp(run_hookdir, "interactive"))
+			should_run_hookdir = HOOKDIR_INTERACTIVE;
+		else
+			die(_("'%s' is not a valid option for --run-hookdir "
+			      "(yes, warn, interactive, no)"), run_hookdir);
+	else
+		should_run_hookdir = configured_hookdir_opt();
+
+	if (!strcmp(argv[0], "list"))
+		return list(argc, argv, prefix);
 
 	usage_with_options(builtin_hook_usage, builtin_hook_options);
 }
diff --git a/hook.c b/hook.c
index ffbdcfd987..ed52e85159 100644
--- a/hook.c
+++ b/hook.c
@@ -97,6 +97,27 @@ static int hook_config_lookup(const char *key, const char *value, void *cb_data)
 	return 0;
 }
 
+enum hookdir_opt configured_hookdir_opt(void)
+{
+	const char *key;
+	if (git_config_get_value("hook.runhookdir", &key))
+		return HOOKDIR_YES; /* by default, just run it. */
+
+	if (!strcmp(key, "no"))
+		return HOOKDIR_NO;
+
+	if (!strcmp(key, "yes"))
+		return HOOKDIR_YES;
+
+	if (!strcmp(key, "warn"))
+		return HOOKDIR_WARN;
+
+	if (!strcmp(key, "interactive"))
+		return HOOKDIR_INTERACTIVE;
+
+	return HOOKDIR_UNKNOWN;
+}
+
 struct list_head* hook_list(const struct strbuf* hookname)
 {
 	struct strbuf hook_key = STRBUF_INIT;
diff --git a/hook.h b/hook.h
index 5750634c83..ccdf6272f2 100644
--- a/hook.h
+++ b/hook.h
@@ -21,6 +21,21 @@ struct hook
  */
 struct list_head* hook_list(const struct strbuf *hookname);
 
+enum hookdir_opt
+{
+	HOOKDIR_NO,
+	HOOKDIR_WARN,
+	HOOKDIR_INTERACTIVE,
+	HOOKDIR_YES,
+	HOOKDIR_UNKNOWN,
+};
+
+/*
+ * Provides the hookdir_opt specified in the config without consulting any
+ * command line arguments.
+ */
+enum hookdir_opt configured_hookdir_opt(void);
+
 /* Free memory associated with a 'struct hook' */
 void free_hook(struct hook *ptr);
 /* Empties the list at 'head', calling 'free_hook()' on each entry */
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 0f12af4659..91127a50a4 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -104,4 +104,47 @@ test_expect_success 'git hook list shows hooks from the hookdir' '
 	test_cmp expected actual
 '
 
+test_expect_success 'hook.runHookDir = no is respected by list' '
+	setup_hookdir &&
+
+	test_config hook.runHookDir "no" &&
+
+	cat >expected <<-EOF &&
+	hookdir: $(pwd)/.git/hooks/pre-commit (will not run)
+	EOF
+
+	git hook list pre-commit >actual &&
+	# the hookdir annotation is translated
+	test_i18ncmp expected actual
+'
+
+test_expect_success 'hook.runHookDir = warn is respected by list' '
+	setup_hookdir &&
+
+	test_config hook.runHookDir "warn" &&
+
+	cat >expected <<-EOF &&
+	hookdir: $(pwd)/.git/hooks/pre-commit (will warn)
+	EOF
+
+	git hook list pre-commit >actual &&
+	# the hookdir annotation is translated
+	test_i18ncmp expected actual
+'
+
+
+test_expect_success 'hook.runHookDir = interactive is respected by list' '
+	setup_hookdir &&
+
+	test_config hook.runHookDir "interactive" &&
+
+	cat >expected <<-EOF &&
+	hookdir: $(pwd)/.git/hooks/pre-commit (will prompt)
+	EOF
+
+	git hook list pre-commit >actual &&
+	# the hookdir annotation is translated
+	test_i18ncmp expected actual
+'
+
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v7 06/17] hook: implement hookcmd.<name>.skip
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
                             ` (4 preceding siblings ...)
  2020-12-22  0:02           ` [PATCH v7 05/17] hook: respect hook.runHookDir Emily Shaffer
@ 2020-12-22  0:02           ` Emily Shaffer
  2021-01-31  3:40             ` Jonathan Tan
  2020-12-22  0:02           ` [PATCH v7 07/17] parse-options: parse into strvec Emily Shaffer
                             ` (14 subsequent siblings)
  20 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

If a user wants a specific repo to skip execution of a hook which is set
at a global or system level, they can now do so by specifying 'skip' in
their repo config:

~/.gitconfig
  [hook.pre-commit]
    command = skippable-oneliner
    command = skippable-hookcmd

  [hookcmd.skippable-hookcmd]
    command = foo.sh

$GIT_DIR/.git/config
  [hookcmd.skippable-oneliner]
    skip = true
  [hookcmd.skippable-hookcmd]
    skip = true

Later it may make sense to add an option like
"hookcmd.<name>.<hook-event>-skip" - but for simplicity, let's start
with a universal skip setting like this.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    In addition to being handy for turning off global hooks one project doesn't
    care about, this setting will be necessary much later for the 'proc-receive'
    hook, which can only cope with up to one hook being specified.
    
    New since v4.
    
    During the Google team's review club I was reminded about this whole
    'skip' option I never implemented. It's true that it's impossible to
    exclude a given hook without this; however, I think I have some more
    work to do on it, so consider it RFC for now and tell me what you think
    :)
     - Emily
    
    During the Google team's review club this week I was reminded about this whole
    'skip' option I never implemented. It's true that it's impossible to exclude
    a given hook without this; however, I think we have some more work to do on it,
    so consider it RFC for now and tell me what you think :)
    
     - Emily

 hook.c                        | 37 +++++++++++++++++++++++++----------
 t/t1360-config-based-hooks.sh | 23 ++++++++++++++++++++++
 2 files changed, 50 insertions(+), 10 deletions(-)

diff --git a/hook.c b/hook.c
index ed52e85159..d262503725 100644
--- a/hook.c
+++ b/hook.c
@@ -12,23 +12,24 @@ void free_hook(struct hook *ptr)
 	}
 }
 
-static void append_or_move_hook(struct list_head *head, const char *command)
+static struct hook* find_hook_by_command(struct list_head *head, const char *command)
 {
 	struct list_head *pos = NULL, *tmp = NULL;
-	struct hook *to_add = NULL;
+	struct hook *found = NULL;
 
-	/*
-	 * remove the prior entry with this command; we'll replace it at the
-	 * end.
-	 */
 	list_for_each_safe(pos, tmp, head) {
 		struct hook *it = list_entry(pos, struct hook, list);
 		if (!strcmp(it->command.buf, command)) {
 		    list_del(pos);
-		    /* we'll simply move the hook to the end */
-		    to_add = it;
+		    found = it;
 		}
 	}
+	return found;
+}
+
+static void append_or_move_hook(struct list_head *head, const char *command)
+{
+	struct hook *to_add = find_hook_by_command(head, command);
 
 	if (!to_add) {
 		/* adding a new hook, not moving an old one */
@@ -41,7 +42,7 @@ static void append_or_move_hook(struct list_head *head, const char *command)
 	/* re-set the scope so we show where an override was specified */
 	to_add->origin = current_config_scope();
 
-	list_add_tail(&to_add->list, pos);
+	list_add_tail(&to_add->list, head);
 }
 
 static void remove_hook(struct list_head *to_remove)
@@ -73,8 +74,18 @@ static int hook_config_lookup(const char *key, const char *value, void *cb_data)
 	if (!strcmp(key, hook_key)) {
 		const char *command = value;
 		struct strbuf hookcmd_name = STRBUF_INIT;
+		int skip = 0;
+
+		/*
+		 * Check if we're removing that hook instead. Hookcmds are
+		 * removed by name, and inlined hooks are removed by command
+		 * content.
+		 */
+		strbuf_addf(&hookcmd_name, "hookcmd.%s.skip", command);
+		git_config_get_bool(hookcmd_name.buf, &skip);
 
 		/* Check if a hookcmd with that name exists. */
+		strbuf_reset(&hookcmd_name);
 		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
 		git_config_get_value(hookcmd_name.buf, &command);
 
@@ -89,7 +100,13 @@ static int hook_config_lookup(const char *key, const char *value, void *cb_data)
 		 *   for each key+value, do_callback(key, value, cb_data)
 		 */
 
-		append_or_move_hook(head, command);
+		if (skip) {
+			struct hook *to_remove = find_hook_by_command(head, command);
+			if (to_remove)
+				remove_hook(&(to_remove->list));
+		} else {
+			append_or_move_hook(head, command);
+		}
 
 		strbuf_release(&hookcmd_name);
 	}
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 91127a50a4..ebd3bc623f 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -132,6 +132,29 @@ test_expect_success 'hook.runHookDir = warn is respected by list' '
 	test_i18ncmp expected actual
 '
 
+test_expect_success 'git hook list removes skipped hookcmd' '
+	setup_hookcmd &&
+	test_config hookcmd.abc.skip "true" --add &&
+
+	cat >expected <<-EOF &&
+	no commands configured for hook '\''pre-commit'\''
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_i18ncmp expected actual
+'
+
+test_expect_success 'git hook list removes skipped inlined hook' '
+	setup_hooks &&
+	test_config hookcmd."$ROOT/path/ghi".skip "true" --add &&
+
+	cat >expected <<-EOF &&
+	global: $ROOT/path/def
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
 
 test_expect_success 'hook.runHookDir = interactive is respected by list' '
 	setup_hookdir &&
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v7 07/17] parse-options: parse into strvec
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
                             ` (5 preceding siblings ...)
  2020-12-22  0:02           ` [PATCH v7 06/17] hook: implement hookcmd.<name>.skip Emily Shaffer
@ 2020-12-22  0:02           ` Emily Shaffer
  2020-12-22  0:02           ` [PATCH v7 08/17] hook: add 'run' subcommand Emily Shaffer
                             ` (13 subsequent siblings)
  20 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

parse-options already knows how to read into a string_list, and it knows
how to read into an strvec as a passthrough (that is, including the
argument as well as its value). string_list and strvec serve similar
purposes but are somewhat painful to convert between; so, let's teach
parse-options to read values of string arguments directly into an
strvec without preserving the argument name.

This is useful if collecting generic arguments to pass through to
another command, for example, 'git hook run --arg "--quiet" --arg
"--format=pretty" some-hook'. The resulting strvec would contain
{ "--quiet", "--format=pretty" }.

The implementation is based on that of OPT_STRING_LIST.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, fixed one or two more places where I missed the argv_array->strvec
    rename.

 Documentation/technical/api-parse-options.txt |  5 +++++
 parse-options-cb.c                            | 16 ++++++++++++++++
 parse-options.h                               |  4 ++++
 3 files changed, 25 insertions(+)

diff --git a/Documentation/technical/api-parse-options.txt b/Documentation/technical/api-parse-options.txt
index 5a60bbfa7f..679bd98629 100644
--- a/Documentation/technical/api-parse-options.txt
+++ b/Documentation/technical/api-parse-options.txt
@@ -173,6 +173,11 @@ There are some macros to easily define options:
 	The string argument is stored as an element in `string_list`.
 	Use of `--no-option` will clear the list of preceding values.
 
+`OPT_STRVEC(short, long, &struct strvec, arg_str, description)`::
+	Introduce an option with a string argument.
+	The string argument is stored as an element in `strvec`.
+	Use of `--no-option` will clear the list of preceding values.
+
 `OPT_INTEGER(short, long, &int_var, description)`::
 	Introduce an option with integer argument.
 	The integer is put into `int_var`.
diff --git a/parse-options-cb.c b/parse-options-cb.c
index 4542d4d3f9..c2451dfb1b 100644
--- a/parse-options-cb.c
+++ b/parse-options-cb.c
@@ -207,6 +207,22 @@ int parse_opt_string_list(const struct option *opt, const char *arg, int unset)
 	return 0;
 }
 
+int parse_opt_strvec(const struct option *opt, const char *arg, int unset)
+{
+	struct strvec *v = opt->value;
+
+	if (unset) {
+		strvec_clear(v);
+		return 0;
+	}
+
+	if (!arg)
+		return -1;
+
+	strvec_push(v, arg);
+	return 0;
+}
+
 int parse_opt_noop_cb(const struct option *opt, const char *arg, int unset)
 {
 	return 0;
diff --git a/parse-options.h b/parse-options.h
index 7030d8f3da..75cc8c7c96 100644
--- a/parse-options.h
+++ b/parse-options.h
@@ -177,6 +177,9 @@ struct option {
 #define OPT_STRING_LIST(s, l, v, a, h) \
 				    { OPTION_CALLBACK, (s), (l), (v), (a), \
 				      (h), 0, &parse_opt_string_list }
+#define OPT_STRVEC(s, l, v, a, h) \
+				    { OPTION_CALLBACK, (s), (l), (v), (a), \
+				      (h), 0, &parse_opt_strvec }
 #define OPT_UYN(s, l, v, h)         { OPTION_CALLBACK, (s), (l), (v), NULL, \
 				      (h), PARSE_OPT_NOARG, &parse_opt_tertiary }
 #define OPT_EXPIRY_DATE(s, l, v, h) \
@@ -296,6 +299,7 @@ int parse_opt_commits(const struct option *, const char *, int);
 int parse_opt_commit(const struct option *, const char *, int);
 int parse_opt_tertiary(const struct option *, const char *, int);
 int parse_opt_string_list(const struct option *, const char *, int);
+int parse_opt_strvec(const struct option *, const char *, int);
 int parse_opt_noop_cb(const struct option *, const char *, int);
 enum parse_opt_result parse_opt_unknown_cb(struct parse_opt_ctx_t *ctx,
 					   const struct option *,
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v7 08/17] hook: add 'run' subcommand
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
                             ` (6 preceding siblings ...)
  2020-12-22  0:02           ` [PATCH v7 07/17] parse-options: parse into strvec Emily Shaffer
@ 2020-12-22  0:02           ` Emily Shaffer
  2021-01-31  4:22             ` Jonathan Tan
  2020-12-22  0:02           ` [PATCH v7 09/17] hook: replace find_hook() with hook_exists() Emily Shaffer
                             ` (12 subsequent siblings)
  20 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

In order to enable hooks to be run as an external process, by a
standalone Git command, or by tools which wrap Git, provide an external
means to run all configured hook commands for a given hook event.

For now, the hook commands will run in config order, in series. As
alternate ordering or parallelism is supported in the future, we should
add knobs to use those to the command line as well.

As with the legacy hook implementation, all stdout generated by hook
commands is redirected to stderr. Piping from stdin is not yet
supported.

Legacy hooks (those present in $GITDIR/hooks) are run at the end of the
execution list. For now, there is no way to disable them.

Users may wish to provide hook commands like 'git config
hook.pre-commit.command "~/linter.sh --pre-commit"'. To enable this,
config-defined hooks are run in a shell. (Since hooks in $GITDIR/hooks
can't be specified with included arguments or paths which need expansion
like this, they are run without a shell instead.)

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, updated the docs, and did less local application of single
    quotes. In order for hookdir hooks to run successfully with a space in
    the path, though, they must not be run with 'sh -c'. So we can treat the
    hookdir hooks specially, and warn users via doc about special
    considerations for configured hooks with spaces in their path.

 Documentation/git-hook.txt    |  31 +++++++++-
 builtin/hook.c                |  48 ++++++++++++++-
 hook.c                        | 112 ++++++++++++++++++++++++++++++++++
 hook.h                        |  32 ++++++++++
 t/t1360-config-based-hooks.sh |  65 +++++++++++++++++++-
 5 files changed, 281 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index f19875ed68..18a817d832 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -9,11 +9,12 @@ SYNOPSIS
 --------
 [verse]
 'git hook' list <hook-name>
+'git hook' run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] <hook-name>
 
 DESCRIPTION
 -----------
-You can list configured hooks with this command. Later, you will be able to run,
-add, and modify hooks with this command.
+You can list and run configured hooks with this command. Later, you will be able
+to add and modify hooks with this command.
 
 This command parses the default configuration files for sections `hook` and
 `hookcmd`. `hook` is used to describe the commands which will be run during a
@@ -64,6 +65,32 @@ in the order they should be run, and print the config scope where the relevant
 `hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
 This output is human-readable and the format is subject to change over time.
 
+run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] `<hook-name>`::
+
+Runs hooks configured for `<hook-name>`, in the same order displayed by `git
+hook list`. Hooks configured this way are run prepended with `sh -c`, so paths
+containing special characters or spaces should be wrapped in single quotes:
+`command = '/my/path with spaces/script.sh' some args`.
+
+OPTIONS
+-------
+--run-hookdir::
+	Overrides the hook.runHookDir config. Must be 'yes', 'warn',
+	'interactive', or 'no'. Specifies how to handle hooks located in the Git
+	hook directory (core.hooksPath).
+
+-a::
+--arg::
+	Only valid for `run`.
++
+Specify arguments to pass to every hook that is run.
+
+-e::
+--env::
+	Only valid for `run`.
++
+Specify environment variables to set for every hook that is run.
+
 CONFIGURATION
 -------------
 include::config/hook.txt[]
diff --git a/builtin/hook.c b/builtin/hook.c
index d087e6f5b0..07ba00e07a 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -5,9 +5,11 @@
 #include "hook.h"
 #include "parse-options.h"
 #include "strbuf.h"
+#include "strvec.h"
 
 static const char * const builtin_hook_usage[] = {
 	N_("git hook list <hookname>"),
+	N_("git hook run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] <hookname>"),
 	NULL
 };
 
@@ -84,6 +86,46 @@ static int list(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+static int run(int argc, const char **argv, const char *prefix)
+{
+	struct strbuf hookname = STRBUF_INIT;
+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
+	int rc = 0;
+
+	struct option run_options[] = {
+		OPT_STRVEC('e', "env", &opt.env, N_("var"),
+			   N_("environment variables for hook to use")),
+		OPT_STRVEC('a', "arg", &opt.args, N_("args"),
+			   N_("argument to pass to hook")),
+		OPT_END(),
+	};
+
+	/*
+	 * While it makes sense to list hooks out-of-repo, it doesn't make sense
+	 * to execute them. Hooks usually want to look at repository artifacts.
+	 */
+	if (!have_git_dir())
+		usage_msg_opt(_("You must be in a Git repo to execute hooks."),
+			      builtin_hook_usage, run_options);
+
+	argc = parse_options(argc, argv, prefix, run_options,
+			     builtin_hook_usage, 0);
+
+	if (argc < 1)
+		usage_msg_opt(_("You must specify a hook event to run."),
+			      builtin_hook_usage, run_options);
+
+	strbuf_addstr(&hookname, argv[0]);
+	opt.run_hookdir = should_run_hookdir;
+
+	rc = run_hooks(hookname.buf, &opt);
+
+	strbuf_release(&hookname);
+	run_hooks_opt_clear(&opt);
+
+	return rc;
+}
+
 int cmd_hook(int argc, const char **argv, const char *prefix)
 {
 	const char *run_hookdir = NULL;
@@ -95,10 +137,10 @@ int cmd_hook(int argc, const char **argv, const char *prefix)
 	};
 
 	argc = parse_options(argc, argv, prefix, builtin_hook_options,
-			     builtin_hook_usage, 0);
+			     builtin_hook_usage, PARSE_OPT_KEEP_UNKNOWN);
 
 	/* after the parse, we should have "<command> <hookname> <args...>" */
-	if (argc < 1)
+	if (argc < 2)
 		usage_with_options(builtin_hook_usage, builtin_hook_options);
 
 	git_config(git_default_config, NULL);
@@ -122,6 +164,8 @@ int cmd_hook(int argc, const char **argv, const char *prefix)
 
 	if (!strcmp(argv[0], "list"))
 		return list(argc, argv, prefix);
+	if (!strcmp(argv[0], "run"))
+		return run(argc, argv, prefix);
 
 	usage_with_options(builtin_hook_usage, builtin_hook_options);
 }
diff --git a/hook.c b/hook.c
index d262503725..5836bbb739 100644
--- a/hook.c
+++ b/hook.c
@@ -3,6 +3,7 @@
 #include "hook.h"
 #include "config.h"
 #include "run-command.h"
+#include "prompt.h"
 
 void free_hook(struct hook *ptr)
 {
@@ -135,6 +136,56 @@ enum hookdir_opt configured_hookdir_opt(void)
 	return HOOKDIR_UNKNOWN;
 }
 
+static int should_include_hookdir(const char *path, enum hookdir_opt cfg)
+{
+	struct strbuf prompt = STRBUF_INIT;
+	/*
+	 * If the path doesn't exist, don't bother adding the empty hook and
+	 * don't bother checking the config or prompting the user.
+	 */
+	if (!path)
+		return 0;
+
+	switch (cfg)
+	{
+		case HOOKDIR_NO:
+			return 0;
+		case HOOKDIR_UNKNOWN:
+			fprintf(stderr,
+				_("Unrecognized value for 'hook.runHookDir'. "
+				  "Is there a typo? "));
+			/* FALLTHROUGH */
+		case HOOKDIR_WARN:
+			fprintf(stderr, _("Running legacy hook at '%s'\n"),
+				path);
+			return 1;
+		case HOOKDIR_INTERACTIVE:
+			do {
+				/*
+				 * TRANSLATORS: Make sure to include [Y] and [n]
+				 * in your translation. Only English input is
+				 * accepted. Default option is "yes".
+				 */
+				fprintf(stderr, _("Run '%s'? [Yn] "), path);
+				git_read_line_interactively(&prompt);
+				strbuf_tolower(&prompt);
+				if (starts_with(prompt.buf, "n")) {
+					strbuf_release(&prompt);
+					return 0;
+				} else if (starts_with(prompt.buf, "y")) {
+					strbuf_release(&prompt);
+					return 1;
+				}
+				/* otherwise, we didn't understand the input */
+			} while (prompt.len); /* an empty reply means "Yes" */
+			strbuf_release(&prompt);
+			return 1;
+		case HOOKDIR_YES:
+		default:
+			return 1;
+	}
+}
+
 struct list_head* hook_list(const struct strbuf* hookname)
 {
 	struct strbuf hook_key = STRBUF_INIT;
@@ -166,3 +217,64 @@ struct list_head* hook_list(const struct strbuf* hookname)
 	strbuf_release(&hook_key);
 	return hook_head;
 }
+
+void run_hooks_opt_init(struct run_hooks_opt *o)
+{
+	strvec_init(&o->env);
+	strvec_init(&o->args);
+	o->run_hookdir = configured_hookdir_opt();
+}
+
+void run_hooks_opt_clear(struct run_hooks_opt *o)
+{
+	strvec_clear(&o->env);
+	strvec_clear(&o->args);
+}
+
+int run_hooks(const char *hookname, struct run_hooks_opt *options)
+{
+	struct strbuf hookname_str = STRBUF_INIT;
+	struct list_head *to_run, *pos = NULL, *tmp = NULL;
+	int rc = 0;
+
+	if (!options)
+		BUG("a struct run_hooks_opt must be provided to run_hooks");
+
+	strbuf_addstr(&hookname_str, hookname);
+
+	to_run = hook_list(&hookname_str);
+
+	list_for_each_safe(pos, tmp, to_run) {
+		struct child_process hook_proc = CHILD_PROCESS_INIT;
+		struct hook *hook = list_entry(pos, struct hook, list);
+
+		hook_proc.env = options->env.v;
+		hook_proc.no_stdin = 1;
+		hook_proc.stdout_to_stderr = 1;
+		hook_proc.trace2_hook_name = hook->command.buf;
+		hook_proc.use_shell = 1;
+
+		if (hook->from_hookdir) {
+		    if (!should_include_hookdir(hook->command.buf, options->run_hookdir))
+			continue;
+		    /*
+		     * Commands from the config could be oneliners, but we know
+		     * for certain that hookdir commands are not.
+		     */
+		    hook_proc.use_shell = 0;
+		}
+
+		/* add command */
+		strvec_push(&hook_proc.args, hook->command.buf);
+
+		/*
+		 * add passed-in argv, without expanding - let the user get back
+		 * exactly what they put in
+		 */
+		strvec_pushv(&hook_proc.args, options->args.v);
+
+		rc |= run_command(&hook_proc);
+	}
+
+	return rc;
+}
diff --git a/hook.h b/hook.h
index ccdf6272f2..259662968f 100644
--- a/hook.h
+++ b/hook.h
@@ -1,6 +1,7 @@
 #include "config.h"
 #include "list.h"
 #include "strbuf.h"
+#include "strvec.h"
 
 struct hook
 {
@@ -36,6 +37,37 @@ enum hookdir_opt
  */
 enum hookdir_opt configured_hookdir_opt(void);
 
+struct run_hooks_opt
+{
+	/* Environment vars to be set for each hook */
+	struct strvec env;
+
+	/* Args to be passed to each hook */
+	struct strvec args;
+
+	/*
+	 * How should the hookdir be handled?
+	 * Leave the RUN_HOOKS_OPT_INIT default in most cases; this only needs
+	 * to be overridden if the user can override it at the command line.
+	 */
+	enum hookdir_opt run_hookdir;
+};
+
+#define RUN_HOOKS_OPT_INIT  {   		\
+	.env = STRVEC_INIT, 				\
+	.args = STRVEC_INIT, 			\
+	.run_hookdir = configured_hookdir_opt()	\
+}
+
+void run_hooks_opt_init(struct run_hooks_opt *o);
+void run_hooks_opt_clear(struct run_hooks_opt *o);
+
+/*
+ * Runs all hooks associated to the 'hookname' event in order. Each hook will be
+ * passed 'env' and 'args'.
+ */
+int run_hooks(const char *hookname, struct run_hooks_opt *options);
+
 /* Free memory associated with a 'struct hook' */
 void free_hook(struct hook *ptr);
 /* Empties the list at 'head', calling 'free_hook()' on each entry */
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index ebd3bc623f..5b3003d59b 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -115,7 +115,10 @@ test_expect_success 'hook.runHookDir = no is respected by list' '
 
 	git hook list pre-commit >actual &&
 	# the hookdir annotation is translated
-	test_i18ncmp expected actual
+	test_i18ncmp expected actual &&
+
+	git hook run pre-commit 2>actual &&
+	test_must_be_empty actual
 '
 
 test_expect_success 'hook.runHookDir = warn is respected by list' '
@@ -129,6 +132,14 @@ test_expect_success 'hook.runHookDir = warn is respected by list' '
 
 	git hook list pre-commit >actual &&
 	# the hookdir annotation is translated
+	test_i18ncmp expected actual &&
+
+	cat >expected <<-EOF &&
+	Running legacy hook at '\''$(pwd)/.git/hooks/pre-commit'\''
+	"Legacy Hook"
+	EOF
+
+	git hook run pre-commit 2>actual &&
 	test_i18ncmp expected actual
 '
 
@@ -156,7 +167,7 @@ test_expect_success 'git hook list removes skipped inlined hook' '
 	test_cmp expected actual
 '
 
-test_expect_success 'hook.runHookDir = interactive is respected by list' '
+test_expect_success 'hook.runHookDir = interactive is respected by list and run' '
 	setup_hookdir &&
 
 	test_config hook.runHookDir "interactive" &&
@@ -167,7 +178,55 @@ test_expect_success 'hook.runHookDir = interactive is respected by list' '
 
 	git hook list pre-commit >actual &&
 	# the hookdir annotation is translated
-	test_i18ncmp expected actual
+	test_i18ncmp expected actual &&
+
+	test_write_lines n | git hook run pre-commit 2>actual &&
+	! grep "Legacy Hook" actual &&
+
+	test_write_lines y | git hook run pre-commit 2>actual &&
+	grep "Legacy Hook" actual
+'
+
+test_expect_success 'inline hook definitions execute oneliners' '
+	test_config hook.pre-commit.command "echo \"Hello World\"" &&
+
+	echo "Hello World" >expected &&
+
+	# hooks are run with stdout_to_stderr = 1
+	git hook run pre-commit 2>actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'inline hook definitions resolve paths' '
+	write_script sample-hook.sh <<-EOF &&
+	echo \"Sample Hook\"
+	EOF
+
+	test_when_finished "rm sample-hook.sh" &&
+
+	test_config hook.pre-commit.command "\"$(pwd)/sample-hook.sh\"" &&
+
+	echo \"Sample Hook\" >expected &&
+
+	# hooks are run with stdout_to_stderr = 1
+	git hook run pre-commit 2>actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'hookdir hook included in git hook run' '
+	setup_hookdir &&
+
+	echo \"Legacy Hook\" >expected &&
+
+	# hooks are run with stdout_to_stderr = 1
+	git hook run pre-commit 2>actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'out-of-repo runs excluded' '
+	setup_hooks &&
+
+	nongit test_must_fail git hook run pre-commit
 '
 
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v7 09/17] hook: replace find_hook() with hook_exists()
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
                             ` (7 preceding siblings ...)
  2020-12-22  0:02           ` [PATCH v7 08/17] hook: add 'run' subcommand Emily Shaffer
@ 2020-12-22  0:02           ` Emily Shaffer
  2021-01-31  4:39             ` Jonathan Tan
  2020-12-22  0:02           ` [PATCH v7 10/17] hook: support passing stdin to hooks Emily Shaffer
                             ` (11 subsequent siblings)
  20 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Add a helper to easily determine whether any hooks exist for a given
hook event.

Many callers want to check whether some state could be modified by a
hook; that check should include the config-based hooks as well. Optimize
by checking the config directly. Since commands which execute hooks
might want to take args to replace 'hook.runHookDir', let
'hook_exists()' mirror the behavior of 'hook.runHookDir'.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 builtin/bugreport.c |  4 ++--
 hook.c              | 15 +++++++++++++++
 hook.h              |  9 +++++++++
 3 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/builtin/bugreport.c b/builtin/bugreport.c
index ad3cc9c02f..2fe65d8f1e 100644
--- a/builtin/bugreport.c
+++ b/builtin/bugreport.c
@@ -3,7 +3,7 @@
 #include "strbuf.h"
 #include "help.h"
 #include "compat/compiler.h"
-#include "run-command.h"
+#include "hook.h"
 
 
 static void get_system_info(struct strbuf *sys_info)
@@ -82,7 +82,7 @@ static void get_populated_hooks(struct strbuf *hook_info, int nongit)
 	}
 
 	for (i = 0; i < ARRAY_SIZE(hook); i++)
-		if (find_hook(hook[i]))
+		if (hook_exists(hook[i], configured_hookdir_opt()))
 			strbuf_addf(hook_info, "%s\n", hook[i]);
 }
 
diff --git a/hook.c b/hook.c
index 5836bbb739..fbb69706d8 100644
--- a/hook.c
+++ b/hook.c
@@ -225,6 +225,21 @@ void run_hooks_opt_init(struct run_hooks_opt *o)
 	o->run_hookdir = configured_hookdir_opt();
 }
 
+int hook_exists(const char *hookname, enum hookdir_opt should_run_hookdir)
+{
+	const char *value = NULL; /* throwaway */
+	struct strbuf hook_key = STRBUF_INIT;
+
+	int could_run_hookdir = (should_run_hookdir == HOOKDIR_INTERACTIVE ||
+				should_run_hookdir == HOOKDIR_WARN ||
+				should_run_hookdir == HOOKDIR_YES)
+				&& !!find_hook(hookname);
+
+	strbuf_addf(&hook_key, "hook.%s.command", hookname);
+
+	return (!git_config_get_value(hook_key.buf, &value)) || could_run_hookdir;
+}
+
 void run_hooks_opt_clear(struct run_hooks_opt *o)
 {
 	strvec_clear(&o->env);
diff --git a/hook.h b/hook.h
index 259662968f..762b6fadad 100644
--- a/hook.h
+++ b/hook.h
@@ -62,6 +62,15 @@ struct run_hooks_opt
 void run_hooks_opt_init(struct run_hooks_opt *o);
 void run_hooks_opt_clear(struct run_hooks_opt *o);
 
+/*
+ * Returns 1 if any hooks are specified in the config or if a hook exists in the
+ * hookdir. Typically, invoke hook_exsts() like:
+ *   hook_exists(hookname, configured_hookdir_opt());
+ * Like with run_hooks, if you take a --run-hookdir flag, reflect that
+ * user-specified behavior here instead.
+ */
+int hook_exists(const char *hookname, enum hookdir_opt should_run_hookdir);
+
 /*
  * Runs all hooks associated to the 'hookname' event in order. Each hook will be
  * passed 'env' and 'args'.
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v7 10/17] hook: support passing stdin to hooks
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
                             ` (8 preceding siblings ...)
  2020-12-22  0:02           ` [PATCH v7 09/17] hook: replace find_hook() with hook_exists() Emily Shaffer
@ 2020-12-22  0:02           ` Emily Shaffer
  2020-12-22  0:02           ` [PATCH v7 11/17] run-command: allow stdin for run_processes_parallel Emily Shaffer
                             ` (10 subsequent siblings)
  20 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Some hooks (such as post-rewrite) need to take input via stdin.
Previously, callers provided stdin to hooks by setting
run-command.h:child_process.in, which takes a FD. Callers would open the
file in question themselves before calling run-command(). However, since
we will now need to seek to the front of the file and read it again for
every hook which runs, hook.h:run_command() takes a path and handles FD
management itself. Since this file is opened for read only, it should
not prevent later parallel execution support.

On the frontend, this is supported by asking for a file path, rather
than by reading stdin. Reading directly from stdin would involve caching
the entire stdin (to memory or to disk) and reading it back from the
beginning to each hook. We'd want to support cases like insufficient
memory or storage for the file. While this may prove useful later, for
now the path of least resistance is to just ask the user to make this
interim file themselves.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/git-hook.txt    | 11 +++++++++--
 builtin/hook.c                |  5 ++++-
 hook.c                        |  7 ++++++-
 hook.h                        |  9 +++++++--
 t/t1360-config-based-hooks.sh | 24 ++++++++++++++++++++++++
 5 files changed, 50 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index 18a817d832..cce30a80d0 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -9,7 +9,8 @@ SYNOPSIS
 --------
 [verse]
 'git hook' list <hook-name>
-'git hook' run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] <hook-name>
+'git hook' run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] [--to-stdin=<path>]
+	<hook-name>
 
 DESCRIPTION
 -----------
@@ -65,7 +66,7 @@ in the order they should be run, and print the config scope where the relevant
 `hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
 This output is human-readable and the format is subject to change over time.
 
-run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] `<hook-name>`::
+run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] [--to-stdin=<path>] `<hook-name>`::
 
 Runs hooks configured for `<hook-name>`, in the same order displayed by `git
 hook list`. Hooks configured this way are run prepended with `sh -c`, so paths
@@ -91,6 +92,12 @@ Specify arguments to pass to every hook that is run.
 +
 Specify environment variables to set for every hook that is run.
 
+--to-stdin::
+	Only valid for `run`.
++
+Specify a file which will be streamed into stdin for every hook that is run.
+Each hook will receive the entire file from beginning to EOF.
+
 CONFIGURATION
 -------------
 include::config/hook.txt[]
diff --git a/builtin/hook.c b/builtin/hook.c
index 07ba00e07a..be104f2938 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -9,7 +9,8 @@
 
 static const char * const builtin_hook_usage[] = {
 	N_("git hook list <hookname>"),
-	N_("git hook run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] <hookname>"),
+	N_("git hook run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...]"
+	   "[--to-stdin=<path>] <hookname>"),
 	NULL
 };
 
@@ -97,6 +98,8 @@ static int run(int argc, const char **argv, const char *prefix)
 			   N_("environment variables for hook to use")),
 		OPT_STRVEC('a', "arg", &opt.args, N_("args"),
 			   N_("argument to pass to hook")),
+		OPT_STRING(0, "to-stdin", &opt.path_to_stdin, N_("path"),
+			   N_("file to read into hooks' stdin")),
 		OPT_END(),
 	};
 
diff --git a/hook.c b/hook.c
index fbb69706d8..ce5c443206 100644
--- a/hook.c
+++ b/hook.c
@@ -263,8 +263,13 @@ int run_hooks(const char *hookname, struct run_hooks_opt *options)
 		struct child_process hook_proc = CHILD_PROCESS_INIT;
 		struct hook *hook = list_entry(pos, struct hook, list);
 
+		/* reopen the file for stdin; run_command closes it. */
+		if (options->path_to_stdin)
+			hook_proc.in = xopen(options->path_to_stdin, O_RDONLY);
+		else
+			hook_proc.no_stdin = 1;
+
 		hook_proc.env = options->env.v;
-		hook_proc.no_stdin = 1;
 		hook_proc.stdout_to_stderr = 1;
 		hook_proc.trace2_hook_name = hook->command.buf;
 		hook_proc.use_shell = 1;
diff --git a/hook.h b/hook.h
index 762b6fadad..e22a6db832 100644
--- a/hook.h
+++ b/hook.h
@@ -51,11 +51,15 @@ struct run_hooks_opt
 	 * to be overridden if the user can override it at the command line.
 	 */
 	enum hookdir_opt run_hookdir;
+
+	/* Path to file which should be piped to stdin for each hook */
+	const char *path_to_stdin;
 };
 
 #define RUN_HOOKS_OPT_INIT  {   		\
-	.env = STRVEC_INIT, 				\
+	.env = STRVEC_INIT, 			\
 	.args = STRVEC_INIT, 			\
+	.path_to_stdin = NULL,			\
 	.run_hookdir = configured_hookdir_opt()	\
 }
 
@@ -73,7 +77,8 @@ int hook_exists(const char *hookname, enum hookdir_opt should_run_hookdir);
 
 /*
  * Runs all hooks associated to the 'hookname' event in order. Each hook will be
- * passed 'env' and 'args'.
+ * passed 'env' and 'args'. The file at 'stdin_path' will be closed and reopened
+ * for each hook that runs.
  */
 int run_hooks(const char *hookname, struct run_hooks_opt *options);
 
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 5b3003d59b..c672269ee4 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -229,4 +229,28 @@ test_expect_success 'out-of-repo runs excluded' '
 	nongit test_must_fail git hook run pre-commit
 '
 
+test_expect_success 'stdin to multiple hooks' '
+	git config --add hook.test.command "xargs -P1 -I% echo a%" &&
+	git config --add hook.test.command "xargs -P1 -I% echo b%" &&
+	test_when_finished "test_unconfig hook.test.command" &&
+
+	cat >input <<-EOF &&
+	1
+	2
+	3
+	EOF
+
+	cat >expected <<-EOF &&
+	a1
+	a2
+	a3
+	b1
+	b2
+	b3
+	EOF
+
+	git hook run --to-stdin=input test 2>actual &&
+	test_cmp expected actual
+'
+
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v7 11/17] run-command: allow stdin for run_processes_parallel
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
                             ` (9 preceding siblings ...)
  2020-12-22  0:02           ` [PATCH v7 10/17] hook: support passing stdin to hooks Emily Shaffer
@ 2020-12-22  0:02           ` Emily Shaffer
  2021-02-01  5:38             ` Jonathan Tan
  2020-12-22  0:02           ` [PATCH v7 12/17] hook: allow parallel hook execution Emily Shaffer
                             ` (9 subsequent siblings)
  20 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

While it makes sense not to inherit stdin from the parent process to
avoid deadlocking, it's not necessary to completely ban stdin to
children. An informed user should be able to configure stdin safely. By
setting `some_child.process.no_stdin=1` before calling `get_next_task()`
we provide a reasonable default behavior but enable users to set up
stdin streaming for themselves during the callback.

`some_child.process.stdout_to_stderr`, however, remains unmodifiable by
`get_next_task()` - the rest of the run_processes_parallel() API depends
on child output in stderr.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 run-command.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/run-command.c b/run-command.c
index ea4d0fb4b1..80c8c97bc1 100644
--- a/run-command.c
+++ b/run-command.c
@@ -1683,6 +1683,9 @@ static int pp_start_one(struct parallel_processes *pp)
 	if (i == pp->max_processes)
 		BUG("bookkeeping is hard");
 
+	/* disallow by default, but allow users to set up stdin if they wish */
+	pp->children[i].process.no_stdin = 1;
+
 	code = pp->get_next_task(&pp->children[i].process,
 				 &pp->children[i].err,
 				 pp->data,
@@ -1694,7 +1697,6 @@ static int pp_start_one(struct parallel_processes *pp)
 	}
 	pp->children[i].process.err = -1;
 	pp->children[i].process.stdout_to_stderr = 1;
-	pp->children[i].process.no_stdin = 1;
 
 	if (start_command(&pp->children[i].process)) {
 		code = pp->start_failure(&pp->children[i].err,
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v7 12/17] hook: allow parallel hook execution
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
                             ` (10 preceding siblings ...)
  2020-12-22  0:02           ` [PATCH v7 11/17] run-command: allow stdin for run_processes_parallel Emily Shaffer
@ 2020-12-22  0:02           ` Emily Shaffer
  2021-02-01  6:04             ` Jonathan Tan
  2020-12-22  0:02           ` [PATCH v7 13/17] hook: allow specifying working directory for hooks Emily Shaffer
                             ` (8 subsequent siblings)
  20 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

In many cases, there's no reason not to allow hooks to execute in
parallel. run_processes_parallel() is well-suited - it's a task queue
that runs its housekeeping in series, which means users don't
need to worry about thread safety on their callback data. True
multithreaded execution with the async_* functions isn't necessary here.
Synchronous hook execution can be achieved by only allowing 1 job to run
at a time.

Teach run_hooks() to use that function for simple hooks which don't
require stdin or capture of stderr.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Per AEvar's request - parallel hook execution on day zero.
    
    In most ways run_processes_parallel() worked great for me - but it didn't
    have great support for hooks where we pipe to and from. I had to add this
    support later in the series.
    
    Since I modified an existing and in-use library I'd appreciate a keen look on
    these patches.
    
     - Emily

 Documentation/config/hook.txt |   5 ++
 Documentation/git-hook.txt    |  14 +++-
 builtin/hook.c                |   6 +-
 hook.c                        | 142 ++++++++++++++++++++++++++--------
 hook.h                        |  28 ++++++-
 5 files changed, 157 insertions(+), 38 deletions(-)

diff --git a/Documentation/config/hook.txt b/Documentation/config/hook.txt
index 75312754ae..a423d13781 100644
--- a/Documentation/config/hook.txt
+++ b/Documentation/config/hook.txt
@@ -12,3 +12,8 @@ hook.runHookDir::
 	Controls how hooks contained in your hookdir are executed. Can be any of
 	"yes", "warn", "interactive", or "no". Defaults to "yes". See
 	linkgit:git-hook[1] and linkgit:git-config[1] "core.hooksPath").
+
+hook.jobs::
+	Specifies how many hooks can be run simultaneously during parallelized
+	hook execution. If unspecified, defaults to the number of processors on
+	the current system.
diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index cce30a80d0..01cee4ad81 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 [verse]
 'git hook' list <hook-name>
 'git hook' run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] [--to-stdin=<path>]
-	<hook-name>
+	[(-j|--jobs) <n>] <hook-name>
 
 DESCRIPTION
 -----------
@@ -66,7 +66,7 @@ in the order they should be run, and print the config scope where the relevant
 `hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
 This output is human-readable and the format is subject to change over time.
 
-run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] [--to-stdin=<path>] `<hook-name>`::
+run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] [--to-stdin=<path>] [(-j|--jobs)<n>] `<hook-name>`::
 
 Runs hooks configured for `<hook-name>`, in the same order displayed by `git
 hook list`. Hooks configured this way are run prepended with `sh -c`, so paths
@@ -98,6 +98,16 @@ Specify environment variables to set for every hook that is run.
 Specify a file which will be streamed into stdin for every hook that is run.
 Each hook will receive the entire file from beginning to EOF.
 
+-j::
+--jobs::
+	Only valid for `run`.
++
+Specify how many hooks to run simultaneously. If this flag is not specified, use
+the value of the `hook.jobs` config. If the config is not specified, use the
+number of CPUs on the current system. Some hooks may be ineligible for
+parallelization: for example, 'commit-msg' intends hooks modify the commit
+message body and cannot be parallelized.
+
 CONFIGURATION
 -------------
 include::config/hook.txt[]
diff --git a/builtin/hook.c b/builtin/hook.c
index be104f2938..7fbc84ab64 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -10,7 +10,7 @@
 static const char * const builtin_hook_usage[] = {
 	N_("git hook list <hookname>"),
 	N_("git hook run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...]"
-	   "[--to-stdin=<path>] <hookname>"),
+	   "[--to-stdin=<path>] [(-j|--jobs) <count>] <hookname>"),
 	NULL
 };
 
@@ -90,7 +90,7 @@ static int list(int argc, const char **argv, const char *prefix)
 static int run(int argc, const char **argv, const char *prefix)
 {
 	struct strbuf hookname = STRBUF_INIT;
-	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_ASYNC;
 	int rc = 0;
 
 	struct option run_options[] = {
@@ -100,6 +100,8 @@ static int run(int argc, const char **argv, const char *prefix)
 			   N_("argument to pass to hook")),
 		OPT_STRING(0, "to-stdin", &opt.path_to_stdin, N_("path"),
 			   N_("file to read into hooks' stdin")),
+		OPT_INTEGER('j', "jobs", &opt.jobs,
+			    N_("run up to <n> hooks simultaneously")),
 		OPT_END(),
 	};
 
diff --git a/hook.c b/hook.c
index ce5c443206..b190afa33b 100644
--- a/hook.c
+++ b/hook.c
@@ -136,6 +136,14 @@ enum hookdir_opt configured_hookdir_opt(void)
 	return HOOKDIR_UNKNOWN;
 }
 
+int configured_hook_jobs(void)
+{
+	int n = online_cpus();
+	git_config_get_int("hook.jobs", &n);
+
+	return n;
+}
+
 static int should_include_hookdir(const char *path, enum hookdir_opt cfg)
 {
 	struct strbuf prompt = STRBUF_INIT;
@@ -223,6 +231,7 @@ void run_hooks_opt_init(struct run_hooks_opt *o)
 	strvec_init(&o->env);
 	strvec_init(&o->args);
 	o->run_hookdir = configured_hookdir_opt();
+	o->jobs = configured_hook_jobs();
 }
 
 int hook_exists(const char *hookname, enum hookdir_opt should_run_hookdir)
@@ -246,11 +255,96 @@ void run_hooks_opt_clear(struct run_hooks_opt *o)
 	strvec_clear(&o->args);
 }
 
+
+static int pick_next_hook(struct child_process *cp,
+			  struct strbuf *out,
+			  void *pp_cb,
+			  void **pp_task_cb)
+{
+	struct hook_cb_data *hook_cb = pp_cb;
+
+	struct hook *hook = list_entry(hook_cb->run_me, struct hook, list);
+
+	if (hook_cb->head == hook_cb->run_me)
+		return 0;
+
+	cp->env = hook_cb->options->env.v;
+	cp->stdout_to_stderr = 1;
+	cp->trace2_hook_name = hook->command.buf;
+
+	/* reopen the file for stdin; run_command closes it. */
+	if (hook_cb->options->path_to_stdin) {
+		cp->no_stdin = 0;
+		cp->in = xopen(hook_cb->options->path_to_stdin, O_RDONLY);
+	} else {
+		cp->no_stdin = 1;
+	}
+
+	/*
+	 * Commands from the config could be oneliners, but we know
+	 * for certain that hookdir commands are not.
+	 */
+	if (hook->from_hookdir)
+		cp->use_shell = 0;
+	else
+		cp->use_shell = 1;
+
+	/* add command */
+	strvec_push(&cp->args, hook->command.buf);
+
+	/*
+	 * add passed-in argv, without expanding - let the user get back
+	 * exactly what they put in
+	 */
+	strvec_pushv(&cp->args, hook_cb->options->args.v);
+
+	/* Provide context for errors if necessary */
+	*pp_task_cb = hook;
+
+	/* Get the next entry ready */
+	hook_cb->run_me = hook_cb->run_me->next;
+
+	return 1;
+}
+
+static int notify_start_failure(struct strbuf *out,
+				void *pp_cb,
+				void *pp_task_cp)
+{
+	struct hook_cb_data *hook_cb = pp_cb;
+	struct hook *attempted = pp_task_cp;
+
+	/* |= rc in cb */
+	hook_cb->rc |= 1;
+
+	strbuf_addf(out, _("Couldn't start '%s', configured in '%s'\n"),
+		    attempted->command.buf,
+		    attempted->from_hookdir ? "hookdir"
+		    	: config_scope_name(attempted->origin));
+
+	/* NEEDSWORK: if halt_on_error is desired, do it here. */
+	return 0;
+}
+
+static int notify_hook_finished(int result,
+				struct strbuf *out,
+				void *pp_cb,
+				void *pp_task_cb)
+{
+	struct hook_cb_data *hook_cb = pp_cb;
+
+	/* |= rc in cb */
+	hook_cb->rc |= result;
+
+	/* NEEDSWORK: if halt_on_error is desired, do it here. */
+	return 0;
+}
+
 int run_hooks(const char *hookname, struct run_hooks_opt *options)
 {
 	struct strbuf hookname_str = STRBUF_INIT;
 	struct list_head *to_run, *pos = NULL, *tmp = NULL;
-	int rc = 0;
+	struct hook_cb_data cb_data = { 0, NULL, NULL, options };
 
 	if (!options)
 		BUG("a struct run_hooks_opt must be provided to run_hooks");
@@ -260,41 +354,23 @@ int run_hooks(const char *hookname, struct run_hooks_opt *options)
 	to_run = hook_list(&hookname_str);
 
 	list_for_each_safe(pos, tmp, to_run) {
-		struct child_process hook_proc = CHILD_PROCESS_INIT;
 		struct hook *hook = list_entry(pos, struct hook, list);
 
-		/* reopen the file for stdin; run_command closes it. */
-		if (options->path_to_stdin)
-			hook_proc.in = xopen(options->path_to_stdin, O_RDONLY);
-		else
-			hook_proc.no_stdin = 1;
-
-		hook_proc.env = options->env.v;
-		hook_proc.stdout_to_stderr = 1;
-		hook_proc.trace2_hook_name = hook->command.buf;
-		hook_proc.use_shell = 1;
-
-		if (hook->from_hookdir) {
-		    if (!should_include_hookdir(hook->command.buf, options->run_hookdir))
-			continue;
-		    /*
-		     * Commands from the config could be oneliners, but we know
-		     * for certain that hookdir commands are not.
-		     */
-		    hook_proc.use_shell = 0;
-		}
-
-		/* add command */
-		strvec_push(&hook_proc.args, hook->command.buf);
+		if (hook->from_hookdir &&
+		    !should_include_hookdir(hook->command.buf, options->run_hookdir))
+			    list_del(pos);
+	}
 
-		/*
-		 * add passed-in argv, without expanding - let the user get back
-		 * exactly what they put in
-		 */
-		strvec_pushv(&hook_proc.args, options->args.v);
+	cb_data.head = to_run;
+	cb_data.run_me = to_run->next;
 
-		rc |= run_command(&hook_proc);
-	}
+	run_processes_parallel_tr2(options->jobs,
+				   pick_next_hook,
+				   notify_start_failure,
+				   notify_hook_finished,
+				   &cb_data,
+				   "hook",
+				   hookname);
 
-	return rc;
+	return cb_data.rc;
 }
diff --git a/hook.h b/hook.h
index e22a6db832..0d973d090f 100644
--- a/hook.h
+++ b/hook.h
@@ -37,6 +37,9 @@ enum hookdir_opt
  */
 enum hookdir_opt configured_hookdir_opt(void);
 
+/* Provides the number of threads to use for parallel hook execution. */
+int configured_hook_jobs(void);
+
 struct run_hooks_opt
 {
 	/* Environment vars to be set for each hook */
@@ -54,15 +57,38 @@ struct run_hooks_opt
 
 	/* Path to file which should be piped to stdin for each hook */
 	const char *path_to_stdin;
+
+	/* Number of threads to parallelize across */
+	int jobs;
 };
 
-#define RUN_HOOKS_OPT_INIT  {   		\
+/*
+ * Callback provided to feed_pipe_fn and consume_sideband_fn.
+ */
+struct hook_cb_data {
+	int rc;
+	struct list_head *head;
+	struct list_head *run_me;
+	struct run_hooks_opt *options;
+};
+
+#define RUN_HOOKS_OPT_INIT_SYNC  {   		\
 	.env = STRVEC_INIT, 			\
 	.args = STRVEC_INIT, 			\
 	.path_to_stdin = NULL,			\
+	.jobs = 1,				\
 	.run_hookdir = configured_hookdir_opt()	\
 }
 
+#define RUN_HOOKS_OPT_INIT_ASYNC {		\
+	.env = STRVEC_INIT, 			\
+	.args = STRVEC_INIT, 			\
+	.path_to_stdin = NULL,			\
+	.jobs = configured_hook_jobs(),		\
+	.run_hookdir = configured_hookdir_opt()	\
+}
+
+
 void run_hooks_opt_init(struct run_hooks_opt *o);
 void run_hooks_opt_clear(struct run_hooks_opt *o);
 
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v7 13/17] hook: allow specifying working directory for hooks
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
                             ` (11 preceding siblings ...)
  2020-12-22  0:02           ` [PATCH v7 12/17] hook: allow parallel hook execution Emily Shaffer
@ 2020-12-22  0:02           ` Emily Shaffer
  2020-12-22  0:02           ` [PATCH v7 14/17] run-command: add stdin callback for parallelization Emily Shaffer
                             ` (7 subsequent siblings)
  20 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Hooks like "post-checkout" require that hooks have a different working
directory than the initial process. Pipe that directly through to struct
child_process.

Because we can just run 'git -C <some-dir> hook run ...' it shouldn't be
necessary to pipe this option through the frontend. In fact, this
reduces the possibility of users running hooks which affect some part of
the filesystem outside of the repo in question.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Needed later for "post-checkout" conversion.

 hook.c | 1 +
 hook.h | 5 +++++
 2 files changed, 6 insertions(+)

diff --git a/hook.c b/hook.c
index b190afa33b..eea90ec1d0 100644
--- a/hook.c
+++ b/hook.c
@@ -271,6 +271,7 @@ static int pick_next_hook(struct child_process *cp,
 	cp->env = hook_cb->options->env.v;
 	cp->stdout_to_stderr = 1;
 	cp->trace2_hook_name = hook->command.buf;
+	cp->dir = hook_cb->options->dir;
 
 	/* reopen the file for stdin; run_command closes it. */
 	if (hook_cb->options->path_to_stdin) {
diff --git a/hook.h b/hook.h
index 0d973d090f..8a7542610c 100644
--- a/hook.h
+++ b/hook.h
@@ -60,6 +60,9 @@ struct run_hooks_opt
 
 	/* Number of threads to parallelize across */
 	int jobs;
+
+	/* Path to initial working directory for subprocess */
+	const char *dir;
 };
 
 /*
@@ -77,6 +80,7 @@ struct hook_cb_data {
 	.args = STRVEC_INIT, 			\
 	.path_to_stdin = NULL,			\
 	.jobs = 1,				\
+	.dir = NULL,				\
 	.run_hookdir = configured_hookdir_opt()	\
 }
 
@@ -85,6 +89,7 @@ struct hook_cb_data {
 	.args = STRVEC_INIT, 			\
 	.path_to_stdin = NULL,			\
 	.jobs = configured_hook_jobs(),		\
+	.dir = NULL,				\
 	.run_hookdir = configured_hookdir_opt()	\
 }
 
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v7 14/17] run-command: add stdin callback for parallelization
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
                             ` (12 preceding siblings ...)
  2020-12-22  0:02           ` [PATCH v7 13/17] hook: allow specifying working directory for hooks Emily Shaffer
@ 2020-12-22  0:02           ` Emily Shaffer
  2021-02-01  6:51             ` Jonathan Tan
  2020-12-22  0:02           ` [PATCH v7 15/17] hook: provide stdin by string_list or callback Emily Shaffer
                             ` (6 subsequent siblings)
  20 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

If a user of the run_processes_parallel() API wants to pipe a large
amount of information to stdin of each parallel command, that
information could exceed the buffer of the pipe allocated for that
process's stdin.  Generally this is solved by repeatedly writing to
child_process.in between calls to start_command() and finish_command();
run_processes_parallel() did not provide users an opportunity to access
child_process at that time.

Because the data might be extremely large (for example, a list of all
refs received during a push from a client) simply taking a string_list
or strbuf is not as scalable as using a callback; the rest of the
run_processes_parallel() API also uses callbacks, so making this feature
match the rest of the API reduces mental load on the user.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 builtin/fetch.c             |  1 +
 builtin/submodule--helper.c |  2 +-
 hook.c                      |  1 +
 run-command.c               | 54 +++++++++++++++++++++++++++++++++++--
 run-command.h               | 17 +++++++++++-
 submodule.c                 |  1 +
 t/helper/test-run-command.c | 31 ++++++++++++++++++---
 t/t0061-run-command.sh      | 30 +++++++++++++++++++++
 8 files changed, 129 insertions(+), 8 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index ecf8537605..5e153b5193 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1647,6 +1647,7 @@ static int fetch_multiple(struct string_list *list, int max_children)
 		result = run_processes_parallel_tr2(max_children,
 						    &fetch_next_remote,
 						    &fetch_failed_to_start,
+						    NULL,
 						    &fetch_finished,
 						    &state,
 						    "fetch", "parallel/fetch");
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index c30896c897..bb623c1852 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -2294,7 +2294,7 @@ static int update_submodules(struct submodule_update_clone *suc)
 	int i;
 
 	run_processes_parallel_tr2(suc->max_jobs, update_clone_get_next_task,
-				   update_clone_start_failure,
+				   update_clone_start_failure, NULL,
 				   update_clone_task_finished, suc, "submodule",
 				   "parallel/update");
 
diff --git a/hook.c b/hook.c
index eea90ec1d0..312ede1251 100644
--- a/hook.c
+++ b/hook.c
@@ -368,6 +368,7 @@ int run_hooks(const char *hookname, struct run_hooks_opt *options)
 	run_processes_parallel_tr2(options->jobs,
 				   pick_next_hook,
 				   notify_start_failure,
+				   NULL,
 				   notify_hook_finished,
 				   &cb_data,
 				   "hook",
diff --git a/run-command.c b/run-command.c
index 80c8c97bc1..7b65c087f8 100644
--- a/run-command.c
+++ b/run-command.c
@@ -1548,6 +1548,7 @@ struct parallel_processes {
 
 	get_next_task_fn get_next_task;
 	start_failure_fn start_failure;
+	feed_pipe_fn feed_pipe;
 	task_finished_fn task_finished;
 
 	struct {
@@ -1575,6 +1576,13 @@ static int default_start_failure(struct strbuf *out,
 	return 0;
 }
 
+static int default_feed_pipe(struct strbuf *pipe,
+			     void *pp_cb,
+			     void *pp_task_cb)
+{
+	return 1;
+}
+
 static int default_task_finished(int result,
 				 struct strbuf *out,
 				 void *pp_cb,
@@ -1605,6 +1613,7 @@ static void pp_init(struct parallel_processes *pp,
 		    int n,
 		    get_next_task_fn get_next_task,
 		    start_failure_fn start_failure,
+		    feed_pipe_fn feed_pipe,
 		    task_finished_fn task_finished,
 		    void *data)
 {
@@ -1623,6 +1632,7 @@ static void pp_init(struct parallel_processes *pp,
 	pp->get_next_task = get_next_task;
 
 	pp->start_failure = start_failure ? start_failure : default_start_failure;
+	pp->feed_pipe = feed_pipe ? feed_pipe : default_feed_pipe;
 	pp->task_finished = task_finished ? task_finished : default_task_finished;
 
 	pp->nr_processes = 0;
@@ -1715,6 +1725,37 @@ static int pp_start_one(struct parallel_processes *pp)
 	return 0;
 }
 
+static void pp_buffer_stdin(struct parallel_processes *pp)
+{
+	int i;
+	struct strbuf sb = STRBUF_INIT;
+
+	/* Buffer stdin for each pipe. */
+	for (i = 0; i < pp->max_processes; i++) {
+		if (pp->children[i].state == GIT_CP_WORKING &&
+		    pp->children[i].process.in > 0) {
+			int done;
+			strbuf_reset(&sb);
+			done = pp->feed_pipe(&sb, pp->data,
+					      pp->children[i].data);
+			if (sb.len) {
+				if (write_in_full(pp->children[i].process.in,
+					      sb.buf, sb.len) < 0) {
+					if (errno != EPIPE)
+						die_errno("write");
+					done = 1;
+				}
+			}
+			if (done) {
+				close(pp->children[i].process.in);
+				pp->children[i].process.in = 0;
+			}
+		}
+	}
+
+	strbuf_release(&sb);
+}
+
 static void pp_buffer_stderr(struct parallel_processes *pp, int output_timeout)
 {
 	int i;
@@ -1779,6 +1820,7 @@ static int pp_collect_finished(struct parallel_processes *pp)
 		pp->nr_processes--;
 		pp->children[i].state = GIT_CP_FREE;
 		pp->pfd[i].fd = -1;
+		pp->children[i].process.in = 0;
 		child_process_init(&pp->children[i].process);
 
 		if (i != pp->output_owner) {
@@ -1812,6 +1854,7 @@ static int pp_collect_finished(struct parallel_processes *pp)
 int run_processes_parallel(int n,
 			   get_next_task_fn get_next_task,
 			   start_failure_fn start_failure,
+			   feed_pipe_fn feed_pipe,
 			   task_finished_fn task_finished,
 			   void *pp_cb)
 {
@@ -1820,7 +1863,9 @@ int run_processes_parallel(int n,
 	int spawn_cap = 4;
 	struct parallel_processes pp;
 
-	pp_init(&pp, n, get_next_task, start_failure, task_finished, pp_cb);
+	sigchain_push(SIGPIPE, SIG_IGN);
+
+	pp_init(&pp, n, get_next_task, start_failure, feed_pipe, task_finished, pp_cb);
 	while (1) {
 		for (i = 0;
 		    i < spawn_cap && !pp.shutdown &&
@@ -1837,6 +1882,7 @@ int run_processes_parallel(int n,
 		}
 		if (!pp.nr_processes)
 			break;
+		pp_buffer_stdin(&pp);
 		pp_buffer_stderr(&pp, output_timeout);
 		pp_output(&pp);
 		code = pp_collect_finished(&pp);
@@ -1848,11 +1894,15 @@ int run_processes_parallel(int n,
 	}
 
 	pp_cleanup(&pp);
+
+	sigchain_pop(SIGPIPE);
+
 	return 0;
 }
 
 int run_processes_parallel_tr2(int n, get_next_task_fn get_next_task,
 			       start_failure_fn start_failure,
+			       feed_pipe_fn feed_pipe,
 			       task_finished_fn task_finished, void *pp_cb,
 			       const char *tr2_category, const char *tr2_label)
 {
@@ -1862,7 +1912,7 @@ int run_processes_parallel_tr2(int n, get_next_task_fn get_next_task,
 				   ((n < 1) ? online_cpus() : n));
 
 	result = run_processes_parallel(n, get_next_task, start_failure,
-					task_finished, pp_cb);
+					feed_pipe, task_finished, pp_cb);
 
 	trace2_region_leave(tr2_category, tr2_label, NULL);
 
diff --git a/run-command.h b/run-command.h
index 6472b38bde..e058c0e2c8 100644
--- a/run-command.h
+++ b/run-command.h
@@ -436,6 +436,20 @@ typedef int (*start_failure_fn)(struct strbuf *out,
 				void *pp_cb,
 				void *pp_task_cb);
 
+/**
+ * This callback is called repeatedly on every child process who requests
+ * start_command() to create a pipe by setting child_process.in < 0.
+ *
+ * pp_cb is the callback cookie as passed into run_processes_parallel, and
+ * pp_task_cb is the callback cookie as passed into get_next_task_fn.
+ * The contents of 'send' will be read into the pipe and passed to the pipe.
+ *
+ * Return nonzero to close the pipe.
+ */
+typedef int (*feed_pipe_fn)(struct strbuf *pipe,
+			    void *pp_cb,
+			    void *pp_task_cb);
+
 /**
  * This callback is called on every child process that finished processing.
  *
@@ -470,10 +484,11 @@ typedef int (*task_finished_fn)(int result,
 int run_processes_parallel(int n,
 			   get_next_task_fn,
 			   start_failure_fn,
+			   feed_pipe_fn,
 			   task_finished_fn,
 			   void *pp_cb);
 int run_processes_parallel_tr2(int n, get_next_task_fn, start_failure_fn,
-			       task_finished_fn, void *pp_cb,
+			       feed_pipe_fn, task_finished_fn, void *pp_cb,
 			       const char *tr2_category, const char *tr2_label);
 
 #endif
diff --git a/submodule.c b/submodule.c
index b3bb59f066..953f41818c 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1638,6 +1638,7 @@ int fetch_populated_submodules(struct repository *r,
 	run_processes_parallel_tr2(max_parallel_jobs,
 				   get_next_submodule,
 				   fetch_start_failure,
+				   NULL,
 				   fetch_finish,
 				   &spf,
 				   "submodule", "parallel/fetch");
diff --git a/t/helper/test-run-command.c b/t/helper/test-run-command.c
index 7ae03dc712..9348184d30 100644
--- a/t/helper/test-run-command.c
+++ b/t/helper/test-run-command.c
@@ -32,8 +32,13 @@ static int parallel_next(struct child_process *cp,
 		return 0;
 
 	strvec_pushv(&cp->args, d->argv);
+	cp->in = d->in;
+	cp->no_stdin = d->no_stdin;
 	strbuf_addstr(err, "preloaded output of a child\n");
 	number_callbacks++;
+
+	*task_cb = xmalloc(sizeof(int));
+	*(int*)(*task_cb) = 2;
 	return 1;
 }
 
@@ -55,6 +60,17 @@ static int task_finished(int result,
 	return 1;
 }
 
+static int test_stdin(struct strbuf *pipe, void *cb, void *task_cb)
+{
+	int *lines_remaining = task_cb;
+
+	if (*lines_remaining)
+		strbuf_addf(pipe, "sample stdin %d\n", --(*lines_remaining));
+
+	return !(*lines_remaining);
+}
+
+
 struct testsuite {
 	struct string_list tests, failed;
 	int next;
@@ -185,7 +201,7 @@ static int testsuite(int argc, const char **argv)
 		suite.tests.nr, max_jobs);
 
 	ret = run_processes_parallel(max_jobs, next_test, test_failed,
-				     test_finished, &suite);
+				     test_stdin, test_finished, &suite);
 
 	if (suite.failed.nr > 0) {
 		ret = 1;
@@ -413,15 +429,22 @@ int cmd__run_command(int argc, const char **argv)
 
 	if (!strcmp(argv[1], "run-command-parallel"))
 		exit(run_processes_parallel(jobs, parallel_next,
-					    NULL, NULL, &proc));
+					    NULL, NULL, NULL, &proc));
 
 	if (!strcmp(argv[1], "run-command-abort"))
 		exit(run_processes_parallel(jobs, parallel_next,
-					    NULL, task_finished, &proc));
+					    NULL, NULL, task_finished, &proc));
 
 	if (!strcmp(argv[1], "run-command-no-jobs"))
 		exit(run_processes_parallel(jobs, no_job,
-					    NULL, task_finished, &proc));
+					    NULL, NULL, task_finished, &proc));
+
+	if (!strcmp(argv[1], "run-command-stdin")) {
+		proc.in = -1;
+		proc.no_stdin = 0;
+		exit (run_processes_parallel(jobs, parallel_next, NULL,
+					     test_stdin, NULL, &proc));
+	}
 
 	fprintf(stderr, "check usage\n");
 	return 1;
diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index 7d599675e3..87759482ad 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -143,6 +143,36 @@ test_expect_success 'run_command runs in parallel with more tasks than jobs avai
 	test_cmp expect actual
 '
 
+cat >expect <<-EOF
+preloaded output of a child
+listening for stdin:
+sample stdin 1
+sample stdin 0
+preloaded output of a child
+listening for stdin:
+sample stdin 1
+sample stdin 0
+preloaded output of a child
+listening for stdin:
+sample stdin 1
+sample stdin 0
+preloaded output of a child
+listening for stdin:
+sample stdin 1
+sample stdin 0
+EOF
+
+test_expect_success 'run_command listens to stdin' '
+	write_script stdin-script <<-\EOF &&
+	echo "listening for stdin:"
+	while read line; do
+		echo "$line"
+	done
+	EOF
+	test-tool run-command run-command-stdin 2 ./stdin-script 2>actual &&
+	test_cmp expect actual
+'
+
 cat >expect <<-EOF
 preloaded output of a child
 asking for a quick stop
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v7 15/17] hook: provide stdin by string_list or callback
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
                             ` (13 preceding siblings ...)
  2020-12-22  0:02           ` [PATCH v7 14/17] run-command: add stdin callback for parallelization Emily Shaffer
@ 2020-12-22  0:02           ` Emily Shaffer
  2021-02-01  7:04             ` Jonathan Tan
  2020-12-22  0:02           ` [PATCH v7 16/17] run-command: allow capturing of collated output Emily Shaffer
                             ` (5 subsequent siblings)
  20 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

In cases where a hook requires only a small amount of information via
stdin, it should be simple for users to provide a string_list alone. But
in more complicated cases where the stdin is too large to hold in
memory, let's provide a callback the users can populate line after line
with instead.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 hook.c | 39 ++++++++++++++++++++++++++++++++++++++-
 hook.h | 25 +++++++++++++++++++++++++
 2 files changed, 63 insertions(+), 1 deletion(-)

diff --git a/hook.c b/hook.c
index 312ede1251..b63a34d0a6 100644
--- a/hook.c
+++ b/hook.c
@@ -9,6 +9,7 @@ void free_hook(struct hook *ptr)
 {
 	if (ptr) {
 		strbuf_release(&ptr->command);
+		free(ptr->feed_pipe_cb_data);
 		free(ptr);
 	}
 }
@@ -38,6 +39,7 @@ static void append_or_move_hook(struct list_head *head, const char *command)
 		strbuf_init(&to_add->command, 0);
 		strbuf_addstr(&to_add->command, command);
 		to_add->from_hookdir = 0;
+		to_add->feed_pipe_cb_data = NULL;
 	}
 
 	/* re-set the scope so we show where an override was specified */
@@ -253,9 +255,32 @@ void run_hooks_opt_clear(struct run_hooks_opt *o)
 {
 	strvec_clear(&o->env);
 	strvec_clear(&o->args);
+	string_list_clear(&o->str_stdin, 0);
 }
 
 
+static int pipe_from_string_list(struct strbuf *pipe, void *pp_cb, void *pp_task_cb)
+{
+	int *item_idx;
+	struct hook *ctx = pp_task_cb;
+	struct string_list *to_pipe = &((struct hook_cb_data*)pp_cb)->options->str_stdin;
+
+	/* Bootstrap the state manager if necessary. */
+	if (!ctx->feed_pipe_cb_data) {
+		ctx->feed_pipe_cb_data = xmalloc(sizeof(unsigned int));
+		*(int*)ctx->feed_pipe_cb_data = 0;
+	}
+
+	item_idx = ctx->feed_pipe_cb_data;
+
+	if (*item_idx < to_pipe->nr) {
+		strbuf_addf(pipe, "%s\n", to_pipe->items[*item_idx].string);
+		(*item_idx)++;
+		return 0;
+	}
+	return 1;
+}
+
 static int pick_next_hook(struct child_process *cp,
 			  struct strbuf *out,
 			  void *pp_cb,
@@ -277,6 +302,10 @@ static int pick_next_hook(struct child_process *cp,
 	if (hook_cb->options->path_to_stdin) {
 		cp->no_stdin = 0;
 		cp->in = xopen(hook_cb->options->path_to_stdin, O_RDONLY);
+	} else if (hook_cb->options->feed_pipe) {
+		/* ask for start_command() to make a pipe for us */
+		cp->in = -1;
+		cp->no_stdin = 0;
 	} else {
 		cp->no_stdin = 1;
 	}
@@ -350,6 +379,14 @@ int run_hooks(const char *hookname, struct run_hooks_opt *options)
 	if (!options)
 		BUG("a struct run_hooks_opt must be provided to run_hooks");
 
+	if ((options->path_to_stdin && options->str_stdin.nr) ||
+	    (options->path_to_stdin && options->feed_pipe) ||
+	    (options->str_stdin.nr && options->feed_pipe))
+		BUG("choose only one method to populate stdin");
+
+	if (options->str_stdin.nr)
+		options->feed_pipe = &pipe_from_string_list;
+
 	strbuf_addstr(&hookname_str, hookname);
 
 	to_run = hook_list(&hookname_str);
@@ -368,7 +405,7 @@ int run_hooks(const char *hookname, struct run_hooks_opt *options)
 	run_processes_parallel_tr2(options->jobs,
 				   pick_next_hook,
 				   notify_start_failure,
-				   NULL,
+				   options->feed_pipe,
 				   notify_hook_finished,
 				   &cb_data,
 				   "hook",
diff --git a/hook.h b/hook.h
index 8a7542610c..0ac83fa7ca 100644
--- a/hook.h
+++ b/hook.h
@@ -2,6 +2,7 @@
 #include "list.h"
 #include "strbuf.h"
 #include "strvec.h"
+#include "run-command.h"
 
 struct hook
 {
@@ -14,6 +15,12 @@ struct hook
 	/* The literal command to run. */
 	struct strbuf command;
 	int from_hookdir;
+
+	/*
+	 * Use this to keep state for your feed_pipe_fn if you are using
+	 * run_hooks_opt.feed_pipe. Otherwise, do not touch it.
+	 */
+	void *feed_pipe_cb_data;
 };
 
 /*
@@ -57,12 +64,24 @@ struct run_hooks_opt
 
 	/* Path to file which should be piped to stdin for each hook */
 	const char *path_to_stdin;
+	/* Pipe each string to stdin, separated by newlines */
+	struct string_list str_stdin;
+	/*
+	 * Callback and state pointer to ask for more content to pipe to stdin.
+	 * Will be called repeatedly, for each hook. See
+	 * hook.c:pipe_from_stdin() for an example. Keep per-hook state in
+	 * hook.feed_pipe_cb_data (per process). Keep initialization context in
+	 * feed_pipe_ctx (shared by all processes).
+	 */
+	feed_pipe_fn feed_pipe;
+	void *feed_pipe_ctx;
 
 	/* Number of threads to parallelize across */
 	int jobs;
 
 	/* Path to initial working directory for subprocess */
 	const char *dir;
+
 };
 
 /*
@@ -81,6 +100,9 @@ struct hook_cb_data {
 	.path_to_stdin = NULL,			\
 	.jobs = 1,				\
 	.dir = NULL,				\
+	.str_stdin = STRING_LIST_INIT_DUP,	\
+	.feed_pipe = NULL,			\
+	.feed_pipe_ctx = NULL,			\
 	.run_hookdir = configured_hookdir_opt()	\
 }
 
@@ -90,6 +112,9 @@ struct hook_cb_data {
 	.path_to_stdin = NULL,			\
 	.jobs = configured_hook_jobs(),		\
 	.dir = NULL,				\
+	.str_stdin = STRING_LIST_INIT_DUP,	\
+	.feed_pipe = NULL,			\
+	.feed_pipe_ctx = NULL,			\
 	.run_hookdir = configured_hookdir_opt()	\
 }
 
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v7 16/17] run-command: allow capturing of collated output
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
                             ` (14 preceding siblings ...)
  2020-12-22  0:02           ` [PATCH v7 15/17] hook: provide stdin by string_list or callback Emily Shaffer
@ 2020-12-22  0:02           ` Emily Shaffer
  2020-12-22  0:02           ` [PATCH v7 17/17] hooks: allow callers to capture output Emily Shaffer
                             ` (4 subsequent siblings)
  20 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Some callers, for example server-side hooks which wish to relay hook
output to clients across a transport, want to capture what would
normally print to stderr and do something else with it. Allow that via a
callback.

By calling the callback regardless of whether there's output available,
we allow clients to send e.g. a keepalive if necessary.

Because we expose a strbuf, not a fd or FILE*, there's no need to create
a temporary pipe or similar - we can just skip the print to stderr and
instead hand it to the caller.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Originally when writing this patch I attempted to use a pipe in memory -
    but managing its lifetime was actually pretty tricky, and I found I could
    achieve the same thing with less code by doing it this way. Critique welcome,
    including "no, you really need to do it with a pipe".

 builtin/fetch.c             |  2 +-
 builtin/submodule--helper.c |  2 +-
 hook.c                      |  1 +
 run-command.c               | 33 +++++++++++++++++++++++++--------
 run-command.h               | 18 +++++++++++++++++-
 submodule.c                 |  2 +-
 t/helper/test-run-command.c | 25 ++++++++++++++++++++-----
 t/t0061-run-command.sh      |  7 +++++++
 8 files changed, 73 insertions(+), 17 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 5e153b5193..6a634085d9 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1647,7 +1647,7 @@ static int fetch_multiple(struct string_list *list, int max_children)
 		result = run_processes_parallel_tr2(max_children,
 						    &fetch_next_remote,
 						    &fetch_failed_to_start,
-						    NULL,
+						    NULL, NULL,
 						    &fetch_finished,
 						    &state,
 						    "fetch", "parallel/fetch");
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index bb623c1852..8c543d33fd 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -2294,7 +2294,7 @@ static int update_submodules(struct submodule_update_clone *suc)
 	int i;
 
 	run_processes_parallel_tr2(suc->max_jobs, update_clone_get_next_task,
-				   update_clone_start_failure, NULL,
+				   update_clone_start_failure, NULL, NULL,
 				   update_clone_task_finished, suc, "submodule",
 				   "parallel/update");
 
diff --git a/hook.c b/hook.c
index b63a34d0a6..1439322a29 100644
--- a/hook.c
+++ b/hook.c
@@ -406,6 +406,7 @@ int run_hooks(const char *hookname, struct run_hooks_opt *options)
 				   pick_next_hook,
 				   notify_start_failure,
 				   options->feed_pipe,
+				   NULL,
 				   notify_hook_finished,
 				   &cb_data,
 				   "hook",
diff --git a/run-command.c b/run-command.c
index 7b65c087f8..0dce6bec83 100644
--- a/run-command.c
+++ b/run-command.c
@@ -1549,6 +1549,7 @@ struct parallel_processes {
 	get_next_task_fn get_next_task;
 	start_failure_fn start_failure;
 	feed_pipe_fn feed_pipe;
+	consume_sideband_fn consume_sideband;
 	task_finished_fn task_finished;
 
 	struct {
@@ -1614,6 +1615,7 @@ static void pp_init(struct parallel_processes *pp,
 		    get_next_task_fn get_next_task,
 		    start_failure_fn start_failure,
 		    feed_pipe_fn feed_pipe,
+		    consume_sideband_fn consume_sideband,
 		    task_finished_fn task_finished,
 		    void *data)
 {
@@ -1634,6 +1636,7 @@ static void pp_init(struct parallel_processes *pp,
 	pp->start_failure = start_failure ? start_failure : default_start_failure;
 	pp->feed_pipe = feed_pipe ? feed_pipe : default_feed_pipe;
 	pp->task_finished = task_finished ? task_finished : default_task_finished;
+	pp->consume_sideband = consume_sideband;
 
 	pp->nr_processes = 0;
 	pp->output_owner = 0;
@@ -1670,7 +1673,10 @@ static void pp_cleanup(struct parallel_processes *pp)
 	 * When get_next_task added messages to the buffer in its last
 	 * iteration, the buffered output is non empty.
 	 */
-	strbuf_write(&pp->buffered_output, stderr);
+	if (pp->consume_sideband)
+		pp->consume_sideband(&pp->buffered_output, pp->data);
+	else
+		strbuf_write(&pp->buffered_output, stderr);
 	strbuf_release(&pp->buffered_output);
 
 	sigchain_pop_common();
@@ -1786,9 +1792,13 @@ static void pp_buffer_stderr(struct parallel_processes *pp, int output_timeout)
 static void pp_output(struct parallel_processes *pp)
 {
 	int i = pp->output_owner;
+
 	if (pp->children[i].state == GIT_CP_WORKING &&
 	    pp->children[i].err.len) {
-		strbuf_write(&pp->children[i].err, stderr);
+		if (pp->consume_sideband)
+			pp->consume_sideband(&pp->children[i].err, pp->data);
+		else
+			strbuf_write(&pp->children[i].err, stderr);
 		strbuf_reset(&pp->children[i].err);
 	}
 }
@@ -1827,11 +1837,15 @@ static int pp_collect_finished(struct parallel_processes *pp)
 			strbuf_addbuf(&pp->buffered_output, &pp->children[i].err);
 			strbuf_reset(&pp->children[i].err);
 		} else {
-			strbuf_write(&pp->children[i].err, stderr);
+			/* Output errors, then all other finished child processes */
+			if (pp->consume_sideband) {
+				pp->consume_sideband(&pp->children[i].err, pp->data);
+				pp->consume_sideband(&pp->buffered_output, pp->data);
+			} else {
+				strbuf_write(&pp->children[i].err, stderr);
+				strbuf_write(&pp->buffered_output, stderr);
+			}
 			strbuf_reset(&pp->children[i].err);
-
-			/* Output all other finished child processes */
-			strbuf_write(&pp->buffered_output, stderr);
 			strbuf_reset(&pp->buffered_output);
 
 			/*
@@ -1855,6 +1869,7 @@ int run_processes_parallel(int n,
 			   get_next_task_fn get_next_task,
 			   start_failure_fn start_failure,
 			   feed_pipe_fn feed_pipe,
+			   consume_sideband_fn consume_sideband,
 			   task_finished_fn task_finished,
 			   void *pp_cb)
 {
@@ -1865,7 +1880,7 @@ int run_processes_parallel(int n,
 
 	sigchain_push(SIGPIPE, SIG_IGN);
 
-	pp_init(&pp, n, get_next_task, start_failure, feed_pipe, task_finished, pp_cb);
+	pp_init(&pp, n, get_next_task, start_failure, feed_pipe, consume_sideband, task_finished, pp_cb);
 	while (1) {
 		for (i = 0;
 		    i < spawn_cap && !pp.shutdown &&
@@ -1903,6 +1918,7 @@ int run_processes_parallel(int n,
 int run_processes_parallel_tr2(int n, get_next_task_fn get_next_task,
 			       start_failure_fn start_failure,
 			       feed_pipe_fn feed_pipe,
+			       consume_sideband_fn consume_sideband,
 			       task_finished_fn task_finished, void *pp_cb,
 			       const char *tr2_category, const char *tr2_label)
 {
@@ -1912,7 +1928,8 @@ int run_processes_parallel_tr2(int n, get_next_task_fn get_next_task,
 				   ((n < 1) ? online_cpus() : n));
 
 	result = run_processes_parallel(n, get_next_task, start_failure,
-					feed_pipe, task_finished, pp_cb);
+					feed_pipe, consume_sideband,
+					task_finished, pp_cb);
 
 	trace2_region_leave(tr2_category, tr2_label, NULL);
 
diff --git a/run-command.h b/run-command.h
index e058c0e2c8..2ad8271f56 100644
--- a/run-command.h
+++ b/run-command.h
@@ -450,6 +450,20 @@ typedef int (*feed_pipe_fn)(struct strbuf *pipe,
 			    void *pp_cb,
 			    void *pp_task_cb);
 
+/**
+ * If this callback is provided, instead of collating process output to stderr,
+ * they will be collated into a new pipe. consume_sideband_fn will be called
+ * repeatedly. When output is available on that pipe, it will be contained in
+ * 'output'. But it will be called with an empty 'output' too, to allow for
+ * keepalives or similar operations if necessary.
+ *
+ * pp_cb is the callback cookie as passed into run_processes_parallel.
+ *
+ * Since this callback is provided with the collated output, no task cookie is
+ * provided.
+ */
+typedef void (*consume_sideband_fn)(struct strbuf *output, void *pp_cb);
+
 /**
  * This callback is called on every child process that finished processing.
  *
@@ -485,10 +499,12 @@ int run_processes_parallel(int n,
 			   get_next_task_fn,
 			   start_failure_fn,
 			   feed_pipe_fn,
+			   consume_sideband_fn,
 			   task_finished_fn,
 			   void *pp_cb);
 int run_processes_parallel_tr2(int n, get_next_task_fn, start_failure_fn,
-			       feed_pipe_fn, task_finished_fn, void *pp_cb,
+			       feed_pipe_fn, consume_sideband_fn,
+			       task_finished_fn, void *pp_cb,
 			       const char *tr2_category, const char *tr2_label);
 
 #endif
diff --git a/submodule.c b/submodule.c
index 953f41818c..215bff22d9 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1638,7 +1638,7 @@ int fetch_populated_submodules(struct repository *r,
 	run_processes_parallel_tr2(max_parallel_jobs,
 				   get_next_submodule,
 				   fetch_start_failure,
-				   NULL,
+				   NULL, NULL,
 				   fetch_finish,
 				   &spf,
 				   "submodule", "parallel/fetch");
diff --git a/t/helper/test-run-command.c b/t/helper/test-run-command.c
index 9348184d30..d53db6d11c 100644
--- a/t/helper/test-run-command.c
+++ b/t/helper/test-run-command.c
@@ -51,6 +51,16 @@ static int no_job(struct child_process *cp,
 	return 0;
 }
 
+static void test_consume_sideband(struct strbuf *output, void *cb)
+{
+	FILE *sideband;
+
+	sideband = fopen("./sideband", "a");
+
+	strbuf_write(output, sideband);
+	fclose(sideband);
+}
+
 static int task_finished(int result,
 			 struct strbuf *err,
 			 void *pp_cb,
@@ -201,7 +211,7 @@ static int testsuite(int argc, const char **argv)
 		suite.tests.nr, max_jobs);
 
 	ret = run_processes_parallel(max_jobs, next_test, test_failed,
-				     test_stdin, test_finished, &suite);
+				     test_stdin, NULL, test_finished, &suite);
 
 	if (suite.failed.nr > 0) {
 		ret = 1;
@@ -429,23 +439,28 @@ int cmd__run_command(int argc, const char **argv)
 
 	if (!strcmp(argv[1], "run-command-parallel"))
 		exit(run_processes_parallel(jobs, parallel_next,
-					    NULL, NULL, NULL, &proc));
+					    NULL, NULL, NULL, NULL, &proc));
 
 	if (!strcmp(argv[1], "run-command-abort"))
 		exit(run_processes_parallel(jobs, parallel_next,
-					    NULL, NULL, task_finished, &proc));
+					    NULL, NULL, NULL, task_finished, &proc));
 
 	if (!strcmp(argv[1], "run-command-no-jobs"))
 		exit(run_processes_parallel(jobs, no_job,
-					    NULL, NULL, task_finished, &proc));
+					    NULL, NULL, NULL, task_finished, &proc));
 
 	if (!strcmp(argv[1], "run-command-stdin")) {
 		proc.in = -1;
 		proc.no_stdin = 0;
 		exit (run_processes_parallel(jobs, parallel_next, NULL,
-					     test_stdin, NULL, &proc));
+					     test_stdin, NULL, NULL, &proc));
 	}
 
+	if (!strcmp(argv[1], "run-command-sideband"))
+		exit(run_processes_parallel(jobs, parallel_next, NULL, NULL,
+					    test_consume_sideband, NULL,
+					    &proc));
+
 	fprintf(stderr, "check usage\n");
 	return 1;
 }
diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index 87759482ad..e99f6c7f44 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -143,6 +143,13 @@ test_expect_success 'run_command runs in parallel with more tasks than jobs avai
 	test_cmp expect actual
 '
 
+test_expect_success 'run_command can divert output' '
+	test_when_finished rm sideband &&
+	test-tool run-command run-command-sideband 3 sh -c "printf \"%s\n%s\n\" Hello World" 2>actual &&
+	test_must_be_empty actual &&
+	test_cmp expect sideband
+'
+
 cat >expect <<-EOF
 preloaded output of a child
 listening for stdin:
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v7 17/17] hooks: allow callers to capture output
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
                             ` (15 preceding siblings ...)
  2020-12-22  0:02           ` [PATCH v7 16/17] run-command: allow capturing of collated output Emily Shaffer
@ 2020-12-22  0:02           ` Emily Shaffer
  2020-12-22  2:11           ` [PATCH v7 00/17] propose config-based hooks (part I) Junio C Hamano
                             ` (3 subsequent siblings)
  20 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Some server-side hooks will require capturing output to send over
sideband instead of printing directly to stderr. Expose that capability.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    You can see this in practice in the conversions for some of the push hooks,
    like 'receive-pack'.

 hook.c |  2 +-
 hook.h | 10 ++++++++++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/hook.c b/hook.c
index 1439322a29..dc241f7ec5 100644
--- a/hook.c
+++ b/hook.c
@@ -406,7 +406,7 @@ int run_hooks(const char *hookname, struct run_hooks_opt *options)
 				   pick_next_hook,
 				   notify_start_failure,
 				   options->feed_pipe,
-				   NULL,
+				   options->consume_sideband,
 				   notify_hook_finished,
 				   &cb_data,
 				   "hook",
diff --git a/hook.h b/hook.h
index 0ac83fa7ca..1e92379bb8 100644
--- a/hook.h
+++ b/hook.h
@@ -76,6 +76,14 @@ struct run_hooks_opt
 	feed_pipe_fn feed_pipe;
 	void *feed_pipe_ctx;
 
+	/*
+	 * Populate this to capture output and prevent it from being printed to
+	 * stderr. This will be passed directly through to
+	 * run_command:run_parallel_processes(). See t/helper/test-run-command.c
+	 * for an example.
+	 */
+	consume_sideband_fn consume_sideband;
+
 	/* Number of threads to parallelize across */
 	int jobs;
 
@@ -103,6 +111,7 @@ struct hook_cb_data {
 	.str_stdin = STRING_LIST_INIT_DUP,	\
 	.feed_pipe = NULL,			\
 	.feed_pipe_ctx = NULL,			\
+	.consume_sideband = NULL,		\
 	.run_hookdir = configured_hookdir_opt()	\
 }
 
@@ -115,6 +124,7 @@ struct hook_cb_data {
 	.str_stdin = STRING_LIST_INIT_DUP,	\
 	.feed_pipe = NULL,			\
 	.feed_pipe_ctx = NULL,			\
+	.consume_sideband = NULL,		\
 	.run_hookdir = configured_hookdir_opt()	\
 }
 
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 00/17] propose config-based hooks (part I)
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
                             ` (16 preceding siblings ...)
  2020-12-22  0:02           ` [PATCH v7 17/17] hooks: allow callers to capture output Emily Shaffer
@ 2020-12-22  2:11           ` Junio C Hamano
  2020-12-28 18:34             ` Emily Shaffer
  2020-12-28 22:37           ` [PATCH v3 18/17] doc: make git-hook.txt point of truth Emily Shaffer
                             ` (2 subsequent siblings)
  20 siblings, 1 reply; 170+ messages in thread
From: Junio C Hamano @ 2020-12-22  2:11 UTC (permalink / raw)
  To: Emily Shaffer
  Cc: git, Jeff King, James Ramsay, Jonathan Nieder, brian m. carlson,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Josh Steadmon, Johannes Schindelin

Emily Shaffer <emilyshaffer@google.com> writes:

> Since v6:
>
>  - Converted 'enum hookdir_opt' to UPPER_SNAKE
>  - Coccinelle fix in the hook destructor
>  - Fixed a bug where builtin/hook.c wasn't running the default git config setup
>    and therefore missed hooks in core.hooksPath when it was set. (These hooks
>    would still run except when invoked by 'git hook run' as the config was
>    called by the processes which invoked the hook library.)

Thanks.  Queued both series (it probably is easier to think of these
as a single 34-patch series, as long as they both are in flight at
the same time).


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 00/17] propose config-based hooks (part I)
  2020-12-22  2:11           ` [PATCH v7 00/17] propose config-based hooks (part I) Junio C Hamano
@ 2020-12-28 18:34             ` Emily Shaffer
  2020-12-28 22:50               ` Junio C Hamano
  0 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-12-28 18:34 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jeff King, James Ramsay, Jonathan Nieder, brian m. carlson,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Josh Steadmon, Johannes Schindelin

On Mon, Dec 21, 2020 at 06:11:05PM -0800, Junio C Hamano wrote:
> 
> Emily Shaffer <emilyshaffer@google.com> writes:
> 
> > Since v6:
> >
> >  - Converted 'enum hookdir_opt' to UPPER_SNAKE
> >  - Coccinelle fix in the hook destructor
> >  - Fixed a bug where builtin/hook.c wasn't running the default git config setup
> >    and therefore missed hooks in core.hooksPath when it was set. (These hooks
> >    would still run except when invoked by 'git hook run' as the config was
> >    called by the processes which invoked the hook library.)
> 
> Thanks.  Queued both series (it probably is easier to think of these
> as a single 34-patch series, as long as they both are in flight at
> the same time).
> 

Do you want me to send them as a single thread for next version?

^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v3 18/17] doc: make git-hook.txt point of truth
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
                             ` (17 preceding siblings ...)
  2020-12-22  2:11           ` [PATCH v7 00/17] propose config-based hooks (part I) Junio C Hamano
@ 2020-12-28 22:37           ` Emily Shaffer
  2020-12-28 22:39             ` Emily Shaffer
  2021-01-29 23:59           ` [PATCH v7 00/17] propose config-based hooks (part I) Emily Shaffer
  2021-02-16 19:46           ` Josh Steadmon
  20 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-12-28 22:37 UTC (permalink / raw)
  To: git
  Cc: Emily Shaffer, Jeff King, Junio C Hamano, James Ramsay,
	Jonathan Nieder, brian m. carlson,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Josh Steadmon, Johannes Schindelin

By showing the list of all hooks in 'git help hook' for users to refer
to, 'git help hook' becomes a one-stop shop for hook authorship. Since
some may still have muscle memory for 'git help githooks', though,
reference the 'git hook' commands and otherwise don't remove content.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Sorry for the wonky subject. It seemed unnecessary to send the entirety
of the topic again when there were no changes. (I was able to push this
patch to my fork without -f, so indeed the rest is unchanged.)

I'd really prefer if 'git help githooks' opens the 'git-hook.txt' manpage,
but I couldn't figure out how to do that. This seemed like the next best thing
to me - users can still find information at the old manpage, but get a little
nudge towards the new manpage in case they didn't notice it.

Extremely open to other notes about the direction of these docs; I'm not
confident in anything except that it'd be annoying to have 'git help
githooks' remain unchanged.

 - Emily

 Documentation/git-hook.txt |   5 +
 Documentation/githooks.txt | 701 +------------------------------------
 2 files changed, 13 insertions(+), 693 deletions(-)

diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index 01cee4ad81..cb8e383ec9 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -112,6 +112,11 @@ CONFIGURATION
 -------------
 include::config/hook.txt[]
 
+HOOKS
+-----
+
+include::native-hooks.txt[]
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index c450f7a27e..4d06369924 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -8,15 +8,20 @@ githooks - Hooks used by Git
 SYNOPSIS
 --------
 $GIT_DIR/hooks/* (or \`git config core.hooksPath`/*)
+[verse]
+'git hook' list <hook-name>
+'git hook' run <hook-name>
 
 
 DESCRIPTION
 -----------
 
-Hooks are programs you can place in a hooks directory to trigger
-actions at certain points in git's execution. Hooks that don't have
+Hooks are programs you can place in a hooks directory or specify in the config
+to trigger actions at certain points in Git's execution. Hooks that don't have
 the executable bit set are ignored.
 
+For information about specifying hooks in the config, check linkgit:git-hook[1].
+
 By default the hooks directory is `$GIT_DIR/hooks`, but that can be
 changed via the `core.hooksPath` configuration variable (see
 linkgit:git-config[1]).
@@ -42,697 +47,7 @@ The currently supported hooks are described below.
 HOOKS
 -----
 
-applypatch-msg
-~~~~~~~~~~~~~~
-
-This hook is invoked by linkgit:git-am[1].  It takes a single
-parameter, the name of the file that holds the proposed commit
-log message.  Exiting with a non-zero status causes `git am` to abort
-before applying the patch.
-
-The hook is allowed to edit the message file in place, and can
-be used to normalize the message into some project standard
-format. It can also be used to refuse the commit after inspecting
-the message file.
-
-The default 'applypatch-msg' hook, when enabled, runs the
-'commit-msg' hook, if the latter is enabled.
-
-Hooks run during 'applypatch-msg' will not be parallelized.
-
-pre-applypatch
-~~~~~~~~~~~~~~
-
-This hook is invoked by linkgit:git-am[1].  It takes no parameter, and is
-invoked after the patch is applied, but before a commit is made.
-
-If it exits with non-zero status, then the working tree will not be
-committed after applying the patch.
-
-It can be used to inspect the current working tree and refuse to
-make a commit if it does not pass certain test.
-
-The default 'pre-applypatch' hook, when enabled, runs the
-'pre-commit' hook, if the latter is enabled.
-
-Hooks run during 'pre-applypatch' will be run in parallel by default.
-
-post-applypatch
-~~~~~~~~~~~~~~~
-
-This hook is invoked by linkgit:git-am[1].  It takes no parameter,
-and is invoked after the patch is applied and a commit is made.
-
-This hook is meant primarily for notification, and cannot affect
-the outcome of `git am`.
-
-Hooks run during 'post-applypatch' will be run in parallel by default.
-
-pre-commit
-~~~~~~~~~~
-
-This hook is invoked by linkgit:git-commit[1], and can be bypassed
-with the `--no-verify` option.  It takes no parameters, and is
-invoked before obtaining the proposed commit log message and
-making a commit.  Exiting with a non-zero status from this script
-causes the `git commit` command to abort before creating a commit.
-
-The default 'pre-commit' hook, when enabled, catches introduction
-of lines with trailing whitespaces and aborts the commit when
-such a line is found.
-
-All the `git commit` hooks are invoked with the environment
-variable `GIT_EDITOR=:` if the command will not bring up an editor
-to modify the commit message.
-
-The default 'pre-commit' hook, when enabled--and with the
-`hooks.allownonascii` config option unset or set to false--prevents
-the use of non-ASCII filenames.
-
-Hooks executed during 'pre-commit' will not be parallelized.
-
-pre-merge-commit
-~~~~~~~~~~~~~~~~
-
-This hook is invoked by linkgit:git-merge[1], and can be bypassed
-with the `--no-verify` option.  It takes no parameters, and is
-invoked after the merge has been carried out successfully and before
-obtaining the proposed commit log message to
-make a commit.  Exiting with a non-zero status from this script
-causes the `git merge` command to abort before creating a commit.
-
-The default 'pre-merge-commit' hook, when enabled, runs the
-'pre-commit' hook, if the latter is enabled.
-
-This hook is invoked with the environment variable
-`GIT_EDITOR=:` if the command will not bring up an editor
-to modify the commit message.
-
-If the merge cannot be carried out automatically, the conflicts
-need to be resolved and the result committed separately (see
-linkgit:git-merge[1]). At that point, this hook will not be executed,
-but the 'pre-commit' hook will, if it is enabled.
-
-Hooks executed during 'pre-merge-commit' will not be parallelized.
-
-prepare-commit-msg
-~~~~~~~~~~~~~~~~~~
-
-This hook is invoked by linkgit:git-commit[1] right after preparing the
-default log message, and before the editor is started.
-
-It takes one to three parameters.  The first is the name of the file
-that contains the commit log message.  The second is the source of the commit
-message, and can be: `message` (if a `-m` or `-F` option was
-given); `template` (if a `-t` option was given or the
-configuration option `commit.template` is set); `merge` (if the
-commit is a merge or a `.git/MERGE_MSG` file exists); `squash`
-(if a `.git/SQUASH_MSG` file exists); or `commit`, followed by
-a commit SHA-1 (if a `-c`, `-C` or `--amend` option was given).
-
-If the exit status is non-zero, `git commit` will abort.
-
-The purpose of the hook is to edit the message file in place, and
-it is not suppressed by the `--no-verify` option.  A non-zero exit
-means a failure of the hook and aborts the commit.  It should not
-be used as replacement for pre-commit hook.
-
-The sample `prepare-commit-msg` hook that comes with Git removes the
-help message found in the commented portion of the commit template.
-
-Hooks executed during 'prepare-commit-msg' will not be parallelized.
-
-commit-msg
-~~~~~~~~~~
-
-This hook is invoked by linkgit:git-commit[1] and linkgit:git-merge[1], and can be
-bypassed with the `--no-verify` option.  It takes a single parameter,
-the name of the file that holds the proposed commit log message.
-Exiting with a non-zero status causes the command to abort.
-
-The hook is allowed to edit the message file in place, and can be used
-to normalize the message into some project standard format. It
-can also be used to refuse the commit after inspecting the message
-file.
-
-The default 'commit-msg' hook, when enabled, detects duplicate
-`Signed-off-by` trailers, and aborts the commit if one is found.
-
-Hooks executed during 'commit-msg' will not be parallelized.
-
-post-commit
-~~~~~~~~~~~
-
-This hook is invoked by linkgit:git-commit[1]. It takes no parameters, and is
-invoked after a commit is made.
-
-This hook is meant primarily for notification, and cannot affect
-the outcome of `git commit`.
-
-Hooks executed during 'post-commit' will run in parallel by default.
-
-pre-rebase
-~~~~~~~~~~
-
-This hook is called by linkgit:git-rebase[1] and can be used to prevent a
-branch from getting rebased.  The hook may be called with one or
-two parameters.  The first parameter is the upstream from which
-the series was forked.  The second parameter is the branch being
-rebased, and is not set when rebasing the current branch.
-
-Hooks executed during 'pre-rebase' will run in parallel by default.
-
-post-checkout
-~~~~~~~~~~~~~
-
-This hook is invoked when a linkgit:git-checkout[1] or
-linkgit:git-switch[1] is run after having updated the
-worktree.  The hook is given three parameters: the ref of the previous HEAD,
-the ref of the new HEAD (which may or may not have changed), and a flag
-indicating whether the checkout was a branch checkout (changing branches,
-flag=1) or a file checkout (retrieving a file from the index, flag=0).
-This hook cannot affect the outcome of `git switch` or `git checkout`,
-other than that the hook's exit status becomes the exit status of
-these two commands.
-
-It is also run after linkgit:git-clone[1], unless the `--no-checkout` (`-n`) option is
-used. The first parameter given to the hook is the null-ref, the second the
-ref of the new HEAD and the flag is always 1. Likewise for `git worktree add`
-unless `--no-checkout` is used.
-
-This hook can be used to perform repository validity checks, auto-display
-differences from the previous HEAD if different, or set working dir metadata
-properties.
-
-Hooks executed during 'post-checkout' will not be parallelized.
-
-post-merge
-~~~~~~~~~~
-
-This hook is invoked by linkgit:git-merge[1], which happens when a `git pull`
-is done on a local repository.  The hook takes a single parameter, a status
-flag specifying whether or not the merge being done was a squash merge.
-This hook cannot affect the outcome of `git merge` and is not executed,
-if the merge failed due to conflicts.
-
-This hook can be used in conjunction with a corresponding pre-commit hook to
-save and restore any form of metadata associated with the working tree
-(e.g.: permissions/ownership, ACLS, etc).  See contrib/hooks/setgitperms.perl
-for an example of how to do this.
-
-Hooks executed during 'post-merge' will run in parallel by default.
-
-pre-push
-~~~~~~~~
-
-This hook is called by linkgit:git-push[1] and can be used to prevent
-a push from taking place.  The hook is called with two parameters
-which provide the name and location of the destination remote, if a
-named remote is not being used both values will be the same.
-
-Information about what is to be pushed is provided on the hook's standard
-input with lines of the form:
-
-  <local ref> SP <local sha1> SP <remote ref> SP <remote sha1> LF
-
-For instance, if the command +git push origin master:foreign+ were run the
-hook would receive a line like the following:
-
-  refs/heads/master 67890 refs/heads/foreign 12345
-
-although the full, 40-character SHA-1s would be supplied.  If the foreign ref
-does not yet exist the `<remote SHA-1>` will be 40 `0`.  If a ref is to be
-deleted, the `<local ref>` will be supplied as `(delete)` and the `<local
-SHA-1>` will be 40 `0`.  If the local commit was specified by something other
-than a name which could be expanded (such as `HEAD~`, or a SHA-1) it will be
-supplied as it was originally given.
-
-If this hook exits with a non-zero status, `git push` will abort without
-pushing anything.  Information about why the push is rejected may be sent
-to the user by writing to standard error.
-
-Hooks executed during 'pre-push' will run in parallel by default.
-
-[[pre-receive]]
-pre-receive
-~~~~~~~~~~~
-
-This hook is invoked by linkgit:git-receive-pack[1] when it reacts to
-`git push` and updates reference(s) in its repository.
-Just before starting to update refs on the remote repository, the
-pre-receive hook is invoked.  Its exit status determines the success
-or failure of the update.
-
-This hook executes once for the receive operation. It takes no
-arguments, but for each ref to be updated it receives on standard
-input a line of the format:
-
-  <old-value> SP <new-value> SP <ref-name> LF
-
-where `<old-value>` is the old object name stored in the ref,
-`<new-value>` is the new object name to be stored in the ref and
-`<ref-name>` is the full name of the ref.
-When creating a new ref, `<old-value>` is 40 `0`.
-
-If the hook exits with non-zero status, none of the refs will be
-updated. If the hook exits with zero, updating of individual refs can
-still be prevented by the <<update,'update'>> hook.
-
-Both standard output and standard error output are forwarded to
-`git send-pack` on the other end, so you can simply `echo` messages
-for the user.
-
-The number of push options given on the command line of
-`git push --push-option=...` can be read from the environment
-variable `GIT_PUSH_OPTION_COUNT`, and the options themselves are
-found in `GIT_PUSH_OPTION_0`, `GIT_PUSH_OPTION_1`,...
-If it is negotiated to not use the push options phase, the
-environment variables will not be set. If the client selects
-to use push options, but doesn't transmit any, the count variable
-will be set to zero, `GIT_PUSH_OPTION_COUNT=0`.
-
-See the section on "Quarantine Environment" in
-linkgit:git-receive-pack[1] for some caveats.
-
-Hooks executed during 'pre-receive' will not be parallelized.
-
-[[update]]
-update
-~~~~~~
-
-This hook is invoked by linkgit:git-receive-pack[1] when it reacts to
-`git push` and updates reference(s) in its repository.
-Just before updating the ref on the remote repository, the update hook
-is invoked.  Its exit status determines the success or failure of
-the ref update.
-
-The hook executes once for each ref to be updated, and takes
-three parameters:
-
- - the name of the ref being updated,
- - the old object name stored in the ref,
- - and the new object name to be stored in the ref.
-
-A zero exit from the update hook allows the ref to be updated.
-Exiting with a non-zero status prevents `git receive-pack`
-from updating that ref.
-
-This hook can be used to prevent 'forced' update on certain refs by
-making sure that the object name is a commit object that is a
-descendant of the commit object named by the old object name.
-That is, to enforce a "fast-forward only" policy.
-
-It could also be used to log the old..new status.  However, it
-does not know the entire set of branches, so it would end up
-firing one e-mail per ref when used naively, though.  The
-<<post-receive,'post-receive'>> hook is more suited to that.
-
-In an environment that restricts the users' access only to git
-commands over the wire, this hook can be used to implement access
-control without relying on filesystem ownership and group
-membership. See linkgit:git-shell[1] for how you might use the login
-shell to restrict the user's access to only git commands.
-
-Both standard output and standard error output are forwarded to
-`git send-pack` on the other end, so you can simply `echo` messages
-for the user.
-
-The default 'update' hook, when enabled--and with
-`hooks.allowunannotated` config option unset or set to false--prevents
-unannotated tags to be pushed.
-
-Hooks executed during 'update' are run in parallel by default.
-
-[[proc-receive]]
-proc-receive
-~~~~~~~~~~~~
-
-This hook is invoked by linkgit:git-receive-pack[1].  If the server has
-set the multi-valued config variable `receive.procReceiveRefs`, and the
-commands sent to 'receive-pack' have matching reference names, these
-commands will be executed by this hook, instead of by the internal
-`execute_commands()` function.  This hook is responsible for updating
-the relevant references and reporting the results back to 'receive-pack'.
-
-This hook executes once for the receive operation.  It takes no
-arguments, but uses a pkt-line format protocol to communicate with
-'receive-pack' to read commands, push-options and send results.  In the
-following example for the protocol, the letter 'S' stands for
-'receive-pack' and the letter 'H' stands for this hook.
-
-    # Version and features negotiation.
-    S: PKT-LINE(version=1\0push-options atomic...)
-    S: flush-pkt
-    H: PKT-LINE(version=1\0push-options...)
-    H: flush-pkt
-
-    # Send commands from server to the hook.
-    S: PKT-LINE(<old-oid> <new-oid> <ref>)
-    S: ... ...
-    S: flush-pkt
-    # Send push-options only if the 'push-options' feature is enabled.
-    S: PKT-LINE(push-option)
-    S: ... ...
-    S: flush-pkt
-
-    # Receive result from the hook.
-    # OK, run this command successfully.
-    H: PKT-LINE(ok <ref>)
-    # NO, I reject it.
-    H: PKT-LINE(ng <ref> <reason>)
-    # Fall through, let 'receive-pack' to execute it.
-    H: PKT-LINE(ok <ref>)
-    H: PKT-LINE(option fall-through)
-    # OK, but has an alternate reference.  The alternate reference name
-    # and other status can be given in option directives.
-    H: PKT-LINE(ok <ref>)
-    H: PKT-LINE(option refname <refname>)
-    H: PKT-LINE(option old-oid <old-oid>)
-    H: PKT-LINE(option new-oid <new-oid>)
-    H: PKT-LINE(option forced-update)
-    H: ... ...
-    H: flush-pkt
-
-Each command for the 'proc-receive' hook may point to a pseudo-reference
-and always has a zero-old as its old-oid, while the 'proc-receive' hook
-may update an alternate reference and the alternate reference may exist
-already with a non-zero old-oid.  For this case, this hook will use
-"option" directives to report extended attributes for the reference given
-by the leading "ok" directive.
-
-The report of the commands of this hook should have the same order as
-the input.  The exit status of the 'proc-receive' hook only determines
-the success or failure of the group of commands sent to it, unless
-atomic push is in use.
-
-It is forbidden to specify more than one hook for 'proc-receive'. If a
-globally-configured 'proc-receive' must be overridden, use
-'hookcmd.<global-hook>.skip = true' to ignore it.
-
-[[post-receive]]
-post-receive
-~~~~~~~~~~~~
-
-This hook is invoked by linkgit:git-receive-pack[1] when it reacts to
-`git push` and updates reference(s) in its repository.
-It executes on the remote repository once after all the refs have
-been updated.
-
-This hook executes once for the receive operation.  It takes no
-arguments, but gets the same information as the
-<<pre-receive,'pre-receive'>>
-hook does on its standard input.
-
-This hook does not affect the outcome of `git receive-pack`, as it
-is called after the real work is done.
-
-This supersedes the <<post-update,'post-update'>> hook in that it gets
-both old and new values of all the refs in addition to their
-names.
-
-Both standard output and standard error output are forwarded to
-`git send-pack` on the other end, so you can simply `echo` messages
-for the user.
-
-The default 'post-receive' hook is empty, but there is
-a sample script `post-receive-email` provided in the `contrib/hooks`
-directory in Git distribution, which implements sending commit
-emails.
-
-The number of push options given on the command line of
-`git push --push-option=...` can be read from the environment
-variable `GIT_PUSH_OPTION_COUNT`, and the options themselves are
-found in `GIT_PUSH_OPTION_0`, `GIT_PUSH_OPTION_1`,...
-If it is negotiated to not use the push options phase, the
-environment variables will not be set. If the client selects
-to use push options, but doesn't transmit any, the count variable
-will be set to zero, `GIT_PUSH_OPTION_COUNT=0`.
-
-Hooks executed during 'post-receive' are run in parallel by default.
-
-[[post-update]]
-post-update
-~~~~~~~~~~~
-
-This hook is invoked by linkgit:git-receive-pack[1] when it reacts to
-`git push` and updates reference(s) in its repository.
-It executes on the remote repository once after all the refs have
-been updated.
-
-It takes a variable number of parameters, each of which is the
-name of ref that was actually updated.
-
-This hook is meant primarily for notification, and cannot affect
-the outcome of `git receive-pack`.
-
-The 'post-update' hook can tell what are the heads that were pushed,
-but it does not know what their original and updated values are,
-so it is a poor place to do log old..new. The
-<<post-receive,'post-receive'>> hook does get both original and
-updated values of the refs. You might consider it instead if you need
-them.
-
-When enabled, the default 'post-update' hook runs
-`git update-server-info` to keep the information used by dumb
-transports (e.g., HTTP) up to date.  If you are publishing
-a Git repository that is accessible via HTTP, you should
-probably enable this hook.
-
-Both standard output and standard error output are forwarded to
-`git send-pack` on the other end, so you can simply `echo` messages
-for the user.
-
-Hooks run during 'post-update' will be run in parallel by default.
-
-reference-transaction
-~~~~~~~~~~~~~~~~~~~~~
-
-This hook is invoked by any Git command that performs reference
-updates. It executes whenever a reference transaction is prepared,
-committed or aborted and may thus get called multiple times.
-
-The hook takes exactly one argument, which is the current state the
-given reference transaction is in:
-
-    - "prepared": All reference updates have been queued to the
-      transaction and references were locked on disk.
-
-    - "committed": The reference transaction was committed and all
-      references now have their respective new value.
-
-    - "aborted": The reference transaction was aborted, no changes
-      were performed and the locks have been released.
-
-For each reference update that was added to the transaction, the hook
-receives on standard input a line of the format:
-
-  <old-value> SP <new-value> SP <ref-name> LF
-
-The exit status of the hook is ignored for any state except for the
-"prepared" state. In the "prepared" state, a non-zero exit status will
-cause the transaction to be aborted. The hook will not be called with
-"aborted" state in that case.
-
-Hooks run during 'reference-transaction' will be run in parallel by default.
-
-push-to-checkout
-~~~~~~~~~~~~~~~~
-
-This hook is invoked by linkgit:git-receive-pack[1] when it reacts to
-`git push` and updates reference(s) in its repository, and when
-the push tries to update the branch that is currently checked out
-and the `receive.denyCurrentBranch` configuration variable is set to
-`updateInstead`.  Such a push by default is refused if the working
-tree and the index of the remote repository has any difference from
-the currently checked out commit; when both the working tree and the
-index match the current commit, they are updated to match the newly
-pushed tip of the branch.  This hook is to be used to override the
-default behaviour.
-
-The hook receives the commit with which the tip of the current
-branch is going to be updated.  It can exit with a non-zero status
-to refuse the push (when it does so, it must not modify the index or
-the working tree).  Or it can make any necessary changes to the
-working tree and to the index to bring them to the desired state
-when the tip of the current branch is updated to the new commit, and
-exit with a zero status.
-
-For example, the hook can simply run `git read-tree -u -m HEAD "$1"`
-in order to emulate `git fetch` that is run in the reverse direction
-with `git push`, as the two-tree form of `git read-tree -u -m` is
-essentially the same as `git switch` or `git checkout`
-that switches branches while
-keeping the local changes in the working tree that do not interfere
-with the difference between the branches.
-
-Hooks executed during 'push-to-checkout' will not be parallelized.
-
-pre-auto-gc
-~~~~~~~~~~~
-
-This hook is invoked by `git gc --auto` (see linkgit:git-gc[1]). It
-takes no parameter, and exiting with non-zero status from this script
-causes the `git gc --auto` to abort.
-
-Hooks run during 'pre-auto-gc' will be run in parallel by default.
-
-post-rewrite
-~~~~~~~~~~~~
-
-This hook is invoked by commands that rewrite commits
-(linkgit:git-commit[1] when called with `--amend` and
-linkgit:git-rebase[1]; however, full-history (re)writing tools like
-linkgit:git-fast-import[1] or
-https://github.com/newren/git-filter-repo[git-filter-repo] typically
-do not call it!).  Its first argument denotes the command it was
-invoked by: currently one of `amend` or `rebase`.  Further
-command-dependent arguments may be passed in the future.
-
-The hook receives a list of the rewritten commits on stdin, in the
-format
-
-  <old-sha1> SP <new-sha1> [ SP <extra-info> ] LF
-
-The 'extra-info' is again command-dependent.  If it is empty, the
-preceding SP is also omitted.  Currently, no commands pass any
-'extra-info'.
-
-The hook always runs after the automatic note copying (see
-"notes.rewrite.<command>" in linkgit:git-config[1]) has happened, and
-thus has access to these notes.
-
-Hooks run during 'post-rewrite' will be run in parallel by default.
-
-The following command-specific comments apply:
-
-rebase::
-	For the 'squash' and 'fixup' operation, all commits that were
-	squashed are listed as being rewritten to the squashed commit.
-	This means that there will be several lines sharing the same
-	'new-sha1'.
-+
-The commits are guaranteed to be listed in the order that they were
-processed by rebase.
-
-sendemail-validate
-~~~~~~~~~~~~~~~~~~
-
-This hook is invoked by linkgit:git-send-email[1].  It takes a single parameter,
-the name of the file that holds the e-mail to be sent.  Exiting with a
-non-zero status causes `git send-email` to abort before sending any
-e-mails.
-
-fsmonitor-watchman
-~~~~~~~~~~~~~~~~~~
-
-This hook is invoked when the configuration option `core.fsmonitor` is
-set to `.git/hooks/fsmonitor-watchman` or `.git/hooks/fsmonitor-watchmanv2`
-depending on the version of the hook to use.
-
-Version 1 takes two arguments, a version (1) and the time in elapsed
-nanoseconds since midnight, January 1, 1970.
-
-Version 2 takes two arguments, a version (2) and a token that is used
-for identifying changes since the token. For watchman this would be
-a clock id. This version must output to stdout the new token followed
-by a NUL before the list of files.
-
-The hook should output to stdout the list of all files in the working
-directory that may have changed since the requested time.  The logic
-should be inclusive so that it does not miss any potential changes.
-The paths should be relative to the root of the working directory
-and be separated by a single NUL.
-
-It is OK to include files which have not actually changed.  All changes
-including newly-created and deleted files should be included. When
-files are renamed, both the old and the new name should be included.
-
-Git will limit what files it checks for changes as well as which
-directories are checked for untracked files based on the path names
-given.
-
-An optimized way to tell git "all files have changed" is to return
-the filename `/`.
-
-The exit status determines whether git will use the data from the
-hook to limit its search.  On error, it will fall back to verifying
-all files and folders.
-
-p4-changelist
-~~~~~~~~~~~~~
-
-This hook is invoked by `git-p4 submit`.
-
-The `p4-changelist` hook is executed after the changelist
-message has been edited by the user. It can be bypassed with the
-`--no-verify` option. It takes a single parameter, the name
-of the file that holds the proposed changelist text. Exiting
-with a non-zero status causes the command to abort.
-
-The hook is allowed to edit the changelist file and can be used
-to normalize the text into some project standard format. It can
-also be used to refuse the Submit after inspect the message file.
-
-Run `git-p4 submit --help` for details.
-
-p4-prepare-changelist
-~~~~~~~~~~~~~~~~~~~~~
-
-This hook is invoked by `git-p4 submit`.
-
-The `p4-prepare-changelist` hook is executed right after preparing
-the default changelist message and before the editor is started.
-It takes one parameter, the name of the file that contains the
-changelist text. Exiting with a non-zero status from the script
-will abort the process.
-
-The purpose of the hook is to edit the message file in place,
-and it is not supressed by the `--no-verify` option. This hook
-is called even if `--prepare-p4-only` is set.
-
-Run `git-p4 submit --help` for details.
-
-p4-post-changelist
-~~~~~~~~~~~~~~~~~~
-
-This hook is invoked by `git-p4 submit`.
-
-The `p4-post-changelist` hook is invoked after the submit has
-successfully occurred in P4. It takes no parameters and is meant
-primarily for notification and cannot affect the outcome of the
-git p4 submit action.
-
-Run `git-p4 submit --help` for details.
-
-p4-pre-submit
-~~~~~~~~~~~~~
-
-This hook is invoked by `git-p4 submit`. It takes no parameters and nothing
-from standard input. Exiting with non-zero status from this script prevent
-`git-p4 submit` from launching. It can be bypassed with the `--no-verify`
-command line option. Run `git-p4 submit --help` for details.
-
-
-
-post-index-change
-~~~~~~~~~~~~~~~~~
-
-This hook is invoked when the index is written in read-cache.c
-do_write_locked_index.
-
-The first parameter passed to the hook is the indicator for the
-working directory being updated.  "1" meaning working directory
-was updated or "0" when the working directory was not updated.
-
-The second parameter passed to the hook is the indicator for whether
-or not the index was updated and the skip-worktree bit could have
-changed.  "1" meaning skip-worktree bits could have been updated
-and "0" meaning they were not.
-
-Only one parameter should be set to "1" when the hook runs.  The hook
-running passing "1", "1" should not be possible.
-
-Hooks run during 'post-index-change' will be run in parallel by default.
+include::native-hooks.txt[]
 
 GIT
 ---
-- 
2.29.2.490.gc7ae633391


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* Re: [PATCH v3 18/17] doc: make git-hook.txt point of truth
  2020-12-28 22:37           ` [PATCH v3 18/17] doc: make git-hook.txt point of truth Emily Shaffer
@ 2020-12-28 22:39             ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-28 22:39 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Junio C Hamano, James Ramsay, Jonathan Nieder,
	brian m. carlson, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Josh Steadmon, Johannes Schindelin

On Mon, Dec 28, 2020 at 02:37:16PM -0800, Emily Shaffer wrote:

Argh. I am having awful Monday brain and this should have been
in-reply-to the other thread. I guess that's a point in opposition of
splitting big topics into multiple threads. :|

I'll resend it on the other topic. I'm sorry.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 00/17] propose config-based hooks (part I)
  2020-12-28 18:34             ` Emily Shaffer
@ 2020-12-28 22:50               ` Junio C Hamano
  0 siblings, 0 replies; 170+ messages in thread
From: Junio C Hamano @ 2020-12-28 22:50 UTC (permalink / raw)
  To: Emily Shaffer
  Cc: git, Jeff King, James Ramsay, Jonathan Nieder, brian m. carlson,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Josh Steadmon, Johannes Schindelin

Emily Shaffer <emilyshaffer@google.com> writes:

> On Mon, Dec 21, 2020 at 06:11:05PM -0800, Junio C Hamano wrote:
>> 
>> Emily Shaffer <emilyshaffer@google.com> writes:
>> 
>> > Since v6:
>> >
>> >  - Converted 'enum hookdir_opt' to UPPER_SNAKE
>> >  - Coccinelle fix in the hook destructor
>> >  - Fixed a bug where builtin/hook.c wasn't running the default git config setup
>> >    and therefore missed hooks in core.hooksPath when it was set. (These hooks
>> >    would still run except when invoked by 'git hook run' as the config was
>> >    called by the processes which invoked the hook library.)
>> 
>> Thanks.  Queued both series (it probably is easier to think of these
>> as a single 34-patch series, as long as they both are in flight at
>> the same time).
>> 
>
> Do you want me to send them as a single thread for next version?

Unless we deliberately focus on stabilizing the early 17 patches
into a shape that they won't need updating while working on the
later part of the series, I'd guess that your next resend would
contain updated versions of these 17 patches, so the only effect
that it has to pretend that the patches belong to two separate
series is to invite mistakes while queuing on my part.  So either
(1) a single thread of all patches, or (2) just the early part to
really make sure everybody is happy with them, so that we can
graduate it early even while the remainder may be going through
revisions, would be more preferrable than the way they have been
structured so far.

Thanks.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 01/17] doc: propose hooks managed by the config
  2020-12-22  0:02           ` [PATCH v7 01/17] doc: propose hooks managed by the config Emily Shaffer
@ 2021-01-23 15:38             ` Ævar Arnfjörð Bjarmason
  2021-01-29 23:52               ` Emily Shaffer
  2021-02-01 22:11             ` Junio C Hamano
  1 sibling, 1 reply; 170+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-01-23 15:38 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git


On Tue, Dec 22 2020, Emily Shaffer wrote:

>     Since v4, addressed comments from Jonathan Tan about wording. However, I have
>     not addressed AEvar's comments or done a full re-review of this document.
>     I wanted to get the rest of the series out for initial review first.

As you note here and in a couple of other patch notes a lot of the
current state was based on my v4 commentary. I see we have parallel hook
execution by default now, woohoo!

I've been keeping an eye on this series, but have been kicking the can
down the road on reviewing it.

Skimming it now I think the state of it looks mostly good now when
viewing the end result, I think it's mainly got one big problem, but the
good news is that it's relatively easy to solve :)

Which is that I think it's really hard to follow along with it because
01/17 starts with a big design doc that's partially outdated, and
partially saying things that aren't in or should be in either a
user-facing doc or commit message.

And then individual patches (e.g. 12/17) either don't have tests
associated with them to test the feature they add, don't update/add
docs, or the docs are at the very beginning.

I think we should aim to mostly or entirely get rid of
Documentation/technical/config-based-hooks.txt, it was more of a "what
about this design?" document in the beginning.

In a series we'd apply most or all of it should really be in end-user
doc (and stuff like "Future work" can just be noted in commit messages
as we go along).

So long story short, I started trying to review this, but found myself
trying to reply to one patch and then grabbing docs from 01/17, or
(e.g. for the parallel stuff) not having tests and starting to come up
with them myself.

So I thought I'd send this E-Mail instead as prodding to maybe convince
you to re-roll it again to make it easier to follow along in a piecemeal
fashion.

As noted before I'm happy to help with this series if needed. I just
thought I'd send this first given that it's been a month since the last
submission, perhaps you've got some more local WIP changes by now...

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 01/17] doc: propose hooks managed by the config
  2021-01-23 15:38             ` Ævar Arnfjörð Bjarmason
@ 2021-01-29 23:52               ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2021-01-29 23:52 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git

On Sat, Jan 23, 2021 at 04:38:31PM +0100, Ævar Arnfjörð Bjarmason wrote:
> 
> 
> On Tue, Dec 22 2020, Emily Shaffer wrote:
> 
> >     Since v4, addressed comments from Jonathan Tan about wording. However, I have
> >     not addressed AEvar's comments or done a full re-review of this document.
> >     I wanted to get the rest of the series out for initial review first.
> 
> As you note here and in a couple of other patch notes a lot of the
> current state was based on my v4 commentary. I see we have parallel hook
> execution by default now, woohoo!
> 
> I've been keeping an eye on this series, but have been kicking the can
> down the road on reviewing it.
> 
> Skimming it now I think the state of it looks mostly good now when
> viewing the end result, I think it's mainly got one big problem, but the
> good news is that it's relatively easy to solve :)
> 
> Which is that I think it's really hard to follow along with it because
> 01/17 starts with a big design doc that's partially outdated, and
> partially saying things that aren't in or should be in either a
> user-facing doc or commit message.

Sure.

> 
> And then individual patches (e.g. 12/17) either don't have tests
> associated with them to test the feature they add, don't update/add
> docs, or the docs are at the very beginning.

Thanks, sure.

> 
> I think we should aim to mostly or entirely get rid of
> Documentation/technical/config-based-hooks.txt, it was more of a "what
> about this design?" document in the beginning.

I'm not 100% sure that I agree - there are a couple other design docs in
Documentation/technical which I still refer to from time to time, e.g.
for sparse checkout. But I *do* agree that there's a lot of info there
that needs to be in end-user docs.

> 
> In a series we'd apply most or all of it should really be in end-user
> doc (and stuff like "Future work" can just be noted in commit messages
> as we go along).
> 
> So long story short, I started trying to review this, but found myself
> trying to reply to one patch and then grabbing docs from 01/17, or
> (e.g. for the parallel stuff) not having tests and starting to come up
> with them myself.

Yeah. A related issue I could imagine, although not what you mentioned
here, is needing to do the same thing between part I and part II of the
series, as I often added some functionality late in part I and then used
it first late in part II. I don't think this is worth reordering for,
but probably better notes would be handy.

> 
> So I thought I'd send this E-Mail instead as prodding to maybe convince
> you to re-roll it again to make it easier to follow along in a piecemeal
> fashion.
> 
> As noted before I'm happy to help with this series if needed. I just
> thought I'd send this first given that it's been a month since the last
> submission, perhaps you've got some more local WIP changes by now...

The biggest help for me would be review focused on nits, code style,
missed free(), etc. - I still haven't gotten that kind of review yet,
and I feel the series as it is now is pretty much "feature complete".

In fact, and I'll mention this in reply to the cover letter in a moment,
we picked this series up as-is from Junio's branch in 'gitster/git' and
have been using it at Google for a couple of weeks now - primarily with
users running their legacy hooks living in hookdir, but also with a
subset of users encouraged to try out the config functionality.

Thanks for the bump.
 - Emily


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 00/17] propose config-based hooks (part I)
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
                             ` (18 preceding siblings ...)
  2020-12-28 22:37           ` [PATCH v3 18/17] doc: make git-hook.txt point of truth Emily Shaffer
@ 2021-01-29 23:59           ` Emily Shaffer
  2021-02-16 19:46           ` Josh Steadmon
  20 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2021-01-29 23:59 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Junio C Hamano, James Ramsay, Jonathan Nieder,
	brian m. carlson, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Josh Steadmon, Johannes Schindelin

On Mon, Dec 21, 2020 at 04:02:03PM -0800, Emily Shaffer wrote:
> 
> Since v6:
> 
>  - Converted 'enum hookdir_opt' to UPPER_SNAKE
>  - Coccinelle fix in the hook destructor
>  - Fixed a bug where builtin/hook.c wasn't running the default git config setup
>    and therefore missed hooks in core.hooksPath when it was set. (These hooks
>    would still run except when invoked by 'git hook run' as the config was
>    called by the processes which invoked the hook library.)
> 
> CI run: https://github.com/nasamuffin/git/actions/runs/436864964

Some updates on this series...

Since Jan 21 we've been running this series as picked from
gitster/git:es/config-hooks on Googler machines, with a subset of users
asked to try out putting their hooks into config instead of hookdir. So
far we haven't heard any crashes or bugs like that, although I did hear
a couple places where the user documentation is lacking. I feel
encouraged by that, and I'm hoping to improve the documentation in the
next week or so, pending $DAYJOB concerns.

We also addressed some of this series in our every-other-week review
club (me, Jonathan Tan, Jonathan Nieder, and Josh Steadmon; although in
this case I tried to be quiet :) ) and so I hope there will be some
comments from my three teammates coming to list sometime next week.

Since I feel pretty comfortable that it doesn't seem to explode
anywhere, I'm really keen to hear nitpicky reviews and try to push to
get this into 'next'; maybe I can barter my eyes on someone else's
neglected review? That sounds pretty mercenary but I think Junio is the
one who suggested it a few weeks ago... ;)

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 03/17] hook: add list command
  2020-12-22  0:02           ` [PATCH v7 03/17] hook: add list command Emily Shaffer
@ 2021-01-31  3:10             ` Jonathan Tan
  2021-02-09 21:06               ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Jonathan Tan @ 2021-01-31  3:10 UTC (permalink / raw)
  To: emilyshaffer; +Cc: git, Jonathan Tan

> Teach 'git hook list <hookname>', which checks the known configs in
> order to create an ordered list of hooks to run on a given hook event.
> 
> Multiple commands can be specified for a given hook by providing
> multiple "hook.<hookname>.command = <path-to-hook>" lines. Hooks will be
> run in config order. If more properties need to be set on a given hook
> in the future, commands can also be specified by providing
> "hook.<hookname>.command = <hookcmd-name>", as well as a "[hookcmd
> <hookcmd-name>]" subsection; at minimum, this subsection must contain a
> "hookcmd.<hookcmd-name>.command = <path-to-hook>" line.

I learned later that this isn't true - in patch 6, the commit message
and one of the tests therein describe being able to skip a previously
inline command by making a hookcmd section of the same name and just
specifying "skip = true" there (without any command).

Maybe just delete the "at minimum" part.

> +static int list(int argc, const char **argv, const char *prefix)
>  {
> -	struct option builtin_hook_options[] = {
> +	struct list_head *head, *pos;
> +	struct hook *item;

You asked for review on nits too so here's one: "item" should be
declared in the list_for_each block. (That also makes it easier to see
that we don't need to free it.)

> diff --git a/hook.c b/hook.c
> new file mode 100644
> index 0000000000..937dc768c8
> --- /dev/null
> +++ b/hook.c
> @@ -0,0 +1,115 @@
> +#include "cache.h"
> +
> +#include "hook.h"
> +#include "config.h"

Usually we put all the includes together without any intervening blank
lines.

> +static void append_or_move_hook(struct list_head *head, const char *command)
> +{
> +	struct list_head *pos = NULL, *tmp = NULL;
> +	struct hook *to_add = NULL;
> +
> +	/*
> +	 * remove the prior entry with this command; we'll replace it at the
> +	 * end.
> +	 */
> +	list_for_each_safe(pos, tmp, head) {
> +		struct hook *it = list_entry(pos, struct hook, list);
> +		if (!strcmp(it->command.buf, command)) {
> +		    list_del(pos);
> +		    /* we'll simply move the hook to the end */
> +		    to_add = it;

"break" here?

> +		}
> +	}
> +
> +	if (!to_add) {
> +		/* adding a new hook, not moving an old one */
> +		to_add = xmalloc(sizeof(struct hook));

Style is to write sizeof(*to_add), I think.

[snip]

> +struct hook_config_cb
> +{
> +	struct strbuf *hookname;

struct declarations have "{" not on a line on its own.

Also, "hookname" could just be a char *?
	
> +	struct list_head *list;
> +};
> +
> +static int hook_config_lookup(const char *key, const char *value, void *cb_data)
> +{
> +	struct hook_config_cb *data = cb_data;
> +	const char *hook_key = data->hookname->buf;
> +	struct list_head *head = data->list;
> +
> +	if (!strcmp(key, hook_key)) {
> +		const char *command = value;
> +		struct strbuf hookcmd_name = STRBUF_INIT;
> +
> +		/* Check if a hookcmd with that name exists. */
> +		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
> +		git_config_get_value(hookcmd_name.buf, &command);

So we don't really care whether git_config_get_value returns 0 or 1 as
long as it doesn't touch "command" if there is no such hookcmd. That
fits with how git_config_get_value() is documented, so that's great.
Perhaps better to document as:

  If a hookcmd.%s.command config exists, replace the command with the
  value of that config. (If not, do nothing - git_config_get_value() is
  documented to not overwrite the value argument in this case.)

> +
> +		if (!command) {
> +			strbuf_release(&hookcmd_name);
> +			BUG("git_config_get_value overwrote a string it shouldn't have");
> +		}
> +
> +		/*
> +		 * TODO: implement an option-getting callback, e.g.
> +		 *   get configs by pattern hookcmd.$value.*
> +		 *   for each key+value, do_callback(key, value, cb_data)
> +		 */
> +
> +		append_or_move_hook(head, command);
> +
> +		strbuf_release(&hookcmd_name);
> +	}
> +
> +	return 0;
> +}
> +
> +struct list_head* hook_list(const struct strbuf* hookname)

"const char *hookname" should suffice?

Also, search for "* " and replace with " *" where applicable (also in
the .h file).

> +{
> +	struct strbuf hook_key = STRBUF_INIT;
> +	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
> +	struct hook_config_cb cb_data = { &hook_key, hook_head };
> +
> +	INIT_LIST_HEAD(hook_head);
> +
> +	if (!hookname)
> +		return NULL;
> +
> +	strbuf_addf(&hook_key, "hook.%s.command", hookname->buf);
> +
> +	git_config(hook_config_lookup, (void*)&cb_data);

Do we need this void* cast?

> +
> +	strbuf_release(&hook_key);
> +	return hook_head;
> +}
> diff --git a/hook.h b/hook.h
> new file mode 100644
> index 0000000000..8ffc4f14b6
> --- /dev/null
> +++ b/hook.h
> @@ -0,0 +1,26 @@
> +#include "config.h"
> +#include "list.h"
> +#include "strbuf.h"
> +
> +struct hook
> +{
> +	struct list_head list;
> +	/*
> +	 * Config file which holds the hook.*.command definition.
> +	 * (This has nothing to do with the hookcmd.<name>.* configs.)
> +	 */
> +	enum config_scope origin;
> +	/* The literal command to run. */
> +	struct strbuf command;

"char *" would suffice?

The tests look fine.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 04/17] hook: include hookdir hook in list
  2020-12-22  0:02           ` [PATCH v7 04/17] hook: include hookdir hook in list Emily Shaffer
@ 2021-01-31  3:20             ` Jonathan Tan
  2021-02-09 22:05               ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Jonathan Tan @ 2021-01-31  3:20 UTC (permalink / raw)
  To: emilyshaffer; +Cc: git, Jonathan Tan

> Historically, hooks are declared by placing an executable into
> $GIT_DIR/hooks/$HOOKNAME (or $HOOKDIR/$HOOKNAME). Although hooks taken
> from the config are more featureful than hooks placed in the $HOOKDIR,
> those hooks should not stop working for users who already have them.

Maybe explicitly add that we're listing them in the list with a "hookdir:"
prefix.

> Legacy hooks should be run directly, not in shell. We know that they are
> a path to an executable, not a oneliner script - and running them
> directly takes care of path quoting concerns for us for free.

Not sure what this paragraph is doing here.

> diff --git a/builtin/hook.c b/builtin/hook.c
> index 4d36de52f8..a0013ae4d7 100644
> --- a/builtin/hook.c
> +++ b/builtin/hook.c
> @@ -16,6 +16,7 @@ static int list(int argc, const char **argv, const char *prefix)
>  	struct list_head *head, *pos;
>  	struct hook *item;
>  	struct strbuf hookname = STRBUF_INIT;
> +	struct strbuf hookdir_annotation = STRBUF_INIT;

Right now this is never set? Maybe hold off on adding this until we set
something.

> @@ -110,6 +113,18 @@ struct list_head* hook_list(const struct strbuf* hookname)
>  
>  	git_config(hook_config_lookup, (void*)&cb_data);
>  
> +	if (have_git_dir())
> +		legacy_hook_path = find_hook(hookname->buf);
> +
> +	/* Unconditionally add legacy hook, but annotate it. */
> +	if (legacy_hook_path) {
> +		struct hook *legacy_hook;
> +
> +		append_or_move_hook(hook_head, absolute_path(legacy_hook_path));

Both find_hook() and absolute_path() use static buffers to hold their
return values, which makes me a bit nervous. Perhaps put them all under
the same "if (have_git_dir())" so that it's clearer that we're not
supposed to insert code arbitrarily between their invocation and their
usage.

> diff --git a/hook.h b/hook.h
> index 8ffc4f14b6..5750634c83 100644
> --- a/hook.h
> +++ b/hook.h
> @@ -12,6 +12,7 @@ struct hook
>  	enum config_scope origin;
>  	/* The literal command to run. */
>  	struct strbuf command;
> +	int from_hookdir;

unsigned from_hookdir : 1?

The tests look good.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 05/17] hook: respect hook.runHookDir
  2020-12-22  0:02           ` [PATCH v7 05/17] hook: respect hook.runHookDir Emily Shaffer
@ 2021-01-31  3:35             ` Jonathan Tan
  2021-02-09 22:31               ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Jonathan Tan @ 2021-01-31  3:35 UTC (permalink / raw)
  To: emilyshaffer; +Cc: git, Jonathan Tan

> Include hooks specified in the hook directory in the list of hooks to
> run. These hooks do need to be treated differently from config-specified
> ones - they do not need to run in a shell, and later on may be disabled
> or warned about based on a config setting.
> 
> Because they are at least as local as the local config, we'll run them
> last - to keep the hook execution order from global to local.

This commit message doesn't seem to match the code change. Firstly,
we're teaching hook.runHookDir, not respecting it (since it did not
exist before this commit), and it's about showing it in "list" and not
about running it at all.

Perhaps just "hook: teach hook.runHookDir" as the subject and as the
body:

  For now, this just affects the output of "git hook list". In the
  future, this will affect the behavior of "git hook run" and when Git
  runs hooks before or after its operations.

> +	switch (should_run_hookdir) {
> +		case HOOKDIR_NO:
> +			strbuf_addstr(&hookdir_annotation, _(" (will not run)"));
> +			break;
> +		case HOOKDIR_INTERACTIVE:
> +			strbuf_addstr(&hookdir_annotation, _(" (will prompt)"));
> +			break;
> +		case HOOKDIR_WARN:
> +		case HOOKDIR_UNKNOWN:

Hmm...UNKNOWN is the same as WARN? This doesn't agree with what is said
in patch 1:

  +In case this list is expanded in the future, if a value for `hook.runHookDir` is
  +given which Git does not recognize, Git should discard that config entry. For
  +example, if "warn" was specified at system level and "junk" was specified at
  +global level, Git would resolve the value to "warn"; if the only time the config
  +was set was to "junk", Git would use the default value of "yes".

But having said that, I would prefer if Git just errored out in this
case.

The rest looks good.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 06/17] hook: implement hookcmd.<name>.skip
  2020-12-22  0:02           ` [PATCH v7 06/17] hook: implement hookcmd.<name>.skip Emily Shaffer
@ 2021-01-31  3:40             ` Jonathan Tan
  2021-02-09 22:57               ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Jonathan Tan @ 2021-01-31  3:40 UTC (permalink / raw)
  To: emilyshaffer; +Cc: git, Jonathan Tan

> If a user wants a specific repo to skip execution of a hook which is set
> at a global or system level, they can now do so by specifying 'skip' in
> their repo config:

Usually the present tense describes the situation before the commit, so
maybe s/they can now do so/they will be able to do so/.

> -static void append_or_move_hook(struct list_head *head, const char *command)
> +static struct hook* find_hook_by_command(struct list_head *head, const char *command)

"* " -> " *"

[snip tests]

For the tests, I thought of the case in which we skip a hookcmd that was
never specified as a hook, but that's probably not very useful.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 08/17] hook: add 'run' subcommand
  2020-12-22  0:02           ` [PATCH v7 08/17] hook: add 'run' subcommand Emily Shaffer
@ 2021-01-31  4:22             ` Jonathan Tan
  2021-02-11 22:44               ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Jonathan Tan @ 2021-01-31  4:22 UTC (permalink / raw)
  To: emilyshaffer; +Cc: git, Jonathan Tan

> In order to enable hooks to be run as an external process, by a
> standalone Git command, or by tools which wrap Git, provide an external
> means to run all configured hook commands for a given hook event.
> 
> For now, the hook commands will run in config order, in series. As
> alternate ordering or parallelism is supported in the future, we should
> add knobs to use those to the command line as well.
> 
> As with the legacy hook implementation, all stdout generated by hook
> commands is redirected to stderr. Piping from stdin is not yet
> supported.
> 
> Legacy hooks (those present in $GITDIR/hooks) are run at the end of the
> execution list. For now, there is no way to disable them.

Not true anymore now that we have hook.runHookDir :-)

> @@ -64,6 +65,32 @@ in the order they should be run, and print the config scope where the relevant
>  `hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
>  This output is human-readable and the format is subject to change over time.
>  
> +run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] `<hook-name>`::
> +
> +Runs hooks configured for `<hook-name>`, in the same order displayed by `git
> +hook list`. Hooks configured this way are run prepended with `sh -c`, so paths
> +containing special characters or spaces should be wrapped in single quotes:
> +`command = '/my/path with spaces/script.sh' some args`.

I learned recently that this may not work the way I expect [1], so you
might want to specifically call this out for someone who knows how
run-command and running-with-shell works.

[1] https://lore.kernel.org/git/YAs9pTBsdskC8CPN@coredump.intra.peff.net/

> @@ -135,6 +136,56 @@ enum hookdir_opt configured_hookdir_opt(void)
>  	return HOOKDIR_UNKNOWN;
>  }
>  
> +static int should_include_hookdir(const char *path, enum hookdir_opt cfg)
> +{
> +	struct strbuf prompt = STRBUF_INIT;
> +	/*
> +	 * If the path doesn't exist, don't bother adding the empty hook and
> +	 * don't bother checking the config or prompting the user.
> +	 */
> +	if (!path)
> +		return 0;
> +
> +	switch (cfg)
> +	{
> +		case HOOKDIR_NO:
> +			return 0;
> +		case HOOKDIR_UNKNOWN:
> +			fprintf(stderr,
> +				_("Unrecognized value for 'hook.runHookDir'. "
> +				  "Is there a typo? "));
> +			/* FALLTHROUGH */

Same comment (about UNKNOWN and defaulting to WARN instead of YES) as in
one of the previous patches.

> +		case HOOKDIR_WARN:
> +			fprintf(stderr, _("Running legacy hook at '%s'\n"),
> +				path);
> +			return 1;
> +		case HOOKDIR_INTERACTIVE:
> +			do {
> +				/*
> +				 * TRANSLATORS: Make sure to include [Y] and [n]
> +				 * in your translation. Only English input is
> +				 * accepted. Default option is "yes".
> +				 */
> +				fprintf(stderr, _("Run '%s'? [Yn] "), path);
> +				git_read_line_interactively(&prompt);
> +				strbuf_tolower(&prompt);
> +				if (starts_with(prompt.buf, "n")) {
> +					strbuf_release(&prompt);
> +					return 0;
> +				} else if (starts_with(prompt.buf, "y")) {
> +					strbuf_release(&prompt);
> +					return 1;
> +				}
> +				/* otherwise, we didn't understand the input */
> +			} while (prompt.len); /* an empty reply means "Yes" */
> +			strbuf_release(&prompt);
> +			return 1;
> +		case HOOKDIR_YES:
> +		default:
> +			return 1;
> +	}
> +}

[snip]

> +int run_hooks(const char *hookname, struct run_hooks_opt *options)
> +{
> +	struct strbuf hookname_str = STRBUF_INIT;
> +	struct list_head *to_run, *pos = NULL, *tmp = NULL;
> +	int rc = 0;
> +
> +	if (!options)
> +		BUG("a struct run_hooks_opt must be provided to run_hooks");
> +
> +	strbuf_addstr(&hookname_str, hookname);
> +
> +	to_run = hook_list(&hookname_str);
> +
> +	list_for_each_safe(pos, tmp, to_run) {
> +		struct child_process hook_proc = CHILD_PROCESS_INIT;
> +		struct hook *hook = list_entry(pos, struct hook, list);
> +
> +		hook_proc.env = options->env.v;
> +		hook_proc.no_stdin = 1;
> +		hook_proc.stdout_to_stderr = 1;
> +		hook_proc.trace2_hook_name = hook->command.buf;
> +		hook_proc.use_shell = 1;

I think this is based on run_hook_ve() in run-command.c - could we
refactor that to avoid duplication of code?

> +
> +		if (hook->from_hookdir) {
> +		    if (!should_include_hookdir(hook->command.buf, options->run_hookdir))
> +			continue;
> +		    /*
> +		     * Commands from the config could be oneliners, but we know
> +		     * for certain that hookdir commands are not.
> +		     */
> +		    hook_proc.use_shell = 0;
> +		}
> +
> +		/* add command */
> +		strvec_push(&hook_proc.args, hook->command.buf);
> +
> +		/*
> +		 * add passed-in argv, without expanding - let the user get back
> +		 * exactly what they put in
> +		 */
> +		strvec_pushv(&hook_proc.args, options->args.v);
> +
> +		rc |= run_command(&hook_proc);
> +	}
> +
> +	return rc;
> +}

[snip]

> +struct run_hooks_opt
> +{
> +	/* Environment vars to be set for each hook */
> +	struct strvec env;
> +
> +	/* Args to be passed to each hook */
> +	struct strvec args;
> +
> +	/*
> +	 * How should the hookdir be handled?
> +	 * Leave the RUN_HOOKS_OPT_INIT default in most cases; this only needs
> +	 * to be overridden if the user can override it at the command line.
> +	 */
> +	enum hookdir_opt run_hookdir;
> +};
> +
> +#define RUN_HOOKS_OPT_INIT  {   		\
> +	.env = STRVEC_INIT, 				\
> +	.args = STRVEC_INIT, 			\
> +	.run_hookdir = configured_hookdir_opt()	\
> +}

I don't think we have function invocations in our declarations like
this. Maybe stick to just using run_hooks_opt_init().

[snip tests]

The tests look good.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 09/17] hook: replace find_hook() with hook_exists()
  2020-12-22  0:02           ` [PATCH v7 09/17] hook: replace find_hook() with hook_exists() Emily Shaffer
@ 2021-01-31  4:39             ` Jonathan Tan
  2021-02-12 22:15               ` Emily Shaffer
  2021-02-18 22:23               ` Emily Shaffer
  0 siblings, 2 replies; 170+ messages in thread
From: Jonathan Tan @ 2021-01-31  4:39 UTC (permalink / raw)
  To: emilyshaffer; +Cc: git, Jonathan Tan

> Add a helper to easily determine whether any hooks exist for a given
> hook event.
> 
> Many callers want to check whether some state could be modified by a
> hook; that check should include the config-based hooks as well. Optimize
> by checking the config directly. Since commands which execute hooks
> might want to take args to replace 'hook.runHookDir', let
> 'hook_exists()' mirror the behavior of 'hook.runHookDir'.

The text makes sense, but the title might better be "introduce
hook_exists()" instead of "replace", since find_hook() is still around.

Also maybe briefly mention the future plans - e.g. in the future, no
code will use find_hook() except <whatever the hook-internal functions
are>, because all of them will use hook_exists() and run_hook().

> +/*
> + * Returns 1 if any hooks are specified in the config or if a hook exists in the
> + * hookdir. Typically, invoke hook_exsts() like:
> + *   hook_exists(hookname, configured_hookdir_opt());
> + * Like with run_hooks, if you take a --run-hookdir flag, reflect that
> + * user-specified behavior here instead.
> + */
> +int hook_exists(const char *hookname, enum hookdir_opt should_run_hookdir);

I wonder if enum hookdir_opt should support a "unspecified" instead, in
which case hook_exists() will automatically read the config (instead of
relying on the caller to call configured_hookdir_opt()), but I see that
this patch set is version 7 and perhaps this design point has already
been discussed.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 11/17] run-command: allow stdin for run_processes_parallel
  2020-12-22  0:02           ` [PATCH v7 11/17] run-command: allow stdin for run_processes_parallel Emily Shaffer
@ 2021-02-01  5:38             ` Jonathan Tan
  2021-02-19 20:23               ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Jonathan Tan @ 2021-02-01  5:38 UTC (permalink / raw)
  To: emilyshaffer; +Cc: git, Jonathan Tan

> diff --git a/run-command.c b/run-command.c
> index ea4d0fb4b1..80c8c97bc1 100644
> --- a/run-command.c
> +++ b/run-command.c
> @@ -1683,6 +1683,9 @@ static int pp_start_one(struct parallel_processes *pp)
>  	if (i == pp->max_processes)
>  		BUG("bookkeeping is hard");
>  
> +	/* disallow by default, but allow users to set up stdin if they wish */
> +	pp->children[i].process.no_stdin = 1;
> +

This makes sense. May be worth a more detailed comment, e.g.:

  By default, do not inherit stdin from the parent process. (If not, all
  children would share it!) Users may overwrite this by having the
  get_next_task function assign 0 to no_stdin and an appropriate integer
  to in.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 12/17] hook: allow parallel hook execution
  2020-12-22  0:02           ` [PATCH v7 12/17] hook: allow parallel hook execution Emily Shaffer
@ 2021-02-01  6:04             ` Jonathan Tan
  2021-02-22 21:46               ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Jonathan Tan @ 2021-02-01  6:04 UTC (permalink / raw)
  To: emilyshaffer; +Cc: git, Jonathan Tan

> In many cases, there's no reason not to allow hooks to execute in
> parallel. run_processes_parallel() is well-suited - it's a task queue
> that runs its housekeeping in series, which means users don't
> need to worry about thread safety on their callback data. True
> multithreaded execution with the async_* functions isn't necessary here.
> Synchronous hook execution can be achieved by only allowing 1 job to run
> at a time.
> 
> Teach run_hooks() to use that function for simple hooks which don't
> require stdin or capture of stderr.

Which hooks would be run in parallel, and which hooks in series? I don't
see code that distinguishes between them.

> 
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
> 
> Notes:
>     Per AEvar's request - parallel hook execution on day zero.
>     
>     In most ways run_processes_parallel() worked great for me - but it didn't
>     have great support for hooks where we pipe to and from. I had to add this
>     support later in the series.
>     
>     Since I modified an existing and in-use library I'd appreciate a keen look on
>     these patches.

What is the existing and in-use library that you're modifying?

> @@ -246,11 +255,96 @@ void run_hooks_opt_clear(struct run_hooks_opt *o)
>  	strvec_clear(&o->args);
>  }
>  
> +
> +static int pick_next_hook(struct child_process *cp,
> +			  struct strbuf *out,
> +			  void *pp_cb,
> +			  void **pp_task_cb)
> +{
> +	struct hook_cb_data *hook_cb = pp_cb;
> +
> +	struct hook *hook = list_entry(hook_cb->run_me, struct hook, list);
> +
> +	if (hook_cb->head == hook_cb->run_me)
> +		return 0;
> +
> +	cp->env = hook_cb->options->env.v;
> +	cp->stdout_to_stderr = 1;
> +	cp->trace2_hook_name = hook->command.buf;
> +
> +	/* reopen the file for stdin; run_command closes it. */
> +	if (hook_cb->options->path_to_stdin) {
> +		cp->no_stdin = 0;
> +		cp->in = xopen(hook_cb->options->path_to_stdin, O_RDONLY);
> +	} else {
> +		cp->no_stdin = 1;
> +	}
> +
> +	/*
> +	 * Commands from the config could be oneliners, but we know
> +	 * for certain that hookdir commands are not.
> +	 */
> +	if (hook->from_hookdir)
> +		cp->use_shell = 0;
> +	else
> +		cp->use_shell = 1;
> +
> +	/* add command */
> +	strvec_push(&cp->args, hook->command.buf);
> +
> +	/*
> +	 * add passed-in argv, without expanding - let the user get back
> +	 * exactly what they put in
> +	 */
> +	strvec_pushv(&cp->args, hook_cb->options->args.v);

I just skimmed over this setup-process-for-hook part - it would have
been much clearer if it was refactored into its own function before this
patch (or better yet, written as its own function in the first place).
As it is, there are some unnecessary rewritings - e.g. setting stdin
after env, and the use_shell setup.

> diff --git a/hook.h b/hook.h

[snip]

> +/*
> + * Callback provided to feed_pipe_fn and consume_sideband_fn.
> + */
> +struct hook_cb_data {
> +	int rc;
> +	struct list_head *head;
> +	struct list_head *run_me;
> +	struct run_hooks_opt *options;
> +};

Could this be in hook.c instead?

Also, I think it's clearer if run_me was a struct hook, and set to NULL
when iteration reaches the end. If you disagree, I think it needs some
documentation (e.g. "the embedded linked list part of the hook that must
be run next; if equal to head, then iteration has ended" or something
like that).

> +#define RUN_HOOKS_OPT_INIT_SYNC  {   		\
>  	.env = STRVEC_INIT, 			\
>  	.args = STRVEC_INIT, 			\
>  	.path_to_stdin = NULL,			\
> +	.jobs = 1,				\
>  	.run_hookdir = configured_hookdir_opt()	\
>  }

This is not used anywhere.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 14/17] run-command: add stdin callback for parallelization
  2020-12-22  0:02           ` [PATCH v7 14/17] run-command: add stdin callback for parallelization Emily Shaffer
@ 2021-02-01  6:51             ` Jonathan Tan
  2021-02-22 23:38               ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Jonathan Tan @ 2021-02-01  6:51 UTC (permalink / raw)
  To: emilyshaffer; +Cc: git, Jonathan Tan

> If a user of the run_processes_parallel() API wants to pipe a large
> amount of information to stdin of each parallel command, that
> information could exceed the buffer of the pipe allocated for that
> process's stdin.  Generally this is solved by repeatedly writing to
> child_process.in between calls to start_command() and finish_command();
> run_processes_parallel() did not provide users an opportunity to access
> child_process at that time.

[snip]

> diff --git a/run-command.h b/run-command.h
> index 6472b38bde..e058c0e2c8 100644
> --- a/run-command.h
> +++ b/run-command.h
> @@ -436,6 +436,20 @@ typedef int (*start_failure_fn)(struct strbuf *out,
>  				void *pp_cb,
>  				void *pp_task_cb);
>  
> +/**
> + * This callback is called repeatedly on every child process who requests
> + * start_command() to create a pipe by setting child_process.in < 0.
> + *
> + * pp_cb is the callback cookie as passed into run_processes_parallel, and
> + * pp_task_cb is the callback cookie as passed into get_next_task_fn.
> + * The contents of 'send' will be read into the pipe and passed to the pipe.
> + *
> + * Return nonzero to close the pipe.
> + */
> +typedef int (*feed_pipe_fn)(struct strbuf *pipe,
> +			    void *pp_cb,
> +			    void *pp_task_cb);
> +

As you mention above in the commit message, I think the clearest API to
support what we need is to just have a callback (that has access to
child_process) that is executed between process start and finish.

As it is, I think this callback is too specific in that it takes a
struct strbuf. I think that this struct strbuf will just end up being
unnecessary copying much of the time, when the user could have just
written to the fd directly.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 15/17] hook: provide stdin by string_list or callback
  2020-12-22  0:02           ` [PATCH v7 15/17] hook: provide stdin by string_list or callback Emily Shaffer
@ 2021-02-01  7:04             ` Jonathan Tan
  2021-02-23 19:52               ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Jonathan Tan @ 2021-02-01  7:04 UTC (permalink / raw)
  To: emilyshaffer; +Cc: git, Jonathan Tan

> In cases where a hook requires only a small amount of information via
> stdin, it should be simple for users to provide a string_list alone. But
> in more complicated cases where the stdin is too large to hold in
> memory, let's provide a callback the users can populate line after line
> with instead.

[snip]

> diff --git a/hook.h b/hook.h
> index 8a7542610c..0ac83fa7ca 100644
> --- a/hook.h
> +++ b/hook.h
> @@ -2,6 +2,7 @@
>  #include "list.h"
>  #include "strbuf.h"
>  #include "strvec.h"
> +#include "run-command.h"
>  
>  struct hook
>  {
> @@ -14,6 +15,12 @@ struct hook
>  	/* The literal command to run. */
>  	struct strbuf command;
>  	int from_hookdir;
> +
> +	/*
> +	 * Use this to keep state for your feed_pipe_fn if you are using
> +	 * run_hooks_opt.feed_pipe. Otherwise, do not touch it.
> +	 */
> +	void *feed_pipe_cb_data;

When would we need per-hook state? I see in patch 14 that you give each
running process little by little (in pp_buffer_stdin()), perhaps so that
each hook can make progress at roughly the same pace, but I don't think
we can expect all hooks to work the same, so I don't think it's worth
complicating the design for all that.

>  };
>  
>  /*
> @@ -57,12 +64,24 @@ struct run_hooks_opt
>  
>  	/* Path to file which should be piped to stdin for each hook */
>  	const char *path_to_stdin;
> +	/* Pipe each string to stdin, separated by newlines */
> +	struct string_list str_stdin;
> +	/*
> +	 * Callback and state pointer to ask for more content to pipe to stdin.
> +	 * Will be called repeatedly, for each hook. See
> +	 * hook.c:pipe_from_stdin() for an example. Keep per-hook state in
> +	 * hook.feed_pipe_cb_data (per process). Keep initialization context in
> +	 * feed_pipe_ctx (shared by all processes).
> +	 */
> +	feed_pipe_fn feed_pipe;
> +	void *feed_pipe_ctx;

Instead of 3 fields, I think 2 suffice - the function and the data
(called "ctx" here). We can supply a function that treats the data as a
string_list.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 01/17] doc: propose hooks managed by the config
  2020-12-22  0:02           ` [PATCH v7 01/17] doc: propose hooks managed by the config Emily Shaffer
  2021-01-23 15:38             ` Ævar Arnfjörð Bjarmason
@ 2021-02-01 22:11             ` Junio C Hamano
  2021-03-10 19:30               ` Emily Shaffer
  1 sibling, 1 reply; 170+ messages in thread
From: Junio C Hamano @ 2021-02-01 22:11 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Emily Shaffer <emilyshaffer@google.com> writes:

> +ways, providing an avenue to deprecate these "legacy" hooks if desired. The
> +handling is based on a config `hook.runHookDir`, which is checked against a
> +number of cases:

Don't we want to also warn when the setting "no" or something
similar prevents the legacy hook from running, to help users
who wonder why their hook scripts are not running?  I.e.

> +- "no": the legacy hook will not be run

+- "warn-no": Git will print a warning to stderr before ignoring the
+  legacy hook

> +- "interactive": Git will prompt the user before running the legacy hook
> +- "warn": Git will print a warning to stderr before running the legacy hook
> +- "yes" (default): Git will silently run the legacy hook

> +In case this list is expanded in the future, if a value for `hook.runHookDir` is
> +given which Git does not recognize, Git should discard that config entry. For
> +example, if "warn" was specified at system level and "junk" was specified at
> +global level, Git would resolve the value to "warn"; if the only time the config
> +was set was to "junk", Git would use the default value of "yes".

Hmph, instead of complaining "value 'junk' is not recognized" and
erroring out?  Why?

> +[[stage-3]]
> +==== Stage 3
> +
> +`.git/hooks` is removed from the template and the hook directory is considered
> +deprecated. To avoid breaking older repos, the default of `hook.runHookDir` is
> +not changed, and `find_hook()` is not removed.

Presumably, we'll have documentation somewhere that instructs users
(who were taught by slashdot and other site to add certain scripts
under their .git/hooks/) how to do the equivalent without adding
scripts in .git/hooks/ directory and instead using the config
mechanism (e.g. "when told to add script X in .git/hooks/, read such
an instruction as if telling you to do Y instead") by the time this
happens?  It probably makes sense to do so as part of stage-2, at
which point the users are _ready_ to migrate.

> +[[security]]
> +=== Security and repo config
> +
> +Part of the motivation behind this refactor is to mitigate hooks as an attack
> +vector;footnote:[https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/]
> +however, as the design stands, users can still provide hooks in the repo-level
> +config, which is included when a repo is zipped and sent elsewhere.  The
> +security of the repo-level config is still under discussion; this design
> +generally assumes the repo-level config is secure, which is not true yet. The
> +goal is to avoid an overcomplicated design to work around a problem which has
> +ceased to exist.

I doubt we want to claim anything about security as part of this
series.  As you say in the paragraph, .git/config and .git/hooks/
are equally (un)protected and if we decide to punt .git/config
security, then not moving away from .git/hooks would not hurt
security-wise, either (in other words, security is not a viable
motivation behind this series).

And if we stop advertising 'security merit' that does not exist,
what remains?  Isn't the biggest selling point that an identical set
of hook configuration can be shared among multiple repositories, and
it allows more than one hook scripts to be triggered by a single
"hook event"?  There may be other good things we should be able to
sell the new mechanism to our users, and we do stress on them, which
is done in the motivation section.  So...

> +.Comparison of alternatives
> +|===
> +|Feature |Config-based hooks |Hook directories |Status quo

Sorry, but I did not find this table particularly convincing.

The only thing I sense is a hand-wavy desire that "we could make it
better than everybody else if we work on it in this area", which can
apply equally for other approaches---they could enhance what they
already have (e.g. "discoverability & documentation").

As a list of "these are the points we aspire to do better than other
people", I think it is an excellent idea to have a table like this
here in the documentation.  But that is not a "comparison".

> +[[execution-ordering]]
> +=== Execution ordering
> +
> +We may find that config order is insufficient for some users; for example,
> +config order makes it difficult to add a new hook to the system or global config
> +which runs at the end of the hook list. A new ordering schema should be:
> +
> +1) Specified by a `hook.order` config, so that users will not unexpectedly see
> +their order change;
> +
> +2) Either dependency or numerically based.
> +
> +Dependency-based ordering is prone to classic linked-list problems, like a
> +cycles and handling of missing dependencies. But, it paves the way for enabling
> +parallelization if some tasks truly depend on others.
> +
> +Numerical ordering makes it tricky for Git to generate suggested ordering
> +numbers for each command, but is easy to determine a definitive order.

OK.

Have we decided what we do for hooks whose interface is to feed
their input from their standard input?  The current system, I think,
just feeds the single hook by writing into a pipe to it, but if we
were to drive multiple hooks, we'd need to write the same thing to
each of these hook programs?  

Do we have a plan to deal with hooks whose outcome is not just
"yes/no", e.g. "proc-receive" hook that munges the list of refs to
be updated and the new values for them, or "applypatch-msg" that
munges the incoming proposed commit log message?  Does the second
hook work on the result of the first hook?  Do the two hooks work on
the vanilla state and their output have to agree with each other?

> +[[parallelization]]
> +=== Parallelization with dependencies
> +
> +Currently hooks use a naive parallelization scheme or are run in series.  But if
> +one hook depends on another's output, then users will want to specify those
> +dependencies.

An untold assumption here is that the questions I asked earlier on
having more than one hooks that is not just yes/no is something
readers know the same answer, and that answer is "the outcome of the
first hook is passed along (as if it were the input given by Git
directly, if the first hook did not exist), to the second hook.  It
should be spelled out somewhere before the execution ordering
section, I think.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 03/17] hook: add list command
  2021-01-31  3:10             ` Jonathan Tan
@ 2021-02-09 21:06               ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2021-02-09 21:06 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Sat, Jan 30, 2021 at 07:10:11PM -0800, Jonathan Tan wrote:
> 
> > Teach 'git hook list <hookname>', which checks the known configs in
> > order to create an ordered list of hooks to run on a given hook event.
> > 
> > Multiple commands can be specified for a given hook by providing
> > multiple "hook.<hookname>.command = <path-to-hook>" lines. Hooks will be
> > run in config order. If more properties need to be set on a given hook
> > in the future, commands can also be specified by providing
> > "hook.<hookname>.command = <hookcmd-name>", as well as a "[hookcmd
> > <hookcmd-name>]" subsection; at minimum, this subsection must contain a
> > "hookcmd.<hookcmd-name>.command = <path-to-hook>" line.
> 
> I learned later that this isn't true - in patch 6, the commit message
> and one of the tests therein describe being able to skip a previously
> inline command by making a hookcmd section of the same name and just
> specifying "skip = true" there (without any command).
> 
> Maybe just delete the "at minimum" part.

Thanks, nice catch. I dropped "at minimum" and s/must/should/. Since I
received feedback during the dogfood internally that there's no
documentation on "hookcmd.<thing>.skip" whatsoever, I'll try to make
a more detailed overview of hookcmd acting as an alias in the
user-facing docs in that commit.

> 
> > +static int list(int argc, const char **argv, const char *prefix)
> >  {
> > -	struct option builtin_hook_options[] = {
> > +	struct list_head *head, *pos;
> > +	struct hook *item;
> 
> You asked for review on nits too so here's one: "item" should be
> declared in the list_for_each block. (That also makes it easier to see
> that we don't need to free it.)

Ok, done.

> 
> > diff --git a/hook.c b/hook.c
> > new file mode 100644
> > index 0000000000..937dc768c8
> > --- /dev/null
> > +++ b/hook.c
> > @@ -0,0 +1,115 @@
> > +#include "cache.h"
> > +
> > +#include "hook.h"
> > +#include "config.h"
> 
> Usually we put all the includes together without any intervening blank
> lines.

Done.

> 
> > +static void append_or_move_hook(struct list_head *head, const char *command)
> > +{
> > +	struct list_head *pos = NULL, *tmp = NULL;
> > +	struct hook *to_add = NULL;
> > +
> > +	/*
> > +	 * remove the prior entry with this command; we'll replace it at the
> > +	 * end.
> > +	 */
> > +	list_for_each_safe(pos, tmp, head) {
> > +		struct hook *it = list_entry(pos, struct hook, list);
> > +		if (!strcmp(it->command.buf, command)) {
> > +		    list_del(pos);
> > +		    /* we'll simply move the hook to the end */
> > +		    to_add = it;
> 
> "break" here?

I think it's safe to do so, but I think I left it out in case duplicates
do make it into the list somehow. But if they're always being inserted
via "append_or_move_hook()" that should not be an issue, so I'll add the
break.

In fact, fixing this exposed a bug. Later, I add using 'pos' instead of
'head':
  list_add_tail(&to_add->list, pos);
When I'm guaranteed to iterate to the end of the list, that works fine,
because in this implementation the last element of the list has "->next"
set back to "head". But when 'pos' isn't sure to be at the end of the
list, it breaks the list. Whoops.. :)


> 
> > +		}
> > +	}
> > +
> > +	if (!to_add) {
> > +		/* adding a new hook, not moving an old one */
> > +		to_add = xmalloc(sizeof(struct hook));
> 
> Style is to write sizeof(*to_add), I think.

Sure.

> 
> [snip]
> 
> > +struct hook_config_cb
> > +{
> > +	struct strbuf *hookname;
> 
> struct declarations have "{" not on a line on its own.

Sure, I can change it. Although, 'git grep -E "struct \w+$"' tells me
there are a few other offenders :) But the count (60ish minus
doc/relnotes references) is much lower than 'git grep -E "struct \w+
\{$"' (800+ matches) so I'm definitely doing it wrong :D

> 
> Also, "hookname" could just be a char *?

Hrmph. I think my C++ background is showing - to me, calling ".buf"
(which is cheap) a few times is a small price to pay to get length
info and so on for free. "hookname" can come from the user - "git
hook (list|run) repo-special-magic-hook" - so I worry about leaving it as a raw
char* with no size info associated. Am I being too paranoid?

> 	
> > +	struct list_head *list;
> > +};
> > +
> > +static int hook_config_lookup(const char *key, const char *value, void *cb_data)
> > +{
> > +	struct hook_config_cb *data = cb_data;
> > +	const char *hook_key = data->hookname->buf;
> > +	struct list_head *head = data->list;
> > +
> > +	if (!strcmp(key, hook_key)) {
> > +		const char *command = value;
> > +		struct strbuf hookcmd_name = STRBUF_INIT;
> > +
> > +		/* Check if a hookcmd with that name exists. */
> > +		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
> > +		git_config_get_value(hookcmd_name.buf, &command);
> 
> So we don't really care whether git_config_get_value returns 0 or 1 as
> long as it doesn't touch "command" if there is no such hookcmd. That
> fits with how git_config_get_value() is documented, so that's great.
> Perhaps better to document as:
> 
>   If a hookcmd.%s.command config exists, replace the command with the
>   value of that config. (If not, do nothing - git_config_get_value() is
>   documented to not overwrite the value argument in this case.)

OK. Thanks, done.

> 
> > +
> > +		if (!command) {
> > +			strbuf_release(&hookcmd_name);
> > +			BUG("git_config_get_value overwrote a string it shouldn't have");
> > +		}
> > +
> > +		/*
> > +		 * TODO: implement an option-getting callback, e.g.
> > +		 *   get configs by pattern hookcmd.$value.*
> > +		 *   for each key+value, do_callback(key, value, cb_data)
> > +		 */
> > +
> > +		append_or_move_hook(head, command);
> > +
> > +		strbuf_release(&hookcmd_name);
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +struct list_head* hook_list(const struct strbuf* hookname)
> 
> "const char *hookname" should suffice?
> 
> Also, search for "* " and replace with " *" where applicable (also in
> the .h file).

See above.

> 
> > +{
> > +	struct strbuf hook_key = STRBUF_INIT;
> > +	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
> > +	struct hook_config_cb cb_data = { &hook_key, hook_head };
> > +
> > +	INIT_LIST_HEAD(hook_head);
> > +
> > +	if (!hookname)
> > +		return NULL;
> > +
> > +	strbuf_addf(&hook_key, "hook.%s.command", hookname->buf);
> > +
> > +	git_config(hook_config_lookup, (void*)&cb_data);
> 
> Do we need this void* cast?

Nope. Dropped, thanks.

> 
> > +
> > +	strbuf_release(&hook_key);
> > +	return hook_head;
> > +}
> > diff --git a/hook.h b/hook.h
> > new file mode 100644
> > index 0000000000..8ffc4f14b6
> > --- /dev/null
> > +++ b/hook.h
> > @@ -0,0 +1,26 @@
> > +#include "config.h"
> > +#include "list.h"
> > +#include "strbuf.h"
> > +
> > +struct hook
> > +{
> > +	struct list_head list;
> > +	/*
> > +	 * Config file which holds the hook.*.command definition.
> > +	 * (This has nothing to do with the hookcmd.<name>.* configs.)
> > +	 */
> > +	enum config_scope origin;
> > +	/* The literal command to run. */
> > +	struct strbuf command;
> 
> "char *" would suffice?

Since 'command' comes directly from user input and is executed, I'm
nervous about using a char* instead of a strbuf even more so than for
the hookname.

> 
> The tests look fine.

Thanks for the nitpicky review, it was great! :)

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 04/17] hook: include hookdir hook in list
  2021-01-31  3:20             ` Jonathan Tan
@ 2021-02-09 22:05               ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2021-02-09 22:05 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Sat, Jan 30, 2021 at 07:20:22PM -0800, Jonathan Tan wrote:
> 
> > Historically, hooks are declared by placing an executable into
> > $GIT_DIR/hooks/$HOOKNAME (or $HOOKDIR/$HOOKNAME). Although hooks taken
> > from the config are more featureful than hooks placed in the $HOOKDIR,
> > those hooks should not stop working for users who already have them.
> 
> Maybe explicitly add that we're listing them in the list with a "hookdir:"
> prefix.

Sure.

> 
> > Legacy hooks should be run directly, not in shell. We know that they are
> > a path to an executable, not a oneliner script - and running them
> > directly takes care of path quoting concerns for us for free.
> 
> Not sure what this paragraph is doing here.

Yep, this is an artifact of the review process (explaining why I didn't
do something weird, which I did in an earlier version, but now it
doesn't make sense to mention it at all). Deleted.

> 
> > diff --git a/builtin/hook.c b/builtin/hook.c
> > index 4d36de52f8..a0013ae4d7 100644
> > --- a/builtin/hook.c
> > +++ b/builtin/hook.c
> > @@ -16,6 +16,7 @@ static int list(int argc, const char **argv, const char *prefix)
> >  	struct list_head *head, *pos;
> >  	struct hook *item;
> >  	struct strbuf hookname = STRBUF_INIT;
> > +	struct strbuf hookdir_annotation = STRBUF_INIT;
> 
> Right now this is never set? Maybe hold off on adding this until we set
> something.

Yeah, that makes sense. Will do.

> 
> > @@ -110,6 +113,18 @@ struct list_head* hook_list(const struct strbuf* hookname)
> >  
> >  	git_config(hook_config_lookup, (void*)&cb_data);
> >  
> > +	if (have_git_dir())
> > +		legacy_hook_path = find_hook(hookname->buf);
> > +
> > +	/* Unconditionally add legacy hook, but annotate it. */
> > +	if (legacy_hook_path) {
> > +		struct hook *legacy_hook;
> > +
> > +		append_or_move_hook(hook_head, absolute_path(legacy_hook_path));
> 
> Both find_hook() and absolute_path() use static buffers to hold their
> return values, which makes me a bit nervous. Perhaps put them all under
> the same "if (have_git_dir())" so that it's clearer that we're not
> supposed to insert code arbitrarily between their invocation and their
> usage.

Oh, that's a cool way to indicate that. Thanks, I did that, and learned
something new!

> 
> > diff --git a/hook.h b/hook.h
> > index 8ffc4f14b6..5750634c83 100644
> > --- a/hook.h
> > +++ b/hook.h
> > @@ -12,6 +12,7 @@ struct hook
> >  	enum config_scope origin;
> >  	/* The literal command to run. */
> >  	struct strbuf command;
> > +	int from_hookdir;
> 
> unsigned from_hookdir : 1?

Sure. It doesn't make a difference now but I see that would be nice for
futureproofing.

> 
> The tests look good.

Thanks.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 05/17] hook: respect hook.runHookDir
  2021-01-31  3:35             ` Jonathan Tan
@ 2021-02-09 22:31               ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2021-02-09 22:31 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Sat, Jan 30, 2021 at 07:35:03PM -0800, Jonathan Tan wrote:
> 
> > Include hooks specified in the hook directory in the list of hooks to
> > run. These hooks do need to be treated differently from config-specified
> > ones - they do not need to run in a shell, and later on may be disabled
> > or warned about based on a config setting.
> > 
> > Because they are at least as local as the local config, we'll run them
> > last - to keep the hook execution order from global to local.
> 
> This commit message doesn't seem to match the code change. Firstly,
> we're teaching hook.runHookDir, not respecting it (since it did not
> exist before this commit), and it's about showing it in "list" and not
> about running it at all.

Yeah, thanks for noticing. Now that I'm rereading it, it looks like this
is the old commit message for "include hookdir hook in list". Yikes.
> 
> Perhaps just "hook: teach hook.runHookDir" as the subject and as the
> body:
> 
>   For now, this just affects the output of "git hook list". In the
>   future, this will affect the behavior of "git hook run" and when Git
>   runs hooks before or after its operations.
> 
> > +	switch (should_run_hookdir) {
> > +		case HOOKDIR_NO:
> > +			strbuf_addstr(&hookdir_annotation, _(" (will not run)"));
> > +			break;
> > +		case HOOKDIR_INTERACTIVE:
> > +			strbuf_addstr(&hookdir_annotation, _(" (will prompt)"));
> > +			break;
> > +		case HOOKDIR_WARN:
> > +		case HOOKDIR_UNKNOWN:
> 
> Hmm...UNKNOWN is the same as WARN? This doesn't agree with what is said
> in patch 1:
> 
>   +In case this list is expanded in the future, if a value for `hook.runHookDir` is
>   +given which Git does not recognize, Git should discard that config entry. For
>   +example, if "warn" was specified at system level and "junk" was specified at
>   +global level, Git would resolve the value to "warn"; if the only time the config
>   +was set was to "junk", Git would use the default value of "yes".
> 
> But having said that, I would prefer if Git just errored out in this
> case.

Eh, I think I'd still like it to be tolerant. Thanks for keeping me
honest :) I added a test to enforce the tolerant behavior, too.

> 
> The rest looks good.

Thanks.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 06/17] hook: implement hookcmd.<name>.skip
  2021-01-31  3:40             ` Jonathan Tan
@ 2021-02-09 22:57               ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2021-02-09 22:57 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Sat, Jan 30, 2021 at 07:40:30PM -0800, Jonathan Tan wrote:
> 
> > If a user wants a specific repo to skip execution of a hook which is set
> > at a global or system level, they can now do so by specifying 'skip' in
> > their repo config:
> 
> Usually the present tense describes the situation before the commit, so
> maybe s/they can now do so/they will be able to do so/.

Sure.

> 
> > -static void append_or_move_hook(struct list_head *head, const char *command)
> > +static struct hook* find_hook_by_command(struct list_head *head, const char *command)
> 
> "* " -> " *"

Thanks.

> 
> [snip tests]
> 
> For the tests, I thought of the case in which we skip a hookcmd that was
> never specified as a hook, but that's probably not very useful.

Ah, it might be useful to make sure we don't choke trying to remove
something that isn't there - I'll add one.


By the way, I got feedback from Googlers using config hooks that "skip"
isn't actually documented anywhere public-facing. For v8 I've added a
section on it to Documentation/git-hook.txt as well as to
Documentation/config/hook.txt.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 08/17] hook: add 'run' subcommand
  2021-01-31  4:22             ` Jonathan Tan
@ 2021-02-11 22:44               ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2021-02-11 22:44 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Sat, Jan 30, 2021 at 08:22:54PM -0800, Jonathan Tan wrote:
> 
> > In order to enable hooks to be run as an external process, by a
> > standalone Git command, or by tools which wrap Git, provide an external
> > means to run all configured hook commands for a given hook event.
> > 
> > For now, the hook commands will run in config order, in series. As
> > alternate ordering or parallelism is supported in the future, we should
> > add knobs to use those to the command line as well.
> > 
> > As with the legacy hook implementation, all stdout generated by hook
> > commands is redirected to stderr. Piping from stdin is not yet
> > supported.
> > 
> > Legacy hooks (those present in $GITDIR/hooks) are run at the end of the
> > execution list. For now, there is no way to disable them.
> 
> Not true anymore now that we have hook.runHookDir :-)

Updated.

> 
> > @@ -64,6 +65,32 @@ in the order they should be run, and print the config scope where the relevant
> >  `hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
> >  This output is human-readable and the format is subject to change over time.
> >  
> > +run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] `<hook-name>`::
> > +
> > +Runs hooks configured for `<hook-name>`, in the same order displayed by `git
> > +hook list`. Hooks configured this way are run prepended with `sh -c`, so paths
> > +containing special characters or spaces should be wrapped in single quotes:
> > +`command = '/my/path with spaces/script.sh' some args`.
> 
> I learned recently that this may not work the way I expect [1], so you
> might want to specifically call this out for someone who knows how
> run-command and running-with-shell works.

I think it might be good enough to say "may be prepended" instead - the
quoting advice (wrap your paths) still holds.

> 
> [1] https://lore.kernel.org/git/YAs9pTBsdskC8CPN@coredump.intra.peff.net/
> 
> > @@ -135,6 +136,56 @@ enum hookdir_opt configured_hookdir_opt(void)
> >  	return HOOKDIR_UNKNOWN;
> >  }
> >  
> > +static int should_include_hookdir(const char *path, enum hookdir_opt cfg)
> > +{
> > +	struct strbuf prompt = STRBUF_INIT;
> > +	/*
> > +	 * If the path doesn't exist, don't bother adding the empty hook and
> > +	 * don't bother checking the config or prompting the user.
> > +	 */
> > +	if (!path)
> > +		return 0;
> > +
> > +	switch (cfg)
> > +	{
> > +		case HOOKDIR_NO:
> > +			return 0;
> > +		case HOOKDIR_UNKNOWN:
> > +			fprintf(stderr,
> > +				_("Unrecognized value for 'hook.runHookDir'. "
> > +				  "Is there a typo? "));
> > +			/* FALLTHROUGH */
> 
> Same comment (about UNKNOWN and defaulting to WARN instead of YES) as in
> one of the previous patches.

Like in the previous patch, I opted to make it match the design doc
(UNKNOWN matches default, aka YES). I left the typo warning, though, as
it might be useful for someone trying to debug why "itneractive" isn't
working as they expect.

> 
> > +		case HOOKDIR_WARN:
> > +			fprintf(stderr, _("Running legacy hook at '%s'\n"),
> > +				path);
> > +			return 1;
> > +		case HOOKDIR_INTERACTIVE:
> > +			do {
> > +				/*
> > +				 * TRANSLATORS: Make sure to include [Y] and [n]
> > +				 * in your translation. Only English input is
> > +				 * accepted. Default option is "yes".
> > +				 */
> > +				fprintf(stderr, _("Run '%s'? [Yn] "), path);
> > +				git_read_line_interactively(&prompt);
> > +				strbuf_tolower(&prompt);
> > +				if (starts_with(prompt.buf, "n")) {
> > +					strbuf_release(&prompt);
> > +					return 0;
> > +				} else if (starts_with(prompt.buf, "y")) {
> > +					strbuf_release(&prompt);
> > +					return 1;
> > +				}
> > +				/* otherwise, we didn't understand the input */
> > +			} while (prompt.len); /* an empty reply means "Yes" */
> > +			strbuf_release(&prompt);
> > +			return 1;
> > +		case HOOKDIR_YES:
> > +		default:
> > +			return 1;
> > +	}
> > +}
> 
> [snip]
> 
> > +int run_hooks(const char *hookname, struct run_hooks_opt *options)
> > +{
> > +	struct strbuf hookname_str = STRBUF_INIT;
> > +	struct list_head *to_run, *pos = NULL, *tmp = NULL;
> > +	int rc = 0;
> > +
> > +	if (!options)
> > +		BUG("a struct run_hooks_opt must be provided to run_hooks");
> > +
> > +	strbuf_addstr(&hookname_str, hookname);
> > +
> > +	to_run = hook_list(&hookname_str);
> > +
> > +	list_for_each_safe(pos, tmp, to_run) {
> > +		struct child_process hook_proc = CHILD_PROCESS_INIT;
> > +		struct hook *hook = list_entry(pos, struct hook, list);
> > +
> > +		hook_proc.env = options->env.v;
> > +		hook_proc.no_stdin = 1;
> > +		hook_proc.stdout_to_stderr = 1;
> > +		hook_proc.trace2_hook_name = hook->command.buf;
> > +		hook_proc.use_shell = 1;
> 
> I think this is based on run_hook_ve() in run-command.c - could we
> refactor that to avoid duplication of code?

Hm. At the end of part II of this series "run_hook_ve()" is deleted
entirely; the implementation of "run_hooks" diverges significantly as
the series progresses (supporting stdin/stdout, etc) so I'd rather not
try to keep one of them based on the other, as I think it'll be more
complicated than it seems in this patch.

> 
> > +
> > +		if (hook->from_hookdir) {
> > +		    if (!should_include_hookdir(hook->command.buf, options->run_hookdir))
> > +			continue;
> > +		    /*
> > +		     * Commands from the config could be oneliners, but we know
> > +		     * for certain that hookdir commands are not.
> > +		     */
> > +		    hook_proc.use_shell = 0;
> > +		}
> > +
> > +		/* add command */
> > +		strvec_push(&hook_proc.args, hook->command.buf);
> > +
> > +		/*
> > +		 * add passed-in argv, without expanding - let the user get back
> > +		 * exactly what they put in
> > +		 */
> > +		strvec_pushv(&hook_proc.args, options->args.v);
> > +
> > +		rc |= run_command(&hook_proc);
> > +	}
> > +
> > +	return rc;
> > +}
> 
> [snip]
> 
> > +struct run_hooks_opt
> > +{
> > +	/* Environment vars to be set for each hook */
> > +	struct strvec env;
> > +
> > +	/* Args to be passed to each hook */
> > +	struct strvec args;
> > +
> > +	/*
> > +	 * How should the hookdir be handled?
> > +	 * Leave the RUN_HOOKS_OPT_INIT default in most cases; this only needs
> > +	 * to be overridden if the user can override it at the command line.
> > +	 */
> > +	enum hookdir_opt run_hookdir;
> > +};
> > +
> > +#define RUN_HOOKS_OPT_INIT  {   		\
> > +	.env = STRVEC_INIT, 				\
> > +	.args = STRVEC_INIT, 			\
> > +	.run_hookdir = configured_hookdir_opt()	\
> > +}
> 
> I don't think we have function invocations in our declarations like
> this. Maybe stick to just using run_hooks_opt_init().

Sure.

> 
> [snip tests]
> 
> The tests look good.

Thanks for the detailed review.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 09/17] hook: replace find_hook() with hook_exists()
  2021-01-31  4:39             ` Jonathan Tan
@ 2021-02-12 22:15               ` Emily Shaffer
  2021-02-18 22:23               ` Emily Shaffer
  1 sibling, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2021-02-12 22:15 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Sat, Jan 30, 2021 at 08:39:28PM -0800, Jonathan Tan wrote:
> 
> > Add a helper to easily determine whether any hooks exist for a given
> > hook event.
> > 
> > Many callers want to check whether some state could be modified by a
> > hook; that check should include the config-based hooks as well. Optimize
> > by checking the config directly. Since commands which execute hooks
> > might want to take args to replace 'hook.runHookDir', let
> > 'hook_exists()' mirror the behavior of 'hook.runHookDir'.
> 
> The text makes sense, but the title might better be "introduce
> hook_exists()" instead of "replace", since find_hook() is still around.
> 
> Also maybe briefly mention the future plans - e.g. in the future, no
> code will use find_hook() except <whatever the hook-internal functions
> are>, because all of them will use hook_exists() and run_hook().
> 
> > +/*
> > + * Returns 1 if any hooks are specified in the config or if a hook exists in the
> > + * hookdir. Typically, invoke hook_exsts() like:
> > + *   hook_exists(hookname, configured_hookdir_opt());
> > + * Like with run_hooks, if you take a --run-hookdir flag, reflect that
> > + * user-specified behavior here instead.
> > + */
> > +int hook_exists(const char *hookname, enum hookdir_opt should_run_hookdir);
> 
> I wonder if enum hookdir_opt should support a "unspecified" instead, in
> which case hook_exists() will automatically read the config (instead of
> relying on the caller to call configured_hookdir_opt()), but I see that
> this patch set is version 7 and perhaps this design point has already
> been discussed.

No, I don't think it has been. So you mean something like:

  enum hookdir_opt
  {
  	HOOKDIR_NO,
  	HOOKDIR_WARN,
  	HOOKDIR_INTERACTIVE,
  	HOOKDIR_YES,
  	HOOKDIR_USE_CFG,
  	HOOKDIR_UNKNOWN,
  };

(name subject to quibbling) and then reimagining
configured_hookdir_opt() to something like:

  enum hookdir_opt resolve_hookdir_opt(enum hookdir_opt o)
  {
  	if (o != HOOKDIR_USE_CFG)
		return o;
	/* former contents of configured_hookdir_opt here */
  }

I like that, if nobody has complaints.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 00/17] propose config-based hooks (part I)
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
                             ` (19 preceding siblings ...)
  2021-01-29 23:59           ` [PATCH v7 00/17] propose config-based hooks (part I) Emily Shaffer
@ 2021-02-16 19:46           ` Josh Steadmon
  2021-02-16 22:47             ` Junio C Hamano
  20 siblings, 1 reply; 170+ messages in thread
From: Josh Steadmon @ 2021-02-16 19:46 UTC (permalink / raw)
  To: Emily Shaffer
  Cc: git, Jeff King, Junio C Hamano, James Ramsay, Jonathan Nieder,
	brian m. carlson, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Johannes Schindelin

On 2020.12.21 16:02, Emily Shaffer wrote:
> Since v6:
> 
>  - Converted 'enum hookdir_opt' to UPPER_SNAKE
>  - Coccinelle fix in the hook destructor
>  - Fixed a bug where builtin/hook.c wasn't running the default git config setup
>    and therefore missed hooks in core.hooksPath when it was set. (These hooks
>    would still run except when invoked by 'git hook run' as the config was
>    called by the processes which invoked the hook library.)
> 
> CI run: https://github.com/nasamuffin/git/actions/runs/436864964
> 
> Thanks!
>  - Emily
> 
> Emily Shaffer (17):
>   doc: propose hooks managed by the config
>   hook: scaffolding for git-hook subcommand
>   hook: add list command
>   hook: include hookdir hook in list
>   hook: respect hook.runHookDir
>   hook: implement hookcmd.<name>.skip
>   parse-options: parse into strvec
>   hook: add 'run' subcommand
>   hook: replace find_hook() with hook_exists()
>   hook: support passing stdin to hooks
>   run-command: allow stdin for run_processes_parallel
>   hook: allow parallel hook execution
>   hook: allow specifying working directory for hooks
>   run-command: add stdin callback for parallelization
>   hook: provide stdin by string_list or callback
>   run-command: allow capturing of collated output
>   hooks: allow callers to capture output
> 
>  .gitignore                                    |   1 +
>  Documentation/Makefile                        |   1 +
>  Documentation/config/hook.txt                 |  19 +
>  Documentation/git-hook.txt                    | 117 +++++
>  Documentation/technical/api-parse-options.txt |   5 +
>  .../technical/config-based-hooks.txt          | 355 +++++++++++++++
>  Makefile                                      |   2 +
>  builtin.h                                     |   1 +
>  builtin/bugreport.c                           |   4 +-
>  builtin/fetch.c                               |   1 +
>  builtin/hook.c                                | 176 ++++++++
>  builtin/submodule--helper.c                   |   2 +-
>  command-list.txt                              |   1 +
>  git.c                                         |   1 +
>  hook.c                                        | 416 ++++++++++++++++++
>  hook.h                                        | 154 +++++++
>  parse-options-cb.c                            |  16 +
>  parse-options.h                               |   4 +
>  run-command.c                                 |  85 +++-
>  run-command.h                                 |  31 ++
>  submodule.c                                   |   1 +
>  t/helper/test-run-command.c                   |  46 +-
>  t/t0061-run-command.sh                        |  37 ++
>  t/t1360-config-based-hooks.sh                 | 256 +++++++++++
>  24 files changed, 1717 insertions(+), 15 deletions(-)
>  create mode 100644 Documentation/config/hook.txt
>  create mode 100644 Documentation/git-hook.txt
>  create mode 100644 Documentation/technical/config-based-hooks.txt
>  create mode 100644 builtin/hook.c
>  create mode 100644 hook.c
>  create mode 100644 hook.h
>  create mode 100755 t/t1360-config-based-hooks.sh
> 
> -- 
> 2.28.0.rc0.142.g3c755180ce-goog
> 

Sorry for the delayed reply. I am happy with this series as-is. Thanks
for all your work on it!

Reviewed-by: Josh Steadmon <steadmon@google.com>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 00/17] propose config-based hooks (part I)
  2021-02-16 19:46           ` Josh Steadmon
@ 2021-02-16 22:47             ` Junio C Hamano
  2021-02-17 21:21               ` Josh Steadmon
  0 siblings, 1 reply; 170+ messages in thread
From: Junio C Hamano @ 2021-02-16 22:47 UTC (permalink / raw)
  To: Josh Steadmon
  Cc: Emily Shaffer, git, Jeff King, James Ramsay, Jonathan Nieder,
	brian m. carlson, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Johannes Schindelin

Josh Steadmon <steadmon@google.com> writes:

>> Emily Shaffer (17):
>>   doc: propose hooks managed by the config
>>   hook: scaffolding for git-hook subcommand
>>   hook: add list command
>>   hook: include hookdir hook in list
>>   hook: respect hook.runHookDir
>>   hook: implement hookcmd.<name>.skip
>>   parse-options: parse into strvec
>>   hook: add 'run' subcommand
>>   hook: replace find_hook() with hook_exists()
>>   hook: support passing stdin to hooks
>>   run-command: allow stdin for run_processes_parallel
>>   hook: allow parallel hook execution
>>   hook: allow specifying working directory for hooks
>>   run-command: add stdin callback for parallelization
>>   hook: provide stdin by string_list or callback
>>   run-command: allow capturing of collated output
>>   hooks: allow callers to capture output
>> 
> Sorry for the delayed reply. I am happy with this series as-is. Thanks
> for all your work on it!
>
> Reviewed-by: Josh Steadmon <steadmon@google.com>

The topic branch has a lot more commits than these 17; I am
wondering if the reviewed-by applies only to the bottom 17, or as
the whole?  I recall that the upper half was expecting at least some
documentation updates.

Thanks.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 00/17] propose config-based hooks (part I)
  2021-02-16 22:47             ` Junio C Hamano
@ 2021-02-17 21:21               ` Josh Steadmon
  2021-02-17 23:07                 ` Junio C Hamano
  0 siblings, 1 reply; 170+ messages in thread
From: Josh Steadmon @ 2021-02-17 21:21 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Emily Shaffer, git, Jeff King, James Ramsay, Jonathan Nieder,
	brian m. carlson, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Johannes Schindelin

On 2021.02.16 14:47, Junio C Hamano wrote:
> Josh Steadmon <steadmon@google.com> writes:
> 
> >> Emily Shaffer (17):
> >>   doc: propose hooks managed by the config
> >>   hook: scaffolding for git-hook subcommand
> >>   hook: add list command
> >>   hook: include hookdir hook in list
> >>   hook: respect hook.runHookDir
> >>   hook: implement hookcmd.<name>.skip
> >>   parse-options: parse into strvec
> >>   hook: add 'run' subcommand
> >>   hook: replace find_hook() with hook_exists()
> >>   hook: support passing stdin to hooks
> >>   run-command: allow stdin for run_processes_parallel
> >>   hook: allow parallel hook execution
> >>   hook: allow specifying working directory for hooks
> >>   run-command: add stdin callback for parallelization
> >>   hook: provide stdin by string_list or callback
> >>   run-command: allow capturing of collated output
> >>   hooks: allow callers to capture output
> >> 
> > Sorry for the delayed reply. I am happy with this series as-is. Thanks
> > for all your work on it!
> >
> > Reviewed-by: Josh Steadmon <steadmon@google.com>
> 
> The topic branch has a lot more commits than these 17; I am
> wondering if the reviewed-by applies only to the bottom 17, or as
> the whole?  I recall that the upper half was expecting at least some
> documentation updates.
> 
> Thanks.

Just to these 17, sorry for being unclear.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 00/17] propose config-based hooks (part I)
  2021-02-17 21:21               ` Josh Steadmon
@ 2021-02-17 23:07                 ` Junio C Hamano
  2021-02-25 19:50                   ` Junio C Hamano
  0 siblings, 1 reply; 170+ messages in thread
From: Junio C Hamano @ 2021-02-17 23:07 UTC (permalink / raw)
  To: Josh Steadmon
  Cc: Emily Shaffer, git, Jeff King, James Ramsay, Jonathan Nieder,
	brian m. carlson, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Johannes Schindelin

Josh Steadmon <steadmon@google.com> writes:

>> The topic branch has a lot more commits than these 17; I am
>> wondering if the reviewed-by applies only to the bottom 17, or as
>> the whole?  I recall that the upper half was expecting at least some
>> documentation updates.
>> 
>> Thanks.
>
> Just to these 17, sorry for being unclear.

Thanks for reading them through.

I am tempted to say we should merge these "mechanism" part down to
'next', hoping that the "rewrite existing ones using the new
mechansim" part can follow soon.


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 09/17] hook: replace find_hook() with hook_exists()
  2021-01-31  4:39             ` Jonathan Tan
  2021-02-12 22:15               ` Emily Shaffer
@ 2021-02-18 22:23               ` Emily Shaffer
  1 sibling, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2021-02-18 22:23 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Sat, Jan 30, 2021 at 08:39:28PM -0800, Jonathan Tan wrote:
> 
> > Add a helper to easily determine whether any hooks exist for a given
> > hook event.
> > 
> > Many callers want to check whether some state could be modified by a
> > hook; that check should include the config-based hooks as well. Optimize
> > by checking the config directly. Since commands which execute hooks
> > might want to take args to replace 'hook.runHookDir', let
> > 'hook_exists()' mirror the behavior of 'hook.runHookDir'.
> 
> The text makes sense, but the title might better be "introduce
> hook_exists()" instead of "replace", since find_hook() is still around.

Yep. Also, removed the one instance I did replace in this commit for
some reason (builtin/bugreport.c). Not sure why that's here and not
later on in part II.

> 
> Also maybe briefly mention the future plans - e.g. in the future, no
> code will use find_hook() except <whatever the hook-internal functions
> are>, because all of them will use hook_exists() and run_hook().

Nice, good point - I'll do that.


> 
> > +/*
> > + * Returns 1 if any hooks are specified in the config or if a hook exists in the
> > + * hookdir. Typically, invoke hook_exsts() like:
> > + *   hook_exists(hookname, configured_hookdir_opt());
> > + * Like with run_hooks, if you take a --run-hookdir flag, reflect that
> > + * user-specified behavior here instead.
> > + */
> > +int hook_exists(const char *hookname, enum hookdir_opt should_run_hookdir);
> 
> I wonder if enum hookdir_opt should support a "unspecified" instead, in
> which case hook_exists() will automatically read the config (instead of
> relying on the caller to call configured_hookdir_opt()), but I see that
> this patch set is version 7 and perhaps this design point has already
> been discussed.

Nope, and I think that's a pretty neat idea. I've done so.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 11/17] run-command: allow stdin for run_processes_parallel
  2021-02-01  5:38             ` Jonathan Tan
@ 2021-02-19 20:23               ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2021-02-19 20:23 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Sun, Jan 31, 2021 at 09:38:08PM -0800, Jonathan Tan wrote:
> 
> > diff --git a/run-command.c b/run-command.c
> > index ea4d0fb4b1..80c8c97bc1 100644
> > --- a/run-command.c
> > +++ b/run-command.c
> > @@ -1683,6 +1683,9 @@ static int pp_start_one(struct parallel_processes *pp)
> >  	if (i == pp->max_processes)
> >  		BUG("bookkeeping is hard");
> >  
> > +	/* disallow by default, but allow users to set up stdin if they wish */
> > +	pp->children[i].process.no_stdin = 1;
> > +
> 
> This makes sense. May be worth a more detailed comment, e.g.:
> 
>   By default, do not inherit stdin from the parent process. (If not, all
>   children would share it!) Users may overwrite this by having the
>   get_next_task function assign 0 to no_stdin and an appropriate integer
>   to in.

Thanks, took it slightly modified:

 /*
  * By default, do not inherit stdin from the parent process - otherwise,
  * all children would share stdin! Users may overwrite this to provide
  * something to the child's stdin by having their 'get_next_task'
  * callback assign 0 to .no_stdin and an appropriate integer to .in.
  */

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 12/17] hook: allow parallel hook execution
  2021-02-01  6:04             ` Jonathan Tan
@ 2021-02-22 21:46               ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2021-02-22 21:46 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Sun, Jan 31, 2021 at 10:04:22PM -0800, Jonathan Tan wrote:
> 
> > In many cases, there's no reason not to allow hooks to execute in
> > parallel. run_processes_parallel() is well-suited - it's a task queue
> > that runs its housekeeping in series, which means users don't
> > need to worry about thread safety on their callback data. True
> > multithreaded execution with the async_* functions isn't necessary here.
> > Synchronous hook execution can be achieved by only allowing 1 job to run
> > at a time.
> > 
> > Teach run_hooks() to use that function for simple hooks which don't
> > require stdin or capture of stderr.
> 
> Which hooks would be run in parallel, and which hooks in series? I don't
> see code that distinguishes between them.

It's up to the caller, who can set run_hooks_opt.jobs. In part II of
this series I made a guess at which ones should run in parallel or in
series and specified it in Documentation/githooks.txt.

> 
> > 
> > Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> > ---
> > 
> > Notes:
> >     Per AEvar's request - parallel hook execution on day zero.
> >     
> >     In most ways run_processes_parallel() worked great for me - but it didn't
> >     have great support for hooks where we pipe to and from. I had to add this
> >     support later in the series.
> >     
> >     Since I modified an existing and in-use library I'd appreciate a keen look on
> >     these patches.
> 
> What is the existing and in-use library that you're modifying?

Hm, this note wasn't super specific. From this point onwards in the
series I make changes to the run-command.h:run_processes_parallel()
library, although not in this commit itself. I think I meant "from here
on out, help me look at run-command.h".

I'll try to make the note a little better next series, sorry for the
confusion :) :)

> 
> > @@ -246,11 +255,96 @@ void run_hooks_opt_clear(struct run_hooks_opt *o)
> >  	strvec_clear(&o->args);
> >  }
> >  
> > +
> > +static int pick_next_hook(struct child_process *cp,
> > +			  struct strbuf *out,
> > +			  void *pp_cb,
> > +			  void **pp_task_cb)
> > +{
> > +	struct hook_cb_data *hook_cb = pp_cb;
> > +
> > +	struct hook *hook = list_entry(hook_cb->run_me, struct hook, list);
> > +
> > +	if (hook_cb->head == hook_cb->run_me)
> > +		return 0;
> > +
> > +	cp->env = hook_cb->options->env.v;
> > +	cp->stdout_to_stderr = 1;
> > +	cp->trace2_hook_name = hook->command.buf;
> > +
> > +	/* reopen the file for stdin; run_command closes it. */
> > +	if (hook_cb->options->path_to_stdin) {
> > +		cp->no_stdin = 0;
> > +		cp->in = xopen(hook_cb->options->path_to_stdin, O_RDONLY);
> > +	} else {
> > +		cp->no_stdin = 1;
> > +	}
> > +
> > +	/*
> > +	 * Commands from the config could be oneliners, but we know
> > +	 * for certain that hookdir commands are not.
> > +	 */
> > +	if (hook->from_hookdir)
> > +		cp->use_shell = 0;
> > +	else
> > +		cp->use_shell = 1;
> > +
> > +	/* add command */
> > +	strvec_push(&cp->args, hook->command.buf);
> > +
> > +	/*
> > +	 * add passed-in argv, without expanding - let the user get back
> > +	 * exactly what they put in
> > +	 */
> > +	strvec_pushv(&cp->args, hook_cb->options->args.v);
> 
> I just skimmed over this setup-process-for-hook part - it would have
> been much clearer if it was refactored into its own function before this
> patch (or better yet, written as its own function in the first place).
> As it is, there are some unnecessary rewritings - e.g. setting stdin
> after env, and the use_shell setup.

Yeah, that makes sense. Will see if I can change it for next round :)

> 
> > diff --git a/hook.h b/hook.h
> 
> [snip]
> 
> > +/*
> > + * Callback provided to feed_pipe_fn and consume_sideband_fn.
> > + */
> > +struct hook_cb_data {
> > +	int rc;
> > +	struct list_head *head;
> > +	struct list_head *run_me;
> > +	struct run_hooks_opt *options;
> > +};
> 
> Could this be in hook.c instead?

It ends up being needed publicly by
https://lore.kernel.org/git/20201222000435.1529768-17-emilyshaffer@google.com
(receive-pack: convert receive hooks to hook.h), which writes its own
stdin provider callback. (In a later commit, a "void* options" gets
added to this struct.)

At that point it's needed because the run-command callback structure can
provide one context pointer for the overall work queue, and one context
pointer for the individual task; this one is the "overall work queue"
pointer.

From hook.h's perspective, the entire hook_cb_data is needed for
pick_next_hook; but run-command.h:run_processes_parallel() doesn't have
a way to tease out a smaller amount of the context pointer for various
callbacks. If we wanted to obfuscate "hook_cb_data" we'd need to add
another indirection and call back to hook.h first, who could then tease
out the client-provided context and then call the client callback, but
to me it sounds unnecessarily complex.

> 
> Also, I think it's clearer if run_me was a struct hook, and set to NULL
> when iteration reaches the end. If you disagree, I think it needs some
> documentation (e.g. "the embedded linked list part of the hook that must
> be run next; if equal to head, then iteration has ended" or something
> like that).

Yeah, I don't see a huge reason not to do that, sure.

> 
> > +#define RUN_HOOKS_OPT_INIT_SYNC  {   		\
> >  	.env = STRVEC_INIT, 			\
> >  	.args = STRVEC_INIT, 			\
> >  	.path_to_stdin = NULL,			\
> > +	.jobs = 1,				\
> >  	.run_hookdir = configured_hookdir_opt()	\
> >  }
> 
> This is not used anywhere.

It is used in part II by hooks which are not able to be parallelized.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 14/17] run-command: add stdin callback for parallelization
  2021-02-01  6:51             ` Jonathan Tan
@ 2021-02-22 23:38               ` Emily Shaffer
  2021-02-23 19:33                 ` Jonathan Tan
  0 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2021-02-22 23:38 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Sun, Jan 31, 2021 at 10:51:53PM -0800, Jonathan Tan wrote:
> 
> > If a user of the run_processes_parallel() API wants to pipe a large
> > amount of information to stdin of each parallel command, that
> > information could exceed the buffer of the pipe allocated for that
> > process's stdin.  Generally this is solved by repeatedly writing to
> > child_process.in between calls to start_command() and finish_command();
> > run_processes_parallel() did not provide users an opportunity to access
> > child_process at that time.
> 
> [snip]
> 
> > diff --git a/run-command.h b/run-command.h
> > index 6472b38bde..e058c0e2c8 100644
> > --- a/run-command.h
> > +++ b/run-command.h
> > @@ -436,6 +436,20 @@ typedef int (*start_failure_fn)(struct strbuf *out,
> >  				void *pp_cb,
> >  				void *pp_task_cb);
> >  
> > +/**
> > + * This callback is called repeatedly on every child process who requests
> > + * start_command() to create a pipe by setting child_process.in < 0.
> > + *
> > + * pp_cb is the callback cookie as passed into run_processes_parallel, and
> > + * pp_task_cb is the callback cookie as passed into get_next_task_fn.
> > + * The contents of 'send' will be read into the pipe and passed to the pipe.
> > + *
> > + * Return nonzero to close the pipe.
> > + */
> > +typedef int (*feed_pipe_fn)(struct strbuf *pipe,
> > +			    void *pp_cb,
> > +			    void *pp_task_cb);
> > +
> 
> As you mention above in the commit message, I think the clearest API to
> support what we need is to just have a callback (that has access to
> child_process) that is executed between process start and finish.
> 
> As it is, I think this callback is too specific in that it takes a
> struct strbuf. I think that this struct strbuf will just end up being
> unnecessary copying much of the time, when the user could have just
> written to the fd directly.

Since the rest of the run_processes_parallel() API passes strings around
with strbufs, I'd prefer to leave it as-is to match the general API
expectations and style.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 14/17] run-command: add stdin callback for parallelization
  2021-02-22 23:38               ` Emily Shaffer
@ 2021-02-23 19:33                 ` Jonathan Tan
  2021-03-10 18:24                   ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Jonathan Tan @ 2021-02-23 19:33 UTC (permalink / raw)
  To: emilyshaffer; +Cc: jonathantanmy, git

> > > +/**
> > > + * This callback is called repeatedly on every child process who requests
> > > + * start_command() to create a pipe by setting child_process.in < 0.
> > > + *
> > > + * pp_cb is the callback cookie as passed into run_processes_parallel, and
> > > + * pp_task_cb is the callback cookie as passed into get_next_task_fn.
> > > + * The contents of 'send' will be read into the pipe and passed to the pipe.
> > > + *
> > > + * Return nonzero to close the pipe.
> > > + */
> > > +typedef int (*feed_pipe_fn)(struct strbuf *pipe,
> > > +			    void *pp_cb,
> > > +			    void *pp_task_cb);
> > > +
> > 
> > As you mention above in the commit message, I think the clearest API to
> > support what we need is to just have a callback (that has access to
> > child_process) that is executed between process start and finish.
> > 
> > As it is, I think this callback is too specific in that it takes a
> > struct strbuf. I think that this struct strbuf will just end up being
> > unnecessary copying much of the time, when the user could have just
> > written to the fd directly.
> 
> Since the rest of the run_processes_parallel() API passes strings around
> with strbufs, I'd prefer to leave it as-is to match the general API
> expectations and style.
> 
>  - Emily

By the rest of the run_processes_parallel() API, do you mean
get_next_task_fn, start_failure_fn, and task_finished_fn? If yes, I
think that it makes sense for them to be strbuf, because buffering is
needed to avoid outputs from individual child processes interleaving,
but that's not true here.

Having said that, this is an internal API so we could just leave it
as-is and then refactor it if we ever need something more flexible.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 15/17] hook: provide stdin by string_list or callback
  2021-02-01  7:04             ` Jonathan Tan
@ 2021-02-23 19:52               ` Emily Shaffer
  2021-02-25 20:56                 ` Jonathan Tan
  0 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2021-02-23 19:52 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Sun, Jan 31, 2021 at 11:04:48PM -0800, Jonathan Tan wrote:
> 
> > In cases where a hook requires only a small amount of information via
> > stdin, it should be simple for users to provide a string_list alone. But
> > in more complicated cases where the stdin is too large to hold in
> > memory, let's provide a callback the users can populate line after line
> > with instead.
> 
> [snip]
> 
> > diff --git a/hook.h b/hook.h
> > index 8a7542610c..0ac83fa7ca 100644
> > --- a/hook.h
> > +++ b/hook.h
> > @@ -2,6 +2,7 @@
> >  #include "list.h"
> >  #include "strbuf.h"
> >  #include "strvec.h"
> > +#include "run-command.h"
> >  
> >  struct hook
> >  {
> > @@ -14,6 +15,12 @@ struct hook
> >  	/* The literal command to run. */
> >  	struct strbuf command;
> >  	int from_hookdir;
> > +
> > +	/*
> > +	 * Use this to keep state for your feed_pipe_fn if you are using
> > +	 * run_hooks_opt.feed_pipe. Otherwise, do not touch it.
> > +	 */
> > +	void *feed_pipe_cb_data;
> 
> When would we need per-hook state? I see in patch 14 that you give each
> running process little by little (in pp_buffer_stdin()), perhaps so that
> each hook can make progress at roughly the same pace, but I don't think
> we can expect all hooks to work the same, so I don't think it's worth
> complicating the design for all that.

I agree that this is a complicated way of doing it, and if you have a
better design I'd be really excited to hear it.

It seemed like this was what was necessary for hooks like
https://lore.kernel.org/git/20201222000435.1529768-15-emilyshaffer@google.com
where the hook and the invoking process talk back and forth, or like
https://lore.kernel.org/git/20201222000435.1529768-17-emilyshaffer@google.com
which generates stdin on the fly for hooks which cannot be parallelized
(and so won't run at the same pace).

The former example - proc-receive - does have a constraint that multiple
hooks can't be specified, so we could theoretically keep the old
implementation and just pick up the single hook's location from the new
hook library. But the latter example still makes me think this much
complexity is needed.

> 
> >  };
> >  
> >  /*
> > @@ -57,12 +64,24 @@ struct run_hooks_opt
> >  
> >  	/* Path to file which should be piped to stdin for each hook */
> >  	const char *path_to_stdin;
> > +	/* Pipe each string to stdin, separated by newlines */
> > +	struct string_list str_stdin;
> > +	/*
> > +	 * Callback and state pointer to ask for more content to pipe to stdin.
> > +	 * Will be called repeatedly, for each hook. See
> > +	 * hook.c:pipe_from_stdin() for an example. Keep per-hook state in
> > +	 * hook.feed_pipe_cb_data (per process). Keep initialization context in
> > +	 * feed_pipe_ctx (shared by all processes).
> > +	 */
> > +	feed_pipe_fn feed_pipe;
> > +	void *feed_pipe_ctx;
> 
> Instead of 3 fields, I think 2 suffice - the function and the data
> (called "ctx" here). We can supply a function that treats the data as a
> string_list.

Nice catch, sure.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 00/17] propose config-based hooks (part I)
  2021-02-17 23:07                 ` Junio C Hamano
@ 2021-02-25 19:50                   ` Junio C Hamano
  2021-03-01 21:51                     ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Junio C Hamano @ 2021-02-25 19:50 UTC (permalink / raw)
  To: Emily Shaffer
  Cc: Josh Steadmon, git, Jeff King, James Ramsay, Jonathan Nieder,
	brian m. carlson, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Johannes Schindelin

Junio C Hamano <gitster@pobox.com> writes:

> Josh Steadmon <steadmon@google.com> writes:
>
>>> The topic branch has a lot more commits than these 17; I am
>>> wondering if the reviewed-by applies only to the bottom 17, or as
>>> the whole?  I recall that the upper half was expecting at least some
>>> documentation updates.
>>> 
>>> Thanks.
>>
>> Just to these 17, sorry for being unclear.
>
> Thanks for reading them through.
>
> I am tempted to say we should merge these "mechanism" part down to
> 'next', hoping that the "rewrite existing ones using the new
> mechansim" part can follow soon.

I said this on Feb 17th, but since then I think I saw you answer
"I'll do that" in responses to JTan's reviews in the past few days
(e.g. <YC7o2rUQOEdiMdqh@google.com>).  Would I regret if I merge the
topic down to 'next' today?

Thanks.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 15/17] hook: provide stdin by string_list or callback
  2021-02-23 19:52               ` Emily Shaffer
@ 2021-02-25 20:56                 ` Jonathan Tan
  2021-03-02  1:47                   ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Jonathan Tan @ 2021-02-25 20:56 UTC (permalink / raw)
  To: emilyshaffer; +Cc: jonathantanmy, git

> > When would we need per-hook state? I see in patch 14 that you give each
> > running process little by little (in pp_buffer_stdin()), perhaps so that
> > each hook can make progress at roughly the same pace, but I don't think
> > we can expect all hooks to work the same, so I don't think it's worth
> > complicating the design for all that.
> 
> I agree that this is a complicated way of doing it, and if you have a
> better design I'd be really excited to hear it.
> 
> It seemed like this was what was necessary for hooks like
> https://lore.kernel.org/git/20201222000435.1529768-15-emilyshaffer@google.com
> where the hook and the invoking process talk back and forth, or like
> https://lore.kernel.org/git/20201222000435.1529768-17-emilyshaffer@google.com
> which generates stdin on the fly for hooks which cannot be parallelized
> (and so won't run at the same pace).
> 
> The former example - proc-receive - does have a constraint that multiple
> hooks can't be specified, so we could theoretically keep the old
> implementation and just pick up the single hook's location from the new
> hook library. But the latter example still makes me think this much
> complexity is needed.

Ah, I see. From your explanation, in these 2 cases, only one hook
executes at a time (in the former case, because there is only one hook,
and in the latter case, you said that the hooks cannot be parallelized).
So it seems to me that the global state (in struct run_hooks_opt) would
be sufficient to keep track of what's going on. (The feed_pipe_fn
function can use pp_cb to keep track of the last executing pp_task_cb
and then compare it against the new pp_task_cb, I think, to keep track
of when a new hook has started.)

Even in the case of multiple hooks run in series (as opposed to a single
hook), I would think that the reason they can't be run in parallel is
because the nature of execution of a hook depends on what happened
during the execution of the previous hook, which seems to me to be even
more reason to centralize the state in struct run_hooks_opt.

Having said that, if my suggestion of not having per-hook state makes
certain patches more complicated, then that might be reason enough to
have per-hook state. In that case, you should write "per-hook state,
though strictly not necessary, makes <case> simpler" (or something like
that) in the commit message.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 00/17] propose config-based hooks (part I)
  2021-02-25 19:50                   ` Junio C Hamano
@ 2021-03-01 21:51                     ` Emily Shaffer
  2021-03-01 22:19                       ` Junio C Hamano
  0 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2021-03-01 21:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Josh Steadmon, git, Jeff King, James Ramsay, Jonathan Nieder,
	brian m. carlson, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Johannes Schindelin

On Thu, Feb 25, 2021 at 11:50:11AM -0800, Junio C Hamano wrote:
> 
> Junio C Hamano <gitster@pobox.com> writes:
> 
> > Josh Steadmon <steadmon@google.com> writes:
> >
> >>> The topic branch has a lot more commits than these 17; I am
> >>> wondering if the reviewed-by applies only to the bottom 17, or as
> >>> the whole?  I recall that the upper half was expecting at least some
> >>> documentation updates.
> >>> 
> >>> Thanks.
> >>
> >> Just to these 17, sorry for being unclear.
> >
> > Thanks for reading them through.
> >
> > I am tempted to say we should merge these "mechanism" part down to
> > 'next', hoping that the "rewrite existing ones using the new
> > mechansim" part can follow soon.
> 
> I said this on Feb 17th, but since then I think I saw you answer
> "I'll do that" in responses to JTan's reviews in the past few days
> (e.g. <YC7o2rUQOEdiMdqh@google.com>).  Would I regret if I merge the
> topic down to 'next' today?

Bah, I'm sorry I missed this - I had a broken mutt config and wasn't
seeing replies, my own fault. Argh.

I have some pretty significant changes from JTan's reviews, so I'd
prefer if you would wait since it would be tricky to turn them into a
patch commit now. But if you'd rather merge it and see a patch instead,
that is fine with me.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 00/17] propose config-based hooks (part I)
  2021-03-01 21:51                     ` Emily Shaffer
@ 2021-03-01 22:19                       ` Junio C Hamano
  0 siblings, 0 replies; 170+ messages in thread
From: Junio C Hamano @ 2021-03-01 22:19 UTC (permalink / raw)
  To: Emily Shaffer
  Cc: Josh Steadmon, git, Jeff King, James Ramsay, Jonathan Nieder,
	brian m. carlson, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Johannes Schindelin

Emily Shaffer <emilyshaffer@google.com> writes:

>> I said this on Feb 17th, but since then I think I saw you answer
>> "I'll do that" in responses to JTan's reviews in the past few days
>> (e.g. <YC7o2rUQOEdiMdqh@google.com>).  Would I regret if I merge the
>> topic down to 'next' today?
>
> Bah, I'm sorry I missed this - I had a broken mutt config and wasn't
> seeing replies, my own fault. Argh.
>
> I have some pretty significant changes from JTan's reviews, so I'd
> prefer if you would wait since it would be tricky to turn them into a
> patch commit now. But if you'd rather merge it and see a patch instead,
> that is fine with me.

OK, I still have it outside 'next', I think.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 15/17] hook: provide stdin by string_list or callback
  2021-02-25 20:56                 ` Jonathan Tan
@ 2021-03-02  1:47                   ` Emily Shaffer
  2021-03-02 23:33                     ` Jonathan Tan
  0 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2021-03-02  1:47 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Thu, Feb 25, 2021 at 12:56:11PM -0800, Jonathan Tan wrote:
> 
> > > When would we need per-hook state? I see in patch 14 that you give each
> > > running process little by little (in pp_buffer_stdin()), perhaps so that
> > > each hook can make progress at roughly the same pace, but I don't think
> > > we can expect all hooks to work the same, so I don't think it's worth
> > > complicating the design for all that.
> > 
> > I agree that this is a complicated way of doing it, and if you have a
> > better design I'd be really excited to hear it.
> > 
> > It seemed like this was what was necessary for hooks like
> > https://lore.kernel.org/git/20201222000435.1529768-15-emilyshaffer@google.com
> > where the hook and the invoking process talk back and forth, or like
> > https://lore.kernel.org/git/20201222000435.1529768-17-emilyshaffer@google.com
> > which generates stdin on the fly for hooks which cannot be parallelized
> > (and so won't run at the same pace).
> > 
> > The former example - proc-receive - does have a constraint that multiple
> > hooks can't be specified, so we could theoretically keep the old
> > implementation and just pick up the single hook's location from the new
> > hook library. But the latter example still makes me think this much
> > complexity is needed.
> 
> Ah, I see. From your explanation, in these 2 cases, only one hook
> executes at a time (in the former case, because there is only one hook,
> and in the latter case, you said that the hooks cannot be parallelized).
> So it seems to me that the global state (in struct run_hooks_opt) would
> be sufficient to keep track of what's going on. (The feed_pipe_fn
> function can use pp_cb to keep track of the last executing pp_task_cb
> and then compare it against the new pp_task_cb, I think, to keep track
> of when a new hook has started.)
> 
> Even in the case of multiple hooks run in series (as opposed to a single
> hook), I would think that the reason they can't be run in parallel is
> because the nature of execution of a hook depends on what happened
> during the execution of the previous hook, which seems to me to be even
> more reason to centralize the state in struct run_hooks_opt.
> 
> Having said that, if my suggestion of not having per-hook state makes
> certain patches more complicated, then that might be reason enough to
> have per-hook state. In that case, you should write "per-hook state,
> though strictly not necessary, makes <case> simpler" (or something like
> that) in the commit message.

Jonathan and I discussed this a little more offline and agreed to leave
the implementation as is.

Jonathan had suggested "have one callback invocation apply to all hooks
that are running now", either by having the callback iterate over the
task queue or by having the run-command lib take the result from the
callback and have *that* iterate over the task queue. The idea being,
one pointer to one copy of source material is easier to handle than
many.

I suggested that the callback's implementation of the second version of
that, where the library takes care of the "and do it for each task in
progress" part, would be pretty much identical to the callback's
implementation as it is in this patch, except that as it is here the
context pointer is per-task and as Jonathan suggests the context pointer
is per-entire-hook-invocation - so there isn't much complexity
difference between the two, from the user's perspective.

We also talked about cases where N=# of hooks > M=# of jobs, that is,
where some hooks must wait for other hooks to finish executing before
that could start. In this case, users' callback implementations would
need to be able to start over from the beginning of the source material,
and a long-running hook would block other short-running hooks from
beginning (because the long-running hook would be confused by hearing
the source material to its stdin again).

Hopefully this diagram illustrates better based on my understanding of
Jonathan's suggestion:

 A B <- "hey everyone, stdin 'foo'"
 A B <- "hey everyone, stdin 'bar'"
 A B <- "hey everyone, stdin 'baz'"
   B
   B
   B
   B
 C   <- "hey everyone, stdin 'foo'"
 C   <- "hey everyone, stdin 'bar'"
 C   <- "hey everyone, stdin 'baz'"
 C

 Anyway, since the complexity is probably about the same to the end user
 and using per-hook context means we don't have to wait like this, we
 agreed to stick with the implementation as is.

  - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 15/17] hook: provide stdin by string_list or callback
  2021-03-02  1:47                   ` Emily Shaffer
@ 2021-03-02 23:33                     ` Jonathan Tan
  0 siblings, 0 replies; 170+ messages in thread
From: Jonathan Tan @ 2021-03-02 23:33 UTC (permalink / raw)
  To: emilyshaffer; +Cc: jonathantanmy, git

> Jonathan and I discussed this a little more offline and agreed to leave
> the implementation as is.
> 
> Jonathan had suggested "have one callback invocation apply to all hooks
> that are running now", either by having the callback iterate over the
> task queue or by having the run-command lib take the result from the
> callback and have *that* iterate over the task queue. The idea being,
> one pointer to one copy of source material is easier to handle than
> many.
> 
> I suggested that the callback's implementation of the second version of
> that, where the library takes care of the "and do it for each task in
> progress" part, would be pretty much identical to the callback's
> implementation as it is in this patch, except that as it is here the
> context pointer is per-task and as Jonathan suggests the context pointer
> is per-entire-hook-invocation - so there isn't much complexity
> difference between the two, from the user's perspective.
> 
> We also talked about cases where N=# of hooks > M=# of jobs, that is,
> where some hooks must wait for other hooks to finish executing before
> that could start. In this case, users' callback implementations would
> need to be able to start over from the beginning of the source material,
> and a long-running hook would block other short-running hooks from
> beginning (because the long-running hook would be confused by hearing
> the source material to its stdin again).

Yes - this (number of hooks greater than number of jobs allowed to run
in parallel) was the case in which my suggestion of not having
hook-specific state would not work. The case we were talking about is
when there's a large amount of dynamically-generated data to be
transmitted to the hooks' stdins and I was thinking that it would be
best anyway if the callback looped over all hooks as data was generated,
but it would not be possible to only do a single pass if the number of
hooks is greater than the number of jobs.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 14/17] run-command: add stdin callback for parallelization
  2021-02-23 19:33                 ` Jonathan Tan
@ 2021-03-10 18:24                   ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2021-03-10 18:24 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Tue, Feb 23, 2021 at 11:33:24AM -0800, Jonathan Tan wrote:
> 
> > > > +/**
> > > > + * This callback is called repeatedly on every child process who requests
> > > > + * start_command() to create a pipe by setting child_process.in < 0.
> > > > + *
> > > > + * pp_cb is the callback cookie as passed into run_processes_parallel, and
> > > > + * pp_task_cb is the callback cookie as passed into get_next_task_fn.
> > > > + * The contents of 'send' will be read into the pipe and passed to the pipe.
> > > > + *
> > > > + * Return nonzero to close the pipe.
> > > > + */
> > > > +typedef int (*feed_pipe_fn)(struct strbuf *pipe,
> > > > +			    void *pp_cb,
> > > > +			    void *pp_task_cb);
> > > > +
> > > 
> > > As you mention above in the commit message, I think the clearest API to
> > > support what we need is to just have a callback (that has access to
> > > child_process) that is executed between process start and finish.
> > > 
> > > As it is, I think this callback is too specific in that it takes a
> > > struct strbuf. I think that this struct strbuf will just end up being
> > > unnecessary copying much of the time, when the user could have just
> > > written to the fd directly.
> > 
> > Since the rest of the run_processes_parallel() API passes strings around
> > with strbufs, I'd prefer to leave it as-is to match the general API
> > expectations and style.
> > 
> >  - Emily
> 
> By the rest of the run_processes_parallel() API, do you mean
> get_next_task_fn, start_failure_fn, and task_finished_fn? If yes, I
> think that it makes sense for them to be strbuf, because buffering is
> needed to avoid outputs from individual child processes interleaving,
> but that's not true here.
> 
> Having said that, this is an internal API so we could just leave it
> as-is and then refactor it if we ever need something more flexible.

Yeah, with that in mind I'll leave it as it is. I don't like the idea of
directly exposing the child's pipe via callback; to me it feels like bad
object-oriented design, but that maybe doesn't apply here :)

Anyway, like you say, we can change it later if someone doesn't like
this, no problem.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v7 01/17] doc: propose hooks managed by the config
  2021-02-01 22:11             ` Junio C Hamano
@ 2021-03-10 19:30               ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2021-03-10 19:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Mon, Feb 01, 2021 at 02:11:53PM -0800, Junio C Hamano wrote:
> 
> Emily Shaffer <emilyshaffer@google.com> writes:
> 
> > +ways, providing an avenue to deprecate these "legacy" hooks if desired. The
> > +handling is based on a config `hook.runHookDir`, which is checked against a
> > +number of cases:
> 
> Don't we want to also warn when the setting "no" or something
> similar prevents the legacy hook from running, to help users
> who wonder why their hook scripts are not running?  I.e.
> 
> > +- "no": the legacy hook will not be run
> 
> +- "warn-no": Git will print a warning to stderr before ignoring the
> +  legacy hook

Yeah, I think you are right. Jonathan N suggested such an option as
"error" (as opposed to warning); I'll add it here plus to the enum and
take your description verbatim. Thanks.

> 
> > +- "interactive": Git will prompt the user before running the legacy hook
> > +- "warn": Git will print a warning to stderr before running the legacy hook
> > +- "yes" (default): Git will silently run the legacy hook
> 
> > +In case this list is expanded in the future, if a value for `hook.runHookDir` is
> > +given which Git does not recognize, Git should discard that config entry. For
> > +example, if "warn" was specified at system level and "junk" was specified at
> > +global level, Git would resolve the value to "warn"; if the only time the config
> > +was set was to "junk", Git would use the default value of "yes".
> 
> Hmph, instead of complaining "value 'junk' is not recognized" and
> erroring out?  Why?

I think for the exact case you're describing above - where I forgot some
useful combination of "run/don't run" and "warn/don't warn" (or, one
could forsee, "first" e.g. to run the legacy hook early on instead of
late, or "only" e.g. config hooks don't exist, etc etc), we add this
flag, and some enterprise (like Jonathan N's team) distributes the new
config to folks' system configs before 100% of machines are using the
newer Git executable. It's a long shot :) but I'd rather be flexible
than inflexible.

I will update the design doc anyway - I ended up implementing it as
"complain about 'junk' and then do default behavior".

> 
> > +[[stage-3]]
> > +==== Stage 3
> > +
> > +`.git/hooks` is removed from the template and the hook directory is considered
> > +deprecated. To avoid breaking older repos, the default of `hook.runHookDir` is
> > +not changed, and `find_hook()` is not removed.
> 
> Presumably, we'll have documentation somewhere that instructs users
> (who were taught by slashdot and other site to add certain scripts
> under their .git/hooks/) how to do the equivalent without adding
> scripts in .git/hooks/ directory and instead using the config
> mechanism (e.g. "when told to add script X in .git/hooks/, read such
> an instruction as if telling you to do Y instead") by the time this
> happens?  It probably makes sense to do so as part of stage-2, at
> which point the users are _ready_ to migrate.

Hm. Where should we distribute such a documentation? git-hook.txt?
githooks.txt? I think it's a good idea, sure.

> 
> > +[[security]]
> > +=== Security and repo config
> > +
> > +Part of the motivation behind this refactor is to mitigate hooks as an attack
> > +vector;footnote:[https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/]
> > +however, as the design stands, users can still provide hooks in the repo-level
> > +config, which is included when a repo is zipped and sent elsewhere.  The
> > +security of the repo-level config is still under discussion; this design
> > +generally assumes the repo-level config is secure, which is not true yet. The
> > +goal is to avoid an overcomplicated design to work around a problem which has
> > +ceased to exist.
> 
> I doubt we want to claim anything about security as part of this
> series.  As you say in the paragraph, .git/config and .git/hooks/
> are equally (un)protected and if we decide to punt .git/config
> security, then not moving away from .git/hooks would not hurt
> security-wise, either (in other words, security is not a viable
> motivation behind this series).

Yeah, I think you are right that despite my best efforts to spin it that
way, it just isn't a security win :) Ah well...

> 
> And if we stop advertising 'security merit' that does not exist,
> what remains?  Isn't the biggest selling point that an identical set
> of hook configuration can be shared among multiple repositories, and
> it allows more than one hook scripts to be triggered by a single
> "hook event"?  There may be other good things we should be able to
> sell the new mechanism to our users, and we do stress on them, which
> is done in the motivation section.  So...

Ok.

I would like to keep this header and make it more clear that we didn't
fully succeed in the wish to make zip attacks impossible, but I will
remove it from the motivations section at the top. I also added a little
more clarity (I hope) on the other pieces in the motivation section.

> 
> > +.Comparison of alternatives
> > +|===
> > +|Feature |Config-based hooks |Hook directories |Status quo
> 
> Sorry, but I did not find this table particularly convincing.
> 
> The only thing I sense is a hand-wavy desire that "we could make it
> better than everybody else if we work on it in this area", which can
> apply equally for other approaches---they could enhance what they
> already have (e.g. "discoverability & documentation").
> 
> As a list of "these are the points we aspire to do better than other
> people", I think it is an excellent idea to have a table like this
> here in the documentation.  But that is not a "comparison".

Ok. I'll see what I can do to frame it better.

> 
> > +[[execution-ordering]]
> > +=== Execution ordering
> > +
> > +We may find that config order is insufficient for some users; for example,
> > +config order makes it difficult to add a new hook to the system or global config
> > +which runs at the end of the hook list. A new ordering schema should be:
> > +
> > +1) Specified by a `hook.order` config, so that users will not unexpectedly see
> > +their order change;
> > +
> > +2) Either dependency or numerically based.
> > +
> > +Dependency-based ordering is prone to classic linked-list problems, like a
> > +cycles and handling of missing dependencies. But, it paves the way for enabling
> > +parallelization if some tasks truly depend on others.
> > +
> > +Numerical ordering makes it tricky for Git to generate suggested ordering
> > +numbers for each command, but is easy to determine a definitive order.
> 
> OK.
> 
> Have we decided what we do for hooks whose interface is to feed
> their input from their standard input?  The current system, I think,
> just feeds the single hook by writing into a pipe to it, but if we
> were to drive multiple hooks, we'd need to write the same thing to
> each of these hook programs?  

Yep. There are a few ways to do it:
1. Provide a file to be used as stdin. This way we just hook up that
file's fd to each hook process's stdin fd. (Example:
https://lore.kernel.org/git/20201222000435.1529768-11-emilyshaffer@google.com
in builtin/am.c:run_post_rewrite_hook())
2. Provide a string_list in the run_hooks_opt struct; hook.c treats each
entry in the string_list as a line to stdin, separated by a newline, and
"replays" the list to each hook in order. (Example:
https://lore.kernel.org/git/20201222000435.1529768-12-emilyshaffer@google.com)
3. Set up your own more complicated version of (2) by writing your own
callback to provide "next line of stdin" based on a context pointer. The
context pointer is per-hook-task. (Example:
https://lore.kernel.org/git/20201222000435.1529768-17-emilyshaffer@google.com)

> 
> Do we have a plan to deal with hooks whose outcome is not just
> "yes/no", e.g. "proc-receive" hook that munges the list of refs to
> be updated and the new values for them, or "applypatch-msg" that
> munges the incoming proposed commit log message?  Does the second
> hook work on the result of the first hook?  Do the two hooks work on
> the vanilla state and their output have to agree with each other?

The only hook I found this way was 'proc-receive' itself - and in the
end, because it's somewhat interactive with two-way communication with
the caller, it didn't seem like there was a good way to reason about
multiple hooks, like you say. So in the 'proc-receive' case, we disallow
multiple hooks. See
https://lore.kernel.org/git/20201222000435.1529768-15-emilyshaffer@google.com.
> 
> > +[[parallelization]]
> > +=== Parallelization with dependencies
> > +
> > +Currently hooks use a naive parallelization scheme or are run in series.  But if
> > +one hook depends on another's output, then users will want to specify those
> > +dependencies.
> 
> An untold assumption here is that the questions I asked earlier on
> having more than one hooks that is not just yes/no is something
> readers know the same answer, and that answer is "the outcome of the
> first hook is passed along (as if it were the input given by Git
> directly, if the first hook did not exist), to the second hook.  It
> should be spelled out somewhere before the execution ordering
> section, I think.

I'll make an explicit callout about that case, thanks.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

end of thread, other threads:[~2021-03-10 19:31 UTC | newest]

Thread overview: 170+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-21 18:54 [PATCH v2 0/4] propose config-based hooks Emily Shaffer
2020-05-21 18:54 ` [PATCH v2 1/4] doc: propose hooks managed by the config Emily Shaffer
2020-05-22 10:13   ` Phillip Wood
2020-06-09 20:26     ` Emily Shaffer
2020-05-21 18:54 ` [PATCH v2 2/4] hook: scaffolding for git-hook subcommand Emily Shaffer
2020-05-21 18:54 ` [PATCH v2 3/4] hook: add list command Emily Shaffer
2020-05-22 10:27   ` Phillip Wood
2020-06-09 21:49     ` Emily Shaffer
2020-08-17 13:36       ` Phillip Wood
2020-05-24 23:00   ` Johannes Schindelin
2020-05-27 23:37     ` Emily Shaffer
2020-05-21 18:54 ` [PATCH v2 4/4] hook: add --porcelain to " Emily Shaffer
2020-05-24 23:00   ` Johannes Schindelin
2020-05-25  0:29     ` Johannes Schindelin
2020-07-28 22:24 ` [PATCH v3 0/6] propose config-based hooks Emily Shaffer
2020-07-28 22:24   ` [PATCH v3 1/6] doc: propose hooks managed by the config Emily Shaffer
2020-07-28 22:24   ` [PATCH v3 2/6] hook: scaffolding for git-hook subcommand Emily Shaffer
2020-07-28 22:24   ` [PATCH v3 3/6] hook: add list command Emily Shaffer
2020-07-28 22:24   ` [PATCH v3 4/6] hook: add --porcelain to " Emily Shaffer
2020-07-28 22:24   ` [RFC PATCH v3 5/6] parse-options: parse into argv_array Emily Shaffer
2020-07-29 19:33     ` Junio C Hamano
2020-07-30 23:41       ` Junio C Hamano
2020-07-28 22:24   ` [RFC PATCH v3 6/6] hook: add 'run' subcommand Emily Shaffer
2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
2020-09-09  0:49     ` [PATCH v4 1/9] doc: propose hooks managed by the config Emily Shaffer
2020-09-23 22:59       ` Jonathan Tan
2020-09-24 21:54         ` Emily Shaffer
2020-10-07  9:23       ` Ævar Arnfjörð Bjarmason
2020-10-22  0:58         ` Emily Shaffer
2020-10-23 19:10           ` Ævar Arnfjörð Bjarmason
2020-10-29 15:38             ` Emily Shaffer
2020-10-29 20:04               ` Ævar Arnfjörð Bjarmason
2020-09-09  0:49     ` [PATCH v4 2/9] hook: scaffolding for git-hook subcommand Emily Shaffer
2020-10-05 23:24       ` Jonathan Nieder
2020-10-06 19:06         ` Emily Shaffer
2020-09-09  0:49     ` [PATCH v4 3/9] hook: add list command Emily Shaffer
2020-09-11 13:27       ` Phillip Wood
2020-09-11 16:51         ` Emily Shaffer
2020-09-23 23:04       ` Jonathan Tan
2020-10-06 20:46         ` Emily Shaffer
2020-09-27 19:23       ` Martin Ågren
2020-10-06 20:20         ` Emily Shaffer
2020-10-05 23:27       ` Jonathan Nieder
2020-09-09  0:49     ` [PATCH v4 4/9] hook: add --porcelain to " Emily Shaffer
2020-09-28 19:29       ` Josh Steadmon
2020-09-09  0:49     ` [PATCH v4 5/9] parse-options: parse into strvec Emily Shaffer
2020-10-05 23:30       ` Jonathan Nieder
2020-10-06  4:49         ` Junio C Hamano
2020-09-09  0:49     ` [PATCH v4 6/9] hook: add 'run' subcommand Emily Shaffer
2020-09-11 13:30       ` Phillip Wood
2020-09-28 19:29       ` Josh Steadmon
2020-10-05 23:39       ` Jonathan Nieder
2020-10-06 22:57         ` Emily Shaffer
2020-09-09  0:49     ` [PATCH v4 7/9] hook: replace run-command.h:find_hook Emily Shaffer
2020-09-09 20:32       ` Junio C Hamano
2020-09-10 19:08         ` Emily Shaffer
2020-09-23 23:20       ` Jonathan Tan
2020-10-05 23:42       ` Jonathan Nieder
2020-09-09  0:49     ` [PATCH v4 8/9] commit: use config-based hooks Emily Shaffer
2020-09-10 13:50       ` Phillip Wood
2020-09-10 22:21         ` Junio C Hamano
2020-09-23 23:47       ` Jonathan Tan
2020-10-05 21:27         ` Emily Shaffer
2020-10-05 23:48           ` Jonathan Nieder
2020-10-06 19:08             ` Emily Shaffer
2020-09-09  0:49     ` [PATCH v4 9/9] run_commit_hook: take strvec instead of varargs Emily Shaffer
2020-09-10 14:16       ` Phillip Wood
2020-09-11 13:20         ` Phillip Wood
2020-09-09 21:04     ` [PATCH v4 0/9] propose config-based hooks Junio C Hamano
2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
2020-10-14 23:24       ` [PATCH v5 1/8] doc: propose hooks managed by the config Emily Shaffer
2020-10-15 16:31         ` Ævar Arnfjörð Bjarmason
2020-10-16 17:29           ` Junio C Hamano
2020-10-21 23:37           ` Emily Shaffer
2020-10-14 23:24       ` [PATCH v5 2/8] hook: scaffolding for git-hook subcommand Emily Shaffer
2020-10-14 23:24       ` [PATCH v5 3/8] hook: add list command Emily Shaffer
2020-10-14 23:24       ` [PATCH v5 4/8] hook: include hookdir hook in list Emily Shaffer
2020-10-14 23:24       ` [PATCH v5 5/8] hook: implement hookcmd.<name>.skip Emily Shaffer
2020-10-14 23:24       ` [PATCH v5 6/8] parse-options: parse into strvec Emily Shaffer
2020-10-14 23:24       ` [PATCH v5 7/8] hook: add 'run' subcommand Emily Shaffer
2020-10-14 23:24       ` [PATCH v5 8/8] hook: replace find_hook() with hook_exists() Emily Shaffer
2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
2020-12-05  1:45         ` [PATCH 01/17] doc: propose hooks managed by the config Emily Shaffer
2020-12-05  1:45         ` [PATCH 02/17] hook: scaffolding for git-hook subcommand Emily Shaffer
2020-12-05  1:45         ` [PATCH 03/17] hook: add list command Emily Shaffer
2020-12-05  1:45         ` [PATCH 04/17] hook: include hookdir hook in list Emily Shaffer
2020-12-05  1:45         ` [PATCH 05/17] hook: respect hook.runHookDir Emily Shaffer
2020-12-05  1:45         ` [PATCH 06/17] hook: implement hookcmd.<name>.skip Emily Shaffer
2020-12-05  1:45         ` [PATCH 07/17] parse-options: parse into strvec Emily Shaffer
2020-12-05  1:45         ` [PATCH 08/17] hook: add 'run' subcommand Emily Shaffer
2020-12-11 10:15           ` Phillip Wood
2020-12-15 21:41             ` Emily Shaffer
2020-12-05  1:45         ` [PATCH 09/17] hook: replace find_hook() with hook_exists() Emily Shaffer
2020-12-05  1:46         ` [PATCH 10/17] hook: support passing stdin to hooks Emily Shaffer
2020-12-05  1:46         ` [PATCH 11/17] run-command: allow stdin for run_processes_parallel Emily Shaffer
2020-12-05  1:46         ` [PATCH 12/17] hook: allow parallel hook execution Emily Shaffer
2020-12-05  1:46         ` [PATCH 13/17] hook: allow specifying working directory for hooks Emily Shaffer
2020-12-05  1:46         ` [PATCH 14/17] run-command: add stdin callback for parallelization Emily Shaffer
2020-12-05  1:46         ` [PATCH 15/17] hook: provide stdin by string_list or callback Emily Shaffer
2020-12-08 21:09           ` SZEDER Gábor
2020-12-08 22:11             ` Emily Shaffer
2020-12-05  1:46         ` [PATCH 16/17] run-command: allow capturing of collated output Emily Shaffer
2020-12-05  1:46         ` [PATCH 17/17] hooks: allow callers to capture output Emily Shaffer
2020-12-16  0:34         ` [PATCH v6 00/17] propose config-based hooks (part I) Josh Steadmon
2020-12-16  0:56           ` Junio C Hamano
2020-12-16 20:16             ` Emily Shaffer
2020-12-16 23:32               ` Junio C Hamano
2020-12-18  2:07                 ` Emily Shaffer
2020-12-18  5:29                   ` Junio C Hamano
2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 01/17] doc: propose hooks managed by the config Emily Shaffer
2021-01-23 15:38             ` Ævar Arnfjörð Bjarmason
2021-01-29 23:52               ` Emily Shaffer
2021-02-01 22:11             ` Junio C Hamano
2021-03-10 19:30               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 02/17] hook: scaffolding for git-hook subcommand Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 03/17] hook: add list command Emily Shaffer
2021-01-31  3:10             ` Jonathan Tan
2021-02-09 21:06               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 04/17] hook: include hookdir hook in list Emily Shaffer
2021-01-31  3:20             ` Jonathan Tan
2021-02-09 22:05               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 05/17] hook: respect hook.runHookDir Emily Shaffer
2021-01-31  3:35             ` Jonathan Tan
2021-02-09 22:31               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 06/17] hook: implement hookcmd.<name>.skip Emily Shaffer
2021-01-31  3:40             ` Jonathan Tan
2021-02-09 22:57               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 07/17] parse-options: parse into strvec Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 08/17] hook: add 'run' subcommand Emily Shaffer
2021-01-31  4:22             ` Jonathan Tan
2021-02-11 22:44               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 09/17] hook: replace find_hook() with hook_exists() Emily Shaffer
2021-01-31  4:39             ` Jonathan Tan
2021-02-12 22:15               ` Emily Shaffer
2021-02-18 22:23               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 10/17] hook: support passing stdin to hooks Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 11/17] run-command: allow stdin for run_processes_parallel Emily Shaffer
2021-02-01  5:38             ` Jonathan Tan
2021-02-19 20:23               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 12/17] hook: allow parallel hook execution Emily Shaffer
2021-02-01  6:04             ` Jonathan Tan
2021-02-22 21:46               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 13/17] hook: allow specifying working directory for hooks Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 14/17] run-command: add stdin callback for parallelization Emily Shaffer
2021-02-01  6:51             ` Jonathan Tan
2021-02-22 23:38               ` Emily Shaffer
2021-02-23 19:33                 ` Jonathan Tan
2021-03-10 18:24                   ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 15/17] hook: provide stdin by string_list or callback Emily Shaffer
2021-02-01  7:04             ` Jonathan Tan
2021-02-23 19:52               ` Emily Shaffer
2021-02-25 20:56                 ` Jonathan Tan
2021-03-02  1:47                   ` Emily Shaffer
2021-03-02 23:33                     ` Jonathan Tan
2020-12-22  0:02           ` [PATCH v7 16/17] run-command: allow capturing of collated output Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 17/17] hooks: allow callers to capture output Emily Shaffer
2020-12-22  2:11           ` [PATCH v7 00/17] propose config-based hooks (part I) Junio C Hamano
2020-12-28 18:34             ` Emily Shaffer
2020-12-28 22:50               ` Junio C Hamano
2020-12-28 22:37           ` [PATCH v3 18/17] doc: make git-hook.txt point of truth Emily Shaffer
2020-12-28 22:39             ` Emily Shaffer
2021-01-29 23:59           ` [PATCH v7 00/17] propose config-based hooks (part I) Emily Shaffer
2021-02-16 19:46           ` Josh Steadmon
2021-02-16 22:47             ` Junio C Hamano
2021-02-17 21:21               ` Josh Steadmon
2021-02-17 23:07                 ` Junio C Hamano
2021-02-25 19:50                   ` Junio C Hamano
2021-03-01 21:51                     ` Emily Shaffer
2021-03-01 22:19                       ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).