git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Emily Shaffer <emilyshaffer@google.com>
To: Jonathan Tan <jonathantanmy@google.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH v7 12/17] hook: allow parallel hook execution
Date: Mon, 22 Feb 2021 13:46:51 -0800	[thread overview]
Message-ID: <YDQmS2qN/lrm547L@google.com> (raw)
In-Reply-To: <20210201060422.1313603-1-jonathantanmy@google.com>

On Sun, Jan 31, 2021 at 10:04:22PM -0800, Jonathan Tan wrote:
> 
> > In many cases, there's no reason not to allow hooks to execute in
> > parallel. run_processes_parallel() is well-suited - it's a task queue
> > that runs its housekeeping in series, which means users don't
> > need to worry about thread safety on their callback data. True
> > multithreaded execution with the async_* functions isn't necessary here.
> > Synchronous hook execution can be achieved by only allowing 1 job to run
> > at a time.
> > 
> > Teach run_hooks() to use that function for simple hooks which don't
> > require stdin or capture of stderr.
> 
> Which hooks would be run in parallel, and which hooks in series? I don't
> see code that distinguishes between them.

It's up to the caller, who can set run_hooks_opt.jobs. In part II of
this series I made a guess at which ones should run in parallel or in
series and specified it in Documentation/githooks.txt.

> 
> > 
> > Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> > ---
> > 
> > Notes:
> >     Per AEvar's request - parallel hook execution on day zero.
> >     
> >     In most ways run_processes_parallel() worked great for me - but it didn't
> >     have great support for hooks where we pipe to and from. I had to add this
> >     support later in the series.
> >     
> >     Since I modified an existing and in-use library I'd appreciate a keen look on
> >     these patches.
> 
> What is the existing and in-use library that you're modifying?

Hm, this note wasn't super specific. From this point onwards in the
series I make changes to the run-command.h:run_processes_parallel()
library, although not in this commit itself. I think I meant "from here
on out, help me look at run-command.h".

I'll try to make the note a little better next series, sorry for the
confusion :) :)

> 
> > @@ -246,11 +255,96 @@ void run_hooks_opt_clear(struct run_hooks_opt *o)
> >  	strvec_clear(&o->args);
> >  }
> >  
> > +
> > +static int pick_next_hook(struct child_process *cp,
> > +			  struct strbuf *out,
> > +			  void *pp_cb,
> > +			  void **pp_task_cb)
> > +{
> > +	struct hook_cb_data *hook_cb = pp_cb;
> > +
> > +	struct hook *hook = list_entry(hook_cb->run_me, struct hook, list);
> > +
> > +	if (hook_cb->head == hook_cb->run_me)
> > +		return 0;
> > +
> > +	cp->env = hook_cb->options->env.v;
> > +	cp->stdout_to_stderr = 1;
> > +	cp->trace2_hook_name = hook->command.buf;
> > +
> > +	/* reopen the file for stdin; run_command closes it. */
> > +	if (hook_cb->options->path_to_stdin) {
> > +		cp->no_stdin = 0;
> > +		cp->in = xopen(hook_cb->options->path_to_stdin, O_RDONLY);
> > +	} else {
> > +		cp->no_stdin = 1;
> > +	}
> > +
> > +	/*
> > +	 * Commands from the config could be oneliners, but we know
> > +	 * for certain that hookdir commands are not.
> > +	 */
> > +	if (hook->from_hookdir)
> > +		cp->use_shell = 0;
> > +	else
> > +		cp->use_shell = 1;
> > +
> > +	/* add command */
> > +	strvec_push(&cp->args, hook->command.buf);
> > +
> > +	/*
> > +	 * add passed-in argv, without expanding - let the user get back
> > +	 * exactly what they put in
> > +	 */
> > +	strvec_pushv(&cp->args, hook_cb->options->args.v);
> 
> I just skimmed over this setup-process-for-hook part - it would have
> been much clearer if it was refactored into its own function before this
> patch (or better yet, written as its own function in the first place).
> As it is, there are some unnecessary rewritings - e.g. setting stdin
> after env, and the use_shell setup.

Yeah, that makes sense. Will see if I can change it for next round :)

> 
> > diff --git a/hook.h b/hook.h
> 
> [snip]
> 
> > +/*
> > + * Callback provided to feed_pipe_fn and consume_sideband_fn.
> > + */
> > +struct hook_cb_data {
> > +	int rc;
> > +	struct list_head *head;
> > +	struct list_head *run_me;
> > +	struct run_hooks_opt *options;
> > +};
> 
> Could this be in hook.c instead?

It ends up being needed publicly by
https://lore.kernel.org/git/20201222000435.1529768-17-emilyshaffer@google.com
(receive-pack: convert receive hooks to hook.h), which writes its own
stdin provider callback. (In a later commit, a "void* options" gets
added to this struct.)

At that point it's needed because the run-command callback structure can
provide one context pointer for the overall work queue, and one context
pointer for the individual task; this one is the "overall work queue"
pointer.

From hook.h's perspective, the entire hook_cb_data is needed for
pick_next_hook; but run-command.h:run_processes_parallel() doesn't have
a way to tease out a smaller amount of the context pointer for various
callbacks. If we wanted to obfuscate "hook_cb_data" we'd need to add
another indirection and call back to hook.h first, who could then tease
out the client-provided context and then call the client callback, but
to me it sounds unnecessarily complex.

> 
> Also, I think it's clearer if run_me was a struct hook, and set to NULL
> when iteration reaches the end. If you disagree, I think it needs some
> documentation (e.g. "the embedded linked list part of the hook that must
> be run next; if equal to head, then iteration has ended" or something
> like that).

Yeah, I don't see a huge reason not to do that, sure.

> 
> > +#define RUN_HOOKS_OPT_INIT_SYNC  {   		\
> >  	.env = STRVEC_INIT, 			\
> >  	.args = STRVEC_INIT, 			\
> >  	.path_to_stdin = NULL,			\
> > +	.jobs = 1,				\
> >  	.run_hookdir = configured_hookdir_opt()	\
> >  }
> 
> This is not used anywhere.

It is used in part II by hooks which are not able to be parallelized.

 - Emily

  reply	other threads:[~2021-02-22 21:47 UTC|newest]

Thread overview: 170+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-21 18:54 [PATCH v2 0/4] propose config-based hooks Emily Shaffer
2020-05-21 18:54 ` [PATCH v2 1/4] doc: propose hooks managed by the config Emily Shaffer
2020-05-22 10:13   ` Phillip Wood
2020-06-09 20:26     ` Emily Shaffer
2020-05-21 18:54 ` [PATCH v2 2/4] hook: scaffolding for git-hook subcommand Emily Shaffer
2020-05-21 18:54 ` [PATCH v2 3/4] hook: add list command Emily Shaffer
2020-05-22 10:27   ` Phillip Wood
2020-06-09 21:49     ` Emily Shaffer
2020-08-17 13:36       ` Phillip Wood
2020-05-24 23:00   ` Johannes Schindelin
2020-05-27 23:37     ` Emily Shaffer
2020-05-21 18:54 ` [PATCH v2 4/4] hook: add --porcelain to " Emily Shaffer
2020-05-24 23:00   ` Johannes Schindelin
2020-05-25  0:29     ` Johannes Schindelin
2020-07-28 22:24 ` [PATCH v3 0/6] propose config-based hooks Emily Shaffer
2020-07-28 22:24   ` [PATCH v3 1/6] doc: propose hooks managed by the config Emily Shaffer
2020-07-28 22:24   ` [PATCH v3 2/6] hook: scaffolding for git-hook subcommand Emily Shaffer
2020-07-28 22:24   ` [PATCH v3 3/6] hook: add list command Emily Shaffer
2020-07-28 22:24   ` [PATCH v3 4/6] hook: add --porcelain to " Emily Shaffer
2020-07-28 22:24   ` [RFC PATCH v3 5/6] parse-options: parse into argv_array Emily Shaffer
2020-07-29 19:33     ` Junio C Hamano
2020-07-30 23:41       ` Junio C Hamano
2020-07-28 22:24   ` [RFC PATCH v3 6/6] hook: add 'run' subcommand Emily Shaffer
2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
2020-09-09  0:49     ` [PATCH v4 1/9] doc: propose hooks managed by the config Emily Shaffer
2020-09-23 22:59       ` Jonathan Tan
2020-09-24 21:54         ` Emily Shaffer
2020-10-07  9:23       ` Ævar Arnfjörð Bjarmason
2020-10-22  0:58         ` Emily Shaffer
2020-10-23 19:10           ` Ævar Arnfjörð Bjarmason
2020-10-29 15:38             ` Emily Shaffer
2020-10-29 20:04               ` Ævar Arnfjörð Bjarmason
2020-09-09  0:49     ` [PATCH v4 2/9] hook: scaffolding for git-hook subcommand Emily Shaffer
2020-10-05 23:24       ` Jonathan Nieder
2020-10-06 19:06         ` Emily Shaffer
2020-09-09  0:49     ` [PATCH v4 3/9] hook: add list command Emily Shaffer
2020-09-11 13:27       ` Phillip Wood
2020-09-11 16:51         ` Emily Shaffer
2020-09-23 23:04       ` Jonathan Tan
2020-10-06 20:46         ` Emily Shaffer
2020-09-27 19:23       ` Martin Ågren
2020-10-06 20:20         ` Emily Shaffer
2020-10-05 23:27       ` Jonathan Nieder
2020-09-09  0:49     ` [PATCH v4 4/9] hook: add --porcelain to " Emily Shaffer
2020-09-28 19:29       ` Josh Steadmon
2020-09-09  0:49     ` [PATCH v4 5/9] parse-options: parse into strvec Emily Shaffer
2020-10-05 23:30       ` Jonathan Nieder
2020-10-06  4:49         ` Junio C Hamano
2020-09-09  0:49     ` [PATCH v4 6/9] hook: add 'run' subcommand Emily Shaffer
2020-09-11 13:30       ` Phillip Wood
2020-09-28 19:29       ` Josh Steadmon
2020-10-05 23:39       ` Jonathan Nieder
2020-10-06 22:57         ` Emily Shaffer
2020-09-09  0:49     ` [PATCH v4 7/9] hook: replace run-command.h:find_hook Emily Shaffer
2020-09-09 20:32       ` Junio C Hamano
2020-09-10 19:08         ` Emily Shaffer
2020-09-23 23:20       ` Jonathan Tan
2020-10-05 23:42       ` Jonathan Nieder
2020-09-09  0:49     ` [PATCH v4 8/9] commit: use config-based hooks Emily Shaffer
2020-09-10 13:50       ` Phillip Wood
2020-09-10 22:21         ` Junio C Hamano
2020-09-23 23:47       ` Jonathan Tan
2020-10-05 21:27         ` Emily Shaffer
2020-10-05 23:48           ` Jonathan Nieder
2020-10-06 19:08             ` Emily Shaffer
2020-09-09  0:49     ` [PATCH v4 9/9] run_commit_hook: take strvec instead of varargs Emily Shaffer
2020-09-10 14:16       ` Phillip Wood
2020-09-11 13:20         ` Phillip Wood
2020-09-09 21:04     ` [PATCH v4 0/9] propose config-based hooks Junio C Hamano
2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
2020-10-14 23:24       ` [PATCH v5 1/8] doc: propose hooks managed by the config Emily Shaffer
2020-10-15 16:31         ` Ævar Arnfjörð Bjarmason
2020-10-16 17:29           ` Junio C Hamano
2020-10-21 23:37           ` Emily Shaffer
2020-10-14 23:24       ` [PATCH v5 2/8] hook: scaffolding for git-hook subcommand Emily Shaffer
2020-10-14 23:24       ` [PATCH v5 3/8] hook: add list command Emily Shaffer
2020-10-14 23:24       ` [PATCH v5 4/8] hook: include hookdir hook in list Emily Shaffer
2020-10-14 23:24       ` [PATCH v5 5/8] hook: implement hookcmd.<name>.skip Emily Shaffer
2020-10-14 23:24       ` [PATCH v5 6/8] parse-options: parse into strvec Emily Shaffer
2020-10-14 23:24       ` [PATCH v5 7/8] hook: add 'run' subcommand Emily Shaffer
2020-10-14 23:24       ` [PATCH v5 8/8] hook: replace find_hook() with hook_exists() Emily Shaffer
2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
2020-12-05  1:45         ` [PATCH 01/17] doc: propose hooks managed by the config Emily Shaffer
2020-12-05  1:45         ` [PATCH 02/17] hook: scaffolding for git-hook subcommand Emily Shaffer
2020-12-05  1:45         ` [PATCH 03/17] hook: add list command Emily Shaffer
2020-12-05  1:45         ` [PATCH 04/17] hook: include hookdir hook in list Emily Shaffer
2020-12-05  1:45         ` [PATCH 05/17] hook: respect hook.runHookDir Emily Shaffer
2020-12-05  1:45         ` [PATCH 06/17] hook: implement hookcmd.<name>.skip Emily Shaffer
2020-12-05  1:45         ` [PATCH 07/17] parse-options: parse into strvec Emily Shaffer
2020-12-05  1:45         ` [PATCH 08/17] hook: add 'run' subcommand Emily Shaffer
2020-12-11 10:15           ` Phillip Wood
2020-12-15 21:41             ` Emily Shaffer
2020-12-05  1:45         ` [PATCH 09/17] hook: replace find_hook() with hook_exists() Emily Shaffer
2020-12-05  1:46         ` [PATCH 10/17] hook: support passing stdin to hooks Emily Shaffer
2020-12-05  1:46         ` [PATCH 11/17] run-command: allow stdin for run_processes_parallel Emily Shaffer
2020-12-05  1:46         ` [PATCH 12/17] hook: allow parallel hook execution Emily Shaffer
2020-12-05  1:46         ` [PATCH 13/17] hook: allow specifying working directory for hooks Emily Shaffer
2020-12-05  1:46         ` [PATCH 14/17] run-command: add stdin callback for parallelization Emily Shaffer
2020-12-05  1:46         ` [PATCH 15/17] hook: provide stdin by string_list or callback Emily Shaffer
2020-12-08 21:09           ` SZEDER Gábor
2020-12-08 22:11             ` Emily Shaffer
2020-12-05  1:46         ` [PATCH 16/17] run-command: allow capturing of collated output Emily Shaffer
2020-12-05  1:46         ` [PATCH 17/17] hooks: allow callers to capture output Emily Shaffer
2020-12-16  0:34         ` [PATCH v6 00/17] propose config-based hooks (part I) Josh Steadmon
2020-12-16  0:56           ` Junio C Hamano
2020-12-16 20:16             ` Emily Shaffer
2020-12-16 23:32               ` Junio C Hamano
2020-12-18  2:07                 ` Emily Shaffer
2020-12-18  5:29                   ` Junio C Hamano
2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 01/17] doc: propose hooks managed by the config Emily Shaffer
2021-01-23 15:38             ` Ævar Arnfjörð Bjarmason
2021-01-29 23:52               ` Emily Shaffer
2021-02-01 22:11             ` Junio C Hamano
2021-03-10 19:30               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 02/17] hook: scaffolding for git-hook subcommand Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 03/17] hook: add list command Emily Shaffer
2021-01-31  3:10             ` Jonathan Tan
2021-02-09 21:06               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 04/17] hook: include hookdir hook in list Emily Shaffer
2021-01-31  3:20             ` Jonathan Tan
2021-02-09 22:05               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 05/17] hook: respect hook.runHookDir Emily Shaffer
2021-01-31  3:35             ` Jonathan Tan
2021-02-09 22:31               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 06/17] hook: implement hookcmd.<name>.skip Emily Shaffer
2021-01-31  3:40             ` Jonathan Tan
2021-02-09 22:57               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 07/17] parse-options: parse into strvec Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 08/17] hook: add 'run' subcommand Emily Shaffer
2021-01-31  4:22             ` Jonathan Tan
2021-02-11 22:44               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 09/17] hook: replace find_hook() with hook_exists() Emily Shaffer
2021-01-31  4:39             ` Jonathan Tan
2021-02-12 22:15               ` Emily Shaffer
2021-02-18 22:23               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 10/17] hook: support passing stdin to hooks Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 11/17] run-command: allow stdin for run_processes_parallel Emily Shaffer
2021-02-01  5:38             ` Jonathan Tan
2021-02-19 20:23               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 12/17] hook: allow parallel hook execution Emily Shaffer
2021-02-01  6:04             ` Jonathan Tan
2021-02-22 21:46               ` Emily Shaffer [this message]
2020-12-22  0:02           ` [PATCH v7 13/17] hook: allow specifying working directory for hooks Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 14/17] run-command: add stdin callback for parallelization Emily Shaffer
2021-02-01  6:51             ` Jonathan Tan
2021-02-22 23:38               ` Emily Shaffer
2021-02-23 19:33                 ` Jonathan Tan
2021-03-10 18:24                   ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 15/17] hook: provide stdin by string_list or callback Emily Shaffer
2021-02-01  7:04             ` Jonathan Tan
2021-02-23 19:52               ` Emily Shaffer
2021-02-25 20:56                 ` Jonathan Tan
2021-03-02  1:47                   ` Emily Shaffer
2021-03-02 23:33                     ` Jonathan Tan
2020-12-22  0:02           ` [PATCH v7 16/17] run-command: allow capturing of collated output Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 17/17] hooks: allow callers to capture output Emily Shaffer
2020-12-22  2:11           ` [PATCH v7 00/17] propose config-based hooks (part I) Junio C Hamano
2020-12-28 18:34             ` Emily Shaffer
2020-12-28 22:50               ` Junio C Hamano
2020-12-28 22:37           ` [PATCH v3 18/17] doc: make git-hook.txt point of truth Emily Shaffer
2020-12-28 22:39             ` Emily Shaffer
2021-01-29 23:59           ` [PATCH v7 00/17] propose config-based hooks (part I) Emily Shaffer
2021-02-16 19:46           ` Josh Steadmon
2021-02-16 22:47             ` Junio C Hamano
2021-02-17 21:21               ` Josh Steadmon
2021-02-17 23:07                 ` Junio C Hamano
2021-02-25 19:50                   ` Junio C Hamano
2021-03-01 21:51                     ` Emily Shaffer
2021-03-01 22:19                       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YDQmS2qN/lrm547L@google.com \
    --to=emilyshaffer@google.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).