All of lore.kernel.org
 help / color / mirror / Atom feed
From: Emily Shaffer <emilyshaffer@google.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Phillip Wood" <phillip.wood123@gmail.com>,
	"Anthony Sottile" <asottile@umich.edu>,
	"Git Mailing List" <git@vger.kernel.org>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: Re: git 2.36.0 regression: pre-commit hooks no longer have stdout/stderr as tty
Date: Wed, 20 Apr 2022 10:09:38 -0700	[thread overview]
Message-ID: <CAJoAoZm7p32Hn=TLQeWUqp_nMjo_TQ2whR4F=cXk4c6PV1M5bA@mail.gmail.com> (raw)
In-Reply-To: <xmqqr15rr9k6.fsf@gitster.g>

On Wed, Apr 20, 2022 at 9:42 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Phillip Wood <phillip.wood123@gmail.com> writes:
>
> >> It *is* true that run-command.c:pp_start_one() sets child_process:err=-1
> >> for the child and run-command.c:run_hook_ve() didn't do that; that -1
> >> means that start_command() will create a new fd for the child's stderr.
> >> Since run_hook_ve() didn't care about the child's stderr before, I
> >> wonder if that is why? Could it be that now that we're processing the
> >> child's stderr, the child no longer thinks stderr is in tty, because the
> >> parent is consuming its output?
> >
> > Exactly, stderr is redirected to a pipe so that we can buffer the
> > output from each process and then write it to the real stdout when the
> > process has finished to avoid the output from different processes
> > getting mixed together. Ideally in this case we'd see that stdout is a
> > tty and create a pty rather than a pipe when buffering the output from
> > the process.
>
> Ah, thanks, and sigh.  That means this was an unintended regression
> caused by use of parallel infrastructure, mixed with a bit of "the
> original problem report wrote hook properly so that when it is not
> connected to a terminal (such as in this new implementation) it
> refrains to do terminal-y things like coloring, so everything is
> working as intended" ;-).
>
> IIRC, the parallel subprocess stuff was invented to spawn multiple
> tasks we internally need (like "checkout these submodules") that are
> not interactive (hence does not need access to stdin) en masse, and
> the output buffering is there to avoid interleaving the output that
> would make it unreadable.
>
> Use of the parallel subprocess API means that we inherently cannot
> give access to the standard input to the hooks.  The users of the
> original run_hooks_ve() API would be OK with that, because it did
> .no_stdin=1 before the problematic hooks API rewrite, but I wonder
> what our plans should be for hooks that want to go interactive.
> They could open /dev/tty themselves (and that would have been the
> only way to go interactive even in the old world order, so it is
> perfectly acceptable to keep it that way with .no_stdin=1), but if
> they run in parallel, the end-user would not know whom they are
> typing to (and which output lines are the prompts they are expected
> to respond to).
>
> In the longer term, there are multiple possible action items.
>
>  * We probably would want to design a bit better anti-interleaving
>    machinery than "buffer everything and show only after the process
>    exists", if we want to keep using the parallel subprocess API.
>    And that would help the original "do this thing in multiple
>    submodules at the same time" use case, too.

I've noticed this too, but for very noisy things which are
parallelized, I'm not sure a better user experience is possible. I
suppose we could pick the "first" job in the task queue and print that
output as it comes in, so that users are aware that *something* is
happening?

[job 0 starts]
[job 1 starts]
job 0 says 0-foo
[job 1 says 1-foo, but it's buffered]
job 0 says 0-bar
[job 1 says 1-bar, but it's buffered]
[job 0 finishes]
[we replay the buffer from job 1 so far:]
job 1 says 1-foo
job 1 says 1-bar
job 1 says 1-baz
[job 1 finishes]

I think it could be possible, but then job 1 still will never learn
that it's a tty, because it's being buffered to prevent interleaving,
even if we have the illusion of non-buffering.

>
>  * We should teach hooks API to make it _optional_ to use the
>    parallel subprocess API.  If we are not spawning hooks in
>    parallel today, there is no reason to incur this regression by
>    using the parallel subprocess API---this was a needress bug, and
>    I am angry.

To counter, I think that having hooks invoked via two different
mechanisms depending on how many are provided or whether they are
parallelized is a mess to debug and maintain. I still stand by the
decision to use the parallel subprocess API, which I think was
reasonable to expect to do the same thing when jobs=1, and I think we
should continue to do so. It simplifies the hook code significantly.

>
>  * the hooks API should learn a mechanism for multiple hooks to
>    coordinate their executions.  Perhaps they indicate their
>    preference if they are OK to be run in parallel, and those that
>    want isolation will be run one-at-a-time before or after others
>    run in parallel, or something.

There is such a mechanism for hooks overall, but not yet for
individual hooks. I know we discussed it at length[1] before, and
decided it would be okay to figure this out later on. I suppose "later
on" may have come :)

>
>  * The hooks API should learn a mechanism for us to tell what
>    execution environment they are in.  Ideally, the hooks, if it is
>    sane to run under the parallel subprocess API, shouldn't have
>    been learning if they are talking to an interactive human user by
>    looking at isatty(), but we should have been explicitly telling
>    them that they are, perhaps by exporting an environment
>    variable.  There may probably be more clue hooks writers want
>    other than "am I talking to human user?" that we would want to
>    enumerate before going this route.

Hm. I was going to mention that Ævar and I discussed the possibility
of setting an environment variable for hook child processes, telling
them which hook they are being run as - e.g.
"GIT_HOOK=prepare-commit-msg" - but I suppose that relying on that
alone doesn't tell us anything about whether the parent is being run
in tty. I agree it could be very useful to simply pass
GIT_PARENT_ISATTY to hooks (and I suppose other child processes).
Could we simply do that from start_command() or something else deep in
run-command.h machinery? Then Anthony's use case becomes

if [-t 1|| GIT_PARENT_ISATTY]
 ...

and no need to examine Git version.

 - Emily

1: https://lore.kernel.org/git/20210527000856.695702-2-emilyshaffer%40google.com
under "Parallelization with dependencies" (and preceding
conversations)

  reply	other threads:[~2022-04-20 17:09 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-19 18:59 git 2.36.0 regression: pre-commit hooks no longer have stdout/stderr as tty Anthony Sottile
2022-04-19 23:37 ` Emily Shaffer
2022-04-19 23:52   ` Anthony Sottile
2022-04-20  9:00   ` Phillip Wood
2022-04-20 12:25     ` Ævar Arnfjörð Bjarmason
2022-04-20 16:22       ` Emily Shaffer
2022-04-20 16:42     ` Junio C Hamano
2022-04-20 17:09       ` Emily Shaffer [this message]
2022-04-20 17:25         ` Junio C Hamano
2022-04-20 17:41           ` Emily Shaffer
2022-04-21 12:03             ` Ævar Arnfjörð Bjarmason
2022-04-21 17:24               ` Junio C Hamano
2022-04-21 18:40                 ` Junio C Hamano
2022-04-20  4:23 ` Junio C Hamano
2022-04-21 12:25 ` [PATCH 0/6] hook API: connect hooks to the TTY again, fixes a v2.36.0 regression Ævar Arnfjörð Bjarmason
2022-04-21 12:25   ` [PATCH 1/6] run-command API: replace run_processes_parallel_tr2() with opts struct Ævar Arnfjörð Bjarmason
2022-04-23  4:24     ` Junio C Hamano
2022-04-28 23:16     ` Emily Shaffer
2022-04-29 16:44       ` Junio C Hamano
2022-04-21 12:25   ` [PATCH 2/6] run-command tests: test stdout of run_command_parallel() Ævar Arnfjörð Bjarmason
2022-04-23  4:24     ` Junio C Hamano
2022-04-21 12:25   ` [PATCH 3/6] run-command: add an "ungroup" option to run_process_parallel() Ævar Arnfjörð Bjarmason
2022-04-23  3:54     ` Junio C Hamano
2022-04-28 23:26     ` Emily Shaffer
2022-04-21 12:25   ` [PATCH 4/6] hook tests: fix redirection logic error in 96e7225b310 Ævar Arnfjörð Bjarmason
2022-04-23  3:54     ` Junio C Hamano
2022-04-21 12:25   ` [PATCH 5/6] hook API: don't redundantly re-set "no_stdin" and "stdout_to_stderr" Ævar Arnfjörð Bjarmason
2022-04-29 22:54     ` Junio C Hamano
2022-04-21 12:25   ` [PATCH 6/6] hook API: fix v2.36.0 regression: hooks should be connected to a TTY Ævar Arnfjörð Bjarmason
2022-04-28 23:31     ` Emily Shaffer
2022-04-29 23:09     ` Junio C Hamano
2022-04-21 17:35   ` [PATCH 0/6] hook API: connect hooks to the TTY again, fixes a v2.36.0 regression Junio C Hamano
2022-04-21 18:50     ` Ævar Arnfjörð Bjarmason
2022-05-18 20:05   ` [PATCH v2 0/8] " Ævar Arnfjörð Bjarmason
2022-05-18 20:05     ` [PATCH v2 1/8] run-command tests: change if/if/... to if/else if/else Ævar Arnfjörð Bjarmason
2022-05-18 20:05     ` [PATCH v2 2/8] run-command API: use "opts" struct for run_processes_parallel{,_tr2}() Ævar Arnfjörð Bjarmason
2022-05-18 21:45       ` Junio C Hamano
2022-05-25 13:18       ` Emily Shaffer
2022-05-18 20:05     ` [PATCH v2 3/8] run-command tests: test stdout of run_command_parallel() Ævar Arnfjörð Bjarmason
2022-05-18 20:05     ` [PATCH v2 4/8] run-command.c: add an initializer for "struct parallel_processes" Ævar Arnfjörð Bjarmason
2022-05-18 20:05     ` [PATCH v2 5/8] run-command: add an "ungroup" option to run_process_parallel() Ævar Arnfjörð Bjarmason
2022-05-18 21:51       ` Junio C Hamano
2022-05-26 17:18       ` Emily Shaffer
2022-05-27 16:08         ` Junio C Hamano
2022-05-18 20:05     ` [PATCH v2 6/8] hook tests: fix redirection logic error in 96e7225b310 Ævar Arnfjörð Bjarmason
2022-05-18 20:05     ` [PATCH v2 7/8] hook API: don't redundantly re-set "no_stdin" and "stdout_to_stderr" Ævar Arnfjörð Bjarmason
2022-05-18 20:05     ` [PATCH v2 8/8] hook API: fix v2.36.0 regression: hooks should be connected to a TTY Ævar Arnfjörð Bjarmason
2022-05-18 21:53       ` Junio C Hamano
2022-05-26 17:23       ` Emily Shaffer
2022-05-26 18:23         ` Ævar Arnfjörð Bjarmason
2022-05-26 18:54           ` Emily Shaffer
2022-05-25 11:30     ` [PATCH v2 0/8] hook API: connect hooks to the TTY again, fixes a v2.36.0 regression Johannes Schindelin
2022-05-25 13:00       ` Ævar Arnfjörð Bjarmason
2022-05-25 16:57       ` Junio C Hamano
2022-05-26  1:10         ` Junio C Hamano
2022-05-26 10:16           ` Ævar Arnfjörð Bjarmason
2022-05-26 16:36             ` Junio C Hamano
2022-05-26 19:13               ` Ævar Arnfjörð Bjarmason
2022-05-26 19:56                 ` Junio C Hamano
2022-05-27  9:14     ` [PATCH v3 0/2] " Ævar Arnfjörð Bjarmason
2022-05-27  9:14       ` [PATCH v3 1/2] run-command: add an "ungroup" option to run_process_parallel() Ævar Arnfjörð Bjarmason
2022-05-27 16:58         ` Junio C Hamano
2022-05-27  9:14       ` [PATCH v3 2/2] hook API: fix v2.36.0 regression: hooks should be connected to a TTY Ævar Arnfjörð Bjarmason
2022-05-27 17:17       ` [PATCH v3 0/2] hook API: connect hooks to the TTY again, fixes a v2.36.0 regression Junio C Hamano
2022-05-31 17:32       ` [PATCH v4 " Ævar Arnfjörð Bjarmason
2022-05-31 17:32         ` [PATCH v4 1/2] run-command: add an "ungroup" option to run_process_parallel() Ævar Arnfjörð Bjarmason
2022-06-01 16:49           ` Johannes Schindelin
2022-06-01 17:09             ` Junio C Hamano
2022-05-31 17:32         ` [PATCH v4 2/2] hook API: fix v2.36.0 regression: hooks should be connected to a TTY Ævar Arnfjörð Bjarmason
2022-06-01 16:50           ` Johannes Schindelin
2022-06-01 16:53         ` [PATCH v4 0/2] hook API: connect hooks to the TTY again, fixes a v2.36.0 regression Johannes Schindelin
2022-06-02 14:07         ` [PATCH v5 " Ævar Arnfjörð Bjarmason
2022-06-02 14:07           ` [PATCH v5 1/2] run-command: add an "ungroup" option to run_process_parallel() Ævar Arnfjörð Bjarmason
2022-06-02 14:07           ` [PATCH v5 2/2] hook API: fix v2.36.0 regression: hooks should be connected to a TTY Ævar Arnfjörð Bjarmason
2022-06-02 20:05           ` [PATCH v5 0/2] hook API: connect hooks to the TTY again, fixes a v2.36.0 regression Junio C Hamano
2022-06-03  8:51           ` Phillip Wood
2022-06-03  9:20             ` Ævar Arnfjörð Bjarmason
2022-06-03 13:21               ` Phillip Wood
2022-06-07  8:49                 ` Ævar Arnfjörð Bjarmason
2022-06-07  8:48           ` [PATCH v6 " Ævar Arnfjörð Bjarmason
2022-06-07  8:48             ` [PATCH v6 1/2] run-command: add an "ungroup" option to run_process_parallel() Ævar Arnfjörð Bjarmason
2022-06-17  0:07               ` Emily Shaffer
2022-06-07  8:48             ` [PATCH v6 2/2] hook API: fix v2.36.0 regression: hooks should be connected to a TTY Ævar Arnfjörð Bjarmason
2022-06-07 17:08               ` Junio C Hamano
2022-06-07 17:02             ` [PATCH v6 0/2] hook API: connect hooks to the TTY again, fixes a v2.36.0 regression Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJoAoZm7p32Hn=TLQeWUqp_nMjo_TQ2whR4F=cXk4c6PV1M5bA@mail.gmail.com' \
    --to=emilyshaffer@google.com \
    --cc=asottile@umich.edu \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=phillip.wood123@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.