All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
@ 2022-01-24 18:56 Johannes Schindelin via GitGitGadget
  2022-01-24 18:56 ` [PATCH 1/9] ci: fix code style Johannes Schindelin via GitGitGadget
                   ` (12 more replies)
  0 siblings, 13 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-01-24 18:56 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin


Background
==========

Recent patches intended to help readers figure out CI failures much quicker
than before. Unfortunately, they haven't been entirely positive for me. For
example, they broke the branch protections in Microsoft's fork of Git, where
we require Pull Requests to pass a certain set of Checks (which are
identified by their names) and therefore caused follow-up work.

Using CI and in general making it easier for new contributors is an area I'm
passionate about, and one I'd like to see improved.


The current situation
=====================

Let me walk you through the current experience when a PR build fails: I get
a notification mail that only says that a certain job failed. There's no
indication of which test failed (or was it the build?). I can click on a
link at it takes me to the workflow run. Once there, all it says is "Process
completed with exit code 1", or even "code 2". Sure, I can click on one of
the failed jobs. It even expands the failed step's log (collapsing the other
steps). And what do I see there?

Let's look at an example of a failed linux-clang (ubuntu-latest) job
[https://github.com/git-for-windows/git/runs/4822802185?check_suite_focus=true]:

[...]
Test Summary Report
-------------------
t1092-sparse-checkout-compatibility.sh           (Wstat: 256 Tests: 53 Failed: 1)
  Failed test:  49
  Non-zero exit status: 1
t3701-add-interactive.sh                         (Wstat: 0 Tests: 71 Failed: 0)
  TODO passed:   45, 47
Files=957, Tests=25489, 645 wallclock secs ( 5.74 usr  1.56 sys + 866.28 cusr 364.34 csys = 1237.92 CPU)
Result: FAIL
make[1]: *** [Makefile:53: prove] Error 1
make[1]: Leaving directory '/home/runner/work/git/git/t'
make: *** [Makefile:3018: test] Error 2


That's it. I count myself lucky not to be a new contributor being faced with
something like this.

Now, since I am active in the Git project for a couple of days or so, I can
make sense of the "TODO passed" label and know that for the purpose of
fixing the build failures, I need to ignore this, and that I need to focus
on the "Failed test" part instead.

I also know that I do not have to get myself an ubuntu-latest box just to
reproduce the error, I do not even have to check out the code and run it
just to learn what that "49" means.

I know, and I do not expect any new contributor, not even most seasoned
contributors to know, that I have to patiently collapse the "Run
ci/run-build-and-tests.sh" job's log, and instead expand the "Run
ci/print-test-failures.sh" job log (which did not fail and hence does not
draw any attention to it).

I know, and again: I do not expect many others to know this, that I then
have to click into the "Search logs" box (not the regular web browser's
search via Ctrl+F!) and type in "not ok" to find the log of the failed test
case (and this might still be a "known broken" one that is marked via
test_expect_failure and once again needs to be ignored).

To be excessively clear: This is not a great experience!


Improved output
===============

Our previous Azure Pipelines-based CI builds had a much nicer UI, one that
even showed flaky tests, and trends e.g. how long the test cases ran. When I
ported Git's CI over to GitHub workflows (to make CI more accessible to new
contributors), I knew fully well that we would leave this very nice UI
behind, and I had hoped that we would get something similar back via new,
community-contributed GitHub Actions that can be used in GitHub workflows.
However, most likely because we use a home-grown test framework implemented
in opinionated POSIX shells scripts, that did not happen.

So I had a look at what standards exist e.g. when testing PowerShell
modules, in the way of marking up their test output in GitHub workflows, and
I was not disappointed: GitHub workflows support "grouping" of output lines,
i.e. marking sections of the output as a group that is then collapsed by
default and can be expanded. And it is this feature I decided to use in this
patch series, along with GitHub workflows' commands to display errors or
notices that are also shown on the summary page of the workflow run. Now, in
addition to "Process completed with exit code" on the summary page, we also
read something like:

⊗ linux-gcc (ubuntu-latest)
   failed: t9800.20 submit from detached head


Even better, this message is a link, and following that, the reader is
presented with something like this
[https://github.com/dscho/git/runs/4840190622?check_suite_focus=true]:

⏵ Run ci/run-build-and-tests.sh
⏵ CI setup
  + ln -s /home/runner/none/.prove t/.prove
  + run_tests=t
  + export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
  + group Build make
  + set +x
⏵ Build
⏵ Run tests
  === Failed test: t9800-git-p4-basic ===
⏵ ok: t9800.1 start p4d
⏵ ok: t9800.2 add p4 files
⏵ ok: t9800.3 basic git p4 clone
⏵ ok: t9800.4 depot typo error
⏵ ok: t9800.5 git p4 clone @all
⏵ ok: t9800.6 git p4 sync uninitialized repo
⏵ ok: t9800.7 git p4 sync new branch
⏵ ok: t9800.8 clone two dirs
⏵ ok: t9800.9 clone two dirs, @all
⏵ ok: t9800.10 clone two dirs, @all, conflicting files
⏵ ok: t9800.11 clone two dirs, each edited by submit, single git commit
⏵ ok: t9800.12 clone using non-numeric revision ranges
⏵ ok: t9800.13 clone with date range, excluding some changes
⏵ ok: t9800.14 exit when p4 fails to produce marshaled output
⏵ ok: t9800.15 exit gracefully for p4 server errors
⏵ ok: t9800.16 clone --bare should make a bare repository
⏵ ok: t9800.17 initial import time from top change time
⏵ ok: t9800.18 unresolvable host in P4PORT should display error
⏵ ok: t9800.19 run hook p4-pre-submit before submit
  Error: failed: t9800.20 submit from detached head
⏵ failure: t9800.20 submit from detached head 
  Error: failed: t9800.21 submit from worktree
⏵ failure: t9800.21 submit from worktree 
  === Failed test: t9801-git-p4-branch ===
  [...]


The "Failed test:" lines are colored in yellow to give a better visual clue
about the logs' structure, the "Error:" label is colored in red to draw the
attention to the important part of the log, and the "⏵" characters indicate
that part of the log is collapsed and can be expanded by clicking on it.

To drill down, the reader merely needs to expand the (failed) test case's
log by clicking on it, and then study the log. If needed (e.g. when the test
case relies on side effects from previous test cases), the logs of preceding
test cases can be expanded as well. In this example, when expanding
t9800.20, it looks like this (for ease of reading, I cut a few chunks of
lines, indicated by "[...]"):

[...]
⏵ ok: t9800.19 run hook p4-pre-submit before submit
  Error: failed: t9800.20 submit from detached head
⏷ failure: t9800.20 submit from detached head 
      test_when_finished cleanup_git &&
      git p4 clone --dest="$git" //depot &&
        (
          cd "$git" &&
          git checkout p4/master &&
          >detached_head_test &&
          git add detached_head_test &&
          git commit -m "add detached_head" &&
          git config git-p4.skipSubmitEdit true &&
          git p4 submit &&
            git p4 rebase &&
            git log p4/master | grep detached_head
        )
    [...]
    Depot paths: //depot/
    Import destination: refs/remotes/p4/master
    
    Importing revision 9 (100%)Perforce db files in '.' will be created if missing...
    Perforce db files in '.' will be created if missing...
    
    Traceback (most recent call last):
      File "/home/runner/work/git/git/git-p4", line 4455, in <module>
        main()
      File "/home/runner/work/git/git/git-p4", line 4449, in main
        if not cmd.run(args):
      File "/home/runner/work/git/git/git-p4", line 2590, in run
        rebase.rebase()
      File "/home/runner/work/git/git/git-p4", line 4121, in rebase
        if len(read_pipe("git diff-index HEAD --")) > 0:
      File "/home/runner/work/git/git/git-p4", line 297, in read_pipe
        retcode, out, err = read_pipe_full(c, *k, **kw)
      File "/home/runner/work/git/git/git-p4", line 284, in read_pipe_full
        p = subprocess.Popen(
      File "/usr/lib/python3.8/subprocess.py", line 858, in __init__
        self._execute_child(args, executable, preexec_fn, close_fds,
      File "/usr/lib/python3.8/subprocess.py", line 1704, in _execute_child
        raise child_exception_type(errno_num, err_msg, err_filename)
    FileNotFoundError: [Errno 2] No such file or directory: 'git diff-index HEAD --'
    error: last command exited with $?=1
    + cleanup_git
    + retry_until_success rm -r /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
    + nr_tries_left=60
    + rm -r /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
    + test_path_is_missing /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
    + test 1 -ne 1
    + test -e /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
    + retry_until_success mkdir /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
    + nr_tries_left=60
    + mkdir /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
    + exit 1
    + eval_ret=1
    + :
    not ok 20 - submit from detached head
    #    
    #        test_when_finished cleanup_git &&
    #        git p4 clone --dest="$git" //depot &&
    #        (
    #            cd "$git" &&
    #            git checkout p4/master &&
    #            >detached_head_test &&
    #            git add detached_head_test &&
    #            git commit -m "add detached_head" &&
    #            git config git-p4.skipSubmitEdit true &&
    #            git p4 submit &&
    #            git p4 rebase &&
    #            git log p4/master | grep detached_head
    #        )
    #    
  Error: failed: t9800.21 submit from worktree
  [...]


Is this the best UI we can have for test failures in CI runs? I hope we can
do better. Having said that, this patch series presents a pretty good start,
and offers a basis for future improvements.

Johannes Schindelin (9):
  ci: fix code style
  ci/run-build-and-tests: take a more high-level view
  ci: make it easier to find failed tests' logs in the GitHub workflow
  ci/run-build-and-tests: add some structure to the GitHub workflow
    output
  tests: refactor --write-junit-xml code
  test(junit): avoid line feeds in XML attributes
  ci: optionally mark up output in the GitHub workflow
  ci: use `--github-workflow-markup` in the GitHub workflow
  ci: call `finalize_test_case_output` a little later

 .github/workflows/main.yml           |  12 ---
 ci/lib.sh                            |  81 ++++++++++++++--
 ci/run-build-and-tests.sh            |  11 ++-
 ci/run-test-slice.sh                 |   5 +-
 t/test-lib-functions.sh              |   4 +-
 t/test-lib-github-workflow-markup.sh |  50 ++++++++++
 t/test-lib-junit.sh                  | 132 +++++++++++++++++++++++++++
 t/test-lib.sh                        | 128 ++++----------------------
 8 files changed, 287 insertions(+), 136 deletions(-)
 create mode 100644 t/test-lib-github-workflow-markup.sh
 create mode 100644 t/test-lib-junit.sh


base-commit: af4e5f569bc89f356eb34a9373d7f82aca6faa8a
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1117%2Fdscho%2Fuse-grouping-in-ci-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1117/dscho/use-grouping-in-ci-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1117
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [PATCH 1/9] ci: fix code style
  2022-01-24 18:56 [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Johannes Schindelin via GitGitGadget
@ 2022-01-24 18:56 ` Johannes Schindelin via GitGitGadget
  2022-01-24 18:56 ` [PATCH 2/9] ci/run-build-and-tests: take a more high-level view Johannes Schindelin via GitGitGadget
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-01-24 18:56 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

In b92cb86ea14 (travis-ci: check that all build artifacts are
.gitignore-d, 2017-12-31), a function was introduced with a code style
that is different from the surrounding code: it added the opening curly
brace on its own line, when all the existing functions in the same file
cuddle that brace on the same line as the function name.

Let's make the code style consistent again.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 ci/lib.sh | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/ci/lib.sh b/ci/lib.sh
index 9d28ab50fb4..ebb502640fa 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -69,8 +69,7 @@ skip_good_tree () {
 	exit 0
 }
 
-check_unignored_build_artifacts ()
-{
+check_unignored_build_artifacts () {
 	! git ls-files --other --exclude-standard --error-unmatch \
 		-- ':/*' 2>/dev/null ||
 	{
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 2/9] ci/run-build-and-tests: take a more high-level view
  2022-01-24 18:56 [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Johannes Schindelin via GitGitGadget
  2022-01-24 18:56 ` [PATCH 1/9] ci: fix code style Johannes Schindelin via GitGitGadget
@ 2022-01-24 18:56 ` Johannes Schindelin via GitGitGadget
  2022-01-24 23:22   ` Eric Sunshine
  2022-01-24 18:56 ` [PATCH 3/9] ci: make it easier to find failed tests' logs in the GitHub workflow Johannes Schindelin via GitGitGadget
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-01-24 18:56 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

In the web UI of GitHub workflows, failed runs are presented with the
job step that failed auto-expanded. In the current setup, this is not
helpful at all because that shows only the output of `prove`, which says
which test failed, but not in what way.

What would help understand the reader what went wrong is the verbose
test output of the failed test.

The logs of the failed runs do contain that verbose test output, but it
is shown in the _next_ step (which is marked as succeeding, and is
therefore _not_ auto-expanded). Anyone not intimately familiar with this
would completely miss the verbose test output, being left mostly
puzzled with the test failures.

We are about to show the failed test cases' output in the _same_ step,
so that the user has a much easier time to figure out what was going
wrong.

But first, we must partially revert the change that tried to improve the
CI runs by combining the `Makefile` targets to build into a single
`make` invocation. That might have sounded like a good idea at the time,
but it does make it rather impossible for the CI script to determine
whether the _build_ failed, or the _tests_. If the tests were run at
all, that is.

So let's go back to calling `make` for the build, and call `make test`
separately so that we can easily detect that _that_ invocation failed,
and react appropriately.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 ci/run-build-and-tests.sh | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 280dda7d285..b70373c172f 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -10,7 +10,7 @@ windows*) cmd //c mklink //j t\\.prove "$(cygpath -aw "$cache_dir/.prove")";;
 *) ln -s "$cache_dir/.prove" t/.prove;;
 esac
 
-export MAKE_TARGETS="all test"
+run_tests=t
 
 case "$jobname" in
 linux-gcc)
@@ -41,14 +41,18 @@ pedantic)
 	# Don't run the tests; we only care about whether Git can be
 	# built.
 	export DEVOPTS=pedantic
-	export MAKE_TARGETS=all
+	run_tests=
 	;;
 esac
 
 # Any new "test" targets should not go after this "make", but should
 # adjust $MAKE_TARGETS. Otherwise compilation-only targets above will
 # start running tests.
-make $MAKE_TARGETS
+make
+if test -n "$run_tests"
+then
+	make test
+fi
 check_unignored_build_artifacts
 
 save_good_tree
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 3/9] ci: make it easier to find failed tests' logs in the GitHub workflow
  2022-01-24 18:56 [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Johannes Schindelin via GitGitGadget
  2022-01-24 18:56 ` [PATCH 1/9] ci: fix code style Johannes Schindelin via GitGitGadget
  2022-01-24 18:56 ` [PATCH 2/9] ci/run-build-and-tests: take a more high-level view Johannes Schindelin via GitGitGadget
@ 2022-01-24 18:56 ` Johannes Schindelin via GitGitGadget
  2022-01-25 23:48   ` Ævar Arnfjörð Bjarmason
  2022-01-24 18:56 ` [PATCH 4/9] ci/run-build-and-tests: add some structure to the GitHub workflow output Johannes Schindelin via GitGitGadget
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-01-24 18:56 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

You currently have to know a lot of implementation details when
investigating test failures in the CI runs. The first step is easy: the
failed job is marked quite clearly, but when opening it, the failed step
is expanded, which in our case is the one running
`ci/run-build-and-tests.sh`. This step, most notably, only offers a
high-level view of what went wrong: it prints the output of `prove`
which merely tells the reader which test script failed.

The actually interesting part is in the detailed log of said failed
test script. But that log is shown in the CI run's step that runs
`ci/print-test-failures.sh`. And that step is _not_ expanded in the web
UI by default.

Let's help the reader by showing the failed tests' detailed logs in the
step that is expanded automatically, i.e. directly after the test suite
failed.

This also helps the situation where the _build_ failed and the
`print-test-failures` step was executed under the assumption that the
_test suite_ failed, and consequently failed to find any failed tests.

An alternative way to implement this patch would be to source
`ci/print-test-failures.sh` in the `handle_test_failures` function to
show these logs. However, over the course of the next few commits, we
want to introduce some grouping which would be harder to achieve that
way (for example, we do want a leaner, and colored, preamble for each
failed test script, and it would be trickier to accommodate the lack of
nested groupings in GitHub workflows' output).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 .github/workflows/main.yml | 12 ------------
 ci/lib.sh                  | 23 +++++++++++++++++++++++
 ci/run-build-and-tests.sh  |  3 ++-
 ci/run-test-slice.sh       |  3 ++-
 4 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index c35200defb9..3fa88b78b6d 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -119,10 +119,6 @@ jobs:
     - name: test
       shell: bash
       run: . /etc/profile && ci/run-test-slice.sh ${{matrix.nr}} 10
-    - name: ci/print-test-failures.sh
-      if: failure()
-      shell: bash
-      run: ci/print-test-failures.sh
     - name: Upload failed tests' directories
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
       uses: actions/upload-artifact@v2
@@ -204,10 +200,6 @@ jobs:
       env:
         NO_SVN_TESTS: 1
       run: . /etc/profile && ci/run-test-slice.sh ${{matrix.nr}} 10
-    - name: ci/print-test-failures.sh
-      if: failure()
-      shell: bash
-      run: ci/print-test-failures.sh
     - name: Upload failed tests' directories
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
       uses: actions/upload-artifact@v2
@@ -261,8 +253,6 @@ jobs:
     - uses: actions/checkout@v2
     - run: ci/install-dependencies.sh
     - run: ci/run-build-and-tests.sh
-    - run: ci/print-test-failures.sh
-      if: failure()
     - name: Upload failed tests' directories
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
       uses: actions/upload-artifact@v2
@@ -292,8 +282,6 @@ jobs:
     - uses: actions/checkout@v1
     - run: ci/install-docker-dependencies.sh
     - run: ci/run-build-and-tests.sh
-    - run: ci/print-test-failures.sh
-      if: failure()
     - name: Upload failed tests' directories
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
       uses: actions/upload-artifact@v1
diff --git a/ci/lib.sh b/ci/lib.sh
index ebb502640fa..2b2c0932320 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -78,6 +78,10 @@ check_unignored_build_artifacts () {
 	}
 }
 
+handle_failed_tests () {
+	return 1
+}
+
 # GitHub Action doesn't set TERM, which is required by tput
 export TERM=${TERM:-dumb}
 
@@ -123,6 +127,25 @@ then
 	CI_JOB_ID="$GITHUB_RUN_ID"
 	CC="${CC:-gcc}"
 	DONT_SKIP_TAGS=t
+	handle_failed_tests () {
+		mkdir -p t/failed-test-artifacts
+		echo "FAILED_TEST_ARTIFACTS=t/failed-test-artifacts" >>$GITHUB_ENV
+
+		for test_exit in t/test-results/*.exit
+		do
+			test 0 != "$(cat "$test_exit")" || continue
+
+			test_name="${test_exit%.exit}"
+			test_name="${test_name##*/}"
+			printf "\\e[33m\\e[1m=== Failed test: ${test_name} ===\\e[m\\n"
+			cat "t/test-results/$test_name.out"
+
+			trash_dir="t/trash directory.$test_name"
+			cp "t/test-results/$test_name.out" t/failed-test-artifacts/
+			tar czf t/failed-test-artifacts/"$test_name".trash.tar.gz "$trash_dir"
+		done
+		return 1
+	}
 
 	cache_dir="$HOME/none"
 
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index b70373c172f..e49f9eaa8c0 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -51,7 +51,8 @@ esac
 make
 if test -n "$run_tests"
 then
-	make test
+	make test ||
+	handle_failed_tests
 fi
 check_unignored_build_artifacts
 
diff --git a/ci/run-test-slice.sh b/ci/run-test-slice.sh
index f8c2c3106a2..63358c23e11 100755
--- a/ci/run-test-slice.sh
+++ b/ci/run-test-slice.sh
@@ -12,6 +12,7 @@ esac
 
 make --quiet -C t T="$(cd t &&
 	./helper/test-tool path-utils slice-tests "$1" "$2" t[0-9]*.sh |
-	tr '\n' ' ')"
+	tr '\n' ' ')" ||
+handle_failed_tests
 
 check_unignored_build_artifacts
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 4/9] ci/run-build-and-tests: add some structure to the GitHub workflow output
  2022-01-24 18:56 [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Johannes Schindelin via GitGitGadget
                   ` (2 preceding siblings ...)
  2022-01-24 18:56 ` [PATCH 3/9] ci: make it easier to find failed tests' logs in the GitHub workflow Johannes Schindelin via GitGitGadget
@ 2022-01-24 18:56 ` Johannes Schindelin via GitGitGadget
  2022-02-23 12:13   ` Phillip Wood
  2022-01-24 18:56 ` [PATCH 5/9] tests: refactor --write-junit-xml code Johannes Schindelin via GitGitGadget
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-01-24 18:56 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The current output of Git's GitHub workflow can be quite confusing,
especially for contributors new to the project.

To make it more helpful, let's introduce some collapsible grouping.
Initially, readers will see the high-level view of what actually
happened (did the build fail, or the test suite?). To drill down, the
respective group can be expanded.

Note: sadly, workflow output currently cannot contain any nested groups
(see https://github.com/actions/runner/issues/802 for details),
therefore we take pains to ensure to end any previous group before
starting a new one.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 ci/lib.sh                 | 55 ++++++++++++++++++++++++++++++++++-----
 ci/run-build-and-tests.sh |  4 +--
 ci/run-test-slice.sh      |  2 +-
 3 files changed, 51 insertions(+), 10 deletions(-)

diff --git a/ci/lib.sh b/ci/lib.sh
index 2b2c0932320..4ed8f40ab02 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -1,5 +1,49 @@
 # Library of functions shared by all CI scripts
 
+if test true != "$GITHUB_ACTIONS"
+then
+	begin_group () { :; }
+	end_group () { :; }
+
+	group () {
+		shift
+		"$@"
+	}
+	set -x
+else
+	begin_group () {
+		need_to_end_group=t
+		echo "::group::$1" >&2
+		set -x
+	}
+
+	end_group () {
+		test -n "$need_to_end_group" || return 0
+		set +x
+		need_to_end_group=
+		echo '::endgroup::' >&2
+	}
+	trap end_group EXIT
+
+	group () {
+		set +x
+		begin_group "$1"
+		shift
+		"$@"
+		res=$?
+		end_group
+		return $res
+	}
+
+	begin_group "CI setup"
+fi
+
+# Set 'exit on error' for all CI scripts to let the caller know that
+# something went wrong.
+# Set tracing executed commands, primarily setting environment variables
+# and installing dependencies.
+set -e
+
 skip_branch_tip_with_tag () {
 	# Sometimes, a branch is pushed at the same time the tag that points
 	# at the same commit as the tip of the branch is pushed, and building
@@ -88,12 +132,6 @@ export TERM=${TERM:-dumb}
 # Clear MAKEFLAGS that may come from the outside world.
 export MAKEFLAGS=
 
-# Set 'exit on error' for all CI scripts to let the caller know that
-# something went wrong.
-# Set tracing executed commands, primarily setting environment variables
-# and installing dependencies.
-set -ex
-
 if test -n "$SYSTEM_COLLECTIONURI" || test -n "$SYSTEM_TASKDEFINITIONSURI"
 then
 	CI_TYPE=azure-pipelines
@@ -138,7 +176,7 @@ then
 			test_name="${test_exit%.exit}"
 			test_name="${test_name##*/}"
 			printf "\\e[33m\\e[1m=== Failed test: ${test_name} ===\\e[m\\n"
-			cat "t/test-results/$test_name.out"
+			group "Failed test: $test_name" cat "t/test-results/$test_name.out"
 
 			trash_dir="t/trash directory.$test_name"
 			cp "t/test-results/$test_name.out" t/failed-test-artifacts/
@@ -234,3 +272,6 @@ linux-leaks)
 esac
 
 MAKEFLAGS="$MAKEFLAGS CC=${CC:-cc}"
+
+end_group
+set -x
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index e49f9eaa8c0..5516f45f7fe 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -48,10 +48,10 @@ esac
 # Any new "test" targets should not go after this "make", but should
 # adjust $MAKE_TARGETS. Otherwise compilation-only targets above will
 # start running tests.
-make
+group Build make
 if test -n "$run_tests"
 then
-	make test ||
+	group "Run tests" make test ||
 	handle_failed_tests
 fi
 check_unignored_build_artifacts
diff --git a/ci/run-test-slice.sh b/ci/run-test-slice.sh
index 63358c23e11..a3c67956a8d 100755
--- a/ci/run-test-slice.sh
+++ b/ci/run-test-slice.sh
@@ -10,7 +10,7 @@ windows*) cmd //c mklink //j t\\.prove "$(cygpath -aw "$cache_dir/.prove")";;
 *) ln -s "$cache_dir/.prove" t/.prove;;
 esac
 
-make --quiet -C t T="$(cd t &&
+group "Run tests" make --quiet -C t T="$(cd t &&
 	./helper/test-tool path-utils slice-tests "$1" "$2" t[0-9]*.sh |
 	tr '\n' ' ')" ||
 handle_failed_tests
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 5/9] tests: refactor --write-junit-xml code
  2022-01-24 18:56 [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Johannes Schindelin via GitGitGadget
                   ` (3 preceding siblings ...)
  2022-01-24 18:56 ` [PATCH 4/9] ci/run-build-and-tests: add some structure to the GitHub workflow output Johannes Schindelin via GitGitGadget
@ 2022-01-24 18:56 ` Johannes Schindelin via GitGitGadget
  2022-01-26  0:10   ` Ævar Arnfjörð Bjarmason
  2022-01-24 18:56 ` [PATCH 6/9] test(junit): avoid line feeds in XML attributes Johannes Schindelin via GitGitGadget
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-01-24 18:56 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The code writing JUnit XML is interspersed directly with all the code in
`t/test-lib.sh`, and it is therefore not only ill-separated, but
introducing yet another output format would make the situation even
worse.

Let's introduce an abstraction layer by hiding the JUnit XML code behind
four new functions that are supposed to be called before and after each
test and test case.

This is not just an academic exercise, refactoring for refactoring's
sake. We _actually_ want to introduce such a new output format, to
make it substantially easier to diagnose test failures in our GitHub
workflow, therefore we do need this refactoring.

This commit is best viewed with `git show --color-moved
--color-moved-ws=allow-indentation-change <commit>`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/test-lib-junit.sh | 126 ++++++++++++++++++++++++++++++++++++++++++++
 t/test-lib.sh       | 124 ++++++-------------------------------------
 2 files changed, 142 insertions(+), 108 deletions(-)
 create mode 100644 t/test-lib-junit.sh

diff --git a/t/test-lib-junit.sh b/t/test-lib-junit.sh
new file mode 100644
index 00000000000..9d55d74d764
--- /dev/null
+++ b/t/test-lib-junit.sh
@@ -0,0 +1,126 @@
+# Library of functions to format test scripts' output in JUnit XML
+# format, to support Git's test suite result to be presented in an
+# easily digestible way on Azure Pipelines.
+#
+# Copyright (c) 2022 Johannes Schindelin
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see http://www.gnu.org/licenses/ .
+#
+# The idea is for `test-lib.sh` to source this file when the user asks
+# for JUnit XML; these functions will then override (empty) functions
+# that are are called at the appropriate times during the test runs.
+
+start_test_output () {
+	junit_xml_dir="$TEST_OUTPUT_DIRECTORY/out"
+	mkdir -p "$junit_xml_dir"
+	junit_xml_base=${1##*/}
+	junit_xml_path="$junit_xml_dir/TEST-${junit_xml_base%.sh}.xml"
+	junit_attrs="name=\"${junit_xml_base%.sh}\""
+	junit_attrs="$junit_attrs timestamp=\"$(TZ=UTC \
+		date +%Y-%m-%dT%H:%M:%S)\""
+	write_junit_xml --truncate "<testsuites>" "  <testsuite $junit_attrs>"
+	junit_suite_start=$(test-tool date getnanos)
+	if test -n "$GIT_TEST_TEE_OUTPUT_FILE"
+	then
+		GIT_TEST_TEE_OFFSET=0
+	fi
+}
+
+start_test_case_output () {
+	junit_start=$(test-tool date getnanos)
+}
+
+finalize_test_case_output () {
+	test_case_result=$1
+	shift
+	case "$test_case_result" in
+	ok)
+		set "$*"
+		;;
+	failure)
+		junit_insert="<failure message=\"not ok $test_count -"
+		junit_insert="$junit_insert $(xml_attr_encode "$1")\">"
+		junit_insert="$junit_insert $(xml_attr_encode \
+			"$(if test -n "$GIT_TEST_TEE_OUTPUT_FILE"
+			   then
+				test-tool path-utils skip-n-bytes \
+					"$GIT_TEST_TEE_OUTPUT_FILE" $GIT_TEST_TEE_OFFSET
+			   else
+				printf '%s\n' "$@" | sed 1d
+			   fi)")"
+		junit_insert="$junit_insert</failure>"
+		if test -n "$GIT_TEST_TEE_OUTPUT_FILE"
+		then
+			junit_insert="$junit_insert<system-err>$(xml_attr_encode \
+				"$(cat "$GIT_TEST_TEE_OUTPUT_FILE")")</system-err>"
+		fi
+		set "$1" "      $junit_insert"
+		;;
+	fixed)
+		set "$* (breakage fixed)"
+		;;
+	broken)
+		set "$* (known breakage)"
+		;;
+	skip)
+		message="$(xml_attr_encode "$skipped_reason")"
+		set "$1" "      <skipped message=\"$message\" />"
+		;;
+	esac
+
+	junit_attrs="name=\"$(xml_attr_encode "$this_test.$test_count $1")\""
+	shift
+	junit_attrs="$junit_attrs classname=\"$this_test\""
+	junit_attrs="$junit_attrs time=\"$(test-tool \
+		date getnanos $junit_start)\""
+	write_junit_xml "$(printf '%s\n' \
+		"    <testcase $junit_attrs>" "$@" "    </testcase>")"
+	junit_have_testcase=t
+}
+
+finalize_test_output () {
+	if test -n "$junit_xml_path"
+	then
+		test -n "$junit_have_testcase" || {
+			junit_start=$(test-tool date getnanos)
+			write_junit_xml_testcase "all tests skipped"
+		}
+
+		# adjust the overall time
+		junit_time=$(test-tool date getnanos $junit_suite_start)
+		sed -e "s/\(<testsuite.*\) time=\"[^\"]*\"/\1/" \
+			-e "s/<testsuite [^>]*/& time=\"$junit_time\"/" \
+			-e '/^ *<\/testsuite/d' \
+			<"$junit_xml_path" >"$junit_xml_path.new"
+		mv "$junit_xml_path.new" "$junit_xml_path"
+
+		write_junit_xml "  </testsuite>" "</testsuites>"
+		write_junit_xml=
+	fi
+}
+
+write_junit_xml () {
+	case "$1" in
+	--truncate)
+		>"$junit_xml_path"
+		junit_have_testcase=
+		shift
+		;;
+	esac
+	printf '%s\n' "$@" >>"$junit_xml_path"
+}
+
+xml_attr_encode () {
+	printf '%s\n' "$@" | test-tool xml-encode
+}
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 0f7a137c7d8..e13e1cb9124 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -107,6 +107,12 @@ mark_option_requires_arg () {
 	store_arg_to=$2
 }
 
+# These functions can be overridden e.g. to output JUnit XML
+start_test_output () { :; }
+start_test_case_output () { :; }
+finalize_test_case_output () { :; }
+finalize_test_output () { :; }
+
 parse_option () {
 	local opt="$1"
 
@@ -166,7 +172,7 @@ parse_option () {
 		tee=t
 		;;
 	--write-junit-xml)
-		write_junit_xml=t
+		. "$TEST_DIRECTORY/test-lib-junit.sh"
 		;;
 	--stress)
 		stress=t ;;
@@ -613,7 +619,7 @@ exec 6<&0
 exec 7>&2
 
 _error_exit () {
-	finalize_junit_xml
+	finalize_test_output
 	GIT_EXIT_OK=t
 	exit 1
 }
@@ -723,35 +729,13 @@ trap '{ code=$?; set +x; } 2>/dev/null; exit $code' INT TERM HUP
 # the test_expect_* functions instead.
 
 test_ok_ () {
-	if test -n "$write_junit_xml"
-	then
-		write_junit_xml_testcase "$*"
-	fi
+	finalize_test_case_output ok "$@"
 	test_success=$(($test_success + 1))
 	say_color "" "ok $test_count - $@"
 }
 
 test_failure_ () {
-	if test -n "$write_junit_xml"
-	then
-		junit_insert="<failure message=\"not ok $test_count -"
-		junit_insert="$junit_insert $(xml_attr_encode "$1")\">"
-		junit_insert="$junit_insert $(xml_attr_encode \
-			"$(if test -n "$GIT_TEST_TEE_OUTPUT_FILE"
-			   then
-				test-tool path-utils skip-n-bytes \
-					"$GIT_TEST_TEE_OUTPUT_FILE" $GIT_TEST_TEE_OFFSET
-			   else
-				printf '%s\n' "$@" | sed 1d
-			   fi)")"
-		junit_insert="$junit_insert</failure>"
-		if test -n "$GIT_TEST_TEE_OUTPUT_FILE"
-		then
-			junit_insert="$junit_insert<system-err>$(xml_attr_encode \
-				"$(cat "$GIT_TEST_TEE_OUTPUT_FILE")")</system-err>"
-		fi
-		write_junit_xml_testcase "$1" "      $junit_insert"
-	fi
+	finalize_test_case_output failure "$@"
 	test_failure=$(($test_failure + 1))
 	say_color error "not ok $test_count - $1"
 	shift
@@ -760,19 +744,13 @@ test_failure_ () {
 }
 
 test_known_broken_ok_ () {
-	if test -n "$write_junit_xml"
-	then
-		write_junit_xml_testcase "$* (breakage fixed)"
-	fi
+	finalize_test_case_output fixed "$@"
 	test_fixed=$(($test_fixed+1))
 	say_color error "ok $test_count - $@ # TODO known breakage vanished"
 }
 
 test_known_broken_failure_ () {
-	if test -n "$write_junit_xml"
-	then
-		write_junit_xml_testcase "$* (known breakage)"
-	fi
+	finalize_test_case_output broken "$@"
 	test_broken=$(($test_broken+1))
 	say_color warn "not ok $test_count - $@ # TODO known breakage"
 }
@@ -1049,10 +1027,7 @@ test_start_ () {
 	test_count=$(($test_count+1))
 	maybe_setup_verbose
 	maybe_setup_valgrind
-	if test -n "$write_junit_xml"
-	then
-		junit_start=$(test-tool date getnanos)
-	fi
+	start_test_case_output
 }
 
 test_finish_ () {
@@ -1103,12 +1078,7 @@ test_skip () {
 
 	case "$to_skip" in
 	t)
-		if test -n "$write_junit_xml"
-		then
-			message="$(xml_attr_encode "$skipped_reason")"
-			write_junit_xml_testcase "$1" \
-				"      <skipped message=\"$message\" />"
-		fi
+		finalize_test_case_output skip "$@"
 
 		say_color skip "ok $test_count # skip $1 ($skipped_reason)"
 		: true
@@ -1124,53 +1094,6 @@ test_at_end_hook_ () {
 	:
 }
 
-write_junit_xml () {
-	case "$1" in
-	--truncate)
-		>"$junit_xml_path"
-		junit_have_testcase=
-		shift
-		;;
-	esac
-	printf '%s\n' "$@" >>"$junit_xml_path"
-}
-
-xml_attr_encode () {
-	printf '%s\n' "$@" | test-tool xml-encode
-}
-
-write_junit_xml_testcase () {
-	junit_attrs="name=\"$(xml_attr_encode "$this_test.$test_count $1")\""
-	shift
-	junit_attrs="$junit_attrs classname=\"$this_test\""
-	junit_attrs="$junit_attrs time=\"$(test-tool \
-		date getnanos $junit_start)\""
-	write_junit_xml "$(printf '%s\n' \
-		"    <testcase $junit_attrs>" "$@" "    </testcase>")"
-	junit_have_testcase=t
-}
-
-finalize_junit_xml () {
-	if test -n "$write_junit_xml" && test -n "$junit_xml_path"
-	then
-		test -n "$junit_have_testcase" || {
-			junit_start=$(test-tool date getnanos)
-			write_junit_xml_testcase "all tests skipped"
-		}
-
-		# adjust the overall time
-		junit_time=$(test-tool date getnanos $junit_suite_start)
-		sed -e "s/\(<testsuite.*\) time=\"[^\"]*\"/\1/" \
-			-e "s/<testsuite [^>]*/& time=\"$junit_time\"/" \
-			-e '/^ *<\/testsuite/d' \
-			<"$junit_xml_path" >"$junit_xml_path.new"
-		mv "$junit_xml_path.new" "$junit_xml_path"
-
-		write_junit_xml "  </testsuite>" "</testsuites>"
-		write_junit_xml=
-	fi
-}
-
 test_atexit_cleanup=:
 test_atexit_handler () {
 	# In a succeeding test script 'test_atexit_handler' is invoked
@@ -1193,7 +1116,7 @@ test_done () {
 	# removed, so the commands can access pidfiles and socket files.
 	test_atexit_handler
 
-	finalize_junit_xml
+	finalize_test_output
 
 	if test -z "$HARNESS_ACTIVE"
 	then
@@ -1484,22 +1407,7 @@ fi
 # in subprocesses like git equals our $PWD (for pathname comparisons).
 cd -P "$TRASH_DIRECTORY" || exit 1
 
-if test -n "$write_junit_xml"
-then
-	junit_xml_dir="$TEST_OUTPUT_DIRECTORY/out"
-	mkdir -p "$junit_xml_dir"
-	junit_xml_base=${0##*/}
-	junit_xml_path="$junit_xml_dir/TEST-${junit_xml_base%.sh}.xml"
-	junit_attrs="name=\"${junit_xml_base%.sh}\""
-	junit_attrs="$junit_attrs timestamp=\"$(TZ=UTC \
-		date +%Y-%m-%dT%H:%M:%S)\""
-	write_junit_xml --truncate "<testsuites>" "  <testsuite $junit_attrs>"
-	junit_suite_start=$(test-tool date getnanos)
-	if test -n "$GIT_TEST_TEE_OUTPUT_FILE"
-	then
-		GIT_TEST_TEE_OFFSET=0
-	fi
-fi
+start_test_output "$0"
 
 # Convenience
 # A regexp to match 5 and 35 hexdigits
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 6/9] test(junit): avoid line feeds in XML attributes
  2022-01-24 18:56 [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Johannes Schindelin via GitGitGadget
                   ` (4 preceding siblings ...)
  2022-01-24 18:56 ` [PATCH 5/9] tests: refactor --write-junit-xml code Johannes Schindelin via GitGitGadget
@ 2022-01-24 18:56 ` Johannes Schindelin via GitGitGadget
  2022-01-24 18:56 ` [PATCH 7/9] ci: optionally mark up output in the GitHub workflow Johannes Schindelin via GitGitGadget
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-01-24 18:56 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

In the test case's output, we do want newline characters, but in the XML
attributes we do not want them.

However, the `xml_attr_encode` function always adds a Line Feed at the
end (which are then encoded as `&#x0a;`, even for XML attributes.

This seems not to faze Azure Pipelines' XML parser, but it still is
incorrect, so let's fix it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/test-lib-junit.sh | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/t/test-lib-junit.sh b/t/test-lib-junit.sh
index 9d55d74d764..c959183c7e2 100644
--- a/t/test-lib-junit.sh
+++ b/t/test-lib-junit.sh
@@ -50,7 +50,7 @@ finalize_test_case_output () {
 		;;
 	failure)
 		junit_insert="<failure message=\"not ok $test_count -"
-		junit_insert="$junit_insert $(xml_attr_encode "$1")\">"
+		junit_insert="$junit_insert $(xml_attr_encode --no-lf "$1")\">"
 		junit_insert="$junit_insert $(xml_attr_encode \
 			"$(if test -n "$GIT_TEST_TEE_OUTPUT_FILE"
 			   then
@@ -74,12 +74,12 @@ finalize_test_case_output () {
 		set "$* (known breakage)"
 		;;
 	skip)
-		message="$(xml_attr_encode "$skipped_reason")"
+		message="$(xml_attr_encode --no-lf "$skipped_reason")"
 		set "$1" "      <skipped message=\"$message\" />"
 		;;
 	esac
 
-	junit_attrs="name=\"$(xml_attr_encode "$this_test.$test_count $1")\""
+	junit_attrs="name=\"$(xml_attr_encode --no-lf "$this_test.$test_count $1")\""
 	shift
 	junit_attrs="$junit_attrs classname=\"$this_test\""
 	junit_attrs="$junit_attrs time=\"$(test-tool \
@@ -122,5 +122,11 @@ write_junit_xml () {
 }
 
 xml_attr_encode () {
-	printf '%s\n' "$@" | test-tool xml-encode
+	if test "x$1" = "x--no-lf"
+	then
+		shift
+		printf '%s' "$*" | test-tool xml-encode
+	else
+		printf '%s\n' "$@" | test-tool xml-encode
+	fi
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 7/9] ci: optionally mark up output in the GitHub workflow
  2022-01-24 18:56 [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Johannes Schindelin via GitGitGadget
                   ` (5 preceding siblings ...)
  2022-01-24 18:56 ` [PATCH 6/9] test(junit): avoid line feeds in XML attributes Johannes Schindelin via GitGitGadget
@ 2022-01-24 18:56 ` Johannes Schindelin via GitGitGadget
  2022-01-24 18:56 ` [PATCH 8/9] ci: use `--github-workflow-markup` " Johannes Schindelin via GitGitGadget
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-01-24 18:56 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

A couple of commands exist to spruce up the output in GitHub workflows:
https://docs.github.com/en/actions/learn-github-actions/workflow-commands-for-github-actions

In addition to the `::group::<label>`/`::endgroup::` commands (which we
already use to structure the output of the build step better), we also
use `::error::`/`::notice::` to draw the attention to test failures and
to test cases that were expected to fail but didn't.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/test-lib-functions.sh              |  4 +--
 t/test-lib-github-workflow-markup.sh | 50 ++++++++++++++++++++++++++++
 t/test-lib.sh                        |  5 ++-
 3 files changed, 56 insertions(+), 3 deletions(-)
 create mode 100644 t/test-lib-github-workflow-markup.sh

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index c3d38aaccbd..b5fe5f66085 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -719,7 +719,7 @@ test_verify_prereq () {
 }
 
 test_expect_failure () {
-	test_start_
+	test_start_ "$@"
 	test "$#" = 3 && { test_prereq=$1; shift; } || test_prereq=
 	test "$#" = 2 ||
 	BUG "not 2 or 3 parameters to test-expect-failure"
@@ -739,7 +739,7 @@ test_expect_failure () {
 }
 
 test_expect_success () {
-	test_start_
+	test_start_ "$@"
 	test "$#" = 3 && { test_prereq=$1; shift; } || test_prereq=
 	test "$#" = 2 ||
 	BUG "not 2 or 3 parameters to test-expect-success"
diff --git a/t/test-lib-github-workflow-markup.sh b/t/test-lib-github-workflow-markup.sh
new file mode 100644
index 00000000000..d8dc969df4a
--- /dev/null
+++ b/t/test-lib-github-workflow-markup.sh
@@ -0,0 +1,50 @@
+# Library of functions to mark up test scripts' output suitable for
+# pretty-printing it in GitHub workflows.
+#
+# Copyright (c) 2022 Johannes Schindelin
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see http://www.gnu.org/licenses/ .
+#
+# The idea is for `test-lib.sh` to source this file when run in GitHub
+# workflows; these functions will then override (empty) functions
+# that are are called at the appropriate times during the test runs.
+
+start_test_output () {
+	test -n "$GIT_TEST_TEE_OUTPUT_FILE" ||
+	die "--github-workflow-markup requires --verbose-log"
+	github_markup_output="${GIT_TEST_TEE_OUTPUT_FILE%.out}.markup"
+	>$github_markup_output
+	GIT_TEST_TEE_OFFSET=0
+}
+
+# No need to override start_test_case_output
+
+finalize_test_case_output () {
+	test_case_result=$1
+	shift
+	case "$test_case_result" in
+	failure)
+		echo >>$github_markup_output "::error::failed: $this_test.$test_count $1"
+		;;
+	fixed)
+		echo >>$github_markup_output "::notice::fixed: $this_test.$test_count $1"
+		;;
+	esac
+	echo >>$github_markup_output "::group::$test_case_result: $this_test.$test_count $*"
+	test-tool >>$github_markup_output path-utils skip-n-bytes \
+		"$GIT_TEST_TEE_OUTPUT_FILE" $GIT_TEST_TEE_OFFSET
+	echo >>$github_markup_output "::endgroup::"
+}
+
+# No need to override finalize_test_output
diff --git a/t/test-lib.sh b/t/test-lib.sh
index e13e1cb9124..076bee58c19 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -174,6 +174,9 @@ parse_option () {
 	--write-junit-xml)
 		. "$TEST_DIRECTORY/test-lib-junit.sh"
 		;;
+	--github-workflow-markup)
+		. "$TEST_DIRECTORY/test-lib-github-workflow-markup.sh"
+		;;
 	--stress)
 		stress=t ;;
 	--stress=*)
@@ -1027,7 +1030,7 @@ test_start_ () {
 	test_count=$(($test_count+1))
 	maybe_setup_verbose
 	maybe_setup_valgrind
-	start_test_case_output
+	start_test_case_output "$@"
 }
 
 test_finish_ () {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 8/9] ci: use `--github-workflow-markup` in the GitHub workflow
  2022-01-24 18:56 [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Johannes Schindelin via GitGitGadget
                   ` (6 preceding siblings ...)
  2022-01-24 18:56 ` [PATCH 7/9] ci: optionally mark up output in the GitHub workflow Johannes Schindelin via GitGitGadget
@ 2022-01-24 18:56 ` Johannes Schindelin via GitGitGadget
  2022-01-24 18:56 ` [PATCH 9/9] ci: call `finalize_test_case_output` a little later Johannes Schindelin via GitGitGadget
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-01-24 18:56 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

This makes the output easier to digest.

Note: since workflow output currently cannot contain any nested groups
(see https://github.com/actions/runner/issues/802 for details), we need
to remove the explicit grouping that would span the entirety of each
failed test script.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 ci/lib.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/ci/lib.sh b/ci/lib.sh
index 4ed8f40ab02..72efdb556ed 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -176,7 +176,7 @@ then
 			test_name="${test_exit%.exit}"
 			test_name="${test_name##*/}"
 			printf "\\e[33m\\e[1m=== Failed test: ${test_name} ===\\e[m\\n"
-			group "Failed test: $test_name" cat "t/test-results/$test_name.out"
+			cat "t/test-results/$test_name.markup"
 
 			trash_dir="t/trash directory.$test_name"
 			cp "t/test-results/$test_name.out" t/failed-test-artifacts/
@@ -188,7 +188,7 @@ then
 	cache_dir="$HOME/none"
 
 	export GIT_PROVE_OPTS="--timer --jobs 10"
-	export GIT_TEST_OPTS="--verbose-log -x"
+	export GIT_TEST_OPTS="--verbose-log -x --github-workflow-markup"
 	MAKEFLAGS="$MAKEFLAGS --jobs=10"
 	test windows != "$CI_OS_NAME" ||
 	GIT_TEST_OPTS="--no-chain-lint --no-bin-wrappers $GIT_TEST_OPTS"
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 9/9] ci: call `finalize_test_case_output` a little later
  2022-01-24 18:56 [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Johannes Schindelin via GitGitGadget
                   ` (7 preceding siblings ...)
  2022-01-24 18:56 ` [PATCH 8/9] ci: use `--github-workflow-markup` " Johannes Schindelin via GitGitGadget
@ 2022-01-24 18:56 ` Johannes Schindelin via GitGitGadget
  2022-01-26  0:25 ` [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Ævar Arnfjörð Bjarmason
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-01-24 18:56 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

We used to call that function already before printing the final verdict.
However, now that we added grouping to the GitHub workflow output, we
will want to include even that part in the collapsible group for that
test case.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/test-lib.sh | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 076bee58c19..1e683ad879b 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -732,30 +732,31 @@ trap '{ code=$?; set +x; } 2>/dev/null; exit $code' INT TERM HUP
 # the test_expect_* functions instead.
 
 test_ok_ () {
-	finalize_test_case_output ok "$@"
 	test_success=$(($test_success + 1))
 	say_color "" "ok $test_count - $@"
+	finalize_test_case_output ok "$@"
 }
 
 test_failure_ () {
-	finalize_test_case_output failure "$@"
+	failure_label=$1
 	test_failure=$(($test_failure + 1))
 	say_color error "not ok $test_count - $1"
 	shift
 	printf '%s\n' "$*" | sed -e 's/^/#	/'
 	test "$immediate" = "" || _error_exit
+	finalize_test_case_output failure "$failure_label" "$@"
 }
 
 test_known_broken_ok_ () {
-	finalize_test_case_output fixed "$@"
 	test_fixed=$(($test_fixed+1))
 	say_color error "ok $test_count - $@ # TODO known breakage vanished"
+	finalize_test_case_output fixed "$@"
 }
 
 test_known_broken_failure_ () {
-	finalize_test_case_output broken "$@"
 	test_broken=$(($test_broken+1))
 	say_color warn "not ok $test_count - $@ # TODO known breakage"
+	finalize_test_case_output broken "$@"
 }
 
 test_debug () {
@@ -1081,10 +1082,10 @@ test_skip () {
 
 	case "$to_skip" in
 	t)
-		finalize_test_case_output skip "$@"
 
 		say_color skip "ok $test_count # skip $1 ($skipped_reason)"
 		: true
+		finalize_test_case_output skip "$@"
 		;;
 	*)
 		false
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* Re: [PATCH 2/9] ci/run-build-and-tests: take a more high-level view
  2022-01-24 18:56 ` [PATCH 2/9] ci/run-build-and-tests: take a more high-level view Johannes Schindelin via GitGitGadget
@ 2022-01-24 23:22   ` Eric Sunshine
  2022-01-25 14:34     ` Johannes Schindelin
  0 siblings, 1 reply; 98+ messages in thread
From: Eric Sunshine @ 2022-01-24 23:22 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget; +Cc: Git List, Johannes Schindelin

On Mon, Jan 24, 2022 at 3:02 PM Johannes Schindelin via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> In the web UI of GitHub workflows, failed runs are presented with the
> job step that failed auto-expanded. In the current setup, this is not
> helpful at all because that shows only the output of `prove`, which says
> which test failed, but not in what way.
>
> What would help understand the reader what went wrong is the verbose
> test output of the failed test.
>
> The logs of the failed runs do contain that verbose test output, but it
> is shown in the _next_ step (which is marked as succeeding, and is
> therefore _not_ auto-expanded). Anyone not intimately familiar with this
> would completely miss the verbose test output, being left mostly
> puzzled with the test failures.
>
> We are about to show the failed test cases' output in the _same_ step,
> so that the user has a much easier time to figure out what was going
> wrong.
>
> But first, we must partially revert the change that tried to improve the
> CI runs by combining the `Makefile` targets to build into a single
> `make` invocation. That might have sounded like a good idea at the time,
> but it does make it rather impossible for the CI script to determine
> whether the _build_ failed, or the _tests_. If the tests were run at
> all, that is.
>
> So let's go back to calling `make` for the build, and call `make test`
> separately so that we can easily detect that _that_ invocation failed,
> and react appropriately.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
> diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
> @@ -10,7 +10,7 @@ windows*) cmd //c mklink //j t\\.prove "$(cygpath -aw "$cache_dir/.prove")";;
> -export MAKE_TARGETS="all test"
> +run_tests=t
>
>  case "$jobname" in
>  linux-gcc)
> @@ -41,14 +41,18 @@ pedantic)
>         # Don't run the tests; we only care about whether Git can be
>         # built.
>         export DEVOPTS=pedantic
> -       export MAKE_TARGETS=all
> +       run_tests=
>         ;;
>  esac
>
>  # Any new "test" targets should not go after this "make", but should
>  # adjust $MAKE_TARGETS. Otherwise compilation-only targets above will
>  # start running tests.
> -make $MAKE_TARGETS

The comment talking about MAKE_TARGETS seems out of date now that
MAKE_TARGETS has been removed from this script.

> +make
> +if test -n "$run_tests"
> +then
> +       make test
> +fi
>  check_unignored_build_artifacts

This changes behavior, doesn't it? Wth the original "make all test",
if the `all` target failed, then the `test` target would not be
invoked. However, with the revised code, `make test` is invoked even
if `make all` fails. Is that behavior change significant? Do we care
about it?

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 2/9] ci/run-build-and-tests: take a more high-level view
  2022-01-24 23:22   ` Eric Sunshine
@ 2022-01-25 14:34     ` Johannes Schindelin
  0 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin @ 2022-01-25 14:34 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Johannes Schindelin via GitGitGadget, Git List

Hi Eric,

On Mon, 24 Jan 2022, Eric Sunshine wrote:

> On Mon, Jan 24, 2022 at 3:02 PM Johannes Schindelin via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> > In the web UI of GitHub workflows, failed runs are presented with the
> > job step that failed auto-expanded. In the current setup, this is not
> > helpful at all because that shows only the output of `prove`, which says
> > which test failed, but not in what way.
> >
> > What would help understand the reader what went wrong is the verbose
> > test output of the failed test.
> >
> > The logs of the failed runs do contain that verbose test output, but it
> > is shown in the _next_ step (which is marked as succeeding, and is
> > therefore _not_ auto-expanded). Anyone not intimately familiar with this
> > would completely miss the verbose test output, being left mostly
> > puzzled with the test failures.
> >
> > We are about to show the failed test cases' output in the _same_ step,
> > so that the user has a much easier time to figure out what was going
> > wrong.
> >
> > But first, we must partially revert the change that tried to improve the
> > CI runs by combining the `Makefile` targets to build into a single
> > `make` invocation. That might have sounded like a good idea at the time,
> > but it does make it rather impossible for the CI script to determine
> > whether the _build_ failed, or the _tests_. If the tests were run at
> > all, that is.
> >
> > So let's go back to calling `make` for the build, and call `make test`
> > separately so that we can easily detect that _that_ invocation failed,
> > and react appropriately.
> >
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
> > diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
> > @@ -10,7 +10,7 @@ windows*) cmd //c mklink //j t\\.prove "$(cygpath -aw "$cache_dir/.prove")";;
> > -export MAKE_TARGETS="all test"
> > +run_tests=t
> >
> >  case "$jobname" in
> >  linux-gcc)
> > @@ -41,14 +41,18 @@ pedantic)
> >         # Don't run the tests; we only care about whether Git can be
> >         # built.
> >         export DEVOPTS=pedantic
> > -       export MAKE_TARGETS=all
> > +       run_tests=
> >         ;;
> >  esac
> >
> >  # Any new "test" targets should not go after this "make", but should
> >  # adjust $MAKE_TARGETS. Otherwise compilation-only targets above will
> >  # start running tests.
> > -make $MAKE_TARGETS
>
> The comment talking about MAKE_TARGETS seems out of date now that
> MAKE_TARGETS has been removed from this script.

Good catch!

> > +make
> > +if test -n "$run_tests"
> > +then
> > +       make test
> > +fi
> >  check_unignored_build_artifacts
>
> This changes behavior, doesn't it? Wth the original "make all test",
> if the `all` target failed, then the `test` target would not be
> invoked. However, with the revised code, `make test` is invoked even
> if `make all` fails. Is that behavior change significant? Do we care
> about it?

That is actually not the case. Compare to what 25715419bf4 (CI: don't run
"make test" twice in one job, 2021-11-23) did: it removed code that _also_
did not specifically prevent `make test` from running when `make all`
failed.

The clue to the riddle is this line in `ci/lib.sh`:

	set -ex

The `-e` part lets the script fail whenever any command fails (unless it
is part of an `if`/`while` condition, or properly chained with `||`).

This line is actually touched by the "ci/run-build-and-tests: add some
structure to the GitHub workflow output" patch in this patch series, which
breaks it apart into the `set -e` and the `set -x` part (so that the
latter can be called later in GitHub workflows, to unclutter the output a
bit).

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 3/9] ci: make it easier to find failed tests' logs in the GitHub workflow
  2022-01-24 18:56 ` [PATCH 3/9] ci: make it easier to find failed tests' logs in the GitHub workflow Johannes Schindelin via GitGitGadget
@ 2022-01-25 23:48   ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 98+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-25 23:48 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Johannes Schindelin, SZEDER Gábor


On Mon, Jan 24 2022, Johannes Schindelin via GitGitGadget wrote:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> You currently have to know a lot of implementation details when
> investigating test failures in the CI runs. The first step is easy: the
> failed job is marked quite clearly, but when opening it, the failed step
> is expanded, which in our case is the one running
> `ci/run-build-and-tests.sh`. This step, most notably, only offers a
> high-level view of what went wrong: it prints the output of `prove`
> which merely tells the reader which test script failed.
>
> The actually interesting part is in the detailed log of said failed
> test script. But that log is shown in the CI run's step that runs
> `ci/print-test-failures.sh`. And that step is _not_ expanded in the web
> UI by default.
>
> Let's help the reader by showing the failed tests' detailed logs in the
> step that is expanded automatically, i.e. directly after the test suite
> failed.
>
> This also helps the situation where the _build_ failed and the
> `print-test-failures` step was executed under the assumption that the
> _test suite_ failed, and consequently failed to find any failed tests.
>
> An alternative way to implement this patch would be to source
> `ci/print-test-failures.sh` in the `handle_test_failures` function to
> show these logs. However, over the course of the next few commits, we
> want to introduce some grouping which would be harder to achieve that
> way (for example, we do want a leaner, and colored, preamble for each
> failed test script, and it would be trickier to accommodate the lack of
> nested groupings in GitHub workflows' output).

Is it really better to have the first thing you see in a failing job be
this level of detail?

To take the "before" demo job from your CL, if you click on a failing
job you'll currently end up with ~1600 lines of "prove" setup and
output, culminating in (the browser auto-scrolls to the end):

    [...]
    Test Summary Report
    -------------------
    t1092-sparse-checkout-compatibility.sh           (Wstat: 256 Tests: 53 Failed: 1)
      Failed test:  49
      Non-zero exit status: 1
    t3701-add-interactive.sh                         (Wstat: 0 Tests: 71 Failed: 0)
      TODO passed:   45, 47
    Files=957, Tests=25489, 645 wallclock secs ( 5.74 usr  1.56 sys + 866.28 cusr 364.34 csys = 1237.92 CPU)
    Result: FAIL

Is it ideal? No. But I think that folding the ci/print-test-failures.sh
output into that step makes it much worse. Now I'll be sent to the very
bottom of ~16000 lines (yes, that's an extra zero at the end) of output,
ending in:

    [...]
    + test_cmp expect sparse-checkout-out
    + test 2 -ne 2
    + GIT_ALLOC_LIMIT=0 eval diff -u "$@"
    + diff -u expect sparse-checkout-out
    + test_cmp full-checkout-err sparse-checkout-err
    + test 2 -ne 2
    + GIT_ALLOC_LIMIT=0 eval diff -u "$@"
    + diff -u full-checkout-err sparse-checkout-err
    + test_cmp full-checkout-err sparse-index-err
    + test 2 -ne 2
    + GIT_ALLOC_LIMIT=0 eval diff -u "$@"
    + diff -u full-checkout-err sparse-index-err
    
    ok 53 - checkout behaves oddly with df-conflict-2
    # failed 1 among 53 test(s)
    1..53

Now you'll need to scroll up or search just to see what test failed.

Usually when these fail I might only look at the failing test name (at
that point already knowing why it failed). I think it's a feature that
we only expand the verbose output later.

I realize that:

1) This isn't the exact output you emit in the post-image here, since you're not
actually using ci/print-test-failures.sh, but from eyeballing the script
it seems to do basically the same thing, i.e. it'll emit the full *.out
file.

B.t.w. why isn't this using ci/print-test-failures.sh. Your "an
alternative way" paragraph doesn't really explain it. Sure, it'll be
further tweaked later, but in the meantime do we have to re-invent
ci/print-test-failures.sh? Anyway...

2) The end-state at the end of this series looks somewhat different, but I think
that end-state shares the UX problem noted above, and to some extent
makes it worse.

That one has 28 thousand lines of output!

Now I know it's elided so you're only supposed to see a few screenfulls
of it, but at least in my browser that end-state is *very slow*, much
slower than the version that shows me the full ~16 thousand lines at
once.

Presumably it's doing some very expensive JavaScript/CSS behind the
scenes.

I mean so slow that when I press page up and down I can see 3-8 lines of
that folded output appear at once, then the next 3-8 lines etc. The
current output meanwhile (and this more verbose one) is
near-instant. This is in Firefox 91.4, if it matters.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 5/9] tests: refactor --write-junit-xml code
  2022-01-24 18:56 ` [PATCH 5/9] tests: refactor --write-junit-xml code Johannes Schindelin via GitGitGadget
@ 2022-01-26  0:10   ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 98+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-26  0:10 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget; +Cc: git, Johannes Schindelin


On Mon, Jan 24 2022, Johannes Schindelin via GitGitGadget wrote:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> The code writing JUnit XML is interspersed directly with all the code in
> `t/test-lib.sh`, and it is therefore not only ill-separated, but
> introducing yet another output format would make the situation even
> worse.
>
> Let's introduce an abstraction layer by hiding the JUnit XML code behind
> four new functions that are supposed to be called before and after each
> test and test case.
>
> This is not just an academic exercise, refactoring for refactoring's
> sake. We _actually_ want to introduce such a new output format, to
> make it substantially easier to diagnose test failures in our GitHub
> workflow, therefore we do need this refactoring.

I'm a bit confused about the need to patch this JUnit code & to
generalize test-lib.sh to emit things in three machine-readable formats
(TAP, JUnit, and now this Markdown format).

In
https://lore.kernel.org/git/nycvar.QRO.7.76.6.2112201834050.347@tvgsbejvaqbjf.bet/
you replied to my patch to remove this dead code with wanting to keep
it, because:
    
    The reason is that there are still some things that Azure Pipelines can do
    that GitHub workflows cannot, for example:
    
    - present the logs of failed tests in an intuitive manner,
    
    - re-run _only_ failed jobs.

Which is fair enough, but in this series we're further patching it, but
it's still not used anywhere in-tree at the end of it, or am I missing
something?

This series is seeking to address 1/2 of the points you mentioned, and
presumably the latter is a question of us juggling around our GitHub CI
job definitions.

Then in 6/9 you note:

    This seems not to faze Azure Pipelines' XML parser, but it still is
    incorrect, so let's fix it.

So this is running on Azure somehow, but not via the previously in-tree
azure-pipeline.yml removed in your 6081d3898fe (ci: retire the Azure
Pipelines definition, 2020-04-11)?

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-01-24 18:56 [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Johannes Schindelin via GitGitGadget
                   ` (8 preceding siblings ...)
  2022-01-24 18:56 ` [PATCH 9/9] ci: call `finalize_test_case_output` a little later Johannes Schindelin via GitGitGadget
@ 2022-01-26  0:25 ` Ævar Arnfjörð Bjarmason
  2022-01-27 16:31 ` CI "grouping" within jobs v.s. lighter split-out jobs (was: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful) Ævar Arnfjörð Bjarmason
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 98+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-26  0:25 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget; +Cc: git, Johannes Schindelin


On Mon, Jan 24 2022, Johannes Schindelin via GitGitGadget wrote:

> Background
> ==========
>
> Recent patches intended to help readers figure out CI failures much quicker
> than before. Unfortunately, they haven't been entirely positive for me. For
> example, they broke the branch protections in Microsoft's fork of Git, where
> we require Pull Requests to pass a certain set of Checks (which are
> identified by their names) and therefore caused follow-up work.

This seems to be a reference to my df7375d7728 (CI: use shorter names
that fit in UX tooltips, 2021-11-23) merged as part of ab/ci-updates,
and I understand from this summary that you had some custom job
somewhere that scraped the job names which broke.

That's unfortunate, I do think being able to actually read the tooltips
in the GitHub UI was a worthwhile trade-off in the end though.

But I'm entirely confused about what any of that has to do with this
series, which is about changing how the job output itself is presented
and summarized, and not about the job names, and making them fit in
tooltips.

Later in the summary you note: 

> Using CI and in general making it easier for new contributors is an area I'm
> passionate about, and one I'd like to see improved.
> [...]
> ⊗ linux-gcc (ubuntu-latest)
>    failed: t9800.20 submit from detached head

Which has one of the new and shorter jobnames, but in a part of the UX
where the length didn't matter, and I can't find a way where it does.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* CI "grouping" within jobs v.s. lighter split-out jobs (was: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful)
  2022-01-24 18:56 [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Johannes Schindelin via GitGitGadget
                   ` (9 preceding siblings ...)
  2022-01-26  0:25 ` [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Ævar Arnfjörð Bjarmason
@ 2022-01-27 16:31 ` Ævar Arnfjörð Bjarmason
  2022-02-19 23:46 ` [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Johannes Schindelin
  2022-03-01 10:24 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
  12 siblings, 0 replies; 98+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-27 16:31 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Johannes Schindelin, SZEDER Gábor, Matheus Tavares,
	Taylor Blau, Lars Schneider,
	Đoàn Trần Công Danh, brian m. carlson,
	Carlo Marcelo Arenas Belón


[CC-ing some people who've been interested in CI architechture]

On Mon, Jan 24 2022, Johannes Schindelin via GitGitGadget wrote:

> [...]
> The current situation
> =====================
>
> Let me walk you through the current experience when a PR build fails: I get
> a notification mail that only says that a certain job failed. There's no
> indication of which test failed (or was it the build?). I can click on a
> link at it takes me to the workflow run. Once there, all it says is "Process
> completed with exit code 1", or even "code 2". Sure, I can click on one of
> the failed jobs. It even expands the failed step's log (collapsing the other
> steps). And what do I see there?
>
> Let's look at an example of a failed linux-clang (ubuntu-latest) job
> [https://github.com/git-for-windows/git/runs/4822802185?check_suite_focus=true]:
>
> [...]
> Test Summary Report
> -------------------
> t1092-sparse-checkout-compatibility.sh           (Wstat: 256 Tests: 53 Failed: 1)
>   Failed test:  49
>   Non-zero exit status: 1
> t3701-add-interactive.sh                         (Wstat: 0 Tests: 71 Failed: 0)
>   TODO passed:   45, 47
> Files=957, Tests=25489, 645 wallclock secs ( 5.74 usr  1.56 sys + 866.28 cusr 364.34 csys = 1237.92 CPU)
> Result: FAIL
> make[1]: *** [Makefile:53: prove] Error 1
> make[1]: Leaving directory '/home/runner/work/git/git/t'
> make: *** [Makefile:3018: test] Error 2
>

Firstly I very much applaud any effort to move the CI UX forward. I know
we haven't seen eye-to-eye on some of the trade-offs there, but I think
something like this series is a step in the right direction. I.e. trying
harder to summarize the output for the user, and making use of some CI
platform-specific features.

I sent a reply in this thread purely on some implementation concerns
related to that in
https://lore.kernel.org/git/220126.86sftbfjl4.gmgdl@evledraar.gmail.com/,
but let's leave that aside for now...

> [...]
> So I had a look at what standards exist e.g. when testing PowerShell
> modules, in the way of marking up their test output in GitHub workflows, and
> I was not disappointed: GitHub workflows support "grouping" of output lines,
> i.e. marking sections of the output as a group that is then collapsed by
> default and can be expanded. And it is this feature I decided to use in this
> patch series, along with GitHub workflows' commands to display errors or
> notices that are also shown on the summary page of the workflow run. Now, in
> addition to "Process completed with exit code" on the summary page, we also
> read something like:
>
> ⊗ linux-gcc (ubuntu-latest)
>    failed: t9800.20 submit from detached head
>
> Even better, this message is a link, and following that, the reader is
> presented with something like this
> [https://github.com/dscho/git/runs/4840190622?check_suite_focus=true]:

This series is doing several different things, at least:

 1) "Grouping" the ci/ output, i.e. "make" from "make test"
 2) Doing likewise for t/test-lib.sh
 3) In doing that for t/test-lib.sh, also "signalling" the GitHub CI,
    to e.g. get the "submit from detached head" output you quote just
    a few lines above

I'd like to focus on just #1 here.

Where I was going with that in my last CI series was to make a start at
eventually being able to run simply "make" at the top-level
"step". I.e. to have a recipe that looks like:

    - run: make
    - run: make test

I feel strongly that that's where we should be heading, and the #1 part
of this series is basically trying to emulate what you'd get for free if
we simply did that.

I.e. if you run single commands at the "step" level (in GitHub CI
nomenclature) you'll get what you're doing with groupings in this series
for free, and without any special code in ci/*, better yet if you then
do want grouping *within* that step you're free to do so without having
clobbered your one-level of grouping already on distinguishing "make"
from "make test".

IOW our CI now looks like this (pseudocode):

     - job:
       - step1:
         - use ci/lib.sh to set env vars
         - run a script like ci/run-build-and-tests.sh
       - step2:
         - use ci/lib.sh to set env vars
         - run a script like print-test-failures.sh

But should instead look like:

     - job:
       - step1:
         - set variables in $GITHUB_ENV using ci/lib.sh
       - step2:
         - make
       - step3:
         - make test
       - step4:
         - run a script like print-test-failures.sh

Well, we can quibble about "step4", but again, let's focus on #1 here,
that's more #2-#3 territory.

I had some WIP code to do that which I polished up, here's how e.g. a
build failure looks like in your implementation (again, just focusing on
how "make" and "make test" is divided out, not the rest):

    https://github.com/dscho/git/runs/4840190622?check_suite_focus=true#step:4:62

I.e. you've made "build" an expandable group at the same level as a
single failed test, and still all under the opaque
ci/run-build-and-test.sh script.

And here's mine. This is using a semi-recent version of my patches that
happened to have a failure, not quite what I've got now, but close
enough for this E-Mail:

    https://github.com/avar/git/runs/4956260395?check_suite_focus=true#step:7:1

Now, notice two things, one we've made "make" and "make test" top-level
steps, but more importantly if you expand that "make test" step on yours
you'll get the full "make test" output,

And two it's got something you don't have at all, which is that we're
now making use of the GitHub CI feature of having pre-declared an
environment for "make test", which the CI knows about (you need to click
to expand it):

    https://github.com/avar/git/runs/4956260395?check_suite_focus=true#step:7:4

Right now that's something we hardly make use of at all, but with my
changes the environment is the *only* special sauce we specify before
the step, i.e. GIT_PROVE_OPTS=.. DEFAULT_TEST_TARGET=... etc.

I think I've run out of my ML quota for now, but here's the branch that implements it:

    https://github.com/git/git/compare/master...avar:avar/ci-unroll-make-commands-to-ci-recipe

That's "282 additions and 493 deletions.", much of what was required to
do this was to eject the remaining support for the dead Travis and Azure
CI's that we don't run, i.e. to entirely remove any sort of state
management or job control from ci/lib.sh, and have it *only* be tasked
with setting variables for subsequent steps to use.

That makes it much simpler, my end-state of it is ~170 lines v.s. your
~270 (but to be fair some of that's deleted Travis code):

    https://github.com/avar/git/blob/avar/ci-unroll-make-commands-to-ci-recipe/ci/lib.sh
    https://github.com/gitgitgadget/git/blob/pr-1117/dscho/use-grouping-in-ci-v1/ci/lib.sh

And much of the rest is just gone, e.g. ci/run-build-and-tests.sh isn't
there anymore, instead you simply run "make" or "make test" (or the
equivalent on Windows, which also works):

    https://github.com/avar/git/tree/avar/ci-unroll-make-commands-to-ci-recipe/ci
    https://github.com/gitgitgadget/git/tree/pr-1117/dscho/use-grouping-in-ci-v1/ci

Anyway, I hope we can find some sort of joint way forward with this,
because I think your #1 at least is going in the opposite direction we
should be going to achieve much the same ends you'd like to achieve.

We can really just do this in a much simpler way once we stop treating
ci/lib.sh and friends as monolithic ball of mud entry points.

But I'd really like us not to go in this direction of using markup to
"sub-divide" the "steps" within a given job, when we can relatively
easily just ... divide the steps.

As shown above that UI plays much more naturally into the CI's native
features & how it likes to arrange & present things.

And again, all of this is *only* discussing the "step #1" noted
above. Using "grouping" for presenting the test failures themselves or
sending summaries to the CI "Summary" is a different matter.

Thanks!




^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-01-24 18:56 [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Johannes Schindelin via GitGitGadget
                   ` (10 preceding siblings ...)
  2022-01-27 16:31 ` CI "grouping" within jobs v.s. lighter split-out jobs (was: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful) Ævar Arnfjörð Bjarmason
@ 2022-02-19 23:46 ` Johannes Schindelin
  2022-02-20  2:44   ` Junio C Hamano
  2022-02-20 12:47   ` Ævar Arnfjörð Bjarmason
  2022-03-01 10:24 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
  12 siblings, 2 replies; 98+ messages in thread
From: Johannes Schindelin @ 2022-02-19 23:46 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 12621 bytes --]

Hi Junio,

I notice that you did not take this into `seen` yet. I find that a little
sad because it would potentially have helped others to figure out the
failure in the latest `seen`:
https://github.com/git/git/runs/5255378056?check_suite_focus=true#step:5:162

Essentially, a recent patch introduces hard-coded SHA-1 hashes in t3007.3.

Ciao,
Dscho


On Mon, 24 Jan 2022, Johannes Schindelin via GitGitGadget wrote:

>
> Background
> ==========
>
> Recent patches intended to help readers figure out CI failures much quicker
> than before. Unfortunately, they haven't been entirely positive for me. For
> example, they broke the branch protections in Microsoft's fork of Git, where
> we require Pull Requests to pass a certain set of Checks (which are
> identified by their names) and therefore caused follow-up work.
>
> Using CI and in general making it easier for new contributors is an area I'm
> passionate about, and one I'd like to see improved.
>
>
> The current situation
> =====================
>
> Let me walk you through the current experience when a PR build fails: I get
> a notification mail that only says that a certain job failed. There's no
> indication of which test failed (or was it the build?). I can click on a
> link at it takes me to the workflow run. Once there, all it says is "Process
> completed with exit code 1", or even "code 2". Sure, I can click on one of
> the failed jobs. It even expands the failed step's log (collapsing the other
> steps). And what do I see there?
>
> Let's look at an example of a failed linux-clang (ubuntu-latest) job
> [https://github.com/git-for-windows/git/runs/4822802185?check_suite_focus=true]:
>
> [...]
> Test Summary Report
> -------------------
> t1092-sparse-checkout-compatibility.sh           (Wstat: 256 Tests: 53 Failed: 1)
>   Failed test:  49
>   Non-zero exit status: 1
> t3701-add-interactive.sh                         (Wstat: 0 Tests: 71 Failed: 0)
>   TODO passed:   45, 47
> Files=957, Tests=25489, 645 wallclock secs ( 5.74 usr  1.56 sys + 866.28 cusr 364.34 csys = 1237.92 CPU)
> Result: FAIL
> make[1]: *** [Makefile:53: prove] Error 1
> make[1]: Leaving directory '/home/runner/work/git/git/t'
> make: *** [Makefile:3018: test] Error 2
>
>
> That's it. I count myself lucky not to be a new contributor being faced with
> something like this.
>
> Now, since I am active in the Git project for a couple of days or so, I can
> make sense of the "TODO passed" label and know that for the purpose of
> fixing the build failures, I need to ignore this, and that I need to focus
> on the "Failed test" part instead.
>
> I also know that I do not have to get myself an ubuntu-latest box just to
> reproduce the error, I do not even have to check out the code and run it
> just to learn what that "49" means.
>
> I know, and I do not expect any new contributor, not even most seasoned
> contributors to know, that I have to patiently collapse the "Run
> ci/run-build-and-tests.sh" job's log, and instead expand the "Run
> ci/print-test-failures.sh" job log (which did not fail and hence does not
> draw any attention to it).
>
> I know, and again: I do not expect many others to know this, that I then
> have to click into the "Search logs" box (not the regular web browser's
> search via Ctrl+F!) and type in "not ok" to find the log of the failed test
> case (and this might still be a "known broken" one that is marked via
> test_expect_failure and once again needs to be ignored).
>
> To be excessively clear: This is not a great experience!
>
>
> Improved output
> ===============
>
> Our previous Azure Pipelines-based CI builds had a much nicer UI, one that
> even showed flaky tests, and trends e.g. how long the test cases ran. When I
> ported Git's CI over to GitHub workflows (to make CI more accessible to new
> contributors), I knew fully well that we would leave this very nice UI
> behind, and I had hoped that we would get something similar back via new,
> community-contributed GitHub Actions that can be used in GitHub workflows.
> However, most likely because we use a home-grown test framework implemented
> in opinionated POSIX shells scripts, that did not happen.
>
> So I had a look at what standards exist e.g. when testing PowerShell
> modules, in the way of marking up their test output in GitHub workflows, and
> I was not disappointed: GitHub workflows support "grouping" of output lines,
> i.e. marking sections of the output as a group that is then collapsed by
> default and can be expanded. And it is this feature I decided to use in this
> patch series, along with GitHub workflows' commands to display errors or
> notices that are also shown on the summary page of the workflow run. Now, in
> addition to "Process completed with exit code" on the summary page, we also
> read something like:
>
> ⊗ linux-gcc (ubuntu-latest)
>    failed: t9800.20 submit from detached head
>
>
> Even better, this message is a link, and following that, the reader is
> presented with something like this
> [https://github.com/dscho/git/runs/4840190622?check_suite_focus=true]:
>
> ⏵ Run ci/run-build-and-tests.sh
> ⏵ CI setup
>   + ln -s /home/runner/none/.prove t/.prove
>   + run_tests=t
>   + export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
>   + group Build make
>   + set +x
> ⏵ Build
> ⏵ Run tests
>   === Failed test: t9800-git-p4-basic ===
> ⏵ ok: t9800.1 start p4d
> ⏵ ok: t9800.2 add p4 files
> ⏵ ok: t9800.3 basic git p4 clone
> ⏵ ok: t9800.4 depot typo error
> ⏵ ok: t9800.5 git p4 clone @all
> ⏵ ok: t9800.6 git p4 sync uninitialized repo
> ⏵ ok: t9800.7 git p4 sync new branch
> ⏵ ok: t9800.8 clone two dirs
> ⏵ ok: t9800.9 clone two dirs, @all
> ⏵ ok: t9800.10 clone two dirs, @all, conflicting files
> ⏵ ok: t9800.11 clone two dirs, each edited by submit, single git commit
> ⏵ ok: t9800.12 clone using non-numeric revision ranges
> ⏵ ok: t9800.13 clone with date range, excluding some changes
> ⏵ ok: t9800.14 exit when p4 fails to produce marshaled output
> ⏵ ok: t9800.15 exit gracefully for p4 server errors
> ⏵ ok: t9800.16 clone --bare should make a bare repository
> ⏵ ok: t9800.17 initial import time from top change time
> ⏵ ok: t9800.18 unresolvable host in P4PORT should display error
> ⏵ ok: t9800.19 run hook p4-pre-submit before submit
>   Error: failed: t9800.20 submit from detached head
> ⏵ failure: t9800.20 submit from detached head
>   Error: failed: t9800.21 submit from worktree
> ⏵ failure: t9800.21 submit from worktree
>   === Failed test: t9801-git-p4-branch ===
>   [...]
>
>
> The "Failed test:" lines are colored in yellow to give a better visual clue
> about the logs' structure, the "Error:" label is colored in red to draw the
> attention to the important part of the log, and the "⏵" characters indicate
> that part of the log is collapsed and can be expanded by clicking on it.
>
> To drill down, the reader merely needs to expand the (failed) test case's
> log by clicking on it, and then study the log. If needed (e.g. when the test
> case relies on side effects from previous test cases), the logs of preceding
> test cases can be expanded as well. In this example, when expanding
> t9800.20, it looks like this (for ease of reading, I cut a few chunks of
> lines, indicated by "[...]"):
>
> [...]
> ⏵ ok: t9800.19 run hook p4-pre-submit before submit
>   Error: failed: t9800.20 submit from detached head
> ⏷ failure: t9800.20 submit from detached head
>       test_when_finished cleanup_git &&
>       git p4 clone --dest="$git" //depot &&
>         (
>           cd "$git" &&
>           git checkout p4/master &&
>           >detached_head_test &&
>           git add detached_head_test &&
>           git commit -m "add detached_head" &&
>           git config git-p4.skipSubmitEdit true &&
>           git p4 submit &&
>             git p4 rebase &&
>             git log p4/master | grep detached_head
>         )
>     [...]
>     Depot paths: //depot/
>     Import destination: refs/remotes/p4/master
>
>     Importing revision 9 (100%)Perforce db files in '.' will be created if missing...
>     Perforce db files in '.' will be created if missing...
>
>     Traceback (most recent call last):
>       File "/home/runner/work/git/git/git-p4", line 4455, in <module>
>         main()
>       File "/home/runner/work/git/git/git-p4", line 4449, in main
>         if not cmd.run(args):
>       File "/home/runner/work/git/git/git-p4", line 2590, in run
>         rebase.rebase()
>       File "/home/runner/work/git/git/git-p4", line 4121, in rebase
>         if len(read_pipe("git diff-index HEAD --")) > 0:
>       File "/home/runner/work/git/git/git-p4", line 297, in read_pipe
>         retcode, out, err = read_pipe_full(c, *k, **kw)
>       File "/home/runner/work/git/git/git-p4", line 284, in read_pipe_full
>         p = subprocess.Popen(
>       File "/usr/lib/python3.8/subprocess.py", line 858, in __init__
>         self._execute_child(args, executable, preexec_fn, close_fds,
>       File "/usr/lib/python3.8/subprocess.py", line 1704, in _execute_child
>         raise child_exception_type(errno_num, err_msg, err_filename)
>     FileNotFoundError: [Errno 2] No such file or directory: 'git diff-index HEAD --'
>     error: last command exited with $?=1
>     + cleanup_git
>     + retry_until_success rm -r /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
>     + nr_tries_left=60
>     + rm -r /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
>     + test_path_is_missing /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
>     + test 1 -ne 1
>     + test -e /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
>     + retry_until_success mkdir /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
>     + nr_tries_left=60
>     + mkdir /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
>     + exit 1
>     + eval_ret=1
>     + :
>     not ok 20 - submit from detached head
>     #
>     #        test_when_finished cleanup_git &&
>     #        git p4 clone --dest="$git" //depot &&
>     #        (
>     #            cd "$git" &&
>     #            git checkout p4/master &&
>     #            >detached_head_test &&
>     #            git add detached_head_test &&
>     #            git commit -m "add detached_head" &&
>     #            git config git-p4.skipSubmitEdit true &&
>     #            git p4 submit &&
>     #            git p4 rebase &&
>     #            git log p4/master | grep detached_head
>     #        )
>     #
>   Error: failed: t9800.21 submit from worktree
>   [...]
>
>
> Is this the best UI we can have for test failures in CI runs? I hope we can
> do better. Having said that, this patch series presents a pretty good start,
> and offers a basis for future improvements.
>
> Johannes Schindelin (9):
>   ci: fix code style
>   ci/run-build-and-tests: take a more high-level view
>   ci: make it easier to find failed tests' logs in the GitHub workflow
>   ci/run-build-and-tests: add some structure to the GitHub workflow
>     output
>   tests: refactor --write-junit-xml code
>   test(junit): avoid line feeds in XML attributes
>   ci: optionally mark up output in the GitHub workflow
>   ci: use `--github-workflow-markup` in the GitHub workflow
>   ci: call `finalize_test_case_output` a little later
>
>  .github/workflows/main.yml           |  12 ---
>  ci/lib.sh                            |  81 ++++++++++++++--
>  ci/run-build-and-tests.sh            |  11 ++-
>  ci/run-test-slice.sh                 |   5 +-
>  t/test-lib-functions.sh              |   4 +-
>  t/test-lib-github-workflow-markup.sh |  50 ++++++++++
>  t/test-lib-junit.sh                  | 132 +++++++++++++++++++++++++++
>  t/test-lib.sh                        | 128 ++++----------------------
>  8 files changed, 287 insertions(+), 136 deletions(-)
>  create mode 100644 t/test-lib-github-workflow-markup.sh
>  create mode 100644 t/test-lib-junit.sh
>
>
> base-commit: af4e5f569bc89f356eb34a9373d7f82aca6faa8a
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1117%2Fdscho%2Fuse-grouping-in-ci-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1117/dscho/use-grouping-in-ci-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1117
> --
> gitgitgadget
>
>

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-02-19 23:46 ` [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Johannes Schindelin
@ 2022-02-20  2:44   ` Junio C Hamano
  2022-02-20 15:25     ` Johannes Schindelin
  2022-02-20 12:47   ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 98+ messages in thread
From: Junio C Hamano @ 2022-02-20  2:44 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Johannes Schindelin via GitGitGadget, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> I notice that you did not take this into `seen` yet. I find that a little
> sad because it would potentially have helped others to figure out the
> failure in the latest `seen`:
> https://github.com/git/git/runs/5255378056?check_suite_focus=true#step:5:162
>
> Essentially, a recent patch introduces hard-coded SHA-1 hashes in t3007.3.

I saw the thread, I saw a few patches were commented on, and a few
were left unanswered, but one was replied by the original submitter
with a "Good catch!", making me expect the topic to be discussed or
rerolled to become ready relatively soon.

But nothing happened, so I even forgot to take a look myself by
picking it up in 'seen'.  It does sound sad that the topic was left
hanging there for 3 weeks or so in that state without any reroll or
response.


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-02-19 23:46 ` [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Johannes Schindelin
  2022-02-20  2:44   ` Junio C Hamano
@ 2022-02-20 12:47   ` Ævar Arnfjörð Bjarmason
  2022-02-22 10:30     ` Johannes Schindelin
  1 sibling, 1 reply; 98+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-20 12:47 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Johannes Schindelin via GitGitGadget, git, Junio C Hamano


On Sun, Feb 20 2022, Johannes Schindelin wrote:

> Hi Junio,
>
> I notice that you did not take this into `seen` yet. I find that a little
> sad because it would potentially have helped others to figure out the
> failure in the latest `seen`:
> https://github.com/git/git/runs/5255378056?check_suite_focus=true#step:5:162
>
> Essentially, a recent patch introduces hard-coded SHA-1 hashes in t3007.3.

I left some feedback on your submission ~3 weeks ago that you haven't
responded to:
https://lore.kernel.org/git/220127.86ilu5cdnf.gmgdl@evledraar.gmail.com/

I think you should really reply to that before this moves forward,
i.e. it's not trivial concerns. I think to get from our current "X" to
your aims of "Y" your way of doing that (for part of this series) is
really an overly complex way of getting there that we can do much
simpler, and the simpler way integrates much better with the GitHub CI
UI.

The feedback I left is on the part of this that's not directly relevant
to what you're pointing out here (which is the grouping of the per-test
failure output), but if your series is picked-up as-is we'd need to undo
rather big parts of it to get to what I consider a better state for the
"grouping" of the "make" v.s. "make test" etc. output.

I can just submit my version of that & we can hash out what direction
makes sense there, how does that sound? I've been running with it for
about a month, and really think that part of the failure output is much
better.

Here's an example of that part:
https://github.com/avar/git/runs/5259000590?check_suite_focus=true

I.e. note how we'll now just have a "make" and "make test" step, and we
failed there on the "make".

So we'd get to the point of simply invoking those build steps as 1=1
mapped CI steps, as opposed to "improving" ci/run-build-and-tests.sh to
emulate that (I've just git rm'd it in my version).

>
> On Mon, 24 Jan 2022, Johannes Schindelin via GitGitGadget wrote:
>
>>
>> Background
>> ==========
>>
>> Recent patches intended to help readers figure out CI failures much quicker
>> than before. Unfortunately, they haven't been entirely positive for me. For
>> example, they broke the branch protections in Microsoft's fork of Git, where
>> we require Pull Requests to pass a certain set of Checks (which are
>> identified by their names) and therefore caused follow-up work.
>>
>> Using CI and in general making it easier for new contributors is an area I'm
>> passionate about, and one I'd like to see improved.
>>
>>
>> The current situation
>> =====================
>>
>> Let me walk you through the current experience when a PR build fails: I get
>> a notification mail that only says that a certain job failed. There's no
>> indication of which test failed (or was it the build?). I can click on a
>> link at it takes me to the workflow run. Once there, all it says is "Process
>> completed with exit code 1", or even "code 2". Sure, I can click on one of
>> the failed jobs. It even expands the failed step's log (collapsing the other
>> steps). And what do I see there?
>>
>> Let's look at an example of a failed linux-clang (ubuntu-latest) job
>> [https://github.com/git-for-windows/git/runs/4822802185?check_suite_focus=true]:
>>
>> [...]
>> Test Summary Report
>> -------------------
>> t1092-sparse-checkout-compatibility.sh           (Wstat: 256 Tests: 53 Failed: 1)
>>   Failed test:  49
>>   Non-zero exit status: 1
>> t3701-add-interactive.sh                         (Wstat: 0 Tests: 71 Failed: 0)
>>   TODO passed:   45, 47
>> Files=957, Tests=25489, 645 wallclock secs ( 5.74 usr  1.56 sys + 866.28 cusr 364.34 csys = 1237.92 CPU)
>> Result: FAIL
>> make[1]: *** [Makefile:53: prove] Error 1
>> make[1]: Leaving directory '/home/runner/work/git/git/t'
>> make: *** [Makefile:3018: test] Error 2
>>
>>
>> That's it. I count myself lucky not to be a new contributor being faced with
>> something like this.
>>
>> Now, since I am active in the Git project for a couple of days or so, I can
>> make sense of the "TODO passed" label and know that for the purpose of
>> fixing the build failures, I need to ignore this, and that I need to focus
>> on the "Failed test" part instead.
>>
>> I also know that I do not have to get myself an ubuntu-latest box just to
>> reproduce the error, I do not even have to check out the code and run it
>> just to learn what that "49" means.
>>
>> I know, and I do not expect any new contributor, not even most seasoned
>> contributors to know, that I have to patiently collapse the "Run
>> ci/run-build-and-tests.sh" job's log, and instead expand the "Run
>> ci/print-test-failures.sh" job log (which did not fail and hence does not
>> draw any attention to it).
>>
>> I know, and again: I do not expect many others to know this, that I then
>> have to click into the "Search logs" box (not the regular web browser's
>> search via Ctrl+F!) and type in "not ok" to find the log of the failed test
>> case (and this might still be a "known broken" one that is marked via
>> test_expect_failure and once again needs to be ignored).
>>
>> To be excessively clear: This is not a great experience!
>>
>>
>> Improved output
>> ===============
>>
>> Our previous Azure Pipelines-based CI builds had a much nicer UI, one that
>> even showed flaky tests, and trends e.g. how long the test cases ran. When I
>> ported Git's CI over to GitHub workflows (to make CI more accessible to new
>> contributors), I knew fully well that we would leave this very nice UI
>> behind, and I had hoped that we would get something similar back via new,
>> community-contributed GitHub Actions that can be used in GitHub workflows.
>> However, most likely because we use a home-grown test framework implemented
>> in opinionated POSIX shells scripts, that did not happen.
>>
>> So I had a look at what standards exist e.g. when testing PowerShell
>> modules, in the way of marking up their test output in GitHub workflows, and
>> I was not disappointed: GitHub workflows support "grouping" of output lines,
>> i.e. marking sections of the output as a group that is then collapsed by
>> default and can be expanded. And it is this feature I decided to use in this
>> patch series, along with GitHub workflows' commands to display errors or
>> notices that are also shown on the summary page of the workflow run. Now, in
>> addition to "Process completed with exit code" on the summary page, we also
>> read something like:
>>
>> ⊗ linux-gcc (ubuntu-latest)
>>    failed: t9800.20 submit from detached head
>>
>>
>> Even better, this message is a link, and following that, the reader is
>> presented with something like this
>> [https://github.com/dscho/git/runs/4840190622?check_suite_focus=true]:
>>
>> ⏵ Run ci/run-build-and-tests.sh
>> ⏵ CI setup
>>   + ln -s /home/runner/none/.prove t/.prove
>>   + run_tests=t
>>   + export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
>>   + group Build make
>>   + set +x
>> ⏵ Build
>> ⏵ Run tests
>>   === Failed test: t9800-git-p4-basic ===
>> ⏵ ok: t9800.1 start p4d
>> ⏵ ok: t9800.2 add p4 files
>> ⏵ ok: t9800.3 basic git p4 clone
>> ⏵ ok: t9800.4 depot typo error
>> ⏵ ok: t9800.5 git p4 clone @all
>> ⏵ ok: t9800.6 git p4 sync uninitialized repo
>> ⏵ ok: t9800.7 git p4 sync new branch
>> ⏵ ok: t9800.8 clone two dirs
>> ⏵ ok: t9800.9 clone two dirs, @all
>> ⏵ ok: t9800.10 clone two dirs, @all, conflicting files
>> ⏵ ok: t9800.11 clone two dirs, each edited by submit, single git commit
>> ⏵ ok: t9800.12 clone using non-numeric revision ranges
>> ⏵ ok: t9800.13 clone with date range, excluding some changes
>> ⏵ ok: t9800.14 exit when p4 fails to produce marshaled output
>> ⏵ ok: t9800.15 exit gracefully for p4 server errors
>> ⏵ ok: t9800.16 clone --bare should make a bare repository
>> ⏵ ok: t9800.17 initial import time from top change time
>> ⏵ ok: t9800.18 unresolvable host in P4PORT should display error
>> ⏵ ok: t9800.19 run hook p4-pre-submit before submit
>>   Error: failed: t9800.20 submit from detached head
>> ⏵ failure: t9800.20 submit from detached head
>>   Error: failed: t9800.21 submit from worktree
>> ⏵ failure: t9800.21 submit from worktree
>>   === Failed test: t9801-git-p4-branch ===
>>   [...]
>>
>>
>> The "Failed test:" lines are colored in yellow to give a better visual clue
>> about the logs' structure, the "Error:" label is colored in red to draw the
>> attention to the important part of the log, and the "⏵" characters indicate
>> that part of the log is collapsed and can be expanded by clicking on it.
>>
>> To drill down, the reader merely needs to expand the (failed) test case's
>> log by clicking on it, and then study the log. If needed (e.g. when the test
>> case relies on side effects from previous test cases), the logs of preceding
>> test cases can be expanded as well. In this example, when expanding
>> t9800.20, it looks like this (for ease of reading, I cut a few chunks of
>> lines, indicated by "[...]"):
>>
>> [...]
>> ⏵ ok: t9800.19 run hook p4-pre-submit before submit
>>   Error: failed: t9800.20 submit from detached head
>> ⏷ failure: t9800.20 submit from detached head
>>       test_when_finished cleanup_git &&
>>       git p4 clone --dest="$git" //depot &&
>>         (
>>           cd "$git" &&
>>           git checkout p4/master &&
>>           >detached_head_test &&
>>           git add detached_head_test &&
>>           git commit -m "add detached_head" &&
>>           git config git-p4.skipSubmitEdit true &&
>>           git p4 submit &&
>>             git p4 rebase &&
>>             git log p4/master | grep detached_head
>>         )
>>     [...]
>>     Depot paths: //depot/
>>     Import destination: refs/remotes/p4/master
>>
>>     Importing revision 9 (100%)Perforce db files in '.' will be created if missing...
>>     Perforce db files in '.' will be created if missing...
>>
>>     Traceback (most recent call last):
>>       File "/home/runner/work/git/git/git-p4", line 4455, in <module>
>>         main()
>>       File "/home/runner/work/git/git/git-p4", line 4449, in main
>>         if not cmd.run(args):
>>       File "/home/runner/work/git/git/git-p4", line 2590, in run
>>         rebase.rebase()
>>       File "/home/runner/work/git/git/git-p4", line 4121, in rebase
>>         if len(read_pipe("git diff-index HEAD --")) > 0:
>>       File "/home/runner/work/git/git/git-p4", line 297, in read_pipe
>>         retcode, out, err = read_pipe_full(c, *k, **kw)
>>       File "/home/runner/work/git/git/git-p4", line 284, in read_pipe_full
>>         p = subprocess.Popen(
>>       File "/usr/lib/python3.8/subprocess.py", line 858, in __init__
>>         self._execute_child(args, executable, preexec_fn, close_fds,
>>       File "/usr/lib/python3.8/subprocess.py", line 1704, in _execute_child
>>         raise child_exception_type(errno_num, err_msg, err_filename)
>>     FileNotFoundError: [Errno 2] No such file or directory: 'git diff-index HEAD --'
>>     error: last command exited with $?=1
>>     + cleanup_git
>>     + retry_until_success rm -r /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
>>     + nr_tries_left=60
>>     + rm -r /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
>>     + test_path_is_missing /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
>>     + test 1 -ne 1
>>     + test -e /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
>>     + retry_until_success mkdir /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
>>     + nr_tries_left=60
>>     + mkdir /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
>>     + exit 1
>>     + eval_ret=1
>>     + :
>>     not ok 20 - submit from detached head
>>     #
>>     #        test_when_finished cleanup_git &&
>>     #        git p4 clone --dest="$git" //depot &&
>>     #        (
>>     #            cd "$git" &&
>>     #            git checkout p4/master &&
>>     #            >detached_head_test &&
>>     #            git add detached_head_test &&
>>     #            git commit -m "add detached_head" &&
>>     #            git config git-p4.skipSubmitEdit true &&
>>     #            git p4 submit &&
>>     #            git p4 rebase &&
>>     #            git log p4/master | grep detached_head
>>     #        )
>>     #
>>   Error: failed: t9800.21 submit from worktree
>>   [...]
>>
>>
>> Is this the best UI we can have for test failures in CI runs? I hope we can
>> do better. Having said that, this patch series presents a pretty good start,
>> and offers a basis for future improvements.
>>
>> Johannes Schindelin (9):
>>   ci: fix code style
>>   ci/run-build-and-tests: take a more high-level view
>>   ci: make it easier to find failed tests' logs in the GitHub workflow
>>   ci/run-build-and-tests: add some structure to the GitHub workflow
>>     output
>>   tests: refactor --write-junit-xml code
>>   test(junit): avoid line feeds in XML attributes
>>   ci: optionally mark up output in the GitHub workflow
>>   ci: use `--github-workflow-markup` in the GitHub workflow
>>   ci: call `finalize_test_case_output` a little later
>>
>>  .github/workflows/main.yml           |  12 ---
>>  ci/lib.sh                            |  81 ++++++++++++++--
>>  ci/run-build-and-tests.sh            |  11 ++-
>>  ci/run-test-slice.sh                 |   5 +-
>>  t/test-lib-functions.sh              |   4 +-
>>  t/test-lib-github-workflow-markup.sh |  50 ++++++++++
>>  t/test-lib-junit.sh                  | 132 +++++++++++++++++++++++++++
>>  t/test-lib.sh                        | 128 ++++----------------------
>>  8 files changed, 287 insertions(+), 136 deletions(-)
>>  create mode 100644 t/test-lib-github-workflow-markup.sh
>>  create mode 100644 t/test-lib-junit.sh
>>
>>
>> base-commit: af4e5f569bc89f356eb34a9373d7f82aca6faa8a
>> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1117%2Fdscho%2Fuse-grouping-in-ci-v1
>> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1117/dscho/use-grouping-in-ci-v1
>> Pull-Request: https://github.com/gitgitgadget/git/pull/1117
>> --
>> gitgitgadget
>>
>>


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-02-20  2:44   ` Junio C Hamano
@ 2022-02-20 15:25     ` Johannes Schindelin
  2022-02-21  8:09       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 98+ messages in thread
From: Johannes Schindelin @ 2022-02-20 15:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin via GitGitGadget, git

Hi Junio,

On Sat, 19 Feb 2022, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> > I notice that you did not take this into `seen` yet. I find that a little
> > sad because it would potentially have helped others to figure out the
> > failure in the latest `seen`:
> > https://github.com/git/git/runs/5255378056?check_suite_focus=true#step:5:162
> >
> > Essentially, a recent patch introduces hard-coded SHA-1 hashes in t3007.3.
>
> I saw the thread, I saw a few patches were commented on, and a few
> were left unanswered, but one was replied by the original submitter
> with a "Good catch!", making me expect the topic to be discussed or
> rerolled to become ready relatively soon.

Yes, I have local changes, but I had really hoped that this patch series
would get a chance to prove its point by example, i.e. by offering the
improved output for the failures in `seen`. I hoped that because I think
that those improvements speak for themselves when you see them.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-02-20 15:25     ` Johannes Schindelin
@ 2022-02-21  8:09       ` Ævar Arnfjörð Bjarmason
  2022-02-22 10:26         ` Johannes Schindelin
  0 siblings, 1 reply; 98+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-21  8:09 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Junio C Hamano, Johannes Schindelin via GitGitGadget, git


On Sun, Feb 20 2022, Johannes Schindelin wrote:

> Hi Junio,
>
> On Sat, 19 Feb 2022, Junio C Hamano wrote:
>
>> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>>
>> > I notice that you did not take this into `seen` yet. I find that a little
>> > sad because it would potentially have helped others to figure out the
>> > failure in the latest `seen`:
>> > https://github.com/git/git/runs/5255378056?check_suite_focus=true#step:5:162
>> >
>> > Essentially, a recent patch introduces hard-coded SHA-1 hashes in t3007.3.
>>
>> I saw the thread, I saw a few patches were commented on, and a few
>> were left unanswered, but one was replied by the original submitter
>> with a "Good catch!", making me expect the topic to be discussed or
>> rerolled to become ready relatively soon.
>
> Yes, I have local changes, but I had really hoped that this patch series
> would get a chance to prove its point by example, i.e. by offering the
> improved output for the failures in `seen`. I hoped that because I think
> that those improvements speak for themselves when you see them.

I think it's a good idea to get wider expose in "seen", "next" etc. for
topics where the bottleneck is lack of feedback due to lack of wider
exposure.

But in this case I've pointed out both direction & UX issues to you that
you haven't addressed. Both what I sent a reminder of yesterday in [1],
and more relevant to what you're discussing here a reply [2] where I
looked & tested your new output v.s. the old, and found that on test
failures it:

 * Replaced summary output with a much more verbose version.

 * Turned the GitHub UI from usable (but sometimes hard to find the needle in
   the haystack) to *extremely slow*. Seemingly because the browser was asked to
   make sense of~30k lines of output, with some of it hidden dynamically by JavaScript.

1. https://lore.kernel.org/git/220220.86bkz1d7hm.gmgdl@evledraar.gmail.com/
2. https://lore.kernel.org/git/220126.86sftbfjl4.gmgdl@evledraar.gmail.com/

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-02-21  8:09       ` Ævar Arnfjörð Bjarmason
@ 2022-02-22 10:26         ` Johannes Schindelin
  0 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin @ 2022-02-22 10:26 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Junio C Hamano, Johannes Schindelin via GitGitGadget, git

[-- Attachment #1: Type: text/plain, Size: 1521 bytes --]

Hi Ævar,

On Mon, 21 Feb 2022, Ævar Arnfjörð Bjarmason wrote:

> On Sun, Feb 20 2022, Johannes Schindelin wrote:
>
> > On Sat, 19 Feb 2022, Junio C Hamano wrote:
> >
> >> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> >>
> >> > I notice that you did not take this into `seen` yet. I find that a little
> >> > sad because it would potentially have helped others to figure out the
> >> > failure in the latest `seen`:
> >> > https://github.com/git/git/runs/5255378056?check_suite_focus=true#step:5:162
> >> >
> >> > Essentially, a recent patch introduces hard-coded SHA-1 hashes in t3007.3.
> >>
> >> I saw the thread, I saw a few patches were commented on, and a few
> >> were left unanswered, but one was replied by the original submitter
> >> with a "Good catch!", making me expect the topic to be discussed or
> >> rerolled to become ready relatively soon.
> >
> > Yes, I have local changes, but I had really hoped that this patch series
> > would get a chance to prove its point by example, i.e. by offering the
> > improved output for the failures in `seen`. I hoped that because I think
> > that those improvements speak for themselves when you see them.
>
> I think it's a good idea to get wider expose in "seen", "next" etc. for
> topics where the bottleneck is lack of feedback due to lack of wider
> exposure.

Having this in `seen` will give the patch series a chance to show in real
life how it improves the process of analyzing regressions.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-02-20 12:47   ` Ævar Arnfjörð Bjarmason
@ 2022-02-22 10:30     ` Johannes Schindelin
  2022-02-22 13:31       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 98+ messages in thread
From: Johannes Schindelin @ 2022-02-22 10:30 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin via GitGitGadget, git, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 1102 bytes --]

Hi Ævar,

On Sun, 20 Feb 2022, Ævar Arnfjörð Bjarmason wrote:

> On Sun, Feb 20 2022, Johannes Schindelin wrote:
>
> > I notice that you did not take this into `seen` yet. I find that a little
> > sad because it would potentially have helped others to figure out the
> > failure in the latest `seen`:
> > https://github.com/git/git/runs/5255378056?check_suite_focus=true#step:5:162
> >
> > Essentially, a recent patch introduces hard-coded SHA-1 hashes in t3007.3.
>
> I left some feedback on your submission ~3 weeks ago that you haven't
> responded to:
> https://lore.kernel.org/git/220127.86ilu5cdnf.gmgdl@evledraar.gmail.com/

You answered my goal of making it easier to figure out regressions by
doubling down on hiding the logs even better. That's not feedback, that's
just ignoring the goal.

You answered my refactor of the Azure Pipelines support with the question
"why?" that I had answered already a long time ago. That's not feedback,
that's ignoring the answers I already provided.

I don't know how to respond to that, therefore I didn't.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-02-22 10:30     ` Johannes Schindelin
@ 2022-02-22 13:31       ` Ævar Arnfjörð Bjarmason
  2022-02-23 12:07         ` Phillip Wood
  0 siblings, 1 reply; 98+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-22 13:31 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Johannes Schindelin via GitGitGadget, git, Junio C Hamano


On Tue, Feb 22 2022, Johannes Schindelin wrote:

> Hi Ævar,
>
> On Sun, 20 Feb 2022, Ævar Arnfjörð Bjarmason wrote:
>
>> On Sun, Feb 20 2022, Johannes Schindelin wrote:
>>
>> > I notice that you did not take this into `seen` yet. I find that a little
>> > sad because it would potentially have helped others to figure out the
>> > failure in the latest `seen`:
>> > https://github.com/git/git/runs/5255378056?check_suite_focus=true#step:5:162
>> >
>> > Essentially, a recent patch introduces hard-coded SHA-1 hashes in t3007.3.
>>
>> I left some feedback on your submission ~3 weeks ago that you haven't
>> responded to:
>> https://lore.kernel.org/git/220127.86ilu5cdnf.gmgdl@evledraar.gmail.com/
>
> You answered my goal of making it easier to figure out regressions by
> doubling down on hiding the logs even better. That's not feedback, that's
> just ignoring the goal.

I think it's clear to anyone reading my feedback that that's either a
gross misreading of the feedback I provided or an intentional
misrepresentation.

I don't mention the second one of those lightly, but I think after some
months of that pattern now when commenting on various patches of yours
it's not an unfair claim.

I.e. you generally seem to latch onto some very narrow interpretation or
comment in some larger feedback pointing out various issues to you, and
use that as a reason not to respond to or address any of the rest.

So just to make the point about one of those mentioned in my [1] with
some further details (I won't go into the whole thing to avoid repeating
myself):

I opened both of:

    https://github.com/git-for-windows/git/runs/4822802185?check_suite_focus=true
    https://github.com/dscho/git/runs/4840190622?check_suite_focus=true

Just now in Firefox 91.5.0esr-1. Both having been opened before, so
they're in cache, and I've got a current 40MB/s real downlink speed etc.

The former fully loads in around 5100ms, with your series here that's
just short of 18000ms.

So your CI changes are making the common case of just looking at a CI
failure more than **3x as slow as before**.

That's according to the "performance" timeline, and not some abstract
"some JS was still running in the background". It lines up with the time
that the scroll bar on the side of the screen stops moving, and the
viewport does the "zoom to end" thing in GitHub CI's UI, focusing on the
error reported. It was really slow before, but it's SLOOOOOW now.

All of which (and I'm no webdev expert) seems to do with the browser
engine struggling to keep up with a much larger set of log data being
thrown at it, which despite the eliding you're adding can be seen in the
~1.7k lines of output growing to beyond ~33k now.

Once it loads the end result after all of that (re your "doubling down
on hiding the logs even better") is that I need to (and I've got a
sizable vertically mounted screen) scroll through around 6 pages of
output, each of which takes around 3 seconds of Firefox churning on more
than 100% CPU before it shows me the next page.

And even *if* it was instant the names of the failing tests are now
spread across several pages of output, whereas in the "prove" output we
have a quick overall summary separated from the
"ci/print-test-failures.sh" output

Does that mean the current output is perfect and can't be improved? No,
I also think it sucks. I just think that the current implementation
you've proposed for improving it is making it worse overall.

Which doesn't mean that it couldn't be addressed, fixed, or that the
core idea of using that "group" syntax to aggregate that output into
sections is bad. I think we should use it, just not as it's currently
implemented.

If that's "not feedback" I don't know what is. It's all relevant, and
while I'm elaborating further here [1] sent almost a month ago notes the
same issues.

> You answered my refactor of the Azure Pipelines support with the question
> "why?" that I had answered already a long time ago. That's not feedback,
> that's ignoring the answers I already provided.

I think it's clear what the gap between that answer is and what I was
asking you was in the parallel follow-up discussion at [2].

But even your answer there of just wanting to keep it in place doesn't
really answer that question for this series. You're not just keeping
that stale code in place, but actively changing it.

I.e. even if you run with all that how are others meant to test and
review the changes being proposed here?

I.e. is resurrecting Azure CI required to test this series, or should
reviewers ignore those parts and just hope it all works etc?

> I don't know how to respond to that, therefore I didn't.

I think whatever differences in direction for this CI feature that we
have, or troubles understanding one another, that your update after 3
weeks of not replying to that feedback [3] asking why Junio didn't pick
up your patches being indistinguishable from there having been nothing
said about your patches at all is, I think, not a good way to proceed
with that.

I.e. we're not the only people talking here, there's presumably others
who'll read these threads and will want to comment on the direction of
any CI changes.

Knowing from you that you read outstanding feedback and didn't
understand it, or read it and summarized but ultimately decided to
change nothing etc. makes for much of a flow on the ML than just
ignoring that feedback entirely.

1. https://lore.kernel.org/git/220126.86sftbfjl4.gmgdl@evledraar.gmail.com/
2. https://lore.kernel.org/git/220222.86y2236ndp.gmgdl@evledraar.gmail.com/
3. https://lore.kernel.org/git/nycvar.QRO.7.76.6.2202200043590.26495@tvgsbejvaqbjf.bet/

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-02-22 13:31       ` Ævar Arnfjörð Bjarmason
@ 2022-02-23 12:07         ` Phillip Wood
  2022-02-25 12:39           ` Ævar Arnfjörð Bjarmason
  2022-02-25 14:10           ` Johannes Schindelin
  0 siblings, 2 replies; 98+ messages in thread
From: Phillip Wood @ 2022-02-23 12:07 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Johannes Schindelin
  Cc: Johannes Schindelin via GitGitGadget, git, Junio C Hamano

On 22/02/2022 13:31, Ævar Arnfjörð Bjarmason wrote:
> [...]
> So just to make the point about one of those mentioned in my [1] with
> some further details (I won't go into the whole thing to avoid repeating
> myself):
> 
> I opened both of:
> 
>      https://github.com/git-for-windows/git/runs/4822802185?check_suite_focus=true
>      https://github.com/dscho/git/runs/4840190622?check_suite_focus=true
> 
> Just now in Firefox 91.5.0esr-1. Both having been opened before, so
> they're in cache, and I've got a current 40MB/s real downlink speed etc.
> 
> The former fully loads in around 5100ms, with your series here that's
> just short of 18000ms.
> 
> So your CI changes are making the common case of just looking at a CI
> failure more than **3x as slow as before**.

I don't think that is the most useful comparison between the two. When I 
am investigating a test failure the time that matters to me is the time 
it takes to display the output of the failing test case. With the first 
link above the initial page load is faster but to get to the output of 
the failing test case I have click on "Run ci/print_test_failures.sh" 
then wait for that to load and then search for "not ok" to actually get 
to the information I'm after. With the second link the initial page load 
does feel slower but then I'm presented  with the test failures nicely 
highlighted in red, all I have to do is click on one and I've got the 
information I'm after. Overall that is much faster and easier to use.

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 4/9] ci/run-build-and-tests: add some structure to the GitHub workflow output
  2022-01-24 18:56 ` [PATCH 4/9] ci/run-build-and-tests: add some structure to the GitHub workflow output Johannes Schindelin via GitGitGadget
@ 2022-02-23 12:13   ` Phillip Wood
  2022-02-25 13:40     ` Johannes Schindelin
  0 siblings, 1 reply; 98+ messages in thread
From: Phillip Wood @ 2022-02-23 12:13 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget, git; +Cc: Johannes Schindelin

Hi Dscho

On 24/01/2022 18:56, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> 
> The current output of Git's GitHub workflow can be quite confusing,
> especially for contributors new to the project.
> 
> To make it more helpful, let's introduce some collapsible grouping.
> Initially, readers will see the high-level view of what actually
> happened (did the build fail, or the test suite?). To drill down, the
> respective group can be expanded.
> 
> Note: sadly, workflow output currently cannot contain any nested groups
> (see https://github.com/actions/runner/issues/802 for details),
> therefore we take pains to ensure to end any previous group before
> starting a new one.

Thanks for working on this, I find it makes it much easier to get to the 
information I need when a test fails. I'm not familiar with github's 
grouping but I noticed something below I didn't understand.

> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>   ci/lib.sh                 | 55 ++++++++++++++++++++++++++++++++++-----
>   ci/run-build-and-tests.sh |  4 +--
>   ci/run-test-slice.sh      |  2 +-
>   3 files changed, 51 insertions(+), 10 deletions(-)
> 
> diff --git a/ci/lib.sh b/ci/lib.sh
> index 2b2c0932320..4ed8f40ab02 100755
> --- a/ci/lib.sh
> +++ b/ci/lib.sh
> @@ -1,5 +1,49 @@
>   # Library of functions shared by all CI scripts
>   
> +if test true != "$GITHUB_ACTIONS"
> +then
> +	begin_group () { :; }
> +	end_group () { :; }
> +
> +	group () {
> +		shift
> +		"$@"
> +	}
> +	set -x
> +else
> +	begin_group () {
> +		need_to_end_group=t
> +		echo "::group::$1" >&2
> +		set -x
> +	}
> +
> +	end_group () {
> +		test -n "$need_to_end_group" || return 0
> +		set +x
> +		need_to_end_group=
> +		echo '::endgroup::' >&2
> +	}
> +	trap end_group EXIT
> +
> +	group () {
> +		set +x
> +		begin_group "$1"
> +		shift
> +		"$@"
> +		res=$?
> +		end_group
> +		return $res
> +	}
> +
> +	begin_group "CI setup"
> +fi
> +
> +# Set 'exit on error' for all CI scripts to let the caller know that
> +# something went wrong.
> +# Set tracing executed commands, primarily setting environment variables
> +# and installing dependencies.
> +set -e

The comment is moved unchanged but the set command has lost the "-x". We 
now have several "set -x" commands in the functions above and one below 
"end_group" lower down. Does the comment need updating as we are not 
enabling the tracing of executed commands here anymore?

Best Wishes

Phillip

>   skip_branch_tip_with_tag () {
>   	# Sometimes, a branch is pushed at the same time the tag that points
>   	# at the same commit as the tip of the branch is pushed, and building
> @@ -88,12 +132,6 @@ export TERM=${TERM:-dumb}
>   # Clear MAKEFLAGS that may come from the outside world.
>   export MAKEFLAGS=
>   
> -# Set 'exit on error' for all CI scripts to let the caller know that
> -# something went wrong.
> -# Set tracing executed commands, primarily setting environment variables
> -# and installing dependencies.
> -set -ex
> -
>   if test -n "$SYSTEM_COLLECTIONURI" || test -n "$SYSTEM_TASKDEFINITIONSURI"
>   then
>   	CI_TYPE=azure-pipelines
> @@ -138,7 +176,7 @@ then
>   			test_name="${test_exit%.exit}"
>   			test_name="${test_name##*/}"
>   			printf "\\e[33m\\e[1m=== Failed test: ${test_name} ===\\e[m\\n"
> -			cat "t/test-results/$test_name.out"
> +			group "Failed test: $test_name" cat "t/test-results/$test_name.out"
>   
>   			trash_dir="t/trash directory.$test_name"
>   			cp "t/test-results/$test_name.out" t/failed-test-artifacts/
> @@ -234,3 +272,6 @@ linux-leaks)
>   esac
>   
>   MAKEFLAGS="$MAKEFLAGS CC=${CC:-cc}"
> +
> +end_group
> +set -x
> diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
> index e49f9eaa8c0..5516f45f7fe 100755
> --- a/ci/run-build-and-tests.sh
> +++ b/ci/run-build-and-tests.sh
> @@ -48,10 +48,10 @@ esac
>   # Any new "test" targets should not go after this "make", but should
>   # adjust $MAKE_TARGETS. Otherwise compilation-only targets above will
>   # start running tests.
> -make
> +group Build make
>   if test -n "$run_tests"
>   then
> -	make test ||
> +	group "Run tests" make test ||
>   	handle_failed_tests
>   fi
>   check_unignored_build_artifacts
> diff --git a/ci/run-test-slice.sh b/ci/run-test-slice.sh
> index 63358c23e11..a3c67956a8d 100755
> --- a/ci/run-test-slice.sh
> +++ b/ci/run-test-slice.sh
> @@ -10,7 +10,7 @@ windows*) cmd //c mklink //j t\\.prove "$(cygpath -aw "$cache_dir/.prove")";;
>   *) ln -s "$cache_dir/.prove" t/.prove;;
>   esac
>   
> -make --quiet -C t T="$(cd t &&
> +group "Run tests" make --quiet -C t T="$(cd t &&
>   	./helper/test-tool path-utils slice-tests "$1" "$2" t[0-9]*.sh |
>   	tr '\n' ' ')" ||
>   handle_failed_tests


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-02-23 12:07         ` Phillip Wood
@ 2022-02-25 12:39           ` Ævar Arnfjörð Bjarmason
  2022-02-25 14:10           ` Johannes Schindelin
  1 sibling, 0 replies; 98+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-25 12:39 UTC (permalink / raw)
  To: phillip.wood
  Cc: Johannes Schindelin, Johannes Schindelin via GitGitGadget, git,
	Junio C Hamano


On Wed, Feb 23 2022, Phillip Wood wrote:

> On 22/02/2022 13:31, Ævar Arnfjörð Bjarmason wrote:
>> [...]
>> So just to make the point about one of those mentioned in my [1] with
>> some further details (I won't go into the whole thing to avoid repeating
>> myself):
>> I opened both of:
>>      https://github.com/git-for-windows/git/runs/4822802185?check_suite_focus=true
>>      https://github.com/dscho/git/runs/4840190622?check_suite_focus=true
>> Just now in Firefox 91.5.0esr-1. Both having been opened before, so
>> they're in cache, and I've got a current 40MB/s real downlink speed etc.
>> The former fully loads in around 5100ms, with your series here
>> that's
>> just short of 18000ms.
>> So your CI changes are making the common case of just looking at a
>> CI
>> failure more than **3x as slow as before**.
>
> I don't think that is the most useful comparison between the two.[...]

I'm not saying that it's the most useful comparison between the two, but
that there's a major performance regression introduced in this series
that so far isn't addressed or noted.

> [...]When
> I am investigating a test failure the time that matters to me is the
> time it takes to display the output of the failing test case. With the
> first link above the initial page load is faster but to get to the
> output of the failing test case I have click on "Run
> ci/print_test_failures.sh" then wait for that to load and then search
> for "not ok" to actually get to the information I'm after. With the
> second link the initial page load does feel slower but then I'm
> presented  with the test failures nicely highlighted in red, all I
> have to do is click on one and I've got the information I'm
> after. Overall that is much faster and easier to use.

Whether you think the regression is worth the end result is a subjective
judgement. I don't think it is, but I don't think you or anyone else is
wrong if they don't agree.

If you think it's OK to spend ~20s instead of ~5s on rendering the new
output that's something that clearly depends on how much you value the
new output, and much much you're willing to wait.

What I am saying, and what I hope you'll agree with, is that it's
something that should be addressed in some way by this series.

One way to do that would be to note the performence regression in a
commit message, and argue that despite the slowdown it's worth it.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 4/9] ci/run-build-and-tests: add some structure to the GitHub workflow output
  2022-02-23 12:13   ` Phillip Wood
@ 2022-02-25 13:40     ` Johannes Schindelin
  0 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin @ 2022-02-25 13:40 UTC (permalink / raw)
  To: phillip.wood; +Cc: Johannes Schindelin via GitGitGadget, git

Hi Phillip,

On Wed, 23 Feb 2022, Phillip Wood wrote:

> On 24/01/2022 18:56, Johannes Schindelin via GitGitGadget wrote:
>
> > +# Set 'exit on error' for all CI scripts to let the caller know that
> > +# something went wrong.
> > +# Set tracing executed commands, primarily setting environment variables
> > +# and installing dependencies.
> > +set -e
>
> The comment is moved unchanged but the set command has lost the "-x". We now
> have several "set -x" commands in the functions above and one below
> "end_group" lower down. Does the comment need updating as we are not enabling
> the tracing of executed commands here anymore?

Oh yes, the comment needs to be updated. Thank you for pointing that out.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-02-23 12:07         ` Phillip Wood
  2022-02-25 12:39           ` Ævar Arnfjörð Bjarmason
@ 2022-02-25 14:10           ` Johannes Schindelin
  2022-02-25 18:16             ` Junio C Hamano
  2022-03-02 10:58             ` [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Phillip Wood
  1 sibling, 2 replies; 98+ messages in thread
From: Johannes Schindelin @ 2022-02-25 14:10 UTC (permalink / raw)
  To: phillip.wood
  Cc: Ævar Arnfjörð Bjarmason,
	Johannes Schindelin via GitGitGadget, git, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 8957 bytes --]

Hi Phillip,

On Wed, 23 Feb 2022, Phillip Wood wrote:

> On 22/02/2022 13:31, Ævar Arnfjörð Bjarmason wrote:
> > [...]
> > So just to make the point about one of those mentioned in my [1] with
> > some further details (I won't go into the whole thing to avoid repeating
> > myself):
> >
> > I opened both of:
> >
> > https://github.com/git-for-windows/git/runs/4822802185?check_suite_focus=true
> > https://github.com/dscho/git/runs/4840190622?check_suite_focus=true
> >
> > Just now in Firefox 91.5.0esr-1. Both having been opened before, so
> > they're in cache, and I've got a current 40MB/s real downlink speed etc.
> >
> > The former fully loads in around 5100ms, with your series here that's
> > just short of 18000ms.
> >
> > So your CI changes are making the common case of just looking at a CI
> > failure more than **3x as slow as before**.
>
> I don't think that is the most useful comparison between the two. When I
> am investigating a test failure the time that matters to me is the time
> it takes to display the output of the failing test case.

Thank you for expressing this so clearly. I will adopt a variation of this
phrasing in my commit message, if you don't mind?

> With the first link above the initial page load is faster but to get to
> the output of the failing test case I have click on "Run
> ci/print_test_failures.sh" then wait for that to load and then search
> for "not ok" to actually get to the information I'm after.

And that's only because you are familiar with what you have to do.

Any new contributor would be stuck with the information presented on the
initial load, without any indication that more information _is_ available,
just hidden away in the next step's log (which is marked as "succeeding",
therefore misleading the inclined reader into thinking that this cannot
potentially contain any information pertinent to the _failure_ that needs
to be investigated).

> With the second link the initial page load does feel slower but then I'm
> presented with the test failures nicely highlighted in red, all I have
> to do is click on one and I've got the information I'm after.
>
> Overall that is much faster and easier to use.

Thank you for your comment. I really started to doubt myself, getting the
idea that it's just a case of me holding this thing wrong.

For what it's worth, I did make a grave mistake by using that particular
`seen` CI failure with all of those failing p4 tests, which obviously
resulted in an incredibly large amount of logs. Obviously that _must_ be
slow to load. I just did not have the time to fabricate a CI failure.

However, I do agree with you that the large amount of logs would have to
be looked at _anyway_, whether it is shown upon loading the job's logs or
only when expanding the `print-test-failures` step's logs. The amount of
the logs is a constant, after all, I did not change anything there (nor
would I).

So a better example might be my concrete use case yesterday: the CI build
of `seen` failed. Here is the link to the regular output:

	https://github.com/git/git/actions/runs/1890665968

On that page, you see the following:


	Annotations
	8 errors and 1 warning

	ⓧ win test (3)
	  Process completed with exit code 2.

	ⓧ win test (6)
	  Process completed with exit code 2.

	ⓧ win test (2)
	  Process completed with exit code 2.

	ⓧ win+VS test (3)
	  Process completed with exit code 2.

	ⓧ win+VS test (6)
	  Process completed with exit code 2.

	ⓧ win+VS test (2)
	  Process completed with exit code 2.

	ⓧ osx-gcc (macos-latest)
	  Process completed with exit code 2.

	ⓧ osx-clang (macos-latest)
	  Process completed with exit code 2.

	⚠ CI: .github#L1
	  windows-latest workflows now use windows-2022. For more details, see https://github.com/actions/virtual-environments/issues/4856

So I merged my branch into `seen` and pushed it. The corresponding run can
be seen here:

	https://github.com/dscho/git/actions/runs/1892982393

On that page, you see the following:

	Annotations
	50 errors and 1 warning

	ⓧ win test (3)
	  failed: t7527.1 explicit daemon start and stop

	ⓧ win test (3)
	  failed: t7527.2 implicit daemon start

	ⓧ win test (3)
	  failed: t7527.3 implicit daemon stop (delete .git)

	ⓧ win test (3)
	  failed: t7527.4 implicit daemon stop (rename .git)

	ⓧ win test (3)
	  failed: t7527.5 implicit daemon stop (rename GIT~1)

	ⓧ win test (3)
	  failed: t7527.6 implicit daemon stop (rename GIT~2)

	ⓧ win test (3)
	  failed: t7527.8 cannot start multiple daemons

	ⓧ win test (3)
	  failed: t7527.10 update-index implicitly starts daemon

	ⓧ win test (3)
	  failed: t7527.11 status implicitly starts daemon

	ⓧ win test (3)
	  failed: t7527.12 edit some files

	ⓧ win test (2)
	  failed: t0012.81 fsmonitor--daemon can handle -h

	ⓧ win test (2)
	  Process completed with exit code 1.

	ⓧ win test (6)
	  failed: t7519.2 run fsmonitor-daemon in bare repo

	ⓧ win test (6)
	  failed: t7519.3 run fsmonitor-daemon in virtual repo

	ⓧ win test (6)
	  Process completed with exit code 1.

	ⓧ win+VS test (3)
	  failed: t7527.1 explicit daemon start and stop

	ⓧ win+VS test (3)
	  failed: t7527.2 implicit daemon start

	ⓧ win+VS test (3)
	  failed: t7527.3 implicit daemon stop (delete .git)

	ⓧ win+VS test (3)
	  failed: t7527.4 implicit daemon stop (rename .git)

	ⓧ win+VS test (3)
	  failed: t7527.5 implicit daemon stop (rename GIT~1)

	ⓧ win+VS test (3)
	  failed: t7527.6 implicit daemon stop (rename GIT~2)

	ⓧ win+VS test (3)
	  failed: t7527.8 cannot start multiple daemons

	ⓧ win+VS test (3)
	  failed: t7527.10 update-index implicitly starts daemon

	ⓧ win+VS test (3)
	  failed: t7527.11 status implicitly starts daemon

	ⓧ win+VS test (3)
	  failed: t7527.12 edit some files

	ⓧ win+VS test (2)
	  failed: t0012.81 fsmonitor--daemon can handle -h

	ⓧ win+VS test (2)
	  Process completed with exit code 1.

	ⓧ win+VS test (6)
	  failed: t7519.2 run fsmonitor-daemon in bare repo

	ⓧ win+VS test (6)
	  failed: t7519.3 run fsmonitor-daemon in virtual repo

	ⓧ win+VS test (6)
	  Process completed with exit code 1.

	ⓧ osx-clang (macos-latest)
	  failed: t0012.81 fsmonitor--daemon can handle -h

	ⓧ osx-clang (macos-latest)
	  failed: t7519.2 run fsmonitor-daemon in bare repo

	ⓧ osx-clang (macos-latest)
	  failed: t7527.1 explicit daemon start and stop

	ⓧ osx-clang (macos-latest)
	  failed: t7527.2 implicit daemon start

	ⓧ osx-clang (macos-latest)
	  failed: t7527.3 implicit daemon stop (delete .git)

	ⓧ osx-clang (macos-latest)
	  failed: t7527.4 implicit daemon stop (rename .git)

	ⓧ osx-clang (macos-latest)
	  failed: t7527.7 MacOS event spelling (rename .GIT)

	ⓧ osx-clang (macos-latest)
	  failed: t7527.8 cannot start multiple daemons

	ⓧ osx-clang (macos-latest)
	  failed: t7527.10 update-index implicitly starts daemon

	ⓧ osx-clang (macos-latest)
	  failed: t7527.11 status implicitly starts daemon

	ⓧ osx-gcc (macos-latest)
	  failed: t0012.81 fsmonitor--daemon can handle -h

	ⓧ osx-gcc (macos-latest)
	  failed: t7519.2 run fsmonitor-daemon in bare repo

	ⓧ osx-gcc (macos-latest)
	  failed: t7527.1 explicit daemon start and stop

	ⓧ osx-gcc (macos-latest)
	  failed: t7527.2 implicit daemon start

	ⓧ osx-gcc (macos-latest)
	  failed: t7527.3 implicit daemon stop (delete .git)

	ⓧ osx-gcc (macos-latest)
	  failed: t7527.4 implicit daemon stop (rename .git)

	ⓧ osx-gcc (macos-latest)
	  failed: t7527.7 MacOS event spelling (rename .GIT)

	ⓧ osx-gcc (macos-latest)
	  failed: t7527.8 cannot start multiple daemons

	ⓧ osx-gcc (macos-latest)
	  failed: t7527.10 update-index implicitly starts daemon

	ⓧ osx-gcc (macos-latest)
	  failed: t7527.11 status implicitly starts daemon

	⚠ CI: .github#L1
	  windows-latest workflows now use windows-2022. For more details, see https://github.com/actions/virtual-environments/issues/4856

In my mind, this is already an improvement. (Even if this is a _lot_ of
output, and a lot of individual errors, given that all of them are fixed
with a single, small patch to adjust an option usage string, but that's
not the fault of my patch series, but of the suggestion to put the check
for the option usage string linting into the `parse_options()` machinery
instead of into the static analysis job.)

Since there are still plenty of failures, the page admittedly does load
relatively slowly. But that's not the time I was trying to optimize for.
My time comes at quite a premium these days, so if the computer has to
work a little harder while I can do something else, as long as it saves
_me_ time, I'll take that time. Every time.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-02-25 14:10           ` Johannes Schindelin
@ 2022-02-25 18:16             ` Junio C Hamano
  2022-02-26 18:43               ` Junio C Hamano
                                 ` (2 more replies)
  2022-03-02 10:58             ` [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Phillip Wood
  1 sibling, 3 replies; 98+ messages in thread
From: Junio C Hamano @ 2022-02-25 18:16 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: phillip.wood, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin via GitGitGadget, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> So I merged my branch into `seen` and pushed it. The corresponding run can
> be seen here:
>
> 	https://github.com/dscho/git/actions/runs/1892982393

I visited this page (while logged in to GItHub---I am saying this
for others who may not know the output is shown differently for
visitors that are logged-in, and and logged-in users).

> On that page, you see the following:
>
> 	Annotations
> 	50 errors and 1 warning
>
> 	ⓧ win test (3)
> 	  failed: t7527.1 explicit daemon start and stop
> ...
>
> 	⚠ CI: .github#L1
> 	  windows-latest workflows now use windows-2022. For more details, see https://github.com/actions/virtual-environments/issues/4856
>
> In my mind, this is already an improvement. (Even if this is a _lot_ of
> output, and a lot of individual errors, given that all of them are fixed
> with a single, small patch to adjust an option usage string, but that's
> not the fault of my patch series, but of the suggestion to put the check
> for the option usage string linting into the `parse_options()` machinery
> instead of into the static analysis job.)

It is not obvious what aspect in the new output _you_ found "an
improvement" to your readers, because you didn't spell it out.  That
makes "in my mind, this is already an improvement" a claim that is
unnecessarily weaker than it really is.

Let me tell my experience:

 - Clicking on macos+clang in the map-looking thing, it did show and
   scroll down automatically to show the last failure link ready to
   be clicked after a few seconds, which was nice, but made me
   scroll back to see the first failure, which could have been
   better.

 - Clicking on win+VS test (2), the failed <test> part was
   automatically opened, and a circle spinned for several dozens of
   seconds to make me wait, but after that, nothing happened.  It
   was somewhat hard to know if I were expected to do something to
   view the first error and when the UI is ready to let me do so, or
   if I were just expected to wait a bit longer for it to all happen
   automatically.

Either case, the presentation to fold all the pieces that finished
successfully made it usable, as that saved human time to scan to
where failures are shown.

I personally do not care about the initial latency when viewing the
output from CI run that may have happened a few dozens of minutes
ago (I do not sit in front of GitHub CI UI and wait until it
finishes). As long as it is made clear when I can start interacting
with it, I can just open the page and let it load while I am working
on something else.

Thanks.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-02-25 18:16             ` Junio C Hamano
@ 2022-02-26 18:43               ` Junio C Hamano
  2022-03-01  2:59                 ` Junio C Hamano
  2022-03-01 10:10                 ` Johannes Schindelin
  2022-03-01 10:20               ` Johannes Schindelin
  2022-03-04  7:38               ` win+VS environment has "cut" but not "paste"? Junio C Hamano
  2 siblings, 2 replies; 98+ messages in thread
From: Junio C Hamano @ 2022-02-26 18:43 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: phillip.wood, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin via GitGitGadget, git

Junio C Hamano <gitster@pobox.com> writes:

> Let me tell my experience:
>
>  - Clicking on macos+clang in the map-looking thing, it did show and
>    scroll down automatically to show the last failure link ready to
>    be clicked after a few seconds, which was nice, but made me
>    scroll back to see the first failure, which could have been
>    better.
>
>  - Clicking on win+VS test (2), the failed <test> part was
>    automatically opened, and a circle spinned for several dozens of
>    seconds to make me wait, but after that, nothing happened.  It
>    was somewhat hard to know if I were expected to do something to
>    view the first error and when the UI is ready to let me do so, or
>    if I were just expected to wait a bit longer for it to all happen
>    automatically.
>
> Either case, the presentation to fold all the pieces that finished
> successfully made it usable, as that saved human time to scan to
> where failures are shown.
>
> I personally do not care about the initial latency when viewing the
> output from CI run that may have happened a few dozens of minutes
> ago (I do not sit in front of GitHub CI UI and wait until it
> finishes). As long as it is made clear when I can start interacting
> with it, I can just open the page and let it load while I am working
> on something else.

FWIW, CI run on "seen" uses this series.

When I highlight a failure at CI, I often give a URL like this:

https://github.com/git/git/runs/5343133021?check_suite_focus=true#step:4:5520

I notice that this "hide by default" forces the recipient of the URL
to click the line after the line with a red highlight before they
can view the breakage.

For example, an URL to show a similar breakage from the old run
(without this series) looks like this:

https://github.com/git/git/runs/5341052811?check_suite_focus=true#step:5:3968

This directly jumps to the error and the recipient of the URL does
not have to do anything special, which I have been using as a
convenient way to give developers a starting point.

I haven't compared the implementation of this one and Ævar's series
that aims for a different goal, so I do not yet have an opinion on
which one should come first (if we want to achieve both of what each
of them wants to achieve, that is).

Thanks.


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-02-26 18:43               ` Junio C Hamano
@ 2022-03-01  2:59                 ` Junio C Hamano
  2022-03-01  6:35                   ` Junio C Hamano
  2022-03-01 10:18                   ` Johannes Schindelin
  2022-03-01 10:10                 ` Johannes Schindelin
  1 sibling, 2 replies; 98+ messages in thread
From: Junio C Hamano @ 2022-03-01  2:59 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: phillip.wood, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin via GitGitGadget, git

Junio C Hamano <gitster@pobox.com> writes:

> FWIW, CI run on "seen" uses this series.

Another "early impression".  I had to open this one today,

    https://github.com/git/git/runs/5367854000?check_suite_focus=true

which was a jarring experience.  It correctly painted the fourth
circle "Run ci/run-build-and-tests.sh" in red with X in it, and
after waiting for a while (which I already said that I do not mind
at all), showed a bunch of line, and then auto-scrolled down to the
end of that section.

It _looked_ like that it was now ready for me to interact with it,
so I started to scroll up to the beginning of that section, but I
had to stare at blank space for several minutes before lines are
shown to occupy that space.  During the repainting, unlike the
initial delay-wait that lets me know that it is not ready by showing
the spinning circle, there was no indication that it wants me to
wait until it fills the blank space with lines.  Not very pleasant.

I do not think it is so bad to say that it is less pleasant than
opening the large "print test failures" section and looking for "not
ok", which was what the original CI UI we had before this series.
But at least with the old one, once the UI becomes ready for me to
interact with, I didn't have to wait for (for the lack of better
phrase) such UI hiccups.  Responses to looking for the next instance
of "not ok" was predictable.

Thanks.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-03-01  2:59                 ` Junio C Hamano
@ 2022-03-01  6:35                   ` Junio C Hamano
  2022-03-01 10:18                   ` Johannes Schindelin
  1 sibling, 0 replies; 98+ messages in thread
From: Junio C Hamano @ 2022-03-01  6:35 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: phillip.wood, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin via GitGitGadget, git

Junio C Hamano <gitster@pobox.com> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> FWIW, CI run on "seen" uses this series.
>
> Another "early impression".  I had to open this one today,
>
>     https://github.com/git/git/runs/5367854000?check_suite_focus=true
>
> which was a jarring experience.  It correctly painted the fourth
> circle "Run ci/run-build-and-tests.sh" in red with X in it, and
> after waiting for a while (which I already said that I do not mind
> at all), showed a bunch of line, and then auto-scrolled down to the
> end of that section.
>
> It _looked_ like that it was now ready for me to interact with it,
> so I started to scroll up to the beginning of that section, but I
> had to stare at blank space for several minutes before lines are

Nah, that was several seconds, not minutes.  Even though I am on
Chromebooks, they are not _that_ slow ;-)

> shown to occupy that space.  During the repainting, unlike the
> initial delay-wait that lets me know that it is not ready by showing
> the spinning circle, there was no indication that it wants me to
> wait until it fills the blank space with lines.  Not very pleasant.
>
> I do not think it is so bad to say that it is less pleasant than
> opening the large "print test failures" section and looking for "not
> ok", which was what the original CI UI we had before this series.
> But at least with the old one, once the UI becomes ready for me to
> interact with, I didn't have to wait for (for the lack of better
> phrase) such UI hiccups.  Responses to looking for the next instance
> of "not ok" was predictable.
>
> Thanks.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-02-26 18:43               ` Junio C Hamano
  2022-03-01  2:59                 ` Junio C Hamano
@ 2022-03-01 10:10                 ` Johannes Schindelin
  2022-03-01 16:57                   ` Junio C Hamano
  1 sibling, 1 reply; 98+ messages in thread
From: Johannes Schindelin @ 2022-03-01 10:10 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: phillip.wood, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin via GitGitGadget, git

Hi Junio,

On Sat, 26 Feb 2022, Junio C Hamano wrote:

> When I highlight a failure at CI, I often give a URL like this:
>
> https://github.com/git/git/runs/5343133021?check_suite_focus=true#step:4:5520
>
> I notice that this "hide by default" forces the recipient of the URL
> to click the line after the line with a red highlight before they
> can view the breakage.

Yes, that's because line 5520 is the header of that group. If you direct
the reader to
https://github.com/git/git/runs/5343133021?check_suite_focus=true#step:4:5674
instead, it will get expanded.

If it would not get expanded, that would be a bug, but obviously not in my
patch series.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-03-01  2:59                 ` Junio C Hamano
  2022-03-01  6:35                   ` Junio C Hamano
@ 2022-03-01 10:18                   ` Johannes Schindelin
  2022-03-01 16:52                     ` Junio C Hamano
  1 sibling, 1 reply; 98+ messages in thread
From: Johannes Schindelin @ 2022-03-01 10:18 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: phillip.wood, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin via GitGitGadget, git

Hi Junio,

On Mon, 28 Feb 2022, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
>
> > FWIW, CI run on "seen" uses this series.
>
> Another "early impression".  I had to open this one today,
>
>     https://github.com/git/git/runs/5367854000?check_suite_focus=true
>
> which was a jarring experience.  It correctly painted the fourth
> circle "Run ci/run-build-and-tests.sh" in red with X in it, and
> after waiting for a while (which I already said that I do not mind
> at all), showed a bunch of line, and then auto-scrolled down to the
> end of that section.
>
> It _looked_ like that it was now ready for me to interact with it,
> so I started to scroll up to the beginning of that section, but I
> had to stare at blank space for several minutes before lines are
> shown to occupy that space.  During the repainting, unlike the
> initial delay-wait that lets me know that it is not ready by showing
> the spinning circle, there was no indication that it wants me to
> wait until it fills the blank space with lines.  Not very pleasant.
>
> I do not think it is so bad to say that it is less pleasant than
> opening the large "print test failures" section and looking for "not
> ok", which was what the original CI UI we had before this series.
> But at least with the old one, once the UI becomes ready for me to
> interact with, I didn't have to wait for (for the lack of better
> phrase) such UI hiccups.  Responses to looking for the next instance
> of "not ok" was predictable.

Let me again state my goal clearly, because some readers seem to be
confused and believe that I want to improve the developer experience of
veterans of the Git mailing list who are more than capable of finding
their way through the build failures.

My goal is instead to make new contributors' lives easier.

It is a pretty high bar we set, expecting a new contributor faced with a
build failure to figure out how to fix the breakage. It's as if we
_wanted_ to instill impostors' syndrome in them, which is probably not
intentional.

In that respect, a relatively responsive page that utterly fails to direct
the reader to the culprit is far, far worse than a slightly bumpy page
that _does_ lead the reader in the right direction.

In any case, thank you for integrating the patches into `seen` so that the
impact of the patches can be "seen".

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-02-25 18:16             ` Junio C Hamano
  2022-02-26 18:43               ` Junio C Hamano
@ 2022-03-01 10:20               ` Johannes Schindelin
  2022-03-04  7:38               ` win+VS environment has "cut" but not "paste"? Junio C Hamano
  2 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin @ 2022-03-01 10:20 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: phillip.wood, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin via GitGitGadget, git

[-- Attachment #1: Type: text/plain, Size: 1285 bytes --]

Hi Junio,

On Fri, 25 Feb 2022, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> > On that page, you see the following:
> >
> > 	Annotations
> > 	50 errors and 1 warning
> >
> > 	ⓧ win test (3)
> > 	  failed: t7527.1 explicit daemon start and stop
> > ...
> >
> > 	⚠ CI: .github#L1
> > 	  windows-latest workflows now use windows-2022. For more details, see https://github.com/actions/virtual-environments/issues/4856
> >
> > In my mind, this is already an improvement. (Even if this is a _lot_ of
> > output, and a lot of individual errors, given that all of them are fixed
> > with a single, small patch to adjust an option usage string, but that's
> > not the fault of my patch series, but of the suggestion to put the check
> > for the option usage string linting into the `parse_options()` machinery
> > instead of into the static analysis job.)
>
> It is not obvious what aspect in the new output _you_ found "an
> improvement" to your readers, because you didn't spell it out.

True.

The difference is those added lines that clearly indicate not only the
failed job, but also the precise test case that failed. All in the summary
page, without requiring any clicking nor scrolling.

Thank you,
Dscho

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-01-24 18:56 [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Johannes Schindelin via GitGitGadget
                   ` (11 preceding siblings ...)
  2022-02-19 23:46 ` [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Johannes Schindelin
@ 2022-03-01 10:24 ` Johannes Schindelin via GitGitGadget
  2022-03-01 10:24   ` [PATCH v2 1/9] ci: fix code style Johannes Schindelin via GitGitGadget
                     ` (12 more replies)
  12 siblings, 13 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-03-01 10:24 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin

Changes since v1:

 * In the patch that removed MAKE_TARGETS, a stale comment about that
   variable is also removed.
 * The comment about set -x has been adjusted because it no longer applies
   as-is.
 * The commit message of "ci: make it easier to find failed tests' logs in
   the GitHub workflow" has been adjusted to motivate the improvement
   better.


Background
==========

Recent patches intended to help readers figure out CI failures much quicker
than before. Unfortunately, they haven't been entirely positive for me. For
example, they broke the branch protections in Microsoft's fork of Git, where
we require Pull Requests to pass a certain set of Checks (which are
identified by their names) and therefore caused follow-up work.

Using CI and in general making it easier for new contributors is an area I'm
passionate about, and one I'd like to see improved.


The current situation
=====================

Let me walk you through the current experience when a PR build fails: I get
a notification mail that only says that a certain job failed. There's no
indication of which test failed (or was it the build?). I can click on a
link at it takes me to the workflow run. Once there, all it says is "Process
completed with exit code 1", or even "code 2". Sure, I can click on one of
the failed jobs. It even expands the failed step's log (collapsing the other
steps). And what do I see there?

Let's look at an example of a failed linux-clang (ubuntu-latest) job
[https://github.com/git-for-windows/git/runs/4822802185?check_suite_focus=true]:

[...]
Test Summary Report
-------------------
t1092-sparse-checkout-compatibility.sh           (Wstat: 256 Tests: 53 Failed: 1)
  Failed test:  49
  Non-zero exit status: 1
t3701-add-interactive.sh                         (Wstat: 0 Tests: 71 Failed: 0)
  TODO passed:   45, 47
Files=957, Tests=25489, 645 wallclock secs ( 5.74 usr  1.56 sys + 866.28 cusr 364.34 csys = 1237.92 CPU)
Result: FAIL
make[1]: *** [Makefile:53: prove] Error 1
make[1]: Leaving directory '/home/runner/work/git/git/t'
make: *** [Makefile:3018: test] Error 2


That's it. I count myself lucky not to be a new contributor being faced with
something like this.

Now, since I am active in the Git project for a couple of days or so, I can
make sense of the "TODO passed" label and know that for the purpose of
fixing the build failures, I need to ignore this, and that I need to focus
on the "Failed test" part instead.

I also know that I do not have to get myself an ubuntu-latest box just to
reproduce the error, I do not even have to check out the code and run it
just to learn what that "49" means.

I know, and I do not expect any new contributor, not even most seasoned
contributors to know, that I have to patiently collapse the "Run
ci/run-build-and-tests.sh" job's log, and instead expand the "Run
ci/print-test-failures.sh" job log (which did not fail and hence does not
draw any attention to it).

I know, and again: I do not expect many others to know this, that I then
have to click into the "Search logs" box (not the regular web browser's
search via Ctrl+F!) and type in "not ok" to find the log of the failed test
case (and this might still be a "known broken" one that is marked via
test_expect_failure and once again needs to be ignored).

To be excessively clear: This is not a great experience!


Improved output
===============

Our previous Azure Pipelines-based CI builds had a much nicer UI, one that
even showed flaky tests, and trends e.g. how long the test cases ran. When I
ported Git's CI over to GitHub workflows (to make CI more accessible to new
contributors), I knew fully well that we would leave this very nice UI
behind, and I had hoped that we would get something similar back via new,
community-contributed GitHub Actions that can be used in GitHub workflows.
However, most likely because we use a home-grown test framework implemented
in opinionated POSIX shells scripts, that did not happen.

So I had a look at what standards exist e.g. when testing PowerShell
modules, in the way of marking up their test output in GitHub workflows, and
I was not disappointed: GitHub workflows support "grouping" of output lines,
i.e. marking sections of the output as a group that is then collapsed by
default and can be expanded. And it is this feature I decided to use in this
patch series, along with GitHub workflows' commands to display errors or
notices that are also shown on the summary page of the workflow run. Now, in
addition to "Process completed with exit code" on the summary page, we also
read something like:

⊗ linux-gcc (ubuntu-latest)
   failed: t9800.20 submit from detached head


Even better, this message is a link, and following that, the reader is
presented with something like this
[https://github.com/dscho/git/runs/4840190622?check_suite_focus=true]:

⏵ Run ci/run-build-and-tests.sh
⏵ CI setup
  + ln -s /home/runner/none/.prove t/.prove
  + run_tests=t
  + export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
  + group Build make
  + set +x
⏵ Build
⏵ Run tests
  === Failed test: t9800-git-p4-basic ===
⏵ ok: t9800.1 start p4d
⏵ ok: t9800.2 add p4 files
⏵ ok: t9800.3 basic git p4 clone
⏵ ok: t9800.4 depot typo error
⏵ ok: t9800.5 git p4 clone @all
⏵ ok: t9800.6 git p4 sync uninitialized repo
⏵ ok: t9800.7 git p4 sync new branch
⏵ ok: t9800.8 clone two dirs
⏵ ok: t9800.9 clone two dirs, @all
⏵ ok: t9800.10 clone two dirs, @all, conflicting files
⏵ ok: t9800.11 clone two dirs, each edited by submit, single git commit
⏵ ok: t9800.12 clone using non-numeric revision ranges
⏵ ok: t9800.13 clone with date range, excluding some changes
⏵ ok: t9800.14 exit when p4 fails to produce marshaled output
⏵ ok: t9800.15 exit gracefully for p4 server errors
⏵ ok: t9800.16 clone --bare should make a bare repository
⏵ ok: t9800.17 initial import time from top change time
⏵ ok: t9800.18 unresolvable host in P4PORT should display error
⏵ ok: t9800.19 run hook p4-pre-submit before submit
  Error: failed: t9800.20 submit from detached head
⏵ failure: t9800.20 submit from detached head 
  Error: failed: t9800.21 submit from worktree
⏵ failure: t9800.21 submit from worktree 
  === Failed test: t9801-git-p4-branch ===
  [...]


The "Failed test:" lines are colored in yellow to give a better visual clue
about the logs' structure, the "Error:" label is colored in red to draw the
attention to the important part of the log, and the "⏵" characters indicate
that part of the log is collapsed and can be expanded by clicking on it.

To drill down, the reader merely needs to expand the (failed) test case's
log by clicking on it, and then study the log. If needed (e.g. when the test
case relies on side effects from previous test cases), the logs of preceding
test cases can be expanded as well. In this example, when expanding
t9800.20, it looks like this (for ease of reading, I cut a few chunks of
lines, indicated by "[...]"):

[...]
⏵ ok: t9800.19 run hook p4-pre-submit before submit
  Error: failed: t9800.20 submit from detached head
⏷ failure: t9800.20 submit from detached head 
      test_when_finished cleanup_git &&
      git p4 clone --dest="$git" //depot &&
        (
          cd "$git" &&
          git checkout p4/master &&
          >detached_head_test &&
          git add detached_head_test &&
          git commit -m "add detached_head" &&
          git config git-p4.skipSubmitEdit true &&
          git p4 submit &&
            git p4 rebase &&
            git log p4/master | grep detached_head
        )
    [...]
    Depot paths: //depot/
    Import destination: refs/remotes/p4/master
    
    Importing revision 9 (100%)Perforce db files in '.' will be created if missing...
    Perforce db files in '.' will be created if missing...
    
    Traceback (most recent call last):
      File "/home/runner/work/git/git/git-p4", line 4455, in <module>
        main()
      File "/home/runner/work/git/git/git-p4", line 4449, in main
        if not cmd.run(args):
      File "/home/runner/work/git/git/git-p4", line 2590, in run
        rebase.rebase()
      File "/home/runner/work/git/git/git-p4", line 4121, in rebase
        if len(read_pipe("git diff-index HEAD --")) > 0:
      File "/home/runner/work/git/git/git-p4", line 297, in read_pipe
        retcode, out, err = read_pipe_full(c, *k, **kw)
      File "/home/runner/work/git/git/git-p4", line 284, in read_pipe_full
        p = subprocess.Popen(
      File "/usr/lib/python3.8/subprocess.py", line 858, in __init__
        self._execute_child(args, executable, preexec_fn, close_fds,
      File "/usr/lib/python3.8/subprocess.py", line 1704, in _execute_child
        raise child_exception_type(errno_num, err_msg, err_filename)
    FileNotFoundError: [Errno 2] No such file or directory: 'git diff-index HEAD --'
    error: last command exited with $?=1
    + cleanup_git
    + retry_until_success rm -r /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
    + nr_tries_left=60
    + rm -r /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
    + test_path_is_missing /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
    + test 1 -ne 1
    + test -e /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
    + retry_until_success mkdir /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
    + nr_tries_left=60
    + mkdir /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
    + exit 1
    + eval_ret=1
    + :
    not ok 20 - submit from detached head
    #    
    #        test_when_finished cleanup_git &&
    #        git p4 clone --dest="$git" //depot &&
    #        (
    #            cd "$git" &&
    #            git checkout p4/master &&
    #            >detached_head_test &&
    #            git add detached_head_test &&
    #            git commit -m "add detached_head" &&
    #            git config git-p4.skipSubmitEdit true &&
    #            git p4 submit &&
    #            git p4 rebase &&
    #            git log p4/master | grep detached_head
    #        )
    #    
  Error: failed: t9800.21 submit from worktree
  [...]


Is this the best UI we can have for test failures in CI runs? I hope we can
do better. Having said that, this patch series presents a pretty good start,
and offers a basis for future improvements.

Johannes Schindelin (9):
  ci: fix code style
  ci/run-build-and-tests: take a more high-level view
  ci: make it easier to find failed tests' logs in the GitHub workflow
  ci/run-build-and-tests: add some structure to the GitHub workflow
    output
  tests: refactor --write-junit-xml code
  test(junit): avoid line feeds in XML attributes
  ci: optionally mark up output in the GitHub workflow
  ci: use `--github-workflow-markup` in the GitHub workflow
  ci: call `finalize_test_case_output` a little later

 .github/workflows/main.yml           |  12 ---
 ci/lib.sh                            |  82 +++++++++++++++--
 ci/run-build-and-tests.sh            |  14 +--
 ci/run-test-slice.sh                 |   5 +-
 t/test-lib-functions.sh              |   4 +-
 t/test-lib-github-workflow-markup.sh |  50 ++++++++++
 t/test-lib-junit.sh                  | 132 +++++++++++++++++++++++++++
 t/test-lib.sh                        | 128 ++++----------------------
 8 files changed, 288 insertions(+), 139 deletions(-)
 create mode 100644 t/test-lib-github-workflow-markup.sh
 create mode 100644 t/test-lib-junit.sh


base-commit: af4e5f569bc89f356eb34a9373d7f82aca6faa8a
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1117%2Fdscho%2Fuse-grouping-in-ci-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1117/dscho/use-grouping-in-ci-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1117

Range-diff vs v1:

  1:  db08b07c37a =  1:  db08b07c37a ci: fix code style
  2:  d2ff51bb5da !  2:  42ff3e170bf ci/run-build-and-tests: take a more high-level view
     @@ ci/run-build-and-tests.sh: pedantic)
       	;;
       esac
       
     - # Any new "test" targets should not go after this "make", but should
     - # adjust $MAKE_TARGETS. Otherwise compilation-only targets above will
     - # start running tests.
     +-# Any new "test" targets should not go after this "make", but should
     +-# adjust $MAKE_TARGETS. Otherwise compilation-only targets above will
     +-# start running tests.
      -make $MAKE_TARGETS
      +make
      +if test -n "$run_tests"
  3:  98891b0d3f7 !  3:  bbbe1623257 ci: make it easier to find failed tests' logs in the GitHub workflow
     @@ Metadata
       ## Commit message ##
          ci: make it easier to find failed tests' logs in the GitHub workflow
      
     +    When investigating a test failure, the time that matters most is the
     +    time it takes from getting aware of the failure to displaying the output
     +    of the failing test case.
     +
          You currently have to know a lot of implementation details when
          investigating test failures in the CI runs. The first step is easy: the
          failed job is marked quite clearly, but when opening it, the failed step
     @@ Commit message
          The actually interesting part is in the detailed log of said failed
          test script. But that log is shown in the CI run's step that runs
          `ci/print-test-failures.sh`. And that step is _not_ expanded in the web
     -    UI by default.
     +    UI by default. It is even marked as "successful", which makes it very
     +    easy to miss that there is useful information hidden in there.
      
          Let's help the reader by showing the failed tests' detailed logs in the
          step that is expanded automatically, i.e. directly after the test suite
  4:  9333ba781b8 !  4:  f72254a9ac6 ci/run-build-and-tests: add some structure to the GitHub workflow output
     @@ ci/lib.sh
      +
      +# Set 'exit on error' for all CI scripts to let the caller know that
      +# something went wrong.
     -+# Set tracing executed commands, primarily setting environment variables
     -+# and installing dependencies.
     ++#
     ++# We already enabled tracing executed commands earlier. This helps by showing
     ++# how # environment variables are set and and dependencies are installed.
      +set -e
      +
       skip_branch_tip_with_tag () {
     @@ ci/lib.sh: linux-leaks)
      +set -x
      
       ## ci/run-build-and-tests.sh ##
     -@@ ci/run-build-and-tests.sh: esac
     - # Any new "test" targets should not go after this "make", but should
     - # adjust $MAKE_TARGETS. Otherwise compilation-only targets above will
     - # start running tests.
     +@@ ci/run-build-and-tests.sh: pedantic)
     + 	;;
     + esac
     + 
      -make
      +group Build make
       if test -n "$run_tests"
  5:  94dcbe1bc43 =  5:  9eda6574313 tests: refactor --write-junit-xml code
  6:  41230100091 =  6:  c8b240af749 test(junit): avoid line feeds in XML attributes
  7:  98b32630fcd =  7:  15f199e810e ci: optionally mark up output in the GitHub workflow
  8:  1a6bd1846bc =  8:  91ea54f36c5 ci: use `--github-workflow-markup` in the GitHub workflow
  9:  992b1575889 =  9:  be2a83f5da3 ci: call `finalize_test_case_output` a little later

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [PATCH v2 1/9] ci: fix code style
  2022-03-01 10:24 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
@ 2022-03-01 10:24   ` Johannes Schindelin via GitGitGadget
  2022-03-01 10:24   ` [PATCH v2 2/9] ci/run-build-and-tests: take a more high-level view Johannes Schindelin via GitGitGadget
                     ` (11 subsequent siblings)
  12 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-03-01 10:24 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

In b92cb86ea14 (travis-ci: check that all build artifacts are
.gitignore-d, 2017-12-31), a function was introduced with a code style
that is different from the surrounding code: it added the opening curly
brace on its own line, when all the existing functions in the same file
cuddle that brace on the same line as the function name.

Let's make the code style consistent again.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 ci/lib.sh | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/ci/lib.sh b/ci/lib.sh
index 9d28ab50fb4..ebb502640fa 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -69,8 +69,7 @@ skip_good_tree () {
 	exit 0
 }
 
-check_unignored_build_artifacts ()
-{
+check_unignored_build_artifacts () {
 	! git ls-files --other --exclude-standard --error-unmatch \
 		-- ':/*' 2>/dev/null ||
 	{
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v2 2/9] ci/run-build-and-tests: take a more high-level view
  2022-03-01 10:24 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
  2022-03-01 10:24   ` [PATCH v2 1/9] ci: fix code style Johannes Schindelin via GitGitGadget
@ 2022-03-01 10:24   ` Johannes Schindelin via GitGitGadget
  2022-03-01 10:24   ` [PATCH v2 3/9] ci: make it easier to find failed tests' logs in the GitHub workflow Johannes Schindelin via GitGitGadget
                     ` (10 subsequent siblings)
  12 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-03-01 10:24 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

In the web UI of GitHub workflows, failed runs are presented with the
job step that failed auto-expanded. In the current setup, this is not
helpful at all because that shows only the output of `prove`, which says
which test failed, but not in what way.

What would help understand the reader what went wrong is the verbose
test output of the failed test.

The logs of the failed runs do contain that verbose test output, but it
is shown in the _next_ step (which is marked as succeeding, and is
therefore _not_ auto-expanded). Anyone not intimately familiar with this
would completely miss the verbose test output, being left mostly
puzzled with the test failures.

We are about to show the failed test cases' output in the _same_ step,
so that the user has a much easier time to figure out what was going
wrong.

But first, we must partially revert the change that tried to improve the
CI runs by combining the `Makefile` targets to build into a single
`make` invocation. That might have sounded like a good idea at the time,
but it does make it rather impossible for the CI script to determine
whether the _build_ failed, or the _tests_. If the tests were run at
all, that is.

So let's go back to calling `make` for the build, and call `make test`
separately so that we can easily detect that _that_ invocation failed,
and react appropriately.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 ci/run-build-and-tests.sh | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 280dda7d285..2818b3046ae 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -10,7 +10,7 @@ windows*) cmd //c mklink //j t\\.prove "$(cygpath -aw "$cache_dir/.prove")";;
 *) ln -s "$cache_dir/.prove" t/.prove;;
 esac
 
-export MAKE_TARGETS="all test"
+run_tests=t
 
 case "$jobname" in
 linux-gcc)
@@ -41,14 +41,15 @@ pedantic)
 	# Don't run the tests; we only care about whether Git can be
 	# built.
 	export DEVOPTS=pedantic
-	export MAKE_TARGETS=all
+	run_tests=
 	;;
 esac
 
-# Any new "test" targets should not go after this "make", but should
-# adjust $MAKE_TARGETS. Otherwise compilation-only targets above will
-# start running tests.
-make $MAKE_TARGETS
+make
+if test -n "$run_tests"
+then
+	make test
+fi
 check_unignored_build_artifacts
 
 save_good_tree
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v2 3/9] ci: make it easier to find failed tests' logs in the GitHub workflow
  2022-03-01 10:24 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
  2022-03-01 10:24   ` [PATCH v2 1/9] ci: fix code style Johannes Schindelin via GitGitGadget
  2022-03-01 10:24   ` [PATCH v2 2/9] ci/run-build-and-tests: take a more high-level view Johannes Schindelin via GitGitGadget
@ 2022-03-01 10:24   ` Johannes Schindelin via GitGitGadget
  2022-03-01 10:24   ` [PATCH v2 4/9] ci/run-build-and-tests: add some structure to the GitHub workflow output Johannes Schindelin via GitGitGadget
                     ` (9 subsequent siblings)
  12 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-03-01 10:24 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

When investigating a test failure, the time that matters most is the
time it takes from getting aware of the failure to displaying the output
of the failing test case.

You currently have to know a lot of implementation details when
investigating test failures in the CI runs. The first step is easy: the
failed job is marked quite clearly, but when opening it, the failed step
is expanded, which in our case is the one running
`ci/run-build-and-tests.sh`. This step, most notably, only offers a
high-level view of what went wrong: it prints the output of `prove`
which merely tells the reader which test script failed.

The actually interesting part is in the detailed log of said failed
test script. But that log is shown in the CI run's step that runs
`ci/print-test-failures.sh`. And that step is _not_ expanded in the web
UI by default. It is even marked as "successful", which makes it very
easy to miss that there is useful information hidden in there.

Let's help the reader by showing the failed tests' detailed logs in the
step that is expanded automatically, i.e. directly after the test suite
failed.

This also helps the situation where the _build_ failed and the
`print-test-failures` step was executed under the assumption that the
_test suite_ failed, and consequently failed to find any failed tests.

An alternative way to implement this patch would be to source
`ci/print-test-failures.sh` in the `handle_test_failures` function to
show these logs. However, over the course of the next few commits, we
want to introduce some grouping which would be harder to achieve that
way (for example, we do want a leaner, and colored, preamble for each
failed test script, and it would be trickier to accommodate the lack of
nested groupings in GitHub workflows' output).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 .github/workflows/main.yml | 12 ------------
 ci/lib.sh                  | 23 +++++++++++++++++++++++
 ci/run-build-and-tests.sh  |  3 ++-
 ci/run-test-slice.sh       |  3 ++-
 4 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index c35200defb9..3fa88b78b6d 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -119,10 +119,6 @@ jobs:
     - name: test
       shell: bash
       run: . /etc/profile && ci/run-test-slice.sh ${{matrix.nr}} 10
-    - name: ci/print-test-failures.sh
-      if: failure()
-      shell: bash
-      run: ci/print-test-failures.sh
     - name: Upload failed tests' directories
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
       uses: actions/upload-artifact@v2
@@ -204,10 +200,6 @@ jobs:
       env:
         NO_SVN_TESTS: 1
       run: . /etc/profile && ci/run-test-slice.sh ${{matrix.nr}} 10
-    - name: ci/print-test-failures.sh
-      if: failure()
-      shell: bash
-      run: ci/print-test-failures.sh
     - name: Upload failed tests' directories
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
       uses: actions/upload-artifact@v2
@@ -261,8 +253,6 @@ jobs:
     - uses: actions/checkout@v2
     - run: ci/install-dependencies.sh
     - run: ci/run-build-and-tests.sh
-    - run: ci/print-test-failures.sh
-      if: failure()
     - name: Upload failed tests' directories
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
       uses: actions/upload-artifact@v2
@@ -292,8 +282,6 @@ jobs:
     - uses: actions/checkout@v1
     - run: ci/install-docker-dependencies.sh
     - run: ci/run-build-and-tests.sh
-    - run: ci/print-test-failures.sh
-      if: failure()
     - name: Upload failed tests' directories
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
       uses: actions/upload-artifact@v1
diff --git a/ci/lib.sh b/ci/lib.sh
index ebb502640fa..2b2c0932320 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -78,6 +78,10 @@ check_unignored_build_artifacts () {
 	}
 }
 
+handle_failed_tests () {
+	return 1
+}
+
 # GitHub Action doesn't set TERM, which is required by tput
 export TERM=${TERM:-dumb}
 
@@ -123,6 +127,25 @@ then
 	CI_JOB_ID="$GITHUB_RUN_ID"
 	CC="${CC:-gcc}"
 	DONT_SKIP_TAGS=t
+	handle_failed_tests () {
+		mkdir -p t/failed-test-artifacts
+		echo "FAILED_TEST_ARTIFACTS=t/failed-test-artifacts" >>$GITHUB_ENV
+
+		for test_exit in t/test-results/*.exit
+		do
+			test 0 != "$(cat "$test_exit")" || continue
+
+			test_name="${test_exit%.exit}"
+			test_name="${test_name##*/}"
+			printf "\\e[33m\\e[1m=== Failed test: ${test_name} ===\\e[m\\n"
+			cat "t/test-results/$test_name.out"
+
+			trash_dir="t/trash directory.$test_name"
+			cp "t/test-results/$test_name.out" t/failed-test-artifacts/
+			tar czf t/failed-test-artifacts/"$test_name".trash.tar.gz "$trash_dir"
+		done
+		return 1
+	}
 
 	cache_dir="$HOME/none"
 
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 2818b3046ae..1ede75e5556 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -48,7 +48,8 @@ esac
 make
 if test -n "$run_tests"
 then
-	make test
+	make test ||
+	handle_failed_tests
 fi
 check_unignored_build_artifacts
 
diff --git a/ci/run-test-slice.sh b/ci/run-test-slice.sh
index f8c2c3106a2..63358c23e11 100755
--- a/ci/run-test-slice.sh
+++ b/ci/run-test-slice.sh
@@ -12,6 +12,7 @@ esac
 
 make --quiet -C t T="$(cd t &&
 	./helper/test-tool path-utils slice-tests "$1" "$2" t[0-9]*.sh |
-	tr '\n' ' ')"
+	tr '\n' ' ')" ||
+handle_failed_tests
 
 check_unignored_build_artifacts
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v2 4/9] ci/run-build-and-tests: add some structure to the GitHub workflow output
  2022-03-01 10:24 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
                     ` (2 preceding siblings ...)
  2022-03-01 10:24   ` [PATCH v2 3/9] ci: make it easier to find failed tests' logs in the GitHub workflow Johannes Schindelin via GitGitGadget
@ 2022-03-01 10:24   ` Johannes Schindelin via GitGitGadget
  2022-03-01 10:24   ` [PATCH v2 5/9] tests: refactor --write-junit-xml code Johannes Schindelin via GitGitGadget
                     ` (8 subsequent siblings)
  12 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-03-01 10:24 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The current output of Git's GitHub workflow can be quite confusing,
especially for contributors new to the project.

To make it more helpful, let's introduce some collapsible grouping.
Initially, readers will see the high-level view of what actually
happened (did the build fail, or the test suite?). To drill down, the
respective group can be expanded.

Note: sadly, workflow output currently cannot contain any nested groups
(see https://github.com/actions/runner/issues/802 for details),
therefore we take pains to ensure to end any previous group before
starting a new one.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 ci/lib.sh                 | 56 ++++++++++++++++++++++++++++++++++-----
 ci/run-build-and-tests.sh |  4 +--
 ci/run-test-slice.sh      |  2 +-
 3 files changed, 52 insertions(+), 10 deletions(-)

diff --git a/ci/lib.sh b/ci/lib.sh
index 2b2c0932320..2a1b22db12a 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -1,5 +1,50 @@
 # Library of functions shared by all CI scripts
 
+if test true != "$GITHUB_ACTIONS"
+then
+	begin_group () { :; }
+	end_group () { :; }
+
+	group () {
+		shift
+		"$@"
+	}
+	set -x
+else
+	begin_group () {
+		need_to_end_group=t
+		echo "::group::$1" >&2
+		set -x
+	}
+
+	end_group () {
+		test -n "$need_to_end_group" || return 0
+		set +x
+		need_to_end_group=
+		echo '::endgroup::' >&2
+	}
+	trap end_group EXIT
+
+	group () {
+		set +x
+		begin_group "$1"
+		shift
+		"$@"
+		res=$?
+		end_group
+		return $res
+	}
+
+	begin_group "CI setup"
+fi
+
+# Set 'exit on error' for all CI scripts to let the caller know that
+# something went wrong.
+#
+# We already enabled tracing executed commands earlier. This helps by showing
+# how # environment variables are set and and dependencies are installed.
+set -e
+
 skip_branch_tip_with_tag () {
 	# Sometimes, a branch is pushed at the same time the tag that points
 	# at the same commit as the tip of the branch is pushed, and building
@@ -88,12 +133,6 @@ export TERM=${TERM:-dumb}
 # Clear MAKEFLAGS that may come from the outside world.
 export MAKEFLAGS=
 
-# Set 'exit on error' for all CI scripts to let the caller know that
-# something went wrong.
-# Set tracing executed commands, primarily setting environment variables
-# and installing dependencies.
-set -ex
-
 if test -n "$SYSTEM_COLLECTIONURI" || test -n "$SYSTEM_TASKDEFINITIONSURI"
 then
 	CI_TYPE=azure-pipelines
@@ -138,7 +177,7 @@ then
 			test_name="${test_exit%.exit}"
 			test_name="${test_name##*/}"
 			printf "\\e[33m\\e[1m=== Failed test: ${test_name} ===\\e[m\\n"
-			cat "t/test-results/$test_name.out"
+			group "Failed test: $test_name" cat "t/test-results/$test_name.out"
 
 			trash_dir="t/trash directory.$test_name"
 			cp "t/test-results/$test_name.out" t/failed-test-artifacts/
@@ -234,3 +273,6 @@ linux-leaks)
 esac
 
 MAKEFLAGS="$MAKEFLAGS CC=${CC:-cc}"
+
+end_group
+set -x
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 1ede75e5556..7abfa00adc0 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -45,10 +45,10 @@ pedantic)
 	;;
 esac
 
-make
+group Build make
 if test -n "$run_tests"
 then
-	make test ||
+	group "Run tests" make test ||
 	handle_failed_tests
 fi
 check_unignored_build_artifacts
diff --git a/ci/run-test-slice.sh b/ci/run-test-slice.sh
index 63358c23e11..a3c67956a8d 100755
--- a/ci/run-test-slice.sh
+++ b/ci/run-test-slice.sh
@@ -10,7 +10,7 @@ windows*) cmd //c mklink //j t\\.prove "$(cygpath -aw "$cache_dir/.prove")";;
 *) ln -s "$cache_dir/.prove" t/.prove;;
 esac
 
-make --quiet -C t T="$(cd t &&
+group "Run tests" make --quiet -C t T="$(cd t &&
 	./helper/test-tool path-utils slice-tests "$1" "$2" t[0-9]*.sh |
 	tr '\n' ' ')" ||
 handle_failed_tests
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v2 5/9] tests: refactor --write-junit-xml code
  2022-03-01 10:24 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
                     ` (3 preceding siblings ...)
  2022-03-01 10:24   ` [PATCH v2 4/9] ci/run-build-and-tests: add some structure to the GitHub workflow output Johannes Schindelin via GitGitGadget
@ 2022-03-01 10:24   ` Johannes Schindelin via GitGitGadget
  2022-03-01 10:24   ` [PATCH v2 6/9] test(junit): avoid line feeds in XML attributes Johannes Schindelin via GitGitGadget
                     ` (7 subsequent siblings)
  12 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-03-01 10:24 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The code writing JUnit XML is interspersed directly with all the code in
`t/test-lib.sh`, and it is therefore not only ill-separated, but
introducing yet another output format would make the situation even
worse.

Let's introduce an abstraction layer by hiding the JUnit XML code behind
four new functions that are supposed to be called before and after each
test and test case.

This is not just an academic exercise, refactoring for refactoring's
sake. We _actually_ want to introduce such a new output format, to
make it substantially easier to diagnose test failures in our GitHub
workflow, therefore we do need this refactoring.

This commit is best viewed with `git show --color-moved
--color-moved-ws=allow-indentation-change <commit>`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/test-lib-junit.sh | 126 ++++++++++++++++++++++++++++++++++++++++++++
 t/test-lib.sh       | 124 ++++++-------------------------------------
 2 files changed, 142 insertions(+), 108 deletions(-)
 create mode 100644 t/test-lib-junit.sh

diff --git a/t/test-lib-junit.sh b/t/test-lib-junit.sh
new file mode 100644
index 00000000000..9d55d74d764
--- /dev/null
+++ b/t/test-lib-junit.sh
@@ -0,0 +1,126 @@
+# Library of functions to format test scripts' output in JUnit XML
+# format, to support Git's test suite result to be presented in an
+# easily digestible way on Azure Pipelines.
+#
+# Copyright (c) 2022 Johannes Schindelin
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see http://www.gnu.org/licenses/ .
+#
+# The idea is for `test-lib.sh` to source this file when the user asks
+# for JUnit XML; these functions will then override (empty) functions
+# that are are called at the appropriate times during the test runs.
+
+start_test_output () {
+	junit_xml_dir="$TEST_OUTPUT_DIRECTORY/out"
+	mkdir -p "$junit_xml_dir"
+	junit_xml_base=${1##*/}
+	junit_xml_path="$junit_xml_dir/TEST-${junit_xml_base%.sh}.xml"
+	junit_attrs="name=\"${junit_xml_base%.sh}\""
+	junit_attrs="$junit_attrs timestamp=\"$(TZ=UTC \
+		date +%Y-%m-%dT%H:%M:%S)\""
+	write_junit_xml --truncate "<testsuites>" "  <testsuite $junit_attrs>"
+	junit_suite_start=$(test-tool date getnanos)
+	if test -n "$GIT_TEST_TEE_OUTPUT_FILE"
+	then
+		GIT_TEST_TEE_OFFSET=0
+	fi
+}
+
+start_test_case_output () {
+	junit_start=$(test-tool date getnanos)
+}
+
+finalize_test_case_output () {
+	test_case_result=$1
+	shift
+	case "$test_case_result" in
+	ok)
+		set "$*"
+		;;
+	failure)
+		junit_insert="<failure message=\"not ok $test_count -"
+		junit_insert="$junit_insert $(xml_attr_encode "$1")\">"
+		junit_insert="$junit_insert $(xml_attr_encode \
+			"$(if test -n "$GIT_TEST_TEE_OUTPUT_FILE"
+			   then
+				test-tool path-utils skip-n-bytes \
+					"$GIT_TEST_TEE_OUTPUT_FILE" $GIT_TEST_TEE_OFFSET
+			   else
+				printf '%s\n' "$@" | sed 1d
+			   fi)")"
+		junit_insert="$junit_insert</failure>"
+		if test -n "$GIT_TEST_TEE_OUTPUT_FILE"
+		then
+			junit_insert="$junit_insert<system-err>$(xml_attr_encode \
+				"$(cat "$GIT_TEST_TEE_OUTPUT_FILE")")</system-err>"
+		fi
+		set "$1" "      $junit_insert"
+		;;
+	fixed)
+		set "$* (breakage fixed)"
+		;;
+	broken)
+		set "$* (known breakage)"
+		;;
+	skip)
+		message="$(xml_attr_encode "$skipped_reason")"
+		set "$1" "      <skipped message=\"$message\" />"
+		;;
+	esac
+
+	junit_attrs="name=\"$(xml_attr_encode "$this_test.$test_count $1")\""
+	shift
+	junit_attrs="$junit_attrs classname=\"$this_test\""
+	junit_attrs="$junit_attrs time=\"$(test-tool \
+		date getnanos $junit_start)\""
+	write_junit_xml "$(printf '%s\n' \
+		"    <testcase $junit_attrs>" "$@" "    </testcase>")"
+	junit_have_testcase=t
+}
+
+finalize_test_output () {
+	if test -n "$junit_xml_path"
+	then
+		test -n "$junit_have_testcase" || {
+			junit_start=$(test-tool date getnanos)
+			write_junit_xml_testcase "all tests skipped"
+		}
+
+		# adjust the overall time
+		junit_time=$(test-tool date getnanos $junit_suite_start)
+		sed -e "s/\(<testsuite.*\) time=\"[^\"]*\"/\1/" \
+			-e "s/<testsuite [^>]*/& time=\"$junit_time\"/" \
+			-e '/^ *<\/testsuite/d' \
+			<"$junit_xml_path" >"$junit_xml_path.new"
+		mv "$junit_xml_path.new" "$junit_xml_path"
+
+		write_junit_xml "  </testsuite>" "</testsuites>"
+		write_junit_xml=
+	fi
+}
+
+write_junit_xml () {
+	case "$1" in
+	--truncate)
+		>"$junit_xml_path"
+		junit_have_testcase=
+		shift
+		;;
+	esac
+	printf '%s\n' "$@" >>"$junit_xml_path"
+}
+
+xml_attr_encode () {
+	printf '%s\n' "$@" | test-tool xml-encode
+}
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 0f7a137c7d8..e13e1cb9124 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -107,6 +107,12 @@ mark_option_requires_arg () {
 	store_arg_to=$2
 }
 
+# These functions can be overridden e.g. to output JUnit XML
+start_test_output () { :; }
+start_test_case_output () { :; }
+finalize_test_case_output () { :; }
+finalize_test_output () { :; }
+
 parse_option () {
 	local opt="$1"
 
@@ -166,7 +172,7 @@ parse_option () {
 		tee=t
 		;;
 	--write-junit-xml)
-		write_junit_xml=t
+		. "$TEST_DIRECTORY/test-lib-junit.sh"
 		;;
 	--stress)
 		stress=t ;;
@@ -613,7 +619,7 @@ exec 6<&0
 exec 7>&2
 
 _error_exit () {
-	finalize_junit_xml
+	finalize_test_output
 	GIT_EXIT_OK=t
 	exit 1
 }
@@ -723,35 +729,13 @@ trap '{ code=$?; set +x; } 2>/dev/null; exit $code' INT TERM HUP
 # the test_expect_* functions instead.
 
 test_ok_ () {
-	if test -n "$write_junit_xml"
-	then
-		write_junit_xml_testcase "$*"
-	fi
+	finalize_test_case_output ok "$@"
 	test_success=$(($test_success + 1))
 	say_color "" "ok $test_count - $@"
 }
 
 test_failure_ () {
-	if test -n "$write_junit_xml"
-	then
-		junit_insert="<failure message=\"not ok $test_count -"
-		junit_insert="$junit_insert $(xml_attr_encode "$1")\">"
-		junit_insert="$junit_insert $(xml_attr_encode \
-			"$(if test -n "$GIT_TEST_TEE_OUTPUT_FILE"
-			   then
-				test-tool path-utils skip-n-bytes \
-					"$GIT_TEST_TEE_OUTPUT_FILE" $GIT_TEST_TEE_OFFSET
-			   else
-				printf '%s\n' "$@" | sed 1d
-			   fi)")"
-		junit_insert="$junit_insert</failure>"
-		if test -n "$GIT_TEST_TEE_OUTPUT_FILE"
-		then
-			junit_insert="$junit_insert<system-err>$(xml_attr_encode \
-				"$(cat "$GIT_TEST_TEE_OUTPUT_FILE")")</system-err>"
-		fi
-		write_junit_xml_testcase "$1" "      $junit_insert"
-	fi
+	finalize_test_case_output failure "$@"
 	test_failure=$(($test_failure + 1))
 	say_color error "not ok $test_count - $1"
 	shift
@@ -760,19 +744,13 @@ test_failure_ () {
 }
 
 test_known_broken_ok_ () {
-	if test -n "$write_junit_xml"
-	then
-		write_junit_xml_testcase "$* (breakage fixed)"
-	fi
+	finalize_test_case_output fixed "$@"
 	test_fixed=$(($test_fixed+1))
 	say_color error "ok $test_count - $@ # TODO known breakage vanished"
 }
 
 test_known_broken_failure_ () {
-	if test -n "$write_junit_xml"
-	then
-		write_junit_xml_testcase "$* (known breakage)"
-	fi
+	finalize_test_case_output broken "$@"
 	test_broken=$(($test_broken+1))
 	say_color warn "not ok $test_count - $@ # TODO known breakage"
 }
@@ -1049,10 +1027,7 @@ test_start_ () {
 	test_count=$(($test_count+1))
 	maybe_setup_verbose
 	maybe_setup_valgrind
-	if test -n "$write_junit_xml"
-	then
-		junit_start=$(test-tool date getnanos)
-	fi
+	start_test_case_output
 }
 
 test_finish_ () {
@@ -1103,12 +1078,7 @@ test_skip () {
 
 	case "$to_skip" in
 	t)
-		if test -n "$write_junit_xml"
-		then
-			message="$(xml_attr_encode "$skipped_reason")"
-			write_junit_xml_testcase "$1" \
-				"      <skipped message=\"$message\" />"
-		fi
+		finalize_test_case_output skip "$@"
 
 		say_color skip "ok $test_count # skip $1 ($skipped_reason)"
 		: true
@@ -1124,53 +1094,6 @@ test_at_end_hook_ () {
 	:
 }
 
-write_junit_xml () {
-	case "$1" in
-	--truncate)
-		>"$junit_xml_path"
-		junit_have_testcase=
-		shift
-		;;
-	esac
-	printf '%s\n' "$@" >>"$junit_xml_path"
-}
-
-xml_attr_encode () {
-	printf '%s\n' "$@" | test-tool xml-encode
-}
-
-write_junit_xml_testcase () {
-	junit_attrs="name=\"$(xml_attr_encode "$this_test.$test_count $1")\""
-	shift
-	junit_attrs="$junit_attrs classname=\"$this_test\""
-	junit_attrs="$junit_attrs time=\"$(test-tool \
-		date getnanos $junit_start)\""
-	write_junit_xml "$(printf '%s\n' \
-		"    <testcase $junit_attrs>" "$@" "    </testcase>")"
-	junit_have_testcase=t
-}
-
-finalize_junit_xml () {
-	if test -n "$write_junit_xml" && test -n "$junit_xml_path"
-	then
-		test -n "$junit_have_testcase" || {
-			junit_start=$(test-tool date getnanos)
-			write_junit_xml_testcase "all tests skipped"
-		}
-
-		# adjust the overall time
-		junit_time=$(test-tool date getnanos $junit_suite_start)
-		sed -e "s/\(<testsuite.*\) time=\"[^\"]*\"/\1/" \
-			-e "s/<testsuite [^>]*/& time=\"$junit_time\"/" \
-			-e '/^ *<\/testsuite/d' \
-			<"$junit_xml_path" >"$junit_xml_path.new"
-		mv "$junit_xml_path.new" "$junit_xml_path"
-
-		write_junit_xml "  </testsuite>" "</testsuites>"
-		write_junit_xml=
-	fi
-}
-
 test_atexit_cleanup=:
 test_atexit_handler () {
 	# In a succeeding test script 'test_atexit_handler' is invoked
@@ -1193,7 +1116,7 @@ test_done () {
 	# removed, so the commands can access pidfiles and socket files.
 	test_atexit_handler
 
-	finalize_junit_xml
+	finalize_test_output
 
 	if test -z "$HARNESS_ACTIVE"
 	then
@@ -1484,22 +1407,7 @@ fi
 # in subprocesses like git equals our $PWD (for pathname comparisons).
 cd -P "$TRASH_DIRECTORY" || exit 1
 
-if test -n "$write_junit_xml"
-then
-	junit_xml_dir="$TEST_OUTPUT_DIRECTORY/out"
-	mkdir -p "$junit_xml_dir"
-	junit_xml_base=${0##*/}
-	junit_xml_path="$junit_xml_dir/TEST-${junit_xml_base%.sh}.xml"
-	junit_attrs="name=\"${junit_xml_base%.sh}\""
-	junit_attrs="$junit_attrs timestamp=\"$(TZ=UTC \
-		date +%Y-%m-%dT%H:%M:%S)\""
-	write_junit_xml --truncate "<testsuites>" "  <testsuite $junit_attrs>"
-	junit_suite_start=$(test-tool date getnanos)
-	if test -n "$GIT_TEST_TEE_OUTPUT_FILE"
-	then
-		GIT_TEST_TEE_OFFSET=0
-	fi
-fi
+start_test_output "$0"
 
 # Convenience
 # A regexp to match 5 and 35 hexdigits
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v2 6/9] test(junit): avoid line feeds in XML attributes
  2022-03-01 10:24 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
                     ` (4 preceding siblings ...)
  2022-03-01 10:24   ` [PATCH v2 5/9] tests: refactor --write-junit-xml code Johannes Schindelin via GitGitGadget
@ 2022-03-01 10:24   ` Johannes Schindelin via GitGitGadget
  2022-03-01 10:24   ` [PATCH v2 7/9] ci: optionally mark up output in the GitHub workflow Johannes Schindelin via GitGitGadget
                     ` (6 subsequent siblings)
  12 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-03-01 10:24 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

In the test case's output, we do want newline characters, but in the XML
attributes we do not want them.

However, the `xml_attr_encode` function always adds a Line Feed at the
end (which are then encoded as `&#x0a;`, even for XML attributes.

This seems not to faze Azure Pipelines' XML parser, but it still is
incorrect, so let's fix it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/test-lib-junit.sh | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/t/test-lib-junit.sh b/t/test-lib-junit.sh
index 9d55d74d764..c959183c7e2 100644
--- a/t/test-lib-junit.sh
+++ b/t/test-lib-junit.sh
@@ -50,7 +50,7 @@ finalize_test_case_output () {
 		;;
 	failure)
 		junit_insert="<failure message=\"not ok $test_count -"
-		junit_insert="$junit_insert $(xml_attr_encode "$1")\">"
+		junit_insert="$junit_insert $(xml_attr_encode --no-lf "$1")\">"
 		junit_insert="$junit_insert $(xml_attr_encode \
 			"$(if test -n "$GIT_TEST_TEE_OUTPUT_FILE"
 			   then
@@ -74,12 +74,12 @@ finalize_test_case_output () {
 		set "$* (known breakage)"
 		;;
 	skip)
-		message="$(xml_attr_encode "$skipped_reason")"
+		message="$(xml_attr_encode --no-lf "$skipped_reason")"
 		set "$1" "      <skipped message=\"$message\" />"
 		;;
 	esac
 
-	junit_attrs="name=\"$(xml_attr_encode "$this_test.$test_count $1")\""
+	junit_attrs="name=\"$(xml_attr_encode --no-lf "$this_test.$test_count $1")\""
 	shift
 	junit_attrs="$junit_attrs classname=\"$this_test\""
 	junit_attrs="$junit_attrs time=\"$(test-tool \
@@ -122,5 +122,11 @@ write_junit_xml () {
 }
 
 xml_attr_encode () {
-	printf '%s\n' "$@" | test-tool xml-encode
+	if test "x$1" = "x--no-lf"
+	then
+		shift
+		printf '%s' "$*" | test-tool xml-encode
+	else
+		printf '%s\n' "$@" | test-tool xml-encode
+	fi
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v2 7/9] ci: optionally mark up output in the GitHub workflow
  2022-03-01 10:24 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
                     ` (5 preceding siblings ...)
  2022-03-01 10:24   ` [PATCH v2 6/9] test(junit): avoid line feeds in XML attributes Johannes Schindelin via GitGitGadget
@ 2022-03-01 10:24   ` Johannes Schindelin via GitGitGadget
  2022-03-01 10:24   ` [PATCH v2 8/9] ci: use `--github-workflow-markup` " Johannes Schindelin via GitGitGadget
                     ` (5 subsequent siblings)
  12 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-03-01 10:24 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

A couple of commands exist to spruce up the output in GitHub workflows:
https://docs.github.com/en/actions/learn-github-actions/workflow-commands-for-github-actions

In addition to the `::group::<label>`/`::endgroup::` commands (which we
already use to structure the output of the build step better), we also
use `::error::`/`::notice::` to draw the attention to test failures and
to test cases that were expected to fail but didn't.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/test-lib-functions.sh              |  4 +--
 t/test-lib-github-workflow-markup.sh | 50 ++++++++++++++++++++++++++++
 t/test-lib.sh                        |  5 ++-
 3 files changed, 56 insertions(+), 3 deletions(-)
 create mode 100644 t/test-lib-github-workflow-markup.sh

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index c3d38aaccbd..b5fe5f66085 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -719,7 +719,7 @@ test_verify_prereq () {
 }
 
 test_expect_failure () {
-	test_start_
+	test_start_ "$@"
 	test "$#" = 3 && { test_prereq=$1; shift; } || test_prereq=
 	test "$#" = 2 ||
 	BUG "not 2 or 3 parameters to test-expect-failure"
@@ -739,7 +739,7 @@ test_expect_failure () {
 }
 
 test_expect_success () {
-	test_start_
+	test_start_ "$@"
 	test "$#" = 3 && { test_prereq=$1; shift; } || test_prereq=
 	test "$#" = 2 ||
 	BUG "not 2 or 3 parameters to test-expect-success"
diff --git a/t/test-lib-github-workflow-markup.sh b/t/test-lib-github-workflow-markup.sh
new file mode 100644
index 00000000000..d8dc969df4a
--- /dev/null
+++ b/t/test-lib-github-workflow-markup.sh
@@ -0,0 +1,50 @@
+# Library of functions to mark up test scripts' output suitable for
+# pretty-printing it in GitHub workflows.
+#
+# Copyright (c) 2022 Johannes Schindelin
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see http://www.gnu.org/licenses/ .
+#
+# The idea is for `test-lib.sh` to source this file when run in GitHub
+# workflows; these functions will then override (empty) functions
+# that are are called at the appropriate times during the test runs.
+
+start_test_output () {
+	test -n "$GIT_TEST_TEE_OUTPUT_FILE" ||
+	die "--github-workflow-markup requires --verbose-log"
+	github_markup_output="${GIT_TEST_TEE_OUTPUT_FILE%.out}.markup"
+	>$github_markup_output
+	GIT_TEST_TEE_OFFSET=0
+}
+
+# No need to override start_test_case_output
+
+finalize_test_case_output () {
+	test_case_result=$1
+	shift
+	case "$test_case_result" in
+	failure)
+		echo >>$github_markup_output "::error::failed: $this_test.$test_count $1"
+		;;
+	fixed)
+		echo >>$github_markup_output "::notice::fixed: $this_test.$test_count $1"
+		;;
+	esac
+	echo >>$github_markup_output "::group::$test_case_result: $this_test.$test_count $*"
+	test-tool >>$github_markup_output path-utils skip-n-bytes \
+		"$GIT_TEST_TEE_OUTPUT_FILE" $GIT_TEST_TEE_OFFSET
+	echo >>$github_markup_output "::endgroup::"
+}
+
+# No need to override finalize_test_output
diff --git a/t/test-lib.sh b/t/test-lib.sh
index e13e1cb9124..076bee58c19 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -174,6 +174,9 @@ parse_option () {
 	--write-junit-xml)
 		. "$TEST_DIRECTORY/test-lib-junit.sh"
 		;;
+	--github-workflow-markup)
+		. "$TEST_DIRECTORY/test-lib-github-workflow-markup.sh"
+		;;
 	--stress)
 		stress=t ;;
 	--stress=*)
@@ -1027,7 +1030,7 @@ test_start_ () {
 	test_count=$(($test_count+1))
 	maybe_setup_verbose
 	maybe_setup_valgrind
-	start_test_case_output
+	start_test_case_output "$@"
 }
 
 test_finish_ () {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v2 8/9] ci: use `--github-workflow-markup` in the GitHub workflow
  2022-03-01 10:24 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
                     ` (6 preceding siblings ...)
  2022-03-01 10:24   ` [PATCH v2 7/9] ci: optionally mark up output in the GitHub workflow Johannes Schindelin via GitGitGadget
@ 2022-03-01 10:24   ` Johannes Schindelin via GitGitGadget
  2022-03-01 10:24   ` [PATCH v2 9/9] ci: call `finalize_test_case_output` a little later Johannes Schindelin via GitGitGadget
                     ` (4 subsequent siblings)
  12 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-03-01 10:24 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

This makes the output easier to digest.

Note: since workflow output currently cannot contain any nested groups
(see https://github.com/actions/runner/issues/802 for details), we need
to remove the explicit grouping that would span the entirety of each
failed test script.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 ci/lib.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/ci/lib.sh b/ci/lib.sh
index 2a1b22db12a..7cc000e075c 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -177,7 +177,7 @@ then
 			test_name="${test_exit%.exit}"
 			test_name="${test_name##*/}"
 			printf "\\e[33m\\e[1m=== Failed test: ${test_name} ===\\e[m\\n"
-			group "Failed test: $test_name" cat "t/test-results/$test_name.out"
+			cat "t/test-results/$test_name.markup"
 
 			trash_dir="t/trash directory.$test_name"
 			cp "t/test-results/$test_name.out" t/failed-test-artifacts/
@@ -189,7 +189,7 @@ then
 	cache_dir="$HOME/none"
 
 	export GIT_PROVE_OPTS="--timer --jobs 10"
-	export GIT_TEST_OPTS="--verbose-log -x"
+	export GIT_TEST_OPTS="--verbose-log -x --github-workflow-markup"
 	MAKEFLAGS="$MAKEFLAGS --jobs=10"
 	test windows != "$CI_OS_NAME" ||
 	GIT_TEST_OPTS="--no-chain-lint --no-bin-wrappers $GIT_TEST_OPTS"
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v2 9/9] ci: call `finalize_test_case_output` a little later
  2022-03-01 10:24 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
                     ` (7 preceding siblings ...)
  2022-03-01 10:24   ` [PATCH v2 8/9] ci: use `--github-workflow-markup` " Johannes Schindelin via GitGitGadget
@ 2022-03-01 10:24   ` Johannes Schindelin via GitGitGadget
  2022-03-01 19:07   ` [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful Junio C Hamano
                     ` (3 subsequent siblings)
  12 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-03-01 10:24 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

We used to call that function already before printing the final verdict.
However, now that we added grouping to the GitHub workflow output, we
will want to include even that part in the collapsible group for that
test case.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/test-lib.sh | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 076bee58c19..1e683ad879b 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -732,30 +732,31 @@ trap '{ code=$?; set +x; } 2>/dev/null; exit $code' INT TERM HUP
 # the test_expect_* functions instead.
 
 test_ok_ () {
-	finalize_test_case_output ok "$@"
 	test_success=$(($test_success + 1))
 	say_color "" "ok $test_count - $@"
+	finalize_test_case_output ok "$@"
 }
 
 test_failure_ () {
-	finalize_test_case_output failure "$@"
+	failure_label=$1
 	test_failure=$(($test_failure + 1))
 	say_color error "not ok $test_count - $1"
 	shift
 	printf '%s\n' "$*" | sed -e 's/^/#	/'
 	test "$immediate" = "" || _error_exit
+	finalize_test_case_output failure "$failure_label" "$@"
 }
 
 test_known_broken_ok_ () {
-	finalize_test_case_output fixed "$@"
 	test_fixed=$(($test_fixed+1))
 	say_color error "ok $test_count - $@ # TODO known breakage vanished"
+	finalize_test_case_output fixed "$@"
 }
 
 test_known_broken_failure_ () {
-	finalize_test_case_output broken "$@"
 	test_broken=$(($test_broken+1))
 	say_color warn "not ok $test_count - $@ # TODO known breakage"
+	finalize_test_case_output broken "$@"
 }
 
 test_debug () {
@@ -1081,10 +1082,10 @@ test_skip () {
 
 	case "$to_skip" in
 	t)
-		finalize_test_case_output skip "$@"
 
 		say_color skip "ok $test_count # skip $1 ($skipped_reason)"
 		: true
+		finalize_test_case_output skip "$@"
 		;;
 	*)
 		false
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-03-01 10:18                   ` Johannes Schindelin
@ 2022-03-01 16:52                     ` Junio C Hamano
  0 siblings, 0 replies; 98+ messages in thread
From: Junio C Hamano @ 2022-03-01 16:52 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: phillip.wood, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin via GitGitGadget, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Let me again state my goal clearly, because some readers seem to be
> confused and believe that I want to improve the developer experience of
> veterans of the Git mailing list who are more than capable of finding
> their way through the build failures.
>
> My goal is instead to make new contributors' lives easier.

I understand who your primary target audiences are, and making them
happy without robbing too much from others is a good thing to do.
Instead of "Nah, this is not targetted for me" and ignoring the
topic, I have been reporting my dogfood experience exactly to help
that process---to notice if this variant worsens end-user experience
and see if that is within the reasonable pros-and-cons tolerance,
especially because they will start noticing the same hiccup as new
contributors stay longer and gain experiences.

> In any case, thank you for integrating the patches into `seen` so that the
> impact of the patches can be "seen".

No, do not thank me.  Thank yourself for writing it, and thank
others to try it and tell you their experiences to help you make it
better.

Thanks.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-03-01 10:10                 ` Johannes Schindelin
@ 2022-03-01 16:57                   ` Junio C Hamano
  0 siblings, 0 replies; 98+ messages in thread
From: Junio C Hamano @ 2022-03-01 16:57 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: phillip.wood, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin via GitGitGadget, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Yes, that's because line 5520 is the header of that group. If you direct
> the reader to
> https://github.com/git/git/runs/5343133021?check_suite_focus=true#step:4:5674
> instead, it will get expanded.

Ah, that's a good trick to know.  I presume 5674 can be any random
line in the block that is hidden by default that I want the viewer
to expand, right?

> If it would not get expanded, that would be a bug, but obviously not in my
> patch series.

Well, the choice to use a mechanism that has such a bug is made by
this patch series.  If "allowing users to easily exchange a URL that
points the exact place that the tests broke" were one of the goals,
it does not matter whose fault it is---the aggregate result is that
the new UI would have failed that goal.

Thankfully there does not seem to be such a bug in the UI mechanism
used by this series, so we are in good shape.

Thanks.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-03-01 10:24 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
                     ` (8 preceding siblings ...)
  2022-03-01 10:24   ` [PATCH v2 9/9] ci: call `finalize_test_case_output` a little later Johannes Schindelin via GitGitGadget
@ 2022-03-01 19:07   ` Junio C Hamano
  2022-03-02 12:22   ` Ævar Arnfjörð Bjarmason
                     ` (2 subsequent siblings)
  12 siblings, 0 replies; 98+ messages in thread
From: Junio C Hamano @ 2022-03-01 19:07 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> Changes since v1:
>
>  * In the patch that removed MAKE_TARGETS, a stale comment about that
>    variable is also removed.
>  * The comment about set -x has been adjusted because it no longer applies
>    as-is.
>  * The commit message of "ci: make it easier to find failed tests' logs in
>    the GitHub workflow" has been adjusted to motivate the improvement
>    better.

Will queue.  Thanks.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-02-25 14:10           ` Johannes Schindelin
  2022-02-25 18:16             ` Junio C Hamano
@ 2022-03-02 10:58             ` Phillip Wood
  2022-03-07 16:07               ` Johannes Schindelin
  1 sibling, 1 reply; 98+ messages in thread
From: Phillip Wood @ 2022-03-02 10:58 UTC (permalink / raw)
  To: Johannes Schindelin, phillip.wood
  Cc: Ævar Arnfjörð Bjarmason,
	Johannes Schindelin via GitGitGadget, git, Junio C Hamano

Hi Dscho

On 25/02/2022 14:10, Johannes Schindelin wrote:
> Hi Phillip,
> 
> On Wed, 23 Feb 2022, Phillip Wood wrote:
> 
>> On 22/02/2022 13:31, Ævar Arnfjörð Bjarmason wrote:
>>> [...]
>>> So just to make the point about one of those mentioned in my [1] with
>>> some further details (I won't go into the whole thing to avoid repeating
>>> myself):
>>>
>>> I opened both of:
>>>
>>> https://github.com/git-for-windows/git/runs/4822802185?check_suite_focus=true
>>> https://github.com/dscho/git/runs/4840190622?check_suite_focus=true
>>>
>>> Just now in Firefox 91.5.0esr-1. Both having been opened before, so
>>> they're in cache, and I've got a current 40MB/s real downlink speed etc.
>>>
>>> The former fully loads in around 5100ms, with your series here that's
>>> just short of 18000ms.
>>>
>>> So your CI changes are making the common case of just looking at a CI
>>> failure more than **3x as slow as before**.
>>
>> I don't think that is the most useful comparison between the two. When I
>> am investigating a test failure the time that matters to me is the time
>> it takes to display the output of the failing test case.
> 
> Thank you for expressing this so clearly. I will adopt a variation of this
> phrasing in my commit message, if you don't mind?

That's fine

>> With the first link above the initial page load is faster but to get to
>> the output of the failing test case I have click on "Run
>> ci/print_test_failures.sh" then wait for that to load and then search
>> for "not ok" to actually get to the information I'm after.
> 
> And that's only because you are familiar with what you have to do.
> 
> Any new contributor would be stuck with the information presented on the
> initial load, without any indication that more information _is_ available,
> just hidden away in the next step's log (which is marked as "succeeding",
> therefore misleading the inclined reader into thinking that this cannot
> potentially contain any information pertinent to the _failure_ that needs
> to be investigated).

Yes it took we a while to realize how to get to the test output when I 
first started looking at the CI results.

One thing I forgot to mention was that when you expand a failing test it 
shows the test script twice before the output e.g.

Error: failed: t7527.35 Matrix[uc:false][fsm:true] enable fsmonitor
failure: t7527.35 Matrix[uc:false][fsm:true] enable fsmonitor
   				git config core.fsmonitor true &&
   				git fsmonitor--daemon start &&
   				git update-index --fsmonitor
   			
   expecting success of 7527.35 'Matrix[uc:false][fsm:true] enable 
fsmonitor':
   				git config core.fsmonitor true &&
   				git fsmonitor--daemon start &&
   				git update-index --fsmonitor

  ++ git config core.fsmonitor true
  ++ git fsmonitor--daemon start 			
  ...

I don't know how easy it would be to fix that so that we only show 
"expecting success of ..." without the test being printed first

Best Wishes

Phillip


>> With the second link the initial page load does feel slower but then I'm
>> presented with the test failures nicely highlighted in red, all I have
>> to do is click on one and I've got the information I'm after.
>>
>> Overall that is much faster and easier to use.
> 
> Thank you for your comment. I really started to doubt myself, getting the
> idea that it's just a case of me holding this thing wrong.
> 
> For what it's worth, I did make a grave mistake by using that particular
> `seen` CI failure with all of those failing p4 tests, which obviously
> resulted in an incredibly large amount of logs. Obviously that _must_ be
> slow to load. I just did not have the time to fabricate a CI failure.
> 
> However, I do agree with you that the large amount of logs would have to
> be looked at _anyway_, whether it is shown upon loading the job's logs or
> only when expanding the `print-test-failures` step's logs. The amount of
> the logs is a constant, after all, I did not change anything there (nor
> would I).
> 
> So a better example might be my concrete use case yesterday: the CI build
> of `seen` failed. Here is the link to the regular output:
> 
> 	https://github.com/git/git/actions/runs/1890665968
> 
> On that page, you see the following:
> 
> 
> 	Annotations
> 	8 errors and 1 warning
> 
> 	ⓧ win test (3)
> 	  Process completed with exit code 2.
> 
> 	ⓧ win test (6)
> 	  Process completed with exit code 2.
> 
> 	ⓧ win test (2)
> 	  Process completed with exit code 2.
> 
> 	ⓧ win+VS test (3)
> 	  Process completed with exit code 2.
> 
> 	ⓧ win+VS test (6)
> 	  Process completed with exit code 2.
> 
> 	ⓧ win+VS test (2)
> 	  Process completed with exit code 2.
> 
> 	ⓧ osx-gcc (macos-latest)
> 	  Process completed with exit code 2.
> 
> 	ⓧ osx-clang (macos-latest)
> 	  Process completed with exit code 2.
> 
> 	⚠ CI: .github#L1
> 	  windows-latest workflows now use windows-2022. For more details, see https://github.com/actions/virtual-environments/issues/4856
> 
> So I merged my branch into `seen` and pushed it. The corresponding run can
> be seen here:
> 
> 	https://github.com/dscho/git/actions/runs/1892982393
> 
> On that page, you see the following:
> 
> 	Annotations
> 	50 errors and 1 warning
> 
> 	ⓧ win test (3)
> 	  failed: t7527.1 explicit daemon start and stop
> 
> 	ⓧ win test (3)
> 	  failed: t7527.2 implicit daemon start
> 
> 	ⓧ win test (3)
> 	  failed: t7527.3 implicit daemon stop (delete .git)
> 
> 	ⓧ win test (3)
> 	  failed: t7527.4 implicit daemon stop (rename .git)
> 
> 	ⓧ win test (3)
> 	  failed: t7527.5 implicit daemon stop (rename GIT~1)
> 
> 	ⓧ win test (3)
> 	  failed: t7527.6 implicit daemon stop (rename GIT~2)
> 
> 	ⓧ win test (3)
> 	  failed: t7527.8 cannot start multiple daemons
> 
> 	ⓧ win test (3)
> 	  failed: t7527.10 update-index implicitly starts daemon
> 
> 	ⓧ win test (3)
> 	  failed: t7527.11 status implicitly starts daemon
> 
> 	ⓧ win test (3)
> 	  failed: t7527.12 edit some files
> 
> 	ⓧ win test (2)
> 	  failed: t0012.81 fsmonitor--daemon can handle -h
> 
> 	ⓧ win test (2)
> 	  Process completed with exit code 1.
> 
> 	ⓧ win test (6)
> 	  failed: t7519.2 run fsmonitor-daemon in bare repo
> 
> 	ⓧ win test (6)
> 	  failed: t7519.3 run fsmonitor-daemon in virtual repo
> 
> 	ⓧ win test (6)
> 	  Process completed with exit code 1.
> 
> 	ⓧ win+VS test (3)
> 	  failed: t7527.1 explicit daemon start and stop
> 
> 	ⓧ win+VS test (3)
> 	  failed: t7527.2 implicit daemon start
> 
> 	ⓧ win+VS test (3)
> 	  failed: t7527.3 implicit daemon stop (delete .git)
> 
> 	ⓧ win+VS test (3)
> 	  failed: t7527.4 implicit daemon stop (rename .git)
> 
> 	ⓧ win+VS test (3)
> 	  failed: t7527.5 implicit daemon stop (rename GIT~1)
> 
> 	ⓧ win+VS test (3)
> 	  failed: t7527.6 implicit daemon stop (rename GIT~2)
> 
> 	ⓧ win+VS test (3)
> 	  failed: t7527.8 cannot start multiple daemons
> 
> 	ⓧ win+VS test (3)
> 	  failed: t7527.10 update-index implicitly starts daemon
> 
> 	ⓧ win+VS test (3)
> 	  failed: t7527.11 status implicitly starts daemon
> 
> 	ⓧ win+VS test (3)
> 	  failed: t7527.12 edit some files
> 
> 	ⓧ win+VS test (2)
> 	  failed: t0012.81 fsmonitor--daemon can handle -h
> 
> 	ⓧ win+VS test (2)
> 	  Process completed with exit code 1.
> 
> 	ⓧ win+VS test (6)
> 	  failed: t7519.2 run fsmonitor-daemon in bare repo
> 
> 	ⓧ win+VS test (6)
> 	  failed: t7519.3 run fsmonitor-daemon in virtual repo
> 
> 	ⓧ win+VS test (6)
> 	  Process completed with exit code 1.
> 
> 	ⓧ osx-clang (macos-latest)
> 	  failed: t0012.81 fsmonitor--daemon can handle -h
> 
> 	ⓧ osx-clang (macos-latest)
> 	  failed: t7519.2 run fsmonitor-daemon in bare repo
> 
> 	ⓧ osx-clang (macos-latest)
> 	  failed: t7527.1 explicit daemon start and stop
> 
> 	ⓧ osx-clang (macos-latest)
> 	  failed: t7527.2 implicit daemon start
> 
> 	ⓧ osx-clang (macos-latest)
> 	  failed: t7527.3 implicit daemon stop (delete .git)
> 
> 	ⓧ osx-clang (macos-latest)
> 	  failed: t7527.4 implicit daemon stop (rename .git)
> 
> 	ⓧ osx-clang (macos-latest)
> 	  failed: t7527.7 MacOS event spelling (rename .GIT)
> 
> 	ⓧ osx-clang (macos-latest)
> 	  failed: t7527.8 cannot start multiple daemons
> 
> 	ⓧ osx-clang (macos-latest)
> 	  failed: t7527.10 update-index implicitly starts daemon
> 
> 	ⓧ osx-clang (macos-latest)
> 	  failed: t7527.11 status implicitly starts daemon
> 
> 	ⓧ osx-gcc (macos-latest)
> 	  failed: t0012.81 fsmonitor--daemon can handle -h
> 
> 	ⓧ osx-gcc (macos-latest)
> 	  failed: t7519.2 run fsmonitor-daemon in bare repo
> 
> 	ⓧ osx-gcc (macos-latest)
> 	  failed: t7527.1 explicit daemon start and stop
> 
> 	ⓧ osx-gcc (macos-latest)
> 	  failed: t7527.2 implicit daemon start
> 
> 	ⓧ osx-gcc (macos-latest)
> 	  failed: t7527.3 implicit daemon stop (delete .git)
> 
> 	ⓧ osx-gcc (macos-latest)
> 	  failed: t7527.4 implicit daemon stop (rename .git)
> 
> 	ⓧ osx-gcc (macos-latest)
> 	  failed: t7527.7 MacOS event spelling (rename .GIT)
> 
> 	ⓧ osx-gcc (macos-latest)
> 	  failed: t7527.8 cannot start multiple daemons
> 
> 	ⓧ osx-gcc (macos-latest)
> 	  failed: t7527.10 update-index implicitly starts daemon
> 
> 	ⓧ osx-gcc (macos-latest)
> 	  failed: t7527.11 status implicitly starts daemon
> 
> 	⚠ CI: .github#L1
> 	  windows-latest workflows now use windows-2022. For more details, see https://github.com/actions/virtual-environments/issues/4856
> 
> In my mind, this is already an improvement. (Even if this is a _lot_ of
> output, and a lot of individual errors, given that all of them are fixed
> with a single, small patch to adjust an option usage string, but that's
> not the fault of my patch series, but of the suggestion to put the check
> for the option usage string linting into the `parse_options()` machinery
> instead of into the static analysis job.)
> 
> Since there are still plenty of failures, the page admittedly does load
> relatively slowly. But that's not the time I was trying to optimize for.
> My time comes at quite a premium these days, so if the computer has to
> work a little harder while I can do something else, as long as it saves
> _me_ time, I'll take that time. Every time.
> 
> Ciao,
> Dscho

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-03-01 10:24 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
                     ` (9 preceding siblings ...)
  2022-03-01 19:07   ` [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful Junio C Hamano
@ 2022-03-02 12:22   ` Ævar Arnfjörð Bjarmason
  2022-03-07 15:57     ` Johannes Schindelin
  2022-03-25  0:48   ` Victoria Dye
  2022-05-21 22:18   ` [PATCH v3 00/12] " Johannes Schindelin via GitGitGadget
  12 siblings, 1 reply; 98+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-02 12:22 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Eric Sunshine, Johannes Schindelin


On Tue, Mar 01 2022, Johannes Schindelin via GitGitGadget wrote:

> Changes since v1:
>
>  * In the patch that removed MAKE_TARGETS, a stale comment about that
>    variable is also removed.
>  * The comment about set -x has been adjusted because it no longer applies
>    as-is.
>  * The commit message of "ci: make it easier to find failed tests' logs in
>    the GitHub workflow" has been adjusted to motivate the improvement
>    better.

Just briefly: Much of the feedback I had on v1 is still unanswered, or
in the case of the performance issues I think just saying that this
output is aimed at non-long-time-contributors doesn't really justify the
large observed slowdowns.

I've been clicking around on the various "seen" links you posted, and
while some of the output is improved it's *much* slower, to the point
where I don't find those gains worth it.

For the early part of this series teaching the "ci/" this GitHub
markdown that's something that's unnecessary in my alternate approach in
[1]. This series is is obviously a relatively shorter way to get there,
but knowing that you *can* do it in a simpler way now it's odd that the
v2 doesn't discuss why getting from "A" to that "B" is still desirable
with this method.

As for the later parts:

>   5:  94dcbe1bc43 =  5:  9eda6574313 tests: refactor --write-junit-xml code
>   6:  41230100091 =  6:  c8b240af749 test(junit): avoid line feeds in XML attributes
>   7:  98b32630fcd =  7:  15f199e810e ci: optionally mark up output in the GitHub workflow
>   8:  1a6bd1846bc =  8:  91ea54f36c5 ci: use `--github-workflow-markup` in the GitHub workflow
>   9:  992b1575889 =  9:  be2a83f5da3 ci: call `finalize_test_case_output` a little later

I think (and I pointed out to you before) that I really don't see why
teaching the test-lib.sh N output formats when we already have a
machine-readable output format with TAP *and* are in fact running code
with "prove" that is parsing that output format into nicely arranged
objects/structures makes sense.

So as a demo for how that can work, here's a quick POC I hacked up a
while ago to use "prove" output plugins for this sort of thing.

First we intentionally break a test:

	$ git -P diff
	diff --git a/t/t0002-gitfile.sh b/t/t0002-gitfile.sh
	index 76052cb5620..f085188baf9 100755
	--- a/t/t0002-gitfile.sh
	+++ b/t/t0002-gitfile.sh
	@@ -44,7 +44,7 @@ test_expect_success 'check hash-object' '
	 
	 test_expect_success 'check cat-file' '
	        git cat-file blob $SHA >actual &&
	-       test_cmp bar actual
	+       test_cmp /dev/null actual
	 '
	 
	 test_expect_success 'check update-index' '
	@@ -67,7 +67,7 @@ test_expect_success 'check commit-tree' '
	 
	 test_expect_success 'check rev-list' '
	        git update-ref "HEAD" "$SHA" &&
	-       test "$SHA" = "$(git rev-list HEAD)"
	+       test "$SHA" = "$(xgit rev-list HEAD)"
	 '

And here's the resulting output, which is the same as the existing
"prove" summary output, but then followed by nicely formatted details
about just the failing tests:

	$ PERL5LIB=$PWD prove --formatter Fmt t0002-gitfile.sh :: -vx
	 
	 test_expect_success 'setup_git_dir twice in subdir' '
	t0002-gitfile.sh .. Dubious, test returned 1 (wstat 256, 0x100)
	Failed 2/14 subtests 
	
	Test Summary Report
	-------------------
	t0002-gitfile.sh (Wstat: 256 Tests: 14 Failed: 2)
	  Failed tests:  6, 10
	  Non-zero exit status: 1
	Files=1, Tests=14,  0 wallclock secs ( 0.02 usr  0.00 sys +  0.13 cusr  0.05 csys =  0.20 CPU)
	Result: FAIL
	Failed in t0002-gitfile.sh#6:
	 ==> test   => not ok 6 - check cat-file
	 ==> source => #
	 ==> source => #        git cat-file blob $SHA >actual &&
	 ==> source => #        test_cmp /dev/null actual
	 ==> source => #
	 ==> trace  => + git cat-file blob 257cc5642cb1a054f08cc83f2d943e56fd3ebe99
	 ==> trace  => + test_cmp /dev/null actual
	 ==> trace  => + test 2 -ne 2
	 ==> trace  => + eval diff -u "$@"
	 ==> trace  => + diff -u /dev/null actual
	 ==> output => --- /dev/null    2022-01-25 04:40:50.187529644 +0000
	 ==> output => +++ actual       2022-03-02 12:29:28.217960535 +0000
	 ==> output => @@ -0,0 +1 @@
	 ==> output => +foo
	 ==> output => error: last command exited with $?=1
	 ==> output => 
	Failed in t0002-gitfile.sh#10:
	 ==> test   => not ok 10 - check rev-list
	 ==> source => #
	 ==> source => #        git update-ref "HEAD" "$SHA" &&
	 ==> source => #        test "$SHA" = "$(xgit rev-list HEAD)"
	 ==> source => #
	 ==> trace  => + git update-ref HEAD f2e10ff57e8c01fe514c650d9de97e913257ba0c
	 ==> trace  => + xgit rev-list HEAD
	 ==> trace  => + test f2e10ff57e8c01fe514c650d9de97e913257ba0c = 
	 ==> output => t0002-gitfile.sh: 5: eval: xgit: not found
	 ==> output => error: last command exited with $?=1
	 ==> output => 

Now, some of that is on top of some output changes to test-lib.sh I had
locally, but the "test", "source" etc. there is not a hardcoded part of
the output, it's corresponding to (some of which I re-labeled,
e.g. "comment" => "source") the individual object types the TAP::Parser
emits.

The code to do that is below, with brief (unindented) comments:

	package Git::TAP::Formatter::Session;
	use v5.18.2;
	use strict;
	use warnings;
	use base 'TAP::Formatter::Console::ParallelSession';
	
	our %STATE;
	sub result {
		my $self = shift;
		my $result = shift;

So here we just have the TAP object for the current tests (as in "ok",
"not ok" etc.), and save all of those away (both for successful and
non-successful) tests in %STATE for later:
	
		my $res = $self->SUPER::result($result);
		my $test_name = $self->name;
	
		# An AoO of test numbers and their output lines
		$STATE{$test_name} ||= [{lines => []}];
	
		push @{$STATE{$test_name}->[-1]->{lines}} => $result;
	
		# When we see a new test add a new AoA for its output. We do
		# end up with the "plan" type as part of the last test, and
		# might want to split it up
		if ($result->type eq 'test') {
			push @{$STATE{$test_name}} => {};
		}
	
		return $res;
	}

This "Fmt" package is a "prove" plugin. It's just hooking into the code
that emits the current summary, now it just shows the "Test Summary
Report" that your CL notes issues with, but this shows how easy it is to
change or amend it (you can override another accessor here to fully
suppress the current output, I just didn't do that):

	package Fmt;
	use strict;
	use warnings;
	use List::MoreUtils qw(firstidx);
	use base 'TAP::Formatter::Console';
	
	sub open_test {
		my $self = shift;
	
		my $session = $self->SUPER::open_test(@_);
		use Data::Dumper;
		#warn "session is = " . Dumper $session;
		return bless $session => 'Git::TAP::Formatter::Session';
	}
	
	sub summary {
		my $self = shift;
		$self->SUPER::summary(@_);
	
		## This state machine needs to go past the "ok" line and grab
		## the comments emitted by e.g. "say_color_tap_comment_lines" in
		## test_ok_()
		for my $test (sort keys %STATE) {
			for (my $i = 1; $i <= $#{$STATE{$test}}; $i++) {
				my @lines = @{$STATE{$test}->[$i]->{lines}};
				my $break = firstidx { $_->raw eq '' } @lines;
				my @source = splice @lines, 0, $break;
	
				splice @lines, 0, 1; # Splice out the '' item
				push @{$STATE{$test}->[$i - 1]->{lines}} => @source;
	
				$STATE{$test}->[$i]->{lines} = \@lines;
	
				# Since we parsed out the source already,
				# let's make it easily machine-readable, and
				# parse the rest.
				$STATE{$test}->[$i]->{source} = \@source;
				my @trace = map { $_->{is_trace} = 1 } grep { $_->raw =~ /^\+ / } @lines;
				$STATE{$test}->[$i]->{trace} = \@trace if @trace;
			}
		}
	
		my $aggregator = $_[0];
		for my $failed ($aggregator->failed) {
			my ($parser) = $aggregator->parsers($failed);
			for my $i ($parser->failed) {
				my $idx = $i - 1;
				my @lines = @{$STATE{$failed}->[$idx]->{lines}};
				my ($test) = grep { $_->is_test } @lines;
	
				say "Failed in $failed#$i:";
				say join "\n", map {
					s/^/ ==> /gr;
				} map {
					sprintf("%-6s => %s",
					     $_->{is_trace} ? "trace" :
					     $_->is_unknown ? "output" :
					     $_->is_comment ? "source" :
					     $_->type,
					     $_->raw,
				     );
				} sort {
					# the "is_trace" may be undef
					no warnings qw(uninitialized);
					# The "[not ]ok" line first...
					$b->is_test <=> $a->is_test ||
					# Then "comment" (i.e. test source)
					$b->is_comment <=> $a->is_comment ||
					# Then the "+ " trace of execution
					$b->{is_trace} <=> $a->{is_trace}
				} @lines;
			}
		}
	}
	
	1;


1. https://lore.kernel.org/git/cover-00.25-00000000000-20220221T143936Z-avarab@gmail.com/
2. https://lore.kernel.org/git/220222.86y2236ndp.gmgdl@evledraar.gmail.com/

^ permalink raw reply	[flat|nested] 98+ messages in thread

* win+VS environment has "cut" but not "paste"?
  2022-02-25 18:16             ` Junio C Hamano
  2022-02-26 18:43               ` Junio C Hamano
  2022-03-01 10:20               ` Johannes Schindelin
@ 2022-03-04  7:38               ` Junio C Hamano
  2022-03-04  9:04                 ` Ævar Arnfjörð Bjarmason
  2022-03-07 15:48                 ` Johannes Schindelin
  2 siblings, 2 replies; 98+ messages in thread
From: Junio C Hamano @ 2022-03-04  7:38 UTC (permalink / raw)
  To: git, Johannes Schindelin; +Cc: Jacob Keller

GitHub CI seems to fail due to lack of "paste" for win+VS job.  This
was somewhat unexpected, as our test scripts seem to make liberal
use of "cut" that goes together with it.

    https://github.com/git/git/runs/5415486631?check_suite_focus=true#step:5:6199

The particular failure at the URL comes from the use of "paste" in
5ea4f3a5 (name-rev: use generation numbers if available,
2022-02-28), but it hardly is the first use of the command.  There
is one use of it in t/aggregate-results.sh in 'master/main' already.

We could rewrite the tests that use "paste" but looking at the use
of the tool in the test (and the aggregate thing), rewriting them
due to lack of a tool, whose source should be freely available from
where "cut" was taken from, does not sound like too attractive a
direction to go in, but I do not know how much work is involved in
adding it (and in general, any basic tool with similar complexity
that we may find missing in the future) to the win+VS environment.

Thoughts?

Thanks.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: win+VS environment has "cut" but not "paste"?
  2022-03-04  7:38               ` win+VS environment has "cut" but not "paste"? Junio C Hamano
@ 2022-03-04  9:04                 ` Ævar Arnfjörð Bjarmason
  2022-03-07 15:51                   ` Johannes Schindelin
  2022-03-07 15:48                 ` Johannes Schindelin
  1 sibling, 1 reply; 98+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-04  9:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Johannes Schindelin, Jacob Keller


On Thu, Mar 03 2022, Junio C Hamano wrote:

> GitHub CI seems to fail due to lack of "paste" for win+VS job.  This
> was somewhat unexpected, as our test scripts seem to make liberal
> use of "cut" that goes together with it.
>
>     https://github.com/git/git/runs/5415486631?check_suite_focus=true#step:5:6199
>
> The particular failure at the URL comes from the use of "paste" in
> 5ea4f3a5 (name-rev: use generation numbers if available,
> 2022-02-28), but it hardly is the first use of the command.  There
> is one use of it in t/aggregate-results.sh in 'master/main' already.

I think it's the first use, the t/aggregate-results.sh is run on
"DEFAULT_TEST_TARGET=test make -C t", but we use
"DEFAULT_TEST_TARGET=prove".

Re your upthread:

> I personally do not care about the initial latency when viewing the
> output from CI run that may have happened a few dozens of minutes
> ago (I do not sit in front of GitHub CI UI and wait until it
> finishes). 

I think this URL is a good example of what I noted in [1]. Your link
loads relatively quickly, but I then saw a "linux-TEST-vars" failure and
clicked on it, wanting to see why that fails.

It opens relatively quickly, but no failure can be seen. It stalls with
a spinner next to "t/run-build-and-test.sh", and stalled like that for
75[2] seconds before finally loading past line ~3.5k to line ~70k
showing the relevant failure in t6120*.sh.

I really don't think it's a reasonable claim to say that only "veterans"
of git development[3] are likely to find the workflow of seeing a CI
failure right away useful, or wanting to browse through the N different
"job" failures without having to pre-open them, go find something else
to do, then come back to it etc.

I also noted in [1] that it takes a lot more CPU now, so even if that is
your workflow for looking at CI you'll need a fairly performant machine
if you have a few job failures (which isn't a typical), as each tab will
be pegging a CPU core at ~100% for a while.

I have fairly normally spec'd quad-core laptop that I almost never hear,
and this new CI UI is pretty reliable in making it sound as though it's
about to take flight.

1. https://lore.kernel.org/git/220222.86tucr6kz5.gmgdl@evledraar.gmail.com/
2. I reported large N seconds, but nothing so bad before. For some reason
   this one's particularly bad, but in [1] it was the same CPU use with ~20s
   etc (but that one was 1/2 the amount of lines)
3. https://lore.kernel.org/git/nycvar.QRO.7.76.6.2203011111150.11118@tvgsbejvaqbjf.bet/

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: win+VS environment has "cut" but not "paste"?
  2022-03-04  7:38               ` win+VS environment has "cut" but not "paste"? Junio C Hamano
  2022-03-04  9:04                 ` Ævar Arnfjörð Bjarmason
@ 2022-03-07 15:48                 ` Johannes Schindelin
  2022-03-07 16:58                   ` Junio C Hamano
  1 sibling, 1 reply; 98+ messages in thread
From: Johannes Schindelin @ 2022-03-07 15:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jacob Keller

Hi Junio,

On Thu, 3 Mar 2022, Junio C Hamano wrote:

> GitHub CI seems to fail due to lack of "paste" for win+VS job.  This
> was somewhat unexpected, as our test scripts seem to make liberal
> use of "cut" that goes together with it.
>
>     https://github.com/git/git/runs/5415486631?check_suite_focus=true#step:5:6199
>
> The particular failure at the URL comes from the use of "paste" in
> 5ea4f3a5 (name-rev: use generation numbers if available,
> 2022-02-28), but it hardly is the first use of the command.  There
> is one use of it in t/aggregate-results.sh in 'master/main' already.
>
> We could rewrite the tests that use "paste" but looking at the use
> of the tool in the test (and the aggregate thing), rewriting them
> due to lack of a tool, whose source should be freely available from
> where "cut" was taken from, does not sound like too attractive a
> direction to go in, but I do not know how much work is involved in
> adding it (and in general, any basic tool with similar complexity
> that we may find missing in the future) to the win+VS environment.

I added it:
https://github.com/git-for-windows/git-sdk-64/commit/e3ade7eef2503149dfefe630037c2fd6d24f2c14

It will take ~35 minutes (at time of writing) for
https://dev.azure.com/Git-for-Windows/git/_build/results?buildId=95542&view=results
to make the corresponding artifact available to the
`setup-git-for-windows-sdk` Action we use.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: win+VS environment has "cut" but not "paste"?
  2022-03-04  9:04                 ` Ævar Arnfjörð Bjarmason
@ 2022-03-07 15:51                   ` Johannes Schindelin
  2022-03-07 17:05                     ` Junio C Hamano
  0 siblings, 1 reply; 98+ messages in thread
From: Johannes Schindelin @ 2022-03-07 15:51 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Junio C Hamano, git, Jacob Keller

[-- Attachment #1: Type: text/plain, Size: 625 bytes --]

Hi Ævar,

On Fri, 4 Mar 2022, Ævar Arnfjörð Bjarmason wrote:

> I really don't think it's a reasonable claim to say that only "veterans"
> of git development[3] are likely to find the workflow of seeing a CI
> failure right away useful, or wanting to browse through the N different
> "job" failures without having to pre-open them, go find something else
> to do, then come back to it etc.

I said that the current output is only useful to veterans. The output that
hides the detailed log, under a separate job that is marked as
non-failing.

That's still as true as when I said it. :-)

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-03-02 12:22   ` Ævar Arnfjörð Bjarmason
@ 2022-03-07 15:57     ` Johannes Schindelin
  2022-03-07 16:05       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 98+ messages in thread
From: Johannes Schindelin @ 2022-03-07 15:57 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin via GitGitGadget, git, Eric Sunshine

[-- Attachment #1: Type: text/plain, Size: 1154 bytes --]

Hi Ævar,

On Wed, 2 Mar 2022, Ævar Arnfjörð Bjarmason wrote:

>
> On Tue, Mar 01 2022, Johannes Schindelin via GitGitGadget wrote:
>
> > Changes since v1:
> >
> >  * In the patch that removed MAKE_TARGETS, a stale comment about that
> >    variable is also removed.
> >  * The comment about set -x has been adjusted because it no longer applies
> >    as-is.
> >  * The commit message of "ci: make it easier to find failed tests' logs in
> >    the GitHub workflow" has been adjusted to motivate the improvement
> >    better.
>
> Just briefly: Much of the feedback I had on v1 is still unanswered,

Yes, because I got the sense that your feedback ignores the goal of making
it easier to diagnose test failures.

> or in the case of the performance issues I think just saying that this
> output is aimed at non-long-time-contributors doesn't really justify the
> large observed slowdowns.

What good is a quickly-loading web site when it is less than useful?

I'd much rather have a slow-loading web site that gets me to where I need
to be than a fast-loading one that leaves me as puzzled as before.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-03-07 15:57     ` Johannes Schindelin
@ 2022-03-07 16:05       ` Ævar Arnfjörð Bjarmason
  2022-03-07 17:36         ` Junio C Hamano
  2022-03-09 13:20         ` Johannes Schindelin
  0 siblings, 2 replies; 98+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-07 16:05 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Johannes Schindelin via GitGitGadget, git, Eric Sunshine,
	SZEDER Gábor


On Mon, Mar 07 2022, Johannes Schindelin wrote:

> On Wed, 2 Mar 2022, Ævar Arnfjörð Bjarmason wrote:
>
>>
>> On Tue, Mar 01 2022, Johannes Schindelin via GitGitGadget wrote:
>>
>> > Changes since v1:
>> >
>> >  * In the patch that removed MAKE_TARGETS, a stale comment about that
>> >    variable is also removed.
>> >  * The comment about set -x has been adjusted because it no longer applies
>> >    as-is.
>> >  * The commit message of "ci: make it easier to find failed tests' logs in
>> >    the GitHub workflow" has been adjusted to motivate the improvement
>> >    better.
>>
>> Just briefly: Much of the feedback I had on v1 is still unanswered,
>
> Yes, because I got the sense that your feedback ignores the goal of making
> it easier to diagnose test failures.

I think that's a rather strange conclusion given that I've submitted a
parallel series that makes some of those failures easier to diagnose
than the same changes in this series. I.e. the failures in build
v.s. test phases, not the individual test format output (but those are
orthagonal).

I think it's a fair summary of our differences that we're just placing
different values on UX responsiveness. I'm pretty sure there's some
amount of UX slowdown you'd consider unacceptable, no matter how much
the output was improved.

Clearly we just use it differently.

>> or in the case of the performance issues I think just saying that this
>> output is aimed at non-long-time-contributors doesn't really justify the
>> large observed slowdowns.
>
> What good is a quickly-loading web site when it is less than useful?

For all the flaws in the current output there are cases now where you
can click on a failure, see a summary of the 1-2 tests that failed, and
even find your way through the right place in the rather verbose raw log
output in 1/4 or 1/2 the time it takes the initial page on the new UX to
loda.

> I'd much rather have a slow-loading web site that gets me to where I need
> to be than a fast-loading one that leaves me as puzzled as before.

I think it's clear that we're going to disagree on this point, but I'd
still think that:

 * In a re-roll, you should amend these patches to clearly note that's a
   UX trade-off you're making, perhaps with rough before/after timings
   similar to the ones I've posted.

   I.e. now those patches say nothing about the UX change resulting in
   UX that's *much* slower than before. Clearly noting that trade-off
   for reviewers is not the same as saying the trade-off can't be made.

 * I don't see why the changes here can't be made configurable (and
   perhaps you'd argue they should be on by default) via the ci-config
   phase.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-03-02 10:58             ` [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Phillip Wood
@ 2022-03-07 16:07               ` Johannes Schindelin
  2022-03-07 17:11                 ` Junio C Hamano
  2022-03-07 17:12                 ` Phillip Wood
  0 siblings, 2 replies; 98+ messages in thread
From: Johannes Schindelin @ 2022-03-07 16:07 UTC (permalink / raw)
  To: phillip.wood
  Cc: Ævar Arnfjörð Bjarmason,
	Johannes Schindelin via GitGitGadget, git, Junio C Hamano

Hi Phillip,

On Wed, 2 Mar 2022, Phillip Wood wrote:

> On 25/02/2022 14:10, Johannes Schindelin wrote:
>
> > On Wed, 23 Feb 2022, Phillip Wood wrote:
> >
> > > With the first link above the initial page load is faster but to get
> > > to the output of the failing test case I have click on "Run
> > > ci/print_test_failures.sh" then wait for that to load and then
> > > search for "not ok" to actually get to the information I'm after.
> >
> > And that's only because you are familiar with what you have to do.
> >
> > Any new contributor would be stuck with the information presented on the
> > initial load, without any indication that more information _is_ available,
> > just hidden away in the next step's log (which is marked as "succeeding",
> > therefore misleading the inclined reader into thinking that this cannot
> > potentially contain any information pertinent to the _failure_ that needs
> > to be investigated).
>
> Yes it took we a while to realize how to get to the test output when I first
> started looking at the CI results.

Thank you for saying that. Since nobody else said it as clearly as you, I
really started to doubt myself here.

> One thing I forgot to mention was that when you expand a failing test it shows
> the test script twice before the output e.g.
>
> Error: failed: t7527.35 Matrix[uc:false][fsm:true] enable fsmonitor
> failure: t7527.35 Matrix[uc:false][fsm:true] enable fsmonitor
>   				git config core.fsmonitor true &&
>   				git fsmonitor--daemon start &&
>   				git update-index --fsmonitor
>
>   expecting success of 7527.35 'Matrix[uc:false][fsm:true] enable fsmonitor':
>   				git config core.fsmonitor true &&
>   				git fsmonitor--daemon start &&
>   				git update-index --fsmonitor
>
>  ++ git config core.fsmonitor true
>  ++ git fsmonitor--daemon start
>  ...
>
> I don't know how easy it would be to fix that so that we only show "expecting
> success of ..." without the test being printed first

It's not _super_ easy: right now, the patch series does not touch the code
that prints the latter message. In fact, that latter message is generated
_before_ the test fails, and redirected via `tee` into
`GIT_TEST_TEE_OUTPUT_FILE`. Then, once the verdict is clear that this test
case failed, the first message is printed (the one that is colored in the
output via `::error::`), and the output from the
`GIT_TEST_TEE_OUTPUT_FILE` file is pasted, starting at the offset marking
the start of the test case.

The easiest workaround would probably to add a flag that suppresses the
header `expecting success` in case we're running with the
`--github-workflow-markup` option.

I'll put it on my ever-growing TODO list, but maybe in the interest of
heeding the mantra "the perfect is the enemy of the good", this can be
done on top of this series rather than blocking it?

Thanks,
Dscho

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: win+VS environment has "cut" but not "paste"?
  2022-03-07 15:48                 ` Johannes Schindelin
@ 2022-03-07 16:58                   ` Junio C Hamano
  0 siblings, 0 replies; 98+ messages in thread
From: Junio C Hamano @ 2022-03-07 16:58 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Jacob Keller

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> GitHub CI seems to fail due to lack of "paste" for win+VS job.  This
>> was somewhat unexpected, as our test scripts seem to make liberal
>> use of "cut" that goes together with it.
>> ...
> I added it:
> https://github.com/git-for-windows/git-sdk-64/commit/e3ade7eef2503149dfefe630037c2fd6d24f2c14

Thanks.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: win+VS environment has "cut" but not "paste"?
  2022-03-07 15:51                   ` Johannes Schindelin
@ 2022-03-07 17:05                     ` Junio C Hamano
  2022-03-09 13:02                       ` Johannes Schindelin
  0 siblings, 1 reply; 98+ messages in thread
From: Junio C Hamano @ 2022-03-07 17:05 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Ævar Arnfjörð Bjarmason, git, Jacob Keller

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> I said that the current output is only useful to veterans. The output that
> hides the detailed log, under a separate job that is marked as
> non-failing.
>
> That's still as true as when I said it. :-)

I think getting rid of the separate "print failures" CI step and
making it more discoverable how to reveal the details of failing
test step is a usability improvement.

I am not Ævar, but I think what was questioned was the improvement
justifies multi dozens of seconds extra waiting time, which is a
usability dis-improvement.  I do not have a good answer to that
question.

I am probably nearing to be a veteran who knows when to brew my tea
in my work cycle, and waiting for an extra minute or two while
browsing CI output is not a problem for me.

But new "non-veteran" users may not share that.  That is something a
bit worrying about the new UI.

Thanks.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-03-07 16:07               ` Johannes Schindelin
@ 2022-03-07 17:11                 ` Junio C Hamano
  2022-03-09 11:44                   ` Ævar Arnfjörð Bjarmason
  2022-03-07 17:12                 ` Phillip Wood
  1 sibling, 1 reply; 98+ messages in thread
From: Junio C Hamano @ 2022-03-07 17:11 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: phillip.wood, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin via GitGitGadget, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> One thing I forgot to mention was that when you expand a failing test it shows
>> the test script twice before the output e.g.
>>
>> Error: failed: t7527.35 Matrix[uc:false][fsm:true] enable fsmonitor
>> failure: t7527.35 Matrix[uc:false][fsm:true] enable fsmonitor
>>   				git config core.fsmonitor true &&
>>   				git fsmonitor--daemon start &&
>>   				git update-index --fsmonitor
>>
>>   expecting success of 7527.35 'Matrix[uc:false][fsm:true] enable fsmonitor':
>>   				git config core.fsmonitor true &&
>>   				git fsmonitor--daemon start &&
>>   				git update-index --fsmonitor
>>
>>  ++ git config core.fsmonitor true
>>  ++ git fsmonitor--daemon start
>>  ...
>>
>> I don't know how easy it would be to fix that so that we only show "expecting
>> success of ..." without the test being printed first
>
> It's not _super_ easy: right now, the patch series does not touch the code

In other words, it is not a new issue introduced by this series, right?

> The easiest workaround would probably to add a flag that suppresses the
> header `expecting success` in case we're running with the
> `--github-workflow-markup` option.

If that is the case, let's leave it outside the series.

If we do not have to hide the solution behind any option specific to
"--github-workflow-markup", then even users (like me) who reguarly
run "cd t && sh ./t1234-a-particular-test.sh -i -v" would benefit if
we no longer have to look at the duplicated test script in the
output.

Thanks.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-03-07 16:07               ` Johannes Schindelin
  2022-03-07 17:11                 ` Junio C Hamano
@ 2022-03-07 17:12                 ` Phillip Wood
  1 sibling, 0 replies; 98+ messages in thread
From: Phillip Wood @ 2022-03-07 17:12 UTC (permalink / raw)
  To: Johannes Schindelin, phillip.wood
  Cc: Ævar Arnfjörð Bjarmason,
	Johannes Schindelin via GitGitGadget, git, Junio C Hamano

Hi Dscho
On 07/03/2022 16:07, Johannes Schindelin wrote:
> Hi Phillip,
> 
> On Wed, 2 Mar 2022, Phillip Wood wrote:
> 
>> On 25/02/2022 14:10, Johannes Schindelin wrote:
>>
>>> On Wed, 23 Feb 2022, Phillip Wood wrote:
>>>
>>>> With the first link above the initial page load is faster but to get
>>>> to the output of the failing test case I have click on "Run
>>>> ci/print_test_failures.sh" then wait for that to load and then
>>>> search for "not ok" to actually get to the information I'm after.
>>>
>>> And that's only because you are familiar with what you have to do.
>>>
>>> Any new contributor would be stuck with the information presented on the
>>> initial load, without any indication that more information _is_ available,
>>> just hidden away in the next step's log (which is marked as "succeeding",
>>> therefore misleading the inclined reader into thinking that this cannot
>>> potentially contain any information pertinent to the _failure_ that needs
>>> to be investigated).
>>
>> Yes it took we a while to realize how to get to the test output when I first
>> started looking at the CI results.
> 
> Thank you for saying that. Since nobody else said it as clearly as you, I
> really started to doubt myself here.
> 
>> One thing I forgot to mention was that when you expand a failing test it shows
>> the test script twice before the output e.g.
>>
>> Error: failed: t7527.35 Matrix[uc:false][fsm:true] enable fsmonitor
>> failure: t7527.35 Matrix[uc:false][fsm:true] enable fsmonitor
>>    				git config core.fsmonitor true &&
>>    				git fsmonitor--daemon start &&
>>    				git update-index --fsmonitor
>>
>>    expecting success of 7527.35 'Matrix[uc:false][fsm:true] enable fsmonitor':
>>    				git config core.fsmonitor true &&
>>    				git fsmonitor--daemon start &&
>>    				git update-index --fsmonitor
>>
>>   ++ git config core.fsmonitor true
>>   ++ git fsmonitor--daemon start
>>   ...
>>
>> I don't know how easy it would be to fix that so that we only show "expecting
>> success of ..." without the test being printed first
> 
> It's not _super_ easy: right now, the patch series does not touch the code
> that prints the latter message. In fact, that latter message is generated
> _before_ the test fails, and redirected via `tee` into
> `GIT_TEST_TEE_OUTPUT_FILE`. Then, once the verdict is clear that this test
> case failed, the first message is printed (the one that is colored in the
> output via `::error::`), and the output from the
> `GIT_TEST_TEE_OUTPUT_FILE` file is pasted, starting at the offset marking
> the start of the test case.
> 
> The easiest workaround would probably to add a flag that suppresses the
> header `expecting success` in case we're running with the
> `--github-workflow-markup` option.
> 
> I'll put it on my ever-growing TODO list, but maybe in the interest of
> heeding the mantra "the perfect is the enemy of the good", this can be
> done on top of this series rather than blocking it?

Sure, I mentioned it in case it was easy to fix but I don't think it 
should stop these patches moving forward

Best Wishes

Phillip

> Thanks,
> Dscho


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-03-07 16:05       ` Ævar Arnfjörð Bjarmason
@ 2022-03-07 17:36         ` Junio C Hamano
  2022-03-09 10:56           ` Ævar Arnfjörð Bjarmason
  2022-03-09 13:20         ` Johannes Schindelin
  1 sibling, 1 reply; 98+ messages in thread
From: Junio C Hamano @ 2022-03-07 17:36 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin, Johannes Schindelin via GitGitGadget, git,
	Eric Sunshine, SZEDER Gábor

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> I think that's a rather strange conclusion given that I've submitted a
> parallel series that makes some of those failures easier to diagnose
> than the same changes in this series. I.e. the failures in build
> v.s. test phases, not the individual test format output (but those are
> orthagonal).

If you have a counter-proposal that you feel is solid enough, I do
not mind dropping the topic in question and replacing it with the
counter-proposal to let people see how it fares for a few days.  If
it allows others to view the output easily if you revert the merge
of this topic into 'seen' and replace with the counter-proposal and
push it to your own repository, that would be an even better way to
highlight the differences of two approaches, as that would allow us
to see the same failures side-by-side.

Am I correct to understand that one of the the common goals here is
to eliminate the need to discover how to get to the first failure
output without turning it slow by 10x to load the output?

> I think it's clear that we're going to disagree on this point, but I'd
> still think that:
>
>  * In a re-roll, you should amend these patches to clearly note that's a
>    UX trade-off you're making, perhaps with rough before/after timings
>    similar to the ones I've posted.
>
>    I.e. now those patches say nothing about the UX change resulting in
>    UX that's *much* slower than before. Clearly noting that trade-off
>    for reviewers is not the same as saying the trade-off can't be made.

Whether we perform counter-proposal comparison or not, the above is
a reasonable thing to ask.

>  * I don't see why the changes here can't be made configurable (and
>    perhaps you'd argue they should be on by default) via the ci-config
>    phase.

I do not know if such a knob is feasible, though.

Thanks.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-03-07 17:36         ` Junio C Hamano
@ 2022-03-09 10:56           ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 98+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-09 10:56 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Schindelin, Johannes Schindelin via GitGitGadget, git,
	Eric Sunshine, SZEDER Gábor


On Mon, Mar 07 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> I think that's a rather strange conclusion given that I've submitted a
>> parallel series that makes some of those failures easier to diagnose
>> than the same changes in this series. I.e. the failures in build
>> v.s. test phases, not the individual test format output (but those are
>> orthagonal).
>
> If you have a counter-proposal that you feel is solid enough, I do
> not mind dropping the topic in question and replacing it with the
> counter-proposal to let people see how it fares for a few days.  If
> it allows others to view the output easily if you revert the merge
> of this topic into 'seen' and replace with the counter-proposal and
> push it to your own repository, that would be an even better way to
> highlight the differences of two approaches, as that would allow us
> to see the same failures side-by-side.

I proposed [1] as a counter-proposal to *part of* this series, which has
various UX comparisons.

Note that it doesn't aim to change the test failure output
significantly, but to structurally simplify .github/workflows/main.yml
and ci/ for a significant line-count reduction, and improve part of the
output that we get for "make" v.s. "make test", and to make it obvious
what the parameters of each step are.

Which is partially overlapping with 1/2 of Johannes's series here, I
think it makes sense to split up those two concerns & address this more
incrementally.

I.e. my series shows that you can get what the first half of this series
proposes to do by adding GitHub-specific output to ci/* without any such
CI-backend-specific output, and the resulting presentation in the UX is
better.

It'll still apply to "master" with this topic ejected, there was a minor
comment (needing commit message rephrasing) from SZEDER Gábor on it, I
could re-roll it if you're interested.

> Am I correct to understand that one of the the common goals here is
> to eliminate the need to discover how to get to the first failure
> output without turning it slow by 10x to load the output?

That's definitely an eventual common goal, I have a POC for how to do
that with an alternate approach that doesn't suffer from that slowdown,
and shows you much more targeted failure output (only the specific tests
that failed) at [2].

I just think it makes sense to split up the concerns how we arrange
.github/workflows/main.yml & ci/* and how doing that differently can
improve the CI UX, v.s. the mostly orthagonal concern of how test-lib.sh
+ prove can best summarize their failure output.

>> I think it's clear that we're going to disagree on this point, but I'd
>> still think that:
>>
>>  * In a re-roll, you should amend these patches to clearly note that's a
>>    UX trade-off you're making, perhaps with rough before/after timings
>>    similar to the ones I've posted.
>>
>>    I.e. now those patches say nothing about the UX change resulting in
>>    UX that's *much* slower than before. Clearly noting that trade-off
>>    for reviewers is not the same as saying the trade-off can't be made.
>
> Whether we perform counter-proposal comparison or not, the above is
> a reasonable thing to ask.
>
>>  * I don't see why the changes here can't be made configurable (and
>>    perhaps you'd argue they should be on by default) via the ci-config
>>    phase.
>
> I do not know if such a knob is feasible, though.

It would be rather trivial. Basically a matter of adding a "if:
needs.ci-config.outputs.style == 'basic'" to ci/print-test-failures.sh,
and a corresponding flag passed down to ci/lib.sh to have it invoke
test-lib.sh with --github-workflow-markup or not.

I.e. this series detects it's running in GitHub CI and optionally tells
test-lib.sh to emit different output, so to get the old output you'd
just need to not opt-in to that.

I think we can have our cake and eat it too though, so I don't think
there's any point in such a knob per-se. The only reason I'm suggesting
it is because Johannes doesn't otherwise seem to want to address the
significant CI UX slowdowns in this series.

I do think the approach taken by this series is inherently limited in
solving that problem though, in a way that the approach outlined in [2]
isn't.

I.e. the problem is that we're throwing say ~30k-90k lines of raw CI
output at some GitHub JavaScript to format and present. Your browser
needs to download all the output, and then the browser needs to spin at
100% CPU to present it to you.

The reason for that is that anything that tweaks the test-lib.sh output
is something you need to consume *in its entirety*. I.e. when you have a
sequence like:

    ok 1 test one
    ok 2 test two
    not ok 3 test three
    1..3

You don't know until the third test that you've had a failure. Short of
teaching test-lib.sh even more complexity (e.g. pre-buffering its
"passing" output) a UX layer needs to present all of it, and won't
benefit from a parsed representation of it.

Which is why I think adding other output formatters to test-lib.sh is a
dead end approach to this problem.

I mean, sure we could start parsing the new output emitted here, but
that doesn't make sense either.

We already run a full TAP parser over the output of the entire test
suite, which we can easily use to only print details about those tests
that failed[2]. We could also still have the full & raw output, but that
could be in some other tab (or "stage", just like
"ci/print-test-failures.sh" is now.

To be fair that isn't quite as trivial in terms of patch count. In
particular we have edge cases currently where the TAP output isn't valid
due to bugs in test-lib.sh and/or tests, and can't combine it with the
--verbose output. The demo at [2] is on top of some patches I've got
locally to fix all of that.

But I really think it's worth it to move a bit slower there & get it
right rather than heading down the wrong direction of having another
not-quite-machine-readable output target.

I.e. once it's machine readable & parsed we can present it however we
like, and can do *much better* output such as correlating trace output
to the test source, and e.g. showing what specific line we failed
on. Right now all of that needs to be eyeballed & inferred from the
"--verbose -x" output (which is often non-obvious & a pain, especially
when the "test_expect_success" block calls helper functions).

1. https://lore.kernel.org/git/cover-00.25-00000000000-20220221T143936Z-avarab@gmail.com/
2. https://lore.kernel.org/git/220302.86mti87cj2.gmgdl@evledraar.gmail.com/

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-03-07 17:11                 ` Junio C Hamano
@ 2022-03-09 11:44                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 98+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-09 11:44 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Schindelin, phillip.wood,
	Johannes Schindelin via GitGitGadget, git


On Mon, Mar 07 2022, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
>>> One thing I forgot to mention was that when you expand a failing test it shows
>>> the test script twice before the output e.g.
>>>
>>> Error: failed: t7527.35 Matrix[uc:false][fsm:true] enable fsmonitor
>>> failure: t7527.35 Matrix[uc:false][fsm:true] enable fsmonitor
>>>   				git config core.fsmonitor true &&
>>>   				git fsmonitor--daemon start &&
>>>   				git update-index --fsmonitor
>>>
>>>   expecting success of 7527.35 'Matrix[uc:false][fsm:true] enable fsmonitor':
>>>   				git config core.fsmonitor true &&
>>>   				git fsmonitor--daemon start &&
>>>   				git update-index --fsmonitor
>>>
>>>  ++ git config core.fsmonitor true
>>>  ++ git fsmonitor--daemon start
>>>  ...
>>>
>>> I don't know how easy it would be to fix that so that we only show "expecting
>>> success of ..." without the test being printed first
>>
>> It's not _super_ easy: right now, the patch series does not touch the code
>
> In other words, it is not a new issue introduced by this series, right?

It is a new issue in this series, specifically how
"finalize_test_case_output" interacts with "test_{ok,failure}_" and
friends.

>> The easiest workaround would probably to add a flag that suppresses the
>> header `expecting success` in case we're running with the
>> `--github-workflow-markup` option.
>
> If that is the case, let's leave it outside the series.
>
> If we do not have to hide the solution behind any option specific to
> "--github-workflow-markup", then even users (like me) who reguarly
> run "cd t && sh ./t1234-a-particular-test.sh -i -v" would benefit if
> we no longer have to look at the duplicated test script in the
> output.

Unless you invoke it with --github-workflow-markup you won't see the
duplication.

I had some comments about inherent limitations in the approach in this
series vis-a-vis parsing markup after the fact[1]. But that really
doesn't seem to apply here. We're just printing the test source into the
*.markup file twice for no particular reason, aren't we?

*tests locally*

Hrm, so first this is a bug:
    
    $ ./t0002-gitfile.sh  --github-workflow-markup
    FATAL: Unexpected exit with code 1
    FATAL: Unexpected exit with code 1
    
Seems it wants --tee but doesn't declare it, this works:

    $ rm -rf test-results/; ./t0002-gitfile.sh  --github-workflow-markup --tee; cat test-results/t0002-gitfile.markup

Isn't this a matter of making finalize_test_case_output not print the
full $* (including the test source) for failures?

1. https://lore.kernel.org/git/220309.861qzbnymn.gmgdl@evledraar.gmail.com/

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: win+VS environment has "cut" but not "paste"?
  2022-03-07 17:05                     ` Junio C Hamano
@ 2022-03-09 13:02                       ` Johannes Schindelin
  2022-03-10 15:23                         ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 98+ messages in thread
From: Johannes Schindelin @ 2022-03-09 13:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ævar Arnfjörð Bjarmason, git, Jacob Keller

[-- Attachment #1: Type: text/plain, Size: 2451 bytes --]

Hi Junio,

On Mon, 7 Mar 2022, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> > I said that the current output is only useful to veterans. The output that
> > hides the detailed log, under a separate job that is marked as
> > non-failing.
> >
> > That's still as true as when I said it. :-)
>
> I think getting rid of the separate "print failures" CI step and
> making it more discoverable how to reveal the details of failing
> test step is a usability improvement.

I'm so glad you're saying that! I was starting to doubt whether my caring
about getting rid of that `print failures` step was maybe misguided.

> I am not Ævar, but I think what was questioned was the improvement
> justifies multi dozens of seconds extra waiting time, which is a
> usability dis-improvement.  I do not have a good answer to that
> question.

It is definitely worrisome that we have to pay such a price. And if there
was a good answer how to improve that (without sacrificing the
discoverability of the command trace corresponding to the test failure), I
would be more than just eager to hear it.

I did consider generating a HTML-formatted report that would then be
attached as a build artifact. But that would hide the relevant information
even worse than a "print failures" step.

Maybe I should just get rid of the grouping? But that _really_ helped me
when I investigated the recent "usage string updates" vs "FSMonitor"
problem, because the test case boundaries were so much clearer.

Plus, as far as I saw, GitHub workflow logs always scroll to the end of
the logs of the failing step, which would not help _at all_ here.

So I dunno.

> I am probably nearing to be a veteran who knows when to brew my tea
> in my work cycle, and waiting for an extra minute or two while
> browsing CI output is not a problem for me.

:-)

> But new "non-veteran" users may not share that.  That is something a
> bit worrying about the new UI.

Indeed. My goal, after all, is to improve the experience of contributors,
not to make it harder.

Still, given that you currently have to click quite a few times until you
get to where you need to be, I have my doubts that what this patch series
does is actually making things slower, measured in terms of the total time
from seeing a failed build to being able to diagnose the cause by
inspecting the command trace.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-03-07 16:05       ` Ævar Arnfjörð Bjarmason
  2022-03-07 17:36         ` Junio C Hamano
@ 2022-03-09 13:20         ` Johannes Schindelin
  2022-03-09 19:39           ` Junio C Hamano
  2022-03-09 19:47           ` Ævar Arnfjörð Bjarmason
  1 sibling, 2 replies; 98+ messages in thread
From: Johannes Schindelin @ 2022-03-09 13:20 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin via GitGitGadget, git, Eric Sunshine,
	SZEDER Gábor

[-- Attachment #1: Type: text/plain, Size: 4698 bytes --]

Hi Ævar,

On Mon, 7 Mar 2022, Ævar Arnfjörð Bjarmason wrote:

> On Mon, Mar 07 2022, Johannes Schindelin wrote:
>
> > On Wed, 2 Mar 2022, Ævar Arnfjörð Bjarmason wrote:
> >
> >>
> >> On Tue, Mar 01 2022, Johannes Schindelin via GitGitGadget wrote:
> >>
> >> > Changes since v1:
> >> >
> >> >  * In the patch that removed MAKE_TARGETS, a stale comment about that
> >> >    variable is also removed.
> >> >  * The comment about set -x has been adjusted because it no longer applies
> >> >    as-is.
> >> >  * The commit message of "ci: make it easier to find failed tests' logs in
> >> >    the GitHub workflow" has been adjusted to motivate the improvement
> >> >    better.
> >>
> >> Just briefly: Much of the feedback I had on v1 is still unanswered,
> >
> > Yes, because I got the sense that your feedback ignores the goal of making
> > it easier to diagnose test failures.
>
> I think that's a rather strange conclusion given that I've submitted a
> parallel series that makes some of those failures easier to diagnose
> than the same changes in this series. I.e. the failures in build
> v.s. test phases, not the individual test format output (but those are
> orthagonal).

I do not know what your parallel series implements, as I did not have the
time to read it yet (and it contains about double the number of patches of
my series, hopefully not intended to make it impossible for me to spare
the time to even taking a glimpse at it).

Also: I thought we had the rule of trying to be mindful of other
contributors and avoid interfering with patch series that are in flight?
It could be viewed as unnecessarily adversarial.

> I think it's a fair summary of our differences that we're just placing
> different values on UX responsiveness. I'm pretty sure there's some
> amount of UX slowdown you'd consider unacceptable, no matter how much
> the output was improved.
>
> Clearly we just use it differently.

I would gladly trade my convenience in return for making it easier for
others to diagnose why their PR builds fail.

At the moment, the way our CI/PR builds present test failures very likely
makes every new contributor feel stupid for not knowing how to proceed.
But they are not stupid. The failure is not theirs. The fault lies
squarely with us, for making it so frigging hard.

> >> or in the case of the performance issues I think just saying that this
> >> output is aimed at non-long-time-contributors doesn't really justify the
> >> large observed slowdowns.
> >
> > What good is a quickly-loading web site when it is less than useful?
>
> For all the flaws in the current output there are cases now where you
> can click on a failure, see a summary of the 1-2 tests that failed, and
> even find your way through the right place in the rather verbose raw log
> output in 1/4 or 1/2 the time it takes the initial page on the new UX to
> loda.

I wonder where the hard data is that backs up these numbers.

I do not have hard data, either, except for one: apart from you and Junio,
I have yet to talk to any contributor who said "oh yeah, I found the logs
right away" rather than things like "when I finally figured out that the
logs were in `print-test-failures`, I had a chance to make sense of the
failures" or even "zOMG _that_ is where the logs are?". And let me tell
you that I heard this from a lot of people. Way more than a mere two.
Far, far more.

> > I'd much rather have a slow-loading web site that gets me to where I need
> > to be than a fast-loading one that leaves me as puzzled as before.
>
> I think it's clear that we're going to disagree on this point, but I'd
> still think that:
>
>  * In a re-roll, you should amend these patches to clearly note that's a
>    UX trade-off you're making, perhaps with rough before/after timings
>    similar to the ones I've posted.
>
>    I.e. now those patches say nothing about the UX change resulting in
>    UX that's *much* slower than before. Clearly noting that trade-off
>    for reviewers is not the same as saying the trade-off can't be made.

Sure, I can do that.

>  * I don't see why the changes here can't be made configurable (and
>    perhaps you'd argue they should be on by default) via the ci-config
>    phase.

I do see why. If my goal is to unhide the logs by default, so that new
contributors can find them more easily, I will not hide that new behavior
behind a hard-to-find config option, an option that new contributors are
even less likely to find. That would be highly counterproductive. If your
goal is to help new contributors, I am certain that you will agree.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-03-09 13:20         ` Johannes Schindelin
@ 2022-03-09 19:39           ` Junio C Hamano
  2022-03-09 19:47           ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 98+ messages in thread
From: Junio C Hamano @ 2022-03-09 19:39 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Ævar Arnfjörð Bjarmason,
	Johannes Schindelin via GitGitGadget, git, Eric Sunshine,
	SZEDER Gábor

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> I do not have hard data, either, except for one: apart from you and Junio,
> I have yet to talk to any contributor who said "oh yeah, I found the logs
> right away" rather than things like "when I finally figured out that the
> logs were in `print-test-failures`, I had a chance to make sense of the
> failures" or even "zOMG _that_ is where the logs are?". And let me tell
> you that I heard this from a lot of people. Way more than a mere two.
> Far, far more.

Stop counting me there.  I didn't find the logs right away, and I
already said that it is a good idea to eliminate the need to open
the thing other than the one that opens by default.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-03-09 13:20         ` Johannes Schindelin
  2022-03-09 19:39           ` Junio C Hamano
@ 2022-03-09 19:47           ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 98+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-09 19:47 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Johannes Schindelin via GitGitGadget, git, Eric Sunshine,
	SZEDER Gábor


On Wed, Mar 09 2022, Johannes Schindelin wrote:

> Hi Ævar,
>
> On Mon, 7 Mar 2022, Ævar Arnfjörð Bjarmason wrote:
>
>> On Mon, Mar 07 2022, Johannes Schindelin wrote:
>>
>> > On Wed, 2 Mar 2022, Ævar Arnfjörð Bjarmason wrote:
>> >
>> >>
>> >> On Tue, Mar 01 2022, Johannes Schindelin via GitGitGadget wrote:
>> >>
>> >> > Changes since v1:
>> >> >
>> >> >  * In the patch that removed MAKE_TARGETS, a stale comment about that
>> >> >    variable is also removed.
>> >> >  * The comment about set -x has been adjusted because it no longer applies
>> >> >    as-is.
>> >> >  * The commit message of "ci: make it easier to find failed tests' logs in
>> >> >    the GitHub workflow" has been adjusted to motivate the improvement
>> >> >    better.
>> >>
>> >> Just briefly: Much of the feedback I had on v1 is still unanswered,
>> >
>> > Yes, because I got the sense that your feedback ignores the goal of making
>> > it easier to diagnose test failures.
>>
>> I think that's a rather strange conclusion given that I've submitted a
>> parallel series that makes some of those failures easier to diagnose
>> than the same changes in this series. I.e. the failures in build
>> v.s. test phases, not the individual test format output (but those are
>> orthagonal).
>
> I do not know what your parallel series implements, as I did not have the
> time to read it yet (and it contains about double the number of patches of
> my series, hopefully not intended to make it impossible for me to spare
> the time to even taking a glimpse at it).

No, I'm not arranging patches in such a way as to make them harder for
you to review specifically. I thought those changes made sense as a
logical progression.

> Also: I thought we had the rule of trying to be mindful of other
> contributors and avoid interfering with patch series that are in flight?
> It could be viewed as unnecessarily adversarial.

You don't need to look at the whole thing, but in
https://lore.kernel.org/git/cover-00.25-00000000000-20220221T143936Z-avarab@gmail.com/
scroll down to "The following compares CI output" and compare:

  master: https://github.com/avar/git/runs/5274251909?check_suite_focus=true
  this: https://github.com/avar/git/runs/5274464670?check_suite_focus=true
  js: https://github.com/avar/git/runs/5272239403?check_suite_focus=true

I.e. for the build v.s. test "grouping" you're doing early in your
series we can get the same with a significantly negative instead of
positive diffstat to .github & ci/, and it frees up the "nested groups"
that you note as a limitation in your 4/9 with another potential group
level (your 4/9:
https://lore.kernel.org/git/9333ba781b8240f704e739b00d274f8c3d887e39.1643050574.git.gitgitgadget@gmail.com/)

So it's not meant to be adversarial, but to help out this topic at large
by showing that a constraint you ran up against is something we don't
need to be limited by, and it makes that part of the CI output better.

I think posting working code to demonstrate that we can indeed do that
is the most productive thing to do, talk being cheap & all that.

So yes, I agree that in general it's better to avoid conflicting topics
etc., but in a case where a topic proposes to add a significant amount
of code & complexity to get to some desired end-state, it makes sense to
demonstrate with a patch or patches that we can get to the same
end-state in some simpler way.

>> I think it's a fair summary of our differences that we're just placing
>> different values on UX responsiveness. I'm pretty sure there's some
>> amount of UX slowdown you'd consider unacceptable, no matter how much
>> the output was improved.
>>
>> Clearly we just use it differently.
>
> I would gladly trade my convenience in return for making it easier for
> others to diagnose why their PR builds fail.
>
> At the moment, the way our CI/PR builds present test failures very likely
> makes every new contributor feel stupid for not knowing how to proceed.
> But they are not stupid. The failure is not theirs. The fault lies
> squarely with us, for making it so frigging hard.

I agree with you that it's relatively bad & could be improved. I don't
have much issue with the end result we're left with once the page
actually loads at the end of this series, just the practicalities of how
slow the resulting UX is.

>> >> or in the case of the performance issues I think just saying that this
>> >> output is aimed at non-long-time-contributors doesn't really justify the
>> >> large observed slowdowns.
>> >
>> > What good is a quickly-loading web site when it is less than useful?
>>
>> For all the flaws in the current output there are cases now where you
>> can click on a failure, see a summary of the 1-2 tests that failed, and
>> even find your way through the right place in the rather verbose raw log
>> output in 1/4 or 1/2 the time it takes the initial page on the new UX to
>> loda.
>
> I wonder where the hard data is that backs up these numbers.

I've posted some in several replies to this series,
e.g. https://lore.kernel.org/git/220222.86tucr6kz5.gmgdl@evledraar.gmail.com/;
Have you tried to reproduce some of those?

I.e. the "hard data" is that usually takes me 10-20 seconds to go from a
CI link to the summary output & opening the "raw dump" view now, and the
same page is taking about a minute to just load with the new UX.

> [...]
>> > I'd much rather have a slow-loading web site that gets me to where I need
>> > to be than a fast-loading one that leaves me as puzzled as before.
>>
>> I think it's clear that we're going to disagree on this point, but I'd
>> still think that:
>>
>>  * In a re-roll, you should amend these patches to clearly note that's a
>>    UX trade-off you're making, perhaps with rough before/after timings
>>    similar to the ones I've posted.
>>
>>    I.e. now those patches say nothing about the UX change resulting in
>>    UX that's *much* slower than before. Clearly noting that trade-off
>>    for reviewers is not the same as saying the trade-off can't be made.
>
> Sure, I can do that.
>
>>  * I don't see why the changes here can't be made configurable (and
>>    perhaps you'd argue they should be on by default) via the ci-config
>>    phase.
>
> I do see why. If my goal is to unhide the logs by default, so that new
> contributors can find them more easily, I will not hide that new behavior
> behind a hard-to-find config option, an option that new contributors are
> even less likely to find. That would be highly counterproductive. If your
> goal is to help new contributors, I am certain that you will agree.

I meant that they could be on by default, but to relatively easily give
us an out to A/B test the performance of new fancy rendering v.s. the
dumb raw dump we have now.

If that's done via CI config it's a rather trivial matter of
e.g. re-pushing "master" somewhere, whereas if it needs a patch/revert
on top...

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: win+VS environment has "cut" but not "paste"?
  2022-03-09 13:02                       ` Johannes Schindelin
@ 2022-03-10 15:23                         ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 98+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-10 15:23 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, git, Jacob Keller


On Wed, Mar 09 2022, Johannes Schindelin wrote:

> Hi Junio,
>
> On Mon, 7 Mar 2022, Junio C Hamano wrote:
>
>> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>>
>> > I said that the current output is only useful to veterans. The output that
>> > hides the detailed log, under a separate job that is marked as
>> > non-failing.
>> >
>> > That's still as true as when I said it. :-)
>>
>> I think getting rid of the separate "print failures" CI step and
>> making it more discoverable how to reveal the details of failing
>> test step is a usability improvement.
>
> I'm so glad you're saying that! I was starting to doubt whether my caring
> about getting rid of that `print failures` step was maybe misguided.

I don't think anyone's been maintaining that getting rid of it wouldn't
be ideal. I for one have just been commenting on issues in the proposed
implementation.

I think we might still want to retain some such steps in the future,
i.e. if we have a failure have subsequent steps that on failure() bump
varying levels of verbosity / raw logs etc., or even try re-running the
test in different ways (e.g. narrow it down with --run).

But the failure step you see when something fails should ideally have
the failure plus the relevant error, just as we do with compile errors.

>> I am not Ævar, but I think what was questioned was the improvement
>> justifies multi dozens of seconds extra waiting time, which is a
>> usability dis-improvement.  I do not have a good answer to that
>> question.
>
> It is definitely worrisome that we have to pay such a price. And if there
> was a good answer how to improve that (without sacrificing the
> discoverability of the command trace corresponding to the test failure), I
> would be more than just eager to hear it.

Isn't the answer to that what I suggested in [1]; I.e. the performance
problem being that we include N number of lines of the output that
*didn't fail*, and that's what slows down showing the relevant output
that *did* fail.

I.e. if say t3070-wildmatch.sh fails in a couple of tests we'd show a
*lot* of lines between the relevant failing tests, let's just elide the
non-failing ones and show the output for the failing ones specifically.

*Sometimes* (but very rarely) it's relevant to still look at the full
output, since the failure might be due to an earlier silent failure in a
previous test (or the state it left behind), but I think that's rare
enough that the right thing to do is just to stick that in a subsequent
"verbose dump" step or whatever.

>> But new "non-veteran" users may not share that.  That is something a
>> bit worrying about the new UI.
>
> Indeed. My goal, after all, is to improve the experience of contributors,
> not to make it harder.
>
> Still, given that you currently have to click quite a few times until you
> get to where you need to be, I have my doubts that what this patch series
> does is actually making things slower, measured in terms of the total time
> from seeing a failed build to being able to diagnose the cause by
> inspecting the command trace.

Yes, but wouldn't the "Test Summary Report" in [1] be the best of both
worlds[1] (with some minor changes to adapt it to the GitHub "grouping"
output, perhaps)?

Then you'd always see the specific of the failing test at the end, if
you had N number of failures you might have a lot of output above that,
but even that we could always tweak with some smart heuristic. I.e. show
verbose "not ok" output if failures <10, if 10..100 elide some for the
raw log, if >100 just print "this is completely screwed" or whatever :)

1. https://lore.kernel.org/git/220302.86mti87cj2.gmgdl@evledraar.gmail.com/

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-03-01 10:24 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
                     ` (10 preceding siblings ...)
  2022-03-02 12:22   ` Ævar Arnfjörð Bjarmason
@ 2022-03-25  0:48   ` Victoria Dye
  2022-03-25  9:02     ` Ævar Arnfjörð Bjarmason
  2022-05-21 21:42     ` Johannes Schindelin
  2022-05-21 22:18   ` [PATCH v3 00/12] " Johannes Schindelin via GitGitGadget
  12 siblings, 2 replies; 98+ messages in thread
From: Victoria Dye @ 2022-03-25  0:48 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget, git
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin

Johannes Schindelin via GitGitGadget wrote:
> Changes since v1:
> 
>  * In the patch that removed MAKE_TARGETS, a stale comment about that
>    variable is also removed.
>  * The comment about set -x has been adjusted because it no longer applies
>    as-is.
>  * The commit message of "ci: make it easier to find failed tests' logs in
>    the GitHub workflow" has been adjusted to motivate the improvement
>    better.
> 
> 
> Background
> ==========
> 
> Recent patches intended to help readers figure out CI failures much quicker
> than before. Unfortunately, they haven't been entirely positive for me. For
> example, they broke the branch protections in Microsoft's fork of Git, where
> we require Pull Requests to pass a certain set of Checks (which are
> identified by their names) and therefore caused follow-up work.
> 
> Using CI and in general making it easier for new contributors is an area I'm
> passionate about, and one I'd like to see improved.
> 
> 
> The current situation
> =====================
> 
> Let me walk you through the current experience when a PR build fails: I get
> a notification mail that only says that a certain job failed. There's no
> indication of which test failed (or was it the build?). I can click on a
> link at it takes me to the workflow run. Once there, all it says is "Process
> completed with exit code 1", or even "code 2". Sure, I can click on one of
> the failed jobs. It even expands the failed step's log (collapsing the other
> steps). And what do I see there?
> 
> Let's look at an example of a failed linux-clang (ubuntu-latest) job
> [https://github.com/git-for-windows/git/runs/4822802185?check_suite_focus=true]:
> 
> [...]
> Test Summary Report
> -------------------
> t1092-sparse-checkout-compatibility.sh           (Wstat: 256 Tests: 53 Failed: 1)
>   Failed test:  49
>   Non-zero exit status: 1
> t3701-add-interactive.sh                         (Wstat: 0 Tests: 71 Failed: 0)
>   TODO passed:   45, 47
> Files=957, Tests=25489, 645 wallclock secs ( 5.74 usr  1.56 sys + 866.28 cusr 364.34 csys = 1237.92 CPU)
> Result: FAIL
> make[1]: *** [Makefile:53: prove] Error 1
> make[1]: Leaving directory '/home/runner/work/git/git/t'
> make: *** [Makefile:3018: test] Error 2
> 
> 
> That's it. I count myself lucky not to be a new contributor being faced with
> something like this.
> 
> Now, since I am active in the Git project for a couple of days or so, I can
> make sense of the "TODO passed" label and know that for the purpose of
> fixing the build failures, I need to ignore this, and that I need to focus
> on the "Failed test" part instead.
> 
> I also know that I do not have to get myself an ubuntu-latest box just to
> reproduce the error, I do not even have to check out the code and run it
> just to learn what that "49" means.
> 
> I know, and I do not expect any new contributor, not even most seasoned
> contributors to know, that I have to patiently collapse the "Run
> ci/run-build-and-tests.sh" job's log, and instead expand the "Run
> ci/print-test-failures.sh" job log (which did not fail and hence does not
> draw any attention to it).
> 

You can count me as one of the ones that, until this series, had no idea
this information existed in the CI logs. And it definitely would have been
helpful in figuring out some of the OS-specific bugs I've run into in the
past. :) 

> I know, and again: I do not expect many others to know this, that I then
> have to click into the "Search logs" box (not the regular web browser's
> search via Ctrl+F!) and type in "not ok" to find the log of the failed test
> case (and this might still be a "known broken" one that is marked via
> test_expect_failure and once again needs to be ignored).
> 
> To be excessively clear: This is not a great experience!
> 
> 
> Improved output
> ===============
> 
> Our previous Azure Pipelines-based CI builds had a much nicer UI, one that
> even showed flaky tests, and trends e.g. how long the test cases ran. When I
> ported Git's CI over to GitHub workflows (to make CI more accessible to new
> contributors), I knew fully well that we would leave this very nice UI
> behind, and I had hoped that we would get something similar back via new,
> community-contributed GitHub Actions that can be used in GitHub workflows.
> However, most likely because we use a home-grown test framework implemented
> in opinionated POSIX shells scripts, that did not happen.
> 
> So I had a look at what standards exist e.g. when testing PowerShell
> modules, in the way of marking up their test output in GitHub workflows, and
> I was not disappointed: GitHub workflows support "grouping" of output lines,
> i.e. marking sections of the output as a group that is then collapsed by
> default and can be expanded. And it is this feature I decided to use in this
> patch series, along with GitHub workflows' commands to display errors or
> notices that are also shown on the summary page of the workflow run. Now, in
> addition to "Process completed with exit code" on the summary page, we also
> read something like:
> 
> ⊗ linux-gcc (ubuntu-latest)
>    failed: t9800.20 submit from detached head
> 
> 
> Even better, this message is a link, and following that, the reader is
> presented with something like this
> [https://github.com/dscho/git/runs/4840190622?check_suite_focus=true]:
> 
> ⏵ Run ci/run-build-and-tests.sh
> ⏵ CI setup
>   + ln -s /home/runner/none/.prove t/.prove
>   + run_tests=t
>   + export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
>   + group Build make
>   + set +x
> ⏵ Build
> ⏵ Run tests
>   === Failed test: t9800-git-p4-basic ===
> ⏵ ok: t9800.1 start p4d
> ⏵ ok: t9800.2 add p4 files
> ⏵ ok: t9800.3 basic git p4 clone
> ⏵ ok: t9800.4 depot typo error
> ⏵ ok: t9800.5 git p4 clone @all
> ⏵ ok: t9800.6 git p4 sync uninitialized repo
> ⏵ ok: t9800.7 git p4 sync new branch
> ⏵ ok: t9800.8 clone two dirs
> ⏵ ok: t9800.9 clone two dirs, @all
> ⏵ ok: t9800.10 clone two dirs, @all, conflicting files
> ⏵ ok: t9800.11 clone two dirs, each edited by submit, single git commit
> ⏵ ok: t9800.12 clone using non-numeric revision ranges
> ⏵ ok: t9800.13 clone with date range, excluding some changes
> ⏵ ok: t9800.14 exit when p4 fails to produce marshaled output
> ⏵ ok: t9800.15 exit gracefully for p4 server errors
> ⏵ ok: t9800.16 clone --bare should make a bare repository
> ⏵ ok: t9800.17 initial import time from top change time
> ⏵ ok: t9800.18 unresolvable host in P4PORT should display error
> ⏵ ok: t9800.19 run hook p4-pre-submit before submit
>   Error: failed: t9800.20 submit from detached head
> ⏵ failure: t9800.20 submit from detached head 
>   Error: failed: t9800.21 submit from worktree
> ⏵ failure: t9800.21 submit from worktree 
>   === Failed test: t9801-git-p4-branch ===
>   [...]
> 
> 
> The "Failed test:" lines are colored in yellow to give a better visual clue
> about the logs' structure, the "Error:" label is colored in red to draw the
> attention to the important part of the log, and the "⏵" characters indicate
> that part of the log is collapsed and can be expanded by clicking on it.
> 
> To drill down, the reader merely needs to expand the (failed) test case's
> log by clicking on it, and then study the log. If needed (e.g. when the test
> case relies on side effects from previous test cases), the logs of preceding
> test cases can be expanded as well. In this example, when expanding
> t9800.20, it looks like this (for ease of reading, I cut a few chunks of
> lines, indicated by "[...]"):
> 
> [...]
> ⏵ ok: t9800.19 run hook p4-pre-submit before submit
>   Error: failed: t9800.20 submit from detached head
> ⏷ failure: t9800.20 submit from detached head 
>       test_when_finished cleanup_git &&
>       git p4 clone --dest="$git" //depot &&
>         (
>           cd "$git" &&
>           git checkout p4/master &&
>           >detached_head_test &&
>           git add detached_head_test &&
>           git commit -m "add detached_head" &&
>           git config git-p4.skipSubmitEdit true &&
>           git p4 submit &&
>             git p4 rebase &&
>             git log p4/master | grep detached_head
>         )
>     [...]
>     Depot paths: //depot/
>     Import destination: refs/remotes/p4/master
>     
>     Importing revision 9 (100%)Perforce db files in '.' will be created if missing...
>     Perforce db files in '.' will be created if missing...
>     
>     Traceback (most recent call last):
>       File "/home/runner/work/git/git/git-p4", line 4455, in <module>
>         main()
>       File "/home/runner/work/git/git/git-p4", line 4449, in main
>         if not cmd.run(args):
>       File "/home/runner/work/git/git/git-p4", line 2590, in run
>         rebase.rebase()
>       File "/home/runner/work/git/git/git-p4", line 4121, in rebase
>         if len(read_pipe("git diff-index HEAD --")) > 0:
>       File "/home/runner/work/git/git/git-p4", line 297, in read_pipe
>         retcode, out, err = read_pipe_full(c, *k, **kw)
>       File "/home/runner/work/git/git/git-p4", line 284, in read_pipe_full
>         p = subprocess.Popen(
>       File "/usr/lib/python3.8/subprocess.py", line 858, in __init__
>         self._execute_child(args, executable, preexec_fn, close_fds,
>       File "/usr/lib/python3.8/subprocess.py", line 1704, in _execute_child
>         raise child_exception_type(errno_num, err_msg, err_filename)
>     FileNotFoundError: [Errno 2] No such file or directory: 'git diff-index HEAD --'
>     error: last command exited with $?=1
>     + cleanup_git
>     + retry_until_success rm -r /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
>     + nr_tries_left=60
>     + rm -r /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
>     + test_path_is_missing /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
>     + test 1 -ne 1
>     + test -e /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
>     + retry_until_success mkdir /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
>     + nr_tries_left=60
>     + mkdir /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
>     + exit 1
>     + eval_ret=1
>     + :
>     not ok 20 - submit from detached head
>     #    
>     #        test_when_finished cleanup_git &&
>     #        git p4 clone --dest="$git" //depot &&
>     #        (
>     #            cd "$git" &&
>     #            git checkout p4/master &&
>     #            >detached_head_test &&
>     #            git add detached_head_test &&
>     #            git commit -m "add detached_head" &&
>     #            git config git-p4.skipSubmitEdit true &&
>     #            git p4 submit &&
>     #            git p4 rebase &&
>     #            git log p4/master | grep detached_head
>     #        )
>     #    
>   Error: failed: t9800.21 submit from worktree
>   [...]
> 
> 
> Is this the best UI we can have for test failures in CI runs? I hope we can
> do better. Having said that, this patch series presents a pretty good start,
> and offers a basis for future improvements.
> 

I think these are really valuable improvements over our current state, but I
also understand the concerns about performance elsewhere in this thread
(it's really slow to load for me as well, and scrolling/expanding the log
groups can be a bit glitchy in my browser). That said, I think there are a
couple ways you could improve the load time without sacrificing the (very
helpful) improvements you've made to error log visibility. I experimented a
bit (example result [1]) and came up with some things that may help:

* group errors by test file, rather than by test case (to reduce
  parsing/rendering time for lots of groups).
* print the verbose logs only for the failed test cases (to massively cut
  down on the size of the log, particularly when there's only a couple
  failures in a test file with a lot of passing tests).
* skip printing the full text of the test in 'finalize_test_case_output'
  when creating the group, i.e., use '$1' instead of '$*' (in both passing
  and failing tests, this information is already printed via some other
  means).

If you wanted to make sure a user could still access the full failure logs
(i.e., including the "ok" test results), you could print a link to the
artifacts page as well - that way, all of the information we currently
provide to users can still be found somewhere.

[1] https://github.com/vdye/git/runs/5666973267

> Johannes Schindelin (9):
>   ci: fix code style
>   ci/run-build-and-tests: take a more high-level view
>   ci: make it easier to find failed tests' logs in the GitHub workflow
>   ci/run-build-and-tests: add some structure to the GitHub workflow
>     output
>   tests: refactor --write-junit-xml code
>   test(junit): avoid line feeds in XML attributes
>   ci: optionally mark up output in the GitHub workflow
>   ci: use `--github-workflow-markup` in the GitHub workflow
>   ci: call `finalize_test_case_output` a little later
> 

The organization of these commits makes the series a bit confusing to read,
mainly due to the JUnit changes in the middle. Patches 5-6 don't appear to
be dependent on patches 1-4, so they could be moved to the beginning (after
patch 1). With that change, I think this series would flow more smoothly:
"Cleanup/non-functional change" -> "JUnit XML improvements" -> "Log UX
improvements".

>  .github/workflows/main.yml           |  12 ---
>  ci/lib.sh                            |  82 +++++++++++++++--
>  ci/run-build-and-tests.sh            |  14 +--
>  ci/run-test-slice.sh                 |   5 +-
>  t/test-lib-functions.sh              |   4 +-
>  t/test-lib-github-workflow-markup.sh |  50 ++++++++++
>  t/test-lib-junit.sh                  | 132 +++++++++++++++++++++++++++
>  t/test-lib.sh                        | 128 ++++----------------------
>  8 files changed, 288 insertions(+), 139 deletions(-)
>  create mode 100644 t/test-lib-github-workflow-markup.sh
>  create mode 100644 t/test-lib-junit.sh
> 
> 
> base-commit: af4e5f569bc89f356eb34a9373d7f82aca6faa8a
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1117%2Fdscho%2Fuse-grouping-in-ci-v2
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1117/dscho/use-grouping-in-ci-v2
> Pull-Request: https://github.com/gitgitgadget/git/pull/1117
> 
> Range-diff vs v1:
> 
>   1:  db08b07c37a =  1:  db08b07c37a ci: fix code style
>   2:  d2ff51bb5da !  2:  42ff3e170bf ci/run-build-and-tests: take a more high-level view
>      @@ ci/run-build-and-tests.sh: pedantic)
>        	;;
>        esac
>        
>      - # Any new "test" targets should not go after this "make", but should
>      - # adjust $MAKE_TARGETS. Otherwise compilation-only targets above will
>      - # start running tests.
>      +-# Any new "test" targets should not go after this "make", but should
>      +-# adjust $MAKE_TARGETS. Otherwise compilation-only targets above will
>      +-# start running tests.
>       -make $MAKE_TARGETS
>       +make
>       +if test -n "$run_tests"
>   3:  98891b0d3f7 !  3:  bbbe1623257 ci: make it easier to find failed tests' logs in the GitHub workflow
>      @@ Metadata
>        ## Commit message ##
>           ci: make it easier to find failed tests' logs in the GitHub workflow
>       
>      +    When investigating a test failure, the time that matters most is the
>      +    time it takes from getting aware of the failure to displaying the output
>      +    of the failing test case.
>      +
>           You currently have to know a lot of implementation details when
>           investigating test failures in the CI runs. The first step is easy: the
>           failed job is marked quite clearly, but when opening it, the failed step
>      @@ Commit message
>           The actually interesting part is in the detailed log of said failed
>           test script. But that log is shown in the CI run's step that runs
>           `ci/print-test-failures.sh`. And that step is _not_ expanded in the web
>      -    UI by default.
>      +    UI by default. It is even marked as "successful", which makes it very
>      +    easy to miss that there is useful information hidden in there.
>       
>           Let's help the reader by showing the failed tests' detailed logs in the
>           step that is expanded automatically, i.e. directly after the test suite
>   4:  9333ba781b8 !  4:  f72254a9ac6 ci/run-build-and-tests: add some structure to the GitHub workflow output
>      @@ ci/lib.sh
>       +
>       +# Set 'exit on error' for all CI scripts to let the caller know that
>       +# something went wrong.
>      -+# Set tracing executed commands, primarily setting environment variables
>      -+# and installing dependencies.
>      ++#
>      ++# We already enabled tracing executed commands earlier. This helps by showing
>      ++# how # environment variables are set and and dependencies are installed.
>       +set -e
>       +
>        skip_branch_tip_with_tag () {
>      @@ ci/lib.sh: linux-leaks)
>       +set -x
>       
>        ## ci/run-build-and-tests.sh ##
>      -@@ ci/run-build-and-tests.sh: esac
>      - # Any new "test" targets should not go after this "make", but should
>      - # adjust $MAKE_TARGETS. Otherwise compilation-only targets above will
>      - # start running tests.
>      +@@ ci/run-build-and-tests.sh: pedantic)
>      + 	;;
>      + esac
>      + 
>       -make
>       +group Build make
>        if test -n "$run_tests"
>   5:  94dcbe1bc43 =  5:  9eda6574313 tests: refactor --write-junit-xml code
>   6:  41230100091 =  6:  c8b240af749 test(junit): avoid line feeds in XML attributes
>   7:  98b32630fcd =  7:  15f199e810e ci: optionally mark up output in the GitHub workflow
>   8:  1a6bd1846bc =  8:  91ea54f36c5 ci: use `--github-workflow-markup` in the GitHub workflow
>   9:  992b1575889 =  9:  be2a83f5da3 ci: call `finalize_test_case_output` a little later
> 


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-03-25  0:48   ` Victoria Dye
@ 2022-03-25  9:02     ` Ævar Arnfjörð Bjarmason
  2022-03-25 18:38       ` Victoria Dye
  2022-05-21 21:42     ` Johannes Schindelin
  1 sibling, 1 reply; 98+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-25  9:02 UTC (permalink / raw)
  To: Victoria Dye
  Cc: Johannes Schindelin via GitGitGadget, git, Eric Sunshine,
	Johannes Schindelin


On Thu, Mar 24 2022, Victoria Dye wrote:

> Johannes Schindelin via GitGitGadget wrote:
> [...]
>> Is this the best UI we can have for test failures in CI runs? I hope we can
>> do better. Having said that, this patch series presents a pretty good start,
>> and offers a basis for future improvements.
>> 
>
> I think these are really valuable improvements over our current state, but I
> also understand the concerns about performance elsewhere in this thread
> (it's really slow to load for me as well, and scrolling/expanding the log
> groups can be a bit glitchy in my browser). That said, I think there are a
> couple ways you could improve the load time without sacrificing the (very
> helpful) improvements you've made to error log visibility. I experimented a
> bit (example result [1]) and came up with some things that may help:
>
> * group errors by test file, rather than by test case (to reduce
>   parsing/rendering time for lots of groups).
> * print the verbose logs only for the failed test cases (to massively cut
>   down on the size of the log, particularly when there's only a couple
>   failures in a test file with a lot of passing tests).
> * skip printing the full text of the test in 'finalize_test_case_output'
>   when creating the group, i.e., use '$1' instead of '$*' (in both passing
>   and failing tests, this information is already printed via some other
>   means).
>
> If you wanted to make sure a user could still access the full failure logs
> (i.e., including the "ok" test results), you could print a link to the
> artifacts page as well - that way, all of the information we currently
> provide to users can still be found somewhere.
>
> [1] https://github.com/vdye/git/runs/5666973267

Thanks a lot for trying to address those concerns.

I took a look at this and it definitely performs better, although in
this case the overall output is ~3k lines.

I'd be curious to see how it performs on some of the cases discussed in
earlier threads of >~50k lines, although it looks like in this case that
would require failures to be really widespread in the test suite.

I just looked at this briefly, but looking at the branch I see you
removed the "checking known breakage of[...]" etc. from the non-GitHub
markdown output, I didn't spot how that was related/needed.

>> Johannes Schindelin (9):
>>   ci: fix code style
>>   ci/run-build-and-tests: take a more high-level view
>>   ci: make it easier to find failed tests' logs in the GitHub workflow
>>   ci/run-build-and-tests: add some structure to the GitHub workflow
>>     output
>>   tests: refactor --write-junit-xml code
>>   test(junit): avoid line feeds in XML attributes
>>   ci: optionally mark up output in the GitHub workflow
>>   ci: use `--github-workflow-markup` in the GitHub workflow
>>   ci: call `finalize_test_case_output` a little later
>> 
>
> The organization of these commits makes the series a bit confusing to read,
> mainly due to the JUnit changes in the middle. Patches 5-6 don't appear to
> be dependent on patches 1-4, so they could be moved to the beginning (after
> patch 1). With that change, I think this series would flow more smoothly:
> "Cleanup/non-functional change" -> "JUnit XML improvements" -> "Log UX
> improvements".

Have you had a change to look at the approach my suggestion of an
alternate approach to the early part of this series takes?:
https://lore.kernel.org/git/cover-00.25-00000000000-20220221T143936Z-avarab@gmail.com/

I.e. to not build up ci/lib.sh to know to group the "build" etc. within
the "run-build-and-test" step, but instead just to pull those to the
top-level by running separate build & test steps.


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-03-25  9:02     ` Ævar Arnfjörð Bjarmason
@ 2022-03-25 18:38       ` Victoria Dye
  0 siblings, 0 replies; 98+ messages in thread
From: Victoria Dye @ 2022-03-25 18:38 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin via GitGitGadget, git, Eric Sunshine,
	Johannes Schindelin

Ævar Arnfjörð Bjarmason wrote:
> 
> On Thu, Mar 24 2022, Victoria Dye wrote:
> 
>> Johannes Schindelin via GitGitGadget wrote:
>> [...]
>>> Is this the best UI we can have for test failures in CI runs? I hope we can
>>> do better. Having said that, this patch series presents a pretty good start,
>>> and offers a basis for future improvements.
>>>
>>
>> I think these are really valuable improvements over our current state, but I
>> also understand the concerns about performance elsewhere in this thread
>> (it's really slow to load for me as well, and scrolling/expanding the log
>> groups can be a bit glitchy in my browser). That said, I think there are a
>> couple ways you could improve the load time without sacrificing the (very
>> helpful) improvements you've made to error log visibility. I experimented a
>> bit (example result [1]) and came up with some things that may help:
>>
>> * group errors by test file, rather than by test case (to reduce
>>   parsing/rendering time for lots of groups).
>> * print the verbose logs only for the failed test cases (to massively cut
>>   down on the size of the log, particularly when there's only a couple
>>   failures in a test file with a lot of passing tests).
>> * skip printing the full text of the test in 'finalize_test_case_output'
>>   when creating the group, i.e., use '$1' instead of '$*' (in both passing
>>   and failing tests, this information is already printed via some other
>>   means).
>>
>> If you wanted to make sure a user could still access the full failure logs
>> (i.e., including the "ok" test results), you could print a link to the
>> artifacts page as well - that way, all of the information we currently
>> provide to users can still be found somewhere.
>>
>> [1] https://github.com/vdye/git/runs/5666973267
> 
> Thanks a lot for trying to address those concerns.
> 
> I took a look at this and it definitely performs better, although in
> this case the overall output is ~3k lines.
> 
> I'd be curious to see how it performs on some of the cases discussed in
> earlier threads of >~50k lines, although it looks like in this case that
> would require failures to be really widespread in the test suite.
> 

Unfortunately, I don't have a direct comparison to that (the longest I found
elsewhere in the thread was ~33k lines [1], but those failures came from
strange interactions on the 'shears/seen' branch of Git for Windows that I
couldn't easily replicate). If it helps, though, here's a 1:1 comparison of
my "experiment" branch's forced test failures with and without the
optimizations I tried (without optimization, the total log is ~28k lines):

without optimization: https://github.com/vdye/git/runs/5696305589 with
optimization: https://github.com/vdye/git/runs/5666973267

So it's definitely faster - it still takes a couple seconds to load, but not
so long that my browser struggles with it (which was my main issue with the
original approach).

[1] https://github.com/dscho/git/runs/4840190622

> I just looked at this briefly, but looking at the branch I see you
> removed the "checking known breakage of[...]" etc. from the non-GitHub
> markdown output, I didn't spot how that was related/needed.
> 

It was mostly just another attempt to cut down on extraneous output (since,
if a test fails, the test definition is printed after the failure, so we
would end up with the same information twice). 

That said, if that were to be incorporated here, it'd need to be smarter
than what I tried - my change removed it entirely from the '.out' logs, and
it means that any test that *does* pass wouldn't have its test definition
logged anywhere (I think). The ideal situation would be the extraneous test
definition is only removed from the '.markup' files, but that's probably a
change better saved for a future patch/series.

>>> Johannes Schindelin (9):
>>>   ci: fix code style
>>>   ci/run-build-and-tests: take a more high-level view
>>>   ci: make it easier to find failed tests' logs in the GitHub workflow
>>>   ci/run-build-and-tests: add some structure to the GitHub workflow
>>>     output
>>>   tests: refactor --write-junit-xml code
>>>   test(junit): avoid line feeds in XML attributes
>>>   ci: optionally mark up output in the GitHub workflow
>>>   ci: use `--github-workflow-markup` in the GitHub workflow
>>>   ci: call `finalize_test_case_output` a little later
>>>
>>
>> The organization of these commits makes the series a bit confusing to read,
>> mainly due to the JUnit changes in the middle. Patches 5-6 don't appear to
>> be dependent on patches 1-4, so they could be moved to the beginning (after
>> patch 1). With that change, I think this series would flow more smoothly:
>> "Cleanup/non-functional change" -> "JUnit XML improvements" -> "Log UX
>> improvements".
> 
> Have you had a change to look at the approach my suggestion of an
> alternate approach to the early part of this series takes?:
> https://lore.kernel.org/git/cover-00.25-00000000000-20220221T143936Z-avarab@gmail.com/
> 
> I.e. to not build up ci/lib.sh to know to group the "build" etc. within
> the "run-build-and-test" step, but instead just to pull those to the
> top-level by running separate build & test steps.
> 

I looked at it a while ago, but I actually had a similar issue following
that series as I did this one; it's difficult to tell what's cleanup, what's
refactoring unrelated to this series, and what's an explicit difference in
approach compared with this series. 

Revisiting it now, I did the same thing I did with dscho's series: ran your
branch with some forced test failures and looked at the results [2]. Based
on that, there are a couple of helpful things I see in your series that
contribute to the same overarching goal as this dscho's:

* Separating build & test into different steps.
    * This makes it more immediately obvious to a user whether the issue was
      a compiler error or a test failure. Since test failures can only even
      happen if the compilation passes, this doesn't create (another)
      situation where the relevant failure information is in a different
      step than the auto-expanded failing one.
* Separating 'lib.sh --build' and 'make' into different steps. 
    * I was initially unsure of the value of this (conceptually, wouldn't
      they both be part of "build"?), but I eventually understood it to be
      "setup the environment for [build|test]" followed by "run the
      [build|test]". Since the main thing dscho's series is addressing is
      information visibility, I like that this similarly "unburies" the
      environment configuration at the beginning of build/test.

Those changes are great (and they probably have some positive impact on load
times). But as far as I can tell, nothing else in your series directly
addresses the main problem dscho is fixing here, which is that the verbose
failure logs are effectively hidden from the user (unless you know exactly
where to look). As a result, it doesn't really fit as a "replacement" to
this one for me. Honestly, my ideal "final form" of all of this may be a
combination of both series, having the CI steps:

- setup build environment
- run build (make)
- setup test environment
- run test (make test) & print failure logs

You can still pull the 'make' executions out of 'run-build-and-test.sh', but
I think the "& print failure logs" part of the 'test' step (i.e., the added
'|| handle_failed_tests') is the critical piece that, although it would slow
things down to some extent (and, of course, it's subjective where the "too
slow" line is), it would relevant failure information a whole lot more
accessible. That's the real "value-add" of this series for me, if only
because I know it would have helped me a bunch of times in the past - I
absolutely believe it would similarly help new contributors in the future.

[2] https://github.com/vdye/git/runs/5695895629

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-03-25  0:48   ` Victoria Dye
  2022-03-25  9:02     ` Ævar Arnfjörð Bjarmason
@ 2022-05-21 21:42     ` Johannes Schindelin
  2022-05-21 23:05       ` Junio C Hamano
  1 sibling, 1 reply; 98+ messages in thread
From: Johannes Schindelin @ 2022-05-21 21:42 UTC (permalink / raw)
  To: Victoria Dye
  Cc: Johannes Schindelin via GitGitGadget, git, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

[-- Attachment #1: Type: text/plain, Size: 15180 bytes --]

Hi Victoria,

[sorry for the long delay, I was being pulled into too many directions,
but am incredibly grateful for your fantastic investigation.]

On Thu, 24 Mar 2022, Victoria Dye wrote:

> Johannes Schindelin via GitGitGadget wrote:
> > Changes since v1:
> >
> >  * In the patch that removed MAKE_TARGETS, a stale comment about that
> >    variable is also removed.
> >  * The comment about set -x has been adjusted because it no longer applies
> >    as-is.
> >  * The commit message of "ci: make it easier to find failed tests' logs in
> >    the GitHub workflow" has been adjusted to motivate the improvement
> >    better.
> >
> >
> > Background
> > ==========
> >
> > Recent patches intended to help readers figure out CI failures much quicker
> > than before. Unfortunately, they haven't been entirely positive for me. For
> > example, they broke the branch protections in Microsoft's fork of Git, where
> > we require Pull Requests to pass a certain set of Checks (which are
> > identified by their names) and therefore caused follow-up work.
> >
> > Using CI and in general making it easier for new contributors is an area I'm
> > passionate about, and one I'd like to see improved.
> >
> >
> > The current situation
> > =====================
> >
> > Let me walk you through the current experience when a PR build fails: I get
> > a notification mail that only says that a certain job failed. There's no
> > indication of which test failed (or was it the build?). I can click on a
> > link at it takes me to the workflow run. Once there, all it says is "Process
> > completed with exit code 1", or even "code 2". Sure, I can click on one of
> > the failed jobs. It even expands the failed step's log (collapsing the other
> > steps). And what do I see there?
> >
> > Let's look at an example of a failed linux-clang (ubuntu-latest) job
> > [https://github.com/git-for-windows/git/runs/4822802185?check_suite_focus=true]:
> >
> > [...]
> > Test Summary Report
> > -------------------
> > t1092-sparse-checkout-compatibility.sh           (Wstat: 256 Tests: 53 Failed: 1)
> >   Failed test:  49
> >   Non-zero exit status: 1
> > t3701-add-interactive.sh                         (Wstat: 0 Tests: 71 Failed: 0)
> >   TODO passed:   45, 47
> > Files=957, Tests=25489, 645 wallclock secs ( 5.74 usr  1.56 sys + 866.28 cusr 364.34 csys = 1237.92 CPU)
> > Result: FAIL
> > make[1]: *** [Makefile:53: prove] Error 1
> > make[1]: Leaving directory '/home/runner/work/git/git/t'
> > make: *** [Makefile:3018: test] Error 2
> >
> >
> > That's it. I count myself lucky not to be a new contributor being faced with
> > something like this.
> >
> > Now, since I am active in the Git project for a couple of days or so, I can
> > make sense of the "TODO passed" label and know that for the purpose of
> > fixing the build failures, I need to ignore this, and that I need to focus
> > on the "Failed test" part instead.
> >
> > I also know that I do not have to get myself an ubuntu-latest box just to
> > reproduce the error, I do not even have to check out the code and run it
> > just to learn what that "49" means.
> >
> > I know, and I do not expect any new contributor, not even most seasoned
> > contributors to know, that I have to patiently collapse the "Run
> > ci/run-build-and-tests.sh" job's log, and instead expand the "Run
> > ci/print-test-failures.sh" job log (which did not fail and hence does not
> > draw any attention to it).
> >
>
> You can count me as one of the ones that, until this series, had no idea
> this information existed in the CI logs. And it definitely would have been
> helpful in figuring out some of the OS-specific bugs I've run into in the
> past. :)

Excellent. I think I have enough accounts to consider it a universal truth
that our CI output _needs_ something like this here patch series.

> > I know, and again: I do not expect many others to know this, that I then
> > have to click into the "Search logs" box (not the regular web browser's
> > search via Ctrl+F!) and type in "not ok" to find the log of the failed test
> > case (and this might still be a "known broken" one that is marked via
> > test_expect_failure and once again needs to be ignored).
> >
> > To be excessively clear: This is not a great experience!
> >
> >
> > Improved output
> > ===============
> >
> > Our previous Azure Pipelines-based CI builds had a much nicer UI, one that
> > even showed flaky tests, and trends e.g. how long the test cases ran. When I
> > ported Git's CI over to GitHub workflows (to make CI more accessible to new
> > contributors), I knew fully well that we would leave this very nice UI
> > behind, and I had hoped that we would get something similar back via new,
> > community-contributed GitHub Actions that can be used in GitHub workflows.
> > However, most likely because we use a home-grown test framework implemented
> > in opinionated POSIX shells scripts, that did not happen.
> >
> > So I had a look at what standards exist e.g. when testing PowerShell
> > modules, in the way of marking up their test output in GitHub workflows, and
> > I was not disappointed: GitHub workflows support "grouping" of output lines,
> > i.e. marking sections of the output as a group that is then collapsed by
> > default and can be expanded. And it is this feature I decided to use in this
> > patch series, along with GitHub workflows' commands to display errors or
> > notices that are also shown on the summary page of the workflow run. Now, in
> > addition to "Process completed with exit code" on the summary page, we also
> > read something like:
> >
> > ⊗ linux-gcc (ubuntu-latest)
> >    failed: t9800.20 submit from detached head
> >
> >
> > Even better, this message is a link, and following that, the reader is
> > presented with something like this
> > [https://github.com/dscho/git/runs/4840190622?check_suite_focus=true]:
> >
> > ⏵ Run ci/run-build-and-tests.sh
> > ⏵ CI setup
> >   + ln -s /home/runner/none/.prove t/.prove
> >   + run_tests=t
> >   + export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> >   + group Build make
> >   + set +x
> > ⏵ Build
> > ⏵ Run tests
> >   === Failed test: t9800-git-p4-basic ===
> > ⏵ ok: t9800.1 start p4d
> > ⏵ ok: t9800.2 add p4 files
> > ⏵ ok: t9800.3 basic git p4 clone
> > ⏵ ok: t9800.4 depot typo error
> > ⏵ ok: t9800.5 git p4 clone @all
> > ⏵ ok: t9800.6 git p4 sync uninitialized repo
> > ⏵ ok: t9800.7 git p4 sync new branch
> > ⏵ ok: t9800.8 clone two dirs
> > ⏵ ok: t9800.9 clone two dirs, @all
> > ⏵ ok: t9800.10 clone two dirs, @all, conflicting files
> > ⏵ ok: t9800.11 clone two dirs, each edited by submit, single git commit
> > ⏵ ok: t9800.12 clone using non-numeric revision ranges
> > ⏵ ok: t9800.13 clone with date range, excluding some changes
> > ⏵ ok: t9800.14 exit when p4 fails to produce marshaled output
> > ⏵ ok: t9800.15 exit gracefully for p4 server errors
> > ⏵ ok: t9800.16 clone --bare should make a bare repository
> > ⏵ ok: t9800.17 initial import time from top change time
> > ⏵ ok: t9800.18 unresolvable host in P4PORT should display error
> > ⏵ ok: t9800.19 run hook p4-pre-submit before submit
> >   Error: failed: t9800.20 submit from detached head
> > ⏵ failure: t9800.20 submit from detached head
> >   Error: failed: t9800.21 submit from worktree
> > ⏵ failure: t9800.21 submit from worktree
> >   === Failed test: t9801-git-p4-branch ===
> >   [...]
> >
> >
> > The "Failed test:" lines are colored in yellow to give a better visual clue
> > about the logs' structure, the "Error:" label is colored in red to draw the
> > attention to the important part of the log, and the "⏵" characters indicate
> > that part of the log is collapsed and can be expanded by clicking on it.
> >
> > To drill down, the reader merely needs to expand the (failed) test case's
> > log by clicking on it, and then study the log. If needed (e.g. when the test
> > case relies on side effects from previous test cases), the logs of preceding
> > test cases can be expanded as well. In this example, when expanding
> > t9800.20, it looks like this (for ease of reading, I cut a few chunks of
> > lines, indicated by "[...]"):
> >
> > [...]
> > ⏵ ok: t9800.19 run hook p4-pre-submit before submit
> >   Error: failed: t9800.20 submit from detached head
> > ⏷ failure: t9800.20 submit from detached head
> >       test_when_finished cleanup_git &&
> >       git p4 clone --dest="$git" //depot &&
> >         (
> >           cd "$git" &&
> >           git checkout p4/master &&
> >           >detached_head_test &&
> >           git add detached_head_test &&
> >           git commit -m "add detached_head" &&
> >           git config git-p4.skipSubmitEdit true &&
> >           git p4 submit &&
> >             git p4 rebase &&
> >             git log p4/master | grep detached_head
> >         )
> >     [...]
> >     Depot paths: //depot/
> >     Import destination: refs/remotes/p4/master
> >
> >     Importing revision 9 (100%)Perforce db files in '.' will be created if missing...
> >     Perforce db files in '.' will be created if missing...
> >
> >     Traceback (most recent call last):
> >       File "/home/runner/work/git/git/git-p4", line 4455, in <module>
> >         main()
> >       File "/home/runner/work/git/git/git-p4", line 4449, in main
> >         if not cmd.run(args):
> >       File "/home/runner/work/git/git/git-p4", line 2590, in run
> >         rebase.rebase()
> >       File "/home/runner/work/git/git/git-p4", line 4121, in rebase
> >         if len(read_pipe("git diff-index HEAD --")) > 0:
> >       File "/home/runner/work/git/git/git-p4", line 297, in read_pipe
> >         retcode, out, err = read_pipe_full(c, *k, **kw)
> >       File "/home/runner/work/git/git/git-p4", line 284, in read_pipe_full
> >         p = subprocess.Popen(
> >       File "/usr/lib/python3.8/subprocess.py", line 858, in __init__
> >         self._execute_child(args, executable, preexec_fn, close_fds,
> >       File "/usr/lib/python3.8/subprocess.py", line 1704, in _execute_child
> >         raise child_exception_type(errno_num, err_msg, err_filename)
> >     FileNotFoundError: [Errno 2] No such file or directory: 'git diff-index HEAD --'
> >     error: last command exited with $?=1
> >     + cleanup_git
> >     + retry_until_success rm -r /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
> >     + nr_tries_left=60
> >     + rm -r /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
> >     + test_path_is_missing /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
> >     + test 1 -ne 1
> >     + test -e /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
> >     + retry_until_success mkdir /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
> >     + nr_tries_left=60
> >     + mkdir /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git
> >     + exit 1
> >     + eval_ret=1
> >     + :
> >     not ok 20 - submit from detached head
> >     #
> >     #        test_when_finished cleanup_git &&
> >     #        git p4 clone --dest="$git" //depot &&
> >     #        (
> >     #            cd "$git" &&
> >     #            git checkout p4/master &&
> >     #            >detached_head_test &&
> >     #            git add detached_head_test &&
> >     #            git commit -m "add detached_head" &&
> >     #            git config git-p4.skipSubmitEdit true &&
> >     #            git p4 submit &&
> >     #            git p4 rebase &&
> >     #            git log p4/master | grep detached_head
> >     #        )
> >     #
> >   Error: failed: t9800.21 submit from worktree
> >   [...]
> >
> >
> > Is this the best UI we can have for test failures in CI runs? I hope we can
> > do better. Having said that, this patch series presents a pretty good start,
> > and offers a basis for future improvements.
> >
>
> I think these are really valuable improvements over our current state, but I
> also understand the concerns about performance elsewhere in this thread
> (it's really slow to load for me as well, and scrolling/expanding the log
> groups can be a bit glitchy in my browser). That said, I think there are a
> couple ways you could improve the load time without sacrificing the (very
> helpful) improvements you've made to error log visibility. I experimented a
> bit (example result [1]) and came up with some things that may help:
>
> * group errors by test file, rather than by test case (to reduce
>   parsing/rendering time for lots of groups).

I really would like to avoid that, based on my past experience with
diagnosing test failures. It is definitely helpful if the structure lets
the reader expand individual test cases.

> * print the verbose logs only for the failed test cases (to massively cut
>   down on the size of the log, particularly when there's only a couple
>   failures in a test file with a lot of passing tests).

That's an amazingly simple trick to improve the speed by a ton, indeed.
Thank you for this splendid idea!

> * skip printing the full text of the test in 'finalize_test_case_output'
>   when creating the group, i.e., use '$1' instead of '$*' (in both passing
>   and failing tests, this information is already printed via some other
>   means).
>
> If you wanted to make sure a user could still access the full failure logs
> (i.e., including the "ok" test results), you could print a link to the
> artifacts page as well - that way, all of the information we currently
> provide to users can still be found somewhere.

That's a good point, I added that hint to the output (the link is
unfortunately not available at the time we print that advice).

>
> [1] https://github.com/vdye/git/runs/5666973267
>
> > Johannes Schindelin (9):
> >   ci: fix code style
> >   ci/run-build-and-tests: take a more high-level view
> >   ci: make it easier to find failed tests' logs in the GitHub workflow
> >   ci/run-build-and-tests: add some structure to the GitHub workflow
> >     output
> >   tests: refactor --write-junit-xml code
> >   test(junit): avoid line feeds in XML attributes
> >   ci: optionally mark up output in the GitHub workflow
> >   ci: use `--github-workflow-markup` in the GitHub workflow
> >   ci: call `finalize_test_case_output` a little later
> >
>
> The organization of these commits makes the series a bit confusing to read,
> mainly due to the JUnit changes in the middle. Patches 5-6 don't appear to
> be dependent on patches 1-4, so they could be moved to the beginning (after
> patch 1). With that change, I think this series would flow more smoothly:
> "Cleanup/non-functional change" -> "JUnit XML improvements" -> "Log UX
> improvements".

Great feedback! I changed the order as suggested.

Again, thank you so much for helping me improve the user experience of
Git's CI/PR builds.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [PATCH v3 00/12] ci: make Git's GitHub workflow output much more helpful
  2022-03-01 10:24 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
                     ` (11 preceding siblings ...)
  2022-03-25  0:48   ` Victoria Dye
@ 2022-05-21 22:18   ` Johannes Schindelin via GitGitGadget
  2022-05-21 22:18     ` [PATCH v3 01/12] ci: fix code style Johannes Schindelin via GitGitGadget
                       ` (11 more replies)
  12 siblings, 12 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-21 22:18 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Victoria Dye, Johannes Schindelin

Changes since v2:

 * Logs for successful test cases are no longer shown, which improves the
   time to load pages (thanks Victoria!).
 * The preamble for each test case is no longer shown twice (thanks
   Victoria!).
 * We now explicitly mention where the full logs can be found.
 * Some patches were reordered to make the story line of this patch series
   more coherent.
 * Rebased onto main due resolve merge conflicts with
   ab/test-tap-fix-for-immediate.

I cannot thank Victoria enough for the thorough investigation; It was
exactly what I had hoped for, and if I had not been pulled into too many
directions at once, I would have incorporated her suggestions and provided a
new iteration much earlier.

It might not be all bad that this iteration had to wait a little longer,
though: In the meantime, the errors on the summary page are now deep-linked
into the part of the logs where the corresponding error message was
generated (just click on the job name above the error message).

Note: I tried to add another patch that would turn GCC's compile errors into
GitHub workflow commands
[https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions]
that would list the error messages on the summary page. However, that would
have required piping the output of make through a sed call, which in turn
would have required set -o pipefail (which is not supported by all the
shells that are used in our CI). I even dabbled with using process
substitution, but that made things even worse: the sed process would
continue outputting after make was finished and after the ::endgroup::
command, meaning that the output was garbled. I'll probably continue
investigating at some stage, but for now I'll call my time-boxed experiment
a wash.

Changes since v1:

 * In the patch that removed MAKE_TARGETS, a stale comment about that
   variable is also removed.
 * The comment about set -x has been adjusted because it no longer applies
   as-is.
 * The commit message of "ci: make it easier to find failed tests' logs in
   the GitHub workflow" has been adjusted to motivate the improvement
   better.


Background
==========

Using CI and in general making it easier for new contributors is an area I'm
passionate about, and one I'd like to see improved.


The current situation
=====================

Let me walk you through the current experience when a PR build fails: I get
a notification mail that only says that a certain job failed. There's no
indication of which test failed (or was it the build?). I can click on a
link at it takes me to the workflow run. Once there, all it says is "Process
completed with exit code 1", or even "code 2". Sure, I can click on one of
the failed jobs. It even expands the failed step's log (collapsing the other
steps). And what do I see there?

Let's look at an example of a failed linux-clang (ubuntu-latest) job
[https://github.com/git-for-windows/git/runs/4822802185?check_suite_focus=true]:

[...]
Test Summary Report
-------------------
t1092-sparse-checkout-compatibility.sh           (Wstat: 256 Tests: 53 Failed: 1)
  Failed test:  49
  Non-zero exit status: 1
t3701-add-interactive.sh                         (Wstat: 0 Tests: 71 Failed: 0)
  TODO passed:   45, 47
Files=957, Tests=25489, 645 wallclock secs ( 5.74 usr  1.56 sys + 866.28 cusr 364.34 csys = 1237.92 CPU)
Result: FAIL
make[1]: *** [Makefile:53: prove] Error 1
make[1]: Leaving directory '/home/runner/work/git/git/t'
make: *** [Makefile:3018: test] Error 2


That's it. I count myself lucky not to be a new contributor being faced with
something like this.

Now, since I am active in the Git project for a couple of days or so, I can
make sense of the "TODO passed" label and know that for the purpose of
fixing the build failures, I need to ignore this, and that I need to focus
on the "Failed test" part instead.

I also know that I do not have to get myself an ubuntu-latest box just to
reproduce the error, I do not even have to check out the code and run it
just to learn what that "49" means.

I know, and I do not expect any new contributor, not even most seasoned
contributors to know, that I have to patiently collapse the "Run
ci/run-build-and-tests.sh" job's log, and instead expand the "Run
ci/print-test-failures.sh" job log (which did not fail and hence does not
draw any attention to it).

I know, and again: I do not expect many others to know this, that I then
have to click into the "Search logs" box (not the regular web browser's
search via Ctrl+F!) and type in "not ok" to find the log of the failed test
case (and this might still be a "known broken" one that is marked via
test_expect_failure and once again needs to be ignored).

To be excessively clear: This is not a great experience!


Improved output
===============

Our previous Azure Pipelines-based CI builds had a much nicer UI, one that
even showed flaky tests, and trends e.g. how long the test cases ran. When I
ported Git's CI over to GitHub workflows (to make CI more accessible to new
contributors), I knew fully well that we would leave this very nice UI
behind, and I had hoped that we would get something similar back via new,
community-contributed GitHub Actions that can be used in GitHub workflows.
However, most likely because we use a home-grown test framework implemented
in opinionated POSIX shells scripts, that did not happen.

So I had a look at what standards exist e.g. when testing PowerShell
modules, in the way of marking up their test output in GitHub workflows, and
I was not disappointed: GitHub workflows support "grouping" of output lines,
i.e. marking sections of the output as a group that is then collapsed by
default and can be expanded. And it is this feature I've decided to use in
this patch series, along with GitHub workflows' commands to display errors
or notices that are also shown on the summary page of the workflow run. Now,
in addition to "Process completed with exit code" on the summary page, we
also read something like:

⊗ linux-clang (ubuntu-latest)
   failed: t3400.22 rebase --apply -q is quiet


Even better, this message is a link, and following that, the reader is
presented with something like this
[https://github.com/dscho/git/runs/6539591442?check_suite_focus=true#step:4:2954]:

[...]
=== Failed test: t3420-rebase-autostash ===
The full logs are in the artifacts attached to this run.
Error: failed: t3420.12 rebase --apply: --quit
⏵ failure: t3420.12 rebase --apply: --quit 
Error: failed: t3420.13 rebase --apply: non-conflicting rebase, conflicting stash
⏵ failure: t3420.13 rebase --apply: non-conflicting rebase, conflicting stash 
Error: failed: t3420.14 rebase --apply: check output with conflicting stash
⏵ failure: t3420.14 rebase --apply: check output with conflicting stash 
Error: failed: t3420.23 rebase --merge: --quit
⏵ failure: t3420.23 rebase --merge: --quit 
Error: failed: t3420.24 rebase --merge: non-conflicting rebase, conflicting stash
⏵ failure: t3420.24 rebase --merge: non-conflicting rebase, conflicting stash 
Error: failed: t3420.25 rebase --merge: check output with conflicting stash
⏵ failure: t3420.25 rebase --merge: check output with conflicting stash 
Error: failed: t3420.34 rebase --interactive: --quit
⏵ failure: t3420.34 rebase --interactive: --quit 
Error: failed: t3420.35 rebase --interactive: non-conflicting rebase, conflicting stash
⏵ failure: t3420.35 rebase --interactive: non-conflicting rebase, conflicting stash 
Error: failed: t3420.36 rebase --interactive: check output with conflicting stash
⏵ failure: t3420.36 rebase --interactive: check output with conflicting stash 
Error: failed: t3420.39 autostash is saved on editor failure with conflict
⏵ failure: t3420.39 autostash is saved on editor failure with conflict 
[...]


The "Failed test:" lines are colored in yellow to give a better visual clue
about the logs' structure, the "Error:" label is colored in red to draw the
attention to the important part of the log, and the "⏵" characters indicate
that part of the log is collapsed and can be expanded by clicking on it.

To drill down, the reader merely needs to expand the test case's log by
clicking on it, and then study the log. If needed (e.g. when the test case
relies on side effects from previous test cases), the logs of preceding test
cases can be expanded as well. In case the full log is needed, including the
successful test cases, they are included in the artifacts that are attached
to the CI/PR run.

Is this the best UI we can have for test failures in CI runs? I hope we can
do better. Having said that, this patch series presents a pretty good start,
and offers a basis for future improvements.

Johannes Schindelin (11):
  ci: fix code style
  tests: refactor --write-junit-xml code
  test(junit): avoid line feeds in XML attributes
  ci/run-build-and-tests: take a more high-level view
  ci: make it easier to find failed tests' logs in the GitHub workflow
  ci/run-build-and-tests: add some structure to the GitHub workflow
    output
  ci: optionally mark up output in the GitHub workflow
  ci(github): skip the logs of the successful test cases
  ci: use `--github-workflow-markup` in the GitHub workflow
  ci(github): mention where the full logs can be found
  ci: call `finalize_test_case_output` a little later

Victoria Dye (1):
  ci(github): avoid printing test case preamble twice

 .github/workflows/main.yml           |  12 ---
 ci/lib.sh                            |  83 +++++++++++++++--
 ci/run-build-and-tests.sh            |  14 +--
 ci/run-test-slice.sh                 |   5 +-
 t/test-lib-functions.sh              |   6 +-
 t/test-lib-github-workflow-markup.sh |  56 ++++++++++++
 t/test-lib-junit.sh                  | 132 +++++++++++++++++++++++++++
 t/test-lib.sh                        | 128 ++++----------------------
 8 files changed, 297 insertions(+), 139 deletions(-)
 create mode 100644 t/test-lib-github-workflow-markup.sh
 create mode 100644 t/test-lib-junit.sh


base-commit: f9b95943b68b6b8ca5a6072f50a08411c6449b55
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1117%2Fdscho%2Fuse-grouping-in-ci-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1117/dscho/use-grouping-in-ci-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1117

Range-diff vs v2:

  1:  db08b07c37a =  1:  68793fcec62 ci: fix code style
  5:  9eda6574313 =  2:  cbf476e4e98 tests: refactor --write-junit-xml code
  6:  c8b240af749 =  3:  30ccd602108 test(junit): avoid line feeds in XML attributes
  2:  42ff3e170bf =  4:  8f5b112bd08 ci/run-build-and-tests: take a more high-level view
  3:  bbbe1623257 !  5:  417f702a245 ci: make it easier to find failed tests' logs in the GitHub workflow
     @@ ci/lib.sh: check_unignored_build_artifacts () {
       
      @@ ci/lib.sh: then
       	CI_JOB_ID="$GITHUB_RUN_ID"
     - 	CC="${CC:-gcc}"
     + 	CC="${CC_PACKAGE:-${CC:-gcc}}"
       	DONT_SKIP_TAGS=t
      +	handle_failed_tests () {
      +		mkdir -p t/failed-test-artifacts
  4:  f72254a9ac6 =  6:  7d2284314ef ci/run-build-and-tests: add some structure to the GitHub workflow output
  7:  15f199e810e =  7:  98059b94a88 ci: optionally mark up output in the GitHub workflow
  -:  ----------- >  8:  d3db5252fb8 ci(github): skip the logs of the successful test cases
  -:  ----------- >  9:  51573ef6c54 ci(github): avoid printing test case preamble twice
  8:  91ea54f36c5 = 10:  7f921ffef12 ci: use `--github-workflow-markup` in the GitHub workflow
  -:  ----------- > 11:  370b08d3a11 ci(github): mention where the full logs can be found
  9:  be2a83f5da3 ! 12:  fe355a6f03b ci: call `finalize_test_case_output` a little later
     @@ t/test-lib.sh: trap '{ code=$?; set +x; } 2>/dev/null; exit $code' INT TERM HUP
       	test_failure=$(($test_failure + 1))
       	say_color error "not ok $test_count - $1"
       	shift
     - 	printf '%s\n' "$*" | sed -e 's/^/#	/'
     - 	test "$immediate" = "" || _error_exit
     +@@ t/test-lib.sh: test_failure_ () {
     + 		say_color error "1..$test_count"
     + 		_error_exit
     + 	fi
      +	finalize_test_case_output failure "$failure_label" "$@"
       }
       

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [PATCH v3 01/12] ci: fix code style
  2022-05-21 22:18   ` [PATCH v3 00/12] " Johannes Schindelin via GitGitGadget
@ 2022-05-21 22:18     ` Johannes Schindelin via GitGitGadget
  2022-05-21 22:18     ` [PATCH v3 02/12] tests: refactor --write-junit-xml code Johannes Schindelin via GitGitGadget
                       ` (10 subsequent siblings)
  11 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-21 22:18 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Victoria Dye, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

In b92cb86ea14 (travis-ci: check that all build artifacts are
.gitignore-d, 2017-12-31), a function was introduced with a code style
that is different from the surrounding code: it added the opening curly
brace on its own line, when all the existing functions in the same file
cuddle that brace on the same line as the function name.

Let's make the code style consistent again.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 ci/lib.sh | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/ci/lib.sh b/ci/lib.sh
index 86e37da9bc5..d718f4e386d 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -69,8 +69,7 @@ skip_good_tree () {
 	exit 0
 }
 
-check_unignored_build_artifacts ()
-{
+check_unignored_build_artifacts () {
 	! git ls-files --other --exclude-standard --error-unmatch \
 		-- ':/*' 2>/dev/null ||
 	{
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v3 02/12] tests: refactor --write-junit-xml code
  2022-05-21 22:18   ` [PATCH v3 00/12] " Johannes Schindelin via GitGitGadget
  2022-05-21 22:18     ` [PATCH v3 01/12] ci: fix code style Johannes Schindelin via GitGitGadget
@ 2022-05-21 22:18     ` Johannes Schindelin via GitGitGadget
  2022-05-21 22:18     ` [PATCH v3 03/12] test(junit): avoid line feeds in XML attributes Johannes Schindelin via GitGitGadget
                       ` (9 subsequent siblings)
  11 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-21 22:18 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Victoria Dye, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The code writing JUnit XML is interspersed directly with all the code in
`t/test-lib.sh`, and it is therefore not only ill-separated, but
introducing yet another output format would make the situation even
worse.

Let's introduce an abstraction layer by hiding the JUnit XML code behind
four new functions that are supposed to be called before and after each
test and test case.

This is not just an academic exercise, refactoring for refactoring's
sake. We _actually_ want to introduce such a new output format, to
make it substantially easier to diagnose test failures in our GitHub
workflow, therefore we do need this refactoring.

This commit is best viewed with `git show --color-moved
--color-moved-ws=allow-indentation-change <commit>`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/test-lib-junit.sh | 126 ++++++++++++++++++++++++++++++++++++++++++++
 t/test-lib.sh       | 124 ++++++-------------------------------------
 2 files changed, 142 insertions(+), 108 deletions(-)
 create mode 100644 t/test-lib-junit.sh

diff --git a/t/test-lib-junit.sh b/t/test-lib-junit.sh
new file mode 100644
index 00000000000..9d55d74d764
--- /dev/null
+++ b/t/test-lib-junit.sh
@@ -0,0 +1,126 @@
+# Library of functions to format test scripts' output in JUnit XML
+# format, to support Git's test suite result to be presented in an
+# easily digestible way on Azure Pipelines.
+#
+# Copyright (c) 2022 Johannes Schindelin
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see http://www.gnu.org/licenses/ .
+#
+# The idea is for `test-lib.sh` to source this file when the user asks
+# for JUnit XML; these functions will then override (empty) functions
+# that are are called at the appropriate times during the test runs.
+
+start_test_output () {
+	junit_xml_dir="$TEST_OUTPUT_DIRECTORY/out"
+	mkdir -p "$junit_xml_dir"
+	junit_xml_base=${1##*/}
+	junit_xml_path="$junit_xml_dir/TEST-${junit_xml_base%.sh}.xml"
+	junit_attrs="name=\"${junit_xml_base%.sh}\""
+	junit_attrs="$junit_attrs timestamp=\"$(TZ=UTC \
+		date +%Y-%m-%dT%H:%M:%S)\""
+	write_junit_xml --truncate "<testsuites>" "  <testsuite $junit_attrs>"
+	junit_suite_start=$(test-tool date getnanos)
+	if test -n "$GIT_TEST_TEE_OUTPUT_FILE"
+	then
+		GIT_TEST_TEE_OFFSET=0
+	fi
+}
+
+start_test_case_output () {
+	junit_start=$(test-tool date getnanos)
+}
+
+finalize_test_case_output () {
+	test_case_result=$1
+	shift
+	case "$test_case_result" in
+	ok)
+		set "$*"
+		;;
+	failure)
+		junit_insert="<failure message=\"not ok $test_count -"
+		junit_insert="$junit_insert $(xml_attr_encode "$1")\">"
+		junit_insert="$junit_insert $(xml_attr_encode \
+			"$(if test -n "$GIT_TEST_TEE_OUTPUT_FILE"
+			   then
+				test-tool path-utils skip-n-bytes \
+					"$GIT_TEST_TEE_OUTPUT_FILE" $GIT_TEST_TEE_OFFSET
+			   else
+				printf '%s\n' "$@" | sed 1d
+			   fi)")"
+		junit_insert="$junit_insert</failure>"
+		if test -n "$GIT_TEST_TEE_OUTPUT_FILE"
+		then
+			junit_insert="$junit_insert<system-err>$(xml_attr_encode \
+				"$(cat "$GIT_TEST_TEE_OUTPUT_FILE")")</system-err>"
+		fi
+		set "$1" "      $junit_insert"
+		;;
+	fixed)
+		set "$* (breakage fixed)"
+		;;
+	broken)
+		set "$* (known breakage)"
+		;;
+	skip)
+		message="$(xml_attr_encode "$skipped_reason")"
+		set "$1" "      <skipped message=\"$message\" />"
+		;;
+	esac
+
+	junit_attrs="name=\"$(xml_attr_encode "$this_test.$test_count $1")\""
+	shift
+	junit_attrs="$junit_attrs classname=\"$this_test\""
+	junit_attrs="$junit_attrs time=\"$(test-tool \
+		date getnanos $junit_start)\""
+	write_junit_xml "$(printf '%s\n' \
+		"    <testcase $junit_attrs>" "$@" "    </testcase>")"
+	junit_have_testcase=t
+}
+
+finalize_test_output () {
+	if test -n "$junit_xml_path"
+	then
+		test -n "$junit_have_testcase" || {
+			junit_start=$(test-tool date getnanos)
+			write_junit_xml_testcase "all tests skipped"
+		}
+
+		# adjust the overall time
+		junit_time=$(test-tool date getnanos $junit_suite_start)
+		sed -e "s/\(<testsuite.*\) time=\"[^\"]*\"/\1/" \
+			-e "s/<testsuite [^>]*/& time=\"$junit_time\"/" \
+			-e '/^ *<\/testsuite/d' \
+			<"$junit_xml_path" >"$junit_xml_path.new"
+		mv "$junit_xml_path.new" "$junit_xml_path"
+
+		write_junit_xml "  </testsuite>" "</testsuites>"
+		write_junit_xml=
+	fi
+}
+
+write_junit_xml () {
+	case "$1" in
+	--truncate)
+		>"$junit_xml_path"
+		junit_have_testcase=
+		shift
+		;;
+	esac
+	printf '%s\n' "$@" >>"$junit_xml_path"
+}
+
+xml_attr_encode () {
+	printf '%s\n' "$@" | test-tool xml-encode
+}
diff --git a/t/test-lib.sh b/t/test-lib.sh
index f09e8f3efce..bdb11e28eea 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -137,6 +137,12 @@ mark_option_requires_arg () {
 	store_arg_to=$2
 }
 
+# These functions can be overridden e.g. to output JUnit XML
+start_test_output () { :; }
+start_test_case_output () { :; }
+finalize_test_case_output () { :; }
+finalize_test_output () { :; }
+
 parse_option () {
 	local opt="$1"
 
@@ -196,7 +202,7 @@ parse_option () {
 		tee=t
 		;;
 	--write-junit-xml)
-		write_junit_xml=t
+		. "$TEST_DIRECTORY/test-lib-junit.sh"
 		;;
 	--stress)
 		stress=t ;;
@@ -664,7 +670,7 @@ exec 6<&0
 exec 7>&2
 
 _error_exit () {
-	finalize_junit_xml
+	finalize_test_output
 	GIT_EXIT_OK=t
 	exit 1
 }
@@ -774,35 +780,13 @@ trap '{ code=$?; set +x; } 2>/dev/null; exit $code' INT TERM HUP
 # the test_expect_* functions instead.
 
 test_ok_ () {
-	if test -n "$write_junit_xml"
-	then
-		write_junit_xml_testcase "$*"
-	fi
+	finalize_test_case_output ok "$@"
 	test_success=$(($test_success + 1))
 	say_color "" "ok $test_count - $@"
 }
 
 test_failure_ () {
-	if test -n "$write_junit_xml"
-	then
-		junit_insert="<failure message=\"not ok $test_count -"
-		junit_insert="$junit_insert $(xml_attr_encode "$1")\">"
-		junit_insert="$junit_insert $(xml_attr_encode \
-			"$(if test -n "$GIT_TEST_TEE_OUTPUT_FILE"
-			   then
-				test-tool path-utils skip-n-bytes \
-					"$GIT_TEST_TEE_OUTPUT_FILE" $GIT_TEST_TEE_OFFSET
-			   else
-				printf '%s\n' "$@" | sed 1d
-			   fi)")"
-		junit_insert="$junit_insert</failure>"
-		if test -n "$GIT_TEST_TEE_OUTPUT_FILE"
-		then
-			junit_insert="$junit_insert<system-err>$(xml_attr_encode \
-				"$(cat "$GIT_TEST_TEE_OUTPUT_FILE")")</system-err>"
-		fi
-		write_junit_xml_testcase "$1" "      $junit_insert"
-	fi
+	finalize_test_case_output failure "$@"
 	test_failure=$(($test_failure + 1))
 	say_color error "not ok $test_count - $1"
 	shift
@@ -815,19 +799,13 @@ test_failure_ () {
 }
 
 test_known_broken_ok_ () {
-	if test -n "$write_junit_xml"
-	then
-		write_junit_xml_testcase "$* (breakage fixed)"
-	fi
+	finalize_test_case_output fixed "$@"
 	test_fixed=$(($test_fixed+1))
 	say_color error "ok $test_count - $@ # TODO known breakage vanished"
 }
 
 test_known_broken_failure_ () {
-	if test -n "$write_junit_xml"
-	then
-		write_junit_xml_testcase "$* (known breakage)"
-	fi
+	finalize_test_case_output broken "$@"
 	test_broken=$(($test_broken+1))
 	say_color warn "not ok $test_count - $@ # TODO known breakage"
 }
@@ -1104,10 +1082,7 @@ test_start_ () {
 	test_count=$(($test_count+1))
 	maybe_setup_verbose
 	maybe_setup_valgrind
-	if test -n "$write_junit_xml"
-	then
-		junit_start=$(test-tool date getnanos)
-	fi
+	start_test_case_output
 }
 
 test_finish_ () {
@@ -1158,12 +1133,7 @@ test_skip () {
 
 	case "$to_skip" in
 	t)
-		if test -n "$write_junit_xml"
-		then
-			message="$(xml_attr_encode "$skipped_reason")"
-			write_junit_xml_testcase "$1" \
-				"      <skipped message=\"$message\" />"
-		fi
+		finalize_test_case_output skip "$@"
 
 		say_color skip "ok $test_count # skip $1 ($skipped_reason)"
 		: true
@@ -1179,53 +1149,6 @@ test_at_end_hook_ () {
 	:
 }
 
-write_junit_xml () {
-	case "$1" in
-	--truncate)
-		>"$junit_xml_path"
-		junit_have_testcase=
-		shift
-		;;
-	esac
-	printf '%s\n' "$@" >>"$junit_xml_path"
-}
-
-xml_attr_encode () {
-	printf '%s\n' "$@" | test-tool xml-encode
-}
-
-write_junit_xml_testcase () {
-	junit_attrs="name=\"$(xml_attr_encode "$this_test.$test_count $1")\""
-	shift
-	junit_attrs="$junit_attrs classname=\"$this_test\""
-	junit_attrs="$junit_attrs time=\"$(test-tool \
-		date getnanos $junit_start)\""
-	write_junit_xml "$(printf '%s\n' \
-		"    <testcase $junit_attrs>" "$@" "    </testcase>")"
-	junit_have_testcase=t
-}
-
-finalize_junit_xml () {
-	if test -n "$write_junit_xml" && test -n "$junit_xml_path"
-	then
-		test -n "$junit_have_testcase" || {
-			junit_start=$(test-tool date getnanos)
-			write_junit_xml_testcase "all tests skipped"
-		}
-
-		# adjust the overall time
-		junit_time=$(test-tool date getnanos $junit_suite_start)
-		sed -e "s/\(<testsuite.*\) time=\"[^\"]*\"/\1/" \
-			-e "s/<testsuite [^>]*/& time=\"$junit_time\"/" \
-			-e '/^ *<\/testsuite/d' \
-			<"$junit_xml_path" >"$junit_xml_path.new"
-		mv "$junit_xml_path.new" "$junit_xml_path"
-
-		write_junit_xml "  </testsuite>" "</testsuites>"
-		write_junit_xml=
-	fi
-}
-
 test_atexit_cleanup=:
 test_atexit_handler () {
 	# In a succeeding test script 'test_atexit_handler' is invoked
@@ -1248,7 +1171,7 @@ test_done () {
 	# removed, so the commands can access pidfiles and socket files.
 	test_atexit_handler
 
-	finalize_junit_xml
+	finalize_test_output
 
 	if test -z "$HARNESS_ACTIVE"
 	then
@@ -1539,22 +1462,7 @@ fi
 # in subprocesses like git equals our $PWD (for pathname comparisons).
 cd -P "$TRASH_DIRECTORY" || exit 1
 
-if test -n "$write_junit_xml"
-then
-	junit_xml_dir="$TEST_OUTPUT_DIRECTORY/out"
-	mkdir -p "$junit_xml_dir"
-	junit_xml_base=${0##*/}
-	junit_xml_path="$junit_xml_dir/TEST-${junit_xml_base%.sh}.xml"
-	junit_attrs="name=\"${junit_xml_base%.sh}\""
-	junit_attrs="$junit_attrs timestamp=\"$(TZ=UTC \
-		date +%Y-%m-%dT%H:%M:%S)\""
-	write_junit_xml --truncate "<testsuites>" "  <testsuite $junit_attrs>"
-	junit_suite_start=$(test-tool date getnanos)
-	if test -n "$GIT_TEST_TEE_OUTPUT_FILE"
-	then
-		GIT_TEST_TEE_OFFSET=0
-	fi
-fi
+start_test_output "$0"
 
 # Convenience
 # A regexp to match 5 and 35 hexdigits
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v3 03/12] test(junit): avoid line feeds in XML attributes
  2022-05-21 22:18   ` [PATCH v3 00/12] " Johannes Schindelin via GitGitGadget
  2022-05-21 22:18     ` [PATCH v3 01/12] ci: fix code style Johannes Schindelin via GitGitGadget
  2022-05-21 22:18     ` [PATCH v3 02/12] tests: refactor --write-junit-xml code Johannes Schindelin via GitGitGadget
@ 2022-05-21 22:18     ` Johannes Schindelin via GitGitGadget
  2022-05-21 22:18     ` [PATCH v3 04/12] ci/run-build-and-tests: take a more high-level view Johannes Schindelin via GitGitGadget
                       ` (8 subsequent siblings)
  11 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-21 22:18 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Victoria Dye, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

In the test case's output, we do want newline characters, but in the XML
attributes we do not want them.

However, the `xml_attr_encode` function always adds a Line Feed at the
end (which are then encoded as `&#x0a;`, even for XML attributes.

This seems not to faze Azure Pipelines' XML parser, but it still is
incorrect, so let's fix it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/test-lib-junit.sh | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/t/test-lib-junit.sh b/t/test-lib-junit.sh
index 9d55d74d764..c959183c7e2 100644
--- a/t/test-lib-junit.sh
+++ b/t/test-lib-junit.sh
@@ -50,7 +50,7 @@ finalize_test_case_output () {
 		;;
 	failure)
 		junit_insert="<failure message=\"not ok $test_count -"
-		junit_insert="$junit_insert $(xml_attr_encode "$1")\">"
+		junit_insert="$junit_insert $(xml_attr_encode --no-lf "$1")\">"
 		junit_insert="$junit_insert $(xml_attr_encode \
 			"$(if test -n "$GIT_TEST_TEE_OUTPUT_FILE"
 			   then
@@ -74,12 +74,12 @@ finalize_test_case_output () {
 		set "$* (known breakage)"
 		;;
 	skip)
-		message="$(xml_attr_encode "$skipped_reason")"
+		message="$(xml_attr_encode --no-lf "$skipped_reason")"
 		set "$1" "      <skipped message=\"$message\" />"
 		;;
 	esac
 
-	junit_attrs="name=\"$(xml_attr_encode "$this_test.$test_count $1")\""
+	junit_attrs="name=\"$(xml_attr_encode --no-lf "$this_test.$test_count $1")\""
 	shift
 	junit_attrs="$junit_attrs classname=\"$this_test\""
 	junit_attrs="$junit_attrs time=\"$(test-tool \
@@ -122,5 +122,11 @@ write_junit_xml () {
 }
 
 xml_attr_encode () {
-	printf '%s\n' "$@" | test-tool xml-encode
+	if test "x$1" = "x--no-lf"
+	then
+		shift
+		printf '%s' "$*" | test-tool xml-encode
+	else
+		printf '%s\n' "$@" | test-tool xml-encode
+	fi
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v3 04/12] ci/run-build-and-tests: take a more high-level view
  2022-05-21 22:18   ` [PATCH v3 00/12] " Johannes Schindelin via GitGitGadget
                       ` (2 preceding siblings ...)
  2022-05-21 22:18     ` [PATCH v3 03/12] test(junit): avoid line feeds in XML attributes Johannes Schindelin via GitGitGadget
@ 2022-05-21 22:18     ` Johannes Schindelin via GitGitGadget
  2022-05-21 22:18     ` [PATCH v3 05/12] ci: make it easier to find failed tests' logs in the GitHub workflow Johannes Schindelin via GitGitGadget
                       ` (7 subsequent siblings)
  11 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-21 22:18 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Victoria Dye, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

In the web UI of GitHub workflows, failed runs are presented with the
job step that failed auto-expanded. In the current setup, this is not
helpful at all because that shows only the output of `prove`, which says
which test failed, but not in what way.

What would help understand the reader what went wrong is the verbose
test output of the failed test.

The logs of the failed runs do contain that verbose test output, but it
is shown in the _next_ step (which is marked as succeeding, and is
therefore _not_ auto-expanded). Anyone not intimately familiar with this
would completely miss the verbose test output, being left mostly
puzzled with the test failures.

We are about to show the failed test cases' output in the _same_ step,
so that the user has a much easier time to figure out what was going
wrong.

But first, we must partially revert the change that tried to improve the
CI runs by combining the `Makefile` targets to build into a single
`make` invocation. That might have sounded like a good idea at the time,
but it does make it rather impossible for the CI script to determine
whether the _build_ failed, or the _tests_. If the tests were run at
all, that is.

So let's go back to calling `make` for the build, and call `make test`
separately so that we can easily detect that _that_ invocation failed,
and react appropriately.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 ci/run-build-and-tests.sh | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 280dda7d285..2818b3046ae 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -10,7 +10,7 @@ windows*) cmd //c mklink //j t\\.prove "$(cygpath -aw "$cache_dir/.prove")";;
 *) ln -s "$cache_dir/.prove" t/.prove;;
 esac
 
-export MAKE_TARGETS="all test"
+run_tests=t
 
 case "$jobname" in
 linux-gcc)
@@ -41,14 +41,15 @@ pedantic)
 	# Don't run the tests; we only care about whether Git can be
 	# built.
 	export DEVOPTS=pedantic
-	export MAKE_TARGETS=all
+	run_tests=
 	;;
 esac
 
-# Any new "test" targets should not go after this "make", but should
-# adjust $MAKE_TARGETS. Otherwise compilation-only targets above will
-# start running tests.
-make $MAKE_TARGETS
+make
+if test -n "$run_tests"
+then
+	make test
+fi
 check_unignored_build_artifacts
 
 save_good_tree
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v3 05/12] ci: make it easier to find failed tests' logs in the GitHub workflow
  2022-05-21 22:18   ` [PATCH v3 00/12] " Johannes Schindelin via GitGitGadget
                       ` (3 preceding siblings ...)
  2022-05-21 22:18     ` [PATCH v3 04/12] ci/run-build-and-tests: take a more high-level view Johannes Schindelin via GitGitGadget
@ 2022-05-21 22:18     ` Johannes Schindelin via GitGitGadget
  2022-05-21 22:18     ` [PATCH v3 06/12] ci/run-build-and-tests: add some structure to the GitHub workflow output Johannes Schindelin via GitGitGadget
                       ` (6 subsequent siblings)
  11 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-21 22:18 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Victoria Dye, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

When investigating a test failure, the time that matters most is the
time it takes from getting aware of the failure to displaying the output
of the failing test case.

You currently have to know a lot of implementation details when
investigating test failures in the CI runs. The first step is easy: the
failed job is marked quite clearly, but when opening it, the failed step
is expanded, which in our case is the one running
`ci/run-build-and-tests.sh`. This step, most notably, only offers a
high-level view of what went wrong: it prints the output of `prove`
which merely tells the reader which test script failed.

The actually interesting part is in the detailed log of said failed
test script. But that log is shown in the CI run's step that runs
`ci/print-test-failures.sh`. And that step is _not_ expanded in the web
UI by default. It is even marked as "successful", which makes it very
easy to miss that there is useful information hidden in there.

Let's help the reader by showing the failed tests' detailed logs in the
step that is expanded automatically, i.e. directly after the test suite
failed.

This also helps the situation where the _build_ failed and the
`print-test-failures` step was executed under the assumption that the
_test suite_ failed, and consequently failed to find any failed tests.

An alternative way to implement this patch would be to source
`ci/print-test-failures.sh` in the `handle_test_failures` function to
show these logs. However, over the course of the next few commits, we
want to introduce some grouping which would be harder to achieve that
way (for example, we do want a leaner, and colored, preamble for each
failed test script, and it would be trickier to accommodate the lack of
nested groupings in GitHub workflows' output).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 .github/workflows/main.yml | 12 ------------
 ci/lib.sh                  | 23 +++++++++++++++++++++++
 ci/run-build-and-tests.sh  |  3 ++-
 ci/run-test-slice.sh       |  3 ++-
 4 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index c35200defb9..3fa88b78b6d 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -119,10 +119,6 @@ jobs:
     - name: test
       shell: bash
       run: . /etc/profile && ci/run-test-slice.sh ${{matrix.nr}} 10
-    - name: ci/print-test-failures.sh
-      if: failure()
-      shell: bash
-      run: ci/print-test-failures.sh
     - name: Upload failed tests' directories
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
       uses: actions/upload-artifact@v2
@@ -204,10 +200,6 @@ jobs:
       env:
         NO_SVN_TESTS: 1
       run: . /etc/profile && ci/run-test-slice.sh ${{matrix.nr}} 10
-    - name: ci/print-test-failures.sh
-      if: failure()
-      shell: bash
-      run: ci/print-test-failures.sh
     - name: Upload failed tests' directories
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
       uses: actions/upload-artifact@v2
@@ -261,8 +253,6 @@ jobs:
     - uses: actions/checkout@v2
     - run: ci/install-dependencies.sh
     - run: ci/run-build-and-tests.sh
-    - run: ci/print-test-failures.sh
-      if: failure()
     - name: Upload failed tests' directories
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
       uses: actions/upload-artifact@v2
@@ -292,8 +282,6 @@ jobs:
     - uses: actions/checkout@v1
     - run: ci/install-docker-dependencies.sh
     - run: ci/run-build-and-tests.sh
-    - run: ci/print-test-failures.sh
-      if: failure()
     - name: Upload failed tests' directories
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
       uses: actions/upload-artifact@v1
diff --git a/ci/lib.sh b/ci/lib.sh
index d718f4e386d..65f5188a550 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -78,6 +78,10 @@ check_unignored_build_artifacts () {
 	}
 }
 
+handle_failed_tests () {
+	return 1
+}
+
 # GitHub Action doesn't set TERM, which is required by tput
 export TERM=${TERM:-dumb}
 
@@ -123,6 +127,25 @@ then
 	CI_JOB_ID="$GITHUB_RUN_ID"
 	CC="${CC_PACKAGE:-${CC:-gcc}}"
 	DONT_SKIP_TAGS=t
+	handle_failed_tests () {
+		mkdir -p t/failed-test-artifacts
+		echo "FAILED_TEST_ARTIFACTS=t/failed-test-artifacts" >>$GITHUB_ENV
+
+		for test_exit in t/test-results/*.exit
+		do
+			test 0 != "$(cat "$test_exit")" || continue
+
+			test_name="${test_exit%.exit}"
+			test_name="${test_name##*/}"
+			printf "\\e[33m\\e[1m=== Failed test: ${test_name} ===\\e[m\\n"
+			cat "t/test-results/$test_name.out"
+
+			trash_dir="t/trash directory.$test_name"
+			cp "t/test-results/$test_name.out" t/failed-test-artifacts/
+			tar czf t/failed-test-artifacts/"$test_name".trash.tar.gz "$trash_dir"
+		done
+		return 1
+	}
 
 	cache_dir="$HOME/none"
 
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 2818b3046ae..1ede75e5556 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -48,7 +48,8 @@ esac
 make
 if test -n "$run_tests"
 then
-	make test
+	make test ||
+	handle_failed_tests
 fi
 check_unignored_build_artifacts
 
diff --git a/ci/run-test-slice.sh b/ci/run-test-slice.sh
index f8c2c3106a2..63358c23e11 100755
--- a/ci/run-test-slice.sh
+++ b/ci/run-test-slice.sh
@@ -12,6 +12,7 @@ esac
 
 make --quiet -C t T="$(cd t &&
 	./helper/test-tool path-utils slice-tests "$1" "$2" t[0-9]*.sh |
-	tr '\n' ' ')"
+	tr '\n' ' ')" ||
+handle_failed_tests
 
 check_unignored_build_artifacts
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v3 06/12] ci/run-build-and-tests: add some structure to the GitHub workflow output
  2022-05-21 22:18   ` [PATCH v3 00/12] " Johannes Schindelin via GitGitGadget
                       ` (4 preceding siblings ...)
  2022-05-21 22:18     ` [PATCH v3 05/12] ci: make it easier to find failed tests' logs in the GitHub workflow Johannes Schindelin via GitGitGadget
@ 2022-05-21 22:18     ` Johannes Schindelin via GitGitGadget
  2022-05-21 22:18     ` [PATCH v3 07/12] ci: optionally mark up output in the GitHub workflow Johannes Schindelin via GitGitGadget
                       ` (5 subsequent siblings)
  11 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-21 22:18 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Victoria Dye, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The current output of Git's GitHub workflow can be quite confusing,
especially for contributors new to the project.

To make it more helpful, let's introduce some collapsible grouping.
Initially, readers will see the high-level view of what actually
happened (did the build fail, or the test suite?). To drill down, the
respective group can be expanded.

Note: sadly, workflow output currently cannot contain any nested groups
(see https://github.com/actions/runner/issues/802 for details),
therefore we take pains to ensure to end any previous group before
starting a new one.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 ci/lib.sh                 | 56 ++++++++++++++++++++++++++++++++++-----
 ci/run-build-and-tests.sh |  4 +--
 ci/run-test-slice.sh      |  2 +-
 3 files changed, 52 insertions(+), 10 deletions(-)

diff --git a/ci/lib.sh b/ci/lib.sh
index 65f5188a550..f8cb79e44f0 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -1,5 +1,50 @@
 # Library of functions shared by all CI scripts
 
+if test true != "$GITHUB_ACTIONS"
+then
+	begin_group () { :; }
+	end_group () { :; }
+
+	group () {
+		shift
+		"$@"
+	}
+	set -x
+else
+	begin_group () {
+		need_to_end_group=t
+		echo "::group::$1" >&2
+		set -x
+	}
+
+	end_group () {
+		test -n "$need_to_end_group" || return 0
+		set +x
+		need_to_end_group=
+		echo '::endgroup::' >&2
+	}
+	trap end_group EXIT
+
+	group () {
+		set +x
+		begin_group "$1"
+		shift
+		"$@"
+		res=$?
+		end_group
+		return $res
+	}
+
+	begin_group "CI setup"
+fi
+
+# Set 'exit on error' for all CI scripts to let the caller know that
+# something went wrong.
+#
+# We already enabled tracing executed commands earlier. This helps by showing
+# how # environment variables are set and and dependencies are installed.
+set -e
+
 skip_branch_tip_with_tag () {
 	# Sometimes, a branch is pushed at the same time the tag that points
 	# at the same commit as the tip of the branch is pushed, and building
@@ -88,12 +133,6 @@ export TERM=${TERM:-dumb}
 # Clear MAKEFLAGS that may come from the outside world.
 export MAKEFLAGS=
 
-# Set 'exit on error' for all CI scripts to let the caller know that
-# something went wrong.
-# Set tracing executed commands, primarily setting environment variables
-# and installing dependencies.
-set -ex
-
 if test -n "$SYSTEM_COLLECTIONURI" || test -n "$SYSTEM_TASKDEFINITIONSURI"
 then
 	CI_TYPE=azure-pipelines
@@ -138,7 +177,7 @@ then
 			test_name="${test_exit%.exit}"
 			test_name="${test_name##*/}"
 			printf "\\e[33m\\e[1m=== Failed test: ${test_name} ===\\e[m\\n"
-			cat "t/test-results/$test_name.out"
+			group "Failed test: $test_name" cat "t/test-results/$test_name.out"
 
 			trash_dir="t/trash directory.$test_name"
 			cp "t/test-results/$test_name.out" t/failed-test-artifacts/
@@ -233,3 +272,6 @@ linux-leaks)
 esac
 
 MAKEFLAGS="$MAKEFLAGS CC=${CC:-cc}"
+
+end_group
+set -x
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 1ede75e5556..7abfa00adc0 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -45,10 +45,10 @@ pedantic)
 	;;
 esac
 
-make
+group Build make
 if test -n "$run_tests"
 then
-	make test ||
+	group "Run tests" make test ||
 	handle_failed_tests
 fi
 check_unignored_build_artifacts
diff --git a/ci/run-test-slice.sh b/ci/run-test-slice.sh
index 63358c23e11..a3c67956a8d 100755
--- a/ci/run-test-slice.sh
+++ b/ci/run-test-slice.sh
@@ -10,7 +10,7 @@ windows*) cmd //c mklink //j t\\.prove "$(cygpath -aw "$cache_dir/.prove")";;
 *) ln -s "$cache_dir/.prove" t/.prove;;
 esac
 
-make --quiet -C t T="$(cd t &&
+group "Run tests" make --quiet -C t T="$(cd t &&
 	./helper/test-tool path-utils slice-tests "$1" "$2" t[0-9]*.sh |
 	tr '\n' ' ')" ||
 handle_failed_tests
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v3 07/12] ci: optionally mark up output in the GitHub workflow
  2022-05-21 22:18   ` [PATCH v3 00/12] " Johannes Schindelin via GitGitGadget
                       ` (5 preceding siblings ...)
  2022-05-21 22:18     ` [PATCH v3 06/12] ci/run-build-and-tests: add some structure to the GitHub workflow output Johannes Schindelin via GitGitGadget
@ 2022-05-21 22:18     ` Johannes Schindelin via GitGitGadget
  2022-05-21 22:18     ` [PATCH v3 08/12] ci(github): skip the logs of the successful test cases Johannes Schindelin via GitGitGadget
                       ` (4 subsequent siblings)
  11 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-21 22:18 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Victoria Dye, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

A couple of commands exist to spruce up the output in GitHub workflows:
https://docs.github.com/en/actions/learn-github-actions/workflow-commands-for-github-actions

In addition to the `::group::<label>`/`::endgroup::` commands (which we
already use to structure the output of the build step better), we also
use `::error::`/`::notice::` to draw the attention to test failures and
to test cases that were expected to fail but didn't.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/test-lib-functions.sh              |  4 +--
 t/test-lib-github-workflow-markup.sh | 50 ++++++++++++++++++++++++++++
 t/test-lib.sh                        |  5 ++-
 3 files changed, 56 insertions(+), 3 deletions(-)
 create mode 100644 t/test-lib-github-workflow-markup.sh

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 93c03380d44..af4831a54c6 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -795,7 +795,7 @@ test_verify_prereq () {
 }
 
 test_expect_failure () {
-	test_start_
+	test_start_ "$@"
 	test "$#" = 3 && { test_prereq=$1; shift; } || test_prereq=
 	test "$#" = 2 ||
 	BUG "not 2 or 3 parameters to test-expect-failure"
@@ -815,7 +815,7 @@ test_expect_failure () {
 }
 
 test_expect_success () {
-	test_start_
+	test_start_ "$@"
 	test "$#" = 3 && { test_prereq=$1; shift; } || test_prereq=
 	test "$#" = 2 ||
 	BUG "not 2 or 3 parameters to test-expect-success"
diff --git a/t/test-lib-github-workflow-markup.sh b/t/test-lib-github-workflow-markup.sh
new file mode 100644
index 00000000000..d8dc969df4a
--- /dev/null
+++ b/t/test-lib-github-workflow-markup.sh
@@ -0,0 +1,50 @@
+# Library of functions to mark up test scripts' output suitable for
+# pretty-printing it in GitHub workflows.
+#
+# Copyright (c) 2022 Johannes Schindelin
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see http://www.gnu.org/licenses/ .
+#
+# The idea is for `test-lib.sh` to source this file when run in GitHub
+# workflows; these functions will then override (empty) functions
+# that are are called at the appropriate times during the test runs.
+
+start_test_output () {
+	test -n "$GIT_TEST_TEE_OUTPUT_FILE" ||
+	die "--github-workflow-markup requires --verbose-log"
+	github_markup_output="${GIT_TEST_TEE_OUTPUT_FILE%.out}.markup"
+	>$github_markup_output
+	GIT_TEST_TEE_OFFSET=0
+}
+
+# No need to override start_test_case_output
+
+finalize_test_case_output () {
+	test_case_result=$1
+	shift
+	case "$test_case_result" in
+	failure)
+		echo >>$github_markup_output "::error::failed: $this_test.$test_count $1"
+		;;
+	fixed)
+		echo >>$github_markup_output "::notice::fixed: $this_test.$test_count $1"
+		;;
+	esac
+	echo >>$github_markup_output "::group::$test_case_result: $this_test.$test_count $*"
+	test-tool >>$github_markup_output path-utils skip-n-bytes \
+		"$GIT_TEST_TEE_OUTPUT_FILE" $GIT_TEST_TEE_OFFSET
+	echo >>$github_markup_output "::endgroup::"
+}
+
+# No need to override finalize_test_output
diff --git a/t/test-lib.sh b/t/test-lib.sh
index bdb11e28eea..29640d107ca 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -204,6 +204,9 @@ parse_option () {
 	--write-junit-xml)
 		. "$TEST_DIRECTORY/test-lib-junit.sh"
 		;;
+	--github-workflow-markup)
+		. "$TEST_DIRECTORY/test-lib-github-workflow-markup.sh"
+		;;
 	--stress)
 		stress=t ;;
 	--stress=*)
@@ -1082,7 +1085,7 @@ test_start_ () {
 	test_count=$(($test_count+1))
 	maybe_setup_verbose
 	maybe_setup_valgrind
-	start_test_case_output
+	start_test_case_output "$@"
 }
 
 test_finish_ () {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v3 08/12] ci(github): skip the logs of the successful test cases
  2022-05-21 22:18   ` [PATCH v3 00/12] " Johannes Schindelin via GitGitGadget
                       ` (6 preceding siblings ...)
  2022-05-21 22:18     ` [PATCH v3 07/12] ci: optionally mark up output in the GitHub workflow Johannes Schindelin via GitGitGadget
@ 2022-05-21 22:18     ` Johannes Schindelin via GitGitGadget
  2022-05-24 10:47       ` Ævar Arnfjörð Bjarmason
  2022-05-21 22:18     ` [PATCH v3 09/12] ci(github): avoid printing test case preamble twice Victoria Dye via GitGitGadget
                       ` (3 subsequent siblings)
  11 siblings, 1 reply; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-21 22:18 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Victoria Dye, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

In most instances, looking at the log of failed test cases is enough to
identify the problem.

In some (rare?) instances, a previous test case that was marked as
successful actually has information pertaining to a later test case that
fails.

To allow the page to load relatively quickly, let's only show the logs
of the failed test cases to be shown. The full logs are available for
download as artifacts, should a deeper investigation become necessary.

Co-authored-by: Victoria Dye <vdye@github.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/test-lib-github-workflow-markup.sh | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/t/test-lib-github-workflow-markup.sh b/t/test-lib-github-workflow-markup.sh
index d8dc969df4a..1ef0fd5ba87 100644
--- a/t/test-lib-github-workflow-markup.sh
+++ b/t/test-lib-github-workflow-markup.sh
@@ -40,6 +40,10 @@ finalize_test_case_output () {
 	fixed)
 		echo >>$github_markup_output "::notice::fixed: $this_test.$test_count $1"
 		;;
+	ok)
+		# Exit without printing the "ok" tests
+		return
+		;;
 	esac
 	echo >>$github_markup_output "::group::$test_case_result: $this_test.$test_count $*"
 	test-tool >>$github_markup_output path-utils skip-n-bytes \
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v3 09/12] ci(github): avoid printing test case preamble twice
  2022-05-21 22:18   ` [PATCH v3 00/12] " Johannes Schindelin via GitGitGadget
                       ` (7 preceding siblings ...)
  2022-05-21 22:18     ` [PATCH v3 08/12] ci(github): skip the logs of the successful test cases Johannes Schindelin via GitGitGadget
@ 2022-05-21 22:18     ` Victoria Dye via GitGitGadget
  2022-05-21 22:18     ` [PATCH v3 10/12] ci: use `--github-workflow-markup` in the GitHub workflow Johannes Schindelin via GitGitGadget
                       ` (2 subsequent siblings)
  11 siblings, 0 replies; 98+ messages in thread
From: Victoria Dye via GitGitGadget @ 2022-05-21 22:18 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Victoria Dye, Johannes Schindelin, Victoria Dye

From: Victoria Dye <vdye@github.com>

We want to mark up the test case preamble when presenting test output in
Git's GitHub workflow. Let's suppress the non-marked-up version in that
case. Any information it would contain is included in the marked-up
variant already.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/test-lib-functions.sh              | 2 ++
 t/test-lib-github-workflow-markup.sh | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index af4831a54c6..89a5e146b7a 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -803,6 +803,7 @@ test_expect_failure () {
 	export test_prereq
 	if ! test_skip "$@"
 	then
+		test -n "$test_skip_test_preamble" ||
 		say >&3 "checking known breakage of $TEST_NUMBER.$test_count '$1': $2"
 		if test_run_ "$2" expecting_failure
 		then
@@ -823,6 +824,7 @@ test_expect_success () {
 	export test_prereq
 	if ! test_skip "$@"
 	then
+		test -n "$test_skip_test_preamble" ||
 		say >&3 "expecting success of $TEST_NUMBER.$test_count '$1': $2"
 		if test_run_ "$2"
 		then
diff --git a/t/test-lib-github-workflow-markup.sh b/t/test-lib-github-workflow-markup.sh
index 1ef0fd5ba87..9c5339c577a 100644
--- a/t/test-lib-github-workflow-markup.sh
+++ b/t/test-lib-github-workflow-markup.sh
@@ -20,6 +20,8 @@
 # workflows; these functions will then override (empty) functions
 # that are are called at the appropriate times during the test runs.
 
+test_skip_test_preamble=t
+
 start_test_output () {
 	test -n "$GIT_TEST_TEE_OUTPUT_FILE" ||
 	die "--github-workflow-markup requires --verbose-log"
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v3 10/12] ci: use `--github-workflow-markup` in the GitHub workflow
  2022-05-21 22:18   ` [PATCH v3 00/12] " Johannes Schindelin via GitGitGadget
                       ` (8 preceding siblings ...)
  2022-05-21 22:18     ` [PATCH v3 09/12] ci(github): avoid printing test case preamble twice Victoria Dye via GitGitGadget
@ 2022-05-21 22:18     ` Johannes Schindelin via GitGitGadget
  2022-05-21 22:18     ` [PATCH v3 11/12] ci(github): mention where the full logs can be found Johannes Schindelin via GitGitGadget
  2022-05-21 22:18     ` [PATCH v3 12/12] ci: call `finalize_test_case_output` a little later Johannes Schindelin via GitGitGadget
  11 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-21 22:18 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Victoria Dye, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

This makes the output easier to digest.

Note: since workflow output currently cannot contain any nested groups
(see https://github.com/actions/runner/issues/802 for details), we need
to remove the explicit grouping that would span the entirety of each
failed test script.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 ci/lib.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/ci/lib.sh b/ci/lib.sh
index f8cb79e44f0..de6532ee8cd 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -177,7 +177,7 @@ then
 			test_name="${test_exit%.exit}"
 			test_name="${test_name##*/}"
 			printf "\\e[33m\\e[1m=== Failed test: ${test_name} ===\\e[m\\n"
-			group "Failed test: $test_name" cat "t/test-results/$test_name.out"
+			cat "t/test-results/$test_name.markup"
 
 			trash_dir="t/trash directory.$test_name"
 			cp "t/test-results/$test_name.out" t/failed-test-artifacts/
@@ -189,7 +189,7 @@ then
 	cache_dir="$HOME/none"
 
 	export GIT_PROVE_OPTS="--timer --jobs 10"
-	export GIT_TEST_OPTS="--verbose-log -x"
+	export GIT_TEST_OPTS="--verbose-log -x --github-workflow-markup"
 	MAKEFLAGS="$MAKEFLAGS --jobs=10"
 	test windows != "$CI_OS_NAME" ||
 	GIT_TEST_OPTS="--no-chain-lint --no-bin-wrappers $GIT_TEST_OPTS"
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v3 11/12] ci(github): mention where the full logs can be found
  2022-05-21 22:18   ` [PATCH v3 00/12] " Johannes Schindelin via GitGitGadget
                       ` (9 preceding siblings ...)
  2022-05-21 22:18     ` [PATCH v3 10/12] ci: use `--github-workflow-markup` in the GitHub workflow Johannes Schindelin via GitGitGadget
@ 2022-05-21 22:18     ` Johannes Schindelin via GitGitGadget
  2022-05-21 22:18     ` [PATCH v3 12/12] ci: call `finalize_test_case_output` a little later Johannes Schindelin via GitGitGadget
  11 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-21 22:18 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Victoria Dye, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The full logs are contained in the `failed-tests-*.zip` artifacts that
are attached to the failed CI run. Since this is not immediately
obvious to the well-disposed reader, let's mention it explicitly.

Suggested-by: Victoria Dye <vdye@github.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 ci/lib.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/ci/lib.sh b/ci/lib.sh
index de6532ee8cd..2f6d9d26e40 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -177,6 +177,7 @@ then
 			test_name="${test_exit%.exit}"
 			test_name="${test_name##*/}"
 			printf "\\e[33m\\e[1m=== Failed test: ${test_name} ===\\e[m\\n"
+			echo "The full logs are in the artifacts attached to this run."
 			cat "t/test-results/$test_name.markup"
 
 			trash_dir="t/trash directory.$test_name"
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v3 12/12] ci: call `finalize_test_case_output` a little later
  2022-05-21 22:18   ` [PATCH v3 00/12] " Johannes Schindelin via GitGitGadget
                       ` (10 preceding siblings ...)
  2022-05-21 22:18     ` [PATCH v3 11/12] ci(github): mention where the full logs can be found Johannes Schindelin via GitGitGadget
@ 2022-05-21 22:18     ` Johannes Schindelin via GitGitGadget
  11 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-21 22:18 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Victoria Dye, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

We used to call that function already before printing the final verdict.
However, now that we added grouping to the GitHub workflow output, we
will want to include even that part in the collapsible group for that
test case.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/test-lib.sh | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 29640d107ca..9e410a5bb70 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -783,13 +783,13 @@ trap '{ code=$?; set +x; } 2>/dev/null; exit $code' INT TERM HUP
 # the test_expect_* functions instead.
 
 test_ok_ () {
-	finalize_test_case_output ok "$@"
 	test_success=$(($test_success + 1))
 	say_color "" "ok $test_count - $@"
+	finalize_test_case_output ok "$@"
 }
 
 test_failure_ () {
-	finalize_test_case_output failure "$@"
+	failure_label=$1
 	test_failure=$(($test_failure + 1))
 	say_color error "not ok $test_count - $1"
 	shift
@@ -799,18 +799,19 @@ test_failure_ () {
 		say_color error "1..$test_count"
 		_error_exit
 	fi
+	finalize_test_case_output failure "$failure_label" "$@"
 }
 
 test_known_broken_ok_ () {
-	finalize_test_case_output fixed "$@"
 	test_fixed=$(($test_fixed+1))
 	say_color error "ok $test_count - $@ # TODO known breakage vanished"
+	finalize_test_case_output fixed "$@"
 }
 
 test_known_broken_failure_ () {
-	finalize_test_case_output broken "$@"
 	test_broken=$(($test_broken+1))
 	say_color warn "not ok $test_count - $@ # TODO known breakage"
+	finalize_test_case_output broken "$@"
 }
 
 test_debug () {
@@ -1136,10 +1137,10 @@ test_skip () {
 
 	case "$to_skip" in
 	t)
-		finalize_test_case_output skip "$@"
 
 		say_color skip "ok $test_count # skip $1 ($skipped_reason)"
 		: true
+		finalize_test_case_output skip "$@"
 		;;
 	*)
 		false
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-05-21 21:42     ` Johannes Schindelin
@ 2022-05-21 23:05       ` Junio C Hamano
  2022-05-22 18:48         ` Johannes Schindelin
                           ` (2 more replies)
  0 siblings, 3 replies; 98+ messages in thread
From: Junio C Hamano @ 2022-05-21 23:05 UTC (permalink / raw)
  To: Johannes Schindelin, Ævar Arnfjörð Bjarmason
  Cc: Victoria Dye, Johannes Schindelin via GitGitGadget, git,
	Eric Sunshine, Derrick Stolee, Emily Shaffer

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> * print the verbose logs only for the failed test cases (to massively cut
>>   down on the size of the log, particularly when there's only a couple
>>   failures in a test file with a lot of passing tests).
>
> That's an amazingly simple trick to improve the speed by a ton, indeed.
> Thank you for this splendid idea!
>
>> * skip printing the full text of the test in 'finalize_test_case_output'
>>   when creating the group, i.e., use '$1' instead of '$*' (in both passing
>>   and failing tests, this information is already printed via some other
>>   means).
>>
>> If you wanted to make sure a user could still access the full failure logs
>> (i.e., including the "ok" test results), you could print a link to the
>> artifacts page as well - that way, all of the information we currently
>> provide to users can still be found somewhere.
>
> That's a good point, I added that hint to the output (the link is
> unfortunately not available at the time we print that advice).

https://github.com/git/git/runs/6539786128 shows that all in-flight
topics merged to 'seen', except for the ds/bundle-uri-more, passes
the linux-leaks job.  The ds/bundle-uri-more topic introduces some
leaks to commands that happen to be used in tests that are marked as
leak-checker clean, making the job fail.

Which makes a great guinea pig for the CI output improvement topic.

So, I created two variants of 'seen' with this linux-leaks breakage.
One is with the js/ci-github-workflow-markup topic on this thread.
The other one is with the ab/ci-github-workflow-markup topic (which
uses a preliminary clean-up ab/ci-setup-simplify topic as its base).
They should show the identical test results and failures.

And here are their output:

 - https://github.com/git/git/runs/6539835065
 - https://github.com/git/git/runs/6539900608

If I recall correctly, the selling point of the ab/* variant over
js/* variant was that it would give quicker UI response compared to
the former, but other than that, both variants' UI are supposed to
be as newbie friendly as the other.

When I tried the former, it reacted too poorly to my attempt to
scroll (with mouse scroll wheel, if it makes a difference) that
sometimes I was staring a blank dark-gray space for a few seconds
waiting for it to be filled by something, which was a bit jarring
experience.  When I tried the latter, it didn't show anything to
help diagnosing the details of the breakage in "run make test" step
and the user needed to know "print test failures" needs to be looked
at, which I am not sure is an inherent limitation of the approach.
After the single extra click, navigating the test output to find the
failed steps among many others that succeeded was not a very pleasant
experience.

Those who are interested in UX experiment may want to visit these
two output to see how usable each of these is for themselves.

Thanks.




^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-05-21 23:05       ` Junio C Hamano
@ 2022-05-22 18:48         ` Johannes Schindelin
  2022-05-22 19:10           ` Junio C Hamano
  2022-05-22 23:27         ` Junio C Hamano
  2022-05-23  9:05         ` Ævar Arnfjörð Bjarmason
  2 siblings, 1 reply; 98+ messages in thread
From: Johannes Schindelin @ 2022-05-22 18:48 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason, Victoria Dye,
	Johannes Schindelin via GitGitGadget, git, Eric Sunshine,
	Derrick Stolee, Emily Shaffer

Hi Junio,

On Sat, 21 May 2022, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> >> * print the verbose logs only for the failed test cases (to massively cut
> >>   down on the size of the log, particularly when there's only a couple
> >>   failures in a test file with a lot of passing tests).
> >
> > That's an amazingly simple trick to improve the speed by a ton, indeed.
> > Thank you for this splendid idea!
> >
> >> * skip printing the full text of the test in 'finalize_test_case_output'
> >>   when creating the group, i.e., use '$1' instead of '$*' (in both passing
> >>   and failing tests, this information is already printed via some other
> >>   means).
> >>
> >> If you wanted to make sure a user could still access the full failure logs
> >> (i.e., including the "ok" test results), you could print a link to the
> >> artifacts page as well - that way, all of the information we currently
> >> provide to users can still be found somewhere.
> >
> > That's a good point, I added that hint to the output (the link is
> > unfortunately not available at the time we print that advice).
>
> https://github.com/git/git/runs/6539786128 shows that all in-flight
> topics merged to 'seen', except for the ds/bundle-uri-more, passes
> the linux-leaks job.  The ds/bundle-uri-more topic introduces some
> leaks to commands that happen to be used in tests that are marked as
> leak-checker clean, making the job fail.
>
> Which makes a great guinea pig for the CI output improvement topic.
>
> So, I created two variants of 'seen' with this linux-leaks breakage.
> One is with the js/ci-github-workflow-markup topic on this thread.
> The other one is with the ab/ci-github-workflow-markup topic (which
> uses a preliminary clean-up ab/ci-setup-simplify topic as its base).
> They should show the identical test results and failures.
>
> And here are their output:
>
>  - https://github.com/git/git/runs/6539835065

I see that this is still with the previous iteration, and therefore
exposes the same speed (or slowness) as was investigated so wonderfully by
Victoria.

So I really do not understand why you pointed to that run, given that it
still contains all the successful test cases' logs, which contributes in a
major way to said slowness.

Maybe you meant to refer to https://github.com/git/git/runs/6540394142
instead, which at least for me loads much faster _and_ makes the output as
helpful as my intention was?

Ciao,
Dscho

>  - https://github.com/git/git/runs/6539900608
>
> If I recall correctly, the selling point of the ab/* variant over
> js/* variant was that it would give quicker UI response compared to
> the former, but other than that, both variants' UI are supposed to
> be as newbie friendly as the other.
>
> When I tried the former, it reacted too poorly to my attempt to
> scroll (with mouse scroll wheel, if it makes a difference) that
> sometimes I was staring a blank dark-gray space for a few seconds
> waiting for it to be filled by something, which was a bit jarring
> experience.  When I tried the latter, it didn't show anything to
> help diagnosing the details of the breakage in "run make test" step
> and the user needed to know "print test failures" needs to be looked
> at, which I am not sure is an inherent limitation of the approach.
> After the single extra click, navigating the test output to find the
> failed steps among many others that succeeded was not a very pleasant
> experience.
>
> Those who are interested in UX experiment may want to visit these
> two output to see how usable each of these is for themselves.
>
> Thanks.
>
>
>
>
>

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-05-22 18:48         ` Johannes Schindelin
@ 2022-05-22 19:10           ` Junio C Hamano
  2022-05-23 12:58             ` Johannes Schindelin
  0 siblings, 1 reply; 98+ messages in thread
From: Junio C Hamano @ 2022-05-22 19:10 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Ævar Arnfjörð Bjarmason, Victoria Dye,
	Johannes Schindelin via GitGitGadget, git, Eric Sunshine,
	Derrick Stolee, Emily Shaffer

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> I see that this is still with the previous iteration, and therefore
> exposes the same speed (or slowness) as was investigated so wonderfully by
> Victoria.
>
> So I really do not understand why you pointed to that run, given that it

Simply because your updated version came to my tree a lot after I
prepared two trees that are otherwise identical for comparison to
write the message you are responding to.  If the new round is much
improved than the previous one, that is a very good news.

I do not appreciate that you have to always talk back to others in
such an aggressive tone, and I do not think it is only to me, by the
way.

You could have said the same thing in a lot more cordial way,
e.g. "There is a newer version than those being compared---could you
look at this run instead for comparison, even though admittably
there probably are changes in other topics in flight so the exact
failures may be different?"


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-05-21 23:05       ` Junio C Hamano
  2022-05-22 18:48         ` Johannes Schindelin
@ 2022-05-22 23:27         ` Junio C Hamano
  2022-05-23 18:55           ` Junio C Hamano
  2022-05-23  9:05         ` Ævar Arnfjörð Bjarmason
  2 siblings, 1 reply; 98+ messages in thread
From: Junio C Hamano @ 2022-05-22 23:27 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Johannes Schindelin
  Cc: Victoria Dye, Johannes Schindelin via GitGitGadget, git,
	Eric Sunshine, Derrick Stolee, Emily Shaffer

Junio C Hamano <gitster@pobox.com> writes:

> Which makes a great guinea pig for the CI output improvement topic.
>
> So, I created two variants of 'seen' with this linux-leaks breakage.
> One is with the js/ci-github-workflow-markup topic on this thread.
> The other one is with the ab/ci-github-workflow-markup topic (which
> uses a preliminary clean-up ab/ci-setup-simplify topic as its base).
> They should show the identical test results and failures.

The two runs to look at have been updated.

 - The one with Ævar's change was missing the primary "workflow
   markup" topic (it only had preliminary clean-up topic), so it is
   not a fair feature-to-feature comparison to begin with.

 - The other one with Johannes's change was done with the version
   before the latest round from yesterday, which has improvements.

With all the other in-flight topics (including the one that shows
failures in linux-leaks job) merged to the same base in the same
order, I prepared two variants of 'seen' that resulted in these
logs:

 - https://github.com/git/git/runs/6546816978
 - https://github.com/git/git/runs/6546750379

One is with both of the required topics from Ævar (with a fix-up [*]),
and the other is with the latest from Johannes's series.

I do not want to taint other folks' eyes with my observations, so I'd
send my impression in a separate message as a response to this
message after waiting for some time.

Thanks.

[Footnote]

* 76253615 (ci: optionally mark up output in the GitHub workflow,
  2022-04-21) added references to ci/print-test-failures.sh and
  ci/print-test-failures-github.sh to the workflow file, while the
  latter script does not exist, but it appears that these references
  want to run the same script, so I've made a stupid and obvious
  fix-up today before pushing the result of merging all out.

  This prevented "make test || ci/print-test-failures.sh" from
  running correctly [*], ever since 76253615 (ci: optionally mark up
  output in the GitHub workflow, 2022-04-21) was queued, and it
  seems that nobody noticed nor complained.  Sigh.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-05-21 23:05       ` Junio C Hamano
  2022-05-22 18:48         ` Johannes Schindelin
  2022-05-22 23:27         ` Junio C Hamano
@ 2022-05-23  9:05         ` Ævar Arnfjörð Bjarmason
  2022-05-23 18:41           ` Johannes Schindelin
  2 siblings, 1 reply; 98+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-05-23  9:05 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Schindelin, Victoria Dye,
	Johannes Schindelin via GitGitGadget, git, Eric Sunshine,
	Derrick Stolee, Emily Shaffer


On Sat, May 21 2022, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
>>> * print the verbose logs only for the failed test cases (to massively cut
>>>   down on the size of the log, particularly when there's only a couple
>>>   failures in a test file with a lot of passing tests).
>>
>> That's an amazingly simple trick to improve the speed by a ton, indeed.
>> Thank you for this splendid idea!
>>
>>> * skip printing the full text of the test in 'finalize_test_case_output'
>>>   when creating the group, i.e., use '$1' instead of '$*' (in both passing
>>>   and failing tests, this information is already printed via some other
>>>   means).
>>>
>>> If you wanted to make sure a user could still access the full failure logs
>>> (i.e., including the "ok" test results), you could print a link to the
>>> artifacts page as well - that way, all of the information we currently
>>> provide to users can still be found somewhere.
>>
>> That's a good point, I added that hint to the output (the link is
>> unfortunately not available at the time we print that advice).
>
> https://github.com/git/git/runs/6539786128 shows that all in-flight
> topics merged to 'seen', except for the ds/bundle-uri-more, passes
> the linux-leaks job.  The ds/bundle-uri-more topic introduces some
> leaks to commands that happen to be used in tests that are marked as
> leak-checker clean, making the job fail.
>
> Which makes a great guinea pig for the CI output improvement topic.
>
> So, I created two variants of 'seen' with this linux-leaks breakage.
> One is with the js/ci-github-workflow-markup topic on this thread.
> The other one is with the ab/ci-github-workflow-markup topic (which
> uses a preliminary clean-up ab/ci-setup-simplify topic as its base).
> They should show the identical test results and failures.
>
> And here are their output:
>
>  - https://github.com/git/git/runs/6539835065
>  - https://github.com/git/git/runs/6539900608
>
> If I recall correctly, the selling point of the ab/* variant over
> js/* variant was that it would give quicker UI response compared to
> the former, but other than that, both variants' UI are supposed to
> be as newbie friendly as the other.

...

> When I tried the former, it reacted too poorly to my attempt to
> scroll (with mouse scroll wheel, if it makes a difference) that
> sometimes I was staring a blank dark-gray space for a few seconds
> waiting for it to be filled by something, which was a bit jarring
> experience.  When I tried the latter, it didn't show anything to
> help diagnosing the details of the breakage in "run make test" step
> and the user needed to know "print test failures" needs to be looked
> at, which I am not sure is an inherent limitation of the approach.
> After the single extra click, navigating the test output to find the
> failed steps among many others that succeeded was not a very pleasant
> experience.
>
> Those who are interested in UX experiment may want to visit these
> two output to see how usable each of these is for themselves.

Re selling point & feature comparison: The point of the ab/* variant was
to re-roll Johannes's onto a "base" topic that made much of his
unnecessary, because the building up of features to emit GitHub markup
can be replaced by unrolling things like "make" and "make test" to the
top-level.

That has its own UX benefits, e.g. you can see at a glance what command
was run and what the environment was, and "make" and "make test" are now
split up from one monolithic "build and test" step.

But the primary intention was not to provide a prettier UX, but to show
that this arrangement made sense. I was hoping that Johannes would reply
with some variant of "ah, I see what you mean, that does make things
simpler!" and run with it, but alas...

So small bits in the UX like what you pointed out with needing an extra
click are in there, that one would be easy to solve, it's because we
"focus" on the last step with a false exit code, so we'd just have to
arrange for the "print" step to be that step.

Anyway, the v3 CL of Johannes's series claims that the re-roll "improves
the time to load pages".

I ran both the Firefox and Chrome debugger with performance benchmarks
against:

    https://github.com/git/git/runs/6540394142

And:

    https://github.com/avar/git/runs/6551581584?check_suite_focus=true

The former is what Johannes noted as the correct v3 in
https://lore.kernel.org/git/nycvar.QRO.7.76.6.2205222045130.352@tvgsbejvaqbjf.bet/,
the latter is the current "seen" with ab/ci-github-workflow-markup
reverted, i.e. just my "base" changes.

In Chrome/Firefox the time to load the page (as in the spinner stops,
and we "focus" on the right content) is:

    JS: ~60s / ~80s 
    Æ: ~25s / ~18s

This is with Chrome Version 101.0.4951.54 (Official Build) (64-bit) and
Firefox 91.8.0esr (64-bit), both on a Debain Linux x86_64 Dell laptop.

The case of Chrome is quite revealing (since its developer tools seem to
show a better summary). It shows that the "Æ" version spent ~200ms on
"scripting", ~1ms on "rendering", and ~20k ms "idle".

For "JS" that's ~30k ms on "scripting", 15k ms on "rendering", then 7k
ms on "painting" (which is ~0ms in the other one). 7k ms are spent on
"idle".

So these are basically the same performance results as I reported in
earlier iterations.

I think a v4 of this series really deserves a much less terse
CL. I.e. there are specific reports about major slowdowns in the UX. Any
re-roll should really be re-testing those with the same/similar software
and reporting before/after results.

Clearly the primary goal of improving the CI UX should not be to
optimize the rendering of the results as a goal in itself, but in this
case it becomes *so slow* that it precludes certain major use-cases,
such as seeing a failure and being able to view it in some timely
fashion.


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-05-22 19:10           ` Junio C Hamano
@ 2022-05-23 12:58             ` Johannes Schindelin
  0 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin @ 2022-05-23 12:58 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason, Victoria Dye,
	Johannes Schindelin via GitGitGadget, git, Eric Sunshine,
	Derrick Stolee, Emily Shaffer

Hi Junio,

On Sun, 22 May 2022, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> > I see that this is still with the previous iteration, and therefore
> > exposes the same speed (or slowness) as was investigated so wonderfully by
> > Victoria.
> >
> > So I really do not understand why you pointed to that run, given that it
>
> Simply because your updated version came to my tree a lot after I
> prepared two trees that are otherwise identical for comparison to
> write the message you are responding to.

Oh sorry, I only noticed that your mail came in after I sent the new
iteration, and I incorrectly thought that you had put the patch series
into the Stalled/To-Drop pile, so I assumed that your mail was in response
to my new iteration.

I had missed that you replied to v2 instead of to v3.

> I do not appreciate that you have to always talk back to others in
> such an aggressive tone

I apologize for that. As you might have guessed, it was not my intention
to be aggressive in any way. I merely meant to express my puzzlement, and
your explanation resolved that very nicely.

Thank you,
Dscho

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-05-23  9:05         ` Ævar Arnfjörð Bjarmason
@ 2022-05-23 18:41           ` Johannes Schindelin
  2022-05-24  8:40             ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 98+ messages in thread
From: Johannes Schindelin @ 2022-05-23 18:41 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Junio C Hamano, Victoria Dye,
	Johannes Schindelin via GitGitGadget, git, Eric Sunshine,
	Derrick Stolee, Emily Shaffer

[-- Attachment #1: Type: text/plain, Size: 3714 bytes --]

Hi Ævar,

On Mon, 23 May 2022, Ævar Arnfjörð Bjarmason wrote:

> Re selling point & feature comparison: The point of the ab/* variant was
> to re-roll Johannes's onto a "base" topic that made much of his
> unnecessary, because the building up of features to emit GitHub markup
> can be replaced by unrolling things like "make" and "make test" to the
> top-level.
>
> That has its own UX benefits, e.g. you can see at a glance what command
> was run and what the environment was, and "make" and "make test" are now
> split up from one monolithic "build and test" step.
>
> But the primary intention was not to provide a prettier UX, but to show
> that this arrangement made sense. I was hoping that Johannes would reply
> with some variant of "ah, I see what you mean, that does make things
> simpler!" and run with it, but alas...

I believe that we share the goal to make the Git project more welcoming
and easier to navigate for new contributors.

The patch series you wanted me to look at claims to make the CI/PR
definitions/scripts simpler. As it matters more to contributors how to
investigate test failures, i.e. what information they are provided about
the failures, I disagree that that patch series needs to be connected to
my patch series in any way.

Further, the result does not look like a simplification to me. For
example, I consider it an absolute no-go to remove the remnants of Azure
Pipelines support. As I had hinted, and as you saw on the git-security
list, I require this support for embargoed releases. That’s what I did
when working on the patches that made it into v2.35.2. In my book,
removing such vital (if dormant) code is not a simplification, but a
Chesterton’s Fence. While we do not need to use Azure Pipelines for our
regular CI, we definitely need it for embargoed releases. “Simply revert
it back” is not an excuse for removing something that should not be
removed in the first place.

As another example where I have a different concept of what constitutes
“simple”: In Git for Windows’ fork, we carry a patch that integrates the
`git-subtree` tests into the CI builds. This patch touches two places,
`ci/run-build-and-tests.sh` and `ci/run-test-slice.sh`. These changes
would be inherited by any CI definition that uses the scripts in `ci/`.
With the proposed patches, there are four places to patch, and they are
all limited to the GitHub workflow definition. Since you asked me for my
assessment: this is de-DRYing the code, making it more cumbersome instead
of simpler.

In other words, I have fundamental objections about the approach and about
tying it to the patches that improve the output of Git’s CI/PR runs.

> In Chrome/Firefox the time to load the page (as in the spinner stops,
> and we "focus" on the right content) is:
>
>     JS: ~60s / ~80s
>     Æ: ~25s / ~18s

My focus is on the experience of occasional and new contributors who need
to investigate test failures in the CI/PR runs. In this thread, we already
discussed the balance between speed of loading the page on the one hand
and how well the reader is guided toward the relevant parts on the other
hand. I disagree with you that the former should be prioritized over the
latter, on the contrary, guiding the readers along a path to success is
much more important than optimizing for a quick page load.

Most contributors who chimed in seemed to not mind a longer page load time
anyway, as long as the result would help them identify quickly what causes
the test failures. Besides, the page load times are only likely to become
better anyway, as GitHub engineers continuously improve Actions.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-05-22 23:27         ` Junio C Hamano
@ 2022-05-23 18:55           ` Junio C Hamano
  2022-05-23 19:21             ` Johannes Schindelin
  0 siblings, 1 reply; 98+ messages in thread
From: Junio C Hamano @ 2022-05-23 18:55 UTC (permalink / raw)
  To: Johannes Schindelin, Ævar Arnfjörð Bjarmason
  Cc: Victoria Dye, Johannes Schindelin via GitGitGadget, git,
	Eric Sunshine, Derrick Stolee, Emily Shaffer

Junio C Hamano <gitster@pobox.com> writes:

> I do not want to taint other folks' eyes with my observations, so I'd
> send my impression in a separate message as a response to this
> message after waiting for some time.

Between the previous and latest of the Johannes's topic, the test
output got a lot shorter by discarding the "ok" output and keeping
only the failures and skips.  Because the readers are mostly
interested in seeing failures (they can download the full log if
they want to), and this design decision probably makes sense to me.
The same "while scrolling, the user has to stare into the gray void
for several seconds" is still there and needs a bit of getting used
to (I do not know if it is a browser's problem, or something the
output can help giving a better user experience---the lines in the
folded part may probably not be "counted" correctly or something
silly like that).

The ones with the topic from Ævar last night, as I've mentioned
already, lacked the main part of the logic, and it wouldn't have
worked correctly because there was a show-stopper bug in one of the
steps in it.  With that fixed, the "extra click" I complained last
night seems to be gone.  I guess the same "discard the test steps
that successfully ran" trick would give us the same "shorter"
output.  I observe the same "staring into the gray void while
scrolling" when it comes to the print-test-failures output, just as
in the output from Johannes's topic.

Common to the both approaches, folding output from each test piece
to one line (typically "ok" but sometimes "failed" heading) may be
the source of UI responsiveness irritation I have been observing,
but I wonder, with the removal of all "ok" pieces, it may make sense
not to fold anything and instead give a flat "here are the traces of
all failed and skipped tests".

In any case, either implementation seems to give us a good
improvement over what is in 'master'.

Thanks.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-05-23 18:55           ` Junio C Hamano
@ 2022-05-23 19:21             ` Johannes Schindelin
  0 siblings, 0 replies; 98+ messages in thread
From: Johannes Schindelin @ 2022-05-23 19:21 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason, Victoria Dye,
	Johannes Schindelin via GitGitGadget, git, Eric Sunshine,
	Derrick Stolee, Emily Shaffer

Hi Junio,

On Mon, 23 May 2022, Junio C Hamano wrote:

> [...] the test output got a lot shorter by discarding the "ok" output
> and keeping only the failures and skips.  Because the readers are mostly
> interested in seeing failures (they can download the full log if
> they want to), and this design decision probably makes sense to me.

For the record, Victoria suggested to group by file rather than by failed
test case.

However, I do speak from a lot of experience diagnosing test failures in
CI/PR runs when I say: it is frequently very helpful to have a look at one
failed test case at a time. I'd much rather suffer a minor lag while
scrolling than having to find the boundaries manually, in particular when
`test_expect_failure` test cases are present (which are reported as
"broken" in the current iteration instead of "failed").

Besides, the scroll issue is probably similar between both approaches to
grouping (and may be independent of the grouping, as you pointed out by
reporting similar issues in the current `print-test-failures` step), and
is something I hope the Actions engineers are working on.

> Common to the both approaches, folding output from each test piece
> to one line (typically "ok" but sometimes "failed" heading) may be
> the source of UI responsiveness irritation I have been observing,
> but I wonder, with the removal of all "ok" pieces, it may make sense
> not to fold anything and instead give a flat "here are the traces of
> all failed and skipped tests".

As I mentioned above, I'd rather keep the grouping by failed test case.

Obviously, the ideal way to decide would be to set up some A/B testing
with real people, but I have no way to set up anything like that.

> In any case, either implementation seems to give us a good improvement
> over what is in 'master'.

There are two things I would like to add:

- In the current iteration's summary page, you will see the failed test
  cases' titles in the errors, and they are clickable (and will get you to
  the corresponding part of the logs). I find this very convenient.

- The addition of the suggestion to look at the run's artifacts for the
  full logs might not look like a big deal, but I bet that it will help in
  particular new contributors. This was yet another great suggestion by
  Victoria.

Thanks,
Dscho


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful
  2022-05-23 18:41           ` Johannes Schindelin
@ 2022-05-24  8:40             ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 98+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-05-24  8:40 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Junio C Hamano, Victoria Dye,
	Johannes Schindelin via GitGitGadget, git, Eric Sunshine,
	Derrick Stolee, Emily Shaffer


On Mon, May 23 2022, Johannes Schindelin wrote:

> Hi Ævar,
>
> On Mon, 23 May 2022, Ævar Arnfjörð Bjarmason wrote:
>
>> Re selling point & feature comparison: The point of the ab/* variant was
>> to re-roll Johannes's onto a "base" topic that made much of his
>> unnecessary, because the building up of features to emit GitHub markup
>> can be replaced by unrolling things like "make" and "make test" to the
>> top-level.
>>
>> That has its own UX benefits, e.g. you can see at a glance what command
>> was run and what the environment was, and "make" and "make test" are now
>> split up from one monolithic "build and test" step.
>>
>> But the primary intention was not to provide a prettier UX, but to show
>> that this arrangement made sense. I was hoping that Johannes would reply
>> with some variant of "ah, I see what you mean, that does make things
>> simpler!" and run with it, but alas...
>
> I believe that we share the goal to make the Git project more welcoming
> and easier to navigate for new contributors.

Yes, definitely.

> The patch series you wanted me to look at claims to make the CI/PR
> definitions/scripts simpler. As it matters more to contributors how to
> investigate test failures, i.e. what information they are provided about
> the failures, I disagree that that patch series needs to be connected to
> my patch series in any way.

Our two set of patches change different parts of the CI UX, so no. The
set of patches I've been proposing isn't just making CI/PR
definitions/scripts simpler, although it also does that.

So e.g. in your patches you need to massage the CI output to split the
"build" step from the "test" step. As you can see in an earlier RFC
re-roll of them on top of my topic that something you'd get for free:
https://lore.kernel.org/git/RFC-cover-v5-00.10-00000000000-20220421T183001Z-avarab@gmail.com/

> Further, the result does not look like a simplification to me. For
> example, I consider it an absolute no-go to remove the remnants of Azure
> Pipelines support. As I had hinted, and as you saw on the git-security
> list, I require this support for embargoed releases. That’s what I did
> when working on the patches that made it into v2.35.2. In my book,
> removing such vital (if dormant) code is not a simplification, but a
> Chesterton’s Fence. While we do not need to use Azure Pipelines for our
> regular CI, we definitely need it for embargoed releases. “Simply revert
> it back” is not an excuse for removing something that should not be
> removed in the first place.

Can you please reply to this 3 month old and still-waiting-on-your-reply
E-Mail on this topic so we can figure out a way forward with this:
https://lore.kernel.org/git/220222.86y2236ndp.gmgdl@evledraar.gmail.com/

> As another example where I have a different concept of what constitutes
> “simple”: In Git for Windows’ fork, we carry a patch that integrates the
> `git-subtree` tests into the CI builds. This patch touches two places,
> `ci/run-build-and-tests.sh` and `ci/run-test-slice.sh`. These changes
> would be inherited by any CI definition that uses the scripts in `ci/`.
> With the proposed patches, there are four places to patch, and they are
> all limited to the GitHub workflow definition. Since you asked me for my
> assessment: this is de-DRYing the code, making it more cumbersome instead
> of simpler.

No, you'd still have two places to patch:

 1. The top-level Makefile to have "make test" run those subtree tests
    depending on some flag, i.e. the same as your
    ci/run-build-and-tests.sh.

 2. ci/run-test-slice.sh as before (which is only needed for the
 Windows-specific tests).

Because we'd be having the Makefile drive the logic you could also run
such a "make test" locally, which is something we should have
anyway. E.g. when I build my own git I run the subtree tests, and would
like to eventually make "run contrib tests too" some configurable
option.

So it is exactly the DRY principle. By avoiding making things needlessly
CI-specific we can just control this behavior with flags, both in and
outside CI.

> In other words, I have fundamental objections about the approach and about
> tying it to the patches that improve the output of Git’s CI/PR runs.

I would too if after my series you needed to patch every place we run
"make test" or whatever to run your subtree tests, but as noted above
that's not the case. So hopefully this addresses that.

More generally: I noted a while ago that if you pointed out issues like
that I'd be happy to address them for you.  Based on this I see
d08496f2c40 (ci: run `contrib/subtree` tests in CI builds, 2021-08-05),
and that would be easy to generalize.

>> In Chrome/Firefox the time to load the page (as in the spinner stops,
>> and we "focus" on the right content) is:
>>
>>     JS: ~60s / ~80s
>>     Æ: ~25s / ~18s
>
> My focus is on the experience of occasional and new contributors who need
> to investigate test failures in the CI/PR runs. In this thread, we already
> discussed the balance between speed of loading the page on the one hand
> and how well the reader is guided toward the relevant parts on the other
> hand.

First, your re-roll claims thta it "improves the time to load pages",
but based on the sort of testing I'd done before when I reported the
severe slowness introduced by this topic I can't reproduce that.

So how exactly are you testing the performance of these load times, and
can you share the numbers you have for master, your previous iteration
and this re-roll?

> I disagree with you that the former should be prioritized over the
> latter, on the contrary, guiding the readers along a path to success is
> much more important than optimizing for a quick page load.

I think a better UX is certainly worth some cost to load times, so I'm
not trying to be difficult in saying that this costs us some
milliseconds so it's a no-go.

But really, this is making it so slow that it's borderline unusable.

The main way I use this interface is that I'll get an E-Mail with a
failure report, or see the "X" in the UX and click through to the
failure, then see the logs etc, and hopefully be able to see from that
what's wrong, or how I could begin to reproduce it.

Right now that's fast enough that I'll do that all in one browser
click-through session, but if I'm having to wait *more than a minute*
v.s. the current 10-20 seconds (which is already quite bad)?

Your latest series also seems to either be buggy (or trigger some bug in
GitHub Actions?) where even after that minute you'll see almost nothing
on your screen. So a user who doesn't know the UX would end up waiting
much longer than that.

You seemingly need to know that it's done when it shows you that blank
screen, and trigger a re-render by scrolling up or down, which will show
you your actual failures.

That's not an issue I saw in any iteration of this before this v3.

> Most contributors who chimed in seemed to not mind a longer page load time
> anyway, as long as the result would help them identify quickly what causes
> the test failures.

Wasn't much of that discussion a follow-up to your initial demos of this
topic?

I don't think those were as slow as what I'm pointing out above, which I
think is just because those failures happened to involve much fewer
lines of log. The slowness seems to be at correlated with how many lines
we're dealing with in total.

> Besides, the page load times are only likely to become
> better anyway, as GitHub engineers continuously improve Actions.

Sure, and if this were all magically made better by GH engineers these
concerns would be addressed.

But right now that isn't the case, and we don't know if/when that would
happen, so we need to review these proposed changes on the basis of how
they'd change the current GitHub CI UX overall.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v3 08/12] ci(github): skip the logs of the successful test cases
  2022-05-21 22:18     ` [PATCH v3 08/12] ci(github): skip the logs of the successful test cases Johannes Schindelin via GitGitGadget
@ 2022-05-24 10:47       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 98+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-05-24 10:47 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Eric Sunshine, Phillip Wood, Victoria Dye, Johannes Schindelin


On Sat, May 21 2022, Johannes Schindelin via GitGitGadget wrote:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> [...]
> Co-authored-by: Victoria Dye <vdye@github.com>

Missing SOB here for Victoria.

> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---

^ permalink raw reply	[flat|nested] 98+ messages in thread

end of thread, other threads:[~2022-05-24 10:48 UTC | newest]

Thread overview: 98+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-24 18:56 [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Johannes Schindelin via GitGitGadget
2022-01-24 18:56 ` [PATCH 1/9] ci: fix code style Johannes Schindelin via GitGitGadget
2022-01-24 18:56 ` [PATCH 2/9] ci/run-build-and-tests: take a more high-level view Johannes Schindelin via GitGitGadget
2022-01-24 23:22   ` Eric Sunshine
2022-01-25 14:34     ` Johannes Schindelin
2022-01-24 18:56 ` [PATCH 3/9] ci: make it easier to find failed tests' logs in the GitHub workflow Johannes Schindelin via GitGitGadget
2022-01-25 23:48   ` Ævar Arnfjörð Bjarmason
2022-01-24 18:56 ` [PATCH 4/9] ci/run-build-and-tests: add some structure to the GitHub workflow output Johannes Schindelin via GitGitGadget
2022-02-23 12:13   ` Phillip Wood
2022-02-25 13:40     ` Johannes Schindelin
2022-01-24 18:56 ` [PATCH 5/9] tests: refactor --write-junit-xml code Johannes Schindelin via GitGitGadget
2022-01-26  0:10   ` Ævar Arnfjörð Bjarmason
2022-01-24 18:56 ` [PATCH 6/9] test(junit): avoid line feeds in XML attributes Johannes Schindelin via GitGitGadget
2022-01-24 18:56 ` [PATCH 7/9] ci: optionally mark up output in the GitHub workflow Johannes Schindelin via GitGitGadget
2022-01-24 18:56 ` [PATCH 8/9] ci: use `--github-workflow-markup` " Johannes Schindelin via GitGitGadget
2022-01-24 18:56 ` [PATCH 9/9] ci: call `finalize_test_case_output` a little later Johannes Schindelin via GitGitGadget
2022-01-26  0:25 ` [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Ævar Arnfjörð Bjarmason
2022-01-27 16:31 ` CI "grouping" within jobs v.s. lighter split-out jobs (was: [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful) Ævar Arnfjörð Bjarmason
2022-02-19 23:46 ` [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Johannes Schindelin
2022-02-20  2:44   ` Junio C Hamano
2022-02-20 15:25     ` Johannes Schindelin
2022-02-21  8:09       ` Ævar Arnfjörð Bjarmason
2022-02-22 10:26         ` Johannes Schindelin
2022-02-20 12:47   ` Ævar Arnfjörð Bjarmason
2022-02-22 10:30     ` Johannes Schindelin
2022-02-22 13:31       ` Ævar Arnfjörð Bjarmason
2022-02-23 12:07         ` Phillip Wood
2022-02-25 12:39           ` Ævar Arnfjörð Bjarmason
2022-02-25 14:10           ` Johannes Schindelin
2022-02-25 18:16             ` Junio C Hamano
2022-02-26 18:43               ` Junio C Hamano
2022-03-01  2:59                 ` Junio C Hamano
2022-03-01  6:35                   ` Junio C Hamano
2022-03-01 10:18                   ` Johannes Schindelin
2022-03-01 16:52                     ` Junio C Hamano
2022-03-01 10:10                 ` Johannes Schindelin
2022-03-01 16:57                   ` Junio C Hamano
2022-03-01 10:20               ` Johannes Schindelin
2022-03-04  7:38               ` win+VS environment has "cut" but not "paste"? Junio C Hamano
2022-03-04  9:04                 ` Ævar Arnfjörð Bjarmason
2022-03-07 15:51                   ` Johannes Schindelin
2022-03-07 17:05                     ` Junio C Hamano
2022-03-09 13:02                       ` Johannes Schindelin
2022-03-10 15:23                         ` Ævar Arnfjörð Bjarmason
2022-03-07 15:48                 ` Johannes Schindelin
2022-03-07 16:58                   ` Junio C Hamano
2022-03-02 10:58             ` [PATCH 0/9] ci: make Git's GitHub workflow output much more helpful Phillip Wood
2022-03-07 16:07               ` Johannes Schindelin
2022-03-07 17:11                 ` Junio C Hamano
2022-03-09 11:44                   ` Ævar Arnfjörð Bjarmason
2022-03-07 17:12                 ` Phillip Wood
2022-03-01 10:24 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
2022-03-01 10:24   ` [PATCH v2 1/9] ci: fix code style Johannes Schindelin via GitGitGadget
2022-03-01 10:24   ` [PATCH v2 2/9] ci/run-build-and-tests: take a more high-level view Johannes Schindelin via GitGitGadget
2022-03-01 10:24   ` [PATCH v2 3/9] ci: make it easier to find failed tests' logs in the GitHub workflow Johannes Schindelin via GitGitGadget
2022-03-01 10:24   ` [PATCH v2 4/9] ci/run-build-and-tests: add some structure to the GitHub workflow output Johannes Schindelin via GitGitGadget
2022-03-01 10:24   ` [PATCH v2 5/9] tests: refactor --write-junit-xml code Johannes Schindelin via GitGitGadget
2022-03-01 10:24   ` [PATCH v2 6/9] test(junit): avoid line feeds in XML attributes Johannes Schindelin via GitGitGadget
2022-03-01 10:24   ` [PATCH v2 7/9] ci: optionally mark up output in the GitHub workflow Johannes Schindelin via GitGitGadget
2022-03-01 10:24   ` [PATCH v2 8/9] ci: use `--github-workflow-markup` " Johannes Schindelin via GitGitGadget
2022-03-01 10:24   ` [PATCH v2 9/9] ci: call `finalize_test_case_output` a little later Johannes Schindelin via GitGitGadget
2022-03-01 19:07   ` [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful Junio C Hamano
2022-03-02 12:22   ` Ævar Arnfjörð Bjarmason
2022-03-07 15:57     ` Johannes Schindelin
2022-03-07 16:05       ` Ævar Arnfjörð Bjarmason
2022-03-07 17:36         ` Junio C Hamano
2022-03-09 10:56           ` Ævar Arnfjörð Bjarmason
2022-03-09 13:20         ` Johannes Schindelin
2022-03-09 19:39           ` Junio C Hamano
2022-03-09 19:47           ` Ævar Arnfjörð Bjarmason
2022-03-25  0:48   ` Victoria Dye
2022-03-25  9:02     ` Ævar Arnfjörð Bjarmason
2022-03-25 18:38       ` Victoria Dye
2022-05-21 21:42     ` Johannes Schindelin
2022-05-21 23:05       ` Junio C Hamano
2022-05-22 18:48         ` Johannes Schindelin
2022-05-22 19:10           ` Junio C Hamano
2022-05-23 12:58             ` Johannes Schindelin
2022-05-22 23:27         ` Junio C Hamano
2022-05-23 18:55           ` Junio C Hamano
2022-05-23 19:21             ` Johannes Schindelin
2022-05-23  9:05         ` Ævar Arnfjörð Bjarmason
2022-05-23 18:41           ` Johannes Schindelin
2022-05-24  8:40             ` Ævar Arnfjörð Bjarmason
2022-05-21 22:18   ` [PATCH v3 00/12] " Johannes Schindelin via GitGitGadget
2022-05-21 22:18     ` [PATCH v3 01/12] ci: fix code style Johannes Schindelin via GitGitGadget
2022-05-21 22:18     ` [PATCH v3 02/12] tests: refactor --write-junit-xml code Johannes Schindelin via GitGitGadget
2022-05-21 22:18     ` [PATCH v3 03/12] test(junit): avoid line feeds in XML attributes Johannes Schindelin via GitGitGadget
2022-05-21 22:18     ` [PATCH v3 04/12] ci/run-build-and-tests: take a more high-level view Johannes Schindelin via GitGitGadget
2022-05-21 22:18     ` [PATCH v3 05/12] ci: make it easier to find failed tests' logs in the GitHub workflow Johannes Schindelin via GitGitGadget
2022-05-21 22:18     ` [PATCH v3 06/12] ci/run-build-and-tests: add some structure to the GitHub workflow output Johannes Schindelin via GitGitGadget
2022-05-21 22:18     ` [PATCH v3 07/12] ci: optionally mark up output in the GitHub workflow Johannes Schindelin via GitGitGadget
2022-05-21 22:18     ` [PATCH v3 08/12] ci(github): skip the logs of the successful test cases Johannes Schindelin via GitGitGadget
2022-05-24 10:47       ` Ævar Arnfjörð Bjarmason
2022-05-21 22:18     ` [PATCH v3 09/12] ci(github): avoid printing test case preamble twice Victoria Dye via GitGitGadget
2022-05-21 22:18     ` [PATCH v3 10/12] ci: use `--github-workflow-markup` in the GitHub workflow Johannes Schindelin via GitGitGadget
2022-05-21 22:18     ` [PATCH v3 11/12] ci(github): mention where the full logs can be found Johannes Schindelin via GitGitGadget
2022-05-21 22:18     ` [PATCH v3 12/12] ci: call `finalize_test_case_output` a little later Johannes Schindelin via GitGitGadget

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.