* [BUG] question mark in GIT_SKIP_TESTS is broken in master @ 2021-06-15 20:55 Andrei Rybak 2021-06-15 23:28 ` [BUG] range expressions in GIT_SKIP_TESTS are broken in master (was [BUG] question mark in GIT_SKIP_TESTS is broken in master) Andrei Rybak 0 siblings, 1 reply; 17+ messages in thread From: Andrei Rybak @ 2021-06-15 20:55 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: git Hi, I've taken an example of how to use GIT_SKIP_TESTS directly from t/README: $ GIT_SKIP_TESTS='t[0-4]??? t91?? t9200.8' make and noticed that t0000 is still being run. I've written a script to bisect the issue: #/bin/sh ( cd t GIT_SKIP_TESTS='t[0-4]??? t91?? t9200.8' make | \ head | grep 'SKIP skip all tests in t0000' ) and it bisected to edc23840b0 (test-lib: bring $remove_trash out of retirement, 2021-05-10) which was merged to master in 2019256717 ("Merge branch 'ab/test-lib-updates'", 2021-06-14). Note, that spelling out the tests number in full works, i.e. $ GIT_SKIP_TESTS=t0000 make won't run t0000. ^ permalink raw reply [flat|nested] 17+ messages in thread
* [BUG] range expressions in GIT_SKIP_TESTS are broken in master (was [BUG] question mark in GIT_SKIP_TESTS is broken in master) 2021-06-15 20:55 [BUG] question mark in GIT_SKIP_TESTS is broken in master Andrei Rybak @ 2021-06-15 23:28 ` Andrei Rybak 2021-06-16 3:28 ` Junio C Hamano 0 siblings, 1 reply; 17+ messages in thread From: Andrei Rybak @ 2021-06-15 23:28 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: git On 15/06/2021 22:55, Andrei Rybak wrote: > Hi, > > I've taken an example of how to use GIT_SKIP_TESTS directly from > t/README: > > $ GIT_SKIP_TESTS='t[0-4]??? t91?? t9200.8' make > > and noticed that t0000 is still being run. I've written a script to > bisect the issue: > > > #/bin/sh > > ( > cd t > GIT_SKIP_TESTS='t[0-4]??? t91?? t9200.8' make | \ > head | grep 'SKIP skip all tests in t0000' > ) > > > and it bisected to edc23840b0 (test-lib: bring $remove_trash out of > retirement, 2021-05-10) which was merged to master in 2019256717 ("Merge > branch 'ab/test-lib-updates'", 2021-06-14). > > Note, that spelling out the tests number in full works, i.e. > > $ GIT_SKIP_TESTS=t0000 make > > won't run t0000. I sent this originally with incorrect subject. Question marks, e.g. $ GIT_SKIP_TESTS='t000?' make work. It is the range expressions, like [0-9] that are broken: $ GIT_SKIP_TESTS='t[0-9]000' make ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [BUG] range expressions in GIT_SKIP_TESTS are broken in master (was [BUG] question mark in GIT_SKIP_TESTS is broken in master) 2021-06-15 23:28 ` [BUG] range expressions in GIT_SKIP_TESTS are broken in master (was [BUG] question mark in GIT_SKIP_TESTS is broken in master) Andrei Rybak @ 2021-06-16 3:28 ` Junio C Hamano 2021-06-16 4:12 ` Junio C Hamano 2021-06-16 8:24 ` [PATCH] test-lib: fix "$remove_trash" regression and match_pattern_list() bugs Ævar Arnfjörð Bjarmason 0 siblings, 2 replies; 17+ messages in thread From: Junio C Hamano @ 2021-06-16 3:28 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason, Andrei Rybak; +Cc: git Andrei Rybak <rybak.a.v@gmail.com> writes: > I sent this originally with incorrect subject. Question marks, e.g. > > $ GIT_SKIP_TESTS='t000?' make > > work. It is the range expressions, like [0-9] that are broken: > > $ GIT_SKIP_TESTS='t[0-9]000' make This reproduces for me, and $ GIT_SKIP_TESTS=t?000' sh t0000-basic.sh $ GIT_SKIP_TESTS=t0?00' sh t0000-basic.sh $ GIT_SKIP_TESTS=t00?0' sh t0000-basic.sh tells me that "Question marks work" above is not exactly correct. It is an exception that this $ GIT_SKIP_TESTS=t000?' sh t0000-basic.sh happens to skip. Interestingly enough, edc23840b0 (test-lib: bring $remove_trash out of retirement, 2021-05-10) cleanly reverts without being depended on by anything else in the series. Ævar? Thanks. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [BUG] range expressions in GIT_SKIP_TESTS are broken in master (was [BUG] question mark in GIT_SKIP_TESTS is broken in master) 2021-06-16 3:28 ` Junio C Hamano @ 2021-06-16 4:12 ` Junio C Hamano 2021-06-16 8:29 ` Jeff King 2021-06-16 8:24 ` [PATCH] test-lib: fix "$remove_trash" regression and match_pattern_list() bugs Ævar Arnfjörð Bjarmason 1 sibling, 1 reply; 17+ messages in thread From: Junio C Hamano @ 2021-06-16 4:12 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: Andrei Rybak, git Junio C Hamano <gitster@pobox.com> writes: > Interestingly enough, edc23840b0 (test-lib: bring $remove_trash out > of retirement, 2021-05-10) cleanly reverts without being depended on > by anything else in the series. > > Ævar? With the crude debugging aid patch (attached at the end) applied, running $ GIT_SKIP_TESTS='t?000' sh -x t0000-basic.sh -v will show something interesting in the trace. ++ this_test=t0000 ++ _s_k_i_p_='t?000' ++ match_pattern_list t0000 t5000 The variable $GIT_SKIP_TESTS on this line: if match_pattern_list "$this_test" $GIT_SKIP_TESTS globs to t5000. We don't quote the variable because we want them separated at $IFS boundaries, but we didn't want the glob specials in its value to take any effect. Sigh. The reason why edc23840b0 appears to break this is probably because we are still in $TEST_DIRECTORY when this match_pattern_list is executed; before that change, we've created $TRASH_DIRECTORY and chdir'd there already, and when we check "do we want to skip all?", there is nothing for the glob to match. That also explains why GIT_SKIP_TESTS="t000?" appears to work. There is no such filesystem entity directly in $TEST_DIRECTORY. $ echo t000? t00?0 t0?00 t?000 t000? t00?0 t0200 t5000 diff --git i/t/test-lib.sh w/t/test-lib.sh index 54938c6427..8ee0540532 100644 --- i/t/test-lib.sh +++ w/t/test-lib.sh @@ -1346,13 +1346,17 @@ fi remove_trash= this_test=${0##*/} this_test=${this_test%%-*} +_s_k_i_p_=$GIT_SKIP_TESTS + if match_pattern_list "$this_test" $GIT_SKIP_TESTS then say_color info >&3 "skipping test $this_test altogether" skip_all="skip all tests in $this_test" test_done fi +exit + # Last-minute variable setup HOME="$TRASH_DIRECTORY" GNUPGHOME="$HOME/gnupg-home-not-used" ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [BUG] range expressions in GIT_SKIP_TESTS are broken in master (was [BUG] question mark in GIT_SKIP_TESTS is broken in master) 2021-06-16 4:12 ` Junio C Hamano @ 2021-06-16 8:29 ` Jeff King 2021-06-16 9:19 ` Junio C Hamano 0 siblings, 1 reply; 17+ messages in thread From: Jeff King @ 2021-06-16 8:29 UTC (permalink / raw) To: Junio C Hamano; +Cc: Ævar Arnfjörð Bjarmason, Andrei Rybak, git On Wed, Jun 16, 2021 at 01:12:09PM +0900, Junio C Hamano wrote: > The variable $GIT_SKIP_TESTS on this line: > > if match_pattern_list "$this_test" $GIT_SKIP_TESTS > > globs to t5000. We don't quote the variable because we want them > separated at $IFS boundaries, but we didn't want the glob specials > in its value to take any effect. Sigh. Good find. It's surprisingly hard to do field-splitting without pathname globbing in pure shell. I couldn't find a way without using "set -f". That's in POSIX, but it feels funny tweaking a global that can effect how other code runs. We can at least constraint it to a subshell close to the point of use: diff --git a/t/test-lib.sh b/t/test-lib.sh index 54938c6427..093104d04f 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -732,14 +732,15 @@ match_pattern_list () { arg="$1" shift test -z "$*" && return 1 - for pattern_ + (set -f + for pattern_ in $* do case "$arg" in $pattern_) - return 0 + exit 0 esac done - return 1 + exit 1) } match_test_selector_list () { @@ -848,7 +849,7 @@ maybe_teardown_verbose () { last_verbose=t maybe_setup_verbose () { test -z "$verbose_only" && return - if match_pattern_list $test_count $verbose_only + if match_pattern_list $test_count "$verbose_only" then exec 4>&2 3>&1 # Emit a delimiting blank line when going from @@ -878,7 +879,7 @@ maybe_setup_valgrind () { return fi GIT_VALGRIND_ENABLED= - if match_pattern_list $test_count $valgrind_only + if match_pattern_list $test_count "$valgrind_only" then GIT_VALGRIND_ENABLED=t fi @@ -1006,7 +1007,7 @@ test_finish_ () { test_skip () { to_skip= skipped_reason= - if match_pattern_list $this_test.$test_count $GIT_SKIP_TESTS + if match_pattern_list $this_test.$test_count "$GIT_SKIP_TESTS" then to_skip=t skipped_reason="GIT_SKIP_TESTS" @@ -1346,7 +1347,7 @@ fi remove_trash= this_test=${0##*/} this_test=${this_test%%-*} -if match_pattern_list "$this_test" $GIT_SKIP_TESTS +if match_pattern_list "$this_test" "$GIT_SKIP_TESTS" then say_color info >&3 "skipping test $this_test altogether" skip_all="skip all tests in $this_test" If that isn't portable for some reason, I think we could fall back on splitting with an external tool. You can't feed the result as a function argument (you run into the same problem!) but you can use "read" to split on newlines, like: echo "$GIT_SKIP_TESTS" | tr ' ' '\n' | while read pattern do echo "got $pattern" done That does put the shell code on the right-hand side of a pipe, which means it's constrained in terms of setting variables, etc. But that would be acceptable for our use here. I dunno. Maybe somebody else can come up with something more clever (or maybe I am just missing something obvious). -Peff ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [BUG] range expressions in GIT_SKIP_TESTS are broken in master (was [BUG] question mark in GIT_SKIP_TESTS is broken in master) 2021-06-16 8:29 ` Jeff King @ 2021-06-16 9:19 ` Junio C Hamano 0 siblings, 0 replies; 17+ messages in thread From: Junio C Hamano @ 2021-06-16 9:19 UTC (permalink / raw) To: Jeff King; +Cc: Ævar Arnfjörð Bjarmason, Andrei Rybak, git Jeff King <peff@peff.net> writes: > It's surprisingly hard to do field-splitting without pathname globbing > in pure shell. I couldn't find a way without using "set -f". That's in > POSIX, but it feels funny tweaking a global that can effect how other > code runs. We can at least constraint it to a subshell close to the > point of use: > ... > - for pattern_ > + (set -f > + for pattern_ in $* > do > case "$arg" in > $pattern_) > - return 0 > + exit 0 > esac > done > - return 1 > + exit 1) > } Nice. "set -f" is what I wanted to find myself but couldn't, when I wrote the message you are responding to. > -if match_pattern_list "$this_test" $GIT_SKIP_TESTS > +if match_pattern_list "$this_test" "$GIT_SKIP_TESTS" OK. 'for pattern_ in $*' that flattens $* allows us to quote it here, passing it as a single argument without globbing. ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] test-lib: fix "$remove_trash" regression and match_pattern_list() bugs 2021-06-16 3:28 ` Junio C Hamano 2021-06-16 4:12 ` Junio C Hamano @ 2021-06-16 8:24 ` Ævar Arnfjörð Bjarmason 2021-06-16 8:36 ` Jeff King 1 sibling, 1 reply; 17+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-06-16 8:24 UTC (permalink / raw) To: git Cc: Junio C Hamano, Andrei Rybak, Đoàn Trần Công Danh, Ævar Arnfjörð Bjarmason My edc23840b0b (test-lib: bring $remove_trash out of retirement, 2021-05-10) caused a regression where we'd fail to handle GIT_SKIP_TESTS variable content like 't????', since we were calling match_pattern_list() from the t/ directory not the trash directory. We'd thus glob match in the directory containing our tests, whereas before we'd do the match in our newly setup (and empty, aside from the .git) trash directory. This bug reveals a more general problem though, which I'm attempting to solve here: We have other users of match_pattern_list() that have always been broken, namely the --verbose-only=* and --valgrind-only=* options added in ff09af3fb8f (test-lib: verbose mode for only tests matching a pattern, 2013-06-23) and 5dfc368f5ec (test-lib: valgrind for only tests matching a pattern, 2013-06-23). Those options will try to match an argument like --verbose-only=1* against a test number like "10", but since we run the match in our own trash directory an earlier test creating a file like "1one.txt" will break that option. We cannot simply quote the $GIT_SKIP_TESTS" being passed to match_pattern_list(), since we are relying on the $IFS semantics. Let's instead setup a .test-lib-trash subdirectory under the trash directory, and an "empty-dir" directory under that. Then let's run the match_pattern_list() in a sub-shell in that directory. This fixes the regression in my edc23840b0b, as well as the bugs we've had in the --{verbose,valgrind}-only=* options. I have tests for this, but they're a pain to support without my in-flight lib-subtest.sh changes, I'll submit them once those land. This has the added benefit of providing a new clean work area for tests in general, so functions in test-lib-functions.sh can use it for their own scratch files, instead of potentially tripping up the test logic itself. The drawback is that any tests that cares about the complete cleanliness of its test are might need to be adjusted, as one ls-files test here does. I think it's still worth it because that'll be one special case (and a dotfile), as opposed to every test potentially tripping over the likes of "expected.config" on a glob match of "*". 1. https://lore.kernel.org/git/1d003cac-83fa-0b63-f60e-55513ac45cf9@gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- On Wed, Jun 16 2021, Junio C Hamano wrote: > Interestingly enough, edc23840b0 (test-lib: bring $remove_trash out > of retirement, 2021-05-10) cleanly reverts without being depended on > by anything else in the series. I think reverting edc23840b0 might be the best short-term fix to avoid dealing with the breakage + get a quick fix down. I can confirm that nothing else in the series relies on it. This patch is an attempt at a more general solution to the problem, which depending on what you think you might want to take instead. It fixes not only the immediate regression, but as noted other existing bugs in match_pattern_list() usage. t/t3000-ls-files-others.sh | 8 ++++++-- t/test-lib-functions.sh | 8 +++++--- t/test-lib.sh | 40 +++++++++++++++++++++++--------------- 3 files changed, 35 insertions(+), 21 deletions(-) diff --git a/t/t3000-ls-files-others.sh b/t/t3000-ls-files-others.sh index 740ce56eab5..8c62da502df 100755 --- a/t/t3000-ls-files-others.sh +++ b/t/t3000-ls-files-others.sh @@ -37,6 +37,7 @@ test_expect_success 'setup: expected output' ' cat >expected1 <<-\EOF && expected1 expected2 + expected2.noempty expected3 output path0 @@ -47,7 +48,10 @@ test_expect_success 'setup: expected output' ' sed -e "s|path2/file2|path2/|" <expected1 >expected2 && cp expected2 expected3 && - echo path4/ >>expected2 + echo path4/ >>expected2 && + + echo .test-lib-trash/ >expected2.noempty && + cat expected2 >>expected2.noempty ' test_expect_success 'ls-files --others' ' @@ -57,7 +61,7 @@ test_expect_success 'ls-files --others' ' test_expect_success 'ls-files --others --directory' ' git ls-files --others --directory >output && - test_cmp expected2 output + test_cmp expected2.noempty output ' test_expect_success '--no-empty-directory hides empty directory' ' diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index f0448daa74b..33697b0df81 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -1040,10 +1040,12 @@ test_cmp_config () { GD="-C $1" && shift fi && - printf "%s\n" "$1" >expect.config && + printf "%s\n" "$1" >"$TRASH_DIRECTORY_TEST_LIB"/expect.config && shift && - git $GD config "$@" >actual.config && - test_cmp expect.config actual.config + git $GD config "$@" >"$TRASH_DIRECTORY_TEST_LIB"/actual.config && + test_cmp \ + "$TRASH_DIRECTORY_TEST_LIB"/expect.config \ + "$TRASH_DIRECTORY_TEST_LIB"/actual.config } # test_cmp_bin - helper to compare binary files diff --git a/t/test-lib.sh b/t/test-lib.sh index 54938c64279..7f284f56b1d 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -260,6 +260,8 @@ case "$TRASH_DIRECTORY" in /*) ;; # absolute path is good *) TRASH_DIRECTORY="$TEST_OUTPUT_DIRECTORY/$TRASH_DIRECTORY" ;; esac +TRASH_DIRECTORY_TEST_LIB="$TRASH_DIRECTORY/.test-lib-trash" +TRASH_DIRECTORY_TEST_LIB_EMPTY="$TRASH_DIRECTORY_TEST_LIB/empty-dir" # If --stress was passed, run this test repeatedly in several parallel loops. if test "$GIT_TEST_STRESS_STARTED" = "done" @@ -848,7 +850,8 @@ maybe_teardown_verbose () { last_verbose=t maybe_setup_verbose () { test -z "$verbose_only" && return - if match_pattern_list $test_count $verbose_only + if (cd "$TRASH_DIRECTORY_TEST_LIB_EMPTY" && + match_pattern_list $test_count $verbose_only) then exec 4>&2 3>&1 # Emit a delimiting blank line when going from @@ -878,7 +881,8 @@ maybe_setup_valgrind () { return fi GIT_VALGRIND_ENABLED= - if match_pattern_list $test_count $valgrind_only + if (cd "$TRASH_DIRECTORY_TEST_LIB_EMPTY" && + match_pattern_list $test_count $valgrind_only) then GIT_VALGRIND_ENABLED=t fi @@ -1006,7 +1010,8 @@ test_finish_ () { test_skip () { to_skip= skipped_reason= - if match_pattern_list $this_test.$test_count $GIT_SKIP_TESTS + if (cd "$TRASH_DIRECTORY_TEST_LIB_EMPTY" && + match_pattern_list $this_test.$test_count $GIT_SKIP_TESTS) then to_skip=t skipped_reason="GIT_SKIP_TESTS" @@ -1177,7 +1182,7 @@ test_done () { esac fi - if test -z "$debug" && test -n "$remove_trash" + if test -z "$debug" then test -d "$TRASH_DIRECTORY" || error "Tests passed but trash directory already removed before test cleanup; aborting" @@ -1342,11 +1347,22 @@ then exit 1 fi +# Test repository +rm -fr "$TRASH_DIRECTORY" || { + GIT_EXIT_OK=t + echo >&5 "FATAL: Cannot prepare test area" + exit 1 +} + +# Set up an early work area for the test code itself +mkdir -p "$TRASH_DIRECTORY_TEST_LIB_EMPTY" >&3 2>&4 || + error "cannot create test-lib.sh trash directory" + # Are we running this test at all? -remove_trash= this_test=${0##*/} this_test=${this_test%%-*} -if match_pattern_list "$this_test" $GIT_SKIP_TESTS +if (cd "$TRASH_DIRECTORY_TEST_LIB_EMPTY" && + match_pattern_list "$this_test" $GIT_SKIP_TESTS) then say_color info >&3 "skipping test $this_test altogether" skip_all="skip all tests in $this_test" @@ -1358,20 +1374,12 @@ HOME="$TRASH_DIRECTORY" GNUPGHOME="$HOME/gnupg-home-not-used" export HOME GNUPGHOME -# Test repository -rm -fr "$TRASH_DIRECTORY" || { - GIT_EXIT_OK=t - echo >&5 "FATAL: Cannot prepare test area" - exit 1 -} - -remove_trash=t +# "We have the $TRASH_DIRECTORY already, but let's create a +# $TRASH_DIRECTORY/.git if test -z "$TEST_NO_CREATE_REPO" then git init "$TRASH_DIRECTORY" >&3 2>&4 || error "cannot run git init" -else - mkdir -p "$TRASH_DIRECTORY" fi # Use -P to resolve symlinks in our working directory so that the cwd -- 2.32.0.555.g0268d380f7b ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH] test-lib: fix "$remove_trash" regression and match_pattern_list() bugs 2021-06-16 8:24 ` [PATCH] test-lib: fix "$remove_trash" regression and match_pattern_list() bugs Ævar Arnfjörð Bjarmason @ 2021-06-16 8:36 ` Jeff King 2021-06-16 9:22 ` Junio C Hamano 2021-06-16 9:49 ` Ævar Arnfjörð Bjarmason 0 siblings, 2 replies; 17+ messages in thread From: Jeff King @ 2021-06-16 8:36 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Junio C Hamano, Andrei Rybak, Đoàn Trần Công Danh On Wed, Jun 16, 2021 at 10:24:13AM +0200, Ævar Arnfjörð Bjarmason wrote: > Those options will try to match an argument like --verbose-only=1* > against a test number like "10", but since we run the match in our own > trash directory an earlier test creating a file like "1one.txt" will > break that option. > > We cannot simply quote the $GIT_SKIP_TESTS" being passed to > match_pattern_list(), since we are relying on the $IFS semantics. > > Let's instead setup a .test-lib-trash subdirectory under the trash > directory, and an "empty-dir" directory under that. Then let's run the > match_pattern_list() in a sub-shell in that directory. Looks like my email just crossed with this one. Your "cd to an empty directory" is a fun version of my "maybe somebody can think of something clever" statement. :) As a general solution, it does fail if the globs may contain things that look like absolute paths, but that is quite unlikely for our use case here. Still, I kind of like the "set -f" version because it doesn't need the extra directory which could cause problems with "ls-files -o", etc, as you mentioned. You could also create the empty directory on the fly, though if "set -f" works portably, that seems less complicated to me. Whatever the expansion mechanism, I do think it's worth having callers quote "$GIT_SKIP_TESTS" and then performing the expansion within match_pattern_list. Then the nasty mechanics are all in that one place. -Peff ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] test-lib: fix "$remove_trash" regression and match_pattern_list() bugs 2021-06-16 8:36 ` Jeff King @ 2021-06-16 9:22 ` Junio C Hamano 2021-06-16 10:23 ` Jeff King 2021-06-16 9:49 ` Ævar Arnfjörð Bjarmason 1 sibling, 1 reply; 17+ messages in thread From: Junio C Hamano @ 2021-06-16 9:22 UTC (permalink / raw) To: Jeff King Cc: Ævar Arnfjörð Bjarmason, git, Andrei Rybak, Đoàn Trần Công Danh Jeff King <peff@peff.net> writes: > ... Still, I kind of like the "set -f" version because it doesn't need > the extra directory which could cause problems with "ls-files -o", etc, > as you mentioned. You could also create the empty directory on the fly, > though if "set -f" works portably, that seems less complicated to me. FWIW, I share that. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] test-lib: fix "$remove_trash" regression and match_pattern_list() bugs 2021-06-16 9:22 ` Junio C Hamano @ 2021-06-16 10:23 ` Jeff King 2021-06-16 10:24 ` Jeff King 2021-06-16 11:38 ` Ævar Arnfjörð Bjarmason 0 siblings, 2 replies; 17+ messages in thread From: Jeff King @ 2021-06-16 10:23 UTC (permalink / raw) To: Junio C Hamano Cc: Ævar Arnfjörð Bjarmason, git, Andrei Rybak, Đoàn Trần Công Danh On Wed, Jun 16, 2021 at 06:22:10PM +0900, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > ... Still, I kind of like the "set -f" version because it doesn't need > > the extra directory which could cause problems with "ls-files -o", etc, > > as you mentioned. You could also create the empty directory on the fly, > > though if "set -f" works portably, that seems less complicated to me. > > FWIW, I share that. Here it is with some cosmetic cleanups and a commit message. I don't mean to preempt further discussion if Ævar prefers another route, but I want to make sure we didn't stall out. I'm a little curious whether the bug could be triggered before the recent move to running match_pattern_list outside of the trash directory (i.e., whether there is some test that creates a sufficiently confusing-looking file in the filesystem; t0000 would be a likely candidate). But not curious enough to comb through the test suite looking for candidates. :) -- >8 -- Subject: [PATCH] test-lib: avoid accidental globbing in match_pattern_list() We have a custom match_pattern_list() function which we use for matching test names (like "t1234") against glob-like patterns (like "t1???") for $GIT_SKIP_TESTS, --verbose-only, etc. Those patterns may have multiple whitespace-separated elements (e.g., "t0* t1234 t5?78"). The callers of match_pattern_list thus pass the strings unquoted, so that the shell does the usual field-splitting into separate arguments. But this also means the shell will do the usual globbing for each argument, which can result in us seeing an expansion based on what's in the filesystem, rather than the real pattern. For example, if I have the path "t5000" in the filesystem, and you feed the pattern "t?0000", that _should_ match the string "t0000", but it won't after the shell has expanded it to "t5000". This has been a bug ever since that function was introduced. But it didn't usually trigger since we typically use the function inside the trash directory, which has a very limited set of files that are unlikely to match. It became a lot easier to trigger after edc23840b0 (test-lib: bring $remove_trash out of retirement, 2021-05-10), because now we match $GIT_SKIP_TESTS before even entering the trash directory. So the t5000 example above can be seen with: GIT_SKIP_TESTS=t?000 ./t0000-basic.sh which should skip all tests but doesn't. We can fix this by using "set -f" to ask the shell not to glob (which is in POSIX, so should hopefully be portable enough). We only want to do this in a subshell (to avoid polluting the rest of the script), which means we need to get the whole string intact into the match_pattern_list function by quoting it. Arguably this is a good idea anyway, since it makes it much more obvious that we intend to split, and it's not simply sloppy scripting. Diagnosed-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jeff King <peff@peff.net> --- t/test-lib.sh | 34 ++++++++++++++++++++++------------ 1 file changed, 22 insertions(+), 12 deletions(-) diff --git a/t/test-lib.sh b/t/test-lib.sh index 54938c6427..a7badbf1dd 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -732,14 +732,24 @@ match_pattern_list () { arg="$1" shift test -z "$*" && return 1 - for pattern_ - do - case "$arg" in - $pattern_) - return 0 - esac - done - return 1 + # We need to use "$*" to get field-splitting, but we want to + # disable globbing, since we are matching against an arbitrary + # $arg, not what's in the filesystem. Using "set -f" accomplishes + # that, but we must do it in a subshell to avoid impacting the + # rest of the script. The exit value of the subshell becomes + # the function's return value. + ( + set -f + for pattern_ in $* + do + case "$arg" in + $pattern_) + exit 0 + ;; + esac + done + exit 1 + ) } match_test_selector_list () { @@ -848,7 +858,7 @@ maybe_teardown_verbose () { last_verbose=t maybe_setup_verbose () { test -z "$verbose_only" && return - if match_pattern_list $test_count $verbose_only + if match_pattern_list $test_count "$verbose_only" then exec 4>&2 3>&1 # Emit a delimiting blank line when going from @@ -878,7 +888,7 @@ maybe_setup_valgrind () { return fi GIT_VALGRIND_ENABLED= - if match_pattern_list $test_count $valgrind_only + if match_pattern_list $test_count "$valgrind_only" then GIT_VALGRIND_ENABLED=t fi @@ -1006,7 +1016,7 @@ test_finish_ () { test_skip () { to_skip= skipped_reason= - if match_pattern_list $this_test.$test_count $GIT_SKIP_TESTS + if match_pattern_list $this_test.$test_count "$GIT_SKIP_TESTS" then to_skip=t skipped_reason="GIT_SKIP_TESTS" @@ -1346,7 +1356,7 @@ fi remove_trash= this_test=${0##*/} this_test=${this_test%%-*} -if match_pattern_list "$this_test" $GIT_SKIP_TESTS +if match_pattern_list "$this_test" "$GIT_SKIP_TESTS" then say_color info >&3 "skipping test $this_test altogether" skip_all="skip all tests in $this_test" -- 2.32.0.559.g998c91e3d7 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH] test-lib: fix "$remove_trash" regression and match_pattern_list() bugs 2021-06-16 10:23 ` Jeff King @ 2021-06-16 10:24 ` Jeff King 2021-06-16 11:38 ` Ævar Arnfjörð Bjarmason 1 sibling, 0 replies; 17+ messages in thread From: Jeff King @ 2021-06-16 10:24 UTC (permalink / raw) To: Junio C Hamano Cc: Ævar Arnfjörð Bjarmason, git, Andrei Rybak, Đoàn Trần Công Danh On Wed, Jun 16, 2021 at 06:23:07AM -0400, Jeff King wrote: > But this also means the shell will do the usual globbing for each > argument, which can result in us seeing an expansion based on what's in > the filesystem, rather than the real pattern. For example, if I have the > path "t5000" in the filesystem, and you feed the pattern "t?0000", that > _should_ match the string "t0000", but it won't after the shell has > expanded it to "t5000". Whoops, that pattern should be "t?000", of course. It's correct in the runnable example later on. -Peff ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] test-lib: fix "$remove_trash" regression and match_pattern_list() bugs 2021-06-16 10:23 ` Jeff King 2021-06-16 10:24 ` Jeff King @ 2021-06-16 11:38 ` Ævar Arnfjörð Bjarmason 2021-06-16 11:50 ` Jeff King 1 sibling, 1 reply; 17+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-06-16 11:38 UTC (permalink / raw) To: Jeff King Cc: Junio C Hamano, git, Andrei Rybak, Đoàn Trần Công Danh On Wed, Jun 16 2021, Jeff King wrote: > On Wed, Jun 16, 2021 at 06:22:10PM +0900, Junio C Hamano wrote: > >> Jeff King <peff@peff.net> writes: >> >> > ... Still, I kind of like the "set -f" version because it doesn't need >> > the extra directory which could cause problems with "ls-files -o", etc, >> > as you mentioned. You could also create the empty directory on the fly, >> > though if "set -f" works portably, that seems less complicated to me. >> >> FWIW, I share that. > > Here it is with some cosmetic cleanups and a commit message. I don't > mean to preempt further discussion if Ævar prefers another route, but I > want to make sure we didn't stall out. I mildly prefer mine as noted in https://lore.kernel.org/git/87pmwmxd6f.fsf@evledraar.gmail.com/; but I'd mainly prefer just to fix the immediate breakage on master, so whatever variant of reverting, taking yours or mine Junio prefers I'm fine with. Just inline comments on the patch: > @@ -732,14 +732,24 @@ match_pattern_list () { > arg="$1" > shift > test -z "$*" && return 1 > - for pattern_ > - do > - case "$arg" in > - $pattern_) > - return 0 > - esac > - done > - return 1 > + # We need to use "$*" to get field-splitting, but we want to > + # disable globbing, since we are matching against an arbitrary > + # $arg, not what's in the filesystem. Using "set -f" accomplishes > + # that, but we must do it in a subshell to avoid impacting the > + # rest of the script. The exit value of the subshell becomes > + # the function's return value. > + ( > + set -f > + for pattern_ in $* > + do > + case "$arg" in > + $pattern_) > + exit 0 > + ;; > + esac > + done > + exit 1 > + ) > } Why not just start with a ret=1, set ret=0 if we have a match and break from the loop, and then do a "set +f" afterwards? I.e. is there an actual need for the subshell here. I'm mildly paranoid about a new "set -<flag>" in the codebase for vague fears of portability (as noted in my linked message), but whatever shell supports "set -<flag>" surely supports the inverse with "set +<flag>", no? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] test-lib: fix "$remove_trash" regression and match_pattern_list() bugs 2021-06-16 11:38 ` Ævar Arnfjörð Bjarmason @ 2021-06-16 11:50 ` Jeff King 2021-06-17 0:29 ` Junio C Hamano 0 siblings, 1 reply; 17+ messages in thread From: Jeff King @ 2021-06-16 11:50 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Junio C Hamano, git, Andrei Rybak, Đoàn Trần Công Danh On Wed, Jun 16, 2021 at 01:38:55PM +0200, Ævar Arnfjörð Bjarmason wrote: > > + # We need to use "$*" to get field-splitting, but we want to > > + # disable globbing, since we are matching against an arbitrary > > + # $arg, not what's in the filesystem. Using "set -f" accomplishes > > + # that, but we must do it in a subshell to avoid impacting the > > + # rest of the script. The exit value of the subshell becomes > > + # the function's return value. > > + ( > > + set -f > > + for pattern_ in $* > > + do > > + case "$arg" in > > + $pattern_) > > + exit 0 > > + ;; > > + esac > > + done > > + exit 1 > > + ) > > } > > Why not just start with a ret=1, set ret=0 if we have a match and break > from the loop, and then do a "set +f" afterwards? I.e. is there an > actual need for the subshell here. My thought was that the subshell takes us back to the original state, regardless of what it was. As opposed to "set +f" which takes us back to a particular state. But it is unlikely that we'd have done a global "set -f" before calling this, so maybe that is being overly conservative. > I'm mildly paranoid about a new "set -<flag>" in the codebase for vague > fears of portability (as noted in my linked message), but whatever shell > supports "set -<flag>" surely supports the inverse with "set +<flag>", > no? Yes, I think we can assume that if it supports one, it supports the other. -Peff ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] test-lib: fix "$remove_trash" regression and match_pattern_list() bugs 2021-06-16 11:50 ` Jeff King @ 2021-06-17 0:29 ` Junio C Hamano 0 siblings, 0 replies; 17+ messages in thread From: Junio C Hamano @ 2021-06-17 0:29 UTC (permalink / raw) To: Jeff King Cc: Ævar Arnfjörð Bjarmason, git, Andrei Rybak, Đoàn Trần Công Danh Jeff King <peff@peff.net> writes: > My thought was that the subshell takes us back to the original state, > regardless of what it was. As opposed to "set +f" which takes us back to > a particular state. But it is unlikely that we'd have done a global "set > -f" before calling this, so maybe that is being overly conservative. Overly conservative, yes, but if it is not too much overhead, I think it is a good practice anyway. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] test-lib: fix "$remove_trash" regression and match_pattern_list() bugs 2021-06-16 8:36 ` Jeff King 2021-06-16 9:22 ` Junio C Hamano @ 2021-06-16 9:49 ` Ævar Arnfjörð Bjarmason 2021-06-16 11:43 ` Jeff King 2021-06-17 0:36 ` Junio C Hamano 1 sibling, 2 replies; 17+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-06-16 9:49 UTC (permalink / raw) To: Jeff King Cc: git, Junio C Hamano, Andrei Rybak, Đoàn Trần Công Danh On Wed, Jun 16 2021, Jeff King wrote: > On Wed, Jun 16, 2021 at 10:24:13AM +0200, Ævar Arnfjörð Bjarmason wrote: > >> Those options will try to match an argument like --verbose-only=1* >> against a test number like "10", but since we run the match in our own >> trash directory an earlier test creating a file like "1one.txt" will >> break that option. >> >> We cannot simply quote the $GIT_SKIP_TESTS" being passed to >> match_pattern_list(), since we are relying on the $IFS semantics. >> >> Let's instead setup a .test-lib-trash subdirectory under the trash >> directory, and an "empty-dir" directory under that. Then let's run the >> match_pattern_list() in a sub-shell in that directory. > > Looks like my email just crossed with this one. Your "cd to an empty > directory" is a fun version of my "maybe somebody can think of something > clever" statement. :) Yeah, I sent it before seeing yours. > As a general solution, it does fail if the globs may contain things that > look like absolute paths, but that is quite unlikely for our use case > here.[...] Not just unlikely, impossible. Yes it's possible in the general case, but in the match_pattern_list we are always normalizing e.g. ./t1234-some-test.sh t t1234, and the other match cases are test numbers etc. It doesn't need to deal with the general case. > [...] Still, I kind of like the "set -f" version because it doesn't need > the extra directory which could cause problems with "ls-files -o", etc, > as you mentioned. You could also create the empty directory on the fly, > though if "set -f" works portably, that seems less complicated to me. Yeah, in isolation I'd agree, but given the desire (with existing code and other in-flight code) to have a scratch are for the test library code itself simply creating such a directory seems like a good thing to have in general, and once we have it we can use the subshell trick, or just "set -f" I suppose and use the scratch dir for other stuff. I am a bit wary of this being our first ever use of "set -f", but maybe it's sufficiently portable. > Whatever the expansion mechanism, I do think it's worth having callers > quote "$GIT_SKIP_TESTS" and then performing the expansion within > match_pattern_list. Then the nasty mechanics are all in that one place. I think it's rather clean to not quote it, i.e. to have the loop get a list of item and then things to match, it would also make it easier to e.g. port it to a native C program. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] test-lib: fix "$remove_trash" regression and match_pattern_list() bugs 2021-06-16 9:49 ` Ævar Arnfjörð Bjarmason @ 2021-06-16 11:43 ` Jeff King 2021-06-17 0:36 ` Junio C Hamano 1 sibling, 0 replies; 17+ messages in thread From: Jeff King @ 2021-06-16 11:43 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Junio C Hamano, Andrei Rybak, Đoàn Trần Công Danh On Wed, Jun 16, 2021 at 11:49:20AM +0200, Ævar Arnfjörð Bjarmason wrote: > > [...] Still, I kind of like the "set -f" version because it doesn't need > > the extra directory which could cause problems with "ls-files -o", etc, > > as you mentioned. You could also create the empty directory on the fly, > > though if "set -f" works portably, that seems less complicated to me. > > Yeah, in isolation I'd agree, but given the desire (with existing code > and other in-flight code) to have a scratch are for the test library > code itself simply creating such a directory seems like a good thing to > have in general, and once we have it we can use the subshell trick, or > just "set -f" I suppose and use the scratch dir for other stuff. I don't have much of an opinion on other uses of the scratch dir, not having seen them. I agree if we do have one and are paying the cost for it already, then using it here isn't a big deal. > I am a bit wary of this being our first ever use of "set -f", but maybe > it's sufficiently portable. Yep, me too. The nice thing about it is that we can swap out the solution pretty easily if it turns out not to be. I don't know how good our coverage of obscure shells is in CI, though (e.g., on your gcc build-farm stuff, are we mostly using system shells?). > > Whatever the expansion mechanism, I do think it's worth having callers > > quote "$GIT_SKIP_TESTS" and then performing the expansion within > > match_pattern_list. Then the nasty mechanics are all in that one place. > > I think it's rather clean to not quote it, i.e. to have the loop get a > list of item and then things to match, it would also make it easier to > e.g. port it to a native C program. My main complaint is that it's a real gotcha for the callers. They have to remember to do this cd-to-scratch (or whatever technique we use). That's cumbersome, and if they forget, their call is wrong in a really subtle way that basic testing isn't likely to uncover. -Peff ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] test-lib: fix "$remove_trash" regression and match_pattern_list() bugs 2021-06-16 9:49 ` Ævar Arnfjörð Bjarmason 2021-06-16 11:43 ` Jeff King @ 2021-06-17 0:36 ` Junio C Hamano 1 sibling, 0 replies; 17+ messages in thread From: Junio C Hamano @ 2021-06-17 0:36 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Jeff King, git, Andrei Rybak, Đoàn Trần Công Danh Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: >> Whatever the expansion mechanism, I do think it's worth having callers >> quote "$GIT_SKIP_TESTS" and then performing the expansion within >> match_pattern_list. Then the nasty mechanics are all in that one place. > > I think it's rather clean to not quote it, i.e. to have the loop get a > list of item and then things to match, it would also make it easier to > e.g. port it to a native C program. Sorry, I am not sure how it can work without quoting $GIT_SKIP_TESTS and exposing the details of how we disable globbing when the caller calls match_pattern_list. A list of item in $GIT_SKIP_TESTS would not be able to pass t?000 in it intact to the loop that processes each of the item on the list. for pattern in $GIT_SKIP_TESTS do attempt to match with $pattern done will see t5000 in $pattern due to globbing. It's not like "$@" trick can be applied to any plain variable. ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2021-06-17 0:36 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-06-15 20:55 [BUG] question mark in GIT_SKIP_TESTS is broken in master Andrei Rybak 2021-06-15 23:28 ` [BUG] range expressions in GIT_SKIP_TESTS are broken in master (was [BUG] question mark in GIT_SKIP_TESTS is broken in master) Andrei Rybak 2021-06-16 3:28 ` Junio C Hamano 2021-06-16 4:12 ` Junio C Hamano 2021-06-16 8:29 ` Jeff King 2021-06-16 9:19 ` Junio C Hamano 2021-06-16 8:24 ` [PATCH] test-lib: fix "$remove_trash" regression and match_pattern_list() bugs Ævar Arnfjörð Bjarmason 2021-06-16 8:36 ` Jeff King 2021-06-16 9:22 ` Junio C Hamano 2021-06-16 10:23 ` Jeff King 2021-06-16 10:24 ` Jeff King 2021-06-16 11:38 ` Ævar Arnfjörð Bjarmason 2021-06-16 11:50 ` Jeff King 2021-06-17 0:29 ` Junio C Hamano 2021-06-16 9:49 ` Ævar Arnfjörð Bjarmason 2021-06-16 11:43 ` Jeff King 2021-06-17 0:36 ` Junio C Hamano
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.