All of lore.kernel.org
 help / color / mirror / Atom feed
* [kvm-unit-tests PATCH] runtime: indicate failure on crash/timeout/abort in TAP
@ 2022-03-10 15:03 Nico Boehr
  2022-03-18  8:46 ` Marc Hartmayer
  2022-03-22 11:05 ` Thomas Huth
  0 siblings, 2 replies; 4+ messages in thread
From: Nico Boehr @ 2022-03-10 15:03 UTC (permalink / raw)
  To: kvm; +Cc: mhartmay, frankja, thuth, pbonzini

When we have crashes, timeouts or aborts, there is currently no indication for
this in the TAP output. When all reports() up to this point succeeded, this
might result in a TAP file looking completely fine even though things went
terribly wrong.

For example, when I set the timeout for the diag288 test on s390x to 1 second,
it fails because it takes quite long, which is properly indicated in the
normal output:

$ ./run_tests.sh diag288
FAIL diag288 (timeout; duration=1s)

But, when I enable TAP output, I get this:

$ ./run_tests.sh -t diag288
TAP version 13
ok 1 - diag288: diag288: privileged: Program interrupt: expected(2) == received(2)
ok 2 - diag288: diag288: specification: uneven: Program interrupt: expected(6) == received(6)
ok 3 - diag288: diag288: specification: unsupported action: Program interrupt: expected(6) == received(6)
ok 4 - diag288: diag288: specification: unsupported function: Program interrupt: expected(6) == received(6)
ok 5 - diag288: diag288: specification: no init: Program interrupt: expected(6) == received(6)
ok 6 - diag288: diag288: specification: min timer: Program interrupt: expected(6) == received(6)
1..6

Which looks like a completely fine TAP file, but actually we ran into a timeout
and didn't even run all tests.

With this patch, we get an additional line at the end which properly shows
something went wrong:

not ok 7 - diag288: timeout; duration=1s

Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
---
 scripts/runtime.bash | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/scripts/runtime.bash b/scripts/runtime.bash
index 6d5fced94246..b41b3d444e27 100644
--- a/scripts/runtime.bash
+++ b/scripts/runtime.bash
@@ -163,9 +163,19 @@ function run()
         print_result "SKIP" $testname "$summary"
     elif [ $ret -eq 124 ]; then
         print_result "FAIL" $testname "" "timeout; duration=$timeout"
+        if [[ $tap_output != "no" ]]; then
+            echo "not ok TEST_NUMBER - ${testname}: timeout; duration=$timeout" >&3
+        fi
     elif [ $ret -gt 127 ]; then
-        print_result "FAIL" $testname "" "terminated on SIG$(kill -l $(($ret - 128)))"
+        signame="SIG"$(kill -l $(($ret - 128)))
+        print_result "FAIL" $testname "" "terminated on $signame"
+        if [[ $tap_output != "no" ]]; then
+            echo "not ok TEST_NUMBER - ${testname}: terminated on $signame" >&3
+        fi
     else
+        if [ $ret -eq 127 ] && [[ $tap_output != "no" ]]; then
+            echo "not ok TEST_NUMBER - ${testname}: aborted" >&3
+        fi
         print_result "FAIL" $testname "$summary"
     fi
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [kvm-unit-tests PATCH] runtime: indicate failure on crash/timeout/abort in TAP
  2022-03-10 15:03 [kvm-unit-tests PATCH] runtime: indicate failure on crash/timeout/abort in TAP Nico Boehr
@ 2022-03-18  8:46 ` Marc Hartmayer
  2022-03-18  9:56   ` Nico Boehr
  2022-03-22 11:05 ` Thomas Huth
  1 sibling, 1 reply; 4+ messages in thread
From: Marc Hartmayer @ 2022-03-18  8:46 UTC (permalink / raw)
  To: Nico Boehr, kvm; +Cc: frankja, thuth, pbonzini

Nico Boehr <nrb@linux.ibm.com> writes:

> When we have crashes, timeouts or aborts, there is currently no indication for
> this in the TAP output.
> When all reports() up to this point succeeded, this
> might result in a TAP file looking completely fine even though things went
> terribly wrong.
>
> For example, when I set the timeout for the diag288 test on s390x to 1 second,
> it fails because it takes quite long, which is properly indicated in the
> normal output:
>
> $ ./run_tests.sh diag288
> FAIL diag288 (timeout; duration=1s)
>
> But, when I enable TAP output, I get this:
>
> $ ./run_tests.sh -t diag288
> TAP version 13
> ok 1 - diag288: diag288: privileged: Program interrupt: expected(2) == received(2)
> ok 2 - diag288: diag288: specification: uneven: Program interrupt: expected(6) == received(6)
> ok 3 - diag288: diag288: specification: unsupported action: Program interrupt: expected(6) == received(6)
> ok 4 - diag288: diag288: specification: unsupported function: Program interrupt: expected(6) == received(6)
> ok 5 - diag288: diag288: specification: no init: Program interrupt: expected(6) == received(6)
> ok 6 - diag288: diag288: specification: min timer: Program interrupt: expected(6) == received(6)
> 1..6
>
> Which looks like a completely fine TAP file, but actually we ran into a timeout
> and didn't even run all tests.
>
> With this patch, we get an additional line at the end which properly shows
> something went wrong:
>
> not ok 7 - diag288: timeout; duration=1s

This results from the fact that the TAP13 test result is generated by
the function `RUNTIME_log_stdout` and not by `print_result` (see commit
6e1d3752d7ca ("tap13: list testcases individually")). In
`RUNTIME_log_stdout` we don’t have access to the QEMU command exit code.
So what you’re doing here is to workaround that fact… I’m not sure how
to fix this properly without refactoring the whole code :/ Maybe Paolo
knows a better fix.

Some small nits below… (I don’t make any comments regarding quoting
since it was already suboptimal in the code).

>
> Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
> ---
>  scripts/runtime.bash | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/scripts/runtime.bash b/scripts/runtime.bash
> index 6d5fced94246..b41b3d444e27 100644
> --- a/scripts/runtime.bash
> +++ b/scripts/runtime.bash
> @@ -163,9 +163,19 @@ function run()
>          print_result "SKIP" $testname "$summary"
>      elif [ $ret -eq 124 ]; then
>          print_result "FAIL" $testname "" "timeout; duration=$timeout"
> +        if [[ $tap_output != "no" ]]; then
> +            echo "not ok TEST_NUMBER - ${testname}: timeout; duration=$timeout" >&3
> +        fi
>      elif [ $ret -gt 127 ]; then
> -        print_result "FAIL" $testname "" "terminated on SIG$(kill -l $(($ret - 128)))"
> +        signame="SIG"$(kill -l $(($ret - 128)))
> +        print_result "FAIL" $testname "" "terminated on $signame"
> +        if [[ $tap_output != "no" ]]; then
> +            echo "not ok TEST_NUMBER - ${testname}: terminated on $signame" >&3
> +        fi
>      else
> +        if [ $ret -eq 127 ] && [[ $tap_output != "no" ]]; then
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
           This is a new case, no? If so please add a separate
           patch creating another `elif` branch.

> +            echo "not ok TEST_NUMBER - ${testname}: aborted" >&3
> +        fi
>          print_result "FAIL" $testname "$summary"
>      fi
>  
> -- 
> 2.31.1

Otherwise, this patch fixes the problem you’ve mentioned - although it
leads to even more fragmented code. But I can't think of a better (easy)
fix right now either.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [kvm-unit-tests PATCH] runtime: indicate failure on crash/timeout/abort in TAP
  2022-03-18  8:46 ` Marc Hartmayer
@ 2022-03-18  9:56   ` Nico Boehr
  0 siblings, 0 replies; 4+ messages in thread
From: Nico Boehr @ 2022-03-18  9:56 UTC (permalink / raw)
  To: Marc Hartmayer, kvm; +Cc: frankja, thuth, pbonzini

On Fri, 2022-03-18 at 09:46 +0100, Marc Hartmayer wrote:
> Nico Boehr <nrb@linux.ibm.com> writes:
[...]
> > Which looks like a completely fine TAP file, but actually we ran
> > into a timeout
> > and didn't even run all tests.
> > 
> > With this patch, we get an additional line at the end which
> > properly shows
> > something went wrong:
> > 
> > not ok 7 - diag288: timeout; duration=1s
> 
> This results from the fact that the TAP13 test result is generated by
> the function `RUNTIME_log_stdout` and not by `print_result` (see
> commit
> 6e1d3752d7ca ("tap13: list testcases individually")). In
> `RUNTIME_log_stdout` we don’t have access to the QEMU command exit
> code.

Basically yes. If we had that we could do all the TAP special handling
there.
> 

[...]
> > diff --git a/scripts/runtime.bash b/scripts/runtime.bash
> > index 6d5fced94246..b41b3d444e27 100644
> > --- a/scripts/runtime.bash
> > +++ b/scripts/runtime.bash
> > @@ -163,9 +163,19 @@ function run()
> >          print_result "SKIP" $testname "$summary"
> >      elif [ $ret -eq 124 ]; then
> >          print_result "FAIL" $testname "" "timeout;
> > duration=$timeout"
> > +        if [[ $tap_output != "no" ]]; then
> > +            echo "not ok TEST_NUMBER - ${testname}: timeout;
> > duration=$timeout" >&3
> > +        fi
> >      elif [ $ret -gt 127 ]; then
> > -        print_result "FAIL" $testname "" "terminated on SIG$(kill
> > -l $(($ret - 128)))"
> > +        signame="SIG"$(kill -l $(($ret - 128)))
> > +        print_result "FAIL" $testname "" "terminated on $signame"
> > +        if [[ $tap_output != "no" ]]; then
> > +            echo "not ok TEST_NUMBER - ${testname}: terminated on
> > $signame" >&3
> > +        fi
> >      else
> > +        if [ $ret -eq 127 ] && [[ $tap_output != "no" ]]; then
>            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>            This is a new case, no? If so please add a separate
>            patch creating another `elif` branch.

Probably depends on what you mean by 'new'. The else branch handles the
test aborting (for example, exception in the guest) _and_ the case of
at least one report failing.

In the latter case, I wanted no additional line in the TAP because we
can already see the failed report there. 

Making the if an elif makes sense, will do that. 

I don't get what you would want to see in a separate patch, can you
please make a pseudocode example?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [kvm-unit-tests PATCH] runtime: indicate failure on crash/timeout/abort in TAP
  2022-03-10 15:03 [kvm-unit-tests PATCH] runtime: indicate failure on crash/timeout/abort in TAP Nico Boehr
  2022-03-18  8:46 ` Marc Hartmayer
@ 2022-03-22 11:05 ` Thomas Huth
  1 sibling, 0 replies; 4+ messages in thread
From: Thomas Huth @ 2022-03-22 11:05 UTC (permalink / raw)
  To: Nico Boehr, kvm; +Cc: mhartmay, frankja, pbonzini

On 10/03/2022 16.03, Nico Boehr wrote:
> When we have crashes, timeouts or aborts, there is currently no indication for
> this in the TAP output. When all reports() up to this point succeeded, this
> might result in a TAP file looking completely fine even though things went
> terribly wrong.
> 
> For example, when I set the timeout for the diag288 test on s390x to 1 second,
> it fails because it takes quite long, which is properly indicated in the
> normal output:
> 
> $ ./run_tests.sh diag288
> FAIL diag288 (timeout; duration=1s)
> 
> But, when I enable TAP output, I get this:
> 
> $ ./run_tests.sh -t diag288
> TAP version 13
> ok 1 - diag288: diag288: privileged: Program interrupt: expected(2) == received(2)
> ok 2 - diag288: diag288: specification: uneven: Program interrupt: expected(6) == received(6)
> ok 3 - diag288: diag288: specification: unsupported action: Program interrupt: expected(6) == received(6)
> ok 4 - diag288: diag288: specification: unsupported function: Program interrupt: expected(6) == received(6)
> ok 5 - diag288: diag288: specification: no init: Program interrupt: expected(6) == received(6)
> ok 6 - diag288: diag288: specification: min timer: Program interrupt: expected(6) == received(6)
> 1..6
> 
> Which looks like a completely fine TAP file, but actually we ran into a timeout
> and didn't even run all tests.
> 
> With this patch, we get an additional line at the end which properly shows
> something went wrong:
> 
> not ok 7 - diag288: timeout; duration=1s
> 
> Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
> ---
>   scripts/runtime.bash | 12 +++++++++++-
>   1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/scripts/runtime.bash b/scripts/runtime.bash
> index 6d5fced94246..b41b3d444e27 100644
> --- a/scripts/runtime.bash
> +++ b/scripts/runtime.bash
> @@ -163,9 +163,19 @@ function run()
>           print_result "SKIP" $testname "$summary"
>       elif [ $ret -eq 124 ]; then
>           print_result "FAIL" $testname "" "timeout; duration=$timeout"
> +        if [[ $tap_output != "no" ]]; then

I'd like to avoid "[[" in new code, and the double negation (!= "no) also 
looks a little bit ugly ... could you please replace this line with:

	if [ "$tap_output" = "yes" ]; then

?

> +            echo "not ok TEST_NUMBER - ${testname}: timeout; duration=$timeout" >&3
> +        fi
>       elif [ $ret -gt 127 ]; then
> -        print_result "FAIL" $testname "" "terminated on SIG$(kill -l $(($ret - 128)))"
> +        signame="SIG"$(kill -l $(($ret - 128)))
> +        print_result "FAIL" $testname "" "terminated on $signame"
> +        if [[ $tap_output != "no" ]]; then

dito

> +            echo "not ok TEST_NUMBER - ${testname}: terminated on $signame" >&3
> +        fi
>       else
> +        if [ $ret -eq 127 ] && [[ $tap_output != "no" ]]; then

dito (especially since this mixes [ and [[ in one line)

> +            echo "not ok TEST_NUMBER - ${testname}: aborted" >&3
> +        fi
>           print_result "FAIL" $testname "$summary"
>       fi

As Marc already mentioned, it's indeed a little bit sad that we now have 
parts of the TAP handling in run_tests.sh and some parts in 
scripts/runtime.bash, but I also can't think of a much nicer solution right 
now ... so with this cosmetics fixed:

Reviewed-by: Thomas Huth <thuth@redhat.com>


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-03-22 11:05 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-10 15:03 [kvm-unit-tests PATCH] runtime: indicate failure on crash/timeout/abort in TAP Nico Boehr
2022-03-18  8:46 ` Marc Hartmayer
2022-03-18  9:56   ` Nico Boehr
2022-03-22 11:05 ` Thomas Huth

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.