How to abort a test?

Message ID 5697DABA.8010008@redhat.com
State New, archived
Headers

Commit Message

Pedro Alves Jan. 14, 2016, 5:28 p.m. UTC
  On 01/14/2016 05:13 PM, Simon Marchi wrote:
> On 16-01-14 11:25 AM, Pedro Alves wrote:
>>> The test will be aborted, runtest will output a detailed error, but the test will still
>>> pass.  Intuitively, I would think that a test that throws an error should automatically
>>> be failed or unresolved, since something unexpected happened.
>>
>> IIUC, you wouldn't want pass/fail to even be reached, so I don't understand
>> what test would you want to fail.
> 
> I think unresolved would be appropriate in that case, since it's not the program under test
> failed, but the test, or its setup/environment.

OK.  I'd say "testcase" to disambiguate with a particular pass/fail invocation.

> 
> According to the Dejagnu doc, unresolved: "... usually means the test did not execute as
> expected, and a human being must go over results to determine if it passed or failed (and
> to improve the test case). ".  I think that applies here.
> 
> So when there is an uncaught exception coming from the test, Dejagnu could issue a:
> 
>   unresolved "Uncaught error from test " # needs a better message
> 
> The important part is that the runtest command returns a failure return code, so that
> automated builds or scripts consider the run as failed.  To me, a test that ends with
> an exception is not a test that ran successfully.
> 
>>>
>>> The only option I see right now would be to fix the whole return chain and add proper
>>> error handling everywhere, to exit early when an error happens.  However, that means
>>> changing tens (hundreds?) of callsites through the testsuite, which is why I'm
>>> looking for alternative solutions first.
>>
>> Test usually return early if running to main fails (if ![runto_main]), so maybe it
>> won't be that many places.  Maybe some judiciously placed "return -code return"
>> hacks saves you a good chunk.  I don't have any other idea.
> 
> Many tests use gdb_run_cmd directly without checking the result:
> 
> gdb/testsuite $ grep -rin '^gdb_run_cmd$' gdb.* | wc -l
> 73
> 
> So they would need to be changed:
> 
> -gdb_run_cmd
> +if ![gdb_run_cmd] {
> +    fail "Failed to run"
> +}
> 

That's where the "return -code return" hack would come in.  You'd
do that inside gdb_run_cmd so that would return directly to gdb_run_cmd's
caller.  But it's an ugly hack.

>> For the particular case of gdbserver not being present in the target,
>> probably the easiest would be to check that earlier, likely before
>> runtest, even.  Not ideal, since the testsuite can mix native and gdbserver
>> tests, for instance, but...
> 
> And it's a bit hard to check in the case of a remote target, given that it's runtest
> that knows how to "compute" the remote hostname, username, expected gdbserver path,
> etc, from the --target_board.

I meant, before runtest, the dejagnu procedure, not before invoking the
runtest binary.  E.g., straight from the target board file, say.

E.g., with:


then I get:

$ time make check RUNTESTFLAGS="--target_board=native-gdbserver"
make[1]: Entering directory `/home/pedro/gdb/mygit/build/gdb/testsuite'
Nothing to be done for all...
make check-single
make[2]: Entering directory `/home/pedro/gdb/mygit/build/gdb/testsuite'
rootme=`pwd`; export rootme; srcdir=/home/pedro/gdb/mygit/src/gdb/testsuite ; export srcdir ; EXPECT=`if [ "${READ1}" != "" ] ; then echo ${rootme}/expect-read1; elif [ -f ${rootme}/../../expect/expect ] ; then echo ${rootme}/../../expect/expect ; else echo expect ; fi` ; export EXPECT ; EXEEXT= ; export EXEEXT ; LD_LIBRARY_PATH=$rootme/../../expect:$rootme/../../libstdc++:$rootme/../../tk/unix:$rootme/../../tcl/unix:$rootme/../../bfd:$rootme/../../opcodes:$LD_LIBRARY_PATH; export LD_LIBRARY_PATH; if [ -f ${rootme}/../../expect/expect ] ; then TCL_LIBRARY=${srcdir}/../../tcl/library ; export TCL_LIBRARY ; fi ; runtest --target_board=native-gdbserver
Test Run By pedro on Thu Jan 14 17:24:37 2016
Native configuration is x86_64-pc-linux-gnu

                === gdb tests ===

Schedule of variations:
    native-gdbserver

Running target native-gdbserver
Using /home/pedro/gdb/mygit/src/gdb/testsuite/boards/../boards/native-gdbserver.exp as board description file for target.
Using /home/pedro/gdb/mygit/src/gdb/testsuite/boards/../boards/gdbserver-base.exp as board description file for target.
ERROR: tcl error sourcing board description file for target /home/pedro/gdb/mygit/src/gdb/testsuite/boards/../boards/gdbserver-base.exp.
gdbserver not present on target!
gdbserver not present on target!
    while executing
"error "gdbserver not present on target!""
    (file "/home/pedro/gdb/mygit/src/gdb/testsuite/boards/../boards/gdbserver-base.exp" line 36)
    invoked from within
"source /home/pedro/gdb/mygit/src/gdb/testsuite/boards/../boards/gdbserver-base.exp"
    ("uplevel" body line 1)
    invoked from within
"uplevel #0 source /home/pedro/gdb/mygit/src/gdb/testsuite/boards/../boards/gdbserver-base.exp"
    invoked from within
"catch "uplevel #0 source ${dir}/${initfile}" error"
make[2]: *** [check-single] Error 1
make[2]: Leaving directory `/home/pedro/gdb/mygit/build/gdb/testsuite'
make[1]: *** [check] Error 2
make[1]: Leaving directory `/home/pedro/gdb/mygit/build/gdb/testsuite'
make: *** [check] Error 2

real    0m0.404s
user    0m0.336s
sys     0m0.060s

Obviously not ideal; I'm just pointing out the likely easiest.

Thanks,
Pedro Alves
  

Comments

Simon Marchi Jan. 14, 2016, 7:43 p.m. UTC | #1
On 16-01-14 12:28 PM, Pedro Alves wrote:
>> So they would need to be changed:
>>
>> -gdb_run_cmd
>> +if ![gdb_run_cmd] {
>> +    fail "Failed to run"
>> +}
>>
> 
> That's where the "return -code return" hack would come in.  You'd
> do that inside gdb_run_cmd so that would return directly to gdb_run_cmd's
> caller.  But it's an ugly hack.

Now that's some tcl dark magic.

>>> For the particular case of gdbserver not being present in the target,
>>> probably the easiest would be to check that earlier, likely before
>>> runtest, even.  Not ideal, since the testsuite can mix native and gdbserver
>>> tests, for instance, but...
>>
>> And it's a bit hard to check in the case of a remote target, given that it's runtest
>> that knows how to "compute" the remote hostname, username, expected gdbserver path,
>> etc, from the --target_board.
> 
> I meant, before runtest, the dejagnu procedure, not before invoking the
> runtest binary.  E.g., straight from the target board file, say.
> 
> E.g., with:
> 
> --- c/gdb/testsuite/boards/gdbserver-base.exp
> +++ w/gdb/testsuite/boards/gdbserver-base.exp
> @@ -33,6 +33,8 @@ set_board_info gdb,predefined_tsv "\\\$trace_timestamp"
> 
>  set GDBFLAGS "${GDBFLAGS} -ex \"set auto-connect-native-target off\""
> 
> +error "gdbserver not present on target!"
> +
>  proc ${board}_file { dest op args } {
>      if { $op == "delete" } {
>         return 0
> 
> then I get:
> 
> $ time make check RUNTESTFLAGS="--target_board=native-gdbserver"
> make[1]: Entering directory `/home/pedro/gdb/mygit/build/gdb/testsuite'
> Nothing to be done for all...
> make check-single
> make[2]: Entering directory `/home/pedro/gdb/mygit/build/gdb/testsuite'
> rootme=`pwd`; export rootme; srcdir=/home/pedro/gdb/mygit/src/gdb/testsuite ; export srcdir ; EXPECT=`if [ "${READ1}" != "" ] ; then echo ${rootme}/expect-read1; elif [ -f ${rootme}/../../expect/expect ] ; then echo ${rootme}/../../expect/expect ; else echo expect ; fi` ; export EXPECT ; EXEEXT= ; export EXEEXT ; LD_LIBRARY_PATH=$rootme/../../expect:$rootme/../../libstdc++:$rootme/../../tk/unix:$rootme/../../tcl/unix:$rootme/../../bfd:$rootme/../../opcodes:$LD_LIBRARY_PATH; export LD_LIBRARY_PATH; if [ -f ${rootme}/../../expect/expect ] ; then TCL_LIBRARY=${srcdir}/../../tcl/library ; export TCL_LIBRARY ; fi ; runtest --target_board=native-gdbserver
> Test Run By pedro on Thu Jan 14 17:24:37 2016
> Native configuration is x86_64-pc-linux-gnu
> 
>                 === gdb tests ===
> 
> Schedule of variations:
>     native-gdbserver
> 
> Running target native-gdbserver
> Using /home/pedro/gdb/mygit/src/gdb/testsuite/boards/../boards/native-gdbserver.exp as board description file for target.
> Using /home/pedro/gdb/mygit/src/gdb/testsuite/boards/../boards/gdbserver-base.exp as board description file for target.
> ERROR: tcl error sourcing board description file for target /home/pedro/gdb/mygit/src/gdb/testsuite/boards/../boards/gdbserver-base.exp.
> gdbserver not present on target!
> gdbserver not present on target!
>     while executing
> "error "gdbserver not present on target!""
>     (file "/home/pedro/gdb/mygit/src/gdb/testsuite/boards/../boards/gdbserver-base.exp" line 36)
>     invoked from within
> "source /home/pedro/gdb/mygit/src/gdb/testsuite/boards/../boards/gdbserver-base.exp"
>     ("uplevel" body line 1)
>     invoked from within
> "uplevel #0 source /home/pedro/gdb/mygit/src/gdb/testsuite/boards/../boards/gdbserver-base.exp"
>     invoked from within
> "catch "uplevel #0 source ${dir}/${initfile}" error"
> make[2]: *** [check-single] Error 1
> make[2]: Leaving directory `/home/pedro/gdb/mygit/build/gdb/testsuite'
> make[1]: *** [check] Error 2
> make[1]: Leaving directory `/home/pedro/gdb/mygit/build/gdb/testsuite'
> make: *** [check] Error 2
> 
> real    0m0.404s
> user    0m0.336s
> sys     0m0.060s
> 
> Obviously not ideal; I'm just pointing out the likely easiest.

Hmm if I try to put a call to

  [remote_file target exists "/home/simark/nonexistent"]

It always returns true somehow.  When I dig, I find this error:

  {rsh to target failed for test -f /home/simark/nonexistent, rsh: Could not resolve hostname target: Name or service not known}

At this point I think the target board is not set (we are currently parsing it), so we
probably can't do operations on it.

Anyway, the point is not to fix it for my particular case, but to make the general
testsuite more robust.  I think I'll send a patch for Dejagnu for the "unresolved"
solution, and see what the maintainers think.

Thanks for you help Pedro.

Simon
  
Pedro Alves Jan. 14, 2016, 7:48 p.m. UTC | #2
On 01/14/2016 07:43 PM, Simon Marchi wrote:
> On 16-01-14 12:28 PM, Pedro Alves wrote:

> Hmm if I try to put a call to
> 
>   [remote_file target exists "/home/simark/nonexistent"]
> 
> It always returns true somehow.  When I dig, I find this error:
> 
>   {rsh to target failed for test -f /home/simark/nonexistent, rsh: Could not resolve hostname target: Name or service not known}
> 
> At this point I think the target board is not set (we are currently parsing it), so we
> probably can't do operations on it.

Yeah, sounds like it.

> 
> Anyway, the point is not to fix it for my particular case, but to make the general
> testsuite more robust.  I think I'll send a patch for Dejagnu for the "unresolved"
> solution, and see what the maintainers think.

If you're willing to do the leg work, that's fantastic, of course.
  
Simon Marchi Jan. 15, 2016, 4:34 p.m. UTC | #3
On 16-01-14 02:48 PM, Pedro Alves wrote:
>> Anyway, the point is not to fix it for my particular case, but to make the general
>> testsuite more robust.  I think I'll send a patch for Dejagnu for the "unresolved"
>> solution, and see what the maintainers think.
> 
> If you're willing to do the leg work, that's fantastic, of course.

I feel completely stupid now.  Just under the line I pasted in my original message,
there is this comment:

1474         if { [catch "uplevel #0 source $test_file_name"] == 1 } {
1475             # If we have a Tcl error, propogate the exit status do make
1476             # notices the error.
1477             global exit_status exit_error
1478             # exit error is set by a command line option
1479             if { $exit_status == 0 } {
1480                 set exit_status $exit_error
1481             }

After searching where this goes, I saw the --status argument to runtest:

       --status
              Set the exit status to fail on Tcl errors.

Turns out we just need to add this to our runtest invocation to make it return non-zero
when a test driver, as it calls the test .exp's, fails.  So we just need to add this to
our call to runtest (will send a patch later).  Sometimes it can be worth it to read the
man page first...
  
Ben Elliston Jan. 15, 2016, 8:22 p.m. UTC | #4
On Fri, Jan 15, 2016 at 11:34:56AM -0500, Simon Marchi wrote:

> 1475             # If we have a Tcl error, propogate the exit status do make
> 1476             # notices the error.

I'll fix the error in the comment, too.

Cheers, Ben
  
Joel Brobecker Jan. 17, 2016, 8:27 a.m. UTC | #5
Hi Simon,

> I feel completely stupid now.  Just under the line I pasted in my
> original message, there is this comment:
> 
> 1474         if { [catch "uplevel #0 source $test_file_name"] == 1 } {
> 1475             # If we have a Tcl error, propogate the exit status do make
> 1476             # notices the error.
> 1477             global exit_status exit_error
> 1478             # exit error is set by a command line option
> 1479             if { $exit_status == 0 } {
> 1480                 set exit_status $exit_error
> 1481             }
> 
> After searching where this goes, I saw the --status argument to runtest:
> 
>        --status
>               Set the exit status to fail on Tcl errors.
> 
> Turns out we just need to add this to our runtest invocation to make
> it return non-zero when a test driver, as it calls the test .exp's,
> fails.  So we just need to add this to our call to runtest (will send
> a patch later).  Sometimes it can be worth it to read the man page
> first...

I admit that some of it flying a little over my head... Does it
mean that we will be able to abort a test if we detect a timeout?
This is one of the things I wanted to do, but didn't have a solution
for, so it'd be fantastic if it helped with that!
  
Simon Marchi Jan. 18, 2016, 4:43 p.m. UTC | #6
On 16-01-17 03:27 AM, Joel Brobecker wrote:
> I admit that some of it flying a little over my head... Does it
> mean that we will be able to abort a test if we detect a timeout?
> This is one of the things I wanted to do, but didn't have a solution
> for, so it'd be fantastic if it helped with that!

--status does no really impact how a timeout is handled.  What happens
when a timeout occurs in gdb_test is defined by us.  Right now, it
produces a test "fail":

testsuite/lib/gdb.exp:
  937         timeout {
  938             if ![string match "" $message] then {
  939                 fail "$message (timeout)"
  940             }
  941             set result 1
  942         }

So you see, for example:

  FAIL: gdb.base/break.exp: some test (timeout)

and the test case carries on.  A gdb_test timeout is no different than
a regular fail.  If you replace the "fail" at line 939 with "error",
which throws a TCL exception, then the current test case will abort,
although in this quite ugly way:

  Running /home/emaisin/src/binutils-gdb/gdb/testsuite/gdb.base/break.exp ...
  ERROR OCCURED: some test (timeout)
      while executing
  "error "$message (timeout)""
      invoked from within
  "if ![string match "" $message] then {
  		error "$message (timeout)"
  	    }"
      invoked from within
  "expect {
  -i exp8 -timeout 1
  	-re ".*A problem internal to GDB has been detected" {
  	    fail "$message (GDB internal error)"
  	    gdb_internal_error_..."
      ("uplevel" body line 1)
      invoked from within
  "uplevel $body" NONE insert breakpoint (timeout)WARNING: remote_expect statement without a default case?!

The current test case is stopped and the next .exp is ran.  However,
if you don't use --status on the runtest command line, the uncaught
exception will not be considered as an error.  So if you use
"make check" in a script, you won't notice the failure.
  
Joel Brobecker Jan. 21, 2016, 10:42 a.m. UTC | #7
> --status does no really impact how a timeout is handled.  What happens
> when a timeout occurs in gdb_test is defined by us.  Right now, it
> produces a test "fail":
> 
> testsuite/lib/gdb.exp:
>   937         timeout {
>   938             if ![string match "" $message] then {
>   939                 fail "$message (timeout)"
>   940             }
>   941             set result 1
>   942         }
> 
> So you see, for example:
> 
>   FAIL: gdb.base/break.exp: some test (timeout)
> 
> and the test case carries on.  A gdb_test timeout is no different than
> a regular fail.  If you replace the "fail" at line 939 with "error",
> which throws a TCL exception, then the current test case will abort,
> although in this quite ugly way:

Argh. I was hoping that we would be able to add something at the end
of the timeout handler to abort, but in a nice way.

So, IIUC, without help from dejagnu, there is no way we know of
to abort testcases...

Thanks for having looked into this!
  

Patch

--- c/gdb/testsuite/boards/gdbserver-base.exp
+++ w/gdb/testsuite/boards/gdbserver-base.exp
@@ -33,6 +33,8 @@  set_board_info gdb,predefined_tsv "\\\$trace_timestamp"

 set GDBFLAGS "${GDBFLAGS} -ex \"set auto-connect-native-target off\""

+error "gdbserver not present on target!"
+
 proc ${board}_file { dest op args } {
     if { $op == "delete" } {
        return 0