Sporadic failures of selftest tests

Message ID 7e8c90cf-c12b-0095-16e5-3dcae94aff6f@redhat.com
State New, archived
Headers

Commit Message

Pedro Alves Oct. 17, 2017, 2:24 p.m. UTC
  On 10/17/2017 03:00 PM, Ulrich Weigand wrote:
> Hello,
> 
> I'm now seeing sporadic failures of some of the self-test test cases
> running natively on x64_86.  A failing case looks like:
> 
> (gdb) break captured_command_loop
> Breakpoint 1 at 0x7117b0: file ../../binutils-gdb/gdb/main.c, line 324.
> (gdb) PASS: gdb.gdb/complaints.exp: breakpoint in captured_command_loop

[...]

> Type "apropos word" to search for commands related to "word".
> (gdb) 
> Breakpoint 1, captured_main (data=0x7fffffffd2e0) at ../../binutils-gdb/gdb/main.c:1147
> 1147              captured_command_loop ();
> (gdb) XFAIL: gdb.gdb/complaints.exp: run until breakpoint at captured_command_loop (line numbers scrambled?)
> 
> Just from reading the logs this looks like the gdb_test_multiple in selftest_setup
> gets the GDB prompt from the inferior GDB and assumes it comes from the outer GDB.

I don't think so.  "captured_main" is a function in the inferior
GDB, so that's the superior gdb's prompt.  I.e., the test ran
the inferior GDB to main, well before the inferior GDB could print
a prompt.

> Pedro, I'm not sure if this may have anything to do with your recent testsuite
> changes, but I didn't see any other obvious candidates in the logs either :-)

This one kind of looks related:

commit bf4692711232eb96cd840f96d88897a2746d8190
Author:     Pedro Alves <palves@redhat.com>
AuthorDate: Tue Oct 10 16:45:50 2017 +0100

    Eliminate catch_errors

which did:



while above in your log, we still see that the test stopped
in the captured_main function, which has a data parameter:

> Breakpoint 1, captured_main (data=0x7fffffffd2e0) at ../../binutils-gdb/gdb/main.c:1147

However, that means GDB stopped in the totally wrong function, so likely
it'd have failed before too.  

Now the question should be why did GDB stop there, when breakpoint 1
was supposedly set on captured_command_loop ?

 > (gdb) break captured_command_loop
 > Breakpoint 1 at 0x7117b0: file ../../binutils-gdb/gdb/main.c, line 324.
 > (gdb) PASS: gdb.gdb/complaints.exp: breakpoint in captured_command_loop
...
 > (gdb) 
 > Breakpoint 1, captured_main (data=0x7fffffffd2e0) at ../../binutils-gdb/gdb/main.c:1147
 > 1147              captured_command_loop ();

That seems to be the root of the problem.

I wonder whether that's somehow related to the other Power regression
Simon reported:
  https://sourceware.org/ml/gdb-patches/2017-10/msg00444.html

I haven't managed to investigate that one.

Does it reproduce easily for you?  If so, I'd suggest a git bisect to
find the culprit.

> Have you see this issue before?  How is this supposed to work in the first place?
> Is there anything that would allow the testsuite to distinguish the gdb prompts
> emitted by the two GDBs?

The selftests that need to distinguish the prompts do "set prompt (xgdb)" to
change one the prompt of one of the gdb's.  But that's a red herring in this case.

Thanks,
Pedro Alves
  

Comments

Pedro Alves Oct. 17, 2017, 2:34 p.m. UTC | #1
On 10/17/2017 03:24 PM, Pedro Alves wrote:

> Now the question should be why did GDB stop there, when breakpoint 1
> was supposedly set on captured_command_loop ?
> 
>  > (gdb) break captured_command_loop
>  > Breakpoint 1 at 0x7117b0: file ../../binutils-gdb/gdb/main.c, line 324.
>  > (gdb) PASS: gdb.gdb/complaints.exp: breakpoint in captured_command_loop
> ...
>  > (gdb) 
>  > Breakpoint 1, captured_main (data=0x7fffffffd2e0) at ../../binutils-gdb/gdb/main.c:1147
>  > 1147              captured_command_loop ();
> 
> That seems to be the root of the problem.
> 
> I wonder whether that's somehow related to the other Power regression
> Simon reported:
>   https://sourceware.org/ml/gdb-patches/2017-10/msg00444.html
> 
> I haven't managed to investigate that one.
> 
> Does it reproduce easily for you?  If so, I'd suggest a git bisect to
> find the culprit.

Wait, is your build of GDB an optimized build?  Maybe the compiler
managed to inline captured_command_loop for you?  Currnetly, when
GDB stops for an inline breakpoint, it stops at the stack caller,
which would explain this.

Thanks,
Pedro Alves
  
Pedro Alves Oct. 17, 2017, 2:40 p.m. UTC | #2
On 10/17/2017 03:34 PM, Pedro Alves wrote:

> Wait, is your build of GDB an optimized build?  Maybe the compiler
> managed to inline captured_command_loop for you?  Currnetly, when
> GDB stops for an inline breakpoint, it stops at the stack caller,
> which would explain this.

Yup, I can reproduce this with:

$ rm -f main.o && make CXXFLAGS="-g3 -O2"
$ make check TESTS="*/complaints.exp"
[...]
Running src/gdb/testsuite/gdb.gdb/complaints.exp ...
FAIL: gdb.gdb/complaints.exp: run until breakpoint at captured_command_loop
WARNING: Couldn't test self

Thanks,
Pedro Alves
  
Pedro Alves Oct. 17, 2017, 2:55 p.m. UTC | #3
On 10/17/2017 03:40 PM, Pedro Alves wrote:
> On 10/17/2017 03:34 PM, Pedro Alves wrote:
> 
>> Wait, is your build of GDB an optimized build?  Maybe the compiler
>> managed to inline captured_command_loop for you?  Currnetly, when
>> GDB stops for an inline breakpoint, it stops at the stack caller,
>> which would explain this.
> 
> Yup, I can reproduce this with:
> 
> $ rm -f main.o && make CXXFLAGS="-g3 -O2"
> $ make check TESTS="*/complaints.exp"
> [...]
> Running src/gdb/testsuite/gdb.gdb/complaints.exp ...
> FAIL: gdb.gdb/complaints.exp: run until breakpoint at captured_command_loop
> WARNING: Couldn't test self

Ah, and in addition to the wrong function hit, I do also see
the prompt issue then.  Here's what I just saw in a manual run:

(top-gdb) b captured_command_loop
Breakpoint 3 at 0x71ee60: file src/gdb/main.c, line 324.
(top-gdb) r
Starting program: build/gdb/gdb -data-directory=data-directory
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7fffed131700 (LWP 4358)]
[New Thread 0x7fffec930700 (LWP 4359)]
[New Thread 0x7fffec12f700 (LWP 4360)]
[New Thread 0x7fffeb92e700 (LWP 4361)]
[New Thread 0x7fffeb12d700 (LWP 4362)]
[New Thread 0x7fffea92c700 (LWP 4363)]
[New Thread 0x7fffea12b700 (LWP 4364)]
GNU gdb (GDB) 8.0.50.20171017-git
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
(gdb)                  <<<<<< PROMPT HERE
Thread 1 "gdb" hit Breakpoint 3, captured_main (data=<optimized out>) at src/gdb/main.c:1147
1147              captured_command_loop ();
(top-gdb) 

Note the "PROMPT HERE" line.

So indeed, that inferior gdb prompt can confuse gdb_test_multiple.
That prompt is only output in optimized builds, because in that
case, due to inlining, the breakpoint happens to trigger _after_
captured_command_loop prints the prompt...  In non-optimized builds,
the prompt is _not_ output.

Thanks,
Pedro Alves
  

Patch

--- a/gdb/testsuite/lib/selftest-support.exp
+++ b/gdb/testsuite/lib/selftest-support.exp
@@ -88,10 +88,10 @@  proc selftest_setup { executable function } {
 
     set description "run until breakpoint at $function"
     gdb_test_multiple "run $INTERNAL_GDBFLAGS" "$description" {
-        -re "Starting program.*Breakpoint \[0-9\]+,.*$function .data.* at .*main.c:.*$gdb_prompt $" {
+        -re "Starting program.*Breakpoint \[0-9\]+,.*$function \\(\\).* at .*main.c:.*$gdb_prompt $" {
             pass "$description"
         }
-        -re "Starting program.*Breakpoint \[0-9\]+,.*$function .data.*$gdb_prompt $" {
+        -re "Starting program.*Breakpoint \[0-9\]+,.*$function \\(\\).*$gdb_prompt $" {
             xfail "$description (line numbers scrambled?)"
         }
        -re "Starting program.*Breakpoint \[0-9\]+,.* at .*main.c:.*$function.*$gdb_prompt $" {