[RFC] fix gdb.threads/non-stop-fair-events.exp timeouts

  While running GDB tests on nios2-linux-gnu with gdbserver and "target 
remote", I've been seeing random failures in 
gdb.threads/non-stop-fair-events.exp.  E.g. in one test run I got

FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=6: thread 1 
broke out of loop (timeout)
FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=6: thread 2 
broke out of loop (timeout)
FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=6: thread 3 
broke out of loop (timeout)
FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=7: thread 1 
broke out of loop (timeout)
FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=10: thread 1 
broke out of loop (timeout)
FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=10: thread 2 
broke out of loop (timeout)

and in other test runs I got a different ones.  The pattern seemed to be 
that sometimes it took an extra long time for the first thread to break 
out of the loop, but once that happened they would all stop correctly 
and send the expected replies even though GDB had given up on waiting 
for the first few already.

I've come up with the attached patch to factor the timeout for the 
failing tests by the number of threads still running, which seems to 
take care of the problem.  Does this seem reasonable?

I'm somewhat confused because, in spite of it sometimes taking at least 
3 times the normal timeout for the first stop message to appear, the 
alarm in the test case (which is tied to the normal timeout) was never 
triggering.  My best theory on that is that the slowness is not in the 
test case, but rather in gdbserver.  IOW, all the threads are already 
stopped by the time the alarm would expire, but gdb and gdbserver 
haven't finished all the notifications and requests to print a stop 
message for any of the threads yet.  Is that plausible?  Should the 
timeout for the alarm be factored by the number of threads, too, just to 
be safe?

I'm also not entirely sure what this test case is supposed to test. 
 From the original commit message and comments in the .exp file it seems 
like timeouts were supposed to be a sign of a broken kernel with thread 
starvation problems, not bugs in gdb or gdbserver.  But, don't we 
normally just skip tests that the target doesn't support or can't run 
properly, rather than report them as FAILs?  And, I don't know how to 
distinguish timeouts that mean the kernel is broken from timeouts that 
mean the target is just slow and you need to set a bigger value in the 
test harness.

-Sandra the confused

[RFC] fix gdb.threads/non-stop-fair-events.exp timeouts

Commit Message

Patch