Restrict gdb.base/gcore-relro-pie.exp to native/linux targets

Message ID 1487859197-6269-1-git-send-email-lgustavo@codesourcery.com
State New, archived
Headers

Commit Message

Luis Machado Feb. 23, 2017, 2:13 p.m. UTC
  This particular test doesn't work correctly on aarch64-elf for a couple
reasons.

First, the test is supposed to crash and bare-metal targets are not always
ready to handle such exceptions like Linux does. In our case the board just
keeps cycling within an ISR function.

Second, core file generation is not guaranteed/not supposed to work with
bare-metal targets.  If it does, it may just be luck.

Therefore the following patch restricts this particular test to only native
and Linux targets, like some other core file tests.

I see other core file tests that are not restricted to native/linux, but i
don't see failures due to those. It doesn't mean they are correct in their
assumptions though. I could fix those up in a single batch if it is deemed
appropriate. Otherwise i'm happy with fixing only this one.

gdb/testsuite/ChangeLog:
2017-02-23  Luis Machado  <lgustavo@codesourcery.com>

	* gdb.base/gcore-relro-pie.exp: Restrict to native/linux targets.
---
 gdb/testsuite/gdb.base/gcore-relro-pie.exp | 5 +++++
 1 file changed, 5 insertions(+)
  

Comments

Pedro Alves Feb. 23, 2017, 2:24 p.m. UTC | #1
On 02/23/2017 02:13 PM, Luis Machado wrote:
> This particular test doesn't work correctly on aarch64-elf for a couple
> reasons.
> 
> First, the test is supposed to crash and bare-metal targets are not always
> ready to handle such exceptions like Linux does. In our case the board just
> keeps cycling within an ISR function.

Hmm?  AFAICS, the testcase is generating the core with gdb_gcore_cmd.
How on earth is that causing this cycling?

Either I'm missing something, or the reason for the cycling is
something else, like for example, the fact that the test is
compiled with flags (-fpie -pie Wl,-z,relro) that might generate a
binary that does not work correctly, or strip is broken for that
target, or something like that.

Thanks,
Pedro Alves
  
Luis Machado Feb. 23, 2017, 2:47 p.m. UTC | #2
On 02/23/2017 08:24 AM, Pedro Alves wrote:
> On 02/23/2017 02:13 PM, Luis Machado wrote:
>> This particular test doesn't work correctly on aarch64-elf for a couple
>> reasons.
>>
>> First, the test is supposed to crash and bare-metal targets are not always
>> ready to handle such exceptions like Linux does. In our case the board just
>> keeps cycling within an ISR function.
>
> Hmm?  AFAICS, the testcase is generating the core with gdb_gcore_cmd.
> How on earth is that causing this cycling?

I tend to see this particular target stopping later in the source file 
compared to x86. I had an (incorrect) assumption it was dereferencing 
NULL when running to break_here, but it doesn't seem to be the case. It 
stops right at what would cause the crash:

(gdb) b break_here
Breakpoint 2 at 0x80000a20
(gdb) disass break_here
Dump of assembler code for function break_here:
    0x0000000080000a1c <+0>:     mov     x0, #0x0
    0x0000000080000a20 <+4>:     str     wzr, [x0]

It seems something is off with startup code/stack pointer then. Back to 
the drawing board.

I still think it doesn't make much sense to run these tests if we're not 
sure gcore will support them. They may run a few early tests/setup 
tests, but that won't translate into meaningful PASSes. But i'm ok 
keeping it as-is if others think the early test PASSes are useful.
  
Pedro Alves Feb. 23, 2017, 2:54 p.m. UTC | #3
On 02/23/2017 02:47 PM, Luis Machado wrote:

> I still think it doesn't make much sense to run these tests if we're not
> sure gcore will support them. 

I don't understand what you're saying.  We can't be sure up
front.  The "gcore" that is run is GDB's "gcore" command.  If that 
doesn't work, gdb_gcore_cmd calls unsupported, and the rest of the 
testcase is skipped.

> They may run a few early tests/setup
> tests, but that won't translate into meaningful PASSes. But i'm ok
> keeping it as-is if others think the early test PASSes are useful.

Looks like it's been useful to catch a startup code problem.  ;-)

Thanks,
Pedro Alves
  
Luis Machado Feb. 23, 2017, 3:10 p.m. UTC | #4
On 02/23/2017 08:54 AM, Pedro Alves wrote:
> On 02/23/2017 02:47 PM, Luis Machado wrote:
>
>> I still think it doesn't make much sense to run these tests if we're not
>> sure gcore will support them.
>
> I don't understand what you're saying.  We can't be sure up
> front.  The "gcore" that is run is GDB's "gcore" command.  If that
> doesn't work, gdb_gcore_cmd calls unsupported, and the rest of the
> testcase is skipped.
>

The point is that we are indeed sure this isn't supported, unless we 
officially support core files on bare-metal targets (i don't think we 
do). See below.

>> They may run a few early tests/setup
>> tests, but that won't translate into meaningful PASSes. But i'm ok
>> keeping it as-is if others think the early test PASSes are useful.
>
> Looks like it's been useful to catch a startup code problem.  ;-)

Don't you agree this is a clear sign we are really testing something 
else other than core file support? It just means a proper standalone 
test doesn't exist to catch such a problem and we got lucky crashing 
when doing a core file test on a target that doesn't support it. It is 
not like the test was designed to catch this.

I find it a bit messy. A proper test would attempt to run pie 
executables to completion (and i can contribute that to verify this 
particular problem).

I just don't like the concept of running a test that is unsupported on a 
particular target (core files) only to test a side-effect during 
pre-test/setup phases. It only adds to artificial PASSes that don't 
necessarily translate to robustness.

But i understand if this is not acceptable. I just want to clean this 
target up, and seeing core file tests being executed is just confusing. :-)
  
Pedro Alves Feb. 23, 2017, 3:43 p.m. UTC | #5
On 02/23/2017 03:10 PM, Luis Machado wrote:
> On 02/23/2017 08:54 AM, Pedro Alves wrote:
>> On 02/23/2017 02:47 PM, Luis Machado wrote:
>>
>>> I still think it doesn't make much sense to run these tests if we're not
>>> sure gcore will support them.
>>
>> I don't understand what you're saying.  We can't be sure up
>> front.  The "gcore" that is run is GDB's "gcore" command.  If that
>> doesn't work, gdb_gcore_cmd calls unsupported, and the rest of the
>> testcase is skipped.
>>
> 
> The point is that we are indeed sure this isn't supported, unless we
> officially support core files on bare-metal targets (i don't think we
> do). 

There's been talks / patches about adding that for years.
Unfortunately it hasn't happened yet, but I think it'd
be quite reasonable to have.

> See below.

Note your patch did not skip the test on bare metal targets. 
That's not even what the subject says.  You've skipped it
everywhere but Linux.  It's not quite the same thing.

> 
>>> They may run a few early tests/setup
>>> tests, but that won't translate into meaningful PASSes. But i'm ok
>>> keeping it as-is if others think the early test PASSes are useful.
>>
>> Looks like it's been useful to catch a startup code problem.  ;-)
> 
> Don't you agree this is a clear sign we are really testing something
> else other than core file support? 

Not really.  It's a sign that the test is missing a predicate, but
it's not "cores work", IMO.  The testcase has a few requirements.
At least:

#1 - PIEs make sense for this target.
#2 - gcore works

To me this is a clear sign a predicate for #1 is missing.
We already detect #2.  Other tests build PIEs too.

IOW, IMO, all this talk about core files is a distraction
from the real issue.

But then I wonder why does compilation, linking, loading all
succeed without something realizing the binary can't run on
this target.  It may well be that PIEs do make sense for this
target.  It's not uncommon to call things "bare-metal" when
they're really pretty sophisticated RTOSs.

> It just means a proper standalone
> test doesn't exist to catch such a problem and we got lucky crashing
> when doing a core file test on a target that doesn't support it. It is
> not like the test was designed to catch this.
> 
> I find it a bit messy. A proper test would attempt to run pie
> executables to completion (and i can contribute that to verify this
> particular problem).

How would that help, if it'd crash and loop the board too?

> I just don't like the concept of running a test that is unsupported on a
> particular target (core files) only to test a side-effect during
> pre-test/setup phases. It only adds to artificial PASSes that don't
> necessarily translate to robustness.
> 
> But i understand if this is not acceptable. I just want to clean this
> target up, and seeing core file tests being executed is just confusing. :-)

Thanks,
Pedro Alves
  
Luis Machado Feb. 23, 2017, 4:16 p.m. UTC | #6
On 02/23/2017 09:43 AM, Pedro Alves wrote:
> There's been talks / patches about adding that for years.
> Unfortunately it hasn't happened yet, but I think it'd
> be quite reasonable to have.

I find it reasonable too, but it is not here yet.

> Not really.  It's a sign that the test is missing a predicate, but
> it's not "cores work", IMO.  The testcase has a few requirements.
> At least:
>
> #1 - PIEs make sense for this target.
> #2 - gcore works
>
> To me this is a clear sign a predicate for #1 is missing.
> We already detect #2.  Other tests build PIEs too.
>
> IOW, IMO, all this talk about core files is a distraction
> from the real issue.
>
> But then I wonder why does compilation, linking, loading all
> succeed without something realizing the binary can't run on
> this target.  It may well be that PIEs do make sense for this
> target.  It's not uncommon to call things "bare-metal" when
> they're really pretty sophisticated RTOSs.

This is just a stack pointer startup problem, so not a matter of 
supporting PIE's or not.

> How would that help, if it'd crash and loop the board too?
>

The test would need to be tweaked. The assumption that these targets can 
handle a crash is not always valid as it is with OS-based targets.

I remember i addressed problems with these assumptions in the past. It 
is similar to assuming reading 0x0 would error out or that it is a bad 
address. We are much more aware of that problem these days.

Unfortunately not a lot of bare-metal/RTOS testing going on for gdb at 
the moment. I think even QEMU shields these tests from problems sometimes.

Anyway, i know what i have to do to get this FAIL fixed!

The discussion about how focused/robust a particular gdb test is/needs 
to be can be rather long. At this point i don't think there is a lot of 
benefit from that, but maybe a worthy discussion in the future about how 
to improve gdb's testsuite, or not!
  

Patch

diff --git a/gdb/testsuite/gdb.base/gcore-relro-pie.exp b/gdb/testsuite/gdb.base/gcore-relro-pie.exp
index 28321fc..fd255e4 100644
--- a/gdb/testsuite/gdb.base/gcore-relro-pie.exp
+++ b/gdb/testsuite/gdb.base/gcore-relro-pie.exp
@@ -19,6 +19,11 @@ 
 
 standard_testfile
 
+if {![isnative] || ![istarget "*-*-linux*"]} {
+    untested "skipping core file test"
+    return 0
+}
+
 if {[prepare_for_testing "failed to prepare" $testfile $srcfile {debug additional_flags=-fpie "ldflags=-pie -Wl,-z,relro"}]} {
     return -1
 }