Update dg-extract-results.* from gcc

Message ID	yddh8ku2mrc.fsf@CeBiTec.Uni-Bielefeld.DE
State	New, archived
Headers	Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk Sender: gdb-patches-owner@sourceware.org From: Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> To: gdb-patches@sourceware.org Subject: Update dg-extract-results.* from gcc Date: Fri, 20 Jul 2018 13:02:31 +0200 Message-ID: <yddh8ku2mrc.fsf@CeBiTec.Uni-Bielefeld.DE> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (usg-unix-v) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-="

Message ID

yddh8ku2mrc.fsf@CeBiTec.Uni-Bielefeld.DE

State

New, archived

Headers

Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
Sender: gdb-patches-owner@sourceware.org
From: Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
To: gdb-patches@sourceware.org
Subject: Update dg-extract-results.* from gcc
Date: Fri, 20 Jul 2018 13:02:31 +0200
Message-ID: <yddh8ku2mrc.fsf@CeBiTec.Uni-Bielefeld.DE>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (usg-unix-v)
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=-=-="

Commit Message

Rainer Orth July 20, 2018, 11:02 a.m. UTC

  When looking at the gdb.sum file produced by dg-extract-results.sh on
Solaris 11/x86, I noticed some wrong sorting, like this:

PASS: gdb.ada/addr_arith.exp: print something'address + 0
PASS: gdb.ada/addr_arith.exp: print 0 + something'address
PASS: gdb.ada/addr_arith.exp: print something'address - 0
PASS: gdb.ada/addr_arith.exp: print 0 - something'address

Looking closer, I noticed that while dg-extract-results.sh had been
copied over from contrib in the gcc repo, the corresponding
dg-extract-results.py file had not.  The latter not only fixes the
sorting problem I'd observed, but is also way faster than the shell
version (like a factor of 50 faster).

Therefore I propose to update both files from the gcc repo.  The changes
to the .sh version are trivial, just counting the number of DejaGnu
ERROR lines, too.

There are other possible improvements in this area:

* One could keep the files in toplevel contrib as in gcc, instead of
  stashing them away in gdb/testsuite.

* One could also copy over gcc's contrib/test_summary, used by the
  toplevel make mail-report.log to provide a nice summary of test
  results.  However, this is currently hampered by the fact that for
  parallel make check the gdb.sum and gdb.log files are left in
  outputs/*/*/gdb.{sum,log} after dg-extract-results.sh has run instead
  of moving them to *.sep like gcc's gcc/Makefile.in does, so
  mail-report.log lists every failure twice.

Thoughts?

	Rainer

Comments

Tom Tromey July 25, 2018, 7:06 p.m. UTC | #1

>>>>> "Rainer" == Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> writes:

Rainer> Therefore I propose to update both files from the gcc repo.  The changes
Rainer> to the .sh version are trivial, just counting the number of DejaGnu
Rainer> ERROR lines, too.

Thank you for doing this.  This is ok.

Rainer> * One could keep the files in toplevel contrib as in gcc, instead of
Rainer>   stashing them away in gdb/testsuite.

I don't have an opinion on this one, either way is ok by me.

Rainer> * One could also copy over gcc's contrib/test_summary, used by the
Rainer>   toplevel make mail-report.log to provide a nice summary of test
Rainer>   results.  However, this is currently hampered by the fact that for
Rainer>   parallel make check the gdb.sum and gdb.log files are left in
Rainer>   outputs/*/*/gdb.{sum,log} after dg-extract-results.sh has run instead
Rainer>   of moving them to *.sep like gcc's gcc/Makefile.in does, so
Rainer>   mail-report.log lists every failure twice.

I don't understand the "*.sep" comment - would you mind spelling it out?

Anyway, if this script is useful to you, it's fine with me if you want
to find a way to make it work.  I think the outputs/** stuff can be
moved around or messed with pretty freely, though of course it is best
not to outright lose things.

Tom

Sergio Durigan Junior July 26, 2018, 8:54 p.m. UTC | #2

On Wednesday, July 25 2018, Tom Tromey wrote:

>>>>>> "Rainer" == Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> writes:
> Rainer> * One could also copy over gcc's contrib/test_summary, used by the
> Rainer>   toplevel make mail-report.log to provide a nice summary of test
> Rainer>   results.  However, this is currently hampered by the fact that for
> Rainer>   parallel make check the gdb.sum and gdb.log files are left in
> Rainer>   outputs/*/*/gdb.{sum,log} after dg-extract-results.sh has run instead
> Rainer>   of moving them to *.sep like gcc's gcc/Makefile.in does, so
> Rainer>   mail-report.log lists every failure twice.
>
> I don't understand the "*.sep" comment - would you mind spelling it out?
>
> Anyway, if this script is useful to you, it's fine with me if you want
> to find a way to make it work.  I think the outputs/** stuff can be
> moved around or messed with pretty freely, though of course it is best
> not to outright lose things.

Just a note: if you're going to move/change the outputs/** stuff, please
make sure the racy targets on gdb/testsuite/Makefile still work.

Thanks,

Rainer Orth July 31, 2018, 12:44 p.m. UTC | #3

Hi Tom,

>>>>>> "Rainer" == Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> writes:
>
> Rainer> Therefore I propose to update both files from the gcc repo.  The changes
> Rainer> to the .sh version are trivial, just counting the number of DejaGnu
> Rainer> ERROR lines, too.
>
> Thank you for doing this.  This is ok.
>
> Rainer> * One could keep the files in toplevel contrib as in gcc, instead of
> Rainer>   stashing them away in gdb/testsuite.
>
> I don't have an opinion on this one, either way is ok by me.

On second thought, there are some arguments for moving them to toplevel
contrib:

* This way, they can easily be used should someone decide to parallelize
  one or more of the binutils, gas, or ld testsuites.

* They are less easily overlooked for updates from the gcc repo when
  they reside in the same place in both.

* The test_summary script needs to live in contrib since the toplevel
  Makefile's mail-report.log target expects it there.

So I'll go that route.

> Rainer> * One could also copy over gcc's contrib/test_summary, used by the
> Rainer>   toplevel make mail-report.log to provide a nice summary of test
> Rainer>   results.  However, this is currently hampered by the fact that for
> Rainer>   parallel make check the gdb.sum and gdb.log files are left in
> Rainer>   outputs/*/*/gdb.{sum,log} after dg-extract-results.sh has run instead
> Rainer>   of moving them to *.sep like gcc's gcc/Makefile.in does, so
> Rainer>   mail-report.log lists every failure twice.
>
> I don't understand the "*.sep" comment - would you mind spelling it out?

The test_summary scripts works by searching for *.sum and *.log files in
the whole tree (given that those live at different levels in the build
tree and cannot easily be found with a glob pattern).

Currently, once dg-extract-results.sh has summarized the individual
gdb.sum and gdb.log files in outputs, we have both the individual
per-subdir files in place and the summarized one in gdb/testsuite.  When
test_summary runs, it find all of of those and lists every non-PASS
result twice in its output, which isn't particularly helpful.

To avoid this, the gcc testsuite moves the subdir .sum/.log files for
parallelized testsuites to .sum.sep/.log.sep before passing them to
dg-extract-results.sh.  This way, we get one summary per testsuite
(e.g. gcc or g++, or gdb in the case at hand), and test_summary won't
pick them up twice.

> Anyway, if this script is useful to you, it's fine with me if you want
> to find a way to make it work.  I think the outputs/** stuff can be
> moved around or messed with pretty freely, though of course it is best
> not to outright lose things.

Absolutely: as I said, the individual files are just moved aside not to
interfere with the likes of test_summary, but still left in place since
dg-extract-results.* isn't always perfect in merging them.

I'll go ahead and prepare a patch then.

	Rainer

Rainer Orth July 31, 2018, 12:46 p.m. UTC | #4

Hi Sergio,

> On Wednesday, July 25 2018, Tom Tromey wrote:
>
>>>>>>> "Rainer" == Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> writes:
>> Rainer> * One could also copy over gcc's contrib/test_summary, used by the
>> Rainer>   toplevel make mail-report.log to provide a nice summary of test
>> Rainer>   results.  However, this is currently hampered by the fact that for
>> Rainer>   parallel make check the gdb.sum and gdb.log files are left in
>> Rainer>   outputs/*/*/gdb.{sum,log} after dg-extract-results.sh has run
>> Rainer> instead
>> Rainer>   of moving them to *.sep like gcc's gcc/Makefile.in does, so
>> Rainer>   mail-report.log lists every failure twice.
>>
>> I don't understand the "*.sep" comment - would you mind spelling it out?
>>
>> Anyway, if this script is useful to you, it's fine with me if you want
>> to find a way to make it work.  I think the outputs/** stuff can be
>> moved around or messed with pretty freely, though of course it is best
>> not to outright lose things.
>
> Just a note: if you're going to move/change the outputs/** stuff, please
> make sure the racy targets on gdb/testsuite/Makefile still work.

good point.  I hope I'd have adapted every use of dg-extract-results.sh
in gdb/testsuite/Makefile.in, but I'll certainly double-check.

Thanks.
        Rainer

Pedro Alves Aug. 7, 2018, 2:35 p.m. UTC | #5

On 07/20/2018 12:02 PM, Rainer Orth wrote:
> When looking at the gdb.sum file produced by dg-extract-results.sh on
> Solaris 11/x86, I noticed some wrong sorting, like this:
> 
> PASS: gdb.ada/addr_arith.exp: print something'address + 0
> PASS: gdb.ada/addr_arith.exp: print 0 + something'address
> PASS: gdb.ada/addr_arith.exp: print something'address - 0
> PASS: gdb.ada/addr_arith.exp: print 0 - something'address
> 
> Looking closer, I noticed that while dg-extract-results.sh had been
> copied over from contrib in the gcc repo, the corresponding
> dg-extract-results.py file had not.  The latter not only fixes the
> sorting problem I'd observed, but is also way faster than the shell
> version (like a factor of 50 faster).

We used to have the dg-extract-results.py file, but we deleted it
because it caused (funnily enough, sorting) problems.  See:

  https://sourceware.org/ml/gdb-patches/2015-02/msg00333.html

Has that sorting stability issue been meanwhile fixed upstream?

Thanks,
Pedro Alves

Rainer Orth Aug. 8, 2018, 2:36 p.m. UTC | #6

Hi Pedro,

> On 07/20/2018 12:02 PM, Rainer Orth wrote:
>> When looking at the gdb.sum file produced by dg-extract-results.sh on
>> Solaris 11/x86, I noticed some wrong sorting, like this:
>> 
>> PASS: gdb.ada/addr_arith.exp: print something'address + 0
>> PASS: gdb.ada/addr_arith.exp: print 0 + something'address
>> PASS: gdb.ada/addr_arith.exp: print something'address - 0
>> PASS: gdb.ada/addr_arith.exp: print 0 - something'address
>> 
>> Looking closer, I noticed that while dg-extract-results.sh had been
>> copied over from contrib in the gcc repo, the corresponding
>> dg-extract-results.py file had not.  The latter not only fixes the
>> sorting problem I'd observed, but is also way faster than the shell
>> version (like a factor of 50 faster).
>
> We used to have the dg-extract-results.py file, but we deleted it
> because it caused (funnily enough, sorting) problems.  See:
>
>   https://sourceware.org/ml/gdb-patches/2015-02/msg00333.html
>
> Has that sorting stability issue been meanwhile fixed upstream?

not that I can see: between the version of dg-extract-results.py removed
in early 2015 and the one in current gcc trunk, there's only added
handling for DejaGnu ERRORs and another minor change to do with
summaries that doesn't seem to change anything wrt. sorting on first
blush.

Howver, I've just run make -j16 check three times in a row on
amd64-pc-solaris2.11, followed by make -j48 check, and the only
differences were to to 200+ racy tests, the vast majority of them in
gdb.threads.  Maybe the prior problems have been due to bugs in older
versions of python?

	Rainer

Pedro Alves Aug. 8, 2018, 3:24 p.m. UTC | #7

On 08/08/2018 03:36 PM, Rainer Orth wrote:

>> On 07/20/2018 12:02 PM, Rainer Orth wrote:
>>> When looking at the gdb.sum file produced by dg-extract-results.sh on
>>> Solaris 11/x86, I noticed some wrong sorting, like this:
>>>
>>> PASS: gdb.ada/addr_arith.exp: print something'address + 0
>>> PASS: gdb.ada/addr_arith.exp: print 0 + something'address
>>> PASS: gdb.ada/addr_arith.exp: print something'address - 0
>>> PASS: gdb.ada/addr_arith.exp: print 0 - something'address
>>>
>>> Looking closer, I noticed that while dg-extract-results.sh had been
>>> copied over from contrib in the gcc repo, the corresponding
>>> dg-extract-results.py file had not.  The latter not only fixes the
>>> sorting problem I'd observed, but is also way faster than the shell
>>> version (like a factor of 50 faster).
>>
>> We used to have the dg-extract-results.py file, but we deleted it
>> because it caused (funnily enough, sorting) problems.  See:
>>
>>   https://sourceware.org/ml/gdb-patches/2015-02/msg00333.html
>>
>> Has that sorting stability issue been meanwhile fixed upstream?
> 
> not that I can see: between the version of dg-extract-results.py removed
> in early 2015 and the one in current gcc trunk, there's only added
> handling for DejaGnu ERRORs and another minor change to do with
> summaries that doesn't seem to change anything wrt. sorting on first
> blush.

OK.

> Howver, I've just run make -j16 check three times in a row on
> amd64-pc-solaris2.11, followed by make -j48 check, and the only
> differences were to to 200+ racy tests, the vast majority of them in
> gdb.threads.  

Thanks for testing.

> Maybe the prior problems have been due to bugs in older
> versions of python?
Might be.

IIRC, the sorting that would change would be the order that the different
individual gdb.sum results would  be merged (the per-gdb.sum order was
stable).  So comparing two runs, you'd get something like, in one run this:

 gdb.base/foo.exp:PASS: test1
 gdb.base/foo.exp:PASS: test2
 gdb.base/foo.exp:PASS: test3
 gdb.base/bar.exp:PASS: testA
 gdb.base/bar.exp:PASS: testB
 gdb.base/bar.exp:PASS: testC

and another run this:

 gdb.base/bar.exp:PASS: testA
 gdb.base/bar.exp:PASS: testB
 gdb.base/bar.exp:PASS: testC
 gdb.base/foo.exp:PASS: test1
 gdb.base/foo.exp:PASS: test2
 gdb.base/foo.exp:PASS: test3

which would result in all those tests spuriously showing up in a
gdb.sum old/new diff.

I'm not sure whether we were seeing that if you compared runs
of the same tree multiple times.  It could be that it only happened
when comparing the results of different trees, which contained
a slightly different set of tests and testcases, like for example
comparing testresults of a patched master against the testresults
of master from a month or week ago, which is something I frequently
do, for example.

*time passes*

Wait wait wait, can you clarify what you meant by wrong sorting in:

 PASS: gdb.ada/addr_arith.exp: print something'address + 0
 PASS: gdb.ada/addr_arith.exp: print 0 + something'address
 PASS: gdb.ada/addr_arith.exp: print something'address - 0
 PASS: gdb.ada/addr_arith.exp: print 0 - something'address

?

Why do you think those results _should_ be sorted?  And in what order?

Typically, the order/sequence in which the tests of a given exp
file is executed is important.  The order in the gdb.sum file must
be the order in which the fail/pass calls are written/issued in the .exp file.
It'd be absolutely incorrect to alphabetically sort the gdb.sum output.
Is that what the .py version does?  That's not what I recall, though.
I guess I may be confused.

Thanks,
Pedro Alves

Jan Kratochvil Sept. 11, 2018, 11:03 a.m. UTC | #8

Hi Joel,

there is now toplevel contrib/ directory in GIT but for example in
gdb-8.2.50.20180910.tar.xz snapshot it is missing, could it be added?


Thanks,
Jan

Rainer Orth Sept. 11, 2018, 11:20 a.m. UTC | #9

Hi Jan,

> there is now toplevel contrib/ directory in GIT but for example in
> gdb-8.2.50.20180910.tar.xz snapshot it is missing, could it be added?

I suspect it just needs to be added to src-release.sh
(GDB_SUPPORT_DIRS), not being used on the binutils side so far, but I've
never tried creating a gdb tarball so far.

	Rainer

Joel Brobecker Sept. 11, 2018, noon UTC | #10

> > there is now toplevel contrib/ directory in GIT but for example in
> > gdb-8.2.50.20180910.tar.xz snapshot it is missing, could it be added?
> 
> I suspect it just needs to be added to src-release.sh
> (GDB_SUPPORT_DIRS), not being used on the binutils side so far, but I've
> never tried creating a gdb tarball so far.

That's my understanding as well. I can take a look at that, but not
before Thu at the very earliest (traveling).

Sergio Durigan Junior Sept. 12, 2018, 5:23 a.m. UTC | #11

On Tuesday, September 11 2018, Joel Brobecker wrote:

>> > there is now toplevel contrib/ directory in GIT but for example in
>> > gdb-8.2.50.20180910.tar.xz snapshot it is missing, could it be added?
>> 
>> I suspect it just needs to be added to src-release.sh
>> (GDB_SUPPORT_DIRS), not being used on the binutils side so far, but I've
>> never tried creating a gdb tarball so far.
>
> That's my understanding as well. I can take a look at that, but not
> before Thu at the very earliest (traveling).

Thanks everybody who's involved in this.  I took a stab at the problem,
and added "contrib" to the list of support GDB dirs.  Tested by
generating a release here, and things look fine.  The patch has been
submitted (Cc'ed binutils@s.o).

Thanks,

Pedro Alves Oct. 1, 2018, 9:36 a.m. UTC | #12

On 08/08/2018 04:24 PM, Pedro Alves wrote:
> On 08/08/2018 03:36 PM, Rainer Orth wrote:
> 
>>> On 07/20/2018 12:02 PM, Rainer Orth wrote:
>>>> When looking at the gdb.sum file produced by dg-extract-results.sh on
>>>> Solaris 11/x86, I noticed some wrong sorting, like this:
>>>>
>>>> PASS: gdb.ada/addr_arith.exp: print something'address + 0
>>>> PASS: gdb.ada/addr_arith.exp: print 0 + something'address
>>>> PASS: gdb.ada/addr_arith.exp: print something'address - 0
>>>> PASS: gdb.ada/addr_arith.exp: print 0 - something'address
>>>>
>>>> Looking closer, I noticed that while dg-extract-results.sh had been
>>>> copied over from contrib in the gcc repo, the corresponding
>>>> dg-extract-results.py file had not.  The latter not only fixes the
>>>> sorting problem I'd observed, but is also way faster than the shell
>>>> version (like a factor of 50 faster).
>>>
>>> We used to have the dg-extract-results.py file, but we deleted it
>>> because it caused (funnily enough, sorting) problems.  See:
>>>
>>>   https://sourceware.org/ml/gdb-patches/2015-02/msg00333.html
>>>
>>> Has that sorting stability issue been meanwhile fixed upstream?
>>
>> not that I can see: between the version of dg-extract-results.py removed
>> in early 2015 and the one in current gcc trunk, there's only added
>> handling for DejaGnu ERRORs and another minor change to do with
>> summaries that doesn't seem to change anything wrt. sorting on first
>> blush.
> 
> OK.
> 
>> Howver, I've just run make -j16 check three times in a row on
>> amd64-pc-solaris2.11, followed by make -j48 check, and the only
>> differences were to to 200+ racy tests, the vast majority of them in
>> gdb.threads.  
> 
> Thanks for testing.
> 
>> Maybe the prior problems have been due to bugs in older
>> versions of python?
> Might be.
> 
> IIRC, the sorting that would change would be the order that the different
> individual gdb.sum results would  be merged (the per-gdb.sum order was
> stable).  So comparing two runs, you'd get something like, in one run this:
> 
>  gdb.base/foo.exp:PASS: test1
>  gdb.base/foo.exp:PASS: test2
>  gdb.base/foo.exp:PASS: test3
>  gdb.base/bar.exp:PASS: testA
>  gdb.base/bar.exp:PASS: testB
>  gdb.base/bar.exp:PASS: testC
> 
> and another run this:
> 
>  gdb.base/bar.exp:PASS: testA
>  gdb.base/bar.exp:PASS: testB
>  gdb.base/bar.exp:PASS: testC
>  gdb.base/foo.exp:PASS: test1
>  gdb.base/foo.exp:PASS: test2
>  gdb.base/foo.exp:PASS: test3
> 
> which would result in all those tests spuriously showing up in a
> gdb.sum old/new diff.
> 
> I'm not sure whether we were seeing that if you compared runs
> of the same tree multiple times.  It could be that it only happened
> when comparing the results of different trees, which contained
> a slightly different set of tests and testcases, like for example
> comparing testresults of a patched master against the testresults
> of master from a month or week ago, which is something I frequently
> do, for example.
> 
> *time passes*
> 
> Wait wait wait, can you clarify what you meant by wrong sorting in:
> 
>  PASS: gdb.ada/addr_arith.exp: print something'address + 0
>  PASS: gdb.ada/addr_arith.exp: print 0 + something'address
>  PASS: gdb.ada/addr_arith.exp: print something'address - 0
>  PASS: gdb.ada/addr_arith.exp: print 0 - something'address
> 
> ?
> 
> Why do you think those results _should_ be sorted?  And in what order?
> 
> Typically, the order/sequence in which the tests of a given exp
> file is executed is important.  The order in the gdb.sum file must
> be the order in which the fail/pass calls are written/issued in the .exp file.
> It'd be absolutely incorrect to alphabetically sort the gdb.sum output.
> Is that what the .py version does?  That's not what I recall, though.
> I guess I may be confused.

Getting back to this, because I just diffed testresults between
runs of different vintage, and got bitten by the sorting problems.

I'm diffing testresults between a run on 20180713 and a run
against today's gdb, and I got a _ton_ of spurious diffs like these:

 -PASS: gdb.ada/complete.exp: complete p my_glob 
 -PASS: gdb.ada/complete.exp: complete p insi 
 -PASS: gdb.ada/complete.exp: complete p inner.insi 
 -PASS: gdb.ada/complete.exp: complete p pck.inne 
 -PASS: gdb.ada/complete.exp: complete p pck__inner__ins 
 -PASS: gdb.ada/complete.exp: complete p pck.inner.ins 
 -PASS: gdb.ada/complete.exp: complete p side 
 -PASS: gdb.ada/complete.exp: complete p exported 
 +PASS: gdb.ada/complete.exp: complete break ada 
  PASS: gdb.ada/complete.exp: complete p <Exported
 -PASS: gdb.ada/complete.exp: p <Exported_Capitalized> 
 -PASS: gdb.ada/complete.exp: p Exported_Capitalized 
 -PASS: gdb.ada/complete.exp: p exported_capitalized 
 -PASS: gdb.ada/complete.exp: complete p __gnat_ada_main_progra 
  PASS: gdb.ada/complete.exp: complete p <__gnat_ada_main_prog
 -PASS: gdb.ada/complete.exp: complete p some 
 +PASS: gdb.ada/complete.exp: complete p <pck__my 
 +PASS: gdb.ada/complete.exp: complete p __gnat_ada_main_progra 
 +PASS: gdb.ada/complete.exp: complete p ambig 
 +PASS: gdb.ada/complete.exp: complete p ambiguous_f 
 +PASS: gdb.ada/complete.exp: complete p ambiguous_func 
 +PASS: gdb.ada/complete.exp: complete p exported 
 +PASS: gdb.ada/complete.exp: complete p external_ident 
 +PASS: gdb.ada/complete.exp: complete p inner.insi 
 +PASS: gdb.ada/complete.exp: complete p insi 
 +PASS: gdb.ada/complete.exp: complete p local_ident 
 +PASS: gdb.ada/complete.exp: complete p my_glob 
  PASS: gdb.ada/complete.exp: complete p not_in_sco
 -PASS: gdb.ada/complete.exp: complete p pck.ins 
 -PASS: gdb.ada/complete.exp: complete p pck.my 
 +PASS: gdb.ada/complete.exp: complete p pck 
 +PASS: gdb.ada/complete.exp: complete p pck. 
 +PASS: gdb.ada/complete.exp: complete p pck.inne 
  PASS: gdb.ada/complete.exp: complete p pck.inne
  PASS: gdb.ada/complete.exp: complete p pck.inner.
 -PASS: gdb.ada/complete.exp: complete p local_ident 
 +PASS: gdb.ada/complete.exp: complete p pck.inner.ins 
 +PASS: gdb.ada/complete.exp: complete p pck.ins 
  PASS: gdb.ada/complete.exp: complete p pck.local_ident
 +PASS: gdb.ada/complete.exp: complete p pck.my 
 +PASS: gdb.ada/complete.exp: complete p pck__inner__ins 
  PASS: gdb.ada/complete.exp: complete p pck__local_ident
 -PASS: gdb.ada/complete.exp: complete p external_ident 
 -PASS: gdb.ada/complete.exp: complete p pck 
 -PASS: gdb.ada/complete.exp: complete p pck. 
 -PASS: gdb.ada/complete.exp: complete p <pck__my 
 +PASS: gdb.ada/complete.exp: complete p side 
 +PASS: gdb.ada/complete.exp: complete p some 
  PASS: gdb.ada/complete.exp: interactive complete 'print some'
 -PASS: gdb.ada/complete.exp: complete p ambig 
 -PASS: gdb.ada/complete.exp: complete p ambiguous_f 
 -PASS: gdb.ada/complete.exp: complete p ambiguous_func 
 +PASS: gdb.ada/complete.exp: p <Exported_Capitalized> 
 +PASS: gdb.ada/complete.exp: p Exported_Capitalized 
 +PASS: gdb.ada/complete.exp: p exported_capitalized 
  PASS: gdb.ada/complete.exp: set max-completions unlimited
 -PASS: gdb.ada/complete.exp: complete break ada 

Given the earlier discussions about sorting, I could
immediately recognize what is wrong.  It's that while
testsuite/outputs/gdb.ada/complete/gdb.sum lists the
test results in chronological order, preserving
execution sequence:

 Running src/gdb/testsuite/gdb.ada/complete.exp ...
 PASS: gdb.ada/complete.exp: compilation foo.adb
 PASS: gdb.ada/complete.exp: complete p my_glob
 PASS: gdb.ada/complete.exp: complete p insi
 PASS: gdb.ada/complete.exp: complete p inner.insi
 PASS: gdb.ada/complete.exp: complete p pck.inne
 PASS: gdb.ada/complete.exp: complete p pck__inner__ins
 PASS: gdb.ada/complete.exp: complete p pck.inner.ins
 PASS: gdb.ada/complete.exp: complete p side
 PASS: gdb.ada/complete.exp: complete p exported
 PASS: gdb.ada/complete.exp: complete p <Exported
 PASS: gdb.ada/complete.exp: p <Exported_Capitalized>
 PASS: gdb.ada/complete.exp: p Exported_Capitalized
 PASS: gdb.ada/complete.exp: p exported_capitalized
 PASS: gdb.ada/complete.exp: complete p __gnat_ada_main_progra
 PASS: gdb.ada/complete.exp: complete p <__gnat_ada_main_prog
 PASS: gdb.ada/complete.exp: complete p some
 PASS: gdb.ada/complete.exp: complete p not_in_sco
 PASS: gdb.ada/complete.exp: complete p pck.ins
 PASS: gdb.ada/complete.exp: complete p pck.my
 PASS: gdb.ada/complete.exp: complete p pck.inne
 PASS: gdb.ada/complete.exp: complete p pck.inner.
 PASS: gdb.ada/complete.exp: complete p local_ident
 PASS: gdb.ada/complete.exp: complete p pck.local_ident
 PASS: gdb.ada/complete.exp: complete p pck__local_ident
 PASS: gdb.ada/complete.exp: complete p external_ident
 PASS: gdb.ada/complete.exp: complete p pck
 PASS: gdb.ada/complete.exp: complete p pck.
 PASS: gdb.ada/complete.exp: complete p <pck__my
 PASS: gdb.ada/complete.exp: interactive complete 'print some'
 PASS: gdb.ada/complete.exp: complete p ambig
 PASS: gdb.ada/complete.exp: complete p ambiguous_f
 PASS: gdb.ada/complete.exp: complete p ambiguous_func
 PASS: gdb.ada/complete.exp: set max-completions unlimited
 PASS: gdb.ada/complete.exp: complete break ada

... the squashed testsuite/gdb.sum ended up with those tests above
sorted lexically:

 PASS: gdb.ada/complete.exp: compilation foo.adb
 PASS: gdb.ada/complete.exp: complete break ada
 PASS: gdb.ada/complete.exp: complete p <Exported
 PASS: gdb.ada/complete.exp: complete p <__gnat_ada_main_prog
 PASS: gdb.ada/complete.exp: complete p <pck__my
 PASS: gdb.ada/complete.exp: complete p __gnat_ada_main_progra
 PASS: gdb.ada/complete.exp: complete p ambig
 PASS: gdb.ada/complete.exp: complete p ambiguous_f
 PASS: gdb.ada/complete.exp: complete p ambiguous_func
 PASS: gdb.ada/complete.exp: complete p exported
 PASS: gdb.ada/complete.exp: complete p external_ident
 PASS: gdb.ada/complete.exp: complete p inner.insi
 PASS: gdb.ada/complete.exp: complete p insi
 PASS: gdb.ada/complete.exp: complete p local_ident
 PASS: gdb.ada/complete.exp: complete p my_glob
 PASS: gdb.ada/complete.exp: complete p not_in_sco
 PASS: gdb.ada/complete.exp: complete p pck
 PASS: gdb.ada/complete.exp: complete p pck.
 PASS: gdb.ada/complete.exp: complete p pck.inne
 PASS: gdb.ada/complete.exp: complete p pck.inne
 PASS: gdb.ada/complete.exp: complete p pck.inner.
 PASS: gdb.ada/complete.exp: complete p pck.inner.ins
 PASS: gdb.ada/complete.exp: complete p pck.ins
 PASS: gdb.ada/complete.exp: complete p pck.local_ident
 PASS: gdb.ada/complete.exp: complete p pck.my
 PASS: gdb.ada/complete.exp: complete p pck__inner__ins
 PASS: gdb.ada/complete.exp: complete p pck__local_ident
 PASS: gdb.ada/complete.exp: complete p side
 PASS: gdb.ada/complete.exp: complete p some
 PASS: gdb.ada/complete.exp: interactive complete 'print some'
 PASS: gdb.ada/complete.exp: p <Exported_Capitalized>
 PASS: gdb.ada/complete.exp: p Exported_Capitalized
 PASS: gdb.ada/complete.exp: p exported_capitalized
 PASS: gdb.ada/complete.exp: set max-completions unlimited

... which is clearly incorrect.

So you won't see the problem if you compare test results of
two runs that both postdate the dg-extract-results update,
and if they're both run in parallel mode.  I assume the problem
is visible if you compare a parallel mode run against
a serial mode run, since the latter won't sort.

Is this something that can be easily fixed?

Thanks,
Pedro Alves

diff mbox

Patch

# HG changeset patch
# Parent  a912cfbcb6270fabdf12dd47297d162123e1e738
Update dg-extract-results.* from gcc

diff --git a/gdb/testsuite/dg-extract-results.py b/gdb/testsuite/dg-extract-results.py
new file mode 100644
--- /dev/null
+++ b/gdb/testsuite/dg-extract-results.py
@@ -0,0 +1,592 @@ 
+#!/usr/bin/python
+#
+# Copyright (C) 2014 Free Software Foundation, Inc.
+#
+# This script is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+
+import sys
+import getopt
+import re
+import io
+from datetime import datetime
+from operator import attrgetter
+
+# True if unrecognised lines should cause a fatal error.  Might want to turn
+# this on by default later.
+strict = False
+
+# True if the order of .log segments should match the .sum file, false if
+# they should keep the original order.
+sort_logs = True
+
+# A version of open() that is safe against whatever binary output
+# might be added to the log.
+def safe_open (filename):
+    if sys.version_info >= (3, 0):
+        return open (filename, 'r', errors = 'surrogateescape')
+    return open (filename, 'r')
+
+# Force stdout to handle escape sequences from a safe_open file.
+if sys.version_info >= (3, 0):
+    sys.stdout = io.TextIOWrapper (sys.stdout.buffer,
+                                   errors = 'surrogateescape')
+
+class Named:
+    def __init__ (self, name):
+        self.name = name
+
+class ToolRun (Named):
+    def __init__ (self, name):
+        Named.__init__ (self, name)
+        # The variations run for this tool, mapped by --target_board name.
+        self.variations = dict()
+
+    # Return the VariationRun for variation NAME.
+    def get_variation (self, name):
+        if name not in self.variations:
+            self.variations[name] = VariationRun (name)
+        return self.variations[name]
+
+class VariationRun (Named):
+    def __init__ (self, name):
+        Named.__init__ (self, name)
+        # A segment of text before the harness runs start, describing which
+        # baseboard files were loaded for the target.
+        self.header = None
+        # The harnesses run for this variation, mapped by filename.
+        self.harnesses = dict()
+        # A list giving the number of times each type of result has
+        # been seen.
+        self.counts = []
+
+    # Return the HarnessRun for harness NAME.
+    def get_harness (self, name):
+        if name not in self.harnesses:
+            self.harnesses[name] = HarnessRun (name)
+        return self.harnesses[name]
+
+class HarnessRun (Named):
+    def __init__ (self, name):
+        Named.__init__ (self, name)
+        # Segments of text that make up the harness run, mapped by a test-based
+        # key that can be used to order them.
+        self.segments = dict()
+        # Segments of text that make up the harness run but which have
+        # no recognized test results.  These are typically harnesses that
+        # are completely skipped for the target.
+        self.empty = []
+        # A list of results.  Each entry is a pair in which the first element
+        # is a unique sorting key and in which the second is the full
+        # PASS/FAIL line.
+        self.results = []
+
+    # Add a segment of text to the harness run.  If the segment includes
+    # test results, KEY is an example of one of them, and can be used to
+    # combine the individual segments in order.  If the segment has no
+    # test results (e.g. because the harness doesn't do anything for the
+    # current configuration) then KEY is None instead.  In that case
+    # just collect the segments in the order that we see them.
+    def add_segment (self, key, segment):
+        if key:
+            assert key not in self.segments
+            self.segments[key] = segment
+        else:
+            self.empty.append (segment)
+
+class Segment:
+    def __init__ (self, filename, start):
+        self.filename = filename
+        self.start = start
+        self.lines = 0
+
+class Prog:
+    def __init__ (self):
+        # The variations specified on the command line.
+        self.variations = []
+        # The variations seen in the input files.
+        self.known_variations = set()
+        # The tools specified on the command line.
+        self.tools = []
+        # Whether to create .sum rather than .log output.
+        self.do_sum = True
+        # Regexps used while parsing.
+        self.test_run_re = re.compile (r'^Test Run By (\S+) on (.*)$')
+        self.tool_re = re.compile (r'^\t\t=== (.*) tests ===$')
+        self.result_re = re.compile (r'^(PASS|XPASS|FAIL|XFAIL|UNRESOLVED'
+                                     r'|WARNING|ERROR|UNSUPPORTED|UNTESTED'
+                                     r'|KFAIL):\s*(.+)')
+        self.completed_re = re.compile (r'.* completed at (.*)')
+        # Pieces of text to write at the head of the output.
+        # start_line is a pair in which the first element is a datetime
+        # and in which the second is the associated 'Test Run By' line.
+        self.start_line = None
+        self.native_line = ''
+        self.target_line = ''
+        self.host_line = ''
+        self.acats_premable = ''
+        # Pieces of text to write at the end of the output.
+        # end_line is like start_line but for the 'runtest completed' line.
+        self.acats_failures = []
+        self.version_output = ''
+        self.end_line = None
+        # Known summary types.
+        self.count_names = [
+            '# of DejaGnu errors\t\t',
+            '# of expected passes\t\t',
+            '# of unexpected failures\t',
+            '# of unexpected successes\t',
+            '# of expected failures\t\t',
+            '# of unknown successes\t\t',
+            '# of known failures\t\t',
+            '# of untested testcases\t\t',
+            '# of unresolved testcases\t',
+            '# of unsupported tests\t\t'
+        ]
+        self.runs = dict()
+
+    def usage (self):
+        name = sys.argv[0]
+        sys.stderr.write ('Usage: ' + name
+                          + ''' [-t tool] [-l variant-list] [-L] log-or-sum-file ...
+
+    tool           The tool (e.g. g++, libffi) for which to create a
+                   new test summary file.  If not specified then output
+                   is created for all tools.
+    variant-list   One or more test variant names.  If the list is
+                   not specified then one is constructed from all
+                   variants in the files for <tool>.
+    sum-file       A test summary file with the format of those
+                   created by runtest from DejaGnu.
+    If -L is used, merge *.log files instead of *.sum.  In this
+    mode the exact order of lines may not be preserved, just different
+    Running *.exp chunks should be in correct order.
+''')
+        sys.exit (1)
+
+    def fatal (self, what, string):
+        if not what:
+            what = sys.argv[0]
+        sys.stderr.write (what + ': ' + string + '\n')
+        sys.exit (1)
+
+    # Parse the command-line arguments.
+    def parse_cmdline (self):
+        try:
+            (options, self.files) = getopt.getopt (sys.argv[1:], 'l:t:L')
+            if len (self.files) == 0:
+                self.usage()
+            for (option, value) in options:
+                if option == '-l':
+                    self.variations.append (value)
+                elif option == '-t':
+                    self.tools.append (value)
+                else:
+                    self.do_sum = False
+        except getopt.GetoptError as e:
+            self.fatal (None, e.msg)
+
+    # Try to parse time string TIME, returning an arbitrary time on failure.
+    # Getting this right is just a nice-to-have so failures should be silent.
+    def parse_time (self, time):
+        try:
+            return datetime.strptime (time, '%c')
+        except ValueError:
+            return datetime.now()
+
+    # Parse an integer and abort on failure.
+    def parse_int (self, filename, value):
+        try:
+            return int (value)
+        except ValueError:
+            self.fatal (filename, 'expected an integer, got: ' + value)
+
+    # Return a list that represents no test results.
+    def zero_counts (self):
+        return [0 for x in self.count_names]
+
+    # Return the ToolRun for tool NAME.
+    def get_tool (self, name):
+        if name not in self.runs:
+            self.runs[name] = ToolRun (name)
+        return self.runs[name]
+
+    # Add the result counts in list FROMC to TOC.
+    def accumulate_counts (self, toc, fromc):
+        for i in range (len (self.count_names)):
+            toc[i] += fromc[i]
+
+    # Parse the list of variations after 'Schedule of variations:'.
+    # Return the number seen.
+    def parse_variations (self, filename, file):
+        num_variations = 0
+        while True:
+            line = file.readline()
+            if line == '':
+                self.fatal (filename, 'could not parse variation list')
+            if line == '\n':
+                break
+            self.known_variations.add (line.strip())
+            num_variations += 1
+        return num_variations
+
+    # Parse from the first line after 'Running target ...' to the end
+    # of the run's summary.
+    def parse_run (self, filename, file, tool, variation, num_variations):
+        header = None
+        harness = None
+        segment = None
+        final_using = 0
+
+        # If this is the first run for this variation, add any text before
+        # the first harness to the header.
+        if not variation.header:
+            segment = Segment (filename, file.tell())
+            variation.header = segment
+
+        # Parse the rest of the summary (the '# of ' lines).
+        if len (variation.counts) == 0:
+            variation.counts = self.zero_counts()
+
+        # Parse up until the first line of the summary.
+        if num_variations == 1:
+            end = '\t\t=== ' + tool.name + ' Summary ===\n'
+        else:
+            end = ('\t\t=== ' + tool.name + ' Summary for '
+                   + variation.name + ' ===\n')
+        while True:
+            line = file.readline()
+            if line == '':
+                self.fatal (filename, 'no recognised summary line')
+            if line == end:
+                break
+
+            # Look for the start of a new harness.
+            if line.startswith ('Running ') and line.endswith (' ...\n'):
+                # Close off the current harness segment, if any.
+                if harness:
+                    segment.lines -= final_using
+                    harness.add_segment (first_key, segment)
+                name = line[len ('Running '):-len(' ...\n')]
+                harness = variation.get_harness (name)
+                segment = Segment (filename, file.tell())
+                first_key = None
+                final_using = 0
+                continue
+
+            # Record test results.  Associate the first test result with
+            # the harness segment, so that if a run for a particular harness
+            # has been split up, we can reassemble the individual segments
+            # in a sensible order.
+            #
+            # dejagnu sometimes issues warnings about the testing environment
+            # before running any tests.  Treat them as part of the header
+            # rather than as a test result.
+            match = self.result_re.match (line)
+            if match and (harness or not line.startswith ('WARNING:')):
+                if not harness:
+                    self.fatal (filename, 'saw test result before harness name')
+                name = match.group (2)
+                # Ugly hack to get the right order for gfortran.
+                if name.startswith ('gfortran.dg/g77/'):
+                    name = 'h' + name
+                key = (name, len (harness.results))
+                harness.results.append ((key, line))
+                if not first_key and sort_logs:
+                    first_key = key
+                if line.startswith ('ERROR: (DejaGnu)'):
+                    for i in range (len (self.count_names)):
+                        if 'DejaGnu errors' in self.count_names[i]:
+                            variation.counts[i] += 1
+                            break
+
+            # 'Using ...' lines are only interesting in a header.  Splitting
+            # the test up into parallel runs leads to more 'Using ...' lines
+            # than there would be in a single log.
+            if line.startswith ('Using '):
+                final_using += 1
+            else:
+                final_using = 0
+
+            # Add other text to the current segment, if any.
+            if segment:
+                segment.lines += 1
+
+        # Close off the final harness segment, if any.
+        if harness:
+            segment.lines -= final_using
+            harness.add_segment (first_key, segment)
+
+        while True:
+            before = file.tell()
+            line = file.readline()
+            if line == '':
+                break
+            if line == '\n':
+                continue
+            if not line.startswith ('# '):
+                file.seek (before)
+                break
+            found = False
+            for i in range (len (self.count_names)):
+                if line.startswith (self.count_names[i]):
+                    count = line[len (self.count_names[i]):-1].strip()
+                    variation.counts[i] += self.parse_int (filename, count)
+                    found = True
+                    break
+            if not found:
+                self.fatal (filename, 'unknown test result: ' + line[:-1])
+
+    # Parse an acats run, which uses a different format from dejagnu.
+    # We have just skipped over '=== acats configuration ==='.
+    def parse_acats_run (self, filename, file):
+        # Parse the preamble, which describes the configuration and logs
+        # the creation of support files.
+        record = (self.acats_premable == '')
+        if record:
+            self.acats_premable = '\t\t=== acats configuration ===\n'
+        while True:
+            line = file.readline()
+            if line == '':
+                self.fatal (filename, 'could not parse acats preamble')
+            if line == '\t\t=== acats tests ===\n':
+                break
+            if record:
+                self.acats_premable += line
+
+        # Parse the test results themselves, using a dummy variation name.
+        tool = self.get_tool ('acats')
+        variation = tool.get_variation ('none')
+        self.parse_run (filename, file, tool, variation, 1)
+
+        # Parse the failure list.
+        while True:
+            before = file.tell()
+            line = file.readline()
+            if line.startswith ('*** FAILURES: '):
+                self.acats_failures.append (line[len ('*** FAILURES: '):-1])
+                continue
+            file.seek (before)
+            break
+
+    # Parse the final summary at the end of a log in order to capture
+    # the version output that follows it.
+    def parse_final_summary (self, filename, file):
+        record = (self.version_output == '')
+        while True:
+            line = file.readline()
+            if line == '':
+                break
+            if line.startswith ('# of '):
+                continue
+            if record:
+                self.version_output += line
+            if line == '\n':
+                break
+
+    # Parse a .log or .sum file.
+    def parse_file (self, filename, file):
+        tool = None
+        target = None
+        num_variations = 1
+        while True:
+            line = file.readline()
+            if line == '':
+                return
+
+            # Parse the list of variations, which comes before the test
+            # runs themselves.
+            if line.startswith ('Schedule of variations:'):
+                num_variations = self.parse_variations (filename, file)
+                continue
+
+            # Parse a testsuite run for one tool/variation combination.
+            if line.startswith ('Running target '):
+                name = line[len ('Running target '):-1]
+                if not tool:
+                    self.fatal (filename, 'could not parse tool name')
+                if name not in self.known_variations:
+                    self.fatal (filename, 'unknown target: ' + name)
+                self.parse_run (filename, file, tool,
+                                tool.get_variation (name),
+                                num_variations)
+                # If there is only one variation then there is no separate
+                # summary for it.  Record any following version output.
+                if num_variations == 1:
+                    self.parse_final_summary (filename, file)
+                continue
+
+            # Parse the start line.  In the case where several files are being
+            # parsed, pick the one with the earliest time.
+            match = self.test_run_re.match (line)
+            if match:
+                time = self.parse_time (match.group (2))
+                if not self.start_line or self.start_line[0] > time:
+                    self.start_line = (time, line)
+                continue
+
+            # Parse the form used for native testing.
+            if line.startswith ('Native configuration is '):
+                self.native_line = line
+                continue
+
+            # Parse the target triplet.
+            if line.startswith ('Target is '):
+                self.target_line = line
+                continue
+
+            # Parse the host triplet.
+            if line.startswith ('Host   is '):
+                self.host_line = line
+                continue
+
+            # Parse the acats premable.
+            if line == '\t\t=== acats configuration ===\n':
+                self.parse_acats_run (filename, file)
+                continue
+
+            # Parse the tool name.
+            match = self.tool_re.match (line)
+            if match:
+                tool = self.get_tool (match.group (1))
+                continue
+
+            # Skip over the final summary (which we instead create from
+            # individual runs) and parse the version output.
+            if tool and line == '\t\t=== ' + tool.name + ' Summary ===\n':
+                if file.readline() != '\n':
+                    self.fatal (filename, 'expected blank line after summary')
+                self.parse_final_summary (filename, file)
+                continue
+
+            # Parse the completion line.  In the case where several files
+            # are being parsed, pick the one with the latest time.
+            match = self.completed_re.match (line)
+            if match:
+                time = self.parse_time (match.group (1))
+                if not self.end_line or self.end_line[0] < time:
+                    self.end_line = (time, line)
+                continue
+
+            # Sanity check to make sure that important text doesn't get
+            # dropped accidentally.
+            if strict and line.strip() != '':
+                self.fatal (filename, 'unrecognised line: ' + line[:-1])
+
+    # Output a segment of text.
+    def output_segment (self, segment):
+        with safe_open (segment.filename) as file:
+            file.seek (segment.start)
+            for i in range (segment.lines):
+                sys.stdout.write (file.readline())
+
+    # Output a summary giving the number of times each type of result has
+    # been seen.
+    def output_summary (self, tool, counts):
+        for i in range (len (self.count_names)):
+            name = self.count_names[i]
+            # dejagnu only prints result types that were seen at least once,
+            # but acats always prints a number of unexpected failures.
+            if (counts[i] > 0
+                or (tool.name == 'acats'
+                    and name.startswith ('# of unexpected failures'))):
+                sys.stdout.write ('%s%d\n' % (name, counts[i]))
+
+    # Output unified .log or .sum information for a particular variation,
+    # with a summary at the end.
+    def output_variation (self, tool, variation):
+        self.output_segment (variation.header)
+        for harness in sorted (variation.harnesses.values(),
+                               key = attrgetter ('name')):
+            sys.stdout.write ('Running ' + harness.name + ' ...\n')
+            if self.do_sum:
+                harness.results.sort()
+                for (key, line) in harness.results:
+                    sys.stdout.write (line)
+            else:
+                # Rearrange the log segments into test order (but without
+                # rearranging text within those segments).
+                for key in sorted (harness.segments.keys()):
+                    self.output_segment (harness.segments[key])
+                for segment in harness.empty:
+                    self.output_segment (segment)
+        if len (self.variations) > 1:
+            sys.stdout.write ('\t\t=== ' + tool.name + ' Summary for '
+                              + variation.name + ' ===\n\n')
+            self.output_summary (tool, variation.counts)
+
+    # Output unified .log or .sum information for a particular tool,
+    # with a summary at the end.
+    def output_tool (self, tool):
+        counts = self.zero_counts()
+        if tool.name == 'acats':
+            # acats doesn't use variations, so just output everything.
+            # It also has a different approach to whitespace.
+            sys.stdout.write ('\t\t=== ' + tool.name + ' tests ===\n')
+            for variation in tool.variations.values():
+                self.output_variation (tool, variation)
+                self.accumulate_counts (counts, variation.counts)
+            sys.stdout.write ('\t\t=== ' + tool.name + ' Summary ===\n')
+        else:
+            # Output the results in the usual dejagnu runtest format.
+            sys.stdout.write ('\n\t\t=== ' + tool.name + ' tests ===\n\n'
+                              'Schedule of variations:\n')
+            for name in self.variations:
+                if name in tool.variations:
+                    sys.stdout.write ('    ' + name + '\n')
+            sys.stdout.write ('\n')
+            for name in self.variations:
+                if name in tool.variations:
+                    variation = tool.variations[name]
+                    sys.stdout.write ('Running target '
+                                      + variation.name + '\n')
+                    self.output_variation (tool, variation)
+                    self.accumulate_counts (counts, variation.counts)
+            sys.stdout.write ('\n\t\t=== ' + tool.name + ' Summary ===\n\n')
+        self.output_summary (tool, counts)
+
+    def main (self):
+        self.parse_cmdline()
+        try:
+            # Parse the input files.
+            for filename in self.files:
+                with safe_open (filename) as file:
+                    self.parse_file (filename, file)
+
+            # Decide what to output.
+            if len (self.variations) == 0:
+                self.variations = sorted (self.known_variations)
+            else:
+                for name in self.variations:
+                    if name not in self.known_variations:
+                        self.fatal (None, 'no results for ' + name)
+            if len (self.tools) == 0:
+                self.tools = sorted (self.runs.keys())
+
+            # Output the header.
+            if self.start_line:
+                sys.stdout.write (self.start_line[1])
+            sys.stdout.write (self.native_line)
+            sys.stdout.write (self.target_line)
+            sys.stdout.write (self.host_line)
+            sys.stdout.write (self.acats_premable)
+
+            # Output the main body.
+            for name in self.tools:
+                if name not in self.runs:
+                    self.fatal (None, 'no results for ' + name)
+                self.output_tool (self.runs[name])
+
+            # Output the footer.
+            if len (self.acats_failures) > 0:
+                sys.stdout.write ('*** FAILURES: '
+                                  + ' '.join (self.acats_failures) + '\n')
+            sys.stdout.write (self.version_output)
+            if self.end_line:
+                sys.stdout.write (self.end_line[1])
+        except IOError as e:
+            self.fatal (e.filename, e.strerror)
+
+Prog().main()
diff --git a/gdb/testsuite/dg-extract-results.sh b/gdb/testsuite/dg-extract-results.sh
--- a/gdb/testsuite/dg-extract-results.sh
+++ b/gdb/testsuite/dg-extract-results.sh
@@ -6,7 +6,7 @@ 
 # The resulting file can be used with test result comparison scripts for
 # results from tests that were run in parallel.  See usage() below.
 
-# Copyright (C) 2008-2018 Free Software Foundation, Inc.
+# Copyright (C) 2008, 2009, 2010, 2012 Free Software Foundation
 # Contributed by Janis Johnson <janis187@us.ibm.com>
 #
 # This file is part of GCC.
@@ -22,7 +22,9 @@ 
 # GNU General Public License for more details.
 #
 # You should have received a copy of the GNU General Public License
-# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+# along with GCC; see the file COPYING.  If not, write to
+# the Free Software Foundation, 51 Franklin Street, Fifth Floor,
+# Boston, MA 02110-1301, USA.
 
 PROGNAME=dg-extract-results.sh
 
@@ -30,7 +32,7 @@  PROGNAME=dg-extract-results.sh
 PYTHON_VER=`echo "$0" | sed 's/sh$/py/'`
 if test "$PYTHON_VER" != "$0" &&
    test -f "$PYTHON_VER" &&
-   python -c 'import sys; sys.exit (0 if sys.version_info >= (2, 6) else 1)' \
+   python -c 'import sys, getopt, re, io, datetime, operator; sys.exit (0 if sys.version_info >= (2, 6) else 1)' \
      > /dev/null 2> /dev/null; then
   exec python $PYTHON_VER "$@"
 fi
@@ -367,10 +369,11 @@  EOF
 BEGIN {
   variant="$VAR"
   tool="$TOOL"
-  passcnt=0; failcnt=0; untstcnt=0; xpasscnt=0; xfailcnt=0; kpasscnt=0; kfailcnt=0; unsupcnt=0; unrescnt=0;
+  passcnt=0; failcnt=0; untstcnt=0; xpasscnt=0; xfailcnt=0; kpasscnt=0; kfailcnt=0; unsupcnt=0; unrescnt=0; dgerrorcnt=0;
   curvar=""; insummary=0
 }
 /^Running target /		{ curvar = \$3; next }
+/^ERROR: \(DejaGnu\)/		{ if (variant == curvar) dgerrorcnt += 1 }
 /^# of /			{ if (variant == curvar) insummary = 1 }
 /^# of expected passes/		{ if (insummary == 1) passcnt += \$5; next; }
 /^# of unexpected successes/	{ if (insummary == 1) xpasscnt += \$5; next; }
@@ -388,6 +391,7 @@  BEGIN {
 { next }
 END {
   printf ("\t\t=== %s Summary for %s ===\n\n", tool, variant)
+  if (dgerrorcnt != 0) printf ("# of DejaGnu errors\t\t%d\n", dgerrorcnt)
   if (passcnt != 0) printf ("# of expected passes\t\t%d\n", passcnt)
   if (failcnt != 0) printf ("# of unexpected failures\t%d\n", failcnt)
   if (xpasscnt != 0) printf ("# of unexpected successes\t%d\n", xpasscnt)
@@ -417,8 +421,9 @@  TOTAL_AWK=${TMP}/total.awk
 cat << EOF > $TOTAL_AWK
 BEGIN {
   tool="$TOOL"
-  passcnt=0; failcnt=0; untstcnt=0; xpasscnt=0; xfailcnt=0; kfailcnt=0; unsupcnt=0; unrescnt=0
+  passcnt=0; failcnt=0; untstcnt=0; xpasscnt=0; xfailcnt=0; kfailcnt=0; unsupcnt=0; unrescnt=0; dgerrorcnt=0
 }
+/^# of DejaGnu errors/		{ dgerrorcnt += \$5 }
 /^# of expected passes/		{ passcnt += \$5 }
 /^# of unexpected failures/	{ failcnt += \$5 }
 /^# of unexpected successes/	{ xpasscnt += \$5 }
@@ -430,6 +435,7 @@  BEGIN {
 /^# of unsupported tests/	{ unsupcnt += \$5 }
 END {
   printf ("\n\t\t=== %s Summary ===\n\n", tool)
+  if (dgerrorcnt != 0) printf ("# of DejaGnu errors\t\t%d\n", dgerrorcnt)
   if (passcnt != 0) printf ("# of expected passes\t\t%d\n", passcnt)
   if (failcnt != 0) printf ("# of unexpected failures\t%d\n", failcnt)
   if (xpasscnt != 0) printf ("# of unexpected successes\t%d\n", xpasscnt)