Make dg-extract-results.sh explicitly treat .{sum,log} files as text
Commit Message
This weekend I was running GDB's testsuite with many options enabled,
and I noticed that, for some specific configurations (specifically
when testing gdbserver), I was getting the following error:
dg-extract-results.sh: sum files are for multiple tools, specify a tool
I remembered seeing this a lot before, so I spent some time
investigating the cause...
First, I found the line on dg-extract-results.sh that printed this
error message. The code does:
CNT=`grep '=== .* tests ===' $SUM_FILES --text | $AWK '{ print $3 }' | sort -u | wc -l`
if [ $CNT -eq 1 ]; then
TOOL=`grep '=== .* tests ===' $FIRST_SUM --text | $AWK '{ print $2 }'`
else
msg "${PROGNAME}: sum files are for multiple tools, specify a tool"
msg ""
usage
exit 1
fi
So, the first thing to do was to identify why $CNT was not 1. When I
ran the command that generated the result for CNT, I found:
$ grep '=== .* tests ===' `find outputs -name gdb.log -print` \
| awk '{ print $3 }' | sort -u | wc -l
7
Hm, strange. So, removing the wc command, the output was:
gdb
outputs/gdb.base/gdb-sigterm/gdb.log
outputs/gdb.threads/non-ldr-exc-1/gdb.log
outputs/gdb.threads/non-ldr-exc-2/gdb.log
outputs/gdb.threads/non-ldr-exc-3/gdb.log
outputs/gdb.threads/non-ldr-exc-4/gdb.log
outputs/gdb.threads/thread-execl/gdb.log
And, when I used only the grep command, without the awk and the sort,
I saw that the majority of the lines were like this:
outputs/gdb.trace/tfind/gdb.log: === gdb tests ===
Which would generated the first line in the output above, "gdb". But,
for the other 6 files above, I saw:
Binary file outputs/gdb.base/gdb-sigterm/gdb.log matches
Right, the problem is that grep is assuming those 6 files are binary,
not text. This happens because of this code, in grep:
<http://git.savannah.gnu.org/cgit/grep.git/tree/src/grep.c#n526>
static enum textbin
buffer_textbin (char *buf, size_t size)
{
if (eolbyte && memchr (buf, '\0', size))
return TEXTBIN_BINARY;
...
If one looks at those 6 files, one will find that they contain the NUL
byte there. They are all printed by the same message, by gdbserver's
code:
input_interrupt, count = 0 c = 0 ('^@')
(The ^@ above is the NUL byte.)
Maybe the right fix would be to improve input_interrupt in
gdbserver/remote-utils.c (see PR server/16359), but I decided to go
the easier route and adjust the dg-extract-results.sh to be more
robust when dealing with the sum and log files. To do that, I am
suggest passing the '--text' option to grep, which overrides grep's
machinery to identify if the file is binary and forces it to treat
every file as text. For me, it makes sense to do that because sum and
log files will always be text, no matter what happens.
OK to apply?
2014-12-14 Sergio Durigan Junior <sergiodj@redhat.com>
* dg-extract-results.sh: Pass '--text' option to grep when
filtering .{sum,log} files, which may contain binary data.
---
gdb/testsuite/dg-extract-results.sh | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
Comments
On Mon, 15 Dec 2014 21:59:56 +0100, Sergio Durigan Junior wrote:
> --- a/gdb/testsuite/dg-extract-results.sh
> +++ b/gdb/testsuite/dg-extract-results.sh
Master of dg-extract-results.sh is gcc/contrib/dg-extract-results.sh
Which is rewritten to Python BTW. (I have noticed it only now.)
Regards,
Jan
On Monday, December 15 2014, Jan Kratochvil wrote:
> Master of dg-extract-results.sh is gcc/contrib/dg-extract-results.sh
>
> Which is rewritten to Python BTW. (I have noticed it only now.)
Ops, OK, thanks, I will submit the patch to gcc-patches then. Is there
any documented procedure to merge this file later? BTW, I see that the
Python version already takes care of binary logs.
On Mon, 15 Dec 2014 23:07:49 +0100, Sergio Durigan Junior wrote:
> Is there any documented procedure to merge this file later?
git log gdb/testsuite/dg-extract-results.sh
shows some commits like:
Merge dg-extract-results.sh from gcc upstream (svn 195224).
or
* dg-extract-results.sh: Sync with GCC HEAD (import r155655, r157175
and r157645).
> Ops, OK, thanks, I will submit the patch to gcc-patches then.
+
> BTW, I see that the Python version already takes care of binary logs.
Does the patch to gcc-patches make sense then?
Jan
On Monday, December 15 2014, Jan Kratochvil wrote:
> On Mon, 15 Dec 2014 23:07:49 +0100, Sergio Durigan Junior wrote:
>> Is there any documented procedure to merge this file later?
>
> git log gdb/testsuite/dg-extract-results.sh
> shows some commits like:
> Merge dg-extract-results.sh from gcc upstream (svn 195224).
> or
> * dg-extract-results.sh: Sync with GCC HEAD (import r155655, r157175
> and r157645).
OK, thanks.
>> Ops, OK, thanks, I will submit the patch to gcc-patches then.
> +
>> BTW, I see that the Python version already takes care of binary logs.
>
> Does the patch to gcc-patches make sense then?
IMO it still does for the case when there is no Python available in the
system.
@@ -120,9 +120,9 @@ if [ -z "$TOOL" ]; then
# If no tool was specified, all specified summary files must be for
# the same tool.
- CNT=`grep '=== .* tests ===' $SUM_FILES | $AWK '{ print $3 }' | sort -u | wc -l`
+ CNT=`grep '=== .* tests ===' $SUM_FILES --text | $AWK '{ print $3 }' | sort -u | wc -l`
if [ $CNT -eq 1 ]; then
- TOOL=`grep '=== .* tests ===' $FIRST_SUM | $AWK '{ print $2 }'`
+ TOOL=`grep '=== .* tests ===' $FIRST_SUM --text | $AWK '{ print $2 }'`
else
msg "${PROGNAME}: sum files are for multiple tools, specify a tool"
msg ""
@@ -133,7 +133,7 @@ else
# Ignore the specified summary files that are not for this tool. This
# should keep the relevant files in the same order.
- SUM_FILES=`grep -l "=== $TOOL" $SUM_FILES`
+ SUM_FILES=`grep -l "=== $TOOL" $SUM_FILES --text`
if test -z "$SUM_FILES" ; then
msg "${PROGNAME}: none of the specified files are results for $TOOL"
exit 1
@@ -222,7 +222,7 @@ else
VARIANTS=""
for VAR in $VARS
do
- grep "Running target $VAR" $SUM_FILES > /dev/null && VARIANTS="$VARIANTS $VAR"
+ grep "Running target $VAR" $SUM_FILES --text > /dev/null && VARIANTS="$VARIANTS $VAR"
done
fi