Make dg-extract-results.sh explicitly treat .{sum,log} files as text

Message ID 1418677196-20721-1-git-send-email-sergiodj@redhat.com
State New, archived
Headers

Commit Message

Sergio Durigan Junior Dec. 15, 2014, 8:59 p.m. UTC
  This weekend I was running GDB's testsuite with many options enabled,
and I noticed that, for some specific configurations (specifically
when testing gdbserver), I was getting the following error:

 dg-extract-results.sh: sum files are for multiple tools, specify a tool

I remembered seeing this a lot before, so I spent some time
investigating the cause...

First, I found the line on dg-extract-results.sh that printed this
error message.  The code does:

  CNT=`grep '=== .* tests ===' $SUM_FILES --text | $AWK '{ print $3 }' | sort -u | wc -l`
  if [ $CNT -eq 1 ]; then
    TOOL=`grep '=== .* tests ===' $FIRST_SUM --text | $AWK '{ print $2 }'`
  else
    msg "${PROGNAME}: sum files are for multiple tools, specify a tool"
    msg ""
    usage
    exit 1
  fi

So, the first thing to do was to identify why $CNT was not 1.  When I
ran the command that generated the result for CNT, I found:

  $ grep '=== .* tests ===' `find outputs -name gdb.log -print` \
     | awk '{ print $3 }' | sort -u | wc -l
  7

Hm, strange.  So, removing the wc command, the output was:

  gdb
  outputs/gdb.base/gdb-sigterm/gdb.log
  outputs/gdb.threads/non-ldr-exc-1/gdb.log
  outputs/gdb.threads/non-ldr-exc-2/gdb.log
  outputs/gdb.threads/non-ldr-exc-3/gdb.log
  outputs/gdb.threads/non-ldr-exc-4/gdb.log
  outputs/gdb.threads/thread-execl/gdb.log

And, when I used only the grep command, without the awk and the sort,
I saw that the majority of the lines were like this:

  outputs/gdb.trace/tfind/gdb.log:                === gdb tests ===

Which would generated the first line in the output above, "gdb".  But,
for the other 6 files above, I saw:

  Binary file outputs/gdb.base/gdb-sigterm/gdb.log matches

Right, the problem is that grep is assuming those 6 files are binary,
not text.  This happens because of this code, in grep:

  <http://git.savannah.gnu.org/cgit/grep.git/tree/src/grep.c#n526>

  static enum textbin
  buffer_textbin (char *buf, size_t size)
  {
    if (eolbyte && memchr (buf, '\0', size))
      return TEXTBIN_BINARY;
  ...

If one looks at those 6 files, one will find that they contain the NUL
byte there.  They are all printed by the same message, by gdbserver's
code:

  input_interrupt, count = 0 c = 0 ('^@')

(The ^@ above is the NUL byte.)

Maybe the right fix would be to improve input_interrupt in
gdbserver/remote-utils.c (see PR server/16359), but I decided to go
the easier route and adjust the dg-extract-results.sh to be more
robust when dealing with the sum and log files.  To do that, I am
suggest passing the '--text' option to grep, which overrides grep's
machinery to identify if the file is binary and forces it to treat
every file as text.  For me, it makes sense to do that because sum and
log files will always be text, no matter what happens.

OK to apply?

2014-12-14  Sergio Durigan Junior  <sergiodj@redhat.com>

	* dg-extract-results.sh: Pass '--text' option to grep when
	filtering .{sum,log} files, which may contain binary data.
---
 gdb/testsuite/dg-extract-results.sh | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)
  

Comments

Jan Kratochvil Dec. 15, 2014, 9:33 p.m. UTC | #1
On Mon, 15 Dec 2014 21:59:56 +0100, Sergio Durigan Junior wrote:
> --- a/gdb/testsuite/dg-extract-results.sh
> +++ b/gdb/testsuite/dg-extract-results.sh

Master of dg-extract-results.sh is gcc/contrib/dg-extract-results.sh

Which is rewritten to Python BTW.  (I have noticed it only now.)


Regards,
Jan
  
Sergio Durigan Junior Dec. 15, 2014, 10:07 p.m. UTC | #2
On Monday, December 15 2014, Jan Kratochvil wrote:

> Master of dg-extract-results.sh is gcc/contrib/dg-extract-results.sh
>
> Which is rewritten to Python BTW.  (I have noticed it only now.)

Ops, OK, thanks, I will submit the patch to gcc-patches then.  Is there
any documented procedure to merge this file later?  BTW, I see that the
Python version already takes care of binary logs.
  
Jan Kratochvil Dec. 15, 2014, 10:15 p.m. UTC | #3
On Mon, 15 Dec 2014 23:07:49 +0100, Sergio Durigan Junior wrote:
> Is there any documented procedure to merge this file later?

git log gdb/testsuite/dg-extract-results.sh
shows some commits like:
    Merge dg-extract-results.sh from gcc upstream (svn 195224).
or
        * dg-extract-results.sh: Sync with GCC HEAD (import r155655, r157175
        and r157645).


> Ops, OK, thanks, I will submit the patch to gcc-patches then.
+
> BTW, I see that the Python version already takes care of binary logs.

Does the patch to gcc-patches make sense then?


Jan
  
Sergio Durigan Junior Dec. 15, 2014, 10:17 p.m. UTC | #4
On Monday, December 15 2014, Jan Kratochvil wrote:

> On Mon, 15 Dec 2014 23:07:49 +0100, Sergio Durigan Junior wrote:
>> Is there any documented procedure to merge this file later?
>
> git log gdb/testsuite/dg-extract-results.sh
> shows some commits like:
>     Merge dg-extract-results.sh from gcc upstream (svn 195224).
> or
>         * dg-extract-results.sh: Sync with GCC HEAD (import r155655, r157175
>         and r157645).

OK, thanks.

>> Ops, OK, thanks, I will submit the patch to gcc-patches then.
> +
>> BTW, I see that the Python version already takes care of binary logs.
>
> Does the patch to gcc-patches make sense then?

IMO it still does for the case when there is no Python available in the
system.
  

Patch

diff --git a/gdb/testsuite/dg-extract-results.sh b/gdb/testsuite/dg-extract-results.sh
index 42190ae..5fe935a 100755
--- a/gdb/testsuite/dg-extract-results.sh
+++ b/gdb/testsuite/dg-extract-results.sh
@@ -120,9 +120,9 @@  if [ -z "$TOOL" ]; then
   # If no tool was specified, all specified summary files must be for
   # the same tool.
 
-  CNT=`grep '=== .* tests ===' $SUM_FILES | $AWK '{ print $3 }' | sort -u | wc -l`
+  CNT=`grep '=== .* tests ===' $SUM_FILES --text | $AWK '{ print $3 }' | sort -u | wc -l`
   if [ $CNT -eq 1 ]; then
-    TOOL=`grep '=== .* tests ===' $FIRST_SUM | $AWK '{ print $2 }'`
+    TOOL=`grep '=== .* tests ===' $FIRST_SUM --text | $AWK '{ print $2 }'`
   else
     msg "${PROGNAME}: sum files are for multiple tools, specify a tool"
     msg ""
@@ -133,7 +133,7 @@  else
   # Ignore the specified summary files that are not for this tool.  This
   # should keep the relevant files in the same order.
 
-  SUM_FILES=`grep -l "=== $TOOL" $SUM_FILES`
+  SUM_FILES=`grep -l "=== $TOOL" $SUM_FILES --text`
   if test -z "$SUM_FILES" ; then
     msg "${PROGNAME}: none of the specified files are results for $TOOL"
     exit 1
@@ -222,7 +222,7 @@  else
   VARIANTS=""
   for VAR in $VARS
   do
-    grep "Running target $VAR" $SUM_FILES > /dev/null && VARIANTS="$VARIANTS $VAR"
+    grep "Running target $VAR" $SUM_FILES --text > /dev/null && VARIANTS="$VARIANTS $VAR"
   done
 fi