Add macros for diagnostic control, use them in locale/weightwc.h

Message ID 54701BAA.1030805@cs.ucla.edu
State Committed
Headers

Commit Message

Paul Eggert Nov. 22, 2014, 5:14 a.m. UTC
  Joseph Myers wrote:
> it's warning about str[1] for an array
> wint_t str[1] for which only str[0] is valid and only str[0] exists to be
> initialized (but the - unreachable - code is, after propagating some
> constants, accessing str[1]).

Yeouch, GCC is even more confused than I thought.

Instead of masking the GCC bug, how about changing the libc source slightly so 
that the bug isn't tickled?  The attached patch does that, and this fixes the 
bug for me (x86-64, both GCC 4.9.2 and GCC 4.8.3) without having to fiddle with 
any pragmas.  As a bonus this patch seems to make the resulting machine code 
slightly more efficient; it's smaller, anyway.

We did this sort of thing with gnulib etc. too.  If memory serves, occasionally 
the buggy warnings were signs of other trouble within GCC, so it wasn't 
unreasonable to change the code to pacify the compiler.  I do try to file GCC 
bug reports for this sort of thing but I'm not perfect in that regard.
  

Comments

Roland McGrath Nov. 24, 2014, 11:47 p.m. UTC | #1
When I said I wanted a macro, what I had in mind was a single macro
that would wrap the affected code, e.g.

	SUPPRESS_WARNING ("maybe-uninitialized",
			  ... code inside here ...
			 )

My rationale was to structure the source code such that it would be
difficult to accidentally spread the suppression zone across more code
than really needed it (or at least be easy to be lazy and still not do
that).

Paul took my strawman "tightest possible scope" to its absurd extreme.
That maybe have been a bit of baiting on my part.  I always expected
we would find a balance between the literal tighest possible scope and
a scope that is tight enough to minimize the risk of masking warnings
we actually want to see (especially as the code is changed later)
while not making the code so contorted that it really hampers
maintainability in another way.

If GCC's limitations make it infeasible to narrow the suppression
scope very much, and/or make it impossible to use a surrounding macro
to do so like my idea, then those ideas are pretty much out the
window.  The pragma syntax is a bit verbose and clunky, so some macros
to make things more concise are still probably worthwhile.

The bottom line is that I want it to be tractable and not gratuitously
cumbersome to suppress particular warnings, but I want it to be
difficult to do so indiscriminately or accidentally and easiest to
maintain and change the code such that we won't inadvertently have
warnings suppressed for new code.  I'm far less concerned about
warnings in test code.  So, if outside of test code we in fact have a
very small number of warnings in need of suppression, perhaps I've
made a mountain of a molehill.

Regardless of the mechanics we settle on, I want to say some things
about review standards for new warning suppressions.  The bar to
adding a warning suppression should be quite high.  It should be
especially high for the "find a bug via control-flow analysis" class
of warnings such as -Wmaybe-uninitialized.  The weightwc.h example in
Joseph's patch has a little of the important information, but IMHO
it's almost a perfect example of what not to do.  It says under what
conditions the warning was observed (machine, compiler version), which
is important information.  

What's at least as important is the rationale by which we're confident
that the compiler is wrong and the code is right.  Just because a
warning has been there for a long time, and we've let it pass, and we
aren't aware of a bug in the code, does not mean that there isn't
actually a bug in the code.  I looked at the fnmatch/weightwc case a
while ago and after a few minutes of looking at the code (with which I
was not especially familiar) could not easily convince myself that I
knew for sure that use of an uninitialized value was impossible.
Joseph has since described his own analysis of the code and made the
case that it is in fact safe.  An explanation of that analysis and the
"proof" (informally speaking) of why the warning is wrong is what
should be required in comments at the site of a warning suppression.

When someone takes the time to do such analysis, he will then be close
enough to the code that he might see ways to change the code so that
the warning goes away without a suppression, as Paul did for this
case.  That is always the best possible result, and what we should be
striving for.  Accepting that a warning suppression is correct should
be the penultimate resort, and never done lightly (the last one, that
we never get to, being pessimizing the code with dead stores and the
like).


Thanks,
Roland
  
Roland McGrath Nov. 25, 2014, midnight UTC | #2
Thanks very much for tracking down that change!  That's exactly what we'd
like to see done instead of warning suppression whenever it's possible.
The only thing I'd like you to do differently is to put more explanation in
the code itself rather than only in the ChangeLog.  e.g.

	/* It's important that this be a scalar variable rather than a
	   one-element array, because GCC (at least 4.9.2 -O2 on x86-64)
	   can be confused by the array and diagnose a "used initialized"
	   in a dead branch in the findidx function.  */


Thanks,
Roland
  
Joseph Myers Nov. 25, 2014, 1:29 a.m. UTC | #3
On Mon, 24 Nov 2014, Roland McGrath wrote:

> If GCC's limitations make it infeasible to narrow the suppression
> scope very much, and/or make it impossible to use a surrounding macro
> to do so like my idea, then those ideas are pretty much out the
> window.  The pragma syntax is a bit verbose and clunky, so some macros
> to make things more concise are still probably worthwhile.

The surrounding macro doesn't provide the interval of lines with the 
pragma enabled, which is needed for it to work.

(And in some cases, a pragma for the whole file may make sense - if a test 
is deliberately testing deprecated interfaces or _FORTIFY_SOURCE runtime 
handling, disabling corresponding warnings for the whole file seems 
appropriate.)

> The bottom line is that I want it to be tractable and not gratuitously
> cumbersome to suppress particular warnings, but I want it to be
> difficult to do so indiscriminately or accidentally and easiest to
> maintain and change the code such that we won't inadvertently have
> warnings suppressed for new code.  I'm far less concerned about
> warnings in test code.  So, if outside of test code we in fact have a
> very small number of warnings in need of suppression, perhaps I've
> made a mountain of a molehill.

We have the weightwc.h warnings for which Paul has found another fix.  We 
have the nscd/connections.c "ignoring return value of 'setuid', declared 
with attribute warn_unused_result [-Wunused-result]" (and similar for 
setgid), in a context where another id-setting call has already failed, 
and while you could assign to a dummy variable it's not clear that has any 
advantage over explicitly suppressing the warnings.  We have, for some 
architecture/compiler combinations, "res_send.c:795:41: warning: 'resplen' 
may be used uninitialized in this function [-Wmaybe-uninitialized]" (a 
case where the relevant code only gets executed if the variable is 
initialized - the first time round a loop will execute a different if 
branch and cause it to be initialized).  And that's it for warnings (on 
x86_64 / ARM / PowerPC, in my testing) for which suppression seems a 
sensible first approach.

Maybe there would be more arising from conversion of -Wno- options in 
makefiles to use of macros / pragmas - but if so, high standards simply 
imply it's hard to move from suppression for a whole file to more 
selective suppression.

> Regardless of the mechanics we settle on, I want to say some things
> about review standards for new warning suppressions.  The bar to
> adding a warning suppression should be quite high.  It should be

I think we should be perfectly willing to file a bug in Bugzilla and then 
add a suppression referencing the bug, if there's something tricky about 
determining if there's a real glibc bug there or a better way to address 
the warning - that means the build stays working on more platforms (in the 
presence of -Werror).  I see no reason why we should assume a possible 
problem in existing code shown up by a warning (from new GCC or on another 
architecture) is more serious than other bugs.  -Werror is first about 
stopping new issues getting in accidentally, rather than forcing certain 
existing issues (those that cause warnings) to be higher priority than 
other existing issues.

(For example, on PowerPC you get

../sysdeps/ieee754/ldbl-128/e_lgammal_r.c:75:1: warning: floating constant 
exceeds range of 'long double' [-Woverflow]
../sysdeps/ieee754/ldbl-128/e_lgammal_r.c:78:1: warning: floating constant 
exceeds range of 'long double' [-Woverflow]

and while the right long-term fix is certainly *not* a warning 
suppression, it would seem reasonable as an interim fix for build breakage 
for an indefinite amount of time, with the comment referring to bug 
16347.)
  
Paul Eggert Nov. 25, 2014, 10:35 p.m. UTC | #4
On 11/24/2014 05:29 PM, Joseph Myers wrote:
> We have the nscd/connections.c "ignoring return value of 'setuid', declared
> with attribute warn_unused_result [-Wunused-result]" (and similar for
> setgid), in a context where another id-setting call has already failed

Yes, we've run into that several times in GNU applications and we came 
up with a better way to fix it, in the Gnulib ignore-value module, which 
defines this:

# define ignore_value(x) \
     (__extension__ ({ __typeof__ (x) __x = (x); (void) __x; }))

One can then write "ignore_value (setuid (server_uid))" instead of 
"setuid (server_uid)".  So we shouldn't need a pragma for this one.

> We have, for some
> architecture/compiler combinations, "res_send.c:795:41: warning: 'resplen'
> may be used uninitialized in this function [-Wmaybe-uninitialized]" (a
> case where the relevant code only gets executed if the variable is
> initialized - the first time round a loop will execute a different if
> branch and cause it to be initialized).

Neither you nor Roland liked the IF_LINT approach that Gnulib uses for 
this sort of thing, but I hope we can figure out something else to 
satisfy glibc's constraints.  How about something like this?

#define ASSUME_MEM_INITIALIZED(v) asm ("" : "=m" (v))

and then write "ASSUME_MEM_INITIALIZED (resplen)" after declaring resplen.

> And that's it for warnings (on
> x86_64 / ARM / PowerPC, in my testing) for which suppression seems a
> sensible first approach.

In that case, it's not unreasonable to hope that we can avoid the 
pragmas entirely.
  
Joseph Myers Nov. 25, 2014, 10:42 p.m. UTC | #5
On Tue, 25 Nov 2014, Paul Eggert wrote:

> On 11/24/2014 05:29 PM, Joseph Myers wrote:
> > We have the nscd/connections.c "ignoring return value of 'setuid', declared
> > with attribute warn_unused_result [-Wunused-result]" (and similar for
> > setgid), in a context where another id-setting call has already failed
> 
> Yes, we've run into that several times in GNU applications and we came up with
> a better way to fix it, in the Gnulib ignore-value module, which defines this:
> 
> # define ignore_value(x) \
>     (__extension__ ({ __typeof__ (x) __x = (x); (void) __x; }))
> 
> One can then write "ignore_value (setuid (server_uid))" instead of "setuid
> (server_uid)".  So we shouldn't need a pragma for this one.

Sure, that can be done if preferred to using pragmas to make it clear this 
is exactly about disabling a diagnostic.

> > We have, for some
> > architecture/compiler combinations, "res_send.c:795:41: warning: 'resplen'
> > may be used uninitialized in this function [-Wmaybe-uninitialized]" (a
> > case where the relevant code only gets executed if the variable is
> > initialized - the first time round a loop will execute a different if
> > branch and cause it to be initialized).
> 
> Neither you nor Roland liked the IF_LINT approach that Gnulib uses for this
> sort of thing, but I hope we can figure out something else to satisfy glibc's
> constraints.  How about something like this?
> 
> #define ASSUME_MEM_INITIALIZED(v) asm ("" : "=m" (v))
> 
> and then write "ASSUME_MEM_INITIALIZED (resplen)" after declaring resplen.

It's only in a particular bit of code where we want to assume it's 
initialized.
  
Roland McGrath Nov. 25, 2014, 10:42 p.m. UTC | #6
> Neither you nor Roland liked the IF_LINT approach that Gnulib uses for 
> this sort of thing, but I hope we can figure out something else to 
> satisfy glibc's constraints.  How about something like this?
> 
> #define ASSUME_MEM_INITIALIZED(v) asm ("" : "=m" (v))
> 
> and then write "ASSUME_MEM_INITIALIZED (resplen)" after declaring resplen.

With "=g" as the constraint it probably won't affect code generation at all.
So that could be OK.
  
Paul Eggert Nov. 26, 2014, 12:23 a.m. UTC | #7
Joseph Myers wrote:
> It's only in a particular bit of code where we want to assume it's
> initialized.

We could use ASSUME_INITIALIZED only in the particular bit of code where it's 
needed, as it doesn't have to be used immediately after a declaration.
  
Joseph Myers Nov. 26, 2014, 12:36 a.m. UTC | #8
On Tue, 25 Nov 2014, Paul Eggert wrote:

> Joseph Myers wrote:
> > It's only in a particular bit of code where we want to assume it's
> > initialized.
> 
> We could use ASSUME_INITIALIZED only in the particular bit of code where it's
> needed, as it doesn't have to be used immediately after a declaration.

But won't the proposed asm, with its output operand for resplen, imply 
that any previous value in resplen is dead - which is not correct if you 
use it only there (where the point is that in fact there is a value there, 
which is not dead, but the compiler can't see there must be a value there) 
rather than before any possible initialization?
  
Paul Eggert Nov. 26, 2014, 12:42 a.m. UTC | #9
Joseph Myers wrote:
> On Tue, 25 Nov 2014, Paul Eggert wrote:
>
>> >Joseph Myers wrote:
>>> > >It's only in a particular bit of code where we want to assume it's
>>> > >initialized.
>> >
>> >We could use ASSUME_INITIALIZED only in the particular bit of code where it's
>> >needed, as it doesn't have to be used immediately after a declaration.
> But won't the proposed asm, with its output operand for resplen, imply
> that any previous value in resplen is dead

Ah, true.  Still, if we use it just after resplen's declaration, we've localized 
the assumption well -- better than we would have done with the pragmas, and 
that's good enough.
  

Patch

From 7c16bdc58c74e3515d8d71e5e006005566e5a32a Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Fri, 21 Nov 2014 20:57:58 -0800
Subject: [PATCH] fnmatch: work around GCC compiler warning bug with uninit var

* posix/fnmatch_loop.c (FCT): Use a scalar not a one-item array.
This works around a bug with x86-64 GCC 4.9.2 and earlier
where 'gcc -O2 -Wmaybe-uninitialized' incorrectly complains
"../locale/weightwc.h:93:7: warning: '*((void *)&str+4)' may be
used uninitialized in this function [-Wmaybe-uninitialized]".
---
 ChangeLog            | 9 +++++++++
 posix/fnmatch_loop.c | 8 ++++----
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index c75dab7..3c7b9e4 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,12 @@ 
+2014-11-21  Paul Eggert  <eggert@cs.ucla.edu>
+
+	fnmatch: work around GCC compiler warning bug with uninit var
+	* posix/fnmatch_loop.c (FCT): Use a scalar not a one-item array.
+	This works around a bug with x86-64 GCC 4.9.2 and earlier
+	where 'gcc -O2 -Wmaybe-uninitialized' incorrectly complains
+	"../locale/weightwc.h:93:7: warning: '*((void *)&str+4)' may be
+	used uninitialized in this function [-Wmaybe-uninitialized]".
+
 2014-11-21  Roland McGrath  <roland@hack.frob.com>
 
 	* nptl/pthread_create.c (__pthread_create_2_1): Set
diff --git a/posix/fnmatch_loop.c b/posix/fnmatch_loop.c
index db6d9d7..c1d8b69 100644
--- a/posix/fnmatch_loop.c
+++ b/posix/fnmatch_loop.c
@@ -343,7 +343,7 @@  FCT (pattern, string, string_end, no_leading_period, flags, ends, alloca_used)
 #ifdef _LIBC
 		else if (c == L('[') && *p == L('='))
 		  {
-		    UCHAR str[1];
+		    UCHAR str;
 		    uint32_t nrules =
 		      _NL_CURRENT_WORD (LC_COLLATE, _NL_COLLATE_NRULES);
 		    const CHAR *startp = p;
@@ -355,7 +355,7 @@  FCT (pattern, string, string_end, no_leading_period, flags, ends, alloca_used)
 			c = L('[');
 			goto normal_bracket;
 		      }
-		    str[0] = c;
+		    str = c;
 
 		    c = *++p;
 		    if (c != L('=') || p[1] != L(']'))
@@ -368,7 +368,7 @@  FCT (pattern, string, string_end, no_leading_period, flags, ends, alloca_used)
 
 		    if (nrules == 0)
 		      {
-			if ((UCHAR) *n == str[0])
+			if ((UCHAR) *n == str)
 			  goto matched;
 		      }
 		    else
@@ -383,7 +383,7 @@  FCT (pattern, string, string_end, no_leading_period, flags, ends, alloca_used)
 # endif
 			const int32_t *indirect;
 			int32_t idx;
-			const UCHAR *cp = (const UCHAR *) str;
+			const UCHAR *cp = (const UCHAR *) &str;
 
 # if WIDE_CHAR_VERSION
 			table = (const int32_t *)
-- 
1.9.3