diff mbox

[BZ,17979,BZ,17721] Fix issues with sys/cdefs.h and uchar.h when using non-gcc compiler.

Message ID 27c31890079f41775175b94a4abedb0c.squirrel@server316.webhostingpad.com
State New, archived
Headers show

Commit Message

Dwight Guth Jan. 28, 2016, 8:20 p.m. UTC
If linking against Glibc with a compiler for which the __GNUC__ macro is not
defined, problems arise when including header files that use the __restrict
or __inline keyword and when including uchar.h.

Glibc strips __restrict from the prototypes of C library functions in this
case. This is incorrect if the compiler is a C99-compliant compiler, because
C99 includes the restrict keyword and uses it in the declaration of a number
of functions in the C library. This leads to undefined behavior because
the definitions of those functions were defined with the restrict keyword,
which makes their type signatures incompatible with their declaration,
a violation of C99 sec. 6.2.7 paragraph 2. The same thing occurs with the
__inline keyword, which, while not undefined behavior per-se, seems
undesirable
in cases where the compiler is C99-compliant and therefore includes the
inline keyword. Here we except the case where the compiler declares itself
to be C99-compliant from these checks in order to allow better C99 compliance
for compilers other than gcc which link against Glibc.

Glibc defines char16_t and char32_t in uchar.h as __CHAR16_TYPE__ and
__CHAR32_TYPE__ when the __GNUC__ macro is defined, but when linking against
Glibc with a different compiler, these types are not defined at all,
which is a violation of C11 sec. 7.28 paragraph 2, as well as a syntax
error because these types are used in the prototypes of functions declared
later in the file. According to this section of the standard, these types
must be defined in this header file and must be the same type as
uint_least16_t and uint_least32_t, which are defined in stdint.h as
"unsigned short int" and "unsigned int" respectively. Here we modify the
header so that if __GNUC__ is not defined, we still provide these typedefs,
but we default them manually to the same type as uint_least16_t and
uint_least32_t if __CHAR16_TYPE__ and __CHAR32_TYPE__ are not defined by
the compiler.

-----

I had trouble testing this patch because I ran into unrelated errors in the
test suite. If someone could help me figure out how to set up a test
environment that is likely to pass all the tests, I can try again, but I
don't
have the resources to struggle with all the errors that arise without knowing
their solutions. The patch should not affect any version of GCC, however.

-----

2016-01-28  Dwight Guth  <dwight.guth@runtimeverification.com>

	[BZ 17979]
	* wcsmbs/uchar.h (char16_t, char32_t): Define types if __GNUC__,
	__CHAR16_TYPE__, or __CHAR32_TYPE__ are not defined.

	[BZ 17721]
	* misc/sys/cdefs.h (__restrict, __inline): Define as keywords if
	__GNUC__ is not defined but __STDC_VERSION__ is at least C99.

-----

Comments

Joseph Myers Jan. 28, 2016, 9:19 p.m. UTC | #1
On Thu, 28 Jan 2016, Dwight Guth wrote:

> Glibc strips __restrict from the prototypes of C library functions in this
> case. This is incorrect if the compiler is a C99-compliant compiler, because
> C99 includes the restrict keyword and uses it in the declaration of a number
> of functions in the C library. This leads to undefined behavior because
> the definitions of those functions were defined with the restrict keyword,
> which makes their type signatures incompatible with their declaration,
> a violation of C99 sec. 6.2.7 paragraph 2. The same thing occurs with the

This affects hardly any functions, if any at all, because qualifiers on 
function argument types are ignored for the purposes of compatibility 
(6.7.5.3#15).  For a function to be affected it would need a restrict 
qualifier that isn't directly on the argument type itself (type *restrict 
*param, for example).
Dwight Guth Jan. 28, 2016, 10:20 p.m. UTC | #2
Thank you for drawing that to my attention. You are correct that this
means that my original issue that caused me to create this part of the
patch is in fact well defined. However, this still seems like an issue
of correctness to me. I can't seem to find any explicit paragraph in
the standard saying that, e.g. fprintf must be declared with the type:

int fprintf(FILE * restrict stream, const char * restrict format, ...);

but I assume that that is implied, otherwise we could give any of the
functions in the library any signature as long as they broadly
followed the requirements of their corresponding subclause, which
seems wrong to me.

On Thu, Jan 28, 2016 at 3:19 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> On Thu, 28 Jan 2016, Dwight Guth wrote:
>
>> Glibc strips __restrict from the prototypes of C library functions in this
>> case. This is incorrect if the compiler is a C99-compliant compiler, because
>> C99 includes the restrict keyword and uses it in the declaration of a number
>> of functions in the C library. This leads to undefined behavior because
>> the definitions of those functions were defined with the restrict keyword,
>> which makes their type signatures incompatible with their declaration,
>> a violation of C99 sec. 6.2.7 paragraph 2. The same thing occurs with the
>
> This affects hardly any functions, if any at all, because qualifiers on
> function argument types are ignored for the purposes of compatibility
> (6.7.5.3#15).  For a function to be affected it would need a restrict
> qualifier that isn't directly on the argument type itself (type *restrict
> *param, for example).
>
> --
> Joseph S. Myers
> joseph@codesourcery.com
Joseph Myers Jan. 28, 2016, 10:48 p.m. UTC | #3
On Thu, 28 Jan 2016, Dwight Guth wrote:

> Thank you for drawing that to my attention. You are correct that this
> means that my original issue that caused me to create this part of the
> patch is in fact well defined. However, this still seems like an issue
> of correctness to me. I can't seem to find any explicit paragraph in
> the standard saying that, e.g. fprintf must be declared with the type:
> 
> int fprintf(FILE * restrict stream, const char * restrict format, ...);
> 
> but I assume that that is implied, otherwise we could give any of the
> functions in the library any signature as long as they broadly
> followed the requirements of their corresponding subclause, which
> seems wrong to me.

It's impossible for a valid program to distinguish what qualifiers were 
used for parameters in a function declaration in a header, thus it's not 
even a meaningful question.
Dwight Guth Jan. 28, 2016, 10:50 p.m. UTC | #4
Okay but if so, then why put the __restrict in the header files at all
if it doesn't really matter? And why put it there only if the compiler
is gcc?

On Thu, Jan 28, 2016 at 4:48 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> On Thu, 28 Jan 2016, Dwight Guth wrote:
>
>> Thank you for drawing that to my attention. You are correct that this
>> means that my original issue that caused me to create this part of the
>> patch is in fact well defined. However, this still seems like an issue
>> of correctness to me. I can't seem to find any explicit paragraph in
>> the standard saying that, e.g. fprintf must be declared with the type:
>>
>> int fprintf(FILE * restrict stream, const char * restrict format, ...);
>>
>> but I assume that that is implied, otherwise we could give any of the
>> functions in the library any signature as long as they broadly
>> followed the requirements of their corresponding subclause, which
>> seems wrong to me.
>
> It's impossible for a valid program to distinguish what qualifiers were
> used for parameters in a function declaration in a header, thus it's not
> even a meaningful question.
>
> --
> Joseph S. Myers
> joseph@codesourcery.com
Joseph Myers Jan. 28, 2016, 10:52 p.m. UTC | #5
On Thu, 28 Jan 2016, Dwight Guth wrote:

> Okay but if so, then why put the __restrict in the header files at all
> if it doesn't really matter? And why put it there only if the compiler
> is gcc?

Effectively it serves as documentation of intent for people reading the 
headers (much like the argument names with __ prefixes).
Mike Frysinger Jan. 28, 2016, 10:58 p.m. UTC | #6
On 28 Jan 2016 22:52, Joseph Myers wrote:
> On Thu, 28 Jan 2016, Dwight Guth wrote:
> > Okay but if so, then why put the __restrict in the header files at all
> > if it doesn't really matter? And why put it there only if the compiler
> > is gcc?
> 
> Effectively it serves as documentation of intent for people reading the 
> headers (much like the argument names with __ prefixes).

wouldn't it also assist automated tools like linters/static analyzers ?
-mike
Dwight Guth Jan. 28, 2016, 11:08 p.m. UTC | #7
Yes, actually. A dynamic analysis tool (like the one I am developing)
that uses the GCC preprocessor on this header file to try to decide
the types of functions in order to decide whether they are
restrict-qualified and therefore subject to the restrictions of
restrict-qualified pointers would be highly interested in this
information.

It also seems strange to me to be declaring functions in header files
with different types than are mandated by the standard just because it
won't matter in most cases... It also sounds to me like a highly
subtle bug waiting to happen. Can you guarantee that in the entire C
standard, there isn't anywhere in it where having the wrong qualifiers
on a function declaration might have some kind of impact on which
programs are well-defined or not, or on the behavior of a program? Can
you guarantee that no such cases will continue to exist in all future
revisions of the C standard? What happens when someone uses __restrict
someday in a situation where it does matter (e.g. if the next version
of C introduces a "char * restrict *" parameter somewhere), and
doesn't know anything about this discussion?

On Thu, Jan 28, 2016 at 4:58 PM, Mike Frysinger <vapier@gentoo.org> wrote:
> On 28 Jan 2016 22:52, Joseph Myers wrote:
>> On Thu, 28 Jan 2016, Dwight Guth wrote:
>> > Okay but if so, then why put the __restrict in the header files at all
>> > if it doesn't really matter? And why put it there only if the compiler
>> > is gcc?
>>
>> Effectively it serves as documentation of intent for people reading the
>> headers (much like the argument names with __ prefixes).
>
> wouldn't it also assist automated tools like linters/static analyzers ?
> -mike
Joseph Myers Jan. 28, 2016, 11:16 p.m. UTC | #8
On Thu, 28 Jan 2016, Mike Frysinger wrote:

> On 28 Jan 2016 22:52, Joseph Myers wrote:
> > On Thu, 28 Jan 2016, Dwight Guth wrote:
> > > Okay but if so, then why put the __restrict in the header files at all
> > > if it doesn't really matter? And why put it there only if the compiler
> > > is gcc?
> > 
> > Effectively it serves as documentation of intent for people reading the 
> > headers (much like the argument names with __ prefixes).
> 
> wouldn't it also assist automated tools like linters/static analyzers ?

If they hardcode information about particular functions, the qualifiers in 
the headers are irrelevant.  If not, even having restrict qualifiers on 
the parameters in the function definitions is only useful when you look at 
the body of the definitions as well, unless you apply heuristics beyond 
what is supported by the standard.

Remember that if a function has two restricted pointer arguments (that are 
restricted in the definition), this does *not* mean that they don't alias 
- only that *if* a particular execution of the function modifies some 
elements pointed to by one of the pointers, those elements are not also 
accessed other than through that pointer.  (But it's completely valid to 
have two restricted pointers to the same array, one only used to access / 
modify odd-numbered elements of that array, and the other one only used to 
access / modify even-numbered elements of that array.  Now, static 
analyzers might reasonably consider that dubious usage that should be 
diagnosed.)
Joseph Myers Jan. 28, 2016, 11:20 p.m. UTC | #9
On Thu, 28 Jan 2016, Dwight Guth wrote:

> It also seems strange to me to be declaring functions in header files
> with different types than are mandated by the standard just because it
> won't matter in most cases... It also sounds to me like a highly

I didn't say that's the reason, simply pointed out that there is no 
conformance bug from missing the restrict qualifiers - the reason is more 
likely that no-one in practice tends to use the glibc headers with 
compilers not defining __GNUC__ (they use non-GNU compilers, but non-GNU 
compilers that define __GNUC__ to indicate support for GNU extensions) and 
so there has been very little interest in fixing hypothetical issues with 
such compilers (and where people have submitted such patches, they've 
drifted away without resolving issues from review and pinging as needed).  
Combined with: __restrict in the headers may well date back to before C99 
was released and so before the final __STDC_VERSION__ value was known.
Dwight Guth Jan. 28, 2016, 11:28 p.m. UTC | #10
Yes, that's reasonable. What would you see as the correct resolution
to the issue?

On Thu, Jan 28, 2016 at 5:20 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> On Thu, 28 Jan 2016, Dwight Guth wrote:
>
>> It also seems strange to me to be declaring functions in header files
>> with different types than are mandated by the standard just because it
>> won't matter in most cases... It also sounds to me like a highly
>
> I didn't say that's the reason, simply pointed out that there is no
> conformance bug from missing the restrict qualifiers - the reason is more
> likely that no-one in practice tends to use the glibc headers with
> compilers not defining __GNUC__ (they use non-GNU compilers, but non-GNU
> compilers that define __GNUC__ to indicate support for GNU extensions) and
> so there has been very little interest in fixing hypothetical issues with
> such compilers (and where people have submitted such patches, they've
> drifted away without resolving issues from review and pinging as needed).
> Combined with: __restrict in the headers may well date back to before C99
> was released and so before the final __STDC_VERSION__ value was known.
>
> --
> Joseph S. Myers
> joseph@codesourcery.com
Joseph Myers Jan. 28, 2016, 11:33 p.m. UTC | #11
On Thu, 28 Jan 2016, Dwight Guth wrote:

> Yes, that's reasonable. What would you see as the correct resolution
> to the issue?

I think defining to restrict based on __STDC_VERSION__ is reasonable 
(outside of the release freeze period).

uchar.h is trickier in that stdint.h is theoretically system-specific 
(it's in sysdeps), so there's an abstraction violation in copying its 
types into uchar.h (not in sysdeps).  I'd be happy for stdint.h and 
inttypes.h to move out of sysdeps, but any hardcoding should probably 
still be accompanied at least by a comment in stdint.h about where else 
the type information is embedded.
Mike Frysinger Jan. 28, 2016, 11:43 p.m. UTC | #12
On 28 Jan 2016 23:16, Joseph Myers wrote:
> On Thu, 28 Jan 2016, Mike Frysinger wrote:
> > On 28 Jan 2016 22:52, Joseph Myers wrote:
> > > On Thu, 28 Jan 2016, Dwight Guth wrote:
> > > > Okay but if so, then why put the __restrict in the header files at all
> > > > if it doesn't really matter? And why put it there only if the compiler
> > > > is gcc?
> > > 
> > > Effectively it serves as documentation of intent for people reading the 
> > > headers (much like the argument names with __ prefixes).
> > 
> > wouldn't it also assist automated tools like linters/static analyzers ?
> 
> If they hardcode information about particular functions, the qualifiers in 
> the headers are irrelevant.  If not, even having restrict qualifiers on 
> the parameters in the function definitions is only useful when you look at 
> the body of the definitions as well, unless you apply heuristics beyond 
> what is supported by the standard.
> 
> Remember that if a function has two restricted pointer arguments (that are 
> restricted in the definition), this does *not* mean that they don't alias 
> - only that *if* a particular execution of the function modifies some 
> elements pointed to by one of the pointers, those elements are not also 
> accessed other than through that pointer.  (But it's completely valid to 
> have two restricted pointers to the same array, one only used to access / 
> modify odd-numbered elements of that array, and the other one only used to 
> access / modify even-numbered elements of that array.  Now, static 
> analyzers might reasonably consider that dubious usage that should be 
> diagnosed.)

i think having analyzers warn about that by default is a sane position.
i would expect that such usage is more commonly an unintended mistake
rather than the function actually has such esoteric behavior.  i don't
think any of the glibc functions are written in this manner and expect
the pointers to be distinct memory locations.

i don't see an issue with allowing the headers to use keywords that are
supposed to exist according to the standard, even if we don't know the
exact compiler that is in use (i.e. non-gcc).
-mike
Dwight Guth Jan. 28, 2016, 11:47 p.m. UTC | #13
Would you see it as better to move stdint.h out of sysdeps, or to do
something similar to what's done with mbstate_t, NULL, size_t, etc by
defining a __need_uint_least16_t and __need_uint_least32_t and having
stdint.h define only a __uint_least16_t and __uint_least32_t type and
then exit, allowing those types to be used across an abstraction
boundary by uchar.h?

On Thu, Jan 28, 2016 at 5:33 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> On Thu, 28 Jan 2016, Dwight Guth wrote:
>
>> Yes, that's reasonable. What would you see as the correct resolution
>> to the issue?
>
> I think defining to restrict based on __STDC_VERSION__ is reasonable
> (outside of the release freeze period).
>
> uchar.h is trickier in that stdint.h is theoretically system-specific
> (it's in sysdeps), so there's an abstraction violation in copying its
> types into uchar.h (not in sysdeps).  I'd be happy for stdint.h and
> inttypes.h to move out of sysdeps, but any hardcoding should probably
> still be accompanied at least by a comment in stdint.h about where else
> the type information is embedded.
>
> --
> Joseph S. Myers
> joseph@codesourcery.com
Joseph Myers Jan. 28, 2016, 11:51 p.m. UTC | #14
On Thu, 28 Jan 2016, Dwight Guth wrote:

> Would you see it as better to move stdint.h out of sysdeps, or to do
> something similar to what's done with mbstate_t, NULL, size_t, etc by
> defining a __need_uint_least16_t and __need_uint_least32_t and having
> stdint.h define only a __uint_least16_t and __uint_least32_t type and
> then exit, allowing those types to be used across an abstraction
> boundary by uchar.h?

I don't think we want to add more __need_* cases; if we split things out, 
it should be by adding a bits/stdint.h header that defines __*_t (and 
moving stdint.h out of sysdeps would still make sense).  Cf 
<https://sourceware.org/ml/libc-alpha/2012-08/msg00510.html>.
Dwight Guth Jan. 29, 2016, 12:06 a.m. UTC | #15
Alright, fair enough. So the solution you want to see is to move
stdint.h out of sysdeps, move the definition of uint_least16_t and
uint_least32_t to a definition of __*_t in bits/stdint.h, and then
define uint_least16_t, uint_least32_t, char16_t, and char32_t on the
basis of the __*_t types? Or is this overkill given that we are
assuming that stdint.h doesn't need to be system-specific? Which would
you prefer?

On Thu, Jan 28, 2016 at 5:51 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> On Thu, 28 Jan 2016, Dwight Guth wrote:
>
>> Would you see it as better to move stdint.h out of sysdeps, or to do
>> something similar to what's done with mbstate_t, NULL, size_t, etc by
>> defining a __need_uint_least16_t and __need_uint_least32_t and having
>> stdint.h define only a __uint_least16_t and __uint_least32_t type and
>> then exit, allowing those types to be used across an abstraction
>> boundary by uchar.h?
>
> I don't think we want to add more __need_* cases; if we split things out,
> it should be by adding a bits/stdint.h header that defines __*_t (and
> moving stdint.h out of sysdeps would still make sense).  Cf
> <https://sourceware.org/ml/libc-alpha/2012-08/msg00510.html>.
>
> --
> Joseph S. Myers
> joseph@codesourcery.com
Alexander Cherepanov Jan. 29, 2016, 12:48 a.m. UTC | #16
On 2016-01-29 02:43, Mike Frysinger wrote:
> On 28 Jan 2016 23:16, Joseph Myers wrote:
>> On Thu, 28 Jan 2016, Mike Frysinger wrote:
>>> On 28 Jan 2016 22:52, Joseph Myers wrote:
>>>> On Thu, 28 Jan 2016, Dwight Guth wrote:
>>>>> Okay but if so, then why put the __restrict in the header files at all
>>>>> if it doesn't really matter? And why put it there only if the compiler
>>>>> is gcc?
>>>>
>>>> Effectively it serves as documentation of intent for people reading the
>>>> headers (much like the argument names with __ prefixes).
>>>
>>> wouldn't it also assist automated tools like linters/static analyzers ?
>>
>> If they hardcode information about particular functions, the qualifiers in
>> the headers are irrelevant.  If not, even having restrict qualifiers on
>> the parameters in the function definitions is only useful when you look at
>> the body of the definitions as well, unless you apply heuristics beyond
>> what is supported by the standard.

I would add that for an analyzer to depend on a specific implementation 
is a bit risky. The meaning of the restrict qualifier on the parameters 
of a libc function depends on the description of the function in the 
standard, not on any implementation.

>> Remember that if a function has two restricted pointer arguments (that are
>> restricted in the definition), this does *not* mean that they don't alias
>> - only that *if* a particular execution of the function modifies some
>> elements pointed to by one of the pointers, those elements are not also
>> accessed other than through that pointer.  (But it's completely valid to
>> have two restricted pointers to the same array, one only used to access /
>> modify odd-numbered elements of that array, and the other one only used to
>> access / modify even-numbered elements of that array.  Now, static
>> analyzers might reasonably consider that dubious usage that should be
>> diagnosed.)
>
> i think having analyzers warn about that by default is a sane position.
> i would expect that such usage is more commonly an unintended mistake
> rather than the function actually has such esoteric behavior.  i don't
> think any of the glibc functions are written in this manner and expect
> the pointers to be distinct memory locations.

What about strncat? I think this:

   char s[10] = "abc";
   strncat(s, s, 2);

is fine according to C11.
Mike Frysinger Jan. 29, 2016, 12:58 a.m. UTC | #17
On 29 Jan 2016 03:48, Alexander Cherepanov wrote:
> On 2016-01-29 02:43, Mike Frysinger wrote:
> > On 28 Jan 2016 23:16, Joseph Myers wrote:
> >> On Thu, 28 Jan 2016, Mike Frysinger wrote:
> >>> On 28 Jan 2016 22:52, Joseph Myers wrote:
> >>>> On Thu, 28 Jan 2016, Dwight Guth wrote:
> >>>>> Okay but if so, then why put the __restrict in the header files at all
> >>>>> if it doesn't really matter? And why put it there only if the compiler
> >>>>> is gcc?
> >>>>
> >>>> Effectively it serves as documentation of intent for people reading the
> >>>> headers (much like the argument names with __ prefixes).
> >>>
> >>> wouldn't it also assist automated tools like linters/static analyzers ?
> >>
> >> If they hardcode information about particular functions, the qualifiers in
> >> the headers are irrelevant.  If not, even having restrict qualifiers on
> >> the parameters in the function definitions is only useful when you look at
> >> the body of the definitions as well, unless you apply heuristics beyond
> >> what is supported by the standard.
> 
> I would add that for an analyzer to depend on a specific implementation 
> is a bit risky. The meaning of the restrict qualifier on the parameters 
> of a libc function depends on the description of the function in the 
> standard, not on any implementation.

not all functions glibc uses restrict on are in any standard, unless you
are lumping "the GNU standard" in there.

> >> Remember that if a function has two restricted pointer arguments (that are
> >> restricted in the definition), this does *not* mean that they don't alias
> >> - only that *if* a particular execution of the function modifies some
> >> elements pointed to by one of the pointers, those elements are not also
> >> accessed other than through that pointer.  (But it's completely valid to
> >> have two restricted pointers to the same array, one only used to access /
> >> modify odd-numbered elements of that array, and the other one only used to
> >> access / modify even-numbered elements of that array.  Now, static
> >> analyzers might reasonably consider that dubious usage that should be
> >> diagnosed.)
> >
> > i think having analyzers warn about that by default is a sane position.
> > i would expect that such usage is more commonly an unintended mistake
> > rather than the function actually has such esoteric behavior.  i don't
> > think any of the glibc functions are written in this manner and expect
> > the pointers to be distinct memory locations.
> 
> What about strncat? I think this:
> 
>    char s[10] = "abc";
>    strncat(s, s, 2);
> 
> is fine according to C11.

linters/static analyzers aren't about doing standard validation.  if i
saw that snippet, i would assume an accidental bug, or pointless code
and should be deleted.  you don't run random hacks through linters, you
run code you want to send to production through them.
-mike
Joseph Myers Jan. 29, 2016, 3:34 p.m. UTC | #18
On Thu, 28 Jan 2016, Dwight Guth wrote:

> Alright, fair enough. So the solution you want to see is to move
> stdint.h out of sysdeps, move the definition of uint_least16_t and
> uint_least32_t to a definition of __*_t in bits/stdint.h, and then
> define uint_least16_t, uint_least32_t, char16_t, and char32_t on the
> basis of the __*_t types? Or is this overkill given that we are
> assuming that stdint.h doesn't need to be system-specific? Which would
> you prefer?

That seems logically right, though it might be overkill.  The minimum is 
cross-references between the headers pointing out the other places where 
type information is embedded.
Alexander Cherepanov Jan. 30, 2016, 1:50 a.m. UTC | #19
On 2016-01-29 03:58, Mike Frysinger wrote:
>>>>> wouldn't it also assist automated tools like linters/static analyzers ?
>>>>
>>>> If they hardcode information about particular functions, the qualifiers in
>>>> the headers are irrelevant.  If not, even having restrict qualifiers on
>>>> the parameters in the function definitions is only useful when you look at
>>>> the body of the definitions as well, unless you apply heuristics beyond
>>>> what is supported by the standard.
>>
>> I would add that for an analyzer to depend on a specific implementation
>> is a bit risky. The meaning of the restrict qualifier on the parameters
>> of a libc function depends on the description of the function in the
>> standard, not on any implementation.
>
> not all functions glibc uses restrict on are in any standard, unless you
> are lumping "the GNU standard" in there.

It doesn't matter. My point is that you have to work from the 
description instead of implementation. The C standard seems to impose 
minimal restrictions so there would be no difference between the 
description and any conforming implementation in many cases. For 
contrast consider proposed strlcpy/strlcat -- the last revision of their 
description is intentionally more strict than their implementation. The 
whole point of this distinction is to enable tools to catch problematic 
uses.

>>>> Remember that if a function has two restricted pointer arguments (that are
>>>> restricted in the definition), this does *not* mean that they don't alias
>>>> - only that *if* a particular execution of the function modifies some
>>>> elements pointed to by one of the pointers, those elements are not also
>>>> accessed other than through that pointer.  (But it's completely valid to
>>>> have two restricted pointers to the same array, one only used to access /
>>>> modify odd-numbered elements of that array, and the other one only used to
>>>> access / modify even-numbered elements of that array.  Now, static
>>>> analyzers might reasonably consider that dubious usage that should be
>>>> diagnosed.)
>>>
>>> i think having analyzers warn about that by default is a sane position.
>>> i would expect that such usage is more commonly an unintended mistake
>>> rather than the function actually has such esoteric behavior.  i don't
>>> think any of the glibc functions are written in this manner and expect
>>> the pointers to be distinct memory locations.
>>
>> What about strncat? I think this:
>>
>>     char s[10] = "abc";
>>     strncat(s, s, 2);
>>
>> is fine according to C11.
>
> linters/static analyzers aren't about doing standard validation.  if i
> saw that snippet, i would assume an accidental bug, or pointless code
> and should be deleted.  you don't run random hacks through linters, you
> run code you want to send to production through them.

I guess opinions will differ.

1. Personally, I would appreciate the distinction between definite 
standard violations and doubtful/risky code.

2. The stylistic preferences differ greatly. You call it a hack, others 
will call it a clever trick (though a couple of ifs and assignments in 
between to make the example code non-pointless).

But we can consider something much more real: glibc code contains many 
cases of undefined behavior (according to C11). I consider it bugs 
unwanted in production (maybe with the exception of strlen) but it's not 
clear to me what is the POV of the project on this. E.g., are bug 
reports about UB are welcome?
Joseph Myers Jan. 31, 2016, 4:12 p.m. UTC | #20
On Sat, 30 Jan 2016, Alexander Cherepanov wrote:

> But we can consider something much more real: glibc code contains many cases
> of undefined behavior (according to C11). I consider it bugs unwanted in
> production (maybe with the exception of strlen) but it's not clear to me what
> is the POV of the project on this. E.g., are bug reports about UB are welcome?

It's an intrinsic part of implementing the C library that it involves 
doing things that are not defined in ISO C (the ISO C library cannot all 
be implemented in ISO C).

If something involves undefined behavior in GNU C, on systems with the 
properties that hold for all systems supported by glibc (e.g. 32-bit int, 
32-bit or 64-bit long and pointers), for inputs to library functions where 
the library function semantics don't involve undefined behavior, taking 
due account of both separate compilation and use of asm to reduce the 
information available to the compiler in certain places and so make things 
not undefined that would otherwise be undefined, then I'd consider bug 
reports appropriate.  Examples of this include buffer overruns and signed 
integer arithmetic overflow.  I think that applies even if there is no 
plausible optimization that could result in the signed integer arithmetic 
overflow causing problems.  Of course, patches are even better.

If something is fully defined in GNU C, but undefined, unspecified or 
implementation-defined in ISO C, I don't think bug reports are 
appropriate.  E.g., it's not a bug for glibc code to rely on signed shifts 
by a nonnegative amount that's less than the width of the type, including 
the aspects of those that are not defined by ISO C, or on the 
fully-defined nature of conversions of out-of-range integer values to 
signed integer types.  Some such cases in code shared with gnulib (or 
other external sources) may still be gnulib bugs where they aren't glibc 
bugs, since gnulib is meant to be much more portable.  And there may be 
cases where you could argue that eliminating such usages is a desirable 
*cleanup* in glibc - e.g., that the code is cleaner when it uses unsigned 
types for bitwise operations.  But in such cases there is not bug and so 
patches should be sent without any bug reports in Bugzilla (or more 
general groups of issues identified in the list of possible cleanups in 
the todo list on the wiki).

Where something depends on a property common to all systems supported by 
glibc, again it may be a useful *cleanup* to make this more explicit in 
the code, without there being any *bug* that's appropriate to report to 
Bugzilla.  E.g., if code is using int in a way that requires it to be 
exactly 32-bit, changing the code to use int32_t is a sensible cleanup; if 
it really does mean int but is hardcoding the value of INT_MAX, making it 
use INT_MAX is a sensible cleanup; if it's using long as a pointer-sized 
integer type, changing it to use intptr_t is a sensible cleanup.

Where glibc code relies on separate compilation to avoid undefined 
behavior, this is not a bug; use of LTO on glibc's own code is not 
supported.  For example, where functions with different C types but 
compatible ABIs are aliased, or a function is otherwise defined with a 
different type from that used when it is called.  Similarly, asm may be 
used to limit code movement, and that could e.g. mean that aliasing is not 
undefined behavior where it would otherwise be undefined.

Of course, if the semantics for a function say that certain inputs result 
in undefined behavior, it does not matter if C-level undefined behavior 
occurs within that function for those inputs.
diff mbox

Patch

diff --git a/misc/sys/cdefs.h b/misc/sys/cdefs.h
index 7fd4154..af23ff7 100644
--- a/misc/sys/cdefs.h
+++ b/misc/sys/cdefs.h
@@ -69,8 +69,11 @@ 

 #else  /* Not GCC.  */

-# define __inline              /* No inline functions.  */
-
+# if defined __STDC_VERSION__ && __STDC_VERSION__ >= 199901L
+#  define __inline             inline
+# else
+#  define __inline             /* No inline functions.  */
+# endif
 # define __THROW
 # define __THROWNL
 # define __NTH(fct)    fct
@@ -360,7 +363,11 @@ 

 /* __restrict is known in EGCS 1.2 and above. */
 #if !__GNUC_PREREQ (2,92)
-# define __restrict    /* Ignore */
+# if !defined __GNUC__ && defined __STDC_VERSION__ && __STDC_VERSION__ >=
199901L
+#  define __restrict   restrict
+# else
+#  define __restrict   /* Ignore */
+# endif
 #endif

 /* ISO C99 also allows to declare arrays as non-overlapping.  The syntax is
diff --git a/wcsmbs/uchar.h b/wcsmbs/uchar.h
index ce92b25..1484e56 100644
--- a/wcsmbs/uchar.h
+++ b/wcsmbs/uchar.h
@@ -39,14 +39,16 @@  __END_NAMESPACE_C99
 #endif


-#if defined __GNUC__ && !defined __USE_ISOCXX11
+#if !defined __USE_ISOCXX11
 /* Define the 16-bit and 32-bit character types.  Use the information
    provided by the compiler.  */
 # if !defined __CHAR16_TYPE__ || !defined __CHAR32_TYPE__
 #  if defined __STDC_VERSION__ && __STDC_VERSION__ < 201000L
 #   error "<uchar.h> requires ISO C11 mode"
 #  else
-#   error "definitions of __CHAR16_TYPE__ and/or __CHAR32_TYPE__ missing"
+/* Same as uint_least16_t and uint_least32_t in stdint.h. */
+typedef unsigned short int __CHAR16_TYPE__;
+typedef unsigned int       __CHAR32_TYPE__;
 #  endif
 # endif
 typedef __CHAR16_TYPE__ char16_t;