add attribute nonstring

Message ID 85ee9ddd-0be5-04d3-7319-600bcf1eb8a6@cs.ucla.edu
State New, archived
Headers

Commit Message

Paul Eggert Nov. 12, 2017, 9:06 a.m. UTC
  Thanks for mentioning the problem with GCC 8. If the intent of 
__attribute__((__nonstring__)) is to consistently mark char arrays that are not 
necessarily null-terminated strings, then lots of data structures in the tzdb 
code would need to be marked with this attribute. For example, every member of 
tzfile.h's 'struct tzhead' would have the attribute, because they are all char 
arrays that might not be null-terminated.

Looking at the proposed glibc patch, it seems its intent is more modest. All 
it's trying to do is to pacify GCC 8 so that GCC doesn't complain about calls to 
strncpy and strncat. In that case, a simpler and cleaner fix for tzdb is to stop 
using strncpy and strncat. (I suspect that the main reason tzdb uses strncpy is 
that its code is so old that it predates the widespread availability of memcpy.) 
So I installed the attached patch into tzdb's zic.c and it should be part of the 
next tzdb release.

Perhaps you can this idea elsewhere in the rest of the glibc patch too. It would 
be a lot of work to consistently change glibc to use 
__attribute__((__nonstring__)) for its documented purpose. If glibc can avoid 
calling strncpy and strncat, it won't need __NONSTRING; if not, then I suggest 
changing this:

/* Describes a char array that is not necessarily a NUL-terminated
    string.  */
# define __NONSTRING __attribute__ ((__nonstring__))

to something more like this:

/* Describes a char array whose address can safely be passed as the first
    argument to strncpy and strncat, as the char array is not necessarily
    a NUL-terminated string.  */
# define __NONSTRING __attribute__ ((__nonstring__))

in sys/cdefs.h, so that the more-modest goal of the patch is clearer to the reader.

I'll CC: this to tz@iana.org as a heads-up there.
From 641f2c48b10dec5ab6dd90f560b66e061aa660ba Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sun, 12 Nov 2017 00:57:19 -0800
Subject: [PROPOSED] Port to GCC 8 -Wstringop-truncation

Problem reported by Martin Sebor in:
https://sourceware.org/ml/libc-alpha/2017-11/msg00336.html
* NEWS: Mention this.
* zic.c (writezone): Use memcpy, not strncpy.
---
 NEWS  | 3 +++
 zic.c | 2 +-
 2 files changed, 4 insertions(+), 1 deletion(-)
  

Comments

Martin Sebor Nov. 12, 2017, 5:33 p.m. UTC | #1
On 11/12/2017 02:06 AM, Paul Eggert wrote:
> Thanks for mentioning the problem with GCC 8. If the intent of
> __attribute__((__nonstring__)) is to consistently mark char arrays that
> are not necessarily null-terminated strings, then lots of data
> structures in the tzdb code would need to be marked with this attribute.
> For example, every member of tzfile.h's 'struct tzhead' would have the
> attribute, because they are all char arrays that might not be
> null-terminated.
>
> Looking at the proposed glibc patch, it seems its intent is more modest.
> All it's trying to do is to pacify GCC 8 so that GCC doesn't complain
> about calls to strncpy and strncat. In that case, a simpler and cleaner
> fix for tzdb is to stop using strncpy and strncat. (I suspect that the
> main reason tzdb uses strncpy is that its code is so old that it
> predates the widespread availability of memcpy.) So I installed the
> attached patch into tzdb's zic.c and it should be part of the next tzdb
> release.

Great!

>
> Perhaps you can this idea elsewhere in the rest of the glibc patch too.
> It would be a lot of work to consistently change glibc to use
> __attribute__((__nonstring__)) for its documented purpose. If glibc can
> avoid calling strncpy and strncat, it won't need __NONSTRING;

Replacing strncpy with memcpy works well in some case (such
the TZ_MAGIC macro).  In other cases the string pointed to by
the constant argument is unknown and can be shorter than the size
of the destination so in those the replacement would need to be
more involved.  I tried to make only minimal changes to suppress
the new warning to avoid excessive churn.  I also didn't want to
remove the originally intended and correct uses of strncpy (to
completely fill a buffer).

I have another GCC enhancement that I plan to submit this week
that points out uses of these non-string arrays with common
string functions like strlen.  With it, annotating more of
the non-string arrays with the attribute to help detect their
accidental misuses might be preferable to using memcpy.  I'll
leave that decision to the Glibc folks here.

  if not,
> then I suggest changing this:
>
> /* Describes a char array that is not necessarily a NUL-terminated
>    string.  */
> # define __NONSTRING __attribute__ ((__nonstring__))
>
> to something more like this:
>
> /* Describes a char array whose address can safely be passed as the first
>    argument to strncpy and strncat, as the char array is not necessarily
>    a NUL-terminated string.  */
> # define __NONSTRING __attribute__ ((__nonstring__))
>
> in sys/cdefs.h, so that the more-modest goal of the patch is clearer to
> the reader.
>
> I'll CC: this to tz@iana.org as a heads-up there.

That would make sense to me as well  I'll wait to submit an updated
patch until this and Joseph's style question have been settled.

Thanks
Martin
  
Paul Eggert Nov. 12, 2017, 8:39 p.m. UTC | #2
[Dropping tz@iana.org as this email isn't relevant to tzdb.]

Martin Sebor wrote:
> I tried to make only minimal changes to suppress
> the new warning to avoid excessive churn.

As the main goal here is to find bugs not to pacify GCC, it's helpful to take 
these diagnostics as an opportunity to look more carefully at the strncpy calls. 
I just now looked at the other two data structures affected by the patch, and 
found some issues.

First, the patch to sysdeps/gnu/bits/utmp.h seems reasonable, as the members 
ut_user etc. have used strncpy format since the 1970s and people who deal with 
this format are used to it. However, this format does not appear to be 
documented in the manual. I suggest accompanying it with a patch to the part of 
manual/users.texi that talks about ut_user etc., so that the __NONSTRING issue 
is documented there.

Second, the patch to sysdeps/gnu/net/if.h is dubious. Its goal appears to be to 
pacify calls like the first call to strncpy in 
sysdeps/unix/sysv/linux/if_index.c. But that first call appears to be a bug in 
glibc, since the underlying Linux system call expects a null-terminated string. 
My guess is that
this particular data structure should indeed be a null-terminated string, and 
should not be marked with __NONSTRING, and that we should instead change the 
glibc code to not use strncpy here. I suppose the code should report an error 
instead of creating a non-null-terminated array, and that it need not initialize 
the rest of the array to zeros.
  
Martin Sebor Nov. 12, 2017, 11:49 p.m. UTC | #3
On 11/12/2017 01:39 PM, Paul Eggert wrote:
> [Dropping tz@iana.org as this email isn't relevant to tzdb.]
>
> Martin Sebor wrote:
>> I tried to make only minimal changes to suppress
>> the new warning to avoid excessive churn.
>
> As the main goal here is to find bugs not to pacify GCC, it's helpful to
> take these diagnostics as an opportunity to look more carefully at the
> strncpy calls. I just now looked at the other two data structures
> affected by the patch, and found some issues.
>
> First, the patch to sysdeps/gnu/bits/utmp.h seems reasonable, as the
> members ut_user etc. have used strncpy format since the 1970s and people
> who deal with this format are used to it. However, this format does not
> appear to be documented in the manual. I suggest accompanying it with a
> patch to the part of manual/users.texi that talks about ut_user etc., so
> that the __NONSTRING issue is documented there.
>
> Second, the patch to sysdeps/gnu/net/if.h is dubious. Its goal appears
> to be to pacify calls like the first call to strncpy in
> sysdeps/unix/sysv/linux/if_index.c. But that first call appears to be a
> bug in glibc, since the underlying Linux system call expects a
> null-terminated string. My guess is that
> this particular data structure should indeed be a null-terminated
> string, and should not be marked with __NONSTRING, and that we should
> instead change the glibc code to not use strncpy here. I suppose the
> code should report an error instead of creating a non-null-terminated
> array, and that it need not initialize the rest of the array to zeros.

I'm not familiar with this ioct call beyond what I've read on
the Linux netdevice man page before submitting the patch.  If
ifr_name is, in fact, supposed to be a string and I missed it
then that would actually be awesome because it would prove
the warning useful.

Martin

PS I still don't see it discussed on the Linux man page but
I did find such a requirement on an AIX 6.1 ioctl man page:
https://www.ibm.com/support/knowledgecenter/en/ssw_ibm_i_61/apis/ioctl.htm

The descriptions of the if_indextoname and if_nametoindex
functions specified by RFC 3493 also talk about the name being
a nul-terminated string so it looks to me like you are correct
and the warning has found a Glibc bug.  Yay! :)
  
Joseph Myers Nov. 13, 2017, 12:50 a.m. UTC | #4
On Sun, 12 Nov 2017, Paul Eggert wrote:

> availability of memcpy.) So I installed the attached patch into tzdb's zic.c
> and it should be part of the next tzdb release.

Please commit this zic.c fix also to glibc.  (We can do a bulk update from 
tzcode after the next tzcode release.)
  

Patch

diff --git a/NEWS b/NEWS
index 966f9b6..7881936 100644
--- a/NEWS
+++ b/NEWS
@@ -29,6 +29,9 @@  Unreleased, experimental changes
     Diagnostics and commentary now distinguish UT from UTC more
     carefully; see theory.html for more information about UT vs UTC.
 
+    zic has been ported to GCC 8's -Wstringop-truncation option.
+    (Problem reported by Martin Sebor.)
+
 
 Release 2017c - 2017-10-20 14:49:34 -0700
 
diff --git a/zic.c b/zic.c
index 82e653d..e36301f 100644
--- a/zic.c
+++ b/zic.c
@@ -1960,7 +1960,7 @@  writezone(const char *const name, const char *const string, char version)
 		}
 #define DO(field)	fwrite(tzh.field, sizeof tzh.field, 1, fp)
 		tzh = tzh0;
-		strncpy(tzh.tzh_magic, TZ_MAGIC, sizeof tzh.tzh_magic);
+		memcpy(tzh.tzh_magic, TZ_MAGIC, sizeof tzh.tzh_magic);
 		tzh.tzh_version[0] = version;
 		convert(thistypecnt, tzh.tzh_ttisgmtcnt);
 		convert(thistypecnt, tzh.tzh_ttisstdcnt);