Message ID | 85ee9ddd-0be5-04d3-7319-600bcf1eb8a6@cs.ucla.edu |
---|---|
State | New, archived |
Headers |
Received: (qmail 3820 invoked by alias); 12 Nov 2017 09:06:48 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 3809 invoked by uid 89); 12 Nov 2017 09:06:47 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 spammy=goal, 8s, modest, H*Ad:D*iana.org X-HELO: zimbra.cs.ucla.edu Subject: Re: [PATCH] add attribute nonstring To: Martin Sebor <msebor@gmail.com> References: <d0432148-24d3-baaa-faf6-b00c33e45103@gmail.com> From: Paul Eggert <eggert@cs.ucla.edu> Cc: GNU C Library <libc-alpha@sourceware.org>, Time zone mailing list <tz@iana.org> Message-ID: <85ee9ddd-0be5-04d3-7319-600bcf1eb8a6@cs.ucla.edu> Date: Sun, 12 Nov 2017 01:06:43 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: <d0432148-24d3-baaa-faf6-b00c33e45103@gmail.com> Content-Type: multipart/mixed; boundary="------------65693D68C6856841DBB3DFFC" |
Commit Message
Paul Eggert
Nov. 12, 2017, 9:06 a.m. UTC
Thanks for mentioning the problem with GCC 8. If the intent of
__attribute__((__nonstring__)) is to consistently mark char arrays that are not
necessarily null-terminated strings, then lots of data structures in the tzdb
code would need to be marked with this attribute. For example, every member of
tzfile.h's 'struct tzhead' would have the attribute, because they are all char
arrays that might not be null-terminated.
Looking at the proposed glibc patch, it seems its intent is more modest. All
it's trying to do is to pacify GCC 8 so that GCC doesn't complain about calls to
strncpy and strncat. In that case, a simpler and cleaner fix for tzdb is to stop
using strncpy and strncat. (I suspect that the main reason tzdb uses strncpy is
that its code is so old that it predates the widespread availability of memcpy.)
So I installed the attached patch into tzdb's zic.c and it should be part of the
next tzdb release.
Perhaps you can this idea elsewhere in the rest of the glibc patch too. It would
be a lot of work to consistently change glibc to use
__attribute__((__nonstring__)) for its documented purpose. If glibc can avoid
calling strncpy and strncat, it won't need __NONSTRING; if not, then I suggest
changing this:
/* Describes a char array that is not necessarily a NUL-terminated
string. */
# define __NONSTRING __attribute__ ((__nonstring__))
to something more like this:
/* Describes a char array whose address can safely be passed as the first
argument to strncpy and strncat, as the char array is not necessarily
a NUL-terminated string. */
# define __NONSTRING __attribute__ ((__nonstring__))
in sys/cdefs.h, so that the more-modest goal of the patch is clearer to the reader.
I'll CC: this to tz@iana.org as a heads-up there.
From 641f2c48b10dec5ab6dd90f560b66e061aa660ba Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sun, 12 Nov 2017 00:57:19 -0800
Subject: [PROPOSED] Port to GCC 8 -Wstringop-truncation
Problem reported by Martin Sebor in:
https://sourceware.org/ml/libc-alpha/2017-11/msg00336.html
* NEWS: Mention this.
* zic.c (writezone): Use memcpy, not strncpy.
---
NEWS | 3 +++
zic.c | 2 +-
2 files changed, 4 insertions(+), 1 deletion(-)
Comments
On 11/12/2017 02:06 AM, Paul Eggert wrote: > Thanks for mentioning the problem with GCC 8. If the intent of > __attribute__((__nonstring__)) is to consistently mark char arrays that > are not necessarily null-terminated strings, then lots of data > structures in the tzdb code would need to be marked with this attribute. > For example, every member of tzfile.h's 'struct tzhead' would have the > attribute, because they are all char arrays that might not be > null-terminated. > > Looking at the proposed glibc patch, it seems its intent is more modest. > All it's trying to do is to pacify GCC 8 so that GCC doesn't complain > about calls to strncpy and strncat. In that case, a simpler and cleaner > fix for tzdb is to stop using strncpy and strncat. (I suspect that the > main reason tzdb uses strncpy is that its code is so old that it > predates the widespread availability of memcpy.) So I installed the > attached patch into tzdb's zic.c and it should be part of the next tzdb > release. Great! > > Perhaps you can this idea elsewhere in the rest of the glibc patch too. > It would be a lot of work to consistently change glibc to use > __attribute__((__nonstring__)) for its documented purpose. If glibc can > avoid calling strncpy and strncat, it won't need __NONSTRING; Replacing strncpy with memcpy works well in some case (such the TZ_MAGIC macro). In other cases the string pointed to by the constant argument is unknown and can be shorter than the size of the destination so in those the replacement would need to be more involved. I tried to make only minimal changes to suppress the new warning to avoid excessive churn. I also didn't want to remove the originally intended and correct uses of strncpy (to completely fill a buffer). I have another GCC enhancement that I plan to submit this week that points out uses of these non-string arrays with common string functions like strlen. With it, annotating more of the non-string arrays with the attribute to help detect their accidental misuses might be preferable to using memcpy. I'll leave that decision to the Glibc folks here. if not, > then I suggest changing this: > > /* Describes a char array that is not necessarily a NUL-terminated > string. */ > # define __NONSTRING __attribute__ ((__nonstring__)) > > to something more like this: > > /* Describes a char array whose address can safely be passed as the first > argument to strncpy and strncat, as the char array is not necessarily > a NUL-terminated string. */ > # define __NONSTRING __attribute__ ((__nonstring__)) > > in sys/cdefs.h, so that the more-modest goal of the patch is clearer to > the reader. > > I'll CC: this to tz@iana.org as a heads-up there. That would make sense to me as well I'll wait to submit an updated patch until this and Joseph's style question have been settled. Thanks Martin
[Dropping tz@iana.org as this email isn't relevant to tzdb.] Martin Sebor wrote: > I tried to make only minimal changes to suppress > the new warning to avoid excessive churn. As the main goal here is to find bugs not to pacify GCC, it's helpful to take these diagnostics as an opportunity to look more carefully at the strncpy calls. I just now looked at the other two data structures affected by the patch, and found some issues. First, the patch to sysdeps/gnu/bits/utmp.h seems reasonable, as the members ut_user etc. have used strncpy format since the 1970s and people who deal with this format are used to it. However, this format does not appear to be documented in the manual. I suggest accompanying it with a patch to the part of manual/users.texi that talks about ut_user etc., so that the __NONSTRING issue is documented there. Second, the patch to sysdeps/gnu/net/if.h is dubious. Its goal appears to be to pacify calls like the first call to strncpy in sysdeps/unix/sysv/linux/if_index.c. But that first call appears to be a bug in glibc, since the underlying Linux system call expects a null-terminated string. My guess is that this particular data structure should indeed be a null-terminated string, and should not be marked with __NONSTRING, and that we should instead change the glibc code to not use strncpy here. I suppose the code should report an error instead of creating a non-null-terminated array, and that it need not initialize the rest of the array to zeros.
On 11/12/2017 01:39 PM, Paul Eggert wrote: > [Dropping tz@iana.org as this email isn't relevant to tzdb.] > > Martin Sebor wrote: >> I tried to make only minimal changes to suppress >> the new warning to avoid excessive churn. > > As the main goal here is to find bugs not to pacify GCC, it's helpful to > take these diagnostics as an opportunity to look more carefully at the > strncpy calls. I just now looked at the other two data structures > affected by the patch, and found some issues. > > First, the patch to sysdeps/gnu/bits/utmp.h seems reasonable, as the > members ut_user etc. have used strncpy format since the 1970s and people > who deal with this format are used to it. However, this format does not > appear to be documented in the manual. I suggest accompanying it with a > patch to the part of manual/users.texi that talks about ut_user etc., so > that the __NONSTRING issue is documented there. > > Second, the patch to sysdeps/gnu/net/if.h is dubious. Its goal appears > to be to pacify calls like the first call to strncpy in > sysdeps/unix/sysv/linux/if_index.c. But that first call appears to be a > bug in glibc, since the underlying Linux system call expects a > null-terminated string. My guess is that > this particular data structure should indeed be a null-terminated > string, and should not be marked with __NONSTRING, and that we should > instead change the glibc code to not use strncpy here. I suppose the > code should report an error instead of creating a non-null-terminated > array, and that it need not initialize the rest of the array to zeros. I'm not familiar with this ioct call beyond what I've read on the Linux netdevice man page before submitting the patch. If ifr_name is, in fact, supposed to be a string and I missed it then that would actually be awesome because it would prove the warning useful. Martin PS I still don't see it discussed on the Linux man page but I did find such a requirement on an AIX 6.1 ioctl man page: https://www.ibm.com/support/knowledgecenter/en/ssw_ibm_i_61/apis/ioctl.htm The descriptions of the if_indextoname and if_nametoindex functions specified by RFC 3493 also talk about the name being a nul-terminated string so it looks to me like you are correct and the warning has found a Glibc bug. Yay! :)
On Sun, 12 Nov 2017, Paul Eggert wrote: > availability of memcpy.) So I installed the attached patch into tzdb's zic.c > and it should be part of the next tzdb release. Please commit this zic.c fix also to glibc. (We can do a bulk update from tzcode after the next tzcode release.)
diff --git a/NEWS b/NEWS index 966f9b6..7881936 100644 --- a/NEWS +++ b/NEWS @@ -29,6 +29,9 @@ Unreleased, experimental changes Diagnostics and commentary now distinguish UT from UTC more carefully; see theory.html for more information about UT vs UTC. + zic has been ported to GCC 8's -Wstringop-truncation option. + (Problem reported by Martin Sebor.) + Release 2017c - 2017-10-20 14:49:34 -0700 diff --git a/zic.c b/zic.c index 82e653d..e36301f 100644 --- a/zic.c +++ b/zic.c @@ -1960,7 +1960,7 @@ writezone(const char *const name, const char *const string, char version) } #define DO(field) fwrite(tzh.field, sizeof tzh.field, 1, fp) tzh = tzh0; - strncpy(tzh.tzh_magic, TZ_MAGIC, sizeof tzh.tzh_magic); + memcpy(tzh.tzh_magic, TZ_MAGIC, sizeof tzh.tzh_magic); tzh.tzh_version[0] = version; convert(thistypecnt, tzh.tzh_ttisgmtcnt); convert(thistypecnt, tzh.tzh_ttisstdcnt);