From patchwork Fri Dec 4 20:20:44 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul Eggert X-Patchwork-Id: 9900 Received: (qmail 36264 invoked by alias); 4 Dec 2015 20:20:55 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 36252 invoked by uid 89); 4 Dec 2015 20:20:54 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.5 required=5.0 tests=AWL, BAYES_50, SPF_PASS, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: zimbra.cs.ucla.edu Subject: Re: [PATCH] Implement strlcat [BZ#178] To: Florian Weimer References: <56547472.3010302@redhat.com> <5654B1FE.5020100@cs.ucla.edu> <5654B796.7070302@redhat.com> <5656E018.5020608@cs.ucla.edu> <565F211A.2030909@redhat.com> <56607CD1.3050209@cs.ucla.edu> <566197FD.9020001@redhat.com> Cc: libc-alpha@sourceware.org From: Paul Eggert Message-ID: <5661F59C.5050608@cs.ucla.edu> Date: Fri, 4 Dec 2015 12:20:44 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: <566197FD.9020001@redhat.com> On 12/04/2015 05:41 AM, Florian Weimer wrote: > Looks good. Thanks, I installed the first patch. Revised versions of the other two patches are attached. Responding to your comments: > _FORTIFY_SOURCE mostly covers the > write-to-statically-sized-buffer case. The manual should not make > promises the current GCC/glibc combinations cannot deliver. > _FORTIFY_SOURCE will always be brittle for the more complex cases > because of the dependency on GCC optimization behavior. > > It's also highly application-specific whether a crash (induced by > _FORTIFY_SOURCE) or truncation (from strncpy or strlcpy) is better. > Thanks, good points. I addressed them in the attached patches by replacing that paragraph with the following text. Although some buffer overruns can be prevented by manually replacing calls to copying functions with calls to truncation functions, nowadays there are easier and more-reliable automatic techniques that cause buffer overruns to reliably terminate a program. These include GCC's @option{-fsanitize=address} option and, if the destination buffer is statically sized, defining the @code{_FORTIFY_SOURCE} macro. Because truncation functions can mask application bugs that would otherwise be caught by the automatic techniques, these functions should be used only when the application's underlying logic requires truncation. Or perhaps you'd rather not document _FORTIFY_SOURCE at all? I notice it's mentioned nowhere in the manual; is that intended? If so, I can further revise accordingly. The attached 2nd patch also alter the proposed strlcpy+strlcat documentation as per my more-recent emails. One more thing: the attached 1st patch also removes the strncat example, as I discovered that it's quite misleading and the manual shouldn't be pushing strncat anyway. From 7d64e0be6baff7155cb753a653952c164894dfa6 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Fri, 4 Dec 2015 12:11:14 -0800 Subject: [PATCH 2/2] Add strlcpy, strlcat [BZ #178] This patch was derived from text by Florian Weimer in: https://sourceware.org/ml/libc-alpha/2015-11/msg00558.html * manual/string.texi (Truncating Strings): New functions from BSD. --- ChangeLog | 6 ++++ manual/string.texi | 102 ++++++++++++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 107 insertions(+), 1 deletion(-) diff --git a/ChangeLog b/ChangeLog index 083a08d..c822b34 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,5 +1,11 @@ 2015-12-04 Paul Eggert + Add strlcpy, strlcat + [BZ #178] + This patch was derived from text by Florian Weimer in: + https://sourceware.org/ml/libc-alpha/2015-11/msg00558.html + * manual/string.texi (Truncating Strings): New functions from BSD. + Split large string section; add truncation advice * manual/examples/strncat.c: Remove. This example was misleading, as the code would have undefined diff --git a/manual/string.texi b/manual/string.texi index 5b11b3b..398b41c 100644 --- a/manual/string.texi +++ b/manual/string.texi @@ -745,7 +745,7 @@ As noted below, this function has significant performance issues. @end deftypefun Programmers using the @code{strcat} or @code{wcscat} function (or the -@code{strncat} or @code{wcsncat} functions defined in +@code{strlcat}, @code{strncat}, or @code{wcsncat} functions defined in a later section, for that matter) can easily be recognized as lazy and reckless. In almost all situations the lengths of the participating strings are known (it better should be @@ -906,6 +906,47 @@ greater than the length of @var{from}. As noted below, this function is generally a poor choice for processing text. @end deftypefun +@comment string.h +@comment BSD +@deftypefun size_t strlcpy (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size}) +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} +This function is similar to @code{strcpy}, but copies at most +@var{size} bytes from the string @var{from} into the destination +buffer @var{to}, including a terminating null byte. + +If @var{size} is greater than the length of @var{from}, this function +copies all of the string @var{from} to the destination buffer +@var{to}, including the terminating null byte. Like other string +functions such as @code{strcpy}, but unlike @code{strncpy}, any +remaining bytes in the destination buffer remain unchanged. + +If @var{size} is nonzero and is not greater than the length of the +string @var{from}, this function copies only the first +@samp{@var{size} - 1} bytes to the destination buffer @var{to}, and +writes a terminating null byte to the last byte in the buffer. + +This function returns the length of @var{from}. This means that +truncation occurs whenever the returned value is not less than +@var{size}. + +The behavior of this function is undefined if @var{size} is zero, if +the source and destination strings overlap, or if the source or +destination are null pointers. + +Unlike @code{strncpy}, this function always null-terminates the +destination string, does not zero-fill the destination buffer, +requires @var{size} to be nonzero, requires @var{from} to be a +null-terminated string, and always computes @var{from}'s length even +when greater than @var{size}. + +This function was designed as a stopgap for quickly retrofitting +possibly-unsafe uses of @code{strcpy} on platforms lacking +buffer-overrun protection. As noted below, this function is generally +a poor choice for processing text. + +This function is derived from BSD. +@end deftypefun + @comment wchar.h @comment ISO @deftypefun {wchar_t *} wcsncpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size}) @@ -1064,6 +1105,65 @@ choice for processing text. Also, this function has significant performance issues. @xref{Concatenating Strings}. @end deftypefun +@comment string.h +@comment BSD +@deftypefun size_t strlcat (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size}) +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} +This function is similar to @code{strcat}, except that it truncates +@var{to} to at most @var{size} bytes (including the terminating null +byte) when appending the string @var{from} to the string @var{to}. + +This function appends a prefix of the string @var{from} to the string +@var{to}. The prefix contains all of @var{from} if that can be +appended to @var{to} without requiring more than @var{size} bytes +total, including the null terminator. Otherwise, it contains only as +many leading bytes of @var{from} as will fit. The resulting string in +@var{to} is always null-terminated, and any excess trailing bytes of +@var{from} are not copied. + +This function returns the sum of the original length of @var{to} and +the length of @var{from}. This means that truncation occurs whenever +the returned value is not less than @var{size}. + +The behavior is undefined if @var{to} does not contain a null byte in +its first @var{size} bytes, if the source and resulting destination +strings overlap, if the source or destination are null pointers, or if +the result would exceed @code{SIZE_MAX}. + +@strong{Portability Note:} With @theglibc{} this function's behavior +is never undefined due to the result exceeding @code{SIZE_MAX}, as +@var{from} and @var{to} cannot overlap and @theglibc{} assumes a flat +address space. + +This function could be implemented like this: + +@smallexample +@group +size_t +strlcat (char *to, const char *from, size_t size) +@{ + size_t len = strlen (to); + return len + strlcpy (to + len, from, size - len); +@} +@end group +@end smallexample + +Whereas @code{strncat} limits the length of the appended string, this +function limits the total destination size. Also, unlike +@code{strncat}, this function requires @var{from} to be +null-terminated and always computes @var{from}'s length even when +greater than the limit. + +As a companion to @code{strlcpy}, this function was designed as a +stopgap for quickly retrofitting possibly-unsafe uses of @code{strcat} +on platforms lacking buffer-overrun protection. As noted below, this +function is generally a poor choice for processing text. Also, like +@code{strcat} this function has significant performance issues. +@xref{Concatenating Strings}. + +This function is derived from BSD. +@end deftypefun + @comment wchar.h @comment ISO @deftypefun {wchar_t *} wcsncat (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size}) -- 2.1.0