Implement strlcat [BZ#178]

Message ID 5654B796.7070302@redhat.com
State Dropped
Headers

Commit Message

Florian Weimer Nov. 24, 2015, 7:16 p.m. UTC
  On 11/24/2015 07:52 PM, Paul Eggert wrote:

> First, the proposed change to the manual still suffers from the problems
> I mentioned earlier, and the strlcpy patch makes these problems worse. I
> would like to help fix this by condensing my scattered
> documentation-related comments into a proposed patch.

I'm attaching the consolidated documentation patch.

> Second, strlcpy and strlcat should both be marked __wur.  This would
> remove a significant part of my objection to the proposed change, which
> is that these poorly-designed APIs encourage silent-truncation bugs.

I disagree.  We don't do that for snprintf, which is very similar, and
we do not check most snprintf results in glibc.

Florian
  

Comments

Paul Eggert Nov. 24, 2015, 8:36 p.m. UTC | #1
On 11/24/2015 11:16 AM, Florian Weimer wrote:
> We don't do that for snprintf, which is very similar, and
> we do not check most snprintf results in glibc.

That's a bad precedent and we should not make things worse by furthering 
it. In the places where glibc calls snprintf without checking the return 
value, snprintf never silently truncates its output, so the extra size 
argument is merely making the code more confusing (and a bit slower) and 
glibc should be calling sprintf instead.

The confusion in question doesn't merely affect the human reader. This 
is an area where using snprintf makes glibc less reliable, assuming 
reasonably modern security technology.  If one of these unchecked 
snprintf calls were buggy and actually did silently truncate its output, 
the bug would be masked by glibc's current use of snprintf, whereas if 
the code used sprintf with fortify checking the bug would be caught and 
reported.

Thanks for the updated manual patch; I'll take a look at it.
  
Rich Felker Nov. 24, 2015, 10:51 p.m. UTC | #2
On Tue, Nov 24, 2015 at 08:16:38PM +0100, Florian Weimer wrote:
> On 11/24/2015 07:52 PM, Paul Eggert wrote:
> 
> > First, the proposed change to the manual still suffers from the problems
> > I mentioned earlier, and the strlcpy patch makes these problems worse. I
> > would like to help fix this by condensing my scattered
> > documentation-related comments into a proposed patch.
> 
> I'm attaching the consolidated documentation patch.
> 
> > Second, strlcpy and strlcat should both be marked __wur.  This would
> > remove a significant part of my objection to the proposed change, which
> > is that these poorly-designed APIs encourage silent-truncation bugs.
> 
> I disagree.  We don't do that for snprintf, which is very similar, and
> we do not check most snprintf results in glibc.

I'm with Florian on this. __wur is only appropriate for functions
where there is no safe way to use them if you ignore the return value.
This is not the case for strlcpy/strlcat. Sometimes you don't care
about truncation at all. Even if you do need to know truncation
status, it's obtainable simply by calling strlen.

As a general principle I feel that __wur should not be used to impose
coding style but only to detect things that are genuinely unsafe. I
believe there are existing places in glibc where this principle is
violated, but that's a topic for another discussion.

Rich
  
Paul Eggert Nov. 24, 2015, 11:24 p.m. UTC | #3
On 11/24/2015 02:51 PM, Rich Felker wrote:
> __wur is only appropriate for functions
> where there is no safe way to use them if you ignore the return value.
> This is not the case for strlcpy/strlcat.

Fair enough, so I withdraw that suggestion.
  

Patch

diff --git a/manual/string.texi b/manual/string.texi
index 5f8a17e..65f50f9 100644
--- a/manual/string.texi
+++ b/manual/string.texi
@@ -576,17 +576,59 @@  there is no null terminator written into @var{to}.
 
 If the length of @var{from} is less than @var{size}, then @code{strncpy}
 copies all of @var{from}, followed by enough null characters to add up
-to @var{size} characters in all.  This behavior is rarely useful, but it
-is specified by the @w{ISO C} standard.
+to @var{size} characters in all.
 
 The behavior of @code{strncpy} is undefined if the strings overlap.
 
-Using @code{strncpy} as opposed to @code{strcpy} is a way to avoid bugs
+Not guaranteeing null termination and always overwriting the entire
+destination buffer makes @code{strncpy} rarely useful, but this behavior
+is specified by the @w{ISO C} standard.  See @code{strlcpy} below for an
+alternative which null-terminates the destination string as long as the
+destination buffer does not have length zero.
+@end deftypefun
+
+@comment string.h
+@comment BSD
+@deftypefun size_t strlcpy (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size})
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+This function is similar to @code{strcpy}, but copies at most @var{size}
+characters into the destination buffer @var{to}, including the
+terminating null character.
+
+If @var{size} is zero, nothing is written to @var{to}.
+
+If the length of the string @var{from} is equal to or greater than
+@var{size}, @code{strlcpy} copies only the first @samp{@var{size} - 1}
+characters to the destination buffer @var{to}, and writes a terminating
+null character to the last character in the buffer.
+
+If the length of @var{from} is less than @var{size}, @code{strlcpy}
+copies all of the string @var{from} to the destination buffer @var{to},
+followed by a single null character.  Like other string functions such
+as @code{strcpy}, but unlike @code{strncpy}, the remaining characters in
+the destination buffer, if any, remain unchanged.
+
+The return value @var{result} of @code{strlcpy} is the length of the
+string @var{from}.  This means that @samp{@var{result} >= @var{size}} is
+true whenever truncation occurs.
+
+The string @var{from} must be null-terminated even if its length exceeds
+that of the destination buffer specified as @var{size}.  The behavior of
+@code{strlcpy} is undefined if the strings overlap or if the source or
+destination are null pointers.
+
+Using @code{strlcpy} as opposed to @code{strcpy} is a way to avoid bugs
 relating to writing past the end of the allocated space for @var{to}.
-However, it can also make your program much slower in one common case:
-copying a string which is probably small into a potentially large buffer.
-In this case, @var{size} may be large, and when it is, @code{strncpy} will
-waste a considerable amount of time copying null characters.
+Unlike @code{strncpy}, @code{strlcpy} ensures that the destination
+string is always null-terminated (unless the buffer size is zero), and
+it does not fill the remaining part of the buffer with null characters.
+
+@strong{Note:} GNU programs should not use statically sized buffers for
+storing strings.  @xref{Semantics, , Writing Robust Programs, standards,
+The GNU Coding Standards}.  Instead of using @code{strlcpy}, it is
+usually better to use dynamic memory allocation (@pxref{Unconstrained
+Allocation}) and functions such as @code{strdup} or @code{asprintf} to
+construct strings.
 @end deftypefun
 
 @comment wchar.h
@@ -1041,6 +1083,38 @@  hello, wo
 
 @comment string.h
 @comment BSD
+@deftypefun size_t strlcat (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size})
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+This function appends the null-terminated string @var{from} to the
+string @var{to}, limiting the total length of the result string at
+@var{to} (including the null terminator) to @var{size}.
+
+If the original string @var{to} contains a null terminator among the
+first @var{size} characters, @code{strlcat} copies as much as possible
+of the string @var{from} into the buffer at @var{to} of @var{size}
+characters, starting at the terminating null character of the original
+string @var{to}.  In effect, this appends the string @var{from} to the
+string @var{to}.  In this case, the resulting string in @var{to} will
+always be null-terminated, and it is truncated if necessary (not all
+characters in @var{from} are copied).  @code{strlcat} still returns the
+length of the untruncated string, the sum of the original length of
+@var{to} and the length of @var{from}.
+
+If the original string @var{to} is not null-terminated (within the first
+@var{size} characters of the buffer), @code{strlcat} returns @var{size}
+plus the length of the string @var{from}.  In this case, the string
+@var{to} is unchanged, and will not be null-terminated.  This also
+covers the case when @var{size} is zero.
+
+@strong{Note:} GNU programs should not use statically sized buffers for
+storing strings.  @xref{Semantics, , Writing Robust Programs, standards,
+The GNU Coding Standards}.  Instead of using @code{strlcat}, it is
+usually better to use dynamic memory allocation (@pxref{Unconstrained
+Allocation}) and and the @code{asprintf} function to construct strings.
+@end deftypefun
+
+@comment string.h
+@comment BSD
 @deftypefun void bcopy (const void *@var{from}, void *@var{to}, size_t @var{size})
 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 This is a partially obsolete alternative for @code{memmove}, derived from