Implement strlcat [BZ#178]

Message ID 5661F59C.5050608@cs.ucla.edu
State New, archived
Headers

Commit Message

Paul Eggert Dec. 4, 2015, 8:20 p.m. UTC
  On 12/04/2015 05:41 AM, Florian Weimer wrote:
> Looks good.

Thanks, I installed the first patch.  Revised versions of the other two 
patches are attached.  Responding to your comments:

> _FORTIFY_SOURCE mostly covers the
> write-to-statically-sized-buffer case.  The manual should not make
> promises the current GCC/glibc combinations cannot deliver.
> _FORTIFY_SOURCE will always be brittle for the more complex cases
> because of the dependency on GCC optimization behavior.
>
> It's also highly application-specific whether a crash (induced by
> _FORTIFY_SOURCE) or truncation (from strncpy or strlcpy) is better.
>

Thanks, good points. I addressed them in the attached patches by 
replacing that paragraph with the following text.

Although some buffer overruns can be prevented by manually replacing
calls to copying functions with calls to truncation functions,
nowadays there are easier and more-reliable automatic techniques that
cause buffer overruns to reliably terminate a program.  These include
GCC's @option{-fsanitize=address} option and, if the destination
buffer is statically sized, defining the @code{_FORTIFY_SOURCE} macro.
Because truncation functions can mask application bugs that would
otherwise be caught by the automatic techniques, these functions
should be used only when the application's underlying logic requires
truncation.


Or perhaps you'd rather not document _FORTIFY_SOURCE at all? I notice 
it's mentioned nowhere in the manual; is that intended? If so, I can 
further revise accordingly.

The attached 2nd patch also alter the proposed strlcpy+strlcat 
documentation as per my more-recent emails. One more thing: the attached 
1st patch also removes the strncat example, as I discovered that it's 
quite misleading and the manual shouldn't be pushing strncat anyway.
  

Comments

Florian Weimer Dec. 10, 2015, 7:33 p.m. UTC | #1
On 12/04/2015 09:20 PM, Paul Eggert wrote:
> Thanks, good points. I addressed them in the attached patches by
> replacing that paragraph with the following text.
> 
> Although some buffer overruns can be prevented by manually replacing
> calls to copying functions with calls to truncation functions,
> nowadays there are easier and more-reliable automatic techniques that
> cause buffer overruns to reliably terminate a program.  These include
> GCC's @option{-fsanitize=address} option and, if the destination
> buffer is statically sized, defining the @code{_FORTIFY_SOURCE} macro.
> Because truncation functions can mask application bugs that would
> otherwise be caught by the automatic techniques, these functions
> should be used only when the application's underlying logic requires
> truncation.
> 
> 
> Or perhaps you'd rather not document _FORTIFY_SOURCE at all? I notice
> it's mentioned nowhere in the manual; is that intended?

It's difficult to document because it is an emergent property resulting
from glibc headers and GCC support.

> If so, I can further revise accordingly.

Right now, it is probably best to drop it.  I don't want to deal with an
influx of security vulnerabilities because it does not actually work in
some cases just yet.

I see that you have added a reference to -fcheck-pointer-bounds.  Do you
have practical experience with this option?  Does it actually work?

Florian
  
Paul Eggert Dec. 10, 2015, 11:13 p.m. UTC | #2
On 12/10/2015 11:33 AM, Florian Weimer wrote:
> I see that you have added a reference to -fcheck-pointer-bounds.  Do you
> have practical experience with this option?

No, I'm afraid we are not up-to-date with the latest Intel hardware 
here; my usual development platform at UCLA runs a 5.5-year-old AMD CPU, 
alas. I will try to get my hands on a Skylake, but this may take some 
time. In the meantime my impression is that -fcheck-pointer-bounds works 
well enough for some packages when using recent-enough hardware and 
software, and I expect that by the time the next glibc comes out 
-fcheck-pointer-bounds will be more-used and better-supported. I suppose 
we could add wording to the glibc manual to say that 
-fcheck-pointer-bounds is bleeding-edge as of 2015.

It is amusing that -fcheck-pointer-bounds removes the main justification 
for strlcpy+strlcat. What timing!
  

Patch

From 7d64e0be6baff7155cb753a653952c164894dfa6 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Fri, 4 Dec 2015 12:11:14 -0800
Subject: [PATCH 2/2] Add strlcpy, strlcat

[BZ #178]
This patch was derived from text by Florian Weimer in:
https://sourceware.org/ml/libc-alpha/2015-11/msg00558.html
* manual/string.texi (Truncating Strings): New functions from BSD.
---
 ChangeLog          |   6 ++++
 manual/string.texi | 102 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 107 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 083a08d..c822b34 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,11 @@ 
 2015-12-04  Paul Eggert  <eggert@cs.ucla.edu>
 
+	Add strlcpy, strlcat
+	[BZ #178]
+	This patch was derived from text by Florian Weimer in:
+	https://sourceware.org/ml/libc-alpha/2015-11/msg00558.html
+	* manual/string.texi (Truncating Strings): New functions from BSD.
+
 	Split large string section; add truncation advice
 	* manual/examples/strncat.c: Remove.
 	This example was misleading, as the code would have undefined
diff --git a/manual/string.texi b/manual/string.texi
index 5b11b3b..398b41c 100644
--- a/manual/string.texi
+++ b/manual/string.texi
@@ -745,7 +745,7 @@  As noted below, this function has significant performance issues.
 @end deftypefun
 
 Programmers using the @code{strcat} or @code{wcscat} function (or the
-@code{strncat} or @code{wcsncat} functions defined in
+@code{strlcat}, @code{strncat}, or @code{wcsncat} functions defined in
 a later section, for that matter)
 can easily be recognized as lazy and reckless.  In almost all situations
 the lengths of the participating strings are known (it better should be
@@ -906,6 +906,47 @@  greater than the length of @var{from}.  As noted below, this function
 is generally a poor choice for processing text.
 @end deftypefun
 
+@comment string.h
+@comment BSD
+@deftypefun size_t strlcpy (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size})
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+This function is similar to @code{strcpy}, but copies at most
+@var{size} bytes from the string @var{from} into the destination
+buffer @var{to}, including a terminating null byte.
+
+If @var{size} is greater than the length of @var{from}, this function
+copies all of the string @var{from} to the destination buffer
+@var{to}, including the terminating null byte.  Like other string
+functions such as @code{strcpy}, but unlike @code{strncpy}, any
+remaining bytes in the destination buffer remain unchanged.
+
+If @var{size} is nonzero and is not greater than the length of the
+string @var{from}, this function copies only the first
+@samp{@var{size} - 1} bytes to the destination buffer @var{to}, and
+writes a terminating null byte to the last byte in the buffer.
+
+This function returns the length of @var{from}.  This means that
+truncation occurs whenever the returned value is not less than
+@var{size}.
+
+The behavior of this function is undefined if @var{size} is zero, if
+the source and destination strings overlap, or if the source or
+destination are null pointers.
+
+Unlike @code{strncpy}, this function always null-terminates the
+destination string, does not zero-fill the destination buffer,
+requires @var{size} to be nonzero, requires @var{from} to be a
+null-terminated string, and always computes @var{from}'s length even
+when greater than @var{size}.
+
+This function was designed as a stopgap for quickly retrofitting
+possibly-unsafe uses of @code{strcpy} on platforms lacking
+buffer-overrun protection.  As noted below, this function is generally
+a poor choice for processing text.
+
+This function is derived from BSD.
+@end deftypefun
+
 @comment wchar.h
 @comment ISO
 @deftypefun {wchar_t *} wcsncpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size})
@@ -1064,6 +1105,65 @@  choice for processing text.  Also, this function has significant
 performance issues.  @xref{Concatenating Strings}.
 @end deftypefun
 
+@comment string.h
+@comment BSD
+@deftypefun size_t strlcat (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size})
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+This function is similar to @code{strcat}, except that it truncates
+@var{to} to at most @var{size} bytes (including the terminating null
+byte) when appending the string @var{from} to the string @var{to}.
+
+This function appends a prefix of the string @var{from} to the string
+@var{to}.  The prefix contains all of @var{from} if that can be
+appended to @var{to} without requiring more than @var{size} bytes
+total, including the null terminator.  Otherwise, it contains only as
+many leading bytes of @var{from} as will fit.  The resulting string in
+@var{to} is always null-terminated, and any excess trailing bytes of
+@var{from} are not copied.
+
+This function returns the sum of the original length of @var{to} and
+the length of @var{from}.  This means that truncation occurs whenever
+the returned value is not less than @var{size}.
+
+The behavior is undefined if @var{to} does not contain a null byte in
+its first @var{size} bytes, if the source and resulting destination
+strings overlap, if the source or destination are null pointers, or if
+the result would exceed @code{SIZE_MAX}.
+
+@strong{Portability Note:} With @theglibc{} this function's behavior
+is never undefined due to the result exceeding @code{SIZE_MAX}, as
+@var{from} and @var{to} cannot overlap and @theglibc{} assumes a flat
+address space.
+
+This function could be implemented like this:
+
+@smallexample
+@group
+size_t
+strlcat (char *to, const char *from, size_t size)
+@{
+  size_t len = strlen (to);
+  return len + strlcpy (to + len, from, size - len);
+@}
+@end group
+@end smallexample
+
+Whereas @code{strncat} limits the length of the appended string, this
+function limits the total destination size.  Also, unlike
+@code{strncat}, this function requires @var{from} to be
+null-terminated and always computes @var{from}'s length even when
+greater than the limit.
+
+As a companion to @code{strlcpy}, this function was designed as a
+stopgap for quickly retrofitting possibly-unsafe uses of @code{strcat}
+on platforms lacking buffer-overrun protection.  As noted below, this
+function is generally a poor choice for processing text.  Also, like
+@code{strcat} this function has significant performance issues.
+@xref{Concatenating Strings}.
+
+This function is derived from BSD.
+@end deftypefun
+
 @comment wchar.h
 @comment ISO
 @deftypefun {wchar_t *} wcsncat (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size})
-- 
2.1.0