[v7] Implement strlcpy, strlcat [BZ #178]
Commit Message
Florian Weimer wrote:
> I got conflicting advice (on-list and off-list)
It might be helpful to see a summary of the off-list advice, as it appears to
have influenced the proposed patch without leaving much trace of what was
discussed. Did the off-list advisors consider the points raised on libc-alpha?
> One suggestion is to make a non-terminated buffer undefined, but that breaks
> the snprintf analogy for size 0 inputs.
Sorry, what analogy is that? snprintf does not concatenate to a buffer directly.
What is the practical use case here? How does the use case ignore the principle
that strlcpy should null-terminate its output?
> Another option is to leave the
> array untouched (which is what I implemented and documented).
It is OK to *implement* it that way. But *documenting* this weird corner case
would raise a can of worms. For example, it would mean strlcat's destination
might not be a string, which would mean that the proposed documentation's talk
about behavior being "undefined if the strings overlap" is not worded correctly.
These problems are discussed in more detail below.
> I did not pick up
> the full criticism because glibc has repeatedly demonstrated that its
> recommended approach to string processing is rather error-prone, so we
> should not judge other approaches too harshly.
Fair enough. However, the wording in the latest proposal has the opposite
problem: when documenting strncpy it inserts a seemingly-partisan suggestion to
use strlcpy instead. Furthermore, its strlcpy and strlcat sections both promote
these functions as ways to avoid buffer overruns even though strlcpy and strlcat
are not on their face more effective at that than the other truncation
functions. The strlcpy/strlcat documentation should be like the other truncation
functions, and defer to the general wording about buffer overruns that is
already in that section of the manual, rather than contain special advocacy for
that particular approach.
Here are some other (in some cases, related) issues with the documentation
change just proposed:
* It starts off by saying strcpy copies data that is "including a terminating
null byte" but this is not true if SIZE is zero.
* The paragraph that specifies strlcpy semantics "If @var{size} is less than or
equal to the the length of the string" is incorrect when SIZE is zero. The
wording should be something more like "If @var{size} is nonzero and less than or
equal to the the length of the string".
* It says strlcpy's behavior is undefined if the strings overlap, but if SIZE is
zero there is no destination string. It should say the behavior is undefined if
SIZE is nonzero and the strings overflap.
* When giving the differences between strncpy and strlcpy, it does not mention
that strlcpy requires the source string to be null-terminated, nor that it
computes the source string's length. Similarly when giving the differences
between strncat and strlcat.
* The initial sentence describing strlcat assumes that the destination is a
string, which is not necessarily the case.
* Typo: "If array @var{to}" should be "If the array @var{to}".
* The phrase "the length of the untruncated string" does not apply if the
destination is not a string.
* On multiple occasions, the description for strlcat talks about "the length of
the string @var{to}" without making it clear whether it is talking about the old
length of TO, or about the new length.
* It does not mention that truncation in strlcat occurs when the result is
greater than or equal to the size.
* It says that strlcat's behavior is undefined if the strings overlap, but the
destination is not necessarily a string, and the description of overlap does not
make it clear whether it's talking about the old destination string or array,
the new destination string or array, or the modified part of the destination.
* When giving the differences between strncat and strlcat, it does not mention
that strlcat ensures that an already-null-terminated destination stays
null-terminated.
* The documentation does not state that it is OK for the destination pointer to
be NULL if SIZE is zero.
* In some places the documentation says "length" (not counting the null
terminator) when it is referring to size (counting the null terminator).
Fixing these problems would take some work and would complicate the
documentation. Instead, I'll update my earlier patch (which leaves these weird
corner case behavior undefined; this is considerably simpler), by removing the
"full criticisms" that you mentioned and by adopting many of the other textual
changes you made. This revised doc patch is shorter than before, and is
compatible with the most-recently-proposed implementation.
As before I hope that the strlcpy/strlcat code changes do not go in, but if they
do go in they will need proper documentation.
The first attachment is a patch relative to master; the second is a diff of
string.texi compared to the text you just proposed.
@@ -904,10 +904,6 @@ non-null bytes followed by zero or more null bytes. It needs to set
all @var{size} bytes of the destination, even when @var{size} is much
greater than the length of @var{from}. As noted below, this function
is generally a poor choice for processing text.
-
-See @code{strlcpy} below for an alternative which null-terminates the
-destination string as long as the destination buffer does not have
-length zero.
@end deftypefun
@comment wchar.h
@@ -1111,15 +1107,13 @@ This function is similar to @code{strcpy}, but copies at most
@var{size} bytes from the string @var{from} into the destination
array @var{to}, including a terminating null byte.
-If @var{size} is zero, nothing is written to @var{to}.
-
If @var{size} is greater than the length of the string @var{from},
this function copies all of the string @var{from} to the destination
array @var{to}, including the terminating null byte. Like other
string functions such as @code{strcpy}, but unlike @code{strncpy}, any
remaining bytes in the destination array remain unchanged.
-If @var{size} is less than or equal to the the length of the string
+If @var{size} is nonzero and less than or equal to the the length of the string
@var{from}, this function copies only the first @samp{@var{size} - 1}
bytes to the destination array @var{to}, and writes a terminating null
byte to the last byte of the array.
@@ -1128,14 +1122,15 @@ The return value @var{result} of @code{strlcpy} is the length of the
string @var{from}. This means that @samp{@var{result} >= @var{size}} is
true whenever truncation occurs.
-The behavior of @code{strlcpy} is undefined if the strings overlap.
+The behavior of @code{strlcpy} is undefined if @var{size} is zero, or
+if the source and destination strings overlap.
As noted below, this function is generally a poor choice for processing
-text. Using @code{strlcpy} instead of @code{strcpy} is a way to avoid
-bugs caused by writing past the end of the allocated space for @var{to}.
-Unlike @code{strncpy}, @code{strlcpy} ensures that the destination
-string is always null-terminated (unless the array size is zero), and it
-does not fill the remaining part of the array with null bytes.
+text. Unlike @code{strncpy}, @code{strlcpy} requires @var{size} to be
+nonzero and the source string to be null-terminated, computes the
+source string's length, ensures that the destination is
+null-terminated, and does not fill the remaining part of the destination
+with null bytes.
This function is derived from BSD.
@end deftypefun
@@ -1145,38 +1140,33 @@ This function is derived from BSD.
@deftypefun size_t strlcat (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size})
@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
This function appends the string @var{from} to the
-string @var{to}, limiting the total length of the result string at
+string @var{to}, limiting the total size of the result string at
@var{to} (including the null terminator) to @var{size}.
-If array @var{to} contains a null terminator among the first @var{size}
-bytes, this function copies as much as possible of the string @var{from}
-into the array at @var{to} of @var{size} bytes, starting at the
-terminating null byte of the original string @var{to}. In effect, this
-appends the string @var{from} to the string @var{to}. In this case, the
-resulting array at @var{to} will contain a null terminator, but the
-result string can be truncated (not all bytes in @var{from} are copied).
-
-If the array @var{to} does not contain a null terminator among the first
-@var{size} bytes, the array @var{to} is not modified, and it will not
-contain a null terminator after this function returns. This also covers
-the case when @var{size} is zero.
+This function copies as much as possible of the string @var{from} into
+the array at @var{to} of @var{size} bytes, starting at the terminating
+null byte of the original string @var{to}. In effect, this appends
+the string @var{from} to the string @var{to}. Although the resulting
+string will contain a null terminator, it can be truncated (not all
+bytes in @var{from} are copied).
-This function returns the length of the untruncated string: the length
-of @var{from} plus the length of the string @var{to} (if there is a null
-terminator within the first @var{size} bytes of the array @var{to}), or
-the length of @var{from} plus @var{size} (if the array @var{to} of
-@var{size} bytes does not contain a null terminator).
+This function returns the sum of the original length of @var{to} and
+the length of @var{from}. This means that truncation occurs unless
+the returned value is less than @var{size}.
-The behavior of @code{strlcat} is undefined if the strings overlap.
+The behavior is undefined if @var{to} does not contain a null byte in
+its first @var{size} bytes, or if the source and resulting destination
+strings overlap.
As noted below, this function is generally a poor choice for processing
text. Also, this function has significant performance issues.
-@xref{Concatenating Strings}. Using @code{strlcat} instead of
-@code{strcat} is a way to avoid bugs caused by writing past the end of
-the allocated space for @var{to}. Unlike @code{strncat}, @var{size}
-specifies the maximum total length of the result string (including its
+@xref{Concatenating Strings}. Unlike @code{strncat}, @var{size}
+specifies the maximum total size of the result string (including its
null terminator), not the number of bytes copied from the source string
@var{from}.
+Also, unlike @code{strncat} this function requires the source and
+destination to be null-terminated, computes the source string's
+length, and keeps the destination null-terminated.
This function is derived from BSD.
@end deftypefun