[v2,3/3] manual: Manual update for strlcat, strlcpy, wcslcat, wclscpy

Message ID f39fcf3e4b98dd53f27a2d196038c73e91148cd4.1681993374.git.fweimer@redhat.com
State Committed
Commit d2fda60e7c4072180ba91df46bbbdacc0f4a133c
Delegated to: Siddhesh Poyarekar
Headers
Series strlcpy and related functions |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent
dj/TryBot-32bit success Build for i686

Commit Message

Florian Weimer April 20, 2023, 12:28 p.m. UTC
  From: Paul Eggert <eggert@cs.ucla.edu>

Co-authored-by: Florian Weimer <fweimer@redhat.com>
---
 manual/maint.texi  |  8 ++++
 manual/string.texi | 96 ++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 101 insertions(+), 3 deletions(-)
  

Comments

Siddhesh Poyarekar June 6, 2023, 6:03 a.m. UTC | #1
On 2023-04-20 08:28, Florian Weimer via Libc-alpha wrote:
> From: Paul Eggert <eggert@cs.ucla.edu>
> 
> Co-authored-by: Florian Weimer <fweimer@redhat.com>
> ---
>   manual/maint.texi  |  8 ++++
>   manual/string.texi | 96 ++++++++++++++++++++++++++++++++++++++++++++--
>   2 files changed, 101 insertions(+), 3 deletions(-)
> 
> diff --git a/manual/maint.texi b/manual/maint.texi
> index a8441e20b6..89da704f45 100644
> --- a/manual/maint.texi
> +++ b/manual/maint.texi
> @@ -371,6 +371,10 @@ The following functions and macros are fortified in @theglibc{}:
>   
>   @item @code{strcpy}
>   
> +@item @code{strlcat}
> +
> +@item @code{strlcpy}
> +
>   @item @code{strncat}
>   
>   @item @code{strncpy}
> @@ -411,6 +415,10 @@ The following functions and macros are fortified in @theglibc{}:
>   
>   @item @code{wcscpy}
>   
> +@item @code{wcslcat}
> +
> +@item @code{wcslcpy}
> +
>   @item @code{wcsncat}
>   
>   @item @code{wcsncpy}
> diff --git a/manual/string.texi b/manual/string.texi
> index ad57265274..4149d54ee7 100644
> --- a/manual/string.texi
> +++ b/manual/string.texi
> @@ -726,8 +726,8 @@ This function has undefined results if the strings overlap.
>   As noted below, this function has significant performance issues.
>   @end deftypefun
>   
> -Programmers using the @code{strcat} or @code{wcscat} function (or the
> -@code{strncat} or @code{wcsncat} functions defined in
> +Programmers using the @code{strcat} or @code{wcscat} functions (or the
> +@code{strlcat}, @code{strncat} and @code{wcsncat} functions defined in
>   a later section, for that matter)
>   can easily be recognized as lazy and reckless.  In almost all situations
>   the lengths of the participating strings are known (it better should be
> @@ -848,7 +848,8 @@ function.  The example would work for wide characters the same way.
>   Whenever a programmer feels the need to use @code{strcat} she or he
>   should think twice and look through the program to see whether the code cannot
>   be rewritten to take advantage of already calculated results.
> -The related functions @code{strncat} and @code{wcscat}
> +The related functions @code{strlcat}, @code{strncat},
> +@code{wcscat} and @code{wcsncat}
>   are almost always unnecessary, too.
>   Again: it is almost always unnecessary to use functions like @code{strcat}.
>   
> @@ -1076,6 +1077,95 @@ processing strings.  Also, this function has significant performance
>   issues.  @xref{Concatenating Strings}.
>   @end deftypefun
>   
> +@deftypefun size_t strlcpy (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size})
> +@standards{BSD, string.h}
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +This function copies the string @var{from} to the destination array
> +@var{to}, limiting the result's size (including the null terminator)
> +to @var{size}.  The caller should ensure that @var{size} includes room
> +for the result's terminating null byte.
> +
> +If @var{size} is greater than the length of the string @var{from},
> +this function copies the non-null bytes of the string
> +@var{from} to the destination array @var{to},
> +and terminates the copy with a null byte.  Like other
> +string functions such as @code{strcpy}, but unlike @code{strncpy}, any
> +remaining bytes in the destination array remain unchanged.
> +
> +If @var{size} is nonzero and less than or equal to the the length of the string
> +@var{from}, this function copies only the first @samp{@var{size} - 1}
> +bytes to the destination array @var{to}, and writes a terminating null
> +byte to the last byte of the array.
> +
> +This function returns the length of the string @var{from}.  This means
> +that truncation occurs if and only if the returned value is greater
> +than or equal to @var{size}.
> +
> +The behavior is undefined if @var{to} or @var{from} is a null pointer,
> +or if the destination array's size is less than @var{size}, or if the
> +string @var{from} overlaps the first @var{size} bytes of the
> +destination array.

Shouldn't this be undefined for all kinds of overlaps between @var{to} 
and @var{from} and not just when the @{from} overlaps with the first 
@var{size} bytes of @var{to}?  Also, perhaps s/destination 
array/@var{to}/ to make it clearer.

> +
> +As noted below, this function is generally a poor choice for
> +processing strings.  Also, this function has a performance issue,
> +as its time cost is proportional to the length of @var{from}
> +even when @var{size} is small.
> +
> +This function is derived from OpenBSD 2.4.
> +@end deftypefun
> +
> +@deftypefun size_t wcslcpy (wchar_t *restrict @var{to}, const wchar_t *restrict @var{from}, size_t @var{size})
> +@standards{BSD, string.h}
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +This function is a variant of @code{strlcpy} for wide strings.
> +The  @var{size} argument counts the length of the destination buffer in
> +wide characters (and not bytes).
> +
> +This function is derived from BSD.
> +@end deftypefun
> +
> +@deftypefun size_t strlcat (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size})
> +@standards{BSD, string.h}
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +This function appends the string @var{from} to the
> +string @var{to}, limiting the result's total size (including the null
> +terminator) to @var{size}.  The caller should ensure that @var{size}
> +includes room for the result's terminating null byte.
> +
> +This function copies as much as possible of the string @var{from} into
> +the array at @var{to} of @var{size} bytes, starting at the terminating
> +null byte of the original string @var{to}.  In effect, this appends
> +the string @var{from} to the string @var{to}.  Although the resulting
> +string will contain a null terminator, it can be truncated (not all
> +bytes in @var{from} may be copied).
> +
> +This function returns the sum of the original length of @var{to} and
> +the length of @var{from}.  This means that truncation occurs if and
> +only if the returned value is greater than or equal to @var{size}.
> +
> +The behavior is undefined if @var{to} or @var{from} is a null pointer,
> +or if the destination array's size is less than @var{size}, or if the
> +destination array does not contain a null byte in its first @var{size}
> +bytes, or if the string @var{from} overlaps the first @var{size} bytes
> +of the destination array.

Same question about overlaps, shouldn't we specify all overlaps as 
undefined?

> +
> +As noted below, this function is generally a poor choice for
> +processing strings.  Also, this function has significant performance
> +issues.  @xref{Concatenating Strings}.
> +
> +This function is derived from OpenBSD 2.4.
> +@end deftypefun
> +
> +@deftypefun size_t wcslcat (wchar_t *restrict @var{to}, const wchar_t *restrict @var{from}, size_t @var{size})
> +@standards{BSD, string.h}
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +This function is a variant of @code{strlcat} for wide strings.
> +The  @var{size} argument counts the length of the destination buffer in
> +wide characters (and not bytes).
> +
> +This function is derived from BSD.
> +@end deftypefun
> +
>   Because these functions can abruptly truncate strings or wide strings,
>   they are generally poor choices for processing them.  When copying or
>   concatening multibyte strings, they can truncate within a multibyte
  
Florian Weimer June 6, 2023, 12:27 p.m. UTC | #2
* Siddhesh Poyarekar:

>> +The behavior is undefined if @var{to} or @var{from} is a null pointer,
>> +or if the destination array's size is less than @var{size}, or if the
>> +string @var{from} overlaps the first @var{size} bytes of the
>> +destination array.
>
> Shouldn't this be undefined for all kinds of overlaps between @var{to}
> and @var{from} and not just when the @{from} overlaps with the first
> @var{size} bytes of @var{to}?

I don't think so.  There is no reason why the data couldn't be copied
within the same array.  This can plausibly happen if a custom memory
allocator is used, for example.

> Also, perhaps s/destination array/@var{to}/ to make it clearer.

I don't think so because it's confusing whether the size refers to TO
itself or (conceptually) to *TO.

Thanks,
Florian
  
Siddhesh Poyarekar June 6, 2023, 12:42 p.m. UTC | #3
On 2023-06-06 08:27, Florian Weimer wrote:
> * Siddhesh Poyarekar:
> 
>>> +The behavior is undefined if @var{to} or @var{from} is a null pointer,
>>> +or if the destination array's size is less than @var{size}, or if the
>>> +string @var{from} overlaps the first @var{size} bytes of the
>>> +destination array.
>>
>> Shouldn't this be undefined for all kinds of overlaps between @var{to}
>> and @var{from} and not just when the @{from} overlaps with the first
>> @var{size} bytes of @var{to}?
> 
> I don't think so.  There is no reason why the data couldn't be copied
> within the same array.  This can plausibly happen if a custom memory
> allocator is used, for example.

Uhmm, I haven't heard of the customer allocator argument being used in 
this context but I suppose it makes sense.  I'm probably just getting 
jitters about specifying behaviour to that much detail when a simple 
"behavior is undefined if FROM and TO overlap" would suffice.  I mean 
this argument could be made for memcpy or any function that assumes 
non-aliased inputs.

>> Also, perhaps s/destination array/@var{to}/ to make it clearer.
> 
> I don't think so because it's confusing whether the size refers to TO
> itself or (conceptually) to *TO.

Fair enough.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
  

Patch

diff --git a/manual/maint.texi b/manual/maint.texi
index a8441e20b6..89da704f45 100644
--- a/manual/maint.texi
+++ b/manual/maint.texi
@@ -371,6 +371,10 @@  The following functions and macros are fortified in @theglibc{}:
 
 @item @code{strcpy}
 
+@item @code{strlcat}
+
+@item @code{strlcpy}
+
 @item @code{strncat}
 
 @item @code{strncpy}
@@ -411,6 +415,10 @@  The following functions and macros are fortified in @theglibc{}:
 
 @item @code{wcscpy}
 
+@item @code{wcslcat}
+
+@item @code{wcslcpy}
+
 @item @code{wcsncat}
 
 @item @code{wcsncpy}
diff --git a/manual/string.texi b/manual/string.texi
index ad57265274..4149d54ee7 100644
--- a/manual/string.texi
+++ b/manual/string.texi
@@ -726,8 +726,8 @@  This function has undefined results if the strings overlap.
 As noted below, this function has significant performance issues.
 @end deftypefun
 
-Programmers using the @code{strcat} or @code{wcscat} function (or the
-@code{strncat} or @code{wcsncat} functions defined in
+Programmers using the @code{strcat} or @code{wcscat} functions (or the
+@code{strlcat}, @code{strncat} and @code{wcsncat} functions defined in
 a later section, for that matter)
 can easily be recognized as lazy and reckless.  In almost all situations
 the lengths of the participating strings are known (it better should be
@@ -848,7 +848,8 @@  function.  The example would work for wide characters the same way.
 Whenever a programmer feels the need to use @code{strcat} she or he
 should think twice and look through the program to see whether the code cannot
 be rewritten to take advantage of already calculated results.
-The related functions @code{strncat} and @code{wcscat}
+The related functions @code{strlcat}, @code{strncat},
+@code{wcscat} and @code{wcsncat}
 are almost always unnecessary, too.
 Again: it is almost always unnecessary to use functions like @code{strcat}.
 
@@ -1076,6 +1077,95 @@  processing strings.  Also, this function has significant performance
 issues.  @xref{Concatenating Strings}.
 @end deftypefun
 
+@deftypefun size_t strlcpy (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size})
+@standards{BSD, string.h}
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+This function copies the string @var{from} to the destination array
+@var{to}, limiting the result's size (including the null terminator)
+to @var{size}.  The caller should ensure that @var{size} includes room
+for the result's terminating null byte.
+
+If @var{size} is greater than the length of the string @var{from},
+this function copies the non-null bytes of the string
+@var{from} to the destination array @var{to},
+and terminates the copy with a null byte.  Like other
+string functions such as @code{strcpy}, but unlike @code{strncpy}, any
+remaining bytes in the destination array remain unchanged.
+
+If @var{size} is nonzero and less than or equal to the the length of the string
+@var{from}, this function copies only the first @samp{@var{size} - 1}
+bytes to the destination array @var{to}, and writes a terminating null
+byte to the last byte of the array.
+
+This function returns the length of the string @var{from}.  This means
+that truncation occurs if and only if the returned value is greater
+than or equal to @var{size}.
+
+The behavior is undefined if @var{to} or @var{from} is a null pointer,
+or if the destination array's size is less than @var{size}, or if the
+string @var{from} overlaps the first @var{size} bytes of the
+destination array.
+
+As noted below, this function is generally a poor choice for
+processing strings.  Also, this function has a performance issue,
+as its time cost is proportional to the length of @var{from}
+even when @var{size} is small.
+
+This function is derived from OpenBSD 2.4.
+@end deftypefun
+
+@deftypefun size_t wcslcpy (wchar_t *restrict @var{to}, const wchar_t *restrict @var{from}, size_t @var{size})
+@standards{BSD, string.h}
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+This function is a variant of @code{strlcpy} for wide strings.
+The  @var{size} argument counts the length of the destination buffer in
+wide characters (and not bytes).
+
+This function is derived from BSD.
+@end deftypefun
+
+@deftypefun size_t strlcat (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size})
+@standards{BSD, string.h}
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+This function appends the string @var{from} to the
+string @var{to}, limiting the result's total size (including the null
+terminator) to @var{size}.  The caller should ensure that @var{size}
+includes room for the result's terminating null byte.
+
+This function copies as much as possible of the string @var{from} into
+the array at @var{to} of @var{size} bytes, starting at the terminating
+null byte of the original string @var{to}.  In effect, this appends
+the string @var{from} to the string @var{to}.  Although the resulting
+string will contain a null terminator, it can be truncated (not all
+bytes in @var{from} may be copied).
+
+This function returns the sum of the original length of @var{to} and
+the length of @var{from}.  This means that truncation occurs if and
+only if the returned value is greater than or equal to @var{size}.
+
+The behavior is undefined if @var{to} or @var{from} is a null pointer,
+or if the destination array's size is less than @var{size}, or if the
+destination array does not contain a null byte in its first @var{size}
+bytes, or if the string @var{from} overlaps the first @var{size} bytes
+of the destination array.
+
+As noted below, this function is generally a poor choice for
+processing strings.  Also, this function has significant performance
+issues.  @xref{Concatenating Strings}.
+
+This function is derived from OpenBSD 2.4.
+@end deftypefun
+
+@deftypefun size_t wcslcat (wchar_t *restrict @var{to}, const wchar_t *restrict @var{from}, size_t @var{size})
+@standards{BSD, string.h}
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+This function is a variant of @code{strlcat} for wide strings.
+The  @var{size} argument counts the length of the destination buffer in
+wide characters (and not bytes).
+
+This function is derived from BSD.
+@end deftypefun
+
 Because these functions can abruptly truncate strings or wide strings,
 they are generally poor choices for processing them.  When copying or
 concatening multibyte strings, they can truncate within a multibyte