fputs return the number of characters written

Message ID 20210919190305.10050-1-cosiekvfj@o2.pl
State Rejected
Headers
Series fputs return the number of characters written |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent
dj/TryBot-32bit success Build for i686

Commit Message

Kacper Piwiński Sept. 19, 2021, 7:03 p.m. UTC
  We already compute strlen so there is no reason to not return it and be consistent with puts
---
 libio/iofputs.c   | 4 +++-
 libio/iofputs_u.c | 4 +++-
 libio/ioputs.c    | 3 +--
 3 files changed, 7 insertions(+), 4 deletions(-)
  

Comments

Siddhesh Poyarekar Sept. 20, 2021, 5:45 a.m. UTC | #1
On 9/20/21 00:33, Kacper Piwiński via Libc-alpha wrote:
> We already compute strlen so there is no reason to not return it and be consistent with puts

What's the reason to return string length though, especially since it 
needs an additional operation:

>     if ((_IO_vtable_offset (fp) != 0 || _IO_fwide (fp, -1) == -1)
>         && _IO_sputn (fp, str, len) == len)
> -    result = 1;
> +    result = MIN (INT_MAX, len);
> +

Users should not depend on puts/fputs to return the string length since 
(1) it is not guaranteed across implementations and hence not portable 
and (2) it is correct only up to INT_MAX.

Siddhesh
  
Kacper Piwiński Sept. 20, 2021, 6:02 p.m. UTC | #2
On 9/20/21 00:33, Kacper Piwiński via Libc-alpha wrote:  We already compute strlen so there is no reason to not return it and be consistent with puts   What's the reason to return string length though, especially since it  needs an additional operation:      if ((_IO_vtable_offset (fp) != 0 || _IO_fwide (fp, -1) == -1)         && _IO_sputn (fp, str, len) == len)  -    result = 1;  +    result = MIN (INT_MAX, len);  +   Users should not depend on puts/fputs to return the string length since  (1) it is not guaranteed across implementations and hence not portable  and (2) it is correct only up to INT_MAX.   Siddhesh   I checked assembly and this operation is implemented with cmovbe. So quite fast.  What's funny overall code size stayed the same because compilation was able to reduce some code paths.     (1)I agree that users should not depend on that if they want a portable program.  But people can target only glibc. Which is not a problem if they chose to?  (2)Yes that's a limitation. Do the same limitation apply to *printf?   So basically we should return strlen because :  1. we already compute it  3. string length is arguably more useful than constant 1 3. it is valid to return any non negative value for success  4. consistency:  - we already do that in puts  - *printf function family do that   Reason to not return strlen:  1. additional operation  2. arguably not that much useful  3. only INT_MAX   I see that I didn't change fputws and fputws_unlocked. So I will make a v2 with them also changed or  alternatively we can change puts to return 1 and also be consistent between *put*s functions.
  
Zack Weinberg Sept. 20, 2021, 6:13 p.m. UTC | #3
On Mon, Sep 20, 2021, at 2:02 PM, Kacper Piwiński via Libc-alpha wrote:
> So basically we should return strlen because:
> 1. we already compute it
> 2. string length is arguably more useful than constant 1
> 3. it is valid to return any non negative value for success
> 4. consistency:
>    - we already do that in puts
>    - *printf function family do that
> Reason to not return strlen:
> 1. additional operation
> 2. arguably not that much useful
> 3. only INT_MAX

A really important reason not to change _anything_ in this area is backward compatibility.  It's not safe for us to change fputs to return something other than 1, without first checking that there isn't any existing software that depends on it to return _only_ 1 or EOF.  Similarly, it would not be safe for us to change puts to return only 1 or EOF, without checking first that there isn't any existing software that depends on it returning the string length.

Have you done any of that checking?

zw
  
Kacper Piwiński Sept. 20, 2021, 6:15 p.m. UTC | #4
Sorry for the wrong indentation. Here is the same message:

I checked assembly and this operation is implemented with cmovbe. So quite fast.
What's funny overall code size stayed the same because compilation was able to reduce some code paths. 

(1)I agree that users should not depend on that if they want a portable program.
But people can target only glibc. Which is not a problem if they chose to?
(2)Yes that's a limitation. Do the same limitation apply to *printf?

So basically we should return strlen because :
1. we already compute it
3. string length is arguably more useful than constant 1
3. it is valid to return any non negative value for success
4. consistency:
- we already do that in puts
- *printf function family do that

Reason to not return strlen:
1. additional operation
2. arguably not that much useful
3. only INT_MAX

I see that I didn't change fputws and fputws_unlocked. So I will make a v2 with them also changed or
alternatively we can change puts to return 1 and also be consistent between *put*s functions.
  
Siddhesh Poyarekar Sept. 20, 2021, 6:39 p.m. UTC | #5
On 9/20/21 23:32, Kacper Piwiński wrote:
> I checked assembly and this operation is implemented with cmovbe. So 
> quite fast.

It's still likely slower than moving a constant, but performance is not 
really a major concern IMO since that instruction barely figures.

> What's funny overall code size stayed the same because compilation was 
> able to reduce some code paths.
> 
> (1)I agree that users should not depend on that if they want a portable 
> program.
> But people can target only glibc. Which is not a problem if they chose to?
> (2)Yes that's a limitation. Do the same limitation apply to *printf?
> 
> So basically we should return strlen because :
> 1. we already compute it
> 3. string length is arguably more useful than constant 1
> 3. it is valid to return any non negative value for success
> 4. consistency:
> - we already do that in puts
> - *printf function family do that
> 
> Reason to not return strlen:
> 1. additional operation
> 2. arguably not that much useful
> 3. only INT_MAX
> 
> I see that I didn't change fputws and fputws_unlocked. So I will make a 
> v2 with them also changed or
> alternatively we can change puts to return 1 and also be consistent 
> between *put*s functions.

AFAICT, the return value is considered implementation defined[1], which 
means that whatever is currently implemented is the advertised behaviour 
and changing would introduce an ABI event.  That's the main concern with 
changing the return values of these functions.

Siddhesh

[1] 
https://pubs.opengroup.org/onlinepubs/9699919799/functions/puts.html#tag_16_468
  
Paul Eggert Sept. 20, 2021, 8:34 p.m. UTC | #6
I'm not convinced it'd be useful to modify puts / fputs, due to the 
INT_MAX and portability issues mentioned. That being said....

[putting my standards-nerd hat on; apologies in advance]

On 9/20/21 11:39 AM, Siddhesh Poyarekar wrote:
> AFAICT, the return value is considered implementation defined[1],

[1] doesn't say the return value of 'puts' is implementation-defined. It 
says merely that 'puts' returns a non-negative number on success. This 
is in contrast to the specs for functions like 'malloc'[2], where the 
spec is careful to say that the behavior of 'malloc (0)' is 
implementation-defined on success.

The current glibc manual doesn't say what 'puts' returns, which is 
surely a defect in the manual. The manual should use return-value 
wording similar to that of 'fputs'.

Although the current glibc manual does say that 'fputs' returns a 
non-negative value on success, it doesn't say which value, and this 
vagueness is OK as far as POSIX goes, as POSIX doesn't say the behavior 
is implementation-defined. So from what I can see, changing a successful 
'puts' to return some other non-negative value is merely of pragmatic 
concern rather than being an API or ABI event (as I understand these 
things, anyway).

> [1] 
> https://pubs.opengroup.org/onlinepubs/9699919799/functions/puts.html#tag_16_468

[2] https://pubs.opengroup.org/onlinepubs/9699919799/functions/malloc.html
  
Siddhesh Poyarekar Sept. 21, 2021, 12:14 a.m. UTC | #7
On 9/21/21 02:04, Paul Eggert wrote:
> On 9/20/21 11:39 AM, Siddhesh Poyarekar wrote:
>> AFAICT, the return value is considered implementation defined[1],
> 
> [1] doesn't say the return value of 'puts' is implementation-defined. It 
> says merely that 'puts' returns a non-negative number on success. This 
> is in contrast to the specs for functions like 'malloc'[2], where the 
> spec is careful to say that the behavior of 'malloc (0)' is 
> implementation-defined on success.

further down, it says this:

"""
This volume of POSIX.1-2017 requires that successful completion simply 
return a non-negative integer. There are at least three known different 
implementation conventions for this requirement:

     Return a constant value.

     Return the last character written.

     Return the number of bytes written. Note that this implementation 
convention cannot be adhered to for strings longer than {INT_MAX} bytes 
as the value would not be representable in the return type of the 
function. For backwards-compatibility, implementations can return the 
number of bytes for strings of up to {INT_MAX} bytes, and return 
{INT_MAX} for all longer strings.
"""

which I agree isn't quite the same as saying that behaviour is 
implementation defined.  However I argue that by mentioning specific 
behaviours, it burdens our implementation with the same level of 
commitment to stay consistent in the behaviour we have implemented. 
That is to say, in the absence of this paragraph I would have been less 
pedantic about the actual return value.

Siddhesh
  
Andreas Schwab Sept. 21, 2021, 7:15 a.m. UTC | #8
On Sep 21 2021, Siddhesh Poyarekar wrote:

> which I agree isn't quite the same as saying that behaviour is
> implementation defined.  However I argue that by mentioning specific 
> behaviours, it burdens our implementation with the same level of
> commitment to stay consistent in the behaviour we have implemented. 
> That is to say, in the absence of this paragraph I would have been less
> pedantic about the actual return value.

This part is non-normative, there is nothing to be pedantic about.

Andreas.
  
Siddhesh Poyarekar Sept. 21, 2021, 7:34 a.m. UTC | #9
On 9/21/21 12:45, Andreas Schwab wrote:
> On Sep 21 2021, Siddhesh Poyarekar wrote:
> 
>> which I agree isn't quite the same as saying that behaviour is
>> implementation defined.  However I argue that by mentioning specific
>> behaviours, it burdens our implementation with the same level of
>> commitment to stay consistent in the behaviour we have implemented.
>> That is to say, in the absence of this paragraph I would have been less
>> pedantic about the actual return value.
> 
> This part is non-normative, there is nothing to be pedantic about.

I don't disagree, but it has practical implications because it has the 
tendency to change user behaviour.  Maybe pedantic is too strong a word 
for what I feel about it, apprehension may be better.

Siddhesh
  
Kacper Piwiński Sept. 22, 2021, 5:03 p.m. UTC | #10
> Have you done any of that checking?
> 
> zw

How to do that?
  

Patch

diff --git a/libio/iofputs.c b/libio/iofputs.c
index a8721314bc..83fde6687a 100644
--- a/libio/iofputs.c
+++ b/libio/iofputs.c
@@ -34,9 +34,11 @@  _IO_fputs (const char *str, FILE *fp)
   int result = EOF;
   CHECK_FILE (fp, EOF);
   _IO_acquire_lock (fp);
+
   if ((_IO_vtable_offset (fp) != 0 || _IO_fwide (fp, -1) == -1)
       && _IO_sputn (fp, str, len) == len)
-    result = 1;
+    result = MIN (INT_MAX, len);
+
   _IO_release_lock (fp);
   return result;
 }
diff --git a/libio/iofputs_u.c b/libio/iofputs_u.c
index b9eb3b415f..907f6119af 100644
--- a/libio/iofputs_u.c
+++ b/libio/iofputs_u.c
@@ -34,8 +34,10 @@  __fputs_unlocked (const char *str, FILE *fp)
   size_t len = strlen (str);
   int result = EOF;
   CHECK_FILE (fp, EOF);
+
   if (_IO_fwide (fp, -1) == -1 && _IO_sputn (fp, str, len) == len)
-    result = 1;
+    result = MIN (INT_MAX, len);
+
   return result;
 }
 libc_hidden_def (__fputs_unlocked)
diff --git a/libio/ioputs.c b/libio/ioputs.c
index 99a177876a..5969c0af30 100644
--- a/libio/ioputs.c
+++ b/libio/ioputs.c
@@ -35,8 +35,7 @@  _IO_puts (const char *str)
   size_t len = strlen (str);
   _IO_acquire_lock (stdout);
 
-  if ((_IO_vtable_offset (stdout) != 0
-       || _IO_fwide (stdout, -1) == -1)
+  if ((_IO_vtable_offset (stdout) != 0 || _IO_fwide (stdout, -1) == -1)
       && _IO_sputn (stdout, str, len) == len
       && _IO_putc_unlocked ('\n', stdout) != EOF)
     result = MIN (INT_MAX, len + 1);