Patchwork manual: Document the standardized scanf flag, "m". [BZ #16376]

login
register
mail settings
Submitter Rical Jasan
Date Feb. 9, 2018, 1:07 p.m.
Message ID <20180209130754.16006-1-ricaljasan@pacific.net>
Download mbox | patch
Permalink /patch/25886/
State Superseded
Headers show

Comments

Rical Jasan - Feb. 9, 2018, 1:07 p.m.
POSIX.1-2008 introduced the optional assignment-allocation modifier,
"m", whose functionality was previously provided by the GNU extension
"a".

	[BZ #16376]
	* manual/stdio.texi (Input Conversion Syntax)
	(String Input Conversions, Dynamic String Input): Document the
	"m" flag.
---
 manual/stdio.texi | 32 ++++++++++++++++++++------------
 1 file changed, 20 insertions(+), 12 deletions(-)
Zack Weinberg - Feb. 9, 2018, 3:06 p.m.
On Fri, Feb 9, 2018 at 8:07 AM, Rical Jasan <ricaljasan@pacific.net> wrote:
> POSIX.1-2008 introduced the optional assignment-allocation modifier,
> "m", whose functionality was previously provided by the GNU extension
> "a".

OK with just one tweak...

> +You should free the buffer with @code{free} when you no longer need it.
> +
> +As a GNU extension predating @samp{m}, @samp{a} is also available, but
> +its use is considered deprecated.

let's be a little more specific here:

+As a GNU extension, the modifier @samp{a} has the same effect as @samp{m}.
+This extension predates POSIX.1-2008 and is now deprecated.  Other C libraries
+may interpret e.g.@: @samp{%as} as the @samp{%a} format for reading
+floating-point numbers, followed by a literal @samp{s}.

zw
Joseph Myers - Feb. 9, 2018, 4:39 p.m.
On Fri, 9 Feb 2018, Zack Weinberg wrote:

> > +As a GNU extension predating @samp{m}, @samp{a} is also available, but
> > +its use is considered deprecated.
> 
> let's be a little more specific here:
> 
> +As a GNU extension, the modifier @samp{a} has the same effect as @samp{m}.
> +This extension predates POSIX.1-2008 and is now deprecated.  Other C libraries
> +may interpret e.g.@: @samp{%as} as the @samp{%a} format for reading
> +floating-point numbers, followed by a literal @samp{s}.

Which glibc does in the absence of _GNU_SOURCE, since __USE_XOPEN2K is 
defined by default.  The redirection to __isoc99_scanf etc. is done if:

#if defined __USE_ISOC99 && !defined __USE_GNU \
    && (!defined __LDBL_COMPAT || !defined __REDIRECT) \
    && (defined __STRICT_ANSI__ || defined __USE_XOPEN2K)
Zack Weinberg - Feb. 9, 2018, 11:22 p.m.
On Fri, Feb 9, 2018 at 11:39 AM, Joseph Myers <joseph@codesourcery.com> wrote:
> On Fri, 9 Feb 2018, Zack Weinberg wrote:
>
>> > +As a GNU extension predating @samp{m}, @samp{a} is also available, but
>> > +its use is considered deprecated.
>>
>> let's be a little more specific here:
>>
>> +As a GNU extension, the modifier @samp{a} has the same effect as @samp{m}.
>> +This extension predates POSIX.1-2008 and is now deprecated.  Other C libraries
>> +may interpret e.g.@: @samp{%as} as the @samp{%a} format for reading
>> +floating-point numbers, followed by a literal @samp{s}.
>
> Which glibc does in the absence of _GNU_SOURCE, since __USE_XOPEN2K is
> defined by default.  The redirection to __isoc99_scanf etc. is done if:
>
> #if defined __USE_ISOC99 && !defined __USE_GNU \
>     && (!defined __LDBL_COMPAT || !defined __REDIRECT) \
>     && (defined __STRICT_ANSI__ || defined __USE_XOPEN2K)

I'm tempted to suggest that we drop the __USE_GNU - meaning that 'a'
would only be a modifier under -std=gnu89, if I'm reading that
correctly - both because it'll be easier to document, and because this
seems to be already what GCC's scanf format warnings do:

$ cat test.c
#include <stdio.h>

int main(void)
{
  char *s;
  scanf("%as", &s);
  puts(s);
  return 0;
}

$ gcc -std=gnu89 -Wformat test.c
$ gcc -std=gnu11 -Wformat test.c
test.c: In function ‘main’:
test.c:6:11: warning: format ‘%a’ expects argument of type ‘float *’,
but argument 2 has type ‘char **’ [-Wformat=]
   scanf("%as", &s);
          ~^    ~~
$ gcc -std=gnu11 -Wformat -D_GNU_SOURCE test.c
test.c: In function ‘main’:
test.c:6:11: warning: format ‘%a’ expects argument of type ‘float *’,
but argument 2 has type ‘char **’ [-Wformat=]
   scanf("%as", &s);
          ~^    ~~

$ gcc --version
gcc (Debian 7.3.0-3) 7.3.0
Joseph Myers - Feb. 9, 2018, 11:33 p.m.
On Fri, 9 Feb 2018, Zack Weinberg wrote:

> >> +As a GNU extension, the modifier @samp{a} has the same effect as @samp{m}.
> >> +This extension predates POSIX.1-2008 and is now deprecated.  Other C libraries
> >> +may interpret e.g.@: @samp{%as} as the @samp{%a} format for reading
> >> +floating-point numbers, followed by a literal @samp{s}.
> >
> > Which glibc does in the absence of _GNU_SOURCE, since __USE_XOPEN2K is
> > defined by default.  The redirection to __isoc99_scanf etc. is done if:
> >
> > #if defined __USE_ISOC99 && !defined __USE_GNU \
> >     && (!defined __LDBL_COMPAT || !defined __REDIRECT) \
> >     && (defined __STRICT_ANSI__ || defined __USE_XOPEN2K)
> 
> I'm tempted to suggest that we drop the __USE_GNU - meaning that 'a'
> would only be a modifier under -std=gnu89, if I'm reading that

That seems reasonable to me.  (With a corresponding change to 
bits/stdio-ldbl.h to keep things consistent in the -mlong-double-64 case.)

Patch

diff --git a/manual/stdio.texi b/manual/stdio.texi
index 38be236991..22c338f8ea 100644
--- a/manual/stdio.texi
+++ b/manual/stdio.texi
@@ -3440,9 +3440,8 @@  successful assignments.
 @cindex flag character (@code{scanf})
 
 @item
-An optional flag character @samp{a} (valid with string conversions only)
+An optional flag character @samp{m} (valid with string conversions only)
 which requests allocation of a buffer long enough to store the string in.
-(This is a GNU extension.)
 @xref{Dynamic String Input}.
 
 @item
@@ -3720,8 +3719,8 @@  provide the buffer, always specify a maximum field width to prevent
 overflow.}
 
 @item
-Ask @code{scanf} to allocate a big enough buffer, by specifying the
-@samp{a} flag character.  This is a GNU extension.  You should provide
+Ask @code{scanf} to allocate a big-enough buffer, by specifying the
+@samp{m} flag character.  You should provide
 an argument of type @code{char **} for the buffer address to be stored
 in.  @xref{Dynamic String Input}.
 @end itemize
@@ -3825,7 +3824,7 @@  is said about @samp{%ls} above is true for @samp{%l[}.
 
 One more reminder: the @samp{%s} and @samp{%[} conversions are
 @strong{dangerous} if you don't specify a maximum width or use the
-@samp{a} flag, because input too long would overflow whatever buffer you
+@samp{m} flag, because too-long input would overflow whatever buffer you
 have provided for it.  No matter how long your buffer is, a user could
 supply input that is longer.  A well-written program reports invalid
 input with a comprehensible error message, not with a crash.
@@ -3833,18 +3832,27 @@  input with a comprehensible error message, not with a crash.
 @node Dynamic String Input
 @subsection Dynamically Allocating String Conversions
 
-A GNU extension to formatted input lets you safely read a string with no
+POSIX.1-2008 specifies an @dfn{assignment-allocation character}
+@samp{m}, valid for use with the string conversion specifiers
+@samp{s}, @samp{S}, @samp{[}, @samp{c}, and @samp{C}, which
+lets you safely read a string with no
 maximum size.  Using this feature, you don't supply a buffer; instead,
 @code{scanf} allocates a buffer big enough to hold the data and gives
-you its address.  To use this feature, write @samp{a} as a flag
-character, as in @samp{%as} or @samp{%a[0-9a-z]}.
+you its address.  To use this feature, write @samp{m} as a flag
+character; e.g., @samp{%ms} or @samp{%m[0-9a-z]}.
 
 The pointer argument you supply for where to store the input should have
 type @code{char **}.  The @code{scanf} function allocates a buffer and
-stores its address in the word that the argument points to.  You should
-free the buffer with @code{free} when you no longer need it.
+stores its address in the word that the argument points to.  When
+using the @samp{l} modifier (or equivalently, @samp{S} or @samp{C}),
+the pointer argument should have the type @code{wchar_t **}.
+
+You should free the buffer with @code{free} when you no longer need it.
+
+As a GNU extension predating @samp{m}, @samp{a} is also available, but
+its use is considered deprecated.
 
-Here is an example of using the @samp{a} flag with the @samp{%[@dots{}]}
+Here is an example of using the @samp{m} flag with the @samp{%[@dots{}]}
 conversion specification to read a ``variable assignment'' of the form
 @samp{@var{variable} = @var{value}}.
 
@@ -3852,7 +3860,7 @@  conversion specification to read a ``variable assignment'' of the form
 @{
   char *variable, *value;
 
-  if (2 > scanf ("%a[a-zA-Z0-9] = %a[^\n]\n",
+  if (2 > scanf ("%m[a-zA-Z0-9] = %m[^\n]\n",
 		 &variable, &value))
     @{
       invalid_input_error ();