From patchwork Wed Feb 7 05:56:30 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rical Jasan X-Patchwork-Id: 25840 Received: (qmail 108887 invoked by alias); 7 Feb 2018 05:56:26 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 108876 invoked by uid 89); 7 Feb 2018 05:56:26 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.2 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, SPF_PASS, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=Conversions, CHANGE, dots, 09az X-HELO: smtp.pacific.net To: libc-alpha From: Rical Jasan Subject: [PATCH] manual: Document the standardized scanf flag, "m". [BZ #16376] Message-ID: <7c42f58d-d076-aeb3-a229-2581aa03af94@pacific.net> Date: Tue, 6 Feb 2018 21:56:30 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 Looking in stdio-common/vfscanf.c, "a" uses GNU_MALLOC, and "m" uses POSIX_MALLOC, and there is a MALLOC which is defined to either, and throughout the code there are several conditionals where POSIX_MALLOC is nested beneath MALLOC (GNU_MALLOC is always MALLOC), but I'm not sure we want to document any internal implementation differences (if they're even significant). The existing documentation wasn't very detailed itself, but I still wanted to mention it. I confirmed "m" was introduced in POSIX.1-2008 (mentioned in the Issue 7 section of CHANGE HISTORY on the s/f/scanf page). The NEWS file says entries for bugs will be generated automatically, and once ACK'd, I'll refactor the paragraphs before committing (tried to make it easier to see what changed here) and provide a ChangeLog entry. Rical From cc32f967605178358d7b0d3e6fc1d36d036e6a8f Mon Sep 17 00:00:00 2001 From: Rical Jasan Date: Tue, 6 Feb 2018 21:22:27 -0800 Subject: [PATCH] manual: Document the standardized scanf flag, "m". [BZ #16376] POSIX.1-2008 introduced the optional assignment-allocation modifier, "m", whose functionality was previously provided by the GNU extension "a" (and still is). [BZ #16376] * manual/stdio.texi (Input Conversion Syntax) (String Input Conversions, Dynamic String Input): Document the "m" flag. --- manual/stdio.texi | 34 +++++++++++++++++++++------------- 1 file changed, 21 insertions(+), 13 deletions(-) diff --git a/manual/stdio.texi b/manual/stdio.texi index 5d7b50c442..4db65f4e18 100644 --- a/manual/stdio.texi +++ b/manual/stdio.texi @@ -3440,10 +3440,9 @@ successful assignments. @cindex flag character (@code{scanf}) @item -An optional flag character @samp{a} (valid with string conversions only) +An optional flag character @samp{m} (valid with string conversions only) which requests allocation of a buffer long enough to store the string in. -(This is a GNU extension.) -@xref{Dynamic String Input}. +(There is also a GNU extension, @samp{a}.) @xref{Dynamic String Input}. @item An optional decimal integer that specifies the @dfn{maximum field @@ -3720,8 +3719,9 @@ provide the buffer, always specify a maximum field width to prevent overflow.} @item -Ask @code{scanf} to allocate a big enough buffer, by specifying the -@samp{a} flag character. This is a GNU extension. You should provide +Ask @code{scanf} to allocate a big-enough buffer, by specifying the +@samp{m} flag character (or the GNU extension, @samp{a}). +You should provide an argument of type @code{char **} for the buffer address to be stored in. @xref{Dynamic String Input}. @end itemize @@ -3825,7 +3825,7 @@ is said about @samp{%ls} above is true for @samp{%l[}. One more reminder: the @samp{%s} and @samp{%[} conversions are @strong{dangerous} if you don't specify a maximum width or use the -@samp{a} flag, because input too long would overflow whatever buffer you +@samp{m} or @samp{a} flags, because too-long input would overflow whatever buffer you have provided for it. No matter how long your buffer is, a user could supply input that is longer. A well-written program reports invalid input with a comprehensible error message, not with a crash. @@ -3833,18 +3833,26 @@ input with a comprehensible error message, not with a crash. @node Dynamic String Input @subsection Dynamically Allocating String Conversions -A GNU extension to formatted input lets you safely read a string with no +POSIX.1-2008 specifies an @dfn{assignment-allocation character} +@samp{m}, valid for use with the string conversion specifiers +@samp{s}, @samp{S}, @samp{[}, @samp{c}, and @samp{C}, which +lets you safely read a string with no maximum size. Using this feature, you don't supply a buffer; instead, @code{scanf} allocates a buffer big enough to hold the data and gives -you its address. To use this feature, write @samp{a} as a flag -character, as in @samp{%as} or @samp{%a[0-9a-z]}. +you its address. To use this feature, write @samp{m} as a flag +character; e.g., @samp{%ms} or @samp{%m[0-9a-z]}. The pointer argument you supply for where to store the input should have type @code{char **}. The @code{scanf} function allocates a buffer and -stores its address in the word that the argument points to. You should -free the buffer with @code{free} when you no longer need it. +stores its address in the word that the argument points to. When +using the @samp{l} modifier (or equivalently, @samp{S} or @samp{C}), +the pointer argument should have the type @code{wchar_t **}. + +You should free the buffer with @code{free} when you no longer need it. + +As a GNU extension, @samp{a} is available in place of @samp{m}. -Here is an example of using the @samp{a} flag with the @samp{%[@dots{}]} +Here is an example of using the @samp{m} flag with the @samp{%[@dots{}]} conversion specification to read a ``variable assignment'' of the form @samp{@var{variable} = @var{value}}. @@ -3852,7 +3860,7 @@ conversion specification to read a ``variable assignment'' of the form @{ char *variable, *value; - if (2 > scanf ("%a[a-zA-Z0-9] = %a[^\n]\n", + if (2 > scanf ("%m[a-zA-Z0-9] = %m[^\n]\n", &variable, &value)) @{ invalid_input_error (); -- 2.14.3