From patchwork Thu Nov 28 09:43:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Florian Weimer X-Patchwork-Id: 36346 Received: (qmail 42485 invoked by alias); 28 Nov 2019 09:43:55 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 42474 invoked by uid 89); 28 Nov 2019 09:43:55 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-15.5 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, SPF_PASS autolearn=ham version=3.3.1 spammy=HX-Languages-Length:2747, useless X-HELO: albireo.enyo.de From: Florian Weimer To: libc-alpha@sourceware.org Subject: Re: [review] manual: Clarify strnlen, wcsnlen, strndup null termination behavior References: <87a79i33kt.fsf@oldenburg2.str.redhat.com> <875zk6337v.fsf@oldenburg2.str.redhat.com> Date: Thu, 28 Nov 2019 10:43:51 +0100 In-Reply-To: <875zk6337v.fsf@oldenburg2.str.redhat.com> (Florian Weimer's message of "Wed, 30 Oct 2019 12:03:16 +0100") Message-ID: <87wobk8hew.fsf@mid.deneb.enyo.de> MIME-Version: 1.0 * Florian Weimer: > * Andreas Schwab: > >> On Okt 30 2019, Florian Weimer wrote: >> >>> * Andreas Schwab: >>> >>>> On Okt 30 2019, Florian Weimer (Code Review) wrote: >>>> >>>>> +Note that @var{s} must be an array of at least @var{maxlen} bytes. It >>>>> +is undefined to call @code{strnlen} on a shorter array, even if it is >>>>> +known that the shorter array contains a null terminator. >>>> >>>> This is not true. strnlen _always_ stops before the null byte. >>> >>> This is not how it is specified in POSIX. >> >> Yes, it is. >> >> The strnlen() function shall return the number of bytes preceding >> the first null byte in the array to which s points, if s contains a >> null byte within the first maxlen bytes; otherwise, it shall return >> maxlen. >> >> There is nothing undefined here. Your interpretation would be >> completely useless anyway. > > It says “array”, which implies a length. Admittedly, it does not say > that maxlen corresponds to the arrray length. POSIX also says this: > > | The strnlen() function shall never examine more than maxlen bytes of > | the array pointed to by s. > > But it does NOT say that reading stops after the first null terminator. I have built glibc with --disable-multi-arch and this patch on x86-64: The resulting crashes demonstrate that the test suite verifies that we do not treat the input as an array (to some degree; there might be scopes in coverage). I think we should document this as a GNU extension. Thoughts? diff --git a/string/strnlen.c b/string/strnlen.c index 0b3a12e8b1..d5781dbb6f 100644 --- a/string/strnlen.c +++ b/string/strnlen.c @@ -33,6 +33,10 @@ size_t __strnlen (const char *str, size_t maxlen) { + /* Assert that the entire input is readable. */ + for (size_t i = 0; i < maxlen; ++i) + asm volatile ("" :: "r" (str[i])); + const char *char_ptr, *end_ptr = str + maxlen; const unsigned long int *longword_ptr; unsigned long int longword, himagic, lomagic; diff --git a/sysdeps/x86_64/strnlen.S b/sysdeps/x86_64/strnlen.S deleted file mode 100644 index d3c43ac482..0000000000 --- a/sysdeps/x86_64/strnlen.S +++ /dev/null @@ -1,6 +0,0 @@ -#define AS_STRNLEN -#define strlen __strnlen -#include "strlen.S" - -weak_alias (__strnlen, strnlen); -libc_hidden_builtin_def (strnlen) diff --git a/wcsmbs/wcsnlen.c b/wcsmbs/wcsnlen.c index 17e004dcc0..0d3709ac91 100644 --- a/wcsmbs/wcsnlen.c +++ b/wcsmbs/wcsnlen.c @@ -26,6 +26,10 @@ size_t __wcsnlen (const wchar_t *s, size_t maxlen) { + /* Assert that the entire input is readable. */ + for (size_t i = 0; i < maxlen; ++i) + asm volatile ("" :: "r" (s[i])); + const wchar_t *ret = __wmemchr (s, L'\0', maxlen); if (ret) maxlen = ret - s;