From patchwork Tue Jun 13 14:24:48 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 20989 Received: (qmail 35177 invoked by alias); 13 Jun 2017 14:24:57 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 35160 invoked by uid 89); 13 Jun 2017 14:24:56 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=skeleton X-HELO: mail-yw0-f169.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:references:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=g/X/gS3XUPKIh6JO9QS18ZUkrJlm64/C1Th+irfjfJY=; b=PIL3DxzfM7U2S2WFtF/uTQNhTxzumLOWaLtVaSSr5EJ8yuRbzNpRzCVQYe8pGMDG4x 0ydWyCYMyJ2o5ryzoAWXPWtypXu8G6+CJGBNBY10qV465TmRpRHljpnko1OJMGyQnq8A Kh7p0kBxpDqSZwAJeZ8DiuKgr7SL6+5sGX4sFLJGKA9286+T89Cy1HCgdUHaa/PU09rP YZGEJueBDSh3hRNpZ4W52QRXOs/1q3iVSzxGVVI9KUpbAywatg1D1nMkkSBuxFFrp98U FESernOHjHGGDajc37j3tHTxmdhm5yo+e2NGXuZiEgWBwZuUGiPHWatGOjS2FADVdU/V 80Ug== X-Gm-Message-State: AKS2vOyOzCZZEPkKKqN8wqqIAYE81tChXFCGqSFANszcE9/IZfPbdp8u wIdRS/FIq/a3pPHG/TxaIQ== X-Received: by 10.55.215.130 with SMTP id t2mr163805qkt.152.1497363891832; Tue, 13 Jun 2017 07:24:51 -0700 (PDT) Subject: Re: [PATCH 07/17] malloc: Add specialized dynarray for C strings From: Adhemerval Zanella To: Florian Weimer Cc: libc-alpha@sourceware.org References: <1496956411-25594-1-git-send-email-adhemerval.zanella@linaro.org> <1496956411-25594-8-git-send-email-adhemerval.zanella@linaro.org> <877f0g43ur.fsf@oldenburg.str.redhat.com> Message-ID: <5881a348-d176-0a53-f970-361d200f5602@linaro.org> Date: Tue, 13 Jun 2017 11:24:48 -0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: On 13/06/2017 09:17, Adhemerval Zanella wrote: > > > On 13/06/2017 06:46, Florian Weimer wrote: >> Adhemerval Zanella writes: >> >>> The provided function are not extensive and meant mainly to be use in >>> subsequent glob implementation cleanup. For internal object >>> consistency only the function provided by char_array.c should be used, >>> including internal object manipulation. >> >> It's surprising that glob needs these many functions. >> >> I'm not sure if the automatic NUL termination makes sense if we ever >> want to generalize this interface. How ingrained is that to the glob >> use case? > > glob internally prepends the search directory by replacing the tilde with > a external home directory (GLOB_TILDE, either by environment variable or > through getXXX functions) or append new data by calling new patterns > recursively (GLOB_BRACE). > > It is not an specific usage to glob, but rather a generic one which with > more higher level languages all the memory buffer bikeshedding will be > avoided by a string object. The idea is to keep a consistent C string buffer > stack-allocated (for general use) or heap based (to not bound due the > internal expansion) which can be used as constant argument for other > function call (fnmatch or recursively in case of GLOB_BRACE). > >> >>> +++ b/malloc/char_array.c >> >> Should this be malloc/char_array-skeleton.c, to indicate that this file >> is parameterized and intended for inclusion into another .c file? > > I am not sure, the idea of char_array is to be self-contained, there is > no need to define anything in the file to include it. Also, if we see > other usage of dynamic C string inside glibc it would be a good idea to > add internal symbols for common symbols. > >> >>> +#define DYNARRAY_ELEMENT_INIT(__e) (*__e = '\0') >> >> Why is this needed? I don't think the code cares for the contents of >> freshly added bytes. For the trailing NUL byte, it seems better to add >> that separately from the initialization operations. > > It does not, it is an artefact I added and it should be removed. > >> >>> +static bool __attribute_used__ >>> +char_array_replace_str_pos (struct char_array *array, size_t pos, >>> + const char *str, size_t len) >>> +{ >>> + if (pos > array->dynarray_header.used) >>> + return false; >>> + >>> + size_t newsize; >>> + if (check_add_wrapv_size_t (pos, len, &newsize) >>> + || check_add_wrapv_size_t (newsize, 1, &newsize) >>> + || !char_array_resize (array, newsize)) >>> + return false; >> >> I don't think it is a good idea to mix the reporting of usage errors >> (index out of bounds) with environmental conditions (memory allocation >> failure). Have you considered calling __libc_fatal if pos is out of >> range, similar to the existing *_at function? > > Indeed, I will change to call __libc_fatal in such cases. > >> >>> +++ b/malloc/malloc-internal.h >> >>> +/* Set *R = A + B. Return true if the answer is mathematically incorrect due >>> + to overflow; in this case, *R is the low order bits of the correct >>> + answer. */ >>> +static inline bool >>> +check_add_wrapv_size_t (size_t a, size_t b, size_t *r) >> >> Why not use check_add_overflow_size_t, for consistency with >> check_mul_overflow_size_t? >> > > I will change it. > Here is an updated patch for the specialized dynarray. I have incorporated all your suggestion but the skeleton name change (which I am not sure if it should follow the idea since it should be a complete 'api'). I also added some nonull attribute as for dynarray implementation. ---- diff --git a/malloc/Makefile b/malloc/Makefile index 14c13f1..323b3fb 100644 --- a/malloc/Makefile +++ b/malloc/Makefile @@ -46,6 +46,7 @@ tests-internal += \ tst-dynarray \ tst-dynarray-fail \ tst-dynarray-at-fail \ + tst-char_array ifneq (no,$(have-tunables)) tests += tst-malloc-usable-tunables @@ -58,7 +59,7 @@ test-srcs = tst-mtrace routines = malloc morecore mcheck mtrace obstack reallocarray \ scratch_buffer_grow scratch_buffer_grow_preserve \ scratch_buffer_set_array_size \ - dynarray_at_failure \ + dynarray_at_failure dynarray_overflow_failure \ dynarray_emplace_enlarge \ dynarray_finalize \ dynarray_resize \ diff --git a/malloc/char_array.c b/malloc/char_array.c new file mode 100644 index 0000000..4d73203 --- /dev/null +++ b/malloc/char_array.c @@ -0,0 +1,281 @@ +/* Specialized dynarray for C strings. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* This file provides a dynamic C string with an initial stack allocated + buffer. Since it is based on dynarray, it provided dynamic size + expansion and heap usage for large strings. + + The following parameters are optional: + + CHAR_ARRAY_INITIAL_SIZE + The size of the statically allocated array (default is 256). It will + be used to define DYNARRAY_INITIAL_SIZE. + + The following functions are provided: + + bool char_array_init_empty (struct char_array *); + bool char_array_init_str (struct char_array *, const char *); + bool char_array_init_str_size (struct char_array *, const char *, size_t); + bool char_array_is_empty (struct char_array *); + const char *char_array_str (struct char_array *); + char char_array_pos (struct char_array *, size_t); + size_t char_array_length (struct char_array *); + bool char_array_set_str (struct char_array *, const char *); + bool char_array_set_str_size (struct char_array *, const char *, size_t); + void char_array_erase (struct char_array *, size_t); + bool char_array_resize_str (struct char_array *, size_t); + bool char_array_prepend_str (struct char_array *, const char *); + bool char_array_append_str (struct char_array *, const char *); + bool char_array_replace_str_pos (struct char_array *, size_t, const char *, + size_t); + + For instance: + + struct char_array str; + // str == "testing"; + char_array_init_str (&str, "testing"); + // c == 's' + char c = char_array_pos (&str, 2); + // str = "testing2"; + char_array_set_str (&str, "testing2"); + // str = "testi"; + char_array_erase (&str, 5); + // str = "123testi"; + char_array_prepend_str (&str, "123"); + // len = 8; + size_t len = char_array_length (&str); + // str = "123testi456"; + char_array_append_str (&str, "456"); + // str = "123testi789"; + char_array_replace_str_pos (&str, 7, "789", 3); + */ + +#define DYNARRAY_STRUCT char_array +#define DYNARRAY_ELEMENT char +#define DYNARRAY_PREFIX char_array_ +#ifndef CHAR_ARRAY_INITIAL_SIZE +# define CHAR_ARRAY_INITIAL_SIZE 256 +#endif +#define DYNARRAY_INITIAL_SIZE CHAR_ARRAY_INITIAL_SIZE +#include +#include + +/* Return a const char for the internal C string handled by 'array'. */ +__attribute__ ((nonnull (1))) +static const char * +char_array_str (struct char_array *array) +{ + return char_array_at (array, 0); +} + +/* Return the character at position 'pos' from the char_array 'array'. */ +__attribute__ ((nonnull (1))) +static char +char_array_pos (struct char_array *array, size_t pos) +{ + return *char_array_at (array, pos); +} + +/* Calculate the length of the string, excluding the terminating null. */ +__attribute__ ((unused, nonnull (1))) +static size_t +char_array_length (struct char_array *array) +{ + /* Exclude the final '\0'. */ + return array->dynarray_header.used - 1; +} + +/* Copy the contents of string 'str' to char_array 'array', including the + final '\0'. */ +__attribute__ ((nonnull (1, 2))) +static bool +char_array_set_str (struct char_array *array, const char *str) +{ + size_t size = strlen (str) + 1; + if (!char_array_resize (array, size)) + return false; + memcpy (array->dynarray_header.array, str, size); + array->dynarray_header.used = size; + return true; +} + +/* Copy up 'size' bytes from string 'str' to char_array 'array'. A final + '\0' is appended in the char_array. */ +__attribute__ ((nonnull (1, 2))) +static bool +char_array_set_str_size (struct char_array *array, const char *str, + size_t size) +{ + size_t newsize; + if (check_add_wrapv_size_t (size, 1, &newsize)) + __libc_dynarray_overflow_failure (size, 1); + + if (!char_array_resize (array, newsize)) + return false; + + *((char *) mempcpy (array->dynarray_header.array, str, size)) = '\0'; + array->dynarray_header.used = newsize; + return true; +} + +/* Initialize the char_array 'array' and sets it to an empty string (""). */ +__attribute__ ((nonnull (1))) +static bool +char_array_init_empty (struct char_array *array) +{ + char_array_init (array); + return char_array_set_str (array, ""); +} + +/* Initialize the char_array 'array' and copy the content of string 'str'. */ +__attribute__ ((nonnull (1, 2))) +static bool +char_array_init_str (struct char_array *array, const char *str) +{ + char_array_init (array); + return char_array_set_str (array, str); +} + +/* Initialize the char_array 'array' and copy the content of string 'str' + up to 'size' characteres. */ +__attribute__ ((nonnull (1, 2))) +static bool +char_array_init_str_size (struct char_array *array, const char *str, + size_t size) +{ + char_array_init (array); + return char_array_set_str_size (array, str, size); +} + +/* Return if the char_array contain any characteres. */ +__attribute__ ((nonnull (1))) +static bool +char_array_is_empty (struct char_array *array) +{ + return *char_array_at (array, 0) == '\0'; +} + +/* Remove the byte at position 'pos' from char_array 'array'. The contents + are moved internally if the position is not at the end of the internal + buffer. */ +__attribute__ ((nonnull (1))) +static bool +char_array_erase (struct char_array *array, size_t pos) +{ + if (pos >= array->dynarray_header.used - 1) + return false; + + char *ppos = char_array_at (array, pos); + char *lpos = array->dynarray_header.array + array->dynarray_header.used; + ptrdiff_t size = lpos - ppos; + memmove (ppos, ppos + 1, size); + array->dynarray_header.used--; + return true; +} + +/* Resize the char_array 'array' to size 'count' maintaining the ending + '\0' byte. */ +__attribute__ ((nonnull (1))) +static bool +char_array_crop (struct char_array *array, size_t size) +{ + if (size >= (array->dynarray_header.used - 1) + || !char_array_resize (array, size + 1)) + return false; + + array->dynarray_header.array[array->dynarray_header.used] = '\0'; + return true; +} + +/* Prepend the contents of string 'str' to char_array 'array', including the + final '\0' byte. */ +__attribute__ ((nonnull (1, 2))) +static bool +char_array_prepend_str (struct char_array *array, const char *str) +{ + size_t size = strlen (str); + /* Resizing the array might change its used elements and we need below + to correct copy the elements. */ + size_t used = array->dynarray_header.used; + + size_t newsize; + if (check_add_wrapv_size_t (used, size, &newsize)) + __libc_dynarray_overflow_failure (used, size); + + if (!char_array_resize (array, newsize)) + return false; + + /* Make room for the string and copy it. */ + memmove (array->dynarray_header.array + size, array->dynarray_header.array, + used); + memcpy (array->dynarray_header.array, str, size); + array->dynarray_header.used = newsize; + return true; +} + +/* Append the contents of string 'str' to char_array 'array, including the + final '\0' byte. */ +__attribute__ ((nonnull (1, 2))) +static bool +char_array_append_str (struct char_array *array, const char *str) +{ + size_t size = strlen (str); + /* Resizing the array might change its used elements and it used it below + to correct copy the elements. */ + size_t used = array->dynarray_header.used - 1; + + /* 'used' does account for final '\0', so there is no need to add + an extra element to calculate the final required size. */ + size_t newsize; + if (check_add_wrapv_size_t (used + 1, size, &newsize)) + __libc_dynarray_overflow_failure (used + 1, size); + + if (!char_array_resize (array, newsize)) + return false; + + /* Start to append at '\0' up to string length and add a final '\0'. */ + *(char*) mempcpy (array->dynarray_header.array + used, str, size) = '\0'; + array->dynarray_header.used = newsize; + return true; +} + +/* Replace the contents starting of position 'pos' of char_array 'array' + with the contents of string 'str' up to 'len' bytes. A final '\0' + is appended in the string. */ +__attribute__ ((nonnull (1, 3))) +static bool +char_array_replace_str_pos (struct char_array *array, size_t pos, + const char *str, size_t len) +{ + if (pos > array->dynarray_header.used) + return false; + + size_t newsize; + if (check_add_wrapv_size_t (pos, len, &newsize)) + __libc_dynarray_overflow_failure (pos, len); + if (check_add_wrapv_size_t (newsize, 1, &newsize)) + __libc_dynarray_overflow_failure (newsize, 1); + + if (!char_array_resize (array, newsize)) + return false; + + char *start = char_array_at (array, pos); + *(char *) mempcpy (start, str, len) = '\0'; + array->dynarray_header.used = newsize; + return true; +} diff --git a/malloc/dynarray.h b/malloc/dynarray.h index c73e08b..5a491b4 100644 --- a/malloc/dynarray.h +++ b/malloc/dynarray.h @@ -173,4 +173,12 @@ void __libc_dynarray_at_failure (size_t size, size_t index) __attribute__ ((noreturn)); libc_hidden_proto (__libc_dynarray_at_failure) +/* Internal function. TErminate the process after an overflow in + new size allocation. SIZE is the current number of elements in + dynamic array and INCR is the new elements to add on current + size. */ +void __libc_dynarray_overflow_failure (size_t size, size_t incr) + __attribute__ ((noreturn)); +libc_hidden_proto (__libc_dynarray_overflow_failure) + #endif /* _DYNARRAY_H */ diff --git a/malloc/dynarray_overflow_failure.c b/malloc/dynarray_overflow_failure.c new file mode 100644 index 0000000..14936b0 --- /dev/null +++ b/malloc/dynarray_overflow_failure.c @@ -0,0 +1,31 @@ +/* Report an dynamic array size overflow condition. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +void +__libc_dynarray_overflow_failure (size_t size, size_t incr) +{ + char buf[200]; + __snprintf (buf, sizeof (buf), "Fatal glibc error: " + "new size overflows (old %zu and increment %zu)\n", + size, incr); + __libc_fatal (buf); +} +libc_hidden_def (__libc_dynarray_overflow_failure) diff --git a/malloc/malloc-internal.h b/malloc/malloc-internal.h index dbd801a..3066cd3 100644 --- a/malloc/malloc-internal.h +++ b/malloc/malloc-internal.h @@ -101,4 +101,18 @@ check_mul_overflow_size_t (size_t left, size_t right, size_t *result) #endif } +/* Set *R = A + B. Return true if the answer is mathematically incorrect due + to overflow; in this case, *R is the low order bits of the correct + answer. */ +static inline bool +check_add_wrapv_size_t (size_t a, size_t b, size_t *r) +{ +#if 5 <= __GNUC__ + return __builtin_add_overflow (a, b, r); +#else + *r = a + b; + return *r < a; +#endif +} + #endif /* _MALLOC_INTERNAL_H */ diff --git a/malloc/tst-char_array.c b/malloc/tst-char_array.c new file mode 100644 index 0000000..53f9482 --- /dev/null +++ b/malloc/tst-char_array.c @@ -0,0 +1,107 @@ +/* Test for char_array. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include + +#include + +#include +#include +#include +#include +#include + +static int +do_test (void) +{ + mtrace (); + + { + struct char_array str; + TEST_VERIFY_EXIT (char_array_init_empty (&str) == true); + TEST_VERIFY_EXIT (char_array_length (&str) == 0); + TEST_VERIFY_EXIT (char_array_is_empty (&str) == true); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "") == 0); + char_array_free (&str); + } + + { + struct char_array str; + TEST_VERIFY_EXIT (char_array_init_str (&str, "testing")); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("testing")); + TEST_VERIFY_EXIT (char_array_pos (&str, 2) == 's'); + TEST_VERIFY_EXIT (char_array_is_empty (&str) == false); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "testing") == 0); + char_array_free (&str); + } + + { + struct char_array str; + TEST_VERIFY_EXIT (char_array_init_str_size (&str, "testing", 4)); + TEST_VERIFY_EXIT (char_array_length (&str) == 4); + TEST_VERIFY_EXIT (char_array_pos (&str, 2) == 's'); + TEST_VERIFY_EXIT (char_array_is_empty (&str) == false); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "test") == 0); + char_array_free (&str); + } + + { + struct char_array str; + TEST_VERIFY_EXIT (char_array_init_str (&str, "testing")); + TEST_VERIFY_EXIT (char_array_set_str (&str, "abcdef")); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "abcdef") == 0); + TEST_VERIFY_EXIT (char_array_set_str_size (&str, "abcdef", 4)); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "abcd") == 0); + char_array_free (&str); + } + + { + struct char_array str; + TEST_VERIFY_EXIT (char_array_init_str (&str, "testing")); + TEST_VERIFY_EXIT (char_array_erase (&str, 4) == true); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("testing") - 1); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "testng") == 0); + TEST_VERIFY_EXIT (char_array_erase (&str, char_array_length (&str)) + == false); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("testing") - 1); + TEST_VERIFY_EXIT (char_array_erase (&str, char_array_length (&str) - 1) + == true); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("testing") - 2); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "testn") == 0); + char_array_free (&str); + } + + { + struct char_array str; + TEST_VERIFY_EXIT (char_array_init_str (&str, "test")); + TEST_VERIFY_EXIT (char_array_prepend_str (&str, "123")); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "123test") == 0); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("123test")); + TEST_VERIFY_EXIT (char_array_append_str (&str, "456")); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "123test456") == 0); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("123test456")); + TEST_VERIFY_EXIT (char_array_replace_str_pos (&str, 7, "789", 3)); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "123test789") == 0); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("123test789")); + char_array_free (&str); + } + + return 0; +} + +#include