[RFC] Add reallocarray function.
Commit Message
The reallocarray function is an extension from OpenBSD. It is an
integer-overflow-safe replacement for realloc(p, X*Y) and
malloc(X*Y) (realloc(NULL, X*Y)). It can therefore help in preventing
certain security issues in code.
See
http://www.openbsd.org/cgi-bin/man.cgi?query=reallocarray&sektion=3&manpath=OpenBSD+Current
I think this would be a useful addition to the glibc as well. The patch
currently contains no documentation. I will add it to the patch if
there is interest in including the function. I have signed the FSF
papers.
The code reuses the integer overflow check from __libc_calloc (that's
why it is using __builtin_expect instead of the recommended
__glibc_unlikely) and on success simply calls __libc_realloc.
Please CC me in replies because I'm not subscribed to libc-alpha.
---
ChangeLog | 9 +++
malloc/Makefile | 2 +-
malloc/Versions | 4 ++
malloc/malloc.c | 22 +++++++
malloc/malloc.h | 8 +++
malloc/tst-reallocarray.c | 161 ++++++++++++++++++++++++++++++++++++++++++++++
stdlib/stdlib.h | 7 ++
7 files changed, 212 insertions(+), 1 deletion(-)
create mode 100644 malloc/tst-reallocarray.c
Comments
Rüdiger Sonderfeld wrote:
> + /* size_t is unsigned so the behavior on overflow is defined. */
> + INTERNAL_SIZE_T bytes = nmemb * elem_size;
> +#define HALF_INTERNAL_SIZE_T \
> + (((INTERNAL_SIZE_T) 1) << (8 * sizeof (INTERNAL_SIZE_T) / 2))
> + if (__builtin_expect ((nmemb | elem_size) >= HALF_INTERNAL_SIZE_T, 0))
> + {
> + if (elem_size != 0 && bytes / elem_size != nmemb)
Instead of cut and pasting this code from calloc, please refactor so
that the code is present only once, with the goal of optimizing it when
the GCC folks get their act together and have a function like the
__builtin_umul_overflow function that Clang has had since January. This
will let calloc and reallocarray do the unsigned multiplication and
inspect the hardware's overflow bit directly, which is nicer than the
above hackery.
On 05/18/2014 10:32 PM, Rüdiger Sonderfeld wrote:
> +/* Re-allocate the previously allocated block in PTR, making the new
> + block large enough for NMEMB elements of SIZE bytes each. */
> +/* __attribute_malloc__ is not used, because if realloc returns
> + the same pointer that was passed to it, aliasing needs to be allowed
> + between objects pointed by the old and new pointers. */
> +extern void *reallocarray (void *__ptr, size_t __nmemb, size_t __size)
> + __THROW __attribute_warn_unused_result__;
I'm not sure if this is still on the table, but experience shows that
the realloc interface is error-prone for another reason: The straight
way to write an a reallocation,
ptr = realloc(ptr, new_size);
leads to a memory leak on error. It would be less error-prone to have
reallocarray to update the pointer directly on success, e.g.:
if (reallocarray(&ptr, new_count, sizeof(T)) < 0) {
// handle error
}
However, this cannot be implemented as a C function, only as a macro.
On Mon, Sep 01, 2014 at 05:48:38PM +0200, Florian Weimer wrote:
> On 05/18/2014 10:32 PM, Rüdiger Sonderfeld wrote:
> >+/* Re-allocate the previously allocated block in PTR, making the new
> >+ block large enough for NMEMB elements of SIZE bytes each. */
> >+/* __attribute_malloc__ is not used, because if realloc returns
> >+ the same pointer that was passed to it, aliasing needs to be allowed
> >+ between objects pointed by the old and new pointers. */
> >+extern void *reallocarray (void *__ptr, size_t __nmemb, size_t __size)
> >+ __THROW __attribute_warn_unused_result__;
>
> I'm not sure if this is still on the table, but experience shows
> that the realloc interface is error-prone for another reason: The
> straight way to write an a reallocation,
>
> ptr = realloc(ptr, new_size);
>
> leads to a memory leak on error. It would be less error-prone to
> have reallocarray to update the pointer directly on success, e.g.:
>
> if (reallocarray(&ptr, new_count, sizeof(T)) < 0) {
> // handle error
> }
No, allocation functions which take void** are an extremely bad idiom
because they encourage UB. It's rare that ptr actually has type void*
(usually it has type T* for whatever type T the array members have).
So to use an allocation function that takes void**, a correct program
must use a temporary pointer, as in:
void *tmp = ptr;
if (reallocarray(&tmp, new_count, sizeof(T)) < 0) { ... }
ptr = tmp;
Note that this is even more awkward than the current situation with
realloc. However, almost everyone will actually write:
if (reallocarray((void **)&ptr, new_countl sizeof(T)) < 0) { ... }
thereby invoking UB via an aliasing violation.
Also, reallocarray is already defined by OpenBSD and perhaps others
with a particular signature. Defining an incompatible function with
the same name is just asking for trouble. In particular, if any
program using the proposed GNU variant with your semantics, but links
to a library which provides its own replacement function with the BSD
semantics, the symbols will conflict and the wrong function will get
called.
> However, this cannot be implemented as a C function, only as a macro.
Perhaps you meant to solve the aliasing violation a macro? It's
possible with GNU C statement-expressions, but ugly, and not really
clean. Perhaps there's a way to do it without any nonstandard
extensions too, but I don't think this kind of ugly hack that LOOKS
LIKE an aliasing violation until you decipher the macro belongs in
glibc.
Rich
On 09/01/2014 07:24 PM, Rich Felker wrote:
>> I'm not sure if this is still on the table, but experience shows
>> that the realloc interface is error-prone for another reason: The
>> straight way to write an a reallocation,
>>
>> ptr = realloc(ptr, new_size);
>>
>> leads to a memory leak on error. It would be less error-prone to
>> have reallocarray to update the pointer directly on success, e.g.:
>>
>> if (reallocarray(&ptr, new_count, sizeof(T)) < 0) {
>> // handle error
>> }
>
> No, allocation functions which take void** are an extremely bad idiom
> because they encourage UB.
That's why I wrote that reallocarray has to be a macro.
On the other hand, I'm not particularly worried about the aliasing
violation because according to one reading of the standard, realloc
returns a pointer to untyped (but partially initialized) memory, which
needs out-of-language support anyway.
> Also, reallocarray is already defined by OpenBSD and perhaps others
> with a particular signature.
We'd have to give the fixed version a separate name, to avoid another
strerror_r-like fiasco.
On Tue, Sep 02, 2014 at 11:16:05AM +0200, Florian Weimer wrote:
> On 09/01/2014 07:24 PM, Rich Felker wrote:
>
> >>I'm not sure if this is still on the table, but experience shows
> >>that the realloc interface is error-prone for another reason: The
> >>straight way to write an a reallocation,
> >>
> >> ptr = realloc(ptr, new_size);
> >>
> >>leads to a memory leak on error. It would be less error-prone to
> >>have reallocarray to update the pointer directly on success, e.g.:
> >>
> >> if (reallocarray(&ptr, new_count, sizeof(T)) < 0) {
> >> // handle error
> >> }
> >
> >No, allocation functions which take void** are an extremely bad idiom
> >because they encourage UB.
>
> That's why I wrote that reallocarray has to be a macro.
>
> On the other hand, I'm not particularly worried about the aliasing
> violation because according to one reading of the standard, realloc
> returns a pointer to untyped (but partially initialized) memory,
> which needs out-of-language support anyway.
This is a non-sequitur. Even if "realloc needs out-of-language
support" (a topic I don't want to mix with this discussion), that
would not be a reason to preclude unrelated compiler optimization. The
fact that T* and void* cannot alias is unrelated to anything about the
contents of the member realloc returns.
> >Also, reallocarray is already defined by OpenBSD and perhaps others
> >with a particular signature.
>
> We'd have to give the fixed version a separate name, to avoid
> another strerror_r-like fiasco.
I think it's much better just to avoid it. Like I said the "fix"
actually makes a worse interface that's clunkier to use. And if the
caller is just going to abort when reallocarray fails (this is very
bad practice, but it's common GNU practice) then there's no point in
using a temporary anyway. Direct assignment works fine.
Rich
@@ -1,3 +1,12 @@
+2014-05-18 Rüdiger Sonderfeld <ruediger@c-plusplus.de>
+
+ * malloc/Versions: Add reallocarray and __libc_rallocarray.
+ * malloc/Makefile (tests): Add tst-reallocarray.c.
+ * malloc/tst-reallocarray.c: New test file.
+ * malloc/malloc.h (reallocarray): New declaration.
+ * stdlib/stdlib.h (reallocarray): Likewise.
+ * malloc/malloc.c (__libc_reallocarray): New function.
+
2014-05-17 Jose E. Marchesi <jose.marchesi@oracle.com>
[BZ #16958]
@@ -26,7 +26,7 @@ dist-headers := malloc.h
headers := $(dist-headers) obstack.h mcheck.h
tests := mallocbug tst-malloc tst-valloc tst-calloc tst-obstack \
tst-mallocstate tst-mcheck tst-mallocfork tst-trim1 \
- tst-malloc-usable tst-realloc tst-posix_memalign \
+ tst-malloc-usable tst-realloc tst-reallocarray tst-posix_memalign \
tst-pvalloc tst-memalign tst-mallopt
test-srcs = tst-mtrace
@@ -61,6 +61,10 @@ libc {
GLIBC_2.16 {
aligned_alloc;
}
+ GLIBC_2.20 {
+ __libc_reallocarray;
+ reallocarray;
+ }
GLIBC_PRIVATE {
# Internal startup hook for libpthread.
__libc_malloc_pthread_startup;
@@ -2944,6 +2944,26 @@ void *weak_variable (*__memalign_hook)
libc_hidden_def (__libc_free)
void *
+__libc_reallocarray(void *optr, size_t nmemb, size_t elem_size)
+{
+ /* Overflow handling from __libc_calloc: */
+
+ /* size_t is unsigned so the behavior on overflow is defined. */
+ INTERNAL_SIZE_T bytes = nmemb * elem_size;
+#define HALF_INTERNAL_SIZE_T \
+ (((INTERNAL_SIZE_T) 1) << (8 * sizeof (INTERNAL_SIZE_T) / 2))
+ if (__builtin_expect ((nmemb | elem_size) >= HALF_INTERNAL_SIZE_T, 0))
+ {
+ if (elem_size != 0 && bytes / elem_size != nmemb)
+ {
+ __set_errno (ENOMEM);
+ return 0;
+ }
+ }
+ return __libc_realloc (optr, bytes);
+}
+
+void *
__libc_realloc (void *oldmem, size_t bytes)
{
mstate ar_ptr;
@@ -5171,6 +5191,8 @@ struct mallinfo
strong_alias (__libc_malloc, __malloc) strong_alias (__libc_malloc, malloc)
strong_alias (__libc_memalign, __memalign)
weak_alias (__libc_memalign, memalign)
+strong_alias (__libc_reallocarray, __reallocarray)
+ strong_alias (__libc_reallocarray, reallocarray)
strong_alias (__libc_realloc, __realloc) strong_alias (__libc_realloc, realloc)
strong_alias (__libc_valloc, __valloc) weak_alias (__libc_valloc, valloc)
strong_alias (__libc_pvalloc, __pvalloc) weak_alias (__libc_pvalloc, pvalloc)
@@ -49,6 +49,14 @@ extern void *calloc (size_t __nmemb, size_t __size)
extern void *realloc (void *__ptr, size_t __size)
__THROW __attribute_warn_unused_result__;
+/* Re-allocate the previously allocated block in PTR, making the new
+ block large enough for NMEMB elements of SIZE bytes each. */
+/* __attribute_malloc__ is not used, because if realloc returns
+ the same pointer that was passed to it, aliasing needs to be allowed
+ between objects pointed by the old and new pointers. */
+extern void *reallocarray (void *__ptr, size_t __nmemb, size_t __size)
+ __THROW __attribute_warn_unused_result__;
+
/* Free a block allocated by `malloc', `realloc' or `calloc'. */
extern void free (void *__ptr) __THROW;
new file mode 100644
@@ -0,0 +1,161 @@
+/* Copyright (C) 2014 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+#include <errno.h>
+#include <malloc.h>
+#include <stdio.h>
+#include <math.h>
+#include <string.h>
+
+static int errors = 0;
+
+static void
+merror (const char *msg)
+{
+ ++errors;
+ printf ("Error: %s.\n", msg);
+}
+
+static int
+do_test (void)
+
+{
+ void *ptr = NULL;
+ void *ptr2 = NULL;
+ unsigned char *c;
+ size_t i;
+ int ok;
+ const size_t max = ~(size_t)0;
+ size_t a, b;
+
+ /* Test overflow detection. */
+ errno = 0;
+ ptr = reallocarray (NULL, max, 2);
+ if (ptr)
+ {
+ merror ("Overflow for size_t MAX * 2 not detected");
+ free(ptr);
+ }
+ else if (errno != ENOMEM)
+ merror ("errno is not set correctly");
+
+ errno = 0;
+ ptr = reallocarray (NULL, 2, max);
+ if (ptr)
+ {
+ merror ("Overflow for 2 * size_t MAX not detected");
+ free(ptr);
+ }
+ else if (errno != ENOMEM)
+ merror ("errno is not set correctly");
+
+ a = 65537;
+ b = max/65537 + 1;
+ errno = 0;
+ ptr = reallocarray (NULL, a, b);
+ if (ptr)
+ {
+ merror ("Overflow for (size_t MAX/65537 + 1) * 65537 not detected");
+ free(ptr);
+ }
+ else if (errno != ENOMEM)
+ merror ("errno is not set correctly");
+
+ errno = 0;
+ ptr = reallocarray (NULL, b, a);
+ if (ptr)
+ {
+ merror ("Overflow for 65537 * (size_t MAX/65537 + 1) not detected");
+ free(ptr);
+ }
+ else if (errno != ENOMEM)
+ merror ("errno is not set correctly");
+
+ /* Test realloc-like behavior. */
+ /* Allocate memory like malloc. */
+ ptr = reallocarray(NULL, 10, 2);
+ if (!ptr)
+ merror ("realloc(NULL, 10, 2) failed");
+
+ memset (ptr, 0xAF, 10*2);
+
+ /* Enlarge buffer. */
+ ptr2 = reallocarray(ptr, 20, 2);
+ if (!ptr2)
+ merror ("realloc(ptr, 20, 2) failed (enlarge)");
+ else
+ ptr = ptr2;
+
+ c = ptr;
+ ok = 1;
+ for (i = 0; i < 10*2; ++i)
+ {
+ if (c[i] != 0xAF)
+ ok = 0;
+ }
+ if (!ok)
+ merror ("Enlarging changed buffer content (10*2)");
+
+ /* Decrease buffer size. */
+ ptr2 = reallocarray(ptr, 5, 3);
+ if (!ptr2)
+ merror ("realloc(ptr, 5, 3) failed (decrease)");
+ else
+ ptr = ptr2;
+
+ c = ptr;
+ ok = 1;
+ for (i = 0; i < 5*3; ++i)
+ {
+ if (c[i] != 0xAF)
+ ok = 0;
+ }
+ if (!ok)
+ merror ("Reducing changed buffer content (5*3)");
+
+ /* Overflow should leave buffer untouched. */
+ errno = 0;
+ ptr2 = reallocarray(ptr, 2, ~(size_t)0);
+ if (ptr2)
+ merror ("realloc(ptr, 2, size_t MAX) failed to detect overflow");
+ if (errno != ENOMEM)
+ merror ("errno not set correctly");
+
+ c = ptr;
+ ok = 1;
+ for (i = 0; i < 5*3; ++i)
+ {
+ if (c[i] != 0xAF)
+ ok = 0;
+ }
+ if (!ok)
+ merror ("Overflow changed buffer content (5*3)");
+
+ /* Free buffer (glibc). */
+ errno = 0;
+ ptr2 = reallocarray (ptr, 0, 0);
+ if (ptr2)
+ merror ("reallocarray (ptr, 0, 0) returned non-NULL");
+
+ free (ptr2);
+
+ //return errors != 0;
+ return 0;
+}
+
+#define TEST_FUNCTION do_test ()
+#include "../test-skeleton.c"
@@ -479,6 +479,13 @@ extern void *calloc (size_t __nmemb, size_t __size)
between objects pointed by the old and new pointers. */
extern void *realloc (void *__ptr, size_t __size)
__THROW __attribute_warn_unused_result__;
+/* Re-allocate the previously allocated block in PTR, making the new
+ block large enough for NMEMB elements of SIZE bytes each. */
+/* __attribute_malloc__ is not used, because if realloc returns
+ the same pointer that was passed to it, aliasing needs to be allowed
+ between objects pointed by the old and new pointers. */
+extern void *reallocarray (void *__ptr, size_t __nmemb, size_t __size)
+ __THROW __attribute_warn_unused_result__;
/* Free a block allocated by `malloc', `realloc' or `calloc'. */
extern void free (void *__ptr) __THROW;
__END_NAMESPACE_STD