[RFC] Add reallocarray function.

Message ID 5040375.QgetdG6juS@descartes
State Superseded
Headers

Commit Message

Rüdiger Sonderfeld May 18, 2014, 8:32 p.m. UTC
  The reallocarray function is an extension from OpenBSD.  It is an
integer-overflow-safe replacement for realloc(p, X*Y) and
malloc(X*Y) (realloc(NULL, X*Y)).  It can therefore help in preventing
certain security issues in code.

See
http://www.openbsd.org/cgi-bin/man.cgi?query=reallocarray&sektion=3&manpath=OpenBSD+Current

I think this would be a useful addition to the glibc as well.  The patch
currently contains no documentation.  I will add it to the patch if
there is interest in including the function.  I have signed the FSF
papers.

The code reuses the integer overflow check from __libc_calloc (that's
why it is using __builtin_expect instead of the recommended
__glibc_unlikely) and on success simply calls __libc_realloc.

Please CC me in replies because I'm not subscribed to libc-alpha.
---
 ChangeLog                 |   9 +++
 malloc/Makefile           |   2 +-
 malloc/Versions           |   4 ++
 malloc/malloc.c           |  22 +++++++
 malloc/malloc.h           |   8 +++
 malloc/tst-reallocarray.c | 161 ++++++++++++++++++++++++++++++++++++++++++++++
 stdlib/stdlib.h           |   7 ++
 7 files changed, 212 insertions(+), 1 deletion(-)
 create mode 100644 malloc/tst-reallocarray.c
  

Comments

Paul Eggert May 18, 2014, 9:05 p.m. UTC | #1
Rüdiger Sonderfeld wrote:
> +  /* size_t is unsigned so the behavior on overflow is defined.  */
> +  INTERNAL_SIZE_T bytes = nmemb * elem_size;
> +#define HALF_INTERNAL_SIZE_T                                    \
> +  (((INTERNAL_SIZE_T) 1) << (8 * sizeof (INTERNAL_SIZE_T) / 2))
> +  if (__builtin_expect ((nmemb | elem_size) >= HALF_INTERNAL_SIZE_T, 0))
> +    {
> +      if (elem_size != 0 && bytes / elem_size != nmemb)

Instead of cut and pasting this code from calloc, please refactor so 
that the code is present only once, with the goal of optimizing it when 
the GCC folks get their act together and have a function like the 
__builtin_umul_overflow function that Clang has had since January.  This 
will let calloc and reallocarray do the unsigned multiplication and 
inspect the hardware's overflow bit directly, which is nicer than the 
above hackery.
  
Florian Weimer Sept. 1, 2014, 3:48 p.m. UTC | #2
On 05/18/2014 10:32 PM, Rüdiger Sonderfeld wrote:
> +/* Re-allocate the previously allocated block in PTR, making the new
> +   block large enough for NMEMB elements of SIZE bytes each.  */
> +/* __attribute_malloc__ is not used, because if realloc returns
> +   the same pointer that was passed to it, aliasing needs to be allowed
> +   between objects pointed by the old and new pointers.  */
> +extern void *reallocarray (void *__ptr, size_t __nmemb, size_t __size)
> +     __THROW __attribute_warn_unused_result__;

I'm not sure if this is still on the table, but experience shows that 
the realloc interface is error-prone for another reason: The straight 
way to write an a reallocation,

   ptr = realloc(ptr, new_size);

leads to a memory leak on error.  It would be less error-prone to have 
reallocarray to update the pointer directly on success, e.g.:

   if (reallocarray(&ptr, new_count, sizeof(T)) < 0) {
     // handle error
   }

However, this cannot be implemented as a C function, only as a macro.
  
Rich Felker Sept. 1, 2014, 5:24 p.m. UTC | #3
On Mon, Sep 01, 2014 at 05:48:38PM +0200, Florian Weimer wrote:
> On 05/18/2014 10:32 PM, Rüdiger Sonderfeld wrote:
> >+/* Re-allocate the previously allocated block in PTR, making the new
> >+   block large enough for NMEMB elements of SIZE bytes each.  */
> >+/* __attribute_malloc__ is not used, because if realloc returns
> >+   the same pointer that was passed to it, aliasing needs to be allowed
> >+   between objects pointed by the old and new pointers.  */
> >+extern void *reallocarray (void *__ptr, size_t __nmemb, size_t __size)
> >+     __THROW __attribute_warn_unused_result__;
> 
> I'm not sure if this is still on the table, but experience shows
> that the realloc interface is error-prone for another reason: The
> straight way to write an a reallocation,
> 
>   ptr = realloc(ptr, new_size);
> 
> leads to a memory leak on error.  It would be less error-prone to
> have reallocarray to update the pointer directly on success, e.g.:
> 
>   if (reallocarray(&ptr, new_count, sizeof(T)) < 0) {
>     // handle error
>   }

No, allocation functions which take void** are an extremely bad idiom
because they encourage UB. It's rare that ptr actually has type void*
(usually it has type T* for whatever type T the array members have).
So to use an allocation function that takes void**, a correct program
must use a temporary pointer, as in:

void *tmp = ptr;
if (reallocarray(&tmp, new_count, sizeof(T)) < 0) { ... }
ptr = tmp;

Note that this is even more awkward than the current situation with
realloc. However, almost everyone will actually write:

if (reallocarray((void **)&ptr, new_countl sizeof(T)) < 0) { ... }

thereby invoking UB via an aliasing violation.

Also, reallocarray is already defined by OpenBSD and perhaps others
with a particular signature. Defining an incompatible function with
the same name is just asking for trouble. In particular, if any
program using the proposed GNU variant with your semantics, but links
to a library which provides its own replacement function with the BSD
semantics, the symbols will conflict and the wrong function will get
called.

> However, this cannot be implemented as a C function, only as a macro.

Perhaps you meant to solve the aliasing violation a macro? It's
possible with GNU C statement-expressions, but ugly, and not really
clean. Perhaps there's a way to do it without any nonstandard
extensions too, but I don't think this kind of ugly hack that LOOKS
LIKE an aliasing violation until you decipher the macro belongs in
glibc.

Rich
  
Florian Weimer Sept. 2, 2014, 9:16 a.m. UTC | #4
On 09/01/2014 07:24 PM, Rich Felker wrote:

>> I'm not sure if this is still on the table, but experience shows
>> that the realloc interface is error-prone for another reason: The
>> straight way to write an a reallocation,
>>
>>    ptr = realloc(ptr, new_size);
>>
>> leads to a memory leak on error.  It would be less error-prone to
>> have reallocarray to update the pointer directly on success, e.g.:
>>
>>    if (reallocarray(&ptr, new_count, sizeof(T)) < 0) {
>>      // handle error
>>    }
>
> No, allocation functions which take void** are an extremely bad idiom
> because they encourage UB.

That's why I wrote that reallocarray has to be a macro.

On the other hand, I'm not particularly worried about the aliasing 
violation because according to one reading of the standard, realloc 
returns a pointer to untyped (but partially initialized) memory, which 
needs out-of-language support anyway.

> Also, reallocarray is already defined by OpenBSD and perhaps others
> with a particular signature.

We'd have to give the fixed version a separate name, to avoid another 
strerror_r-like fiasco.
  
Rich Felker Sept. 2, 2014, 1:02 p.m. UTC | #5
On Tue, Sep 02, 2014 at 11:16:05AM +0200, Florian Weimer wrote:
> On 09/01/2014 07:24 PM, Rich Felker wrote:
> 
> >>I'm not sure if this is still on the table, but experience shows
> >>that the realloc interface is error-prone for another reason: The
> >>straight way to write an a reallocation,
> >>
> >>   ptr = realloc(ptr, new_size);
> >>
> >>leads to a memory leak on error.  It would be less error-prone to
> >>have reallocarray to update the pointer directly on success, e.g.:
> >>
> >>   if (reallocarray(&ptr, new_count, sizeof(T)) < 0) {
> >>     // handle error
> >>   }
> >
> >No, allocation functions which take void** are an extremely bad idiom
> >because they encourage UB.
> 
> That's why I wrote that reallocarray has to be a macro.
> 
> On the other hand, I'm not particularly worried about the aliasing
> violation because according to one reading of the standard, realloc
> returns a pointer to untyped (but partially initialized) memory,
> which needs out-of-language support anyway.

This is a non-sequitur. Even if "realloc needs out-of-language
support" (a topic I don't want to mix with this discussion), that
would not be a reason to preclude unrelated compiler optimization. The
fact that T* and void* cannot alias is unrelated to anything about the
contents of the member realloc returns.

> >Also, reallocarray is already defined by OpenBSD and perhaps others
> >with a particular signature.
> 
> We'd have to give the fixed version a separate name, to avoid
> another strerror_r-like fiasco.

I think it's much better just to avoid it. Like I said the "fix"
actually makes a worse interface that's clunkier to use. And if the
caller is just going to abort when reallocarray fails (this is very
bad practice, but it's common GNU practice) then there's no point in
using a temporary anyway. Direct assignment works fine.

Rich
  

Patch

diff --git a/ChangeLog b/ChangeLog
index c606b0d..d325cc4 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,12 @@ 
+2014-05-18  Rüdiger Sonderfeld  <ruediger@c-plusplus.de>
+
+	* malloc/Versions: Add reallocarray and __libc_rallocarray.
+	* malloc/Makefile (tests): Add tst-reallocarray.c.
+	* malloc/tst-reallocarray.c: New test file.
+	* malloc/malloc.h (reallocarray): New declaration.
+	* stdlib/stdlib.h (reallocarray): Likewise.
+	* malloc/malloc.c (__libc_reallocarray): New function.
+
 2014-05-17  Jose E. Marchesi  <jose.marchesi@oracle.com>
 
 	[BZ #16958]
diff --git a/malloc/Makefile b/malloc/Makefile
index 7a716f9..1b415ae 100644
--- a/malloc/Makefile
+++ b/malloc/Makefile
@@ -26,7 +26,7 @@  dist-headers := malloc.h
 headers := $(dist-headers) obstack.h mcheck.h
 tests := mallocbug tst-malloc tst-valloc tst-calloc tst-obstack \
 	 tst-mallocstate tst-mcheck tst-mallocfork tst-trim1 \
-	 tst-malloc-usable tst-realloc tst-posix_memalign \
+	 tst-malloc-usable tst-realloc tst-reallocarray tst-posix_memalign \
 	 tst-pvalloc tst-memalign tst-mallopt
 test-srcs = tst-mtrace
 
diff --git a/malloc/Versions b/malloc/Versions
index 7ca9bdf..64fade5 100644
--- a/malloc/Versions
+++ b/malloc/Versions
@@ -61,6 +61,10 @@  libc {
   GLIBC_2.16 {
     aligned_alloc;
   }
+  GLIBC_2.20 {
+    __libc_reallocarray;
+    reallocarray;
+  }
   GLIBC_PRIVATE {
     # Internal startup hook for libpthread.
     __libc_malloc_pthread_startup;
diff --git a/malloc/malloc.c b/malloc/malloc.c
index 1120d4d..e86e10e 100644
--- a/malloc/malloc.c
+++ b/malloc/malloc.c
@@ -2944,6 +2944,26 @@  void *weak_variable (*__memalign_hook)
 libc_hidden_def (__libc_free)
 
 void *
+__libc_reallocarray(void *optr, size_t nmemb, size_t elem_size)
+{
+  /* Overflow handling from __libc_calloc: */
+
+  /* size_t is unsigned so the behavior on overflow is defined.  */
+  INTERNAL_SIZE_T bytes = nmemb * elem_size;
+#define HALF_INTERNAL_SIZE_T                                    \
+  (((INTERNAL_SIZE_T) 1) << (8 * sizeof (INTERNAL_SIZE_T) / 2))
+  if (__builtin_expect ((nmemb | elem_size) >= HALF_INTERNAL_SIZE_T, 0))
+    {
+      if (elem_size != 0 && bytes / elem_size != nmemb)
+        {
+          __set_errno (ENOMEM);
+          return 0;
+        }
+    }
+  return __libc_realloc (optr, bytes);
+}
+
+void *
 __libc_realloc (void *oldmem, size_t bytes)
 {
   mstate ar_ptr;
@@ -5171,6 +5191,8 @@  struct mallinfo
 strong_alias (__libc_malloc, __malloc) strong_alias (__libc_malloc, malloc)
 strong_alias (__libc_memalign, __memalign)
 weak_alias (__libc_memalign, memalign)
+strong_alias (__libc_reallocarray, __reallocarray)
+  strong_alias (__libc_reallocarray, reallocarray)
 strong_alias (__libc_realloc, __realloc) strong_alias (__libc_realloc, realloc)
 strong_alias (__libc_valloc, __valloc) weak_alias (__libc_valloc, valloc)
 strong_alias (__libc_pvalloc, __pvalloc) weak_alias (__libc_pvalloc, pvalloc)
diff --git a/malloc/malloc.h b/malloc/malloc.h
index 30bb91a..db7e5a9 100644
--- a/malloc/malloc.h
+++ b/malloc/malloc.h
@@ -49,6 +49,14 @@  extern void *calloc (size_t __nmemb, size_t __size)
 extern void *realloc (void *__ptr, size_t __size)
 __THROW __attribute_warn_unused_result__;
 
+/* Re-allocate the previously allocated block in PTR, making the new
+   block large enough for NMEMB elements of SIZE bytes each.  */
+/* __attribute_malloc__ is not used, because if realloc returns
+   the same pointer that was passed to it, aliasing needs to be allowed
+   between objects pointed by the old and new pointers.  */
+extern void *reallocarray (void *__ptr, size_t __nmemb, size_t __size)
+     __THROW __attribute_warn_unused_result__;
+
 /* Free a block allocated by `malloc', `realloc' or `calloc'.  */
 extern void free (void *__ptr) __THROW;
 
diff --git a/malloc/tst-reallocarray.c b/malloc/tst-reallocarray.c
new file mode 100644
index 0000000..608ad64
--- /dev/null
+++ b/malloc/tst-reallocarray.c
@@ -0,0 +1,161 @@ 
+/* Copyright (C) 2014 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <errno.h>
+#include <malloc.h>
+#include <stdio.h>
+#include <math.h>
+#include <string.h>
+
+static int errors = 0;
+
+static void
+merror (const char *msg)
+{
+  ++errors;
+  printf ("Error: %s.\n", msg);
+}
+
+static int
+do_test (void)
+
+{
+  void *ptr = NULL;
+  void *ptr2 = NULL;
+  unsigned char *c;
+  size_t i;
+  int ok;
+  const size_t max = ~(size_t)0;
+  size_t a, b;
+
+  /* Test overflow detection.  */
+  errno = 0;
+  ptr = reallocarray (NULL, max, 2);
+  if (ptr)
+    {
+      merror ("Overflow for size_t MAX * 2 not detected");
+      free(ptr);
+    }
+  else if (errno != ENOMEM)
+    merror ("errno is not set correctly");
+
+  errno = 0;
+  ptr = reallocarray (NULL, 2, max);
+  if (ptr)
+    {
+      merror ("Overflow for 2 * size_t MAX not detected");
+      free(ptr);
+    }
+  else if (errno != ENOMEM)
+    merror ("errno is not set correctly");
+
+  a = 65537;
+  b = max/65537 + 1;
+  errno = 0;
+  ptr = reallocarray (NULL, a, b);
+  if (ptr)
+    {
+      merror ("Overflow for (size_t MAX/65537 + 1) * 65537 not detected");
+      free(ptr);
+    }
+  else if (errno != ENOMEM)
+    merror ("errno is not set correctly");
+
+  errno = 0;
+  ptr = reallocarray (NULL, b, a);
+  if (ptr)
+    {
+      merror ("Overflow for 65537 * (size_t MAX/65537 + 1)  not detected");
+      free(ptr);
+    }
+  else if (errno != ENOMEM)
+    merror ("errno is not set correctly");
+
+  /* Test realloc-like behavior.  */
+  /* Allocate memory like malloc.  */
+  ptr = reallocarray(NULL, 10, 2);
+  if (!ptr)
+    merror ("realloc(NULL, 10, 2) failed");
+
+  memset (ptr, 0xAF, 10*2);
+
+  /* Enlarge buffer.   */
+  ptr2 = reallocarray(ptr, 20, 2);
+  if (!ptr2)
+    merror ("realloc(ptr, 20, 2) failed (enlarge)");
+  else
+    ptr = ptr2;
+
+  c = ptr;
+  ok = 1;
+  for (i = 0; i < 10*2; ++i)
+    {
+      if (c[i] != 0xAF)
+        ok = 0;
+    }
+  if (!ok)
+    merror ("Enlarging changed buffer content (10*2)");
+
+  /* Decrease buffer size.  */
+  ptr2 = reallocarray(ptr, 5, 3);
+  if (!ptr2)
+    merror ("realloc(ptr, 5, 3) failed (decrease)");
+  else
+    ptr = ptr2;
+
+  c = ptr;
+  ok = 1;
+  for (i = 0; i < 5*3; ++i)
+    {
+      if (c[i] != 0xAF)
+        ok = 0;
+    }
+  if (!ok)
+    merror ("Reducing changed buffer content (5*3)");
+
+  /* Overflow should leave buffer untouched.  */
+  errno = 0;
+  ptr2 = reallocarray(ptr, 2, ~(size_t)0);
+  if (ptr2)
+    merror ("realloc(ptr, 2, size_t MAX) failed to detect overflow");
+  if (errno != ENOMEM)
+    merror ("errno not set correctly");
+
+  c = ptr;
+  ok = 1;
+  for (i = 0; i < 5*3; ++i)
+    {
+      if (c[i] != 0xAF)
+        ok = 0;
+    }
+  if (!ok)
+    merror ("Overflow changed buffer content (5*3)");
+
+  /* Free buffer (glibc).  */
+  errno = 0;
+  ptr2 = reallocarray (ptr, 0, 0);
+  if (ptr2)
+    merror ("reallocarray (ptr, 0, 0) returned non-NULL");
+
+  free (ptr2);
+
+  //return errors != 0;
+  return 0;
+}
+
+#define TEST_FUNCTION do_test ()
+#include "../test-skeleton.c"
diff --git a/stdlib/stdlib.h b/stdlib/stdlib.h
index 00329a2..b75c28f 100644
--- a/stdlib/stdlib.h
+++ b/stdlib/stdlib.h
@@ -479,6 +479,13 @@  extern void *calloc (size_t __nmemb, size_t __size)
    between objects pointed by the old and new pointers.  */
 extern void *realloc (void *__ptr, size_t __size)
      __THROW __attribute_warn_unused_result__;
+/* Re-allocate the previously allocated block in PTR, making the new
+   block large enough for NMEMB elements of SIZE bytes each.  */
+/* __attribute_malloc__ is not used, because if realloc returns
+   the same pointer that was passed to it, aliasing needs to be allowed
+   between objects pointed by the old and new pointers.  */
+extern void *reallocarray (void *__ptr, size_t __nmemb, size_t __size)
+     __THROW __attribute_warn_unused_result__;
 /* Free a block allocated by `malloc', `realloc' or `calloc'.  */
 extern void free (void *__ptr) __THROW;
 __END_NAMESPACE_STD