Patchwork [BZ,20628] make mallinfo saturating

login
register
mail settings
Submitter DJ Delorie
Date Sept. 27, 2016, 6:09 p.m.
Message ID <xny42drz8h.fsf@greed.delorie.com>
Download mbox | patch
Permalink /patch/16084/
State New
Headers show

Comments

DJ Delorie - Sept. 27, 2016, 6:09 p.m.
The mallinfo() API is fixed as "int" even though on 64-bit systems,
where size_t is bigger than int, these can overflow.  The man pages
document this behavior.  The API cannot be changed.  However, this
patch changes the "returns incorrect values" case from overflowing to
saturation, so that the bogus values are easily spotted as 0x7fffffff
(INT_MAX in that case) instead of being an arbitrary (possibly
"reasonable") overflowed value.

A demonstration case is included in this email (and the bz, with
output), but as testing it requires at least 2 Gb of free memory, I
have not included a test case in the source tree at this time.

Built and tested with no new warnings or failures on both Fedora 20
and rawhide (no surprise, no tests test mallinfo()) on x86_64.

Man page change: After "However, because some internal bookkeeping
values may be of type long, the reported values may wrap around zero
and thus be inaccurate." add "In versions of glibc after X.YY, the
reported values will saturate at INT_MAX instead of overflowing."

Related BZs:
3973 - malloc_stats/struct mallinfo is not 64bits ready (WONTFIX)
5711 - mallinfo structure cannot hold 64 bit data (WONTFIX)

2016-09-27  DJ Delorie  <dj@delorie.com>

	[BZ #20628]
	* malloc/malloc.c (int_mallinfo): Use saturating add instead of
	overflow to collect statistics into a fixed "int" container.

-------------------- mallinfo-overflow-demo.c
#include <stdlib.h>
#include <stdio.h>
#include <malloc.h>

void
print_stats ()
{
  struct mallinfo m = mallinfo ();
  printf("Malloc stats...\n");
#define F(x) printf("%-8s 0x%08x %d\n", #x, m.x, m.x);
  F(arena);
  F(ordblks);
  F(smblks);
  F(hblks);
  F(hblkhd);
  F(usmblks);
  F(fsmblks);
  F(uordblks);
  F(fordblks);
  F(keepcost);
}

int
main()
{
  int i;
  print_stats ();
  for (i=0; i<0x22345678; i++)
    malloc (0x10);
  print_stats ();
}
--------------------

$ ./mallinfo-overflow-demo 
Malloc stats...
arena    0x00000000 0
...
uordblks 0x00000000 0
Malloc stats...
arena    0x468b0000 1183514624
...
uordblks 0x468acf00 1183502080
...

$ ~/tools/upstream/glibc.mallinfo.build/testrun.sh ./mallinfo-overflow-demo
Malloc stats...
arena    0x00000000 0
...
uordblks 0x00000000 0
...
Malloc stats...
arena    0x7fffffff 2147483647
...
uordblks 0x7fffffff 2147483647
...

--------------------
Paul Eggert - Sept. 27, 2016, 7:07 p.m.
It would be more backward-compatible to represent minor overflows as 
negative numbers that are equivalent to the correct answers modulo 
(UINT_MAX + 1). That way, callers can continue to retrieve the correct 
values by casting int to unsigned. The code can use -1 to represent a 
value greater than UINT_MAX. This all should be doable just as 
efficiently as the proposed patch.
Carlos O'Donell - Sept. 28, 2016, 8:26 p.m.
On 09/27/2016 03:07 PM, Paul Eggert wrote:
> It would be more backward-compatible to represent minor overflows as
> negative numbers that are equivalent to the correct answers modulo
> (UINT_MAX + 1). That way, callers can continue to retrieve the
> correct values by casting int to unsigned. The code can use -1 to
> represent a value greater than UINT_MAX. This all should be doable
> just as efficiently as the proposed patch.
 
Agreed, I had not considered that case. It would certainly make the
interface as useful as it could be with a 32-bit address space, but
it would still be mostly useless on a 64-bit system (even with 48-bit VA).

I just double checked that C11 does continue to contain the clause
that allows the conversion to work:
~~~
6.3.1.3
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in the new type
until the value is in the range of the new type.60)
60) The rules describe arithmetic on the mathematical value, not the value of a given type of expression.
~~~

This way you could cast the values to 'unsigned int' and know you had
a valid result as long as it was less than '(unsigned int)-1' (reserved
for overflow).

DJ, Care to make a version 2 of the patch?

Patch

diff --git a/malloc/malloc.c b/malloc/malloc.c
index ef04360..4f438ef 100644
--- a/malloc/malloc.c
+++ b/malloc/malloc.c
@@ -224,6 +224,7 @@ 
 #include <unistd.h>
 #include <stdio.h>    /* needed for malloc_stats */
 #include <errno.h>
+#include <limits.h>
 
 #include <shlib-compat.h>
 
@@ -4633,18 +4634,29 @@  int_mallinfo (mstate av, struct mallinfo *m)
         }
     }
 
-  m->smblks += nfastblocks;
-  m->ordblks += nblocks;
-  m->fordblks += avail;
-  m->uordblks += av->system_mem - avail;
-  m->arena += av->system_mem;
-  m->fsmblks += fastavail;
+/* Saturated add - add ADD to SUM.  If the result exceeds the range of
+   "int", set SUM to INT_MAX instead.  Assumes ADD and SUM are
+   positive.  The published ABI prevents us from bumping "int" to a
+   larger type.  */
+#define SAT_ADD(SUM, ADD) \
+  ({ INTERNAL_SIZE_T tmp = (INTERNAL_SIZE_T)(SUM) + (INTERNAL_SIZE_T)(ADD); SUM = (tmp > INT_MAX) ? INT_MAX : tmp; })
+
+/* Likewise, but assign ADD to SUM.  */
+#define SAT_SET(SUM, ADD) \
+  ({ SUM = ((INTERNAL_SIZE_T)(ADD) > INT_MAX) ? INT_MAX : (ADD); })
+
+  SAT_ADD (m->smblks, nfastblocks);
+  SAT_ADD (m->ordblks, nblocks);
+  SAT_ADD (m->fordblks, avail);
+  SAT_ADD (m->uordblks, av->system_mem - avail);
+  SAT_ADD (m->arena, av->system_mem);
+  SAT_ADD (m->fsmblks, fastavail);
   if (av == &main_arena)
     {
-      m->hblks = mp_.n_mmaps;
-      m->hblkhd = mp_.mmapped_mem;
-      m->usmblks = 0;
-      m->keepcost = chunksize (av->top);
+      SAT_SET (m->hblks, mp_.n_mmaps);
+      SAT_SET (m->hblkhd, mp_.mmapped_mem);
+      SAT_SET (m->usmblks, 0);
+      SAT_SET (m->keepcost, chunksize (av->top));
     }
 }