Use size_t for mallinfo fields.

Message ID e85c11c0-6b16-8298-39a3-320655c2bee6@suse.cz
State Committed
Headers
Series Use size_t for mallinfo fields. |

Commit Message

Martin Liška July 7, 2020, noon UTC
  Hi.

The current int type can easily overflow for allocation of more
than 4GB.

The following patch changes that to size_t. I guess I need to adjust
the API version of the function, right?

Thanks,
Martin

---
  malloc/malloc.h    | 20 ++++++++++----------
  manual/memory.texi | 22 +++++++++++-----------
  2 files changed, 21 insertions(+), 21 deletions(-)
  

Comments

Andreas Schwab July 7, 2020, 12:17 p.m. UTC | #1
On Jul 07 2020, Martin Liška wrote:

> The current int type can easily overflow for allocation of more
> than 4GB.
>
> The following patch changes that to size_t. I guess I need to adjust
> the API version of the function, right?

Not only that, it breaks the ABI of mallinfo.

Andreas.
  
Martin Liška July 7, 2020, 1:07 p.m. UTC | #2
On 7/7/20 2:17 PM, Andreas Schwab wrote:
> On Jul 07 2020, Martin Liška wrote:
> 
>> The current int type can easily overflow for allocation of more
>> than 4GB.
>>
>> The following patch changes that to size_t. I guess I need to adjust
>> the API version of the function, right?
> 
> Not only that, it breaks the ABI of mallinfo.

Sure, so what options do I have? I'm new to glibc so a hint would be appreciated.

Thanks,
Martin

> 
> Andreas.
>
  
H.J. Lu July 7, 2020, 1:19 p.m. UTC | #3
On Tue, Jul 7, 2020 at 6:08 AM Martin Liška <mliska@suse.cz> wrote:
>
> On 7/7/20 2:17 PM, Andreas Schwab wrote:
> > On Jul 07 2020, Martin Liška wrote:
> >
> >> The current int type can easily overflow for allocation of more
> >> than 4GB.
> >>
> >> The following patch changes that to size_t. I guess I need to adjust
> >> the API version of the function, right?
> >
> > Not only that, it breaks the ABI of mallinfo.
>
> Sure, so what options do I have? I'm new to glibc so a hint would be appreciated.
>

You need to keep the old version of mallinfo in libc.so.  There are
plenty of examples
in glibc source:

malloc/hooks.c:#if SHLIB_COMPAT (libc, GLIBC_2_0, GLIBC_2_25)
malloc/hooks.c:#endif   /* SHLIB_COMPAT */
malloc/malloc.c:#if SHLIB_COMPAT (libc, GLIBC_2_0, GLIBC_2_24)
malloc/malloc.c:#if SHLIB_COMPAT (libc, GLIBC_2_0, GLIBC_2_26)
malloc/obstack.c:#  if SHLIB_COMPAT (libc, GLIBC_2_0, GLIBC_2_3_4)
  
Florian Weimer July 7, 2020, 1:49 p.m. UTC | #4
* Martin Liška:

> On 7/7/20 2:17 PM, Andreas Schwab wrote:
>> On Jul 07 2020, Martin Liška wrote:
>>
>>> The current int type can easily overflow for allocation of more
>>> than 4GB.
>>>
>>> The following patch changes that to size_t. I guess I need to adjust
>>> the API version of the function, right?
>>
>> Not only that, it breaks the ABI of mallinfo.
>
> Sure, so what options do I have? I'm new to glibc so a hint would be
> appreciated.

We need to add a new function.  Symbol versioning does not work because
mallinfo is interposed by alternative mallocs (tcmalloc, Address
Sanitizer, etc.).  Without the new function name, the interposer does
not know which ABI the application expects, so it's going to be quite
messy.

Thanks,
Florian
  
Martin Liška July 7, 2020, 1:52 p.m. UTC | #5
On 7/7/20 3:49 PM, Florian Weimer wrote:
> * Martin Liška:
> 
>> On 7/7/20 2:17 PM, Andreas Schwab wrote:
>>> On Jul 07 2020, Martin Liška wrote:
>>>
>>>> The current int type can easily overflow for allocation of more
>>>> than 4GB.
>>>>
>>>> The following patch changes that to size_t. I guess I need to adjust
>>>> the API version of the function, right?
>>>
>>> Not only that, it breaks the ABI of mallinfo.
>>
>> Sure, so what options do I have? I'm new to glibc so a hint would be
>> appreciated.
> 
> We need to add a new function.  Symbol versioning does not work because
> mallinfo is interposed by alternative mallocs (tcmalloc, Address
> Sanitizer, etc.).  Without the new function name, the interposer does
> not know which ABI the application expects, so it's going to be quite
> messy.
> 
> Thanks,
> Florian
> 

All right, am I closer with the suggested patch?

Martin
  
Florian Weimer July 7, 2020, 2:22 p.m. UTC | #6
* Martin Liška:

> On 7/7/20 3:49 PM, Florian Weimer wrote:
>> * Martin Liška:
>>
>>> On 7/7/20 2:17 PM, Andreas Schwab wrote:
>>>> On Jul 07 2020, Martin Liška wrote:
>>>>
>>>>> The current int type can easily overflow for allocation of more
>>>>> than 4GB.
>>>>>
>>>>> The following patch changes that to size_t. I guess I need to adjust
>>>>> the API version of the function, right?
>>>>
>>>> Not only that, it breaks the ABI of mallinfo.
>>>
>>> Sure, so what options do I have? I'm new to glibc so a hint would be
>>> appreciated.
>>
>> We need to add a new function.  Symbol versioning does not work because
>> mallinfo is interposed by alternative mallocs (tcmalloc, Address
>> Sanitizer, etc.).  Without the new function name, the interposer does
>> not know which ABI the application expects, so it's going to be quite
>> messy.

> All right, am I closer with the suggested patch?

If what I wrote above is right (we'd first gather consensus around
that), we should probably add struct mallinfo2 and mallinfo2, deprecate
the original mallinfo function, and eventually remove them from the
public API (turning the original mallinfo into a compatibility symbol).

I suppose it would make sense to raise this issue with the tcmalloc, tbb
and Address Sanitizer people, to see if they would be willing to
implement mallinfo2 on their end.

The end result, having mallinfo2 and not mallinfo, is a bit ugly, but
it's an improvement over the current state.  I do not see a need to get
creative with symbol redirects or symbol versions.

Thanks,
Florian
  
Andreas Schwab July 7, 2020, 2:32 p.m. UTC | #7
On Jul 07 2020, Florian Weimer wrote:

> If what I wrote above is right (we'd first gather consensus around
> that), we should probably add struct mallinfo2 and mallinfo2, deprecate
> the original mallinfo function, and eventually remove them from the
> public API (turning the original mallinfo into a compatibility symbol).

Isn't mallinfo obsoleted by malloc_info anyway?

Andreas.
  
Florian Weimer July 7, 2020, 2:36 p.m. UTC | #8
* Andreas Schwab:

> On Jul 07 2020, Florian Weimer wrote:
>
>> If what I wrote above is right (we'd first gather consensus around
>> that), we should probably add struct mallinfo2 and mallinfo2, deprecate
>> the original mallinfo function, and eventually remove them from the
>> public API (turning the original mallinfo into a compatibility symbol).
>
> Isn't mallinfo obsoleted by malloc_info anyway?

None of the malloc alternatives seem to have picked it up.

Martin, why do you want to change mallinfo, rather than switching to
malloc_info?

Thanks,
Florian
  
Martin Liška July 8, 2020, 7:24 a.m. UTC | #9
On 7/7/20 4:22 PM, Florian Weimer wrote:
> * Martin Liška:
> 
>> On 7/7/20 3:49 PM, Florian Weimer wrote:
>>> * Martin Liška:
>>>
>>>> On 7/7/20 2:17 PM, Andreas Schwab wrote:
>>>>> On Jul 07 2020, Martin Liška wrote:
>>>>>
>>>>>> The current int type can easily overflow for allocation of more
>>>>>> than 4GB.
>>>>>>
>>>>>> The following patch changes that to size_t. I guess I need to adjust
>>>>>> the API version of the function, right?
>>>>>
>>>>> Not only that, it breaks the ABI of mallinfo.
>>>>
>>>> Sure, so what options do I have? I'm new to glibc so a hint would be
>>>> appreciated.
>>>
>>> We need to add a new function.  Symbol versioning does not work because
>>> mallinfo is interposed by alternative mallocs (tcmalloc, Address
>>> Sanitizer, etc.).  Without the new function name, the interposer does
>>> not know which ABI the application expects, so it's going to be quite
>>> messy.
> 
>> All right, am I closer with the suggested patch?
> 
> If what I wrote above is right (we'd first gather consensus around
> that), we should probably add struct mallinfo2 and mallinfo2, deprecate
> the original mallinfo function, and eventually remove them from the
> public API (turning the original mallinfo into a compatibility symbol).

All right, I'm sending patch for that.

> 
> I suppose it would make sense to raise this issue with the tcmalloc, tbb
> and Address Sanitizer people, to see if they would be willing to
> implement mallinfo2 on their end.

Once we're done I can file issues to all these to inform them.

Thoughts?
Martin

> 
> The end result, having mallinfo2 and not mallinfo, is a bit ugly, but
> it's an improvement over the current state.  I do not see a need to get
> creative with symbol redirects or symbol versions.
> 
> Thanks,
> Florian
>
  
Martin Liška July 8, 2020, 7:25 a.m. UTC | #10
On 7/7/20 4:36 PM, Florian Weimer wrote:
> * Andreas Schwab:
> 
>> On Jul 07 2020, Florian Weimer wrote:
>>
>>> If what I wrote above is right (we'd first gather consensus around
>>> that), we should probably add struct mallinfo2 and mallinfo2, deprecate
>>> the original mallinfo function, and eventually remove them from the
>>> public API (turning the original mallinfo into a compatibility symbol).
>>
>> Isn't mallinfo obsoleted by malloc_info anyway?
> 
> None of the malloc alternatives seem to have picked it up.
> 
> Martin, why do you want to change mallinfo, rather than switching to
> malloc_info?

We use it in the GCC to inform about current memory usage (it's handy for LTO debugging).

If I see correctly, for malloc_info, one would have to parse a XML output..

Martin

> 
> Thanks,
> Florian
>
  
Martin Liška July 23, 2020, 10:23 a.m. UTC | #11
PING^1

On 7/8/20 9:24 AM, Martin Liška wrote:
> On 7/7/20 4:22 PM, Florian Weimer wrote:
>> * Martin Liška:
>>
>>> On 7/7/20 3:49 PM, Florian Weimer wrote:
>>>> * Martin Liška:
>>>>
>>>>> On 7/7/20 2:17 PM, Andreas Schwab wrote:
>>>>>> On Jul 07 2020, Martin Liška wrote:
>>>>>>
>>>>>>> The current int type can easily overflow for allocation of more
>>>>>>> than 4GB.
>>>>>>>
>>>>>>> The following patch changes that to size_t. I guess I need to adjust
>>>>>>> the API version of the function, right?
>>>>>>
>>>>>> Not only that, it breaks the ABI of mallinfo.
>>>>>
>>>>> Sure, so what options do I have? I'm new to glibc so a hint would be
>>>>> appreciated.
>>>>
>>>> We need to add a new function.  Symbol versioning does not work because
>>>> mallinfo is interposed by alternative mallocs (tcmalloc, Address
>>>> Sanitizer, etc.).  Without the new function name, the interposer does
>>>> not know which ABI the application expects, so it's going to be quite
>>>> messy.
>>
>>> All right, am I closer with the suggested patch?
>>
>> If what I wrote above is right (we'd first gather consensus around
>> that), we should probably add struct mallinfo2 and mallinfo2, deprecate
>> the original mallinfo function, and eventually remove them from the
>> public API (turning the original mallinfo into a compatibility symbol).
> 
> All right, I'm sending patch for that.
> 
>>
>> I suppose it would make sense to raise this issue with the tcmalloc, tbb
>> and Address Sanitizer people, to see if they would be willing to
>> implement mallinfo2 on their end.
> 
> Once we're done I can file issues to all these to inform them.
> 
> Thoughts?
> Martin
> 
>>
>> The end result, having mallinfo2 and not mallinfo, is a bit ugly, but
>> it's an improvement over the current state.  I do not see a need to get
>> creative with symbol redirects or symbol versions.
>>
>> Thanks,
>> Florian
>>
>
  
Szabolcs Nagy July 23, 2020, 2:38 p.m. UTC | #12
The 07/08/2020 09:24, Martin Liška wrote:
> Subject: [PATCH] Add mallinfo2 function that support sizes >= 4GB.
> 
> The current int type can easily overflow for allocation of more
> than 4GB.

i don't think a new symbol with a similar bad
design as the old one is desirable.

(i think a good design would allow exposing malloc
internal info without abi and api issues when the
internals change, e.g. no struct field names and
struct layout that are tied to internals. and
supportable by other implementations so whatever
gcc is doing would work elsewhere not just on glibc)

one hack that may be acceptable is to use some
nonsensical value in a mallinfo field to signal that
the fields have new meaning, then the api can work
backward compatibly when all field values are less
than 2G and after that things are broken anyway so
we can switch to some different struct content that
has no overflow issue, but abi compatible with the
old struct (e.g. round up the values to multiples of
1M), so we don't need a new abi symbol and interposers
continue to work. (but i'm not entirely sure this
is a good idea either)
  
Martin Liška July 27, 2020, 12:08 p.m. UTC | #13
On 7/23/20 4:38 PM, Szabolcs Nagy wrote:
> The 07/08/2020 09:24, Martin Liška wrote:
>> Subject: [PATCH] Add mallinfo2 function that support sizes >= 4GB.
>>
>> The current int type can easily overflow for allocation of more
>> than 4GB.
> 
> i don't think a new symbol with a similar bad
> design as the old one is desirable.
> 
> (i think a good design would allow exposing malloc
> internal info without abi and api issues when the
> internals change, e.g. no struct field names and
> struct layout that are tied to internals. and
> supportable by other implementations so whatever
> gcc is doing would work elsewhere not just on glibc)

Hello.

All right, I agree with that. So something like:

enum malloc_info
{
   ARENA_BYTES,
   ...
};

size_t get_mallinfo (malloc_info type) ?

That would allow adding new enum values that can supported in the future.

> 
> one hack that may be acceptable is to use some
> nonsensical value in a mallinfo field to signal that
> the fields have new meaning, then the api can work
> backward compatibly when all field values are less
> than 2G and after that things are broken anyway so
> we can switch to some different struct content that
> has no overflow issue, but abi compatible with the
> old struct (e.g. round up the values to multiples of
> 1M), so we don't need a new abi symbol and interposers
> continue to work. (but i'm not entirely sure this
> is a good idea either)

Huh, that's quite hacky approach and I don't like it much :)
Let's forget about the old API/ABI and design a new proper one.

Martin

>
  
Florian Weimer July 27, 2020, 12:21 p.m. UTC | #14
* Martin Liška:

> All right, I agree with that. So something like:
>
> enum malloc_info
> {
>   ARENA_BYTES,
>   ...
> };
>
> size_t get_mallinfo (malloc_info type) ?
>
> That would allow adding new enum values that can supported in the future.

It does not allow to obtain a consistent snapshot of multiple values,
though.

Thanks,
Florian
  
Martin Liška July 27, 2020, 12:45 p.m. UTC | #15
On 7/27/20 2:21 PM, Florian Weimer wrote:
> * Martin Liška:
> 
>> All right, I agree with that. So something like:
>>
>> enum malloc_info
>> {
>>    ARENA_BYTES,
>>    ...
>> };
>>
>> size_t get_mallinfo (malloc_info type) ?
>>
>> That would allow adding new enum values that can supported in the future.
> 
> It does not allow to obtain a consistent snapshot of multiple values,
> though.

Good point!

So are we back to mallinfo2 which I suggested in patch?
Can you please make an opinion about it?

Thanks,
Martin

> 
> Thanks,
> Florian
>
  
Martin Liška Aug. 11, 2020, 12:26 p.m. UTC | #16
PING^1

On 7/27/20 2:45 PM, Martin Liška wrote:
> On 7/27/20 2:21 PM, Florian Weimer wrote:
>> * Martin Liška:
>>
>>> All right, I agree with that. So something like:
>>>
>>> enum malloc_info
>>> {
>>>    ARENA_BYTES,
>>>    ...
>>> };
>>>
>>> size_t get_mallinfo (malloc_info type) ?
>>>
>>> That would allow adding new enum values that can supported in the future.
>>
>> It does not allow to obtain a consistent snapshot of multiple values,
>> though.
> 
> Good point!
> 
> So are we back to mallinfo2 which I suggested in patch?
> Can you please make an opinion about it?
> 
> Thanks,
> Martin
> 
>>
>> Thanks,
>> Florian
>>
>
  
Florian Weimer Aug. 11, 2020, 1:44 p.m. UTC | #17
* Martin Liška:

> On 7/27/20 2:21 PM, Florian Weimer wrote:
>> * Martin Liška:
>> 
>>> All right, I agree with that. So something like:
>>>
>>> enum malloc_info
>>> {
>>>    ARENA_BYTES,
>>>    ...
>>> };
>>>
>>> size_t get_mallinfo (malloc_info type) ?
>>>
>>> That would allow adding new enum values that can supported in the future.
>> 
>> It does not allow to obtain a consistent snapshot of multiple values,
>> though.
>
> Good point!
>
> So are we back to mallinfo2 which I suggested in patch?
> Can you please make an opinion about it?

DJ, what do you think about this patch?
  

Patch

diff --git a/malloc/malloc.h b/malloc/malloc.h
index a6903fdd54..785c38e164 100644
--- a/malloc/malloc.h
+++ b/malloc/malloc.h
@@ -85,16 +85,16 @@  __THROW __attribute_malloc__;
 
 struct mallinfo
 {
-  int arena;    /* non-mmapped space allocated from system */
-  int ordblks;  /* number of free chunks */
-  int smblks;   /* number of fastbin blocks */
-  int hblks;    /* number of mmapped regions */
-  int hblkhd;   /* space in mmapped regions */
-  int usmblks;  /* always 0, preserved for backwards compatibility */
-  int fsmblks;  /* space available in freed fastbin blocks */
-  int uordblks; /* total allocated space */
-  int fordblks; /* total free space */
-  int keepcost; /* top-most, releasable (via malloc_trim) space */
+  size_t arena;    /* non-mmapped space allocated from system */
+  size_t ordblks;  /* number of free chunks */
+  size_t smblks;   /* number of fastbin blocks */
+  size_t hblks;    /* number of mmapped regions */
+  size_t hblkhd;   /* space in mmapped regions */
+  size_t usmblks;  /* always 0, preserved for backwards compatibility */
+  size_t fsmblks;  /* space available in freed fastbin blocks */
+  size_t uordblks; /* total allocated space */
+  size_t fordblks; /* total free space */
+  size_t keepcost; /* top-most, releasable (via malloc_trim) space */
 };
 
 /* Returns a copy of the updated current mallinfo. */
diff --git a/manual/memory.texi b/manual/memory.texi
index aa5011e4f9..ac803dd2d5 100644
--- a/manual/memory.texi
+++ b/manual/memory.texi
@@ -1516,39 +1516,39 @@  This structure type is used to return information about the dynamic
 memory allocator.  It contains the following members:
 
 @table @code
-@item int arena
+@item size_t arena
 This is the total size of memory allocated with @code{sbrk} by
 @code{malloc}, in bytes.
 
-@item int ordblks
+@item size_t ordblks
 This is the number of chunks not in use.  (The memory allocator
-internally gets chunks of memory from the operating system, and then
+size_ternally gets chunks of memory from the operating system, and then
 carves them up to satisfy individual @code{malloc} requests;
 @pxref{The GNU Allocator}.)
 
-@item int smblks
+@item size_t smblks
 This field is unused.
 
-@item int hblks
+@item size_t hblks
 This is the total number of chunks allocated with @code{mmap}.
 
-@item int hblkhd
+@item size_t hblkhd
 This is the total size of memory allocated with @code{mmap}, in bytes.
 
-@item int usmblks
+@item size_t usmblks
 This field is unused and always 0.
 
-@item int fsmblks
+@item size_t fsmblks
 This field is unused.
 
-@item int uordblks
+@item size_t uordblks
 This is the total size of memory occupied by chunks handed out by
 @code{malloc}.
 
-@item int fordblks
+@item size_t fordblks
 This is the total size of memory occupied by free (not in use) chunks.
 
-@item int keepcost
+@item size_t keepcost
 This is the size of the top-most releasable chunk that normally
 borders the end of the heap (i.e., the high end of the virtual address
 space's data segment).