elf: dl-minimal malloc needs to respect fundamental alignment

Message ID 20160621111702.39A5B402F6E95@oldenburg.str.redhat.com
State Superseded
Delegated to: Carlos O'Donell
Headers

Commit Message

Florian Weimer June 21, 2016, 11:17 a.m. UTC
  If malloc is used instead of memalign for small alignments,
malloc needs to provide the ABI-required alignment.

2016-06-21  Florian Weimer  <fweimer@redhat.com>

	* elf/dl-minimal.c (malloc): Allocate with fundamental alignment.
  

Comments

H.J. Lu June 21, 2016, 12:54 p.m. UTC | #1
On Tue, Jun 21, 2016 at 4:17 AM, Florian Weimer <fweimer@redhat.com> wrote:
> If malloc is used instead of memalign for small alignments,
> malloc needs to provide the ABI-required alignment.
>
> 2016-06-21  Florian Weimer  <fweimer@redhat.com>
>
>         * elf/dl-minimal.c (malloc): Allocate with fundamental alignment.
>
> diff --git a/elf/dl-minimal.c b/elf/dl-minimal.c
> index c8a8f8d..2d7d139 100644
> --- a/elf/dl-minimal.c
> +++ b/elf/dl-minimal.c
> @@ -27,6 +27,7 @@
>  #include <sys/types.h>
>  #include <ldsodefs.h>
>  #include <_itoa.h>
> +#include <stdalign.h>
>
>  #include <assert.h>
>
> @@ -90,7 +91,7 @@ __libc_memalign (size_t align, size_t n)
>  void * weak_function
>  malloc (size_t n)
>  {
> -  return __libc_memalign (sizeof (double), n);
> +  return __libc_memalign (_Alignof (max_align_t), n);
>  }
>
>  /* We use this function occasionally since the real implementation may

We also have MALLOC_ALIGNMENT in malloc.  Should they be consistent?
  
Florian Weimer June 21, 2016, 12:57 p.m. UTC | #2
On 06/21/2016 02:54 PM, H.J. Lu wrote:
> On Tue, Jun 21, 2016 at 4:17 AM, Florian Weimer <fweimer@redhat.com> wrote:
>> If malloc is used instead of memalign for small alignments,
>> malloc needs to provide the ABI-required alignment.
>>
>> 2016-06-21  Florian Weimer  <fweimer@redhat.com>
>>
>>         * elf/dl-minimal.c (malloc): Allocate with fundamental alignment.
>>
>> diff --git a/elf/dl-minimal.c b/elf/dl-minimal.c
>> index c8a8f8d..2d7d139 100644
>> --- a/elf/dl-minimal.c
>> +++ b/elf/dl-minimal.c
>> @@ -27,6 +27,7 @@
>>  #include <sys/types.h>
>>  #include <ldsodefs.h>
>>  #include <_itoa.h>
>> +#include <stdalign.h>
>>
>>  #include <assert.h>
>>
>> @@ -90,7 +91,7 @@ __libc_memalign (size_t align, size_t n)
>>  void * weak_function
>>  malloc (size_t n)
>>  {
>> -  return __libc_memalign (sizeof (double), n);
>> +  return __libc_memalign (_Alignof (max_align_t), n);
>>  }
>>
>>  /* We use this function occasionally since the real implementation may
>
> We also have MALLOC_ALIGNMENT in malloc.  Should they be consistent?

MALLOC_ALIGNMENT is potentially larger.  malloc/tst-malloc-thread-fail 
tests for alignment.  To my knowledge, it passes on all regularly tested 
architectures after commit dea39b13e2958a7f0e75b5594a06d97d61cc439f.

Thanks,
Florian
  
H.J. Lu June 21, 2016, 1 p.m. UTC | #3
On Tue, Jun 21, 2016 at 5:57 AM, Florian Weimer <fweimer@redhat.com> wrote:
> On 06/21/2016 02:54 PM, H.J. Lu wrote:
>>
>> On Tue, Jun 21, 2016 at 4:17 AM, Florian Weimer <fweimer@redhat.com>
>> wrote:
>>>
>>> If malloc is used instead of memalign for small alignments,
>>> malloc needs to provide the ABI-required alignment.
>>>
>>> 2016-06-21  Florian Weimer  <fweimer@redhat.com>
>>>
>>>         * elf/dl-minimal.c (malloc): Allocate with fundamental alignment.
>>>
>>> diff --git a/elf/dl-minimal.c b/elf/dl-minimal.c
>>> index c8a8f8d..2d7d139 100644
>>> --- a/elf/dl-minimal.c
>>> +++ b/elf/dl-minimal.c
>>> @@ -27,6 +27,7 @@
>>>  #include <sys/types.h>
>>>  #include <ldsodefs.h>
>>>  #include <_itoa.h>
>>> +#include <stdalign.h>
>>>
>>>  #include <assert.h>
>>>
>>> @@ -90,7 +91,7 @@ __libc_memalign (size_t align, size_t n)
>>>  void * weak_function
>>>  malloc (size_t n)
>>>  {
>>> -  return __libc_memalign (sizeof (double), n);
>>> +  return __libc_memalign (_Alignof (max_align_t), n);
>>>  }
>>>
>>>  /* We use this function occasionally since the real implementation may
>>
>>
>> We also have MALLOC_ALIGNMENT in malloc.  Should they be consistent?
>
>
> MALLOC_ALIGNMENT is potentially larger.  malloc/tst-malloc-thread-fail tests
> for alignment.  To my knowledge, it passes on all regularly tested
> architectures after commit dea39b13e2958a7f0e75b5594a06d97d61cc439f.
>

MALLOC_ALIGNMENT is kind of mapped to the malloc alignment of
a psABI.  Shouldn't ld.so malloc have the same alignment of libc malloc?
  
Florian Weimer June 21, 2016, 1:06 p.m. UTC | #4
On 06/21/2016 03:00 PM, H.J. Lu wrote:

>> MALLOC_ALIGNMENT is potentially larger.  malloc/tst-malloc-thread-fail tests
>> for alignment.  To my knowledge, it passes on all regularly tested
>> architectures after commit dea39b13e2958a7f0e75b5594a06d97d61cc439f.
>
> MALLOC_ALIGNMENT is kind of mapped to the malloc alignment of
> a psABI.  Shouldn't ld.so malloc have the same alignment of libc malloc?

I don't see why.  MALLOC_ALIGNMENT has to match both the ABI constraint 
and the malloc/malloc.c implementation constraint (which requires a 
minimum alignment of 2 * sizeof (size_t)).

Other mallocs do not have matching implementation constraints, and it is 
standard practice (in non-glibc mallocs) to lower the alignment for 
allocations which are smaller in size than _Alignof (max_align_t), 
although this is not compliant with C11.

Florian
  
H.J. Lu June 21, 2016, 1:20 p.m. UTC | #5
On Tue, Jun 21, 2016 at 6:06 AM, Florian Weimer <fweimer@redhat.com> wrote:
> On 06/21/2016 03:00 PM, H.J. Lu wrote:
>
>>> MALLOC_ALIGNMENT is potentially larger.  malloc/tst-malloc-thread-fail
>>> tests
>>> for alignment.  To my knowledge, it passes on all regularly tested
>>> architectures after commit dea39b13e2958a7f0e75b5594a06d97d61cc439f.
>>
>>
>> MALLOC_ALIGNMENT is kind of mapped to the malloc alignment of
>> a psABI.  Shouldn't ld.so malloc have the same alignment of libc malloc?
>
>
> I don't see why.  MALLOC_ALIGNMENT has to match both the ABI constraint and
> the malloc/malloc.c implementation constraint (which requires a minimum
> alignment of 2 * sizeof (size_t)).

My understanding is since the minimum constraint of malloc alignment
<= ABI alignment, MALLOC_ALIGNMENT == ABI alignment.  Do you
have a glibc platform where it isn't true?

> Other mallocs do not have matching implementation constraints, and it is
> standard practice (in non-glibc mallocs) to lower the alignment for
> allocations which are smaller in size than _Alignof (max_align_t), although
> this is not compliant with C11.
>
> Florian
  
Florian Weimer June 23, 2016, 3:01 p.m. UTC | #6
On 06/21/2016 03:20 PM, H.J. Lu wrote:
> On Tue, Jun 21, 2016 at 6:06 AM, Florian Weimer <fweimer@redhat.com> wrote:
>> On 06/21/2016 03:00 PM, H.J. Lu wrote:
>>
>>>> MALLOC_ALIGNMENT is potentially larger.  malloc/tst-malloc-thread-fail
>>>> tests
>>>> for alignment.  To my knowledge, it passes on all regularly tested
>>>> architectures after commit dea39b13e2958a7f0e75b5594a06d97d61cc439f.
>>>
>>>
>>> MALLOC_ALIGNMENT is kind of mapped to the malloc alignment of
>>> a psABI.  Shouldn't ld.so malloc have the same alignment of libc malloc?
>>
>>
>> I don't see why.  MALLOC_ALIGNMENT has to match both the ABI constraint and
>> the malloc/malloc.c implementation constraint (which requires a minimum
>> alignment of 2 * sizeof (size_t)).
>
> My understanding is since the minimum constraint of malloc alignment
> <= ABI alignment, MALLOC_ALIGNMENT == ABI alignment.  Do you
> have a glibc platform where it isn't true?

If I'm not mistaken, m68k has fundamental alignment 2 (the alignment of 
long, double, and long double), but our malloc still aligns to 8 (2 * 
sizeof (size_t)).

Florian
  
H.J. Lu June 23, 2016, 4:15 p.m. UTC | #7
On Thu, Jun 23, 2016 at 8:01 AM, Florian Weimer <fweimer@redhat.com> wrote:
> On 06/21/2016 03:20 PM, H.J. Lu wrote:
>>
>> On Tue, Jun 21, 2016 at 6:06 AM, Florian Weimer <fweimer@redhat.com>
>> wrote:
>>>
>>> On 06/21/2016 03:00 PM, H.J. Lu wrote:
>>>
>>>>> MALLOC_ALIGNMENT is potentially larger.  malloc/tst-malloc-thread-fail
>>>>> tests
>>>>> for alignment.  To my knowledge, it passes on all regularly tested
>>>>> architectures after commit dea39b13e2958a7f0e75b5594a06d97d61cc439f.
>>>>
>>>>
>>>>
>>>> MALLOC_ALIGNMENT is kind of mapped to the malloc alignment of
>>>> a psABI.  Shouldn't ld.so malloc have the same alignment of libc malloc?
>>>
>>>
>>>
>>> I don't see why.  MALLOC_ALIGNMENT has to match both the ABI constraint
>>> and
>>> the malloc/malloc.c implementation constraint (which requires a minimum
>>> alignment of 2 * sizeof (size_t)).
>>
>>
>> My understanding is since the minimum constraint of malloc alignment
>> <= ABI alignment, MALLOC_ALIGNMENT == ABI alignment.  Do you
>> have a glibc platform where it isn't true?
>
>
> If I'm not mistaken, m68k has fundamental alignment 2 (the alignment of
> long, double, and long double), but our malloc still aligns to 8 (2 * sizeof
> (size_t)).
>

Malloc alignment has been an isssie.  GCC defines
MALLOC_ABI_ALIGNMENT as

#ifndef MALLOC_ABI_ALIGNMENT
#define MALLOC_ABI_ALIGNMENT BITS_PER_WORD
#endif

which is incorrect for Linux and unusable:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36159

I'd like to see a reasonable solution to address malloc alignment
on Linux, not just in ld.so.
  
Florian Weimer June 23, 2016, 4:25 p.m. UTC | #8
On 06/23/2016 06:15 PM, H.J. Lu wrote:

> Malloc alignment has been an isssie.  GCC defines
> MALLOC_ABI_ALIGNMENT as
>
> #ifndef MALLOC_ABI_ALIGNMENT
> #define MALLOC_ABI_ALIGNMENT BITS_PER_WORD
> #endif
>
> which is incorrect for Linux and unusable:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36159
>
> I'd like to see a reasonable solution to address malloc alignment
> on Linux, not just in ld.so.

But this glibc bug is fixed:

   https://sourceware.org/bugzilla/show_bug.cgi?id=6527

As far as I know, we know match or exceed the C11 max_align_t 
requirements.  What we cannot do is to bump max_align_t alignment 
because it affects ABI.  Overall, this entire concept of fundamental 
alignment is somewhat ill-defined.

Over-aligned allocations with default operator new in C++ are a 
different matter and require C++ ABI changes, so that the C++ runtime 
can call the appropriate aligned allocation function.

There are popular interposed mallocs which do not provide sufficient 
alignment, matching max_align_t, as well.

Florian
  
Joseph Myers June 23, 2016, 5:02 p.m. UTC | #9
On Thu, 23 Jun 2016, Florian Weimer wrote:

> As far as I know, we know match or exceed the C11 max_align_t requirements.
> What we cannot do is to bump max_align_t alignment because it affects ABI.

That's not obvious to me.  It shouldn't affect the ABI of any function in 
glibc, for example; it's not the sort of type you embed in other 
structures.  Increasing max_align_t to cover _Float128 (for 32-bit x86 as 
the sole affected case) is a question I left open in my GCC _FloatN / 
_FloatNx patch posting (not included in that patch, potentially relevant 
for a followup); that issue properly applies to _Decimal128 as well.
  
Florian Weimer June 23, 2016, 5:54 p.m. UTC | #10
On 06/23/2016 07:02 PM, Joseph Myers wrote:
> On Thu, 23 Jun 2016, Florian Weimer wrote:
>
>> As far as I know, we know match or exceed the C11 max_align_t requirements.
>> What we cannot do is to bump max_align_t alignment because it affects ABI.
>
> That's not obvious to me.  It shouldn't affect the ABI of any function in
> glibc, for example; it's not the sort of type you embed in other
> structures.

I'm more worried about something using _Alignas (max_align_t), affecting 
struct layout.

> Increasing max_align_t to cover _Float128 (for 32-bit x86 as
> the sole affected case) is a question I left open in my GCC _FloatN /
> _FloatNx patch posting (not included in that patch, potentially relevant
> for a followup); that issue properly applies to _Decimal128 as well.

If we hurry up, we could bump max_align_t on i386, true.  The type and 
its implied promise is still fairly new, after all.

But a path towards increasing the fundamental alignment from time to 
time is less clear to me.

We also interpret the max_align_t requirement strictly in our malloc in 
the sense that we also apply it to small allocations.  Other mallocs 
will happily return blocks with just 8-byte alignment on x86_64, where 
_Alignof (max_align_t) is 16.  Maybe this something we could address in 
the definition of max_align_t in the C standard, but its definition is 
already very confusing.

Surprisingly the impact on i386 wouldn't even be *that* severe.  The 
minimum allocation size would still be 12 (for a total chunk size of 
16).  The size range from 13 to 20 is affected most prominently: The 
total chunk size would go from 24 bytes to 32 bytes, an increase of one 
third.  But this is also the largest such increase, the subsequent ones 
are progressively smaller:

def roundup(value, align):
     return (value + align - 1) / align * align

def chunk_size(pointer_size, align, object_size):
     minimum_chunk_size = roundup(4 * pointer_size, align)
     size = roundup(object_size + pointer_size, align)
     return max(minimum_chunk_size, size)

print "size  old  new  increase"
for size in range(0, 65):
     old_size = chunk_size(4, 8, size)
     new_size = chunk_size(4, 16, size)
     increase = (float(new_size) / old_size - 1) * 100
     print "  %2d   %2d   %2d  %8.2f" % (
         size, old_size, new_size, increase)

size  old  new  increase
    0   16   16      0.00
    1   16   16      0.00
    2   16   16      0.00
    3   16   16      0.00
    4   16   16      0.00
    5   16   16      0.00
    6   16   16      0.00
    7   16   16      0.00
    8   16   16      0.00
    9   16   16      0.00
   10   16   16      0.00
   11   16   16      0.00
   12   16   16      0.00
   13   24   32     33.33
   14   24   32     33.33
   15   24   32     33.33
   16   24   32     33.33
   17   24   32     33.33
   18   24   32     33.33
   19   24   32     33.33
   20   24   32     33.33
   21   32   32      0.00
   22   32   32      0.00
   23   32   32      0.00
   24   32   32      0.00
   25   32   32      0.00
   26   32   32      0.00
   27   32   32      0.00
   28   32   32      0.00
   29   40   48     20.00
   30   40   48     20.00
   31   40   48     20.00
   32   40   48     20.00
…

glibc malloc is not based on size groups for allocations, but fewer, 
more granular sizes will still help to reduce fragmentation somewhat.

Maybe we should just make the change, document the increased alignment, 
and have GCC change their definition of max_align-t?

The overhead would be more pronounced if we go from 16 bytes to 32 bytes 
alignment on 64-bit architectures.

Florian
  
Carlos O'Donell July 11, 2016, 2:35 a.m. UTC | #11
On 06/21/2016 07:17 AM, Florian Weimer wrote:
> If malloc is used instead of memalign for small alignments,
> malloc needs to provide the ABI-required alignment.
> 
> 2016-06-21  Florian Weimer  <fweimer@redhat.com>
> 
> 	* elf/dl-minimal.c (malloc): Allocate with fundamental alignment.
> 
> diff --git a/elf/dl-minimal.c b/elf/dl-minimal.c
> index c8a8f8d..2d7d139 100644
> --- a/elf/dl-minimal.c
> +++ b/elf/dl-minimal.c
> @@ -27,6 +27,7 @@
>  #include <sys/types.h>
>  #include <ldsodefs.h>
>  #include <_itoa.h>
> +#include <stdalign.h>
>  
>  #include <assert.h>
>  
> @@ -90,7 +91,7 @@ __libc_memalign (size_t align, size_t n)
>  void * weak_function
>  malloc (size_t n)
>  {
> -  return __libc_memalign (sizeof (double), n);
> +  return __libc_memalign (_Alignof (max_align_t), n);
>  }
>  
>  /* We use this function occasionally since the real implementation may
> 

I agree with H.J. here, this should be MALLOC_ALIGNMENT, and if it's larger
then so be it. It should logically match the behaviour, as best it can, of
glibc's malloc since we're handing off most commonly to that malloc after
relocation. Thus I'd like to see the behaviours harmonized. 

Other mallocs may not have the same behaviour but that's not a reason to
avoid MALLOC_ALIGNMENT.

The discussions about fixing libc malloc's alignment are out of scope for
this change IMO. We should focus on fixing ld.so's behaviour.

Out of curiosity have you tried to assemble a unit test for these functions
based on linking directly with dl-minimal.os? It would be nice to run them
through similar testing as is done by malloc.

OK to checkin if you use MALLOC_ALIGMENT.

Cheers,
Carlos.
  
Florian Weimer July 11, 2016, 8:55 a.m. UTC | #12
On 07/11/2016 04:35 AM, Carlos O'Donell wrote:

> I agree with H.J. here, this should be MALLOC_ALIGNMENT, and if it's larger
> then so be it. It should logically match the behaviour, as best it can, of
> glibc's malloc since we're handing off most commonly to that malloc after
> relocation. Thus I'd like to see the behaviours harmonized.
>
> Other mallocs may not have the same behaviour but that's not a reason to
> avoid MALLOC_ALIGNMENT.

I'm still not convinced.  Obviously, we cannot rely on MALLOC_ALIGNMENT 
anywhere because interposed mallocs may not provide it.  So I still 
don't see the value of compatibility with the main malloc here.

> The discussions about fixing libc malloc's alignment are out of scope for
> this change IMO. We should focus on fixing ld.so's behaviour.
>
> Out of curiosity have you tried to assemble a unit test for these functions
> based on linking directly with dl-minimal.os? It would be nice to run them
> through similar testing as is done by malloc.
>
> OK to checkin if you use MALLOC_ALIGMENT.

This change is not exactly trivial because it's currently defined in 
malloc/malloc.c only.  I will need to post another version for review.

Thanks,
Florian
  

Patch

diff --git a/elf/dl-minimal.c b/elf/dl-minimal.c
index c8a8f8d..2d7d139 100644
--- a/elf/dl-minimal.c
+++ b/elf/dl-minimal.c
@@ -27,6 +27,7 @@ 
 #include <sys/types.h>
 #include <ldsodefs.h>
 #include <_itoa.h>
+#include <stdalign.h>
 
 #include <assert.h>
 
@@ -90,7 +91,7 @@  __libc_memalign (size_t align, size_t n)
 void * weak_function
 malloc (size_t n)
 {
-  return __libc_memalign (sizeof (double), n);
+  return __libc_memalign (_Alignof (max_align_t), n);
 }
 
 /* We use this function occasionally since the real implementation may