diff mbox series

[RFC,1/1] elf: align the mapping address of LOAD segments with p_align

Message ID 20211204045848.71105-2-rongwei.wang@linux.alibaba.com
State Superseded
Delegated to: H.J. Lu
Headers show
Series make ld.so map .text LOAD ssegments and aligned by p_align | expand

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent
dj/TryBot-32bit success Build for i686

Commit Message

Rongwei Wang Dec. 4, 2021, 4:58 a.m. UTC
Now, ld.so always map the LOAD segments and aligned by base
page size (e.g. 4k in x86 or 4k, 16k and 64k in arm64). And
this patch improve the scheme here. In this patch, ld.so
can align the mapping address of the first LOAD segment with
p_align when p_align is greater than the current base page
size.

And this change makes code segments using huge pages become
simple and available.

Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>
---
 elf/dl-load.c         |  1 +
 elf/dl-map-segments.h | 54 +++++++++++++++++++++++++++++++++++++++++--
 include/link.h        |  3 +++
 3 files changed, 56 insertions(+), 2 deletions(-)

Comments

Florian Weimer Dec. 4, 2021, 6:10 p.m. UTC | #1
* Rongwei Wang via Libc-alpha:

> Now, ld.so always map the LOAD segments and aligned by base
> page size (e.g. 4k in x86 or 4k, 16k and 64k in arm64). And
> this patch improve the scheme here. In this patch, ld.so
> can align the mapping address of the first LOAD segment with
> p_align when p_align is greater than the current base page
> size.
>
> And this change makes code segments using huge pages become
> simple and available.

I feel like we should be able to tell the kernel that we want an aligned
mapping.  munmap is not actually cheap.

Do you know if there are some ideas in this area?  Perhaps with another
MAP_ flag and encoding the requested alignment in the lower bits of the
hints address?

Thanks,
Florian
Rongwei Wang Dec. 6, 2021, 2:47 a.m. UTC | #2
Hi, Florian

On 12/5/21 2:10 AM, Florian Weimer wrote:
> * Rongwei Wang via Libc-alpha:
> 
>> Now, ld.so always map the LOAD segments and aligned by base
>> page size (e.g. 4k in x86 or 4k, 16k and 64k in arm64). And
>> this patch improve the scheme here. In this patch, ld.so
>> can align the mapping address of the first LOAD segment with
>> p_align when p_align is greater than the current base page
>> size.
>>
>> And this change makes code segments using huge pages become
>> simple and available.
> 
> I feel like we should be able to tell the kernel that we want an aligned
> mapping.  munmap is not actually cheap.
I see munmap here will slow down applications linking to DSO. And the 
effect seems small? Sorry, I don't know much about this impact.
Of course, I know that this effect should be magnified if the 
applications has too many DSOs.

> 
> Do you know if there are some ideas in this area?  Perhaps with another
As far as I can tell, there are two methods to make .text use huge pages:
1. The way above mentioned: madvise + khugepaged;
2. use libhugetlbfs[1]: this way need to recompile the source code of 
application, and linking libhugetlbfs.so. The abvious difference is
this way uses hugetlb.
For application, the two methods mentioned above which seems not 
friendly to applications.

> MAP_ flag and encoding the requested alignment in the lower bits of the
Actually, It seems the kernel developers don't want to introduce new 
MAP_ flags to support this. Maybe they think mmap and munmap is enough 
for applications to get an aligned mapping address. And of course, it
is just my guess and self point.

In additions, back to the patch, what do you think of the alignment 
method used in our patch? There would be 4k, 64k and 2M to align the 
mapping in our patch, according to p_align value. If we try to introduce
new MAP_ flags in kernel, I am not sure whether we need to add multiple
flags, for example MAP_64K, MAP_2M, etc.

LINK[1]: https://github.com/libhugetlbfs/libhugetlbfs

Thanks!

> hints address?
> 
> Thanks,
> Florian
>
H.J. Lu Dec. 6, 2021, 2:48 p.m. UTC | #3
On Fri, Dec 3, 2021 at 9:00 PM Rongwei Wang via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> Now, ld.so always map the LOAD segments and aligned by base
> page size (e.g. 4k in x86 or 4k, 16k and 64k in arm64). And
> this patch improve the scheme here. In this patch, ld.so
> can align the mapping address of the first LOAD segment with
> p_align when p_align is greater than the current base page
> size.

This is a bug fix.  Please open a glibc bug:

https://sourceware.org/bugzilla/enter_bug.cgi

with a testcase which should align variables to 2MB in the main
program and a shared library.   Please include the testcase in
your patch and mention the glibc bug in your commit message.

> And this change makes code segments using huge pages become
> simple and available.
>
> Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>
> ---
>  elf/dl-load.c         |  1 +
>  elf/dl-map-segments.h | 54 +++++++++++++++++++++++++++++++++++++++++--
>  include/link.h        |  3 +++
>  3 files changed, 56 insertions(+), 2 deletions(-)
>
> diff --git a/elf/dl-load.c b/elf/dl-load.c
> index e39980fb19..136cfe2fa8 100644
> --- a/elf/dl-load.c
> +++ b/elf/dl-load.c
> @@ -1154,6 +1154,7 @@ _dl_map_object_from_fd (const char *name, const char *origname, int fd,
>           c->dataend = ph->p_vaddr + ph->p_filesz;
>           c->allocend = ph->p_vaddr + ph->p_memsz;
>           c->mapoff = ALIGN_DOWN (ph->p_offset, GLRO(dl_pagesize));
> +          l->l_load_align = ph->p_align;
>
>           /* Determine whether there is a gap between the last segment
>              and this one.  */
> diff --git a/elf/dl-map-segments.h b/elf/dl-map-segments.h
> index ac9f09ab4c..ae03236045 100644
> --- a/elf/dl-map-segments.h
> +++ b/elf/dl-map-segments.h
> @@ -18,6 +18,47 @@
>
>  #include <dl-load.h>
>
> +static __always_inline void *
> +_dl_map_segments_align (const struct loadcmd *c,
> +                   ElfW(Addr) mappref, int fd, size_t alignment,
> +                   const size_t maplength)
> +{
> +       unsigned long map_start, map_start_align, map_end;
> +       unsigned long maplen = (maplength >= alignment) ?
> +                               (maplength + alignment) : (2 * alignment);
> +
> +       /* Allocate enough space to ensure that address aligned by
> +           p_align is included. */
> +       map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplen,
> +                                    PROT_NONE,
> +                                    MAP_ANONYMOUS | MAP_PRIVATE,
> +                                    -1, 0);
> +       if (__glibc_unlikely ((void *) map_start == MAP_FAILED)) {
> +               /* If mapping a aligned address failed, then ... */
> +               map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
> +                                    c->prot,
> +                                    MAP_COPY|MAP_FILE,
> +                                    fd, c->mapoff);
> +
> +               return (void *) map_start;
> +       }
> +       map_start_align = ALIGN_UP(map_start, alignment);
> +       map_end = map_start_align + maplength;
> +
> +       /* Remember which part of the address space this object uses.  */
> +       map_start_align = (ElfW(Addr)) __mmap ((void *) map_start_align, maplength,
> +                                    c->prot,
> +                                    MAP_COPY|MAP_FILE|MAP_FIXED,
> +                                    fd, c->mapoff);
> +       if (__glibc_unlikely ((void *) map_start_align == MAP_FAILED))
> +               return MAP_FAILED;
> +       if (map_start_align > map_start)
> +               __munmap((void *)map_start, map_start_align - map_start);
> +       __munmap((void *)map_end, map_start + maplen - map_end);
> +
> +       return (void *) map_start_align;
> +}
> +

Please follow the glibc coding format.

>  /* This implementation assumes (as does the corresponding implementation
>     of _dl_unmap_segments, in dl-unmap-segments.h) that shared objects
>     are always laid out with all segments contiguous (or with gaps
> @@ -52,11 +93,20 @@ _dl_map_segments (struct link_map *l, int fd,
>                                    c->mapstart & GLRO(dl_use_load_bias))
>             - MAP_BASE_ADDR (l));
>
> -      /* Remember which part of the address space this object uses.  */
> -      l->l_map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
> +       /* During mapping, align the mapping address of the LOAD segments
> +          according to own p_align. This helps OS map its code segment to
> +          huge pages. */
> +       if (l->l_load_align > GLRO(dl_pagesize)) {
> +               l->l_map_start = (ElfW(Addr)) _dl_map_segments_align (c,
> +                                            mappref, fd,
> +                                            l->l_load_align, maplength);
> +       } else {
> +               /* Remember which part of the address space this object uses.  */
> +               l->l_map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
>                                              c->prot,
>                                              MAP_COPY|MAP_FILE,
>                                              fd, c->mapoff);

Please follow the glibc coding format.

> +       }
>        if (__glibc_unlikely ((void *) l->l_map_start == MAP_FAILED))
>          return DL_MAP_SEGMENTS_ERROR_MAP_SEGMENT;
>
> diff --git a/include/link.h b/include/link.h
> index aea268439c..fc6ce29fab 100644
> --- a/include/link.h
> +++ b/include/link.h
> @@ -298,6 +298,9 @@ struct link_map
>
>      /* Thread-local storage related info.  */
>
> +    /* Alignment requirement of the LOAD block.  */
> +    size_t l_load_align;
> +
>      /* Start of the initialization image.  */
>      void *l_tls_initimage;
>      /* Size of the initialization image.  */
> --
> 2.27.0
>

Thanks.
Rongwei Wang Dec. 8, 2021, 2:14 a.m. UTC | #4
Hi hjl

On 12/6/21 10:48 PM, H.J. Lu wrote:
> On Fri, Dec 3, 2021 at 9:00 PM Rongwei Wang via Libc-alpha
> <libc-alpha@sourceware.org> wrote:
>>
>> Now, ld.so always map the LOAD segments and aligned by base
>> page size (e.g. 4k in x86 or 4k, 16k and 64k in arm64). And
>> this patch improve the scheme here. In this patch, ld.so
>> can align the mapping address of the first LOAD segment with
>> p_align when p_align is greater than the current base page
>> size.
> 
> This is a bug fix.  Please open a glibc bug:
> 
> https://sourceware.org/bugzilla/enter_bug.cgiOK

And I requesting the account.
> 
> with a testcase which should align variables to 2MB in the main
By the way, I have a question about whether we need to align each LOAD 
segments? In our patch, we only fixed the mapping address for the first 
LOAD segment.
> program and a shared library.   Please include the testcase in
> your patch and mention the glibc bug in your commit message.
> 
>> And this change makes code segments using huge pages become
>> simple and available.
>>
>> Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>
>> ---
>>   elf/dl-load.c         |  1 +
>>   elf/dl-map-segments.h | 54 +++++++++++++++++++++++++++++++++++++++++--
>>   include/link.h        |  3 +++
>>   3 files changed, 56 insertions(+), 2 deletions(-)
>>
>> diff --git a/elf/dl-load.c b/elf/dl-load.c
>> index e39980fb19..136cfe2fa8 100644
>> --- a/elf/dl-load.c
>> +++ b/elf/dl-load.c
>> @@ -1154,6 +1154,7 @@ _dl_map_object_from_fd (const char *name, const char *origname, int fd,
>>            c->dataend = ph->p_vaddr + ph->p_filesz;
>>            c->allocend = ph->p_vaddr + ph->p_memsz;
>>            c->mapoff = ALIGN_DOWN (ph->p_offset, GLRO(dl_pagesize));
>> +          l->l_load_align = ph->p_align;
>>
>>            /* Determine whether there is a gap between the last segment
>>               and this one.  */
>> diff --git a/elf/dl-map-segments.h b/elf/dl-map-segments.h
>> index ac9f09ab4c..ae03236045 100644
>> --- a/elf/dl-map-segments.h
>> +++ b/elf/dl-map-segments.h
>> @@ -18,6 +18,47 @@
>>
>>   #include <dl-load.h>
>>
>> +static __always_inline void *
>> +_dl_map_segments_align (const struct loadcmd *c,
>> +                   ElfW(Addr) mappref, int fd, size_t alignment,
>> +                   const size_t maplength)
>> +{
>> +       unsigned long map_start, map_start_align, map_end;
>> +       unsigned long maplen = (maplength >= alignment) ?
>> +                               (maplength + alignment) : (2 * alignment);
>> +
>> +       /* Allocate enough space to ensure that address aligned by
>> +           p_align is included. */
>> +       map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplen,
>> +                                    PROT_NONE,
>> +                                    MAP_ANONYMOUS | MAP_PRIVATE,
>> +                                    -1, 0);
>> +       if (__glibc_unlikely ((void *) map_start == MAP_FAILED)) {
>> +               /* If mapping a aligned address failed, then ... */
>> +               map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
>> +                                    c->prot,
>> +                                    MAP_COPY|MAP_FILE,
>> +                                    fd, c->mapoff);
>> +
>> +               return (void *) map_start;
>> +       }
>> +       map_start_align = ALIGN_UP(map_start, alignment);
>> +       map_end = map_start_align + maplength;
>> +
>> +       /* Remember which part of the address space this object uses.  */
>> +       map_start_align = (ElfW(Addr)) __mmap ((void *) map_start_align, maplength,
>> +                                    c->prot,
>> +                                    MAP_COPY|MAP_FILE|MAP_FIXED,
>> +                                    fd, c->mapoff);
>> +       if (__glibc_unlikely ((void *) map_start_align == MAP_FAILED))
>> +               return MAP_FAILED;
>> +       if (map_start_align > map_start)
>> +               __munmap((void *)map_start, map_start_align - map_start);
>> +       __munmap((void *)map_end, map_start + maplen - map_end);
>> +
>> +       return (void *) map_start_align;
>> +}
>> +
> 
> Please follow the glibc coding format.
> 
>>   /* This implementation assumes (as does the corresponding implementation
>>      of _dl_unmap_segments, in dl-unmap-segments.h) that shared objects
>>      are always laid out with all segments contiguous (or with gaps
>> @@ -52,11 +93,20 @@ _dl_map_segments (struct link_map *l, int fd,
>>                                     c->mapstart & GLRO(dl_use_load_bias))
>>              - MAP_BASE_ADDR (l));
>>
>> -      /* Remember which part of the address space this object uses.  */
>> -      l->l_map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
>> +       /* During mapping, align the mapping address of the LOAD segments
>> +          according to own p_align. This helps OS map its code segment to
>> +          huge pages. */
>> +       if (l->l_load_align > GLRO(dl_pagesize)) {
>> +               l->l_map_start = (ElfW(Addr)) _dl_map_segments_align (c,
>> +                                            mappref, fd,
>> +                                            l->l_load_align, maplength);
>> +       } else {
>> +               /* Remember which part of the address space this object uses.  */
>> +               l->l_map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
>>                                               c->prot,
>>                                               MAP_COPY|MAP_FILE,
>>                                               fd, c->mapoff);
> 
> Please follow the glibc coding format.
OK

Thanks.
> 
>> +       }
>>         if (__glibc_unlikely ((void *) l->l_map_start == MAP_FAILED))
>>           return DL_MAP_SEGMENTS_ERROR_MAP_SEGMENT;
>>
>> diff --git a/include/link.h b/include/link.h
>> index aea268439c..fc6ce29fab 100644
>> --- a/include/link.h
>> +++ b/include/link.h
>> @@ -298,6 +298,9 @@ struct link_map
>>
>>       /* Thread-local storage related info.  */
>>
>> +    /* Alignment requirement of the LOAD block.  */
>> +    size_t l_load_align;
>> +
>>       /* Start of the initialization image.  */
>>       void *l_tls_initimage;
>>       /* Size of the initialization image.  */
>> --
>> 2.27.0
>>
> 
> Thanks.
>
H.J. Lu Dec. 8, 2021, 2:33 a.m. UTC | #5
On Tue, Dec 7, 2021 at 6:14 PM Rongwei Wang
<rongwei.wang@linux.alibaba.com> wrote:
>
> Hi hjl
>
> On 12/6/21 10:48 PM, H.J. Lu wrote:
> > On Fri, Dec 3, 2021 at 9:00 PM Rongwei Wang via Libc-alpha
> > <libc-alpha@sourceware.org> wrote:
> >>
> >> Now, ld.so always map the LOAD segments and aligned by base
> >> page size (e.g. 4k in x86 or 4k, 16k and 64k in arm64). And
> >> this patch improve the scheme here. In this patch, ld.so
> >> can align the mapping address of the first LOAD segment with
> >> p_align when p_align is greater than the current base page
> >> size.
> >
> > This is a bug fix.  Please open a glibc bug:
> >
> > https://sourceware.org/bugzilla/enter_bug.cgiOK
>
> And I requesting the account.
> >
> > with a testcase which should align variables to 2MB in the main
> By the way, I have a question about whether we need to align each LOAD
> segments? In our patch, we only fixed the mapping address for the first
> LOAD segment.

I think the first one should be sufficient.   You can verify it with a
2MB aligned variable in PIE:

[hjl@gnu-cfl-2 tmp]$ cat x.c
#include <stdio.h>

int foo  __attribute__((aligned(0x200000))) = 1;

int
main ()
{
  printf ("foo: %p\n", &foo);
}
[hjl@gnu-cfl-2 tmp]$ gcc -no-pie x.c
[hjl@gnu-cfl-2 tmp]$ ./a.out
foo: 0x800000
[hjl@gnu-cfl-2 tmp]$ gcc x.c -fpie -pie
[hjl@gnu-cfl-2 tmp]$ ./a.out
foo: 0x55c529afe000
[hjl@gnu-cfl-2 tmp]$

> > program and a shared library.   Please include the testcase in
> > your patch and mention the glibc bug in your commit message.
> >
> >> And this change makes code segments using huge pages become
> >> simple and available.
> >>
> >> Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>
> >> ---
> >>   elf/dl-load.c         |  1 +
> >>   elf/dl-map-segments.h | 54 +++++++++++++++++++++++++++++++++++++++++--
> >>   include/link.h        |  3 +++
> >>   3 files changed, 56 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/elf/dl-load.c b/elf/dl-load.c
> >> index e39980fb19..136cfe2fa8 100644
> >> --- a/elf/dl-load.c
> >> +++ b/elf/dl-load.c
> >> @@ -1154,6 +1154,7 @@ _dl_map_object_from_fd (const char *name, const char *origname, int fd,
> >>            c->dataend = ph->p_vaddr + ph->p_filesz;
> >>            c->allocend = ph->p_vaddr + ph->p_memsz;
> >>            c->mapoff = ALIGN_DOWN (ph->p_offset, GLRO(dl_pagesize));
> >> +          l->l_load_align = ph->p_align;
> >>
> >>            /* Determine whether there is a gap between the last segment
> >>               and this one.  */
> >> diff --git a/elf/dl-map-segments.h b/elf/dl-map-segments.h
> >> index ac9f09ab4c..ae03236045 100644
> >> --- a/elf/dl-map-segments.h
> >> +++ b/elf/dl-map-segments.h
> >> @@ -18,6 +18,47 @@
> >>
> >>   #include <dl-load.h>
> >>
> >> +static __always_inline void *
> >> +_dl_map_segments_align (const struct loadcmd *c,
> >> +                   ElfW(Addr) mappref, int fd, size_t alignment,
> >> +                   const size_t maplength)
> >> +{
> >> +       unsigned long map_start, map_start_align, map_end;
> >> +       unsigned long maplen = (maplength >= alignment) ?
> >> +                               (maplength + alignment) : (2 * alignment);
> >> +
> >> +       /* Allocate enough space to ensure that address aligned by
> >> +           p_align is included. */
> >> +       map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplen,
> >> +                                    PROT_NONE,
> >> +                                    MAP_ANONYMOUS | MAP_PRIVATE,
> >> +                                    -1, 0);
> >> +       if (__glibc_unlikely ((void *) map_start == MAP_FAILED)) {
> >> +               /* If mapping a aligned address failed, then ... */
> >> +               map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
> >> +                                    c->prot,
> >> +                                    MAP_COPY|MAP_FILE,
> >> +                                    fd, c->mapoff);
> >> +
> >> +               return (void *) map_start;
> >> +       }
> >> +       map_start_align = ALIGN_UP(map_start, alignment);
> >> +       map_end = map_start_align + maplength;
> >> +
> >> +       /* Remember which part of the address space this object uses.  */
> >> +       map_start_align = (ElfW(Addr)) __mmap ((void *) map_start_align, maplength,
> >> +                                    c->prot,
> >> +                                    MAP_COPY|MAP_FILE|MAP_FIXED,
> >> +                                    fd, c->mapoff);
> >> +       if (__glibc_unlikely ((void *) map_start_align == MAP_FAILED))
> >> +               return MAP_FAILED;
> >> +       if (map_start_align > map_start)
> >> +               __munmap((void *)map_start, map_start_align - map_start);
> >> +       __munmap((void *)map_end, map_start + maplen - map_end);
> >> +
> >> +       return (void *) map_start_align;
> >> +}
> >> +
> >
> > Please follow the glibc coding format.
> >
> >>   /* This implementation assumes (as does the corresponding implementation
> >>      of _dl_unmap_segments, in dl-unmap-segments.h) that shared objects
> >>      are always laid out with all segments contiguous (or with gaps
> >> @@ -52,11 +93,20 @@ _dl_map_segments (struct link_map *l, int fd,
> >>                                     c->mapstart & GLRO(dl_use_load_bias))
> >>              - MAP_BASE_ADDR (l));
> >>
> >> -      /* Remember which part of the address space this object uses.  */
> >> -      l->l_map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
> >> +       /* During mapping, align the mapping address of the LOAD segments
> >> +          according to own p_align. This helps OS map its code segment to
> >> +          huge pages. */
> >> +       if (l->l_load_align > GLRO(dl_pagesize)) {
> >> +               l->l_map_start = (ElfW(Addr)) _dl_map_segments_align (c,
> >> +                                            mappref, fd,
> >> +                                            l->l_load_align, maplength);
> >> +       } else {
> >> +               /* Remember which part of the address space this object uses.  */
> >> +               l->l_map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
> >>                                               c->prot,
> >>                                               MAP_COPY|MAP_FILE,
> >>                                               fd, c->mapoff);
> >
> > Please follow the glibc coding format.
> OK
>
> Thanks.
> >
> >> +       }
> >>         if (__glibc_unlikely ((void *) l->l_map_start == MAP_FAILED))
> >>           return DL_MAP_SEGMENTS_ERROR_MAP_SEGMENT;
> >>
> >> diff --git a/include/link.h b/include/link.h
> >> index aea268439c..fc6ce29fab 100644
> >> --- a/include/link.h
> >> +++ b/include/link.h
> >> @@ -298,6 +298,9 @@ struct link_map
> >>
> >>       /* Thread-local storage related info.  */
> >>
> >> +    /* Alignment requirement of the LOAD block.  */
> >> +    size_t l_load_align;
> >> +
> >>       /* Start of the initialization image.  */
> >>       void *l_tls_initimage;
> >>       /* Size of the initialization image.  */
> >> --
> >> 2.27.0
> >>
> >
> > Thanks.
> >
Rongwei Wang Dec. 8, 2021, 3:04 a.m. UTC | #6
On 12/8/21 10:33 AM, H.J. Lu wrote:
> On Tue, Dec 7, 2021 at 6:14 PM Rongwei Wang
> <rongwei.wang@linux.alibaba.com> wrote:
>>
>> Hi hjl
>>
>> On 12/6/21 10:48 PM, H.J. Lu wrote:
>>> On Fri, Dec 3, 2021 at 9:00 PM Rongwei Wang via Libc-alpha
>>> <libc-alpha@sourceware.org> wrote:
>>>>
>>>> Now, ld.so always map the LOAD segments and aligned by base
>>>> page size (e.g. 4k in x86 or 4k, 16k and 64k in arm64). And
>>>> this patch improve the scheme here. In this patch, ld.so
>>>> can align the mapping address of the first LOAD segment with
>>>> p_align when p_align is greater than the current base page
>>>> size.
>>>
>>> This is a bug fix.  Please open a glibc bug:
>>>
>>> https://sourceware.org/bugzilla/enter_bug.cgiOK
>>
>> And I requesting the account.
>>>
>>> with a testcase which should align variables to 2MB in the main
>> By the way, I have a question about whether we need to align each LOAD
>> segments? In our patch, we only fixed the mapping address for the first
>> LOAD segment.
> 
> I think the first one should be sufficient.   You can verify it with a
> 2MB aligned variable in PIE:
> 
> [hjl@gnu-cfl-2 tmp]$ cat x.c
> #include <stdio.h>
> 
> int foo  __attribute__((aligned(0x200000))) = 1;
> 
> int
> main ()
> {
>    printf ("foo: %p\n", &foo);
> }
> [hjl@gnu-cfl-2 tmp]$ gcc -no-pie x.c
> [hjl@gnu-cfl-2 tmp]$ ./a.out
> foo: 0x800000
> [hjl@gnu-cfl-2 tmp]$ gcc x.c -fpie -pie
> [hjl@gnu-cfl-2 tmp]$ ./a.out
> foo: 0x55c529afe000
> [hjl@gnu-cfl-2 tmp]$

Learned it!

Thanks.
> 
>>> program and a shared library.   Please include the testcase in
>>> your patch and mention the glibc bug in your commit message.
>>>
>>>> And this change makes code segments using huge pages become
>>>> simple and available.
>>>>
>>>> Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>
>>>> ---
>>>>    elf/dl-load.c         |  1 +
>>>>    elf/dl-map-segments.h | 54 +++++++++++++++++++++++++++++++++++++++++--
>>>>    include/link.h        |  3 +++
>>>>    3 files changed, 56 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/elf/dl-load.c b/elf/dl-load.c
>>>> index e39980fb19..136cfe2fa8 100644
>>>> --- a/elf/dl-load.c
>>>> +++ b/elf/dl-load.c
>>>> @@ -1154,6 +1154,7 @@ _dl_map_object_from_fd (const char *name, const char *origname, int fd,
>>>>             c->dataend = ph->p_vaddr + ph->p_filesz;
>>>>             c->allocend = ph->p_vaddr + ph->p_memsz;
>>>>             c->mapoff = ALIGN_DOWN (ph->p_offset, GLRO(dl_pagesize));
>>>> +          l->l_load_align = ph->p_align;
>>>>
>>>>             /* Determine whether there is a gap between the last segment
>>>>                and this one.  */
>>>> diff --git a/elf/dl-map-segments.h b/elf/dl-map-segments.h
>>>> index ac9f09ab4c..ae03236045 100644
>>>> --- a/elf/dl-map-segments.h
>>>> +++ b/elf/dl-map-segments.h
>>>> @@ -18,6 +18,47 @@
>>>>
>>>>    #include <dl-load.h>
>>>>
>>>> +static __always_inline void *
>>>> +_dl_map_segments_align (const struct loadcmd *c,
>>>> +                   ElfW(Addr) mappref, int fd, size_t alignment,
>>>> +                   const size_t maplength)
>>>> +{
>>>> +       unsigned long map_start, map_start_align, map_end;
>>>> +       unsigned long maplen = (maplength >= alignment) ?
>>>> +                               (maplength + alignment) : (2 * alignment);
>>>> +
>>>> +       /* Allocate enough space to ensure that address aligned by
>>>> +           p_align is included. */
>>>> +       map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplen,
>>>> +                                    PROT_NONE,
>>>> +                                    MAP_ANONYMOUS | MAP_PRIVATE,
>>>> +                                    -1, 0);
>>>> +       if (__glibc_unlikely ((void *) map_start == MAP_FAILED)) {
>>>> +               /* If mapping a aligned address failed, then ... */
>>>> +               map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
>>>> +                                    c->prot,
>>>> +                                    MAP_COPY|MAP_FILE,
>>>> +                                    fd, c->mapoff);
>>>> +
>>>> +               return (void *) map_start;
>>>> +       }
>>>> +       map_start_align = ALIGN_UP(map_start, alignment);
>>>> +       map_end = map_start_align + maplength;
>>>> +
>>>> +       /* Remember which part of the address space this object uses.  */
>>>> +       map_start_align = (ElfW(Addr)) __mmap ((void *) map_start_align, maplength,
>>>> +                                    c->prot,
>>>> +                                    MAP_COPY|MAP_FILE|MAP_FIXED,
>>>> +                                    fd, c->mapoff);
>>>> +       if (__glibc_unlikely ((void *) map_start_align == MAP_FAILED))
>>>> +               return MAP_FAILED;
>>>> +       if (map_start_align > map_start)
>>>> +               __munmap((void *)map_start, map_start_align - map_start);
>>>> +       __munmap((void *)map_end, map_start + maplen - map_end);
>>>> +
>>>> +       return (void *) map_start_align;
>>>> +}
>>>> +
>>>
>>> Please follow the glibc coding format.
>>>
>>>>    /* This implementation assumes (as does the corresponding implementation
>>>>       of _dl_unmap_segments, in dl-unmap-segments.h) that shared objects
>>>>       are always laid out with all segments contiguous (or with gaps
>>>> @@ -52,11 +93,20 @@ _dl_map_segments (struct link_map *l, int fd,
>>>>                                      c->mapstart & GLRO(dl_use_load_bias))
>>>>               - MAP_BASE_ADDR (l));
>>>>
>>>> -      /* Remember which part of the address space this object uses.  */
>>>> -      l->l_map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
>>>> +       /* During mapping, align the mapping address of the LOAD segments
>>>> +          according to own p_align. This helps OS map its code segment to
>>>> +          huge pages. */
>>>> +       if (l->l_load_align > GLRO(dl_pagesize)) {
>>>> +               l->l_map_start = (ElfW(Addr)) _dl_map_segments_align (c,
>>>> +                                            mappref, fd,
>>>> +                                            l->l_load_align, maplength);
>>>> +       } else {
>>>> +               /* Remember which part of the address space this object uses.  */
>>>> +               l->l_map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
>>>>                                                c->prot,
>>>>                                                MAP_COPY|MAP_FILE,
>>>>                                                fd, c->mapoff);
>>>
>>> Please follow the glibc coding format.
>> OK
>>
>> Thanks.
>>>
>>>> +       }
>>>>          if (__glibc_unlikely ((void *) l->l_map_start == MAP_FAILED))
>>>>            return DL_MAP_SEGMENTS_ERROR_MAP_SEGMENT;
>>>>
>>>> diff --git a/include/link.h b/include/link.h
>>>> index aea268439c..fc6ce29fab 100644
>>>> --- a/include/link.h
>>>> +++ b/include/link.h
>>>> @@ -298,6 +298,9 @@ struct link_map
>>>>
>>>>        /* Thread-local storage related info.  */
>>>>
>>>> +    /* Alignment requirement of the LOAD block.  */
>>>> +    size_t l_load_align;
>>>> +
>>>>        /* Start of the initialization image.  */
>>>>        void *l_tls_initimage;
>>>>        /* Size of the initialization image.  */
>>>> --
>>>> 2.27.0
>>>>
>>>
>>> Thanks.
>>>
> 
> 
>
H.J. Lu Dec. 8, 2021, 11:52 p.m. UTC | #7
On Tue, Dec 7, 2021 at 7:05 PM Rongwei Wang
<rongwei.wang@linux.alibaba.com> wrote:
>
>
>
> On 12/8/21 10:33 AM, H.J. Lu wrote:
> > On Tue, Dec 7, 2021 at 6:14 PM Rongwei Wang
> > <rongwei.wang@linux.alibaba.com> wrote:
> >>
> >> Hi hjl
> >>
> >> On 12/6/21 10:48 PM, H.J. Lu wrote:
> >>> On Fri, Dec 3, 2021 at 9:00 PM Rongwei Wang via Libc-alpha
> >>> <libc-alpha@sourceware.org> wrote:
> >>>>
> >>>> Now, ld.so always map the LOAD segments and aligned by base
> >>>> page size (e.g. 4k in x86 or 4k, 16k and 64k in arm64). And
> >>>> this patch improve the scheme here. In this patch, ld.so
> >>>> can align the mapping address of the first LOAD segment with
> >>>> p_align when p_align is greater than the current base page
> >>>> size.
> >>>
> >>> This is a bug fix.  Please open a glibc bug:
> >>>
> >>> https://sourceware.org/bugzilla/enter_bug.cgiOK
> >>
> >> And I requesting the account.
> >>>
> >>> with a testcase which should align variables to 2MB in the main
> >> By the way, I have a question about whether we need to align each LOAD
> >> segments? In our patch, we only fixed the mapping address for the first
> >> LOAD segment.
> >
> > I think the first one should be sufficient.   You can verify it with a
> > 2MB aligned variable in PIE:
> >
> > [hjl@gnu-cfl-2 tmp]$ cat x.c
> > #include <stdio.h>
> >
> > int foo  __attribute__((aligned(0x200000))) = 1;
> >
> > int
> > main ()
> > {
> >    printf ("foo: %p\n", &foo);
> > }
> > [hjl@gnu-cfl-2 tmp]$ gcc -no-pie x.c
> > [hjl@gnu-cfl-2 tmp]$ ./a.out
> > foo: 0x800000
> > [hjl@gnu-cfl-2 tmp]$ gcc x.c -fpie -pie
> > [hjl@gnu-cfl-2 tmp]$ ./a.out
> > foo: 0x55c529afe000
> > [hjl@gnu-cfl-2 tmp]$
>
> Learned it!

I opened:

https://sourceware.org/bugzilla/show_bug.cgi?id=28676

> Thanks.
> >
> >>> program and a shared library.   Please include the testcase in
> >>> your patch and mention the glibc bug in your commit message.
> >>>
> >>>> And this change makes code segments using huge pages become
> >>>> simple and available.
> >>>>
> >>>> Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>
> >>>> ---
> >>>>    elf/dl-load.c         |  1 +
> >>>>    elf/dl-map-segments.h | 54 +++++++++++++++++++++++++++++++++++++++++--
> >>>>    include/link.h        |  3 +++
> >>>>    3 files changed, 56 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/elf/dl-load.c b/elf/dl-load.c
> >>>> index e39980fb19..136cfe2fa8 100644
> >>>> --- a/elf/dl-load.c
> >>>> +++ b/elf/dl-load.c
> >>>> @@ -1154,6 +1154,7 @@ _dl_map_object_from_fd (const char *name, const char *origname, int fd,
> >>>>             c->dataend = ph->p_vaddr + ph->p_filesz;
> >>>>             c->allocend = ph->p_vaddr + ph->p_memsz;
> >>>>             c->mapoff = ALIGN_DOWN (ph->p_offset, GLRO(dl_pagesize));
> >>>> +          l->l_load_align = ph->p_align;
> >>>>
> >>>>             /* Determine whether there is a gap between the last segment
> >>>>                and this one.  */
> >>>> diff --git a/elf/dl-map-segments.h b/elf/dl-map-segments.h
> >>>> index ac9f09ab4c..ae03236045 100644
> >>>> --- a/elf/dl-map-segments.h
> >>>> +++ b/elf/dl-map-segments.h
> >>>> @@ -18,6 +18,47 @@
> >>>>
> >>>>    #include <dl-load.h>
> >>>>
> >>>> +static __always_inline void *
> >>>> +_dl_map_segments_align (const struct loadcmd *c,
> >>>> +                   ElfW(Addr) mappref, int fd, size_t alignment,
> >>>> +                   const size_t maplength)
> >>>> +{
> >>>> +       unsigned long map_start, map_start_align, map_end;
> >>>> +       unsigned long maplen = (maplength >= alignment) ?
> >>>> +                               (maplength + alignment) : (2 * alignment);
> >>>> +
> >>>> +       /* Allocate enough space to ensure that address aligned by
> >>>> +           p_align is included. */
> >>>> +       map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplen,
> >>>> +                                    PROT_NONE,
> >>>> +                                    MAP_ANONYMOUS | MAP_PRIVATE,
> >>>> +                                    -1, 0);
> >>>> +       if (__glibc_unlikely ((void *) map_start == MAP_FAILED)) {
> >>>> +               /* If mapping a aligned address failed, then ... */
> >>>> +               map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
> >>>> +                                    c->prot,
> >>>> +                                    MAP_COPY|MAP_FILE,
> >>>> +                                    fd, c->mapoff);
> >>>> +
> >>>> +               return (void *) map_start;
> >>>> +       }
> >>>> +       map_start_align = ALIGN_UP(map_start, alignment);
> >>>> +       map_end = map_start_align + maplength;
> >>>> +
> >>>> +       /* Remember which part of the address space this object uses.  */
> >>>> +       map_start_align = (ElfW(Addr)) __mmap ((void *) map_start_align, maplength,
> >>>> +                                    c->prot,
> >>>> +                                    MAP_COPY|MAP_FILE|MAP_FIXED,
> >>>> +                                    fd, c->mapoff);
> >>>> +       if (__glibc_unlikely ((void *) map_start_align == MAP_FAILED))
> >>>> +               return MAP_FAILED;
> >>>> +       if (map_start_align > map_start)
> >>>> +               __munmap((void *)map_start, map_start_align - map_start);
> >>>> +       __munmap((void *)map_end, map_start + maplen - map_end);
> >>>> +
> >>>> +       return (void *) map_start_align;
> >>>> +}
> >>>> +
> >>>
> >>> Please follow the glibc coding format.
> >>>
> >>>>    /* This implementation assumes (as does the corresponding implementation
> >>>>       of _dl_unmap_segments, in dl-unmap-segments.h) that shared objects
> >>>>       are always laid out with all segments contiguous (or with gaps
> >>>> @@ -52,11 +93,20 @@ _dl_map_segments (struct link_map *l, int fd,
> >>>>                                      c->mapstart & GLRO(dl_use_load_bias))
> >>>>               - MAP_BASE_ADDR (l));
> >>>>
> >>>> -      /* Remember which part of the address space this object uses.  */
> >>>> -      l->l_map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
> >>>> +       /* During mapping, align the mapping address of the LOAD segments
> >>>> +          according to own p_align. This helps OS map its code segment to
> >>>> +          huge pages. */
> >>>> +       if (l->l_load_align > GLRO(dl_pagesize)) {
> >>>> +               l->l_map_start = (ElfW(Addr)) _dl_map_segments_align (c,
> >>>> +                                            mappref, fd,
> >>>> +                                            l->l_load_align, maplength);
> >>>> +       } else {
> >>>> +               /* Remember which part of the address space this object uses.  */
> >>>> +               l->l_map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
> >>>>                                                c->prot,
> >>>>                                                MAP_COPY|MAP_FILE,
> >>>>                                                fd, c->mapoff);
> >>>
> >>> Please follow the glibc coding format.
> >> OK
> >>
> >> Thanks.
> >>>
> >>>> +       }
> >>>>          if (__glibc_unlikely ((void *) l->l_map_start == MAP_FAILED))
> >>>>            return DL_MAP_SEGMENTS_ERROR_MAP_SEGMENT;
> >>>>
> >>>> diff --git a/include/link.h b/include/link.h
> >>>> index aea268439c..fc6ce29fab 100644
> >>>> --- a/include/link.h
> >>>> +++ b/include/link.h
> >>>> @@ -298,6 +298,9 @@ struct link_map
> >>>>
> >>>>        /* Thread-local storage related info.  */
> >>>>
> >>>> +    /* Alignment requirement of the LOAD block.  */
> >>>> +    size_t l_load_align;
> >>>> +
> >>>>        /* Start of the initialization image.  */
> >>>>        void *l_tls_initimage;
> >>>>        /* Size of the initialization image.  */
> >>>> --
> >>>> 2.27.0
> >>>>
> >>>
> >>> Thanks.
> >>>
> >
> >
> >
Rongwei Wang Dec. 9, 2021, 1:43 a.m. UTC | #8
On 12/9/21 7:52 AM, H.J. Lu wrote:
> On Tue, Dec 7, 2021 at 7:05 PM Rongwei Wang
> <rongwei.wang@linux.alibaba.com> wrote:
>>
>>
>>
>> On 12/8/21 10:33 AM, H.J. Lu wrote:
>>> On Tue, Dec 7, 2021 at 6:14 PM Rongwei Wang
>>> <rongwei.wang@linux.alibaba.com> wrote:
>>>>
>>>> Hi hjl
>>>>
>>>> On 12/6/21 10:48 PM, H.J. Lu wrote:
>>>>> On Fri, Dec 3, 2021 at 9:00 PM Rongwei Wang via Libc-alpha
>>>>> <libc-alpha@sourceware.org> wrote:
>>>>>>
>>>>>> Now, ld.so always map the LOAD segments and aligned by base
>>>>>> page size (e.g. 4k in x86 or 4k, 16k and 64k in arm64). And
>>>>>> this patch improve the scheme here. In this patch, ld.so
>>>>>> can align the mapping address of the first LOAD segment with
>>>>>> p_align when p_align is greater than the current base page
>>>>>> size.
>>>>>
>>>>> This is a bug fix.  Please open a glibc bug:
>>>>>
>>>>> https://sourceware.org/bugzilla/enter_bug.cgiOK
>>>>
>>>> And I requesting the account.
>>>>>
>>>>> with a testcase which should align variables to 2MB in the main
>>>> By the way, I have a question about whether we need to align each LOAD
>>>> segments? In our patch, we only fixed the mapping address for the first
>>>> LOAD segment.
>>>
>>> I think the first one should be sufficient.   You can verify it with a
>>> 2MB aligned variable in PIE:
>>>
>>> [hjl@gnu-cfl-2 tmp]$ cat x.c
>>> #include <stdio.h>
>>>
>>> int foo  __attribute__((aligned(0x200000))) = 1;
>>>
>>> int
>>> main ()
>>> {
>>>     printf ("foo: %p\n", &foo);
>>> }
>>> [hjl@gnu-cfl-2 tmp]$ gcc -no-pie x.c
>>> [hjl@gnu-cfl-2 tmp]$ ./a.out
>>> foo: 0x800000
>>> [hjl@gnu-cfl-2 tmp]$ gcc x.c -fpie -pie
>>> [hjl@gnu-cfl-2 tmp]$ ./a.out
>>> foo: 0x55c529afe000
>>> [hjl@gnu-cfl-2 tmp]$
>>
>> Learned it!
> 
> I opened:Thanks.
> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=28676
Hi, I saw your report a kernel bug about PIE:

https://bugzilla.kernel.org/show_bug.cgi?id=215275

I remember a related fix patch to this bug is also included in our patchset:

https://lore.kernel.org/linux-mm/20211009092658.59665-4-rongwei.wang@linux.alibaba.com/

So, this issue is regarded as a bug by glibc, I can resend this patch
to kernel mail list and CC you.

Thanks.
> 
>> Thanks.
>>>
>>>>> program and a shared library.   Please include the testcase in
>>>>> your patch and mention the glibc bug in your commit message.
>>>>>
>>>>>> And this change makes code segments using huge pages become
>>>>>> simple and available.
>>>>>>
>>>>>> Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>
>>>>>> ---
>>>>>>     elf/dl-load.c         |  1 +
>>>>>>     elf/dl-map-segments.h | 54 +++++++++++++++++++++++++++++++++++++++++--
>>>>>>     include/link.h        |  3 +++
>>>>>>     3 files changed, 56 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/elf/dl-load.c b/elf/dl-load.c
>>>>>> index e39980fb19..136cfe2fa8 100644
>>>>>> --- a/elf/dl-load.c
>>>>>> +++ b/elf/dl-load.c
>>>>>> @@ -1154,6 +1154,7 @@ _dl_map_object_from_fd (const char *name, const char *origname, int fd,
>>>>>>              c->dataend = ph->p_vaddr + ph->p_filesz;
>>>>>>              c->allocend = ph->p_vaddr + ph->p_memsz;
>>>>>>              c->mapoff = ALIGN_DOWN (ph->p_offset, GLRO(dl_pagesize));
>>>>>> +          l->l_load_align = ph->p_align;
>>>>>>
>>>>>>              /* Determine whether there is a gap between the last segment
>>>>>>                 and this one.  */
>>>>>> diff --git a/elf/dl-map-segments.h b/elf/dl-map-segments.h
>>>>>> index ac9f09ab4c..ae03236045 100644
>>>>>> --- a/elf/dl-map-segments.h
>>>>>> +++ b/elf/dl-map-segments.h
>>>>>> @@ -18,6 +18,47 @@
>>>>>>
>>>>>>     #include <dl-load.h>
>>>>>>
>>>>>> +static __always_inline void *
>>>>>> +_dl_map_segments_align (const struct loadcmd *c,
>>>>>> +                   ElfW(Addr) mappref, int fd, size_t alignment,
>>>>>> +                   const size_t maplength)
>>>>>> +{
>>>>>> +       unsigned long map_start, map_start_align, map_end;
>>>>>> +       unsigned long maplen = (maplength >= alignment) ?
>>>>>> +                               (maplength + alignment) : (2 * alignment);
>>>>>> +
>>>>>> +       /* Allocate enough space to ensure that address aligned by
>>>>>> +           p_align is included. */
>>>>>> +       map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplen,
>>>>>> +                                    PROT_NONE,
>>>>>> +                                    MAP_ANONYMOUS | MAP_PRIVATE,
>>>>>> +                                    -1, 0);
>>>>>> +       if (__glibc_unlikely ((void *) map_start == MAP_FAILED)) {
>>>>>> +               /* If mapping a aligned address failed, then ... */
>>>>>> +               map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
>>>>>> +                                    c->prot,
>>>>>> +                                    MAP_COPY|MAP_FILE,
>>>>>> +                                    fd, c->mapoff);
>>>>>> +
>>>>>> +               return (void *) map_start;
>>>>>> +       }
>>>>>> +       map_start_align = ALIGN_UP(map_start, alignment);
>>>>>> +       map_end = map_start_align + maplength;
>>>>>> +
>>>>>> +       /* Remember which part of the address space this object uses.  */
>>>>>> +       map_start_align = (ElfW(Addr)) __mmap ((void *) map_start_align, maplength,
>>>>>> +                                    c->prot,
>>>>>> +                                    MAP_COPY|MAP_FILE|MAP_FIXED,
>>>>>> +                                    fd, c->mapoff);
>>>>>> +       if (__glibc_unlikely ((void *) map_start_align == MAP_FAILED))
>>>>>> +               return MAP_FAILED;
>>>>>> +       if (map_start_align > map_start)
>>>>>> +               __munmap((void *)map_start, map_start_align - map_start);
>>>>>> +       __munmap((void *)map_end, map_start + maplen - map_end);
>>>>>> +
>>>>>> +       return (void *) map_start_align;
>>>>>> +}
>>>>>> +
>>>>>
>>>>> Please follow the glibc coding format.
>>>>>
>>>>>>     /* This implementation assumes (as does the corresponding implementation
>>>>>>        of _dl_unmap_segments, in dl-unmap-segments.h) that shared objects
>>>>>>        are always laid out with all segments contiguous (or with gaps
>>>>>> @@ -52,11 +93,20 @@ _dl_map_segments (struct link_map *l, int fd,
>>>>>>                                       c->mapstart & GLRO(dl_use_load_bias))
>>>>>>                - MAP_BASE_ADDR (l));
>>>>>>
>>>>>> -      /* Remember which part of the address space this object uses.  */
>>>>>> -      l->l_map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
>>>>>> +       /* During mapping, align the mapping address of the LOAD segments
>>>>>> +          according to own p_align. This helps OS map its code segment to
>>>>>> +          huge pages. */
>>>>>> +       if (l->l_load_align > GLRO(dl_pagesize)) {
>>>>>> +               l->l_map_start = (ElfW(Addr)) _dl_map_segments_align (c,
>>>>>> +                                            mappref, fd,
>>>>>> +                                            l->l_load_align, maplength);
>>>>>> +       } else {
>>>>>> +               /* Remember which part of the address space this object uses.  */
>>>>>> +               l->l_map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
>>>>>>                                                 c->prot,
>>>>>>                                                 MAP_COPY|MAP_FILE,
>>>>>>                                                 fd, c->mapoff);
>>>>>
>>>>> Please follow the glibc coding format.
>>>> OK
>>>>
>>>> Thanks.
>>>>>
>>>>>> +       }
>>>>>>           if (__glibc_unlikely ((void *) l->l_map_start == MAP_FAILED))
>>>>>>             return DL_MAP_SEGMENTS_ERROR_MAP_SEGMENT;
>>>>>>
>>>>>> diff --git a/include/link.h b/include/link.h
>>>>>> index aea268439c..fc6ce29fab 100644
>>>>>> --- a/include/link.h
>>>>>> +++ b/include/link.h
>>>>>> @@ -298,6 +298,9 @@ struct link_map
>>>>>>
>>>>>>         /* Thread-local storage related info.  */
>>>>>>
>>>>>> +    /* Alignment requirement of the LOAD block.  */
>>>>>> +    size_t l_load_align;
>>>>>> +
>>>>>>         /* Start of the initialization image.  */
>>>>>>         void *l_tls_initimage;
>>>>>>         /* Size of the initialization image.  */
>>>>>> --
>>>>>> 2.27.0
>>>>>>
>>>>>
>>>>> Thanks.
>>>>>
>>>
>>>
>>>
> 
> 
>
diff mbox series

Patch

diff --git a/elf/dl-load.c b/elf/dl-load.c
index e39980fb19..136cfe2fa8 100644
--- a/elf/dl-load.c
+++ b/elf/dl-load.c
@@ -1154,6 +1154,7 @@  _dl_map_object_from_fd (const char *name, const char *origname, int fd,
 	  c->dataend = ph->p_vaddr + ph->p_filesz;
 	  c->allocend = ph->p_vaddr + ph->p_memsz;
 	  c->mapoff = ALIGN_DOWN (ph->p_offset, GLRO(dl_pagesize));
+          l->l_load_align = ph->p_align;
 
 	  /* Determine whether there is a gap between the last segment
 	     and this one.  */
diff --git a/elf/dl-map-segments.h b/elf/dl-map-segments.h
index ac9f09ab4c..ae03236045 100644
--- a/elf/dl-map-segments.h
+++ b/elf/dl-map-segments.h
@@ -18,6 +18,47 @@ 
 
 #include <dl-load.h>
 
+static __always_inline void *
+_dl_map_segments_align (const struct loadcmd *c,
+                   ElfW(Addr) mappref, int fd, size_t alignment,
+                   const size_t maplength)
+{
+	unsigned long map_start, map_start_align, map_end;
+	unsigned long maplen = (maplength >= alignment) ?
+				(maplength + alignment) : (2 * alignment);
+
+	/* Allocate enough space to ensure that address aligned by
+           p_align is included. */
+	map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplen,
+                                    PROT_NONE,
+                                    MAP_ANONYMOUS | MAP_PRIVATE,
+                                    -1, 0);
+	if (__glibc_unlikely ((void *) map_start == MAP_FAILED)) {
+		/* If mapping a aligned address failed, then ... */
+		map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
+                                    c->prot,
+                                    MAP_COPY|MAP_FILE,
+                                    fd, c->mapoff);
+
+		return (void *) map_start;
+	}
+	map_start_align = ALIGN_UP(map_start, alignment);
+	map_end = map_start_align + maplength;
+
+	/* Remember which part of the address space this object uses.  */
+	map_start_align = (ElfW(Addr)) __mmap ((void *) map_start_align, maplength,
+                                    c->prot,
+                                    MAP_COPY|MAP_FILE|MAP_FIXED,
+                                    fd, c->mapoff);
+	if (__glibc_unlikely ((void *) map_start_align == MAP_FAILED))
+		return MAP_FAILED;
+	if (map_start_align > map_start)
+		__munmap((void *)map_start, map_start_align - map_start);
+	__munmap((void *)map_end, map_start + maplen - map_end);
+
+	return (void *) map_start_align;
+}
+
 /* This implementation assumes (as does the corresponding implementation
    of _dl_unmap_segments, in dl-unmap-segments.h) that shared objects
    are always laid out with all segments contiguous (or with gaps
@@ -52,11 +93,20 @@  _dl_map_segments (struct link_map *l, int fd,
                                   c->mapstart & GLRO(dl_use_load_bias))
            - MAP_BASE_ADDR (l));
 
-      /* Remember which part of the address space this object uses.  */
-      l->l_map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
+	/* During mapping, align the mapping address of the LOAD segments
+ 	   according to own p_align. This helps OS map its code segment to
+	   huge pages. */
+	if (l->l_load_align > GLRO(dl_pagesize)) {
+		l->l_map_start = (ElfW(Addr)) _dl_map_segments_align (c,
+                                            mappref, fd,
+                                            l->l_load_align, maplength);
+	} else {
+		/* Remember which part of the address space this object uses.  */
+		l->l_map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
                                             c->prot,
                                             MAP_COPY|MAP_FILE,
                                             fd, c->mapoff);
+	}
       if (__glibc_unlikely ((void *) l->l_map_start == MAP_FAILED))
         return DL_MAP_SEGMENTS_ERROR_MAP_SEGMENT;
 
diff --git a/include/link.h b/include/link.h
index aea268439c..fc6ce29fab 100644
--- a/include/link.h
+++ b/include/link.h
@@ -298,6 +298,9 @@  struct link_map
 
     /* Thread-local storage related info.  */
 
+    /* Alignment requirement of the LOAD block.  */
+    size_t l_load_align;
+
     /* Start of the initialization image.  */
     void *l_tls_initimage;
     /* Size of the initialization image.  */