mbox series

[v4,0/7] malloc: Improve Huge Page support

Message ID 20210830185215.449572-1-adhemerval.zanella@linaro.org
Headers show
Series malloc: Improve Huge Page support | expand


Adhemerval Zanella Aug. 30, 2021, 6:52 p.m. UTC
Linux currently supports two ways to use Huge Pages: either by using
specific flags directly with the syscall (MAP_HUGETLB for mmap(), or
SHM_HUGETLB for shmget()), or by using Transparent Huge Pages (THP)
where the kernel tries to move allocated anonymous pages to Huge
Pages blocks transparently to application.

Also, THP current support three different modes [1]: 'never', 'madvise',
and 'always'.  The 'never' is self-explanatory and 'always' will enable
THP for all anonymous memory.  However, 'madvise' is still the default
for some systems and for such cases THP will be only used if the memory
range is explicity advertise by the program through a
madvise(MADV_HUGEPAGE) call.

This patchset adds a two new tunables to improve malloc() support with
Huge Page, 'glibc.malloc.hugetlb' with the supported values:

  - 0: default value, do not enable any huge page usage.

  - 1: instructs the system allocator to issue a madvise(MADV_HUGEPAGE)
    call after a mmap() for sizes larger than the default huge page size
    and on sbrk() calls to extend the program data segment.

  - 2 or larger: instructthe system allocator to round allocation to
    huge page sizes along with the required flags (MAP_HUGETLB for Linux).
    If the memory allocation fails, the default system page size is used
    instead.  A positive value larger than 2 sets a specific huge page
    size and the value is checked against the supported one by the

The 'glibc.malloc.hugetlb=2' aims to replace the 'morecore' removed
callback from 2.34 for libhugetlbfs (where the library tries to leverage
the huge pages usage instead to provide a system allocator).  By
implementing the support directly on the mmap() code patch there is
no need to try emulate the morecore()/sbrk() semantic which simplifies
the code and make memory shrink logic more straighforward.

I did also a sniff test check with SPECcpu2017 intspeed on a Ryzen 9
5900X machine using gcc 10.3 to compare glibc.malloc.hugetlb=0 and
glibc.malloc.hugetlb=2 (and THP set to 'madvise').  The improvement is
about 7.5% for hugetlb=2 (10.7 vs 11.5).

Changes from previous version:

  - Fixed the area shrink logic, where the pagesize was not updated.
  - Removed malloc/tst-free-errno* from hugetlb2 tests set, since it
    requires to know which page size was used by the malloc call.
  - Add huge page support on main arena.

Adhemerval Zanella (7):
  malloc: Add madvise support for Transparent Huge Pages
  malloc: Add THP/madvise support for sbrk
  malloc: Move mmap logic to its own function
  malloc: Add Huge Page support for mmap()
  malloc: Add huge page support to arenas
  malloc: Move MORECORE fallback mmap to sysmalloc_mmap_fallback
  malloc: Enable huge page support on main arena

 NEWS                                       |  11 +-
 Rules                                      |  36 +++
 elf/dl-tunables.list                       |   4 +
 elf/tst-rtld-list-tunables.exp             |   1 +
 include/libc-pointer-arith.h               |  10 +
 malloc/Makefile                            |  23 ++
 malloc/arena.c                             | 133 +++++---
 malloc/malloc-internal.h                   |   1 +
 malloc/malloc.c                            | 357 ++++++++++++++-------
 malloc/morecore.c                          |   2 -
 manual/tunables.texi                       |  16 +
 sysdeps/generic/Makefile                   |   8 +
 sysdeps/generic/malloc-hugepages.c         |  38 +++
 sysdeps/generic/malloc-hugepages.h         |  44 +++
 sysdeps/unix/sysv/linux/malloc-hugepages.c | 202 ++++++++++++
 15 files changed, 731 insertions(+), 155 deletions(-)
 create mode 100644 sysdeps/generic/malloc-hugepages.c
 create mode 100644 sysdeps/generic/malloc-hugepages.h
 create mode 100644 sysdeps/unix/sysv/linux/malloc-hugepages.c