| Message ID | 20260405035323.558335-1-wangrui@loongson.cn (mailing list archive) |
|---|---|
| Headers |
Return-Path: <libc-alpha-bounces~patchwork=sourceware.org@sourceware.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from vm01.sourceware.org (localhost [127.0.0.1]) by sourceware.org (Postfix) with ESMTP id 1EA614BA2E27 for <patchwork@sourceware.org>; Sun, 5 Apr 2026 03:54:40 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1EA614BA2E27 X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail.loongson.cn (mail.loongson.cn [114.242.206.163]) by sourceware.org (Postfix) with ESMTP id 8ED584BA2E27 for <libc-alpha@sourceware.org>; Sun, 5 Apr 2026 03:54:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8ED584BA2E27 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8ED584BA2E27 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=114.242.206.163 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1775361248; cv=none; b=LI0jyDjAPrk7ZFd9vKGyz+IgBBANaKA6PKMoBxpV4RAdDcJCtaqjw5sVVqAh+Lor8wqgEB8m0crxfqjKPZFFX4SdZIJ8OdJxcjKZyQ2DpJ2MSN1ZBnQVgpj4A39uCFo/vAVfBaZujVUJM61OSLVQvHmVP2YG4Zg1U/dCnrjG4d4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1775361248; c=relaxed/simple; bh=77pKPkYzboJdLHD73b8GsdXNNnql3FuILK8ihNX4jxQ=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=uxuOXMwyGM/yb8kW0EBZPuB+Hqxbwe/y3YO8PkFMXIzqUpF5ALwobhBBksMVYyeylL/WyV6v+IZuPq854TK6lyCVAWpVlSejcW6R8+7Fl8yUDmMfsxLvpDcZLT5Xt0ApyqpnG8wNFc3YMTHmXlsJadCkPWEjrmtQTPZNdJ3Qm/8= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8ED584BA2E27 Received: from loongson.cn (unknown [223.64.120.66]) by gateway (Coremail) with SMTP id _____8Ax_6nZ3NFpexgiAA--.36595S3; Sun, 05 Apr 2026 11:54:01 +0800 (CST) Received: from localhost (unknown [223.64.120.66]) by front1 (Coremail) with SMTP id qMiowJBxDOHK3NFpP1JlAA--.44692S2; Sun, 05 Apr 2026 11:53:57 +0800 (CST) From: WANG Rui <wangrui@loongson.cn> To: libc-alpha@sourceware.org Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org>, Dev Jain <dev.jain@arm.com>, Florian Weimer <fweimer@redhat.com>, Wilco Dijkstra <Wilco.Dijkstra@arm.com>, Xi Ruoyao <xry111@xry111.site>, WANG Xuerui <git@xen0n.name>, caiyinyu <caiyinyu@loongson.cn>, mengqinggang <mengqinggang@loongson.cn>, Huacai Chen <chenhuacai@kernel.org>, hjl.tools@gmail.com, WANG Rui <wangrui@loongson.cn> Subject: [PATCH v8 0/6] elf: THP-aware load segment alignment Date: Sun, 5 Apr 2026 11:53:16 +0800 Message-ID: <20260405035323.558335-1-wangrui@loongson.cn> X-Mailer: git-send-email 2.53.0 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-CM-TRANSID: qMiowJBxDOHK3NFpP1JlAA--.44692S2 X-CM-SenderInfo: pzdqw2txl6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBj93XoW3Ar47JryxZF1fZF4rtr45XFc_yoW7Wr1fpF WFkrn5KFW5Ary7CFZav3ZIkwnIqw4rGrWDCwnIgw1qvw15WryxWFs2vw15Xa47Cr1UJF48 ZrZ2qr1DuFy5ZacCm3ZEXasCq-sJn29KB7ZKAUJUUUU7529EdanIXcx71UUUUU7KY7ZEXa sCq-sGcSsGvfJ3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU 0xBIdaVrnRJUUUBjb4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2 IYs7xG6rWj6s0DM7CIcVAFz4kK6r1Y6r17M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Jr0_JF4l84ACjcxK6xIIjxv20xvEc7CjxVAFwI 0_Jr0_Gr1l84ACjcxK6I8E87Iv67AKxVWUJVW8JwA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_ Jr0_Gr1ln4kS14v26r1Y6r17M2AIxVAIcxkEcVAq07x20xvEncxIr21l57IF6xkI12xvs2 x26I8E6xACxx1l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r126r1D McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr4 1lc7CjxVAaw2AFwI0_JF0_Jw1l42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_ Gr1l4IxYO2xFxVAFwI0_Jrv_JF1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s026x8GjcxK67 AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1q6r43MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8I cVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMIIF0xvE42xK8VAvwI 8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v2 6r1j6r4UYxBIdaVFxhVjvjDU0xZFpf9x07jOiSdUUUUU= X-Spam-Status: No, score=-5.3 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list <libc-alpha.sourceware.org> List-Unsubscribe: <https://sourceware.org/mailman/options/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe> List-Archive: <https://sourceware.org/pipermail/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-request@sourceware.org?subject=help> List-Subscribe: <https://sourceware.org/mailman/listinfo/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=subscribe> Errors-To: libc-alpha-bounces~patchwork=sourceware.org@sourceware.org |
| Series |
elf: THP-aware load segment alignment
|
|
Message
WANG Rui
April 5, 2026, 3:53 a.m. UTC
OK for commit?
Changes since [v7]:
* Rename tunable glibc.elf.hugetlb to glibc.elf.thp.
* Rebase on current master.
Changes since [v6]:
* Move MAX_THP_PAGESIZE to hugepages.
* Skip THP mode probing when DL_MAP_DEFAULT_THP_PAGESIZE is non-zero.
Changes since [v5]:
* No functional changes.
* Add benchmark results to the commit message.
Changes since [v4]:
* Merge malloc-hugepages into hugepages.
* Limit the glibc.elf.hugetlb tunable maximum value to 1.
Changes since [v3]:
* Rebased on current master.
* Resolved conflicts with recently merged changes.
* No functional changes intended.
Changes since [v2]:
* Refactor THP detection into a new generic hugepages abstraction,
moving helpers out of malloc-hugepages.
* Add a new tunable, `glibc.elf.hugetlb`, to control THP-aware ELF
segment alignment.
* Move the Linux implementation of `_dl_map_segment_align` to
sysdeps/unix/sysv/linux/, making it generic for Linux (including
32-bit) and avoiding wordsize-64-specific overrides.
* Use `static inline` instead of `static __always_inline`.
Changes since [v1]:
* Fix CI build failure (-Wunused-function).
This patch series introduces a small extension point in the ELF loader to allow
architecture-specific adjustment of load segment alignment, and uses it to
improve Transparent Huge Page (THP) usage on Linux.
Patch 1 moves Transparent Huge Page helpers into a new generic hugepages
abstraction, so THP mode detection and default huge page size probing can be
shared between malloc and the dynamic loader. There is no functional change.
Patch 2 removes a redundant declaration of `_dl_map_segments` from
dl-load.h, avoiding `-Wunused-function` build failures and keeping the
prototype colocated with its definition.
Patch 3 adds a new helper, `_dl_map_segment_align`, which is called when
determining the maximum alignment for ELF load segments. The generic
implementation is a no-op and preserves existing behavior.
Patch 4 introduces a new tunable, `glibc.elf.hugetlb`, which controls
THP-aware alignment of ELF loadable segments. By default, the value is 0
and existing behavior is preserved.
Patch 5 provides a Linux implementation of `_dl_map_segment_align` that
opportunistically aligns large, suitably aligned, non-writable `PT_LOAD`
segments to the system’s default THP page size when THP is configured to
be used unconditionally and the tunable is enabled.
Patch 6 enables this behavior by default on LoongArch64 Linux and defines
the default THP page size (32MB), matching the architecture’s PMD huge page
geometry.
[v7]: https://sourceware.org/pipermail/libc-alpha/2026-March/175776.html
[v6]: https://sourceware.org/pipermail/libc-alpha/2026-March/175737.html
[v5]: https://sourceware.org/pipermail/libc-alpha/2026-March/175694.html
[v4]: https://sourceware.org/pipermail/libc-alpha/2026-March/175644.html
[v3]: https://sourceware.org/pipermail/libc-alpha/2026-February/175464.html
[v2]: https://sourceware.org/pipermail/libc-alpha/2026-February/175394.html
[v1]: https://sourceware.org/pipermail/libc-alpha/2026-February/175359.html
WANG Rui (6):
hugepages: Move THP helpers to generic hugepages abstraction
elf: Remove redundant _dl_map_segments declaration from dl-load.h
elf: Introduce _dl_map_segment_align hook for segment alignment tuning
tunables: Add glibc.elf.thp tunable for THP-aware segment alignment
elf: Align large load segments to PMD huge page size for THP
loongarch: Enable THP-aligned load segments by default on 64-bit
elf/dl-load.c | 4 ++
elf/dl-load.h | 5 +-
elf/dl-tunables.c | 7 +--
elf/dl-tunables.list | 8 +++
malloc/malloc-internal.h | 2 +-
malloc/malloc.c | 27 +++++-----
manual/tunables.texi | 24 +++++++++
sysdeps/generic/Makefile | 4 +-
sysdeps/generic/dl-map-segment-align.h | 26 +++++++++
.../{malloc-hugepages.c => hugepages.c} | 13 +++--
.../{malloc-hugepages.h => hugepages.h} | 32 ++++++-----
sysdeps/unix/sysv/linux/Makefile | 1 +
.../{malloc-hugepages.h => hugepages.h} | 4 +-
.../unix/sysv/linux/dl-map-segment-align.c | 53 +++++++++++++++++++
.../unix/sysv/linux/dl-map-segment-align.h | 27 ++++++++++
.../linux/{malloc-hugepages.c => hugepages.c} | 33 ++++++------
.../unix/sysv/linux/loongarch/cpu-features.c | 6 +++
.../loongarch/lp64/dl-map-segment-align.h | 22 ++++++++
18 files changed, 237 insertions(+), 61 deletions(-)
create mode 100644 sysdeps/generic/dl-map-segment-align.h
rename sysdeps/generic/{malloc-hugepages.c => hugepages.c} (76%)
rename sysdeps/generic/{malloc-hugepages.h => hugepages.h} (68%)
rename sysdeps/unix/sysv/linux/aarch64/{malloc-hugepages.h => hugepages.h} (91%)
create mode 100644 sysdeps/unix/sysv/linux/dl-map-segment-align.c
create mode 100644 sysdeps/unix/sysv/linux/dl-map-segment-align.h
rename sysdeps/unix/sysv/linux/{malloc-hugepages.c => hugepages.c} (89%)
create mode 100644 sysdeps/unix/sysv/linux/loongarch/lp64/dl-map-segment-align.h
Comments
On Sun, Apr 5, 2026 at 11:54 AM WANG Rui <wangrui@loongson.cn> wrote: > > OK for commit? > > Changes since [v7]: > * Rename tunable glibc.elf.hugetlb to glibc.elf.thp. > * Rebase on current master. > > Changes since [v6]: > * Move MAX_THP_PAGESIZE to hugepages. > * Skip THP mode probing when DL_MAP_DEFAULT_THP_PAGESIZE is non-zero. > > Changes since [v5]: > * No functional changes. > * Add benchmark results to the commit message. > > Changes since [v4]: > * Merge malloc-hugepages into hugepages. > * Limit the glibc.elf.hugetlb tunable maximum value to 1. > > Changes since [v3]: > * Rebased on current master. > * Resolved conflicts with recently merged changes. > * No functional changes intended. > > Changes since [v2]: > * Refactor THP detection into a new generic hugepages abstraction, > moving helpers out of malloc-hugepages. > * Add a new tunable, `glibc.elf.hugetlb`, to control THP-aware ELF > segment alignment. > * Move the Linux implementation of `_dl_map_segment_align` to > sysdeps/unix/sysv/linux/, making it generic for Linux (including > 32-bit) and avoiding wordsize-64-specific overrides. > * Use `static inline` instead of `static __always_inline`. > > Changes since [v1]: > * Fix CI build failure (-Wunused-function). > > This patch series introduces a small extension point in the ELF loader to allow > architecture-specific adjustment of load segment alignment, and uses it to > improve Transparent Huge Page (THP) usage on Linux. > > Patch 1 moves Transparent Huge Page helpers into a new generic hugepages > abstraction, so THP mode detection and default huge page size probing can be > shared between malloc and the dynamic loader. There is no functional change. > > Patch 2 removes a redundant declaration of `_dl_map_segments` from > dl-load.h, avoiding `-Wunused-function` build failures and keeping the > prototype colocated with its definition. > > Patch 3 adds a new helper, `_dl_map_segment_align`, which is called when > determining the maximum alignment for ELF load segments. The generic > implementation is a no-op and preserves existing behavior. > > Patch 4 introduces a new tunable, `glibc.elf.hugetlb`, which controls > THP-aware alignment of ELF loadable segments. By default, the value is 0 > and existing behavior is preserved. > > Patch 5 provides a Linux implementation of `_dl_map_segment_align` that > opportunistically aligns large, suitably aligned, non-writable `PT_LOAD` > segments to the system’s default THP page size when THP is configured to > be used unconditionally and the tunable is enabled. > > Patch 6 enables this behavior by default on LoongArch64 Linux and defines > the default THP page size (32MB), matching the architecture’s PMD huge page > geometry. > > [v7]: https://sourceware.org/pipermail/libc-alpha/2026-March/175776.html > [v6]: https://sourceware.org/pipermail/libc-alpha/2026-March/175737.html > [v5]: https://sourceware.org/pipermail/libc-alpha/2026-March/175694.html > [v4]: https://sourceware.org/pipermail/libc-alpha/2026-March/175644.html > [v3]: https://sourceware.org/pipermail/libc-alpha/2026-February/175464.html > [v2]: https://sourceware.org/pipermail/libc-alpha/2026-February/175394.html > [v1]: https://sourceware.org/pipermail/libc-alpha/2026-February/175359.html > > WANG Rui (6): > hugepages: Move THP helpers to generic hugepages abstraction > elf: Remove redundant _dl_map_segments declaration from dl-load.h > elf: Introduce _dl_map_segment_align hook for segment alignment tuning > tunables: Add glibc.elf.thp tunable for THP-aware segment alignment > elf: Align large load segments to PMD huge page size for THP > loongarch: Enable THP-aligned load segments by default on 64-bit > > elf/dl-load.c | 4 ++ > elf/dl-load.h | 5 +- > elf/dl-tunables.c | 7 +-- > elf/dl-tunables.list | 8 +++ > malloc/malloc-internal.h | 2 +- > malloc/malloc.c | 27 +++++----- > manual/tunables.texi | 24 +++++++++ > sysdeps/generic/Makefile | 4 +- > sysdeps/generic/dl-map-segment-align.h | 26 +++++++++ > .../{malloc-hugepages.c => hugepages.c} | 13 +++-- > .../{malloc-hugepages.h => hugepages.h} | 32 ++++++----- > sysdeps/unix/sysv/linux/Makefile | 1 + > .../{malloc-hugepages.h => hugepages.h} | 4 +- > .../unix/sysv/linux/dl-map-segment-align.c | 53 +++++++++++++++++++ > .../unix/sysv/linux/dl-map-segment-align.h | 27 ++++++++++ > .../linux/{malloc-hugepages.c => hugepages.c} | 33 ++++++------ > .../unix/sysv/linux/loongarch/cpu-features.c | 6 +++ > .../loongarch/lp64/dl-map-segment-align.h | 22 ++++++++ > 18 files changed, 237 insertions(+), 61 deletions(-) > create mode 100644 sysdeps/generic/dl-map-segment-align.h > rename sysdeps/generic/{malloc-hugepages.c => hugepages.c} (76%) > rename sysdeps/generic/{malloc-hugepages.h => hugepages.h} (68%) > rename sysdeps/unix/sysv/linux/aarch64/{malloc-hugepages.h => hugepages.h} (91%) > create mode 100644 sysdeps/unix/sysv/linux/dl-map-segment-align.c > create mode 100644 sysdeps/unix/sysv/linux/dl-map-segment-align.h > rename sysdeps/unix/sysv/linux/{malloc-hugepages.c => hugepages.c} (89%) > create mode 100644 sysdeps/unix/sysv/linux/loongarch/lp64/dl-map-segment-align.h > > -- > 2.53.0 > Please add some tests for this feature.
On Tue, Apr 7, 2026 at 7:32 AM H.J. Lu <hjl.tools@gmail.com> wrote: > > On Sun, Apr 5, 2026 at 11:54 AM WANG Rui <wangrui@loongson.cn> wrote: > > > > OK for commit? > > > > Changes since [v7]: > > * Rename tunable glibc.elf.hugetlb to glibc.elf.thp. > > * Rebase on current master. > > > > Changes since [v6]: > > * Move MAX_THP_PAGESIZE to hugepages. > > * Skip THP mode probing when DL_MAP_DEFAULT_THP_PAGESIZE is non-zero. > > > > Changes since [v5]: > > * No functional changes. > > * Add benchmark results to the commit message. > > > > Changes since [v4]: > > * Merge malloc-hugepages into hugepages. > > * Limit the glibc.elf.hugetlb tunable maximum value to 1. > > > > Changes since [v3]: > > * Rebased on current master. > > * Resolved conflicts with recently merged changes. > > * No functional changes intended. > > > > Changes since [v2]: > > * Refactor THP detection into a new generic hugepages abstraction, > > moving helpers out of malloc-hugepages. > > * Add a new tunable, `glibc.elf.hugetlb`, to control THP-aware ELF > > segment alignment. > > * Move the Linux implementation of `_dl_map_segment_align` to > > sysdeps/unix/sysv/linux/, making it generic for Linux (including > > 32-bit) and avoiding wordsize-64-specific overrides. > > * Use `static inline` instead of `static __always_inline`. > > > > Changes since [v1]: > > * Fix CI build failure (-Wunused-function). > > > > This patch series introduces a small extension point in the ELF loader to allow > > architecture-specific adjustment of load segment alignment, and uses it to > > improve Transparent Huge Page (THP) usage on Linux. > > > > Patch 1 moves Transparent Huge Page helpers into a new generic hugepages > > abstraction, so THP mode detection and default huge page size probing can be > > shared between malloc and the dynamic loader. There is no functional change. > > > > Patch 2 removes a redundant declaration of `_dl_map_segments` from > > dl-load.h, avoiding `-Wunused-function` build failures and keeping the > > prototype colocated with its definition. > > > > Patch 3 adds a new helper, `_dl_map_segment_align`, which is called when > > determining the maximum alignment for ELF load segments. The generic > > implementation is a no-op and preserves existing behavior. > > > > Patch 4 introduces a new tunable, `glibc.elf.hugetlb`, which controls > > THP-aware alignment of ELF loadable segments. By default, the value is 0 > > and existing behavior is preserved. > > > > Patch 5 provides a Linux implementation of `_dl_map_segment_align` that > > opportunistically aligns large, suitably aligned, non-writable `PT_LOAD` > > segments to the system’s default THP page size when THP is configured to > > be used unconditionally and the tunable is enabled. > > > > Patch 6 enables this behavior by default on LoongArch64 Linux and defines > > the default THP page size (32MB), matching the architecture’s PMD huge page > > geometry. > > > > [v7]: https://sourceware.org/pipermail/libc-alpha/2026-March/175776.html > > [v6]: https://sourceware.org/pipermail/libc-alpha/2026-March/175737.html > > [v5]: https://sourceware.org/pipermail/libc-alpha/2026-March/175694.html > > [v4]: https://sourceware.org/pipermail/libc-alpha/2026-March/175644.html > > [v3]: https://sourceware.org/pipermail/libc-alpha/2026-February/175464.html > > [v2]: https://sourceware.org/pipermail/libc-alpha/2026-February/175394.html > > [v1]: https://sourceware.org/pipermail/libc-alpha/2026-February/175359.html > > > > WANG Rui (6): > > hugepages: Move THP helpers to generic hugepages abstraction > > elf: Remove redundant _dl_map_segments declaration from dl-load.h > > elf: Introduce _dl_map_segment_align hook for segment alignment tuning > > tunables: Add glibc.elf.thp tunable for THP-aware segment alignment > > elf: Align large load segments to PMD huge page size for THP > > loongarch: Enable THP-aligned load segments by default on 64-bit > > > > elf/dl-load.c | 4 ++ > > elf/dl-load.h | 5 +- > > elf/dl-tunables.c | 7 +-- > > elf/dl-tunables.list | 8 +++ > > malloc/malloc-internal.h | 2 +- > > malloc/malloc.c | 27 +++++----- > > manual/tunables.texi | 24 +++++++++ > > sysdeps/generic/Makefile | 4 +- > > sysdeps/generic/dl-map-segment-align.h | 26 +++++++++ > > .../{malloc-hugepages.c => hugepages.c} | 13 +++-- > > .../{malloc-hugepages.h => hugepages.h} | 32 ++++++----- > > sysdeps/unix/sysv/linux/Makefile | 1 + > > .../{malloc-hugepages.h => hugepages.h} | 4 +- > > .../unix/sysv/linux/dl-map-segment-align.c | 53 +++++++++++++++++++ > > .../unix/sysv/linux/dl-map-segment-align.h | 27 ++++++++++ > > .../linux/{malloc-hugepages.c => hugepages.c} | 33 ++++++------ > > .../unix/sysv/linux/loongarch/cpu-features.c | 6 +++ > > .../loongarch/lp64/dl-map-segment-align.h | 22 ++++++++ > > 18 files changed, 237 insertions(+), 61 deletions(-) > > create mode 100644 sysdeps/generic/dl-map-segment-align.h > > rename sysdeps/generic/{malloc-hugepages.c => hugepages.c} (76%) > > rename sysdeps/generic/{malloc-hugepages.h => hugepages.h} (68%) > > rename sysdeps/unix/sysv/linux/aarch64/{malloc-hugepages.h => hugepages.h} (91%) > > create mode 100644 sysdeps/unix/sysv/linux/dl-map-segment-align.c > > create mode 100644 sysdeps/unix/sysv/linux/dl-map-segment-align.h > > rename sysdeps/unix/sysv/linux/{malloc-hugepages.c => hugepages.c} (89%) > > create mode 100644 sysdeps/unix/sysv/linux/loongarch/lp64/dl-map-segment-align.h > > > > -- > > 2.53.0 > > > > Please add some tests for this feature. > I opened: https://sourceware.org/bugzilla/show_bug.cgi?id=34056 for GNU_PROPERTY_1_NEEDED_TRANSPARENT_HUGEPAGE.
Hi HJ, > I opened: > > https://sourceware.org/bugzilla/show_bug.cgi?id=34056 > > for GNU_PROPERTY_1_NEEDED_TRANSPARENT_HUGEPAGE. So what is the reasoning for this? Where will it be set and under which conditions? It looks like adding a lot of startup overhead for a flag that would either never be set if it is not the default or always set if it is the default. The whole point of THP is that it is transparent, ie. automatic, always enabled. Cheers, Wilco
On Thu, Apr 9, 2026 at 6:29 PM Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote: > > Hi HJ, > > > I opened: > > > > https://sourceware.org/bugzilla/show_bug.cgi?id=34056 > > > > for GNU_PROPERTY_1_NEEDED_TRANSPARENT_HUGEPAGE. > > So what is the reasoning for this? Where will it be set and under which conditions? A programmer can ask for it. > It looks like adding a lot of startup overhead for a flag that would either never be > set if it is not the default or always set if it is the default. The whole point of THP > is that it is transparent, ie. automatic, always enabled. > Is it always beneficial to load every program, big or small, performance sensitive or non-sensitive, with THP? If it is the case, why bother with a tunable?
Hi HJ, > Is it always beneficial to load every program, big or small, > performance sensitive or non-sensitive, with THP? If it is > the case, why bother with a tunable? Linux reads much larger blocks from the filesystem, so yes, it's beneficial if the binary is large enough and you have THP. Small binaries can't really benefit. Tunables are almost for free, it's just checking a global variable, not a system call. Cheers, Wilco
On Thu, Apr 9, 2026 at 9:30 PM Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote: > > Hi HJ, > > > Is it always beneficial to load every program, big or small, > > performance sensitive or non-sensitive, with THP? If it is > > the case, why bother with a tunable? > > Linux reads much larger blocks from the filesystem, so yes, > it's beneficial if the binary is large enough and you have THP. > Small binaries can't really benefit. > > Tunables are almost for free, it's just checking a global > variable, not a system call. The end-user still needs to set the tunable. It is beyond the programmer's control. With a property bit, the developer can set it on the binary. If needed, it can be set in one of crt files.
Hi HJ, >> Tunables are almost for free, it's just checking a global >> variable, not a system call. > > The end-user still needs to set the tunable. It is beyond > the programmer's control. With a property bit, the developer > can set it on the binary. If needed, it can be set in one of > crt files. The goal is to enable this by default on modern 64-bit targets given the extra alignment is not harmful (only minor reduction of ALSR bits when PMD size is 2MB). All this configurability is insane - nobody is ever going to figure out which settings are available, how to modify each setting, let alone which combination gives the best performance on every application on a particular machine... Users just want stuff to work well out of the box. Call them lazy if you like. Cheers, Wilco
On Fri, Apr 10, 2026 at 5:40 AM Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote: > > Hi HJ, > > >> Tunables are almost for free, it's just checking a global > >> variable, not a system call. > > > > The end-user still needs to set the tunable. It is beyond > > the programmer's control. With a property bit, the developer > > can set it on the binary. If needed, it can be set in one of > > crt files. > > The goal is to enable this by default on modern 64-bit targets given the extra > alignment is not harmful (only minor reduction of ALSR bits when PMD size is 2MB). > > All this configurability is insane - nobody is ever going to figure out which settings are > available, how to modify each setting, let alone which combination gives the best > performance on every application on a particular machine... Users just want stuff to > work well out of the box. Call them lazy if you like. > > Why a tunable at all? Users may not set it.
Hi HJ,
> Why a tunable at all? Users may not set it.
If it is already the default, users never need to set it. It makes benchmarking
simpler at a cost that is ~10000x lower than checking the huge page size.
Cheers,
Wilco
On Fri, Apr 10, 2026 at 7:33 AM Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote: > > Hi HJ, > > > Why a tunable at all? Users may not set it. > > If it is already the default, users never need to set it. It makes benchmarking > simpler at a cost that is ~10000x lower than checking the huge page size. > Even if THP isn't enabled by default, like for 32-bit mode. We may still want to enable THP on specific 32-bit applications. A property bit provides such flexibility without requiring the user to set a tunable at run-time.
On Fri, Apr 10, 2026 at 9:26 AM H.J. Lu <hjl.tools@gmail.com> wrote: > > On Fri, Apr 10, 2026 at 7:33 AM Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote: > > > > Hi HJ, > > > > > Why a tunable at all? Users may not set it. > > > > If it is already the default, users never need to set it. It makes benchmarking > > simpler at a cost that is ~10000x lower than checking the huge page size. > > > > Even if THP isn't enabled by default, like for 32-bit mode. We may still > want to enable THP on specific 32-bit applications. A property bit provides > such flexibility without requiring the user to set a tunable at run-time. Do we really need a flag in the executable file to hint the loader to cooperate with THP? If we already have the ability to modify the executable, why not just make the LOAD segmengts hugepage-aligned in the first place? As you can see, this optimization is currently opt-in on most arches rather than enabled by default. I see the tunable as more of a transitional mechanism, it simply gives users a way to turn the opt on and on arches where it is enbled by default, it can also be used to turn it off. Thanks, Rui
On Fri, Apr 10, 2026 at 10:57 AM WANG Rui <wangrui@loongson.cn> wrote: > > On Fri, Apr 10, 2026 at 9:26 AM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > On Fri, Apr 10, 2026 at 7:33 AM Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote: > > > > > > Hi HJ, > > > > > > > Why a tunable at all? Users may not set it. > > > > > > If it is already the default, users never need to set it. It makes benchmarking > > > simpler at a cost that is ~10000x lower than checking the huge page size. > > > > > > > Even if THP isn't enabled by default, like for 32-bit mode. We may still > > want to enable THP on specific 32-bit applications. A property bit provides > > such flexibility without requiring the user to set a tunable at run-time. > > Do we really need a flag in the executable file to hint the loader to > cooperate with THP? If we already have the ability to modify the > executable, why not just make the LOAD segmengts hugepage-aligned in > the first place? > > As you can see, this optimization is currently opt-in on most arches > rather than enabled by default. I see the tunable as more of a > transitional mechanism, it simply gives users a way to turn the opt on > and on arches where it is enbled by default, it can also be used to > turn it off. > This all or nothing approach is my main concern. A programmer may want to enable THP on an application even if THP isn't enabled by default. The overhead of checking a bit in the l_1_needed field is minimum.
On Fri, Apr 10, 2026 at 11:35 AM H.J. Lu <hjl.tools@gmail.com> wrote: > > On Fri, Apr 10, 2026 at 10:57 AM WANG Rui <wangrui@loongson.cn> wrote: > > > > On Fri, Apr 10, 2026 at 9:26 AM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > > > On Fri, Apr 10, 2026 at 7:33 AM Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote: > > > > > > > > Hi HJ, > > > > > > > > > Why a tunable at all? Users may not set it. > > > > > > > > If it is already the default, users never need to set it. It makes benchmarking > > > > simpler at a cost that is ~10000x lower than checking the huge page size. > > > > > > > > > > Even if THP isn't enabled by default, like for 32-bit mode. We may still > > > want to enable THP on specific 32-bit applications. A property bit provides > > > such flexibility without requiring the user to set a tunable at run-time. > > > > Do we really need a flag in the executable file to hint the loader to > > cooperate with THP? If we already have the ability to modify the > > executable, why not just make the LOAD segmengts hugepage-aligned in > > the first place? > > > > As you can see, this optimization is currently opt-in on most arches > > rather than enabled by default. I see the tunable as more of a > > transitional mechanism, it simply gives users a way to turn the opt on > > and on arches where it is enbled by default, it can also be used to > > turn it off. > > > > This all or nothing approach is my main concern. A programmer > may want to enable THP on an application even if THP isn't enabled > by default. The overhead of checking a bit in the l_1_needed field > is minimum. In earlier discussions, there was mention of memory pressure caused by THP, which is really a runtime system concern. When THP is not enabled by default on the system, I'm a bit unsure whether it's a good idea for a programer's decision to override the system default. Ideally, the executable itself would take care of hugepage-friendly alignment, the loader would make sure the mappings land on hugepage-aligned virtual addresses, and the final decision on whether to use huge pages would be left solely to THP. It has visibility into the system's dynamic state, and users would only need to deal with this single control point, making it relatively straightforward to turn it on or off. Thanks, Rui
On Fri, Apr 10, 2026 at 11:58 AM WANG Rui <wangrui@loongson.cn> wrote: > > On Fri, Apr 10, 2026 at 11:35 AM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > On Fri, Apr 10, 2026 at 10:57 AM WANG Rui <wangrui@loongson.cn> wrote: > > > > > > On Fri, Apr 10, 2026 at 9:26 AM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > > > > > On Fri, Apr 10, 2026 at 7:33 AM Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote: > > > > > > > > > > Hi HJ, > > > > > > > > > > > Why a tunable at all? Users may not set it. > > > > > > > > > > If it is already the default, users never need to set it. It makes benchmarking > > > > > simpler at a cost that is ~10000x lower than checking the huge page size. > > > > > > > > > > > > > Even if THP isn't enabled by default, like for 32-bit mode. We may still > > > > want to enable THP on specific 32-bit applications. A property bit provides > > > > such flexibility without requiring the user to set a tunable at run-time. > > > > > > Do we really need a flag in the executable file to hint the loader to > > > cooperate with THP? If we already have the ability to modify the > > > executable, why not just make the LOAD segmengts hugepage-aligned in > > > the first place? > > > > > > As you can see, this optimization is currently opt-in on most arches > > > rather than enabled by default. I see the tunable as more of a > > > transitional mechanism, it simply gives users a way to turn the opt on > > > and on arches where it is enbled by default, it can also be used to > > > turn it off. > > > > > > > This all or nothing approach is my main concern. A programmer > > may want to enable THP on an application even if THP isn't enabled > > by default. The overhead of checking a bit in the l_1_needed field > > is minimum. > > In earlier discussions, there was mention of memory pressure caused by > THP, which is really a runtime system concern. When THP is not enabled > by default on the system, I'm a bit unsure whether it's a good idea > for a programer's decision to override the system default. > > Ideally, the executable itself would take care of hugepage-friendly > alignment, the loader would make sure the mappings land on > hugepage-aligned virtual addresses, and the final decision on whether > to use huge pages would be left solely to THP. It has visibility into > the system's dynamic state, and users would only need to deal with > this single control point, making it relatively straightforward to > turn it on or off. > What a property bit provides is flexibility.