[v2] s390: Do not use canonical PLT if pointer equality is not needed
Checks
| Context |
Check |
Description |
| linaro-tcwg-bot/tcwg_binutils_build--master-arm |
success
|
Build passed
|
| linaro-tcwg-bot/tcwg_binutils_build--master-aarch64 |
success
|
Build passed
|
| linaro-tcwg-bot/tcwg_binutils_check--master-arm |
fail
|
Test failed
|
| linaro-tcwg-bot/tcwg_binutils_check--master-aarch64 |
fail
|
Test failed
|
Commit Message
Require pointer equality in executables for symbols with non-PLT
PC-relative relocations, that are likely in address taken context, and
direct relocations, that are likely in function reference context. Do so
for IFUNC symbols defined in a non-shared object. Clear value of PLT
undefined symbols if pointer equality is not needed and do not hash them
in '.gnu.hash' section.
Linkers need to make this distinction in order to decide whether to ask
the dynamic linker to use canonical PLT entries. Normally GOT entries
in shared libraries contain addresses of the respective functions, with
one notable exception: when a PDE (Position-Dependent Executable;
no-PIE) calls the respective function and also takes its address. Such
executables assume that all addresses are known in advance, so they use
addresses of the respective PLT entries. For consistency reasons, all
respective GOT entries in the process must also use them.
When a linker sees that a PDE both calls a function and also takes its
address, it creates a PLT entry and asks the dynamic linker to consider
it canonical by setting the respective undefined symbol's value, which
is normally zero, to the address of this executable PLT entry.
As workaround for GCC 12-14 treat PC32DBL relocation for address taking
instruction "larl rX,<sym>@PLT" as if it was without @PLT suffix and
require pointer equality.
GCC 12-14, since GCC commit 0990d93dd8a4 ("IBM Z: Use @PLT symbols for
local functions in 64-bit mode") [1], unconditionally suffix non-local
symbols with @PLT, regardless of whether they are used in function call
instructions (i.e. brasl) or address taking instructions (i.e. larl).
The assembler therefore generates a PLT32DBL instead of a PC32DBL
relocation for larl. The linker therefore cannot distinguish between
function call and address taking instructions solely from the relocation
type. The latter requiring pointer equality.
This complements GCC commit a2e0a30c52fa ("IBM zSystems: Do not use
@PLT with larl") [2], which makes GCC stop suffixing @PLT to address
taking larl instructions, so that the correct behavior with regards to
pointer equality is also achieved with affected GCC 12-14.
Note that this workaround can be reverted once GCC 12-14 emitting
address taking larl instructions with @PLT suffix have become
irrelevant.
Note that without the workaround for GCC 12-14 suffixing @PLT to larl
the following linker tests would fail:
FAIL: shared
FAIL: visibility (hidden_normal)
FAIL: visibility (hidden_weak)
FAIL: visibility (protected)
FAIL: visibility (protected_undef_def)
FAIL: visibility (protected_weak)
FAIL: visibility (normal)
Based on x86-64, especially Jakub Jelinek's x86 commits 47a9f7b34f7a
(clearing value of PLT undefined symbols if pointer equality not needed)
and fdc90cb46b0f (omitting PLT undefined symbols from '.gnu.hash').
[1] GCC commit 0990d93dd8a4 ("IBM Z: Use @PLT symbols for local
functions in 64-bit mode"),
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=0990d93dd8a4
[2] GCC commit a2e0a30c52fa ("IBM zSystems: Do not use @PLT with larl"),
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=a2e0a30c52fa
bfd/
PR ld/29655
* elf64-s390.c (elf_s390_check_relocs): Require pointer equality
for direct and non-PLT PC-relative relocations indicating
address taking instructions and for PLT32DBL relocations, when
used with address taking larl instruction.
(elf_s390_finish_dynamic_symbol): Do not use canonical PLT for
non-local undefined symbols if pointer equality is not needed.
Abort if pointer equality needed flag not set although required.
(elf_s390_copy_indirect_symbol): Copy pointer equality needed
flag.
(elf_s390_hash_symbol): New function. Based on x86-64.
(elf_backend_hash_symbol): Wire up elf_s390_hash_symbol.
ld/testsuite/
PR ld/29655
* ld-elf/shared.exp: Add new pr29655 test.
* ld-elf/pr29655a.c: New file. Based on Rui's sample in PR.
* ld-elf/pr29655b.c: Likewise.
* ld-s390/plt_64-1.wf: Adjust expected test output to change in
.gnu.hash due to omitted PLT undefined symbols that do not need
pointer equality.
* ld-s390/plt_64-1_eh.wf: Likewise.
Bug: https://sourceware.org/PR29655
Co-authored-by: Andreas Krebbel <krebbel@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
---
Notes (jremus):
Changes in v2:
- Reword commit message to reflect that function pointer equality is
required for all direct and non-PLT PC-relative relocations and to
mention which tests fail without the GCC 12-14 workaround.
- Fix typo in comment on GCC 12-14 workaround. (Andreas)
- Have main() return 0 in test. (Andreas)
- Adding Nick, Alan, and Jan due to the added common test case.
Note: Splitting the GCC 12-14 workaround into a separate patch either
requires it to be the first patch or causes one shared and multiple
visibility tests to fail. Addressing the latter using a setup_fail
condition as follows (that would get removed by the workaround-patch)
seemed rather odd to me:
# On s390 64-bit (s390x) GCC 12-14 suffix symbols in address
# taken context with @PLT, which breaks function pointer equality.
if { [istarget s390x-*-linux*]
&& [at_least_gcc_version 12 0]
&& ![at_least_gcc_version 15 0] } {
setup_xfail "s390x-*-linux*"
}
bfd/elf64-s390.c | 66 ++++++++++++++++++++++++++++-
ld/testsuite/ld-elf/pr29655a.c | 20 +++++++++
ld/testsuite/ld-elf/pr29655b.c | 15 +++++++
ld/testsuite/ld-elf/shared.exp | 30 +++++++++++++
ld/testsuite/ld-s390/plt_64-1.wf | 8 ++--
ld/testsuite/ld-s390/plt_64-1_eh.wf | 2 +-
6 files changed, 134 insertions(+), 7 deletions(-)
create mode 100644 ld/testsuite/ld-elf/pr29655a.c
create mode 100644 ld/testsuite/ld-elf/pr29655b.c
Comments
On 3/16/2026 6:07 PM, Jens Remus wrote:
> Require pointer equality in executables for symbols with non-PLT
> PC-relative relocations, that are likely in address taken context, and
> direct relocations, that are likely in function reference context. Do so
> for IFUNC symbols defined in a non-shared object. Clear value of PLT
> undefined symbols if pointer equality is not needed and do not hash them
> in '.gnu.hash' section.
> ld/testsuite/
> PR ld/29655
> * ld-elf/shared.exp: Add new pr29655 test.
> * ld-elf/pr29655a.c: New file. Based on Rui's sample in PR.
> * ld-elf/pr29655b.c: Likewise.
> * ld-s390/plt_64-1.wf: Adjust expected test output to change in
> .gnu.hash due to omitted PLT undefined symbols that do not need
> pointer equality.
> * ld-s390/plt_64-1_eh.wf: Likewise.
>
> Bug: https://sourceware.org/PR29655
> Co-authored-by: Andreas Krebbel <krebbel@linux.ibm.com>
> Signed-off-by: Jens Remus <jremus@linux.ibm.com>
> diff --git a/ld/testsuite/ld-elf/pr29655a.c b/ld/testsuite/ld-elf/pr29655a.c
> new file mode 100644
> @@ -0,0 +1,20 @@
> +#include <stdio.h>
> +
> +typedef void Fn();
> +
> +void __attribute__((visibility("hidden")))
> +fun (void)
> +{}
> +
> +extern void fun_public() __attribute__((alias("fun")));
> +
> +void
> +call_callback (Fn *callback)
> +{
> + if (callback == fun)
> + printf("PASS\n");
> + else
> + printf("FAIL\n");
> +
> + callback ();
> +}
> diff --git a/ld/testsuite/ld-elf/pr29655b.c b/ld/testsuite/ld-elf/pr29655b.c
> new file mode 100644
> @@ -0,0 +1,15 @@
> +#ifndef __PIC__
> +#error "this file must be compiled with -fPIC"
> +#endif
> +
> +typedef void Fn();
> +void fun_public(void);
> +void call_callback(Fn *callback);
> +
> +int
> +main ()
> +{
> + fun_public ();
> + call_callback (fun_public);
> + return 0;
> +}
> diff --git a/ld/testsuite/ld-elf/shared.exp b/ld/testsuite/ld-elf/shared.exp
> @@ -1863,3 +1863,33 @@ run_ld_link_tests [list \
> "pr23658-2" \
> ] \
> ]
> +
> +# PR 29655
> +run_cc_link_tests [list \
> + [list \
> + "Build pr29655.so" \
> + "-shared" \
> + "-fPIC" \
> + { pr29655a.c } \
> + {} \
> + "pr29655.so" \
> + ] \
> +]
> +# PR 29655 (cont.): Check that in PIC code linked as PDE taking the address
> +# of a function defined in a DSO results in the function address (from GOT)
> +# and not the "canonical PLT" address from the PDE.
> +# This is just an optimization and both is valid, although libraries may
> +# depend on this specific behavior, so do not complain loudly.
> +setup_xfail *-*-*
> +clear_xfail i?86-*-* powerpc*-*-* s390x-*-* x86_64-*-*
This needs arm*-*-* aarch64-*-* to be added according to the
Linaro-TCWG-CI.
Nick, Alan, Jan, any thoughts whether adding a common test makes sense
or whether should I better make this a s390-specific test (in ld-s390)?
Given this test requires to be run, I cannot verify for other targets.
> +run_ld_link_exec_tests [list \
> + [list \
> + "Run pr29655" \
> + "$NOPIE_LDFLAGS -Wl,--no-as-needed,-rpath,tmpdir tmpdir/pr29655.so" \
> + "" \
> + { pr29655b.c } \
> + "pr29655" \
> + "pass.out" \
> + "-fPIC" \
> + ] \
> +]
Thanks and regards,
Jens
On Mon, Mar 16, 2026 at 06:07:47PM +0100, Jens Remus wrote:
> +# PR 29655 (cont.): Check that in PIC code linked as PDE taking the address
> +# of a function defined in a DSO results in the function address (from GOT)
> +# and not the "canonical PLT" address from the PDE.
> +# This is just an optimization and both is valid, although libraries may
> +# depend on this specific behavior, so do not complain loudly.
> +setup_xfail *-*-*
> +clear_xfail i?86-*-* powerpc*-*-* s390x-*-* x86_64-*-*
> +run_ld_link_exec_tests [list \
> + [list \
> + "Run pr29655" \
> + "$NOPIE_LDFLAGS -Wl,--no-as-needed,-rpath,tmpdir tmpdir/pr29655.so" \
> + "" \
> + { pr29655b.c } \
> + "pr29655" \
> + "pass.out" \
> + "-fPIC" \
> + ] \
> +]
A test that needs to be run is inferior to a test that compiles and
examines an object, because the former only works natively. I think
you can likely turn this into a run_cc_link_tests that checks the
executable with readelf --dyn-sym looking for a zero value undefined
fun_public.
eg. this (s390x)
6: 0000000000000000 0 FUNC GLOBAL DEFAULT UND fun_public
Not this (s390), which is defining the sym on a plt stub (via ld.so
magic) and would therefore fail when run.
8: 00400424 0 FUNC GLOBAL DEFAULT UND fun_public
For the cross-toolchain targets that I test, it looks like
aarch64-linux, alpha-linux, arm-linux, hppa-linux, ia64-linux,
microblaze-linux, mips-linux, mips64-linux-gnuabi64,
powerpc64le-linux, powerpc64-linux, powerpc-linux, s390x-linux and
x86_64-pc-linux will pass.
m68k-linux, riscv64-linux, s390-linux, sh4-linux, sparc64-linux and
tilepro-linux will fail.
On 3/17/2026 12:22 PM, Alan Modra wrote:
> On Mon, Mar 16, 2026 at 06:07:47PM +0100, Jens Remus wrote:
>> +# PR 29655 (cont.): Check that in PIC code linked as PDE taking the address
>> +# of a function defined in a DSO results in the function address (from GOT)
>> +# and not the "canonical PLT" address from the PDE.
>> +# This is just an optimization and both is valid, although libraries may
>> +# depend on this specific behavior, so do not complain loudly.
>> +setup_xfail *-*-*
>> +clear_xfail i?86-*-* powerpc*-*-* s390x-*-* x86_64-*-*
>> +run_ld_link_exec_tests [list \
>> + [list \
>> + "Run pr29655" \
>> + "$NOPIE_LDFLAGS -Wl,--no-as-needed,-rpath,tmpdir tmpdir/pr29655.so" \
>> + "" \
>> + { pr29655b.c } \
>> + "pr29655" \
>> + "pass.out" \
>> + "-fPIC" \
>> + ] \
>> +]
>
> A test that needs to be run is inferior to a test that compiles and
> examines an object, because the former only works natively. I think
> you can likely turn this into a run_cc_link_tests that checks the
> executable with readelf --dyn-sym looking for a zero value undefined
> fun_public.
> eg. this (s390x)
> 6: 0000000000000000 0 FUNC GLOBAL DEFAULT UND fun_public
Great suggestion! I'll send a v3.
I'll see whether I can use Fangrui Song's cross-build/test script [1]
as base to run some tests for other targets myself.
[1]: https://inbox.sourceware.org/binutils/CAN30aBFaYE4CLUcxGk61Qc4+2K7yDPEDyhbKb7QBEJ-T4ENrxg@mail.gmail.com/
> Not this (s390), which is defining the sym on a plt stub (via ld.so
> magic) and would therefore fail when run.
> 8: 00400424 0 FUNC GLOBAL DEFAULT UND fun_public
Andreas and I am aware that I fixed the issue for s390 64-bit (s390x)
only.
s390 32-bit (s390) is quite obsolete as, Linux Kernel 6.19 (February
2026) removed 32-bit compat support on s390 [2] and is thus s390 64-bit
(s390x) only and Glibc 2.43 (January 2026) deprecated s390 32-bit with
the intent to remove it in Glibc 4.44 [3].
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8e0b986c59c6
[3]: https://inbox.sourceware.org/libc-alpha/52c31093-9750-42ab-a830-134b3ffa2d4c@linux.ibm.com/T/#u
> For the cross-toolchain targets that I test, it looks like
> aarch64-linux, alpha-linux, arm-linux, hppa-linux, ia64-linux,
> microblaze-linux, mips-linux, mips64-linux-gnuabi64,
> powerpc64le-linux, powerpc64-linux, powerpc-linux, s390x-linux and
> x86_64-pc-linux will pass.
Thanks! I have added those to the clear-xfail list.
> m68k-linux, riscv64-linux, s390-linux, sh4-linux, sparc64-linux and
> tilepro-linux will fail.
Regards,
Jens
On 3/17/2026 5:28 PM, Jens Remus wrote:
> On 3/17/2026 12:22 PM, Alan Modra wrote:
>> On Mon, Mar 16, 2026 at 06:07:47PM +0100, Jens Remus wrote:
>> A test that needs to be run is inferior to a test that compiles and
>> examines an object, because the former only works natively. I think
>> you can likely turn this into a run_cc_link_tests that checks the
>> executable with readelf --dyn-sym looking for a zero value undefined
>> fun_public.
>> eg. this (s390x)
>> 6: 0000000000000000 0 FUNC GLOBAL DEFAULT UND fun_public
>
> Great suggestion! I'll send a v3.
>
> I'll see whether I can use Fangrui Song's cross-build/test script [1]
> as base to run some tests for other targets myself.
>
> [1]: https://inbox.sourceware.org/binutils/CAN30aBFaYE4CLUcxGk61Qc4+2K7yDPEDyhbKb7QBEJ-T4ENrxg@mail.gmail.com/
I did not think this through. run_cc_link_tests obviously requires a
cross C compiler, which I would have to build as well for every target.
Then ld-elf/shared.exp tests for the availability of a C++ compiler
right in the middle and otherwise stops further testing, which silently
skips all following tests (whether or not they actually require C++).
So my own testing is still very limited.
Regards,
Jens
@@ -852,6 +852,7 @@ elf_s390_copy_indirect_symbol (struct bfd_link_info *info,
dir->ref_regular |= ind->ref_regular;
dir->ref_regular_nonweak |= ind->ref_regular_nonweak;
dir->needs_plt |= ind->needs_plt;
+ dir->pointer_equality_needed |= ind->pointer_equality_needed;
}
else
_bfd_elf_link_hash_copy_indirect (info, dir, ind);
@@ -1037,6 +1038,7 @@ elf_s390_check_relocs (bfd *abfd,
referenced. */
h->ref_regular = 1;
h->needs_plt = 1;
+ h->pointer_equality_needed = 1;
}
}
@@ -1078,6 +1080,38 @@ elf_s390_check_relocs (bfd *abfd,
{
h->needs_plt = 1;
h->plt.refcount += 1;
+
+ /* GCC 12-14 unconditionally suffix non-local symbols
+ with @PLT, regardless of whether they are used in
+ function call instructions (i.e. brasl) or address
+ taking instructions (i.e. larl). Treat PLT32DBL
+ relocation for "larl rX,<sym>@PLT" instruction as
+ address taking and require pointer equality. */
+ if (bfd_link_executable (info)
+ && r_type == R_390_PLT32DBL
+ && rel->r_offset >= 2)
+ {
+ bfd_byte *contents;
+ void *insn_start;
+ uint16_t op;
+
+ if (elf_section_data (sec)->this_hdr.contents != NULL)
+ contents = elf_section_data (sec)->this_hdr.contents;
+ else if (!_bfd_elf_mmap_section_contents (abfd, sec, &contents))
+ return false;
+
+ insn_start = contents + rel->r_offset - 2;
+ op = bfd_get_16 (abfd, insn_start) & 0xff0f;
+
+ if (op == 0xc000)
+ {
+ /* larl rX,<sym>@PLT */
+ h->pointer_equality_needed = 1;
+ }
+
+ if (elf_section_data (sec)->this_hdr.contents != contents)
+ _bfd_elf_munmap_section_contents (sec, contents);
+ }
}
break;
@@ -1227,6 +1261,12 @@ elf_s390_check_relocs (bfd *abfd,
refers to is in a shared lib. */
h->plt.refcount += 1;
}
+
+ /* Require pointer equality in PDE for above PC-relative
+ relocations, that are likely in address taken context,
+ and direct relocations, that are likely in function
+ reference context. */
+ h->pointer_equality_needed = 1;
}
/* If we are creating a shared library, and this is a reloc
@@ -3692,11 +3732,16 @@ elf_s390_finish_dynamic_symbol (bfd *output_bfd,
if (!h->def_regular)
{
/* Mark the symbol as undefined, rather than as defined in
- the .plt section. Leave the value alone. This is a clue
+ the .plt section. Leave the value if there were any
+ relocations where pointer equality matters (this is a clue
for the dynamic linker, to make function pointer
comparisons work between an application and shared
- library. */
+ library), otherwise set it to zero. If a function is only
+ called from a binary, there is no need to slow down
+ shared libraries because of that. */
sym->st_shndx = SHN_UNDEF;
+ if (!h->pointer_equality_needed)
+ sym->st_value = 0;
}
}
}
@@ -3730,6 +3775,9 @@ elf_s390_finish_dynamic_symbol (bfd *output_bfd,
}
else
{
+ if (!h->pointer_equality_needed)
+ abort ();
+
/* For non-shared objects explicit GOT slots must be
filled with the PLT slot address for pointer
equality reasons. */
@@ -4344,6 +4392,19 @@ elf_s390_create_dynamic_sections (bfd *dynobj,
return true;
}
+/* Return TRUE if symbol should be hashed in the `.gnu.hash' section. */
+
+static bool
+elf_s390_hash_symbol (struct elf_link_hash_entry *h)
+{
+ if (h->plt.offset != (bfd_vma) -1
+ && !h->def_regular
+ && !h->pointer_equality_needed)
+ return false;
+
+ return _bfd_elf_hash_symbol (h);
+}
+
/* Why was the hash table entry size definition changed from
ARCH_SIZE/8 to 4? This breaks the 64 bit dynamic linker and
this is the only reason for the s390_elf64_size_info structure. */
@@ -4424,6 +4485,7 @@ static const struct elf_size_info s390_elf64_size_info =
#define elf_backend_sort_relocs_p elf_s390_elf_sort_relocs_p
#define elf_backend_additional_program_headers elf_s390_additional_program_headers
#define elf_backend_modify_segment_map elf_s390_modify_segment_map
+#define elf_backend_hash_symbol elf_s390_hash_symbol
#define bfd_elf64_mkobject elf_s390_mkobject
#define elf_backend_object_p elf_s390_object_p
new file mode 100644
@@ -0,0 +1,20 @@
+#include <stdio.h>
+
+typedef void Fn();
+
+void __attribute__((visibility("hidden")))
+fun (void)
+{}
+
+extern void fun_public() __attribute__((alias("fun")));
+
+void
+call_callback (Fn *callback)
+{
+ if (callback == fun)
+ printf("PASS\n");
+ else
+ printf("FAIL\n");
+
+ callback ();
+}
new file mode 100644
@@ -0,0 +1,15 @@
+#ifndef __PIC__
+#error "this file must be compiled with -fPIC"
+#endif
+
+typedef void Fn();
+void fun_public(void);
+void call_callback(Fn *callback);
+
+int
+main ()
+{
+ fun_public ();
+ call_callback (fun_public);
+ return 0;
+}
@@ -1863,3 +1863,33 @@ run_ld_link_tests [list \
"pr23658-2" \
] \
]
+
+# PR 29655
+run_cc_link_tests [list \
+ [list \
+ "Build pr29655.so" \
+ "-shared" \
+ "-fPIC" \
+ { pr29655a.c } \
+ {} \
+ "pr29655.so" \
+ ] \
+]
+# PR 29655 (cont.): Check that in PIC code linked as PDE taking the address
+# of a function defined in a DSO results in the function address (from GOT)
+# and not the "canonical PLT" address from the PDE.
+# This is just an optimization and both is valid, although libraries may
+# depend on this specific behavior, so do not complain loudly.
+setup_xfail *-*-*
+clear_xfail i?86-*-* powerpc*-*-* s390x-*-* x86_64-*-*
+run_ld_link_exec_tests [list \
+ [list \
+ "Run pr29655" \
+ "$NOPIE_LDFLAGS -Wl,--no-as-needed,-rpath,tmpdir tmpdir/pr29655.so" \
+ "" \
+ { pr29655b.c } \
+ "pr29655" \
+ "pass.out" \
+ "-fPIC" \
+ ] \
+]
@@ -19,14 +19,14 @@ Contents of the .eh_frame section:
DW_CFA_nop
DW_CFA_nop
-00000018 000000000000001c 0000001c FDE cie=00000000 pc=00000000010002b8..00000000010002e4
+00000018 000000000000001c 0000001c FDE cie=00000000 pc=00000000010002b0..00000000010002dc
DW_CFA_remember_state
- DW_CFA_advance_loc: 6 to 00000000010002be
+ DW_CFA_advance_loc: 6 to 00000000010002b6
DW_CFA_offset: r14 at cfa-48
DW_CFA_offset: r15 at cfa-40
- DW_CFA_advance_loc: 8 to 00000000010002c6
+ DW_CFA_advance_loc: 8 to 00000000010002be
DW_CFA_def_cfa_offset: 320
- DW_CFA_advance_loc: 24 to 00000000010002de
+ DW_CFA_advance_loc: 24 to 00000000010002d6
DW_CFA_restore_state
DW_CFA_nop
DW_CFA_nop
@@ -19,7 +19,7 @@ Contents of the .eh_frame section:
DW_CFA_nop
DW_CFA_nop
-00000018 0000000000000014 0000001c FDE cie=00000000 pc=0000000001000258..00000000010002b8
+00000018 0000000000000014 0000001c FDE cie=00000000 pc=0000000001000250..00000000010002b0
DW_CFA_nop
DW_CFA_nop
DW_CFA_nop