[v2] LoongArch: Add overflow check and new relaxations when relaxing pcalau12i+ld.d
Checks
Context |
Check |
Description |
linaro-tcwg-bot/tcwg_binutils_build--master-arm |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_binutils_build--master-aarch64 |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_binutils_check--master-aarch64 |
success
|
Test passed
|
linaro-tcwg-bot/tcwg_binutils_check--master-arm |
success
|
Test passed
|
Commit Message
For pcalau12i+ld.d that can be relaxed, the relaxation is
pcalau12i+ld.d => pcalau12i+addi.d => pcaddi and there is no
overflow check.
So the overflow check in the pcalau12i+ld.d relaxation and the
relaxation of pcalau12i+ld.d => pcaddi are added.
Changes from v1:
* Add relaxation for pcalau12i+ld.d=>pcaddi.
* Update test cases.
v1: https://sourceware.org/pipermail/binutils/2024-November/137597.html
---
bfd/elfnn-loongarch.c | 65 ++++++++++++---
.../ld-loongarch-elf/ld-loongarch-elf.exp | 1 +
ld/testsuite/ld-loongarch-elf/relax-got.d | 79 +++++++++++++++++++
ld/testsuite/ld-loongarch-elf/relax-got.ld | 31 ++++++++
ld/testsuite/ld-loongarch-elf/relax-got.s | 69 ++++++++++++++++
5 files changed, 233 insertions(+), 12 deletions(-)
create mode 100644 ld/testsuite/ld-loongarch-elf/relax-got.d
create mode 100644 ld/testsuite/ld-loongarch-elf/relax-got.ld
create mode 100644 ld/testsuite/ld-loongarch-elf/relax-got.s
Comments
On Fri, 2024-11-15 at 10:51 +0800, Lulu Cai wrote:
> For pcalau12i+ld.d that can be relaxed, the relaxation is
> pcalau12i+ld.d => pcalau12i+addi.d => pcaddi and there is no
> overflow check.
>
> So the overflow check in the pcalau12i+ld.d relaxation and the
> relaxation of pcalau12i+ld.d => pcaddi are added.
>
> Changes from v1:
> * Add relaxation for pcalau12i+ld.d=>pcaddi.
Why do we need this?
/* snip */
> - if (relax_func (abfd, sec, sym_sec, rel, symval,
> - info, again, max_alignment)
> - && relax_func == loongarch_relax_pcala_ld)
> - loongarch_relax_pcala_addi (abfd, sec, sym_sec, rel, symval,
> - info, again, max_alignment);
If pcalau12i + ld.d can be relaxed to pcaddi, here the pair will be
relaxed to pcalau12i + addi.d by loongarch_relax_pcala_ld and then
further relaxed to pcaddi anyway.
On 11/15/24 2:13 PM, Xi Ruoyao wrote:
> On Fri, 2024-11-15 at 10:51 +0800, Lulu Cai wrote:
>> For pcalau12i+ld.d that can be relaxed, the relaxation is
>> pcalau12i+ld.d => pcalau12i+addi.d => pcaddi and there is no
>> overflow check.
>>
>> So the overflow check in the pcalau12i+ld.d relaxation and the
>> relaxation of pcalau12i+ld.d => pcaddi are added.
>>
>> Changes from v1:
>> * Add relaxation for pcalau12i+ld.d=>pcaddi.
> Why do we need this?
>
> /* snip */
>
>> - if (relax_func (abfd, sec, sym_sec, rel, symval,
>> - info, again, max_alignment)
>> - && relax_func == loongarch_relax_pcala_ld)
>> - loongarch_relax_pcala_addi (abfd, sec, sym_sec, rel, symval,
>> - info, again, max_alignment);
> If pcalau12i + ld.d can be relaxed to pcaddi, here the pair will be
> relaxed to pcalau12i + addi.d by loongarch_relax_pcala_ld and then
> further relaxed to pcaddi anyway.
>
The idea here is to do the relaxation of pcalau12i + ld.d once instead
of twice if possible.
But is this necessary in your opinion?
On Fri, 2024-11-15 at 15:09 +0800, Lulu Cai wrote:
> On 11/15/24 2:13 PM, Xi Ruoyao wrote:
> > On Fri, 2024-11-15 at 10:51 +0800, Lulu Cai wrote:
> > > For pcalau12i+ld.d that can be relaxed, the relaxation is
> > > pcalau12i+ld.d => pcalau12i+addi.d => pcaddi and there is no
> > > overflow check.
> > >
> > > So the overflow check in the pcalau12i+ld.d relaxation and the
> > > relaxation of pcalau12i+ld.d => pcaddi are added.
> > >
> > > Changes from v1:
> > > * Add relaxation for pcalau12i+ld.d=>pcaddi.
> > Why do we need this?
> >
> > /* snip */
> >
> > > - if (relax_func (abfd, sec, sym_sec, rel, symval,
> > > - info, again, max_alignment)
> > > - && relax_func == loongarch_relax_pcala_ld)
> > > - loongarch_relax_pcala_addi (abfd, sec, sym_sec, rel, symval,
> > > - info, again, max_alignment);
> > If pcalau12i + ld.d can be relaxed to pcaddi, here the pair will be
> > relaxed to pcalau12i + addi.d by loongarch_relax_pcala_ld and then
> > further relaxed to pcaddi anyway.
> >
>
> The idea here is to do the relaxation of pcalau12i + ld.d once instead
> of twice if possible.
> But is this necessary in your opinion?
"If things are not broken don't fix it." Besides that there's another
reason to keep them separated:
A key difference between pcalau12i + ld.d => pcalau12i + addi.d and
pcalau12i + addi.d => pcaddi is the latter deletes bytes but the former
does not. So the latter must happens in elf_relax_section, but the
former can be moved to early_size_sections, like where
R_X86_64_GOTPCRELX is handled for x86_64.
Moving pcalau12i + ld.d => pcalau12i + addi.d earlier will allow us to
resolve https://github.com/loongson-community/discussions/issues/58. I
just have not got enough time to do the work myself...
On 11/15/24 3:54 PM, Xi Ruoyao wrote:
> On Fri, 2024-11-15 at 15:09 +0800, Lulu Cai wrote:
>> On 11/15/24 2:13 PM, Xi Ruoyao wrote:
>>> On Fri, 2024-11-15 at 10:51 +0800, Lulu Cai wrote:
>>>> For pcalau12i+ld.d that can be relaxed, the relaxation is
>>>> pcalau12i+ld.d => pcalau12i+addi.d => pcaddi and there is no
>>>> overflow check.
>>>>
>>>> So the overflow check in the pcalau12i+ld.d relaxation and the
>>>> relaxation of pcalau12i+ld.d => pcaddi are added.
>>>>
>>>> Changes from v1:
>>>> * Add relaxation for pcalau12i+ld.d=>pcaddi.
>>> Why do we need this?
>>>
>>> /* snip */
>>>
>>>> - if (relax_func (abfd, sec, sym_sec, rel, symval,
>>>> - info, again, max_alignment)
>>>> - && relax_func == loongarch_relax_pcala_ld)
>>>> - loongarch_relax_pcala_addi (abfd, sec, sym_sec, rel, symval,
>>>> - info, again, max_alignment);
>>> If pcalau12i + ld.d can be relaxed to pcaddi, here the pair will be
>>> relaxed to pcalau12i + addi.d by loongarch_relax_pcala_ld and then
>>> further relaxed to pcaddi anyway.
>>>
>> The idea here is to do the relaxation of pcalau12i + ld.d once instead
>> of twice if possible.
>> But is this necessary in your opinion?
> "If things are not broken don't fix it." Besides that there's another
> reason to keep them separated:
>
> A key difference between pcalau12i + ld.d => pcalau12i + addi.d and
> pcalau12i + addi.d => pcaddi is the latter deletes bytes but the former
> does not. So the latter must happens in elf_relax_section, but the
> former can be moved to early_size_sections, like where
> R_X86_64_GOTPCRELX is handled for x86_64.
It's reasonable...
> Moving pcalau12i + ld.d => pcalau12i + addi.d earlier will allow us to
> resolve https://github.com/loongson-community/discussions/issues/58. I
> just have not got enough time to do the work myself...
>
I also found this problem and want to solve it, but it is still in the
initial stage.
I will pay attention to this issue later.
Thanks.
@@ -5096,15 +5096,16 @@ loongarch_relax_call36 (bfd *abfd, asection *sec, asection *sym_sec,
return true;
}
-/* Relax pcalau12i,ld.d => pcalau12i,addi.d. */
+/* Relax pcalau12i,ld.d => pcalau12i,addi.d
+ or => pcaddi. */
static bool
loongarch_relax_pcala_ld (bfd *abfd, asection *sec,
- asection *sym_sec ATTRIBUTE_UNUSED,
+ asection *sym_sec,
Elf_Internal_Rela *rel_hi,
- bfd_vma symval ATTRIBUTE_UNUSED,
- struct bfd_link_info *info ATTRIBUTE_UNUSED,
- bool *again ATTRIBUTE_UNUSED,
- bfd_vma max_alignment ATTRIBUTE_UNUSED)
+ bfd_vma symval,
+ struct bfd_link_info *info,
+ bool *again,
+ bfd_vma max_alignment)
{
bfd_byte *contents = elf_section_data (sec)->this_hdr.contents;
Elf_Internal_Rela *rel_lo = rel_hi + 2;
@@ -5113,12 +5114,55 @@ loongarch_relax_pcala_ld (bfd *abfd, asection *sec,
uint32_t rd = LARCH_GET_RD (pca);
uint32_t addi_d = LARCH_OP_ADDI_D;
+ /* This section's output_offset need to subtract the bytes of instructions
+ relaxed by the previous sections, so it needs to be updated beforehand.
+ size_input_section already took care of updating it after relaxation,
+ so we additionally update once here. */
+ sec->output_offset = sec->output_section->size;
+ bfd_vma pc = sec_addr (sec) + rel_hi->r_offset;
+
+ /* If pc and symbol not in the same segment, add/sub segment alignment. */
+ if (!loongarch_two_sections_in_same_segment (info->output_bfd,
+ sec->output_section,
+ sym_sec->output_section))
+ max_alignment = info->maxpagesize > max_alignment ? info->maxpagesize
+ : max_alignment;
+
+ if (symval > pc)
+ pc -= (max_alignment > 4 ? max_alignment : 0);
+ else if (symval < pc)
+ pc += (max_alignment > 4 ? max_alignment : 0);
+
if ((ELFNN_R_TYPE (rel_lo->r_info) != R_LARCH_GOT_PC_LO12)
|| (LARCH_GET_RD (ld) != rd)
|| (LARCH_GET_RJ (ld) != rd)
- || !LARCH_INSN_LD_D (ld))
+ || !LARCH_INSN_LD_D (ld)
+ /* Within [-2G,2G) addressing range. */
+ || (bfd_signed_vma)(symval - pc) < (bfd_signed_vma)(int32_t)0x80000000
+ || (bfd_signed_vma)(symval - pc) > (bfd_signed_vma)(int32_t)0x7fffffff)
return false;
+ /* Continue next relax trip. */
+ *again = true;
+
+ /* Can be relaxed to pcaddi? */
+ if ((symval & 0x3) == 0
+ && ((bfd_signed_vma)(symval - pc) >= (bfd_signed_vma)(int32_t)0xffe00000)
+ && ((bfd_signed_vma)(symval - pc) <= (bfd_signed_vma)(int32_t)0x1ffffc))
+ {
+ const uint32_t pcaddi = LARCH_OP_PCADDI;
+ pca = pcaddi | rd;
+ bfd_put (32, abfd, pca, contents + rel_hi->r_offset);
+
+ /* Adjust relocations. */
+ rel_hi->r_info = ELFNN_R_INFO (ELFNN_R_SYM (rel_hi->r_info),
+ R_LARCH_PCREL20_S2);
+ rel_lo->r_info = ELFNN_R_INFO (0, R_LARCH_NONE);
+ loongarch_relax_delete_bytes (abfd, sec, rel_lo->r_offset, 4, info);
+
+ return true;
+ }
+
addi_d = addi_d | (rd << 5) | rd;
bfd_put (32, abfd, addi_d, contents + rel_lo->r_offset);
@@ -5585,11 +5629,8 @@ loongarch_elf_relax_section (bfd *abfd, asection *sec,
if (r_type == R_LARCH_GOT_PC_HI20 && !local_got)
continue;
- if (relax_func (abfd, sec, sym_sec, rel, symval,
- info, again, max_alignment)
- && relax_func == loongarch_relax_pcala_ld)
- loongarch_relax_pcala_addi (abfd, sec, sym_sec, rel, symval,
- info, again, max_alignment);
+ relax_func (abfd, sec, sym_sec, rel, symval, info,
+ again, max_alignment);
}
return true;
@@ -47,6 +47,7 @@ if [istarget "loongarch64-*-*"] {
run_dump_test "tls-le-relax"
run_dump_test "relax-medium-call"
run_dump_test "relax-medium-call-1"
+ run_dump_test "relax-got"
}
if [istarget "loongarch32-*-*"] {
new file mode 100644
@@ -0,0 +1,79 @@
+#ld: -T relax-got.ld
+#objdump: -D
+
+.* file format .*
+
+
+Disassembly of section \.text:
+
+0+10000 <_start>:
+ 10000: 18080004 pcaddi \$a0, 16384
+ 10004: 18080004 pcaddi \$a0, 16384
+ 10008: 18080004 pcaddi \$a0, 16384
+ 1000c: 1a01fe04 pcalau12i \$a0, 4080
+ 10010: 02c00084 addi.d \$a0, \$a0, 0
+ 10014: 1a01fe04 pcalau12i \$a0, 4080
+ 10018: 02c02084 addi.d \$a0, \$a0, 8
+ 1001c: 1a01fe04 pcalau12i \$a0, 4080
+ 10020: 02c04084 addi.d \$a0, \$a0, 16
+ 10024: 1a03fe04 pcalau12i \$a0, 8176
+ 10028: 28c0c084 ld.d \$a0, \$a0, 48
+ 1002c: 1a03fe04 pcalau12i \$a0, 8176
+ 10030: 28c04084 ld.d \$a0, \$a0, 16
+ 10034: 1a03fe04 pcalau12i \$a0, 8176
+ 10038: 28c0e084 ld.d \$a0, \$a0, 56
+
+0+1003c <sym>:
+ 1003c: 03400000 nop
+
+Disassembly of section \.pcaddi:
+
+0+20000 <sym_default_1>:
+ 20000: 19f801e4 pcaddi \$a0, -16369
+
+0+20004 <sym_hidden_1>:
+ 20004: 19f801c4 pcaddi \$a0, -16370
+
+0+20008 <sym_protected_1>:
+ 20008: 19f801a4 pcaddi \$a0, -16371
+
+Disassembly of section \.pcalau12i_addi:
+
+0+1000000 <sym_default_2>:
+ 1000000: 1bfe0204 pcalau12i \$a0, -4080
+ 1000004: 02c0f084 addi.d \$a0, \$a0, 60
+
+0+1000008 <sym_hidden_2>:
+ 1000008: 1bfe0204 pcalau12i \$a0, -4080
+ 100000c: 02c0f084 addi.d \$a0, \$a0, 60
+
+0+1000010 <sym_protected_2>:
+ 1000010: 1bfe0204 pcalau12i \$a0, -4080
+ 1000014: 02c0f084 addi.d \$a0, \$a0, 60
+
+Disassembly of section \.got:
+
+0+2000000 <_GLOBAL_OFFSET_TABLE_>:
+ ...
+ 2000010: 80020008 .word 0x80020008
+ ...
+ 2000030: 80020000 .word 0x80020000
+ 2000034: 00000000 .word 0x00000000
+ 2000038: 80020010 .word 0x80020010
+ ...
+ 2000050: 0001003c .word 0x0001003c
+ 2000054: 00000000 .word 0x00000000
+
+Disassembly of section \.pcalau12i_ld:
+
+0+80020000 <sym_default_3>:
+ 80020000: 1b03fc04 pcalau12i \$a0, -516128
+ 80020004: 28c14084 ld.d \$a0, \$a0, 80
+
+0+80020008 <sym_hidden_3>:
+ 80020008: 1b03fc04 pcalau12i \$a0, -516128
+ 8002000c: 28c14084 ld.d \$a0, \$a0, 80
+
+0+80020010 <sym_protected_3>:
+ 80020010: 1b03fc04 pcalau12i \$a0, -516128
+ 80020014: 28c14084 ld.d \$a0, \$a0, 80
new file mode 100644
@@ -0,0 +1,31 @@
+OUTPUT_ARCH(loongarch)
+ENTRY(_start)
+SECTIONS
+{
+ PROVIDE (__executable_start = 0x8000);
+ . = 0x10000;
+ .text :
+ {
+ *(.text)
+ } =0
+ . = 0x20000;
+ .pcaddi :
+ {
+ *(.pcaddi)
+ }
+ . = 0x1000000;
+ .pcalau12i_addi :
+ {
+ *(.pcalau12i_addi)
+ }
+ . = 0x2000000;
+ .got :
+ {
+ *(.got)
+ }
+ . = 0x80020000;
+ .pcalau12i_ld :
+ {
+ *(.pcalau12i_ld)
+ }
+}
new file mode 100644
@@ -0,0 +1,69 @@
+ .text
+ .global _start
+_start:
+ # relax to pcaddi
+ la.got $a0,sym_default_1
+ la.got $a0,sym_hidden_1
+ la.got $a0,sym_protected_1
+
+ # relax to pcalau12i + addi.d
+ la.got $a0,sym_default_2
+ la.got $a0,sym_hidden_2
+ la.got $a0,sym_protected_2
+
+ # dot not relax when overflow
+ la.got $a0,sym_default_3
+ la.got $a0,sym_hidden_3
+ la.got $a0,sym_protected_3
+
+ .global sym
+sym:
+ nop
+
+
+ .section .pcaddi,"ax",@progbits
+ .global sym_default_1
+sym_default_1:
+ la.got $a0,sym
+
+ .global sym_hidden_1
+ .hidden sym_hidden_1
+sym_hidden_1:
+ la.got $a0,sym
+
+ .global sym_protected_1
+ .protected sym_protected_1
+sym_protected_1:
+ la.got $a0,sym
+
+
+ .section .pcalau12i_addi,"ax",@progbits
+ .global sym_default_2
+sym_default_2:
+ la.got $a0,sym
+
+ .global sym_hidden_2
+ .hidden sym_hidden_2
+sym_hidden_2:
+ la.got $a0,sym
+
+ .global sym_protected_2
+ .protected sym_protected_2
+sym_protected_2:
+ la.got $a0,sym
+
+
+ .section .pcalau12i_ld,"ax",@progbits
+ .global sym_default_3
+sym_default_3:
+ la.got $a0,sym
+
+ .global sym_hidden_3
+ .hidden sym_hidden_3
+sym_hidden_3:
+ la.got $a0,sym
+
+ .global sym_protected_3
+ .protected sym_protected_3
+sym_protected_3:
+ la.got $a0,sym