[v2,9/9] aarch64: Handle alignment when it is bigger than BIGGEST_ALIGNMENT

Message ID GVXPR83MB06295CD92B26BAC44CD47438F84D2@GVXPR83MB0629.EURPRD83.prod.outlook.com
State New
Headers
Series None |

Commit Message

Evgeny Karpov Oct. 23, 2024, 1:20 p.m. UTC
  Tuesday, October 22, 2024
Richard Sandiford <richard.sandiford@arm.com> wrote:

>> If ASM_OUTPUT_ALIGNED_LOCAL uses an alignment less than BIGGEST_ALIGNMENT,
>> it might trigger a relocation issue.
>>
>> relocation truncated to fit: IMAGE_REL_ARM64_PAGEOFFSET_12L
>
> Sorry to press the issue, but: why does that happen?

#define IMAGE_REL_ARM64_PAGEOFFSET_12L  0x0007  /* The 12-bit page offset of the target, for instruction LDR (indexed, unsigned immediate). */

Based on the documentation for LDR
https://developer.arm.com/documentation/ddi0596/2020-12/Base-Instructions/LDR--immediate---Load-Register--immediate--
"For the 64-bit variant: is the optional positive immediate byte offset, a multiple of 8 in the range 0 to 32760, defaulting to 0 and encoded in the "imm12" field as <pimm>/8"

This means BIGGEST_ALIGNMENT (128) could be replaced with 64.

auto rounded = ROUND_UP (MAX ((SIZE), 1),       \
    MAX ((ALIGNMENT), 64) / BITS_PER_UNIT);

It works for most cases, however, not for all of them.

cross-aarch64-w64-mingw32-msvcrt/lib/gcc/aarch64-w64-mingw32/15.0.0/libgomp.a(oacc-profiling.o): in function `goacc_profiling_initialize':
mingw-woarm64-build/code/gcc/libgomp/oacc-profiling.c:105:(.text+0x2c): relocation truncated to fit: IMAGE_REL_ARM64_PAGEOFFSET_12L against `no symbol'

This case should be investigated separately and fixed in the following
patch series and BIGGEST_ALIGNMENT should be used for now.

>>> Better to use "auto" rather than "unsigned".
>> It looks like "auto" cannot be used there.
>
>What goes wrong if you use it?
>
>The reason for asking for "auto" was to avoid silent truncation.

After the second try and recompiling, it looks like it works.

Regards,
Evgeny
  

Comments

Richard Sandiford Oct. 23, 2024, 2:06 p.m. UTC | #1
Evgeny Karpov <Evgeny.Karpov@microsoft.com> writes:
> Tuesday, October 22, 2024
> Richard Sandiford <richard.sandiford@arm.com> wrote:
>
>>> If ASM_OUTPUT_ALIGNED_LOCAL uses an alignment less than BIGGEST_ALIGNMENT,
>>> it might trigger a relocation issue.
>>>
>>> relocation truncated to fit: IMAGE_REL_ARM64_PAGEOFFSET_12L
>>
>> Sorry to press the issue, but: why does that happen?
>
> #define IMAGE_REL_ARM64_PAGEOFFSET_12L  0x0007  /* The 12-bit page offset of the target, for instruction LDR (indexed, unsigned immediate). */
>
> Based on the documentation for LDR
> https://developer.arm.com/documentation/ddi0596/2020-12/Base-Instructions/LDR--immediate---Load-Register--immediate--
> "For the 64-bit variant: is the optional positive immediate byte offset, a multiple of 8 in the range 0 to 32760, defaulting to 0 and encoded in the "imm12" field as <pimm>/8"

This in itself is relatively standard thouugh.  We can't assume
without checking that any given offset will be "nicely" aligned.
So...

> This means BIGGEST_ALIGNMENT (128) could be replaced with 64.
>
> auto rounded = ROUND_UP (MAX ((SIZE), 1),       \
>     MAX ((ALIGNMENT), 64) / BITS_PER_UNIT);
>
> It works for most cases, however, not for all of them.

...although this will work for, say, loading all of:

unsigned char x[8];

using a single LDR, it doesn't look like it would cope with:

  struct __attribute__((packed)) {
    char x;
    void *y;
  } foo;

  void *f() { return foo.y; }

Or, even if that does work, it isn't clear to me why patching
ASM_OUTPUT_ALIGNED_LOCAL is a complete solution to the problem.

ISTM that we should be checking the known alignment during code generation,
and only using relocations if their alignment requirements are known to
be met.

Once that's done, it would make sense to increase the default alignment
if that improves code quality.  But it would be good to fix the correctness
issue first, while the problem is still easily visible.

If we do want to increase the default alignment to improve code quality,
the normal way would be via macros like DATA_ALIGNMENT or LOCAL_ALIGNMENT.
The advantage of those macros is that the increased alignment is visible
during code generation, rather than something that is only applied at
output time.

Thanks,
Richard
  

Patch

diff --git a/gcc/config/aarch64/aarch64-coff.h b/gcc/config/aarch64/aarch64-coff.h
index 8fc6ca0440d..52c8c8d99c2 100644
--- a/gcc/config/aarch64/aarch64-coff.h
+++ b/gcc/config/aarch64/aarch64-coff.h
@@ -58,6 +58,13 @@ 
   assemble_name ((FILE), (NAME)),              \
   fprintf ((FILE), "," HOST_WIDE_INT_PRINT_UNSIGNED "\n", (ROUNDED)))

+#define ASM_OUTPUT_ALIGNED_LOCAL(FILE, NAME, SIZE, ALIGNMENT)  \
+  {                                                            \
+    auto rounded = ROUND_UP (MAX ((SIZE), 1),          \
+      MAX ((ALIGNMENT), BIGGEST_ALIGNMENT) / BITS_PER_UNIT);   \
+    ASM_OUTPUT_LOCAL (FILE, NAME, SIZE, rounded);              \
+  }
+
 #define ASM_OUTPUT_SKIP(STREAM, NBYTES)        \
   fprintf (STREAM, "\t.space\t%d  // skip\n", (int) (NBYTES))