[RFC] objcopy, elf: Align Header Offset [PR 32324]
Checks
Commit Message
Hello,
This RFC 'fixes' the issue described in PR32324, but I doubt this is the
right thing to do. I am not at all familiar with this piece of code but
it looked wrong for the offset after objcopy to be lower than the
'Align' value, whereas all other headers seem to have an 'Offset' that
is larger than their 'Align'.
Debugging this also hinted that had the offset been the same as 'Align'
and in fact the original 'Offset' then everything would fall in place.
(deleted some leading 0s so it would fit in 80-char line).
Before objcopy/strip:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
...
LOAD 0x00000000200000 0x00000000600000 0x00000000600000
0x0000000040467c 0x0000000040467c R E 0x200000
After objcopy/strip:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
...
LOAD 0x00000000030120 0x00000000430120 0x00000000600000
0x0000000040467c 0x0000000040467c R E 0x200000
So I decided to try this hack for now and for this case it seems to work
for the binary I was using to reproduce this.
This is just intended as breadcrumbs for the next person to pick this up.
My reasoning was that I was seeing that the p_offset and p_vaddr of the
Header where things went wrong was much lower than the original, however
the section addresses were not, so some sections were no longer fitting
in this Segment and were being moved to the next one and eventually we
ran out of space for some Sections, hence the warnings.
Kind regards,
Andre
Comments
On 02.12.2024 15:16, Andre Vieira (lists) wrote:
> This RFC 'fixes' the issue described in PR32324, but I doubt this is the
> right thing to do. I am not at all familiar with this piece of code but
> it looked wrong for the offset after objcopy to be lower than the
> 'Align' value, whereas all other headers seem to have an 'Offset' that
> is larger than their 'Align'.
> Debugging this also hinted that had the offset been the same as 'Align'
> and in fact the original 'Offset' then everything would fall in place.
>
> (deleted some leading 0s so it would fit in 80-char line).
> Before objcopy/strip:
> Type Offset VirtAddr PhysAddr
> FileSiz MemSiz Flags Align
> ...
> LOAD 0x00000000200000 0x00000000600000 0x00000000600000
> 0x0000000040467c 0x0000000040467c R E 0x200000
>
> After objcopy/strip:
> Type Offset VirtAddr PhysAddr
> FileSiz MemSiz Flags Align
> ...
> LOAD 0x00000000030120 0x00000000430120 0x00000000600000
> 0x0000000040467c 0x0000000040467c R E 0x200000
Neither here nor in the bug report a complete picture is presented. I
don't think providing the info for just a single segment / section can
suffice. What's more, in the PR you mention the input image is the
result of some other tool's optimization. Pre-optimized image data may
also be relevant, especially as long as we can't exclude there's an
issue with that optimization (as you hint at as a possibility in the
PR).
Therefore can you please provide at least a full program and section
header dump from the one binary in question, for all three states of
it (pre optimization, post optimization, and post stripping)? Ideally
you'd make available a shrunk down testcase, for people (like me) to
actually play with.
> So I decided to try this hack for now and for this case it seems to work
> for the binary I was using to reproduce this.
> This is just intended as breadcrumbs for the next person to pick this up.
>
> My reasoning was that I was seeing that the p_offset and p_vaddr of the
> Header where things went wrong was much lower than the original, however
> the section addresses were not, so some sections were no longer fitting
> in this Segment and were being moved to the next one and eventually we
> ran out of space for some Sections, hence the warnings.
Right, except that I don't think it makes much sense to special case
offsets below p_align. p_align, after all, says nothing about full
addresses, but is just about requirements towards their low bits.
Jan
On Dez 02 2024, Andre Vieira (lists) wrote:
> diff --git a/bfd/elf.c b/bfd/elf.c
> index 74236a658fd9a10a61f466d9a2191998c2f4ce06..28d0143acca351203c840e6afc1dca710d6f671a 100644
> --- a/bfd/elf.c
> +++ b/bfd/elf.c
> @@ -6152,7 +6152,12 @@ assign_file_positions_for_load_sections (bfd *abfd,
> if (m == phdr_load_seg)
> {
> if (!m->includes_filehdr)
> - p->p_offset = off;
> + {
> + if (((bfd_vma) off) < p->p_align)
> + p->p_offset = p->p_align;
> + else
> + p->p_offset = off;
> + }
That is definitely wrong. p_align only asserts that (p_offset -
p_vaddr) % p_align == 0. It is _not_ required that p_offset % p_align
== 0.
Hi Jan,
Thank you for responding!
On 03/12/2024 09:26, Jan Beulich wrote:
>
> Neither here nor in the bug report a complete picture is presented. I
> don't think providing the info for just a single segment / section can
> suffice. What's more, in the PR you mention the input image is the
> result of some other tool's optimization. Pre-optimized image data may
> also be relevant, especially as long as we can't exclude there's an
> issue with that optimization (as you hint at as a possibility in the
> PR).
>
> Therefore can you please provide at least a full program and section
> header dump from the one binary in question, for all three states of
> it (pre optimization, post optimization, and post stripping)? Ideally
> you'd make available a shrunk down testcase, for people (like me) to
> actually play with.
Yeah I fully appreciate this, I was not sure what the best way was to go
about it, I've attached the headers and segments of the binary before
and after copying.
I'll see if I can create a binary reproducer of code I am able to share,
but in 9 years of GNU development I've never sent binaries over for
debugging, we usually operate with source-level reproducers... so not
sure about whether it was something people did. Also not sure I'll be
able to come up with a shareable reproducer for you, but I'll try.
I think I was hoping others would have seen similar behavior and
would've been able to investigate, but I guess if they had, it might
have already been fixed, even if I'm not 100% convinced this is an issue
with objcopy/strip ... could well be binary that was modified to
something that isn't quite compliant with ELF.
>
>> So I decided to try this hack for now and for this case it seems to work
>> for the binary I was using to reproduce this.
>> This is just intended as breadcrumbs for the next person to pick this up.
>>
>> My reasoning was that I was seeing that the p_offset and p_vaddr of the
>> Header where things went wrong was much lower than the original, however
>> the section addresses were not, so some sections were no longer fitting
>> in this Segment and were being moved to the next one and eventually we
>> ran out of space for some Sections, hence the warnings.
>
> Right, except that I don't think it makes much sense to special case
> offsets below p_align. p_align, after all, says nothing about full
> addresses, but is just about requirements towards their low bits.
True... and from what I could find p_vaddr should equal p_offset modulo
p_align, but I probably need to read that as:
p_vaddr % p_align == p_offset % p_align
Both before and after headers uphold that ... though strangely, unlike
other headers, the p_paddr of the header where the problems start didn't
change and is no longer the same as p_vaddr.
Like I said, the p_align hack was just something I noticed whilst
debugging that would 'make it work', but I'm not at all convinced it's
the right approach either, but I don't know what is either.
Kind Regards,
Andre
PS: Apologies if I stop responding, I'll make sure someone else in my
team picks this up whilst I'm away.
Elf file type is EXEC (Executable file)
Entry point 0x800000
There are 11 program headers, starting at offset 2097152
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000200000 0x0000000000600000 0x0000000000600000
0x0000000000000268 0x0000000000000268 R 0x8
INTERP 0x0000000000000294 0x0000000000400294 0x0000000000400294
0x0000000000000042 0x0000000000000042 R 0x1
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x0000000000024458 0x0000000000024458 R E 0x10000
LOAD 0x000000000002fdc0 0x000000000043fdc0 0x000000000043fdc0
0x0000000000000360 0x0000000000000360 RW 0x10000
LOAD 0x0000000000200000 0x0000000000600000 0x0000000000600000
0x000000000040467c 0x000000000040467c R E 0x200000
DYNAMIC 0x000000000002fdd8 0x000000000043fdd8 0x000000000043fdd8
0x00000000000001f0 0x00000000000001f0 RW 0x8
NOTE 0x0000000000000270 0x0000000000400270 0x0000000000400270
0x0000000000000024 0x0000000000000024 R 0x4
NOTE 0x0000000000030100 0x0000000000440100 0x0000000000440100
0x0000000000000020 0x0000000000000020 R 0x4
GNU_EH_FRAME 0x0000000000603d08 0x0000000000a03d08 0x0000000000a03d08
0x0000000000000974 0x0000000000000974 R 0x4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 0x10
GNU_RELRO 0x000000000002fdc0 0x000000000043fdc0 0x000000000043fdc0
0x0000000000000240 0x0000000000000240 R 0x1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .note.gnu.build-id .interp .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .bolt.org.text .fini .rodata .bolt.org.eh_frame_hdr .bolt.org.eh_frame
03 .init_array .fini_array .dynamic .got .got.plt .data .tm_clone_table .bss .note.ABI-tag
04 .text .text.cold .eh_frame .eh_frame_hdr
05 .dynamic
06 .note.gnu.build-id
07 .note.ABI-tag
08 .eh_frame_hdr
09
10 .init_array .fini_array .dynamic .got
Elf file type is EXEC (Executable file)
Entry point 0x800000
There are 11 program headers, starting at offset 196896
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000030120 0x0000000000430120 0x0000000000600000
0x0000000000000268 0x0000000000000268 R 0x8
INTERP 0x0000000000000294 0x0000000000400294 0x0000000000400294
0x0000000000000042 0x0000000000000042 R 0x1
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x0000000000024458 0x0000000000024458 R E 0x10000
LOAD 0x000000000002fdc0 0x000000000043fdc0 0x000000000043fdc0
0x0000000000000360 0x0000000000000360 RW 0x10000
LOAD 0x0000000000030120 0x0000000000430120 0x0000000000600000
0x000000000040467c 0x000000000040467c R E 0x200000
DYNAMIC 0x000000000002fdd8 0x000000000043fdd8 0x000000000043fdd8
0x00000000000001f0 0x00000000000001f0 RW 0x8
NOTE 0x0000000000000270 0x0000000000400270 0x0000000000400270
0x0000000000000024 0x0000000000000024 R 0x4
NOTE 0x0000000000030100 0x0000000000440100 0x0000000000440100
0x0000000000000020 0x0000000000000020 R 0x4
GNU_EH_FRAME 0x0000000000433e28 0x0000000000a03d08 0x0000000000a03d08
0x0000000000000974 0x0000000000000974 R 0x4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 0x10
GNU_RELRO 0x000000000002fdc0 0x000000000043fdc0 0x000000000043fdc0
0x0000000000000240 0x0000000000000240 R 0x1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .note.gnu.build-id .interp .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .bolt.org.text .fini .rodata .bolt.org.eh_frame_hdr .bolt.org.eh_frame
03 .init_array .fini_array .dynamic .got .got.plt .data .tm_clone_table .bss .note.ABI-tag
04 .bss .text .text.cold
05 .dynamic
06 .note.gnu.build-id
07 .note.ABI-tag
08 .eh_frame_hdr
09
10 .init_array .fini_array .dynamic .got
OK so I have a binary reproducer for you, it's just a nonsensical hello
world with a loop to make it run for long enough for perf to create a
record that actually captures something. I compressed it using gzip, so
hopefully the mailing list accepts it.
This binary was built on an aarch64-none-linux-gnu target, using a top
of the trunk build of llvm-bolt and a gcc 11.4 with the following commands:
$ gcc hello.c fn.c -O2 -g -Wl,--emit-relocs
-fno-reorder-blocks-and-partition
$ perf record -e cycles:u -o perf.data -- ./a.out
$ ./bolt_build/bin/perf2bolt -p perf.data -o perf.fdata -nl ./a.out
$ ./bolt_build/bin/llvm-bolt ./a.out -o ./a-bolted.out -data perf.fdata
-reorder-blocks=ext-tsp -reorder-functions=hfsort -split-functions
-split-all-cold -split-eh -dyno-stats
To reproduce the issue with GNU strip/objcopy it is sufficient to call:
$ objcopy a-bolted.out bad_copy
objcopy: bad_copy: section `.eh_frame' can't be allocated in segment 2
LOAD: .text .text.cold .eh_frame .eh_frame_hdr
objcopy: bad_copy: section `.eh_frame_hdr' can't be allocated in segment 2
LOAD: .text .text.cold .eh_frame .eh_frame_hdr
$ ./bad_copy
Illegal Instruction
The sources used to create the binary are ...
hello.c:
#include <stdio.h>
extern int fn (int *, int *, int);
# define N 2230000
int a[N];
int b[N];
int main (void)
{
for (int i = 0; i < N; ++i)
{
a[i] = i % 10;
b[i] = i % 2;
}
printf ("Hello World! %d\n", fn(&a[0], &b[0], N));
return 0;
}
... and ...
fn.c:
int fn (int *a, int *b, int n)
{
int sum = 0;
for (int i = 0; i < n; ++ i)
sum += a[i] * b[i];
return sum;
}
On 03.12.2024 17:30, Andre Vieira (lists) wrote:
> OK so I have a binary reproducer for you, it's just a nonsensical hello
> world with a loop to make it run for long enough for perf to create a
> record that actually captures something. I compressed it using gzip, so
> hopefully the mailing list accepts it.
>
> This binary was built on an aarch64-none-linux-gnu target, using a top
> of the trunk build of llvm-bolt and a gcc 11.4 with the following commands:
> $ gcc hello.c fn.c -O2 -g -Wl,--emit-relocs
> -fno-reorder-blocks-and-partition
> $ perf record -e cycles:u -o perf.data -- ./a.out
The original binary would also have been of interest.
> $ ./bolt_build/bin/perf2bolt -p perf.data -o perf.fdata -nl ./a.out
> $ ./bolt_build/bin/llvm-bolt ./a.out -o ./a-bolted.out -data perf.fdata
> -reorder-blocks=ext-tsp -reorder-functions=hfsort -split-functions
> -split-all-cold -split-eh -dyno-stats
>
> To reproduce the issue with GNU strip/objcopy it is sufficient to call:
> $ objcopy a-bolted.out bad_copy
> objcopy: bad_copy: section `.eh_frame' can't be allocated in segment 2
> LOAD: .text .text.cold .eh_frame .eh_frame_hdr
> objcopy: bad_copy: section `.eh_frame_hdr' can't be allocated in segment 2
> LOAD: .text .text.cold .eh_frame .eh_frame_hdr
It looks to me as if the problem was that objcopy tries to move the two
sections from segment 4 to segment 2. While I don't think I'm aware of a
place where it is written down what exact changes objcopy is expected /
permitted to make, it would seem to me that it shouldn't alter the
section-to-segment mapping. The two sections in question should remain
in segment 4. Doing so would feel like partial re-linking of the binary,
which pretty certainly isn't what objcopy is supposed to do (and even
less so strip).
Jan
On 17.12.2024 15:39, Jan Beulich wrote:
> On 03.12.2024 17:30, Andre Vieira (lists) wrote:
>> OK so I have a binary reproducer for you, it's just a nonsensical hello
>> world with a loop to make it run for long enough for perf to create a
>> record that actually captures something. I compressed it using gzip, so
>> hopefully the mailing list accepts it.
>>
>> This binary was built on an aarch64-none-linux-gnu target, using a top
>> of the trunk build of llvm-bolt and a gcc 11.4 with the following commands:
>> $ gcc hello.c fn.c -O2 -g -Wl,--emit-relocs
>> -fno-reorder-blocks-and-partition
>> $ perf record -e cycles:u -o perf.data -- ./a.out
>
> The original binary would also have been of interest.
>
>> $ ./bolt_build/bin/perf2bolt -p perf.data -o perf.fdata -nl ./a.out
>> $ ./bolt_build/bin/llvm-bolt ./a.out -o ./a-bolted.out -data perf.fdata
>> -reorder-blocks=ext-tsp -reorder-functions=hfsort -split-functions
>> -split-all-cold -split-eh -dyno-stats
>>
>> To reproduce the issue with GNU strip/objcopy it is sufficient to call:
>> $ objcopy a-bolted.out bad_copy
>> objcopy: bad_copy: section `.eh_frame' can't be allocated in segment 2
>> LOAD: .text .text.cold .eh_frame .eh_frame_hdr
>> objcopy: bad_copy: section `.eh_frame_hdr' can't be allocated in segment 2
>> LOAD: .text .text.cold .eh_frame .eh_frame_hdr
>
> It looks to me as if the problem was that objcopy tries to move the two
> sections from segment 4 to segment 2.
That was a red herring. The diagnostic emits an internal loop counter,
making the number reported entirely meaningless. There's no moving of
sections between segments, afaict right now. Back to square 1.
Jan
@@ -6152,7 +6152,12 @@ assign_file_positions_for_load_sections (bfd *abfd,
if (m == phdr_load_seg)
{
if (!m->includes_filehdr)
- p->p_offset = off;
+ {
+ if (((bfd_vma) off) < p->p_align)
+ p->p_offset = p->p_align;
+ else
+ p->p_offset = off;
+ }
off += actual * bed->s->sizeof_phdr;
}