RFC: Padding segments to eliminate gaps
Checks
Commit Message
Hi Guys,
We have an interesting situation here at RedHat. For various
reasons(1) we want to ensure that there are no gaps between loadable
segments in a binary. But we do not want to increase the size of
binaries unnecessarily.
Originally I tried increasing the p_memsz and p_filesz of loadable
segments to eliminate the gaps. Without actually adding in any extra
sections. (As far as I can tell the ELF standard does not require
that segments be completely filled with sections). But this proved to
be a bust as tools like objcopy/strip could not handle the empty parts
of the segments. (This is a potential bug in the BFD library, but I
chose not to investigate as I felt that fixing it might break lots of
things).
Instead I have gone with a second method - adding a new option to
objcopy that inserts padding sections that fill up segments so that
they end at the start of the next segments. The new sections are
SHT_NOBITS so that they do not take up any extra space in the file
(well apart from their section headers), and using objcopy means that
the transformation can be performed post-link, possibly after
stripping or other kinds of file munging.
I am not sure if this feature is really suitable for submission to the
upstream sources. It feels a bit hackish to me. But maybe someone
will be interested in it, which is why I am posting the patch here.
As for what the results look like, here is a short example:
$ gcc hello.c
$ objcopy --add-segment-padding-sections a.out a.new
$ ls -l a.out a.new
-rwxr-xr-x. 1 nickc nickc 17832 Jan 21 14:27 a.out*
-rwxr-xr-x. 1 nickc nickc 18064 Jan 21 14:29 a.new*
(a small increase in file size)
$ readelf -lW a.out
[...]
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
[...]
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00039c 0x00039c R 0x1000
LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x000155 0x000155 R E 0x1000
LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x00024c 0x00024c R 0x1000
LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000228 0x000230 RW 0x1000
$ readelf -lW a.new
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00039c 0x001000 R 0x1000
LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x000155 0x001000 R E 0x1000
LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x00024c 0x001000 R 0x1000
LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000228 0x000230 RW 0x1000
(note the change in MemSiz)
Section to Segment mapping:
Segment Sections...
02 .note.gnu.property .note.gnu.build-id .note.ABI-tag .segment.pad.2
03 .init .plt .text .fini .segment.pad.3
04 .interp .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .rodata .eh_frame_hdr .eh_frame .segment.pad.4
(the .segment.pad.N sections are the newly inserted padding sections)
If you have any comments or questions please feel free to let me know.
Cheers
Nick
(1) The presence of gaps between loadable segments leads to the
possibility that something else might be inserted in between them.
Typically only if the common page size and maximum page size for the
given architecture are different. But theoretically it could happen for
any architecture. This load of unexpected code/data might then pose a
security risk, if a buffer overflow attack or similar is able to reach
it.
Comments
* Nick Clifton:
> We have an interesting situation here at RedHat. For various
> reasons(1) we want to ensure that there are no gaps between loadable
> segments in a binary. But we do not want to increase the size of
> binaries unnecessarily.
> (1) The presence of gaps between loadable segments leads to the
> possibility that something else might be inserted in between them.
> Typically only if the common page size and maximum page size for the
> given architecture are different. But theoretically it could happen for
> any architecture. This load of unexpected code/data might then pose a
> security risk, if a buffer overflow attack or similar is able to reach
> it.
The main concern is that it creates situations and code paths that have
not been tested, and are difficult to test due to ASLR.
> Originally I tried increasing the p_memsz and p_filesz of loadable
> segments to eliminate the gaps. Without actually adding in any extra
> sections. (As far as I can tell the ELF standard does not require
> that segments be completely filled with sections). But this proved to
> be a bust as tools like objcopy/strip could not handle the empty parts
> of the segments. (This is a potential bug in the BFD library, but I
> chose not to investigate as I felt that fixing it might break lots of
> things).
This is rather unfortunate.
> Instead I have gone with a second method - adding a new option to
> objcopy that inserts padding sections that fill up segments so that
> they end at the start of the next segments. The new sections are
> SHT_NOBITS so that they do not take up any extra space in the file
> (well apart from their section headers), and using objcopy means that
> the transformation can be performed post-link, possibly after
> stripping or other kinds of file munging.
Will those objects have the same fate? In these sense that future
tooling is likely to corrupt them?
> I am not sure if this feature is really suitable for submission to the
> upstream sources. It feels a bit hackish to me. But maybe someone
> will be interested in it, which is why I am posting the patch here.
>
> As for what the results look like, here is a short example:
>
> $ gcc hello.c
> $ objcopy --add-segment-padding-sections a.out a.new
> $ ls -l a.out a.new
> -rwxr-xr-x. 1 nickc nickc 17832 Jan 21 14:27 a.out*
> -rwxr-xr-x. 1 nickc nickc 18064 Jan 21 14:29 a.new*
>
> (a small increase in file size)
>
> $ readelf -lW a.out
> [...]
> Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
> [...]
> LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00039c 0x00039c R 0x1000
> LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x000155 0x000155 R E 0x1000
> LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x00024c 0x00024c R 0x1000
> LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000228 0x000230 RW 0x1000
>
> $ readelf -lW a.new
>
> LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00039c 0x001000 R 0x1000
> LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x000155 0x001000 R E 0x1000
> LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x00024c 0x001000 R 0x1000
> LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000228 0x000230 RW 0x1000
>
> (note the change in MemSiz)
>
> Section to Segment mapping:
> Segment Sections...
> 02 .note.gnu.property .note.gnu.build-id .note.ABI-tag .segment.pad.2
> 03 .init .plt .text .fini .segment.pad.3
> 04 .interp .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .rodata .eh_frame_hdr .eh_frame .segment.pad.4
> (the .segment.pad.N sections are the newly inserted padding sections)
Could you show the readelf -SW output? I'm kind of surprised that a
NOBITS section appears in the readelf -lW output. But reading the ELF
specification, it sort of makes sense.
Is there a chance you could teach the link editor to insert this
padding? But if we can use objcopy to add tooling-compatible padding, I
think this would be sufficient for our purposes.
Thanks,
Florian
Hi Florian,
>> Instead I have gone with a second method - adding a new option to
>> objcopy that inserts padding sections that fill up segments so that
>> they end at the start of the next segments. The new sections are
>> SHT_NOBITS so that they do not take up any extra space in the file
>> (well apart from their section headers), and using objcopy means that
>> the transformation can be performed post-link, possibly after
>> stripping or other kinds of file munging.
>
> Will those objects have the same fate? In these sense that future
> tooling is likely to corrupt them?
No. Or at least not with the tests I have performed. I have been able
to strip the augmented binaries using the binutils, the elfutils and the
llvm versions of strip and they continue to work and to be correctly
padded.
> Could you show the readelf -SW output? I'm kind of surprised that a
> NOBITS section appears in the readelf -lW output. But reading the ELF
> specification, it sort of makes sense.
Sure, the output is attached. I am also attaching the augmented binary
(in a compressed form) in case you wish to examine or experiment with it.
> Is there a chance you could teach the link editor to insert this
> padding?
It would be a lot harder and less flexible. One of benefits of using
objcopy in this way is that it can be used on binaries that were not
created by ld. So if gold or lld or mold have been used to create an
executable they can still be augmented by using objcopy.
Cheers
Nick
There are 41 section headers, starting at offset 0x3c50:
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .note.gnu.property NOTE 0000000000400318 000318 000040 00 A 0 0 8
[ 2] .note.gnu.build-id NOTE 0000000000400358 000358 000024 00 A 0 0 4
[ 3] .note.ABI-tag NOTE 000000000040037c 00037c 000020 00 A 0 0 4
[ 4] .init PROGBITS 0000000000401000 001000 00001b 00 AX 0 0 4
[ 5] .plt PROGBITS 0000000000401020 001020 000020 10 AX 0 0 16
[ 6] .text PROGBITS 0000000000401040 001040 000106 00 AX 0 0 16
[ 7] .fini PROGBITS 0000000000401148 001148 00000d 00 AX 0 0 4
[ 8] .interp PROGBITS 0000000000402000 002000 00001c 00 A 0 0 1
[ 9] .gnu.hash GNU_HASH 0000000000402020 002020 00001c 00 A 10 0 8
[10] .dynsym DYNSYM 0000000000402040 002040 000060 18 A 11 1 8
[11] .dynstr STRTAB 00000000004020a0 0020a0 00004a 00 A 0 0 1
[12] .gnu.version VERSYM 00000000004020ea 0020ea 000008 02 A 10 0 2
[13] .gnu.version_r VERNEED 00000000004020f8 0020f8 000030 00 A 11 1 8
[14] .rela.dyn RELA 0000000000402128 002128 000030 18 A 10 0 8
[15] .rela.plt RELA 0000000000402158 002158 000018 18 AI 10 23 8
[16] .rodata PROGBITS 0000000000402170 002170 000020 00 A 0 0 8
[17] .eh_frame_hdr PROGBITS 0000000000402190 002190 00002c 00 A 0 0 4
[18] .eh_frame PROGBITS 00000000004021c0 0021c0 00008c 00 A 0 0 8
[19] .init_array INIT_ARRAY 0000000000403df8 002df8 000008 08 WA 0 0 8
[20] .fini_array FINI_ARRAY 0000000000403e00 002e00 000008 08 WA 0 0 8
[21] .dynamic DYNAMIC 0000000000403e08 002e08 0001d0 10 WA 11 0 8
[22] .got PROGBITS 0000000000403fd8 002fd8 000010 08 WA 0 0 8
[23] .got.plt PROGBITS 0000000000403fe8 002fe8 000020 08 WA 0 0 8
[24] .data PROGBITS 0000000000404008 003008 000018 00 WA 0 0 8
[25] .bss NOBITS 0000000000404020 003020 000008 00 WA 0 0 1
[26] .comment PROGBITS 0000000000000000 003020 00005c 01 MS 0 0 1
[27] .annobin.notes STRTAB 0000000000000000 00307c 00014f 01 MS 0 0 1
[28] .gnu.build.attributes NOTE 0000000000406028 0031cc 000144 00 0 0 4
[29] .debug_aranges PROGBITS 0000000000000000 003310 000030 00 0 0 1
[30] .debug_info PROGBITS 0000000000000000 003340 0000ac 00 0 0 1
[31] .debug_abbrev PROGBITS 0000000000000000 0033ec 00008d 00 0 0 1
[32] .debug_line PROGBITS 0000000000000000 003479 000054 00 0 0 1
[33] .debug_str PROGBITS 0000000000000000 0034cd 00005c 01 MS 0 0 1
[34] .debug_line_str PROGBITS 0000000000000000 003529 00005c 01 MS 0 0 1
[35] .segment.pad.2 NOBITS 000000000040039c 00039c 000c64 00 A 0 0 1
[36] .segment.pad.3 NOBITS 0000000000401155 001155 000eab 00 AX 0 0 1
[37] .segment.pad.4 NOBITS 000000000040224c 00224c 000db4 00 A 0 0 1
[38] .symtab SYMTAB 0000000000000000 003588 000360 18 39 18 8
[39] .strtab STRTAB 0000000000000000 0038e8 0001ae 00 0 0 1
[40] .shstrtab STRTAB 0000000000000000 003a96 0001b8 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
D (mbind), l (large), p (processor specific)
Elf file type is EXEC (Executable file)
Entry point 0x401040
There are 10 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000040 0x0000000000400040 0x0000000000400040 0x000230 0x000230 R 0x8
INTERP 0x002000 0x0000000000402000 0x0000000000402000 0x00001c 0x00001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00039c 0x001000 R 0x1000
LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x000155 0x001000 R E 0x1000
LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x00024c 0x001000 R 0x1000
LOAD 0x002df8 0x0000000000403df8 0x0000000000403df8 0x000228 0x000230 RW 0x1000
DYNAMIC 0x002e08 0x0000000000403e08 0x0000000000403e08 0x0001d0 0x0001d0 RW 0x8
NOTE 0x000318 0x0000000000400318 0x0000000000400318 0x000040 0x000040 R 0x8
NOTE 0x000358 0x0000000000400358 0x0000000000400358 0x000044 0x000044 R 0x4
GNU_PROPERTY 0x000318 0x0000000000400318 0x0000000000400318 0x000040 0x000040 R 0x8
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .note.gnu.property .note.gnu.build-id .note.ABI-tag .segment.pad.2
03 .init .plt .text .fini .segment.pad.3
04 .interp .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .rodata .eh_frame_hdr .eh_frame .segment.pad.4
05 .init_array .fini_array .dynamic .got .got.plt .data .bss
06 .dynamic
07 .note.gnu.property
08 .note.gnu.build-id .note.ABI-tag
09 .note.gnu.property
@@ -1285,6 +1285,7 @@ objcopy [@option{-F} @var{bfdname}|@opti
[@option{--debugging}]
[@option{--gap-fill=}@var{val}]
[@option{--pad-to=}@var{address}]
+ [@option{--add-segment-padding-sections}]
[@option{--set-start=}@var{val}]
[@option{--adjust-start=}@var{incr}]
[@option{--change-addresses=}@var{incr}]
@@ -1666,6 +1667,18 @@ Pad the output file up to the load addre
done by increasing the size of the last section. The extra space is
filled in with the value specified by @option{--gap-fill} (default zero).
+@item --add-segment-padding-sections
+Add empty sections that eliminate any gaps between loadable segments.
+This option only works with ELF format files that contains both
+sections and segments. The added sections will have the SHT_NOBITS
+type and permissions that match the segment that they are padding.
+The sections names will be of the form .segment.pad.<N> where <N> is
+the number of the load segment that is being padded.
+
+Note - this option should not be used in conjunction with any other
+option that adds or removes sections, or changes their size in any
+way.
+
@item --set-start @var{val}
Set the start address (also known as the entry address) of the new
file to @var{val}. Not all object file formats support setting the
@@ -245,6 +245,9 @@ static bool change_leading_char = false;
/* Whether to remove the leading character from global symbol names. */
static bool remove_leading_char = false;
+/* If true add sections that pad loadable segments so that there are no gaps between them. */
+static bool add_segment_padding_sections = false;
+
/* Whether to permit wildcard in symbol comparison. */
static bool wildcard = false;
@@ -309,6 +312,7 @@ enum command_line_switch
{
OPTION_ADD_SECTION=150,
OPTION_ADD_GNU_DEBUGLINK,
+ OPTION_ADD_SEGMENT_PADDING_SECTIONS,
OPTION_ADD_SYMBOL,
OPTION_ALT_MACH_CODE,
OPTION_CHANGE_ADDRESSES,
@@ -424,6 +428,7 @@ static struct option copy_options[] =
{
{"add-gnu-debuglink", required_argument, 0, OPTION_ADD_GNU_DEBUGLINK},
{"add-section", required_argument, 0, OPTION_ADD_SECTION},
+ {"add-segment-padding-sections", no_argument, 0, OPTION_ADD_SEGMENT_PADDING_SECTIONS},
{"add-symbol", required_argument, 0, OPTION_ADD_SYMBOL},
{"adjust-section-vma", required_argument, 0, OPTION_CHANGE_SECTION_ADDRESS},
{"adjust-start", required_argument, 0, OPTION_CHANGE_START},
@@ -626,6 +631,7 @@ copy_usage (FILE *stream, int exit_statu
-b --byte <num> Select byte <num> in every interleaved block\n\
--gap-fill <val> Fill gaps between sections with <val>\n\
--pad-to <addr> Pad the last section up to address <addr>\n\
+ --add-segment-padding-sections Add padding sections so that load segments do not have gaps\n\
--set-start <addr> Set the start address to <addr>\n\
{--change-start|--adjust-start} <incr>\n\
Add <incr> to the start address\n\
@@ -2647,6 +2653,143 @@ set_long_section_mode (bfd *output_bfd,
}
}
+static bool
+is_loadable_segment (const Elf_Internal_Phdr * phdr)
+{
+ return phdr->p_type == PT_LOAD || phdr->p_type == PT_TLS;
+}
+
+static bfd_vma
+page_align_down (bfd_vma pagesize, bfd_vma addr)
+{
+ return addr & ~ (pagesize - 1);
+}
+
+static void
+add_padding_sections (bfd *ibfd, bfd *obfd)
+{
+ if (obfd == NULL) /* Paranoia. */
+ return;
+
+
+ if (bfd_get_flavour (ibfd) != bfd_target_elf_flavour
+ || bfd_get_flavour (obfd) != bfd_target_elf_flavour)
+ {
+ non_fatal (_("the --add-segment-padding-sections option only works with ELF format files\n"));
+ return;
+ }
+
+ if (add_sections != NULL || update_sections != NULL || gap_fill_set || pad_to_set)
+ {
+ non_fatal (_("the --add-segment-padding-sections option does not work with other paddding/modifying options\n"));
+ return;
+ }
+
+ const struct elf_backend_data * bed = get_elf_backend_data (ibfd);
+ const struct elf_obj_tdata * tdata = elf_tdata (ibfd);
+ const Elf_Internal_Ehdr * ehdr = elf_elfheader (ibfd);
+
+ if (bed == NULL || tdata == NULL || ehdr == NULL)
+ {
+ non_fatal ("could not find ELF data\n");
+ return;
+ }
+
+ const Elf_Internal_Phdr * prev = NULL;
+ unsigned int i;
+ bool sections_added = false;
+
+ for (i = 1; i < ehdr->e_phnum; i++)
+ {
+ const Elf_Internal_Phdr * current = tdata->phdr + i;
+
+ if (! is_loadable_segment (current))
+ continue;
+
+ /* If this is the first loadable segment, just record it. */
+ if (prev == NULL)
+ {
+ prev = current;
+ continue;
+ }
+
+ bfd_vma prev_end = prev->p_vaddr + prev->p_memsz;
+ bfd_vma current_start = page_align_down (bed->commonpagesize, current->p_vaddr);
+
+ /* If the segments are not ordered by increasing p_vaddr then abort.
+ Note: the ELF standard requires that segments be sorted by p_vaddr,
+ but linker scripts are able to override this. */
+ if (current_start < prev_end)
+ break;
+
+ /* If the previous segment ended at the start of the
+ current segment then there is nothing to do. */
+ if (prev_end == current_start)
+ {
+ prev = current;
+ continue;
+ }
+
+ flagword flags = SEC_LINKER_CREATED | SEC_ALLOC;
+
+ /* We do not add SEC_HAS_CONTENTS because we want to create a SHT_NOBITS
+ section. That way we will not take up (much) extra space in the file. */
+
+ if (prev->p_flags & PF_X)
+ flags |= SEC_CODE | SEC_READONLY;
+ else if (prev->p_flags & PF_R)
+ flags |= SEC_DATA | SEC_READONLY;
+ else if (prev->p_flags & PF_W)
+ flags |= SEC_DATA;
+ else
+ {
+ prev = current;
+ continue;
+ }
+
+#define SEGMENT_PADDING_SECTION_NAME ".segment.pad"
+ char * new_name;
+ if (asprintf (& new_name, SEGMENT_PADDING_SECTION_NAME ".%u", i - 1) < 1)
+ {
+ non_fatal ("unable to construct padding section name\n");
+ break;
+ }
+
+ asection * new_section = bfd_make_section_with_flags (obfd, new_name, flags);
+
+ if (new_section == NULL)
+ {
+ free (new_name);
+ non_fatal ("unable to make padding section\n");
+ break;
+ }
+
+ bfd_vma new_size = (current_start - prev->p_vaddr) - prev->p_memsz;
+
+ if (! bfd_set_section_size (new_section, new_size))
+ {
+ bfd_nonfatal_message (NULL, obfd, new_section, NULL);
+ break;
+ }
+
+ if (! bfd_set_section_vma (new_section, prev_end))
+ {
+ bfd_nonfatal_message (NULL, obfd, new_section, NULL);
+ break;
+ }
+
+ /* Do not free the name - it is used later on. */
+
+ sections_added = true;
+ prev = current;
+ }
+
+ /* If we have added any sections, remove the section
+ to segment map so that it is regenerated. */
+ if (sections_added)
+ elf_seg_map (obfd) = NULL;
+}
+
/* Copy object file IBFD onto OBFD.
Returns TRUE upon success, FALSE otherwise. */
@@ -3284,6 +3427,9 @@ copy_object (bfd *ibfd, bfd *obfd, const
}
}
+ if (add_segment_padding_sections)
+ add_padding_sections (ibfd, obfd);
+
/* Symbol filtering must happen after the output sections
have been created, but before their contents are set. */
dhandle = NULL;
@@ -5596,6 +5742,10 @@ copy_main (int argc, char *argv[])
pad_to_set = true;
break;
+ case OPTION_ADD_SEGMENT_PADDING_SECTIONS:
+ add_segment_padding_sections = true;
+ break;
+
case OPTION_REMOVE_LEADING_CHAR:
remove_leading_char = true;
break;
@@ -1454,6 +1454,7 @@ proc objcopy_remove_relocations_from_exe
if [is_elf_format] {
objcopy_remove_relocations_from_executable
+ run_dump_test "add-segment-padding-sections"
}
run_dump_test "pr23633"
@@ -1481,3 +1482,5 @@ if { ![is_xcoff_format] } {
}
run_dump_test "rename-section-01"
+
+
@@ -0,0 +1,10 @@
+#PROG: objcopy
+#name: objcopy --add-segment-padding-sections
+#source: bintest.s
+#ld: -e 0 --defsym external_symbol=0
+#objcopy: --add-segment-padding-sections
+#readelf: -lW
+
+#...
+[ ]+LOAD[ ]+0x[0-9a-f]+[ ]+0x[0-9a-f]+[ ]+0x[0-9a-f]+[ ]+0x[0-9a-f]+[ ]+0x[0-9a-f]+000.*
+#pass