[1/4] Nios II: elf subdir changes, unsigned divide in ld.so
Commit Message
This patch contains the changes in the elf/ subdir. The main changes are:
(1) The changes to add Nios II to elf/elf.h, machine number,
relocations, etc. Pretty standard stuff copied from BFD.
(2) A change in elf/dynamic-link.h, to allow a port to define a special
unsigned integer divide during RTLD_BOOTSTRAP, by defining
RTLD_BOOTSTRAP_UDIV.
The rationale is: while Nios II does have optional hardware integer
divide defined as part of the ISA spec, the current support we're
working on here is for the common no-divider case, relying on libgcc
__udivsi3.
Unfortunately however, Nios II does not have PC-relative calls (only a
direct transfer call similar to MIPS), so all subroutine calls under PIC
have to go through the GOT, which... is not available yet during
RTLD_BOOTSTRAP.
So the RTLD_BOOTSTRAP_UDIV macro facility allows nios2 to define its own
inline udiv in dl-machine.h, for use when libgcc is still not accessible.
Thanks,
Chung-Lin
* elf/dynamic-link.h (RTLD_UDIV): New macro.
(elf_machine_lazy_rel): Change divide operation to use RTLD_UDIV.
* elf/elf.h (EM_ALTERA_NIOS2): New machine number for Altera
Nios II.
(DT_NIOS2_GP): New dynamic entry type for Nios II _gp address.
(R_NIOS2_*): Define Nios II relocations.
Comments
On Fri, 28 Mar 2014, Chung-Lin Tang wrote:
> (2) A change in elf/dynamic-link.h, to allow a port to define a special
> unsigned integer divide during RTLD_BOOTSTRAP, by defining
> RTLD_BOOTSTRAP_UDIV.
>
> The rationale is: while Nios II does have optional hardware integer
> divide defined as part of the ISA spec, the current support we're
> working on here is for the common no-divider case, relying on libgcc
> __udivsi3.
>
> Unfortunately however, Nios II does not have PC-relative calls (only a
> direct transfer call similar to MIPS), so all subroutine calls under PIC
> have to go through the GOT, which... is not available yet during
> RTLD_BOOTSTRAP.
>
> So the RTLD_BOOTSTRAP_UDIV macro facility allows nios2 to define its own
> inline udiv in dl-machine.h, for use when libgcc is still not accessible.
With this approach, there's a risk of a future GCC version noticing an
implementation of division and optimizing it back into a library call.
Did you consider a -minline-divide or similar GCC option that could be
used for building the dynamic linker (and if so, what was the rationale
for rejecting it)? (See below for one possible reason not to use that
approach.)
> (R_NIOS2_*): Define Nios II relocations.
ChangeLog entries need to mention each new symbol individually.
> +/* Allow targets that do not have libgcc __udivsi3 available to define
> + one locally. */
> +# if defined RTLD_BOOTSTRAP && defined RTLD_BOOTSTRAP_UDIV
> +# define RTLD_UDIV(A, B) RTLD_BOOTSTRAP_UDIV (A, B)
> +# else
> +# define RTLD_UDIV(A, B) ((A) / (B))
> +# endif
#ifdef-like conditionals have a risk of typos being quietly unnoticed, as
discussed recently in the -Wundef discussions, so for a new macro it would
be better to have it always defined exactly once without any "#if
defined". For example, a header dl-udiv.h that defines
RTLD_BOOTSTRAP_UDIV trivially in the generic version, non-trivially in the
Nios II version.
The comment here needs to explain the semantics to be provided by the
macro RTLD_BOOTSTRAP_UDIV. In particular, can it be assumed (a) that the
division is always for word-sized values, (b) that it is always exact, (c)
that it is always dividing by sizeof (ElfW(reloc))?
Given those assumptions, you could do something like
#define RTLD_BOOTSTRAP_UDIV(A, B) \
({ \
_Static_assert ((B) == 12, "bad arguments to RTLD_BOOTSTRAP_UDIV"); \
((A) >> 2) * 0xaaaaaaab; \
})
(untested), to implement exact division in terms of multiplication, if
that's available inline and smaller / faster than dividing bit-by-bit.
(And if multiplication isn't inline either, then you can use a few shifts
and additions for that particular multiplication.)
I'm not saying these other approaches to the division are necessarily
smaller / faster on Nios II, but if the interface means the division is
exact then they should at least be considered. (And if it's exact, but
the compiler can't tell that, that's a reason against using an option for
the compiler to inline division.)
Also, it would be worth reviewing the other approaches discussed when this
issue was raised for the Xtensa port of glibc
<https://sourceware.org/ml/libc-alpha/2007-04/msg00002.html>.
> +/* Nios II relocations. */
> +#define R_NIOS2_NONE 0 /* No reloc. */
> +#define R_NIOS2_S16 1 /* Direct signed 16 bit. */
> +#define R_NIOS2_U16 2 /* Direct unsigned 16 bit. */
> +#define R_NIOS2_PCREL16 3 /* PC relative 16 bit. */
> +#define R_NIOS2_CALL26 4
Each macro should have a comment rather than giving up on comments after
the first few.
Send the elf.h changes alone. Those can go in first, assuming the
corresponding constants and names are already commited in binutils.
But we'd like to see a short comment on each and every R_* macro.
Thanks,
Roland
@@ -97,6 +97,15 @@ elf_machine_lazy_rel (struct link_map *map,
# define ELF_DURING_STARTUP (0)
# endif
+/* Allow targets that do not have libgcc __udivsi3 available to define
+ one locally. */
+# if defined RTLD_BOOTSTRAP && defined RTLD_BOOTSTRAP_UDIV
+# define RTLD_UDIV(A, B) RTLD_BOOTSTRAP_UDIV (A, B)
+# else
+# define RTLD_UDIV(A, B) ((A) / (B))
+# endif
+
+
/* Get the definitions of `elf_dynamic_do_rel' and `elf_dynamic_do_rela'.
These functions are almost identical, so we use cpp magic to avoid
duplicating their code. It cannot be done in a more general function
@@ -123,7 +132,7 @@ elf_machine_lazy_rel (struct link_map *map,
if (map->l_info[VERSYMIDX (DT_##RELOC##COUNT)] != NULL) \
ranges[0].nrelative \
= MIN (map->l_info[VERSYMIDX (DT_##RELOC##COUNT)]->d_un.d_val, \
- ranges[0].size / sizeof (ElfW(reloc))); \
+ RTLD_UDIV (ranges[0].size, sizeof (ElfW(reloc)))); \
} \
if ((map)->l_info[DT_PLTREL] \
&& (!test_rel || (map)->l_info[DT_PLTREL]->d_un.d_val == DT_##RELOC)) \
@@ -249,6 +249,7 @@ typedef struct
#define EM_OPENRISC 92 /* OpenRISC 32-bit embedded processor */
#define EM_ARC_A5 93 /* ARC Cores Tangent-A5 */
#define EM_XTENSA 94 /* Tensilica Xtensa Architecture */
+#define EM_ALTERA_NIOS2 113 /* Altera Nios II */
#define EM_AARCH64 183 /* ARM AARCH64 */
#define EM_TILEPRO 188 /* Tilera TILEPro */
#define EM_MICROBLAZE 189 /* Xilinx MicroBlaze */
@@ -3132,6 +3133,57 @@ typedef Elf32_Addr Elf32_Conflict;
#define R_MICROBLAZE_TLSGOTTPREL32 28 /* TLS Offset From Thread Pointer. */
#define R_MICROBLAZE_TLSTPREL32 29 /* TLS Offset From Thread Pointer. */
+/* Legal values for d_tag (dynamic entry type). */
+#define DT_NIOS2_GP 0x70000002 /* Address of _gp. */
+
+/* Nios II relocations. */
+#define R_NIOS2_NONE 0 /* No reloc. */
+#define R_NIOS2_S16 1 /* Direct signed 16 bit. */
+#define R_NIOS2_U16 2 /* Direct unsigned 16 bit. */
+#define R_NIOS2_PCREL16 3 /* PC relative 16 bit. */
+#define R_NIOS2_CALL26 4
+#define R_NIOS2_IMM5 5
+#define R_NIOS2_CACHE_OPX 6
+#define R_NIOS2_IMM6 7
+#define R_NIOS2_IMM8 8
+#define R_NIOS2_HI16 9
+#define R_NIOS2_LO16 10
+#define R_NIOS2_HIADJ16 11
+#define R_NIOS2_BFD_RELOC_32 12
+#define R_NIOS2_BFD_RELOC_16 13
+#define R_NIOS2_BFD_RELOC_8 14
+#define R_NIOS2_GPREL 15
+#define R_NIOS2_GNU_VTINHERIT 16
+#define R_NIOS2_GNU_VTENTRY 17
+#define R_NIOS2_UJMP 18
+#define R_NIOS2_CJMP 19
+#define R_NIOS2_CALLR 20
+#define R_NIOS2_ALIGN 21
+#define R_NIOS2_GOT16 22
+#define R_NIOS2_CALL16 23
+#define R_NIOS2_GOTOFF_LO 24
+#define R_NIOS2_GOTOFF_HA 25
+#define R_NIOS2_PCREL_LO 26
+#define R_NIOS2_PCREL_HA 27
+#define R_NIOS2_TLS_GD16 28
+#define R_NIOS2_TLS_LDM16 29
+#define R_NIOS2_TLS_LDO16 30
+#define R_NIOS2_TLS_IE16 31
+#define R_NIOS2_TLS_LE16 32
+#define R_NIOS2_TLS_DTPMOD 33
+#define R_NIOS2_TLS_DTPREL 34
+#define R_NIOS2_TLS_TPREL 35
+#define R_NIOS2_COPY 36
+#define R_NIOS2_GLOB_DAT 37
+#define R_NIOS2_JUMP_SLOT 38
+#define R_NIOS2_RELATIVE 39
+#define R_NIOS2_GOTOFF 40
+#define R_NIOS2_CALL26_NOAT 41
+#define R_NIOS2_GOT_LO 42
+#define R_NIOS2_GOT_HA 43
+#define R_NIOS2_CALL_LO 44
+#define R_NIOS2_CALL_HA 45
+
/* TILEPro relocations. */
#define R_TILEPRO_NONE 0 /* No reloc */
#define R_TILEPRO_32 1 /* Direct 32 bit */