[RFC] Building for SuperH fails with gcc5/6

Message ID d752ba99-6900-0df2-b7b0-146a8803bc11@att.net
State New, archived
Headers

Commit Message

Alexey Neyman Jan. 26, 2017, 7:03 a.m. UTC
  Hi,

Build glibc for sh4-unknown-linux-gnu currently fails if one's using 
GCC5/6: in dl-conflict.c, the elf_machine_rela() function is called with 
NULL as its 3rd argument, sym. The implementation of that function in 
sysdeps/sh/dl-machine.h dereferences that pointer:

const Elf32_Sym *const refsym = sym;
...
if (map == &GL(dl_rtld_map))
   value -= map->l_addr + refsym->st_value + reloc->r_addend;

GCC discovers a null pointer dereference, and in accordance with 
-fdelete-null-pointer-checks (which is enabled in -O2) replaces this 
code with a trap - which, as SH does not implement a trap pattern in 
GCC, evaluates to an abort() call. This abort() call pulls many more 
objects from libc_nonshared.a, eventually resulting in link failure due 
to multiple definitions for a number of symbols.

As far as I see, the conditional before this code is always false in 
rtld: _dl_resolve_conflicts() is called with main_map as the first 
argument, not GL(_dl_rtld_map), but since that call is in yet another 
compilation unit, GCC does not know about it. Patch that wraps this 
conditional into !defined RESOLVE_CONFLICT_FIND_MAP attached.

However, I thought the abort() call may be also inserted by GCC on other 
architectures that do not implement the trap pattern, so it may make 
sense to provide a dummy abort() in rtld that does not pull any other 
dependencies. An alternative patch for this approach also attached.

And, of course, it is also possible to just add 
-fno-delete-null-pointer-checks to the compiler flags for dl-conflict.c 
on SH, even though it would unnecessarily pessimize the code.

Comments?

Regards,
Alexey.
  

Comments

Adhemerval Zanella Jan. 26, 2017, 12:25 p.m. UTC | #1
On 26/01/2017 05:03, Alexey Neyman wrote:
> Hi,
> 
> Build glibc for sh4-unknown-linux-gnu currently fails if one's using GCC5/6: in dl-conflict.c, the elf_machine_rela() function is called with NULL as its 3rd argument, sym. The implementation of that function in sysdeps/sh/dl-machine.h dereferences that pointer:
> 
> const Elf32_Sym *const refsym = sym;
> ...
> if (map == &GL(dl_rtld_map))
>   value -= map->l_addr + refsym->st_value + reloc->r_addend;
> 
> GCC discovers a null pointer dereference, and in accordance with -fdelete-null-pointer-checks (which is enabled in -O2) replaces this code with a trap - which, as SH does not implement a trap pattern in GCC, evaluates to an abort() call. This abort() call pulls many more objects from libc_nonshared.a, eventually resulting in link failure due to multiple definitions for a number of symbols.
> 

This issue is being tracked by PR#70216 "[SH] Implement __builtin_trap" [1]. 

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70216


> As far as I see, the conditional before this code is always false in rtld: _dl_resolve_conflicts() is called with main_map as the first argument, not GL(_dl_rtld_map), but since that call is in yet another compilation unit, GCC does not know about it. Patch that wraps this conditional into !defined RESOLVE_CONFLICT_FIND_MAP attached.
> 
> However, I thought the abort() call may be also inserted by GCC on other architectures that do not implement the trap pattern, so it may make sense to provide a dummy abort() in rtld that does not pull any other dependencies. An alternative patch for this approach also attached.
> 
> And, of course, it is also possible to just add -fno-delete-null-pointer-checks to the compiler flags for dl-conflict.c on SH, even though it would unnecessarily pessimize the code.
> 
> Comments?

Current build-many-glibcs scripts uses both '-fno-isolate-erroneous-paths-dereference' and
'-fno-isolate-erroneous-paths-attribute' on all SH targets and I am not sure which option
is better in this specific situation.

> 
> Regards,
> Alexey.
> 
> 
> 0001-Provide-a-minimal-abort-stub-for-rtld.patch
> 
> 

> +
> +void
> +abort(void)
> +{
> +  // It's noreturn...
> +  for (;;)
> +    ;
> +}

I do not think issue an infinite loop should be a default abort, its contract
explicit states the program should abnormally exit.  I think we need to implement
an arch-specific override using the defined abort instruction for the archicture
(ABORT_INSTRUCTION).

> diff --git a/sysdeps/sh/Makefile b/sysdeps/sh/Makefile
> index 0c6db9a..039217c 100644
> --- a/sysdeps/sh/Makefile
> +++ b/sysdeps/sh/Makefile
> @@ -9,3 +9,7 @@ endif
>  ifeq ($(subdir),debug)
>  CFLAGS-backtrace.c += -funwind-tables
>  endif
> +
> +ifeq ($(subdir),elf)
> +sysdep-rtld-routines += dl-abort-stub
> +endif
> -- 2.9.3
> 
> 
> 0001-sh-conditional-is-false-in-dl-conflict.c.patch
> 
> 
> From 279acf7a059f2d2296f690d7f2d886bd0e404f30 Mon Sep 17 00:00:00 2001
> From: Alexey Neyman <stilor@att.net>
> Date: Wed, 25 Jan 2017 21:46:53 -0800
> Subject: [PATCH] sh: conditional is false in dl-conflict.c
> 
> ... ifdef it out, so that it doesn't create a call to abort().
> 
> Signed-off-by: Alexey Neyman <stilor@att.net>
> ---
>  sysdeps/sh/dl-machine.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/sysdeps/sh/dl-machine.h b/sysdeps/sh/dl-machine.h
> index 449deea..2b468af 100644
> --- a/sysdeps/sh/dl-machine.h
> +++ b/sysdeps/sh/dl-machine.h
> @@ -389,7 +389,7 @@ elf_machine_rela (struct link_map *map, const Elf32_Rela *reloc,
>  	  break;
>  	case R_SH_DIR32:
>  	  {
> -#ifndef RTLD_BOOTSTRAP
> +#if !defined RTLD_BOOTSTRAP && !defined RESOLVE_CONFLICT_FIND_MAP
>  	   /* This is defined in rtld.c, but nowhere in the static
>  	      libc.a; make the reference weak so static programs can
>  	      still link.  This declaration cannot be done when
> -- 2.9.3
> 

This change seems a better alternative. Cross-compiling sh it shows no 
more build issues and I am assuming here this change would not cause
any side effects in runtime. 

In any way, I would prefer to add a SH specific abort implementation
to be used by the loader.  However it seems in PR#70216 comments that
gcc developers are still defining which would be best instruction
to use.
  
Alexey Neyman Feb. 4, 2017, 8:28 p.m. UTC | #2
On 01/26/2017 03:52 AM, Adhemerval Zanella wrote:
> On 26/01/2017 05:03, Alexey Neyman wrote:
>> Hi,
>>
>> Build glibc for sh4-unknown-linux-gnu currently fails if one's using GCC5/6: in dl-conflict.c, the elf_machine_rela() function is called with NULL as its 3rd argument, sym. The implementation of that function in sysdeps/sh/dl-machine.h dereferences that pointer:
>>
>> const Elf32_Sym *const refsym = sym;
>> ...
>> if (map == &GL(dl_rtld_map))
>>    value -= map->l_addr + refsym->st_value + reloc->r_addend;
>>
>> GCC discovers a null pointer dereference, and in accordance with -fdelete-null-pointer-checks (which is enabled in -O2) replaces this code with a trap - which, as SH does not implement a trap pattern in GCC, evaluates to an abort() call. This abort() call pulls many more objects from libc_nonshared.a, eventually resulting in link failure due to multiple definitions for a number of symbols.
>>
> This issue is being tracked by PR#70216 "[SH] Implement __builtin_trap" [1].
>
> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70216
Thanks. Sorry, I wanted but forgot to reference this GCC issue.
>> As far as I see, the conditional before this code is always false in rtld: _dl_resolve_conflicts() is called with main_map as the first argument, not GL(_dl_rtld_map), but since that call is in yet another compilation unit, GCC does not know about it. Patch that wraps this conditional into !defined RESOLVE_CONFLICT_FIND_MAP attached.
>>
>> However, I thought the abort() call may be also inserted by GCC on other architectures that do not implement the trap pattern, so it may make sense to provide a dummy abort() in rtld that does not pull any other dependencies. An alternative patch for this approach also attached.
>>
>> And, of course, it is also possible to just add -fno-delete-null-pointer-checks to the compiler flags for dl-conflict.c on SH, even though it would unnecessarily pessimize the code.
>>
>> Comments?
> Current build-many-glibcs scripts uses both '-fno-isolate-erroneous-paths-dereference' and
> '-fno-isolate-erroneous-paths-attribute' on all SH targets and I am not sure which option
> is better in this specific situation.
This condition is covered by the former; the latter is not currently 
enabled by -O2. My bad, I referred to a wrong GCC option.

However, enabling it in the whole build (as that script does) will 
pessimize all the code, even in locations where abort() can be called. 
If anything, it should be constrained to rtld or even to specific files 
in rtld.
>> 0001-Provide-a-minimal-abort-stub-for-rtld.patch
>>  From 279acf7a059f2d2296f690d7f2d886bd0e404f30 Mon Sep 17 00:00:00 2001
>> From: Alexey Neyman <stilor@att.net>
>> Date: Wed, 25 Jan 2017 21:46:53 -0800
>> Subject: [PATCH] sh: conditional is false in dl-conflict.c
>>
>> ... ifdef it out, so that it doesn't create a call to abort().
>>
>> Signed-off-by: Alexey Neyman <stilor@att.net>
>> ---
>>   sysdeps/sh/dl-machine.h | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/sysdeps/sh/dl-machine.h b/sysdeps/sh/dl-machine.h
>> index 449deea..2b468af 100644
>> --- a/sysdeps/sh/dl-machine.h
>> +++ b/sysdeps/sh/dl-machine.h
>> @@ -389,7 +389,7 @@ elf_machine_rela (struct link_map *map, const Elf32_Rela *reloc,
>>   	  break;
>>   	case R_SH_DIR32:
>>   	  {
>> -#ifndef RTLD_BOOTSTRAP
>> +#if !defined RTLD_BOOTSTRAP && !defined RESOLVE_CONFLICT_FIND_MAP
>>   	   /* This is defined in rtld.c, but nowhere in the static
>>   	      libc.a; make the reference weak so static programs can
>>   	      still link.  This declaration cannot be done when
>> -- 2.9.3
>>
> This change seems a better alternative. Cross-compiling sh it shows no
> more build issues and I am assuming here this change would not cause
> any side effects in runtime.
I have no SH hardware either. If anyone could verify the resulting 
glibc, that would be great

It shouldn't be much worse that GCC4 behavior, or the behavior with 
-fno-isolate-errnoneous-paths-dereference (which, if I am wrong and that 
conditional can execute from dl-conflict.c, would've dereferenced a NULL 
pointer).
> In any way, I would prefer to add a SH specific abort implementation
> to be used by the loader.  However it seems in PR#70216 comments that
> gcc developers are still defining which would be best instruction
> to use.
GCC5/6 do not have such a trap, and these versions are going to be in 
use for a while. GCC7 is around the corner and as it looks, isn't going 
to implement SH-specific trap. I think glibc should adapt to its 
shortcomings.

Can the patch be picked up?

Regards,
Alexey.
  

Patch

From 279acf7a059f2d2296f690d7f2d886bd0e404f30 Mon Sep 17 00:00:00 2001
From: Alexey Neyman <stilor@att.net>
Date: Wed, 25 Jan 2017 21:46:53 -0800
Subject: [PATCH] sh: conditional is false in dl-conflict.c

... ifdef it out, so that it doesn't create a call to abort().

Signed-off-by: Alexey Neyman <stilor@att.net>
---
 sysdeps/sh/dl-machine.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sysdeps/sh/dl-machine.h b/sysdeps/sh/dl-machine.h
index 449deea..2b468af 100644
--- a/sysdeps/sh/dl-machine.h
+++ b/sysdeps/sh/dl-machine.h
@@ -389,7 +389,7 @@  elf_machine_rela (struct link_map *map, const Elf32_Rela *reloc,
 	  break;
 	case R_SH_DIR32:
 	  {
-#ifndef RTLD_BOOTSTRAP
+#if !defined RTLD_BOOTSTRAP && !defined RESOLVE_CONFLICT_FIND_MAP
 	   /* This is defined in rtld.c, but nowhere in the static
 	      libc.a; make the reference weak so static programs can
 	      still link.  This declaration cannot be done when
-- 
2.9.3