[0/3] Simplify ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA and revert aarch64/arm's extern protected data handling

Message ID 20220501060615.1694317-1-maskray@google.com
Headers
Series Simplify ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA and revert aarch64/arm's extern protected data handling |

Message

Fangrui Song May 1, 2022, 6:06 a.m. UTC
  Say both a.so and b.so define protected var and the executable copy
relocates var. ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA has strange
semantics: a.so accesses the copy in the executable while b.so accesses
its own. This behavior requires that (a) the compiler emits
GOT-generating relocations (b) the linker produces GLOB_DAT instead of
RELATIVE.

Without the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA code, b.so's GLOB_DAT
will bind to the executable (normal behavior).

For aarch64/arm it makes sense to restore the original behavior and don't
pay the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA cost. The behavior is very
unlikely used by anyone.

* Clang code generation treats STV_PROTECTED the same way as STV_HIDDEN:
  no GOT-generating relocation in the first place.
* gold and lld reject copy relocation on a STV_PROTECTED symbol.
* Nowadays -fpie/-fpic modes are popular. GCC/Clang's codegen uses
  GOT-generating relocation when accessing an default visibility
  external symbol which avoids copy relocation.

Fangrui Song (3):
  elf: Remove ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA check for
    non-DL_EXTERN_PROTECTED_DATA ports
  Revert "[AArch64][BZ #17711] Fix extern protected data handling"
  Revert "[ARM][BZ #17711] Fix extern protected data handling"

 elf/dl-lookup.c              | 46 ++++++++++++------------------------
 sysdeps/aarch64/dl-machine.h | 13 +++++-----
 sysdeps/aarch64/dl-sysdep.h  |  2 --
 sysdeps/arm/dl-machine.h     | 10 +++-----
 sysdeps/arm/dl-sysdep.h      |  2 --
 5 files changed, 24 insertions(+), 49 deletions(-)
  

Comments

H.J. Lu May 9, 2022, 9:26 p.m. UTC | #1
On Sat, Apr 30, 2022 at 11:06 PM Fangrui Song via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> Say both a.so and b.so define protected var and the executable copy
> relocates var. ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA has strange
> semantics: a.so accesses the copy in the executable while b.so accesses
> its own. This behavior requires that (a) the compiler emits
> GOT-generating relocations (b) the linker produces GLOB_DAT instead of
> RELATIVE.
>
> Without the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA code, b.so's GLOB_DAT
> will bind to the executable (normal behavior).
>
> For aarch64/arm it makes sense to restore the original behavior and don't
> pay the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA cost. The behavior is very
> unlikely used by anyone.
>
> * Clang code generation treats STV_PROTECTED the same way as STV_HIDDEN:
>   no GOT-generating relocation in the first place.
> * gold and lld reject copy relocation on a STV_PROTECTED symbol.
> * Nowadays -fpie/-fpic modes are popular. GCC/Clang's codegen uses
>   GOT-generating relocation when accessing an default visibility
>   external symbol which avoids copy relocation.
>
> Fangrui Song (3):
>   elf: Remove ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA check for
>     non-DL_EXTERN_PROTECTED_DATA ports
>   Revert "[AArch64][BZ #17711] Fix extern protected data handling"
>   Revert "[ARM][BZ #17711] Fix extern protected data handling"
>
>  elf/dl-lookup.c              | 46 ++++++++++++------------------------
>  sysdeps/aarch64/dl-machine.h | 13 +++++-----
>  sysdeps/aarch64/dl-sysdep.h  |  2 --
>  sysdeps/arm/dl-machine.h     | 10 +++-----
>  sysdeps/arm/dl-sysdep.h      |  2 --
>  5 files changed, 24 insertions(+), 49 deletions(-)
>
> --
> 2.36.0.464.gb9c8b46e94-goog
>

Given that protected symbols never work properly with copy relocation,
can we change protected symbol handling to simply warn copy relocation
against protected data symbol and non-zero symbol values of undefined
symbols against protected function symbols?
  
Fangrui Song May 10, 2022, 7:42 a.m. UTC | #2
On 2022-05-09, H.J. Lu wrote:
>On Sat, Apr 30, 2022 at 11:06 PM Fangrui Song via Libc-alpha
><libc-alpha@sourceware.org> wrote:
>>
>> Say both a.so and b.so define protected var and the executable copy
>> relocates var. ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA has strange
>> semantics: a.so accesses the copy in the executable while b.so accesses
>> its own. This behavior requires that (a) the compiler emits
>> GOT-generating relocations (b) the linker produces GLOB_DAT instead of
>> RELATIVE.
>>
>> Without the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA code, b.so's GLOB_DAT
>> will bind to the executable (normal behavior).
>>
>> For aarch64/arm it makes sense to restore the original behavior and don't
>> pay the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA cost. The behavior is very
>> unlikely used by anyone.
>>
>> * Clang code generation treats STV_PROTECTED the same way as STV_HIDDEN:
>>   no GOT-generating relocation in the first place.
>> * gold and lld reject copy relocation on a STV_PROTECTED symbol.
>> * Nowadays -fpie/-fpic modes are popular. GCC/Clang's codegen uses
>>   GOT-generating relocation when accessing an default visibility
>>   external symbol which avoids copy relocation.
>>
>> Fangrui Song (3):
>>   elf: Remove ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA check for
>>     non-DL_EXTERN_PROTECTED_DATA ports
>>   Revert "[AArch64][BZ #17711] Fix extern protected data handling"
>>   Revert "[ARM][BZ #17711] Fix extern protected data handling"
>>
>>  elf/dl-lookup.c              | 46 ++++++++++++------------------------
>>  sysdeps/aarch64/dl-machine.h | 13 +++++-----
>>  sysdeps/aarch64/dl-sysdep.h  |  2 --
>>  sysdeps/arm/dl-machine.h     | 10 +++-----
>>  sysdeps/arm/dl-sysdep.h      |  2 --
>>  5 files changed, 24 insertions(+), 49 deletions(-)
>>
>> --
>> 2.36.0.464.gb9c8b46e94-goog
>>
>
>Given that protected symbols never work properly with copy relocation,
>can we change protected symbol handling to simply warn copy relocation
>against protected data symbol and non-zero symbol values of undefined
>symbols against protected function symbols?

"protected symbols never work properly with copy relocation" generally
applies to all non-x86 architectures (arm/aarch64 has some guarantee,
but not reliable). My goal is indeed to remove relevant code for all
non-x86 architectures. They should not even bother testing
STV_PROTECTED.

On x86 there is some guarantee. The gcc>=5 -fpie
HAVE_LD_PIE_COPYRELOC behavior makes the problem trigger in some cases.
Having a warning in x86 specific files (e.g.
sysdeps/x86_64/dl-machine.h R_X86_64_COPY) seems appropriate.
  
H.J. Lu May 10, 2022, 3:02 p.m. UTC | #3
On Tue, May 10, 2022 at 12:42 AM Fangrui Song <maskray@google.com> wrote:
>
> On 2022-05-09, H.J. Lu wrote:
> >On Sat, Apr 30, 2022 at 11:06 PM Fangrui Song via Libc-alpha
> ><libc-alpha@sourceware.org> wrote:
> >>
> >> Say both a.so and b.so define protected var and the executable copy
> >> relocates var. ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA has strange
> >> semantics: a.so accesses the copy in the executable while b.so accesses
> >> its own. This behavior requires that (a) the compiler emits
> >> GOT-generating relocations (b) the linker produces GLOB_DAT instead of
> >> RELATIVE.
> >>
> >> Without the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA code, b.so's GLOB_DAT
> >> will bind to the executable (normal behavior).
> >>
> >> For aarch64/arm it makes sense to restore the original behavior and don't
> >> pay the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA cost. The behavior is very
> >> unlikely used by anyone.
> >>
> >> * Clang code generation treats STV_PROTECTED the same way as STV_HIDDEN:
> >>   no GOT-generating relocation in the first place.
> >> * gold and lld reject copy relocation on a STV_PROTECTED symbol.
> >> * Nowadays -fpie/-fpic modes are popular. GCC/Clang's codegen uses
> >>   GOT-generating relocation when accessing an default visibility
> >>   external symbol which avoids copy relocation.
> >>
> >> Fangrui Song (3):
> >>   elf: Remove ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA check for
> >>     non-DL_EXTERN_PROTECTED_DATA ports
> >>   Revert "[AArch64][BZ #17711] Fix extern protected data handling"
> >>   Revert "[ARM][BZ #17711] Fix extern protected data handling"
> >>
> >>  elf/dl-lookup.c              | 46 ++++++++++++------------------------
> >>  sysdeps/aarch64/dl-machine.h | 13 +++++-----
> >>  sysdeps/aarch64/dl-sysdep.h  |  2 --
> >>  sysdeps/arm/dl-machine.h     | 10 +++-----
> >>  sysdeps/arm/dl-sysdep.h      |  2 --
> >>  5 files changed, 24 insertions(+), 49 deletions(-)
> >>
> >> --
> >> 2.36.0.464.gb9c8b46e94-goog
> >>
> >
> >Given that protected symbols never work properly with copy relocation,
> >can we change protected symbol handling to simply warn copy relocation
> >against protected data symbol and non-zero symbol values of undefined
> >symbols against protected function symbols?
>
> "protected symbols never work properly with copy relocation" generally
> applies to all non-x86 architectures (arm/aarch64 has some guarantee,
> but not reliable). My goal is indeed to remove relevant code for all
> non-x86 architectures. They should not even bother testing
> STV_PROTECTED.

Even on x86, protected symbols don't work as intended.   They don't
provide any performance benefits.  I think we should move away from
copy relocation on protected symbols.  We can either issue a warning
or an error.

> On x86 there is some guarantee. The gcc>=5 -fpie
> HAVE_LD_PIE_COPYRELOC behavior makes the problem trigger in some cases.
> Having a warning in x86 specific files (e.g.
> sysdeps/x86_64/dl-machine.h R_X86_64_COPY) seems appropriate.
  
Fangrui Song May 12, 2022, 4:56 a.m. UTC | #4
On 2022-05-10, H.J. Lu wrote:
>On Tue, May 10, 2022 at 12:42 AM Fangrui Song <maskray@google.com> wrote:
>>
>> On 2022-05-09, H.J. Lu wrote:
>> >On Sat, Apr 30, 2022 at 11:06 PM Fangrui Song via Libc-alpha
>> ><libc-alpha@sourceware.org> wrote:
>> >>
>> >> Say both a.so and b.so define protected var and the executable copy
>> >> relocates var. ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA has strange
>> >> semantics: a.so accesses the copy in the executable while b.so accesses
>> >> its own. This behavior requires that (a) the compiler emits
>> >> GOT-generating relocations (b) the linker produces GLOB_DAT instead of
>> >> RELATIVE.
>> >>
>> >> Without the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA code, b.so's GLOB_DAT
>> >> will bind to the executable (normal behavior).
>> >>
>> >> For aarch64/arm it makes sense to restore the original behavior and don't
>> >> pay the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA cost. The behavior is very
>> >> unlikely used by anyone.
>> >>
>> >> * Clang code generation treats STV_PROTECTED the same way as STV_HIDDEN:
>> >>   no GOT-generating relocation in the first place.
>> >> * gold and lld reject copy relocation on a STV_PROTECTED symbol.
>> >> * Nowadays -fpie/-fpic modes are popular. GCC/Clang's codegen uses
>> >>   GOT-generating relocation when accessing an default visibility
>> >>   external symbol which avoids copy relocation.
>> >>
>> >> Fangrui Song (3):
>> >>   elf: Remove ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA check for
>> >>     non-DL_EXTERN_PROTECTED_DATA ports
>> >>   Revert "[AArch64][BZ #17711] Fix extern protected data handling"
>> >>   Revert "[ARM][BZ #17711] Fix extern protected data handling"
>> >>
>> >>  elf/dl-lookup.c              | 46 ++++++++++++------------------------
>> >>  sysdeps/aarch64/dl-machine.h | 13 +++++-----
>> >>  sysdeps/aarch64/dl-sysdep.h  |  2 --
>> >>  sysdeps/arm/dl-machine.h     | 10 +++-----
>> >>  sysdeps/arm/dl-sysdep.h      |  2 --
>> >>  5 files changed, 24 insertions(+), 49 deletions(-)
>> >>
>> >> --
>> >> 2.36.0.464.gb9c8b46e94-goog
>> >>
>> >
>> >Given that protected symbols never work properly with copy relocation,
>> >can we change protected symbol handling to simply warn copy relocation
>> >against protected data symbol and non-zero symbol values of undefined
>> >symbols against protected function symbols?
>>
>> "protected symbols never work properly with copy relocation" generally
>> applies to all non-x86 architectures (arm/aarch64 has some guarantee,
>> but not reliable). My goal is indeed to remove relevant code for all
>> non-x86 architectures. They should not even bother testing
>> STV_PROTECTED.
>
>Even on x86, protected symbols don't work as intended.   They don't
>provide any performance benefits.  I think we should move away from
>copy relocation on protected symbols.  We can either issue a warning
>or an error.
>
>> On x86 there is some guarantee. The gcc>=5 -fpie
>> HAVE_LD_PIE_COPYRELOC behavior makes the problem trigger in some cases.
>> Having a warning in x86 specific files (e.g.
>> sysdeps/x86_64/dl-machine.h R_X86_64_COPY) seems appropriate.

How about adding a warning for R_X86_64_COPY like the following, then removing
all code paths of (ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA!=0)?

 From 3f905ff283178396840861a965acf35bd21cf5f0 Mon Sep 17 00:00:00 2001
From: Fangrui Song <maskray@google.com>
Date: Wed, 11 May 2022 21:54:02 -0700
Subject: [PATCH] x86: Warn for copy relocations on a protected symbol

---
  sysdeps/x86_64/dl-machine.h | 6 ++++++
  1 file changed, 6 insertions(+)

diff --git a/sysdeps/x86_64/dl-machine.h b/sysdeps/x86_64/dl-machine.h
index c70af7ab1e..48f08ecc0b 100644
--- a/sysdeps/x86_64/dl-machine.h
+++ b/sysdeps/x86_64/dl-machine.h
@@ -474,6 +474,12 @@ and creates an unsatisfiable circular dependency.\n",
             break;
           memcpy (reloc_addr_arg, (void *) value,
                   MIN (sym->st_size, refsym->st_size));
+         if (ELFW(ST_VISIBILITY) (sym->st_other) == STV_PROTECTED)
+           {
+             fmt = "\
+%s: Protected symbol `%s' is incompatible with copy relocations\n";
+             goto print_err;
+           }
           if (__glibc_unlikely (sym->st_size > refsym->st_size)
               || (__glibc_unlikely (sym->st_size < refsym->st_size)
                   && GLRO(dl_verbose)))
  
H.J. Lu May 13, 2022, 10:18 p.m. UTC | #5
On Wed, May 11, 2022 at 9:56 PM Fangrui Song <maskray@google.com> wrote:
>
> On 2022-05-10, H.J. Lu wrote:
> >On Tue, May 10, 2022 at 12:42 AM Fangrui Song <maskray@google.com> wrote:
> >>
> >> On 2022-05-09, H.J. Lu wrote:
> >> >On Sat, Apr 30, 2022 at 11:06 PM Fangrui Song via Libc-alpha
> >> ><libc-alpha@sourceware.org> wrote:
> >> >>
> >> >> Say both a.so and b.so define protected var and the executable copy
> >> >> relocates var. ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA has strange
> >> >> semantics: a.so accesses the copy in the executable while b.so accesses
> >> >> its own. This behavior requires that (a) the compiler emits
> >> >> GOT-generating relocations (b) the linker produces GLOB_DAT instead of
> >> >> RELATIVE.
> >> >>
> >> >> Without the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA code, b.so's GLOB_DAT
> >> >> will bind to the executable (normal behavior).
> >> >>
> >> >> For aarch64/arm it makes sense to restore the original behavior and don't
> >> >> pay the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA cost. The behavior is very
> >> >> unlikely used by anyone.
> >> >>
> >> >> * Clang code generation treats STV_PROTECTED the same way as STV_HIDDEN:
> >> >>   no GOT-generating relocation in the first place.
> >> >> * gold and lld reject copy relocation on a STV_PROTECTED symbol.
> >> >> * Nowadays -fpie/-fpic modes are popular. GCC/Clang's codegen uses
> >> >>   GOT-generating relocation when accessing an default visibility
> >> >>   external symbol which avoids copy relocation.
> >> >>
> >> >> Fangrui Song (3):
> >> >>   elf: Remove ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA check for
> >> >>     non-DL_EXTERN_PROTECTED_DATA ports
> >> >>   Revert "[AArch64][BZ #17711] Fix extern protected data handling"
> >> >>   Revert "[ARM][BZ #17711] Fix extern protected data handling"
> >> >>
> >> >>  elf/dl-lookup.c              | 46 ++++++++++++------------------------
> >> >>  sysdeps/aarch64/dl-machine.h | 13 +++++-----
> >> >>  sysdeps/aarch64/dl-sysdep.h  |  2 --
> >> >>  sysdeps/arm/dl-machine.h     | 10 +++-----
> >> >>  sysdeps/arm/dl-sysdep.h      |  2 --
> >> >>  5 files changed, 24 insertions(+), 49 deletions(-)
> >> >>
> >> >> --
> >> >> 2.36.0.464.gb9c8b46e94-goog
> >> >>
> >> >
> >> >Given that protected symbols never work properly with copy relocation,
> >> >can we change protected symbol handling to simply warn copy relocation
> >> >against protected data symbol and non-zero symbol values of undefined
> >> >symbols against protected function symbols?
> >>
> >> "protected symbols never work properly with copy relocation" generally
> >> applies to all non-x86 architectures (arm/aarch64 has some guarantee,
> >> but not reliable). My goal is indeed to remove relevant code for all
> >> non-x86 architectures. They should not even bother testing
> >> STV_PROTECTED.
> >
> >Even on x86, protected symbols don't work as intended.   They don't
> >provide any performance benefits.  I think we should move away from
> >copy relocation on protected symbols.  We can either issue a warning
> >or an error.
> >
> >> On x86 there is some guarantee. The gcc>=5 -fpie
> >> HAVE_LD_PIE_COPYRELOC behavior makes the problem trigger in some cases.
> >> Having a warning in x86 specific files (e.g.
> >> sysdeps/x86_64/dl-machine.h R_X86_64_COPY) seems appropriate.
>
> How about adding a warning for R_X86_64_COPY like the following, then removing
> all code paths of (ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA!=0)?
>
>  From 3f905ff283178396840861a965acf35bd21cf5f0 Mon Sep 17 00:00:00 2001
> From: Fangrui Song <maskray@google.com>
> Date: Wed, 11 May 2022 21:54:02 -0700
> Subject: [PATCH] x86: Warn for copy relocations on a protected symbol
>
> ---
>   sysdeps/x86_64/dl-machine.h | 6 ++++++
>   1 file changed, 6 insertions(+)
>
> diff --git a/sysdeps/x86_64/dl-machine.h b/sysdeps/x86_64/dl-machine.h
> index c70af7ab1e..48f08ecc0b 100644
> --- a/sysdeps/x86_64/dl-machine.h
> +++ b/sysdeps/x86_64/dl-machine.h
> @@ -474,6 +474,12 @@ and creates an unsatisfiable circular dependency.\n",
>              break;
>            memcpy (reloc_addr_arg, (void *) value,
>                    MIN (sym->st_size, refsym->st_size));
> +         if (ELFW(ST_VISIBILITY) (sym->st_other) == STV_PROTECTED)
> +           {
> +             fmt = "\
> +%s: Protected symbol `%s' is incompatible with copy relocations\n";
> +             goto print_err;
> +           }
>            if (__glibc_unlikely (sym->st_size > refsym->st_size)
>                || (__glibc_unlikely (sym->st_size < refsym->st_size)
>                    && GLRO(dl_verbose)))

Should it be handled in sysdeps/generic/dl-protected.h?

BTW, I'd like to see a migration plan to remove copy relocation and
use protected
visibility in shared libraries.   Qt uses protected/default
visibilities to mark exported
and imported symbols with GCC 12:

https://github.com/qt/qtbase/commit/19b7f854a274812d9c95fc7aaf134a12530c105f

Should glibc provide some EXPORT and IMPORT macros to help this?

--
H.J.
  
Fangrui Song May 13, 2022, 11:09 p.m. UTC | #6
On 2022-05-13, H.J. Lu wrote:
>On Wed, May 11, 2022 at 9:56 PM Fangrui Song <maskray@google.com> wrote:
>>
>> On 2022-05-10, H.J. Lu wrote:
>> >On Tue, May 10, 2022 at 12:42 AM Fangrui Song <maskray@google.com> wrote:
>> >>
>> >> On 2022-05-09, H.J. Lu wrote:
>> >> >On Sat, Apr 30, 2022 at 11:06 PM Fangrui Song via Libc-alpha
>> >> ><libc-alpha@sourceware.org> wrote:
>> >> >>
>> >> >> Say both a.so and b.so define protected var and the executable copy
>> >> >> relocates var. ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA has strange
>> >> >> semantics: a.so accesses the copy in the executable while b.so accesses
>> >> >> its own. This behavior requires that (a) the compiler emits
>> >> >> GOT-generating relocations (b) the linker produces GLOB_DAT instead of
>> >> >> RELATIVE.
>> >> >>
>> >> >> Without the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA code, b.so's GLOB_DAT
>> >> >> will bind to the executable (normal behavior).
>> >> >>
>> >> >> For aarch64/arm it makes sense to restore the original behavior and don't
>> >> >> pay the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA cost. The behavior is very
>> >> >> unlikely used by anyone.
>> >> >>
>> >> >> * Clang code generation treats STV_PROTECTED the same way as STV_HIDDEN:
>> >> >>   no GOT-generating relocation in the first place.
>> >> >> * gold and lld reject copy relocation on a STV_PROTECTED symbol.
>> >> >> * Nowadays -fpie/-fpic modes are popular. GCC/Clang's codegen uses
>> >> >>   GOT-generating relocation when accessing an default visibility
>> >> >>   external symbol which avoids copy relocation.
>> >> >>
>> >> >> Fangrui Song (3):
>> >> >>   elf: Remove ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA check for
>> >> >>     non-DL_EXTERN_PROTECTED_DATA ports
>> >> >>   Revert "[AArch64][BZ #17711] Fix extern protected data handling"
>> >> >>   Revert "[ARM][BZ #17711] Fix extern protected data handling"
>> >> >>
>> >> >>  elf/dl-lookup.c              | 46 ++++++++++++------------------------
>> >> >>  sysdeps/aarch64/dl-machine.h | 13 +++++-----
>> >> >>  sysdeps/aarch64/dl-sysdep.h  |  2 --
>> >> >>  sysdeps/arm/dl-machine.h     | 10 +++-----
>> >> >>  sysdeps/arm/dl-sysdep.h      |  2 --
>> >> >>  5 files changed, 24 insertions(+), 49 deletions(-)
>> >> >>
>> >> >> --
>> >> >> 2.36.0.464.gb9c8b46e94-goog
>> >> >>
>> >> >
>> >> >Given that protected symbols never work properly with copy relocation,
>> >> >can we change protected symbol handling to simply warn copy relocation
>> >> >against protected data symbol and non-zero symbol values of undefined
>> >> >symbols against protected function symbols?
>> >>
>> >> "protected symbols never work properly with copy relocation" generally
>> >> applies to all non-x86 architectures (arm/aarch64 has some guarantee,
>> >> but not reliable). My goal is indeed to remove relevant code for all
>> >> non-x86 architectures. They should not even bother testing
>> >> STV_PROTECTED.
>> >
>> >Even on x86, protected symbols don't work as intended.   They don't
>> >provide any performance benefits.  I think we should move away from
>> >copy relocation on protected symbols.  We can either issue a warning
>> >or an error.
>> >
>> >> On x86 there is some guarantee. The gcc>=5 -fpie
>> >> HAVE_LD_PIE_COPYRELOC behavior makes the problem trigger in some cases.
>> >> Having a warning in x86 specific files (e.g.
>> >> sysdeps/x86_64/dl-machine.h R_X86_64_COPY) seems appropriate.
>>
>> How about adding a warning for R_X86_64_COPY like the following, then removing
>> all code paths of (ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA!=0)?
>>
>>  From 3f905ff283178396840861a965acf35bd21cf5f0 Mon Sep 17 00:00:00 2001
>> From: Fangrui Song <maskray@google.com>
>> Date: Wed, 11 May 2022 21:54:02 -0700
>> Subject: [PATCH] x86: Warn for copy relocations on a protected symbol
>>
>> ---
>>   sysdeps/x86_64/dl-machine.h | 6 ++++++
>>   1 file changed, 6 insertions(+)
>>
>> diff --git a/sysdeps/x86_64/dl-machine.h b/sysdeps/x86_64/dl-machine.h
>> index c70af7ab1e..48f08ecc0b 100644
>> --- a/sysdeps/x86_64/dl-machine.h
>> +++ b/sysdeps/x86_64/dl-machine.h
>> @@ -474,6 +474,12 @@ and creates an unsatisfiable circular dependency.\n",
>>              break;
>>            memcpy (reloc_addr_arg, (void *) value,
>>                    MIN (sym->st_size, refsym->st_size));
>> +         if (ELFW(ST_VISIBILITY) (sym->st_other) == STV_PROTECTED)
>> +           {
>> +             fmt = "\
>> +%s: Protected symbol `%s' is incompatible with copy relocations\n";
>> +             goto print_err;
>> +           }
>>            if (__glibc_unlikely (sym->st_size > refsym->st_size)
>>                || (__glibc_unlikely (sym->st_size < refsym->st_size)
>>                    && GLRO(dl_verbose)))
>
>Should it be handled in sysdeps/generic/dl-protected.h?

Yes, look like the suitable place. I think we should report a warning
regardless of GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS, at least for
all non-x86 architectures.

For x86, gating the logic under GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS may be fine
but I think it may be a bit overengineering as well.

>BTW, I'd like to see a migration plan to remove copy relocation and
>use protected
>visibility in shared libraries.   

As mentioned, the elf/dl-lookup.c code is to handle the case when a
protected variable is defined in multiple shared objects and the
executable copy relocates it. This is a pathologic case and does not
have clear semantics.  So IMO we should just remove the
ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA code that let the second shared
object access its own copy.

I sent the patch series keeping ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA,
because I did not know whether you want to keep it for x86 :)

Note: the single protected data copy relocated by the executable does
not need any ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA code to work. It
simply depends on whether the compiler and the linker cooperate and
produce a GLOB_DAT dynamic relocation.

>Qt uses protected/default
>visibilities to mark exported
>and imported symbols with GCC 12:
>
>https://github.com/qt/qtbase/commit/19b7f854a274812d9c95fc7aaf134a12530c105f

The Qt example belongs to the majority of cases where there is one
single definition for a protected variable.  Removing the dl-lookup.c
STV_PROTECTED code won't affect it.  If Qt ever defines the protected
variable in two shared objects, we can see that it is in a very broken
state.

And the original issue was really related to GCC 5's x86-64 decision to use
HAVE_LD_PIE_COPYRELOC. Other architectures don't have the problem.
In recent years most (if not all) distributions have switched to default
PIE. On all non-x86 architecture supporting shared objects, -fPIE
codegen uses an indirection for an external data symbol.

On the GCC side, we should just fix x86-64 -fpie by removing
HAVE_LD_PIE_COPYRELOC
(https://gcc.gnu.org/pipermail/gcc-patches/2021-November/582987.html).

>Should glibc provide some EXPORT and IMPORT macros to help this?

The -fno-pic codegen incompatibility with -Bsymbolic/protected/some
--dynamic-list has been there since like forever for architectures with
copy relocations. It doesn't help for glibc to tell GCC what to do.

Programs specifying -fno-pic likely rely on the existing behavior.
Changing GCC -fno-pic to default indirect access may be fine. That will
just let them opt out as they currently do for some security hardening
options pushed by Debian/Gentoo/RedHat/etc (e.g. -fno-stack-protector)