[v7] Check global symbols in static library against shared library

Message ID 20240524233515.1400088-1-hjl.tools@gmail.com
State Under Review
Delegated to: Florian Weimer
Headers
Series [v7] Check global symbols in static library against shared library |

Checks

Context Check Description
redhat-pt-bot/TryBot-apply_patch success Patch applied to master at the time it was sent
redhat-pt-bot/TryBot-32bit success Build for i686
linaro-tcwg-bot/tcwg_glibc_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_glibc_check--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_glibc_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_glibc_check--master-arm success Testing passed

Commit Message

H.J. Lu May 24, 2024, 11:35 p.m. UTC
  Here is the v7 patch to check global symbols in static library against
shared library.  The test results may not be clean due to:

https://sourceware.org/bugzilla/show_bug.cgi?id=31776

H.J.
--
Changes in v7:

1. Remove --just-symbols since it was added only in binutils 2.37.
2. Fold archive-function.awk into check-static-abi.sh.

Changes in v6:

1. Drop __nldbl__IO_vfscanf libc-shared-only-symbols.

Changes in v5:

1. Move libc-shared-only-symbols referenced from libnldbl_nonshared.a to
sysdeps/ieee754/ldbl-opt/Makefile.

Changes in v4:

1. Use a per library shared-only-symbols.
2. Report extra functions with filenames in an archive.

Changes in v3:

1. Update the shared-only-symbols order in sysdeps/sparc/sparc32/Makefile.

Changes in v2:

1. Add export LC_ALL for nm.
2. Pass $* to check-static-abi.sh.
3. Don't pass $(static-only-symbols) to check-static-abi.sh.
4. Get installed static libraries from $(extra-libs).
5. Add shared-only-symbols for powerpc and sparc32.

---
The shared library ABIs are checked against the abilist files.  Extract
global symbol definitions from static libraries and check them against
exported global symbol definitions in shared libraries.  Special cases
are handled:

1. Some global symbols are available only in libc.so.
2. Some math functions are in libc.a, but not in libc.so.
3. The libgcc symbols are in libc.so only on some targets.
4. Internal symbols have global visibility only in the static library.

This check is enabled only if nm displays symbol versions for dynamic
symbols.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
---
 Makerules                      |  71 ++++++++++++++++++
 configure                      |  42 +++++++++++
 configure.ac                   |  31 ++++++++
 elf/Makefile                   |  15 ++++
 math/Makefile                  |   4 +
 scripts/check-static-abi.sh    | 133 +++++++++++++++++++++++++++++++++
 sysdeps/generic/Makefile       |  11 +++
 sysdeps/sparc/sparc32/Makefile |  12 +++
 8 files changed, 319 insertions(+)
 create mode 100644 scripts/check-static-abi.sh
  

Comments

H.J. Lu May 27, 2024, 2:04 p.m. UTC | #1
Hi Peter, Joseph,

Any comments on this patch?

Thanks.

On Fri, May 24, 2024 at 4:35 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> Here is the v7 patch to check global symbols in static library against
> shared library.  The test results may not be clean due to:
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=31776
>
> H.J.
> --
> Changes in v7:
>
> 1. Remove --just-symbols since it was added only in binutils 2.37.
> 2. Fold archive-function.awk into check-static-abi.sh.
>
> Changes in v6:
>
> 1. Drop __nldbl__IO_vfscanf libc-shared-only-symbols.
>
> Changes in v5:
>
> 1. Move libc-shared-only-symbols referenced from libnldbl_nonshared.a to
> sysdeps/ieee754/ldbl-opt/Makefile.
>
> Changes in v4:
>
> 1. Use a per library shared-only-symbols.
> 2. Report extra functions with filenames in an archive.
>
> Changes in v3:
>
> 1. Update the shared-only-symbols order in sysdeps/sparc/sparc32/Makefile.
>
> Changes in v2:
>
> 1. Add export LC_ALL for nm.
> 2. Pass $* to check-static-abi.sh.
> 3. Don't pass $(static-only-symbols) to check-static-abi.sh.
> 4. Get installed static libraries from $(extra-libs).
> 5. Add shared-only-symbols for powerpc and sparc32.
>
> ---
> The shared library ABIs are checked against the abilist files.  Extract
> global symbol definitions from static libraries and check them against
> exported global symbol definitions in shared libraries.  Special cases
> are handled:
>
> 1. Some global symbols are available only in libc.so.
> 2. Some math functions are in libc.a, but not in libc.so.
> 3. The libgcc symbols are in libc.so only on some targets.
> 4. Internal symbols have global visibility only in the static library.
>
> This check is enabled only if nm displays symbol versions for dynamic
> symbols.
>
> Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
> ---
>  Makerules                      |  71 ++++++++++++++++++
>  configure                      |  42 +++++++++++
>  configure.ac                   |  31 ++++++++
>  elf/Makefile                   |  15 ++++
>  math/Makefile                  |   4 +
>  scripts/check-static-abi.sh    | 133 +++++++++++++++++++++++++++++++++
>  sysdeps/generic/Makefile       |  11 +++
>  sysdeps/sparc/sparc32/Makefile |  12 +++
>  8 files changed, 319 insertions(+)
>  create mode 100644 scripts/check-static-abi.sh
>
> diff --git a/Makerules b/Makerules
> index 275110dda8..84046dbffc 100644
> --- a/Makerules
> +++ b/Makerules
> @@ -1238,6 +1238,23 @@ ifeq ($(build-shared),yes)
>         LC_ALL=C $(OBJDUMP) --dynamic-syms $< > $@T
>         mv -f $@T $@
>
> +# Extract global symbol definitions from shared and static libraries.
> +NM-FLAGS=--extern-only --defined-only
> +
> +%.dynamicsymlist: %.so
> +       LC_ALL=C; export LC_ALL; $(NM) --dynamic $(NM-FLAGS) $< \
> +         | $(AWK) 'NF == 3 { print $$3 }' \
> +         | grep -v "^_.*GLIBC_PRIVATE" | grep @@ | sed -e "s/@@.*//" \
> +         | sort > $@T
> +       mv -f $@T $@
> +
> +%.staticsymlist: %.a
> +       LC_ALL=C; export LC_ALL; $(NM) $(NM-FLAGS) $< \
> +         | $(AWK) 'NF == 3 { print $$3 }' \
> +         | grep -v DW.ref.__gcc_personality_v0 \
> +         | sort > $@T
> +       mv -f $@T $@
> +
>  vpath %.abilist $(+sysdep_dirs)
>
>  # The .PRECIOUS rule prevents the files built by an implicit rule whose
> @@ -1246,6 +1263,16 @@ vpath %.abilist $(+sysdep_dirs)
>  # 'make clean', which is handled by the 'generated' variable.
>  .PRECIOUS: %.symlist
>  generated += $(extra-libs:=.symlist)
> +ifeq (yes,$(nm-has-symbol-version))
> +.PRECIOUS: %.dynamicsymlist %.staticsymlist
> +# Only check global symbols in static libraries in $(extra-libs).
> +static-libs := $(foreach lib, $(extra-libs), \
> +                $(if $($(lib)-inhibit-o),,$(lib)))
> +generated += \
> +  $(static-libs:=.dynamicsymlist) \
> +  $(static-libs:=.staticsymlist) \
> +# generated
> +endif
>
>  $(objpfx)check-abi-%.out: $(common-objpfx)config.make %.abilist \
>                           $(objpfx)%.symlist
> @@ -1259,6 +1286,24 @@ define check-abi
>         diff -p -U 0 $(filter %.abilist,$^) $(filter %.symlist,$^) > $@
>  endef
>
> +$(objpfx)check-static-abi-%.out: $(common-objpfx)config.make \
> +                                $(objpfx)%.dynamicsymlist \
> +                                $(objpfx)%.staticsymlist
> +       $(check-static-abi); \
> +       $(evaluate-test)
> +$(objpfx)check-static-abi-%.out: $(common-objpfx)config.make \
> +                                $(common-objpfx)%.dynamicsymlist \
> +                                $(common-objpfx)%.staticsymlist
> +       $(check-static-abi); \
> +       $(evaluate-test)
> +define check-static-abi
> +       $(SHELL) $(..)scripts/check-static-abi.sh $* \
> +         $(subst elf/libc.a,libc.a,$(objpfx)$*.a) $(NM) $(AWK) \
> +         $(filter %.dynamicsymlist,$^) \
> +         $(filter %.staticsymlist,$^) \
> +         "$($*-shared-only-symbols)" > $@
> +endef
> +
>  update-abi-%: $(objpfx)%.symlist %.abilist
>         $(update-abi)
>  update-abi-%: $(common-objpfx)%.symlist %.abilist
> @@ -1290,6 +1335,11 @@ update-all-abi: $(patsubst %.so,update-all-abi-%,$(install-lib.so-versioned))
>  check-abi-list = $(patsubst %.so,$(objpfx)check-abi-%.out, \
>                                  $(install-lib.so-versioned))
>  check-abi: $(check-abi-list)
> +ifeq (yes,$(nm-has-symbol-version))
> +check-static-abi-list = $(patsubst %.a,$(objpfx)check-static-abi-%.out, \
> +                                      $(static-libs:=.a))
> +check-abi: $(check-static-abi-list)
> +endif
>  ifdef subdir
>  subdir_check-abi: check-abi
>  subdir_update-abi: update-abi
> @@ -1298,6 +1348,10 @@ else
>  check-abi: subdir_check-abi
>         if grep -q '^FAIL:' $(objpfx)*/check-abi*.test-result; then \
>                 cat $(objpfx)*/check-abi*.out && exit 1; fi
> +ifeq (yes,$(nm-has-symbol-version))
> +       if grep -q '^FAIL:' $(objpfx)*/check-static-abi*.test-result; \
> +       then cat $(objpfx)*/check-static-abi*.out && exit 1; fi
> +endif
>  update-abi: subdir_update-abi
>  update-all-abi: subdir_update-all-abi
>  endif
> @@ -1308,11 +1362,28 @@ tests-special += $(objpfx)check-abi-libc.out
>  update-abi: update-abi-libc
>  update-all-abi: update-all-abi-libc
>  common-generated += libc.symlist
> +ifeq (yes,$(nm-has-symbol-version))
> +check-abi: $(objpfx)check-static-abi-libc.out
> +tests-special += $(objpfx)check-static-abi-libc.out
> +common-generated += \
> +  libc.dynamicsymlist \
> +  libc.staticsymlist \
> +  libc_nonshared.staticsymlist \
> +  # common-generated
> +# check-static-abi.sh needs libc_nonshared.staticsymlist and
> +# libm.dynamicsymlist to check the global symbol list in libc.a.
> +$(objpfx)check-static-abi-libc.out: \
> +  $(common-objpfx)libc_nonshared.staticsymlist \
> +  $(common-objpfx)math/libm.dynamicsymlist
> +endif
>  endif
>
>  ifeq ($(build-shared),yes)
>  ifdef subdir
>  tests-special += $(check-abi-list)
> +ifeq (yes,$(nm-has-symbol-version))
> +tests-special += $(check-static-abi-list)
> +endif
>  endif
>  endif
>
> diff --git a/configure b/configure
> index 432e40a592..11ac7c094a 100755
> --- a/configure
> +++ b/configure
> @@ -7566,6 +7566,48 @@ if test "$libc_cv_symver_needs_alias" = yes; then
>
>  fi
>
> +{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking whether nm displays symbol versions for dynamic symbols" >&5
> +printf %s "checking whether nm displays symbol versions for dynamic symbols... " >&6; }
> +if test ${libc_cv_nm_has_symbol_version+y}
> +then :
> +  printf %s "(cached) " >&6
> +else $as_nop
> +    cat > conftest.s <<EOF
> +       .text
> +       .globl  new_foo
> +       .type   new_foo, %function
> +new_foo:
> +       .byte 0
> +       .symver new_foo,foo@@VERS_2.0
> +EOF
> +  cat > conftest.t <<EOF
> +VERS_2.0 {
> +global:
> +  foo;
> +local:
> +  *;
> +};
> +EOF
> +  libc_cv_nm_has_symbol_version=no
> +  if { ac_try='${CC-cc} -nostdlib -nostartfiles -fPIC -shared conftest.s -Wl,-version-script=conftest.t -o conftest.so'
> +  { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
> +  (eval $ac_try) 2>&5
> +  ac_status=$?
> +  printf "%s\n" "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
> +  test $ac_status = 0; }; }; then
> +    if $NM --dynamic --extern-only --defined-only conftest.so 2>&1 \
> +       | grep -q foo@@VERS_2.0; then
> +      libc_cv_nm_has_symbol_version=yes
> +    fi
> +  fi
> +  rm conftest.*
> +
> +fi
> +{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: $libc_cv_nm_has_symbol_version" >&5
> +printf "%s\n" "$libc_cv_nm_has_symbol_version" >&6; }
> +config_vars="$config_vars
> +nm-has-symbol-version = $libc_cv_nm_has_symbol_version"
> +
>  { printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for __builtin_trap with no external dependencies" >&5
>  printf %s "checking for __builtin_trap with no external dependencies... " >&6; }
>  if test ${libc_cv_builtin_trap+y}
> diff --git a/configure.ac b/configure.ac
> index bdc385d03c..e0ad986ae0 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -1554,6 +1554,37 @@ if test "$libc_cv_symver_needs_alias" = yes; then
>    AC_DEFINE(SYMVER_NEEDS_ALIAS)
>  fi
>
> +dnl Starting with binutils 2.35, NM displays symbol versions for dynamic
> +dnl symbols (PR 25708).
> +AC_CACHE_CHECK(whether nm displays symbol versions for dynamic symbols,
> +               libc_cv_nm_has_symbol_version, [dnl
> +  cat > conftest.s <<EOF
> +       .text
> +       .globl  new_foo
> +       .type   new_foo, %function
> +new_foo:
> +       .byte 0
> +       .symver new_foo,foo@@VERS_2.0
> +EOF
> +  cat > conftest.t <<EOF
> +VERS_2.0 {
> +global:
> +  foo;
> +local:
> +  *;
> +};
> +EOF
> +  libc_cv_nm_has_symbol_version=no
> +  if AC_TRY_COMMAND([${CC-cc} -nostdlib -nostartfiles -fPIC -shared conftest.s -Wl,-version-script=conftest.t -o conftest.so]); then
> +    if $NM --dynamic --extern-only --defined-only conftest.so 2>&1 \
> +       | grep -q foo@@VERS_2.0; then
> +      libc_cv_nm_has_symbol_version=yes
> +    fi
> +  fi
> +  rm conftest.*
> +])
> +LIBC_CONFIG_VAR([nm-has-symbol-version], [$libc_cv_nm_has_symbol_version])
> +
>  AC_CACHE_CHECK(for __builtin_trap with no external dependencies,
>                libc_cv_builtin_trap, [dnl
>  libc_cv_builtin_trap=no
> diff --git a/elf/Makefile b/elf/Makefile
> index 280e777c19..eb4357a0dc 100644
> --- a/elf/Makefile
> +++ b/elf/Makefile
> @@ -182,6 +182,21 @@ CFLAGS-rtld.os += $(rtld-early-cflags)
>  ifeq ($(unwind-find-fde),yes)
>  routines += unwind-dw2-fde-glibc
>  shared-only-routines += unwind-dw2-fde-glibc
> +
> +# These global symbols are only in libc.so.
> +libc-shared-only-symbols += \
> +  _Unwind_Find_FDE \
> +  __deregister_frame \
> +  __deregister_frame_info \
> +  __deregister_frame_info_bases \
> +  __register_frame \
> +  __register_frame_info \
> +  __register_frame_info_bases \
> +  __register_frame_info_bases \
> +  __register_frame_info_table \
> +  __register_frame_info_table_bases \
> +  __register_frame_table \
> +# libc-shared-only-symbols
>  endif
>
>  before-compile  += $(objpfx)trusted-dirs.h
> diff --git a/math/Makefile b/math/Makefile
> index 58e5c070cf..ae268e0088 100644
> --- a/math/Makefile
> +++ b/math/Makefile
> @@ -1674,3 +1674,7 @@ object-suffixes-left := $(libmvec-tests)
>  include $(o-iterator)
>
>  $(objpfx)test-fenv-tls: $(shared-thread-library)
> +
> +# check-static-abi.sh needs libc.staticsymlist to check the global symbol
> +# list in libm.a.
> +$(objpfx)check-static-abi-libm.out: $(common-objpfx)libc.staticsymlist
> diff --git a/scripts/check-static-abi.sh b/scripts/check-static-abi.sh
> new file mode 100644
> index 0000000000..72a198d481
> --- /dev/null
> +++ b/scripts/check-static-abi.sh
> @@ -0,0 +1,133 @@
> +#!/bin/sh
> +# Check global symbols in static library against global symbols in shared
> +# library.
> +# Copyright (C) 2024 Free Software Foundation, Inc.
> +# This file is part of the GNU C Library.
> +#
> +# The GNU C Library is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU Lesser General Public
> +# License as published by the Free Software Foundation; either
> +# version 2.1 of the License, or (at your option) any later version.
> +#
> +# The GNU C Library is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +# Lesser General Public License for more details.
> +#
> +# You should have received a copy of the GNU Lesser General Public
> +# License along with the GNU C Library; if not, see
> +# <https://www.gnu.org/licenses/>.
> +
> +export LC_ALL=C
> +
> +lib="$1"
> +shift
> +archive="$1"
> +shift
> +nm="$1"
> +shift
> +awk="$1"
> +shift
> +dynamicsymlist="$1"
> +shift
> +if test $lib = libc; then
> +  libm_dynamicsymlist="$1"
> +  shift
> +fi
> +staticsymlist="$1"
> +shift
> +if test $lib = libc; then
> +  libc_nonshared_staticsymlist="$1"
> +  shift
> +elif test $lib = libm; then
> +  libc_staticsymlist="$1"
> +  shift
> +fi
> +shared_only_functions="$1"
> +
> +staticdiff="$(mktemp)"
> +staticlist="$(mktemp)"
> +
> +cleanup () {
> +  rm -f -- "$staticdiff" "$staticlist"
> +}
> +
> +trap cleanup 0
> +
> +status=0
> +
> +diff -up $dynamicsymlist $staticsymlist > $staticdiff
> +
> +# Check missing global symbols, including symbols with a leading
> +# underscore, in static library.
> +shared_only="$(grep '^-[^-]' $staticdiff | sed -e 's/^-//')"
> +if test -n "$shared_only"; then
> +  # Exclude shared only functions.
> +  for f in $shared_only; do
> +    case " $shared_only_functions " in
> +    *" $f "*)
> +      ;;
> +    *)
> +      # Skip the missing function, $f, in libm.a if $f is in libc.a.
> +      if test $lib = libm \
> +        && grep -q "^$f$" $libc_staticsymlist; then
> +       continue
> +      fi
> +      missing="$missing $f"
> +      ;;
> +    esac
> +  done
> +  if test -n "$missing"; then
> +    status=1
> +    echo Missing functions in ${lib}.a:
> +    echo "$missing"
> +  fi
> +fi
> +
> +# Check extra global symbols, excluding symbols with a leading underscore
> +# which can be internal symbols, in static library.
> +static_only="$(grep '^+[^_^\+]' $staticdiff | sed -e 's/^+//')"
> +if test -n "$static_only"; then
> +  # Exclude static only functions.
> +  for f in $static_only; do
> +    # Skip the extra function, $f, in libc.a if $f is in libc_nonshared.a
> +    # or libm.so.
> +    if test $lib = libc; then
> +      if grep -q "^$f$" $libc_nonshared_staticsymlist \
> +        || grep -q "^$f$" $libm_dynamicsymlist; then
> +       continue
> +      fi
> +    fi
> +    extra="$extra $f"
> +  done
> +  if test -n "$extra"; then
> +    status=1
> +    echo Extra functions in ${lib}.a:
> +    $nm --extern-only --defined-only $archive > $staticlist 2>&1
> +    $awk -v functions="$extra" \
> +      ' \
> +NF == 1 { \
> +  thisfile = $1; \
> +  filenames[$1] = 1; \
> +  next; \
> +} \
> +NF == 3 { \
> +  allsymbols[thisfile, $3] = 1; \
> +  next; \
> +} \
> +END { \
> +  split (functions, symbols, " "); \
> +  for (s in symbols) { \
> +    for (f in filenames) { \
> +      if (allsymbols[f, symbols[s]] == 1) { \
> +        print f, symbols[s];
> +      } \
> +    } \
> +  } \
> +} \
> +      ' \
> +      $staticlist
> +  fi
> +fi
> +
> +exit $status
> diff --git a/sysdeps/generic/Makefile b/sysdeps/generic/Makefile
> index 72522d1f0e..f24c517bbd 100644
> --- a/sysdeps/generic/Makefile
> +++ b/sysdeps/generic/Makefile
> @@ -25,7 +25,18 @@ ifeq (yes:yes,$(build-shared):$(unwind-find-fde))
>  # This is needed to support g++ v2 and v3.
>  sysdep_routines += framestate unwind-pe
>  shared-only-routines += framestate unwind-pe
> +
> +# These global symbols are only in libc.so.
> +libc-shared-only-symbols += __frame_state_for
>  endif
> +
> +# These global symbols are only in libc.so.
> +libc-shared-only-symbols += \
> +  __xmknod \
> +  __xmknodat \
> +  _dl_mcount_wrapper_check \
> +  re_max_failures \
> +# libc-shared-only-symbols
>  endif
>
>  ifeq ($(subdir),malloc)
> diff --git a/sysdeps/sparc/sparc32/Makefile b/sysdeps/sparc/sparc32/Makefile
> index 4ba8794aa0..154af1b8e1 100644
> --- a/sysdeps/sparc/sparc32/Makefile
> +++ b/sysdeps/sparc/sparc32/Makefile
> @@ -15,6 +15,18 @@
>  # License along with the GNU C Library; if not, see
>  # <https://www.gnu.org/licenses/>.
>
> +ifeq ($(subdir),elf)
> +# These global symbols are only in libc.so.  Filter out the
> +# $(shared-only-routines) routines.
> +libc-shared-only-symbols += \
> +  .div \
> +  .rem \
> +  .udiv \
> +  .umul \
> +  .urem \
> +  # libc-shared-only-symbols
> +endif
> +
>  ifeq ($(subdir),gnulib)
>  sysdep_routines = dotmul umul $(divrem) alloca
>  shared-only-routines += umul $(divrem)
> --
> 2.45.1
>
  
Joseph Myers May 28, 2024, 8:10 p.m. UTC | #2
On Mon, 27 May 2024, H.J. Lu wrote:

> Hi Peter, Joseph,
> 
> Any comments on this patch?

I agree with the previous concerns about using nm here (in a way that 
requires a new configure check!) instead of aligning the logic with 
existing code checking symbol versions and defined symbols.

The ABI tests use objdump --dynamic-syms to generate .dynsym files, and 
then scripts/abilist.awk generates .symlist files from those.  I'd hope 
that the .dynsym files contain all the information needed for this check.  
Now, duplicating the logic from scripts/abilist.awk wouldn't be desirable 
either, and I understand that you can't use the .symlist files themselves 
because you want to exclude compat symbols in this check.  So that 
suggests two possibilities:

* Add a new option to abilist.awk (it already has an option 
include_private=1, see nptl_db/Makefile) to exclude compat symbols.

* Replace abilist.awk by Python library code with such an option, and then 
use scripts that call that library code with different options in the 
existing uses of abilist.awk and for the new test.

As for lists of static symbols, we already have Python code in 
linknamespace.py that works with readelf -W -s output.  So I'd suggest 
factoring that out into Python library code that can be used in whatever 
tests need such information.

In both cases, ideally we might directly parse ELF in glibcelf.py in the 
long term, but at least sharing code parsing output of objdump / readelf 
that we already parse for similar uses would be better than adding yet 
another parser that might do things slightly differently and that works 
with output from a different tool.
  
H.J. Lu May 29, 2024, 10:48 p.m. UTC | #3
On Tue, May 28, 2024 at 1:10 PM Joseph Myers <josmyers@redhat.com> wrote:
>
> On Mon, 27 May 2024, H.J. Lu wrote:
>
> > Hi Peter, Joseph,
> >
> > Any comments on this patch?
>
> I agree with the previous concerns about using nm here (in a way that
> requires a new configure check!) instead of aligning the logic with
> existing code checking symbol versions and defined symbols.

The new NM feature provides what is needed so that no additional
codes are needed.

> The ABI tests use objdump --dynamic-syms to generate .dynsym files, and
> then scripts/abilist.awk generates .symlist files from those.  I'd hope
> that the .dynsym files contain all the information needed for this check.
> Now, duplicating the logic from scripts/abilist.awk wouldn't be desirable
> either, and I understand that you can't use the .symlist files themselves
> because you want to exclude compat symbols in this check.  So that
> suggests two possibilities:
>
> * Add a new option to abilist.awk (it already has an option
> include_private=1, see nptl_db/Makefile) to exclude compat symbols.
>
> * Replace abilist.awk by Python library code with such an option, and then
> use scripts that call that library code with different options in the
> existing uses of abilist.awk and for the new test.
>
> As for lists of static symbols, we already have Python code in
> linknamespace.py that works with readelf -W -s output.  So I'd suggest
> factoring that out into Python library code that can be used in whatever
> tests need such information.
>
> In both cases, ideally we might directly parse ELF in glibcelf.py in the
> long term, but at least sharing code parsing output of objdump / readelf
> that we already parse for similar uses would be better than adding yet
> another parser that might do things slightly differently and that works
> with output from a different tool.
>

My patch works and doesn't require any new tools.  It already discovered
some libc.a ABI issues.   Someone can replace it with something better in
the future.
  
Joseph Myers May 30, 2024, 5:11 p.m. UTC | #4
On Wed, 29 May 2024, H.J. Lu wrote:

> On Tue, May 28, 2024 at 1:10 PM Joseph Myers <josmyers@redhat.com> wrote:
> >
> > On Mon, 27 May 2024, H.J. Lu wrote:
> >
> > > Hi Peter, Joseph,
> > >
> > > Any comments on this patch?
> >
> > I agree with the previous concerns about using nm here (in a way that
> > requires a new configure check!) instead of aligning the logic with
> > existing code checking symbol versions and defined symbols.
> 
> The new NM feature provides what is needed so that no additional
> codes are needed.

It needs a new configure check.  And I'm thinking of "additional codes" 
here primarily in terms of how many different ways are used for extracting 
symbols from objects - meaning reusing existing code (appropriately 
refactored as needed to allow multiple places to share it) is strongly 
preferred to using a different tool in one place.
  
H.J. Lu May 31, 2024, 1:11 a.m. UTC | #5
On Thu, May 30, 2024 at 10:12 AM Joseph Myers <josmyers@redhat.com> wrote:
>
> On Wed, 29 May 2024, H.J. Lu wrote:
>
> > On Tue, May 28, 2024 at 1:10 PM Joseph Myers <josmyers@redhat.com>
wrote:
> > >
> > > On Mon, 27 May 2024, H.J. Lu wrote:
> > >
> > > > Hi Peter, Joseph,
> > > >
> > > > Any comments on this patch?
> > >
> > > I agree with the previous concerns about using nm here (in a way that
> > > requires a new configure check!) instead of aligning the logic with
> > > existing code checking symbol versions and defined symbols.
> >
> > The new NM feature provides what is needed so that no additional
> > codes are needed.
>
> It needs a new configure check.  And I'm thinking of "additional codes"

My changes are simple. Write the new codes just implement the same features
which have been implemented in binutils.

> here primarily in terms of how many different ways are used for extracting
> symbols from objects - meaning reusing existing code (appropriately
> refactored as needed to allow multiple places to share it) is strongly
> preferred to using a different tool in one place.

This sounds an interesting project for someone who wants to get
Involved with the project.

H.J.
  

Patch

diff --git a/Makerules b/Makerules
index 275110dda8..84046dbffc 100644
--- a/Makerules
+++ b/Makerules
@@ -1238,6 +1238,23 @@  ifeq ($(build-shared),yes)
 	LC_ALL=C $(OBJDUMP) --dynamic-syms $< > $@T
 	mv -f $@T $@
 
+# Extract global symbol definitions from shared and static libraries.
+NM-FLAGS=--extern-only --defined-only
+
+%.dynamicsymlist: %.so
+	LC_ALL=C; export LC_ALL; $(NM) --dynamic $(NM-FLAGS) $< \
+	  | $(AWK) 'NF == 3 { print $$3 }' \
+	  | grep -v "^_.*GLIBC_PRIVATE" | grep @@ | sed -e "s/@@.*//" \
+	  | sort > $@T
+	mv -f $@T $@
+
+%.staticsymlist: %.a
+	LC_ALL=C; export LC_ALL; $(NM) $(NM-FLAGS) $< \
+	  | $(AWK) 'NF == 3 { print $$3 }' \
+	  | grep -v DW.ref.__gcc_personality_v0 \
+	  | sort > $@T
+	mv -f $@T $@
+
 vpath %.abilist $(+sysdep_dirs)
 
 # The .PRECIOUS rule prevents the files built by an implicit rule whose
@@ -1246,6 +1263,16 @@  vpath %.abilist $(+sysdep_dirs)
 # 'make clean', which is handled by the 'generated' variable.
 .PRECIOUS: %.symlist
 generated += $(extra-libs:=.symlist)
+ifeq (yes,$(nm-has-symbol-version))
+.PRECIOUS: %.dynamicsymlist %.staticsymlist
+# Only check global symbols in static libraries in $(extra-libs).
+static-libs := $(foreach lib, $(extra-libs), \
+		 $(if $($(lib)-inhibit-o),,$(lib)))
+generated += \
+  $(static-libs:=.dynamicsymlist) \
+  $(static-libs:=.staticsymlist) \
+# generated
+endif
 
 $(objpfx)check-abi-%.out: $(common-objpfx)config.make %.abilist \
 			  $(objpfx)%.symlist
@@ -1259,6 +1286,24 @@  define check-abi
 	diff -p -U 0 $(filter %.abilist,$^) $(filter %.symlist,$^) > $@
 endef
 
+$(objpfx)check-static-abi-%.out: $(common-objpfx)config.make \
+				 $(objpfx)%.dynamicsymlist \
+				 $(objpfx)%.staticsymlist
+	$(check-static-abi); \
+	$(evaluate-test)
+$(objpfx)check-static-abi-%.out: $(common-objpfx)config.make \
+				 $(common-objpfx)%.dynamicsymlist \
+				 $(common-objpfx)%.staticsymlist
+	$(check-static-abi); \
+	$(evaluate-test)
+define check-static-abi
+	$(SHELL) $(..)scripts/check-static-abi.sh $* \
+	  $(subst elf/libc.a,libc.a,$(objpfx)$*.a) $(NM) $(AWK) \
+	  $(filter %.dynamicsymlist,$^) \
+	  $(filter %.staticsymlist,$^) \
+	  "$($*-shared-only-symbols)" > $@
+endef
+
 update-abi-%: $(objpfx)%.symlist %.abilist
 	$(update-abi)
 update-abi-%: $(common-objpfx)%.symlist %.abilist
@@ -1290,6 +1335,11 @@  update-all-abi: $(patsubst %.so,update-all-abi-%,$(install-lib.so-versioned))
 check-abi-list = $(patsubst %.so,$(objpfx)check-abi-%.out, \
 				 $(install-lib.so-versioned))
 check-abi: $(check-abi-list)
+ifeq (yes,$(nm-has-symbol-version))
+check-static-abi-list = $(patsubst %.a,$(objpfx)check-static-abi-%.out, \
+				       $(static-libs:=.a))
+check-abi: $(check-static-abi-list)
+endif
 ifdef subdir
 subdir_check-abi: check-abi
 subdir_update-abi: update-abi
@@ -1298,6 +1348,10 @@  else
 check-abi: subdir_check-abi
 	if grep -q '^FAIL:' $(objpfx)*/check-abi*.test-result; then \
 		cat $(objpfx)*/check-abi*.out && exit 1; fi
+ifeq (yes,$(nm-has-symbol-version))
+	if grep -q '^FAIL:' $(objpfx)*/check-static-abi*.test-result; \
+	then cat $(objpfx)*/check-static-abi*.out && exit 1; fi
+endif
 update-abi: subdir_update-abi
 update-all-abi: subdir_update-all-abi
 endif
@@ -1308,11 +1362,28 @@  tests-special += $(objpfx)check-abi-libc.out
 update-abi: update-abi-libc
 update-all-abi: update-all-abi-libc
 common-generated += libc.symlist
+ifeq (yes,$(nm-has-symbol-version))
+check-abi: $(objpfx)check-static-abi-libc.out
+tests-special += $(objpfx)check-static-abi-libc.out
+common-generated += \
+  libc.dynamicsymlist \
+  libc.staticsymlist \
+  libc_nonshared.staticsymlist \
+  # common-generated
+# check-static-abi.sh needs libc_nonshared.staticsymlist and
+# libm.dynamicsymlist to check the global symbol list in libc.a.
+$(objpfx)check-static-abi-libc.out: \
+  $(common-objpfx)libc_nonshared.staticsymlist \
+  $(common-objpfx)math/libm.dynamicsymlist
+endif
 endif
 
 ifeq ($(build-shared),yes)
 ifdef subdir
 tests-special += $(check-abi-list)
+ifeq (yes,$(nm-has-symbol-version))
+tests-special += $(check-static-abi-list)
+endif
 endif
 endif
 
diff --git a/configure b/configure
index 432e40a592..11ac7c094a 100755
--- a/configure
+++ b/configure
@@ -7566,6 +7566,48 @@  if test "$libc_cv_symver_needs_alias" = yes; then
 
 fi
 
+{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking whether nm displays symbol versions for dynamic symbols" >&5
+printf %s "checking whether nm displays symbol versions for dynamic symbols... " >&6; }
+if test ${libc_cv_nm_has_symbol_version+y}
+then :
+  printf %s "(cached) " >&6
+else $as_nop
+    cat > conftest.s <<EOF
+	.text
+	.globl	new_foo
+	.type	new_foo, %function
+new_foo:
+	.byte 0
+	.symver new_foo,foo@@VERS_2.0
+EOF
+  cat > conftest.t <<EOF
+VERS_2.0 {
+global:
+  foo;
+local:
+  *;
+};
+EOF
+  libc_cv_nm_has_symbol_version=no
+  if { ac_try='${CC-cc} -nostdlib -nostartfiles -fPIC -shared conftest.s -Wl,-version-script=conftest.t -o conftest.so'
+  { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
+  (eval $ac_try) 2>&5
+  ac_status=$?
+  printf "%s\n" "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; }; then
+    if $NM --dynamic --extern-only --defined-only conftest.so 2>&1 \
+       | grep -q foo@@VERS_2.0; then
+      libc_cv_nm_has_symbol_version=yes
+    fi
+  fi
+  rm conftest.*
+
+fi
+{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: $libc_cv_nm_has_symbol_version" >&5
+printf "%s\n" "$libc_cv_nm_has_symbol_version" >&6; }
+config_vars="$config_vars
+nm-has-symbol-version = $libc_cv_nm_has_symbol_version"
+
 { printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for __builtin_trap with no external dependencies" >&5
 printf %s "checking for __builtin_trap with no external dependencies... " >&6; }
 if test ${libc_cv_builtin_trap+y}
diff --git a/configure.ac b/configure.ac
index bdc385d03c..e0ad986ae0 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1554,6 +1554,37 @@  if test "$libc_cv_symver_needs_alias" = yes; then
   AC_DEFINE(SYMVER_NEEDS_ALIAS)
 fi
 
+dnl Starting with binutils 2.35, NM displays symbol versions for dynamic
+dnl symbols (PR 25708).
+AC_CACHE_CHECK(whether nm displays symbol versions for dynamic symbols,
+               libc_cv_nm_has_symbol_version, [dnl
+  cat > conftest.s <<EOF
+	.text
+	.globl	new_foo
+	.type	new_foo, %function
+new_foo:
+	.byte 0
+	.symver new_foo,foo@@VERS_2.0
+EOF
+  cat > conftest.t <<EOF
+VERS_2.0 {
+global:
+  foo;
+local:
+  *;
+};
+EOF
+  libc_cv_nm_has_symbol_version=no
+  if AC_TRY_COMMAND([${CC-cc} -nostdlib -nostartfiles -fPIC -shared conftest.s -Wl,-version-script=conftest.t -o conftest.so]); then
+    if $NM --dynamic --extern-only --defined-only conftest.so 2>&1 \
+       | grep -q foo@@VERS_2.0; then
+      libc_cv_nm_has_symbol_version=yes
+    fi
+  fi
+  rm conftest.*
+])
+LIBC_CONFIG_VAR([nm-has-symbol-version], [$libc_cv_nm_has_symbol_version])
+
 AC_CACHE_CHECK(for __builtin_trap with no external dependencies,
 	       libc_cv_builtin_trap, [dnl
 libc_cv_builtin_trap=no
diff --git a/elf/Makefile b/elf/Makefile
index 280e777c19..eb4357a0dc 100644
--- a/elf/Makefile
+++ b/elf/Makefile
@@ -182,6 +182,21 @@  CFLAGS-rtld.os += $(rtld-early-cflags)
 ifeq ($(unwind-find-fde),yes)
 routines += unwind-dw2-fde-glibc
 shared-only-routines += unwind-dw2-fde-glibc
+
+# These global symbols are only in libc.so.
+libc-shared-only-symbols += \
+  _Unwind_Find_FDE \
+  __deregister_frame \
+  __deregister_frame_info \
+  __deregister_frame_info_bases \
+  __register_frame \
+  __register_frame_info \
+  __register_frame_info_bases \
+  __register_frame_info_bases \
+  __register_frame_info_table \
+  __register_frame_info_table_bases \
+  __register_frame_table \
+# libc-shared-only-symbols
 endif
 
 before-compile  += $(objpfx)trusted-dirs.h
diff --git a/math/Makefile b/math/Makefile
index 58e5c070cf..ae268e0088 100644
--- a/math/Makefile
+++ b/math/Makefile
@@ -1674,3 +1674,7 @@  object-suffixes-left := $(libmvec-tests)
 include $(o-iterator)
 
 $(objpfx)test-fenv-tls: $(shared-thread-library)
+
+# check-static-abi.sh needs libc.staticsymlist to check the global symbol
+# list in libm.a.
+$(objpfx)check-static-abi-libm.out: $(common-objpfx)libc.staticsymlist
diff --git a/scripts/check-static-abi.sh b/scripts/check-static-abi.sh
new file mode 100644
index 0000000000..72a198d481
--- /dev/null
+++ b/scripts/check-static-abi.sh
@@ -0,0 +1,133 @@ 
+#!/bin/sh
+# Check global symbols in static library against global symbols in shared
+# library.
+# Copyright (C) 2024 Free Software Foundation, Inc.
+# This file is part of the GNU C Library.
+#
+# The GNU C Library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2.1 of the License, or (at your option) any later version.
+#
+# The GNU C Library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with the GNU C Library; if not, see
+# <https://www.gnu.org/licenses/>.
+
+export LC_ALL=C
+
+lib="$1"
+shift
+archive="$1"
+shift
+nm="$1"
+shift
+awk="$1"
+shift
+dynamicsymlist="$1"
+shift
+if test $lib = libc; then
+  libm_dynamicsymlist="$1"
+  shift
+fi
+staticsymlist="$1"
+shift
+if test $lib = libc; then
+  libc_nonshared_staticsymlist="$1"
+  shift
+elif test $lib = libm; then
+  libc_staticsymlist="$1"
+  shift
+fi
+shared_only_functions="$1"
+
+staticdiff="$(mktemp)"
+staticlist="$(mktemp)"
+
+cleanup () {
+  rm -f -- "$staticdiff" "$staticlist"
+}
+
+trap cleanup 0
+
+status=0
+
+diff -up $dynamicsymlist $staticsymlist > $staticdiff
+
+# Check missing global symbols, including symbols with a leading
+# underscore, in static library.
+shared_only="$(grep '^-[^-]' $staticdiff | sed -e 's/^-//')"
+if test -n "$shared_only"; then
+  # Exclude shared only functions.
+  for f in $shared_only; do
+    case " $shared_only_functions " in
+    *" $f "*)
+      ;;
+    *)
+      # Skip the missing function, $f, in libm.a if $f is in libc.a.
+      if test $lib = libm \
+	 && grep -q "^$f$" $libc_staticsymlist; then
+	continue
+      fi
+      missing="$missing $f"
+      ;;
+    esac
+  done
+  if test -n "$missing"; then
+    status=1
+    echo Missing functions in ${lib}.a:
+    echo "$missing"
+  fi
+fi
+
+# Check extra global symbols, excluding symbols with a leading underscore
+# which can be internal symbols, in static library.
+static_only="$(grep '^+[^_^\+]' $staticdiff | sed -e 's/^+//')"
+if test -n "$static_only"; then
+  # Exclude static only functions.
+  for f in $static_only; do
+    # Skip the extra function, $f, in libc.a if $f is in libc_nonshared.a
+    # or libm.so.
+    if test $lib = libc; then
+      if grep -q "^$f$" $libc_nonshared_staticsymlist \
+	 || grep -q "^$f$" $libm_dynamicsymlist; then
+	continue
+      fi
+    fi
+    extra="$extra $f"
+  done
+  if test -n "$extra"; then
+    status=1
+    echo Extra functions in ${lib}.a:
+    $nm --extern-only --defined-only $archive > $staticlist 2>&1
+    $awk -v functions="$extra" \
+      ' \
+NF == 1 { \
+  thisfile = $1; \
+  filenames[$1] = 1; \
+  next; \
+} \
+NF == 3 { \
+  allsymbols[thisfile, $3] = 1; \
+  next; \
+} \
+END { \
+  split (functions, symbols, " "); \
+  for (s in symbols) { \
+    for (f in filenames) { \
+      if (allsymbols[f, symbols[s]] == 1) { \
+        print f, symbols[s];
+      } \
+    } \
+  } \
+} \
+      ' \
+      $staticlist
+  fi
+fi
+
+exit $status
diff --git a/sysdeps/generic/Makefile b/sysdeps/generic/Makefile
index 72522d1f0e..f24c517bbd 100644
--- a/sysdeps/generic/Makefile
+++ b/sysdeps/generic/Makefile
@@ -25,7 +25,18 @@  ifeq (yes:yes,$(build-shared):$(unwind-find-fde))
 # This is needed to support g++ v2 and v3.
 sysdep_routines += framestate unwind-pe
 shared-only-routines += framestate unwind-pe
+
+# These global symbols are only in libc.so.
+libc-shared-only-symbols += __frame_state_for
 endif
+
+# These global symbols are only in libc.so.
+libc-shared-only-symbols += \
+  __xmknod \
+  __xmknodat \
+  _dl_mcount_wrapper_check \
+  re_max_failures \
+# libc-shared-only-symbols
 endif
 
 ifeq ($(subdir),malloc)
diff --git a/sysdeps/sparc/sparc32/Makefile b/sysdeps/sparc/sparc32/Makefile
index 4ba8794aa0..154af1b8e1 100644
--- a/sysdeps/sparc/sparc32/Makefile
+++ b/sysdeps/sparc/sparc32/Makefile
@@ -15,6 +15,18 @@ 
 # License along with the GNU C Library; if not, see
 # <https://www.gnu.org/licenses/>.
 
+ifeq ($(subdir),elf)
+# These global symbols are only in libc.so.  Filter out the
+# $(shared-only-routines) routines.
+libc-shared-only-symbols += \
+  .div \
+  .rem \
+  .udiv \
+  .umul \
+  .urem \
+  # libc-shared-only-symbols
+endif
+
 ifeq ($(subdir),gnulib)
 sysdep_routines = dotmul umul $(divrem) alloca
 shared-only-routines += umul $(divrem)