Narrow Linux symbol CRCs to 32 bits

Message ID 20221027140928.1480353-1-gprocida@google.com
State New
Headers
Series Narrow Linux symbol CRCs to 32 bits |

Commit Message

Giuliano Procida Oct. 27, 2022, 2:09 p.m. UTC
  MODVERSIONS CRCs are 32-bit hashes of strings representing C type
elements or typed symbols. The hash is calculated using a 32-bit CRC,
hence the name. The kernel module loading code (implicitly) truncates
any provided CRC value to 32 bits before comparing it with anything.

When support was added to libabigail, values up to 64 bits wide were
supported. Recently, Linux kernel builds have now started generating
ELF CRC symbols with 64-bit values (where the low 32 bits are the
CRC). Together this has resulted in incorrect CRCs in ABIs.

This change resolves the problem by narrowing libabigail's concept of
Linux CRC to 32 bits. No tests are affected.

	* include/abg-ir.h (elf_symbol::elf_symbol): Change CRC type
	from optional<uint64_t> to optional<uint32_t>.
	(elf_symbol::create): Likewise.
	(elf_symbol::get_crc): Likewise.
	(elf_symbol::set_crc): Likewise.
	* src/abg-ir.cc (elf_symbol::priv) Change CRC type from
	optional<uint64_t> to optional<uint32_t>.
	(elf_symbol::priv::priv): Likewise.
	(elf_symbol::elf_symbol): Likewise.
	(elf_symbol::create): Likewise.
	(elf_symbol::get_crc): Likewise.
	(elf_symbol::set_crc): Likewise.
	* src/abg-reporter-priv.cc (maybe_report_diff_for_symbol):
	Change CRC type from optional<uint64_t> to
	optional<uint32_t>.
	* src/abg-symtab-reader.cc (symtab::load_): Change crc_values
	value type from uint64_t to uint32_t.

Signed-off-by: Giuliano Procida <gprocida@google.com>
---
 include/abg-ir.h         |  8 ++++----
 src/abg-ir.cc            | 12 ++++++------
 src/abg-reporter-priv.cc |  4 ++--
 src/abg-symtab-reader.cc |  2 +-
 4 files changed, 13 insertions(+), 13 deletions(-)
  

Comments

Giuliano Procida Oct. 27, 2022, 3:53 p.m. UTC | #1
Self-reply.

On Thu, 27 Oct 2022 at 15:09, Giuliano Procida <gprocida@google.com> wrote:
>
> MODVERSIONS CRCs are 32-bit hashes of strings representing C type
> elements or typed symbols. The hash is calculated using a 32-bit CRC,
> hence the name. The kernel module loading code (implicitly) truncates
> any provided CRC value to 32 bits before comparing it with anything.
>
> When support was added to libabigail, values up to 64 bits wide were
> supported.

True so far.

> Recently, Linux kernel builds have now started generating
> ELF CRC symbols with 64-bit values (where the low 32 bits are the
> CRC). Together this has resulted in incorrect CRCs in ABIs.

The actual problem is a change to how CRCs are emitted in kernel objects.
I think the change 7b4537199a4a8480b8c3ba37a2d44765ce76cd9b was
responsible.

> This change resolves the problem by narrowing libabigail's concept of
> Linux CRC to 32 bits. No tests are affected.

The representation change is fine. The problem with bad CRCs is still there.

>
>         * include/abg-ir.h (elf_symbol::elf_symbol): Change CRC type
>         from optional<uint64_t> to optional<uint32_t>.
>         (elf_symbol::create): Likewise.
>         (elf_symbol::get_crc): Likewise.
>         (elf_symbol::set_crc): Likewise.
>         * src/abg-ir.cc (elf_symbol::priv) Change CRC type from
>         optional<uint64_t> to optional<uint32_t>.
>         (elf_symbol::priv::priv): Likewise.
>         (elf_symbol::elf_symbol): Likewise.
>         (elf_symbol::create): Likewise.
>         (elf_symbol::get_crc): Likewise.
>         (elf_symbol::set_crc): Likewise.
>         * src/abg-reporter-priv.cc (maybe_report_diff_for_symbol):
>         Change CRC type from optional<uint64_t> to
>         optional<uint32_t>.
>         * src/abg-symtab-reader.cc (symtab::load_): Change crc_values
>         value type from uint64_t to uint32_t.
>
> Signed-off-by: Giuliano Procida <gprocida@google.com>
> ---
>  include/abg-ir.h         |  8 ++++----
>  src/abg-ir.cc            | 12 ++++++------
>  src/abg-reporter-priv.cc |  4 ++--
>  src/abg-symtab-reader.cc |  2 +-
>  4 files changed, 13 insertions(+), 13 deletions(-)
>
> diff --git a/include/abg-ir.h b/include/abg-ir.h
> index 4892f0e2..546603f4 100644
> --- a/include/abg-ir.h
> +++ b/include/abg-ir.h
> @@ -940,7 +940,7 @@ private:
>              const version&     ve,
>              visibility         vi,
>              bool               is_in_ksymtab = false,
> -            const abg_compat::optional<uint64_t>&      crc = {},
> +            const abg_compat::optional<uint32_t>&      crc = {},
>              const abg_compat::optional<std::string>&   ns = {},
>              bool               is_suppressed = false);
>
> @@ -966,7 +966,7 @@ public:
>          const version&     ve,
>          visibility         vi,
>          bool               is_in_ksymtab = false,
> -        const abg_compat::optional<uint64_t>&          crc = {},
> +        const abg_compat::optional<uint32_t>&          crc = {},
>          const abg_compat::optional<std::string>&       ns = {},
>          bool               is_suppressed = false);
>
> @@ -1039,11 +1039,11 @@ public:
>    void
>    set_is_in_ksymtab(bool is_in_ksymtab);
>
> -  const abg_compat::optional<uint64_t>&
> +  const abg_compat::optional<uint32_t>&
>    get_crc() const;
>
>    void
> -  set_crc(const abg_compat::optional<uint64_t>& crc);
> +  set_crc(const abg_compat::optional<uint32_t>& crc);
>
>    const abg_compat::optional<std::string>&
>    get_namespace() const;
> diff --git a/src/abg-ir.cc b/src/abg-ir.cc
> index 5193934a..a043a532 100644
> --- a/src/abg-ir.cc
> +++ b/src/abg-ir.cc
> @@ -1788,7 +1788,7 @@ struct elf_symbol::priv
>    //     STT_COMMON definition of that name that has the largest size.
>    bool                 is_common_;
>    bool                 is_in_ksymtab_;
> -  abg_compat::optional<uint64_t>       crc_;
> +  abg_compat::optional<uint32_t>       crc_;
>    abg_compat::optional<std::string>    namespace_;
>    bool                 is_suppressed_;
>    elf_symbol_wptr      main_symbol_;
> @@ -1822,7 +1822,7 @@ struct elf_symbol::priv
>         const elf_symbol::version& ve,
>         elf_symbol::visibility    vi,
>         bool                      is_in_ksymtab,
> -       const abg_compat::optional<uint64_t>&   crc,
> +       const abg_compat::optional<uint32_t>&   crc,
>         const abg_compat::optional<std::string>&        ns,
>         bool                      is_suppressed)
>      : env_(e),
> @@ -1896,7 +1896,7 @@ elf_symbol::elf_symbol(const environment* e,
>                        const version&     ve,
>                        visibility         vi,
>                        bool               is_in_ksymtab,
> -                      const abg_compat::optional<uint64_t>&    crc,
> +                      const abg_compat::optional<uint32_t>&    crc,
>                        const abg_compat::optional<std::string>& ns,
>                        bool               is_suppressed)
>    : priv_(new priv(e,
> @@ -1971,7 +1971,7 @@ elf_symbol::create(const environment* e,
>                    const version&     ve,
>                    visibility         vi,
>                    bool               is_in_ksymtab,
> -                  const abg_compat::optional<uint64_t>&        crc,
> +                  const abg_compat::optional<uint32_t>&        crc,
>                    const abg_compat::optional<std::string>&     ns,
>                    bool               is_suppressed)
>  {
> @@ -2208,7 +2208,7 @@ elf_symbol::set_is_in_ksymtab(bool is_in_ksymtab)
>  /// Getter of the 'crc' property.
>  ///
>  /// @return the CRC (modversions) value for Linux Kernel symbols, if any
> -const abg_compat::optional<uint64_t>&
> +const abg_compat::optional<uint32_t>&
>  elf_symbol::get_crc() const
>  {return priv_->crc_;}
>
> @@ -2216,7 +2216,7 @@ elf_symbol::get_crc() const
>  ///
>  /// @param crc the new CRC (modversions) value for Linux Kernel symbols
>  void
> -elf_symbol::set_crc(const abg_compat::optional<uint64_t>& crc)
> +elf_symbol::set_crc(const abg_compat::optional<uint32_t>& crc)
>  {priv_->crc_ = crc;}
>
>  /// Getter of the 'namespace' property.
> diff --git a/src/abg-reporter-priv.cc b/src/abg-reporter-priv.cc
> index 2ecf8016..70839085 100644
> --- a/src/abg-reporter-priv.cc
> +++ b/src/abg-reporter-priv.cc
> @@ -1149,8 +1149,8 @@ maybe_report_diff_for_symbol(const elf_symbol_sptr&       symbol1,
>           << "\n";
>      }
>
> -  const abg_compat::optional<uint64_t>& crc1 = symbol1->get_crc();
> -  const abg_compat::optional<uint64_t>& crc2 = symbol2->get_crc();
> +  const abg_compat::optional<uint32_t>& crc1 = symbol1->get_crc();
> +  const abg_compat::optional<uint32_t>& crc2 = symbol2->get_crc();
>    if (crc1 != crc2)
>      {
>        const std::string none = "(none)";
> diff --git a/src/abg-symtab-reader.cc b/src/abg-symtab-reader.cc
> index 606c96a2..8a566f8d 100644
> --- a/src/abg-symtab-reader.cc
> +++ b/src/abg-symtab-reader.cc
> @@ -265,7 +265,7 @@ symtab::load_(Elf*         elf_handle,
>
>    const bool is_kernel = elf_helpers::is_linux_kernel(elf_handle);
>    std::unordered_set<std::string> exported_kernel_symbols;
> -  std::unordered_map<std::string, uint64_t> crc_values;
> +  std::unordered_map<std::string, uint32_t> crc_values;
>    std::unordered_map<std::string, std::string> namespaces;
>
>    for (size_t i = 0; i < number_syms; ++i)
> --
> 2.38.0.135.g90850a2211-goog
>
  
Dodji Seketeli Oct. 28, 2022, 3:13 p.m. UTC | #2
Hello Giuliano,

Giuliano Procida <gprocida@google.com> a écrit:

> Self-reply.
>
> On Thu, 27 Oct 2022 at 15:09, Giuliano Procida <gprocida@google.com> wrote:
>>
>> MODVERSIONS CRCs are 32-bit hashes of strings representing C type
>> elements or typed symbols. The hash is calculated using a 32-bit CRC,
>> hence the name. The kernel module loading code (implicitly) truncates
>> any provided CRC value to 32 bits before comparing it with anything.
>>
>> When support was added to libabigail, values up to 64 bits wide were
>> supported.
>
> True so far.
>
>> Recently, Linux kernel builds have now started generating
>> ELF CRC symbols with 64-bit values (where the low 32 bits are the
>> CRC). Together this has resulted in incorrect CRCs in ABIs.
>
> The actual problem is a change to how CRCs are emitted in kernel objects.
> I think the change 7b4537199a4a8480b8c3ba37a2d44765ce76cd9b was
> responsible.
>
>> This change resolves the problem by narrowing libabigail's concept of
>> Linux CRC to 32 bits. No tests are affected.
>
> The representation change is fine. The problem with bad CRCs is still
> there.

The changes look good to me.  Do you want me to apply it right now, or
do you prefer to send a subsequent patch that addresses the new way how
CRCs are emitted in kernel object?

I am fine with either way. 

[...]

Cheers,
  
Giuliano Procida Oct. 28, 2022, 4:08 p.m. UTC | #3
Hi Dodji.

On Fri, 28 Oct 2022 at 16:13, Dodji Seketeli <dodji@seketeli.org> wrote:
>
> Hello Giuliano,
>
> Giuliano Procida <gprocida@google.com> a écrit:
>
> > Self-reply.
> >
> > On Thu, 27 Oct 2022 at 15:09, Giuliano Procida <gprocida@google.com> wrote:
> >>
> >> MODVERSIONS CRCs are 32-bit hashes of strings representing C type
> >> elements or typed symbols. The hash is calculated using a 32-bit CRC,
> >> hence the name. The kernel module loading code (implicitly) truncates
> >> any provided CRC value to 32 bits before comparing it with anything.
> >>
> >> When support was added to libabigail, values up to 64 bits wide were
> >> supported.
> >
> > True so far.
> >
> >> Recently, Linux kernel builds have now started generating
> >> ELF CRC symbols with 64-bit values (where the low 32 bits are the
> >> CRC). Together this has resulted in incorrect CRCs in ABIs.
> >
> > The actual problem is a change to how CRCs are emitted in kernel objects.
> > I think the change 7b4537199a4a8480b8c3ba37a2d44765ce76cd9b was
> > responsible.
> >
> >> This change resolves the problem by narrowing libabigail's concept of
> >> Linux CRC to 32 bits. No tests are affected.
> >
> > The representation change is fine. The problem with bad CRCs is still
> > there.
>
> The changes look good to me.  Do you want me to apply it right now, or
> do you prefer to send a subsequent patch that addresses the new way how
> CRCs are emitted in kernel object?

I'm happy for the change to go in. However, the commit message text
should be updated. I would make it:

--

MODVERSIONS CRCs are 32-bit hashes of strings representing C type
elements or typed symbols. The hashes are calculated using a 32-bit CRC,
hence the name. The kernel module loading code (implicitly) truncates
any provided CRC value to 32 bits before comparing it with anything.

When support was added to libabigail, values up to 64 bits wide were
supported. This change narrows libabigail's concept of Linux CRC to 32
bits. No tests are affected.

--

I'm not sure when we'll get around to updating the symtab reader. It'll
probably not be done by me. As I understand things, there are
something like 3 or 4 different encodings of CRCs depending on the
kernel version and architecture (and that's ignoring endianness issues).

Matthias has even suggested it might be better (more reliable) just reading
CRCs from Modules.symvers instead. However, that would require more
integration into build scripts etc.

> I am fine with either way.
>
> [...]
>
> Cheers,
>
> --
>                 Dodji
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>
  
Dodji Seketeli Nov. 17, 2022, 10:13 a.m. UTC | #4
Hello,

[...]

On Fri, 28 Oct 2022 at 16:13, Dodji Seketeli <dodji@seketeli.org> wrote:

>> The changes look good to me.  Do you want me to apply it right now, or
>> do you prefer to send a subsequent patch that addresses the new way how
>> CRCs are emitted in kernel object?

Giuliano Procida <gprocida@google.com> a écrit:

> I'm happy for the change to go in. However, the commit message text
> should be updated. I would make it:
>
> --
>
> MODVERSIONS CRCs are 32-bit hashes of strings representing C type
> elements or typed symbols. The hashes are calculated using a 32-bit CRC,
> hence the name. The kernel module loading code (implicitly) truncates
> any provided CRC value to 32 bits before comparing it with anything.
>
> When support was added to libabigail, values up to 64 bits wide were
> supported. This change narrows libabigail's concept of Linux CRC to 32
> bits. No tests are affected.
>
> --
>
> I'm not sure when we'll get around to updating the symtab reader. It'll
> probably not be done by me. As I understand things, there are
> something like 3 or 4 different encodings of CRCs depending on the
> kernel version and architecture (and that's ignoring endianness issues).
>
> Matthias has even suggested it might be better (more reliable) just reading
> CRCs from Modules.symvers instead. However, that would require more
> integration into build scripts etc.

This felt through the cracks on my end, sorry about that.

I have just applied the v2 of the patch that you've sent 5 days ago.

Thank you for that.

[...]

Cheers,
  

Patch

diff --git a/include/abg-ir.h b/include/abg-ir.h
index 4892f0e2..546603f4 100644
--- a/include/abg-ir.h
+++ b/include/abg-ir.h
@@ -940,7 +940,7 @@  private:
 	     const version&	ve,
 	     visibility		vi,
 	     bool		is_in_ksymtab = false,
-	     const abg_compat::optional<uint64_t>&	crc = {},
+	     const abg_compat::optional<uint32_t>&	crc = {},
 	     const abg_compat::optional<std::string>&	ns = {},
 	     bool		is_suppressed = false);
 
@@ -966,7 +966,7 @@  public:
 	 const version&	    ve,
 	 visibility	    vi,
 	 bool		    is_in_ksymtab = false,
-	 const abg_compat::optional<uint64_t>&		crc = {},
+	 const abg_compat::optional<uint32_t>&		crc = {},
 	 const abg_compat::optional<std::string>&	ns = {},
 	 bool		    is_suppressed = false);
 
@@ -1039,11 +1039,11 @@  public:
   void
   set_is_in_ksymtab(bool is_in_ksymtab);
 
-  const abg_compat::optional<uint64_t>&
+  const abg_compat::optional<uint32_t>&
   get_crc() const;
 
   void
-  set_crc(const abg_compat::optional<uint64_t>& crc);
+  set_crc(const abg_compat::optional<uint32_t>& crc);
 
   const abg_compat::optional<std::string>&
   get_namespace() const;
diff --git a/src/abg-ir.cc b/src/abg-ir.cc
index 5193934a..a043a532 100644
--- a/src/abg-ir.cc
+++ b/src/abg-ir.cc
@@ -1788,7 +1788,7 @@  struct elf_symbol::priv
   //     STT_COMMON definition of that name that has the largest size.
   bool			is_common_;
   bool			is_in_ksymtab_;
-  abg_compat::optional<uint64_t>	crc_;
+  abg_compat::optional<uint32_t>	crc_;
   abg_compat::optional<std::string>	namespace_;
   bool			is_suppressed_;
   elf_symbol_wptr	main_symbol_;
@@ -1822,7 +1822,7 @@  struct elf_symbol::priv
        const elf_symbol::version& ve,
        elf_symbol::visibility	  vi,
        bool			  is_in_ksymtab,
-       const abg_compat::optional<uint64_t>&	crc,
+       const abg_compat::optional<uint32_t>&	crc,
        const abg_compat::optional<std::string>&	ns,
        bool			  is_suppressed)
     : env_(e),
@@ -1896,7 +1896,7 @@  elf_symbol::elf_symbol(const environment* e,
 		       const version&	  ve,
 		       visibility	  vi,
 		       bool		  is_in_ksymtab,
-		       const abg_compat::optional<uint64_t>&	crc,
+		       const abg_compat::optional<uint32_t>&	crc,
 		       const abg_compat::optional<std::string>&	ns,
 		       bool		  is_suppressed)
   : priv_(new priv(e,
@@ -1971,7 +1971,7 @@  elf_symbol::create(const environment* e,
 		   const version&     ve,
 		   visibility	      vi,
 		   bool		      is_in_ksymtab,
-		   const abg_compat::optional<uint64_t>&	crc,
+		   const abg_compat::optional<uint32_t>&	crc,
 		   const abg_compat::optional<std::string>&	ns,
 		   bool		      is_suppressed)
 {
@@ -2208,7 +2208,7 @@  elf_symbol::set_is_in_ksymtab(bool is_in_ksymtab)
 /// Getter of the 'crc' property.
 ///
 /// @return the CRC (modversions) value for Linux Kernel symbols, if any
-const abg_compat::optional<uint64_t>&
+const abg_compat::optional<uint32_t>&
 elf_symbol::get_crc() const
 {return priv_->crc_;}
 
@@ -2216,7 +2216,7 @@  elf_symbol::get_crc() const
 ///
 /// @param crc the new CRC (modversions) value for Linux Kernel symbols
 void
-elf_symbol::set_crc(const abg_compat::optional<uint64_t>& crc)
+elf_symbol::set_crc(const abg_compat::optional<uint32_t>& crc)
 {priv_->crc_ = crc;}
 
 /// Getter of the 'namespace' property.
diff --git a/src/abg-reporter-priv.cc b/src/abg-reporter-priv.cc
index 2ecf8016..70839085 100644
--- a/src/abg-reporter-priv.cc
+++ b/src/abg-reporter-priv.cc
@@ -1149,8 +1149,8 @@  maybe_report_diff_for_symbol(const elf_symbol_sptr&	symbol1,
 	  << "\n";
     }
 
-  const abg_compat::optional<uint64_t>& crc1 = symbol1->get_crc();
-  const abg_compat::optional<uint64_t>& crc2 = symbol2->get_crc();
+  const abg_compat::optional<uint32_t>& crc1 = symbol1->get_crc();
+  const abg_compat::optional<uint32_t>& crc2 = symbol2->get_crc();
   if (crc1 != crc2)
     {
       const std::string none = "(none)";
diff --git a/src/abg-symtab-reader.cc b/src/abg-symtab-reader.cc
index 606c96a2..8a566f8d 100644
--- a/src/abg-symtab-reader.cc
+++ b/src/abg-symtab-reader.cc
@@ -265,7 +265,7 @@  symtab::load_(Elf*	       elf_handle,
 
   const bool is_kernel = elf_helpers::is_linux_kernel(elf_handle);
   std::unordered_set<std::string> exported_kernel_symbols;
-  std::unordered_map<std::string, uint64_t> crc_values;
+  std::unordered_map<std::string, uint32_t> crc_values;
   std::unordered_map<std::string, std::string> namespaces;
 
   for (size_t i = 0; i < number_syms; ++i)