diff mbox series

Add support for the CTF debug format to libabigail.

Message ID 20211011084509.9044-1-jose.marchesi@oracle.com
State New
Headers show
Series Add support for the CTF debug format to libabigail. | expand

Commit Message

Jose E. Marchesi Oct. 11, 2021, 8:45 a.m. UTC
CTF (C Type Format) is a lightwieght debugging format that provices
information about C types and the association between functions and
data symbols and types.  It is designed to be very compact and
simple.

This patch introduces support in libabigail to extract ABI information
from CTF stored in ELF files.

A few notes on this implementation:

- The implementation is complete in terms of CTF support.  Every CTF
  feature is processed and handled to generate libabigail IR.  This
  includes basic types, typedefs, pointer, array and struct types.
  The CTF record of data objects (variables) and functions are also
  used in order to generate the corresponding libabigail IR artifacts.

- The decoding of CTF data is done using the libctf library which is
  part of binutils.  In order to link with it, binutils shall be built
  with --enable-shared for libctf.so to become available.

- This initial implementation is aimed to simplicity.  We have not
  tried to resolve any and every corner case that may require special
  handling.  We have observed that the DWARF front-end (which is
  naturally way more complex as the scope is way bigger) is plagued
  with hacks to handle such situations.  However, for the CTF support
  we prefer to proceed in a simpler and more modest way: we will
  handle these problems if/when we find them.  The fact that CTF only
  supports C (currently) certainly helps there.

- Likewise, in this basic support we are not handling symbol
  suppressions or other goodies that libabigail provides.  We are new
  to libabigail and ABI analysis, and at this point we simply don't
  have a clear picture about what is most useful/relevant to support
  or not.  With the maintainer's blesssing, we will tackle that
  functionaly after this basic support is applied upstream.

- The implementation in abg-ctf-reader.{cc,h} is pretty much
  self-contained.  As a result there is some duplication in terms of
  ELF handling with the DWARF reader, but since that logic is very
  simple and can be easily implemented, we don't consider this to be a
  big deal (for now.)  Hopefully the maintainers agree.

- The libabigail tools assume that ELF means to always use DWARF to
  generate the ABI IR.  We added a new command-line option --ctf to
  the tools in order to make them to use the CTF debug info instead.
  We are definitely not sure whether this is the best user interface.
  In fact I would be suprised if it was ;)

- We added support for --ctf to both abilint and abidiff.   We are not
  sure whether it would make sense to add support for CTF to the other
  tools.  Feedback welcome.

- We are pondering about what to do in terms of testing.  We have
  cursory tested this implementation using abilint and abidiff.  We
  know we are generating IR corpus that seem to be ok.  It would be
  good however to be able to run the libabigail testsuites using CTF.
  However the testsuites may need some non-trivial changes in order to
  make this possible.  Let's talk about that :)

Salud!
---
 ChangeLog                |  18 +
 configure.ac             |  32 +-
 include/Makefile.am      |   4 +
 include/abg-corpus.h     |   1 +
 include/abg-ctf-reader.h |  34 ++
 src/Makefile.am          |   4 +
 src/abg-ctf-reader.cc    | 999 +++++++++++++++++++++++++++++++++++++++++++++++
 tools/abidiff.cc         |  99 +++--
 tools/abilint.cc         |  37 +-
 9 files changed, 1183 insertions(+), 45 deletions(-)
 create mode 100644 include/abg-ctf-reader.h
 create mode 100644 src/abg-ctf-reader.cc

Comments

Giuliano Procida Oct. 11, 2021, 12:09 p.m. UTC | #1
Hi.

On Mon, 11 Oct 2021 at 09:45, Jose E. Marchesi via Libabigail
<libabigail@sourceware.org> wrote:
>
> CTF (C Type Format) is a lightwieght debugging format that provices
> information about C types and the association between functions and
> data symbols and types.  It is designed to be very compact and
> simple.
>

It's nice to see you say "simple" here. I'm all in favour. However, a lot
the https://github.com/oracle/binutils-gdb/wiki/libctf-todo items look
like they aim to reduce CTF binary size at the expense of greater
complexity. Will everything be abstracted away in libctf?

> This patch introduces support in libabigail to extract ABI information
> from CTF stored in ELF files.
>
> A few notes on this implementation:
>
> - The implementation is complete in terms of CTF support.  Every CTF
>   feature is processed and handled to generate libabigail IR.  This
>   includes basic types, typedefs, pointer, array and struct types.
>   The CTF record of data objects (variables) and functions are also
>   used in order to generate the corresponding libabigail IR artifacts.
>
> - The decoding of CTF data is done using the libctf library which is
>   part of binutils.  In order to link with it, binutils shall be built
>   with --enable-shared for libctf.so to become available.
>
> - This initial implementation is aimed to simplicity.  We have not
>   tried to resolve any and every corner case that may require special
>   handling.  We have observed that the DWARF front-end (which is
>   naturally way more complex as the scope is way bigger) is plagued
>   with hacks to handle such situations.  However, for the CTF support
>   we prefer to proceed in a simpler and more modest way: we will
>   handle these problems if/when we find them.  The fact that CTF only
>   supports C (currently) certainly helps there.
>
> - Likewise, in this basic support we are not handling symbol
>   suppressions or other goodies that libabigail provides.  We are new
>   to libabigail and ABI analysis, and at this point we simply don't
>   have a clear picture about what is most useful/relevant to support
>   or not.  With the maintainer's blesssing, we will tackle that
>   functionaly after this basic support is applied upstream.
>
> - The implementation in abg-ctf-reader.{cc,h} is pretty much
>   self-contained.  As a result there is some duplication in terms of
>   ELF handling with the DWARF reader, but since that logic is very
>   simple and can be easily implemented, we don't consider this to be a
>   big deal (for now.)  Hopefully the maintainers agree.
>

The implementation is short which is great.

> - The libabigail tools assume that ELF means to always use DWARF to
>   generate the ABI IR.  We added a new command-line option --ctf to
>   the tools in order to make them to use the CTF debug info instead.
>   We are definitely not sure whether this is the best user interface.
>   In fact I would be suprised if it was ;)
>
> - We added support for --ctf to both abilint and abidiff.   We are not
>   sure whether it would make sense to add support for CTF to the other
>   tools.  Feedback welcome.
>

For ease of testing / building up a useful regression test suite, please do
consider adding --ctf to abidw (or adding abictf?) which would give a
CTF -> XML utility. Plain diff (rather than abilint's ABI diff) can be used to
check for changes over time.

> - We are pondering about what to do in terms of testing.  We have
>   cursory tested this implementation using abilint and abidiff.  We
>   know we are generating IR corpus that seem to be ok.  It would be
>   good however to be able to run the libabigail testsuites using CTF.
>   However the testsuites may need some non-trivial changes in order to
>   make this possible.  Let's talk about that :)
>

We created a small test suite for regression testing, initially when we
started working on BTF so that the developer could have something
to check their progress but also to track progress on certain libabgiail
issues.

There is a simple Makefile that refreshes objects and reports from C
and C++ source code. Each test case consists of a pair of either C
or C++ source files. Everything is enumerated just by globbing.

Source is compiled with GCC 10 at present and BTF information is
obtained by running pahole -J on copies of the objects.

There are Python wrappers that replicate these ABI extraction and diff
steps to check for discrepancies during continuous integration builds.

The abidiff script is a bit special in that it expects comparing .o and
.xml in all 4 combinations to result in identical outcomes. stdout
and exit status are both captured and compared. As a result, there's
been a test we haven't been able to add for a while.

We don't attempt to assert identity of abidiff of abidw XML with BTF diff
for the same files. There are format and diff algorithm implementation
differences that make this an impossibility. This may be possible for you
with CTF though and you can abidiff test all 9 combinations of DWARF,
CTF and XML inputs. You probably won't be able to assert that DWARF
 -> XML and CTL -> XML give identical results as text.

The driver code and Makefile are part of the Google repo, but I'd be
happy to share all the test cases. It's probably time I organised them
a bit better anyway.

> Salud!

Regards,
Giuliano.

> ---
>  ChangeLog                |  18 +
>  configure.ac             |  32 +-
>  include/Makefile.am      |   4 +
>  include/abg-corpus.h     |   1 +
>  include/abg-ctf-reader.h |  34 ++
>  src/Makefile.am          |   4 +
>  src/abg-ctf-reader.cc    | 999 +++++++++++++++++++++++++++++++++++++++++++++++
>  tools/abidiff.cc         |  99 +++--
>  tools/abilint.cc         |  37 +-
>  9 files changed, 1183 insertions(+), 45 deletions(-)
>  create mode 100644 include/abg-ctf-reader.h
>  create mode 100644 src/abg-ctf-reader.cc
>
> diff --git a/ChangeLog b/ChangeLog
> index 30918b49..385fe067 100644
> --- a/ChangeLog
> +++ b/ChangeLog
> @@ -1,3 +1,21 @@
> +2021-10-11  Jose E. Marchesi  <jose.marchesi@oracle.com>
> +
> +       * configure.ac: Check for libctf.
> +       * src/abg-ctf-reader.cc: New file.
> +       * include/abg-ctf-reader.h: Likewise.
> +       * src/Makefile.am (libabigail_la_SOURCES): Add abg-ctf-reader.cc
> +       conditionally.
> +       * include/Makefile.am (pkginclude_HEADERS): Add abg-ctf-reader.h
> +       conditionally.
> +       * tools/abilint.cc (struct options): New option `use_ctf'.
> +       (display_usage): Documentation for --ctf.
> +       (parse_command_line): Handle --ctf.
> +       (main): Honour --ctf.
> +       * tools/abidiff.cc (struct options): New option `use_ctf'.
> +       (display_usage): Documentation for --ctf.
> +       (parse_command_line): Handle --ctf.
> +       (main): Honour --ctf.
> +
>  2021-10-04  Dodji Seketeli <dodji@redhat.com>
>
>         Update NEWS file for 2.0
> diff --git a/configure.ac b/configure.ac
> index 9e91f496..1eb85008 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -138,6 +138,13 @@ AC_ARG_ENABLE(ubsan,
>               ENABLE_UBSAN=$enableval,
>               ENABLE_UBSAN=no)
>
> +dnl check if user has enabled CTF code
> +AC_ARG_ENABLE(ctf,
> +             AS_HELP_STRING([--enable-ctf=yes|no],
> +                            [disable support of ctf files)]),
> +             ENABLE_CTF=$enableval,
> +             ENABLE_CTF=no)
> +
>  dnl *************************************************
>  dnl check for dependencies
>  dnl *************************************************
> @@ -244,6 +251,24 @@ fi
>  AC_SUBST(DW_LIBS)
>  AC_SUBST([ELF_LIBS])
>
> +dnl check for libctf presence if CTF code has been enabled by command line
> +dnl argument, and then define CTF flag (to build CTF file code) if libctf is
> +dnl found on the system
> +CTF_LIBS=
> +if test x$ENABLE_CTF = xyes; then
> +  LIBCTF=
> +  AC_CHECK_LIB(ctf, ctf_open, [LIBCTF=yes], [LIBCTF=no])
> +  if test x$LIBCTF = xyes; then
> +    AC_MSG_NOTICE([activating CTF code])
> +    AC_DEFINE([CTF], 1,
> +            [Defined if user enables and system has the libctf library])
> +    CTF_LIBS=-lctf
> +  else
> +    AC_MSG_NOTICE([CTF enabled but no libctf found])
> +    ENABLE_CTF=no
> +  fi
> +fi
> +
>  dnl Check for dependency: libxml
>  LIBXML2_VERSION=2.6.22
>  PKG_CHECK_MODULES(XML, libxml-2.0 >= $LIBXML2_VERSION)
> @@ -593,7 +618,7 @@ AX_VALGRIND_CHECK
>
>  dnl Set the list of libraries libabigail depends on
>
> -DEPS_LIBS="$XML_LIBS $ELF_LIBS $DW_LIBS"
> +DEPS_LIBS="$XML_LIBS $ELF_LIBS $DW_LIBS $CTF_LIBS"
>  AC_SUBST(DEPS_LIBS)
>
>  if test x$ABIGAIL_DEVEL != x; then
> @@ -631,6 +656,10 @@ if test x$ENABLE_UBSAN = xyes; then
>      CXXFLAGS="$CXXFLAGS -fsanitize=undefined"
>  fi
>
> +dnl Set a few Automake conditionals
> +
> +AM_CONDITIONAL([CTF_READER],[test "x$ENABLE_CTF" = "xyes"])
> +
>  dnl Set the level of C++ standard we use.
>  CXXFLAGS="$CXXFLAGS -std=$CXX_STANDARD"
>
> @@ -936,6 +965,7 @@ AC_MSG_NOTICE([
>      Enable bash completion                        : ${ENABLE_BASH_COMPLETION}
>      Enable fedabipkgdiff                           : ${ENABLE_FEDABIPKGDIFF}
>      Enable python 3                               : ${ENABLE_PYTHON3}
> +    Enable CTF front-end                           : ${ENABLE_CTF}
>      Enable running tests under Valgrind            : ${enable_valgrind}
>      Enable build with -fsanitize=address          : ${ENABLE_ASAN}
>      Enable build with -fsanitize=memory           : ${ENABLE_MSAN}
> diff --git a/include/Makefile.am b/include/Makefile.am
> index 0f3b0936..9e5e037b 100644
> --- a/include/Makefile.am
> +++ b/include/Makefile.am
> @@ -27,4 +27,8 @@ abg-viz-dot.h         \
>  abg-viz-svg.h          \
>  abg-regex.h
>
> +if CTF_READER
> +pkginclude_HEADERS += abg-ctf-reader.h
> +endif
> +
>  EXTRA_DIST = abg-version.h.in
> diff --git a/include/abg-corpus.h b/include/abg-corpus.h
> index 136c348c..652a8294 100644
> --- a/include/abg-corpus.h
> +++ b/include/abg-corpus.h
> @@ -46,6 +46,7 @@ public:
>      ARTIFICIAL_ORIGIN = 0,
>      NATIVE_XML_ORIGIN,
>      DWARF_ORIGIN,
> +    CTF_ORIGIN,
>      LINUX_KERNEL_BINARY_ORIGIN
>    };
>
> diff --git a/include/abg-ctf-reader.h b/include/abg-ctf-reader.h
> new file mode 100644
> index 00000000..07eccec6
> --- /dev/null
> +++ b/include/abg-ctf-reader.h
> @@ -0,0 +1,34 @@
> +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
> +// -*- Mode: C++ -*-
> +//
> +// Copyright (C) 2021 Oracle, Inc.
> +//
> +// Author: Jose E. Marchesi
> +
> +/// @file
> +///
> +/// This file contains the declarations of the entry points to
> +/// de-serialize an instance of @ref abigail::corpus from a file in
> +/// elf format, containing CTF information.
> +
> +#ifndef __ABG_CTF_READER_H__
> +#define __ABG_CTF_READER_H__
> +
> +#include <ostream>
> +#include "abg-corpus.h"
> +#include "abg-suppression.h"
> +
> +namespace abigail
> +{
> +namespace ctf_reader
> +{
> +
> +class read_context;
> +read_context *create_read_context (std::string elf_path,
> +                                   ir::environment *env);
> +corpus_sptr read_corpus (read_context *ctxt);
> +
> +} // end namespace ctf_reader
> +} // end namespace abigail
> +
> +#endif // ! __ABG_CTF_READER_H__
> diff --git a/src/Makefile.am b/src/Makefile.am
> index 430ce98d..b60d74cb 100644
> --- a/src/Makefile.am
> +++ b/src/Makefile.am
> @@ -41,6 +41,10 @@ abg-symtab-reader.h                  \
>  abg-symtab-reader.cc                   \
>  $(VIZ_SOURCES)
>
> +if CTF_READER
> +libabigail_la_SOURCES += abg-ctf-reader.cc
> +endif
> +
>  libabigail_la_LIBADD = $(DEPS_LIBS)
>  libabigail_la_LDFLAGS = -lpthread -Wl,--as-needed -no-undefined
>
> diff --git a/src/abg-ctf-reader.cc b/src/abg-ctf-reader.cc
> new file mode 100644
> index 00000000..a121b3c8
> --- /dev/null
> +++ b/src/abg-ctf-reader.cc
> @@ -0,0 +1,999 @@
> +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
> +// -*- Mode: C++ -*-
> +//
> +// Copyright (C) 2021 Oracle, Inc.
> +//
> +// Author: Jose E. Marchesi
> +
> +/// @file
> +///
> +/// This file contains the definitions of the entry points to
> +/// de-serialize an instance of @ref abigail::corpus from a file in
> +/// ELF format, containing CTF information.
> +
> +#include "config.h"
> +
> +#include <fcntl.h> /* For open(3) */
> +#include <iostream>
> +
> +#include "ctf-api.h"
> +
> +#include "abg-internal.h"
> +#include "abg-ir-priv.h"
> +#include "abg-elf-helpers.h"
> +
> +// <headers defining libabigail's API go under here>
> +ABG_BEGIN_EXPORT_DECLARATIONS
> +
> +#include "abg-ctf-reader.h"
> +#include "abg-libxml-utils.h"
> +#include "abg-reader.h"
> +#include "abg-corpus.h"
> +#include "abg-symtab-reader.h"
> +#include "abg-tools-utils.h"
> +
> +ABG_END_EXPORT_DECLARATIONS
> +// </headers defining libabigail's API>
> +
> +namespace abigail
> +{
> +namespace ctf_reader
> +{
> +
> +class read_context
> +{
> +public:
> +  /// The name of the ELF file from which the CTF archive got
> +  /// extracted.
> +  string filename;
> +
> +  /// The IR environment.
> +  ir::environment *ir_env;
> +
> +  /// The CTF archive read from FILENAME.  If an archive couldn't
> +  /// be read from the file then this is NULL.
> +  ctf_archive_t *ctfa;
> +
> +  /// A map associating CTF type ids with libabigail IR types.  This
> +  /// is used to reuse already generated types.
> +  unordered_map<ctf_id_t,type_base_sptr> types_map;
> +
> +  /// Associate a given CTF type ID with a given libabigail IR type.
> +  void add_type (ctf_id_t ctf_type, type_base_sptr type)
> +  {
> +    types_map.insert (std::make_pair (ctf_type, type));
> +  }
> +
> +  /// Lookup a given CTF type ID in the types map.
> +  ///
> +  /// @param ctf_type the type ID of the type to lookup.
> +  type_base_sptr lookup_type (ctf_id_t ctf_type)
> +  {
> +    type_base_sptr result;
> +
> +    auto search = types_map.find (ctf_type);
> +    if (search != types_map.end())
> +      result = search->second;
> +
> +    return result;
> +  }
> +
> +  /// Constructor.
> +  ///
> +  /// @param elf_path the path to the ELF file.
> +  read_context (string elf_path, ir::environment *env)
> +  {
> +    int err;
> +
> +    types_map.clear ();
> +    filename = elf_path;
> +    ir_env = env;
> +    ctfa = ctf_open (filename.c_str(),
> +                     NULL /* BFD target */, &err);
> +
> +    if (ctfa == NULL)
> +      fprintf (stderr, "cannot open %s: %s\n", filename.c_str(), ctf_errmsg (err));
> +  }
> +
> +  /// Destructor of the @ref read_context type.
> +  ~read_context ()
> +  {
> +    ctf_close (ctfa);
> +  }
> +}; // end class read_context.
> +
> +/// Forward reference, needed because several of the process_ctf_*
> +/// functions below are indirectly recursive through this call.
> +static type_base_sptr process_ctf_type (read_context *ctxt, corpus_sptr corp,
> +                                        translation_unit_sptr tunit,
> +                                        ctf_dict_t *ctf_dictionary,
> +                                        ctf_id_t ctf_type);
> +
> +/// Build and return a typedef libabigail IR.
> +///
> +/// @param ctxt the read context.
> +/// @param corp the libabigail IR corpus being constructed.
> +/// @param tunit the current IR translation unit.
> +/// @param ctf_dictionary the CTF dictionary being read.
> +/// @param ctf_type the CTF type ID of the source type.
> +///
> +/// @return a shared pointer to the IR node for the typedef.
> +
> +static typedef_decl_sptr
> +process_ctf_typedef (read_context *ctxt,
> +                     corpus_sptr corp,
> +                     translation_unit_sptr tunit,
> +                     ctf_dict_t *ctf_dictionary,
> +                     ctf_id_t ctf_type)
> +{
> +  typedef_decl_sptr result;
> +
> +  ctf_id_t ctf_utype = ctf_type_reference (ctf_dictionary, ctf_type);
> +  const char *typedef_name = ctf_type_name_raw (ctf_dictionary, ctf_type);
> +  type_base_sptr utype = ctxt->lookup_type (ctf_utype);
> +
> +  if (!utype)
> +    {
> +      utype = process_ctf_type (ctxt, corp, tunit, ctf_dictionary, ctf_utype);
> +      if (!utype)
> +        return result;
> +    }
> +
> +  result.reset (new typedef_decl (typedef_name, utype, location (),
> +                                  typedef_name /* mangled_name */));
> +  return result;
> +}
> +
> +/// Build and return an integer or float type declaration libabigail
> +/// IR.
> +///
> +/// @param ctxt the read context.
> +/// @param corp the libabigail IR corpus being constructed.
> +/// @param ctf_dictionary the CTF dictionary being read.
> +/// @param ctf_type the CTF type ID of the source type.
> +///
> +/// @return a shared pointer to the IR node for the type.
> +
> +static type_decl_sptr
> +process_ctf_base_type (read_context *ctxt,
> +                       corpus_sptr corp,
> +                       ctf_dict_t *ctf_dictionary,
> +                       ctf_id_t ctf_type)
> +{
> +  type_decl_sptr result;
> +
> +  ssize_t type_alignment = ctf_type_align (ctf_dictionary, ctf_type);
> +  const char *type_name = ctf_type_name_raw (ctf_dictionary, ctf_type);
> +
> +  /* Get the type encoding and extract some useful properties of
> +     the type from it.  In case of any error, just ignore the
> +     type.  */
> +  ctf_encoding_t type_encoding;
> +  if (ctf_type_encoding (ctf_dictionary,
> +                         ctf_type,
> +                         &type_encoding))
> +    return result;
> +
> +  /* Create the IR type corresponding to the CTF type.  */
> +  if (type_encoding.cte_bits == 0
> +      && type_encoding.cte_format == CTF_INT_SIGNED)
> +    {
> +      /* This is the `void' type.  */
> +      type_base_sptr void_type = ctxt->ir_env->get_void_type ();
> +      decl_base_sptr type_declaration = get_type_declaration (void_type);
> +      result = is_type_decl (type_declaration);
> +    }
> +  else
> +    {
> +      result = lookup_basic_type (type_name, *corp);
> +      if (!result)
> +        result.reset (new type_decl (ctxt->ir_env,
> +                                     type_name,
> +                                     type_encoding.cte_bits,
> +                                     type_alignment * 8 /* in bits */,
> +                                     location (),
> +                                     type_name /* mangled_name */));
> +
> +    }
> +
> +  return result;
> +}
> +
> +/// Build and return a function type libabigail IR.
> +///
> +/// @param ctxt the read context.
> +/// @param corp the libabigail IR corpus being constructed.
> +/// @param tunit the current IR translation unit.
> +/// @param ctf_dictionary the CTF dictionary being read.
> +/// @param ctf_type the CTF type ID of the source type.
> +///
> +/// @return a shared pointer to the IR node for the function type.
> +
> +static function_type_sptr
> +process_ctf_function_type (read_context *ctxt,
> +                           corpus_sptr corp,
> +                           translation_unit_sptr tunit,
> +                           ctf_dict_t *ctf_dictionary,
> +                           ctf_id_t ctf_type)
> +{
> +  function_type_sptr result;
> +
> +  /* Fetch the function type info from the CTF type.  */
> +  ctf_funcinfo_t funcinfo;
> +  ctf_func_type_info (ctf_dictionary, ctf_type, &funcinfo);
> +  int vararg_p = funcinfo.ctc_flags & CTF_FUNC_VARARG;
> +
> +  /* Take care first of the result type.  */
> +  ctf_id_t ctf_ret_type = funcinfo.ctc_return;
> +  type_base_sptr ret_type = ctxt->lookup_type (ctf_ret_type);
> +
> +  if (!ret_type)
> +    {
> +      ret_type = process_ctf_type (ctxt, corp, tunit, ctf_dictionary,
> +                                   ctf_ret_type);
> +      if (!ret_type)
> +        return result;
> +    }
> +
> +  /* Now process the argument types.  */
> +  int argc = funcinfo.ctc_argc;
> +  std::vector<ctf_id_t> argv (argc);
> +  if (ctf_func_type_args (ctf_dictionary, ctf_type,
> +                          argc, argv.data ()) == CTF_ERR)
> +    return result;
> +
> +  function_decl::parameters function_parms;
> +  for (int i = 0; i < argc; i++)
> +    {
> +      ctf_id_t ctf_arg_type = argv[i];
> +      type_base_sptr arg_type = ctxt->lookup_type (ctf_arg_type);
> +
> +      if (!arg_type)
> +        {
> +          arg_type = process_ctf_type (ctxt, corp, tunit, ctf_dictionary,
> +                                       ctf_arg_type);
> +          if (!arg_type)
> +            return result;
> +        }
> +
> +      function_decl::parameter_sptr parm
> +        (new function_decl::parameter (arg_type, "",
> +                                       location (),
> +                                       vararg_p && (i == argc - 1),
> +                                       false /* is_artificial */));
> +      function_parms.push_back (parm);
> +    }
> +
> +
> +  /* Ok now the function type itself.  */
> +  result.reset (new function_type (ret_type,
> +                                   function_parms,
> +                                   tunit->get_address_size (),
> +                                   ctf_type_align (ctf_dictionary, ctf_type)));
> +
> +  tunit->bind_function_type_life_time (result);
> +  result->set_is_artificial (true);
> +  return result;
> +}
> +
> +static void
> +process_ctf_sou_members (read_context *ctxt,
> +                         corpus_sptr corp,
> +                         translation_unit_sptr tunit,
> +                         ctf_dict_t *ctf_dictionary,
> +                         ctf_id_t ctf_type,
> +                         class_or_union_sptr sou)
> +{
> +  ssize_t member_size;
> +  ctf_next_t *member_next = NULL;
> +  const char *member_name = NULL;
> +  ctf_id_t member_ctf_type;
> +
> +  while ((member_size = ctf_member_next (ctf_dictionary, ctf_type,
> +                                         &member_next, &member_name,
> +                                         &member_ctf_type,
> +                                         CTF_MN_RECURSE)) >= 0)
> +    {
> +      ctf_membinfo_t membinfo;
> +
> +      if (ctf_member_info (ctf_dictionary,
> +                           ctf_type,
> +                           member_name,
> +                           &membinfo) == CTF_ERR)
> +        return;
> +
> +      /* Build the IR for the member's type.  */
> +      type_base_sptr member_type = ctxt->lookup_type (member_ctf_type);
> +      if (!member_type)
> +        {
> +          member_type = process_ctf_type (ctxt, corp, tunit, ctf_dictionary,
> +                                          member_ctf_type);
> +          if (!member_type)
> +            /* Ignore this member.  */
> +            continue;
> +        }
> +
> +      /* Create a declaration IR node for the member and add it to the
> +         struct type.  */
> +      var_decl_sptr data_member_decl (new var_decl (member_name,
> +                                                    member_type,
> +                                                    location (),
> +                                                    member_name));
> +      sou->add_data_member (data_member_decl,
> +                            public_access,
> +                            true /* is_laid_out */,
> +                            false /* is_static */,
> +                            membinfo.ctm_offset);
> +    }
> +  if (ctf_errno (ctf_dictionary) != ECTF_NEXT_END)
> +    fprintf (stderr, "ERROR from ctf_member_next\n");
> +}
> +
> +/// Build and return a struct type libabigail IR.
> +///
> +/// @param ctxt the read context.
> +/// @param corp the libabigail IR corpus being constructed.
> +/// @param tunit the current IR translation unit.
> +/// @param ctf_dictionary the CTF dictionary being read.
> +/// @param ctf_type the CTF type ID of the source type.
> +///
> +/// @return a shared pointer to the IR node for the struct type.
> +
> +static class_decl_sptr
> +process_ctf_struct_type (read_context *ctxt,
> +                         corpus_sptr corp,
> +                         translation_unit_sptr tunit,
> +                         ctf_dict_t *ctf_dictionary,
> +                         ctf_id_t ctf_type)
> +{
> +  class_decl_sptr result;
> +  std::string struct_type_name = ctf_type_name_raw (ctf_dictionary,
> +                                                 ctf_type);
> +  bool struct_type_is_anonymous = (struct_type_name == "");
> +
> +  /* The libabigail IR encodes C struct types in `class' IR nodes.  */
> +  result.reset (new class_decl (ctxt->ir_env,
> +                                struct_type_name,
> +                                ctf_type_size (ctf_dictionary, ctf_type) * 8,
> +                                ctf_type_align (ctf_dictionary, ctf_type) * 8,
> +                                true /* is_struct */,
> +                                location (),
> +                                decl_base::VISIBILITY_DEFAULT,
> +                                struct_type_is_anonymous));
> +  if (!result)
> +    return result;
> +
> +  /* The C type system indirectly supports loops by the mean of
> +     pointers to structs or unions.  Since some contained type can
> +     refer to this struct, we have to make it available in the cache
> +     at this point even if the members haven't been added to the IR
> +     node yet.  */
> +  ctxt->add_type (ctf_type, result);
> +
> +  /* Now add the struct members as specified in the CTF type description.
> +     This is C, so named types can only be defined in the global
> +     scope.  */
> +  process_ctf_sou_members (ctxt, corp, tunit, ctf_dictionary, ctf_type,
> +                           result);
> +
> +  return result;
> +}
> +
> +/// Build and return an union type libabigail IR.
> +///
> +/// @param ctxt the read context.
> +/// @param corp the libabigail IR corpus being constructed.
> +/// @param tunit the current IR translation unit.
> +/// @param ctf_dictionary the CTF dictionary being read.
> +/// @param ctf_type the CTF type ID of the source type.
> +///
> +/// @return a shared pointer to the IR node for the union type.
> +
> +static union_decl_sptr
> +process_ctf_union_type (read_context *ctxt,
> +                        corpus_sptr corp,
> +                        translation_unit_sptr tunit,
> +                        ctf_dict_t *ctf_dictionary,
> +                        ctf_id_t ctf_type)
> +{
> +  union_decl_sptr result;
> +  std::string union_type_name = ctf_type_name_raw (ctf_dictionary,
> +                                                   ctf_type);
> +  bool union_type_is_anonymous = (union_type_name == "");
> +
> +  /* Create the corresponding libabigail union IR node.  */
> +  result.reset (new union_decl (ctxt->ir_env,
> +                                union_type_name,
> +                                ctf_type_size (ctf_dictionary, ctf_type) * 8,
> +                                location (),
> +                                decl_base::VISIBILITY_DEFAULT,
> +                                union_type_is_anonymous));
> +  if (!result)
> +    return result;
> +
> +  /* The C type system indirectly supports loops by the mean of
> +     pointers to structs or unions.  Since some contained type can
> +     refer to this union, we have to make it available in the cache
> +     at this point even if the members haven't been added to the IR
> +     node yet.  */
> +  ctxt->add_type (ctf_type, result);
> +
> +  /* Now add the union members as specified in the CTF type description.
> +     This is C, so named types can only be defined in the global
> +     scope.  */
> +  process_ctf_sou_members (ctxt, corp, tunit, ctf_dictionary, ctf_type,
> +                           result);
> +
> +  return result;
> +}
> +
> +/// Build and return an array type libabigail IR.
> +///
> +/// @param ctxt the read context.
> +/// @param corp the libabigail IR corpus being constructed.
> +/// @param tunit the current IR translation unit.
> +/// @param ctf_dictionary the CTF dictionary being read.
> +/// @param ctf_type the CTF type ID of the source type.
> +///
> +/// @return a shared pointer to the IR node for the array type.
> +
> +static array_type_def_sptr
> +process_ctf_array_type (read_context *ctxt,
> +                        corpus_sptr corp,
> +                        translation_unit_sptr tunit,
> +                        ctf_dict_t *ctf_dictionary,
> +                        ctf_id_t ctf_type)
> +{
> +  array_type_def_sptr result;
> +  ctf_arinfo_t ctf_ainfo;
> +
> +  /* First, get the information about the CTF array.  */
> +  if (ctf_array_info (ctf_dictionary, ctf_type, &ctf_ainfo)
> +      == CTF_ERR)
> +    return result;
> +
> +  ctf_id_t ctf_element_type = ctf_ainfo.ctr_contents;
> +  ctf_id_t ctf_index_type = ctf_ainfo.ctr_index;
> +  uint64_t nelems = ctf_ainfo.ctr_nelems;
> +
> +  /* Make sure the element type is generated.  */
> +  type_base_sptr element_type = ctxt->lookup_type (ctf_element_type);
> +  if (!element_type)
> +    {
> +      element_type = process_ctf_type (ctxt, corp, tunit, ctf_dictionary, ctf_element_type);
> +      if (!element_type)
> +        return result;
> +    }
> +
> +  /* Ditto for the index type.  */
> +  type_base_sptr index_type = ctxt->lookup_type (ctf_index_type);
> +  if (!index_type)
> +    {
> +      index_type = process_ctf_type (ctxt, corp, tunit, ctf_dictionary, ctf_index_type);
> +      if (!index_type)
> +        return result;
> +    }
> +
> +  /* The number of elements of the array determines the IR subranges
> +     type to build.  */
> +  array_type_def::subranges_type subranges;
> +  array_type_def::subrange_sptr subrange;
> +  array_type_def::subrange_type::bound_value lower_bound;
> +  array_type_def::subrange_type::bound_value upper_bound;
> +
> +  lower_bound.set_unsigned (0); /* CTF supports C only.  */
> +  upper_bound.set_unsigned (nelems > 0 ? nelems - 1 : 0U);
> +
> +  subrange.reset (new array_type_def::subrange_type (ctxt->ir_env,
> +                                                     "",
> +                                                     lower_bound,
> +                                                     upper_bound,
> +                                                     index_type,
> +                                                     location (),
> +                                                     translation_unit::LANG_C));
> +  if (!subrange)
> +    return result;
> +
> +  add_decl_to_scope (subrange, tunit->get_global_scope());
> +  canonicalize (subrange);
> +  subranges.push_back (subrange);
> +
> +  /* Finally build the IR for the array type and return it.  */
> +  result.reset (new array_type_def (element_type, subranges, location ()));
> +  return result;
> +}
> +
> +/// Build and return a qualified type libabigail IR.
> +///
> +/// @param ctxt the read context.
> +/// @param corp the libabigail IR corpus being constructed.
> +/// @param tunit the current IR translation unit.
> +/// @param ctf_dictionary the CTF dictionary being read.
> +/// @param ctf_type the CTF type ID of the source type.
> +
> +static type_base_sptr
> +process_ctf_qualified_type (read_context *ctxt,
> +                            corpus_sptr corp,
> +                            translation_unit_sptr tunit,
> +                            ctf_dict_t *ctf_dictionary,
> +                            ctf_id_t ctf_type)
> +{
> +  type_base_sptr result;
> +  int type_kind = ctf_type_kind (ctf_dictionary, ctf_type);
> +  ctf_id_t ctf_utype = ctf_type_reference (ctf_dictionary, ctf_type);
> +  type_base_sptr utype = ctxt->lookup_type (ctf_utype);
> +
> +  if (!utype)
> +    {
> +      utype = process_ctf_type (ctxt, corp, tunit, ctf_dictionary, ctf_utype);
> +      if (!utype)
> +        return result;
> +    }
> +
> +  qualified_type_def::CV qualifiers = qualified_type_def::CV_NONE;
> +  if (type_kind == CTF_K_CONST)
> +    qualifiers |= qualified_type_def::CV_CONST;
> +  else if (type_kind == CTF_K_VOLATILE)
> +    qualifiers |= qualified_type_def::CV_VOLATILE;
> +  else if (type_kind == CTF_K_RESTRICT)
> +    qualifiers |= qualified_type_def::CV_RESTRICT;
> +  else
> +    ABG_ASSERT_NOT_REACHED;
> +
> +  result.reset (new qualified_type_def (utype, qualifiers, location ()));
> +  return result;
> +}
> +
> +/// Build and return a pointer type libabigail IR.
> +///
> +/// @param ctxt the read context.
> +/// @param corp the libabigail IR corpus being constructed.
> +/// @param tunit the current IR translation unit.
> +/// @param ctf_dictionary the CTF dictionary being read.
> +/// @param ctf_type the CTF type ID of the source type.
> +///
> +/// @return a shared pointer to the IR node for the pointer type.
> +
> +static pointer_type_def_sptr
> +process_ctf_pointer_type (read_context *ctxt,
> +                          corpus_sptr corp,
> +                          translation_unit_sptr tunit,
> +                          ctf_dict_t *ctf_dictionary,
> +                          ctf_id_t ctf_type)
> +{
> +  pointer_type_def_sptr result;
> +  ctf_id_t ctf_target_type = ctf_type_reference (ctf_dictionary, ctf_type);
> +  type_base_sptr target_type = ctxt->lookup_type (ctf_target_type);
> +
> +  if (!target_type)
> +    {
> +      target_type = process_ctf_type (ctxt, corp, tunit, ctf_dictionary,
> +                                      ctf_target_type);
> +      if (!target_type)
> +        return result;
> +    }
> +
> +  result.reset (new pointer_type_def (target_type,
> +                                      ctf_type_size (ctf_dictionary, ctf_type) * 8,
> +                                      ctf_type_align (ctf_dictionary, ctf_type) * 8,
> +                                      location ()));
> +  return result;
> +}
> +
> +/// Build and return an enum type libabigail IR.
> +///
> +/// @param ctxt the read context.
> +/// @param corp the libabigail IR corpus being constructed.
> +/// @param tunit the current IR translation unit.
> +/// @param ctf_dictionary the CTF dictionary being read.
> +/// @param ctf_type the CTF type ID of the source type.
> +///
> +/// @return a shared pointer to the IR node for the enum type.
> +
> +static enum_type_decl_sptr
> +process_ctf_enum_type (read_context *ctxt,
> +                        corpus_sptr corp,
> +                        translation_unit_sptr tunit,
> +                        ctf_dict_t *ctf_dictionary,
> +                        ctf_id_t ctf_type)
> +{
> +  enum_type_decl_sptr result;
> +
> +  /* Build a signed integral type for the type of the enumerators, aka
> +     the underlying type.  The size of the enumerators in bytes is
> +     specified in the CTF enumeration type.  */
> +  size_t utype_size_in_bits = ctf_type_size (ctf_dictionary, ctf_type) * 8;
> +  type_decl_sptr utype;
> +
> +  utype.reset (new type_decl (ctxt->ir_env,
> +                              "",
> +                              utype_size_in_bits,
> +                              utype_size_in_bits,
> +                              location ()));
> +  utype->set_is_anonymous (true);
> +  utype->set_is_artificial (true);
> +  if (!utype)
> +    return result;
> +  add_decl_to_scope (utype, tunit->get_global_scope());
> +  canonicalize (utype);
> +
> +  /* Iterate over the enum entries.  */
> +  enum_type_decl::enumerators enms;
> +  ctf_next_t *enum_next = NULL;
> +  const char *ename;
> +  int evalue;
> +
> +  while ((ename = ctf_enum_next (ctf_dictionary, ctf_type, &enum_next, &evalue)))
> +    enms.push_back (enum_type_decl::enumerator (ctxt->ir_env, ename, evalue));
> +  if (ctf_errno (ctf_dictionary) != ECTF_NEXT_END)
> +    {
> +      fprintf (stderr, "ERROR from ctf_enum_next\n");
> +      return result;
> +    }
> +
> +  const char *enum_name = ctf_type_name_raw (ctf_dictionary, ctf_type);
> +  result.reset (new enum_type_decl (enum_name, location (),
> +                                    utype, enms, enum_name));
> +  return result;
> +}
> +
> +/// Add a new type declaration to the given libabigail IR corpus CORP.
> +///
> +/// @param ctxt the read context.
> +/// @param corp the libabigail IR corpus being constructed.
> +/// @param tunit the current IR translation unit.
> +/// @param ctf_dictionary the CTF dictionary being read.
> +/// @param ctf_type the CTF type ID of the source type.
> +///
> +/// Note that if @ref ctf_type can't reliably be translated to the IR
> +/// then it is simply ignored.
> +///
> +/// @return a shared pointer to the IR node for the type.
> +
> +static type_base_sptr
> +process_ctf_type (read_context *ctxt,
> +                  corpus_sptr corp,
> +                  translation_unit_sptr tunit,
> +                  ctf_dict_t *ctf_dictionary,
> +                  ctf_id_t ctf_type)
> +{
> +  int type_kind = ctf_type_kind (ctf_dictionary, ctf_type);
> +  type_base_sptr result;
> +
> +  switch (type_kind)
> +    {
> +    case CTF_K_INTEGER:
> +    case CTF_K_FLOAT:
> +      {
> +        type_decl_sptr type_decl
> +          = process_ctf_base_type (ctxt, corp, ctf_dictionary, ctf_type);
> +
> +        if (type_decl)
> +          {
> +            add_decl_to_scope (type_decl, tunit->get_global_scope ());
> +            result = is_type (type_decl);
> +          }
> +        break;
> +      }
> +    case CTF_K_TYPEDEF:
> +      {
> +        typedef_decl_sptr typedef_decl
> +          = process_ctf_typedef (ctxt, corp, tunit, ctf_dictionary, ctf_type);
> +
> +        if (typedef_decl)
> +          {
> +            add_decl_to_scope (typedef_decl, tunit->get_global_scope ());
> +            result = is_type (typedef_decl);
> +          }
> +        break;
> +      }
> +    case CTF_K_POINTER:
> +      {
> +        pointer_type_def_sptr pointer_type
> +          = process_ctf_pointer_type (ctxt, corp, tunit, ctf_dictionary, ctf_type);
> +
> +        if (pointer_type)
> +          {
> +            add_decl_to_scope (pointer_type, tunit->get_global_scope ());
> +            result = pointer_type;
> +          }
> +        break;
> +      }
> +    case CTF_K_CONST:
> +    case CTF_K_VOLATILE:
> +    case CTF_K_RESTRICT:
> +      {
> +        type_base_sptr qualified_type
> +          = process_ctf_qualified_type (ctxt, corp, tunit, ctf_dictionary, ctf_type);
> +
> +        if (qualified_type)
> +          {
> +            decl_base_sptr qualified_type_decl = get_type_declaration (qualified_type);
> +
> +            add_decl_to_scope (qualified_type_decl, tunit->get_global_scope ());
> +            result = qualified_type;
> +          }
> +        break;
> +      }
> +    case CTF_K_ARRAY:
> +      {
> +        array_type_def_sptr array_type
> +          = process_ctf_array_type (ctxt, corp, tunit, ctf_dictionary, ctf_type);
> +
> +        if (array_type)
> +          {
> +            decl_base_sptr array_type_decl = get_type_declaration (array_type);
> +
> +            add_decl_to_scope (array_type_decl, tunit->get_global_scope ());
> +            result = array_type;
> +          }
> +        break;
> +      }
> +    case CTF_K_ENUM:
> +      {
> +        enum_type_decl_sptr enum_type
> +          = process_ctf_enum_type (ctxt, corp, tunit, ctf_dictionary, ctf_type);
> +
> +        if (enum_type)
> +          {
> +            add_decl_to_scope (enum_type, tunit->get_global_scope ());
> +            result = enum_type;
> +          }
> +
> +        break;
> +      }
> +    case CTF_K_FUNCTION:
> +      {
> +        function_type_sptr function_type
> +          = process_ctf_function_type (ctxt, corp, tunit, ctf_dictionary, ctf_type);
> +
> +        if (function_type)
> +          {
> +            decl_base_sptr function_type_decl = get_type_declaration (function_type);
> +
> +            add_decl_to_scope (function_type_decl, tunit->get_global_scope ());
> +            result = function_type;
> +          }
> +        break;
> +      }
> +    case CTF_K_STRUCT:
> +      {
> +        class_decl_sptr struct_decl
> +          = process_ctf_struct_type (ctxt, corp, tunit, ctf_dictionary, ctf_type);
> +
> +        if (struct_decl)
> +          {
> +            add_decl_to_scope (struct_decl, tunit->get_global_scope ());
> +            result = is_type (struct_decl);
> +          }
> +        break;
> +      }
> +    case CTF_K_UNION:
> +      {
> +        union_decl_sptr union_decl
> +          = process_ctf_union_type (ctxt, corp, tunit, ctf_dictionary, ctf_type);
> +
> +        if (union_decl)
> +          {
> +            add_decl_to_scope (union_decl, tunit->get_global_scope ());
> +            result = is_type (union_decl);
> +          }
> +        break;
> +      }
> +    case CTF_K_UNKNOWN:
> +      /* Unknown types are simply ignored.  */
> +    default:
> +      break;
> +    }
> +
> +  if (result)
> +    {
> +      decl_base_sptr result_decl = get_type_declaration (result);
> +
> +      canonicalize (result);
> +      ctxt->add_type (ctf_type, result);
> +    }
> +  else
> +    fprintf (stderr, "NOT PROCESSED TYPE %lu\n", ctf_type);
> +
> +  return result;
> +}
> +
> +/// Process a CTF archive and create libabigail IR for the types,
> +/// variables and function declarations found in the archive.  The IR
> +/// is added to the given corpus.
> +///
> +/// @param ctxt the read context containing the CTF archive to
> +/// process.
> +/// @param corp the IR corpus to which add the new contents.
> +
> +static void
> +process_ctf_archive (read_context *ctxt, corpus_sptr corp)
> +{
> +  /* We only have a translation unit.  */
> +  translation_unit_sptr ir_translation_unit =
> +    std::make_shared<translation_unit> (ctxt->ir_env, "", 64);
> +  ir_translation_unit->set_language (translation_unit::LANG_C);
> +  corp->add (ir_translation_unit);
> +
> +  /* Iterate over the CTF dictionaries in the archive.  */
> +  int ctf_err;
> +  ctf_dict_t *ctf_dict;
> +  ctf_next_t *dict_next = NULL;
> +  const char *archive_name;
> +
> +  while ((ctf_dict = ctf_archive_next (ctxt->ctfa, &dict_next, &archive_name,
> +                                       0 /* skip_parent */, &ctf_err)) != NULL)
> +    {
> +      /* Iterate over the CTF types stored in this archive.  */
> +      ctf_id_t ctf_type;
> +      int type_flag;
> +      ctf_next_t *type_next = NULL;
> +
> +      while ((ctf_type = ctf_type_next (ctf_dict, &type_next, &type_flag,
> +                                        1 /* want_hidden */)) != CTF_ERR)
> +        {
> +          process_ctf_type (ctxt, corp, ir_translation_unit,
> +                            ctf_dict, ctf_type);
> +        }
> +      if (ctf_errno (ctf_dict) != ECTF_NEXT_END)
> +        fprintf (stderr, "ERROR from ctf_type_next\n");
> +
> +      /* Iterate over the CTF variables stored in this archive.  */
> +      ctf_id_t ctf_var_type;
> +      ctf_next_t *var_next = NULL;
> +      const char *var_name;
> +
> +      while ((ctf_var_type = ctf_variable_next (ctf_dict, &var_next, &var_name))
> +             != CTF_ERR)
> +        {
> +          type_base_sptr var_type = ctxt->lookup_type (ctf_var_type);
> +
> +          if (!var_type)
> +            {
> +              var_type = process_ctf_type (ctxt, corp, ir_translation_unit,
> +                                           ctf_dict, ctf_var_type);
> +              if (!var_type)
> +                /* Ignore variable if its type can't be sorted out.  */
> +                continue;
> +            }
> +
> +          var_decl_sptr var_declaration;
> +          var_declaration.reset (new var_decl (var_name,
> +                                               var_type,
> +                                               location (),
> +                                               var_name));
> +
> +          add_decl_to_scope (var_declaration,
> +                             ir_translation_unit->get_global_scope ());
> +        }
> +      if (ctf_errno (ctf_dict) != ECTF_NEXT_END)
> +        fprintf (stderr, "ERROR from ctf_variable_next\n");
> +
> +      /* Iterate over the CTF functions stored in this archive.  */
> +      ctf_next_t *func_next = NULL;
> +      const char *func_name = NULL;
> +      ctf_id_t ctf_sym;
> +
> +      while ((ctf_sym = ctf_symbol_next (ctf_dict, &func_next, &func_name,
> +                                         1 /* functions symbols only */) != CTF_ERR))
> +      {
> +        ctf_id_t ctf_func_type = ctf_lookup_by_name (ctf_dict, func_name);
> +        type_base_sptr func_type = ctxt->lookup_type (ctf_func_type);
> +        if (!func_type)
> +          {
> +            func_type = process_ctf_type (ctxt, corp, ir_translation_unit,
> +                                          ctf_dict, ctf_func_type);
> +            if (!func_type)
> +              /* Ignore function if its type can't be sorted out.  */
> +              continue;
> +          }
> +
> +        function_decl_sptr func_declaration;
> +        func_declaration.reset (new function_decl (func_name,
> +                                                   func_type,
> +                                                   0 /* is_inline */,
> +                                                   location ()));
> +
> +        add_decl_to_scope (func_declaration,
> +                           ir_translation_unit->get_global_scope ());
> +      }
> +      if (ctf_errno (ctf_dict) != ECTF_NEXT_END)
> +        fprintf (stderr, "ERROR from ctf_symbol_next\n");
> +    }
> +  if (ctf_err != ECTF_NEXT_END)
> +    fprintf (stderr, "ERROR from ctf_archive_next\n");
> +
> +}
> +
> +/// Slurp certain information from the ELF file described by a given
> +/// read context and install it in a libabigail corpus.
> +///
> +/// @param ctxt the read context
> +/// @param corp the libabigail corpus in which to install the info.
> +///
> +/// @return 0 if there is an error.
> +/// @return 1 otherwise.
> +
> +static int
> +slurp_elf_info (read_context *ctxt, corpus_sptr corp)
> +{
> +  /* libelf requires to negotiate/set the version of ELF.  */
> +  if (elf_version (EV_CURRENT) == EV_NONE)
> +    return 0;
> +
> +  /* Open an ELF handler.  */
> +  int elf_fd = open (ctxt->filename.c_str(), O_RDONLY);
> +  if (elf_fd == -1)
> +    return 0;
> +
> +  Elf *elf_handler = elf_begin (elf_fd, ELF_C_READ, NULL);
> +  if (elf_handler == NULL)
> +    {
> +      fprintf (stderr, "cannot open %s: %s\n",
> +               ctxt->filename.c_str(), elf_errmsg (elf_errno ()));
> +      close (elf_fd);
> +      return 0;
> +    }
> +
> +  /* Set the ELF architecture.  */
> +  GElf_Ehdr eh_mem;
> +  GElf_Ehdr *ehdr = gelf_getehdr (elf_handler, &eh_mem);
> +  corp->set_architecture_name (elf_helpers::e_machine_to_string (ehdr->e_machine));
> +
> +  /* Read the symtab from the ELF file and set it in the corpus.  */
> +  symtab_reader::symtab_sptr symtab =
> +    symtab_reader::symtab::load (elf_handler, ctxt->ir_env,
> +                                 0 /* No suppressions.  */);
> +  corp->set_symtab(symtab);
> +
> +  /* Finish the ELF handler and close the associated file.  */
> +  elf_end (elf_handler);
> +  close (elf_fd);
> +
> +  return 1;
> +}
> +
> +/// Create and return a new read context to process CTF information
> +/// from a given ELF file.
> +///
> +/// @param elf_path the patch of some ELF file.
> +/// @param env a libabigail IR environment.
> +
> +read_context *
> +create_read_context (std::string elf_path, ir::environment *env)
> +{
> +  return new read_context (elf_path, env);
> +}
> +
> +/// Read the CTF information from some source described by a given
> +/// read context and process it to create a libabigail IR corpus.
> +/// Store the corpus in the same read context.
> +///
> +/// @param ctxt the read context to use.
> +/// @return a shared pointer to the read corpus.
> +
> +corpus_sptr
> +read_corpus (read_context *ctxt)
> +{
> +  corpus_sptr corp
> +    = std::make_shared<corpus> (ctxt->ir_env, ctxt->filename);
> +
> +  /* Set some properties of the corpus first.  */
> +  corp->set_origin(corpus::CTF_ORIGIN);
> +  if (!slurp_elf_info (ctxt, corp))
> +    return corp;
> +
> +  /* Get out now if no CTF debug info is found.  */
> +  if (ctxt->ctfa == NULL)
> +    return corp;
> +
> +  /* Process the CTF archive in the read context, if any.  Information
> +     about the types, variables, functions, etc contained in the
> +     archive are added to the given corpus.  */
> +  process_ctf_archive (ctxt, corp);
> +  return corp;
> +}
> +
> +} // End of namespace ctf_reader
> +} // End of namespace abigail
> diff --git a/tools/abidiff.cc b/tools/abidiff.cc
> index 21f7ff61..db021bf4 100644
> --- a/tools/abidiff.cc
> +++ b/tools/abidiff.cc
> @@ -19,6 +19,7 @@
>  #include "abg-tools-utils.h"
>  #include "abg-reader.h"
>  #include "abg-dwarf-reader.h"
> +#include "abg-ctf-reader.h"
>
>  using std::vector;
>  using std::string;
> @@ -104,6 +105,7 @@ struct options
>  #ifdef WITH_DEBUG_SELF_COMPARISON
>    bool                 do_debug;
>  #endif
> +  bool                 use_ctf;
>    vector<char*> di_root_paths1;
>    vector<char*> di_root_paths2;
>    vector<char**> prepared_di_root_paths1;
> @@ -144,7 +146,8 @@ struct options
>        show_impacted_interfaces(),
>        dump_diff_tree(),
>        show_stats(),
> -      do_log()
> +      do_log(),
> +      use_ctf()
>  #ifdef WITH_DEBUG_SELF_COMPARISON
>      ,
>      do_debug()
> @@ -233,6 +236,7 @@ display_usage(const string& prog_name, ostream& out)
>      << " --dump-diff-tree  emit a debug dump of the internal diff tree to "
>      "the error output stream\n"
>      <<  " --stats  show statistics about various internal stuff\n"
> +    << "  --ctf use CTF instead of DWARF in ELF files\n"
>  #ifdef WITH_DEBUG_SELF_COMPARISON
>      << " --debug debug the process of comparing an ABI corpus against itself"
>  #endif
> @@ -579,6 +583,8 @@ parse_command_line(int argc, char* argv[], options& opts)
>         opts.show_stats = true;
>        else if (!strcmp(argv[i], "--verbose"))
>         opts.do_log = true;
> +      else if (!strcmp(argv[i], "--ctf"))
> +        opts.use_ctf = true;
>  #ifdef WITH_DEBUG_SELF_COMPARISON
>        else if (!strcmp(argv[i], "--debug"))
>         opts.do_debug = true;
> @@ -1150,23 +1156,35 @@ main(int argc, char* argv[])
>         case abigail::tools_utils::FILE_TYPE_ELF: // fall through
>         case abigail::tools_utils::FILE_TYPE_AR:
>           {
> -           abigail::dwarf_reader::read_context_sptr ctxt =
> -             abigail::dwarf_reader::create_read_context
> -             (opts.file1, opts.prepared_di_root_paths1,
> -              env.get(), /*read_all_types=*/opts.show_all_types,
> -              opts.linux_kernel_mode);
> -           assert(ctxt);
> -
> -           abigail::dwarf_reader::set_show_stats(*ctxt, opts.show_stats);
> -           set_suppressions(*ctxt, opts);
> -           abigail::dwarf_reader::set_do_log(*ctxt, opts.do_log);
> -           c1 = abigail::dwarf_reader::read_corpus_from_elf(*ctxt, c1_status);
> -           if (!c1
> -               || (opts.fail_no_debug_info
> -                   && (c1_status & STATUS_ALT_DEBUG_INFO_NOT_FOUND)
> -                   && (c1_status & STATUS_DEBUG_INFO_NOT_FOUND)))
> -             return handle_error(c1_status, ctxt.get(),
> -                                 argv[0], opts);
> +            if (opts.use_ctf)
> +              {
> +                abigail::ctf_reader::read_context *ctxt
> +                  = abigail::ctf_reader::create_read_context (opts.file1,
> +                                                              env.get());
> +
> +                assert (ctxt);
> +                c1 = abigail::ctf_reader::read_corpus (ctxt);
> +              }
> +            else
> +              {
> +                abigail::dwarf_reader::read_context_sptr ctxt =
> +                  abigail::dwarf_reader::create_read_context
> +                  (opts.file1, opts.prepared_di_root_paths1,
> +                   env.get(), /*read_all_types=*/opts.show_all_types,
> +                   opts.linux_kernel_mode);
> +                assert(ctxt);
> +
> +                abigail::dwarf_reader::set_show_stats(*ctxt, opts.show_stats);
> +                set_suppressions(*ctxt, opts);
> +                abigail::dwarf_reader::set_do_log(*ctxt, opts.do_log);
> +                c1 = abigail::dwarf_reader::read_corpus_from_elf(*ctxt, c1_status);
> +                if (!c1
> +                    || (opts.fail_no_debug_info
> +                        && (c1_status & STATUS_ALT_DEBUG_INFO_NOT_FOUND)
> +                        && (c1_status & STATUS_DEBUG_INFO_NOT_FOUND)))
> +                  return handle_error(c1_status, ctxt.get(),
> +                                      argv[0], opts);
> +              }
>           }
>           break;
>         case abigail::tools_utils::FILE_TYPE_XML_CORPUS:
> @@ -1219,23 +1237,34 @@ main(int argc, char* argv[])
>         case abigail::tools_utils::FILE_TYPE_ELF: // Fall through
>         case abigail::tools_utils::FILE_TYPE_AR:
>           {
> -           abigail::dwarf_reader::read_context_sptr ctxt =
> -             abigail::dwarf_reader::create_read_context
> -             (opts.file2, opts.prepared_di_root_paths2,
> -              env.get(), /*read_all_types=*/opts.show_all_types,
> -              opts.linux_kernel_mode);
> -           assert(ctxt);
> -           abigail::dwarf_reader::set_show_stats(*ctxt, opts.show_stats);
> -           abigail::dwarf_reader::set_do_log(*ctxt, opts.do_log);
> -           set_suppressions(*ctxt, opts);
> -
> -           c2 = abigail::dwarf_reader::read_corpus_from_elf(*ctxt, c2_status);
> -           if (!c2
> -               || (opts.fail_no_debug_info
> -                   && (c2_status & STATUS_ALT_DEBUG_INFO_NOT_FOUND)
> -                   && (c2_status & STATUS_DEBUG_INFO_NOT_FOUND)))
> -             return handle_error(c2_status, ctxt.get(), argv[0], opts);
> -
> +            if (opts.use_ctf)
> +              {
> +                abigail::ctf_reader::read_context *ctxt
> +                  = abigail::ctf_reader::create_read_context (opts.file2,
> +                                                              env.get());
> +
> +                assert (ctxt);
> +                c2 = abigail::ctf_reader::read_corpus (ctxt);
> +              }
> +            else
> +              {
> +                abigail::dwarf_reader::read_context_sptr ctxt =
> +                  abigail::dwarf_reader::create_read_context
> +                  (opts.file2, opts.prepared_di_root_paths2,
> +                   env.get(), /*read_all_types=*/opts.show_all_types,
> +                   opts.linux_kernel_mode);
> +                assert(ctxt);
> +                abigail::dwarf_reader::set_show_stats(*ctxt, opts.show_stats);
> +                abigail::dwarf_reader::set_do_log(*ctxt, opts.do_log);
> +                set_suppressions(*ctxt, opts);
> +
> +                c2 = abigail::dwarf_reader::read_corpus_from_elf(*ctxt, c2_status);
> +                if (!c2
> +                    || (opts.fail_no_debug_info
> +                        && (c2_status & STATUS_ALT_DEBUG_INFO_NOT_FOUND)
> +                        && (c2_status & STATUS_DEBUG_INFO_NOT_FOUND)))
> +                  return handle_error(c2_status, ctxt.get(), argv[0], opts);
> +              }
>           }
>           break;
>         case abigail::tools_utils::FILE_TYPE_XML_CORPUS:
> diff --git a/tools/abilint.cc b/tools/abilint.cc
> index 856f935d..b1551ea3 100644
> --- a/tools/abilint.cc
> +++ b/tools/abilint.cc
> @@ -27,6 +27,7 @@
>  #include "abg-corpus.h"
>  #include "abg-reader.h"
>  #include "abg-dwarf-reader.h"
> +#include "abg-ctf-reader.h"
>  #include "abg-writer.h"
>  #include "abg-suppression.h"
>
> @@ -67,6 +68,7 @@ struct options
>    bool                         read_tu;
>    bool                         diff;
>    bool                         noout;
> +  bool                         use_ctf;
>    std::shared_ptr<char>        di_root_path;
>    vector<string>               suppression_paths;
>    string                       headers_dir;
> @@ -77,7 +79,8 @@ struct options
>        read_from_stdin(false),
>        read_tu(false),
>        diff(false),
> -      noout(false)
> +      noout(false),
> +      use_ctf(false)
>    {}
>  };//end struct options;
>
> @@ -99,7 +102,8 @@ display_usage(const string& prog_name, ostream& out)
>      "the input and the memory model saved back to disk\n"
>      << "  --noout  do not display anything on stdout\n"
>      << "  --stdin|--  read abi-file content from stdin\n"
> -    << "  --tu  expect a single translation unit file\n";
> +    << "  --tu  expect a single translation unit file\n"
> +    << "  --ctf use CTF instead of DWARF in ELF files\n";
>  }
>
>  bool
> @@ -173,6 +177,8 @@ parse_command_line(int argc, char* argv[], options& opts)
>           opts.read_from_stdin = true;
>         else if (!strcmp(argv[i], "--tu"))
>           opts.read_tu = true;
> +        else if (!strcmp(argv[i], "--ctf"))
> +          opts.use_ctf = true;
>         else if (!strcmp(argv[i], "--diff"))
>           opts.diff = true;
>         else if (!strcmp(argv[i], "--noout"))
> @@ -338,13 +344,26 @@ main(int argc, char* argv[])
>             di_root_path = opts.di_root_path.get();
>             vector<char**> di_roots;
>             di_roots.push_back(&di_root_path);
> -           abigail::dwarf_reader::read_context_sptr ctxt =
> -             abigail::dwarf_reader::create_read_context(opts.file_path,
> -                                                        di_roots, env.get(),
> -                                                        /*load_all_types=*/false);
> -           assert(ctxt);
> -           set_suppressions(*ctxt, opts);
> -           corp = read_corpus_from_elf(*ctxt, s);
> +
> +            if (opts.use_ctf)
> +              {
> +                abigail::ctf_reader::read_context *ctxt
> +                  = abigail::ctf_reader::create_read_context (opts.file_path,
> +                                                              env.get());
> +
> +                assert (ctxt);
> +                corp = abigail::ctf_reader::read_corpus (ctxt);
> +              }
> +            else
> +              {
> +                abigail::dwarf_reader::read_context_sptr ctxt =
> +                  abigail::dwarf_reader::create_read_context(opts.file_path,
> +                                                             di_roots, env.get(),
> +                                                             /*load_all_types=*/false);
> +                assert(ctxt);
> +                set_suppressions(*ctxt, opts);
> +                corp = read_corpus_from_elf(*ctxt, s);
> +              }
>           }
>           break;
>         case abigail::tools_utils::FILE_TYPE_XML_CORPUS:
> --
> 2.11.0
>
Jose E. Marchesi Oct. 11, 2021, 3:11 p.m. UTC | #2
Hi Giuliano.
Thanks for the feedback.

> On Mon, 11 Oct 2021 at 09:45, Jose E. Marchesi via Libabigail
> <libabigail@sourceware.org> wrote:
>>
>> CTF (C Type Format) is a lightwieght debugging format that provices
>> information about C types and the association between functions and
>> data symbols and types.  It is designed to be very compact and
>> simple.
>>
>
> It's nice to see you say "simple" here. I'm all in favour. However, a lot
> the https://github.com/oracle/binutils-gdb/wiki/libctf-todo items look
> like they aim to reduce CTF binary size at the expense of greater
> complexity. Will everything be abstracted away in libctf?

libctf shall be able to abstract that additional complexity, yes.

I agree the balance between compactness and simplicity is tricky: it is
also a concern of mine.  However, I think that provided we don't
change/miss the aim/scope of the debugging format, we shall be ok at the
end.

Consider DWARF for example.  Some people say it is not compact.  They
are very wrong: DWARF is way more compact than, say, CTF, speaking in
relative terms.  It is just that the scope of DWARF is so wide that the
amount of information it encodes for typical programs is totally
massive.

>> This patch introduces support in libabigail to extract ABI information
>> from CTF stored in ELF files.
>>
>> A few notes on this implementation:
>>
>> - The implementation is complete in terms of CTF support.  Every CTF
>>   feature is processed and handled to generate libabigail IR.  This
>>   includes basic types, typedefs, pointer, array and struct types.
>>   The CTF record of data objects (variables) and functions are also
>>   used in order to generate the corresponding libabigail IR artifacts.
>>
>> - The decoding of CTF data is done using the libctf library which is
>>   part of binutils.  In order to link with it, binutils shall be built
>>   with --enable-shared for libctf.so to become available.
>>
>> - This initial implementation is aimed to simplicity.  We have not
>>   tried to resolve any and every corner case that may require special
>>   handling.  We have observed that the DWARF front-end (which is
>>   naturally way more complex as the scope is way bigger) is plagued
>>   with hacks to handle such situations.  However, for the CTF support
>>   we prefer to proceed in a simpler and more modest way: we will
>>   handle these problems if/when we find them.  The fact that CTF only
>>   supports C (currently) certainly helps there.
>>
>> - Likewise, in this basic support we are not handling symbol
>>   suppressions or other goodies that libabigail provides.  We are new
>>   to libabigail and ABI analysis, and at this point we simply don't
>>   have a clear picture about what is most useful/relevant to support
>>   or not.  With the maintainer's blesssing, we will tackle that
>>   functionaly after this basic support is applied upstream.
>>
>> - The implementation in abg-ctf-reader.{cc,h} is pretty much
>>   self-contained.  As a result there is some duplication in terms of
>>   ELF handling with the DWARF reader, but since that logic is very
>>   simple and can be easily implemented, we don't consider this to be a
>>   big deal (for now.)  Hopefully the maintainers agree.
>>
>
> The implementation is short which is great.
>
>> - The libabigail tools assume that ELF means to always use DWARF to
>>   generate the ABI IR.  We added a new command-line option --ctf to
>>   the tools in order to make them to use the CTF debug info instead.
>>   We are definitely not sure whether this is the best user interface.
>>   In fact I would be suprised if it was ;)
>>
>> - We added support for --ctf to both abilint and abidiff.   We are not
>>   sure whether it would make sense to add support for CTF to the other
>>   tools.  Feedback welcome.
>>
>
> For ease of testing / building up a useful regression test suite, please do
> consider adding --ctf to abidw (or adding abictf?) which would give a
> CTF -> XML utility. Plain diff (rather than abilint's ABI diff) can be used to
> check for changes over time.

What about renaming abitdw to something like abielf?  Then --ctf would
fit well.

>> - We are pondering about what to do in terms of testing.  We have
>>   cursory tested this implementation using abilint and abidiff.  We
>>   know we are generating IR corpus that seem to be ok.  It would be
>>   good however to be able to run the libabigail testsuites using CTF.
>>   However the testsuites may need some non-trivial changes in order to
>>   make this possible.  Let's talk about that :)
>>
>
> We created a small test suite for regression testing, initially when we
> started working on BTF so that the developer could have something
> to check their progress but also to track progress on certain libabgiail
> issues.

Sounds exactly what we need :)

> There is a simple Makefile that refreshes objects and reports from C
> and C++ source code. Each test case consists of a pair of either C
> or C++ source files. Everything is enumerated just by globbing.
>
> Source is compiled with GCC 10 at present and BTF information is
> obtained by running pahole -J on copies of the objects.

Note that GCC now supports generating BTF natively.  That will become
available in released form with GCC 12.

> There are Python wrappers that replicate these ABI extraction and diff
> steps to check for discrepancies during continuous integration builds.
>
> The abidiff script is a bit special in that it expects comparing .o and
> .xml in all 4 combinations to result in identical outcomes. stdout
> and exit status are both captured and compared. As a result, there's
> been a test we haven't been able to add for a while.
>
> We don't attempt to assert identity of abidiff of abidw XML with BTF diff
> for the same files. There are format and diff algorithm implementation
> differences that make this an impossibility. This may be possible for you
> with CTF though and you can abidiff test all 9 combinations of DWARF,
> CTF and XML inputs. You probably won't be able to assert that DWARF
>  -> XML and CTL -> XML give identical results as text.
>
> The driver code and Makefile are part of the Google repo, but I'd be
> happy to share all the test cases. It's probably time I organised them
> a bit better anyway.

Where is the Google repo?
Giuliano Procida Oct. 11, 2021, 3:26 p.m. UTC | #3
Hi.

On Mon, 11 Oct 2021 at 16:12, Jose E. Marchesi <jose.marchesi@oracle.com> wrote:
>
>
> Hi Giuliano.
> Thanks for the feedback.
>
> > On Mon, 11 Oct 2021 at 09:45, Jose E. Marchesi via Libabigail
> > <libabigail@sourceware.org> wrote:
> >>
> >> CTF (C Type Format) is a lightwieght debugging format that provices
> >> information about C types and the association between functions and
> >> data symbols and types.  It is designed to be very compact and
> >> simple.
> >>
> >
> > It's nice to see you say "simple" here. I'm all in favour. However, a lot
> > the https://github.com/oracle/binutils-gdb/wiki/libctf-todo items look
> > like they aim to reduce CTF binary size at the expense of greater
> > complexity. Will everything be abstracted away in libctf?
>
> libctf shall be able to abstract that additional complexity, yes.
>
> I agree the balance between compactness and simplicity is tricky: it is
> also a concern of mine.  However, I think that provided we don't
> change/miss the aim/scope of the debugging format, we shall be ok at the
> end.
>
> Consider DWARF for example.  Some people say it is not compact.  They
> are very wrong: DWARF is way more compact than, say, CTF, speaking in
> relative terms.  It is just that the scope of DWARF is so wide that the
> amount of information it encodes for typical programs is totally
> massive.
>
> >> This patch introduces support in libabigail to extract ABI information
> >> from CTF stored in ELF files.
> >>
> >> A few notes on this implementation:
> >>
> >> - The implementation is complete in terms of CTF support.  Every CTF
> >>   feature is processed and handled to generate libabigail IR.  This
> >>   includes basic types, typedefs, pointer, array and struct types.
> >>   The CTF record of data objects (variables) and functions are also
> >>   used in order to generate the corresponding libabigail IR artifacts.
> >>
> >> - The decoding of CTF data is done using the libctf library which is
> >>   part of binutils.  In order to link with it, binutils shall be built
> >>   with --enable-shared for libctf.so to become available.
> >>
> >> - This initial implementation is aimed to simplicity.  We have not
> >>   tried to resolve any and every corner case that may require special
> >>   handling.  We have observed that the DWARF front-end (which is
> >>   naturally way more complex as the scope is way bigger) is plagued
> >>   with hacks to handle such situations.  However, for the CTF support
> >>   we prefer to proceed in a simpler and more modest way: we will
> >>   handle these problems if/when we find them.  The fact that CTF only
> >>   supports C (currently) certainly helps there.
> >>
> >> - Likewise, in this basic support we are not handling symbol
> >>   suppressions or other goodies that libabigail provides.  We are new
> >>   to libabigail and ABI analysis, and at this point we simply don't
> >>   have a clear picture about what is most useful/relevant to support
> >>   or not.  With the maintainer's blesssing, we will tackle that
> >>   functionaly after this basic support is applied upstream.
> >>
> >> - The implementation in abg-ctf-reader.{cc,h} is pretty much
> >>   self-contained.  As a result there is some duplication in terms of
> >>   ELF handling with the DWARF reader, but since that logic is very
> >>   simple and can be easily implemented, we don't consider this to be a
> >>   big deal (for now.)  Hopefully the maintainers agree.
> >>
> >
> > The implementation is short which is great.
> >
> >> - The libabigail tools assume that ELF means to always use DWARF to
> >>   generate the ABI IR.  We added a new command-line option --ctf to
> >>   the tools in order to make them to use the CTF debug info instead.
> >>   We are definitely not sure whether this is the best user interface.
> >>   In fact I would be suprised if it was ;)
> >>
> >> - We added support for --ctf to both abilint and abidiff.   We are not
> >>   sure whether it would make sense to add support for CTF to the other
> >>   tools.  Feedback welcome.
> >>
> >
> > For ease of testing / building up a useful regression test suite, please do
> > consider adding --ctf to abidw (or adding abictf?) which would give a
> > CTF -> XML utility. Plain diff (rather than abilint's ABI diff) can be used to
> > check for changes over time.
>
> What about renaming abitdw to something like abielf?  Then --ctf would
> fit well.
>
> >> - We are pondering about what to do in terms of testing.  We have
> >>   cursory tested this implementation using abilint and abidiff.  We
> >>   know we are generating IR corpus that seem to be ok.  It would be
> >>   good however to be able to run the libabigail testsuites using CTF.
> >>   However the testsuites may need some non-trivial changes in order to
> >>   make this possible.  Let's talk about that :)
> >>
> >
> > We created a small test suite for regression testing, initially when we
> > started working on BTF so that the developer could have something
> > to check their progress but also to track progress on certain libabgiail
> > issues.
>
> Sounds exactly what we need :)
>
> > There is a simple Makefile that refreshes objects and reports from C
> > and C++ source code. Each test case consists of a pair of either C
> > or C++ source files. Everything is enumerated just by globbing.
> >
> > Source is compiled with GCC 10 at present and BTF information is
> > obtained by running pahole -J on copies of the objects.
>
> Note that GCC now supports generating BTF natively.  That will become
> available in released form with GCC 12.
>

Good to know. At some point we may end up with 3 different BTF test
inputs (pahole, GCC and Clang).

> > There are Python wrappers that replicate these ABI extraction and diff
> > steps to check for discrepancies during continuous integration builds.
> >
> > The abidiff script is a bit special in that it expects comparing .o and
> > .xml in all 4 combinations to result in identical outcomes. stdout
> > and exit status are both captured and compared. As a result, there's
> > been a test we haven't been able to add for a while.
> >
> > We don't attempt to assert identity of abidiff of abidw XML with BTF diff
> > for the same files. There are format and diff algorithm implementation
> > differences that make this an impossibility. This may be possible for you
> > with CTF though and you can abidiff test all 9 combinations of DWARF,
> > CTF and XML inputs. You probably won't be able to assert that DWARF
> >  -> XML and CTL -> XML give identical results as text.
> >
> > The driver code and Makefile are part of the Google repo, but I'd be
> > happy to share all the test cases. It's probably time I organised them
> > a bit better anyway.
>
> Where is the Google repo?

That's the (private) Google monorepo. To be honest, we could probably
offer up the Makefile and Python scripts, but the latter in assumes
Google libraries.

Giuliano.
Dodji Seketeli Oct. 27, 2021, 8:59 a.m. UTC | #4
>     Add support for the CTF debug format to libabigail.

Thanks a lot for this work!  It's appreciated!

>     CTF (C Type Format) is a lightwieght debugging format that provices
>     information about C types and the association between functions and
>     data symbols and types.  It is designed to be very compact and
>     simple.

OK.

>
>     This patch introduces support in libabigail to extract ABI information
>     from CTF stored in ELF files.
>     
>     A few notes on this implementation:
>     
>     - The implementation is complete in terms of CTF support.  Every CTF
>       feature is processed and handled to generate libabigail IR.  This
>       includes basic types, typedefs, pointer, array and struct types.
>       The CTF record of data objects (variables) and functions are also
>       used in order to generate the corresponding libabigail IR artifacts.

Right.  I have some comments/questions about some CTF constructs, so I
have asked them in my review below.

Otherwise, I find your implementation super neat, thank you for that.


>     - The decoding of CTF data is done using the libctf library which is
>       part of binutils.  In order to link with it, binutils shall be built
>       with --enable-shared for libctf.so to become available.

OK.

>     
>     - This initial implementation is aimed to simplicity.  We have not
>       tried to resolve any and every corner case that may require special
>       handling.  We have observed that the DWARF front-end (which is
>       naturally way more complex as the scope is way bigger) is plagued
>       with hacks to handle such situations.  However, for the CTF support
>       we prefer to proceed in a simpler and more modest way: we will
>       handle these problems if/when we find them.  The fact that CTF only
>       supports C (currently) certainly helps there.

I see.  I have no objection with this incremental-and-down-to-earth
approach.

>     
>     - Likewise, in this basic support we are not handling symbol
>       suppressions or other goodies that libabigail provides.  We are new
>       to libabigail and ABI analysis, and at this point we simply don't
>       have a clear picture about what is most useful/relevant to support
>       or not.  With the maintainer's blesssing, we will tackle that
>       functionaly after this basic support is applied upstream.

Sounds fair to me.

>     - The implementation in abg-ctf-reader.{cc,h} is pretty much
>       self-contained.  As a result there is some duplication in terms of
>       ELF handling with the DWARF reader, but since that logic is very
>       simple and can be easily implemented, we don't consider this to be a
>       big deal (for now.)  Hopefully the maintainers agree.

I totally agree.

>     - The libabigail tools assume that ELF means to always use DWARF to
>       generate the ABI IR.  We added a new command-line option --ctf to
>       the tools in order to make them to use the CTF debug info instead.
>       We are definitely not sure whether this is the best user interface.
>       In fact I would be suprised if it was ;)

It's OK for me, so far.  We can come up with a better way.  In theory,
we should be able to automatically detect the debuginfo format
attached to a given binary, now that we about to support more than one
;-) We should just make sure we can do that even when said debuginfo
is split out into a separate file.

>     - We added support for --ctf to both abilint and abidiff.

Thanks for doing that!

Oh, one thing that is missing is to add documentation to the
doc/manuals/abi{diff,lint}.rst files for the new --ctf to these tools.
Could you please add that in a subsequent iteration of this patch?

>       We are not sure whether it would make sense to add support for CTF
>       to the other tools.  Feedback welcome.

I think it would be important to add something similar to abidw, as
that tool is important to emit an abixml representation of a given
binary.  A lot of users use the abixml representation to serialize a
representation of the

>     - We are pondering about what to do in terms of testing.  We have
>       cursory tested this implementation using abilint and abidiff.  We
>       know we are generating IR corpus that seem to be ok.  It would be
>       good however to be able to run the libabigail testsuites using CTF.
>       However the testsuites may need some non-trivial changes in order to
>       make this possible.  Let's talk about that :)

I think it wouldn't be that hard.  We can discuss that separately if
you like.  But it just boils down to adding a new test similar to
tests/test-read-dwarf.cc, that would be called tests/test-read-ctf.cc.
That test would just read a binary built with ctf support, save its
abixml representation to disk and compare it with an expected output.
That would be a great start.  I can help with that, no problem.  But
let's maybe discuss this in a separate thread, even after the initial
patch is applied.
    
Please find my review below.

> diff --git a/ChangeLog b/ChangeLog
> index 30918b49..385fe067 100644
> --- a/ChangeLog
> +++ b/ChangeLog
> @@ -1,3 +1,21 @@
> +2021-10-11  Jose E. Marchesi  <jose.marchesi@oracle.com>
> +
> +	* configure.ac: Check for libctf.
> +	* src/abg-ctf-reader.cc: New file.
> +	* include/abg-ctf-reader.h: Likewise.
> +	* src/Makefile.am (libabigail_la_SOURCES): Add abg-ctf-reader.cc
> +	conditionally.
> +	* include/Makefile.am (pkginclude_HEADERS): Add abg-ctf-reader.h
> +	conditionally.
> +	* tools/abilint.cc (struct options): New option `use_ctf'.
> +	(display_usage): Documentation for --ctf.
> +	(parse_command_line): Handle --ctf.
> +	(main): Honour --ctf.
> +	* tools/abidiff.cc (struct options): New option `use_ctf'.
> +	(display_usage): Documentation for --ctf.
> +	(parse_command_line): Handle --ctf.
> +	(main): Honour --ctf.
> +

As explained in the COMMIT-LOG-GUIDELINES file from the source code at
https://sourceware.org/git/?p=libabigail.git;a=blob_plain;f=COMMIT-LOG-GUIDELINES;hb=HEAD,
the ChangeLog is automatically updated by a script before a Libabigail
release.

The content of the ChangeLog file is taken from the commit log message,
which has to contain a "paragraph" with the information you just wrote
above.  To provide an example of how to format it, I have created a
branch ctf-branch in the Git repository at
https://sourceware.org/git/?p=libabigail.git;a=shortlog;h=refs/heads/ctf-branch.
In that branch, I applied this patch and I have amended it to add this
ChangeLog entry to the commit log message.  You can see that in the
commit
https://sourceware.org/git/?p=libabigail.git;a=commit;h=71ef732ebc7c6390245e33537af9f097e4032004.


Please note that you also need to provide a line, in the commit log,
with the "Sign-off-by: " mention.  To learn more about that, please
read the CONTRIBUTING file in the source code, especially the
paragraph entitled "Sign your work".


> diff --git a/configure.ac b/configure.ac

[...]

> +    AC_DEFINE([CTF], 1,
> +	     [Defined if user enables and system has the libctf library])

To comply with the implicit convention throughout the code, I renamed
the pre-processor macro used to enable the "CTF Support Feature" into
"WITH_CTF", because all the other "feature enabling" macros are named
WITH_XXX.  This produces the hunk below, that as been committed into
the ctf-branch:

@@ -266,7 +266,7 @@ if test x$ENABLE_CTF = xyes; then
   AC_CHECK_LIB(ctf, ctf_open, [LIBCTF=yes], [LIBCTF=no])
   if test x$LIBCTF = xyes; then
     AC_MSG_NOTICE([activating CTF code])
-    AC_DEFINE([WITH_CTF], 1,
+    AC_DEFINE([CTF], 1,
 	     [Defined if user enables and system has the libctf library])
     CTF_LIBS=-lctf
   else

Of course, all uses the "CTF" macro throughout have been updated accordingly.

> diff --git a/src/abg-ctf-reader.cc b/src/abg-ctf-reader.cc

[...]

> +  /// Associate a given CTF type ID with a given libabigail IR type.
> +  void add_type (ctf_id_t ctf_type, type_base_sptr type)

Throughout the existing source code, there is no space between the
name of a function and the opening parenthesis.  This is based on the
standard GNU C++ coding conventions followed, for instance, in
libstdc++.  It would be appreciated if this patch could comply with
the coding conventions of the rest of the source code.

Those conventions are loosely defined in the CONTRIBUTING file under
the chapter named "Coding language and style".  There is a
.clang-format specification file that you can use to format the source
code (semi?-)automatically using the clang formatter.  I don't use it
myself as I just do the formatting manually as I write the code in
Emacs but you can ask questions about that tool on the mailing list,
should you need any help about it.

[...]

> +static typedef_decl_sptr
> +process_ctf_typedef (read_context *ctxt,
> +                     corpus_sptr corp,
> +                     translation_unit_sptr tunit,
> +                     ctf_dict_t *ctf_dictionary,
> +                     ctf_id_t ctf_type)
> +{
> +  typedef_decl_sptr result;
> +
> +  ctf_id_t ctf_utype = ctf_type_reference (ctf_dictionary, ctf_type);

How about error handling here?  I mean, what if ctf_type is not a
typedef or a type that has a reference to another type?

Just so you know, in the other readers (DWARF and abixml) the code
base uses a kind of defensive programming where errors in this kind of
situations lead to an abort of the process.  Assert macros like
ABG_ASSERT() are used to ensure that the returned ctf_utype is not an
error code, for instance.  This eases the detection of expectation
violation at runtime, making it much easier to debug the issues that
might be reported in the future.

I am not requiring this per se; just making sure you are aware of that
aspect of things.

[...]

> +  result.reset (new typedef_decl (typedef_name, utype, location (),
> +                                  typedef_name /* mangled_name */));

I noticed that you are not setting the "location" information.  That
information is quite useful for accurate diagnostics emitted by, e.g,
abidiff.  For instance:

    $ build/tools/abidiff --harmless tests/data/test-diff-dwarf/test2-v0.o tests/data/test-diff-dwarf/test2-v1.o 
    Functions changes summary: 0 Removed, 1 Changed, 0 Added function
    Variables changes summary: 0 Removed, 0 Changed, 0 Added variable

    1 function with some indirect sub-type change:

      [C] 'function void foo(int, char)' at test2-v1.cc:5:1 has some indirect sub-type changes:
	parameter 1 of type 'int' changed:
	  entity changed from 'int' to compatible type 'typedef Int' at test2-v1.cc:1:1
	parameter 2 of type 'char' changed:
	  entity changed from 'char' to compatible type 'typedef Char' at test2-v1.cc:2:1

    $ 

The location information that you see in the form of "at
test2-v1.cc:1:1" can be useful, I believe.

So why are you not retrieving location information from CTF? (honest
question).  To be fair, I haven't found (from the documentation) how
to get source location information from CTF.  But I am pretty sure I
must be missing the information.  In any case, I think it might be
interesting to add a comment in the code about the reason.

In any case, if the information is not yet available, it's not a
problem.  The patch will go in, regardless.  When source location is
later available from CTF then this code will be amended accordingly.


Also, there is something else that might be important that you are not
doing here: supporting Naming Typedefs.

Consider this construct:

typedef enum {one, two} Number;

Here, there is an anonymous enum that is created.  There is also a
typedef that is created to "name" that anonymous enum.  In concrete
terms, the enum will be referred-to as having the name "Number".
Libabigail IR allows to set the typedef as being the "naming typedef"
of the underlying type of the using the
decl_base::set_naming_typedef() member function, on the underlying
type (the enum).

So the code would somewhat look like this:

    if (is_anonymous_type(utype)
	&& (is_enum_type(utype) || is_class_or_union_type(utype))
      // So utype is an anonymous enum, union or struct.
      // So let's consider that the typedef 'result' is 
      // a naming typedef for utype.
      utype->set_naming_typedef(result);

Does that make sense?

[...]

> +  type_base_sptr utype = ctxt->lookup_type (ctf_utype);
> +
> +  if (!utype)
> +    {
> +      utype = process_ctf_type (ctxt, corp, tunit, ctf_dictionary, ctf_utype);
> +      if (!utype)
> +        return result;
> +    }

I noticed this pattern is used a lot throughout the code.
Maybe it could be factorized into a function and used throughout the
code base as, e.g:

    type_base_sptr utype = process_or_lookup_type(ctxt, corp, tunit,
						  ctf_dictionary,
						  ctf_utype)
What do you think?

[...]

> +static void
> +process_ctf_sou_members (read_context *ctxt,
> +                         corpus_sptr corp,
> +                         translation_unit_sptr tunit,
> +                         ctf_dict_t *ctf_dictionary,
> +                         ctf_id_t ctf_type,
> +                         class_or_union_sptr sou)
> +{

This function definition lacks doxygen comment, even if it's "just" a
sub-routine of process_ctf_struct_type and process_ctf_union_type.

[...]

+static void
+process_ctf_archive (read_context *ctxt, corpus_sptr corp)
+{
+  /* We only have a translation unit.  */
+  translation_unit_sptr ir_translation_unit =
+    std::make_shared<translation_unit> (ctxt->ir_env, "", 64);

Interesting.  So, CTF doesn't keep the information about the
translation unit a decl was defined in?  I guess that must be related
to the fact that I haven't seen source location information so far.

[...]

+
+      /* Iterate over the CTF functions stored in this archive.  */
+      ctf_next_t *func_next = NULL;
+      const char *func_name = NULL;
+      ctf_id_t ctf_sym;
+
+      while ((ctf_sym = ctf_symbol_next (ctf_dict, &func_next, &func_name,
+                                         1 /* functions symbols only */) != CTF_ERR))
+      {

How about function symbol visibility?  Won't ctf_symbol_next return
the symbol of a function that is not necessarily "exported"?  In that
case, because the symbol is not exported, it's related to a "private"
function and so, it doesn't have ABI visibility.  So it should be
discarded.  Just curious.

diff --git a/tools/abidiff.cc b/tools/abidiff.cc

[...]

@@ -104,6 +105,7 @@ struct options
 #ifdef WITH_DEBUG_SELF_COMPARISON
   bool			do_debug;
 #endif
+  bool			use_ctf;

I have protected this CTF support hunk with a #ifdef WITH_CTF guard,
so that configuring with --disable-ctf leads to compiling the file
just fine I've done so with all the CTF-specific hunks in abidiff.cc
and abilint.cc.

All in all, I really like the patch.  Thanks a lot for working on
this.

I have pushed my modified version to the ctf-branch at
https://sourceware.org/git/?p=libabigail.git;a=shortlog;h=refs/heads/ctf-branch.
I just amended your patch.

Your original patch is available in the ctf-branch-original at
https://sourceware.org/git/?p=libabigail.git;a=shortlog;h=refs/heads/ctf-branch-original
so you can diff both to see my changes.

I am attaching my updated patch below, so that you can apply it and
maybe do your modifications from that one.  You can just add your
'sign-off-by' to that one.

Cheers,

From bdf04ae534defec12f74bcf2a63288b0ce297e4d Mon Sep 17 00:00:00 2001
From: "Jose E. Marchesi via Libabigail" <libabigail@sourceware.org>
Date: Mon, 11 Oct 2021 10:45:09 +0200
Subject: [PATCH] Add support for the CTF debug format to libabigail.

CTF (C Type Format) is a lightweight debugging format that provides
information about C types and the association between functions and
data symbols and types.  It is designed to be very compact and
simple.  More can be learned about it at https://ctfstd.org.

This patch introduces support in libabigail to extract ABI information
from CTF stored in ELF files.

A few notes on this implementation:

- The implementation is complete in terms of CTF support.  Every CTF
  feature is processed and handled to generate libabigail IR.  This
  includes basic types, typedefs, pointer, array and struct types.
  The CTF record of data objects (variables) and functions are also
  used in order to generate the corresponding libabigail IR artifacts.

- The decoding of CTF data is done using the libctf library which is
  part of binutils.  In order to link with it, binutils shall be built
  with --enable-shared for libctf.so to become available.

- This initial implementation is aimed to simplicity.  We have not
  tried to resolve any and every corner case that may require special
  handling.  We have observed that the DWARF front-end (which is
  naturally way more complex as the scope is way bigger) is plagued
  with hacks to handle such situations.  However, for the CTF support
  we prefer to proceed in a simpler and more modest way: we will
  handle these problems if/when we find them.  The fact that CTF only
  supports C (currently) certainly helps there.

- Likewise, in this basic support we are not handling symbol
  suppressions or other goodies that libabigail provides.  We are new
  to libabigail and ABI analysis, and at this point we simply don't
  have a clear picture about what is most useful/relevant to support
  or not.  With the maintainer's blesssing, we will tackle that
  functionaly after this basic support is applied upstream.

- The implementation in abg-ctf-reader.{cc,h} is pretty much
  self-contained.  As a result there is some duplication in terms of
  ELF handling with the DWARF reader, but since that logic is very
  simple and can be easily implemented, we don't consider this to be a
  big deal (for now.)  Hopefully the maintainers agree.

- The libabigail tools assume that ELF means to always use DWARF to
  generate the ABI IR.  We added a new command-line option --ctf to
  the tools in order to make them to use the CTF debug info instead.
  We are definitely not sure whether this is the best user interface.
  In fact I would be suprised if it was ;)

- We added support for --ctf to both abilint and abidiff.   We are not
  sure whether it would make sense to add support for CTF to the other
  tools.  Feedback welcome.

- We are pondering about what to do in terms of testing.  We have
  cursory tested this implementation using abilint and abidiff.  We
  know we are generating IR corpus that seem to be ok.  It would be
  good however to be able to run the libabigail testsuites using CTF.
  However the testsuites may need some non-trivial changes in order to
  make this possible.  Let's talk about that :)

Salud!

	* configure.ac: Check for libctf.
	* src/abg-ctf-reader.cc: New file.
	* include/abg-ctf-reader.h: Likewise.
	* src/Makefile.am (libabigail_la_SOURCES): Add abg-ctf-reader.cc
	conditionally.
	* include/Makefile.am (pkginclude_HEADERS): Add abg-ctf-reader.h
	conditionally.
	* tools/abilint.cc (struct options): New option `use_ctf'.
	(display_usage): Documentation for --ctf.
	(parse_command_line): Handle --ctf.
	(main): Honour --ctf.
	* tools/abidiff.cc (struct options): New option `use_ctf'.
	(display_usage): Documentation for --ctf.
	(parse_command_line): Handle --ctf.
	(main): Honour --ctf.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
---
 configure.ac             |  32 +-
 include/Makefile.am      |   4 +
 include/abg-corpus.h     |   1 +
 include/abg-ctf-reader.h |  34 ++
 src/Makefile.am          |   4 +
 src/abg-ctf-reader.cc    | 999 +++++++++++++++++++++++++++++++++++++++
 tools/abidiff.cc         | 110 +++--
 tools/abilint.cc         |  49 +-
 8 files changed, 1190 insertions(+), 43 deletions(-)
 create mode 100644 include/abg-ctf-reader.h
 create mode 100644 src/abg-ctf-reader.cc

diff --git a/configure.ac b/configure.ac
index 950e2704..4f835d07 100644
--- a/configure.ac
+++ b/configure.ac
@@ -144,6 +144,13 @@ AC_ARG_ENABLE(ubsan,
 	      ENABLE_UBSAN=$enableval,
 	      ENABLE_UBSAN=no)
 
+dnl check if user has enabled CTF code
+AC_ARG_ENABLE(ctf,
+	      AS_HELP_STRING([--enable-ctf=yes|no],
+			     [disable support of ctf files)]),
+	      ENABLE_CTF=$enableval,
+	      ENABLE_CTF=no)
+
 dnl *************************************************
 dnl check for dependencies
 dnl *************************************************
@@ -250,6 +257,24 @@ fi
 AC_SUBST(DW_LIBS)
 AC_SUBST([ELF_LIBS])
 
+dnl check for libctf presence if CTF code has been enabled by command line
+dnl argument, and then define CTF flag (to build CTF file code) if libctf is
+dnl found on the system
+CTF_LIBS=
+if test x$ENABLE_CTF = xyes; then
+  LIBCTF=
+  AC_CHECK_LIB(ctf, ctf_open, [LIBCTF=yes], [LIBCTF=no])
+  if test x$LIBCTF = xyes; then
+    AC_MSG_NOTICE([activating CTF code])
+    AC_DEFINE([WITH_CTF], 1,
+	     [Defined if user enables and system has the libctf library])
+    CTF_LIBS=-lctf
+  else
+    AC_MSG_NOTICE([CTF enabled but no libctf found])
+    ENABLE_CTF=no
+  fi
+fi
+
 dnl Check for dependency: libxml
 LIBXML2_VERSION=2.6.22
 PKG_CHECK_MODULES(XML, libxml-2.0 >= $LIBXML2_VERSION)
@@ -611,7 +636,7 @@ AX_VALGRIND_CHECK
 
 dnl Set the list of libraries libabigail depends on
 
-DEPS_LIBS="$XML_LIBS $ELF_LIBS $DW_LIBS"
+DEPS_LIBS="$XML_LIBS $ELF_LIBS $DW_LIBS $CTF_LIBS"
 AC_SUBST(DEPS_LIBS)
 
 if test x$ABIGAIL_DEVEL != x; then
@@ -649,6 +674,10 @@ if test x$ENABLE_UBSAN = xyes; then
     CXXFLAGS="$CXXFLAGS -fsanitize=undefined"
 fi
 
+dnl Set a few Automake conditionals
+
+AM_CONDITIONAL([CTF_READER],[test "x$ENABLE_CTF" = "xyes"])
+
 dnl Set the level of C++ standard we use.
 CXXFLAGS="$CXXFLAGS -std=$CXX_STANDARD"
 
@@ -955,6 +984,7 @@ AC_MSG_NOTICE([
     Enable bash completion	                   : ${ENABLE_BASH_COMPLETION}
     Enable fedabipkgdiff                           : ${ENABLE_FEDABIPKGDIFF}
     Enable python 3				   : ${ENABLE_PYTHON3}
+    Enable CTF front-end                           : ${ENABLE_CTF}
     Enable running tests under Valgrind            : ${enable_valgrind}
     Enable build with -fsanitize=address    	   : ${ENABLE_ASAN}
     Enable build with -fsanitize=memory    	   : ${ENABLE_MSAN}
diff --git a/include/Makefile.am b/include/Makefile.am
index 0f3b0936..9e5e037b 100644
--- a/include/Makefile.am
+++ b/include/Makefile.am
@@ -27,4 +27,8 @@ abg-viz-dot.h		\
 abg-viz-svg.h		\
 abg-regex.h
 
+if CTF_READER
+pkginclude_HEADERS += abg-ctf-reader.h
+endif
+
 EXTRA_DIST = abg-version.h.in
diff --git a/include/abg-corpus.h b/include/abg-corpus.h
index 136c348c..652a8294 100644
--- a/include/abg-corpus.h
+++ b/include/abg-corpus.h
@@ -46,6 +46,7 @@ public:
     ARTIFICIAL_ORIGIN = 0,
     NATIVE_XML_ORIGIN,
     DWARF_ORIGIN,
+    CTF_ORIGIN,
     LINUX_KERNEL_BINARY_ORIGIN
   };
 
diff --git a/include/abg-ctf-reader.h b/include/abg-ctf-reader.h
new file mode 100644
index 00000000..07eccec6
--- /dev/null
+++ b/include/abg-ctf-reader.h
@@ -0,0 +1,34 @@
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+// -*- Mode: C++ -*-
+//
+// Copyright (C) 2021 Oracle, Inc.
+//
+// Author: Jose E. Marchesi
+
+/// @file
+///
+/// This file contains the declarations of the entry points to
+/// de-serialize an instance of @ref abigail::corpus from a file in
+/// elf format, containing CTF information.
+
+#ifndef __ABG_CTF_READER_H__
+#define __ABG_CTF_READER_H__
+
+#include <ostream>
+#include "abg-corpus.h"
+#include "abg-suppression.h"
+
+namespace abigail
+{
+namespace ctf_reader
+{
+
+class read_context;
+read_context *create_read_context (std::string elf_path,
+                                   ir::environment *env);
+corpus_sptr read_corpus (read_context *ctxt);
+
+} // end namespace ctf_reader
+} // end namespace abigail
+
+#endif // ! __ABG_CTF_READER_H__
diff --git a/src/Makefile.am b/src/Makefile.am
index 430ce98d..b60d74cb 100644
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -41,6 +41,10 @@ abg-symtab-reader.h			\
 abg-symtab-reader.cc			\
 $(VIZ_SOURCES)
 
+if CTF_READER
+libabigail_la_SOURCES += abg-ctf-reader.cc
+endif
+
 libabigail_la_LIBADD = $(DEPS_LIBS)
 libabigail_la_LDFLAGS = -lpthread -Wl,--as-needed -no-undefined
 
diff --git a/src/abg-ctf-reader.cc b/src/abg-ctf-reader.cc
new file mode 100644
index 00000000..8e2d6387
--- /dev/null
+++ b/src/abg-ctf-reader.cc
@@ -0,0 +1,999 @@
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+// -*- Mode: C++ -*-
+//
+// Copyright (C) 2021 Oracle, Inc.
+//
+// Author: Jose E. Marchesi
+
+/// @file
+///
+/// This file contains the definitions of the entry points to
+/// de-serialize an instance of @ref abigail::corpus from a file in
+/// ELF format, containing CTF information.
+
+#include "config.h"
+
+#include <fcntl.h> /* For open(3) */
+#include <iostream>
+
+#include "ctf-api.h"
+
+#include "abg-internal.h"
+#include "abg-ir-priv.h"
+#include "abg-elf-helpers.h"
+
+// <headers defining libabigail's API go under here>
+ABG_BEGIN_EXPORT_DECLARATIONS
+
+#include "abg-ctf-reader.h"
+#include "abg-libxml-utils.h"
+#include "abg-reader.h"
+#include "abg-corpus.h"
+#include "abg-symtab-reader.h"
+#include "abg-tools-utils.h"
+
+ABG_END_EXPORT_DECLARATIONS
+// </headers defining libabigail's API>
+
+namespace abigail
+{
+namespace ctf_reader
+{
+
+class read_context
+{
+public:
+  /// The name of the ELF file from which the CTF archive got
+  /// extracted.
+  string filename;
+
+  /// The IR environment.
+  ir::environment *ir_env;
+
+  /// The CTF archive read from FILENAME.  If an archive couldn't
+  /// be read from the file then this is NULL.
+  ctf_archive_t *ctfa;
+
+  /// A map associating CTF type ids with libabigail IR types.  This
+  /// is used to reuse already generated types.
+  unordered_map<ctf_id_t,type_base_sptr> types_map;
+
+  /// Associate a given CTF type ID with a given libabigail IR type.
+  void add_type (ctf_id_t ctf_type, type_base_sptr type)
+  {
+    types_map.insert (std::make_pair (ctf_type, type));
+  }
+
+  /// Lookup a given CTF type ID in the types map.
+  ///
+  /// @param ctf_type the type ID of the type to lookup.
+  type_base_sptr lookup_type (ctf_id_t ctf_type)
+  {
+    type_base_sptr result;
+
+    auto search = types_map.find (ctf_type);
+    if (search != types_map.end())
+      result = search->second;
+
+    return result;
+  }
+
+  /// Constructor.
+  ///
+  /// @param elf_path the path to the ELF file.
+  read_context (string elf_path, ir::environment *env)
+  {
+    int err;
+
+    types_map.clear ();
+    filename = elf_path;
+    ir_env = env;
+    ctfa = ctf_open (filename.c_str(),
+                     NULL /* BFD target */, &err);
+
+    if (ctfa == NULL)
+      fprintf (stderr, "cannot open %s: %s\n", filename.c_str(), ctf_errmsg (err));
+  }
+
+  /// Destructor of the @ref read_context type.
+  ~read_context ()
+  {
+    ctf_close (ctfa);
+  }
+}; // end class read_context.
+
+/// Forward reference, needed because several of the process_ctf_*
+/// functions below are indirectly recursive through this call.
+static type_base_sptr process_ctf_type (read_context *ctxt, corpus_sptr corp,
+                                        translation_unit_sptr tunit,
+                                        ctf_dict_t *ctf_dictionary,
+                                        ctf_id_t ctf_type);
+
+/// Build and return a typedef libabigail IR.
+///
+/// @param ctxt the read context.
+/// @param corp the libabigail IR corpus being constructed.
+/// @param tunit the current IR translation unit.
+/// @param ctf_dictionary the CTF dictionary being read.
+/// @param ctf_type the CTF type ID of the source type.
+///
+/// @return a shared pointer to the IR node for the typedef.
+
+static typedef_decl_sptr
+process_ctf_typedef (read_context *ctxt,
+                     corpus_sptr corp,
+                     translation_unit_sptr tunit,
+                     ctf_dict_t *ctf_dictionary,
+                     ctf_id_t ctf_type)
+{
+  typedef_decl_sptr result;
+
+  ctf_id_t ctf_utype = ctf_type_reference (ctf_dictionary, ctf_type);
+  const char *typedef_name = ctf_type_name_raw (ctf_dictionary, ctf_type);
+  type_base_sptr utype = ctxt->lookup_type (ctf_utype);
+
+  if (!utype)
+    {
+      utype = process_ctf_type (ctxt, corp, tunit, ctf_dictionary, ctf_utype);
+      if (!utype)
+        return result;
+    }
+
+  result.reset (new typedef_decl (typedef_name, utype, location (),
+                                  typedef_name /* mangled_name */));
+  return result;
+}
+
+/// Build and return an integer or float type declaration libabigail
+/// IR.
+///
+/// @param ctxt the read context.
+/// @param corp the libabigail IR corpus being constructed.
+/// @param ctf_dictionary the CTF dictionary being read.
+/// @param ctf_type the CTF type ID of the source type.
+///
+/// @return a shared pointer to the IR node for the type.
+
+static type_decl_sptr
+process_ctf_base_type (read_context *ctxt,
+                       corpus_sptr corp,
+                       ctf_dict_t *ctf_dictionary,
+                       ctf_id_t ctf_type)
+{
+  type_decl_sptr result;
+
+  ssize_t type_alignment = ctf_type_align (ctf_dictionary, ctf_type);
+  const char *type_name = ctf_type_name_raw (ctf_dictionary, ctf_type);
+
+  /* Get the type encoding and extract some useful properties of
+     the type from it.  In case of any error, just ignore the
+     type.  */
+  ctf_encoding_t type_encoding;
+  if (ctf_type_encoding (ctf_dictionary,
+                         ctf_type,
+                         &type_encoding))
+    return result;
+
+  /* Create the IR type corresponding to the CTF type.  */
+  if (type_encoding.cte_bits == 0
+      && type_encoding.cte_format == CTF_INT_SIGNED)
+    {
+      /* This is the `void' type.  */
+      type_base_sptr void_type = ctxt->ir_env->get_void_type ();
+      decl_base_sptr type_declaration = get_type_declaration (void_type);
+      result = is_type_decl (type_declaration);
+    }
+  else
+    {
+      result = lookup_basic_type (type_name, *corp);
+      if (!result)
+        result.reset (new type_decl (ctxt->ir_env,
+                                     type_name,
+                                     type_encoding.cte_bits,
+                                     type_alignment * 8 /* in bits */,
+                                     location (),
+                                     type_name /* mangled_name */));
+
+    }
+
+  return result;
+}
+
+/// Build and return a function type libabigail IR.
+///
+/// @param ctxt the read context.
+/// @param corp the libabigail IR corpus being constructed.
+/// @param tunit the current IR translation unit.
+/// @param ctf_dictionary the CTF dictionary being read.
+/// @param ctf_type the CTF type ID of the source type.
+///
+/// @return a shared pointer to the IR node for the function type.
+
+static function_type_sptr
+process_ctf_function_type (read_context *ctxt,
+                           corpus_sptr corp,
+                           translation_unit_sptr tunit,
+                           ctf_dict_t *ctf_dictionary,
+                           ctf_id_t ctf_type)
+{
+  function_type_sptr result;
+
+  /* Fetch the function type info from the CTF type.  */
+  ctf_funcinfo_t funcinfo;
+  ctf_func_type_info (ctf_dictionary, ctf_type, &funcinfo);
+  int vararg_p = funcinfo.ctc_flags & CTF_FUNC_VARARG;
+
+  /* Take care first of the result type.  */
+  ctf_id_t ctf_ret_type = funcinfo.ctc_return;
+  type_base_sptr ret_type = ctxt->lookup_type (ctf_ret_type);
+
+  if (!ret_type)
+    {
+      ret_type = process_ctf_type (ctxt, corp, tunit, ctf_dictionary,
+                                   ctf_ret_type);
+      if (!ret_type)
+        return result;
+    }
+
+  /* Now process the argument types.  */
+  int argc = funcinfo.ctc_argc;
+  std::vector<ctf_id_t> argv (argc);
+  if (static_cast<ctf_id_t>(ctf_func_type_args(ctf_dictionary, ctf_type,
+					       argc, argv.data ())) == CTF_ERR)
+    return result;
+
+  function_decl::parameters function_parms;
+  for (int i = 0; i < argc; i++)
+    {
+      ctf_id_t ctf_arg_type = argv[i];
+      type_base_sptr arg_type = ctxt->lookup_type (ctf_arg_type);
+
+      if (!arg_type)
+        {
+          arg_type = process_ctf_type (ctxt, corp, tunit, ctf_dictionary,
+                                       ctf_arg_type);
+          if (!arg_type)
+            return result;
+        }
+
+      function_decl::parameter_sptr parm
+        (new function_decl::parameter (arg_type, "",
+                                       location (),
+                                       vararg_p && (i == argc - 1),
+                                       false /* is_artificial */));
+      function_parms.push_back (parm);
+    }
+
+
+  /* Ok now the function type itself.  */
+  result.reset (new function_type (ret_type,
+                                   function_parms,
+                                   tunit->get_address_size (),
+                                   ctf_type_align (ctf_dictionary, ctf_type)));
+
+  tunit->bind_function_type_life_time (result);
+  result->set_is_artificial (true);
+  return result;
+}
+
+static void
+process_ctf_sou_members (read_context *ctxt,
+                         corpus_sptr corp,
+                         translation_unit_sptr tunit,
+                         ctf_dict_t *ctf_dictionary,
+                         ctf_id_t ctf_type,
+                         class_or_union_sptr sou)
+{
+  ssize_t member_size;
+  ctf_next_t *member_next = NULL;
+  const char *member_name = NULL;
+  ctf_id_t member_ctf_type;
+
+  while ((member_size = ctf_member_next (ctf_dictionary, ctf_type,
+                                         &member_next, &member_name,
+                                         &member_ctf_type,
+                                         CTF_MN_RECURSE)) >= 0)
+    {
+      ctf_membinfo_t membinfo;
+
+      if (static_cast<ctf_id_t>(ctf_member_info(ctf_dictionary,
+						ctf_type,
+						member_name,
+						&membinfo)) == CTF_ERR)
+        return;
+
+      /* Build the IR for the member's type.  */
+      type_base_sptr member_type = ctxt->lookup_type (member_ctf_type);
+      if (!member_type)
+        {
+          member_type = process_ctf_type (ctxt, corp, tunit, ctf_dictionary,
+                                          member_ctf_type);
+          if (!member_type)
+            /* Ignore this member.  */
+            continue;
+        }
+
+      /* Create a declaration IR node for the member and add it to the
+         struct type.  */
+      var_decl_sptr data_member_decl (new var_decl (member_name,
+                                                    member_type,
+                                                    location (),
+                                                    member_name));
+      sou->add_data_member (data_member_decl,
+                            public_access,
+                            true /* is_laid_out */,
+                            false /* is_static */,
+                            membinfo.ctm_offset);
+    }
+  if (ctf_errno (ctf_dictionary) != ECTF_NEXT_END)
+    fprintf (stderr, "ERROR from ctf_member_next\n");
+}
+
+/// Build and return a struct type libabigail IR.
+///
+/// @param ctxt the read context.
+/// @param corp the libabigail IR corpus being constructed.
+/// @param tunit the current IR translation unit.
+/// @param ctf_dictionary the CTF dictionary being read.
+/// @param ctf_type the CTF type ID of the source type.
+///
+/// @return a shared pointer to the IR node for the struct type.
+
+static class_decl_sptr
+process_ctf_struct_type (read_context *ctxt,
+                         corpus_sptr corp,
+                         translation_unit_sptr tunit,
+                         ctf_dict_t *ctf_dictionary,
+                         ctf_id_t ctf_type)
+{
+  class_decl_sptr result;
+  std::string struct_type_name = ctf_type_name_raw (ctf_dictionary,
+                                                 ctf_type);
+  bool struct_type_is_anonymous = (struct_type_name == "");
+
+  /* The libabigail IR encodes C struct types in `class' IR nodes.  */
+  result.reset (new class_decl (ctxt->ir_env,
+                                struct_type_name,
+                                ctf_type_size (ctf_dictionary, ctf_type) * 8,
+                                ctf_type_align (ctf_dictionary, ctf_type) * 8,
+                                true /* is_struct */,
+                                location (),
+                                decl_base::VISIBILITY_DEFAULT,
+                                struct_type_is_anonymous));
+  if (!result)
+    return result;
+
+  /* The C type system indirectly supports loops by the mean of
+     pointers to structs or unions.  Since some contained type can
+     refer to this struct, we have to make it available in the cache
+     at this point even if the members haven't been added to the IR
+     node yet.  */
+  ctxt->add_type (ctf_type, result);
+
+  /* Now add the struct members as specified in the CTF type description.
+     This is C, so named types can only be defined in the global
+     scope.  */
+  process_ctf_sou_members (ctxt, corp, tunit, ctf_dictionary, ctf_type,
+                           result);
+
+  return result;
+}
+
+/// Build and return an union type libabigail IR.
+///
+/// @param ctxt the read context.
+/// @param corp the libabigail IR corpus being constructed.
+/// @param tunit the current IR translation unit.
+/// @param ctf_dictionary the CTF dictionary being read.
+/// @param ctf_type the CTF type ID of the source type.
+///
+/// @return a shared pointer to the IR node for the union type.
+
+static union_decl_sptr
+process_ctf_union_type (read_context *ctxt,
+                        corpus_sptr corp,
+                        translation_unit_sptr tunit,
+                        ctf_dict_t *ctf_dictionary,
+                        ctf_id_t ctf_type)
+{
+  union_decl_sptr result;
+  std::string union_type_name = ctf_type_name_raw (ctf_dictionary,
+                                                   ctf_type);
+  bool union_type_is_anonymous = (union_type_name == "");
+
+  /* Create the corresponding libabigail union IR node.  */
+  result.reset (new union_decl (ctxt->ir_env,
+                                union_type_name,
+                                ctf_type_size (ctf_dictionary, ctf_type) * 8,
+                                location (),
+                                decl_base::VISIBILITY_DEFAULT,
+                                union_type_is_anonymous));
+  if (!result)
+    return result;
+
+  /* The C type system indirectly supports loops by the mean of
+     pointers to structs or unions.  Since some contained type can
+     refer to this union, we have to make it available in the cache
+     at this point even if the members haven't been added to the IR
+     node yet.  */
+  ctxt->add_type (ctf_type, result);
+
+  /* Now add the union members as specified in the CTF type description.
+     This is C, so named types can only be defined in the global
+     scope.  */
+  process_ctf_sou_members (ctxt, corp, tunit, ctf_dictionary, ctf_type,
+                           result);
+
+  return result;
+}
+
+/// Build and return an array type libabigail IR.
+///
+/// @param ctxt the read context.
+/// @param corp the libabigail IR corpus being constructed.
+/// @param tunit the current IR translation unit.
+/// @param ctf_dictionary the CTF dictionary being read.
+/// @param ctf_type the CTF type ID of the source type.
+///
+/// @return a shared pointer to the IR node for the array type.
+
+static array_type_def_sptr
+process_ctf_array_type (read_context *ctxt,
+                        corpus_sptr corp,
+                        translation_unit_sptr tunit,
+                        ctf_dict_t *ctf_dictionary,
+                        ctf_id_t ctf_type)
+{
+  array_type_def_sptr result;
+  ctf_arinfo_t ctf_ainfo;
+
+  /* First, get the information about the CTF array.  */
+  if (static_cast<ctf_id_t>(ctf_array_info(ctf_dictionary,
+					   ctf_type,
+					   &ctf_ainfo)) == CTF_ERR)
+    return result;
+
+  ctf_id_t ctf_element_type = ctf_ainfo.ctr_contents;
+  ctf_id_t ctf_index_type = ctf_ainfo.ctr_index;
+  uint64_t nelems = ctf_ainfo.ctr_nelems;
+
+  /* Make sure the element type is generated.  */
+  type_base_sptr element_type = ctxt->lookup_type (ctf_element_type);
+  if (!element_type)
+    {
+      element_type = process_ctf_type (ctxt, corp, tunit, ctf_dictionary, ctf_element_type);
+      if (!element_type)
+        return result;
+    }
+
+  /* Ditto for the index type.  */
+  type_base_sptr index_type = ctxt->lookup_type (ctf_index_type);
+  if (!index_type)
+    {
+      index_type = process_ctf_type (ctxt, corp, tunit, ctf_dictionary, ctf_index_type);
+      if (!index_type)
+        return result;
+    }
+
+  /* The number of elements of the array determines the IR subranges
+     type to build.  */
+  array_type_def::subranges_type subranges;
+  array_type_def::subrange_sptr subrange;
+  array_type_def::subrange_type::bound_value lower_bound;
+  array_type_def::subrange_type::bound_value upper_bound;
+
+  lower_bound.set_unsigned (0); /* CTF supports C only.  */
+  upper_bound.set_unsigned (nelems > 0 ? nelems - 1 : 0U);
+
+  subrange.reset (new array_type_def::subrange_type (ctxt->ir_env,
+                                                     "",
+                                                     lower_bound,
+                                                     upper_bound,
+                                                     index_type,
+                                                     location (),
+                                                     translation_unit::LANG_C));
+  if (!subrange)
+    return result;
+
+  add_decl_to_scope (subrange, tunit->get_global_scope());
+  canonicalize (subrange);
+  subranges.push_back (subrange);
+
+  /* Finally build the IR for the array type and return it.  */
+  result.reset (new array_type_def (element_type, subranges, location ()));
+  return result;
+}
+
+/// Build and return a qualified type libabigail IR.
+///
+/// @param ctxt the read context.
+/// @param corp the libabigail IR corpus being constructed.
+/// @param tunit the current IR translation unit.
+/// @param ctf_dictionary the CTF dictionary being read.
+/// @param ctf_type the CTF type ID of the source type.
+
+static type_base_sptr
+process_ctf_qualified_type (read_context *ctxt,
+                            corpus_sptr corp,
+                            translation_unit_sptr tunit,
+                            ctf_dict_t *ctf_dictionary,
+                            ctf_id_t ctf_type)
+{
+  type_base_sptr result;
+  int type_kind = ctf_type_kind (ctf_dictionary, ctf_type);
+  ctf_id_t ctf_utype = ctf_type_reference (ctf_dictionary, ctf_type);
+  type_base_sptr utype = ctxt->lookup_type (ctf_utype);
+
+  if (!utype)
+    {
+      utype = process_ctf_type (ctxt, corp, tunit, ctf_dictionary, ctf_utype);
+      if (!utype)
+        return result;
+    }
+
+  qualified_type_def::CV qualifiers = qualified_type_def::CV_NONE;
+  if (type_kind == CTF_K_CONST)
+    qualifiers |= qualified_type_def::CV_CONST;
+  else if (type_kind == CTF_K_VOLATILE)
+    qualifiers |= qualified_type_def::CV_VOLATILE;
+  else if (type_kind == CTF_K_RESTRICT)
+    qualifiers |= qualified_type_def::CV_RESTRICT;
+  else
+    ABG_ASSERT_NOT_REACHED;
+
+  result.reset (new qualified_type_def (utype, qualifiers, location ()));
+  return result;
+}
+
+/// Build and return a pointer type libabigail IR.
+///
+/// @param ctxt the read context.
+/// @param corp the libabigail IR corpus being constructed.
+/// @param tunit the current IR translation unit.
+/// @param ctf_dictionary the CTF dictionary being read.
+/// @param ctf_type the CTF type ID of the source type.
+///
+/// @return a shared pointer to the IR node for the pointer type.
+
+static pointer_type_def_sptr
+process_ctf_pointer_type (read_context *ctxt,
+                          corpus_sptr corp,
+                          translation_unit_sptr tunit,
+                          ctf_dict_t *ctf_dictionary,
+                          ctf_id_t ctf_type)
+{
+  pointer_type_def_sptr result;
+  ctf_id_t ctf_target_type = ctf_type_reference (ctf_dictionary, ctf_type);
+  type_base_sptr target_type = ctxt->lookup_type (ctf_target_type);
+
+  if (!target_type)
+    {
+      target_type = process_ctf_type (ctxt, corp, tunit, ctf_dictionary,
+                                      ctf_target_type);
+      if (!target_type)
+        return result;
+    }
+
+  result.reset (new pointer_type_def (target_type,
+                                      ctf_type_size (ctf_dictionary, ctf_type) * 8,
+                                      ctf_type_align (ctf_dictionary, ctf_type) * 8,
+                                      location ()));
+  return result;
+}
+
+/// Build and return an enum type libabigail IR.
+///
+/// @param ctxt the read context.
+/// @param corp the libabigail IR corpus being constructed.
+/// @param tunit the current IR translation unit.
+/// @param ctf_dictionary the CTF dictionary being read.
+/// @param ctf_type the CTF type ID of the source type.
+///
+/// @return a shared pointer to the IR node for the enum type.
+
+static enum_type_decl_sptr
+process_ctf_enum_type (read_context *ctxt,
+                        translation_unit_sptr tunit,
+                        ctf_dict_t *ctf_dictionary,
+                        ctf_id_t ctf_type)
+{
+  enum_type_decl_sptr result;
+
+  /* Build a signed integral type for the type of the enumerators, aka
+     the underlying type.  The size of the enumerators in bytes is
+     specified in the CTF enumeration type.  */
+  size_t utype_size_in_bits = ctf_type_size (ctf_dictionary, ctf_type) * 8;
+  type_decl_sptr utype;
+
+  utype.reset (new type_decl (ctxt->ir_env,
+                              "",
+                              utype_size_in_bits,
+                              utype_size_in_bits,
+                              location ()));
+  utype->set_is_anonymous (true);
+  utype->set_is_artificial (true);
+  if (!utype)
+    return result;
+  add_decl_to_scope (utype, tunit->get_global_scope());
+  canonicalize (utype);
+
+  /* Iterate over the enum entries.  */
+  enum_type_decl::enumerators enms;
+  ctf_next_t *enum_next = NULL;
+  const char *ename;
+  int evalue;
+
+  while ((ename = ctf_enum_next (ctf_dictionary, ctf_type, &enum_next, &evalue)))
+    enms.push_back (enum_type_decl::enumerator (ctxt->ir_env, ename, evalue));
+  if (ctf_errno (ctf_dictionary) != ECTF_NEXT_END)
+    {
+      fprintf (stderr, "ERROR from ctf_enum_next\n");
+      return result;
+    }
+
+  const char *enum_name = ctf_type_name_raw (ctf_dictionary, ctf_type);
+  result.reset (new enum_type_decl (enum_name, location (),
+                                    utype, enms, enum_name));
+  return result;
+}
+
+/// Add a new type declaration to the given libabigail IR corpus CORP.
+///
+/// @param ctxt the read context.
+/// @param corp the libabigail IR corpus being constructed.
+/// @param tunit the current IR translation unit.
+/// @param ctf_dictionary the CTF dictionary being read.
+/// @param ctf_type the CTF type ID of the source type.
+///
+/// Note that if @ref ctf_type can't reliably be translated to the IR
+/// then it is simply ignored.
+///
+/// @return a shared pointer to the IR node for the type.
+
+static type_base_sptr
+process_ctf_type (read_context *ctxt,
+                  corpus_sptr corp,
+                  translation_unit_sptr tunit,
+                  ctf_dict_t *ctf_dictionary,
+                  ctf_id_t ctf_type)
+{
+  int type_kind = ctf_type_kind (ctf_dictionary, ctf_type);
+  type_base_sptr result;
+
+  switch (type_kind)
+    {
+    case CTF_K_INTEGER:
+    case CTF_K_FLOAT:
+      {
+        type_decl_sptr type_decl
+          = process_ctf_base_type (ctxt, corp, ctf_dictionary, ctf_type);
+
+        if (type_decl)
+          {
+            add_decl_to_scope (type_decl, tunit->get_global_scope ());
+            result = is_type (type_decl);
+          }
+        break;
+      }
+    case CTF_K_TYPEDEF:
+      {
+        typedef_decl_sptr typedef_decl
+          = process_ctf_typedef (ctxt, corp, tunit, ctf_dictionary, ctf_type);
+
+        if (typedef_decl)
+          {
+            add_decl_to_scope (typedef_decl, tunit->get_global_scope ());
+            result = is_type (typedef_decl);
+          }
+        break;
+      }
+    case CTF_K_POINTER:
+      {
+        pointer_type_def_sptr pointer_type
+          = process_ctf_pointer_type (ctxt, corp, tunit, ctf_dictionary, ctf_type);
+
+        if (pointer_type)
+          {
+            add_decl_to_scope (pointer_type, tunit->get_global_scope ());
+            result = pointer_type;
+          }
+        break;
+      }
+    case CTF_K_CONST:
+    case CTF_K_VOLATILE:
+    case CTF_K_RESTRICT:
+      {
+        type_base_sptr qualified_type
+          = process_ctf_qualified_type (ctxt, corp, tunit, ctf_dictionary, ctf_type);
+
+        if (qualified_type)
+          {
+            decl_base_sptr qualified_type_decl = get_type_declaration (qualified_type);
+
+            add_decl_to_scope (qualified_type_decl, tunit->get_global_scope ());
+            result = qualified_type;
+          }
+        break;
+      }
+    case CTF_K_ARRAY:
+      {
+        array_type_def_sptr array_type
+          = process_ctf_array_type (ctxt, corp, tunit, ctf_dictionary, ctf_type);
+
+        if (array_type)
+          {
+            decl_base_sptr array_type_decl = get_type_declaration (array_type);
+
+            add_decl_to_scope (array_type_decl, tunit->get_global_scope ());
+            result = array_type;
+          }
+        break;
+      }
+    case CTF_K_ENUM:
+      {
+        enum_type_decl_sptr enum_type
+          = process_ctf_enum_type (ctxt, tunit, ctf_dictionary, ctf_type);
+
+        if (enum_type)
+          {
+            add_decl_to_scope (enum_type, tunit->get_global_scope ());
+            result = enum_type;
+          }
+
+        break;
+      }
+    case CTF_K_FUNCTION:
+      {
+        function_type_sptr function_type
+          = process_ctf_function_type (ctxt, corp, tunit, ctf_dictionary, ctf_type);
+
+        if (function_type)
+          {
+            decl_base_sptr function_type_decl = get_type_declaration (function_type);
+
+            add_decl_to_scope (function_type_decl, tunit->get_global_scope ());
+            result = function_type;
+          }
+        break;
+      }
+    case CTF_K_STRUCT:
+      {
+        class_decl_sptr struct_decl
+          = process_ctf_struct_type (ctxt, corp, tunit, ctf_dictionary, ctf_type);
+
+        if (struct_decl)
+          {
+            add_decl_to_scope (struct_decl, tunit->get_global_scope ());
+            result = is_type (struct_decl);
+          }
+        break;
+      }
+    case CTF_K_UNION:
+      {
+        union_decl_sptr union_decl
+          = process_ctf_union_type (ctxt, corp, tunit, ctf_dictionary, ctf_type);
+
+        if (union_decl)
+          {
+            add_decl_to_scope (union_decl, tunit->get_global_scope ());
+            result = is_type (union_decl);
+          }
+        break;
+      }
+    case CTF_K_UNKNOWN:
+      /* Unknown types are simply ignored.  */
+    default:
+      break;
+    }
+
+  if (result)
+    {
+      decl_base_sptr result_decl = get_type_declaration (result);
+
+      canonicalize (result);
+      ctxt->add_type (ctf_type, result);
+    }
+  else
+    fprintf (stderr, "NOT PROCESSED TYPE %lu\n", ctf_type);
+
+  return result;
+}
+
+/// Process a CTF archive and create libabigail IR for the types,
+/// variables and function declarations found in the archive.  The IR
+/// is added to the given corpus.
+///
+/// @param ctxt the read context containing the CTF archive to
+/// process.
+/// @param corp the IR corpus to which add the new contents.
+
+static void
+process_ctf_archive (read_context *ctxt, corpus_sptr corp)
+{
+  /* We only have a translation unit.  */
+  translation_unit_sptr ir_translation_unit =
+    std::make_shared<translation_unit> (ctxt->ir_env, "", 64);
+  ir_translation_unit->set_language (translation_unit::LANG_C);
+  corp->add (ir_translation_unit);
+
+  /* Iterate over the CTF dictionaries in the archive.  */
+  int ctf_err;
+  ctf_dict_t *ctf_dict;
+  ctf_next_t *dict_next = NULL;
+  const char *archive_name;
+
+  while ((ctf_dict = ctf_archive_next (ctxt->ctfa, &dict_next, &archive_name,
+                                       0 /* skip_parent */, &ctf_err)) != NULL)
+    {
+      /* Iterate over the CTF types stored in this archive.  */
+      ctf_id_t ctf_type;
+      int type_flag;
+      ctf_next_t *type_next = NULL;
+
+      while ((ctf_type = ctf_type_next (ctf_dict, &type_next, &type_flag,
+                                        1 /* want_hidden */)) != CTF_ERR)
+        {
+          process_ctf_type (ctxt, corp, ir_translation_unit,
+                            ctf_dict, ctf_type);
+        }
+      if (ctf_errno (ctf_dict) != ECTF_NEXT_END)
+        fprintf (stderr, "ERROR from ctf_type_next\n");
+
+      /* Iterate over the CTF variables stored in this archive.  */
+      ctf_id_t ctf_var_type;
+      ctf_next_t *var_next = NULL;
+      const char *var_name;
+
+      while ((ctf_var_type = ctf_variable_next (ctf_dict, &var_next, &var_name))
+             != CTF_ERR)
+        {
+          type_base_sptr var_type = ctxt->lookup_type (ctf_var_type);
+
+          if (!var_type)
+            {
+              var_type = process_ctf_type (ctxt, corp, ir_translation_unit,
+                                           ctf_dict, ctf_var_type);
+              if (!var_type)
+                /* Ignore variable if its type can't be sorted out.  */
+                continue;
+            }
+
+          var_decl_sptr var_declaration;
+          var_declaration.reset (new var_decl (var_name,
+                                               var_type,
+                                               location (),
+                                               var_name));
+
+          add_decl_to_scope (var_declaration,
+                             ir_translation_unit->get_global_scope ());
+        }
+      if (ctf_errno (ctf_dict) != ECTF_NEXT_END)
+        fprintf (stderr, "ERROR from ctf_variable_next\n");
+
+      /* Iterate over the CTF functions stored in this archive.  */
+      ctf_next_t *func_next = NULL;
+      const char *func_name = NULL;
+      ctf_id_t ctf_sym;
+
+      while ((ctf_sym = ctf_symbol_next (ctf_dict, &func_next, &func_name,
+                                         1 /* functions symbols only */) != CTF_ERR))
+      {
+        ctf_id_t ctf_func_type = ctf_lookup_by_name (ctf_dict, func_name);
+        type_base_sptr func_type = ctxt->lookup_type (ctf_func_type);
+        if (!func_type)
+          {
+            func_type = process_ctf_type (ctxt, corp, ir_translation_unit,
+                                          ctf_dict, ctf_func_type);
+            if (!func_type)
+              /* Ignore function if its type can't be sorted out.  */
+              continue;
+          }
+
+        function_decl_sptr func_declaration;
+        func_declaration.reset (new function_decl (func_name,
+                                                   func_type,
+                                                   0 /* is_inline */,
+                                                   location ()));
+
+        add_decl_to_scope (func_declaration,
+                           ir_translation_unit->get_global_scope ());
+      }
+      if (ctf_errno (ctf_dict) != ECTF_NEXT_END)
+        fprintf (stderr, "ERROR from ctf_symbol_next\n");
+    }
+  if (ctf_err != ECTF_NEXT_END)
+    fprintf (stderr, "ERROR from ctf_archive_next\n");
+
+}
+
+/// Slurp certain information from the ELF file described by a given
+/// read context and install it in a libabigail corpus.
+///
+/// @param ctxt the read context
+/// @param corp the libabigail corpus in which to install the info.
+///
+/// @return 0 if there is an error.
+/// @return 1 otherwise.
+
+static int
+slurp_elf_info (read_context *ctxt, corpus_sptr corp)
+{
+  /* libelf requires to negotiate/set the version of ELF.  */
+  if (elf_version (EV_CURRENT) == EV_NONE)
+    return 0;
+
+  /* Open an ELF handler.  */
+  int elf_fd = open (ctxt->filename.c_str(), O_RDONLY);
+  if (elf_fd == -1)
+    return 0;
+
+  Elf *elf_handler = elf_begin (elf_fd, ELF_C_READ, NULL);
+  if (elf_handler == NULL)
+    {
+      fprintf (stderr, "cannot open %s: %s\n",
+               ctxt->filename.c_str(), elf_errmsg (elf_errno ()));
+      close (elf_fd);
+      return 0;
+    }
+
+  /* Set the ELF architecture.  */
+  GElf_Ehdr eh_mem;
+  GElf_Ehdr *ehdr = gelf_getehdr (elf_handler, &eh_mem);
+  corp->set_architecture_name (elf_helpers::e_machine_to_string (ehdr->e_machine));
+
+  /* Read the symtab from the ELF file and set it in the corpus.  */
+  symtab_reader::symtab_sptr symtab =
+    symtab_reader::symtab::load (elf_handler, ctxt->ir_env,
+                                 0 /* No suppressions.  */);
+  corp->set_symtab(symtab);
+
+  /* Finish the ELF handler and close the associated file.  */
+  elf_end (elf_handler);
+  close (elf_fd);
+
+  return 1;
+}
+
+/// Create and return a new read context to process CTF information
+/// from a given ELF file.
+///
+/// @param elf_path the patch of some ELF file.
+/// @param env a libabigail IR environment.
+
+read_context *
+create_read_context (std::string elf_path, ir::environment *env)
+{
+  return new read_context (elf_path, env);
+}
+
+/// Read the CTF information from some source described by a given
+/// read context and process it to create a libabigail IR corpus.
+/// Store the corpus in the same read context.
+///
+/// @param ctxt the read context to use.
+/// @return a shared pointer to the read corpus.
+
+corpus_sptr
+read_corpus (read_context *ctxt)
+{
+  corpus_sptr corp
+    = std::make_shared<corpus> (ctxt->ir_env, ctxt->filename);
+
+  /* Set some properties of the corpus first.  */
+  corp->set_origin(corpus::CTF_ORIGIN);
+  if (!slurp_elf_info (ctxt, corp))
+    return corp;
+
+  /* Get out now if no CTF debug info is found.  */
+  if (ctxt->ctfa == NULL)
+    return corp;
+
+  /* Process the CTF archive in the read context, if any.  Information
+     about the types, variables, functions, etc contained in the
+     archive are added to the given corpus.  */
+  process_ctf_archive (ctxt, corp);
+  return corp;
+}
+
+} // End of namespace ctf_reader
+} // End of namespace abigail
diff --git a/tools/abidiff.cc b/tools/abidiff.cc
index 21f7ff61..56f57448 100644
--- a/tools/abidiff.cc
+++ b/tools/abidiff.cc
@@ -19,6 +19,9 @@
 #include "abg-tools-utils.h"
 #include "abg-reader.h"
 #include "abg-dwarf-reader.h"
+#ifdef WITH_CTF
+#include "abg-ctf-reader.h"
+#endif
 
 using std::vector;
 using std::string;
@@ -103,6 +106,9 @@ struct options
   bool			do_log;
 #ifdef WITH_DEBUG_SELF_COMPARISON
   bool			do_debug;
+#endif
+#ifdef WITH_CTF
+  bool			use_ctf;
 #endif
   vector<char*> di_root_paths1;
   vector<char*> di_root_paths2;
@@ -145,6 +151,10 @@ struct options
       dump_diff_tree(),
       show_stats(),
       do_log()
+#ifdef WITH_CTF
+    ,
+      use_ctf()
+#endif
 #ifdef WITH_DEBUG_SELF_COMPARISON
     ,
     do_debug()
@@ -233,6 +243,7 @@ display_usage(const string& prog_name, ostream& out)
     << " --dump-diff-tree  emit a debug dump of the internal diff tree to "
     "the error output stream\n"
     <<  " --stats  show statistics about various internal stuff\n"
+    << "  --ctf use CTF instead of DWARF in ELF files\n"
 #ifdef WITH_DEBUG_SELF_COMPARISON
     << " --debug debug the process of comparing an ABI corpus against itself"
 #endif
@@ -579,6 +590,10 @@ parse_command_line(int argc, char* argv[], options& opts)
 	opts.show_stats = true;
       else if (!strcmp(argv[i], "--verbose"))
 	opts.do_log = true;
+#ifdef WITH_CTF
+      else if (!strcmp(argv[i], "--ctf"))
+        opts.use_ctf = true;
+#endif
 #ifdef WITH_DEBUG_SELF_COMPARISON
       else if (!strcmp(argv[i], "--debug"))
 	opts.do_debug = true;
@@ -1150,23 +1165,37 @@ main(int argc, char* argv[])
 	case abigail::tools_utils::FILE_TYPE_ELF: // fall through
 	case abigail::tools_utils::FILE_TYPE_AR:
 	  {
-	    abigail::dwarf_reader::read_context_sptr ctxt =
-	      abigail::dwarf_reader::create_read_context
-	      (opts.file1, opts.prepared_di_root_paths1,
-	       env.get(), /*read_all_types=*/opts.show_all_types,
-	       opts.linux_kernel_mode);
-	    assert(ctxt);
-
-	    abigail::dwarf_reader::set_show_stats(*ctxt, opts.show_stats);
-	    set_suppressions(*ctxt, opts);
-	    abigail::dwarf_reader::set_do_log(*ctxt, opts.do_log);
-	    c1 = abigail::dwarf_reader::read_corpus_from_elf(*ctxt, c1_status);
-	    if (!c1
-		|| (opts.fail_no_debug_info
-		    && (c1_status & STATUS_ALT_DEBUG_INFO_NOT_FOUND)
-		    && (c1_status & STATUS_DEBUG_INFO_NOT_FOUND)))
-	      return handle_error(c1_status, ctxt.get(),
-				  argv[0], opts);
+#ifdef WITH_CTF
+            if (opts.use_ctf)
+              {
+                abigail::ctf_reader::read_context *ctxt
+                  = abigail::ctf_reader::create_read_context (opts.file1,
+                                                              env.get());
+
+                assert (ctxt);
+                c1 = abigail::ctf_reader::read_corpus (ctxt);
+              }
+            else
+#endif
+              {
+                abigail::dwarf_reader::read_context_sptr ctxt =
+                  abigail::dwarf_reader::create_read_context
+                  (opts.file1, opts.prepared_di_root_paths1,
+                   env.get(), /*read_all_types=*/opts.show_all_types,
+                   opts.linux_kernel_mode);
+                assert(ctxt);
+
+                abigail::dwarf_reader::set_show_stats(*ctxt, opts.show_stats);
+                set_suppressions(*ctxt, opts);
+                abigail::dwarf_reader::set_do_log(*ctxt, opts.do_log);
+                c1 = abigail::dwarf_reader::read_corpus_from_elf(*ctxt, c1_status);
+                if (!c1
+                    || (opts.fail_no_debug_info
+                        && (c1_status & STATUS_ALT_DEBUG_INFO_NOT_FOUND)
+                        && (c1_status & STATUS_DEBUG_INFO_NOT_FOUND)))
+                  return handle_error(c1_status, ctxt.get(),
+                                      argv[0], opts);
+              }
 	  }
 	  break;
 	case abigail::tools_utils::FILE_TYPE_XML_CORPUS:
@@ -1219,23 +1248,36 @@ main(int argc, char* argv[])
 	case abigail::tools_utils::FILE_TYPE_ELF: // Fall through
 	case abigail::tools_utils::FILE_TYPE_AR:
 	  {
-	    abigail::dwarf_reader::read_context_sptr ctxt =
-	      abigail::dwarf_reader::create_read_context
-	      (opts.file2, opts.prepared_di_root_paths2,
-	       env.get(), /*read_all_types=*/opts.show_all_types,
-	       opts.linux_kernel_mode);
-	    assert(ctxt);
-	    abigail::dwarf_reader::set_show_stats(*ctxt, opts.show_stats);
-	    abigail::dwarf_reader::set_do_log(*ctxt, opts.do_log);
-	    set_suppressions(*ctxt, opts);
-
-	    c2 = abigail::dwarf_reader::read_corpus_from_elf(*ctxt, c2_status);
-	    if (!c2
-		|| (opts.fail_no_debug_info
-		    && (c2_status & STATUS_ALT_DEBUG_INFO_NOT_FOUND)
-		    && (c2_status & STATUS_DEBUG_INFO_NOT_FOUND)))
-	      return handle_error(c2_status, ctxt.get(), argv[0], opts);
-
+#ifdef WITH_CTF
+            if (opts.use_ctf)
+              {
+                abigail::ctf_reader::read_context *ctxt
+                  = abigail::ctf_reader::create_read_context (opts.file2,
+                                                              env.get());
+
+                assert (ctxt);
+                c2 = abigail::ctf_reader::read_corpus (ctxt);
+              }
+            else
+#endif
+              {
+                abigail::dwarf_reader::read_context_sptr ctxt =
+                  abigail::dwarf_reader::create_read_context
+                  (opts.file2, opts.prepared_di_root_paths2,
+                   env.get(), /*read_all_types=*/opts.show_all_types,
+                   opts.linux_kernel_mode);
+                assert(ctxt);
+                abigail::dwarf_reader::set_show_stats(*ctxt, opts.show_stats);
+                abigail::dwarf_reader::set_do_log(*ctxt, opts.do_log);
+                set_suppressions(*ctxt, opts);
+
+                c2 = abigail::dwarf_reader::read_corpus_from_elf(*ctxt, c2_status);
+                if (!c2
+                    || (opts.fail_no_debug_info
+                        && (c2_status & STATUS_ALT_DEBUG_INFO_NOT_FOUND)
+                        && (c2_status & STATUS_DEBUG_INFO_NOT_FOUND)))
+                  return handle_error(c2_status, ctxt.get(), argv[0], opts);
+              }
 	  }
 	  break;
 	case abigail::tools_utils::FILE_TYPE_XML_CORPUS:
diff --git a/tools/abilint.cc b/tools/abilint.cc
index 856f935d..8f354086 100644
--- a/tools/abilint.cc
+++ b/tools/abilint.cc
@@ -27,6 +27,9 @@
 #include "abg-corpus.h"
 #include "abg-reader.h"
 #include "abg-dwarf-reader.h"
+#ifdef WITH_CTF
+#include "abg-ctf-reader.h"
+#endif
 #include "abg-writer.h"
 #include "abg-suppression.h"
 
@@ -67,6 +70,9 @@ struct options
   bool				read_tu;
   bool				diff;
   bool				noout;
+#ifdef WITH_CTF
+  bool				use_ctf;
+#endif
   std::shared_ptr<char>	di_root_path;
   vector<string>		suppression_paths;
   string			headers_dir;
@@ -78,6 +84,10 @@ struct options
       read_tu(false),
       diff(false),
       noout(false)
+#ifdef WITH_CTF
+    ,
+      use_ctf(false)
+#endif
   {}
 };//end struct options;
 
@@ -99,7 +109,11 @@ display_usage(const string& prog_name, ostream& out)
     "the input and the memory model saved back to disk\n"
     << "  --noout  do not display anything on stdout\n"
     << "  --stdin|--  read abi-file content from stdin\n"
-    << "  --tu  expect a single translation unit file\n";
+    << "  --tu  expect a single translation unit file\n"
+#ifdef WITH_CTF
+    << "  --ctf use CTF instead of DWARF in ELF files\n"
+#endif
+    ;
 }
 
 bool
@@ -173,6 +187,10 @@ parse_command_line(int argc, char* argv[], options& opts)
 	  opts.read_from_stdin = true;
 	else if (!strcmp(argv[i], "--tu"))
 	  opts.read_tu = true;
+#ifdef WITH_CTF
+        else if (!strcmp(argv[i], "--ctf"))
+          opts.use_ctf = true;
+#endif
 	else if (!strcmp(argv[i], "--diff"))
 	  opts.diff = true;
 	else if (!strcmp(argv[i], "--noout"))
@@ -338,13 +356,28 @@ main(int argc, char* argv[])
 	    di_root_path = opts.di_root_path.get();
 	    vector<char**> di_roots;
 	    di_roots.push_back(&di_root_path);
-	    abigail::dwarf_reader::read_context_sptr ctxt =
-	      abigail::dwarf_reader::create_read_context(opts.file_path,
-							 di_roots, env.get(),
-							 /*load_all_types=*/false);
-	    assert(ctxt);
-	    set_suppressions(*ctxt, opts);
-	    corp = read_corpus_from_elf(*ctxt, s);
+
+#ifdef WITH_CTF
+            if (opts.use_ctf)
+              {
+                abigail::ctf_reader::read_context *ctxt
+                  = abigail::ctf_reader::create_read_context (opts.file_path,
+                                                              env.get());
+
+                assert (ctxt);
+                corp = abigail::ctf_reader::read_corpus (ctxt);
+              }
+            else
+#endif
+              {
+                abigail::dwarf_reader::read_context_sptr ctxt =
+                  abigail::dwarf_reader::create_read_context(opts.file_path,
+                                                             di_roots, env.get(),
+                                                             /*load_all_types=*/false);
+                assert(ctxt);
+                set_suppressions(*ctxt, opts);
+                corp = read_corpus_from_elf(*ctxt, s);
+              }
 	  }
 	  break;
 	case abigail::tools_utils::FILE_TYPE_XML_CORPUS:
Dodji Seketeli Oct. 27, 2021, 9:19 a.m. UTC | #5
[...]

"Jose E. Marchesi via Libabigail" <libabigail@sourceware.org> a écrit:

>>> - The libabigail tools assume that ELF means to always use DWARF to
>>>   generate the ABI IR.  We added a new command-line option --ctf to
>>>   the tools in order to make them to use the CTF debug info instead.
>>>   We are definitely not sure whether this is the best user interface.
>>>   In fact I would be suprised if it was ;)
>>>
>>> - We added support for --ctf to both abilint and abidiff.   We are not
>>>   sure whether it would make sense to add support for CTF to the other
>>>   tools.  Feedback welcome.

[...]

Giuliano Procida <gprocida@google.com> a écrit:

>> For ease of testing / building up a useful regression test suite, please do
>> consider adding --ctf to abidw (or adding abictf?) which would give a
>> CTF -> XML utility. Plain diff (rather than abilint's ABI diff) can be used to
>> check for changes over time.

"Jose E. Marchesi" <jose.marchesi@oracle.com> a ecrit

> What about renaming abitdw to something like abielf?  Then --ctf would
> fit well.

abidw was also a word play for "abi data write".  So we can just let it
as it is and add --ctf to it.  But if you feel strongly about it, we can
indeed change the name and provide an 'abidw' symlink for compatibility
purposes.  The new name could be 'abisrl' for 'abi serialize", or
something.

But we can discuss this later when the feature is in place.  For now, I
think the most important thing to do is to add the --ctf option to
abidw.

Ultimately, I think we can write some magic to detect the debuginfo
format associated with the binary and use it.  There will be cases where
it's hard to determine that.  For those cases, the --ctf or --dwarf (to
be added) options would be useful.

I hope this is helpful.

[...]

Cheers,
Jose E. Marchesi Oct. 27, 2021, 4:06 p.m. UTC | #6
Hi Dodji.

>>     - The libabigail tools assume that ELF means to always use DWARF to
>>       generate the ABI IR.  We added a new command-line option --ctf to
>>       the tools in order to make them to use the CTF debug info instead.
>>       We are definitely not sure whether this is the best user interface.
>>       In fact I would be suprised if it was ;)
>
> It's OK for me, so far.  We can come up with a better way.  In theory,
> we should be able to automatically detect the debuginfo format
> attached to a given binary, now that we about to support more than one
> ;-) We should just make sure we can do that even when said debuginfo
> is split out into a separate file.

Right.  If both formats (DWARF and CTF) are present in a given binary
DWARF must have precedence, since it is capable of encoding more
information than CTF.

>
>>     - We added support for --ctf to both abilint and abidiff.
>
> Thanks for doing that!
>
> Oh, one thing that is missing is to add documentation to the
> doc/manuals/abi{diff,lint}.rst files for the new --ctf to these tools.
> Could you please add that in a subsequent iteration of this patch?

Will do for V2.

>>       We are not sure whether it would make sense to add support for CTF
>>       to the other tools.  Feedback welcome.
>
> I think it would be important to add something similar to abidw, as
> that tool is important to emit an abixml representation of a given
> binary.  A lot of users use the abixml representation to serialize a
> representation of the

I will add CTF support for abidw in V2.

>>     - We are pondering about what to do in terms of testing.  We have
>>       cursory tested this implementation using abilint and abidiff.  We
>>       know we are generating IR corpus that seem to be ok.  It would be
>>       good however to be able to run the libabigail testsuites using CTF.
>>       However the testsuites may need some non-trivial changes in order to
>>       make this possible.  Let's talk about that :)
>
> I think it wouldn't be that hard.  We can discuss that separately if
> you like.  But it just boils down to adding a new test similar to
> tests/test-read-dwarf.cc, that would be called tests/test-read-ctf.cc.
> That test would just read a binary built with ctf support, save its
> abixml representation to disk and compare it with an expected output.
> That would be a great start.  I can help with that, no problem.  But
> let's maybe discuss this in a separate thread, even after the initial
> patch is applied.

I asked Guillermo Rodriguez to look at that.  He is subscribed to this
list.

>> diff --git a/ChangeLog b/ChangeLog
>> index 30918b49..385fe067 100644
>> --- a/ChangeLog
>> +++ b/ChangeLog
>> @@ -1,3 +1,21 @@
>> +2021-10-11  Jose E. Marchesi  <jose.marchesi@oracle.com>
>> +
>> +	* configure.ac: Check for libctf.
>> +	* src/abg-ctf-reader.cc: New file.
>> +	* include/abg-ctf-reader.h: Likewise.
>> +	* src/Makefile.am (libabigail_la_SOURCES): Add abg-ctf-reader.cc
>> +	conditionally.
>> +	* include/Makefile.am (pkginclude_HEADERS): Add abg-ctf-reader.h
>> +	conditionally.
>> +	* tools/abilint.cc (struct options): New option `use_ctf'.
>> +	(display_usage): Documentation for --ctf.
>> +	(parse_command_line): Handle --ctf.
>> +	(main): Honour --ctf.
>> +	* tools/abidiff.cc (struct options): New option `use_ctf'.
>> +	(display_usage): Documentation for --ctf.
>> +	(parse_command_line): Handle --ctf.
>> +	(main): Honour --ctf.
>> +
>
> As explained in the COMMIT-LOG-GUIDELINES file from the source code at
> https://sourceware.org/git/?p=libabigail.git;a=blob_plain;f=COMMIT-LOG-GUIDELINES;hb=HEAD,
> the ChangeLog is automatically updated by a script before a Libabigail
> release.

Ok... I will DTRT.
Will also add a Sign-off-by in V2.

>
>> diff --git a/configure.ac b/configure.ac
>
> [...]
>
>> +    AC_DEFINE([CTF], 1,
>> +	     [Defined if user enables and system has the libctf library])
>
> To comply with the implicit convention throughout the code, I renamed
> the pre-processor macro used to enable the "CTF Support Feature" into
> "WITH_CTF", because all the other "feature enabling" macros are named
> WITH_XXX.  This produces the hunk below, that as been committed into
> the ctf-branch:
>
> @@ -266,7 +266,7 @@ if test x$ENABLE_CTF = xyes; then
>    AC_CHECK_LIB(ctf, ctf_open, [LIBCTF=yes], [LIBCTF=no])
>    if test x$LIBCTF = xyes; then
>      AC_MSG_NOTICE([activating CTF code])
> -    AC_DEFINE([WITH_CTF], 1,
> +    AC_DEFINE([CTF], 1,
>  	     [Defined if user enables and system has the libctf library])
>      CTF_LIBS=-lctf
>    else
>
> Of course, all uses the "CTF" macro throughout have been updated accordingly.

Sorry, somehow I overlooked that.  Thanks for fixing it.

>> diff --git a/src/abg-ctf-reader.cc b/src/abg-ctf-reader.cc
>
> [...]
>
>> +  /// Associate a given CTF type ID with a given libabigail IR type.
>> +  void add_type (ctf_id_t ctf_type, type_base_sptr type)
>
> Throughout the existing source code, there is no space between the
> name of a function and the opening parenthesis.  This is based on the
> standard GNU C++ coding conventions followed, for instance, in
> libstdc++.  It would be appreciated if this patch could comply with
> the coding conventions of the rest of the source code.
>
> Those conventions are loosely defined in the CONTRIBUTING file under
> the chapter named "Coding language and style".  There is a
> .clang-format specification file that you can use to format the source
> code (semi?-)automatically using the clang formatter.  I don't use it
> myself as I just do the formatting manually as I write the code in
> Emacs but you can ask questions about that tool on the mailing list,
> should you need any help about it.

Ok I will remove spaces between function names and opening parenthesis.

> [...]
>
>> +static typedef_decl_sptr
>> +process_ctf_typedef (read_context *ctxt,
>> +                     corpus_sptr corp,
>> +                     translation_unit_sptr tunit,
>> +                     ctf_dict_t *ctf_dictionary,
>> +                     ctf_id_t ctf_type)
>> +{
>> +  typedef_decl_sptr result;
>> +
>> +  ctf_id_t ctf_utype = ctf_type_reference (ctf_dictionary, ctf_type);
>
> How about error handling here?  I mean, what if ctf_type is not a
> typedef or a type that has a reference to another type?

Hmmm, yes this code relies on CTF_TYPE to be a typedef type with a
reference to other type.

The first condition is guaranteed by the caller at the moment, which is
bad coupling. (bad bad jemarch.)

Will fix for V2.

>> +  result.reset (new typedef_decl (typedef_name, utype, location (),
>> +                                  typedef_name /* mangled_name */));
>
> I noticed that you are not setting the "location" information.  That
> information is quite useful for accurate diagnostics emitted by, e.g,
> abidiff.  For instance:
>
>     $ build/tools/abidiff --harmless tests/data/test-diff-dwarf/test2-v0.o tests/data/test-diff-dwarf/test2-v1.o 
>     Functions changes summary: 0 Removed, 1 Changed, 0 Added function
>     Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
>
>     1 function with some indirect sub-type change:
>
>       [C] 'function void foo(int, char)' at test2-v1.cc:5:1 has some indirect sub-type changes:
> 	parameter 1 of type 'int' changed:
> 	  entity changed from 'int' to compatible type 'typedef Int' at test2-v1.cc:1:1
> 	parameter 2 of type 'char' changed:
> 	  entity changed from 'char' to compatible type 'typedef Char' at test2-v1.cc:2:1
>
>     $ 
>
> The location information that you see in the form of "at
> test2-v1.cc:1:1" can be useful, I believe.
>
> So why are you not retrieving location information from CTF? (honest
> question).  To be fair, I haven't found (from the documentation) how
> to get source location information from CTF.  But I am pretty sure I
> must be missing the information.  In any case, I think it might be
> interesting to add a comment in the code about the reason.
>
> In any case, if the information is not yet available, it's not a
> problem.  The patch will go in, regardless.  When source location is
> later available from CTF then this code will be amended accordingly.

CTF doesn't support location information.

> Also, there is something else that might be important that you are not
> doing here: supporting Naming Typedefs.
>
> Consider this construct:
>
> typedef enum {one, two} Number;
>
> Here, there is an anonymous enum that is created.  There is also a
> typedef that is created to "name" that anonymous enum.  In concrete
> terms, the enum will be referred-to as having the name "Number".
> Libabigail IR allows to set the typedef as being the "naming typedef"
> of the underlying type of the using the
> decl_base::set_naming_typedef() member function, on the underlying
> type (the enum).
>
> So the code would somewhat look like this:
>
>     if (is_anonymous_type(utype)
> 	&& (is_enum_type(utype) || is_class_or_union_type(utype))
>       // So utype is an anonymous enum, union or struct.
>       // So let's consider that the typedef 'result' is 
>       // a naming typedef for utype.
>       utype->set_naming_typedef(result);
>
> Does that make sense?

Oook, from the snippet above I infer this applies only to anonymous
enum, class/struct and union underlying types?

Sure, easy to do.  Will be in V2.

>> +  type_base_sptr utype = ctxt->lookup_type (ctf_utype);
>> +
>> +  if (!utype)
>> +    {
>> +      utype = process_ctf_type (ctxt, corp, tunit, ctf_dictionary, ctf_utype);
>> +      if (!utype)
>> +        return result;
>> +    }
>
> I noticed this pattern is used a lot throughout the code.
> Maybe it could be factorized into a function and used throughout the
> code base as, e.g:
>
>     type_base_sptr utype = process_or_lookup_type(ctxt, corp, tunit,
> 						  ctf_dictionary,
> 						  ctf_utype)
> What do you think?

Yes, totally.  Will do.

>
>> +static void
>> +process_ctf_sou_members (read_context *ctxt,
>> +                         corpus_sptr corp,
>> +                         translation_unit_sptr tunit,
>> +                         ctf_dict_t *ctf_dictionary,
>> +                         ctf_id_t ctf_type,
>> +                         class_or_union_sptr sou)
>> +{
>
> This function definition lacks doxygen comment, even if it's "just" a
> sub-routine of process_ctf_struct_type and process_ctf_union_type.

Ok.

> +static void
> +process_ctf_archive (read_context *ctxt, corpus_sptr corp)
> +{
> +  /* We only have a translation unit.  */
> +  translation_unit_sptr ir_translation_unit =
> +    std::make_shared<translation_unit> (ctxt->ir_env, "", 64);
>
> Interesting.  So, CTF doesn't keep the information about the
> translation unit a decl was defined in?  I guess that must be related
> to the fact that I haven't seen source location information so far.

Correct, no translation unit info in CTF.  What CTF has is deduplication
builtin in the linker (ld) so when several objects get combined the
corresponding .ctf sections get also merged and deduplicated.

> [...]
>
> +
> +      /* Iterate over the CTF functions stored in this archive.  */
> +      ctf_next_t *func_next = NULL;
> +      const char *func_name = NULL;
> +      ctf_id_t ctf_sym;
> +
> +      while ((ctf_sym = ctf_symbol_next (ctf_dict, &func_next, &func_name,
> +                                         1 /* functions symbols only */) != CTF_ERR))
> +      {
>
> How about function symbol visibility?  Won't ctf_symbol_next return
> the symbol of a function that is not necessarily "exported"?  In that
> case, because the symbol is not exported, it's related to a "private"
> function and so, it doesn't have ABI visibility.  So it should be
> discarded.  Just curious.

Hmmm, I think these symbols correspond to public symbols, but will
double-check and sure of that.

> diff --git a/tools/abidiff.cc b/tools/abidiff.cc
>
> [...]
>
> @@ -104,6 +105,7 @@ struct options
>  #ifdef WITH_DEBUG_SELF_COMPARISON
>    bool			do_debug;
>  #endif
> +  bool			use_ctf;
>
> I have protected this CTF support hunk with a #ifdef WITH_CTF guard,
> so that configuring with --disable-ctf leads to compiling the file
> just fine I've done so with all the CTF-specific hunks in abidiff.cc
> and abilint.cc.

Thanks for that.  I forgot :)

>
> All in all, I really like the patch.  Thanks a lot for working on
> this.

Well thanks for the review.
I will send a V2 soon.
Ben Woodard Oct. 27, 2021, 8:31 p.m. UTC | #7
> On Oct 27, 2021, at 1:59 AM, Dodji Seketeli <dodji@seketeli.org> wrote:
> 
> It's OK for me, so far.  We can come up with a better way.  In theory,
> we should be able to automatically detect the debuginfo format
> attached to a given binary, now that we about to support more than one
> ;-) We should just make sure we can do that even when said debuginfo
> is split out into a separate file.

While you are in that part of the code, I really would like a feature where it would return a non-zero if the binary didn’t have any debuginfo available. This can be optional. Something like:

—require-debuginfo

I’ve been caught too many times where I think that I have debuginfo but it doesn’t and I think everything is fine and there are no ABI conflicts but in reality libabigail isn’t really doing any useful testing.

-ben
Dodji Seketeli Oct. 29, 2021, 9:35 a.m. UTC | #8
Hello Ben,

Ben Woodard <woodard@redhat.com> a écrit:

> While you are in that part of the code, I really would like a feature
> where it would return a non-zero if the binary didn’t have any
> debuginfo available. This can be optional. Something like:
>
> —require-debuginfo
>
> I’ve been caught too many times where I think that I have debuginfo
> but it doesn’t and I think everything is fine and there are no ABI
> conflicts but in reality libabigail isn’t really doing any useful
> testing.

That's a fair request, I think. 

I have filed https://sourceware.org/bugzilla/show_bug.cgi?id=28515 so
that it's not lost.

Cheers,
diff mbox series

Patch

diff --git a/ChangeLog b/ChangeLog
index 30918b49..385fe067 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,21 @@ 
+2021-10-11  Jose E. Marchesi  <jose.marchesi@oracle.com>
+
+	* configure.ac: Check for libctf.
+	* src/abg-ctf-reader.cc: New file.
+	* include/abg-ctf-reader.h: Likewise.
+	* src/Makefile.am (libabigail_la_SOURCES): Add abg-ctf-reader.cc
+	conditionally.
+	* include/Makefile.am (pkginclude_HEADERS): Add abg-ctf-reader.h
+	conditionally.
+	* tools/abilint.cc (struct options): New option `use_ctf'.
+	(display_usage): Documentation for --ctf.
+	(parse_command_line): Handle --ctf.
+	(main): Honour --ctf.
+	* tools/abidiff.cc (struct options): New option `use_ctf'.
+	(display_usage): Documentation for --ctf.
+	(parse_command_line): Handle --ctf.
+	(main): Honour --ctf.
+
 2021-10-04  Dodji Seketeli <dodji@redhat.com>
 
 	Update NEWS file for 2.0
diff --git a/configure.ac b/configure.ac
index 9e91f496..1eb85008 100644
--- a/configure.ac
+++ b/configure.ac
@@ -138,6 +138,13 @@  AC_ARG_ENABLE(ubsan,
 	      ENABLE_UBSAN=$enableval,
 	      ENABLE_UBSAN=no)
 
+dnl check if user has enabled CTF code
+AC_ARG_ENABLE(ctf,
+	      AS_HELP_STRING([--enable-ctf=yes|no],
+			     [disable support of ctf files)]),
+	      ENABLE_CTF=$enableval,
+	      ENABLE_CTF=no)
+
 dnl *************************************************
 dnl check for dependencies
 dnl *************************************************
@@ -244,6 +251,24 @@  fi
 AC_SUBST(DW_LIBS)
 AC_SUBST([ELF_LIBS])
 
+dnl check for libctf presence if CTF code has been enabled by command line
+dnl argument, and then define CTF flag (to build CTF file code) if libctf is
+dnl found on the system
+CTF_LIBS=
+if test x$ENABLE_CTF = xyes; then
+  LIBCTF=
+  AC_CHECK_LIB(ctf, ctf_open, [LIBCTF=yes], [LIBCTF=no])
+  if test x$LIBCTF = xyes; then
+    AC_MSG_NOTICE([activating CTF code])
+    AC_DEFINE([CTF], 1,
+	     [Defined if user enables and system has the libctf library])
+    CTF_LIBS=-lctf
+  else
+    AC_MSG_NOTICE([CTF enabled but no libctf found])
+    ENABLE_CTF=no
+  fi
+fi
+
 dnl Check for dependency: libxml
 LIBXML2_VERSION=2.6.22
 PKG_CHECK_MODULES(XML, libxml-2.0 >= $LIBXML2_VERSION)
@@ -593,7 +618,7 @@  AX_VALGRIND_CHECK
 
 dnl Set the list of libraries libabigail depends on
 
-DEPS_LIBS="$XML_LIBS $ELF_LIBS $DW_LIBS"
+DEPS_LIBS="$XML_LIBS $ELF_LIBS $DW_LIBS $CTF_LIBS"
 AC_SUBST(DEPS_LIBS)
 
 if test x$ABIGAIL_DEVEL != x; then
@@ -631,6 +656,10 @@  if test x$ENABLE_UBSAN = xyes; then
     CXXFLAGS="$CXXFLAGS -fsanitize=undefined"
 fi
 
+dnl Set a few Automake conditionals
+
+AM_CONDITIONAL([CTF_READER],[test "x$ENABLE_CTF" = "xyes"])
+
 dnl Set the level of C++ standard we use.
 CXXFLAGS="$CXXFLAGS -std=$CXX_STANDARD"
 
@@ -936,6 +965,7 @@  AC_MSG_NOTICE([
     Enable bash completion	                   : ${ENABLE_BASH_COMPLETION}
     Enable fedabipkgdiff                           : ${ENABLE_FEDABIPKGDIFF}
     Enable python 3				   : ${ENABLE_PYTHON3}
+    Enable CTF front-end                           : ${ENABLE_CTF}
     Enable running tests under Valgrind            : ${enable_valgrind}
     Enable build with -fsanitize=address    	   : ${ENABLE_ASAN}
     Enable build with -fsanitize=memory    	   : ${ENABLE_MSAN}
diff --git a/include/Makefile.am b/include/Makefile.am
index 0f3b0936..9e5e037b 100644
--- a/include/Makefile.am
+++ b/include/Makefile.am
@@ -27,4 +27,8 @@  abg-viz-dot.h		\
 abg-viz-svg.h		\
 abg-regex.h
 
+if CTF_READER
+pkginclude_HEADERS += abg-ctf-reader.h
+endif
+
 EXTRA_DIST = abg-version.h.in
diff --git a/include/abg-corpus.h b/include/abg-corpus.h
index 136c348c..652a8294 100644
--- a/include/abg-corpus.h
+++ b/include/abg-corpus.h
@@ -46,6 +46,7 @@  public:
     ARTIFICIAL_ORIGIN = 0,
     NATIVE_XML_ORIGIN,
     DWARF_ORIGIN,
+    CTF_ORIGIN,
     LINUX_KERNEL_BINARY_ORIGIN
   };
 
diff --git a/include/abg-ctf-reader.h b/include/abg-ctf-reader.h
new file mode 100644
index 00000000..07eccec6
--- /dev/null
+++ b/include/abg-ctf-reader.h
@@ -0,0 +1,34 @@ 
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+// -*- Mode: C++ -*-
+//
+// Copyright (C) 2021 Oracle, Inc.
+//
+// Author: Jose E. Marchesi
+
+/// @file
+///
+/// This file contains the declarations of the entry points to
+/// de-serialize an instance of @ref abigail::corpus from a file in
+/// elf format, containing CTF information.
+
+#ifndef __ABG_CTF_READER_H__
+#define __ABG_CTF_READER_H__
+
+#include <ostream>
+#include "abg-corpus.h"
+#include "abg-suppression.h"
+
+namespace abigail
+{
+namespace ctf_reader
+{
+
+class read_context;
+read_context *create_read_context (std::string elf_path,
+                                   ir::environment *env);
+corpus_sptr read_corpus (read_context *ctxt);
+
+} // end namespace ctf_reader
+} // end namespace abigail
+
+#endif // ! __ABG_CTF_READER_H__
diff --git a/src/Makefile.am b/src/Makefile.am
index 430ce98d..b60d74cb 100644
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -41,6 +41,10 @@  abg-symtab-reader.h			\
 abg-symtab-reader.cc			\
 $(VIZ_SOURCES)
 
+if CTF_READER
+libabigail_la_SOURCES += abg-ctf-reader.cc
+endif
+
 libabigail_la_LIBADD = $(DEPS_LIBS)
 libabigail_la_LDFLAGS = -lpthread -Wl,--as-needed -no-undefined
 
diff --git a/src/abg-ctf-reader.cc b/src/abg-ctf-reader.cc
new file mode 100644
index 00000000..a121b3c8
--- /dev/null
+++ b/src/abg-ctf-reader.cc
@@ -0,0 +1,999 @@ 
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+// -*- Mode: C++ -*-
+//
+// Copyright (C) 2021 Oracle, Inc.
+//
+// Author: Jose E. Marchesi
+
+/// @file
+///
+/// This file contains the definitions of the entry points to
+/// de-serialize an instance of @ref abigail::corpus from a file in
+/// ELF format, containing CTF information.
+
+#include "config.h"
+
+#include <fcntl.h> /* For open(3) */
+#include <iostream>
+
+#include "ctf-api.h"
+
+#include "abg-internal.h"
+#include "abg-ir-priv.h"
+#include "abg-elf-helpers.h"
+
+// <headers defining libabigail's API go under here>
+ABG_BEGIN_EXPORT_DECLARATIONS
+
+#include "abg-ctf-reader.h"
+#include "abg-libxml-utils.h"
+#include "abg-reader.h"
+#include "abg-corpus.h"
+#include "abg-symtab-reader.h"
+#include "abg-tools-utils.h"
+
+ABG_END_EXPORT_DECLARATIONS
+// </headers defining libabigail's API>
+
+namespace abigail
+{
+namespace ctf_reader
+{
+
+class read_context
+{
+public:
+  /// The name of the ELF file from which the CTF archive got
+  /// extracted.
+  string filename;
+
+  /// The IR environment.
+  ir::environment *ir_env;
+
+  /// The CTF archive read from FILENAME.  If an archive couldn't
+  /// be read from the file then this is NULL.
+  ctf_archive_t *ctfa;
+
+  /// A map associating CTF type ids with libabigail IR types.  This
+  /// is used to reuse already generated types.
+  unordered_map<ctf_id_t,type_base_sptr> types_map;
+
+  /// Associate a given CTF type ID with a given libabigail IR type.
+  void add_type (ctf_id_t ctf_type, type_base_sptr type)
+  {
+    types_map.insert (std::make_pair (ctf_type, type));
+  }
+
+  /// Lookup a given CTF type ID in the types map.
+  ///
+  /// @param ctf_type the type ID of the type to lookup.
+  type_base_sptr lookup_type (ctf_id_t ctf_type)
+  {
+    type_base_sptr result;
+
+    auto search = types_map.find (ctf_type);
+    if (search != types_map.end())
+      result = search->second;
+
+    return result;
+  }
+
+  /// Constructor.
+  ///
+  /// @param elf_path the path to the ELF file.
+  read_context (string elf_path, ir::environment *env)
+  {
+    int err;
+
+    types_map.clear ();
+    filename = elf_path;
+    ir_env = env;
+    ctfa = ctf_open (filename.c_str(),
+                     NULL /* BFD target */, &err);
+
+    if (ctfa == NULL)
+      fprintf (stderr, "cannot open %s: %s\n", filename.c_str(), ctf_errmsg (err));
+  }
+
+  /// Destructor of the @ref read_context type.
+  ~read_context ()
+  {
+    ctf_close (ctfa);
+  }
+}; // end class read_context.
+
+/// Forward reference, needed because several of the process_ctf_*
+/// functions below are indirectly recursive through this call.
+static type_base_sptr process_ctf_type (read_context *ctxt, corpus_sptr corp,
+                                        translation_unit_sptr tunit,
+                                        ctf_dict_t *ctf_dictionary,
+                                        ctf_id_t ctf_type);
+
+/// Build and return a typedef libabigail IR.
+///
+/// @param ctxt the read context.
+/// @param corp the libabigail IR corpus being constructed.
+/// @param tunit the current IR translation unit.
+/// @param ctf_dictionary the CTF dictionary being read.
+/// @param ctf_type the CTF type ID of the source type.
+///
+/// @return a shared pointer to the IR node for the typedef.
+
+static typedef_decl_sptr
+process_ctf_typedef (read_context *ctxt,
+                     corpus_sptr corp,
+                     translation_unit_sptr tunit,
+                     ctf_dict_t *ctf_dictionary,
+                     ctf_id_t ctf_type)
+{
+  typedef_decl_sptr result;
+
+  ctf_id_t ctf_utype = ctf_type_reference (ctf_dictionary, ctf_type);
+  const char *typedef_name = ctf_type_name_raw (ctf_dictionary, ctf_type);
+  type_base_sptr utype = ctxt->lookup_type (ctf_utype);
+
+  if (!utype)
+    {
+      utype = process_ctf_type (ctxt, corp, tunit, ctf_dictionary, ctf_utype);
+      if (!utype)
+        return result;
+    }
+
+  result.reset (new typedef_decl (typedef_name, utype, location (),
+                                  typedef_name /* mangled_name */));
+  return result;
+}
+
+/// Build and return an integer or float type declaration libabigail
+/// IR.
+///
+/// @param ctxt the read context.
+/// @param corp the libabigail IR corpus being constructed.
+/// @param ctf_dictionary the CTF dictionary being read.
+/// @param ctf_type the CTF type ID of the source type.
+///
+/// @return a shared pointer to the IR node for the type.
+
+static type_decl_sptr
+process_ctf_base_type (read_context *ctxt,
+                       corpus_sptr corp,
+                       ctf_dict_t *ctf_dictionary,
+                       ctf_id_t ctf_type)
+{
+  type_decl_sptr result;
+
+  ssize_t type_alignment = ctf_type_align (ctf_dictionary, ctf_type);
+  const char *type_name = ctf_type_name_raw (ctf_dictionary, ctf_type);
+
+  /* Get the type encoding and extract some useful properties of
+     the type from it.  In case of any error, just ignore the
+     type.  */
+  ctf_encoding_t type_encoding;
+  if (ctf_type_encoding (ctf_dictionary,
+                         ctf_type,
+                         &type_encoding))
+    return result;
+
+  /* Create the IR type corresponding to the CTF type.  */
+  if (type_encoding.cte_bits == 0
+      && type_encoding.cte_format == CTF_INT_SIGNED)
+    {
+      /* This is the `void' type.  */
+      type_base_sptr void_type = ctxt->ir_env->get_void_type ();
+      decl_base_sptr type_declaration = get_type_declaration (void_type);
+      result = is_type_decl (type_declaration);
+    }
+  else
+    {
+      result = lookup_basic_type (type_name, *corp);
+      if (!result)
+        result.reset (new type_decl (ctxt->ir_env,
+                                     type_name,
+                                     type_encoding.cte_bits,
+                                     type_alignment * 8 /* in bits */,
+                                     location (),
+                                     type_name /* mangled_name */));
+
+    }
+
+  return result;
+}
+
+/// Build and return a function type libabigail IR.
+///
+/// @param ctxt the read context.
+/// @param corp the libabigail IR corpus being constructed.
+/// @param tunit the current IR translation unit.
+/// @param ctf_dictionary the CTF dictionary being read.
+/// @param ctf_type the CTF type ID of the source type.
+///
+/// @return a shared pointer to the IR node for the function type.
+
+static function_type_sptr
+process_ctf_function_type (read_context *ctxt,
+                           corpus_sptr corp,
+                           translation_unit_sptr tunit,
+                           ctf_dict_t *ctf_dictionary,
+                           ctf_id_t ctf_type)
+{
+  function_type_sptr result;
+
+  /* Fetch the function type info from the CTF type.  */
+  ctf_funcinfo_t funcinfo;
+  ctf_func_type_info (ctf_dictionary, ctf_type, &funcinfo);
+  int vararg_p = funcinfo.ctc_flags & CTF_FUNC_VARARG;
+
+  /* Take care first of the result type.  */
+  ctf_id_t ctf_ret_type = funcinfo.ctc_return;
+  type_base_sptr ret_type = ctxt->lookup_type (ctf_ret_type);
+
+  if (!ret_type)
+    {
+      ret_type = process_ctf_type (ctxt, corp, tunit, ctf_dictionary,
+                                   ctf_ret_type);
+      if (!ret_type)
+        return result;
+    }
+
+  /* Now process the argument types.  */
+  int argc = funcinfo.ctc_argc;
+  std::vector<ctf_id_t> argv (argc);
+  if (ctf_func_type_args (ctf_dictionary, ctf_type,
+                          argc, argv.data ()) == CTF_ERR)
+    return result;
+
+  function_decl::parameters function_parms;
+  for (int i = 0; i < argc; i++)
+    {
+      ctf_id_t ctf_arg_type = argv[i];
+      type_base_sptr arg_type = ctxt->lookup_type (ctf_arg_type);
+
+      if (!arg_type)
+        {
+          arg_type = process_ctf_type (ctxt, corp, tunit, ctf_dictionary,
+                                       ctf_arg_type);
+          if (!arg_type)
+            return result;
+        }
+
+      function_decl::parameter_sptr parm
+        (new function_decl::parameter (arg_type, "",
+                                       location (),
+                                       vararg_p && (i == argc - 1),
+                                       false /* is_artificial */));
+      function_parms.push_back (parm);
+    }
+
+
+  /* Ok now the function type itself.  */
+  result.reset (new function_type (ret_type,
+                                   function_parms,
+                                   tunit->get_address_size (),
+                                   ctf_type_align (ctf_dictionary, ctf_type)));
+
+  tunit->bind_function_type_life_time (result);
+  result->set_is_artificial (true);
+  return result;
+}
+
+static void
+process_ctf_sou_members (read_context *ctxt,
+                         corpus_sptr corp,
+                         translation_unit_sptr tunit,
+                         ctf_dict_t *ctf_dictionary,
+                         ctf_id_t ctf_type,
+                         class_or_union_sptr sou)
+{
+  ssize_t member_size;
+  ctf_next_t *member_next = NULL;
+  const char *member_name = NULL;
+  ctf_id_t member_ctf_type;
+
+  while ((member_size = ctf_member_next (ctf_dictionary, ctf_type,
+                                         &member_next, &member_name,
+                                         &member_ctf_type,
+                                         CTF_MN_RECURSE)) >= 0)
+    {
+      ctf_membinfo_t membinfo;
+
+      if (ctf_member_info (ctf_dictionary,
+                           ctf_type,
+                           member_name,
+                           &membinfo) == CTF_ERR)
+        return;
+
+      /* Build the IR for the member's type.  */
+      type_base_sptr member_type = ctxt->lookup_type (member_ctf_type);
+      if (!member_type)
+        {
+          member_type = process_ctf_type (ctxt, corp, tunit, ctf_dictionary,
+                                          member_ctf_type);
+          if (!member_type)
+            /* Ignore this member.  */
+            continue;
+        }
+
+      /* Create a declaration IR node for the member and add it to the
+         struct type.  */
+      var_decl_sptr data_member_decl (new var_decl (member_name,
+                                                    member_type,
+                                                    location (),
+                                                    member_name));
+      sou->add_data_member (data_member_decl,
+                            public_access,
+                            true /* is_laid_out */,
+                            false /* is_static */,
+                            membinfo.ctm_offset);
+    }
+  if (ctf_errno (ctf_dictionary) != ECTF_NEXT_END)
+    fprintf (stderr, "ERROR from ctf_member_next\n");
+}
+
+/// Build and return a struct type libabigail IR.
+///
+/// @param ctxt the read context.
+/// @param corp the libabigail IR corpus being constructed.
+/// @param tunit the current IR translation unit.
+/// @param ctf_dictionary the CTF dictionary being read.
+/// @param ctf_type the CTF type ID of the source type.
+///
+/// @return a shared pointer to the IR node for the struct type.
+
+static class_decl_sptr
+process_ctf_struct_type (read_context *ctxt,
+                         corpus_sptr corp,
+                         translation_unit_sptr tunit,
+                         ctf_dict_t *ctf_dictionary,
+                         ctf_id_t ctf_type)
+{
+  class_decl_sptr result;
+  std::string struct_type_name = ctf_type_name_raw (ctf_dictionary,
+                                                 ctf_type);
+  bool struct_type_is_anonymous = (struct_type_name == "");
+
+  /* The libabigail IR encodes C struct types in `class' IR nodes.  */
+  result.reset (new class_decl (ctxt->ir_env,
+                                struct_type_name,
+                                ctf_type_size (ctf_dictionary, ctf_type) * 8,
+                                ctf_type_align (ctf_dictionary, ctf_type) * 8,
+                                true /* is_struct */,
+                                location (),
+                                decl_base::VISIBILITY_DEFAULT,
+                                struct_type_is_anonymous));
+  if (!result)
+    return result;
+
+  /* The C type system indirectly supports loops by the mean of
+     pointers to structs or unions.  Since some contained type can
+     refer to this struct, we have to make it available in the cache
+     at this point even if the members haven't been added to the IR
+     node yet.  */
+  ctxt->add_type (ctf_type, result);
+
+  /* Now add the struct members as specified in the CTF type description.
+     This is C, so named types can only be defined in the global
+     scope.  */
+  process_ctf_sou_members (ctxt, corp, tunit, ctf_dictionary, ctf_type,
+                           result);
+
+  return result;
+}
+
+/// Build and return an union type libabigail IR.
+///
+/// @param ctxt the read context.
+/// @param corp the libabigail IR corpus being constructed.
+/// @param tunit the current IR translation unit.
+/// @param ctf_dictionary the CTF dictionary being read.
+/// @param ctf_type the CTF type ID of the source type.
+///
+/// @return a shared pointer to the IR node for the union type.
+
+static union_decl_sptr
+process_ctf_union_type (read_context *ctxt,
+                        corpus_sptr corp,
+                        translation_unit_sptr tunit,
+                        ctf_dict_t *ctf_dictionary,
+                        ctf_id_t ctf_type)
+{
+  union_decl_sptr result;
+  std::string union_type_name = ctf_type_name_raw (ctf_dictionary,
+                                                   ctf_type);
+  bool union_type_is_anonymous = (union_type_name == "");
+
+  /* Create the corresponding libabigail union IR node.  */
+  result.reset (new union_decl (ctxt->ir_env,
+                                union_type_name,
+                                ctf_type_size (ctf_dictionary, ctf_type) * 8,
+                                location (),
+                                decl_base::VISIBILITY_DEFAULT,
+                                union_type_is_anonymous));
+  if (!result)
+    return result;
+
+  /* The C type system indirectly supports loops by the mean of
+     pointers to structs or unions.  Since some contained type can
+     refer to this union, we have to make it available in the cache
+     at this point even if the members haven't been added to the IR
+     node yet.  */
+  ctxt->add_type (ctf_type, result);
+
+  /* Now add the union members as specified in the CTF type description.
+     This is C, so named types can only be defined in the global
+     scope.  */
+  process_ctf_sou_members (ctxt, corp, tunit, ctf_dictionary, ctf_type,
+                           result);
+
+  return result;
+}
+
+/// Build and return an array type libabigail IR.
+///
+/// @param ctxt the read context.
+/// @param corp the libabigail IR corpus being constructed.
+/// @param tunit the current IR translation unit.
+/// @param ctf_dictionary the CTF dictionary being read.
+/// @param ctf_type the CTF type ID of the source type.
+///
+/// @return a shared pointer to the IR node for the array type.
+
+static array_type_def_sptr
+process_ctf_array_type (read_context *ctxt,
+                        corpus_sptr corp,
+                        translation_unit_sptr tunit,
+                        ctf_dict_t *ctf_dictionary,
+                        ctf_id_t ctf_type)
+{
+  array_type_def_sptr result;
+  ctf_arinfo_t ctf_ainfo;
+
+  /* First, get the information about the CTF array.  */
+  if (ctf_array_info (ctf_dictionary, ctf_type, &ctf_ainfo)
+      == CTF_ERR)
+    return result;
+
+  ctf_id_t ctf_element_type = ctf_ainfo.ctr_contents;
+  ctf_id_t ctf_index_type = ctf_ainfo.ctr_index;
+  uint64_t nelems = ctf_ainfo.ctr_nelems;
+
+  /* Make sure the element type is generated.  */
+  type_base_sptr element_type = ctxt->lookup_type (ctf_element_type);
+  if (!element_type)
+    {
+      element_type = process_ctf_type (ctxt, corp, tunit, ctf_dictionary, ctf_element_type);
+      if (!element_type)
+        return result;
+    }
+
+  /* Ditto for the index type.  */
+  type_base_sptr index_type = ctxt->lookup_type (ctf_index_type);
+  if (!index_type)
+    {
+      index_type = process_ctf_type (ctxt, corp, tunit, ctf_dictionary, ctf_index_type);
+      if (!index_type)
+        return result;
+    }
+
+  /* The number of elements of the array determines the IR subranges
+     type to build.  */
+  array_type_def::subranges_type subranges;
+  array_type_def::subrange_sptr subrange;
+  array_type_def::subrange_type::bound_value lower_bound;
+  array_type_def::subrange_type::bound_value upper_bound;
+
+  lower_bound.set_unsigned (0); /* CTF supports C only.  */
+  upper_bound.set_unsigned (nelems > 0 ? nelems - 1 : 0U);
+
+  subrange.reset (new array_type_def::subrange_type (ctxt->ir_env,
+                                                     "",
+                                                     lower_bound,
+                                                     upper_bound,
+                                                     index_type,
+                                                     location (),
+                                                     translation_unit::LANG_C));
+  if (!subrange)
+    return result;
+
+  add_decl_to_scope (subrange, tunit->get_global_scope());
+  canonicalize (subrange);
+  subranges.push_back (subrange);
+
+  /* Finally build the IR for the array type and return it.  */
+  result.reset (new array_type_def (element_type, subranges, location ()));
+  return result;
+}
+
+/// Build and return a qualified type libabigail IR.
+///
+/// @param ctxt the read context.
+/// @param corp the libabigail IR corpus being constructed.
+/// @param tunit the current IR translation unit.
+/// @param ctf_dictionary the CTF dictionary being read.
+/// @param ctf_type the CTF type ID of the source type.
+
+static type_base_sptr
+process_ctf_qualified_type (read_context *ctxt,
+                            corpus_sptr corp,
+                            translation_unit_sptr tunit,
+                            ctf_dict_t *ctf_dictionary,
+                            ctf_id_t ctf_type)
+{
+  type_base_sptr result;
+  int type_kind = ctf_type_kind (ctf_dictionary, ctf_type);
+  ctf_id_t ctf_utype = ctf_type_reference (ctf_dictionary, ctf_type);
+  type_base_sptr utype = ctxt->lookup_type (ctf_utype);
+
+  if (!utype)
+    {
+      utype = process_ctf_type (ctxt, corp, tunit, ctf_dictionary, ctf_utype);
+      if (!utype)
+        return result;
+    }
+
+  qualified_type_def::CV qualifiers = qualified_type_def::CV_NONE;
+  if (type_kind == CTF_K_CONST)
+    qualifiers |= qualified_type_def::CV_CONST;
+  else if (type_kind == CTF_K_VOLATILE)
+    qualifiers |= qualified_type_def::CV_VOLATILE;
+  else if (type_kind == CTF_K_RESTRICT)
+    qualifiers |= qualified_type_def::CV_RESTRICT;
+  else
+    ABG_ASSERT_NOT_REACHED;
+
+  result.reset (new qualified_type_def (utype, qualifiers, location ()));
+  return result;
+}
+
+/// Build and return a pointer type libabigail IR.
+///
+/// @param ctxt the read context.
+/// @param corp the libabigail IR corpus being constructed.
+/// @param tunit the current IR translation unit.
+/// @param ctf_dictionary the CTF dictionary being read.
+/// @param ctf_type the CTF type ID of the source type.
+///
+/// @return a shared pointer to the IR node for the pointer type.
+
+static pointer_type_def_sptr
+process_ctf_pointer_type (read_context *ctxt,
+                          corpus_sptr corp,
+                          translation_unit_sptr tunit,
+                          ctf_dict_t *ctf_dictionary,
+                          ctf_id_t ctf_type)
+{
+  pointer_type_def_sptr result;
+  ctf_id_t ctf_target_type = ctf_type_reference (ctf_dictionary, ctf_type);
+  type_base_sptr target_type = ctxt->lookup_type (ctf_target_type);
+
+  if (!target_type)
+    {
+      target_type = process_ctf_type (ctxt, corp, tunit, ctf_dictionary,
+                                      ctf_target_type);
+      if (!target_type)
+        return result;
+    }
+
+  result.reset (new pointer_type_def (target_type,
+                                      ctf_type_size (ctf_dictionary, ctf_type) * 8,
+                                      ctf_type_align (ctf_dictionary, ctf_type) * 8,
+                                      location ()));
+  return result;
+}
+
+/// Build and return an enum type libabigail IR.
+///
+/// @param ctxt the read context.
+/// @param corp the libabigail IR corpus being constructed.
+/// @param tunit the current IR translation unit.
+/// @param ctf_dictionary the CTF dictionary being read.
+/// @param ctf_type the CTF type ID of the source type.
+///
+/// @return a shared pointer to the IR node for the enum type.
+
+static enum_type_decl_sptr
+process_ctf_enum_type (read_context *ctxt,
+                        corpus_sptr corp,
+                        translation_unit_sptr tunit,
+                        ctf_dict_t *ctf_dictionary,
+                        ctf_id_t ctf_type)
+{
+  enum_type_decl_sptr result;
+
+  /* Build a signed integral type for the type of the enumerators, aka
+     the underlying type.  The size of the enumerators in bytes is
+     specified in the CTF enumeration type.  */
+  size_t utype_size_in_bits = ctf_type_size (ctf_dictionary, ctf_type) * 8;
+  type_decl_sptr utype;
+
+  utype.reset (new type_decl (ctxt->ir_env,
+                              "",
+                              utype_size_in_bits,
+                              utype_size_in_bits,
+                              location ()));
+  utype->set_is_anonymous (true);
+  utype->set_is_artificial (true);
+  if (!utype)
+    return result;
+  add_decl_to_scope (utype, tunit->get_global_scope());
+  canonicalize (utype);
+
+  /* Iterate over the enum entries.  */
+  enum_type_decl::enumerators enms;
+  ctf_next_t *enum_next = NULL;
+  const char *ename;
+  int evalue;
+
+  while ((ename = ctf_enum_next (ctf_dictionary, ctf_type, &enum_next, &evalue)))
+    enms.push_back (enum_type_decl::enumerator (ctxt->ir_env, ename, evalue));
+  if (ctf_errno (ctf_dictionary) != ECTF_NEXT_END)
+    {
+      fprintf (stderr, "ERROR from ctf_enum_next\n");
+      return result;
+    }
+
+  const char *enum_name = ctf_type_name_raw (ctf_dictionary, ctf_type);
+  result.reset (new enum_type_decl (enum_name, location (),
+                                    utype, enms, enum_name));
+  return result;
+}
+
+/// Add a new type declaration to the given libabigail IR corpus CORP.
+///
+/// @param ctxt the read context.
+/// @param corp the libabigail IR corpus being constructed.
+/// @param tunit the current IR translation unit.
+/// @param ctf_dictionary the CTF dictionary being read.
+/// @param ctf_type the CTF type ID of the source type.
+///
+/// Note that if @ref ctf_type can't reliably be translated to the IR
+/// then it is simply ignored.
+///
+/// @return a shared pointer to the IR node for the type.
+
+static type_base_sptr
+process_ctf_type (read_context *ctxt,
+                  corpus_sptr corp,
+                  translation_unit_sptr tunit,
+                  ctf_dict_t *ctf_dictionary,
+                  ctf_id_t ctf_type)
+{
+  int type_kind = ctf_type_kind (ctf_dictionary, ctf_type);
+  type_base_sptr result;
+
+  switch (type_kind)
+    {
+    case CTF_K_INTEGER:
+    case CTF_K_FLOAT:
+      {
+        type_decl_sptr type_decl
+          = process_ctf_base_type (ctxt, corp, ctf_dictionary, ctf_type);
+
+        if (type_decl)
+          {
+            add_decl_to_scope (type_decl, tunit->get_global_scope ());
+            result = is_type (type_decl);
+          }
+        break;
+      }
+    case CTF_K_TYPEDEF:
+      {
+        typedef_decl_sptr typedef_decl
+          = process_ctf_typedef (ctxt, corp, tunit, ctf_dictionary, ctf_type);
+
+        if (typedef_decl)
+          {
+            add_decl_to_scope (typedef_decl, tunit->get_global_scope ());
+            result = is_type (typedef_decl);
+          }
+        break;
+      }
+    case CTF_K_POINTER:
+      {
+        pointer_type_def_sptr pointer_type
+          = process_ctf_pointer_type (ctxt, corp, tunit, ctf_dictionary, ctf_type);
+
+        if (pointer_type)
+          {
+            add_decl_to_scope (pointer_type, tunit->get_global_scope ());
+            result = pointer_type;
+          }
+        break;
+      }
+    case CTF_K_CONST:
+    case CTF_K_VOLATILE:
+    case CTF_K_RESTRICT:
+      {
+        type_base_sptr qualified_type
+          = process_ctf_qualified_type (ctxt, corp, tunit, ctf_dictionary, ctf_type);
+
+        if (qualified_type)
+          {
+            decl_base_sptr qualified_type_decl = get_type_declaration (qualified_type);
+
+            add_decl_to_scope (qualified_type_decl, tunit->get_global_scope ());
+            result = qualified_type;
+          }
+        break;
+      }
+    case CTF_K_ARRAY:
+      {
+        array_type_def_sptr array_type
+          = process_ctf_array_type (ctxt, corp, tunit, ctf_dictionary, ctf_type);
+
+        if (array_type)
+          {
+            decl_base_sptr array_type_decl = get_type_declaration (array_type);
+
+            add_decl_to_scope (array_type_decl, tunit->get_global_scope ());
+            result = array_type;
+          }
+        break;
+      }
+    case CTF_K_ENUM:
+      {
+        enum_type_decl_sptr enum_type
+          = process_ctf_enum_type (ctxt, corp, tunit, ctf_dictionary, ctf_type);
+
+        if (enum_type)
+          {
+            add_decl_to_scope (enum_type, tunit->get_global_scope ());
+            result = enum_type;
+          }
+
+        break;
+      }
+    case CTF_K_FUNCTION:
+      {
+        function_type_sptr function_type
+          = process_ctf_function_type (ctxt, corp, tunit, ctf_dictionary, ctf_type);
+
+        if (function_type)
+          {
+            decl_base_sptr function_type_decl = get_type_declaration (function_type);
+
+            add_decl_to_scope (function_type_decl, tunit->get_global_scope ());
+            result = function_type;
+          }
+        break;
+      }
+    case CTF_K_STRUCT:
+      {
+        class_decl_sptr struct_decl
+          = process_ctf_struct_type (ctxt, corp, tunit, ctf_dictionary, ctf_type);
+
+        if (struct_decl)
+          {
+            add_decl_to_scope (struct_decl, tunit->get_global_scope ());
+            result = is_type (struct_decl);
+          }
+        break;
+      }
+    case CTF_K_UNION:
+      {
+        union_decl_sptr union_decl
+          = process_ctf_union_type (ctxt, corp, tunit, ctf_dictionary, ctf_type);
+
+        if (union_decl)
+          {
+            add_decl_to_scope (union_decl, tunit->get_global_scope ());
+            result = is_type (union_decl);
+          }
+        break;
+      }
+    case CTF_K_UNKNOWN:
+      /* Unknown types are simply ignored.  */
+    default:
+      break;
+    }
+
+  if (result)
+    {
+      decl_base_sptr result_decl = get_type_declaration (result);
+
+      canonicalize (result);
+      ctxt->add_type (ctf_type, result);
+    }
+  else
+    fprintf (stderr, "NOT PROCESSED TYPE %lu\n", ctf_type);
+
+  return result;
+}
+
+/// Process a CTF archive and create libabigail IR for the types,
+/// variables and function declarations found in the archive.  The IR
+/// is added to the given corpus.
+///
+/// @param ctxt the read context containing the CTF archive to
+/// process.
+/// @param corp the IR corpus to which add the new contents.
+
+static void
+process_ctf_archive (read_context *ctxt, corpus_sptr corp)
+{
+  /* We only have a translation unit.  */
+  translation_unit_sptr ir_translation_unit =
+    std::make_shared<translation_unit> (ctxt->ir_env, "", 64);
+  ir_translation_unit->set_language (translation_unit::LANG_C);
+  corp->add (ir_translation_unit);
+
+  /* Iterate over the CTF dictionaries in the archive.  */
+  int ctf_err;
+  ctf_dict_t *ctf_dict;
+  ctf_next_t *dict_next = NULL;
+  const char *archive_name;
+
+  while ((ctf_dict = ctf_archive_next (ctxt->ctfa, &dict_next, &archive_name,
+                                       0 /* skip_parent */, &ctf_err)) != NULL)
+    {
+      /* Iterate over the CTF types stored in this archive.  */
+      ctf_id_t ctf_type;
+      int type_flag;
+      ctf_next_t *type_next = NULL;
+
+      while ((ctf_type = ctf_type_next (ctf_dict, &type_next, &type_flag,
+                                        1 /* want_hidden */)) != CTF_ERR)
+        {
+          process_ctf_type (ctxt, corp, ir_translation_unit,
+                            ctf_dict, ctf_type);
+        }
+      if (ctf_errno (ctf_dict) != ECTF_NEXT_END)
+        fprintf (stderr, "ERROR from ctf_type_next\n");
+
+      /* Iterate over the CTF variables stored in this archive.  */
+      ctf_id_t ctf_var_type;
+      ctf_next_t *var_next = NULL;
+      const char *var_name;
+
+      while ((ctf_var_type = ctf_variable_next (ctf_dict, &var_next, &var_name))
+             != CTF_ERR)
+        {
+          type_base_sptr var_type = ctxt->lookup_type (ctf_var_type);
+
+          if (!var_type)
+            {
+              var_type = process_ctf_type (ctxt, corp, ir_translation_unit,
+                                           ctf_dict, ctf_var_type);
+              if (!var_type)
+                /* Ignore variable if its type can't be sorted out.  */
+                continue;
+            }
+
+          var_decl_sptr var_declaration;
+          var_declaration.reset (new var_decl (var_name,
+                                               var_type,
+                                               location (),
+                                               var_name));
+
+          add_decl_to_scope (var_declaration,
+                             ir_translation_unit->get_global_scope ());
+        }
+      if (ctf_errno (ctf_dict) != ECTF_NEXT_END)
+        fprintf (stderr, "ERROR from ctf_variable_next\n");
+
+      /* Iterate over the CTF functions stored in this archive.  */
+      ctf_next_t *func_next = NULL;
+      const char *func_name = NULL;
+      ctf_id_t ctf_sym;
+
+      while ((ctf_sym = ctf_symbol_next (ctf_dict, &func_next, &func_name,
+                                         1 /* functions symbols only */) != CTF_ERR))
+      {
+        ctf_id_t ctf_func_type = ctf_lookup_by_name (ctf_dict, func_name);
+        type_base_sptr func_type = ctxt->lookup_type (ctf_func_type);
+        if (!func_type)
+          {
+            func_type = process_ctf_type (ctxt, corp, ir_translation_unit,
+                                          ctf_dict, ctf_func_type);
+            if (!func_type)
+              /* Ignore function if its type can't be sorted out.  */
+              continue;
+          }
+
+        function_decl_sptr func_declaration;
+        func_declaration.reset (new function_decl (func_name,
+                                                   func_type,
+                                                   0 /* is_inline */,
+                                                   location ()));
+
+        add_decl_to_scope (func_declaration,
+                           ir_translation_unit->get_global_scope ());
+      }
+      if (ctf_errno (ctf_dict) != ECTF_NEXT_END)
+        fprintf (stderr, "ERROR from ctf_symbol_next\n");
+    }
+  if (ctf_err != ECTF_NEXT_END)
+    fprintf (stderr, "ERROR from ctf_archive_next\n");
+
+}
+
+/// Slurp certain information from the ELF file described by a given
+/// read context and install it in a libabigail corpus.
+///
+/// @param ctxt the read context
+/// @param corp the libabigail corpus in which to install the info.
+///
+/// @return 0 if there is an error.
+/// @return 1 otherwise.
+
+static int
+slurp_elf_info (read_context *ctxt, corpus_sptr corp)
+{
+  /* libelf requires to negotiate/set the version of ELF.  */
+  if (elf_version (EV_CURRENT) == EV_NONE)
+    return 0;
+
+  /* Open an ELF handler.  */
+  int elf_fd = open (ctxt->filename.c_str(), O_RDONLY);
+  if (elf_fd == -1)
+    return 0;
+
+  Elf *elf_handler = elf_begin (elf_fd, ELF_C_READ, NULL);
+  if (elf_handler == NULL)
+    {
+      fprintf (stderr, "cannot open %s: %s\n",
+               ctxt->filename.c_str(), elf_errmsg (elf_errno ()));
+      close (elf_fd);
+      return 0;
+    }
+
+  /* Set the ELF architecture.  */
+  GElf_Ehdr eh_mem;
+  GElf_Ehdr *ehdr = gelf_getehdr (elf_handler, &eh_mem);
+  corp->set_architecture_name (elf_helpers::e_machine_to_string (ehdr->e_machine));
+
+  /* Read the symtab from the ELF file and set it in the corpus.  */
+  symtab_reader::symtab_sptr symtab =
+    symtab_reader::symtab::load (elf_handler, ctxt->ir_env,
+                                 0 /* No suppressions.  */);
+  corp->set_symtab(symtab);
+
+  /* Finish the ELF handler and close the associated file.  */
+  elf_end (elf_handler);
+  close (elf_fd);
+
+  return 1;
+}
+
+/// Create and return a new read context to process CTF information
+/// from a given ELF file.
+///
+/// @param elf_path the patch of some ELF file.
+/// @param env a libabigail IR environment.
+
+read_context *
+create_read_context (std::string elf_path, ir::environment *env)
+{
+  return new read_context (elf_path, env);
+}
+
+/// Read the CTF information from some source described by a given
+/// read context and process it to create a libabigail IR corpus.
+/// Store the corpus in the same read context.
+///
+/// @param ctxt the read context to use.
+/// @return a shared pointer to the read corpus.
+
+corpus_sptr
+read_corpus (read_context *ctxt)
+{
+  corpus_sptr corp
+    = std::make_shared<corpus> (ctxt->ir_env, ctxt->filename);
+
+  /* Set some properties of the corpus first.  */
+  corp->set_origin(corpus::CTF_ORIGIN);
+  if (!slurp_elf_info (ctxt, corp))
+    return corp;
+
+  /* Get out now if no CTF debug info is found.  */
+  if (ctxt->ctfa == NULL)
+    return corp;
+
+  /* Process the CTF archive in the read context, if any.  Information
+     about the types, variables, functions, etc contained in the
+     archive are added to the given corpus.  */
+  process_ctf_archive (ctxt, corp);
+  return corp;
+}
+
+} // End of namespace ctf_reader
+} // End of namespace abigail
diff --git a/tools/abidiff.cc b/tools/abidiff.cc
index 21f7ff61..db021bf4 100644
--- a/tools/abidiff.cc
+++ b/tools/abidiff.cc
@@ -19,6 +19,7 @@ 
 #include "abg-tools-utils.h"
 #include "abg-reader.h"
 #include "abg-dwarf-reader.h"
+#include "abg-ctf-reader.h"
 
 using std::vector;
 using std::string;
@@ -104,6 +105,7 @@  struct options
 #ifdef WITH_DEBUG_SELF_COMPARISON
   bool			do_debug;
 #endif
+  bool			use_ctf;
   vector<char*> di_root_paths1;
   vector<char*> di_root_paths2;
   vector<char**> prepared_di_root_paths1;
@@ -144,7 +146,8 @@  struct options
       show_impacted_interfaces(),
       dump_diff_tree(),
       show_stats(),
-      do_log()
+      do_log(),
+      use_ctf()
 #ifdef WITH_DEBUG_SELF_COMPARISON
     ,
     do_debug()
@@ -233,6 +236,7 @@  display_usage(const string& prog_name, ostream& out)
     << " --dump-diff-tree  emit a debug dump of the internal diff tree to "
     "the error output stream\n"
     <<  " --stats  show statistics about various internal stuff\n"
+    << "  --ctf use CTF instead of DWARF in ELF files\n"
 #ifdef WITH_DEBUG_SELF_COMPARISON
     << " --debug debug the process of comparing an ABI corpus against itself"
 #endif
@@ -579,6 +583,8 @@  parse_command_line(int argc, char* argv[], options& opts)
 	opts.show_stats = true;
       else if (!strcmp(argv[i], "--verbose"))
 	opts.do_log = true;
+      else if (!strcmp(argv[i], "--ctf"))
+        opts.use_ctf = true;
 #ifdef WITH_DEBUG_SELF_COMPARISON
       else if (!strcmp(argv[i], "--debug"))
 	opts.do_debug = true;
@@ -1150,23 +1156,35 @@  main(int argc, char* argv[])
 	case abigail::tools_utils::FILE_TYPE_ELF: // fall through
 	case abigail::tools_utils::FILE_TYPE_AR:
 	  {
-	    abigail::dwarf_reader::read_context_sptr ctxt =
-	      abigail::dwarf_reader::create_read_context
-	      (opts.file1, opts.prepared_di_root_paths1,
-	       env.get(), /*read_all_types=*/opts.show_all_types,
-	       opts.linux_kernel_mode);
-	    assert(ctxt);
-
-	    abigail::dwarf_reader::set_show_stats(*ctxt, opts.show_stats);
-	    set_suppressions(*ctxt, opts);
-	    abigail::dwarf_reader::set_do_log(*ctxt, opts.do_log);
-	    c1 = abigail::dwarf_reader::read_corpus_from_elf(*ctxt, c1_status);
-	    if (!c1
-		|| (opts.fail_no_debug_info
-		    && (c1_status & STATUS_ALT_DEBUG_INFO_NOT_FOUND)
-		    && (c1_status & STATUS_DEBUG_INFO_NOT_FOUND)))
-	      return handle_error(c1_status, ctxt.get(),
-				  argv[0], opts);
+            if (opts.use_ctf)
+              {
+                abigail::ctf_reader::read_context *ctxt
+                  = abigail::ctf_reader::create_read_context (opts.file1,
+                                                              env.get());
+
+                assert (ctxt);
+                c1 = abigail::ctf_reader::read_corpus (ctxt);
+              }
+            else
+              {
+                abigail::dwarf_reader::read_context_sptr ctxt =
+                  abigail::dwarf_reader::create_read_context
+                  (opts.file1, opts.prepared_di_root_paths1,
+                   env.get(), /*read_all_types=*/opts.show_all_types,
+                   opts.linux_kernel_mode);
+                assert(ctxt);
+
+                abigail::dwarf_reader::set_show_stats(*ctxt, opts.show_stats);
+                set_suppressions(*ctxt, opts);
+                abigail::dwarf_reader::set_do_log(*ctxt, opts.do_log);
+                c1 = abigail::dwarf_reader::read_corpus_from_elf(*ctxt, c1_status);
+                if (!c1
+                    || (opts.fail_no_debug_info
+                        && (c1_status & STATUS_ALT_DEBUG_INFO_NOT_FOUND)
+                        && (c1_status & STATUS_DEBUG_INFO_NOT_FOUND)))
+                  return handle_error(c1_status, ctxt.get(),
+                                      argv[0], opts);
+              }
 	  }
 	  break;
 	case abigail::tools_utils::FILE_TYPE_XML_CORPUS:
@@ -1219,23 +1237,34 @@  main(int argc, char* argv[])
 	case abigail::tools_utils::FILE_TYPE_ELF: // Fall through
 	case abigail::tools_utils::FILE_TYPE_AR:
 	  {
-	    abigail::dwarf_reader::read_context_sptr ctxt =
-	      abigail::dwarf_reader::create_read_context
-	      (opts.file2, opts.prepared_di_root_paths2,
-	       env.get(), /*read_all_types=*/opts.show_all_types,
-	       opts.linux_kernel_mode);
-	    assert(ctxt);
-	    abigail::dwarf_reader::set_show_stats(*ctxt, opts.show_stats);
-	    abigail::dwarf_reader::set_do_log(*ctxt, opts.do_log);
-	    set_suppressions(*ctxt, opts);
-
-	    c2 = abigail::dwarf_reader::read_corpus_from_elf(*ctxt, c2_status);
-	    if (!c2
-		|| (opts.fail_no_debug_info
-		    && (c2_status & STATUS_ALT_DEBUG_INFO_NOT_FOUND)
-		    && (c2_status & STATUS_DEBUG_INFO_NOT_FOUND)))
-	      return handle_error(c2_status, ctxt.get(), argv[0], opts);
-
+            if (opts.use_ctf)
+              {
+                abigail::ctf_reader::read_context *ctxt
+                  = abigail::ctf_reader::create_read_context (opts.file2,
+                                                              env.get());
+
+                assert (ctxt);
+                c2 = abigail::ctf_reader::read_corpus (ctxt);
+              }
+            else
+              {
+                abigail::dwarf_reader::read_context_sptr ctxt =
+                  abigail::dwarf_reader::create_read_context
+                  (opts.file2, opts.prepared_di_root_paths2,
+                   env.get(), /*read_all_types=*/opts.show_all_types,
+                   opts.linux_kernel_mode);
+                assert(ctxt);
+                abigail::dwarf_reader::set_show_stats(*ctxt, opts.show_stats);
+                abigail::dwarf_reader::set_do_log(*ctxt, opts.do_log);
+                set_suppressions(*ctxt, opts);
+
+                c2 = abigail::dwarf_reader::read_corpus_from_elf(*ctxt, c2_status);
+                if (!c2
+                    || (opts.fail_no_debug_info
+                        && (c2_status & STATUS_ALT_DEBUG_INFO_NOT_FOUND)
+                        && (c2_status & STATUS_DEBUG_INFO_NOT_FOUND)))
+                  return handle_error(c2_status, ctxt.get(), argv[0], opts);
+              }
 	  }
 	  break;
 	case abigail::tools_utils::FILE_TYPE_XML_CORPUS:
diff --git a/tools/abilint.cc b/tools/abilint.cc
index 856f935d..b1551ea3 100644
--- a/tools/abilint.cc
+++ b/tools/abilint.cc
@@ -27,6 +27,7 @@ 
 #include "abg-corpus.h"
 #include "abg-reader.h"
 #include "abg-dwarf-reader.h"
+#include "abg-ctf-reader.h"
 #include "abg-writer.h"
 #include "abg-suppression.h"
 
@@ -67,6 +68,7 @@  struct options
   bool				read_tu;
   bool				diff;
   bool				noout;
+  bool				use_ctf;
   std::shared_ptr<char>	di_root_path;
   vector<string>		suppression_paths;
   string			headers_dir;
@@ -77,7 +79,8 @@  struct options
       read_from_stdin(false),
       read_tu(false),
       diff(false),
-      noout(false)
+      noout(false),
+      use_ctf(false)
   {}
 };//end struct options;
 
@@ -99,7 +102,8 @@  display_usage(const string& prog_name, ostream& out)
     "the input and the memory model saved back to disk\n"
     << "  --noout  do not display anything on stdout\n"
     << "  --stdin|--  read abi-file content from stdin\n"
-    << "  --tu  expect a single translation unit file\n";
+    << "  --tu  expect a single translation unit file\n"
+    << "  --ctf use CTF instead of DWARF in ELF files\n";
 }
 
 bool
@@ -173,6 +177,8 @@  parse_command_line(int argc, char* argv[], options& opts)
 	  opts.read_from_stdin = true;
 	else if (!strcmp(argv[i], "--tu"))
 	  opts.read_tu = true;
+        else if (!strcmp(argv[i], "--ctf"))
+          opts.use_ctf = true;
 	else if (!strcmp(argv[i], "--diff"))
 	  opts.diff = true;
 	else if (!strcmp(argv[i], "--noout"))
@@ -338,13 +344,26 @@  main(int argc, char* argv[])
 	    di_root_path = opts.di_root_path.get();
 	    vector<char**> di_roots;
 	    di_roots.push_back(&di_root_path);
-	    abigail::dwarf_reader::read_context_sptr ctxt =
-	      abigail::dwarf_reader::create_read_context(opts.file_path,
-							 di_roots, env.get(),
-							 /*load_all_types=*/false);
-	    assert(ctxt);
-	    set_suppressions(*ctxt, opts);
-	    corp = read_corpus_from_elf(*ctxt, s);
+
+            if (opts.use_ctf)
+              {
+                abigail::ctf_reader::read_context *ctxt
+                  = abigail::ctf_reader::create_read_context (opts.file_path,
+                                                              env.get());
+
+                assert (ctxt);
+                corp = abigail::ctf_reader::read_corpus (ctxt);
+              }
+            else
+              {
+                abigail::dwarf_reader::read_context_sptr ctxt =
+                  abigail::dwarf_reader::create_read_context(opts.file_path,
+                                                             di_roots, env.get(),
+                                                             /*load_all_types=*/false);
+                assert(ctxt);
+                set_suppressions(*ctxt, opts);
+                corp = read_corpus_from_elf(*ctxt, s);
+              }
 	  }
 	  break;
 	case abigail::tools_utils::FILE_TYPE_XML_CORPUS: