[libctf,00/22] more modifiable CTF dicts (and a few bugfixes)

Message ID 20240417202018.34966-1-nick.alcock@oracle.com
Headers
Series more modifiable CTF dicts (and a few bugfixes) |

Message

Nick Alcock April 17, 2024, 8:19 p.m. UTC
  A longstanding restriction of libctf is that open CTF dicts are divided
into two varieties: one that you can create, add stuff to and then write
out and throw away, and one that you can open but then never add
anything to: the dict is forever read-only.

This distinction is not entirely original sin.  Solaris libctf and its users
had remnants of code that suggested that it was intended to be possible to
read in CTF and at least modify it, but this was never properly implemented
and would at best have caused memory corruption.  Most attempts failed with
an ECTF_RDONLY error.

This was not at all helped by the design decision to split the set of
types libctf saw into 'dynamic' (added by ctf_add_*) and 'static', have
lookups work only on static types, and have ctf_update() work by
throwing the dict's static types away and reserializing them from the
dynamic types.  This stopped type lookup working until you did a
ctf_update() to reserialize the entire dict, at increasingly horrible
performance cost, and meant that libctf had to in effect handle dicts
that were mixtures of read-only and writable dicts while gaining none of
the benefits of doing that.

The performance cost and need to call ctf_update() have long been fixed,
and lookups now work on all types however added, but the restriction
that writable dicts came from ctf_create() and read-only ones came from
ctf_*open(), and that you couldn't save the latter, persisted.  Worse
yet, if you tried to save writable dicts more than once things often
went wrong (strtab corruption was commonplace), even if you did nothing
at all to them between the saves.

This series tries to clean all that up, in part so we can save dicts and
make transformations to what we save without affecting the dict itself, and
certainly without corrupting anything.

Ignoring a few commits that introduce a minor new option to objdump, fix an
unfortunate error in lookups of bitfield types by name, and fix typos and
leaks, this series is divided into two halves:

 - patches up to the reversion in the middle, which make the readonliness of
   dicts apply to *types* instead of the dict as a whole: in particular, you
   cannot add members to structs, enums, or unions that were read in from
   files.  You can add references to them, and add new types of any kind
   freely, which was more or less easy except for the symbol handling code,
   which needed a good bit of rejigging (and bugfixing) in the process.

 - the reversion and patches beyond it discards an old internal strtab
   abstraction which proved to be much more trouble than it was worth
   ("pending refs") and replaces it with a new scheme which fixes corruption
   of the string table if serialized more than once, drops any need to scan
   existing types for references to strings (so we can just blindly copy the
   existing static type table from a ctf_open()ed dict and append to it when
   saving it again), and redoes serialization and the writeout functions so
   that while it does make a few changes to the dict being read in (the
   strtab is regenerated), the types table is not affected, and there is no
   "replace the guts of this type table with a serialized copy" nonsense
   like libctf has always had before now: we just emit everything into a new
   buffer and return it.  Old types already present when the dict was
   ctf_opened need not be traversed at all (we have to traverse the
   symtypetabs and variables sections because they are sorted, so any new
   entries probably appear in the middle).

   The result is noticeably simpler and avoids a lot of boilerplate where
   you had to remember to copy every field in the struct ctf_dict (and
   remember to augment this list when adding new fields, which was routinely
   forgotten, triggering different subtle bugs every time).  It also fixes a
   couple of completely broken API functions, notably ctf_gzwrite(), which
   while inconvenient and annoying to use should not completely fail to
   serialize the dict before writing it out...

The last couple of patches, one due to Nicholas Vinson and the other very
similar to one he wrote, fixes bugs that break building with recent LLD (LLD
is stricter than GNU ld with respect to version scripts these days).

The usual giant pile of tests have been run: all look happy. I'm going to
run the trybot over it shortly.

I'll apply it in a couple of days if nobody says otherwise.

Cc: Nicholas Vinson <nvinson234@gmail.com>

Nicholas Vinson (1):
  libctf: Remove undefined functions from ver. map

Nick Alcock (21):
  binutils, objdump: Add --ctf-parent-section
  libctf: don't leak the symbol name in the name->type cache
  libctf: remove static/dynamic name lookup distinction
  libctf: fix name lookup in dicts containing base-type bitfields
  libctf: support addition of types to dicts read via ctf_open()
  libctf: fix a comment
  libctf: delete LCTF_DIRTY
  libctf: fix a comment typo
  libctf: rename ctf_dict.ctf_{symtab,strtab}
  Revert "libctf: do not corrupt strings across ctf_serialize"
  libctf: replace 'pending refs' abstraction
  libctf: rethink strtab writeout
  libctf: make ctf_serialize() actually serialize
  libctf: fix tiny dumping error
  libctf: improve handling of type dumping errors
  libctf: make ctf_lookup of symbols by name work in more cases
  libctf: fix a debugging typo
  libctf: add rewriting tests
  libctf: fix leak in test
  libctf: don't pass errno into ctf_err_warn so often
  libctf: do not include undefined functions in libctf.ver

 binutils/doc/ctf.options.texi                 |  10 +
 binutils/objdump.c                            |  56 +-
 libctf/configure                              |  21 +-
 libctf/configure.ac                           |  21 +-
 libctf/ctf-archive.c                          |   9 +-
 libctf/ctf-create.c                           | 252 ++++---
 libctf/ctf-dedup.c                            |   8 +-
 libctf/ctf-dump.c                             |  10 +-
 libctf/ctf-hash.c                             | 112 +---
 libctf/ctf-impl.h                             | 116 ++--
 libctf/ctf-link.c                             |  38 +-
 libctf/ctf-lookup.c                           | 372 +++++++----
 libctf/ctf-open.c                             | 341 +++++-----
 libctf/ctf-serialize.c                        | 406 +++++-------
 libctf/ctf-string.c                           | 620 ++++++++++++------
 libctf/ctf-subr.c                             |   6 +-
 libctf/ctf-types.c                            |  46 +-
 libctf/ctf-util.c                             |  13 -
 libctf/libctf.ver                             |   5 +-
 .../libctf-lookup/add-to-opened-ctf.c         |  19 +
 .../testsuite/libctf-lookup/add-to-opened.c   | 148 +++++
 .../testsuite/libctf-lookup/add-to-opened.lk  |   3 +
 .../libctf-lookup/conflicting-type-syms.c     |   4 +
 .../libctf-regression/gzrewrite-ctf.c         |  19 +
 .../testsuite/libctf-regression/gzrewrite.c   | 165 +++++
 .../testsuite/libctf-regression/gzrewrite.lk  |   3 +
 libctf/testsuite/libctf-regression/zrewrite.c | 156 +++++
 .../testsuite/libctf-regression/zrewrite.lk   |   3 +
 .../libctf-bitfield-name-lookup.c             | 136 ++++
 .../libctf-bitfield-name-lookup.lk            |   1 +
 30 files changed, 1983 insertions(+), 1136 deletions(-)
 create mode 100644 libctf/testsuite/libctf-lookup/add-to-opened-ctf.c
 create mode 100644 libctf/testsuite/libctf-lookup/add-to-opened.c
 create mode 100644 libctf/testsuite/libctf-lookup/add-to-opened.lk
 create mode 100644 libctf/testsuite/libctf-regression/gzrewrite-ctf.c
 create mode 100644 libctf/testsuite/libctf-regression/gzrewrite.c
 create mode 100644 libctf/testsuite/libctf-regression/gzrewrite.lk
 create mode 100644 libctf/testsuite/libctf-regression/zrewrite.c
 create mode 100644 libctf/testsuite/libctf-regression/zrewrite.lk
 create mode 100644 libctf/testsuite/libctf-writable/libctf-bitfield-name-lookup.c
 create mode 100644 libctf/testsuite/libctf-writable/libctf-bitfield-name-lookup.lk
  

Comments

Nick Alcock April 19, 2024, 3:51 p.m. UTC | #1
On 17 Apr 2024, Nick Alcock stated:

> I'll apply it in a couple of days if nobody says otherwise.
>
> Cc: Nicholas Vinson <nvinson234@gmail.com>
>
> Nicholas Vinson (1):
>   libctf: Remove undefined functions from ver. map
>
> Nick Alcock (21):
>   binutils, objdump: Add --ctf-parent-section
>   libctf: don't leak the symbol name in the name->type cache
>   libctf: remove static/dynamic name lookup distinction
>   libctf: fix name lookup in dicts containing base-type bitfields
>   libctf: support addition of types to dicts read via ctf_open()
>   libctf: fix a comment
>   libctf: delete LCTF_DIRTY
>   libctf: fix a comment typo
>   libctf: rename ctf_dict.ctf_{symtab,strtab}
>   Revert "libctf: do not corrupt strings across ctf_serialize"
>   libctf: replace 'pending refs' abstraction
>   libctf: rethink strtab writeout
>   libctf: make ctf_serialize() actually serialize
>   libctf: fix tiny dumping error
>   libctf: improve handling of type dumping errors
>   libctf: make ctf_lookup of symbols by name work in more cases
>   libctf: fix a debugging typo
>   libctf: add rewriting tests
>   libctf: fix leak in test
>   libctf: don't pass errno into ctf_err_warn so often
>   libctf: do not include undefined functions in libctf.ver

This is pushed now, exactly as here except for a couple of tiny GNU
style fixes (the one pointed out by Alan, and a few similar ones in a
couple of other commits).