[RFC] manual: Document how types change

Message ID 8734m4n1ij.fsf@oldenburg3.str.redhat.com
State Under Review
Delegated to: Arjun Shankar
Headers
Series [RFC] manual: Document how types change |

Checks

Context Check Description
redhat-pt-bot/TryBot-apply_patch success Patch applied to master at the time it was sent
redhat-pt-bot/TryBot-32bit success Build for i686
linaro-tcwg-bot/tcwg_glibc_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_glibc_check--master-arm success Test passed
linaro-tcwg-bot/tcwg_glibc_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_glibc_check--master-aarch64 success Test passed

Commit Message

Florian Weimer Sept. 12, 2024, 6:57 p.m. UTC
  I said I would put together some writeup about extension mechanisms.
Here's what I got so far.  Do you think this is helpful?

Thanks,
Florian
---
 manual/changing-types.texi | 265 +++++++++++++++++++++++++++++++++++++++++++++
 manual/filesys.texi        |  18 +++
 manual/intro.texi          |   3 +
 manual/resource.texi       |   3 +
 manual/time.texi           |   8 +-
 5 files changed, 296 insertions(+), 1 deletion(-)


base-commit: c9154cad66aa0b11ede62cc9190d3485c5ef6941
  

Comments

Joseph Myers Sept. 12, 2024, 7:13 p.m. UTC | #1
I'm not sure if it should be mentioned in this change, but getsockopt is 
one of the interfaces with a size argument, with at least one case where 
this uses a structure that has grown (though not recently): struct 
tcp_info (where glibc hasn't expanded its definition since 2007 for more 
recent kernel changes, but I think adding the members from more recent 
kernels would be consistent with how this structure is expected to be 
used).
  

Patch

diff --git a/manual/changing-types.texi b/manual/changing-types.texi
new file mode 100644
index 0000000000..f0366ee060
--- /dev/null
+++ b/manual/changing-types.texi
@@ -0,0 +1,265 @@ 
+@node Changing Types
+@subsection Types That Can Change Layout
+
+@cindex type compatibility
+@cindex type extension
+@cindex extensible types
+@cindex ABI changes
+
+Some types defined by @theglibc{} can change their layouts, impacting
+application compatibility.  There are three major causes of such
+changes.
+
+@enumerate 1
+@item
+A change due to the way application code is compiled, caused by
+preprocessor macros, compiler flags, and linker flags.
+
+@item
+A change over time as new features are developed, either within
+@theglibc{} itself, or by the kernel.
+
+@item
+At run time, some types are interpreted differently from their
+definition in @theglibc{} header files, technically leading to aliasing
+violations.
+@end enumerate
+
+The following section describe these categories in more detail, and
+provide guidance to application and library authors how to prepare for
+such type changes.
+
+@menu
+* Types and Build Configuration::   Compilation settings and type changes.
+* Extending Types::                 Types used in extensible interfaces.
+* Type-Based Aliasing Violations::  Type interpretation changing at run time.
+@end menu
+
+@node Types and Build Configuration
+@subsubsection Types depending on build configuration
+
+@Theglibc{} defines certain types differently depending on the build
+configuration.
+
+For application usage, this variance is usually not a problem, as long
+as the entire application is compiled with the same build flags.
+However, using impacted types in public headers (including those
+installed for system-wide use) can cause issues because different
+applications may well use different build configurations, and the
+non-header bits (static archives or dynamic shared objects) typically
+assume one specific definition of the type.
+
+Even if their types depend on the build configuration, variables and
+other objects can usually be copied freely.  The restrictions on copying
+resulting from the type changes described in the following sections do
+not apply.
+
+@subsubheading Time-dependent types and file system times
+
+The types @code{time_t} (@pxref{Time Types}), @code{off_t} (@pxref{File
+Position Primitive}), @code{ino_t} and @code{blkcnt_t} (@pxref{Attribute
+Meanings}) change size based on feature test macros.  @xref{Feature Test
+Macros}.
+
+Types that use these types change their layout as well.  Changes for
+@code{time_t} affect @code{struct timespec}, @code{struct timeval},
+@code{struct rusage}.  Changes to @code{off_t}, @code{ino_t},
+@code{blkcnt_t} further affect @code{struct stat} (@pxref{Attribute
+Meanings}), @code{struct dirent} (@pxref{Directory Entries}),
+@code{struct flock} (@pxref{File Locks}).  The layout of @code{struct
+rlimit} changes as well (@pxref{Limits on Resources}).
+
+Instead of @code{time_t}, @code{off_t}, @code{ino_t}, @code{blkcnt_t},
+use @code{uint64_t} or @code{int64_t} in public header
+files. @xref{Integers}.  The type @code{struct statx_timestamp} defined
+in @code{<sys/stat.h>} could be used in place of @code{struct timespec}
+and @code{struct timeval}, or you could supply your own time. The type
+@code{struct flock64} is an invariant version of @code{struct flock}.
+
+@Theglibc{} does not define fully invariant replacements for
+@code{struct dirent} and @code{struct stat}.  The type @code{struct
+statx} is subject to extensions (@pxref{Extending Types}), and
+@code{struct stat64} still depends on the size of @code{time_t}.  The
+type @code{struct dirent64} has aliasing issues (@pxref{Type-Based
+Aliasing Violations}).  If data from these types need to be exported in
+public headers, it is recommended to define separate types.
+
+@subsubheading Floating point types
+
+On some architectures, @theglibc{} supports multiple variants of the
+@code{long double} type, typically @code{long double} in binary64 IEEE
+representation and @code{long double} in binary128 IEEE representation,
+and potentially an architecture-dependent format distinct from both.
+Variants of @code{long double} can be chosen by architecture-specific C
+and C++ compiler flags, and the @theglibc{} header files automatically
+adjust function declarations to redirect the appropriate implementation
+where such a choice is supported.  On some architectures, @theglibc{}
+supports linking with @code{-lnldbl} (``no long double'') to select
+@code{double} (binary64) as the implementation type for @code{long double},
+regardless of compiler support.
+
+Using @code{long double} in installed headers is best avoided.  If
+@code{double} does not provide sufficient precision or range, consider
+using @code{_Float128} instead.  If @code{long double} support in
+installed headers is required, it is necessary to use
+architecture-specific compiler flags to adjust function declarations, so
+that references in applications are redirected to the appropriate
+implementations (similar to what @theglibc{} does in its header files).
+
+For C++, it is possible to use C++ name mangling to achieve the
+appropriate redirects.  (The @code{long double} variants have different
+mangling.)  To get different mangling if the use of @code{long double}
+is nested in another type, it may be necessary to pass the @code{long
+double} type as a template parameter.
+
+@subsubheading Unsupported type changes
+
+@Theglibc{} is incompatible with applying @samp{#pragma pack} before
+including system headers, and with compiler options that achieve the
+same thing (such as @option{-fpack-struct}).  Options that change sizes
+of enum types (@option{-fshort-enum}), @code{wchar_t}
+(@option{-fshort-wchar}), and similar ABI-changing options are
+unsupported as well.
+
+@node Extending Types
+@subsubsection Extension Mechanisms for Types
+
+For structure types, extending them means adding additional fields,
+usually at the end.
+
+Application code needs to be aware of the approach used to detect the
+presence of fields that are optional (either sizes, flags or type codes,
+as described below).  In general, it is safe to expose such extensible
+types in public, installed header files, as long as the data is
+referenced by a pointer to an extensible type, and not a copy.
+Applications using the pointer still need to follow the rules of the
+extension mechanism, and the library interface needs to provide enough
+context if the structure itself does not contain the required
+information.
+
+Embedding extensible types in the middle of structures must be avoided
+because future extensions will shift the following fields, changing the
+application binary interface (ABI).
+
+Copying extensible structures requires additional steps.  The copy needs
+to be adjusted so that according to the extension protocol in use, only
+the parts that have been copied (because they were known at compile
+time) are present.  In some cases, future versions of the type might
+include copies, and copying code that is unaware of them will
+incorrectly create shallow copies only.  It may be preferable to avoid
+creating copies and use pointers instead.
+
+@subsubheading Managing extensions with explicit structure sizes
+
+Size information can show which members are present.  Such information
+serves two differnet purposes: specifying the amount of space that was
+actually allocated (given the definition of the structure used at
+compile time), and the part of the structure that is actually used at
+run time.  Size information can be kept separate, or directly included
+in the structure.
+
+An example for included size information is @code{struct sched_attr}.
+@xref{Extensible Scheduling}.  Separate size information is used with
+the @code{struct sock_addr} type family.  @xref{Address Formats} and
+@ref{Reading Address}.
+
+@subsubheading Managing extensions with flags
+
+Similar to the size indicator, the flags can specify which fields are
+allocated, and which have been used.  An example is @code{struct statx},
+which uses a hybrid scheme: the @code{stax} function receives a mask
+argument that specifies which information is requested (and which
+structure fields therefore must have been allocated), and the
+@code{stx_mask} member shows which fields have actually been propagated.
+
+@subsubheading Type codes and tagged unions
+
+@cindex tagged unions
+Some structures have a code or version field that provide information
+which fields are available, or which union members are active.  The
+latter approach is sometimes refered to as @dfn{tagged unions}.
+
+Type codes alone do not provide full extensibility because any required
+size information cannot be derived from the type code alone.
+
+Type codes are used in @code{struct sockaddr} (the @code{sa_family}
+field indicates which concrete type to use for access) and
+@code{siginfo_t} (the @code{si_code} field indicates which union members
+are active).  The @code{_r_debug} structure variable (defined in
+@code{link.h}) has a @code{r_version} field that describes available
+extensions.
+
+@subsubheading Extensions as implementation internals
+
+Internal implementation details of @theglibc{} are exposed in installed
+header files, to allow access to these details from macros and inline
+functions, as a performance optimization, or for use in initializers.
+The @code{FILE} type is an example that is made available for use in
+inline functions.  @xref{Streams}.  The type @code{pthread_mutex_t} is
+extensible and supports initializers such as
+@code{PTHREAD_MUTEX_INITIALIZER}.
+
+Structure members that are reserved for such internal use (and for other
+reasons) have names starting with a @samp{_}.  The interpretation of
+these members can change from one @glibcadj{} version to the next.
+Furthermore, such structures cannot be copied directly (using
+@code{memcpy} or structure assignment) in such a way the the copied
+object can serve as a replacement for the original.
+
+Apart from avoiding undefined direct member access and struct copies, no
+further programmer action is required because the existing compatibility
+mechanisms in @theglibc{} automatically handle type changes
+
+@subsubheading Storage managed by the library
+
+If @theglibc{} allocates objects of an extensible type, additional
+fields can be added at the end of structs without compatibility impact,
+as long as applications are provided a way to determine if the
+additional fields are actually present.  An example of this approach is
+@code{struct dl_phdr_info}, as used with the @code{dl_iterate_phdr}
+function.  The callback function passed to @code{dl_iterate_phdr}
+receives the current structure size in a separate argument.
+
+@subsubheading Type change management via symbol versions
+
+Some types (or their interpretation) changed over time, and @theglibc{}
+uses symbol versioning to maintain support for old applications that
+were compiled using the old type definition.  For example, this happened
+with @code{timer_t}, @code{cpu_set_t} (@pxref{CPU Affinity}), and on
+some architectures, @code{jmp_buf} (@pxref{Non-Local Details}).  While
+the existing compatibility symbol versions remain, it is expected that
+future type changes will not use this mechanism because the degree of
+compatibility that can be achieved in practice is quite limited.
+
+@node Type-Based Aliasing Violations
+@subsubsection Run-time Type Mismatches and Type-Based Aliasing Violations
+
+In some cases the dynamic of an object created by @theglibc{} is
+different from the type revealed to the application, or vice versa.
+
+For example, the @code{_r_debug} variable can have type @code{struct
+r_debug_extension} instead of type @code{struct r_debug} if the version
+in the @code{r_version} is 2 instead of 1.  Similarly, the @code{struct
+sock_addr} family of types (@pxref{Address Formats}): Applications can
+reserve space using the @code{struct sockaddr_storage} type, and use
+@code{getsockname} to fill in data for a concrete socket address type.
+(Both cases go beyond the rules for common initial field sequences in
+structure members of unions, so they are technically type-based aliasing
+violations.)
+
+These mismatches also occur if an extensible type (@pxref{Extending
+Types}) has a different definition in an application or library on the
+one side, and in @theglibc{} on the other.  This can happen if software
+is built against one version of @theglibc{}, but runs on a different
+system that uses another version.  Technically, this is a violation of
+the C language's one definition rule.  Due to separate compilation, it
+it works if the type extension mechanisms described before are used.
+
+A different case of a run-time type mismatch involes @code{struct
+dirent} (@pxref{Directory Entries}) and @code{struct dirent64}: the
+@code{d_name} field contents may overflow the specified fixed array
+length.  @xref{Reading/Closing Directory}.
+
+Like with types changed in size (@xref{Extending Types}), it is
+complicated to copy objects of these types.  When sharing objects of
+such types, it may be necessary to pass around pointers instead.
diff --git a/manual/filesys.texi b/manual/filesys.texi
index aabb68385b..202bff7e35 100644
--- a/manual/filesys.texi
+++ b/manual/filesys.texi
@@ -409,6 +409,11 @@  entries.  It contains the following fields:
 This is the null-terminated file name component.  This is the only
 field you can count on in all POSIX systems.
 
+In @theglibc{}, the @code{d_name} field is defined with a fixed length,
+but the actual length used at run time may exceed that.
+@xref{Type-Based Aliasing Violations} and @ref{Reading/Closing
+Directory}.
+
 @item ino_t d_fileno
 This is the file serial number.  For BSD compatibility, you can also
 refer to this member as @code{d_ino}.  On @gnulinuxhurdsystems{} and most POSIX
@@ -473,6 +478,9 @@  This returns the @code{st_mode} value corresponding to @var{dtype}.
 @end deftypefun
 @end table
 
+This structure may change layout based on build configuration.
+@xref{Types and Build Configuration}.
+
 This structure may contain additional members in the future.  Their
 availability is always announced in the compilation environment by a
 macro named @code{_DIRENT_HAVE_D_@var{xxx}} where @var{xxx} is replaced
@@ -2011,6 +2019,9 @@  The optimal block size for reading or writing this file, in bytes.  You
 might use this size for allocating the buffer space for reading or
 writing the file.  (This is unrelated to @code{st_blocks}.)
 @end table
+
+The @code{struct stat} type may change layout based on build
+configuration.  @xref{Types and Build Configuration}.
 @end deftp
 
 The extensions for the Large File Support (LFS) require, even on 32-bit
@@ -2104,6 +2115,9 @@  Here is a list of them.
 This is an integer data type used to represent file modes.  In
 @theglibc{}, this is an unsigned type no narrower than @code{unsigned
 int}.
+
+The @code{struct stat64} type may change layout based on build
+configuration.  @xref{Types and Build Configuration}.
 @end deftp
 
 @cindex inode number
@@ -2115,6 +2129,7 @@  In @theglibc{}, this type is no narrower than @code{unsigned int}.
 
 If the source is compiled with @code{_FILE_OFFSET_BITS == 64} this type
 is transparently replaced by @code{ino64_t}.
+@xref{Types and Build Configuration}.
 @end deftp
 
 @deftp {Data Type} ino64_t
@@ -2125,6 +2140,7 @@  for the use in LFS.  In @theglibc{}, this type is no narrower than
 
 When compiling with @code{_FILE_OFFSET_BITS == 64} this type is
 available under the name @code{ino_t}.
+@xref{Types and Build Configuration}.
 @end deftp
 
 @deftp {Data Type} dev_t
@@ -2145,6 +2161,7 @@  In @theglibc{}, this type is no narrower than @code{int}.
 
 If the source is compiled with @code{_FILE_OFFSET_BITS == 64} this type
 is transparently replaced by @code{blkcnt64_t}.
+@xref{Types and Build Configuration}.
 @end deftp
 
 @deftp {Data Type} blkcnt64_t
@@ -2154,6 +2171,7 @@  use in LFS.  In @theglibc{}, this type is no narrower than @code{int}.
 
 When compiling with @code{_FILE_OFFSET_BITS == 64} this type is
 available under the name @code{blkcnt_t}.
+@xref{Types and Build Configuration}.
 @end deftp
 
 @node Reading Attributes
diff --git a/manual/intro.texi b/manual/intro.texi
index 879c1b38d9..f17d687af7 100644
--- a/manual/intro.texi
+++ b/manual/intro.texi
@@ -990,6 +990,7 @@  This section describes some of the practical issues involved in using
 * Reserved Names::              The C standard reserves some names for
                                  the library, and some for users.
 * Feature Test Macros::         How to control what names are defined.
+* Changing Types::              Types that can change their layouts.
 @end menu
 
 @node Header Files, Macro Definitions,  , Using the Library
@@ -1292,6 +1293,8 @@  The header file @file{termios.h} reserves names prefixed with @samp{c_},
 @comment Include the section on Creature Nest Macros.
 @include creature.texi
 
+@include changing-types.texi
+
 @node Roadmap to the Manual,  , Using the Library, Introduction
 @section Roadmap to the Manual
 
diff --git a/manual/resource.texi b/manual/resource.texi
index 612520d4d9..ad260502eb 100644
--- a/manual/resource.texi
+++ b/manual/resource.texi
@@ -244,6 +244,8 @@  The maximum limit.
 
 For @code{getrlimit}, the structure is an output; it receives the current
 values.  For @code{setrlimit}, it specifies the new values.
+This type changes layout based on build configuration.
+@xref{Types and Build Configuration}.
 @end deftp
 
 For the LFS functions a similar type is defined in @file{sys/resource.h}.
@@ -980,6 +982,7 @@  to @code{struct sched_attr} at the end, so that the size of this data
 type changes.  Do not use it in places where this matters, such as
 structure fields in installed header files, where such a change could
 impact the application binary interface (ABI).
+@xref{Extending Types}
 
 The following generic fields are available.
 
diff --git a/manual/time.texi b/manual/time.texi
index 64aad8fdc5..b3e16f900a 100644
--- a/manual/time.texi
+++ b/manual/time.texi
@@ -163,7 +163,7 @@  and UTC will become obsolete.
 Currently the @code{time_t} type is 64 bits wide on all platforms
 supported by @theglibc{}, except that it is 32 bits wide on a few
 older platforms unless you define @code{_TIME_BITS} to 64.
-@xref{Feature Test Macros}.
+@xref{Feature Test Macros} and @ref{Types and Build Configuration}.
 @end deftp
 
 @deftp {Data Type} {struct timespec}
@@ -189,6 +189,9 @@  equal to zero, and less than 1,000,000,000.
 When @code{struct timespec} values are supplied to @glibcadj{}
 functions, the value in this field must be in the same range.
 @end table
+
+The type @code{struct timespec} varies with build configuration on some
+architectures.  @xref{Types and Build Configuration}.
 @end deftp
 
 @deftp {Data Type} {struct timeval}
@@ -216,6 +219,9 @@  equal to zero, and less than 1,000,000.
 When @code{struct timeval} values are supplied to @glibcadj{}
 functions, the value in this field must be in the same range.
 @end table
+
+The type @code{struct timeval} varies with build configuration on some
+architectures.  @xref{Types and Build Configuration}.
 @end deftp
 
 @deftp {Data Type} {struct tm}