[v2] IPA: Provide a mechanism to register static DTORs via cxa_atexit.

Message ID 3846F6D3-7701-4162-89A4-FB7C252C802F@gmail.com
State New
Headers
Series [v2] IPA: Provide a mechanism to register static DTORs via cxa_atexit. |

Commit Message

Iain Sandoe Nov. 5, 2021, 1:08 p.m. UTC
  I tried enabling this on x86-64-linux (just for interest) and it seems to work
OK there too - but that testing revealed a thinko that didn’t show with a
a normal regstrap.

----
 
 Changes from original post:
 1. amended a comment
 2. fixed a thinko where I was not allowing for functions declared as
    *both* CTOR and DTOR.

For at least one target (Darwin) the platform convention is to
register static destructors (i.e. __attribute__((destructor)))
with __cxa_atexit rather than placing them into a list that is
run by some other mechanism.

This patch provides a target hook that allows a target to opt
into this and handling for the process in ipa_cdtor_merge ().

When the mode is enabled (dtors_from_cxa_atexit is set) we:

 * Generate new CTORs to register static destructors with
   __cxa_atexit and add them to the existing list of CTORs;
   we then process the revised CTORs list.

 * We sort the DTORs into priority and then TU order, this
   means that they are registered in that order with
   __cxa_atexit () and therefore will be run in the reverse
   order.

 * Likewise, CTORs are sorted into priority and then TU order,
   which means that they will run in that order.

This matches the behavior of using init/fini (or
mod_init_func/mod_term_func) sections.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

gcc/ChangeLog:

	* config/darwin.h (TARGET_DTORS_FROM_CXA_ATEXIT): New.
	* doc/tm.texi: Regenerated.
	* doc/tm.texi.in: Add TARGET_DTORS_FROM_CXA_ATEXIT hook.
	* ipa.c (ipa_discover_variable_flags):
	(cgraph_build_static_cdtor_1): Return the built function
	decl.
	(build_cxa_atexit_decl): New.
	(build_dso_handle_decl): New.
	(build_cxa_dtor_registrations): New.
	(compare_cdtor_tu_order): New.
	(build_cxa_atexit_fns): New.
	(ipa_cdtor_merge): If dtors_from_cxa_atexit is set,
	process the DTORs/CTORs accordingly.
	(pass_ipa_cdtor_merge::gate): Also run if
	dtors_from_cxa_atexit is set.
	* target.def (dtors_from_cxa_atexit): New hook.
---
 gcc/config/darwin.h |   5 ++
 gcc/doc/tm.texi     |   8 ++
 gcc/doc/tm.texi.in  |   2 +
 gcc/ipa.c           | 201 +++++++++++++++++++++++++++++++++++++++++++-
 gcc/target.def      |  10 +++
 5 files changed, 222 insertions(+), 4 deletions(-)
  

Comments

Iain Sandoe Nov. 5, 2021, 1:25 p.m. UTC | #1
sheesh … EWRONGREVISEDPATCH

> On 5 Nov 2021, at 13:08, Iain Sandoe <iains.gcc@gmail.com> wrote:
> 
> I tried enabling this on x86-64-linux (just for interest) and it seems to work
> OK there too - but that testing revealed a thinko that didn’t show with a
> a normal regstrap.

… now with the correct patch.

[PATCH v2] IPA: Provide a mechanism to register static DTORs via
 cxa_atexit.

For at least one target (Darwin) the platform convention is to
register static destructors (i.e. __attribute__((destructor)))
with __cxa_atexit rather than placing them into a list that is
run by some other mechanism.

This patch provides a target hook that allows a target to opt
into this and handling for the process in ipa_cdtor_merge ().

When the mode is enabled (dtors_from_cxa_atexit is set) we:

 * Generate new CTORs to register static destructors with
   __cxa_atexit and add them to the existing list of CTORs;
   we then process the revised CTORs list.

 * We sort the DTORs into priority and then TU order, this
   means that they are registered in that order with
   __cxa_atexit () and therefore will be run in the reverse
   order.

 * Likewise, CTORs are sorted into priority and then TU order,
   which means that they will run in that order.

This matches the behavior of using init/fini (or
mod_init_func/mod_term_func) sections.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

gcc/ChangeLog:

	* config/darwin.h (TARGET_DTORS_FROM_CXA_ATEXIT): New.
	* doc/tm.texi: Regenerated.
	* doc/tm.texi.in: Add TARGET_DTORS_FROM_CXA_ATEXIT hook.
	* ipa.c (ipa_discover_variable_flags):
	(cgraph_build_static_cdtor_1): Return the built function
	decl.
	(build_cxa_atexit_decl): New.
	(build_dso_handle_decl): New.
	(build_cxa_dtor_registrations): New.
	(compare_cdtor_tu_order): New.
	(build_cxa_atexit_fns): New.
	(ipa_cdtor_merge): If dtors_from_cxa_atexit is set,
	process the DTORs/CTORs accordingly.
	(pass_ipa_cdtor_merge::gate): Also run if
	dtors_from_cxa_atexit is set.
	* target.def (dtors_from_cxa_atexit): New hook.
---
 gcc/config/darwin.h |   7 +-
 gcc/doc/tm.texi     |   8 ++
 gcc/doc/tm.texi.in  |   2 +
 gcc/ipa.c           | 200 +++++++++++++++++++++++++++++++++++++++++++-
 gcc/target.def      |  10 +++
 5 files changed, 222 insertions(+), 5 deletions(-)

diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
index 27cb3e4bb30..2b19fb7c085 100644
--- a/gcc/config/darwin.h
+++ b/gcc/config/darwin.h
@@ -54,6 +54,11 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 
 #define DO_GLOBAL_DTORS_BODY
 
+/* Register static destructors to run from __cxa_atexit instead of putting
+   them into a .mod_term_funcs section.  */
+
+#define TARGET_DTORS_FROM_CXA_ATEXIT true
+
 /* The string value for __SIZE_TYPE__.  */
 
 #ifndef SIZE_TYPE
@@ -1160,7 +1165,7 @@ extern void darwin_driver_init (unsigned int *,struct cl_decoded_option **);
 
 /* The Apple assembler and linker do not support constructor priorities.  */
 #undef SUPPORTS_INIT_PRIORITY
-#define SUPPORTS_INIT_PRIORITY 0
+#define SUPPORTS_INIT_PRIORITY 1
 
 #undef STACK_CHECK_STATIC_BUILTIN
 #define STACK_CHECK_STATIC_BUILTIN 1
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 990152f5b15..0a4df18b825 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -9233,6 +9233,14 @@ collecting constructors and destructors to be run at startup and exit.
 It is false if we must use @command{collect2}.
 @end deftypevr
 
+@deftypevr {Target Hook} bool TARGET_DTORS_FROM_CXA_ATEXIT
+This value is true if the target wants destructors to be queued to be
+run from __cxa_atexit.  If this is the case then, for each priority level,
+a new constructor will be entered that registers the destructors for that
+level with __cxa_atexit (and there will be no destructors emitted).
+It is false the method implied by @code{have_ctors_dtors} is used.
+@end deftypevr
+
 @deftypefn {Target Hook} void TARGET_ASM_CONSTRUCTOR (rtx @var{symbol}, int @var{priority})
 If defined, a function that outputs assembler code to arrange to call
 the function referenced by @var{symbol} at initialization time.
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 193c9bdd853..c733f356fe4 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -6021,6 +6021,8 @@ encountering an @code{init_priority} attribute.
 
 @hook TARGET_HAVE_CTORS_DTORS
 
+@hook TARGET_DTORS_FROM_CXA_ATEXIT
+
 @hook TARGET_ASM_CONSTRUCTOR
 
 @hook TARGET_ASM_DESTRUCTOR
diff --git a/gcc/ipa.c b/gcc/ipa.c
index 4f62ac183ee..325b658b55e 100644
--- a/gcc/ipa.c
+++ b/gcc/ipa.c
@@ -837,7 +837,7 @@ ipa_discover_variable_flags (void)
    FINAL specify whether the externally visible name for collect2 should
    be produced. */
 
-static void
+static tree
 cgraph_build_static_cdtor_1 (char which, tree body, int priority, bool final,
 			     tree optimization,
 			     tree target)
@@ -916,6 +916,7 @@ cgraph_build_static_cdtor_1 (char which, tree body, int priority, bool final,
 
   set_cfun (NULL);
   current_function_decl = NULL;
+  return decl;
 }
 
 /* Generate and emit a static constructor or destructor.  WHICH must
@@ -1022,6 +1023,124 @@ build_cdtor (bool ctor_p, const vec<tree> &cdtors)
     }
 }
 
+/* Helper functions for build_cxa_dtor_registrations ().
+   Build a decl for __cxa_atexit ().  */
+
+static tree
+build_cxa_atexit_decl ()
+{
+  /* The parameter to "__cxa_atexit" is "void (*)(void *)".  */
+  tree fn_type = build_function_type_list (void_type_node,
+					   ptr_type_node, NULL_TREE);
+  tree fn_ptr_type = build_pointer_type (fn_type);
+  /* The declaration for `__cxa_atexit' is:
+     int __cxa_atexit (void (*)(void *), void *, void *).  */
+  const char *name = "__cxa_atexit";
+  tree cxa_name = get_identifier (name);
+  fn_type = build_function_type_list (integer_type_node, fn_ptr_type,
+				      ptr_type_node, ptr_type_node, NULL_TREE);
+  tree atexit_fndecl = build_decl (BUILTINS_LOCATION, FUNCTION_DECL,
+				   cxa_name, fn_type);
+  SET_DECL_ASSEMBLER_NAME (atexit_fndecl, cxa_name);
+  DECL_VISIBILITY (atexit_fndecl) = VISIBILITY_DEFAULT;
+  DECL_VISIBILITY_SPECIFIED (atexit_fndecl) = true;
+  set_call_expr_flags (atexit_fndecl, ECF_LEAF | ECF_NOTHROW);
+  TREE_PUBLIC (atexit_fndecl) = true;
+  DECL_EXTERNAL (atexit_fndecl) = true;
+  DECL_ARTIFICIAL (atexit_fndecl) = true;
+  return atexit_fndecl;
+}
+
+/* Build a decl for __dso_handle.  */
+
+static tree
+build_dso_handle_decl ()
+{
+  /* Declare the __dso_handle variable.  */
+  tree dso_handle_decl = build_decl (UNKNOWN_LOCATION, VAR_DECL,
+				     get_identifier ("__dso_handle"),
+				     ptr_type_node);
+  TREE_PUBLIC (dso_handle_decl) = true;
+  DECL_EXTERNAL (dso_handle_decl) = true;
+  DECL_ARTIFICIAL (dso_handle_decl) = true;
+#ifdef HAVE_GAS_HIDDEN
+  if (dso_handle_decl != error_mark_node)
+    {
+      DECL_VISIBILITY (dso_handle_decl) = VISIBILITY_HIDDEN;
+      DECL_VISIBILITY_SPECIFIED (dso_handle_decl) = true;
+    }
+#endif
+  return dso_handle_decl;
+}
+
+/*  This builds one or more constructor functions that register DTORs with
+    __cxa_atexit ().  Within a priority level, DTORs are registered in TU
+    order - which means that they will run in reverse TU order from cxa_atexit.
+    This is the same behavior as using a .fini / .mod_term_funcs section.
+    As the functions are built, they are appended to the CTORs vector.  */
+
+static void
+build_cxa_dtor_registrations (const vec<tree> &dtors, vec<tree> *ctors)
+{
+  size_t i,j;
+  size_t len = dtors.length ();
+
+  location_t sav_loc = input_location;
+  input_location = UNKNOWN_LOCATION;
+
+  tree atexit_fndecl = build_cxa_atexit_decl ();
+  tree dso_handle_decl = build_dso_handle_decl ();
+
+  /* We want &__dso_handle.  */
+  tree dso_ptr = build1_loc (UNKNOWN_LOCATION, ADDR_EXPR,
+			     ptr_type_node, dso_handle_decl);
+
+  i = 0;
+  while (i < len)
+    {
+      priority_type priority = 0;
+      tree body = NULL_TREE;
+      j = i;
+      do
+	{
+	  priority_type p;
+	  tree fn = dtors[j];
+	  p = DECL_FINI_PRIORITY (fn);
+	  if (j == i)
+	    priority = p;
+	  else if (p != priority)
+	    break;
+	  j++;
+	}
+      while (j < len);
+
+      /* Find the next batch of destructors with the same initialization
+	 priority.  */
+      for (;i < j; i++)
+	{
+	  tree fn = dtors[i];
+	  DECL_STATIC_DESTRUCTOR (fn) = 0;
+	  tree dtor_ptr = build1_loc (UNKNOWN_LOCATION, ADDR_EXPR,
+				      ptr_type_node, fn);
+	  tree call_cxa_atexit
+	    = build_call_expr_loc (UNKNOWN_LOCATION, atexit_fndecl, 3,
+				   dtor_ptr, null_pointer_node, dso_ptr);
+	  TREE_SIDE_EFFECTS (call_cxa_atexit) = 1;
+	  append_to_statement_list (call_cxa_atexit, &body);
+	}
+
+      gcc_assert (body != NULL_TREE);
+      /* Generate a function to register the DTORs at this priority.  */
+      tree new_ctor
+	= cgraph_build_static_cdtor_1 ('I', body, priority, true,
+				       DECL_FUNCTION_SPECIFIC_OPTIMIZATION (dtors[0]),
+				       DECL_FUNCTION_SPECIFIC_TARGET (dtors[0]));
+      /* Add this to the list of ctors.  */
+      ctors->safe_push (new_ctor);
+    }
+  input_location = sav_loc;
+}
+
 /* Comparison function for qsort.  P1 and P2 are actually of type
    "tree *" and point to static constructors.  DECL_INIT_PRIORITY is
    used to determine the sort order.  */
@@ -1071,7 +1190,46 @@ compare_dtor (const void *p1, const void *p2)
   else if (priority1 > priority2)
     return 1;
   else
-    /* Ensure a stable sort.  */
+    /* Ensure a stable sort - into TU order.  */
+    return DECL_UID (f1) - DECL_UID (f2);
+}
+
+/* Comparison function for qsort.  P1 and P2 are of type "tree *" and point to
+   a pair of static constructors or destructors.  We first sort on the basis of
+   priority and then into TU order (on the strict assumption that DECL_UIDs are
+   ordered in the same way as the original functions).  ???: this seems quite
+   fragile. */
+
+static int
+compare_cdtor_tu_order (const void *p1, const void *p2)
+{
+  tree f1;
+  tree f2;
+  int priority1;
+  int priority2;
+
+  f1 = *(const tree *)p1;
+  f2 = *(const tree *)p2;
+  /* We process the DTORs first, and then remove their flag, so this order
+     allows for functions that are declared as both CTOR and DTOR.  */
+  if (DECL_STATIC_DESTRUCTOR (f1))
+    {
+      gcc_checking_assert (DECL_STATIC_DESTRUCTOR (f2));
+      priority1 = DECL_FINI_PRIORITY (f1);
+      priority2 = DECL_FINI_PRIORITY (f2);
+    }
+  else
+    {
+      priority1 = DECL_INIT_PRIORITY (f1);
+      priority2 = DECL_INIT_PRIORITY (f2);
+    }
+
+  if (priority1 < priority2)
+    return -1;
+  else if (priority1 > priority2)
+    return 1;
+  else
+    /* For equal priority, sort into the order of definition in the TU.  */
     return DECL_UID (f1) - DECL_UID (f2);
 }
 
@@ -1097,6 +1255,37 @@ build_cdtor_fns (vec<tree> *ctors, vec<tree> *dtors)
     }
 }
 
+/* Generate new CTORs to register static destructors with __cxa_atexit and add
+   them to the existing list of CTORs; we then process the revised CTORs list.
+
+   We sort the DTORs into priority and then TU order, this means that they are
+   registered in that order with __cxa_atexit () and therefore will be run in
+   the reverse order.
+
+   Likewise, CTORs are sorted into priority and then TU order, which means that
+   they will run in that order.
+
+   This matches the behavior of using init/fini or mod_init_func/mod_term_func
+   sections.  */
+
+static void
+build_cxa_atexit_fns (vec<tree> *ctors, vec<tree> *dtors)
+{
+  if (!dtors->is_empty ())
+    {
+      gcc_assert (targetm.dtors_from_cxa_atexit);
+      dtors->qsort (compare_cdtor_tu_order);
+      build_cxa_dtor_registrations (*dtors, ctors);
+    }
+
+  if (!ctors->is_empty ())
+    {
+      gcc_assert (targetm.dtors_from_cxa_atexit);
+      ctors->qsort (compare_cdtor_tu_order);
+      build_cdtor (/*ctor_p=*/true, *ctors);
+    }
+}
+
 /* Look for constructors and destructors and produce function calling them.
    This is needed for targets not supporting ctors or dtors, but we perform the
    transformation also at linktime to merge possibly numerous
@@ -1115,7 +1304,10 @@ ipa_cdtor_merge (void)
     if (DECL_STATIC_CONSTRUCTOR (node->decl)
 	|| DECL_STATIC_DESTRUCTOR (node->decl))
        record_cdtor_fn (node, &ctors, &dtors);
-  build_cdtor_fns (&ctors, &dtors);
+  if (targetm.dtors_from_cxa_atexit)
+    build_cxa_atexit_fns (&ctors, &dtors);
+  else
+    build_cdtor_fns (&ctors, &dtors);
   return 0;
 }
 
@@ -1162,7 +1354,7 @@ pass_ipa_cdtor_merge::gate (function *)
   /* Perform the pass when we have no ctors/dtors support
      or at LTO time to merge multiple constructors into single
      function.  */
-  return !targetm.have_ctors_dtors || in_lto_p;
+  return !targetm.have_ctors_dtors || in_lto_p || targetm.dtors_from_cxa_atexit;
 }
 
 } // anon namespace
diff --git a/gcc/target.def b/gcc/target.def
index 87feeec2ea1..6fe1a02a5e8 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -6819,6 +6819,16 @@ collecting constructors and destructors to be run at startup and exit.\n\
 It is false if we must use @command{collect2}.",
  bool, false)
 
+/* True if the target wants DTORs to be run from cxa_atexit.  */
+DEFHOOKPOD
+(dtors_from_cxa_atexit,
+ "This value is true if the target wants destructors to be queued to be\n\
+run from __cxa_atexit.  If this is the case then, for each priority level,\n\
+a new constructor will be entered that registers the destructors for that\n\
+level with __cxa_atexit (and there will be no destructors emitted).\n\
+It is false the method implied by @code{have_ctors_dtors} is used.",
+ bool, false)
+
 /* True if thread-local storage is supported.  */
 DEFHOOKPOD
 (have_tls,
  
Jan Hubicka Nov. 13, 2021, 3:16 p.m. UTC | #2
> sheesh … EWRONGREVISEDPATCH
> 
> > On 5 Nov 2021, at 13:08, Iain Sandoe <iains.gcc@gmail.com> wrote:
> > 
> > I tried enabling this on x86-64-linux (just for interest) and it seems to work
> > OK there too - but that testing revealed a thinko that didn’t show with a
> > a normal regstrap.
> 
> … now with the correct patch.
> 
> [PATCH v2] IPA: Provide a mechanism to register static DTORs via
>  cxa_atexit.
> 
> For at least one target (Darwin) the platform convention is to
> register static destructors (i.e. __attribute__((destructor)))
> with __cxa_atexit rather than placing them into a list that is
> run by some other mechanism.
> 
> This patch provides a target hook that allows a target to opt
> into this and handling for the process in ipa_cdtor_merge ().
> 
> When the mode is enabled (dtors_from_cxa_atexit is set) we:
> 
>  * Generate new CTORs to register static destructors with
>    __cxa_atexit and add them to the existing list of CTORs;
>    we then process the revised CTORs list.
> 
>  * We sort the DTORs into priority and then TU order, this
>    means that they are registered in that order with
>    __cxa_atexit () and therefore will be run in the reverse
>    order.
> 
>  * Likewise, CTORs are sorted into priority and then TU order,
>    which means that they will run in that order.
> 
> This matches the behavior of using init/fini (or
> mod_init_func/mod_term_func) sections.
> 
> Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
> 
> gcc/ChangeLog:
> 
> 	* config/darwin.h (TARGET_DTORS_FROM_CXA_ATEXIT): New.
> 	* doc/tm.texi: Regenerated.
> 	* doc/tm.texi.in: Add TARGET_DTORS_FROM_CXA_ATEXIT hook.
> 	* ipa.c (ipa_discover_variable_flags):
> 	(cgraph_build_static_cdtor_1): Return the built function
> 	decl.
> 	(build_cxa_atexit_decl): New.
> 	(build_dso_handle_decl): New.
> 	(build_cxa_dtor_registrations): New.
> 	(compare_cdtor_tu_order): New.
> 	(build_cxa_atexit_fns): New.
> 	(ipa_cdtor_merge): If dtors_from_cxa_atexit is set,
> 	process the DTORs/CTORs accordingly.
> 	(pass_ipa_cdtor_merge::gate): Also run if
> 	dtors_from_cxa_atexit is set.
> 	* target.def (dtors_from_cxa_atexit): New hook.

OK, thanks!
Honza
  

Patch

diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
index 27cb3e4bb30..5202903f5b2 100644
--- a/gcc/config/darwin.h
+++ b/gcc/config/darwin.h
@@ -54,6 +54,11 @@  see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 
 #define DO_GLOBAL_DTORS_BODY
 
+/* Register static destructors to run from __cxa_atexit instead of putting
+   them into a .mod_term_funcs section.  */
+
+#define TARGET_DTORS_FROM_CXA_ATEXIT true
+
 /* The string value for __SIZE_TYPE__.  */
 
 #ifndef SIZE_TYPE
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 78a1af1ad4d..6ec1d50b3e4 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -9210,6 +9210,14 @@  collecting constructors and destructors to be run at startup and exit.
 It is false if we must use @command{collect2}.
 @end deftypevr
 
+@deftypevr {Target Hook} bool TARGET_DTORS_FROM_CXA_ATEXIT
+This value is true if the target wants destructors to be queued to be
+run from __cxa_atexit.  If this is the case then, for each priority level,
+a new constructor will be entered that registers the destructors for that
+level with __cxa_atexit (and there will be no destructors emitted).
+It is false the method implied by @code{have_ctors_dtors} is used.
+@end deftypevr
+
 @deftypefn {Target Hook} void TARGET_ASM_CONSTRUCTOR (rtx @var{symbol}, int @var{priority})
 If defined, a function that outputs assembler code to arrange to call
 the function referenced by @var{symbol} at initialization time.
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 4401550989e..2b9960b73d7 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -6015,6 +6015,8 @@  encountering an @code{init_priority} attribute.
 
 @hook TARGET_HAVE_CTORS_DTORS
 
+@hook TARGET_DTORS_FROM_CXA_ATEXIT
+
 @hook TARGET_ASM_CONSTRUCTOR
 
 @hook TARGET_ASM_DESTRUCTOR
diff --git a/gcc/ipa.c b/gcc/ipa.c
index 4f62ac183ee..d234a69b9fe 100644
--- a/gcc/ipa.c
+++ b/gcc/ipa.c
@@ -837,7 +837,7 @@  ipa_discover_variable_flags (void)
    FINAL specify whether the externally visible name for collect2 should
    be produced. */
 
-static void
+static tree
 cgraph_build_static_cdtor_1 (char which, tree body, int priority, bool final,
 			     tree optimization,
 			     tree target)
@@ -916,6 +916,7 @@  cgraph_build_static_cdtor_1 (char which, tree body, int priority, bool final,
 
   set_cfun (NULL);
   current_function_decl = NULL;
+  return decl;
 }
 
 /* Generate and emit a static constructor or destructor.  WHICH must
@@ -1022,6 +1023,128 @@  build_cdtor (bool ctor_p, const vec<tree> &cdtors)
     }
 }
 
+/* Helper functions for build_cxa_dtor_registrations ().
+   Build a decl for __cxa_atexit ().  */
+
+static tree
+build_cxa_atexit_decl ()
+{
+  /* The parameter to "__cxa_atexit" is "void (*)(void *)".  */
+  tree fn_type = build_function_type_list (void_type_node,
+					   ptr_type_node, NULL_TREE);
+  tree fn_ptr_type = build_pointer_type (fn_type);
+  /* The declaration for `__cxa_atexit' is:
+     int __cxa_atexit (void (*)(void *), void *, void *).  */
+  const char *name = "__cxa_atexit";
+  tree cxa_name = get_identifier (name);
+  fn_type = build_function_type_list (integer_type_node, fn_ptr_type,
+				      ptr_type_node, ptr_type_node, NULL_TREE);
+  tree atexit_fndecl = build_decl (BUILTINS_LOCATION, FUNCTION_DECL,
+				   cxa_name, fn_type);
+  SET_DECL_ASSEMBLER_NAME (atexit_fndecl, cxa_name);
+  DECL_VISIBILITY (atexit_fndecl) = VISIBILITY_DEFAULT;
+  DECL_VISIBILITY_SPECIFIED (atexit_fndecl) = true;
+  set_call_expr_flags (atexit_fndecl, ECF_LEAF | ECF_NOTHROW);
+  TREE_PUBLIC (atexit_fndecl) = true;
+  DECL_EXTERNAL (atexit_fndecl) = true;
+  DECL_ARTIFICIAL (atexit_fndecl) = true;
+  return atexit_fndecl;
+}
+
+/* Build a decl for __dso_handle.  */
+
+static tree
+build_dso_handle_decl ()
+{
+  /* Declare the __dso_handle variable.  */
+  tree dso_handle_decl = build_decl (UNKNOWN_LOCATION, VAR_DECL,
+				     get_identifier ("__dso_handle"),
+				     ptr_type_node);
+  TREE_PUBLIC (dso_handle_decl) = true;
+  DECL_EXTERNAL (dso_handle_decl) = true;
+  DECL_ARTIFICIAL (dso_handle_decl) = true;
+#ifdef HAVE_GAS_HIDDEN
+  if (dso_handle_decl != error_mark_node)
+    {
+      DECL_VISIBILITY (dso_handle_decl) = VISIBILITY_HIDDEN;
+      DECL_VISIBILITY_SPECIFIED (dso_handle_decl) = true;
+    }
+#endif
+  return dso_handle_decl;
+}
+
+/*  This builds one or more constructor functions that register DTORs with
+    __cxa_atexit ().  Within a priority level, DTORs are registered in TU
+    order - which means that they will run in reverse TU order from cxa_atexit.
+    This is the same behavior as using a .fini / .mod_term_funcs section.
+    As the functions are built, they are appended to the CTORs vector.  */
+
+static void
+build_cxa_dtor_registrations (const vec<tree> &dtors, vec<tree> *ctors)
+{
+  size_t i,j;
+  size_t len = dtors.length ();
+
+  location_t sav_loc = input_location;
+  input_location = UNKNOWN_LOCATION;
+
+  tree atexit_fndecl = build_cxa_atexit_decl ();
+  tree dso_handle_decl = build_dso_handle_decl ();
+
+  /* We want &__dso_handle.  */
+  tree dso_ptr = build1_loc (UNKNOWN_LOCATION, ADDR_EXPR,
+			     ptr_type_node, dso_handle_decl);
+
+  i = 0;
+  while (i < len)
+    {
+      priority_type priority = 0;
+      tree body = NULL_TREE;
+      j = i;
+      do
+	{
+	  priority_type p;
+	  tree fn = dtors[j];
+	  p = DECL_FINI_PRIORITY (fn);
+	  if (j == i)
+	    priority = p;
+	  else if (p != priority)
+	    break;
+	  j++;
+	}
+      while (j < len);
+
+      /* Find the next batch of constructors/destructors with the same
+	 initialization priority.  */
+      for (;i < j; i++)
+	{
+	  tree fn = dtors[i];
+	  DECL_STATIC_DESTRUCTOR (fn) = 0;
+	  tree dtor_ptr = build1_loc (UNKNOWN_LOCATION, ADDR_EXPR,
+				      ptr_type_node, fn);
+	  tree call_cxa_atexit
+	    = build_call_expr_loc (UNKNOWN_LOCATION, atexit_fndecl, 3,
+				   dtor_ptr, null_pointer_node, dso_ptr);
+
+	  /* We do not want to optimize away pure/const calls here.
+	     When optimizing, these should be already removed, when not
+	     optimizing, we want user to be able to breakpoint in them.  */
+	  TREE_SIDE_EFFECTS (call_cxa_atexit) = 1;
+	  append_to_statement_list (call_cxa_atexit, &body);
+	}
+
+      gcc_assert (body != NULL_TREE);
+      /* Generate a function to register the DTORs at this priority.  */
+      tree new_ctor
+	= cgraph_build_static_cdtor_1 ('I', body, priority, true,
+				       DECL_FUNCTION_SPECIFIC_OPTIMIZATION (dtors[0]),
+				       DECL_FUNCTION_SPECIFIC_TARGET (dtors[0]));
+      /* Add this to the list of ctors.  */
+      ctors->safe_push (new_ctor);
+    }
+  input_location = sav_loc;
+}
+
 /* Comparison function for qsort.  P1 and P2 are actually of type
    "tree *" and point to static constructors.  DECL_INIT_PRIORITY is
    used to determine the sort order.  */
@@ -1071,7 +1194,43 @@  compare_dtor (const void *p1, const void *p2)
   else if (priority1 > priority2)
     return 1;
   else
-    /* Ensure a stable sort.  */
+    /* Ensure a stable sort - into TU order.  */
+    return DECL_UID (f1) - DECL_UID (f2);
+}
+
+/* Comparison function for qsort.  P1 and P2 are of type "tree *" and point to
+   a pair of static constructors or destructors.  We first sort on the basis of
+   priority and then into TU order (on the strict assumption that DECL_UIDs are
+   ordered in the same way as the original functions).  ???: this seems quite
+   fragile. */
+
+static int
+compare_cdtor_tu_order (const void *p1, const void *p2)
+{
+  tree f1;
+  tree f2;
+  int priority1;
+  int priority2;
+
+  f1 = *(const tree *)p1;
+  f2 = *(const tree *)p2;
+  if (DECL_STATIC_CONSTRUCTOR (f1))
+    {
+      priority1 = DECL_INIT_PRIORITY (f1);
+      priority2 = DECL_INIT_PRIORITY (f2);
+    }
+  else
+    {
+      priority1 = DECL_FINI_PRIORITY (f1);
+      priority2 = DECL_FINI_PRIORITY (f2);
+    }
+
+  if (priority1 < priority2)
+    return -1;
+  else if (priority1 > priority2)
+    return 1;
+  else
+    /* For equal priority, sort into the order of definition in the TU.  */
     return DECL_UID (f1) - DECL_UID (f2);
 }
 
@@ -1097,6 +1256,37 @@  build_cdtor_fns (vec<tree> *ctors, vec<tree> *dtors)
     }
 }
 
+/* Generate new CTORs to register static destructors with __cxa_atexit and add
+   them to the existing list of CTORs; we then process the revised CTORs list.
+
+   We sort the DTORs into priority and then TU order, this means that they are
+   registered in that order with __cxa_atexit () and therefore will be run in
+   the reverse order.
+
+   Likewise, CTORs are sorted into priority and then TU order, which means that
+   they will run in that order.
+
+   This matches the behavior of using init/fini or mod_init_func/mod_term_func
+   sections.  */
+
+static void
+build_cxa_atexit_fns (vec<tree> *ctors, vec<tree> *dtors)
+{
+  if (!dtors->is_empty ())
+    {
+      gcc_assert (targetm.dtors_from_cxa_atexit);
+      dtors->qsort (compare_cdtor_tu_order);
+      build_cxa_dtor_registrations (*dtors, ctors);
+    }
+
+  if (!ctors->is_empty ())
+    {
+      gcc_assert (targetm.dtors_from_cxa_atexit);
+      ctors->qsort (compare_cdtor_tu_order);
+      build_cdtor (/*ctor_p=*/true, *ctors);
+    }
+}
+
 /* Look for constructors and destructors and produce function calling them.
    This is needed for targets not supporting ctors or dtors, but we perform the
    transformation also at linktime to merge possibly numerous
@@ -1115,7 +1305,10 @@  ipa_cdtor_merge (void)
     if (DECL_STATIC_CONSTRUCTOR (node->decl)
 	|| DECL_STATIC_DESTRUCTOR (node->decl))
        record_cdtor_fn (node, &ctors, &dtors);
-  build_cdtor_fns (&ctors, &dtors);
+  if (targetm.dtors_from_cxa_atexit)
+    build_cxa_atexit_fns (&ctors, &dtors);
+  else
+    build_cdtor_fns (&ctors, &dtors);
   return 0;
 }
 
@@ -1162,7 +1355,7 @@  pass_ipa_cdtor_merge::gate (function *)
   /* Perform the pass when we have no ctors/dtors support
      or at LTO time to merge multiple constructors into single
      function.  */
-  return !targetm.have_ctors_dtors || in_lto_p;
+  return !targetm.have_ctors_dtors || in_lto_p || targetm.dtors_from_cxa_atexit;
 }
 
 } // anon namespace
diff --git a/gcc/target.def b/gcc/target.def
index 51ea167172b..b803c582f01 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -6778,6 +6778,16 @@  collecting constructors and destructors to be run at startup and exit.\n\
 It is false if we must use @command{collect2}.",
  bool, false)
 
+/* True if the target wants DTORs to be run from cxa_atexit.  */
+DEFHOOKPOD
+(dtors_from_cxa_atexit,
+ "This value is true if the target wants destructors to be queued to be\n\
+run from __cxa_atexit.  If this is the case then, for each priority level,\n\
+a new constructor will be entered that registers the destructors for that\n\
+level with __cxa_atexit (and there will be no destructors emitted).\n\
+It is false the method implied by @code{have_ctors_dtors} is used.",
+ bool, false)
+
 /* True if thread-local storage is supported.  */
 DEFHOOKPOD
 (have_tls,