[1/2] OpenMP: Add lang hooks + run-time filled map arrays for Fortran deep mapping of DT

Message ID bc27fa73-cb20-52f3-11bc-302a33157b34@codesourcery.com
State New
Headers
Series [1/2] OpenMP: Add lang hooks + run-time filled map arrays for Fortran deep mapping of DT |

Commit Message

Tobias Burnus Jan. 10, 2023, 11:35 a.m. UTC
  This patches is the ME part to support OpenMP 5.0's deep-mapping
feature, i.e. mapping allocatable components of Fortran's derived types
automatically. [Not the lang hooks but allocatate-array part will probably
also be useful when later adding 'iterator'-modifier support to the
'map'/'to'/'from' clauses.]

This is a belated real submission of the patch sent in March 2022,
   https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591144.html
(with FE fixes at https://gcc.gnu.org/pipermail/gcc-patches/2022-April/593562.html
   (note to self: Bernhard did sent some comment fixes off list)
+ https://gcc.gnu.org/pipermail/gcc-patches/2022-April/593704.html )
+ ME fix for OpenACC at https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603906.html
[which is in the attach patch]

As written, attached is the ME part. Below is a description how
it is supposed to get used; the patch links above show how it looks
in the real-code FE.


======================
BACKGROUND
======================

Fortran permits

type t
   integer, allocatable :: x, y(:)
end type t
type t2
   type(t2), allocatable :: previous_stack  ! Not valid in OMP 5.0
   integer, allocatable :: a
   type(t) :: b, c(:)
end type t2
type(t2) :: var1, var2(:)

!$omp target enter data(var1, var2)

Where all allocatable components need to be mapped alongside. The number of
mappings is only known at runtime, e.g. for 'var2' - the array size is only
known at runtime and then each allocatable component of each element of
'var2' needs to be mapped - both those can contain allocatable components as
well, which have to be mapped - but of course only if the parent component
is actually allocated.

  * * *

The current code puts 'kinds' with const values into an array, 'sizes' in
a fixed-size stack array (either with const or dynamic values) and 'addrs'
is a struct.

To support deep mapping, those all have to be dynamic; hence, the arrays
'sizes' and 'kinds' are turned into pointers - and the 'struct' gets a
tailing variable-size array, which is then filled with the dynamic content.

For this purpose, three lang hooks are added - all are called rather late,
i.e. during omp-low.cc, such that all previous operations (implicit mapping,
explicit mapping, OpenMP mapper) are already done.

* First one to check whether there is any allocatable component for a map-clause
   element (explicitly or implicitly added). If not, the current code is used.
   Otherwise, it uses dynamically allocated arrays
(Side note: As the size is now only known at runtime, TREE_VEC has now another
  element - the array size - hence the change to expand_omp_target, before it
  was known statically from the type.)

* Second hook to actually count how many allocations are done, required for
   the allocation.

* Third hook to actually fill the arrays.


Comments? Remarks?

Tobias

PS: There are two things to watch out in the future:
- 'mapper': I think it should work when the mapper is present as it comes rather
   late in the flow, but I have not checked with Julian's patches (pending review).
- Order: the dynamic items are added last to 'addrs' to permit keeping the 'struct'
   type. I think that's fine for allocatable components as they are added rather late
   and accessing them via 'is_device_ptr' is not possible.
   But there might be some issues with 'interator' in future; something to watch out.
   If so, we may need to partially or fully give up on putting all others mappings stillinto
   the struct.
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
  

Patch

OpenMP: Add lang hooks + run-time filled map arrays for Fortran deep mapping of DT

This patch adds middle end support for mapping Fortran derived-types with
allocatable components. If those are present, the kinds/sizes arrays will be
allocated at run time and the addrs struct gets an variable-sized array at
the end. The newly added hooks are:
  * lhd_omp_deep_mapping_p: If true, use the new code.
  * lhd_omp_deep_mapping_cnt: Count the elements, needed for allocation.
  * lhd_omp_deep_mapping: Fill the allocated arrays.

gcc/ChangeLog:

	* langhooks-def.h (lhd_omp_deep_mapping_p,
	lhd_omp_deep_mapping_cnt, lhd_omp_deep_mapping): New.
	(LANG_HOOKS_OMP_DEEP_MAPPING_P, LANG_HOOKS_OMP_DEEP_MAPPING_CNT,
	LANG_HOOKS_OMP_DEEP_MAPPING): Define.
	(LANG_HOOKS_DECLS): Use it.
	* langhooks.cc (lhd_omp_deep_mapping_p, lhd_omp_deep_mapping_cnt,
	lhd_omp_deep_mapping): New stubs.
	* langhooks.h (struct lang_hooks_for_decls): Add new hooks
	* omp-expand.cc (expand_omp_target): Handle dynamic-size
	addr/sizes/kinds arrays.
	* omp-low.cc (build_sender_ref, fixup_child_record_type,
	scan_sharing_clauses, lower_omp_target): Update to handle
	new hooks and dynamic-size addr/sizes/kinds arrays.

 gcc/langhooks-def.h |  10 +++
 gcc/langhooks.cc    |  24 ++++++
 gcc/langhooks.h     |  15 ++++
 gcc/omp-expand.cc   |  18 ++++-
 gcc/omp-low.cc      | 224 ++++++++++++++++++++++++++++++++++++++++++++++------
 5 files changed, 265 insertions(+), 26 deletions(-)

diff --git a/gcc/langhooks-def.h b/gcc/langhooks-def.h
index c6d18526360..75ba037f776 100644
--- a/gcc/langhooks-def.h
+++ b/gcc/langhooks-def.h
@@ -85,6 +85,10 @@  extern enum omp_clause_defaultmap_kind lhd_omp_predetermined_mapping (tree);
 extern tree lhd_omp_assignment (tree, tree, tree);
 extern void lhd_omp_finish_clause (tree, gimple_seq *, bool);
 extern tree lhd_omp_array_size (tree, gimple_seq *);
+extern bool lhd_omp_deep_mapping_p (const gimple *, tree);
+extern tree lhd_omp_deep_mapping_cnt (const gimple *, tree, gimple_seq *);
+extern void lhd_omp_deep_mapping (const gimple *, tree, unsigned HOST_WIDE_INT,
+				  tree, tree, tree, tree, tree, gimple_seq *);
 struct gimplify_omp_ctx;
 extern void lhd_omp_firstprivatize_type_sizes (struct gimplify_omp_ctx *,
 					       tree);
@@ -273,6 +277,9 @@  extern tree lhd_unit_size_without_reusable_padding (tree);
 #define LANG_HOOKS_OMP_CLAUSE_LINEAR_CTOR NULL
 #define LANG_HOOKS_OMP_CLAUSE_DTOR hook_tree_tree_tree_null
 #define LANG_HOOKS_OMP_FINISH_CLAUSE lhd_omp_finish_clause
+#define LANG_HOOKS_OMP_DEEP_MAPPING_P lhd_omp_deep_mapping_p
+#define LANG_HOOKS_OMP_DEEP_MAPPING_CNT lhd_omp_deep_mapping_cnt
+#define LANG_HOOKS_OMP_DEEP_MAPPING lhd_omp_deep_mapping
 #define LANG_HOOKS_OMP_ALLOCATABLE_P hook_bool_tree_false
 #define LANG_HOOKS_OMP_SCALAR_P lhd_omp_scalar_p
 #define LANG_HOOKS_OMP_SCALAR_TARGET_P hook_bool_tree_false
@@ -307,6 +314,9 @@  extern tree lhd_unit_size_without_reusable_padding (tree);
   LANG_HOOKS_OMP_CLAUSE_LINEAR_CTOR, \
   LANG_HOOKS_OMP_CLAUSE_DTOR, \
   LANG_HOOKS_OMP_FINISH_CLAUSE, \
+  LANG_HOOKS_OMP_DEEP_MAPPING_P, \
+  LANG_HOOKS_OMP_DEEP_MAPPING_CNT, \
+  LANG_HOOKS_OMP_DEEP_MAPPING, \
   LANG_HOOKS_OMP_ALLOCATABLE_P, \
   LANG_HOOKS_OMP_SCALAR_P, \
   LANG_HOOKS_OMP_SCALAR_TARGET_P, \
diff --git a/gcc/langhooks.cc b/gcc/langhooks.cc
index 9a1a9eccca9..5a0ec6bb8b8 100644
--- a/gcc/langhooks.cc
+++ b/gcc/langhooks.cc
@@ -642,6 +642,30 @@  lhd_omp_array_size (tree, gimple_seq *)
   return NULL_TREE;
 }
 
+/* Returns true when additional mappings for a decl are needed.  */
+
+bool
+lhd_omp_deep_mapping_p (const gimple *, tree)
+{
+  return false;
+}
+
+/* Returns number of additional mappings for a decl.  */
+
+tree
+lhd_omp_deep_mapping_cnt (const gimple *, tree, gimple_seq *)
+{
+  return NULL_TREE;
+}
+
+/* Do the additional mappings.  */
+
+void
+lhd_omp_deep_mapping (const gimple *, tree, unsigned HOST_WIDE_INT, tree, tree,
+		      tree, tree, tree, gimple_seq *)
+{
+}
+
 /* Return true if DECL is a scalar variable (for the purpose of
    implicit firstprivatization & mapping). Only if alloc_ptr_ok
    are allocatables and pointers accepted. */
diff --git a/gcc/langhooks.h b/gcc/langhooks.h
index cca75285fc2..97b5e888151 100644
--- a/gcc/langhooks.h
+++ b/gcc/langhooks.h
@@ -313,6 +313,21 @@  struct lang_hooks_for_decls
   /* Do language specific checking on an implicitly determined clause.  */
   void (*omp_finish_clause) (tree clause, gimple_seq *pre_p, bool);
 
+  /* Additional language-specific mappings for a decl; returns true
+     if those may occur.  */
+  bool (*omp_deep_mapping_p) (const gimple *ctx_stmt, tree clause);
+
+  /* Additional language-specific mappings for a decl; returns the
+     number of additional mappings needed.  */
+  tree (*omp_deep_mapping_cnt) (const gimple *ctx_stmt, tree clause,
+				gimple_seq *seq);
+
+  /* Do the actual additional language-specific mappings for a decl. */
+  void (*omp_deep_mapping) (const gimple *stmt, tree clause,
+			    unsigned HOST_WIDE_INT tkind,
+			    tree data, tree sizes, tree kinds,
+			    tree offset_data, tree offset, gimple_seq *seq);
+
   /* Return true if DECL is an allocatable variable (for the purpose of
      implicit mapping).  */
   bool (*omp_allocatable_p) (tree decl);
diff --git a/gcc/omp-expand.cc b/gcc/omp-expand.cc
index c7ef41924d6..5554cc3171f 100644
--- a/gcc/omp-expand.cc
+++ b/gcc/omp-expand.cc
@@ -9916,8 +9916,9 @@  expand_omp_target (struct omp_region *region)
 		  /* We're ignoring the subcode because we're
 		     effectively doing a STRIP_NOPS.  */
 
-		  if (TREE_CODE (arg) == ADDR_EXPR
-		      && TREE_OPERAND (arg, 0) == sender)
+		  if ((TREE_CODE (arg) == ADDR_EXPR
+		       && TREE_OPERAND (arg, 0) == sender)
+		      || arg == sender)
 		    {
 		      tgtcopy_stmt = stmt;
 		      break;
@@ -10365,7 +10366,7 @@  expand_omp_target (struct omp_region *region)
       t3 = t2;
       t4 = t2;
     }
-  else
+  else if (TREE_VEC_LENGTH (t) == 3 || is_gimple_omp_oacc (entry_stmt))
     {
       t1 = TYPE_MAX_VALUE (TYPE_DOMAIN (TREE_TYPE (TREE_VEC_ELT (t, 1))));
       t1 = size_binop (PLUS_EXPR, t1, size_int (1));
@@ -10373,6 +10374,17 @@  expand_omp_target (struct omp_region *region)
       t3 = build_fold_addr_expr (TREE_VEC_ELT (t, 1));
       t4 = build_fold_addr_expr (TREE_VEC_ELT (t, 2));
     }
+  else
+    {
+      t1 = force_gimple_operand_gsi (&gsi, TREE_VEC_ELT (t, 3), true, NULL_TREE,
+				     true, GSI_SAME_STMT);
+      t2 = force_gimple_operand_gsi (&gsi, TREE_VEC_ELT (t, 0), true, NULL_TREE,
+				     true, GSI_SAME_STMT);
+      t3 = force_gimple_operand_gsi (&gsi, TREE_VEC_ELT (t, 1), true, NULL_TREE,
+				     true, GSI_SAME_STMT);
+      t4 = force_gimple_operand_gsi (&gsi, TREE_VEC_ELT (t, 2), true, NULL_TREE,
+				     true, GSI_SAME_STMT);
+    }
 
   gimple *g;
   bool tagging = false;
diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index fef41a013ec..cf6f0e14507 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -767,7 +767,10 @@  static tree
 build_sender_ref (splay_tree_key key, omp_context *ctx)
 {
   tree field = lookup_sfield (key, ctx);
-  return omp_build_component_ref (ctx->sender_decl, field);
+  tree tmp = ctx->sender_decl;
+  if (POINTER_TYPE_P (TREE_TYPE (tmp)))
+    tmp = build_fold_indirect_ref (tmp);
+  return omp_build_component_ref (tmp, field);
 }
 
 static tree
@@ -1138,7 +1141,9 @@  fixup_child_record_type (omp_context *ctx)
     type = build_qualified_type (type, TYPE_QUAL_CONST);
 
   TREE_TYPE (ctx->receiver_decl)
-    = build_qualified_type (build_reference_type (type), TYPE_QUAL_RESTRICT);
+    = build_qualified_type (flexible_array_type_p (type)
+			    ? build_pointer_type (type)
+			    : build_reference_type (type), TYPE_QUAL_RESTRICT);
 }
 
 /* Instantiate decls as necessary in CTX to satisfy the data sharing
@@ -1149,6 +1154,7 @@  scan_sharing_clauses (tree clauses, omp_context *ctx)
 {
   tree c, decl;
   bool scan_array_reductions = false;
+  bool flex_array_ptr = false;
 
   for (c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
     if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_ALLOCATE
@@ -1592,6 +1598,8 @@  scan_sharing_clauses (tree clauses, omp_context *ctx)
 		  && !OMP_CLAUSE_MAP_ZERO_BIAS_ARRAY_SECTION (c))
 		break;
 	    }
+	  if (!flex_array_ptr)
+	    flex_array_ptr = lang_hooks.decls.omp_deep_mapping_p (ctx->stmt, c);
 	  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
 	      && DECL_P (decl)
 	      && (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
@@ -2001,6 +2009,18 @@  scan_sharing_clauses (tree clauses, omp_context *ctx)
 		 && OMP_CLAUSE_LINEAR_GIMPLE_SEQ (c))
 	  scan_omp (&OMP_CLAUSE_LINEAR_GIMPLE_SEQ (c), ctx);
     }
+  if (flex_array_ptr)
+    {
+      tree field = build_range_type (size_type_node,
+				     build_int_cstu (size_type_node, 0),
+				     NULL_TREE);
+      field = build_array_type (ptr_type_node, field);
+      field = build_decl (UNKNOWN_LOCATION, FIELD_DECL, NULL_TREE, field);
+      SET_DECL_ALIGN (field, TYPE_ALIGN (ptr_type_node));
+      DECL_CONTEXT (field) = ctx->record_type;
+      DECL_CHAIN (field) = TYPE_FIELDS (ctx->record_type);
+      TYPE_FIELDS (ctx->record_type) = field;
+    }
 }
 
 /* Create a new name for omp child function.  Returns an identifier. */
@@ -12696,6 +12716,11 @@  lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
   unsigned int map_cnt = 0;
   tree in_reduction_clauses = NULL_TREE;
 
+  tree deep_map_cnt = NULL_TREE;
+  tree deep_map_data = NULL_TREE;
+  tree deep_map_offset_data = NULL_TREE;
+  tree deep_map_offset = NULL_TREE;
+
   offloaded = is_gimple_omp_offloaded (stmt);
   switch (gimple_omp_target_kind (stmt))
     {
@@ -12774,6 +12799,8 @@  lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
   push_gimplify_context ();
   fplist = NULL;
 
+  ilist = NULL;
+  olist = NULL;
   for (c = clauses; c ; c = OMP_CLAUSE_CHAIN (c))
     switch (OMP_CLAUSE_CODE (c))
       {
@@ -12827,6 +12854,16 @@  lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
       case OMP_CLAUSE_FROM:
       oacc_firstprivate:
 	var = OMP_CLAUSE_DECL (c);
+	{
+	  tree extra = lang_hooks.decls.omp_deep_mapping_cnt (stmt, c, &ilist);
+	  if (extra != NULL_TREE && deep_map_cnt != NULL_TREE)
+	    deep_map_cnt = fold_build2_loc (OMP_CLAUSE_LOCATION (c), PLUS_EXPR,
+					    size_type_node, deep_map_cnt,
+					    extra);
+	  else if (extra != NULL_TREE)
+	    deep_map_cnt = extra;
+	}
+
 	if (!DECL_P (var))
 	  {
 	    if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_MAP
@@ -13061,18 +13098,31 @@  lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
       record_vars_into (gimple_bind_vars (tgt_bind), child_fn);
     }
 
-  olist = NULL;
-  ilist = NULL;
   if (ctx->record_type)
     {
+      if (deep_map_cnt && TREE_CODE (deep_map_cnt) == INTEGER_CST)
+	/* map_cnt = map_cnt + tree_to_hwi (deep_map_cnt); */
+	/* deep_map_cnt = NULL_TREE; */
+	gcc_unreachable ();
+      else if (deep_map_cnt)
+	{
+	  gcc_assert (flexible_array_type_p (ctx->record_type));
+	  tree n = create_tmp_var_raw (size_type_node, "nn_map");
+	  gimple_add_tmp_var (n);
+	  gimplify_assign (n, deep_map_cnt, &ilist);
+	  deep_map_cnt = n;
+	}
       ctx->sender_decl
-	= create_tmp_var (ctx->record_type, ".omp_data_arr");
+	= create_tmp_var (deep_map_cnt ? build_pointer_type (ctx->record_type)
+				       : ctx->record_type, ".omp_data_arr");
       DECL_NAMELESS (ctx->sender_decl) = 1;
       TREE_ADDRESSABLE (ctx->sender_decl) = 1;
-      t = make_tree_vec (3);
+      t = make_tree_vec (deep_map_cnt ? 4 : 3);
       TREE_VEC_ELT (t, 0) = ctx->sender_decl;
       TREE_VEC_ELT (t, 1)
-	= create_tmp_var (build_array_type_nelts (size_type_node, map_cnt),
+	= create_tmp_var (deep_map_cnt
+			  ? build_pointer_type (size_type_node)
+			  : build_array_type_nelts (size_type_node, map_cnt),
 			  ".omp_data_sizes");
       DECL_NAMELESS (TREE_VEC_ELT (t, 1)) = 1;
       TREE_ADDRESSABLE (TREE_VEC_ELT (t, 1)) = 1;
@@ -13080,13 +13130,65 @@  lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
       tree tkind_type = short_unsigned_type_node;
       int talign_shift = 8;
       TREE_VEC_ELT (t, 2)
-	= create_tmp_var (build_array_type_nelts (tkind_type, map_cnt),
+	= create_tmp_var (deep_map_cnt
+			  ? build_pointer_type (tkind_type)
+			  : build_array_type_nelts (tkind_type, map_cnt),
 			  ".omp_data_kinds");
       DECL_NAMELESS (TREE_VEC_ELT (t, 2)) = 1;
       TREE_ADDRESSABLE (TREE_VEC_ELT (t, 2)) = 1;
       TREE_STATIC (TREE_VEC_ELT (t, 2)) = 1;
       gimple_omp_target_set_data_arg (stmt, t);
 
+      if (deep_map_cnt)
+	{
+	  tree tmp, size;
+	  size = create_tmp_var (size_type_node, NULL);
+	  DECL_NAMELESS (size) = 1;
+	  gimplify_assign (size,
+			   fold_build2_loc (UNKNOWN_LOCATION, PLUS_EXPR,
+					    size_type_node, deep_map_cnt,
+					    build_int_cst (size_type_node,
+							   map_cnt)), &ilist);
+	  TREE_VEC_ELT (t, 3) = size;
+
+	  tree call = builtin_decl_explicit (BUILT_IN_MALLOC);
+	  size = fold_build2_loc (UNKNOWN_LOCATION, MULT_EXPR,
+				  size_type_node, deep_map_cnt,
+				  TYPE_SIZE_UNIT (ptr_type_node));
+	  size = fold_build2_loc (UNKNOWN_LOCATION, PLUS_EXPR,
+				  size_type_node, size,
+				  TYPE_SIZE_UNIT (ctx->record_type));
+	  tmp = build_call_expr_loc (input_location, call, 1, size);
+	  gimplify_assign (ctx->sender_decl, tmp, &ilist);
+
+	  size = fold_build2_loc (UNKNOWN_LOCATION, MULT_EXPR,
+				  size_type_node, TREE_VEC_ELT (t, 3),
+				  TYPE_SIZE_UNIT (size_type_node));
+	  tmp = build_call_expr_loc (input_location, call, 1, size);
+	  gimplify_assign (TREE_VEC_ELT (t, 1), tmp, &ilist);
+
+	  size = fold_build2_loc (UNKNOWN_LOCATION, MULT_EXPR,
+				  size_type_node, TREE_VEC_ELT (t, 3),
+				  TYPE_SIZE_UNIT (tkind_type));
+	  tmp = build_call_expr_loc (input_location, call, 1, size);
+	  gimplify_assign (TREE_VEC_ELT (t, 2), tmp, &ilist);
+	  tree field = TYPE_FIELDS (TREE_TYPE (TREE_TYPE (ctx->sender_decl)));
+	  for ( ; DECL_CHAIN (field) != NULL_TREE; field = DECL_CHAIN (field))
+	    ;
+	  gcc_assert (TREE_CODE (TREE_TYPE (field)));
+	  tmp = build_fold_indirect_ref (ctx->sender_decl);
+	  deep_map_data = omp_build_component_ref (tmp, field);
+	  deep_map_offset_data = create_tmp_var_raw (size_type_node,
+						     "map_offset_data");
+	  deep_map_offset = create_tmp_var_raw (size_type_node, "map_offset");
+	  gimple_add_tmp_var (deep_map_offset_data);
+	  gimple_add_tmp_var (deep_map_offset);
+	  gimplify_assign (deep_map_offset_data, build_int_cst (size_type_node,
+								0), &ilist);
+	  gimplify_assign (deep_map_offset, build_int_cst (size_type_node,
+							   map_cnt), &ilist);
+	}
+
       vec<constructor_elt, va_gc> *vsize;
       vec<constructor_elt, va_gc> *vkind;
       vec_alloc (vsize, map_cnt);
@@ -13113,6 +13215,24 @@  lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 		    || (OMP_CLAUSE_MAP_KIND (c)
 			== GOMP_MAP_FIRSTPRIVATE_REFERENCE)))
 	      break;
+	    if (deep_map_cnt)
+	      {
+		unsigned HOST_WIDE_INT tkind2;
+		switch (OMP_CLAUSE_CODE (c))
+		  {
+		  case OMP_CLAUSE_MAP: tkind2 = OMP_CLAUSE_MAP_KIND (c); break;
+		  case OMP_CLAUSE_FIRSTPRIVATE: tkind2 = GOMP_MAP_TO; break;
+		  case OMP_CLAUSE_TO: tkind2 = GOMP_MAP_TO; break;
+		  case OMP_CLAUSE_FROM: tkind2 = GOMP_MAP_FROM; break;
+		  default: gcc_unreachable ();
+		  }
+		lang_hooks.decls.omp_deep_mapping (stmt, c, tkind2,
+						   deep_map_data,
+						   TREE_VEC_ELT (t, 1),
+						   TREE_VEC_ELT (t, 2),
+						   deep_map_offset_data,
+						   deep_map_offset, &ilist);
+	      }
 	    if (!DECL_P (ovar))
 	      {
 		if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
@@ -13667,23 +13787,65 @@  lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 
       gcc_assert (map_idx == map_cnt);
 
-      DECL_INITIAL (TREE_VEC_ELT (t, 1))
-	= build_constructor (TREE_TYPE (TREE_VEC_ELT (t, 1)), vsize);
-      DECL_INITIAL (TREE_VEC_ELT (t, 2))
-	= build_constructor (TREE_TYPE (TREE_VEC_ELT (t, 2)), vkind);
+      if (!deep_map_cnt)
+	{
+	  DECL_INITIAL (TREE_VEC_ELT (t, 1))
+	    = build_constructor (TREE_TYPE (TREE_VEC_ELT (t, 1)), vsize);
+	  DECL_INITIAL (TREE_VEC_ELT (t, 2))
+	    = build_constructor (TREE_TYPE (TREE_VEC_ELT (t, 2)), vkind);
+	}
       for (int i = 1; i <= 2; i++)
-	if (!TREE_STATIC (TREE_VEC_ELT (t, i)))
+	if (deep_map_cnt || !TREE_STATIC (TREE_VEC_ELT (t, i)))
 	  {
+	    tree tmp = TREE_VEC_ELT (t, i);
+	    if (deep_map_cnt)
+	      {
+		const char *prefix = (i == 1 ? ".omp_data_sizes0"
+					     : ".omp_data_kinds0");
+		tree type = (i == 1) ? size_type_node : tkind_type;
+		type = build_array_type_nelts (type, map_cnt);
+		tree var = create_tmp_var (type, prefix);
+		DECL_NAMELESS (var) = 1;
+		TREE_ADDRESSABLE (var) = 1;
+		TREE_STATIC (var) = TREE_STATIC (tmp);
+		DECL_INITIAL (var) = build_constructor (type, i == 1
+							      ? vsize : vkind);
+		tmp = var;
+		TREE_STATIC (TREE_VEC_ELT (t, i)) = 0;
+	      }
+
 	    gimple_seq initlist = NULL;
-	    force_gimple_operand (build1 (DECL_EXPR, void_type_node,
-					  TREE_VEC_ELT (t, i)),
+	    force_gimple_operand (build1 (DECL_EXPR, void_type_node, tmp),
 				  &initlist, true, NULL_TREE);
 	    gimple_seq_add_seq (&ilist, initlist);
 
-	    tree clobber = build_clobber (TREE_TYPE (TREE_VEC_ELT (t, i)));
-	    gimple_seq_add_stmt (&olist,
-				 gimple_build_assign (TREE_VEC_ELT (t, i),
-						      clobber));
+	    if (deep_map_cnt)
+	      {
+		tree tmp2;
+		tree call = builtin_decl_explicit (BUILT_IN_MEMCPY);
+		tmp2 = TYPE_SIZE_UNIT (TREE_TYPE (tmp));
+		call = build_call_expr_loc (input_location, call, 3,
+					    TREE_VEC_ELT (t, i),
+					    build_fold_addr_expr (tmp), tmp2);
+		gimplify_and_add (call, &ilist);
+	      }
+
+	    if (!TREE_STATIC (tmp))
+	      {
+		tree clobber = build_clobber (TREE_TYPE (tmp));
+		gimple_seq_add_stmt (&olist,
+				     gimple_build_assign (tmp, clobber));
+	      }
+	    if (deep_map_cnt)
+	      {
+		tmp = TREE_VEC_ELT (t, i);
+		tree call = builtin_decl_explicit (BUILT_IN_FREE);
+		call = build_call_expr_loc (input_location, call, 1, tmp);
+		gimplify_and_add (call, &olist);
+		tree clobber = build_clobber (TREE_TYPE (tmp));
+		gimple_seq_add_stmt (&olist,
+				     gimple_build_assign (tmp, clobber));
+	      }
 	  }
 	else if (omp_maybe_offloaded_ctx (ctx->outer))
 	  {
@@ -13703,7 +13865,18 @@  lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 	      }
 	  }
 
-      tree clobber = build_clobber (ctx->record_type);
+      if (deep_map_cnt)
+	{
+	  tree call = builtin_decl_explicit (BUILT_IN_FREE);
+	  call = build_call_expr_loc (input_location, call, 1,
+				      TREE_VEC_ELT (t, 0));
+	  gimplify_and_add (call, &olist);
+
+	  gimplify_expr (&TREE_VEC_ELT (t, 1), &ilist, NULL, is_gimple_val,
+			 fb_rvalue);
+	}
+
+      tree clobber = build_clobber (TREE_TYPE (ctx->sender_decl));
       gimple_seq_add_stmt (&olist, gimple_build_assign (ctx->sender_decl,
 							clobber));
     }
@@ -13716,11 +13889,16 @@  lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
   if (offloaded
       && ctx->record_type)
     {
-      t = build_fold_addr_expr_loc (loc, ctx->sender_decl);
+      t = ctx->sender_decl;
+      if (!deep_map_cnt)
+	t = build_fold_addr_expr_loc (loc, t);
       /* fixup_child_record_type might have changed receiver_decl's type.  */
       t = fold_convert_loc (loc, TREE_TYPE (ctx->receiver_decl), t);
-      gimple_seq_add_stmt (&new_body,
-	  		   gimple_build_assign (ctx->receiver_decl, t));
+      if (!AGGREGATE_TYPE_P (TREE_TYPE (ctx->sender_decl)))
+	gimplify_assign (ctx->receiver_decl, t, &new_body);
+      else
+	gimple_seq_add_stmt (&new_body,
+			     gimple_build_assign (ctx->receiver_decl, t));
     }
   gimple_seq_add_seq (&new_body, fplist);