[C,1/4] introduce ubsan checking for assigment of VM types 1/4

Message ID 93a1692e7f0e895379cb6847bfcb6e6d3dafadc3.camel@tugraz.at
State New
Headers
Series [C,1/4] introduce ubsan checking for assigment of VM types 1/4 |

Commit Message

Martin Uecker May 29, 2023, 10:19 a.m. UTC
  Hi Joseph and Martin,

this series adds UBSan checking for assignment of variably-modified
types, i.e. it checks that size expressions on both sides of the 
assignment match.

1. no functional change, adds a structure argument to the
comptypes family functions in the C FE.

2. checking for all assignments except function arguments
including the libsanitizer changes (no upstream discussion so far)

3. checking for function arguments, but only when the function is
referenced using its declaration.

4. checking for functions called via a pointer


Q1: Should this be -fsanitize=vla-bound ? I used it because it is
related and does not have much functionality.

Q2: I now have warnings when a function can not be instrumented
because size expressions are too complicated or information was
lost before. Probably this needs to have a flag.

Martin



    c: introduce ubsan checking for assigment of VM types 1/4
    
    Reorganize recursive type checking to use a structure to
    store information collected during the recursion and
    returned to the caller (enum_and_init_p, different_types_p).
    
    gcc/c:
            * c-typeck.cc (struct comptypes_data): Add structure.
            (tagged_types_tu_compatible_p,
            function_types_compatible_p, type_lists_compatible_p,
            comptyes_internal): Add structure to interface and
            adapt calls.
            (comptypes, comptypes_check_enum_int,
            comptypes_check_different_types): Adapt calls.
  

Comments

Martin Uecker May 29, 2023, 10:20 a.m. UTC | #1
c: introduce ubsan checking for assigment of VM types 2/4
    
    When checking compatibility of types during assignment, collect
    all pairs of types where the outermost bound needs to match at
    run-time.  This list is then processed to add UBSan checks for
    each bound.
    
    gcc/c-family:
            * c-ubsan.cc (ubsan_instrument_vm_assign): New function.
            * c-ubsan.h (ubsan_instrument_vm_assign: New function.
    
    gcc/c:
            * c-typeck.cc (struct instrument_data). New structure.
            (comp_target_types_instr convert_for_assignment_instrument): New
            interfaces for existing functions.
            (struct comptypes_data): Add instrumentation.
            (comptypes_check_enum_int_intr): New interface.
            (comptypes_check_enum_int): Old interface (calls new).
            (comptypes_internal): Collect VLA types needed for UBSan.
            (comp_target_types_instr): New interface.
            (comp_target_types): Old interface (calls new).
            (function_types_compatible_p): No instrumentation for function
            arguments.
            (process_vm_constraints): New function.
            (convert_for_assignment_instrument): New interface.
            (convert_for_assignment): Instrument assignments.
            * sanitizer.def: Add sanitizer builtins.
    
    gcc/testsuite:
            * gcc.dg/ubsan/vm-bounds-1.c: New test.
            * gcc.dg/ubsan/vm-bounds-1b.c: New test.
            * gcc.dg/ubsan/vm-bounds-2.c: New test.
    
    libsanitizer/ubsan:
            * ubsan_checks.inc: Add UBSan check.
            * ubsan_handlers.cpp (handleVMBoundsMismatch): New function.
            * ubsan_handlers.h (struct VMBoundsMismatchData): New structure.
            (vm_bounds_mismatch): New handler.

diff --git a/gcc/c-family/c-ubsan.cc b/gcc/c-family/c-ubsan.cc
index 51aa83a378d..59ef9708188 100644
--- a/gcc/c-family/c-ubsan.cc
+++ b/gcc/c-family/c-ubsan.cc
@@ -334,6 +334,48 @@ ubsan_instrument_vla (location_t loc, tree size)
   return t;
 }
 
+/* Instrument assignment of variably modified types.  */
+
+tree
+ubsan_instrument_vm_assign (location_t loc, tree a, tree b)
+{
+  tree t, tt;
+
+  gcc_assert (TREE_CODE (a) == ARRAY_TYPE);
+  gcc_assert (TREE_CODE (b) == ARRAY_TYPE);
+
+  tree as = TYPE_MAX_VALUE (TYPE_DOMAIN (a));
+  tree bs = TYPE_MAX_VALUE (TYPE_DOMAIN (b));
+
+  as = fold_build2 (PLUS_EXPR, sizetype, as, size_one_node);
+  bs = fold_build2 (PLUS_EXPR, sizetype, bs, size_one_node);
+
+  t = build2 (NE_EXPR, boolean_type_node, as, bs);
+  if (flag_sanitize_trap & SANITIZE_VLA)
+    tt = build_call_expr_loc (loc, builtin_decl_explicit (BUILT_IN_TRAP), 0);
+  else
+    {
+      tree data = ubsan_create_data ("__ubsan_vm_data", 1, &loc,
+				     ubsan_type_descriptor (a, UBSAN_PRINT_ARRAY),
+				     ubsan_type_descriptor (b, UBSAN_PRINT_ARRAY),
+				     ubsan_type_descriptor (sizetype),
+				     NULL_TREE, NULL_TREE);
+      data = build_fold_addr_expr_loc (loc, data);
+      enum built_in_function bcode
+	= (flag_sanitize_recover & SANITIZE_VLA)
+	  ? BUILT_IN_UBSAN_HANDLE_VM_BOUNDS_MISMATCH
+	  : BUILT_IN_UBSAN_HANDLE_VM_BOUNDS_MISMATCH_ABORT;
+      tt = builtin_decl_explicit (bcode);
+      tt = build_call_expr_loc (loc, tt, 3, data,
+				ubsan_encode_value (as),
+				ubsan_encode_value (bs));
+    }
+  t = build3 (COND_EXPR, void_type_node, t, tt, void_node);
+
+  return t;
+}
+
+
 /* Instrument missing return in C++ functions returning non-void.  */
 
 tree
diff --git a/gcc/c-family/c-ubsan.h b/gcc/c-family/c-ubsan.h
index fef1033e1e4..1b07b0645f2 100644
--- a/gcc/c-family/c-ubsan.h
+++ b/gcc/c-family/c-ubsan.h
@@ -26,6 +26,7 @@ extern tree ubsan_instrument_shift (location_t, enum tree_code, tree, tree);
 extern tree ubsan_instrument_vla (location_t, tree);
 extern tree ubsan_instrument_return (location_t);
 extern tree ubsan_instrument_bounds (location_t, tree, tree *, bool);
+extern tree ubsan_instrument_vm_assign (location_t, tree, tree);
 extern bool ubsan_array_ref_instrumented_p (const_tree);
 extern void ubsan_maybe_instrument_array_ref (tree *, bool);
 extern void ubsan_maybe_instrument_reference (tree *);
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 2a1b7321b45..a8fccc6f6ed 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -94,6 +94,9 @@ struct comptypes_data;
 static int tagged_types_tu_compatible_p (const_tree, const_tree,
 					 struct comptypes_data *);
 static int comp_target_types (location_t, tree, tree);
+struct instrument_data;
+static int comp_target_types_instr (location_t, tree, tree,
+				    vec<struct instrument_data, va_gc> *);
 static int function_types_compatible_p (const_tree, const_tree,
 					struct comptypes_data *);
 static int type_lists_compatible_p (const_tree, const_tree,
@@ -106,6 +109,9 @@ static tree pointer_diff (location_t, tree, tree, tree *);
 static tree convert_for_assignment (location_t, location_t, tree, tree, tree,
 				    enum impl_conv, bool, tree, tree, int,
 				    int = 0);
+static tree convert_for_assignment_instrument (location_t, location_t, tree, tree, tree,
+				    enum impl_conv, bool, tree, tree, int, int,
+				    vec<struct instrument_data, va_gc> *);
 static tree valid_compound_expr_initializer (tree, tree);
 static void push_string (const char *);
 static void push_member_name (tree);
@@ -1042,10 +1048,18 @@ common_type (tree t1, tree t2)
   return c_common_type (t1, t2);
 }
 
+struct instrument_data {
+
+  tree t1;
+  tree t2;
+};
+
 struct comptypes_data {
 
   bool enum_and_int_p;
   bool different_types_p;
+
+  vec<struct instrument_data, va_gc>* instr_vec;
 };
 
 /* Return 1 if TYPE1 and TYPE2 are compatible types for assignment
@@ -1069,13 +1083,17 @@ comptypes (tree type1, tree type2)
 /* Like comptypes, but if it returns non-zero because enum and int are
    compatible, it sets *ENUM_AND_INT_P to true.  */
 
-int
-comptypes_check_enum_int (tree type1, tree type2, bool *enum_and_int_p)
+static int
+comptypes_check_enum_int_instr (tree type1, tree type2, bool *enum_and_int_p,
+				vec<struct instrument_data, va_gc> *instr_vec)
 {
   const struct tagged_tu_seen_cache * tagged_tu_seen_base1 = tagged_tu_seen_base;
   int val;
 
   struct comptypes_data data = { };
+
+  data.instr_vec = instr_vec;
+
   val = comptypes_internal (type1, type2, &data);
   *enum_and_int_p = data.enum_and_int_p;
 
@@ -1084,6 +1102,12 @@ comptypes_check_enum_int (tree type1, tree type2, bool *enum_and_int_p)
   return val;
 }
 
+int
+comptypes_check_enum_int (tree type1, tree type2, bool *enum_and_int_p)
+{
+  return comptypes_check_enum_int_instr (type1, type2, enum_and_int_p, NULL);
+}
+
 /* Like comptypes, but if it returns nonzero for different types, it
    sets *DIFFERENT_TYPES_P to true.  */
 
@@ -1252,7 +1276,16 @@ comptypes_internal (const_tree type1, const_tree type2,
 	if (d1_variable != d2_variable)
 	  data->different_types_p = true;
 	if (d1_variable || d2_variable)
-	  break;
+	  {
+	    if (NULL != data->instr_vec)
+	      {
+		struct instrument_data id;
+		id.t1 = TYPE_MAIN_VARIANT (t2);
+		id.t2 = TYPE_MAIN_VARIANT (t1);
+		vec_safe_push(data->instr_vec, id);
+	      }
+	    break;
+	  }
 	if (d1_zero && d2_zero)
 	  break;
 	if (d1_zero || d2_zero
@@ -1299,7 +1332,8 @@ comptypes_internal (const_tree type1, const_tree type2,
    subset of the other.  */
 
 static int
-comp_target_types (location_t location, tree ttl, tree ttr)
+comp_target_types_instr (location_t location, tree ttl, tree ttr,
+			 vec<struct instrument_data, va_gc> *instr_vec)
 {
   int val;
   int val_ped;
@@ -1333,8 +1367,7 @@ comp_target_types (location_t location, tree ttl, tree ttr)
 	 ? c_build_qualified_type (TYPE_MAIN_VARIANT (mvr), TYPE_QUAL_ATOMIC)
 	 : TYPE_MAIN_VARIANT (mvr));
 
-  enum_and_int_p = false;
-  val = comptypes_check_enum_int (mvl, mvr, &enum_and_int_p);
+  val = comptypes_check_enum_int_instr (mvl, mvr, &enum_and_int_p, instr_vec);
 
   if (val == 1 && val_ped != 1)
     pedwarn_c11 (location, OPT_Wpedantic, "invalid use of pointers to arrays with different qualifiers "
@@ -1349,6 +1382,13 @@ comp_target_types (location_t location, tree ttl, tree ttr)
 
   return val;
 }
+
+static int
+comp_target_types (location_t location, tree ttl, tree ttr)
+{
+  return comp_target_types_instr (location, ttl, ttr, NULL);
+}
+
 
 /* Subroutines of `comptypes'.  */
 
@@ -1694,8 +1734,14 @@ function_types_compatible_p (const_tree f1, const_tree f2,
       return val;
     }
 
-  /* Both types have argument lists: compare them and propagate results.  */
+  /* Both types have argument lists: compare them and propagate results.
+     Turn off UBSan instrumentation for bounds as these are all arrays
+     of unspecified bound.  */
+  auto instr_vec_tmp = data->instr_vec;
+  data->instr_vec = NULL;
   val1 = type_lists_compatible_p (args1, args2, data);
+  data->instr_vec = instr_vec_tmp;
+
   return val1 != 1 ? val1 : val;
 }
 
@@ -3517,10 +3563,11 @@ convert_argument (location_t ploc, tree function, tree fundecl,
   if (excess_precision)
     val = build1 (EXCESS_PRECISION_EXPR, valtype, val);
 
-  tree parmval = convert_for_assignment (ploc, ploc, type,
-					 val, origtype, ic_argpass,
-					 npc, fundecl, function,
-					 parmnum + 1, warnopt);
+  tree parmval = convert_for_assignment_instrument (ploc, ploc, type,
+						    val, origtype, ic_argpass,
+						    npc, fundecl, function,
+						    parmnum + 1, warnopt,
+						    NULL);
 
   if (targetm.calls.promote_prototypes (fundecl ? TREE_TYPE (fundecl) : 0)
       && INTEGRAL_TYPE_P (type)
@@ -3530,6 +3577,26 @@ convert_argument (location_t ploc, tree function, tree fundecl,
   return parmval;
 }
 
+
+/* Process all constraints for variably-modified types.  */
+
+static tree
+process_vm_constraints (location_t location,
+			vec<struct instrument_data, va_gc> *instr_vec)
+{
+  unsigned int i;
+  struct instrument_data* d;
+  tree instr_expr = void_node;
+
+  FOR_EACH_VEC_SAFE_ELT (instr_vec, i, d)
+    {
+      tree in = ubsan_instrument_vm_assign (location, d->t1, d->t2);
+      instr_expr = fold_build2 (COMPOUND_EXPR, void_type_node, in, instr_expr);
+    }
+  return instr_expr;
+}
+
+
 /* Convert the argument expressions in the vector VALUES
    to the types in the list TYPELIST.
 
@@ -6858,7 +6925,44 @@ static tree
 convert_for_assignment (location_t location, location_t expr_loc, tree type,
 			tree rhs, tree origtype, enum impl_conv errtype,
 			bool null_pointer_constant, tree fundecl,
-			tree function, int parmnum, int warnopt /* = 0 */)
+			tree function, int parmnum, int warnopt)
+{
+  vec<struct instrument_data, va_gc> *instr_vec = NULL;
+
+  if (sanitize_flags_p (SANITIZE_VLA)
+      && (ic_init_const != errtype))
+    vec_alloc (instr_vec, 10);
+
+  tree ret = convert_for_assignment_instrument (location, expr_loc, type,
+						rhs, origtype, errtype,
+						null_pointer_constant, fundecl,
+						function, parmnum, warnopt,
+						instr_vec);
+  if (instr_vec)
+    {
+      if (ret && error_mark_node != ret && 0 < vec_safe_length (instr_vec))
+	{
+	  /* We have to make sure that the rhs is evaluated first,
+	     because we may use size expressions in it to check bounds.  */
+	  tree instr_expr = process_vm_constraints (location, instr_vec);
+	  if (void_node != instr_expr)
+	    {
+	      ret = save_expr (ret);
+	      instr_expr = fold_build2 (COMPOUND_EXPR, void_type_node, ret, instr_expr);
+	      ret = fold_build2 (COMPOUND_EXPR, TREE_TYPE (ret), instr_expr, ret);
+	    }
+	}
+      vec_free (instr_vec);
+    }
+  return ret;
+}
+
+static tree
+convert_for_assignment_instrument (location_t location, location_t expr_loc, tree type,
+			tree rhs, tree origtype, enum impl_conv errtype,
+			bool null_pointer_constant, tree fundecl,
+			tree function, int parmnum, int warnopt,
+			vec<struct instrument_data, va_gc>* instr_vec)
 {
   enum tree_code codel = TREE_CODE (type);
   tree orig_rhs = rhs;
@@ -7102,11 +7206,11 @@ convert_for_assignment (location_t location, location_t expr_loc, tree type,
       rhs = build1 (ADDR_EXPR, build_pointer_type (TREE_TYPE (rhs)), rhs);
       SET_EXPR_LOCATION (rhs, location);
 
-      rhs = convert_for_assignment (location, expr_loc,
-				    build_pointer_type (TREE_TYPE (type)),
-				    rhs, origtype, errtype,
-				    null_pointer_constant, fundecl, function,
-				    parmnum, warnopt);
+      rhs = convert_for_assignment_instrument (location, expr_loc,
+					       build_pointer_type (TREE_TYPE (type)),
+					       rhs, origtype, errtype,
+					       null_pointer_constant, fundecl, function,
+					       parmnum, warnopt, instr_vec);
       if (rhs == error_mark_node)
 	return error_mark_node;
 
@@ -7487,7 +7591,7 @@ convert_for_assignment (location_t location, location_t expr_loc, tree type,
 	 Meanwhile, the lhs target must have all the qualifiers of the rhs.  */
       if ((VOID_TYPE_P (ttl) && !TYPE_ATOMIC (ttl))
 	  || (VOID_TYPE_P (ttr) && !TYPE_ATOMIC (ttr))
-	  || (target_cmp = comp_target_types (location, type, rhstype))
+	  || (target_cmp = comp_target_types_instr (location, type, rhstype, instr_vec))
 	  || is_opaque_pointer
 	  || ((c_common_unsigned_type (mvl)
 	       == c_common_unsigned_type (mvr))
diff --git a/gcc/sanitizer.def b/gcc/sanitizer.def
index d47cc7dd9d7..9a25f1db4bd 100644
--- a/gcc/sanitizer.def
+++ b/gcc/sanitizer.def
@@ -506,6 +506,10 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_MISSING_RETURN,
 		      "__ubsan_handle_missing_return",
 		      BT_FN_VOID_PTR,
 		      ATTR_NORETURN_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_VM_BOUNDS_MISMATCH,
+		      "__ubsan_handle_vm_bounds_mismatch",
+		      BT_FN_VOID_PTR_PTR_PTR,
+		      ATTR_COLD_NOTHROW_LEAF_LIST)
 DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_VLA_BOUND_NOT_POSITIVE,
 		      "__ubsan_handle_vla_bound_not_positive",
 		      BT_FN_VOID_PTR_PTR,
@@ -542,6 +546,10 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW_ABORT,
 		      "__ubsan_handle_divrem_overflow_abort",
 		      BT_FN_VOID_PTR_PTR_PTR,
 		      ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_VM_BOUNDS_MISMATCH_ABORT,
+		      "__ubsan_handle_vm_bounds_mismatch_about",
+		      BT_FN_VOID_PTR_PTR_PTR,
+		      ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST)
 DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_SHIFT_OUT_OF_BOUNDS_ABORT,
 		      "__ubsan_handle_shift_out_of_bounds_abort",
 		      BT_FN_VOID_PTR_PTR_PTR,
diff --git a/gcc/testsuite/gcc.dg/ubsan/vm-bounds-1.c b/gcc/testsuite/gcc.dg/ubsan/vm-bounds-1.c
new file mode 100644
index 00000000000..003202ac9b7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/vm-bounds-1.c
@@ -0,0 +1,153 @@
+/* { dg-do run } */
+/* { dg-options "-fsanitize=vla-bound" } */
+
+
+/* test return types */
+
+int m, n;
+
+static char (*z0(void))[5][5] { char (*p)[m][n] = 0; return p; }
+static char (*z1(void))[5][5] { char (*p)[5][n] = 0; return p; }
+static char (*z2(void))[5][5] { char (*p)[m][5] = 0; return p; }
+
+
+int main()
+{
+	m = 4, n = 3;
+
+	int u = 4;
+	int v = 3;
+
+	/* initialization */
+
+	char a[4];
+	char (*pa)[u] = &a;
+	char (*qa)[v] = &a;
+	/* { dg-output "bound 3 of type 'char \\\[\\\*\\\]' does not match bound 4 of type 'char \\\[4\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+	char b[u];
+	const char (*pb)[u] = &b;
+	char (*qb)[v] = &b;
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[\\\*\\\]' does not match bound 4 of type 'char \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+	char c[4][3];
+	char (*pc0)[u][v] = &c;
+	char (*qc0)[v][u] = &c;	
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[\\\*\\\]\\\[\\\*\\\]' does not match bound 4 of type 'char \\\[4\\\]\\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not match bound 3 of type 'char \\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+	char (*pc1)[u][3] = &c;
+	char (*qc1)[v][3] = &c;
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[\\\*\\\]\\\[3\\\]' does not match bound 4 of type 'char \\\[4\\\]\\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+	char (*pc2)[4][v] = &c;
+	char (*qc2)[4][u] = &c;
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not match bound 3 of type 'char \\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+	char (*pc3)[][v] = &c;	
+	char (*qc3)[][u] = &c;
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not match bound 3 of type 'char \\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+	char d[u][v];
+	char (*pd0)[4][3] = &d;
+	char (*qd0)[3][4] = &d;	
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[3\\\]\\\[4\\\]' does not match bound 4 of type 'char \\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[4\\\]' does not match bound 3 of type 'char \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	
+	char (*pd1)[u][3] = &d;	
+	char (*qd1)[v][4] = &d;	
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[\\\*\\\]\\\[4\\\]' does not match bound 4 of type 'char \\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[4\\\]' does not match bound 3 of type 'char \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	
+	char (*pd2)[4][v] = &d;
+	char (*qd2)[3][u] = &d;
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[3\\\]\\\[\\\*\\\]' does not match bound 4 of type 'char \\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not match bound 3 of type 'char \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	
+	char (*pd3)[u][v] = &d;
+	char (*qd3)[v][u] = &d;
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[\\\*\\\]\\\[\\\*\\\]' does not match bound 4 of type 'char \\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not match bound 3 of type 'char \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	
+	char e[4][v];
+	char (*pe0)[4][3] = &e;
+	char (*qe0)[4][4] = &e;	
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[4\\\]' does not match bound 3 of type 'char \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+	char f[u][3];
+	char (*pf)[4][3] = &f;
+	char (*qf)[3][3] = &f;
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[3\\\]\\\[3\\\]' does not match bound 4 of type 'char \\\[\\\*\\\]\\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+	char (*g[u])[v];
+	char (*(*pg)[u])[v] = &g;
+	char (*(*qg)[v])[u] = &g;	
+	/* { dg-output "\[^\n\r]*bound 3 of type '\[^\]]*\\\[\\\*\\\]' does not match bound 4 of type '\[^\]]*\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not match bound 3 of type 'char \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	
+	/* assignment */
+
+	pa = &a;
+	qa = &a;
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[\\\*\\\]' does not match bound 4 of type 'char \\\[4\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+	pb = &b;
+	qb = &b;
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[\\\*\\\]' does not match bound 4 of type 'char \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	
+	pc0 = &c;
+	qc0 = &c;	
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[\\\*\\\]\\\[\\\*\\\]' does not match bound 4 of type 'char \\\[4\\\]\\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not match bound 3 of type 'char \\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	
+	pc1 = &c;
+	qc1 = &c;
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[\\\*\\\]\\\[3\\\]' does not match bound 4 of type 'char \\\[4\\\]\\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+	pc2 = &c;
+	qc2 = &c;
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not match bound 3 of type 'char \\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+	pd0 = &d;
+	qd0 = &d;	
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[3\\\]\\\[4\\\]' does not match bound 4 of type 'char \\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[4\\\]' does not match bound 3 of type 'char \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+	pd1 = &d;
+	qd1 = &d;	
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[\\\*\\\]\\\[4\\\]' does not match bound 4 of type 'char \\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[4\\\]' does not match bound 3 of type 'char \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	
+	pd2 = &d;
+	qd2 = &d;
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[3\\\]\\\[\\\*\\\]' does not match bound 4 of type 'char \\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not match bound 3 of type 'char \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	
+	pd3 = &d;
+	qd3 = &d;
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[\\\*\\\]\\\[\\\*\\\]' does not match bound 4 of type 'char \\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not match bound 3 of type 'char \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	
+	pe0 = &e;
+	qe0 = &e;
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[4\\\]' does not match bound 3 of type 'char \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+	pf = &f;
+	qf = &f;
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[3\\\]\\\[3\\\]' does not match bound 4 of type 'char \\\[\\\*\\\]\\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+	/* return */
+	z0();		
+	/* { dg-output "\[^\n\r]*bound 5 of type 'char \\\[5\\\]\\\[5\\\]' does not match bound 4 of type 'char \\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	/* { dg-output "\[^\n\r]*bound 5 of type 'char \\\[5\\\]' does not match bound 3 of type 'char \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+	z1();		
+	/* { dg-output "\[^\n\r]*bound 5 of type 'char \\\[5\\\]' does not match bound 3 of type 'char \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	z2();	
+	/* { dg-output "\[^\n\r]*bound 5 of type 'char \\\[5\\\]\\\[5\\\]' does not match bound 4 of type 'char \\\[\\\*\\\]\\\[5\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+	char (*(*p)(void))[u][v] = &z0;
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]\\\[\\\*\\\]' does not match bound 5 of type 'char \\\[5\\\]\\\[5\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[\\\*\\\]' does not match bound 5 of type 'char \\\[5\\\]'" } */
+}
+
diff --git a/gcc/testsuite/gcc.dg/ubsan/vm-bounds-1b.c b/gcc/testsuite/gcc.dg/ubsan/vm-bounds-1b.c
new file mode 100644
index 00000000000..5b51aeacbcb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/vm-bounds-1b.c
@@ -0,0 +1,32 @@
+/* { dg-do run } */
+/* { dg-options "-fsanitize=vla-bound" } */
+/* { dg-require-effective-target trampolines } */
+
+
+static char bb[4][4];
+static char (*g())[4][4] { return &bb; }
+
+int main()
+{
+	int n = 3;
+	char b[4];
+	char (*f())[++n] { return &b; }
+
+	if (4 != sizeof(*f()))
+		__builtin_abort();
+
+	char (*(*p)())[++n] = &f;	/* { dg-output "bound 5 of type 'char \\\[\\\*\\\]' does not match bound 4 of type 'char \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+
+	if (5 != sizeof(*(*p)()))
+		__builtin_abort();
+
+	char (*(*q)())[++n][4] = &g;	/* { dg-output "\[^\n\r]*bound 6 of type 'char \\\[\\\*\\\]\\\[4\\\]' does not match bound 4 of type 'char \\\[4\\\]\\\[4\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+
+	if (6 * 4 != sizeof(*(*q)()))
+		__builtin_abort();
+
+	return 0;
+}
+
diff --git a/gcc/testsuite/gcc.dg/ubsan/vm-bounds-2.c b/gcc/testsuite/gcc.dg/ubsan/vm-bounds-2.c
new file mode 100644
index 00000000000..ebc63d32144
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/vm-bounds-2.c
@@ -0,0 +1,56 @@
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=vla-bound" } */
+
+// make sure we do not ICE on any of these
+
+const char* name = "hallo";
+
+
+typedef void (*ht)(int n, int m, char x[n][m]);
+void e(ht) { }
+
+int n, m;
+static void f0(char a[n][m]) { }
+static void f1(int u, int v, char a[u][v]) { }
+static void f2(int u, int v, char a[u][v]) { }
+
+void f(void)
+{
+	int x = 1;
+	int (*m)[x] = 0;
+	m = ({ long* d2; (int (*)[d2[0]])(0); });
+
+	/* function pointer assignments */
+
+	void (*gp)(char x[4][3]) = f0;
+	void (*hp)(int n, int m, char x[n][m]) = f1;
+	ht hp2 = f1;
+	e(f1);
+
+	/* composite type */
+
+	int u = 3; int v = 4;
+	char a[u][v];
+	(1 ? f1 : f2)(u, v, a);
+}
+
+/* size expression in parameter */
+
+extern void a(long N, char (*a)[N]);
+
+static void b(void)
+{
+	a(1, ({ int d = 0; (char (*)[d])0; }) );
+}
+
+/* composite type */
+
+int c(int u, char (*a)[u]);
+int c(int u, char (*a)[u]) { }
+
+int d(void)
+{
+	char a[3];
+	c(3, &a);
+}
+
diff --git a/libsanitizer/ubsan/ubsan_checks.inc b/libsanitizer/ubsan/ubsan_checks.inc
index 846cd89ee19..3182a48a0e3 100644
--- a/libsanitizer/ubsan/ubsan_checks.inc
+++ b/libsanitizer/ubsan/ubsan_checks.inc
@@ -56,6 +56,7 @@ UBSAN_CHECK(OutOfBoundsIndex, "out-of-bounds-index", "bounds")
 UBSAN_CHECK(UnreachableCall, "unreachable-call", "unreachable")
 UBSAN_CHECK(MissingReturn, "missing-return", "return")
 UBSAN_CHECK(NonPositiveVLAIndex, "non-positive-vla-index", "vla-bound")
+UBSAN_CHECK(VMBoundsMismatch, "vm-bounds-mismatch", "vm-bounds")
 UBSAN_CHECK(FloatCastOverflow, "float-cast-overflow", "float-cast-overflow")
 UBSAN_CHECK(InvalidBoolLoad, "invalid-bool-load", "bool")
 UBSAN_CHECK(InvalidEnumLoad, "invalid-enum-load", "enum")
diff --git a/libsanitizer/ubsan/ubsan_handlers.cpp b/libsanitizer/ubsan/ubsan_handlers.cpp
index 970075e69a6..cbe03ca37e4 100644
--- a/libsanitizer/ubsan/ubsan_handlers.cpp
+++ b/libsanitizer/ubsan/ubsan_handlers.cpp
@@ -433,6 +433,33 @@ void __ubsan::__ubsan_handle_missing_return(UnreachableData *Data) {
   Die();
 }
 
+static void handleVMBoundsMismatch(VMBoundsMismatchData *Data, ValueHandle Bound1,
+				   ValueHandle Bound2, ReportOptions Opts) {
+  SourceLocation Loc = Data->Loc.acquire();
+  ErrorType ET = ErrorType::NonPositiveVLAIndex;
+
+  if (ignoreReport(Loc, Opts, ET))
+    return;
+
+  ScopedReport R(Opts, Loc, ET);
+
+  Diag(Loc, DL_Error, ET, "bound %0 of type %1 does not match bound %2 of type %3")
+      << Value(Data->IndexType, Bound2) << Data->ToType
+      << Value(Data->IndexType, Bound1) << Data->FromType;
+}
+
+void __ubsan::__ubsan_handle_vm_bounds_mismatch(VMBoundsMismatchData *Data,
+                                                ValueHandle Bound1, ValueHandle Bound2) {
+  GET_REPORT_OPTIONS(false);
+  handleVMBoundsMismatch(Data, Bound1, Bound2, Opts);
+}
+void __ubsan::__ubsan_handle_vm_bounds_mismatch_abort(VMBoundsMismatchData *Data,
+                                                          ValueHandle Bound1, ValueHandle Bound2) {
+  GET_REPORT_OPTIONS(true);
+  handleVMBoundsMismatch(Data, Bound1, Bound2, Opts);
+  Die();
+}
+
 static void handleVLABoundNotPositive(VLABoundData *Data, ValueHandle Bound,
                                       ReportOptions Opts) {
   SourceLocation Loc = Data->Loc.acquire();
diff --git a/libsanitizer/ubsan/ubsan_handlers.h b/libsanitizer/ubsan/ubsan_handlers.h
index 9f412353fc0..4765710e9f1 100644
--- a/libsanitizer/ubsan/ubsan_handlers.h
+++ b/libsanitizer/ubsan/ubsan_handlers.h
@@ -107,6 +107,17 @@ struct VLABoundData {
 /// \brief Handle a VLA with a non-positive bound.
 RECOVERABLE(vla_bound_not_positive, VLABoundData *Data, ValueHandle Bound)
 
+struct VMBoundsMismatchData {
+  SourceLocation Loc;
+  const TypeDescriptor &FromType;
+  const TypeDescriptor &ToType;
+  const TypeDescriptor &IndexType;
+};
+
+/// \brief Handle a VM types with run-time bounds mismatch
+RECOVERABLE(vm_bounds_mismatch, VMBoundsMismatchData *Data, ValueHandle Bound1, ValueHandle Bound2)
+
+
 // Keeping this around for binary compatibility with (sanitized) programs
 // compiled with older compilers.
 struct FloatCastOverflowData {
  
Martin Uecker May 29, 2023, 10:22 a.m. UTC | #2
c: introduce ubsan checking for assigment of VM types 3/4
    
    Support instrumentation of function arguments for functions
    called via a declaration.  We can support only simple size
    expressions without side effects, because the UBSan
    instrumentation is done before the call, but the expressions
    are evaluated in the callee.
    
    gcc/c-family:
            * c-ubsan.cc (ubsan_instrument_vm_assign): Add arguments
            for size expressions.
            * c-ubsan.h (ubsan_instrument_vm_assign): Dito.
    
    gcc/c:
            * c-typeck.cc (process_vm_constraints): Add support
            for instrumenting function arguments.
    
    gcc/testsuide/gcc.dg:
            * ubsan/vm-bounds-2.c: Update.
            * ubsan/vm-bounds-3.c: New test.
            * ubsan/vm-bounds-4.c: New test.

diff --git a/gcc/c-family/c-ubsan.cc b/gcc/c-family/c-ubsan.cc
index 59ef9708188..a8f95aa39e8 100644
--- a/gcc/c-family/c-ubsan.cc
+++ b/gcc/c-family/c-ubsan.cc
@@ -337,19 +337,13 @@ ubsan_instrument_vla (location_t loc, tree size)
 /* Instrument assignment of variably modified types.  */
 
 tree
-ubsan_instrument_vm_assign (location_t loc, tree a, tree b)
+ubsan_instrument_vm_assign (location_t loc, tree a, tree as, tree b, tree bs)
 {
   tree t, tt;
 
   gcc_assert (TREE_CODE (a) == ARRAY_TYPE);
   gcc_assert (TREE_CODE (b) == ARRAY_TYPE);
 
-  tree as = TYPE_MAX_VALUE (TYPE_DOMAIN (a));
-  tree bs = TYPE_MAX_VALUE (TYPE_DOMAIN (b));
-
-  as = fold_build2 (PLUS_EXPR, sizetype, as, size_one_node);
-  bs = fold_build2 (PLUS_EXPR, sizetype, bs, size_one_node);
-
   t = build2 (NE_EXPR, boolean_type_node, as, bs);
   if (flag_sanitize_trap & SANITIZE_VLA)
     tt = build_call_expr_loc (loc, builtin_decl_explicit (BUILT_IN_TRAP), 0);
diff --git a/gcc/c-family/c-ubsan.h b/gcc/c-family/c-ubsan.h
index 1b07b0645f2..42be1d691a8 100644
--- a/gcc/c-family/c-ubsan.h
+++ b/gcc/c-family/c-ubsan.h
@@ -26,7 +26,7 @@ extern tree ubsan_instrument_shift (location_t, enum tree_code, tree, tree);
 extern tree ubsan_instrument_vla (location_t, tree);
 extern tree ubsan_instrument_return (location_t);
 extern tree ubsan_instrument_bounds (location_t, tree, tree *, bool);
-extern tree ubsan_instrument_vm_assign (location_t, tree, tree);
+extern tree ubsan_instrument_vm_assign (location_t, tree, tree, tree, tree);
 extern bool ubsan_array_ref_instrumented_p (const_tree);
 extern void ubsan_maybe_instrument_array_ref (tree *, bool);
 extern void ubsan_maybe_instrument_reference (tree *);
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index a8fccc6f6ed..aeddac315fc 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -3408,7 +3408,8 @@ static tree
 convert_argument (location_t ploc, tree function, tree fundecl,
 		  tree type, tree origtype, tree val, tree valtype,
 		  bool npc, tree rname, int parmnum, int argnum,
-		  bool excess_precision, int warnopt)
+		  bool excess_precision, int warnopt,
+		  vec<struct instrument_data, va_gc> *instr_vec)
 {
   /* Formal parm type is specified by a function prototype.  */
 
@@ -3567,7 +3568,7 @@ convert_argument (location_t ploc, tree function, tree fundecl,
 						    val, origtype, ic_argpass,
 						    npc, fundecl, function,
 						    parmnum + 1, warnopt,
-						    NULL);
+						    instr_vec);
 
   if (targetm.calls.promote_prototypes (fundecl ? TREE_TYPE (fundecl) : 0)
       && INTEGRAL_TYPE_P (type)
@@ -3582,15 +3583,111 @@ convert_argument (location_t ploc, tree function, tree fundecl,
 
 static tree
 process_vm_constraints (location_t location,
-			vec<struct instrument_data, va_gc> *instr_vec)
+			vec<struct instrument_data, va_gc> *instr_vec,
+			tree function, tree fundecl, vec<tree, va_gc> *values)
 {
   unsigned int i;
   struct instrument_data* d;
   tree instr_expr = void_node;
+  tree args = NULL;
+
+  /* Find the arguments for the function declaration / type.  */
+  if (function)
+    {
+      if (FUNCTION_DECL == TREE_CODE (function))
+	{
+	  fundecl = function;
+	  args = DECL_ARGUMENTS (fundecl);
+	}
+      else
+	{
+	  /* Functions called via pointers are not yet supported.  */
+	  return void_node;
+	}
+    }
 
   FOR_EACH_VEC_SAFE_ELT (instr_vec, i, d)
     {
-      tree in = ubsan_instrument_vm_assign (location, d->t1, d->t2);
+      tree t1 = d->t1;
+      tree t2 = d->t2;
+
+      gcc_assert (ARRAY_TYPE == TREE_CODE (t1));
+      gcc_assert (ARRAY_TYPE == TREE_CODE (t2));
+
+      tree as = TYPE_MAX_VALUE (TYPE_DOMAIN (t1));
+      tree bs = TYPE_MAX_VALUE (TYPE_DOMAIN (t2));
+
+      gcc_assert (as);
+      gcc_assert (bs);
+
+      as = fold_build2 (PLUS_EXPR, sizetype, as, size_one_node);
+      bs = fold_build2 (PLUS_EXPR, sizetype, bs, size_one_node);
+
+      if (function)
+	{
+
+	  if (INTEGER_CST == TREE_CODE (bs))
+	    goto cont;
+
+	  if (NOP_EXPR == TREE_CODE (bs)
+	      && SAVE_EXPR == TREE_CODE (TREE_OPERAND (bs, 0)))
+	    {
+	      tree bs1 = TREE_OPERAND (bs, 0);
+	      tree bs2 = TREE_OPERAND (bs1, 0);
+
+	      /* Another parameter of the current functions.  */
+	      if (PARM_DECL == TREE_CODE (bs2)
+		  && (DECL_CONTEXT (bs2) == fundecl
+		      || DECL_CONTEXT (bs2) == NULL))
+		{
+		  tree arg = args;
+		  int pos = 0;
+		  while (arg)
+		   {
+		     if (arg == bs2)
+		       {
+			 bs = (*values)[pos];
+			 bs = save_expr (bs);
+			 bs = build1 (NOP_EXPR, sizetype, bs);
+			 break;
+		       }
+		     pos++;
+		     arg = DECL_CHAIN (arg);
+		   }
+		  if (!arg)
+		    goto giveup;
+		  goto cont;
+		}
+
+	      /*  A parameter of an enclosing function.  */
+	      if (PARM_DECL == TREE_CODE (bs2)
+		  && DECL_CONTEXT (bs2) != fundecl)
+		{
+		  bs2 = unshare_expr (bs2);
+		  bs1 = save_expr (bs2);
+		  bs = build1 (NOP_EXPR, sizetype, bs1);
+		  goto cont;
+		}
+
+	      /*  A variable with enclosing scope.  */
+	      if (VAR_DECL == TREE_CODE (bs2))
+		{
+		  bs2 = unshare_expr (bs2);
+		  bs1 = save_expr (bs2);
+		  bs = build1 (NOP_EXPR, sizetype, bs1);
+		  goto cont;
+		}
+	    }
+	giveup:
+	  /*  Give up.  If we do not understand a size expression, we can
+	      also not instrument any of the others because it may have
+	      side effects affecting them.  (We could restart and instrument
+	      the only the ones with integer constants.)   */
+	    warning_at (location, 0, "Function call not instrumented.");
+	    return void_node;
+	}
+cont:
+      tree in = ubsan_instrument_vm_assign (location, t1, as, t2, bs);
       instr_expr = fold_build2 (COMPOUND_EXPR, void_type_node, in, instr_expr);
     }
   return instr_expr;
@@ -3689,6 +3786,11 @@ convert_arguments (location_t loc, vec<location_t> arg_loc, tree typelist,
 	  }
     }
 
+  vec<struct instrument_data, va_gc> *instr_vec = NULL;
+
+  if (sanitize_flags_p (SANITIZE_VLA))
+    vec_alloc (instr_vec, 20);
+
   /* Scan the given expressions (VALUES) and types (TYPELIST), producing
      individual converted arguments.  */
 
@@ -3801,7 +3903,7 @@ convert_arguments (location_t loc, vec<location_t> arg_loc, tree typelist,
 	  tree origtype = (!origtypes) ? NULL_TREE : (*origtypes)[parmnum];
 	  parmval = convert_argument (ploc, function, fundecl, type, origtype,
 				      val, valtype, npc, rname, parmnum, argnum,
-				      excess_precision, 0);
+				      excess_precision, 0, instr_vec);
 	}
       else if (promote_float_arg)
         {
@@ -3854,7 +3956,7 @@ convert_arguments (location_t loc, vec<location_t> arg_loc, tree typelist,
 	  convert_argument (ploc, function, fundecl, builtin_type, origtype,
 			    val, valtype, npc, rname, parmnum, argnum,
 			    excess_precision,
-			    OPT_Wbuiltin_declaration_mismatch);
+			    OPT_Wbuiltin_declaration_mismatch, NULL);
 	}
 
       if (typetail)
@@ -3866,6 +3968,22 @@ convert_arguments (location_t loc, vec<location_t> arg_loc, tree typelist,
 
   gcc_assert (parmnum == vec_safe_length (values));
 
+  if (0 < parmnum && instr_vec && 0 < vec_safe_length (instr_vec))
+    {
+      tree instr_expr = process_vm_constraints (loc, instr_vec, function, fundecl, values);
+      /* We have to make sure that all parameters are evaluated first,
+	 because we may use size expressions in it to check bounds.  */
+      if (void_node != instr_expr)
+	{
+	  tree parmval = (*values)[0];
+	  parmval = save_expr (parmval);
+	  instr_expr = fold_build2 (COMPOUND_EXPR, void_type_node, parmval, instr_expr);
+	  parmval = fold_build2 (COMPOUND_EXPR, TREE_TYPE (parmval), instr_expr, parmval);
+	  (*values)[0] = parmval;
+	}
+      vec_free (instr_vec);
+    }
+
   if (typetail != NULL_TREE && TREE_VALUE (typetail) != void_type_node)
     {
       error_at (loc, "too few arguments to function %qE", function);
@@ -6944,7 +7062,7 @@ convert_for_assignment (location_t location, location_t expr_loc, tree type,
 	{
 	  /* We have to make sure that the rhs is evaluated first,
 	     because we may use size expressions in it to check bounds.  */
-	  tree instr_expr = process_vm_constraints (location, instr_vec);
+	  tree instr_expr = process_vm_constraints (location, instr_vec, NULL, NULL, NULL);
 	  if (void_node != instr_expr)
 	    {
 	      ret = save_expr (ret);
diff --git a/gcc/testsuite/gcc.dg/ubsan/vm-bounds-2.c b/gcc/testsuite/gcc.dg/ubsan/vm-bounds-2.c
index ebc63d32144..22f06231eaa 100644
--- a/gcc/testsuite/gcc.dg/ubsan/vm-bounds-2.c
+++ b/gcc/testsuite/gcc.dg/ubsan/vm-bounds-2.c
@@ -51,6 +51,6 @@ int c(int u, char (*a)[u]) { }
 int d(void)
 {
 	char a[3];
-	c(3, &a);
+	c(3, &a);		/* { dg-warning "Function call not instrumented." } */
 }
 
diff --git a/gcc/testsuite/gcc.dg/ubsan/vm-bounds-3.c b/gcc/testsuite/gcc.dg/ubsan/vm-bounds-3.c
new file mode 100644
index 00000000000..9ec95921fb9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/vm-bounds-3.c
@@ -0,0 +1,96 @@
+/* { dg-do run } */
+/* { dg-options "-fsanitize=vla-bound" } */
+
+int m, n;
+
+static void a0(char (*a)[4]) { }
+static void b0(char (*a)[n]) { }
+static void c0(char (*a)[n][m]) { }
+static void d0(char (*a)[4][m]) { }
+static void e0(char (*a)[n][3]) { }
+static void f0(char a[n][m]) { }
+
+static void b1(int u, char (*a)[u]) { }
+static void c1(int u, int v, char (*a)[u][v]) { }
+static void d1(int v, char (*a)[4][v]) { }
+static void e1(int u, char (*a)[u][3]) { }
+static void f1(int u, int v, char a[u][v]) { }
+
+
+
+int main()
+{
+	m = 3, n = 4;
+
+	int u = 4;
+	int v = 3;
+
+	/* function arguments */
+
+	char A0[4];
+	char A1[u];
+	char B0[3];
+	char B1[v];
+
+	a0(&A0);
+	a0(&A1);
+	a0(&B0);	/* { dg-warning "incompatible pointer type" } */
+	a0(&B1);
+	/* { dg-output "bound 4 of type 'char \\\[4\\\]' does not match bound 3 of type 'char \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	b0(&A0);
+	b0(&A1);
+	b0(&B0);
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not match bound 3 of type 'char \\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	b0(&B1);
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not match bound 3 of type 'char \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	b1(4, &A0);
+	b1(4, &B0);
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not match bound 3 of type 'char \\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+	char C0[4][3];
+	char C1[u][3];
+	char C2[4][v];
+	char C3[u][v];
+	char D0[3][4];
+	char D1[v][4];
+	char D2[3][u];
+	char D3[v][u];
+
+	c0(&C0);
+	c0(&C1);
+	c0(&C2);
+	c0(&C3);
+	c0(&D0);
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]\\\[\\\*\\\]' does not match bound 3 of type 'char \\\[3\\\]\\\[4\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[\\\*\\\]' does not match bound 4 of type 'char \\\[4\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	c0(&D1);
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]\\\[\\\*\\\]' does not match bound 3 of type 'char \\\[\\\*\\\]\\\[4\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[\\\*\\\]' does not match bound 4 of type 'char \\\[4\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	
+	c0(&D2);
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]\\\[\\\*\\\]' does not match bound 3 of type 'char \\\[3\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[\\\*\\\]' does not match bound 4 of type 'char \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	
+	c0(&D3);
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]\\\[\\\*\\\]' does not match bound 3 of type 'char \\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[\\\*\\\]' does not match bound 4 of type 'char \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	
+	d0(&C0);
+	d0(&D0);	/* { dg-warning "incompatible pointer type" } */
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[\\\*\\\]' does not match bound 4 of type 'char \\\[4\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	d1(3, &C0);
+	d1(3, &D0);	/* { dg-warning "incompatible pointer type" } */
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[\\\*\\\]' does not match bound 4 of type 'char \\\[4\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	e0(&C0);
+	e0(&D0);	/* { dg-warning "incompatible pointer type" } */
+	e1(4, &C0);
+	e1(4, &D0);	/* { dg-warning "incompatible pointer type" } */
+
+	f0(C0);
+	f0(D0);
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[\\\*\\\]' does not match bound 4 of type 'char \\\[4\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	f1(4, 3, C0);
+	f1(4, 3, D0);
+	/* { dg-output "\[^\n\r]*bound 3 of type 'char \\\[\\\*\\\]' does not match bound 4 of type 'char \\\[4\\\]'" } */
+}
+
diff --git a/gcc/testsuite/gcc.dg/ubsan/vm-bounds-4.c b/gcc/testsuite/gcc.dg/ubsan/vm-bounds-4.c
new file mode 100644
index 00000000000..e745b41d159
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/vm-bounds-4.c
@@ -0,0 +1,37 @@
+/* { dg-do run } */
+/* { dg-options "-fsanitize=vla-bound" } */
+
+
+
+void b0(int n, char (*a)[n]) { }
+void b1(int m, char (*a)[m]);
+void b2(int m; char (*a)[m], int m) { }
+void b3(int m; char (*a)[m], int m);
+int n;
+void b4(char (*a)[n]) { }
+void b5(char (*a)[n]);
+
+int main()
+{
+	char A0[3];
+	b1(4, &A0);
+	/* { dg-output "bound 4 of type 'char \\\[\\\*\\\]' does not match bound 3 of type 'char \\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	b0(4, &A0);
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not match bound 3 of type 'char \\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	b2(&A0, 4);
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not match bound 3 of type 'char \\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	b3(&A0, 4);
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not match bound 3 of type 'char \\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	n = 4;
+	b4(&A0);
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not match bound 3 of type 'char \\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	b5(&A0);
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not match bound 3 of type 'char \\\[3\\\]'" } */
+}
+
+
+void b1(int n, char (*a)[n]) { }
+void b3(int m; char (*a)[m], int m) { }
+void b5(char (*a)[n]) { }
+
+
  
Martin Uecker May 29, 2023, 10:22 a.m. UTC | #3
c: introduce ubsan checking for assigment of VM types 4/4
    
    Support instrumentation of functions called via pointers.  To do so,
    record the declaration with the parameter types, so that it can be
    retrieved later.
    
    gcc/c:
            c-decl.cc (get_parm_info): Record function declaration
            for arguments.
            c-type.cc (process_vm_constraints): Instrument functions
            called via pointers.
    
    gcc/testsuide/gcc.dg:
            * ubsan/vm-bounds-2.c: Add warning.
            * ubsan/vm-bounds-5.c: New test.

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 1af51c4acfc..c33adf7e5fe 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -8410,6 +8410,9 @@ get_parm_info (bool ellipsis, tree expr)
 		 declared types.  The back end may override this later.  */
 	      DECL_ARG_TYPE (decl) = type;
 	      types = tree_cons (0, type, types);
+
+	      /* Record the decl for use of UBSan bounds checking.  */
+	      TREE_PURPOSE (types) = decl;
 	    }
 	  break;
 
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index aeddac315fc..43e7b96a55f 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -3601,9 +3601,20 @@ process_vm_constraints (location_t location,
 	}
       else
 	{
-	  /* Functions called via pointers are not yet supported.  */
-	  return void_node;
+	  while (FUNCTION_TYPE != TREE_CODE (function))
+	    function = TREE_TYPE (function);
+
+	  args = TREE_PURPOSE (TYPE_ARG_TYPES (function));
+
+	  if (!args)
+	    {
+	      /* FIXME: this can happen when forming composite types for the
+		 conditional operator.  */
+	      warning_at (location, 0, "Function call not instrumented.");
+	      return void_node;
+	    }
 	}
+      gcc_assert (PARM_DECL == TREE_CODE (args));
     }
 
   FOR_EACH_VEC_SAFE_ELT (instr_vec, i, d)
diff --git a/gcc/testsuite/gcc.dg/ubsan/vm-bounds-2.c b/gcc/testsuite/gcc.dg/ubsan/vm-bounds-2.c
index 22f06231eaa..093cbddd2ea 100644
--- a/gcc/testsuite/gcc.dg/ubsan/vm-bounds-2.c
+++ b/gcc/testsuite/gcc.dg/ubsan/vm-bounds-2.c
@@ -31,7 +31,7 @@ void f(void)
 
 	int u = 3; int v = 4;
 	char a[u][v];
-	(1 ? f1 : f2)(u, v, a);
+	(1 ? f1 : f2)(u, v, a);	/* { dg-warning "Function call not instrumented." } */
 }
 
 /* size expression in parameter */
diff --git a/gcc/testsuite/gcc.dg/ubsan/vm-bounds-5.c b/gcc/testsuite/gcc.dg/ubsan/vm-bounds-5.c
new file mode 100644
index 00000000000..1a251e39deb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/vm-bounds-5.c
@@ -0,0 +1,72 @@
+/* { dg-do run } */
+/* { dg-options "-fsanitize=vla-bound" } */
+
+
+void foo1(void (*p)(int n, char (*a)[n]))
+{
+	char A0[3];
+	(*p)(3, &A0);
+	(*p)(4, &A0);	/* */
+	/* { dg-output "bound 4 of type 'char \\\[\\\*\\\]' does not match bound 3 of type 'char \\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+}
+
+void b0(int n, char (*a)[n]) { }
+
+
+int n;
+
+void foo2(void (*p)(int n, char (*a)[n]))
+{
+	n = 4;
+	char A0[3];
+	(*p)(3, &A0);
+	(*p)(4, &A0);
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not match bound 3 of type 'char \\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+}
+
+void foo3(void (*p)(int n0, char (*a)[n]))
+{
+	n = 4;
+	char A0[3];
+	(*p)(3, &A0);	/* */
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not match bound 3 of type 'char \\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+	(*p)(4, &A0);	/* */
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not match bound 3 of type 'char \\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+}
+
+void foo4(void (*p)(int n, char (*a)[n]))
+{
+	n = 3;
+	char A0[3];
+	(*p)(3, &A0);
+	(*p)(4, &A0);	/* */
+	/* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not match bound 3 of type 'char \\\[3\\\]'" } */
+}
+
+
+void foo5(void (*p)(int n0, char (*a)[n]))
+{
+	n = 3;
+	char A0[3];
+	(*p)(3, &A0);
+	(*p)(4, &A0);
+}
+
+
+void b1(int n0, char (*a)[n]) { }
+
+
+
+int main()
+{
+	foo1(&b0);
+
+	foo2(&b1);
+	foo3(&b1); // we should diagnose mismatch and run-time discrepancies
+
+	foo4(&b1);
+	foo5(&b1); // we should diagnose mismatch and run-time discrepancies
+}
+
+
+
  
Joseph Myers May 30, 2023, 10:59 p.m. UTC | #4
On Mon, 29 May 2023, Martin Uecker via Gcc-patches wrote:

>     Support instrumentation of function arguments for functions
>     called via a declaration.  We can support only simple size

What do you mean by "via a declaration"?

If the *definition* is visible (and known to be the definition used at 
runtime rather than being interposed) then you can determine in some cases 
that there is UB from bad bounds.  If only some other declaration is 
visible, or the definition might be interposed, VLA sizes in the 
declaration are equivalent to [*]; it's suspicious if they don't match, 
but it's not UB and so it would seem rather questionable for UBSan to 
treat it as such (cf. the rejection in GCC of sanitization for some 
questionable cases of unsigned integer overflow that aren't UB either).

> +	  /*  Give up.  If we do not understand a size expression, we can
> +	      also not instrument any of the others because it may have
> +	      side effects affecting them.  (We could restart and instrument
> +	      the only the ones with integer constants.)   */
> +	    warning_at (location, 0, "Function call not instrumented.");
> +	    return void_node;

This is not a properly formatted diagnostic message (should start with a 
lowercase letter and not end with '.').
  
Martin Uecker May 31, 2023, 8:12 a.m. UTC | #5
Am Dienstag, dem 30.05.2023 um 22:59 +0000 schrieb Joseph Myers:
> On Mon, 29 May 2023, Martin Uecker via Gcc-patches wrote:
> 
> >     Support instrumentation of function arguments for functions
> >     called via a declaration.  We can support only simple size
> 
> What do you mean by "via a declaration"?
> 
> If the *definition* is visible (and known to be the definition used
> at runtime rather than being interposed) then you can determine in
> some cases  that there is UB from bad bounds.  If only some other
> declaration is visible, or the definition might be interposed, VLA
> sizes in the declaration are equivalent to [*]; it's suspicious if
> they don't match, but it's not UB and so it would seem rather
> questionable for UBSan to treat it as such (cf. the rejection in 
> GCC of sanitization for some questionable cases of unsigned integer
> overflow that aren't UB either).

You are right that it is UB only with the additional
assumption that the bounds in the seen declaration are
the same as the ones in the definition.   But we now warn
about any mismatch since GCC 11 with -Wall based on the
understanding  that any such mismatch should be considered
a bug. There also does not seem  to be any valid use case
for having mismatching bounds and I think the intention 
of WG14 is clearly that they can be used for checking 
(cf. WG14 charter). So I think this is a different
situation for unsigned integer overflow.

Fom a practial point of view is is certainly very useful 
for users to be able to verify these bounds at run-time. 
But we could make it a separate UBSan option if it is
really a concern.

BTW: There was a similar discussion years ago about making
certain bound checks for arrays part of UBSan because it is
not clear that the bounds in the type of 'x' in x[n] are
relevant rather than the ones of the underlying array
(which may be different).  In the end both GCC and clang
have these UBSan checks now and I think  everybody is
happy about it.


> > +         /*  Give up.  If we do not understand a size expression,
> > we can
> > +             also not instrument any of the others because it may
> > have
> > +             side effects affecting them.  (We could restart and
> > instrument
> > +             the only the ones with integer constants.)   */
> > +           warning_at (location, 0, "Function call not
> > instrumented.");
> > +           return void_node;
> 
> This is not a properly formatted diagnostic message (should start
> with a 
> lowercase letter and not end with '.').

Thanks. I would probably remove this warning and re-introduce it
with another patch that also adds an option fir it.

Martin
  

Patch

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 22e240a3c2a..2a1b7321b45 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -90,12 +90,14 @@  static bool require_constant_elements;
 static bool require_constexpr_value;
 
 static tree qualify_type (tree, tree);
-static int tagged_types_tu_compatible_p (const_tree, const_tree, bool *,
-					 bool *);
+struct comptypes_data;
+static int tagged_types_tu_compatible_p (const_tree, const_tree,
+					 struct comptypes_data *);
 static int comp_target_types (location_t, tree, tree);
-static int function_types_compatible_p (const_tree, const_tree, bool *,
-					bool *);
-static int type_lists_compatible_p (const_tree, const_tree, bool *, bool *);
+static int function_types_compatible_p (const_tree, const_tree,
+					struct comptypes_data *);
+static int type_lists_compatible_p (const_tree, const_tree,
+				    struct comptypes_data *);
 static tree lookup_field (tree, tree);
 static int convert_arguments (location_t, vec<location_t>, tree,
 			      vec<tree, va_gc> *, vec<tree, va_gc> *, tree,
@@ -125,7 +127,8 @@  static tree find_init_member (tree, struct obstack *);
 static void readonly_warning (tree, enum lvalue_use);
 static int lvalue_or_else (location_t, const_tree, enum lvalue_use);
 static void record_maybe_used_decl (tree);
-static int comptypes_internal (const_tree, const_tree, bool *, bool *);
+static int comptypes_internal (const_tree, const_tree,
+			       struct comptypes_data *data);
 
 /* Return true if EXP is a null pointer constant, false otherwise.  */
 
@@ -1039,6 +1042,12 @@  common_type (tree t1, tree t2)
   return c_common_type (t1, t2);
 }
 
+struct comptypes_data {
+
+  bool enum_and_int_p;
+  bool different_types_p;
+};
+
 /* Return 1 if TYPE1 and TYPE2 are compatible types for assignment
    or various other operations.  Return 2 if they are compatible
    but a warning may be needed if you use them together.  */
@@ -1049,7 +1058,9 @@  comptypes (tree type1, tree type2)
   const struct tagged_tu_seen_cache * tagged_tu_seen_base1 = tagged_tu_seen_base;
   int val;
 
-  val = comptypes_internal (type1, type2, NULL, NULL);
+  struct comptypes_data data = { };
+  val = comptypes_internal (type1, type2, &data);
+
   free_all_tagged_tu_seen_up_to (tagged_tu_seen_base1);
 
   return val;
@@ -1064,7 +1075,10 @@  comptypes_check_enum_int (tree type1, tree type2, bool *enum_and_int_p)
   const struct tagged_tu_seen_cache * tagged_tu_seen_base1 = tagged_tu_seen_base;
   int val;
 
-  val = comptypes_internal (type1, type2, enum_and_int_p, NULL);
+  struct comptypes_data data = { };
+  val = comptypes_internal (type1, type2, &data);
+  *enum_and_int_p = data.enum_and_int_p;
+
   free_all_tagged_tu_seen_up_to (tagged_tu_seen_base1);
 
   return val;
@@ -1080,7 +1094,10 @@  comptypes_check_different_types (tree type1, tree type2,
   const struct tagged_tu_seen_cache * tagged_tu_seen_base1 = tagged_tu_seen_base;
   int val;
 
-  val = comptypes_internal (type1, type2, NULL, different_types_p);
+  struct comptypes_data data = { };
+  val = comptypes_internal (type1, type2, &data);
+  *different_types_p = data.different_types_p;
+
   free_all_tagged_tu_seen_up_to (tagged_tu_seen_base1);
 
   return val;
@@ -1089,19 +1106,18 @@  comptypes_check_different_types (tree type1, tree type2,
 /* Return 1 if TYPE1 and TYPE2 are compatible types for assignment
    or various other operations.  Return 2 if they are compatible
    but a warning may be needed if you use them together.  If
-   ENUM_AND_INT_P is not NULL, and one type is an enum and the other a
-   compatible integer type, then this sets *ENUM_AND_INT_P to true;
-   *ENUM_AND_INT_P is never set to false.  If DIFFERENT_TYPES_P is not
-   NULL, and the types are compatible but different enough not to be
+   one type is an enum and the other a compatible integer type, then
+   this sets 'enum_and_int_p' in DATA to true (it is never set to
+   false).  If the types are compatible but different enough not to be
    permitted in C11 typedef redeclarations, then this sets
-   *DIFFERENT_TYPES_P to true; *DIFFERENT_TYPES_P is never set to
+   'different_types_p' in DATA to true; it is never set to
    false, but may or may not be set if the types are incompatible.
    This differs from comptypes, in that we don't free the seen
    types.  */
 
 static int
-comptypes_internal (const_tree type1, const_tree type2, bool *enum_and_int_p,
-		    bool *different_types_p)
+comptypes_internal (const_tree type1, const_tree type2,
+		    struct comptypes_data *data)
 {
   const_tree t1 = type1;
   const_tree t2 = type2;
@@ -1124,10 +1140,8 @@  comptypes_internal (const_tree type1, const_tree type2, bool *enum_and_int_p,
       t1 = ENUM_UNDERLYING_TYPE (t1);
       if (TREE_CODE (t2) != VOID_TYPE)
 	{
-	  if (enum_and_int_p != NULL)
-	    *enum_and_int_p = true;
-	  if (different_types_p != NULL)
-	    *different_types_p = true;
+	  data->enum_and_int_p = true;
+	  data->different_types_p = true;
 	}
     }
   else if (TREE_CODE (t2) == ENUMERAL_TYPE
@@ -1137,10 +1151,8 @@  comptypes_internal (const_tree type1, const_tree type2, bool *enum_and_int_p,
       t2 = ENUM_UNDERLYING_TYPE (t2);
       if (TREE_CODE (t1) != VOID_TYPE)
 	{
-	  if (enum_and_int_p != NULL)
-	    *enum_and_int_p = true;
-	  if (different_types_p != NULL)
-	    *different_types_p = true;
+	  data->enum_and_int_p = true;
+	  data->different_types_p = true;
 	}
     }
 
@@ -1198,13 +1210,11 @@  comptypes_internal (const_tree type1, const_tree type2, bool *enum_and_int_p,
       if (TYPE_MODE (t1) != TYPE_MODE (t2))
 	break;
       val = (TREE_TYPE (t1) == TREE_TYPE (t2)
-	     ? 1 : comptypes_internal (TREE_TYPE (t1), TREE_TYPE (t2),
-				       enum_and_int_p, different_types_p));
+	     ? 1 : comptypes_internal (TREE_TYPE (t1), TREE_TYPE (t2), data));
       break;
 
     case FUNCTION_TYPE:
-      val = function_types_compatible_p (t1, t2, enum_and_int_p,
-					 different_types_p);
+      val = function_types_compatible_p (t1, t2, data);
       break;
 
     case ARRAY_TYPE:
@@ -1218,13 +1228,11 @@  comptypes_internal (const_tree type1, const_tree type2, bool *enum_and_int_p,
 	/* Target types must match incl. qualifiers.  */
 	if (TREE_TYPE (t1) != TREE_TYPE (t2)
 	    && (val = comptypes_internal (TREE_TYPE (t1), TREE_TYPE (t2),
-					  enum_and_int_p,
-					  different_types_p)) == 0)
+					  data)) == 0)
 	  return 0;
 
-	if (different_types_p != NULL
-	    && (d1 == NULL_TREE) != (d2 == NULL_TREE))
-	  *different_types_p = true;
+	if ((d1 == NULL_TREE) != (d2 == NULL_TREE))
+	  data->different_types_p = true;
 	/* Sizes must match unless one is missing or variable.  */
 	if (d1 == NULL_TREE || d2 == NULL_TREE || d1 == d2)
 	  break;
@@ -1241,9 +1249,8 @@  comptypes_internal (const_tree type1, const_tree type2, bool *enum_and_int_p,
 	d1_variable = d1_variable || (d1_zero && C_TYPE_VARIABLE_SIZE (t1));
 	d2_variable = d2_variable || (d2_zero && C_TYPE_VARIABLE_SIZE (t2));
 
-	if (different_types_p != NULL
-	    && d1_variable != d2_variable)
-	  *different_types_p = true;
+	if (d1_variable != d2_variable)
+	  data->different_types_p = true;
 	if (d1_variable || d2_variable)
 	  break;
 	if (d1_zero && d2_zero)
@@ -1268,8 +1275,7 @@  comptypes_internal (const_tree type1, const_tree type2, bool *enum_and_int_p,
 	      && ! attribute_list_contained (a2, a1))
 	    break;
 
-	  val = tagged_types_tu_compatible_p (t1, t2, enum_and_int_p,
-					      different_types_p);
+	  val = tagged_types_tu_compatible_p (t1, t2, data);
 
 	  if (attrval != 2)
 	    return val;
@@ -1278,8 +1284,7 @@  comptypes_internal (const_tree type1, const_tree type2, bool *enum_and_int_p,
 
     case VECTOR_TYPE:
       val = (known_eq (TYPE_VECTOR_SUBPARTS (t1), TYPE_VECTOR_SUBPARTS (t2))
-	     && comptypes_internal (TREE_TYPE (t1), TREE_TYPE (t2),
-				    enum_and_int_p, different_types_p));
+	     && comptypes_internal (TREE_TYPE (t1), TREE_TYPE (t2), data));
       break;
 
     default:
@@ -1395,14 +1400,11 @@  free_all_tagged_tu_seen_up_to (const struct tagged_tu_seen_cache *tu_til)
 
 /* Return 1 if two 'struct', 'union', or 'enum' types T1 and T2 are
    compatible.  If the two types are not the same (which has been
-   checked earlier), this can only happen when multiple translation
-   units are being compiled.  See C99 6.2.7 paragraph 1 for the exact
-   rules.  ENUM_AND_INT_P and DIFFERENT_TYPES_P are as in
-   comptypes_internal.  */
+   checked earlier).  */
 
 static int
 tagged_types_tu_compatible_p (const_tree t1, const_tree t2,
-			      bool *enum_and_int_p, bool *different_types_p)
+			      struct comptypes_data* data)
 {
   tree s1, s2;
   bool needs_warning = false;
@@ -1513,8 +1515,7 @@  tagged_types_tu_compatible_p (const_tree t1, const_tree t2,
 
 	    if (DECL_NAME (s1) != DECL_NAME (s2))
 	      break;
-	    result = comptypes_internal (TREE_TYPE (s1), TREE_TYPE (s2),
-					 enum_and_int_p, different_types_p);
+	    result = comptypes_internal (TREE_TYPE (s1), TREE_TYPE (s2), data);
 
 	    if (result != 1 && !DECL_NAME (s1))
 	      break;
@@ -1550,8 +1551,7 @@  tagged_types_tu_compatible_p (const_tree t1, const_tree t2,
 		  int result;
 
 		  result = comptypes_internal (TREE_TYPE (s1), TREE_TYPE (s2),
-					       enum_and_int_p,
-					       different_types_p);
+					       data);
 
 		  if (result != 1 && !DECL_NAME (s1))
 		    continue;
@@ -1599,8 +1599,7 @@  tagged_types_tu_compatible_p (const_tree t1, const_tree t2,
 	    if (TREE_CODE (s1) != TREE_CODE (s2)
 		|| DECL_NAME (s1) != DECL_NAME (s2))
 	      break;
-	    result = comptypes_internal (TREE_TYPE (s1), TREE_TYPE (s2),
-					 enum_and_int_p, different_types_p);
+	    result = comptypes_internal (TREE_TYPE (s1), TREE_TYPE (s2), data);
 	    if (result == 0)
 	      break;
 	    if (result == 2)
@@ -1633,7 +1632,7 @@  tagged_types_tu_compatible_p (const_tree t1, const_tree t2,
 
 static int
 function_types_compatible_p (const_tree f1, const_tree f2,
-			     bool *enum_and_int_p, bool *different_types_p)
+			     struct comptypes_data *data)
 {
   tree args1, args2;
   /* 1 if no need for warning yet, 2 if warning cause has been seen.  */
@@ -1654,16 +1653,15 @@  function_types_compatible_p (const_tree f1, const_tree f2,
   if (TYPE_VOLATILE (ret2))
     ret2 = build_qualified_type (TYPE_MAIN_VARIANT (ret2),
 				 TYPE_QUALS (ret2) & ~TYPE_QUAL_VOLATILE);
-  val = comptypes_internal (ret1, ret2, enum_and_int_p, different_types_p);
+  val = comptypes_internal (ret1, ret2, data);
   if (val == 0)
     return 0;
 
   args1 = TYPE_ARG_TYPES (f1);
   args2 = TYPE_ARG_TYPES (f2);
 
-  if (different_types_p != NULL
-      && (args1 == NULL_TREE) != (args2 == NULL_TREE))
-    *different_types_p = true;
+  if ((args1 == NULL_TREE) != (args2 == NULL_TREE))
+    data->different_types_p = true;
 
   /* An unspecified parmlist matches any specified parmlist
      whose argument types don't need default promotions.  */
@@ -1679,7 +1677,7 @@  function_types_compatible_p (const_tree f1, const_tree f2,
 	 If they don't match, ask for a warning (but no error).  */
       if (TYPE_ACTUAL_ARG_TYPES (f1)
 	  && type_lists_compatible_p (args2, TYPE_ACTUAL_ARG_TYPES (f1),
-				      enum_and_int_p, different_types_p) != 1)
+				      data) != 1)
 	val = 2;
       return val;
     }
@@ -1691,14 +1689,13 @@  function_types_compatible_p (const_tree f1, const_tree f2,
 	return 0;
       if (TYPE_ACTUAL_ARG_TYPES (f2)
 	  && type_lists_compatible_p (args1, TYPE_ACTUAL_ARG_TYPES (f2),
-				      enum_and_int_p, different_types_p) != 1)
+				      data) != 1)
 	val = 2;
       return val;
     }
 
   /* Both types have argument lists: compare them and propagate results.  */
-  val1 = type_lists_compatible_p (args1, args2, enum_and_int_p,
-				  different_types_p);
+  val1 = type_lists_compatible_p (args1, args2, data);
   return val1 != 1 ? val1 : val;
 }
 
@@ -1709,7 +1706,7 @@  function_types_compatible_p (const_tree f1, const_tree f2,
 
 static int
 type_lists_compatible_p (const_tree args1, const_tree args2,
-			 bool *enum_and_int_p, bool *different_types_p)
+			 struct comptypes_data *data)
 {
   /* 1 if no need for warning yet, 2 if warning cause has been seen.  */
   int val = 1;
@@ -1740,9 +1737,8 @@  type_lists_compatible_p (const_tree args1, const_tree args2,
 	 means there is supposed to be an argument
 	 but nothing is specified about what type it has.
 	 So match anything that self-promotes.  */
-      if (different_types_p != NULL
-	  && (a1 == NULL_TREE) != (a2 == NULL_TREE))
-	*different_types_p = true;
+      if ((a1 == NULL_TREE) != (a2 == NULL_TREE))
+	data->different_types_p = true;
       if (a1 == NULL_TREE)
 	{
 	  if (c_type_promotes_to (a2) != a2)
@@ -1757,11 +1753,9 @@  type_lists_compatible_p (const_tree args1, const_tree args2,
       else if (TREE_CODE (a1) == ERROR_MARK
 	       || TREE_CODE (a2) == ERROR_MARK)
 	;
-      else if (!(newval = comptypes_internal (mv1, mv2, enum_and_int_p,
-					      different_types_p)))
+      else if (!(newval = comptypes_internal (mv1, mv2, data)))
 	{
-	  if (different_types_p != NULL)
-	    *different_types_p = true;
+	  data->different_types_p = true;
 	  /* Allow  wait (union {union wait *u; int *i} *)
 	     and  wait (union wait *)  to be compatible.  */
 	  if (TREE_CODE (a1) == UNION_TYPE
@@ -1782,8 +1776,7 @@  type_lists_compatible_p (const_tree args1, const_tree args2,
 			   ? c_build_qualified_type (TYPE_MAIN_VARIANT (mv3),
 						     TYPE_QUAL_ATOMIC)
 			   : TYPE_MAIN_VARIANT (mv3));
-		  if (comptypes_internal (mv3, mv2, enum_and_int_p,
-					  different_types_p))
+		  if (comptypes_internal (mv3, mv2, data))
 		    break;
 		}
 	      if (memb == NULL_TREE)
@@ -1807,8 +1800,7 @@  type_lists_compatible_p (const_tree args1, const_tree args2,
 			   ? c_build_qualified_type (TYPE_MAIN_VARIANT (mv3),
 						     TYPE_QUAL_ATOMIC)
 			   : TYPE_MAIN_VARIANT (mv3));
-		  if (comptypes_internal (mv3, mv1, enum_and_int_p,
-					  different_types_p))
+		  if (comptypes_internal (mv3, mv1, data))
 		    break;
 		}
 	      if (memb == NULL_TREE)