Allow (void *) 0xdeadbeef accesses without warnings [PR99578]

Message ID YjS6xjDiHD3ebBca@tucnak
State New
Headers
Series Allow (void *) 0xdeadbeef accesses without warnings [PR99578] |

Commit Message

Jakub Jelinek March 18, 2022, 5 p.m. UTC
  Hi!

Starting with GCC11 we keep emitting false positive -Warray-bounds or
-Wstringop-overflow etc. warnings on widely used *(type *)0x12345000
style accesses (or memory/string routines to such pointers).
This is a standard programming style supported by all C/C++ compilers
I've ever tried, used mostly in kernel or DSP programming, but sometimes
also together with mmap MAP_FIXED when certain things, often I/O registers
but could be anything else too are known to be present at fixed
addresses.

Such INTEGER_CST addresses can appear in code either because a user
used it like that (in which case it is fine) or because somebody used
pointer arithmetics (including &((struct whatever *)NULL)->field) on
a NULL pointer.  The middle-end warning code wrongly assumes that the
latter case is what is very likely, while the former is unlikely and
users should change their code.

The following patch attempts to differentiate between the two, but
because we are in stage4, does it in a temporary compromise way.
For GCC 13, the aim is to represent in the IL the user (type *)0x12345000
style addresses as INTEGER_CSTs as before, and represent the addresses
derived from pointer arithmetics on NULL pointer as
&MEM_REF[(void*)0B + offset].  When we see say POINTER_PLUS of
NULL and non-zero offset, I think we should fold it to this form.
We IMNSHO need to support the poor man's offsetof I'm afraid still
in wide use, so we should IMHO fold casts of such addresses to integers
into INTEGER_CST offset.  SCCVN IMHO shouldn't the
&MEM_REF[(void*)0B + offset] and (void*)offset forms as equivalent (yes,
it will be a small missed optimization but RTL will see them the same),
but e.g. the patch already does ==/!= folding between the two forms.
As doing this fully seems to be more like a stage1 project, the following
patch keeps doing what GCC11/GCC12 did until now for pointers smaller than
a parameter, by default equal to 4KB, which means that PR102037 will
remain unfixed for now (unless users lower the new param manually) and for
the most common cases nothing changes right now, but for higher
addresses we'll assume that such INTEGER_CSTs are valid direct address
accesses and use the &MEM_REF[(void*)0B + offset] forms for the invalid
code on which we'll warn.

Yet another alternative would be just the params.opt and
pointer-query.cc (compute_objsize_r) hunks which will stop warning
if we go more than 4KB from a NULL pointer altogether and doing
the full set of changes only in GCC 13 (I've bootstrapped/regtested
such patch last night on x86_64-linux and i686-linux).

So far tested with make check-gcc/check-g++ on x86_64-linux,
furthermore also tested with the parameter's default changed from 4096
to 16.

Ok for trunk if it passes full bootstrap/regtest?

2022-03-18  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/99578
	PR middle-end/100680
	PR tree-optimization/100834
	* params.opt (--param=min-pagesize=): New parameter.
	* gimple-expr.cc (is_gimple_invariant_address): Allow ADDR_EXPR of
	MEM_REF with integer_zerop as first operand as invariant.
	(is_gimple_ip_invariant_address): Likewise.
	* match.pd (&MEM[0 + off] == off): New simplification.
	* gimple-ssa-warn-restrict.cc (builtin_memref::set_base_and_offset):
	Ignore MEM_REFs with integer_zerop as address.
	* gimple-fold.cc (maybe_canonicalize_mem_ref_addr): Don't fold
	&MEM[0 + off] into off if off is larger than min-pagesize.
	* gimple-array-bounds.cc (array_bounds_checker::check_mem_ref): Return
	false for MEM_REFs with integer_zerop as address.
	* pointer-query.cc (handle_mem_ref): Handle MEM_REFs with integer_zerop
	as address specially.
	(compute_objsize_r) <case ARRAY_REF>: Formatting fix.
	(compute_objsize_r) <case INTEGER_CST>: Use maximum object size instead
	of zero for pointer constants equal or larger than min-pagesize.
	* fold-const.cc (build_fold_addr_expr_with_type_loc):

	* gcc.dg/tree-ssa/pr99578-1.c: New test.
	* gcc.dg/pr99578-1.c: New test.
	* gcc.dg/pr99578-2.c: New test.
	* gcc.dg/pr99578-3.c: New test.
	* gcc.dg/pr100680.c: New test.
	* gcc.dg/pr100834.c: New test.



	Jakub
  

Comments

Richard Biener March 18, 2022, 5:38 p.m. UTC | #1
> Am 18.03.2022 um 18:01 schrieb Jakub Jelinek via Gcc-patches <gcc-patches@gcc.gnu.org>:
> 
> Hi!
> 
> Starting with GCC11 we keep emitting false positive -Warray-bounds or
> -Wstringop-overflow etc. warnings on widely used *(type *)0x12345000
> style accesses (or memory/string routines to such pointers).
> This is a standard programming style supported by all C/C++ compilers
> I've ever tried, used mostly in kernel or DSP programming, but sometimes
> also together with mmap MAP_FIXED when certain things, often I/O registers
> but could be anything else too are known to be present at fixed
> addresses.
> 
> Such INTEGER_CST addresses can appear in code either because a user
> used it like that (in which case it is fine) or because somebody used
> pointer arithmetics (including &((struct whatever *)NULL)->field) on
> a NULL pointer.  The middle-end warning code wrongly assumes that the
> latter case is what is very likely, while the former is unlikely and
> users should change their code.
> 
> The following patch attempts to differentiate between the two, but
> because we are in stage4, does it in a temporary compromise way.
> For GCC 13, the aim is to represent in the IL the user (type *)0x12345000
> style addresses as INTEGER_CSTs as before, and represent the addresses
> derived from pointer arithmetics on NULL pointer as
> &MEM_REF[(void*)0B + offset].  When we see say POINTER_PLUS of
> NULL and non-zero offset, I think we should fold it to this form.
> We IMNSHO need to support the poor man's offsetof I'm afraid still
> in wide use, so we should IMHO fold casts of such addresses to integers
> into INTEGER_CST offset.  SCCVN IMHO shouldn't the
> &MEM_REF[(void*)0B + offset] and (void*)offset forms as equivalent (yes,
> it will be a small missed optimization but RTL will see them the same),
> but e.g. the patch already does ==/!= folding between the two forms.
> As doing this fully seems to be more like a stage1 project, the following
> patch keeps doing what GCC11/GCC12 did until now for pointers smaller than
> a parameter, by default equal to 4KB, which means that PR102037 will
> remain unfixed for now (unless users lower the new param manually) and for
> the most common cases nothing changes right now, but for higher
> addresses we'll assume that such INTEGER_CSTs are valid direct address
> accesses and use the &MEM_REF[(void*)0B + offset] forms for the invalid
> code on which we'll warn.
> 
> Yet another alternative would be just the params.opt and
> pointer-query.cc (compute_objsize_r) hunks which will stop warning
> if we go more than 4KB from a NULL pointer altogether and doing
> the full set of changes only in GCC 13 (I've bootstrapped/regtested
> such patch last night on x86_64-linux and i686-linux).

I think I’d prefer this simpler approach at this point.  Thus that one is OK.

Richard.

> 
> So far tested with make check-gcc/check-g++ on x86_64-linux,
> furthermore also tested with the parameter's default changed from 4096
> to 16.
> 
> Ok for trunk if it passes full bootstrap/regtest?
> 
> 2022-03-18  Jakub Jelinek  <jakub@redhat.com>
> 
>    PR middle-end/99578
>    PR middle-end/100680
>    PR tree-optimization/100834
>    * params.opt (--param=min-pagesize=): New parameter.
>    * gimple-expr.cc (is_gimple_invariant_address): Allow ADDR_EXPR of
>    MEM_REF with integer_zerop as first operand as invariant.
>    (is_gimple_ip_invariant_address): Likewise.
>    * match.pd (&MEM[0 + off] == off): New simplification.
>    * gimple-ssa-warn-restrict.cc (builtin_memref::set_base_and_offset):
>    Ignore MEM_REFs with integer_zerop as address.
>    * gimple-fold.cc (maybe_canonicalize_mem_ref_addr): Don't fold
>    &MEM[0 + off] into off if off is larger than min-pagesize.
>    * gimple-array-bounds.cc (array_bounds_checker::check_mem_ref): Return
>    false for MEM_REFs with integer_zerop as address.
>    * pointer-query.cc (handle_mem_ref): Handle MEM_REFs with integer_zerop
>    as address specially.
>    (compute_objsize_r) <case ARRAY_REF>: Formatting fix.
>    (compute_objsize_r) <case INTEGER_CST>: Use maximum object size instead
>    of zero for pointer constants equal or larger than min-pagesize.
>    * fold-const.cc (build_fold_addr_expr_with_type_loc):
> 
>    * gcc.dg/tree-ssa/pr99578-1.c: New test.
>    * gcc.dg/pr99578-1.c: New test.
>    * gcc.dg/pr99578-2.c: New test.
>    * gcc.dg/pr99578-3.c: New test.
>    * gcc.dg/pr100680.c: New test.
>    * gcc.dg/pr100834.c: New test.
> 
> --- gcc/params.opt.jj    2022-03-17 18:48:06.723406664 +0100
> +++ gcc/params.opt    2022-03-18 11:59:31.518452714 +0100
> @@ -613,6 +613,10 @@ The maximum number of insns in loop head
> Common Joined UInteger Var(param_max_modulo_backtrack_attempts) Init(40) Param Optimization
> The maximum number of backtrack attempts the scheduler should make when modulo scheduling a loop.
> 
> +-param=min-pagesize=
> +Common Joined UInteger Var(param_min_pagesize) Init(4096) Param Optimization
> +Minimum page size for warning purposes.
> +
> -param=max-partial-antic-length=
> Common Joined UInteger Var(param_max_partial_antic_length) Init(100) Param Optimization
> Maximum length of partial antic set when performing tree pre optimization.
> --- gcc/gimple-expr.cc.jj    2022-03-17 18:48:06.691407102 +0100
> +++ gcc/gimple-expr.cc    2022-03-18 11:59:31.484453184 +0100
> @@ -690,9 +690,13 @@ is_gimple_invariant_address (const_tree
>   if (TREE_CODE (op) == MEM_REF)
>     {
>       const_tree op0 = TREE_OPERAND (op, 0);
> -      return (TREE_CODE (op0) == ADDR_EXPR
> -          && (CONSTANT_CLASS_P (TREE_OPERAND (op0, 0))
> -          || decl_address_invariant_p (TREE_OPERAND (op0, 0))));
> +      return ((TREE_CODE (op0) == ADDR_EXPR
> +           && (CONSTANT_CLASS_P (TREE_OPERAND (op0, 0))
> +           || decl_address_invariant_p (TREE_OPERAND (op0, 0))))
> +          /* Allow &MEM <int[4]> [(void *)0B + 8192B] as representation
> +         of &NULL->field_at_offset_8192 to differentiate for
> +         warning purposes those cases from (type *)0x2000.  */
> +          || integer_zerop (op0));
>     }
> 
>   return CONSTANT_CLASS_P (op) || decl_address_invariant_p (op);
> @@ -716,9 +720,10 @@ is_gimple_ip_invariant_address (const_tr
>   if (TREE_CODE (op) == MEM_REF)
>     {
>       const_tree op0 = TREE_OPERAND (op, 0);
> -      return (TREE_CODE (op0) == ADDR_EXPR
> -          && (CONSTANT_CLASS_P (TREE_OPERAND (op0, 0))
> -          || decl_address_ip_invariant_p (TREE_OPERAND (op0, 0))));
> +      return ((TREE_CODE (op0) == ADDR_EXPR
> +           && (CONSTANT_CLASS_P (TREE_OPERAND (op0, 0))
> +           || decl_address_ip_invariant_p (TREE_OPERAND (op0, 0))))
> +          || integer_zerop (op0));
>     }
> 
>   return CONSTANT_CLASS_P (op) || decl_address_ip_invariant_p (op);
> --- gcc/match.pd.jj    2022-03-16 10:56:18.995730484 +0100
> +++ gcc/match.pd    2022-03-18 14:53:55.726876501 +0100
> @@ -5606,6 +5606,25 @@ (define_operator_list SYNC_FETCH_AND_AND
>       (if (cmp == NE_EXPR)
>        { constant_boolean_node (true, type); })))))))
> 
> +/* Simplify pointer equality compares of &MEM_REF[0 + offset] and integer.  */
> +(for neeq (ne eq)
> + (simplify
> +  (neeq (convert? addr@0) INTEGER_CST@1)
> +   (if (TYPE_PRECISION (TREE_TYPE (@1)) == TYPE_PRECISION (TREE_TYPE (@0))
> +    && TYPE_PRECISION (TREE_TYPE (@1)) == TYPE_PRECISION (sizetype))
> +    (with { poly_int64 off;
> +        HOST_WIDE_INT coff;
> +        tree base = get_addr_base_and_unit_offset (TREE_OPERAND (@0, 0),
> +                               &off); }
> +     (if (base
> +      && TREE_CODE (base) == MEM_REF
> +      && integer_zerop (TREE_OPERAND (base, 0))
> +      && TREE_CODE (TREE_OPERAND (base, 1)) == INTEGER_CST
> +      && off.is_constant (&coff))
> +      { constant_boolean_node ((wi::to_wide (TREE_OPERAND (base, 1)) + coff
> +                == wi::to_wide (@1))
> +                   ^ (neeq != EQ_EXPR), type); })))))
> +
> /* Simplify pointer equality compares using PTA.  */
> (for neeq (ne eq)
>  (simplify
> --- gcc/gimple-ssa-warn-restrict.cc.jj    2022-02-04 14:36:55.000000000 +0100
> +++ gcc/gimple-ssa-warn-restrict.cc    2022-03-18 13:42:05.590650262 +0100
> @@ -521,7 +521,7 @@ builtin_memref::set_base_and_offset (tre
>    offrange[1] += maxobjsize;
>     }
> 
> -  if (TREE_CODE (base) == MEM_REF)
> +  if (TREE_CODE (base) == MEM_REF && !integer_zerop (TREE_OPERAND (base, 0)))
>     {
>       tree memrefoff = fold_convert (ptrdiff_type_node, TREE_OPERAND (base, 1));
>       extend_offset_range (memrefoff);
> --- gcc/gimple-fold.cc.jj    2022-03-17 18:48:06.709406855 +0100
> +++ gcc/gimple-fold.cc    2022-03-18 11:59:31.500452963 +0100
> @@ -6018,7 +6018,25 @@ maybe_canonicalize_mem_ref_addr (tree *t
>          if (wi::to_poly_wide (TREE_OPERAND (base, 0)).to_shwi (&moffset))
>        {
>          coffset += moffset;
> -          *orig_t = build_int_cst (TREE_TYPE (*orig_t), coffset);
> +          /* Don't fold &NULL->field into INTEGER_CSTs, so that we
> +             for warning purposes keep the distinction between such
> +             invalid cases and valid (type *)0x7e124000.  */
> +          if (integer_zerop (TREE_OPERAND (*t, 0))
> +              && maybe_ge (coffset, param_min_pagesize))
> +            {
> +              if (t == &TREE_OPERAND (*orig_t, 0))
> +            return res;
> +              location_t loc = EXPR_LOCATION (*orig_t);
> +              tree ptype = TREE_TYPE (*orig_t);
> +              tree tem = build2_loc (loc, MEM_REF, TREE_TYPE (ptype),
> +                         TREE_OPERAND (*t, 0),
> +                         build_int_cst (ptr_type_node,
> +                                coffset));
> +              *orig_t
> +            = build_fold_addr_expr_with_type_loc (loc, tem, ptype);
> +            }
> +          else
> +            *orig_t = build_int_cst (TREE_TYPE (*orig_t), coffset);
>          return true;
>        }
>        }
> --- gcc/gimple-array-bounds.cc.jj    2022-03-17 18:48:06.691407102 +0100
> +++ gcc/gimple-array-bounds.cc    2022-03-18 11:59:31.509452838 +0100
> @@ -393,7 +393,8 @@ bool
> array_bounds_checker::check_mem_ref (location_t location, tree ref,
>                     bool ignore_off_by_one)
> {
> -  if (warning_suppressed_p (ref, OPT_Warray_bounds))
> +  if (warning_suppressed_p (ref, OPT_Warray_bounds)
> +      || integer_zerop (TREE_OPERAND (ref, 0)))
>     return false;
> 
>   /* The statement used to allocate the array or null.  */
> --- gcc/pointer-query.cc.jj    2022-03-17 18:48:06.724406650 +0100
> +++ gcc/pointer-query.cc    2022-03-18 16:14:38.184691128 +0100
> @@ -1968,6 +1968,17 @@ handle_mem_ref (tree mref, gimple *stmt,
>     }
> 
>   tree mrefop = TREE_OPERAND (mref, 0);
> +  if (integer_zerop (mrefop))
> +    {
> +      addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (mref));
> +      if (targetm.addr_space.zero_address_valid (as))
> +    pref->set_max_size_range ();
> +      else
> +    pref->sizrng[0] = pref->sizrng[1] = 0;
> +      pref->ref = null_pointer_node;
> +      return true;
> +    }
> +
>   if (!compute_objsize_r (mrefop, stmt, false, ostype, pref, snlim, qry))
>     return false;
> 
> @@ -2243,7 +2254,7 @@ compute_objsize_r (tree ptr, gimple *stm
>       }
> 
>     case ARRAY_REF:
> -    return handle_array_ref (ptr, stmt, addr, ostype, pref, snlim, qry);
> +      return handle_array_ref (ptr, stmt, addr, ostype, pref, snlim, qry);
> 
>     case COMPONENT_REF:
>       return handle_component_ref (ptr, stmt, addr, ostype, pref, snlim, qry);
> @@ -2264,12 +2275,14 @@ compute_objsize_r (tree ptr, gimple *stm
>       }
> 
>     case INTEGER_CST:
> -      /* Pointer constants other than null are most likely the result
> -     of erroneous null pointer addition/subtraction.  Unless zero
> -     is a valid address set size to zero.  For null pointers, set
> -     size to the maximum for now since those may be the result of
> -     jump threading.  */
> -      if (integer_zerop (ptr))
> +      /* Pointer constants other than null smaller than param_min_pagesize
> +     might be the result of erroneous null pointer addition/subtraction.
> +     Unless zero is a valid address set size to zero.  For null pointers,
> +     set size to the maximum for now since those may be the result of
> +     jump threading.  Similarly, for values >= param_min_pagesize in
> +     order to support (type *) 0x7cdeab00.  */
> +      if (integer_zerop (ptr)
> +      || wi::to_widest (ptr) >= param_min_pagesize)
>    pref->set_max_size_range ();
>       else if (POINTER_TYPE_P (TREE_TYPE (ptr)))
>    {
> --- gcc/fold-const.cc.jj    2022-03-17 18:48:06.675407320 +0100
> +++ gcc/fold-const.cc    2022-03-18 11:59:31.541452395 +0100
> @@ -9206,7 +9206,12 @@ build_fold_addr_expr_with_type_loc (loca
>    t = fold_convert_loc (loc, ptrtype, t);
>     }
>   else if (TREE_CODE (t) == MEM_REF
> -       && TREE_CODE (TREE_OPERAND (t, 0)) == INTEGER_CST)
> +       && TREE_CODE (TREE_OPERAND (t, 0)) == INTEGER_CST
> +       /* Don't fold &MEM <int[4]> [(void *)0B + 8192B]
> +          as a representation of &((type *)NULL)->field_at_offset_8192.  */
> +       && (!integer_zerop (TREE_OPERAND (t, 0))
> +           || TREE_CODE (TREE_OPERAND (t, 1)) != INTEGER_CST
> +           || wi::to_widest (TREE_OPERAND (t, 1)) < param_min_pagesize))
>     return fold_binary (POINTER_PLUS_EXPR, ptrtype,
>            TREE_OPERAND (t, 0),
>            convert_to_ptrofftype (TREE_OPERAND (t, 1)));
> --- gcc/testsuite/gcc.dg/tree-ssa/pr99578-1.c.jj    2022-03-18 15:12:47.470156316 +0100
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr99578-1.c    2022-03-18 15:18:30.249401754 +0100
> @@ -0,0 +1,22 @@
> +/* PR middle-end/99578 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-not "&MEM" "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "PHI <-?1\\\(\[0-9\]+\\\), -?1\\\(\[0-9\]+\\\)>" 2 "optimized" } } */
> +
> +struct S { int a, b[4]; };
> +struct T { int a, b[8192], c[4]; };
> +
> +int
> +foo (struct S *p)
> +{
> +  if (p) return -1;
> +  return p->b == (void *)4;
> +}
> +
> +int
> +bar (struct T *p)
> +{
> +  if (p) return -1;
> +  return p->c == (void *)32772;
> +}
> --- gcc/testsuite/gcc.dg/pr99578-1.c.jj    2022-03-18 15:04:22.825163042 +0100
> +++ gcc/testsuite/gcc.dg/pr99578-1.c    2022-03-18 15:09:26.693941208 +0100
> @@ -0,0 +1,26 @@
> +/* PR middle-end/99578 */
> +/* { dg-do compile { target int32 } } */
> +/* { dg-options "-O2 -Warray-bounds" } */
> +
> +struct S { int a, b[4]; };
> +struct T { int a, b[8192], c[4]; };
> +
> +void
> +foo (struct S *p)
> +{
> +  if (p) return;
> +  __builtin_memset (p->b, 0, sizeof p->b);    /* { dg-warning "offset \\\[0, 15\\\] is out of the bounds \\\[0, 0\\\]" } */
> +}
> +
> +void
> +bar (struct T *p)
> +{
> +  if (p) return;
> +  __builtin_memset (p->c, 0, sizeof p->c);    /* { dg-warning "offset \\\[0, 15\\\] is out of the bounds \\\[0, 0\\\]" } */
> +}
> +
> +void
> +baz (void)
> +{
> +  __builtin_memset ((void *) 0x8004, 0, 16);    /* { dg-bogus "is out of the bounds" } */
> +}
> --- gcc/testsuite/gcc.dg/pr99578-2.c.jj    2022-03-18 15:09:58.339502263 +0100
> +++ gcc/testsuite/gcc.dg/pr99578-2.c    2022-03-18 15:10:33.308017231 +0100
> @@ -0,0 +1,26 @@
> +/* PR middle-end/99578 */
> +/* { dg-do compile { target int32 } } */
> +/* { dg-options "-O2 -Wstringop-overflow" } */
> +
> +struct S { int a, b[4]; };
> +struct T { int a, b[8192], c[4]; };
> +
> +void
> +foo (struct S *p)
> +{
> +  if (p) return;
> +  __builtin_memset (p->b, 0, sizeof p->b);    /* { dg-warning "writing 16 bytes into a region of size 0 overflows the destination" } */
> +}
> +
> +void
> +bar (struct T *p)
> +{
> +  if (p) return;
> +  __builtin_memset (p->c, 0, sizeof p->c);    /* { dg-warning "writing 16 bytes into a region of size 0 overflows the destination" } */
> +}
> +
> +void
> +baz (void)
> +{
> +  __builtin_memset ((void *) 0x8004, 0, 16);    /* { dg-bogus "overflows the destination" } */
> +}
> --- gcc/testsuite/gcc.dg/pr99578-3.c.jj    2022-03-18 16:43:56.303435806 +0100
> +++ gcc/testsuite/gcc.dg/pr99578-3.c    2022-03-18 16:43:51.379503346 +0100
> @@ -0,0 +1,13 @@
> +/* PR middle-end/99578 */
> +/* { dg-do compile { target size32plus } } */
> +/* { dg-options "-O2 -Wstringop-overread" } */
> +
> +struct S { unsigned int s; };
> +extern struct S v;
> +extern void *memcpy (void *, const void *, __SIZE_TYPE__);
> +
> +void
> +foo (void)
> +{
> +  memcpy (&v, (void *)(0xe8ffc000), sizeof (struct S));    /* { dg-bogus "from a region of size 0" } */
> +}
> --- gcc/testsuite/gcc.dg/pr100680.c.jj    2022-03-18 16:48:39.136556295 +0100
> +++ gcc/testsuite/gcc.dg/pr100680.c    2022-03-18 16:55:37.512876159 +0100
> @@ -0,0 +1,31 @@
> +/* PR middle-end/100680 */
> +/* { dg-do compile { target size32plus } } */
> +/* { dg-options "-O2 -Wstringop-overread" } */
> +
> +struct s {
> +  char a[8];
> +  int i;
> +  long l;
> +};
> +
> +extern char ea[8];
> +static char sa[8] = { 1, 2, 3, 4 };
> +
> +int
> +test (void)
> +{
> +  const struct s *ps = (const struct s *) 0x12345678L;
> +  if (__builtin_memcmp (ps->a, ps->a, 8))
> +    return 0;
> +
> +  if (__builtin_memcmp (ps->a, ea, 8))        /* { dg-bogus "exceeds source size 0" } */
> +    return 0;
> +
> +  if (__builtin_memcmp (ps->a, sa, 8))        /* { dg-bogus "exceeds source size 0" } */
> +    return 0;
> +
> +  if (__builtin_memcmp (ps->a, "abcdABCD", 8))    /* { dg-bogus "exceeds source size 0" } */
> +    return 0;
> +
> +  return 1;
> +}
> --- gcc/testsuite/gcc.dg/pr100834.c.jj    2022-03-18 17:23:13.802552517 +0100
> +++ gcc/testsuite/gcc.dg/pr100834.c    2022-03-18 17:24:21.985630103 +0100
> @@ -0,0 +1,42 @@
> +/* PR tree-optimization/100834 */
> +/* { dg-do compile { target size32plus } } */
> +/* { dg-options "-O2 -Wall" } */
> +
> +#define PAGE_SIZE    4096
> +#define STACK_SIZE    PAGE_SIZE
> +
> +union registers
> +{
> +  struct
> +  {
> +    unsigned long r15, r14, r13, r12, r11, r10, r9, r8;
> +    unsigned long rdi, rsi, rbp, unused, rbx, rdx, rcx, rax;
> +  };
> +  unsigned long by_index[16];
> +};
> +
> +struct per_cpu
> +{
> +  union
> +  {
> +    unsigned char stack[STACK_SIZE];
> +    struct
> +    {
> +      unsigned char __fill[STACK_SIZE - sizeof (union registers)];
> +      union registers guest_regs;
> +    };
> +  };
> +} __attribute__((aligned (PAGE_SIZE)));
> +
> +static inline struct per_cpu *
> +this_cpu_data (void)
> +{
> +  return (struct per_cpu *) 0xdeadbeef;
> +}
> +
> +void
> +foo (void)
> +{
> +  struct per_cpu *cpu_data = this_cpu_data ();
> +  __builtin_memset (&cpu_data->guest_regs, 0, sizeof (cpu_data->guest_regs));    /* { dg-bogus "is out of the bounds" } */
> +}
> 
> 
>    Jakub
>
  
Martin Liška March 21, 2022, 9:48 a.m. UTC | #2
Hi.

I'm installing the following patch that documents the newly introduced parameter.

Martin
  
Martin Liška March 30, 2022, 8:04 a.m. UTC | #3
Hello.

I noticed the change that was backported to gcc-11 breaks LTO bytecode
due to the addition of the new parameter:

./lto1 -quiet -dumpbase ./a.wpa -mtune=generic -march=x86-64 -O2 -version -fno-openmp -fno-openacc -fno-pie -fcf-protection=none -fltrans-output-list=./a.ltrans.out -fwpa -fresolution=a.res -flinker-output=exec @./a.wpa.args.0
GNU GIMPLE (GCC) version 11.2.1 20220329 (x86_64-pc-linux-gnu)
	compiled by GNU C version 11.2.1 20220316 [revision 6a1150d1524aeda3381b2171712e1a6611d441d6], GMP version 6.2.1, MPFR version 4.1.0-p7, MPC version 1.2.1, isl version isl-0.24-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
GNU GIMPLE (GCC) version 11.2.1 20220329 (x86_64-pc-linux-gnu)
	compiled by GNU C version 11.2.1 20220316 [revision 6a1150d1524aeda3381b2171712e1a6611d441d6], GMP version 6.2.1, MPFR version 4.1.0-p7, MPC version 1.2.1, isl version isl-0.24-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
lto1: internal compiler error: Segmentation fault
0x10c6e9e crash_signal
	/home/marxin/Programming/gcc2/gcc/toplev.c:327
0x7ffff78b93cf ???
	/usr/src/debug/glibc-2.35-2.1.x86_64/signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
0xb2664b bp_unpack_string(data_in*, bitpack_d*)
	/home/marxin/Programming/gcc2/gcc/data-streamer-in.c:112
0xf68898 cl_optimization_stream_in(data_in*, bitpack_d*, cl_optimization*)
	/dev/shm/objdir2/gcc/options-save.c:12787
0x1392f33 streamer_read_tree_bitfields(lto_input_block*, data_in*, tree_node*)
	/home/marxin/Programming/gcc2/gcc/tree-streamer-in.c:552
0xeb7456 lto_read_tree_1
...

I'm going to bump it.

Cheers,
Martin
  

Patch

--- gcc/params.opt.jj	2022-03-17 18:48:06.723406664 +0100
+++ gcc/params.opt	2022-03-18 11:59:31.518452714 +0100
@@ -613,6 +613,10 @@  The maximum number of insns in loop head
 Common Joined UInteger Var(param_max_modulo_backtrack_attempts) Init(40) Param Optimization
 The maximum number of backtrack attempts the scheduler should make when modulo scheduling a loop.
 
+-param=min-pagesize=
+Common Joined UInteger Var(param_min_pagesize) Init(4096) Param Optimization
+Minimum page size for warning purposes.
+
 -param=max-partial-antic-length=
 Common Joined UInteger Var(param_max_partial_antic_length) Init(100) Param Optimization
 Maximum length of partial antic set when performing tree pre optimization.
--- gcc/gimple-expr.cc.jj	2022-03-17 18:48:06.691407102 +0100
+++ gcc/gimple-expr.cc	2022-03-18 11:59:31.484453184 +0100
@@ -690,9 +690,13 @@  is_gimple_invariant_address (const_tree
   if (TREE_CODE (op) == MEM_REF)
     {
       const_tree op0 = TREE_OPERAND (op, 0);
-      return (TREE_CODE (op0) == ADDR_EXPR
-	      && (CONSTANT_CLASS_P (TREE_OPERAND (op0, 0))
-		  || decl_address_invariant_p (TREE_OPERAND (op0, 0))));
+      return ((TREE_CODE (op0) == ADDR_EXPR
+	       && (CONSTANT_CLASS_P (TREE_OPERAND (op0, 0))
+		   || decl_address_invariant_p (TREE_OPERAND (op0, 0))))
+	      /* Allow &MEM <int[4]> [(void *)0B + 8192B] as representation
+		 of &NULL->field_at_offset_8192 to differentiate for
+		 warning purposes those cases from (type *)0x2000.  */
+	      || integer_zerop (op0));
     }
 
   return CONSTANT_CLASS_P (op) || decl_address_invariant_p (op);
@@ -716,9 +720,10 @@  is_gimple_ip_invariant_address (const_tr
   if (TREE_CODE (op) == MEM_REF)
     {
       const_tree op0 = TREE_OPERAND (op, 0);
-      return (TREE_CODE (op0) == ADDR_EXPR
-	      && (CONSTANT_CLASS_P (TREE_OPERAND (op0, 0))
-		  || decl_address_ip_invariant_p (TREE_OPERAND (op0, 0))));
+      return ((TREE_CODE (op0) == ADDR_EXPR
+	       && (CONSTANT_CLASS_P (TREE_OPERAND (op0, 0))
+		   || decl_address_ip_invariant_p (TREE_OPERAND (op0, 0))))
+	      || integer_zerop (op0));
     }
 
   return CONSTANT_CLASS_P (op) || decl_address_ip_invariant_p (op);
--- gcc/match.pd.jj	2022-03-16 10:56:18.995730484 +0100
+++ gcc/match.pd	2022-03-18 14:53:55.726876501 +0100
@@ -5606,6 +5606,25 @@  (define_operator_list SYNC_FETCH_AND_AND
       (if (cmp == NE_EXPR)
        { constant_boolean_node (true, type); })))))))
 
+/* Simplify pointer equality compares of &MEM_REF[0 + offset] and integer.  */
+(for neeq (ne eq)
+ (simplify
+  (neeq (convert? addr@0) INTEGER_CST@1)
+   (if (TYPE_PRECISION (TREE_TYPE (@1)) == TYPE_PRECISION (TREE_TYPE (@0))
+	&& TYPE_PRECISION (TREE_TYPE (@1)) == TYPE_PRECISION (sizetype))
+    (with { poly_int64 off;
+	    HOST_WIDE_INT coff;
+	    tree base = get_addr_base_and_unit_offset (TREE_OPERAND (@0, 0),
+						       &off); }
+     (if (base
+	  && TREE_CODE (base) == MEM_REF
+	  && integer_zerop (TREE_OPERAND (base, 0))
+	  && TREE_CODE (TREE_OPERAND (base, 1)) == INTEGER_CST
+	  && off.is_constant (&coff))
+      { constant_boolean_node ((wi::to_wide (TREE_OPERAND (base, 1)) + coff
+				== wi::to_wide (@1))
+			       ^ (neeq != EQ_EXPR), type); })))))
+
 /* Simplify pointer equality compares using PTA.  */
 (for neeq (ne eq)
  (simplify
--- gcc/gimple-ssa-warn-restrict.cc.jj	2022-02-04 14:36:55.000000000 +0100
+++ gcc/gimple-ssa-warn-restrict.cc	2022-03-18 13:42:05.590650262 +0100
@@ -521,7 +521,7 @@  builtin_memref::set_base_and_offset (tre
 	offrange[1] += maxobjsize;
     }
 
-  if (TREE_CODE (base) == MEM_REF)
+  if (TREE_CODE (base) == MEM_REF && !integer_zerop (TREE_OPERAND (base, 0)))
     {
       tree memrefoff = fold_convert (ptrdiff_type_node, TREE_OPERAND (base, 1));
       extend_offset_range (memrefoff);
--- gcc/gimple-fold.cc.jj	2022-03-17 18:48:06.709406855 +0100
+++ gcc/gimple-fold.cc	2022-03-18 11:59:31.500452963 +0100
@@ -6018,7 +6018,25 @@  maybe_canonicalize_mem_ref_addr (tree *t
 	      if (wi::to_poly_wide (TREE_OPERAND (base, 0)).to_shwi (&moffset))
 		{
 		  coffset += moffset;
-		  *orig_t = build_int_cst (TREE_TYPE (*orig_t), coffset);
+		  /* Don't fold &NULL->field into INTEGER_CSTs, so that we
+		     for warning purposes keep the distinction between such
+		     invalid cases and valid (type *)0x7e124000.  */
+		  if (integer_zerop (TREE_OPERAND (*t, 0))
+		      && maybe_ge (coffset, param_min_pagesize))
+		    {
+		      if (t == &TREE_OPERAND (*orig_t, 0))
+			return res;
+		      location_t loc = EXPR_LOCATION (*orig_t);
+		      tree ptype = TREE_TYPE (*orig_t);
+		      tree tem = build2_loc (loc, MEM_REF, TREE_TYPE (ptype),
+					     TREE_OPERAND (*t, 0),
+					     build_int_cst (ptr_type_node,
+							    coffset));
+		      *orig_t
+			= build_fold_addr_expr_with_type_loc (loc, tem, ptype);
+		    }
+		  else
+		    *orig_t = build_int_cst (TREE_TYPE (*orig_t), coffset);
 		  return true;
 		}
 	    }
--- gcc/gimple-array-bounds.cc.jj	2022-03-17 18:48:06.691407102 +0100
+++ gcc/gimple-array-bounds.cc	2022-03-18 11:59:31.509452838 +0100
@@ -393,7 +393,8 @@  bool
 array_bounds_checker::check_mem_ref (location_t location, tree ref,
 				     bool ignore_off_by_one)
 {
-  if (warning_suppressed_p (ref, OPT_Warray_bounds))
+  if (warning_suppressed_p (ref, OPT_Warray_bounds)
+      || integer_zerop (TREE_OPERAND (ref, 0)))
     return false;
 
   /* The statement used to allocate the array or null.  */
--- gcc/pointer-query.cc.jj	2022-03-17 18:48:06.724406650 +0100
+++ gcc/pointer-query.cc	2022-03-18 16:14:38.184691128 +0100
@@ -1968,6 +1968,17 @@  handle_mem_ref (tree mref, gimple *stmt,
     }
 
   tree mrefop = TREE_OPERAND (mref, 0);
+  if (integer_zerop (mrefop))
+    {
+      addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (mref));
+      if (targetm.addr_space.zero_address_valid (as))
+	pref->set_max_size_range ();
+      else
+	pref->sizrng[0] = pref->sizrng[1] = 0;
+      pref->ref = null_pointer_node;
+      return true;
+    }
+
   if (!compute_objsize_r (mrefop, stmt, false, ostype, pref, snlim, qry))
     return false;
 
@@ -2243,7 +2254,7 @@  compute_objsize_r (tree ptr, gimple *stm
       }
 
     case ARRAY_REF:
-	return handle_array_ref (ptr, stmt, addr, ostype, pref, snlim, qry);
+      return handle_array_ref (ptr, stmt, addr, ostype, pref, snlim, qry);
 
     case COMPONENT_REF:
       return handle_component_ref (ptr, stmt, addr, ostype, pref, snlim, qry);
@@ -2264,12 +2275,14 @@  compute_objsize_r (tree ptr, gimple *stm
       }
 
     case INTEGER_CST:
-      /* Pointer constants other than null are most likely the result
-	 of erroneous null pointer addition/subtraction.  Unless zero
-	 is a valid address set size to zero.  For null pointers, set
-	 size to the maximum for now since those may be the result of
-	 jump threading.  */
-      if (integer_zerop (ptr))
+      /* Pointer constants other than null smaller than param_min_pagesize
+	 might be the result of erroneous null pointer addition/subtraction.
+	 Unless zero is a valid address set size to zero.  For null pointers,
+	 set size to the maximum for now since those may be the result of
+	 jump threading.  Similarly, for values >= param_min_pagesize in
+	 order to support (type *) 0x7cdeab00.  */
+      if (integer_zerop (ptr)
+	  || wi::to_widest (ptr) >= param_min_pagesize)
 	pref->set_max_size_range ();
       else if (POINTER_TYPE_P (TREE_TYPE (ptr)))
 	{
--- gcc/fold-const.cc.jj	2022-03-17 18:48:06.675407320 +0100
+++ gcc/fold-const.cc	2022-03-18 11:59:31.541452395 +0100
@@ -9206,7 +9206,12 @@  build_fold_addr_expr_with_type_loc (loca
 	t = fold_convert_loc (loc, ptrtype, t);
     }
   else if (TREE_CODE (t) == MEM_REF
-	   && TREE_CODE (TREE_OPERAND (t, 0)) == INTEGER_CST)
+	   && TREE_CODE (TREE_OPERAND (t, 0)) == INTEGER_CST
+	   /* Don't fold &MEM <int[4]> [(void *)0B + 8192B]
+	      as a representation of &((type *)NULL)->field_at_offset_8192.  */
+	   && (!integer_zerop (TREE_OPERAND (t, 0))
+	       || TREE_CODE (TREE_OPERAND (t, 1)) != INTEGER_CST
+	       || wi::to_widest (TREE_OPERAND (t, 1)) < param_min_pagesize))
     return fold_binary (POINTER_PLUS_EXPR, ptrtype,
 			TREE_OPERAND (t, 0),
 			convert_to_ptrofftype (TREE_OPERAND (t, 1)));
--- gcc/testsuite/gcc.dg/tree-ssa/pr99578-1.c.jj	2022-03-18 15:12:47.470156316 +0100
+++ gcc/testsuite/gcc.dg/tree-ssa/pr99578-1.c	2022-03-18 15:18:30.249401754 +0100
@@ -0,0 +1,22 @@ 
+/* PR middle-end/99578 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-not "&MEM" "optimized" } } */
+/* { dg-final { scan-tree-dump-times "PHI <-?1\\\(\[0-9\]+\\\), -?1\\\(\[0-9\]+\\\)>" 2 "optimized" } } */
+
+struct S { int a, b[4]; };
+struct T { int a, b[8192], c[4]; };
+
+int
+foo (struct S *p)
+{
+  if (p) return -1;
+  return p->b == (void *)4;
+}
+
+int
+bar (struct T *p)
+{
+  if (p) return -1;
+  return p->c == (void *)32772;
+}
--- gcc/testsuite/gcc.dg/pr99578-1.c.jj	2022-03-18 15:04:22.825163042 +0100
+++ gcc/testsuite/gcc.dg/pr99578-1.c	2022-03-18 15:09:26.693941208 +0100
@@ -0,0 +1,26 @@ 
+/* PR middle-end/99578 */
+/* { dg-do compile { target int32 } } */
+/* { dg-options "-O2 -Warray-bounds" } */
+
+struct S { int a, b[4]; };
+struct T { int a, b[8192], c[4]; };
+
+void
+foo (struct S *p)
+{
+  if (p) return;
+  __builtin_memset (p->b, 0, sizeof p->b);	/* { dg-warning "offset \\\[0, 15\\\] is out of the bounds \\\[0, 0\\\]" } */
+}
+
+void
+bar (struct T *p)
+{
+  if (p) return;
+  __builtin_memset (p->c, 0, sizeof p->c);	/* { dg-warning "offset \\\[0, 15\\\] is out of the bounds \\\[0, 0\\\]" } */
+}
+
+void
+baz (void)
+{
+  __builtin_memset ((void *) 0x8004, 0, 16);	/* { dg-bogus "is out of the bounds" } */
+}
--- gcc/testsuite/gcc.dg/pr99578-2.c.jj	2022-03-18 15:09:58.339502263 +0100
+++ gcc/testsuite/gcc.dg/pr99578-2.c	2022-03-18 15:10:33.308017231 +0100
@@ -0,0 +1,26 @@ 
+/* PR middle-end/99578 */
+/* { dg-do compile { target int32 } } */
+/* { dg-options "-O2 -Wstringop-overflow" } */
+
+struct S { int a, b[4]; };
+struct T { int a, b[8192], c[4]; };
+
+void
+foo (struct S *p)
+{
+  if (p) return;
+  __builtin_memset (p->b, 0, sizeof p->b);	/* { dg-warning "writing 16 bytes into a region of size 0 overflows the destination" } */
+}
+
+void
+bar (struct T *p)
+{
+  if (p) return;
+  __builtin_memset (p->c, 0, sizeof p->c);	/* { dg-warning "writing 16 bytes into a region of size 0 overflows the destination" } */
+}
+
+void
+baz (void)
+{
+  __builtin_memset ((void *) 0x8004, 0, 16);	/* { dg-bogus "overflows the destination" } */
+}
--- gcc/testsuite/gcc.dg/pr99578-3.c.jj	2022-03-18 16:43:56.303435806 +0100
+++ gcc/testsuite/gcc.dg/pr99578-3.c	2022-03-18 16:43:51.379503346 +0100
@@ -0,0 +1,13 @@ 
+/* PR middle-end/99578 */
+/* { dg-do compile { target size32plus } } */
+/* { dg-options "-O2 -Wstringop-overread" } */
+
+struct S { unsigned int s; };
+extern struct S v;
+extern void *memcpy (void *, const void *, __SIZE_TYPE__);
+
+void
+foo (void)
+{
+  memcpy (&v, (void *)(0xe8ffc000), sizeof (struct S));	/* { dg-bogus "from a region of size 0" } */
+}
--- gcc/testsuite/gcc.dg/pr100680.c.jj	2022-03-18 16:48:39.136556295 +0100
+++ gcc/testsuite/gcc.dg/pr100680.c	2022-03-18 16:55:37.512876159 +0100
@@ -0,0 +1,31 @@ 
+/* PR middle-end/100680 */
+/* { dg-do compile { target size32plus } } */
+/* { dg-options "-O2 -Wstringop-overread" } */
+
+struct s {
+  char a[8];
+  int i;
+  long l;
+};
+
+extern char ea[8];
+static char sa[8] = { 1, 2, 3, 4 };
+
+int
+test (void)
+{
+  const struct s *ps = (const struct s *) 0x12345678L;
+  if (__builtin_memcmp (ps->a, ps->a, 8))
+    return 0;
+
+  if (__builtin_memcmp (ps->a, ea, 8))		/* { dg-bogus "exceeds source size 0" } */
+    return 0;
+
+  if (__builtin_memcmp (ps->a, sa, 8))		/* { dg-bogus "exceeds source size 0" } */
+    return 0;
+
+  if (__builtin_memcmp (ps->a, "abcdABCD", 8))	/* { dg-bogus "exceeds source size 0" } */
+    return 0;
+
+  return 1;
+}
--- gcc/testsuite/gcc.dg/pr100834.c.jj	2022-03-18 17:23:13.802552517 +0100
+++ gcc/testsuite/gcc.dg/pr100834.c	2022-03-18 17:24:21.985630103 +0100
@@ -0,0 +1,42 @@ 
+/* PR tree-optimization/100834 */
+/* { dg-do compile { target size32plus } } */
+/* { dg-options "-O2 -Wall" } */
+
+#define PAGE_SIZE 	4096
+#define STACK_SIZE 	PAGE_SIZE
+
+union registers
+{
+  struct
+  {
+    unsigned long r15, r14, r13, r12, r11, r10, r9, r8;
+    unsigned long rdi, rsi, rbp, unused, rbx, rdx, rcx, rax;
+  };
+  unsigned long by_index[16];
+};
+
+struct per_cpu
+{
+  union
+  {
+    unsigned char stack[STACK_SIZE];
+    struct
+    {
+      unsigned char __fill[STACK_SIZE - sizeof (union registers)];
+      union registers guest_regs;
+    };
+  };
+} __attribute__((aligned (PAGE_SIZE)));
+
+static inline struct per_cpu *
+this_cpu_data (void)
+{
+  return (struct per_cpu *) 0xdeadbeef;
+}
+
+void
+foo (void)
+{
+  struct per_cpu *cpu_data = this_cpu_data ();
+  __builtin_memset (&cpu_data->guest_regs, 0, sizeof (cpu_data->guest_regs));	/* { dg-bogus "is out of the bounds" } */
+}