Basic kill analysis for modref

Message ID 20211111125851.GC17431@kam.mff.cuni.cz
State New
Headers
Series Basic kill analysis for modref |

Commit Message

Jan Hubicka Nov. 11, 2021, 12:58 p.m. UTC
  Hi,
This patch enables optimization of stores that are killed by calls.
Modref summary is extended by array containing list of access ranges, relative
to function parameters, that are known to be killed by the function.
This array is collected during local analysis and optimized (so separate
stores are glued together). 

Kill analysis in ipa-modref.c is quite simplistic.  In particular no WPA
propagation is done and also we take very simple approach to prove that
given store is executed each invocation of the function.  I simply
require it to be in the first basic block and before anyting that can
throw externally.  I have more fancy code for that but with this patch I
want to primarily discuss interace to tree-ssa-alias.c. I wonder if thre
are some helpers I can re-use?

From GCC linktime I get 814 functions with non-empty kill vector.

Modref stats:
  modref kill: 39 kills, 7162 queries
  modref use: 25169 disambiguations, 697722 queries
  modref clobber: 2290122 disambiguations, 22750147 queries
  5240008 tbaa queries (0.230329 per modref query)
  806190 base compares (0.035437 per modref query)

(note that more kills happens at early optimization where we did not
inlined that much yet).

For tramp3d (non-lto -O3 build):

Modref stats:
  modref kill: 45 kills, 630 queries
  modref use: 750 disambiguations, 10061 queries
  modref clobber: 35253 disambiguations, 543262 queries
  85347 tbaa queries (0.157101 per modref query)
  18727 base compares (0.034471 per modref query)

So it is not that high, but it gets better after improving the analysis side
and also with -Os and/or PGO (wehre we offline cdtors) and also wiring in
same_addr_size_stores_p which I want to discuss incrementally.

But at least there are not that many queries to slow down compile times
noticeably :)

Honza

gcc/ChangeLog:

	* ipa-modref-tree.h (struct modref_access_node): New member function
	* ipa-modref.c (modref_summary::useful_p): Kills are not useful when
	we can not analyze loads.
	(struct modref_summary_lto): Add kills.
	(modref_summary::dump): Dump kills.
	(record_access): Take access node as parameter.
	(record_access_lto): Likewise.
	(add_kill): New function.
	(merge_call_side_effects): Merge kills.
	(analyze_call): Pass around always_executed.
	(struct summary_ptrs): Add always_executed flag.
	(analyze_load): Update.
	(analyze_store): Handle kills.
	(analyze_stmt): Pass around always_executed flag; handle kills from
	clobbers.
	(analyze_function): Compute always_executed.
	(modref_summaries::duplicate): Copy kills.
	(update_signature): Release kills.
	* ipa-modref.h (struct modref_summary): Add kills.
	* tree-ssa-alias.c (dump_alias_stats): Dump kills.
	(stmt_kills_ref_p): Handle modref kills.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/modref-dse-2.c: New test.
  

Comments

Richard Biener Nov. 12, 2021, 10:38 a.m. UTC | #1
On Thu, 11 Nov 2021, Jan Hubicka wrote:

> Hi,
> This patch enables optimization of stores that are killed by calls.
> Modref summary is extended by array containing list of access ranges, relative
> to function parameters, that are known to be killed by the function.
> This array is collected during local analysis and optimized (so separate
> stores are glued together). 
> 
> Kill analysis in ipa-modref.c is quite simplistic.  In particular no WPA
> propagation is done and also we take very simple approach to prove that
> given store is executed each invocation of the function.  I simply
> require it to be in the first basic block and before anyting that can
> throw externally.  I have more fancy code for that but with this patch I
> want to primarily discuss interace to tree-ssa-alias.c. I wonder if thre
> are some helpers I can re-use?
> 
> From GCC linktime I get 814 functions with non-empty kill vector.
> 
> Modref stats:
>   modref kill: 39 kills, 7162 queries
>   modref use: 25169 disambiguations, 697722 queries
>   modref clobber: 2290122 disambiguations, 22750147 queries
>   5240008 tbaa queries (0.230329 per modref query)
>   806190 base compares (0.035437 per modref query)
> 
> (note that more kills happens at early optimization where we did not
> inlined that much yet).
> 
> For tramp3d (non-lto -O3 build):
> 
> Modref stats:
>   modref kill: 45 kills, 630 queries
>   modref use: 750 disambiguations, 10061 queries
>   modref clobber: 35253 disambiguations, 543262 queries
>   85347 tbaa queries (0.157101 per modref query)
>   18727 base compares (0.034471 per modref query)
> 
> So it is not that high, but it gets better after improving the analysis side
> and also with -Os and/or PGO (wehre we offline cdtors) and also wiring in
> same_addr_size_stores_p which I want to discuss incrementally.
> 
> But at least there are not that many queries to slow down compile times
> noticeably :)
> 
> Honza
> 
> gcc/ChangeLog:
> 
> 	* ipa-modref-tree.h (struct modref_access_node): New member function
> 	* ipa-modref.c (modref_summary::useful_p): Kills are not useful when
> 	we can not analyze loads.
> 	(struct modref_summary_lto): Add kills.
> 	(modref_summary::dump): Dump kills.
> 	(record_access): Take access node as parameter.
> 	(record_access_lto): Likewise.
> 	(add_kill): New function.
> 	(merge_call_side_effects): Merge kills.
> 	(analyze_call): Pass around always_executed.
> 	(struct summary_ptrs): Add always_executed flag.
> 	(analyze_load): Update.
> 	(analyze_store): Handle kills.
> 	(analyze_stmt): Pass around always_executed flag; handle kills from
> 	clobbers.
> 	(analyze_function): Compute always_executed.
> 	(modref_summaries::duplicate): Copy kills.
> 	(update_signature): Release kills.
> 	* ipa-modref.h (struct modref_summary): Add kills.
> 	* tree-ssa-alias.c (dump_alias_stats): Dump kills.
> 	(stmt_kills_ref_p): Handle modref kills.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.dg/tree-ssa/modref-dse-2.c: New test.
> 
> diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
> index 17ff6bb582c..6f8caa331a6 100644
> --- a/gcc/tree-ssa-alias.c
> +++ b/gcc/tree-ssa-alias.c
> @@ -120,6 +120,8 @@ static struct {
>    unsigned HOST_WIDE_INT modref_use_no_alias;
>    unsigned HOST_WIDE_INT modref_clobber_may_alias;
>    unsigned HOST_WIDE_INT modref_clobber_no_alias;
> +  unsigned HOST_WIDE_INT modref_kill_no;
> +  unsigned HOST_WIDE_INT modref_kill_yes;
>    unsigned HOST_WIDE_INT modref_tests;
>    unsigned HOST_WIDE_INT modref_baseptr_tests;
>  } alias_stats;
> @@ -169,6 +171,12 @@ dump_alias_stats (FILE *s)
>  	   + alias_stats.aliasing_component_refs_p_may_alias);
>    dump_alias_stats_in_alias_c (s);
>    fprintf (s, "\nModref stats:\n");
> +  fprintf (s, "  modref kill: "
> +	   HOST_WIDE_INT_PRINT_DEC" kills, "
> +	   HOST_WIDE_INT_PRINT_DEC" queries\n",
> +	   alias_stats.modref_kill_yes,
> +	   alias_stats.modref_kill_yes
> +	   + alias_stats.modref_kill_no);
>    fprintf (s, "  modref use: "
>  	   HOST_WIDE_INT_PRINT_DEC" disambiguations, "
>  	   HOST_WIDE_INT_PRINT_DEC" queries\n",
> @@ -3373,6 +3381,107 @@ stmt_kills_ref_p (gimple *stmt, ao_ref *ref)
>    if (is_gimple_call (stmt))
>      {
>        tree callee = gimple_call_fndecl (stmt);
> +      struct cgraph_node *node;
> +      modref_summary *summary;
> +
> +      /* Try to disambiguate using modref summary.  Modref records a vector
> +	 of stores with known offsets relative to function parameters that must
> +	 happen every execution of function.  Find if we have a matching
> +	 store and verify that function can not use the value.  */
> +      if (callee != NULL_TREE
> +	  && (node = cgraph_node::get (callee)) != NULL
> +	  && node->binds_to_current_def_p ()
> +	  && (summary = get_modref_function_summary (node)) != NULL

I wonder why we bother producing summaries for things that do not
bind locally?  The summary->kills.length () has an upper bound?

> +	  && summary->kills.length ())
> +	{
> +	  tree base = ao_ref_base (ref);
> +	  for (unsigned int i = 0; i < summary->kills.length (); i++)
> +	    {
> +	      modref_access_node &a = summary->kills[i];
> +	      tree op;
> +	      poly_offset_int off1_adj = 0, off2_adj = 0;
> +	      poly_int64 off1, off2;
> +	      tree base_ptr = NULL;
> +	      tree base_decl = NULL;
> +
> +	      if (a.parm_index >= 0)
> +		op = gimple_call_arg (stmt, a.parm_index);
> +	      else if (a.parm_index == MODREF_STATIC_CHAIN_PARM)
> +		op = gimple_call_chain (stmt);
> +	      else
> +		gcc_unreachable ();

I wonder if we can abstract this to a modref_access_node method?

> +
> +	      off2_adj += a.parm_offset * BITS_PER_UNIT;

wasn't there a parm_offset unknown? ...

> +	      if (!(off2_adj + a.offset).to_shwi (&off2))
> +		continue;
> +	      if (TREE_CODE (base) == MEM_REF)
> +		{
> +		  off1_adj = mem_ref_offset (base) << LOG2_BITS_PER_UNIT;
> +		  if (TREE_CODE (TREE_OPERAND (base, 0)) == ADDR_EXPR)
> +		    base_decl = TREE_OPERAND (TREE_OPERAND (base, 0), 0);

'base' will be the decl in this case, apart from when the constant
offset doesn't fit ao_ref.offset, so I think you can spare this
special-case and give up on non-SSA base_ptr

> +		  else
> +		    base_ptr = TREE_OPERAND (base, 0);
> +		}
> +	      /* Give up on TMRs for now.  */
> +	      else if (TREE_CODE (base) == TARGET_MEM_REF)
> +		break;
> +	      else
> +		base_decl = base;
> +
> +	      gcc_checking_assert (!base_decl || DECL_P (base_decl));
> +	      gcc_checking_assert (!base_ptr
> +				   || TREE_CODE (base_ptr) == SSA_NAME);
> +
> +	      /* OP is a pointer and we have access range from its
> +		 dereference.  */
> +	      if (TREE_CODE (op) == ADDR_EXPR)
> +		{
> +		  poly_int64 size, offset, max_size;
> +		  bool reverse;
> +		  tree op_base = get_ref_base_and_extent
> +			  (TREE_OPERAND (op, 0), &offset, &size,
> +			   &max_size, &reverse);

I think you want get_addr_base_and_unit_offset here.  All
variable indexed addresses are in separate stmts.  That also means
you can eventually work with just byte sizes/offsets?

> +		  if (!known_size_p (size) || !known_eq (size, max_size))
> +		    continue;
> +		  off2_adj += offset;
> +		  /* &str->foo are not passed as gimple operands directly,
> +		     would need to look up the def stmt.  */
> +		  gcc_checking_assert (TREE_CODE (op_base) != MEM_REF);
> +		  if (!base_decl
> +		      || compare_base_decls (op_base, base_decl) != 1)
> +		    continue;
> +		}
> +	      else if (!base_ptr || !operand_equal_p (base_ptr, op))
> +		continue;
> +
> +	      if (!(off1_adj + ref->offset).to_shwi (&off1))
> +		continue;
> +	      if (!(off2_adj + a.offset).to_shwi (&off2))
> +		continue;
> +
> +	      if (known_subrange_p (off1, ref->max_size, off2, a.size)
> +		  && dbg_cnt (ipa_mod_ref))
> +		{
> +		  /* For store to be killed it needs to not be used earlier.  */
> +		  if (ref_maybe_used_by_call_p_1 (as_a <gcall *> (stmt), ref,
> +						  true))

Hmm, so moderf says p->x is killed when we have

foo (struct X *p)
{
  int tem = p->x;
  p->x = 0;
  return tem;
}

?  Or even

foo (struct X *p)
{
  bar ();
  p->x = 0;
}

?

Otherwise it looks sensible.

Richard.

> +		    break;
> +		  if (dump_file && (dump_flags & TDF_DETAILS))
> +		    {
> +		      fprintf (dump_file,
> +			       "ipa-modref: call stmt ");
> +		      print_gimple_stmt (dump_file, stmt, 0);
> +		      fprintf (dump_file,
> +			       "ipa-modref: call to %s kills ",
> +			       node->dump_name ());
> +		      print_generic_expr (dump_file, ref->base);
> +		    }
> +		  ++alias_stats.modref_kill_yes;
> +		  return true;
> +		}
> +	    }
> +	  ++alias_stats.modref_kill_no;
> +	}
>        if (callee != NULL_TREE
>  	  && gimple_call_builtin_p (stmt, BUILT_IN_NORMAL))
>  	switch (DECL_FUNCTION_CODE (callee))
> diff --git a/gcc/ipa-modref-tree.h b/gcc/ipa-modref-tree.h
> index 3e213b23d79..dc3b910801d 100644
> --- a/gcc/ipa-modref-tree.h
> +++ b/gcc/ipa-modref-tree.h
> @@ -287,6 +287,55 @@ struct GTY(()) modref_access_node
>  	      size, max_size, record_adjustments);
>        return true;
>      }
> +  /* Merge in access A if it is possible to do without losing
> +     precision.  Return true if successful.
> +     Unlike merge assume that both accesses are always executed
> +     and merge size the same was as max_size.  */
> +  bool merge_for_kills (const modref_access_node &a)
> +    {
> +      poly_int64 offset1 = 0;
> +      poly_int64 aoffset1 = 0;
> +      poly_int64 new_parm_offset = 0;
> +
> +      /* We assume that containment was tested earlier.  */
> +      gcc_checking_assert (!contains (a) && !a.contains (*this));
> +      /* For kills we allways need to know parameter.  */
> +      gcc_checking_assert (parm_index != MODREF_UNKNOWN_PARM
> +			   && a.parm_index != MODREF_UNKNOWN_PARM);
> +      if (parm_index != a.parm_index)
> +	return false;
> +      /* Unknown offsets are useless.  */
> +      gcc_checking_assert (parm_offset_known && a.parm_offset_known);
> +      if (!combined_offsets (a, &new_parm_offset, &offset1, &aoffset1))
> +	return false;
> +      /* Ranges must be known and sizes matching max sizes.  */
> +      gcc_checking_assert (range_info_useful_p ()
> +			  && a.range_info_useful_p ()
> +			  && known_size_p (size) && known_eq (size, max_size)
> +			  && known_size_p (a.size)
> +			  && known_eq (a.size, a.max_size));
> +      if (known_le (offset1, aoffset1))
> +	{
> +	  if (!known_size_p (max_size)
> +	      || known_ge (offset1 + max_size, aoffset1))
> +	    {
> +	      update_for_kills (new_parm_offset, offset1, max_size,
> +				aoffset1, a.max_size);
> +	      return true;
> +	    }
> +	}
> +      else if (known_le (aoffset1, offset1))
> +	{
> +	  if (!known_size_p (a.max_size)
> +	      || known_ge (aoffset1 + a.max_size, offset1))
> +	    {
> +	      update_for_kills (new_parm_offset, offset1, max_size,
> +				aoffset1, a.max_size);
> +	      return true;
> +	    }
> +	}
> +      return false;
> +    }
>    /* Return true if A1 and B1 can be merged with lower informatoin
>       less than A2 and B2.
>       Assume that no containment or lossless merging is possible.  */
> @@ -430,6 +479,33 @@ private:
>        update (parm_offset1, offset1,
>  	      new_size, new_max_size, record_adjustments);
>      }
> +  /* Merge two ranges both starting at parm_offset1 and update THIS
> +     with result.  */
> +  void update_for_kills (poly_int64 parm_offset1,
> +			 poly_int64 offset1,
> +			 poly_int64 max_size1,
> +			 poly_int64 offset2,
> +			 poly_int64 max_size2)
> +    {
> +      if (known_le (offset1, offset2))
> +	;
> +      else if (known_le (offset2, offset1))
> +	{
> +	  std::swap (offset1, offset2);
> +	  std::swap (max_size1, max_size2);
> +	}
> +      else
> +	gcc_unreachable ();
> +
> +      poly_int64 new_max_size = max_size2 + offset2 - offset1;
> +      if (known_le (new_max_size, max_size1))
> +	new_max_size = max_size1;
> +
> +      parm_offset = parm_offset1;
> +      offset = offset1;
> +      max_size = new_max_size;
> +      size = new_max_size;
> +    }
>    /* Given access nodes THIS and A, return true if they
>       can be done with common parm_offsets.  In this case
>       return parm offset in new_parm_offset, new_offset
> diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
> index f8b7b900527..d9859daa18e 100644
> --- a/gcc/ipa-modref.c
> +++ b/gcc/ipa-modref.c
> @@ -334,6 +334,8 @@ modref_summary::useful_p (int ecf_flags, bool check_flags)
>      return false;
>    if (loads && !loads->every_base)
>      return true;
> +  else
> +    kills.release ();
>    if (ecf_flags & ECF_PURE)
>      return false;
>    return stores && !stores->every_base;
> @@ -370,6 +372,7 @@ struct GTY(()) modref_summary_lto
>       more verbose and thus more likely to hit the limits.  */
>    modref_records_lto *loads;
>    modref_records_lto *stores;
> +  auto_vec<modref_access_node> GTY((skip)) kills;
>    auto_vec<eaf_flags_t> GTY((skip)) arg_flags;
>    eaf_flags_t retslot_flags;
>    eaf_flags_t static_chain_flags;
> @@ -617,6 +620,12 @@ modref_summary::dump (FILE *out)
>        fprintf (out, "  stores:\n");
>        dump_records (stores, out);
>      }
> +  if (kills.length ())
> +    {
> +      fprintf (out, "  kills:\n");
> +      for (unsigned int i = 0; i < kills.length (); i++)
> +	dump_access (&kills[i], out);
> +    }
>    if (writes_errno)
>      fprintf (out, "  Writes errno\n");
>    if (side_effects)
> @@ -750,13 +759,12 @@ get_access (ao_ref *ref)
>  /* Record access into the modref_records data structure.  */
>  
>  static void
> -record_access (modref_records *tt, ao_ref *ref)
> +record_access (modref_records *tt, ao_ref *ref, modref_access_node &a)
>  {
>    alias_set_type base_set = !flag_strict_aliasing ? 0
>  			    : ao_ref_base_alias_set (ref);
>    alias_set_type ref_set = !flag_strict_aliasing ? 0
>  			    : (ao_ref_alias_set (ref));
> -  modref_access_node a = get_access (ref);
>    if (dump_file)
>      {
>         fprintf (dump_file, "   - Recording base_set=%i ref_set=%i ",
> @@ -769,7 +777,7 @@ record_access (modref_records *tt, ao_ref *ref)
>  /* IPA version of record_access_tree.  */
>  
>  static void
> -record_access_lto (modref_records_lto *tt, ao_ref *ref)
> +record_access_lto (modref_records_lto *tt, ao_ref *ref, modref_access_node &a)
>  {
>    /* get_alias_set sometimes use different type to compute the alias set
>       than TREE_TYPE (base).  Do same adjustments.  */
> @@ -816,7 +824,6 @@ record_access_lto (modref_records_lto *tt, ao_ref *ref)
>  		       || variably_modified_type_p (ref_type, NULL_TREE)))
>  	ref_type = NULL_TREE;
>      }
> -  modref_access_node a = get_access (ref);
>    if (dump_file)
>      {
>        fprintf (dump_file, "   - Recording base type:");
> @@ -910,6 +917,82 @@ parm_map_for_arg (tree op)
>    return parm_map;
>  }
>  
> +/* Insert new kill A into KILLS.  */
> +
> +static void
> +add_kill (vec<modref_access_node> &kills, modref_access_node &a)
> +{
> +  size_t index;
> +  modref_access_node *a2;
> +  bool merge = false;
> +
> +  /* See if we have corresponding etry already or we can merge with
> +     neighbouring entry.  */
> +  FOR_EACH_VEC_ELT (kills, index, a2)
> +    {
> +      if (a2->contains (a))
> +	return;
> +      if (a.contains (*a2))
> +	{
> +	  *a2 = a;
> +	  merge = true;
> +	  break;
> +	}
> +      if (a2->merge_for_kills (a))
> +	{
> +	  merge = true;
> +	  break;
> +	}
> +    }
> +  /* If entry was not found, insert it.  */
> +  if (!merge)
> +    {
> +      if ((int)kills.length () >= param_modref_max_accesses)
> +	{
> +	  if (dump_file)
> +	    fprintf (dump_file,
> +		     "--param param=modref-max-accesses limit reached:");
> +	  return;
> +	}
> +      kills.safe_push (a);
> +      return;
> +    }
> +  /* Extenidng range in an entry may make it possible to merge it with
> +     other entries.  */
> +    {
> +      size_t i;
> +
> +      for (i = 0; i < kills.length ();)
> +	if (i != index)
> +	  {
> +	    bool found = false, restart = false;
> +	    modref_access_node *a = &kills[i];
> +	    modref_access_node *n = &kills[index];
> +
> +	    if (n->contains (*a))
> +	      found = true;
> +	    if (!found && n->merge_for_kills (*a))
> +	      found = restart = true;
> +	    gcc_checking_assert (found || !a->merge_for_kills (*n));
> +	    if (found)
> +	      {
> +		kills.unordered_remove (i);
> +		if (index == kills.length ())
> +		  {
> +		    index = i;
> +		    i++;
> +		  }
> +		if (restart)
> +		  i = 0;
> +	      }
> +	    else
> +	      i++;
> +	  }
> +	else
> +	  i++;
> +    }
> +}
> +
>  /* Merge side effects of call STMT to function with CALLEE_SUMMARY
>     int CUR_SUMMARY.  Return true if something changed.
>     If IGNORE_STORES is true, do not merge stores.
> @@ -920,7 +1003,7 @@ bool
>  merge_call_side_effects (modref_summary *cur_summary,
>  			 gimple *stmt, modref_summary *callee_summary,
>  			 bool ignore_stores, cgraph_node *callee_node,
> -			 bool record_adjustments)
> +			 bool record_adjustments, bool always_executed)
>  {
>    auto_vec <modref_parm_map, 32> parm_map;
>    modref_parm_map chain_map;
> @@ -988,6 +1071,29 @@ merge_call_side_effects (modref_summary *cur_summary,
>  	  changed = true;
>  	}
>      }
> +  if (always_executed)
> +    {
> +      size_t i;
> +      modref_access_node *a2;
> +      FOR_EACH_VEC_ELT (callee_summary->kills, i, a2)
> +	{
> +	  modref_access_node n = *a2;
> +	  if (n.parm_index >= (int)parm_map.length ())
> +	    continue;
> +	  modref_parm_map &m
> +		  = n.parm_index == MODREF_STATIC_CHAIN_PARM
> +		    ? chain_map
> +		    : parm_map[n.parm_index];
> +	  if (m.parm_index == MODREF_LOCAL_MEMORY_PARM
> +	      || m.parm_index == MODREF_UNKNOWN_PARM
> +	      || m.parm_index == MODREF_RETSLOT_PARM
> +	      || !m.parm_offset_known)
> +	    continue;
> +	  n.parm_index = m.parm_index;
> +	  n.parm_offset += m.parm_offset;
> +	  add_kill (cur_summary->kills, n);
> +	}
> +    }
>    if (!cur_summary->side_effects
>        && callee_summary->side_effects)
>      {
> @@ -1198,7 +1304,8 @@ process_fnspec (modref_summary *cur_summary,
>  
>  static bool
>  analyze_call (modref_summary *cur_summary, modref_summary_lto *cur_summary_lto,
> -	      gcall *stmt, vec <gimple *> *recursive_calls)
> +	      gcall *stmt, vec <gimple *> *recursive_calls,
> +	      bool always_executed)
>  {
>    /* Check flags on the function call.  In certain cases, analysis can be
>       simplified.  */
> @@ -1284,7 +1391,7 @@ analyze_call (modref_summary *cur_summary, modref_summary_lto *cur_summary_lto,
>      }
>  
>    merge_call_side_effects (cur_summary, stmt, callee_summary, ignore_stores,
> -			   callee_node, false);
> +			   callee_node, false, always_executed);
>  
>    return true;
>  }
> @@ -1295,6 +1402,7 @@ struct summary_ptrs
>  {
>    struct modref_summary *nolto;
>    struct modref_summary_lto *lto;
> +  bool always_executed;
>  };
>  
>  /* Helper for analyze_stmt.  */
> @@ -1329,11 +1437,12 @@ analyze_load (gimple *, tree, tree op, void *data)
>  
>    ao_ref r;
>    ao_ref_init (&r, op);
> +  modref_access_node a = get_access (&r);
>  
>    if (summary)
> -    record_access (summary->loads, &r);
> +    record_access (summary->loads, &r, a);
>    if (summary_lto)
> -    record_access_lto (summary_lto->loads, &r);
> +    record_access_lto (summary_lto->loads, &r, a);
>    return false;
>  }
>  
> @@ -1369,11 +1478,24 @@ analyze_store (gimple *, tree, tree op, void *data)
>  
>    ao_ref r;
>    ao_ref_init (&r, op);
> +  modref_access_node a = get_access (&r);
>  
>    if (summary)
> -    record_access (summary->stores, &r);
> +    record_access (summary->stores, &r, a);
>    if (summary_lto)
> -    record_access_lto (summary_lto->stores, &r);
> +    record_access_lto (summary_lto->stores, &r, a);
> +  if (summary
> +      && ((summary_ptrs *)data)->always_executed
> +      && a.parm_offset_known == true
> +      && a.parm_index != MODREF_UNKNOWN_PARM
> +      && a.parm_index != MODREF_RETSLOT_PARM
> +      && known_size_p (r.size)
> +      && known_eq (r.max_size, r.size))
> +    {
> +      if (dump_file)
> +	fprintf (dump_file, "   - Recording kill\n");
> +      add_kill (summary->kills, a);
> +    }
>    return false;
>  }
>  
> @@ -1382,16 +1504,36 @@ analyze_store (gimple *, tree, tree op, void *data)
>  
>  static bool
>  analyze_stmt (modref_summary *summary, modref_summary_lto *summary_lto,
> -	      gimple *stmt, bool ipa, vec <gimple *> *recursive_calls)
> +	      gimple *stmt, bool ipa, vec <gimple *> *recursive_calls,
> +	      bool always_executed)
>  {
>    /* In general we can not ignore clobbers because they are barriers for code
>       motion, however after inlining it is safe to do because local optimization
>       passes do not consider clobbers from other functions.
>       Similar logic is in ipa-pure-const.c.  */
>    if ((ipa || cfun->after_inlining) && gimple_clobber_p (stmt))
> -    return true;
> +    {
> +      if (summary
> +	  && always_executed && record_access_p (gimple_assign_lhs (stmt)))
> +	{
> +	  ao_ref r;
> +	  ao_ref_init (&r, gimple_assign_lhs (stmt));
> +	  modref_access_node a = get_access (&r);
> +	  if (a.parm_offset_known == true
> +	      && a.parm_index != MODREF_UNKNOWN_PARM
> +	      && a.parm_index != MODREF_RETSLOT_PARM
> +	      && known_size_p (r.size)
> +	      && known_eq (r.max_size, r.size))
> +	    {
> +	      if (dump_file)
> +		fprintf (dump_file, "   - Recording kill\n");
> +	      add_kill (summary->kills, a);
> +	    }
> +	}
> +      return true;
> +    }
>  
> -  struct summary_ptrs sums = {summary, summary_lto};
> +  struct summary_ptrs sums = {summary, summary_lto, always_executed};
>  
>    /* Analyze all loads and stores in STMT.  */
>    walk_stmt_load_store_ops (stmt, &sums,
> @@ -1420,7 +1562,8 @@ analyze_stmt (modref_summary *summary, modref_summary_lto *summary_lto,
>     case GIMPLE_CALL:
>       if (!ipa || gimple_call_internal_p (stmt))
>         return analyze_call (summary, summary_lto,
> -			    as_a <gcall *> (stmt), recursive_calls);
> +			    as_a <gcall *> (stmt), recursive_calls,
> +			    always_executed);
>       else
>        {
>  	attr_fnspec fnspec = gimple_call_fnspec (as_a <gcall *>(stmt));
> @@ -2729,11 +2872,15 @@ analyze_function (function *f, bool ipa)
>    FOR_EACH_BB_FN (bb, f)
>      {
>        gimple_stmt_iterator si;
> +      bool always_executed
> +	      = bb == single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun))->dest;
> +
>        for (si = gsi_start_nondebug_after_labels_bb (bb);
>  	   !gsi_end_p (si); gsi_next_nondebug (&si))
>  	{
>  	  if (!analyze_stmt (summary, summary_lto,
> -			     gsi_stmt (si), ipa, &recursive_calls)
> +			     gsi_stmt (si), ipa, &recursive_calls,
> +			     always_executed)
>  	      || ((!summary || !summary->useful_p (ecf_flags, false))
>  		  && (!summary_lto
>  		      || !summary_lto->useful_p (ecf_flags, false))))
> @@ -2742,6 +2889,9 @@ analyze_function (function *f, bool ipa)
>  	      collapse_stores (summary, summary_lto);
>  	      break;
>  	    }
> +	  if (always_executed
> +	      && stmt_can_throw_external (cfun, gsi_stmt (si)))
> +	    always_executed = false;
>  	}
>      }
>  
> @@ -2761,7 +2911,7 @@ analyze_function (function *f, bool ipa)
>  			   ignore_stores_p (current_function_decl,
>  					    gimple_call_flags
>  						 (recursive_calls[i])),
> -			   fnode, !first);
> +			   fnode, !first, false);
>  	      if (!summary->useful_p (ecf_flags, false))
>  		{
>  		  remove_summary (lto, nolto, ipa);
> @@ -2993,6 +3143,8 @@ modref_summaries::duplicate (cgraph_node *, cgraph_node *dst,
>  			 src_data->loads->max_refs,
>  			 src_data->loads->max_accesses);
>    dst_data->loads->copy_from (src_data->loads);
> +  dst_data->kills.reserve_exact (src_data->kills.length ());
> +  dst_data->kills.splice (src_data->kills);
>    dst_data->writes_errno = src_data->writes_errno;
>    dst_data->side_effects = src_data->side_effects;
>    if (src_data->arg_flags.length ())
> @@ -3640,6 +3792,8 @@ update_signature (struct cgraph_node *node)
>      {
>        r->loads->remap_params (&map);
>        r->stores->remap_params (&map);
> +      /* TODO: One we do IPA kills analysis, update the table here.  */
> +      r->kills.release ();
>        if (r->arg_flags.length ())
>  	remap_arg_flags (r->arg_flags, info);
>      }
> @@ -3647,6 +3801,8 @@ update_signature (struct cgraph_node *node)
>      {
>        r_lto->loads->remap_params (&map);
>        r_lto->stores->remap_params (&map);
> +      /* TODO: One we do IPA kills analysis, update the table here.  */
> +      r_lto->kills.release ();
>        if (r_lto->arg_flags.length ())
>  	remap_arg_flags (r_lto->arg_flags, info);
>      }
> diff --git a/gcc/ipa-modref.h b/gcc/ipa-modref.h
> index 49c99f263a7..b77c1aa7400 100644
> --- a/gcc/ipa-modref.h
> +++ b/gcc/ipa-modref.h
> @@ -30,6 +30,7 @@ struct GTY(()) modref_summary
>    /* Load and stores in function (transitively closed to all callees)  */
>    modref_records *loads;
>    modref_records *stores;
> +  auto_vec<modref_access_node> GTY((skip)) kills;
>    auto_vec<eaf_flags_t> GTY((skip)) arg_flags;
>    eaf_flags_t retslot_flags;
>    eaf_flags_t static_chain_flags;
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-2.c b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-2.c
> new file mode 100644
> index 00000000000..2a23a9e8ccb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-2.c
> @@ -0,0 +1,22 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-dse1"  } */
> +__attribute__ ((noinline))
> +void write (int *a)
> +{
> +	*a=1;
> +	a[1]=2;
> +}
> +int test ()
> +{
> +	int a;
> +	a=2;
> +	write (&a);
> +	return a;
> +}
> +int test2 (int *a)
> +{
> +	*a=2;
> +	write (a);
> +	return *a;
> +}
> +/* { dg-final { scan-tree-dump-times "Deleted dead store" 2 "dse1"} } */
>
  
Jan Hubicka Nov. 12, 2021, 10:47 a.m. UTC | #2
> 
> I wonder why we bother producing summaries for things that do not
> bind locally?  The summary->kills.length () has an upper bound?

Because of local aliases.
The size of the array is capped by param_max_modref_accesses which is
16.
> 
> > +	  && summary->kills.length ())
> > +	{
> > +	  tree base = ao_ref_base (ref);
> > +	  for (unsigned int i = 0; i < summary->kills.length (); i++)
> > +	    {
> > +	      modref_access_node &a = summary->kills[i];
> > +	      tree op;
> > +	      poly_offset_int off1_adj = 0, off2_adj = 0;
> > +	      poly_int64 off1, off2;
> > +	      tree base_ptr = NULL;
> > +	      tree base_decl = NULL;
> > +
> > +	      if (a.parm_index >= 0)
> > +		op = gimple_call_arg (stmt, a.parm_index);
> > +	      else if (a.parm_index == MODREF_STATIC_CHAIN_PARM)
> > +		op = gimple_call_chain (stmt);
> > +	      else
> > +		gcc_unreachable ();
> 
> I wonder if we can abstract this to a modref_access_node method?

Something like get_param (stmt)? Yes, it looks like a good idea.
> 
> > +
> > +	      off2_adj += a.parm_offset * BITS_PER_UNIT;
> 
> wasn't there a parm_offset unknown? ...
Yes, but we do not insert those accesses to kills since they are
unknown.
> 
> > +	      if (!(off2_adj + a.offset).to_shwi (&off2))
> > +		continue;
> > +	      if (TREE_CODE (base) == MEM_REF)
> > +		{
> > +		  off1_adj = mem_ref_offset (base) << LOG2_BITS_PER_UNIT;
> > +		  if (TREE_CODE (TREE_OPERAND (base, 0)) == ADDR_EXPR)
> > +		    base_decl = TREE_OPERAND (TREE_OPERAND (base, 0), 0);
> 
> 'base' will be the decl in this case, apart from when the constant
> offset doesn't fit ao_ref.offset, so I think you can spare this
> special-case and give up on non-SSA base_ptr

I tought we wrap decls to modrefs in lto streaming when type merging
fails?
> 
> > +		  else
> > +		    base_ptr = TREE_OPERAND (base, 0);
> > +		}
> > +	      /* Give up on TMRs for now.  */
> > +	      else if (TREE_CODE (base) == TARGET_MEM_REF)
> > +		break;
> > +	      else
> > +		base_decl = base;
> > +
> > +	      gcc_checking_assert (!base_decl || DECL_P (base_decl));
> > +	      gcc_checking_assert (!base_ptr
> > +				   || TREE_CODE (base_ptr) == SSA_NAME);
> > +
> > +	      /* OP is a pointer and we have access range from its
> > +		 dereference.  */
> > +	      if (TREE_CODE (op) == ADDR_EXPR)
> > +		{
> > +		  poly_int64 size, offset, max_size;
> > +		  bool reverse;
> > +		  tree op_base = get_ref_base_and_extent
> > +			  (TREE_OPERAND (op, 0), &offset, &size,
> > +			   &max_size, &reverse);
> 
> I think you want get_addr_base_and_unit_offset here.  All
> variable indexed addresses are in separate stmts.  That also means
> you can eventually work with just byte sizes/offsets?

Will do.  The access range in modref summary is bit based (since we want
to disabiguate bitfields like we do in rest of alias oracle) but indeed
this part cna be in bytes.
> 
> > +		  if (!known_size_p (size) || !known_eq (size, max_size))
> > +		    continue;
> > +		  off2_adj += offset;
> > +		  /* &str->foo are not passed as gimple operands directly,
> > +		     would need to look up the def stmt.  */
> > +		  gcc_checking_assert (TREE_CODE (op_base) != MEM_REF);
> > +		  if (!base_decl
> > +		      || compare_base_decls (op_base, base_decl) != 1)
> > +		    continue;
> > +		}
> > +	      else if (!base_ptr || !operand_equal_p (base_ptr, op))
> > +		continue;
> > +
> > +	      if (!(off1_adj + ref->offset).to_shwi (&off1))
> > +		continue;
> > +	      if (!(off2_adj + a.offset).to_shwi (&off2))
> > +		continue;
> > +
> > +	      if (known_subrange_p (off1, ref->max_size, off2, a.size)
> > +		  && dbg_cnt (ipa_mod_ref))
> > +		{
> > +		  /* For store to be killed it needs to not be used earlier.  */
> > +		  if (ref_maybe_used_by_call_p_1 (as_a <gcall *> (stmt), ref,
> > +						  true))
> 
> Hmm, so moderf says p->x is killed when we have
> 
> foo (struct X *p)
> {
>   int tem = p->x;
>   p->x = 0;
>   return tem;
> }
> 
> ?  Or even
Yep, this will currently land in kills.  I can add loop pruning kills
with known load ranges incrementally.
> 
> foo (struct X *p)
> {
>   bar ();
>   p->x = 0;
> }
> 
> ?
Here we will end up with reading global memory and that will turn kills
empty in modref.

The check is still needed to verify that ref is not passed as aggregate
parameter.

I will update patch.
Thanks,
Honza
> 
> Otherwise it looks sensible.
> 
> Richard.
  
Richard Biener Nov. 12, 2021, 11 a.m. UTC | #3
On Fri, 12 Nov 2021, Jan Hubicka wrote:

> > 
> > I wonder why we bother producing summaries for things that do not
> > bind locally?  The summary->kills.length () has an upper bound?
> 
> Because of local aliases.
> The size of the array is capped by param_max_modref_accesses which is
> 16.
> > 
> > > +	  && summary->kills.length ())
> > > +	{
> > > +	  tree base = ao_ref_base (ref);
> > > +	  for (unsigned int i = 0; i < summary->kills.length (); i++)
> > > +	    {
> > > +	      modref_access_node &a = summary->kills[i];
> > > +	      tree op;
> > > +	      poly_offset_int off1_adj = 0, off2_adj = 0;
> > > +	      poly_int64 off1, off2;
> > > +	      tree base_ptr = NULL;
> > > +	      tree base_decl = NULL;
> > > +
> > > +	      if (a.parm_index >= 0)
> > > +		op = gimple_call_arg (stmt, a.parm_index);
> > > +	      else if (a.parm_index == MODREF_STATIC_CHAIN_PARM)
> > > +		op = gimple_call_chain (stmt);
> > > +	      else
> > > +		gcc_unreachable ();
> > 
> > I wonder if we can abstract this to a modref_access_node method?
> 
> Something like get_param (stmt)? Yes, it looks like a good idea.
> > 
> > > +
> > > +	      off2_adj += a.parm_offset * BITS_PER_UNIT;
> > 
> > wasn't there a parm_offset unknown? ...
> Yes, but we do not insert those accesses to kills since they are
> unknown.
> > 
> > > +	      if (!(off2_adj + a.offset).to_shwi (&off2))
> > > +		continue;
> > > +	      if (TREE_CODE (base) == MEM_REF)
> > > +		{
> > > +		  off1_adj = mem_ref_offset (base) << LOG2_BITS_PER_UNIT;
> > > +		  if (TREE_CODE (TREE_OPERAND (base, 0)) == ADDR_EXPR)
> > > +		    base_decl = TREE_OPERAND (TREE_OPERAND (base, 0), 0);
> > 
> > 'base' will be the decl in this case, apart from when the constant
> > offset doesn't fit ao_ref.offset, so I think you can spare this
> > special-case and give up on non-SSA base_ptr
> 
> I tought we wrap decls to modrefs in lto streaming when type merging
> fails?

ao_ref_base returns the result of get_ref_base_and_extent which
unwraps this

> > 
> > > +		  else
> > > +		    base_ptr = TREE_OPERAND (base, 0);
> > > +		}
> > > +	      /* Give up on TMRs for now.  */
> > > +	      else if (TREE_CODE (base) == TARGET_MEM_REF)
> > > +		break;
> > > +	      else
> > > +		base_decl = base;
> > > +
> > > +	      gcc_checking_assert (!base_decl || DECL_P (base_decl));
> > > +	      gcc_checking_assert (!base_ptr
> > > +				   || TREE_CODE (base_ptr) == SSA_NAME);
> > > +
> > > +	      /* OP is a pointer and we have access range from its
> > > +		 dereference.  */
> > > +	      if (TREE_CODE (op) == ADDR_EXPR)
> > > +		{
> > > +		  poly_int64 size, offset, max_size;
> > > +		  bool reverse;
> > > +		  tree op_base = get_ref_base_and_extent
> > > +			  (TREE_OPERAND (op, 0), &offset, &size,
> > > +			   &max_size, &reverse);
> > 
> > I think you want get_addr_base_and_unit_offset here.  All
> > variable indexed addresses are in separate stmts.  That also means
> > you can eventually work with just byte sizes/offsets?
> 
> Will do.  The access range in modref summary is bit based (since we want
> to disabiguate bitfields like we do in rest of alias oracle) but indeed
> this part cna be in bytes.
> > 
> > > +		  if (!known_size_p (size) || !known_eq (size, max_size))
> > > +		    continue;
> > > +		  off2_adj += offset;
> > > +		  /* &str->foo are not passed as gimple operands directly,
> > > +		     would need to look up the def stmt.  */
> > > +		  gcc_checking_assert (TREE_CODE (op_base) != MEM_REF);
> > > +		  if (!base_decl
> > > +		      || compare_base_decls (op_base, base_decl) != 1)
> > > +		    continue;
> > > +		}
> > > +	      else if (!base_ptr || !operand_equal_p (base_ptr, op))
> > > +		continue;
> > > +
> > > +	      if (!(off1_adj + ref->offset).to_shwi (&off1))
> > > +		continue;
> > > +	      if (!(off2_adj + a.offset).to_shwi (&off2))
> > > +		continue;
> > > +
> > > +	      if (known_subrange_p (off1, ref->max_size, off2, a.size)
> > > +		  && dbg_cnt (ipa_mod_ref))
> > > +		{
> > > +		  /* For store to be killed it needs to not be used earlier.  */
> > > +		  if (ref_maybe_used_by_call_p_1 (as_a <gcall *> (stmt), ref,
> > > +						  true))
> > 
> > Hmm, so moderf says p->x is killed when we have
> > 
> > foo (struct X *p)
> > {
> >   int tem = p->x;
> >   p->x = 0;
> >   return tem;
> > }
> > 
> > ?  Or even
> Yep, this will currently land in kills.  I can add loop pruning kills
> with known load ranges incrementally.
> > 
> > foo (struct X *p)
> > {
> >   bar ();
> >   p->x = 0;
> > }
> > 
> > ?
> Here we will end up with reading global memory and that will turn kills
> empty in modref.
> 
> The check is still needed to verify that ref is not passed as aggregate
> parameter.
> 
> I will update patch.
> Thanks,
> Honza
> > 
> > Otherwise it looks sensible.
> > 
> > Richard.
>
  
Jan Hubicka Nov. 14, 2021, 6:53 p.m. UTC | #4
> > 
> > I think you want get_addr_base_and_unit_offset here.  All
> > variable indexed addresses are in separate stmts.  That also means
> > you can eventually work with just byte sizes/offsets?
> 
> Will do.  The access range in modref summary is bit based (since we want
> to disabiguate bitfields like we do in rest of alias oracle) but indeed
> this part cna be in bytes.

Actually after the unifiation I can just use get_ao_ref which will call
ao_ref_init_from_ptr_and_range that has all the logic I need in there.
I also noticed that I ended up duplicating the code matching bases
and ranges which is done already twice in the function - once for store
targets and once for MEMSET and friends.  The later copy lacked overflow
checks so I took the first copy and moved it to helper function.  This
makes the gimple part of patch really straighforward: just build ao_ref
if possible and then pass it to this function.

I also added statistics.

I have bootstrapped/regtsed on x86_64-linux the updated patch and
comitted it so I can break out the patches that depends on it.
I have patch improving the kill tracking at modref side and also the
kill oracle itself can use fnspec and does not need to special case
mem* functions.

For cc1plus LTO link I now get:

Alias oracle query stats:
  refs_may_alias_p: 76106130 disambiguations, 100928932 queries
  on_includes: 12539931 disambiguations, 39864841 queries
  ref_maybe_used_by_call_p: 625857 disambiguations, 77138089 queries
  call_may_clobber_ref_p: 366420 disambiguations, 369293 queries
  stmt_kills_ref_p: 107503 kills, 5699589 queries
  nonoverlapping_component_refs_p: 0 disambiguations, 26176 queries
  nonoverlapping_refs_since_match_p: 30339 disambiguations, 65400 must overlaps, 96698 queries
  aliasing_component_refs_p: 57500 disambiguations, 15464678 queries
  TBAA oracle: 28248334 disambiguations 104710521 queries
               15220245 are in alias set 0
               8905994 queries asked about the same object
               98 queries asked about the same alias set
               0 access volatile
               50371110 are dependent in the DAG
               1964740 are aritificially in conflict with void *

Modref stats:  
  modref kill: 52 kills, 6655 queries
  modref use: 25204 disambiguations, 692151 queries
  modref clobber: 2309709 disambiguations, 21877806 queries
  5320532 tbaa queries (0.243193 per modref query)
  761785 base compares (0.034820 per modref query)

PTA query stats:
  pt_solution_includes: 12539931 disambiguations, 39864841 queries
  pt_solutions_intersect: 1713075 disambiguations, 14023484 queries

Newly we get statis of kill oracle itself:
  stmt_kills_ref_p: 107503 kills, 5699589 queries
and the modref part:
  modref kill: 52 kills, 6655 queries
So an improvemnet over 1 kill using modref I had before. Still not
really great.

Honza

gcc/ChangeLog:

            * ipa-modref-tree.c (modref_access_node::update_for_kills): New
            member function.
            (modref_access_node::merge_for_kills): Likewise.
            (modref_access_node::insert_kill): Likewise.
            * ipa-modref-tree.h (modref_access_node::update_for_kills,
            modref_access_node::merge_for_kills, modref_access_node::insert_kill):
            Declare.
            (modref_access_node::useful_for_kill): New member function.
            * ipa-modref.c (modref_summary::useful_p): Release useless kills.
            (lto_modref_summary): Add kills.
            (modref_summary::dump): Dump kills.
            (record_access): Add mdoref_access_node parameter.
            (record_access_lto): Likewise.
            (merge_call_side_effects): Merge kills.
            (analyze_call): Add ALWAYS_EXECUTED param and pass it around.
            (struct summary_ptrs): Add always_executed filed.
            (analyze_load): Update.
            (analyze_store): Update; record kills.
            (analyze_stmt): Add always_executed; record kills in clobbers.
            (analyze_function): Track always_executed.
            (modref_summaries::duplicate): Duplicate kills.
            (update_signature): Release kills.
            * ipa-modref.h (struct modref_summary): Add kills.
            * tree-ssa-alias.c (alias_stats): Add kill stats.
            (dump_alias_stats): Dump kill stats.
            (store_kills_ref_p): Break out from ...
            (stmt_kills_ref_p): Use it; handle modref info based kills.
    
    gcc/testsuite/ChangeLog:
    
    2021-11-14  Jan Hubicka  <hubicka@ucw.cz>
    
            * gcc.dg/tree-ssa/modref-dse-3.c: New test.


diff --git a/gcc/ipa-modref-tree.c b/gcc/ipa-modref-tree.c
index 6fc2b7298f4..bbe23a5a211 100644
--- a/gcc/ipa-modref-tree.c
+++ b/gcc/ipa-modref-tree.c
@@ -638,6 +638,185 @@ modref_access_node::get_ao_ref (const gcall *stmt, ao_ref *ref) const
   return true;
 }
 
+/* Return true A is a subkill.  */
+bool
+modref_access_node::contains_for_kills (const modref_access_node &a) const
+{
+  poly_int64 aoffset_adj = 0;
+
+  gcc_checking_assert (parm_index != MODREF_UNKNOWN_PARM
+		       && a.parm_index != MODREF_UNKNOWN_PARM);
+  if (parm_index != a.parm_index)
+    return false;
+  gcc_checking_assert (parm_offset_known && a.parm_offset_known);
+  aoffset_adj = (a.parm_offset - parm_offset)
+		* BITS_PER_UNIT;
+  gcc_checking_assert (range_info_useful_p () && a.range_info_useful_p ());
+  return known_subrange_p (a.offset + aoffset_adj,
+			   a.max_size, offset, max_size);
+}
+
+/* Merge two ranges both starting at parm_offset1 and update THIS
+   with result.  */
+bool
+modref_access_node::update_for_kills (poly_int64 parm_offset1,
+				      poly_int64 offset1,
+				      poly_int64 max_size1,
+				      poly_int64 offset2,
+				      poly_int64 max_size2,
+				      bool record_adjustments)
+{
+  if (known_le (offset1, offset2))
+    ;
+  else if (known_le (offset2, offset1))
+    {
+      std::swap (offset1, offset2);
+      std::swap (max_size1, max_size2);
+    }
+  else
+    gcc_unreachable ();
+
+  poly_int64 new_max_size = max_size2 + offset2 - offset1;
+  if (known_le (new_max_size, max_size1))
+    new_max_size = max_size1;
+  if (known_eq (parm_offset, parm_offset1)
+      && known_eq (offset, offset1)
+      && known_eq (size, new_max_size)
+      && known_eq (max_size, new_max_size))
+    return false;
+
+  if (!record_adjustments
+      || (++adjustments) < param_modref_max_adjustments)
+    {
+      parm_offset = parm_offset1;
+      offset = offset1;
+      max_size = new_max_size;
+      size = new_max_size;
+      gcc_checking_assert (useful_for_kill_p ());
+      return true;
+    }
+  return false;
+}
+
+/* Merge in access A if it is possible to do without losing
+   precision.  Return true if successful.
+   Unlike merge assume that both accesses are always executed
+   and merge size the same was as max_size.  */
+bool
+modref_access_node::merge_for_kills (const modref_access_node &a,
+				     bool record_adjustments)
+{
+  poly_int64 offset1 = 0;
+  poly_int64 aoffset1 = 0;
+  poly_int64 new_parm_offset = 0;
+
+  /* We assume that containment was tested earlier.  */
+  gcc_checking_assert (!contains_for_kills (a) && !a.contains_for_kills (*this)
+		       && useful_for_kill_p () && a.useful_for_kill_p ());
+
+  if (parm_index != a.parm_index
+      || !combined_offsets (a, &new_parm_offset, &offset1, &aoffset1))
+    return false;
+
+  if (known_le (offset1, aoffset1))
+   {
+     if (!known_size_p (max_size)
+	 || known_ge (offset1 + max_size, aoffset1))
+       return update_for_kills (new_parm_offset, offset1, max_size,
+				aoffset1, a.max_size, record_adjustments);
+   }
+  else if (known_le (aoffset1, offset1))
+   {
+     if (!known_size_p (a.max_size)
+	 || known_ge (aoffset1 + a.max_size, offset1))
+       return update_for_kills (new_parm_offset, offset1, max_size,
+				aoffset1, a.max_size, record_adjustments);
+   }
+  return false;
+}
+
+/* Insert new kill A into KILLS.  If RECORD_ADJUSTMENTS is true limit number
+   of changes to each entry.  Return true if something changed.  */
+
+bool
+modref_access_node::insert_kill (vec<modref_access_node> &kills,
+				 modref_access_node &a, bool record_adjustments)
+{
+  size_t index;
+  modref_access_node *a2;
+  bool merge = false;
+
+  gcc_checking_assert (a.useful_for_kill_p ());
+
+  /* See if we have corresponding entry already or we can merge with
+     neighbouring entry.  */
+  FOR_EACH_VEC_ELT (kills, index, a2)
+    {
+      if (a2->contains_for_kills (a))
+	return false;
+      if (a.contains_for_kills (*a2))
+	{
+	  a.adjustments = 0;
+	  *a2 = a;
+	  merge = true;
+	  break;
+	}
+      if (a2->merge_for_kills (a, record_adjustments))
+	{
+	  merge = true;
+	  break;
+	}
+    }
+  /* If entry was not found, insert it.  */
+  if (!merge)
+    {
+      if ((int)kills.length () >= param_modref_max_accesses)
+	{
+	  if (dump_file)
+	    fprintf (dump_file,
+		     "--param param=modref-max-accesses limit reached:");
+	  return false;
+	}
+      a.adjustments = 0;
+      kills.safe_push (a);
+      return true;
+    }
+  /* Extending range in an entry may make it possible to merge it with
+     other entries.  */
+  size_t i;
+
+  for (i = 0; i < kills.length ();)
+    if (i != index)
+      {
+	bool found = false, restart = false;
+	modref_access_node *a = &kills[i];
+	modref_access_node *n = &kills[index];
+
+	if (n->contains_for_kills (*a))
+	  found = true;
+	if (!found && n->merge_for_kills (*a, false))
+	  found = restart = true;
+	gcc_checking_assert (found || !a->merge_for_kills (*n, false));
+	if (found)
+	  {
+	    kills.unordered_remove (i);
+	    if (index == kills.length ())
+	      {
+		index = i;
+		i++;
+	      }
+	    if (restart)
+	      i = 0;
+	  }
+	else
+	  i++;
+      }
+    else
+      i++;
+  return true;
+}
+
+
 #if CHECKING_P
 
 namespace selftest {
diff --git a/gcc/ipa-modref-tree.h b/gcc/ipa-modref-tree.h
index 2fcabe480bd..1bf2aa8460e 100644
--- a/gcc/ipa-modref-tree.h
+++ b/gcc/ipa-modref-tree.h
@@ -82,6 +82,13 @@ struct GTY(()) modref_access_node
     {
       return parm_index != MODREF_UNKNOWN_PARM;
     }
+  /* Return true if access can be used to determine a kill.  */
+  bool useful_for_kill_p () const
+    {
+      return parm_offset_known && parm_index != MODREF_UNKNOWN_PARM
+	     && parm_index != MODREF_RETSLOT_PARM && known_size_p (size)
+	     && known_eq (max_size, size);
+    }
   /* Dump range to debug OUT.  */
   void dump (FILE *out);
   /* Return true if both accesses are the same.  */
@@ -98,10 +105,18 @@ struct GTY(()) modref_access_node
   static int insert (vec <modref_access_node, va_gc> *&accesses,
 		     modref_access_node a, size_t max_accesses,
 		     bool record_adjustments);
+  /* Same as insert but for kills where we are conservative the other way
+     around: if information is lost, the kill is lost.  */
+  static bool insert_kill (vec<modref_access_node> &kills,
+			   modref_access_node &a, bool record_adjustments);
 private:
   bool contains (const modref_access_node &) const;
+  bool contains_for_kills (const modref_access_node &) const;
   void update (poly_int64, poly_int64, poly_int64, poly_int64, bool);
+  bool update_for_kills (poly_int64, poly_int64, poly_int64,
+			 poly_int64, poly_int64, bool);
   bool merge (const modref_access_node &, bool);
+  bool merge_for_kills (const modref_access_node &, bool);
   static bool closer_pair_p (const modref_access_node &,
 			     const modref_access_node &,
 			     const modref_access_node &,
diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index b75ed84135b..df4612bbff9 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -335,6 +335,8 @@ modref_summary::useful_p (int ecf_flags, bool check_flags)
     return (!side_effects && (ecf_flags & ECF_LOOPING_CONST_OR_PURE));
   if (loads && !loads->every_base)
     return true;
+  else
+    kills.release ();
   if (ecf_flags & ECF_PURE)
     return (!side_effects && (ecf_flags & ECF_LOOPING_CONST_OR_PURE));
   return stores && !stores->every_base;
@@ -351,6 +353,7 @@ struct GTY(()) modref_summary_lto
      more verbose and thus more likely to hit the limits.  */
   modref_records_lto *loads;
   modref_records_lto *stores;
+  auto_vec<modref_access_node> GTY((skip)) kills;
   auto_vec<eaf_flags_t> GTY((skip)) arg_flags;
   eaf_flags_t retslot_flags;
   eaf_flags_t static_chain_flags;
@@ -570,6 +573,15 @@ modref_summary::dump (FILE *out)
       fprintf (out, "  stores:\n");
       dump_records (stores, out);
     }
+  if (kills.length ())
+    {
+      fprintf (out, "  kills:\n");
+      for (auto kill : kills)
+	{
+	  fprintf (out, "    ");
+	  kill.dump (out);
+	}
+    }
   if (writes_errno)
     fprintf (out, "  Writes errno\n");
   if (side_effects)
@@ -762,13 +774,12 @@ get_access (ao_ref *ref)
 /* Record access into the modref_records data structure.  */
 
 static void
-record_access (modref_records *tt, ao_ref *ref)
+record_access (modref_records *tt, ao_ref *ref, modref_access_node &a)
 {
   alias_set_type base_set = !flag_strict_aliasing ? 0
 			    : ao_ref_base_alias_set (ref);
   alias_set_type ref_set = !flag_strict_aliasing ? 0
 			    : (ao_ref_alias_set (ref));
-  modref_access_node a = get_access (ref);
   if (dump_file)
     {
        fprintf (dump_file, "   - Recording base_set=%i ref_set=%i ",
@@ -781,7 +792,7 @@ record_access (modref_records *tt, ao_ref *ref)
 /* IPA version of record_access_tree.  */
 
 static void
-record_access_lto (modref_records_lto *tt, ao_ref *ref)
+record_access_lto (modref_records_lto *tt, ao_ref *ref, modref_access_node &a)
 {
   /* get_alias_set sometimes use different type to compute the alias set
      than TREE_TYPE (base).  Do same adjustments.  */
@@ -828,7 +839,6 @@ record_access_lto (modref_records_lto *tt, ao_ref *ref)
 		       || variably_modified_type_p (ref_type, NULL_TREE)))
 	ref_type = NULL_TREE;
     }
-  modref_access_node a = get_access (ref);
   if (dump_file)
     {
       fprintf (dump_file, "   - Recording base type:");
@@ -932,13 +942,17 @@ bool
 merge_call_side_effects (modref_summary *cur_summary,
 			 gimple *stmt, modref_summary *callee_summary,
 			 bool ignore_stores, cgraph_node *callee_node,
-			 bool record_adjustments)
+			 bool record_adjustments, bool always_executed)
 {
   auto_vec <modref_parm_map, 32> parm_map;
   modref_parm_map chain_map;
   bool changed = false;
   int flags = gimple_call_flags (stmt);
 
+  if ((flags & (ECF_CONST | ECF_NOVOPS))
+      && !(flags & ECF_LOOPING_CONST_OR_PURE))
+    return changed;
+
   if (!cur_summary->side_effects && callee_summary->side_effects)
     {
       if (dump_file)
@@ -950,6 +964,38 @@ merge_call_side_effects (modref_summary *cur_summary,
   if (flags & (ECF_CONST | ECF_NOVOPS))
     return changed;
 
+  if (always_executed
+      && callee_summary->kills.length ()
+      && (!cfun->can_throw_non_call_exceptions
+	  || !stmt_could_throw_p (cfun, stmt)))
+    {
+      /* Watch for self recursive updates.  */
+      auto_vec<modref_access_node, 32> saved_kills;
+
+      saved_kills.reserve_exact (callee_summary->kills.length ());
+      saved_kills.splice (callee_summary->kills);
+      for (auto kill : saved_kills)
+	{
+	  if (kill.parm_index >= (int)parm_map.length ())
+	    continue;
+	  modref_parm_map &m
+		  = kill.parm_index == MODREF_STATIC_CHAIN_PARM
+		    ? chain_map
+		    : parm_map[kill.parm_index];
+	  if (m.parm_index == MODREF_LOCAL_MEMORY_PARM
+	      || m.parm_index == MODREF_UNKNOWN_PARM
+	      || m.parm_index == MODREF_RETSLOT_PARM
+	      || !m.parm_offset_known)
+	    continue;
+	  modref_access_node n = kill;
+	  n.parm_index = m.parm_index;
+	  n.parm_offset += m.parm_offset;
+	  if (modref_access_node::insert_kill (cur_summary->kills, n,
+					       record_adjustments))
+	    changed = true;
+	}
+    }
+
   /* We can not safely optimize based on summary of callee if it does
      not always bind to current def: it is possible that memory load
      was optimized out earlier which may not happen in the interposed
@@ -1218,7 +1264,8 @@ process_fnspec (modref_summary *cur_summary,
 
 static bool
 analyze_call (modref_summary *cur_summary, modref_summary_lto *cur_summary_lto,
-	      gcall *stmt, vec <gimple *> *recursive_calls)
+	      gcall *stmt, vec <gimple *> *recursive_calls,
+	      bool always_executed)
 {
   /* Check flags on the function call.  In certain cases, analysis can be
      simplified.  */
@@ -1305,7 +1352,7 @@ analyze_call (modref_summary *cur_summary, modref_summary_lto *cur_summary_lto,
     }
 
   merge_call_side_effects (cur_summary, stmt, callee_summary, ignore_stores,
-			   callee_node, false);
+			   callee_node, false, always_executed);
 
   return true;
 }
@@ -1316,6 +1363,7 @@ struct summary_ptrs
 {
   struct modref_summary *nolto;
   struct modref_summary_lto *lto;
+  bool always_executed;
 };
 
 /* Helper for analyze_stmt.  */
@@ -1350,18 +1398,19 @@ analyze_load (gimple *, tree, tree op, void *data)
 
   ao_ref r;
   ao_ref_init (&r, op);
+  modref_access_node a = get_access (&r);
 
   if (summary)
-    record_access (summary->loads, &r);
+    record_access (summary->loads, &r, a);
   if (summary_lto)
-    record_access_lto (summary_lto->loads, &r);
+    record_access_lto (summary_lto->loads, &r, a);
   return false;
 }
 
 /* Helper for analyze_stmt.  */
 
 static bool
-analyze_store (gimple *, tree, tree op, void *data)
+analyze_store (gimple *stmt, tree, tree op, void *data)
 {
   modref_summary *summary = ((summary_ptrs *)data)->nolto;
   modref_summary_lto *summary_lto = ((summary_ptrs *)data)->lto;
@@ -1390,11 +1439,22 @@ analyze_store (gimple *, tree, tree op, void *data)
 
   ao_ref r;
   ao_ref_init (&r, op);
+  modref_access_node a = get_access (&r);
 
   if (summary)
-    record_access (summary->stores, &r);
+    record_access (summary->stores, &r, a);
   if (summary_lto)
-    record_access_lto (summary_lto->stores, &r);
+    record_access_lto (summary_lto->stores, &r, a);
+  if (summary
+      && ((summary_ptrs *)data)->always_executed
+      && a.useful_for_kill_p ()
+      && (!cfun->can_throw_non_call_exceptions
+	  || !stmt_could_throw_p (cfun, stmt)))
+    {
+      if (dump_file)
+	fprintf (dump_file, "   - Recording kill\n");
+      modref_access_node::insert_kill (summary->kills, a, false);
+    }
   return false;
 }
 
@@ -1403,16 +1463,32 @@ analyze_store (gimple *, tree, tree op, void *data)
 
 static bool
 analyze_stmt (modref_summary *summary, modref_summary_lto *summary_lto,
-	      gimple *stmt, bool ipa, vec <gimple *> *recursive_calls)
+	      gimple *stmt, bool ipa, vec <gimple *> *recursive_calls,
+	      bool always_executed)
 {
   /* In general we can not ignore clobbers because they are barriers for code
      motion, however after inlining it is safe to do because local optimization
      passes do not consider clobbers from other functions.
      Similar logic is in ipa-pure-const.c.  */
   if ((ipa || cfun->after_inlining) && gimple_clobber_p (stmt))
-    return true;
+    {
+      if (summary
+	  && always_executed && record_access_p (gimple_assign_lhs (stmt)))
+	{
+	  ao_ref r;
+	  ao_ref_init (&r, gimple_assign_lhs (stmt));
+	  modref_access_node a = get_access (&r);
+	  if (a.useful_for_kill_p ())
+	    {
+	      if (dump_file)
+		fprintf (dump_file, "   - Recording kill\n");
+	      modref_access_node::insert_kill (summary->kills, a, false);
+	    }
+	}
+      return true;
+    }
 
-  struct summary_ptrs sums = {summary, summary_lto};
+  struct summary_ptrs sums = {summary, summary_lto, always_executed};
 
   /* Analyze all loads and stores in STMT.  */
   walk_stmt_load_store_ops (stmt, &sums,
@@ -1441,7 +1517,8 @@ analyze_stmt (modref_summary *summary, modref_summary_lto *summary_lto,
    case GIMPLE_CALL:
      if (!ipa || gimple_call_internal_p (stmt))
        return analyze_call (summary, summary_lto,
-			    as_a <gcall *> (stmt), recursive_calls);
+			    as_a <gcall *> (stmt), recursive_calls,
+			    always_executed);
      else
       {
 	attr_fnspec fnspec = gimple_call_fnspec (as_a <gcall *>(stmt));
@@ -2779,11 +2856,15 @@ analyze_function (function *f, bool ipa)
   FOR_EACH_BB_FN (bb, f)
     {
       gimple_stmt_iterator si;
+      bool always_executed
+	      = bb == single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun))->dest;
+
       for (si = gsi_start_nondebug_after_labels_bb (bb);
 	   !gsi_end_p (si); gsi_next_nondebug (&si))
 	{
 	  if (!analyze_stmt (summary, summary_lto,
-			     gsi_stmt (si), ipa, &recursive_calls)
+			     gsi_stmt (si), ipa, &recursive_calls,
+			     always_executed)
 	      || ((!summary || !summary->useful_p (ecf_flags, false))
 		  && (!summary_lto
 		      || !summary_lto->useful_p (ecf_flags, false))))
@@ -2792,6 +2873,9 @@ analyze_function (function *f, bool ipa)
 	      collapse_stores (summary, summary_lto);
 	      break;
 	    }
+	  if (always_executed
+	      && stmt_can_throw_external (cfun, gsi_stmt (si)))
+	    always_executed = false;
 	}
     }
 
@@ -2811,7 +2895,7 @@ analyze_function (function *f, bool ipa)
 			   ignore_stores_p (current_function_decl,
 					    gimple_call_flags
 						 (recursive_calls[i])),
-			   fnode, !first);
+			   fnode, !first, false);
 	      if (!summary->useful_p (ecf_flags, false))
 		{
 		  remove_summary (lto, nolto, ipa);
@@ -3061,6 +3145,8 @@ modref_summaries::duplicate (cgraph_node *, cgraph_node *dst,
 			 src_data->loads->max_refs,
 			 src_data->loads->max_accesses);
   dst_data->loads->copy_from (src_data->loads);
+  dst_data->kills.reserve_exact (src_data->kills.length ());
+  dst_data->kills.splice (src_data->kills);
   dst_data->writes_errno = src_data->writes_errno;
   dst_data->side_effects = src_data->side_effects;
   if (src_data->arg_flags.length ())
@@ -3710,6 +3796,8 @@ update_signature (struct cgraph_node *node)
     {
       r->loads->remap_params (&map);
       r->stores->remap_params (&map);
+      /* TODO: One we do IPA kills analysis, update the table here.  */
+      r->kills.release ();
       if (r->arg_flags.length ())
 	remap_arg_flags (r->arg_flags, info);
     }
@@ -3717,6 +3805,8 @@ update_signature (struct cgraph_node *node)
     {
       r_lto->loads->remap_params (&map);
       r_lto->stores->remap_params (&map);
+      /* TODO: One we do IPA kills analysis, update the table here.  */
+      r_lto->kills.release ();
       if (r_lto->arg_flags.length ())
 	remap_arg_flags (r_lto->arg_flags, info);
     }
diff --git a/gcc/ipa-modref.h b/gcc/ipa-modref.h
index 118dc5f2abf..9e8a30fd80a 100644
--- a/gcc/ipa-modref.h
+++ b/gcc/ipa-modref.h
@@ -30,6 +30,7 @@ struct GTY(()) modref_summary
   /* Load and stores in function (transitively closed to all callees)  */
   modref_records *loads;
   modref_records *stores;
+  auto_vec<modref_access_node> GTY((skip)) kills;
   auto_vec<eaf_flags_t> GTY((skip)) arg_flags;
   eaf_flags_t retslot_flags;
   eaf_flags_t static_chain_flags;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-3.c b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-3.c
new file mode 100644
index 00000000000..c69e423c6fd
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-3.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dse1-details"  } */
+__attribute__ ((noinline))
+void write (int *a)
+{
+	*a=1;
+	a[1]=2;
+}
+int test ()
+{
+	int a;
+	a=2;
+	write (&a);
+	return a;
+}
+int test2 (int *a)
+{
+	*a=2;
+	write (a);
+	return *a;
+}
+/* { dg-final { scan-tree-dump-times "Deleted dead store" 2 "dse1"} } */
diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
index 29be1f848b5..093d65cc003 100644
--- a/gcc/tree-ssa-alias.c
+++ b/gcc/tree-ssa-alias.c
@@ -116,10 +116,14 @@ static struct {
   unsigned HOST_WIDE_INT nonoverlapping_refs_since_match_p_may_alias;
   unsigned HOST_WIDE_INT nonoverlapping_refs_since_match_p_must_overlap;
   unsigned HOST_WIDE_INT nonoverlapping_refs_since_match_p_no_alias;
+  unsigned HOST_WIDE_INT stmt_kills_ref_p_no;
+  unsigned HOST_WIDE_INT stmt_kills_ref_p_yes;
   unsigned HOST_WIDE_INT modref_use_may_alias;
   unsigned HOST_WIDE_INT modref_use_no_alias;
   unsigned HOST_WIDE_INT modref_clobber_may_alias;
   unsigned HOST_WIDE_INT modref_clobber_no_alias;
+  unsigned HOST_WIDE_INT modref_kill_no;
+  unsigned HOST_WIDE_INT modref_kill_yes;
   unsigned HOST_WIDE_INT modref_tests;
   unsigned HOST_WIDE_INT modref_baseptr_tests;
 } alias_stats;
@@ -146,6 +150,12 @@ dump_alias_stats (FILE *s)
 	   alias_stats.call_may_clobber_ref_p_no_alias,
 	   alias_stats.call_may_clobber_ref_p_no_alias
 	   + alias_stats.call_may_clobber_ref_p_may_alias);
+  fprintf (s, "  stmt_kills_ref_p: "
+	   HOST_WIDE_INT_PRINT_DEC" kills, "
+	   HOST_WIDE_INT_PRINT_DEC" queries\n",
+	   alias_stats.stmt_kills_ref_p_yes + alias_stats.modref_kill_yes,
+	   alias_stats.stmt_kills_ref_p_yes + alias_stats.modref_kill_yes
+	   + alias_stats.stmt_kills_ref_p_no + alias_stats.modref_kill_no);
   fprintf (s, "  nonoverlapping_component_refs_p: "
 	   HOST_WIDE_INT_PRINT_DEC" disambiguations, "
 	   HOST_WIDE_INT_PRINT_DEC" queries\n",
@@ -169,6 +179,12 @@ dump_alias_stats (FILE *s)
 	   + alias_stats.aliasing_component_refs_p_may_alias);
   dump_alias_stats_in_alias_c (s);
   fprintf (s, "\nModref stats:\n");
+  fprintf (s, "  modref kill: "
+	   HOST_WIDE_INT_PRINT_DEC" kills, "
+	   HOST_WIDE_INT_PRINT_DEC" queries\n",
+	   alias_stats.modref_kill_yes,
+	   alias_stats.modref_kill_yes
+	   + alias_stats.modref_kill_no);
   fprintf (s, "  modref use: "
 	   HOST_WIDE_INT_PRINT_DEC" disambiguations, "
 	   HOST_WIDE_INT_PRINT_DEC" queries\n",
@@ -3220,6 +3236,51 @@ same_addr_size_stores_p (tree base1, poly_int64 offset1, poly_int64 size1,
 	  && known_eq (wi::to_poly_offset (DECL_SIZE (obj)), size1));
 }
 
+/* Return true if REF is killed by an store described by
+   BASE, OFFSET, SIZE and MAX_SIZE.  */
+
+static bool
+store_kills_ref_p (tree base, poly_int64 offset, poly_int64 size,
+		   poly_int64 max_size, ao_ref *ref)
+{
+  poly_int64 ref_offset = ref->offset;
+  /* We can get MEM[symbol: sZ, index: D.8862_1] here,
+     so base == ref->base does not always hold.  */
+  if (base != ref->base)
+    {
+      /* Try using points-to info.  */
+      if (same_addr_size_stores_p (base, offset, size, max_size, ref->base,
+				   ref->offset, ref->size, ref->max_size))
+	return true;
+
+      /* If both base and ref->base are MEM_REFs, only compare the
+	 first operand, and if the second operand isn't equal constant,
+	 try to add the offsets into offset and ref_offset.  */
+      if (TREE_CODE (base) == MEM_REF && TREE_CODE (ref->base) == MEM_REF
+	  && TREE_OPERAND (base, 0) == TREE_OPERAND (ref->base, 0))
+	{
+	  if (!tree_int_cst_equal (TREE_OPERAND (base, 1),
+				   TREE_OPERAND (ref->base, 1)))
+	    {
+	      poly_offset_int off1 = mem_ref_offset (base);
+	      off1 <<= LOG2_BITS_PER_UNIT;
+	      off1 += offset;
+	      poly_offset_int off2 = mem_ref_offset (ref->base);
+	      off2 <<= LOG2_BITS_PER_UNIT;
+	      off2 += ref_offset;
+	      if (!off1.to_shwi (&offset) || !off2.to_shwi (&ref_offset))
+		size = -1;
+	    }
+	}
+      else
+	size = -1;
+    }
+  /* For a must-alias check we need to be able to constrain
+     the access properly.  */
+  return (known_eq (size, max_size)
+	  && known_subrange_p (ref_offset, ref->max_size, offset, size));
+}
+
 /* If STMT kills the memory reference REF return true, otherwise
    return false.  */
 
@@ -3293,7 +3354,10 @@ stmt_kills_ref_p (gimple *stmt, ao_ref *ref)
 		      && operand_equal_p (lhs, base,
 					  OEP_ADDRESS_OF
 					  | OEP_MATCH_SIDE_EFFECTS))))
-	    return true;
+	    {
+	      ++alias_stats.stmt_kills_ref_p_yes;
+	      return true;
+	    }
 	}
 
       /* Now look for non-literal equal bases with the restriction of
@@ -3301,52 +3365,72 @@ stmt_kills_ref_p (gimple *stmt, ao_ref *ref)
       /* For a must-alias check we need to be able to constrain
 	 the access properly.  */
       if (!ref->max_size_known_p ())
-	return false;
-      poly_int64 size, offset, max_size, ref_offset = ref->offset;
+	{
+	  ++alias_stats.stmt_kills_ref_p_no;
+	  return false;
+	}
+      poly_int64 size, offset, max_size;
       bool reverse;
       tree base = get_ref_base_and_extent (lhs, &offset, &size, &max_size,
 					   &reverse);
-      /* We can get MEM[symbol: sZ, index: D.8862_1] here,
-	 so base == ref->base does not always hold.  */
-      if (base != ref->base)
+      if (store_kills_ref_p (base, offset, size, max_size, ref))
 	{
-	  /* Try using points-to info.  */
-	  if (same_addr_size_stores_p (base, offset, size, max_size, ref->base,
-				       ref->offset, ref->size, ref->max_size))
-	    return true;
-
-	  /* If both base and ref->base are MEM_REFs, only compare the
-	     first operand, and if the second operand isn't equal constant,
-	     try to add the offsets into offset and ref_offset.  */
-	  if (TREE_CODE (base) == MEM_REF && TREE_CODE (ref->base) == MEM_REF
-	      && TREE_OPERAND (base, 0) == TREE_OPERAND (ref->base, 0))
-	    {
-	      if (!tree_int_cst_equal (TREE_OPERAND (base, 1),
-				       TREE_OPERAND (ref->base, 1)))
-		{
-		  poly_offset_int off1 = mem_ref_offset (base);
-		  off1 <<= LOG2_BITS_PER_UNIT;
-		  off1 += offset;
-		  poly_offset_int off2 = mem_ref_offset (ref->base);
-		  off2 <<= LOG2_BITS_PER_UNIT;
-		  off2 += ref_offset;
-		  if (!off1.to_shwi (&offset) || !off2.to_shwi (&ref_offset))
-		    size = -1;
-		}
-	    }
-	  else
-	    size = -1;
+	  ++alias_stats.stmt_kills_ref_p_yes;
+	  return true;
 	}
-      /* For a must-alias check we need to be able to constrain
-	 the access properly.  */
-      if (known_eq (size, max_size)
-	  && known_subrange_p (ref_offset, ref->max_size, offset, size))
-	return true;
     }
 
   if (is_gimple_call (stmt))
     {
       tree callee = gimple_call_fndecl (stmt);
+      struct cgraph_node *node;
+      modref_summary *summary;
+
+      /* Try to disambiguate using modref summary.  Modref records a vector
+	 of stores with known offsets relative to function parameters that must
+	 happen every execution of function.  Find if we have a matching
+	 store and verify that function can not use the value.  */
+      if (callee != NULL_TREE
+	  && (node = cgraph_node::get (callee)) != NULL
+	  && node->binds_to_current_def_p ()
+	  && (summary = get_modref_function_summary (node)) != NULL
+	  && summary->kills.length ()
+	  && (!cfun->can_throw_non_call_exceptions
+	      || !stmt_can_throw_internal (cfun, stmt)))
+	{
+	  for (auto kill : summary->kills)
+	    {
+	      ao_ref dref;
+
+	      /* We only can do useful compares if we know the access range
+		 precisely.  */
+	      if (!kill.get_ao_ref (as_a <gcall *> (stmt), &dref))
+		continue;
+	      if (store_kills_ref_p (ao_ref_base (&dref), dref.offset,
+				     dref.size, dref.max_size, ref))
+		{
+		  /* For store to be killed it needs to not be used
+		     earlier.  */
+		  if (ref_maybe_used_by_call_p_1 (as_a <gcall *> (stmt), ref,
+						  true)
+		      || !dbg_cnt (ipa_mod_ref))
+		    break;
+		  if (dump_file && (dump_flags & TDF_DETAILS))
+		    {
+		      fprintf (dump_file,
+			       "ipa-modref: call stmt ");
+		      print_gimple_stmt (dump_file, stmt, 0);
+		      fprintf (dump_file,
+			       "ipa-modref: call to %s kills ",
+			       node->dump_name ());
+		      print_generic_expr (dump_file, ref->base);
+		    }
+		    ++alias_stats.modref_kill_yes;
+		    return true;
+		}
+	    }
+	  ++alias_stats.modref_kill_no;
+	}
       if (callee != NULL_TREE
 	  && gimple_call_builtin_p (stmt, BUILT_IN_NORMAL))
 	switch (DECL_FUNCTION_CODE (callee))
@@ -3357,7 +3441,10 @@ stmt_kills_ref_p (gimple *stmt, ao_ref *ref)
 	      tree base = ao_ref_base (ref);
 	      if (base && TREE_CODE (base) == MEM_REF
 		  && TREE_OPERAND (base, 0) == ptr)
-		return true;
+		{
+		  ++alias_stats.stmt_kills_ref_p_yes;
+		  return true;
+		}
 	      break;
 	    }
 
@@ -3376,7 +3463,10 @@ stmt_kills_ref_p (gimple *stmt, ao_ref *ref)
 	      /* For a must-alias check we need to be able to constrain
 		 the access properly.  */
 	      if (!ref->max_size_known_p ())
-		return false;
+		{
+		  ++alias_stats.stmt_kills_ref_p_no;
+		  return false;
+		}
 	      tree dest;
 	      tree len;
 
@@ -3391,11 +3481,17 @@ stmt_kills_ref_p (gimple *stmt, ao_ref *ref)
 		  tree arg1 = gimple_call_arg (stmt, 1);
 		  if (TREE_CODE (arg0) != INTEGER_CST
 		      || TREE_CODE (arg1) != INTEGER_CST)
-		    return false;
+		    {
+		      ++alias_stats.stmt_kills_ref_p_no;
+		      return false;
+		    }
 
 		  dest = gimple_call_lhs (stmt);
 		  if (!dest)
-		    return false;
+		    {
+		      ++alias_stats.stmt_kills_ref_p_no;
+		      return false;
+		    }
 		  len = fold_build2 (MULT_EXPR, TREE_TYPE (arg0), arg0, arg1);
 		}
 	      else
@@ -3405,29 +3501,14 @@ stmt_kills_ref_p (gimple *stmt, ao_ref *ref)
 		}
 	      if (!poly_int_tree_p (len))
 		return false;
-	      tree rbase = ref->base;
-	      poly_offset_int roffset = ref->offset;
 	      ao_ref dref;
 	      ao_ref_init_from_ptr_and_size (&dref, dest, len);
-	      tree base = ao_ref_base (&dref);
-	      poly_offset_int offset = dref.offset;
-	      if (!base || !known_size_p (dref.size))
-		return false;
-	      if (TREE_CODE (base) == MEM_REF)
+	      if (store_kills_ref_p (ao_ref_base (&dref), dref.offset,
+				     dref.size, dref.max_size, ref))
 		{
-		  if (TREE_CODE (rbase) != MEM_REF)
-		    return false;
-		  // Compare pointers.
-		  offset += mem_ref_offset (base) << LOG2_BITS_PER_UNIT;
-		  roffset += mem_ref_offset (rbase) << LOG2_BITS_PER_UNIT;
-		  base = TREE_OPERAND (base, 0);
-		  rbase = TREE_OPERAND (rbase, 0);
+		  ++alias_stats.stmt_kills_ref_p_yes;
+		  return true;
 		}
-	      if (base == rbase
-		  && known_subrange_p (roffset, ref->max_size, offset,
-				       wi::to_poly_offset (len)
-				       << LOG2_BITS_PER_UNIT))
-		return true;
 	      break;
 	    }
 
@@ -3438,7 +3519,10 @@ stmt_kills_ref_p (gimple *stmt, ao_ref *ref)
 		{
 		  tree base = ao_ref_base (ref);
 		  if (TREE_OPERAND (ptr, 0) == base)
-		    return true;
+		    {
+		      ++alias_stats.stmt_kills_ref_p_yes;
+		      return true;
+		    }
 		}
 	      break;
 	    }
@@ -3446,6 +3530,7 @@ stmt_kills_ref_p (gimple *stmt, ao_ref *ref)
 	  default:;
 	  }
     }
+  ++alias_stats.stmt_kills_ref_p_no;
   return false;
 }
  
H.J. Lu Nov. 15, 2021, 6:51 p.m. UTC | #5
On Sun, Nov 14, 2021 at 10:53 AM Jan Hubicka via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> > >
> > > I think you want get_addr_base_and_unit_offset here.  All
> > > variable indexed addresses are in separate stmts.  That also means
> > > you can eventually work with just byte sizes/offsets?
> >
> > Will do.  The access range in modref summary is bit based (since we want
> > to disabiguate bitfields like we do in rest of alias oracle) but indeed
> > this part cna be in bytes.
>
> Actually after the unifiation I can just use get_ao_ref which will call
> ao_ref_init_from_ptr_and_range that has all the logic I need in there.
> I also noticed that I ended up duplicating the code matching bases
> and ranges which is done already twice in the function - once for store
> targets and once for MEMSET and friends.  The later copy lacked overflow
> checks so I took the first copy and moved it to helper function.  This
> makes the gimple part of patch really straighforward: just build ao_ref
> if possible and then pass it to this function.
>
> I also added statistics.
>
> I have bootstrapped/regtsed on x86_64-linux the updated patch and
> comitted it so I can break out the patches that depends on it.
> I have patch improving the kill tracking at modref side and also the
> kill oracle itself can use fnspec and does not need to special case
> mem* functions.
>
> For cc1plus LTO link I now get:
>
> Alias oracle query stats:
>   refs_may_alias_p: 76106130 disambiguations, 100928932 queries
>   on_includes: 12539931 disambiguations, 39864841 queries
>   ref_maybe_used_by_call_p: 625857 disambiguations, 77138089 queries
>   call_may_clobber_ref_p: 366420 disambiguations, 369293 queries
>   stmt_kills_ref_p: 107503 kills, 5699589 queries
>   nonoverlapping_component_refs_p: 0 disambiguations, 26176 queries
>   nonoverlapping_refs_since_match_p: 30339 disambiguations, 65400 must overlaps, 96698 queries
>   aliasing_component_refs_p: 57500 disambiguations, 15464678 queries
>   TBAA oracle: 28248334 disambiguations 104710521 queries
>                15220245 are in alias set 0
>                8905994 queries asked about the same object
>                98 queries asked about the same alias set
>                0 access volatile
>                50371110 are dependent in the DAG
>                1964740 are aritificially in conflict with void *
>
> Modref stats:
>   modref kill: 52 kills, 6655 queries
>   modref use: 25204 disambiguations, 692151 queries
>   modref clobber: 2309709 disambiguations, 21877806 queries
>   5320532 tbaa queries (0.243193 per modref query)
>   761785 base compares (0.034820 per modref query)
>
> PTA query stats:
>   pt_solution_includes: 12539931 disambiguations, 39864841 queries
>   pt_solutions_intersect: 1713075 disambiguations, 14023484 queries
>
> Newly we get statis of kill oracle itself:
>   stmt_kills_ref_p: 107503 kills, 5699589 queries
> and the modref part:
>   modref kill: 52 kills, 6655 queries
> So an improvemnet over 1 kill using modref I had before. Still not
> really great.
>
> Honza
>
> gcc/ChangeLog:
>
>             * ipa-modref-tree.c (modref_access_node::update_for_kills): New
>             member function.
>             (modref_access_node::merge_for_kills): Likewise.
>             (modref_access_node::insert_kill): Likewise.
>             * ipa-modref-tree.h (modref_access_node::update_for_kills,
>             modref_access_node::merge_for_kills, modref_access_node::insert_kill):
>             Declare.
>             (modref_access_node::useful_for_kill): New member function.
>             * ipa-modref.c (modref_summary::useful_p): Release useless kills.
>             (lto_modref_summary): Add kills.
>             (modref_summary::dump): Dump kills.
>             (record_access): Add mdoref_access_node parameter.
>             (record_access_lto): Likewise.
>             (merge_call_side_effects): Merge kills.
>             (analyze_call): Add ALWAYS_EXECUTED param and pass it around.
>             (struct summary_ptrs): Add always_executed filed.
>             (analyze_load): Update.
>             (analyze_store): Update; record kills.
>             (analyze_stmt): Add always_executed; record kills in clobbers.
>             (analyze_function): Track always_executed.
>             (modref_summaries::duplicate): Duplicate kills.
>             (update_signature): Release kills.
>             * ipa-modref.h (struct modref_summary): Add kills.
>             * tree-ssa-alias.c (alias_stats): Add kill stats.
>             (dump_alias_stats): Dump kill stats.
>             (store_kills_ref_p): Break out from ...
>             (stmt_kills_ref_p): Use it; handle modref info based kills.
>
>     gcc/testsuite/ChangeLog:
>
>     2021-11-14  Jan Hubicka  <hubicka@ucw.cz>
>
>             * gcc.dg/tree-ssa/modref-dse-3.c: New test.
>
>
> diff --git a/gcc/ipa-modref-tree.c b/gcc/ipa-modref-tree.c
> index 6fc2b7298f4..bbe23a5a211 100644
> --- a/gcc/ipa-modref-tree.c
> +++ b/gcc/ipa-modref-tree.c
> @@ -638,6 +638,185 @@ modref_access_node::get_ao_ref (const gcall *stmt, ao_ref *ref) const
>    return true;
>  }
>
> +/* Return true A is a subkill.  */
> +bool
> +modref_access_node::contains_for_kills (const modref_access_node &a) const
> +{
> +  poly_int64 aoffset_adj = 0;
> +
> +  gcc_checking_assert (parm_index != MODREF_UNKNOWN_PARM
> +                      && a.parm_index != MODREF_UNKNOWN_PARM);
> +  if (parm_index != a.parm_index)
> +    return false;
> +  gcc_checking_assert (parm_offset_known && a.parm_offset_known);
> +  aoffset_adj = (a.parm_offset - parm_offset)
> +               * BITS_PER_UNIT;
> +  gcc_checking_assert (range_info_useful_p () && a.range_info_useful_p ());
> +  return known_subrange_p (a.offset + aoffset_adj,
> +                          a.max_size, offset, max_size);
> +}
> +
> +/* Merge two ranges both starting at parm_offset1 and update THIS
> +   with result.  */
> +bool
> +modref_access_node::update_for_kills (poly_int64 parm_offset1,
> +                                     poly_int64 offset1,
> +                                     poly_int64 max_size1,
> +                                     poly_int64 offset2,
> +                                     poly_int64 max_size2,
> +                                     bool record_adjustments)
> +{
> +  if (known_le (offset1, offset2))
> +    ;
> +  else if (known_le (offset2, offset1))
> +    {
> +      std::swap (offset1, offset2);
> +      std::swap (max_size1, max_size2);
> +    }
> +  else
> +    gcc_unreachable ();
> +
> +  poly_int64 new_max_size = max_size2 + offset2 - offset1;
> +  if (known_le (new_max_size, max_size1))
> +    new_max_size = max_size1;
> +  if (known_eq (parm_offset, parm_offset1)
> +      && known_eq (offset, offset1)
> +      && known_eq (size, new_max_size)
> +      && known_eq (max_size, new_max_size))
> +    return false;
> +
> +  if (!record_adjustments
> +      || (++adjustments) < param_modref_max_adjustments)
> +    {
> +      parm_offset = parm_offset1;
> +      offset = offset1;
> +      max_size = new_max_size;
> +      size = new_max_size;
> +      gcc_checking_assert (useful_for_kill_p ());
> +      return true;
> +    }
> +  return false;
> +}
> +
> +/* Merge in access A if it is possible to do without losing
> +   precision.  Return true if successful.
> +   Unlike merge assume that both accesses are always executed
> +   and merge size the same was as max_size.  */
> +bool
> +modref_access_node::merge_for_kills (const modref_access_node &a,
> +                                    bool record_adjustments)
> +{
> +  poly_int64 offset1 = 0;
> +  poly_int64 aoffset1 = 0;
> +  poly_int64 new_parm_offset = 0;
> +
> +  /* We assume that containment was tested earlier.  */
> +  gcc_checking_assert (!contains_for_kills (a) && !a.contains_for_kills (*this)
> +                      && useful_for_kill_p () && a.useful_for_kill_p ());
> +
> +  if (parm_index != a.parm_index
> +      || !combined_offsets (a, &new_parm_offset, &offset1, &aoffset1))
> +    return false;
> +
> +  if (known_le (offset1, aoffset1))
> +   {
> +     if (!known_size_p (max_size)
> +        || known_ge (offset1 + max_size, aoffset1))
> +       return update_for_kills (new_parm_offset, offset1, max_size,
> +                               aoffset1, a.max_size, record_adjustments);
> +   }
> +  else if (known_le (aoffset1, offset1))
> +   {
> +     if (!known_size_p (a.max_size)
> +        || known_ge (aoffset1 + a.max_size, offset1))
> +       return update_for_kills (new_parm_offset, offset1, max_size,
> +                               aoffset1, a.max_size, record_adjustments);
> +   }
> +  return false;
> +}
> +
> +/* Insert new kill A into KILLS.  If RECORD_ADJUSTMENTS is true limit number
> +   of changes to each entry.  Return true if something changed.  */
> +
> +bool
> +modref_access_node::insert_kill (vec<modref_access_node> &kills,
> +                                modref_access_node &a, bool record_adjustments)
> +{
> +  size_t index;
> +  modref_access_node *a2;
> +  bool merge = false;
> +
> +  gcc_checking_assert (a.useful_for_kill_p ());
> +
> +  /* See if we have corresponding entry already or we can merge with
> +     neighbouring entry.  */
> +  FOR_EACH_VEC_ELT (kills, index, a2)
> +    {
> +      if (a2->contains_for_kills (a))
> +       return false;
> +      if (a.contains_for_kills (*a2))
> +       {
> +         a.adjustments = 0;
> +         *a2 = a;
> +         merge = true;
> +         break;
> +       }
> +      if (a2->merge_for_kills (a, record_adjustments))
> +       {
> +         merge = true;
> +         break;
> +       }
> +    }
> +  /* If entry was not found, insert it.  */
> +  if (!merge)
> +    {
> +      if ((int)kills.length () >= param_modref_max_accesses)
> +       {
> +         if (dump_file)
> +           fprintf (dump_file,
> +                    "--param param=modref-max-accesses limit reached:");
> +         return false;
> +       }
> +      a.adjustments = 0;
> +      kills.safe_push (a);
> +      return true;
> +    }
> +  /* Extending range in an entry may make it possible to merge it with
> +     other entries.  */
> +  size_t i;
> +
> +  for (i = 0; i < kills.length ();)
> +    if (i != index)
> +      {
> +       bool found = false, restart = false;
> +       modref_access_node *a = &kills[i];
> +       modref_access_node *n = &kills[index];
> +
> +       if (n->contains_for_kills (*a))
> +         found = true;
> +       if (!found && n->merge_for_kills (*a, false))
> +         found = restart = true;
> +       gcc_checking_assert (found || !a->merge_for_kills (*n, false));
> +       if (found)
> +         {
> +           kills.unordered_remove (i);
> +           if (index == kills.length ())
> +             {
> +               index = i;
> +               i++;
> +             }
> +           if (restart)
> +             i = 0;
> +         }
> +       else
> +         i++;
> +      }
> +    else
> +      i++;
> +  return true;
> +}
> +
> +
>  #if CHECKING_P
>
>  namespace selftest {
> diff --git a/gcc/ipa-modref-tree.h b/gcc/ipa-modref-tree.h
> index 2fcabe480bd..1bf2aa8460e 100644
> --- a/gcc/ipa-modref-tree.h
> +++ b/gcc/ipa-modref-tree.h
> @@ -82,6 +82,13 @@ struct GTY(()) modref_access_node
>      {
>        return parm_index != MODREF_UNKNOWN_PARM;
>      }
> +  /* Return true if access can be used to determine a kill.  */
> +  bool useful_for_kill_p () const
> +    {
> +      return parm_offset_known && parm_index != MODREF_UNKNOWN_PARM
> +            && parm_index != MODREF_RETSLOT_PARM && known_size_p (size)
> +            && known_eq (max_size, size);
> +    }
>    /* Dump range to debug OUT.  */
>    void dump (FILE *out);
>    /* Return true if both accesses are the same.  */
> @@ -98,10 +105,18 @@ struct GTY(()) modref_access_node
>    static int insert (vec <modref_access_node, va_gc> *&accesses,
>                      modref_access_node a, size_t max_accesses,
>                      bool record_adjustments);
> +  /* Same as insert but for kills where we are conservative the other way
> +     around: if information is lost, the kill is lost.  */
> +  static bool insert_kill (vec<modref_access_node> &kills,
> +                          modref_access_node &a, bool record_adjustments);
>  private:
>    bool contains (const modref_access_node &) const;
> +  bool contains_for_kills (const modref_access_node &) const;
>    void update (poly_int64, poly_int64, poly_int64, poly_int64, bool);
> +  bool update_for_kills (poly_int64, poly_int64, poly_int64,
> +                        poly_int64, poly_int64, bool);
>    bool merge (const modref_access_node &, bool);
> +  bool merge_for_kills (const modref_access_node &, bool);
>    static bool closer_pair_p (const modref_access_node &,
>                              const modref_access_node &,
>                              const modref_access_node &,
> diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
> index b75ed84135b..df4612bbff9 100644
> --- a/gcc/ipa-modref.c
> +++ b/gcc/ipa-modref.c
> @@ -335,6 +335,8 @@ modref_summary::useful_p (int ecf_flags, bool check_flags)
>      return (!side_effects && (ecf_flags & ECF_LOOPING_CONST_OR_PURE));
>    if (loads && !loads->every_base)
>      return true;
> +  else
> +    kills.release ();
>    if (ecf_flags & ECF_PURE)
>      return (!side_effects && (ecf_flags & ECF_LOOPING_CONST_OR_PURE));
>    return stores && !stores->every_base;
> @@ -351,6 +353,7 @@ struct GTY(()) modref_summary_lto
>       more verbose and thus more likely to hit the limits.  */
>    modref_records_lto *loads;
>    modref_records_lto *stores;
> +  auto_vec<modref_access_node> GTY((skip)) kills;
>    auto_vec<eaf_flags_t> GTY((skip)) arg_flags;
>    eaf_flags_t retslot_flags;
>    eaf_flags_t static_chain_flags;
> @@ -570,6 +573,15 @@ modref_summary::dump (FILE *out)
>        fprintf (out, "  stores:\n");
>        dump_records (stores, out);
>      }
> +  if (kills.length ())
> +    {
> +      fprintf (out, "  kills:\n");
> +      for (auto kill : kills)
> +       {
> +         fprintf (out, "    ");
> +         kill.dump (out);
> +       }
> +    }
>    if (writes_errno)
>      fprintf (out, "  Writes errno\n");
>    if (side_effects)
> @@ -762,13 +774,12 @@ get_access (ao_ref *ref)
>  /* Record access into the modref_records data structure.  */
>
>  static void
> -record_access (modref_records *tt, ao_ref *ref)
> +record_access (modref_records *tt, ao_ref *ref, modref_access_node &a)
>  {
>    alias_set_type base_set = !flag_strict_aliasing ? 0
>                             : ao_ref_base_alias_set (ref);
>    alias_set_type ref_set = !flag_strict_aliasing ? 0
>                             : (ao_ref_alias_set (ref));
> -  modref_access_node a = get_access (ref);
>    if (dump_file)
>      {
>         fprintf (dump_file, "   - Recording base_set=%i ref_set=%i ",
> @@ -781,7 +792,7 @@ record_access (modref_records *tt, ao_ref *ref)
>  /* IPA version of record_access_tree.  */
>
>  static void
> -record_access_lto (modref_records_lto *tt, ao_ref *ref)
> +record_access_lto (modref_records_lto *tt, ao_ref *ref, modref_access_node &a)
>  {
>    /* get_alias_set sometimes use different type to compute the alias set
>       than TREE_TYPE (base).  Do same adjustments.  */
> @@ -828,7 +839,6 @@ record_access_lto (modref_records_lto *tt, ao_ref *ref)
>                        || variably_modified_type_p (ref_type, NULL_TREE)))
>         ref_type = NULL_TREE;
>      }
> -  modref_access_node a = get_access (ref);
>    if (dump_file)
>      {
>        fprintf (dump_file, "   - Recording base type:");
> @@ -932,13 +942,17 @@ bool
>  merge_call_side_effects (modref_summary *cur_summary,
>                          gimple *stmt, modref_summary *callee_summary,
>                          bool ignore_stores, cgraph_node *callee_node,
> -                        bool record_adjustments)
> +                        bool record_adjustments, bool always_executed)
>  {
>    auto_vec <modref_parm_map, 32> parm_map;
>    modref_parm_map chain_map;
>    bool changed = false;
>    int flags = gimple_call_flags (stmt);
>
> +  if ((flags & (ECF_CONST | ECF_NOVOPS))
> +      && !(flags & ECF_LOOPING_CONST_OR_PURE))
> +    return changed;
> +
>    if (!cur_summary->side_effects && callee_summary->side_effects)
>      {
>        if (dump_file)
> @@ -950,6 +964,38 @@ merge_call_side_effects (modref_summary *cur_summary,
>    if (flags & (ECF_CONST | ECF_NOVOPS))
>      return changed;
>
> +  if (always_executed
> +      && callee_summary->kills.length ()
> +      && (!cfun->can_throw_non_call_exceptions
> +         || !stmt_could_throw_p (cfun, stmt)))
> +    {
> +      /* Watch for self recursive updates.  */
> +      auto_vec<modref_access_node, 32> saved_kills;
> +
> +      saved_kills.reserve_exact (callee_summary->kills.length ());
> +      saved_kills.splice (callee_summary->kills);
> +      for (auto kill : saved_kills)
> +       {
> +         if (kill.parm_index >= (int)parm_map.length ())
> +           continue;
> +         modref_parm_map &m
> +                 = kill.parm_index == MODREF_STATIC_CHAIN_PARM
> +                   ? chain_map
                     ^^^^^^^^^^^^^^^^  chain_map  isn't initialized.

This caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103262

> +                   : parm_map[kill.parm_index];
> +         if (m.parm_index == MODREF_LOCAL_MEMORY_PARM
> +             || m.parm_index == MODREF_UNKNOWN_PARM
> +             || m.parm_index == MODREF_RETSLOT_PARM
> +             || !m.parm_offset_known)
> +           continue;
> +         modref_access_node n = kill;
> +         n.parm_index = m.parm_index;
> +         n.parm_offset += m.parm_offset;
> +         if (modref_access_node::insert_kill (cur_summary->kills, n,
> +                                              record_adjustments))
> +           changed = true;
> +       }
> +    }
> +
  
Jeff Law Nov. 15, 2021, 7 p.m. UTC | #6
On 11/15/2021 11:51 AM, H.J. Lu via Gcc-patches wrote:
> On Sun, Nov 14, 2021 at 10:53 AM Jan Hubicka via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>>> I think you want get_addr_base_and_unit_offset here.  All
>>>> variable indexed addresses are in separate stmts.  That also means
>>>> you can eventually work with just byte sizes/offsets?
>>> Will do.  The access range in modref summary is bit based (since we want
>>> to disabiguate bitfields like we do in rest of alias oracle) but indeed
>>> this part cna be in bytes.
>> Actually after the unifiation I can just use get_ao_ref which will call
>> ao_ref_init_from_ptr_and_range that has all the logic I need in there.
>> I also noticed that I ended up duplicating the code matching bases
>> and ranges which is done already twice in the function - once for store
>> targets and once for MEMSET and friends.  The later copy lacked overflow
>> checks so I took the first copy and moved it to helper function.  This
>> makes the gimple part of patch really straighforward: just build ao_ref
>> if possible and then pass it to this function.
>>
>> I also added statistics.
>>
>> I have bootstrapped/regtsed on x86_64-linux the updated patch and
>> comitted it so I can break out the patches that depends on it.
>> I have patch improving the kill tracking at modref side and also the
>> kill oracle itself can use fnspec and does not need to special case
>> mem* functions.
>>
>> For cc1plus LTO link I now get:
>>
>> Alias oracle query stats:
>>    refs_may_alias_p: 76106130 disambiguations, 100928932 queries
>>    on_includes: 12539931 disambiguations, 39864841 queries
>>    ref_maybe_used_by_call_p: 625857 disambiguations, 77138089 queries
>>    call_may_clobber_ref_p: 366420 disambiguations, 369293 queries
>>    stmt_kills_ref_p: 107503 kills, 5699589 queries
>>    nonoverlapping_component_refs_p: 0 disambiguations, 26176 queries
>>    nonoverlapping_refs_since_match_p: 30339 disambiguations, 65400 must overlaps, 96698 queries
>>    aliasing_component_refs_p: 57500 disambiguations, 15464678 queries
>>    TBAA oracle: 28248334 disambiguations 104710521 queries
>>                 15220245 are in alias set 0
>>                 8905994 queries asked about the same object
>>                 98 queries asked about the same alias set
>>                 0 access volatile
>>                 50371110 are dependent in the DAG
>>                 1964740 are aritificially in conflict with void *
>>
>> Modref stats:
>>    modref kill: 52 kills, 6655 queries
>>    modref use: 25204 disambiguations, 692151 queries
>>    modref clobber: 2309709 disambiguations, 21877806 queries
>>    5320532 tbaa queries (0.243193 per modref query)
>>    761785 base compares (0.034820 per modref query)
>>
>> PTA query stats:
>>    pt_solution_includes: 12539931 disambiguations, 39864841 queries
>>    pt_solutions_intersect: 1713075 disambiguations, 14023484 queries
>>
>> Newly we get statis of kill oracle itself:
>>    stmt_kills_ref_p: 107503 kills, 5699589 queries
>> and the modref part:
>>    modref kill: 52 kills, 6655 queries
>> So an improvemnet over 1 kill using modref I had before. Still not
>> really great.
>>
>> Honza
>>
>> gcc/ChangeLog:
>>
>>              * ipa-modref-tree.c (modref_access_node::update_for_kills): New
>>              member function.
>>              (modref_access_node::merge_for_kills): Likewise.
>>              (modref_access_node::insert_kill): Likewise.
>>              * ipa-modref-tree.h (modref_access_node::update_for_kills,
>>              modref_access_node::merge_for_kills, modref_access_node::insert_kill):
>>              Declare.
>>              (modref_access_node::useful_for_kill): New member function.
>>              * ipa-modref.c (modref_summary::useful_p): Release useless kills.
>>              (lto_modref_summary): Add kills.
>>              (modref_summary::dump): Dump kills.
>>              (record_access): Add mdoref_access_node parameter.
>>              (record_access_lto): Likewise.
>>              (merge_call_side_effects): Merge kills.
>>              (analyze_call): Add ALWAYS_EXECUTED param and pass it around.
>>              (struct summary_ptrs): Add always_executed filed.
>>              (analyze_load): Update.
>>              (analyze_store): Update; record kills.
>>              (analyze_stmt): Add always_executed; record kills in clobbers.
>>              (analyze_function): Track always_executed.
>>              (modref_summaries::duplicate): Duplicate kills.
>>              (update_signature): Release kills.
>>              * ipa-modref.h (struct modref_summary): Add kills.
>>              * tree-ssa-alias.c (alias_stats): Add kill stats.
>>              (dump_alias_stats): Dump kill stats.
>>              (store_kills_ref_p): Break out from ...
>>              (stmt_kills_ref_p): Use it; handle modref info based kills.
>>
>>      gcc/testsuite/ChangeLog:
>>
>>      2021-11-14  Jan Hubicka  <hubicka@ucw.cz>
>>
>>              * gcc.dg/tree-ssa/modref-dse-3.c: New test.
>>
>>
>> diff --git a/gcc/ipa-modref-tree.c b/gcc/ipa-modref-tree.c
>> index 6fc2b7298f4..bbe23a5a211 100644
>> --- a/gcc/ipa-modref-tree.c
>> +++ b/gcc/ipa-modref-tree.c
>> @@ -638,6 +638,185 @@ modref_access_node::get_ao_ref (const gcall *stmt, ao_ref *ref) const
>>     return true;
>>   }
>>
>> +/* Return true A is a subkill.  */
>> +bool
>> +modref_access_node::contains_for_kills (const modref_access_node &a) const
>> +{
>> +  poly_int64 aoffset_adj = 0;
>> +
>> +  gcc_checking_assert (parm_index != MODREF_UNKNOWN_PARM
>> +                      && a.parm_index != MODREF_UNKNOWN_PARM);
>> +  if (parm_index != a.parm_index)
>> +    return false;
>> +  gcc_checking_assert (parm_offset_known && a.parm_offset_known);
>> +  aoffset_adj = (a.parm_offset - parm_offset)
>> +               * BITS_PER_UNIT;
>> +  gcc_checking_assert (range_info_useful_p () && a.range_info_useful_p ());
>> +  return known_subrange_p (a.offset + aoffset_adj,
>> +                          a.max_size, offset, max_size);
>> +}
>> +
>> +/* Merge two ranges both starting at parm_offset1 and update THIS
>> +   with result.  */
>> +bool
>> +modref_access_node::update_for_kills (poly_int64 parm_offset1,
>> +                                     poly_int64 offset1,
>> +                                     poly_int64 max_size1,
>> +                                     poly_int64 offset2,
>> +                                     poly_int64 max_size2,
>> +                                     bool record_adjustments)
>> +{
>> +  if (known_le (offset1, offset2))
>> +    ;
>> +  else if (known_le (offset2, offset1))
>> +    {
>> +      std::swap (offset1, offset2);
>> +      std::swap (max_size1, max_size2);
>> +    }
>> +  else
>> +    gcc_unreachable ();
>> +
>> +  poly_int64 new_max_size = max_size2 + offset2 - offset1;
>> +  if (known_le (new_max_size, max_size1))
>> +    new_max_size = max_size1;
>> +  if (known_eq (parm_offset, parm_offset1)
>> +      && known_eq (offset, offset1)
>> +      && known_eq (size, new_max_size)
>> +      && known_eq (max_size, new_max_size))
>> +    return false;
>> +
>> +  if (!record_adjustments
>> +      || (++adjustments) < param_modref_max_adjustments)
>> +    {
>> +      parm_offset = parm_offset1;
>> +      offset = offset1;
>> +      max_size = new_max_size;
>> +      size = new_max_size;
>> +      gcc_checking_assert (useful_for_kill_p ());
>> +      return true;
>> +    }
>> +  return false;
>> +}
>> +
>> +/* Merge in access A if it is possible to do without losing
>> +   precision.  Return true if successful.
>> +   Unlike merge assume that both accesses are always executed
>> +   and merge size the same was as max_size.  */
>> +bool
>> +modref_access_node::merge_for_kills (const modref_access_node &a,
>> +                                    bool record_adjustments)
>> +{
>> +  poly_int64 offset1 = 0;
>> +  poly_int64 aoffset1 = 0;
>> +  poly_int64 new_parm_offset = 0;
>> +
>> +  /* We assume that containment was tested earlier.  */
>> +  gcc_checking_assert (!contains_for_kills (a) && !a.contains_for_kills (*this)
>> +                      && useful_for_kill_p () && a.useful_for_kill_p ());
>> +
>> +  if (parm_index != a.parm_index
>> +      || !combined_offsets (a, &new_parm_offset, &offset1, &aoffset1))
>> +    return false;
>> +
>> +  if (known_le (offset1, aoffset1))
>> +   {
>> +     if (!known_size_p (max_size)
>> +        || known_ge (offset1 + max_size, aoffset1))
>> +       return update_for_kills (new_parm_offset, offset1, max_size,
>> +                               aoffset1, a.max_size, record_adjustments);
>> +   }
>> +  else if (known_le (aoffset1, offset1))
>> +   {
>> +     if (!known_size_p (a.max_size)
>> +        || known_ge (aoffset1 + a.max_size, offset1))
>> +       return update_for_kills (new_parm_offset, offset1, max_size,
>> +                               aoffset1, a.max_size, record_adjustments);
>> +   }
>> +  return false;
>> +}
>> +
>> +/* Insert new kill A into KILLS.  If RECORD_ADJUSTMENTS is true limit number
>> +   of changes to each entry.  Return true if something changed.  */
>> +
>> +bool
>> +modref_access_node::insert_kill (vec<modref_access_node> &kills,
>> +                                modref_access_node &a, bool record_adjustments)
>> +{
>> +  size_t index;
>> +  modref_access_node *a2;
>> +  bool merge = false;
>> +
>> +  gcc_checking_assert (a.useful_for_kill_p ());
>> +
>> +  /* See if we have corresponding entry already or we can merge with
>> +     neighbouring entry.  */
>> +  FOR_EACH_VEC_ELT (kills, index, a2)
>> +    {
>> +      if (a2->contains_for_kills (a))
>> +       return false;
>> +      if (a.contains_for_kills (*a2))
>> +       {
>> +         a.adjustments = 0;
>> +         *a2 = a;
>> +         merge = true;
>> +         break;
>> +       }
>> +      if (a2->merge_for_kills (a, record_adjustments))
>> +       {
>> +         merge = true;
>> +         break;
>> +       }
>> +    }
>> +  /* If entry was not found, insert it.  */
>> +  if (!merge)
>> +    {
>> +      if ((int)kills.length () >= param_modref_max_accesses)
>> +       {
>> +         if (dump_file)
>> +           fprintf (dump_file,
>> +                    "--param param=modref-max-accesses limit reached:");
>> +         return false;
>> +       }
>> +      a.adjustments = 0;
>> +      kills.safe_push (a);
>> +      return true;
>> +    }
>> +  /* Extending range in an entry may make it possible to merge it with
>> +     other entries.  */
>> +  size_t i;
>> +
>> +  for (i = 0; i < kills.length ();)
>> +    if (i != index)
>> +      {
>> +       bool found = false, restart = false;
>> +       modref_access_node *a = &kills[i];
>> +       modref_access_node *n = &kills[index];
>> +
>> +       if (n->contains_for_kills (*a))
>> +         found = true;
>> +       if (!found && n->merge_for_kills (*a, false))
>> +         found = restart = true;
>> +       gcc_checking_assert (found || !a->merge_for_kills (*n, false));
>> +       if (found)
>> +         {
>> +           kills.unordered_remove (i);
>> +           if (index == kills.length ())
>> +             {
>> +               index = i;
>> +               i++;
>> +             }
>> +           if (restart)
>> +             i = 0;
>> +         }
>> +       else
>> +         i++;
>> +      }
>> +    else
>> +      i++;
>> +  return true;
>> +}
>> +
>> +
>>   #if CHECKING_P
>>
>>   namespace selftest {
>> diff --git a/gcc/ipa-modref-tree.h b/gcc/ipa-modref-tree.h
>> index 2fcabe480bd..1bf2aa8460e 100644
>> --- a/gcc/ipa-modref-tree.h
>> +++ b/gcc/ipa-modref-tree.h
>> @@ -82,6 +82,13 @@ struct GTY(()) modref_access_node
>>       {
>>         return parm_index != MODREF_UNKNOWN_PARM;
>>       }
>> +  /* Return true if access can be used to determine a kill.  */
>> +  bool useful_for_kill_p () const
>> +    {
>> +      return parm_offset_known && parm_index != MODREF_UNKNOWN_PARM
>> +            && parm_index != MODREF_RETSLOT_PARM && known_size_p (size)
>> +            && known_eq (max_size, size);
>> +    }
>>     /* Dump range to debug OUT.  */
>>     void dump (FILE *out);
>>     /* Return true if both accesses are the same.  */
>> @@ -98,10 +105,18 @@ struct GTY(()) modref_access_node
>>     static int insert (vec <modref_access_node, va_gc> *&accesses,
>>                       modref_access_node a, size_t max_accesses,
>>                       bool record_adjustments);
>> +  /* Same as insert but for kills where we are conservative the other way
>> +     around: if information is lost, the kill is lost.  */
>> +  static bool insert_kill (vec<modref_access_node> &kills,
>> +                          modref_access_node &a, bool record_adjustments);
>>   private:
>>     bool contains (const modref_access_node &) const;
>> +  bool contains_for_kills (const modref_access_node &) const;
>>     void update (poly_int64, poly_int64, poly_int64, poly_int64, bool);
>> +  bool update_for_kills (poly_int64, poly_int64, poly_int64,
>> +                        poly_int64, poly_int64, bool);
>>     bool merge (const modref_access_node &, bool);
>> +  bool merge_for_kills (const modref_access_node &, bool);
>>     static bool closer_pair_p (const modref_access_node &,
>>                               const modref_access_node &,
>>                               const modref_access_node &,
>> diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
>> index b75ed84135b..df4612bbff9 100644
>> --- a/gcc/ipa-modref.c
>> +++ b/gcc/ipa-modref.c
>> @@ -335,6 +335,8 @@ modref_summary::useful_p (int ecf_flags, bool check_flags)
>>       return (!side_effects && (ecf_flags & ECF_LOOPING_CONST_OR_PURE));
>>     if (loads && !loads->every_base)
>>       return true;
>> +  else
>> +    kills.release ();
>>     if (ecf_flags & ECF_PURE)
>>       return (!side_effects && (ecf_flags & ECF_LOOPING_CONST_OR_PURE));
>>     return stores && !stores->every_base;
>> @@ -351,6 +353,7 @@ struct GTY(()) modref_summary_lto
>>        more verbose and thus more likely to hit the limits.  */
>>     modref_records_lto *loads;
>>     modref_records_lto *stores;
>> +  auto_vec<modref_access_node> GTY((skip)) kills;
>>     auto_vec<eaf_flags_t> GTY((skip)) arg_flags;
>>     eaf_flags_t retslot_flags;
>>     eaf_flags_t static_chain_flags;
>> @@ -570,6 +573,15 @@ modref_summary::dump (FILE *out)
>>         fprintf (out, "  stores:\n");
>>         dump_records (stores, out);
>>       }
>> +  if (kills.length ())
>> +    {
>> +      fprintf (out, "  kills:\n");
>> +      for (auto kill : kills)
>> +       {
>> +         fprintf (out, "    ");
>> +         kill.dump (out);
>> +       }
>> +    }
>>     if (writes_errno)
>>       fprintf (out, "  Writes errno\n");
>>     if (side_effects)
>> @@ -762,13 +774,12 @@ get_access (ao_ref *ref)
>>   /* Record access into the modref_records data structure.  */
>>
>>   static void
>> -record_access (modref_records *tt, ao_ref *ref)
>> +record_access (modref_records *tt, ao_ref *ref, modref_access_node &a)
>>   {
>>     alias_set_type base_set = !flag_strict_aliasing ? 0
>>                              : ao_ref_base_alias_set (ref);
>>     alias_set_type ref_set = !flag_strict_aliasing ? 0
>>                              : (ao_ref_alias_set (ref));
>> -  modref_access_node a = get_access (ref);
>>     if (dump_file)
>>       {
>>          fprintf (dump_file, "   - Recording base_set=%i ref_set=%i ",
>> @@ -781,7 +792,7 @@ record_access (modref_records *tt, ao_ref *ref)
>>   /* IPA version of record_access_tree.  */
>>
>>   static void
>> -record_access_lto (modref_records_lto *tt, ao_ref *ref)
>> +record_access_lto (modref_records_lto *tt, ao_ref *ref, modref_access_node &a)
>>   {
>>     /* get_alias_set sometimes use different type to compute the alias set
>>        than TREE_TYPE (base).  Do same adjustments.  */
>> @@ -828,7 +839,6 @@ record_access_lto (modref_records_lto *tt, ao_ref *ref)
>>                         || variably_modified_type_p (ref_type, NULL_TREE)))
>>          ref_type = NULL_TREE;
>>       }
>> -  modref_access_node a = get_access (ref);
>>     if (dump_file)
>>       {
>>         fprintf (dump_file, "   - Recording base type:");
>> @@ -932,13 +942,17 @@ bool
>>   merge_call_side_effects (modref_summary *cur_summary,
>>                           gimple *stmt, modref_summary *callee_summary,
>>                           bool ignore_stores, cgraph_node *callee_node,
>> -                        bool record_adjustments)
>> +                        bool record_adjustments, bool always_executed)
>>   {
>>     auto_vec <modref_parm_map, 32> parm_map;
>>     modref_parm_map chain_map;
>>     bool changed = false;
>>     int flags = gimple_call_flags (stmt);
>>
>> +  if ((flags & (ECF_CONST | ECF_NOVOPS))
>> +      && !(flags & ECF_LOOPING_CONST_OR_PURE))
>> +    return changed;
>> +
>>     if (!cur_summary->side_effects && callee_summary->side_effects)
>>       {
>>         if (dump_file)
>> @@ -950,6 +964,38 @@ merge_call_side_effects (modref_summary *cur_summary,
>>     if (flags & (ECF_CONST | ECF_NOVOPS))
>>       return changed;
>>
>> +  if (always_executed
>> +      && callee_summary->kills.length ()
>> +      && (!cfun->can_throw_non_call_exceptions
>> +         || !stmt_could_throw_p (cfun, stmt)))
>> +    {
>> +      /* Watch for self recursive updates.  */
>> +      auto_vec<modref_access_node, 32> saved_kills;
>> +
>> +      saved_kills.reserve_exact (callee_summary->kills.length ());
>> +      saved_kills.splice (callee_summary->kills);
>> +      for (auto kill : saved_kills)
>> +       {
>> +         if (kill.parm_index >= (int)parm_map.length ())
>> +           continue;
>> +         modref_parm_map &m
>> +                 = kill.parm_index == MODREF_STATIC_CHAIN_PARM
>> +                   ? chain_map
>                       ^^^^^^^^^^^^^^^^  chain_map  isn't initialized.
>
> This caused:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103262
Yup.  It's causing heartburn in various ways in the tester.  I was just 
tracking it down with valgrind...
jeff
  
Jan Hubicka Nov. 15, 2021, 8:56 p.m. UTC | #7
> > > +  if (always_executed
> > > +      && callee_summary->kills.length ()
> > > +      && (!cfun->can_throw_non_call_exceptions
> > > +         || !stmt_could_throw_p (cfun, stmt)))
> > > +    {
> > > +      /* Watch for self recursive updates.  */
> > > +      auto_vec<modref_access_node, 32> saved_kills;
> > > +
> > > +      saved_kills.reserve_exact (callee_summary->kills.length ());
> > > +      saved_kills.splice (callee_summary->kills);
> > > +      for (auto kill : saved_kills)
> > > +       {
> > > +         if (kill.parm_index >= (int)parm_map.length ())
> > > +           continue;
> > > +         modref_parm_map &m
> > > +                 = kill.parm_index == MODREF_STATIC_CHAIN_PARM
> > > +                   ? chain_map
> >                       ^^^^^^^^^^^^^^^^  chain_map  isn't initialized.
> > 
> > This caused:
> > 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103262
> Yup.  It's causing heartburn in various ways in the tester.  I was just
> tracking it down with valgrind...
> jeff

Oops, either me or patch much have mislocated the change within the
function when updating to new tree.  I am testing the following fix and
will cook up a testcase verifying that merging of kills works as
expected.

Thanks!
Honza

diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index df4612bbff9..4784f68f585 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -964,38 +980,6 @@ merge_call_side_effects (modref_summary *cur_summary,
   if (flags & (ECF_CONST | ECF_NOVOPS))
     return changed;
 
-  if (always_executed
-      && callee_summary->kills.length ()
-      && (!cfun->can_throw_non_call_exceptions
-	  || !stmt_could_throw_p (cfun, stmt)))
-    {
-      /* Watch for self recursive updates.  */
-      auto_vec<modref_access_node, 32> saved_kills;
-
-      saved_kills.reserve_exact (callee_summary->kills.length ());
-      saved_kills.splice (callee_summary->kills);
-      for (auto kill : saved_kills)
-	{
-	  if (kill.parm_index >= (int)parm_map.length ())
-	    continue;
-	  modref_parm_map &m
-		  = kill.parm_index == MODREF_STATIC_CHAIN_PARM
-		    ? chain_map
-		    : parm_map[kill.parm_index];
-	  if (m.parm_index == MODREF_LOCAL_MEMORY_PARM
-	      || m.parm_index == MODREF_UNKNOWN_PARM
-	      || m.parm_index == MODREF_RETSLOT_PARM
-	      || !m.parm_offset_known)
-	    continue;
-	  modref_access_node n = kill;
-	  n.parm_index = m.parm_index;
-	  n.parm_offset += m.parm_offset;
-	  if (modref_access_node::insert_kill (cur_summary->kills, n,
-					       record_adjustments))
-	    changed = true;
-	}
-    }
-
   /* We can not safely optimize based on summary of callee if it does
      not always bind to current def: it is possible that memory load
      was optimized out earlier which may not happen in the interposed
@@ -1043,6 +1027,38 @@ merge_call_side_effects (modref_summary *cur_summary,
   if (dump_file)
     fprintf (dump_file, "\n");
 
+  if (always_executed
+      && callee_summary->kills.length ()
+      && (!cfun->can_throw_non_call_exceptions
+	  || !stmt_could_throw_p (cfun, stmt)))
+    {
+      /* Watch for self recursive updates.  */
+      auto_vec<modref_access_node, 32> saved_kills;
+
+      saved_kills.reserve_exact (callee_summary->kills.length ());
+      saved_kills.splice (callee_summary->kills);
+      for (auto kill : saved_kills)
+	{
+	  if (kill.parm_index >= (int)parm_map.length ())
+	    continue;
+	  modref_parm_map &m
+		  = kill.parm_index == MODREF_STATIC_CHAIN_PARM
+		    ? chain_map
+		    : parm_map[kill.parm_index];
+	  if (m.parm_index == MODREF_LOCAL_MEMORY_PARM
+	      || m.parm_index == MODREF_UNKNOWN_PARM
+	      || m.parm_index == MODREF_RETSLOT_PARM
+	      || !m.parm_offset_known)
+	    continue;
+	  modref_access_node n = kill;
+	  n.parm_index = m.parm_index;
+	  n.parm_offset += m.parm_offset;
+	  if (modref_access_node::insert_kill (cur_summary->kills, n,
+					       record_adjustments))
+	    changed = true;
+	}
+    }
+
   /* Merge with callee's summary.  */
   changed |= cur_summary->loads->merge (callee_summary->loads, &parm_map,
 					&chain_map, record_adjustments);
  
Jan Hubicka Nov. 16, 2021, 8:19 a.m. UTC | #8
>                      ^^^^^^^^^^^^^^^^  chain_map  isn't initialized.
> 
> This caused:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103262
> 

Hi,
this is patch I comitted that moves the misplaced hunk.

gcc/ChangeLog:

	PR ipa/103262
	* ipa-modref.c (merge_call_side_effects): Fix uninitialized
	access.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/modref-dse-5.c: New test.

diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index df4612bbff9..4784f68f585 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -964,38 +980,6 @@ merge_call_side_effects (modref_summary *cur_summary,
   if (flags & (ECF_CONST | ECF_NOVOPS))
     return changed;
 
-  if (always_executed
-      && callee_summary->kills.length ()
-      && (!cfun->can_throw_non_call_exceptions
-	  || !stmt_could_throw_p (cfun, stmt)))
-    {
-      /* Watch for self recursive updates.  */
-      auto_vec<modref_access_node, 32> saved_kills;
-
-      saved_kills.reserve_exact (callee_summary->kills.length ());
-      saved_kills.splice (callee_summary->kills);
-      for (auto kill : saved_kills)
-	{
-	  if (kill.parm_index >= (int)parm_map.length ())
-	    continue;
-	  modref_parm_map &m
-		  = kill.parm_index == MODREF_STATIC_CHAIN_PARM
-		    ? chain_map
-		    : parm_map[kill.parm_index];
-	  if (m.parm_index == MODREF_LOCAL_MEMORY_PARM
-	      || m.parm_index == MODREF_UNKNOWN_PARM
-	      || m.parm_index == MODREF_RETSLOT_PARM
-	      || !m.parm_offset_known)
-	    continue;
-	  modref_access_node n = kill;
-	  n.parm_index = m.parm_index;
-	  n.parm_offset += m.parm_offset;
-	  if (modref_access_node::insert_kill (cur_summary->kills, n,
-					       record_adjustments))
-	    changed = true;
-	}
-    }
-
   /* We can not safely optimize based on summary of callee if it does
      not always bind to current def: it is possible that memory load
      was optimized out earlier which may not happen in the interposed
@@ -1043,6 +1027,38 @@ merge_call_side_effects (modref_summary *cur_summary,
   if (dump_file)
     fprintf (dump_file, "\n");
 
+  if (always_executed
+      && callee_summary->kills.length ()
+      && (!cfun->can_throw_non_call_exceptions
+	  || !stmt_could_throw_p (cfun, stmt)))
+    {
+      /* Watch for self recursive updates.  */
+      auto_vec<modref_access_node, 32> saved_kills;
+
+      saved_kills.reserve_exact (callee_summary->kills.length ());
+      saved_kills.splice (callee_summary->kills);
+      for (auto kill : saved_kills)
+	{
+	  if (kill.parm_index >= (int)parm_map.length ())
+	    continue;
+	  modref_parm_map &m
+		  = kill.parm_index == MODREF_STATIC_CHAIN_PARM
+		    ? chain_map
+		    : parm_map[kill.parm_index];
+	  if (m.parm_index == MODREF_LOCAL_MEMORY_PARM
+	      || m.parm_index == MODREF_UNKNOWN_PARM
+	      || m.parm_index == MODREF_RETSLOT_PARM
+	      || !m.parm_offset_known)
+	    continue;
+	  modref_access_node n = kill;
+	  n.parm_index = m.parm_index;
+	  n.parm_offset += m.parm_offset;
+	  if (modref_access_node::insert_kill (cur_summary->kills, n,
+					       record_adjustments))
+	    changed = true;
+	}
+    }
+
   /* Merge with callee's summary.  */
   changed |= cur_summary->loads->merge (callee_summary->loads, &parm_map,
 					&chain_map, record_adjustments);
+/* { dg-final { scan-tree-dump "Deleted dead store: kill_me" "dse2" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-5.c b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-5.c
new file mode 100644
index 00000000000..ad35b70136f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-5.c
@@ -0,0 +1,43 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dse2-details"  } */
+struct a {int a,b,c;};
+__attribute__ ((noinline))
+void
+kill_me (struct a *a)
+{
+  a->a=0;
+  a->b=0;
+  a->c=0;
+}
+__attribute__ ((noinline))
+int
+wrap(int b, struct a *a)
+{
+   kill_me (a);
+   return b;
+}
+__attribute__ ((noinline))
+void
+my_pleasure (struct a *a)
+{
+  a->a=1;
+  a->c=2;
+}
+__attribute__ ((noinline))
+int
+wrap2(int b, struct a *a)
+{
+   my_pleasure (a);
+   return b;
+}
+
+int
+set (struct a *a)
+{
+  wrap (0, a);
+  int ret = wrap2 (0, a);
+  //int ret = my_pleasure (a);
+  a->b=1;
+  return ret;
+}
+/* { dg-final { scan-tree-dump "Deleted dead store: wrap" "dse2" } } */
  

Patch

diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
index 17ff6bb582c..6f8caa331a6 100644
--- a/gcc/tree-ssa-alias.c
+++ b/gcc/tree-ssa-alias.c
@@ -120,6 +120,8 @@  static struct {
   unsigned HOST_WIDE_INT modref_use_no_alias;
   unsigned HOST_WIDE_INT modref_clobber_may_alias;
   unsigned HOST_WIDE_INT modref_clobber_no_alias;
+  unsigned HOST_WIDE_INT modref_kill_no;
+  unsigned HOST_WIDE_INT modref_kill_yes;
   unsigned HOST_WIDE_INT modref_tests;
   unsigned HOST_WIDE_INT modref_baseptr_tests;
 } alias_stats;
@@ -169,6 +171,12 @@  dump_alias_stats (FILE *s)
 	   + alias_stats.aliasing_component_refs_p_may_alias);
   dump_alias_stats_in_alias_c (s);
   fprintf (s, "\nModref stats:\n");
+  fprintf (s, "  modref kill: "
+	   HOST_WIDE_INT_PRINT_DEC" kills, "
+	   HOST_WIDE_INT_PRINT_DEC" queries\n",
+	   alias_stats.modref_kill_yes,
+	   alias_stats.modref_kill_yes
+	   + alias_stats.modref_kill_no);
   fprintf (s, "  modref use: "
 	   HOST_WIDE_INT_PRINT_DEC" disambiguations, "
 	   HOST_WIDE_INT_PRINT_DEC" queries\n",
@@ -3373,6 +3381,107 @@  stmt_kills_ref_p (gimple *stmt, ao_ref *ref)
   if (is_gimple_call (stmt))
     {
       tree callee = gimple_call_fndecl (stmt);
+      struct cgraph_node *node;
+      modref_summary *summary;
+
+      /* Try to disambiguate using modref summary.  Modref records a vector
+	 of stores with known offsets relative to function parameters that must
+	 happen every execution of function.  Find if we have a matching
+	 store and verify that function can not use the value.  */
+      if (callee != NULL_TREE
+	  && (node = cgraph_node::get (callee)) != NULL
+	  && node->binds_to_current_def_p ()
+	  && (summary = get_modref_function_summary (node)) != NULL
+	  && summary->kills.length ())
+	{
+	  tree base = ao_ref_base (ref);
+	  for (unsigned int i = 0; i < summary->kills.length (); i++)
+	    {
+	      modref_access_node &a = summary->kills[i];
+	      tree op;
+	      poly_offset_int off1_adj = 0, off2_adj = 0;
+	      poly_int64 off1, off2;
+	      tree base_ptr = NULL;
+	      tree base_decl = NULL;
+
+	      if (a.parm_index >= 0)
+		op = gimple_call_arg (stmt, a.parm_index);
+	      else if (a.parm_index == MODREF_STATIC_CHAIN_PARM)
+		op = gimple_call_chain (stmt);
+	      else
+		gcc_unreachable ();
+
+	      off2_adj += a.parm_offset * BITS_PER_UNIT;
+	      if (!(off2_adj + a.offset).to_shwi (&off2))
+		continue;
+	      if (TREE_CODE (base) == MEM_REF)
+		{
+		  off1_adj = mem_ref_offset (base) << LOG2_BITS_PER_UNIT;
+		  if (TREE_CODE (TREE_OPERAND (base, 0)) == ADDR_EXPR)
+		    base_decl = TREE_OPERAND (TREE_OPERAND (base, 0), 0);
+		  else
+		    base_ptr = TREE_OPERAND (base, 0);
+		}
+	      /* Give up on TMRs for now.  */
+	      else if (TREE_CODE (base) == TARGET_MEM_REF)
+		break;
+	      else
+		base_decl = base;
+
+	      gcc_checking_assert (!base_decl || DECL_P (base_decl));
+	      gcc_checking_assert (!base_ptr
+				   || TREE_CODE (base_ptr) == SSA_NAME);
+
+	      /* OP is a pointer and we have access range from its
+		 dereference.  */
+	      if (TREE_CODE (op) == ADDR_EXPR)
+		{
+		  poly_int64 size, offset, max_size;
+		  bool reverse;
+		  tree op_base = get_ref_base_and_extent
+			  (TREE_OPERAND (op, 0), &offset, &size,
+			   &max_size, &reverse);
+		  if (!known_size_p (size) || !known_eq (size, max_size))
+		    continue;
+		  off2_adj += offset;
+		  /* &str->foo are not passed as gimple operands directly,
+		     would need to look up the def stmt.  */
+		  gcc_checking_assert (TREE_CODE (op_base) != MEM_REF);
+		  if (!base_decl
+		      || compare_base_decls (op_base, base_decl) != 1)
+		    continue;
+		}
+	      else if (!base_ptr || !operand_equal_p (base_ptr, op))
+		continue;
+
+	      if (!(off1_adj + ref->offset).to_shwi (&off1))
+		continue;
+	      if (!(off2_adj + a.offset).to_shwi (&off2))
+		continue;
+
+	      if (known_subrange_p (off1, ref->max_size, off2, a.size)
+		  && dbg_cnt (ipa_mod_ref))
+		{
+		  /* For store to be killed it needs to not be used earlier.  */
+		  if (ref_maybe_used_by_call_p_1 (as_a <gcall *> (stmt), ref,
+						  true))
+		    break;
+		  if (dump_file && (dump_flags & TDF_DETAILS))
+		    {
+		      fprintf (dump_file,
+			       "ipa-modref: call stmt ");
+		      print_gimple_stmt (dump_file, stmt, 0);
+		      fprintf (dump_file,
+			       "ipa-modref: call to %s kills ",
+			       node->dump_name ());
+		      print_generic_expr (dump_file, ref->base);
+		    }
+		  ++alias_stats.modref_kill_yes;
+		  return true;
+		}
+	    }
+	  ++alias_stats.modref_kill_no;
+	}
       if (callee != NULL_TREE
 	  && gimple_call_builtin_p (stmt, BUILT_IN_NORMAL))
 	switch (DECL_FUNCTION_CODE (callee))
diff --git a/gcc/ipa-modref-tree.h b/gcc/ipa-modref-tree.h
index 3e213b23d79..dc3b910801d 100644
--- a/gcc/ipa-modref-tree.h
+++ b/gcc/ipa-modref-tree.h
@@ -287,6 +287,55 @@  struct GTY(()) modref_access_node
 	      size, max_size, record_adjustments);
       return true;
     }
+  /* Merge in access A if it is possible to do without losing
+     precision.  Return true if successful.
+     Unlike merge assume that both accesses are always executed
+     and merge size the same was as max_size.  */
+  bool merge_for_kills (const modref_access_node &a)
+    {
+      poly_int64 offset1 = 0;
+      poly_int64 aoffset1 = 0;
+      poly_int64 new_parm_offset = 0;
+
+      /* We assume that containment was tested earlier.  */
+      gcc_checking_assert (!contains (a) && !a.contains (*this));
+      /* For kills we allways need to know parameter.  */
+      gcc_checking_assert (parm_index != MODREF_UNKNOWN_PARM
+			   && a.parm_index != MODREF_UNKNOWN_PARM);
+      if (parm_index != a.parm_index)
+	return false;
+      /* Unknown offsets are useless.  */
+      gcc_checking_assert (parm_offset_known && a.parm_offset_known);
+      if (!combined_offsets (a, &new_parm_offset, &offset1, &aoffset1))
+	return false;
+      /* Ranges must be known and sizes matching max sizes.  */
+      gcc_checking_assert (range_info_useful_p ()
+			  && a.range_info_useful_p ()
+			  && known_size_p (size) && known_eq (size, max_size)
+			  && known_size_p (a.size)
+			  && known_eq (a.size, a.max_size));
+      if (known_le (offset1, aoffset1))
+	{
+	  if (!known_size_p (max_size)
+	      || known_ge (offset1 + max_size, aoffset1))
+	    {
+	      update_for_kills (new_parm_offset, offset1, max_size,
+				aoffset1, a.max_size);
+	      return true;
+	    }
+	}
+      else if (known_le (aoffset1, offset1))
+	{
+	  if (!known_size_p (a.max_size)
+	      || known_ge (aoffset1 + a.max_size, offset1))
+	    {
+	      update_for_kills (new_parm_offset, offset1, max_size,
+				aoffset1, a.max_size);
+	      return true;
+	    }
+	}
+      return false;
+    }
   /* Return true if A1 and B1 can be merged with lower informatoin
      less than A2 and B2.
      Assume that no containment or lossless merging is possible.  */
@@ -430,6 +479,33 @@  private:
       update (parm_offset1, offset1,
 	      new_size, new_max_size, record_adjustments);
     }
+  /* Merge two ranges both starting at parm_offset1 and update THIS
+     with result.  */
+  void update_for_kills (poly_int64 parm_offset1,
+			 poly_int64 offset1,
+			 poly_int64 max_size1,
+			 poly_int64 offset2,
+			 poly_int64 max_size2)
+    {
+      if (known_le (offset1, offset2))
+	;
+      else if (known_le (offset2, offset1))
+	{
+	  std::swap (offset1, offset2);
+	  std::swap (max_size1, max_size2);
+	}
+      else
+	gcc_unreachable ();
+
+      poly_int64 new_max_size = max_size2 + offset2 - offset1;
+      if (known_le (new_max_size, max_size1))
+	new_max_size = max_size1;
+
+      parm_offset = parm_offset1;
+      offset = offset1;
+      max_size = new_max_size;
+      size = new_max_size;
+    }
   /* Given access nodes THIS and A, return true if they
      can be done with common parm_offsets.  In this case
      return parm offset in new_parm_offset, new_offset
diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index f8b7b900527..d9859daa18e 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -334,6 +334,8 @@  modref_summary::useful_p (int ecf_flags, bool check_flags)
     return false;
   if (loads && !loads->every_base)
     return true;
+  else
+    kills.release ();
   if (ecf_flags & ECF_PURE)
     return false;
   return stores && !stores->every_base;
@@ -370,6 +372,7 @@  struct GTY(()) modref_summary_lto
      more verbose and thus more likely to hit the limits.  */
   modref_records_lto *loads;
   modref_records_lto *stores;
+  auto_vec<modref_access_node> GTY((skip)) kills;
   auto_vec<eaf_flags_t> GTY((skip)) arg_flags;
   eaf_flags_t retslot_flags;
   eaf_flags_t static_chain_flags;
@@ -617,6 +620,12 @@  modref_summary::dump (FILE *out)
       fprintf (out, "  stores:\n");
       dump_records (stores, out);
     }
+  if (kills.length ())
+    {
+      fprintf (out, "  kills:\n");
+      for (unsigned int i = 0; i < kills.length (); i++)
+	dump_access (&kills[i], out);
+    }
   if (writes_errno)
     fprintf (out, "  Writes errno\n");
   if (side_effects)
@@ -750,13 +759,12 @@  get_access (ao_ref *ref)
 /* Record access into the modref_records data structure.  */
 
 static void
-record_access (modref_records *tt, ao_ref *ref)
+record_access (modref_records *tt, ao_ref *ref, modref_access_node &a)
 {
   alias_set_type base_set = !flag_strict_aliasing ? 0
 			    : ao_ref_base_alias_set (ref);
   alias_set_type ref_set = !flag_strict_aliasing ? 0
 			    : (ao_ref_alias_set (ref));
-  modref_access_node a = get_access (ref);
   if (dump_file)
     {
        fprintf (dump_file, "   - Recording base_set=%i ref_set=%i ",
@@ -769,7 +777,7 @@  record_access (modref_records *tt, ao_ref *ref)
 /* IPA version of record_access_tree.  */
 
 static void
-record_access_lto (modref_records_lto *tt, ao_ref *ref)
+record_access_lto (modref_records_lto *tt, ao_ref *ref, modref_access_node &a)
 {
   /* get_alias_set sometimes use different type to compute the alias set
      than TREE_TYPE (base).  Do same adjustments.  */
@@ -816,7 +824,6 @@  record_access_lto (modref_records_lto *tt, ao_ref *ref)
 		       || variably_modified_type_p (ref_type, NULL_TREE)))
 	ref_type = NULL_TREE;
     }
-  modref_access_node a = get_access (ref);
   if (dump_file)
     {
       fprintf (dump_file, "   - Recording base type:");
@@ -910,6 +917,82 @@  parm_map_for_arg (tree op)
   return parm_map;
 }
 
+/* Insert new kill A into KILLS.  */
+
+static void
+add_kill (vec<modref_access_node> &kills, modref_access_node &a)
+{
+  size_t index;
+  modref_access_node *a2;
+  bool merge = false;
+
+  /* See if we have corresponding etry already or we can merge with
+     neighbouring entry.  */
+  FOR_EACH_VEC_ELT (kills, index, a2)
+    {
+      if (a2->contains (a))
+	return;
+      if (a.contains (*a2))
+	{
+	  *a2 = a;
+	  merge = true;
+	  break;
+	}
+      if (a2->merge_for_kills (a))
+	{
+	  merge = true;
+	  break;
+	}
+    }
+  /* If entry was not found, insert it.  */
+  if (!merge)
+    {
+      if ((int)kills.length () >= param_modref_max_accesses)
+	{
+	  if (dump_file)
+	    fprintf (dump_file,
+		     "--param param=modref-max-accesses limit reached:");
+	  return;
+	}
+      kills.safe_push (a);
+      return;
+    }
+  /* Extenidng range in an entry may make it possible to merge it with
+     other entries.  */
+    {
+      size_t i;
+
+      for (i = 0; i < kills.length ();)
+	if (i != index)
+	  {
+	    bool found = false, restart = false;
+	    modref_access_node *a = &kills[i];
+	    modref_access_node *n = &kills[index];
+
+	    if (n->contains (*a))
+	      found = true;
+	    if (!found && n->merge_for_kills (*a))
+	      found = restart = true;
+	    gcc_checking_assert (found || !a->merge_for_kills (*n));
+	    if (found)
+	      {
+		kills.unordered_remove (i);
+		if (index == kills.length ())
+		  {
+		    index = i;
+		    i++;
+		  }
+		if (restart)
+		  i = 0;
+	      }
+	    else
+	      i++;
+	  }
+	else
+	  i++;
+    }
+}
+
 /* Merge side effects of call STMT to function with CALLEE_SUMMARY
    int CUR_SUMMARY.  Return true if something changed.
    If IGNORE_STORES is true, do not merge stores.
@@ -920,7 +1003,7 @@  bool
 merge_call_side_effects (modref_summary *cur_summary,
 			 gimple *stmt, modref_summary *callee_summary,
 			 bool ignore_stores, cgraph_node *callee_node,
-			 bool record_adjustments)
+			 bool record_adjustments, bool always_executed)
 {
   auto_vec <modref_parm_map, 32> parm_map;
   modref_parm_map chain_map;
@@ -988,6 +1071,29 @@  merge_call_side_effects (modref_summary *cur_summary,
 	  changed = true;
 	}
     }
+  if (always_executed)
+    {
+      size_t i;
+      modref_access_node *a2;
+      FOR_EACH_VEC_ELT (callee_summary->kills, i, a2)
+	{
+	  modref_access_node n = *a2;
+	  if (n.parm_index >= (int)parm_map.length ())
+	    continue;
+	  modref_parm_map &m
+		  = n.parm_index == MODREF_STATIC_CHAIN_PARM
+		    ? chain_map
+		    : parm_map[n.parm_index];
+	  if (m.parm_index == MODREF_LOCAL_MEMORY_PARM
+	      || m.parm_index == MODREF_UNKNOWN_PARM
+	      || m.parm_index == MODREF_RETSLOT_PARM
+	      || !m.parm_offset_known)
+	    continue;
+	  n.parm_index = m.parm_index;
+	  n.parm_offset += m.parm_offset;
+	  add_kill (cur_summary->kills, n);
+	}
+    }
   if (!cur_summary->side_effects
       && callee_summary->side_effects)
     {
@@ -1198,7 +1304,8 @@  process_fnspec (modref_summary *cur_summary,
 
 static bool
 analyze_call (modref_summary *cur_summary, modref_summary_lto *cur_summary_lto,
-	      gcall *stmt, vec <gimple *> *recursive_calls)
+	      gcall *stmt, vec <gimple *> *recursive_calls,
+	      bool always_executed)
 {
   /* Check flags on the function call.  In certain cases, analysis can be
      simplified.  */
@@ -1284,7 +1391,7 @@  analyze_call (modref_summary *cur_summary, modref_summary_lto *cur_summary_lto,
     }
 
   merge_call_side_effects (cur_summary, stmt, callee_summary, ignore_stores,
-			   callee_node, false);
+			   callee_node, false, always_executed);
 
   return true;
 }
@@ -1295,6 +1402,7 @@  struct summary_ptrs
 {
   struct modref_summary *nolto;
   struct modref_summary_lto *lto;
+  bool always_executed;
 };
 
 /* Helper for analyze_stmt.  */
@@ -1329,11 +1437,12 @@  analyze_load (gimple *, tree, tree op, void *data)
 
   ao_ref r;
   ao_ref_init (&r, op);
+  modref_access_node a = get_access (&r);
 
   if (summary)
-    record_access (summary->loads, &r);
+    record_access (summary->loads, &r, a);
   if (summary_lto)
-    record_access_lto (summary_lto->loads, &r);
+    record_access_lto (summary_lto->loads, &r, a);
   return false;
 }
 
@@ -1369,11 +1478,24 @@  analyze_store (gimple *, tree, tree op, void *data)
 
   ao_ref r;
   ao_ref_init (&r, op);
+  modref_access_node a = get_access (&r);
 
   if (summary)
-    record_access (summary->stores, &r);
+    record_access (summary->stores, &r, a);
   if (summary_lto)
-    record_access_lto (summary_lto->stores, &r);
+    record_access_lto (summary_lto->stores, &r, a);
+  if (summary
+      && ((summary_ptrs *)data)->always_executed
+      && a.parm_offset_known == true
+      && a.parm_index != MODREF_UNKNOWN_PARM
+      && a.parm_index != MODREF_RETSLOT_PARM
+      && known_size_p (r.size)
+      && known_eq (r.max_size, r.size))
+    {
+      if (dump_file)
+	fprintf (dump_file, "   - Recording kill\n");
+      add_kill (summary->kills, a);
+    }
   return false;
 }
 
@@ -1382,16 +1504,36 @@  analyze_store (gimple *, tree, tree op, void *data)
 
 static bool
 analyze_stmt (modref_summary *summary, modref_summary_lto *summary_lto,
-	      gimple *stmt, bool ipa, vec <gimple *> *recursive_calls)
+	      gimple *stmt, bool ipa, vec <gimple *> *recursive_calls,
+	      bool always_executed)
 {
   /* In general we can not ignore clobbers because they are barriers for code
      motion, however after inlining it is safe to do because local optimization
      passes do not consider clobbers from other functions.
      Similar logic is in ipa-pure-const.c.  */
   if ((ipa || cfun->after_inlining) && gimple_clobber_p (stmt))
-    return true;
+    {
+      if (summary
+	  && always_executed && record_access_p (gimple_assign_lhs (stmt)))
+	{
+	  ao_ref r;
+	  ao_ref_init (&r, gimple_assign_lhs (stmt));
+	  modref_access_node a = get_access (&r);
+	  if (a.parm_offset_known == true
+	      && a.parm_index != MODREF_UNKNOWN_PARM
+	      && a.parm_index != MODREF_RETSLOT_PARM
+	      && known_size_p (r.size)
+	      && known_eq (r.max_size, r.size))
+	    {
+	      if (dump_file)
+		fprintf (dump_file, "   - Recording kill\n");
+	      add_kill (summary->kills, a);
+	    }
+	}
+      return true;
+    }
 
-  struct summary_ptrs sums = {summary, summary_lto};
+  struct summary_ptrs sums = {summary, summary_lto, always_executed};
 
   /* Analyze all loads and stores in STMT.  */
   walk_stmt_load_store_ops (stmt, &sums,
@@ -1420,7 +1562,8 @@  analyze_stmt (modref_summary *summary, modref_summary_lto *summary_lto,
    case GIMPLE_CALL:
      if (!ipa || gimple_call_internal_p (stmt))
        return analyze_call (summary, summary_lto,
-			    as_a <gcall *> (stmt), recursive_calls);
+			    as_a <gcall *> (stmt), recursive_calls,
+			    always_executed);
      else
       {
 	attr_fnspec fnspec = gimple_call_fnspec (as_a <gcall *>(stmt));
@@ -2729,11 +2872,15 @@  analyze_function (function *f, bool ipa)
   FOR_EACH_BB_FN (bb, f)
     {
       gimple_stmt_iterator si;
+      bool always_executed
+	      = bb == single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun))->dest;
+
       for (si = gsi_start_nondebug_after_labels_bb (bb);
 	   !gsi_end_p (si); gsi_next_nondebug (&si))
 	{
 	  if (!analyze_stmt (summary, summary_lto,
-			     gsi_stmt (si), ipa, &recursive_calls)
+			     gsi_stmt (si), ipa, &recursive_calls,
+			     always_executed)
 	      || ((!summary || !summary->useful_p (ecf_flags, false))
 		  && (!summary_lto
 		      || !summary_lto->useful_p (ecf_flags, false))))
@@ -2742,6 +2889,9 @@  analyze_function (function *f, bool ipa)
 	      collapse_stores (summary, summary_lto);
 	      break;
 	    }
+	  if (always_executed
+	      && stmt_can_throw_external (cfun, gsi_stmt (si)))
+	    always_executed = false;
 	}
     }
 
@@ -2761,7 +2911,7 @@  analyze_function (function *f, bool ipa)
 			   ignore_stores_p (current_function_decl,
 					    gimple_call_flags
 						 (recursive_calls[i])),
-			   fnode, !first);
+			   fnode, !first, false);
 	      if (!summary->useful_p (ecf_flags, false))
 		{
 		  remove_summary (lto, nolto, ipa);
@@ -2993,6 +3143,8 @@  modref_summaries::duplicate (cgraph_node *, cgraph_node *dst,
 			 src_data->loads->max_refs,
 			 src_data->loads->max_accesses);
   dst_data->loads->copy_from (src_data->loads);
+  dst_data->kills.reserve_exact (src_data->kills.length ());
+  dst_data->kills.splice (src_data->kills);
   dst_data->writes_errno = src_data->writes_errno;
   dst_data->side_effects = src_data->side_effects;
   if (src_data->arg_flags.length ())
@@ -3640,6 +3792,8 @@  update_signature (struct cgraph_node *node)
     {
       r->loads->remap_params (&map);
       r->stores->remap_params (&map);
+      /* TODO: One we do IPA kills analysis, update the table here.  */
+      r->kills.release ();
       if (r->arg_flags.length ())
 	remap_arg_flags (r->arg_flags, info);
     }
@@ -3647,6 +3801,8 @@  update_signature (struct cgraph_node *node)
     {
       r_lto->loads->remap_params (&map);
       r_lto->stores->remap_params (&map);
+      /* TODO: One we do IPA kills analysis, update the table here.  */
+      r_lto->kills.release ();
       if (r_lto->arg_flags.length ())
 	remap_arg_flags (r_lto->arg_flags, info);
     }
diff --git a/gcc/ipa-modref.h b/gcc/ipa-modref.h
index 49c99f263a7..b77c1aa7400 100644
--- a/gcc/ipa-modref.h
+++ b/gcc/ipa-modref.h
@@ -30,6 +30,7 @@  struct GTY(()) modref_summary
   /* Load and stores in function (transitively closed to all callees)  */
   modref_records *loads;
   modref_records *stores;
+  auto_vec<modref_access_node> GTY((skip)) kills;
   auto_vec<eaf_flags_t> GTY((skip)) arg_flags;
   eaf_flags_t retslot_flags;
   eaf_flags_t static_chain_flags;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-2.c b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-2.c
new file mode 100644
index 00000000000..2a23a9e8ccb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-2.c
@@ -0,0 +1,22 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dse1"  } */
+__attribute__ ((noinline))
+void write (int *a)
+{
+	*a=1;
+	a[1]=2;
+}
+int test ()
+{
+	int a;
+	a=2;
+	write (&a);
+	return a;
+}
+int test2 (int *a)
+{
+	*a=2;
+	write (a);
+	return *a;
+}
+/* { dg-final { scan-tree-dump-times "Deleted dead store" 2 "dse1"} } */