[v2,1/5] fix up compute_objsize: move bndrng into access_data

Message ID c9b3a8f9-303f-5265-b199-13088198c131@gmail.com
State New
Headers
Series [v2,1/5] fix up compute_objsize: move bndrng into access_data |

Commit Message

Martin Sebor Dec. 6, 2021, 5:31 p.m. UTC
  Attached is the subset of the patch in part (1) below:  Move
bndrng from access_ref to access_data.

On 12/3/21 5:00 PM, Jeff Law wrote:
> 
> 
> On 11/8/2021 7:34 PM, Martin Sebor via Gcc-patches wrote:
>> The pointer-query code that implements compute_objsize() that's
>> in turn used by most middle end access warnings now has a few
>> warts in it and (at least) one bug.  With the exception of
>> the bug the warts aren't behind any user-visible bugs that
>> I know of but they do cause problems in new code I've been
>> implementing on top of it.  Besides fixing the one bug (just
>> a typo) the attached patch cleans up these latent issues:
>>
>> 1) It moves the bndrng member from the access_ref class to
>>    access_data.  As a FIXME in the code notes, the member never
>>    did belong in the former and only takes up space in the cache.
>>
>> 2) The compute_objsize_r() function is big, unwieldy, and tedious
>>    to step through because of all the if statements that are better
>>    coded as one switch statement.  This change factors out more
>>    of its code into smaller handler functions as has been suggested
>>    and done a few times before.
>>
>> 3) (2) exposed a few places where I fail to pass the current
>>    GIMPLE statement down to ranger.  This leads to worse quality
>>    range info, including possible false positives and negatives.
>>    I just spotted these problems in code review but I haven't
>>    taken the time to come up with test cases.  This change fixes
>>    these oversights as well.
>>
>> 4) The handling of PHI statements is also in one big, hard-to-
>>    follow function.  This change moves the handling of each PHI
>>    argument into its own handler which merges it into the previous
>>    argument.  This makes the code easier to work with and opens it
>>    to reuse also for MIN_EXPR and MAX_EXPR.  (This is primarily
>>    used to print informational notes after warnings.)
>>
>> 5) Finally, the patch factors code to dump each access_ref
>>    cached by the pointer_query cache out of pointer_query::dump
>>    and into access_ref::dump.  This helps with debugging.
>>
>> These changes should have no user-visible effect and other than
>> a regression test for the typo (PR 103143) come with no tests.
>> They've been tested on x86_64-linux.
> Sigh.  You've identified 6 distinct changes above.  The 5 you've 
> enumerated plus a typo fix somewhere.  There's no reason why they need 
> to be a single patch and many reasons why they should be a series of 
> independent patches.    Combining them into a single patch isn't how we 
> do things and it hides the actual bugfix in here.
> 
> Please send a fix for the typo first since that should be able to 
> trivially go forward.  Then  a patch for item #1.  That should be 
> trivial to review when it's pulled out from teh rest of the patch. 
> Beyond that, your choice on ordering, but you need to break this down.
> 
> 
> 
> 
> Jeff
>
  

Comments

Jeff Law Dec. 8, 2021, 6:47 p.m. UTC | #1
On 12/6/2021 10:31 AM, Martin Sebor wrote:
> Attached is the subset of the patch in part (1) below:  Move
> bndrng from access_ref to access_data.
>
> On 12/3/21 5:00 PM, Jeff Law wrote:
>>
>>
>> On 11/8/2021 7:34 PM, Martin Sebor via Gcc-patches wrote:
>>> The pointer-query code that implements compute_objsize() that's
>>> in turn used by most middle end access warnings now has a few
>>> warts in it and (at least) one bug.  With the exception of
>>> the bug the warts aren't behind any user-visible bugs that
>>> I know of but they do cause problems in new code I've been
>>> implementing on top of it.  Besides fixing the one bug (just
>>> a typo) the attached patch cleans up these latent issues:
>>>
>>> 1) It moves the bndrng member from the access_ref class to
>>>    access_data.  As a FIXME in the code notes, the member never
>>>    did belong in the former and only takes up space in the cache.
>>>
>>> 2) The compute_objsize_r() function is big, unwieldy, and tedious
>>>    to step through because of all the if statements that are better
>>>    coded as one switch statement.  This change factors out more
>>>    of its code into smaller handler functions as has been suggested
>>>    and done a few times before.
>>>
>>> 3) (2) exposed a few places where I fail to pass the current
>>>    GIMPLE statement down to ranger.  This leads to worse quality
>>>    range info, including possible false positives and negatives.
>>>    I just spotted these problems in code review but I haven't
>>>    taken the time to come up with test cases.  This change fixes
>>>    these oversights as well.
>>>
>>> 4) The handling of PHI statements is also in one big, hard-to-
>>>    follow function.  This change moves the handling of each PHI
>>>    argument into its own handler which merges it into the previous
>>>    argument.  This makes the code easier to work with and opens it
>>>    to reuse also for MIN_EXPR and MAX_EXPR.  (This is primarily
>>>    used to print informational notes after warnings.)
>>>
>>> 5) Finally, the patch factors code to dump each access_ref
>>>    cached by the pointer_query cache out of pointer_query::dump
>>>    and into access_ref::dump.  This helps with debugging.
>>>
>>> These changes should have no user-visible effect and other than
>>> a regression test for the typo (PR 103143) come with no tests.
>>> They've been tested on x86_64-linux.
>> Sigh.  You've identified 6 distinct changes above.  The 5 you've 
>> enumerated plus a typo fix somewhere.  There's no reason why they 
>> need to be a single patch and many reasons why they should be a 
>> series of independent patches.    Combining them into a single patch 
>> isn't how we do things and it hides the actual bugfix in here.
>>
>> Please send a fix for the typo first since that should be able to 
>> trivially go forward.  Then  a patch for item #1.  That should be 
>> trivial to review when it's pulled out from teh rest of the patch. 
>> Beyond that, your choice on ordering, but you need to break this down.
>>
>>
>>
>>
>> Jeff
>>
>
>
> gcc-pointer_query-refactor-1.diff
>
> commit 1062c1154beeeae26e86f053946ea733ffb5d136
> Author: Martin Sebor <msebor@redhat.com>
> Date:   Sat Dec 4 16:46:17 2021 -0700
>
>      Move bndrng from access_ref to access_data.
>      
>      gcc/ChangeLog:
>      
>              * gimple-ssa-warn-access.cc (check_access): Adjust to member name
>              change.
>              (pass_waccess::check_strncmp): Same.
>              * pointer-query.cc (access_ref::access_ref): Remove arguments.
>              Simpilfy.
>              (access_data::access_data): Define new ctors.
>              (access_data::set_bound): Define new member function.
>              (compute_objsize_r): Remove unnecessary code.
>              * pointer-query.h (struct access_ref): Remove ctor arguments.
>              (struct access_data): Declare ctor overloads.
>              (access_data::dst_bndrng): New member.
>              (access_data::src_bndrng): New member.
OK.  Thanks for breaking this out.

jeff
  

Patch

commit 1062c1154beeeae26e86f053946ea733ffb5d136
Author: Martin Sebor <msebor@redhat.com>
Date:   Sat Dec 4 16:46:17 2021 -0700

    Move bndrng from access_ref to access_data.
    
    gcc/ChangeLog:
    
            * gimple-ssa-warn-access.cc (check_access): Adjust to member name
            change.
            (pass_waccess::check_strncmp): Same.
            * pointer-query.cc (access_ref::access_ref): Remove arguments.
            Simpilfy.
            (access_data::access_data): Define new ctors.
            (access_data::set_bound): Define new member function.
            (compute_objsize_r): Remove unnecessary code.
            * pointer-query.h (struct access_ref): Remove ctor arguments.
            (struct access_data): Declare ctor overloads.
            (access_data::dst_bndrng): New member.
            (access_data::src_bndrng): New member.

diff --git a/gcc/gimple-ssa-warn-access.cc b/gcc/gimple-ssa-warn-access.cc
index 48bf8aaff50..05b8d91b71d 100644
--- a/gcc/gimple-ssa-warn-access.cc
+++ b/gcc/gimple-ssa-warn-access.cc
@@ -1337,10 +1337,10 @@  check_access (GimpleOrTree exp, tree dstwrite,
   if (!dstsize)
     dstsize = maxobjsize;
 
-  /* Set RANGE to that of DSTWRITE if non-null, bounded by PAD->DST.BNDRNG
+  /* Set RANGE to that of DSTWRITE if non-null, bounded by PAD->DST_BNDRNG
      if valid.  */
   gimple *stmt = pad ? pad->stmt : nullptr;
-  get_size_range (rvals, dstwrite, stmt, range, pad ? pad->dst.bndrng : NULL);
+  get_size_range (rvals, dstwrite, stmt, range, pad ? pad->dst_bndrng : NULL);
 
   tree func = get_callee_fndecl (exp);
   /* Read vs write access by built-ins can be determined from the const
@@ -1432,9 +1432,9 @@  check_access (GimpleOrTree exp, tree dstwrite,
      of an object.  */
   if (maxread)
     {
-      /* Set RANGE to that of MAXREAD, bounded by PAD->SRC.BNDRNG if
+      /* Set RANGE to that of MAXREAD, bounded by PAD->SRC_BNDRNG if
 	 PAD is nonnull and BNDRNG is valid.  */
-      get_size_range (rvals, maxread, stmt, range, pad ? pad->src.bndrng : NULL);
+      get_size_range (rvals, maxread, stmt, range, pad ? pad->src_bndrng : NULL);
 
       location_t loc = get_location (exp);
       tree size = dstsize;
@@ -1479,12 +1479,12 @@  check_access (GimpleOrTree exp, tree dstwrite,
       && (pad->src.offrng[1] < 0
 	  || pad->src.offrng[0] <= pad->src.offrng[1]))
     {
-      /* Set RANGE to that of MAXREAD, bounded by PAD->SRC.BNDRNG if
+      /* Set RANGE to that of MAXREAD, bounded by PAD->SRC_BNDRNG if
 	 PAD is nonnull and BNDRNG is valid.  */
-      get_size_range (rvals, maxread, stmt, range, pad ? pad->src.bndrng : NULL);
+      get_size_range (rvals, maxread, stmt, range, pad ? pad->src_bndrng : NULL);
       /* Set OVERREAD for reads starting just past the end of an object.  */
-      overread = pad->src.sizrng[1] - pad->src.offrng[0] < pad->src.bndrng[0];
-      range[0] = wide_int_to_tree (sizetype, pad->src.bndrng[0]);
+      overread = pad->src.sizrng[1] - pad->src.offrng[0] < pad->src_bndrng[0];
+      range[0] = wide_int_to_tree (sizetype, pad->src_bndrng[0]);
       slen = size_zero_node;
     }
 
@@ -2592,7 +2592,7 @@  pass_waccess::check_strncmp (gcall *stmt)
   /* Determine the range of the bound first and bail if it fails; it's
      cheaper than computing the size of the objects.  */
   tree bndrng[2] = { NULL_TREE, NULL_TREE };
-  get_size_range (m_ptr_qry.rvals, bound, stmt, bndrng, adata1.src.bndrng);
+  get_size_range (m_ptr_qry.rvals, bound, stmt, bndrng, adata1.src_bndrng);
   if (!bndrng[0] || integer_zerop (bndrng[0]))
     return;
 
diff --git a/gcc/pointer-query.cc b/gcc/pointer-query.cc
index 25ce4303849..a8c9671e3ba 100644
--- a/gcc/pointer-query.cc
+++ b/gcc/pointer-query.cc
@@ -596,15 +596,9 @@  gimple_parm_array_size (tree ptr, wide_int rng[2],
   return var;
 }
 
-/* Given a statement STMT, set the bounds of the reference to at most
-   as many bytes as BOUND or unknown when null, and at least one when
-   the MINACCESS is true unless BOUND is a constant zero.  STMT is
-   used for context to get accurate range info.  */
-
-access_ref::access_ref (range_query *qry /* = nullptr */,
-			tree bound /* = NULL_TREE */,
-			gimple *stmt /* = nullptr */,
-			bool minaccess /* = false */)
+/* Initialize the object.  */
+
+access_ref::access_ref ()
   : ref (), eval ([](tree x){ return x; }), deref (), trail1special (true),
     base0 (true), parmarray ()
 {
@@ -613,21 +607,6 @@  access_ref::access_ref (range_query *qry /* = nullptr */,
   offmax[0] = offmax[1] = 0;
   /* Invalidate.   */
   sizrng[0] = sizrng[1] = -1;
-
-  /* Set the default bounds of the access and adjust below.  */
-  bndrng[0] = minaccess ? 1 : 0;
-  bndrng[1] = HOST_WIDE_INT_M1U;
-
-  /* When BOUND is nonnull and a range can be extracted from it,
-     set the bounds of the access to reflect both it and MINACCESS.
-     BNDRNG[0] is the size of the minimum access.  */
-  tree rng[2];
-  if (bound && get_size_range (qry, bound, stmt, rng, SR_ALLOW_ZERO))
-    {
-      bndrng[0] = wi::to_offset (rng[0]);
-      bndrng[1] = wi::to_offset (rng[1]);
-      bndrng[0] = bndrng[0] > 0 && minaccess ? 1 : 0;
-    }
 }
 
 /* Return the PHI node REF refers to or null if it doesn't.  */
@@ -1199,6 +1178,54 @@  access_ref::inform_access (access_mode mode) const
 	    sizestr, allocfn);
 }
 
+/* Set the access to at most MAXWRITE and MAXREAD bytes, and at least 1
+   when MINWRITE or MINREAD, respectively, is set.  */
+access_data::access_data (range_query *query, gimple *stmt, access_mode mode,
+			  tree maxwrite /* = NULL_TREE */,
+			  bool minwrite /* = false */,
+			  tree maxread /* = NULL_TREE */,
+			  bool minread /* = false */)
+  : stmt (stmt), call (), dst (), src (), mode (mode)
+{
+  set_bound (dst_bndrng, maxwrite, minwrite, query, stmt);
+  set_bound (src_bndrng, maxread, minread, query, stmt);
+}
+
+/* Set the access to at most MAXWRITE and MAXREAD bytes, and at least 1
+   when MINWRITE or MINREAD, respectively, is set.  */
+access_data::access_data (range_query *query, tree expr, access_mode mode,
+			  tree maxwrite /* = NULL_TREE */,
+			  bool minwrite /* = false */,
+			  tree maxread /* = NULL_TREE */,
+			  bool minread /* = false */)
+  : stmt (), call (expr),  dst (), src (), mode (mode)
+{
+  set_bound (dst_bndrng, maxwrite, minwrite, query, stmt);
+  set_bound (src_bndrng, maxread, minread, query, stmt);
+}
+
+/* Set BNDRNG to the range of BOUND for the statement STMT.  */
+
+void
+access_data::set_bound (offset_int bndrng[2], tree bound, bool minaccess,
+			range_query *query, gimple *stmt)
+{
+  /* Set the default bounds of the access and adjust below.  */
+  bndrng[0] = minaccess ? 1 : 0;
+  bndrng[1] = HOST_WIDE_INT_M1U;
+
+  /* When BOUND is nonnull and a range can be extracted from it,
+     set the bounds of the access to reflect both it and MINACCESS.
+     BNDRNG[0] is the size of the minimum access.  */
+  tree rng[2];
+  if (bound && get_size_range (query, bound, stmt, rng, SR_ALLOW_ZERO))
+    {
+      bndrng[0] = wi::to_offset (rng[0]);
+      bndrng[1] = wi::to_offset (rng[1]);
+      bndrng[0] = bndrng[0] > 0 && minaccess ? 1 : 0;
+    }
+}
+
 /* Set a bit for the PHI in VISITED and return true if it wasn't
    already set.  */
 
@@ -1948,14 +1975,8 @@  compute_objsize_r (tree ptr, gimple *stmt, int ostype, access_ref *pref,
 	  if (const access_ref *cache_ref = qry->get_ref (ptr))
 	    {
 	      /* If the pointer is in the cache set *PREF to what it refers
-		 to and return success.
-		 FIXME: BNDRNG is determined by each access and so it doesn't
-		 belong in access_ref.  Until the design is changed, keep it
-		 unchanged here.  */
-	      const offset_int bndrng[2] = { pref->bndrng[0], pref->bndrng[1] };
+		 to and return success.  */
 	      *pref = *cache_ref;
-	      pref->bndrng[0] = bndrng[0];
-	      pref->bndrng[1] = bndrng[1];
 	      return true;
 	    }
 	}
diff --git a/gcc/pointer-query.h b/gcc/pointer-query.h
index fbea3316f14..82cb81b3987 100644
--- a/gcc/pointer-query.h
+++ b/gcc/pointer-query.h
@@ -61,8 +61,7 @@  class pointer_query;
 struct access_ref
 {
   /* Set the bounds of the reference.  */
-  access_ref (range_query *query = nullptr, tree = NULL_TREE,
-	      gimple * = nullptr, bool = false);
+  access_ref ();
 
   /* Return the PHI node REF refers to or null if it doesn't.  */
   gphi *phi () const;
@@ -129,11 +128,6 @@  struct access_ref
   offset_int sizrng[2];
   /* The minimum and maximum offset computed.  */
   offset_int offmax[2];
-  /* Range of the bound of the access: denotes that the access
-     is at least BNDRNG[0] bytes but no more than BNDRNG[1].
-     For string functions the size of the actual access is
-     further constrained by the length of the string.  */
-  offset_int bndrng[2];
 
   /* Used to fold integer expressions when called from front ends.  */
   tree (*eval)(tree);
@@ -206,23 +200,18 @@  struct access_data
 {
   /* Set the access to at most MAXWRITE and MAXREAD bytes, and
      at least 1 when MINWRITE or MINREAD, respectively, is set.  */
-  access_data (range_query *query, gimple *stmt, access_mode mode,
-	       tree maxwrite = NULL_TREE, bool minwrite = false,
-	       tree maxread = NULL_TREE, bool minread = false)
-    : stmt (stmt), call (),
-      dst (query, maxwrite, stmt, minwrite),
-      src (query, maxread, stmt, minread),
-      mode (mode) { }
+  access_data (range_query *, gimple *, access_mode,
+	       tree = NULL_TREE, bool = false,
+	       tree = NULL_TREE, bool = false);
 
   /* Set the access to at most MAXWRITE and MAXREAD bytes, and
      at least 1 when MINWRITE or MINREAD, respectively, is set.  */
-  access_data (range_query *query, tree expr, access_mode mode,
-	       tree maxwrite = NULL_TREE, bool minwrite = false,
-	       tree maxread = NULL_TREE, bool minread = false)
-    : stmt (), call (expr),
-      dst (query, maxwrite, nullptr, minwrite),
-      src (query, maxread, nullptr, minread),
-      mode (mode) { }
+  access_data (range_query *, tree, access_mode,
+	       tree = NULL_TREE, bool = false,
+	       tree = NULL_TREE, bool = false);
+
+  /* Constructor helper.  */
+  static void set_bound (offset_int[2], tree, bool, range_query *, gimple *);
 
   /* Access statement.  */
   gimple *stmt;
@@ -230,6 +219,14 @@  struct access_data
   tree call;
   /* Destination and source of the access.  */
   access_ref dst, src;
+
+  /* Range of the bound of the access: denotes that the access is at
+     least XXX_BNDRNG[0] bytes but no more than XXX_BNDRNG[1].  For
+     string functions the size of the actual access is further
+     constrained by the length of the string.  */
+  offset_int dst_bndrng[2];
+  offset_int src_bndrng[2];
+
   /* Read-only for functions like memcmp or strlen, write-only
      for memset, read-write for memcpy or strcat.  */
   access_mode mode;