[03/10] tree-object-size: Use tree instead of HOST_WIDE_INT

Message ID 20211109190137.1107736-4-siddhesh@gotplt.org
State Superseded
Headers
Series __builtin_dynamic_object_size |

Commit Message

Siddhesh Poyarekar Nov. 9, 2021, 7:01 p.m. UTC
  Transform tree-object-size to operate on tree objects instead of host
wide integers.  This makes it easier to extend to dynamic expressions
for object sizes.

The compute_builtin_object_size interface also now returns a tree
expression instead of HOST_WIDE_INT, so callers have been adjusted to
account for that.

gcc/ChangeLog:

	* tree-object-size.h (compute_builtin_object_size): Return tree
	instead of HOST_WIDE_INT.
	* builtins.c (fold_builtin_object_size): Adjust.
	* gimple-fold.c (gimple_fold_builtin_strncat): Likewise.
	* ubsan.c (instrument_object_size): Likewise.
	* tree-object-size.c (object_sizes): Change type to vec<tree>.
	(initval): New function.
	(unknown): Use it.
	(size_unknown_p, size_initval, size_unknown): New functions.
	(object_sizes_unknown_p): Use it.
	(object_sizes_get): Return tree.
	(object_sizes_initialize): Rename from object_sizes_set_force
	and set VAL parameter type as tree.
	(object_sizes_set): Set VAL parameter type as tree and adjust
	implementation.
	(size_for_offset): New function.
	(addr_object_size): Change PSIZE parameter to tree and adjust
	implementation.
	(alloc_object_size): Return tree.
	(compute_builtin_object_size): Return tree in PSIZE.
	(expr_object_size, call_object_size, unknown_object_size):
	Adjust for object_sizes_set change.
	(merge_object_sizes): Set OFFSET type to tree and adjust
	implementation.
	(plus_stmt_object_size, cond_expr_object_size,
	collect_object_sizes_for, check_for_plus_in_loops_1): Adjust for
	change of type from HOST_WIDE_INT to tree.
	(object_sizes_execute): Adjust for type change to tree.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
---
 gcc/builtins.c         |  10 +-
 gcc/gimple-fold.c      |   9 +-
 gcc/tree-object-size.c | 253 ++++++++++++++++++++++-------------------
 gcc/tree-object-size.h |   2 +-
 gcc/ubsan.c            |  46 ++++----
 5 files changed, 168 insertions(+), 152 deletions(-)
  

Comments

Jakub Jelinek Nov. 19, 2021, 5:06 p.m. UTC | #1
On Wed, Nov 10, 2021 at 12:31:29AM +0530, Siddhesh Poyarekar wrote:
> 	* tree-object-size.h (compute_builtin_object_size): Return tree
> 	instead of HOST_WIDE_INT.
> 	* builtins.c (fold_builtin_object_size): Adjust.
> 	* gimple-fold.c (gimple_fold_builtin_strncat): Likewise.
> 	* ubsan.c (instrument_object_size): Likewise.
> 	* tree-object-size.c (object_sizes): Change type to vec<tree>.
> 	(initval): New function.
> 	(unknown): Use it.
> 	(size_unknown_p, size_initval, size_unknown): New functions.
> 	(object_sizes_unknown_p): Use it.
> 	(object_sizes_get): Return tree.
> 	(object_sizes_initialize): Rename from object_sizes_set_force
> 	and set VAL parameter type as tree.
> 	(object_sizes_set): Set VAL parameter type as tree and adjust
> 	implementation.
> 	(size_for_offset): New function.
> 	(addr_object_size): Change PSIZE parameter to tree and adjust
> 	implementation.
> 	(alloc_object_size): Return tree.
> 	(compute_builtin_object_size): Return tree in PSIZE.
> 	(expr_object_size, call_object_size, unknown_object_size):
> 	Adjust for object_sizes_set change.
> 	(merge_object_sizes): Set OFFSET type to tree and adjust
> 	implementation.
> 	(plus_stmt_object_size, cond_expr_object_size,
> 	collect_object_sizes_for, check_for_plus_in_loops_1): Adjust for
> 	change of type from HOST_WIDE_INT to tree.
> 	(object_sizes_execute): Adjust for type change to tree.

> --- a/gcc/builtins.c
> +++ b/gcc/builtins.c
> @@ -10226,7 +10226,7 @@ maybe_emit_sprintf_chk_warning (tree exp, enum built_in_function fcode)
>  static tree
>  fold_builtin_object_size (tree ptr, tree ost)
>  {
> -  unsigned HOST_WIDE_INT bytes;
> +  tree bytes;
>    int object_size_type;
>  
>    if (!validate_arg (ptr, POINTER_TYPE)
> @@ -10251,17 +10251,15 @@ fold_builtin_object_size (tree ptr, tree ost)
>    if (TREE_CODE (ptr) == ADDR_EXPR)
>      {
>        compute_builtin_object_size (ptr, object_size_type, &bytes);
> -      if (wi::fits_to_tree_p (bytes, size_type_node))
> -	return build_int_cstu (size_type_node, bytes);
> +      return fold_convert (size_type_node, bytes);
>      }
>    else if (TREE_CODE (ptr) == SSA_NAME)
>      {
>        /* If object size is not known yet, delay folding until
>         later.  Maybe subsequent passes will help determining
>         it.  */
> -      if (compute_builtin_object_size (ptr, object_size_type, &bytes)
> -	  && wi::fits_to_tree_p (bytes, size_type_node))
> -	return build_int_cstu (size_type_node, bytes);
> +      if (compute_builtin_object_size (ptr, object_size_type, &bytes))
> +	return fold_convert (size_type_node, bytes);

Neither of these are equivalent to what it used to do before.
If some target has e.g. pointers wider than size_t, then previously we could
compute bytes that doesn't fit into size_t and would return NULL which
eventually would result in object_sizes_execute or expand_builtin_object_size
resolving it to the unknown value.  But fold_convert will perform modulo
arithmetics on it, so say if size_t is 24-bit and bytes is 0x1000001, it
will fold to (size_t) 1.  I think we should still punt if it doesn't fit.
Or do we ensure that what it returns always has sizetype type?

> @@ -2486,13 +2486,14 @@ gimple_fold_builtin_strncat (gimple_stmt_iterator *gsi)
>    if (cmpsrc < 0)
>      return false;
>  
> -  unsigned HOST_WIDE_INT dstsize;
> +  tree dstsize;
>  
>    bool nowarn = warning_suppressed_p (stmt, OPT_Wstringop_overflow_);
>  
> -  if (!nowarn && compute_builtin_object_size (dst, 1, &dstsize))
> +  if (!nowarn && compute_builtin_object_size (dst, 1, &dstsize)
> +      && tree_fits_uhwi_p (dstsize))
>      {
> -      int cmpdst = compare_tree_int (len, dstsize);
> +      int cmpdst = compare_tree_int (len, tree_to_uhwi (dstsize));

Why jump from tree to UHWI and back?
Check that TREE_CODE (dstsize) == INTEGER_CST and do
tree_int_cst_compare instead?

>  
>        if (cmpdst >= 0)
>  	{
> @@ -2509,7 +2510,7 @@ gimple_fold_builtin_strncat (gimple_stmt_iterator *gsi)
>  				    "destination size")
>  			       : G_("%qD specified bound %E exceeds "
>  				    "destination size %wu"),
> -			       fndecl, len, dstsize);
> +			       fndecl, len, tree_to_uhwi (dstsize));

Similarly, use %E and pass dstsize to it.

> +/* Initial value of object sizes; zero for maximum and SIZE_MAX for minimum
> +   object size.  */
> +
> +static inline unsigned HOST_WIDE_INT
> +initval (int object_size_type)
> +{
> +  return (unsigned HOST_WIDE_INT) -popcount_hwi (object_size_type
> +						 & OST_MINIMUM);

This makes the code quite unreadable and on some arches it will be also
much worse (popcount is a libcall on many, and even if not, it is inline
function using a builtin inside of it).  Can't you use a simple readable
  return (object_size_type & OST_MINUMUM) ? HOST_WIDE_INT_M1U : 0;
and let the compiler optimize as much as it wants?
?

> +/* Bytes at end of the object with SZ from offset OFFSET. */
> +
> +static tree
> +size_for_offset (tree sz, tree offset)
> +{
> +  return size_binop (MINUS_EXPR, size_binop (MAX_EXPR, sz, offset), offset);
> +}
>  
> @@ -314,27 +356,22 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
>  			    SSA_NAME_VERSION (var)))
>  	    sz = object_sizes_get (osi, SSA_NAME_VERSION (var));
>  	  else
> -	    sz = unknown (object_size_type);
> +	    sz = size_unknown (object_size_type);
>  	}
> -      if (sz != unknown (object_size_type))
> +      if (!size_unknown_p (sz, object_size_type))
>  	{
> -	  offset_int mem_offset;
> -	  if (mem_ref_offset (pt_var).is_constant (&mem_offset))
> -	    {
> -	      offset_int dsz = wi::sub (sz, mem_offset);
> -	      if (wi::neg_p (dsz))
> -		sz = 0;
> -	      else if (wi::fits_uhwi_p (dsz))
> -		sz = dsz.to_uhwi ();
> -	      else
> -		sz = unknown (object_size_type);
> -	    }
> +	  tree offset = TREE_OPERAND (pt_var, 1);
> +	  if (TREE_CODE (offset) != INTEGER_CST
> +	      || TREE_CODE (sz) != INTEGER_CST)
> +	    sz = size_unknown (object_size_type);
>  	  else
> -	    sz = unknown (object_size_type);
> +	    sz = size_for_offset (sz, offset);

This doesn't match what the code did and I'm surprised if it works at all.
TREE_OPERAND (pt_var, 1), while it is an INTEGER_CST or POLY_INT_CST,
has in its type encoded the type for aliasing, so the type is some pointer
type.  Performing size_binop etc. on such values can misbehave, the code
assumes that it is sizetype or at least some integral type compatible with
it.Also, mem_ref_offset is signed and offset_int has bigger precision
than pointers on the target such that it can be always signed.  So
e.g. if MEM_REF's second operand is bigger or equal than half of the
address space, in the old code it would appear to be negative, wi::sub would
result in a value bigger than sz.  Not really sure right now if that is
exactly how we want to treat it, would be nice to try some testcase.
In any case, size_for_offset should be called with the offset converted
to sizetype if it does size_binop on it.
> +    reexamine |= merge_object_sizes (osi, var, then_, size_int (0));

size_zero_node ?
>    else
>      expr_object_size (osi, var, then_);
>  
> @@ -952,7 +970,7 @@ cond_expr_object_size (struct object_size_info *osi, tree var, gimple *stmt)
>      return reexamine;
>  
>    if (TREE_CODE (else_) == SSA_NAME)
> -    reexamine |= merge_object_sizes (osi, var, else_, 0);
> +    reexamine |= merge_object_sizes (osi, var, else_, size_int (0));

Likewise.  Many times.

> -  unsigned HOST_WIDE_INT size;
> -  if (compute_builtin_object_size (base_addr, 0, &size))
> -    sizet = build_int_cst (sizetype, size);
> -  else if (optimize)

I'd prefer not to reindent it to avoid useless churn, just do
  if (compute_builtin_object_size (base_addr, 0, &sizet))
    ;
  else if (optimize)
...

	Jakub
  
Siddhesh Poyarekar Nov. 19, 2021, 7:01 p.m. UTC | #2
On 11/19/21 22:36, Jakub Jelinek wrote:
> On Wed, Nov 10, 2021 at 12:31:29AM +0530, Siddhesh Poyarekar wrote:
>> 	* tree-object-size.h (compute_builtin_object_size): Return tree
>> 	instead of HOST_WIDE_INT.
>> 	* builtins.c (fold_builtin_object_size): Adjust.
>> 	* gimple-fold.c (gimple_fold_builtin_strncat): Likewise.
>> 	* ubsan.c (instrument_object_size): Likewise.
>> 	* tree-object-size.c (object_sizes): Change type to vec<tree>.
>> 	(initval): New function.
>> 	(unknown): Use it.
>> 	(size_unknown_p, size_initval, size_unknown): New functions.
>> 	(object_sizes_unknown_p): Use it.
>> 	(object_sizes_get): Return tree.
>> 	(object_sizes_initialize): Rename from object_sizes_set_force
>> 	and set VAL parameter type as tree.
>> 	(object_sizes_set): Set VAL parameter type as tree and adjust
>> 	implementation.
>> 	(size_for_offset): New function.
>> 	(addr_object_size): Change PSIZE parameter to tree and adjust
>> 	implementation.
>> 	(alloc_object_size): Return tree.
>> 	(compute_builtin_object_size): Return tree in PSIZE.
>> 	(expr_object_size, call_object_size, unknown_object_size):
>> 	Adjust for object_sizes_set change.
>> 	(merge_object_sizes): Set OFFSET type to tree and adjust
>> 	implementation.
>> 	(plus_stmt_object_size, cond_expr_object_size,
>> 	collect_object_sizes_for, check_for_plus_in_loops_1): Adjust for
>> 	change of type from HOST_WIDE_INT to tree.
>> 	(object_sizes_execute): Adjust for type change to tree.
> 
>> --- a/gcc/builtins.c
>> +++ b/gcc/builtins.c
>> @@ -10226,7 +10226,7 @@ maybe_emit_sprintf_chk_warning (tree exp, enum built_in_function fcode)
>>   static tree
>>   fold_builtin_object_size (tree ptr, tree ost)
>>   {
>> -  unsigned HOST_WIDE_INT bytes;
>> +  tree bytes;
>>     int object_size_type;
>>   
>>     if (!validate_arg (ptr, POINTER_TYPE)
>> @@ -10251,17 +10251,15 @@ fold_builtin_object_size (tree ptr, tree ost)
>>     if (TREE_CODE (ptr) == ADDR_EXPR)
>>       {
>>         compute_builtin_object_size (ptr, object_size_type, &bytes);
>> -      if (wi::fits_to_tree_p (bytes, size_type_node))
>> -	return build_int_cstu (size_type_node, bytes);
>> +      return fold_convert (size_type_node, bytes);
>>       }
>>     else if (TREE_CODE (ptr) == SSA_NAME)
>>       {
>>         /* If object size is not known yet, delay folding until
>>          later.  Maybe subsequent passes will help determining
>>          it.  */
>> -      if (compute_builtin_object_size (ptr, object_size_type, &bytes)
>> -	  && wi::fits_to_tree_p (bytes, size_type_node))
>> -	return build_int_cstu (size_type_node, bytes);
>> +      if (compute_builtin_object_size (ptr, object_size_type, &bytes))
>> +	return fold_convert (size_type_node, bytes);
> 
> Neither of these are equivalent to what it used to do before.
> If some target has e.g. pointers wider than size_t, then previously we could
> compute bytes that doesn't fit into size_t and would return NULL which
> eventually would result in object_sizes_execute or expand_builtin_object_size
> resolving it to the unknown value.  But fold_convert will perform modulo
> arithmetics on it, so say if size_t is 24-bit and bytes is 0x1000001, it
> will fold to (size_t) 1.  I think we should still punt if it doesn't fit.
> Or do we ensure that what it returns always has sizetype type?

compute_builtin_object_size should always return sizetype.  I probably 
haven't strictly ensured that but I could do that.  Would it be 
necessary to check for fit in sizetype -> size_type_node conversion too? 
  I've been assuming that they'd have the same precision.

> 
>> @@ -2486,13 +2486,14 @@ gimple_fold_builtin_strncat (gimple_stmt_iterator *gsi)
>>     if (cmpsrc < 0)
>>       return false;
>>   
>> -  unsigned HOST_WIDE_INT dstsize;
>> +  tree dstsize;
>>   
>>     bool nowarn = warning_suppressed_p (stmt, OPT_Wstringop_overflow_);
>>   
>> -  if (!nowarn && compute_builtin_object_size (dst, 1, &dstsize))
>> +  if (!nowarn && compute_builtin_object_size (dst, 1, &dstsize)
>> +      && tree_fits_uhwi_p (dstsize))
>>       {
>> -      int cmpdst = compare_tree_int (len, dstsize);
>> +      int cmpdst = compare_tree_int (len, tree_to_uhwi (dstsize));
> 
> Why jump from tree to UHWI and back?
> Check that TREE_CODE (dstsize) == INTEGER_CST and do
> tree_int_cst_compare instead?
> 
>>   
>>         if (cmpdst >= 0)
>>   	{
>> @@ -2509,7 +2510,7 @@ gimple_fold_builtin_strncat (gimple_stmt_iterator *gsi)
>>   				    "destination size")
>>   			       : G_("%qD specified bound %E exceeds "
>>   				    "destination size %wu"),
>> -			       fndecl, len, dstsize);
>> +			       fndecl, len, tree_to_uhwi (dstsize));
> 
> Similarly, use %E and pass dstsize to it.
> 

I noticed it while doing the gimple fold patches; I'll fix it.  I need 
to rebase this bit anyway.

>> +/* Initial value of object sizes; zero for maximum and SIZE_MAX for minimum
>> +   object size.  */
>> +
>> +static inline unsigned HOST_WIDE_INT
>> +initval (int object_size_type)
>> +{
>> +  return (unsigned HOST_WIDE_INT) -popcount_hwi (object_size_type
>> +						 & OST_MINIMUM);
> 
> This makes the code quite unreadable and on some arches it will be also
> much worse (popcount is a libcall on many, and even if not, it is inline
> function using a builtin inside of it).  Can't you use a simple readable
>    return (object_size_type & OST_MINUMUM) ? HOST_WIDE_INT_M1U : 0;
> and let the compiler optimize as much as it wants?
> ?

OK, I'll do that; I was trying too hard there I suppose :)

> 
>> +/* Bytes at end of the object with SZ from offset OFFSET. */
>> +
>> +static tree
>> +size_for_offset (tree sz, tree offset)
>> +{
>> +  return size_binop (MINUS_EXPR, size_binop (MAX_EXPR, sz, offset), offset);
>> +}
>>   
>> @@ -314,27 +356,22 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
>>   			    SSA_NAME_VERSION (var)))
>>   	    sz = object_sizes_get (osi, SSA_NAME_VERSION (var));
>>   	  else
>> -	    sz = unknown (object_size_type);
>> +	    sz = size_unknown (object_size_type);
>>   	}
>> -      if (sz != unknown (object_size_type))
>> +      if (!size_unknown_p (sz, object_size_type))
>>   	{
>> -	  offset_int mem_offset;
>> -	  if (mem_ref_offset (pt_var).is_constant (&mem_offset))
>> -	    {
>> -	      offset_int dsz = wi::sub (sz, mem_offset);
>> -	      if (wi::neg_p (dsz))
>> -		sz = 0;
>> -	      else if (wi::fits_uhwi_p (dsz))
>> -		sz = dsz.to_uhwi ();
>> -	      else
>> -		sz = unknown (object_size_type);
>> -	    }
>> +	  tree offset = TREE_OPERAND (pt_var, 1);
>> +	  if (TREE_CODE (offset) != INTEGER_CST
>> +	      || TREE_CODE (sz) != INTEGER_CST)
>> +	    sz = size_unknown (object_size_type);
>>   	  else
>> -	    sz = unknown (object_size_type);
>> +	    sz = size_for_offset (sz, offset);
> 
> This doesn't match what the code did and I'm surprised if it works at all.
> TREE_OPERAND (pt_var, 1), while it is an INTEGER_CST or POLY_INT_CST,
> has in its type encoded the type for aliasing, so the type is some pointer
> type.  Performing size_binop etc. on such values can misbehave, the code
> assumes that it is sizetype or at least some integral type compatible with
> it.Also, mem_ref_offset is signed and offset_int has bigger precision
> than pointers on the target such that it can be always signed.  So
> e.g. if MEM_REF's second operand is bigger or equal than half of the
> address space, in the old code it would appear to be negative, wi::sub would
> result in a value bigger than sz.  Not really sure right now if that is
> exactly how we want to treat it, would be nice to try some testcase.

Let me try coming up with a test case for it.

> In any case, size_for_offset should be called with the offset converted
> to sizetype if it does size_binop on it.

I'll fix this.

>> +    reexamine |= merge_object_sizes (osi, var, then_, size_int (0));
> 
> size_zero_node ?
>>     else
>>       expr_object_size (osi, var, then_);
>>   
>> @@ -952,7 +970,7 @@ cond_expr_object_size (struct object_size_info *osi, tree var, gimple *stmt)
>>       return reexamine;
>>   
>>     if (TREE_CODE (else_) == SSA_NAME)
>> -    reexamine |= merge_object_sizes (osi, var, else_, 0);
>> +    reexamine |= merge_object_sizes (osi, var, else_, size_int (0));
> 
> Likewise.  Many times.

OK.

>> -  unsigned HOST_WIDE_INT size;
>> -  if (compute_builtin_object_size (base_addr, 0, &size))
>> -    sizet = build_int_cst (sizetype, size);
>> -  else if (optimize)
> 
> I'd prefer not to reindent it to avoid useless churn, just do
>    if (compute_builtin_object_size (base_addr, 0, &sizet))
>      ;
>    else if (optimize)

OK.

> ...
> 
> 	Jakub
> 

Thanks,
Siddhesh
  
Jakub Jelinek Nov. 19, 2021, 7:16 p.m. UTC | #3
On Sat, Nov 20, 2021 at 12:31:19AM +0530, Siddhesh Poyarekar wrote:
> > Neither of these are equivalent to what it used to do before.
> > If some target has e.g. pointers wider than size_t, then previously we could
> > compute bytes that doesn't fit into size_t and would return NULL which
> > eventually would result in object_sizes_execute or expand_builtin_object_size
> > resolving it to the unknown value.  But fold_convert will perform modulo
> > arithmetics on it, so say if size_t is 24-bit and bytes is 0x1000001, it
> > will fold to (size_t) 1.  I think we should still punt if it doesn't fit.
> > Or do we ensure that what it returns always has sizetype type?
> 
> compute_builtin_object_size should always return sizetype.  I probably
> haven't strictly ensured that but I could do that.  Would it be necessary to
> check for fit in sizetype -> size_type_node conversion too?  I've been
> assuming that they'd have the same precision.

sizetype and size_type_node should be indeed always the same precision.

	Jakub
  
Richard Biener Nov. 22, 2021, 8:41 a.m. UTC | #4
On Fri, Nov 19, 2021 at 8:17 PM Jakub Jelinek via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On Sat, Nov 20, 2021 at 12:31:19AM +0530, Siddhesh Poyarekar wrote:
> > > Neither of these are equivalent to what it used to do before.
> > > If some target has e.g. pointers wider than size_t, then previously we could
> > > compute bytes that doesn't fit into size_t and would return NULL which
> > > eventually would result in object_sizes_execute or expand_builtin_object_size
> > > resolving it to the unknown value.  But fold_convert will perform modulo
> > > arithmetics on it, so say if size_t is 24-bit and bytes is 0x1000001, it
> > > will fold to (size_t) 1.  I think we should still punt if it doesn't fit.
> > > Or do we ensure that what it returns always has sizetype type?
> >
> > compute_builtin_object_size should always return sizetype.  I probably
> > haven't strictly ensured that but I could do that.  Would it be necessary to
> > check for fit in sizetype -> size_type_node conversion too?  I've been
> > assuming that they'd have the same precision.
>
> sizetype and size_type_node should be indeed always the same precision.

I don't think so.  vms.h has

/* Always a 32 bit type.  */
#undef SIZE_TYPE
#define SIZE_TYPE  "unsigned int"

...

#define SIZETYPE (flag_vms_pointer_size == VMS_POINTER_SIZE_NONE ? \
                  "unsigned int" : "long long unsigned int")

interestingly that's the _only_ SIZETYPE override we have, so it does look like
we might want to try removing the 'SIZETYPE' target macro.  I'm not sure the
above does anything good - it looks more like a ptr_mode vs Pmode thing.

Richard.

>         Jakub
>
  
Siddhesh Poyarekar Nov. 22, 2021, 10:11 a.m. UTC | #5
On 11/20/21 00:31, Siddhesh Poyarekar wrote:
>> This doesn't match what the code did and I'm surprised if it works at 
>> all.
>> TREE_OPERAND (pt_var, 1), while it is an INTEGER_CST or POLY_INT_CST,
>> has in its type encoded the type for aliasing, so the type is some 
>> pointer
>> type.  Performing size_binop etc. on such values can misbehave, the code
>> assumes that it is sizetype or at least some integral type compatible 
>> with
>> it.Also, mem_ref_offset is signed and offset_int has bigger precision
>> than pointers on the target such that it can be always signed.  So
>> e.g. if MEM_REF's second operand is bigger or equal than half of the
>> address space, in the old code it would appear to be negative, wi::sub 
>> would
>> result in a value bigger than sz.  Not really sure right now if that is
>> exactly how we want to treat it, would be nice to try some testcase.
> 
> Let me try coming up with a test case for it.
> 

So I played around a bit with this.  Basically:

char buf[8];

__SIZE_TYPE__ test (void)
{
   char *p = &buf[0x90000004];
   return __builtin_object_size (p + 2, 0);
}

when built with -m32 returns 0x70000002 but on 64-bit, returns 0 as 
expected.  of course, with subscript as 0x9000000000000004, 64-bit gives 
  0x7000000000000002 as the result.

With the tree conversion, this is at least partly taken care of since 
offset larger than size, to the extent that it fits into sizetype, 
returns a size of zero.  So in the above example, the size returned is 
zero in both -m32 as well as -m64.  Likewise for negative offset, i.e. 
&buf[-4]; the old code returns 10 while with trees it returns 0, which 
seems correct to me since it is an underflow.

It's only partly taken care of because, e.g.

   char *p = &buf[0x100000004];

ends up truncating the offset, returning object size of 2.  This however 
is an unrelated problem; it's the folding of offsets that is responsible 
for this since it ends up truncating the offset to shwi bounds.  Perhaps 
there's an opportunity in get_addr_base_and_unit_offset to warn of 
overflow if offset goes above pointer precision before truncating it.

So for this patch, may I simply ensure that offset is converted to 
sizetype and keep everything else the same?  it appears to demonstrate 
better behaviour than the older code.  I'll also add these tests.

Thanks,
Siddhesh
  
Jakub Jelinek Nov. 22, 2021, 10:31 a.m. UTC | #6
On Mon, Nov 22, 2021 at 03:41:57PM +0530, Siddhesh Poyarekar wrote:
> So I played around a bit with this.  Basically:
> 
> char buf[8];
> 
> __SIZE_TYPE__ test (void)
> {
>   char *p = &buf[0x90000004];
>   return __builtin_object_size (p + 2, 0);
> }
> 
> when built with -m32 returns 0x70000002 but on 64-bit, returns 0 as
> expected.  of course, with subscript as 0x9000000000000004, 64-bit gives
> 0x7000000000000002 as the result.

That is one case, I think when the base is a ADDR_EXPR VAR_DECL, offsets
larger than half of the address space will be UB, though of course the
question is what __bos should return for that because it wants to prevent
UB exploiting.

Another case is where the base is some pointer, something like
  &MEM_REF [ptr, -16].a
etc., where the MEM_REF offset negative makes a lot of sense,
there could be
  ptr = &var + 32;
earlier etc.
That gives us to the general question on what to do with POINTER_PLUS_EXPR
if the offset is constant but "negative" (as it is sizetype, it is never
negative, but usually we treat it or should treat it as signed in the end).
If the offset is "negative", for e.g. __bos 0/1 one option is to use a
conservative maximum, i.e. subtract the offset from the value (make it even
larger, of course make sure it doesn't wrap around).  Another one would be
to consider also __bos 2 in that case, if the negation of the offset is
bigger than __bos 2, then the pointer points to before the start of the
object and we could return 0.

	Jakub
  
Siddhesh Poyarekar Nov. 22, 2021, noon UTC | #7
On 11/22/21 16:01, Jakub Jelinek wrote:
> On Mon, Nov 22, 2021 at 03:41:57PM +0530, Siddhesh Poyarekar wrote:
>> So I played around a bit with this.  Basically:
>>
>> char buf[8];
>>
>> __SIZE_TYPE__ test (void)
>> {
>>    char *p = &buf[0x90000004];
>>    return __builtin_object_size (p + 2, 0);
>> }
>>
>> when built with -m32 returns 0x70000002 but on 64-bit, returns 0 as
>> expected.  of course, with subscript as 0x9000000000000004, 64-bit gives
>> 0x7000000000000002 as the result.
> 
> That is one case, I think when the base is a ADDR_EXPR VAR_DECL, offsets
> larger than half of the address space will be UB, though of course the
> question is what __bos should return for that because it wants to prevent
> UB exploiting.
> 
> Another case is where the base is some pointer, something like
>    &MEM_REF [ptr, -16].a
> etc., where the MEM_REF offset negative makes a lot of sense,
> there could be
>    ptr = &var + 32;
> earlier etc.
> That gives us to the general question on what to do with POINTER_PLUS_EXPR
> if the offset is constant but "negative" (as it is sizetype, it is never
> negative, but usually we treat it or should treat it as signed in the end).
> If the offset is "negative", for e.g. __bos 0/1 one option is to use a
> conservative maximum, i.e. subtract the offset from the value (make it even
> larger, of course make sure it doesn't wrap around).  Another one would be
> to consider also __bos 2 in that case, if the negation of the offset is
> bigger than __bos 2, then the pointer points to before the start of the
> object and we could return 0.

So I've got patch 10/10, which handles dynamic (and consequently 
negative) offsets.  It basically computes a "whole size", which then 
gives the extent to which a negative offset is valid, making the 
estimates a bit more precise.  I didn't do it for static object sizes 
because I didn't have time then, but I could add a patch 11/10 if the 
idea sounds OK to you.

Siddhesh
  
Siddhesh Poyarekar Nov. 22, 2021, 12:31 p.m. UTC | #8
On 11/22/21 17:30, Siddhesh Poyarekar wrote:
> So I've got patch 10/10, which handles dynamic (and consequently 
> negative) offsets.  It basically computes a "whole size", which then 
> gives the extent to which a negative offset is valid, making the 
> estimates a bit more precise.  I didn't do it for static object sizes 
> because I didn't have time then, but I could add a patch 11/10 if the 
> idea sounds OK to you.

... or alternatively, I could bring the whole size idea into this tree 
conversion patch so that it handles all kinds of offsets.  That might 
even eliminate patch 10/10.  What would you prefer?

Siddhesh
  
Jakub Jelinek Nov. 22, 2021, 12:32 p.m. UTC | #9
On Mon, Nov 22, 2021 at 06:01:08PM +0530, Siddhesh Poyarekar wrote:
> On 11/22/21 17:30, Siddhesh Poyarekar wrote:
> > So I've got patch 10/10, which handles dynamic (and consequently
> > negative) offsets.  It basically computes a "whole size", which then
> > gives the extent to which a negative offset is valid, making the
> > estimates a bit more precise.  I didn't do it for static object sizes
> > because I didn't have time then, but I could add a patch 11/10 if the
> > idea sounds OK to you.
> 
> ... or alternatively, I could bring the whole size idea into this tree
> conversion patch so that it handles all kinds of offsets.  That might even
> eliminate patch 10/10.  What would you prefer?

Into this patch.

	Jakub
  
Jakub Jelinek Nov. 23, 2021, 11:58 a.m. UTC | #10
On Mon, Nov 22, 2021 at 01:32:22PM +0100, Jakub Jelinek via Gcc-patches wrote:
> On Mon, Nov 22, 2021 at 06:01:08PM +0530, Siddhesh Poyarekar wrote:
> > On 11/22/21 17:30, Siddhesh Poyarekar wrote:
> > > So I've got patch 10/10, which handles dynamic (and consequently
> > > negative) offsets.  It basically computes a "whole size", which then
> > > gives the extent to which a negative offset is valid, making the
> > > estimates a bit more precise.  I didn't do it for static object sizes
> > > because I didn't have time then, but I could add a patch 11/10 if the
> > > idea sounds OK to you.
> > 
> > ... or alternatively, I could bring the whole size idea into this tree
> > conversion patch so that it handles all kinds of offsets.  That might even
> > eliminate patch 10/10.  What would you prefer?
> 
> Into this patch.

BTW, seems the current behavior is to punt on those "negative" values,
we trigger
  if (offset >= offset_limit)
case for it and return unknown.

	Jakub
  
Siddhesh Poyarekar Nov. 23, 2021, 1:33 p.m. UTC | #11
On 11/23/21 17:28, Jakub Jelinek wrote:
> On Mon, Nov 22, 2021 at 01:32:22PM +0100, Jakub Jelinek via Gcc-patches wrote:
>> On Mon, Nov 22, 2021 at 06:01:08PM +0530, Siddhesh Poyarekar wrote:
>>> On 11/22/21 17:30, Siddhesh Poyarekar wrote:
>>>> So I've got patch 10/10, which handles dynamic (and consequently
>>>> negative) offsets.  It basically computes a "whole size", which then
>>>> gives the extent to which a negative offset is valid, making the
>>>> estimates a bit more precise.  I didn't do it for static object sizes
>>>> because I didn't have time then, but I could add a patch 11/10 if the
>>>> idea sounds OK to you.
>>>
>>> ... or alternatively, I could bring the whole size idea into this tree
>>> conversion patch so that it handles all kinds of offsets.  That might even
>>> eliminate patch 10/10.  What would you prefer?
>>
>> Into this patch.
> 
> BTW, seems the current behavior is to punt on those "negative" values,
> we trigger
>    if (offset >= offset_limit)
> case for it and return unknown.

The current behaviour is actually inconsistent; for SSA names it punts 
for sizes greater than offset limit and for addr_expr it ends up with 
larger sizes.

Siddhesh
  

Patch

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 384864bfb3a..d2e6d95a175 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -10226,7 +10226,7 @@  maybe_emit_sprintf_chk_warning (tree exp, enum built_in_function fcode)
 static tree
 fold_builtin_object_size (tree ptr, tree ost)
 {
-  unsigned HOST_WIDE_INT bytes;
+  tree bytes;
   int object_size_type;
 
   if (!validate_arg (ptr, POINTER_TYPE)
@@ -10251,17 +10251,15 @@  fold_builtin_object_size (tree ptr, tree ost)
   if (TREE_CODE (ptr) == ADDR_EXPR)
     {
       compute_builtin_object_size (ptr, object_size_type, &bytes);
-      if (wi::fits_to_tree_p (bytes, size_type_node))
-	return build_int_cstu (size_type_node, bytes);
+      return fold_convert (size_type_node, bytes);
     }
   else if (TREE_CODE (ptr) == SSA_NAME)
     {
       /* If object size is not known yet, delay folding until
        later.  Maybe subsequent passes will help determining
        it.  */
-      if (compute_builtin_object_size (ptr, object_size_type, &bytes)
-	  && wi::fits_to_tree_p (bytes, size_type_node))
-	return build_int_cstu (size_type_node, bytes);
+      if (compute_builtin_object_size (ptr, object_size_type, &bytes))
+	return fold_convert (size_type_node, bytes);
     }
 
   return NULL_TREE;
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 6e25a7c05db..cbd0fb08f80 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -2486,13 +2486,14 @@  gimple_fold_builtin_strncat (gimple_stmt_iterator *gsi)
   if (cmpsrc < 0)
     return false;
 
-  unsigned HOST_WIDE_INT dstsize;
+  tree dstsize;
 
   bool nowarn = warning_suppressed_p (stmt, OPT_Wstringop_overflow_);
 
-  if (!nowarn && compute_builtin_object_size (dst, 1, &dstsize))
+  if (!nowarn && compute_builtin_object_size (dst, 1, &dstsize)
+      && tree_fits_uhwi_p (dstsize))
     {
-      int cmpdst = compare_tree_int (len, dstsize);
+      int cmpdst = compare_tree_int (len, tree_to_uhwi (dstsize));
 
       if (cmpdst >= 0)
 	{
@@ -2509,7 +2510,7 @@  gimple_fold_builtin_strncat (gimple_stmt_iterator *gsi)
 				    "destination size")
 			       : G_("%qD specified bound %E exceeds "
 				    "destination size %wu"),
-			       fndecl, len, dstsize);
+			       fndecl, len, tree_to_uhwi (dstsize));
 	  if (nowarn)
 	    suppress_warning (stmt, OPT_Wstringop_overflow_);
 	}
diff --git a/gcc/tree-object-size.c b/gcc/tree-object-size.c
index 7a22539cc43..4b9c45c6af2 100644
--- a/gcc/tree-object-size.c
+++ b/gcc/tree-object-size.c
@@ -54,13 +54,12 @@  enum
 
 static tree compute_object_offset (const_tree, const_tree);
 static bool addr_object_size (struct object_size_info *,
-			      const_tree, int, unsigned HOST_WIDE_INT *);
-static unsigned HOST_WIDE_INT alloc_object_size (const gcall *, int);
+			      const_tree, int, tree *);
+static tree alloc_object_size (const gcall *, int);
 static tree pass_through_call (const gcall *);
 static void collect_object_sizes_for (struct object_size_info *, tree);
 static void expr_object_size (struct object_size_info *, tree, tree);
-static bool merge_object_sizes (struct object_size_info *, tree, tree,
-				unsigned HOST_WIDE_INT);
+static bool merge_object_sizes (struct object_size_info *, tree, tree, tree);
 static bool plus_stmt_object_size (struct object_size_info *, tree, gimple *);
 static bool cond_expr_object_size (struct object_size_info *, tree, gimple *);
 static void init_offset_limit (void);
@@ -74,7 +73,7 @@  static void check_for_plus_in_loops_1 (struct object_size_info *, tree,
    the subobject (innermost array or field with address taken).
    object_sizes[2] is lower bound for number of bytes till the end of
    the object and object_sizes[3] lower bound for subobject.  */
-static vec<unsigned HOST_WIDE_INT> object_sizes[OST_END];
+static vec<tree> object_sizes[OST_END];
 
 /* Bitmaps what object sizes have been computed already.  */
 static bitmap computed[OST_END];
@@ -82,10 +81,47 @@  static bitmap computed[OST_END];
 /* Maximum value of offset we consider to be addition.  */
 static unsigned HOST_WIDE_INT offset_limit;
 
+/* Initial value of object sizes; zero for maximum and SIZE_MAX for minimum
+   object size.  */
+
+static inline unsigned HOST_WIDE_INT
+initval (int object_size_type)
+{
+  return (unsigned HOST_WIDE_INT) -popcount_hwi (object_size_type
+						 & OST_MINIMUM);
+}
+
+/* Unknown object size value; it's the opposite of initval.  */
+
 static inline unsigned HOST_WIDE_INT
 unknown (int object_size_type)
 {
-  return ((unsigned HOST_WIDE_INT) -((object_size_type >> 1) ^ 1));
+  return ~initval (object_size_type);
+}
+
+/* Return true if VAL is represents an unknown size for OBJECT_SIZE_TYPE.  */
+
+static inline bool
+size_unknown_p (tree val, int object_size_type)
+{
+  return (tree_fits_uhwi_p (val)
+	  && tree_to_uhwi (val) == unknown (object_size_type));
+}
+
+/* Return a tree with initial value for OBJECT_SIZE_TYPE.  */
+
+static inline tree
+size_initval (int object_size_type)
+{
+  return size_int (initval (object_size_type));
+}
+
+/* Return a tree with unknown value for OBJECT_SIZE_TYPE.  */
+
+static inline tree
+size_unknown (int object_size_type)
+{
+  return size_int (unknown (object_size_type));
 }
 
 /* Grow object_sizes[OBJECT_SIZE_TYPE] to num_ssa_names.  */
@@ -110,13 +146,13 @@  object_sizes_release (int object_size_type)
 static inline bool
 object_sizes_unknown_p (int object_size_type, unsigned varno)
 {
-  return (object_sizes[object_size_type][varno]
-	  == unknown (object_size_type));
+  return size_unknown_p (object_sizes[object_size_type][varno],
+			 object_size_type);
 }
 
 /* Return size for VARNO corresponding to OSI.  */
 
-static inline unsigned HOST_WIDE_INT
+static inline tree
 object_sizes_get (struct object_size_info *osi, unsigned varno)
 {
   return object_sizes[osi->object_size_type][varno];
@@ -124,33 +160,32 @@  object_sizes_get (struct object_size_info *osi, unsigned varno)
 
 /* Set size for VARNO corresponding to OSI to VAL.  */
 
-static inline bool
-object_sizes_set_force (struct object_size_info *osi, unsigned varno,
-			unsigned HOST_WIDE_INT val)
+static inline void
+object_sizes_initialize (struct object_size_info *osi, unsigned varno,
+			 tree val = NULL_TREE)
 {
-  object_sizes[osi->object_size_type][varno] = val;
-  return true;
+  int object_size_type = osi->object_size_type;
+
+  if (!val)
+    val = size_initval (object_size_type);
+  object_sizes[object_size_type][varno] = val;
 }
 
 /* Set size for VARNO corresponding to OSI to VAL if it is the new minimum or
    maximum.  */
 
 static inline bool
-object_sizes_set (struct object_size_info *osi, unsigned varno,
-		  unsigned HOST_WIDE_INT val)
+object_sizes_set (struct object_size_info *osi, unsigned varno, tree val)
 {
   int object_size_type = osi->object_size_type;
-  if ((object_size_type & OST_MINIMUM) == 0)
-    {
-      if (object_sizes[object_size_type][varno] < val)
-	return object_sizes_set_force (osi, varno, val);
-    }
-  else
-    {
-      if (object_sizes[object_size_type][varno] > val)
-	return object_sizes_set_force (osi, varno, val);
-    }
-  return false;
+  tree oldval = object_sizes[object_size_type][varno];
+
+  enum tree_code code = object_size_type & OST_MINIMUM ? MIN_EXPR : MAX_EXPR;
+
+  if (compare_tree_int (oldval, initval (object_size_type)) != 0)
+    val = size_binop (code, val, oldval);
+  object_sizes[object_size_type][varno] = val;
+  return tree_int_cst_compare (val, oldval) != 0;
 }
 
 /* Initialize OFFSET_LIMIT variable.  */
@@ -164,6 +199,13 @@  init_offset_limit (void)
   offset_limit /= 2;
 }
 
+/* Bytes at end of the object with SZ from offset OFFSET. */
+
+static tree
+size_for_offset (tree sz, tree offset)
+{
+  return size_binop (MINUS_EXPR, size_binop (MAX_EXPR, sz, offset), offset);
+}
 
 /* Compute offset of EXPR within VAR.  Return error_mark_node
    if unknown.  */
@@ -274,11 +316,11 @@  decl_init_size (tree decl, bool min)
 
 /* Compute __builtin_object_size for PTR, which is a ADDR_EXPR.
    OBJECT_SIZE_TYPE is the second argument from __builtin_object_size.
-   If unknown, return unknown (object_size_type).  */
+   If unknown, return size_unknown (object_size_type).  */
 
 static bool
 addr_object_size (struct object_size_info *osi, const_tree ptr,
-		  int object_size_type, unsigned HOST_WIDE_INT *psize)
+		  int object_size_type, tree *psize)
 {
   tree pt_var, pt_var_size = NULL_TREE, var_size, bytes;
 
@@ -286,7 +328,7 @@  addr_object_size (struct object_size_info *osi, const_tree ptr,
 
   /* Set to unknown and overwrite just before returning if the size
      could be determined.  */
-  *psize = unknown (object_size_type);
+  *psize = size_unknown (object_size_type);
 
   pt_var = TREE_OPERAND (ptr, 0);
   while (handled_component_p (pt_var))
@@ -297,7 +339,7 @@  addr_object_size (struct object_size_info *osi, const_tree ptr,
 
   if (TREE_CODE (pt_var) == MEM_REF)
     {
-      unsigned HOST_WIDE_INT sz;
+      tree sz;
 
       if (!osi || (object_size_type & OST_SUBOBJECT) != 0
 	  || TREE_CODE (TREE_OPERAND (pt_var, 0)) != SSA_NAME)
@@ -314,27 +356,22 @@  addr_object_size (struct object_size_info *osi, const_tree ptr,
 			    SSA_NAME_VERSION (var)))
 	    sz = object_sizes_get (osi, SSA_NAME_VERSION (var));
 	  else
-	    sz = unknown (object_size_type);
+	    sz = size_unknown (object_size_type);
 	}
-      if (sz != unknown (object_size_type))
+      if (!size_unknown_p (sz, object_size_type))
 	{
-	  offset_int mem_offset;
-	  if (mem_ref_offset (pt_var).is_constant (&mem_offset))
-	    {
-	      offset_int dsz = wi::sub (sz, mem_offset);
-	      if (wi::neg_p (dsz))
-		sz = 0;
-	      else if (wi::fits_uhwi_p (dsz))
-		sz = dsz.to_uhwi ();
-	      else
-		sz = unknown (object_size_type);
-	    }
+	  tree offset = TREE_OPERAND (pt_var, 1);
+	  if (TREE_CODE (offset) != INTEGER_CST
+	      || TREE_CODE (sz) != INTEGER_CST)
+	    sz = size_unknown (object_size_type);
 	  else
-	    sz = unknown (object_size_type);
+	    sz = size_for_offset (sz, offset);
 	}
 
-      if (sz != unknown (object_size_type) && sz < offset_limit)
-	pt_var_size = size_int (sz);
+      if (!size_unknown_p (sz, object_size_type)
+	  && TREE_CODE (sz) == INTEGER_CST
+	  && compare_tree_int (sz, offset_limit) < 0)
+	pt_var_size = sz;
     }
   else if (DECL_P (pt_var))
     {
@@ -350,8 +387,7 @@  addr_object_size (struct object_size_info *osi, const_tree ptr,
   if (pt_var_size)
     {
       /* Validate the size determined above.  */
-      if (!tree_fits_uhwi_p (pt_var_size)
-	  || tree_to_uhwi (pt_var_size) >= offset_limit)
+      if (compare_tree_int (pt_var_size, offset_limit) >= 0)
 	return false;
     }
 
@@ -502,22 +538,17 @@  addr_object_size (struct object_size_info *osi, const_tree ptr,
   else
     bytes = pt_var_size;
 
-  if (tree_fits_uhwi_p (bytes))
-    {
-      *psize = tree_to_uhwi (bytes);
-      return true;
-    }
-
-  return false;
+  *psize = bytes;
+  return true;
 }
 
 
 /* Compute __builtin_object_size for CALL, which is a GIMPLE_CALL.
    Handles calls to functions declared with attribute alloc_size.
    OBJECT_SIZE_TYPE is the second argument from __builtin_object_size.
-   If unknown, return unknown (object_size_type).  */
+   If unknown, return size_unknown (object_size_type).  */
 
-static unsigned HOST_WIDE_INT
+static tree
 alloc_object_size (const gcall *call, int object_size_type)
 {
   gcc_assert (is_gimple_call (call));
@@ -529,7 +560,7 @@  alloc_object_size (const gcall *call, int object_size_type)
     calltype = gimple_call_fntype (call);
 
   if (!calltype)
-    return unknown (object_size_type);
+    return size_unknown (object_size_type);
 
   /* Set to positions of alloc_size arguments.  */
   int arg1 = -1, arg2 = -1;
@@ -549,7 +580,7 @@  alloc_object_size (const gcall *call, int object_size_type)
       || (arg2 >= 0
 	  && (arg2 >= (int)gimple_call_num_args (call)
 	      || TREE_CODE (gimple_call_arg (call, arg2)) != INTEGER_CST)))
-    return unknown (object_size_type);
+    return size_unknown (object_size_type);
 
   tree bytes = NULL_TREE;
   if (arg2 >= 0)
@@ -559,10 +590,7 @@  alloc_object_size (const gcall *call, int object_size_type)
   else if (arg1 >= 0)
     bytes = fold_convert (sizetype, gimple_call_arg (call, arg1));
 
-  if (bytes && tree_fits_uhwi_p (bytes))
-    return tree_to_uhwi (bytes);
-
-  return unknown (object_size_type);
+  return bytes;
 }
 
 
@@ -598,13 +626,13 @@  pass_through_call (const gcall *call)
 
 bool
 compute_builtin_object_size (tree ptr, int object_size_type,
-			     unsigned HOST_WIDE_INT *psize)
+			     tree *psize)
 {
   gcc_assert (object_size_type >= 0 && object_size_type < OST_END);
 
   /* Set to unknown and overwrite just before returning if the size
      could be determined.  */
-  *psize = unknown (object_size_type);
+  *psize = size_unknown (object_size_type);
 
   if (! offset_limit)
     init_offset_limit ();
@@ -638,8 +666,7 @@  compute_builtin_object_size (tree ptr, int object_size_type,
 						  psize))
 		{
 		  /* Return zero when the offset is out of bounds.  */
-		  unsigned HOST_WIDE_INT off = tree_to_shwi (offset);
-		  *psize = off < *psize ? *psize - off : 0;
+		  *psize = size_for_offset (*psize, offset);
 		  return true;
 		}
 	    }
@@ -747,12 +774,13 @@  compute_builtin_object_size (tree ptr, int object_size_type,
 		print_generic_expr (dump_file, ssa_name (i),
 				    dump_flags);
 		fprintf (dump_file,
-			 ": %s %sobject size "
-			 HOST_WIDE_INT_PRINT_UNSIGNED "\n",
+			 ": %s %sobject size ",
 			 ((object_size_type & OST_MINIMUM) ? "minimum"
 			  : "maximum"),
-			 (object_size_type & OST_SUBOBJECT) ? "sub" : "",
-			 object_sizes_get (&osi, i));
+			 (object_size_type & OST_SUBOBJECT) ? "sub" : "");
+		print_generic_expr (dump_file, object_sizes_get (&osi, i),
+				    dump_flags);
+		fprintf (dump_file, "\n");
 	      }
 	}
 
@@ -761,7 +789,7 @@  compute_builtin_object_size (tree ptr, int object_size_type,
     }
 
   *psize = object_sizes_get (&osi, SSA_NAME_VERSION (ptr));
-  return *psize != unknown (object_size_type);
+  return !size_unknown_p (*psize, object_size_type);
 }
 
 /* Compute object_sizes for PTR, defined to VALUE, which is not an SSA_NAME.  */
@@ -771,7 +799,7 @@  expr_object_size (struct object_size_info *osi, tree ptr, tree value)
 {
   int object_size_type = osi->object_size_type;
   unsigned int varno = SSA_NAME_VERSION (ptr);
-  unsigned HOST_WIDE_INT bytes;
+  tree bytes;
 
   gcc_assert (!object_sizes_unknown_p (object_size_type, varno));
   gcc_assert (osi->pass == 0);
@@ -786,7 +814,7 @@  expr_object_size (struct object_size_info *osi, tree ptr, tree value)
   if (TREE_CODE (value) == ADDR_EXPR)
     addr_object_size (osi, value, object_size_type, &bytes);
   else
-    bytes = unknown (object_size_type);
+    bytes = size_unknown (object_size_type);
 
   object_sizes_set (osi, varno, bytes);
 }
@@ -799,16 +827,13 @@  call_object_size (struct object_size_info *osi, tree ptr, gcall *call)
 {
   int object_size_type = osi->object_size_type;
   unsigned int varno = SSA_NAME_VERSION (ptr);
-  unsigned HOST_WIDE_INT bytes;
 
   gcc_assert (is_gimple_call (call));
 
   gcc_assert (!object_sizes_unknown_p (object_size_type, varno));
   gcc_assert (osi->pass == 0);
 
-  bytes = alloc_object_size (call, object_size_type);
-
-  object_sizes_set (osi, varno, bytes);
+  object_sizes_set (osi, varno, alloc_object_size (call, object_size_type));
 }
 
 
@@ -823,7 +848,7 @@  unknown_object_size (struct object_size_info *osi, tree ptr)
   gcc_checking_assert (!object_sizes_unknown_p (object_size_type, varno));
   gcc_checking_assert (osi->pass == 0);
 
-  object_sizes_set (osi, varno, unknown (object_size_type));
+  object_sizes_set (osi, varno, size_unknown (object_size_type));
 }
 
 
@@ -832,17 +857,17 @@  unknown_object_size (struct object_size_info *osi, tree ptr)
 
 static bool
 merge_object_sizes (struct object_size_info *osi, tree dest, tree orig,
-		    unsigned HOST_WIDE_INT offset)
+		    tree offset)
 {
   int object_size_type = osi->object_size_type;
   unsigned int varno = SSA_NAME_VERSION (dest);
-  unsigned HOST_WIDE_INT orig_bytes;
+  tree orig_bytes;
 
   if (object_sizes_unknown_p (object_size_type, varno))
     return false;
-  if (offset >= offset_limit)
+  if (compare_tree_int (offset, offset_limit) >= 0)
     {
-      object_sizes_set (osi, varno, unknown (object_size_type));
+      object_sizes_set (osi, varno, size_unknown (object_size_type));
       return false;
     }
 
@@ -850,9 +875,8 @@  merge_object_sizes (struct object_size_info *osi, tree dest, tree orig,
     collect_object_sizes_for (osi, orig);
 
   orig_bytes = object_sizes_get (osi, SSA_NAME_VERSION (orig));
-  if (orig_bytes != unknown (object_size_type))
-    orig_bytes = (offset > orig_bytes)
-		 ? HOST_WIDE_INT_0U : orig_bytes - offset;
+  if (!size_unknown_p (orig_bytes, object_size_type))
+    orig_bytes = size_for_offset (orig_bytes, offset);
 
   osi->changed = object_sizes_set (osi, varno, orig_bytes);
 
@@ -869,7 +893,7 @@  plus_stmt_object_size (struct object_size_info *osi, tree var, gimple *stmt)
 {
   int object_size_type = osi->object_size_type;
   unsigned int varno = SSA_NAME_VERSION (var);
-  unsigned HOST_WIDE_INT bytes;
+  tree bytes;
   tree op0, op1;
 
   if (gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR)
@@ -895,28 +919,22 @@  plus_stmt_object_size (struct object_size_info *osi, tree var, gimple *stmt)
       && (TREE_CODE (op0) == SSA_NAME
 	  || TREE_CODE (op0) == ADDR_EXPR))
     {
-      if (! tree_fits_uhwi_p (op1))
-	bytes = unknown (object_size_type);
-      else if (TREE_CODE (op0) == SSA_NAME)
-	return merge_object_sizes (osi, var, op0, tree_to_uhwi (op1));
+      if (TREE_CODE (op0) == SSA_NAME)
+	return merge_object_sizes (osi, var, op0, op1);
       else
 	{
-	  unsigned HOST_WIDE_INT off = tree_to_uhwi (op1);
-
           /* op0 will be ADDR_EXPR here.  */
 	  addr_object_size (osi, op0, object_size_type, &bytes);
-	  if (bytes == unknown (object_size_type))
+	  if (size_unknown_p (bytes, object_size_type))
 	    ;
-	  else if (off > offset_limit)
-	    bytes = unknown (object_size_type);
-	  else if (off > bytes)
-	    bytes = 0;
+	  else if (compare_tree_int (op1, offset_limit) > 0)
+	    bytes = size_unknown (object_size_type);
 	  else
-	    bytes -= off;
+	    bytes = size_for_offset (bytes, op1);
 	}
     }
   else
-    bytes = unknown (object_size_type);
+    bytes = size_unknown (object_size_type);
 
   object_sizes_set (osi, varno, bytes);
   return false;
@@ -944,7 +962,7 @@  cond_expr_object_size (struct object_size_info *osi, tree var, gimple *stmt)
   else_ = gimple_assign_rhs3 (stmt);
 
   if (TREE_CODE (then_) == SSA_NAME)
-    reexamine |= merge_object_sizes (osi, var, then_, 0);
+    reexamine |= merge_object_sizes (osi, var, then_, size_int (0));
   else
     expr_object_size (osi, var, then_);
 
@@ -952,7 +970,7 @@  cond_expr_object_size (struct object_size_info *osi, tree var, gimple *stmt)
     return reexamine;
 
   if (TREE_CODE (else_) == SSA_NAME)
-    reexamine |= merge_object_sizes (osi, var, else_, 0);
+    reexamine |= merge_object_sizes (osi, var, else_, size_int (0));
   else
     expr_object_size (osi, var, else_);
 
@@ -969,11 +987,11 @@  cond_expr_object_size (struct object_size_info *osi, tree var, gimple *stmt)
    object size is object size of the first operand minus the constant.
    If the constant is bigger than the number of remaining bytes until the
    end of the object, object size is 0, but if it is instead a pointer
-   subtraction, object size is unknown (object_size_type).
+   subtraction, object size is size_unknown (object_size_type).
    To differentiate addition from subtraction, ADDR_EXPR returns
-   unknown (object_size_type) for all objects bigger than half of the address
-   space, and constants less than half of the address space are considered
-   addition, while bigger constants subtraction.
+   size_unknown (object_size_type) for all objects bigger than half of the
+   address space, and constants less than half of the address space are
+   considered addition, while bigger constants subtraction.
    For a memcpy like GIMPLE_CALL that always returns one of its arguments, the
    object size is object size of that argument.
    Otherwise, object size is the maximum of object sizes of variables
@@ -996,7 +1014,7 @@  collect_object_sizes_for (struct object_size_info *osi, tree var)
 	{
 	  /* Initialize to 0 for maximum size and M1U for minimum size so that
 	     it gets immediately overridden.  */
-	  object_sizes_set_force (osi, varno, unknown (object_size_type ^ 2));
+	  object_sizes_initialize (osi, varno);
 	}
       else
 	{
@@ -1039,7 +1057,7 @@  collect_object_sizes_for (struct object_size_info *osi, tree var)
           {
             if (TREE_CODE (rhs) == SSA_NAME
                 && POINTER_TYPE_P (TREE_TYPE (rhs)))
-              reexamine = merge_object_sizes (osi, var, rhs, 0);
+	      reexamine = merge_object_sizes (osi, var, rhs, size_int (0));
             else
               expr_object_size (osi, var, rhs);
           }
@@ -1056,7 +1074,7 @@  collect_object_sizes_for (struct object_size_info *osi, tree var)
           {
             if (TREE_CODE (arg) == SSA_NAME
                 && POINTER_TYPE_P (TREE_TYPE (arg)))
-              reexamine = merge_object_sizes (osi, var, arg, 0);
+	      reexamine = merge_object_sizes (osi, var, arg, size_int (0));
             else
               expr_object_size (osi, var, arg);
           }
@@ -1067,7 +1085,7 @@  collect_object_sizes_for (struct object_size_info *osi, tree var)
 
     case GIMPLE_ASM:
       /* Pointers defined by __asm__ statements can point anywhere.  */
-      object_sizes_set (osi, varno, unknown (object_size_type));
+      object_sizes_set (osi, varno, size_unknown (object_size_type));
       break;
 
     case GIMPLE_NOP:
@@ -1076,7 +1094,7 @@  collect_object_sizes_for (struct object_size_info *osi, tree var)
 	expr_object_size (osi, var, SSA_NAME_VAR (var));
       else
 	/* Uninitialized SSA names point nowhere.  */
-	object_sizes_set (osi, varno, unknown (object_size_type));
+	object_sizes_set (osi, varno, size_unknown (object_size_type));
       break;
 
     case GIMPLE_PHI:
@@ -1091,7 +1109,7 @@  collect_object_sizes_for (struct object_size_info *osi, tree var)
 	      break;
 
 	    if (TREE_CODE (rhs) == SSA_NAME)
-	      reexamine |= merge_object_sizes (osi, var, rhs, 0);
+	      reexamine |= merge_object_sizes (osi, var, rhs, size_int (0));
 	    else if (osi->pass == 0)
 	      expr_object_size (osi, var, rhs);
 	  }
@@ -1142,7 +1160,7 @@  check_for_plus_in_loops_1 (struct object_size_info *osi, tree var,
 	      --sp;
 	      bitmap_clear_bit (osi->reexamine, *sp);
 	      bitmap_set_bit (computed[osi->object_size_type], *sp);
-	      object_sizes_set_force (osi, *sp, 0);
+	      object_sizes_set (osi, *sp, size_int (0));
 	      if (*sp == varno)
 		break;
 	    }
@@ -1333,17 +1351,16 @@  object_sizes_execute (function *fun, bool insert_min_max_p)
 			  || TREE_CODE (ptr) == SSA_NAME))
 		    {
 		      tree type = TREE_TYPE (lhs);
-		      unsigned HOST_WIDE_INT bytes;
+		      tree bytes;
 		      if (compute_builtin_object_size (ptr, object_size_type,
-						       &bytes)
-			  && wi::fits_to_tree_p (bytes, type))
+						       &bytes))
 			{
 			  tree tem = make_ssa_name (type);
 			  gimple_call_set_lhs (call, tem);
 			  enum tree_code code
 			    = (object_size_type & OST_MINIMUM
 			       ? MAX_EXPR : MIN_EXPR);
-			  tree cst = build_int_cstu (type, bytes);
+			  tree cst = fold_convert (type, bytes);
 			  gimple *g
 			    = gimple_build_assign (lhs, code, tem, cst);
 			  gsi_insert_after (&i, g, GSI_NEW_STMT);
diff --git a/gcc/tree-object-size.h b/gcc/tree-object-size.h
index ef18aea50db..b2d6a58324c 100644
--- a/gcc/tree-object-size.h
+++ b/gcc/tree-object-size.h
@@ -22,7 +22,7 @@  along with GCC; see the file COPYING3.  If not see
 
 extern void init_object_sizes (void);
 extern void fini_object_sizes (void);
-extern bool compute_builtin_object_size (tree, int, unsigned HOST_WIDE_INT *);
+extern bool compute_builtin_object_size (tree, int, tree *);
 extern tree decl_init_size (tree, bool);
 
 #endif  // GCC_TREE_OBJECT_SIZE_H
diff --git a/gcc/ubsan.c b/gcc/ubsan.c
index ba9adf0ad96..2b1b7571e25 100644
--- a/gcc/ubsan.c
+++ b/gcc/ubsan.c
@@ -2160,33 +2160,33 @@  instrument_object_size (gimple_stmt_iterator *gsi, tree t, bool is_lhs)
   if (decl_p)
     base_addr = build1 (ADDR_EXPR,
 			build_pointer_type (TREE_TYPE (base)), base);
-  unsigned HOST_WIDE_INT size;
-  if (compute_builtin_object_size (base_addr, 0, &size))
-    sizet = build_int_cst (sizetype, size);
-  else if (optimize)
-    {
-      if (LOCATION_LOCUS (loc) == UNKNOWN_LOCATION)
-	loc = input_location;
-      /* Generate __builtin_object_size call.  */
-      sizet = builtin_decl_explicit (BUILT_IN_OBJECT_SIZE);
-      sizet = build_call_expr_loc (loc, sizet, 2, base_addr,
-				   integer_zero_node);
-      sizet = force_gimple_operand_gsi (gsi, sizet, false, NULL_TREE, true,
-					GSI_SAME_STMT);
-      /* If the call above didn't end up being an integer constant, go one
-	 statement back and get the __builtin_object_size stmt.  Save it,
-	 we might need it later.  */
-      if (SSA_VAR_P (sizet))
+  if (!compute_builtin_object_size (base_addr, 0, &sizet))
+    {
+      if (optimize)
 	{
-	  gsi_prev (gsi);
-	  bos_stmt = gsi_stmt (*gsi);
+	  if (LOCATION_LOCUS (loc) == UNKNOWN_LOCATION)
+	    loc = input_location;
+	  /* Generate __builtin_object_size call.  */
+	  sizet = builtin_decl_explicit (BUILT_IN_OBJECT_SIZE);
+	  sizet = build_call_expr_loc (loc, sizet, 2, base_addr,
+				       integer_zero_node);
+	  sizet = force_gimple_operand_gsi (gsi, sizet, false, NULL_TREE, true,
+					    GSI_SAME_STMT);
+	  /* If the call above didn't end up being an integer constant, go one
+	     statement back and get the __builtin_object_size stmt.  Save it,
+	     we might need it later.  */
+	  if (SSA_VAR_P (sizet))
+	    {
+	      gsi_prev (gsi);
+	      bos_stmt = gsi_stmt (*gsi);
 
-	  /* Move on to where we were.  */
-	  gsi_next (gsi);
+	      /* Move on to where we were.  */
+	      gsi_next (gsi);
+	    }
 	}
+      else
+	return;
     }
-  else
-    return;
 
   /* Generate UBSAN_OBJECT_SIZE (ptr, ptr+sizeof(*ptr)-base, objsize, ckind)
      call.  */