[v2] constrain conservative string lengths to array sizes [PR104119]

Message ID b578e87f-9e7a-4105-4436-e311271c1eca@gmail.com
State Committed
Commit 3c9f762ad02f398c27275688c3494332f69237f5
Headers
Series [v2] constrain conservative string lengths to array sizes [PR104119] |

Commit Message

Martin Sebor Jan. 20, 2022, 10:54 p.m. UTC
  The updated patch ensures the tighter bound isn't used to compute
the sprintf result and adds a test to verify that.  (This is messy
in the strlen/sprintf pass and should be cleaned up to avoid this
mistake in the future.)

Rested on x86_64-linux.

On 1/19/22 18:20, Martin Sebor wrote:
> The attached patch suppresses a class of unexpected -Wformat-overflow
> (and -truncation) warnings introduced as a result of better range info
> with the integration of the strlen pass with Ranger.
> 
> The sprintf warning code relies on the strlen pass data to determine
> the lengths of string arguments to %s directives.  When the data for
> a string are present, such as after a strlen call, the length can be
> either a constant or, in the case of interest, a range (including
> [N, PTRDIFF_MAX - 2] for a string of unbounded length).  When absent
> because no string call has been seen yet, the string length is
> considered to be bounded by the size of the array it's stored in.
> This constrains the maximum number of bytes output by the %s directive
> and reduces false positives.
> 
> The problem this patch addresses is that in the interesting case there
> is no logic similar to the last ("no data") case, and so the maximum
> number of bytes can be in excess of the size of the array.  The patch
> does it by computing the size of the object (or member) in which
> the string is stored and using its size minus 1 as the upper bound
> on the length.  To do that, I had to adjust the APIs to pass in
> the pointer_query instance of the range_query.  The meat of the change
> is in the new get_maxbound() function.
> 
> There might be opportunities to do better still.  I'll try to look
> into them if I still have time.
> 
> Tested on x86_64-linux.
> 
> Martin
  

Comments

Jason Merrill Feb. 3, 2022, 2:44 p.m. UTC | #1
On 1/20/22 17:54, Martin Sebor wrote:
> The updated patch ensures the tighter bound isn't used to compute
> the sprintf result and adds a test to verify that.  (This is messy
> in the strlen/sprintf pass and should be cleaned up to avoid this
> mistake in the future.)
> 
> Rested on x86_64-linux.

OK, thanks.

> On 1/19/22 18:20, Martin Sebor wrote:
>> The attached patch suppresses a class of unexpected -Wformat-overflow
>> (and -truncation) warnings introduced as a result of better range info
>> with the integration of the strlen pass with Ranger.
>>
>> The sprintf warning code relies on the strlen pass data to determine
>> the lengths of string arguments to %s directives.  When the data for
>> a string are present, such as after a strlen call, the length can be
>> either a constant or, in the case of interest, a range (including
>> [N, PTRDIFF_MAX - 2] for a string of unbounded length).  When absent
>> because no string call has been seen yet, the string length is
>> considered to be bounded by the size of the array it's stored in.
>> This constrains the maximum number of bytes output by the %s directive
>> and reduces false positives.
>>
>> The problem this patch addresses is that in the interesting case there
>> is no logic similar to the last ("no data") case, and so the maximum
>> number of bytes can be in excess of the size of the array.  The patch
>> does it by computing the size of the object (or member) in which
>> the string is stored and using its size minus 1 as the upper bound
>> on the length.  To do that, I had to adjust the APIs to pass in
>> the pointer_query instance of the range_query.  The meat of the change
>> is in the new get_maxbound() function.
>>
>> There might be opportunities to do better still.  I'll try to look
>> into them if I still have time.
>>
>> Tested on x86_64-linux.
>>
>> Martin
  

Patch

Constrain conservative string lengths to array sizes [PR104119].

Resolves:
PR tree-optimization/104119 - unexpected -Wformat-overflow after strlen in ILP32 since Ranger integration

gcc/ChangeLog:

	PR tree-optimization/104119
	* gimple-ssa-sprintf.cc (struct directive): Change argument type.
	(format_none): Same.
	(format_percent): Same.
	(format_integer): Same.
	(format_floating): Same.
	(get_string_length): Same.
	(format_character): Same.
	(format_string): Same.
	(format_plain): Same.
	(format_directive): Same.
	(compute_format_length): Same.
	(handle_printf_call): Same.
	* tree-ssa-strlen.cc (get_range_strlen_dynamic): Same.   Call
	get_maxbound.
	(get_range_strlen_phi): Same.
	(get_maxbound): New function.
	(strlen_pass::get_len_or_size): Adjust to parameter change.
	* tree-ssa-strlen.h (get_range_strlen_dynamic): Change argument type.

gcc/testsuite/ChangeLog:

	PR tree-optimization/104119
	* gcc.dg/tree-ssa/builtin-snprintf-13.c: New test.
	* gcc.dg/tree-ssa/builtin-sprintf-warn-29.c: New test.

diff --git a/gcc/gimple-ssa-sprintf.cc b/gcc/gimple-ssa-sprintf.cc
index 98ab563a01b..c93f12f90b5 100644
--- a/gcc/gimple-ssa-sprintf.cc
+++ b/gcc/gimple-ssa-sprintf.cc
@@ -600,7 +600,7 @@  struct directive
 
   /* Format conversion function that given a directive and an argument
      returns the formatting result.  */
-  fmtresult (*fmtfunc) (const directive &, tree, range_query *);
+  fmtresult (*fmtfunc) (const directive &, tree, pointer_query &);
 
   /* Return True when the format flag CHR has been used.  */
   bool get_flag (char chr) const
@@ -968,7 +968,7 @@  directive::set_precision (tree arg, range_query *query)
 /* Return the result of formatting a no-op directive (such as '%n').  */
 
 static fmtresult
-format_none (const directive &, tree, range_query *)
+format_none (const directive &, tree, pointer_query &)
 {
   fmtresult res (0);
   return res;
@@ -977,7 +977,7 @@  format_none (const directive &, tree, range_query *)
 /* Return the result of formatting the '%%' directive.  */
 
 static fmtresult
-format_percent (const directive &, tree, range_query *)
+format_percent (const directive &, tree, pointer_query &)
 {
   fmtresult res (1);
   return res;
@@ -1199,7 +1199,7 @@  adjust_range_for_overflow (tree dirtype, tree *argmin, tree *argmax)
    used when the directive argument or its value isn't known.  */
 
 static fmtresult
-format_integer (const directive &dir, tree arg, range_query *query)
+format_integer (const directive &dir, tree arg, pointer_query &ptr_qry)
 {
   tree intmax_type_node;
   tree uintmax_type_node;
@@ -1383,7 +1383,7 @@  format_integer (const directive &dir, tree arg, range_query *query)
       /* Try to determine the range of values of the integer argument
 	 (range information is not available for pointers).  */
       value_range vr;
-      query->range_of_expr (vr, arg, dir.info->callstmt);
+      ptr_qry.rvals->range_of_expr (vr, arg, dir.info->callstmt);
 
       if (!vr.varying_p () && !vr.undefined_p ())
 	{
@@ -1414,7 +1414,7 @@  format_integer (const directive &dir, tree arg, range_query *query)
 	      if (code == INTEGER_CST)
 		{
 		  arg = gimple_assign_rhs1 (def);
-		  return format_integer (dir, arg, query);
+		  return format_integer (dir, arg, ptr_qry);
 		}
 
 	      if (code == NOP_EXPR)
@@ -1459,16 +1459,16 @@  format_integer (const directive &dir, tree arg, range_query *query)
       /* For unsigned conversions/directives or signed when
 	 the minimum is positive, use the minimum and maximum to compute
 	 the shortest and longest output, respectively.  */
-      res.range.min = format_integer (dir, argmin, query).range.min;
-      res.range.max = format_integer (dir, argmax, query).range.max;
+      res.range.min = format_integer (dir, argmin, ptr_qry).range.min;
+      res.range.max = format_integer (dir, argmax, ptr_qry).range.max;
     }
   else if (tree_int_cst_sgn (argmax) < 0)
     {
       /* For signed conversions/directives if maximum is negative,
 	 use the minimum as the longest output and maximum as the
 	 shortest output.  */
-      res.range.min = format_integer (dir, argmax, query).range.min;
-      res.range.max = format_integer (dir, argmin, query).range.max;
+      res.range.min = format_integer (dir, argmax, ptr_qry).range.min;
+      res.range.max = format_integer (dir, argmin, ptr_qry).range.max;
     }
   else
     {
@@ -1477,11 +1477,11 @@  format_integer (const directive &dir, tree arg, range_query *query)
 	 length of the output of both minimum and maximum and pick the
 	 longer.  */
       unsigned HOST_WIDE_INT max1
-	= format_integer (dir, argmin, query).range.max;
+	= format_integer (dir, argmin, ptr_qry).range.max;
       unsigned HOST_WIDE_INT max2
-	= format_integer (dir, argmax, query).range.max;
+	= format_integer (dir, argmax, ptr_qry).range.max;
       res.range.min
-	= format_integer (dir, integer_zero_node, query).range.min;
+	= format_integer (dir, integer_zero_node, ptr_qry).range.min;
       res.range.max = MAX (max1, max2);
     }
 
@@ -1830,7 +1830,7 @@  format_floating (const directive &dir, const HOST_WIDE_INT prec[2])
    ARG.  */
 
 static fmtresult
-format_floating (const directive &dir, tree arg, range_query *)
+format_floating (const directive &dir, tree arg, pointer_query &)
 {
   HOST_WIDE_INT prec[] = { dir.prec[0], dir.prec[1] };
   tree type = (dir.modifier == FMT_LEN_L || dir.modifier == FMT_LEN_ll
@@ -2025,7 +2025,7 @@  format_floating (const directive &dir, tree arg, range_query *)
 
 static fmtresult
 get_string_length (tree str, gimple *stmt, unsigned HOST_WIDE_INT max_size,
-		   unsigned eltsize, range_query *query)
+		   unsigned eltsize, pointer_query &ptr_qry)
 {
   if (!str)
     return fmtresult ();
@@ -2036,7 +2036,7 @@  get_string_length (tree str, gimple *stmt, unsigned HOST_WIDE_INT max_size,
   c_strlen_data lendata = { };
   lendata.maxbound = str;
   if (eltsize == 1)
-    get_range_strlen_dynamic (str, stmt, &lendata, query);
+    get_range_strlen_dynamic (str, stmt, &lendata, ptr_qry);
   else
     {
       /* Determine the length of the shortest and longest string referenced
@@ -2084,17 +2084,30 @@  get_string_length (tree str, gimple *stmt, unsigned HOST_WIDE_INT max_size,
       return res;
     }
 
+  /* The minimum length of the string.  */
   HOST_WIDE_INT min
     = (tree_fits_uhwi_p (lendata.minlen)
        ? tree_to_uhwi (lendata.minlen)
        : 0);
 
+  /* The maximum length of the string; initially set to MAXBOUND which
+     may be less than MAXLEN, but may be adjusted up below.  */
   HOST_WIDE_INT max
     = (lendata.maxbound && tree_fits_uhwi_p (lendata.maxbound)
        ? tree_to_uhwi (lendata.maxbound)
        : HOST_WIDE_INT_M1U);
 
-  const bool unbounded = integer_all_onesp (lendata.maxlen);
+  /* True if either the maximum length is unknown or (conservatively)
+     the array bound is less than the maximum length.  That can happen
+     when the length of the string is unknown but the array in which
+     the string is stored is a member of a struct.  The warning uses
+     the size of the member as the upper bound but the optimization
+     doesn't.  The optimization could still use the size of
+     enclosing object as the upper bound but that's not done here.  */
+  const bool unbounded
+    = (integer_all_onesp (lendata.maxlen)
+       || (lendata.maxbound
+	   && tree_int_cst_lt (lendata.maxbound, lendata.maxlen)));
 
   /* Set the max/likely counters to unbounded when a minimum is known
      but the maximum length isn't bounded.  This implies that STR is
@@ -2147,7 +2160,7 @@  get_string_length (tree str, gimple *stmt, unsigned HOST_WIDE_INT max_size,
    vsprinf).  */
 
 static fmtresult
-format_character (const directive &dir, tree arg, range_query *query)
+format_character (const directive &dir, tree arg, pointer_query &ptr_qry)
 {
   fmtresult res;
 
@@ -2160,7 +2173,8 @@  format_character (const directive &dir, tree arg, range_query *query)
       res.range.min = 0;
 
       HOST_WIDE_INT min, max;
-      if (get_int_range (arg, dir.info->callstmt, &min, &max, false, 0, query))
+      if (get_int_range (arg, dir.info->callstmt, &min, &max, false, 0,
+			 ptr_qry.rvals))
 	{
 	  if (min == 0 && max == 0)
 	    {
@@ -2457,7 +2471,7 @@  alias_offset (tree arg, HOST_WIDE_INT *arg_size,
    vsprinf).  */
 
 static fmtresult
-format_string (const directive &dir, tree arg, range_query *query)
+format_string (const directive &dir, tree arg, pointer_query &ptr_qry)
 {
   fmtresult res;
 
@@ -2495,7 +2509,7 @@  format_string (const directive &dir, tree arg, range_query *query)
     }
 
   fmtresult slen =
-    get_string_length (arg, dir.info->callstmt, arg_size, count_by, query);
+    get_string_length (arg, dir.info->callstmt, arg_size, count_by, ptr_qry);
   if (slen.range.min == slen.range.max
       && slen.range.min < HOST_WIDE_INT_MAX)
     {
@@ -2667,7 +2681,7 @@  format_string (const directive &dir, tree arg, range_query *query)
 /* Format plain string (part of the format string itself).  */
 
 static fmtresult
-format_plain (const directive &dir, tree, range_query *)
+format_plain (const directive &dir, tree, pointer_query &)
 {
   fmtresult res (dir.len);
   return res;
@@ -3063,7 +3077,7 @@  bytes_remaining (unsigned HOST_WIDE_INT navail, const format_result &res)
 static bool
 format_directive (const call_info &info,
 		  format_result *res, const directive &dir,
-		  range_query *query)
+		  pointer_query &ptr_qry)
 {
   /* Offset of the beginning of the directive from the beginning
      of the format string.  */
@@ -3088,7 +3102,7 @@  format_directive (const call_info &info,
     return false;
 
   /* Compute the range of lengths of the formatted output.  */
-  fmtresult fmtres = dir.fmtfunc (dir, dir.arg, query);
+  fmtresult fmtres = dir.fmtfunc (dir, dir.arg, ptr_qry);
 
   /* Record whether the output of all directives is known to be
      bounded by some maximum, implying that their arguments are
@@ -3990,7 +4004,8 @@  maybe_warn_overlap (call_info &info, format_result *res)
    that caused the processing to be terminated early).  */
 
 static bool
-compute_format_length (call_info &info, format_result *res, range_query *query)
+compute_format_length (call_info &info, format_result *res,
+		       pointer_query &ptr_qry)
 {
   if (dump_file)
     {
@@ -4027,10 +4042,10 @@  compute_format_length (call_info &info, format_result *res, range_query *query)
     {
       directive dir (&info, dirno);
 
-      size_t n = parse_directive (info, dir, res, pf, &argno, query);
+      size_t n = parse_directive (info, dir, res, pf, &argno, ptr_qry.rvals);
 
       /* Return failure if the format function fails.  */
-      if (!format_directive (info, res, dir, query))
+      if (!format_directive (info, res, dir, ptr_qry))
 	return false;
 
       /* Return success when the directive is zero bytes long and it's
@@ -4700,7 +4715,7 @@  handle_printf_call (gimple_stmt_iterator *gsi, pointer_query &ptr_qry)
      never set to true again).  */
   res.posunder4k = posunder4k && dstptr;
 
-  bool success = compute_format_length (info, &res, ptr_qry.rvals);
+  bool success = compute_format_length (info, &res, ptr_qry);
   if (res.warned)
     suppress_warning (info.callstmt, info.warnopt ());
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtin-snprintf-13.c b/gcc/testsuite/gcc.dg/tree-ssa/builtin-snprintf-13.c
new file mode 100644
index 00000000000..c5b5f0c6b65
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/builtin-snprintf-13.c
@@ -0,0 +1,131 @@ 
+/* PR tree-optimization/104119 - unexpected -Wformat-overflow after strlen
+   in ILP32 since Ranger integration
+   Verify that unlike -Wformat-overflow the sprintf optimization doesn't
+   assume the length of a string isn't bounded by the size of the array
+   member it's stored in.
+   { dg-do compile }
+   { dg-options "-O2 -Wall -fdump-tree-optimized" } */
+
+typedef __SIZE_TYPE__ size_t;
+
+void* memcpy (void*, const void*, size_t);
+int snprintf (char*, size_t, const char*, ...);
+char* strcpy (char*, const char*);
+size_t strlen (const char*);
+
+extern void keep_call_on_line (int);
+extern void elim_call_on_line (int);
+
+void sink (void*, ...);
+
+struct __attribute__ ((packed)) S
+{
+  char a4[4], b4[4], ax[];
+};
+
+extern struct S es;
+
+void test_extern_decl_memcpy (void)
+{
+  struct S *p = &es;
+
+  /* Set strlen (P->A4) to [3, PTRDIFF - 2].   */
+  memcpy (p->a4, "123", 3);
+  int i = snprintf (0, 0, "%s", p->a4);
+  if (i > 4)
+    keep_call_on_line (__LINE__);
+}
+
+void test_extern_decl_strcpy_3 (void)
+{
+  struct S *p = &es;
+
+  /* Set strlen (P->A4) to 3.   */
+  strcpy (p->a4, "123");
+  int i = snprintf (0, 0, "%s", p->a4);
+  if (i > 4)
+    elim_call_on_line (__LINE__);
+}
+
+void test_extern_decl_strcpy_X (const char *s)
+{
+  struct S *p = &es;
+
+  /* Set strlen (P->A4) to [0, PTRDIFF_MAX - 2].   */
+  strcpy (p->a4, s);
+  int i = snprintf (0, 0, "%s", p->a4);
+  if (i > 4)
+    keep_call_on_line (__LINE__);
+}
+
+size_t test_extern_decl_strlen (void)
+{
+  struct S *p = &es;
+
+  /* Set strlen (P->A4) to [0, PTRDIFF - 2].   */
+  size_t n = strlen (p->a4);
+  int i = snprintf (0, 0, "%s", p->a4);
+  if (i > 4)
+    keep_call_on_line (__LINE__);
+  return n;
+}
+
+
+static struct S ss;
+
+/* Store and read SS to prevent optimizers from assuming it's unchanged.  */
+
+extern void set_ss (struct S *p)
+{
+  if (ss.a4[(unsigned char)*p->a4])
+    __builtin_memcpy (&ss, p, sizeof ss);
+}
+
+
+void test_static_decl_memcpy (void)
+{
+  struct S *p = &ss;
+
+  /* Set strlen (P->A4) to [3, PTRDIFF - 2].   */
+  memcpy (p->a4, "123", 3);
+  int i = snprintf (0, 0, "%s", p->a4);
+  if (i > 4)
+    keep_call_on_line (__LINE__);
+}
+
+void test_static_decl_strcpy_3 (void)
+{
+  struct S *p = &ss;
+
+  /* Set strlen (P->A4) to 3.   */
+  strcpy (p->a4, "123");
+  int i = snprintf (0, 0, "%s", p->a4);
+  if (i > 4)
+    elim_call_on_line (__LINE__);
+}
+
+void test_static_decl_strcpy_X (const char *s)
+{
+  struct S *p = &ss;
+
+  /* Set strlen (P->A4) to [0, PTRDIFF_MAX - 2].   */
+  strcpy (p->a4, s);
+  int i = snprintf (0, 0, "%s", p->a4);
+  if (i > 4)
+    keep_call_on_line (__LINE__);
+}
+
+size_t test_static_decl_strlen (void)
+{
+  struct S *p = &ss;
+
+  /* Set strlen (P->A4) to [0, PTRDIFF - 2].   */
+  size_t n = strlen (p->a4);
+  int i = snprintf (0, 0, "%s", p->a4);
+  if (i > 4)
+    keep_call_on_line (__LINE__);
+  return n;
+}
+
+/* { dg-final { scan-tree-dump-times "keep_call_on_line" 6 "optimized" } }
+   { dg-final { scan-tree-dump-not "elim_call_on_line" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-29.c b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-29.c
new file mode 100644
index 00000000000..3591f4f1851
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-29.c
@@ -0,0 +1,179 @@ 
+/* PR tree-optimization/104119 - unexpected -Wformat-overflow after strlen
+   in ILP32 since Ranger integration
+   { dg-do compile }
+   { dg-options "-O2 -Wall -ftrack-macro-expansion=0" } */
+
+typedef __SIZE_TYPE__ size_t;
+
+void* malloc (size_t);
+int sprintf (char*, const char*, ...);
+size_t strlen (const char*);
+
+void sink (void*, ...);
+
+struct __attribute__ ((packed)) S
+{
+  char a3[3], a4[4], a5[5], a6[6], a7[7], a8[8], a9[9], ax[];
+};
+
+extern struct S s;
+extern char a4[4], a7[7], a8[8];
+
+
+void test_decl (void)
+{
+  struct S *p = &s;
+
+  {
+    size_t n = strlen (p->a3);
+    sprintf (a4, "%s", p->a3);    // { dg-bogus "-Wformat-overflow" }
+    sink (a4, n);
+  }
+
+  {
+    size_t n = strlen (p->a4);
+    sprintf (a4, "%s", p->a4);    // { dg-bogus "-Wformat-overflow" }
+    sink (a4, n);
+  }
+
+  {
+    size_t n = strlen (p->a5);
+    sprintf (a4, "%s", p->a5);    // { dg-warning "may write a terminating nul past the end" }
+    sink (a4, n);
+  }
+
+  {
+    size_t n = strlen (p->a7);
+    sprintf (a8, "%s", p->a7);    // { dg-bogus "-Wformat-overflow" }
+    sink (a8, n);
+  }
+
+  {
+    size_t n = strlen (p->a8);
+    sprintf (a8, "%s", p->a8);    // { dg-bogus "-Wformat-overflow" }
+    sink (a8, n);
+  }
+
+  {
+    size_t n = strlen (p->a9);
+    sprintf (a8, "%s", p->a9);    // { dg-warning "may write a terminating nul past the end " }
+    sink (a8, n);
+  }
+
+  {
+    size_t n = strlen (p->ax);
+    sprintf (a7, "%s", p->ax);    // { dg-bogus "-Wformat-overflow" "pr??????" { xfail ilp32 } }
+    sink (a7, n);
+  }
+}
+
+
+/* Verify the warning with a pointer to an allocated object with nonstant
+   size in known range.  */
+
+void test_alloc_5_8 (int n)
+{
+  if (n < 5 || 8 < n)
+    n = 5;
+
+  struct S *p = (struct S*)malloc (sizeof *p + n);
+  sink (p);   // initialize *p
+
+  {
+    size_t n = strlen (p->a3);
+    sprintf (a4, "%s", p->a3);    // { dg-bogus "-Wformat-overflow" }
+    sink (a4, n);
+  }
+
+  {
+    size_t n = strlen (p->a4);
+    sprintf (a4, "%s", p->a4);    // { dg-bogus "-Wformat-overflow" }
+    sink (a4, n);
+  }
+
+  {
+    size_t n = strlen (p->a5);
+    sprintf (a4, "%s", p->a5);    // { dg-warning "may write a terminating nul past the end" }
+    sink (a4, n);
+  }
+
+  {
+    size_t n = strlen (p->a7);
+    sprintf (a8, "%s", p->a7);    // { dg-bogus "-Wformat-overflow" }
+    sink (a8, n);
+  }
+
+  {
+    size_t n = strlen (p->a8);
+    sprintf (a8, "%s", p->a8);    // { dg-bogus "-Wformat-overflow" }
+    sink (a8, n);
+  }
+
+  {
+    size_t n = strlen (p->a9);
+    sprintf (a8, "%s", p->a9);    // { dg-warning "may write a terminating nul past the end " }
+    sink (a8, n);
+  }
+
+  {
+    /* The size of the flexible array member p->ax is between 5 and 8
+       bytes so the length of the string stored in it is at most 7.
+       Verify the warning triggers based on its size and also gets
+       the length right.  */
+    size_t n = strlen (p->ax);
+    sprintf (a4, "%s", p->ax);    // { dg-warning "writing up to 7 bytes " }
+    sink (a4, n);
+  }
+
+  {
+    size_t n = strlen (p->ax);
+    sprintf (a8, "%s", p->ax);
+    sink (a8, n);
+  }
+}
+
+
+void test_ptr (struct S *p)
+{
+  {
+    size_t n = strlen (p->a3);
+    sprintf (a4, "%s", p->a3);    // { dg-bogus "-Wformat-overflow" }
+    sink (a4, n);
+  }
+
+  {
+    size_t n = strlen (p->a4);
+    sprintf (a4, "%s", p->a4);    // { dg-bogus "-Wformat-overflow" }
+    sink (a4, n);
+  }
+
+  {
+    size_t n = strlen (p->a5);
+    sprintf (a4, "%s", p->a5);    // { dg-warning "may write a terminating nul past the end" }
+    sink (a4, n);
+  }
+
+  {
+    size_t n = strlen (p->a7);
+    sprintf (a8, "%s", p->a7);    // { dg-bogus "-Wformat-overflow" }
+    sink (a8, n);
+  }
+
+  {
+    size_t n = strlen (p->a8);
+    sprintf (a8, "%s", p->a8);    // { dg-bogus "-Wformat-overflow" }
+    sink (a8, n);
+  }
+
+  {
+    size_t n = strlen (p->a9);
+    sprintf (a8, "%s", p->a9);    // { dg-warning "may write a terminating nul past the end " }
+    sink (a8, n);
+  }
+
+  {
+    size_t n = strlen (p->ax);
+    sprintf (a8, "%s", p->ax);    // { dg-bogus "-Wformat-overflow" "pr??????" { xfail ilp32 } }
+    sink (a8, n);
+  }
+}
diff --git a/gcc/tree-ssa-strlen.cc b/gcc/tree-ssa-strlen.cc
index df97b86d5f8..b5f800e73ab 100644
--- a/gcc/tree-ssa-strlen.cc
+++ b/gcc/tree-ssa-strlen.cc
@@ -193,8 +193,8 @@  struct laststmt_struct
 } laststmt;
 
 static int get_stridx_plus_constant (strinfo *, unsigned HOST_WIDE_INT, tree);
-static bool get_range_strlen_dynamic (tree, gimple *s, c_strlen_data *,
-				      bitmap, range_query *, unsigned *);
+static bool get_range_strlen_dynamic (tree, gimple *, c_strlen_data *,
+				      bitmap, pointer_query *, unsigned *);
 
 /* Sets MINMAX to either the constant value or the range VAL is in
    and returns either the constant value or VAL on success or null
@@ -1094,7 +1094,7 @@  dump_strlen_info (FILE *fp, gimple *stmt, range_query *rvals)
 static bool
 get_range_strlen_phi (tree src, gphi *phi,
 		      c_strlen_data *pdata, bitmap visited,
-		      range_query *rvals, unsigned *pssa_def_max)
+		      pointer_query *ptr_qry, unsigned *pssa_def_max)
 {
   if (!bitmap_set_bit (visited, SSA_NAME_VERSION (src)))
     return true;
@@ -1113,7 +1113,7 @@  get_range_strlen_phi (tree src, gphi *phi,
 	continue;
 
       c_strlen_data argdata = { };
-      if (!get_range_strlen_dynamic (arg, phi, &argdata, visited, rvals,
+      if (!get_range_strlen_dynamic (arg, phi, &argdata, visited, ptr_qry,
 				     pssa_def_max))
 	{
 	  pdata->maxlen = build_all_ones_cst (size_type_node);
@@ -1159,6 +1159,48 @@  get_range_strlen_phi (tree src, gphi *phi,
   return true;
 }
 
+/* Return the maximum possible length of the string PTR that's less
+   than MAXLEN given the size of the object of subobject it points
+   to at the given STMT.  MAXLEN is the maximum length of the string
+   determined so far.  Return null when no such maximum can be
+   determined.  */
+
+static tree
+get_maxbound (tree ptr, gimple *stmt, offset_int maxlen,
+	      pointer_query *ptr_qry)
+{
+  access_ref aref;
+  if (!ptr_qry->get_ref (ptr, stmt, &aref))
+    return NULL_TREE;
+
+  offset_int sizrem = aref.size_remaining ();
+  if (sizrem <= 0)
+    return NULL_TREE;
+
+  if (sizrem < maxlen)
+    maxlen = sizrem - 1;
+
+  /* Try to determine the maximum from the subobject at the offset.
+     This handles MEM [&some-struct, member-offset] that's often
+     the result of folding COMPONENT_REF [some-struct, member].  */
+  tree reftype = TREE_TYPE (aref.ref);
+  if (!RECORD_OR_UNION_TYPE_P (reftype)
+      || aref.offrng[0] != aref.offrng[1]
+      || !wi::fits_shwi_p (aref.offrng[0]))
+    return wide_int_to_tree (size_type_node, maxlen);
+
+  HOST_WIDE_INT off = aref.offrng[0].to_shwi ();
+  tree fld = field_at_offset (reftype, NULL_TREE, off);
+  if (!fld || !DECL_SIZE_UNIT (fld))
+    return wide_int_to_tree (size_type_node, maxlen);
+
+  offset_int size = wi::to_offset (DECL_SIZE_UNIT (fld));
+  if (maxlen < size)
+    return wide_int_to_tree (size_type_node, maxlen);
+
+  return wide_int_to_tree (size_type_node, size - 1);
+}
+
 /* Attempt to determine the length of the string SRC.  On success, store
    the length in *PDATA and return true.  Otherwise, return false.
    VISITED is a bitmap of visited PHI nodes.  RVALS points to the valuation
@@ -1168,7 +1210,7 @@  get_range_strlen_phi (tree src, gphi *phi,
 static bool
 get_range_strlen_dynamic (tree src, gimple *stmt,
 			  c_strlen_data *pdata, bitmap visited,
-			  range_query *rvals, unsigned *pssa_def_max)
+			  pointer_query *ptr_qry, unsigned *pssa_def_max)
 {
   int idx = get_stridx (src, stmt);
   if (!idx)
@@ -1177,7 +1219,7 @@  get_range_strlen_dynamic (tree src, gimple *stmt,
 	{
 	  gimple *def_stmt = SSA_NAME_DEF_STMT (src);
 	  if (gphi *phi = dyn_cast<gphi *>(def_stmt))
-	    return get_range_strlen_phi (src, phi, pdata, visited, rvals,
+	    return get_range_strlen_phi (src, phi, pdata, visited, ptr_qry,
 					 pssa_def_max);
 	}
 
@@ -1206,7 +1248,7 @@  get_range_strlen_dynamic (tree src, gimple *stmt,
 	  else if (TREE_CODE (si->nonzero_chars) == SSA_NAME)
 	    {
 	      value_range vr;
-	      rvals->range_of_expr (vr, si->nonzero_chars, si->stmt);
+	      ptr_qry->rvals->range_of_expr (vr, si->nonzero_chars, si->stmt);
 	      if (range_int_cst_p (&vr))
 		{
 		  pdata->minlen = vr.min ();
@@ -1250,12 +1292,16 @@  get_range_strlen_dynamic (tree src, gimple *stmt,
       else if (pdata->minlen && TREE_CODE (pdata->minlen) == SSA_NAME)
 	{
 	  value_range vr;
-	  rvals->range_of_expr (vr, si->nonzero_chars, stmt);
+	  ptr_qry->rvals->range_of_expr (vr, si->nonzero_chars, stmt);
 	  if (range_int_cst_p (&vr))
 	    {
 	      pdata->minlen = vr.min ();
 	      pdata->maxlen = vr.max ();
-	      pdata->maxbound = pdata->maxlen;
+	      offset_int max = offset_int::from (vr.upper_bound (0), SIGNED);
+	      if (tree maxbound = get_maxbound (si->ptr, stmt, max, ptr_qry))
+		pdata->maxbound = maxbound;
+	      else
+		pdata->maxbound = pdata->maxlen;
 	    }
 	  else
 	    {
@@ -1293,13 +1339,13 @@  get_range_strlen_dynamic (tree src, gimple *stmt,
 
 void
 get_range_strlen_dynamic (tree src, gimple *stmt, c_strlen_data *pdata,
-			  range_query *rvals)
+			  pointer_query &ptr_qry)
 {
   auto_bitmap visited;
   tree maxbound = pdata->maxbound;
 
   unsigned limit = param_ssa_name_def_chain_limit;
-  if (!get_range_strlen_dynamic (src, stmt, pdata, visited, rvals, &limit))
+  if (!get_range_strlen_dynamic (src, stmt, pdata, visited, &ptr_qry, &limit))
     {
       /* On failure extend the length range to an impossible maximum
 	 (a valid MAXLEN must be less than PTRDIFF_MAX - 1).  Other
@@ -4030,7 +4076,7 @@  strlen_pass::get_len_or_size (gimple *stmt, tree arg, int idx,
   /* Set MAXBOUND to an arbitrary non-null non-integer node as a request
      to have it set to the length of the longest string in a PHI.  */
   lendata.maxbound = arg;
-  get_range_strlen_dynamic (arg, stmt, &lendata, ptr_qry.rvals);
+  get_range_strlen_dynamic (arg, stmt, &lendata, ptr_qry);
 
   unsigned HOST_WIDE_INT maxbound = HOST_WIDE_INT_M1U;
   if (tree_fits_uhwi_p (lendata.maxbound)
diff --git a/gcc/tree-ssa-strlen.h b/gcc/tree-ssa-strlen.h
index eca14814570..8d155450db8 100644
--- a/gcc/tree-ssa-strlen.h
+++ b/gcc/tree-ssa-strlen.h
@@ -33,7 +33,7 @@  extern tree get_range (tree, gimple *, wide_int[2],
 
 struct c_strlen_data;
 extern void get_range_strlen_dynamic (tree, gimple *, c_strlen_data *,
-				      class range_query *);
+				      pointer_query &);
 
 /* APIs internal to strlen pass.  Defined in gimple-ssa-sprintf.cc.  */
 extern bool handle_printf_call (gimple_stmt_iterator *, pointer_query &);