[trunk,PR104308,analyzer] handle memmove like memcpy

Message ID 20221202191104.3812885-1-dmalcolm@redhat.com
State Committed
Delegated to: David Malcolm
Headers
Series [trunk,PR104308,analyzer] handle memmove like memcpy |

Commit Message

David Malcolm Dec. 2, 2022, 7:11 p.m. UTC
  On Fri, 2022-12-02 at 06:45 -0300, Alexandre Oliva wrote:
> Hello, David,
> 
> I'd written this patch for gcc-12, but you've worked too much on the
> analyzer ;-) for it to apply in the trunk.  I wonder if there's any
> use
> you can make of it, or of the observations in it, or whether you'd
> prefer me to try to port it.
> 
> I've regstrapped it on x86_64-linux-gnu, and also tested with crosses
> to
> riscv64-elf and arm-eabi, but with gcc-12 only, so I'm hesitant to
> ask
> whether it's ok to install.  Trunk still fails to issue the warning
> for
> memmove on riscv64-elf.
> 
> 
> The testcase expects analyzer warnings for memmove just like for
> memcpy.  We get them when memmove is open-coded, but not when it
> remains a call.
> 
> The analyzer has code to handle memcpy calls, whose logic can be
> trivially reused for memmove, but we don't get the expected warnings
> and notes for memmove on riscv64-*-elf.  They wouldn't be issued for
> memcpy either, if it wasn't open-coded.
> 
> I've managed to find out how to get the warning for the
> remaining-call
> variants, but not to get the note issued for the point of creation of
> the uninitialized buffer, so this patch also adds an xfail for the
> note on riscv*-*-*.

Hi Alex,

There are various functions that the analyzer "knows" about, but in
gcc 12 and earlier, the logic for handling them was rather messy with
lots of inconsistencies and places for bugs to hide.

In trunk, I've tidied this up by adding a
  class known_function
so that the handling for "foo" that was:
  region_model::impl_call_foo
becomes a
  class kf_foo : public known_function
subclass instead, which has responsibility for determining the
outcome(s) of a particular call.  Instances get registered into a
known_function_manager, which has responsibility for identifying which
known_function instance is relevant at the call site.

This makes the logic simpler and more consistent.

I had a go at porting your patch to trunk; here's the result.
I added explicit testing of memmove, which I like to do any time I add
a new known_function registration.

Only lightly tested so far, on x86_64-pc-linux-gnu.

Does this do what you wanted?

Ultimately I want to add a warning for memcpy of overlapping regions, of
course.

Thanks
Dave

gcc/analyzer/ChangeLog:
	* region-model-impl-calls.cc (class kf_memcpy): Rename to...
	(class kf_memcpy_memmove): ...this.
	(kf_memcpy::impl_call_pre): Rename to...
	(kf_memcpy_memmove::impl_call_pre): ...this, and check the src for
	poison.
	(register_known_functions): Update for above renaming, and
	register BUILT_IN_MEMMOVE and BUILT_IN_MEMMOVE_CHK.

gcc/testsuite/ChangeLog:
	* gcc.dg/analyzer/memmove-1.c: New test, based on memcpy-1.c
	* gcc.dg/analyzer/pr104308.c (test_memmove_within_uninit):
	Expect creation point note to be missing on riscv*-*-*.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
---
 gcc/analyzer/region-model-impl-calls.cc   |  18 ++-
 gcc/testsuite/gcc.dg/analyzer/memmove-1.c | 168 ++++++++++++++++++++++
 gcc/testsuite/gcc.dg/analyzer/pr104308.c  |   2 +-
 3 files changed, 181 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/memmove-1.c
  

Comments

Alexandre Oliva Dec. 8, 2022, 10:36 a.m. UTC | #1
Hello again, David,

On Dec  2, 2022, David Malcolm <dmalcolm@redhat.com> wrote:

> I had a go at porting your patch to trunk; here's the result.

Oh, wow, nice!  Thank you so much.

I confirm it works on riscv64-elf too.
  

Patch

diff --git a/gcc/analyzer/region-model-impl-calls.cc b/gcc/analyzer/region-model-impl-calls.cc
index 8ba644c33cd..103aed58138 100644
--- a/gcc/analyzer/region-model-impl-calls.cc
+++ b/gcc/analyzer/region-model-impl-calls.cc
@@ -772,10 +772,12 @@  kf_malloc::impl_call_pre (const call_details &cd) const
     }
 }
 
-/* Handler for "memcpy" and "__builtin_memcpy".  */
-// TODO: complain about overlapping src and dest.
+/* Handler for "memcpy" and "__builtin_memcpy",
+   "memmove", and "__builtin_memmove".  */
+/* TODO: complain about overlapping src and dest for the memcpy
+   variants.  */
 
-class kf_memcpy : public known_function
+class kf_memcpy_memmove : public known_function
 {
 public:
   bool matches_call_types_p (const call_details &cd) const final override
@@ -789,7 +791,7 @@  public:
 };
 
 void
-kf_memcpy::impl_call_pre (const call_details &cd) const
+kf_memcpy_memmove::impl_call_pre (const call_details &cd) const
 {
   const svalue *dest_ptr_sval = cd.get_arg_svalue (0);
   const svalue *src_ptr_sval = cd.get_arg_svalue (1);
@@ -811,6 +813,8 @@  kf_memcpy::impl_call_pre (const call_details &cd) const
     = mgr->get_sized_region (dest_reg, NULL_TREE, num_bytes_sval);
   const svalue *src_contents_sval
     = model->get_store_value (sized_src_reg, cd.get_ctxt ());
+  model->check_for_poison (src_contents_sval, cd.get_arg_tree (1),
+			   cd.get_ctxt ());
   model->set_value (sized_dest_reg, src_contents_sval, cd.get_ctxt ());
 }
 
@@ -1505,8 +1509,10 @@  register_known_functions (known_function_manager &kfm)
     kfm.add (BUILT_IN_EXPECT_WITH_PROBABILITY, make_unique<kf_expect> ());
     kfm.add (BUILT_IN_FREE, make_unique<kf_free> ());
     kfm.add (BUILT_IN_MALLOC, make_unique<kf_malloc> ());
-    kfm.add (BUILT_IN_MEMCPY, make_unique<kf_memcpy> ());
-    kfm.add (BUILT_IN_MEMCPY_CHK, make_unique<kf_memcpy> ());
+    kfm.add (BUILT_IN_MEMCPY, make_unique<kf_memcpy_memmove> ());
+    kfm.add (BUILT_IN_MEMCPY_CHK, make_unique<kf_memcpy_memmove> ());
+    kfm.add (BUILT_IN_MEMMOVE, make_unique<kf_memcpy_memmove> ());
+    kfm.add (BUILT_IN_MEMMOVE_CHK, make_unique<kf_memcpy_memmove> ());
     kfm.add (BUILT_IN_MEMSET, make_unique<kf_memset> ());
     kfm.add (BUILT_IN_MEMSET_CHK, make_unique<kf_memset> ());
     kfm.add (BUILT_IN_REALLOC, make_unique<kf_realloc> ());
diff --git a/gcc/testsuite/gcc.dg/analyzer/memmove-1.c b/gcc/testsuite/gcc.dg/analyzer/memmove-1.c
new file mode 100644
index 00000000000..d97a5d6b55c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/memmove-1.c
@@ -0,0 +1,168 @@ 
+#include <string.h>
+#include "analyzer-decls.h"
+
+/* Function for thwarting expansion of memmove by optimizer.  */
+
+typedef void * (*memmove_t) (void *dst, const void *src, size_t n);
+  
+static memmove_t __attribute__((noinline))
+get_memmove (void)
+{
+  return memmove;
+}
+
+void *test_1 (void *dst, void *src, size_t n)
+{
+  void *result = memmove (dst, src, n);
+  __analyzer_eval (result == dst); /* { dg-warning "TRUE" } */
+  return result;
+}
+
+void *test_1a (void *dst, void *src, size_t n)
+{
+  void *result = __memmove_chk (dst, src, n, -1);
+  __analyzer_eval (result == dst); /* { dg-warning "TRUE" } */
+  return result;
+}
+
+void *test_1b (void *dst, void *src, size_t n)
+{
+  memmove_t fn = get_memmove ();
+  void *result = fn (dst, src, n);
+  __analyzer_eval (result == dst); /* { dg-warning "TRUE" } */
+  return result;
+}
+
+void test_2 (int i)
+{
+  int j;
+  memmove (&j, &i, sizeof (int));
+  __analyzer_eval (i == j); /* { dg-warning "TRUE" } */
+}
+
+void test_2a (int i)
+{
+  int j;
+  __memmove_chk (&j, &i, sizeof (int), sizeof (int));
+  __analyzer_eval (i == j);  /* { dg-warning "TRUE" } */
+}
+
+void test_2b (int i)
+{
+  int j;
+  memmove_t fn = get_memmove ();
+  fn (&j, &i, sizeof (int));
+  __analyzer_eval (i == j); /* { dg-warning "TRUE" } */
+}
+
+void test_3 (void *src, size_t n)
+{
+  char buf[40], other[40];
+  buf[0] = 'a';
+  other[0] = 'b';
+  __analyzer_eval (buf[0] == 'a');    /* { dg-warning "TRUE" } */
+  __analyzer_eval (other[0] == 'b');  /* { dg-warning "TRUE" } */
+
+  memmove (buf, src, n);
+  __analyzer_eval (buf[0] == 'a');    /* { dg-warning "UNKNOWN" } */
+  __analyzer_eval (other[0] == 'b');  /* { dg-warning "TRUE" } */
+}
+
+void test_3b (void *src, size_t n)
+{
+  char buf[40], other[40];
+  memmove_t fn = get_memmove ();
+  buf[0] = 'a';
+  other[0] = 'b';
+  __analyzer_eval (buf[0] == 'a');    /* { dg-warning "TRUE" } */
+  __analyzer_eval (other[0] == 'b');  /* { dg-warning "TRUE" } */
+
+  fn (buf, src, n);
+  __analyzer_eval (buf[0] == 'a');    /* { dg-warning "UNKNOWN" } */
+  __analyzer_eval (other[0] == 'b');  /* { dg-warning "TRUE" } */
+}
+
+/* Overwriting a zeroed buffer, then memmove of the result.  */
+
+void test_4 (int a, int b)
+{
+  int src[1024];
+  int dst[1024];
+  memset (src, 0, sizeof (src));
+  src[42] = a;
+  src[100] = b;
+  __analyzer_eval (src[0] == 0);    /* { dg-warning "TRUE" } */
+  __analyzer_eval (src[42] == a);    /* { dg-warning "TRUE" } */
+  __analyzer_eval (src[100] == b);    /* { dg-warning "TRUE" } */
+  __analyzer_eval (src[1023] == 0);    /* { dg-warning "TRUE" } */
+
+  memmove (dst, src, sizeof (src));
+  __analyzer_eval (dst[0] == 0);    /* { dg-warning "TRUE" } */
+  __analyzer_eval (dst[42] == a);    /* { dg-warning "TRUE" } */
+  __analyzer_eval (dst[100] == b);    /* { dg-warning "TRUE" } */
+  __analyzer_eval (dst[1023] == 0);    /* { dg-warning "TRUE" } */  
+}
+
+void test_4b (int a, int b)
+{
+  int src[1024];
+  int dst[1024];
+  memmove_t fn = get_memmove ();
+  memset (src, 0, sizeof (src));
+  src[42] = a;
+  src[100] = b;
+  __analyzer_eval (src[0] == 0);    /* { dg-warning "TRUE" } */
+  __analyzer_eval (src[42] == a);    /* { dg-warning "TRUE" } */
+  __analyzer_eval (src[100] == b);    /* { dg-warning "TRUE" } */
+  __analyzer_eval (src[1023] == 0);    /* { dg-warning "TRUE" } */
+
+  fn (dst, src, sizeof (src));
+  __analyzer_eval (dst[0] == 0);    /* { dg-warning "TRUE" } */
+  __analyzer_eval (dst[42] == a);    /* { dg-warning "TRUE" } */
+  __analyzer_eval (dst[100] == b);    /* { dg-warning "TRUE" } */
+  __analyzer_eval (dst[1023] == 0);    /* { dg-warning "TRUE" } */  
+}
+
+/* Populating a buffer from an unknown buffer.  */
+
+void test_5 (void *src, size_t sz)
+{
+  char dst[1024];
+  memmove (dst, src, sizeof (dst));
+  __analyzer_eval (dst[0] == 0); /* { dg-warning "UNKNOWN" } */
+  __analyzer_eval (dst[1023] == 0); /* { dg-warning "UNKNOWN" } */
+}
+
+void test_5b (void *src, size_t sz)
+{
+  char dst[1024];
+  memmove_t fn = get_memmove ();
+  fn (dst, src, sizeof (dst));
+  __analyzer_eval (dst[0] == 0); /* { dg-warning "UNKNOWN" } */
+  __analyzer_eval (dst[1023] == 0); /* { dg-warning "UNKNOWN" } */
+}
+
+/* Zero-sized memmove.  */
+
+void test_6 (void *dst, void *src)
+{
+  memmove (dst, src, 0);
+}
+
+void test_6b (void *dst, void *src)
+{
+  memmove_t fn = get_memmove ();
+  fn (dst, src, 0);
+}
+
+/* memmove to string literal.  */
+
+void test_7 (void *src, size_t sz)
+{
+  memmove ((void *)"hello world", src, sz); /* { dg-warning "write to string literal" } */
+}
+
+void test_7b (void *src, size_t sz)
+{
+  memmove ((void *)"hello world", src, sz); /* { dg-warning "write to string literal" } */
+}
diff --git a/gcc/testsuite/gcc.dg/analyzer/pr104308.c b/gcc/testsuite/gcc.dg/analyzer/pr104308.c
index a3a0cbb7317..e6a2c8821bf 100644
--- a/gcc/testsuite/gcc.dg/analyzer/pr104308.c
+++ b/gcc/testsuite/gcc.dg/analyzer/pr104308.c
@@ -6,7 +6,7 @@ 
 
 int test_memmove_within_uninit (void)
 {
-  char s[5]; /* { dg-message "region created on stack here" } */
+  char s[5]; /* { dg-message "region created on stack here" "" { xfail riscv*-*-* } } */
   memmove(s, s + 1, 2); /* { dg-warning "use of uninitialized value" } */
   return 0;
 }