[committed] Add __attribute__ ((tainted_args))

Message ID 20220114012522.1676434-1-dmalcolm@redhat.com
State Committed
Commit b31cec9c22b8dfa40baefd4c2dd774477e8e04c5
Headers
Series [committed] Add __attribute__ ((tainted_args)) |

Commit Message

David Malcolm Jan. 14, 2022, 1:25 a.m. UTC
  On Thu, 2022-01-13 at 14:08 -0500, Jason Merrill wrote:
> On 1/12/22 10:33, David Malcolm wrote:
> > On Tue, 2022-01-11 at 23:36 -0500, Jason Merrill wrote:
> > > On 1/10/22 16:36, David Malcolm via Gcc-patches wrote:
> > > > On Thu, 2022-01-06 at 09:08 -0500, David Malcolm wrote:
> > > > > On Sat, 2021-11-13 at 15:37 -0500, David Malcolm wrote:
> > > > > > This patch adds a new __attribute__ ((tainted)) to the
> > > > > > C/C++
> > > > > > frontends.
> > > > > 
> > > > > Ping for GCC C/C++ mantainers for review of the C/C++ FE
> > > > > parts of
> > > > > this
> > > > > patch (attribute registration, documentation, the name of the
> > > > > attribute, etc).
> > > > > 
> > > > > (I believe it's independent of the rest of the patch kit, in
> > > > > that
> > > > > it
> > > > > could go into trunk without needing the prior patches)
> > > > > 
> > > > > Thanks
> > > > > Dave
> > > > 
> > > > Getting close to end of stage 3 for GCC 12, so pinging this
> > > > patch
> > > > again...
> > > > 
> > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584376.html
> > > 
> > > The c-family change is OK.
> > 
> > Thanks.
> > 
> > I'm retesting the patch now, but it now seems to me that
> >    __attribute__((tainted_args))
> > would lead to more readable code than:
> >    __attribute__((tainted))
> > 
> > in that the name "tainted_args" better conveys the idea that all
> > arguments are under attacker-control (as opposed to the body of the
> > function or the function pointer being under attacker-control).
> > 
> > Looking at
> >   
> > https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html
> > we already have some attributes with underscores in their names.
> > 
> > Does this sound good?
> 
> Makes sense to me.

Thanks.

I updated the patch to use the name "tainted_args" for the attribute,
and there were a few other changes needed due to splitting it out from
the rest of the kit.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk for gcc 12 as b31cec9c22b8dfa40baefd4c2dd774477e8e04c5.

The following is what I committed, for reference:

This patch adds a new __attribute__ ((tainted_args)) to the C/C++ frontends.

It can be used on function decls: the analyzer will treat as tainted
all parameters to the function and all buffers pointed to by parameters
to the function.  Adding this in one place to the Linux kernel's
__SYSCALL_DEFINEx macro allows the analyzer to treat all syscalls as
having tainted inputs.  This gives some coverage of system calls without
needing to "teach" the analyzer about "__user" - an example of the use
of this can be seen in CVE-2011-2210, where given:

 SYSCALL_DEFINE5(osf_getsysinfo, unsigned long, op, void __user *, buffer,
                 unsigned long, nbytes, int __user *, start, void __user *, arg)

the analyzer will treat the nbytes param as under attacker control, and
can complain accordingly:

taint-CVE-2011-2210-1.c: In function 'sys_osf_getsysinfo':
taint-CVE-2011-2210-1.c:69:21: warning: use of attacker-controlled value
  'nbytes' as size without upper-bounds checking [CWE-129] [-Wanalyzer-tainted-size]
   69 |                 if (copy_to_user(buffer, hwrpb, nbytes) != 0)
      |                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Additionally, the patch allows the attribute to be used on field decls:
specifically function pointers.  Any function used as an initializer
for such a field gets treated as being called with tainted arguments.
An example can be seen in CVE-2020-13143, where adding
__attribute__((tainted_args)) to the "store" callback of
configfs_attribute:

  struct configfs_attribute {
    /* [...snip...] */
    ssize_t (*store)(struct config_item *, const char *, size_t)
      __attribute__((tainted_args));
    /* [...snip...] */
  };

allows the analyzer to see:

 CONFIGFS_ATTR(gadget_dev_desc_, UDC);

and treat gadget_dev_desc_UDC_store as having tainted arguments, so that
it complains:

taint-CVE-2020-13143-1.c: In function 'gadget_dev_desc_UDC_store':
taint-CVE-2020-13143-1.c:33:17: warning: use of attacker-controlled value
  'len + 18446744073709551615' as offset without upper-bounds checking [CWE-823] [-Wanalyzer-tainted-offset]
   33 |         if (name[len - 1] == '\n')
      |             ~~~~^~~~~~~~~

As before this currently still needs -fanalyzer-checker=taint (in
addition to -fanalyzer).

gcc/analyzer/ChangeLog:
	* engine.cc: Include "stringpool.h", "attribs.h", and
	"tree-dfa.h".
	(mark_params_as_tainted): New.
	(class tainted_args_function_custom_event): New.
	(class tainted_args_function_info): New.
	(exploded_graph::add_function_entry): Handle functions with
	"tainted_args" attribute.
	(class tainted_args_field_custom_event): New.
	(class tainted_args_callback_custom_event): New.
	(class tainted_args_call_info): New.
	(add_tainted_args_callback): New.
	(add_any_callbacks): New.
	(exploded_graph::build_initial_worklist): Likewise.
	(exploded_graph::build_initial_worklist): Find callbacks that are
	reachable from global initializers, calling add_any_callbacks on
	them.

gcc/c-family/ChangeLog:
	* c-attribs.c (c_common_attribute_table): Add "tainted_args".
	(handle_tainted_args_attribute): New.

gcc/ChangeLog:
	* doc/extend.texi (Function Attributes): Note that "tainted_args" can
	be used on field decls.
	(Common Function Attributes): Add entry on "tainted_args" attribute.

gcc/testsuite/ChangeLog:
	* gcc.dg/analyzer/attr-tainted_args-1.c: New test.
	* gcc.dg/analyzer/attr-tainted_args-misuses.c: New test.
	* gcc.dg/analyzer/taint-CVE-2011-2210-1.c: New test.
	* gcc.dg/analyzer/taint-CVE-2020-13143-1.c: New test.
	* gcc.dg/analyzer/taint-CVE-2020-13143-2.c: New test.
	* gcc.dg/analyzer/taint-CVE-2020-13143.h: New test.
	* gcc.dg/analyzer/taint-alloc-3.c: New test.
	* gcc.dg/analyzer/taint-alloc-4.c: New test.
	* gcc.dg/analyzer/test-uaccess.h: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
---
 gcc/analyzer/engine.cc                        | 320 +++++++++++++++++-
 gcc/c-family/c-attribs.c                      |  36 ++
 gcc/doc/extend.texi                           |  23 +-
 .../gcc.dg/analyzer/attr-tainted_args-1.c     |  88 +++++
 .../analyzer/attr-tainted_args-misuses.c      |   6 +
 .../gcc.dg/analyzer/taint-CVE-2011-2210-1.c   |  93 +++++
 .../gcc.dg/analyzer/taint-CVE-2020-13143-1.c  |  38 +++
 .../gcc.dg/analyzer/taint-CVE-2020-13143-2.c  |  32 ++
 .../gcc.dg/analyzer/taint-CVE-2020-13143.h    |  91 +++++
 gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c |  21 ++
 gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c |  31 ++
 gcc/testsuite/gcc.dg/analyzer/test-uaccess.h  |  15 +
 12 files changed, 791 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/attr-tainted_args-1.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/attr-tainted_args-misuses.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-1.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-1.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-2.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/test-uaccess.h
  

Patch

diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
index 8b6f4c83f0f..243235e4cd4 100644
--- a/gcc/analyzer/engine.cc
+++ b/gcc/analyzer/engine.cc
@@ -68,6 +68,9 @@  along with GCC; see the file COPYING3.  If not see
 #include "plugin.h"
 #include "target.h"
 #include <memory>
+#include "stringpool.h"
+#include "attribs.h"
+#include "tree-dfa.h"
 
 /* For an overview, see gcc/doc/analyzer.texi.  */
 
@@ -2287,6 +2290,116 @@  exploded_graph::~exploded_graph ()
     delete (*iter).second;
 }
 
+/* Subroutine for use when implementing __attribute__((tainted_args))
+   on functions and on function pointer fields in structs.
+
+   Called on STATE representing a call to FNDECL.
+   Mark all params of FNDECL in STATE as "tainted".  Mark the value of all
+   regions pointed to by params of FNDECL as "tainted".
+
+   Return true if successful; return false if the "taint" state machine
+   was not found.  */
+
+static bool
+mark_params_as_tainted (program_state *state, tree fndecl,
+			const extrinsic_state &ext_state)
+{
+  unsigned taint_sm_idx;
+  if (!ext_state.get_sm_idx_by_name ("taint", &taint_sm_idx))
+    return false;
+  sm_state_map *smap = state->m_checker_states[taint_sm_idx];
+
+  const state_machine &sm = ext_state.get_sm (taint_sm_idx);
+  state_machine::state_t tainted = sm.get_state_by_name ("tainted");
+
+  region_model_manager *mgr = ext_state.get_model_manager ();
+
+  function *fun = DECL_STRUCT_FUNCTION (fndecl);
+  gcc_assert (fun);
+
+  for (tree iter_parm = DECL_ARGUMENTS (fndecl); iter_parm;
+       iter_parm = DECL_CHAIN (iter_parm))
+    {
+      tree param = iter_parm;
+      if (tree parm_default_ssa = ssa_default_def (fun, iter_parm))
+	param = parm_default_ssa;
+      const region *param_reg = state->m_region_model->get_lvalue (param, NULL);
+      const svalue *init_sval = mgr->get_or_create_initial_value (param_reg);
+      smap->set_state (state->m_region_model, init_sval,
+		       tainted, NULL /*origin_new_sval*/, ext_state);
+      if (POINTER_TYPE_P (TREE_TYPE (param)))
+	{
+	  const region *pointee_reg = mgr->get_symbolic_region (init_sval);
+	  /* Mark "*param" as tainted.  */
+	  const svalue *init_pointee_sval
+	    = mgr->get_or_create_initial_value (pointee_reg);
+	  smap->set_state (state->m_region_model, init_pointee_sval,
+			   tainted, NULL /*origin_new_sval*/, ext_state);
+	}
+    }
+
+  return true;
+}
+
+/* Custom event for use by tainted_args_function_info when a function
+   has been marked with __attribute__((tainted_args)).  */
+
+class tainted_args_function_custom_event : public custom_event
+{
+public:
+  tainted_args_function_custom_event (location_t loc, tree fndecl, int depth)
+  : custom_event (loc, fndecl, depth),
+    m_fndecl (fndecl)
+  {
+  }
+
+  label_text get_desc (bool can_colorize) const FINAL OVERRIDE
+  {
+    return make_label_text
+      (can_colorize,
+       "function %qE marked with %<__attribute__((tainted_args))%>",
+       m_fndecl);
+  }
+
+private:
+  tree m_fndecl;
+};
+
+/* Custom exploded_edge info for top-level calls to a function
+   marked with __attribute__((tainted_args)).  */
+
+class tainted_args_function_info : public custom_edge_info
+{
+public:
+  tainted_args_function_info (tree fndecl)
+  : m_fndecl (fndecl)
+  {}
+
+  void print (pretty_printer *pp) const FINAL OVERRIDE
+  {
+    pp_string (pp, "call to tainted_args function");
+  };
+
+  bool update_model (region_model *,
+		     const exploded_edge *,
+		     region_model_context *) const FINAL OVERRIDE
+  {
+    /* No-op.  */
+    return true;
+  }
+
+  void add_events_to_path (checker_path *emission_path,
+			   const exploded_edge &) const FINAL OVERRIDE
+  {
+    emission_path->add_event
+      (new tainted_args_function_custom_event
+       (DECL_SOURCE_LOCATION (m_fndecl), m_fndecl, 0));
+  }
+
+private:
+  tree m_fndecl;
+};
+
 /* Ensure that there is an exploded_node representing an external call to
    FUN, adding it to the worklist if creating it.
 
@@ -2313,14 +2426,25 @@  exploded_graph::add_function_entry (function *fun)
   program_state state (m_ext_state);
   state.push_frame (m_ext_state, fun);
 
+  custom_edge_info *edge_info = NULL;
+
+  if (lookup_attribute ("tainted_args", DECL_ATTRIBUTES (fun->decl)))
+    {
+      if (mark_params_as_tainted (&state, fun->decl, m_ext_state))
+	edge_info = new tainted_args_function_info (fun->decl);
+    }
+
   if (!state.m_valid)
     return NULL;
 
   exploded_node *enode = get_or_create_node (point, state, NULL);
   if (!enode)
-    return NULL;
+    {
+      delete edge_info;
+      return NULL;
+    }
 
-  add_edge (m_origin, enode, NULL);
+  add_edge (m_origin, enode, NULL, edge_info);
 
   m_functions_with_enodes.add (fun);
 
@@ -2634,6 +2758,187 @@  toplevel_function_p (function *fun, logger *logger)
   return true;
 }
 
+/* Custom event for use by tainted_call_info when a callback field has been
+   marked with __attribute__((tainted_args)), for labelling the field.  */
+
+class tainted_args_field_custom_event : public custom_event
+{
+public:
+  tainted_args_field_custom_event (tree field)
+  : custom_event (DECL_SOURCE_LOCATION (field), NULL_TREE, 0),
+    m_field (field)
+  {
+  }
+
+  label_text get_desc (bool can_colorize) const FINAL OVERRIDE
+  {
+    return make_label_text (can_colorize,
+			    "field %qE of %qT"
+			    " is marked with %<__attribute__((tainted_args))%>",
+			    m_field, DECL_CONTEXT (m_field));
+  }
+
+private:
+  tree m_field;
+};
+
+/* Custom event for use by tainted_call_info when a callback field has been
+   marked with __attribute__((tainted_args)), for labelling the function used
+   in that callback.  */
+
+class tainted_args_callback_custom_event : public custom_event
+{
+public:
+  tainted_args_callback_custom_event (location_t loc, tree fndecl, int depth,
+				 tree field)
+  : custom_event (loc, fndecl, depth),
+    m_field (field)
+  {
+  }
+
+  label_text get_desc (bool can_colorize) const FINAL OVERRIDE
+  {
+    return make_label_text (can_colorize,
+			    "function %qE used as initializer for field %qE"
+			    " marked with %<__attribute__((tainted_args))%>",
+			    m_fndecl, m_field);
+  }
+
+private:
+  tree m_field;
+};
+
+/* Custom edge info for use when adding a function used by a callback field
+   marked with '__attribute__((tainted_args))'.   */
+
+class tainted_args_call_info : public custom_edge_info
+{
+public:
+  tainted_args_call_info (tree field, tree fndecl, location_t loc)
+  : m_field (field), m_fndecl (fndecl), m_loc (loc)
+  {}
+
+  void print (pretty_printer *pp) const FINAL OVERRIDE
+  {
+    pp_string (pp, "call to tainted field");
+  };
+
+  bool update_model (region_model *,
+		     const exploded_edge *,
+		     region_model_context *) const FINAL OVERRIDE
+  {
+    /* No-op.  */
+    return true;
+  }
+
+  void add_events_to_path (checker_path *emission_path,
+			   const exploded_edge &) const FINAL OVERRIDE
+  {
+    /* Show the field in the struct declaration, e.g.
+       "(1) field 'store' is marked with '__attribute__((tainted_args))'"  */
+    emission_path->add_event
+      (new tainted_args_field_custom_event (m_field));
+
+    /* Show the callback in the initializer
+       e.g.
+       "(2) function 'gadget_dev_desc_UDC_store' used as initializer
+       for field 'store' marked with '__attribute__((tainted_args))'".  */
+    emission_path->add_event
+      (new tainted_args_callback_custom_event (m_loc, m_fndecl, 0, m_field));
+  }
+
+private:
+  tree m_field;
+  tree m_fndecl;
+  location_t m_loc;
+};
+
+/* Given an initializer at LOC for FIELD marked with
+   '__attribute__((tainted_args))' initialized with FNDECL, add an
+   entrypoint to FNDECL to EG (and to its worklist) where the params to
+   FNDECL are marked as tainted.  */
+
+static void
+add_tainted_args_callback (exploded_graph *eg, tree field, tree fndecl,
+			   location_t loc)
+{
+  logger *logger = eg->get_logger ();
+
+  LOG_SCOPE (logger);
+
+  if (!gimple_has_body_p (fndecl))
+    return;
+
+  const extrinsic_state &ext_state = eg->get_ext_state ();
+
+  function *fun = DECL_STRUCT_FUNCTION (fndecl);
+  gcc_assert (fun);
+
+  program_point point
+    = program_point::from_function_entry (eg->get_supergraph (), fun);
+  program_state state (ext_state);
+  state.push_frame (ext_state, fun);
+
+  if (!mark_params_as_tainted (&state, fndecl, ext_state))
+    return;
+
+  if (!state.m_valid)
+    return;
+
+  exploded_node *enode = eg->get_or_create_node (point, state, NULL);
+  if (logger)
+    {
+      if (enode)
+	logger->log ("created EN %i for tainted_args %qE entrypoint",
+		     enode->m_index, fndecl);
+      else
+	{
+	  logger->log ("did not create enode for tainted_args %qE entrypoint",
+		       fndecl);
+	  return;
+	}
+    }
+
+  tainted_args_call_info *info
+    = new tainted_args_call_info (field, fndecl, loc);
+  eg->add_edge (eg->get_origin (), enode, NULL, info);
+}
+
+/* Callback for walk_tree for finding callbacks within initializers;
+   ensure that any callback initializer where the corresponding field is
+   marked with '__attribute__((tainted_args))' is treated as an entrypoint
+   to the analysis, special-casing that the inputs to the callback are
+   untrustworthy.  */
+
+static tree
+add_any_callbacks (tree *tp, int *, void *data)
+{
+  exploded_graph *eg = (exploded_graph *)data;
+  if (TREE_CODE (*tp) == CONSTRUCTOR)
+    {
+      /* Find fields with the "tainted_args" attribute.
+	 walk_tree only walks the values, not the index values;
+	 look at the index values.  */
+      unsigned HOST_WIDE_INT idx;
+      constructor_elt *ce;
+
+      for (idx = 0; vec_safe_iterate (CONSTRUCTOR_ELTS (*tp), idx, &ce);
+	   idx++)
+	if (ce->index && TREE_CODE (ce->index) == FIELD_DECL)
+	  if (lookup_attribute ("tainted_args", DECL_ATTRIBUTES (ce->index)))
+	    {
+	      tree value = ce->value;
+	      if (TREE_CODE (value) == ADDR_EXPR
+		  && TREE_CODE (TREE_OPERAND (value, 0)) == FUNCTION_DECL)
+		add_tainted_args_callback (eg, ce->index,
+					   TREE_OPERAND (value, 0),
+					   EXPR_LOCATION (value));
+	    }
+    }
+
+  return NULL_TREE;
+}
+
 /* Add initial nodes to EG, with entrypoints for externally-callable
    functions.  */
 
@@ -2659,6 +2964,17 @@  exploded_graph::build_initial_worklist ()
 	  logger->log ("did not create enode for %qE entrypoint", fun->decl);
       }
   }
+
+  /* Find callbacks that are reachable from global initializers.  */
+  varpool_node *vpnode;
+  FOR_EACH_VARIABLE (vpnode)
+    {
+      tree decl = vpnode->decl;
+      tree init = DECL_INITIAL (decl);
+      if (!init)
+	continue;
+      walk_tree (&init, add_any_callbacks, this, NULL);
+    }
 }
 
 /* The main loop of the analysis.
diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index bdf72ce385c..4fb5dbd1409 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -117,6 +117,7 @@  static tree handle_no_profile_instrument_function_attribute (tree *, tree,
 							     tree, int, bool *);
 static tree handle_malloc_attribute (tree *, tree, tree, int, bool *);
 static tree handle_dealloc_attribute (tree *, tree, tree, int, bool *);
+static tree handle_tainted_args_attribute (tree *, tree, tree, int, bool *);
 static tree handle_returns_twice_attribute (tree *, tree, tree, int, bool *);
 static tree handle_no_limit_stack_attribute (tree *, tree, tree, int,
 					     bool *);
@@ -548,6 +549,8 @@  const struct attribute_spec c_common_attribute_table[] =
 			      handle_objc_nullability_attribute, NULL },
   { "*dealloc",                1, 2, true, false, false, false,
 			      handle_dealloc_attribute, NULL },
+  { "tainted_args",	      0, 0, true,  false, false, false,
+			      handle_tainted_args_attribute, NULL },
   { NULL,                     0, 0, false, false, false, false, NULL, NULL }
 };
 
@@ -5774,6 +5777,39 @@  handle_objc_nullability_attribute (tree *node, tree name, tree args,
   return NULL_TREE;
 }
 
+/* Handle a "tainted_args" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_tainted_args_attribute (tree *node, tree name, tree, int,
+			       bool *no_add_attrs)
+{
+  if (TREE_CODE (*node) != FUNCTION_DECL
+      && TREE_CODE (*node) != FIELD_DECL)
+    {
+      warning (OPT_Wattributes, "%qE attribute ignored; valid only "
+	       "for functions and function pointer fields",
+	       name);
+      *no_add_attrs = true;
+      return NULL_TREE;
+    }
+
+  if (TREE_CODE (*node) == FIELD_DECL
+      && !(TREE_CODE (TREE_TYPE (*node)) == POINTER_TYPE
+	   && TREE_CODE (TREE_TYPE (TREE_TYPE (*node))) == FUNCTION_TYPE))
+    {
+      warning (OPT_Wattributes, "%qE attribute ignored;"
+	       " field must be a function pointer",
+	       name);
+      *no_add_attrs = true;
+      return NULL_TREE;
+    }
+
+  *no_add_attrs = false; /* OK */
+
+  return NULL_TREE;
+}
+
 /* Attempt to partially validate a single attribute ATTR as if
    it were to be applied to an entity OPER.  */
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 637124a7172..20a5944256a 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -2512,7 +2512,8 @@  variable declarations (@pxref{Variable Attributes}),
 labels (@pxref{Label Attributes}),
 enumerators (@pxref{Enumerator Attributes}),
 statements (@pxref{Statement Attributes}),
-and types (@pxref{Type Attributes}).
+types (@pxref{Type Attributes}),
+and on field declarations (for @code{tainted_args}).
 
 There is some overlap between the purposes of attributes and pragmas
 (@pxref{Pragmas,,Pragmas Accepted by GCC}).  It has been
@@ -4009,6 +4010,26 @@  addition to creating a symbol version (as if
 @code{"@var{name2}@@@var{nodename}"} was used) the version will be also used
 to resolve @var{name2} by the linker.
 
+@item tainted_args
+@cindex @code{tainted_args} function attribute
+The @code{tainted_args} attribute is used to specify that a function is called
+in a way that requires sanitization of its arguments, such as a system
+call in an operating system kernel.  Such a function can be considered part
+of the ``attack surface'' of the program.  The attribute can be used both
+on function declarations, and on field declarations containing function
+pointers.  In the latter case, any function used as an initializer of
+such a callback field will be treated as being called with tainted
+arguments.
+
+The analyzer will pay particular attention to such functions when both
+@option{-fanalyzer} and @option{-fanalyzer-checker=taint} are supplied,
+potentially issuing warnings guarded by
+@option{-Wanalyzer-tainted-allocation-size},
+@option{-Wanalyzer-tainted-array-index},
+@option{-Wanalyzer-tainted-divisor},
+@option{-Wanalyzer-tainted-offset},
+and @option{-Wanalyzer-tainted-size}.
+
 @item target_clones (@var{options})
 @cindex @code{target_clones} function attribute
 The @code{target_clones} attribute is used to specify that a function
diff --git a/gcc/testsuite/gcc.dg/analyzer/attr-tainted_args-1.c b/gcc/testsuite/gcc.dg/analyzer/attr-tainted_args-1.c
new file mode 100644
index 00000000000..e1d87c9cece
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/attr-tainted_args-1.c
@@ -0,0 +1,88 @@ 
+// TODO: remove need for this option
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+
+#include "analyzer-decls.h"
+
+struct arg_buf
+{
+  int i;
+  int j;
+};
+
+/* Example of marking a function as tainted.  */
+
+void __attribute__((tainted_args))
+test_1 (int i, void *p, char *q)
+{
+  /* There should be a single enode,
+     for the "tainted" entry to the function.  */
+  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1 processed enode" } */
+
+  __analyzer_dump_state ("taint", i); /* { dg-warning "state: 'tainted'" } */
+  __analyzer_dump_state ("taint", p); /* { dg-warning "state: 'tainted'" } */
+  __analyzer_dump_state ("taint", q); /* { dg-warning "state: 'tainted'" } */
+  __analyzer_dump_state ("taint", *q); /* { dg-warning "state: 'tainted'" } */
+
+  struct arg_buf *args = p;
+  __analyzer_dump_state ("taint", args->i); /* { dg-warning "state: 'tainted'" } */
+  __analyzer_dump_state ("taint", args->j); /* { dg-warning "state: 'tainted'" } */  
+}
+
+/* Example of marking a callback field as tainted.  */
+
+struct s2
+{
+  void (*cb) (int, void *, char *)
+    __attribute__((tainted_args));
+};
+
+/* Function not marked as tainted.  */
+
+void
+test_2a (int i, void *p, char *q)
+{
+  /* There should be a single enode,
+     for the normal entry to the function.  */
+  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1 processed enode" } */
+
+  __analyzer_dump_state ("taint", i); /* { dg-warning "state: 'start'" } */
+  __analyzer_dump_state ("taint", p); /* { dg-warning "state: 'start'" } */
+  __analyzer_dump_state ("taint", q); /* { dg-warning "state: 'start'" } */
+
+  struct arg_buf *args = p;
+  __analyzer_dump_state ("taint", args->i); /* { dg-warning "state: 'start'" } */
+  __analyzer_dump_state ("taint", args->j); /* { dg-warning "state: 'start'" } */  
+}
+
+/* Function referenced via t2b.cb, marked as "tainted".  */
+
+void
+test_2b (int i, void *p, char *q)
+{
+  /* There should be two enodes
+     for the direct call, and the "tainted" entry to the function.  */
+  __analyzer_dump_exploded_nodes (0); /* { dg-warning "2 processed enodes" } */
+}
+
+/* Callback used via t2c.cb, marked as "tainted".  */
+void
+__analyzer_test_2c (int i, void *p, char *q)
+{
+  /* There should be a single enode,
+     for the "tainted" entry to the function.  */
+  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1 processed enode" } */
+
+  __analyzer_dump_state ("taint", i); /* { dg-warning "state: 'tainted'" } */
+  __analyzer_dump_state ("taint", p); /* { dg-warning "state: 'tainted'" } */
+  __analyzer_dump_state ("taint", q); /* { dg-warning "state: 'tainted'" } */
+}
+
+struct s2 t2b =
+{
+  .cb = test_2b
+};
+
+struct s2 t2c =
+{
+  .cb = __analyzer_test_2c
+};
diff --git a/gcc/testsuite/gcc.dg/analyzer/attr-tainted_args-misuses.c b/gcc/testsuite/gcc.dg/analyzer/attr-tainted_args-misuses.c
new file mode 100644
index 00000000000..4b0dc915059
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/attr-tainted_args-misuses.c
@@ -0,0 +1,6 @@ 
+int not_a_fn __attribute__ ((tainted_args)); /* { dg-warning "'tainted_args' attribute ignored; valid only for functions and function pointer fields" } */
+
+struct s
+{
+  int f __attribute__ ((tainted_args)); /* { dg-warning "'tainted_args' attribute ignored; field must be a function pointer" } */
+};
diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-1.c b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-1.c
new file mode 100644
index 00000000000..b44be993568
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-1.c
@@ -0,0 +1,93 @@ 
+/* "The osf_getsysinfo function in arch/alpha/kernel/osf_sys.c in the
+   Linux kernel before 2.6.39.4 on the Alpha platform does not properly
+   restrict the data size for GSI_GET_HWRPB operations, which allows
+   local users to obtain sensitive information from kernel memory via
+   a crafted call."
+
+   Fixed in 3d0475119d8722798db5e88f26493f6547a4bb5b on linux-2.6.39.y
+   in linux-stable.  */
+
+// TODO: remove need for this option:
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+
+#include "analyzer-decls.h"
+#include "test-uaccess.h"
+
+/* Adapted from include/linux/linkage.h.  */
+
+#define asmlinkage
+
+/* Adapted from include/linux/syscalls.h.  */
+
+#define __SC_DECL1(t1, a1)	t1 a1
+#define __SC_DECL2(t2, a2, ...) t2 a2, __SC_DECL1(__VA_ARGS__)
+#define __SC_DECL3(t3, a3, ...) t3 a3, __SC_DECL2(__VA_ARGS__)
+#define __SC_DECL4(t4, a4, ...) t4 a4, __SC_DECL3(__VA_ARGS__)
+#define __SC_DECL5(t5, a5, ...) t5 a5, __SC_DECL4(__VA_ARGS__)
+#define __SC_DECL6(t6, a6, ...) t6 a6, __SC_DECL5(__VA_ARGS__)
+
+#define SYSCALL_DEFINEx(x, sname, ...)				\
+	__SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
+
+#define SYSCALL_DEFINE(name) asmlinkage long sys_##name
+#define __SYSCALL_DEFINEx(x, name, ...)					\
+	asmlinkage __attribute__((tainted_args)) \
+	long sys##name(__SC_DECL##x(__VA_ARGS__))
+
+#define SYSCALL_DEFINE5(name, ...) SYSCALL_DEFINEx(5, _##name, __VA_ARGS__)
+
+/* Adapted from arch/alpha/include/asm/hwrpb.h.  */
+
+struct hwrpb_struct {
+	unsigned long phys_addr;	/* check: physical address of the hwrpb */
+	unsigned long id;		/* check: "HWRPB\0\0\0" */
+	unsigned long revision;
+	unsigned long size;		/* size of hwrpb */
+	/* [...snip...] */
+};
+
+extern struct hwrpb_struct *hwrpb;
+
+/* Adapted from arch/alpha/kernel/osf_sys.c.  */
+
+SYSCALL_DEFINE5(osf_getsysinfo, unsigned long, op, void __user *, buffer,
+		unsigned long, nbytes, int __user *, start, void __user *, arg)
+{
+	/* [...snip...] */
+
+	__analyzer_dump_state ("taint", nbytes);  /* { dg-warning "tainted" } */
+
+	/* TODO: should have an event explaining why "nbytes" is treated as
+	   attacker-controlled.  */
+
+	/* case GSI_GET_HWRPB: */
+		if (nbytes < sizeof(*hwrpb))
+			return -1;
+
+		__analyzer_dump_state ("taint", nbytes);  /* { dg-warning "has_lb" } */
+
+		if (copy_to_user(buffer, hwrpb, nbytes) != 0) /* { dg-warning "use of attacker-controlled value 'nbytes' as size without upper-bounds checking" } */
+			return -2;
+
+		return 1;
+
+	/* [...snip...] */
+}
+
+/* With the fix for the sense of the size comparison.  */
+
+SYSCALL_DEFINE5(osf_getsysinfo_fixed, unsigned long, op, void __user *, buffer,
+		unsigned long, nbytes, int __user *, start, void __user *, arg)
+{
+	/* [...snip...] */
+
+	/* case GSI_GET_HWRPB: */
+		if (nbytes > sizeof(*hwrpb))
+			return -1;
+		if (copy_to_user(buffer, hwrpb, nbytes) != 0) /* { dg-bogus "attacker-controlled" } */
+			return -2;
+
+		return 1;
+
+	/* [...snip...] */
+}
diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-1.c b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-1.c
new file mode 100644
index 00000000000..328c5799145
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-1.c
@@ -0,0 +1,38 @@ 
+/* See notes in this header.  */
+#include "taint-CVE-2020-13143.h"
+
+// TODO: remove need for this option
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+
+struct configfs_attribute {
+	/* [...snip...] */
+	ssize_t (*store)(struct config_item *, const char *, size_t) /* { dg-message "\\(1\\) field 'store' of 'struct configfs_attribute' is marked with '__attribute__\\(\\(tainted_args\\)\\)'" } */
+		__attribute__((tainted_args)); /* (this is added).  */
+};
+static inline struct gadget_info *to_gadget_info(struct config_item *item)
+{
+	 return container_of(to_config_group(item), struct gadget_info, group);
+}
+
+static ssize_t gadget_dev_desc_UDC_store(struct config_item *item,
+		const char *page, size_t len)
+{
+	struct gadget_info *gi = to_gadget_info(item);
+	char *name;
+	int ret;
+
+#if 0
+	/* FIXME: this is the fix.  */
+	if (strlen(page) < len)
+		return -EOVERFLOW;
+#endif
+
+	name = kstrdup(page, GFP_KERNEL);
+	if (!name)
+		return -ENOMEM;
+	if (name[len - 1] == '\n') /* { dg-warning "use of attacker-controlled value 'len \[^\n\r\]+' as offset without upper-bounds checking" } */
+		name[len - 1] = '\0'; /* { dg-warning "use of attacker-controlled value 'len \[^\n\r\]+' as offset without upper-bounds checking" } */
+	/* [...snip...] */				\
+}
+
+CONFIGFS_ATTR(gadget_dev_desc_, UDC); /* { dg-message "\\(2\\) function 'gadget_dev_desc_UDC_store' used as initializer for field 'store' marked with '__attribute__\\(\\(tainted_args\\)\\)'" } */
diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-2.c b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-2.c
new file mode 100644
index 00000000000..c74a460b01e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-2.c
@@ -0,0 +1,32 @@ 
+/* See notes in this header.  */
+#include "taint-CVE-2020-13143.h"
+
+// TODO: remove need for this option
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+
+struct configfs_attribute {
+	/* [...snip...] */
+	ssize_t (*store)(struct config_item *, const char *, size_t) /* { dg-message "\\(1\\) field 'store' of 'struct configfs_attribute' is marked with '__attribute__\\(\\(tainted_args\\)\\)'" } */
+		__attribute__((tainted_args)); /* (this is added).  */
+};
+
+/* Highly simplified version.  */
+
+static ssize_t gadget_dev_desc_UDC_store(struct config_item *item,
+		const char *page, size_t len)
+{
+	/* TODO: ought to have state_change_event talking about where the tainted value comes from.  */
+
+	char *name;
+	/* [...snip...] */
+
+	name = kstrdup(page, GFP_KERNEL);
+	if (!name)
+		return -ENOMEM;
+	if (name[len - 1] == '\n') /* { dg-warning "use of attacker-controlled value 'len \[^\n\r\]+' as offset without upper-bounds checking" } */
+		name[len - 1] = '\0';  /* { dg-warning "use of attacker-controlled value 'len \[^\n\r\]+' as offset without upper-bounds checking" } */
+	/* [...snip...] */
+	return 0;
+}
+
+CONFIGFS_ATTR(gadget_dev_desc_, UDC); /* { dg-message "\\(2\\) function 'gadget_dev_desc_UDC_store' used as initializer for field 'store' marked with '__attribute__\\(\\(tainted_args\\)\\)'" } */
diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h
new file mode 100644
index 00000000000..0ba023539af
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h
@@ -0,0 +1,91 @@ 
+/* Shared header for the various taint-CVE-2020-13143.h tests.
+   
+   "gadget_dev_desc_UDC_store in drivers/usb/gadget/configfs.c in the
+   Linux kernel 3.16 through 5.6.13 relies on kstrdup without considering
+   the possibility of an internal '\0' value, which allows attackers to
+   trigger an out-of-bounds read, aka CID-15753588bcd4."
+
+   Fixed by 15753588bcd4bbffae1cca33c8ced5722477fe1f on linux-5.7.y
+   in linux-stable.  */
+
+// TODO: remove need for this option
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+
+#include <stddef.h>
+
+/* Adapted from include/uapi/asm-generic/posix_types.h  */
+
+typedef unsigned int     __kernel_size_t;
+typedef int              __kernel_ssize_t;
+
+/* Adapted from include/linux/types.h  */
+
+//typedef __kernel_size_t		size_t;
+typedef __kernel_ssize_t	ssize_t;
+
+/* Adapted from include/linux/kernel.h  */
+
+#define container_of(ptr, type, member) ({				\
+	void *__mptr = (void *)(ptr);					\
+	/* [...snip...] */						\
+	((type *)(__mptr - offsetof(type, member))); })
+
+/* Adapted from include/linux/configfs.h  */
+
+struct config_item {
+	/* [...snip...] */
+};
+
+struct config_group {
+	struct config_item		cg_item;
+	/* [...snip...] */
+};
+
+static inline struct config_group *to_config_group(struct config_item *item)
+{
+	return item ? container_of(item,struct config_group,cg_item) : NULL;
+}
+
+#define CONFIGFS_ATTR(_pfx, _name)				\
+static struct configfs_attribute _pfx##attr_##_name = {	\
+	/* [...snip...] */				\
+	.store		= _pfx##_name##_store,		\
+}
+
+/* Adapted from include/linux/compiler.h  */
+
+#define __force
+
+/* Adapted from include/asm-generic/errno-base.h  */
+
+#define	ENOMEM		12	/* Out of memory */
+
+/* Adapted from include/linux/types.h  */
+
+#define __bitwise__
+typedef unsigned __bitwise__ gfp_t;
+
+/* Adapted from include/linux/gfp.h  */
+
+#define ___GFP_WAIT		0x10u
+#define ___GFP_IO		0x40u
+#define ___GFP_FS		0x80u
+#define __GFP_WAIT	((__force gfp_t)___GFP_WAIT)
+#define __GFP_IO	((__force gfp_t)___GFP_IO)
+#define __GFP_FS	((__force gfp_t)___GFP_FS)
+#define GFP_KERNEL  (__GFP_WAIT | __GFP_IO | __GFP_FS)
+
+/* Adapted from include/linux/compiler_attributes.h  */
+
+#define __malloc                        __attribute__((__malloc__))
+
+/* Adapted from include/linux/string.h  */
+
+extern char *kstrdup(const char *s, gfp_t gfp) __malloc;
+
+/* Adapted from drivers/usb/gadget/configfs.c  */
+
+struct gadget_info {
+	struct config_group group;
+	/* [...snip...] */				\
+};
diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
new file mode 100644
index 00000000000..80d8f0b8247
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
@@ -0,0 +1,21 @@ 
+// TODO: remove need for this option:
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+
+#include "analyzer-decls.h"
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+/* malloc with tainted size from a syscall.  */
+
+void *p;
+
+void __attribute__((tainted_args))
+test_1 (size_t sz) /* { dg-message "\\(1\\) function 'test_1' marked with '__attribute__\\(\\(tainted_args\\)\\)'" } */
+{
+  /* TODO: should have a message saying why "sz" is tainted, e.g.
+     "treating 'sz' as attacker-controlled because 'test_1' is marked with '__attribute__((tainted_args))'"  */
+
+  p = malloc (sz); /* { dg-warning "use of attacker-controlled value 'sz' as allocation size without upper-bounds checking" "warning" } */
+  /* { dg-message "\\(\[0-9\]+\\) use of attacker-controlled value 'sz' as allocation size without upper-bounds checking" "final event" { target *-*-* } .-1 } */
+}
diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
new file mode 100644
index 00000000000..bd47097b1d5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
@@ -0,0 +1,31 @@ 
+// TODO: remove need for this option:
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+
+#include "analyzer-decls.h"
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+/* malloc with tainted size from a syscall.  */
+
+struct arg_buf
+{
+  size_t sz;
+};
+
+void *p;
+
+void __attribute__((tainted_args))
+test_1 (void *data) /* { dg-message "\\(1\\) function 'test_1' marked with '__attribute__\\(\\(tainted_args\\)\\)'" } */
+{
+  /* we should treat pointed-to-structs as tainted.  */
+  __analyzer_dump_state ("taint", data); /* { dg-warning "state: 'tainted'" } */
+  
+  struct arg_buf *args = data;
+
+  __analyzer_dump_state ("taint", args); /* { dg-warning "state: 'tainted'" } */
+  __analyzer_dump_state ("taint", args->sz); /* { dg-warning "state: 'tainted'" } */
+  
+  p = malloc (args->sz); /* { dg-warning "use of attacker-controlled value '\\*args.sz' as allocation size without upper-bounds checking" "warning" } */
+  /* { dg-message "\\(\[0-9\]+\\) use of attacker-controlled value '\\*args.sz' as allocation size without upper-bounds checking" "final event" { target *-*-* } .-1 } */
+}
diff --git a/gcc/testsuite/gcc.dg/analyzer/test-uaccess.h b/gcc/testsuite/gcc.dg/analyzer/test-uaccess.h
new file mode 100644
index 00000000000..70c9d6309ef
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/test-uaccess.h
@@ -0,0 +1,15 @@ 
+/* Shared header for testcases for copy_from_user/copy_to_user.  */
+
+/* Adapted from include/linux/compiler.h  */
+
+#define __user
+
+/* Adapted from include/asm-generic/uaccess.h  */
+
+extern int copy_from_user(void *to, const void __user *from, long n)
+  __attribute__((access (write_only, 1, 3),
+		 access (read_only, 2, 3)));
+
+extern long copy_to_user(void __user *to, const void *from, unsigned long n)
+  __attribute__((access (write_only, 1, 3),
+		 access (read_only, 2, 3)));