[6/6] Add __attribute__ ((tainted))

Message ID 20211113203732.2098220-9-dmalcolm@redhat.com
State New
Headers
Series RFC: adding support to GCC for detecting trust boundaries |

Commit Message

David Malcolm Nov. 13, 2021, 8:37 p.m. UTC
  This patch adds a new __attribute__ ((tainted)) to the C/C++ frontends.

It can be used on function decls: the analyzer will treat as tainted
all parameters to the function and all buffers pointed to by parameters
to the function.  Adding this in one place to the Linux kernel's
__SYSCALL_DEFINEx macro allows the analyzer to treat all syscalls as
having tainted inputs.  This gives additional testing beyond e.g. __user
pointers added by earlier patches - an example of the use of this can be
seen in CVE-2011-2210, where given:

 SYSCALL_DEFINE5(osf_getsysinfo, unsigned long, op, void __user *, buffer,
                 unsigned long, nbytes, int __user *, start, void __user *, arg)

the analyzer will treat the nbytes param as under attacker control, and
can complain accordingly:

taint-CVE-2011-2210-1.c: In function ‘sys_osf_getsysinfo’:
taint-CVE-2011-2210-1.c:69:21: warning: use of attacker-controlled value
  ‘nbytes’ as size without upper-bounds checking [CWE-129] [-Wanalyzer-tainted-size]
   69 |                 if (copy_to_user(buffer, hwrpb, nbytes) != 0)
      |                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Additionally, the patch allows the attribute to be used on field decls:
specifically function pointers.  Any function used as an initializer
for such a field gets treated as tainted.  An example can be seen in
CVE-2020-13143, where adding __attribute__((tainted)) to the "store"
callback of configfs_attribute:

  struct configfs_attribute {
     /* [...snip...] */
     ssize_t (*store)(struct config_item *, const char *, size_t)
       __attribute__((tainted));
     /* [...snip...] */
  };

allows the analyzer to see:

 CONFIGFS_ATTR(gadget_dev_desc_, UDC);

and treat gadget_dev_desc_UDC_store as tainted, so that it complains:

taint-CVE-2020-13143-1.c: In function ‘gadget_dev_desc_UDC_store’:
taint-CVE-2020-13143-1.c:33:17: warning: use of attacker-controlled value
  ‘len + 18446744073709551615’ as offset without upper-bounds checking [CWE-823] [-Wanalyzer-tainted-offset]
   33 |         if (name[len - 1] == '\n')
      |             ~~~~^~~~~~~~~

Similarly, the attribute could be used on the ioctl callback field,
USB device callbacks, network-handling callbacks etc.  This potentially
gives a lot of test coverage with relatively little code annotation, and
without necessarily needing link-time analysis (which -fanalyzer can
only do at present on trivial examples).

I believe this is the first time we've had an attribute on a field.
If that's an issue, I could prepare a version of the patch that
merely allowed it on functions themselves.

As before this currently still needs -fanalyzer-checker=taint (in
addition to -fanalyzer).

gcc/analyzer/ChangeLog:
	* engine.cc: Include "stringpool.h", "attribs.h", and
	"tree-dfa.h".
	(mark_params_as_tainted): New.
	(class tainted_function_custom_event): New.
	(class tainted_function_info): New.
	(exploded_graph::add_function_entry): Handle functions with
	"tainted" attribute.
	(class tainted_field_custom_event): New.
	(class tainted_callback_custom_event): New.
	(class tainted_call_info): New.
	(add_tainted_callback): New.
	(add_any_callbacks): New.
	(exploded_graph::build_initial_worklist): Find callbacks that are
	reachable from global initializers, calling add_any_callbacks on
	them.

gcc/c-family/ChangeLog:
	* c-attribs.c (c_common_attribute_table): Add "tainted".
	(handle_tainted_attribute): New.

gcc/ChangeLog:
	* doc/extend.texi (Function Attributes): Note that "tainted" can
	be used on field decls.
	(Common Function Attributes): Add entry on "tainted" attribute.

gcc/testsuite/ChangeLog:
	* gcc.dg/analyzer/attr-tainted-1.c: New test.
	* gcc.dg/analyzer/attr-tainted-misuses.c: New test.
	* gcc.dg/analyzer/taint-CVE-2011-2210-1.c: New test.
	* gcc.dg/analyzer/taint-CVE-2020-13143-1.c: New test.
	* gcc.dg/analyzer/taint-CVE-2020-13143-2.c: New test.
	* gcc.dg/analyzer/taint-CVE-2020-13143.h: New test.
	* gcc.dg/analyzer/taint-alloc-3.c: New test.
	* gcc.dg/analyzer/taint-alloc-4.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
---
 gcc/analyzer/engine.cc                        | 317 +++++++++++++++++-
 gcc/c-family/c-attribs.c                      |  36 ++
 gcc/doc/extend.texi                           |  22 +-
 .../gcc.dg/analyzer/attr-tainted-1.c          |  88 +++++
 .../gcc.dg/analyzer/attr-tainted-misuses.c    |   6 +
 .../gcc.dg/analyzer/taint-CVE-2011-2210-1.c   |  93 +++++
 .../gcc.dg/analyzer/taint-CVE-2020-13143-1.c  |  38 +++
 .../gcc.dg/analyzer/taint-CVE-2020-13143-2.c  |  32 ++
 .../gcc.dg/analyzer/taint-CVE-2020-13143.h    |  91 +++++
 gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c |  21 ++
 gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c |  31 ++
 11 files changed, 772 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/attr-tainted-misuses.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-1.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-1.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-2.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
  

Comments

David Malcolm Jan. 6, 2022, 2:08 p.m. UTC | #1
On Sat, 2021-11-13 at 15:37 -0500, David Malcolm wrote:
> This patch adds a new __attribute__ ((tainted)) to the C/C++
> frontends.

Ping for GCC C/C++ mantainers for review of the C/C++ FE parts of this
patch (attribute registration, documentation, the name of the
attribute, etc).

(I believe it's independent of the rest of the patch kit, in that it
could go into trunk without needing the prior patches)

Thanks
Dave


> 
> It can be used on function decls: the analyzer will treat as tainted
> all parameters to the function and all buffers pointed to by
> parameters
> to the function.  Adding this in one place to the Linux kernel's
> __SYSCALL_DEFINEx macro allows the analyzer to treat all syscalls as
> having tainted inputs.  This gives additional testing beyond e.g.
> __user
> pointers added by earlier patches - an example of the use of this can
> be
> seen in CVE-2011-2210, where given:
> 
>  SYSCALL_DEFINE5(osf_getsysinfo, unsigned long, op, void __user *,
> buffer,
>                  unsigned long, nbytes, int __user *, start, void
> __user *, arg)
> 
> the analyzer will treat the nbytes param as under attacker control,
> and
> can complain accordingly:
> 
> taint-CVE-2011-2210-1.c: In function ‘sys_osf_getsysinfo’:
> taint-CVE-2011-2210-1.c:69:21: warning: use of attacker-controlled
> value
>   ‘nbytes’ as size without upper-bounds checking [CWE-129] [-
> Wanalyzer-tainted-size]
>    69 |                 if (copy_to_user(buffer, hwrpb, nbytes) != 0)
>       |                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> Additionally, the patch allows the attribute to be used on field
> decls:
> specifically function pointers.  Any function used as an initializer
> for such a field gets treated as tainted.  An example can be seen in
> CVE-2020-13143, where adding __attribute__((tainted)) to the "store"
> callback of configfs_attribute:
> 
>   struct configfs_attribute {
>      /* [...snip...] */
>      ssize_t (*store)(struct config_item *, const char *, size_t)
>        __attribute__((tainted));
>      /* [...snip...] */
>   };
> 
> allows the analyzer to see:
> 
>  CONFIGFS_ATTR(gadget_dev_desc_, UDC);
> 
> and treat gadget_dev_desc_UDC_store as tainted, so that it complains:
> 
> taint-CVE-2020-13143-1.c: In function ‘gadget_dev_desc_UDC_store’:
> taint-CVE-2020-13143-1.c:33:17: warning: use of attacker-controlled
> value
>   ‘len + 18446744073709551615’ as offset without upper-bounds
> checking [CWE-823] [-Wanalyzer-tainted-offset]
>    33 |         if (name[len - 1] == '\n')
>       |             ~~~~^~~~~~~~~
> 
> Similarly, the attribute could be used on the ioctl callback field,
> USB device callbacks, network-handling callbacks etc.  This
> potentially
> gives a lot of test coverage with relatively little code annotation,
> and
> without necessarily needing link-time analysis (which -fanalyzer can
> only do at present on trivial examples).
> 
> I believe this is the first time we've had an attribute on a field.
> If that's an issue, I could prepare a version of the patch that
> merely allowed it on functions themselves.
> 
> As before this currently still needs -fanalyzer-checker=taint (in
> addition to -fanalyzer).
> 
> gcc/analyzer/ChangeLog:
>         * engine.cc: Include "stringpool.h", "attribs.h", and
>         "tree-dfa.h".
>         (mark_params_as_tainted): New.
>         (class tainted_function_custom_event): New.
>         (class tainted_function_info): New.
>         (exploded_graph::add_function_entry): Handle functions with
>         "tainted" attribute.
>         (class tainted_field_custom_event): New.
>         (class tainted_callback_custom_event): New.
>         (class tainted_call_info): New.
>         (add_tainted_callback): New.
>         (add_any_callbacks): New.
>         (exploded_graph::build_initial_worklist): Find callbacks that
> are
>         reachable from global initializers, calling add_any_callbacks
> on
>         them.
> 
> gcc/c-family/ChangeLog:
>         * c-attribs.c (c_common_attribute_table): Add "tainted".
>         (handle_tainted_attribute): New.
> 
> gcc/ChangeLog:
>         * doc/extend.texi (Function Attributes): Note that "tainted"
> can
>         be used on field decls.
>         (Common Function Attributes): Add entry on "tainted"
> attribute.
> 
> gcc/testsuite/ChangeLog:
>         * gcc.dg/analyzer/attr-tainted-1.c: New test.
>         * gcc.dg/analyzer/attr-tainted-misuses.c: New test.
>         * gcc.dg/analyzer/taint-CVE-2011-2210-1.c: New test.
>         * gcc.dg/analyzer/taint-CVE-2020-13143-1.c: New test.
>         * gcc.dg/analyzer/taint-CVE-2020-13143-2.c: New test.
>         * gcc.dg/analyzer/taint-CVE-2020-13143.h: New test.
>         * gcc.dg/analyzer/taint-alloc-3.c: New test.
>         * gcc.dg/analyzer/taint-alloc-4.c: New test.
> 
> Signed-off-by: David Malcolm <dmalcolm@redhat.com>
> ---
>  gcc/analyzer/engine.cc                        | 317
> +++++++++++++++++-
>  gcc/c-family/c-attribs.c                      |  36 ++
>  gcc/doc/extend.texi                           |  22 +-
>  .../gcc.dg/analyzer/attr-tainted-1.c          |  88 +++++
>  .../gcc.dg/analyzer/attr-tainted-misuses.c    |   6 +
>  .../gcc.dg/analyzer/taint-CVE-2011-2210-1.c   |  93 +++++
>  .../gcc.dg/analyzer/taint-CVE-2020-13143-1.c  |  38 +++
>  .../gcc.dg/analyzer/taint-CVE-2020-13143-2.c  |  32 ++
>  .../gcc.dg/analyzer/taint-CVE-2020-13143.h    |  91 +++++
>  gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c |  21 ++
>  gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c |  31 ++
>  11 files changed, 772 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/analyzer/attr-tainted-
> misuses.c
>  create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-
> 2210-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-
> 13143-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-
> 13143-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-
> 13143.h
>  create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
>  create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
> 
> diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
> index 096e219392d..5fab41daf93 100644
> --- a/gcc/analyzer/engine.cc
> +++ b/gcc/analyzer/engine.cc
> @@ -68,6 +68,9 @@ along with GCC; see the file COPYING3.  If not see
>  #include "plugin.h"
>  #include "target.h"
>  #include <memory>
> +#include "stringpool.h"
> +#include "attribs.h"
> +#include "tree-dfa.h"
>  
>  /* For an overview, see gcc/doc/analyzer.texi.  */
>  
> @@ -2276,6 +2279,116 @@ exploded_graph::~exploded_graph ()
>      delete (*iter).second;
>  }
>  
> +/* Subroutine for use when implementing __attribute__((tainted))
> +   on functions and on function pointer fields in structs.
> +
> +   Called on STATE representing a call to FNDECL.
> +   Mark all params of FNDECL in STATE as "tainted".  Mark the value
> of all
> +   regions pointed to by params of FNDECL as "tainted".
> +
> +   Return true if successful; return false if the "taint" state
> machine
> +   was not found.  */
> +
> +static bool
> +mark_params_as_tainted (program_state *state, tree fndecl,
> +                       const extrinsic_state &ext_state)
> +{
> +  unsigned taint_sm_idx;
> +  if (!ext_state.get_sm_idx_by_name ("taint", &taint_sm_idx))
> +    return false;
> +  sm_state_map *smap = state->m_checker_states[taint_sm_idx];
> +
> +  const state_machine &sm = ext_state.get_sm (taint_sm_idx);
> +  state_machine::state_t tainted = sm.get_state_by_name ("tainted");
> +
> +  region_model_manager *mgr = ext_state.get_model_manager ();
> +
> +  function *fun = DECL_STRUCT_FUNCTION (fndecl);
> +  gcc_assert (fun);
> +
> +  for (tree iter_parm = DECL_ARGUMENTS (fndecl); iter_parm;
> +       iter_parm = DECL_CHAIN (iter_parm))
> +    {
> +      tree param = iter_parm;
> +      if (tree parm_default_ssa = ssa_default_def (fun, iter_parm))
> +       param = parm_default_ssa;
> +      const region *param_reg = state->m_region_model->get_lvalue
> (param, NULL);
> +      const svalue *init_sval = mgr->get_or_create_initial_value
> (param_reg);
> +      smap->set_state (state->m_region_model, init_sval,
> +                      tainted, NULL /*origin_new_sval*/, ext_state);
> +      if (POINTER_TYPE_P (TREE_TYPE (param)))
> +       {
> +         const region *pointee_reg = mgr->get_symbolic_region
> (init_sval);
> +         /* Mark "*param" as tainted.  */
> +         const svalue *init_pointee_sval
> +           = mgr->get_or_create_initial_value (pointee_reg);
> +         smap->set_state (state->m_region_model, init_pointee_sval,
> +                          tainted, NULL /*origin_new_sval*/,
> ext_state);
> +       }
> +    }
> +
> +  return true;
> +}
> +
> +/* Custom event for use by tainted_function_info when a function
> +   has been marked with __attribute__((tainted)).  */
> +
> +class tainted_function_custom_event : public custom_event
> +{
> +public:
> +  tainted_function_custom_event (location_t loc, tree fndecl, int
> depth)
> +  : custom_event (loc, fndecl, depth),
> +    m_fndecl (fndecl)
> +  {
> +  }
> +
> +  label_text get_desc (bool can_colorize) const FINAL OVERRIDE
> +  {
> +    return make_label_text
> +      (can_colorize,
> +       "function %qE marked with %<__attribute__((tainted))%>",
> +       m_fndecl);
> +  }
> +
> +private:
> +  tree m_fndecl;
> +};
> +
> +/* Custom exploded_edge info for top-level calls to a function
> +   marked with __attribute__((tainted)).  */
> +
> +class tainted_function_info : public custom_edge_info
> +{
> +public:
> +  tainted_function_info (tree fndecl)
> +  : m_fndecl (fndecl)
> +  {}
> +
> +  void print (pretty_printer *pp) const FINAL OVERRIDE
> +  {
> +    pp_string (pp, "call to tainted function");
> +  };
> +
> +  bool update_model (region_model *,
> +                    const exploded_edge *,
> +                    region_model_context *) const FINAL OVERRIDE
> +  {
> +    /* No-op.  */
> +    return true;
> +  }
> +
> +  void add_events_to_path (checker_path *emission_path,
> +                          const exploded_edge &) const FINAL
> OVERRIDE
> +  {
> +    emission_path->add_event
> +      (new tainted_function_custom_event
> +       (DECL_SOURCE_LOCATION (m_fndecl), m_fndecl, 0));
> +  }
> +
> +private:
> +  tree m_fndecl;
> +};
> +
>  /* Ensure that there is an exploded_node representing an external
> call to
>     FUN, adding it to the worklist if creating it.
>  
> @@ -2302,14 +2415,25 @@ exploded_graph::add_function_entry (function
> *fun)
>    program_state state (m_ext_state);
>    state.push_frame (m_ext_state, fun);
>  
> +  custom_edge_info *edge_info = NULL;
> +
> +  if (lookup_attribute ("tainted", DECL_ATTRIBUTES (fun->decl)))
> +    {
> +      if (mark_params_as_tainted (&state, fun->decl, m_ext_state))
> +       edge_info = new tainted_function_info (fun->decl);
> +    }
> +
>    if (!state.m_valid)
>      return NULL;
>  
>    exploded_node *enode = get_or_create_node (point, state, NULL);
>    if (!enode)
> -    return NULL;
> +    {
> +      delete edge_info;
> +      return NULL;
> +    }
>  
> -  add_edge (m_origin, enode, NULL);
> +  add_edge (m_origin, enode, NULL, edge_info);
>  
>    m_functions_with_enodes.add (fun);
>  
> @@ -2623,6 +2747,184 @@ toplevel_function_p (function *fun, logger
> *logger)
>    return true;
>  }
>  
> +/* Custom event for use by tainted_call_info when a callback field
> has been
> +   marked with __attribute__((tainted)), for labelling the field. 
> */
> +
> +class tainted_field_custom_event : public custom_event
> +{
> +public:
> +  tainted_field_custom_event (tree field)
> +  : custom_event (DECL_SOURCE_LOCATION (field), NULL_TREE, 0),
> +    m_field (field)
> +  {
> +  }
> +
> +  label_text get_desc (bool can_colorize) const FINAL OVERRIDE
> +  {
> +    return make_label_text (can_colorize,
> +                           "field %qE of %qT"
> +                           " is marked with
> %<__attribute__((tainted))%>",
> +                           m_field, DECL_CONTEXT (m_field));
> +  }
> +
> +private:
> +  tree m_field;
> +};
> +
> +/* Custom event for use by tainted_call_info when a callback field
> has been
> +   marked with __attribute__((tainted)), for labelling the function
> used
> +   in that callback.  */
> +
> +class tainted_callback_custom_event : public custom_event
> +{
> +public:
> +  tainted_callback_custom_event (location_t loc, tree fndecl, int
> depth,
> +                                tree field)
> +  : custom_event (loc, fndecl, depth),
> +    m_field (field)
> +  {
> +  }
> +
> +  label_text get_desc (bool can_colorize) const FINAL OVERRIDE
> +  {
> +    return make_label_text (can_colorize,
> +                           "function %qE used as initializer for
> field %qE"
> +                           " marked with
> %<__attribute__((tainted))%>",
> +                           m_fndecl, m_field);
> +  }
> +
> +private:
> +  tree m_field;
> +};
> +
> +/* Custom edge info for use when adding a function used by a
> callback field
> +   marked with '__attribute__((tainted))'.   */
> +
> +class tainted_call_info : public custom_edge_info
> +{
> +public:
> +  tainted_call_info (tree field, tree fndecl, location_t loc)
> +  : m_field (field), m_fndecl (fndecl), m_loc (loc)
> +  {}
> +
> +  void print (pretty_printer *pp) const FINAL OVERRIDE
> +  {
> +    pp_string (pp, "call to tainted field");
> +  };
> +
> +  bool update_model (region_model *,
> +                    const exploded_edge *,
> +                    region_model_context *) const FINAL OVERRIDE
> +  {
> +    /* No-op.  */
> +    return true;
> +  }
> +
> +  void add_events_to_path (checker_path *emission_path,
> +                          const exploded_edge &) const FINAL
> OVERRIDE
> +  {
> +    /* Show the field in the struct declaration
> +       e.g. "(1) field 'store' is marked with
> '__attribute__((tainted))'"  */
> +    emission_path->add_event
> +      (new tainted_field_custom_event (m_field));
> +
> +    /* Show the callback in the initializer
> +       e.g.
> +       "(2) function 'gadget_dev_desc_UDC_store' used as initializer
> +       for field 'store' marked with '__attribute__((tainted))'". 
> */
> +    emission_path->add_event
> +      (new tainted_callback_custom_event (m_loc, m_fndecl, 0,
> m_field));
> +  }
> +
> +private:
> +  tree m_field;
> +  tree m_fndecl;
> +  location_t m_loc;
> +};
> +
> +/* Given an initializer at LOC for FIELD marked with
> '__attribute__((tainted))'
> +   initialized with FNDECL, add an entrypoint to FNDECL to EG (and
> to its
> +   worklist) where the params to FNDECL are marked as tainted.  */
> +
> +static void
> +add_tainted_callback (exploded_graph *eg, tree field, tree fndecl,
> +                     location_t loc)
> +{
> +  logger *logger = eg->get_logger ();
> +
> +  LOG_SCOPE (logger);
> +
> +  if (!gimple_has_body_p (fndecl))
> +    return;
> +
> +  const extrinsic_state &ext_state = eg->get_ext_state ();
> +
> +  function *fun = DECL_STRUCT_FUNCTION (fndecl);
> +  gcc_assert (fun);
> +
> +  program_point point
> +    = program_point::from_function_entry (eg->get_supergraph (),
> fun);
> +  program_state state (ext_state);
> +  state.push_frame (ext_state, fun);
> +
> +  if (!mark_params_as_tainted (&state, fndecl, ext_state))
> +    return;
> +
> +  if (!state.m_valid)
> +    return;
> +
> +  exploded_node *enode = eg->get_or_create_node (point, state,
> NULL);
> +  if (logger)
> +    {
> +      if (enode)
> +       logger->log ("created EN %i for tainted %qE entrypoint",
> +                    enode->m_index, fndecl);
> +      else
> +       {
> +         logger->log ("did not create enode for tainted %qE
> entrypoint",
> +                      fndecl);
> +         return;
> +       }
> +    }
> +
> +  tainted_call_info *info = new tainted_call_info (field, fndecl,
> loc);
> +  eg->add_edge (eg->get_origin (), enode, NULL, info);
> +}
> +
> +/* Callback for walk_tree for finding callbacks within initializers;
> +   ensure that any callback initializer where the corresponding
> field is
> +   marked with '__attribute__((tainted))' is treated as an
> entrypoint to the
> +   analysis, special-casing that the inputs to the callback are
> +   untrustworthy.  */
> +
> +static tree
> +add_any_callbacks (tree *tp, int *, void *data)
> +{
> +  exploded_graph *eg = (exploded_graph *)data;
> +  if (TREE_CODE (*tp) == CONSTRUCTOR)
> +    {
> +      /* Find fields with the "tainted" attribute.
> +        walk_tree only walks the values, not the index values;
> +        look at the index values.  */
> +      unsigned HOST_WIDE_INT idx;
> +      constructor_elt *ce;
> +
> +      for (idx = 0; vec_safe_iterate (CONSTRUCTOR_ELTS (*tp), idx,
> &ce);
> +          idx++)
> +       if (ce->index && TREE_CODE (ce->index) == FIELD_DECL)
> +         if (lookup_attribute ("tainted", DECL_ATTRIBUTES (ce-
> >index)))
> +           {
> +             tree value = ce->value;
> +             if (TREE_CODE (value) == ADDR_EXPR
> +                 && TREE_CODE (TREE_OPERAND (value, 0)) ==
> FUNCTION_DECL)
> +               add_tainted_callback (eg, ce->index, TREE_OPERAND
> (value, 0),
> +                                     EXPR_LOCATION (value));
> +           }
> +    }
> +
> +  return NULL_TREE;
> +}
> +
>  /* Add initial nodes to EG, with entrypoints for externally-callable
>     functions.  */
>  
> @@ -2648,6 +2950,17 @@ exploded_graph::build_initial_worklist ()
>           logger->log ("did not create enode for %qE entrypoint",
> fun->decl);
>        }
>    }
> +
> +  /* Find callbacks that are reachable from global initializers.  */
> +  varpool_node *vpnode;
> +  FOR_EACH_VARIABLE (vpnode)
> +    {
> +      tree decl = vpnode->decl;
> +      tree init = DECL_INITIAL (decl);
> +      if (!init)
> +       continue;
> +      walk_tree (&init, add_any_callbacks, this, NULL);
> +    }
>  }
>  
>  /* The main loop of the analysis.
> diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
> index 9e03156de5e..835ba6e0e8c 100644
> --- a/gcc/c-family/c-attribs.c
> +++ b/gcc/c-family/c-attribs.c
> @@ -117,6 +117,7 @@ static tree
> handle_no_profile_instrument_function_attribute (tree *, tree,
>                                                              tree,
> int, bool *);
>  static tree handle_malloc_attribute (tree *, tree, tree, int, bool
> *);
>  static tree handle_dealloc_attribute (tree *, tree, tree, int, bool
> *);
> +static tree handle_tainted_attribute (tree *, tree, tree, int, bool
> *);
>  static tree handle_returns_twice_attribute (tree *, tree, tree, int,
> bool *);
>  static tree handle_no_limit_stack_attribute (tree *, tree, tree,
> int,
>                                              bool *);
> @@ -569,6 +570,8 @@ const struct attribute_spec
> c_common_attribute_table[] =
>                               handle_objc_nullability_attribute, NULL
> },
>    { "*dealloc",                1, 2, true, false, false, false,
>                               handle_dealloc_attribute, NULL },
> +  { "tainted",               0, 0, true,  false, false, false,
> +                             handle_tainted_attribute, NULL },
>    { NULL,                     0, 0, false, false, false, false,
> NULL, NULL }
>  };
>  
> @@ -5857,6 +5860,39 @@ handle_objc_nullability_attribute (tree *node,
> tree name, tree args,
>    return NULL_TREE;
>  }
>  
> +/* Handle a "tainted" attribute; arguments as in
> +   struct attribute_spec.handler.  */
> +
> +static tree
> +handle_tainted_attribute (tree *node, tree name, tree, int,
> +                         bool *no_add_attrs)
> +{
> +  if (TREE_CODE (*node) != FUNCTION_DECL
> +      && TREE_CODE (*node) != FIELD_DECL)
> +    {
> +      warning (OPT_Wattributes, "%qE attribute ignored; valid only "
> +              "for functions and function pointer fields",
> +              name);
> +      *no_add_attrs = true;
> +      return NULL_TREE;
> +    }
> +
> +  if (TREE_CODE (*node) == FIELD_DECL
> +      && !(TREE_CODE (TREE_TYPE (*node)) == POINTER_TYPE
> +          && TREE_CODE (TREE_TYPE (TREE_TYPE (*node))) ==
> FUNCTION_TYPE))
> +    {
> +      warning (OPT_Wattributes, "%qE attribute ignored;"
> +              " field must be a function pointer",
> +              name);
> +      *no_add_attrs = true;
> +      return NULL_TREE;
> +    }
> +
> +  *no_add_attrs = false; /* OK */
> +
> +  return NULL_TREE;
> +}
> +
>  /* Attempt to partially validate a single attribute ATTR as if
>     it were to be applied to an entity OPER.  */
>  
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 5a6ef464779..826bbd48e7e 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -2465,7 +2465,8 @@ variable declarations (@pxref{Variable
> Attributes}),
>  labels (@pxref{Label Attributes}),
>  enumerators (@pxref{Enumerator Attributes}),
>  statements (@pxref{Statement Attributes}),
> -and types (@pxref{Type Attributes}).
> +types (@pxref{Type Attributes}),
> +and on field declarations (for @code{tainted}).
>  
>  There is some overlap between the purposes of attributes and pragmas
>  (@pxref{Pragmas,,Pragmas Accepted by GCC}).  It has been
> @@ -3977,6 +3978,25 @@ addition to creating a symbol version (as if
>  @code{"@var{name2}@@@var{nodename}"} was used) the version will be
> also used
>  to resolve @var{name2} by the linker.
>  
> +@item tainted
> +@cindex @code{tainted} function attribute
> +The @code{tainted} attribute is used to specify that a function is
> called
> +in a way that requires sanitization of its arguments, such as a
> system
> +call in an operating system kernel.  Such a function can be
> considered part
> +of the ``attack surface'' of the program.  The attribute can be used
> both
> +on function declarations, and on field declarations containing
> function
> +pointers.  In the latter case, any function used as an initializer
> of
> +such a callback field will be treated as tainted.
> +
> +The analyzer will pay particular attention to such functions when
> both
> +@option{-fanalyzer} and @option{-fanalyzer-checker=taint} are
> supplied,
> +potentially issuing warnings guarded by
> +@option{-Wanalyzer-exposure-through-uninit-copy},
> +@option{-Wanalyzer-tainted-allocation-size},
> +@option{-Wanalyzer-tainted-array-index},
> +@option{Wanalyzer-tainted-offset},
> +and @option{Wanalyzer-tainted-size}.
> +
>  @item target_clones (@var{options})
>  @cindex @code{target_clones} function attribute
>  The @code{target_clones} attribute is used to specify that a
> function
> diff --git a/gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
> b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
> new file mode 100644
> index 00000000000..cc4d5900372
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
> @@ -0,0 +1,88 @@
> +// TODO: remove need for this option
> +/* { dg-additional-options "-fanalyzer-checker=taint" } */
> +
> +#include "analyzer-decls.h"
> +
> +struct arg_buf
> +{
> +  int i;
> +  int j;
> +};
> +
> +/* Example of marking a function as tainted.  */
> +
> +void __attribute__((tainted))
> +test_1 (int i, void *p, char *q)
> +{
> +  /* There should be a single enode,
> +     for the "tainted" entry to the function.  */
> +  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1 processed
> enode" } */
> +
> +  __analyzer_dump_state ("taint", i); /* { dg-warning "state:
> 'tainted'" } */
> +  __analyzer_dump_state ("taint", p); /* { dg-warning "state:
> 'tainted'" } */
> +  __analyzer_dump_state ("taint", q); /* { dg-warning "state:
> 'tainted'" } */
> +  __analyzer_dump_state ("taint", *q); /* { dg-warning "state:
> 'tainted'" } */
> +
> +  struct arg_buf *args = p;
> +  __analyzer_dump_state ("taint", args->i); /* { dg-warning "state:
> 'tainted'" } */
> +  __analyzer_dump_state ("taint", args->j); /* { dg-warning "state:
> 'tainted'" } */  
> +}
> +
> +/* Example of marking a callback field as tainted.  */
> +
> +struct s2
> +{
> +  void (*cb) (int, void *, char *)
> +    __attribute__((tainted));
> +};
> +
> +/* Function not marked as tainted.  */
> +
> +void
> +test_2a (int i, void *p, char *q)
> +{
> +  /* There should be a single enode,
> +     for the normal entry to the function.  */
> +  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1 processed
> enode" } */
> +
> +  __analyzer_dump_state ("taint", i); /* { dg-warning "state:
> 'start'" } */
> +  __analyzer_dump_state ("taint", p); /* { dg-warning "state:
> 'start'" } */
> +  __analyzer_dump_state ("taint", q); /* { dg-warning "state:
> 'start'" } */
> +
> +  struct arg_buf *args = p;
> +  __analyzer_dump_state ("taint", args->i); /* { dg-warning "state:
> 'start'" } */
> +  __analyzer_dump_state ("taint", args->j); /* { dg-warning "state:
> 'start'" } */  
> +}
> +
> +/* Function referenced via t2b.cb, marked as "tainted".  */
> +
> +void
> +test_2b (int i, void *p, char *q)
> +{
> +  /* There should be two enodes
> +     for the direct call, and the "tainted" entry to the function. 
> */
> +  __analyzer_dump_exploded_nodes (0); /* { dg-warning "2 processed
> enodes" } */
> +}
> +
> +/* Callback used via t2c.cb, marked as "tainted".  */
> +void
> +__analyzer_test_2c (int i, void *p, char *q)
> +{
> +  /* There should be a single enode,
> +     for the "tainted" entry to the function.  */
> +  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1 processed
> enode" } */
> +
> +  __analyzer_dump_state ("taint", i); /* { dg-warning "state:
> 'tainted'" } */
> +  __analyzer_dump_state ("taint", p); /* { dg-warning "state:
> 'tainted'" } */
> +  __analyzer_dump_state ("taint", q); /* { dg-warning "state:
> 'tainted'" } */
> +}
> +
> +struct s2 t2b =
> +{
> +  .cb = test_2b
> +};
> +
> +struct s2 t2c =
> +{
> +  .cb = __analyzer_test_2c
> +};
> diff --git a/gcc/testsuite/gcc.dg/analyzer/attr-tainted-misuses.c
> b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-misuses.c
> new file mode 100644
> index 00000000000..6f4cbc82efb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-misuses.c
> @@ -0,0 +1,6 @@
> +int not_a_fn __attribute__ ((tainted)); /* { dg-warning "'tainted'
> attribute ignored; valid only for functions and function pointer
> fields" } */
> +
> +struct s
> +{
> +  int f __attribute__ ((tainted)); /* { dg-warning "'tainted'
> attribute ignored; field must be a function pointer" } */
> +};
> diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-1.c
> b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-1.c
> new file mode 100644
> index 00000000000..fe6c7ebbb1f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-1.c
> @@ -0,0 +1,93 @@
> +/* "The osf_getsysinfo function in arch/alpha/kernel/osf_sys.c in
> the
> +   Linux kernel before 2.6.39.4 on the Alpha platform does not
> properly
> +   restrict the data size for GSI_GET_HWRPB operations, which allows
> +   local users to obtain sensitive information from kernel memory
> via
> +   a crafted call."
> +
> +   Fixed in 3d0475119d8722798db5e88f26493f6547a4bb5b on linux-
> 2.6.39.y
> +   in linux-stable.  */
> +
> +// TODO: remove need for this option:
> +/* { dg-additional-options "-fanalyzer-checker=taint" } */
> +
> +#include "analyzer-decls.h"
> +#include "test-uaccess.h"
> +
> +/* Adapted from include/linux/linkage.h.  */
> +
> +#define asmlinkage
> +
> +/* Adapted from include/linux/syscalls.h.  */
> +
> +#define __SC_DECL1(t1, a1)     t1 a1
> +#define __SC_DECL2(t2, a2, ...) t2 a2, __SC_DECL1(__VA_ARGS__)
> +#define __SC_DECL3(t3, a3, ...) t3 a3, __SC_DECL2(__VA_ARGS__)
> +#define __SC_DECL4(t4, a4, ...) t4 a4, __SC_DECL3(__VA_ARGS__)
> +#define __SC_DECL5(t5, a5, ...) t5 a5, __SC_DECL4(__VA_ARGS__)
> +#define __SC_DECL6(t6, a6, ...) t6 a6, __SC_DECL5(__VA_ARGS__)
> +
> +#define SYSCALL_DEFINEx(x, sname, ...)                         \
> +       __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
> +
> +#define SYSCALL_DEFINE(name) asmlinkage long sys_##name
> +#define __SYSCALL_DEFINEx(x, name,
> ...)                                        \
> +       asmlinkage __attribute__((tainted)) \
> +       long sys##name(__SC_DECL##x(__VA_ARGS__))
> +
> +#define SYSCALL_DEFINE5(name, ...) SYSCALL_DEFINEx(5, _##name,
> __VA_ARGS__)
> +
> +/* Adapted from arch/alpha/include/asm/hwrpb.h.  */
> +
> +struct hwrpb_struct {
> +       unsigned long phys_addr;        /* check: physical address of
> the hwrpb */
> +       unsigned long id;               /* check: "HWRPB\0\0\0" */
> +       unsigned long revision;
> +       unsigned long size;             /* size of hwrpb */
> +       /* [...snip...] */
> +};
> +
> +extern struct hwrpb_struct *hwrpb;
> +
> +/* Adapted from arch/alpha/kernel/osf_sys.c.  */
> +
> +SYSCALL_DEFINE5(osf_getsysinfo, unsigned long, op, void __user *,
> buffer,
> +               unsigned long, nbytes, int __user *, start, void
> __user *, arg)
> +{
> +       /* [...snip...] */
> +
> +       __analyzer_dump_state ("taint", nbytes);  /* { dg-warning
> "tainted" } */
> +
> +       /* TODO: should have an event explaining why "nbytes" is
> treated as
> +          attacker-controlled.  */
> +
> +       /* case GSI_GET_HWRPB: */
> +               if (nbytes < sizeof(*hwrpb))
> +                       return -1;
> +
> +               __analyzer_dump_state ("taint", nbytes);  /* { dg-
> warning "has_lb" } */
> +
> +               if (copy_to_user(buffer, hwrpb, nbytes) != 0) /* {
> dg-warning "use of attacker-controlled value 'nbytes' as size without
> upper-bounds checking" } */
> +                       return -2;
> +
> +               return 1;
> +
> +       /* [...snip...] */
> +}
> +
> +/* With the fix for the sense of the size comparison.  */
> +
> +SYSCALL_DEFINE5(osf_getsysinfo_fixed, unsigned long, op, void __user
> *, buffer,
> +               unsigned long, nbytes, int __user *, start, void
> __user *, arg)
> +{
> +       /* [...snip...] */
> +
> +       /* case GSI_GET_HWRPB: */
> +               if (nbytes > sizeof(*hwrpb))
> +                       return -1;
> +               if (copy_to_user(buffer, hwrpb, nbytes) != 0) /* {
> dg-bogus "attacker-controlled" } */
> +                       return -2;
> +
> +               return 1;
> +
> +       /* [...snip...] */
> +}
> diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-1.c
> b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-1.c
> new file mode 100644
> index 00000000000..0b9a94a8d6c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-1.c
> @@ -0,0 +1,38 @@
> +/* See notes in this header.  */
> +#include "taint-CVE-2020-13143.h"
> +
> +// TODO: remove need for this option
> +/* { dg-additional-options "-fanalyzer-checker=taint" } */
> +
> +struct configfs_attribute {
> +       /* [...snip...] */
> +       ssize_t (*store)(struct config_item *, const char *, size_t)
> /* { dg-message "\\(1\\) field 'store' of 'struct configfs_attribute'
> is marked with '__attribute__\\(\\(tainted\\)\\)'" } */
> +               __attribute__((tainted)); /* (this is added).  */
> +};
> +static inline struct gadget_info *to_gadget_info(struct config_item
> *item)
> +{
> +        return container_of(to_config_group(item), struct
> gadget_info, group);
> +}
> +
> +static ssize_t gadget_dev_desc_UDC_store(struct config_item *item,
> +               const char *page, size_t len)
> +{
> +       struct gadget_info *gi = to_gadget_info(item);
> +       char *name;
> +       int ret;
> +
> +#if 0
> +       /* FIXME: this is the fix.  */
> +       if (strlen(page) < len)
> +               return -EOVERFLOW;
> +#endif
> +
> +       name = kstrdup(page, GFP_KERNEL);
> +       if (!name)
> +               return -ENOMEM;
> +       if (name[len - 1] == '\n') /* { dg-warning "use of attacker-
> controlled value 'len \[^\n\r\]+' as offset without upper-bounds
> checking" } */
> +               name[len - 1] = '\0'; /* { dg-warning "use of
> attacker-controlled value 'len \[^\n\r\]+' as offset without upper-
> bounds checking" } */
> +       /* [...snip...] */                              \
> +}
> +
> +CONFIGFS_ATTR(gadget_dev_desc_, UDC); /* { dg-message "\\(2\\)
> function 'gadget_dev_desc_UDC_store' used as initializer for field
> 'store' marked with '__attribute__\\(\\(tainted\\)\\)'" } */
> diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-2.c
> b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-2.c
> new file mode 100644
> index 00000000000..e05da9276c1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-2.c
> @@ -0,0 +1,32 @@
> +/* See notes in this header.  */
> +#include "taint-CVE-2020-13143.h"
> +
> +// TODO: remove need for this option
> +/* { dg-additional-options "-fanalyzer-checker=taint" } */
> +
> +struct configfs_attribute {
> +       /* [...snip...] */
> +       ssize_t (*store)(struct config_item *, const char *, size_t)
> /* { dg-message "\\(1\\) field 'store' of 'struct configfs_attribute'
> is marked with '__attribute__\\(\\(tainted\\)\\)'" } */
> +               __attribute__((tainted)); /* (this is added).  */
> +};
> +
> +/* Highly simplified version.  */
> +
> +static ssize_t gadget_dev_desc_UDC_store(struct config_item *item,
> +               const char *page, size_t len)
> +{
> +       /* TODO: ought to have state_change_event talking about where
> the tainted value comes from.  */
> +
> +       char *name;
> +       /* [...snip...] */
> +
> +       name = kstrdup(page, GFP_KERNEL);
> +       if (!name)
> +               return -ENOMEM;
> +       if (name[len - 1] == '\n') /* { dg-warning "use of attacker-
> controlled value 'len \[^\n\r\]+' as offset without upper-bounds
> checking" } */
> +               name[len - 1] = '\0';  /* { dg-warning "use of
> attacker-controlled value 'len \[^\n\r\]+' as offset without upper-
> bounds checking" } */
> +       /* [...snip...] */
> +       return 0;
> +}
> +
> +CONFIGFS_ATTR(gadget_dev_desc_, UDC); /* { dg-message "\\(2\\)
> function 'gadget_dev_desc_UDC_store' used as initializer for field
> 'store' marked with '__attribute__\\(\\(tainted\\)\\)'" } */
> diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h
> b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h
> new file mode 100644
> index 00000000000..0ba023539af
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h
> @@ -0,0 +1,91 @@
> +/* Shared header for the various taint-CVE-2020-13143.h tests.
> +   
> +   "gadget_dev_desc_UDC_store in drivers/usb/gadget/configfs.c in
> the
> +   Linux kernel 3.16 through 5.6.13 relies on kstrdup without
> considering
> +   the possibility of an internal '\0' value, which allows attackers
> to
> +   trigger an out-of-bounds read, aka CID-15753588bcd4."
> +
> +   Fixed by 15753588bcd4bbffae1cca33c8ced5722477fe1f on linux-5.7.y
> +   in linux-stable.  */
> +
> +// TODO: remove need for this option
> +/* { dg-additional-options "-fanalyzer-checker=taint" } */
> +
> +#include <stddef.h>
> +
> +/* Adapted from include/uapi/asm-generic/posix_types.h  */
> +
> +typedef unsigned int     __kernel_size_t;
> +typedef int              __kernel_ssize_t;
> +
> +/* Adapted from include/linux/types.h  */
> +
> +//typedef __kernel_size_t              size_t;
> +typedef __kernel_ssize_t       ssize_t;
> +
> +/* Adapted from include/linux/kernel.h  */
> +
> +#define container_of(ptr, type, member)
> ({                             \
> +       void *__mptr = (void
> *)(ptr);                                   \
> +       /* [...snip...]
> */                                              \
> +       ((type *)(__mptr - offsetof(type, member))); })
> +
> +/* Adapted from include/linux/configfs.h  */
> +
> +struct config_item {
> +       /* [...snip...] */
> +};
> +
> +struct config_group {
> +       struct config_item              cg_item;
> +       /* [...snip...] */
> +};
> +
> +static inline struct config_group *to_config_group(struct
> config_item *item)
> +{
> +       return item ? container_of(item,struct config_group,cg_item)
> : NULL;
> +}
> +
> +#define CONFIGFS_ATTR(_pfx, _name)                             \
> +static struct configfs_attribute _pfx##attr_##_name = {        \
> +       /* [...snip...] */                              \
> +       .store          = _pfx##_name##_store,          \
> +}
> +
> +/* Adapted from include/linux/compiler.h  */
> +
> +#define __force
> +
> +/* Adapted from include/asm-generic/errno-base.h  */
> +
> +#define        ENOMEM          12      /* Out of memory */
> +
> +/* Adapted from include/linux/types.h  */
> +
> +#define __bitwise__
> +typedef unsigned __bitwise__ gfp_t;
> +
> +/* Adapted from include/linux/gfp.h  */
> +
> +#define ___GFP_WAIT            0x10u
> +#define ___GFP_IO              0x40u
> +#define ___GFP_FS              0x80u
> +#define __GFP_WAIT     ((__force gfp_t)___GFP_WAIT)
> +#define __GFP_IO       ((__force gfp_t)___GFP_IO)
> +#define __GFP_FS       ((__force gfp_t)___GFP_FS)
> +#define GFP_KERNEL  (__GFP_WAIT | __GFP_IO | __GFP_FS)
> +
> +/* Adapted from include/linux/compiler_attributes.h  */
> +
> +#define __malloc                        __attribute__((__malloc__))
> +
> +/* Adapted from include/linux/string.h  */
> +
> +extern char *kstrdup(const char *s, gfp_t gfp) __malloc;
> +
> +/* Adapted from drivers/usb/gadget/configfs.c  */
> +
> +struct gadget_info {
> +       struct config_group group;
> +       /* [...snip...] */                              \
> +};
> diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
> b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
> new file mode 100644
> index 00000000000..4c567b2ffdf
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
> @@ -0,0 +1,21 @@
> +// TODO: remove need for this option:
> +/* { dg-additional-options "-fanalyzer-checker=taint" } */
> +
> +#include "analyzer-decls.h"
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +
> +/* malloc with tainted size from a syscall.  */
> +
> +void *p;
> +
> +void __attribute__((tainted))
> +test_1 (size_t sz) /* { dg-message "\\(1\\) function 'test_1' marked
> with '__attribute__\\(\\(tainted\\)\\)'" } */
> +{
> +  /* TODO: should have a message saying why "sz" is tainted, e.g.
> +     "treating 'sz' as attacker-controlled because 'test_1' is
> marked with '__attribute__((tainted))'"  */
> +
> +  p = malloc (sz); /* { dg-warning "use of attacker-controlled value
> 'sz' as allocation size without upper-bounds checking" "warning" } */
> +  /* { dg-message "\\(\[0-9\]+\\) use of attacker-controlled value
> 'sz' as allocation size without upper-bounds checking" "final event"
> { target *-*-* } .-1 } */
> +}
> diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
> b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
> new file mode 100644
> index 00000000000..f52cafcd71d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
> @@ -0,0 +1,31 @@
> +// TODO: remove need for this option:
> +/* { dg-additional-options "-fanalyzer-checker=taint" } */
> +
> +#include "analyzer-decls.h"
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +
> +/* malloc with tainted size from a syscall.  */
> +
> +struct arg_buf
> +{
> +  size_t sz;
> +};
> +
> +void *p;
> +
> +void __attribute__((tainted))
> +test_1 (void *data) /* { dg-message "\\(1\\) function 'test_1'
> marked with '__attribute__\\(\\(tainted\\)\\)'" } */
> +{
> +  /* we should treat pointed-to-structs as tainted.  */
> +  __analyzer_dump_state ("taint", data); /* { dg-warning "state:
> 'tainted'" } */
> +  
> +  struct arg_buf *args = data;
> +
> +  __analyzer_dump_state ("taint", args); /* { dg-warning "state:
> 'tainted'" } */
> +  __analyzer_dump_state ("taint", args->sz); /* { dg-warning "state:
> 'tainted'" } */
> +  
> +  p = malloc (args->sz); /* { dg-warning "use of attacker-controlled
> value '\\*args.sz' as allocation size without upper-bounds checking"
> "warning" } */
> +  /* { dg-message "\\(\[0-9\]+\\) use of attacker-controlled value
> '\\*args.sz' as allocation size without upper-bounds checking" "final
> event" { target *-*-* } .-1 } */
> +}
  
David Malcolm Jan. 10, 2022, 9:36 p.m. UTC | #2
On Thu, 2022-01-06 at 09:08 -0500, David Malcolm wrote:
> On Sat, 2021-11-13 at 15:37 -0500, David Malcolm wrote:
> > This patch adds a new __attribute__ ((tainted)) to the C/C++
> > frontends.
> 
> Ping for GCC C/C++ mantainers for review of the C/C++ FE parts of this
> patch (attribute registration, documentation, the name of the
> attribute, etc).
> 
> (I believe it's independent of the rest of the patch kit, in that it
> could go into trunk without needing the prior patches)
> 
> Thanks
> Dave

Getting close to end of stage 3 for GCC 12, so pinging this patch
again...

https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584376.html

Thanks
Dave

> 
> 
> > 
> > It can be used on function decls: the analyzer will treat as tainted
> > all parameters to the function and all buffers pointed to by
> > parameters
> > to the function.  Adding this in one place to the Linux kernel's
> > __SYSCALL_DEFINEx macro allows the analyzer to treat all syscalls as
> > having tainted inputs.  This gives additional testing beyond e.g.
> > __user
> > pointers added by earlier patches - an example of the use of this can
> > be
> > seen in CVE-2011-2210, where given:
> > 
> >  SYSCALL_DEFINE5(osf_getsysinfo, unsigned long, op, void __user *,
> > buffer,
> >                  unsigned long, nbytes, int __user *, start, void
> > __user *, arg)
> > 
> > the analyzer will treat the nbytes param as under attacker control,
> > and
> > can complain accordingly:
> > 
> > taint-CVE-2011-2210-1.c: In function ‘sys_osf_getsysinfo’:
> > taint-CVE-2011-2210-1.c:69:21: warning: use of attacker-controlled
> > value
> >   ‘nbytes’ as size without upper-bounds checking [CWE-129] [-
> > Wanalyzer-tainted-size]
> >    69 |                 if (copy_to_user(buffer, hwrpb, nbytes) != 0)
> >       |                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > 
> > Additionally, the patch allows the attribute to be used on field
> > decls:
> > specifically function pointers.  Any function used as an initializer
> > for such a field gets treated as tainted.  An example can be seen in
> > CVE-2020-13143, where adding __attribute__((tainted)) to the "store"
> > callback of configfs_attribute:
> > 
> >   struct configfs_attribute {
> >      /* [...snip...] */
> >      ssize_t (*store)(struct config_item *, const char *, size_t)
> >        __attribute__((tainted));
> >      /* [...snip...] */
> >   };
> > 
> > allows the analyzer to see:
> > 
> >  CONFIGFS_ATTR(gadget_dev_desc_, UDC);
> > 
> > and treat gadget_dev_desc_UDC_store as tainted, so that it complains:
> > 
> > taint-CVE-2020-13143-1.c: In function ‘gadget_dev_desc_UDC_store’:
> > taint-CVE-2020-13143-1.c:33:17: warning: use of attacker-controlled
> > value
> >   ‘len + 18446744073709551615’ as offset without upper-bounds
> > checking [CWE-823] [-Wanalyzer-tainted-offset]
> >    33 |         if (name[len - 1] == '\n')
> >       |             ~~~~^~~~~~~~~
> > 
> > Similarly, the attribute could be used on the ioctl callback field,
> > USB device callbacks, network-handling callbacks etc.  This
> > potentially
> > gives a lot of test coverage with relatively little code annotation,
> > and
> > without necessarily needing link-time analysis (which -fanalyzer can
> > only do at present on trivial examples).
> > 
> > I believe this is the first time we've had an attribute on a field.
> > If that's an issue, I could prepare a version of the patch that
> > merely allowed it on functions themselves.
> > 
> > As before this currently still needs -fanalyzer-checker=taint (in
> > addition to -fanalyzer).
> > 
> > gcc/analyzer/ChangeLog:
> >         * engine.cc: Include "stringpool.h", "attribs.h", and
> >         "tree-dfa.h".
> >         (mark_params_as_tainted): New.
> >         (class tainted_function_custom_event): New.
> >         (class tainted_function_info): New.
> >         (exploded_graph::add_function_entry): Handle functions with
> >         "tainted" attribute.
> >         (class tainted_field_custom_event): New.
> >         (class tainted_callback_custom_event): New.
> >         (class tainted_call_info): New.
> >         (add_tainted_callback): New.
> >         (add_any_callbacks): New.
> >         (exploded_graph::build_initial_worklist): Find callbacks that
> > are
> >         reachable from global initializers, calling add_any_callbacks
> > on
> >         them.
> > 
> > gcc/c-family/ChangeLog:
> >         * c-attribs.c (c_common_attribute_table): Add "tainted".
> >         (handle_tainted_attribute): New.
> > 
> > gcc/ChangeLog:
> >         * doc/extend.texi (Function Attributes): Note that "tainted"
> > can
> >         be used on field decls.
> >         (Common Function Attributes): Add entry on "tainted"
> > attribute.
> > 
> > gcc/testsuite/ChangeLog:
> >         * gcc.dg/analyzer/attr-tainted-1.c: New test.
> >         * gcc.dg/analyzer/attr-tainted-misuses.c: New test.
> >         * gcc.dg/analyzer/taint-CVE-2011-2210-1.c: New test.
> >         * gcc.dg/analyzer/taint-CVE-2020-13143-1.c: New test.
> >         * gcc.dg/analyzer/taint-CVE-2020-13143-2.c: New test.
> >         * gcc.dg/analyzer/taint-CVE-2020-13143.h: New test.
> >         * gcc.dg/analyzer/taint-alloc-3.c: New test.
> >         * gcc.dg/analyzer/taint-alloc-4.c: New test.
> > 
> > Signed-off-by: David Malcolm <dmalcolm@redhat.com>
> > ---
> >  gcc/analyzer/engine.cc                        | 317
> > +++++++++++++++++-
> >  gcc/c-family/c-attribs.c                      |  36 ++
> >  gcc/doc/extend.texi                           |  22 +-
> >  .../gcc.dg/analyzer/attr-tainted-1.c          |  88 +++++
> >  .../gcc.dg/analyzer/attr-tainted-misuses.c    |   6 +
> >  .../gcc.dg/analyzer/taint-CVE-2011-2210-1.c   |  93 +++++
> >  .../gcc.dg/analyzer/taint-CVE-2020-13143-1.c  |  38 +++
> >  .../gcc.dg/analyzer/taint-CVE-2020-13143-2.c  |  32 ++
> >  .../gcc.dg/analyzer/taint-CVE-2020-13143.h    |  91 +++++
> >  gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c |  21 ++
> >  gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c |  31 ++
> >  11 files changed, 772 insertions(+), 3 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
> >  create mode 100644 gcc/testsuite/gcc.dg/analyzer/attr-tainted-
> > misuses.c
> >  create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-
> > 2210-1.c
> >  create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-
> > 13143-1.c
> >  create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-
> > 13143-2.c
> >  create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-
> > 13143.h
> >  create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
> >  create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
> > 
> > diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
> > index 096e219392d..5fab41daf93 100644
> > --- a/gcc/analyzer/engine.cc
> > +++ b/gcc/analyzer/engine.cc
> > @@ -68,6 +68,9 @@ along with GCC; see the file COPYING3.  If not see
> >  #include "plugin.h"
> >  #include "target.h"
> >  #include <memory>
> > +#include "stringpool.h"
> > +#include "attribs.h"
> > +#include "tree-dfa.h"
> >  
> >  /* For an overview, see gcc/doc/analyzer.texi.  */
> >  
> > @@ -2276,6 +2279,116 @@ exploded_graph::~exploded_graph ()
> >      delete (*iter).second;
> >  }
> >  
> > +/* Subroutine for use when implementing __attribute__((tainted))
> > +   on functions and on function pointer fields in structs.
> > +
> > +   Called on STATE representing a call to FNDECL.
> > +   Mark all params of FNDECL in STATE as "tainted".  Mark the value
> > of all
> > +   regions pointed to by params of FNDECL as "tainted".
> > +
> > +   Return true if successful; return false if the "taint" state
> > machine
> > +   was not found.  */
> > +
> > +static bool
> > +mark_params_as_tainted (program_state *state, tree fndecl,
> > +                       const extrinsic_state &ext_state)
> > +{
> > +  unsigned taint_sm_idx;
> > +  if (!ext_state.get_sm_idx_by_name ("taint", &taint_sm_idx))
> > +    return false;
> > +  sm_state_map *smap = state->m_checker_states[taint_sm_idx];
> > +
> > +  const state_machine &sm = ext_state.get_sm (taint_sm_idx);
> > +  state_machine::state_t tainted = sm.get_state_by_name ("tainted");
> > +
> > +  region_model_manager *mgr = ext_state.get_model_manager ();
> > +
> > +  function *fun = DECL_STRUCT_FUNCTION (fndecl);
> > +  gcc_assert (fun);
> > +
> > +  for (tree iter_parm = DECL_ARGUMENTS (fndecl); iter_parm;
> > +       iter_parm = DECL_CHAIN (iter_parm))
> > +    {
> > +      tree param = iter_parm;
> > +      if (tree parm_default_ssa = ssa_default_def (fun, iter_parm))
> > +       param = parm_default_ssa;
> > +      const region *param_reg = state->m_region_model->get_lvalue
> > (param, NULL);
> > +      const svalue *init_sval = mgr->get_or_create_initial_value
> > (param_reg);
> > +      smap->set_state (state->m_region_model, init_sval,
> > +                      tainted, NULL /*origin_new_sval*/, ext_state);
> > +      if (POINTER_TYPE_P (TREE_TYPE (param)))
> > +       {
> > +         const region *pointee_reg = mgr->get_symbolic_region
> > (init_sval);
> > +         /* Mark "*param" as tainted.  */
> > +         const svalue *init_pointee_sval
> > +           = mgr->get_or_create_initial_value (pointee_reg);
> > +         smap->set_state (state->m_region_model, init_pointee_sval,
> > +                          tainted, NULL /*origin_new_sval*/,
> > ext_state);
> > +       }
> > +    }
> > +
> > +  return true;
> > +}
> > +
> > +/* Custom event for use by tainted_function_info when a function
> > +   has been marked with __attribute__((tainted)).  */
> > +
> > +class tainted_function_custom_event : public custom_event
> > +{
> > +public:
> > +  tainted_function_custom_event (location_t loc, tree fndecl, int
> > depth)
> > +  : custom_event (loc, fndecl, depth),
> > +    m_fndecl (fndecl)
> > +  {
> > +  }
> > +
> > +  label_text get_desc (bool can_colorize) const FINAL OVERRIDE
> > +  {
> > +    return make_label_text
> > +      (can_colorize,
> > +       "function %qE marked with %<__attribute__((tainted))%>",
> > +       m_fndecl);
> > +  }
> > +
> > +private:
> > +  tree m_fndecl;
> > +};
> > +
> > +/* Custom exploded_edge info for top-level calls to a function
> > +   marked with __attribute__((tainted)).  */
> > +
> > +class tainted_function_info : public custom_edge_info
> > +{
> > +public:
> > +  tainted_function_info (tree fndecl)
> > +  : m_fndecl (fndecl)
> > +  {}
> > +
> > +  void print (pretty_printer *pp) const FINAL OVERRIDE
> > +  {
> > +    pp_string (pp, "call to tainted function");
> > +  };
> > +
> > +  bool update_model (region_model *,
> > +                    const exploded_edge *,
> > +                    region_model_context *) const FINAL OVERRIDE
> > +  {
> > +    /* No-op.  */
> > +    return true;
> > +  }
> > +
> > +  void add_events_to_path (checker_path *emission_path,
> > +                          const exploded_edge &) const FINAL
> > OVERRIDE
> > +  {
> > +    emission_path->add_event
> > +      (new tainted_function_custom_event
> > +       (DECL_SOURCE_LOCATION (m_fndecl), m_fndecl, 0));
> > +  }
> > +
> > +private:
> > +  tree m_fndecl;
> > +};
> > +
> >  /* Ensure that there is an exploded_node representing an external
> > call to
> >     FUN, adding it to the worklist if creating it.
> >  
> > @@ -2302,14 +2415,25 @@ exploded_graph::add_function_entry (function
> > *fun)
> >    program_state state (m_ext_state);
> >    state.push_frame (m_ext_state, fun);
> >  
> > +  custom_edge_info *edge_info = NULL;
> > +
> > +  if (lookup_attribute ("tainted", DECL_ATTRIBUTES (fun->decl)))
> > +    {
> > +      if (mark_params_as_tainted (&state, fun->decl, m_ext_state))
> > +       edge_info = new tainted_function_info (fun->decl);
> > +    }
> > +
> >    if (!state.m_valid)
> >      return NULL;
> >  
> >    exploded_node *enode = get_or_create_node (point, state, NULL);
> >    if (!enode)
> > -    return NULL;
> > +    {
> > +      delete edge_info;
> > +      return NULL;
> > +    }
> >  
> > -  add_edge (m_origin, enode, NULL);
> > +  add_edge (m_origin, enode, NULL, edge_info);
> >  
> >    m_functions_with_enodes.add (fun);
> >  
> > @@ -2623,6 +2747,184 @@ toplevel_function_p (function *fun, logger
> > *logger)
> >    return true;
> >  }
> >  
> > +/* Custom event for use by tainted_call_info when a callback field
> > has been
> > +   marked with __attribute__((tainted)), for labelling the field. 
> > */
> > +
> > +class tainted_field_custom_event : public custom_event
> > +{
> > +public:
> > +  tainted_field_custom_event (tree field)
> > +  : custom_event (DECL_SOURCE_LOCATION (field), NULL_TREE, 0),
> > +    m_field (field)
> > +  {
> > +  }
> > +
> > +  label_text get_desc (bool can_colorize) const FINAL OVERRIDE
> > +  {
> > +    return make_label_text (can_colorize,
> > +                           "field %qE of %qT"
> > +                           " is marked with
> > %<__attribute__((tainted))%>",
> > +                           m_field, DECL_CONTEXT (m_field));
> > +  }
> > +
> > +private:
> > +  tree m_field;
> > +};
> > +
> > +/* Custom event for use by tainted_call_info when a callback field
> > has been
> > +   marked with __attribute__((tainted)), for labelling the function
> > used
> > +   in that callback.  */
> > +
> > +class tainted_callback_custom_event : public custom_event
> > +{
> > +public:
> > +  tainted_callback_custom_event (location_t loc, tree fndecl, int
> > depth,
> > +                                tree field)
> > +  : custom_event (loc, fndecl, depth),
> > +    m_field (field)
> > +  {
> > +  }
> > +
> > +  label_text get_desc (bool can_colorize) const FINAL OVERRIDE
> > +  {
> > +    return make_label_text (can_colorize,
> > +                           "function %qE used as initializer for
> > field %qE"
> > +                           " marked with
> > %<__attribute__((tainted))%>",
> > +                           m_fndecl, m_field);
> > +  }
> > +
> > +private:
> > +  tree m_field;
> > +};
> > +
> > +/* Custom edge info for use when adding a function used by a
> > callback field
> > +   marked with '__attribute__((tainted))'.   */
> > +
> > +class tainted_call_info : public custom_edge_info
> > +{
> > +public:
> > +  tainted_call_info (tree field, tree fndecl, location_t loc)
> > +  : m_field (field), m_fndecl (fndecl), m_loc (loc)
> > +  {}
> > +
> > +  void print (pretty_printer *pp) const FINAL OVERRIDE
> > +  {
> > +    pp_string (pp, "call to tainted field");
> > +  };
> > +
> > +  bool update_model (region_model *,
> > +                    const exploded_edge *,
> > +                    region_model_context *) const FINAL OVERRIDE
> > +  {
> > +    /* No-op.  */
> > +    return true;
> > +  }
> > +
> > +  void add_events_to_path (checker_path *emission_path,
> > +                          const exploded_edge &) const FINAL
> > OVERRIDE
> > +  {
> > +    /* Show the field in the struct declaration
> > +       e.g. "(1) field 'store' is marked with
> > '__attribute__((tainted))'"  */
> > +    emission_path->add_event
> > +      (new tainted_field_custom_event (m_field));
> > +
> > +    /* Show the callback in the initializer
> > +       e.g.
> > +       "(2) function 'gadget_dev_desc_UDC_store' used as initializer
> > +       for field 'store' marked with '__attribute__((tainted))'". 
> > */
> > +    emission_path->add_event
> > +      (new tainted_callback_custom_event (m_loc, m_fndecl, 0,
> > m_field));
> > +  }
> > +
> > +private:
> > +  tree m_field;
> > +  tree m_fndecl;
> > +  location_t m_loc;
> > +};
> > +
> > +/* Given an initializer at LOC for FIELD marked with
> > '__attribute__((tainted))'
> > +   initialized with FNDECL, add an entrypoint to FNDECL to EG (and
> > to its
> > +   worklist) where the params to FNDECL are marked as tainted.  */
> > +
> > +static void
> > +add_tainted_callback (exploded_graph *eg, tree field, tree fndecl,
> > +                     location_t loc)
> > +{
> > +  logger *logger = eg->get_logger ();
> > +
> > +  LOG_SCOPE (logger);
> > +
> > +  if (!gimple_has_body_p (fndecl))
> > +    return;
> > +
> > +  const extrinsic_state &ext_state = eg->get_ext_state ();
> > +
> > +  function *fun = DECL_STRUCT_FUNCTION (fndecl);
> > +  gcc_assert (fun);
> > +
> > +  program_point point
> > +    = program_point::from_function_entry (eg->get_supergraph (),
> > fun);
> > +  program_state state (ext_state);
> > +  state.push_frame (ext_state, fun);
> > +
> > +  if (!mark_params_as_tainted (&state, fndecl, ext_state))
> > +    return;
> > +
> > +  if (!state.m_valid)
> > +    return;
> > +
> > +  exploded_node *enode = eg->get_or_create_node (point, state,
> > NULL);
> > +  if (logger)
> > +    {
> > +      if (enode)
> > +       logger->log ("created EN %i for tainted %qE entrypoint",
> > +                    enode->m_index, fndecl);
> > +      else
> > +       {
> > +         logger->log ("did not create enode for tainted %qE
> > entrypoint",
> > +                      fndecl);
> > +         return;
> > +       }
> > +    }
> > +
> > +  tainted_call_info *info = new tainted_call_info (field, fndecl,
> > loc);
> > +  eg->add_edge (eg->get_origin (), enode, NULL, info);
> > +}
> > +
> > +/* Callback for walk_tree for finding callbacks within initializers;
> > +   ensure that any callback initializer where the corresponding
> > field is
> > +   marked with '__attribute__((tainted))' is treated as an
> > entrypoint to the
> > +   analysis, special-casing that the inputs to the callback are
> > +   untrustworthy.  */
> > +
> > +static tree
> > +add_any_callbacks (tree *tp, int *, void *data)
> > +{
> > +  exploded_graph *eg = (exploded_graph *)data;
> > +  if (TREE_CODE (*tp) == CONSTRUCTOR)
> > +    {
> > +      /* Find fields with the "tainted" attribute.
> > +        walk_tree only walks the values, not the index values;
> > +        look at the index values.  */
> > +      unsigned HOST_WIDE_INT idx;
> > +      constructor_elt *ce;
> > +
> > +      for (idx = 0; vec_safe_iterate (CONSTRUCTOR_ELTS (*tp), idx,
> > &ce);
> > +          idx++)
> > +       if (ce->index && TREE_CODE (ce->index) == FIELD_DECL)
> > +         if (lookup_attribute ("tainted", DECL_ATTRIBUTES (ce-
> > > index)))
> > +           {
> > +             tree value = ce->value;
> > +             if (TREE_CODE (value) == ADDR_EXPR
> > +                 && TREE_CODE (TREE_OPERAND (value, 0)) ==
> > FUNCTION_DECL)
> > +               add_tainted_callback (eg, ce->index, TREE_OPERAND
> > (value, 0),
> > +                                     EXPR_LOCATION (value));
> > +           }
> > +    }
> > +
> > +  return NULL_TREE;
> > +}
> > +
> >  /* Add initial nodes to EG, with entrypoints for externally-callable
> >     functions.  */
> >  
> > @@ -2648,6 +2950,17 @@ exploded_graph::build_initial_worklist ()
> >           logger->log ("did not create enode for %qE entrypoint",
> > fun->decl);
> >        }
> >    }
> > +
> > +  /* Find callbacks that are reachable from global initializers.  */
> > +  varpool_node *vpnode;
> > +  FOR_EACH_VARIABLE (vpnode)
> > +    {
> > +      tree decl = vpnode->decl;
> > +      tree init = DECL_INITIAL (decl);
> > +      if (!init)
> > +       continue;
> > +      walk_tree (&init, add_any_callbacks, this, NULL);
> > +    }
> >  }
> >  
> >  /* The main loop of the analysis.
> > diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
> > index 9e03156de5e..835ba6e0e8c 100644
> > --- a/gcc/c-family/c-attribs.c
> > +++ b/gcc/c-family/c-attribs.c
> > @@ -117,6 +117,7 @@ static tree
> > handle_no_profile_instrument_function_attribute (tree *, tree,
> >                                                              tree,
> > int, bool *);
> >  static tree handle_malloc_attribute (tree *, tree, tree, int, bool
> > *);
> >  static tree handle_dealloc_attribute (tree *, tree, tree, int, bool
> > *);
> > +static tree handle_tainted_attribute (tree *, tree, tree, int, bool
> > *);
> >  static tree handle_returns_twice_attribute (tree *, tree, tree, int,
> > bool *);
> >  static tree handle_no_limit_stack_attribute (tree *, tree, tree,
> > int,
> >                                              bool *);
> > @@ -569,6 +570,8 @@ const struct attribute_spec
> > c_common_attribute_table[] =
> >                               handle_objc_nullability_attribute, NULL
> > },
> >    { "*dealloc",                1, 2, true, false, false, false,
> >                               handle_dealloc_attribute, NULL },
> > +  { "tainted",               0, 0, true,  false, false, false,
> > +                             handle_tainted_attribute, NULL },
> >    { NULL,                     0, 0, false, false, false, false,
> > NULL, NULL }
> >  };
> >  
> > @@ -5857,6 +5860,39 @@ handle_objc_nullability_attribute (tree *node,
> > tree name, tree args,
> >    return NULL_TREE;
> >  }
> >  
> > +/* Handle a "tainted" attribute; arguments as in
> > +   struct attribute_spec.handler.  */
> > +
> > +static tree
> > +handle_tainted_attribute (tree *node, tree name, tree, int,
> > +                         bool *no_add_attrs)
> > +{
> > +  if (TREE_CODE (*node) != FUNCTION_DECL
> > +      && TREE_CODE (*node) != FIELD_DECL)
> > +    {
> > +      warning (OPT_Wattributes, "%qE attribute ignored; valid only "
> > +              "for functions and function pointer fields",
> > +              name);
> > +      *no_add_attrs = true;
> > +      return NULL_TREE;
> > +    }
> > +
> > +  if (TREE_CODE (*node) == FIELD_DECL
> > +      && !(TREE_CODE (TREE_TYPE (*node)) == POINTER_TYPE
> > +          && TREE_CODE (TREE_TYPE (TREE_TYPE (*node))) ==
> > FUNCTION_TYPE))
> > +    {
> > +      warning (OPT_Wattributes, "%qE attribute ignored;"
> > +              " field must be a function pointer",
> > +              name);
> > +      *no_add_attrs = true;
> > +      return NULL_TREE;
> > +    }
> > +
> > +  *no_add_attrs = false; /* OK */
> > +
> > +  return NULL_TREE;
> > +}
> > +
> >  /* Attempt to partially validate a single attribute ATTR as if
> >     it were to be applied to an entity OPER.  */
> >  
> > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> > index 5a6ef464779..826bbd48e7e 100644
> > --- a/gcc/doc/extend.texi
> > +++ b/gcc/doc/extend.texi
> > @@ -2465,7 +2465,8 @@ variable declarations (@pxref{Variable
> > Attributes}),
> >  labels (@pxref{Label Attributes}),
> >  enumerators (@pxref{Enumerator Attributes}),
> >  statements (@pxref{Statement Attributes}),
> > -and types (@pxref{Type Attributes}).
> > +types (@pxref{Type Attributes}),
> > +and on field declarations (for @code{tainted}).
> >  
> >  There is some overlap between the purposes of attributes and pragmas
> >  (@pxref{Pragmas,,Pragmas Accepted by GCC}).  It has been
> > @@ -3977,6 +3978,25 @@ addition to creating a symbol version (as if
> >  @code{"@var{name2}@@@var{nodename}"} was used) the version will be
> > also used
> >  to resolve @var{name2} by the linker.
> >  
> > +@item tainted
> > +@cindex @code{tainted} function attribute
> > +The @code{tainted} attribute is used to specify that a function is
> > called
> > +in a way that requires sanitization of its arguments, such as a
> > system
> > +call in an operating system kernel.  Such a function can be
> > considered part
> > +of the ``attack surface'' of the program.  The attribute can be used
> > both
> > +on function declarations, and on field declarations containing
> > function
> > +pointers.  In the latter case, any function used as an initializer
> > of
> > +such a callback field will be treated as tainted.
> > +
> > +The analyzer will pay particular attention to such functions when
> > both
> > +@option{-fanalyzer} and @option{-fanalyzer-checker=taint} are
> > supplied,
> > +potentially issuing warnings guarded by
> > +@option{-Wanalyzer-exposure-through-uninit-copy},
> > +@option{-Wanalyzer-tainted-allocation-size},
> > +@option{-Wanalyzer-tainted-array-index},
> > +@option{Wanalyzer-tainted-offset},
> > +and @option{Wanalyzer-tainted-size}.
> > +
> >  @item target_clones (@var{options})
> >  @cindex @code{target_clones} function attribute
> >  The @code{target_clones} attribute is used to specify that a
> > function
> > diff --git a/gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
> > b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
> > new file mode 100644
> > index 00000000000..cc4d5900372
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
> > @@ -0,0 +1,88 @@
> > +// TODO: remove need for this option
> > +/* { dg-additional-options "-fanalyzer-checker=taint" } */
> > +
> > +#include "analyzer-decls.h"
> > +
> > +struct arg_buf
> > +{
> > +  int i;
> > +  int j;
> > +};
> > +
> > +/* Example of marking a function as tainted.  */
> > +
> > +void __attribute__((tainted))
> > +test_1 (int i, void *p, char *q)
> > +{
> > +  /* There should be a single enode,
> > +     for the "tainted" entry to the function.  */
> > +  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1 processed
> > enode" } */
> > +
> > +  __analyzer_dump_state ("taint", i); /* { dg-warning "state:
> > 'tainted'" } */
> > +  __analyzer_dump_state ("taint", p); /* { dg-warning "state:
> > 'tainted'" } */
> > +  __analyzer_dump_state ("taint", q); /* { dg-warning "state:
> > 'tainted'" } */
> > +  __analyzer_dump_state ("taint", *q); /* { dg-warning "state:
> > 'tainted'" } */
> > +
> > +  struct arg_buf *args = p;
> > +  __analyzer_dump_state ("taint", args->i); /* { dg-warning "state:
> > 'tainted'" } */
> > +  __analyzer_dump_state ("taint", args->j); /* { dg-warning "state:
> > 'tainted'" } */  
> > +}
> > +
> > +/* Example of marking a callback field as tainted.  */
> > +
> > +struct s2
> > +{
> > +  void (*cb) (int, void *, char *)
> > +    __attribute__((tainted));
> > +};
> > +
> > +/* Function not marked as tainted.  */
> > +
> > +void
> > +test_2a (int i, void *p, char *q)
> > +{
> > +  /* There should be a single enode,
> > +     for the normal entry to the function.  */
> > +  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1 processed
> > enode" } */
> > +
> > +  __analyzer_dump_state ("taint", i); /* { dg-warning "state:
> > 'start'" } */
> > +  __analyzer_dump_state ("taint", p); /* { dg-warning "state:
> > 'start'" } */
> > +  __analyzer_dump_state ("taint", q); /* { dg-warning "state:
> > 'start'" } */
> > +
> > +  struct arg_buf *args = p;
> > +  __analyzer_dump_state ("taint", args->i); /* { dg-warning "state:
> > 'start'" } */
> > +  __analyzer_dump_state ("taint", args->j); /* { dg-warning "state:
> > 'start'" } */  
> > +}
> > +
> > +/* Function referenced via t2b.cb, marked as "tainted".  */
> > +
> > +void
> > +test_2b (int i, void *p, char *q)
> > +{
> > +  /* There should be two enodes
> > +     for the direct call, and the "tainted" entry to the function. 
> > */
> > +  __analyzer_dump_exploded_nodes (0); /* { dg-warning "2 processed
> > enodes" } */
> > +}
> > +
> > +/* Callback used via t2c.cb, marked as "tainted".  */
> > +void
> > +__analyzer_test_2c (int i, void *p, char *q)
> > +{
> > +  /* There should be a single enode,
> > +     for the "tainted" entry to the function.  */
> > +  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1 processed
> > enode" } */
> > +
> > +  __analyzer_dump_state ("taint", i); /* { dg-warning "state:
> > 'tainted'" } */
> > +  __analyzer_dump_state ("taint", p); /* { dg-warning "state:
> > 'tainted'" } */
> > +  __analyzer_dump_state ("taint", q); /* { dg-warning "state:
> > 'tainted'" } */
> > +}
> > +
> > +struct s2 t2b =
> > +{
> > +  .cb = test_2b
> > +};
> > +
> > +struct s2 t2c =
> > +{
> > +  .cb = __analyzer_test_2c
> > +};
> > diff --git a/gcc/testsuite/gcc.dg/analyzer/attr-tainted-misuses.c
> > b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-misuses.c
> > new file mode 100644
> > index 00000000000..6f4cbc82efb
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-misuses.c
> > @@ -0,0 +1,6 @@
> > +int not_a_fn __attribute__ ((tainted)); /* { dg-warning "'tainted'
> > attribute ignored; valid only for functions and function pointer
> > fields" } */
> > +
> > +struct s
> > +{
> > +  int f __attribute__ ((tainted)); /* { dg-warning "'tainted'
> > attribute ignored; field must be a function pointer" } */
> > +};
> > diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-1.c
> > b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-1.c
> > new file mode 100644
> > index 00000000000..fe6c7ebbb1f
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-1.c
> > @@ -0,0 +1,93 @@
> > +/* "The osf_getsysinfo function in arch/alpha/kernel/osf_sys.c in
> > the
> > +   Linux kernel before 2.6.39.4 on the Alpha platform does not
> > properly
> > +   restrict the data size for GSI_GET_HWRPB operations, which allows
> > +   local users to obtain sensitive information from kernel memory
> > via
> > +   a crafted call."
> > +
> > +   Fixed in 3d0475119d8722798db5e88f26493f6547a4bb5b on linux-
> > 2.6.39.y
> > +   in linux-stable.  */
> > +
> > +// TODO: remove need for this option:
> > +/* { dg-additional-options "-fanalyzer-checker=taint" } */
> > +
> > +#include "analyzer-decls.h"
> > +#include "test-uaccess.h"
> > +
> > +/* Adapted from include/linux/linkage.h.  */
> > +
> > +#define asmlinkage
> > +
> > +/* Adapted from include/linux/syscalls.h.  */
> > +
> > +#define __SC_DECL1(t1, a1)     t1 a1
> > +#define __SC_DECL2(t2, a2, ...) t2 a2, __SC_DECL1(__VA_ARGS__)
> > +#define __SC_DECL3(t3, a3, ...) t3 a3, __SC_DECL2(__VA_ARGS__)
> > +#define __SC_DECL4(t4, a4, ...) t4 a4, __SC_DECL3(__VA_ARGS__)
> > +#define __SC_DECL5(t5, a5, ...) t5 a5, __SC_DECL4(__VA_ARGS__)
> > +#define __SC_DECL6(t6, a6, ...) t6 a6, __SC_DECL5(__VA_ARGS__)
> > +
> > +#define SYSCALL_DEFINEx(x, sname, ...)                         \
> > +       __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
> > +
> > +#define SYSCALL_DEFINE(name) asmlinkage long sys_##name
> > +#define __SYSCALL_DEFINEx(x, name,
> > ...)                                        \
> > +       asmlinkage __attribute__((tainted)) \
> > +       long sys##name(__SC_DECL##x(__VA_ARGS__))
> > +
> > +#define SYSCALL_DEFINE5(name, ...) SYSCALL_DEFINEx(5, _##name,
> > __VA_ARGS__)
> > +
> > +/* Adapted from arch/alpha/include/asm/hwrpb.h.  */
> > +
> > +struct hwrpb_struct {
> > +       unsigned long phys_addr;        /* check: physical address of
> > the hwrpb */
> > +       unsigned long id;               /* check: "HWRPB\0\0\0" */
> > +       unsigned long revision;
> > +       unsigned long size;             /* size of hwrpb */
> > +       /* [...snip...] */
> > +};
> > +
> > +extern struct hwrpb_struct *hwrpb;
> > +
> > +/* Adapted from arch/alpha/kernel/osf_sys.c.  */
> > +
> > +SYSCALL_DEFINE5(osf_getsysinfo, unsigned long, op, void __user *,
> > buffer,
> > +               unsigned long, nbytes, int __user *, start, void
> > __user *, arg)
> > +{
> > +       /* [...snip...] */
> > +
> > +       __analyzer_dump_state ("taint", nbytes);  /* { dg-warning
> > "tainted" } */
> > +
> > +       /* TODO: should have an event explaining why "nbytes" is
> > treated as
> > +          attacker-controlled.  */
> > +
> > +       /* case GSI_GET_HWRPB: */
> > +               if (nbytes < sizeof(*hwrpb))
> > +                       return -1;
> > +
> > +               __analyzer_dump_state ("taint", nbytes);  /* { dg-
> > warning "has_lb" } */
> > +
> > +               if (copy_to_user(buffer, hwrpb, nbytes) != 0) /* {
> > dg-warning "use of attacker-controlled value 'nbytes' as size without
> > upper-bounds checking" } */
> > +                       return -2;
> > +
> > +               return 1;
> > +
> > +       /* [...snip...] */
> > +}
> > +
> > +/* With the fix for the sense of the size comparison.  */
> > +
> > +SYSCALL_DEFINE5(osf_getsysinfo_fixed, unsigned long, op, void __user
> > *, buffer,
> > +               unsigned long, nbytes, int __user *, start, void
> > __user *, arg)
> > +{
> > +       /* [...snip...] */
> > +
> > +       /* case GSI_GET_HWRPB: */
> > +               if (nbytes > sizeof(*hwrpb))
> > +                       return -1;
> > +               if (copy_to_user(buffer, hwrpb, nbytes) != 0) /* {
> > dg-bogus "attacker-controlled" } */
> > +                       return -2;
> > +
> > +               return 1;
> > +
> > +       /* [...snip...] */
> > +}
> > diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-1.c
> > b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-1.c
> > new file mode 100644
> > index 00000000000..0b9a94a8d6c
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-1.c
> > @@ -0,0 +1,38 @@
> > +/* See notes in this header.  */
> > +#include "taint-CVE-2020-13143.h"
> > +
> > +// TODO: remove need for this option
> > +/* { dg-additional-options "-fanalyzer-checker=taint" } */
> > +
> > +struct configfs_attribute {
> > +       /* [...snip...] */
> > +       ssize_t (*store)(struct config_item *, const char *, size_t)
> > /* { dg-message "\\(1\\) field 'store' of 'struct configfs_attribute'
> > is marked with '__attribute__\\(\\(tainted\\)\\)'" } */
> > +               __attribute__((tainted)); /* (this is added).  */
> > +};
> > +static inline struct gadget_info *to_gadget_info(struct config_item
> > *item)
> > +{
> > +        return container_of(to_config_group(item), struct
> > gadget_info, group);
> > +}
> > +
> > +static ssize_t gadget_dev_desc_UDC_store(struct config_item *item,
> > +               const char *page, size_t len)
> > +{
> > +       struct gadget_info *gi = to_gadget_info(item);
> > +       char *name;
> > +       int ret;
> > +
> > +#if 0
> > +       /* FIXME: this is the fix.  */
> > +       if (strlen(page) < len)
> > +               return -EOVERFLOW;
> > +#endif
> > +
> > +       name = kstrdup(page, GFP_KERNEL);
> > +       if (!name)
> > +               return -ENOMEM;
> > +       if (name[len - 1] == '\n') /* { dg-warning "use of attacker-
> > controlled value 'len \[^\n\r\]+' as offset without upper-bounds
> > checking" } */
> > +               name[len - 1] = '\0'; /* { dg-warning "use of
> > attacker-controlled value 'len \[^\n\r\]+' as offset without upper-
> > bounds checking" } */
> > +       /* [...snip...] */                              \
> > +}
> > +
> > +CONFIGFS_ATTR(gadget_dev_desc_, UDC); /* { dg-message "\\(2\\)
> > function 'gadget_dev_desc_UDC_store' used as initializer for field
> > 'store' marked with '__attribute__\\(\\(tainted\\)\\)'" } */
> > diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-2.c
> > b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-2.c
> > new file mode 100644
> > index 00000000000..e05da9276c1
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-2.c
> > @@ -0,0 +1,32 @@
> > +/* See notes in this header.  */
> > +#include "taint-CVE-2020-13143.h"
> > +
> > +// TODO: remove need for this option
> > +/* { dg-additional-options "-fanalyzer-checker=taint" } */
> > +
> > +struct configfs_attribute {
> > +       /* [...snip...] */
> > +       ssize_t (*store)(struct config_item *, const char *, size_t)
> > /* { dg-message "\\(1\\) field 'store' of 'struct configfs_attribute'
> > is marked with '__attribute__\\(\\(tainted\\)\\)'" } */
> > +               __attribute__((tainted)); /* (this is added).  */
> > +};
> > +
> > +/* Highly simplified version.  */
> > +
> > +static ssize_t gadget_dev_desc_UDC_store(struct config_item *item,
> > +               const char *page, size_t len)
> > +{
> > +       /* TODO: ought to have state_change_event talking about where
> > the tainted value comes from.  */
> > +
> > +       char *name;
> > +       /* [...snip...] */
> > +
> > +       name = kstrdup(page, GFP_KERNEL);
> > +       if (!name)
> > +               return -ENOMEM;
> > +       if (name[len - 1] == '\n') /* { dg-warning "use of attacker-
> > controlled value 'len \[^\n\r\]+' as offset without upper-bounds
> > checking" } */
> > +               name[len - 1] = '\0';  /* { dg-warning "use of
> > attacker-controlled value 'len \[^\n\r\]+' as offset without upper-
> > bounds checking" } */
> > +       /* [...snip...] */
> > +       return 0;
> > +}
> > +
> > +CONFIGFS_ATTR(gadget_dev_desc_, UDC); /* { dg-message "\\(2\\)
> > function 'gadget_dev_desc_UDC_store' used as initializer for field
> > 'store' marked with '__attribute__\\(\\(tainted\\)\\)'" } */
> > diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h
> > b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h
> > new file mode 100644
> > index 00000000000..0ba023539af
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h
> > @@ -0,0 +1,91 @@
> > +/* Shared header for the various taint-CVE-2020-13143.h tests.
> > +   
> > +   "gadget_dev_desc_UDC_store in drivers/usb/gadget/configfs.c in
> > the
> > +   Linux kernel 3.16 through 5.6.13 relies on kstrdup without
> > considering
> > +   the possibility of an internal '\0' value, which allows attackers
> > to
> > +   trigger an out-of-bounds read, aka CID-15753588bcd4."
> > +
> > +   Fixed by 15753588bcd4bbffae1cca33c8ced5722477fe1f on linux-5.7.y
> > +   in linux-stable.  */
> > +
> > +// TODO: remove need for this option
> > +/* { dg-additional-options "-fanalyzer-checker=taint" } */
> > +
> > +#include <stddef.h>
> > +
> > +/* Adapted from include/uapi/asm-generic/posix_types.h  */
> > +
> > +typedef unsigned int     __kernel_size_t;
> > +typedef int              __kernel_ssize_t;
> > +
> > +/* Adapted from include/linux/types.h  */
> > +
> > +//typedef __kernel_size_t              size_t;
> > +typedef __kernel_ssize_t       ssize_t;
> > +
> > +/* Adapted from include/linux/kernel.h  */
> > +
> > +#define container_of(ptr, type, member)
> > ({                             \
> > +       void *__mptr = (void
> > *)(ptr);                                   \
> > +       /* [...snip...]
> > */                                              \
> > +       ((type *)(__mptr - offsetof(type, member))); })
> > +
> > +/* Adapted from include/linux/configfs.h  */
> > +
> > +struct config_item {
> > +       /* [...snip...] */
> > +};
> > +
> > +struct config_group {
> > +       struct config_item              cg_item;
> > +       /* [...snip...] */
> > +};
> > +
> > +static inline struct config_group *to_config_group(struct
> > config_item *item)
> > +{
> > +       return item ? container_of(item,struct config_group,cg_item)
> > : NULL;
> > +}
> > +
> > +#define CONFIGFS_ATTR(_pfx, _name)                             \
> > +static struct configfs_attribute _pfx##attr_##_name = {        \
> > +       /* [...snip...] */                              \
> > +       .store          = _pfx##_name##_store,          \
> > +}
> > +
> > +/* Adapted from include/linux/compiler.h  */
> > +
> > +#define __force
> > +
> > +/* Adapted from include/asm-generic/errno-base.h  */
> > +
> > +#define        ENOMEM          12      /* Out of memory */
> > +
> > +/* Adapted from include/linux/types.h  */
> > +
> > +#define __bitwise__
> > +typedef unsigned __bitwise__ gfp_t;
> > +
> > +/* Adapted from include/linux/gfp.h  */
> > +
> > +#define ___GFP_WAIT            0x10u
> > +#define ___GFP_IO              0x40u
> > +#define ___GFP_FS              0x80u
> > +#define __GFP_WAIT     ((__force gfp_t)___GFP_WAIT)
> > +#define __GFP_IO       ((__force gfp_t)___GFP_IO)
> > +#define __GFP_FS       ((__force gfp_t)___GFP_FS)
> > +#define GFP_KERNEL  (__GFP_WAIT | __GFP_IO | __GFP_FS)
> > +
> > +/* Adapted from include/linux/compiler_attributes.h  */
> > +
> > +#define __malloc                        __attribute__((__malloc__))
> > +
> > +/* Adapted from include/linux/string.h  */
> > +
> > +extern char *kstrdup(const char *s, gfp_t gfp) __malloc;
> > +
> > +/* Adapted from drivers/usb/gadget/configfs.c  */
> > +
> > +struct gadget_info {
> > +       struct config_group group;
> > +       /* [...snip...] */                              \
> > +};
> > diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
> > b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
> > new file mode 100644
> > index 00000000000..4c567b2ffdf
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
> > @@ -0,0 +1,21 @@
> > +// TODO: remove need for this option:
> > +/* { dg-additional-options "-fanalyzer-checker=taint" } */
> > +
> > +#include "analyzer-decls.h"
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <string.h>
> > +
> > +/* malloc with tainted size from a syscall.  */
> > +
> > +void *p;
> > +
> > +void __attribute__((tainted))
> > +test_1 (size_t sz) /* { dg-message "\\(1\\) function 'test_1' marked
> > with '__attribute__\\(\\(tainted\\)\\)'" } */
> > +{
> > +  /* TODO: should have a message saying why "sz" is tainted, e.g.
> > +     "treating 'sz' as attacker-controlled because 'test_1' is
> > marked with '__attribute__((tainted))'"  */
> > +
> > +  p = malloc (sz); /* { dg-warning "use of attacker-controlled value
> > 'sz' as allocation size without upper-bounds checking" "warning" } */
> > +  /* { dg-message "\\(\[0-9\]+\\) use of attacker-controlled value
> > 'sz' as allocation size without upper-bounds checking" "final event"
> > { target *-*-* } .-1 } */
> > +}
> > diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
> > b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
> > new file mode 100644
> > index 00000000000..f52cafcd71d
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
> > @@ -0,0 +1,31 @@
> > +// TODO: remove need for this option:
> > +/* { dg-additional-options "-fanalyzer-checker=taint" } */
> > +
> > +#include "analyzer-decls.h"
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <string.h>
> > +
> > +/* malloc with tainted size from a syscall.  */
> > +
> > +struct arg_buf
> > +{
> > +  size_t sz;
> > +};
> > +
> > +void *p;
> > +
> > +void __attribute__((tainted))
> > +test_1 (void *data) /* { dg-message "\\(1\\) function 'test_1'
> > marked with '__attribute__\\(\\(tainted\\)\\)'" } */
> > +{
> > +  /* we should treat pointed-to-structs as tainted.  */
> > +  __analyzer_dump_state ("taint", data); /* { dg-warning "state:
> > 'tainted'" } */
> > +  
> > +  struct arg_buf *args = data;
> > +
> > +  __analyzer_dump_state ("taint", args); /* { dg-warning "state:
> > 'tainted'" } */
> > +  __analyzer_dump_state ("taint", args->sz); /* { dg-warning "state:
> > 'tainted'" } */
> > +  
> > +  p = malloc (args->sz); /* { dg-warning "use of attacker-controlled
> > value '\\*args.sz' as allocation size without upper-bounds checking"
> > "warning" } */
> > +  /* { dg-message "\\(\[0-9\]+\\) use of attacker-controlled value
> > '\\*args.sz' as allocation size without upper-bounds checking" "final
> > event" { target *-*-* } .-1 } */
> > +}
>
  
Jason Merrill Jan. 12, 2022, 4:36 a.m. UTC | #3
On 1/10/22 16:36, David Malcolm via Gcc-patches wrote:
> On Thu, 2022-01-06 at 09:08 -0500, David Malcolm wrote:
>> On Sat, 2021-11-13 at 15:37 -0500, David Malcolm wrote:
>>> This patch adds a new __attribute__ ((tainted)) to the C/C++
>>> frontends.
>>
>> Ping for GCC C/C++ mantainers for review of the C/C++ FE parts of this
>> patch (attribute registration, documentation, the name of the
>> attribute, etc).
>>
>> (I believe it's independent of the rest of the patch kit, in that it
>> could go into trunk without needing the prior patches)
>>
>> Thanks
>> Dave
> 
> Getting close to end of stage 3 for GCC 12, so pinging this patch
> again...
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584376.html

The c-family change is OK.

> Thanks
> Dave
> 
>>
>>
>>>
>>> It can be used on function decls: the analyzer will treat as tainted
>>> all parameters to the function and all buffers pointed to by
>>> parameters
>>> to the function.  Adding this in one place to the Linux kernel's
>>> __SYSCALL_DEFINEx macro allows the analyzer to treat all syscalls as
>>> having tainted inputs.  This gives additional testing beyond e.g.
>>> __user
>>> pointers added by earlier patches - an example of the use of this can
>>> be
>>> seen in CVE-2011-2210, where given:
>>>
>>>   SYSCALL_DEFINE5(osf_getsysinfo, unsigned long, op, void __user *,
>>> buffer,
>>>                   unsigned long, nbytes, int __user *, start, void
>>> __user *, arg)
>>>
>>> the analyzer will treat the nbytes param as under attacker control,
>>> and
>>> can complain accordingly:
>>>
>>> taint-CVE-2011-2210-1.c: In function ‘sys_osf_getsysinfo’:
>>> taint-CVE-2011-2210-1.c:69:21: warning: use of attacker-controlled
>>> value
>>>    ‘nbytes’ as size without upper-bounds checking [CWE-129] [-
>>> Wanalyzer-tainted-size]
>>>     69 |                 if (copy_to_user(buffer, hwrpb, nbytes) != 0)
>>>        |                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>
>>> Additionally, the patch allows the attribute to be used on field
>>> decls:
>>> specifically function pointers.  Any function used as an initializer
>>> for such a field gets treated as tainted.  An example can be seen in
>>> CVE-2020-13143, where adding __attribute__((tainted)) to the "store"
>>> callback of configfs_attribute:
>>>
>>>    struct configfs_attribute {
>>>       /* [...snip...] */
>>>       ssize_t (*store)(struct config_item *, const char *, size_t)
>>>         __attribute__((tainted));
>>>       /* [...snip...] */
>>>    };
>>>
>>> allows the analyzer to see:
>>>
>>>   CONFIGFS_ATTR(gadget_dev_desc_, UDC);
>>>
>>> and treat gadget_dev_desc_UDC_store as tainted, so that it complains:
>>>
>>> taint-CVE-2020-13143-1.c: In function ‘gadget_dev_desc_UDC_store’:
>>> taint-CVE-2020-13143-1.c:33:17: warning: use of attacker-controlled
>>> value
>>>    ‘len + 18446744073709551615’ as offset without upper-bounds
>>> checking [CWE-823] [-Wanalyzer-tainted-offset]
>>>     33 |         if (name[len - 1] == '\n')
>>>        |             ~~~~^~~~~~~~~
>>>
>>> Similarly, the attribute could be used on the ioctl callback field,
>>> USB device callbacks, network-handling callbacks etc.  This
>>> potentially
>>> gives a lot of test coverage with relatively little code annotation,
>>> and
>>> without necessarily needing link-time analysis (which -fanalyzer can
>>> only do at present on trivial examples).
>>>
>>> I believe this is the first time we've had an attribute on a field.
>>> If that's an issue, I could prepare a version of the patch that
>>> merely allowed it on functions themselves.
>>>
>>> As before this currently still needs -fanalyzer-checker=taint (in
>>> addition to -fanalyzer).
>>>
>>> gcc/analyzer/ChangeLog:
>>>          * engine.cc: Include "stringpool.h", "attribs.h", and
>>>          "tree-dfa.h".
>>>          (mark_params_as_tainted): New.
>>>          (class tainted_function_custom_event): New.
>>>          (class tainted_function_info): New.
>>>          (exploded_graph::add_function_entry): Handle functions with
>>>          "tainted" attribute.
>>>          (class tainted_field_custom_event): New.
>>>          (class tainted_callback_custom_event): New.
>>>          (class tainted_call_info): New.
>>>          (add_tainted_callback): New.
>>>          (add_any_callbacks): New.
>>>          (exploded_graph::build_initial_worklist): Find callbacks that
>>> are
>>>          reachable from global initializers, calling add_any_callbacks
>>> on
>>>          them.
>>>
>>> gcc/c-family/ChangeLog:
>>>          * c-attribs.c (c_common_attribute_table): Add "tainted".
>>>          (handle_tainted_attribute): New.
>>>
>>> gcc/ChangeLog:
>>>          * doc/extend.texi (Function Attributes): Note that "tainted"
>>> can
>>>          be used on field decls.
>>>          (Common Function Attributes): Add entry on "tainted"
>>> attribute.
>>>
>>> gcc/testsuite/ChangeLog:
>>>          * gcc.dg/analyzer/attr-tainted-1.c: New test.
>>>          * gcc.dg/analyzer/attr-tainted-misuses.c: New test.
>>>          * gcc.dg/analyzer/taint-CVE-2011-2210-1.c: New test.
>>>          * gcc.dg/analyzer/taint-CVE-2020-13143-1.c: New test.
>>>          * gcc.dg/analyzer/taint-CVE-2020-13143-2.c: New test.
>>>          * gcc.dg/analyzer/taint-CVE-2020-13143.h: New test.
>>>          * gcc.dg/analyzer/taint-alloc-3.c: New test.
>>>          * gcc.dg/analyzer/taint-alloc-4.c: New test.
>>>
>>> Signed-off-by: David Malcolm <dmalcolm@redhat.com>
>>> ---
>>>   gcc/analyzer/engine.cc                        | 317
>>> +++++++++++++++++-
>>>   gcc/c-family/c-attribs.c                      |  36 ++
>>>   gcc/doc/extend.texi                           |  22 +-
>>>   .../gcc.dg/analyzer/attr-tainted-1.c          |  88 +++++
>>>   .../gcc.dg/analyzer/attr-tainted-misuses.c    |   6 +
>>>   .../gcc.dg/analyzer/taint-CVE-2011-2210-1.c   |  93 +++++
>>>   .../gcc.dg/analyzer/taint-CVE-2020-13143-1.c  |  38 +++
>>>   .../gcc.dg/analyzer/taint-CVE-2020-13143-2.c  |  32 ++
>>>   .../gcc.dg/analyzer/taint-CVE-2020-13143.h    |  91 +++++
>>>   gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c |  21 ++
>>>   gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c |  31 ++
>>>   11 files changed, 772 insertions(+), 3 deletions(-)
>>>   create mode 100644 gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
>>>   create mode 100644 gcc/testsuite/gcc.dg/analyzer/attr-tainted-
>>> misuses.c
>>>   create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-
>>> 2210-1.c
>>>   create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-
>>> 13143-1.c
>>>   create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-
>>> 13143-2.c
>>>   create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-
>>> 13143.h
>>>   create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
>>>   create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
>>>
>>> diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
>>> index 096e219392d..5fab41daf93 100644
>>> --- a/gcc/analyzer/engine.cc
>>> +++ b/gcc/analyzer/engine.cc
>>> @@ -68,6 +68,9 @@ along with GCC; see the file COPYING3.  If not see
>>>   #include "plugin.h"
>>>   #include "target.h"
>>>   #include <memory>
>>> +#include "stringpool.h"
>>> +#include "attribs.h"
>>> +#include "tree-dfa.h"
>>>   
>>>   /* For an overview, see gcc/doc/analyzer.texi.  */
>>>   
>>> @@ -2276,6 +2279,116 @@ exploded_graph::~exploded_graph ()
>>>       delete (*iter).second;
>>>   }
>>>   
>>> +/* Subroutine for use when implementing __attribute__((tainted))
>>> +   on functions and on function pointer fields in structs.
>>> +
>>> +   Called on STATE representing a call to FNDECL.
>>> +   Mark all params of FNDECL in STATE as "tainted".  Mark the value
>>> of all
>>> +   regions pointed to by params of FNDECL as "tainted".
>>> +
>>> +   Return true if successful; return false if the "taint" state
>>> machine
>>> +   was not found.  */
>>> +
>>> +static bool
>>> +mark_params_as_tainted (program_state *state, tree fndecl,
>>> +                       const extrinsic_state &ext_state)
>>> +{
>>> +  unsigned taint_sm_idx;
>>> +  if (!ext_state.get_sm_idx_by_name ("taint", &taint_sm_idx))
>>> +    return false;
>>> +  sm_state_map *smap = state->m_checker_states[taint_sm_idx];
>>> +
>>> +  const state_machine &sm = ext_state.get_sm (taint_sm_idx);
>>> +  state_machine::state_t tainted = sm.get_state_by_name ("tainted");
>>> +
>>> +  region_model_manager *mgr = ext_state.get_model_manager ();
>>> +
>>> +  function *fun = DECL_STRUCT_FUNCTION (fndecl);
>>> +  gcc_assert (fun);
>>> +
>>> +  for (tree iter_parm = DECL_ARGUMENTS (fndecl); iter_parm;
>>> +       iter_parm = DECL_CHAIN (iter_parm))
>>> +    {
>>> +      tree param = iter_parm;
>>> +      if (tree parm_default_ssa = ssa_default_def (fun, iter_parm))
>>> +       param = parm_default_ssa;
>>> +      const region *param_reg = state->m_region_model->get_lvalue
>>> (param, NULL);
>>> +      const svalue *init_sval = mgr->get_or_create_initial_value
>>> (param_reg);
>>> +      smap->set_state (state->m_region_model, init_sval,
>>> +                      tainted, NULL /*origin_new_sval*/, ext_state);
>>> +      if (POINTER_TYPE_P (TREE_TYPE (param)))
>>> +       {
>>> +         const region *pointee_reg = mgr->get_symbolic_region
>>> (init_sval);
>>> +         /* Mark "*param" as tainted.  */
>>> +         const svalue *init_pointee_sval
>>> +           = mgr->get_or_create_initial_value (pointee_reg);
>>> +         smap->set_state (state->m_region_model, init_pointee_sval,
>>> +                          tainted, NULL /*origin_new_sval*/,
>>> ext_state);
>>> +       }
>>> +    }
>>> +
>>> +  return true;
>>> +}
>>> +
>>> +/* Custom event for use by tainted_function_info when a function
>>> +   has been marked with __attribute__((tainted)).  */
>>> +
>>> +class tainted_function_custom_event : public custom_event
>>> +{
>>> +public:
>>> +  tainted_function_custom_event (location_t loc, tree fndecl, int
>>> depth)
>>> +  : custom_event (loc, fndecl, depth),
>>> +    m_fndecl (fndecl)
>>> +  {
>>> +  }
>>> +
>>> +  label_text get_desc (bool can_colorize) const FINAL OVERRIDE
>>> +  {
>>> +    return make_label_text
>>> +      (can_colorize,
>>> +       "function %qE marked with %<__attribute__((tainted))%>",
>>> +       m_fndecl);
>>> +  }
>>> +
>>> +private:
>>> +  tree m_fndecl;
>>> +};
>>> +
>>> +/* Custom exploded_edge info for top-level calls to a function
>>> +   marked with __attribute__((tainted)).  */
>>> +
>>> +class tainted_function_info : public custom_edge_info
>>> +{
>>> +public:
>>> +  tainted_function_info (tree fndecl)
>>> +  : m_fndecl (fndecl)
>>> +  {}
>>> +
>>> +  void print (pretty_printer *pp) const FINAL OVERRIDE
>>> +  {
>>> +    pp_string (pp, "call to tainted function");
>>> +  };
>>> +
>>> +  bool update_model (region_model *,
>>> +                    const exploded_edge *,
>>> +                    region_model_context *) const FINAL OVERRIDE
>>> +  {
>>> +    /* No-op.  */
>>> +    return true;
>>> +  }
>>> +
>>> +  void add_events_to_path (checker_path *emission_path,
>>> +                          const exploded_edge &) const FINAL
>>> OVERRIDE
>>> +  {
>>> +    emission_path->add_event
>>> +      (new tainted_function_custom_event
>>> +       (DECL_SOURCE_LOCATION (m_fndecl), m_fndecl, 0));
>>> +  }
>>> +
>>> +private:
>>> +  tree m_fndecl;
>>> +};
>>> +
>>>   /* Ensure that there is an exploded_node representing an external
>>> call to
>>>      FUN, adding it to the worklist if creating it.
>>>   
>>> @@ -2302,14 +2415,25 @@ exploded_graph::add_function_entry (function
>>> *fun)
>>>     program_state state (m_ext_state);
>>>     state.push_frame (m_ext_state, fun);
>>>   
>>> +  custom_edge_info *edge_info = NULL;
>>> +
>>> +  if (lookup_attribute ("tainted", DECL_ATTRIBUTES (fun->decl)))
>>> +    {
>>> +      if (mark_params_as_tainted (&state, fun->decl, m_ext_state))
>>> +       edge_info = new tainted_function_info (fun->decl);
>>> +    }
>>> +
>>>     if (!state.m_valid)
>>>       return NULL;
>>>   
>>>     exploded_node *enode = get_or_create_node (point, state, NULL);
>>>     if (!enode)
>>> -    return NULL;
>>> +    {
>>> +      delete edge_info;
>>> +      return NULL;
>>> +    }
>>>   
>>> -  add_edge (m_origin, enode, NULL);
>>> +  add_edge (m_origin, enode, NULL, edge_info);
>>>   
>>>     m_functions_with_enodes.add (fun);
>>>   
>>> @@ -2623,6 +2747,184 @@ toplevel_function_p (function *fun, logger
>>> *logger)
>>>     return true;
>>>   }
>>>   
>>> +/* Custom event for use by tainted_call_info when a callback field
>>> has been
>>> +   marked with __attribute__((tainted)), for labelling the field.
>>> */
>>> +
>>> +class tainted_field_custom_event : public custom_event
>>> +{
>>> +public:
>>> +  tainted_field_custom_event (tree field)
>>> +  : custom_event (DECL_SOURCE_LOCATION (field), NULL_TREE, 0),
>>> +    m_field (field)
>>> +  {
>>> +  }
>>> +
>>> +  label_text get_desc (bool can_colorize) const FINAL OVERRIDE
>>> +  {
>>> +    return make_label_text (can_colorize,
>>> +                           "field %qE of %qT"
>>> +                           " is marked with
>>> %<__attribute__((tainted))%>",
>>> +                           m_field, DECL_CONTEXT (m_field));
>>> +  }
>>> +
>>> +private:
>>> +  tree m_field;
>>> +};
>>> +
>>> +/* Custom event for use by tainted_call_info when a callback field
>>> has been
>>> +   marked with __attribute__((tainted)), for labelling the function
>>> used
>>> +   in that callback.  */
>>> +
>>> +class tainted_callback_custom_event : public custom_event
>>> +{
>>> +public:
>>> +  tainted_callback_custom_event (location_t loc, tree fndecl, int
>>> depth,
>>> +                                tree field)
>>> +  : custom_event (loc, fndecl, depth),
>>> +    m_field (field)
>>> +  {
>>> +  }
>>> +
>>> +  label_text get_desc (bool can_colorize) const FINAL OVERRIDE
>>> +  {
>>> +    return make_label_text (can_colorize,
>>> +                           "function %qE used as initializer for
>>> field %qE"
>>> +                           " marked with
>>> %<__attribute__((tainted))%>",
>>> +                           m_fndecl, m_field);
>>> +  }
>>> +
>>> +private:
>>> +  tree m_field;
>>> +};
>>> +
>>> +/* Custom edge info for use when adding a function used by a
>>> callback field
>>> +   marked with '__attribute__((tainted))'.   */
>>> +
>>> +class tainted_call_info : public custom_edge_info
>>> +{
>>> +public:
>>> +  tainted_call_info (tree field, tree fndecl, location_t loc)
>>> +  : m_field (field), m_fndecl (fndecl), m_loc (loc)
>>> +  {}
>>> +
>>> +  void print (pretty_printer *pp) const FINAL OVERRIDE
>>> +  {
>>> +    pp_string (pp, "call to tainted field");
>>> +  };
>>> +
>>> +  bool update_model (region_model *,
>>> +                    const exploded_edge *,
>>> +                    region_model_context *) const FINAL OVERRIDE
>>> +  {
>>> +    /* No-op.  */
>>> +    return true;
>>> +  }
>>> +
>>> +  void add_events_to_path (checker_path *emission_path,
>>> +                          const exploded_edge &) const FINAL
>>> OVERRIDE
>>> +  {
>>> +    /* Show the field in the struct declaration
>>> +       e.g. "(1) field 'store' is marked with
>>> '__attribute__((tainted))'"  */
>>> +    emission_path->add_event
>>> +      (new tainted_field_custom_event (m_field));
>>> +
>>> +    /* Show the callback in the initializer
>>> +       e.g.
>>> +       "(2) function 'gadget_dev_desc_UDC_store' used as initializer
>>> +       for field 'store' marked with '__attribute__((tainted))'".
>>> */
>>> +    emission_path->add_event
>>> +      (new tainted_callback_custom_event (m_loc, m_fndecl, 0,
>>> m_field));
>>> +  }
>>> +
>>> +private:
>>> +  tree m_field;
>>> +  tree m_fndecl;
>>> +  location_t m_loc;
>>> +};
>>> +
>>> +/* Given an initializer at LOC for FIELD marked with
>>> '__attribute__((tainted))'
>>> +   initialized with FNDECL, add an entrypoint to FNDECL to EG (and
>>> to its
>>> +   worklist) where the params to FNDECL are marked as tainted.  */
>>> +
>>> +static void
>>> +add_tainted_callback (exploded_graph *eg, tree field, tree fndecl,
>>> +                     location_t loc)
>>> +{
>>> +  logger *logger = eg->get_logger ();
>>> +
>>> +  LOG_SCOPE (logger);
>>> +
>>> +  if (!gimple_has_body_p (fndecl))
>>> +    return;
>>> +
>>> +  const extrinsic_state &ext_state = eg->get_ext_state ();
>>> +
>>> +  function *fun = DECL_STRUCT_FUNCTION (fndecl);
>>> +  gcc_assert (fun);
>>> +
>>> +  program_point point
>>> +    = program_point::from_function_entry (eg->get_supergraph (),
>>> fun);
>>> +  program_state state (ext_state);
>>> +  state.push_frame (ext_state, fun);
>>> +
>>> +  if (!mark_params_as_tainted (&state, fndecl, ext_state))
>>> +    return;
>>> +
>>> +  if (!state.m_valid)
>>> +    return;
>>> +
>>> +  exploded_node *enode = eg->get_or_create_node (point, state,
>>> NULL);
>>> +  if (logger)
>>> +    {
>>> +      if (enode)
>>> +       logger->log ("created EN %i for tainted %qE entrypoint",
>>> +                    enode->m_index, fndecl);
>>> +      else
>>> +       {
>>> +         logger->log ("did not create enode for tainted %qE
>>> entrypoint",
>>> +                      fndecl);
>>> +         return;
>>> +       }
>>> +    }
>>> +
>>> +  tainted_call_info *info = new tainted_call_info (field, fndecl,
>>> loc);
>>> +  eg->add_edge (eg->get_origin (), enode, NULL, info);
>>> +}
>>> +
>>> +/* Callback for walk_tree for finding callbacks within initializers;
>>> +   ensure that any callback initializer where the corresponding
>>> field is
>>> +   marked with '__attribute__((tainted))' is treated as an
>>> entrypoint to the
>>> +   analysis, special-casing that the inputs to the callback are
>>> +   untrustworthy.  */
>>> +
>>> +static tree
>>> +add_any_callbacks (tree *tp, int *, void *data)
>>> +{
>>> +  exploded_graph *eg = (exploded_graph *)data;
>>> +  if (TREE_CODE (*tp) == CONSTRUCTOR)
>>> +    {
>>> +      /* Find fields with the "tainted" attribute.
>>> +        walk_tree only walks the values, not the index values;
>>> +        look at the index values.  */
>>> +      unsigned HOST_WIDE_INT idx;
>>> +      constructor_elt *ce;
>>> +
>>> +      for (idx = 0; vec_safe_iterate (CONSTRUCTOR_ELTS (*tp), idx,
>>> &ce);
>>> +          idx++)
>>> +       if (ce->index && TREE_CODE (ce->index) == FIELD_DECL)
>>> +         if (lookup_attribute ("tainted", DECL_ATTRIBUTES (ce-
>>>> index)))
>>> +           {
>>> +             tree value = ce->value;
>>> +             if (TREE_CODE (value) == ADDR_EXPR
>>> +                 && TREE_CODE (TREE_OPERAND (value, 0)) ==
>>> FUNCTION_DECL)
>>> +               add_tainted_callback (eg, ce->index, TREE_OPERAND
>>> (value, 0),
>>> +                                     EXPR_LOCATION (value));
>>> +           }
>>> +    }
>>> +
>>> +  return NULL_TREE;
>>> +}
>>> +
>>>   /* Add initial nodes to EG, with entrypoints for externally-callable
>>>      functions.  */
>>>   
>>> @@ -2648,6 +2950,17 @@ exploded_graph::build_initial_worklist ()
>>>            logger->log ("did not create enode for %qE entrypoint",
>>> fun->decl);
>>>         }
>>>     }
>>> +
>>> +  /* Find callbacks that are reachable from global initializers.  */
>>> +  varpool_node *vpnode;
>>> +  FOR_EACH_VARIABLE (vpnode)
>>> +    {
>>> +      tree decl = vpnode->decl;
>>> +      tree init = DECL_INITIAL (decl);
>>> +      if (!init)
>>> +       continue;
>>> +      walk_tree (&init, add_any_callbacks, this, NULL);
>>> +    }
>>>   }
>>>   
>>>   /* The main loop of the analysis.
>>> diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
>>> index 9e03156de5e..835ba6e0e8c 100644
>>> --- a/gcc/c-family/c-attribs.c
>>> +++ b/gcc/c-family/c-attribs.c
>>> @@ -117,6 +117,7 @@ static tree
>>> handle_no_profile_instrument_function_attribute (tree *, tree,
>>>                                                               tree,
>>> int, bool *);
>>>   static tree handle_malloc_attribute (tree *, tree, tree, int, bool
>>> *);
>>>   static tree handle_dealloc_attribute (tree *, tree, tree, int, bool
>>> *);
>>> +static tree handle_tainted_attribute (tree *, tree, tree, int, bool
>>> *);
>>>   static tree handle_returns_twice_attribute (tree *, tree, tree, int,
>>> bool *);
>>>   static tree handle_no_limit_stack_attribute (tree *, tree, tree,
>>> int,
>>>                                               bool *);
>>> @@ -569,6 +570,8 @@ const struct attribute_spec
>>> c_common_attribute_table[] =
>>>                                handle_objc_nullability_attribute, NULL
>>> },
>>>     { "*dealloc",                1, 2, true, false, false, false,
>>>                                handle_dealloc_attribute, NULL },
>>> +  { "tainted",               0, 0, true,  false, false, false,
>>> +                             handle_tainted_attribute, NULL },
>>>     { NULL,                     0, 0, false, false, false, false,
>>> NULL, NULL }
>>>   };
>>>   
>>> @@ -5857,6 +5860,39 @@ handle_objc_nullability_attribute (tree *node,
>>> tree name, tree args,
>>>     return NULL_TREE;
>>>   }
>>>   
>>> +/* Handle a "tainted" attribute; arguments as in
>>> +   struct attribute_spec.handler.  */
>>> +
>>> +static tree
>>> +handle_tainted_attribute (tree *node, tree name, tree, int,
>>> +                         bool *no_add_attrs)
>>> +{
>>> +  if (TREE_CODE (*node) != FUNCTION_DECL
>>> +      && TREE_CODE (*node) != FIELD_DECL)
>>> +    {
>>> +      warning (OPT_Wattributes, "%qE attribute ignored; valid only "
>>> +              "for functions and function pointer fields",
>>> +              name);
>>> +      *no_add_attrs = true;
>>> +      return NULL_TREE;
>>> +    }
>>> +
>>> +  if (TREE_CODE (*node) == FIELD_DECL
>>> +      && !(TREE_CODE (TREE_TYPE (*node)) == POINTER_TYPE
>>> +          && TREE_CODE (TREE_TYPE (TREE_TYPE (*node))) ==
>>> FUNCTION_TYPE))
>>> +    {
>>> +      warning (OPT_Wattributes, "%qE attribute ignored;"
>>> +              " field must be a function pointer",
>>> +              name);
>>> +      *no_add_attrs = true;
>>> +      return NULL_TREE;
>>> +    }
>>> +
>>> +  *no_add_attrs = false; /* OK */
>>> +
>>> +  return NULL_TREE;
>>> +}
>>> +
>>>   /* Attempt to partially validate a single attribute ATTR as if
>>>      it were to be applied to an entity OPER.  */
>>>   
>>> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
>>> index 5a6ef464779..826bbd48e7e 100644
>>> --- a/gcc/doc/extend.texi
>>> +++ b/gcc/doc/extend.texi
>>> @@ -2465,7 +2465,8 @@ variable declarations (@pxref{Variable
>>> Attributes}),
>>>   labels (@pxref{Label Attributes}),
>>>   enumerators (@pxref{Enumerator Attributes}),
>>>   statements (@pxref{Statement Attributes}),
>>> -and types (@pxref{Type Attributes}).
>>> +types (@pxref{Type Attributes}),
>>> +and on field declarations (for @code{tainted}).
>>>   
>>>   There is some overlap between the purposes of attributes and pragmas
>>>   (@pxref{Pragmas,,Pragmas Accepted by GCC}).  It has been
>>> @@ -3977,6 +3978,25 @@ addition to creating a symbol version (as if
>>>   @code{"@var{name2}@@@var{nodename}"} was used) the version will be
>>> also used
>>>   to resolve @var{name2} by the linker.
>>>   
>>> +@item tainted
>>> +@cindex @code{tainted} function attribute
>>> +The @code{tainted} attribute is used to specify that a function is
>>> called
>>> +in a way that requires sanitization of its arguments, such as a
>>> system
>>> +call in an operating system kernel.  Such a function can be
>>> considered part
>>> +of the ``attack surface'' of the program.  The attribute can be used
>>> both
>>> +on function declarations, and on field declarations containing
>>> function
>>> +pointers.  In the latter case, any function used as an initializer
>>> of
>>> +such a callback field will be treated as tainted.
>>> +
>>> +The analyzer will pay particular attention to such functions when
>>> both
>>> +@option{-fanalyzer} and @option{-fanalyzer-checker=taint} are
>>> supplied,
>>> +potentially issuing warnings guarded by
>>> +@option{-Wanalyzer-exposure-through-uninit-copy},
>>> +@option{-Wanalyzer-tainted-allocation-size},
>>> +@option{-Wanalyzer-tainted-array-index},
>>> +@option{Wanalyzer-tainted-offset},
>>> +and @option{Wanalyzer-tainted-size}.
>>> +
>>>   @item target_clones (@var{options})
>>>   @cindex @code{target_clones} function attribute
>>>   The @code{target_clones} attribute is used to specify that a
>>> function
>>> diff --git a/gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
>>> b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
>>> new file mode 100644
>>> index 00000000000..cc4d5900372
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
>>> @@ -0,0 +1,88 @@
>>> +// TODO: remove need for this option
>>> +/* { dg-additional-options "-fanalyzer-checker=taint" } */
>>> +
>>> +#include "analyzer-decls.h"
>>> +
>>> +struct arg_buf
>>> +{
>>> +  int i;
>>> +  int j;
>>> +};
>>> +
>>> +/* Example of marking a function as tainted.  */
>>> +
>>> +void __attribute__((tainted))
>>> +test_1 (int i, void *p, char *q)
>>> +{
>>> +  /* There should be a single enode,
>>> +     for the "tainted" entry to the function.  */
>>> +  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1 processed
>>> enode" } */
>>> +
>>> +  __analyzer_dump_state ("taint", i); /* { dg-warning "state:
>>> 'tainted'" } */
>>> +  __analyzer_dump_state ("taint", p); /* { dg-warning "state:
>>> 'tainted'" } */
>>> +  __analyzer_dump_state ("taint", q); /* { dg-warning "state:
>>> 'tainted'" } */
>>> +  __analyzer_dump_state ("taint", *q); /* { dg-warning "state:
>>> 'tainted'" } */
>>> +
>>> +  struct arg_buf *args = p;
>>> +  __analyzer_dump_state ("taint", args->i); /* { dg-warning "state:
>>> 'tainted'" } */
>>> +  __analyzer_dump_state ("taint", args->j); /* { dg-warning "state:
>>> 'tainted'" } */
>>> +}
>>> +
>>> +/* Example of marking a callback field as tainted.  */
>>> +
>>> +struct s2
>>> +{
>>> +  void (*cb) (int, void *, char *)
>>> +    __attribute__((tainted));
>>> +};
>>> +
>>> +/* Function not marked as tainted.  */
>>> +
>>> +void
>>> +test_2a (int i, void *p, char *q)
>>> +{
>>> +  /* There should be a single enode,
>>> +     for the normal entry to the function.  */
>>> +  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1 processed
>>> enode" } */
>>> +
>>> +  __analyzer_dump_state ("taint", i); /* { dg-warning "state:
>>> 'start'" } */
>>> +  __analyzer_dump_state ("taint", p); /* { dg-warning "state:
>>> 'start'" } */
>>> +  __analyzer_dump_state ("taint", q); /* { dg-warning "state:
>>> 'start'" } */
>>> +
>>> +  struct arg_buf *args = p;
>>> +  __analyzer_dump_state ("taint", args->i); /* { dg-warning "state:
>>> 'start'" } */
>>> +  __analyzer_dump_state ("taint", args->j); /* { dg-warning "state:
>>> 'start'" } */
>>> +}
>>> +
>>> +/* Function referenced via t2b.cb, marked as "tainted".  */
>>> +
>>> +void
>>> +test_2b (int i, void *p, char *q)
>>> +{
>>> +  /* There should be two enodes
>>> +     for the direct call, and the "tainted" entry to the function.
>>> */
>>> +  __analyzer_dump_exploded_nodes (0); /* { dg-warning "2 processed
>>> enodes" } */
>>> +}
>>> +
>>> +/* Callback used via t2c.cb, marked as "tainted".  */
>>> +void
>>> +__analyzer_test_2c (int i, void *p, char *q)
>>> +{
>>> +  /* There should be a single enode,
>>> +     for the "tainted" entry to the function.  */
>>> +  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1 processed
>>> enode" } */
>>> +
>>> +  __analyzer_dump_state ("taint", i); /* { dg-warning "state:
>>> 'tainted'" } */
>>> +  __analyzer_dump_state ("taint", p); /* { dg-warning "state:
>>> 'tainted'" } */
>>> +  __analyzer_dump_state ("taint", q); /* { dg-warning "state:
>>> 'tainted'" } */
>>> +}
>>> +
>>> +struct s2 t2b =
>>> +{
>>> +  .cb = test_2b
>>> +};
>>> +
>>> +struct s2 t2c =
>>> +{
>>> +  .cb = __analyzer_test_2c
>>> +};
>>> diff --git a/gcc/testsuite/gcc.dg/analyzer/attr-tainted-misuses.c
>>> b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-misuses.c
>>> new file mode 100644
>>> index 00000000000..6f4cbc82efb
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-misuses.c
>>> @@ -0,0 +1,6 @@
>>> +int not_a_fn __attribute__ ((tainted)); /* { dg-warning "'tainted'
>>> attribute ignored; valid only for functions and function pointer
>>> fields" } */
>>> +
>>> +struct s
>>> +{
>>> +  int f __attribute__ ((tainted)); /* { dg-warning "'tainted'
>>> attribute ignored; field must be a function pointer" } */
>>> +};
>>> diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-1.c
>>> b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-1.c
>>> new file mode 100644
>>> index 00000000000..fe6c7ebbb1f
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-1.c
>>> @@ -0,0 +1,93 @@
>>> +/* "The osf_getsysinfo function in arch/alpha/kernel/osf_sys.c in
>>> the
>>> +   Linux kernel before 2.6.39.4 on the Alpha platform does not
>>> properly
>>> +   restrict the data size for GSI_GET_HWRPB operations, which allows
>>> +   local users to obtain sensitive information from kernel memory
>>> via
>>> +   a crafted call."
>>> +
>>> +   Fixed in 3d0475119d8722798db5e88f26493f6547a4bb5b on linux-
>>> 2.6.39.y
>>> +   in linux-stable.  */
>>> +
>>> +// TODO: remove need for this option:
>>> +/* { dg-additional-options "-fanalyzer-checker=taint" } */
>>> +
>>> +#include "analyzer-decls.h"
>>> +#include "test-uaccess.h"
>>> +
>>> +/* Adapted from include/linux/linkage.h.  */
>>> +
>>> +#define asmlinkage
>>> +
>>> +/* Adapted from include/linux/syscalls.h.  */
>>> +
>>> +#define __SC_DECL1(t1, a1)     t1 a1
>>> +#define __SC_DECL2(t2, a2, ...) t2 a2, __SC_DECL1(__VA_ARGS__)
>>> +#define __SC_DECL3(t3, a3, ...) t3 a3, __SC_DECL2(__VA_ARGS__)
>>> +#define __SC_DECL4(t4, a4, ...) t4 a4, __SC_DECL3(__VA_ARGS__)
>>> +#define __SC_DECL5(t5, a5, ...) t5 a5, __SC_DECL4(__VA_ARGS__)
>>> +#define __SC_DECL6(t6, a6, ...) t6 a6, __SC_DECL5(__VA_ARGS__)
>>> +
>>> +#define SYSCALL_DEFINEx(x, sname, ...)                         \
>>> +       __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
>>> +
>>> +#define SYSCALL_DEFINE(name) asmlinkage long sys_##name
>>> +#define __SYSCALL_DEFINEx(x, name,
>>> ...)                                        \
>>> +       asmlinkage __attribute__((tainted)) \
>>> +       long sys##name(__SC_DECL##x(__VA_ARGS__))
>>> +
>>> +#define SYSCALL_DEFINE5(name, ...) SYSCALL_DEFINEx(5, _##name,
>>> __VA_ARGS__)
>>> +
>>> +/* Adapted from arch/alpha/include/asm/hwrpb.h.  */
>>> +
>>> +struct hwrpb_struct {
>>> +       unsigned long phys_addr;        /* check: physical address of
>>> the hwrpb */
>>> +       unsigned long id;               /* check: "HWRPB\0\0\0" */
>>> +       unsigned long revision;
>>> +       unsigned long size;             /* size of hwrpb */
>>> +       /* [...snip...] */
>>> +};
>>> +
>>> +extern struct hwrpb_struct *hwrpb;
>>> +
>>> +/* Adapted from arch/alpha/kernel/osf_sys.c.  */
>>> +
>>> +SYSCALL_DEFINE5(osf_getsysinfo, unsigned long, op, void __user *,
>>> buffer,
>>> +               unsigned long, nbytes, int __user *, start, void
>>> __user *, arg)
>>> +{
>>> +       /* [...snip...] */
>>> +
>>> +       __analyzer_dump_state ("taint", nbytes);  /* { dg-warning
>>> "tainted" } */
>>> +
>>> +       /* TODO: should have an event explaining why "nbytes" is
>>> treated as
>>> +          attacker-controlled.  */
>>> +
>>> +       /* case GSI_GET_HWRPB: */
>>> +               if (nbytes < sizeof(*hwrpb))
>>> +                       return -1;
>>> +
>>> +               __analyzer_dump_state ("taint", nbytes);  /* { dg-
>>> warning "has_lb" } */
>>> +
>>> +               if (copy_to_user(buffer, hwrpb, nbytes) != 0) /* {
>>> dg-warning "use of attacker-controlled value 'nbytes' as size without
>>> upper-bounds checking" } */
>>> +                       return -2;
>>> +
>>> +               return 1;
>>> +
>>> +       /* [...snip...] */
>>> +}
>>> +
>>> +/* With the fix for the sense of the size comparison.  */
>>> +
>>> +SYSCALL_DEFINE5(osf_getsysinfo_fixed, unsigned long, op, void __user
>>> *, buffer,
>>> +               unsigned long, nbytes, int __user *, start, void
>>> __user *, arg)
>>> +{
>>> +       /* [...snip...] */
>>> +
>>> +       /* case GSI_GET_HWRPB: */
>>> +               if (nbytes > sizeof(*hwrpb))
>>> +                       return -1;
>>> +               if (copy_to_user(buffer, hwrpb, nbytes) != 0) /* {
>>> dg-bogus "attacker-controlled" } */
>>> +                       return -2;
>>> +
>>> +               return 1;
>>> +
>>> +       /* [...snip...] */
>>> +}
>>> diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-1.c
>>> b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-1.c
>>> new file mode 100644
>>> index 00000000000..0b9a94a8d6c
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-1.c
>>> @@ -0,0 +1,38 @@
>>> +/* See notes in this header.  */
>>> +#include "taint-CVE-2020-13143.h"
>>> +
>>> +// TODO: remove need for this option
>>> +/* { dg-additional-options "-fanalyzer-checker=taint" } */
>>> +
>>> +struct configfs_attribute {
>>> +       /* [...snip...] */
>>> +       ssize_t (*store)(struct config_item *, const char *, size_t)
>>> /* { dg-message "\\(1\\) field 'store' of 'struct configfs_attribute'
>>> is marked with '__attribute__\\(\\(tainted\\)\\)'" } */
>>> +               __attribute__((tainted)); /* (this is added).  */
>>> +};
>>> +static inline struct gadget_info *to_gadget_info(struct config_item
>>> *item)
>>> +{
>>> +        return container_of(to_config_group(item), struct
>>> gadget_info, group);
>>> +}
>>> +
>>> +static ssize_t gadget_dev_desc_UDC_store(struct config_item *item,
>>> +               const char *page, size_t len)
>>> +{
>>> +       struct gadget_info *gi = to_gadget_info(item);
>>> +       char *name;
>>> +       int ret;
>>> +
>>> +#if 0
>>> +       /* FIXME: this is the fix.  */
>>> +       if (strlen(page) < len)
>>> +               return -EOVERFLOW;
>>> +#endif
>>> +
>>> +       name = kstrdup(page, GFP_KERNEL);
>>> +       if (!name)
>>> +               return -ENOMEM;
>>> +       if (name[len - 1] == '\n') /* { dg-warning "use of attacker-
>>> controlled value 'len \[^\n\r\]+' as offset without upper-bounds
>>> checking" } */
>>> +               name[len - 1] = '\0'; /* { dg-warning "use of
>>> attacker-controlled value 'len \[^\n\r\]+' as offset without upper-
>>> bounds checking" } */
>>> +       /* [...snip...] */                              \
>>> +}
>>> +
>>> +CONFIGFS_ATTR(gadget_dev_desc_, UDC); /* { dg-message "\\(2\\)
>>> function 'gadget_dev_desc_UDC_store' used as initializer for field
>>> 'store' marked with '__attribute__\\(\\(tainted\\)\\)'" } */
>>> diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-2.c
>>> b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-2.c
>>> new file mode 100644
>>> index 00000000000..e05da9276c1
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-2.c
>>> @@ -0,0 +1,32 @@
>>> +/* See notes in this header.  */
>>> +#include "taint-CVE-2020-13143.h"
>>> +
>>> +// TODO: remove need for this option
>>> +/* { dg-additional-options "-fanalyzer-checker=taint" } */
>>> +
>>> +struct configfs_attribute {
>>> +       /* [...snip...] */
>>> +       ssize_t (*store)(struct config_item *, const char *, size_t)
>>> /* { dg-message "\\(1\\) field 'store' of 'struct configfs_attribute'
>>> is marked with '__attribute__\\(\\(tainted\\)\\)'" } */
>>> +               __attribute__((tainted)); /* (this is added).  */
>>> +};
>>> +
>>> +/* Highly simplified version.  */
>>> +
>>> +static ssize_t gadget_dev_desc_UDC_store(struct config_item *item,
>>> +               const char *page, size_t len)
>>> +{
>>> +       /* TODO: ought to have state_change_event talking about where
>>> the tainted value comes from.  */
>>> +
>>> +       char *name;
>>> +       /* [...snip...] */
>>> +
>>> +       name = kstrdup(page, GFP_KERNEL);
>>> +       if (!name)
>>> +               return -ENOMEM;
>>> +       if (name[len - 1] == '\n') /* { dg-warning "use of attacker-
>>> controlled value 'len \[^\n\r\]+' as offset without upper-bounds
>>> checking" } */
>>> +               name[len - 1] = '\0';  /* { dg-warning "use of
>>> attacker-controlled value 'len \[^\n\r\]+' as offset without upper-
>>> bounds checking" } */
>>> +       /* [...snip...] */
>>> +       return 0;
>>> +}
>>> +
>>> +CONFIGFS_ATTR(gadget_dev_desc_, UDC); /* { dg-message "\\(2\\)
>>> function 'gadget_dev_desc_UDC_store' used as initializer for field
>>> 'store' marked with '__attribute__\\(\\(tainted\\)\\)'" } */
>>> diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h
>>> b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h
>>> new file mode 100644
>>> index 00000000000..0ba023539af
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h
>>> @@ -0,0 +1,91 @@
>>> +/* Shared header for the various taint-CVE-2020-13143.h tests.
>>> +
>>> +   "gadget_dev_desc_UDC_store in drivers/usb/gadget/configfs.c in
>>> the
>>> +   Linux kernel 3.16 through 5.6.13 relies on kstrdup without
>>> considering
>>> +   the possibility of an internal '\0' value, which allows attackers
>>> to
>>> +   trigger an out-of-bounds read, aka CID-15753588bcd4."
>>> +
>>> +   Fixed by 15753588bcd4bbffae1cca33c8ced5722477fe1f on linux-5.7.y
>>> +   in linux-stable.  */
>>> +
>>> +// TODO: remove need for this option
>>> +/* { dg-additional-options "-fanalyzer-checker=taint" } */
>>> +
>>> +#include <stddef.h>
>>> +
>>> +/* Adapted from include/uapi/asm-generic/posix_types.h  */
>>> +
>>> +typedef unsigned int     __kernel_size_t;
>>> +typedef int              __kernel_ssize_t;
>>> +
>>> +/* Adapted from include/linux/types.h  */
>>> +
>>> +//typedef __kernel_size_t              size_t;
>>> +typedef __kernel_ssize_t       ssize_t;
>>> +
>>> +/* Adapted from include/linux/kernel.h  */
>>> +
>>> +#define container_of(ptr, type, member)
>>> ({                             \
>>> +       void *__mptr = (void
>>> *)(ptr);                                   \
>>> +       /* [...snip...]
>>> */                                              \
>>> +       ((type *)(__mptr - offsetof(type, member))); })
>>> +
>>> +/* Adapted from include/linux/configfs.h  */
>>> +
>>> +struct config_item {
>>> +       /* [...snip...] */
>>> +};
>>> +
>>> +struct config_group {
>>> +       struct config_item              cg_item;
>>> +       /* [...snip...] */
>>> +};
>>> +
>>> +static inline struct config_group *to_config_group(struct
>>> config_item *item)
>>> +{
>>> +       return item ? container_of(item,struct config_group,cg_item)
>>> : NULL;
>>> +}
>>> +
>>> +#define CONFIGFS_ATTR(_pfx, _name)                             \
>>> +static struct configfs_attribute _pfx##attr_##_name = {        \
>>> +       /* [...snip...] */                              \
>>> +       .store          = _pfx##_name##_store,          \
>>> +}
>>> +
>>> +/* Adapted from include/linux/compiler.h  */
>>> +
>>> +#define __force
>>> +
>>> +/* Adapted from include/asm-generic/errno-base.h  */
>>> +
>>> +#define        ENOMEM          12      /* Out of memory */
>>> +
>>> +/* Adapted from include/linux/types.h  */
>>> +
>>> +#define __bitwise__
>>> +typedef unsigned __bitwise__ gfp_t;
>>> +
>>> +/* Adapted from include/linux/gfp.h  */
>>> +
>>> +#define ___GFP_WAIT            0x10u
>>> +#define ___GFP_IO              0x40u
>>> +#define ___GFP_FS              0x80u
>>> +#define __GFP_WAIT     ((__force gfp_t)___GFP_WAIT)
>>> +#define __GFP_IO       ((__force gfp_t)___GFP_IO)
>>> +#define __GFP_FS       ((__force gfp_t)___GFP_FS)
>>> +#define GFP_KERNEL  (__GFP_WAIT | __GFP_IO | __GFP_FS)
>>> +
>>> +/* Adapted from include/linux/compiler_attributes.h  */
>>> +
>>> +#define __malloc                        __attribute__((__malloc__))
>>> +
>>> +/* Adapted from include/linux/string.h  */
>>> +
>>> +extern char *kstrdup(const char *s, gfp_t gfp) __malloc;
>>> +
>>> +/* Adapted from drivers/usb/gadget/configfs.c  */
>>> +
>>> +struct gadget_info {
>>> +       struct config_group group;
>>> +       /* [...snip...] */                              \
>>> +};
>>> diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
>>> b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
>>> new file mode 100644
>>> index 00000000000..4c567b2ffdf
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
>>> @@ -0,0 +1,21 @@
>>> +// TODO: remove need for this option:
>>> +/* { dg-additional-options "-fanalyzer-checker=taint" } */
>>> +
>>> +#include "analyzer-decls.h"
>>> +#include <stdio.h>
>>> +#include <stdlib.h>
>>> +#include <string.h>
>>> +
>>> +/* malloc with tainted size from a syscall.  */
>>> +
>>> +void *p;
>>> +
>>> +void __attribute__((tainted))
>>> +test_1 (size_t sz) /* { dg-message "\\(1\\) function 'test_1' marked
>>> with '__attribute__\\(\\(tainted\\)\\)'" } */
>>> +{
>>> +  /* TODO: should have a message saying why "sz" is tainted, e.g.
>>> +     "treating 'sz' as attacker-controlled because 'test_1' is
>>> marked with '__attribute__((tainted))'"  */
>>> +
>>> +  p = malloc (sz); /* { dg-warning "use of attacker-controlled value
>>> 'sz' as allocation size without upper-bounds checking" "warning" } */
>>> +  /* { dg-message "\\(\[0-9\]+\\) use of attacker-controlled value
>>> 'sz' as allocation size without upper-bounds checking" "final event"
>>> { target *-*-* } .-1 } */
>>> +}
>>> diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
>>> b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
>>> new file mode 100644
>>> index 00000000000..f52cafcd71d
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
>>> @@ -0,0 +1,31 @@
>>> +// TODO: remove need for this option:
>>> +/* { dg-additional-options "-fanalyzer-checker=taint" } */
>>> +
>>> +#include "analyzer-decls.h"
>>> +#include <stdio.h>
>>> +#include <stdlib.h>
>>> +#include <string.h>
>>> +
>>> +/* malloc with tainted size from a syscall.  */
>>> +
>>> +struct arg_buf
>>> +{
>>> +  size_t sz;
>>> +};
>>> +
>>> +void *p;
>>> +
>>> +void __attribute__((tainted))
>>> +test_1 (void *data) /* { dg-message "\\(1\\) function 'test_1'
>>> marked with '__attribute__\\(\\(tainted\\)\\)'" } */
>>> +{
>>> +  /* we should treat pointed-to-structs as tainted.  */
>>> +  __analyzer_dump_state ("taint", data); /* { dg-warning "state:
>>> 'tainted'" } */
>>> +
>>> +  struct arg_buf *args = data;
>>> +
>>> +  __analyzer_dump_state ("taint", args); /* { dg-warning "state:
>>> 'tainted'" } */
>>> +  __analyzer_dump_state ("taint", args->sz); /* { dg-warning "state:
>>> 'tainted'" } */
>>> +
>>> +  p = malloc (args->sz); /* { dg-warning "use of attacker-controlled
>>> value '\\*args.sz' as allocation size without upper-bounds checking"
>>> "warning" } */
>>> +  /* { dg-message "\\(\[0-9\]+\\) use of attacker-controlled value
>>> '\\*args.sz' as allocation size without upper-bounds checking" "final
>>> event" { target *-*-* } .-1 } */
>>> +}
>>
> 
>
  
David Malcolm Jan. 12, 2022, 3:33 p.m. UTC | #4
On Tue, 2022-01-11 at 23:36 -0500, Jason Merrill wrote:
> On 1/10/22 16:36, David Malcolm via Gcc-patches wrote:
> > On Thu, 2022-01-06 at 09:08 -0500, David Malcolm wrote:
> > > On Sat, 2021-11-13 at 15:37 -0500, David Malcolm wrote:
> > > > This patch adds a new __attribute__ ((tainted)) to the C/C++
> > > > frontends.
> > > 
> > > Ping for GCC C/C++ mantainers for review of the C/C++ FE parts of
> > > this
> > > patch (attribute registration, documentation, the name of the
> > > attribute, etc).
> > > 
> > > (I believe it's independent of the rest of the patch kit, in that
> > > it
> > > could go into trunk without needing the prior patches)
> > > 
> > > Thanks
> > > Dave
> > 
> > Getting close to end of stage 3 for GCC 12, so pinging this patch
> > again...
> > 
> > https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584376.html
> 
> The c-family change is OK.

Thanks.

I'm retesting the patch now, but it now seems to me that
  __attribute__((tainted_args))
would lead to more readable code than:
  __attribute__((tainted))

in that the name "tainted_args" better conveys the idea that all
arguments are under attacker-control (as opposed to the body of the
function or the function pointer being under attacker-control).

Looking at
  https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html
we already have some attributes with underscores in their names.

Does this sound good?
Dave

> 
> > Thanks
> > Dave
> > 
> > > 
> > > 
> > > > 
> > > > It can be used on function decls: the analyzer will treat as
> > > > tainted
> > > > all parameters to the function and all buffers pointed to by
> > > > parameters
> > > > to the function.  Adding this in one place to the Linux kernel's
> > > > __SYSCALL_DEFINEx macro allows the analyzer to treat all syscalls
> > > > as
> > > > having tainted inputs.  This gives additional testing beyond e.g.
> > > > __user
> > > > pointers added by earlier patches - an example of the use of this
> > > > can
> > > > be
> > > > seen in CVE-2011-2210, where given:
> > > > 
> > > >   SYSCALL_DEFINE5(osf_getsysinfo, unsigned long, op, void __user
> > > > *,
> > > > buffer,
> > > >                   unsigned long, nbytes, int __user *, start,
> > > > void
> > > > __user *, arg)
> > > > 
> > > > the analyzer will treat the nbytes param as under attacker
> > > > control,
> > > > and
> > > > can complain accordingly:
> > > > 
> > > > taint-CVE-2011-2210-1.c: In function ‘sys_osf_getsysinfo’:
> > > > taint-CVE-2011-2210-1.c:69:21: warning: use of attacker-
> > > > controlled
> > > > value
> > > >    ‘nbytes’ as size without upper-bounds checking [CWE-129] [-
> > > > Wanalyzer-tainted-size]
> > > >     69 |                 if (copy_to_user(buffer, hwrpb, nbytes)
> > > > != 0)
> > > >        |                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > > 
> > > > Additionally, the patch allows the attribute to be used on field
> > > > decls:
> > > > specifically function pointers.  Any function used as an
> > > > initializer
> > > > for such a field gets treated as tainted.  An example can be seen
> > > > in
> > > > CVE-2020-13143, where adding __attribute__((tainted)) to the
> > > > "store"
> > > > callback of configfs_attribute:
> > > > 
> > > >    struct configfs_attribute {
> > > >       /* [...snip...] */
> > > >       ssize_t (*store)(struct config_item *, const char *,
> > > > size_t)
> > > >         __attribute__((tainted));
> > > >       /* [...snip...] */
> > > >    };
> > > > 
> > > > allows the analyzer to see:
> > > > 
> > > >   CONFIGFS_ATTR(gadget_dev_desc_, UDC);
> > > > 
> > > > and treat gadget_dev_desc_UDC_store as tainted, so that it
> > > > complains:
> > > > 
> > > > taint-CVE-2020-13143-1.c: In function
> > > > ‘gadget_dev_desc_UDC_store’:
> > > > taint-CVE-2020-13143-1.c:33:17: warning: use of attacker-
> > > > controlled
> > > > value
> > > >    ‘len + 18446744073709551615’ as offset without upper-bounds
> > > > checking [CWE-823] [-Wanalyzer-tainted-offset]
> > > >     33 |         if (name[len - 1] == '\n')
> > > >        |             ~~~~^~~~~~~~~
> > > > 
> > > > Similarly, the attribute could be used on the ioctl callback
> > > > field,
> > > > USB device callbacks, network-handling callbacks etc.  This
> > > > potentially
> > > > gives a lot of test coverage with relatively little code
> > > > annotation,
> > > > and
> > > > without necessarily needing link-time analysis (which -fanalyzer
> > > > can
> > > > only do at present on trivial examples).
> > > > 
> > > > I believe this is the first time we've had an attribute on a
> > > > field.
> > > > If that's an issue, I could prepare a version of the patch that
> > > > merely allowed it on functions themselves.
> > > > 
> > > > As before this currently still needs -fanalyzer-checker=taint (in
> > > > addition to -fanalyzer).
> > > > 
> > > > gcc/analyzer/ChangeLog:
> > > >          * engine.cc: Include "stringpool.h", "attribs.h", and
> > > >          "tree-dfa.h".
> > > >          (mark_params_as_tainted): New.
> > > >          (class tainted_function_custom_event): New.
> > > >          (class tainted_function_info): New.
> > > >          (exploded_graph::add_function_entry): Handle functions
> > > > with
> > > >          "tainted" attribute.
> > > >          (class tainted_field_custom_event): New.
> > > >          (class tainted_callback_custom_event): New.
> > > >          (class tainted_call_info): New.
> > > >          (add_tainted_callback): New.
> > > >          (add_any_callbacks): New.
> > > >          (exploded_graph::build_initial_worklist): Find callbacks
> > > > that
> > > > are
> > > >          reachable from global initializers, calling
> > > > add_any_callbacks
> > > > on
> > > >          them.
> > > > 
> > > > gcc/c-family/ChangeLog:
> > > >          * c-attribs.c (c_common_attribute_table): Add "tainted".
> > > >          (handle_tainted_attribute): New.
> > > > 
> > > > gcc/ChangeLog:
> > > >          * doc/extend.texi (Function Attributes): Note that
> > > > "tainted"
> > > > can
> > > >          be used on field decls.
> > > >          (Common Function Attributes): Add entry on "tainted"
> > > > attribute.
> > > > 
> > > > gcc/testsuite/ChangeLog:
> > > >          * gcc.dg/analyzer/attr-tainted-1.c: New test.
> > > >          * gcc.dg/analyzer/attr-tainted-misuses.c: New test.
> > > >          * gcc.dg/analyzer/taint-CVE-2011-2210-1.c: New test.
> > > >          * gcc.dg/analyzer/taint-CVE-2020-13143-1.c: New test.
> > > >          * gcc.dg/analyzer/taint-CVE-2020-13143-2.c: New test.
> > > >          * gcc.dg/analyzer/taint-CVE-2020-13143.h: New test.
> > > >          * gcc.dg/analyzer/taint-alloc-3.c: New test.
> > > >          * gcc.dg/analyzer/taint-alloc-4.c: New test.
> > > > 
> > > > Signed-off-by: David Malcolm <dmalcolm@redhat.com>
> > > > ---
> > > >   gcc/analyzer/engine.cc                        | 317
> > > > +++++++++++++++++-
> > > >   gcc/c-family/c-attribs.c                      |  36 ++
> > > >   gcc/doc/extend.texi                           |  22 +-
> > > >   .../gcc.dg/analyzer/attr-tainted-1.c          |  88 +++++
> > > >   .../gcc.dg/analyzer/attr-tainted-misuses.c    |   6 +
> > > >   .../gcc.dg/analyzer/taint-CVE-2011-2210-1.c   |  93 +++++
> > > >   .../gcc.dg/analyzer/taint-CVE-2020-13143-1.c  |  38 +++
> > > >   .../gcc.dg/analyzer/taint-CVE-2020-13143-2.c  |  32 ++
> > > >   .../gcc.dg/analyzer/taint-CVE-2020-13143.h    |  91 +++++
> > > >   gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c |  21 ++
> > > >   gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c |  31 ++
> > > >   11 files changed, 772 insertions(+), 3 deletions(-)
> > > >   create mode 100644 gcc/testsuite/gcc.dg/analyzer/attr-tainted-
> > > > 1.c
> > > >   create mode 100644 gcc/testsuite/gcc.dg/analyzer/attr-tainted-
> > > > misuses.c
> > > >   create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-
> > > > 2011-
> > > > 2210-1.c
> > > >   create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-
> > > > 2020-
> > > > 13143-1.c
> > > >   create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-
> > > > 2020-
> > > > 13143-2.c
> > > >   create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-
> > > > 2020-
> > > > 13143.h
> > > >   create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-alloc-
> > > > 3.c
> > > >   create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-alloc-
> > > > 4.c
> > > > 
> > > > diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
> > > > index 096e219392d..5fab41daf93 100644
> > > > --- a/gcc/analyzer/engine.cc
> > > > +++ b/gcc/analyzer/engine.cc
> > > > @@ -68,6 +68,9 @@ along with GCC; see the file COPYING3.  If not
> > > > see
> > > >   #include "plugin.h"
> > > >   #include "target.h"
> > > >   #include <memory>
> > > > +#include "stringpool.h"
> > > > +#include "attribs.h"
> > > > +#include "tree-dfa.h"
> > > >   
> > > >   /* For an overview, see gcc/doc/analyzer.texi.  */
> > > >   
> > > > @@ -2276,6 +2279,116 @@ exploded_graph::~exploded_graph ()
> > > >       delete (*iter).second;
> > > >   }
> > > >   
> > > > +/* Subroutine for use when implementing __attribute__((tainted))
> > > > +   on functions and on function pointer fields in structs.
> > > > +
> > > > +   Called on STATE representing a call to FNDECL.
> > > > +   Mark all params of FNDECL in STATE as "tainted".  Mark the
> > > > value
> > > > of all
> > > > +   regions pointed to by params of FNDECL as "tainted".
> > > > +
> > > > +   Return true if successful; return false if the "taint" state
> > > > machine
> > > > +   was not found.  */
> > > > +
> > > > +static bool
> > > > +mark_params_as_tainted (program_state *state, tree fndecl,
> > > > +                       const extrinsic_state &ext_state)
> > > > +{
> > > > +  unsigned taint_sm_idx;
> > > > +  if (!ext_state.get_sm_idx_by_name ("taint", &taint_sm_idx))
> > > > +    return false;
> > > > +  sm_state_map *smap = state->m_checker_states[taint_sm_idx];
> > > > +
> > > > +  const state_machine &sm = ext_state.get_sm (taint_sm_idx);
> > > > +  state_machine::state_t tainted = sm.get_state_by_name
> > > > ("tainted");
> > > > +
> > > > +  region_model_manager *mgr = ext_state.get_model_manager ();
> > > > +
> > > > +  function *fun = DECL_STRUCT_FUNCTION (fndecl);
> > > > +  gcc_assert (fun);
> > > > +
> > > > +  for (tree iter_parm = DECL_ARGUMENTS (fndecl); iter_parm;
> > > > +       iter_parm = DECL_CHAIN (iter_parm))
> > > > +    {
> > > > +      tree param = iter_parm;
> > > > +      if (tree parm_default_ssa = ssa_default_def (fun,
> > > > iter_parm))
> > > > +       param = parm_default_ssa;
> > > > +      const region *param_reg = state->m_region_model-
> > > > >get_lvalue
> > > > (param, NULL);
> > > > +      const svalue *init_sval = mgr->get_or_create_initial_value
> > > > (param_reg);
> > > > +      smap->set_state (state->m_region_model, init_sval,
> > > > +                      tainted, NULL /*origin_new_sval*/,
> > > > ext_state);
> > > > +      if (POINTER_TYPE_P (TREE_TYPE (param)))
> > > > +       {
> > > > +         const region *pointee_reg = mgr->get_symbolic_region
> > > > (init_sval);
> > > > +         /* Mark "*param" as tainted.  */
> > > > +         const svalue *init_pointee_sval
> > > > +           = mgr->get_or_create_initial_value (pointee_reg);
> > > > +         smap->set_state (state->m_region_model,
> > > > init_pointee_sval,
> > > > +                          tainted, NULL /*origin_new_sval*/,
> > > > ext_state);
> > > > +       }
> > > > +    }
> > > > +
> > > > +  return true;
> > > > +}
> > > > +
> > > > +/* Custom event for use by tainted_function_info when a function
> > > > +   has been marked with __attribute__((tainted)).  */
> > > > +
> > > > +class tainted_function_custom_event : public custom_event
> > > > +{
> > > > +public:
> > > > +  tainted_function_custom_event (location_t loc, tree fndecl,
> > > > int
> > > > depth)
> > > > +  : custom_event (loc, fndecl, depth),
> > > > +    m_fndecl (fndecl)
> > > > +  {
> > > > +  }
> > > > +
> > > > +  label_text get_desc (bool can_colorize) const FINAL OVERRIDE
> > > > +  {
> > > > +    return make_label_text
> > > > +      (can_colorize,
> > > > +       "function %qE marked with %<__attribute__((tainted))%>",
> > > > +       m_fndecl);
> > > > +  }
> > > > +
> > > > +private:
> > > > +  tree m_fndecl;
> > > > +};
> > > > +
> > > > +/* Custom exploded_edge info for top-level calls to a function
> > > > +   marked with __attribute__((tainted)).  */
> > > > +
> > > > +class tainted_function_info : public custom_edge_info
> > > > +{
> > > > +public:
> > > > +  tainted_function_info (tree fndecl)
> > > > +  : m_fndecl (fndecl)
> > > > +  {}
> > > > +
> > > > +  void print (pretty_printer *pp) const FINAL OVERRIDE
> > > > +  {
> > > > +    pp_string (pp, "call to tainted function");
> > > > +  };
> > > > +
> > > > +  bool update_model (region_model *,
> > > > +                    const exploded_edge *,
> > > > +                    region_model_context *) const FINAL OVERRIDE
> > > > +  {
> > > > +    /* No-op.  */
> > > > +    return true;
> > > > +  }
> > > > +
> > > > +  void add_events_to_path (checker_path *emission_path,
> > > > +                          const exploded_edge &) const FINAL
> > > > OVERRIDE
> > > > +  {
> > > > +    emission_path->add_event
> > > > +      (new tainted_function_custom_event
> > > > +       (DECL_SOURCE_LOCATION (m_fndecl), m_fndecl, 0));
> > > > +  }
> > > > +
> > > > +private:
> > > > +  tree m_fndecl;
> > > > +};
> > > > +
> > > >   /* Ensure that there is an exploded_node representing an
> > > > external
> > > > call to
> > > >      FUN, adding it to the worklist if creating it.
> > > >   
> > > > @@ -2302,14 +2415,25 @@ exploded_graph::add_function_entry
> > > > (function
> > > > *fun)
> > > >     program_state state (m_ext_state);
> > > >     state.push_frame (m_ext_state, fun);
> > > >   
> > > > +  custom_edge_info *edge_info = NULL;
> > > > +
> > > > +  if (lookup_attribute ("tainted", DECL_ATTRIBUTES (fun->decl)))
> > > > +    {
> > > > +      if (mark_params_as_tainted (&state, fun->decl,
> > > > m_ext_state))
> > > > +       edge_info = new tainted_function_info (fun->decl);
> > > > +    }
> > > > +
> > > >     if (!state.m_valid)
> > > >       return NULL;
> > > >   
> > > >     exploded_node *enode = get_or_create_node (point, state,
> > > > NULL);
> > > >     if (!enode)
> > > > -    return NULL;
> > > > +    {
> > > > +      delete edge_info;
> > > > +      return NULL;
> > > > +    }
> > > >   
> > > > -  add_edge (m_origin, enode, NULL);
> > > > +  add_edge (m_origin, enode, NULL, edge_info);
> > > >   
> > > >     m_functions_with_enodes.add (fun);
> > > >   
> > > > @@ -2623,6 +2747,184 @@ toplevel_function_p (function *fun,
> > > > logger
> > > > *logger)
> > > >     return true;
> > > >   }
> > > >   
> > > > +/* Custom event for use by tainted_call_info when a callback
> > > > field
> > > > has been
> > > > +   marked with __attribute__((tainted)), for labelling the
> > > > field.
> > > > */
> > > > +
> > > > +class tainted_field_custom_event : public custom_event
> > > > +{
> > > > +public:
> > > > +  tainted_field_custom_event (tree field)
> > > > +  : custom_event (DECL_SOURCE_LOCATION (field), NULL_TREE, 0),
> > > > +    m_field (field)
> > > > +  {
> > > > +  }
> > > > +
> > > > +  label_text get_desc (bool can_colorize) const FINAL OVERRIDE
> > > > +  {
> > > > +    return make_label_text (can_colorize,
> > > > +                           "field %qE of %qT"
> > > > +                           " is marked with
> > > > %<__attribute__((tainted))%>",
> > > > +                           m_field, DECL_CONTEXT (m_field));
> > > > +  }
> > > > +
> > > > +private:
> > > > +  tree m_field;
> > > > +};
> > > > +
> > > > +/* Custom event for use by tainted_call_info when a callback
> > > > field
> > > > has been
> > > > +   marked with __attribute__((tainted)), for labelling the
> > > > function
> > > > used
> > > > +   in that callback.  */
> > > > +
> > > > +class tainted_callback_custom_event : public custom_event
> > > > +{
> > > > +public:
> > > > +  tainted_callback_custom_event (location_t loc, tree fndecl,
> > > > int
> > > > depth,
> > > > +                                tree field)
> > > > +  : custom_event (loc, fndecl, depth),
> > > > +    m_field (field)
> > > > +  {
> > > > +  }
> > > > +
> > > > +  label_text get_desc (bool can_colorize) const FINAL OVERRIDE
> > > > +  {
> > > > +    return make_label_text (can_colorize,
> > > > +                           "function %qE used as initializer for
> > > > field %qE"
> > > > +                           " marked with
> > > > %<__attribute__((tainted))%>",
> > > > +                           m_fndecl, m_field);
> > > > +  }
> > > > +
> > > > +private:
> > > > +  tree m_field;
> > > > +};
> > > > +
> > > > +/* Custom edge info for use when adding a function used by a
> > > > callback field
> > > > +   marked with '__attribute__((tainted))'.   */
> > > > +
> > > > +class tainted_call_info : public custom_edge_info
> > > > +{
> > > > +public:
> > > > +  tainted_call_info (tree field, tree fndecl, location_t loc)
> > > > +  : m_field (field), m_fndecl (fndecl), m_loc (loc)
> > > > +  {}
> > > > +
> > > > +  void print (pretty_printer *pp) const FINAL OVERRIDE
> > > > +  {
> > > > +    pp_string (pp, "call to tainted field");
> > > > +  };
> > > > +
> > > > +  bool update_model (region_model *,
> > > > +                    const exploded_edge *,
> > > > +                    region_model_context *) const FINAL OVERRIDE
> > > > +  {
> > > > +    /* No-op.  */
> > > > +    return true;
> > > > +  }
> > > > +
> > > > +  void add_events_to_path (checker_path *emission_path,
> > > > +                          const exploded_edge &) const FINAL
> > > > OVERRIDE
> > > > +  {
> > > > +    /* Show the field in the struct declaration
> > > > +       e.g. "(1) field 'store' is marked with
> > > > '__attribute__((tainted))'"  */
> > > > +    emission_path->add_event
> > > > +      (new tainted_field_custom_event (m_field));
> > > > +
> > > > +    /* Show the callback in the initializer
> > > > +       e.g.
> > > > +       "(2) function 'gadget_dev_desc_UDC_store' used as
> > > > initializer
> > > > +       for field 'store' marked with
> > > > '__attribute__((tainted))'".
> > > > */
> > > > +    emission_path->add_event
> > > > +      (new tainted_callback_custom_event (m_loc, m_fndecl, 0,
> > > > m_field));
> > > > +  }
> > > > +
> > > > +private:
> > > > +  tree m_field;
> > > > +  tree m_fndecl;
> > > > +  location_t m_loc;
> > > > +};
> > > > +
> > > > +/* Given an initializer at LOC for FIELD marked with
> > > > '__attribute__((tainted))'
> > > > +   initialized with FNDECL, add an entrypoint to FNDECL to EG
> > > > (and
> > > > to its
> > > > +   worklist) where the params to FNDECL are marked as tainted. 
> > > > */
> > > > +
> > > > +static void
> > > > +add_tainted_callback (exploded_graph *eg, tree field, tree
> > > > fndecl,
> > > > +                     location_t loc)
> > > > +{
> > > > +  logger *logger = eg->get_logger ();
> > > > +
> > > > +  LOG_SCOPE (logger);
> > > > +
> > > > +  if (!gimple_has_body_p (fndecl))
> > > > +    return;
> > > > +
> > > > +  const extrinsic_state &ext_state = eg->get_ext_state ();
> > > > +
> > > > +  function *fun = DECL_STRUCT_FUNCTION (fndecl);
> > > > +  gcc_assert (fun);
> > > > +
> > > > +  program_point point
> > > > +    = program_point::from_function_entry (eg->get_supergraph (),
> > > > fun);
> > > > +  program_state state (ext_state);
> > > > +  state.push_frame (ext_state, fun);
> > > > +
> > > > +  if (!mark_params_as_tainted (&state, fndecl, ext_state))
> > > > +    return;
> > > > +
> > > > +  if (!state.m_valid)
> > > > +    return;
> > > > +
> > > > +  exploded_node *enode = eg->get_or_create_node (point, state,
> > > > NULL);
> > > > +  if (logger)
> > > > +    {
> > > > +      if (enode)
> > > > +       logger->log ("created EN %i for tainted %qE entrypoint",
> > > > +                    enode->m_index, fndecl);
> > > > +      else
> > > > +       {
> > > > +         logger->log ("did not create enode for tainted %qE
> > > > entrypoint",
> > > > +                      fndecl);
> > > > +         return;
> > > > +       }
> > > > +    }
> > > > +
> > > > +  tainted_call_info *info = new tainted_call_info (field,
> > > > fndecl,
> > > > loc);
> > > > +  eg->add_edge (eg->get_origin (), enode, NULL, info);
> > > > +}
> > > > +
> > > > +/* Callback for walk_tree for finding callbacks within
> > > > initializers;
> > > > +   ensure that any callback initializer where the corresponding
> > > > field is
> > > > +   marked with '__attribute__((tainted))' is treated as an
> > > > entrypoint to the
> > > > +   analysis, special-casing that the inputs to the callback are
> > > > +   untrustworthy.  */
> > > > +
> > > > +static tree
> > > > +add_any_callbacks (tree *tp, int *, void *data)
> > > > +{
> > > > +  exploded_graph *eg = (exploded_graph *)data;
> > > > +  if (TREE_CODE (*tp) == CONSTRUCTOR)
> > > > +    {
> > > > +      /* Find fields with the "tainted" attribute.
> > > > +        walk_tree only walks the values, not the index values;
> > > > +        look at the index values.  */
> > > > +      unsigned HOST_WIDE_INT idx;
> > > > +      constructor_elt *ce;
> > > > +
> > > > +      for (idx = 0; vec_safe_iterate (CONSTRUCTOR_ELTS (*tp),
> > > > idx,
> > > > &ce);
> > > > +          idx++)
> > > > +       if (ce->index && TREE_CODE (ce->index) == FIELD_DECL)
> > > > +         if (lookup_attribute ("tainted", DECL_ATTRIBUTES (ce-
> > > > > index)))
> > > > +           {
> > > > +             tree value = ce->value;
> > > > +             if (TREE_CODE (value) == ADDR_EXPR
> > > > +                 && TREE_CODE (TREE_OPERAND (value, 0)) ==
> > > > FUNCTION_DECL)
> > > > +               add_tainted_callback (eg, ce->index, TREE_OPERAND
> > > > (value, 0),
> > > > +                                     EXPR_LOCATION (value));
> > > > +           }
> > > > +    }
> > > > +
> > > > +  return NULL_TREE;
> > > > +}
> > > > +
> > > >   /* Add initial nodes to EG, with entrypoints for externally-
> > > > callable
> > > >      functions.  */
> > > >   
> > > > @@ -2648,6 +2950,17 @@ exploded_graph::build_initial_worklist ()
> > > >            logger->log ("did not create enode for %qE
> > > > entrypoint",
> > > > fun->decl);
> > > >         }
> > > >     }
> > > > +
> > > > +  /* Find callbacks that are reachable from global
> > > > initializers.  */
> > > > +  varpool_node *vpnode;
> > > > +  FOR_EACH_VARIABLE (vpnode)
> > > > +    {
> > > > +      tree decl = vpnode->decl;
> > > > +      tree init = DECL_INITIAL (decl);
> > > > +      if (!init)
> > > > +       continue;
> > > > +      walk_tree (&init, add_any_callbacks, this, NULL);
> > > > +    }
> > > >   }
> > > >   
> > > >   /* The main loop of the analysis.
> > > > diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
> > > > index 9e03156de5e..835ba6e0e8c 100644
> > > > --- a/gcc/c-family/c-attribs.c
> > > > +++ b/gcc/c-family/c-attribs.c
> > > > @@ -117,6 +117,7 @@ static tree
> > > > handle_no_profile_instrument_function_attribute (tree *, tree,
> > > >                                                              
> > > > tree,
> > > > int, bool *);
> > > >   static tree handle_malloc_attribute (tree *, tree, tree, int,
> > > > bool
> > > > *);
> > > >   static tree handle_dealloc_attribute (tree *, tree, tree, int,
> > > > bool
> > > > *);
> > > > +static tree handle_tainted_attribute (tree *, tree, tree, int,
> > > > bool
> > > > *);
> > > >   static tree handle_returns_twice_attribute (tree *, tree, tree,
> > > > int,
> > > > bool *);
> > > >   static tree handle_no_limit_stack_attribute (tree *, tree,
> > > > tree,
> > > > int,
> > > >                                               bool *);
> > > > @@ -569,6 +570,8 @@ const struct attribute_spec
> > > > c_common_attribute_table[] =
> > > >                                handle_objc_nullability_attribute,
> > > > NULL
> > > > },
> > > >     { "*dealloc",                1, 2, true, false, false, false,
> > > >                                handle_dealloc_attribute, NULL },
> > > > +  { "tainted",               0, 0, true,  false, false, false,
> > > > +                             handle_tainted_attribute, NULL },
> > > >     { NULL,                     0, 0, false, false, false, false,
> > > > NULL, NULL }
> > > >   };
> > > >   
> > > > @@ -5857,6 +5860,39 @@ handle_objc_nullability_attribute (tree
> > > > *node,
> > > > tree name, tree args,
> > > >     return NULL_TREE;
> > > >   }
> > > >   
> > > > +/* Handle a "tainted" attribute; arguments as in
> > > > +   struct attribute_spec.handler.  */
> > > > +
> > > > +static tree
> > > > +handle_tainted_attribute (tree *node, tree name, tree, int,
> > > > +                         bool *no_add_attrs)
> > > > +{
> > > > +  if (TREE_CODE (*node) != FUNCTION_DECL
> > > > +      && TREE_CODE (*node) != FIELD_DECL)
> > > > +    {
> > > > +      warning (OPT_Wattributes, "%qE attribute ignored; valid
> > > > only "
> > > > +              "for functions and function pointer fields",
> > > > +              name);
> > > > +      *no_add_attrs = true;
> > > > +      return NULL_TREE;
> > > > +    }
> > > > +
> > > > +  if (TREE_CODE (*node) == FIELD_DECL
> > > > +      && !(TREE_CODE (TREE_TYPE (*node)) == POINTER_TYPE
> > > > +          && TREE_CODE (TREE_TYPE (TREE_TYPE (*node))) ==
> > > > FUNCTION_TYPE))
> > > > +    {
> > > > +      warning (OPT_Wattributes, "%qE attribute ignored;"
> > > > +              " field must be a function pointer",
> > > > +              name);
> > > > +      *no_add_attrs = true;
> > > > +      return NULL_TREE;
> > > > +    }
> > > > +
> > > > +  *no_add_attrs = false; /* OK */
> > > > +
> > > > +  return NULL_TREE;
> > > > +}
> > > > +
> > > >   /* Attempt to partially validate a single attribute ATTR as if
> > > >      it were to be applied to an entity OPER.  */
> > > >   
> > > > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> > > > index 5a6ef464779..826bbd48e7e 100644
> > > > --- a/gcc/doc/extend.texi
> > > > +++ b/gcc/doc/extend.texi
> > > > @@ -2465,7 +2465,8 @@ variable declarations (@pxref{Variable
> > > > Attributes}),
> > > >   labels (@pxref{Label Attributes}),
> > > >   enumerators (@pxref{Enumerator Attributes}),
> > > >   statements (@pxref{Statement Attributes}),
> > > > -and types (@pxref{Type Attributes}).
> > > > +types (@pxref{Type Attributes}),
> > > > +and on field declarations (for @code{tainted}).
> > > >   
> > > >   There is some overlap between the purposes of attributes and
> > > > pragmas
> > > >   (@pxref{Pragmas,,Pragmas Accepted by GCC}).  It has been
> > > > @@ -3977,6 +3978,25 @@ addition to creating a symbol version (as
> > > > if
> > > >   @code{"@var{name2}@@@var{nodename}"} was used) the version will
> > > > be
> > > > also used
> > > >   to resolve @var{name2} by the linker.
> > > >   
> > > > +@item tainted
> > > > +@cindex @code{tainted} function attribute
> > > > +The @code{tainted} attribute is used to specify that a function
> > > > is
> > > > called
> > > > +in a way that requires sanitization of its arguments, such as a
> > > > system
> > > > +call in an operating system kernel.  Such a function can be
> > > > considered part
> > > > +of the ``attack surface'' of the program.  The attribute can be
> > > > used
> > > > both
> > > > +on function declarations, and on field declarations containing
> > > > function
> > > > +pointers.  In the latter case, any function used as an
> > > > initializer
> > > > of
> > > > +such a callback field will be treated as tainted.
> > > > +
> > > > +The analyzer will pay particular attention to such functions
> > > > when
> > > > both
> > > > +@option{-fanalyzer} and @option{-fanalyzer-checker=taint} are
> > > > supplied,
> > > > +potentially issuing warnings guarded by
> > > > +@option{-Wanalyzer-exposure-through-uninit-copy},
> > > > +@option{-Wanalyzer-tainted-allocation-size},
> > > > +@option{-Wanalyzer-tainted-array-index},
> > > > +@option{Wanalyzer-tainted-offset},
> > > > +and @option{Wanalyzer-tainted-size}.
> > > > +
> > > >   @item target_clones (@var{options})
> > > >   @cindex @code{target_clones} function attribute
> > > >   The @code{target_clones} attribute is used to specify that a
> > > > function
> > > > diff --git a/gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
> > > > b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
> > > > new file mode 100644
> > > > index 00000000000..cc4d5900372
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
> > > > @@ -0,0 +1,88 @@
> > > > +// TODO: remove need for this option
> > > > +/* { dg-additional-options "-fanalyzer-checker=taint" } */
> > > > +
> > > > +#include "analyzer-decls.h"
> > > > +
> > > > +struct arg_buf
> > > > +{
> > > > +  int i;
> > > > +  int j;
> > > > +};
> > > > +
> > > > +/* Example of marking a function as tainted.  */
> > > > +
> > > > +void __attribute__((tainted))
> > > > +test_1 (int i, void *p, char *q)
> > > > +{
> > > > +  /* There should be a single enode,
> > > > +     for the "tainted" entry to the function.  */
> > > > +  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1
> > > > processed
> > > > enode" } */
> > > > +
> > > > +  __analyzer_dump_state ("taint", i); /* { dg-warning "state:
> > > > 'tainted'" } */
> > > > +  __analyzer_dump_state ("taint", p); /* { dg-warning "state:
> > > > 'tainted'" } */
> > > > +  __analyzer_dump_state ("taint", q); /* { dg-warning "state:
> > > > 'tainted'" } */
> > > > +  __analyzer_dump_state ("taint", *q); /* { dg-warning "state:
> > > > 'tainted'" } */
> > > > +
> > > > +  struct arg_buf *args = p;
> > > > +  __analyzer_dump_state ("taint", args->i); /* { dg-warning
> > > > "state:
> > > > 'tainted'" } */
> > > > +  __analyzer_dump_state ("taint", args->j); /* { dg-warning
> > > > "state:
> > > > 'tainted'" } */
> > > > +}
> > > > +
> > > > +/* Example of marking a callback field as tainted.  */
> > > > +
> > > > +struct s2
> > > > +{
> > > > +  void (*cb) (int, void *, char *)
> > > > +    __attribute__((tainted));
> > > > +};
> > > > +
> > > > +/* Function not marked as tainted.  */
> > > > +
> > > > +void
> > > > +test_2a (int i, void *p, char *q)
> > > > +{
> > > > +  /* There should be a single enode,
> > > > +     for the normal entry to the function.  */
> > > > +  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1
> > > > processed
> > > > enode" } */
> > > > +
> > > > +  __analyzer_dump_state ("taint", i); /* { dg-warning "state:
> > > > 'start'" } */
> > > > +  __analyzer_dump_state ("taint", p); /* { dg-warning "state:
> > > > 'start'" } */
> > > > +  __analyzer_dump_state ("taint", q); /* { dg-warning "state:
> > > > 'start'" } */
> > > > +
> > > > +  struct arg_buf *args = p;
> > > > +  __analyzer_dump_state ("taint", args->i); /* { dg-warning
> > > > "state:
> > > > 'start'" } */
> > > > +  __analyzer_dump_state ("taint", args->j); /* { dg-warning
> > > > "state:
> > > > 'start'" } */
> > > > +}
> > > > +
> > > > +/* Function referenced via t2b.cb, marked as "tainted".  */
> > > > +
> > > > +void
> > > > +test_2b (int i, void *p, char *q)
> > > > +{
> > > > +  /* There should be two enodes
> > > > +     for the direct call, and the "tainted" entry to the
> > > > function.
> > > > */
> > > > +  __analyzer_dump_exploded_nodes (0); /* { dg-warning "2
> > > > processed
> > > > enodes" } */
> > > > +}
> > > > +
> > > > +/* Callback used via t2c.cb, marked as "tainted".  */
> > > > +void
> > > > +__analyzer_test_2c (int i, void *p, char *q)
> > > > +{
> > > > +  /* There should be a single enode,
> > > > +     for the "tainted" entry to the function.  */
> > > > +  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1
> > > > processed
> > > > enode" } */
> > > > +
> > > > +  __analyzer_dump_state ("taint", i); /* { dg-warning "state:
> > > > 'tainted'" } */
> > > > +  __analyzer_dump_state ("taint", p); /* { dg-warning "state:
> > > > 'tainted'" } */
> > > > +  __analyzer_dump_state ("taint", q); /* { dg-warning "state:
> > > > 'tainted'" } */
> > > > +}
> > > > +
> > > > +struct s2 t2b =
> > > > +{
> > > > +  .cb = test_2b
> > > > +};
> > > > +
> > > > +struct s2 t2c =
> > > > +{
> > > > +  .cb = __analyzer_test_2c
> > > > +};
> > > > diff --git a/gcc/testsuite/gcc.dg/analyzer/attr-tainted-misuses.c
> > > > b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-misuses.c
> > > > new file mode 100644
> > > > index 00000000000..6f4cbc82efb
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-misuses.c
> > > > @@ -0,0 +1,6 @@
> > > > +int not_a_fn __attribute__ ((tainted)); /* { dg-warning
> > > > "'tainted'
> > > > attribute ignored; valid only for functions and function pointer
> > > > fields" } */
> > > > +
> > > > +struct s
> > > > +{
> > > > +  int f __attribute__ ((tainted)); /* { dg-warning "'tainted'
> > > > attribute ignored; field must be a function pointer" } */
> > > > +};
> > > > diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-
> > > > 1.c
> > > > b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-1.c
> > > > new file mode 100644
> > > > index 00000000000..fe6c7ebbb1f
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-1.c
> > > > @@ -0,0 +1,93 @@
> > > > +/* "The osf_getsysinfo function in arch/alpha/kernel/osf_sys.c
> > > > in
> > > > the
> > > > +   Linux kernel before 2.6.39.4 on the Alpha platform does not
> > > > properly
> > > > +   restrict the data size for GSI_GET_HWRPB operations, which
> > > > allows
> > > > +   local users to obtain sensitive information from kernel
> > > > memory
> > > > via
> > > > +   a crafted call."
> > > > +
> > > > +   Fixed in 3d0475119d8722798db5e88f26493f6547a4bb5b on linux-
> > > > 2.6.39.y
> > > > +   in linux-stable.  */
> > > > +
> > > > +// TODO: remove need for this option:
> > > > +/* { dg-additional-options "-fanalyzer-checker=taint" } */
> > > > +
> > > > +#include "analyzer-decls.h"
> > > > +#include "test-uaccess.h"
> > > > +
> > > > +/* Adapted from include/linux/linkage.h.  */
> > > > +
> > > > +#define asmlinkage
> > > > +
> > > > +/* Adapted from include/linux/syscalls.h.  */
> > > > +
> > > > +#define __SC_DECL1(t1, a1)     t1 a1
> > > > +#define __SC_DECL2(t2, a2, ...) t2 a2, __SC_DECL1(__VA_ARGS__)
> > > > +#define __SC_DECL3(t3, a3, ...) t3 a3, __SC_DECL2(__VA_ARGS__)
> > > > +#define __SC_DECL4(t4, a4, ...) t4 a4, __SC_DECL3(__VA_ARGS__)
> > > > +#define __SC_DECL5(t5, a5, ...) t5 a5, __SC_DECL4(__VA_ARGS__)
> > > > +#define __SC_DECL6(t6, a6, ...) t6 a6, __SC_DECL5(__VA_ARGS__)
> > > > +
> > > > +#define SYSCALL_DEFINEx(x, sname, ...)                         \
> > > > +       __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
> > > > +
> > > > +#define SYSCALL_DEFINE(name) asmlinkage long sys_##name
> > > > +#define __SYSCALL_DEFINEx(x, name,
> > > > ...)                                        \
> > > > +       asmlinkage __attribute__((tainted)) \
> > > > +       long sys##name(__SC_DECL##x(__VA_ARGS__))
> > > > +
> > > > +#define SYSCALL_DEFINE5(name, ...) SYSCALL_DEFINEx(5, _##name,
> > > > __VA_ARGS__)
> > > > +
> > > > +/* Adapted from arch/alpha/include/asm/hwrpb.h.  */
> > > > +
> > > > +struct hwrpb_struct {
> > > > +       unsigned long phys_addr;        /* check: physical
> > > > address of
> > > > the hwrpb */
> > > > +       unsigned long id;               /* check: "HWRPB\0\0\0"
> > > > */
> > > > +       unsigned long revision;
> > > > +       unsigned long size;             /* size of hwrpb */
> > > > +       /* [...snip...] */
> > > > +};
> > > > +
> > > > +extern struct hwrpb_struct *hwrpb;
> > > > +
> > > > +/* Adapted from arch/alpha/kernel/osf_sys.c.  */
> > > > +
> > > > +SYSCALL_DEFINE5(osf_getsysinfo, unsigned long, op, void __user
> > > > *,
> > > > buffer,
> > > > +               unsigned long, nbytes, int __user *, start, void
> > > > __user *, arg)
> > > > +{
> > > > +       /* [...snip...] */
> > > > +
> > > > +       __analyzer_dump_state ("taint", nbytes);  /* { dg-warning
> > > > "tainted" } */
> > > > +
> > > > +       /* TODO: should have an event explaining why "nbytes" is
> > > > treated as
> > > > +          attacker-controlled.  */
> > > > +
> > > > +       /* case GSI_GET_HWRPB: */
> > > > +               if (nbytes < sizeof(*hwrpb))
> > > > +                       return -1;
> > > > +
> > > > +               __analyzer_dump_state ("taint", nbytes);  /* {
> > > > dg-
> > > > warning "has_lb" } */
> > > > +
> > > > +               if (copy_to_user(buffer, hwrpb, nbytes) != 0) /*
> > > > {
> > > > dg-warning "use of attacker-controlled value 'nbytes' as size
> > > > without
> > > > upper-bounds checking" } */
> > > > +                       return -2;
> > > > +
> > > > +               return 1;
> > > > +
> > > > +       /* [...snip...] */
> > > > +}
> > > > +
> > > > +/* With the fix for the sense of the size comparison.  */
> > > > +
> > > > +SYSCALL_DEFINE5(osf_getsysinfo_fixed, unsigned long, op, void
> > > > __user
> > > > *, buffer,
> > > > +               unsigned long, nbytes, int __user *, start, void
> > > > __user *, arg)
> > > > +{
> > > > +       /* [...snip...] */
> > > > +
> > > > +       /* case GSI_GET_HWRPB: */
> > > > +               if (nbytes > sizeof(*hwrpb))
> > > > +                       return -1;
> > > > +               if (copy_to_user(buffer, hwrpb, nbytes) != 0) /*
> > > > {
> > > > dg-bogus "attacker-controlled" } */
> > > > +                       return -2;
> > > > +
> > > > +               return 1;
> > > > +
> > > > +       /* [...snip...] */
> > > > +}
> > > > diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-
> > > > 1.c
> > > > b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-1.c
> > > > new file mode 100644
> > > > index 00000000000..0b9a94a8d6c
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-1.c
> > > > @@ -0,0 +1,38 @@
> > > > +/* See notes in this header.  */
> > > > +#include "taint-CVE-2020-13143.h"
> > > > +
> > > > +// TODO: remove need for this option
> > > > +/* { dg-additional-options "-fanalyzer-checker=taint" } */
> > > > +
> > > > +struct configfs_attribute {
> > > > +       /* [...snip...] */
> > > > +       ssize_t (*store)(struct config_item *, const char *,
> > > > size_t)
> > > > /* { dg-message "\\(1\\) field 'store' of 'struct
> > > > configfs_attribute'
> > > > is marked with '__attribute__\\(\\(tainted\\)\\)'" } */
> > > > +               __attribute__((tainted)); /* (this is added).  */
> > > > +};
> > > > +static inline struct gadget_info *to_gadget_info(struct
> > > > config_item
> > > > *item)
> > > > +{
> > > > +        return container_of(to_config_group(item), struct
> > > > gadget_info, group);
> > > > +}
> > > > +
> > > > +static ssize_t gadget_dev_desc_UDC_store(struct config_item
> > > > *item,
> > > > +               const char *page, size_t len)
> > > > +{
> > > > +       struct gadget_info *gi = to_gadget_info(item);
> > > > +       char *name;
> > > > +       int ret;
> > > > +
> > > > +#if 0
> > > > +       /* FIXME: this is the fix.  */
> > > > +       if (strlen(page) < len)
> > > > +               return -EOVERFLOW;
> > > > +#endif
> > > > +
> > > > +       name = kstrdup(page, GFP_KERNEL);
> > > > +       if (!name)
> > > > +               return -ENOMEM;
> > > > +       if (name[len - 1] == '\n') /* { dg-warning "use of
> > > > attacker-
> > > > controlled value 'len \[^\n\r\]+' as offset without upper-bounds
> > > > checking" } */
> > > > +               name[len - 1] = '\0'; /* { dg-warning "use of
> > > > attacker-controlled value 'len \[^\n\r\]+' as offset without
> > > > upper-
> > > > bounds checking" } */
> > > > +       /* [...snip...] */                              \
> > > > +}
> > > > +
> > > > +CONFIGFS_ATTR(gadget_dev_desc_, UDC); /* { dg-message "\\(2\\)
> > > > function 'gadget_dev_desc_UDC_store' used as initializer for
> > > > field
> > > > 'store' marked with '__attribute__\\(\\(tainted\\)\\)'" } */
> > > > diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-
> > > > 2.c
> > > > b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-2.c
> > > > new file mode 100644
> > > > index 00000000000..e05da9276c1
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-2.c
> > > > @@ -0,0 +1,32 @@
> > > > +/* See notes in this header.  */
> > > > +#include "taint-CVE-2020-13143.h"
> > > > +
> > > > +// TODO: remove need for this option
> > > > +/* { dg-additional-options "-fanalyzer-checker=taint" } */
> > > > +
> > > > +struct configfs_attribute {
> > > > +       /* [...snip...] */
> > > > +       ssize_t (*store)(struct config_item *, const char *,
> > > > size_t)
> > > > /* { dg-message "\\(1\\) field 'store' of 'struct
> > > > configfs_attribute'
> > > > is marked with '__attribute__\\(\\(tainted\\)\\)'" } */
> > > > +               __attribute__((tainted)); /* (this is added).  */
> > > > +};
> > > > +
> > > > +/* Highly simplified version.  */
> > > > +
> > > > +static ssize_t gadget_dev_desc_UDC_store(struct config_item
> > > > *item,
> > > > +               const char *page, size_t len)
> > > > +{
> > > > +       /* TODO: ought to have state_change_event talking about
> > > > where
> > > > the tainted value comes from.  */
> > > > +
> > > > +       char *name;
> > > > +       /* [...snip...] */
> > > > +
> > > > +       name = kstrdup(page, GFP_KERNEL);
> > > > +       if (!name)
> > > > +               return -ENOMEM;
> > > > +       if (name[len - 1] == '\n') /* { dg-warning "use of
> > > > attacker-
> > > > controlled value 'len \[^\n\r\]+' as offset without upper-bounds
> > > > checking" } */
> > > > +               name[len - 1] = '\0';  /* { dg-warning "use of
> > > > attacker-controlled value 'len \[^\n\r\]+' as offset without
> > > > upper-
> > > > bounds checking" } */
> > > > +       /* [...snip...] */
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +CONFIGFS_ATTR(gadget_dev_desc_, UDC); /* { dg-message "\\(2\\)
> > > > function 'gadget_dev_desc_UDC_store' used as initializer for
> > > > field
> > > > 'store' marked with '__attribute__\\(\\(tainted\\)\\)'" } */
> > > > diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h
> > > > b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h
> > > > new file mode 100644
> > > > index 00000000000..0ba023539af
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h
> > > > @@ -0,0 +1,91 @@
> > > > +/* Shared header for the various taint-CVE-2020-13143.h tests.
> > > > +
> > > > +   "gadget_dev_desc_UDC_store in drivers/usb/gadget/configfs.c
> > > > in
> > > > the
> > > > +   Linux kernel 3.16 through 5.6.13 relies on kstrdup without
> > > > considering
> > > > +   the possibility of an internal '\0' value, which allows
> > > > attackers
> > > > to
> > > > +   trigger an out-of-bounds read, aka CID-15753588bcd4."
> > > > +
> > > > +   Fixed by 15753588bcd4bbffae1cca33c8ced5722477fe1f on linux-
> > > > 5.7.y
> > > > +   in linux-stable.  */
> > > > +
> > > > +// TODO: remove need for this option
> > > > +/* { dg-additional-options "-fanalyzer-checker=taint" } */
> > > > +
> > > > +#include <stddef.h>
> > > > +
> > > > +/* Adapted from include/uapi/asm-generic/posix_types.h  */
> > > > +
> > > > +typedef unsigned int     __kernel_size_t;
> > > > +typedef int              __kernel_ssize_t;
> > > > +
> > > > +/* Adapted from include/linux/types.h  */
> > > > +
> > > > +//typedef __kernel_size_t              size_t;
> > > > +typedef __kernel_ssize_t       ssize_t;
> > > > +
> > > > +/* Adapted from include/linux/kernel.h  */
> > > > +
> > > > +#define container_of(ptr, type, member)
> > > > ({                             \
> > > > +       void *__mptr = (void
> > > > *)(ptr);                                   \
> > > > +       /* [...snip...]
> > > > */                                              \
> > > > +       ((type *)(__mptr - offsetof(type, member))); })
> > > > +
> > > > +/* Adapted from include/linux/configfs.h  */
> > > > +
> > > > +struct config_item {
> > > > +       /* [...snip...] */
> > > > +};
> > > > +
> > > > +struct config_group {
> > > > +       struct config_item              cg_item;
> > > > +       /* [...snip...] */
> > > > +};
> > > > +
> > > > +static inline struct config_group *to_config_group(struct
> > > > config_item *item)
> > > > +{
> > > > +       return item ? container_of(item,struct
> > > > config_group,cg_item)
> > > > : NULL;
> > > > +}
> > > > +
> > > > +#define CONFIGFS_ATTR(_pfx, _name)                             \
> > > > +static struct configfs_attribute _pfx##attr_##_name = {        \
> > > > +       /* [...snip...] */                              \
> > > > +       .store          = _pfx##_name##_store,          \
> > > > +}
> > > > +
> > > > +/* Adapted from include/linux/compiler.h  */
> > > > +
> > > > +#define __force
> > > > +
> > > > +/* Adapted from include/asm-generic/errno-base.h  */
> > > > +
> > > > +#define        ENOMEM          12      /* Out of memory */
> > > > +
> > > > +/* Adapted from include/linux/types.h  */
> > > > +
> > > > +#define __bitwise__
> > > > +typedef unsigned __bitwise__ gfp_t;
> > > > +
> > > > +/* Adapted from include/linux/gfp.h  */
> > > > +
> > > > +#define ___GFP_WAIT            0x10u
> > > > +#define ___GFP_IO              0x40u
> > > > +#define ___GFP_FS              0x80u
> > > > +#define __GFP_WAIT     ((__force gfp_t)___GFP_WAIT)
> > > > +#define __GFP_IO       ((__force gfp_t)___GFP_IO)
> > > > +#define __GFP_FS       ((__force gfp_t)___GFP_FS)
> > > > +#define GFP_KERNEL  (__GFP_WAIT | __GFP_IO | __GFP_FS)
> > > > +
> > > > +/* Adapted from include/linux/compiler_attributes.h  */
> > > > +
> > > > +#define __malloc                       
> > > > __attribute__((__malloc__))
> > > > +
> > > > +/* Adapted from include/linux/string.h  */
> > > > +
> > > > +extern char *kstrdup(const char *s, gfp_t gfp) __malloc;
> > > > +
> > > > +/* Adapted from drivers/usb/gadget/configfs.c  */
> > > > +
> > > > +struct gadget_info {
> > > > +       struct config_group group;
> > > > +       /* [...snip...] */                              \
> > > > +};
> > > > diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
> > > > b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
> > > > new file mode 100644
> > > > index 00000000000..4c567b2ffdf
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
> > > > @@ -0,0 +1,21 @@
> > > > +// TODO: remove need for this option:
> > > > +/* { dg-additional-options "-fanalyzer-checker=taint" } */
> > > > +
> > > > +#include "analyzer-decls.h"
> > > > +#include <stdio.h>
> > > > +#include <stdlib.h>
> > > > +#include <string.h>
> > > > +
> > > > +/* malloc with tainted size from a syscall.  */
> > > > +
> > > > +void *p;
> > > > +
> > > > +void __attribute__((tainted))
> > > > +test_1 (size_t sz) /* { dg-message "\\(1\\) function 'test_1'
> > > > marked
> > > > with '__attribute__\\(\\(tainted\\)\\)'" } */
> > > > +{
> > > > +  /* TODO: should have a message saying why "sz" is tainted,
> > > > e.g.
> > > > +     "treating 'sz' as attacker-controlled because 'test_1' is
> > > > marked with '__attribute__((tainted))'"  */
> > > > +
> > > > +  p = malloc (sz); /* { dg-warning "use of attacker-controlled
> > > > value
> > > > 'sz' as allocation size without upper-bounds checking" "warning"
> > > > } */
> > > > +  /* { dg-message "\\(\[0-9\]+\\) use of attacker-controlled
> > > > value
> > > > 'sz' as allocation size without upper-bounds checking" "final
> > > > event"
> > > > { target *-*-* } .-1 } */
> > > > +}
> > > > diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
> > > > b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
> > > > new file mode 100644
> > > > index 00000000000..f52cafcd71d
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
> > > > @@ -0,0 +1,31 @@
> > > > +// TODO: remove need for this option:
> > > > +/* { dg-additional-options "-fanalyzer-checker=taint" } */
> > > > +
> > > > +#include "analyzer-decls.h"
> > > > +#include <stdio.h>
> > > > +#include <stdlib.h>
> > > > +#include <string.h>
> > > > +
> > > > +/* malloc with tainted size from a syscall.  */
> > > > +
> > > > +struct arg_buf
> > > > +{
> > > > +  size_t sz;
> > > > +};
> > > > +
> > > > +void *p;
> > > > +
> > > > +void __attribute__((tainted))
> > > > +test_1 (void *data) /* { dg-message "\\(1\\) function 'test_1'
> > > > marked with '__attribute__\\(\\(tainted\\)\\)'" } */
> > > > +{
> > > > +  /* we should treat pointed-to-structs as tainted.  */
> > > > +  __analyzer_dump_state ("taint", data); /* { dg-warning "state:
> > > > 'tainted'" } */
> > > > +
> > > > +  struct arg_buf *args = data;
> > > > +
> > > > +  __analyzer_dump_state ("taint", args); /* { dg-warning "state:
> > > > 'tainted'" } */
> > > > +  __analyzer_dump_state ("taint", args->sz); /* { dg-warning
> > > > "state:
> > > > 'tainted'" } */
> > > > +
> > > > +  p = malloc (args->sz); /* { dg-warning "use of attacker-
> > > > controlled
> > > > value '\\*args.sz' as allocation size without upper-bounds
> > > > checking"
> > > > "warning" } */
> > > > +  /* { dg-message "\\(\[0-9\]+\\) use of attacker-controlled
> > > > value
> > > > '\\*args.sz' as allocation size without upper-bounds checking"
> > > > "final
> > > > event" { target *-*-* } .-1 } */
> > > > +}
> > > 
> > 
> > 
>
  
Jason Merrill Jan. 13, 2022, 7:08 p.m. UTC | #5
On 1/12/22 10:33, David Malcolm wrote:
> On Tue, 2022-01-11 at 23:36 -0500, Jason Merrill wrote:
>> On 1/10/22 16:36, David Malcolm via Gcc-patches wrote:
>>> On Thu, 2022-01-06 at 09:08 -0500, David Malcolm wrote:
>>>> On Sat, 2021-11-13 at 15:37 -0500, David Malcolm wrote:
>>>>> This patch adds a new __attribute__ ((tainted)) to the C/C++
>>>>> frontends.
>>>>
>>>> Ping for GCC C/C++ mantainers for review of the C/C++ FE parts of
>>>> this
>>>> patch (attribute registration, documentation, the name of the
>>>> attribute, etc).
>>>>
>>>> (I believe it's independent of the rest of the patch kit, in that
>>>> it
>>>> could go into trunk without needing the prior patches)
>>>>
>>>> Thanks
>>>> Dave
>>>
>>> Getting close to end of stage 3 for GCC 12, so pinging this patch
>>> again...
>>>
>>> https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584376.html
>>
>> The c-family change is OK.
> 
> Thanks.
> 
> I'm retesting the patch now, but it now seems to me that
>    __attribute__((tainted_args))
> would lead to more readable code than:
>    __attribute__((tainted))
> 
> in that the name "tainted_args" better conveys the idea that all
> arguments are under attacker-control (as opposed to the body of the
> function or the function pointer being under attacker-control).
> 
> Looking at
>    https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html
> we already have some attributes with underscores in their names.
> 
> Does this sound good?

Makes sense to me.

>>
>>> Thanks
>>> Dave
>>>
>>>>
>>>>
>>>>>
>>>>> It can be used on function decls: the analyzer will treat as
>>>>> tainted
>>>>> all parameters to the function and all buffers pointed to by
>>>>> parameters
>>>>> to the function.  Adding this in one place to the Linux kernel's
>>>>> __SYSCALL_DEFINEx macro allows the analyzer to treat all syscalls
>>>>> as
>>>>> having tainted inputs.  This gives additional testing beyond e.g.
>>>>> __user
>>>>> pointers added by earlier patches - an example of the use of this
>>>>> can
>>>>> be
>>>>> seen in CVE-2011-2210, where given:
>>>>>
>>>>>    SYSCALL_DEFINE5(osf_getsysinfo, unsigned long, op, void __user
>>>>> *,
>>>>> buffer,
>>>>>                    unsigned long, nbytes, int __user *, start,
>>>>> void
>>>>> __user *, arg)
>>>>>
>>>>> the analyzer will treat the nbytes param as under attacker
>>>>> control,
>>>>> and
>>>>> can complain accordingly:
>>>>>
>>>>> taint-CVE-2011-2210-1.c: In function ‘sys_osf_getsysinfo’:
>>>>> taint-CVE-2011-2210-1.c:69:21: warning: use of attacker-
>>>>> controlled
>>>>> value
>>>>>     ‘nbytes’ as size without upper-bounds checking [CWE-129] [-
>>>>> Wanalyzer-tainted-size]
>>>>>      69 |                 if (copy_to_user(buffer, hwrpb, nbytes)
>>>>> != 0)
>>>>>         |                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>
>>>>> Additionally, the patch allows the attribute to be used on field
>>>>> decls:
>>>>> specifically function pointers.  Any function used as an
>>>>> initializer
>>>>> for such a field gets treated as tainted.  An example can be seen
>>>>> in
>>>>> CVE-2020-13143, where adding __attribute__((tainted)) to the
>>>>> "store"
>>>>> callback of configfs_attribute:
>>>>>
>>>>>     struct configfs_attribute {
>>>>>        /* [...snip...] */
>>>>>        ssize_t (*store)(struct config_item *, const char *,
>>>>> size_t)
>>>>>          __attribute__((tainted));
>>>>>        /* [...snip...] */
>>>>>     };
>>>>>
>>>>> allows the analyzer to see:
>>>>>
>>>>>    CONFIGFS_ATTR(gadget_dev_desc_, UDC);
>>>>>
>>>>> and treat gadget_dev_desc_UDC_store as tainted, so that it
>>>>> complains:
>>>>>
>>>>> taint-CVE-2020-13143-1.c: In function
>>>>> ‘gadget_dev_desc_UDC_store’:
>>>>> taint-CVE-2020-13143-1.c:33:17: warning: use of attacker-
>>>>> controlled
>>>>> value
>>>>>     ‘len + 18446744073709551615’ as offset without upper-bounds
>>>>> checking [CWE-823] [-Wanalyzer-tainted-offset]
>>>>>      33 |         if (name[len - 1] == '\n')
>>>>>         |             ~~~~^~~~~~~~~
>>>>>
>>>>> Similarly, the attribute could be used on the ioctl callback
>>>>> field,
>>>>> USB device callbacks, network-handling callbacks etc.  This
>>>>> potentially
>>>>> gives a lot of test coverage with relatively little code
>>>>> annotation,
>>>>> and
>>>>> without necessarily needing link-time analysis (which -fanalyzer
>>>>> can
>>>>> only do at present on trivial examples).
>>>>>
>>>>> I believe this is the first time we've had an attribute on a
>>>>> field.
>>>>> If that's an issue, I could prepare a version of the patch that
>>>>> merely allowed it on functions themselves.
>>>>>
>>>>> As before this currently still needs -fanalyzer-checker=taint (in
>>>>> addition to -fanalyzer).
>>>>>
>>>>> gcc/analyzer/ChangeLog:
>>>>>           * engine.cc: Include "stringpool.h", "attribs.h", and
>>>>>           "tree-dfa.h".
>>>>>           (mark_params_as_tainted): New.
>>>>>           (class tainted_function_custom_event): New.
>>>>>           (class tainted_function_info): New.
>>>>>           (exploded_graph::add_function_entry): Handle functions
>>>>> with
>>>>>           "tainted" attribute.
>>>>>           (class tainted_field_custom_event): New.
>>>>>           (class tainted_callback_custom_event): New.
>>>>>           (class tainted_call_info): New.
>>>>>           (add_tainted_callback): New.
>>>>>           (add_any_callbacks): New.
>>>>>           (exploded_graph::build_initial_worklist): Find callbacks
>>>>> that
>>>>> are
>>>>>           reachable from global initializers, calling
>>>>> add_any_callbacks
>>>>> on
>>>>>           them.
>>>>>
>>>>> gcc/c-family/ChangeLog:
>>>>>           * c-attribs.c (c_common_attribute_table): Add "tainted".
>>>>>           (handle_tainted_attribute): New.
>>>>>
>>>>> gcc/ChangeLog:
>>>>>           * doc/extend.texi (Function Attributes): Note that
>>>>> "tainted"
>>>>> can
>>>>>           be used on field decls.
>>>>>           (Common Function Attributes): Add entry on "tainted"
>>>>> attribute.
>>>>>
>>>>> gcc/testsuite/ChangeLog:
>>>>>           * gcc.dg/analyzer/attr-tainted-1.c: New test.
>>>>>           * gcc.dg/analyzer/attr-tainted-misuses.c: New test.
>>>>>           * gcc.dg/analyzer/taint-CVE-2011-2210-1.c: New test.
>>>>>           * gcc.dg/analyzer/taint-CVE-2020-13143-1.c: New test.
>>>>>           * gcc.dg/analyzer/taint-CVE-2020-13143-2.c: New test.
>>>>>           * gcc.dg/analyzer/taint-CVE-2020-13143.h: New test.
>>>>>           * gcc.dg/analyzer/taint-alloc-3.c: New test.
>>>>>           * gcc.dg/analyzer/taint-alloc-4.c: New test.
>>>>>
>>>>> Signed-off-by: David Malcolm <dmalcolm@redhat.com>
>>>>> ---
>>>>>    gcc/analyzer/engine.cc                        | 317
>>>>> +++++++++++++++++-
>>>>>    gcc/c-family/c-attribs.c                      |  36 ++
>>>>>    gcc/doc/extend.texi                           |  22 +-
>>>>>    .../gcc.dg/analyzer/attr-tainted-1.c          |  88 +++++
>>>>>    .../gcc.dg/analyzer/attr-tainted-misuses.c    |   6 +
>>>>>    .../gcc.dg/analyzer/taint-CVE-2011-2210-1.c   |  93 +++++
>>>>>    .../gcc.dg/analyzer/taint-CVE-2020-13143-1.c  |  38 +++
>>>>>    .../gcc.dg/analyzer/taint-CVE-2020-13143-2.c  |  32 ++
>>>>>    .../gcc.dg/analyzer/taint-CVE-2020-13143.h    |  91 +++++
>>>>>    gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c |  21 ++
>>>>>    gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c |  31 ++
>>>>>    11 files changed, 772 insertions(+), 3 deletions(-)
>>>>>    create mode 100644 gcc/testsuite/gcc.dg/analyzer/attr-tainted-
>>>>> 1.c
>>>>>    create mode 100644 gcc/testsuite/gcc.dg/analyzer/attr-tainted-
>>>>> misuses.c
>>>>>    create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-
>>>>> 2011-
>>>>> 2210-1.c
>>>>>    create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-
>>>>> 2020-
>>>>> 13143-1.c
>>>>>    create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-
>>>>> 2020-
>>>>> 13143-2.c
>>>>>    create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-
>>>>> 2020-
>>>>> 13143.h
>>>>>    create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-alloc-
>>>>> 3.c
>>>>>    create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-alloc-
>>>>> 4.c
>>>>>
>>>>> diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
>>>>> index 096e219392d..5fab41daf93 100644
>>>>> --- a/gcc/analyzer/engine.cc
>>>>> +++ b/gcc/analyzer/engine.cc
>>>>> @@ -68,6 +68,9 @@ along with GCC; see the file COPYING3.  If not
>>>>> see
>>>>>    #include "plugin.h"
>>>>>    #include "target.h"
>>>>>    #include <memory>
>>>>> +#include "stringpool.h"
>>>>> +#include "attribs.h"
>>>>> +#include "tree-dfa.h"
>>>>>    
>>>>>    /* For an overview, see gcc/doc/analyzer.texi.  */
>>>>>    
>>>>> @@ -2276,6 +2279,116 @@ exploded_graph::~exploded_graph ()
>>>>>        delete (*iter).second;
>>>>>    }
>>>>>    
>>>>> +/* Subroutine for use when implementing __attribute__((tainted))
>>>>> +   on functions and on function pointer fields in structs.
>>>>> +
>>>>> +   Called on STATE representing a call to FNDECL.
>>>>> +   Mark all params of FNDECL in STATE as "tainted".  Mark the
>>>>> value
>>>>> of all
>>>>> +   regions pointed to by params of FNDECL as "tainted".
>>>>> +
>>>>> +   Return true if successful; return false if the "taint" state
>>>>> machine
>>>>> +   was not found.  */
>>>>> +
>>>>> +static bool
>>>>> +mark_params_as_tainted (program_state *state, tree fndecl,
>>>>> +                       const extrinsic_state &ext_state)
>>>>> +{
>>>>> +  unsigned taint_sm_idx;
>>>>> +  if (!ext_state.get_sm_idx_by_name ("taint", &taint_sm_idx))
>>>>> +    return false;
>>>>> +  sm_state_map *smap = state->m_checker_states[taint_sm_idx];
>>>>> +
>>>>> +  const state_machine &sm = ext_state.get_sm (taint_sm_idx);
>>>>> +  state_machine::state_t tainted = sm.get_state_by_name
>>>>> ("tainted");
>>>>> +
>>>>> +  region_model_manager *mgr = ext_state.get_model_manager ();
>>>>> +
>>>>> +  function *fun = DECL_STRUCT_FUNCTION (fndecl);
>>>>> +  gcc_assert (fun);
>>>>> +
>>>>> +  for (tree iter_parm = DECL_ARGUMENTS (fndecl); iter_parm;
>>>>> +       iter_parm = DECL_CHAIN (iter_parm))
>>>>> +    {
>>>>> +      tree param = iter_parm;
>>>>> +      if (tree parm_default_ssa = ssa_default_def (fun,
>>>>> iter_parm))
>>>>> +       param = parm_default_ssa;
>>>>> +      const region *param_reg = state->m_region_model-
>>>>>> get_lvalue
>>>>> (param, NULL);
>>>>> +      const svalue *init_sval = mgr->get_or_create_initial_value
>>>>> (param_reg);
>>>>> +      smap->set_state (state->m_region_model, init_sval,
>>>>> +                      tainted, NULL /*origin_new_sval*/,
>>>>> ext_state);
>>>>> +      if (POINTER_TYPE_P (TREE_TYPE (param)))
>>>>> +       {
>>>>> +         const region *pointee_reg = mgr->get_symbolic_region
>>>>> (init_sval);
>>>>> +         /* Mark "*param" as tainted.  */
>>>>> +         const svalue *init_pointee_sval
>>>>> +           = mgr->get_or_create_initial_value (pointee_reg);
>>>>> +         smap->set_state (state->m_region_model,
>>>>> init_pointee_sval,
>>>>> +                          tainted, NULL /*origin_new_sval*/,
>>>>> ext_state);
>>>>> +       }
>>>>> +    }
>>>>> +
>>>>> +  return true;
>>>>> +}
>>>>> +
>>>>> +/* Custom event for use by tainted_function_info when a function
>>>>> +   has been marked with __attribute__((tainted)).  */
>>>>> +
>>>>> +class tainted_function_custom_event : public custom_event
>>>>> +{
>>>>> +public:
>>>>> +  tainted_function_custom_event (location_t loc, tree fndecl,
>>>>> int
>>>>> depth)
>>>>> +  : custom_event (loc, fndecl, depth),
>>>>> +    m_fndecl (fndecl)
>>>>> +  {
>>>>> +  }
>>>>> +
>>>>> +  label_text get_desc (bool can_colorize) const FINAL OVERRIDE
>>>>> +  {
>>>>> +    return make_label_text
>>>>> +      (can_colorize,
>>>>> +       "function %qE marked with %<__attribute__((tainted))%>",
>>>>> +       m_fndecl);
>>>>> +  }
>>>>> +
>>>>> +private:
>>>>> +  tree m_fndecl;
>>>>> +};
>>>>> +
>>>>> +/* Custom exploded_edge info for top-level calls to a function
>>>>> +   marked with __attribute__((tainted)).  */
>>>>> +
>>>>> +class tainted_function_info : public custom_edge_info
>>>>> +{
>>>>> +public:
>>>>> +  tainted_function_info (tree fndecl)
>>>>> +  : m_fndecl (fndecl)
>>>>> +  {}
>>>>> +
>>>>> +  void print (pretty_printer *pp) const FINAL OVERRIDE
>>>>> +  {
>>>>> +    pp_string (pp, "call to tainted function");
>>>>> +  };
>>>>> +
>>>>> +  bool update_model (region_model *,
>>>>> +                    const exploded_edge *,
>>>>> +                    region_model_context *) const FINAL OVERRIDE
>>>>> +  {
>>>>> +    /* No-op.  */
>>>>> +    return true;
>>>>> +  }
>>>>> +
>>>>> +  void add_events_to_path (checker_path *emission_path,
>>>>> +                          const exploded_edge &) const FINAL
>>>>> OVERRIDE
>>>>> +  {
>>>>> +    emission_path->add_event
>>>>> +      (new tainted_function_custom_event
>>>>> +       (DECL_SOURCE_LOCATION (m_fndecl), m_fndecl, 0));
>>>>> +  }
>>>>> +
>>>>> +private:
>>>>> +  tree m_fndecl;
>>>>> +};
>>>>> +
>>>>>    /* Ensure that there is an exploded_node representing an
>>>>> external
>>>>> call to
>>>>>       FUN, adding it to the worklist if creating it.
>>>>>    
>>>>> @@ -2302,14 +2415,25 @@ exploded_graph::add_function_entry
>>>>> (function
>>>>> *fun)
>>>>>      program_state state (m_ext_state);
>>>>>      state.push_frame (m_ext_state, fun);
>>>>>    
>>>>> +  custom_edge_info *edge_info = NULL;
>>>>> +
>>>>> +  if (lookup_attribute ("tainted", DECL_ATTRIBUTES (fun->decl)))
>>>>> +    {
>>>>> +      if (mark_params_as_tainted (&state, fun->decl,
>>>>> m_ext_state))
>>>>> +       edge_info = new tainted_function_info (fun->decl);
>>>>> +    }
>>>>> +
>>>>>      if (!state.m_valid)
>>>>>        return NULL;
>>>>>    
>>>>>      exploded_node *enode = get_or_create_node (point, state,
>>>>> NULL);
>>>>>      if (!enode)
>>>>> -    return NULL;
>>>>> +    {
>>>>> +      delete edge_info;
>>>>> +      return NULL;
>>>>> +    }
>>>>>    
>>>>> -  add_edge (m_origin, enode, NULL);
>>>>> +  add_edge (m_origin, enode, NULL, edge_info);
>>>>>    
>>>>>      m_functions_with_enodes.add (fun);
>>>>>    
>>>>> @@ -2623,6 +2747,184 @@ toplevel_function_p (function *fun,
>>>>> logger
>>>>> *logger)
>>>>>      return true;
>>>>>    }
>>>>>    
>>>>> +/* Custom event for use by tainted_call_info when a callback
>>>>> field
>>>>> has been
>>>>> +   marked with __attribute__((tainted)), for labelling the
>>>>> field.
>>>>> */
>>>>> +
>>>>> +class tainted_field_custom_event : public custom_event
>>>>> +{
>>>>> +public:
>>>>> +  tainted_field_custom_event (tree field)
>>>>> +  : custom_event (DECL_SOURCE_LOCATION (field), NULL_TREE, 0),
>>>>> +    m_field (field)
>>>>> +  {
>>>>> +  }
>>>>> +
>>>>> +  label_text get_desc (bool can_colorize) const FINAL OVERRIDE
>>>>> +  {
>>>>> +    return make_label_text (can_colorize,
>>>>> +                           "field %qE of %qT"
>>>>> +                           " is marked with
>>>>> %<__attribute__((tainted))%>",
>>>>> +                           m_field, DECL_CONTEXT (m_field));
>>>>> +  }
>>>>> +
>>>>> +private:
>>>>> +  tree m_field;
>>>>> +};
>>>>> +
>>>>> +/* Custom event for use by tainted_call_info when a callback
>>>>> field
>>>>> has been
>>>>> +   marked with __attribute__((tainted)), for labelling the
>>>>> function
>>>>> used
>>>>> +   in that callback.  */
>>>>> +
>>>>> +class tainted_callback_custom_event : public custom_event
>>>>> +{
>>>>> +public:
>>>>> +  tainted_callback_custom_event (location_t loc, tree fndecl,
>>>>> int
>>>>> depth,
>>>>> +                                tree field)
>>>>> +  : custom_event (loc, fndecl, depth),
>>>>> +    m_field (field)
>>>>> +  {
>>>>> +  }
>>>>> +
>>>>> +  label_text get_desc (bool can_colorize) const FINAL OVERRIDE
>>>>> +  {
>>>>> +    return make_label_text (can_colorize,
>>>>> +                           "function %qE used as initializer for
>>>>> field %qE"
>>>>> +                           " marked with
>>>>> %<__attribute__((tainted))%>",
>>>>> +                           m_fndecl, m_field);
>>>>> +  }
>>>>> +
>>>>> +private:
>>>>> +  tree m_field;
>>>>> +};
>>>>> +
>>>>> +/* Custom edge info for use when adding a function used by a
>>>>> callback field
>>>>> +   marked with '__attribute__((tainted))'.   */
>>>>> +
>>>>> +class tainted_call_info : public custom_edge_info
>>>>> +{
>>>>> +public:
>>>>> +  tainted_call_info (tree field, tree fndecl, location_t loc)
>>>>> +  : m_field (field), m_fndecl (fndecl), m_loc (loc)
>>>>> +  {}
>>>>> +
>>>>> +  void print (pretty_printer *pp) const FINAL OVERRIDE
>>>>> +  {
>>>>> +    pp_string (pp, "call to tainted field");
>>>>> +  };
>>>>> +
>>>>> +  bool update_model (region_model *,
>>>>> +                    const exploded_edge *,
>>>>> +                    region_model_context *) const FINAL OVERRIDE
>>>>> +  {
>>>>> +    /* No-op.  */
>>>>> +    return true;
>>>>> +  }
>>>>> +
>>>>> +  void add_events_to_path (checker_path *emission_path,
>>>>> +                          const exploded_edge &) const FINAL
>>>>> OVERRIDE
>>>>> +  {
>>>>> +    /* Show the field in the struct declaration
>>>>> +       e.g. "(1) field 'store' is marked with
>>>>> '__attribute__((tainted))'"  */
>>>>> +    emission_path->add_event
>>>>> +      (new tainted_field_custom_event (m_field));
>>>>> +
>>>>> +    /* Show the callback in the initializer
>>>>> +       e.g.
>>>>> +       "(2) function 'gadget_dev_desc_UDC_store' used as
>>>>> initializer
>>>>> +       for field 'store' marked with
>>>>> '__attribute__((tainted))'".
>>>>> */
>>>>> +    emission_path->add_event
>>>>> +      (new tainted_callback_custom_event (m_loc, m_fndecl, 0,
>>>>> m_field));
>>>>> +  }
>>>>> +
>>>>> +private:
>>>>> +  tree m_field;
>>>>> +  tree m_fndecl;
>>>>> +  location_t m_loc;
>>>>> +};
>>>>> +
>>>>> +/* Given an initializer at LOC for FIELD marked with
>>>>> '__attribute__((tainted))'
>>>>> +   initialized with FNDECL, add an entrypoint to FNDECL to EG
>>>>> (and
>>>>> to its
>>>>> +   worklist) where the params to FNDECL are marked as tainted.
>>>>> */
>>>>> +
>>>>> +static void
>>>>> +add_tainted_callback (exploded_graph *eg, tree field, tree
>>>>> fndecl,
>>>>> +                     location_t loc)
>>>>> +{
>>>>> +  logger *logger = eg->get_logger ();
>>>>> +
>>>>> +  LOG_SCOPE (logger);
>>>>> +
>>>>> +  if (!gimple_has_body_p (fndecl))
>>>>> +    return;
>>>>> +
>>>>> +  const extrinsic_state &ext_state = eg->get_ext_state ();
>>>>> +
>>>>> +  function *fun = DECL_STRUCT_FUNCTION (fndecl);
>>>>> +  gcc_assert (fun);
>>>>> +
>>>>> +  program_point point
>>>>> +    = program_point::from_function_entry (eg->get_supergraph (),
>>>>> fun);
>>>>> +  program_state state (ext_state);
>>>>> +  state.push_frame (ext_state, fun);
>>>>> +
>>>>> +  if (!mark_params_as_tainted (&state, fndecl, ext_state))
>>>>> +    return;
>>>>> +
>>>>> +  if (!state.m_valid)
>>>>> +    return;
>>>>> +
>>>>> +  exploded_node *enode = eg->get_or_create_node (point, state,
>>>>> NULL);
>>>>> +  if (logger)
>>>>> +    {
>>>>> +      if (enode)
>>>>> +       logger->log ("created EN %i for tainted %qE entrypoint",
>>>>> +                    enode->m_index, fndecl);
>>>>> +      else
>>>>> +       {
>>>>> +         logger->log ("did not create enode for tainted %qE
>>>>> entrypoint",
>>>>> +                      fndecl);
>>>>> +         return;
>>>>> +       }
>>>>> +    }
>>>>> +
>>>>> +  tainted_call_info *info = new tainted_call_info (field,
>>>>> fndecl,
>>>>> loc);
>>>>> +  eg->add_edge (eg->get_origin (), enode, NULL, info);
>>>>> +}
>>>>> +
>>>>> +/* Callback for walk_tree for finding callbacks within
>>>>> initializers;
>>>>> +   ensure that any callback initializer where the corresponding
>>>>> field is
>>>>> +   marked with '__attribute__((tainted))' is treated as an
>>>>> entrypoint to the
>>>>> +   analysis, special-casing that the inputs to the callback are
>>>>> +   untrustworthy.  */
>>>>> +
>>>>> +static tree
>>>>> +add_any_callbacks (tree *tp, int *, void *data)
>>>>> +{
>>>>> +  exploded_graph *eg = (exploded_graph *)data;
>>>>> +  if (TREE_CODE (*tp) == CONSTRUCTOR)
>>>>> +    {
>>>>> +      /* Find fields with the "tainted" attribute.
>>>>> +        walk_tree only walks the values, not the index values;
>>>>> +        look at the index values.  */
>>>>> +      unsigned HOST_WIDE_INT idx;
>>>>> +      constructor_elt *ce;
>>>>> +
>>>>> +      for (idx = 0; vec_safe_iterate (CONSTRUCTOR_ELTS (*tp),
>>>>> idx,
>>>>> &ce);
>>>>> +          idx++)
>>>>> +       if (ce->index && TREE_CODE (ce->index) == FIELD_DECL)
>>>>> +         if (lookup_attribute ("tainted", DECL_ATTRIBUTES (ce-
>>>>>> index)))
>>>>> +           {
>>>>> +             tree value = ce->value;
>>>>> +             if (TREE_CODE (value) == ADDR_EXPR
>>>>> +                 && TREE_CODE (TREE_OPERAND (value, 0)) ==
>>>>> FUNCTION_DECL)
>>>>> +               add_tainted_callback (eg, ce->index, TREE_OPERAND
>>>>> (value, 0),
>>>>> +                                     EXPR_LOCATION (value));
>>>>> +           }
>>>>> +    }
>>>>> +
>>>>> +  return NULL_TREE;
>>>>> +}
>>>>> +
>>>>>    /* Add initial nodes to EG, with entrypoints for externally-
>>>>> callable
>>>>>       functions.  */
>>>>>    
>>>>> @@ -2648,6 +2950,17 @@ exploded_graph::build_initial_worklist ()
>>>>>             logger->log ("did not create enode for %qE
>>>>> entrypoint",
>>>>> fun->decl);
>>>>>          }
>>>>>      }
>>>>> +
>>>>> +  /* Find callbacks that are reachable from global
>>>>> initializers.  */
>>>>> +  varpool_node *vpnode;
>>>>> +  FOR_EACH_VARIABLE (vpnode)
>>>>> +    {
>>>>> +      tree decl = vpnode->decl;
>>>>> +      tree init = DECL_INITIAL (decl);
>>>>> +      if (!init)
>>>>> +       continue;
>>>>> +      walk_tree (&init, add_any_callbacks, this, NULL);
>>>>> +    }
>>>>>    }
>>>>>    
>>>>>    /* The main loop of the analysis.
>>>>> diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
>>>>> index 9e03156de5e..835ba6e0e8c 100644
>>>>> --- a/gcc/c-family/c-attribs.c
>>>>> +++ b/gcc/c-family/c-attribs.c
>>>>> @@ -117,6 +117,7 @@ static tree
>>>>> handle_no_profile_instrument_function_attribute (tree *, tree,
>>>>>                                                               
>>>>> tree,
>>>>> int, bool *);
>>>>>    static tree handle_malloc_attribute (tree *, tree, tree, int,
>>>>> bool
>>>>> *);
>>>>>    static tree handle_dealloc_attribute (tree *, tree, tree, int,
>>>>> bool
>>>>> *);
>>>>> +static tree handle_tainted_attribute (tree *, tree, tree, int,
>>>>> bool
>>>>> *);
>>>>>    static tree handle_returns_twice_attribute (tree *, tree, tree,
>>>>> int,
>>>>> bool *);
>>>>>    static tree handle_no_limit_stack_attribute (tree *, tree,
>>>>> tree,
>>>>> int,
>>>>>                                                bool *);
>>>>> @@ -569,6 +570,8 @@ const struct attribute_spec
>>>>> c_common_attribute_table[] =
>>>>>                                 handle_objc_nullability_attribute,
>>>>> NULL
>>>>> },
>>>>>      { "*dealloc",                1, 2, true, false, false, false,
>>>>>                                 handle_dealloc_attribute, NULL },
>>>>> +  { "tainted",               0, 0, true,  false, false, false,
>>>>> +                             handle_tainted_attribute, NULL },
>>>>>      { NULL,                     0, 0, false, false, false, false,
>>>>> NULL, NULL }
>>>>>    };
>>>>>    
>>>>> @@ -5857,6 +5860,39 @@ handle_objc_nullability_attribute (tree
>>>>> *node,
>>>>> tree name, tree args,
>>>>>      return NULL_TREE;
>>>>>    }
>>>>>    
>>>>> +/* Handle a "tainted" attribute; arguments as in
>>>>> +   struct attribute_spec.handler.  */
>>>>> +
>>>>> +static tree
>>>>> +handle_tainted_attribute (tree *node, tree name, tree, int,
>>>>> +                         bool *no_add_attrs)
>>>>> +{
>>>>> +  if (TREE_CODE (*node) != FUNCTION_DECL
>>>>> +      && TREE_CODE (*node) != FIELD_DECL)
>>>>> +    {
>>>>> +      warning (OPT_Wattributes, "%qE attribute ignored; valid
>>>>> only "
>>>>> +              "for functions and function pointer fields",
>>>>> +              name);
>>>>> +      *no_add_attrs = true;
>>>>> +      return NULL_TREE;
>>>>> +    }
>>>>> +
>>>>> +  if (TREE_CODE (*node) == FIELD_DECL
>>>>> +      && !(TREE_CODE (TREE_TYPE (*node)) == POINTER_TYPE
>>>>> +          && TREE_CODE (TREE_TYPE (TREE_TYPE (*node))) ==
>>>>> FUNCTION_TYPE))
>>>>> +    {
>>>>> +      warning (OPT_Wattributes, "%qE attribute ignored;"
>>>>> +              " field must be a function pointer",
>>>>> +              name);
>>>>> +      *no_add_attrs = true;
>>>>> +      return NULL_TREE;
>>>>> +    }
>>>>> +
>>>>> +  *no_add_attrs = false; /* OK */
>>>>> +
>>>>> +  return NULL_TREE;
>>>>> +}
>>>>> +
>>>>>    /* Attempt to partially validate a single attribute ATTR as if
>>>>>       it were to be applied to an entity OPER.  */
>>>>>    
>>>>> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
>>>>> index 5a6ef464779..826bbd48e7e 100644
>>>>> --- a/gcc/doc/extend.texi
>>>>> +++ b/gcc/doc/extend.texi
>>>>> @@ -2465,7 +2465,8 @@ variable declarations (@pxref{Variable
>>>>> Attributes}),
>>>>>    labels (@pxref{Label Attributes}),
>>>>>    enumerators (@pxref{Enumerator Attributes}),
>>>>>    statements (@pxref{Statement Attributes}),
>>>>> -and types (@pxref{Type Attributes}).
>>>>> +types (@pxref{Type Attributes}),
>>>>> +and on field declarations (for @code{tainted}).
>>>>>    
>>>>>    There is some overlap between the purposes of attributes and
>>>>> pragmas
>>>>>    (@pxref{Pragmas,,Pragmas Accepted by GCC}).  It has been
>>>>> @@ -3977,6 +3978,25 @@ addition to creating a symbol version (as
>>>>> if
>>>>>    @code{"@var{name2}@@@var{nodename}"} was used) the version will
>>>>> be
>>>>> also used
>>>>>    to resolve @var{name2} by the linker.
>>>>>    
>>>>> +@item tainted
>>>>> +@cindex @code{tainted} function attribute
>>>>> +The @code{tainted} attribute is used to specify that a function
>>>>> is
>>>>> called
>>>>> +in a way that requires sanitization of its arguments, such as a
>>>>> system
>>>>> +call in an operating system kernel.  Such a function can be
>>>>> considered part
>>>>> +of the ``attack surface'' of the program.  The attribute can be
>>>>> used
>>>>> both
>>>>> +on function declarations, and on field declarations containing
>>>>> function
>>>>> +pointers.  In the latter case, any function used as an
>>>>> initializer
>>>>> of
>>>>> +such a callback field will be treated as tainted.
>>>>> +
>>>>> +The analyzer will pay particular attention to such functions
>>>>> when
>>>>> both
>>>>> +@option{-fanalyzer} and @option{-fanalyzer-checker=taint} are
>>>>> supplied,
>>>>> +potentially issuing warnings guarded by
>>>>> +@option{-Wanalyzer-exposure-through-uninit-copy},
>>>>> +@option{-Wanalyzer-tainted-allocation-size},
>>>>> +@option{-Wanalyzer-tainted-array-index},
>>>>> +@option{Wanalyzer-tainted-offset},
>>>>> +and @option{Wanalyzer-tainted-size}.
>>>>> +
>>>>>    @item target_clones (@var{options})
>>>>>    @cindex @code{target_clones} function attribute
>>>>>    The @code{target_clones} attribute is used to specify that a
>>>>> function
>>>>> diff --git a/gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
>>>>> b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
>>>>> new file mode 100644
>>>>> index 00000000000..cc4d5900372
>>>>> --- /dev/null
>>>>> +++ b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
>>>>> @@ -0,0 +1,88 @@
>>>>> +// TODO: remove need for this option
>>>>> +/* { dg-additional-options "-fanalyzer-checker=taint" } */
>>>>> +
>>>>> +#include "analyzer-decls.h"
>>>>> +
>>>>> +struct arg_buf
>>>>> +{
>>>>> +  int i;
>>>>> +  int j;
>>>>> +};
>>>>> +
>>>>> +/* Example of marking a function as tainted.  */
>>>>> +
>>>>> +void __attribute__((tainted))
>>>>> +test_1 (int i, void *p, char *q)
>>>>> +{
>>>>> +  /* There should be a single enode,
>>>>> +     for the "tainted" entry to the function.  */
>>>>> +  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1
>>>>> processed
>>>>> enode" } */
>>>>> +
>>>>> +  __analyzer_dump_state ("taint", i); /* { dg-warning "state:
>>>>> 'tainted'" } */
>>>>> +  __analyzer_dump_state ("taint", p); /* { dg-warning "state:
>>>>> 'tainted'" } */
>>>>> +  __analyzer_dump_state ("taint", q); /* { dg-warning "state:
>>>>> 'tainted'" } */
>>>>> +  __analyzer_dump_state ("taint", *q); /* { dg-warning "state:
>>>>> 'tainted'" } */
>>>>> +
>>>>> +  struct arg_buf *args = p;
>>>>> +  __analyzer_dump_state ("taint", args->i); /* { dg-warning
>>>>> "state:
>>>>> 'tainted'" } */
>>>>> +  __analyzer_dump_state ("taint", args->j); /* { dg-warning
>>>>> "state:
>>>>> 'tainted'" } */
>>>>> +}
>>>>> +
>>>>> +/* Example of marking a callback field as tainted.  */
>>>>> +
>>>>> +struct s2
>>>>> +{
>>>>> +  void (*cb) (int, void *, char *)
>>>>> +    __attribute__((tainted));
>>>>> +};
>>>>> +
>>>>> +/* Function not marked as tainted.  */
>>>>> +
>>>>> +void
>>>>> +test_2a (int i, void *p, char *q)
>>>>> +{
>>>>> +  /* There should be a single enode,
>>>>> +     for the normal entry to the function.  */
>>>>> +  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1
>>>>> processed
>>>>> enode" } */
>>>>> +
>>>>> +  __analyzer_dump_state ("taint", i); /* { dg-warning "state:
>>>>> 'start'" } */
>>>>> +  __analyzer_dump_state ("taint", p); /* { dg-warning "state:
>>>>> 'start'" } */
>>>>> +  __analyzer_dump_state ("taint", q); /* { dg-warning "state:
>>>>> 'start'" } */
>>>>> +
>>>>> +  struct arg_buf *args = p;
>>>>> +  __analyzer_dump_state ("taint", args->i); /* { dg-warning
>>>>> "state:
>>>>> 'start'" } */
>>>>> +  __analyzer_dump_state ("taint", args->j); /* { dg-warning
>>>>> "state:
>>>>> 'start'" } */
>>>>> +}
>>>>> +
>>>>> +/* Function referenced via t2b.cb, marked as "tainted".  */
>>>>> +
>>>>> +void
>>>>> +test_2b (int i, void *p, char *q)
>>>>> +{
>>>>> +  /* There should be two enodes
>>>>> +     for the direct call, and the "tainted" entry to the
>>>>> function.
>>>>> */
>>>>> +  __analyzer_dump_exploded_nodes (0); /* { dg-warning "2
>>>>> processed
>>>>> enodes" } */
>>>>> +}
>>>>> +
>>>>> +/* Callback used via t2c.cb, marked as "tainted".  */
>>>>> +void
>>>>> +__analyzer_test_2c (int i, void *p, char *q)
>>>>> +{
>>>>> +  /* There should be a single enode,
>>>>> +     for the "tainted" entry to the function.  */
>>>>> +  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1
>>>>> processed
>>>>> enode" } */
>>>>> +
>>>>> +  __analyzer_dump_state ("taint", i); /* { dg-warning "state:
>>>>> 'tainted'" } */
>>>>> +  __analyzer_dump_state ("taint", p); /* { dg-warning "state:
>>>>> 'tainted'" } */
>>>>> +  __analyzer_dump_state ("taint", q); /* { dg-warning "state:
>>>>> 'tainted'" } */
>>>>> +}
>>>>> +
>>>>> +struct s2 t2b =
>>>>> +{
>>>>> +  .cb = test_2b
>>>>> +};
>>>>> +
>>>>> +struct s2 t2c =
>>>>> +{
>>>>> +  .cb = __analyzer_test_2c
>>>>> +};
>>>>> diff --git a/gcc/testsuite/gcc.dg/analyzer/attr-tainted-misuses.c
>>>>> b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-misuses.c
>>>>> new file mode 100644
>>>>> index 00000000000..6f4cbc82efb
>>>>> --- /dev/null
>>>>> +++ b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-misuses.c
>>>>> @@ -0,0 +1,6 @@
>>>>> +int not_a_fn __attribute__ ((tainted)); /* { dg-warning
>>>>> "'tainted'
>>>>> attribute ignored; valid only for functions and function pointer
>>>>> fields" } */
>>>>> +
>>>>> +struct s
>>>>> +{
>>>>> +  int f __attribute__ ((tainted)); /* { dg-warning "'tainted'
>>>>> attribute ignored; field must be a function pointer" } */
>>>>> +};
>>>>> diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-
>>>>> 1.c
>>>>> b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-1.c
>>>>> new file mode 100644
>>>>> index 00000000000..fe6c7ebbb1f
>>>>> --- /dev/null
>>>>> +++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-1.c
>>>>> @@ -0,0 +1,93 @@
>>>>> +/* "The osf_getsysinfo function in arch/alpha/kernel/osf_sys.c
>>>>> in
>>>>> the
>>>>> +   Linux kernel before 2.6.39.4 on the Alpha platform does not
>>>>> properly
>>>>> +   restrict the data size for GSI_GET_HWRPB operations, which
>>>>> allows
>>>>> +   local users to obtain sensitive information from kernel
>>>>> memory
>>>>> via
>>>>> +   a crafted call."
>>>>> +
>>>>> +   Fixed in 3d0475119d8722798db5e88f26493f6547a4bb5b on linux-
>>>>> 2.6.39.y
>>>>> +   in linux-stable.  */
>>>>> +
>>>>> +// TODO: remove need for this option:
>>>>> +/* { dg-additional-options "-fanalyzer-checker=taint" } */
>>>>> +
>>>>> +#include "analyzer-decls.h"
>>>>> +#include "test-uaccess.h"
>>>>> +
>>>>> +/* Adapted from include/linux/linkage.h.  */
>>>>> +
>>>>> +#define asmlinkage
>>>>> +
>>>>> +/* Adapted from include/linux/syscalls.h.  */
>>>>> +
>>>>> +#define __SC_DECL1(t1, a1)     t1 a1
>>>>> +#define __SC_DECL2(t2, a2, ...) t2 a2, __SC_DECL1(__VA_ARGS__)
>>>>> +#define __SC_DECL3(t3, a3, ...) t3 a3, __SC_DECL2(__VA_ARGS__)
>>>>> +#define __SC_DECL4(t4, a4, ...) t4 a4, __SC_DECL3(__VA_ARGS__)
>>>>> +#define __SC_DECL5(t5, a5, ...) t5 a5, __SC_DECL4(__VA_ARGS__)
>>>>> +#define __SC_DECL6(t6, a6, ...) t6 a6, __SC_DECL5(__VA_ARGS__)
>>>>> +
>>>>> +#define SYSCALL_DEFINEx(x, sname, ...)                         \
>>>>> +       __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
>>>>> +
>>>>> +#define SYSCALL_DEFINE(name) asmlinkage long sys_##name
>>>>> +#define __SYSCALL_DEFINEx(x, name,
>>>>> ...)                                        \
>>>>> +       asmlinkage __attribute__((tainted)) \
>>>>> +       long sys##name(__SC_DECL##x(__VA_ARGS__))
>>>>> +
>>>>> +#define SYSCALL_DEFINE5(name, ...) SYSCALL_DEFINEx(5, _##name,
>>>>> __VA_ARGS__)
>>>>> +
>>>>> +/* Adapted from arch/alpha/include/asm/hwrpb.h.  */
>>>>> +
>>>>> +struct hwrpb_struct {
>>>>> +       unsigned long phys_addr;        /* check: physical
>>>>> address of
>>>>> the hwrpb */
>>>>> +       unsigned long id;               /* check: "HWRPB\0\0\0"
>>>>> */
>>>>> +       unsigned long revision;
>>>>> +       unsigned long size;             /* size of hwrpb */
>>>>> +       /* [...snip...] */
>>>>> +};
>>>>> +
>>>>> +extern struct hwrpb_struct *hwrpb;
>>>>> +
>>>>> +/* Adapted from arch/alpha/kernel/osf_sys.c.  */
>>>>> +
>>>>> +SYSCALL_DEFINE5(osf_getsysinfo, unsigned long, op, void __user
>>>>> *,
>>>>> buffer,
>>>>> +               unsigned long, nbytes, int __user *, start, void
>>>>> __user *, arg)
>>>>> +{
>>>>> +       /* [...snip...] */
>>>>> +
>>>>> +       __analyzer_dump_state ("taint", nbytes);  /* { dg-warning
>>>>> "tainted" } */
>>>>> +
>>>>> +       /* TODO: should have an event explaining why "nbytes" is
>>>>> treated as
>>>>> +          attacker-controlled.  */
>>>>> +
>>>>> +       /* case GSI_GET_HWRPB: */
>>>>> +               if (nbytes < sizeof(*hwrpb))
>>>>> +                       return -1;
>>>>> +
>>>>> +               __analyzer_dump_state ("taint", nbytes);  /* {
>>>>> dg-
>>>>> warning "has_lb" } */
>>>>> +
>>>>> +               if (copy_to_user(buffer, hwrpb, nbytes) != 0) /*
>>>>> {
>>>>> dg-warning "use of attacker-controlled value 'nbytes' as size
>>>>> without
>>>>> upper-bounds checking" } */
>>>>> +                       return -2;
>>>>> +
>>>>> +               return 1;
>>>>> +
>>>>> +       /* [...snip...] */
>>>>> +}
>>>>> +
>>>>> +/* With the fix for the sense of the size comparison.  */
>>>>> +
>>>>> +SYSCALL_DEFINE5(osf_getsysinfo_fixed, unsigned long, op, void
>>>>> __user
>>>>> *, buffer,
>>>>> +               unsigned long, nbytes, int __user *, start, void
>>>>> __user *, arg)
>>>>> +{
>>>>> +       /* [...snip...] */
>>>>> +
>>>>> +       /* case GSI_GET_HWRPB: */
>>>>> +               if (nbytes > sizeof(*hwrpb))
>>>>> +                       return -1;
>>>>> +               if (copy_to_user(buffer, hwrpb, nbytes) != 0) /*
>>>>> {
>>>>> dg-bogus "attacker-controlled" } */
>>>>> +                       return -2;
>>>>> +
>>>>> +               return 1;
>>>>> +
>>>>> +       /* [...snip...] */
>>>>> +}
>>>>> diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-
>>>>> 1.c
>>>>> b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-1.c
>>>>> new file mode 100644
>>>>> index 00000000000..0b9a94a8d6c
>>>>> --- /dev/null
>>>>> +++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-1.c
>>>>> @@ -0,0 +1,38 @@
>>>>> +/* See notes in this header.  */
>>>>> +#include "taint-CVE-2020-13143.h"
>>>>> +
>>>>> +// TODO: remove need for this option
>>>>> +/* { dg-additional-options "-fanalyzer-checker=taint" } */
>>>>> +
>>>>> +struct configfs_attribute {
>>>>> +       /* [...snip...] */
>>>>> +       ssize_t (*store)(struct config_item *, const char *,
>>>>> size_t)
>>>>> /* { dg-message "\\(1\\) field 'store' of 'struct
>>>>> configfs_attribute'
>>>>> is marked with '__attribute__\\(\\(tainted\\)\\)'" } */
>>>>> +               __attribute__((tainted)); /* (this is added).  */
>>>>> +};
>>>>> +static inline struct gadget_info *to_gadget_info(struct
>>>>> config_item
>>>>> *item)
>>>>> +{
>>>>> +        return container_of(to_config_group(item), struct
>>>>> gadget_info, group);
>>>>> +}
>>>>> +
>>>>> +static ssize_t gadget_dev_desc_UDC_store(struct config_item
>>>>> *item,
>>>>> +               const char *page, size_t len)
>>>>> +{
>>>>> +       struct gadget_info *gi = to_gadget_info(item);
>>>>> +       char *name;
>>>>> +       int ret;
>>>>> +
>>>>> +#if 0
>>>>> +       /* FIXME: this is the fix.  */
>>>>> +       if (strlen(page) < len)
>>>>> +               return -EOVERFLOW;
>>>>> +#endif
>>>>> +
>>>>> +       name = kstrdup(page, GFP_KERNEL);
>>>>> +       if (!name)
>>>>> +               return -ENOMEM;
>>>>> +       if (name[len - 1] == '\n') /* { dg-warning "use of
>>>>> attacker-
>>>>> controlled value 'len \[^\n\r\]+' as offset without upper-bounds
>>>>> checking" } */
>>>>> +               name[len - 1] = '\0'; /* { dg-warning "use of
>>>>> attacker-controlled value 'len \[^\n\r\]+' as offset without
>>>>> upper-
>>>>> bounds checking" } */
>>>>> +       /* [...snip...] */                              \
>>>>> +}
>>>>> +
>>>>> +CONFIGFS_ATTR(gadget_dev_desc_, UDC); /* { dg-message "\\(2\\)
>>>>> function 'gadget_dev_desc_UDC_store' used as initializer for
>>>>> field
>>>>> 'store' marked with '__attribute__\\(\\(tainted\\)\\)'" } */
>>>>> diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-
>>>>> 2.c
>>>>> b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-2.c
>>>>> new file mode 100644
>>>>> index 00000000000..e05da9276c1
>>>>> --- /dev/null
>>>>> +++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-2.c
>>>>> @@ -0,0 +1,32 @@
>>>>> +/* See notes in this header.  */
>>>>> +#include "taint-CVE-2020-13143.h"
>>>>> +
>>>>> +// TODO: remove need for this option
>>>>> +/* { dg-additional-options "-fanalyzer-checker=taint" } */
>>>>> +
>>>>> +struct configfs_attribute {
>>>>> +       /* [...snip...] */
>>>>> +       ssize_t (*store)(struct config_item *, const char *,
>>>>> size_t)
>>>>> /* { dg-message "\\(1\\) field 'store' of 'struct
>>>>> configfs_attribute'
>>>>> is marked with '__attribute__\\(\\(tainted\\)\\)'" } */
>>>>> +               __attribute__((tainted)); /* (this is added).  */
>>>>> +};
>>>>> +
>>>>> +/* Highly simplified version.  */
>>>>> +
>>>>> +static ssize_t gadget_dev_desc_UDC_store(struct config_item
>>>>> *item,
>>>>> +               const char *page, size_t len)
>>>>> +{
>>>>> +       /* TODO: ought to have state_change_event talking about
>>>>> where
>>>>> the tainted value comes from.  */
>>>>> +
>>>>> +       char *name;
>>>>> +       /* [...snip...] */
>>>>> +
>>>>> +       name = kstrdup(page, GFP_KERNEL);
>>>>> +       if (!name)
>>>>> +               return -ENOMEM;
>>>>> +       if (name[len - 1] == '\n') /* { dg-warning "use of
>>>>> attacker-
>>>>> controlled value 'len \[^\n\r\]+' as offset without upper-bounds
>>>>> checking" } */
>>>>> +               name[len - 1] = '\0';  /* { dg-warning "use of
>>>>> attacker-controlled value 'len \[^\n\r\]+' as offset without
>>>>> upper-
>>>>> bounds checking" } */
>>>>> +       /* [...snip...] */
>>>>> +       return 0;
>>>>> +}
>>>>> +
>>>>> +CONFIGFS_ATTR(gadget_dev_desc_, UDC); /* { dg-message "\\(2\\)
>>>>> function 'gadget_dev_desc_UDC_store' used as initializer for
>>>>> field
>>>>> 'store' marked with '__attribute__\\(\\(tainted\\)\\)'" } */
>>>>> diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h
>>>>> b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h
>>>>> new file mode 100644
>>>>> index 00000000000..0ba023539af
>>>>> --- /dev/null
>>>>> +++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h
>>>>> @@ -0,0 +1,91 @@
>>>>> +/* Shared header for the various taint-CVE-2020-13143.h tests.
>>>>> +
>>>>> +   "gadget_dev_desc_UDC_store in drivers/usb/gadget/configfs.c
>>>>> in
>>>>> the
>>>>> +   Linux kernel 3.16 through 5.6.13 relies on kstrdup without
>>>>> considering
>>>>> +   the possibility of an internal '\0' value, which allows
>>>>> attackers
>>>>> to
>>>>> +   trigger an out-of-bounds read, aka CID-15753588bcd4."
>>>>> +
>>>>> +   Fixed by 15753588bcd4bbffae1cca33c8ced5722477fe1f on linux-
>>>>> 5.7.y
>>>>> +   in linux-stable.  */
>>>>> +
>>>>> +// TODO: remove need for this option
>>>>> +/* { dg-additional-options "-fanalyzer-checker=taint" } */
>>>>> +
>>>>> +#include <stddef.h>
>>>>> +
>>>>> +/* Adapted from include/uapi/asm-generic/posix_types.h  */
>>>>> +
>>>>> +typedef unsigned int     __kernel_size_t;
>>>>> +typedef int              __kernel_ssize_t;
>>>>> +
>>>>> +/* Adapted from include/linux/types.h  */
>>>>> +
>>>>> +//typedef __kernel_size_t              size_t;
>>>>> +typedef __kernel_ssize_t       ssize_t;
>>>>> +
>>>>> +/* Adapted from include/linux/kernel.h  */
>>>>> +
>>>>> +#define container_of(ptr, type, member)
>>>>> ({                             \
>>>>> +       void *__mptr = (void
>>>>> *)(ptr);                                   \
>>>>> +       /* [...snip...]
>>>>> */                                              \
>>>>> +       ((type *)(__mptr - offsetof(type, member))); })
>>>>> +
>>>>> +/* Adapted from include/linux/configfs.h  */
>>>>> +
>>>>> +struct config_item {
>>>>> +       /* [...snip...] */
>>>>> +};
>>>>> +
>>>>> +struct config_group {
>>>>> +       struct config_item              cg_item;
>>>>> +       /* [...snip...] */
>>>>> +};
>>>>> +
>>>>> +static inline struct config_group *to_config_group(struct
>>>>> config_item *item)
>>>>> +{
>>>>> +       return item ? container_of(item,struct
>>>>> config_group,cg_item)
>>>>> : NULL;
>>>>> +}
>>>>> +
>>>>> +#define CONFIGFS_ATTR(_pfx, _name)                             \
>>>>> +static struct configfs_attribute _pfx##attr_##_name = {        \
>>>>> +       /* [...snip...] */                              \
>>>>> +       .store          = _pfx##_name##_store,          \
>>>>> +}
>>>>> +
>>>>> +/* Adapted from include/linux/compiler.h  */
>>>>> +
>>>>> +#define __force
>>>>> +
>>>>> +/* Adapted from include/asm-generic/errno-base.h  */
>>>>> +
>>>>> +#define        ENOMEM          12      /* Out of memory */
>>>>> +
>>>>> +/* Adapted from include/linux/types.h  */
>>>>> +
>>>>> +#define __bitwise__
>>>>> +typedef unsigned __bitwise__ gfp_t;
>>>>> +
>>>>> +/* Adapted from include/linux/gfp.h  */
>>>>> +
>>>>> +#define ___GFP_WAIT            0x10u
>>>>> +#define ___GFP_IO              0x40u
>>>>> +#define ___GFP_FS              0x80u
>>>>> +#define __GFP_WAIT     ((__force gfp_t)___GFP_WAIT)
>>>>> +#define __GFP_IO       ((__force gfp_t)___GFP_IO)
>>>>> +#define __GFP_FS       ((__force gfp_t)___GFP_FS)
>>>>> +#define GFP_KERNEL  (__GFP_WAIT | __GFP_IO | __GFP_FS)
>>>>> +
>>>>> +/* Adapted from include/linux/compiler_attributes.h  */
>>>>> +
>>>>> +#define __malloc
>>>>> __attribute__((__malloc__))
>>>>> +
>>>>> +/* Adapted from include/linux/string.h  */
>>>>> +
>>>>> +extern char *kstrdup(const char *s, gfp_t gfp) __malloc;
>>>>> +
>>>>> +/* Adapted from drivers/usb/gadget/configfs.c  */
>>>>> +
>>>>> +struct gadget_info {
>>>>> +       struct config_group group;
>>>>> +       /* [...snip...] */                              \
>>>>> +};
>>>>> diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
>>>>> b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
>>>>> new file mode 100644
>>>>> index 00000000000..4c567b2ffdf
>>>>> --- /dev/null
>>>>> +++ b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
>>>>> @@ -0,0 +1,21 @@
>>>>> +// TODO: remove need for this option:
>>>>> +/* { dg-additional-options "-fanalyzer-checker=taint" } */
>>>>> +
>>>>> +#include "analyzer-decls.h"
>>>>> +#include <stdio.h>
>>>>> +#include <stdlib.h>
>>>>> +#include <string.h>
>>>>> +
>>>>> +/* malloc with tainted size from a syscall.  */
>>>>> +
>>>>> +void *p;
>>>>> +
>>>>> +void __attribute__((tainted))
>>>>> +test_1 (size_t sz) /* { dg-message "\\(1\\) function 'test_1'
>>>>> marked
>>>>> with '__attribute__\\(\\(tainted\\)\\)'" } */
>>>>> +{
>>>>> +  /* TODO: should have a message saying why "sz" is tainted,
>>>>> e.g.
>>>>> +     "treating 'sz' as attacker-controlled because 'test_1' is
>>>>> marked with '__attribute__((tainted))'"  */
>>>>> +
>>>>> +  p = malloc (sz); /* { dg-warning "use of attacker-controlled
>>>>> value
>>>>> 'sz' as allocation size without upper-bounds checking" "warning"
>>>>> } */
>>>>> +  /* { dg-message "\\(\[0-9\]+\\) use of attacker-controlled
>>>>> value
>>>>> 'sz' as allocation size without upper-bounds checking" "final
>>>>> event"
>>>>> { target *-*-* } .-1 } */
>>>>> +}
>>>>> diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
>>>>> b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
>>>>> new file mode 100644
>>>>> index 00000000000..f52cafcd71d
>>>>> --- /dev/null
>>>>> +++ b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
>>>>> @@ -0,0 +1,31 @@
>>>>> +// TODO: remove need for this option:
>>>>> +/* { dg-additional-options "-fanalyzer-checker=taint" } */
>>>>> +
>>>>> +#include "analyzer-decls.h"
>>>>> +#include <stdio.h>
>>>>> +#include <stdlib.h>
>>>>> +#include <string.h>
>>>>> +
>>>>> +/* malloc with tainted size from a syscall.  */
>>>>> +
>>>>> +struct arg_buf
>>>>> +{
>>>>> +  size_t sz;
>>>>> +};
>>>>> +
>>>>> +void *p;
>>>>> +
>>>>> +void __attribute__((tainted))
>>>>> +test_1 (void *data) /* { dg-message "\\(1\\) function 'test_1'
>>>>> marked with '__attribute__\\(\\(tainted\\)\\)'" } */
>>>>> +{
>>>>> +  /* we should treat pointed-to-structs as tainted.  */
>>>>> +  __analyzer_dump_state ("taint", data); /* { dg-warning "state:
>>>>> 'tainted'" } */
>>>>> +
>>>>> +  struct arg_buf *args = data;
>>>>> +
>>>>> +  __analyzer_dump_state ("taint", args); /* { dg-warning "state:
>>>>> 'tainted'" } */
>>>>> +  __analyzer_dump_state ("taint", args->sz); /* { dg-warning
>>>>> "state:
>>>>> 'tainted'" } */
>>>>> +
>>>>> +  p = malloc (args->sz); /* { dg-warning "use of attacker-
>>>>> controlled
>>>>> value '\\*args.sz' as allocation size without upper-bounds
>>>>> checking"
>>>>> "warning" } */
>>>>> +  /* { dg-message "\\(\[0-9\]+\\) use of attacker-controlled
>>>>> value
>>>>> '\\*args.sz' as allocation size without upper-bounds checking"
>>>>> "final
>>>>> event" { target *-*-* } .-1 } */
>>>>> +}
>>>>
>>>
>>>
>>
> 
>
  

Patch

diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
index 096e219392d..5fab41daf93 100644
--- a/gcc/analyzer/engine.cc
+++ b/gcc/analyzer/engine.cc
@@ -68,6 +68,9 @@  along with GCC; see the file COPYING3.  If not see
 #include "plugin.h"
 #include "target.h"
 #include <memory>
+#include "stringpool.h"
+#include "attribs.h"
+#include "tree-dfa.h"
 
 /* For an overview, see gcc/doc/analyzer.texi.  */
 
@@ -2276,6 +2279,116 @@  exploded_graph::~exploded_graph ()
     delete (*iter).second;
 }
 
+/* Subroutine for use when implementing __attribute__((tainted))
+   on functions and on function pointer fields in structs.
+
+   Called on STATE representing a call to FNDECL.
+   Mark all params of FNDECL in STATE as "tainted".  Mark the value of all
+   regions pointed to by params of FNDECL as "tainted".
+
+   Return true if successful; return false if the "taint" state machine
+   was not found.  */
+
+static bool
+mark_params_as_tainted (program_state *state, tree fndecl,
+			const extrinsic_state &ext_state)
+{
+  unsigned taint_sm_idx;
+  if (!ext_state.get_sm_idx_by_name ("taint", &taint_sm_idx))
+    return false;
+  sm_state_map *smap = state->m_checker_states[taint_sm_idx];
+
+  const state_machine &sm = ext_state.get_sm (taint_sm_idx);
+  state_machine::state_t tainted = sm.get_state_by_name ("tainted");
+
+  region_model_manager *mgr = ext_state.get_model_manager ();
+
+  function *fun = DECL_STRUCT_FUNCTION (fndecl);
+  gcc_assert (fun);
+
+  for (tree iter_parm = DECL_ARGUMENTS (fndecl); iter_parm;
+       iter_parm = DECL_CHAIN (iter_parm))
+    {
+      tree param = iter_parm;
+      if (tree parm_default_ssa = ssa_default_def (fun, iter_parm))
+	param = parm_default_ssa;
+      const region *param_reg = state->m_region_model->get_lvalue (param, NULL);
+      const svalue *init_sval = mgr->get_or_create_initial_value (param_reg);
+      smap->set_state (state->m_region_model, init_sval,
+		       tainted, NULL /*origin_new_sval*/, ext_state);
+      if (POINTER_TYPE_P (TREE_TYPE (param)))
+	{
+	  const region *pointee_reg = mgr->get_symbolic_region (init_sval);
+	  /* Mark "*param" as tainted.  */
+	  const svalue *init_pointee_sval
+	    = mgr->get_or_create_initial_value (pointee_reg);
+	  smap->set_state (state->m_region_model, init_pointee_sval,
+			   tainted, NULL /*origin_new_sval*/, ext_state);
+	}
+    }
+
+  return true;
+}
+
+/* Custom event for use by tainted_function_info when a function
+   has been marked with __attribute__((tainted)).  */
+
+class tainted_function_custom_event : public custom_event
+{
+public:
+  tainted_function_custom_event (location_t loc, tree fndecl, int depth)
+  : custom_event (loc, fndecl, depth),
+    m_fndecl (fndecl)
+  {
+  }
+
+  label_text get_desc (bool can_colorize) const FINAL OVERRIDE
+  {
+    return make_label_text
+      (can_colorize,
+       "function %qE marked with %<__attribute__((tainted))%>",
+       m_fndecl);
+  }
+
+private:
+  tree m_fndecl;
+};
+
+/* Custom exploded_edge info for top-level calls to a function
+   marked with __attribute__((tainted)).  */
+
+class tainted_function_info : public custom_edge_info
+{
+public:
+  tainted_function_info (tree fndecl)
+  : m_fndecl (fndecl)
+  {}
+
+  void print (pretty_printer *pp) const FINAL OVERRIDE
+  {
+    pp_string (pp, "call to tainted function");
+  };
+
+  bool update_model (region_model *,
+		     const exploded_edge *,
+		     region_model_context *) const FINAL OVERRIDE
+  {
+    /* No-op.  */
+    return true;
+  }
+
+  void add_events_to_path (checker_path *emission_path,
+			   const exploded_edge &) const FINAL OVERRIDE
+  {
+    emission_path->add_event
+      (new tainted_function_custom_event
+       (DECL_SOURCE_LOCATION (m_fndecl), m_fndecl, 0));
+  }
+
+private:
+  tree m_fndecl;
+};
+
 /* Ensure that there is an exploded_node representing an external call to
    FUN, adding it to the worklist if creating it.
 
@@ -2302,14 +2415,25 @@  exploded_graph::add_function_entry (function *fun)
   program_state state (m_ext_state);
   state.push_frame (m_ext_state, fun);
 
+  custom_edge_info *edge_info = NULL;
+
+  if (lookup_attribute ("tainted", DECL_ATTRIBUTES (fun->decl)))
+    {
+      if (mark_params_as_tainted (&state, fun->decl, m_ext_state))
+	edge_info = new tainted_function_info (fun->decl);
+    }
+
   if (!state.m_valid)
     return NULL;
 
   exploded_node *enode = get_or_create_node (point, state, NULL);
   if (!enode)
-    return NULL;
+    {
+      delete edge_info;
+      return NULL;
+    }
 
-  add_edge (m_origin, enode, NULL);
+  add_edge (m_origin, enode, NULL, edge_info);
 
   m_functions_with_enodes.add (fun);
 
@@ -2623,6 +2747,184 @@  toplevel_function_p (function *fun, logger *logger)
   return true;
 }
 
+/* Custom event for use by tainted_call_info when a callback field has been
+   marked with __attribute__((tainted)), for labelling the field.  */
+
+class tainted_field_custom_event : public custom_event
+{
+public:
+  tainted_field_custom_event (tree field)
+  : custom_event (DECL_SOURCE_LOCATION (field), NULL_TREE, 0),
+    m_field (field)
+  {
+  }
+
+  label_text get_desc (bool can_colorize) const FINAL OVERRIDE
+  {
+    return make_label_text (can_colorize,
+			    "field %qE of %qT"
+			    " is marked with %<__attribute__((tainted))%>",
+			    m_field, DECL_CONTEXT (m_field));
+  }
+
+private:
+  tree m_field;
+};
+
+/* Custom event for use by tainted_call_info when a callback field has been
+   marked with __attribute__((tainted)), for labelling the function used
+   in that callback.  */
+
+class tainted_callback_custom_event : public custom_event
+{
+public:
+  tainted_callback_custom_event (location_t loc, tree fndecl, int depth,
+				 tree field)
+  : custom_event (loc, fndecl, depth),
+    m_field (field)
+  {
+  }
+
+  label_text get_desc (bool can_colorize) const FINAL OVERRIDE
+  {
+    return make_label_text (can_colorize,
+			    "function %qE used as initializer for field %qE"
+			    " marked with %<__attribute__((tainted))%>",
+			    m_fndecl, m_field);
+  }
+
+private:
+  tree m_field;
+};
+
+/* Custom edge info for use when adding a function used by a callback field
+   marked with '__attribute__((tainted))'.   */
+
+class tainted_call_info : public custom_edge_info
+{
+public:
+  tainted_call_info (tree field, tree fndecl, location_t loc)
+  : m_field (field), m_fndecl (fndecl), m_loc (loc)
+  {}
+
+  void print (pretty_printer *pp) const FINAL OVERRIDE
+  {
+    pp_string (pp, "call to tainted field");
+  };
+
+  bool update_model (region_model *,
+		     const exploded_edge *,
+		     region_model_context *) const FINAL OVERRIDE
+  {
+    /* No-op.  */
+    return true;
+  }
+
+  void add_events_to_path (checker_path *emission_path,
+			   const exploded_edge &) const FINAL OVERRIDE
+  {
+    /* Show the field in the struct declaration
+       e.g. "(1) field 'store' is marked with '__attribute__((tainted))'"  */
+    emission_path->add_event
+      (new tainted_field_custom_event (m_field));
+
+    /* Show the callback in the initializer
+       e.g.
+       "(2) function 'gadget_dev_desc_UDC_store' used as initializer
+       for field 'store' marked with '__attribute__((tainted))'".  */
+    emission_path->add_event
+      (new tainted_callback_custom_event (m_loc, m_fndecl, 0, m_field));
+  }
+
+private:
+  tree m_field;
+  tree m_fndecl;
+  location_t m_loc;
+};
+
+/* Given an initializer at LOC for FIELD marked with '__attribute__((tainted))'
+   initialized with FNDECL, add an entrypoint to FNDECL to EG (and to its
+   worklist) where the params to FNDECL are marked as tainted.  */
+
+static void
+add_tainted_callback (exploded_graph *eg, tree field, tree fndecl,
+		      location_t loc)
+{
+  logger *logger = eg->get_logger ();
+
+  LOG_SCOPE (logger);
+
+  if (!gimple_has_body_p (fndecl))
+    return;
+
+  const extrinsic_state &ext_state = eg->get_ext_state ();
+
+  function *fun = DECL_STRUCT_FUNCTION (fndecl);
+  gcc_assert (fun);
+
+  program_point point
+    = program_point::from_function_entry (eg->get_supergraph (), fun);
+  program_state state (ext_state);
+  state.push_frame (ext_state, fun);
+
+  if (!mark_params_as_tainted (&state, fndecl, ext_state))
+    return;
+
+  if (!state.m_valid)
+    return;
+
+  exploded_node *enode = eg->get_or_create_node (point, state, NULL);
+  if (logger)
+    {
+      if (enode)
+	logger->log ("created EN %i for tainted %qE entrypoint",
+		     enode->m_index, fndecl);
+      else
+	{
+	  logger->log ("did not create enode for tainted %qE entrypoint",
+		       fndecl);
+	  return;
+	}
+    }
+
+  tainted_call_info *info = new tainted_call_info (field, fndecl, loc);
+  eg->add_edge (eg->get_origin (), enode, NULL, info);
+}
+
+/* Callback for walk_tree for finding callbacks within initializers;
+   ensure that any callback initializer where the corresponding field is
+   marked with '__attribute__((tainted))' is treated as an entrypoint to the
+   analysis, special-casing that the inputs to the callback are
+   untrustworthy.  */
+
+static tree
+add_any_callbacks (tree *tp, int *, void *data)
+{
+  exploded_graph *eg = (exploded_graph *)data;
+  if (TREE_CODE (*tp) == CONSTRUCTOR)
+    {
+      /* Find fields with the "tainted" attribute.
+	 walk_tree only walks the values, not the index values;
+	 look at the index values.  */
+      unsigned HOST_WIDE_INT idx;
+      constructor_elt *ce;
+
+      for (idx = 0; vec_safe_iterate (CONSTRUCTOR_ELTS (*tp), idx, &ce);
+	   idx++)
+	if (ce->index && TREE_CODE (ce->index) == FIELD_DECL)
+	  if (lookup_attribute ("tainted", DECL_ATTRIBUTES (ce->index)))
+	    {
+	      tree value = ce->value;
+	      if (TREE_CODE (value) == ADDR_EXPR
+		  && TREE_CODE (TREE_OPERAND (value, 0)) == FUNCTION_DECL)
+		add_tainted_callback (eg, ce->index, TREE_OPERAND (value, 0),
+				      EXPR_LOCATION (value));
+	    }
+    }
+
+  return NULL_TREE;
+}
+
 /* Add initial nodes to EG, with entrypoints for externally-callable
    functions.  */
 
@@ -2648,6 +2950,17 @@  exploded_graph::build_initial_worklist ()
 	  logger->log ("did not create enode for %qE entrypoint", fun->decl);
       }
   }
+
+  /* Find callbacks that are reachable from global initializers.  */
+  varpool_node *vpnode;
+  FOR_EACH_VARIABLE (vpnode)
+    {
+      tree decl = vpnode->decl;
+      tree init = DECL_INITIAL (decl);
+      if (!init)
+	continue;
+      walk_tree (&init, add_any_callbacks, this, NULL);
+    }
 }
 
 /* The main loop of the analysis.
diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 9e03156de5e..835ba6e0e8c 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -117,6 +117,7 @@  static tree handle_no_profile_instrument_function_attribute (tree *, tree,
 							     tree, int, bool *);
 static tree handle_malloc_attribute (tree *, tree, tree, int, bool *);
 static tree handle_dealloc_attribute (tree *, tree, tree, int, bool *);
+static tree handle_tainted_attribute (tree *, tree, tree, int, bool *);
 static tree handle_returns_twice_attribute (tree *, tree, tree, int, bool *);
 static tree handle_no_limit_stack_attribute (tree *, tree, tree, int,
 					     bool *);
@@ -569,6 +570,8 @@  const struct attribute_spec c_common_attribute_table[] =
 			      handle_objc_nullability_attribute, NULL },
   { "*dealloc",                1, 2, true, false, false, false,
 			      handle_dealloc_attribute, NULL },
+  { "tainted",		      0, 0, true,  false, false, false,
+			      handle_tainted_attribute, NULL },
   { NULL,                     0, 0, false, false, false, false, NULL, NULL }
 };
 
@@ -5857,6 +5860,39 @@  handle_objc_nullability_attribute (tree *node, tree name, tree args,
   return NULL_TREE;
 }
 
+/* Handle a "tainted" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_tainted_attribute (tree *node, tree name, tree, int,
+			  bool *no_add_attrs)
+{
+  if (TREE_CODE (*node) != FUNCTION_DECL
+      && TREE_CODE (*node) != FIELD_DECL)
+    {
+      warning (OPT_Wattributes, "%qE attribute ignored; valid only "
+	       "for functions and function pointer fields",
+	       name);
+      *no_add_attrs = true;
+      return NULL_TREE;
+    }
+
+  if (TREE_CODE (*node) == FIELD_DECL
+      && !(TREE_CODE (TREE_TYPE (*node)) == POINTER_TYPE
+	   && TREE_CODE (TREE_TYPE (TREE_TYPE (*node))) == FUNCTION_TYPE))
+    {
+      warning (OPT_Wattributes, "%qE attribute ignored;"
+	       " field must be a function pointer",
+	       name);
+      *no_add_attrs = true;
+      return NULL_TREE;
+    }
+
+  *no_add_attrs = false; /* OK */
+
+  return NULL_TREE;
+}
+
 /* Attempt to partially validate a single attribute ATTR as if
    it were to be applied to an entity OPER.  */
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 5a6ef464779..826bbd48e7e 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -2465,7 +2465,8 @@  variable declarations (@pxref{Variable Attributes}),
 labels (@pxref{Label Attributes}),
 enumerators (@pxref{Enumerator Attributes}),
 statements (@pxref{Statement Attributes}),
-and types (@pxref{Type Attributes}).
+types (@pxref{Type Attributes}),
+and on field declarations (for @code{tainted}).
 
 There is some overlap between the purposes of attributes and pragmas
 (@pxref{Pragmas,,Pragmas Accepted by GCC}).  It has been
@@ -3977,6 +3978,25 @@  addition to creating a symbol version (as if
 @code{"@var{name2}@@@var{nodename}"} was used) the version will be also used
 to resolve @var{name2} by the linker.
 
+@item tainted
+@cindex @code{tainted} function attribute
+The @code{tainted} attribute is used to specify that a function is called
+in a way that requires sanitization of its arguments, such as a system
+call in an operating system kernel.  Such a function can be considered part
+of the ``attack surface'' of the program.  The attribute can be used both
+on function declarations, and on field declarations containing function
+pointers.  In the latter case, any function used as an initializer of
+such a callback field will be treated as tainted.
+
+The analyzer will pay particular attention to such functions when both
+@option{-fanalyzer} and @option{-fanalyzer-checker=taint} are supplied,
+potentially issuing warnings guarded by
+@option{-Wanalyzer-exposure-through-uninit-copy},
+@option{-Wanalyzer-tainted-allocation-size},
+@option{-Wanalyzer-tainted-array-index},
+@option{Wanalyzer-tainted-offset},
+and @option{Wanalyzer-tainted-size}.
+
 @item target_clones (@var{options})
 @cindex @code{target_clones} function attribute
 The @code{target_clones} attribute is used to specify that a function
diff --git a/gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
new file mode 100644
index 00000000000..cc4d5900372
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
@@ -0,0 +1,88 @@ 
+// TODO: remove need for this option
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+
+#include "analyzer-decls.h"
+
+struct arg_buf
+{
+  int i;
+  int j;
+};
+
+/* Example of marking a function as tainted.  */
+
+void __attribute__((tainted))
+test_1 (int i, void *p, char *q)
+{
+  /* There should be a single enode,
+     for the "tainted" entry to the function.  */
+  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1 processed enode" } */
+
+  __analyzer_dump_state ("taint", i); /* { dg-warning "state: 'tainted'" } */
+  __analyzer_dump_state ("taint", p); /* { dg-warning "state: 'tainted'" } */
+  __analyzer_dump_state ("taint", q); /* { dg-warning "state: 'tainted'" } */
+  __analyzer_dump_state ("taint", *q); /* { dg-warning "state: 'tainted'" } */
+
+  struct arg_buf *args = p;
+  __analyzer_dump_state ("taint", args->i); /* { dg-warning "state: 'tainted'" } */
+  __analyzer_dump_state ("taint", args->j); /* { dg-warning "state: 'tainted'" } */  
+}
+
+/* Example of marking a callback field as tainted.  */
+
+struct s2
+{
+  void (*cb) (int, void *, char *)
+    __attribute__((tainted));
+};
+
+/* Function not marked as tainted.  */
+
+void
+test_2a (int i, void *p, char *q)
+{
+  /* There should be a single enode,
+     for the normal entry to the function.  */
+  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1 processed enode" } */
+
+  __analyzer_dump_state ("taint", i); /* { dg-warning "state: 'start'" } */
+  __analyzer_dump_state ("taint", p); /* { dg-warning "state: 'start'" } */
+  __analyzer_dump_state ("taint", q); /* { dg-warning "state: 'start'" } */
+
+  struct arg_buf *args = p;
+  __analyzer_dump_state ("taint", args->i); /* { dg-warning "state: 'start'" } */
+  __analyzer_dump_state ("taint", args->j); /* { dg-warning "state: 'start'" } */  
+}
+
+/* Function referenced via t2b.cb, marked as "tainted".  */
+
+void
+test_2b (int i, void *p, char *q)
+{
+  /* There should be two enodes
+     for the direct call, and the "tainted" entry to the function.  */
+  __analyzer_dump_exploded_nodes (0); /* { dg-warning "2 processed enodes" } */
+}
+
+/* Callback used via t2c.cb, marked as "tainted".  */
+void
+__analyzer_test_2c (int i, void *p, char *q)
+{
+  /* There should be a single enode,
+     for the "tainted" entry to the function.  */
+  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1 processed enode" } */
+
+  __analyzer_dump_state ("taint", i); /* { dg-warning "state: 'tainted'" } */
+  __analyzer_dump_state ("taint", p); /* { dg-warning "state: 'tainted'" } */
+  __analyzer_dump_state ("taint", q); /* { dg-warning "state: 'tainted'" } */
+}
+
+struct s2 t2b =
+{
+  .cb = test_2b
+};
+
+struct s2 t2c =
+{
+  .cb = __analyzer_test_2c
+};
diff --git a/gcc/testsuite/gcc.dg/analyzer/attr-tainted-misuses.c b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-misuses.c
new file mode 100644
index 00000000000..6f4cbc82efb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-misuses.c
@@ -0,0 +1,6 @@ 
+int not_a_fn __attribute__ ((tainted)); /* { dg-warning "'tainted' attribute ignored; valid only for functions and function pointer fields" } */
+
+struct s
+{
+  int f __attribute__ ((tainted)); /* { dg-warning "'tainted' attribute ignored; field must be a function pointer" } */
+};
diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-1.c b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-1.c
new file mode 100644
index 00000000000..fe6c7ebbb1f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-1.c
@@ -0,0 +1,93 @@ 
+/* "The osf_getsysinfo function in arch/alpha/kernel/osf_sys.c in the
+   Linux kernel before 2.6.39.4 on the Alpha platform does not properly
+   restrict the data size for GSI_GET_HWRPB operations, which allows
+   local users to obtain sensitive information from kernel memory via
+   a crafted call."
+
+   Fixed in 3d0475119d8722798db5e88f26493f6547a4bb5b on linux-2.6.39.y
+   in linux-stable.  */
+
+// TODO: remove need for this option:
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+
+#include "analyzer-decls.h"
+#include "test-uaccess.h"
+
+/* Adapted from include/linux/linkage.h.  */
+
+#define asmlinkage
+
+/* Adapted from include/linux/syscalls.h.  */
+
+#define __SC_DECL1(t1, a1)	t1 a1
+#define __SC_DECL2(t2, a2, ...) t2 a2, __SC_DECL1(__VA_ARGS__)
+#define __SC_DECL3(t3, a3, ...) t3 a3, __SC_DECL2(__VA_ARGS__)
+#define __SC_DECL4(t4, a4, ...) t4 a4, __SC_DECL3(__VA_ARGS__)
+#define __SC_DECL5(t5, a5, ...) t5 a5, __SC_DECL4(__VA_ARGS__)
+#define __SC_DECL6(t6, a6, ...) t6 a6, __SC_DECL5(__VA_ARGS__)
+
+#define SYSCALL_DEFINEx(x, sname, ...)				\
+	__SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
+
+#define SYSCALL_DEFINE(name) asmlinkage long sys_##name
+#define __SYSCALL_DEFINEx(x, name, ...)					\
+	asmlinkage __attribute__((tainted)) \
+	long sys##name(__SC_DECL##x(__VA_ARGS__))
+
+#define SYSCALL_DEFINE5(name, ...) SYSCALL_DEFINEx(5, _##name, __VA_ARGS__)
+
+/* Adapted from arch/alpha/include/asm/hwrpb.h.  */
+
+struct hwrpb_struct {
+	unsigned long phys_addr;	/* check: physical address of the hwrpb */
+	unsigned long id;		/* check: "HWRPB\0\0\0" */
+	unsigned long revision;
+	unsigned long size;		/* size of hwrpb */
+	/* [...snip...] */
+};
+
+extern struct hwrpb_struct *hwrpb;
+
+/* Adapted from arch/alpha/kernel/osf_sys.c.  */
+
+SYSCALL_DEFINE5(osf_getsysinfo, unsigned long, op, void __user *, buffer,
+		unsigned long, nbytes, int __user *, start, void __user *, arg)
+{
+	/* [...snip...] */
+
+	__analyzer_dump_state ("taint", nbytes);  /* { dg-warning "tainted" } */
+
+	/* TODO: should have an event explaining why "nbytes" is treated as
+	   attacker-controlled.  */
+
+	/* case GSI_GET_HWRPB: */
+		if (nbytes < sizeof(*hwrpb))
+			return -1;
+
+		__analyzer_dump_state ("taint", nbytes);  /* { dg-warning "has_lb" } */
+
+		if (copy_to_user(buffer, hwrpb, nbytes) != 0) /* { dg-warning "use of attacker-controlled value 'nbytes' as size without upper-bounds checking" } */
+			return -2;
+
+		return 1;
+
+	/* [...snip...] */
+}
+
+/* With the fix for the sense of the size comparison.  */
+
+SYSCALL_DEFINE5(osf_getsysinfo_fixed, unsigned long, op, void __user *, buffer,
+		unsigned long, nbytes, int __user *, start, void __user *, arg)
+{
+	/* [...snip...] */
+
+	/* case GSI_GET_HWRPB: */
+		if (nbytes > sizeof(*hwrpb))
+			return -1;
+		if (copy_to_user(buffer, hwrpb, nbytes) != 0) /* { dg-bogus "attacker-controlled" } */
+			return -2;
+
+		return 1;
+
+	/* [...snip...] */
+}
diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-1.c b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-1.c
new file mode 100644
index 00000000000..0b9a94a8d6c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-1.c
@@ -0,0 +1,38 @@ 
+/* See notes in this header.  */
+#include "taint-CVE-2020-13143.h"
+
+// TODO: remove need for this option
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+
+struct configfs_attribute {
+	/* [...snip...] */
+	ssize_t (*store)(struct config_item *, const char *, size_t) /* { dg-message "\\(1\\) field 'store' of 'struct configfs_attribute' is marked with '__attribute__\\(\\(tainted\\)\\)'" } */
+		__attribute__((tainted)); /* (this is added).  */
+};
+static inline struct gadget_info *to_gadget_info(struct config_item *item)
+{
+	 return container_of(to_config_group(item), struct gadget_info, group);
+}
+
+static ssize_t gadget_dev_desc_UDC_store(struct config_item *item,
+		const char *page, size_t len)
+{
+	struct gadget_info *gi = to_gadget_info(item);
+	char *name;
+	int ret;
+
+#if 0
+	/* FIXME: this is the fix.  */
+	if (strlen(page) < len)
+		return -EOVERFLOW;
+#endif
+
+	name = kstrdup(page, GFP_KERNEL);
+	if (!name)
+		return -ENOMEM;
+	if (name[len - 1] == '\n') /* { dg-warning "use of attacker-controlled value 'len \[^\n\r\]+' as offset without upper-bounds checking" } */
+		name[len - 1] = '\0'; /* { dg-warning "use of attacker-controlled value 'len \[^\n\r\]+' as offset without upper-bounds checking" } */
+	/* [...snip...] */				\
+}
+
+CONFIGFS_ATTR(gadget_dev_desc_, UDC); /* { dg-message "\\(2\\) function 'gadget_dev_desc_UDC_store' used as initializer for field 'store' marked with '__attribute__\\(\\(tainted\\)\\)'" } */
diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-2.c b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-2.c
new file mode 100644
index 00000000000..e05da9276c1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-2.c
@@ -0,0 +1,32 @@ 
+/* See notes in this header.  */
+#include "taint-CVE-2020-13143.h"
+
+// TODO: remove need for this option
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+
+struct configfs_attribute {
+	/* [...snip...] */
+	ssize_t (*store)(struct config_item *, const char *, size_t) /* { dg-message "\\(1\\) field 'store' of 'struct configfs_attribute' is marked with '__attribute__\\(\\(tainted\\)\\)'" } */
+		__attribute__((tainted)); /* (this is added).  */
+};
+
+/* Highly simplified version.  */
+
+static ssize_t gadget_dev_desc_UDC_store(struct config_item *item,
+		const char *page, size_t len)
+{
+	/* TODO: ought to have state_change_event talking about where the tainted value comes from.  */
+
+	char *name;
+	/* [...snip...] */
+
+	name = kstrdup(page, GFP_KERNEL);
+	if (!name)
+		return -ENOMEM;
+	if (name[len - 1] == '\n') /* { dg-warning "use of attacker-controlled value 'len \[^\n\r\]+' as offset without upper-bounds checking" } */
+		name[len - 1] = '\0';  /* { dg-warning "use of attacker-controlled value 'len \[^\n\r\]+' as offset without upper-bounds checking" } */
+	/* [...snip...] */
+	return 0;
+}
+
+CONFIGFS_ATTR(gadget_dev_desc_, UDC); /* { dg-message "\\(2\\) function 'gadget_dev_desc_UDC_store' used as initializer for field 'store' marked with '__attribute__\\(\\(tainted\\)\\)'" } */
diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h
new file mode 100644
index 00000000000..0ba023539af
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h
@@ -0,0 +1,91 @@ 
+/* Shared header for the various taint-CVE-2020-13143.h tests.
+   
+   "gadget_dev_desc_UDC_store in drivers/usb/gadget/configfs.c in the
+   Linux kernel 3.16 through 5.6.13 relies on kstrdup without considering
+   the possibility of an internal '\0' value, which allows attackers to
+   trigger an out-of-bounds read, aka CID-15753588bcd4."
+
+   Fixed by 15753588bcd4bbffae1cca33c8ced5722477fe1f on linux-5.7.y
+   in linux-stable.  */
+
+// TODO: remove need for this option
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+
+#include <stddef.h>
+
+/* Adapted from include/uapi/asm-generic/posix_types.h  */
+
+typedef unsigned int     __kernel_size_t;
+typedef int              __kernel_ssize_t;
+
+/* Adapted from include/linux/types.h  */
+
+//typedef __kernel_size_t		size_t;
+typedef __kernel_ssize_t	ssize_t;
+
+/* Adapted from include/linux/kernel.h  */
+
+#define container_of(ptr, type, member) ({				\
+	void *__mptr = (void *)(ptr);					\
+	/* [...snip...] */						\
+	((type *)(__mptr - offsetof(type, member))); })
+
+/* Adapted from include/linux/configfs.h  */
+
+struct config_item {
+	/* [...snip...] */
+};
+
+struct config_group {
+	struct config_item		cg_item;
+	/* [...snip...] */
+};
+
+static inline struct config_group *to_config_group(struct config_item *item)
+{
+	return item ? container_of(item,struct config_group,cg_item) : NULL;
+}
+
+#define CONFIGFS_ATTR(_pfx, _name)				\
+static struct configfs_attribute _pfx##attr_##_name = {	\
+	/* [...snip...] */				\
+	.store		= _pfx##_name##_store,		\
+}
+
+/* Adapted from include/linux/compiler.h  */
+
+#define __force
+
+/* Adapted from include/asm-generic/errno-base.h  */
+
+#define	ENOMEM		12	/* Out of memory */
+
+/* Adapted from include/linux/types.h  */
+
+#define __bitwise__
+typedef unsigned __bitwise__ gfp_t;
+
+/* Adapted from include/linux/gfp.h  */
+
+#define ___GFP_WAIT		0x10u
+#define ___GFP_IO		0x40u
+#define ___GFP_FS		0x80u
+#define __GFP_WAIT	((__force gfp_t)___GFP_WAIT)
+#define __GFP_IO	((__force gfp_t)___GFP_IO)
+#define __GFP_FS	((__force gfp_t)___GFP_FS)
+#define GFP_KERNEL  (__GFP_WAIT | __GFP_IO | __GFP_FS)
+
+/* Adapted from include/linux/compiler_attributes.h  */
+
+#define __malloc                        __attribute__((__malloc__))
+
+/* Adapted from include/linux/string.h  */
+
+extern char *kstrdup(const char *s, gfp_t gfp) __malloc;
+
+/* Adapted from drivers/usb/gadget/configfs.c  */
+
+struct gadget_info {
+	struct config_group group;
+	/* [...snip...] */				\
+};
diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
new file mode 100644
index 00000000000..4c567b2ffdf
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
@@ -0,0 +1,21 @@ 
+// TODO: remove need for this option:
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+
+#include "analyzer-decls.h"
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+/* malloc with tainted size from a syscall.  */
+
+void *p;
+
+void __attribute__((tainted))
+test_1 (size_t sz) /* { dg-message "\\(1\\) function 'test_1' marked with '__attribute__\\(\\(tainted\\)\\)'" } */
+{
+  /* TODO: should have a message saying why "sz" is tainted, e.g.
+     "treating 'sz' as attacker-controlled because 'test_1' is marked with '__attribute__((tainted))'"  */
+
+  p = malloc (sz); /* { dg-warning "use of attacker-controlled value 'sz' as allocation size without upper-bounds checking" "warning" } */
+  /* { dg-message "\\(\[0-9\]+\\) use of attacker-controlled value 'sz' as allocation size without upper-bounds checking" "final event" { target *-*-* } .-1 } */
+}
diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
new file mode 100644
index 00000000000..f52cafcd71d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
@@ -0,0 +1,31 @@ 
+// TODO: remove need for this option:
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+
+#include "analyzer-decls.h"
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+/* malloc with tainted size from a syscall.  */
+
+struct arg_buf
+{
+  size_t sz;
+};
+
+void *p;
+
+void __attribute__((tainted))
+test_1 (void *data) /* { dg-message "\\(1\\) function 'test_1' marked with '__attribute__\\(\\(tainted\\)\\)'" } */
+{
+  /* we should treat pointed-to-structs as tainted.  */
+  __analyzer_dump_state ("taint", data); /* { dg-warning "state: 'tainted'" } */
+  
+  struct arg_buf *args = data;
+
+  __analyzer_dump_state ("taint", args); /* { dg-warning "state: 'tainted'" } */
+  __analyzer_dump_state ("taint", args->sz); /* { dg-warning "state: 'tainted'" } */
+  
+  p = malloc (args->sz); /* { dg-warning "use of attacker-controlled value '\\*args.sz' as allocation size without upper-bounds checking" "warning" } */
+  /* { dg-message "\\(\[0-9\]+\\) use of attacker-controlled value '\\*args.sz' as allocation size without upper-bounds checking" "final event" { target *-*-* } .-1 } */
+}