c++: ahead-of-time overload set pruning for non-dep calls

Message ID 20211215174905.3848806-1-ppalka@redhat.com
State New
Headers
Series c++: ahead-of-time overload set pruning for non-dep calls |

Commit Message

Patrick Palka Dec. 15, 2021, 5:49 p.m. UTC
  This patch makes us remember the function selected by overload
resolution during ahead of time processing of a non-dependent call
expression, so that we avoid repeating most of the work of overload
resolution at instantiation time.  This mirrors what we already do for
non-dependent operator expressions via build_min_non_dep_op_overload.

Some caveats:

 * When processing ahead of time a non-dependent call to a member
   function template inside a class template (as in
   g++.dg/template/deduce4.C), we end up generating an "inverted" partial
   instantiation such as S<T>::foo<int, int>(), the kinds of which we're
   apparently not prepared to fully instantiate (e.g. tsubst_baselink
   mishandles it).  So this patch disables this optimization for such
   functions and adds a FIXME.

 * WHen trying to make the instantiation machinery handle these partial
   instantiations, I made a couple of changes in register_specialization
   and tsubst_function_decl that get us closer to handling such partial
   instantiations and that seem like improvements on their own, so this
   patch includes these changes.

  * This change triggered a latent FUNCTION_DECL pretty printing issue
    in cpp0x/error2.C -- since we now resolve the call to foo<0> ahead
    of time, the error now looks like:

      error: expansion pattern ‘foo()()=0’ contains no parameter pack

    where the FUNCTION_DECL foo is clearly misprinted.  But this
    pretty-printing issue could be reproduced without this patch if
    we replace foo with an ordinary function.  Since this testcase was
    added to verify pretty printing of TEMPLATE_ID_EXPR, I work around
    this test failure by making the call to foo type-dependent and thus
    immune to this ahead of time pruning.

  * We now reject parts of cpp0x/fntmp-equiv1.C because we notice that
    the call d(f, b) in

      template <unsigned long f, unsigned b, typename> e<d(f, b)> d();

    isn't constexpr because the (resolved) d isn't.  I tried fixing this
    by making d constexpr, but then the call to d from main becomes
    ambiguous.  So I settled with removing this part of the testcase.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?  Also tested on cmcstl2 and range-v3.

gcc/cp/ChangeLog:

	* call.c (build_new_method_call): For a non-dependent call
	expression inside a template, returning a templated tree
	whose overload set contains just the selected function.
	* pt.c (register_specialization): Check only the innermost
	template args for dependence in the early exit test.
	(tsubst_function_decl): Simplify obtaining the template arguments
	for a partial instantiation.
	* semantics.c (finish_call_expr): As with build_new_method_call.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/error2.C: Make the call to foo type-dependent in
	order to avoid latent pretty-printing issue for FUNCTION_DECL
	inside MODOP_EXPR.
	* g++.dg/cpp0x/fntmp-equiv1.C: Remove ill-formed parts of
	testcase that we now diagnose.
	* g++.dg/template/non-dependent16.C: New test.
	* g++.dg/template/non-dependent16a.C: New test.
---
 gcc/cp/call.c                                 | 17 +++++++++
 gcc/cp/pt.c                                   | 18 ++-------
 gcc/cp/semantics.c                            | 15 ++++++++
 gcc/testsuite/g++.dg/cpp0x/error2.C           |  4 +-
 gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C     |  4 --
 .../g++.dg/template/non-dependent16.C         | 37 +++++++++++++++++++
 .../g++.dg/template/non-dependent16a.C        | 36 ++++++++++++++++++
 7 files changed, 111 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/template/non-dependent16.C
 create mode 100644 gcc/testsuite/g++.dg/template/non-dependent16a.C
  

Comments

Jason Merrill Dec. 15, 2021, 9:29 p.m. UTC | #1
On 12/15/21 12:49, Patrick Palka wrote:
> This patch makes us remember the function selected by overload
> resolution during ahead of time processing of a non-dependent call
> expression, so that we avoid repeating most of the work of overload
> resolution at instantiation time.  This mirrors what we already do for
> non-dependent operator expressions via build_min_non_dep_op_overload.
> 
> Some caveats:
> 
>   * When processing ahead of time a non-dependent call to a member
>     function template inside a class template (as in
>     g++.dg/template/deduce4.C), we end up generating an "inverted" partial
>     instantiation such as S<T>::foo<int, int>(), the kinds of which we're
>     apparently not prepared to fully instantiate (e.g. tsubst_baselink
>     mishandles it).  So this patch disables this optimization for such
>     functions and adds a FIXME.

I wonder if it would be worthwhile to build a TEMPLATE_ID_EXPR to 
remember the deduced template args, even if we are failing to remember 
the actual function?

>   * WHen trying to make the instantiation machinery handle these partial
>     instantiations, I made a couple of changes in register_specialization
>     and tsubst_function_decl that get us closer to handling such partial
>     instantiations and that seem like improvements on their own, so this
>     patch includes these changes.

The tsubst_function_decl change makes me nervous; surely there was some 
reason that function wasn't that way in the first place.  Let's hold 
these changes for stage 1 if they aren't actually fixing anything.

>    * This change triggered a latent FUNCTION_DECL pretty printing issue
>      in cpp0x/error2.C -- since we now resolve the call to foo<0> ahead
>      of time, the error now looks like:
> 
>        error: expansion pattern ‘foo()()=0’ contains no parameter pack
> 
>      where the FUNCTION_DECL foo is clearly misprinted.  But this
>      pretty-printing issue could be reproduced without this patch if
>      we replace foo with an ordinary function.  Since this testcase was
>      added to verify pretty printing of TEMPLATE_ID_EXPR, I work around
>      this test failure by making the call to foo type-dependent and thus
>      immune to this ahead of time pruning.
> 
>    * We now reject parts of cpp0x/fntmp-equiv1.C because we notice that
>      the call d(f, b) in
> 
>        template <unsigned long f, unsigned b, typename> e<d(f, b)> d();
> 
>      isn't constexpr because the (resolved) d isn't.  I tried fixing this
>      by making d constexpr, but then the call to d from main becomes
>      ambiguous.  So I settled with removing this part of the testcase.
> 
> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> trunk?  Also tested on cmcstl2 and range-v3.
> 
> gcc/cp/ChangeLog:
> 
> 	* call.c (build_new_method_call): For a non-dependent call
> 	expression inside a template, returning a templated tree
> 	whose overload set contains just the selected function.
> 	* pt.c (register_specialization): Check only the innermost
> 	template args for dependence in the early exit test.
> 	(tsubst_function_decl): Simplify obtaining the template arguments
> 	for a partial instantiation.
> 	* semantics.c (finish_call_expr): As with build_new_method_call.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* g++.dg/cpp0x/error2.C: Make the call to foo type-dependent in
> 	order to avoid latent pretty-printing issue for FUNCTION_DECL
> 	inside MODOP_EXPR.
> 	* g++.dg/cpp0x/fntmp-equiv1.C: Remove ill-formed parts of
> 	testcase that we now diagnose.
> 	* g++.dg/template/non-dependent16.C: New test.
> 	* g++.dg/template/non-dependent16a.C: New test.
> ---
>   gcc/cp/call.c                                 | 17 +++++++++
>   gcc/cp/pt.c                                   | 18 ++-------
>   gcc/cp/semantics.c                            | 15 ++++++++
>   gcc/testsuite/g++.dg/cpp0x/error2.C           |  4 +-
>   gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C     |  4 --
>   .../g++.dg/template/non-dependent16.C         | 37 +++++++++++++++++++
>   .../g++.dg/template/non-dependent16a.C        | 36 ++++++++++++++++++
>   7 files changed, 111 insertions(+), 20 deletions(-)
>   create mode 100644 gcc/testsuite/g++.dg/template/non-dependent16.C
>   create mode 100644 gcc/testsuite/g++.dg/template/non-dependent16a.C
> 
> diff --git a/gcc/cp/call.c b/gcc/cp/call.c
> index 53a391cbc6b..92d96c19f5c 100644
> --- a/gcc/cp/call.c
> +++ b/gcc/cp/call.c
> @@ -11165,6 +11165,23 @@ build_new_method_call (tree instance, tree fns, vec<tree, va_gc> **args,
>   	}
>         if (INDIRECT_REF_P (call))
>   	call = TREE_OPERAND (call, 0);
> +
> +      /* Prune all but the selected function from the original overload
> +	 set so that we can avoid some duplicate work at instantiation time.  */
> +      if (really_overloaded_fn (fns))
> +	{
> +	  if (DECL_TEMPLATE_INFO (fn)
> +	      && DECL_MEMBER_TEMPLATE_P (DECL_TI_TEMPLATE (fn))
> +	      && dependent_type_p (DECL_CONTEXT (fn)))
> +	    /* FIXME: We're not prepared to fully instantiate "inverted"
> +	       partial instantiations such as A<T>::f<int>().  */;
> +	  else
> +	    {
> +	      orig_fns = copy_node (orig_fns);
> +	      BASELINK_FUNCTIONS (orig_fns) = fn;
> +	    }
> +	}
> +
>         call = (build_min_non_dep_call_vec
>   	      (call,
>   	       build_min (COMPONENT_REF, TREE_TYPE (CALL_EXPR_FN (call)),
> diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
> index 2340139b238..b114114e617 100644
> --- a/gcc/cp/pt.c
> +++ b/gcc/cp/pt.c
> @@ -1566,18 +1566,10 @@ register_specialization (tree spec, tree tmpl, tree args, bool is_friend,
>   		  && TREE_CODE (spec) == NONTYPE_ARGUMENT_PACK));
>   
>     if (TREE_CODE (spec) == FUNCTION_DECL
> -      && uses_template_parms (DECL_TI_ARGS (spec)))
> +      && uses_template_parms (INNERMOST_TEMPLATE_ARGS (DECL_TI_ARGS (spec))))
>       /* This is the FUNCTION_DECL for a partial instantiation.  Don't
> -       register it; we want the corresponding TEMPLATE_DECL instead.
> -       We use `uses_template_parms (DECL_TI_ARGS (spec))' rather than
> -       the more obvious `uses_template_parms (spec)' to avoid problems
> -       with default function arguments.  In particular, given
> -       something like this:
> -
> -	  template <class T> void f(T t1, T t = T())
> -
> -       the default argument expression is not substituted for in an
> -       instantiation unless and until it is actually needed.  */
> +       register it; we want to register the corresponding TEMPLATE_DECL
> +       instead.  */
>       return spec;
>   
>     if (optimize_specialization_lookup_p (tmpl))
> @@ -13960,9 +13952,7 @@ tsubst_function_decl (tree t, tree args, tsubst_flags_t complain,
>   
>         /* Calculate the complete set of arguments used to
>   	 specialize R.  */
> -      argvec = tsubst_template_args (DECL_TI_ARGS
> -				     (DECL_TEMPLATE_RESULT
> -				      (DECL_TI_TEMPLATE (t))),
> +      argvec = tsubst_template_args (DECL_TI_ARGS (t),
>   				     args, complain, in_decl);
>         if (argvec == error_mark_node)
>   	return error_mark_node;
> diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
> index 7078af03d3c..57f689042b9 100644
> --- a/gcc/cp/semantics.c
> +++ b/gcc/cp/semantics.c
> @@ -2893,6 +2893,21 @@ finish_call_expr (tree fn, vec<tree, va_gc> **args, bool disallow_virtual,
>       {
>         if (INDIRECT_REF_P (result))
>   	result = TREE_OPERAND (result, 0);
> +
> +      /* Prune all but the selected function from the original overload
> +	 set so that we can avoid some duplicate work at instantiation time.  */
> +      if (TREE_CODE (result) == CALL_EXPR
> +	  && really_overloaded_fn (orig_fn))
> +	{
> +	  orig_fn = CALL_EXPR_FN (result);
> +	  if (TREE_CODE (orig_fn) == COMPONENT_REF)
> +	    {
> +	      /* The result of build_new_method_call.  */
> +	      orig_fn = TREE_OPERAND (orig_fn, 1);
> +	      gcc_assert (BASELINK_P (orig_fn));
> +	    }
> +	}
> +
>         result = build_call_vec (TREE_TYPE (result), orig_fn, orig_args);
>         SET_EXPR_LOCATION (result, input_location);
>         KOENIG_LOOKUP_P (result) = koenig_p;
> diff --git a/gcc/testsuite/g++.dg/cpp0x/error2.C b/gcc/testsuite/g++.dg/cpp0x/error2.C
> index e6af294c180..eb966362ccb 100644
> --- a/gcc/testsuite/g++.dg/cpp0x/error2.C
> +++ b/gcc/testsuite/g++.dg/cpp0x/error2.C
> @@ -3,7 +3,7 @@
>   
>   template<int> int foo();
>   
> -template<typename F> void bar(F f)
> +template<typename F, int N> void bar(F f)
>   {
> -  f((foo<0>()=0)...); // { dg-error "pattern '\\(foo\\<0\\>\\)\\(\\)=0'" }
> +  f((foo<N>()=0)...); // { dg-error "pattern '\\(foo\\<N\\>\\)\\(\\)=0'" }
>   }
> diff --git a/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C b/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C
> index 833ae6fc85c..60ebad8d1d3 100644
> --- a/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C
> +++ b/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C
> @@ -1,10 +1,7 @@
>   // PR c++/86946, DR 1321
>   // { dg-do compile { target c++11 } }
>   
> -int d(int, int);
>   template <long> class e {};
> -template <unsigned long f, unsigned b, typename> e<sizeof(d(f, b))> d();
> -template <unsigned long f, unsigned b, typename> e<d(f, b)> d();
>   
>   template <class T, class U> constexpr T d2(T, U) { return 42; }
>   template <unsigned long f, unsigned b, typename> e<d2(f, b)> d2();
> @@ -17,7 +14,6 @@ template <unsigned long f, unsigned b, typename> e<sizeof(d3(f, b))> d3();
>   
>   int main()
>   {
> -  d<1,2,int>();
>     d2<1,2,int>();
>     d3<1,2,int>();
>   }
> diff --git a/gcc/testsuite/g++.dg/template/non-dependent16.C b/gcc/testsuite/g++.dg/template/non-dependent16.C
> new file mode 100644
> index 00000000000..ee8ef902529
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/template/non-dependent16.C
> @@ -0,0 +1,37 @@
> +// This test verifies that after resolving a non-dependent call expression
> +// ahead of time, we prune all but the selected candidate from the overload
> +// set.  Without this optimization, overload resolution for the final call to
> +// f<void>() would be exponential in the size of the overload set.
> +
> +// { dg-do compile { target c++11 } }
> +
> +template<class T> void f();
> +template<class T> auto f() -> decltype(f<void>(), 1, *T());
> +template<class T> auto f() -> decltype(f<void>(), 2, *T());
> +template<class T> auto f() -> decltype(f<void>(), 3, *T());
> +template<class T> auto f() -> decltype(f<void>(), 4, *T());
> +template<class T> auto f() -> decltype(f<void>(), 5, *T());
> +template<class T> auto f() -> decltype(f<void>(), 6, *T());
> +template<class T> auto f() -> decltype(f<void>(), 7, *T());
> +template<class T> auto f() -> decltype(f<void>(), 8, *T());
> +template<class T> auto f() -> decltype(f<void>(), 9, *T());
> +template<class T> auto f() -> decltype(f<void>(), 10, *T());
> +template<class T> auto f() -> decltype(f<void>(), 11, *T());
> +template<class T> auto f() -> decltype(f<void>(), 12, *T());
> +template<class T> auto f() -> decltype(f<void>(), 13, *T());
> +template<class T> auto f() -> decltype(f<void>(), 14, *T());
> +template<class T> auto f() -> decltype(f<void>(), 15, *T());
> +template<class T> auto f() -> decltype(f<void>(), 16, *T());
> +template<class T> auto f() -> decltype(f<void>(), 17, *T());
> +template<class T> auto f() -> decltype(f<void>(), 18, *T());
> +template<class T> auto f() -> decltype(f<void>(), 19, *T());
> +template<class T> auto f() -> decltype(f<void>(), 20, *T());
> +template<class T> auto f() -> decltype(f<void>(), 21, *T());
> +template<class T> auto f() -> decltype(f<void>(), 22, *T());
> +template<class T> auto f() -> decltype(f<void>(), 23, *T());
> +template<class T> auto f() -> decltype(f<void>(), 24, *T());
> +template<class T> auto f() -> decltype(f<void>(), 25, *T());
> +
> +int main() {
> +  f<void>();
> +}
> diff --git a/gcc/testsuite/g++.dg/template/non-dependent16a.C b/gcc/testsuite/g++.dg/template/non-dependent16a.C
> new file mode 100644
> index 00000000000..0e04d646c0b
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/template/non-dependent16a.C
> @@ -0,0 +1,36 @@
> +// Like non-dependent16.C, but using member functions.
> +
> +// { dg-do compile { target c++11 } }
> +
> +struct A {
> +  template<class T> static void f();
> +  template<class T> static auto f() -> decltype(f<void>(), 1, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 2, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 3, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 4, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 5, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 6, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 7, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 8, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 9, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 10, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 11, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 12, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 13, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 14, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 15, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 16, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 17, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 18, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 19, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 20, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 21, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 22, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 23, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 24, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 25, *T());
> +};
> +
> +int main() {
> +  A::f<void>();
> +}
  
Patrick Palka Dec. 16, 2021, 4:59 p.m. UTC | #2
On Wed, 15 Dec 2021, Jason Merrill wrote:

> On 12/15/21 12:49, Patrick Palka wrote:
> > This patch makes us remember the function selected by overload
> > resolution during ahead of time processing of a non-dependent call
> > expression, so that we avoid repeating most of the work of overload
> > resolution at instantiation time.  This mirrors what we already do for
> > non-dependent operator expressions via build_min_non_dep_op_overload.
> > 
> > Some caveats:
> > 
> >   * When processing ahead of time a non-dependent call to a member
> >     function template inside a class template (as in
> >     g++.dg/template/deduce4.C), we end up generating an "inverted" partial
> >     instantiation such as S<T>::foo<int, int>(), the kinds of which we're
> >     apparently not prepared to fully instantiate (e.g. tsubst_baselink
> >     mishandles it).  So this patch disables this optimization for such
> >     functions and adds a FIXME.
> 
> I wonder if it would be worthwhile to build a TEMPLATE_ID_EXPR to remember the
> deduced template args, even if we are failing to remember the actual function?

Hmm, that transformation could have observable effects, since overload
resolution for f<int>(0) might end up instantiating more things than for
f(0) due to the explicit-args substitution step:

  template<class T> struct A { using type = typename T::type; };

  template<class T> void f(T);
  template<class T, class U = typename T::type> typename A<T>::type f(T);

Here overload resolution for f(0) succeeds and selects the first
overload but for f<int>(0) induces a hard error.  Also I worry that such
a transformation might affect declaration matching in weird ways due
to conflating f(0) with f<int>(0).

We could at least though prune the overload set to the corresponding
selected function template rather than the "inside-out" specialization;
I'll try to implement that.

> 
> >   * WHen trying to make the instantiation machinery handle these partial
> >     instantiations, I made a couple of changes in register_specialization
> >     and tsubst_function_decl that get us closer to handling such partial
> >     instantiations and that seem like improvements on their own, so this
> >     patch includes these changes.
> 
> The tsubst_function_decl change makes me nervous; surely there was some reason
> that function wasn't that way in the first place.  Let's hold these changes
> for stage 1 if they aren't actually fixing anything.

Will do.

> 
> >    * This change triggered a latent FUNCTION_DECL pretty printing issue
> >      in cpp0x/error2.C -- since we now resolve the call to foo<0> ahead
> >      of time, the error now looks like:
> > 
> >        error: expansion pattern ‘foo()()=0’ contains no parameter pack
> > 
> >      where the FUNCTION_DECL foo is clearly misprinted.  But this
> >      pretty-printing issue could be reproduced without this patch if
> >      we replace foo with an ordinary function.  Since this testcase was
> >      added to verify pretty printing of TEMPLATE_ID_EXPR, I work around
> >      this test failure by making the call to foo type-dependent and thus
> >      immune to this ahead of time pruning.
> > 
> >    * We now reject parts of cpp0x/fntmp-equiv1.C because we notice that
> >      the call d(f, b) in
> > 
> >        template <unsigned long f, unsigned b, typename> e<d(f, b)> d();
> > 
> >      isn't constexpr because the (resolved) d isn't.  I tried fixing this
> >      by making d constexpr, but then the call to d from main becomes
> >      ambiguous.  So I settled with removing this part of the testcase.
> > 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > trunk?  Also tested on cmcstl2 and range-v3.
> > 
> > gcc/cp/ChangeLog:
> > 
> > 	* call.c (build_new_method_call): For a non-dependent call
> > 	expression inside a template, returning a templated tree
> > 	whose overload set contains just the selected function.
> > 	* pt.c (register_specialization): Check only the innermost
> > 	template args for dependence in the early exit test.
> > 	(tsubst_function_decl): Simplify obtaining the template arguments
> > 	for a partial instantiation.
> > 	* semantics.c (finish_call_expr): As with build_new_method_call.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > 	* g++.dg/cpp0x/error2.C: Make the call to foo type-dependent in
> > 	order to avoid latent pretty-printing issue for FUNCTION_DECL
> > 	inside MODOP_EXPR.
> > 	* g++.dg/cpp0x/fntmp-equiv1.C: Remove ill-formed parts of
> > 	testcase that we now diagnose.
> > 	* g++.dg/template/non-dependent16.C: New test.
> > 	* g++.dg/template/non-dependent16a.C: New test.
> > ---
> >   gcc/cp/call.c                                 | 17 +++++++++
> >   gcc/cp/pt.c                                   | 18 ++-------
> >   gcc/cp/semantics.c                            | 15 ++++++++
> >   gcc/testsuite/g++.dg/cpp0x/error2.C           |  4 +-
> >   gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C     |  4 --
> >   .../g++.dg/template/non-dependent16.C         | 37 +++++++++++++++++++
> >   .../g++.dg/template/non-dependent16a.C        | 36 ++++++++++++++++++
> >   7 files changed, 111 insertions(+), 20 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/template/non-dependent16.C
> >   create mode 100644 gcc/testsuite/g++.dg/template/non-dependent16a.C
> > 
> > diff --git a/gcc/cp/call.c b/gcc/cp/call.c
> > index 53a391cbc6b..92d96c19f5c 100644
> > --- a/gcc/cp/call.c
> > +++ b/gcc/cp/call.c
> > @@ -11165,6 +11165,23 @@ build_new_method_call (tree instance, tree fns,
> > vec<tree, va_gc> **args,
> >   	}
> >         if (INDIRECT_REF_P (call))
> >   	call = TREE_OPERAND (call, 0);
> > +
> > +      /* Prune all but the selected function from the original overload
> > +	 set so that we can avoid some duplicate work at instantiation time.
> > */
> > +      if (really_overloaded_fn (fns))
> > +	{
> > +	  if (DECL_TEMPLATE_INFO (fn)
> > +	      && DECL_MEMBER_TEMPLATE_P (DECL_TI_TEMPLATE (fn))
> > +	      && dependent_type_p (DECL_CONTEXT (fn)))
> > +	    /* FIXME: We're not prepared to fully instantiate "inverted"
> > +	       partial instantiations such as A<T>::f<int>().  */;
> > +	  else
> > +	    {
> > +	      orig_fns = copy_node (orig_fns);
> > +	      BASELINK_FUNCTIONS (orig_fns) = fn;
> > +	    }
> > +	}
> > +
> >         call = (build_min_non_dep_call_vec
> >   	      (call,
> >   	       build_min (COMPONENT_REF, TREE_TYPE (CALL_EXPR_FN (call)),
> > diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
> > index 2340139b238..b114114e617 100644
> > --- a/gcc/cp/pt.c
> > +++ b/gcc/cp/pt.c
> > @@ -1566,18 +1566,10 @@ register_specialization (tree spec, tree tmpl, tree
> > args, bool is_friend,
> >   		  && TREE_CODE (spec) == NONTYPE_ARGUMENT_PACK));
> >       if (TREE_CODE (spec) == FUNCTION_DECL
> > -      && uses_template_parms (DECL_TI_ARGS (spec)))
> > +      && uses_template_parms (INNERMOST_TEMPLATE_ARGS (DECL_TI_ARGS
> > (spec))))
> >       /* This is the FUNCTION_DECL for a partial instantiation.  Don't
> > -       register it; we want the corresponding TEMPLATE_DECL instead.
> > -       We use `uses_template_parms (DECL_TI_ARGS (spec))' rather than
> > -       the more obvious `uses_template_parms (spec)' to avoid problems
> > -       with default function arguments.  In particular, given
> > -       something like this:
> > -
> > -	  template <class T> void f(T t1, T t = T())
> > -
> > -       the default argument expression is not substituted for in an
> > -       instantiation unless and until it is actually needed.  */
> > +       register it; we want to register the corresponding TEMPLATE_DECL
> > +       instead.  */
> >       return spec;
> >       if (optimize_specialization_lookup_p (tmpl))
> > @@ -13960,9 +13952,7 @@ tsubst_function_decl (tree t, tree args,
> > tsubst_flags_t complain,
> >           /* Calculate the complete set of arguments used to
> >   	 specialize R.  */
> > -      argvec = tsubst_template_args (DECL_TI_ARGS
> > -				     (DECL_TEMPLATE_RESULT
> > -				      (DECL_TI_TEMPLATE (t))),
> > +      argvec = tsubst_template_args (DECL_TI_ARGS (t),
> >   				     args, complain, in_decl);
> >         if (argvec == error_mark_node)
> >   	return error_mark_node;
> > diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
> > index 7078af03d3c..57f689042b9 100644
> > --- a/gcc/cp/semantics.c
> > +++ b/gcc/cp/semantics.c
> > @@ -2893,6 +2893,21 @@ finish_call_expr (tree fn, vec<tree, va_gc> **args,
> > bool disallow_virtual,
> >       {
> >         if (INDIRECT_REF_P (result))
> >   	result = TREE_OPERAND (result, 0);
> > +
> > +      /* Prune all but the selected function from the original overload
> > +	 set so that we can avoid some duplicate work at instantiation time.
> > */
> > +      if (TREE_CODE (result) == CALL_EXPR
> > +	  && really_overloaded_fn (orig_fn))
> > +	{
> > +	  orig_fn = CALL_EXPR_FN (result);
> > +	  if (TREE_CODE (orig_fn) == COMPONENT_REF)
> > +	    {
> > +	      /* The result of build_new_method_call.  */
> > +	      orig_fn = TREE_OPERAND (orig_fn, 1);
> > +	      gcc_assert (BASELINK_P (orig_fn));
> > +	    }
> > +	}
> > +
> >         result = build_call_vec (TREE_TYPE (result), orig_fn, orig_args);
> >         SET_EXPR_LOCATION (result, input_location);
> >         KOENIG_LOOKUP_P (result) = koenig_p;
> > diff --git a/gcc/testsuite/g++.dg/cpp0x/error2.C
> > b/gcc/testsuite/g++.dg/cpp0x/error2.C
> > index e6af294c180..eb966362ccb 100644
> > --- a/gcc/testsuite/g++.dg/cpp0x/error2.C
> > +++ b/gcc/testsuite/g++.dg/cpp0x/error2.C
> > @@ -3,7 +3,7 @@
> >     template<int> int foo();
> >   -template<typename F> void bar(F f)
> > +template<typename F, int N> void bar(F f)
> >   {
> > -  f((foo<0>()=0)...); // { dg-error "pattern '\\(foo\\<0\\>\\)\\(\\)=0'" }
> > +  f((foo<N>()=0)...); // { dg-error "pattern '\\(foo\\<N\\>\\)\\(\\)=0'" }
> >   }
> > diff --git a/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C
> > b/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C
> > index 833ae6fc85c..60ebad8d1d3 100644
> > --- a/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C
> > +++ b/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C
> > @@ -1,10 +1,7 @@
> >   // PR c++/86946, DR 1321
> >   // { dg-do compile { target c++11 } }
> >   -int d(int, int);
> >   template <long> class e {};
> > -template <unsigned long f, unsigned b, typename> e<sizeof(d(f, b))> d();
> > -template <unsigned long f, unsigned b, typename> e<d(f, b)> d();
> >     template <class T, class U> constexpr T d2(T, U) { return 42; }
> >   template <unsigned long f, unsigned b, typename> e<d2(f, b)> d2();
> > @@ -17,7 +14,6 @@ template <unsigned long f, unsigned b, typename>
> > e<sizeof(d3(f, b))> d3();
> >     int main()
> >   {
> > -  d<1,2,int>();
> >     d2<1,2,int>();
> >     d3<1,2,int>();
> >   }
> > diff --git a/gcc/testsuite/g++.dg/template/non-dependent16.C
> > b/gcc/testsuite/g++.dg/template/non-dependent16.C
> > new file mode 100644
> > index 00000000000..ee8ef902529
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/template/non-dependent16.C
> > @@ -0,0 +1,37 @@
> > +// This test verifies that after resolving a non-dependent call expression
> > +// ahead of time, we prune all but the selected candidate from the overload
> > +// set.  Without this optimization, overload resolution for the final call
> > to
> > +// f<void>() would be exponential in the size of the overload set.
> > +
> > +// { dg-do compile { target c++11 } }
> > +
> > +template<class T> void f();
> > +template<class T> auto f() -> decltype(f<void>(), 1, *T());
> > +template<class T> auto f() -> decltype(f<void>(), 2, *T());
> > +template<class T> auto f() -> decltype(f<void>(), 3, *T());
> > +template<class T> auto f() -> decltype(f<void>(), 4, *T());
> > +template<class T> auto f() -> decltype(f<void>(), 5, *T());
> > +template<class T> auto f() -> decltype(f<void>(), 6, *T());
> > +template<class T> auto f() -> decltype(f<void>(), 7, *T());
> > +template<class T> auto f() -> decltype(f<void>(), 8, *T());
> > +template<class T> auto f() -> decltype(f<void>(), 9, *T());
> > +template<class T> auto f() -> decltype(f<void>(), 10, *T());
> > +template<class T> auto f() -> decltype(f<void>(), 11, *T());
> > +template<class T> auto f() -> decltype(f<void>(), 12, *T());
> > +template<class T> auto f() -> decltype(f<void>(), 13, *T());
> > +template<class T> auto f() -> decltype(f<void>(), 14, *T());
> > +template<class T> auto f() -> decltype(f<void>(), 15, *T());
> > +template<class T> auto f() -> decltype(f<void>(), 16, *T());
> > +template<class T> auto f() -> decltype(f<void>(), 17, *T());
> > +template<class T> auto f() -> decltype(f<void>(), 18, *T());
> > +template<class T> auto f() -> decltype(f<void>(), 19, *T());
> > +template<class T> auto f() -> decltype(f<void>(), 20, *T());
> > +template<class T> auto f() -> decltype(f<void>(), 21, *T());
> > +template<class T> auto f() -> decltype(f<void>(), 22, *T());
> > +template<class T> auto f() -> decltype(f<void>(), 23, *T());
> > +template<class T> auto f() -> decltype(f<void>(), 24, *T());
> > +template<class T> auto f() -> decltype(f<void>(), 25, *T());
> > +
> > +int main() {
> > +  f<void>();
> > +}
> > diff --git a/gcc/testsuite/g++.dg/template/non-dependent16a.C
> > b/gcc/testsuite/g++.dg/template/non-dependent16a.C
> > new file mode 100644
> > index 00000000000..0e04d646c0b
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/template/non-dependent16a.C
> > @@ -0,0 +1,36 @@
> > +// Like non-dependent16.C, but using member functions.
> > +
> > +// { dg-do compile { target c++11 } }
> > +
> > +struct A {
> > +  template<class T> static void f();
> > +  template<class T> static auto f() -> decltype(f<void>(), 1, *T());
> > +  template<class T> static auto f() -> decltype(f<void>(), 2, *T());
> > +  template<class T> static auto f() -> decltype(f<void>(), 3, *T());
> > +  template<class T> static auto f() -> decltype(f<void>(), 4, *T());
> > +  template<class T> static auto f() -> decltype(f<void>(), 5, *T());
> > +  template<class T> static auto f() -> decltype(f<void>(), 6, *T());
> > +  template<class T> static auto f() -> decltype(f<void>(), 7, *T());
> > +  template<class T> static auto f() -> decltype(f<void>(), 8, *T());
> > +  template<class T> static auto f() -> decltype(f<void>(), 9, *T());
> > +  template<class T> static auto f() -> decltype(f<void>(), 10, *T());
> > +  template<class T> static auto f() -> decltype(f<void>(), 11, *T());
> > +  template<class T> static auto f() -> decltype(f<void>(), 12, *T());
> > +  template<class T> static auto f() -> decltype(f<void>(), 13, *T());
> > +  template<class T> static auto f() -> decltype(f<void>(), 14, *T());
> > +  template<class T> static auto f() -> decltype(f<void>(), 15, *T());
> > +  template<class T> static auto f() -> decltype(f<void>(), 16, *T());
> > +  template<class T> static auto f() -> decltype(f<void>(), 17, *T());
> > +  template<class T> static auto f() -> decltype(f<void>(), 18, *T());
> > +  template<class T> static auto f() -> decltype(f<void>(), 19, *T());
> > +  template<class T> static auto f() -> decltype(f<void>(), 20, *T());
> > +  template<class T> static auto f() -> decltype(f<void>(), 21, *T());
> > +  template<class T> static auto f() -> decltype(f<void>(), 22, *T());
> > +  template<class T> static auto f() -> decltype(f<void>(), 23, *T());
> > +  template<class T> static auto f() -> decltype(f<void>(), 24, *T());
> > +  template<class T> static auto f() -> decltype(f<void>(), 25, *T());
> > +};
> > +
> > +int main() {
> > +  A::f<void>();
> > +}
> 
>
  
Patrick Palka Dec. 16, 2021, 7:53 p.m. UTC | #3
On Thu, 16 Dec 2021, Patrick Palka wrote:

> On Wed, 15 Dec 2021, Jason Merrill wrote:
> 
> > On 12/15/21 12:49, Patrick Palka wrote:
> > > This patch makes us remember the function selected by overload
> > > resolution during ahead of time processing of a non-dependent call
> > > expression, so that we avoid repeating most of the work of overload
> > > resolution at instantiation time.  This mirrors what we already do for
> > > non-dependent operator expressions via build_min_non_dep_op_overload.
> > > 
> > > Some caveats:
> > > 
> > >   * When processing ahead of time a non-dependent call to a member
> > >     function template inside a class template (as in
> > >     g++.dg/template/deduce4.C), we end up generating an "inverted" partial
> > >     instantiation such as S<T>::foo<int, int>(), the kinds of which we're
> > >     apparently not prepared to fully instantiate (e.g. tsubst_baselink
> > >     mishandles it).  So this patch disables this optimization for such
> > >     functions and adds a FIXME.
> > 
> > I wonder if it would be worthwhile to build a TEMPLATE_ID_EXPR to remember the
> > deduced template args, even if we are failing to remember the actual function?
> 
> Hmm, that transformation could have observable effects, since overload
> resolution for f<int>(0) might end up instantiating more things than for
> f(0) due to the explicit-args substitution step:
> 
>   template<class T> struct A { using type = typename T::type; };
> 
>   template<class T> void f(T);
>   template<class T, class U = typename T::type> typename A<T>::type f(T);
> 
> Here overload resolution for f(0) succeeds and selects the first
> overload but for f<int>(0) induces a hard error.  Also I worry that such
> a transformation might affect declaration matching in weird ways due
> to conflating f(0) with f<int>(0).
> 
> We could at least though prune the overload set to the corresponding
> selected function template rather than the "inside-out" specialization;
> I'll try to implement that.

Here's a bootstrapped and regtested patch which implements that:

-- >8 --

This patch makes us remember the function selected by overload resolution
during ahead of time processing of a non-dependent call expression, so
that we avoid repeating most of the work of overload resolution for the
call at instantiation time.  Note that we already do this for
non-dependent operator expressions via build_min_non_dep_op_overload.

Some caveats:

 * When processing ahead of time a non-dependent call to a member
   function template inside a common class template (as in
   g++.dg/template/deduce4.C), we end up generating an "inverted" partial
   instantiation such as S<T>::foo<int, int>(), the likes of which we're
   apparently not prepared to fully instantiate (e.g. tsubst_baselink
   mishandles it).  So this patch makes us prune to the selected
   template instead of the specialization in this case.

 * This change triggered a latent FUNCTION_DECL pretty printing issue
   in cpp0x/error2.C -- since we now resolve the call to foo<0> ahead
   of time, the error now looks like:

     error: expansion pattern ‘foo()()=0’ contains no parameter pack

   where the FUNCTION_DECL for foo<0> is clearly misprinted.  But this
   pretty-printing issue could be reproduced without this patch if
   we define foo as a non-template function.  Since this testcase was
   added to verify pretty printing of TEMPLATE_ID_EXPR, I work around
   this test failure by making the call to foo type-dependent and thus
   immune to this ahead of time pruning.

 * We now reject parts of cpp0x/fntmp-equiv1.C because we notice that
   the call d(f, b) in

     template <unsigned long f, unsigned b, typename> e<d(f, b)> d();

   is always non-constexpr because the selected d isn't.  I tried fixing
   this by making it constexpr, but then the call to d from main becomes
   ambiguous.  So I settled with removing this part of the testcase.

gcc/cp/ChangeLog:

	* call.c (build_new_method_call): For a non-dependent call
	expression inside a template, returning a templated tree
	whose overload set contains just the selected function.
	* semantics.c (finish_call_expr): Likewise.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/error2.C: Make the call to foo type-dependent in
	order to avoid latent pretty-printing issue for FUNCTION_DECL
	inside MODOP_EXPR.
	* g++.dg/cpp0x/fntmp-equiv1.C: Remove ill-formed parts of
	testcase that we now diagnose.
	* g++.dg/template/non-dependent16.C: New test.
	* g++.dg/template/non-dependent16a.C: New test.
	* g++.dg/template/non-dependent17.C: New test.
---
 gcc/cp/call.c                                 | 27 ++++++++++++++
 gcc/cp/semantics.c                            | 15 ++++++++
 gcc/testsuite/g++.dg/cpp0x/error2.C           |  4 +-
 gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C     |  4 --
 .../g++.dg/template/non-dependent16.C         | 37 +++++++++++++++++++
 .../g++.dg/template/non-dependent16a.C        | 36 ++++++++++++++++++
 .../g++.dg/template/non-dependent17.C         | 21 +++++++++++
 7 files changed, 138 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/template/non-dependent16.C
 create mode 100644 gcc/testsuite/g++.dg/template/non-dependent16a.C
 create mode 100644 gcc/testsuite/g++.dg/template/non-dependent17.C

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 495dcdd77b3..1fbfc580a1e 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -11163,6 +11163,33 @@ build_new_method_call (tree instance, tree fns, vec<tree, va_gc> **args,
 	}
       if (INDIRECT_REF_P (call))
 	call = TREE_OPERAND (call, 0);
+
+      /* Prune all but the selected function from the original overload
+	 set so that we can avoid some duplicate work at instantiation time.  */
+      if (really_overloaded_fn (fns))
+	{
+	  if (DECL_TEMPLATE_INFO (fn)
+	      && DECL_MEMBER_TEMPLATE_P (DECL_TI_TEMPLATE (fn))
+	      && dependent_type_p (DECL_CONTEXT (fn)))
+	    {
+	      /* FIXME: We're not prepared to fully instantiate "inside-out"
+		 partial instantiations such as A<T>::f<int>().  So instead
+		 use the selected template, not the specialization.  */
+
+	      if (OVL_SINGLE_P (fns))
+		/* If the original overload set consists of a single function
+		   template, this isn't beneficial.  */
+		goto skip_prune;
+
+	      fn = ovl_make (DECL_TI_TEMPLATE (fn));
+	      if (template_only)
+		fn = lookup_template_function (fn, explicit_targs);
+	    }
+	  orig_fns = copy_node (orig_fns);
+	  BASELINK_FUNCTIONS (orig_fns) = fn;
+	}
+
+skip_prune:
       call = (build_min_non_dep_call_vec
 	      (call,
 	       build_min (COMPONENT_REF, TREE_TYPE (CALL_EXPR_FN (call)),
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 6603066c620..6ffd82cb0ec 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -2893,6 +2893,21 @@ finish_call_expr (tree fn, vec<tree, va_gc> **args, bool disallow_virtual,
     {
       if (INDIRECT_REF_P (result))
 	result = TREE_OPERAND (result, 0);
+
+      /* Prune all but the selected function from the original overload
+	 set so that we can avoid some duplicate work at instantiation time.  */
+      if (TREE_CODE (result) == CALL_EXPR
+	  && really_overloaded_fn (orig_fn))
+	{
+	  orig_fn = CALL_EXPR_FN (result);
+	  if (TREE_CODE (orig_fn) == COMPONENT_REF)
+	    {
+	      /* The non-dependent result of build_new_method_call.  */
+	      orig_fn = TREE_OPERAND (orig_fn, 1);
+	      gcc_assert (BASELINK_P (orig_fn));
+	    }
+	}
+
       result = build_call_vec (TREE_TYPE (result), orig_fn, orig_args);
       SET_EXPR_LOCATION (result, input_location);
       KOENIG_LOOKUP_P (result) = koenig_p;
diff --git a/gcc/testsuite/g++.dg/cpp0x/error2.C b/gcc/testsuite/g++.dg/cpp0x/error2.C
index e6af294c180..eb966362ccb 100644
--- a/gcc/testsuite/g++.dg/cpp0x/error2.C
+++ b/gcc/testsuite/g++.dg/cpp0x/error2.C
@@ -3,7 +3,7 @@
 
 template<int> int foo();
 
-template<typename F> void bar(F f)
+template<typename F, int N> void bar(F f)
 {
-  f((foo<0>()=0)...); // { dg-error "pattern '\\(foo\\<0\\>\\)\\(\\)=0'" }
+  f((foo<N>()=0)...); // { dg-error "pattern '\\(foo\\<N\\>\\)\\(\\)=0'" }
 }
diff --git a/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C b/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C
index 833ae6fc85c..60ebad8d1d3 100644
--- a/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C
+++ b/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C
@@ -1,10 +1,7 @@
 // PR c++/86946, DR 1321
 // { dg-do compile { target c++11 } }
 
-int d(int, int);
 template <long> class e {};
-template <unsigned long f, unsigned b, typename> e<sizeof(d(f, b))> d();
-template <unsigned long f, unsigned b, typename> e<d(f, b)> d();
 
 template <class T, class U> constexpr T d2(T, U) { return 42; }
 template <unsigned long f, unsigned b, typename> e<d2(f, b)> d2();
@@ -17,7 +14,6 @@ template <unsigned long f, unsigned b, typename> e<sizeof(d3(f, b))> d3();
 
 int main()
 {
-  d<1,2,int>();
   d2<1,2,int>();
   d3<1,2,int>();
 }
diff --git a/gcc/testsuite/g++.dg/template/non-dependent16.C b/gcc/testsuite/g++.dg/template/non-dependent16.C
new file mode 100644
index 00000000000..ee8ef902529
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/non-dependent16.C
@@ -0,0 +1,37 @@
+// This test verifies that after resolving a non-dependent call expression
+// ahead of time, we prune all but the selected candidate from the overload
+// set.  Without this optimization, overload resolution for the final call to
+// f<void>() would be exponential in the size of the overload set.
+
+// { dg-do compile { target c++11 } }
+
+template<class T> void f();
+template<class T> auto f() -> decltype(f<void>(), 1, *T());
+template<class T> auto f() -> decltype(f<void>(), 2, *T());
+template<class T> auto f() -> decltype(f<void>(), 3, *T());
+template<class T> auto f() -> decltype(f<void>(), 4, *T());
+template<class T> auto f() -> decltype(f<void>(), 5, *T());
+template<class T> auto f() -> decltype(f<void>(), 6, *T());
+template<class T> auto f() -> decltype(f<void>(), 7, *T());
+template<class T> auto f() -> decltype(f<void>(), 8, *T());
+template<class T> auto f() -> decltype(f<void>(), 9, *T());
+template<class T> auto f() -> decltype(f<void>(), 10, *T());
+template<class T> auto f() -> decltype(f<void>(), 11, *T());
+template<class T> auto f() -> decltype(f<void>(), 12, *T());
+template<class T> auto f() -> decltype(f<void>(), 13, *T());
+template<class T> auto f() -> decltype(f<void>(), 14, *T());
+template<class T> auto f() -> decltype(f<void>(), 15, *T());
+template<class T> auto f() -> decltype(f<void>(), 16, *T());
+template<class T> auto f() -> decltype(f<void>(), 17, *T());
+template<class T> auto f() -> decltype(f<void>(), 18, *T());
+template<class T> auto f() -> decltype(f<void>(), 19, *T());
+template<class T> auto f() -> decltype(f<void>(), 20, *T());
+template<class T> auto f() -> decltype(f<void>(), 21, *T());
+template<class T> auto f() -> decltype(f<void>(), 22, *T());
+template<class T> auto f() -> decltype(f<void>(), 23, *T());
+template<class T> auto f() -> decltype(f<void>(), 24, *T());
+template<class T> auto f() -> decltype(f<void>(), 25, *T());
+
+int main() {
+  f<void>();
+}
diff --git a/gcc/testsuite/g++.dg/template/non-dependent16a.C b/gcc/testsuite/g++.dg/template/non-dependent16a.C
new file mode 100644
index 00000000000..0e04d646c0b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/non-dependent16a.C
@@ -0,0 +1,36 @@
+// Like non-dependent16.C, but using member functions.
+
+// { dg-do compile { target c++11 } }
+
+struct A {
+  template<class T> static void f();
+  template<class T> static auto f() -> decltype(f<void>(), 1, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 2, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 3, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 4, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 5, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 6, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 7, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 8, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 9, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 10, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 11, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 12, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 13, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 14, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 15, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 16, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 17, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 18, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 19, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 20, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 21, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 22, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 23, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 24, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 25, *T());
+};
+
+int main() {
+  A::f<void>();
+}
diff --git a/gcc/testsuite/g++.dg/template/non-dependent17.C b/gcc/testsuite/g++.dg/template/non-dependent17.C
new file mode 100644
index 00000000000..bc664999e84
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/non-dependent17.C
@@ -0,0 +1,21 @@
+// A variant of deduce4.C with multiple overloads of foo.  Verify we don't
+// crash when after ahead-of-time pruning of the overload set for the
+// non-dependent call to foo.
+// { dg-do compile }
+
+template <typename T>
+struct S {
+  template <typename U, typename V>
+  static void foo(V) { }
+  template <typename U>
+  static void foo(...) { }
+
+  void bar () { foo<int>(10); }
+};
+
+void
+test ()
+{
+  S<int> s;
+  s.bar ();
+}
  
Jason Merrill Dec. 16, 2021, 9:12 p.m. UTC | #4
On 12/16/21 14:53, Patrick Palka wrote:
> On Thu, 16 Dec 2021, Patrick Palka wrote:
> 
>> On Wed, 15 Dec 2021, Jason Merrill wrote:
>>
>>> On 12/15/21 12:49, Patrick Palka wrote:
>>>> This patch makes us remember the function selected by overload
>>>> resolution during ahead of time processing of a non-dependent call
>>>> expression, so that we avoid repeating most of the work of overload
>>>> resolution at instantiation time.  This mirrors what we already do for
>>>> non-dependent operator expressions via build_min_non_dep_op_overload.
>>>>
>>>> Some caveats:
>>>>
>>>>    * When processing ahead of time a non-dependent call to a member
>>>>      function template inside a class template (as in
>>>>      g++.dg/template/deduce4.C), we end up generating an "inverted" partial
>>>>      instantiation such as S<T>::foo<int, int>(), the kinds of which we're
>>>>      apparently not prepared to fully instantiate (e.g. tsubst_baselink
>>>>      mishandles it).  So this patch disables this optimization for such
>>>>      functions and adds a FIXME.
>>>
>>> I wonder if it would be worthwhile to build a TEMPLATE_ID_EXPR to remember the
>>> deduced template args, even if we are failing to remember the actual function?
>>
>> Hmm, that transformation could have observable effects, since overload
>> resolution for f<int>(0) might end up instantiating more things than for
>> f(0) due to the explicit-args substitution step:
>>
>>    template<class T> struct A { using type = typename T::type; };
>>
>>    template<class T> void f(T);
>>    template<class T, class U = typename T::type> typename A<T>::type f(T);
>>
>> Here overload resolution for f(0) succeeds and selects the first
>> overload but for f<int>(0) induces a hard error.  Also I worry that such
>> a transformation might affect declaration matching in weird ways due
>> to conflating f(0) with f<int>(0).
>>
>> We could at least though prune the overload set to the corresponding
>> selected function template rather than the "inside-out" specialization;
>> I'll try to implement that.
> 
> Here's a bootstrapped and regtested patch which implements that:

OK.

> -- >8 --
> 
> This patch makes us remember the function selected by overload resolution
> during ahead of time processing of a non-dependent call expression, so
> that we avoid repeating most of the work of overload resolution for the
> call at instantiation time.  Note that we already do this for
> non-dependent operator expressions via build_min_non_dep_op_overload.
> 
> Some caveats:
> 
>   * When processing ahead of time a non-dependent call to a member
>     function template inside a common class template (as in
>     g++.dg/template/deduce4.C), we end up generating an "inverted" partial
>     instantiation such as S<T>::foo<int, int>(), the likes of which we're
>     apparently not prepared to fully instantiate (e.g. tsubst_baselink
>     mishandles it).  So this patch makes us prune to the selected
>     template instead of the specialization in this case.
> 
>   * This change triggered a latent FUNCTION_DECL pretty printing issue
>     in cpp0x/error2.C -- since we now resolve the call to foo<0> ahead
>     of time, the error now looks like:
> 
>       error: expansion pattern ‘foo()()=0’ contains no parameter pack
> 
>     where the FUNCTION_DECL for foo<0> is clearly misprinted.  But this
>     pretty-printing issue could be reproduced without this patch if
>     we define foo as a non-template function.  Since this testcase was
>     added to verify pretty printing of TEMPLATE_ID_EXPR, I work around
>     this test failure by making the call to foo type-dependent and thus
>     immune to this ahead of time pruning.
> 
>   * We now reject parts of cpp0x/fntmp-equiv1.C because we notice that
>     the call d(f, b) in
> 
>       template <unsigned long f, unsigned b, typename> e<d(f, b)> d();
> 
>     is always non-constexpr because the selected d isn't.  I tried fixing
>     this by making it constexpr, but then the call to d from main becomes
>     ambiguous.  So I settled with removing this part of the testcase.
> 
> gcc/cp/ChangeLog:
> 
> 	* call.c (build_new_method_call): For a non-dependent call
> 	expression inside a template, returning a templated tree
> 	whose overload set contains just the selected function.
> 	* semantics.c (finish_call_expr): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* g++.dg/cpp0x/error2.C: Make the call to foo type-dependent in
> 	order to avoid latent pretty-printing issue for FUNCTION_DECL
> 	inside MODOP_EXPR.
> 	* g++.dg/cpp0x/fntmp-equiv1.C: Remove ill-formed parts of
> 	testcase that we now diagnose.
> 	* g++.dg/template/non-dependent16.C: New test.
> 	* g++.dg/template/non-dependent16a.C: New test.
> 	* g++.dg/template/non-dependent17.C: New test.
> ---
>   gcc/cp/call.c                                 | 27 ++++++++++++++
>   gcc/cp/semantics.c                            | 15 ++++++++
>   gcc/testsuite/g++.dg/cpp0x/error2.C           |  4 +-
>   gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C     |  4 --
>   .../g++.dg/template/non-dependent16.C         | 37 +++++++++++++++++++
>   .../g++.dg/template/non-dependent16a.C        | 36 ++++++++++++++++++
>   .../g++.dg/template/non-dependent17.C         | 21 +++++++++++
>   7 files changed, 138 insertions(+), 6 deletions(-)
>   create mode 100644 gcc/testsuite/g++.dg/template/non-dependent16.C
>   create mode 100644 gcc/testsuite/g++.dg/template/non-dependent16a.C
>   create mode 100644 gcc/testsuite/g++.dg/template/non-dependent17.C
> 
> diff --git a/gcc/cp/call.c b/gcc/cp/call.c
> index 495dcdd77b3..1fbfc580a1e 100644
> --- a/gcc/cp/call.c
> +++ b/gcc/cp/call.c
> @@ -11163,6 +11163,33 @@ build_new_method_call (tree instance, tree fns, vec<tree, va_gc> **args,
>   	}
>         if (INDIRECT_REF_P (call))
>   	call = TREE_OPERAND (call, 0);
> +
> +      /* Prune all but the selected function from the original overload
> +	 set so that we can avoid some duplicate work at instantiation time.  */
> +      if (really_overloaded_fn (fns))
> +	{
> +	  if (DECL_TEMPLATE_INFO (fn)
> +	      && DECL_MEMBER_TEMPLATE_P (DECL_TI_TEMPLATE (fn))
> +	      && dependent_type_p (DECL_CONTEXT (fn)))
> +	    {
> +	      /* FIXME: We're not prepared to fully instantiate "inside-out"
> +		 partial instantiations such as A<T>::f<int>().  So instead
> +		 use the selected template, not the specialization.  */
> +
> +	      if (OVL_SINGLE_P (fns))
> +		/* If the original overload set consists of a single function
> +		   template, this isn't beneficial.  */
> +		goto skip_prune;
> +
> +	      fn = ovl_make (DECL_TI_TEMPLATE (fn));
> +	      if (template_only)
> +		fn = lookup_template_function (fn, explicit_targs);
> +	    }
> +	  orig_fns = copy_node (orig_fns);
> +	  BASELINK_FUNCTIONS (orig_fns) = fn;
> +	}
> +
> +skip_prune:
>         call = (build_min_non_dep_call_vec
>   	      (call,
>   	       build_min (COMPONENT_REF, TREE_TYPE (CALL_EXPR_FN (call)),
> diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
> index 6603066c620..6ffd82cb0ec 100644
> --- a/gcc/cp/semantics.c
> +++ b/gcc/cp/semantics.c
> @@ -2893,6 +2893,21 @@ finish_call_expr (tree fn, vec<tree, va_gc> **args, bool disallow_virtual,
>       {
>         if (INDIRECT_REF_P (result))
>   	result = TREE_OPERAND (result, 0);
> +
> +      /* Prune all but the selected function from the original overload
> +	 set so that we can avoid some duplicate work at instantiation time.  */
> +      if (TREE_CODE (result) == CALL_EXPR
> +	  && really_overloaded_fn (orig_fn))
> +	{
> +	  orig_fn = CALL_EXPR_FN (result);
> +	  if (TREE_CODE (orig_fn) == COMPONENT_REF)
> +	    {
> +	      /* The non-dependent result of build_new_method_call.  */
> +	      orig_fn = TREE_OPERAND (orig_fn, 1);
> +	      gcc_assert (BASELINK_P (orig_fn));
> +	    }
> +	}
> +
>         result = build_call_vec (TREE_TYPE (result), orig_fn, orig_args);
>         SET_EXPR_LOCATION (result, input_location);
>         KOENIG_LOOKUP_P (result) = koenig_p;
> diff --git a/gcc/testsuite/g++.dg/cpp0x/error2.C b/gcc/testsuite/g++.dg/cpp0x/error2.C
> index e6af294c180..eb966362ccb 100644
> --- a/gcc/testsuite/g++.dg/cpp0x/error2.C
> +++ b/gcc/testsuite/g++.dg/cpp0x/error2.C
> @@ -3,7 +3,7 @@
>   
>   template<int> int foo();
>   
> -template<typename F> void bar(F f)
> +template<typename F, int N> void bar(F f)
>   {
> -  f((foo<0>()=0)...); // { dg-error "pattern '\\(foo\\<0\\>\\)\\(\\)=0'" }
> +  f((foo<N>()=0)...); // { dg-error "pattern '\\(foo\\<N\\>\\)\\(\\)=0'" }
>   }
> diff --git a/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C b/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C
> index 833ae6fc85c..60ebad8d1d3 100644
> --- a/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C
> +++ b/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C
> @@ -1,10 +1,7 @@
>   // PR c++/86946, DR 1321
>   // { dg-do compile { target c++11 } }
>   
> -int d(int, int);
>   template <long> class e {};
> -template <unsigned long f, unsigned b, typename> e<sizeof(d(f, b))> d();
> -template <unsigned long f, unsigned b, typename> e<d(f, b)> d();
>   
>   template <class T, class U> constexpr T d2(T, U) { return 42; }
>   template <unsigned long f, unsigned b, typename> e<d2(f, b)> d2();
> @@ -17,7 +14,6 @@ template <unsigned long f, unsigned b, typename> e<sizeof(d3(f, b))> d3();
>   
>   int main()
>   {
> -  d<1,2,int>();
>     d2<1,2,int>();
>     d3<1,2,int>();
>   }
> diff --git a/gcc/testsuite/g++.dg/template/non-dependent16.C b/gcc/testsuite/g++.dg/template/non-dependent16.C
> new file mode 100644
> index 00000000000..ee8ef902529
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/template/non-dependent16.C
> @@ -0,0 +1,37 @@
> +// This test verifies that after resolving a non-dependent call expression
> +// ahead of time, we prune all but the selected candidate from the overload
> +// set.  Without this optimization, overload resolution for the final call to
> +// f<void>() would be exponential in the size of the overload set.
> +
> +// { dg-do compile { target c++11 } }
> +
> +template<class T> void f();
> +template<class T> auto f() -> decltype(f<void>(), 1, *T());
> +template<class T> auto f() -> decltype(f<void>(), 2, *T());
> +template<class T> auto f() -> decltype(f<void>(), 3, *T());
> +template<class T> auto f() -> decltype(f<void>(), 4, *T());
> +template<class T> auto f() -> decltype(f<void>(), 5, *T());
> +template<class T> auto f() -> decltype(f<void>(), 6, *T());
> +template<class T> auto f() -> decltype(f<void>(), 7, *T());
> +template<class T> auto f() -> decltype(f<void>(), 8, *T());
> +template<class T> auto f() -> decltype(f<void>(), 9, *T());
> +template<class T> auto f() -> decltype(f<void>(), 10, *T());
> +template<class T> auto f() -> decltype(f<void>(), 11, *T());
> +template<class T> auto f() -> decltype(f<void>(), 12, *T());
> +template<class T> auto f() -> decltype(f<void>(), 13, *T());
> +template<class T> auto f() -> decltype(f<void>(), 14, *T());
> +template<class T> auto f() -> decltype(f<void>(), 15, *T());
> +template<class T> auto f() -> decltype(f<void>(), 16, *T());
> +template<class T> auto f() -> decltype(f<void>(), 17, *T());
> +template<class T> auto f() -> decltype(f<void>(), 18, *T());
> +template<class T> auto f() -> decltype(f<void>(), 19, *T());
> +template<class T> auto f() -> decltype(f<void>(), 20, *T());
> +template<class T> auto f() -> decltype(f<void>(), 21, *T());
> +template<class T> auto f() -> decltype(f<void>(), 22, *T());
> +template<class T> auto f() -> decltype(f<void>(), 23, *T());
> +template<class T> auto f() -> decltype(f<void>(), 24, *T());
> +template<class T> auto f() -> decltype(f<void>(), 25, *T());
> +
> +int main() {
> +  f<void>();
> +}
> diff --git a/gcc/testsuite/g++.dg/template/non-dependent16a.C b/gcc/testsuite/g++.dg/template/non-dependent16a.C
> new file mode 100644
> index 00000000000..0e04d646c0b
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/template/non-dependent16a.C
> @@ -0,0 +1,36 @@
> +// Like non-dependent16.C, but using member functions.
> +
> +// { dg-do compile { target c++11 } }
> +
> +struct A {
> +  template<class T> static void f();
> +  template<class T> static auto f() -> decltype(f<void>(), 1, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 2, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 3, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 4, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 5, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 6, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 7, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 8, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 9, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 10, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 11, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 12, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 13, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 14, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 15, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 16, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 17, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 18, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 19, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 20, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 21, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 22, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 23, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 24, *T());
> +  template<class T> static auto f() -> decltype(f<void>(), 25, *T());
> +};
> +
> +int main() {
> +  A::f<void>();
> +}
> diff --git a/gcc/testsuite/g++.dg/template/non-dependent17.C b/gcc/testsuite/g++.dg/template/non-dependent17.C
> new file mode 100644
> index 00000000000..bc664999e84
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/template/non-dependent17.C
> @@ -0,0 +1,21 @@
> +// A variant of deduce4.C with multiple overloads of foo.  Verify we don't
> +// crash when after ahead-of-time pruning of the overload set for the
> +// non-dependent call to foo.
> +// { dg-do compile }
> +
> +template <typename T>
> +struct S {
> +  template <typename U, typename V>
> +  static void foo(V) { }
> +  template <typename U>
> +  static void foo(...) { }
> +
> +  void bar () { foo<int>(10); }
> +};
> +
> +void
> +test ()
> +{
> +  S<int> s;
> +  s.bar ();
> +}
  

Patch

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 53a391cbc6b..92d96c19f5c 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -11165,6 +11165,23 @@  build_new_method_call (tree instance, tree fns, vec<tree, va_gc> **args,
 	}
       if (INDIRECT_REF_P (call))
 	call = TREE_OPERAND (call, 0);
+
+      /* Prune all but the selected function from the original overload
+	 set so that we can avoid some duplicate work at instantiation time.  */
+      if (really_overloaded_fn (fns))
+	{
+	  if (DECL_TEMPLATE_INFO (fn)
+	      && DECL_MEMBER_TEMPLATE_P (DECL_TI_TEMPLATE (fn))
+	      && dependent_type_p (DECL_CONTEXT (fn)))
+	    /* FIXME: We're not prepared to fully instantiate "inverted"
+	       partial instantiations such as A<T>::f<int>().  */;
+	  else
+	    {
+	      orig_fns = copy_node (orig_fns);
+	      BASELINK_FUNCTIONS (orig_fns) = fn;
+	    }
+	}
+
       call = (build_min_non_dep_call_vec
 	      (call,
 	       build_min (COMPONENT_REF, TREE_TYPE (CALL_EXPR_FN (call)),
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 2340139b238..b114114e617 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -1566,18 +1566,10 @@  register_specialization (tree spec, tree tmpl, tree args, bool is_friend,
 		  && TREE_CODE (spec) == NONTYPE_ARGUMENT_PACK));
 
   if (TREE_CODE (spec) == FUNCTION_DECL
-      && uses_template_parms (DECL_TI_ARGS (spec)))
+      && uses_template_parms (INNERMOST_TEMPLATE_ARGS (DECL_TI_ARGS (spec))))
     /* This is the FUNCTION_DECL for a partial instantiation.  Don't
-       register it; we want the corresponding TEMPLATE_DECL instead.
-       We use `uses_template_parms (DECL_TI_ARGS (spec))' rather than
-       the more obvious `uses_template_parms (spec)' to avoid problems
-       with default function arguments.  In particular, given
-       something like this:
-
-	  template <class T> void f(T t1, T t = T())
-
-       the default argument expression is not substituted for in an
-       instantiation unless and until it is actually needed.  */
+       register it; we want to register the corresponding TEMPLATE_DECL
+       instead.  */
     return spec;
 
   if (optimize_specialization_lookup_p (tmpl))
@@ -13960,9 +13952,7 @@  tsubst_function_decl (tree t, tree args, tsubst_flags_t complain,
 
       /* Calculate the complete set of arguments used to
 	 specialize R.  */
-      argvec = tsubst_template_args (DECL_TI_ARGS
-				     (DECL_TEMPLATE_RESULT
-				      (DECL_TI_TEMPLATE (t))),
+      argvec = tsubst_template_args (DECL_TI_ARGS (t),
 				     args, complain, in_decl);
       if (argvec == error_mark_node)
 	return error_mark_node;
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 7078af03d3c..57f689042b9 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -2893,6 +2893,21 @@  finish_call_expr (tree fn, vec<tree, va_gc> **args, bool disallow_virtual,
     {
       if (INDIRECT_REF_P (result))
 	result = TREE_OPERAND (result, 0);
+
+      /* Prune all but the selected function from the original overload
+	 set so that we can avoid some duplicate work at instantiation time.  */
+      if (TREE_CODE (result) == CALL_EXPR
+	  && really_overloaded_fn (orig_fn))
+	{
+	  orig_fn = CALL_EXPR_FN (result);
+	  if (TREE_CODE (orig_fn) == COMPONENT_REF)
+	    {
+	      /* The result of build_new_method_call.  */
+	      orig_fn = TREE_OPERAND (orig_fn, 1);
+	      gcc_assert (BASELINK_P (orig_fn));
+	    }
+	}
+
       result = build_call_vec (TREE_TYPE (result), orig_fn, orig_args);
       SET_EXPR_LOCATION (result, input_location);
       KOENIG_LOOKUP_P (result) = koenig_p;
diff --git a/gcc/testsuite/g++.dg/cpp0x/error2.C b/gcc/testsuite/g++.dg/cpp0x/error2.C
index e6af294c180..eb966362ccb 100644
--- a/gcc/testsuite/g++.dg/cpp0x/error2.C
+++ b/gcc/testsuite/g++.dg/cpp0x/error2.C
@@ -3,7 +3,7 @@ 
 
 template<int> int foo();
 
-template<typename F> void bar(F f)
+template<typename F, int N> void bar(F f)
 {
-  f((foo<0>()=0)...); // { dg-error "pattern '\\(foo\\<0\\>\\)\\(\\)=0'" }
+  f((foo<N>()=0)...); // { dg-error "pattern '\\(foo\\<N\\>\\)\\(\\)=0'" }
 }
diff --git a/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C b/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C
index 833ae6fc85c..60ebad8d1d3 100644
--- a/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C
+++ b/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C
@@ -1,10 +1,7 @@ 
 // PR c++/86946, DR 1321
 // { dg-do compile { target c++11 } }
 
-int d(int, int);
 template <long> class e {};
-template <unsigned long f, unsigned b, typename> e<sizeof(d(f, b))> d();
-template <unsigned long f, unsigned b, typename> e<d(f, b)> d();
 
 template <class T, class U> constexpr T d2(T, U) { return 42; }
 template <unsigned long f, unsigned b, typename> e<d2(f, b)> d2();
@@ -17,7 +14,6 @@  template <unsigned long f, unsigned b, typename> e<sizeof(d3(f, b))> d3();
 
 int main()
 {
-  d<1,2,int>();
   d2<1,2,int>();
   d3<1,2,int>();
 }
diff --git a/gcc/testsuite/g++.dg/template/non-dependent16.C b/gcc/testsuite/g++.dg/template/non-dependent16.C
new file mode 100644
index 00000000000..ee8ef902529
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/non-dependent16.C
@@ -0,0 +1,37 @@ 
+// This test verifies that after resolving a non-dependent call expression
+// ahead of time, we prune all but the selected candidate from the overload
+// set.  Without this optimization, overload resolution for the final call to
+// f<void>() would be exponential in the size of the overload set.
+
+// { dg-do compile { target c++11 } }
+
+template<class T> void f();
+template<class T> auto f() -> decltype(f<void>(), 1, *T());
+template<class T> auto f() -> decltype(f<void>(), 2, *T());
+template<class T> auto f() -> decltype(f<void>(), 3, *T());
+template<class T> auto f() -> decltype(f<void>(), 4, *T());
+template<class T> auto f() -> decltype(f<void>(), 5, *T());
+template<class T> auto f() -> decltype(f<void>(), 6, *T());
+template<class T> auto f() -> decltype(f<void>(), 7, *T());
+template<class T> auto f() -> decltype(f<void>(), 8, *T());
+template<class T> auto f() -> decltype(f<void>(), 9, *T());
+template<class T> auto f() -> decltype(f<void>(), 10, *T());
+template<class T> auto f() -> decltype(f<void>(), 11, *T());
+template<class T> auto f() -> decltype(f<void>(), 12, *T());
+template<class T> auto f() -> decltype(f<void>(), 13, *T());
+template<class T> auto f() -> decltype(f<void>(), 14, *T());
+template<class T> auto f() -> decltype(f<void>(), 15, *T());
+template<class T> auto f() -> decltype(f<void>(), 16, *T());
+template<class T> auto f() -> decltype(f<void>(), 17, *T());
+template<class T> auto f() -> decltype(f<void>(), 18, *T());
+template<class T> auto f() -> decltype(f<void>(), 19, *T());
+template<class T> auto f() -> decltype(f<void>(), 20, *T());
+template<class T> auto f() -> decltype(f<void>(), 21, *T());
+template<class T> auto f() -> decltype(f<void>(), 22, *T());
+template<class T> auto f() -> decltype(f<void>(), 23, *T());
+template<class T> auto f() -> decltype(f<void>(), 24, *T());
+template<class T> auto f() -> decltype(f<void>(), 25, *T());
+
+int main() {
+  f<void>();
+}
diff --git a/gcc/testsuite/g++.dg/template/non-dependent16a.C b/gcc/testsuite/g++.dg/template/non-dependent16a.C
new file mode 100644
index 00000000000..0e04d646c0b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/non-dependent16a.C
@@ -0,0 +1,36 @@ 
+// Like non-dependent16.C, but using member functions.
+
+// { dg-do compile { target c++11 } }
+
+struct A {
+  template<class T> static void f();
+  template<class T> static auto f() -> decltype(f<void>(), 1, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 2, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 3, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 4, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 5, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 6, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 7, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 8, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 9, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 10, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 11, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 12, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 13, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 14, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 15, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 16, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 17, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 18, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 19, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 20, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 21, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 22, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 23, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 24, *T());
+  template<class T> static auto f() -> decltype(f<void>(), 25, *T());
+};
+
+int main() {
+  A::f<void>();
+}