[2/1] c++: optimize specialization of templated member functions

Message ID 20220609130013.250243-1-ppalka@redhat.com
State New
Headers
Series c++: optimize specialization of nested class templates |

Commit Message

Patrick Palka June 9, 2022, 1 p.m. UTC
  This performs one of the optimizations added by the previous
patch to lookup_template_class, to instantiate_template as well.
(For the libstdc++ ranges tests this optimization appears to be
effective around 30% of the time, i.e. 30% of the time context of 'tmpl'
is non-dependent while the context of 'gen_tmpl' is dependent.)

gcc/cp/ChangeLog:

	* pt.cc (instantiate_template): Don't substitute the context
	of the most general template if that of the partially
	instantiated template is non-dependent.
---
 gcc/cp/pt.cc | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)
  

Comments

Jason Merrill June 9, 2022, 3:54 p.m. UTC | #1
On 6/9/22 09:00, Patrick Palka wrote:
> This performs one of the optimizations added by the previous
> patch to lookup_template_class, to instantiate_template as well.
> (For the libstdc++ ranges tests this optimization appears to be
> effective around 30% of the time, i.e. 30% of the time context of 'tmpl'
> is non-dependent while the context of 'gen_tmpl' is dependent.)

If this is a significant optimization, how about doing it in 
tsubst_aggr_type rather than its callers?

> gcc/cp/ChangeLog:
> 
> 	* pt.cc (instantiate_template): Don't substitute the context
> 	of the most general template if that of the partially
> 	instantiated template is non-dependent.
> ---
>   gcc/cp/pt.cc | 10 ++++++++--
>   1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> index e021c254872..208daad298a 100644
> --- a/gcc/cp/pt.cc
> +++ b/gcc/cp/pt.cc
> @@ -21661,8 +21661,14 @@ instantiate_template (tree tmpl, tree orig_args, tsubst_flags_t complain)
>       ++processing_template_decl;
>     if (DECL_CLASS_SCOPE_P (gen_tmpl))
>       {
> -      tree ctx = tsubst_aggr_type (DECL_CONTEXT (gen_tmpl), targ_ptr,
> -				   complain, gen_tmpl, true);
> +      tree ctx;
> +      if (!uses_template_parms (DECL_CONTEXT (tmpl)))
> +	/* If the context of the partially instantiated template is already
> +	   non-dependent, then we might as well use it.  */
> +	ctx = DECL_CONTEXT (tmpl);
> +      else
> +	ctx = tsubst_aggr_type (DECL_CONTEXT (gen_tmpl), targ_ptr,
> +				complain, gen_tmpl, true);
>         push_nested_class (ctx);
>       }
>
  
Patrick Palka June 9, 2022, 7:37 p.m. UTC | #2
On Thu, 9 Jun 2022, Jason Merrill wrote:

> On 6/9/22 09:00, Patrick Palka wrote:
> > This performs one of the optimizations added by the previous
> > patch to lookup_template_class, to instantiate_template as well.
> > (For the libstdc++ ranges tests this optimization appears to be
> > effective around 30% of the time, i.e. 30% of the time context of 'tmpl'
> > is non-dependent while the context of 'gen_tmpl' is dependent.)
> 
> If this is a significant optimization, how about doing it in tsubst_aggr_type
> rather than its callers?

I'm not sure how we'd do this optimization in tsubst_aggr_type?

I haven't observed any significant time/memory improvements based on my
limited benchmarking, but I can imagine for deeply nested templates it
could be significant.  And avoiding redundant work should hopefully help
streamline debugging I suppose.

> 
> > gcc/cp/ChangeLog:
> > 
> > 	* pt.cc (instantiate_template): Don't substitute the context
> > 	of the most general template if that of the partially
> > 	instantiated template is non-dependent.
> > ---
> >   gcc/cp/pt.cc | 10 ++++++++--
> >   1 file changed, 8 insertions(+), 2 deletions(-)
> > 
> > diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> > index e021c254872..208daad298a 100644
> > --- a/gcc/cp/pt.cc
> > +++ b/gcc/cp/pt.cc
> > @@ -21661,8 +21661,14 @@ instantiate_template (tree tmpl, tree orig_args,
> > tsubst_flags_t complain)
> >       ++processing_template_decl;
> >     if (DECL_CLASS_SCOPE_P (gen_tmpl))
> >       {
> > -      tree ctx = tsubst_aggr_type (DECL_CONTEXT (gen_tmpl), targ_ptr,
> > -				   complain, gen_tmpl, true);
> > +      tree ctx;
> > +      if (!uses_template_parms (DECL_CONTEXT (tmpl)))
> > +	/* If the context of the partially instantiated template is already
> > +	   non-dependent, then we might as well use it.  */
> > +	ctx = DECL_CONTEXT (tmpl);
> > +      else
> > +	ctx = tsubst_aggr_type (DECL_CONTEXT (gen_tmpl), targ_ptr,
> > +				complain, gen_tmpl, true);
> >         push_nested_class (ctx);
> >       }
> >   
> 
>
  
Jason Merrill June 10, 2022, 4:23 p.m. UTC | #3
On 6/9/22 15:37, Patrick Palka wrote:
> On Thu, 9 Jun 2022, Jason Merrill wrote:
> 
>> On 6/9/22 09:00, Patrick Palka wrote:
>>> This performs one of the optimizations added by the previous
>>> patch to lookup_template_class, to instantiate_template as well.
>>> (For the libstdc++ ranges tests this optimization appears to be
>>> effective around 30% of the time, i.e. 30% of the time context of 'tmpl'
>>> is non-dependent while the context of 'gen_tmpl' is dependent.)
>>
>> If this is a significant optimization, how about doing it in tsubst_aggr_type
>> rather than its callers?
> 
> I'm not sure how we'd do this optimization in tsubst_aggr_type?

Oops, I was overlooking the gen_tmpl vs. tmpl difference.

> I haven't observed any significant time/memory improvements based on my
> limited benchmarking, but I can imagine for deeply nested templates it
> could be significant.  And avoiding redundant work should hopefully help
> streamline debugging I suppose.

OK.

>>
>>> gcc/cp/ChangeLog:
>>>
>>> 	* pt.cc (instantiate_template): Don't substitute the context
>>> 	of the most general template if that of the partially
>>> 	instantiated template is non-dependent.
>>> ---
>>>    gcc/cp/pt.cc | 10 ++++++++--
>>>    1 file changed, 8 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
>>> index e021c254872..208daad298a 100644
>>> --- a/gcc/cp/pt.cc
>>> +++ b/gcc/cp/pt.cc
>>> @@ -21661,8 +21661,14 @@ instantiate_template (tree tmpl, tree orig_args,
>>> tsubst_flags_t complain)
>>>        ++processing_template_decl;
>>>      if (DECL_CLASS_SCOPE_P (gen_tmpl))
>>>        {
>>> -      tree ctx = tsubst_aggr_type (DECL_CONTEXT (gen_tmpl), targ_ptr,
>>> -				   complain, gen_tmpl, true);
>>> +      tree ctx;
>>> +      if (!uses_template_parms (DECL_CONTEXT (tmpl)))
>>> +	/* If the context of the partially instantiated template is already
>>> +	   non-dependent, then we might as well use it.  */
>>> +	ctx = DECL_CONTEXT (tmpl);
>>> +      else
>>> +	ctx = tsubst_aggr_type (DECL_CONTEXT (gen_tmpl), targ_ptr,
>>> +				complain, gen_tmpl, true);
>>>          push_nested_class (ctx);
>>>        }
>>>    
>>
>>
>
  

Patch

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index e021c254872..208daad298a 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -21661,8 +21661,14 @@  instantiate_template (tree tmpl, tree orig_args, tsubst_flags_t complain)
     ++processing_template_decl;
   if (DECL_CLASS_SCOPE_P (gen_tmpl))
     {
-      tree ctx = tsubst_aggr_type (DECL_CONTEXT (gen_tmpl), targ_ptr,
-				   complain, gen_tmpl, true);
+      tree ctx;
+      if (!uses_template_parms (DECL_CONTEXT (tmpl)))
+	/* If the context of the partially instantiated template is already
+	   non-dependent, then we might as well use it.  */
+	ctx = DECL_CONTEXT (tmpl);
+      else
+	ctx = tsubst_aggr_type (DECL_CONTEXT (gen_tmpl), targ_ptr,
+				complain, gen_tmpl, true);
       push_nested_class (ctx);
     }