[2/4] c++/modules: Track module purview for deferred instantiations [PR114630]

Message ID 663212bb.050a0220.9a96f.94c6@mx.google.com
State New
Headers
Series c++/modules: Fix missed GMF discarding |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gcc_check--master-arm success Testing passed

Commit Message

Nathaniel Shead May 1, 2024, 10 a.m. UTC
  Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?

-- >8 --

When calling instantiate_pending_templates at end of parsing, any new
functions that are instantiated from this point have their module
purview set based on the current value of module_kind.

This is unideal, however, as the modules code will then treat these
instantiations as reachable and cause large swathes of the GMF to be
emitted into the module CMI, despite no code in the actual module
purview referencing it.

This patch fixes this by also remembering the value of module_kind when
the instantiation was deferred, and restoring it when doing this
deferred instantiation.  That way newly instantiated declarations
appropriately get a DECL_MODULE_PURVIEW_P appropriate for where the
instantiation was required, meaning that GMF entities won't be counted
as reachable unless referenced by an actually reachable entity.

Note that purviewness and attachment etc. is generally only determined
by the base template: this is purely for determining whether a
specialisation was declared in the module purview and hence whether it
should be streamed out.  See the comment on 'set_instantiating_module'.

	PR c++/114630
	PR c++/114795

gcc/cp/ChangeLog:

	* cp-tree.h (struct tinst_level): Add field for tracking
	module_kind.
	* pt.cc (push_tinst_level_loc): Cache module_kind in new_level.
	(reopen_tinst_level): Restore module_kind from level.
	(instantiate_pending_templates): Save and restore module_kind so
	it isn't affected by reopen_tinst_level.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/gmf-3.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
---
 gcc/cp/cp-tree.h                     |  3 +++
 gcc/cp/pt.cc                         |  4 ++++
 gcc/testsuite/g++.dg/modules/gmf-3.C | 13 +++++++++++++
 3 files changed, 20 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/modules/gmf-3.C
  

Comments

Patrick Palka May 1, 2024, 2:11 p.m. UTC | #1
On Wed, 1 May 2024, Nathaniel Shead wrote:

> Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
> 
> -- >8 --
> 
> When calling instantiate_pending_templates at end of parsing, any new
> functions that are instantiated from this point have their module
> purview set based on the current value of module_kind.
> 
> This is unideal, however, as the modules code will then treat these
> instantiations as reachable and cause large swathes of the GMF to be
> emitted into the module CMI, despite no code in the actual module
> purview referencing it.
> 
> This patch fixes this by also remembering the value of module_kind when
> the instantiation was deferred, and restoring it when doing this
> deferred instantiation.  That way newly instantiated declarations
> appropriately get a DECL_MODULE_PURVIEW_P appropriate for where the
> instantiation was required, meaning that GMF entities won't be counted
> as reachable unless referenced by an actually reachable entity.
> 
> Note that purviewness and attachment etc. is generally only determined
> by the base template: this is purely for determining whether a
> specialisation was declared in the module purview and hence whether it
> should be streamed out.  See the comment on 'set_instantiating_module'.
> 
> 	PR c++/114630
> 	PR c++/114795
> 
> gcc/cp/ChangeLog:
> 
> 	* cp-tree.h (struct tinst_level): Add field for tracking
> 	module_kind.
> 	* pt.cc (push_tinst_level_loc): Cache module_kind in new_level.
> 	(reopen_tinst_level): Restore module_kind from level.
> 	(instantiate_pending_templates): Save and restore module_kind so
> 	it isn't affected by reopen_tinst_level.

LGTM.  Another approach is to instantiate all so-far deferred
instantiations and vtables once we reach the start of the module
purview, but your approach is much cleaner.

Note that deferred instantiation can induce more deferred instantiation,
but your approach will do the right thing here (module_kind will be
inherited from the point of instantiation).

What if an instantiation is needed from both the GMF and the module
purview?  Then IIUC it'll be instantiated as if from the GMF, which
seems right to me.

> 
> gcc/testsuite/ChangeLog:
> 
> 	* g++.dg/modules/gmf-3.C: New test.
> 
> Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
> ---
>  gcc/cp/cp-tree.h                     |  3 +++
>  gcc/cp/pt.cc                         |  4 ++++
>  gcc/testsuite/g++.dg/modules/gmf-3.C | 13 +++++++++++++
>  3 files changed, 20 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/modules/gmf-3.C
> 
> diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
> index 1938ada0268..0e619120ccc 100644
> --- a/gcc/cp/cp-tree.h
> +++ b/gcc/cp/cp-tree.h
> @@ -6626,6 +6626,9 @@ struct GTY((chain_next ("%h.next"))) tinst_level {
>    /* The location where the template is instantiated.  */
>    location_t locus;
>  
> +  /* The module kind where the template is instantiated. */
> +  unsigned module_kind;
> +
>    /* errorcount + sorrycount when we pushed this level.  */
>    unsigned short errors;
>  
> diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> index 1c3eef60c06..401aa92bc3e 100644
> --- a/gcc/cp/pt.cc
> +++ b/gcc/cp/pt.cc
> @@ -11277,6 +11277,7 @@ push_tinst_level_loc (tree tldcl, tree targs, location_t loc)
>    new_level->tldcl = tldcl;
>    new_level->targs = targs;
>    new_level->locus = loc;
> +  new_level->module_kind = module_kind;
>    new_level->errors = errorcount + sorrycount;
>    new_level->next = NULL;
>    new_level->refcount = 0;
> @@ -11345,6 +11346,7 @@ reopen_tinst_level (struct tinst_level *level)
>    for (t = level; t; t = t->next)
>      ++tinst_depth;
>  
> +  module_kind = level->module_kind;
>    set_refcount_ptr (current_tinst_level, level);
>    pop_tinst_level ();
>    if (current_tinst_level)
> @@ -27442,6 +27444,7 @@ instantiate_pending_templates (int retries)
>  {
>    int reconsider;
>    location_t saved_loc = input_location;
> +  unsigned saved_module_kind = module_kind;
>  
>    /* Instantiating templates may trigger vtable generation.  This in turn
>       may require further template instantiations.  We place a limit here
> @@ -27532,6 +27535,7 @@ instantiate_pending_templates (int retries)
>    while (reconsider);
>  
>    input_location = saved_loc;
> +  module_kind = saved_module_kind;
>  }
>  
>  /* Substitute ARGVEC into T, which is a list of initializers for
> diff --git a/gcc/testsuite/g++.dg/modules/gmf-3.C b/gcc/testsuite/g++.dg/modules/gmf-3.C
> new file mode 100644
> index 00000000000..e52ae904ea9
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/modules/gmf-3.C
> @@ -0,0 +1,13 @@
> +// PR c++/114630
> +// { dg-additional-options "-fmodules-ts -Wno-global-module -fdump-lang-module" }
> +// { dg-module-cmi M }
> +
> +module;
> +template <typename> struct allocator {
> +  allocator() {}
> +};
> +template class allocator<wchar_t>;
> +export module M;
> +
> +// The whole GMF should be discarded here
> +// { dg-final { scan-lang-dump "Wrote 0 clusters" module } }
> -- 
> 2.43.2
> 
>
  
Nathaniel Shead May 1, 2024, 2:27 p.m. UTC | #2
On Wed, May 01, 2024 at 10:11:20AM -0400, Patrick Palka wrote:
> On Wed, 1 May 2024, Nathaniel Shead wrote:
> 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
> > 
> > -- >8 --
> > 
> > When calling instantiate_pending_templates at end of parsing, any new
> > functions that are instantiated from this point have their module
> > purview set based on the current value of module_kind.
> > 
> > This is unideal, however, as the modules code will then treat these
> > instantiations as reachable and cause large swathes of the GMF to be
> > emitted into the module CMI, despite no code in the actual module
> > purview referencing it.
> > 
> > This patch fixes this by also remembering the value of module_kind when
> > the instantiation was deferred, and restoring it when doing this
> > deferred instantiation.  That way newly instantiated declarations
> > appropriately get a DECL_MODULE_PURVIEW_P appropriate for where the
> > instantiation was required, meaning that GMF entities won't be counted
> > as reachable unless referenced by an actually reachable entity.
> > 
> > Note that purviewness and attachment etc. is generally only determined
> > by the base template: this is purely for determining whether a
> > specialisation was declared in the module purview and hence whether it
> > should be streamed out.  See the comment on 'set_instantiating_module'.
> > 
> > 	PR c++/114630
> > 	PR c++/114795
> > 
> > gcc/cp/ChangeLog:
> > 
> > 	* cp-tree.h (struct tinst_level): Add field for tracking
> > 	module_kind.
> > 	* pt.cc (push_tinst_level_loc): Cache module_kind in new_level.
> > 	(reopen_tinst_level): Restore module_kind from level.
> > 	(instantiate_pending_templates): Save and restore module_kind so
> > 	it isn't affected by reopen_tinst_level.
> 
> LGTM.  Another approach is to instantiate all so-far deferred
> instantiations and vtables once we reach the start of the module
> purview, but your approach is much cleaner.
> 
> Note that deferred instantiation can induce more deferred instantiation,
> but your approach will do the right thing here (module_kind will be
> inherited from the point of instantiation).
> 
> What if an instantiation is needed from both the GMF and the module
> purview?  Then IIUC it'll be instantiated as if from the GMF, which
> seems right to me.
> 

Hm..., so I believe it'll be marked according to whichever instantiates
it "first", so if e.g. deferred from GMF and then instantiated
non-deferred from purview it'll be marked purview (and the deferred
instantiation will be skipped as it's already instantiated).  This could
mean it gets unnecessarily emitted if it only got instantiated because
it got used in e.g. a non-inline function body, I suppose; but that's
true in general actually, at the moment.

Maybe then a better approach would be to instead always use the
DECL_MODULE_PURVIEW_P of the instantiating template instead of the
global 'module_purview_p' function in 'set_instantiating_module'?
I think that should still make sure that instantiations are emitted
when they need to be (because they'll be reachable from a declaration
in the purview): I might try to experiment with something like this.

> > 
> > gcc/testsuite/ChangeLog:
> > 
> > 	* g++.dg/modules/gmf-3.C: New test.
> > 
> > Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
> > ---
> >  gcc/cp/cp-tree.h                     |  3 +++
> >  gcc/cp/pt.cc                         |  4 ++++
> >  gcc/testsuite/g++.dg/modules/gmf-3.C | 13 +++++++++++++
> >  3 files changed, 20 insertions(+)
> >  create mode 100644 gcc/testsuite/g++.dg/modules/gmf-3.C
> > 
> > diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
> > index 1938ada0268..0e619120ccc 100644
> > --- a/gcc/cp/cp-tree.h
> > +++ b/gcc/cp/cp-tree.h
> > @@ -6626,6 +6626,9 @@ struct GTY((chain_next ("%h.next"))) tinst_level {
> >    /* The location where the template is instantiated.  */
> >    location_t locus;
> >  
> > +  /* The module kind where the template is instantiated. */
> > +  unsigned module_kind;
> > +
> >    /* errorcount + sorrycount when we pushed this level.  */
> >    unsigned short errors;
> >  
> > diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> > index 1c3eef60c06..401aa92bc3e 100644
> > --- a/gcc/cp/pt.cc
> > +++ b/gcc/cp/pt.cc
> > @@ -11277,6 +11277,7 @@ push_tinst_level_loc (tree tldcl, tree targs, location_t loc)
> >    new_level->tldcl = tldcl;
> >    new_level->targs = targs;
> >    new_level->locus = loc;
> > +  new_level->module_kind = module_kind;
> >    new_level->errors = errorcount + sorrycount;
> >    new_level->next = NULL;
> >    new_level->refcount = 0;
> > @@ -11345,6 +11346,7 @@ reopen_tinst_level (struct tinst_level *level)
> >    for (t = level; t; t = t->next)
> >      ++tinst_depth;
> >  
> > +  module_kind = level->module_kind;
> >    set_refcount_ptr (current_tinst_level, level);
> >    pop_tinst_level ();
> >    if (current_tinst_level)
> > @@ -27442,6 +27444,7 @@ instantiate_pending_templates (int retries)
> >  {
> >    int reconsider;
> >    location_t saved_loc = input_location;
> > +  unsigned saved_module_kind = module_kind;
> >  
> >    /* Instantiating templates may trigger vtable generation.  This in turn
> >       may require further template instantiations.  We place a limit here
> > @@ -27532,6 +27535,7 @@ instantiate_pending_templates (int retries)
> >    while (reconsider);
> >  
> >    input_location = saved_loc;
> > +  module_kind = saved_module_kind;
> >  }
> >  
> >  /* Substitute ARGVEC into T, which is a list of initializers for
> > diff --git a/gcc/testsuite/g++.dg/modules/gmf-3.C b/gcc/testsuite/g++.dg/modules/gmf-3.C
> > new file mode 100644
> > index 00000000000..e52ae904ea9
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/modules/gmf-3.C
> > @@ -0,0 +1,13 @@
> > +// PR c++/114630
> > +// { dg-additional-options "-fmodules-ts -Wno-global-module -fdump-lang-module" }
> > +// { dg-module-cmi M }
> > +
> > +module;
> > +template <typename> struct allocator {
> > +  allocator() {}
> > +};
> > +template class allocator<wchar_t>;
> > +export module M;
> > +
> > +// The whole GMF should be discarded here
> > +// { dg-final { scan-lang-dump "Wrote 0 clusters" module } }
> > -- 
> > 2.43.2
> > 
> > 
>
  
Jason Merrill May 1, 2024, 3:27 p.m. UTC | #3
On 5/1/24 07:11, Patrick Palka wrote:
> On Wed, 1 May 2024, Nathaniel Shead wrote:
> 
>> Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
>>
>> -- >8 --
>>
>> When calling instantiate_pending_templates at end of parsing, any new
>> functions that are instantiated from this point have their module
>> purview set based on the current value of module_kind.
>>
>> This is unideal, however, as the modules code will then treat these
>> instantiations as reachable and cause large swathes of the GMF to be
>> emitted into the module CMI, despite no code in the actual module
>> purview referencing it.
>>
>> This patch fixes this by also remembering the value of module_kind when
>> the instantiation was deferred, and restoring it when doing this
>> deferred instantiation.  That way newly instantiated declarations
>> appropriately get a DECL_MODULE_PURVIEW_P appropriate for where the
>> instantiation was required, meaning that GMF entities won't be counted
>> as reachable unless referenced by an actually reachable entity.
>>
>> Note that purviewness and attachment etc. is generally only determined
>> by the base template: this is purely for determining whether a
>> specialisation was declared in the module purview and hence whether it
>> should be streamed out.  See the comment on 'set_instantiating_module'.
>>
>> 	PR c++/114630
>> 	PR c++/114795
>>
>> gcc/cp/ChangeLog:
>>
>> 	* cp-tree.h (struct tinst_level): Add field for tracking
>> 	module_kind.
>> 	* pt.cc (push_tinst_level_loc): Cache module_kind in new_level.
>> 	(reopen_tinst_level): Restore module_kind from level.
>> 	(instantiate_pending_templates): Save and restore module_kind so
>> 	it isn't affected by reopen_tinst_level.
> 
> LGTM.  Another approach is to instantiate all so-far deferred
> instantiations and vtables once we reach the start of the module
> purview, but your approach is much cleaner.

Hmm, I actually think I like the approach of instantiating at that point 
better, so that instantiations in the GMF can't depend on things in 
module purview.

And probably do the same again when switching to the private module 
fragment, when that's implemented.

Jason
  
Patrick Palka May 1, 2024, 3:54 p.m. UTC | #4
On Thu, 2 May 2024, Nathaniel Shead wrote:

> On Wed, May 01, 2024 at 10:11:20AM -0400, Patrick Palka wrote:
> > On Wed, 1 May 2024, Nathaniel Shead wrote:
> > 
> > > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
> > > 
> > > -- >8 --
> > > 
> > > When calling instantiate_pending_templates at end of parsing, any new
> > > functions that are instantiated from this point have their module
> > > purview set based on the current value of module_kind.
> > > 
> > > This is unideal, however, as the modules code will then treat these
> > > instantiations as reachable and cause large swathes of the GMF to be
> > > emitted into the module CMI, despite no code in the actual module
> > > purview referencing it.
> > > 
> > > This patch fixes this by also remembering the value of module_kind when
> > > the instantiation was deferred, and restoring it when doing this
> > > deferred instantiation.  That way newly instantiated declarations
> > > appropriately get a DECL_MODULE_PURVIEW_P appropriate for where the
> > > instantiation was required, meaning that GMF entities won't be counted
> > > as reachable unless referenced by an actually reachable entity.
> > > 
> > > Note that purviewness and attachment etc. is generally only determined
> > > by the base template: this is purely for determining whether a
> > > specialisation was declared in the module purview and hence whether it
> > > should be streamed out.  See the comment on 'set_instantiating_module'.
> > > 
> > > 	PR c++/114630
> > > 	PR c++/114795
> > > 
> > > gcc/cp/ChangeLog:
> > > 
> > > 	* cp-tree.h (struct tinst_level): Add field for tracking
> > > 	module_kind.
> > > 	* pt.cc (push_tinst_level_loc): Cache module_kind in new_level.
> > > 	(reopen_tinst_level): Restore module_kind from level.
> > > 	(instantiate_pending_templates): Save and restore module_kind so
> > > 	it isn't affected by reopen_tinst_level.
> > 
> > LGTM.  Another approach is to instantiate all so-far deferred
> > instantiations and vtables once we reach the start of the module
> > purview, but your approach is much cleaner.
> > 
> > Note that deferred instantiation can induce more deferred instantiation,
> > but your approach will do the right thing here (module_kind will be
> > inherited from the point of instantiation).
> > 
> > What if an instantiation is needed from both the GMF and the module
> > purview?  Then IIUC it'll be instantiated as if from the GMF, which
> > seems right to me.
> > 
> 
> Hm..., so I believe it'll be marked according to whichever instantiates
> it "first", so if e.g. deferred from GMF and then instantiated
> non-deferred from purview it'll be marked purview (and the deferred
> instantiation will be skipped as it's already instantiated).  This could
> mean it gets unnecessarily emitted if it only got instantiated because
> it got used in e.g. a non-inline function body, I suppose; but that's
> true in general actually, at the moment.

FWIW I think the most relevant situation where we need to instantiate
immediately and can't defer until EOF is during constexpr evaluation of
a variable initializer.  (We also immediately instantiate variable
template specializations, and function template specializations with
deduced return type, but in these cases we'd always instantiate from the
first use, i.e. from the GMF in this scenario.)

So I guess a testcase for this situation could be something like:

    module;
    template<class T> constexpr int f(T t) { return t; }
    template int f(int);

    export module foo;
    constexpr int w = f(0);
    // f<int> ideally shouldn't be emitted in the CMI?

> 
> Maybe then a better approach would be to instead always use the
> DECL_MODULE_PURVIEW_P of the instantiating template instead of the
> global 'module_purview_p' function in 'set_instantiating_module'?
> I think that should still make sure that instantiations are emitted
> when they need to be (because they'll be reachable from a declaration
> in the purview): I might try to experiment with something like this.

The approach of instantiating all so-far deferred instantiations/vtables
at the start of the module purview should handle this properly too.

> 
> > > 
> > > gcc/testsuite/ChangeLog:
> > > 
> > > 	* g++.dg/modules/gmf-3.C: New test.
> > > 
> > > Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
> > > ---
> > >  gcc/cp/cp-tree.h                     |  3 +++
> > >  gcc/cp/pt.cc                         |  4 ++++
> > >  gcc/testsuite/g++.dg/modules/gmf-3.C | 13 +++++++++++++
> > >  3 files changed, 20 insertions(+)
> > >  create mode 100644 gcc/testsuite/g++.dg/modules/gmf-3.C
> > > 
> > > diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
> > > index 1938ada0268..0e619120ccc 100644
> > > --- a/gcc/cp/cp-tree.h
> > > +++ b/gcc/cp/cp-tree.h
> > > @@ -6626,6 +6626,9 @@ struct GTY((chain_next ("%h.next"))) tinst_level {
> > >    /* The location where the template is instantiated.  */
> > >    location_t locus;
> > >  
> > > +  /* The module kind where the template is instantiated. */
> > > +  unsigned module_kind;
> > > +
> > >    /* errorcount + sorrycount when we pushed this level.  */
> > >    unsigned short errors;
> > >  
> > > diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> > > index 1c3eef60c06..401aa92bc3e 100644
> > > --- a/gcc/cp/pt.cc
> > > +++ b/gcc/cp/pt.cc
> > > @@ -11277,6 +11277,7 @@ push_tinst_level_loc (tree tldcl, tree targs, location_t loc)
> > >    new_level->tldcl = tldcl;
> > >    new_level->targs = targs;
> > >    new_level->locus = loc;
> > > +  new_level->module_kind = module_kind;
> > >    new_level->errors = errorcount + sorrycount;
> > >    new_level->next = NULL;
> > >    new_level->refcount = 0;
> > > @@ -11345,6 +11346,7 @@ reopen_tinst_level (struct tinst_level *level)
> > >    for (t = level; t; t = t->next)
> > >      ++tinst_depth;
> > >  
> > > +  module_kind = level->module_kind;
> > >    set_refcount_ptr (current_tinst_level, level);
> > >    pop_tinst_level ();
> > >    if (current_tinst_level)
> > > @@ -27442,6 +27444,7 @@ instantiate_pending_templates (int retries)
> > >  {
> > >    int reconsider;
> > >    location_t saved_loc = input_location;
> > > +  unsigned saved_module_kind = module_kind;
> > >  
> > >    /* Instantiating templates may trigger vtable generation.  This in turn
> > >       may require further template instantiations.  We place a limit here
> > > @@ -27532,6 +27535,7 @@ instantiate_pending_templates (int retries)
> > >    while (reconsider);
> > >  
> > >    input_location = saved_loc;
> > > +  module_kind = saved_module_kind;
> > >  }
> > >  
> > >  /* Substitute ARGVEC into T, which is a list of initializers for
> > > diff --git a/gcc/testsuite/g++.dg/modules/gmf-3.C b/gcc/testsuite/g++.dg/modules/gmf-3.C
> > > new file mode 100644
> > > index 00000000000..e52ae904ea9
> > > --- /dev/null
> > > +++ b/gcc/testsuite/g++.dg/modules/gmf-3.C
> > > @@ -0,0 +1,13 @@
> > > +// PR c++/114630
> > > +// { dg-additional-options "-fmodules-ts -Wno-global-module -fdump-lang-module" }
> > > +// { dg-module-cmi M }
> > > +
> > > +module;
> > > +template <typename> struct allocator {
> > > +  allocator() {}
> > > +};
> > > +template class allocator<wchar_t>;
> > > +export module M;
> > > +
> > > +// The whole GMF should be discarded here
> > > +// { dg-final { scan-lang-dump "Wrote 0 clusters" module } }
> > > -- 
> > > 2.43.2
> > > 
> > > 
> > 
> 
>
  
Jason Merrill May 1, 2024, 6:49 p.m. UTC | #5
On 5/1/24 08:54, Patrick Palka wrote:
> On Thu, 2 May 2024, Nathaniel Shead wrote:
> 
>> On Wed, May 01, 2024 at 10:11:20AM -0400, Patrick Palka wrote:
>>> On Wed, 1 May 2024, Nathaniel Shead wrote:
>>>
>>>> Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
>>>>
>>>> -- >8 --
>>>>
>>>> When calling instantiate_pending_templates at end of parsing, any new
>>>> functions that are instantiated from this point have their module
>>>> purview set based on the current value of module_kind.
>>>>
>>>> This is unideal, however, as the modules code will then treat these
>>>> instantiations as reachable and cause large swathes of the GMF to be
>>>> emitted into the module CMI, despite no code in the actual module
>>>> purview referencing it.
>>>>
>>>> This patch fixes this by also remembering the value of module_kind when
>>>> the instantiation was deferred, and restoring it when doing this
>>>> deferred instantiation.  That way newly instantiated declarations
>>>> appropriately get a DECL_MODULE_PURVIEW_P appropriate for where the
>>>> instantiation was required, meaning that GMF entities won't be counted
>>>> as reachable unless referenced by an actually reachable entity.
>>>>
>>>> Note that purviewness and attachment etc. is generally only determined
>>>> by the base template: this is purely for determining whether a
>>>> specialisation was declared in the module purview and hence whether it
>>>> should be streamed out.  See the comment on 'set_instantiating_module'.
>>>>
>>>> 	PR c++/114630
>>>> 	PR c++/114795
>>>>
>>>> gcc/cp/ChangeLog:
>>>>
>>>> 	* cp-tree.h (struct tinst_level): Add field for tracking
>>>> 	module_kind.
>>>> 	* pt.cc (push_tinst_level_loc): Cache module_kind in new_level.
>>>> 	(reopen_tinst_level): Restore module_kind from level.
>>>> 	(instantiate_pending_templates): Save and restore module_kind so
>>>> 	it isn't affected by reopen_tinst_level.
>>>
>>> LGTM.  Another approach is to instantiate all so-far deferred
>>> instantiations and vtables once we reach the start of the module
>>> purview, but your approach is much cleaner.
>>>
>>> Note that deferred instantiation can induce more deferred instantiation,
>>> but your approach will do the right thing here (module_kind will be
>>> inherited from the point of instantiation).
>>>
>>> What if an instantiation is needed from both the GMF and the module
>>> purview?  Then IIUC it'll be instantiated as if from the GMF, which
>>> seems right to me.
>>>
>>
>> Hm..., so I believe it'll be marked according to whichever instantiates
>> it "first", so if e.g. deferred from GMF and then instantiated
>> non-deferred from purview it'll be marked purview (and the deferred
>> instantiation will be skipped as it's already instantiated).  This could
>> mean it gets unnecessarily emitted if it only got instantiated because
>> it got used in e.g. a non-inline function body, I suppose; but that's
>> true in general actually, at the moment.
> 
> FWIW I think the most relevant situation where we need to instantiate
> immediately and can't defer until EOF is during constexpr evaluation of
> a variable initializer.  (We also immediately instantiate variable
> template specializations, and function template specializations with
> deduced return type, but in these cases we'd always instantiate from the
> first use, i.e. from the GMF in this scenario.)
> 
> So I guess a testcase for this situation could be something like:
> 
>      module;
>      template<class T> constexpr int f(T t) { return t; }
>      template int f(int);
> 
>      export module foo;
>      constexpr int w = f(0);
>      // f<int> ideally shouldn't be emitted in the CMI?

Does the explicit instantiation change that at all?  Either way it seams 
like f and f<int> aren't decl-reachable from w if we determine that 
after constant evaluation.

Jason
  
Patrick Palka May 1, 2024, 7:11 p.m. UTC | #6
On Wed, 1 May 2024, Jason Merrill wrote:

> On 5/1/24 08:54, Patrick Palka wrote:
> > On Thu, 2 May 2024, Nathaniel Shead wrote:
> > 
> > > On Wed, May 01, 2024 at 10:11:20AM -0400, Patrick Palka wrote:
> > > > On Wed, 1 May 2024, Nathaniel Shead wrote:
> > > > 
> > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
> > > > > 
> > > > > -- >8 --
> > > > > 
> > > > > When calling instantiate_pending_templates at end of parsing, any new
> > > > > functions that are instantiated from this point have their module
> > > > > purview set based on the current value of module_kind.
> > > > > 
> > > > > This is unideal, however, as the modules code will then treat these
> > > > > instantiations as reachable and cause large swathes of the GMF to be
> > > > > emitted into the module CMI, despite no code in the actual module
> > > > > purview referencing it.
> > > > > 
> > > > > This patch fixes this by also remembering the value of module_kind
> > > > > when
> > > > > the instantiation was deferred, and restoring it when doing this
> > > > > deferred instantiation.  That way newly instantiated declarations
> > > > > appropriately get a DECL_MODULE_PURVIEW_P appropriate for where the
> > > > > instantiation was required, meaning that GMF entities won't be counted
> > > > > as reachable unless referenced by an actually reachable entity.
> > > > > 
> > > > > Note that purviewness and attachment etc. is generally only determined
> > > > > by the base template: this is purely for determining whether a
> > > > > specialisation was declared in the module purview and hence whether it
> > > > > should be streamed out.  See the comment on
> > > > > 'set_instantiating_module'.
> > > > > 
> > > > > 	PR c++/114630
> > > > > 	PR c++/114795
> > > > > 
> > > > > gcc/cp/ChangeLog:
> > > > > 
> > > > > 	* cp-tree.h (struct tinst_level): Add field for tracking
> > > > > 	module_kind.
> > > > > 	* pt.cc (push_tinst_level_loc): Cache module_kind in
> > > > > new_level.
> > > > > 	(reopen_tinst_level): Restore module_kind from level.
> > > > > 	(instantiate_pending_templates): Save and restore module_kind
> > > > > so
> > > > > 	it isn't affected by reopen_tinst_level.
> > > > 
> > > > LGTM.  Another approach is to instantiate all so-far deferred
> > > > instantiations and vtables once we reach the start of the module
> > > > purview, but your approach is much cleaner.
> > > > 
> > > > Note that deferred instantiation can induce more deferred instantiation,
> > > > but your approach will do the right thing here (module_kind will be
> > > > inherited from the point of instantiation).
> > > > 
> > > > What if an instantiation is needed from both the GMF and the module
> > > > purview?  Then IIUC it'll be instantiated as if from the GMF, which
> > > > seems right to me.
> > > > 
> > > 
> > > Hm..., so I believe it'll be marked according to whichever instantiates
> > > it "first", so if e.g. deferred from GMF and then instantiated
> > > non-deferred from purview it'll be marked purview (and the deferred
> > > instantiation will be skipped as it's already instantiated).  This could
> > > mean it gets unnecessarily emitted if it only got instantiated because
> > > it got used in e.g. a non-inline function body, I suppose; but that's
> > > true in general actually, at the moment.
> > 
> > FWIW I think the most relevant situation where we need to instantiate
> > immediately and can't defer until EOF is during constexpr evaluation of
> > a variable initializer.  (We also immediately instantiate variable
> > template specializations, and function template specializations with
> > deduced return type, but in these cases we'd always instantiate from the
> > first use, i.e. from the GMF in this scenario.)
> > 
> > So I guess a testcase for this situation could be something like:
> > 
> >      module;
> >      template<class T> constexpr int f(T t) { return t; }
> >      template int f(int);
> > 
> >      export module foo;
> >      constexpr int w = f(0);
> >      // f<int> ideally shouldn't be emitted in the CMI?
> 
> Does the explicit instantiation change that at all?  Either way it seams like
> f and f<int> aren't decl-reachable from w if we determine that after constant
> evaluation.

Currently due to the bug in question we handle this as if the explicit
instantiation was performed from the module purview:

  module;
  template<class T> constexpr int f(T t) { return t; }

  export module foo;
  template int f(int);
  constexpr int w = f(0);

I think we do want to emit f<int> in that case?
  
Jason Merrill May 1, 2024, 8:27 p.m. UTC | #7
On 5/1/24 12:11, Patrick Palka wrote:
> On Wed, 1 May 2024, Jason Merrill wrote:
> 
>> On 5/1/24 08:54, Patrick Palka wrote:
>>> On Thu, 2 May 2024, Nathaniel Shead wrote:
>>>
>>>> On Wed, May 01, 2024 at 10:11:20AM -0400, Patrick Palka wrote:
>>>>> On Wed, 1 May 2024, Nathaniel Shead wrote:
>>>>>
>>>>>> Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
>>>>>>
>>>>>> -- >8 --
>>>>>>
>>>>>> When calling instantiate_pending_templates at end of parsing, any new
>>>>>> functions that are instantiated from this point have their module
>>>>>> purview set based on the current value of module_kind.
>>>>>>
>>>>>> This is unideal, however, as the modules code will then treat these
>>>>>> instantiations as reachable and cause large swathes of the GMF to be
>>>>>> emitted into the module CMI, despite no code in the actual module
>>>>>> purview referencing it.
>>>>>>
>>>>>> This patch fixes this by also remembering the value of module_kind
>>>>>> when
>>>>>> the instantiation was deferred, and restoring it when doing this
>>>>>> deferred instantiation.  That way newly instantiated declarations
>>>>>> appropriately get a DECL_MODULE_PURVIEW_P appropriate for where the
>>>>>> instantiation was required, meaning that GMF entities won't be counted
>>>>>> as reachable unless referenced by an actually reachable entity.
>>>>>>
>>>>>> Note that purviewness and attachment etc. is generally only determined
>>>>>> by the base template: this is purely for determining whether a
>>>>>> specialisation was declared in the module purview and hence whether it
>>>>>> should be streamed out.  See the comment on
>>>>>> 'set_instantiating_module'.
>>>>>>
>>>>>> 	PR c++/114630
>>>>>> 	PR c++/114795
>>>>>>
>>>>>> gcc/cp/ChangeLog:
>>>>>>
>>>>>> 	* cp-tree.h (struct tinst_level): Add field for tracking
>>>>>> 	module_kind.
>>>>>> 	* pt.cc (push_tinst_level_loc): Cache module_kind in
>>>>>> new_level.
>>>>>> 	(reopen_tinst_level): Restore module_kind from level.
>>>>>> 	(instantiate_pending_templates): Save and restore module_kind
>>>>>> so
>>>>>> 	it isn't affected by reopen_tinst_level.
>>>>>
>>>>> LGTM.  Another approach is to instantiate all so-far deferred
>>>>> instantiations and vtables once we reach the start of the module
>>>>> purview, but your approach is much cleaner.
>>>>>
>>>>> Note that deferred instantiation can induce more deferred instantiation,
>>>>> but your approach will do the right thing here (module_kind will be
>>>>> inherited from the point of instantiation).
>>>>>
>>>>> What if an instantiation is needed from both the GMF and the module
>>>>> purview?  Then IIUC it'll be instantiated as if from the GMF, which
>>>>> seems right to me.
>>>>>
>>>>
>>>> Hm..., so I believe it'll be marked according to whichever instantiates
>>>> it "first", so if e.g. deferred from GMF and then instantiated
>>>> non-deferred from purview it'll be marked purview (and the deferred
>>>> instantiation will be skipped as it's already instantiated).  This could
>>>> mean it gets unnecessarily emitted if it only got instantiated because
>>>> it got used in e.g. a non-inline function body, I suppose; but that's
>>>> true in general actually, at the moment.
>>>
>>> FWIW I think the most relevant situation where we need to instantiate
>>> immediately and can't defer until EOF is during constexpr evaluation of
>>> a variable initializer.  (We also immediately instantiate variable
>>> template specializations, and function template specializations with
>>> deduced return type, but in these cases we'd always instantiate from the
>>> first use, i.e. from the GMF in this scenario.)
>>>
>>> So I guess a testcase for this situation could be something like:
>>>
>>>       module;
>>>       template<class T> constexpr int f(T t) { return t; }
>>>       template int f(int);
>>>
>>>       export module foo;
>>>       constexpr int w = f(0);
>>>       // f<int> ideally shouldn't be emitted in the CMI?
>>
>> Does the explicit instantiation change that at all?  Either way it seams like
>> f and f<int> aren't decl-reachable from w if we determine that after constant
>> evaluation.
> 
> Currently due to the bug in question we handle this as if the explicit
> instantiation was performed from the module purview:
> 
>    module;
>    template<class T> constexpr int f(T t) { return t; }
> 
>    export module foo;
>    template int f(int);
>    constexpr int w = f(0);
> 
> I think we do want to emit f<int> in that case?

Agreed, because f is decl-reachable from the explicit instantiation in 
the module purview.

I was suggesting that there's no effective difference between explicit 
instantiation in the GMF and implicit instantiation.

Jason
  

Patch

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 1938ada0268..0e619120ccc 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6626,6 +6626,9 @@  struct GTY((chain_next ("%h.next"))) tinst_level {
   /* The location where the template is instantiated.  */
   location_t locus;
 
+  /* The module kind where the template is instantiated. */
+  unsigned module_kind;
+
   /* errorcount + sorrycount when we pushed this level.  */
   unsigned short errors;
 
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 1c3eef60c06..401aa92bc3e 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -11277,6 +11277,7 @@  push_tinst_level_loc (tree tldcl, tree targs, location_t loc)
   new_level->tldcl = tldcl;
   new_level->targs = targs;
   new_level->locus = loc;
+  new_level->module_kind = module_kind;
   new_level->errors = errorcount + sorrycount;
   new_level->next = NULL;
   new_level->refcount = 0;
@@ -11345,6 +11346,7 @@  reopen_tinst_level (struct tinst_level *level)
   for (t = level; t; t = t->next)
     ++tinst_depth;
 
+  module_kind = level->module_kind;
   set_refcount_ptr (current_tinst_level, level);
   pop_tinst_level ();
   if (current_tinst_level)
@@ -27442,6 +27444,7 @@  instantiate_pending_templates (int retries)
 {
   int reconsider;
   location_t saved_loc = input_location;
+  unsigned saved_module_kind = module_kind;
 
   /* Instantiating templates may trigger vtable generation.  This in turn
      may require further template instantiations.  We place a limit here
@@ -27532,6 +27535,7 @@  instantiate_pending_templates (int retries)
   while (reconsider);
 
   input_location = saved_loc;
+  module_kind = saved_module_kind;
 }
 
 /* Substitute ARGVEC into T, which is a list of initializers for
diff --git a/gcc/testsuite/g++.dg/modules/gmf-3.C b/gcc/testsuite/g++.dg/modules/gmf-3.C
new file mode 100644
index 00000000000..e52ae904ea9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/gmf-3.C
@@ -0,0 +1,13 @@ 
+// PR c++/114630
+// { dg-additional-options "-fmodules-ts -Wno-global-module -fdump-lang-module" }
+// { dg-module-cmi M }
+
+module;
+template <typename> struct allocator {
+  allocator() {}
+};
+template class allocator<wchar_t>;
+export module M;
+
+// The whole GMF should be discarded here
+// { dg-final { scan-lang-dump "Wrote 0 clusters" module } }