[PATCH/RFC,1/2] WPD: Enable whole program devirtualization

Message ID SN6PR01MB495813DD340BF811FEEABB2AF7DC9@SN6PR01MB4958.prod.exchangelabs.com
State New
Headers
Series [PATCH/RFC,1/2] WPD: Enable whole program devirtualization |

Commit Message

Feng Xue OS Sept. 16, 2021, 9:25 a.m. UTC
  This and following patches are composed to enable full devirtualization
under whole program assumption (so also called whole-program
devirtualization, WPD for short), which is an enhancement to current
speculative devirtualization. The base of the optimization is how to
identify class type that is local in terms of whole-program scope, at
least  those class types in libstdc++ must be excluded in some way.
Our means is to use typeinfo symbol as identity marker of a class since
it is unique and always generated once the class or its derived type
is instantiated somewhere, and rely on symbol resolution by
lto-linker-plugin to detect whether  a typeinfo is referenced by regular
object/library, which indirectly tells class types are escaped or not.
The RFC at https://gcc.gnu.org/pipermail/gcc/2021-August/237132.html
gives more details on that.

Bootstrapped/regtested on x86_64-linux and aarch64-linux.

Thanks,
Feng

----
2021-09-07  Feng Xue  <fxue@os.amperecomputing.com>

gcc/
	* common.opt (-fdevirtualize-fully): New option.
	* class.c (build_rtti_vtbl_entries): Force generation of typeinfo
	even -fno-rtti is specificied under full devirtualization.
	* cgraph.c (cgraph_update_edges_for_call_stmt): Add an assertion
	to check node to be traversed.
	* cgraphclones.c (cgraph_node::find_replacement): Record
	former_clone_of on replacement node.
	* cgraphunit.c (symtab_node::needed_p): Always output vtable for
	full devirtualization.
	(analyze_functions): Force generation of primary vtables for all
	base classes.
	* ipa-devirt.c (odr_type_d::whole_program_local): New field.
	(odr_type_d::has_virtual_base): Likewise.
	(odr_type_d::all_derivations_known): Removed.
	(odr_type_d::whole_program_local_p): New member function.
	(odr_type_d::all_derivations_known_p): Likewise.
	(odr_type_d::possibly_instantiated_p): Likewise.
	(odr_type_d::set_has_virtual_base): Likewise.
	(get_odr_type): Set "whole_program_local" and "has_virtual_base"
	when adding a type.
	(type_all_derivations_known_p): Replace implementation by a call
	to odr_type_d::all_derivations_known_p.
	(type_possibly_instantiated_p): Replace implementation by a call
	to odr_type_d::possibly_instantiated_p.
	(type_known_to_have_no_derivations_p): Replace call to
	type_possibly_instantiated_p with call to
	odr_type_d::possibly_instantiated_p.
	(type_all_ctors_visible_p): Removed.
	(type_whole_program_local_p): New function.
	(get_type_vtable): Likewise.
	(extract_typeinfo_in_vtable): Likewise.
	(identify_whole_program_local_types): Likewise.
	(dump_odr_type): Dump has_virtual_base and whole_program_local_p()
	of type.
	(maybe_record_node): Resort to type_whole_program_local_p to
	check whether a class has been optimized away.
	(record_target_from_binfo): Remove parameter "anonymous", add
	a new parameter "possibly_instantiated", and adjust code
	accordingly.
	(devirt_variable_node_removal_hook): Replace call to
	"type_in_anonymous_namespace_p" with "type_whole_program_local_p".
	(possible_polymorphic_call_targets): Replace call to
	"type_possibly_instantiated_p" with "possibly_instantiated_p",
	replace flag check on "all_derivations_known" with call to
	 "all_derivations_known_p".
	* ipa-icf.c (filter_removed_items): Disable folding on vtable
	under full devirtualization.
	* ipa-polymorphic-call.c (restrict_to_inner_class): Move odr
	type check to type_known_to_have_no_derivations_p.
	* ipa-utils.h (identify_whole_program_local_types): New
	declaration.
	(type_all_derivations_known_p): Parameter type adjustment.
	* ipa.c (walk_polymorphic_call_targets): Do not mark vcall
	targets as reachable for full devirtualization.
	(can_remove_vtable_if_no_refs_p): New function.
	(symbol_table::remove_unreachable_nodes): Add defined vtables
	to reachable list under full devirtualization.
	* lto-symtab.c (lto_symtab_merge_symbols): Identify whole
	program	local types after symbol table merge.
---
  

Comments

Jason Merrill Sept. 16, 2021, 5:53 p.m. UTC | #1
On 9/16/21 05:25, Feng Xue OS via Gcc-patches wrote:
> This and following patches are composed to enable full devirtualization
> under whole program assumption (so also called whole-program
> devirtualization, WPD for short), which is an enhancement to current
> speculative devirtualization. The base of the optimization is how to
> identify class type that is local in terms of whole-program scope, at
> least  those class types in libstdc++ must be excluded in some way.
> Our means is to use typeinfo symbol as identity marker of a class since
> it is unique and always generated once the class or its derived type
> is instantiated somewhere, and rely on symbol resolution by
> lto-linker-plugin to detect whether  a typeinfo is referenced by regular
> object/library, which indirectly tells class types are escaped or not.
> The RFC at https://gcc.gnu.org/pipermail/gcc/2021-August/237132.html
> gives more details on that.
> 
> Bootstrapped/regtested on x86_64-linux and aarch64-linux.
> 
> Thanks,
> Feng
> 
> ----
> 2021-09-07  Feng Xue  <fxue@os.amperecomputing.com>
> 
> gcc/
> 	* common.opt (-fdevirtualize-fully): New option.
> 	* class.c (build_rtti_vtbl_entries): Force generation of typeinfo
> 	even -fno-rtti is specificied under full devirtualization.

This makes -fno-rtti useless; rather than this, you should warn about 
the combination of flags and force flag_rtti on.  It also sounds like 
you depend on the library not being built with -fno-rtti.

> 	* cgraph.c (cgraph_update_edges_for_call_stmt): Add an assertion
> 	to check node to be traversed.
> 	* cgraphclones.c (cgraph_node::find_replacement): Record
> 	former_clone_of on replacement node.
> 	* cgraphunit.c (symtab_node::needed_p): Always output vtable for
> 	full devirtualization.
> 	(analyze_functions): Force generation of primary vtables for all
> 	base classes.
> 	* ipa-devirt.c (odr_type_d::whole_program_local): New field.
> 	(odr_type_d::has_virtual_base): Likewise.
> 	(odr_type_d::all_derivations_known): Removed.
> 	(odr_type_d::whole_program_local_p): New member function.
> 	(odr_type_d::all_derivations_known_p): Likewise.
> 	(odr_type_d::possibly_instantiated_p): Likewise.
> 	(odr_type_d::set_has_virtual_base): Likewise.
> 	(get_odr_type): Set "whole_program_local" and "has_virtual_base"
> 	when adding a type.
> 	(type_all_derivations_known_p): Replace implementation by a call
> 	to odr_type_d::all_derivations_known_p.
> 	(type_possibly_instantiated_p): Replace implementation by a call
> 	to odr_type_d::possibly_instantiated_p.
> 	(type_known_to_have_no_derivations_p): Replace call to
> 	type_possibly_instantiated_p with call to
> 	odr_type_d::possibly_instantiated_p.
> 	(type_all_ctors_visible_p): Removed.
> 	(type_whole_program_local_p): New function.
> 	(get_type_vtable): Likewise.
> 	(extract_typeinfo_in_vtable): Likewise.
> 	(identify_whole_program_local_types): Likewise.
> 	(dump_odr_type): Dump has_virtual_base and whole_program_local_p()
> 	of type.
> 	(maybe_record_node): Resort to type_whole_program_local_p to
> 	check whether a class has been optimized away.
> 	(record_target_from_binfo): Remove parameter "anonymous", add
> 	a new parameter "possibly_instantiated", and adjust code
> 	accordingly.
> 	(devirt_variable_node_removal_hook): Replace call to
> 	"type_in_anonymous_namespace_p" with "type_whole_program_local_p".
> 	(possible_polymorphic_call_targets): Replace call to
> 	"type_possibly_instantiated_p" with "possibly_instantiated_p",
> 	replace flag check on "all_derivations_known" with call to
> 	 "all_derivations_known_p".
> 	* ipa-icf.c (filter_removed_items): Disable folding on vtable
> 	under full devirtualization.
> 	* ipa-polymorphic-call.c (restrict_to_inner_class): Move odr
> 	type check to type_known_to_have_no_derivations_p.
> 	* ipa-utils.h (identify_whole_program_local_types): New
> 	declaration.
> 	(type_all_derivations_known_p): Parameter type adjustment.
> 	* ipa.c (walk_polymorphic_call_targets): Do not mark vcall
> 	targets as reachable for full devirtualization.
> 	(can_remove_vtable_if_no_refs_p): New function.
> 	(symbol_table::remove_unreachable_nodes): Add defined vtables
> 	to reachable list under full devirtualization.
> 	* lto-symtab.c (lto_symtab_merge_symbols): Identify whole
> 	program	local types after symbol table merge.
> ---
>
  
Feng Xue OS Sept. 17, 2021, 2:29 a.m. UTC | #2
>On 9/16/21 05:25, Feng Xue OS via Gcc-patches wrote:
>> This and following patches are composed to enable full devirtualization
>> under whole program assumption (so also called whole-program
>> devirtualization, WPD for short), which is an enhancement to current
>> speculative devirtualization. The base of the optimization is how to
>> identify class type that is local in terms of whole-program scope, at
>> least  those class types in libstdc++ must be excluded in some way.
>> Our means is to use typeinfo symbol as identity marker of a class since
>> it is unique and always generated once the class or its derived type
>> is instantiated somewhere, and rely on symbol resolution by
>> lto-linker-plugin to detect whether  a typeinfo is referenced by regular
>> object/library, which indirectly tells class types are escaped or not.
>> The RFC at https://gcc.gnu.org/pipermail/gcc/2021-August/237132.html
>> gives more details on that.
>>
>> Bootstrapped/regtested on x86_64-linux and aarch64-linux.
>>
>> Thanks,
>> Feng
>>
>> ----
>> 2021-09-07  Feng Xue  <fxue@os.amperecomputing.com>
>>
>> gcc/
>>       * common.opt (-fdevirtualize-fully): New option.
>>       * class.c (build_rtti_vtbl_entries): Force generation of typeinfo
>>       even -fno-rtti is specificied under full devirtualization.
>
>This makes -fno-rtti useless; rather than this, you should warn about
>the combination of flags and force flag_rtti on.  It also sounds like
>you depend on the library not being built with -fno-rtti.

Although rtti is generated by front-end, we will remove it after lto symtab
merge, which is meant to keep same behavior as -fno-rtti.

Yes, regular library to be linked with should contain rtti data, otherwise
WPD could not deduce class type usage safely. By default, we can think
that it should work for libstdc++, but it probably becomes a problem for
user library, which might be avoided if we properly document this
requirement and suggest user doing that when using WPD.

Thanks
Feng
>
>>       * cgraph.c (cgraph_update_edges_for_call_stmt): Add an assertion
>>       to check node to be traversed.
>>       * cgraphclones.c (cgraph_node::find_replacement): Record
>>       former_clone_of on replacement node.
>>       * cgraphunit.c (symtab_node::needed_p): Always output vtable for
>>       full devirtualization.
>>       (analyze_functions): Force generation of primary vtables for all
>>       base classes.
>>       * ipa-devirt.c (odr_type_d::whole_program_local): New field.
>>       (odr_type_d::has_virtual_base): Likewise.
>>       (odr_type_d::all_derivations_known): Removed.
>>       (odr_type_d::whole_program_local_p): New member function.
>>       (odr_type_d::all_derivations_known_p): Likewise.
>>       (odr_type_d::possibly_instantiated_p): Likewise.
>>       (odr_type_d::set_has_virtual_base): Likewise.
>>       (get_odr_type): Set "whole_program_local" and "has_virtual_base"
>>       when adding a type.
>>       (type_all_derivations_known_p): Replace implementation by a call
>>       to odr_type_d::all_derivations_known_p.
>>       (type_possibly_instantiated_p): Replace implementation by a call
>>       to odr_type_d::possibly_instantiated_p.
>>       (type_known_to_have_no_derivations_p): Replace call to
>>       type_possibly_instantiated_p with call to
>>       odr_type_d::possibly_instantiated_p.
>>       (type_all_ctors_visible_p): Removed.
>>       (type_whole_program_local_p): New function.
>>       (get_type_vtable): Likewise.
>>       (extract_typeinfo_in_vtable): Likewise.
>>       (identify_whole_program_local_types): Likewise.
>>       (dump_odr_type): Dump has_virtual_base and whole_program_local_p()
>>       of type.
>>       (maybe_record_node): Resort to type_whole_program_local_p to
>>       check whether a class has been optimized away.
>>       (record_target_from_binfo): Remove parameter "anonymous", add
>>       a new parameter "possibly_instantiated", and adjust code
>>       accordingly.
>>       (devirt_variable_node_removal_hook): Replace call to
>>       "type_in_anonymous_namespace_p" with "type_whole_program_local_p".
>>       (possible_polymorphic_call_targets): Replace call to
>>       "type_possibly_instantiated_p" with "possibly_instantiated_p",
>>       replace flag check on "all_derivations_known" with call to
>>        "all_derivations_known_p".
>>       * ipa-icf.c (filter_removed_items): Disable folding on vtable
>>       under full devirtualization.
>>       * ipa-polymorphic-call.c (restrict_to_inner_class): Move odr
>>       type check to type_known_to_have_no_derivations_p.
>>       * ipa-utils.h (identify_whole_program_local_types): New
>>       declaration.
>>       (type_all_derivations_known_p): Parameter type adjustment.
>>       * ipa.c (walk_polymorphic_call_targets): Do not mark vcall
>>       targets as reachable for full devirtualization.
>>       (can_remove_vtable_if_no_refs_p): New function.
>>       (symbol_table::remove_unreachable_nodes): Add defined vtables
>>       to reachable list under full devirtualization.
>>       * lto-symtab.c (lto_symtab_merge_symbols): Identify whole
>>       program local types after symbol table merge.
>> ---
>>
  
Jason Merrill Sept. 18, 2021, 5:53 a.m. UTC | #3
On 9/16/21 22:29, Feng Xue OS wrote:
>> On 9/16/21 05:25, Feng Xue OS via Gcc-patches wrote:
>>> This and following patches are composed to enable full devirtualization
>>> under whole program assumption (so also called whole-program
>>> devirtualization, WPD for short), which is an enhancement to current
>>> speculative devirtualization. The base of the optimization is how to
>>> identify class type that is local in terms of whole-program scope, at
>>> least  those class types in libstdc++ must be excluded in some way.
>>> Our means is to use typeinfo symbol as identity marker of a class since
>>> it is unique and always generated once the class or its derived type
>>> is instantiated somewhere, and rely on symbol resolution by
>>> lto-linker-plugin to detect whether  a typeinfo is referenced by regular
>>> object/library, which indirectly tells class types are escaped or not.
>>> The RFC at https://gcc.gnu.org/pipermail/gcc/2021-August/237132.html
>>> gives more details on that.
>>>
>>> Bootstrapped/regtested on x86_64-linux and aarch64-linux.
>>>
>>> Thanks,
>>> Feng
>>>
>>> ----
>>> 2021-09-07  Feng Xue  <fxue@os.amperecomputing.com>
>>>
>>> gcc/
>>>        * common.opt (-fdevirtualize-fully): New option.
>>>        * class.c (build_rtti_vtbl_entries): Force generation of typeinfo
>>>        even -fno-rtti is specificied under full devirtualization.
>>
>> This makes -fno-rtti useless; rather than this, you should warn about
>> the combination of flags and force flag_rtti on.  It also sounds like
>> you depend on the library not being built with -fno-rtti.
> 
> Although rtti is generated by front-end, we will remove it after lto symtab
> merge, which is meant to keep same behavior as -fno-rtti.

Ah, the cp/ change is OK, then, with a comment about that.

> Yes, regular library to be linked with should contain rtti data, otherwise
> WPD could not deduce class type usage safely. By default, we can think
> that it should work for libstdc++, but it probably becomes a problem for
> user library, which might be avoided if we properly document this
> requirement and suggest user doing that when using WPD.

Yes, I would expect that external libraries would be built with RTTI on 
to allow users to use RTTI features even if they aren't used within the 
library.  But it's good to document it as a requirement.

> +	      /* If a class with virtual base is only instantiated as
> +		 subobjects of derived classes, and has no complete object in
> +		 compilation unit, merely construction vtables will be involved,
> +		 its primary vtable is really not needed, and subject to being
> +		 removed.  So once a vtable node is encountered, for all
> +		 polymorphic base classes of the vtable's context class, always
> +		 force generation of primary vtable nodes when full
> +		 devirtualization is enabled.  */

Why do you need the primary vtable if you're relying on RTTI info? 
Construction vtables will point to the same RTTI node.

> +	  /* Public class w/o key member function (or local class in a public
> +	     inline function) requires COMDAT-like vtable so as to be shared
> +	     among units.  But C++ privatizing via -fno-weak would introduce
> +	     multiple static vtable copies for one class in merged lto symbol
> +	     table.  This breaks one-to-one correspondence between class and
> +	     vtable, and makes class liveness check become not that easy.  To
> +	     be simple, we exclude such kind of class from our choice list.

Same question.  Also, why would you use -fno-weak?  Forcing multiple 
copies of things we're perfectly capable of combining seems like a 
strange choice.  You can privatize things with the symbol visibility 
controls or RTLD_LOCAL.

Jason
  
Feng Xue OS Sept. 18, 2021, 9:38 a.m. UTC | #4
>On 9/16/21 22:29, Feng Xue OS wrote:
>>> On 9/16/21 05:25, Feng Xue OS via Gcc-patches wrote:
>>>> This and following patches are composed to enable full devirtualization
>>>> under whole program assumption (so also called whole-program
>>>> devirtualization, WPD for short), which is an enhancement to current
>>>> speculative devirtualization. The base of the optimization is how to
>>>> identify class type that is local in terms of whole-program scope, at
>>>> least  those class types in libstdc++ must be excluded in some way.
>>>> Our means is to use typeinfo symbol as identity marker of a class since
>>>> it is unique and always generated once the class or its derived type
>>>> is instantiated somewhere, and rely on symbol resolution by
>>>> lto-linker-plugin to detect whether  a typeinfo is referenced by regular
>>>> object/library, which indirectly tells class types are escaped or not.
>>>> The RFC at https://gcc.gnu.org/pipermail/gcc/2021-August/237132.html
>>>> gives more details on that.
>>>>
>>>> Bootstrapped/regtested on x86_64-linux and aarch64-linux.
>>>>
>>>> Thanks,
>>>> Feng
>>>>
>>>> ----
>>>> 2021-09-07  Feng Xue  <fxue@os.amperecomputing.com>
>>>>
>>>> gcc/
>>>>        * common.opt (-fdevirtualize-fully): New option.
>>>>        * class.c (build_rtti_vtbl_entries): Force generation of typeinfo
>>>>        even -fno-rtti is specificied under full devirtualization.
>>>
>>> This makes -fno-rtti useless; rather than this, you should warn about
>>> the combination of flags and force flag_rtti on.  It also sounds like
>>> you depend on the library not being built with -fno-rtti.
>>
>> Although rtti is generated by front-end, we will remove it after lto symtab
>> merge, which is meant to keep same behavior as -fno-rtti.
>
> Ah, the cp/ change is OK, then, with a comment about that.
>
>> Yes, regular library to be linked with should contain rtti data, otherwise
>> WPD could not deduce class type usage safely. By default, we can think
>> that it should work for libstdc++, but it probably becomes a problem for
>> user library, which might be avoided if we properly document this
>> requirement and suggest user doing that when using WPD.
>
> Yes, I would expect that external libraries would be built with RTTI on
> to allow users to use RTTI features even if they aren't used within the
> library.  But it's good to document it as a requirement.
>
>> +           /* If a class with virtual base is only instantiated as
>> +              subobjects of derived classes, and has no complete object in
>> +              compilation unit, merely construction vtables will be involved,
>> +              its primary vtable is really not needed, and subject to being
>> +              removed.  So once a vtable node is encountered, for all
>> +              polymorphic base classes of the vtable's context class, always
>> +              force generation of primary vtable nodes when full
>> +              devirtualization is enabled.  */
>
> Why do you need the primary vtable if you're relying on RTTI info?
> Construction vtables will point to the same RTTI node.

At middle end, the easiest way to get vtable of type is via TYPE_BINFO(type),
it is the primary one. And WPD relies on existence of varpool_node of the
vtable decl to determine if the type has been removed (when it is never
instantiated), so we will force generation of vtable node at very early stage.
Additionally, construction vtable (C-in-D) belongs to the class (D) of complete
object, not the class (C) of subobject actually being constructed for, it is not
easy to correlate construction vtable with the subobject class (C) after front
end.

>
>> +       /* Public class w/o key member function (or local class in a public
>> +          inline function) requires COMDAT-like vtable so as to be shared
>> +          among units.  But C++ privatizing via -fno-weak would introduce
>> +          multiple static vtable copies for one class in merged lto symbol
>> +          table.  This breaks one-to-one correspondence between class and
>> +          vtable, and makes class liveness check become not that easy.  To
>> +          be simple, we exclude such kind of class from our choice list.
>
> Same question.  Also, why would you use -fno-weak?  Forcing multiple
> copies of things we're perfectly capable of combining seems like a
> strange choice.  You can privatize things with the symbol visibility
> controls or RTLD_LOCAL.

We expect that user does not specify -fno-weak for WPD. But if
specified, we should correctly handle that and bypass the type. And
indeed there is no need to force generation of vtable under this
situation.  But if vtable is not keyed to any compilation unit, we might
never have any copy of it in ordinary build, while its class type is
meaningful to whole-program analysis, such as an abstract root class.

Thanks,
Feng
  

Patch

From 2632d8e7ea8f96cb545e57dedd9e4148b5a2cae4 Mon Sep 17 00:00:00 2001
From: Feng Xue <fxue@os.amperecomputing.com>
Date: Mon, 6 Sep 2021 15:03:31 +0800
Subject: [PATCH 1/2] WPD: Enable whole program devirtualization

Enable full devirtualization under whole program assumption (so also
called whole-program devirtualization, WPD for short). The base of the
optimization is how to identify class type that is local in terms of
whole-program scope. But "whole program" does not ensure that class
hierarchy of a type never span to dependent C++ libraries (one is
libstdc++), which would result in incorrect devirtualization. An
example is given below to demonstrate the problem.

    // Has been pre-compiled to a library
    class Base {
        virtual void method() = 0;
    };

    class Derive_in_Library : public Base {
        virtual void method()  { ... }
    };

    Base *get_object()
    {
        return new Derive_in_Library();
    }

    -------------------------------------------------------

    // User source code to compile
    class Base {
        virtual void method() = 0;
    };

    class Derive : public Base {
        virtual void method()  { ... }
    };

    void foo()
    {
      Base *obj = get_object();

      obj->method();
    }

If there is no way to inspect entire class hierarchy comprised by
relevant types in the library and user source code, whole program
analysis would improperly think of 'Derive' as sole descendant of
'Base', and triggers devirtualizing 'obj->method' to 'Derive::method',
which is definitely unexpected transformation.

Seemingly, we could find whether a class type is referenced in object
file or library by tracking availability of its vtable symbol. But
vtable might be purged and we are still interested in its belonging
class type. Refer to the library code in above example, vtable of 'Base'
is unused since it neither participate construction of 'Base' nor
'Derive_in_Library', but we still must know if 'Base' is live in the
library to ensure correct result.

Moreover, once a class is virtual inherited, it will have multiple
vtables (including construction vtables), but the way of looking up
class via vtable symbol requires one-to-one correspondence between them,
then it does not work.

Beside vtable symbol, class instantiation also creates references to
typeinfo symbols of itself and all its parent classes. At the same time,
each class has unique typeinfo, which could act as a desirable type
marker, and be used to perform type lookup in object file and library.
Anyway, the approach is not 100% safe, specifically, when typeinfo
symbols are invisible or missed, for example, when libraries to link
against was built with -fno-weak or -fno-rtti. But at least for
libstc++, these symbols should be present for the sake of allowing
dynamic_cast in user source code.

Lto-linker-plugin will work with linker to do whole-program symbol
resolution before LTO/WPA happens, and attach binding information to
symbols. In above example, resolution of typeinfo symbols generated
for user code will be:

    class Base {                  //  global type
        // _ZTI4Base (typeinfo for Base)      LDPR_PREVAILING_DEF
        virtual void method() = 0;
    };

    class Derive : public Base {  // local type
        // _ZTI6Derive (typeinfo for Derive)  LDPR_PREVAILING_DEF_IRONLY
        virtual void method()  { ... }
    };

Given a typeinfo symbol, if it is resolved as LDPR_PREVAILING_DEF_IRONLY
(only referenced from IR code, with no references from regular objects),
its corresponding class type is deemed to be whole-program local.

2021-09-07  Feng Xue  <fxue@os.amperecomputing.com>

gcc/
	* common.opt (-fdevirtualize-fully): New option.
	* class.c (build_rtti_vtbl_entries): Force generation of typeinfo
	even -fno-rtti is specificied under full devirtualization.
	* cgraph.c (cgraph_update_edges_for_call_stmt): Add an assertion
	to check node to be traversed.
	* cgraphclones.c (cgraph_node::find_replacement): Record
	former_clone_of on replacement node.
	* cgraphunit.c (symtab_node::needed_p): Always output vtable for
	full devirtualization.
	(analyze_functions): Force generation of primary vtables for all
	base classes.
	* ipa-devirt.c (odr_type_d::whole_program_local): New field.
	(odr_type_d::has_virtual_base): Likewise.
	(odr_type_d::all_derivations_known): Removed.
	(odr_type_d::whole_program_local_p): New member function.
	(odr_type_d::all_derivations_known_p): Likewise.
	(odr_type_d::possibly_instantiated_p): Likewise.
	(odr_type_d::set_has_virtual_base): Likewise.
	(get_odr_type): Set "whole_program_local" and "has_virtual_base"
	when adding a type.
	(type_all_derivations_known_p): Replace implementation by a call
	to odr_type_d::all_derivations_known_p.
	(type_possibly_instantiated_p): Replace implementation by a call
	to odr_type_d::possibly_instantiated_p.
	(type_known_to_have_no_derivations_p): Replace call to
	type_possibly_instantiated_p with call to
	odr_type_d::possibly_instantiated_p.
	(type_all_ctors_visible_p): Removed.
	(type_whole_program_local_p): New function.
	(get_type_vtable): Likewise.
	(extract_typeinfo_in_vtable): Likewise.
	(identify_whole_program_local_types): Likewise.
	(dump_odr_type): Dump has_virtual_base and whole_program_local_p()
	of type.
	(maybe_record_node): Resort to type_whole_program_local_p to
	check whether a class has been optimized away.
	(record_target_from_binfo): Remove parameter "anonymous", add
	a new parameter "possibly_instantiated", and adjust code
	accordingly.
	(devirt_variable_node_removal_hook): Replace call to
	"type_in_anonymous_namespace_p" with "type_whole_program_local_p".
	(possible_polymorphic_call_targets): Replace call to
	"type_possibly_instantiated_p" with "possibly_instantiated_p",
	replace flag check on "all_derivations_known" with call to
	 "all_derivations_known_p".
	* ipa-icf.c (filter_removed_items): Disable folding on vtable
	under full devirtualization.
	* ipa-polymorphic-call.c (restrict_to_inner_class): Move odr
	type check to type_known_to_have_no_derivations_p.
	* ipa-utils.h (identify_whole_program_local_types): New
	declaration.
	(type_all_derivations_known_p): Parameter type adjustment.
	* ipa.c (walk_polymorphic_call_targets): Do not mark vcall
	targets as reachable for full devirtualization.
	(can_remove_vtable_if_no_refs_p): New function.
	(symbol_table::remove_unreachable_nodes): Add defined vtables
	to reachable list under full devirtualization.
	* lto-symtab.c (lto_symtab_merge_symbols): Identify whole
	program	local types after symbol table merge.
---
 gcc/cgraph.c               |   5 +-
 gcc/cgraphclones.c         |   5 +
 gcc/cgraphunit.c           |  33 +++
 gcc/common.opt             |   4 +
 gcc/cp/class.c             |   2 +-
 gcc/ipa-devirt.c           | 453 ++++++++++++++++++++++++++++++-------
 gcc/ipa-icf.c              |   4 +-
 gcc/ipa-polymorphic-call.c |   1 -
 gcc/ipa-utils.h            |   3 +-
 gcc/ipa.c                  |  33 ++-
 gcc/lto/lto-symtab.c       |   3 +
 11 files changed, 459 insertions(+), 87 deletions(-)

diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 8f3af003f2a..6a0402577a4 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -1713,7 +1713,10 @@  cgraph_update_edges_for_call_stmt (gimple *old_stmt, tree old_decl,
 	else
 	  {
 	    while (node != orig && !node->next_sibling_clone)
-	      node = node->clone_of;
+	      {
+		gcc_assert (node);
+		node = node->clone_of;
+	      }
 	    if (node != orig)
 	      node = node->next_sibling_clone;
 	  }
diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c
index ae91dccd31d..e2e64570d1e 100644
--- a/gcc/cgraphclones.c
+++ b/gcc/cgraphclones.c
@@ -662,6 +662,7 @@  cgraph_node::find_replacement (clone_info *info)
 
   for (next_inline_clone = clones;
        next_inline_clone
+       && !next_inline_clone->former_clone_of
        && next_inline_clone->decl != decl;
        next_inline_clone = next_inline_clone->next_sibling_clone)
     ;
@@ -738,6 +739,10 @@  cgraph_node::find_replacement (clone_info *info)
 	 with function body.  */
       replacement->order = order;
 
+      /* Record original node from which these clones are formerly
+	 duplicated.  */
+      replacement->former_clone_of = former_clone_of;
+
       return replacement;
     }
   else
diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index 81a99436673..bf8d0560e18 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -247,6 +247,10 @@  symtab_node::needed_p (void)
   if (!definition)
     return false;
 
+  /* Always output vtable when full devirtualization is enabled.  */
+  if (flag_devirtualize_fully && DECL_VIRTUAL_P (decl))
+    return true;
+
   if (DECL_EXTERNAL (decl))
     return false;
 
@@ -1286,6 +1290,35 @@  analyze_functions (bool first_time)
 	      varpool_node *vnode = dyn_cast <varpool_node *> (node);
 	      if (vnode && vnode->definition && !vnode->analyzed)
 		vnode->analyze ();
+
+	      /* If a class with virtual base is only instantiated as
+		 subobjects of derived classes, and has no complete object in
+		 compilation unit, merely construction vtables will be involved,
+		 its primary vtable is really not needed, and subject to being
+		 removed.  So once a vtable node is encountered, for all
+		 polymorphic base classes of the vtable's context class, always
+		 force generation of primary vtable nodes when full
+		 devirtualization is enabled.  */
+	      if (flag_devirtualize_fully
+		  && vnode && DECL_VIRTUAL_P (vnode->decl)
+		  && DECL_CONTEXT (vnode->decl))
+		{
+		  tree binfo = TYPE_BINFO (DECL_CONTEXT (vnode->decl));
+
+		  gcc_assert (binfo);
+		  for (unsigned i = 0; i < BINFO_N_BASE_BINFOS (binfo); i++)
+		    {
+		      tree base_type = TREE_TYPE (BINFO_BASE_BINFO (binfo, i));
+		      tree base_vtable = BINFO_VTABLE (TYPE_BINFO (base_type));
+		      unsigned HOST_WIDE_INT offset;
+
+		      if (base_vtable
+			  && vtable_pointer_value_to_vtable (base_vtable,
+							     &base_vtable,
+							     &offset))
+			varpool_node::get_create (base_vtable);
+		    }
+		}
 	    }
 
 	  if (node->same_comdat_group)
diff --git a/gcc/common.opt b/gcc/common.opt
index 66dc583455e..a0235611a23 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1273,6 +1273,10 @@  fdevirtualize-speculatively
 Common Var(flag_devirtualize_speculatively) Optimization
 Perform speculative devirtualization.
 
+fdevirtualize-fully
+Common Var(flag_devirtualize_fully) Optimization
+Perform full devirtualization on virtual call when all targets are known under whole program assumption.
+
 fdevirtualize
 Common Var(flag_devirtualize) Optimization
 Try to convert virtual calls to direct ones.
diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index fe225c61a62..4b24989f329 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -10264,7 +10264,7 @@  build_rtti_vtbl_entries (tree binfo, vtbl_init_data* vid)
 			BINFO_OFFSET (vid->rtti_binfo), BINFO_OFFSET (b));
 
   /* The second entry is the address of the typeinfo object.  */
-  if (flag_rtti)
+  if (flag_rtti || (flag_devirtualize_fully && flag_lto))
     decl = build_address (get_tinfo_decl (t));
   else
     decl = integer_zero_node;
diff --git a/gcc/ipa-devirt.c b/gcc/ipa-devirt.c
index 8deec75b2df..fcb097d7156 100644
--- a/gcc/ipa-devirt.c
+++ b/gcc/ipa-devirt.c
@@ -216,72 +216,122 @@  struct GTY(()) odr_type_d
   int id;
   /* Is it in anonymous namespace? */
   bool anonymous_namespace;
-  /* Do we know about all derivations of given type?  */
-  bool all_derivations_known;
+  /* Set when type is not used outside of program.  */
+  bool whole_program_local;
   /* Did we report ODR violation here?  */
   bool odr_violated;
   /* Set when virtual table without RTTI prevailed table with.  */
   bool rtti_broken;
   /* Set when the canonical type is determined using the type name.  */
   bool tbaa_enabled;
+  /* Set when type contains virtual base.  */
+  bool has_virtual_base;
+
+  bool whole_program_local_p ();
+
+  /* Do we know about all derivations of given type?  */
+  bool all_derivations_known_p ()
+  {
+    if (!RECORD_OR_UNION_TYPE_P (type))
+      return true;
+
+    if (TYPE_FINAL_P (type))
+      return true;
+
+    return whole_program_local_p ();
+  }
+
+  bool possibly_instantiated_p ();
+
+  void set_has_virtual_base ()
+  {
+    if (!has_virtual_base)
+      {
+	unsigned i;
+	odr_type derived;
+
+	has_virtual_base = true;
+
+	FOR_EACH_VEC_ELT (derived_types, i, derived)
+	  derived->set_has_virtual_base ();
+      }
+  }
 };
 
-/* Return TRUE if all derived types of T are known and thus
-   we may consider the walk of derived type complete.
+/* Given TYPE, return its primary vtable if it is a polymorphic class,
+   otherwise return NULL.  */
+
+static inline tree
+get_type_vtable (tree type)
+{
+  tree binfo = TYPE_BINFO (type);
 
-   This is typically true only for final anonymous namespace types and types
+  if (!binfo)
+    return NULL;
+
+  tree vtable = BINFO_VTABLE (binfo);
+  unsigned HOST_WIDE_INT offset;
+
+  if (!vtable || !vtable_pointer_value_to_vtable (vtable, &vtable, &offset))
+    return NULL;
+
+  return vtable;
+}
+
+/* Return TRUE if the ODR type is local in whole-program scope.
+
+   This is typically true for final anonymous namespace types and types
    defined within functions (that may be COMDAT and thus shared across units,
-   but with the same set of derived types).  */
+   but with the same set of derived types).
+
+   A FINAL type could not imply it is whole-program local, since it might
+   be used in external library.  */
 
 bool
-type_all_derivations_known_p (const_tree t)
+odr_type_d::whole_program_local_p ()
 {
-  if (TYPE_FINAL_P (t))
-    return true;
   if (flag_ltrans)
     return false;
-  /* Non-C++ types may have IDENTIFIER_NODE here, do not crash.  */
-  if (!TYPE_NAME (t) || TREE_CODE (TYPE_NAME (t)) != TYPE_DECL)
-    return true;
-  if (type_in_anonymous_namespace_p (t))
-    return true;
-  return (decl_function_context (TYPE_NAME (t)) != NULL);
+
+  return whole_program_local;
 }
 
-/* Return TRUE if type's constructors are all visible.  */
+/* Return TRUE if ODR type may have any instance.  */
 
-static bool
-type_all_ctors_visible_p (tree t)
+bool
+odr_type_d::possibly_instantiated_p ()
 {
-  return !flag_ltrans
-	 && symtab->state >= CONSTRUCTION
-	 /* We cannot always use type_all_derivations_known_p.
-	    For function local types we must assume case where
-	    the function is COMDAT and shared in between units.
-
-	    TODO: These cases are quite easy to get, but we need
-	    to keep track of C++ privatizing via -Wno-weak
-	    as well as the  IPA privatizing.  */
-	 && type_in_anonymous_namespace_p (t);
-}
+  gcc_assert (symtab->state >= CONSTRUCTION);
 
-/* Return TRUE if type may have instance.  */
+  if (!RECORD_OR_UNION_TYPE_P (type) || !whole_program_local_p ())
+    return true;
 
-static bool
-type_possibly_instantiated_p (tree t)
-{
-  tree vtable;
-  varpool_node *vnode;
+  tree vtable = get_type_vtable (type);
 
-  /* TODO: Add abstract types here.  */
-  if (!type_all_ctors_visible_p (t))
+  if (!vtable)
     return true;
 
-  vtable = BINFO_VTABLE (TYPE_BINFO (t));
-  if (TREE_CODE (vtable) == POINTER_PLUS_EXPR)
-    vtable = TREE_OPERAND (TREE_OPERAND (vtable, 0), 0);
-  vnode = varpool_node::get (vtable);
-  return vnode && vnode->definition;
+  varpool_node *vtable_node = varpool_node::get (vtable);
+
+  /* Leaf class or class w/o virtual base has only one vtable, so just to
+     check availability of primary vtable is enough.
+
+     TODO: handle possible construction vtables when virtual inheritance
+     exists.  */
+  if (!has_virtual_base || derived_types.is_empty ())
+    return vtable_node && vtable_node->definition;
+
+  return true;
+}
+
+/* Return TRUE if T is local in whole-program scope.  */
+
+static inline bool
+type_whole_program_local_p (tree t)
+{
+  odr_type type = get_odr_type (t);
+
+  return type && type->whole_program_local_p ();
 }
 
 /* Hash used to unify ODR types based on their mangled name and for anonymous
@@ -1953,9 +2003,15 @@  get_odr_type (tree type, bool insert)
       val->bases = vNULL;
       val->derived_types = vNULL;
       if (type_with_linkage_p (type))
-        val->anonymous_namespace = type_in_anonymous_namespace_p (type);
+	val->anonymous_namespace = type_in_anonymous_namespace_p (type);
       else
 	val->anonymous_namespace = 0;
+
+      if (!in_lto_p
+	  && (val->anonymous_namespace
+	      || decl_function_context (TYPE_NAME (type))))
+	val->whole_program_local = true;
+
       build_bases = COMPLETE_TYPE_P (val->type);
       insert_to_odr_array = true;
       *slot = val;
@@ -1970,7 +2026,6 @@  get_odr_type (tree type, bool insert)
 
       gcc_assert (BINFO_TYPE (TYPE_BINFO (val->type)) == type);
 
-      val->all_derivations_known = type_all_derivations_known_p (type);
       for (i = 0; i < BINFO_N_BASE_BINFOS (binfo); i++)
 	/* For now record only polymorphic types. other are
 	   pointless for devirtualization and we cannot precisely
@@ -1980,6 +2035,12 @@  get_odr_type (tree type, bool insert)
 	    tree base_type= BINFO_TYPE (BINFO_BASE_BINFO (binfo, i));
 	    odr_type base = get_odr_type (base_type, true);
 	    gcc_assert (TYPE_MAIN_VARIANT (base_type) == base_type);
+
+	    /* Propagate has_virtual_base flag to all derived classes.  */
+	    if (base->has_virtual_base
+		|| BINFO_VIRTUAL_P (BINFO_BASE_BINFO (binfo, i)))
+	      val->set_has_virtual_base ();
+
 	    base->derived_types.safe_push (val);
 	    val->bases.safe_push (base);
 	    if (base->id > base_id)
@@ -2107,15 +2168,35 @@  register_odr_type (tree type)
     }
 }
 
+/* Return TRUE if all derived types of T are known and thus
+   we may consider the walk of derived type complete.  */
+
+bool
+type_all_derivations_known_p (tree t)
+{
+  /* Non-C++ types may have IDENTIFIER_NODE here, do not crash.  */
+  if (!TYPE_NAME (t) || TREE_CODE (TYPE_NAME (t)) != TYPE_DECL)
+    return true;
+
+  if (!odr_hash || !can_be_name_hashed_p (t))
+    return false;
+
+  odr_type type = get_odr_type (t, true);
+
+  return type->all_derivations_known_p ();
+}
+
 /* Return true if type is known to have no derivations.  */
 
 bool
 type_known_to_have_no_derivations_p (tree t)
 {
-  return (type_all_derivations_known_p (t)
-	  && (TYPE_FINAL_P (t)
-	      || (odr_hash
-		  && !get_odr_type (t, true)->derived_types.length())));
+  if (!odr_hash || !can_be_name_hashed_p (t))
+    return false;
+
+  odr_type type = get_odr_type (t, true);
+
+  return type->all_derivations_known_p () && !type->derived_types.length();
 }
 
 /* Dump ODR type T and all its derived types.  INDENT specifies indentation for
@@ -2127,8 +2208,15 @@  dump_odr_type (FILE *f, odr_type t, int indent=0)
   unsigned int i;
   fprintf (f, "%*s type %i: ", indent * 2, "", t->id);
   print_generic_expr (f, t->type, TDF_SLIM);
-  fprintf (f, "%s", t->anonymous_namespace ? " (anonymous namespace)":"");
-  fprintf (f, "%s\n", t->all_derivations_known ? " (derivations known)":"");
+  if (t->anonymous_namespace)
+    fprintf (f, " (anonymous namespace)");
+  if (t->has_virtual_base)
+    fprintf (f, " (virtual base)");
+  if (t->whole_program_local_p ())
+    fprintf (f, " (whole program local)");
+  else if (t->all_derivations_known_p ())
+    fprintf (f, " (derivations known)");
+  fprintf (f, "\n");
   if (TYPE_NAME (t->type))
     {
       if (DECL_ASSEMBLER_NAME_SET_P (TYPE_NAME (t->type)))
@@ -2309,6 +2397,219 @@  build_type_inheritance_graph (void)
   timevar_pop (TV_IPA_INHERITANCE);
 }
 
+/* Return typeinfo referenced by vtable represented by VNODE.  If REMOVE
+   is TRUE, replace all typeinfo addresses in vtable with zero values.  */
+
+static symtab_node *
+extract_typeinfo_in_vtable (varpool_node *vnode, bool remove)
+{
+  symtab_node *typeinfo = NULL;
+  ipa_ref *ref;
+
+  for (unsigned i = 0; vnode->iterate_reference (i, ref); i++)
+    {
+      symtab_node *referred = ref->referred;
+
+      if (ref->use == IPA_REF_ADDR && is_a <varpool_node *> (referred))
+	{
+	  gcc_assert (!DECL_VIRTUAL_P (referred->decl));
+
+	  if (!remove)
+	    return referred;
+
+	  if (typeinfo)
+	    gcc_assert (typeinfo == referred);
+	  else
+	    typeinfo = referred;
+
+	  ref->remove_reference ();
+	  i--;
+	}
+    }
+
+  if (!typeinfo)
+    return NULL;
+
+  tree init = vnode->get_constructor ();
+
+  gcc_assert (init != error_mark_node);
+
+  for (unsigned i = 0; i < CONSTRUCTOR_NELTS (init); i++)
+    {
+      constructor_elt *elt = CONSTRUCTOR_ELT (init, i);
+      tree value = elt->value;
+
+      if (CONVERT_EXPR_P (value))
+	value = TREE_OPERAND (value, 0);
+
+      if (TREE_CODE (value) == ADDR_EXPR)
+	{
+	  value = TREE_OPERAND (value, 0);
+
+	  if (VAR_P (value))
+	    {
+	      gcc_assert (typeinfo->decl == value);
+	      elt->value = build_zero_cst (TREE_TYPE (elt->value));
+	    }
+	  else
+	    gcc_assert (TREE_CODE (value) == FUNCTION_DECL);
+	}
+    }
+
+  return typeinfo;
+}
+
+/* Find out C++ polymorphic classes that are used locally in terms of lto
+   whole program scope.  If full devirtualization is off, we only consider
+   local class in function or anonymous namespace.  Otherwise, we will resort
+   to lto-linker-plugin symbol resolution to check whether vtable and typeinfo
+   symbols of a given class are referenced by any external regular object or
+   library.  At same time, since typeinfo is used to carry symbol resolution
+   information, it is always generated even user specifies -fno-rtti at LGEN
+   compilation, and removal of typeinfo is postponed to this procedure.  */
+
+void
+identify_whole_program_local_types (void)
+{
+  hash_set<lto_file_decl_data *> *no_rtti_files = NULL;
+  varpool_node *vnode;
+
+  gcc_assert (in_lto_p && !flag_ltrans);
+
+  if (!odr_hash)
+    return;
+
+  if (flag_devirtualize_fully)
+    {
+      lto_file_decl_data *prev_file_data = NULL;
+      cgraph_node *cnode;
+
+      /* Effect of -fno-rtti is file-wide, collect those files that -fno-rtti
+	 has been specified for.  */
+      FOR_EACH_FUNCTION (cnode)
+	{
+	  lto_file_decl_data *file_data = cnode->lto_file_data;
+
+	  if (DECL_FUNCTION_SPECIFIC_OPTIMIZATION (cnode->decl)
+	      && !opt_for_fn (cnode->decl, flag_rtti)
+	      && prev_file_data != file_data && file_data)
+	    {
+	      if (!no_rtti_files)
+		no_rtti_files = new hash_set<lto_file_decl_data *> ();
+	      no_rtti_files->add (file_data);
+
+	      /* Symbols coming from same lto file are likely to be grouped
+		 together. Based on this fact, here is a minor optimization
+		 that we do not add lto file if it is the one just added.  */
+	      prev_file_data = file_data;
+	    }
+	}
+    }
+
+  FOR_EACH_DEFINED_VARIABLE (vnode)
+    {
+      tree decl = vnode->decl;
+      symtab_node *typeinfo = NULL;
+
+      if (!DECL_VIRTUAL_P (decl) || DECL_EXTERNAL (decl))
+	continue;
+
+      if (flag_devirtualize_fully)
+	{
+	  bool remove_rtti = no_rtti_files
+			     && no_rtti_files->contains (vnode->lto_file_data);
+
+	  /* Remove typeinfo if its referring vtable is originated from a file
+	     that is impacted by -fno-rtti.  */
+	  typeinfo = extract_typeinfo_in_vtable (vnode, remove_rtti);
+	}
+
+      tree class_type = DECL_CONTEXT (decl);
+      odr_type type = get_odr_type (class_type);
+
+      if (!type->anonymous_namespace
+	  && !decl_function_context (TYPE_NAME (class_type)))
+	{
+	  /* This is a VTT (vtable table), or user specified -fno-rtti, but
+	     forgot -fdevirtualize-fully at LGEN compilation.  */
+	  if (!typeinfo)
+	    continue;
+
+	  /* If vtable of public class has no linkage (occurres with -fno-weak),
+	     lto-linker-plugin could not enforce symbol resolution and tell us
+	     nothing, just conservatively skip the class.  */
+	  if (!TREE_PUBLIC (decl))
+	    continue;
+
+	  /* Symbol with resolution as LDPR_PREVAILING_DEF_IRONLY_EXP allows
+	     external reference from dynamic library, which can not be known
+	     at link time.  */
+	  if (vnode->resolution != LDPR_PREVAILING_DEF_IRONLY
+	      || typeinfo->resolution != LDPR_PREVAILING_DEF_IRONLY)
+	    continue;
+	}
+
+      /* Skip class type that violates odr constraint, since types in conflict
+	 may have incompatible vtable definition.  */
+      if (type->odr_violated)
+	{
+	  gcc_assert (!type->anonymous_namespace);
+	  continue;
+	}
+
+      tree vtable = get_type_vtable (class_type);
+
+      gcc_assert (vtable);
+
+      if (vtable != vnode->decl)
+	{
+	  /* This is not primary vtable of a class type, just a construction
+	     vtable for one of its virtual base.  */
+	  gcc_assert (type->has_virtual_base);
+	  continue;
+	}
+
+      if (type->types)
+	{
+	  tree equiv_type;
+	  bool multi_vtable_p = false;
+	  unsigned i;
+
+	  /* Public class w/o key member function (or local class in a public
+	     inline function) requires COMDAT-like vtable so as to be shared
+	     among units.  But C++ privatizing via -fno-weak would introduce
+	     multiple static vtable copies for one class in merged lto symbol
+	     table.  This breaks one-to-one correspondence between class and
+	     vtable, and makes class liveness check become not that easy.  To
+	     be simple, we exclude such kind of class from our choice list.
+
+	     TODO: lto_symtab_merge_symbols() currently only merges public and
+	     external symbols, it could be extended to combine identical
+	     static symbols similar to COMDAT.  */
+	  if (class_type != type->type
+	      && (vtable != get_type_vtable (type->type)))
+	    continue;
+
+	  FOR_EACH_VEC_ELT (*(type->types), i, equiv_type)
+	    {
+	      if (COMPLETE_TYPE_P (equiv_type)
+		  && (vtable != get_type_vtable (equiv_type)))
+		{
+		  multi_vtable_p = true;
+		  break;
+		}
+	    }
+
+	  if (multi_vtable_p)
+	    continue;
+	}
+
+      type->whole_program_local = true;
+    }
+
+  delete no_rtti_files;
+}
+
 /* Return true if N has reference from live virtual table
    (and thus can be a destination of polymorphic call). 
    Be conservatively correct when callgraph is not built or
@@ -2385,11 +2686,9 @@  maybe_record_node (vec <cgraph_node *> &nodes,
 
   if (!can_refer)
     {
-      /* The only case when method of anonymous namespace becomes unreferable
-	 is when we completely optimized it out.  */
-      if (flag_ltrans
-	  || !target 
-	  || !type_in_anonymous_namespace_p (DECL_CONTEXT (target)))
+      /* The only case when method of whole-program local class type becomes
+	 unreferable is when we completely optimized it out.  */
+      if (!target || !type_whole_program_local_p (DECL_CONTEXT (target)))
 	*completep = false;
       return;
     }
@@ -2472,8 +2771,8 @@  maybe_record_node (vec <cgraph_node *> &nodes,
       if (flag_sanitize & SANITIZE_UNREACHABLE)
 	*completep = false;
     }
-  else if (flag_ltrans
-	   || !type_in_anonymous_namespace_p (DECL_CONTEXT (target)))
+  else if (!type_whole_program_local_p (DECL_CONTEXT (target))
+	   || DECL_EXTERNAL (target))
     *completep = false;
 }
 
@@ -2492,7 +2791,8 @@  maybe_record_node (vec <cgraph_node *> &nodes,
    for virtual function in. INSERTED tracks nodes we already
    inserted.
 
-   ANONYMOUS is true if BINFO is part of anonymous namespace.
+   POSSIBLY_INSTANTIATED is true if there might exist instance
+   of type.
 
    Clear COMPLETEP when we hit unreferable target.
   */
@@ -2508,7 +2808,7 @@  record_target_from_binfo (vec <cgraph_node *> &nodes,
 			  HOST_WIDE_INT offset,
 			  hash_set<tree> *inserted,
 			  hash_set<tree> *matched_vtables,
-			  bool anonymous,
+			  bool possibly_instantiated,
 			  bool *completep)
 {
   tree type = BINFO_TYPE (binfo);
@@ -2545,19 +2845,11 @@  record_target_from_binfo (vec <cgraph_node *> &nodes,
 	  gcc_assert (odr_violation_reported);
 	  return;
 	}
-      /* For types in anonymous namespace first check if the respective vtable
-	 is alive. If not, we know the type can't be called.  */
-      if (!flag_ltrans && anonymous)
-	{
-	  tree vtable = BINFO_VTABLE (inner_binfo);
-	  varpool_node *vnode;
+      /* For whole-program local types first check if the respective vtable is
+	 alive. If not, we know the type can't be called.  */
+      if (!possibly_instantiated)
+	return;
 
-	  if (TREE_CODE (vtable) == POINTER_PLUS_EXPR)
-	    vtable = TREE_OPERAND (TREE_OPERAND (vtable, 0), 0);
-	  vnode = varpool_node::get (vtable);
-	  if (!vnode || !vnode->definition)
-	    return;
-	}
       gcc_assert (inner_binfo);
       if (bases_to_consider
 	  ? !matched_vtables->contains (BINFO_VTABLE (inner_binfo))
@@ -2583,7 +2875,8 @@  record_target_from_binfo (vec <cgraph_node *> &nodes,
       record_target_from_binfo (nodes, bases_to_consider, base_binfo, otr_type,
 				type_binfos, 
 				otr_token, outer_type, offset, inserted,
-				matched_vtables, anonymous, completep);
+				matched_vtables, possibly_instantiated,
+				completep);
   if (BINFO_VTABLE (binfo))
     type_binfos.pop ();
 }
@@ -2614,7 +2907,7 @@  possible_polymorphic_call_targets_1 (vec <cgraph_node *> &nodes,
   tree binfo = TYPE_BINFO (type->type);
   unsigned int i;
   auto_vec <tree, 8> type_binfos;
-  bool possibly_instantiated = type_possibly_instantiated_p (type->type);
+  bool possibly_instantiated = type->possibly_instantiated_p ();
 
   /* We may need to consider types w/o instances because of possible derived
      types using their methods either directly or via construction vtables.
@@ -2625,12 +2918,12 @@  possible_polymorphic_call_targets_1 (vec <cgraph_node *> &nodes,
     {
       record_target_from_binfo (nodes,
 				(!possibly_instantiated
-				 && type_all_derivations_known_p (type->type))
+				 && type->all_derivations_known_p ())
 				? &bases_to_consider : NULL,
 				binfo, otr_type, type_binfos, otr_token,
 				outer_type, offset,
 				inserted, matched_vtables,
-				type->anonymous_namespace, completep);
+				possibly_instantiated, completep);
     }
   for (i = 0; i < type->derived_types.length (); i++)
     possible_polymorphic_call_targets_1 (nodes, inserted, 
@@ -2948,7 +3241,7 @@  devirt_variable_node_removal_hook (varpool_node *n,
 {
   if (cached_polymorphic_call_targets
       && DECL_VIRTUAL_P (n->decl)
-      && type_in_anonymous_namespace_p (DECL_CONTEXT (n->decl)))
+      && type_whole_program_local_p (DECL_CONTEXT (n->decl)))
     free_polymorphic_call_targets_hash ();
 }
 
@@ -3189,7 +3482,7 @@  possible_polymorphic_call_targets (tree otr_type,
 	 to walk derivations.  */
       if (target && DECL_FINAL_P (target))
 	context.speculative_maybe_derived_type = false;
-      if (type_possibly_instantiated_p (speculative_outer_type->type))
+      if (speculative_outer_type->possibly_instantiated_p ())
 	maybe_record_node (nodes, target, &inserted, can_refer, &speculation_complete);
       if (binfo)
 	matched_vtables.add (BINFO_VTABLE (binfo));
@@ -3237,7 +3530,7 @@  possible_polymorphic_call_targets (tree otr_type,
 	}
 
       /* If OUTER_TYPE is abstract, we know we are not seeing its instance.  */
-      if (type_possibly_instantiated_p (outer_type->type))
+      if (outer_type->possibly_instantiated_p ())
 	maybe_record_node (nodes, target, &inserted, can_refer, &complete);
       else
 	skipped = true;
@@ -3258,7 +3551,7 @@  possible_polymorphic_call_targets (tree otr_type,
 						 bases_to_consider,
 						 context.maybe_in_construction);
 
-	  if (!outer_type->all_derivations_known)
+	  if (!outer_type->all_derivations_known_p ())
 	    {
 	      if (!speculative && final_warning_records
 		  && nodes.length () == 1
@@ -3323,7 +3616,7 @@  possible_polymorphic_call_targets (tree otr_type,
 	      if (type != outer_type
 		  && (!skipped
 		      || (context.maybe_derived_type
-			  && !type_all_derivations_known_p (outer_type->type))))
+			  && !outer_type->all_derivations_known_p ())))
 		record_targets_from_bases (otr_type, otr_token, outer_type->type,
 					   context.offset, nodes, &inserted,
 					   &matched_vtables, &complete);
diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index 4c1f25d0834..701e38d25de 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -2411,7 +2411,9 @@  sem_item_optimizer::filter_removed_items (void)
 	    {
 	      /* Filter out non-readonly variables.  */
 	      tree decl = item->decl;
-	      if (TREE_READONLY (decl))
+
+	      if (TREE_READONLY (decl)
+		  && (!flag_devirtualize_fully || !DECL_VIRTUAL_P (decl)))
 		filtered.safe_push (item);
 	      else
 		remove_item (item);
diff --git a/gcc/ipa-polymorphic-call.c b/gcc/ipa-polymorphic-call.c
index 06c200c9cc9..5d4d5d29ccf 100644
--- a/gcc/ipa-polymorphic-call.c
+++ b/gcc/ipa-polymorphic-call.c
@@ -230,7 +230,6 @@  ipa_polymorphic_call_context::restrict_to_inner_class (tree otr_type,
 	      /* If type is known to be final, do not worry about derived
 		 types.  Testing it here may help us to avoid speculation.  */
 	      if (otr_type && TREE_CODE (outer_type) == RECORD_TYPE
-		  && (!in_lto_p || odr_type_p (outer_type))
 		  && type_with_linkage_p (outer_type)
 		  && type_known_to_have_no_derivations_p (outer_type))
 		maybe_derived_type = false;
diff --git a/gcc/ipa-utils.h b/gcc/ipa-utils.h
index 3cfaf2d2737..fe393284862 100644
--- a/gcc/ipa-utils.h
+++ b/gcc/ipa-utils.h
@@ -58,6 +58,7 @@  extern bool thunk_expansion;
 void build_type_inheritance_graph (void);
 void rebuild_type_inheritance_graph (void);
 void update_type_inheritance_graph (void);
+void identify_whole_program_local_types (void);
 vec <cgraph_node *>
 possible_polymorphic_call_targets (tree, HOST_WIDE_INT,
 				   ipa_polymorphic_call_context,
@@ -80,7 +81,7 @@  tree vtable_pointer_value_to_binfo (const_tree);
 bool vtable_pointer_value_to_vtable (const_tree, tree *, unsigned HOST_WIDE_INT *);
 tree subbinfo_with_vtable_at_offset (tree, unsigned HOST_WIDE_INT, tree);
 void compare_virtual_tables (varpool_node *, varpool_node *);
-bool type_all_derivations_known_p (const_tree);
+bool type_all_derivations_known_p (tree);
 bool type_known_to_have_no_derivations_p (tree);
 bool contains_polymorphic_type_p (const_tree);
 void register_odr_type (tree);
diff --git a/gcc/ipa.c b/gcc/ipa.c
index 4f62ac183ee..6e200a906b5 100644
--- a/gcc/ipa.c
+++ b/gcc/ipa.c
@@ -197,6 +197,13 @@  walk_polymorphic_call_targets (hash_set<void *> *reachable_call_targets,
 	    continue;
 
 	  n->indirect_call_target = true;
+
+	  /* Here, do not mark targets as reachable for full devirtualization,
+	     which could enable elimination of member functions of class type
+	     that is never instantiated.  */
+	  if (flag_devirtualize_fully)
+	    continue;
+
 	  symtab_node *body = n->function_symbol ();
 
 	  /* Prior inlining, keep alive bodies of possible targets for
@@ -252,6 +259,26 @@  walk_polymorphic_call_targets (hash_set<void *> *reachable_call_targets,
     }
 }
 
+/* Return false when vtable should be retained for purpose of full
+   devirtualization if there is no direct reference to it.  */
+
+static bool
+can_remove_vtable_if_no_refs_p (varpool_node *vnode)
+{
+  if (!flag_devirtualize_fully)
+    return true;
+
+  if (DECL_EXTERNAL (vnode->decl))
+    return true;
+
+  /* We will force generating vtables in LGEN stage even they are "unused",
+     since they carry information needed by devirtualization.  */
+  if (!in_lto_p && flag_generate_lto)
+    return false;
+
+  return true;
+}
+
 /* Perform reachability analysis and reclaim all unreachable nodes.
 
    The algorithm is basically mark&sweep but with some extra refinements:
@@ -350,8 +377,10 @@  symbol_table::remove_unreachable_nodes (FILE *file)
 
   /* Mark variables that are obviously needed.  */
   FOR_EACH_DEFINED_VARIABLE (vnode)
-    if (!vnode->can_remove_if_no_refs_p()
-	&& !vnode->in_other_partition)
+    if ((!vnode->can_remove_if_no_refs_p ()
+	 && !vnode->in_other_partition)
+	|| (DECL_VIRTUAL_P (vnode->decl)
+	    && !can_remove_vtable_if_no_refs_p (vnode)))
       {
 	reachable.add (vnode);
 	enqueue_node (vnode, &first, &reachable);
diff --git a/gcc/lto/lto-symtab.c b/gcc/lto/lto-symtab.c
index a4d70150d7f..a297c3e52e8 100644
--- a/gcc/lto/lto-symtab.c
+++ b/gcc/lto/lto-symtab.c
@@ -1077,6 +1077,9 @@  lto_symtab_merge_symbols (void)
 	      node->decl->decl_with_vis.symtab_node = node;
 	    }
 	}
+
+      if (flag_devirtualize)
+	identify_whole_program_local_types ();
     }
 }
 
-- 
2.17.1