From patchwork Wed Apr 17 07:42:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 88573 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0E3FB3858C62 for ; Wed, 17 Apr 2024 07:43:27 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id EB5C33858D20 for ; Wed, 17 Apr 2024 07:42:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EB5C33858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org EB5C33858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1713339781; cv=none; b=xymeND3+vg0xiFZCsjBgB1LQ2IyzL9En7nywsxzUBPgheAFS2rn/Gmiu7o5keHKnbcVSXi+D3hXsLEvRExK23yjxzMHZQbSWNex2Jtq3xIrNbEDCSbheZBPkY3mbCrdmQz++5g8ayJlTC6vYGLLSsziwg/kNsIZPTawTkT4PEnQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1713339781; c=relaxed/simple; bh=fHicrXcpdlLwEQkH9oaT65nD95iXlEJwXKRcK3Nu3qM=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=iAgSH/5mGDEVPLD0YuuhPs/GoRi4nf6jf+lbHWc2urNJCudS8bSVU4mVH+h5HyKjhIElwdDQxFsEBUCa/zmOzoxQA8oU3AY6siRydhPisHoGRG/F79CHiQSbqQ1CmjRIGi2Mcer5mq9PYSI+fF9OKEJSr7D46Jxn/adQRmhAMeo= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1713339778; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type; bh=gD/5UOFtUqe30qUvzhIhem6+W60deJwKLsepIlJzB4g=; b=FYIEhE5T1JUJaX3z4sdWpVJWlFZ6J1X9dEfQfgxJuvAAzgDkD1FL72mMAe9PKGtsyg0e6S VsJhlgpRnICYQ6iboVtl83bZ/Sf8KiHL+LNcJmTVH5IRd9r5Zdn6KfNftHK3anmlizMTc6 MmTGgaeZgCL19SXc4ascqOEjEQq9vVE= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-160-XfwE8Qc9Mnu_gTxKAxU-vw-1; Wed, 17 Apr 2024 03:42:54 -0400 X-MC-Unique: XfwE8Qc9Mnu_gTxKAxU-vw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 79CF2806528; Wed, 17 Apr 2024 07:42:54 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.45.224.5]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 3B879202451A; Wed, 17 Apr 2024 07:42:54 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 43H7gmgt2655544 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Wed, 17 Apr 2024 09:42:48 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 43H7gld82655543; Wed, 17 Apr 2024 09:42:47 +0200 Date: Wed, 17 Apr 2024 09:42:47 +0200 From: Jakub Jelinek To: Jason Merrill , Jan Hubicka , Richard Biener , Patrick Palka Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208] Message-ID: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Jakub Jelinek Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Hi! When expand_or_defer_fn is called at_eof time, it calls import_export_decl and then maybe_clone_body, which uses DECL_ONE_ONLY and comdat name in a couple of places to try to optimize cdtors which are known to have the same body by making the complete cdtor an alias to base cdtor (and in that case also uses *[CD]5* as comdat group name instead of the normal comdat group names specific to each mangled name). Now, this optimization depends on DECL_ONE_ONLY and DECL_INTERFACE_KNOWN, maybe_clone_body and can_alias_cdtor use: if (DECL_ONE_ONLY (fn)) cgraph_node::get_create (clone)->set_comdat_group (cxx_comdat_group (clone)); ... bool can_alias = can_alias_cdtor (fn); ... /* Tell cgraph if both ctors or both dtors are known to have the same body. */ if (can_alias && fns[0] && idx == 1 && cgraph_node::get_create (fns[0])->create_same_body_alias (clone, fns[0])) { alias = true; if (DECL_ONE_ONLY (fns[0])) { /* For comdat base and complete cdtors put them into the same, *[CD]5* comdat group instead of *[CD][12]*. */ comdat_group = cdtor_comdat_group (fns[1], fns[0]); cgraph_node::get_create (fns[0])->set_comdat_group (comdat_group); if (symtab_node::get (clone)->same_comdat_group) symtab_node::get (clone)->remove_from_same_comdat_group (); symtab_node::get (clone)->add_to_same_comdat_group (symtab_node::get (fns[0])); } } and /* Don't use aliases for weak/linkonce definitions unless we can put both symbols in the same COMDAT group. */ return (DECL_INTERFACE_KNOWN (fn) && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn)) && (!DECL_ONE_ONLY (fn) || (HAVE_COMDAT_GROUP && DECL_WEAK (fn)))); The following testcase regressed with Marek's r14-5979 change, when pr113208_0.C is compiled where the ctor is marked constexpr, we no longer perform this optimization, where _ZN6vectorI12QualityValueEC2ERKS1_ was emitted in the _ZN6vectorI12QualityValueEC5ERKS1_ comdat group and _ZN6vectorI12QualityValueEC1ERKS1_ was made an alias to it, instead we emit _ZN6vectorI12QualityValueEC2ERKS1_ in _ZN6vectorI12QualityValueEC2ERKS1_ comdat group and the same content _ZN6vectorI12QualityValueEC1ERKS1_ as separate symbol in _ZN6vectorI12QualityValueEC1ERKS1_ comdat group. Now, the linker seems to somehow cope with that, eventhough it probably keeps both copies of the ctor, but seems LTO can't cope with that and Honza doesn't know what it should do in that case (linker decides that the prevailing symbol is _ZN6vectorI12QualityValueEC2ERKS1_ (from the _ZN6vectorI12QualityValueEC2ERKS1_ comdat group) and _ZN6vectorI12QualityValueEC1ERKS1_ alias (from the other TU, from _ZN6vectorI12QualityValueEC5ERKS1_ comdat group)). Note, the case where some constructor is marked constexpr in one TU and not in another one happens pretty often in libstdc++ when one mixes -std= flags used to compile different compilation units. The reason the optimization doesn't trigger when the constructor is constexpr is that expand_or_defer_fn is called in that case much earlier than when it is not constexpr; in the former case it is called when we try to constant evaluate that constructor. But DECL_INTERFACE_KNOWN is false in that case and comdat_linkage hasn't been called either (I think it is desirable, because comdat group is stored in the cgraph node and am not sure it is a good idea to create cgraph nodes for something that might not be needed later on at all), so maybe_clone_body clones the bodies, but doesn't make them as aliases. The following patch is an attempt to redo this optimization when import_export_decl is called at_eof time on the base/complete cdtor (or deleting dtor). It will not do anything if maybe_clone_body hasn't been called uyet (the TREE_ASM_WRITTEN check on the DECL_MAYBE_IN_CHARGE_CDTOR_P), or when one or both of the base/complete cdtors have been lowered already, or when maybe_clone_body called maybe_thunk_body and it was successful. Otherwise retries the can_alias_cdtor check and makes the complete cdtor alias to the base cdtor with adjustments to the comdat group. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2024-04-17 Jakub Jelinek PR lto/113208 * cp-tree.h (maybe_optimize_cdtor): Declare. * decl2.cc (import_export_decl): Call it for cloned cdtors. * optimize.cc (maybe_optimize_cdtor): New function. * g++.dg/lto/pr113208_0.C: New test. * g++.dg/lto/pr113208_1.C: New file. * g++.dg/lto/pr113208.h: New file. Jakub --- gcc/cp/cp-tree.h.jj 2024-04-16 17:18:37.286279533 +0200 +++ gcc/cp/cp-tree.h 2024-04-16 17:45:01.685635709 +0200 @@ -7447,6 +7447,7 @@ extern bool handle_module_option (unsign /* In optimize.cc */ extern tree clone_attrs (tree); extern bool maybe_clone_body (tree); +extern void maybe_optimize_cdtor (tree); /* In parser.cc */ extern tree cp_convert_range_for (tree, tree, tree, cp_decomp *, bool, --- gcc/cp/decl2.cc.jj 2024-04-16 17:18:37.287279519 +0200 +++ gcc/cp/decl2.cc 2024-04-16 17:45:01.686635695 +0200 @@ -3568,6 +3568,9 @@ import_export_decl (tree decl) } DECL_INTERFACE_KNOWN (decl) = 1; + + if (DECL_CLONED_FUNCTION_P (decl)) + maybe_optimize_cdtor (decl); } /* Return an expression that performs the destruction of DECL, which --- gcc/cp/optimize.cc.jj 2024-04-16 17:18:37.374278327 +0200 +++ gcc/cp/optimize.cc 2024-04-16 21:35:53.597188774 +0200 @@ -723,3 +723,58 @@ maybe_clone_body (tree fn) /* We don't need to process the original function any further. */ return 1; } + +/* If maybe_clone_body is called while the cdtor is still tentative, + DECL_ONE_ONLY will be false and so will be can_alias_cdtor (fn). + In that case we wouldn't try to optimize using an alias and instead + would emit separate base and complete cdtor. The following function + attempts to still optimize that case when we import_export_decl + is called first time on one of the clones. */ + +void +maybe_optimize_cdtor (tree orig_decl) +{ + tree fns[3]; + tree fn = DECL_CLONED_FUNCTION (orig_decl); + gcc_checking_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn)); + if (DECL_INTERFACE_KNOWN (fn) + || !TREE_ASM_WRITTEN (fn) + || !DECL_ONE_ONLY (orig_decl) + || symtab->global_info_ready) + return; + + populate_clone_array (fn, fns); + + if (!fns[0] || !fns[1]) + return; + for (int i = 2 - !fns[2]; i >= 0; --i) + if (fns[i] != orig_decl && DECL_INTERFACE_KNOWN (fns[i])) + return; + DECL_INTERFACE_KNOWN (fn) = 1; + comdat_linkage (fn); + if (!can_alias_cdtor (fn)) + return; + /* For comdat base and complete cdtors put them into the same, + *[CD]5* comdat group instead of *[CD][12]*. */ + auto n0 = cgraph_node::get_create (fns[0]); + auto n1 = cgraph_node::get_create (fns[1]); + auto n2 = fns[2] ? cgraph_node::get_create (fns[1]) : NULL; + if (n0->lowered || n1->lowered || (n2 && n2->lowered)) + return; + import_export_decl (fns[0]); + n1->definition = false; + if (!n0->create_same_body_alias (fns[1], fns[0])) + return; + + tree comdat_group = cdtor_comdat_group (fns[1], fns[0]); + n1 = cgraph_node::get (fns[1]); + n0->set_comdat_group (comdat_group); + if (n1->same_comdat_group) + n1->remove_from_same_comdat_group (); + n1->add_to_same_comdat_group (n0); + if (fns[2]) + n2->add_to_same_comdat_group (n0); + import_export_decl (fns[1]); + /* Remove the body now that it is an alias. */ + DECL_SAVED_TREE (fns[1]) = NULL_TREE; +} --- gcc/testsuite/g++.dg/lto/pr113208_0.C.jj 2024-04-16 17:45:01.687635682 +0200 +++ gcc/testsuite/g++.dg/lto/pr113208_0.C 2024-04-16 17:45:01.687635682 +0200 @@ -0,0 +1,13 @@ +// { dg-lto-do link } +// { dg-lto-options { {-O1 -std=c++20 -flto}} } +// { dg-extra-ld-options "-r -nostdlib -flinker-output=nolto-rel" } +// { dg-require-linker-plugin "" } + +#define CONSTEXPR constexpr +#include "pr113208.h" + +struct QualityValue; +struct k : vector {}; + +void m(k); +void n(k i) { m(i); } --- gcc/testsuite/g++.dg/lto/pr113208_1.C.jj 2024-04-16 17:45:01.687635682 +0200 +++ gcc/testsuite/g++.dg/lto/pr113208_1.C 2024-04-16 17:45:01.687635682 +0200 @@ -0,0 +1,6 @@ +#define CONSTEXPR +#include "pr113208.h" + +struct QualityValue; +vector values1; +vector values{values1}; --- gcc/testsuite/g++.dg/lto/pr113208.h.jj 2024-04-16 17:45:01.687635682 +0200 +++ gcc/testsuite/g++.dg/lto/pr113208.h 2024-04-16 17:45:01.687635682 +0200 @@ -0,0 +1,10 @@ +template struct _Vector_base { + int g() const; + _Vector_base(int, int); +}; +template +struct vector : _Vector_base<_Tp> { + CONSTEXPR vector(const vector &__x) + : _Vector_base<_Tp>(1, __x.g()) {} + vector() : _Vector_base<_Tp>(1, 2) {} +};