From patchwork Mon Nov 22 09:36:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 47995 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C8B83385842E for ; Mon, 22 Nov 2021 09:36:49 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C8B83385842E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1637573809; bh=w73guE3mdfMz8I7pVjD96UlDsVni1yvPsAsq4V2DqHM=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=Hnt+hfkd7Uck5fMuguydPLJLwDpH1nCWrEo3/y0fARFSDIvDBgRrUp2FO41q88wEF 3V9zxQPYCG/9UjxutLhYCkzgOKrqRe+BAbtiNW5CHMAM3uSBmR987QKMSboMouqLLw EM+EphN0H6RNg8NloT5DLGTPcddaakgFxMKMtLvY= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 91F1B385800A for ; Mon, 22 Nov 2021 09:36:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 91F1B385800A Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-245-D6TYAA-mOU6iwheawT9SXw-1; Mon, 22 Nov 2021 04:36:13 -0500 X-MC-Unique: D6TYAA-mOU6iwheawT9SXw-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id CD9EE18125C1; Mon, 22 Nov 2021 09:36:11 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.192.23]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 642615D6D5; Mon, 22 Nov 2021 09:36:11 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.16.1/8.16.1) with ESMTPS id 1AM9a8Jc2535547 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Mon, 22 Nov 2021 10:36:09 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.16.1/8.16.1/Submit) id 1AM9a89B2535546; Mon, 22 Nov 2021 10:36:08 +0100 Date: Mon, 22 Nov 2021 10:36:07 +0100 To: Uros Bizjak Subject: [PATCH] x86: Speed up target attribute handling by using a cache Message-ID: <20211122093607.GJ2646553@tucnak> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline X-Spam-Status: No, score=-5.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jakub Jelinek via Gcc-patches From: Jakub Jelinek Reply-To: Jakub Jelinek Cc: gcc-patches@gcc.gnu.org Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi! The target attribute handling is very expensive and for the common case from x86intrin.h where many functions get implicitly the same target attribute, we can speed up compilation a lot by caching it. The following patches both create a single entry cache, where they cache for a particular target attribute argument list the resulting DECL_FUNCTION_SPECIFIC_TARGET and DECL_FUNCTION_SPECIFIC_OPTIMIZATION values from ix86_valid_target_attribute_p and use the cache if the args are the same as last time and we start either from NULL values of those, or from the recorded values for those from last time. Compiling a simple: #include int i; testcase with ./cc1 -quiet -O2 -isystem include/ test.c takes on my WS without the patches ~0.392s and with either of the patches ~0.182s, i.e. roughly half the time as before. For ./cc1plus -quiet -O2 -isystem include/ test.c it is slightly worse, the speed up is from ~0.613s to ~0.403s. The difference between the 2 patches is that the first one uses copy_list while the second one uses a vec, so I think the second one has the advantage of creating less GC garbage. I've verified both patches achieve the same content of those DECL_FUNCTION_SPECIFIC_TARGET and DECL_FUNCTION_SPECIFIC_OPTIMIZATION nodes as before on x86intrin.h by doing debug_tree on those and comparing the stderr from without these patches to with these patches. Both patches were bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk (and which one)? 2021-11-22 Jakub Jelinek * attribs.h (simple_cst_list_equal): Declare. * attribs.c (simple_cst_list_equal): No longer static. * config/i386/i386-options.c (target_attribute_cache): New variable. (ix86_valid_target_attribute_p): Cache DECL_FUNCTION_SPECIFIC_TARGET and DECL_FUNCTION_SPECIFIC_OPTIMIZATION based on args. Jakub 2021-11-22 Jakub Jelinek * config/i386/i386-options.c (target_attribute_cache): New variable. (ix86_valid_target_attribute_p): Cache DECL_FUNCTION_SPECIFIC_TARGET and DECL_FUNCTION_SPECIFIC_OPTIMIZATION based on args. --- gcc/config/i386/i386-options.c.jj 2021-11-20 23:53:35.730637746 +0100 +++ gcc/config/i386/i386-options.c 2021-11-21 18:13:16.948659255 +0100 @@ -1403,6 +1403,8 @@ ix86_valid_target_attribute_tree (tree f return t; } +static GTY(()) vec *target_attribute_cache; + /* Hook to validate attribute((target("string"))). */ bool @@ -1423,6 +1425,28 @@ ix86_valid_target_attribute_p (tree fnde && strcmp (TREE_STRING_POINTER (TREE_VALUE (args)), "default") == 0) return true; + if (target_attribute_cache + && (DECL_FUNCTION_SPECIFIC_TARGET (fndecl) == (*target_attribute_cache)[0] + || DECL_FUNCTION_SPECIFIC_TARGET (fndecl) == NULL_TREE) + && (DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl) + == (*target_attribute_cache)[1] + || DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl) == NULL_TREE)) + { + tree a; + unsigned int n = vec_safe_length (target_attribute_cache); + unsigned int i = 2; + for (a = args; a && i < n; a = TREE_CHAIN (a), ++i) + if (simple_cst_equal ((*target_attribute_cache)[i], TREE_VALUE (a)) != 1) + break; + if (a == NULL_TREE && i == n) + { + DECL_FUNCTION_SPECIFIC_TARGET (fndecl) = (*target_attribute_cache)[0]; + DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl) + = (*target_attribute_cache)[1]; + return true; + } + } + tree old_optimize = build_optimization_node (&global_options, &global_options_set); @@ -1459,8 +1483,19 @@ ix86_valid_target_attribute_p (tree fnde if (new_target == error_mark_node) ret = false; - else if (fndecl && new_target) + else if (new_target) { + if (DECL_FUNCTION_SPECIFIC_TARGET (fndecl) == NULL_TREE + && DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl) == NULL_TREE) + { + vec_safe_truncate (target_attribute_cache, 0); + vec_safe_push (target_attribute_cache, new_target); + vec_safe_push (target_attribute_cache, old_optimize != new_optimize + ? new_optimize : NULL_TREE); + for (tree a = args; a; a = TREE_CHAIN (a)) + vec_safe_push (target_attribute_cache, TREE_VALUE (a)); + } + DECL_FUNCTION_SPECIFIC_TARGET (fndecl) = new_target; if (old_optimize != new_optimize) --- gcc/attribs.h.jj 2021-11-11 14:35:37.442350841 +0100 +++ gcc/attribs.h 2021-11-19 11:52:08.843252645 +0100 @@ -60,6 +60,7 @@ extern tree build_type_attribute_variant extern tree build_decl_attribute_variant (tree, tree); extern tree build_type_attribute_qual_variant (tree, tree, int); +extern bool simple_cst_list_equal (const_tree, const_tree); extern bool attribute_value_equal (const_tree, const_tree); /* Return 0 if the attributes for two types are incompatible, 1 if they --- gcc/attribs.c.jj 2021-11-11 14:35:37.442350841 +0100 +++ gcc/attribs.c 2021-11-19 11:51:43.473615692 +0100 @@ -1290,7 +1290,7 @@ cmp_attrib_identifiers (const_tree attr1 /* Compare two constructor-element-type constants. Return 1 if the lists are known to be equal; otherwise return 0. */ -static bool +bool simple_cst_list_equal (const_tree l1, const_tree l2) { while (l1 != NULL_TREE && l2 != NULL_TREE) --- gcc/config/i386/i386-options.c.jj 2021-11-15 13:19:07.347900863 +0100 +++ gcc/config/i386/i386-options.c 2021-11-20 00:27:32.919505947 +0100 @@ -1403,6 +1403,8 @@ ix86_valid_target_attribute_tree (tree f return t; } +static GTY(()) tree target_attribute_cache[3]; + /* Hook to validate attribute((target("string"))). */ bool @@ -1423,6 +1425,19 @@ ix86_valid_target_attribute_p (tree fnde && strcmp (TREE_STRING_POINTER (TREE_VALUE (args)), "default") == 0) return true; + if ((DECL_FUNCTION_SPECIFIC_TARGET (fndecl) == target_attribute_cache[1] + || DECL_FUNCTION_SPECIFIC_TARGET (fndecl) == NULL_TREE) + && (DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl) + == target_attribute_cache[2] + || DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl) == NULL_TREE) + && simple_cst_list_equal (args, target_attribute_cache[0])) + { + DECL_FUNCTION_SPECIFIC_TARGET (fndecl) = target_attribute_cache[1]; + DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl) + = target_attribute_cache[2]; + return true; + } + tree old_optimize = build_optimization_node (&global_options, &global_options_set); @@ -1456,8 +1471,17 @@ ix86_valid_target_attribute_p (tree fnde if (new_target == error_mark_node) ret = false; - else if (fndecl && new_target) + else if (new_target) { + if (DECL_FUNCTION_SPECIFIC_TARGET (fndecl) == NULL_TREE + && DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl) == NULL_TREE) + { + target_attribute_cache[0] = copy_list (args); + target_attribute_cache[1] = new_target; + target_attribute_cache[2] + = old_optimize != new_optimize ? new_optimize : NULL_TREE; + } + DECL_FUNCTION_SPECIFIC_TARGET (fndecl) = new_target; if (old_optimize != new_optimize)