From patchwork Fri Feb 3 14:06:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 64255 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 233423858C33 for ; Fri, 3 Feb 2023 14:07:21 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 233423858C33 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1675433241; bh=+SSWfEVo4TgI/zTWKRAJaJcP7hd83+/nSQdQcv40SZ4=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=hRM0SKWdAVu3KJBTA22Bxu/bCk//BsGsZp3993rCt44WeWeUKNKYpXiuude770fyo HtqfuVmreLwQTdTjMjXb3vN9PlAIdm+UuxTDXbwE8xYKRcM5+vbxXSA+/OmyXuu2gt aX2I9PrcufUYQ/FzZaPiRGFsmSpb6n5PY8bNdWGk= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by sourceware.org (Postfix) with ESMTPS id 170C1385842A for ; Fri, 3 Feb 2023 14:06:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 170C1385842A Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 07A86348F6 for ; Fri, 3 Feb 2023 14:06:51 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id E72031346D for ; Fri, 3 Feb 2023 14:06:50 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 6h1MN/oU3WPUFQAAMHmgww (envelope-from ) for ; Fri, 03 Feb 2023 14:06:50 +0000 Date: Fri, 3 Feb 2023 15:06:50 +0100 (CET) To: gcc-patches@gcc.gnu.org Subject: [PATCH] Improve RTL CSE hash table hash usage MIME-Version: 1.0 Message-Id: <20230203140650.E72031346D@imap2.suse-dmz.suse.de> X-Spam-Status: No, score=-10.3 required=5.0 tests=BAYES_00, DKIM_INVALID, DKIM_SIGNED, GIT_PATCH_0, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Biener via Gcc-patches From: Richard Biener Reply-To: Richard Biener Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" The RTL CSE hash table has a fixed number of buckets (32) each with a linked list of entries with the same hash value. The actual hash values are computed using hash_rtx which uses adds for mixing and adds the rtx CODE as CODE << 7 (apart from some exceptions such as MEM). The unsigned int typed hash value is then simply truncated for the actual lookup into the fixed size table which means that usually CODE is simply lost. The following improves this truncation by first mixing in more bits using xor. It does not change the actual hash function since that's used outside of CSE as well. An alternative would be to bump the fixed number of buckets, say to 256 which would retain the LSB of CODE or to 8192 which can capture all 6 bits required for the last CODE. As the comment in CSE says, there's invalidate_memory and flush_hash_table done possibly frequently and those at least need to walk all slots, so when the hash table is mostly empty enlarging it will be a loss. Still there should be more regular lookups by hash, so less collisions should pay off as well. Without enlarging the table a better hash function is unlikely going to make a big difference, simple statistics on the number of collisions at insertion time shows a reduction of around 10%. Bumping HASH_SHIFT by 1 improves that to 30% at the expense of reducing the average table fill by 10% (all of this stats from looking just at fold-const.i at -O2). Increasing HASH_SHIFT more leaves the table even more sparse likely showing that hash_rtx uses add for mixing which is quite bad. Bumping HASH_SHIFT by 2 removes 90% of all collisions. Experimenting with using inchash instead of adds for the mixing does not improve things when looking at the HASH_SHIFT bumped by 2 numbers. Bootstrapped and tested on x86_64-unknown-linux-gnu. Any opinions? * cse.cc (HASH): Turn into inline function and mix in another HASH_SHIFT bits. (SAFE_HASH): Likewise. --- gcc/cse.cc | 37 +++++++++++++++++++++++-------------- 1 file changed, 23 insertions(+), 14 deletions(-) diff --git a/gcc/cse.cc b/gcc/cse.cc index 37afc88b439..4777e559b86 100644 --- a/gcc/cse.cc +++ b/gcc/cse.cc @@ -420,20 +420,6 @@ struct table_elt #define HASH_SIZE (1 << HASH_SHIFT) #define HASH_MASK (HASH_SIZE - 1) -/* Compute hash code of X in mode M. Special-case case where X is a pseudo - register (hard registers may require `do_not_record' to be set). */ - -#define HASH(X, M) \ - ((REG_P (X) && REGNO (X) >= FIRST_PSEUDO_REGISTER \ - ? (((unsigned) REG << 7) + (unsigned) REG_QTY (REGNO (X))) \ - : canon_hash (X, M)) & HASH_MASK) - -/* Like HASH, but without side-effects. */ -#define SAFE_HASH(X, M) \ - ((REG_P (X) && REGNO (X) >= FIRST_PSEUDO_REGISTER \ - ? (((unsigned) REG << 7) + (unsigned) REG_QTY (REGNO (X))) \ - : safe_hash (X, M)) & HASH_MASK) - /* Determine whether register number N is considered a fixed register for the purpose of approximating register costs. It is desirable to replace other regs with fixed regs, to reduce need for @@ -586,6 +572,29 @@ static machine_mode cse_cc_succs (basic_block, basic_block, rtx, rtx, static const struct rtl_hooks cse_rtl_hooks = RTL_HOOKS_INITIALIZER; +/* Compute hash code of X in mode M. Special-case case where X is a pseudo + register (hard registers may require `do_not_record' to be set). */ + +static inline unsigned +HASH (rtx x, machine_mode mode) +{ + unsigned h = (REG_P (x) && REGNO (x) >= FIRST_PSEUDO_REGISTER + ? (((unsigned) REG << 7) + (unsigned) REG_QTY (REGNO (x))) + : canon_hash (x, mode)); + return (h ^ (h >> HASH_SHIFT)) & HASH_MASK; +} + +/* Like HASH, but without side-effects. */ + +static inline unsigned +SAFE_HASH (rtx x, machine_mode mode) +{ + unsigned h = (REG_P (x) && REGNO (x) >= FIRST_PSEUDO_REGISTER + ? (((unsigned) REG << 7) + (unsigned) REG_QTY (REGNO (x))) + : safe_hash (x, mode)); + return (h ^ (h >> HASH_SHIFT)) & HASH_MASK; +} + /* Nonzero if X has the form (PLUS frame-pointer integer). */ static bool