From patchwork Wed Oct 13 16:04:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John David Anglin X-Patchwork-Id: 46177 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 53C4A3858013 for ; Wed, 13 Oct 2021 16:04:53 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from cmx-torrgo001.bell.net (mta-tor-003.bell.net [209.71.212.30]) by sourceware.org (Postfix) with ESMTP id 66A2D3858C27 for ; Wed, 13 Oct 2021 16:04:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 66A2D3858C27 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=bell.net Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=bell.net X-RG-CM-BuS: 0 X-RG-CM-SC: 0 X-RG-CM: Clean X-Originating-IP: [67.71.8.137] X-RG-Env-Sender: dave.anglin@bell.net X-RG-Rigid: 60C8868807F3FF31 X-CM-Envelope: MS4xfDlLOPvQL4TlhYJ00RZdzmb840ov0kQVN1IIp0xORx+q5K4bClRdf+DznaT8aOEuJ4iNzut2Mzc+CGKoI8U7eO8/guQfcTCblMh+rbS+pEjuMkro6RSk /U+bCXPCd04/RyS1gA1pSj5FI4yaQFVtt03NXBw5Ek6T61Rmo+z46q6ydMOKHbBRF//OhW+82evejsK523CFKswQ7Udy7Q32XUbL27ojWhfNeEot9ry3A3fe szIerucZrmLiTzzPUuAx1g== X-CM-Analysis: v=2.4 cv=Udwy9IeN c=1 sm=1 tr=0 ts=61670391 a=jrdA9tB8yuRqUzQ1EpSZjA==:117 a=jrdA9tB8yuRqUzQ1EpSZjA==:17 a=IkcTkHD0fZMA:10 a=mDV3o1hIAAAA:8 a=-maEpGS1gIvElCR-q0kA:9 a=QEXdDO2ut3YA:10 a=_FVE-zBwftR9WsbkzFJk:22 Received: from [192.168.2.49] (67.71.8.137) by cmx-torrgo001.bell.net (5.8.716.03) (authenticated as dave.anglin@bell.net) id 60C8868807F3FF31; Wed, 13 Oct 2021 12:04:33 -0400 Message-ID: <4641577f-91ad-82a7-e381-62a0497bc092@bell.net> Date: Wed, 13 Oct 2021 12:04:34 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0 Content-Language: en-US To: GCC Patches From: John David Anglin Subject: [committed] hppa: Add support for 32-bit hppa targets in muldi3 expander X-Spam-Status: No, score=-9.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This patches patch allows inlining 64-bit hardware multiplication on 32-bit hppa targets instead of using __muldi3 from libgcc. This should improve performance at the expense of a slight increase in code size. We need this because I am testing a change to build libgcc with software float and integer multiplication. Tested on hppa2.0w-hp-hpux11.11, hppa64-hp-hpux11.11 and hppa-unknown-linux-gnu. Committed to all active branches. Dave --- Add support for 32-bit hppa targets in muldi3 expander 2021-10-13 John David Anglin gcc/ChangeLog: * config/pa/pa.md (muldi3): Add support for inlining 64-bit multiplication on 32-bit PA 1.1 and 2.0 targets. diff --git a/gcc/config/pa/pa.md b/gcc/config/pa/pa.md index b314f96de35..10623dd6fdb 100644 --- a/gcc/config/pa/pa.md +++ b/gcc/config/pa/pa.md @@ -5374,32 +5374,38 @@ [(set (match_operand:DI 0 "register_operand" "") (mult:DI (match_operand:DI 1 "register_operand" "") (match_operand:DI 2 "register_operand" "")))] - "TARGET_64BIT && ! TARGET_DISABLE_FPREGS && ! TARGET_SOFT_FLOAT" + "! optimize_size + && TARGET_PA_11 + && ! TARGET_DISABLE_FPREGS + && ! TARGET_SOFT_FLOAT" " { rtx low_product = gen_reg_rtx (DImode); rtx cross_product1 = gen_reg_rtx (DImode); rtx cross_product2 = gen_reg_rtx (DImode); - rtx cross_scratch = gen_reg_rtx (DImode); - rtx cross_product = gen_reg_rtx (DImode); rtx op1l, op1r, op2l, op2r; - rtx op1shifted, op2shifted; - - op1shifted = gen_reg_rtx (DImode); - op2shifted = gen_reg_rtx (DImode); - op1l = gen_reg_rtx (SImode); - op1r = gen_reg_rtx (SImode); - op2l = gen_reg_rtx (SImode); - op2r = gen_reg_rtx (SImode); - - emit_move_insn (op1shifted, gen_rtx_LSHIFTRT (DImode, operands[1], - GEN_INT (32))); - emit_move_insn (op2shifted, gen_rtx_LSHIFTRT (DImode, operands[2], - GEN_INT (32))); - op1r = force_reg (SImode, gen_rtx_SUBREG (SImode, operands[1], 4)); - op2r = force_reg (SImode, gen_rtx_SUBREG (SImode, operands[2], 4)); - op1l = force_reg (SImode, gen_rtx_SUBREG (SImode, op1shifted, 4)); - op2l = force_reg (SImode, gen_rtx_SUBREG (SImode, op2shifted, 4)); + + if (TARGET_64BIT) + { + rtx op1shifted = gen_reg_rtx (DImode); + rtx op2shifted = gen_reg_rtx (DImode); + + emit_move_insn (op1shifted, gen_rtx_LSHIFTRT (DImode, operands[1], + GEN_INT (32))); + emit_move_insn (op2shifted, gen_rtx_LSHIFTRT (DImode, operands[2], + GEN_INT (32))); + op1r = force_reg (SImode, gen_rtx_SUBREG (SImode, operands[1], 4)); + op2r = force_reg (SImode, gen_rtx_SUBREG (SImode, operands[2], 4)); + op1l = force_reg (SImode, gen_rtx_SUBREG (SImode, op1shifted, 4)); + op2l = force_reg (SImode, gen_rtx_SUBREG (SImode, op2shifted, 4)); + } + else + { + op1r = force_reg (SImode, gen_lowpart (SImode, operands[1])); + op2r = force_reg (SImode, gen_lowpart (SImode, operands[2])); + op1l = force_reg (SImode, gen_highpart (SImode, operands[1])); + op2l = force_reg (SImode, gen_highpart (SImode, operands[2])); + } /* Emit multiplies for the cross products. */ emit_insn (gen_umulsidi3 (cross_product1, op2r, op1l)); @@ -5408,13 +5414,35 @@ /* Emit a multiply for the low sub-word. */ emit_insn (gen_umulsidi3 (low_product, copy_rtx (op2r), copy_rtx (op1r))); - /* Sum the cross products and shift them into proper position. */ - emit_insn (gen_adddi3 (cross_scratch, cross_product1, cross_product2)); - emit_insn (gen_ashldi3 (cross_product, cross_scratch, GEN_INT (32))); + if (TARGET_64BIT) + { + rtx cross_scratch = gen_reg_rtx (DImode); + rtx cross_product = gen_reg_rtx (DImode); - /* Add the cross product to the low product and store the result - into the output operand . */ - emit_insn (gen_adddi3 (operands[0], cross_product, low_product)); + /* Sum the cross products and shift them into proper position. */ + emit_insn (gen_adddi3 (cross_scratch, cross_product1, cross_product2)); + emit_insn (gen_ashldi3 (cross_product, cross_scratch, GEN_INT (32))); + + /* Add the cross product to the low product and store the result + into the output operand . */ + emit_insn (gen_adddi3 (operands[0], cross_product, low_product)); + } + else + { + rtx cross_scratch = gen_reg_rtx (SImode); + + /* Sum cross products. */ + emit_move_insn (cross_scratch, + gen_rtx_PLUS (SImode, + gen_lowpart (SImode, cross_product1), + gen_lowpart (SImode, cross_product2))); + emit_move_insn (gen_lowpart (SImode, operands[0]), + gen_lowpart (SImode, low_product)); + emit_move_insn (gen_highpart (SImode, operands[0]), + gen_rtx_PLUS (SImode, + gen_highpart (SImode, low_product), + cross_scratch)); + } DONE; }")