From patchwork Thu Jan 13 18:13:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 49998 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 137E6385781E for ; Thu, 13 Jan 2022 18:14:39 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 137E6385781E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1642097679; bh=7+HQ15DuiiAFCXJc6k4wuWHhOZ3GwnOluQbzfTii9EI=; h=Date:Subject:To:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=dysBekxL4HjKja/2+zZOuIq2t43cM/lgBiJeeAzGHQl5pheoNP4+vwvi9Vsu63QvG bPP8A7X98ulsuvJDwW4j2slOOFAfJR3AAfuP497OfEHiCrxh2ipQOVZwpwL5MEcgyV V4RBpslmVAcdkgE+YDHRriEYFpKGictddt7ReY2o= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-qv1-xf2d.google.com (mail-qv1-xf2d.google.com [IPv6:2607:f8b0:4864:20::f2d]) by sourceware.org (Postfix) with ESMTPS id 66D7C385840D for ; Thu, 13 Jan 2022 18:14:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 66D7C385840D Received: by mail-qv1-xf2d.google.com with SMTP id q3so7670283qvc.7 for ; Thu, 13 Jan 2022 10:14:10 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=7+HQ15DuiiAFCXJc6k4wuWHhOZ3GwnOluQbzfTii9EI=; b=ZcbRvdStfP/gn8zQKk2TGXS9IppkHZFdIUJsYnsoTEUCwjg6Hc6jP3MpeFxPPuxUOD ChCjB7dCYed5DEnvwoPqbfetu4KLhr/ivFI9EdLktGOHYvNupNwa84ku6OXA8xkRiJgi sSZrNfHtF6F5FluhQCOnmJ23n0zApYVEqEXY2dkOJmXtZY0KHmOJSANZNZGd7FTCoNaH QHpK1Z0LDsXaOM0r5QGu9fOVk1ykufq8K7F5Cn2/oVxTZdVeyuVF42b0nQJ7QYGF9wmk xva7VEsyAZuqPpV4ziRztfx8sgiUfg+4OQU/qWyuL/dI8CC8sMkmLagx7l/pMnhYXGqL qfyg== X-Gm-Message-State: AOAM533Npg32XDLCz2IPdtbBKjO6GzG8y4bO5vduyMNNq27QGCnZAqxq IKGx+fZl+Iijl161+78TFPkhDSU5Mxu1sh3TxG9bB64exbDddA== X-Google-Smtp-Source: ABdhPJxkr5v+qju+jCiJQyF+Iso7lTRbD85jv2SNQiqDTo1tR8fJ1NaeSuBSKqlwXZ76MB4/CPIvJtfCSuFF+yf/RFo= X-Received: by 2002:a05:6214:2346:: with SMTP id hu6mr5151186qvb.31.1642097649708; Thu, 13 Jan 2022 10:14:09 -0800 (PST) MIME-Version: 1.0 Date: Thu, 13 Jan 2022 19:13:58 +0100 Message-ID: Subject: [PATCH] i386: Cleanup V2QI arithmetic instructions To: "gcc-patches@gcc.gnu.org" X-Spam-Status: No, score=-8.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Uros Bizjak via Gcc-patches From: Uros Bizjak Reply-To: Uros Bizjak Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" 2022-01-13 Uroš Bizjak gcc/ChangeLog: * config/i386/mmx.md (negv2qi): Disparage GPR alternative a bit. Disable for TARGET_PARTIAL_REG_STALL unless optimizing for size. (negv2qi splitters): Use lowpart_subreg instead of gen_lowpart to create subreg. (v2qi3): Disparage GPR alternative a bit. Disable for TARGET_PARTIAL_REG_STALL unless optimizing for size. (v2qi3 splitters): Use lowpart_subreg instead of gen_lowpart to create subreg. * config/i386/i386.md (*subqi_ext_2): Move. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Pushed to master. Uros. diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 9937643a273..bcaaa4993b1 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -6905,6 +6905,30 @@ [(set_attr "type" "alu") (set_attr "mode" "SI")]) +(define_insn "*subqi_ext_2" + [(set (zero_extract:SWI248 + (match_operand:SWI248 0 "register_operand" "+Q") + (const_int 8) + (const_int 8)) + (subreg:SWI248 + (minus:QI + (subreg:QI + (zero_extract:SWI248 + (match_operand:SWI248 1 "register_operand" "0") + (const_int 8) + (const_int 8)) 0) + (subreg:QI + (zero_extract:SWI248 + (match_operand:SWI248 2 "register_operand" "Q") + (const_int 8) + (const_int 8)) 0)) 0)) + (clobber (reg:CC FLAGS_REG))] + "/* FIXME: without this LRA can't reload this pattern, see PR82524. */ + rtx_equal_p (operands[0], operands[1])" + "sub{b}\t{%h2, %h0|%h0, %h2}" + [(set_attr "type" "alu") + (set_attr "mode" "QI")]) + ;; Subtract with jump on overflow. (define_expand "subv4" [(parallel [(set (reg:CCO FLAGS_REG) @@ -6932,30 +6956,6 @@ operands[4] = gen_rtx_SIGN_EXTEND (mode, operands[2]); }) -(define_insn "*subqi_ext_2" - [(set (zero_extract:SWI248 - (match_operand:SWI248 0 "register_operand" "+Q") - (const_int 8) - (const_int 8)) - (subreg:SWI248 - (minus:QI - (subreg:QI - (zero_extract:SWI248 - (match_operand:SWI248 1 "register_operand" "0") - (const_int 8) - (const_int 8)) 0) - (subreg:QI - (zero_extract:SWI248 - (match_operand:SWI248 2 "register_operand" "Q") - (const_int 8) - (const_int 8)) 0)) 0)) - (clobber (reg:CC FLAGS_REG))] - "/* FIXME: without this LRA can't reload this pattern, see PR82524. */ - rtx_equal_p (operands[0], operands[1])" - "sub{b}\t{%h2, %h0|%h0, %h2}" - [(set_attr "type" "alu") - (set_attr "mode" "QI")]) - (define_insn "*subv4" [(set (reg:CCO FLAGS_REG) (eq:CCO (minus: diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 295a132bc46..3d99a5e851b 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -1633,12 +1633,20 @@ "TARGET_MMX_WITH_SSE" "operands[2] = force_reg (mode, CONST0_RTX (mode));") +(define_expand "neg2" + [(set (match_operand:VI_32 0 "register_operand") + (minus:VI_32 + (match_dup 2) + (match_operand:VI_32 1 "register_operand")))] + "TARGET_SSE2" + "operands[2] = force_reg (mode, CONST0_RTX (mode));") + (define_insn "negv2qi2" [(set (match_operand:V2QI 0 "register_operand" "=?Q,&Yw") (neg:V2QI (match_operand:V2QI 1 "register_operand" "0,Yw"))) (clobber (reg:CC FLAGS_REG))] - "" + "!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)" "#" [(set_attr "isa" "*,sse2") (set_attr "type" "multi") @@ -1664,10 +1672,10 @@ (const_int 8)) 0)) 0)) (clobber (reg:CC FLAGS_REG))])] { - operands[3] = gen_lowpart (HImode, operands[1]); - operands[2] = gen_lowpart (HImode, operands[0]); - operands[1] = gen_lowpart (QImode, operands[1]); - operands[0] = gen_lowpart (QImode, operands[0]); + operands[3] = lowpart_subreg (HImode, operands[1], V2QImode); + operands[2] = lowpart_subreg (HImode, operands[0], V2QImode); + operands[1] = lowpart_subreg (QImode, operands[1], V2QImode); + operands[0] = lowpart_subreg (QImode, operands[0], V2QImode); }) (define_split @@ -1678,11 +1686,11 @@ "reload_completed" [(set (match_dup 0) (match_dup 2)) (set (match_dup 0) - (minus:V4QI (match_dup 0) (match_dup 1)))] + (minus:V16QI (match_dup 0) (match_dup 1)))] { - operands[2] = CONST0_RTX (V4QImode); - operands[1] = gen_lowpart (V4QImode, operands[1]); - operands[0] = gen_lowpart (V4QImode, operands[0]); + operands[2] = CONST0_RTX (V16QImode); + operands[1] = lowpart_subreg (V16QImode, operands[1], V2QImode); + operands[0] = lowpart_subreg (V16QImode, operands[0], V2QImode); }) (define_expand "mmx_3" @@ -1718,14 +1726,6 @@ (set_attr "type" "mmxadd,sseadd,sseadd") (set_attr "mode" "DI,TI,TI")]) -(define_expand "neg2" - [(set (match_operand:VI_32 0 "register_operand") - (minus:VI_32 - (match_dup 2) - (match_operand:VI_32 1 "register_operand")))] - "TARGET_SSE2" - "operands[2] = force_reg (mode, CONST0_RTX (mode));") - (define_insn "3" [(set (match_operand:VI_32 0 "register_operand" "=x,Yw") (plusminus:VI_32 @@ -1745,7 +1745,7 @@ (match_operand:V2QI 1 "register_operand" "0,0,Yw") (match_operand:V2QI 2 "register_operand" "Q,x,Yw"))) (clobber (reg:CC FLAGS_REG))] - "" + "!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)" "#" [(set_attr "isa" "*,sse2_noavx,avx") (set_attr "type" "multi,sseadd,sseadd") @@ -1776,12 +1776,12 @@ (const_int 8)) 0)) 0)) (clobber (reg:CC FLAGS_REG))])] { - operands[5] = gen_lowpart (HImode, operands[2]); - operands[4] = gen_lowpart (HImode, operands[1]); - operands[3] = gen_lowpart (HImode, operands[0]); - operands[2] = gen_lowpart (QImode, operands[2]); - operands[1] = gen_lowpart (QImode, operands[1]); - operands[0] = gen_lowpart (QImode, operands[0]); + operands[5] = lowpart_subreg (HImode, operands[2], V2QImode); + operands[4] = lowpart_subreg (HImode, operands[1], V2QImode); + operands[3] = lowpart_subreg (HImode, operands[0], V2QImode); + operands[2] = lowpart_subreg (QImode, operands[2], V2QImode); + operands[1] = lowpart_subreg (QImode, operands[1], V2QImode); + operands[0] = lowpart_subreg (QImode, operands[0], V2QImode); }) (define_split @@ -1792,11 +1792,11 @@ (clobber (reg:CC FLAGS_REG))] "TARGET_SSE2 && reload_completed" [(set (match_dup 0) - (plusminus:V4QI (match_dup 1) (match_dup 2)))] + (plusminus:V16QI (match_dup 1) (match_dup 2)))] { - operands[2] = gen_lowpart (V4QImode, operands[2]); - operands[1] = gen_lowpart (V4QImode, operands[1]); - operands[0] = gen_lowpart (V4QImode, operands[0]); + operands[2] = lowpart_subreg (V16QImode, operands[2], V2QImode); + operands[1] = lowpart_subreg (V16QImode, operands[1], V2QImode); + operands[0] = lowpart_subreg (V16QImode, operands[0], V2QImode); }) (define_expand "mmx_3"