From patchwork Sat Jan 15 20:01:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 50070 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0C9713858429 for ; Sat, 15 Jan 2022 20:02:16 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0C9713858429 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1642276936; bh=HSWUZlbh709Rjebn9L1bl+jxUgvWdt/2By3lprjhgpo=; h=Date:Subject:To:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=qro4+1C0Wa4S6S/HwsLKR2VmTmQmanUT+ffoOix7NONugx/58154/QK1tIioKX3wT aID0J4QO561rCueMX2dck3OTYUbI54nOzO5J/nrhqv7Qemxqpncq378tUqk/lYHs+3 JlXRZiuYr/7xh08vgX/SSVbtBfMnG+Ws/V7Y1KGQ= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-qk1-x732.google.com (mail-qk1-x732.google.com [IPv6:2607:f8b0:4864:20::732]) by sourceware.org (Postfix) with ESMTPS id F01AC3858D3C for ; Sat, 15 Jan 2022 20:01:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org F01AC3858D3C Received: by mail-qk1-x732.google.com with SMTP id d24so2111820qkk.5 for ; Sat, 15 Jan 2022 12:01:44 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=HSWUZlbh709Rjebn9L1bl+jxUgvWdt/2By3lprjhgpo=; b=05/VoDgZb9K6WwhZJJ/s1MuAMJq8OEywg205oZA5m+Xc4l/GkFdcDqxHd8ffyGveZI T/AlF14TwZRLqmRgPtvCu73Qx8xRJ98I8Bw/fWwaibzrGKqSoDr2CKocOhK2OvwRo+dO OvokN5wy8UjXaSe2Exl8e3TWBNMgA2mlbEFDjYecPOKwC57x7U01CnGgFxUyjiFqFCbX WNi0pzuY8n/zDdBrmo7dF+B6cjUaMpzKCNJdRe/If5WBsNk12nTzzezpRuiGijqXc1/g OpKZOQzaFZBv+vMvQf+tGNYhg4k7ByKPa2RMBDUgAMlmOuHkOyFk7tKpY9Ia1Xuw3RVb H09Q== X-Gm-Message-State: AOAM5322pnahc5L1bK0C9xm1jCsz+1lhlJvQJzUQo/n+HP10RNYeOPRX fiLLJ2K4tb4776h8nRNM30bez9oQfD35XA9xiX6dJouYniLT9g== X-Google-Smtp-Source: ABdhPJx/X5ZjsBJlgVapY0TGxhTeM4tD8GYt0tVjFt95r3yemMfrBbZXk5Ky5a0x1ExOePqIpbHCg1zXbYS3ZMK1Fpk= X-Received: by 2002:a05:620a:4547:: with SMTP id u7mr6437486qkp.328.1642276904314; Sat, 15 Jan 2022 12:01:44 -0800 (PST) MIME-Version: 1.0 Date: Sat, 15 Jan 2022 21:01:32 +0100 Message-ID: Subject: [PATCH] i386: Improve and optimize ix86_expand_sse_movcc To: "gcc-patches@gcc.gnu.org" X-Spam-Status: No, score=-8.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Uros Bizjak via Gcc-patches From: Uros Bizjak Reply-To: Uros Bizjak Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Modernize ix86_expand_sse_movcc to use expand_simple_{unop,binop} infrastructure to avoid manual twiddling with output registers. Also fix a couple of inconsistent vector_all_ones_operand usages, break a couple of unnecessary else-if chains, eliminate common subexpressions and do some general code simplifications. 2022-01-15 Uroš Bizjak gcc/ChangeLog: * config/i386/i386-expand.c (ix86_expand_sse_movcc): Use expand_simple_unop and expand_simple_binop instead of manually constructing NOT, AND and IOR RTXes. Use vector_all_ones_operand consistently. Eliminate common subexpressions and simplify code. * config/i386/sse.md (3): New expander. (3): Make public. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Pushed to master. Uros. diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c index c740d6e5c04..138580da96e 100644 --- a/gcc/config/i386/i386-expand.c +++ b/gcc/config/i386/i386-expand.c @@ -3781,6 +3781,7 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false) { machine_mode mode = GET_MODE (dest); machine_mode cmpmode = GET_MODE (cmp); + rtx x; /* Simplify trivial VEC_COND_EXPR to avoid ICE in pr97506. */ if (rtx_equal_p (op_true, op_false)) @@ -3789,8 +3790,6 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false) return; } - rtx t2, t3, x; - /* If we have an integer mask and FP value then we need to cast mask to FP mode. */ if (mode != cmpmode && VECTOR_MODE_P (cmpmode)) @@ -3813,12 +3812,14 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false) ? force_reg (mode, op_false) : op_false); if (op_true == CONST0_RTX (mode)) { - rtx n = gen_reg_rtx (cmpmode); if (cmpmode == E_DImode && !TARGET_64BIT) - emit_insn (gen_knotdi (n, cmp)); + { + x = gen_reg_rtx (cmpmode); + emit_insn (gen_knotdi (x, cmp)); + } else - emit_insn (gen_rtx_SET (n, gen_rtx_fmt_e (NOT, cmpmode, cmp))); - cmp = n; + x = expand_simple_unop (cmpmode, NOT, cmp, NULL, 1); + cmp = x; /* Reverse op_true op_false. */ std::swap (op_true, op_false); } @@ -3826,22 +3827,24 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false) if (mode == HFmode) emit_insn (gen_movhf_mask (dest, op_true, op_false, cmp)); else - { - rtx vec_merge = gen_rtx_VEC_MERGE (mode, op_true, op_false, cmp); - emit_insn (gen_rtx_SET (dest, vec_merge)); - } + emit_insn (gen_rtx_SET (dest, + gen_rtx_VEC_MERGE (mode, + op_true, op_false, cmp))); return; } - else if (vector_all_ones_operand (op_true, mode) - && op_false == CONST0_RTX (mode)) + + if (vector_all_ones_operand (op_true, mode) + && op_false == CONST0_RTX (mode)) { - emit_insn (gen_rtx_SET (dest, cmp)); + emit_move_insn (dest, cmp); return; } else if (op_false == CONST0_RTX (mode)) { - op_true = force_reg (mode, op_true); - ix86_emit_vec_binop (AND, mode, dest, cmp, op_true); + x = expand_simple_binop (mode, AND, cmp, op_true, + dest, 1, OPTAB_DIRECT); + if (x != dest) + emit_move_insn (dest, x); return; } else if (op_true == CONST0_RTX (mode)) @@ -3851,13 +3854,16 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false) ix86_emit_vec_binop (AND, mode, dest, x, op_false); return; } - else if (INTEGRAL_MODE_P (mode) && op_true == CONSTM1_RTX (mode)) + else if (vector_all_ones_operand (op_true, mode)) { - op_false = force_reg (mode, op_false); - ix86_emit_vec_binop (IOR, mode, dest, cmp, op_false); + x = expand_simple_binop (mode, IOR, cmp, op_false, + dest, 1, OPTAB_DIRECT); + if (x != dest) + emit_move_insn (dest, x); return; } - else if (TARGET_XOP) + + if (TARGET_XOP) { op_true = force_reg (mode, op_true); @@ -3865,16 +3871,17 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false) || !nonimmediate_operand (op_false, mode)) op_false = force_reg (mode, op_false); - emit_insn (gen_rtx_SET (dest, gen_rtx_IF_THEN_ELSE (mode, cmp, - op_true, - op_false))); + emit_insn (gen_rtx_SET (dest, + gen_rtx_IF_THEN_ELSE (mode, cmp, + op_true, op_false))); return; } rtx (*gen) (rtx, rtx, rtx, rtx) = NULL; - rtx d = dest; + machine_mode blend_mode = mode; - if (!vector_operand (op_true, mode)) + if (GET_MODE_SIZE (mode) < 16 + || !vector_operand (op_true, mode)) op_true = force_reg (mode, op_true); op_false = force_reg (mode, op_false); @@ -3883,10 +3890,7 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false) { case E_V2SFmode: if (TARGET_SSE4_1) - { - gen = gen_mmx_blendvps; - op_true = force_reg (mode, op_true); - } + gen = gen_mmx_blendvps; break; case E_V4SFmode: if (TARGET_SSE4_1) @@ -3898,54 +3902,32 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false) break; case E_SFmode: if (TARGET_SSE4_1) - { - gen = gen_sse4_1_blendvss; - op_true = force_reg (mode, op_true); - } + gen = gen_sse4_1_blendvss; break; case E_DFmode: if (TARGET_SSE4_1) - { - gen = gen_sse4_1_blendvsd; - op_true = force_reg (mode, op_true); - } + gen = gen_sse4_1_blendvsd; break; case E_V8QImode: case E_V4HImode: case E_V2SImode: if (TARGET_SSE4_1) { - op_true = force_reg (mode, op_true); - gen = gen_mmx_pblendvb_v8qi; - if (mode != V8QImode) - d = gen_reg_rtx (V8QImode); - op_false = gen_lowpart (V8QImode, op_false); - op_true = gen_lowpart (V8QImode, op_true); - cmp = gen_lowpart (V8QImode, cmp); + blend_mode = V8QImode; } break; case E_V4QImode: case E_V2HImode: if (TARGET_SSE4_1) { - op_true = force_reg (mode, op_true); - gen = gen_mmx_pblendvb_v4qi; - if (mode != V4QImode) - d = gen_reg_rtx (V4QImode); - op_false = gen_lowpart (V4QImode, op_false); - op_true = gen_lowpart (V4QImode, op_true); - cmp = gen_lowpart (V4QImode, cmp); + blend_mode = V4QImode; } break; case E_V2QImode: if (TARGET_SSE4_1) - { - op_true = force_reg (mode, op_true); - - gen = gen_mmx_pblendvb_v2qi; - } + gen = gen_mmx_pblendvb_v2qi; break; case E_V16QImode: case E_V8HImode: @@ -3955,11 +3937,7 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false) if (TARGET_SSE4_1) { gen = gen_sse4_1_pblendvb; - if (mode != V16QImode) - d = gen_reg_rtx (V16QImode); - op_false = gen_lowpart (V16QImode, op_false); - op_true = gen_lowpart (V16QImode, op_true); - cmp = gen_lowpart (V16QImode, cmp); + blend_mode = V16QImode; } break; case E_V8SFmode: @@ -3978,11 +3956,7 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false) if (TARGET_AVX2) { gen = gen_avx2_pblendvb; - if (mode != V32QImode) - d = gen_reg_rtx (V32QImode); - op_false = gen_lowpart (V32QImode, op_false); - op_true = gen_lowpart (V32QImode, op_true); - cmp = gen_lowpart (V32QImode, cmp); + blend_mode = V32QImode; } break; @@ -4014,26 +3988,36 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false) if (gen != NULL) { - emit_insn (gen (d, op_false, op_true, cmp)); - if (d != dest) - emit_move_insn (dest, gen_lowpart (GET_MODE (dest), d)); + if (blend_mode == mode) + x = dest; + else + { + x = gen_reg_rtx (blend_mode); + op_false = gen_lowpart (blend_mode, op_false); + op_true = gen_lowpart (blend_mode, op_true); + cmp = gen_lowpart (blend_mode, cmp); + } + + emit_insn (gen (x, op_false, op_true, cmp)); + + if (x != dest) + emit_move_insn (dest, gen_lowpart (mode, x)); } else { - op_true = force_reg (mode, op_true); - - t2 = gen_reg_rtx (mode); - if (optimize) - t3 = gen_reg_rtx (mode); - else - t3 = dest; + rtx t2, t3; - ix86_emit_vec_binop (AND, mode, t2, op_true, cmp); + t2 = expand_simple_binop (mode, AND, op_true, cmp, + NULL, 1, OPTAB_DIRECT); + t3 = gen_reg_rtx (mode); x = gen_rtx_NOT (mode, cmp); ix86_emit_vec_binop (AND, mode, t3, x, op_false); - ix86_emit_vec_binop (IOR, mode, dest, t3, t2); + x = expand_simple_binop (mode, IOR, t3, t2, + dest, 1, OPTAB_DIRECT); + if (x != dest) + emit_move_insn (dest, x); } } diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 0864748875e..50dc5da9a38 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -4956,7 +4956,7 @@ ] (const_string "TI")))]) -(define_insn "*3" +(define_insn "3" [(set (match_operand:MODEF 0 "register_operand" "=x,x,v,v") (any_logic:MODEF (match_operand:MODEF 1 "register_operand" "%0,x,v,v")