From patchwork Thu Dec 2 01:33:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Oliva X-Patchwork-Id: 48381 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 13F933857C4A for ; Thu, 2 Dec 2021 01:33:49 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 13F933857C4A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1638408829; bh=h6x+xiEUKV2U1Vd7EMfn5PtN9UXonZ1i4xjRNr6qsK0=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=Z2aJj11FBxqwao5EmFJ/ZJJFBtMj9vFVdyuGC1avaCtf0pFXv6QLDKIp5DER9z3fw GnUfwkXBBpgUJMWPNUMagsDOp+J20q+HZQD+J6ajPRvp9dwvDPs4CcmORdkv45f3PO uXUVFZKItgRtVGfZr22IOvnkkyUkt9ew4FqVtKlU= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from rock.gnat.com (rock.gnat.com [205.232.38.15]) by sourceware.org (Postfix) with ESMTPS id CE1AC3858414; Thu, 2 Dec 2021 01:33:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CE1AC3858414 Received: from localhost (localhost.localdomain [127.0.0.1]) by filtered-rock.gnat.com (Postfix) with ESMTP id 8D7BA116E9A; Wed, 1 Dec 2021 20:33:18 -0500 (EST) X-Virus-Scanned: Debian amavisd-new at gnat.com X-Amavis-Alert: BAD HEADER SECTION, Header field occurs more than once: "Cc" occurs 3 times Received: from rock.gnat.com ([127.0.0.1]) by localhost (rock.gnat.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 76ht0793AUXf; Wed, 1 Dec 2021 20:33:18 -0500 (EST) Received: from free.home (tron.gnat.com [IPv6:2620:20:4000:0:46a8:42ff:fe0e:e294]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by rock.gnat.com (Postfix) with ESMTPS id 6DB67116BAA; Wed, 1 Dec 2021 20:33:17 -0500 (EST) Received: from livre (livre.home [172.31.160.2]) by free.home (8.15.2/8.15.2) with ESMTPS id 1B21X4Lm741869 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 1 Dec 2021 22:33:07 -0300 To: gcc-patches@gcc.gnu.org Subject: [PR103149] introduce asmnesia internal function Organization: Free thinker, does not speak for AdaCore Date: Wed, 01 Dec 2021 22:33:04 -0300 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Alexandre Oliva via Gcc-patches From: Alexandre Oliva Reply-To: Alexandre Oliva Cc: rguenther@suse.de, wilson@gcc.gnu.org, uweigand@de.ibm.com, asolokha@gmx.com, zsojka@seznam.cz, ian@airs.com Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This is now used by -fharden-conditional-branches and -fharden-compares to detach the values used in redundant hardening compares from their original locations. It eventually still expands to an asm statement, as before, but the temporary is only introduced at expand time, preventing gimple optimizations from diverging input and output, which may prevent their unification in asm match operands. Additional logic to check asm operands matches, and to keep them matchable, which may improve some cases of "+m" and "+X" in asm that are currently rejected because the single operand, split into input and output, ends up diverging in ways that can't be merged back by reload. While investigating regressions hit by an earlier iteration of this patch, I came across pr93027.c, and noticed that, with optimization (unlike the test), reload overwrites one of the inputs with another. This preexisted this patch, and I have not attempted to fix it. The changes above paved the way to fix PR103149: a vectorized loop ends up using vector booleans on aarch64, and the hardening pass attempts to "copy-and-forget" them with an asm statement (thus ASMNESIA) for the redundant hardening compare. The problem there is that vector booleans, on the affected platform, can't go in GENERAL_REGS, but the asm statement we used to emit could only use those registers or memory. The work-around introduced in the newly-introduced ASMNESIA expander is to check whether the mode is suitable for GENERAL_REGS, and force the asm operands to use MEM otherwise. This is not ideal, but functionality to map a mode to a constraint string that selects a suitable register class doesn't seem readily available. Maybe in a future patch... ASMNESIA currently works only with gimple registers of types with scalar modes. I have not attempted to make it more general than that, though I envision future uses that might end up requiring it. I'll cross that bridge if/when I get to it. Regstrapped on x86_64-linux-gnu. Also bootstrapped along with a patch that enables both compare hardening passes. Built crosses to aarch64- and riscv64-, and verified that the mentioned aarch64 PR is fixed. Ok to install? (Manual inspection of the asm output for riscv PR103302 suggests that the reported problem may be gone with the given testcase, but I have not built enough of the target environment to enable me to run that.) for gcc/ChangeLog PR middle-end/103149 * cfgexpand.c (expand_asm_stmt): Pass original input constraint, and constraint array, to asm_operand_ok. * gimple-hander-conditionals.cc (detach_value): Call ASMNESIA internal function. * internal-fn.c (expand_ASMNESIA): New. * internal-fn.def (ASMNESIA): New. * recog.c (asm_operand_ok): Turn into wrapper of... (asm_operand_ok_match): ... renamed. Skip non-constraint characters faster. Don't require register_operand for matched operands. Combine results with... (check_match): New. (check_asm_operands): Enable match checking. for gcc/testsuite/ChangeLog PR middle-end/103149 * gcc.target/aarch64/pr103149.c: New. * gcc.target/i386/pr85030.c: Also accept impossible constraint error. --- gcc/cfgexpand.c | 6 + gcc/gimple-harden-conditionals.cc | 17 ---- gcc/internal-fn.c | 81 +++++++++++++++++++ gcc/internal-fn.def | 4 + gcc/recog.c | 114 +++++++++++++++++++++++---- gcc/testsuite/gcc.target/aarch64/pr103149.c | 13 +++ gcc/testsuite/gcc.target/i386/pr85030.c | 2 7 files changed, 200 insertions(+), 37 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/pr103149.c diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index fb84d469f1e77..f0272ba1ac9ba 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -3374,10 +3374,10 @@ expand_asm_stmt (gasm *stmt) tree val = input_tvec[i]; tree type = TREE_TYPE (val); bool allows_reg, allows_mem, ok; - const char *constraint; + const char *constraint, *orig_constraint; rtx op; - constraint = constraints[i + noutputs]; + orig_constraint = constraint = constraints[i + noutputs]; ok = parse_input_constraint (&constraint, i, ninputs, noutputs, 0, constraints.address (), &allows_mem, &allows_reg); @@ -3397,7 +3397,7 @@ expand_asm_stmt (gasm *stmt) else if (MEM_P (op)) op = validize_mem (op); - if (asm_operand_ok (op, constraint, NULL) <= 0) + if (asm_operand_ok (op, orig_constraint, constraints.address ()) <= 0) { if (allows_reg && TYPE_MODE (type) != BLKmode) op = force_reg (TYPE_MODE (type), op); diff --git a/gcc/gimple-harden-conditionals.cc b/gcc/gimple-harden-conditionals.cc index cfa2361d65be0..3768b2e23bdaf 100644 --- a/gcc/gimple-harden-conditionals.cc +++ b/gcc/gimple-harden-conditionals.cc @@ -132,21 +132,8 @@ detach_value (location_t loc, gimple_stmt_iterator *gsip, tree val) tree ret = make_ssa_name (TREE_TYPE (val)); SET_SSA_NAME_VAR_OR_IDENTIFIER (ret, SSA_NAME_IDENTIFIER (val)); - /* Output asm ("" : "=g" (ret) : "0" (val)); */ - vec *inputs = NULL; - vec *outputs = NULL; - vec_safe_push (outputs, - build_tree_list - (build_tree_list - (NULL_TREE, build_string (2, "=g")), - ret)); - vec_safe_push (inputs, - build_tree_list - (build_tree_list - (NULL_TREE, build_string (1, "0")), - val)); - gasm *detach = gimple_build_asm_vec ("", inputs, outputs, - NULL, NULL); + gcall *detach = gimple_build_call_internal (IFN_ASMNESIA, 1, val); + gimple_call_set_lhs (detach, ret); gimple_set_location (detach, loc); gsi_insert_before (gsip, detach, GSI_SAME_STMT); diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c index 6ac3460d538be..a4a2ca91d9f93 100644 --- a/gcc/internal-fn.c +++ b/gcc/internal-fn.c @@ -3234,6 +3234,87 @@ expand_LAUNDER (internal_fn, gcall *call) expand_assignment (lhs, gimple_call_arg (call, 0), false); } +/* Expand ASMNESIA to assignment and asm that makes the value unknown. */ + +static void +expand_ASMNESIA (internal_fn, gcall *call) +{ + tree to = gimple_call_lhs (call); + + if (!to) + return; + + location_t locus = gimple_location (call); + + tree from = gimple_call_arg (call, 0); + + rtx to_rtx = expand_expr (to, NULL_RTX, VOIDmode, EXPAND_WRITE); + + rtx from_rtx = expand_expr (from, NULL_RTX, VOIDmode, EXPAND_NORMAL); + + enum machine_mode mode = GET_MODE (to_rtx); + + gcc_checking_assert (mode != BLKmode && mode != VOIDmode + && (GET_MODE (from_rtx) == mode + || GET_MODE (from_rtx) == VOIDmode)); + + rtvec argvec = rtvec_alloc (1); + rtvec constraintvec = rtvec_alloc (1); + rtvec labelvec = rtvec_alloc (0); + + bool need_memory = true; + for (int i = 0; i < FIRST_PSEUDO_REGISTER; i++) + if (TEST_HARD_REG_BIT (reg_class_contents[GENERAL_REGS], i) + && targetm.hard_regno_mode_ok (i, mode)) + { + need_memory = false; + break; + } + + rtx regtemp = copy_to_mode_reg (mode, from_rtx); + rtx temp = regtemp; + const char *constraint = "=g"; + + if (need_memory) + { + constraint = "=m"; + temp = assign_stack_temp (mode, GET_MODE_SIZE (mode)); + emit_move_insn (temp, regtemp); + } + + /* This is roughly equivalent to asm ("" : "+g" (temp)); + + We use +m instead of +g if we NEED_MEMORY, i.e., the mode is not suitable + for GENERAL_REGS. ??? Ideally, we'd choose a suitable register class, if + there is one, but how do we get a constraint type that maps to a it? +X is + at the same time too lax, because arbitrary RTL and x87 FP regs are not + desirable, and too strict, because it doesn't guide any register class. + ??? Maybe the errors reg-stack would flag on e.g. libgcc/_multc3 could be + conditioned on the operand's being referenced? + + Unlike expansion of gimple asm stmts, it doesn't go through + targetm.md_asm_adjust (we don't wish any clobbers). + + Furthermore, we ensure input and output start out with the same pseudo or + MEM, which gimple doesn't. This is particularly important for MEMs, since + reloads can't merge disparate ones, and it would be important for X because + it guides NO_REGS. */ + rtx body = gen_rtx_ASM_OPERANDS (mode, + ggc_strdup (""), + constraint, 0, + argvec, constraintvec, labelvec, + locus); + ASM_OPERANDS_INPUT (body, 0) = temp; + ASM_OPERANDS_INPUT_CONSTRAINT_EXP (body, 0) + = gen_rtx_ASM_INPUT_loc (mode, "0", locus); + emit_insn (gen_rtx_SET (temp, body)); + + if (need_memory) + emit_move_insn (regtemp, temp); + + emit_move_insn (to_rtx, regtemp); +} + /* Expand {MASK_,}SCATTER_STORE{S,U} call CALL using optab OPTAB. */ static void diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index acb0dbda5567b..f10b220192196 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -410,6 +410,10 @@ DEF_INTERNAL_FN (FALLTHROUGH, ECF_LEAF | ECF_NOTHROW, NULL) /* To implement __builtin_launder. */ DEF_INTERNAL_FN (LAUNDER, ECF_LEAF | ECF_NOTHROW | ECF_NOVOPS, NULL) +/* Copy a value while preventing optimizations based on knowledge + about the input operand. */ +DEF_INTERNAL_FN (ASMNESIA, ECF_LEAF | ECF_NOTHROW, NULL) + /* Divmod function. */ DEF_INTERNAL_FN (DIVMOD, ECF_CONST | ECF_LEAF, NULL) diff --git a/gcc/recog.c b/gcc/recog.c index 5a42c45361d5c..2f402daad55a0 100644 --- a/gcc/recog.c +++ b/gcc/recog.c @@ -53,6 +53,9 @@ along with GCC; see the file COPYING3. If not see static void validate_replace_rtx_1 (rtx *, rtx, rtx, rtx_insn *, bool); static void validate_replace_src_1 (rtx *, void *); static rtx_insn *split_insn (rtx_insn *); +static int asm_operand_ok_match (rtx op, const char *constraint, + const char **constraints, + rtx *operands, int match_index); struct target_recog default_target_recog; #if SWITCHABLE_TARGET @@ -170,7 +173,7 @@ check_asm_operands (rtx x) const char *c = constraints[i]; if (c[0] == '%') c++; - if (! asm_operand_ok (operands[i], c, constraints)) + if (! asm_operand_ok_match (operands[i], c, constraints, operands, -1)) return 0; } @@ -2167,6 +2170,67 @@ get_referenced_operands (const char *string, bool *used, int asm_operand_ok (rtx op, const char *constraint, const char **constraints) +{ + return asm_operand_ok_match (op, constraint, constraints, NULL, -1); +} + +/* Combine a previous cummulative result PREVRES with RES. If MATCH_INDEX is + nonnegative, check that OP is compatible with the matched operand, using + OPERANDS if not NULL, under the CN-implied requirements. + */ + +static int +check_match (int prevres, rtx op, enum constraint_num cn, int res, + rtx *operands, int match_index) +{ + if (prevres > 0 || !res) + return prevres; + + if (match_index < 0) + return res; + + bool allows_reg = false, allows_mem = false; + enum reg_class rclass = NO_REGS; + + if (cn == CONSTRAINT__LIMIT) + { + allows_reg = allows_mem = true; + rclass = GENERAL_REGS; + } + else + { + insn_extra_constraint_allows_reg_mem (cn, &allows_reg, &allows_mem); + if (allows_reg) + rclass = reg_class_for_constraint (cn); + } + + rtx mop = operands ? operands[match_index] : NULL_RTX; + + /* Avoid divergence between input and output when reload wouldn't be able to + fix it up. Keep matching MEMs identical if we can't have REGs, but allow + reloadable MEMs otherwise. Avoid immediates and rvalue expressions, since + they can't match outputs. */ + if (op != mop + && ((mop && allows_mem && MEM_P (op) && MEM_P (mop) + && (!allows_reg || rclass == NO_REGS || GET_MODE (mop) == BLKmode) + && !rtx_equal_p (op, mop)) + || (mop && allows_reg + && (rclass == NO_REGS || GET_MODE (mop) == BLKmode) + && (!allows_mem || !MEM_P (op) || !MEM_P (mop))) + || !nonimmediate_operand (op, VOIDmode))) + return prevres; + + return res; +} + +/* Check if an asm_operand matches its constraints. + If MATCH_INDEX is nonnegative, also check for potential compatibility + with the matched operand, using OPERANDS if non-NULL. + Return > 0 if ok, = 0 if bad, < 0 if inconclusive. */ + +static int +asm_operand_ok_match (rtx op, const char *constraint, const char **constraints, + rtx *operands, int match_index) { int result = 0; bool incdec_ok = false; @@ -2177,7 +2241,8 @@ asm_operand_ok (rtx op, const char *constraint, const char **constraints) /* Empty constraint string is the same as "X,...,X", i.e. X for as many alternatives as required to match the other operands. */ if (*constraint == '\0') - result = 1; + result = check_match (result, op, CONSTRAINT_X, + 1, operands, match_index); while (*constraint) { @@ -2186,7 +2251,8 @@ asm_operand_ok (rtx op, const char *constraint, const char **constraints) int len; switch (c) { - case ',': + case '&': case '?': case '!': case '#': case '*': + case '=': case '+': case ',': constraint++; continue; @@ -2204,7 +2270,8 @@ asm_operand_ok (rtx op, const char *constraint, const char **constraints) match = strtoul (constraint, &end, 10); if (!result) - result = asm_operand_ok (op, constraints[match], NULL); + result = asm_operand_ok_match (op, constraints[match], NULL, + operands, match); constraint = (const char *) end; } else @@ -2229,12 +2296,14 @@ asm_operand_ok (rtx op, const char *constraint, const char **constraints) offsettable address exists. */ case 'o': /* offsettable */ if (offsettable_nonstrict_memref_p (op)) - result = 1; + result = check_match (result, op, CONSTRAINT_o, 1, + operands, match_index); break; case 'g': if (general_operand (op, VOIDmode)) - result = 1; + result = check_match (result, op, CONSTRAINT__LIMIT, 1, + operands, match_index); break; case '<': @@ -2253,18 +2322,21 @@ asm_operand_ok (rtx op, const char *constraint, const char **constraints) switch (get_constraint_type (cn)) { case CT_REGISTER: - if (!result - && reg_class_for_constraint (cn) != NO_REGS - && GET_MODE (op) != BLKmode - && register_operand (op, VOIDmode)) - result = 1; + /* Don't demand a register for a matched operand, see pr93027. */ + result = check_match (result, op, cn, + reg_class_for_constraint (cn) != NO_REGS + && GET_MODE (op) != BLKmode + && (match_index >= 0 + || register_operand (op, VOIDmode)), + operands, match_index); break; case CT_CONST_INT: - if (!result - && CONST_INT_P (op) - && insn_const_int_ok_for_constraint (INTVAL (op), cn)) - result = 1; + result = check_match (result, op, cn, + CONST_INT_P (op) + && (insn_const_int_ok_for_constraint + (INTVAL (op), cn)), + operands, match_index); break; case CT_MEMORY: @@ -2275,16 +2347,22 @@ asm_operand_ok (rtx op, const char *constraint, const char **constraints) /* Every memory operand can be reloaded to fit. */ if (!mem) mem = extract_mem_from_operand (op); - result = result || memory_operand (mem, VOIDmode); + result = check_match (result, op, cn, + memory_operand (mem, VOIDmode), + operands, match_index); break; case CT_ADDRESS: /* Every address operand can be reloaded to fit. */ - result = result || address_operand (op, VOIDmode); + result = check_match (result, op, cn, + address_operand (op, VOIDmode), + operands, match_index); break; case CT_FIXED_FORM: - result = result || constraint_satisfied_p (op, cn); + result = check_match (result, op, cn, + constraint_satisfied_p (op, cn), + operands, match_index); break; } break; diff --git a/gcc/testsuite/gcc.target/aarch64/pr103149.c b/gcc/testsuite/gcc.target/aarch64/pr103149.c new file mode 100644 index 0000000000000..e7283b4f569c7 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/pr103149.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-march=armv8-a+sve -O2 -fharden-conditional-branches -fno-tree-scev-cprop" } */ + +/* -fharden-conditional-branches relies on ASMNESIA, that used to require + GENERAL_REGS even for vectorized booleans, which can't go on + GENERAL_REGS. */ + +void +foo (int *p) +{ + while (*p < 1) + ++*p; +} diff --git a/gcc/testsuite/gcc.target/i386/pr85030.c b/gcc/testsuite/gcc.target/i386/pr85030.c index ff41df6bb643c..f24e690b8c11d 100644 --- a/gcc/testsuite/gcc.target/i386/pr85030.c +++ b/gcc/testsuite/gcc.target/i386/pr85030.c @@ -6,5 +6,5 @@ void foo () { struct S a; - asm volatile ("" : "=rm" (a) : "0" (1)); /* { dg-error "inconsistent operand constraints in an 'asm'" } */ + asm volatile ("" : "=rm" (a) : "0" (1)); /* { dg-error "impossible constraint in 'asm'|inconsistent operand constraints in an 'asm'" } */ }