From patchwork Thu Jan 13 09:03:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 49951 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 08AD93858409 for ; Thu, 13 Jan 2022 09:04:29 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 08AD93858409 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1642064669; bh=s5Z1IEgrhL335xLWYuDT8wTjgiBuLnG4hrLMijlcabk=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=nstZBvOGnUmH5uTDQUovJxAlTEbACvfpYf/fnaTk73kDXRNyKBbWdVlYEoPUTQvjh huxMsDTN6juYuAZA3Hs3pa1SXj5maZ9SPJ+qpj65jhvDYYtUCmYVz19RY9XssnSwo6 FMG/qYAQfYSgeLbz1xq7076lKkWCf1NsVUi1npoI= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 5BB193858D39 for ; Thu, 13 Jan 2022 09:03:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5BB193858D39 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-486-NbuqU7GXP8G7dQjQMsByGw-1; Thu, 13 Jan 2022 04:03:56 -0500 X-MC-Unique: NbuqU7GXP8G7dQjQMsByGw-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id DB64A835B91; Thu, 13 Jan 2022 09:03:55 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.195.246]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 4BF8C1F45F; Thu, 13 Jan 2022 09:03:55 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.16.1/8.16.1) with ESMTPS id 20D93qdU2504115 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 13 Jan 2022 10:03:52 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.16.1/8.16.1/Submit) id 20D93p2d2504114; Thu, 13 Jan 2022 10:03:51 +0100 Date: Thu, 13 Jan 2022 10:03:51 +0100 To: Richard Biener Subject: [PATCH] forwprop: Canonicalize atomic fetch_op op x to op_fetch or vice versa [PR98737] Message-ID: <20220113090351.GR2646553@tucnak> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline X-Spam-Status: No, score=-5.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jakub Jelinek via Gcc-patches From: Jakub Jelinek Reply-To: Jakub Jelinek Cc: gcc-patches@gcc.gnu.org Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi! When writing the PR98737 fix, I've handled just the case where people use __atomic_op_fetch (p, x, y) etc. But some people actually use the other builtins, like __atomic_fetch_op (p, x, y) op x. The following patch canonicalizes the latter to the former and vice versa when possible if the result of the builtin is a single use and if that use is a cast with same precision, also that cast's lhs has a single use. For all ops of +, -, &, | and ^ we can do those __atomic_fetch_op (p, x, y) op x -> __atomic_op_fetch (p, x, y) (and __sync too) opts, but cases of INTEGER_CST and SSA_NAME x behave differently. For INTEGER_CST, typically - x is canonicalized to + (-x), while for SSA_NAME we need to handle various casts, which sometimes happen on the second argument of the builtin (there can be even two subsequent casts for char/short due to the promotions we do) and there can be a cast on the argument of op too. And all ops but - are commutative. For the other direction, i.e. __atomic_op_fetch (p, x, y) rop x -> __atomic_fetch_op (p, x, y) we can't handle op of & and |, those aren't reversible, for op + rop is -, for - rop is + and for ^ rop is ^, otherwise the same stuff as above applies. And, there is another case, we canonicalize x - y == 0 (or != 0) and x ^ y == 0 (or != 0) to x == y (or x != y) and for constant y x + y == 0 (or != 0) to x == -y (or != -y), so the patch also virtually undoes those canonicalizations, because e.g. for the earlier PR98737 patch but even generally, it is better if a result of atomic op fetch is compared against 0 than doing atomic fetch op and compare it to some variable or non-zero constant. As for debug info, for non-reversible operations (& and |) the patch resets debug stmts if there are any, for -fnon-call-exceptions too (didn't want to include debug temps right before all uses), but otherwise it emits the reverse operation from the result as a debug temp and uses that in debug stmts. On the emitted assembly for the testcases which are fairly large, I see substantial decreases of the *.s size: -rw-rw-r--. 1 jakub jakub 116897 Jan 13 09:58 pr98737-1.svanilla -rw-rw-r--. 1 jakub jakub 93861 Jan 13 09:57 pr98737-1.spatched -rw-rw-r--. 1 jakub jakub 70257 Jan 13 09:57 pr98737-2.svanilla -rw-rw-r--. 1 jakub jakub 67537 Jan 13 09:57 pr98737-2.spatched There are some functions where due to RA we get one more instruction than previously, but most of them are smaller even when not hitting the PR98737 previous patch's optimizations. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2022-01-13 Jakub Jelinek PR target/98737 * tree-ssa-forwprop.c (simplify_builtin_call): Canonicalize __atomic_fetch_op (p, x, y) op x into __atomic_op_fetch (p, x, y) and __atomic_op_fetch (p, x, y) iop x into __atomic_fetch_op (p, x, y). * gcc.dg/tree-ssa/pr98737-1.c: New test. * gcc.dg/tree-ssa/pr98737-2.c: New test. Jakub --- gcc/tree-ssa-forwprop.c.jj 2022-01-11 23:11:23.467275019 +0100 +++ gcc/tree-ssa-forwprop.c 2022-01-12 22:12:24.666522743 +0100 @@ -1241,12 +1241,19 @@ constant_pointer_difference (tree p1, tr memset (p + 4, ' ', 3); into memcpy (p, "abcd ", 7); - call if the latter can be stored by pieces during expansion. */ + call if the latter can be stored by pieces during expansion. + + Also canonicalize __atomic_fetch_op (p, x, y) op x + to __atomic_op_fetch (p, x, y) or + __atomic_op_fetch (p, x, y) iop x + to __atomic_fetch_op (p, x, y) when possible (also __sync). */ static bool simplify_builtin_call (gimple_stmt_iterator *gsi_p, tree callee2) { gimple *stmt1, *stmt2 = gsi_stmt (*gsi_p); + enum built_in_function other_atomic = END_BUILTINS; + enum tree_code atomic_op = ERROR_MARK; tree vuse = gimple_vuse (stmt2); if (vuse == NULL) return false; @@ -1448,6 +1455,300 @@ simplify_builtin_call (gimple_stmt_itera } } break; + + #define CASE_ATOMIC(NAME, OTHER, OP) \ + case BUILT_IN_##NAME##_1: \ + case BUILT_IN_##NAME##_2: \ + case BUILT_IN_##NAME##_4: \ + case BUILT_IN_##NAME##_8: \ + case BUILT_IN_##NAME##_16: \ + atomic_op = OP; \ + other_atomic \ + = (enum built_in_function) (BUILT_IN_##OTHER##_1 \ + + (DECL_FUNCTION_CODE (callee2) \ + - BUILT_IN_##NAME##_1)); \ + goto handle_atomic_fetch_op; + + CASE_ATOMIC (ATOMIC_FETCH_ADD, ATOMIC_ADD_FETCH, PLUS_EXPR) + CASE_ATOMIC (ATOMIC_FETCH_SUB, ATOMIC_SUB_FETCH, MINUS_EXPR) + CASE_ATOMIC (ATOMIC_FETCH_AND, ATOMIC_AND_FETCH, BIT_AND_EXPR) + CASE_ATOMIC (ATOMIC_FETCH_XOR, ATOMIC_XOR_FETCH, BIT_XOR_EXPR) + CASE_ATOMIC (ATOMIC_FETCH_OR, ATOMIC_OR_FETCH, BIT_IOR_EXPR) + + CASE_ATOMIC (SYNC_FETCH_AND_ADD, SYNC_ADD_AND_FETCH, PLUS_EXPR) + CASE_ATOMIC (SYNC_FETCH_AND_SUB, SYNC_SUB_AND_FETCH, MINUS_EXPR) + CASE_ATOMIC (SYNC_FETCH_AND_AND, SYNC_AND_AND_FETCH, BIT_AND_EXPR) + CASE_ATOMIC (SYNC_FETCH_AND_XOR, SYNC_XOR_AND_FETCH, BIT_XOR_EXPR) + CASE_ATOMIC (SYNC_FETCH_AND_OR, SYNC_OR_AND_FETCH, BIT_IOR_EXPR) + + CASE_ATOMIC (ATOMIC_ADD_FETCH, ATOMIC_FETCH_ADD, MINUS_EXPR) + CASE_ATOMIC (ATOMIC_SUB_FETCH, ATOMIC_FETCH_SUB, PLUS_EXPR) + CASE_ATOMIC (ATOMIC_XOR_FETCH, ATOMIC_FETCH_XOR, BIT_XOR_EXPR) + + CASE_ATOMIC (SYNC_ADD_AND_FETCH, SYNC_FETCH_AND_ADD, MINUS_EXPR) + CASE_ATOMIC (SYNC_SUB_AND_FETCH, SYNC_FETCH_AND_SUB, PLUS_EXPR) + CASE_ATOMIC (SYNC_XOR_AND_FETCH, SYNC_FETCH_AND_XOR, BIT_XOR_EXPR) + +#undef CASE_ATOMIC + + handle_atomic_fetch_op: + if (gimple_call_num_args (stmt2) >= 2 && gimple_call_lhs (stmt2)) + { + tree lhs2 = gimple_call_lhs (stmt2), lhsc = lhs2; + tree arg = gimple_call_arg (stmt2, 1); + gimple *use_stmt, *cast_stmt = NULL; + use_operand_p use_p; + tree ndecl = builtin_decl_explicit (other_atomic); + + if (ndecl == NULL_TREE || !single_imm_use (lhs2, &use_p, &use_stmt)) + break; + + if (gimple_assign_cast_p (use_stmt)) + { + cast_stmt = use_stmt; + lhsc = gimple_assign_lhs (cast_stmt); + if (lhsc == NULL_TREE + || !INTEGRAL_TYPE_P (TREE_TYPE (lhsc)) + || (TYPE_PRECISION (TREE_TYPE (lhsc)) + != TYPE_PRECISION (TREE_TYPE (lhs2))) + || !single_imm_use (lhsc, &use_p, &use_stmt)) + { + use_stmt = cast_stmt; + cast_stmt = NULL; + lhsc = lhs2; + } + } + + bool ok = false; + tree oarg = NULL_TREE; + enum tree_code ccode = ERROR_MARK; + tree crhs1 = NULL_TREE, crhs2 = NULL_TREE; + if (is_gimple_assign (use_stmt) + && gimple_assign_rhs_code (use_stmt) == atomic_op) + { + if (gimple_assign_rhs1 (use_stmt) == lhsc) + oarg = gimple_assign_rhs2 (use_stmt); + else if (atomic_op != MINUS_EXPR) + oarg = gimple_assign_rhs1 (use_stmt); + } + else if (atomic_op == MINUS_EXPR + && is_gimple_assign (use_stmt) + && gimple_assign_rhs_code (use_stmt) == PLUS_EXPR + && TREE_CODE (arg) == INTEGER_CST + && (TREE_CODE (gimple_assign_rhs2 (use_stmt)) + == INTEGER_CST)) + { + tree a = fold_convert (TREE_TYPE (lhs2), arg); + tree o = fold_convert (TREE_TYPE (lhs2), + gimple_assign_rhs2 (use_stmt)); + if (wi::to_wide (a) == wi::neg (wi::to_wide (o))) + ok = true; + } + else if (atomic_op == BIT_AND_EXPR || atomic_op == BIT_IOR_EXPR) + ; + else if (gimple_code (use_stmt) == GIMPLE_COND) + { + ccode = gimple_cond_code (use_stmt); + crhs1 = gimple_cond_lhs (use_stmt); + crhs2 = gimple_cond_rhs (use_stmt); + } + else if (is_gimple_assign (use_stmt)) + { + if (gimple_assign_rhs_class (use_stmt) == GIMPLE_BINARY_RHS) + { + ccode = gimple_assign_rhs_code (use_stmt); + crhs1 = gimple_assign_rhs1 (use_stmt); + crhs2 = gimple_assign_rhs2 (use_stmt); + } + else if (gimple_assign_rhs_code (use_stmt) == COND_EXPR) + { + tree cond = gimple_assign_rhs1 (use_stmt); + if (COMPARISON_CLASS_P (cond)) + { + ccode = TREE_CODE (cond); + crhs1 = TREE_OPERAND (cond, 0); + crhs2 = TREE_OPERAND (cond, 1); + } + } + } + if (ccode == EQ_EXPR || ccode == NE_EXPR) + { + /* Deal with x - y == 0 or x ^ y == 0 + being optimized into x == y and x + cst == 0 + into x == -cst. */ + tree o = NULL_TREE; + if (crhs1 == lhsc) + o = crhs2; + else if (crhs2 == lhsc) + o = crhs1; + if (o && atomic_op != PLUS_EXPR) + oarg = o; + else if (o + && TREE_CODE (o) == INTEGER_CST + && TREE_CODE (arg) == INTEGER_CST) + { + tree a = fold_convert (TREE_TYPE (lhs2), arg); + o = fold_convert (TREE_TYPE (lhs2), o); + if (wi::to_wide (a) == wi::neg (wi::to_wide (o))) + ok = true; + } + } + if (oarg && !ok) + { + if (operand_equal_p (arg, oarg, 0)) + ok = true; + else if (TREE_CODE (arg) == SSA_NAME + && TREE_CODE (oarg) == SSA_NAME) + { + tree oarg2 = oarg; + if (gimple_assign_cast_p (SSA_NAME_DEF_STMT (oarg))) + { + gimple *g = SSA_NAME_DEF_STMT (oarg); + oarg2 = gimple_assign_rhs1 (g); + if (TREE_CODE (oarg2) != SSA_NAME + || !INTEGRAL_TYPE_P (TREE_TYPE (oarg2)) + || (TYPE_PRECISION (TREE_TYPE (oarg2)) + != TYPE_PRECISION (TREE_TYPE (oarg)))) + oarg2 = oarg; + } + if (gimple_assign_cast_p (SSA_NAME_DEF_STMT (arg))) + { + gimple *g = SSA_NAME_DEF_STMT (arg); + tree rhs1 = gimple_assign_rhs1 (g); + /* Handle e.g. + x.0_1 = (long unsigned int) x_4(D); + _2 = __atomic_fetch_add_8 (&vlong, x.0_1, 0); + _3 = (long int) _2; + _7 = x_4(D) + _3; */ + if (rhs1 == oarg || rhs1 == oarg2) + ok = true; + /* Handle e.g. + x.18_1 = (short unsigned int) x_5(D); + _2 = (int) x.18_1; + _3 = __atomic_fetch_xor_2 (&vshort, _2, 0); + _4 = (short int) _3; + _8 = x_5(D) ^ _4; + This happens only for char/short. */ + else if (TREE_CODE (rhs1) == SSA_NAME + && INTEGRAL_TYPE_P (TREE_TYPE (rhs1)) + && (TYPE_PRECISION (TREE_TYPE (rhs1)) + == TYPE_PRECISION (TREE_TYPE (lhs2)))) + { + g = SSA_NAME_DEF_STMT (rhs1); + if (gimple_assign_cast_p (g) + && (gimple_assign_rhs1 (g) == oarg + || gimple_assign_rhs1 (g) == oarg2)) + ok = true; + } + } + if (!ok && arg == oarg2) + /* Handle e.g. + _1 = __sync_fetch_and_add_4 (&v, x_5(D)); + _2 = (int) _1; + x.0_3 = (int) x_5(D); + _7 = _2 + x.0_3; */ + ok = true; + } + } + + if (ok) + { + tree new_lhs = make_ssa_name (TREE_TYPE (lhs2)); + gimple_call_set_lhs (stmt2, new_lhs); + gimple_call_set_fndecl (stmt2, ndecl); + gimple_stmt_iterator gsi = gsi_for_stmt (use_stmt); + if (ccode == ERROR_MARK) + gimple_assign_set_rhs_with_ops (&gsi, cast_stmt + ? NOP_EXPR : SSA_NAME, + new_lhs); + else + { + crhs1 = new_lhs; + crhs2 = build_zero_cst (TREE_TYPE (lhs2)); + if (gimple_code (use_stmt) == GIMPLE_COND) + { + gcond *cond_stmt = as_a (use_stmt); + gimple_cond_set_lhs (cond_stmt, crhs1); + gimple_cond_set_rhs (cond_stmt, crhs2); + } + else if (gimple_assign_rhs_class (use_stmt) + == GIMPLE_BINARY_RHS) + { + gimple_assign_set_rhs1 (use_stmt, crhs1); + gimple_assign_set_rhs2 (use_stmt, crhs2); + } + else + { + gcc_checking_assert (gimple_assign_rhs_code (use_stmt) + == COND_EXPR); + tree cond = build2 (ccode, boolean_type_node, + crhs1, crhs2); + gimple_assign_set_rhs1 (use_stmt, cond); + } + } + update_stmt (use_stmt); + imm_use_iterator iter; + bool add_debug_temp = false; + FOR_EACH_IMM_USE_STMT (use_stmt, iter, lhs2) + if (use_stmt != cast_stmt) + { + gcc_assert (is_gimple_debug (use_stmt)); + /* & and | aren't reversible. */ + if (atomic_op == BIT_AND_EXPR + || atomic_op == BIT_IOR_EXPR + /* Or with -fnon-call-exceptions if we can't + add debug stmts after the call. */ + || stmt_ends_bb_p (stmt2)) + { + gimple_debug_bind_reset_value (use_stmt); + update_stmt (use_stmt); + } + else + add_debug_temp = true; + } + if (cast_stmt) + { + gsi = gsi_for_stmt (cast_stmt); + gsi_remove (&gsi, true); + } + if (add_debug_temp) + { + gsi = gsi_for_stmt (stmt2); + tree type = TREE_TYPE (lhs2); + if (TREE_CODE (arg) == INTEGER_CST) + arg = fold_convert (type, arg); + else if (!useless_type_conversion_p (type, TREE_TYPE (arg))) + { + tree narg = build_debug_expr_decl (type); + gdebug *g + = gimple_build_debug_bind (narg, + fold_convert (type, arg), + stmt2); + gsi_insert_after (&gsi, g, GSI_NEW_STMT); + arg = narg; + } + enum tree_code rcode; + switch (atomic_op) + { + case PLUS_EXPR: rcode = MINUS_EXPR; break; + case MINUS_EXPR: rcode = PLUS_EXPR; break; + case BIT_XOR_EXPR: rcode = atomic_op; break; + default: gcc_unreachable (); + } + tree d = build_debug_expr_decl (type); + gdebug *g + = gimple_build_debug_bind (d, build2 (rcode, type, + new_lhs, arg), + stmt2); + gsi_insert_after (&gsi, g, GSI_NEW_STMT); + replace_uses_by (lhs2, d); + } + update_stmt (stmt2); + release_ssa_name (lhs2); + } + } + break; + default: break; } --- gcc/testsuite/gcc.dg/tree-ssa/pr98737-1.c.jj 2022-01-12 14:48:45.743941426 +0100 +++ gcc/testsuite/gcc.dg/tree-ssa/pr98737-1.c 2022-01-12 16:36:54.228346979 +0100 @@ -0,0 +1,148 @@ +/* PR target/98737 */ +/* { dg-do compile { target i?86-*-* x86_64-*-* powerpc*-*-* aarch64*-*-* } } */ +/* { dg-options "-O2 -fdump-tree-optimized -fcompare-debug" } */ +/* { dg-additional-options "-march=i686" { target ia32 } } */ +/* { dg-final { scan-tree-dump-not "__atomic_fetch_" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "__sync_fetch_and_" "optimized" } } */ + +typedef signed char schar; +typedef unsigned long ulong; +typedef unsigned int uint; +typedef unsigned short ushort; +typedef unsigned char uchar; +long vlong; +int vint; +short vshort; +schar vschar; +ulong vulong; +uint vuint; +ushort vushort; +uchar vuchar; +#define A(n, t, ut, f, o, ...) \ +t fn##n (t x) \ +{ \ + ut z = f (&v##t, x, ##__VA_ARGS__); \ + t w = (t) z; \ + return w o x; \ +} +#define B(n, f, o, ...) \ + A(n##0, long, ulong, f, o, ##__VA_ARGS__) \ + A(n##1, int, uint, f, o, ##__VA_ARGS__) \ + A(n##2, short, ushort, f, o, ##__VA_ARGS__) \ + A(n##3, schar, uchar, f, o, ##__VA_ARGS__) \ + A(n##4, ulong, ulong, f, o, ##__VA_ARGS__) \ + A(n##5, uint, uint, f, o, ##__VA_ARGS__) \ + A(n##6, ushort, ushort, f, o, ##__VA_ARGS__) \ + A(n##7, uchar, uchar, f, o, ##__VA_ARGS__) + +B(00, __atomic_fetch_add, +, __ATOMIC_RELAXED) +B(01, __atomic_fetch_sub, -, __ATOMIC_RELAXED) +B(02, __atomic_fetch_and, &, __ATOMIC_RELAXED) +B(03, __atomic_fetch_xor, ^, __ATOMIC_RELAXED) +B(04, __atomic_fetch_or, |, __ATOMIC_RELAXED) +B(05, __sync_fetch_and_add, +) +B(06, __sync_fetch_and_sub, -) +B(07, __sync_fetch_and_and, &) +B(08, __sync_fetch_and_xor, ^) +B(09, __sync_fetch_and_or, |) + +#undef A +#define A(n, t, ut, f, o, ...) \ +t fn##n (void) \ +{ \ + ut z = f (&v##t, 42, ##__VA_ARGS__); \ + t w = (t) z; \ + return w o 42; \ +} + +B(10, __atomic_fetch_add, +, __ATOMIC_RELAXED) +B(11, __atomic_fetch_sub, -, __ATOMIC_RELAXED) +B(12, __atomic_fetch_and, &, __ATOMIC_RELAXED) +B(13, __atomic_fetch_xor, ^, __ATOMIC_RELAXED) +B(14, __atomic_fetch_or, |, __ATOMIC_RELAXED) +B(15, __sync_fetch_and_add, +) +B(16, __sync_fetch_and_sub, -) +B(17, __sync_fetch_and_and, &) +B(18, __sync_fetch_and_xor, ^) +B(19, __sync_fetch_and_or, |) + +#undef A +#define A(n, t, ut, f, o, ...) \ +t fn##n (t x) \ +{ \ + ut z = f (&v##t, x, ##__VA_ARGS__); \ + t w = (t) z; \ + t v = w o x; \ + return v == 0; \ +} + +B(20, __atomic_fetch_add, +, __ATOMIC_RELAXED) +B(21, __atomic_fetch_sub, -, __ATOMIC_RELAXED) +B(22, __atomic_fetch_and, &, __ATOMIC_RELAXED) +B(23, __atomic_fetch_xor, ^, __ATOMIC_RELAXED) +B(24, __atomic_fetch_or, |, __ATOMIC_RELAXED) +B(25, __sync_fetch_and_add, +) +B(26, __sync_fetch_and_sub, -) +B(27, __sync_fetch_and_and, &) +B(28, __sync_fetch_and_xor, ^) +B(29, __sync_fetch_and_or, |) + +#undef A +#define A(n, t, ut, f, o, ...) \ +t fn##n (void) \ +{ \ + ut z = f (&v##t, 42, ##__VA_ARGS__); \ + t w = (t) z; \ + t v = w o 42; \ + return v != 0; \ +} + +B(30, __atomic_fetch_add, +, __ATOMIC_RELAXED) +B(31, __atomic_fetch_sub, -, __ATOMIC_RELAXED) +B(32, __atomic_fetch_and, &, __ATOMIC_RELAXED) +B(33, __atomic_fetch_xor, ^, __ATOMIC_RELAXED) +B(34, __atomic_fetch_or, |, __ATOMIC_RELAXED) +B(35, __sync_fetch_and_add, +) +B(36, __sync_fetch_and_sub, -) +B(37, __sync_fetch_and_and, &) +B(38, __sync_fetch_and_xor, ^) +B(39, __sync_fetch_and_or, |) + +#undef A +#define A(n, t, ut, f, o, ...) \ +t fn##n (t x) \ +{ \ + return (t) (((t) f (&v##t, x, ##__VA_ARGS__)) \ + o x) != 0; \ +} + +B(40, __atomic_fetch_add, +, __ATOMIC_RELAXED) +B(41, __atomic_fetch_sub, -, __ATOMIC_RELAXED) +B(42, __atomic_fetch_and, &, __ATOMIC_RELAXED) +B(43, __atomic_fetch_xor, ^, __ATOMIC_RELAXED) +B(44, __atomic_fetch_or, |, __ATOMIC_RELAXED) +B(45, __sync_fetch_and_add, +) +B(46, __sync_fetch_and_sub, -) +B(47, __sync_fetch_and_and, &) +B(48, __sync_fetch_and_xor, ^) +B(49, __sync_fetch_and_or, |) + +#undef A +#define A(n, t, ut, f, o, ...) \ +t fn##n (void) \ +{ \ + return (t) (((t) f (&v##t, 42, ##__VA_ARGS__))\ + o 42) == 0; \ +} + +B(50, __atomic_fetch_add, +, __ATOMIC_RELAXED) +B(51, __atomic_fetch_sub, -, __ATOMIC_RELAXED) +B(52, __atomic_fetch_and, &, __ATOMIC_RELAXED) +B(53, __atomic_fetch_xor, ^, __ATOMIC_RELAXED) +/* (whatever | 42) == 0 is 0, so we can't test this. */ +/* B(54, __atomic_fetch_or, |, __ATOMIC_RELAXED) */ +B(55, __sync_fetch_and_add, +) +B(56, __sync_fetch_and_sub, -) +B(57, __sync_fetch_and_and, &) +B(58, __sync_fetch_and_xor, ^) +/* B(59, __sync_fetch_and_or, |) */ --- gcc/testsuite/gcc.dg/tree-ssa/pr98737-2.c.jj 2022-01-12 16:43:29.411766485 +0100 +++ gcc/testsuite/gcc.dg/tree-ssa/pr98737-2.c 2022-01-12 16:41:24.301534958 +0100 @@ -0,0 +1,123 @@ +/* PR target/98737 */ +/* { dg-do compile { target i?86-*-* x86_64-*-* powerpc*-*-* aarch64*-*-* } } */ +/* { dg-options "-O2 -fdump-tree-optimized -fcompare-debug" } */ +/* { dg-additional-options "-march=i686" { target ia32 } } */ +/* { dg-final { scan-tree-dump-not "__atomic_\[^f]" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "__sync_\[^f]" "optimized" } } */ + +typedef signed char schar; +typedef unsigned long ulong; +typedef unsigned int uint; +typedef unsigned short ushort; +typedef unsigned char uchar; +long vlong; +int vint; +short vshort; +schar vschar; +ulong vulong; +uint vuint; +ushort vushort; +uchar vuchar; +#define A(n, t, ut, f, o, ...) \ +t fn##n (t x) \ +{ \ + ut z = f (&v##t, x, ##__VA_ARGS__); \ + t w = (t) z; \ + return w o x; \ +} +#define B(n, f, o, ...) \ + A(n##0, long, ulong, f, o, ##__VA_ARGS__) \ + A(n##1, int, uint, f, o, ##__VA_ARGS__) \ + A(n##2, short, ushort, f, o, ##__VA_ARGS__) \ + A(n##3, schar, uchar, f, o, ##__VA_ARGS__) \ + A(n##4, ulong, ulong, f, o, ##__VA_ARGS__) \ + A(n##5, uint, uint, f, o, ##__VA_ARGS__) \ + A(n##6, ushort, ushort, f, o, ##__VA_ARGS__) \ + A(n##7, uchar, uchar, f, o, ##__VA_ARGS__) + +B(00, __atomic_add_fetch, -, __ATOMIC_RELAXED) +B(01, __atomic_sub_fetch, +, __ATOMIC_RELAXED) +B(03, __atomic_xor_fetch, ^, __ATOMIC_RELAXED) +B(05, __sync_add_and_fetch, -) +B(06, __sync_sub_and_fetch, +) +B(08, __sync_xor_and_fetch, ^) + +#undef A +#define A(n, t, ut, f, o, ...) \ +t fn##n (void) \ +{ \ + ut z = f (&v##t, 42, ##__VA_ARGS__); \ + t w = (t) z; \ + return w o 42; \ +} + +B(10, __atomic_add_fetch, -, __ATOMIC_RELAXED) +B(11, __atomic_sub_fetch, +, __ATOMIC_RELAXED) +B(13, __atomic_xor_fetch, ^, __ATOMIC_RELAXED) +B(15, __sync_add_and_fetch, -) +B(16, __sync_sub_and_fetch, +) +B(18, __sync_xor_and_fetch, ^) + +#undef A +#define A(n, t, ut, f, o, ...) \ +t fn##n (t x) \ +{ \ + ut z = f (&v##t, x, ##__VA_ARGS__); \ + t w = (t) z; \ + t v = w o x; \ + return v == 0; \ +} + +B(20, __atomic_add_fetch, -, __ATOMIC_RELAXED) +B(21, __atomic_sub_fetch, +, __ATOMIC_RELAXED) +B(23, __atomic_xor_fetch, ^, __ATOMIC_RELAXED) +B(25, __sync_add_and_fetch, -) +B(26, __sync_sub_and_fetch, +) +B(28, __sync_xor_and_fetch, ^) + +#undef A +#define A(n, t, ut, f, o, ...) \ +t fn##n (void) \ +{ \ + ut z = f (&v##t, 42, ##__VA_ARGS__); \ + t w = (t) z; \ + t v = w o 42; \ + return v != 0; \ +} + +B(30, __atomic_add_fetch, -, __ATOMIC_RELAXED) +B(31, __atomic_sub_fetch, +, __ATOMIC_RELAXED) +B(33, __atomic_xor_fetch, ^, __ATOMIC_RELAXED) +B(35, __sync_add_and_fetch, -) +B(36, __sync_sub_and_fetch, +) +B(38, __sync_xor_and_fetch, ^) + +#undef A +#define A(n, t, ut, f, o, ...) \ +t fn##n (t x) \ +{ \ + return (t) (((t) f (&v##t, x, ##__VA_ARGS__)) \ + o x) != 0; \ +} + +B(40, __atomic_add_fetch, -, __ATOMIC_RELAXED) +B(41, __atomic_sub_fetch, +, __ATOMIC_RELAXED) +B(43, __atomic_xor_fetch, ^, __ATOMIC_RELAXED) +B(45, __sync_add_and_fetch, -) +B(46, __sync_sub_and_fetch, +) +B(48, __sync_xor_and_fetch, ^) + +#undef A +#define A(n, t, ut, f, o, ...) \ +t fn##n (void) \ +{ \ + return (t) (((t) f (&v##t, 42, ##__VA_ARGS__))\ + o 42) == 0; \ +} + +B(50, __atomic_add_fetch, -, __ATOMIC_RELAXED) +B(51, __atomic_sub_fetch, +, __ATOMIC_RELAXED) +B(53, __atomic_xor_fetch, ^, __ATOMIC_RELAXED) +B(55, __sync_add_and_fetch, -) +B(56, __sync_sub_and_fetch, +) +B(58, __sync_xor_and_fetch, ^)