From patchwork Thu Apr 7 20:15:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2 via Gcc-patches" X-Patchwork-Id: 52728 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5629E3858427 for ; Thu, 7 Apr 2022 20:16:35 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5629E3858427 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1649362595; bh=zaFKHm131FuYaVEsO2tRJ6IzazOPRsX1CrmnW34yybY=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=GIfZC3nmPJ8e4xmUwF8kR4CuhJFWGhi9x7GFB+7J+pk6H/4HHOuvXO0X3EVMu+oSU DM/MloK4S8nQYFuLsPoRfDZgN9tyW+pJnRbgcZmi0+MU8/m3lvVsPjG/l85gI96eL5 JpjTmFve1UvuLZxnXpNBc6joNV8A/9RtaXMI+E4g= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-fw-80006.amazon.com (smtp-fw-80006.amazon.com [99.78.197.217]) by sourceware.org (Postfix) with ESMTPS id 0983B3858C50 for ; Thu, 7 Apr 2022 20:16:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0983B3858C50 X-Amazon-filename: 0001-AArch64-add-barrier-to-no-LSE-path-in-outline-atomic.patch, 0002-AArch64-emit-a-barrier-for-__atomic-builtins.patch X-IronPort-AV: E=Sophos;i="5.90,242,1643673600"; d="scan'208,217,223";a="77716834" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO email-inbound-relay-iad-1e-90d70b14.us-east-1.amazon.com) ([10.25.36.214]) by smtp-border-fw-80006.pdx80.corp.amazon.com with ESMTP; 07 Apr 2022 20:16:02 +0000 Received: from EX13MTAUWB001.ant.amazon.com (iad12-ws-svc-p26-lb9-vlan3.iad.amazon.com [10.40.163.38]) by email-inbound-relay-iad-1e-90d70b14.us-east-1.amazon.com (Postfix) with ESMTPS id D0A2DC0941; Thu, 7 Apr 2022 20:16:00 +0000 (UTC) Received: from EX13D01UWB002.ant.amazon.com (10.43.161.136) by EX13MTAUWB001.ant.amazon.com (10.43.161.207) with Microsoft SMTP Server (TLS) id 15.0.1497.32; Thu, 7 Apr 2022 20:16:00 +0000 Received: from EX13D01UWB002.ant.amazon.com (10.43.161.136) by EX13d01UWB002.ant.amazon.com (10.43.161.136) with Microsoft SMTP Server (TLS) id 15.0.1497.32; Thu, 7 Apr 2022 20:16:00 +0000 Received: from EX13D01UWB002.ant.amazon.com ([10.43.161.136]) by EX13d01UWB002.ant.amazon.com ([10.43.161.136]) with mapi id 15.00.1497.033; Thu, 7 Apr 2022 20:16:00 +0000 To: "gcc-patches@gcc.gnu.org" Subject: [AArch64] PR105162: emit barrier for __sync and __atomic builtins on CPUs without LSE Thread-Topic: [AArch64] PR105162: emit barrier for __sync and __atomic builtins on CPUs without LSE Thread-Index: AQHYSroLWi1lUL/1Ik27BC7ZgbdzRA== Date: Thu, 7 Apr 2022 20:15:59 +0000 Message-ID: <1649362558922.26300@amazon.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.43.160.178] MIME-Version: 1.0 X-Spam-Status: No, score=-17.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, HTML_MESSAGE, KAM_SHORT, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Pop, Sebastian via Gcc-patches" From: "Li, Pan2 via Gcc-patches" Reply-To: "Pop, Sebastian" Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi, With -moutline-atomics gcc stops generating a barrier for __sync builtins: https://gcc.gnu.org/PR105162 This is a problem on CPUs without LSE instructions where the ld/st exclusives do not guarantee a full barrier. The attached patch adds the barrier to the outline-atomics functions on the path without LSE instructions. In consequence, under -moutline-atomics __atomic and __sync builtins now behave the same with and without LSE instructions. To complete the change, the second patch makes gcc emit the barrier for __atomic builtins as well, i.e., independently of is_mm_sync(). Sebastian From 68c07f95157057f0167723b182f0ccffdac8a17e Mon Sep 17 00:00:00 2001 From: Sebastian Pop Date: Thu, 7 Apr 2022 19:18:57 +0000 Subject: [PATCH 2/2] [AArch64] emit a barrier for __atomic builtins --- gcc/config/aarch64/aarch64.cc | 15 +++------------ 1 file changed, 3 insertions(+), 12 deletions(-) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 18f80499079..be1b8d22c6a 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -22931,9 +22931,7 @@ aarch64_split_compare_and_swap (rtx operands[]) if (strong_zero_p) aarch64_gen_compare_reg (NE, rval, const0_rtx); - /* Emit any final barrier needed for a __sync operation. */ - if (is_mm_sync (model)) - aarch64_emit_post_barrier (model); + aarch64_emit_post_barrier (model); } /* Split an atomic operation. */ @@ -22948,7 +22946,6 @@ aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem, machine_mode mode = GET_MODE (mem); machine_mode wmode = (mode == DImode ? DImode : SImode); const enum memmodel model = memmodel_from_int (INTVAL (model_rtx)); - const bool is_sync = is_mm_sync (model); rtx_code_label *label; rtx x; @@ -22966,11 +22963,7 @@ aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem, /* The initial load can be relaxed for a __sync operation since a final barrier will be emitted to stop code hoisting. */ - if (is_sync) - aarch64_emit_load_exclusive (mode, old_out, mem, - GEN_INT (MEMMODEL_RELAXED)); - else - aarch64_emit_load_exclusive (mode, old_out, mem, model_rtx); + aarch64_emit_load_exclusive (mode, old_out, mem, GEN_INT (MEMMODEL_RELAXED)); switch (code) { @@ -23016,9 +23009,7 @@ aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem, gen_rtx_LABEL_REF (Pmode, label), pc_rtx); aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x)); - /* Emit any final barrier needed for a __sync operation. */ - if (is_sync) - aarch64_emit_post_barrier (model); + aarch64_emit_post_barrier (model); } static void -- 2.25.1