From patchwork Thu Sep 19 13:11:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Malcomson X-Patchwork-Id: 97700 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C7C09385B503 for ; Thu, 19 Sep 2024 13:13:27 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11on2062b.outbound.protection.outlook.com [IPv6:2a01:111:f403:2414::62b]) by sourceware.org (Postfix) with ESMTPS id A7B603858D28 for ; Thu, 19 Sep 2024 13:12:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A7B603858D28 Authentication-Results: sourceware.org; dmarc=fail (p=reject dis=none) header.from=nvidia.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=nvidia.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A7B603858D28 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2414::62b ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1726751565; cv=pass; b=kv5iim6JSkCebcYy+8pesmfsztiJm/vyed/F7h+1hP3R1wUr/50rvkZilqOypmTu7ypYw2lSpBYKYMILdbNIAyf7L0AixZLnIeeHuvlh3dqkUizZW8ajxrw+Yu/uitqhtYN/wLvya3WWd65GeYYoywjtFU11sBffWFtmhFCE2bw= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1726751565; c=relaxed/simple; bh=9ahf5glgHasRiG5pgKccAb9aYCku6L3nagoMq74QI2s=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=GYBuIbc/PVVcmqxSkkJEeqHtubmfXBq8HAukISVQuhDzlShsly8KFArJ1FgiKsCiwFfaPcbpgja7b/iYLENwhbBO2SiFz/8+812gz2FyvF4E2OwK6pkmMsL5FtEzjI4QWi5O2c9Zqfel38plzx7fPLZj+FezMkG99dh/Pj8NK2Q= ARC-Authentication-Results: i=2; server2.sourceware.org ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=RmMN1pB/vhmxwker6RTuTx7AKFHSBB4IBTbndIXN/tC/8UmuBs1oruQqlJ52QL8sRoMiCqV3bhKVW034ffpooHJqowZpul/R3oHKj9Gdd1EV7VwDRdyyD0LQp4JGePwRsLAJQZmrRmGx3zaW5Nz4h0lSlObnn+nuiV2I5fopQtDuOCv5AJIlDJyTqG8CqC+kJfsYUJg7Lk8axPI87xXUNO8AnmcEFxvzr6KM5z5qDTOHQwwJmxdtX1KrOMvFz/FxSv+pOaTGu+sUZnOH7rzx8/Ym53v84jC3XnyE1uNtUjwAvJunfyM3TN7iWsAMrVbbd0SHEpqejHmiwKt0AiL47w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=K5VP0e2sT64KS/Uc+zKS+nezmhCgryTor4HU6zhxf/M=; b=LFHK0ejOXUeGglBJsfUt5ATc3SkRFBrqr5zQWS6fhHU/+LU1Ousr9XtPWnLnMtxBg83cD/Z/Dxr1QC+qG3h05fNG0BYyuD+ILPN0TRSgc6kjnOUsXrsXhUTQfFY/wCGnCsLoNoRm8xom/wxHvd6ijUfFu9ZV0j9F1w/C+dpcdrfJPDcWoT+m7p3CQbyDQnCnja2O1BW0V7wnXnmrR6bWrl0/7ebFz6xLQ19cqRcrU8/7eclxWSH988CI8xTay3qWEetwBkWHBrGP+oiX7+mMyKiyPJqtjUlY2bjz2EhkB0LpD3ls1e7kB8HhojMJkuH4FpHGz3KsEsLAwDdLRFgW/g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.160) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=K5VP0e2sT64KS/Uc+zKS+nezmhCgryTor4HU6zhxf/M=; b=MsdogTZcWbf8ktqeWT9+ZavV9Hb2LOnS0pO2he2udlo+RSmhtSULATWRYoxrgqeNkM0EY2TmR7yQ3olH3O8tnXkDHsM6p/b2dQk/wUuUJSj7Jvh6jH3NZlKTZcJ3/ydRxsQeDpjcyNG0FHH0ozR6qIX3wjFiu/7M4+THdRJJSLyN7EDzRw/DpD6N6dlLabUfZ7WG12pnq2TEwMKgq5VmgBQ/cNQaulC6QRPjj2orpw4Qw324e3rAnzWZ5ZG38ZM5yy7GZR0iqVirA6hrhOl+RnK1PnMXEUHbCwgn+v602eqK9fZQV/3WNFRICATdNFa5bKrIiPL2WkdF4fsjthPyAQ== Received: from BY5PR13CA0029.namprd13.prod.outlook.com (2603:10b6:a03:180::42) by CY5PR12MB6456.namprd12.prod.outlook.com (2603:10b6:930:34::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7962.24; Thu, 19 Sep 2024 13:12:38 +0000 Received: from CO1PEPF000066EB.namprd05.prod.outlook.com (2603:10b6:a03:180:cafe::ae) by BY5PR13CA0029.outlook.office365.com (2603:10b6:a03:180::42) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7939.24 via Frontend Transport; Thu, 19 Sep 2024 13:12:38 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.160) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.160 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.160; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.160) by CO1PEPF000066EB.mail.protection.outlook.com (10.167.249.7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Thu, 19 Sep 2024 13:12:38 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Thu, 19 Sep 2024 06:12:24 -0700 Received: from f2784bb-lcelt.nvidia.com (10.126.230.35) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Thu, 19 Sep 2024 06:12:23 -0700 From: To: CC: Jonathan Wakely , Joseph Myers , Richard Biener , Matthew Malcomson Subject: [PATCH 1/8] [RFC] Define new floating point builtin fetch_add functions Date: Thu, 19 Sep 2024 14:11:57 +0100 Message-ID: <20240919131204.3865854-2-mmalcomson@nvidia.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240919131204.3865854-1-mmalcomson@nvidia.com> References: <20240919131204.3865854-1-mmalcomson@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [10.126.230.35] X-ClientProxiedBy: rnnvmail202.nvidia.com (10.129.68.7) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1PEPF000066EB:EE_|CY5PR12MB6456:EE_ X-MS-Office365-Filtering-Correlation-Id: 182fc854-d924-4b30-c107-08dcd8acbc52 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|376014|82310400026|36860700013|1800799024; X-Microsoft-Antispam-Message-Info: c3+RkFzkkZu4POzdCI5dzWJlEmCIbVOTLTDq5HIK8L71D+hZQz7cXrCagTXbsM4lnrMWhcU9ZkYD4Tt3D6SvwJntIli/UYg4sjJu0tztf8cf8Tg1UxBr8Dlcm4BWAt4OwyBi6ms2WhJWbdiuYrlk8FqramM7BVGT/oAw3fuQ+SXwPPRv8bdFnziYPPjpV+piMny8i/W+frFGwC6d2gupB95DWQopwJcCOwJn8DnNZEDVH4Stsr9NsMDV7nBPpfxqREjbrTxdJFJrFMkPYqcvxKHkhgBU9cvnYxZ9Ps3UBdezUlayy+1AGcToL/7FQ2wSrdtW5EY60T7i3tLWQC2srdZO6pGpQ/Lrn8KOeeixPhZCJdyYEfq+lcOT0Kx3nyOCK4xieBSbAlc9LwSAKqahddvRBZOBUpB/0pV4p/Ea05uDCE7Ylgo6tLkJOrIr38W8W0tdKz6G/mIbtqeev4Q8SaWcj1aqo1wDJEFYqBu208dVuGcT095gIf6ixf0bXexFH5rK8KIprwzwUk5rA6bOKdqT79eqVy2tAUIAQcT8fg0hXbJhtzo+ud2jyfia4G2aCHUFJoLkfJCjo/dhUbnfqLoyACdzZ6kgbSM6HnQBz0qz7Y2n+axh8kjZDR7i3YrFMsFZqtM4n4xFy3Ghx8KTv4aDm2aDGuIgJaoNKm3r8t56rl+RNvnZ2gz8iOTmDVXGr5q/pWI/NHvvNkKT2TaRbYh9fX5+kiD2+jGTcaLMh++6q9igtIz8GTdzZtAxhHy/UG44i0u+Lize9TLp40KeXJLCazwerN1ZD8sqeODmwmM8WoXJGbFV9Ku7XLzf9U/f4sMKXxFKBtR44T8m9vvD3BNOCgPgSb7bQofGINbtlDDuwduoCN/6Dawlhz3omUK+AUIcv7pd64bgGE2PWF4QtSaRaAdkDPG9L5+FAzE9hEJPbIklV/4YeMWQ2z0utRtUXG/LZcKyKQv/h0Zauk4P/AI5+Fb8Q6TxF8wZVmw/moObDT70ayy9sdEgdCNk2irquLucTmkKyXdBjrmJrgYuQ2asB/ruhemxzhm0hsg8VaZc2xgM2uUPfLntEoE+8F37AbtsVuvsiouXuZbFrNP1yEeOc0VSnrtTWb4yzcSGGMN9EMocn5hjXzJcXR3Zfbt4JTRleFAAEbEnBayq+c5sHgJpbYxTY8RBkuI812BHlHZ5yV3kGLZPIleNtpWx60gJGIiQqgMsaUcFU47yF2SyttWFrdgfyiMUsRxmBJPPCuB62zDZWBOWNhB5HWrZuRygpqFAH8vtlaD3Ikqx3rXz+U1fn1WzlSvI0uAhEoxuDKluQJt69wdsFUNxVvwFvsxm8tP7hN7hzaG0Mr8T+Y+NnL0+Ze97i5KyZOHu61xiBe4RJ8CBpH7ieO/MY+XTvgMga6p12cpqasYdtdZgvgWmH41lJcc5Kq2CUNyRhcNf3qgbmPhdDS01G6KwLoYHmBIK X-Forefront-Antispam-Report: CIP:216.228.117.160; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:dc6edge1.nvidia.com; CAT:NONE; SFS:(13230040)(376014)(82310400026)(36860700013)(1800799024); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Sep 2024 13:12:38.4344 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 182fc854-d924-4b30-c107-08dcd8acbc52 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.117.160]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CO1PEPF000066EB.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY5PR12MB6456 X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, SPF_HELO_PASS, SPF_NONE, TXREP, UPPERCASE_50_75 autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org From: Matthew Malcomson This commit just defines the new names -- as yet don't implement them. Saving this commit because this is one decision, and recording what the decision was and why: Adding new floating point builtins for each floating point type that is defined in the general code *except* f128x (which would have a size greater than 16bytes -- the largest integral atomic operation we currently support). We have to base our naming on floating point *types* rather than sizes since different types can have the same size and the operations need to be distinguished based on type. N.b. one could make size-suffixed builtins that are still overloaded based on types but I thought that this was the cleaner approach. (Actual requirement is distinction based on mode, this is how I choose which internal function to use in a later patch. I believe that defining the function in terms of types and internally mapping to modes is a sensible split between user interface and internal implementation). N.b. in order to choose whether these operations are available or not in something like libstdc++ I use something like `__has_builtin(__atomic_fetch_add_fp)`. This happens to be the builtin for implementing the relevant operation on doubles, but it also seems like a nice name to check. - This would require that all compiler implementations have floating point atomics for all floating point types they support available at the same time. I don't expect this is much of a problem but invite dissent. N.b. I used the below type suffixes (following what seems like the existing convention for builtins): - float -> f - double -> - long double -> l - _FloatN -> fN (for N <- (16, 32, 64, 128)) - _FloatNx -> fNx (for N <- (32, 64)) Richi suggested doing this expansion generally for all these builtins following Cxy _Atomic semantics on IRC. Since C hasn't specified any fetch_ semantics for floating point types, C++ has only specified `atomic<>::fetch_{add,sub}`, and the operations other than these are all bitwise operations (which don't to map well to floating point), I believe I have followed that suggestion by implementing all fetch_{sub,add}/{add,sub}_fetch operations. I have not implemented anything for the __sync_* builtins on the belief that these are legacy and new code should use the __atomic_* builtins. Happy to adjust if that is a bad choice. Only the new function types were needed for most cases. The Fortran frontend does not use `builtin-types.def` so it needed the fortran `types.def` to be updated to include the floating point built in types in `enum builtin_type` local to `gfc_init_builtin_functions`. - N.b. these types are already available in the fortran frontend (being defined by `build_common_tree_nodes`), it's just that they were not available for sync-builtins.def functions until this commit. ------------------------------ N.b. for this RFC I've not checked that any other frontends can access these builtins. Even the fortran frontend I've only adjusted things to ensure stuff builds. Signed-off-by: Matthew Malcomson --- gcc/builtin-types.def | 20 ++++++++++++++++++ gcc/fortran/types.def | 48 +++++++++++++++++++++++++++++++++++++++++++ gcc/sync-builtins.def | 40 ++++++++++++++++++++++++++++++++++++ 3 files changed, 108 insertions(+) diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def index c97d6bad1de..97ccd945b55 100644 --- a/gcc/builtin-types.def +++ b/gcc/builtin-types.def @@ -802,6 +802,26 @@ DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I2_INT, BT_VOID, BT_VOLATILE_PTR, BT_I2, BT DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I4_INT, BT_VOID, BT_VOLATILE_PTR, BT_I4, BT_INT) DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I8_INT, BT_VOID, BT_VOLATILE_PTR, BT_I8, BT_INT) DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I16_INT, BT_VOID, BT_VOLATILE_PTR, BT_I16, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT_VPTR_FLOAT_INT, BT_FLOAT, BT_VOLATILE_PTR, + BT_FLOAT, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_DOUBLE_VPTR_DOUBLE_INT, BT_DOUBLE, BT_VOLATILE_PTR, + BT_DOUBLE, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_LONGDOUBLE_VPTR_LONGDOUBLE_INT, BT_LONGDOUBLE, + BT_VOLATILE_PTR, BT_LONGDOUBLE, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_BFLOAT16_VPTR_BFLOAT16_INT, BT_BFLOAT16, BT_VOLATILE_PTR, + BT_BFLOAT16, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT16_VPTR_FLOAT16_INT, BT_FLOAT16, BT_VOLATILE_PTR, + BT_FLOAT16, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT32_VPTR_FLOAT32_INT, BT_FLOAT32, BT_VOLATILE_PTR, + BT_FLOAT32, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT64_VPTR_FLOAT64_INT, BT_FLOAT64, BT_VOLATILE_PTR, + BT_FLOAT64, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT128_VPTR_FLOAT128_INT, BT_FLOAT128, BT_VOLATILE_PTR, + BT_FLOAT128, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT32X_VPTR_FLOAT32X_INT, BT_FLOAT32X, BT_VOLATILE_PTR, + BT_FLOAT32X, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT64X_VPTR_FLOAT64X_INT, BT_FLOAT64X, BT_VOLATILE_PTR, + BT_FLOAT64X, BT_INT) DEF_FUNCTION_TYPE_3 (BT_FN_INT_PTRPTR_SIZE_SIZE, BT_INT, BT_PTR_PTR, BT_SIZE, BT_SIZE) DEF_FUNCTION_TYPE_3 (BT_FN_PTR_CONST_PTR_CONST_PTR_SIZE, BT_PTR, BT_CONST_PTR, BT_CONST_PTR, BT_SIZE) DEF_FUNCTION_TYPE_3 (BT_FN_BOOL_INT_INT_INTPTR, BT_BOOL, BT_INT, BT_INT, diff --git a/gcc/fortran/types.def b/gcc/fortran/types.def index 390cc9542f7..52695a39047 100644 --- a/gcc/fortran/types.def +++ b/gcc/fortran/types.def @@ -60,6 +60,34 @@ DEF_PRIMITIVE_TYPE (BT_I4, builtin_type_for_size (BITS_PER_UNIT*4, 1)) DEF_PRIMITIVE_TYPE (BT_I8, builtin_type_for_size (BITS_PER_UNIT*8, 1)) DEF_PRIMITIVE_TYPE (BT_I16, builtin_type_for_size (BITS_PER_UNIT*16, 1)) +DEF_PRIMITIVE_TYPE (BT_FLOAT, float_type_node) +DEF_PRIMITIVE_TYPE (BT_DOUBLE, double_type_node) +DEF_PRIMITIVE_TYPE (BT_LONGDOUBLE, long_double_type_node) +DEF_PRIMITIVE_TYPE (BT_BFLOAT16, (bfloat16_type_node + ? bfloat16_type_node + : error_mark_node)) +DEF_PRIMITIVE_TYPE (BT_FLOAT16, (float16_type_node + ? float16_type_node + : error_mark_node)) +DEF_PRIMITIVE_TYPE (BT_FLOAT32, (float32_type_node + ? float32_type_node + : error_mark_node)) +DEF_PRIMITIVE_TYPE (BT_FLOAT64, (float64_type_node + ? float64_type_node + : error_mark_node)) +DEF_PRIMITIVE_TYPE (BT_FLOAT128, (float128_type_node + ? float128_type_node + : error_mark_node)) +DEF_PRIMITIVE_TYPE (BT_FLOAT32X, (float32x_type_node + ? float32x_type_node + : error_mark_node)) +DEF_PRIMITIVE_TYPE (BT_FLOAT64X, (float64x_type_node + ? float64x_type_node + : error_mark_node)) +DEF_PRIMITIVE_TYPE (BT_FLOAT128X, (float128x_type_node + ? float128x_type_node + : error_mark_node)) + DEF_PRIMITIVE_TYPE (BT_PTR, ptr_type_node) DEF_PRIMITIVE_TYPE (BT_CONST_PTR, const_ptr_type_node) DEF_PRIMITIVE_TYPE (BT_VOLATILE_PTR, @@ -144,6 +172,26 @@ DEF_FUNCTION_TYPE_3 (BT_FN_I2_VPTR_I2_INT, BT_I2, BT_VOLATILE_PTR, BT_I2, BT_INT DEF_FUNCTION_TYPE_3 (BT_FN_I4_VPTR_I4_INT, BT_I4, BT_VOLATILE_PTR, BT_I4, BT_INT) DEF_FUNCTION_TYPE_3 (BT_FN_I8_VPTR_I8_INT, BT_I8, BT_VOLATILE_PTR, BT_I8, BT_INT) DEF_FUNCTION_TYPE_3 (BT_FN_I16_VPTR_I16_INT, BT_I16, BT_VOLATILE_PTR, BT_I16, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT_VPTR_FLOAT_INT, BT_FLOAT, BT_VOLATILE_PTR, + BT_FLOAT, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_DOUBLE_VPTR_DOUBLE_INT, BT_DOUBLE, BT_VOLATILE_PTR, + BT_DOUBLE, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_LONGDOUBLE_VPTR_LONGDOUBLE_INT, BT_LONGDOUBLE, + BT_VOLATILE_PTR, BT_LONGDOUBLE, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_BFLOAT16_VPTR_BFLOAT16_INT, BT_BFLOAT16, BT_VOLATILE_PTR, + BT_BFLOAT16, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT16_VPTR_FLOAT16_INT, BT_FLOAT16, BT_VOLATILE_PTR, + BT_FLOAT16, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT32_VPTR_FLOAT32_INT, BT_FLOAT32, BT_VOLATILE_PTR, + BT_FLOAT32, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT64_VPTR_FLOAT64_INT, BT_FLOAT64, BT_VOLATILE_PTR, + BT_FLOAT64, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT128_VPTR_FLOAT128_INT, BT_FLOAT128, BT_VOLATILE_PTR, + BT_FLOAT128, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT32X_VPTR_FLOAT32X_INT, BT_FLOAT32X, BT_VOLATILE_PTR, + BT_FLOAT32X, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT64X_VPTR_FLOAT64X_INT, BT_FLOAT64X, BT_VOLATILE_PTR, + BT_FLOAT64X, BT_INT) DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I1_INT, BT_VOID, BT_VOLATILE_PTR, BT_I1, BT_INT) DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I2_INT, BT_VOID, BT_VOLATILE_PTR, BT_I2, BT_INT) DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I4_INT, BT_VOID, BT_VOLATILE_PTR, BT_I4, BT_INT) diff --git a/gcc/sync-builtins.def b/gcc/sync-builtins.def index b4ec3782799..89cc564a8f6 100644 --- a/gcc/sync-builtins.def +++ b/gcc/sync-builtins.def @@ -28,6 +28,30 @@ along with GCC; see the file COPYING3. If not see is supposed to be using. It's overloaded, and is resolved to one of the "_1" through "_16" versions, plus some extra casts. */ + +/* Same as DEF_GCC_FLOATN_NX_BUILTINS, except for sync builtins. + N.b. we do not define the f128x type because this would be larger than the + 16 byte integral types that we have atomic support for. That would mean + we couldn't implement them without adding special extra handling -- + especially because to act atomically on such large sizes all architectures + would require locking implementations added in libatomic. */ +#undef DEF_SYNC_FLOATN_NX_BUILTINS +#define DEF_SYNC_FLOATN_NX_BUILTINS(ENUM, NAME, TYPE_MACRO, ATTRS) \ + DEF_SYNC_BUILTIN (ENUM ## F16, NAME "f16", TYPE_MACRO (FLOAT16), ATTRS) \ + DEF_SYNC_BUILTIN (ENUM ## F32, NAME "f32", TYPE_MACRO (FLOAT32), ATTRS) \ + DEF_SYNC_BUILTIN (ENUM ## F64, NAME "f64", TYPE_MACRO (FLOAT64), ATTRS) \ + DEF_SYNC_BUILTIN (ENUM ## F128, NAME "f128", TYPE_MACRO (FLOAT128), ATTRS) \ + DEF_SYNC_BUILTIN (ENUM ## F32X, NAME "f32x", TYPE_MACRO (FLOAT32X), ATTRS) \ + DEF_SYNC_BUILTIN (ENUM ## F64X, NAME "f64x", TYPE_MACRO (FLOAT64X), ATTRS) + +#undef DEF_SYNC_FLOAT_BUILTINS +#define DEF_SYNC_FLOAT_BUILTINS(ENUM, NAME, TYPE_MACRO, ATTRS) \ + DEF_SYNC_BUILTIN (ENUM ## _FPF, NAME "_fpf", TYPE_MACRO (FLOAT), ATTRS) \ + DEF_SYNC_BUILTIN (ENUM ## _FP, NAME "_fp", TYPE_MACRO (DOUBLE), ATTRS) \ + DEF_SYNC_BUILTIN (ENUM ## _FPL, NAME "_fpl", TYPE_MACRO (LONGDOUBLE), ATTRS) \ + DEF_SYNC_BUILTIN (ENUM ## _FPF16B, NAME "_fpf16b", TYPE_MACRO (BFLOAT16), ATTRS) \ + DEF_SYNC_FLOATN_NX_BUILTINS (ENUM ## _FP, NAME "_fp", TYPE_MACRO, ATTRS) + DEF_SYNC_BUILTIN (BUILT_IN_SYNC_FETCH_AND_ADD_N, "__sync_fetch_and_add", BT_FN_VOID_VAR, ATTR_NOTHROWCALL_LEAF_LIST) DEF_SYNC_BUILTIN (BUILT_IN_SYNC_FETCH_AND_ADD_1, "__sync_fetch_and_add_1", @@ -378,6 +402,10 @@ DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_ADD_FETCH_8, DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_ADD_FETCH_16, "__atomic_add_fetch_16", BT_FN_I16_VPTR_I16_INT, ATTR_NOTHROWCALL_LEAF_LIST) +#define ADD_FETCH_TYPE(F) BT_FN_##F##_VPTR_##F##_INT +DEF_SYNC_FLOAT_BUILTINS (BUILT_IN_ATOMIC_ADD_FETCH, "__atomic_add_fetch", + ADD_FETCH_TYPE, ATTR_NOTHROWCALL_LEAF_LIST) +#undef ADD_FETCH_TYPE DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_SUB_FETCH_N, "__atomic_sub_fetch", @@ -397,6 +425,10 @@ DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_SUB_FETCH_8, DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_SUB_FETCH_16, "__atomic_sub_fetch_16", BT_FN_I16_VPTR_I16_INT, ATTR_NOTHROWCALL_LEAF_LIST) +#define SUB_FETCH_TYPE(F) BT_FN_##F##_VPTR_##F##_INT +DEF_SYNC_FLOAT_BUILTINS (BUILT_IN_ATOMIC_SUB_FETCH, "__atomic_sub_fetch", + SUB_FETCH_TYPE, ATTR_NOTHROWCALL_LEAF_LIST) +#undef SUB_FETCH_TYPE DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_AND_FETCH_N, "__atomic_and_fetch", @@ -492,6 +524,10 @@ DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_FETCH_ADD_8, DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_FETCH_ADD_16, "__atomic_fetch_add_16", BT_FN_I16_VPTR_I16_INT, ATTR_NOTHROWCALL_LEAF_LIST) +#define FETCH_ADD_TYPE(F) BT_FN_##F##_VPTR_##F##_INT +DEF_SYNC_FLOAT_BUILTINS (BUILT_IN_ATOMIC_FETCH_ADD, "__atomic_fetch_add", + FETCH_ADD_TYPE, ATTR_NOTHROWCALL_LEAF_LIST) +#undef FETCH_ADD_TYPE DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_FETCH_SUB_N, "__atomic_fetch_sub", @@ -511,6 +547,10 @@ DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_FETCH_SUB_8, DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_FETCH_SUB_16, "__atomic_fetch_sub_16", BT_FN_I16_VPTR_I16_INT, ATTR_NOTHROWCALL_LEAF_LIST) +#define FETCH_SUB_TYPE(F) BT_FN_##F##_VPTR_##F##_INT +DEF_SYNC_FLOAT_BUILTINS (BUILT_IN_ATOMIC_FETCH_SUB, "__atomic_fetch_sub", + FETCH_SUB_TYPE, ATTR_NOTHROWCALL_LEAF_LIST) +#undef FETCH_SUB_TYPE DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_FETCH_AND_N, "__atomic_fetch_and", From patchwork Thu Sep 19 13:11:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Malcomson X-Patchwork-Id: 97701 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EA4333857000 for ; Thu, 19 Sep 2024 13:13:46 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on20602.outbound.protection.outlook.com [IPv6:2a01:111:f403:2416::602]) by sourceware.org (Postfix) with ESMTPS id 5DEA73858D35 for ; Thu, 19 Sep 2024 13:12:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5DEA73858D35 Authentication-Results: sourceware.org; dmarc=fail (p=reject dis=none) header.from=nvidia.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=nvidia.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5DEA73858D35 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2416::602 ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1726751567; cv=pass; b=xBBE9MX3HCeM4sX0wIimO7lrfhioK8cw0SuLZpAH5JWpxkTpLz1EKwgOnV/Sd1rPDUaWmKZsgtA+LeA6UjxaActPeDf+o/dJqZZ1P6BYLxfnmkG5TX+dBw11O5KAl6v5AVs9b7w9fnUQsjoPOSHl+tlrZxqMbdlaQAADAs843Ms= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1726751567; c=relaxed/simple; bh=8JGm6qGxUIrtD4H9BvUPqeJ4aWH+xgu10Zb5vilMoTI=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=q1XijCpRkrXMiWX9yDG3c4SDClXwU+s9rkZn30YYw5ViO0dDgh6jnqB6xSrmlYswEiibgTOX/kcpQL/DYNCMBMHgc2DNkJfMJddh+XAz6KXyj+Cu0FQZ9oOotFvbqpZybyqQEq9mH/+Q9swvDZZLozYn612nGznB7uuuDEWM4FY= ARC-Authentication-Results: i=2; server2.sourceware.org ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Wnphd7l1Ydz9YkmszAokbjAVOgGo1o2ay+jykWzN/3rK0kvzA0LKH1Yy30jo9BDxjvYNOLXQ7TpeMNS0eKjFH84UdnTDqUhpAPxIFjfJKM8HoDO9ufaoU0xCNdPEwMXjbePU6mySQMwMmBnYHNYoK2xBY2M87xImytrA/lIRhX+7XBaAJlIczwlQyTTijMzflfQX43C23B8M4QkGL0j+xWXCdNfgwlSUF9gFE26wAfQsXXTmT0f0AmJeHQUx9/5Pk65q99IsNCHjL0eKVtn+AHh/as1lqG3LdizHcHM3UWrk9pVE1p6rUu6vJADXOv7nr+CGXYd8HQw5vQ7S7p10iA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=0SiZcebjTi7+TKteBTOTa1El735rqXco4nwasiLI+TQ=; b=lWwf8meKIPpGnPIS/fE2r2LczytobbEdX66a5x7G5NBR5yuJ72GMJJEJk1cjxcdok2LfuZ23+SknEGsZbS+WsNgIkb/DpBsfka1HCmDTA2blETXuVcBD0kJak243uS9/ciZQpID1L0qqSL4uYm/8rVJQj494tRP6P9Bu9TLpiro1s7e0twGFfsXUXh8oBKMWTlMdApFdtyxbY6hk/giUgO/Iv4BotuIdlu9RZIQ/7gLStxDpi4sgL9oAo/IH4v+OBxQ97K4HN9xdOW14ABqFAcYGGwYOUsyHoO6cmpp2pnomfAa8v0WfRFxibL5sbV617uNyZAS8ZHXT5I7i2Ct6iw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.161) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=0SiZcebjTi7+TKteBTOTa1El735rqXco4nwasiLI+TQ=; b=NopWh4VENG4YRWu4mvd2ScMsQuRprwtm8G0VT8hxqT33twLVFBWdeZDnvulsVJbl1FJThIsek+9abUZu02SYvh5jfUgctvlPEbMwilCGdWGXsraX9umEZ+57f3mD7mS+2ncG2Tmfm6QlBMIOixnqUchaziBWZY5RrGC0bhoewzUg1Kn/Vb0HYd04pV7OIAla8C2Mf5HcloMXXfD4PXrpq+KF7WMv827bnM8d8jabehSJb8Pp0CEK1bBpIariXiyR6Xi0EQ5zidjsCia6VYDZ1hEVcL4RkbOBhYKLKb7zzQAu6ddA5bS9DdZjxJVDOvKAo6jB25f3B8mR75T959eJuQ== Received: from CH5PR04CA0019.namprd04.prod.outlook.com (2603:10b6:610:1f4::8) by IA1PR12MB6355.namprd12.prod.outlook.com (2603:10b6:208:3e1::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7982.21; Thu, 19 Sep 2024 13:12:40 +0000 Received: from DS2PEPF0000343E.namprd02.prod.outlook.com (2603:10b6:610:1f4:cafe::e5) by CH5PR04CA0019.outlook.office365.com (2603:10b6:610:1f4::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7962.26 via Frontend Transport; Thu, 19 Sep 2024 13:12:40 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.161) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.161 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.161; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.161) by DS2PEPF0000343E.mail.protection.outlook.com (10.167.18.41) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Thu, 19 Sep 2024 13:12:39 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Thu, 19 Sep 2024 06:12:26 -0700 Received: from f2784bb-lcelt.nvidia.com (10.126.230.35) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Thu, 19 Sep 2024 06:12:24 -0700 From: To: CC: Jonathan Wakely , Joseph Myers , Richard Biener , Matthew Malcomson Subject: [PATCH 2/8] [RFC] Add FP types for atomic builtin overload resolution Date: Thu, 19 Sep 2024 14:11:58 +0100 Message-ID: <20240919131204.3865854-3-mmalcomson@nvidia.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240919131204.3865854-1-mmalcomson@nvidia.com> References: <20240919131204.3865854-1-mmalcomson@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [10.126.230.35] X-ClientProxiedBy: rnnvmail202.nvidia.com (10.129.68.7) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS2PEPF0000343E:EE_|IA1PR12MB6355:EE_ X-MS-Office365-Filtering-Correlation-Id: c7d62004-2790-4143-8a61-08dcd8acbd47 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|1800799024|376014|36860700013|82310400026; X-Microsoft-Antispam-Message-Info: WjR1UdLe/jbs47GEIwtkZ9IlE/If8/4/qOERjgGsswekx9/3nYOQqv3iepmRPSsuR2VSvz5HEMmNgo8ErQHYL+W830UVi78JRp1x/norON47PYwvljktw+aXFUCHyTtSQa0FmVhHQ6SbHnLz75pEKWuikLF9Qr8d6VFY85YPs5RHAYEDcMgLXcwiaimiyjMXs7dQe+6wz7+YFGrPEkiSpIKxDsn9fz4U6YJESJ2eFie0kY+8VeyHeXxLawgComlHmsTxbzObAH+mSGkG5zyu3w3jUdX5IEX283EZVnEcXwkOZtMlzixhwOtaqqMaiEy9Zc5LpaNUXp6JE2L9Ub1oXO+xZ15kH/58elkZaAkzEb2hrBzIaI+bVWaTOhf7zY8zPzTxgK48nepiA+JuEdcTrRcKpyxuPgDgqJ66U3L1LYPWiy+JkT+hnNdGL61Rm5xyupg6xXA9AAX4NDpwrvGsiJPzHQnwdaCTLi+swy5HPHE8lGcHdXxRhdyumQ3QwmxToHBzlApHvJoT53UFV0tItdcXsCIJ1lJlRt/sOXdxjIgxnfgMpA+ZahMlBTjSvI2S0S+jfIL8iiRy3fTJ7UvyFdPN/odg9tHmG8q1rd+C4OyaB6d6aD1ND1hvnhIPThOT7HGSA80gq0xScaf8JNijCQ0M0mpTsjvAmnbCRE8eorfIKZuTlumCND9Gd5RQQ8HRLhbx3wFp6YacpkUVz++0Bsx2F0VSM5VHiG8lmjQ61b/nPCA+yAyxDnvWfvoQTxvTdl/yQSjhMxLgROE5WbJgesB6pE2HneTiqMumVPuStLoO4q3cjJrFLIm7NyfNHAZClNyH3349fFBUIsZJj5aXsHs6+9kflBsCFe+yZmaTIe6GdjoSKnSNNmLG+A/sxxZf1hQpEtVTo+aUi5zH7/YhI7TeSqxGivoW1Q32/W89aXEO3ApDO1x7OnsvOW/WqiBAAyPJACRfxfndPbXI0crB7ZyVP6aNZVhWj2KpXtXYfhqU1TluL9Pig/eqe7/Vx6hZjiNyTLVnTODwrb130SYMdbB8x3IreXSfh52Wu5TCTgIEULobq5s1djUWyKUMQH6c7cwv379fokaoG2J1L1ldwbut1S5K5PcGNzURaDjGc+lnf/GflUl9vydCkxL/HK/EhxwQNRanz/7GHL/pHLG7NjgqF6a8HJWF5KzXXeVI1TCyL1/RwMP2JceBtOaNdm+wtsYrUbKoc9r010OqdeYtvuj0+1hJbKiJqThJi48dSgAh0iF8MC9VEQbDoHxp7iuri1GvaGN2nv9iQGnTw6ADAtjRZCDca50xLPcU6vCuNcuRIovRtfsXyL1q3DzlXk1ntjX3jq4ZseF5zmhZPSRtk2iUCLq/3HTOyL4qtmF0tc2+7D/9ap4H1CwaPGbyeINV0529YYvy3wPLASAsa+3ci9fU/GQUoCyZFGqXi+R7kZEXr64nxwmq8Cfoycmg1qKK X-Forefront-Antispam-Report: CIP:216.228.117.161; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:dc6edge2.nvidia.com; CAT:NONE; SFS:(13230040)(1800799024)(376014)(36860700013)(82310400026); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Sep 2024 13:12:39.9762 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: c7d62004-2790-4143-8a61-08dcd8acbd47 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.117.161]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DS2PEPF0000343E.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6355 X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, SPF_HELO_PASS, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org From: Matthew Malcomson Have a bit of an ugly mapping from floating point type to the builtin using that type. Would like to find some code-sharing between this, the function (in a later patch in this series) that finds the relevant mode from a given builtin, and the general sync-builtins.def file. As yet don't have a nice way to do that, but haven't looked that hard. Other than that, seems we can cleanly emit the functions that we need. N.b. we match which function to use based on the MODE of the type for two reasons: 1) Can't match directly on type as otherwise `typedef float x` would mean that `x` could no longer be used with that intrinsic. 2) MODE (i.e. the types ABI) is the thing that we need to distinguish between when deciding which fundamental operation needs to be applied. Signed-off-by: Matthew Malcomson --- gcc/c-family/c-common.cc | 88 ++++++++++++++++++++++++++++++++-------- 1 file changed, 70 insertions(+), 18 deletions(-) diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc index e7e371fd26f..c0a2b136d67 100644 --- a/gcc/c-family/c-common.cc +++ b/gcc/c-family/c-common.cc @@ -7360,13 +7360,15 @@ speculation_safe_value_resolve_return (tree first_param, tree result) static int sync_resolve_size (tree function, vec *params, bool fetch, - bool orig_format) + bool orig_format, + int *fp_specialisation_offset) { /* Type of the argument. */ tree argtype; /* Type the argument points to. */ tree type; int size; + bool valid_float = false; if (vec_safe_is_empty (params)) { @@ -7385,7 +7387,8 @@ sync_resolve_size (tree function, vec *params, bool fetch, goto incompatible; type = TREE_TYPE (type); - if (!INTEGRAL_TYPE_P (type) && !POINTER_TYPE_P (type)) + valid_float = fp_specialisation_offset && fetch && SCALAR_FLOAT_TYPE_P (type); + if (!INTEGRAL_TYPE_P (type) && !POINTER_TYPE_P (type) && !valid_float) goto incompatible; if (!COMPLETE_TYPE_P (type)) @@ -7402,6 +7405,40 @@ sync_resolve_size (tree function, vec *params, bool fetch, && !targetm.scalar_mode_supported_p (TImode)) return -1; + if (valid_float) + { + tree fp_type = type; + /* TODO Want a better reverse-mapping between an argument type and + the builtin enum. */ + struct type_to_offset { tree type; size_t offset; }; + static const struct type_to_offset fp_type_mappings[] = { + { float_type_node, 6 }, + { double_type_node, 7 }, + { long_double_type_node, 8 }, + { bfloat16_type_node ? bfloat16_type_node : error_mark_node, 9 }, + { float16_type_node ? float16_type_node : error_mark_node, 10 }, + { float32_type_node ? float32_type_node : error_mark_node, 11 }, + { float64_type_node ? float64_type_node : error_mark_node, 12 }, + { float128_type_node ? float128_type_node : error_mark_node, 13 }, + { float32x_type_node ? float32x_type_node : error_mark_node, 14 }, + { float64x_type_node ? float64x_type_node : error_mark_node, 15 } + }; + size_t offset = 0; + for (size_t i = 0; + i < sizeof(fp_type_mappings)/sizeof(fp_type_mappings[0]); + ++i) { + if (TYPE_MODE (fp_type) == TYPE_MODE (fp_type_mappings[i].type)) + { + offset = fp_type_mappings[i].offset; + break; + } + } + if (offset == 0) + goto incompatible; + *fp_specialisation_offset = offset; + return -1; + } + if (size == 1 || size == 2 || size == 4 || size == 8 || size == 16) return size; @@ -7462,9 +7499,10 @@ sync_resolve_params (location_t loc, tree orig_function, tree function, arguments (e.g. EXPECTED argument of __atomic_compare_exchange_n), bool arguments (e.g. WEAK argument) or signed int arguments (memmodel kinds). */ - if (TREE_CODE (arg_type) == INTEGER_TYPE && TYPE_UNSIGNED (arg_type)) + if ((TREE_CODE (arg_type) == INTEGER_TYPE && TYPE_UNSIGNED (arg_type)) + || SCALAR_FLOAT_TYPE_P (arg_type)) { - /* Ideally for the first conversion we'd use convert_for_assignment + /* Ideally) for the first conversion we'd use convert_for_assignment so that we get warnings for anything that doesn't match the pointer type. This isn't portable across the C and C++ front ends atm. */ val = (*params)[parmnum]; @@ -8256,7 +8294,6 @@ atomic_bitint_fetch_using_cas_loop (location_t loc, NULL_TREE); } - /* Some builtin functions are placeholders for other expressions. This function should be called immediately after parsing the call expression before surrounding code has committed to the type of the expression. @@ -8277,6 +8314,9 @@ resolve_overloaded_builtin (location_t loc, tree function, and so must be rejected. */ bool fetch_op = true; bool orig_format = true; + /* Is this function one of the builtins that has floating point + specializations. */ + bool fetch_maybe_float = false; tree new_return = NULL_TREE; switch (DECL_BUILT_IN_CLASS (function)) @@ -8406,12 +8446,14 @@ resolve_overloaded_builtin (location_t loc, tree function, /* FALLTHRU */ case BUILT_IN_ATOMIC_ADD_FETCH_N: case BUILT_IN_ATOMIC_SUB_FETCH_N: + case BUILT_IN_ATOMIC_FETCH_SUB_N: + case BUILT_IN_ATOMIC_FETCH_ADD_N: + fetch_maybe_float = true; + /* FALLTHRU */ case BUILT_IN_ATOMIC_AND_FETCH_N: case BUILT_IN_ATOMIC_NAND_FETCH_N: case BUILT_IN_ATOMIC_XOR_FETCH_N: case BUILT_IN_ATOMIC_OR_FETCH_N: - case BUILT_IN_ATOMIC_FETCH_ADD_N: - case BUILT_IN_ATOMIC_FETCH_SUB_N: case BUILT_IN_ATOMIC_FETCH_AND_N: case BUILT_IN_ATOMIC_FETCH_NAND_N: case BUILT_IN_ATOMIC_FETCH_XOR_N: @@ -8443,23 +8485,33 @@ resolve_overloaded_builtin (location_t loc, tree function, && orig_code != BUILT_IN_SYNC_LOCK_TEST_AND_SET_N && orig_code != BUILT_IN_SYNC_LOCK_RELEASE_N); - int n = sync_resolve_size (function, params, fetch_op, orig_format); - tree new_function, first_param, result; + int fp_specialisation_offset = 0; + int n = sync_resolve_size(function, params, fetch_op, orig_format, + fetch_maybe_float + ? &fp_specialisation_offset + : NULL); + tree new_function, first_param, result; enum built_in_function fncode; - if (n == 0) - return error_mark_node; + if (n == 0) + return error_mark_node; - if (n == -1) + /* If this is a floating point atomic operation, + The operation does not have a backend implementation, + Or we are asking for things to not be inlined, + Then inline it as a CAS loop. */ + if (fp_specialisation_offset != 0) + fncode = (enum built_in_function)((int)orig_code + fp_specialisation_offset); + else if (n == -1) return atomic_bitint_fetch_using_cas_loop (loc, orig_code, function, params); + else + fncode = (enum built_in_function)((int)orig_code + exact_log2 (n) + 1); - fncode = (enum built_in_function)((int)orig_code + exact_log2 (n) + 1); - new_function = builtin_decl_explicit (fncode); - if (!sync_resolve_params (loc, function, new_function, params, - orig_format)) - return error_mark_node; - + new_function = builtin_decl_explicit (fncode); + if (!sync_resolve_params (loc, function, new_function, params, + orig_format)) + return error_mark_node; first_param = (*params)[0]; result = build_function_call_vec (loc, vNULL, new_function, params, NULL); From patchwork Thu Sep 19 13:11:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Malcomson X-Patchwork-Id: 97706 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1A2223858410 for ; Thu, 19 Sep 2024 13:16:58 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2061e.outbound.protection.outlook.com [IPv6:2a01:111:f403:200a::61e]) by sourceware.org (Postfix) with ESMTPS id AF43C385840A for ; Thu, 19 Sep 2024 13:12:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AF43C385840A Authentication-Results: sourceware.org; dmarc=fail (p=reject dis=none) header.from=nvidia.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=nvidia.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org AF43C385840A Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:200a::61e ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1726751577; cv=pass; b=XudCS5egUpcEedVwjj5gtnyzvzGIvWB2llZCXIgJfCEJeE8eKiF99327t4hd9QSY5U9CjpDn2rLBvp4VCvCSIYAdrKxQO4RYGQweJxcGx8RKEHb2YiGn/WIs71NQtoas2md6wp4/ZPQoodIv7RjAleXlbl7YhBVX+5BwEUoef0M= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1726751577; c=relaxed/simple; bh=uAlAS9AYbvgWYzogWiT2uWfqNlQNZ8OyutEtnmYrL2Q=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=V6AVJ0/XtDqMkDseTdAoOuV1VKS5sdM7ekLSwSYvxWz6PjSpsq/AVsD7IH1OtY/gYG1szaIp+v+0tDdf3oHEZUgKYqvnlBW0+lrfc7YZ+WyLQZI/e4IEEo8m1buO3K+F+7ebxUZuYRZKsaG8veHb7t3T0mi4AsGHl8LH91xcfkM= ARC-Authentication-Results: i=2; server2.sourceware.org ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=NkahvZgMHAkPb2Myew3E+HrtNFqlhxHBAhg6pxmrt92Ccvaev9tfMWMgRMHeseKL8ZyryS0II+J4hOntD76hD3S8ba7t+n4ogjea9TDx892FetauffMc/1OZ+F7qH9pV2pjVLew/60CNpFHBuUqtohcQ4lqzo5CjTgGvzn070Tp4F70JTPX2Go+IgynNGHb8Zr2wQdIypY6d+/7uQd29gBCvgi1DnHtRkPyzNe598AIcpIOJiFqdlLzYZTOlFiIgl1+CteK5vJJ93VGDVRADvj2CtPxCD/LQSfS7WD2nMclYDhO1j1u10gcwhQ77uI+wLKXmxge7yd2/hcUobKwpNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=5LhSlN2JTpRWfRuxU/n4RGh+qnlt2d59L9SBzNMl6Bw=; b=uwvGjUfqOYq138F/pIotx6HTKjNQnEeyg27n36swYPSHYBsl6ZHUQb8UA8fY/H2S6I0d79uQSQ+/yAqtTBYs4fI1q36yXISMLQQikC0kxfnq4YFrLFMqIK6ZSqxMqIIs+W3TJ5fqxF4mOxloyuq10Q4VY7zxQHJ7ak226KeTFMJRHVwZn2oftQTUnQ4IVJ4kggbAG2nS7Dve9J2YFJr5ll0zEGlsnKcsi6Fk8G1uZThsJB8sXhBDrpFNPs2sWCREaCmWGO8BAir3pm9WDmJRbLTSHEIGuHBXcHtbHefYbjsDljc8FJoBEJthTR49rfsZcRLkiS6ybuk7PM+2FAAYNg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.160) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=5LhSlN2JTpRWfRuxU/n4RGh+qnlt2d59L9SBzNMl6Bw=; b=X54XWPfsR4uE9gket04tSkzB8kNhpj6Vf17hTkt/azmfqQd2l2u+nbvpdT667/MFmdMmexY1M0cQJkXuF8legz/HYKHOZrA07+DaDYBXEGC5HuqyNRYfUUVUU1Qx9vwATeqZD+EQdt70OJ3QwZqJdSyqE7G0qKPbJ+me4UL+6mbgBJd4K3hnLnZv1tiqWqyu5gkiwO8+pG+jrybg/Ee0C+HSpWp9t+6s+9YwAHjaS4ktkMHiywcaTkbwwbqum3QsTysXUrRuGToZAzxYF8iHESSbP5eabwAHQCaNSFTdE+/7WX6liLe7AXuLfO9Jeji20FBGX8pLfp4SZYOukN370g== Received: from SJ0PR05CA0040.namprd05.prod.outlook.com (2603:10b6:a03:33f::15) by CY8PR12MB7268.namprd12.prod.outlook.com (2603:10b6:930:54::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7962.24; Thu, 19 Sep 2024 13:12:43 +0000 Received: from CO1PEPF000066E7.namprd05.prod.outlook.com (2603:10b6:a03:33f:cafe::df) by SJ0PR05CA0040.outlook.office365.com (2603:10b6:a03:33f::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8005.7 via Frontend Transport; Thu, 19 Sep 2024 13:12:43 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.160) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.160 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.160; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.160) by CO1PEPF000066E7.mail.protection.outlook.com (10.167.249.9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Thu, 19 Sep 2024 13:12:43 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Thu, 19 Sep 2024 06:12:27 -0700 Received: from f2784bb-lcelt.nvidia.com (10.126.230.35) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Thu, 19 Sep 2024 06:12:26 -0700 From: To: CC: Jonathan Wakely , Joseph Myers , Richard Biener , Matthew Malcomson Subject: [PATCH 3/8] [RFC] Tie the new atomic builtins to the backend Date: Thu, 19 Sep 2024 14:11:59 +0100 Message-ID: <20240919131204.3865854-4-mmalcomson@nvidia.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240919131204.3865854-1-mmalcomson@nvidia.com> References: <20240919131204.3865854-1-mmalcomson@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [10.126.230.35] X-ClientProxiedBy: rnnvmail202.nvidia.com (10.129.68.7) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1PEPF000066E7:EE_|CY8PR12MB7268:EE_ X-MS-Office365-Filtering-Correlation-Id: fdcfd573-87a2-4dc4-2176-08dcd8acbf1b X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|1800799024|36860700013|376014|82310400026; X-Microsoft-Antispam-Message-Info: yh4jnhmW/RMXP5LeH7Mq1QJGY3XGhqKZtReon1IAp7Sc2yNpgtzLrJIScuMMR2k3ActAUwJrrwM+DbWFfoDl4+hgxes9hqntrWGNYk+YFO1P40wCuCbDLZkyyKvfSRCCtygpAC8ByUj39K7p2Z//ZCaGUfzT/DR9EB+jPUzLh8zF/VCLpMpb5uaCa9bNPgo5+W+h70ok4JaG/HMRq2l+TftN3dfiYD4pg9bJGJ5n1dUppCsI2iybMMB9bUYXYdYR2AzyFScpJM7uXLynWeNYSOxjDRuE0rBrDGQ/cPptqS5Ns2lEZKDvFl8ig6zkjqtMe1nzPOhRlFuEMGO4iB5ZSJ51b2JbNFJGSSjU0caqzOx3Lt+P0EkKEIq3A8KQNDB7Z41LuEt44quSS7CCNT+rCbzYLOMK64/emx5yXD2tyLEtiNp9hLRikDw1eZLZWz/j71kcDG4qzIb/e4UGdVeWsT466OJsWlrLkXqs7mucCt+Cu8QM7K5p7BP1qWFmePZxu6cvJhnxmKbyjw35hqXaC+A5lwwIg5F3A1m6fIL5C1HRHFs4mbJjIXQEUE3RQ2xSg2crfFuxOp2bdE9J7PZg7c29hG7DdvV3F2aupc8ViO/lkXrg4FjJBOiobC6FwkrlFLTGllcqTjJHOuoFP54DRXWySU0m6BcYlRS8+wqrM9Y3cL4eGqocu7uEzu8+DO6kEJf2/fCVXko5H8okZbUMeYkJIVb35B8NV8T5p4E/03Ou64J/4mSNqFdtV0mmF7+QF+POWsUSd9vyA2IPowuK484gMLsrspwpv3qtxv1cTpdxRssmxZxqDD7epChZv1kd9e6ECPWpTWvgDA+EmaWRlD9vZT33A1LPZ6QisdTVWYiFyCA5fsQd8EqrUiyzPjR+jU7ZgYePh6YVJWzcMwBpVgsKyQvazzkr4CJXkfs62Sa+OjGshJkQhyMfvJtjFQS1SE64FQ68EnRxV2EMy+H7qc5lVJ4dPfIuU0zonWrUBvtQG+YbBV8dQ7yhFuB4lqErYGFo3gmpaPIG28nzx3Bu8e65XOzLm94zzNiALbwQTr8jTZAhHyghg6zuWDrErIfeLXtJJ2JiZR4/Mai83kMb6BobIvJCK2I9fk+jjvsvi0r7xeBGYapcLD3Ux2yQvBUK3yRnCi0qpXpljiskOTeugnCxBCPOiWXV4JEu8hNNM1TI4eNiLGBF+GQLIv2P+SfPeMLKIsQdSGu6lePHHpiFDZqCeGgx2Vk74mm9yCMaRWp8xVyvnAjI+K+qSsdJZeeMJnKnnu80l46pjMNoVeGelAXv9Si6vnJOfBsOUsqlNSvQlFRr0j03/PejZBplTsL2UMs5JrkwA+ycrgimMbt3E7gceepkXGu01efaXbOv3reF1dsLsR+23mqvaCMkkKc6hyHIjwh18E1vKIhOkwwhrvtaX2oNM1ywfXt5FAkKyTzUz6uO5s1acxZ3qd6gamYP X-Forefront-Antispam-Report: CIP:216.228.117.160; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:dc6edge1.nvidia.com; CAT:NONE; SFS:(13230040)(1800799024)(36860700013)(376014)(82310400026); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Sep 2024 13:12:43.0718 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: fdcfd573-87a2-4dc4-2176-08dcd8acbf1b X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.117.160]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CO1PEPF000066E7.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY8PR12MB7268 X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, SPF_HELO_PASS, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org From: Matthew Malcomson Need to implement something in the Things implemented in this patch: 1) Update the optabs definitions to include floating point versions of atomic fetch_add variants. 2) When expanding into a CAS loop in RTL because the floating point optab is not implemented, there are now two different modes. One is the integral mode in which the atomic CAS (and load) should be performed, and one is the floating point mode in which the operation should be performed. - Extra handling of modes etc in `expand_atomic_fetch_op`. Things to highlight to any reviewer: 1) Needed another mapping from builtin to mode. This is *almost* shared code between this and the frontend. Looks like this could be shared if I put some effort into it. 2) I do not always expand into the modify before version, but also use the modify after version when unable to inline. - From looking at the dates on different parts of the code, it seems that this used to be needed before libatomic was added as a target library. Since libatomic currently implements both the fetch_ and _fetch versions I don't believe it's needed any more. 3) I `extract_bit_field` to convert between representations when expanding as a fallback (because fallback CAS loop loads in integral register and we want to reinterpret that as a floating point type before the intermediate operation). - Not sure if there's a better way I don't know about. Other than that everything seems mostly straight-forwardly following what is already done. Signed-off-by: Matthew Malcomson --- gcc/builtins.cc | 153 +++++++++++++++++++++++++++++++++++++++++++++--- gcc/optabs.cc | 101 ++++++++++++++++++++++++++++---- gcc/optabs.def | 6 +- gcc/optabs.h | 2 +- 4 files changed, 241 insertions(+), 21 deletions(-) diff --git a/gcc/builtins.cc b/gcc/builtins.cc index 0b902896ddd..0ffd7d0b211 100644 --- a/gcc/builtins.cc +++ b/gcc/builtins.cc @@ -6394,6 +6394,46 @@ get_builtin_sync_mode (int fcode_diff) return int_mode_for_size (BITS_PER_UNIT << fcode_diff, 0).require (); } +/* Reconsitute the machine modes relevant for this builtin operation from the + builtin difference from the _N version of a fetch_add atomic. + + Only works for floating point atomic builtins. + FCODE_DIFF should be fcode - base, where base is the FOO_N code for the + group of builtins. N.b. this is a different base to that used by + `get_builtin_sync_mode` because that matches the builtin enum offset used in + c-common.cc to find the builtin enum from a given MODE. + + TODO Really do need to figure out a bit neater code here. Should not be + inlining the mapping from type to offset in two different places. */ +static inline machine_mode +get_builtin_fp_sync_mode (int fcode_diff, machine_mode *mode) +{ + struct type_to_offset { tree type; size_t offset; }; + static const struct type_to_offset fp_type_mappings[] = { + { float_type_node, 6 }, + { double_type_node, 7 }, + { long_double_type_node, 8 }, + { bfloat16_type_node ? bfloat16_type_node : error_mark_node, 9 }, + { float16_type_node ? float16_type_node : error_mark_node, 10 }, + { float32_type_node ? float32_type_node : error_mark_node, 11 }, + { float64_type_node ? float64_type_node : error_mark_node, 12 }, + { float128_type_node ? float128_type_node : error_mark_node, 13 }, + { float32x_type_node ? float32x_type_node : error_mark_node, 14 }, + { float64x_type_node ? float64x_type_node : error_mark_node, 15 } + }; + gcc_assert (fcode_diff <= 15 && fcode_diff >= 6); + for (size_t i = 0; i < sizeof(fp_type_mappings)/sizeof(fp_type_mappings[0]); i++) + { + if ((size_t)fcode_diff == fp_type_mappings[i].offset) + { + *mode = TYPE_MODE (fp_type_mappings[i].type); + return int_mode_for_size (GET_MODE_SIZE (*mode) * BITS_PER_UNIT, 0) + .require (); + } + } + gcc_unreachable (); +} + /* Expand the memory expression LOC and return the appropriate memory operand for the builtin_sync operations. */ @@ -6886,9 +6926,10 @@ expand_builtin_atomic_store (machine_mode mode, tree exp) resolved to an instruction sequence. */ static rtx -expand_builtin_atomic_fetch_op (machine_mode mode, tree exp, rtx target, +expand_builtin_atomic_fetch_op (machine_mode expand_mode, tree exp, rtx target, enum rtx_code code, bool fetch_after, - bool ignore, enum built_in_function ext_call) + bool ignore, enum built_in_function ext_call, + machine_mode load_mode = VOIDmode) { rtx val, mem, ret; enum memmodel model; @@ -6898,13 +6939,13 @@ expand_builtin_atomic_fetch_op (machine_mode mode, tree exp, rtx target, model = get_memmodel (CALL_EXPR_ARG (exp, 2)); /* Expand the operands. */ - mem = get_builtin_sync_mem (CALL_EXPR_ARG (exp, 0), mode); - val = expand_expr_force_mode (CALL_EXPR_ARG (exp, 1), mode); + mem = get_builtin_sync_mem (CALL_EXPR_ARG (exp, 0), expand_mode); + val = expand_expr_force_mode (CALL_EXPR_ARG (exp, 1), expand_mode); /* Only try generating instructions if inlining is turned on. */ if (flag_inline_atomics) { - ret = expand_atomic_fetch_op (target, mem, val, code, model, fetch_after); + ret = expand_atomic_fetch_op (target, mem, val, code, model, fetch_after, load_mode); if (ret) return ret; } @@ -6938,12 +6979,12 @@ expand_builtin_atomic_fetch_op (machine_mode mode, tree exp, rtx target, { if (code == NOT) { - ret = expand_simple_binop (mode, AND, ret, val, NULL_RTX, true, + ret = expand_simple_binop (expand_mode, AND, ret, val, NULL_RTX, true, OPTAB_LIB_WIDEN); - ret = expand_simple_unop (mode, NOT, ret, target, true); + ret = expand_simple_unop (expand_mode, NOT, ret, target, true); } else - ret = expand_simple_binop (mode, code, ret, val, target, true, + ret = expand_simple_binop (expand_mode, code, ret, val, target, true, OPTAB_LIB_WIDEN); } return ret; @@ -8779,7 +8820,7 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode, if (target) return target; break; - + case BUILT_IN_ATOMIC_FETCH_SUB_1: case BUILT_IN_ATOMIC_FETCH_SUB_2: case BUILT_IN_ATOMIC_FETCH_SUB_4: @@ -8840,6 +8881,100 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode, return target; break; + case BUILT_IN_ATOMIC_FETCH_ADD_FPF: + case BUILT_IN_ATOMIC_FETCH_ADD_FP: + case BUILT_IN_ATOMIC_FETCH_ADD_FPL: + case BUILT_IN_ATOMIC_FETCH_ADD_FPF16B: + case BUILT_IN_ATOMIC_FETCH_ADD_FPF16: + case BUILT_IN_ATOMIC_FETCH_ADD_FPF32: + case BUILT_IN_ATOMIC_FETCH_ADD_FPF64: + case BUILT_IN_ATOMIC_FETCH_ADD_FPF128: + case BUILT_IN_ATOMIC_FETCH_ADD_FPF32X: + case BUILT_IN_ATOMIC_FETCH_ADD_FPF64X: + { + machine_mode int_mode + = get_builtin_fp_sync_mode (fcode - BUILT_IN_ATOMIC_FETCH_ADD_N, &mode); + target = expand_builtin_atomic_fetch_op (mode, exp, target, PLUS, false, + ignore, BUILT_IN_NONE, int_mode); + if (target) + return target; + break; + } + + case BUILT_IN_ATOMIC_ADD_FETCH_FPF: + case BUILT_IN_ATOMIC_ADD_FETCH_FP: + case BUILT_IN_ATOMIC_ADD_FETCH_FPL: + case BUILT_IN_ATOMIC_ADD_FETCH_FPF16B: + case BUILT_IN_ATOMIC_ADD_FETCH_FPF16: + case BUILT_IN_ATOMIC_ADD_FETCH_FPF32: + case BUILT_IN_ATOMIC_ADD_FETCH_FPF64: + case BUILT_IN_ATOMIC_ADD_FETCH_FPF128: + case BUILT_IN_ATOMIC_ADD_FETCH_FPF32X: + case BUILT_IN_ATOMIC_ADD_FETCH_FPF64X: + { + /* TODO I don't translate to the FETCH_ADD library call if this fails + to inline. The integral ADD_FETCH versions of atomic functions do. + I don't understand why they make that transformation, could *guess* + that it's the more likely function to be implemented except that + libatomic seems to implement everything if it implements anything. + -- Any explanation why the integral versions make this translation + (and hence help with whether these floating point versions should make + that translation) would be welcomed. + + A comment in gcc.dg/atomic-noinline.c seems to imply that such a + translation was necessary at one point. That comment was added to the + testsuite file before the introduction of libatomic to the GCC target + library. I guess this was something needed in an earlier state of the + ecosystem. */ + machine_mode int_mode + = get_builtin_fp_sync_mode (fcode - BUILT_IN_ATOMIC_ADD_FETCH_N, &mode); + target = expand_builtin_atomic_fetch_op (mode, exp, target, PLUS, true, + ignore, BUILT_IN_NONE, int_mode); + if (target) + return target; + break; + } + + case BUILT_IN_ATOMIC_FETCH_SUB_FPF: + case BUILT_IN_ATOMIC_FETCH_SUB_FP: + case BUILT_IN_ATOMIC_FETCH_SUB_FPL: + case BUILT_IN_ATOMIC_FETCH_SUB_FPF16B: + case BUILT_IN_ATOMIC_FETCH_SUB_FPF16: + case BUILT_IN_ATOMIC_FETCH_SUB_FPF32: + case BUILT_IN_ATOMIC_FETCH_SUB_FPF64: + case BUILT_IN_ATOMIC_FETCH_SUB_FPF128: + case BUILT_IN_ATOMIC_FETCH_SUB_FPF32X: + case BUILT_IN_ATOMIC_FETCH_SUB_FPF64X: + { + machine_mode int_mode + = get_builtin_fp_sync_mode (fcode - BUILT_IN_ATOMIC_FETCH_SUB_N, &mode); + target = expand_builtin_atomic_fetch_op (mode, exp, target, MINUS, false, + ignore, BUILT_IN_NONE, int_mode); + if (target) + return target; + break; + } + + case BUILT_IN_ATOMIC_SUB_FETCH_FPF: + case BUILT_IN_ATOMIC_SUB_FETCH_FP: + case BUILT_IN_ATOMIC_SUB_FETCH_FPL: + case BUILT_IN_ATOMIC_SUB_FETCH_FPF16B: + case BUILT_IN_ATOMIC_SUB_FETCH_FPF16: + case BUILT_IN_ATOMIC_SUB_FETCH_FPF32: + case BUILT_IN_ATOMIC_SUB_FETCH_FPF64: + case BUILT_IN_ATOMIC_SUB_FETCH_FPF128: + case BUILT_IN_ATOMIC_SUB_FETCH_FPF32X: + case BUILT_IN_ATOMIC_SUB_FETCH_FPF64X: + { + machine_mode int_mode + = get_builtin_fp_sync_mode (fcode - BUILT_IN_ATOMIC_SUB_FETCH_N, &mode); + target = expand_builtin_atomic_fetch_op (mode, exp, target, MINUS, true, + ignore, BUILT_IN_NONE, int_mode); + if (target) + return target; + break; + } + case BUILT_IN_ATOMIC_TEST_AND_SET: target = expand_builtin_atomic_test_and_set (exp, target); if (target) diff --git a/gcc/optabs.cc b/gcc/optabs.cc index 185c5b1a705..ca395dde89b 100644 --- a/gcc/optabs.cc +++ b/gcc/optabs.cc @@ -7745,6 +7745,10 @@ expand_atomic_fetch_op_no_fallback (rtx target, rtx mem, rtx val, if (result) return result; + /* TODO For floating point is there anything extra to worry about + w.r.t. rounding (i.e. is X+ guaranteed to be equal + to X-(-1 * )). + Doubt it is, but wouldn't want to avoid the operation on a hunch. */ /* If the fetch value can be calculated from the other variation of fetch, try that operation. */ if (after || unused_result || optab.reverse_code != UNKNOWN) @@ -7793,7 +7797,8 @@ expand_atomic_fetch_op_no_fallback (rtx target, rtx mem, rtx val, AFTER is false to return the value before the operation (fetch_OP). */ rtx expand_atomic_fetch_op (rtx target, rtx mem, rtx val, enum rtx_code code, - enum memmodel model, bool after) + enum memmodel model, bool after, + machine_mode load_mode) { machine_mode mode = GET_MODE (mem); rtx result; @@ -7802,7 +7807,7 @@ expand_atomic_fetch_op (rtx target, rtx mem, rtx val, enum rtx_code code, /* If loads are not atomic for the required size and we are not called to provide a __sync builtin, do not do anything so that we stay consistent with atomic loads of the same size. */ - if (!can_atomic_load_p (mode) && !is_mm_sync (model)) + if (!can_atomic_load_p (load_mode) && !is_mm_sync (model)) return NULL_RTX; result = expand_atomic_fetch_op_no_fallback (target, mem, val, code, model, @@ -7817,6 +7822,13 @@ expand_atomic_fetch_op (rtx target, rtx mem, rtx val, enum rtx_code code, rtx tmp; enum rtx_code reverse = (code == PLUS ? MINUS : PLUS); + /* TODO Need to double-check whether there's any floating point problems + with doing the reverse operation on a negated value. + (Don't know of any particular problem -- just have this feeling that + floating point transformations are tricky). + + FWIW I have the impression this is fine because GCC optimizes x + (-y) + to x - y for floating point values. */ start_sequence (); tmp = expand_simple_unop (mode, NEG, val, NULL_RTX, true); result = expand_atomic_fetch_op_no_fallback (target, mem, tmp, reverse, @@ -7835,7 +7847,7 @@ expand_atomic_fetch_op (rtx target, rtx mem, rtx val, enum rtx_code code, } /* Try the __sync libcalls only if we can't do compare-and-swap inline. */ - if (!can_compare_and_swap_p (mode, false)) + if (!can_compare_and_swap_p (load_mode, false)) { rtx libfunc; bool fixup = false; @@ -7870,11 +7882,41 @@ expand_atomic_fetch_op (rtx target, rtx mem, rtx val, enum rtx_code code, code = orig_code; } - /* If nothing else has succeeded, default to a compare and swap loop. */ - if (can_compare_and_swap_p (mode, true)) + /* If nothing else has succeeded, default to a compare and swap loop. + + N.b. for modes where the compare and swap has to happen in a different + mode to the operation we have to do a conversion in between the integral + value that the CAS loop is going to be using and the other mode that our + operations are performed in. This happens when modes where + load_mode != mode, e.g. where `mode` is a floating point mode and + `load_mode` is an integral one. */ + if (can_compare_and_swap_p (load_mode, true)) { + /* Should have been ensured by the caller, but nice to make sure. */ + gcc_assert (known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE (load_mode))); + poly_uint64 loadmode_bitsize = GET_MODE_SIZE (load_mode) * BITS_PER_UNIT; rtx_insn *insn; - rtx t0 = gen_reg_rtx (mode), t1; + rtx t0 = gen_reg_rtx (load_mode), t1; + rtx tmp = gen_reg_rtx (mode); + /* TODO Is there a better way than this to convert between + interpretations? We need bitwise interpretation because the atomic + memory operations are being performed on an integral register. */ + auto interpret_as_float = + [loadmode_bitsize, mode] (rtx target, rtx irtx) -> rtx { + rtx tmp = extract_bit_field (irtx, loadmode_bitsize, 0, true, target, + mode, mode, false, NULL); + if (tmp != target) + emit_move_insn (target, tmp); + return target; + }; + auto interpret_as_int + = [loadmode_bitsize, load_mode] (rtx target, rtx frtx) -> rtx { + rtx tmp = extract_bit_field (frtx, loadmode_bitsize, 0, true, target, + load_mode, load_mode, false, NULL); + if (tmp != target) + emit_move_insn (target, tmp); + return target; + }; start_sequence (); @@ -7885,7 +7927,12 @@ expand_atomic_fetch_op (rtx target, rtx mem, rtx val, enum rtx_code code, target = gen_reg_rtx (mode); /* If fetch_before, copy the value now. */ if (!after) - emit_move_insn (target, t0); + { + if (load_mode == mode) + emit_move_insn (target, t0); + else + interpret_as_float (target, t0); + } } else target = const0_rtx; @@ -7897,18 +7944,52 @@ expand_atomic_fetch_op (rtx target, rtx mem, rtx val, enum rtx_code code, true, OPTAB_LIB_WIDEN); t1 = expand_simple_unop (mode, code, t1, NULL_RTX, true); } - else + else if (load_mode == mode) t1 = expand_simple_binop (mode, code, t1, val, NULL_RTX, true, OPTAB_LIB_WIDEN); + else + { + interpret_as_float (tmp, t1); + tmp = expand_simple_binop (mode, code, tmp, val, NULL_RTX, true, + OPTAB_LIB_WIDEN); + t1 = gen_reg_rtx (load_mode); + interpret_as_int (t1, tmp); + } /* For after, copy the value now. */ if (!unused_result && after) - emit_move_insn (target, t1); + emit_move_insn (target, load_mode == mode ? t1 : tmp); insn = get_insns (); end_sequence (); + /* + Outside `expand_compare_and_swap_loop` (i.e. inside the `seq`) I've + done the following: + tmp (floating) = old_reg (integral) + tmp += 1 + new_reg (integral) = tmp (floating) + `expand_compare_and_swap_loop` wraps the sequence it's given as + described at the top of its implementation. + cmp_reg = mem + label: + old_reg = cmp_reg; + tmp (floating) = old_reg (integral) + tmp += 1; + new_reg (integral) = tmp (floating) + (success, cmp_reg) = CAS(mem, old_reg, new_reg) + if (success) + goto label; + + In order to implement this what we want is to expand the MEM as an + integral value before passing into this function. Then this function + would not have to understand anything about the fact that the inner + thing is a floating point operation. + - N.b. there is the question of whether we'd like a conversion + inside or outside the loop. I don't think it matters TBH, though + could easily be missing something here. */ + mem = adjust_address (mem, load_mode, 0); if (t1 != NULL && expand_compare_and_swap_loop (mem, t0, t1, insn)) - return target; + return target; } return NULL_RTX; diff --git a/gcc/optabs.def b/gcc/optabs.def index 45e117a7f50..a450c4bba81 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -503,6 +503,7 @@ OPTAB_D (sync_sub_optab, "sync_sub$I$a") OPTAB_D (sync_xor_optab, "sync_xor$I$a") OPTAB_D (atomic_add_fetch_optab, "atomic_add_fetch$I$a") +OPTAB_NX (atomic_add_fetch_optab, "atomic_add_fetch$F$a") OPTAB_D (atomic_add_optab, "atomic_add$I$a") OPTAB_D (atomic_and_fetch_optab, "atomic_and_fetch$I$a") OPTAB_D (atomic_and_optab, "atomic_and$I$a") @@ -511,11 +512,13 @@ OPTAB_D (atomic_bit_test_and_complement_optab, "atomic_bit_test_and_complement$I OPTAB_D (atomic_bit_test_and_reset_optab, "atomic_bit_test_and_reset$I$a") OPTAB_D (atomic_compare_and_swap_optab, "atomic_compare_and_swap$I$a") OPTAB_D (atomic_exchange_optab, "atomic_exchange$I$a") -OPTAB_D (atomic_fetch_add_optab, "atomic_fetch_add$I$a") +OPTAB_D (atomic_fetch_add_optab, "atomic_fetch_add$F$a") +OPTAB_NX (atomic_fetch_add_optab, "atomic_fetch_add$I$a") OPTAB_D (atomic_fetch_and_optab, "atomic_fetch_and$I$a") OPTAB_D (atomic_fetch_nand_optab, "atomic_fetch_nand$I$a") OPTAB_D (atomic_fetch_or_optab, "atomic_fetch_or$I$a") OPTAB_D (atomic_fetch_sub_optab, "atomic_fetch_sub$I$a") +OPTAB_NX (atomic_fetch_sub_optab, "atomic_fetch_sub$F$a") OPTAB_D (atomic_fetch_xor_optab, "atomic_fetch_xor$I$a") OPTAB_D (atomic_load_optab, "atomic_load$I$a") OPTAB_D (atomic_nand_fetch_optab, "atomic_nand_fetch$I$a") @@ -524,6 +527,7 @@ OPTAB_D (atomic_or_fetch_optab, "atomic_or_fetch$I$a") OPTAB_D (atomic_or_optab, "atomic_or$I$a") OPTAB_D (atomic_store_optab, "atomic_store$I$a") OPTAB_D (atomic_sub_fetch_optab, "atomic_sub_fetch$I$a") +OPTAB_NX (atomic_sub_fetch_optab, "atomic_sub_fetch$F$a") OPTAB_D (atomic_sub_optab, "atomic_sub$I$a") OPTAB_D (atomic_xor_fetch_optab, "atomic_xor_fetch$I$a") OPTAB_D (atomic_xor_optab, "atomic_xor$I$a") diff --git a/gcc/optabs.h b/gcc/optabs.h index 301847e2186..8da637e87b6 100644 --- a/gcc/optabs.h +++ b/gcc/optabs.h @@ -366,7 +366,7 @@ extern void expand_mem_signal_fence (enum memmodel); rtx expand_atomic_load (rtx, rtx, enum memmodel); rtx expand_atomic_store (rtx, rtx, enum memmodel, bool); rtx expand_atomic_fetch_op (rtx, rtx, rtx, enum rtx_code, enum memmodel, - bool); + bool, machine_mode load_mode = VOIDmode); extern void expand_asm_reg_clobber_mem_blockage (HARD_REG_SET); From patchwork Thu Sep 19 13:12:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Malcomson X-Patchwork-Id: 97704 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 49C393858282 for ; Thu, 19 Sep 2024 13:16:00 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from NAM02-DM3-obe.outbound.protection.outlook.com (mail-dm3nam02on2062f.outbound.protection.outlook.com [IPv6:2a01:111:f403:2405::62f]) by sourceware.org (Postfix) with ESMTPS id 887613858C78 for ; Thu, 19 Sep 2024 13:12:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 887613858C78 Authentication-Results: sourceware.org; dmarc=fail (p=reject dis=none) header.from=nvidia.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=nvidia.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 887613858C78 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2405::62f ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1726751575; cv=pass; b=FdG1CwOm8FB8ql/Cs8gTx0ENilN1feTgyT+Vd4tq4wzGRYQ4tFQKNFHdD4nSAalQj4XUl5RlOt7etUMs2nC5vM6OxaU5Rej0TWyj6A6T6cXK+PpsDS3V9sf+bquRCfoq2eRTs1hKBpIR3rc4fox49kX9SmxTlH+LT8BUsp/okHw= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1726751575; c=relaxed/simple; bh=3Z39VJZY1gDCS+i02jvUOh854A/Z/zhJjW2XYZv5GaU=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=dds0KTDcxv+Ea+h8S5D0YRilIQMrZ7NQer2FGhAt4FUZ1rj+qx4iYxTRKdB8oKRiW/Ujeno3+AgnP7Vytjeyh2Jw9QKobwrIIu7iZmvwhOBn1y+MaQPpYStAXc24alwWHQC14muf05LHMnjIR7XYQ5BnCjDIW9ZMD8/fjUIK3go= ARC-Authentication-Results: i=2; server2.sourceware.org ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=uLOWQBGJDqWnlMeFyJQnZjOtnu2+NJbM1bgx81W/zLX9Ll4eTgv7ooUOID13XfjCXZXHNmVN/WH+s8NsS+tjP+255rBW7vCxh8D49mAhMrX1VfLpVzh0airpuvAdnPf5e+9X1IGNsuXfTGFmkml3waepF6HP+E58KxYLdETvVVNu8caTeG4yJ6aYx17yIlQyMgcwl5P0jgcYatQqLPnoHefQPP5U/oefVadM+ltPPHrbWf2EYFUDPDzvq3jIGkU4EqHBLptGId3CHA6danID0zNSnhR1aS7P/vSzsKZseyRatKIqQZXfi4/UkBaQgmJY4h6Ffx0aY97+U1/V2na0nw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=08cAG4TikfLln+nVNOfqNEzez9IBBMMKEogJFcz/f4k=; b=FbEr0d/aD/FP1LnQTdqamJWORXgQb1hYH/iNeX5Cnz3INFViIjb9ne/dlx2gjJ6Vqz4bnTKDwkbUlUA+q2l/Gy87Q2JlL8bYCDZJ5FxLodPdumvI6xpHJf9tbmovkoXC3eDifbGP6gQwJHbI/LVPNaBdLRC9E7ZUHxWz/XFa3b/NLa3wTQm8Wkc3BrRdZMQURzWSIRbb4w0YILNBmrtcNrIJbdhiG+idQruaFvAOnL19H2Kb05nzdbl9bOLs4NAHFnuB8Mvs0tr8ZZdnuilTq9QyuwsEeGftAHVnPz5Se8tsbsaMz2l//gXN0qGt+IUOpGcMaBKUXHQQhaSXnwEhCw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.160) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=08cAG4TikfLln+nVNOfqNEzez9IBBMMKEogJFcz/f4k=; b=iztkkKf7BV4CiJ6TU1BAgjKMHea6kJK6CJLfAV0PntL507HDT4SIoAfV79qiedJ/rukqA+YlpX8TorPS3cbrcOIGlFqX6SzjB+MfbL15DSGsAAnJImh8eSyH2XIXXMEHaWOePbeZWXHqCpmtUHvduh4GbeGqzqyKd0n4UJ/fR5eqawXc7ntPge93VBX+5uJVkCAIUVGq+rRnQR2g9+MlFselHnA8cbL4BsM4LuEpMvyN0nvDrW4z1a09dDqwK1cvgSSoBhojaHYvaPAVDnIpKMbfkGasp1Fpjo42ls6H67CfvSxn81PkvaW29On3rr2mJ3dBBWhhishJAZuEOF2I2w== Received: from BY5PR13CA0019.namprd13.prod.outlook.com (2603:10b6:a03:180::32) by CYYPR12MB8921.namprd12.prod.outlook.com (2603:10b6:930:c7::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7982.17; Thu, 19 Sep 2024 13:12:44 +0000 Received: from CO1PEPF000066EB.namprd05.prod.outlook.com (2603:10b6:a03:180:cafe::fa) by BY5PR13CA0019.outlook.office365.com (2603:10b6:a03:180::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7982.11 via Frontend Transport; Thu, 19 Sep 2024 13:12:44 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.160) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.160 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.160; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.160) by CO1PEPF000066EB.mail.protection.outlook.com (10.167.249.7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Thu, 19 Sep 2024 13:12:44 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Thu, 19 Sep 2024 06:12:29 -0700 Received: from f2784bb-lcelt.nvidia.com (10.126.230.35) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Thu, 19 Sep 2024 06:12:27 -0700 From: To: CC: Jonathan Wakely , Joseph Myers , Richard Biener , Matthew Malcomson Subject: [PATCH 4/8] [RFC] Have libatomic working as first draft Date: Thu, 19 Sep 2024 14:12:00 +0100 Message-ID: <20240919131204.3865854-5-mmalcomson@nvidia.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240919131204.3865854-1-mmalcomson@nvidia.com> References: <20240919131204.3865854-1-mmalcomson@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [10.126.230.35] X-ClientProxiedBy: rnnvmail202.nvidia.com (10.129.68.7) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1PEPF000066EB:EE_|CYYPR12MB8921:EE_ X-MS-Office365-Filtering-Correlation-Id: 3e312897-5cab-4dc8-0e17-08dcd8acbfcc X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|1800799024|376014|36860700013|82310400026; X-Microsoft-Antispam-Message-Info: tUgliz+pPw1fQuGlc7N2wkjcQMbeqkugZ46GjI0Fcw5wKkbzwM13BbAThnhJ64bB8mnPkI/1AxnTTuOYSw0pSZJsHjF4kUKQt5WE8QQc4V8NcBNJRbd4hKqWDt5VeSgGYo8v9z9BmtE59s5n1Ohn4d2Zn9qiM6Ruaq/gVYR0AS3F2bds2OB1SF2GpBfHl3OlCRHKNDFb/z6xDsDVKuNNF2Odg06iRNBWZURKoYPKBWbTbHRNLuiHFyZSU09/AVp8+FLNv+8z5iqE6pPizZ0BoLYWMUwaKTJ2JH8YMYmlshyeZISS5dlLt2WGwWDOwn/KH7SJ0lckaGHeL14YLmFhw51ahwJH+PdJyxxkW9FY7vd2uNuED5oV7vqdYkSySEHdQ5lCaJuR05uxj7SIgT6b32efZZYFUi60h0ZRJqRGGct4Pd7sBeYuoswGSjj35Y37nGlHxGi+25qWMwgRv2+yNe8qNDc775rjkuEl2NiynAntl+aliRWJ+8KydFsPXiMPcLIoFSE1mulM+Gg/HSSZQfDZHDu8zV9X17vK6BxkHJPllom9sScXrbMX7QtjSRcljExW6a+AaegY7zCqFjf1oMKKlWAW9uPXYdh5LAaKC10VmThGjgJvY88VlYYm4gxCX9VezrgYnidEQdswN3LOCnA5SmTdKlqvKVr8nDJ92u9GSWCuxxhtmYBGtpEv109x4cV6arhut33Fk+gvqQbgLHCujENk1HwNShGVkCugM7j5pYguFsVNwmG6jv0scKE64KQMXus0XVw0yaUFQo4qXn/in6yvlmp4hOF5ssoLuofqOEiIs9qoN0Qm9kJRj5JThBq58t7q82MOYJZGJlrXvDm5VJKIjGtZQFq0vUxZuJp6jdN9XsRWDJsKUTy7TVB/EuqQoOkMuyMjyxzappl9yI2tdthQmk0TjKJUJBDzPb/pm/h2yrw7ECQ7b6CSq2dVjL1J7sarmRr+WXu/QMBk0pnnRIV/dKFrkvooY12fB/1ol4iNs3DJxtN833aVnAvf+SO3xLm7wD8qO0DbJ0bGJufV2R32bhOreJ/fehFSxQXuDamEwk2g2xiD4wazP9RA9OJoQhijZdDmrxDO7Xvpd/TMPbG8DDvnRcyrmNP65s6ZXc7TRb0RO4hs3eJe/n4l0i9PkvHU2HnQQxFDYULMsggB4w8yvS2yhaRDSXV11qpxg8UOBTi0lyHCU5Lhvu5t7+TGPeybaQTSU00kHiV6IcZHCwfo47RjEz83VWhofvAc4gWqb5PxyQ7tWYv/MZVcAuxu5+Uh1HfoieaCzRVQpcTop9WaWC9H1IyaE1qyXL8cHqAGOzCFMNgbFLiBAmAiLx+X7Np0lPo8JD1qBFjbUTayWVzGRc5YN3hl5tK7fufDwwQqeqsj4AWN8zZJOMGbur+q9TE7ugUW4aEjhL0t27hEJqwGC7LSgEHJZpP93vg9dj6viG2rBvM8sYF7t36F X-Forefront-Antispam-Report: CIP:216.228.117.160; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:dc6edge1.nvidia.com; CAT:NONE; SFS:(13230040)(1800799024)(376014)(36860700013)(82310400026); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Sep 2024 13:12:44.2782 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 3e312897-5cab-4dc8-0e17-08dcd8acbfcc X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.117.160]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CO1PEPF000066EB.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CYYPR12MB8921 X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_LINEPADDING, KAM_SHORT, SPF_HELO_PASS, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org From: Matthew Malcomson As it stands there are still a few things to look at whether they could be improved: 1) Need to find the exact version of automake to use. I'm using automake 1.15.1 from https://ftp.gnu.org/gnu/automake/ but the header is claiming I'm using automake 1.15. 2) The internal naming is all a little "not right" up for floating point. E.g. the SIZE() macro is no longer adding a SIZE integer suffix to something but instead adding a suffix representing a type. Not sure whether the churn to fix this is worth it -- will ask upstream. 3) Have not implemented the word-size compare and swap loop fallback. This because the implementation uses a mask and the mask is not always the same for any given architecture. Hence the existing approach in code would not work for all floating point types. - I would appreciate some feedback about whether this is OK to not implement. Seems reasonable to me. 4) In the existing test for the availability of an atomic fetch operation there are two things that I do not know why they are needed and hence didn't add them to the check for atomic floating point fetch_{add,sub}. I just wanted to highlight this in case I missed something. 1) I only put the `x` register into a register with an `asm` call. To be honest I don't know why anything need be put into a register, but I didn't put the floating point value into a register because I didn't know of a standard GCC floating point register constraint that worked across all architectures. - Is there any need for this `asm` line (I copied from existing libatomic configure code without understanding). - Is there any need for the constant addition to be applied? 2) I used a cast of a 1.0 floating point literal as the "addition" for all floating point types in the configury check. - Is there something subtle I'm missing about this? (I *think* it's fine, but felt like this seemed to be a place where I could trip up without knowing). Description of things done in this commit: We implement the new floating point builtins around fetch_add. This is mostly a configure/makefile change. The main overview of the changes is that we create a new list of suffixes (_fpf, _fp, _fpl, _fp16b, _fp16, _fp32, _fp64, _fp128, _fp32x, _fp64x) and re-compile fadd_n.c and fsub_n.c for these suffixes. The existing machinery for checking whether a given atomic builtin is implemented is extended to check for these same suffixes on the atomic builtins. The existing machinery for generating atomic fetch_ implementations using a given suffix and general patterns is also re-used (n.b. with the exception that the implementation based on a compare and exchange of a word is not implemented because the pre-processor does not know the size of the floating point types). The AArch64 backend is updated slightly. It didn't build because it assumed there was some IFUNC for all operations implemented (and didn't have any IFUNC for the new floating point operations). The new functions are advertised as LIBATOMIC_1.3 in the linker map for the dynamic library. Signed-off-by: Matthew Malcomson --- libatomic/Makefile.am | 6 +- libatomic/Makefile.in | 12 +- libatomic/acinclude.m4 | 49 + libatomic/auto-config.h.in | 84 +- libatomic/config/linux/aarch64/host-config.h | 2 + libatomic/configure | 1153 +++++++++++++++++- libatomic/configure.ac | 4 + libatomic/fadd_n.c | 23 + libatomic/fop_n.c | 5 +- libatomic/fsub_n.c | 23 + libatomic/libatomic.map | 44 + libatomic/libatomic_i.h | 58 + libatomic/testsuite/Makefile.in | 1 + 13 files changed, 1392 insertions(+), 72 deletions(-) diff --git a/libatomic/Makefile.am b/libatomic/Makefile.am index efadd9dcd48..ec24f1da86b 100644 --- a/libatomic/Makefile.am +++ b/libatomic/Makefile.am @@ -110,6 +110,7 @@ IFUNC_OPT = $(word $(PAT_S),$(IFUNC_OPTIONS)) M_SIZE = -DN=$(PAT_N) M_IFUNC = $(if $(PAT_S),$(IFUNC_DEF) $(IFUNC_OPT)) M_FILE = $(PAT_BASE)_n.c +M_FLOATING = $(if $(findstring $(PAT_N),$(FPSUFFIXES)),-DFLOATING) # The lack of explicit dependency on the source file means that VPATH cannot # work properly. Instead, perform this operation by hand. First, collect a @@ -120,10 +121,13 @@ all_c_files := $(foreach dir,$(search_path),$(wildcard $(dir)/*.c)) M_SRC = $(firstword $(filter %/$(M_FILE), $(all_c_files))) %_.lo: Makefile - $(LTCOMPILE) $(M_DEPS) $(M_SIZE) $(M_IFUNC) -c -o $@ $(M_SRC) + $(LTCOMPILE) $(M_DEPS) $(M_SIZE) $(M_FLOATING) $(M_IFUNC) -c -o $@ $(M_SRC) ## Include all of the sizes in the "normal" set of compilation flags. libatomic_la_LIBADD = $(foreach s,$(SIZES),$(addsuffix _$(s)_.lo,$(SIZEOBJS))) +# Include the special floating point "sizes" as a set of compilation flags. +FPOBJS = fadd fsub +libatomic_la_LIBADD += $(foreach suf,$(FPSUFFIXES),$(addsuffix _$(suf)_.lo,$(FPOBJS))) ## On a target-specific basis, include alternates to be selected by IFUNC. if HAVE_IFUNC diff --git a/libatomic/Makefile.in b/libatomic/Makefile.in index 9798e7c09e9..c70ebf9cc8b 100644 --- a/libatomic/Makefile.in +++ b/libatomic/Makefile.in @@ -289,6 +289,7 @@ ECHO_T = @ECHO_T@ EGREP = @EGREP@ EXEEXT = @EXEEXT@ FGREP = @FGREP@ +FPSUFFIXES = @FPSUFFIXES@ GREP = @GREP@ INSTALL = @INSTALL@ INSTALL_DATA = @INSTALL_DATA@ @@ -441,6 +442,7 @@ IFUNC_OPT = $(word $(PAT_S),$(IFUNC_OPTIONS)) M_SIZE = -DN=$(PAT_N) M_IFUNC = $(if $(PAT_S),$(IFUNC_DEF) $(IFUNC_OPT)) M_FILE = $(PAT_BASE)_n.c +M_FLOATING = $(if $(findstring $(PAT_N),$(FPSUFFIXES)),-DFLOATING) # The lack of explicit dependency on the source file means that VPATH cannot # work properly. Instead, perform this operation by hand. First, collect a @@ -450,8 +452,12 @@ all_c_files := $(foreach dir,$(search_path),$(wildcard $(dir)/*.c)) # Then sort through them to find the one we want, and select the first. M_SRC = $(firstword $(filter %/$(M_FILE), $(all_c_files))) libatomic_la_LIBADD = $(foreach s,$(SIZES),$(addsuffix \ - _$(s)_.lo,$(SIZEOBJS))) $(am__append_1) $(am__append_2) \ - $(am__append_3) $(am__append_4) + _$(s)_.lo,$(SIZEOBJS))) $(foreach \ + suf,$(FPSUFFIXES),$(addsuffix _$(suf)_.lo,$(FPOBJS))) \ + $(am__append_1) $(am__append_2) $(am__append_3) \ + $(am__append_4) +# Include the special floating point "sizes" as a set of compilation flags. +FPOBJS = fadd fsub @ARCH_AARCH64_LINUX_TRUE@@HAVE_IFUNC_TRUE@IFUNC_OPTIONS = -march=armv8-a+lse @ARCH_ARM_LINUX_TRUE@@HAVE_IFUNC_TRUE@IFUNC_OPTIONS = -march=armv7-a+fp -DHAVE_KERNEL64 @ARCH_I386_TRUE@@HAVE_IFUNC_TRUE@IFUNC_OPTIONS = -march=i586 @@ -894,7 +900,7 @@ vpath % $(strip $(search_path)) -include $(wildcard $(DEPDIR)/*.Ppo) %_.lo: Makefile - $(LTCOMPILE) $(M_DEPS) $(M_SIZE) $(M_IFUNC) -c -o $@ $(M_SRC) + $(LTCOMPILE) $(M_DEPS) $(M_SIZE) $(M_FLOATING) $(M_IFUNC) -c -o $@ $(M_SRC) # Amend the automake generated all-multi rule to guarantee that all-multi # is not run in parallel with the %_.lo rules which generate $(DEPDIR)/*.Ppo diff --git a/libatomic/acinclude.m4 b/libatomic/acinclude.m4 index f35ab5b60a5..59445f68b85 100644 --- a/libatomic/acinclude.m4 +++ b/libatomic/acinclude.m4 @@ -17,6 +17,23 @@ AC_DEFUN([LIBAT_FORALL_MODES], $1(TI,16)] ) dnl +dnl Iterate over all the floating types that we want to check. +dnl Arguments to the function given are the floting point type and the suffix +dnl used to identify this floating point type. +dnl +AC_DEFUN([LIBAT_FOR_FLOATING_TYPES], + [$1(float,fpf) + $1(double,fp) + $1(long double,fpl) + $1(bfloat16,fpf16b) + $1(_Float16,fpf16) + $1(_Float32,fpf32) + $1(_Float64,fpf64) + $1(_Float128,fpf128) + $1(_Float32x,fpf32x) + $1(_Float64x,fpf64x)] +) +dnl dnl Check for builtin types by mode. dnl dnl A less interesting of size checking than autoconf normally provides. @@ -33,6 +50,20 @@ AC_DEFUN([LIBAT_HAVE_INT_MODE],[ SIZES="$SIZES $2" fi ]) +dnl +dnl Check if built in floating point types are defined for this target. +dnl +AC_DEFUN([LIBAT_HAVE_FLOATING_TYPE],[ + AC_CACHE_CHECK([for type $1],[libat_cv_have_type_$2], + [AC_COMPILE_IFELSE([AC_LANG_SOURCE([$1 x;])], + [libat_cv_have_type_$2=yes],[libat_cv_have_type_$2=no])]) + LIBAT_DEFINE_YESNO([HAVE_$2], [$libat_cv_have_type_$2], + [Have support for floating type $1.]) + if test x$libat_cv_have_type_$2 = xyes; then + FPSUFFIXES="$FPSUFFIXES $2" + fi +]) + dnl dnl Check for atomic builtins. dnl See: @@ -154,6 +185,24 @@ AC_DEFUN([LIBAT_HAVE_ATOMIC_FETCH_ADD],[ AH_BOTTOM([#define MAYBE_HAVE_ATOMIC_FETCH_ADD_$2 HAVE_ATOMIC_FETCH_ADD_$2]) ]) +dnl +dnl Test if we have __atomic_fetch_add for floating point type $1, with suffix $2 +dnl +AC_DEFUN([LIBAT_HAVE_ATOMIC_FETCH_ADDSUB_FP],[ + LIBAT_TEST_ATOMIC_BUILTIN([for __atomic_fetch_{add,sub} for floating type $1], + [libat_cv_have_at_faddsub_$2], [ + $1 *x; + asm("" : "=g"(x)); + __atomic_fetch_add (x, ($1)(1.0), 0); + __atomic_add_fetch (x, ($1)(1.0), 0); + __atomic_fetch_sub (x, ($1)(1.0), 0); + __atomic_sub_fetch (x, ($1)(1.0), 0); + ]) +LIBAT_DEFINE_YESNO([HAVE_ATOMIC_FETCH_ADDSUB_$2], [$libat_cv_have_at_faddsub_$2], + [Have __atomic_fetch_{add,sub} for floating point type $1.]) + AH_BOTTOM([#define MAYBE_HAVE_ATOMIC_FETCH_ADDSUB_$2 HAVE_ATOMIC_FETCH_ADDSUB_$2]) +]) + dnl dnl Test if we have __atomic_fetch_op for all op for mode $1, size $2 dnl diff --git a/libatomic/auto-config.h.in b/libatomic/auto-config.h.in index ab3424a759e..fe56b31f4b6 100644 --- a/libatomic/auto-config.h.in +++ b/libatomic/auto-config.h.in @@ -33,6 +33,36 @@ /* Have __atomic_exchange for 8 byte integers. */ #undef HAVE_ATOMIC_EXCHANGE_8 +/* Have __atomic_fetch_{add,sub} for floating point type double. */ +#undef HAVE_ATOMIC_FETCH_ADDSUB_fp + +/* Have __atomic_fetch_{add,sub} for floating point type float. */ +#undef HAVE_ATOMIC_FETCH_ADDSUB_fpf + +/* Have __atomic_fetch_{add,sub} for floating point type _Float128. */ +#undef HAVE_ATOMIC_FETCH_ADDSUB_fpf128 + +/* Have __atomic_fetch_{add,sub} for floating point type _Float16. */ +#undef HAVE_ATOMIC_FETCH_ADDSUB_fpf16 + +/* Have __atomic_fetch_{add,sub} for floating point type bfloat16. */ +#undef HAVE_ATOMIC_FETCH_ADDSUB_fpf16b + +/* Have __atomic_fetch_{add,sub} for floating point type _Float32. */ +#undef HAVE_ATOMIC_FETCH_ADDSUB_fpf32 + +/* Have __atomic_fetch_{add,sub} for floating point type _Float32x. */ +#undef HAVE_ATOMIC_FETCH_ADDSUB_fpf32x + +/* Have __atomic_fetch_{add,sub} for floating point type _Float64. */ +#undef HAVE_ATOMIC_FETCH_ADDSUB_fpf64 + +/* Have __atomic_fetch_{add,sub} for floating point type _Float64x. */ +#undef HAVE_ATOMIC_FETCH_ADDSUB_fpf64x + +/* Have __atomic_fetch_{add,sub} for floating point type long double. */ +#undef HAVE_ATOMIC_FETCH_ADDSUB_fpl + /* Have __atomic_fetch_add for 1 byte integers. */ #undef HAVE_ATOMIC_FETCH_ADD_1 @@ -153,6 +183,36 @@ /* Define to 1 if you have the header file. */ #undef HAVE_UNISTD_H +/* Have support for floating type double. */ +#undef HAVE_fp + +/* Have support for floating type float. */ +#undef HAVE_fpf + +/* Have support for floating type _Float128. */ +#undef HAVE_fpf128 + +/* Have support for floating type _Float16. */ +#undef HAVE_fpf16 + +/* Have support for floating type bfloat16. */ +#undef HAVE_fpf16b + +/* Have support for floating type _Float32. */ +#undef HAVE_fpf32 + +/* Have support for floating type _Float32x. */ +#undef HAVE_fpf32x + +/* Have support for floating type _Float64. */ +#undef HAVE_fpf64 + +/* Have support for floating type _Float64x. */ +#undef HAVE_fpf64x + +/* Have support for floating type long double. */ +#undef HAVE_fpl + /* Define ifunc resolver function argument. */ #undef IFUNC_RESOLVER_ARGS @@ -281,12 +341,32 @@ #define MAYBE_HAVE_ATOMIC_FETCH_OP_16 HAVE_ATOMIC_FETCH_OP_16 +#define MAYBE_HAVE_ATOMIC_FETCH_ADDSUB_fpf HAVE_ATOMIC_FETCH_ADDSUB_fpf + +#define MAYBE_HAVE_ATOMIC_FETCH_ADDSUB_fp HAVE_ATOMIC_FETCH_ADDSUB_fp + +#define MAYBE_HAVE_ATOMIC_FETCH_ADDSUB_fpl HAVE_ATOMIC_FETCH_ADDSUB_fpl + +#define MAYBE_HAVE_ATOMIC_FETCH_ADDSUB_fpf16b HAVE_ATOMIC_FETCH_ADDSUB_fpf16b + +#define FAST_ATOMIC_LDST_2 HAVE_ATOMIC_LDST_2 + +#define MAYBE_HAVE_ATOMIC_FETCH_ADDSUB_fpf16 HAVE_ATOMIC_FETCH_ADDSUB_fpf16 + +#define MAYBE_HAVE_ATOMIC_FETCH_ADDSUB_fpf32 HAVE_ATOMIC_FETCH_ADDSUB_fpf32 + +#define MAYBE_HAVE_ATOMIC_FETCH_ADDSUB_fpf64 HAVE_ATOMIC_FETCH_ADDSUB_fpf64 + +#define MAYBE_HAVE_ATOMIC_FETCH_ADDSUB_fpf128 HAVE_ATOMIC_FETCH_ADDSUB_fpf128 + +#define MAYBE_HAVE_ATOMIC_FETCH_ADDSUB_fpf32x HAVE_ATOMIC_FETCH_ADDSUB_fpf32x + +#define MAYBE_HAVE_ATOMIC_FETCH_ADDSUB_fpf64x HAVE_ATOMIC_FETCH_ADDSUB_fpf64x + #ifndef WORDS_BIGENDIAN #define WORDS_BIGENDIAN 0 #endif -#define FAST_ATOMIC_LDST_2 HAVE_ATOMIC_LDST_2 - #define MAYBE_HAVE_ATOMIC_LDST_4 HAVE_ATOMIC_LDST_4 #define FAST_ATOMIC_LDST_4 HAVE_ATOMIC_LDST_4 diff --git a/libatomic/config/linux/aarch64/host-config.h b/libatomic/config/linux/aarch64/host-config.h index 93f367d5878..639ed3efeaa 100644 --- a/libatomic/config/linux/aarch64/host-config.h +++ b/libatomic/config/linux/aarch64/host-config.h @@ -77,6 +77,8 @@ typedef struct __ifunc_arg_t { # define IFUNC_NCOND(N) 0 # define IFUNC_ALT 1 # endif +# elif defined(FLOATING) +# define IFUNC_NCOND(N) 0 # else # define IFUNC_COND_1 (hwcap & HWCAP_ATOMICS) # define IFUNC_NCOND(N) 1 diff --git a/libatomic/configure b/libatomic/configure index d579bab96f8..f68c18295f7 100755 --- a/libatomic/configure +++ b/libatomic/configure @@ -644,6 +644,7 @@ ARCH_AARCH64_LINUX_TRUE HAVE_IFUNC_FALSE HAVE_IFUNC_TRUE tmake_file +FPSUFFIXES SIZES XLDFLAGS XCFLAGS @@ -11456,7 +11457,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 11459 "configure" +#line 11460 "configure" #include "confdefs.h" #if HAVE_DLFCN_H @@ -11562,7 +11563,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 11565 "configure" +#line 11566 "configure" #include "confdefs.h" #if HAVE_DLFCN_H @@ -12665,6 +12666,317 @@ _ACEOF + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for type float" >&5 +$as_echo_n "checking for type float... " >&6; } +if ${libat_cv_have_type_fpf+:} false; then : + $as_echo_n "(cached) " >&6 +else + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ +float x; +_ACEOF +if ac_fn_c_try_compile "$LINENO"; then : + libat_cv_have_type_fpf=yes +else + libat_cv_have_type_fpf=no +fi +rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libat_cv_have_type_fpf" >&5 +$as_echo "$libat_cv_have_type_fpf" >&6; } + + yesno=`echo $libat_cv_have_type_fpf | tr 'yesno' '1 0 '` + +cat >>confdefs.h <<_ACEOF +#define HAVE_fpf $yesno +_ACEOF + + + if test x$libat_cv_have_type_fpf = xyes; then + FPSUFFIXES="$FPSUFFIXES fpf" + fi + + + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for type double" >&5 +$as_echo_n "checking for type double... " >&6; } +if ${libat_cv_have_type_fp+:} false; then : + $as_echo_n "(cached) " >&6 +else + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ +double x; +_ACEOF +if ac_fn_c_try_compile "$LINENO"; then : + libat_cv_have_type_fp=yes +else + libat_cv_have_type_fp=no +fi +rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libat_cv_have_type_fp" >&5 +$as_echo "$libat_cv_have_type_fp" >&6; } + + yesno=`echo $libat_cv_have_type_fp | tr 'yesno' '1 0 '` + +cat >>confdefs.h <<_ACEOF +#define HAVE_fp $yesno +_ACEOF + + + if test x$libat_cv_have_type_fp = xyes; then + FPSUFFIXES="$FPSUFFIXES fp" + fi + + + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for type long double" >&5 +$as_echo_n "checking for type long double... " >&6; } +if ${libat_cv_have_type_fpl+:} false; then : + $as_echo_n "(cached) " >&6 +else + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ +long double x; +_ACEOF +if ac_fn_c_try_compile "$LINENO"; then : + libat_cv_have_type_fpl=yes +else + libat_cv_have_type_fpl=no +fi +rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libat_cv_have_type_fpl" >&5 +$as_echo "$libat_cv_have_type_fpl" >&6; } + + yesno=`echo $libat_cv_have_type_fpl | tr 'yesno' '1 0 '` + +cat >>confdefs.h <<_ACEOF +#define HAVE_fpl $yesno +_ACEOF + + + if test x$libat_cv_have_type_fpl = xyes; then + FPSUFFIXES="$FPSUFFIXES fpl" + fi + + + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for type bfloat16" >&5 +$as_echo_n "checking for type bfloat16... " >&6; } +if ${libat_cv_have_type_fpf16b+:} false; then : + $as_echo_n "(cached) " >&6 +else + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ +bfloat16 x; +_ACEOF +if ac_fn_c_try_compile "$LINENO"; then : + libat_cv_have_type_fpf16b=yes +else + libat_cv_have_type_fpf16b=no +fi +rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libat_cv_have_type_fpf16b" >&5 +$as_echo "$libat_cv_have_type_fpf16b" >&6; } + + yesno=`echo $libat_cv_have_type_fpf16b | tr 'yesno' '1 0 '` + +cat >>confdefs.h <<_ACEOF +#define HAVE_fpf16b $yesno +_ACEOF + + + if test x$libat_cv_have_type_fpf16b = xyes; then + FPSUFFIXES="$FPSUFFIXES fpf16b" + fi + + + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for type _Float16" >&5 +$as_echo_n "checking for type _Float16... " >&6; } +if ${libat_cv_have_type_fpf16+:} false; then : + $as_echo_n "(cached) " >&6 +else + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ +_Float16 x; +_ACEOF +if ac_fn_c_try_compile "$LINENO"; then : + libat_cv_have_type_fpf16=yes +else + libat_cv_have_type_fpf16=no +fi +rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libat_cv_have_type_fpf16" >&5 +$as_echo "$libat_cv_have_type_fpf16" >&6; } + + yesno=`echo $libat_cv_have_type_fpf16 | tr 'yesno' '1 0 '` + +cat >>confdefs.h <<_ACEOF +#define HAVE_fpf16 $yesno +_ACEOF + + + if test x$libat_cv_have_type_fpf16 = xyes; then + FPSUFFIXES="$FPSUFFIXES fpf16" + fi + + + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for type _Float32" >&5 +$as_echo_n "checking for type _Float32... " >&6; } +if ${libat_cv_have_type_fpf32+:} false; then : + $as_echo_n "(cached) " >&6 +else + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ +_Float32 x; +_ACEOF +if ac_fn_c_try_compile "$LINENO"; then : + libat_cv_have_type_fpf32=yes +else + libat_cv_have_type_fpf32=no +fi +rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libat_cv_have_type_fpf32" >&5 +$as_echo "$libat_cv_have_type_fpf32" >&6; } + + yesno=`echo $libat_cv_have_type_fpf32 | tr 'yesno' '1 0 '` + +cat >>confdefs.h <<_ACEOF +#define HAVE_fpf32 $yesno +_ACEOF + + + if test x$libat_cv_have_type_fpf32 = xyes; then + FPSUFFIXES="$FPSUFFIXES fpf32" + fi + + + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for type _Float64" >&5 +$as_echo_n "checking for type _Float64... " >&6; } +if ${libat_cv_have_type_fpf64+:} false; then : + $as_echo_n "(cached) " >&6 +else + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ +_Float64 x; +_ACEOF +if ac_fn_c_try_compile "$LINENO"; then : + libat_cv_have_type_fpf64=yes +else + libat_cv_have_type_fpf64=no +fi +rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libat_cv_have_type_fpf64" >&5 +$as_echo "$libat_cv_have_type_fpf64" >&6; } + + yesno=`echo $libat_cv_have_type_fpf64 | tr 'yesno' '1 0 '` + +cat >>confdefs.h <<_ACEOF +#define HAVE_fpf64 $yesno +_ACEOF + + + if test x$libat_cv_have_type_fpf64 = xyes; then + FPSUFFIXES="$FPSUFFIXES fpf64" + fi + + + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for type _Float128" >&5 +$as_echo_n "checking for type _Float128... " >&6; } +if ${libat_cv_have_type_fpf128+:} false; then : + $as_echo_n "(cached) " >&6 +else + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ +_Float128 x; +_ACEOF +if ac_fn_c_try_compile "$LINENO"; then : + libat_cv_have_type_fpf128=yes +else + libat_cv_have_type_fpf128=no +fi +rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libat_cv_have_type_fpf128" >&5 +$as_echo "$libat_cv_have_type_fpf128" >&6; } + + yesno=`echo $libat_cv_have_type_fpf128 | tr 'yesno' '1 0 '` + +cat >>confdefs.h <<_ACEOF +#define HAVE_fpf128 $yesno +_ACEOF + + + if test x$libat_cv_have_type_fpf128 = xyes; then + FPSUFFIXES="$FPSUFFIXES fpf128" + fi + + + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for type _Float32x" >&5 +$as_echo_n "checking for type _Float32x... " >&6; } +if ${libat_cv_have_type_fpf32x+:} false; then : + $as_echo_n "(cached) " >&6 +else + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ +_Float32x x; +_ACEOF +if ac_fn_c_try_compile "$LINENO"; then : + libat_cv_have_type_fpf32x=yes +else + libat_cv_have_type_fpf32x=no +fi +rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libat_cv_have_type_fpf32x" >&5 +$as_echo "$libat_cv_have_type_fpf32x" >&6; } + + yesno=`echo $libat_cv_have_type_fpf32x | tr 'yesno' '1 0 '` + +cat >>confdefs.h <<_ACEOF +#define HAVE_fpf32x $yesno +_ACEOF + + + if test x$libat_cv_have_type_fpf32x = xyes; then + FPSUFFIXES="$FPSUFFIXES fpf32x" + fi + + + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for type _Float64x" >&5 +$as_echo_n "checking for type _Float64x... " >&6; } +if ${libat_cv_have_type_fpf64x+:} false; then : + $as_echo_n "(cached) " >&6 +else + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ +_Float64x x; +_ACEOF +if ac_fn_c_try_compile "$LINENO"; then : + libat_cv_have_type_fpf64x=yes +else + libat_cv_have_type_fpf64x=no +fi +rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libat_cv_have_type_fpf64x" >&5 +$as_echo "$libat_cv_have_type_fpf64x" >&6; } + + yesno=`echo $libat_cv_have_type_fpf64x | tr 'yesno' '1 0 '` + +cat >>confdefs.h <<_ACEOF +#define HAVE_fpf64x $yesno +_ACEOF + + + if test x$libat_cv_have_type_fpf64x = xyes; then + FPSUFFIXES="$FPSUFFIXES fpf64x" + fi + + + # Check for compiler builtins of atomic operations. # Do link tests if possible, instead asm tests, limited to some platforms @@ -14697,81 +15009,793 @@ _ACEOF - { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether byte ordering is bigendian" >&5 -$as_echo_n "checking whether byte ordering is bigendian... " >&6; } -if ${ac_cv_c_bigendian+:} false; then : + + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for __atomic_fetch_{add,sub} for floating type float" >&5 +$as_echo_n "checking for __atomic_fetch_{add,sub} for floating type float... " >&6; } +if ${libat_cv_have_at_faddsub_fpf+:} false; then : $as_echo_n "(cached) " >&6 else - ac_cv_c_bigendian=unknown - # See if we're dealing with a universal compiler. - cat confdefs.h - <<_ACEOF >conftest.$ac_ext -/* end confdefs.h. */ -#ifndef __APPLE_CC__ - not a universal capable compiler - #endif - typedef int dummy; - -_ACEOF -if ac_fn_c_try_compile "$LINENO"; then : - # Check for potential -arch flags. It is not universal unless - # there are at least two -arch flags with different values. - ac_arch= - ac_prev= - for ac_word in $CC $CFLAGS $CPPFLAGS $LDFLAGS; do - if test -n "$ac_prev"; then - case $ac_word in - i?86 | x86_64 | ppc | ppc64) - if test -z "$ac_arch" || test "$ac_arch" = "$ac_word"; then - ac_arch=$ac_word - else - ac_cv_c_bigendian=universal - break - fi - ;; - esac - ac_prev= - elif test "x$ac_word" = "x-arch"; then - ac_prev=arch - fi - done -fi -rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext - if test $ac_cv_c_bigendian = unknown; then - # See if sys/param.h defines the BYTE_ORDER macro. - cat confdefs.h - <<_ACEOF >conftest.$ac_ext + cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ -#include - #include int main () { -#if ! (defined BYTE_ORDER && defined BIG_ENDIAN \ - && defined LITTLE_ENDIAN && BYTE_ORDER && BIG_ENDIAN \ - && LITTLE_ENDIAN) - bogus endian macros - #endif + + float *x; + asm("" : "=g"(x)); + __atomic_fetch_add (x, (float)(1.0), 0); + __atomic_add_fetch (x, (float)(1.0), 0); + __atomic_fetch_sub (x, (float)(1.0), 0); + __atomic_sub_fetch (x, (float)(1.0), 0); ; return 0; } _ACEOF -if ac_fn_c_try_compile "$LINENO"; then : - # It does; now see whether it defined to BIG_ENDIAN or not. - cat confdefs.h - <<_ACEOF >conftest.$ac_ext -/* end confdefs.h. */ -#include - #include - -int -main () -{ -#if BYTE_ORDER != BIG_ENDIAN - not big endian - #endif - - ; + if test x$atomic_builtins_link_tests = xyes; then + if { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_link\""; } >&5 + (eval $ac_link) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + eval libat_cv_have_at_faddsub_fpf=yes + else + eval libat_cv_have_at_faddsub_fpf=no + fi + else + old_CFLAGS="$CFLAGS" + # Compile unoptimized. + CFLAGS="$CFLAGS -O0 -S" + if { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_compile\""; } >&5 + (eval $ac_compile) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + if grep __atomic_ conftest.s >/dev/null 2>&1 ; then + eval libat_cv_have_at_faddsub_fpf=no + else + eval libat_cv_have_at_faddsub_fpf=yes + fi + else + eval libat_cv_have_at_faddsub_fpf=no + fi + CFLAGS="$old_CFLAGS" + fi + rm -f conftest* + +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libat_cv_have_at_faddsub_fpf" >&5 +$as_echo "$libat_cv_have_at_faddsub_fpf" >&6; } + + + yesno=`echo $libat_cv_have_at_faddsub_fpf | tr 'yesno' '1 0 '` + +cat >>confdefs.h <<_ACEOF +#define HAVE_ATOMIC_FETCH_ADDSUB_fpf $yesno +_ACEOF + + + + + + + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for __atomic_fetch_{add,sub} for floating type double" >&5 +$as_echo_n "checking for __atomic_fetch_{add,sub} for floating type double... " >&6; } +if ${libat_cv_have_at_faddsub_fp+:} false; then : + $as_echo_n "(cached) " >&6 +else + + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ + +int +main () +{ + + double *x; + asm("" : "=g"(x)); + __atomic_fetch_add (x, (double)(1.0), 0); + __atomic_add_fetch (x, (double)(1.0), 0); + __atomic_fetch_sub (x, (double)(1.0), 0); + __atomic_sub_fetch (x, (double)(1.0), 0); + + ; + return 0; +} +_ACEOF + if test x$atomic_builtins_link_tests = xyes; then + if { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_link\""; } >&5 + (eval $ac_link) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + eval libat_cv_have_at_faddsub_fp=yes + else + eval libat_cv_have_at_faddsub_fp=no + fi + else + old_CFLAGS="$CFLAGS" + # Compile unoptimized. + CFLAGS="$CFLAGS -O0 -S" + if { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_compile\""; } >&5 + (eval $ac_compile) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + if grep __atomic_ conftest.s >/dev/null 2>&1 ; then + eval libat_cv_have_at_faddsub_fp=no + else + eval libat_cv_have_at_faddsub_fp=yes + fi + else + eval libat_cv_have_at_faddsub_fp=no + fi + CFLAGS="$old_CFLAGS" + fi + rm -f conftest* + +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libat_cv_have_at_faddsub_fp" >&5 +$as_echo "$libat_cv_have_at_faddsub_fp" >&6; } + + + yesno=`echo $libat_cv_have_at_faddsub_fp | tr 'yesno' '1 0 '` + +cat >>confdefs.h <<_ACEOF +#define HAVE_ATOMIC_FETCH_ADDSUB_fp $yesno +_ACEOF + + + + + + + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for __atomic_fetch_{add,sub} for floating type long double" >&5 +$as_echo_n "checking for __atomic_fetch_{add,sub} for floating type long double... " >&6; } +if ${libat_cv_have_at_faddsub_fpl+:} false; then : + $as_echo_n "(cached) " >&6 +else + + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ + +int +main () +{ + + long double *x; + asm("" : "=g"(x)); + __atomic_fetch_add (x, (long double)(1.0), 0); + __atomic_add_fetch (x, (long double)(1.0), 0); + __atomic_fetch_sub (x, (long double)(1.0), 0); + __atomic_sub_fetch (x, (long double)(1.0), 0); + + ; + return 0; +} +_ACEOF + if test x$atomic_builtins_link_tests = xyes; then + if { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_link\""; } >&5 + (eval $ac_link) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + eval libat_cv_have_at_faddsub_fpl=yes + else + eval libat_cv_have_at_faddsub_fpl=no + fi + else + old_CFLAGS="$CFLAGS" + # Compile unoptimized. + CFLAGS="$CFLAGS -O0 -S" + if { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_compile\""; } >&5 + (eval $ac_compile) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + if grep __atomic_ conftest.s >/dev/null 2>&1 ; then + eval libat_cv_have_at_faddsub_fpl=no + else + eval libat_cv_have_at_faddsub_fpl=yes + fi + else + eval libat_cv_have_at_faddsub_fpl=no + fi + CFLAGS="$old_CFLAGS" + fi + rm -f conftest* + +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libat_cv_have_at_faddsub_fpl" >&5 +$as_echo "$libat_cv_have_at_faddsub_fpl" >&6; } + + + yesno=`echo $libat_cv_have_at_faddsub_fpl | tr 'yesno' '1 0 '` + +cat >>confdefs.h <<_ACEOF +#define HAVE_ATOMIC_FETCH_ADDSUB_fpl $yesno +_ACEOF + + + + + + + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for __atomic_fetch_{add,sub} for floating type bfloat16" >&5 +$as_echo_n "checking for __atomic_fetch_{add,sub} for floating type bfloat16... " >&6; } +if ${libat_cv_have_at_faddsub_fpf16b+:} false; then : + $as_echo_n "(cached) " >&6 +else + + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ + +int +main () +{ + + bfloat16 *x; + asm("" : "=g"(x)); + __atomic_fetch_add (x, (bfloat16)(1.0), 0); + __atomic_add_fetch (x, (bfloat16)(1.0), 0); + __atomic_fetch_sub (x, (bfloat16)(1.0), 0); + __atomic_sub_fetch (x, (bfloat16)(1.0), 0); + + ; + return 0; +} +_ACEOF + if test x$atomic_builtins_link_tests = xyes; then + if { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_link\""; } >&5 + (eval $ac_link) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + eval libat_cv_have_at_faddsub_fpf16b=yes + else + eval libat_cv_have_at_faddsub_fpf16b=no + fi + else + old_CFLAGS="$CFLAGS" + # Compile unoptimized. + CFLAGS="$CFLAGS -O0 -S" + if { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_compile\""; } >&5 + (eval $ac_compile) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + if grep __atomic_ conftest.s >/dev/null 2>&1 ; then + eval libat_cv_have_at_faddsub_fpf16b=no + else + eval libat_cv_have_at_faddsub_fpf16b=yes + fi + else + eval libat_cv_have_at_faddsub_fpf16b=no + fi + CFLAGS="$old_CFLAGS" + fi + rm -f conftest* + +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libat_cv_have_at_faddsub_fpf16b" >&5 +$as_echo "$libat_cv_have_at_faddsub_fpf16b" >&6; } + + + yesno=`echo $libat_cv_have_at_faddsub_fpf16b | tr 'yesno' '1 0 '` + +cat >>confdefs.h <<_ACEOF +#define HAVE_ATOMIC_FETCH_ADDSUB_fpf16b $yesno +_ACEOF + + + + + + + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for __atomic_fetch_{add,sub} for floating type _Float16" >&5 +$as_echo_n "checking for __atomic_fetch_{add,sub} for floating type _Float16... " >&6; } +if ${libat_cv_have_at_faddsub_fpf16+:} false; then : + $as_echo_n "(cached) " >&6 +else + + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ + +int +main () +{ + + _Float16 *x; + asm("" : "=g"(x)); + __atomic_fetch_add (x, (_Float16)(1.0), 0); + __atomic_add_fetch (x, (_Float16)(1.0), 0); + __atomic_fetch_sub (x, (_Float16)(1.0), 0); + __atomic_sub_fetch (x, (_Float16)(1.0), 0); + + ; + return 0; +} +_ACEOF + if test x$atomic_builtins_link_tests = xyes; then + if { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_link\""; } >&5 + (eval $ac_link) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + eval libat_cv_have_at_faddsub_fpf16=yes + else + eval libat_cv_have_at_faddsub_fpf16=no + fi + else + old_CFLAGS="$CFLAGS" + # Compile unoptimized. + CFLAGS="$CFLAGS -O0 -S" + if { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_compile\""; } >&5 + (eval $ac_compile) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + if grep __atomic_ conftest.s >/dev/null 2>&1 ; then + eval libat_cv_have_at_faddsub_fpf16=no + else + eval libat_cv_have_at_faddsub_fpf16=yes + fi + else + eval libat_cv_have_at_faddsub_fpf16=no + fi + CFLAGS="$old_CFLAGS" + fi + rm -f conftest* + +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libat_cv_have_at_faddsub_fpf16" >&5 +$as_echo "$libat_cv_have_at_faddsub_fpf16" >&6; } + + + yesno=`echo $libat_cv_have_at_faddsub_fpf16 | tr 'yesno' '1 0 '` + +cat >>confdefs.h <<_ACEOF +#define HAVE_ATOMIC_FETCH_ADDSUB_fpf16 $yesno +_ACEOF + + + + + + + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for __atomic_fetch_{add,sub} for floating type _Float32" >&5 +$as_echo_n "checking for __atomic_fetch_{add,sub} for floating type _Float32... " >&6; } +if ${libat_cv_have_at_faddsub_fpf32+:} false; then : + $as_echo_n "(cached) " >&6 +else + + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ + +int +main () +{ + + _Float32 *x; + asm("" : "=g"(x)); + __atomic_fetch_add (x, (_Float32)(1.0), 0); + __atomic_add_fetch (x, (_Float32)(1.0), 0); + __atomic_fetch_sub (x, (_Float32)(1.0), 0); + __atomic_sub_fetch (x, (_Float32)(1.0), 0); + + ; + return 0; +} +_ACEOF + if test x$atomic_builtins_link_tests = xyes; then + if { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_link\""; } >&5 + (eval $ac_link) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + eval libat_cv_have_at_faddsub_fpf32=yes + else + eval libat_cv_have_at_faddsub_fpf32=no + fi + else + old_CFLAGS="$CFLAGS" + # Compile unoptimized. + CFLAGS="$CFLAGS -O0 -S" + if { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_compile\""; } >&5 + (eval $ac_compile) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + if grep __atomic_ conftest.s >/dev/null 2>&1 ; then + eval libat_cv_have_at_faddsub_fpf32=no + else + eval libat_cv_have_at_faddsub_fpf32=yes + fi + else + eval libat_cv_have_at_faddsub_fpf32=no + fi + CFLAGS="$old_CFLAGS" + fi + rm -f conftest* + +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libat_cv_have_at_faddsub_fpf32" >&5 +$as_echo "$libat_cv_have_at_faddsub_fpf32" >&6; } + + + yesno=`echo $libat_cv_have_at_faddsub_fpf32 | tr 'yesno' '1 0 '` + +cat >>confdefs.h <<_ACEOF +#define HAVE_ATOMIC_FETCH_ADDSUB_fpf32 $yesno +_ACEOF + + + + + + + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for __atomic_fetch_{add,sub} for floating type _Float64" >&5 +$as_echo_n "checking for __atomic_fetch_{add,sub} for floating type _Float64... " >&6; } +if ${libat_cv_have_at_faddsub_fpf64+:} false; then : + $as_echo_n "(cached) " >&6 +else + + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ + +int +main () +{ + + _Float64 *x; + asm("" : "=g"(x)); + __atomic_fetch_add (x, (_Float64)(1.0), 0); + __atomic_add_fetch (x, (_Float64)(1.0), 0); + __atomic_fetch_sub (x, (_Float64)(1.0), 0); + __atomic_sub_fetch (x, (_Float64)(1.0), 0); + + ; + return 0; +} +_ACEOF + if test x$atomic_builtins_link_tests = xyes; then + if { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_link\""; } >&5 + (eval $ac_link) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + eval libat_cv_have_at_faddsub_fpf64=yes + else + eval libat_cv_have_at_faddsub_fpf64=no + fi + else + old_CFLAGS="$CFLAGS" + # Compile unoptimized. + CFLAGS="$CFLAGS -O0 -S" + if { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_compile\""; } >&5 + (eval $ac_compile) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + if grep __atomic_ conftest.s >/dev/null 2>&1 ; then + eval libat_cv_have_at_faddsub_fpf64=no + else + eval libat_cv_have_at_faddsub_fpf64=yes + fi + else + eval libat_cv_have_at_faddsub_fpf64=no + fi + CFLAGS="$old_CFLAGS" + fi + rm -f conftest* + +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libat_cv_have_at_faddsub_fpf64" >&5 +$as_echo "$libat_cv_have_at_faddsub_fpf64" >&6; } + + + yesno=`echo $libat_cv_have_at_faddsub_fpf64 | tr 'yesno' '1 0 '` + +cat >>confdefs.h <<_ACEOF +#define HAVE_ATOMIC_FETCH_ADDSUB_fpf64 $yesno +_ACEOF + + + + + + + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for __atomic_fetch_{add,sub} for floating type _Float128" >&5 +$as_echo_n "checking for __atomic_fetch_{add,sub} for floating type _Float128... " >&6; } +if ${libat_cv_have_at_faddsub_fpf128+:} false; then : + $as_echo_n "(cached) " >&6 +else + + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ + +int +main () +{ + + _Float128 *x; + asm("" : "=g"(x)); + __atomic_fetch_add (x, (_Float128)(1.0), 0); + __atomic_add_fetch (x, (_Float128)(1.0), 0); + __atomic_fetch_sub (x, (_Float128)(1.0), 0); + __atomic_sub_fetch (x, (_Float128)(1.0), 0); + + ; + return 0; +} +_ACEOF + if test x$atomic_builtins_link_tests = xyes; then + if { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_link\""; } >&5 + (eval $ac_link) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + eval libat_cv_have_at_faddsub_fpf128=yes + else + eval libat_cv_have_at_faddsub_fpf128=no + fi + else + old_CFLAGS="$CFLAGS" + # Compile unoptimized. + CFLAGS="$CFLAGS -O0 -S" + if { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_compile\""; } >&5 + (eval $ac_compile) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + if grep __atomic_ conftest.s >/dev/null 2>&1 ; then + eval libat_cv_have_at_faddsub_fpf128=no + else + eval libat_cv_have_at_faddsub_fpf128=yes + fi + else + eval libat_cv_have_at_faddsub_fpf128=no + fi + CFLAGS="$old_CFLAGS" + fi + rm -f conftest* + +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libat_cv_have_at_faddsub_fpf128" >&5 +$as_echo "$libat_cv_have_at_faddsub_fpf128" >&6; } + + + yesno=`echo $libat_cv_have_at_faddsub_fpf128 | tr 'yesno' '1 0 '` + +cat >>confdefs.h <<_ACEOF +#define HAVE_ATOMIC_FETCH_ADDSUB_fpf128 $yesno +_ACEOF + + + + + + + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for __atomic_fetch_{add,sub} for floating type _Float32x" >&5 +$as_echo_n "checking for __atomic_fetch_{add,sub} for floating type _Float32x... " >&6; } +if ${libat_cv_have_at_faddsub_fpf32x+:} false; then : + $as_echo_n "(cached) " >&6 +else + + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ + +int +main () +{ + + _Float32x *x; + asm("" : "=g"(x)); + __atomic_fetch_add (x, (_Float32x)(1.0), 0); + __atomic_add_fetch (x, (_Float32x)(1.0), 0); + __atomic_fetch_sub (x, (_Float32x)(1.0), 0); + __atomic_sub_fetch (x, (_Float32x)(1.0), 0); + + ; + return 0; +} +_ACEOF + if test x$atomic_builtins_link_tests = xyes; then + if { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_link\""; } >&5 + (eval $ac_link) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + eval libat_cv_have_at_faddsub_fpf32x=yes + else + eval libat_cv_have_at_faddsub_fpf32x=no + fi + else + old_CFLAGS="$CFLAGS" + # Compile unoptimized. + CFLAGS="$CFLAGS -O0 -S" + if { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_compile\""; } >&5 + (eval $ac_compile) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + if grep __atomic_ conftest.s >/dev/null 2>&1 ; then + eval libat_cv_have_at_faddsub_fpf32x=no + else + eval libat_cv_have_at_faddsub_fpf32x=yes + fi + else + eval libat_cv_have_at_faddsub_fpf32x=no + fi + CFLAGS="$old_CFLAGS" + fi + rm -f conftest* + +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libat_cv_have_at_faddsub_fpf32x" >&5 +$as_echo "$libat_cv_have_at_faddsub_fpf32x" >&6; } + + + yesno=`echo $libat_cv_have_at_faddsub_fpf32x | tr 'yesno' '1 0 '` + +cat >>confdefs.h <<_ACEOF +#define HAVE_ATOMIC_FETCH_ADDSUB_fpf32x $yesno +_ACEOF + + + + + + + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for __atomic_fetch_{add,sub} for floating type _Float64x" >&5 +$as_echo_n "checking for __atomic_fetch_{add,sub} for floating type _Float64x... " >&6; } +if ${libat_cv_have_at_faddsub_fpf64x+:} false; then : + $as_echo_n "(cached) " >&6 +else + + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ + +int +main () +{ + + _Float64x *x; + asm("" : "=g"(x)); + __atomic_fetch_add (x, (_Float64x)(1.0), 0); + __atomic_add_fetch (x, (_Float64x)(1.0), 0); + __atomic_fetch_sub (x, (_Float64x)(1.0), 0); + __atomic_sub_fetch (x, (_Float64x)(1.0), 0); + + ; + return 0; +} +_ACEOF + if test x$atomic_builtins_link_tests = xyes; then + if { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_link\""; } >&5 + (eval $ac_link) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + eval libat_cv_have_at_faddsub_fpf64x=yes + else + eval libat_cv_have_at_faddsub_fpf64x=no + fi + else + old_CFLAGS="$CFLAGS" + # Compile unoptimized. + CFLAGS="$CFLAGS -O0 -S" + if { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_compile\""; } >&5 + (eval $ac_compile) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + if grep __atomic_ conftest.s >/dev/null 2>&1 ; then + eval libat_cv_have_at_faddsub_fpf64x=no + else + eval libat_cv_have_at_faddsub_fpf64x=yes + fi + else + eval libat_cv_have_at_faddsub_fpf64x=no + fi + CFLAGS="$old_CFLAGS" + fi + rm -f conftest* + +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libat_cv_have_at_faddsub_fpf64x" >&5 +$as_echo "$libat_cv_have_at_faddsub_fpf64x" >&6; } + + + yesno=`echo $libat_cv_have_at_faddsub_fpf64x | tr 'yesno' '1 0 '` + +cat >>confdefs.h <<_ACEOF +#define HAVE_ATOMIC_FETCH_ADDSUB_fpf64x $yesno +_ACEOF + + + + + + + + { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether byte ordering is bigendian" >&5 +$as_echo_n "checking whether byte ordering is bigendian... " >&6; } +if ${ac_cv_c_bigendian+:} false; then : + $as_echo_n "(cached) " >&6 +else + ac_cv_c_bigendian=unknown + # See if we're dealing with a universal compiler. + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ +#ifndef __APPLE_CC__ + not a universal capable compiler + #endif + typedef int dummy; + +_ACEOF +if ac_fn_c_try_compile "$LINENO"; then : + + # Check for potential -arch flags. It is not universal unless + # there are at least two -arch flags with different values. + ac_arch= + ac_prev= + for ac_word in $CC $CFLAGS $CPPFLAGS $LDFLAGS; do + if test -n "$ac_prev"; then + case $ac_word in + i?86 | x86_64 | ppc | ppc64) + if test -z "$ac_arch" || test "$ac_arch" = "$ac_word"; then + ac_arch=$ac_word + else + ac_cv_c_bigendian=universal + break + fi + ;; + esac + ac_prev= + elif test "x$ac_word" = "x-arch"; then + ac_prev=arch + fi + done +fi +rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext + if test $ac_cv_c_bigendian = unknown; then + # See if sys/param.h defines the BYTE_ORDER macro. + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ +#include + #include + +int +main () +{ +#if ! (defined BYTE_ORDER && defined BIG_ENDIAN \ + && defined LITTLE_ENDIAN && BYTE_ORDER && BIG_ENDIAN \ + && LITTLE_ENDIAN) + bogus endian macros + #endif + + ; + return 0; +} +_ACEOF +if ac_fn_c_try_compile "$LINENO"; then : + # It does; now see whether it defined to BIG_ENDIAN or not. + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ +#include + #include + +int +main () +{ +#if BYTE_ORDER != BIG_ENDIAN + not big endian + #endif + + ; return 0; } _ACEOF @@ -15761,6 +16785,7 @@ XCFLAGS="$XCFLAGS $XPCFLAGS" + # Conditionalize the makefile for this target machine. tmake_file_= for f in ${tmake_file} diff --git a/libatomic/configure.ac b/libatomic/configure.ac index 32a2cdb13ae..597f88b073e 100644 --- a/libatomic/configure.ac +++ b/libatomic/configure.ac @@ -196,6 +196,7 @@ AC_CHECK_HEADERS([fenv.h]) # Check for common type sizes LIBAT_FORALL_MODES([LIBAT_HAVE_INT_MODE]) +LIBAT_FOR_FLOATING_TYPES([LIBAT_HAVE_FLOATING_TYPE]) # Check for compiler builtins of atomic operations. LIBAT_TEST_ATOMIC_INIT @@ -205,6 +206,8 @@ LIBAT_FORALL_MODES([LIBAT_HAVE_ATOMIC_EXCHANGE]) LIBAT_FORALL_MODES([LIBAT_HAVE_ATOMIC_CAS]) LIBAT_FORALL_MODES([LIBAT_HAVE_ATOMIC_FETCH_ADD]) LIBAT_FORALL_MODES([LIBAT_HAVE_ATOMIC_FETCH_OP]) +LIBAT_FOR_FLOATING_TYPES([LIBAT_HAVE_ATOMIC_FETCH_ADDSUB_FP]) + AC_C_BIGENDIAN # I don't like the default behaviour of WORDS_BIGENDIAN undefined for LE. @@ -273,6 +276,7 @@ AC_SUBST(XCFLAGS) AC_SUBST(XLDFLAGS) AC_SUBST(LIBS) AC_SUBST(SIZES) +AC_SUBST(FPSUFFIXES) # Conditionalize the makefile for this target machine. tmake_file_= diff --git a/libatomic/fadd_n.c b/libatomic/fadd_n.c index 32b75cec654..bae6b95d728 100644 --- a/libatomic/fadd_n.c +++ b/libatomic/fadd_n.c @@ -28,6 +28,28 @@ #define NAME add #define OP(X,Y) ((X) + (Y)) +/* + When compiling this file for the floating point operations, some of the names + become a bit of a misnomer. + - SIZE now logically creates a token suffixed by a "floating point suffix" + rather than suffixed by "size". + - UTYPE is now something like U_fp for double and they are typedef'd to the + corresponding floating point type. + - N is no longer a number, but instead a floating point suffix. +*/ +#if FLOATING +# define HAVE_ATOMIC_FETCH_OP_fpf HAVE_ATOMIC_FETCH_ADDSUB_fpf +# define HAVE_ATOMIC_FETCH_OP_fpf HAVE_ATOMIC_FETCH_ADDSUB_fpf +# define HAVE_ATOMIC_FETCH_OP_fp HAVE_ATOMIC_FETCH_ADDSUB_fp +# define HAVE_ATOMIC_FETCH_OP_fpl HAVE_ATOMIC_FETCH_ADDSUB_fpl +# define HAVE_ATOMIC_FETCH_OP_fpf16 HAVE_ATOMIC_FETCH_ADDSUB_fpf16 +# define HAVE_ATOMIC_FETCH_OP_fpf32 HAVE_ATOMIC_FETCH_ADDSUB_fpf32 +# define HAVE_ATOMIC_FETCH_OP_fpf64 HAVE_ATOMIC_FETCH_ADDSUB_fpf64 +# define HAVE_ATOMIC_FETCH_OP_fpf128 HAVE_ATOMIC_FETCH_ADDSUB_fpf128 +# define HAVE_ATOMIC_FETCH_OP_fpf32x HAVE_ATOMIC_FETCH_ADDSUB_fpf32x +# define HAVE_ATOMIC_FETCH_OP_fpf64x HAVE_ATOMIC_FETCH_ADDSUB_fpf64x +#endif + /* Defer to HAVE_ATOMIC_FETCH_ADD, which some targets implement specially, even if HAVE_ATOMIC_FETCH_OP is not defined. */ #if !SIZE(HAVE_ATOMIC_FETCH_OP) @@ -44,4 +66,5 @@ #endif #include "fop_n.c" + #undef LAT_FADD_N diff --git a/libatomic/fop_n.c b/libatomic/fop_n.c index fefff3a57a4..5cba1dcb3c9 100644 --- a/libatomic/fop_n.c +++ b/libatomic/fop_n.c @@ -104,8 +104,9 @@ SIZE(C3(libat_,NAME,_fetch)) (UTYPE *mptr, UTYPE opval, int smodel) /* If this type is no larger than word-sized, fall back to a word-sized - compare-and-swap loop. */ -#if !DONE && N < WORDSIZE && defined(atomic_compare_exchange_w) + compare-and-swap loop. Avoid doing this with floating point types since the + preprocessor does not know the size of those types. */ +#if !DONE && !defined(FLOATING) && N < WORDSIZE && defined(atomic_compare_exchange_w) UTYPE SIZE(C2(libat_fetch_,NAME)) (UTYPE *mptr, UTYPE opval, int smodel) { diff --git a/libatomic/fsub_n.c b/libatomic/fsub_n.c index 49b375a543f..bc99f22797e 100644 --- a/libatomic/fsub_n.c +++ b/libatomic/fsub_n.c @@ -1,5 +1,28 @@ #define LAT_FSUB_N #define NAME sub #define OP(X,Y) ((X) - (Y)) + +/* + When compiling this file for the floating point operations, some of the names + become a bit of a misnomer. + - SIZE now logically creates a token suffixed by a "floating point suffix" + rather than suffixed by "size". + - UTYPE is now something like U_fp for double and they are typedef'd to the + corresponding floating point type. + - N is no longer a number, but instead a floating point suffix. +*/ +#if FLOATING +# define HAVE_ATOMIC_FETCH_OP_fpf HAVE_ATOMIC_FETCH_ADDSUB_fpf +# define HAVE_ATOMIC_FETCH_OP_fpf HAVE_ATOMIC_FETCH_ADDSUB_fpf +# define HAVE_ATOMIC_FETCH_OP_fp HAVE_ATOMIC_FETCH_ADDSUB_fp +# define HAVE_ATOMIC_FETCH_OP_fpl HAVE_ATOMIC_FETCH_ADDSUB_fpl +# define HAVE_ATOMIC_FETCH_OP_fpf16 HAVE_ATOMIC_FETCH_ADDSUB_fpf16 +# define HAVE_ATOMIC_FETCH_OP_fpf32 HAVE_ATOMIC_FETCH_ADDSUB_fpf32 +# define HAVE_ATOMIC_FETCH_OP_fpf64 HAVE_ATOMIC_FETCH_ADDSUB_fpf64 +# define HAVE_ATOMIC_FETCH_OP_fpf128 HAVE_ATOMIC_FETCH_ADDSUB_fpf128 +# define HAVE_ATOMIC_FETCH_OP_fpf32x HAVE_ATOMIC_FETCH_ADDSUB_fpf32x +# define HAVE_ATOMIC_FETCH_OP_fpf64x HAVE_ATOMIC_FETCH_ADDSUB_fpf64x +#endif + #include "fop_n.c" #undef LAT_FSUB_N diff --git a/libatomic/libatomic.map b/libatomic/libatomic.map index 39e7c2c6b9a..b08fb8d6bc8 100644 --- a/libatomic/libatomic.map +++ b/libatomic/libatomic.map @@ -108,3 +108,47 @@ LIBATOMIC_1.2 { atomic_flag_clear; atomic_flag_clear_explicit; } LIBATOMIC_1.1; + +LIBATOMIC_1.3 { + global: + __atomic_add_fetch_fpf; + __atomic_add_fetch_fp; + __atomic_add_fetch_fpl; + __atomic_add_fetch_fpf16b; + __atomic_add_fetch_fpf16; + __atomic_add_fetch_fpf32; + __atomic_add_fetch_fpf64; + __atomic_add_fetch_fpf128; + __atomic_add_fetch_fpf32x; + __atomic_add_fetch_fpf64x; + __atomic_fetch_add_fpf; + __atomic_fetch_add_fp; + __atomic_fetch_add_fpl; + __atomic_fetch_add_fpf16b; + __atomic_fetch_add_fpf16; + __atomic_fetch_add_fpf32; + __atomic_fetch_add_fpf64; + __atomic_fetch_add_fpf128; + __atomic_fetch_add_fpf32x; + __atomic_fetch_add_fpf64x; + __atomic_fetch_sub_fpf; + __atomic_fetch_sub_fp; + __atomic_fetch_sub_fpl; + __atomic_fetch_sub_fpf16b; + __atomic_fetch_sub_fpf16; + __atomic_fetch_sub_fpf32; + __atomic_fetch_sub_fpf64; + __atomic_fetch_sub_fpf128; + __atomic_fetch_sub_fpf32x; + __atomic_fetch_sub_fpf64x; + __atomic_sub_fetch_fpf; + __atomic_sub_fetch_fp; + __atomic_sub_fetch_fpl; + __atomic_sub_fetch_fpf16b; + __atomic_sub_fetch_fpf16; + __atomic_sub_fetch_fpf32; + __atomic_sub_fetch_fpf64; + __atomic_sub_fetch_fpf128; + __atomic_sub_fetch_fpf32x; + __atomic_sub_fetch_fpf64x; +} LIBATOMIC_1.2; diff --git a/libatomic/libatomic_i.h b/libatomic/libatomic_i.h index 861a22da152..8dc13fd459d 100644 --- a/libatomic/libatomic_i.h +++ b/libatomic/libatomic_i.h @@ -63,6 +63,31 @@ typedef unsigned U_8 __attribute__((mode(DI))); typedef unsigned U_16 __attribute__((mode(TI))); #endif +typedef float U_fpf; +typedef double U_fp; +typedef long double U_fpl; +#if HAVE_fpf16b +typedef bfloat16 U_fpf16b; +#endif +#if HAVE_fpf16 +typedef _Float16 U_fpf16; +#endif +#if HAVE_fpf32 +typedef _Float32 U_fpf32; +#endif +#if HAVE_fpf64 +typedef _Float64 U_fpf64; +#endif +#if HAVE_fpf128 +typedef _Float128 U_fpf128; +#endif +#if HAVE_fpf32x +typedef _Float32x U_fpf32x; +#endif +#if HAVE_fpf64x +typedef _Float64x U_fpf64x; +#endif + /* The widest type that we support. */ #if HAVE_INT16 # define MAX_SIZE 16 @@ -92,6 +117,28 @@ typedef U_MAX U_8; #if !HAVE_INT16 typedef U_MAX U_16; #endif +#if !HAVE_fpf16b +typedef U_MAX U_fpf16b; +#endif +#if !HAVE_fpf16 +typedef U_MAX U_fpf16; +#endif +#if !HAVE_fpf32 +typedef U_MAX U_fpf32; +#endif +#if !HAVE_fpf64 +typedef U_MAX U_fpf64; +#endif +#if !HAVE_fpf128 +typedef U_MAX U_fpf128; +#endif +#if !HAVE_fpf32x +typedef U_MAX U_fpf32x; +#endif +#if !HAVE_fpf64x +typedef U_MAX U_fpf64x; +#endif + union max_size_u { @@ -215,6 +262,17 @@ DECLARE_ALL_SIZED(4); DECLARE_ALL_SIZED(8); DECLARE_ALL_SIZED(16); +DECLARE_ALL_SIZED(fpf); +DECLARE_ALL_SIZED(fp); +DECLARE_ALL_SIZED(fpl); +DECLARE_ALL_SIZED(fpf16b); +DECLARE_ALL_SIZED(fpf16); +DECLARE_ALL_SIZED(fpf32); +DECLARE_ALL_SIZED(fpf64); +DECLARE_ALL_SIZED(fpf128); +DECLARE_ALL_SIZED(fpf32x); +DECLARE_ALL_SIZED(fpf64x); + #undef DECLARE_1 #undef DECLARE_ALL_SIZED #undef DECLARE_ALL_SIZED_ diff --git a/libatomic/testsuite/Makefile.in b/libatomic/testsuite/Makefile.in index 247268f1949..e617e8f67ac 100644 --- a/libatomic/testsuite/Makefile.in +++ b/libatomic/testsuite/Makefile.in @@ -159,6 +159,7 @@ ECHO_T = @ECHO_T@ EGREP = @EGREP@ EXEEXT = @EXEEXT@ FGREP = @FGREP@ +FPSUFFIXES = @FPSUFFIXES@ GREP = @GREP@ INSTALL = @INSTALL@ INSTALL_DATA = @INSTALL_DATA@ From patchwork Thu Sep 19 13:12:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Malcomson X-Patchwork-Id: 97702 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A87C8385B503 for ; Thu, 19 Sep 2024 13:13:48 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11on2061d.outbound.protection.outlook.com [IPv6:2a01:111:f403:2414::61d]) by sourceware.org (Postfix) with ESMTPS id C33493858C39 for ; Thu, 19 Sep 2024 13:12:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C33493858C39 Authentication-Results: sourceware.org; dmarc=fail (p=reject dis=none) header.from=nvidia.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=nvidia.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C33493858C39 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2414::61d ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1726751574; cv=pass; b=mLoebUprrA+VAyl3pS2KW+lGs4VkFWDNSirHq2gNH0I5gmRAKVtTmrlPW1Vf5tOyUnAfXhWwZppsG3Zthc1P9Y5ZClQ4N9DajRL5t+i5BYZW/xlt2S2bB6msZdj27BjiqCP8J+n6Om0vA9m9AdG5ZApOzWXW61sYGBri9pa74xg= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1726751574; c=relaxed/simple; bh=DOxFCdoQrzaujKKMw+a2VIaQ29X2UTH04Z6gj7sLDkQ=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=X/Lmki9tmhAD+ARZ0CMRmmLyx/w+t4i+bcP3hU8i+wXWoXFpw+UWMDPfP9c8g3jfvN9CK9L8zhLipwDBcJyG1re78MJky+xYL6d+I17NNN3MrsZa1WwrWfaP41zBcpTQwSIOvDXcFud7L9xxygTTqqKRTg0+unmJzOO9EgAk/TU= ARC-Authentication-Results: i=2; server2.sourceware.org ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=gbEPfoHrXXSS/LUwRMkcNhTQV5zfEixaFPFB04fMQoWGf/gxb1swwek0Jc7+WHj+z2QGtII3a4Up+XK11gzjTAaNSRcV0UHlT2Y1G3U6YltM9gc1QIpVSIiY8IQW6FiHNh4a55ckLzt7vEopvyr3xwAGNizYO9PuUiXGvbEen5wvoowc9NpCiiTMP2Myh0XH7i/kV+1yIKuX4g0mREFT37GlbSh/eJ1Kh8Ivh2l/mkluAkocPWWc5JyNF7+nUf+Ehd3XH1rC/j/kRl3sNFetoaI9r9oBdvqTSOqA3uk4osazdENHOD53lUym8ZcPCVLZ8NhZY7dxoK9WzIWhSar/AA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=NdzawkysR5ZsSLa++BHnN9Po8pTvAOLUg9Q+OZOQ/1E=; b=xL83fgoPzufM0Q6EA7OZNPxZN6G4lv9iZBqklV/xsidQgodRlaN357ACOUFcQ8bqFAI+sM/lHGLP91Vzulpss81xsrBJK6gGdaxBuyDUrNsHuS4bml0TrKwbf6x3Cx/6Y22kkHVXxxN0GR9l5oA6u8hQaAKDyh+y952vkkz+XACnDQKv8IppeadUgb63rfuCWq1y2hg/WaGrZx+JYrYhXVxy5gzKT4lp9+7FQspa51HhDj8XMQtKKkxHrVgnYWxp/dPJXQ6OC4/KTEO2GfUHIv9Lngm0aKu74SzgUmLh3/h9WibxHImvHmdJvmtNge9QZhb4L9ar0T3ikbwYP1RuAw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.160) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=NdzawkysR5ZsSLa++BHnN9Po8pTvAOLUg9Q+OZOQ/1E=; b=XnzrCHfas/bKokmX59Px7k/jrf3dFvET/1NQSkOsIRPuWsAkQZBxA50gpq4cP6+wwjVIg/i31NsQd7Do/MKIlLIkNNL7aQw2Ml3Ax9nwmSXihpui1HBSMxEz5R7lATmwWEz5of2KNuIPpOuzUHZ4jYRzykekQamMLtq5ZCea0SblVdu2N2H/EIjmZCwDhNpJsUfy9SHY7tE9wOfCFfOFKU9WGs3Gc/DSsiHHBf/WcZxfJkNhAgutBFJqNuf6wrkXaFFSI4raZ4ozDL7PN8iWEVsxE/H24s7NCRWnZRK4SNYQzmpmNOhv7mZ++CYBsgP4rYR/tdIqtbFSEB9UIU0J9w== Received: from SJ0PR05CA0054.namprd05.prod.outlook.com (2603:10b6:a03:33f::29) by SN7PR12MB8145.namprd12.prod.outlook.com (2603:10b6:806:350::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7982.22; Thu, 19 Sep 2024 13:12:45 +0000 Received: from CO1PEPF000066E7.namprd05.prod.outlook.com (2603:10b6:a03:33f:cafe::24) by SJ0PR05CA0054.outlook.office365.com (2603:10b6:a03:33f::29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7939.24 via Frontend Transport; Thu, 19 Sep 2024 13:12:45 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.160) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.160 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.160; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.160) by CO1PEPF000066E7.mail.protection.outlook.com (10.167.249.9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Thu, 19 Sep 2024 13:12:45 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Thu, 19 Sep 2024 06:12:31 -0700 Received: from f2784bb-lcelt.nvidia.com (10.126.230.35) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Thu, 19 Sep 2024 06:12:29 -0700 From: To: CC: Jonathan Wakely , Joseph Myers , Richard Biener , Matthew Malcomson Subject: [PATCH 5/8] [RFC] Use new builtins in libstdc++ Date: Thu, 19 Sep 2024 14:12:01 +0100 Message-ID: <20240919131204.3865854-6-mmalcomson@nvidia.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240919131204.3865854-1-mmalcomson@nvidia.com> References: <20240919131204.3865854-1-mmalcomson@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [10.126.230.35] X-ClientProxiedBy: rnnvmail202.nvidia.com (10.129.68.7) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1PEPF000066E7:EE_|SN7PR12MB8145:EE_ X-MS-Office365-Filtering-Correlation-Id: 8e37688c-e74d-4930-fd9d-08dcd8acc09a X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|82310400026|36860700013|1800799024|376014; X-Microsoft-Antispam-Message-Info: CUQi2c841sBi0rcXIRlXB2IEoq61CeHVzPVi5bes5knqyXVzDJyezRNshCseCwz/5pboZ4Je+QY4K3HuzVlQeAS4wSFMXo6nth1GYU3Oau4jMSs26RpnJ85Xf9zyEYkUMifLWRZ0ZagAW1q2qzByORmaUgjOKJeDKVCAwTW2HFyPY4zvtwQQkNJ4r4/FsSL5lbeBa6lQt33WnXXBXgHC9lzfgHdlKyyYJGwm8wV8F6FTUTB/xDktxx4T/cPxqfz1/14RG8kvFXx/yQR1MGgr50aHwQkgUoRf6R5e+EGSwkv03ArneAjD1ZElgzP4jV6Nc8sybsAequfgXzLb3sOWTPAhwtOS3Yq7iyfnXpFt0/CsAYtrW3xYsmtpQ+E9Cn0AGLy8hSyibhsPIVENd2bp5hj8JxftbKZNvu4zdrS3RV1SBCQfeYW9ViIUmaCNcSmYAyuyC3XBL0UyrsPL+lmFXFULxTa4cmn999bXLCggyyXCIjId+Nne8qxFmob6ylPNB8hbO/pSMW0TLdcn/Xmz2/5gonqZTkOwG5TN/ew0L1TnjeIJM5YDdtL/vj/QwwoesIEWSJsmvcIWF3DjILQxO7lgZhJ7NI/Lh/bn4UN4nKvD/In//2ZI3EvKF1zSWsxOcuICDSWR19Q1waLirmmcvxCsTqNI0kmSJ2hf2qSElRSikGuv0dD3QjUBcbxP8s0eSj3Kw5GCxD0hWM1WTGzDkzZMiHnv/NjhNz4zPp+Zs7xuPxRvXDHr9bCEiwl8q1yEzideEH5CTSfcmAko4c4pt8JvkXn1KoOx5XWP/nEZHQulm3ounwVtr8SzJHuJDfioICkepn63MrbmBcrgxcDWzkGfdIegMeXFuIkhsYA0IeMU/TpzuillTyFoVc5DBdko9x0vTAAy9XILLo63r4PV0Wym4QMhnV6DZVv6WuXwc2t2VmF24Kwl7EIzvpmnea88ZELnrQeIlT/kzih5TA+DbeCSlr0ybwWkwm4eqYfHM/P+/yPZXVLa5b3TVPlkYJc4wJ87SMduXBPyCmoSrbnfD6/Cy+QCk7Lv7gr8U6GARoMp1Nl/dzHCq904P/xoC8HF0yg5z8ZZ80LB6gp1l41SJ6N6bBkNkdCUxhb/+i0nlpP3tUWxfXkJCopy3/63ymfC/ZAsjhRx27uzAldU9Ya5EjLGbD3zJnA7ya2jv4oulOzIoQk7bhJoQJYeTSXZwmpK9NdvdFahDO9C4TeX1+QKV0HpgOUxWdLJAbF7Igu1fvHHIJnYO8fuOkTOrX7mjJn6eOIdtWQljTvryLFmkz+A7OeDGomF7Y8aRocFNWGecwg2CUD4Xd6yJjpYJx+KUb8RQBL5SyYNhV8N8xkFFE3r6OxAxFascwK6D8jeeJudiIg= X-Forefront-Antispam-Report: CIP:216.228.117.160; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:dc6edge1.nvidia.com; CAT:NONE; SFS:(13230040)(82310400026)(36860700013)(1800799024)(376014); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Sep 2024 13:12:45.6187 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 8e37688c-e74d-4930-fd9d-08dcd8acc09a X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.117.160]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CO1PEPF000066E7.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN7PR12MB8145 X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, SPF_HELO_PASS, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org From: Matthew Malcomson Points to question here are: 1) Whether checking for this particular internal builtin is OK (this one happens to be the one implementing the operation for a `double`, we would have to rely on the approach that if anyone implements this operation for a `double` they implement it for all the floating point types that their C++ frontend and libstdc++ handle). 2) Whether the `#if` bit should be somewhere else instead of put in the `__fetch_add_flt` function. I put it there because that's where it seemed natural, but am not familiar enough with libstdc++ to be confident in that decision. We still need the CAS loop fallback for any compiler that doesn't implement this builtin, and hence will still need some extra choice to be made for floating point types. Once all compilers we care about implement this we can remove this special handling and merge the floating point and integral operations into the same template. Signed-off-by: Matthew Malcomson --- libstdc++-v3/include/bits/atomic_base.h | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/libstdc++-v3/include/bits/atomic_base.h b/libstdc++-v3/include/bits/atomic_base.h index 1c2367b39b6..d3b1a022db2 100644 --- a/libstdc++-v3/include/bits/atomic_base.h +++ b/libstdc++-v3/include/bits/atomic_base.h @@ -1217,30 +1217,41 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _Tp __fetch_add_flt(_Tp* __ptr, _Val<_Tp> __i, memory_order __m) noexcept { +#if __has_builtin(__atomic_fetch_add_fp) + return __atomic_fetch_add(__ptr, __i, int(__m)); +#else _Val<_Tp> __oldval = load(__ptr, memory_order_relaxed); _Val<_Tp> __newval = __oldval + __i; while (!compare_exchange_weak(__ptr, __oldval, __newval, __m, memory_order_relaxed)) __newval = __oldval + __i; return __oldval; +#endif } template _Tp __fetch_sub_flt(_Tp* __ptr, _Val<_Tp> __i, memory_order __m) noexcept { +#if __has_builtin(__atomic_fetch_sub) + return __atomic_fetch_sub(__ptr, __i, int(__m)); +#else _Val<_Tp> __oldval = load(__ptr, memory_order_relaxed); _Val<_Tp> __newval = __oldval - __i; while (!compare_exchange_weak(__ptr, __oldval, __newval, __m, memory_order_relaxed)) __newval = __oldval - __i; return __oldval; +#endif } template _Tp __add_fetch_flt(_Tp* __ptr, _Val<_Tp> __i) noexcept { +#if __has_builtin(__atomic_add_fetch) + return __atomic_add_fetch(__ptr, __i, __ATOMIC_SEQ_CST); +#else _Val<_Tp> __oldval = load(__ptr, memory_order_relaxed); _Val<_Tp> __newval = __oldval + __i; while (!compare_exchange_weak(__ptr, __oldval, __newval, @@ -1248,12 +1259,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION memory_order_relaxed)) __newval = __oldval + __i; return __newval; +#endif } template _Tp __sub_fetch_flt(_Tp* __ptr, _Val<_Tp> __i) noexcept { +#if __has_builtin(__atomic_sub_fetch) + return __atomic_sub_fetch(__ptr, __i, __ATOMIC_SEQ_CST); +#else _Val<_Tp> __oldval = load(__ptr, memory_order_relaxed); _Val<_Tp> __newval = __oldval - __i; while (!compare_exchange_weak(__ptr, __oldval, __newval, @@ -1261,6 +1276,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION memory_order_relaxed)) __newval = __oldval - __i; return __newval; +#endif } } // namespace __atomic_impl From patchwork Thu Sep 19 13:12:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Malcomson X-Patchwork-Id: 97705 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 224FD3858C78 for ; Thu, 19 Sep 2024 13:16:22 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on20607.outbound.protection.outlook.com [IPv6:2a01:111:f403:2415::607]) by sourceware.org (Postfix) with ESMTPS id CF5F43858C50 for ; Thu, 19 Sep 2024 13:12:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CF5F43858C50 Authentication-Results: sourceware.org; dmarc=fail (p=reject dis=none) header.from=nvidia.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=nvidia.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CF5F43858C50 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2415::607 ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1726751586; cv=pass; b=ffEfAbI0opr0I9x+Lk50YvEJ5g7i2bILRbLn1z8bWfnyqmXjtLISfgj8Tvqb0D9sXZEe2MJ4gxaKzUVA5dSGPTNOc5+y7ErqV56sn/ApxEdeD6ZtiBdiyluabyQCW+DP8rVV/kx9j3DbFvhvM5B/HWWkXMtR5p8fp06nb1T5NAU= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1726751586; c=relaxed/simple; bh=fHiPsV1jBX4CBOc28/NwDGX6LQkb5ZaVKVlXqU6OPJ0=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=kVsUJcPkmoE4un5Vj7BZ/ttDDvkIxWNT3b89sqyn1Ln0m96Hb/M4wuwZJ8KykAhDY702SKyXdv9XwQ+mIeYI2UUA5+WT/FrtgCVBvQmt3Y6sbqJZoMWXCq1WMki+R4xYGJPvOs1XyDTIHAGOcaCwWierhVwQH45wGiA7yU8oSn4= ARC-Authentication-Results: i=2; server2.sourceware.org ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=fHWQrZymWYFU+wfcc8ZJRg42T7zs/I3qMswT74ApRpnn9RwwNWoytGRcDuft7pFNvg4upBf2z2PDR/E6q3335R0ZsvQJVorP/cGF2tLbIIuJwhICD4wVTSmx2n0sIncc3w+XzOfKzVtEGxuncWv00b+/WUP+/ZBxy4ZQlc8NWe8/rBxWnfg2IaYUGleCfz7+x4GpgGVUklvq2k1rM00eagd/25OUHy4Ew0Dm4jUSQkYxsUnro8LW6p8EUrMwDxB9e0+0j/V/MITf1HcV1tlGMW/79VRRxVmHem8UsTqK6TAJRPVgeFFTonf6WzJmKv/lnc666AOpxDHyB1pjdzBdGg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=xBzSvud5fzSMZFsYvwzIGHpwAUH/clL5iIbbALtdv/w=; b=h8Kj2FmClcl5jaxDQjdm9J/6GD03RQzMP7UWNxT3dlB3vr8ARsS5s1na/i+GcDfPBBI8RXxr6QOeISvrHNzs8b/SYnh6DX/RIcPqiKvICEwh+pey+kR0LOTWjbklj7LS8QaFNDxmaFzx18QOACeajtS1ZLsh00r1I0ScVmQdagwVMPighdXCKnvCuqKIQl5RKVmPT5agPTaZczUkkip4yzBdFoce2dVum/SYE/ANVt4ETxi0L0qeWeYMXv8IpPlM4QHiQkL1DWq47GHQeNt6dDsZcWKRXs/hGoJsi1P17l8b/pkSvkjRRv3gggAdjRMSyBYQwHcxytSfKZIYiHRzeA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.161) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=xBzSvud5fzSMZFsYvwzIGHpwAUH/clL5iIbbALtdv/w=; b=AHJFUIOT3uaP84gUtaTGF39axQEk9P2u2ajCAx5v+kBVe5WFVfvAZvw7Eg0bmY6395iQ8yRZe0PImXbXjh82cPSNSp8h9R5vMiFB2+3cFOMgZJ25Hj9LgziIiqeQdBaLBIGvucf/hRa5IAHOoQ2FFOoczKYoKN8wJ2Q3513jxICx1CzhlEQFkKB7me59XYPEpxJDlEWO9Z3yV2CFtOaGDrjJNTJYWxZBg1yCHOYyUty1epVdtK5GF7QBXlGaOfbXs1Y5ReRSZfF9DKHNXGUX9QYWp1Ft2Mj8+WcHGeKHbgDi9OtY1rm21UdAC4GauIWyJoDAZ0r4+zP4k3kFuQJ/UA== Received: from CH5PR05CA0007.namprd05.prod.outlook.com (2603:10b6:610:1f0::12) by CH3PR12MB7596.namprd12.prod.outlook.com (2603:10b6:610:14b::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7982.22; Thu, 19 Sep 2024 13:12:47 +0000 Received: from DS2PEPF0000343A.namprd02.prod.outlook.com (2603:10b6:610:1f0:cafe::f5) by CH5PR05CA0007.outlook.office365.com (2603:10b6:610:1f0::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7982.16 via Frontend Transport; Thu, 19 Sep 2024 13:12:47 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.161) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.161 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.161; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.161) by DS2PEPF0000343A.mail.protection.outlook.com (10.167.18.37) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Thu, 19 Sep 2024 13:12:47 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Thu, 19 Sep 2024 06:12:33 -0700 Received: from f2784bb-lcelt.nvidia.com (10.126.230.35) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Thu, 19 Sep 2024 06:12:31 -0700 From: To: CC: Jonathan Wakely , Joseph Myers , Richard Biener , Matthew Malcomson Subject: [PATCH 6/8] [RFC] First attempt at testsuite Date: Thu, 19 Sep 2024 14:12:02 +0100 Message-ID: <20240919131204.3865854-7-mmalcomson@nvidia.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240919131204.3865854-1-mmalcomson@nvidia.com> References: <20240919131204.3865854-1-mmalcomson@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [10.126.230.35] X-ClientProxiedBy: rnnvmail202.nvidia.com (10.129.68.7) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS2PEPF0000343A:EE_|CH3PR12MB7596:EE_ X-MS-Office365-Filtering-Correlation-Id: 9a7ceaca-f095-40b8-5121-08dcd8acc180 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|1800799024|376014|82310400026|36860700013; X-Microsoft-Antispam-Message-Info: Wd8OymuGiNB/o9Hk3f61jk6PBlFO210QcZjKpc3SIFAKxu9ithkoTV7dp6P/1Ptvo74r+hvn5xsPCXpH+HfvjqWrgIHDoREEHA3Lk1eveRkPKFLyg6bAdUKuF55dXvEvS12ymjkKiyWBw442GXxCH2CtmZqtf5yrxRPuYKsbONLsqsGiRJrfDEeyCkPtpMSaw1Iyhk5nZCT6vmqxSS9Q5SNRZleXrk5PxdjwGt1TqM/Szbe7SmufSP74em1k4xCoXxo05ryIDJzdpePbjG6dL4TgH2TWSa0lky08P2eCNfdNwnqSOiXwkDU/xxZbLDjFEDxfHRIqu48N7kTw7atUCS+VG7lwJ+aWNCrkWIRT7T85sn/syAmpDgGiGvpqjFj1J6cE1tOBZUvtsm1PpaMvmUEUdRAK9LFH1GR88ytgACri/e8zCW4xjUfUpBmPN4nOjmtb9hq+dATiUdLWrgm0gVvj6j0ib+jSkK+HTFjCdyNKA6S1Olmap9EAX8UrFPQSGu8mdUfzw2hjIZC3Krqq9igs22liWfXIyZw98KrThEp989JaK1GwU28nTRXxDzobkwpxK2MXNqjwI7rgkcb4ajvMEetaGdAsdU4AbufVG4WSV1LIfnzNZ7VuJcO4sIJwhk3go0FjSJWhzKRsj9uQjuUkOW7zslwOu3Hu3/iBMGhs5kgFezVTYCq040dyHQlZ/M0PGUyQjqt2sJAdox844sE4tWrsnCzV2Z7MFg8mGByiyF0XW7iORz8cLd44RzGahQA18BT80v8s/M7M7K4eSQNfQMxYj7Y7324fN+yqhgHVMeAlKDB25VfgeX/nBforzzMAF/CYrO8UmUkmhr4fjk8HUXGxWROKeC9kytJ5vdrAhCNKZ7iFnWRPLvFd5qooZD5QVtg0r65S5j9YlipyxRGHVo7QiwMDCpJmCgX+LMDoMk0teqQ3NXhvflxPQlaBa5UEeWz1Pmce9EIRUFIuZUu+4MZrvyyCJdRhlm6FqES0s3keWY8TOpGosQbXXooC0OXi5M2QpNf1nomh0GRrg97d27W9WpsaXyB9qO4zqzn/YMbVd/VEPDiKavjXwrrbfncTgavcVjVH6lcmghqMFr0YKQgUZ9Z3wE8AKNaIlswqRxexYx6ocUZQlJ6DaW3/k87dvVwpqXGVuztNU9YnrHvAWkq9zFNJtmnQOEsOU17Ml0OoT0ukFMLHUfTS3iysF4n7owLRvPH0CV/y+huILnQPRRTP/hqsQM/WcqRfJaYaMXkuq2A7TMWH4MrpbI440DqdCaPv+MfAmDHvPtJBMhX6HQ57hJtYfipjfGcT/IGDyXd6XDyESOoi4KBbCsDm0ZxD7HN7HRJYccylb6efxj3aBpgzJ+l95yQzDGUrwQi3DZZa6+9p0zgx5HUICGZgyxACJ6Ux4F3gW86v5Eblgb96+pNGWGsnHhabVVkxjzVpy2RSbKKHR13ORcDMIWa9 X-Forefront-Antispam-Report: CIP:216.228.117.161; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:dc6edge2.nvidia.com; CAT:NONE; SFS:(13230040)(1800799024)(376014)(82310400026)(36860700013); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Sep 2024 13:12:47.0591 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 9a7ceaca-f095-40b8-5121-08dcd8acc180 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.117.161]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DS2PEPF0000343A.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH3PR12MB7596 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, SPF_HELO_PASS, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org From: Matthew Malcomson As it stands this doesn't fully pass. However it passes *enough* to convince me that it's of the "shape" that I would suggest (and hence good enough for an RFC). Signed-off-by: Matthew Malcomson --- gcc/testsuite/gcc.dg/atomic-op-fp.c | 204 ++++++++ gcc/testsuite/gcc.dg/atomic-op-fpf.c | 204 ++++++++ gcc/testsuite/gcc.dg/atomic-op-fpf128.c | 204 ++++++++ gcc/testsuite/gcc.dg/atomic-op-fpf16.c | 204 ++++++++ gcc/testsuite/gcc.dg/atomic-op-fpf16b.c | 204 ++++++++ gcc/testsuite/gcc.dg/atomic-op-fpf32.c | 204 ++++++++ gcc/testsuite/gcc.dg/atomic-op-fpf32x.c | 204 ++++++++ gcc/testsuite/gcc.dg/atomic-op-fpf64.c | 204 ++++++++ gcc/testsuite/gcc.dg/atomic-op-fpf64x.c | 204 ++++++++ gcc/testsuite/gcc.dg/atomic-op-fpl.c | 204 ++++++++ gcc/testsuite/lib/target-supports.exp | 463 +++++++++++++++++- .../testsuite/libatomic.c/atomic-op-fp.c | 203 ++++++++ .../testsuite/libatomic.c/atomic-op-fpf.c | 203 ++++++++ .../testsuite/libatomic.c/atomic-op-fpf128.c | 203 ++++++++ .../testsuite/libatomic.c/atomic-op-fpf16.c | 203 ++++++++ .../testsuite/libatomic.c/atomic-op-fpf16b.c | 203 ++++++++ .../testsuite/libatomic.c/atomic-op-fpf32.c | 203 ++++++++ .../testsuite/libatomic.c/atomic-op-fpf32x.c | 203 ++++++++ .../testsuite/libatomic.c/atomic-op-fpf64.c | 203 ++++++++ .../testsuite/libatomic.c/atomic-op-fpf64x.c | 203 ++++++++ .../testsuite/libatomic.c/atomic-op-fpl.c | 203 ++++++++ 21 files changed, 4532 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.dg/atomic-op-fp.c create mode 100644 gcc/testsuite/gcc.dg/atomic-op-fpf.c create mode 100644 gcc/testsuite/gcc.dg/atomic-op-fpf128.c create mode 100644 gcc/testsuite/gcc.dg/atomic-op-fpf16.c create mode 100644 gcc/testsuite/gcc.dg/atomic-op-fpf16b.c create mode 100644 gcc/testsuite/gcc.dg/atomic-op-fpf32.c create mode 100644 gcc/testsuite/gcc.dg/atomic-op-fpf32x.c create mode 100644 gcc/testsuite/gcc.dg/atomic-op-fpf64.c create mode 100644 gcc/testsuite/gcc.dg/atomic-op-fpf64x.c create mode 100644 gcc/testsuite/gcc.dg/atomic-op-fpl.c create mode 100644 libatomic/testsuite/libatomic.c/atomic-op-fp.c create mode 100644 libatomic/testsuite/libatomic.c/atomic-op-fpf.c create mode 100644 libatomic/testsuite/libatomic.c/atomic-op-fpf128.c create mode 100644 libatomic/testsuite/libatomic.c/atomic-op-fpf16.c create mode 100644 libatomic/testsuite/libatomic.c/atomic-op-fpf16b.c create mode 100644 libatomic/testsuite/libatomic.c/atomic-op-fpf32.c create mode 100644 libatomic/testsuite/libatomic.c/atomic-op-fpf32x.c create mode 100644 libatomic/testsuite/libatomic.c/atomic-op-fpf64.c create mode 100644 libatomic/testsuite/libatomic.c/atomic-op-fpf64x.c create mode 100644 libatomic/testsuite/libatomic.c/atomic-op-fpl.c diff --git a/gcc/testsuite/gcc.dg/atomic-op-fp.c b/gcc/testsuite/gcc.dg/atomic-op-fp.c new file mode 100644 index 00000000000..8cb2a5d8290 --- /dev/null +++ b/gcc/testsuite/gcc.dg/atomic-op-fp.c @@ -0,0 +1,204 @@ +/* Test __atomic routines for existence and proper execution on double + values with each valid memory model. */ +/* { dg-do run } */ +/* { dg-additional-options "-std=c23" } */ +/* { dg-require-effective-target sync_double } */ + +/* Test the execution of the __atomic_*OP builtin routines for a double. */ + +extern void abort(void); + +double v, count, res; + +/* The fetch_op routines return the original value before the operation. + * TODO N.b. I don't *believe* the checking against integral values and + * addition/subtraction of integral values could trigger any floating point + * confusion, and the testcases seem to pass, but haven't yet gotten confident + * that it could never happen (though I guess if we're following ieee standards + * then the behaviour on one architecture and at one time should match that on + * other architectures -- so I guess I should be fine). */ + +void +test_fetch_add () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_fetch_add (&v, count, __ATOMIC_RELAXED) != 0) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_CONSUME) != 1) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQUIRE) != 2) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_RELEASE) != 3) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQ_REL) != 4) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_SEQ_CST) != 5) + abort (); +} + + +void +test_fetch_sub() +{ + v = res = 20; + count = 0.0; + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_RELAXED) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_CONSUME) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQUIRE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQ_REL) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_SEQ_CST) != res--) + abort (); +} + +/* The OP_fetch routines return the new value after the operation. */ + +void +test_add_fetch () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_add_fetch (&v, count, __ATOMIC_RELAXED) != 1) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_CONSUME) != 2) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQUIRE) != 3) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_RELEASE) != 4) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL) != 5) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_SEQ_CST) != 6) + abort (); +} + + +void +test_sub_fetch () +{ + v = res = 20; + count = 0.0; + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_CONSUME) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQUIRE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_RELEASE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_SEQ_CST) != --res) + abort (); +} + +/* Test the OP routines with a result which isn't used. Use both variations + within each function. */ + +void +test_add () +{ + v = res = 0.0; + count = 1.0; + + __atomic_add_fetch (&v, count, __ATOMIC_RELAXED); + if (v != 1) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_CONSUME); + if (v != 2) + abort (); + + __atomic_add_fetch (&v, 1.0 , __ATOMIC_ACQUIRE); + if (v != 3) + abort (); + + __atomic_fetch_add (&v, 1.0, __ATOMIC_RELEASE); + if (v != 4) + abort (); + + __atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL); + if (v != 5) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_SEQ_CST); + if (v != 6) + abort (); +} + + +void +test_sub() +{ + v = res = 20; + count = 0.0; + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_CONSUME); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, 1, __ATOMIC_ACQUIRE); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_SEQ_CST); + if (v != --res) + abort (); +} + +int +main () +{ + test_fetch_add (); + test_fetch_sub (); + + test_add_fetch (); + test_sub_fetch (); + + test_add (); + test_sub (); + + return 0; +} diff --git a/gcc/testsuite/gcc.dg/atomic-op-fpf.c b/gcc/testsuite/gcc.dg/atomic-op-fpf.c new file mode 100644 index 00000000000..dfeb173eeef --- /dev/null +++ b/gcc/testsuite/gcc.dg/atomic-op-fpf.c @@ -0,0 +1,204 @@ +/* Test __atomic routines for existence and proper execution on float + values with each valid memory model. */ +/* { dg-do run } */ +/* { dg-additional-options "-std=c23" } */ +/* { dg-require-effective-target sync_float } */ + +/* Test the execution of the __atomic_*OP builtin routines for a float. */ + +extern void abort(void); + +float v, count, res; + +/* The fetch_op routines return the original value before the operation. + * TODO N.b. I don't *believe* the checking against integral values and + * addition/subtraction of integral values could trigger any floating point + * confusion, and the testcases seem to pass, but haven't yet gotten confident + * that it could never happen (though I guess if we're following ieee standards + * then the behaviour on one architecture and at one time should match that on + * other architectures -- so I guess I should be fine). */ + +void +test_fetch_add () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_fetch_add (&v, count, __ATOMIC_RELAXED) != 0) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_CONSUME) != 1) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQUIRE) != 2) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_RELEASE) != 3) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQ_REL) != 4) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_SEQ_CST) != 5) + abort (); +} + + +void +test_fetch_sub() +{ + v = res = 20; + count = 0.0; + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_RELAXED) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_CONSUME) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQUIRE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQ_REL) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_SEQ_CST) != res--) + abort (); +} + +/* The OP_fetch routines return the new value after the operation. */ + +void +test_add_fetch () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_add_fetch (&v, count, __ATOMIC_RELAXED) != 1) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_CONSUME) != 2) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQUIRE) != 3) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_RELEASE) != 4) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL) != 5) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_SEQ_CST) != 6) + abort (); +} + + +void +test_sub_fetch () +{ + v = res = 20; + count = 0.0; + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_CONSUME) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQUIRE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_RELEASE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_SEQ_CST) != --res) + abort (); +} + +/* Test the OP routines with a result which isn't used. Use both variations + within each function. */ + +void +test_add () +{ + v = res = 0.0; + count = 1.0; + + __atomic_add_fetch (&v, count, __ATOMIC_RELAXED); + if (v != 1) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_CONSUME); + if (v != 2) + abort (); + + __atomic_add_fetch (&v, 1.0 , __ATOMIC_ACQUIRE); + if (v != 3) + abort (); + + __atomic_fetch_add (&v, 1.0, __ATOMIC_RELEASE); + if (v != 4) + abort (); + + __atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL); + if (v != 5) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_SEQ_CST); + if (v != 6) + abort (); +} + + +void +test_sub() +{ + v = res = 20; + count = 0.0; + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_CONSUME); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, 1, __ATOMIC_ACQUIRE); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_SEQ_CST); + if (v != --res) + abort (); +} + +int +main () +{ + test_fetch_add (); + test_fetch_sub (); + + test_add_fetch (); + test_sub_fetch (); + + test_add (); + test_sub (); + + return 0; +} diff --git a/gcc/testsuite/gcc.dg/atomic-op-fpf128.c b/gcc/testsuite/gcc.dg/atomic-op-fpf128.c new file mode 100644 index 00000000000..93962953ef0 --- /dev/null +++ b/gcc/testsuite/gcc.dg/atomic-op-fpf128.c @@ -0,0 +1,204 @@ +/* Test __atomic routines for existence and proper execution on _Float128 + values with each valid memory model. */ +/* { dg-do run } */ +/* { dg-additional-options "-std=c23" } */ +/* { dg-require-effective-target sync__Float128 } */ + +/* Test the execution of the __atomic_*OP builtin routines for a _Float128. */ + +extern void abort(void); + +_Float128 v, count, res; + +/* The fetch_op routines return the original value before the operation. + * TODO N.b. I don't *believe* the checking against integral values and + * addition/subtraction of integral values could trigger any floating point + * confusion, and the testcases seem to pass, but haven't yet gotten confident + * that it could never happen (though I guess if we're following ieee standards + * then the behaviour on one architecture and at one time should match that on + * other architectures -- so I guess I should be fine). */ + +void +test_fetch_add () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_fetch_add (&v, count, __ATOMIC_RELAXED) != 0) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_CONSUME) != 1) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQUIRE) != 2) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_RELEASE) != 3) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQ_REL) != 4) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_SEQ_CST) != 5) + abort (); +} + + +void +test_fetch_sub() +{ + v = res = 20; + count = 0.0; + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_RELAXED) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_CONSUME) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQUIRE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQ_REL) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_SEQ_CST) != res--) + abort (); +} + +/* The OP_fetch routines return the new value after the operation. */ + +void +test_add_fetch () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_add_fetch (&v, count, __ATOMIC_RELAXED) != 1) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_CONSUME) != 2) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQUIRE) != 3) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_RELEASE) != 4) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL) != 5) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_SEQ_CST) != 6) + abort (); +} + + +void +test_sub_fetch () +{ + v = res = 20; + count = 0.0; + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_CONSUME) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQUIRE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_RELEASE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_SEQ_CST) != --res) + abort (); +} + +/* Test the OP routines with a result which isn't used. Use both variations + within each function. */ + +void +test_add () +{ + v = res = 0.0; + count = 1.0; + + __atomic_add_fetch (&v, count, __ATOMIC_RELAXED); + if (v != 1) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_CONSUME); + if (v != 2) + abort (); + + __atomic_add_fetch (&v, 1.0 , __ATOMIC_ACQUIRE); + if (v != 3) + abort (); + + __atomic_fetch_add (&v, 1.0, __ATOMIC_RELEASE); + if (v != 4) + abort (); + + __atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL); + if (v != 5) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_SEQ_CST); + if (v != 6) + abort (); +} + + +void +test_sub() +{ + v = res = 20; + count = 0.0; + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_CONSUME); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, 1, __ATOMIC_ACQUIRE); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_SEQ_CST); + if (v != --res) + abort (); +} + +int +main () +{ + test_fetch_add (); + test_fetch_sub (); + + test_add_fetch (); + test_sub_fetch (); + + test_add (); + test_sub (); + + return 0; +} diff --git a/gcc/testsuite/gcc.dg/atomic-op-fpf16.c b/gcc/testsuite/gcc.dg/atomic-op-fpf16.c new file mode 100644 index 00000000000..3e9e2f24180 --- /dev/null +++ b/gcc/testsuite/gcc.dg/atomic-op-fpf16.c @@ -0,0 +1,204 @@ +/* Test __atomic routines for existence and proper execution on _Float16 + values with each valid memory model. */ +/* { dg-do run } */ +/* { dg-additional-options "-std=c23" } */ +/* { dg-require-effective-target sync__Float16 } */ + +/* Test the execution of the __atomic_*OP builtin routines for a _Float16. */ + +extern void abort(void); + +_Float16 v, count, res; + +/* The fetch_op routines return the original value before the operation. + * TODO N.b. I don't *believe* the checking against integral values and + * addition/subtraction of integral values could trigger any floating point + * confusion, and the testcases seem to pass, but haven't yet gotten confident + * that it could never happen (though I guess if we're following ieee standards + * then the behaviour on one architecture and at one time should match that on + * other architectures -- so I guess I should be fine). */ + +void +test_fetch_add () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_fetch_add (&v, count, __ATOMIC_RELAXED) != 0) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_CONSUME) != 1) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQUIRE) != 2) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_RELEASE) != 3) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQ_REL) != 4) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_SEQ_CST) != 5) + abort (); +} + + +void +test_fetch_sub() +{ + v = res = 20; + count = 0.0; + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_RELAXED) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_CONSUME) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQUIRE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQ_REL) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_SEQ_CST) != res--) + abort (); +} + +/* The OP_fetch routines return the new value after the operation. */ + +void +test_add_fetch () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_add_fetch (&v, count, __ATOMIC_RELAXED) != 1) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_CONSUME) != 2) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQUIRE) != 3) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_RELEASE) != 4) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL) != 5) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_SEQ_CST) != 6) + abort (); +} + + +void +test_sub_fetch () +{ + v = res = 20; + count = 0.0; + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_CONSUME) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQUIRE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_RELEASE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_SEQ_CST) != --res) + abort (); +} + +/* Test the OP routines with a result which isn't used. Use both variations + within each function. */ + +void +test_add () +{ + v = res = 0.0; + count = 1.0; + + __atomic_add_fetch (&v, count, __ATOMIC_RELAXED); + if (v != 1) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_CONSUME); + if (v != 2) + abort (); + + __atomic_add_fetch (&v, 1.0 , __ATOMIC_ACQUIRE); + if (v != 3) + abort (); + + __atomic_fetch_add (&v, 1.0, __ATOMIC_RELEASE); + if (v != 4) + abort (); + + __atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL); + if (v != 5) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_SEQ_CST); + if (v != 6) + abort (); +} + + +void +test_sub() +{ + v = res = 20; + count = 0.0; + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_CONSUME); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, 1, __ATOMIC_ACQUIRE); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_SEQ_CST); + if (v != --res) + abort (); +} + +int +main () +{ + test_fetch_add (); + test_fetch_sub (); + + test_add_fetch (); + test_sub_fetch (); + + test_add (); + test_sub (); + + return 0; +} diff --git a/gcc/testsuite/gcc.dg/atomic-op-fpf16b.c b/gcc/testsuite/gcc.dg/atomic-op-fpf16b.c new file mode 100644 index 00000000000..4800bc696cc --- /dev/null +++ b/gcc/testsuite/gcc.dg/atomic-op-fpf16b.c @@ -0,0 +1,204 @@ +/* Test __atomic routines for existence and proper execution on __bf16 + values with each valid memory model. */ +/* { dg-do run } */ +/* { dg-additional-options "-std=c23" } */ +/* { dg-require-effective-target sync___bf16 } */ + +/* Test the execution of the __atomic_*OP builtin routines for a __bf16. */ + +extern void abort(void); + +__bf16 v, count, res; + +/* The fetch_op routines return the original value before the operation. + * TODO N.b. I don't *believe* the checking against integral values and + * addition/subtraction of integral values could trigger any floating point + * confusion, and the testcases seem to pass, but haven't yet gotten confident + * that it could never happen (though I guess if we're following ieee standards + * then the behaviour on one architecture and at one time should match that on + * other architectures -- so I guess I should be fine). */ + +void +test_fetch_add () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_fetch_add (&v, count, __ATOMIC_RELAXED) != 0) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_CONSUME) != 1) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQUIRE) != 2) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_RELEASE) != 3) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQ_REL) != 4) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_SEQ_CST) != 5) + abort (); +} + + +void +test_fetch_sub() +{ + v = res = 20; + count = 0.0; + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_RELAXED) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_CONSUME) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQUIRE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQ_REL) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_SEQ_CST) != res--) + abort (); +} + +/* The OP_fetch routines return the new value after the operation. */ + +void +test_add_fetch () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_add_fetch (&v, count, __ATOMIC_RELAXED) != 1) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_CONSUME) != 2) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQUIRE) != 3) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_RELEASE) != 4) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL) != 5) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_SEQ_CST) != 6) + abort (); +} + + +void +test_sub_fetch () +{ + v = res = 20; + count = 0.0; + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_CONSUME) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQUIRE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_RELEASE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_SEQ_CST) != --res) + abort (); +} + +/* Test the OP routines with a result which isn't used. Use both variations + within each function. */ + +void +test_add () +{ + v = res = 0.0; + count = 1.0; + + __atomic_add_fetch (&v, count, __ATOMIC_RELAXED); + if (v != 1) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_CONSUME); + if (v != 2) + abort (); + + __atomic_add_fetch (&v, 1.0 , __ATOMIC_ACQUIRE); + if (v != 3) + abort (); + + __atomic_fetch_add (&v, 1.0, __ATOMIC_RELEASE); + if (v != 4) + abort (); + + __atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL); + if (v != 5) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_SEQ_CST); + if (v != 6) + abort (); +} + + +void +test_sub() +{ + v = res = 20; + count = 0.0; + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_CONSUME); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, 1, __ATOMIC_ACQUIRE); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_SEQ_CST); + if (v != --res) + abort (); +} + +int +main () +{ + test_fetch_add (); + test_fetch_sub (); + + test_add_fetch (); + test_sub_fetch (); + + test_add (); + test_sub (); + + return 0; +} diff --git a/gcc/testsuite/gcc.dg/atomic-op-fpf32.c b/gcc/testsuite/gcc.dg/atomic-op-fpf32.c new file mode 100644 index 00000000000..87775a6f255 --- /dev/null +++ b/gcc/testsuite/gcc.dg/atomic-op-fpf32.c @@ -0,0 +1,204 @@ +/* Test __atomic routines for existence and proper execution on _Float32 + values with each valid memory model. */ +/* { dg-do run } */ +/* { dg-additional-options "-std=c23" } */ +/* { dg-require-effective-target sync__Float32 } */ + +/* Test the execution of the __atomic_*OP builtin routines for a _Float32. */ + +extern void abort(void); + +_Float32 v, count, res; + +/* The fetch_op routines return the original value before the operation. + * TODO N.b. I don't *believe* the checking against integral values and + * addition/subtraction of integral values could trigger any floating point + * confusion, and the testcases seem to pass, but haven't yet gotten confident + * that it could never happen (though I guess if we're following ieee standards + * then the behaviour on one architecture and at one time should match that on + * other architectures -- so I guess I should be fine). */ + +void +test_fetch_add () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_fetch_add (&v, count, __ATOMIC_RELAXED) != 0) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_CONSUME) != 1) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQUIRE) != 2) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_RELEASE) != 3) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQ_REL) != 4) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_SEQ_CST) != 5) + abort (); +} + + +void +test_fetch_sub() +{ + v = res = 20; + count = 0.0; + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_RELAXED) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_CONSUME) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQUIRE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQ_REL) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_SEQ_CST) != res--) + abort (); +} + +/* The OP_fetch routines return the new value after the operation. */ + +void +test_add_fetch () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_add_fetch (&v, count, __ATOMIC_RELAXED) != 1) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_CONSUME) != 2) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQUIRE) != 3) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_RELEASE) != 4) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL) != 5) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_SEQ_CST) != 6) + abort (); +} + + +void +test_sub_fetch () +{ + v = res = 20; + count = 0.0; + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_CONSUME) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQUIRE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_RELEASE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_SEQ_CST) != --res) + abort (); +} + +/* Test the OP routines with a result which isn't used. Use both variations + within each function. */ + +void +test_add () +{ + v = res = 0.0; + count = 1.0; + + __atomic_add_fetch (&v, count, __ATOMIC_RELAXED); + if (v != 1) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_CONSUME); + if (v != 2) + abort (); + + __atomic_add_fetch (&v, 1.0 , __ATOMIC_ACQUIRE); + if (v != 3) + abort (); + + __atomic_fetch_add (&v, 1.0, __ATOMIC_RELEASE); + if (v != 4) + abort (); + + __atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL); + if (v != 5) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_SEQ_CST); + if (v != 6) + abort (); +} + + +void +test_sub() +{ + v = res = 20; + count = 0.0; + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_CONSUME); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, 1, __ATOMIC_ACQUIRE); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_SEQ_CST); + if (v != --res) + abort (); +} + +int +main () +{ + test_fetch_add (); + test_fetch_sub (); + + test_add_fetch (); + test_sub_fetch (); + + test_add (); + test_sub (); + + return 0; +} diff --git a/gcc/testsuite/gcc.dg/atomic-op-fpf32x.c b/gcc/testsuite/gcc.dg/atomic-op-fpf32x.c new file mode 100644 index 00000000000..eb4e9e5bac4 --- /dev/null +++ b/gcc/testsuite/gcc.dg/atomic-op-fpf32x.c @@ -0,0 +1,204 @@ +/* Test __atomic routines for existence and proper execution on _Float32x + values with each valid memory model. */ +/* { dg-do run } */ +/* { dg-additional-options "-std=c23" } */ +/* { dg-require-effective-target sync__Float32x } */ + +/* Test the execution of the __atomic_*OP builtin routines for a _Float32x. */ + +extern void abort(void); + +_Float32x v, count, res; + +/* The fetch_op routines return the original value before the operation. + * TODO N.b. I don't *believe* the checking against integral values and + * addition/subtraction of integral values could trigger any floating point + * confusion, and the testcases seem to pass, but haven't yet gotten confident + * that it could never happen (though I guess if we're following ieee standards + * then the behaviour on one architecture and at one time should match that on + * other architectures -- so I guess I should be fine). */ + +void +test_fetch_add () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_fetch_add (&v, count, __ATOMIC_RELAXED) != 0) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_CONSUME) != 1) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQUIRE) != 2) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_RELEASE) != 3) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQ_REL) != 4) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_SEQ_CST) != 5) + abort (); +} + + +void +test_fetch_sub() +{ + v = res = 20; + count = 0.0; + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_RELAXED) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_CONSUME) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQUIRE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQ_REL) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_SEQ_CST) != res--) + abort (); +} + +/* The OP_fetch routines return the new value after the operation. */ + +void +test_add_fetch () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_add_fetch (&v, count, __ATOMIC_RELAXED) != 1) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_CONSUME) != 2) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQUIRE) != 3) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_RELEASE) != 4) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL) != 5) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_SEQ_CST) != 6) + abort (); +} + + +void +test_sub_fetch () +{ + v = res = 20; + count = 0.0; + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_CONSUME) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQUIRE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_RELEASE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_SEQ_CST) != --res) + abort (); +} + +/* Test the OP routines with a result which isn't used. Use both variations + within each function. */ + +void +test_add () +{ + v = res = 0.0; + count = 1.0; + + __atomic_add_fetch (&v, count, __ATOMIC_RELAXED); + if (v != 1) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_CONSUME); + if (v != 2) + abort (); + + __atomic_add_fetch (&v, 1.0 , __ATOMIC_ACQUIRE); + if (v != 3) + abort (); + + __atomic_fetch_add (&v, 1.0, __ATOMIC_RELEASE); + if (v != 4) + abort (); + + __atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL); + if (v != 5) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_SEQ_CST); + if (v != 6) + abort (); +} + + +void +test_sub() +{ + v = res = 20; + count = 0.0; + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_CONSUME); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, 1, __ATOMIC_ACQUIRE); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_SEQ_CST); + if (v != --res) + abort (); +} + +int +main () +{ + test_fetch_add (); + test_fetch_sub (); + + test_add_fetch (); + test_sub_fetch (); + + test_add (); + test_sub (); + + return 0; +} diff --git a/gcc/testsuite/gcc.dg/atomic-op-fpf64.c b/gcc/testsuite/gcc.dg/atomic-op-fpf64.c new file mode 100644 index 00000000000..78137fddaf2 --- /dev/null +++ b/gcc/testsuite/gcc.dg/atomic-op-fpf64.c @@ -0,0 +1,204 @@ +/* Test __atomic routines for existence and proper execution on _Float64 + values with each valid memory model. */ +/* { dg-do run } */ +/* { dg-additional-options "-std=c23" } */ +/* { dg-require-effective-target sync__Float64 } */ + +/* Test the execution of the __atomic_*OP builtin routines for a _Float64. */ + +extern void abort(void); + +_Float64 v, count, res; + +/* The fetch_op routines return the original value before the operation. + * TODO N.b. I don't *believe* the checking against integral values and + * addition/subtraction of integral values could trigger any floating point + * confusion, and the testcases seem to pass, but haven't yet gotten confident + * that it could never happen (though I guess if we're following ieee standards + * then the behaviour on one architecture and at one time should match that on + * other architectures -- so I guess I should be fine). */ + +void +test_fetch_add () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_fetch_add (&v, count, __ATOMIC_RELAXED) != 0) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_CONSUME) != 1) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQUIRE) != 2) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_RELEASE) != 3) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQ_REL) != 4) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_SEQ_CST) != 5) + abort (); +} + + +void +test_fetch_sub() +{ + v = res = 20; + count = 0.0; + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_RELAXED) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_CONSUME) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQUIRE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQ_REL) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_SEQ_CST) != res--) + abort (); +} + +/* The OP_fetch routines return the new value after the operation. */ + +void +test_add_fetch () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_add_fetch (&v, count, __ATOMIC_RELAXED) != 1) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_CONSUME) != 2) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQUIRE) != 3) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_RELEASE) != 4) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL) != 5) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_SEQ_CST) != 6) + abort (); +} + + +void +test_sub_fetch () +{ + v = res = 20; + count = 0.0; + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_CONSUME) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQUIRE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_RELEASE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_SEQ_CST) != --res) + abort (); +} + +/* Test the OP routines with a result which isn't used. Use both variations + within each function. */ + +void +test_add () +{ + v = res = 0.0; + count = 1.0; + + __atomic_add_fetch (&v, count, __ATOMIC_RELAXED); + if (v != 1) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_CONSUME); + if (v != 2) + abort (); + + __atomic_add_fetch (&v, 1.0 , __ATOMIC_ACQUIRE); + if (v != 3) + abort (); + + __atomic_fetch_add (&v, 1.0, __ATOMIC_RELEASE); + if (v != 4) + abort (); + + __atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL); + if (v != 5) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_SEQ_CST); + if (v != 6) + abort (); +} + + +void +test_sub() +{ + v = res = 20; + count = 0.0; + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_CONSUME); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, 1, __ATOMIC_ACQUIRE); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_SEQ_CST); + if (v != --res) + abort (); +} + +int +main () +{ + test_fetch_add (); + test_fetch_sub (); + + test_add_fetch (); + test_sub_fetch (); + + test_add (); + test_sub (); + + return 0; +} diff --git a/gcc/testsuite/gcc.dg/atomic-op-fpf64x.c b/gcc/testsuite/gcc.dg/atomic-op-fpf64x.c new file mode 100644 index 00000000000..30c0438491e --- /dev/null +++ b/gcc/testsuite/gcc.dg/atomic-op-fpf64x.c @@ -0,0 +1,204 @@ +/* Test __atomic routines for existence and proper execution on _Float64x + values with each valid memory model. */ +/* { dg-do run } */ +/* { dg-additional-options "-std=c23" } */ +/* { dg-require-effective-target sync__Float64x } */ + +/* Test the execution of the __atomic_*OP builtin routines for a _Float64x. */ + +extern void abort(void); + +_Float64x v, count, res; + +/* The fetch_op routines return the original value before the operation. + * TODO N.b. I don't *believe* the checking against integral values and + * addition/subtraction of integral values could trigger any floating point + * confusion, and the testcases seem to pass, but haven't yet gotten confident + * that it could never happen (though I guess if we're following ieee standards + * then the behaviour on one architecture and at one time should match that on + * other architectures -- so I guess I should be fine). */ + +void +test_fetch_add () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_fetch_add (&v, count, __ATOMIC_RELAXED) != 0) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_CONSUME) != 1) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQUIRE) != 2) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_RELEASE) != 3) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQ_REL) != 4) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_SEQ_CST) != 5) + abort (); +} + + +void +test_fetch_sub() +{ + v = res = 20; + count = 0.0; + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_RELAXED) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_CONSUME) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQUIRE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQ_REL) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_SEQ_CST) != res--) + abort (); +} + +/* The OP_fetch routines return the new value after the operation. */ + +void +test_add_fetch () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_add_fetch (&v, count, __ATOMIC_RELAXED) != 1) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_CONSUME) != 2) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQUIRE) != 3) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_RELEASE) != 4) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL) != 5) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_SEQ_CST) != 6) + abort (); +} + + +void +test_sub_fetch () +{ + v = res = 20; + count = 0.0; + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_CONSUME) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQUIRE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_RELEASE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_SEQ_CST) != --res) + abort (); +} + +/* Test the OP routines with a result which isn't used. Use both variations + within each function. */ + +void +test_add () +{ + v = res = 0.0; + count = 1.0; + + __atomic_add_fetch (&v, count, __ATOMIC_RELAXED); + if (v != 1) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_CONSUME); + if (v != 2) + abort (); + + __atomic_add_fetch (&v, 1.0 , __ATOMIC_ACQUIRE); + if (v != 3) + abort (); + + __atomic_fetch_add (&v, 1.0, __ATOMIC_RELEASE); + if (v != 4) + abort (); + + __atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL); + if (v != 5) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_SEQ_CST); + if (v != 6) + abort (); +} + + +void +test_sub() +{ + v = res = 20; + count = 0.0; + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_CONSUME); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, 1, __ATOMIC_ACQUIRE); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_SEQ_CST); + if (v != --res) + abort (); +} + +int +main () +{ + test_fetch_add (); + test_fetch_sub (); + + test_add_fetch (); + test_sub_fetch (); + + test_add (); + test_sub (); + + return 0; +} diff --git a/gcc/testsuite/gcc.dg/atomic-op-fpl.c b/gcc/testsuite/gcc.dg/atomic-op-fpl.c new file mode 100644 index 00000000000..a3f0edeeb90 --- /dev/null +++ b/gcc/testsuite/gcc.dg/atomic-op-fpl.c @@ -0,0 +1,204 @@ +/* Test __atomic routines for existence and proper execution on long double + values with each valid memory model. */ +/* { dg-do run } */ +/* { dg-additional-options "-std=c23" } */ +/* { dg-require-effective-target sync_long_double } */ + +/* Test the execution of the __atomic_*OP builtin routines for a long double. */ + +extern void abort(void); + +long double v, count, res; + +/* The fetch_op routines return the original value before the operation. + * TODO N.b. I don't *believe* the checking against integral values and + * addition/subtraction of integral values could trigger any floating point + * confusion, and the testcases seem to pass, but haven't yet gotten confident + * that it could never happen (though I guess if we're following ieee standards + * then the behaviour on one architecture and at one time should match that on + * other architectures -- so I guess I should be fine). */ + +void +test_fetch_add () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_fetch_add (&v, count, __ATOMIC_RELAXED) != 0) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_CONSUME) != 1) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQUIRE) != 2) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_RELEASE) != 3) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQ_REL) != 4) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_SEQ_CST) != 5) + abort (); +} + + +void +test_fetch_sub() +{ + v = res = 20; + count = 0.0; + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_RELAXED) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_CONSUME) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQUIRE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQ_REL) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_SEQ_CST) != res--) + abort (); +} + +/* The OP_fetch routines return the new value after the operation. */ + +void +test_add_fetch () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_add_fetch (&v, count, __ATOMIC_RELAXED) != 1) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_CONSUME) != 2) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQUIRE) != 3) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_RELEASE) != 4) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL) != 5) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_SEQ_CST) != 6) + abort (); +} + + +void +test_sub_fetch () +{ + v = res = 20; + count = 0.0; + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_CONSUME) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQUIRE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_RELEASE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_SEQ_CST) != --res) + abort (); +} + +/* Test the OP routines with a result which isn't used. Use both variations + within each function. */ + +void +test_add () +{ + v = res = 0.0; + count = 1.0; + + __atomic_add_fetch (&v, count, __ATOMIC_RELAXED); + if (v != 1) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_CONSUME); + if (v != 2) + abort (); + + __atomic_add_fetch (&v, 1.0 , __ATOMIC_ACQUIRE); + if (v != 3) + abort (); + + __atomic_fetch_add (&v, 1.0, __ATOMIC_RELEASE); + if (v != 4) + abort (); + + __atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL); + if (v != 5) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_SEQ_CST); + if (v != 6) + abort (); +} + + +void +test_sub() +{ + v = res = 20; + count = 0.0; + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_CONSUME); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, 1, __ATOMIC_ACQUIRE); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_SEQ_CST); + if (v != --res) + abort (); +} + +int +main () +{ + test_fetch_add (); + test_fetch_sub (); + + test_add_fetch (); + test_sub_fetch (); + + test_add (); + test_sub (); + + return 0; +} diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index d368251ef9a..f2e96e87575 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -9747,7 +9747,468 @@ proc check_effective_target_sync_char_short { } { || [check_effective_target_mips_llsc] }}] } -# Return 1 if thread_fence does not rely on __sync_synchronize +# proc check_effective_target_sync_ {} { +# set template_string { +# extern int printf(const char *, ...); +# int main () { +# _Static_assert (sizeof() == 1 +# || sizeof() == 2 +# || sizeof() == 4 +# || sizeof() == 8 +# || sizeof() == 16 +# , " size not found"); +# if (sizeof() == 1 || sizeof() == 2) { +# printf("sync_char_short"); +# } else if (sizeof() == 4) { +# printf("sync_int_long"); +# } else if (sizeof() == 8) { +# printf("sync_long_long_runtime"); +# } else if (sizeof() == 16) { +# printf("sync_int_128_runtime"); +# } +# return 0; +# } +# } +# proc atomic__evalcheck {template} { +# set tempvar [check_compile size_ executable \ +# $template ""] +# # If failed to compile then the type isn't available this floating type +# # is certainly not atomic on this target. +# if { ![string match "" [lindex $tempvar 0]] } { +# return 0; +# } +# # Now want to run the binary and see what it prints out. +# set output [gcc_load "./[lindex $tempvar 1]"] +# remote_file build delete [lindex $tempvar 1] +# if { [lindex $output 0] != "pass" } { +# return 0; +# } +# return [check_effective_target_[lindex $output 1]] +# } +# return [check_cached_effective_target sync_ \ +# "atomic__evalcheck {$template_string}"] +# } + +proc check_effective_target_sync_float {} { + set template_string { + extern int printf(const char *, ...); + int main () { + _Static_assert (sizeof(float) == 1 + || sizeof(float) == 2 + || sizeof(float) == 4 + || sizeof(float) == 8 + || sizeof(float) == 16 + , "float size not found"); + if (sizeof(float) == 1 || sizeof(float) == 2) { + printf("sync_char_short"); + } else if (sizeof(float) == 4) { + printf("sync_int_long"); + } else if (sizeof(float) == 8) { + printf("sync_long_long_runtime"); + } else if (sizeof(float) == 16) { + printf("sync_int_128_runtime"); + } + return 0; + } + } + proc atomic_float_evalcheck {template} { + set tempvar [check_compile size_float executable \ + $template ""] + # If failed to compile then the type isn't available this floating type + # is certainly not atomic on this target. + if { ![string match "" [lindex $tempvar 0]] } { + return 0; + } + # Now want to run the binary and see what it prints out. + set output [gcc_load "./[lindex $tempvar 1]"] + remote_file build delete [lindex $tempvar 1] + if { [lindex $output 0] != "pass" } { + return 0; + } + return [check_effective_target_[lindex $output 1]] + } + return [check_cached_effective_target sync_float \ + "atomic_float_evalcheck {$template_string}"] +} + +proc check_effective_target_sync_double {} { + set template_string { + extern int printf(const char *, ...); + int main () { + _Static_assert (sizeof(double) == 1 + || sizeof(double) == 2 + || sizeof(double) == 4 + || sizeof(double) == 8 + || sizeof(double) == 16 + , "double size not found"); + if (sizeof(double) == 1 || sizeof(double) == 2) { + printf("sync_char_short"); + } else if (sizeof(double) == 4) { + printf("sync_int_long"); + } else if (sizeof(double) == 8) { + printf("sync_long_long_runtime"); + } else if (sizeof(double) == 16) { + printf("sync_int_128_runtime"); + } + return 0; + } + } + proc atomic_double_evalcheck {template} { + set tempvar [check_compile size_double executable \ + $template ""] + # If failed to compile then the type isn't available this floating type + # is certainly not atomic on this target. + if { ![string match "" [lindex $tempvar 0]] } { + return 0; + } + # Now want to run the binary and see what it prints out. + set output [gcc_load "./[lindex $tempvar 1]"] + remote_file build delete [lindex $tempvar 1] + if { [lindex $output 0] != "pass" } { + return 0; + } + return [check_effective_target_[lindex $output 1]] + } + return [check_cached_effective_target sync_double \ + "atomic_double_evalcheck {$template_string}"] +} + +proc check_effective_target_sync_long_double {} { + set template_string { + extern int printf(const char *, ...); + int main () { + _Static_assert (sizeof(long double) == 1 + || sizeof(long double) == 2 + || sizeof(long double) == 4 + || sizeof(long double) == 8 + || sizeof(long double) == 16 + , "long double size not found"); + if (sizeof(long double) == 1 || sizeof(long double) == 2) { + printf("sync_char_short"); + } else if (sizeof(long double) == 4) { + printf("sync_int_long"); + } else if (sizeof(long double) == 8) { + printf("sync_long_long_runtime"); + } else if (sizeof(long double) == 16) { + printf("sync_int_128_runtime"); + } + return 0; + } + } + proc atomic_long_double_evalcheck {template} { + set tempvar [check_compile size_long_double executable \ + $template ""] + # If failed to compile then the type isn't available this floating type + # is certainly not atomic on this target. + if { ![string match "" [lindex $tempvar 0]] } { + return 0; + } + # Now want to run the binary and see what it prints out. + set output [gcc_load "./[lindex $tempvar 1]"] + remote_file build delete [lindex $tempvar 1] + if { [lindex $output 0] != "pass" } { + return 0; + } + return [check_effective_target_[lindex $output 1]] + } + return [check_cached_effective_target sync_long_double \ + "atomic_long_double_evalcheck {$template_string}"] +} + +proc check_effective_target_sync___bf16 {} { + set template_string { + extern int printf(const char *, ...); + int main () { + _Static_assert (sizeof(__bf16) == 1 + || sizeof(__bf16) == 2 + || sizeof(__bf16) == 4 + || sizeof(__bf16) == 8 + || sizeof(__bf16) == 16 + , "__bf16 size not found"); + if (sizeof(__bf16) == 1 || sizeof(__bf16) == 2) { + printf("sync_char_short"); + } else if (sizeof(__bf16) == 4) { + printf("sync_int_long"); + } else if (sizeof(__bf16) == 8) { + printf("sync_long_long_runtime"); + } else if (sizeof(__bf16) == 16) { + printf("sync_int_128_runtime"); + } + return 0; + } + } + proc atomic___bf16_evalcheck {template} { + set tempvar [check_compile size___bf16 executable \ + $template ""] + # If failed to compile then the type isn't available this floating type + # is certainly not atomic on this target. + if { ![string match "" [lindex $tempvar 0]] } { + return 0; + } + # Now want to run the binary and see what it prints out. + set output [gcc_load "./[lindex $tempvar 1]"] + remote_file build delete [lindex $tempvar 1] + if { [lindex $output 0] != "pass" } { + return 0; + } + return [check_effective_target_[lindex $output 1]] + } + return [check_cached_effective_target sync___bf16 \ + "atomic___bf16_evalcheck {$template_string}"] +} + +proc check_effective_target_sync__Float16 {} { + set template_string { + extern int printf(const char *, ...); + int main () { + _Static_assert (sizeof(_Float16) == 1 + || sizeof(_Float16) == 2 + || sizeof(_Float16) == 4 + || sizeof(_Float16) == 8 + || sizeof(_Float16) == 16 + , "_Float16 size not found"); + if (sizeof(_Float16) == 1 || sizeof(_Float16) == 2) { + printf("sync_char_short"); + } else if (sizeof(_Float16) == 4) { + printf("sync_int_long"); + } else if (sizeof(_Float16) == 8) { + printf("sync_long_long_runtime"); + } else if (sizeof(_Float16) == 16) { + printf("sync_int_128_runtime"); + } + return 0; + } + } + proc atomic__Float16_evalcheck {template} { + set tempvar [check_compile size__Float16 executable \ + $template ""] + # If failed to compile then the type isn't available this floating type + # is certainly not atomic on this target. + if { ![string match "" [lindex $tempvar 0]] } { + return 0; + } + # Now want to run the binary and see what it prints out. + set output [gcc_load "./[lindex $tempvar 1]"] + remote_file build delete [lindex $tempvar 1] + if { [lindex $output 0] != "pass" } { + return 0; + } + return [check_effective_target_[lindex $output 1]] + } + return [check_cached_effective_target sync__Float16 \ + "atomic__Float16_evalcheck {$template_string}"] +} + +proc check_effective_target_sync__Float32 {} { + set template_string { + extern int printf(const char *, ...); + int main () { + _Static_assert (sizeof(_Float32) == 1 + || sizeof(_Float32) == 2 + || sizeof(_Float32) == 4 + || sizeof(_Float32) == 8 + || sizeof(_Float32) == 16 + , "_Float32 size not found"); + if (sizeof(_Float32) == 1 || sizeof(_Float32) == 2) { + printf("sync_char_short"); + } else if (sizeof(_Float32) == 4) { + printf("sync_int_long"); + } else if (sizeof(_Float32) == 8) { + printf("sync_long_long_runtime"); + } else if (sizeof(_Float32) == 16) { + printf("sync_int_128_runtime"); + } + return 0; + } + } + proc atomic__Float32_evalcheck {template} { + set tempvar [check_compile size__Float32 executable \ + $template ""] + # If failed to compile then the type isn't available this floating type + # is certainly not atomic on this target. + if { ![string match "" [lindex $tempvar 0]] } { + return 0; + } + # Now want to run the binary and see what it prints out. + set output [gcc_load "./[lindex $tempvar 1]"] + remote_file build delete [lindex $tempvar 1] + if { [lindex $output 0] != "pass" } { + return 0; + } + return [check_effective_target_[lindex $output 1]] + } + return [check_cached_effective_target sync__Float32 \ + "atomic__Float32_evalcheck {$template_string}"] +} + +proc check_effective_target_sync__Float64 {} { + set template_string { + extern int printf(const char *, ...); + int main () { + _Static_assert (sizeof(_Float64) == 1 + || sizeof(_Float64) == 2 + || sizeof(_Float64) == 4 + || sizeof(_Float64) == 8 + || sizeof(_Float64) == 16 + , "_Float64 size not found"); + if (sizeof(_Float64) == 1 || sizeof(_Float64) == 2) { + printf("sync_char_short"); + } else if (sizeof(_Float64) == 4) { + printf("sync_int_long"); + } else if (sizeof(_Float64) == 8) { + printf("sync_long_long_runtime"); + } else if (sizeof(_Float64) == 16) { + printf("sync_int_128_runtime"); + } + return 0; + } + } + proc atomic__Float64_evalcheck {template} { + set tempvar [check_compile size__Float64 executable \ + $template ""] + # If failed to compile then the type isn't available this floating type + # is certainly not atomic on this target. + if { ![string match "" [lindex $tempvar 0]] } { + return 0; + } + # Now want to run the binary and see what it prints out. + set output [gcc_load "./[lindex $tempvar 1]"] + remote_file build delete [lindex $tempvar 1] + if { [lindex $output 0] != "pass" } { + return 0; + } + return [check_effective_target_[lindex $output 1]] + } + return [check_cached_effective_target sync__Float64 \ + "atomic__Float64_evalcheck {$template_string}"] +} + +proc check_effective_target_sync__Float128 {} { + set template_string { + extern int printf(const char *, ...); + int main () { + _Static_assert (sizeof(_Float128) == 1 + || sizeof(_Float128) == 2 + || sizeof(_Float128) == 4 + || sizeof(_Float128) == 8 + || sizeof(_Float128) == 16 + , "_Float128 size not found"); + if (sizeof(_Float128) == 1 || sizeof(_Float128) == 2) { + printf("sync_char_short"); + } else if (sizeof(_Float128) == 4) { + printf("sync_int_long"); + } else if (sizeof(_Float128) == 8) { + printf("sync_long_long_runtime"); + } else if (sizeof(_Float128) == 16) { + printf("sync_int_128_runtime"); + } + return 0; + } + } + proc atomic__Float128_evalcheck {template} { + set tempvar [check_compile size__Float128 executable \ + $template ""] + # If failed to compile then the type isn't available this floating type + # is certainly not atomic on this target. + if { ![string match "" [lindex $tempvar 0]] } { + return 0; + } + # Now want to run the binary and see what it prints out. + set output [gcc_load "./[lindex $tempvar 1]"] + remote_file build delete [lindex $tempvar 1] + if { [lindex $output 0] != "pass" } { + return 0; + } + return [check_effective_target_[lindex $output 1]] + } + return [check_cached_effective_target sync__Float128 \ + "atomic__Float128_evalcheck {$template_string}"] +} + +proc check_effective_target_sync__Float32x {} { + set template_string { + extern int printf(const char *, ...); + int main () { + _Static_assert (sizeof(_Float32x) == 1 + || sizeof(_Float32x) == 2 + || sizeof(_Float32x) == 4 + || sizeof(_Float32x) == 8 + || sizeof(_Float32x) == 16 + , "_Float32x size not found"); + if (sizeof(_Float32x) == 1 || sizeof(_Float32x) == 2) { + printf("sync_char_short"); + } else if (sizeof(_Float32x) == 4) { + printf("sync_int_long"); + } else if (sizeof(_Float32x) == 8) { + printf("sync_long_long_runtime"); + } else if (sizeof(_Float32x) == 16) { + printf("sync_int_128_runtime"); + } + return 0; + } + } + proc atomic__Float32x_evalcheck {template} { + set tempvar [check_compile size__Float32x executable \ + $template ""] + # If failed to compile then the type isn't available this floating type + # is certainly not atomic on this target. + if { ![string match "" [lindex $tempvar 0]] } { + return 0; + } + # Now want to run the binary and see what it prints out. + set output [gcc_load "./[lindex $tempvar 1]"] + remote_file build delete [lindex $tempvar 1] + if { [lindex $output 0] != "pass" } { + return 0; + } + return [check_effective_target_[lindex $output 1]] + } + return [check_cached_effective_target sync__Float32x \ + "atomic__Float32x_evalcheck {$template_string}"] +} + +proc check_effective_target_sync__Float64x {} { + set template_string { + extern int printf(const char *, ...); + int main () { + _Static_assert (sizeof(_Float64x) == 1 + || sizeof(_Float64x) == 2 + || sizeof(_Float64x) == 4 + || sizeof(_Float64x) == 8 + || sizeof(_Float64x) == 16 + , "_Float64x size not found"); + if (sizeof(_Float64x) == 1 || sizeof(_Float64x) == 2) { + printf("sync_char_short"); + } else if (sizeof(_Float64x) == 4) { + printf("sync_int_long"); + } else if (sizeof(_Float64x) == 8) { + printf("sync_long_long_runtime"); + } else if (sizeof(_Float64x) == 16) { + printf("sync_int_128_runtime"); + } + return 0; + } + } + proc atomic__Float64x_evalcheck {template} { + set tempvar [check_compile size__Float64x executable \ + $template ""] + # If failed to compile then the type isn't available this floating type + # is certainly not atomic on this target. + if { ![string match "" [lindex $tempvar 0]] } { + return 0; + } + # Now want to run the binary and see what it prints out. + set output [gcc_load "./[lindex $tempvar 1]"] + remote_file build delete [lindex $tempvar 1] + if { [lindex $output 0] != "pass" } { + return 0; + } + return [check_effective_target_[lindex $output 1]] + } + return [check_cached_effective_target sync__Float64x \ + "atomic__Float64x_evalcheck {$template_string}"] +} + # library function proc check_effective_target_thread_fence {} { diff --git a/libatomic/testsuite/libatomic.c/atomic-op-fp.c b/libatomic/testsuite/libatomic.c/atomic-op-fp.c new file mode 100644 index 00000000000..70ebb2767d0 --- /dev/null +++ b/libatomic/testsuite/libatomic.c/atomic-op-fp.c @@ -0,0 +1,203 @@ +/* Test __atomic routines for existence and proper execution on double + values with each valid memory model. */ +/* { dg-do run } */ +/* { dg-additional-options "-std=c23" } */ + +/* Test the execution of the __atomic_*OP builtin routines for a double. */ + +extern void abort(void); + +double v, count, res; + +/* The fetch_op routines return the original value before the operation. + * TODO N.b. I don't *believe* the checking against integral values and + * addition/subtraction of integral values could trigger any floating point + * confusion, and the testcases seem to pass, but haven't yet gotten confident + * that it could never happen (though I guess if we're following ieee standards + * then the behaviour on one architecture and at one time should match that on + * other architectures -- so I guess I should be fine). */ + +void +test_fetch_add () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_fetch_add (&v, count, __ATOMIC_RELAXED) != 0) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_CONSUME) != 1) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQUIRE) != 2) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_RELEASE) != 3) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQ_REL) != 4) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_SEQ_CST) != 5) + abort (); +} + + +void +test_fetch_sub() +{ + v = res = 20; + count = 0.0; + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_RELAXED) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_CONSUME) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQUIRE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQ_REL) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_SEQ_CST) != res--) + abort (); +} + +/* The OP_fetch routines return the new value after the operation. */ + +void +test_add_fetch () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_add_fetch (&v, count, __ATOMIC_RELAXED) != 1) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_CONSUME) != 2) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQUIRE) != 3) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_RELEASE) != 4) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL) != 5) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_SEQ_CST) != 6) + abort (); +} + + +void +test_sub_fetch () +{ + v = res = 20; + count = 0.0; + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_CONSUME) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQUIRE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_RELEASE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_SEQ_CST) != --res) + abort (); +} + +/* Test the OP routines with a result which isn't used. Use both variations + within each function. */ + +void +test_add () +{ + v = res = 0.0; + count = 1.0; + + __atomic_add_fetch (&v, count, __ATOMIC_RELAXED); + if (v != 1) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_CONSUME); + if (v != 2) + abort (); + + __atomic_add_fetch (&v, 1.0 , __ATOMIC_ACQUIRE); + if (v != 3) + abort (); + + __atomic_fetch_add (&v, 1.0, __ATOMIC_RELEASE); + if (v != 4) + abort (); + + __atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL); + if (v != 5) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_SEQ_CST); + if (v != 6) + abort (); +} + + +void +test_sub() +{ + v = res = 20; + count = 0.0; + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_CONSUME); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, 1, __ATOMIC_ACQUIRE); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_SEQ_CST); + if (v != --res) + abort (); +} + +int +main () +{ + test_fetch_add (); + test_fetch_sub (); + + test_add_fetch (); + test_sub_fetch (); + + test_add (); + test_sub (); + + return 0; +} diff --git a/libatomic/testsuite/libatomic.c/atomic-op-fpf.c b/libatomic/testsuite/libatomic.c/atomic-op-fpf.c new file mode 100644 index 00000000000..9a9abecbd77 --- /dev/null +++ b/libatomic/testsuite/libatomic.c/atomic-op-fpf.c @@ -0,0 +1,203 @@ +/* Test __atomic routines for existence and proper execution on float + values with each valid memory model. */ +/* { dg-do run } */ +/* { dg-additional-options "-std=c23" } */ + +/* Test the execution of the __atomic_*OP builtin routines for a float. */ + +extern void abort(void); + +float v, count, res; + +/* The fetch_op routines return the original value before the operation. + * TODO N.b. I don't *believe* the checking against integral values and + * addition/subtraction of integral values could trigger any floating point + * confusion, and the testcases seem to pass, but haven't yet gotten confident + * that it could never happen (though I guess if we're following ieee standards + * then the behaviour on one architecture and at one time should match that on + * other architectures -- so I guess I should be fine). */ + +void +test_fetch_add () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_fetch_add (&v, count, __ATOMIC_RELAXED) != 0) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_CONSUME) != 1) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQUIRE) != 2) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_RELEASE) != 3) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQ_REL) != 4) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_SEQ_CST) != 5) + abort (); +} + + +void +test_fetch_sub() +{ + v = res = 20; + count = 0.0; + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_RELAXED) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_CONSUME) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQUIRE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQ_REL) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_SEQ_CST) != res--) + abort (); +} + +/* The OP_fetch routines return the new value after the operation. */ + +void +test_add_fetch () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_add_fetch (&v, count, __ATOMIC_RELAXED) != 1) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_CONSUME) != 2) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQUIRE) != 3) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_RELEASE) != 4) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL) != 5) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_SEQ_CST) != 6) + abort (); +} + + +void +test_sub_fetch () +{ + v = res = 20; + count = 0.0; + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_CONSUME) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQUIRE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_RELEASE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_SEQ_CST) != --res) + abort (); +} + +/* Test the OP routines with a result which isn't used. Use both variations + within each function. */ + +void +test_add () +{ + v = res = 0.0; + count = 1.0; + + __atomic_add_fetch (&v, count, __ATOMIC_RELAXED); + if (v != 1) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_CONSUME); + if (v != 2) + abort (); + + __atomic_add_fetch (&v, 1.0 , __ATOMIC_ACQUIRE); + if (v != 3) + abort (); + + __atomic_fetch_add (&v, 1.0, __ATOMIC_RELEASE); + if (v != 4) + abort (); + + __atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL); + if (v != 5) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_SEQ_CST); + if (v != 6) + abort (); +} + + +void +test_sub() +{ + v = res = 20; + count = 0.0; + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_CONSUME); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, 1, __ATOMIC_ACQUIRE); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_SEQ_CST); + if (v != --res) + abort (); +} + +int +main () +{ + test_fetch_add (); + test_fetch_sub (); + + test_add_fetch (); + test_sub_fetch (); + + test_add (); + test_sub (); + + return 0; +} diff --git a/libatomic/testsuite/libatomic.c/atomic-op-fpf128.c b/libatomic/testsuite/libatomic.c/atomic-op-fpf128.c new file mode 100644 index 00000000000..892c8cdd4ac --- /dev/null +++ b/libatomic/testsuite/libatomic.c/atomic-op-fpf128.c @@ -0,0 +1,203 @@ +/* Test __atomic routines for existence and proper execution on _Float128 + values with each valid memory model. */ +/* { dg-do run } */ +/* { dg-additional-options "-std=c23" } */ + +/* Test the execution of the __atomic_*OP builtin routines for a _Float128. */ + +extern void abort(void); + +_Float128 v, count, res; + +/* The fetch_op routines return the original value before the operation. + * TODO N.b. I don't *believe* the checking against integral values and + * addition/subtraction of integral values could trigger any floating point + * confusion, and the testcases seem to pass, but haven't yet gotten confident + * that it could never happen (though I guess if we're following ieee standards + * then the behaviour on one architecture and at one time should match that on + * other architectures -- so I guess I should be fine). */ + +void +test_fetch_add () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_fetch_add (&v, count, __ATOMIC_RELAXED) != 0) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_CONSUME) != 1) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQUIRE) != 2) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_RELEASE) != 3) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQ_REL) != 4) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_SEQ_CST) != 5) + abort (); +} + + +void +test_fetch_sub() +{ + v = res = 20; + count = 0.0; + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_RELAXED) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_CONSUME) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQUIRE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQ_REL) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_SEQ_CST) != res--) + abort (); +} + +/* The OP_fetch routines return the new value after the operation. */ + +void +test_add_fetch () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_add_fetch (&v, count, __ATOMIC_RELAXED) != 1) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_CONSUME) != 2) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQUIRE) != 3) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_RELEASE) != 4) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL) != 5) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_SEQ_CST) != 6) + abort (); +} + + +void +test_sub_fetch () +{ + v = res = 20; + count = 0.0; + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_CONSUME) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQUIRE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_RELEASE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_SEQ_CST) != --res) + abort (); +} + +/* Test the OP routines with a result which isn't used. Use both variations + within each function. */ + +void +test_add () +{ + v = res = 0.0; + count = 1.0; + + __atomic_add_fetch (&v, count, __ATOMIC_RELAXED); + if (v != 1) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_CONSUME); + if (v != 2) + abort (); + + __atomic_add_fetch (&v, 1.0 , __ATOMIC_ACQUIRE); + if (v != 3) + abort (); + + __atomic_fetch_add (&v, 1.0, __ATOMIC_RELEASE); + if (v != 4) + abort (); + + __atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL); + if (v != 5) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_SEQ_CST); + if (v != 6) + abort (); +} + + +void +test_sub() +{ + v = res = 20; + count = 0.0; + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_CONSUME); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, 1, __ATOMIC_ACQUIRE); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_SEQ_CST); + if (v != --res) + abort (); +} + +int +main () +{ + test_fetch_add (); + test_fetch_sub (); + + test_add_fetch (); + test_sub_fetch (); + + test_add (); + test_sub (); + + return 0; +} diff --git a/libatomic/testsuite/libatomic.c/atomic-op-fpf16.c b/libatomic/testsuite/libatomic.c/atomic-op-fpf16.c new file mode 100644 index 00000000000..c88e32b18e7 --- /dev/null +++ b/libatomic/testsuite/libatomic.c/atomic-op-fpf16.c @@ -0,0 +1,203 @@ +/* Test __atomic routines for existence and proper execution on _Float16 + values with each valid memory model. */ +/* { dg-do run } */ +/* { dg-additional-options "-std=c23" } */ + +/* Test the execution of the __atomic_*OP builtin routines for a _Float16. */ + +extern void abort(void); + +_Float16 v, count, res; + +/* The fetch_op routines return the original value before the operation. + * TODO N.b. I don't *believe* the checking against integral values and + * addition/subtraction of integral values could trigger any floating point + * confusion, and the testcases seem to pass, but haven't yet gotten confident + * that it could never happen (though I guess if we're following ieee standards + * then the behaviour on one architecture and at one time should match that on + * other architectures -- so I guess I should be fine). */ + +void +test_fetch_add () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_fetch_add (&v, count, __ATOMIC_RELAXED) != 0) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_CONSUME) != 1) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQUIRE) != 2) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_RELEASE) != 3) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQ_REL) != 4) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_SEQ_CST) != 5) + abort (); +} + + +void +test_fetch_sub() +{ + v = res = 20; + count = 0.0; + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_RELAXED) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_CONSUME) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQUIRE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQ_REL) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_SEQ_CST) != res--) + abort (); +} + +/* The OP_fetch routines return the new value after the operation. */ + +void +test_add_fetch () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_add_fetch (&v, count, __ATOMIC_RELAXED) != 1) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_CONSUME) != 2) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQUIRE) != 3) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_RELEASE) != 4) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL) != 5) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_SEQ_CST) != 6) + abort (); +} + + +void +test_sub_fetch () +{ + v = res = 20; + count = 0.0; + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_CONSUME) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQUIRE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_RELEASE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_SEQ_CST) != --res) + abort (); +} + +/* Test the OP routines with a result which isn't used. Use both variations + within each function. */ + +void +test_add () +{ + v = res = 0.0; + count = 1.0; + + __atomic_add_fetch (&v, count, __ATOMIC_RELAXED); + if (v != 1) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_CONSUME); + if (v != 2) + abort (); + + __atomic_add_fetch (&v, 1.0 , __ATOMIC_ACQUIRE); + if (v != 3) + abort (); + + __atomic_fetch_add (&v, 1.0, __ATOMIC_RELEASE); + if (v != 4) + abort (); + + __atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL); + if (v != 5) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_SEQ_CST); + if (v != 6) + abort (); +} + + +void +test_sub() +{ + v = res = 20; + count = 0.0; + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_CONSUME); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, 1, __ATOMIC_ACQUIRE); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_SEQ_CST); + if (v != --res) + abort (); +} + +int +main () +{ + test_fetch_add (); + test_fetch_sub (); + + test_add_fetch (); + test_sub_fetch (); + + test_add (); + test_sub (); + + return 0; +} diff --git a/libatomic/testsuite/libatomic.c/atomic-op-fpf16b.c b/libatomic/testsuite/libatomic.c/atomic-op-fpf16b.c new file mode 100644 index 00000000000..23f026788da --- /dev/null +++ b/libatomic/testsuite/libatomic.c/atomic-op-fpf16b.c @@ -0,0 +1,203 @@ +/* Test __atomic routines for existence and proper execution on __bf16 + values with each valid memory model. */ +/* { dg-do run } */ +/* { dg-additional-options "-std=c23" } */ + +/* Test the execution of the __atomic_*OP builtin routines for a __bf16. */ + +extern void abort(void); + +__bf16 v, count, res; + +/* The fetch_op routines return the original value before the operation. + * TODO N.b. I don't *believe* the checking against integral values and + * addition/subtraction of integral values could trigger any floating point + * confusion, and the testcases seem to pass, but haven't yet gotten confident + * that it could never happen (though I guess if we're following ieee standards + * then the behaviour on one architecture and at one time should match that on + * other architectures -- so I guess I should be fine). */ + +void +test_fetch_add () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_fetch_add (&v, count, __ATOMIC_RELAXED) != 0) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_CONSUME) != 1) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQUIRE) != 2) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_RELEASE) != 3) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQ_REL) != 4) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_SEQ_CST) != 5) + abort (); +} + + +void +test_fetch_sub() +{ + v = res = 20; + count = 0.0; + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_RELAXED) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_CONSUME) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQUIRE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQ_REL) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_SEQ_CST) != res--) + abort (); +} + +/* The OP_fetch routines return the new value after the operation. */ + +void +test_add_fetch () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_add_fetch (&v, count, __ATOMIC_RELAXED) != 1) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_CONSUME) != 2) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQUIRE) != 3) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_RELEASE) != 4) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL) != 5) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_SEQ_CST) != 6) + abort (); +} + + +void +test_sub_fetch () +{ + v = res = 20; + count = 0.0; + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_CONSUME) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQUIRE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_RELEASE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_SEQ_CST) != --res) + abort (); +} + +/* Test the OP routines with a result which isn't used. Use both variations + within each function. */ + +void +test_add () +{ + v = res = 0.0; + count = 1.0; + + __atomic_add_fetch (&v, count, __ATOMIC_RELAXED); + if (v != 1) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_CONSUME); + if (v != 2) + abort (); + + __atomic_add_fetch (&v, 1.0 , __ATOMIC_ACQUIRE); + if (v != 3) + abort (); + + __atomic_fetch_add (&v, 1.0, __ATOMIC_RELEASE); + if (v != 4) + abort (); + + __atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL); + if (v != 5) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_SEQ_CST); + if (v != 6) + abort (); +} + + +void +test_sub() +{ + v = res = 20; + count = 0.0; + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_CONSUME); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, 1, __ATOMIC_ACQUIRE); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_SEQ_CST); + if (v != --res) + abort (); +} + +int +main () +{ + test_fetch_add (); + test_fetch_sub (); + + test_add_fetch (); + test_sub_fetch (); + + test_add (); + test_sub (); + + return 0; +} diff --git a/libatomic/testsuite/libatomic.c/atomic-op-fpf32.c b/libatomic/testsuite/libatomic.c/atomic-op-fpf32.c new file mode 100644 index 00000000000..2987a1b5151 --- /dev/null +++ b/libatomic/testsuite/libatomic.c/atomic-op-fpf32.c @@ -0,0 +1,203 @@ +/* Test __atomic routines for existence and proper execution on _Float32 + values with each valid memory model. */ +/* { dg-do run } */ +/* { dg-additional-options "-std=c23" } */ + +/* Test the execution of the __atomic_*OP builtin routines for a _Float32. */ + +extern void abort(void); + +_Float32 v, count, res; + +/* The fetch_op routines return the original value before the operation. + * TODO N.b. I don't *believe* the checking against integral values and + * addition/subtraction of integral values could trigger any floating point + * confusion, and the testcases seem to pass, but haven't yet gotten confident + * that it could never happen (though I guess if we're following ieee standards + * then the behaviour on one architecture and at one time should match that on + * other architectures -- so I guess I should be fine). */ + +void +test_fetch_add () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_fetch_add (&v, count, __ATOMIC_RELAXED) != 0) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_CONSUME) != 1) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQUIRE) != 2) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_RELEASE) != 3) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQ_REL) != 4) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_SEQ_CST) != 5) + abort (); +} + + +void +test_fetch_sub() +{ + v = res = 20; + count = 0.0; + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_RELAXED) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_CONSUME) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQUIRE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQ_REL) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_SEQ_CST) != res--) + abort (); +} + +/* The OP_fetch routines return the new value after the operation. */ + +void +test_add_fetch () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_add_fetch (&v, count, __ATOMIC_RELAXED) != 1) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_CONSUME) != 2) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQUIRE) != 3) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_RELEASE) != 4) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL) != 5) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_SEQ_CST) != 6) + abort (); +} + + +void +test_sub_fetch () +{ + v = res = 20; + count = 0.0; + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_CONSUME) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQUIRE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_RELEASE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_SEQ_CST) != --res) + abort (); +} + +/* Test the OP routines with a result which isn't used. Use both variations + within each function. */ + +void +test_add () +{ + v = res = 0.0; + count = 1.0; + + __atomic_add_fetch (&v, count, __ATOMIC_RELAXED); + if (v != 1) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_CONSUME); + if (v != 2) + abort (); + + __atomic_add_fetch (&v, 1.0 , __ATOMIC_ACQUIRE); + if (v != 3) + abort (); + + __atomic_fetch_add (&v, 1.0, __ATOMIC_RELEASE); + if (v != 4) + abort (); + + __atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL); + if (v != 5) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_SEQ_CST); + if (v != 6) + abort (); +} + + +void +test_sub() +{ + v = res = 20; + count = 0.0; + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_CONSUME); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, 1, __ATOMIC_ACQUIRE); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_SEQ_CST); + if (v != --res) + abort (); +} + +int +main () +{ + test_fetch_add (); + test_fetch_sub (); + + test_add_fetch (); + test_sub_fetch (); + + test_add (); + test_sub (); + + return 0; +} diff --git a/libatomic/testsuite/libatomic.c/atomic-op-fpf32x.c b/libatomic/testsuite/libatomic.c/atomic-op-fpf32x.c new file mode 100644 index 00000000000..7d6adc49f16 --- /dev/null +++ b/libatomic/testsuite/libatomic.c/atomic-op-fpf32x.c @@ -0,0 +1,203 @@ +/* Test __atomic routines for existence and proper execution on _Float32x + values with each valid memory model. */ +/* { dg-do run } */ +/* { dg-additional-options "-std=c23" } */ + +/* Test the execution of the __atomic_*OP builtin routines for a _Float32x. */ + +extern void abort(void); + +_Float32x v, count, res; + +/* The fetch_op routines return the original value before the operation. + * TODO N.b. I don't *believe* the checking against integral values and + * addition/subtraction of integral values could trigger any floating point + * confusion, and the testcases seem to pass, but haven't yet gotten confident + * that it could never happen (though I guess if we're following ieee standards + * then the behaviour on one architecture and at one time should match that on + * other architectures -- so I guess I should be fine). */ + +void +test_fetch_add () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_fetch_add (&v, count, __ATOMIC_RELAXED) != 0) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_CONSUME) != 1) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQUIRE) != 2) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_RELEASE) != 3) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQ_REL) != 4) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_SEQ_CST) != 5) + abort (); +} + + +void +test_fetch_sub() +{ + v = res = 20; + count = 0.0; + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_RELAXED) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_CONSUME) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQUIRE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQ_REL) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_SEQ_CST) != res--) + abort (); +} + +/* The OP_fetch routines return the new value after the operation. */ + +void +test_add_fetch () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_add_fetch (&v, count, __ATOMIC_RELAXED) != 1) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_CONSUME) != 2) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQUIRE) != 3) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_RELEASE) != 4) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL) != 5) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_SEQ_CST) != 6) + abort (); +} + + +void +test_sub_fetch () +{ + v = res = 20; + count = 0.0; + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_CONSUME) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQUIRE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_RELEASE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_SEQ_CST) != --res) + abort (); +} + +/* Test the OP routines with a result which isn't used. Use both variations + within each function. */ + +void +test_add () +{ + v = res = 0.0; + count = 1.0; + + __atomic_add_fetch (&v, count, __ATOMIC_RELAXED); + if (v != 1) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_CONSUME); + if (v != 2) + abort (); + + __atomic_add_fetch (&v, 1.0 , __ATOMIC_ACQUIRE); + if (v != 3) + abort (); + + __atomic_fetch_add (&v, 1.0, __ATOMIC_RELEASE); + if (v != 4) + abort (); + + __atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL); + if (v != 5) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_SEQ_CST); + if (v != 6) + abort (); +} + + +void +test_sub() +{ + v = res = 20; + count = 0.0; + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_CONSUME); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, 1, __ATOMIC_ACQUIRE); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_SEQ_CST); + if (v != --res) + abort (); +} + +int +main () +{ + test_fetch_add (); + test_fetch_sub (); + + test_add_fetch (); + test_sub_fetch (); + + test_add (); + test_sub (); + + return 0; +} diff --git a/libatomic/testsuite/libatomic.c/atomic-op-fpf64.c b/libatomic/testsuite/libatomic.c/atomic-op-fpf64.c new file mode 100644 index 00000000000..478a78ef15c --- /dev/null +++ b/libatomic/testsuite/libatomic.c/atomic-op-fpf64.c @@ -0,0 +1,203 @@ +/* Test __atomic routines for existence and proper execution on _Float64 + values with each valid memory model. */ +/* { dg-do run } */ +/* { dg-additional-options "-std=c23" } */ + +/* Test the execution of the __atomic_*OP builtin routines for a _Float64. */ + +extern void abort(void); + +_Float64 v, count, res; + +/* The fetch_op routines return the original value before the operation. + * TODO N.b. I don't *believe* the checking against integral values and + * addition/subtraction of integral values could trigger any floating point + * confusion, and the testcases seem to pass, but haven't yet gotten confident + * that it could never happen (though I guess if we're following ieee standards + * then the behaviour on one architecture and at one time should match that on + * other architectures -- so I guess I should be fine). */ + +void +test_fetch_add () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_fetch_add (&v, count, __ATOMIC_RELAXED) != 0) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_CONSUME) != 1) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQUIRE) != 2) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_RELEASE) != 3) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQ_REL) != 4) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_SEQ_CST) != 5) + abort (); +} + + +void +test_fetch_sub() +{ + v = res = 20; + count = 0.0; + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_RELAXED) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_CONSUME) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQUIRE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQ_REL) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_SEQ_CST) != res--) + abort (); +} + +/* The OP_fetch routines return the new value after the operation. */ + +void +test_add_fetch () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_add_fetch (&v, count, __ATOMIC_RELAXED) != 1) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_CONSUME) != 2) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQUIRE) != 3) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_RELEASE) != 4) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL) != 5) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_SEQ_CST) != 6) + abort (); +} + + +void +test_sub_fetch () +{ + v = res = 20; + count = 0.0; + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_CONSUME) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQUIRE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_RELEASE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_SEQ_CST) != --res) + abort (); +} + +/* Test the OP routines with a result which isn't used. Use both variations + within each function. */ + +void +test_add () +{ + v = res = 0.0; + count = 1.0; + + __atomic_add_fetch (&v, count, __ATOMIC_RELAXED); + if (v != 1) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_CONSUME); + if (v != 2) + abort (); + + __atomic_add_fetch (&v, 1.0 , __ATOMIC_ACQUIRE); + if (v != 3) + abort (); + + __atomic_fetch_add (&v, 1.0, __ATOMIC_RELEASE); + if (v != 4) + abort (); + + __atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL); + if (v != 5) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_SEQ_CST); + if (v != 6) + abort (); +} + + +void +test_sub() +{ + v = res = 20; + count = 0.0; + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_CONSUME); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, 1, __ATOMIC_ACQUIRE); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_SEQ_CST); + if (v != --res) + abort (); +} + +int +main () +{ + test_fetch_add (); + test_fetch_sub (); + + test_add_fetch (); + test_sub_fetch (); + + test_add (); + test_sub (); + + return 0; +} diff --git a/libatomic/testsuite/libatomic.c/atomic-op-fpf64x.c b/libatomic/testsuite/libatomic.c/atomic-op-fpf64x.c new file mode 100644 index 00000000000..fdfaa177197 --- /dev/null +++ b/libatomic/testsuite/libatomic.c/atomic-op-fpf64x.c @@ -0,0 +1,203 @@ +/* Test __atomic routines for existence and proper execution on _Float64x + values with each valid memory model. */ +/* { dg-do run } */ +/* { dg-additional-options "-std=c23" } */ + +/* Test the execution of the __atomic_*OP builtin routines for a _Float64x. */ + +extern void abort(void); + +_Float64x v, count, res; + +/* The fetch_op routines return the original value before the operation. + * TODO N.b. I don't *believe* the checking against integral values and + * addition/subtraction of integral values could trigger any floating point + * confusion, and the testcases seem to pass, but haven't yet gotten confident + * that it could never happen (though I guess if we're following ieee standards + * then the behaviour on one architecture and at one time should match that on + * other architectures -- so I guess I should be fine). */ + +void +test_fetch_add () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_fetch_add (&v, count, __ATOMIC_RELAXED) != 0) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_CONSUME) != 1) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQUIRE) != 2) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_RELEASE) != 3) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQ_REL) != 4) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_SEQ_CST) != 5) + abort (); +} + + +void +test_fetch_sub() +{ + v = res = 20; + count = 0.0; + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_RELAXED) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_CONSUME) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQUIRE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQ_REL) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_SEQ_CST) != res--) + abort (); +} + +/* The OP_fetch routines return the new value after the operation. */ + +void +test_add_fetch () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_add_fetch (&v, count, __ATOMIC_RELAXED) != 1) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_CONSUME) != 2) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQUIRE) != 3) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_RELEASE) != 4) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL) != 5) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_SEQ_CST) != 6) + abort (); +} + + +void +test_sub_fetch () +{ + v = res = 20; + count = 0.0; + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_CONSUME) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQUIRE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_RELEASE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_SEQ_CST) != --res) + abort (); +} + +/* Test the OP routines with a result which isn't used. Use both variations + within each function. */ + +void +test_add () +{ + v = res = 0.0; + count = 1.0; + + __atomic_add_fetch (&v, count, __ATOMIC_RELAXED); + if (v != 1) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_CONSUME); + if (v != 2) + abort (); + + __atomic_add_fetch (&v, 1.0 , __ATOMIC_ACQUIRE); + if (v != 3) + abort (); + + __atomic_fetch_add (&v, 1.0, __ATOMIC_RELEASE); + if (v != 4) + abort (); + + __atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL); + if (v != 5) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_SEQ_CST); + if (v != 6) + abort (); +} + + +void +test_sub() +{ + v = res = 20; + count = 0.0; + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_CONSUME); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, 1, __ATOMIC_ACQUIRE); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_SEQ_CST); + if (v != --res) + abort (); +} + +int +main () +{ + test_fetch_add (); + test_fetch_sub (); + + test_add_fetch (); + test_sub_fetch (); + + test_add (); + test_sub (); + + return 0; +} diff --git a/libatomic/testsuite/libatomic.c/atomic-op-fpl.c b/libatomic/testsuite/libatomic.c/atomic-op-fpl.c new file mode 100644 index 00000000000..cb9c35b44f5 --- /dev/null +++ b/libatomic/testsuite/libatomic.c/atomic-op-fpl.c @@ -0,0 +1,203 @@ +/* Test __atomic routines for existence and proper execution on long double + values with each valid memory model. */ +/* { dg-do run } */ +/* { dg-additional-options "-std=c23" } */ + +/* Test the execution of the __atomic_*OP builtin routines for a long double. */ + +extern void abort(void); + +long double v, count, res; + +/* The fetch_op routines return the original value before the operation. + * TODO N.b. I don't *believe* the checking against integral values and + * addition/subtraction of integral values could trigger any floating point + * confusion, and the testcases seem to pass, but haven't yet gotten confident + * that it could never happen (though I guess if we're following ieee standards + * then the behaviour on one architecture and at one time should match that on + * other architectures -- so I guess I should be fine). */ + +void +test_fetch_add () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_fetch_add (&v, count, __ATOMIC_RELAXED) != 0) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_CONSUME) != 1) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQUIRE) != 2) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_RELEASE) != 3) + abort (); + + if (__atomic_fetch_add (&v, count, __ATOMIC_ACQ_REL) != 4) + abort (); + + if (__atomic_fetch_add (&v, 1, __ATOMIC_SEQ_CST) != 5) + abort (); +} + + +void +test_fetch_sub() +{ + v = res = 20; + count = 0.0; + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_RELAXED) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_CONSUME) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQUIRE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE) != res--) + abort (); + + if (__atomic_fetch_sub (&v, count + 1, __ATOMIC_ACQ_REL) != res--) + abort (); + + if (__atomic_fetch_sub (&v, 1, __ATOMIC_SEQ_CST) != res--) + abort (); +} + +/* The OP_fetch routines return the new value after the operation. */ + +void +test_add_fetch () +{ + v = res = 0.0; + count = 1.0; + + if (__atomic_add_fetch (&v, count, __ATOMIC_RELAXED) != 1) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_CONSUME) != 2) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQUIRE) != 3) + abort (); + + if (__atomic_add_fetch (&v, 1, __ATOMIC_RELEASE) != 4) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL) != 5) + abort (); + + if (__atomic_add_fetch (&v, count, __ATOMIC_SEQ_CST) != 6) + abort (); +} + + +void +test_sub_fetch () +{ + v = res = 20; + count = 0.0; + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_CONSUME) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQUIRE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, 1, __ATOMIC_RELEASE) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL) != --res) + abort (); + + if (__atomic_sub_fetch (&v, count + 1, __ATOMIC_SEQ_CST) != --res) + abort (); +} + +/* Test the OP routines with a result which isn't used. Use both variations + within each function. */ + +void +test_add () +{ + v = res = 0.0; + count = 1.0; + + __atomic_add_fetch (&v, count, __ATOMIC_RELAXED); + if (v != 1) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_CONSUME); + if (v != 2) + abort (); + + __atomic_add_fetch (&v, 1.0 , __ATOMIC_ACQUIRE); + if (v != 3) + abort (); + + __atomic_fetch_add (&v, 1.0, __ATOMIC_RELEASE); + if (v != 4) + abort (); + + __atomic_add_fetch (&v, count, __ATOMIC_ACQ_REL); + if (v != 5) + abort (); + + __atomic_fetch_add (&v, count, __ATOMIC_SEQ_CST); + if (v != 6) + abort (); +} + + +void +test_sub() +{ + v = res = 20; + count = 0.0; + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_RELAXED); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_CONSUME); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, 1, __ATOMIC_ACQUIRE); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, 1, __ATOMIC_RELEASE); + if (v != --res) + abort (); + + __atomic_sub_fetch (&v, count + 1, __ATOMIC_ACQ_REL); + if (v != --res) + abort (); + + __atomic_fetch_sub (&v, count + 1, __ATOMIC_SEQ_CST); + if (v != --res) + abort (); +} + +int +main () +{ + test_fetch_add (); + test_fetch_sub (); + + test_add_fetch (); + test_sub_fetch (); + + test_add (); + test_sub (); + + return 0; +} From patchwork Thu Sep 19 13:12:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Malcomson X-Patchwork-Id: 97703 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C578D385C6C2 for ; Thu, 19 Sep 2024 13:14:57 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from NAM10-BN7-obe.outbound.protection.outlook.com (mail-bn7nam10on20622.outbound.protection.outlook.com [IPv6:2a01:111:f403:2009::622]) by sourceware.org (Postfix) with ESMTPS id 03661385840B for ; Thu, 19 Sep 2024 13:12:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 03661385840B Authentication-Results: sourceware.org; dmarc=fail (p=reject dis=none) header.from=nvidia.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=nvidia.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 03661385840B Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2009::622 ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1726751574; cv=pass; b=F4N/gcyG+ThcCTBFBFn4xwYc8RhgiEKQyAQjGWNgnkgzX0ql/nQgLpRZxJBHGU6YPKErJBN5Wj5PLGue6knHEFV+CUZ1LWY3jOvid0OUaEr7Lh2PqFYiYKT3TFYpaPMsgHV2C+rcfmvhzR1sq235A1STlZmGQg7V4O2VGRoUtCU= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1726751574; c=relaxed/simple; bh=/76AEQL2v/L//fFy/42U9nEfD7Kg5H/Kc4KQTrWww2w=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=b0hsBeNxTAIqSU8AmjG9SRQLLG3LaGUM8F1jwgZDuiUGih3wAAn7cao3tDgwp7UDFBfya1DWPhO4kgMhXSt67sBdYCnt5aL5/r8Vh8gQyeOZOKW/BW8TLOSjx2e0B8HSC88KrufLkm9RLWr9rgHicUnbyDKhYqHYwsQY3nvbT8w= ARC-Authentication-Results: i=2; server2.sourceware.org ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=BO9EVqjof8gSqdc7Rhkny0lFISRDAGHfr9lv/4L//mzILPWrraErGfhM8ie+bLk8WwnIZNTqrMXT/6Lruk3O5rzvY3+uWZyBHnVRgXwfdZW5/r9n2TsacCEU7mJFaTU0pfB5f9I6k3Dnu9GR9ibVxMNJCFVmDG4xe/LCmaSBkAF9amyjs/R2g2P5kw2SjRbuqwhsrlzElPuOPQvP4okZ8+RDsOYoSC9bq9PmtQZn05ZToyXIvNmQrIucCE4c4lLyKLvumXtly0pSCpUPr4UsxJCrvCRsipVajzfJa46jChv+IwiThQxSdqaluUaNpxqcUHeAE6GqWUJcveNBQnKRhw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=h7MkFRMzFRxL9PwPyjpuU1hLJhnAitYx7UVDDh8Jhoo=; b=iuN3ycdlKmRY1QJbRNe/l44AGdtXY8MLht61VoeAItjsJ4Ovm5eOmxpU7j1CSi6/CucOKfYz7RUhBqjH8R+TWrdICCzDtemP/xepTko/iT92pxAqPe9OXM+PoP8Fo3rn+P6MLz+7uj0KO7UNtI+R/5KuLJMiFvUEDu049S+NARFnFJ6DL+67vdDTNBVyKmbfr9sjPXRfgJ+tGRjvsJANGT+4gvOP62Ea1BHbpsUEpa004e/Ist+c6D4zbEZcevNvifiuDsVg7uSNvTnA3uoxtjrPyTdg2hMyWm4bgeRkFrfoW8AFZw4SLS3yQQ6bMJRiE11+ufm/PUrgs4cwTBfPyQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.161) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=h7MkFRMzFRxL9PwPyjpuU1hLJhnAitYx7UVDDh8Jhoo=; b=P9tH5A8+Uzb955VD7gsr8qZgwzdgLVFZnj9LBpIcPh4n1G3yk8kyzjSpEUpuQh6c+3IlVUd/AhLnyCvHWekLlkmE5oSBe7uvm8amzF7T3CenECe2zjHQUu/s9i2jwR+8Gf4q+P21I2n105rqhOPC0i8CcEeK+XZ70WHoN48jvO/LJu8b1wE6kKZwfLUmmV1MfPTCqQa0ueNjl9KgcauXwSUuJJvlogzDNOlz9CA1cGs+DukklvuBw4Fon+188jXtbGhZFtn4TbFR6j6F2OyXCXyCZWLPe55weRqzsgFFDwvQxrSIFEYuQwL6G8UnalJrqTZn5omNAdQ5zXIZJ5ZMuQ== Received: from CH5PR05CA0007.namprd05.prod.outlook.com (2603:10b6:610:1f0::12) by DS0PR12MB8443.namprd12.prod.outlook.com (2603:10b6:8:126::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7982.16; Thu, 19 Sep 2024 13:12:48 +0000 Received: from DS2PEPF0000343A.namprd02.prod.outlook.com (2603:10b6:610:1f0:cafe::e1) by CH5PR05CA0007.outlook.office365.com (2603:10b6:610:1f0::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7982.16 via Frontend Transport; Thu, 19 Sep 2024 13:12:48 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.161) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.161 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.161; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.161) by DS2PEPF0000343A.mail.protection.outlook.com (10.167.18.37) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Thu, 19 Sep 2024 13:12:48 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Thu, 19 Sep 2024 06:12:34 -0700 Received: from f2784bb-lcelt.nvidia.com (10.126.230.35) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Thu, 19 Sep 2024 06:12:33 -0700 From: To: CC: Jonathan Wakely , Joseph Myers , Richard Biener , Matthew Malcomson Subject: [PATCH 7/8] [RFC] Mention floating point atomic fetch_add etc in docs Date: Thu, 19 Sep 2024 14:12:03 +0100 Message-ID: <20240919131204.3865854-8-mmalcomson@nvidia.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240919131204.3865854-1-mmalcomson@nvidia.com> References: <20240919131204.3865854-1-mmalcomson@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [10.126.230.35] X-ClientProxiedBy: rnnvmail202.nvidia.com (10.129.68.7) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS2PEPF0000343A:EE_|DS0PR12MB8443:EE_ X-MS-Office365-Filtering-Correlation-Id: 863dcb37-adab-4bc4-e50d-08dcd8acc256 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|1800799024|36860700013|82310400026|376014; X-Microsoft-Antispam-Message-Info: e3RjtRFT5eGcTn7J6sMDoSU5caLz/vMgI/dNT5Xleg/hGEH78CQSCNHqsLg36Qa7366v3UHRdGDON803xYZhLXZC2JxCim4VVL372cdP7tFW4H+NAdLNkg8qsC5+9PY71FkalHcE/lCLRrcJOr0yq6IjCK/dy1AhOphb/VI9tq95HTCiBlugPKTsanLqSbM3U1xrKt87tT8sY3Xfit4o7bhdpN+OICZGhO4V+k+a+xOrdtnWhyAVHK3mPsBuabydznEM5GonroAgU24uzBeP5Ux6etLQJ++gQx04JozBSCt31GD4IiCiM1w5asbp5Rtbmosliln4BTjSmvqgKVDyy+7C+kwkXAIgxcjhe6XzKZlnRZCot7POqgU3bTgOzYY9TXuVDzY/dS6C9CE1p/Phk/WKxR7gOe8J7gtAOJhAdWGySUEg78B4ntFbLPc4w1/D1/q9NbRY24rjAMcK0fz3/QvrzRBHWi4PE9pzga/Ib1iSeqqJH5WAs4xZe7jAFWIFmqLwF29PxamgcGyBpkFDH3ufchWSh5hazz9Z8HW1wGYjyfRrAwgrUNjPypWd2eCheVzI+9uC7ZNqiXSGS5hV30hUy83pnb7ToyOaYd/YkNFeI8AM/99QAN+O7GI/oEcWgQgG8c6PvG68Et45FWKD6MArNebesHiwiedhagxjqBkGZppfAz2M+aRLZGa2Xau4HYZqoMle09TEmcL1TregzNjwJyZaWAQAFCbw7h63/i+Pz+VYUjmHRtJp5o/T1aJSqy8wmtDOPIDWEHhpVPWiYHHUNEBf56EJ2wKA856gi11uZxJ6uik1MrtdnMllbhGGGxZiTHDF5QBNXlj2lkq8hZYokT3gstSBhl37q32DgAbVLeeJCq5Tn1hIHr0U5+ujdhAtGWPJbC49z44mlOofSa9wILLlk/7Yhq+8O/sclyZbeiQSklkAsRgdOczUwXpsdS+6qNFYpaq+sF5ascZqGhD+5HE5Ur2xA0CwZ/U4Mntfl9LaLcb9Oec4sfvmtt1cVYR1lVTnPgjBV7v963duUJb1Ktm2+9O4CgF+tOzHpCQv4mnESyIiz/IHYuNk0NZGKXA++17zCBUQrXrHtGfu0561OMABeZHVkpYTxQcIOVIuMvPhhocYE4QzmE7rCMqiJ72BTC+5s2k8AMbJlKxDaP+GGMQY461N9nlHpTQmpeAjGx0oF+x1P9NDzJll1iT2m67kdafvbh5ARNrfrhf+njlxqtMkiQljUw2+Tg41SRzOyvZKhw3J6d1AyGk2Y6Cg7r7wZd/Z/S3knY+JvE5Rmbx/7SJk8lflJEwfFTF0FEkYUhddblVXbFgY+qogZ/HDHDJH07IhM1n7oissJvT9IcDGqgWAjqp1/gbzTDGejIXLdgo82o6/qAactKtI+5jD X-Forefront-Antispam-Report: CIP:216.228.117.161; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:dc6edge2.nvidia.com; CAT:NONE; SFS:(13230040)(1800799024)(36860700013)(82310400026)(376014); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Sep 2024 13:12:48.4654 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 863dcb37-adab-4bc4-e50d-08dcd8acc256 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.117.161]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DS2PEPF0000343A.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS0PR12MB8443 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_SHORT, SPF_HELO_PASS, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org From: Matthew Malcomson Signed-off-by: Matthew Malcomson --- gcc/doc/extend.texi | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 66c99ef7a66..a3e3e7da5d6 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -13501,6 +13501,18 @@ the same format with the addition of a @samp{size_t} parameter inserted as the first parameter indicating the size of the object being pointed to. All objects must be the same size. +Moreover, the @samp{__atomic_fetch_add}, @samp{__atomic_fetch_sub}, +@samp{__atomic_add_fetch} and @samp{__atomic_sub_fetch} builtins can all +accept floating point types of @code{float}, @code{double}, @code{long double}, +@code{bfloat16}, @code{_Float16}, @code{_Float32}, @code{_Float64}, +@code{_Float128}, @code{_Float32x} and @code{_Float64x}. These use a lock-free +built-in function if the size of the floating point type makes that possible +and otherwise leave an external call to be resolved at run time. This external +call is of the same format but specialised to the given floating point type. +The specialised versions of these functions are denoted by one of the +suffixes @code{_fpf}, @code{_fp}, @code{_fpl}, @code{_fpf16b}, @code{_fpf16}, +@code{_fpf32}, @code{_fpf64}, @code{_fpf128}, @code{_fpf32x}, @code{_fpf64x}. + There are 6 different memory orders that can be specified. These map to the C++11 memory orders with the same names, see the C++11 standard or the @uref{https://gcc.gnu.org/wiki/Atomic/GCCMM/AtomicSync,GCC wiki From patchwork Thu Sep 19 13:12:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Malcomson X-Patchwork-Id: 97707 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E51343858C78 for ; Thu, 19 Sep 2024 13:18:03 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2060e.outbound.protection.outlook.com [IPv6:2a01:111:f403:2412::60e]) by sourceware.org (Postfix) with ESMTPS id A8C693858283 for ; Thu, 19 Sep 2024 13:12:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A8C693858283 Authentication-Results: sourceware.org; dmarc=fail (p=reject dis=none) header.from=nvidia.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=nvidia.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A8C693858283 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2412::60e ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1726751582; cv=pass; b=n6n+vihTj5iWUj/nrVTI58TT6IKOb3H+ymJgoEERGcwaU7b1fANtsWNlXY0cmnhhJEzbEFjAbYMS7KDK1nReacSwGdtTVRdebs3H03NGUO8oxKrEapAOu2eBMpyngSsnkkAq6FzyGxo9feIjx1QWxkn5Ps9M31Cve1ZDlQ5tBSI= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1726751582; c=relaxed/simple; bh=W0FqZIrKwQEgm76ntpOW+L+O765GB5c2Q3B+vbPlPAY=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=ivsa0O5H6gN6ghTmGJzlCrfT+UqDHMlr51G920Uv0quQaNXRda8tA3l3ap3c/NQpRtrrysgFnp/eGUcma/Uhg7WNMQxCGUIBtMadh2Dduca6O34u+tpT67a+V273rzTWPyt6gGT6voAwFu01T4mbH06arwDZC82FVePhCR1DpzM= ARC-Authentication-Results: i=2; server2.sourceware.org ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=HsUt2gBjrbxwpQZINUzMBopyEiToSdnqU4DwIxbQjvGV4dOOTLfpFJlKlAo6h7KvyT+SdP2J1XM1TecqB8+728eoHNZn/DEk7/VK/tXLXHtL8Eyuw3Kbc4k9RhesjTRJjwKS/X1n4ZHCSaHRcFGDmzTIqteAEQFyj4HN8jNm2g85Jvc/IOt2BE6iPeBe/AkG9+tFQCnzZfDavU5ovB/zRg33Xo+4/qr3ZcnmbIvSU8xMerA5WQrYeF2fSSooawEF9wH+YpTqHTrjCibUi+Vz/0cvwAYOoDKhTAlay9gGm9xKK3bYAVqkz4rHfOMXrcrk8E6FXNXiqRjHUB6d6ABMXQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=yFyF3nC1c6Bu4CEDV5dGvpBXwo0pOIWfZpWffyhIhK0=; b=a+rONKIuynPUorI44BMVo4LMYItN+XMfSvggIhBaNIWfmg9Bjxdl/aTrryskz59YmYW5Z8mO7QH3Croyfy/XzkCrIucySYcD/alj1y4WVs2EfOW65ptCwynQWKzZZFxu4mcLwNN5ZtS8YuACzS9wFk5nixeDzpiD46HuNaWgKqsZCBRBgDNjAQhpVKikK8FOmKiw4aTR4rTChH3dJsJFQ+VRzP9mWhFbSL/jlSuO886ExgDTqKeE+IBeQ0I7YRTTFCKi76IPj4liz+ABaIWJ0WpHNJZ/ZIpmxmqAPXIStVGWgHs3DANM4omkeZopMPidnXwIvGbbCSY1UOxwTXPGJQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.161) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=yFyF3nC1c6Bu4CEDV5dGvpBXwo0pOIWfZpWffyhIhK0=; b=naNnTqGxq9xSpinTCkl9SZFUcrj69tMVBD2yIddQ8gnegHJI5f0/DSUMudgyOdLt4duEpPeLaP6OzJys/P06COBFbA62sgvGlu7bbQucTu31SbbF2QsDV6Sg9FweS0cxQODRaQyqKd/Ibrts/oOxVGZaBC5xir9RTy1czWO1Fn65WfEpMWCAhx5YEF4AnlgdltI0YNRH01Vs4gsZmEEIruBsLX7saosBGrD/7+9Vkr3Y3xlROyioS7HBlxXsD/hVzse6LNo8ofYN5C3FNr0Dj38XwpH/WX1qTYcp4ZLjOh4ClE+yFI1ldtPXcJu53kPYxNTNdM6pT/7/+s4eOYX+Lw== Received: from CH5PR04CA0009.namprd04.prod.outlook.com (2603:10b6:610:1f4::15) by PH8PR12MB6915.namprd12.prod.outlook.com (2603:10b6:510:1bc::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7982.17; Thu, 19 Sep 2024 13:12:50 +0000 Received: from DS2PEPF0000343E.namprd02.prod.outlook.com (2603:10b6:610:1f4:cafe::ca) by CH5PR04CA0009.outlook.office365.com (2603:10b6:610:1f4::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7939.30 via Frontend Transport; Thu, 19 Sep 2024 13:12:49 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.161) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.161 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.161; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.161) by DS2PEPF0000343E.mail.protection.outlook.com (10.167.18.41) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Thu, 19 Sep 2024 13:12:49 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Thu, 19 Sep 2024 06:12:36 -0700 Received: from f2784bb-lcelt.nvidia.com (10.126.230.35) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Thu, 19 Sep 2024 06:12:34 -0700 From: To: CC: Jonathan Wakely , Joseph Myers , Richard Biener , Matthew Malcomson Subject: [PATCH 8/8] [RFC] Add demo implementation of one of the operations Date: Thu, 19 Sep 2024 14:12:04 +0100 Message-ID: <20240919131204.3865854-9-mmalcomson@nvidia.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240919131204.3865854-1-mmalcomson@nvidia.com> References: <20240919131204.3865854-1-mmalcomson@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [10.126.230.35] X-ClientProxiedBy: rnnvmail202.nvidia.com (10.129.68.7) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS2PEPF0000343E:EE_|PH8PR12MB6915:EE_ X-MS-Office365-Filtering-Correlation-Id: c2b9a540-edb2-4d4e-c01a-08dcd8acc2ee X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|1800799024|36860700013|82310400026|376014; X-Microsoft-Antispam-Message-Info: oyOlYCcwaE/mI92NnimpiGxqwSw/w1ij3IDeaz2k7gQ04HWqO/vJuAg/S7n9QrKtyvzLvx78wokBdmvrTzGVPIgPEVLi66zbeWAE5ZjYC5wqDXS5cqQvF1C/Y6cZqCNDjbZccHCninxB+DykBTsVgi2EJ8B3XeJL1tlrqtwtzidgtxrclAhEjBG6m84hKzo4j1zCSoOLLbDDgqvZ/1RKi8TR67vuW9IZI61eZMYwKXqhu2NnCU7w5Lh+Jpqu+xo5wrqAloMgRepx/PkaZp9VOCzu9zNIW7nRKGGdhGY7S5vBEshDv6qPx9/GTqzWWOJYkAAKxP2tQN7b1uCUsHu7XpsyxrmPF7vDQO43dnF9vKABeRv3lmUHVfxAqv/OwdXcykfER/LfLny+F7ge6a1woZ7KEvcfx592w98itt6oeHdHtdMJ4N0hbLyQdLoJa03adyw0LVBehIK0PaGltBXgV+7ZjtDSMt4l+4gisoL04/AaPRwAwHc6ZlQ11Q0t3sBaTONU3Pvqu0oU8GkFA80/bqxa/p23TW/3XBgpT3+MncUoWQ+GO3bagGOdtH49V3lH72G94VTrIHhN7AxCSlqQPVJvQeu+h9ZFXB4jo7HbfXWQobUJoJVX0+3TTH7YqwUPsG1M3tFAp7qiTWZ8UjqodsOlUKmzCDDtJ0+iesdu9vRohSmzPZiu6wkrVYGl4NfRqXZJQgi3L7SiiKdmHx/cFmo/DXmNztNRJmli3/tDPPBGuKyKf86TUPTI7VEam7x5/8vBWOLgWIWJvwO3THra7rFSnTqniOgH0zGEOL5cT5YDArpNDdDh5UYpWOCuaeggmE85r2Hh09kRCEfy9zvs2BBdU+dJuep/G0rjOPXNlj9DKq8nznLtLEZSAuteaiuySpeLR9e9lwzhteTHzvrzFIDDZQfJ1CWNXqxpK3drwvXr/Msb+Xd+sujq11f/UqWIs5KdBj91s25PII5hVcwroUC7a3GgyAIREtarRprh2lJFkrlZ9Z2ueyCzYAVSiKtu3uOaqmbWul60Pm6Ey1+SQ4irvC5YjV+X3jy0PNxxaB5nfUB8debqOn9y63BxdM5ybCWXs7xEo+qb/3tlMP+Iad3JU8JC9VYPCL1oERMmOIEYvOosdDX4SaqNPSwPcCsk8wTPRKng221dsWRPhCXHxRgCfNyMxzDk7Gsf5WdnZYUdwtsRGlDilkSX/RAObWmnbOPfcujWZRCigHuRn11HeRrndcT570lTNvaCfs4HlulyYWS42znXyXQjp0IkLiu7udNnik9bJJ3Q0kBBDkQeT2bczkqmiTLEldHRIc2mPF7trYx5KGt9FMicszdsTR4JubSeEAPSIkebpMNtIJtku6d60pF7S0sUTSFbigdNjxapm+Fc5dK3zQ8AsZDaGmG+cLSrWGaf3rfnb1rP3MvwBlA9mJYfZxjd6V1abdxS9E5Rs4UWwxngSd+9LW/3eTRH X-Forefront-Antispam-Report: CIP:216.228.117.161; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:dc6edge2.nvidia.com; CAT:NONE; SFS:(13230040)(1800799024)(36860700013)(82310400026)(376014); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Sep 2024 13:12:49.4606 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: c2b9a540-edb2-4d4e-c01a-08dcd8acc2ee X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.117.161]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DS2PEPF0000343E.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH8PR12MB6915 X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, SPF_HELO_PASS, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org From: Matthew Malcomson Do demo implementation in AArch64 since that's the backend I'm most familiar with. Nothing much else to say -- nice to see that the demo implementation seems to work as expected (being used for fetch_add, add_fetch and sub_fetch even though it's only defined for fetch_sub). Demo implementation ensures that I can run some execution tests. Demo is added behind a flag in order to be able to run the testsuite with different variants (with the flag and without). Ensuring that the functionality worked for both the fallback and when this optab was implemented (also check with the two different fallbacks of either using libatomic or inlining a CAS loop). In order to run with both this and the fallback implementation I use the following flag in RUNTESTFLAGS: --target_board='unix {unix/-mtesting-fp-atomics}' Signed-off-by: Matthew Malcomson --- gcc/config/aarch64/aarch64.h | 2 ++ gcc/config/aarch64/aarch64.opt | 5 +++++ gcc/config/aarch64/atomics.md | 15 +++++++++++++++ 3 files changed, 22 insertions(+) diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index fac1882bcb3..c2f37545cd7 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -119,6 +119,8 @@ of LSE instructions. */ #define TARGET_OUTLINE_ATOMICS (aarch64_flag_outline_atomics) +#define TARGET_TESTING_FP_ATOMICS (aarch64_flag_testing_fp_atomics) + /* Align definitions of arrays, unions and structures so that initializations and copies can be made more efficient. This is not ABI-changing, so it only affects places where we can see the diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt index 6356c419399..ed031258575 100644 --- a/gcc/config/aarch64/aarch64.opt +++ b/gcc/config/aarch64/aarch64.opt @@ -332,6 +332,11 @@ moutline-atomics Target Var(aarch64_flag_outline_atomics) Init(2) Save Generate local calls to out-of-line atomic operations. +mtesting-fp-atomics +Target Var(aarch64_flag_testing_fp_atomics) Init(0) Save +Use the demonstration implementation of atomic_fetch_sub_ for floating +point modes. + -param=aarch64-vect-compare-costs= Target Joined UInteger Var(aarch64_vect_compare_costs) Init(1) IntegerRange(0, 1) Param When vectorizing, consider using multiple different approaches and use diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md index 32a0a723732..ee8fbcd6c58 100644 --- a/gcc/config/aarch64/atomics.md +++ b/gcc/config/aarch64/atomics.md @@ -368,6 +368,21 @@ ;; However we also implement the acquire memory barrier with DMB LD, ;; and so the ST is not blocked by the barrier. +(define_insn "atomic_fetch_sub" + [(set (match_operand:GPF 0 "register_operand" "=&w") + (match_operand:GPF 1 "aarch64_sync_memory_operand" "+Q")) + (set (match_dup 1) + (unspec_volatile:GPF + [(minus:GPF (match_dup 1) + (match_operand:GPF 2 "register_operand" "w")) + (match_operand:SI 3 "const_int_operand")] + UNSPECV_ATOMIC_LDOP_PLUS)) + (clobber (match_scratch:GPF 4 "=w"))] + "TARGET_TESTING_FP_ATOMICS" + "// Here's your sandwich.\;ldr %0, %1\;fsub %4, %0, %2\;str %4, %1\;// END" +) + + (define_insn "aarch64_atomic__lse" [(set (match_operand:ALLI 0 "aarch64_sync_memory_operand" "+Q") (unspec_volatile:ALLI