From patchwork Sat Nov 18 19:30:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 80226 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9DC173858C53 for ; Sat, 18 Nov 2023 19:31:22 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id D9D283858D20 for ; Sat, 18 Nov 2023 19:31:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D9D283858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D9D283858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700335866; cv=none; b=jANXW/U+240ID9I+9ExbY3XHXscf9LmwIgGOtunlCAAjk3wyb/95wDBGPnvoQcmiztjLmhdMj9bnYJ349+gYFqsOZXGCDBZ4Zd/gA36HKqfX+nPQ8cyQs39MhZA7JKAQapyBcgGSGFfRSCFLBSguLbH8BU7G/wAOA5MfShAEuQw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700335866; c=relaxed/simple; bh=4TUPybWKlhR8dWKmR5KdSk3KNpcZfnhOnS41Ow6DTy8=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=Blvbm3JolUJc3/VIz65XTnAuVv7Izr5AMQ9agW620IQGhl06LhzDeVRYePEmXdsPy42+VU/Cft9Pv43bDrBrCVwHMf7TmQIRJqQDqyvgV57zojuQspYGIQTS8zmS/vQD1PyYkKF4DcaeeksHXr5FjAENpU9/D+ZOsn2qDjW+O14= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1700335864; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type; bh=gz8LbBGrAjPsOk/EJ6smFSqBNbW7L+jU9eJ9bW7mBwQ=; b=ZeANFDmAcC6Sjj+MnXyAz/GNghzX13kPTZbe3Rf3g1sXmqtURGgVDoLw4ejjIb3klI9LNv DZnd37FgMt8F7ahfKaEZqVjvutLLtR51paEm/sfxYC0n6is4ZbXVs7dWiGr7IxT4IWTlmw ZZmYaokc+2r+zclQAlfUNntRrBG/laU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-575-N4SCO5Q-NK2AJGjG6m8Vbg-1; Sat, 18 Nov 2023 14:31:01 -0500 X-MC-Unique: N4SCO5Q-NK2AJGjG6m8Vbg-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 31868101A529; Sat, 18 Nov 2023 19:31:01 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.194.53]) by smtp.corp.redhat.com (Postfix) with ESMTPS id D57FD492BFA; Sat, 18 Nov 2023 19:31:00 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 3AIJUwlw3207630 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Sat, 18 Nov 2023 20:30:58 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 3AIJUuSH3207629; Sat, 18 Nov 2023 20:30:56 +0100 Date: Sat, 18 Nov 2023 20:30:56 +0100 From: Jakub Jelinek To: Richard Biener , "Joseph S. Myers" Cc: gcc-patches@gcc.gnu.org, Florian Weimer Subject: [PATCH] c-family, middle-end: Add __builtin_c[lt]zg (arg, 0ULL) exception Message-ID: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.10 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline X-Spam-Status: No, score=-3.4 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Jakub Jelinek Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Hi! In https://sourceware.org/pipermail/libc-alpha/2023-November/152819.html Florian Weimer raised concern that the type-generic stdbit.h macros currently being considered suffer from similar problem as old tgmath.h implementation, in particular that the macros expand during preprocessing their arguments multiple times and if one nests these stdbit.h type-generic macros several times, that can result in extremely large preprocessed source and long compile times, even when the argument is only actually evaluated once at runtime for side-effects. As I'd strongly prefer not to add new builtins for all the 14 stdbit.h type-generic macros, I think it is better to build the macros from smaller building blocks. The following patch adds the first one. While one can say implement e.g. stdc_leading_zeros(value) macro as ((unsigned int) __builtin_clzg (value, __builtin_popcountg ((__typeof (value)) ~(__typeof (value)) 0))) that expands the argument 3 times, and even if it just used ((unsigned int) __builtin_clzg (value, __builtin_popcountg ((__typeof (value)) -1))) relying on 2-s complement, that is still twice. I'd prefer not to add optional 3rd argument to these, but given that the second argument if specified right now has to have signed int type, the following patch adds an exception that it allows 0ULL as a magic value for the argument to mean fill in the precision of the first argument. Ok for trunk if it passes bootstrap/regtest? 2023-11-18 Jakub Jelinek PR c/111309 gcc/ * builtins.cc (fold_builtin_bit_query): If arg1 is 0ULL, use TYPE_PRECISION (arg0_type) instead of it. * fold-const-call.cc (fold_const_call_sss): Rename arg0_type argument to arg_type, add arg1_type argument, if for CLZ/CTZ second argument is unsigned long long, use TYPE_PRECISION (arg0_type). (fold_const_call_1): Pass also TREE_TYPE (arg1) to fold_const_call_sss. * doc/extend.texi (__builtin_clzg, __builtin_ctzg): Document behavior for second argument 0ULL. gcc/c-family/ * c-common.cc (check_builtin_function_arguments): If args[1] is 0ULL, use TYPE_PRECISION (TREE_TYPE (args[0])) instead of it. gcc/testsuite/ * c-c++-common/pr111309-3.c: New test. * gcc.dg/torture/bitint-43.c: Add tests with 0ULL second argument. Jakub --- gcc/builtins.cc.jj 2023-11-14 10:52:16.170276318 +0100 +++ gcc/builtins.cc 2023-11-18 13:55:02.996395917 +0100 @@ -9591,6 +9591,10 @@ fold_builtin_bit_query (location_t loc, case BUILT_IN_CLZG: if (arg1 && TREE_CODE (arg1) != INTEGER_CST) return NULL_TREE; + if (arg1 + && (TYPE_MAIN_VARIANT (TREE_TYPE (arg1)) + == long_long_unsigned_type_node)) + arg1 = build_int_cst (integer_type_node, TYPE_PRECISION (arg0_type)); ifn = IFN_CLZ; fcodei = BUILT_IN_CLZ; fcodel = BUILT_IN_CLZL; @@ -9599,6 +9603,10 @@ fold_builtin_bit_query (location_t loc, case BUILT_IN_CTZG: if (arg1 && TREE_CODE (arg1) != INTEGER_CST) return NULL_TREE; + if (arg1 + && (TYPE_MAIN_VARIANT (TREE_TYPE (arg1)) + == long_long_unsigned_type_node)) + arg1 = build_int_cst (integer_type_node, TYPE_PRECISION (arg0_type)); ifn = IFN_CTZ; fcodei = BUILT_IN_CTZ; fcodel = BUILT_IN_CTZL; --- gcc/fold-const-call.cc.jj 2023-11-14 10:52:16.186276097 +0100 +++ gcc/fold-const-call.cc 2023-11-18 13:49:57.514641417 +0100 @@ -1543,13 +1543,13 @@ fold_const_call_sss (real_value *result, *RESULT = FN (ARG0, ARG1) - where ARG_TYPE is the type of ARG0 and PRECISION is the number of bits in - the result. Return true on success. */ + where ARG0_TYPE is the type of ARG0, ARG1_TYPE is the type of ARG1 and + PRECISION is the number of bits in the result. Return true on success. */ static bool fold_const_call_sss (wide_int *result, combined_fn fn, const wide_int_ref &arg0, const wide_int_ref &arg1, - unsigned int precision, tree arg_type ATTRIBUTE_UNUSED) + unsigned int precision, tree arg0_type, tree arg1_type) { switch (fn) { @@ -1559,6 +1559,8 @@ fold_const_call_sss (wide_int *result, c int tmp; if (wi::ne_p (arg0, 0)) tmp = wi::clz (arg0); + else if (TYPE_MAIN_VARIANT (arg1_type) == long_long_unsigned_type_node) + tmp = TYPE_PRECISION (arg0_type); else tmp = arg1.to_shwi (); *result = wi::shwi (tmp, precision); @@ -1571,6 +1573,8 @@ fold_const_call_sss (wide_int *result, c int tmp; if (wi::ne_p (arg0, 0)) tmp = wi::ctz (arg0); + else if (TYPE_MAIN_VARIANT (arg1_type) == long_long_unsigned_type_node) + tmp = TYPE_PRECISION (arg0_type); else tmp = arg1.to_shwi (); *result = wi::shwi (tmp, precision); @@ -1625,7 +1629,7 @@ fold_const_call_1 (combined_fn fn, tree wide_int result; if (fold_const_call_sss (&result, fn, wi::to_wide (arg0), wi::to_wide (arg1), TYPE_PRECISION (type), - TREE_TYPE (arg0))) + TREE_TYPE (arg0), TREE_TYPE (arg1))) return wide_int_to_tree (type, result); } return NULL_TREE; --- gcc/doc/extend.texi.jj 2023-11-16 17:27:39.838028110 +0100 +++ gcc/doc/extend.texi 2023-11-18 13:17:40.982551766 +0100 @@ -15031,6 +15031,9 @@ optional second argument with int type. are performed on the first argument. If two arguments are specified, and first argument is 0, the result is the second argument. If only one argument is specified and it is 0, the result is undefined. +As an exception, if two arguments are specified and the second argument +is 0ULL, it is as if the second argument was the bit width of the first +argument. @enddefbuiltin @defbuiltin{int __builtin_ctzg (...)} @@ -15040,6 +15043,9 @@ optional second argument with int type. are performed on the first argument. If two arguments are specified, and first argument is 0, the result is the second argument. If only one argument is specified and it is 0, the result is undefined. +As an exception, if two arguments are specified and the second argument +is 0ULL, it is as if the second argument was the bit width of the first +argument. @enddefbuiltin @defbuiltin{int __builtin_clrsbg (...)} --- gcc/c-family/c-common.cc.jj 2023-11-14 18:26:05.193616416 +0100 +++ gcc/c-family/c-common.cc 2023-11-18 13:20:55.751844490 +0100 @@ -6540,6 +6540,13 @@ check_builtin_function_arguments (locati "%qE does not have integral type", 2, fndecl); return false; } + if (integer_zerop (args[1]) + && (TYPE_MAIN_VARIANT (TREE_TYPE (args[1])) + == long_long_unsigned_type_node)) + args[1] = build_int_cst (integer_type_node, + INTEGRAL_TYPE_P (TREE_TYPE (args[0])) + ? TYPE_PRECISION (TREE_TYPE (args[0])) + : 0); if ((TYPE_PRECISION (TREE_TYPE (args[1])) > TYPE_PRECISION (integer_type_node)) || (TYPE_PRECISION (TREE_TYPE (args[1])) --- gcc/testsuite/c-c++-common/pr111309-3.c.jj 2023-11-18 13:22:22.084644472 +0100 +++ gcc/testsuite/c-c++-common/pr111309-3.c 2023-11-18 13:26:12.894436233 +0100 @@ -0,0 +1,26 @@ +/* PR c/111309 */ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +int +main () +{ + if (__builtin_clzg ((unsigned char) 0, 0ULL) != __CHAR_BIT__ + || __builtin_clzg ((unsigned short) 0, 0ULL) != __SIZEOF_SHORT__ * __CHAR_BIT__ + || __builtin_clzg (0U, 0ULL) != __SIZEOF_INT__ * __CHAR_BIT__ + || __builtin_clzg (0UL, 0ULL) != __SIZEOF_LONG__ * __CHAR_BIT__ + || __builtin_clzg (0ULL, 0ULL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__ +#ifdef __SIZEOF_INT128__ + || __builtin_clzg ((unsigned __int128) 0, 0ULL) != __SIZEOF_INT128__ * __CHAR_BIT__ +#endif + || __builtin_clzg ((unsigned char) 1, 0ULL) != __CHAR_BIT__ - 1 + || __builtin_clzg ((unsigned short) 2, 0ULL) != __SIZEOF_SHORT__ * __CHAR_BIT__ - 2 + || __builtin_clzg (4U, 0ULL) != __SIZEOF_INT__ * __CHAR_BIT__ - 3 + || __builtin_clzg (8UL, 0ULL) != __SIZEOF_LONG__ * __CHAR_BIT__ - 4 + || __builtin_clzg (16ULL, 0ULL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__ - 5 +#ifdef __SIZEOF_INT128__ + || __builtin_clzg ((unsigned __int128) 32, 0ULL) != __SIZEOF_INT128__ * __CHAR_BIT__ - 6 +#endif + || 0) + __builtin_abort (); +} --- gcc/testsuite/gcc.dg/torture/bitint-43.c.jj 2023-11-14 10:52:16.191276028 +0100 +++ gcc/testsuite/gcc.dg/torture/bitint-43.c 2023-11-18 13:28:49.335261722 +0100 @@ -141,7 +141,9 @@ main () || parity156 (0) != 0 || popcount156 (0) != 0 || __builtin_clzg ((unsigned _BitInt(156)) 0, 156 + 32) != 156 + 32 + || __builtin_clzg ((unsigned _BitInt(156)) 0, 0ULL) != 156 || __builtin_ctzg ((unsigned _BitInt(156)) 0, 156) != 156 + || __builtin_ctzg ((unsigned _BitInt(156)) 0, 0ULL) != 156 || __builtin_clrsbg ((_BitInt(156)) 0) != 156 - 1 || __builtin_ffsg ((_BitInt(156)) 0) != 0 || __builtin_parityg ((unsigned _BitInt(156)) 0) != 0 @@ -159,8 +161,10 @@ main () || popcount156 (-1) != 156 || __builtin_clzg ((unsigned _BitInt(156)) -1) != 0 || __builtin_clzg ((unsigned _BitInt(156)) -1, 156 + 32) != 0 + || __builtin_clzg ((unsigned _BitInt(156)) -1, 0ULL) != 0 || __builtin_ctzg ((unsigned _BitInt(156)) -1) != 0 || __builtin_ctzg ((unsigned _BitInt(156)) -1, 156) != 0 + || __builtin_ctzg ((unsigned _BitInt(156)) -1, 0ULL) != 0 || __builtin_clrsbg ((_BitInt(156)) -1) != 156 - 1 || __builtin_ffsg ((_BitInt(156)) -1) != 1 || __builtin_parityg ((unsigned _BitInt(156)) -1) != 0