Message ID | 20201222153039.17722-1-rzinsly@linux.ibm.com |
---|---|
State | Superseded |
Delegated to: | Tulio Magno Quites Machado Filho |
Headers |
Return-Path: <libc-alpha-bounces@sourceware.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 092C7382D83E; Tue, 22 Dec 2020 15:31:09 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 092C7382D83E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1608651069; bh=LYJlAX7Kp2BLkPiDTYzlMtnAMuvWNylKm0WYa9Ky1Dc=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=FbeyDcyChApRzXHt7AmJRt3uUtXJof06wPCbX9YKot13ynD91CUd4t7tJgr1bIbpg aeaHgKVdqyP4hZhtqBulbnVrSQ2cKIc1uFtbab1r4GqfsmwfBsFP1F1cQWW8dSHjaw Cma3Yxs5MRta/3h+dsAge9isxLvZMcV20pO/HxY8= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 8FA7E382D83E for <libc-alpha@sourceware.org>; Tue, 22 Dec 2020 15:31:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 8FA7E382D83E Received: from pps.filterd (m0098394.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 0BMFP9uC052620 for <libc-alpha@sourceware.org>; Tue, 22 Dec 2020 10:31:04 -0500 Received: from ppma04dal.us.ibm.com (7a.29.35a9.ip4.static.sl-reverse.com [169.53.41.122]) by mx0a-001b2d01.pphosted.com with ESMTP id 35kkdr0466-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for <libc-alpha@sourceware.org>; Tue, 22 Dec 2020 10:31:04 -0500 Received: from pps.filterd (ppma04dal.us.ibm.com [127.0.0.1]) by ppma04dal.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 0BMFRbF2016513 for <libc-alpha@sourceware.org>; Tue, 22 Dec 2020 15:31:03 GMT Received: from b01cxnp23033.gho.pok.ibm.com (b01cxnp23033.gho.pok.ibm.com [9.57.198.28]) by ppma04dal.us.ibm.com with ESMTP id 35kk8r83f0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for <libc-alpha@sourceware.org>; Tue, 22 Dec 2020 15:31:03 +0000 Received: from b01ledav004.gho.pok.ibm.com (b01ledav004.gho.pok.ibm.com [9.57.199.109]) by b01cxnp23033.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 0BMFV3ew21889330 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 22 Dec 2020 15:31:03 GMT Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 02FB7112064; Tue, 22 Dec 2020 15:31:03 +0000 (GMT) Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 89A9F112069; Tue, 22 Dec 2020 15:31:02 +0000 (GMT) Received: from localhost (unknown [9.160.38.10]) by b01ledav004.gho.pok.ibm.com (Postfix) with ESMTP; Tue, 22 Dec 2020 15:31:02 +0000 (GMT) To: libc-alpha@sourceware.org Subject: [PATCH] powerpc: Add optimized ilogbf128 for POWER9 Date: Tue, 22 Dec 2020 12:30:38 -0300 Message-Id: <20201222153039.17722-1-rzinsly@linux.ibm.com> X-Mailer: git-send-email 2.29.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.343, 18.0.737 definitions=2020-12-22_07:2020-12-21, 2020-12-22 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 mlxlogscore=466 phishscore=0 suspectscore=0 impostorscore=0 spamscore=0 mlxscore=0 malwarescore=0 bulkscore=0 clxscore=1015 lowpriorityscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2012220111 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_NUMSUBJECT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list <libc-alpha.sourceware.org> List-Unsubscribe: <https://sourceware.org/mailman/options/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe> List-Archive: <https://sourceware.org/pipermail/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-request@sourceware.org?subject=help> List-Subscribe: <https://sourceware.org/mailman/listinfo/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=subscribe> From: Raphael Moreira Zinsly via Libc-alpha <libc-alpha@sourceware.org> Reply-To: Raphael Moreira Zinsly <rzinsly@linux.ibm.com> Cc: tuliom@linux.ibm.com, Raphael Moreira Zinsly <rzinsly@linux.ibm.com> Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" <libc-alpha-bounces@sourceware.org> |
Series |
powerpc: Add optimized ilogbf128 for POWER9
|
|
Commit Message
Raphael M Zinsly
Dec. 22, 2020, 3:30 p.m. UTC
The instruction xsxexpqp introduced on POWER9 extracts the exponent from a quad-precision floating-point, thus it can be used to improve ilogbf128 and llogbf128. --- .../powerpc/powerpc64/le/fpu/e_ilogbf128.c | 22 +++++++++++++++++++ 1 file changed, 22 insertions(+) create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c
Comments
Benchtests results with and without this patch on a POWER9: without: "ilogbf128": { "subnormal": { "duration": 5.09834e+08, "iterations": 2.8146e+07, "max": 38.979, "min": 2.939, "mean": 18.1139 }, "normal": { "duration": 4.99378e+08, "iterations": 1.6151e+08, "max": 16.698, "min": 2.942, "mean": 3.09193 } } with: "ilogbf128": { "subnormal": { "duration": 5.09989e+08, "iterations": 2.5978e+07, "max": 41.027, "min": 4.674, "mean": 19.6316 }, "normal": { "duration": 4.98105e+08, "iterations": 1.77912e+08, "max": 12.663, "min": 2.792, "mean": 2.79972 } } Best Regards,
On 12/22/20 9:30 AM, Raphael Moreira Zinsly via Libc-alpha wrote: > The instruction xsxexpqp introduced on POWER9 extracts the exponent > from a quad-precision floating-point, thus it can be used to improve > ilogbf128 and llogbf128. > --- > .../powerpc/powerpc64/le/fpu/e_ilogbf128.c | 22 +++++++++++++++++++ > 1 file changed, 22 insertions(+) > create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c > > diff --git a/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c b/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c > new file mode 100644 > index 0000000000..47558bbadc > --- /dev/null > +++ b/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c > @@ -0,0 +1,22 @@ > +#ifdef _ARCH_PWR9 > +int _ilogbf128 (_Float128 __x); This should be a locally (static) scoped function. > + > +int > +#if defined(_F128_ENABLE_IFUNC) > +__ieee754_ilogbf128_power9 (_Float128 __x) > +#else > +__ieee754_ilogbf128 (_Float128 __x) > +#endif > +{ > + /* Check for exceptional cases. */ > + if (!__builtin_vsx_scalar_test_data_class_qp (__x, 0x7f)) > + return __builtin_vsx_scalar_extract_expq (__x) - 0x3fff; > + else > + /* Fallback to the generic ilogb if __x is NaN, Inf or subnormal. */ > + return _ilogbf128(__x); > +} > + > +#define __ieee754_ilogbf128 _ilogbf128 > +#endif > + > +#include<sysdeps/ieee754/float128/e_ilogbf128.c> A space seems to be missing between include and <. Otherwise, LGTM. As a side note, I think the benchtests are not too impressive. I am surprised normal values don't show better results.
On 1/4/21 5:20 PM, Paul E Murphy via Libc-alpha wrote: > > > On 12/22/20 9:30 AM, Raphael Moreira Zinsly via Libc-alpha wrote: >> The instruction xsxexpqp introduced on POWER9 extracts the exponent >> from a quad-precision floating-point, thus it can be used to improve >> ilogbf128 and llogbf128. >> --- >> .../powerpc/powerpc64/le/fpu/e_ilogbf128.c | 22 +++++++++++++++++++ >> 1 file changed, 22 insertions(+) >> create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c >> >> diff --git a/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c >> b/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c >> new file mode 100644 >> index 0000000000..47558bbadc >> --- /dev/null >> +++ b/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c >> @@ -0,0 +1,22 @@ >> +#ifdef _ARCH_PWR9 >> +int _ilogbf128 (_Float128 __x); > > This should be a locally (static) scoped function. > >> + >> +int >> +#if defined(_F128_ENABLE_IFUNC) >> +__ieee754_ilogbf128_power9 (_Float128 __x) >> +#else >> +__ieee754_ilogbf128 (_Float128 __x) >> +#endif >> +{ >> + /* Check for exceptional cases. */ >> + if (!__builtin_vsx_scalar_test_data_class_qp (__x, 0x7f)) >> + return __builtin_vsx_scalar_extract_expq (__x) - 0x3fff; >> + else >> + /* Fallback to the generic ilogb if __x is NaN, Inf or >> subnormal. */ >> + return _ilogbf128(__x); >> +} >> + >> +#define __ieee754_ilogbf128 _ilogbf128 >> +#endif >> + >> +#include<sysdeps/ieee754/float128/e_ilogbf128.c> > > A space seems to be missing between include and <. > > Otherwise, LGTM. > > As a side note, I think the benchtests are not too impressive. I am > surprised normal values don't show better results. After spending a little time looking at this, the call overhead of the wrapper is hiding most of the improvement. Similarly, power9 adds similar instructions for float32/float64. I would recommend refactoring this patch to provide an override to w_ilogb_template.c so all three formats can use these new instructions without the call overhead for normal numbers.
diff --git a/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c b/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c new file mode 100644 index 0000000000..47558bbadc --- /dev/null +++ b/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c @@ -0,0 +1,22 @@ +#ifdef _ARCH_PWR9 +int _ilogbf128 (_Float128 __x); + +int +#if defined(_F128_ENABLE_IFUNC) +__ieee754_ilogbf128_power9 (_Float128 __x) +#else +__ieee754_ilogbf128 (_Float128 __x) +#endif +{ + /* Check for exceptional cases. */ + if (!__builtin_vsx_scalar_test_data_class_qp (__x, 0x7f)) + return __builtin_vsx_scalar_extract_expq (__x) - 0x3fff; + else + /* Fallback to the generic ilogb if __x is NaN, Inf or subnormal. */ + return _ilogbf128(__x); +} + +#define __ieee754_ilogbf128 _ilogbf128 +#endif + +#include<sysdeps/ieee754/float128/e_ilogbf128.c>