From patchwork Thu Sep 22 15:56:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 57913 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 628913857BBC for ; Thu, 22 Sep 2022 15:58:30 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 628913857BBC DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1663862310; bh=jIV2KxCUsklYxb7De7QLJuHrLxC78qwwFxK9UUOdR+o=; h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=Pli8TdFQU/3lK2vrjVY3mqd4CrFHJ1nAO4w6MhkZAEdOFH1L+GKOvA9h1L+saYaUK YF3O2vnIJeNb6joQimrr1JLRuTLOIdbe5HYDPUMMB1ci0WUhLE8DmUrExtFbym+nmv y+7tFm6TVknjjTojoYFgfVES+/ye/lewddZa1ZHw= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id ACD00385701F for ; Thu, 22 Sep 2022 15:56:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org ACD00385701F Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-356-sVHMpYJkMV2446cEN_5elg-1; Thu, 22 Sep 2022 11:56:41 -0400 X-MC-Unique: sVHMpYJkMV2446cEN_5elg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 4636138173C4; Thu, 22 Sep 2022 15:56:41 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.192.194]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 9E8D32166B2B; Thu, 22 Sep 2022 15:56:40 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 28MFubgI2554954 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 22 Sep 2022 17:56:38 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 28MFuaU32554953; Thu, 22 Sep 2022 17:56:36 +0200 Date: Thu, 22 Sep 2022 17:56:35 +0200 To: Hongtao Liu , Jonathan Wakely , "Joseph S. Myers" , Richard Earnshaw , Kyrylo Tkachov , richard.sandiford@arm.com Subject: [RFC PATCH] __trunc{tf,xf,df,sf,hf}bf2, __truncbfhf2 and __extendbfsf2 Message-ID: References: MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_MANYTO, KAM_NUMSUBJECT, KAM_SHORT, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jakub Jelinek via Gcc-patches From: Jakub Jelinek Reply-To: Jakub Jelinek Cc: gcc-patches@gcc.gnu.org Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" On Tue, Sep 20, 2022 at 10:51:18AM +0200, Jakub Jelinek via Gcc-patches wrote: > On Tue, Sep 20, 2022 at 11:35:07AM +0800, Hongtao Liu wrote: > > > The question is (mainly for aarch64, arm and x86 backend maintainers) if we > > > shouldn't support it, in the PR there is a partial patch to do so, but > > > the big question is if it should be supported as the __bf16 type those > > > 3 targets use with u6__bf16 mangling and remove those *_invalid_* cases > > > and add conversions to/from at least SFmode but probably also DFmode, TFmode > > > and XFmode on x86 and implement arithmetics on those through conversion to > > > SFmode, performing arithmetics there and conversion back. > > > Conversion from BFmode to SFmode is easy, left shift by 16 and ought to be > > > implemented inline, SFmode -> BFmode conversion is harder, > > > I think it is roughly: > > I'm not sure if there should be any floating point exceptions for > > BFmode operation. > > For x86, there's no floating point exceptions for AVX512_BF16 related > > instructions > > As long as __bf16 is just an extension, supporting or not supporting > exceptions on sNaNs is just fine I think, but I'm afraid it is different > for std::bfloat16_t. If we claim we support it (define that type > in , predefine __STD_BFLOAT16_TYPE__), then it needs to follow > ISO/IEC/IEEE 60559, and I'm afraid that means also exceptions and the like. > While the IEEE spec doesn't cover the exact bfloat16 format, C++ talks about > a format with these and these number of bits here and there that behaves > like in IEEE otherwise. > Whether we support std::bfloat16_t at all is our choice, if we do support > it, whether we support it with __bf16 underlying type or come up with > something different, it is up to us, and with -ffast-math/-Ofast etc. > we can certainly use hw instructions for it which don't raise exceptions. > > At least that is my limited understanding of it... I've been playing with this a little bit and here is a soft-fp version of IMHO everything we need for proper bfloat16 support. In particular, I think we need all the truncating conversions from other floating formats that a target with BFmode floating point support (currently arm, aarch64 and x86) has, truncating conversion from BFmode to HFmode (seems GCC when precision is the same considers conversions truncating) and an extension from BFmode to SFmode. Extensions from BFmode to SF/DF/XF/TFmode are IMHO best implemented inside of GCC by performing BFmode to SFmode conversion first and then converting SFmode to those other formats, other arithmetics on BFmode should be implemented simply by widening to SFmode, doing arithmetics there and then converting back. The BF to SFmode extension can be also implemented simply by shifting the VCEd value up by 16 bits and VCEing the result if flags say sNaNs don't need to be handled, or IMHO if we use the extended result in some arithmetic operation that will handle the sNaN signaling + conversion into qNaN, similarly for SFmode to BFmode conversions we can use hw instructions if available and we don't care about sNaNs. The C FE has the advantage that it has excess precision support, there we should arrange for BFmode to be always promoted to SFmode excess precision, but C++ FE doesn't. Also, question to ARM/AArch64/x86 maintainers is if it is ok to add conversion and arithmetic support to the __bf16 type, or if that type should keep to be useless and there should be another type (some keyword or just float __attribute__((__mode__ (__BF__)))) that we'd have that support for. Whatever type we'd use as std::bfloat16_t should mangle as DFb16_ rather than u6__bf16 that __bf16 currently mangles to though. Thoughts on this? And for Joseph, sure, the libgcc/soft-fp/ part should probably go into glibc first and be copied from there afterwards. Perhaps the __truncbfhf2 could be dropped and we could just on the compiler side emit shift left by 16 before calling __truncsfhf2. Jakub extern __bf16 __trunctfbf2 (_Float128); extern __bf16 __truncxfbf2 (__float80); extern __bf16 __truncdfbf2 (_Float64); extern __bf16 __truncsfbf2 (_Float32); extern __bf16 __trunchfbf2 (_Float16); extern _Float16 __truncbfhf2 (__bf16); extern _Float32 __extendbfsf2 (__bf16); int main () { volatile _Float128 tf; volatile __float80 xf; volatile _Float64 df; volatile _Float32 sf; volatile _Float16 hf; union { _Float32 f; unsigned int i; } u1; union { __bf16 f; unsigned short i; } u2; tf = 2.718281828459045235360287471352662498F128; u1.f = tf; u2.f = __trunctfbf2 (tf); __builtin_printf ("%08x %04x\n", u1.i, u2.i); xf = 2.718281828459045235360287471352662498W; u1.f = xf; u2.f = __truncxfbf2 (xf); __builtin_printf ("%08x %04x\n", u1.i, u2.i); df = 2.718281828459045235360287471352662498F64; u1.f = df; u2.f = __truncdfbf2 (df); __builtin_printf ("%08x %04x\n", u1.i, u2.i); sf = 2.718281828459045235360287471352662498F32; u1.f = sf; u2.f = __truncsfbf2 (sf); __builtin_printf ("%08x %04x\n", u1.i, u2.i); hf = 2.718281828459045235360287471352662498F16; u1.f = hf; u2.f = __trunchfbf2 (hf); __builtin_printf ("%08x %04x\n", u1.i, u2.i); tf = __builtin_inff128 (); u1.f = tf; u2.f = __trunctfbf2 (tf); __builtin_printf ("%08x %04x\n", u1.i, u2.i); xf = -__builtin_infl (); u1.f = xf; u2.f = __truncxfbf2 (xf); __builtin_printf ("%08x %04x\n", u1.i, u2.i); df = __builtin_inff64 (); u1.f = df; u2.f = __truncdfbf2 (df); __builtin_printf ("%08x %04x\n", u1.i, u2.i); sf = -__builtin_inff32 (); u1.f = sf; u2.f = __truncsfbf2 (sf); __builtin_printf ("%08x %04x\n", u1.i, u2.i); hf = __builtin_inff16 (); u1.f = hf; u2.f = __trunchfbf2 (hf); __builtin_printf ("%08x %04x\n", u1.i, u2.i); tf = __builtin_nanf128 (""); u1.f = tf; u2.f = __trunctfbf2 (tf); __builtin_printf ("%08x %04x\n", u1.i, u2.i); xf = __builtin_nanl (""); u1.f = xf; u2.f = __truncxfbf2 (xf); __builtin_printf ("%08x %04x\n", u1.i, u2.i); df = __builtin_nanf64 (""); u1.f = df; u2.f = __truncdfbf2 (df); __builtin_printf ("%08x %04x\n", u1.i, u2.i); sf = __builtin_nanf32 (""); u1.f = sf; u2.f = __truncsfbf2 (sf); __builtin_printf ("%08x %04x\n", u1.i, u2.i); hf = __builtin_nanf16 (""); u1.f = hf; u2.f = __trunchfbf2 (hf); __builtin_printf ("%08x %04x\n", u1.i, u2.i); return 0; } --- libgcc/soft-fp/brain.h.jj 2022-09-22 15:28:04.865171729 +0200 +++ libgcc/soft-fp/brain.h 2022-09-22 15:35:11.970374554 +0200 @@ -0,0 +1,172 @@ +/* Software floating-point emulation. + Definitions for Brain Floating Point format (bfloat16). + Copyright (C) 1997-2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + In addition to the permissions in the GNU Lesser General Public + License, the Free Software Foundation gives you unlimited + permission to link the compiled version of this file into + combinations with other programs, and to distribute those + combinations without any restriction coming from the use of this + file. (The Lesser General Public License restrictions do apply in + other respects; for example, they cover modification of the file, + and distribution when not linked into a combine executable.) + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef SOFT_FP_BRAIN_H +#define SOFT_FP_BRAIN_H 1 + +#if _FP_W_TYPE_SIZE < 32 +# error "Here's a nickel kid. Go buy yourself a real computer." +#endif + +#define _FP_FRACTBITS_B (_FP_W_TYPE_SIZE) + +#define _FP_FRACTBITS_DW_B (_FP_W_TYPE_SIZE) + +#define _FP_FRACBITS_B 8 +#define _FP_FRACXBITS_B (_FP_FRACTBITS_B - _FP_FRACBITS_B) +#define _FP_WFRACBITS_B (_FP_WORKBITS + _FP_FRACBITS_B) +#define _FP_WFRACXBITS_B (_FP_FRACTBITS_B - _FP_WFRACBITS_B) +#define _FP_EXPBITS_B 8 +#define _FP_EXPBIAS_B 127 +#define _FP_EXPMAX_B 255 + +#define _FP_QNANBIT_B ((_FP_W_TYPE) 1 << (_FP_FRACBITS_B-2)) +#define _FP_QNANBIT_SH_B ((_FP_W_TYPE) 1 << (_FP_FRACBITS_B-2+_FP_WORKBITS)) +#define _FP_IMPLBIT_B ((_FP_W_TYPE) 1 << (_FP_FRACBITS_B-1)) +#define _FP_IMPLBIT_SH_B ((_FP_W_TYPE) 1 << (_FP_FRACBITS_B-1+_FP_WORKBITS)) +#define _FP_OVERFLOW_B ((_FP_W_TYPE) 1 << (_FP_WFRACBITS_B)) + +#define _FP_WFRACBITS_DW_B (2 * _FP_WFRACBITS_B) +#define _FP_WFRACXBITS_DW_B (_FP_FRACTBITS_DW_B - _FP_WFRACBITS_DW_B) +#define _FP_HIGHBIT_DW_B \ + ((_FP_W_TYPE) 1 << (_FP_WFRACBITS_DW_B - 1) % _FP_W_TYPE_SIZE) + +/* The implementation of _FP_MUL_MEAT_B and _FP_DIV_MEAT_B should be + chosen by the target machine. */ + +typedef float BFtype __attribute__ ((mode (BF))); + +union _FP_UNION_B +{ + BFtype flt; + struct _FP_STRUCT_LAYOUT + { +#if __BYTE_ORDER == __BIG_ENDIAN + unsigned sign : 1; + unsigned exp : _FP_EXPBITS_B; + unsigned frac : _FP_FRACBITS_B - (_FP_IMPLBIT_B != 0); +#else + unsigned frac : _FP_FRACBITS_B - (_FP_IMPLBIT_B != 0); + unsigned exp : _FP_EXPBITS_B; + unsigned sign : 1; +#endif + } bits; +}; + +#define FP_DECL_B(X) _FP_DECL (1, X) +#define FP_UNPACK_RAW_B(X, val) _FP_UNPACK_RAW_1 (B, X, (val)) +#define FP_UNPACK_RAW_BP(X, val) _FP_UNPACK_RAW_1_P (B, X, (val)) +#define FP_PACK_RAW_B(val, X) _FP_PACK_RAW_1 (B, (val), X) +#define FP_PACK_RAW_BP(val, X) \ + do \ + { \ + if (!FP_INHIBIT_RESULTS) \ + _FP_PACK_RAW_1_P (B, (val), X); \ + } \ + while (0) + +#define FP_UNPACK_B(X, val) \ + do \ + { \ + _FP_UNPACK_RAW_1 (B, X, (val)); \ + _FP_UNPACK_CANONICAL (B, 1, X); \ + } \ + while (0) + +#define FP_UNPACK_BP(X, val) \ + do \ + { \ + _FP_UNPACK_RAW_1_P (B, X, (val)); \ + _FP_UNPACK_CANONICAL (B, 1, X); \ + } \ + while (0) + +#define FP_UNPACK_SEMIRAW_B(X, val) \ + do \ + { \ + _FP_UNPACK_RAW_1 (B, X, (val)); \ + _FP_UNPACK_SEMIRAW (B, 1, X); \ + } \ + while (0) + +#define FP_UNPACK_SEMIRAW_BP(X, val) \ + do \ + { \ + _FP_UNPACK_RAW_1_P (B, X, (val)); \ + _FP_UNPACK_SEMIRAW (B, 1, X); \ + } \ + while (0) + +#define FP_PACK_B(val, X) \ + do \ + { \ + _FP_PACK_CANONICAL (B, 1, X); \ + _FP_PACK_RAW_1 (B, (val), X); \ + } \ + while (0) + +#define FP_PACK_BP(val, X) \ + do \ + { \ + _FP_PACK_CANONICAL (B, 1, X); \ + if (!FP_INHIBIT_RESULTS) \ + _FP_PACK_RAW_1_P (B, (val), X); \ + } \ + while (0) + +#define FP_PACK_SEMIRAW_B(val, X) \ + do \ + { \ + _FP_PACK_SEMIRAW (B, 1, X); \ + _FP_PACK_RAW_1 (B, (val), X); \ + } \ + while (0) + +#define FP_PACK_SEMIRAW_BP(val, X) \ + do \ + { \ + _FP_PACK_SEMIRAW (B, 1, X); \ + if (!FP_INHIBIT_RESULTS) \ + _FP_PACK_RAW_1_P (B, (val), X); \ + } \ + while (0) + +#define FP_TO_INT_B(r, X, rsz, rsg) _FP_TO_INT (B, 1, (r), X, (rsz), (rsg)) +#define FP_TO_INT_ROUND_B(r, X, rsz, rsg) \ + _FP_TO_INT_ROUND (B, 1, (r), X, (rsz), (rsg)) +#define FP_FROM_INT_B(X, r, rs, rt) _FP_FROM_INT (B, 1, X, (r), (rs), rt) + +/* BFmode arithmetic is not implemented. */ + +#define _FP_FRAC_HIGH_B(X) _FP_FRAC_HIGH_1 (X) +#define _FP_FRAC_HIGH_RAW_B(X) _FP_FRAC_HIGH_1 (X) +#define _FP_FRAC_HIGH_DW_B(X) _FP_FRAC_HIGH_1 (X) + +#define FP_CMP_EQ_B(r, X, Y, ex) _FP_CMP_EQ (B, 1, (r), X, Y, (ex)) + +#endif /* !SOFT_FP_BRAIN_H */ --- libgcc/soft-fp/truncsfbf2.c.jj 2022-09-22 15:43:46.345386049 +0200 +++ libgcc/soft-fp/truncsfbf2.c 2022-09-22 16:02:19.940226518 +0200 @@ -0,0 +1,48 @@ +/* Software floating-point emulation. + Truncate IEEE single into bfloat16. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + In addition to the permissions in the GNU Lesser General Public + License, the Free Software Foundation gives you unlimited + permission to link the compiled version of this file into + combinations with other programs, and to distribute those + combinations without any restriction coming from the use of this + file. (The Lesser General Public License restrictions do apply in + other respects; for example, they cover modification of the file, + and distribution when not linked into a combine executable.) + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include "soft-fp.h" +#include "brain.h" +#include "single.h" + +BFtype +__truncsfbf2 (SFtype a) +{ + FP_DECL_EX; + FP_DECL_S (A); + FP_DECL_B (R); + BFtype r; + + FP_INIT_ROUNDMODE; + FP_UNPACK_SEMIRAW_S (A, a); + FP_TRUNC (B, S, 1, 1, R, A); + FP_PACK_SEMIRAW_B (r, R); + FP_HANDLE_EXCEPTIONS; + + return r; +} --- libgcc/soft-fp/truncbfhf2.c.jj 2022-09-22 16:13:28.894300765 +0200 +++ libgcc/soft-fp/truncbfhf2.c 2022-09-22 17:12:11.459004531 +0200 @@ -0,0 +1,75 @@ +/* Software floating-point emulation. + Truncate bfloat16 into IEEE half. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + In addition to the permissions in the GNU Lesser General Public + License, the Free Software Foundation gives you unlimited + permission to link the compiled version of this file into + combinations with other programs, and to distribute those + combinations without any restriction coming from the use of this + file. (The Lesser General Public License restrictions do apply in + other respects; for example, they cover modification of the file, + and distribution when not linked into a combine executable.) + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include "soft-fp.h" +#include "half.h" +#include "brain.h" +#include "single.h" + +/* BFtype and HFtype are unordered, neither is a superset or subset + of each other. Convert BFtype to SFtype (lossless) and then + truncate to HFtype. */ + +HFtype +__truncbfhf2 (BFtype a) +{ + FP_DECL_EX; + FP_DECL_H (A); + FP_DECL_S (B); + FP_DECL_B (R); + SFtype b; + HFtype r; + + FP_INIT_ROUNDMODE; + /* Optimize BFtype to SFtype conversion to simple left shift + by 16 if possible, we don't need to raise exceptions on sNaN + here as the SFtype to HFtype truncation should do that too. */ + if (sizeof (BFtype) == 2 + && sizeof (unsigned short) == 2 + && sizeof (SFtype) == 4 + && sizeof (unsigned int) == 4) + { + union { BFtype a; unsigned short b; } u1; + union { SFtype a; unsigned int b; } u2; + u1.a = a; + u2.b = (u1.b << 8) << 8; + b = u2.a; + } + else + { + FP_UNPACK_RAW_B (A, a); + FP_EXTEND (S, B, 1, 1, B, A); + FP_PACK_RAW_S (b, B); + } + FP_UNPACK_SEMIRAW_S (B, b); + FP_TRUNC (H, S, 1, 1, R, B); + FP_PACK_SEMIRAW_H (r, R); + FP_HANDLE_EXCEPTIONS; + + return r; +} --- libgcc/soft-fp/truncxfbf2.c.jj 2022-09-22 15:45:56.211621629 +0200 +++ libgcc/soft-fp/truncxfbf2.c 2022-09-22 16:02:03.205454405 +0200 @@ -0,0 +1,52 @@ +/* Software floating-point emulation. + Truncate IEEE extended into bfloat16. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + In addition to the permissions in the GNU Lesser General Public + License, the Free Software Foundation gives you unlimited + permission to link the compiled version of this file into + combinations with other programs, and to distribute those + combinations without any restriction coming from the use of this + file. (The Lesser General Public License restrictions do apply in + other respects; for example, they cover modification of the file, + and distribution when not linked into a combine executable.) + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include "soft-fp.h" +#include "brain.h" +#include "extended.h" + +BFtype +__truncxfbf2 (XFtype a) +{ + FP_DECL_EX; + FP_DECL_E (A); + FP_DECL_B (R); + BFtype r; + + FP_INIT_ROUNDMODE; + FP_UNPACK_SEMIRAW_E (A, a); +#if _FP_W_TYPE_SIZE < 64 + FP_TRUNC (B, E, 1, 4, R, A); +#else + FP_TRUNC (B, E, 1, 2, R, A); +#endif + FP_PACK_SEMIRAW_B (r, R); + FP_HANDLE_EXCEPTIONS; + + return r; +} --- libgcc/soft-fp/trunchfbf2.c.jj 2022-09-22 15:59:01.321931320 +0200 +++ libgcc/soft-fp/trunchfbf2.c 2022-09-22 17:11:28.729588880 +0200 @@ -0,0 +1,58 @@ +/* Software floating-point emulation. + Truncate IEEE half into bfloat16. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + In addition to the permissions in the GNU Lesser General Public + License, the Free Software Foundation gives you unlimited + permission to link the compiled version of this file into + combinations with other programs, and to distribute those + combinations without any restriction coming from the use of this + file. (The Lesser General Public License restrictions do apply in + other respects; for example, they cover modification of the file, + and distribution when not linked into a combine executable.) + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include "soft-fp.h" +#include "brain.h" +#include "half.h" +#include "single.h" + +/* BFtype and HFtype are unordered, neither is a superset or subset + of each other. Convert HFtype to SFtype (lossless) and then + truncate to BFtype. */ + +BFtype +__trunchfbf2 (HFtype a) +{ + FP_DECL_EX; + FP_DECL_H (A); + FP_DECL_S (B); + FP_DECL_B (R); + SFtype b; + BFtype r; + + FP_INIT_ROUNDMODE; + FP_UNPACK_RAW_H (A, a); + FP_EXTEND (S, H, 1, 1, B, A); + FP_PACK_RAW_S (b, B); + FP_UNPACK_SEMIRAW_S (B, b); + FP_TRUNC (B, S, 1, 1, R, B); + FP_PACK_SEMIRAW_B (r, R); + FP_HANDLE_EXCEPTIONS; + + return r; +} --- libgcc/soft-fp/truncdfbf2.c.jj 2022-09-22 15:40:15.303253337 +0200 +++ libgcc/soft-fp/truncdfbf2.c 2022-09-22 15:41:55.083897689 +0200 @@ -0,0 +1,52 @@ +/* Software floating-point emulation. + Truncate IEEE double into bfloat16. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + In addition to the permissions in the GNU Lesser General Public + License, the Free Software Foundation gives you unlimited + permission to link the compiled version of this file into + combinations with other programs, and to distribute those + combinations without any restriction coming from the use of this + file. (The Lesser General Public License restrictions do apply in + other respects; for example, they cover modification of the file, + and distribution when not linked into a combine executable.) + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include "soft-fp.h" +#include "brain.h" +#include "double.h" + +BFtype +__truncdfbf2 (DFtype a) +{ + FP_DECL_EX; + FP_DECL_D (A); + FP_DECL_B (R); + BFtype r; + + FP_INIT_ROUNDMODE; + FP_UNPACK_SEMIRAW_D (A, a); +#if _FP_W_TYPE_SIZE < _FP_FRACBITS_D + FP_TRUNC (B, D, 1, 2, R, A); +#else + FP_TRUNC (B, D, 1, 1, R, A); +#endif + FP_PACK_SEMIRAW_B (r, R); + FP_HANDLE_EXCEPTIONS; + + return r; +} --- libgcc/soft-fp/trunctfbf2.c.jj 2022-09-22 15:44:14.924997754 +0200 +++ libgcc/soft-fp/trunctfbf2.c 2022-09-22 15:44:45.694579708 +0200 @@ -0,0 +1,52 @@ +/* Software floating-point emulation. + Truncate IEEE quad into bfloat16. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + In addition to the permissions in the GNU Lesser General Public + License, the Free Software Foundation gives you unlimited + permission to link the compiled version of this file into + combinations with other programs, and to distribute those + combinations without any restriction coming from the use of this + file. (The Lesser General Public License restrictions do apply in + other respects; for example, they cover modification of the file, + and distribution when not linked into a combine executable.) + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include "soft-fp.h" +#include "brain.h" +#include "quad.h" + +BFtype +__trunctfbf2 (TFtype a) +{ + FP_DECL_EX; + FP_DECL_Q (A); + FP_DECL_B (R); + BFtype r; + + FP_INIT_ROUNDMODE; + FP_UNPACK_SEMIRAW_Q (A, a); +#if _FP_W_TYPE_SIZE < 64 + FP_TRUNC (B, Q, 1, 4, R, A); +#else + FP_TRUNC (B, Q, 1, 2, R, A); +#endif + FP_PACK_SEMIRAW_B (r, R); + FP_HANDLE_EXCEPTIONS; + + return r; +} --- libgcc/soft-fp/extendbfsf2.c.jj 2022-09-22 16:27:01.378339625 +0200 +++ libgcc/soft-fp/extendbfsf2.c 2022-09-22 16:27:46.379725593 +0200 @@ -0,0 +1,49 @@ +/* Software floating-point emulation. + Return an bfloat16 converted to IEEE single + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + In addition to the permissions in the GNU Lesser General Public + License, the Free Software Foundation gives you unlimited + permission to link the compiled version of this file into + combinations with other programs, and to distribute those + combinations without any restriction coming from the use of this + file. (The Lesser General Public License restrictions do apply in + other respects; for example, they cover modification of the file, + and distribution when not linked into a combine executable.) + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define FP_NO_EXACT_UNDERFLOW +#include "soft-fp.h" +#include "brain.h" +#include "single.h" + +SFtype +__extendbfsf2 (BFtype a) +{ + FP_DECL_EX; + FP_DECL_B (A); + FP_DECL_S (R); + SFtype r; + + FP_INIT_EXCEPTIONS; + FP_UNPACK_RAW_B (A, a); + FP_EXTEND (S, B, 1, 1, R, A); + FP_PACK_RAW_S (r, R); + FP_HANDLE_EXCEPTIONS; + + return r; +} --- libgcc/config/i386/t-softfp.jj 2021-12-30 15:12:44.111138056 +0100 +++ libgcc/config/i386/t-softfp 2022-09-22 16:38:31.639921214 +0200 @@ -6,8 +6,9 @@ LIB2FUNCS_EXCLUDE += $(libgcc2-hf-functi libgcc2-hf-extras = $(addsuffix .c, $(libgcc2-hf-functions)) LIB2ADD += $(addprefix $(srcdir)/config/i386/, $(libgcc2-hf-extras)) -softfp_extensions := hfsf hfdf hftf hfxf sfdf sftf dftf xftf -softfp_truncations := tfhf xfhf dfhf sfhf tfsf dfsf tfdf tfxf +softfp_extensions := hfsf hfdf hftf hfxf sfdf sftf dftf xftf bfsf +softfp_truncations := tfhf xfhf dfhf sfhf tfsf dfsf tfdf tfxf \ + tfbf xfbf dfbf sfbf hfbf bfhf softfp_extras += eqhf2 @@ -20,6 +21,8 @@ CFLAGS-truncsfhf2.c += -msse2 CFLAGS-truncdfhf2.c += -msse2 CFLAGS-truncxfhf2.c += -msse2 CFLAGS-trunctfhf2.c += -msse2 +CFLAGS-truncbfhf2.c += -msse2 +CFLAGS-trunchfbf2.c += -msse2 CFLAGS-eqhf2.c += -msse2 CFLAGS-_divhc3.c += -msse2 --- libgcc/config/i386/libgcc-glibc.ver.jj 2022-01-11 23:11:23.723271422 +0100 +++ libgcc/config/i386/libgcc-glibc.ver 2022-09-22 16:41:26.599448819 +0200 @@ -214,3 +214,14 @@ GCC_12.0.0 { __trunctfhf2 __truncxfhf2 } + +%inherit GCC_13.0.0 GCC_12.0.0 +GCC_13.0.0 { + __extendbfsf2 + __truncdfbf2 + __truncsfbf2 + __trunctfbf2 + __truncxfbf2 + __trunchfbf2 + __truncbfhf2 +} --- libgcc/config/i386/64/sfp-machine.h.jj 2021-12-30 15:12:44.111138056 +0100 +++ libgcc/config/i386/64/sfp-machine.h 2022-09-22 16:44:45.897627866 +0200 @@ -14,6 +14,7 @@ typedef unsigned int UTItype __attribute #define _FP_DIV_MEAT_Q(R,X,Y) _FP_DIV_MEAT_2_udiv(Q,R,X,Y) #define _FP_NANFRAC_H _FP_QNANBIT_H +#define _FP_NANFRAC_B _FP_QNANBIT_B #define _FP_NANFRAC_S _FP_QNANBIT_S #define _FP_NANFRAC_D _FP_QNANBIT_D #define _FP_NANFRAC_E _FP_QNANBIT_E, 0 --- libgcc/config/i386/sfp-machine.h.jj 2021-12-30 15:12:44.111138056 +0100 +++ libgcc/config/i386/sfp-machine.h 2022-09-22 16:46:16.130350681 +0200 @@ -18,6 +18,7 @@ typedef int __gcc_CMPtype __attribute__ #define _FP_QNANNEGATEDP 0 #define _FP_NANSIGN_H 1 +#define _FP_NANSIGN_B 1 #define _FP_NANSIGN_S 1 #define _FP_NANSIGN_D 1 #define _FP_NANSIGN_E 1 --- libgcc/config/i386/32/sfp-machine.h.jj 2021-12-30 15:12:44.110138070 +0100 +++ libgcc/config/i386/32/sfp-machine.h 2022-09-22 16:44:26.786898371 +0200 @@ -87,6 +87,7 @@ #define _FP_DIV_MEAT_Q(R,X,Y) _FP_DIV_MEAT_4_udiv(Q,R,X,Y) #define _FP_NANFRAC_H _FP_QNANBIT_H +#define _FP_NANFRAC_B _FP_QNANBIT_B #define _FP_NANFRAC_S _FP_QNANBIT_S #define _FP_NANFRAC_D _FP_QNANBIT_D, 0 /* Even if XFmode is 12byte, we have to pad it to