From patchwork Thu Sep 22 15:56:35 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Jakub Jelinek <jakub@redhat.com>
X-Patchwork-Id: 57913
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 628913857BBC
	for <patchwork@sourceware.org>; Thu, 22 Sep 2022 15:58:30 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 628913857BBC
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1663862310;
	bh=jIV2KxCUsklYxb7De7QLJuHrLxC78qwwFxK9UUOdR+o=;
	h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe:
	 List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:
	 From;
	b=Pli8TdFQU/3lK2vrjVY3mqd4CrFHJ1nAO4w6MhkZAEdOFH1L+GKOvA9h1L+saYaUK
	 YF3O2vnIJeNb6joQimrr1JLRuTLOIdbe5HYDPUMMB1ci0WUhLE8DmUrExtFbym+nmv
	 y+7tFm6TVknjjTojoYFgfVES+/ye/lewddZa1ZHw=
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [170.10.133.124])
 by sourceware.org (Postfix) with ESMTPS id ACD00385701F
 for <gcc-patches@gcc.gnu.org>; Thu, 22 Sep 2022 15:56:45 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org ACD00385701F
Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com
 [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 us-mta-356-sVHMpYJkMV2446cEN_5elg-1; Thu, 22 Sep 2022 11:56:41 -0400
X-MC-Unique: sVHMpYJkMV2446cEN_5elg-1
Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com
 [10.11.54.6])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 4636138173C4;
 Thu, 22 Sep 2022 15:56:41 +0000 (UTC)
Received: from tucnak.zalov.cz (unknown [10.39.192.194])
 by smtp.corp.redhat.com (Postfix) with ESMTPS id 9E8D32166B2B;
 Thu, 22 Sep 2022 15:56:40 +0000 (UTC)
Received: from tucnak.zalov.cz (localhost [127.0.0.1])
 by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 28MFubgI2554954
 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT);
 Thu, 22 Sep 2022 17:56:38 +0200
Received: (from jakub@localhost)
 by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 28MFuaU32554953;
 Thu, 22 Sep 2022 17:56:36 +0200
Date: Thu, 22 Sep 2022 17:56:35 +0200
To: Hongtao Liu <crazylht@gmail.com>, Jonathan Wakely <jwakely@redhat.com>,
 "Joseph S. Myers" <joseph@codesourcery.com>,
 Richard Earnshaw <richard.earnshaw@arm.com>,
 Kyrylo Tkachov <kyrylo.tkachov@arm.com>, richard.sandiford@arm.com
Subject: [RFC PATCH] __trunc{tf,xf,df,sf,hf}bf2,
 __truncbfhf2 and __extendbfsf2
Message-ID: <YyyFs7w3npTxkci7@tucnak>
References: <Yx7oXsjWwmgsaj/A@tucnak>
 <CAMZc-bxB2fVAxVbBmqOj-ixTJybuX-V9dT2FvjrQR-wWRR7HkQ@mail.gmail.com>
 <Yyl/BkcvjpWW3Jyz@tucnak>
MIME-Version: 1.0
In-Reply-To: <Yyl/BkcvjpWW3Jyz@tucnak>
X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Disposition: inline
X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_MANYTO,
 KAM_NUMSUBJECT, KAM_SHORT, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_NONE,
 TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-Patchwork-Original-From: Jakub Jelinek via Gcc-patches
 <gcc-patches@gcc.gnu.org>
From: Jakub Jelinek <jakub@redhat.com>
Reply-To: Jakub Jelinek <jakub@redhat.com>
Cc: gcc-patches@gcc.gnu.org
Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org
Sender: "Gcc-patches"
 <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>

On Tue, Sep 20, 2022 at 10:51:18AM +0200, Jakub Jelinek via Gcc-patches wrote:
> On Tue, Sep 20, 2022 at 11:35:07AM +0800, Hongtao Liu wrote:
> > > The question is (mainly for aarch64, arm and x86 backend maintainers) if we
> > > shouldn't support it, in the PR there is a partial patch to do so, but
> > > the big question is if it should be supported as the __bf16 type those
> > > 3 targets use with u6__bf16 mangling and remove those *_invalid_* cases
> > > and add conversions to/from at least SFmode but probably also DFmode, TFmode
> > > and XFmode on x86 and implement arithmetics on those through conversion to
> > > SFmode, performing arithmetics there and conversion back.
> > > Conversion from BFmode to SFmode is easy, left shift by 16 and ought to be
> > > implemented inline, SFmode -> BFmode conversion is harder,
> > > I think it is roughly:
> > I'm not sure if there should be any floating point exceptions for
> > BFmode operation.
> > For x86, there's no floating point exceptions for AVX512_BF16 related
> > instructions
> 
> As long as __bf16 is just an extension, supporting or not supporting
> exceptions on sNaNs is just fine I think, but I'm afraid it is different
> for std::bfloat16_t.  If we claim we support it (define that type
> in <stdfloat>, predefine __STD_BFLOAT16_TYPE__), then it needs to follow
> ISO/IEC/IEEE 60559, and I'm afraid that means also exceptions and the like.
> While the IEEE spec doesn't cover the exact bfloat16 format, C++ talks about
> a format with these and these number of bits here and there that behaves
> like in IEEE otherwise.
> Whether we support std::bfloat16_t at all is our choice, if we do support
> it, whether we support it with __bf16 underlying type or come up with
> something different, it is up to us, and with -ffast-math/-Ofast etc.
> we can certainly use hw instructions for it which don't raise exceptions.
> 
> At least that is my limited understanding of it...

I've been playing with this a little bit and here is a soft-fp version of
IMHO everything we need for proper bfloat16 support.
In particular, I think we need all the truncating conversions from other
floating formats that a target with BFmode floating point support (currently
arm, aarch64 and x86) has, truncating conversion from BFmode to HFmode
(seems GCC when precision is the same considers conversions truncating)
and an extension from BFmode to SFmode.  Extensions from BFmode to
SF/DF/XF/TFmode are IMHO best implemented inside of GCC by performing
BFmode to SFmode conversion first and then converting SFmode to those
other formats, other arithmetics on BFmode should be implemented simply
by widening to SFmode, doing arithmetics there and then converting back.
The BF to SFmode extension can be also implemented simply by shifting
the VCEd value up by 16 bits and VCEing the result if flags say
sNaNs don't need to be handled, or IMHO if we use the extended result
in some arithmetic operation that will handle the sNaN signaling +
conversion into qNaN, similarly for SFmode to BFmode conversions
we can use hw instructions if available and we don't care about sNaNs.

The C FE has the advantage that it has excess precision support, there
we should arrange for BFmode to be always promoted to SFmode excess
precision, but C++ FE doesn't.

Also, question to ARM/AArch64/x86 maintainers is if it is ok to
add conversion and arithmetic support to the __bf16 type, or if
that type should keep to be useless and there should be another
type (some keyword or just float __attribute__((__mode__ (__BF__))))
that we'd have that support for.  Whatever type we'd use as
std::bfloat16_t should mangle as DFb16_ rather than u6__bf16 that
__bf16 currently mangles to though.

Thoughts on this?

And for Joseph, sure, the libgcc/soft-fp/ part should probably go
into glibc first and be copied from there afterwards.

Perhaps the __truncbfhf2 could be dropped and we could just on
the compiler side emit shift left by 16 before calling __truncsfhf2.


	Jakub
extern __bf16 __trunctfbf2 (_Float128);
extern __bf16 __truncxfbf2 (__float80);
extern __bf16 __truncdfbf2 (_Float64);
extern __bf16 __truncsfbf2 (_Float32);
extern __bf16 __trunchfbf2 (_Float16);
extern _Float16 __truncbfhf2 (__bf16);
extern _Float32 __extendbfsf2 (__bf16);

int
main ()
{
  volatile _Float128 tf;
  volatile __float80 xf;
  volatile _Float64 df;
  volatile _Float32 sf;
  volatile _Float16 hf;
  union { _Float32 f; unsigned int i; } u1;
  union { __bf16 f; unsigned short i; } u2;
  tf = 2.718281828459045235360287471352662498F128;
  u1.f = tf; u2.f = __trunctfbf2 (tf);
  __builtin_printf ("%08x %04x\n", u1.i, u2.i);
  xf = 2.718281828459045235360287471352662498W;
  u1.f = xf; u2.f = __truncxfbf2 (xf);
  __builtin_printf ("%08x %04x\n", u1.i, u2.i);
  df = 2.718281828459045235360287471352662498F64;
  u1.f = df; u2.f = __truncdfbf2 (df);
  __builtin_printf ("%08x %04x\n", u1.i, u2.i);
  sf = 2.718281828459045235360287471352662498F32;
  u1.f = sf; u2.f = __truncsfbf2 (sf);
  __builtin_printf ("%08x %04x\n", u1.i, u2.i);
  hf = 2.718281828459045235360287471352662498F16;
  u1.f = hf; u2.f = __trunchfbf2 (hf);
  __builtin_printf ("%08x %04x\n", u1.i, u2.i);
  tf = __builtin_inff128 ();
  u1.f = tf; u2.f = __trunctfbf2 (tf);
  __builtin_printf ("%08x %04x\n", u1.i, u2.i);
  xf = -__builtin_infl ();
  u1.f = xf; u2.f = __truncxfbf2 (xf);
  __builtin_printf ("%08x %04x\n", u1.i, u2.i);
  df = __builtin_inff64 ();
  u1.f = df; u2.f = __truncdfbf2 (df);
  __builtin_printf ("%08x %04x\n", u1.i, u2.i);
  sf = -__builtin_inff32 ();
  u1.f = sf; u2.f = __truncsfbf2 (sf);
  __builtin_printf ("%08x %04x\n", u1.i, u2.i);
  hf = __builtin_inff16 ();
  u1.f = hf; u2.f = __trunchfbf2 (hf);
  __builtin_printf ("%08x %04x\n", u1.i, u2.i);
  tf = __builtin_nanf128 ("");
  u1.f = tf; u2.f = __trunctfbf2 (tf);
  __builtin_printf ("%08x %04x\n", u1.i, u2.i);
  xf = __builtin_nanl ("");
  u1.f = xf; u2.f = __truncxfbf2 (xf);
  __builtin_printf ("%08x %04x\n", u1.i, u2.i);
  df = __builtin_nanf64 ("");
  u1.f = df; u2.f = __truncdfbf2 (df);
  __builtin_printf ("%08x %04x\n", u1.i, u2.i);
  sf = __builtin_nanf32 ("");
  u1.f = sf; u2.f = __truncsfbf2 (sf);
  __builtin_printf ("%08x %04x\n", u1.i, u2.i);
  hf = __builtin_nanf16 ("");
  u1.f = hf; u2.f = __trunchfbf2 (hf);
  __builtin_printf ("%08x %04x\n", u1.i, u2.i);
  return 0;
}

--- libgcc/soft-fp/brain.h.jj	2022-09-22 15:28:04.865171729 +0200
+++ libgcc/soft-fp/brain.h	2022-09-22 15:35:11.970374554 +0200
@@ -0,0 +1,172 @@
+/* Software floating-point emulation.
+   Definitions for Brain Floating Point format (bfloat16).
+   Copyright (C) 1997-2022 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   In addition to the permissions in the GNU Lesser General Public
+   License, the Free Software Foundation gives you unlimited
+   permission to link the compiled version of this file into
+   combinations with other programs, and to distribute those
+   combinations without any restriction coming from the use of this
+   file.  (The Lesser General Public License restrictions do apply in
+   other respects; for example, they cover modification of the file,
+   and distribution when not linked into a combine executable.)
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef SOFT_FP_BRAIN_H
+#define SOFT_FP_BRAIN_H	1
+
+#if _FP_W_TYPE_SIZE < 32
+# error "Here's a nickel kid.  Go buy yourself a real computer."
+#endif
+
+#define _FP_FRACTBITS_B		(_FP_W_TYPE_SIZE)
+
+#define _FP_FRACTBITS_DW_B	(_FP_W_TYPE_SIZE)
+
+#define _FP_FRACBITS_B		8
+#define _FP_FRACXBITS_B		(_FP_FRACTBITS_B - _FP_FRACBITS_B)
+#define _FP_WFRACBITS_B		(_FP_WORKBITS + _FP_FRACBITS_B)
+#define _FP_WFRACXBITS_B	(_FP_FRACTBITS_B - _FP_WFRACBITS_B)
+#define _FP_EXPBITS_B		8
+#define _FP_EXPBIAS_B		127
+#define _FP_EXPMAX_B		255
+
+#define _FP_QNANBIT_B		((_FP_W_TYPE) 1 << (_FP_FRACBITS_B-2))
+#define _FP_QNANBIT_SH_B	((_FP_W_TYPE) 1 << (_FP_FRACBITS_B-2+_FP_WORKBITS))
+#define _FP_IMPLBIT_B		((_FP_W_TYPE) 1 << (_FP_FRACBITS_B-1))
+#define _FP_IMPLBIT_SH_B	((_FP_W_TYPE) 1 << (_FP_FRACBITS_B-1+_FP_WORKBITS))
+#define _FP_OVERFLOW_B		((_FP_W_TYPE) 1 << (_FP_WFRACBITS_B))
+
+#define _FP_WFRACBITS_DW_B	(2 * _FP_WFRACBITS_B)
+#define _FP_WFRACXBITS_DW_B	(_FP_FRACTBITS_DW_B - _FP_WFRACBITS_DW_B)
+#define _FP_HIGHBIT_DW_B	\
+  ((_FP_W_TYPE) 1 << (_FP_WFRACBITS_DW_B - 1) % _FP_W_TYPE_SIZE)
+
+/* The implementation of _FP_MUL_MEAT_B and _FP_DIV_MEAT_B should be
+   chosen by the target machine.  */
+
+typedef float BFtype __attribute__ ((mode (BF)));
+
+union _FP_UNION_B
+{
+  BFtype flt;
+  struct _FP_STRUCT_LAYOUT
+  {
+#if __BYTE_ORDER == __BIG_ENDIAN
+    unsigned sign : 1;
+    unsigned exp  : _FP_EXPBITS_B;
+    unsigned frac : _FP_FRACBITS_B - (_FP_IMPLBIT_B != 0);
+#else
+    unsigned frac : _FP_FRACBITS_B - (_FP_IMPLBIT_B != 0);
+    unsigned exp  : _FP_EXPBITS_B;
+    unsigned sign : 1;
+#endif
+  } bits;
+};
+
+#define FP_DECL_B(X)		_FP_DECL (1, X)
+#define FP_UNPACK_RAW_B(X, val)	_FP_UNPACK_RAW_1 (B, X, (val))
+#define FP_UNPACK_RAW_BP(X, val)	_FP_UNPACK_RAW_1_P (B, X, (val))
+#define FP_PACK_RAW_B(val, X)	_FP_PACK_RAW_1 (B, (val), X)
+#define FP_PACK_RAW_BP(val, X)			\
+  do						\
+    {						\
+      if (!FP_INHIBIT_RESULTS)			\
+	_FP_PACK_RAW_1_P (B, (val), X);		\
+    }						\
+  while (0)
+
+#define FP_UNPACK_B(X, val)			\
+  do						\
+    {						\
+      _FP_UNPACK_RAW_1 (B, X, (val));		\
+      _FP_UNPACK_CANONICAL (B, 1, X);		\
+    }						\
+  while (0)
+
+#define FP_UNPACK_BP(X, val)			\
+  do						\
+    {						\
+      _FP_UNPACK_RAW_1_P (B, X, (val));		\
+      _FP_UNPACK_CANONICAL (B, 1, X);		\
+    }						\
+  while (0)
+
+#define FP_UNPACK_SEMIRAW_B(X, val)		\
+  do						\
+    {						\
+      _FP_UNPACK_RAW_1 (B, X, (val));		\
+      _FP_UNPACK_SEMIRAW (B, 1, X);		\
+    }						\
+  while (0)
+
+#define FP_UNPACK_SEMIRAW_BP(X, val)		\
+  do						\
+    {						\
+      _FP_UNPACK_RAW_1_P (B, X, (val));		\
+      _FP_UNPACK_SEMIRAW (B, 1, X);		\
+    }						\
+  while (0)
+
+#define FP_PACK_B(val, X)			\
+  do						\
+    {						\
+      _FP_PACK_CANONICAL (B, 1, X);		\
+      _FP_PACK_RAW_1 (B, (val), X);		\
+    }						\
+  while (0)
+
+#define FP_PACK_BP(val, X)			\
+  do						\
+    {						\
+      _FP_PACK_CANONICAL (B, 1, X);		\
+      if (!FP_INHIBIT_RESULTS)			\
+	_FP_PACK_RAW_1_P (B, (val), X);		\
+    }						\
+  while (0)
+
+#define FP_PACK_SEMIRAW_B(val, X)		\
+  do						\
+    {						\
+      _FP_PACK_SEMIRAW (B, 1, X);		\
+      _FP_PACK_RAW_1 (B, (val), X);		\
+    }						\
+  while (0)
+
+#define FP_PACK_SEMIRAW_BP(val, X)		\
+  do						\
+    {						\
+      _FP_PACK_SEMIRAW (B, 1, X);		\
+      if (!FP_INHIBIT_RESULTS)			\
+	_FP_PACK_RAW_1_P (B, (val), X);		\
+    }						\
+  while (0)
+
+#define FP_TO_INT_B(r, X, rsz, rsg)	_FP_TO_INT (B, 1, (r), X, (rsz), (rsg))
+#define FP_TO_INT_ROUND_B(r, X, rsz, rsg)	\
+  _FP_TO_INT_ROUND (B, 1, (r), X, (rsz), (rsg))
+#define FP_FROM_INT_B(X, r, rs, rt)	_FP_FROM_INT (B, 1, X, (r), (rs), rt)
+
+/* BFmode arithmetic is not implemented.  */
+
+#define _FP_FRAC_HIGH_B(X)	_FP_FRAC_HIGH_1 (X)
+#define _FP_FRAC_HIGH_RAW_B(X)	_FP_FRAC_HIGH_1 (X)
+#define _FP_FRAC_HIGH_DW_B(X)	_FP_FRAC_HIGH_1 (X)
+
+#define FP_CMP_EQ_B(r, X, Y, ex)       _FP_CMP_EQ (B, 1, (r), X, Y, (ex))
+
+#endif /* !SOFT_FP_BRAIN_H */
--- libgcc/soft-fp/truncsfbf2.c.jj	2022-09-22 15:43:46.345386049 +0200
+++ libgcc/soft-fp/truncsfbf2.c	2022-09-22 16:02:19.940226518 +0200
@@ -0,0 +1,48 @@
+/* Software floating-point emulation.
+   Truncate IEEE single into bfloat16.
+   Copyright (C) 2022 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   In addition to the permissions in the GNU Lesser General Public
+   License, the Free Software Foundation gives you unlimited
+   permission to link the compiled version of this file into
+   combinations with other programs, and to distribute those
+   combinations without any restriction coming from the use of this
+   file.  (The Lesser General Public License restrictions do apply in
+   other respects; for example, they cover modification of the file,
+   and distribution when not linked into a combine executable.)
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "soft-fp.h"
+#include "brain.h"
+#include "single.h"
+
+BFtype
+__truncsfbf2 (SFtype a)
+{
+  FP_DECL_EX;
+  FP_DECL_S (A);
+  FP_DECL_B (R);
+  BFtype r;
+
+  FP_INIT_ROUNDMODE;
+  FP_UNPACK_SEMIRAW_S (A, a);
+  FP_TRUNC (B, S, 1, 1, R, A);
+  FP_PACK_SEMIRAW_B (r, R);
+  FP_HANDLE_EXCEPTIONS;
+
+  return r;
+}
--- libgcc/soft-fp/truncbfhf2.c.jj	2022-09-22 16:13:28.894300765 +0200
+++ libgcc/soft-fp/truncbfhf2.c	2022-09-22 17:12:11.459004531 +0200
@@ -0,0 +1,75 @@
+/* Software floating-point emulation.
+   Truncate bfloat16 into IEEE half.
+   Copyright (C) 2022 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   In addition to the permissions in the GNU Lesser General Public
+   License, the Free Software Foundation gives you unlimited
+   permission to link the compiled version of this file into
+   combinations with other programs, and to distribute those
+   combinations without any restriction coming from the use of this
+   file.  (The Lesser General Public License restrictions do apply in
+   other respects; for example, they cover modification of the file,
+   and distribution when not linked into a combine executable.)
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "soft-fp.h"
+#include "half.h"
+#include "brain.h"
+#include "single.h"
+
+/* BFtype and HFtype are unordered, neither is a superset or subset
+   of each other.  Convert BFtype to SFtype (lossless) and then
+   truncate to HFtype.  */
+
+HFtype
+__truncbfhf2 (BFtype a)
+{
+  FP_DECL_EX;
+  FP_DECL_H (A);
+  FP_DECL_S (B);
+  FP_DECL_B (R);
+  SFtype b;
+  HFtype r;
+
+  FP_INIT_ROUNDMODE;
+  /* Optimize BFtype to SFtype conversion to simple left shift
+     by 16 if possible, we don't need to raise exceptions on sNaN
+     here as the SFtype to HFtype truncation should do that too.  */
+  if (sizeof (BFtype) == 2
+      && sizeof (unsigned short) == 2
+      && sizeof (SFtype) == 4
+      && sizeof (unsigned int) == 4)
+    {
+      union { BFtype a; unsigned short b; } u1;
+      union { SFtype a; unsigned int b; } u2;
+      u1.a = a;
+      u2.b = (u1.b << 8) << 8;
+      b = u2.a;
+    }
+  else
+    {
+      FP_UNPACK_RAW_B (A, a);
+      FP_EXTEND (S, B, 1, 1, B, A);
+      FP_PACK_RAW_S (b, B);
+    }
+  FP_UNPACK_SEMIRAW_S (B, b);
+  FP_TRUNC (H, S, 1, 1, R, B);
+  FP_PACK_SEMIRAW_H (r, R);
+  FP_HANDLE_EXCEPTIONS;
+
+  return r;
+}
--- libgcc/soft-fp/truncxfbf2.c.jj	2022-09-22 15:45:56.211621629 +0200
+++ libgcc/soft-fp/truncxfbf2.c	2022-09-22 16:02:03.205454405 +0200
@@ -0,0 +1,52 @@
+/* Software floating-point emulation.
+   Truncate IEEE extended into bfloat16.
+   Copyright (C) 2022 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   In addition to the permissions in the GNU Lesser General Public
+   License, the Free Software Foundation gives you unlimited
+   permission to link the compiled version of this file into
+   combinations with other programs, and to distribute those
+   combinations without any restriction coming from the use of this
+   file.  (The Lesser General Public License restrictions do apply in
+   other respects; for example, they cover modification of the file,
+   and distribution when not linked into a combine executable.)
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "soft-fp.h"
+#include "brain.h"
+#include "extended.h"
+
+BFtype
+__truncxfbf2 (XFtype a)
+{
+  FP_DECL_EX;
+  FP_DECL_E (A);
+  FP_DECL_B (R);
+  BFtype r;
+
+  FP_INIT_ROUNDMODE;
+  FP_UNPACK_SEMIRAW_E (A, a);
+#if _FP_W_TYPE_SIZE < 64
+  FP_TRUNC (B, E, 1, 4, R, A);
+#else
+  FP_TRUNC (B, E, 1, 2, R, A);
+#endif
+  FP_PACK_SEMIRAW_B (r, R);
+  FP_HANDLE_EXCEPTIONS;
+
+  return r;
+}
--- libgcc/soft-fp/trunchfbf2.c.jj	2022-09-22 15:59:01.321931320 +0200
+++ libgcc/soft-fp/trunchfbf2.c	2022-09-22 17:11:28.729588880 +0200
@@ -0,0 +1,58 @@
+/* Software floating-point emulation.
+   Truncate IEEE half into bfloat16.
+   Copyright (C) 2022 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   In addition to the permissions in the GNU Lesser General Public
+   License, the Free Software Foundation gives you unlimited
+   permission to link the compiled version of this file into
+   combinations with other programs, and to distribute those
+   combinations without any restriction coming from the use of this
+   file.  (The Lesser General Public License restrictions do apply in
+   other respects; for example, they cover modification of the file,
+   and distribution when not linked into a combine executable.)
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "soft-fp.h"
+#include "brain.h"
+#include "half.h"
+#include "single.h"
+
+/* BFtype and HFtype are unordered, neither is a superset or subset
+   of each other.  Convert HFtype to SFtype (lossless) and then
+   truncate to BFtype.  */
+
+BFtype
+__trunchfbf2 (HFtype a)
+{
+  FP_DECL_EX;
+  FP_DECL_H (A);
+  FP_DECL_S (B);
+  FP_DECL_B (R);
+  SFtype b;
+  BFtype r;
+
+  FP_INIT_ROUNDMODE;
+  FP_UNPACK_RAW_H (A, a);
+  FP_EXTEND (S, H, 1, 1, B, A);
+  FP_PACK_RAW_S (b, B);
+  FP_UNPACK_SEMIRAW_S (B, b);
+  FP_TRUNC (B, S, 1, 1, R, B);
+  FP_PACK_SEMIRAW_B (r, R);
+  FP_HANDLE_EXCEPTIONS;
+
+  return r;
+}
--- libgcc/soft-fp/truncdfbf2.c.jj	2022-09-22 15:40:15.303253337 +0200
+++ libgcc/soft-fp/truncdfbf2.c	2022-09-22 15:41:55.083897689 +0200
@@ -0,0 +1,52 @@
+/* Software floating-point emulation.
+   Truncate IEEE double into bfloat16.
+   Copyright (C) 2022 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   In addition to the permissions in the GNU Lesser General Public
+   License, the Free Software Foundation gives you unlimited
+   permission to link the compiled version of this file into
+   combinations with other programs, and to distribute those
+   combinations without any restriction coming from the use of this
+   file.  (The Lesser General Public License restrictions do apply in
+   other respects; for example, they cover modification of the file,
+   and distribution when not linked into a combine executable.)
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "soft-fp.h"
+#include "brain.h"
+#include "double.h"
+
+BFtype
+__truncdfbf2 (DFtype a)
+{
+  FP_DECL_EX;
+  FP_DECL_D (A);
+  FP_DECL_B (R);
+  BFtype r;
+
+  FP_INIT_ROUNDMODE;
+  FP_UNPACK_SEMIRAW_D (A, a);
+#if _FP_W_TYPE_SIZE < _FP_FRACBITS_D
+  FP_TRUNC (B, D, 1, 2, R, A);
+#else
+  FP_TRUNC (B, D, 1, 1, R, A);
+#endif
+  FP_PACK_SEMIRAW_B (r, R);
+  FP_HANDLE_EXCEPTIONS;
+
+  return r;
+}
--- libgcc/soft-fp/trunctfbf2.c.jj	2022-09-22 15:44:14.924997754 +0200
+++ libgcc/soft-fp/trunctfbf2.c	2022-09-22 15:44:45.694579708 +0200
@@ -0,0 +1,52 @@
+/* Software floating-point emulation.
+   Truncate IEEE quad into bfloat16.
+   Copyright (C) 2022 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   In addition to the permissions in the GNU Lesser General Public
+   License, the Free Software Foundation gives you unlimited
+   permission to link the compiled version of this file into
+   combinations with other programs, and to distribute those
+   combinations without any restriction coming from the use of this
+   file.  (The Lesser General Public License restrictions do apply in
+   other respects; for example, they cover modification of the file,
+   and distribution when not linked into a combine executable.)
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include "soft-fp.h"
+#include "brain.h"
+#include "quad.h"
+
+BFtype
+__trunctfbf2 (TFtype a)
+{
+  FP_DECL_EX;
+  FP_DECL_Q (A);
+  FP_DECL_B (R);
+  BFtype r;
+
+  FP_INIT_ROUNDMODE;
+  FP_UNPACK_SEMIRAW_Q (A, a);
+#if _FP_W_TYPE_SIZE < 64
+  FP_TRUNC (B, Q, 1, 4, R, A);
+#else
+  FP_TRUNC (B, Q, 1, 2, R, A);
+#endif
+  FP_PACK_SEMIRAW_B (r, R);
+  FP_HANDLE_EXCEPTIONS;
+
+  return r;
+}
--- libgcc/soft-fp/extendbfsf2.c.jj	2022-09-22 16:27:01.378339625 +0200
+++ libgcc/soft-fp/extendbfsf2.c	2022-09-22 16:27:46.379725593 +0200
@@ -0,0 +1,49 @@
+/* Software floating-point emulation.
+   Return an bfloat16 converted to IEEE single
+   Copyright (C) 2022 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   In addition to the permissions in the GNU Lesser General Public
+   License, the Free Software Foundation gives you unlimited
+   permission to link the compiled version of this file into
+   combinations with other programs, and to distribute those
+   combinations without any restriction coming from the use of this
+   file.  (The Lesser General Public License restrictions do apply in
+   other respects; for example, they cover modification of the file,
+   and distribution when not linked into a combine executable.)
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#define FP_NO_EXACT_UNDERFLOW
+#include "soft-fp.h"
+#include "brain.h"
+#include "single.h"
+
+SFtype
+__extendbfsf2 (BFtype a)
+{
+  FP_DECL_EX;
+  FP_DECL_B (A);
+  FP_DECL_S (R);
+  SFtype r;
+
+  FP_INIT_EXCEPTIONS;
+  FP_UNPACK_RAW_B (A, a);
+  FP_EXTEND (S, B, 1, 1, R, A);
+  FP_PACK_RAW_S (r, R);
+  FP_HANDLE_EXCEPTIONS;
+
+  return r;
+}
--- libgcc/config/i386/t-softfp.jj	2021-12-30 15:12:44.111138056 +0100
+++ libgcc/config/i386/t-softfp	2022-09-22 16:38:31.639921214 +0200
@@ -6,8 +6,9 @@ LIB2FUNCS_EXCLUDE += $(libgcc2-hf-functi
 libgcc2-hf-extras = $(addsuffix .c, $(libgcc2-hf-functions))
 LIB2ADD += $(addprefix $(srcdir)/config/i386/, $(libgcc2-hf-extras))
 
-softfp_extensions := hfsf hfdf hftf hfxf sfdf sftf dftf xftf
-softfp_truncations := tfhf xfhf dfhf sfhf tfsf dfsf tfdf tfxf
+softfp_extensions := hfsf hfdf hftf hfxf sfdf sftf dftf xftf bfsf
+softfp_truncations := tfhf xfhf dfhf sfhf tfsf dfsf tfdf tfxf \
+		      tfbf xfbf dfbf sfbf hfbf bfhf
 
 softfp_extras += eqhf2
 
@@ -20,6 +21,8 @@ CFLAGS-truncsfhf2.c += -msse2
 CFLAGS-truncdfhf2.c += -msse2
 CFLAGS-truncxfhf2.c += -msse2
 CFLAGS-trunctfhf2.c += -msse2
+CFLAGS-truncbfhf2.c += -msse2
+CFLAGS-trunchfbf2.c += -msse2
 
 CFLAGS-eqhf2.c += -msse2
 CFLAGS-_divhc3.c += -msse2
--- libgcc/config/i386/libgcc-glibc.ver.jj	2022-01-11 23:11:23.723271422 +0100
+++ libgcc/config/i386/libgcc-glibc.ver	2022-09-22 16:41:26.599448819 +0200
@@ -214,3 +214,14 @@ GCC_12.0.0 {
   __trunctfhf2
   __truncxfhf2
 }
+
+%inherit GCC_13.0.0 GCC_12.0.0
+GCC_13.0.0 {
+  __extendbfsf2
+  __truncdfbf2
+  __truncsfbf2
+  __trunctfbf2
+  __truncxfbf2
+  __trunchfbf2
+  __truncbfhf2
+}
--- libgcc/config/i386/64/sfp-machine.h.jj	2021-12-30 15:12:44.111138056 +0100
+++ libgcc/config/i386/64/sfp-machine.h	2022-09-22 16:44:45.897627866 +0200
@@ -14,6 +14,7 @@ typedef unsigned int UTItype __attribute
 #define _FP_DIV_MEAT_Q(R,X,Y)   _FP_DIV_MEAT_2_udiv(Q,R,X,Y)
 
 #define _FP_NANFRAC_H		_FP_QNANBIT_H
+#define _FP_NANFRAC_B		_FP_QNANBIT_B
 #define _FP_NANFRAC_S		_FP_QNANBIT_S
 #define _FP_NANFRAC_D		_FP_QNANBIT_D
 #define _FP_NANFRAC_E		_FP_QNANBIT_E, 0
--- libgcc/config/i386/sfp-machine.h.jj	2021-12-30 15:12:44.111138056 +0100
+++ libgcc/config/i386/sfp-machine.h	2022-09-22 16:46:16.130350681 +0200
@@ -18,6 +18,7 @@ typedef int __gcc_CMPtype __attribute__
 #define _FP_QNANNEGATEDP 0
 
 #define _FP_NANSIGN_H		1
+#define _FP_NANSIGN_B		1
 #define _FP_NANSIGN_S		1
 #define _FP_NANSIGN_D		1
 #define _FP_NANSIGN_E		1
--- libgcc/config/i386/32/sfp-machine.h.jj	2021-12-30 15:12:44.110138070 +0100
+++ libgcc/config/i386/32/sfp-machine.h	2022-09-22 16:44:26.786898371 +0200
@@ -87,6 +87,7 @@
 #define _FP_DIV_MEAT_Q(R,X,Y)   _FP_DIV_MEAT_4_udiv(Q,R,X,Y)
 
 #define _FP_NANFRAC_H		_FP_QNANBIT_H
+#define _FP_NANFRAC_B		_FP_QNANBIT_B
 #define _FP_NANFRAC_S		_FP_QNANBIT_S
 #define _FP_NANFRAC_D		_FP_QNANBIT_D, 0
 /* Even if XFmode is 12byte,  we have to pad it to