From patchwork Mon Feb 26 14:26:21 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Bhushan Attarde <bhushan.attarde@imgtec.com>
X-Patchwork-Id: 86382
Return-Path: <gdb-patches-bounces+patchwork=sourceware.org@sourceware.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 801943858429
	for <patchwork@sourceware.org>; Mon, 26 Feb 2024 14:27:28 +0000 (GMT)
X-Original-To: gdb-patches@sourceware.org
Delivered-To: gdb-patches@sourceware.org
Received: from mx08-00376f01.pphosted.com (mx08-00376f01.pphosted.com
 [91.207.212.86])
 by sourceware.org (Postfix) with ESMTPS id 7BA3C3858D28
 for <gdb-patches@sourceware.org>; Mon, 26 Feb 2024 14:26:47 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7BA3C3858D28
Authentication-Results: sourceware.org;
 dmarc=pass (p=none dis=none) header.from=imgtec.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=imgtec.com
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 7BA3C3858D28
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=91.207.212.86
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708957610; cv=none;
 b=CGUBmvNNYife/KL5icand4Re+hDZIoF4azbCmAvwdACWtyc2lp3dZYXNW+MLSJd5E0HbbuogWbpinmKM7Nol7C/leSrockCoxxi04lp5M33mT8PwPpPRj5JFoxqQekI0Nc9c0qZ+dAlhqjtNPt65T+7c+EmuCcFpad16tYvLcIo=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1708957610; c=relaxed/simple;
 bh=igwUGyVdw/xWvxmH2E/ZtsfQxW6UTyjCSc+l5/yXhDA=;
 h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version;
 b=cdFVTKRp2yJpZhCD4ZHPOcJ6kcyIKV9SMMlj3lfuDWSuxYDs1AiDcP48B0I46e2VZRfK7mt7GvHIQRLtdaCDEkll+rQ1YOzJ/VmwdsNO+FF9E548adXLQrTxyrp14LBQ0/kyDWlFnXLGuxTV5DbWarLtpAWwJUDL7Nunk7dMn9o=
ARC-Authentication-Results: i=1; server2.sourceware.org
Received: from pps.filterd (m0168888.ppops.net [127.0.0.1])
 by mx08-00376f01.pphosted.com (8.17.1.24/8.17.1.24) with ESMTP id
 41Q86eF5005999; Mon, 26 Feb 2024 14:26:42 GMT
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=imgtec.com; h=
 from:to:cc:subject:date:message-id:mime-version
 :content-transfer-encoding:content-type; s=dk201812; bh=GNPJRUu9
 pW8DXYa+Vh/aYrLX2HIsg3gESwkUJh8zdaM=; b=uGTICBrMbvvNDTEH9vz2SsKK
 8Awa6sOjEk5GMzAfy04TyIBd2JQoBd7Bc5rCe8Ci4o4aCi9z73A4CtjmZczz1vL1
 P65P2w08MRG21xWERYOoIYNsCp5bSyIkBOZMF7Elc1lK1RILbsb34FLAeyhl9Ero
 vUM7WaCYu2yKoM56fYb41Esf1xyyiI2sUePG9VeJx6zsEoZpivuzYlAR2awQU1je
 mWcmLtXx0y1mnBGXiut4j194NBz+rC7Pt005Ej5SFAMt9igQaH21wDM+lobmSeos
 cE/Tg5yIeaOYh/oInFTtnrViQanUZTqn+oOHFlL0URNQi+w5M48YEaF0pD7jRQ==
Received: from hhmail05.hh.imgtec.org ([217.156.249.195])
 by mx08-00376f01.pphosted.com (PPS) with ESMTPS id 3wf7kssr69-1
 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT);
 Mon, 26 Feb 2024 14:26:42 +0000 (GMT)
Received: from hhbattarde.hh.imgtec.org (10.100.136.78) by
 HHMAIL05.hh.imgtec.org (10.100.10.120) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id
 15.1.2507.35; Mon, 26 Feb 2024 14:26:41 +0000
From: <bhushan.attarde@imgtec.com>
To: <gdb-patches@sourceware.org>
CC: <aburgess@redhat.com>, <vapier@gentoo.org>, <Jaydeep.Patil@imgtec.com>,
 Bhushan Attarde <bhushan.attarde@imgtec.com>
Subject: [PATCH 04/11] sim: riscv: Add single precision floating-point MAC
 instructions
Date: Mon, 26 Feb 2024 14:26:21 +0000
Message-ID: <20240226142628.1629048-1-bhushan.attarde@imgtec.com>
X-Mailer: git-send-email 2.25.1
MIME-Version: 1.0
X-Originating-IP: [10.100.136.78]
X-ClientProxiedBy: HHMAIL05.hh.imgtec.org (10.100.10.120) To
 HHMAIL05.hh.imgtec.org (10.100.10.120)
X-EXCLAIMER-MD-CONFIG: 15a78312-3e47-46eb-9010-2e54d84a9631
X-Proofpoint-GUID: YLR_MtrpW4FAQb9_3i7Gz89Eko4OIRIH
X-Proofpoint-ORIG-GUID: YLR_MtrpW4FAQb9_3i7Gz89Eko4OIRIH
X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW,
 SPF_HELO_NONE, SPF_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gdb-patches@sourceware.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gdb-patches mailing list <gdb-patches.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=subscribe>
Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org

From: Bhushan Attarde <bhushan.attarde@imgtec.com>

Added simulation of following single precision floating-point instructions
fmadd.s, fnmadd.s, fmsub.s and fnmsub.s.

Added test file sim/testsuite/riscv/s-basic-arith.s to test these instructions.
---
 sim/riscv/sim-main.c                | 171 +++++++++++++++++++++++++++-
 sim/testsuite/riscv/s-basic-arith.s |  80 +++++++++++++
 2 files changed, 246 insertions(+), 5 deletions(-)
 create mode 100644 sim/testsuite/riscv/s-basic-arith.s

diff --git a/sim/riscv/sim-main.c b/sim/riscv/sim-main.c
index 0e873895f76..dd91431ad12 100644
--- a/sim/riscv/sim-main.c
+++ b/sim/riscv/sim-main.c
@@ -25,6 +25,7 @@
 #include "defs.h"
 
 #include <math.h>
+#include <fenv.h>
 #include <inttypes.h>
 #include <time.h>
 
@@ -71,12 +72,25 @@ static const struct riscv_opcode *riscv_hash[OP_MASK_OP + 1];
 #define FCSR_DZ		0x8	/* Divide by zero.  */
 #define FCSR_NV		0x10	/* Invalid Operation.  */
 
+#define RNE	0x000
+#define RTZ	0x001
+#define RDN	0x002
+#define RUP	0x003
+#define RMM	0x004
+#define DYN	0x007
+
+#define MASK_RM (OP_MASK_RM << OP_SH_RM)
+
 #define FEQ	1
 #define FLT	2
 #define FLE	3
 #define FCLASS	4
 #define FMIN	5
 #define FMAX	6
+#define FMADD	7
+#define FMSUB	8
+#define FNMADD	9
+#define FNMSUB	10
 
 static INLINE void
 store_rd (SIM_CPU *cpu, int rd, unsigned_word val)
@@ -721,6 +735,45 @@ mulhsu (int64_t a, uint64_t b)
   return negate ? ~res + (a * b == 0) : res;
 }
 
+/* Handle the rounding modes.  */
+static int
+set_riscv_rounding_mode (int rm)
+{
+  int old_rm = fegetround ();
+
+  /* Instruction does not set rm.  */
+  if (rm == -1)
+    return old_rm;
+
+  if (rm == RNE)
+    fesetround (FE_TONEAREST);
+  else if (rm == RTZ)
+    fesetround (FE_TOWARDZERO);
+  else if (rm == RDN)
+    fesetround (FE_DOWNWARD);
+  else if (rm == RUP)
+    fesetround (FE_UPWARD);
+  else
+    {
+      /* No direct match for RMM.  Simulate it.  */
+    }
+
+  return old_rm;
+}
+
+/* Checks whether the fractional part of a floating-point number is
+   exactly 0.5.  If the fractional part is 0.5, the function returns 1;
+   otherwise, it returns 0.  */
+static int
+is_float_halfway (float value)
+{
+  float frac_part, int_part;
+  frac_part = modff (value, &int_part);
+  if (fabsf (frac_part) == 0.5f)
+    return 1;
+  return 0;
+}
+
 /* Handle single precision floating point compare instructions.  */
 static void
 float32_compare (SIM_CPU *cpu, int rd, int rs1, int rs2, int flags)
@@ -829,22 +882,58 @@ float32_compare (SIM_CPU *cpu, int rd, int rs1, int rs2, int flags)
 
 /* Handle single precision floating point math instructions.  */
 static void
-float32_math (SIM_CPU *cpu, int rd, int rs1, int rs2, int flags)
+float32_math (SIM_CPU *cpu, int rd, int rs1, int rs2,
+	      int rs3, int rm, int flags)
 {
   struct riscv_sim_cpu *riscv_cpu = RISCV_SIM_CPU (cpu);
-  float a, b, result = 0;
-  uint32_t rs1_bits, rs2_bits, rd_bits;
+  float a, b, c, result = 0;
+  int old_rm, old_except, new_except;
+  uint32_t rs1_bits, rs2_bits, rs3_bits, rd_bits;
   const char *frd_name = riscv_fpr_names_abi[rd];
   const char *frs1_name = riscv_fpr_names_abi[rs1];
   const char *frs2_name = riscv_fpr_names_abi[rs2];
+  const char *frs3_name = riscv_fpr_names_abi[rs3];
+
+  if (rm == DYN)
+    rm = riscv_cpu->csr.frm;
+
+  old_rm = set_riscv_rounding_mode (rm);
+  old_except = fetestexcept (FE_ALL_EXCEPT);
 
   rs1_bits = (uint32_t) riscv_cpu->fpregs[rs1];
   memcpy (&a, &rs1_bits, sizeof (a));
   rs2_bits = (uint32_t) riscv_cpu->fpregs[rs2];
   memcpy (&b, &rs2_bits, sizeof (b));
 
+  if (flags == FMADD || flags == FNMADD
+      || flags == FMSUB || flags == FNMSUB)
+    {
+      rs3_bits = (uint32_t) riscv_cpu->fpregs[rs3];
+      memcpy (&c, &rs3_bits, sizeof (c));
+    }
+
   switch (flags)
     {
+    case FMADD:
+      TRACE_INSN (cpu, "fmadd.s %s, %s, %s, %s, rm=%d;",
+		  frd_name, frs1_name, frs2_name, frs3_name, rm);
+      result = (a * b) + c;
+      break;
+    case FNMADD:
+      TRACE_INSN (cpu, "fnmadd.s %s, %s, %s, %s, rm=%d;",
+		  frd_name, frs1_name, frs2_name, frs3_name, rm);
+      result = -((a * b) - c);
+      break;
+    case FMSUB:
+      TRACE_INSN (cpu, "fmsub.s %s, %s, %s, %s, rm=%d;",
+		  frd_name, frs1_name, frs2_name, frs3_name, rm);
+      result = (a * b) - c;
+      break;
+    case FNMSUB:
+      TRACE_INSN (cpu, "fnmsub.s %s, %s, %s, %s, rm=%d;",
+		  frd_name, frs1_name, frs2_name, frs3_name, rm);
+      result = -((a * b) + c);
+      break;
     case FMAX:
       TRACE_INSN (cpu, "fmax.s %s, %s, %s;", frd_name, frs1_name, frs2_name);
       result = fmaxf (a, b);
@@ -855,9 +944,63 @@ float32_math (SIM_CPU *cpu, int rd, int rs1, int rs2, int flags)
       break;
     }
 
+  if (rm == RMM)
+    {
+      if (is_float_halfway (result))
+	{
+	  if (result > 0)
+	    result = nextafterf (result, INFINITY);
+	  else
+	    result = nextafterf (result, -INFINITY);
+	}
+    }
+
   /* Store result.  */
   memcpy (&rd_bits, &result, sizeof (result));
   store_fp (cpu, rd, rd_bits);
+
+  /* Restore rounding mode.  */
+  fesetround (old_rm);
+
+  /* Set exception.  */
+  new_except = fetestexcept (FE_ALL_EXCEPT);
+
+  if (old_except != new_except)
+    {
+      if (new_except & FE_OVERFLOW)
+	{
+	  riscv_cpu->csr.fcsr |= FCSR_OF;
+	  riscv_cpu->csr.fflags |= FCSR_OF;
+	  TRACE_REGISTER (cpu, "wrote CSR fcsr |= OF");
+	}
+      else if (new_except & FE_UNDERFLOW)
+	{
+	  riscv_cpu->csr.fcsr |= FCSR_UF;
+	  riscv_cpu->csr.fflags |= FCSR_UF;
+	  TRACE_REGISTER (cpu, "wrote CSR fcsr |= UF");
+	}
+      else if (new_except & FE_INEXACT)
+	{
+	  riscv_cpu->csr.fcsr |= FCSR_NX;
+	  riscv_cpu->csr.fflags |= FCSR_NX;
+	  TRACE_REGISTER (cpu, "wrote CSR fcsr |= NX");
+	}
+      else if (new_except & FE_DIVBYZERO)
+	{
+	  riscv_cpu->csr.fcsr |= FCSR_DZ;
+	  riscv_cpu->csr.fflags |= FCSR_DZ;
+	  TRACE_REGISTER (cpu, "wrote CSR fcsr |= DZ");
+	}
+      else if (new_except & FE_INVALID)
+	{
+	  riscv_cpu->csr.fcsr |= FCSR_NV;
+	  riscv_cpu->csr.fflags |= FCSR_NV;
+	  TRACE_REGISTER (cpu, "wrote CSR fcsr |= NV");
+	}
+
+      feclearexcept (FE_ALL_EXCEPT);
+      feraiseexcept (old_except);
+    }
 }
 
 /* Simulate single precision floating point instructions.  */
@@ -869,6 +1012,8 @@ execute_f (SIM_CPU *cpu, unsigned_word iw, const struct riscv_opcode *op)
   int rd = (iw >> OP_SH_RD) & OP_MASK_RD;
   int rs1 = (iw >> OP_SH_RS1) & OP_MASK_RS1;
   int rs2 = (iw >> OP_SH_RS2) & OP_MASK_RS2;
+  int rs3 = (iw >> OP_SH_RS3) & OP_MASK_RS3;
+  int rm = (iw >> OP_SH_RM) & OP_MASK_RM;
   const char *frd_name = riscv_fpr_names_abi[rd];
   const char *rd_name = riscv_gpr_names_abi[rd];
   const char *frs1_name = riscv_fpr_names_abi[rs1];
@@ -959,10 +1104,10 @@ execute_f (SIM_CPU *cpu, unsigned_word iw, const struct riscv_opcode *op)
 	break;
       }
     case MATCH_FMIN_S:
-      float32_math (cpu, rd, rs1, rs2, FMIN);
+      float32_math (cpu, rd, rs1, rs2, 0, -1, FMIN);
       break;
     case MATCH_FMAX_S:
-      float32_math (cpu, rd, rs1, rs2, FMAX);
+      float32_math (cpu, rd, rs1, rs2, 0, -1, FMAX);
       break;
     case MATCH_FRCSR:
       TRACE_INSN (cpu, "frcsr %s;", rd_name);
@@ -1012,6 +1157,22 @@ execute_f (SIM_CPU *cpu, unsigned_word iw, const struct riscv_opcode *op)
       riscv_cpu->csr.fcsr |= rs1 & 0x1f;
       TRACE_REGISTER (cpu, "wrote CSR fcsr = %#" PRIxTW, riscv_cpu->csr.fcsr);
       break;
+    case MATCH_FMADD_S:
+    case MATCH_FMADD_S | MASK_RM:
+      float32_math (cpu, rd, rs1, rs2, rs3, rm, FMADD);
+      break;
+    case MATCH_FNMADD_S:
+    case MATCH_FNMADD_S | MASK_RM:
+      float32_math (cpu, rd, rs1, rs2, rs3, rm, FNMADD);
+      break;
+    case MATCH_FMSUB_S:
+    case MATCH_FMSUB_S | MASK_RM:
+      float32_math (cpu, rd, rs1, rs2, rs3, rm, FMSUB);
+      break;
+    case MATCH_FNMSUB_S:
+    case MATCH_FNMSUB_S | MASK_RM:
+      float32_math (cpu, rd, rs1, rs2, rs3, rm, FNMSUB);
+      break;
     default:
       TRACE_INSN (cpu, "UNHANDLED INSN: %s", op->name);
       sim_engine_halt (sd, cpu, NULL, riscv_cpu->pc, sim_signalled, SIM_SIGILL);
diff --git a/sim/testsuite/riscv/s-basic-arith.s b/sim/testsuite/riscv/s-basic-arith.s
new file mode 100644
index 00000000000..a05a0d0a2c3
--- /dev/null
+++ b/sim/testsuite/riscv/s-basic-arith.s
@@ -0,0 +1,80 @@
+# Single precision basic arithmetic tests.
+# mach: riscv32 riscv64
+# sim(riscv32): --model RV32IF
+# sim(riscv64): --model RV64ID
+# ld(riscv32): -m elf32lriscv
+# ld(riscv64): -m elf64lriscv
+# as(riscv32): -march=rv32if
+# as(riscv64): -march=rv64id
+
+.include "testutils.inc"
+
+	.section	.data
+	.align 2
+
+_arg1:
+	.float -12.5
+
+_arg2:
+	.float 2.5
+
+_arg3:
+	.float 7.45
+
+_result:
+	.float -23.799999
+	.float 38.7000008
+	.float -38.7000008
+	.float 23.7999992
+
+	start
+	.option push
+	.option norelax
+	la	a0,_arg1
+	la	a1,_arg2
+	la	a2,_arg3
+	la	a3,_result
+	li	a4,1
+	.option pop
+
+	# Test fmadd instruction.
+	flw	fa0,0(a0)
+	flw	fa1,0(a1)
+	flw	fa2,0(a2)
+	flw	fa3,0(a3)
+	fmadd.s	fa4,fa0,fa1,fa0,rne
+	feq.s	a5,fa4,fa4
+	bne	a5,a4,test_fail
+
+	# Test fnmadd instruction.
+	flw	fa0,0(a0)
+	flw	fa1,0(a1)
+	flw	fa2,0(a2)
+	flw	fa3,4(a3)
+	fnmadd.s	fa4,fa0,fa1,fa0,rne
+	feq.s	a5,fa4,fa4
+	bne	a5,a4,test_fail
+
+	# Test fmsub instruction.
+	flw	fa0,0(a0)
+	flw	fa1,0(a1)
+	flw	fa2,0(a2)
+	flw	fa3,8(a3)
+	fmsub.s	fa4,fa0,fa1,fa0,rne
+	feq.s	a5,fa4,fa4
+	bne	a5,a4,test_fail
+
+	# Test fnmsub instruction.
+	flw	fa0,0(a0)
+	flw	fa1,0(a1)
+	flw	fa2,0(a2)
+	flw	fa3,12(a3)
+	fmsub.s	fa4,fa0,fa1,fa0,rne
+	feq.s	a5,fa4,fa4
+	bne	a5,a4,test_fail
+
+test_pass:
+	pass
+
+test_fail:
+	fail