From patchwork Mon May 23 18:12:09 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Vineet Gupta <vineetg@rivosinc.com>
X-Patchwork-Id: 54306
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id E24B63856250
	for <patchwork@sourceware.org>; Mon, 23 May 2022 18:12:31 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from mail-pf1-x436.google.com (mail-pf1-x436.google.com
 [IPv6:2607:f8b0:4864:20::436])
 by sourceware.org (Postfix) with ESMTPS id 5A1E53858C56
 for <gcc-patches@gcc.gnu.org>; Mon, 23 May 2022 18:12:14 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5A1E53858C56
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=rivosinc.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com
Received: by mail-pf1-x436.google.com with SMTP id 202so7390998pfu.0
 for <gcc-patches@gcc.gnu.org>; Mon, 23 May 2022 11:12:14 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=rivosinc-com.20210112.gappssmtp.com; s=20210112;
 h=from:to:cc:subject:date:message-id:mime-version
 :content-transfer-encoding;
 bh=JWJNS+IthRebg0KC6SxeBRJJmONvQ54Z1gO8RExodD8=;
 b=Ybm29PBkjJcOvcH4tvCyAwv6ATTO02YA4TbupHRB8yCLCAB8z5JU5bCWLhm4Qy7KSY
 jNgQq9ANmfBSdYss7byHHumBfMuJov43hKLDvVtDN9tZtFQzPqkhGSWGw1Ct4FLibehD
 KZ1xe50arIaqB2Ke3zuIwo1fXMhUW3Fm66UkooAfJ2PLkI81BQM5MKpRw2TG0WKVGYiC
 slbhd6SUw+Y/nKzG0FJErt4bWVpmkEyiFINJXZ/FtNZ3UawudNaSIVaoEn6ho3tkN7KP
 hs/EHjjOUi+yq577LWgpuDO94JL/D/IzMVlmHbv0JzkzBziYbX6HZHgTpXDqfBxZ7Dx3
 c4Vg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version
 :content-transfer-encoding;
 bh=JWJNS+IthRebg0KC6SxeBRJJmONvQ54Z1gO8RExodD8=;
 b=NwIeXnxY5RG+xQK9IFJRl6Y9aOcooJ0OKfQWIw9YyE/ItIrloAw4F/KWqwyJYXNSXL
 wI0KSMQQS+07LSxfsekJDqDA7HeGhCbC1LxsodZ7tTXeALu3YC126z+deb8n4kYx2jmZ
 TPmuvHlWNXbVpS6I7V7R3UuSclTB4iY4aOjsnO+D4W3v74Mit+IyUSdar4K3zgw2tOgE
 kLk5Tl84VYiD30Wg3fZ2ChsoJrr4OvCBtJ1kyGICi/Ne+0ca+ucOK1XdZm+O9Tj3i8yy
 nbhjytk6HHbdHGbzcYUuiH9xO/ncqen9+7qsGb2DizYUtQX5qSqgdxJSy9ANeL6X9QMr
 BA6Q==
X-Gm-Message-State: AOAM5330+8MT+3IybTJpK5gaE/6DbgtREQskV+hTx9MfbifJ3mekN9Uu
 Y9dUxSJvUZiAGakebIPDUKV5oUqwmR1Z3g==
X-Google-Smtp-Source: 
 ABdhPJymA31wMVvihHoMNHJb28lArkZb1yyCGZTR7/xlS9CvkgB1GVERigtEGedvYA+LkQE5oo2qsw==
X-Received: by 2002:a63:5:0:b0:3c6:dcb2:428 with SMTP id
 5-20020a630005000000b003c6dcb20428mr21146615pga.73.1653329533009;
 Mon, 23 May 2022 11:12:13 -0700 (PDT)
Received: from vineetg-framework.ba.rivosinc.com
 (c-24-4-73-83.hsd1.ca.comcast.net. [24.4.73.83])
 by smtp.gmail.com with ESMTPSA id
 b4-20020a170902d50400b0015e8d4eb1dcsm5511563plg.38.2022.05.23.11.12.11
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Mon, 23 May 2022 11:12:12 -0700 (PDT)
From: Vineet Gupta <vineetg@rivosinc.com>
To: gcc-patches@gcc.gnu.org
Subject: [PATCH] [PR/target 105666] RISC-V: Inhibit FP <--> int register moves
 via tune param
Date: Mon, 23 May 2022 11:12:09 -0700
Message-Id: <20220523181209.2208136-1-vineetg@rivosinc.com>
X-Mailer: git-send-email 2.32.0
MIME-Version: 1.0
X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, GIT_PATCH_0, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL,
 RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Cc: Andrew Waterman <andrew@sifive.com>, Vineet Gupta <vineetg@rivosinc.com>,
 kito.cheng@gmail.com, Philipp Tomsich <philipp.tomsich@vrull.eu>,
 gnu-toolchain@rivosinc.com
Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org
Sender: "Gcc-patches"
 <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>

Under extreme register pressure, compiler can use FP <--> int
moves as a cheap alternate to spilling to memory.
This was seen with SPEC2017 FP benchmark 507.cactu:
ML_BSSN_Advect.cc:ML_BSSN_Advect_Body()

|	fmv.d.x	fa5,s9	# PDupwindNthSymm2Xt1, PDupwindNthSymm2Xt1
| .LVL325:
|	ld	s9,184(sp)		# _12469, %sfp
| ...
| .LVL339:
|	fmv.x.d	s4,fa5	# PDupwindNthSymm2Xt1, PDupwindNthSymm2Xt1
|

The FMV instructions could be costlier (than stack spill) on certain
micro-architectures, thus this needs to be a per-cpu tunable
(default being to inhibit on all existing RV cpus).

Testsuite run with new test reports 10 failures without the fix
corresponding to the build variations of pr105666.c

| 		=== gcc Summary ===
|
| # of expected passes		123318   (+10)
| # of unexpected failures	34       (-10)
| # of unexpected successes	4
| # of expected failures	780
| # of unresolved testcases	4
| # of unsupported tests	2796

gcc/Changelog:

	* config/riscv/riscv.cc: (struct riscv_tune_param): Add
	  fmv_cost.
	(rocket_tune_info): Add default fmv_cost 8.
	(sifive_7_tune_info): Ditto.
	(thead_c906_tune_info): Ditto.
	(optimize_size_tune_info): Ditto.
	(riscv_register_move_cost): Use fmv_cost for int<->fp moves.

gcc/testsuite/Changelog:

	* gcc.target/riscv/pr105666.c: New test.

Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
---
 gcc/config/riscv/riscv.cc                 |  9 ++++
 gcc/testsuite/gcc.target/riscv/pr105666.c | 55 +++++++++++++++++++++++
 2 files changed, 64 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr105666.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index ee756aab6940..f3ac0d8865f0 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -220,6 +220,7 @@ struct riscv_tune_param
   unsigned short issue_rate;
   unsigned short branch_cost;
   unsigned short memory_cost;
+  unsigned short fmv_cost;
   bool slow_unaligned_access;
 };
 
@@ -285,6 +286,7 @@ static const struct riscv_tune_param rocket_tune_info = {
   1,						/* issue_rate */
   3,						/* branch_cost */
   5,						/* memory_cost */
+  8,						/* fmv_cost */
   true,						/* slow_unaligned_access */
 };
 
@@ -298,6 +300,7 @@ static const struct riscv_tune_param sifive_7_tune_info = {
   2,						/* issue_rate */
   4,						/* branch_cost */
   3,						/* memory_cost */
+  8,						/* fmv_cost */
   true,						/* slow_unaligned_access */
 };
 
@@ -311,6 +314,7 @@ static const struct riscv_tune_param thead_c906_tune_info = {
   1,            /* issue_rate */
   3,            /* branch_cost */
   5,            /* memory_cost */
+  8,		/* fmv_cost */
   false,            /* slow_unaligned_access */
 };
 
@@ -324,6 +328,7 @@ static const struct riscv_tune_param optimize_size_tune_info = {
   1,						/* issue_rate */
   1,						/* branch_cost */
   2,						/* memory_cost */
+  8,						/* fmv_cost */
   false,					/* slow_unaligned_access */
 };
 
@@ -4737,6 +4742,10 @@ static int
 riscv_register_move_cost (machine_mode mode,
 			  reg_class_t from, reg_class_t to)
 {
+  if ((from == FP_REGS && to == GR_REGS) ||
+      (from == GR_REGS && to == FP_REGS))
+    return tune_param->fmv_cost;
+
   return riscv_secondary_memory_needed (mode, from, to) ? 8 : 2;
 }
 
diff --git a/gcc/testsuite/gcc.target/riscv/pr105666.c b/gcc/testsuite/gcc.target/riscv/pr105666.c
new file mode 100644
index 000000000000..904f3bc0763f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr105666.c
@@ -0,0 +1,55 @@
+/* Shamelessly plugged off gcc/testsuite/gcc.c-torture/execute/pr28982a.c.  
+
+   The idea is to induce high register pressure for both int/fp registers
+   so that they spill. By default FMV instructions would be used to stash
+   int reg to a fp reg (and vice-versa) but that could be costlier than
+   spilling to stack.  */
+
+/* { dg-do compile } */
+/* { dg-options "-march=rv64g -ffast-math" } */
+
+#define NITER 4
+#define NVARS 20
+#define MULTI(X) \
+  X( 0), X( 1), X( 2), X( 3), X( 4), X( 5), X( 6), X( 7), X( 8), X( 9), \
+  X(10), X(11), X(12), X(13), X(14), X(15), X(16), X(17), X(18), X(19)
+
+#define DECLAREI(INDEX) inc##INDEX = incs[INDEX]
+#define DECLAREF(INDEX) *ptr##INDEX = ptrs[INDEX], result##INDEX = 5
+#define LOOP(INDEX) result##INDEX += result##INDEX * (*ptr##INDEX), ptr##INDEX += inc##INDEX
+#define COPYOUT(INDEX) results[INDEX] = result##INDEX
+
+double *ptrs[NVARS];
+double results[NVARS];
+int incs[NVARS];
+
+void __attribute__((noinline))
+foo (int n)
+{
+  int MULTI (DECLAREI);
+  double MULTI (DECLAREF);
+  while (n--)
+    MULTI (LOOP);
+  MULTI (COPYOUT);
+}
+
+double input[NITER * NVARS];
+
+int
+main (void)
+{
+  int i;
+
+  for (i = 0; i < NVARS; i++)
+    ptrs[i] = input + i, incs[i] = i;
+  for (i = 0; i < NITER * NVARS; i++)
+    input[i] = i;
+  foo (NITER);
+  for (i = 0; i < NVARS; i++)
+    if (results[i] != i * NITER * (NITER + 1) / 2)
+      return 1;
+  return 0;
+}
+
+/* { dg-final { scan-assembler-not "\tfmv\\.d\\.x\t" } } */
+/* { dg-final { scan-assembler-not "\tfmv\\.x\\.d\t" } } */