From patchwork Thu Nov 10 02:51:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 60328 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 793963865C0E for ; Thu, 10 Nov 2022 02:52:25 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 793963865C0E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1668048745; bh=SJ8ZLhddnXoAcQ6lttnAoatkAWI+5HnvdA7xdBsGKcU=; h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=ng3uWUMaMGhJBenhba/OVXautJYxmlmTJy6pzrMDjNjmzh53EQ3ELhuWp8LYeN+3c iy6YE9c5EjTYnWJN385Up/lyXbi5Ts9q5d24qzf4mw/8EfKgRSiif3Xiy9+4mLktrY 4oRLhHRa5tEd0o2ULqkswCPvl5cok9BsndjuGwY4= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id EAB573858D28 for ; Thu, 10 Nov 2022 02:51:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org EAB573858D28 Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2AA2kpae004451; Thu, 10 Nov 2022 02:51:53 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3krrtp82m7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 10 Nov 2022 02:51:53 +0000 Received: from m0098419.ppops.net (m0098419.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 2AA2nNpp010934; Thu, 10 Nov 2022 02:51:52 GMT Received: from ppma03wdc.us.ibm.com (ba.79.3fa9.ip4.static.sl-reverse.com [169.63.121.186]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3krrtp82m1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 10 Nov 2022 02:51:52 +0000 Received: from pps.filterd (ppma03wdc.us.ibm.com [127.0.0.1]) by ppma03wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2AA2o4n0008464; Thu, 10 Nov 2022 02:51:52 GMT Received: from b03cxnp08026.gho.boulder.ibm.com (b03cxnp08026.gho.boulder.ibm.com [9.17.130.18]) by ppma03wdc.us.ibm.com with ESMTP id 3kngs7ht5h-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 10 Nov 2022 02:51:52 +0000 Received: from smtpav04.dal12v.mail.ibm.com ([9.208.128.131]) by b03cxnp08026.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2AA2ptYd61276548 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 10 Nov 2022 02:51:55 GMT Received: from smtpav04.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9E14F58063; Thu, 10 Nov 2022 02:51:50 +0000 (GMT) Received: from smtpav04.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 08DF75805A; Thu, 10 Nov 2022 02:51:50 +0000 (GMT) Received: from toto.the-meissners.org (unknown [9.160.5.6]) by smtpav04.dal12v.mail.ibm.com (Postfix) with ESMTPS; Thu, 10 Nov 2022 02:51:49 +0000 (GMT) Date: Wed, 9 Nov 2022 21:51:48 -0500 To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , "Kewen.Lin" , David Edelsohn , Peter Bergner , Will Schmidt Subject: [PATCH 5/6] PowerPC: Switch to dense math names for all MMA operations. Message-ID: Mail-Followup-To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , "Kewen.Lin" , David Edelsohn , Peter Bergner , Will Schmidt References: Content-Disposition: inline In-Reply-To: X-TM-AS-GCONF: 00 X-Proofpoint-GUID: tr_0UNOQ16BNGL7aK7HR0i_cElWlpukZ X-Proofpoint-ORIG-GUID: 50r1JgESzdyuklhlefBN45TAXAp9-n6F X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-09_06,2022-11-09_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 spamscore=0 suspectscore=0 clxscore=1015 mlxscore=0 phishscore=0 mlxlogscore=999 adultscore=0 bulkscore=0 lowpriorityscore=0 impostorscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211100016 X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_MANYTO, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Michael Meissner via Gcc-patches From: Michael Meissner Reply-To: Michael Meissner Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This patch changes the assembler instruction names for MMA instructions from the original name used in power10 to the new name when used with the dense math system. I.e. xvf64gerpp becomes dmxvf64gerpp. The assembler will emit the same bits for either spelling. The patches have been tested on the following platforms. I added the patches for PR target/107299 that I submitted on November 2nd before doing the builds so that GCC would build on systems using IEEE 128-bit long double. * https://gcc.gnu.org/pipermail/gcc-patches/2022-November/604834.html There were no regressions with doing bootstrap builds and running the regression tests: 1) Power10 LE using --with-cpu=power10 --with-long-double-format=ieee; 2) Power10 LE using --with-cpu=power10 --with-long-double-format=ibm; 3) Power9 LE using --with-cpu=power9 --with-long-double-format=ibm; and 4) Power8 BE using --with-cpu=power8 (both 32-bit & 64-bit tested). Can I check this patch into the GCC 13 master branch? 2022-11-09 Michael Meissner gcc/ * config/rs6000/mma.md (vvi4i4i8_dm): New int attribute. (avvi4i4i8_dm): Likewise. (vvi4i4i2_dm): Likewise. (avvi4i4i2_dm): Likewise. (vvi4i4_dm): Likewise. (avvi4i4_dm): Likewise. (pvi4i2_dm): Likewise. (apvi4i2_dm): Likewise. (vvi4i4i4_dm): Likewise. (avvi4i4i4_dm): Likewise. (mma_): Add support for running on DMF systems, generating the dense math instruction and using the dense math accumulators. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. gcc/testsuite/ * gcc.target/powerpc/dm-double-test.c: New test. * lib/target-supports.exp (check_effective_target_ppc_dmr_ok): New target test. --- gcc/config/rs6000/mma.md | 98 +++++++-- .../gcc.target/powerpc/dm-double-test.c | 194 ++++++++++++++++++ gcc/testsuite/lib/target-supports.exp | 19 ++ 3 files changed, 299 insertions(+), 12 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/dm-double-test.c diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md index 835f34e8e00..cca1fa71f75 100644 --- a/gcc/config/rs6000/mma.md +++ b/gcc/config/rs6000/mma.md @@ -227,13 +227,22 @@ (define_int_attr apv [(UNSPEC_MMA_XVF64GERPP "xvf64gerpp") (define_int_attr vvi4i4i8 [(UNSPEC_MMA_PMXVI4GER8 "pmxvi4ger8")]) +(define_int_attr vvi4i4i8_dm [(UNSPEC_MMA_PMXVI4GER8 "pmdmxvi4ger8")]) + (define_int_attr avvi4i4i8 [(UNSPEC_MMA_PMXVI4GER8PP "pmxvi4ger8pp")]) +(define_int_attr avvi4i4i8_dm [(UNSPEC_MMA_PMXVI4GER8PP "pmdmxvi4ger8pp")]) + (define_int_attr vvi4i4i2 [(UNSPEC_MMA_PMXVI16GER2 "pmxvi16ger2") (UNSPEC_MMA_PMXVI16GER2S "pmxvi16ger2s") (UNSPEC_MMA_PMXVF16GER2 "pmxvf16ger2") (UNSPEC_MMA_PMXVBF16GER2 "pmxvbf16ger2")]) +(define_int_attr vvi4i4i2_dm [(UNSPEC_MMA_PMXVI16GER2 "pmdmxvi16ger2") + (UNSPEC_MMA_PMXVI16GER2S "pmdmxvi16ger2s") + (UNSPEC_MMA_PMXVF16GER2 "pmdmxvf16ger2") + (UNSPEC_MMA_PMXVBF16GER2 "pmdmxvbf16ger2")]) + (define_int_attr avvi4i4i2 [(UNSPEC_MMA_PMXVI16GER2PP "pmxvi16ger2pp") (UNSPEC_MMA_PMXVI16GER2SPP "pmxvi16ger2spp") (UNSPEC_MMA_PMXVF16GER2PP "pmxvf16ger2pp") @@ -245,25 +254,54 @@ (define_int_attr avvi4i4i2 [(UNSPEC_MMA_PMXVI16GER2PP "pmxvi16ger2pp") (UNSPEC_MMA_PMXVBF16GER2NP "pmxvbf16ger2np") (UNSPEC_MMA_PMXVBF16GER2NN "pmxvbf16ger2nn")]) +(define_int_attr avvi4i4i2_dm [(UNSPEC_MMA_PMXVI16GER2PP "pmdmxvi16ger2pp") + (UNSPEC_MMA_PMXVI16GER2SPP "pmdmxvi16ger2spp") + (UNSPEC_MMA_PMXVF16GER2PP "pmdmxvf16ger2pp") + (UNSPEC_MMA_PMXVF16GER2PN "pmdmxvf16ger2pn") + (UNSPEC_MMA_PMXVF16GER2NP "pmdmxvf16ger2np") + (UNSPEC_MMA_PMXVF16GER2NN "pmdmxvf16ger2nn") + (UNSPEC_MMA_PMXVBF16GER2PP "pmdmxvbf16ger2pp") + (UNSPEC_MMA_PMXVBF16GER2PN "pmdmxvbf16ger2pn") + (UNSPEC_MMA_PMXVBF16GER2NP "pmdmxvbf16ger2np") + (UNSPEC_MMA_PMXVBF16GER2NN "pmdmxvbf16ger2nn")]) + (define_int_attr vvi4i4 [(UNSPEC_MMA_PMXVF32GER "pmxvf32ger")]) +(define_int_attr vvi4i4_dm [(UNSPEC_MMA_PMXVF32GER "pmdmxvf32ger")]) + (define_int_attr avvi4i4 [(UNSPEC_MMA_PMXVF32GERPP "pmxvf32gerpp") (UNSPEC_MMA_PMXVF32GERPN "pmxvf32gerpn") (UNSPEC_MMA_PMXVF32GERNP "pmxvf32gernp") (UNSPEC_MMA_PMXVF32GERNN "pmxvf32gernn")]) +(define_int_attr avvi4i4_dm [(UNSPEC_MMA_PMXVF32GERPP "pmdmxvf32gerpp") + (UNSPEC_MMA_PMXVF32GERPN "pmdmxvf32gerpn") + (UNSPEC_MMA_PMXVF32GERNP "pmdmxvf32gernp") + (UNSPEC_MMA_PMXVF32GERNN "pmdmxvf32gernn")]) + (define_int_attr pvi4i2 [(UNSPEC_MMA_PMXVF64GER "pmxvf64ger")]) +(define_int_attr pvi4i2_dm [(UNSPEC_MMA_PMXVF64GER "pmdmxvf64ger")]) + (define_int_attr apvi4i2 [(UNSPEC_MMA_PMXVF64GERPP "pmxvf64gerpp") (UNSPEC_MMA_PMXVF64GERPN "pmxvf64gerpn") (UNSPEC_MMA_PMXVF64GERNP "pmxvf64gernp") (UNSPEC_MMA_PMXVF64GERNN "pmxvf64gernn")]) +(define_int_attr apvi4i2_dm [(UNSPEC_MMA_PMXVF64GERPP "pmdmxvf64gerpp") + (UNSPEC_MMA_PMXVF64GERPN "pmdmxvf64gerpn") + (UNSPEC_MMA_PMXVF64GERNP "pmdmxvf64gernp") + (UNSPEC_MMA_PMXVF64GERNN "pmdmxvf64gernn")]) + (define_int_attr vvi4i4i4 [(UNSPEC_MMA_PMXVI8GER4 "pmxvi8ger4")]) +(define_int_attr vvi4i4i4_dm [(UNSPEC_MMA_PMXVI8GER4 "pmdmxvi8ger4")]) + (define_int_attr avvi4i4i4 [(UNSPEC_MMA_PMXVI8GER4PP "pmxvi8ger4pp") (UNSPEC_MMA_PMXVI8GER4SPP "pmxvi8ger4spp")]) +(define_int_attr avvi4i4i4_dm [(UNSPEC_MMA_PMXVI8GER4PP "pmdmxvi8ger4pp") + (UNSPEC_MMA_PMXVI8GER4SPP "pmdmxvi8ger4spp")]) ;; Vector pair support. OOmode can only live in VSRs. (define_expand "movoo" @@ -615,7 +653,10 @@ (define_insn "mma_" (match_operand:V16QI 2 "vsx_register_operand" "wa,v,?wa")] MMA_VV))] "TARGET_MMA" - " %A0,%x1,%x2" + "@ + dm %A0,%x1,%x2 + %A0,%x1,%x2 + %A0,%x1,%x2" [(set_attr "type" "mma") (set_attr "isa" "dm,not_dm,not_dm")]) @@ -636,7 +677,10 @@ (define_insn "mma_" (match_operand:V16QI 2 "vsx_register_operand" "wa,v,?wa")] MMA_PV))] "TARGET_MMA" - " %A0,%x1,%x2" + "@ + dm %A0,%x1,%x2 + %A0,%x1,%x2 + %A0,%x1,%x2" [(set_attr "type" "mma") (set_attr "isa" "dm,not_dm,not_dm")]) @@ -647,7 +691,10 @@ (define_insn "mma_" (match_operand:V16QI 3 "vsx_register_operand" "wa,v,?wa")] MMA_APV))] "TARGET_MMA" - " %A0,%x2,%x3" + "@ + dm %A0,%x2,%x3 + %A0,%x2,%x3 + %A0,%x2,%x3" [(set_attr "type" "mma") (set_attr "isa" "dm,not_dm,not_dm")]) @@ -660,7 +707,10 @@ (define_insn "mma_" (match_operand:SI 5 "u8bit_cint_operand" "n,n,n")] MMA_VVI4I4I8))] "TARGET_MMA" - " %A0,%x1,%x2,%3,%4,%5" + "@ + dm %A0,%x1,%x2,%3,%4,%5 + %A0,%x1,%x2,%3,%4,%5 + %A0,%x1,%x2,%3,%4,%5" [(set_attr "type" "mma") (set_attr "prefixed" "yes") (set_attr "isa" "dm,not_dm,not_dm")]) @@ -689,7 +739,10 @@ (define_insn "mma_" (match_operand:SI 5 "const_0_to_3_operand" "n,n,n")] MMA_VVI4I4I2))] "TARGET_MMA" - " %A0,%x1,%x2,%3,%4,%5" + "@ + %A0,%x1,%x2,%3,%4,%5 + %A0,%x1,%x2,%3,%4,%5 + %A0,%x1,%x2,%3,%4,%5" [(set_attr "type" "mma") (set_attr "prefixed" "yes") (set_attr "isa" "dm,not_dm,not_dm")]) @@ -704,7 +757,10 @@ (define_insn "mma_" (match_operand:SI 6 "const_0_to_3_operand" "n,n,n")] MMA_AVVI4I4I2))] "TARGET_MMA" - " %A0,%x2,%x3,%4,%5,%6" + "@ + %A0,%x2,%x3,%4,%5,%6 + %A0,%x2,%x3,%4,%5,%6 + %A0,%x2,%x3,%4,%5,%6" [(set_attr "type" "mma") (set_attr "prefixed" "yes") (set_attr "isa" "dm,not_dm,not_dm")]) @@ -717,7 +773,10 @@ (define_insn "mma_" (match_operand:SI 4 "const_0_to_15_operand" "n,n,n")] MMA_VVI4I4))] "TARGET_MMA" - " %A0,%x1,%x2,%3,%4" + "@ + %A0,%x1,%x2,%3,%4 + %A0,%x1,%x2,%3,%4 + %A0,%x1,%x2,%3,%4" [(set_attr "type" "mma") (set_attr "prefixed" "yes") (set_attr "isa" "dm,not_dm,not_dm")]) @@ -731,7 +790,10 @@ (define_insn "mma_" (match_operand:SI 5 "const_0_to_15_operand" "n,n,n")] MMA_AVVI4I4))] "TARGET_MMA" - " %A0,%x2,%x3,%4,%5" + "@ + %A0,%x2,%x3,%4,%5 + %A0,%x2,%x3,%4,%5 + %A0,%x2,%x3,%4,%5" [(set_attr "type" "mma") (set_attr "prefixed" "yes") (set_attr "isa" "dm,not_dm,not_dm")]) @@ -744,7 +806,10 @@ (define_insn "mma_" (match_operand:SI 4 "const_0_to_3_operand" "n,n,n")] MMA_PVI4I2))] "TARGET_MMA" - " %A0,%x1,%x2,%3,%4" + "@ + %A0,%x1,%x2,%3,%4 + %A0,%x1,%x2,%3,%4 + %A0,%x1,%x2,%3,%4" [(set_attr "type" "mma") (set_attr "prefixed" "yes") (set_attr "isa" "dm,not_dm,not_dm")]) @@ -758,7 +823,10 @@ (define_insn "mma_" (match_operand:SI 5 "const_0_to_3_operand" "n,n,n")] MMA_APVI4I2))] "TARGET_MMA" - " %A0,%x2,%x3,%4,%5" + "@ + %A0,%x2,%x3,%4,%5 + %A0,%x2,%x3,%4,%5 + %A0,%x2,%x3,%4,%5" [(set_attr "type" "mma") (set_attr "prefixed" "yes") (set_attr "isa" "dm,not_dm,not_dm")]) @@ -772,7 +840,10 @@ (define_insn "mma_" (match_operand:SI 5 "const_0_to_15_operand" "n,n,n")] MMA_VVI4I4I4))] "TARGET_MMA" - " %A0,%x1,%x2,%3,%4,%5" + "@ + %A0,%x1,%x2,%3,%4,%5 + %A0,%x1,%x2,%3,%4,%5 + %A0,%x1,%x2,%3,%4,%5" [(set_attr "type" "mma") (set_attr "prefixed" "yes") (set_attr "isa" "dm,not_dm,not_dm")]) @@ -787,7 +858,10 @@ (define_insn "mma_" (match_operand:SI 6 "const_0_to_15_operand" "n,n,n")] MMA_AVVI4I4I4))] "TARGET_MMA" - " %A0,%x2,%x3,%4,%5,%6" + "@ + %A0,%x2,%x3,%4,%5,%6 + %A0,%x2,%x3,%4,%5,%6 + %A0,%x2,%x3,%4,%5,%6" [(set_attr "type" "mma") (set_attr "prefixed" "yes") (set_attr "isa" "dm,not_dm,not_dm")]) diff --git a/gcc/testsuite/gcc.target/powerpc/dm-double-test.c b/gcc/testsuite/gcc.target/powerpc/dm-double-test.c new file mode 100644 index 00000000000..eaa01426c78 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/dm-double-test.c @@ -0,0 +1,194 @@ +/* Test derived from mma-double-1.c, modified for dense math. */ +/* { dg-do compile } */ +/* { dg-require-effective-target powerpc_dense_math_ok } */ +/* { dg-options "-mdejagnu-cpu=future -O2" } */ + +#include +#include +#include + +typedef unsigned char vec_t __attribute__ ((vector_size (16))); +typedef double v4sf_t __attribute__ ((vector_size (16))); +#define SAVE_ACC(ACC, ldc, J) \ + __builtin_mma_disassemble_acc (result, ACC); \ + rowC = (v4sf_t *) &CO[0*ldc+J]; \ + rowC[0] += result[0]; \ + rowC = (v4sf_t *) &CO[1*ldc+J]; \ + rowC[0] += result[1]; \ + rowC = (v4sf_t *) &CO[2*ldc+J]; \ + rowC[0] += result[2]; \ + rowC = (v4sf_t *) &CO[3*ldc+J]; \ + rowC[0] += result[3]; + +void +DM (int m, int n, int k, double *A, double *B, double *C) +{ + __vector_quad acc0, acc1, acc2, acc3, acc4, acc5, acc6, acc7; + v4sf_t result[4]; + v4sf_t *rowC; + for (int l = 0; l < n; l += 4) + { + double *CO; + double *AO; + AO = A; + CO = C; + C += m * 4; + for (int j = 0; j < m; j += 16) + { + double *BO = B; + __builtin_mma_xxsetaccz (&acc0); + __builtin_mma_xxsetaccz (&acc1); + __builtin_mma_xxsetaccz (&acc2); + __builtin_mma_xxsetaccz (&acc3); + __builtin_mma_xxsetaccz (&acc4); + __builtin_mma_xxsetaccz (&acc5); + __builtin_mma_xxsetaccz (&acc6); + __builtin_mma_xxsetaccz (&acc7); + unsigned long i; + + for (i = 0; i < k; i++) + { + vec_t *rowA = (vec_t *) & AO[i * 16]; + __vector_pair rowB; + vec_t *rb = (vec_t *) & BO[i * 4]; + __builtin_mma_assemble_pair (&rowB, rb[1], rb[0]); + __builtin_mma_xvf64gerpp (&acc0, rowB, rowA[0]); + __builtin_mma_xvf64gerpp (&acc1, rowB, rowA[1]); + __builtin_mma_xvf64gerpp (&acc2, rowB, rowA[2]); + __builtin_mma_xvf64gerpp (&acc3, rowB, rowA[3]); + __builtin_mma_xvf64gerpp (&acc4, rowB, rowA[4]); + __builtin_mma_xvf64gerpp (&acc5, rowB, rowA[5]); + __builtin_mma_xvf64gerpp (&acc6, rowB, rowA[6]); + __builtin_mma_xvf64gerpp (&acc7, rowB, rowA[7]); + } + SAVE_ACC (&acc0, m, 0); + SAVE_ACC (&acc2, m, 4); + SAVE_ACC (&acc1, m, 2); + SAVE_ACC (&acc3, m, 6); + SAVE_ACC (&acc4, m, 8); + SAVE_ACC (&acc6, m, 12); + SAVE_ACC (&acc5, m, 10); + SAVE_ACC (&acc7, m, 14); + AO += k * 16; + BO += k * 4; + CO += 16; + } + B += k * 4; + } +} + +void +init (double *matrix, int row, int column) +{ + for (int j = 0; j < column; j++) + { + for (int i = 0; i < row; i++) + { + matrix[j * row + i] = (i * 16 + 2 + j) / 0.123; + } + } +} + +void +init0 (double *matrix, double *matrix1, int row, int column) +{ + for (int j = 0; j < column; j++) + for (int i = 0; i < row; i++) + matrix[j * row + i] = matrix1[j * row + i] = 0; +} + + +void +print (const char *name, const double *matrix, int row, int column) +{ + printf ("Matrix %s has %d rows and %d columns:\n", name, row, column); + for (int i = 0; i < row; i++) + { + for (int j = 0; j < column; j++) + { + printf ("%f ", matrix[j * row + i]); + } + printf ("\n"); + } + printf ("\n"); +} + +int +main (int argc, char *argv[]) +{ + int rowsA, colsB, common; + int i, j, k; + int ret = 0; + + for (int t = 16; t <= 128; t += 16) + { + for (int t1 = 4; t1 <= 16; t1 += 4) + { + rowsA = t; + colsB = t1; + common = 1; + /* printf ("Running test for rows = %d,cols = %d\n", t, t1); */ + double A[rowsA * common]; + double B[common * colsB]; + double C[rowsA * colsB]; + double D[rowsA * colsB]; + + + init (A, rowsA, common); + init (B, common, colsB); + init0 (C, D, rowsA, colsB); + DM (rowsA, colsB, common, A, B, C); + + for (i = 0; i < colsB; i++) + { + for (j = 0; j < rowsA; j++) + { + D[i * rowsA + j] = 0; + for (k = 0; k < common; k++) + { + D[i * rowsA + j] += + A[k * rowsA + j] * B[k + common * i]; + } + } + } + for (i = 0; i < colsB; i++) + { + for (j = 0; j < rowsA; j++) + { + for (k = 0; k < common; k++) + { + if (D[i * rowsA + j] != C[i * rowsA + j]) + { + printf ("Error %d,%d,%d\n",i,j,k); + ret++; + } + } + } + } + if (ret) + { + print ("A", A, rowsA, common); + print ("B", B, common, colsB); + print ("C", C, rowsA, colsB); + print ("D", D, rowsA, colsB); + } + } + } + +#ifdef VERBOSE + if (ret) + printf ("DM double test fail: %d errors\n",ret); + else + printf ("DM double test success: 0 DM errors\n"); +#else + if (ret) + abort(); +#endif + + return ret; +} + +/* { dg-final { scan-assembler {\mdmsetaccz\M} } } */ +/* { dg-final { scan-assembler {\mdmxvf64gerpp\M} } } */ +/* { dg-final { scan-assembler {\mdmxxextfdmr512\M} } } */ + diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index c7f583d6d14..b70ebf963f9 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -6534,6 +6534,25 @@ proc check_effective_target_power10_ok { } { } } +# Return 1 if this is a PowerPC target supporting -mcpu=future or -mdense-math +# which enables the dense math operations. +proc check_effective_target_powerpc_dense_math_ok { } { + return [check_no_compiler_messages_nocache powerpc_dense_math_ok assembly { + __vector_quad vq; + void test (void) + { + #ifndef __PPC_DMR__ + #error "target does not have dense math support." + #else + /* Make sure we have dense math support. */ + __vector_quad dmr; + __asm__ ("dmsetaccz %A0" : "=wD" (dmr)); + vq = dmr; + #endif + } + } "-mcpu=future"] +} + # Return 1 if this is a PowerPC target supporting -mfloat128 via either # software emulation on power7/power8 systems or hardware support on power9.