From patchwork Fri Mar 24 15:30:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Frederik Harwath X-Patchwork-Id: 66858 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0859D388B693 for ; Fri, 24 Mar 2023 15:34:20 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa1.mentor.iphmx.com (esa1.mentor.iphmx.com [68.232.129.153]) by sourceware.org (Postfix) with ESMTPS id 039B03858428; Fri, 24 Mar 2023 15:33:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 039B03858428 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.98,288,1673942400"; d="scan'208";a="324682" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa1.mentor.iphmx.com with ESMTP; 24 Mar 2023 07:31:07 -0800 IronPort-SDR: WLs1C8plr1BmZ6jYl7BNqaqdghwQVg7Tc82WX4XTyA8iljFBwvxyO891vasqhDzFrHGjJgh3qB MmLO4ZlZx7DJeyHYbVkZNhXl1edSXmUzum91q1x5SAEnboPmEbSMEPfewZI0w6b8OuP3VLM1Nd 65qE03A0iqUd/l3s9Vf4GKj6z/x4VjvzLM+2OwLwodpbQVakGSGWIq3BvlXW+HPtS9HIxlb9sh KRL1UNivQ0KIfl0mrzA1CMf3M1i1HqjjvVVeF3vJiiZA6MKGs4htjhddLZYqAN+nX65fXyZJAG Q/c= From: Frederik Harwath To: , , , Subject: [PATCH 1/7] openmp: Add Fortran support for "omp unroll" directive Date: Fri, 24 Mar 2023 16:30:39 +0100 Message-ID: <20230324153046.3996092-2-frederik@codesourcery.com> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20230324153046.3996092-1-frederik@codesourcery.com> References: <20230324153046.3996092-1-frederik@codesourcery.com> MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) To svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_PASS, SPF_PASS, TXREP, T_FILL_THIS_FORM_SHORT autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This commit implements the OpenMP 5.1 "omp unroll" directive for Fortran. The Fortran front end changes encompass the parsing and the verification of nesting restrictions etc. The actual loop transformation is implemented in a new language-independent "omp_transform_loops" pass which runs before omp lowering. No attempt is made to re-use existing unrolling optimizations because a separate implementation allows for better control of the unrolling. The new pass will also serve as a foundation for the implementation of further OpenMP loop transformations. This commit only implements the support for "omp unroll" on the outermost loop of a loop nest. The support for inner loops will be added later. gcc/ChangeLog: * Makefile.in: Add omp_transform_loops.o. * gimple-pretty-print.cc (dump_gimple_omp_for): Handle "full" and "partial" clauses. * gimple.h (enum gf_mask): Add GF_OMP_FOR_KIND_TRANSFORM_LOOP. * gimplify.cc (is_gimple_stmt): Handle OMP_UNROLL. (gimplify_scan_omp_clauses): Handle OMP_UNROLL_FULL, OMP_UNROLL_NONE, and OMP_UNROLL_PARTIAL. (gimplify_adjust_omp_clauses): Handle OMP_UNROLL_FULL, OMP_UNROLL_NONE, and OMP_UNROLL_PARTIAL. (gimplify_omp_for): Handle OMP_UNROLL. (gimplify_expr): Likewise. * params.opt: Add omp-unroll-full-max-iteration and omp-unroll-default-factor. * passes.def: Add pass_omp_transform_loop before pass_lower_omp. * tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_UNROLL_NONE, OMP_CLAUSE_UNROLL_FULL, and OMP_CLAUSE_UNROLL_PARTIAL. * tree-pass.h (make_pass_omp_transform_loops): Declare pmake_pass_omp_transform_loops. * tree-pretty-print.cc (dump_omp_clause): Handle OMP_CLAUSE_UNROLL_NONE, OMP_CLAUSE_UNROLL_FULL, and OMP_CLAUSE_UNROLL_PARTIAL. (dump_generic_node): Handle OMP_UNROLL. * tree.cc (omp_clause_num_ops): Add number of operators for OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_NONE, and OMP_CLAUSE_UNROLL_PARTIAl. (omp_clause_code_names): Add name strings for OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_NONE, and OMP_CLAUSE_UNROLL_PARTIAL. * tree.def (OMP_UNROLL): Define. * tree.h (OMP_CLAUSE_UNROLL_PARTIAL_EXPR): Define. * omp-transform-loops.cc: New file. * omp-general.cc (omp_loop_transform_clause_p): New function. * omp-general.h (omp_loop_transform_clause_p): New declaration. gcc/fortran/ChangeLog: * dump-parse-tree.cc (show_omp_clauses): Handle "unroll full" and "unroll partial". (show_omp_node): Handle OMP_UNROLL. (show_code_node): Handle EXEC_OMP_UNROLL. * gfortran.h (enum gfc_statement): Add ST_OMP_UNROLL, ST_OMP_END_UNROLL. (enum gfc_exec_op): Add EXEC_OMP_UNROLL. * match.h (gfc_match_omp_unroll): Declare. * openmp.cc (enum omp_mask2): Add OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_NONE, OMP_CLAUSE_UNROLL_PARTIAL. (gfc_match_omp_clauses): Handle "omp unroll partial". (OMP_UNROLL_CLAUSES): New macro definition. (gfc_match_omp_unroll): Match "full" clause. (omp_unroll_removes_loop_nest): New function. (resolve_omp_unroll): New function. (resolve_omp_do): Accept and verify "omp unroll" directives between directive and loop. (omp_code_to_statement): Handle EXEC_OMP_UNROLL. (gfc_resolve_omp_directive): Likewise. * parse.cc (decode_omp_directive): Handle "undroll" and "end unroll". (next_statement): Handle ST_OMP_UNROLL. (gfc_ascii_statement): Handle ST_OMP_UNROLL and ST_OMP_END_UNROLL. (parse_omp_do): Accept ST_OMP_UNROLL and ST_OMP_END_UNROLL before/after loop. (parse_executable): Handle ST_OMP_UNROLL. * resolve.cc (gfc_resolve_blocks): Handle EXEC_OMP_UNROLL. (gfc_resolve_code): Likewise. * st.cc (gfc_free_statement): Likewise. * trans-openmp.cc (gfc_trans_omp_clauses): Handle unroll clauses. (gfc_trans_omp_do): Handle OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_PARTIAL, OMP_CLAUSE_UNROLL_NONE creation. (gfc_trans_omp_directive): Handle EXEC_OMP_UNROLL. * trans.cc (trans_code): Likewise. libgomp/ChangeLog: * testsuite/libgomp.fortran/loop-transforms/unroll-1.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-2.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-3.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-4.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-5.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-6.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-7.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-7a.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-7b.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-7c.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-8.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/loop-transforms/unroll-1.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-2.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-3.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-4.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-5.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-6.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-7.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-9.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-no-clause-1.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-no-clause-2.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-no-clause-3.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-simd-1.f90: New test. --- gcc/Makefile.in | 1 + gcc/fortran/dump-parse-tree.cc | 15 + gcc/fortran/gfortran.h | 9 +- gcc/fortran/match.h | 1 + gcc/fortran/openmp.cc | 174 +- gcc/fortran/parse.cc | 37 +- gcc/fortran/resolve.cc | 3 + gcc/fortran/st.cc | 1 + gcc/fortran/trans-openmp.cc | 71 +- gcc/fortran/trans.cc | 1 + gcc/gimple-pretty-print.cc | 6 + gcc/gimple.h | 1 + gcc/gimplify.cc | 40 +- gcc/omp-general.cc | 14 + gcc/omp-general.h | 1 + gcc/omp-transform-loops.cc | 1401 +++++++++++++++++ gcc/params.opt | 9 + gcc/passes.def | 1 + .../gomp/loop-transforms/unroll-1.f90 | 277 ++++ .../gomp/loop-transforms/unroll-10.f90 | 7 + .../gomp/loop-transforms/unroll-11.f90 | 75 + .../gomp/loop-transforms/unroll-12.f90 | 29 + .../gomp/loop-transforms/unroll-2.f90 | 22 + .../gomp/loop-transforms/unroll-3.f90 | 17 + .../gomp/loop-transforms/unroll-4.f90 | 18 + .../gomp/loop-transforms/unroll-5.f90 | 18 + .../gomp/loop-transforms/unroll-6.f90 | 19 + .../gomp/loop-transforms/unroll-7.f90 | 62 + .../gomp/loop-transforms/unroll-8.f90 | 22 + .../gomp/loop-transforms/unroll-9.f90 | 18 + .../loop-transforms/unroll-no-clause-1.f90 | 20 + .../loop-transforms/unroll-no-clause-2.f90 | 21 + .../loop-transforms/unroll-no-clause-3.f90 | 23 + .../gomp/loop-transforms/unroll-simd-1.f90 | 244 +++ .../gomp/loop-transforms/unroll-simd-2.f90 | 57 + gcc/tree-core.h | 9 + gcc/tree-pass.h | 1 + gcc/tree-pretty-print.cc | 20 + gcc/tree.cc | 6 + gcc/tree.def | 6 + gcc/tree.h | 3 + .../loop-transforms/unroll-1.f90 | 52 + .../loop-transforms/unroll-2.f90 | 88 ++ .../loop-transforms/unroll-3.f90 | 59 + .../loop-transforms/unroll-4.f90 | 72 + .../loop-transforms/unroll-5.f90 | 55 + .../loop-transforms/unroll-6.f90 | 105 ++ .../loop-transforms/unroll-7.f90 | 198 +++ .../loop-transforms/unroll-7a.f90 | 7 + .../loop-transforms/unroll-7b.f90 | 7 + .../loop-transforms/unroll-7c.f90 | 7 + .../loop-transforms/unroll-8.f90 | 38 + .../loop-transforms/unroll-simd-1.f90 | 33 + 53 files changed, 3484 insertions(+), 17 deletions(-) create mode 100644 gcc/omp-transform-loops.cc create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-10.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-11.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-12.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-2.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-3.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-4.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-5.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-6.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-7.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-8.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-9.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-2.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-3.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-2.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-2.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-3.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-4.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-5.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-6.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7a.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7b.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7c.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-8.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90 -- 2.36.1 ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 diff --git a/gcc/Makefile.in b/gcc/Makefile.in index d8b76d83d68..8e203f68bd7 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1540,6 +1540,7 @@ OBJS = \ omp-expand.o \ omp-general.o \ omp-low.o \ + omp-transform-loops.o \ omp-oacc-kernels-decompose.o \ omp-oacc-neuter-broadcast.o \ omp-simd-clone.o \ diff --git a/gcc/fortran/dump-parse-tree.cc b/gcc/fortran/dump-parse-tree.cc index 3b24bdc1a6c..e069aca1f1d 100644 --- a/gcc/fortran/dump-parse-tree.cc +++ b/gcc/fortran/dump-parse-tree.cc @@ -2052,6 +2052,16 @@ show_omp_clauses (gfc_omp_clauses *omp_clauses) } if (omp_clauses->assume) show_omp_assumes (omp_clauses->assume); + if (omp_clauses->unroll_full) + { + fputs (" FULL", dumpfile); + } + if (omp_clauses->unroll_partial) + { + fputs (" PARTIAL", dumpfile); + if (omp_clauses->unroll_partial_factor > 0) + fprintf (dumpfile, "(%u)", omp_clauses->unroll_partial_factor); + } } /* Show a single OpenMP or OpenACC directive node and everything underneath it @@ -2162,6 +2172,7 @@ show_omp_node (int level, gfc_code *c) name = "TEAMS DISTRIBUTE PARALLEL DO SIMD"; break; case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: name = "TEAMS DISTRIBUTE SIMD"; break; case EXEC_OMP_TEAMS_LOOP: name = "TEAMS LOOP"; break; + case EXEC_OMP_UNROLL: name = "UNROLL"; break; case EXEC_OMP_WORKSHARE: name = "WORKSHARE"; break; default: gcc_unreachable (); @@ -2238,6 +2249,7 @@ show_omp_node (int level, gfc_code *c) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: omp_clauses = c->ext.omp_clauses; break; @@ -2299,6 +2311,8 @@ show_omp_node (int level, gfc_code *c) d = d->block; } } + else if (c->op == EXEC_OMP_UNROLL) + show_code (level + 1, c->block != NULL ? c->block->next : c->next); else show_code (level + 1, c->block->next); if (c->op == EXEC_OMP_ATOMIC) @@ -3477,6 +3491,7 @@ show_code_node (int level, gfc_code *c) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: show_omp_node (level, c); break; diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index 9bab2c40ead..5ef4a8907b0 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -319,7 +319,8 @@ enum gfc_statement ST_OMP_END_MASKED_TASKLOOP_SIMD, ST_OMP_SCOPE, ST_OMP_END_SCOPE, ST_OMP_ERROR, ST_OMP_ASSUME, ST_OMP_END_ASSUME, ST_OMP_ASSUMES, /* Note: gfc_match_omp_nothing returns ST_NONE. */ - ST_OMP_NOTHING, ST_NONE + ST_OMP_NOTHING, ST_NONE, + ST_OMP_UNROLL, ST_OMP_END_UNROLL }; /* Types of interfaces that we can have. Assignment interfaces are @@ -1561,6 +1562,8 @@ typedef struct gfc_omp_clauses unsigned order_unconstrained:1, order_reproducible:1, capture:1; unsigned grainsize_strict:1, num_tasks_strict:1, compare:1, weak:1; unsigned non_rectangular:1, order_concurrent:1; + unsigned unroll_full:1, unroll_none:1, unroll_partial:1; + unsigned unroll_partial_factor; ENUM_BITFIELD (gfc_omp_sched_kind) sched_kind:3; ENUM_BITFIELD (gfc_omp_device_type) device_type:2; ENUM_BITFIELD (gfc_omp_memorder) memorder:3; @@ -2974,6 +2977,7 @@ enum gfc_exec_op EXEC_OMP_TARGET_TEAMS_LOOP, EXEC_OMP_MASKED, EXEC_OMP_PARALLEL_MASKED, EXEC_OMP_PARALLEL_MASKED_TASKLOOP, EXEC_OMP_PARALLEL_MASKED_TASKLOOP_SIMD, EXEC_OMP_MASKED_TASKLOOP, EXEC_OMP_MASKED_TASKLOOP_SIMD, EXEC_OMP_SCOPE, + EXEC_OMP_UNROLL, EXEC_OMP_ERROR }; @@ -3868,6 +3872,9 @@ void gfc_generate_module_code (gfc_namespace *); /* trans-intrinsic.cc */ bool gfc_inline_intrinsic_function_p (gfc_expr *); +/* trans-openmp.cc */ +bool loop_transform_p (gfc_exec_op op); + /* bbt.cc */ typedef int (*compare_fn) (void *, void *); void gfc_insert_bbt (void *, void *, compare_fn); diff --git a/gcc/fortran/match.h b/gcc/fortran/match.h index 4430aff001c..5640c725f09 100644 --- a/gcc/fortran/match.h +++ b/gcc/fortran/match.h @@ -226,6 +226,7 @@ match gfc_match_omp_teams_distribute_parallel_do_simd (void); match gfc_match_omp_teams_distribute_simd (void); match gfc_match_omp_teams_loop (void); match gfc_match_omp_threadprivate (void); +match gfc_match_omp_unroll (void); match gfc_match_omp_workshare (void); match gfc_match_omp_end_critical (void); match gfc_match_omp_end_nowait (void); diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc index abca146d78e..e54f016b170 100644 --- a/gcc/fortran/openmp.cc +++ b/gcc/fortran/openmp.cc @@ -1051,6 +1051,9 @@ enum omp_mask1 /* More OpenMP clauses and OpenACC 2.0+ specific clauses. */ enum omp_mask2 { + OMP_CLAUSE_UNROLL_FULL, /* OpenMP 5.1. */ + OMP_CLAUSE_UNROLL_NONE, /* OpenMP 5.1. */ + OMP_CLAUSE_UNROLL_PARTIAL, /* OpenMP 5.1. */ OMP_CLAUSE_ASYNC, OMP_CLAUSE_NUM_GANGS, OMP_CLAUSE_NUM_WORKERS, @@ -2523,6 +2526,15 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, NULL, &head, true, true) == MATCH_YES)) continue; + if ((mask & OMP_CLAUSE_UNROLL_FULL) + && (m = gfc_match_dupl_check (!c->unroll_full, "full")) + != MATCH_NO) + { + if (m == MATCH_ERROR) + goto error; + c->unroll_full = needs_space = true; + continue; + } break; case 'g': if ((mask & OMP_CLAUSE_GANG) @@ -3156,10 +3168,36 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, } break; case 'p': - if ((mask & OMP_CLAUSE_COPY) - && gfc_match ("pcopy ( ") == MATCH_YES + if (mask & OMP_CLAUSE_UNROLL_PARTIAL) + { + if ((m = gfc_match_dupl_check (!c->unroll_partial, "partial")) + != MATCH_NO) + { + int unroll_factor; + if (m == MATCH_ERROR) + goto error; + + c->unroll_partial = true; + + gfc_expr *cexpr = NULL; + m = gfc_match (" ( %e )", &cexpr); + if (m == MATCH_NO) + ; + else if (m == MATCH_YES + && !gfc_extract_int (cexpr, &unroll_factor, -1) + && unroll_factor > 0) + c->unroll_partial_factor = unroll_factor; + else + gfc_error_now ("PARTIAL clause argument not constant " + "positive integer at %C"); + gfc_free_expr (cexpr); + continue; + } + } + if ((mask & OMP_CLAUSE_COPY) && gfc_match ("pcopy ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_TOFROM, true, allow_derived)) + OMP_MAP_TOFROM, true, + allow_derived)) continue; if ((mask & OMP_CLAUSE_COPYIN) && gfc_match ("pcopyin ( ") == MATCH_YES @@ -4270,6 +4308,8 @@ cleanup: (omp_mask (OMP_CLAUSE_AT) | OMP_CLAUSE_MESSAGE | OMP_CLAUSE_SEVERITY) #define OMP_WORKSHARE_CLAUSES \ omp_mask (OMP_CLAUSE_NOWAIT) +#define OMP_UNROLL_CLAUSES \ + (omp_mask (OMP_CLAUSE_UNROLL_FULL) | OMP_CLAUSE_UNROLL_PARTIAL) static match @@ -6369,6 +6409,20 @@ gfc_match_omp_teams_distribute_simd (void) | OMP_SIMD_CLAUSES); } +match +gfc_match_omp_unroll (void) +{ + match m = match_omp (EXEC_OMP_UNROLL, OMP_UNROLL_CLAUSES); + + /* Add an internal clause as a marker to indicate that this "unroll" + directive had no clause. */ + if (new_st.ext.omp_clauses + && !new_st.ext.omp_clauses->unroll_full + && !new_st.ext.omp_clauses->unroll_partial) + new_st.ext.omp_clauses->unroll_none = true; + + return m; +} match gfc_match_omp_workshare (void) @@ -9235,6 +9289,75 @@ gfc_resolve_do_iterator (gfc_code *code, gfc_symbol *sym, bool add_clause) } } + +static bool +omp_unroll_removes_loop_nest (gfc_code *code) +{ + gcc_assert (code->op == EXEC_OMP_UNROLL); + if (!code->ext.omp_clauses) + return true; + + if (code->ext.omp_clauses->unroll_none) + { + gfc_warning (0, "!$OMP UNROLL without PARTIAL clause at %L turns loop " + "into a non-loop", + &code->loc); + return true; + } + if (code->ext.omp_clauses->unroll_full) + { + gfc_warning (0, "!$OMP UNROLL with FULL clause at %L turns loop into a " + "non-loop", + &code->loc); + return true; + } + return false; +} + +static void +resolve_loop_transform_generic (gfc_code *code, const char *descr) +{ + gcc_assert (code->block); + + if (code->block->op == EXEC_OMP_UNROLL + && !omp_unroll_removes_loop_nest (code->block)) + return; + + if (code->block->next->op == EXEC_OMP_UNROLL + && !omp_unroll_removes_loop_nest (code->block->next)) + return; + + if (code->block->next->op == EXEC_DO_WHILE) + { + gfc_error ("%s invalid around DO WHILE or DO without loop " + "control at %L", descr, &code->loc); + return; + } + if (code->block->next->op == EXEC_DO_CONCURRENT) + { + gfc_error ("%s invalid around DO CONCURRENT loop at %L", + descr, &code->loc); + return; + } + + gfc_error ("missing canonical loop nest after %s at %L", + descr, &code->loc); + +} + +static void +resolve_omp_unroll (gfc_code *code) +{ + if (!code->block || code->block->op == EXEC_DO) + return; + + if (code->block->next->op == EXEC_DO) + return; + + resolve_loop_transform_generic (code, "!$OMP UNROLL"); +} + + static void handle_local_var (gfc_symbol *sym) { @@ -9259,6 +9382,13 @@ is_outer_iteration_variable (gfc_code *code, int depth, gfc_symbol *var) { int i; gfc_code *do_code = code->block->next; + while (loop_transform_p (do_code->op)) { + if (do_code->block) + do_code = do_code->block->next; + else + do_code = do_code->next; + } + gcc_assert (!loop_transform_p (do_code->op)); for (i = 1; i < depth; i++) { @@ -9277,6 +9407,13 @@ expr_is_invariant (gfc_code *code, int depth, gfc_expr *expr) { int i; gfc_code *do_code = code->block->next; + while (loop_transform_p (do_code->op)) { + if (do_code->block) + do_code = do_code->block->next; + else + do_code = do_code->next; + } + gcc_assert (!loop_transform_p (do_code->op)); for (i = 1; i < depth; i++) { @@ -9454,6 +9591,7 @@ resolve_omp_do (gfc_code *code) is_simd = true; break; case EXEC_OMP_TEAMS_LOOP: name = "!$OMP TEAMS LOOP"; break; + case EXEC_OMP_UNROLL: name = "!$OMP UNROLL"; break; default: gcc_unreachable (); } @@ -9461,6 +9599,23 @@ resolve_omp_do (gfc_code *code) resolve_omp_clauses (code, code->ext.omp_clauses, NULL); do_code = code->block->next; + /* Move forward over any loop transformation directives to find the loop. */ + bool error = false; + while (do_code->op == EXEC_OMP_UNROLL) + { + if (!error && omp_unroll_removes_loop_nest (do_code)) + { + gfc_error ("missing canonical loop nest after %s at %L", name, + &code->loc); + error = true; + } + if (do_code->block) + do_code = do_code->block->next; + else + do_code = do_code->next; + } + gcc_assert (do_code->op != EXEC_OMP_UNROLL); + if (code->ext.omp_clauses->orderedc) collapse = code->ext.omp_clauses->orderedc; else @@ -9490,6 +9645,14 @@ resolve_omp_do (gfc_code *code) &do_code->loc); break; } + if (do_code->op != EXEC_DO) + { + gfc_error ("%s must be DO loop at %L", name, + &do_code->loc); + break; + } + + gcc_assert (do_code->op != EXEC_OMP_UNROLL); gcc_assert (do_code->op == EXEC_DO); if (do_code->ext.iterator->var->ts.type != BT_INTEGER) gfc_error ("%s iteration variable must be of type integer at %L", @@ -9726,6 +9889,8 @@ omp_code_to_statement (gfc_code *code) return ST_OMP_PARALLEL_LOOP; case EXEC_OMP_DEPOBJ: return ST_OMP_DEPOBJ; + case EXEC_OMP_UNROLL: + return ST_OMP_UNROLL; default: gcc_unreachable (); } @@ -10155,6 +10320,9 @@ gfc_resolve_omp_directive (gfc_code *code, gfc_namespace *ns) case EXEC_OMP_TEAMS_LOOP: resolve_omp_do (code); break; + case EXEC_OMP_UNROLL: + resolve_omp_unroll (code); + break; case EXEC_OMP_ASSUME: case EXEC_OMP_CANCEL: case EXEC_OMP_ERROR: diff --git a/gcc/fortran/parse.cc b/gcc/fortran/parse.cc index f1e55316e5b..094678436b4 100644 --- a/gcc/fortran/parse.cc +++ b/gcc/fortran/parse.cc @@ -1008,6 +1008,7 @@ decode_omp_directive (void) ST_OMP_END_TEAMS_DISTRIBUTE); matcho ("end teams loop", gfc_match_omp_eos_error, ST_OMP_END_TEAMS_LOOP); matcho ("end teams", gfc_match_omp_eos_error, ST_OMP_END_TEAMS); + matchs ("end unroll", gfc_match_omp_eos_error, ST_OMP_END_UNROLL); matcho ("end workshare", gfc_match_omp_end_nowait, ST_OMP_END_WORKSHARE); break; @@ -1137,6 +1138,9 @@ decode_omp_directive (void) matchdo ("threadprivate", gfc_match_omp_threadprivate, ST_OMP_THREADPRIVATE); break; + case 'u': + matchs ("unroll", gfc_match_omp_unroll, ST_OMP_UNROLL); + break; case 'w': matcho ("workshare", gfc_match_omp_workshare, ST_OMP_WORKSHARE); break; @@ -1724,6 +1728,7 @@ next_statement (void) case ST_OMP_LOOP: case ST_OMP_PARALLEL_LOOP: case ST_OMP_TEAMS_LOOP: \ case ST_OMP_TARGET_PARALLEL_LOOP: case ST_OMP_TARGET_TEAMS_LOOP: \ case ST_OMP_ASSUME: \ + case ST_OMP_UNROLL: \ case ST_CRITICAL: \ case ST_OACC_PARALLEL_LOOP: case ST_OACC_PARALLEL: case ST_OACC_KERNELS: \ case ST_OACC_DATA: case ST_OACC_HOST_DATA: case ST_OACC_LOOP: \ @@ -2096,6 +2101,9 @@ gfc_ascii_statement (gfc_statement st, bool strip_sentinel) case ST_END_UNION: p = "END UNION"; break; + case ST_OMP_END_UNROLL: + p = "!$OMP END UNROLL"; + break; case ST_END_MAP: p = "END MAP"; break; @@ -2766,6 +2774,9 @@ gfc_ascii_statement (gfc_statement st, bool strip_sentinel) case ST_OMP_THREADPRIVATE: p = "!$OMP THREADPRIVATE"; break; + case ST_OMP_UNROLL: + p = "!$OMP UNROLL"; + break; case ST_OMP_WORKSHARE: p = "!$OMP WORKSHARE"; break; @@ -5180,6 +5191,7 @@ parse_omp_do (gfc_statement omp_st) gfc_statement st; gfc_code *cp, *np; gfc_state_data s; + int num_unroll = 0; accept_statement (omp_st); @@ -5196,6 +5208,12 @@ parse_omp_do (gfc_statement omp_st) unexpected_eof (); else if (st == ST_DO) break; + else if (st == ST_OMP_UNROLL) + { + accept_statement (st); + num_unroll++; + continue; + } else unexpected_statement (st); } @@ -5221,6 +5239,17 @@ parse_omp_do (gfc_statement omp_st) pop_state (); st = next_statement (); + for (; num_unroll > 0; num_unroll--) + { + if (st == ST_OMP_END_UNROLL) + { + gfc_clear_new_st (); + gfc_commit_symbols (); + gfc_warning_check (); + st = next_statement (); + } + } + gfc_statement omp_end_st = ST_OMP_END_DO; switch (omp_st) { @@ -5234,7 +5263,9 @@ parse_omp_do (gfc_statement omp_st) case ST_OMP_DISTRIBUTE_SIMD: omp_end_st = ST_OMP_END_DISTRIBUTE_SIMD; break; - case ST_OMP_DO: omp_end_st = ST_OMP_END_DO; break; + case ST_OMP_DO: + omp_end_st = ST_OMP_END_DO; + break; case ST_OMP_DO_SIMD: omp_end_st = ST_OMP_END_DO_SIMD; break; case ST_OMP_LOOP: omp_end_st = ST_OMP_END_LOOP; break; case ST_OMP_PARALLEL_DO: omp_end_st = ST_OMP_END_PARALLEL_DO; break; @@ -5307,6 +5338,9 @@ parse_omp_do (gfc_statement omp_st) case ST_OMP_TEAMS_LOOP: omp_end_st = ST_OMP_END_TEAMS_LOOP; break; + case ST_OMP_UNROLL: + omp_end_st = ST_OMP_END_UNROLL; + break; default: gcc_unreachable (); } if (st == omp_end_st) @@ -5991,6 +6025,7 @@ parse_executable (gfc_statement st) case ST_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case ST_OMP_TEAMS_DISTRIBUTE_SIMD: case ST_OMP_TEAMS_LOOP: + case ST_OMP_UNROLL: st = parse_omp_do (st); if (st == ST_IMPLIED_ENDDO) return st; diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc index f6ec76acb0b..46988ff281d 100644 --- a/gcc/fortran/resolve.cc +++ b/gcc/fortran/resolve.cc @@ -11041,6 +11041,7 @@ gfc_resolve_blocks (gfc_code *b, gfc_namespace *ns) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_LOOP: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: + case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: break; @@ -12197,6 +12198,7 @@ gfc_resolve_code (gfc_code *code, gfc_namespace *ns) case EXEC_OMP_LOOP: case EXEC_OMP_SIMD: case EXEC_OMP_TARGET_SIMD: + case EXEC_OMP_UNROLL: gfc_resolve_omp_do_blocks (code, ns); break; case EXEC_SELECT_TYPE: @@ -12693,6 +12695,7 @@ start: case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: gfc_resolve_omp_directive (code, ns); break; diff --git a/gcc/fortran/st.cc b/gcc/fortran/st.cc index 657bc9deebf..6112831e621 100644 --- a/gcc/fortran/st.cc +++ b/gcc/fortran/st.cc @@ -277,6 +277,7 @@ gfc_free_statement (gfc_code *p) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: gfc_free_omp_clauses (p->ext.omp_clauses); break; diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc index 84c0184f48e..c4a23f6e247 100644 --- a/gcc/fortran/trans-openmp.cc +++ b/gcc/fortran/trans-openmp.cc @@ -3890,6 +3890,29 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, omp_clauses = gfc_trans_add_clause (c, omp_clauses); } + if (clauses->unroll_full) + { + c = build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_UNROLL_FULL); + omp_clauses = gfc_trans_add_clause (c, omp_clauses); + } + + if (clauses->unroll_none) + { + c = build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_UNROLL_NONE); + omp_clauses = gfc_trans_add_clause (c, omp_clauses); + } + + if (clauses->unroll_partial) + { + c = build_omp_clause (gfc_get_location (&where), + OMP_CLAUSE_UNROLL_PARTIAL); + OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c) + = clauses->unroll_partial_factor ? build_int_cst ( + integer_type_node, clauses->unroll_partial_factor) + : NULL_TREE; + omp_clauses = gfc_trans_add_clause (c, omp_clauses); + } + if (clauses->ordered) { c = build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_ORDERED); @@ -5080,6 +5103,12 @@ gfc_trans_omp_cancel (gfc_code *code) return gfc_finish_block (&block); } +bool +loop_transform_p (gfc_exec_op op) +{ + return op == EXEC_OMP_UNROLL; +} + static tree gfc_trans_omp_cancellation_point (gfc_code *code) { @@ -5257,7 +5286,7 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, { gfc_se se; tree dovar, stmt, from, to, step, type, init, cond, incr, orig_decls; - tree local_dovar = NULL_TREE, cycle_label, tmp, omp_clauses; + tree local_dovar = NULL_TREE, cycle_label, tmp, omp_clauses, loop_transform_clauses; stmtblock_t block; stmtblock_t body; gfc_omp_clauses *clauses = code->ext.omp_clauses; @@ -5268,6 +5297,7 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, vec *saved_doacross_steps = doacross_steps; gfc_expr_list *tile = do_clauses ? do_clauses->tile_list : clauses->tile_list; gfc_code *orig_code = code; + locus top_loc = code->loc; /* Both collapsed and tiled loops are lowered the same way. In OpenACC, those clauses are not compatible, so prioritize the tile @@ -5285,7 +5315,25 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, if (collapse <= 0) collapse = 1; + if (pblock == NULL) + { + gfc_start_block (&block); + pblock = █ + } code = code->block->next; + gcc_assert (code->op == EXEC_DO || code->op == EXEC_OMP_UNROLL); + /* Loop transformation directives surrounding the associated loop of an "omp + do" (or similar directive) are represented as clauses on the "omp do". */ + loop_transform_clauses = NULL; + while (code->op == EXEC_OMP_UNROLL) + { + tree clauses = gfc_trans_omp_clauses (pblock, code->ext.omp_clauses, + code->loc); + loop_transform_clauses = chainon (loop_transform_clauses, clauses); + + code = code->block ? code->block->next : code->next; + } + gcc_assert (code->op != EXEC_OMP_UNROLL); gcc_assert (code->op == EXEC_DO); init = make_tree_vec (collapse); @@ -5293,18 +5341,21 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, incr = make_tree_vec (collapse); orig_decls = clauses->ordered ? make_tree_vec (collapse) : NULL_TREE; - if (pblock == NULL) - { - gfc_start_block (&block); - pblock = █ - } - /* simd schedule modifier is only useful for composite do simd and other constructs including that, where gfc_trans_omp_do is only called on the simd construct and DO's clauses are translated elsewhere. */ do_clauses->sched_simd = false; - omp_clauses = gfc_trans_omp_clauses (pblock, do_clauses, code->loc); + if (op == EXEC_OMP_UNROLL) + { + /* This is a loop transformation on a loop which is not associated with + any other directive. Use the directive location instead of the loop + location for the clauses. */ + omp_clauses = gfc_trans_omp_clauses (pblock, do_clauses, top_loc); + } + else + omp_clauses = gfc_trans_omp_clauses (pblock, do_clauses, code->loc); + omp_clauses = chainon (omp_clauses, loop_transform_clauses); for (i = 0; i < collapse; i++) { @@ -5558,7 +5609,7 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, } gcc_assert (local_dovar == dovar || c != NULL); } - if (local_dovar != dovar) + if (local_dovar != dovar && op != EXEC_OMP_UNROLL) { if (op != EXEC_OMP_SIMD || dovar_found == 1) tmp = build_omp_clause (input_location, OMP_CLAUSE_PRIVATE); @@ -5644,6 +5695,7 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, case EXEC_OMP_LOOP: stmt = make_node (OMP_LOOP); break; case EXEC_OMP_TASKLOOP: stmt = make_node (OMP_TASKLOOP); break; case EXEC_OACC_LOOP: stmt = make_node (OACC_LOOP); break; + case EXEC_OMP_UNROLL: stmt = make_node (OMP_LOOP_TRANS); break; default: gcc_unreachable (); } @@ -7741,6 +7793,7 @@ gfc_trans_omp_directive (gfc_code *code) case EXEC_OMP_LOOP: case EXEC_OMP_SIMD: case EXEC_OMP_TASKLOOP: + case EXEC_OMP_UNROLL: return gfc_trans_omp_do (code, code->op, NULL, code->ext.omp_clauses, NULL); case EXEC_OMP_DISTRIBUTE_PARALLEL_DO: diff --git a/gcc/fortran/trans.cc b/gcc/fortran/trans.cc index f7745add045..56ec59fe80e 100644 --- a/gcc/fortran/trans.cc +++ b/gcc/fortran/trans.cc @@ -2520,6 +2520,7 @@ trans_code (gfc_code * code, tree cond) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: res = gfc_trans_omp_directive (code); break; diff --git a/gcc/gimple-pretty-print.cc b/gcc/gimple-pretty-print.cc index 300e9d7ed1e..24ef60059fe 100644 --- a/gcc/gimple-pretty-print.cc +++ b/gcc/gimple-pretty-print.cc @@ -1478,6 +1478,9 @@ dump_gimple_omp_for (pretty_printer *buffer, const gomp_for *gs, int spc, case GF_OMP_FOR_KIND_SIMD: kind = " simd"; break; + case GF_OMP_FOR_KIND_TRANSFORM_LOOP: + kind = " unroll"; + break; default: gcc_unreachable (); } @@ -1515,6 +1518,9 @@ dump_gimple_omp_for (pretty_printer *buffer, const gomp_for *gs, int spc, case GF_OMP_FOR_KIND_SIMD: pp_string (buffer, "#pragma omp simd"); break; + case GF_OMP_FOR_KIND_TRANSFORM_LOOP: + pp_string (buffer, "#pragma omp loop_transform"); + break; default: gcc_unreachable (); } diff --git a/gcc/gimple.h b/gcc/gimple.h index 081d18e425a..213cfc58abb 100644 --- a/gcc/gimple.h +++ b/gcc/gimple.h @@ -159,6 +159,7 @@ enum gf_mask { GF_OMP_FOR_KIND_TASKLOOP = 2, GF_OMP_FOR_KIND_OACC_LOOP = 4, GF_OMP_FOR_KIND_SIMD = 5, + GF_OMP_FOR_KIND_TRANSFORM_LOOP = 6, GF_OMP_FOR_COMBINED = 1 << 3, GF_OMP_FOR_COMBINED_INTO = 1 << 4, GF_OMP_TARGET_KIND_MASK = (1 << 5) - 1, diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index ade6e335da7..2c160686533 100644 --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -5949,6 +5949,7 @@ is_gimple_stmt (tree t) case OACC_CACHE: case OMP_PARALLEL: case OMP_FOR: + case OMP_LOOP_TRANS: case OMP_SIMD: case OMP_DISTRIBUTE: case OMP_LOOP: @@ -12101,6 +12102,10 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, } break; + case OMP_CLAUSE_UNROLL_FULL: + case OMP_CLAUSE_UNROLL_NONE: + case OMP_CLAUSE_UNROLL_PARTIAL: + break; case OMP_CLAUSE_NOHOST: default: gcc_unreachable (); @@ -13071,6 +13076,9 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p, case OMP_CLAUSE_FINALIZE: case OMP_CLAUSE_INCLUSIVE: case OMP_CLAUSE_EXCLUSIVE: + case OMP_CLAUSE_UNROLL_FULL: + case OMP_CLAUSE_UNROLL_NONE: + case OMP_CLAUSE_UNROLL_PARTIAL: break; case OMP_CLAUSE_NOHOST: @@ -13797,6 +13805,8 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) case OMP_SIMD: ort = ORT_SIMD; break; + case OMP_LOOP_TRANS: + break; default: gcc_unreachable (); } @@ -14158,8 +14168,19 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) n->value &= ~GOVD_LASTPRIVATE_CONDITIONAL; } } - else - omp_add_variable (gimplify_omp_ctxp, decl, GOVD_PRIVATE | GOVD_SEEN); + else { + if (TREE_CODE(orig_for_stmt) == OMP_LOOP_TRANS) + { + /* This loop is not going to be associated with any + directive after its transformation in + pass-omp_transform_loops. It will be lowered there + and the loop iteration variable will be used in the + context. */ + omp_notice_variable(gimplify_omp_ctxp, decl, true); + } + else + omp_add_variable(gimplify_omp_ctxp, decl, GOVD_PRIVATE | GOVD_SEEN); + } /* If DECL is not a gimple register, create a temporary variable to act as an iteration counter. This is valid, since DECL cannot be @@ -14200,7 +14221,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) c2 = NULL_TREE; } } - else + else if (TREE_CODE (orig_for_stmt) != OMP_LOOP_TRANS) omp_add_variable (gimplify_omp_ctxp, var, GOVD_PRIVATE | GOVD_SEEN); } @@ -14481,6 +14502,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) case OMP_DISTRIBUTE: kind = GF_OMP_FOR_KIND_DISTRIBUTE; break; case OMP_TASKLOOP: kind = GF_OMP_FOR_KIND_TASKLOOP; break; case OACC_LOOP: kind = GF_OMP_FOR_KIND_OACC_LOOP; break; + case OMP_LOOP_TRANS: kind = GF_OMP_FOR_KIND_TRANSFORM_LOOP; break; default: gcc_unreachable (); } @@ -14665,6 +14687,13 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) gtask_clauses_ptr = &OMP_CLAUSE_CHAIN (c); } break; + /* Move loop transformations to inner loop */ + case OMP_CLAUSE_UNROLL_FULL: + case OMP_CLAUSE_UNROLL_NONE: + case OMP_CLAUSE_UNROLL_PARTIAL: + *gfor_clauses_ptr = c; + gfor_clauses_ptr = &OMP_CLAUSE_CHAIN (c); + break; default: gcc_unreachable (); } @@ -15105,6 +15134,10 @@ gimplify_omp_loop (tree *expr_p, gimple_seq *pre_p) } pc = &OMP_CLAUSE_CHAIN (*pc); break; + case OMP_CLAUSE_UNROLL_PARTIAL: + case OMP_CLAUSE_UNROLL_FULL: + case OMP_CLAUSE_UNROLL_NONE: + break; default: gcc_unreachable (); } @@ -16886,6 +16919,7 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p, case OMP_FOR: case OMP_DISTRIBUTE: case OMP_TASKLOOP: + case OMP_LOOP_TRANS: case OACC_LOOP: ret = gimplify_omp_for (expr_p, pre_p); break; diff --git a/gcc/omp-general.cc b/gcc/omp-general.cc index eefdcb54590..e29d695dcba 100644 --- a/gcc/omp-general.cc +++ b/gcc/omp-general.cc @@ -2253,6 +2253,20 @@ omp_declare_variant_remove_hook (struct cgraph_node *node, void *) } } +/* Return true if C is a clause that represents an OpenMP loop transformation + directive, false otherwise. */ + +bool +omp_loop_transform_clause_p (tree c) +{ + if (c == NULL) + return false; + + enum omp_clause_code code = OMP_CLAUSE_CODE (c); + return (code == OMP_CLAUSE_UNROLL_FULL || code == OMP_CLAUSE_UNROLL_PARTIAL + || code == OMP_CLAUSE_UNROLL_NONE); +} + /* Try to resolve declare variant, return the variant decl if it should be used instead of base, or base otherwise. */ diff --git a/gcc/omp-general.h b/gcc/omp-general.h index 92717db1628..8d6390ad6f6 100644 --- a/gcc/omp-general.h +++ b/gcc/omp-general.h @@ -113,6 +113,7 @@ extern int omp_context_selector_matches (tree); extern int omp_context_selector_set_compare (const char *, tree, tree); extern tree omp_get_context_selector (tree, const char *, const char *); extern tree omp_resolve_declare_variant (tree); +extern bool omp_loop_transform_clause_p (tree); extern tree oacc_launch_pack (unsigned code, tree device, unsigned op); extern tree oacc_replace_fn_attrib_attr (tree attribs, tree dims); extern void oacc_replace_fn_attrib (tree fn, tree dims); diff --git a/gcc/omp-transform-loops.cc b/gcc/omp-transform-loops.cc new file mode 100644 index 00000000000..d845d0e4798 --- /dev/null +++ b/gcc/omp-transform-loops.cc @@ -0,0 +1,1401 @@ +/* OMP loop transformation pass. Transforms loops according to + loop transformations directives such as "omp unroll". + + Copyright (C) 2023 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +. */ + + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "pretty-print.h" +#include "diagnostic-core.h" +#include "backend.h" +#include "target.h" +#include "tree.h" +#include "tree-inline.h" +#include "gimple.h" +#include "gimple-iterator.h" +#include "tree-pass.h" +#include "gimple-walk.h" +#include "gimple-pretty-print.h" +#include "gimplify.h" +#include "ssa.h" +#include "tree-into-ssa.h" +#include "fold-const.h" +#include "print-tree.h" +#include "omp-general.h" + +/* Context information for walk_omp_for_loops. */ +struct walk_ctx +{ + /* The most recently visited gomp_for that has been transformed and + for which gimple_omp_for_set_combined_into_p returned true. */ + gomp_for *inner_combined_loop; + + /* The innermost bind enclosing the currently visited node. */ + gbind *bind; +}; + +static unsigned int walk_omp_for_loops (gimple_seq *, walk_ctx *); +static enum tree_code omp_adjust_neq_condition (tree v, tree step); + +static bool +non_rectangular_p (const gomp_for *omp_for) +{ + size_t collapse = gimple_omp_for_collapse (omp_for); + for (size_t i = 0; i < collapse; i++) + { + if (TREE_CODE (gimple_omp_for_final (omp_for, i)) == TREE_VEC + || TREE_CODE (gimple_omp_for_initial (omp_for, i)) == TREE_VEC) + return true; + } + + return false; +} + +/* Callback for subst_var. */ + +static tree +subst_var_in_op (tree *t, int *subtrees ATTRIBUTE_UNUSED, void *data) +{ + + auto *wi = (struct walk_stmt_info *)data; + auto from_to = (std::pair *)wi->info; + + if (*t == from_to->first) + { + *t = from_to->second; + wi->changed = true; + } + + return NULL_TREE; +} + +/* Substitute all occurrences of FROM in the operands of the GIMPLE statements + in SEQ by TO. */ + +static void +subst_var (gimple_seq *seq, tree from, tree to) +{ + gcc_assert (VAR_P (from)); + gcc_assert (VAR_P (to)); + + std::pair from_to (from, to); + struct walk_stmt_info wi; + memset (&wi, 0, sizeof (wi)); + wi.info = (void *)&from_to; + + walk_gimple_seq_mod (seq, NULL, subst_var_in_op, &wi); +} + +/* Return the type that should be used for computing the iteration count of a + loop with the given index VAR and upper/lower bound FINAL according to + OpenMP 5.1. */ + +tree +gomp_for_iter_count_type (tree var, tree final) +{ + tree var_type = TREE_TYPE (var); + + if (POINTER_TYPE_P (var_type)) + return ptrdiff_type_node; + + tree operand_type = TREE_TYPE (final); + if (TYPE_UNSIGNED (var_type) && !TYPE_UNSIGNED (operand_type)) + return signed_type_for (operand_type); + + return var_type; +} + +extern tree +gimple_assign_rhs_to_tree (gimple *stmt); + +/* Substitute all definitions from SEQ bottom-up into EXPR. This is used to + reconstruct a tree for a gimplified expression for determinig whether or not + the number of iterations of a loop is constant. */ + +tree +subst_defs (tree expr, gimple_seq seq) +{ + gimple_seq_node last = gimple_seq_last (seq); + gimple_seq_node first = gimple_seq_first (seq); + for (auto n = last; n != NULL; n = n != first ? n->prev : NULL) + { + if (!is_gimple_assign (n)) + continue; + + tree lhs = gimple_assign_lhs (n); + tree rhs = gimple_assign_rhs_to_tree (n); + std::pair from_to (lhs, rhs); + struct walk_stmt_info wi; + memset (&wi, 0, sizeof (wi)); + wi.info = (void *)&from_to; + walk_tree (&expr, subst_var_in_op, &wi, NULL); + expr = fold (expr); + } + + return expr; +} + +/* Return an expression for the number of iterations of the outermost loop of + OMP_FOR. */ + +tree +gomp_for_number_of_iterations (const gomp_for *omp_for, size_t level) +{ + gcc_assert (!non_rectangular_p (omp_for)); + + tree init = gimple_omp_for_initial (omp_for, level); + tree final = gimple_omp_for_final (omp_for, level); + tree_code cond = gimple_omp_for_cond (omp_for, level); + tree index = gimple_omp_for_index (omp_for, level); + tree type = gomp_for_iter_count_type (index, final); + tree step = TREE_OPERAND (gimple_omp_for_incr (omp_for, level), 1); + + init = subst_defs (init, gimple_omp_for_pre_body (omp_for)); + init = fold (init); + final = subst_defs (final, gimple_omp_for_pre_body (omp_for)); + final = fold (final); + + tree_code minus_code = MINUS_EXPR; + tree diff_type = type; + if (POINTER_TYPE_P (TREE_TYPE (final))) + { + minus_code = POINTER_DIFF_EXPR; + diff_type = ptrdiff_type_node; + } + + tree diff; + if (cond == GT_EXPR) + diff = fold_build2 (minus_code, diff_type, init, final); + else if (cond == LT_EXPR) + diff = fold_build2 (minus_code, diff_type, final, init); + else + gcc_unreachable (); + + diff = fold_build2 (CEIL_DIV_EXPR, type, diff, step); + diff = fold_build1 (ABS_EXPR, type, diff); + + return diff; +} + +/* Return true if the expression representing the number of iterations for + OMP_FOR is a constant expression, false otherwise. */ + +bool +gomp_for_constant_iterations_p (gomp_for *omp_for, + unsigned HOST_WIDE_INT *iterations) +{ + tree t = gomp_for_number_of_iterations (omp_for, 0); + if (!TREE_CONSTANT (t) + || !tree_fits_uhwi_p (t)) + return false; + + *iterations = tree_to_uhwi (t); + return true; +} + +/* Split a gomp_for that represents a collapsed loop-nest into single + loops. The result is a gomp_for of the same kind which is not collapsed + (i.e. gimple_omp_for_collapse (OMP_FOR) == 1) and which contains nested, + non-collapsed gomp_for loops whose kind is GF_OMP_FOR_KIND_TRANSFORM_LOOP + (i.e. they will be lowered into plain, non-omp loops by this pass) for each + of the loops of OMP_FOR. All loops whose depth is strictly less than + FROM_DEPTH are left collapsed. */ + +static gomp_for* +gomp_for_uncollapse (gomp_for *omp_for, int from_depth = 0) +{ + int collapse = gimple_omp_for_collapse (omp_for); + gcc_assert (from_depth < collapse); + + if (collapse <= 1) + return omp_for; + + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE | MSG_PRIORITY_INTERNALS, omp_for, + "Uncollapsing loop:\n %G\n", + static_cast (omp_for)); + + gimple_seq body = gimple_omp_body (omp_for); + gomp_for *level_omp_for = omp_for; + for (int level = collapse - 1; level >= from_depth; level--) + { + level_omp_for = gimple_build_omp_for (body, + GF_OMP_FOR_KIND_TRANSFORM_LOOP, + NULL, 1, NULL); + gimple_omp_for_set_cond (level_omp_for, 0, + gimple_omp_for_cond (omp_for, level)); + gimple_omp_for_set_initial (level_omp_for, 0, + gimple_omp_for_initial (omp_for, level)); + gimple_omp_for_set_final (level_omp_for, 0, + gimple_omp_for_final (omp_for, level)); + gimple_omp_for_set_incr (level_omp_for, 0, + gimple_omp_for_incr (omp_for, level)); + gimple_omp_for_set_index (level_omp_for, 0, + gimple_omp_for_index (omp_for, level)); + + body = level_omp_for; + } + + omp_for->collapse = from_depth; + + if (from_depth > 0) + { + gimple_omp_set_body (omp_for, body); + return omp_for; + } + + gimple_omp_for_set_clauses (level_omp_for, gimple_omp_for_clauses (omp_for)); + gimple_omp_for_set_pre_body (level_omp_for, gimple_omp_for_pre_body (omp_for)); + gimple_omp_for_set_combined_into_p (level_omp_for, + gimple_omp_for_combined_into_p (omp_for)); + gimple_omp_for_set_combined_p (level_omp_for, + gimple_omp_for_combined_p (omp_for)); + + return level_omp_for; +} + +static tree +build_loop_exit_cond (tree index, tree_code cond, tree final, gimple_seq *seq) +{ + tree exit_cond + = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, + fold_build2 (cond, boolean_type_node, index, final)); + tree res = create_tmp_var (boolean_type_node); + gimplify_assign (res, exit_cond, seq); + + return res; +} + +/* Returns a register that contains the final value of a loop as described by + FINAL. This is necessary for non-rectangular loops. */ + +static tree +build_loop_final (tree final, gimple_seq *seq) +{ + if (TREE_CODE (final) != TREE_VEC) /* rectangular loop-nest */ + return final; + + tree coeff = TREE_VEC_ELT (final, 0); + tree outer_var = TREE_VEC_ELT (final, 1); + tree constt = TREE_VEC_ELT (final, 2); + + tree type = TREE_TYPE (outer_var); + tree val = fold_build2 (MULT_EXPR, type, coeff, outer_var); + val = fold_build2 (PLUS_EXPR, type, val, constt); + + tree res = create_tmp_var (type); + gimplify_assign (res, val, seq); + + return res; +} + +/* Unroll the loop BODY UNROLL_FACTOR times, replacing the INDEX + variable by a local copy in each copy of the body that will be + incremented as specified by INCR. If BUILD_EXIT_CONDS is true, + insert a test of the loop exit condition given COND and FINAL + before each copy of the body that will exit the loop if the value + of the local index variable satisfies the loop exit condition. + + For example, the unrolling with BUILD_EXIT_CONDS == true turns + + for (i = 0; i < 3; i = i + 1) + { + BODY + } + + into + + for (i = 0; i < n; i = i + 1) + { + i.0 = i + if (!(i_0 < n)) + goto exit + BODY_COPY_1[i/i.0] i.e. index var i replaced by i.0 + if (!(i_1 < n)) + goto exit + i.1 = i.0 + 1 + BODY_COPY_2[i/i.1] + if (!(i_3 < n)) + goto exit + i.2 = i.2 + 1 + BODY_COPY_3[i/i.2] + exit: + } + */ +static gimple_seq +build_unroll_body (gimple_seq body, tree unroll_factor, tree index, tree incr, + bool build_exit_conds = false, tree final = NULL_TREE, + tree_code *cond = NULL) +{ + gcc_assert ((!build_exit_conds && !final && !cond) + || (build_exit_conds && final && cond)); + + gimple_seq new_body = NULL; + + push_gimplify_context (); + + if (build_exit_conds) + final = build_loop_final (final, &new_body); + + tree local_index = create_tmp_var (TREE_TYPE (index)); + subst_var (&body, index, local_index); + tree local_incr = unshare_expr (incr); + TREE_OPERAND (local_incr, 0) = local_index; + + tree exit_label = create_artificial_label (gimple_location (body)); + + unsigned HOST_WIDE_INT n = tree_to_uhwi (unroll_factor); + for (unsigned HOST_WIDE_INT i = 0; i < n; i++) + { + if (i == 0) + gimplify_assign (local_index, index, &new_body); + else + gimplify_assign (local_index, local_incr, &new_body); + + tree body_copy_label = create_artificial_label (gimple_location (body)); + + if (build_exit_conds) + { + tree exit_cond + = build_loop_exit_cond (local_index, *cond, final, &new_body); + gimple_seq_add_stmt ( + &new_body, + gimple_build_cond (EQ_EXPR, exit_cond, boolean_true_node, + exit_label, body_copy_label)); + } + + gimple_seq body_copy = copy_gimple_seq_and_replace_locals (body); + gimple_seq_add_stmt (&new_body, gimple_build_label (body_copy_label)); + gimple_seq_add_seq (&new_body, body_copy); + } + + + gbind *bind = gimple_build_bind (NULL, new_body, NULL); + pop_gimplify_context (bind); + + gimple_seq result = NULL; + gimple_seq_add_stmt (&result, bind); + gimple_seq_add_stmt (&result, gimple_build_label (exit_label)); + return result; +} + +static gimple_seq transform_gomp_for (gomp_for *, tree, walk_ctx *ctx); + +/* Execute the partial unrolling transformation for OMP_FOR with the given + UNROLL_FACTOR and return the resulting gimple bind. LOC is the location for + diagnostic messages. + + Example + -------- + -------- + + Original loop + ------------- + + #pragma omp for unroll_partial(3) + for (i = 0; i < 100; i = i + 1) + { + BODY + } + + gets, roughly, translated to + + { + #pragma omp for + for (i = 0; i < 100; i = i + 3) + { + i.0 = i + if i.0 > 100: + goto exit_label + BODY_COPY_1[i/i.0] i.e. index var replaced + i.1 = i + 1 + if i.1 > 100: + goto exit_label + BODY_COPY_2[i/1.1] + i.2 = i + 2 + if i.2 > 100: + goto exit_label + BODY_COPY_3[i/i.2] + + exit_label: + } + */ + +/* FIXME The value of the loop counter of the transformed loop is +currently unspecified. OpenMP 5.2 does not define what the value +should be. There is an open OpenMP spec issue ("Loop counter value +after transform: Misc 6.0: Loop transformations #3440") in the +non-public OpenMP spec repository. */ + +static gimple_seq +partial_unroll (gomp_for *omp_for, tree unroll_factor, + location_t loc, tree transformation_clauses, walk_ctx *ctx) +{ + gcc_assert (unroll_factor); + gcc_assert ( + OMP_CLAUSE_CODE (transformation_clauses) == OMP_CLAUSE_UNROLL_PARTIAL + || OMP_CLAUSE_CODE (transformation_clauses) == OMP_CLAUSE_UNROLL_NONE); + + /* Partial unrolling reduces the loop nest depth of a canonical loop nest to 1 + hence outer directives cannot require a greater collapse. */ + gcc_assert (gimple_omp_for_collapse (omp_for) <= 1); + + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE | MSG_PRIORITY_INTERNALS, + dump_user_location_t::from_location_t (loc), + "Partially unrolling loop:\n %G\n", + static_cast (omp_for)); + + gomp_for *unrolled_for = as_a (copy_gimple_seq_and_replace_locals (omp_for)); + + tree final = gimple_omp_for_final (unrolled_for, 0); + tree incr = gimple_omp_for_incr (unrolled_for, 0); + tree index = gimple_omp_for_index (unrolled_for, 0); + gimple_seq body = gimple_omp_body (unrolled_for); + + tree_code cond = gimple_omp_for_cond (unrolled_for, 0); + tree step = TREE_OPERAND (incr, 1); + gimple_omp_set_body (unrolled_for, + build_unroll_body (body, unroll_factor, index, incr, + true, final, &cond)); + + gbind *result_bind = gimple_build_bind (NULL, NULL, NULL); + + push_gimplify_context (); + + tree scaled_step + = fold_build2 (MULT_EXPR, TREE_TYPE (step), + fold_convert (TREE_TYPE (step), unroll_factor), step); + + /* For combined constructs, step will be gimplified on the outer + gomp_for. */ + if (!gimple_omp_for_combined_into_p (omp_for) + && !TREE_CONSTANT (scaled_step)) + { + tree var = create_tmp_var (TREE_TYPE (step), ".omp_unroll_step"); + gimplify_assign (var, scaled_step, + gimple_omp_for_pre_body_ptr (unrolled_for)); + scaled_step = var; + } + TREE_OPERAND (incr, 1) = scaled_step; + gimple_omp_for_set_incr (unrolled_for, 0, incr); + + pop_gimplify_context (result_bind); + + if (gimple_omp_for_combined_into_p (omp_for)) + ctx->inner_combined_loop = unrolled_for; + + tree remaining_clauses = OMP_CLAUSE_CHAIN (transformation_clauses); + gimple_seq_add_stmt ( + gimple_bind_body_ptr (result_bind), + transform_gomp_for (unrolled_for, remaining_clauses, ctx)); + + return result_bind; +} + +static gimple_seq +full_unroll (gomp_for *omp_for, location_t loc, walk_ctx *ctx ATTRIBUTE_UNUSED) +{ + tree init = gimple_omp_for_initial (omp_for, 0); + unsigned HOST_WIDE_INT niter = 0; + if (!gomp_for_constant_iterations_p (omp_for, &niter)) + { + error_at (loc, "Cannot apply full unrolling to loop with " + "non-constant number of iterations"); + return omp_for; + } + + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE | MSG_PRIORITY_INTERNALS, + dump_user_location_t::from_location_t (loc), + "Fully unrolling loop with " + HOST_WIDE_INT_PRINT_UNSIGNED + " iterations :\n %G\n", niter, + static_cast (omp_for)); + + tree incr = gimple_omp_for_incr (omp_for, 0); + tree index = gimple_omp_for_index (omp_for, 0); + gimple_seq body = gimple_omp_body (omp_for); + + tree unroll_factor = build_int_cst (TREE_TYPE (init), niter); + + gimple_seq unrolled = NULL; + gimple_seq_add_seq (&unrolled, gimple_omp_for_pre_body (omp_for)); + push_gimplify_context (); + gimple_seq_add_seq (&unrolled, + build_unroll_body (body, unroll_factor, index, incr)); + + gbind *result_bind = gimple_build_bind (NULL, unrolled, NULL); + pop_gimplify_context (result_bind); + return result_bind; +} + +/* Decides if the OMP_FOR for which the user did not specify the type of + unrolling to apply in the 'unroll' directive represented by the TRANSFORM + clause should be fully unrolled. */ + +static bool +assign_unroll_full_clause_p (gomp_for *omp_for, tree transform) +{ + gcc_assert (OMP_CLAUSE_CODE (transform) == OMP_CLAUSE_UNROLL_NONE); + gcc_assert (OMP_CLAUSE_CHAIN (transform) == NULL); + + /* Full unrolling turns the loop into a non-loop and hence + the following transformations would fail. */ + if (TREE_CHAIN (transform) != NULL_TREE) + return false; + + unsigned HOST_WIDE_INT num_iters; + if (!gomp_for_constant_iterations_p (omp_for, &num_iters) + || num_iters + > (unsigned HOST_WIDE_INT)param_omp_unroll_full_max_iterations) + return false; + + if (dump_enabled_p ()) + { + auto loc = dump_user_location_t::from_location_t ( + OMP_CLAUSE_LOCATION (transform)); + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, loc, + "assigned % clause to % with small " + "constant number of iterations\n"); + } + + return true; +} + +/* If the OMP_FOR for which the user did not specify the type of unrolling in + the 'unroll' directive in the TRANSFORM clause should be partially unrolled, + return the unroll factor, otherwise return null. */ + +static tree +assign_unroll_partial_clause_p (gomp_for *omp_for ATTRIBUTE_UNUSED, + tree transform) +{ + gcc_assert (OMP_CLAUSE_CODE (transform) == OMP_CLAUSE_UNROLL_NONE); + + if (param_omp_unroll_default_factor == 0) + return NULL; + + tree unroll_factor + = build_int_cst (integer_type_node, param_omp_unroll_default_factor); + + if (dump_enabled_p ()) + { + auto loc = dump_user_location_t::from_location_t ( + OMP_CLAUSE_LOCATION (transform)); + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, loc, + "added % clause to % directive\n", + param_omp_unroll_default_factor); + } + + return unroll_factor; +} + +/* Generate the code for an OMP_FOR that represents the result of a + loop transformation which is not associated with any directive and + which will hence not be lowered in the omp-expansion. */ + +static gimple_seq +expand_transformed_loop (gomp_for *omp_for) +{ + gcc_assert (gimple_omp_for_kind (omp_for) + == GF_OMP_FOR_KIND_TRANSFORM_LOOP); + + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE | MSG_PRIORITY_INTERNALS, omp_for, + "Expanding loop:\n %G\n", + static_cast (omp_for)); + + push_gimplify_context (); + + omp_for = gomp_for_uncollapse (omp_for); + + tree incr = gimple_omp_for_incr (omp_for, 0); + tree index = gimple_omp_for_index (omp_for, 0); + tree init = gimple_omp_for_initial (omp_for, 0); + tree final = gimple_omp_for_final (omp_for, 0); + tree_code cond = gimple_omp_for_cond (omp_for, 0); + gimple_seq body = gimple_omp_body (omp_for); + gimple_seq pre_body = gimple_omp_for_pre_body (omp_for); + + gimple_seq loop = NULL; + + tree exit_label = create_artificial_label (UNKNOWN_LOCATION); + tree cycle_label = create_artificial_label (UNKNOWN_LOCATION); + tree body_label = create_artificial_label (UNKNOWN_LOCATION); + + gimple_seq_add_seq (&loop, pre_body); + gimplify_assign (index, init, &loop); + tree final_var = final; + if (TREE_CODE (final) != VAR_DECL) + { + final_var = create_tmp_var (TREE_TYPE (final)); + gimplify_assign (final_var, final, &loop); + } + + gimple_seq_add_stmt (&loop, gimple_build_label (cycle_label)); + gimple_seq_add_stmt (&loop, gimple_build_cond (cond, index, final_var, + body_label, exit_label)); + gimple_seq_add_stmt (&loop, gimple_build_label (body_label)); + gimple_seq_add_seq (&loop, body); + gimplify_assign (index, incr, &loop); + gimple_seq_add_stmt (&loop, gimple_build_goto (cycle_label)); + gimple_seq_add_stmt (&loop, gimple_build_label (exit_label)); + + gbind *bind = gimple_build_bind (NULL, loop, NULL); + pop_gimplify_context (bind); + + return bind; +} + +static enum tree_code +omp_adjust_neq_condition (tree v, tree step) +{ + gcc_assert (TREE_CODE (step) == INTEGER_CST); + if (TREE_CODE (TREE_TYPE (v)) == INTEGER_TYPE) + { + if (integer_onep (step)) + return LT_EXPR; + else + { + gcc_assert (integer_minus_onep (step)); + return GT_EXPR; + } + } + else + { + tree unit = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (v))); + gcc_assert (TREE_CODE (unit) == INTEGER_CST); + if (tree_int_cst_equal (unit, step)) + return LT_EXPR; + else + { + gcc_assert (wi::neg (wi::to_widest (unit)) + == wi::to_widest (step)); + return GT_EXPR; + } + } +} + +/* Adjust *COND_CODE and *N2 so that the former is either LT_EXPR or GT_EXPR, + given that V is the loop index variable and STEP is loop step. + + This function has been derived from omp_adjust_for_condition. + In contrast to the original function it does not add 1 or + -1 to the the final value when converting <=,>= to <,> + for a pointer-type index variable. Instead, this function + adds or subtracts the type size in bytes. This is necessary + to determine the number of iterations correctly. */ + +void +omp_adjust_for_condition2 (location_t loc, enum tree_code *cond_code, tree *n2, + tree v, tree step) +{ + switch (*cond_code) + { + case LT_EXPR: + case GT_EXPR: + break; + + case NE_EXPR: + *cond_code = omp_adjust_neq_condition (v, step); + break; + + case LE_EXPR: + if (POINTER_TYPE_P (TREE_TYPE (*n2))) + { + tree unit = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (v))); + HOST_WIDE_INT type_unit = tree_to_shwi (unit); + + *n2 = fold_build_pointer_plus_hwi_loc (loc, *n2, type_unit); + } + else + *n2 = fold_build2_loc (loc, PLUS_EXPR, TREE_TYPE (*n2), *n2, + build_int_cst (TREE_TYPE (*n2), 1)); + *cond_code = LT_EXPR; + break; + case GE_EXPR: + if (POINTER_TYPE_P (TREE_TYPE (*n2))) + { + tree unit = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (v))); + HOST_WIDE_INT type_unit = tree_to_shwi (unit); + *n2 = fold_build_pointer_plus_hwi_loc (loc, *n2, -1 * type_unit); + } + else + *n2 = fold_build2_loc (loc, MINUS_EXPR, TREE_TYPE (*n2), *n2, + build_int_cst (TREE_TYPE (*n2), 1)); + *cond_code = GT_EXPR; + break; + default: + gcc_unreachable (); + } +} + +/* Transform the condition of OMP_FOR to either LT_EXPR or GT_EXPR and adjust + the final value as necessary. */ + +static bool +canonicalize_conditions (gomp_for *omp_for) +{ + size_t collapse = gimple_omp_for_collapse (omp_for); + location_t loc = gimple_location (omp_for); + bool new_decls = false; + + gimple_seq *pre_body = gimple_omp_for_pre_body_ptr (omp_for); + for (size_t l = 0; l < collapse; l++) + { + enum tree_code cond = gimple_omp_for_cond (omp_for, l); + + if (cond == LT_EXPR || cond == GT_EXPR) + continue; + + tree incr = gimple_omp_for_incr (omp_for, l); + tree step = omp_get_for_step_from_incr (loc, incr); + tree index = gimple_omp_for_index (omp_for, l); + tree final = gimple_omp_for_final (omp_for, l); + tree orig_final = final; + /* If final refers to the index variable of an outer level, i.e. + the loop nest is non-rectangular, only convert NE_EXPR. This + is necessary for unrolling. Unrolling needs to multiply the + step by the unrolling factor, but non-constant step values + are impossible with NE_EXPR. */ + if (TREE_CODE (final) == TREE_VEC) + { + cond = omp_adjust_neq_condition (TREE_VEC_ELT (final, 1), + TREE_OPERAND (incr, 1)); + gimple_omp_for_set_cond (omp_for, l, cond); + continue; + } + + omp_adjust_for_condition2 (loc, &cond, &final, index, step); + + gimple_omp_for_set_cond (omp_for, l, cond); + if (final == orig_final) + continue; + + /* If this is a combined construct, gimplify the final on the + outer construct. */ + if (TREE_CODE (final) != INTEGER_CST + && !gimple_omp_for_combined_into_p (omp_for)) + { + tree new_final = create_tmp_var (TREE_TYPE (final)); + gimplify_assign (new_final, final, pre_body); + final = new_final; + new_decls = true; + } + + gimple_omp_for_set_final (omp_for, l, final); + } + + return new_decls; +} + +/* Combined distribute or taskloop constructs are represented by two + or more nested gomp_for constructs which are created during + gimplification. Loop transformations on the combined construct are + executed on the innermost gomp_for. This function adjusts the loop + header of an outer OMP_FOR loop to the changes made by the + transformations on the inner loop which is provided by the CTX. */ + +static gimple_seq +adjust_combined_loop (gomp_for *omp_for, walk_ctx *ctx) +{ + gcc_assert (gimple_omp_for_combined_p (omp_for)); + gcc_assert (ctx->inner_combined_loop); + + gomp_for *inner_omp_for = ctx->inner_combined_loop; + size_t collapse = gimple_omp_for_collapse (inner_omp_for); + + int kind = gimple_omp_for_kind (omp_for); + if (kind == GF_OMP_FOR_KIND_DISTRIBUTE || kind == GF_OMP_FOR_KIND_TASKLOOP) + { + for (size_t level = 0; level < collapse; ++level) + { + tree outer_incr = gimple_omp_for_incr (omp_for, level); + tree inner_incr = gimple_omp_for_incr (inner_omp_for, level); + gcc_assert (TREE_TYPE (inner_incr) == TREE_TYPE (outer_incr)); + + tree inner_final = gimple_omp_for_final (inner_omp_for, level); + enum tree_code inner_cond + = gimple_omp_for_cond (inner_omp_for, level); + gimple_omp_for_set_cond (omp_for, level, inner_cond); + + tree inner_step = TREE_OPERAND (inner_incr, 1); + /* If this omp_for is the outermost loop belonging to a + combined construct, gimplify the step into its + prebody. Otherwise, just gimplify the step on the inner + gomp_for and move the ungimplified step expression + here. */ + if (!gimple_omp_for_combined_into_p (omp_for) + && !TREE_CONSTANT (inner_step)) + { + push_gimplify_context (); + tree step = create_tmp_var (TREE_TYPE (inner_incr), + ".omp_combined_step"); + gimplify_assign (step, inner_step, + gimple_omp_for_pre_body_ptr (omp_for)); + pop_gimplify_context (ctx->bind); + TREE_OPERAND (outer_incr, 1) = step; + } + else + TREE_OPERAND (outer_incr, 1) = inner_step; + + if (!gimple_omp_for_combined_into_p (omp_for) + && !TREE_CONSTANT (inner_final)) + { + push_gimplify_context (); + tree final = create_tmp_var (TREE_TYPE (inner_final), + ".omp_combined_final"); + gimplify_assign (final, inner_final, + gimple_omp_for_pre_body_ptr (omp_for)); + pop_gimplify_context (ctx->bind); + gimple_omp_for_set_final (omp_for, level, final); + } + else + gimple_omp_for_set_final (omp_for, level, inner_final); + + /* Gimplify the step on the inner loop of the combined construct. */ + if (!TREE_CONSTANT (inner_step)) + { + push_gimplify_context (); + tree step = create_tmp_var (TREE_TYPE (inner_incr), + ".omp_combined_step"); + gimplify_assign (step, inner_step, + gimple_omp_for_pre_body_ptr (inner_omp_for)); + TREE_OPERAND (inner_incr, 1) = step; + pop_gimplify_context (ctx->bind); + + tree private_clause = build_omp_clause ( + gimple_location (omp_for), OMP_CLAUSE_PRIVATE); + OMP_CLAUSE_DECL (private_clause) = step; + tree *clauses = gimple_omp_for_clauses_ptr (inner_omp_for); + *clauses = chainon (*clauses, private_clause); + } + + /* Gimplify the final on the inner loop of the combined construct. */ + if (!TREE_CONSTANT (inner_final)) + { + push_gimplify_context (); + tree final = create_tmp_var (TREE_TYPE (inner_incr), + ".omp_combined_final"); + gimplify_assign (final, inner_final, + gimple_omp_for_pre_body_ptr (inner_omp_for)); + gimple_omp_for_set_final (inner_omp_for, level, final); + pop_gimplify_context (ctx->bind); + + tree private_clause = build_omp_clause ( + gimple_location (omp_for), OMP_CLAUSE_PRIVATE); + OMP_CLAUSE_DECL (private_clause) = final; + tree *clauses = gimple_omp_for_clauses_ptr (inner_omp_for); + *clauses = chainon (*clauses, private_clause); + } + } + } + + if (gimple_omp_for_combined_into_p (omp_for)) + ctx->inner_combined_loop = omp_for; + else + ctx->inner_combined_loop = NULL; + + return omp_for; +} + +/* Transform OMP_FOR recursively according to the clause chain + TRANSFORMATION. Return the resulting sequence of gimple statements. + + This function dispatches OMP_FOR to the handler function for the + TRANSFORMATION clause. The handler function is responsible for invoking this + function recursively for executing the remaining transformations. */ + +static gimple_seq +transform_gomp_for (gomp_for *omp_for, tree transformation, walk_ctx *ctx) +{ + if (!transformation) + { + if (gimple_omp_for_kind (omp_for) == GF_OMP_FOR_KIND_TRANSFORM_LOOP) + return expand_transformed_loop (omp_for); + + return omp_for; + } + + push_gimplify_context (); + + bool added_decls = canonicalize_conditions (omp_for); + + gimple_seq result = NULL; + location_t loc = OMP_CLAUSE_LOCATION (transformation); + auto dump_loc = dump_user_location_t::from_location_t (loc); + switch (OMP_CLAUSE_CODE (transformation)) + { + case OMP_CLAUSE_UNROLL_FULL: + gcc_assert (TREE_CHAIN (transformation) == NULL); + result = full_unroll (omp_for, loc, ctx); + break; + case OMP_CLAUSE_UNROLL_NONE: + gcc_assert (TREE_CHAIN (transformation) == NULL); + if (assign_unroll_full_clause_p (omp_for, transformation)) + { + result = full_unroll (omp_for, loc, ctx); + } + else if (tree unroll_factor + = assign_unroll_partial_clause_p (omp_for, transformation)) + { + result = partial_unroll (omp_for, unroll_factor, loc, + transformation, ctx); + } + else { + if (dump_enabled_p ()) + { + /* TODO Try to inform the unrolling pass that the user + wants to unroll this loop. This could relax some + restrictions there, e.g. on the code size? */ + dump_printf_loc ( + MSG_MISSED_OPTIMIZATION, dump_loc, + "not unrolling loop with % directive. Add " + "clause to specify unrolling type or invoke the " + "compiler with --param=omp-unroll-default-factor=n for some" + "constant integer n"); + } + result = transform_gomp_for (omp_for, NULL, ctx); + } + + break; + case OMP_CLAUSE_UNROLL_PARTIAL: + { + tree unroll_factor = OMP_CLAUSE_UNROLL_PARTIAL_EXPR (transformation); + if (!unroll_factor) + { + // TODO Use target architecture dependent constants? + unsigned factor = param_omp_unroll_default_factor > 0 + ? param_omp_unroll_default_factor + : 5; + unroll_factor = build_int_cst (integer_type_node, factor); + + if (dump_enabled_p ()) + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, dump_loc, + "% clause without unrolling " + "factor turned into % clause\n", + factor); + } + result = partial_unroll (omp_for, unroll_factor, loc, transformation, + ctx); + } + break; + default: + gcc_unreachable (); + } + + if (added_decls && gimple_code (result) != GIMPLE_BIND) + result = gimple_build_bind (NULL, result, NULL); + pop_gimplify_context (added_decls ? result : NULL); /* for decls from canonicalize_loops */ + + return result; +} + +/* Remove all loop transformation clauses from the clauses of OMP_FOR and + return a new tree chain containing just those clauses. + + The clauses correspond to transformation *directives* associated with the + OMP_FOR's loop. The returned clauses are ordered from the innermost + directive to the outermost, i.e. in the order in which the transformations + should execute. + + Example: + -------- + -------- + + The loop + + #pragma omp for nowait + #pragma omp unroll partial(5) + #pragma omp tile sizes(2,2) + LOOP + + is represented as + + #pragma omp for nowait unroll_partial(5) tile_sizes(2,2) + LOOP + + Gimplification may add clauses after the transformation clauses added + by the front ends. This function will leave only the "nowait" clause on + OMP_FOR and return the clauses "tile_sizes(2,2) unroll_partial(5)". */ + +static tree +gomp_for_remove_transformation_clauses (gomp_for *omp_for) +{ + tree *clauses = gimple_omp_for_clauses_ptr (omp_for); + tree trans_clauses = NULL; + tree last_other_clause = NULL; + + for (tree c = gimple_omp_for_clauses (omp_for); c != NULL_TREE;) + { + tree chain_tail = OMP_CLAUSE_CHAIN (c); + if (omp_loop_transform_clause_p (c)) + { + if (last_other_clause) + OMP_CLAUSE_CHAIN (last_other_clause) = chain_tail; + else + *clauses = OMP_CLAUSE_CHAIN (c); + + OMP_CLAUSE_CHAIN (c) = NULL; + trans_clauses = chainon (trans_clauses, c); + } + else + { + /* There should be no other clauses between loop transformations ... */ + gcc_assert (!trans_clauses || !last_other_clause + || TREE_CHAIN (last_other_clause) == c); + /* ... and hence stop if transformations were found before the + non-transformation clause C. */ + if (trans_clauses) + break; + last_other_clause = c; + } + + c = chain_tail; + } + + return nreverse (trans_clauses); +} + +static void +print_optimized_unroll_partial_msg (tree c) +{ + gcc_assert (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_UNROLL_PARTIAL); + location_t loc = OMP_CLAUSE_LOCATION (c); + dump_user_location_t dump_loc; + dump_loc = dump_user_location_t::from_location_t (loc); + + tree unroll_factor = OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c); + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, dump_loc, + "replaced consecutive % directives by " + "%\n", tree_to_uhwi (unroll_factor)); +} + +/* Optimize CLAUSES by removing and merging redundant clauses. Return the + optimized clause chain. */ + +static tree +optimize_transformation_clauses (tree clauses) +{ + /* The last unroll_partial clause seen in clauses, if any, + or the last merged unroll partial clause. */ + tree unroll_partial = NULL; + /* The last clause was not a unroll_partial clause, if any. + unroll_full and unroll_none are not relevant because + they appear only at the end of a chain. */ + tree last_non_unroll = NULL; + /* Indicates that at least two unroll_partial clauses have been merged + since last_non_unroll was seen. */ + bool merged_unroll_partial = false; + + for (tree c = clauses; c != NULL_TREE; c = OMP_CLAUSE_CHAIN (c)) + { + enum omp_clause_code code = OMP_CLAUSE_CODE (c); + + switch (code) + { + case OMP_CLAUSE_UNROLL_NONE: + /* 'unroll' without a clause cannot be followed by any + transformations because its result does not have canonical loop + nest form. */ + gcc_assert (OMP_CLAUSE_CHAIN (c) == NULL); + unroll_partial = NULL; + merged_unroll_partial = false; + break; + case OMP_CLAUSE_UNROLL_FULL: + /* 'unroll full' cannot be followed by any transformations because + its result does not have canonical loop nest form. */ + gcc_assert (OMP_CLAUSE_CHAIN (c) == NULL); + + /* Previous 'unroll partial' directives are useless. */ + if (unroll_partial) + { + if (last_non_unroll) + OMP_CLAUSE_CHAIN (last_non_unroll) = c; + else + clauses = c; + + if (dump_enabled_p ()) + { + location_t loc = OMP_CLAUSE_LOCATION (c); + dump_user_location_t dump_loc; + dump_loc = dump_user_location_t::from_location_t (loc); + + dump_printf_loc ( + MSG_OPTIMIZED_LOCATIONS, dump_loc, + "removed useless % directives " + "preceding 'omp unroll full'\n"); + } + } + unroll_partial = NULL; + merged_unroll_partial = false; + break; + case OMP_CLAUSE_UNROLL_PARTIAL: + { + /* Merge a sequence of consecutive 'unroll partial' directives. + Note that it impossible for 'unroll full' or 'unroll' to + appear inbetween the 'unroll partial' clauses because they + remove the loop-nest. */ + if (unroll_partial) + { + tree factor = OMP_CLAUSE_UNROLL_PARTIAL_EXPR (unroll_partial); + tree c_factor = OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c); + if (factor && c_factor) + factor = fold_build2 (MULT_EXPR, TREE_TYPE (factor), factor, + c_factor); + else if (!factor && c_factor) + factor = c_factor; + + gcc_assert (!factor || TREE_CODE (factor) == INTEGER_CST); + + OMP_CLAUSE_UNROLL_PARTIAL_EXPR (unroll_partial) = factor; + OMP_CLAUSE_CHAIN (unroll_partial) = OMP_CLAUSE_CHAIN (c); + OMP_CLAUSE_LOCATION (unroll_partial) = OMP_CLAUSE_LOCATION (c); + merged_unroll_partial = true; + } + else + unroll_partial = c; + } + break; + default: + gcc_unreachable (); + } + } + + if (merged_unroll_partial && dump_enabled_p ()) + print_optimized_unroll_partial_msg (unroll_partial); + + return clauses; +} + +/* Visit the current statement in GSI_P in the walk_omp_for_loops walk and + execute all loop transformations found on it. */ + +void +process_omp_for (gomp_for *omp_for, gimple_seq *containing_seq, walk_ctx *ctx) +{ + auto gsi_p = gsi_for_stmt (omp_for, containing_seq); + tree transform_clauses = gomp_for_remove_transformation_clauses (omp_for); + + /* Do not attempt to transform broken code which might violate the + assumptions of the loop transformation implementations. + + Transformation clauses must be dropped first because following + passes do not handle them. */ + if (seen_error ()) + return; + + transform_clauses = optimize_transformation_clauses (transform_clauses); + + gimple *transformed = omp_for; + if (gimple_omp_for_combined_p (omp_for) + && ctx->inner_combined_loop) + transformed = adjust_combined_loop (omp_for, ctx); + else + transformed = transform_gomp_for (omp_for, transform_clauses, ctx); + + if (transformed == omp_for) + return; + + gsi_replace_with_seq (&gsi_p, transformed, true); + + if (!dump_enabled_p () || !(dump_flags & TDF_DETAILS)) + return; + + dump_printf_loc (MSG_NOTE | MSG_PRIORITY_INTERNALS, transformed, + "Transformed loop: %G\n\n", transformed); +} + +/* Traverse SEQ in depth-first order and apply the loop transformation + found on gomp_for statements. */ + +static unsigned int +walk_omp_for_loops (gimple_seq *seq, walk_ctx *ctx) +{ + gimple_stmt_iterator gsi; + for (gsi = gsi_start (*seq); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + switch (gimple_code (stmt)) + { + case GIMPLE_OMP_CRITICAL: + case GIMPLE_OMP_MASTER: + case GIMPLE_OMP_MASKED: + case GIMPLE_OMP_TASKGROUP: + case GIMPLE_OMP_ORDERED: + case GIMPLE_OMP_SCAN: + case GIMPLE_OMP_SECTION: + case GIMPLE_OMP_PARALLEL: + case GIMPLE_OMP_TASK: + case GIMPLE_OMP_SCOPE: + case GIMPLE_OMP_SECTIONS: + case GIMPLE_OMP_SINGLE: + case GIMPLE_OMP_TARGET: + case GIMPLE_OMP_TEAMS: + { + gbind *bind = ctx->bind; + walk_omp_for_loops (gimple_omp_body_ptr (stmt), ctx); + ctx->bind = bind; + break; + } + case GIMPLE_OMP_FOR: + { + gbind *bind = ctx->bind; + walk_omp_for_loops (gimple_omp_for_pre_body_ptr (stmt), ctx); + walk_omp_for_loops (gimple_omp_body_ptr (stmt), ctx); + ctx->bind = bind; + process_omp_for (as_a (stmt), seq, ctx); + break; + } + case GIMPLE_BIND: + { + gbind *bind = as_a (stmt); + ctx->bind = bind; + walk_omp_for_loops (gimple_bind_body_ptr (bind), ctx); + ctx->bind = bind; + break; + } + case GIMPLE_TRY: + { + gbind *bind = ctx->bind; + walk_omp_for_loops (gimple_try_eval_ptr (as_a (stmt)), + ctx); + walk_omp_for_loops (gimple_try_cleanup_ptr (as_a (stmt)), + ctx); + ctx->bind = bind; + break; + } + + case GIMPLE_CATCH: + { + gbind *bind = ctx->bind; + walk_omp_for_loops ( + gimple_catch_handler_ptr (as_a (stmt)), ctx); + ctx->bind = bind; + break; + } + + case GIMPLE_EH_FILTER: + { + gbind *bind = ctx->bind; + walk_omp_for_loops (gimple_eh_filter_failure_ptr (stmt), ctx); + ctx->bind = bind; + break; + } + + case GIMPLE_EH_ELSE: + { + gbind *bind = ctx->bind; + geh_else *eh_else_stmt = as_a (stmt); + walk_omp_for_loops (gimple_eh_else_n_body_ptr (eh_else_stmt), ctx); + walk_omp_for_loops (gimple_eh_else_e_body_ptr (eh_else_stmt), ctx); + ctx->bind = bind; + break; + } + break; + + case GIMPLE_WITH_CLEANUP_EXPR: + { + gbind *bind = ctx->bind; + walk_omp_for_loops (gimple_wce_cleanup_ptr (stmt), ctx); + ctx->bind = bind; + break; + } + + case GIMPLE_TRANSACTION: + { + gbind *bind = ctx->bind; + auto trans = as_a (stmt); + walk_omp_for_loops (gimple_transaction_body_ptr (trans), ctx); + ctx->bind = bind; + break; + } + + case GIMPLE_ASSUME: + break; + + default: + gcc_assert (!gimple_has_substatements (stmt)); + continue; + } + } + + return true; +} + +static unsigned int +execute_omp_transform_loops () +{ + gimple_seq body = gimple_body (current_function_decl); + walk_ctx ctx; + ctx.inner_combined_loop = NULL; + ctx.bind = NULL; + walk_omp_for_loops (&body, &ctx); + + return 0; +} + +namespace +{ + +const pass_data pass_data_omp_transform_loops = { + GIMPLE_PASS, /* type */ + "omp_transform_loops", /* name */ + OPTGROUP_OMP, /* optinfo_flags */ + TV_NONE, /* tv_id */ + PROP_gimple_any, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + 0, /* todo_flags_finish */ +}; + +class pass_omp_transform_loops : public gimple_opt_pass +{ +public: + pass_omp_transform_loops (gcc::context *ctxt) + : gimple_opt_pass (pass_data_omp_transform_loops, ctxt) + { + } + + /* opt_pass methods: */ + virtual unsigned int + execute (function *) + { + return execute_omp_transform_loops (); + } + virtual bool + gate (function *) + { + return flag_openmp || flag_openmp_simd; + } + +}; // class pass_omp_transform_loops + +} // anon namespace + +gimple_opt_pass * +make_pass_omp_transform_loops (gcc::context *ctxt) +{ + return new pass_omp_transform_loops (ctxt); +} diff --git a/gcc/params.opt b/gcc/params.opt index 41d8bef245e..cf5e09bf9e0 100644 --- a/gcc/params.opt +++ b/gcc/params.opt @@ -820,6 +820,15 @@ Enum(openacc_privatization) String(quiet) Value(OPENACC_PRIVATIZATION_QUIET) EnumValue Enum(openacc_privatization) String(noisy) Value(OPENACC_PRIVATIZATION_NOISY) +-param=omp-unroll-full-max-iterations= +Common Joined UInteger Var(param_omp_unroll_full_max_iterations) Init(5) Param Optimization +The maximum number of iterations of a loop for which an 'omp unroll' directive on the loop without a +clause will be turned into an 'omp unroll full'. + +-param=omp-unroll-default-factor= +Common Joined UInteger Var(param_omp_unroll_default_factor) Init(0) Param Optimization +The unroll factor that will be used for loops that have an 'omp unroll partial' directive without an explicit unroll factor. + -param=parloops-chunk-size= Common Joined UInteger Var(param_parloops_chunk_size) Param Optimization Chunk size of omp schedule for loops parallelized by parloops. diff --git a/gcc/passes.def b/gcc/passes.def index c9a8f19747b..5a5f3616cf8 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -35,6 +35,7 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_diagnose_omp_blocks); NEXT_PASS (pass_diagnose_tm_blocks); NEXT_PASS (pass_omp_oacc_kernels_decompose); + NEXT_PASS (pass_omp_transform_loops); NEXT_PASS (pass_lower_omp); NEXT_PASS (pass_lower_cf); NEXT_PASS (pass_lower_tm); diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-1.f90 new file mode 100644 index 00000000000..4cfac4c5e26 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-1.f90 @@ -0,0 +1,277 @@ +subroutine test1 + implicit none + integer :: i + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +subroutine test2 + implicit none + integer :: i + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test2 + +subroutine test3 + implicit none + integer :: i + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end do +end subroutine test3 + +subroutine test4 + implicit none + integer :: i + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end do +end subroutine test4 + +subroutine test5 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test5 + +subroutine test6 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test6 + +subroutine test7 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test7 + +subroutine test8 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end unroll +end subroutine test8 + +subroutine test9 + implicit none + integer :: i + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test9 + +subroutine test10 + implicit none + integer :: i + + !$omp unroll full ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test10 + +subroutine test11 + implicit none + integer :: i,j + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + do j = 1,100 + call dummy2(i,j) + end do + end do +end subroutine test11 + +subroutine test12 + implicit none + integer :: i,j + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + call dummy(i) ! { dg-error {Unexpected CALL statement at \(1\)} } + !$omp unroll + do j = 1,100 + call dummy2(i,j) + end do + end do +end subroutine test12 + +subroutine test13 + implicit none + integer :: i,j + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do j = 1,100 + call dummy2(i,j) + end do + call dummy(i) + end do +end subroutine test13 + +subroutine test14 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end unroll + !$omp end unroll ! { dg-error {Unexpected \!\$OMP END UNROLL statement at \(1\)} } +end subroutine test14 + +subroutine test15 + implicit none + integer :: i + + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end unroll + !$omp end unroll ! { dg-error {Unexpected \!\$OMP END UNROLL statement at \(1\)} } +end subroutine test15 + +subroutine test16 + implicit none + integer :: i + + !$omp do + !$omp unroll partial(1) + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test16 + +subroutine test17 + implicit none + integer :: i + + !$omp do + !$omp unroll partial(2) + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test17 + +subroutine test18 + implicit none + integer :: i + + !$omp do + !$omp unroll partial(0) ! { dg-error {PARTIAL clause argument not constant positive integer at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test18 + +subroutine test19 + implicit none + integer :: i + + !$omp do + !$omp unroll partial(-10) ! { dg-error {PARTIAL clause argument not constant positive integer at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test19 + +subroutine test20 + implicit none + integer :: i + + !$omp do + !$omp unroll partial + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test20 + +subroutine test21 + implicit none + integer :: i + + !$omp unroll partial ! { dg-error {\!\$OMP UNROLL invalid around DO CONCURRENT loop at \(1\)} } + do concurrent (i = 1:100) + call dummy(i) ! { dg-error {Subroutine call to 'dummy' in DO CONCURRENT block at \(1\) is not PURE} } + end do + !$omp end unroll +end subroutine test21 + +subroutine test22 + implicit none + integer :: i + + !$omp do + !$omp unroll partial + do concurrent (i = 1:100) ! { dg-error {\!\$OMP DO cannot be a DO CONCURRENT loop at \(1\)} } + call dummy(i) ! { dg-error {Subroutine call to 'dummy' in DO CONCURRENT block at \(1\) is not PURE} } + end do + !$omp end unroll +end subroutine test22 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-10.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-10.f90 new file mode 100644 index 00000000000..2c4a45d3054 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-10.f90 @@ -0,0 +1,7 @@ +subroutine test(i) + ! TODO The checking that produces this message comes too late. Not important, but would be nice to have. + !$omp unroll full ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} "" { xfail *-*-* } } + call dummy0 ! { dg-error {Unexpected CALL statement at \(1\)} } +end subroutine test ! { dg-error {Unexpected END statement at \(1\)} } + +! { dg-error "Unexpected end of file" "" { target "*-*-*" } 0 } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-11.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-11.f90 new file mode 100644 index 00000000000..3f0d5981e9b --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-11.f90 @@ -0,0 +1,75 @@ +subroutine test1(i) + implicit none + integer :: i + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,10 + call dummy(i) + end do +end subroutine test1 + +subroutine test2(i) + implicit none + integer :: i + !$omp unroll full ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do i = 1,10 + call dummy(i) + end do +end subroutine test2 + +subroutine test3(i) + implicit none + integer :: i + !$omp unroll full ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + !$omp unroll full + !$omp unroll + do i = 1,10 + call dummy(i) + end do +end subroutine test3 + +subroutine test4(i) + implicit none + integer :: i + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,10 + call dummy(i) + end do +end subroutine test4 + +subroutine test5(i) + implicit none + integer :: i + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do i = 1,10 + call dummy(i) + end do +end subroutine test5 + +subroutine test6(i) + implicit none + integer :: i + !$omp do ! { dg-error {missing canonical loop nest after \!\$OMP DO at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do i = 1,10 + call dummy(i) + end do +end subroutine test6 + +subroutine test7(i) + implicit none + integer :: i + !$omp loop ! { dg-error {missing canonical loop nest after \!\$OMP LOOP at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do i = 1,10 + call dummy(i) + end do +end subroutine test7 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-12.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-12.f90 new file mode 100644 index 00000000000..0d8f3f5a2c0 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-12.f90 @@ -0,0 +1,29 @@ +subroutine test1 + implicit none + integer :: i + !$omp unroll ! { dg-error {\!\$OMP UNROLL invalid around DO WHILE or DO without loop control at \(1\)} } + do while (i < 10) + call dummy(i) + i = i + 1 + end do +end subroutine test1 + +subroutine test2 + implicit none + integer :: i + !$omp unroll ! { dg-error {\!\$OMP UNROLL invalid around DO WHILE or DO without loop control at \(1\)} } + do + call dummy(i) + i = i + 1 + if (i >= 10) exit + end do +end subroutine test2 + +subroutine test3 + implicit none + integer :: i + !$omp unroll ! { dg-error {\!\$OMP UNROLL invalid around DO CONCURRENT loop at \(1\)} } + do concurrent (i=1:10) + call dummy(i) ! { dg-error {Subroutine call to 'dummy' in DO CONCURRENT block at \(1\) is not PURE} } + end do +end subroutine test3 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-2.f90 new file mode 100644 index 00000000000..8496f9eefe0 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-2.f90 @@ -0,0 +1,22 @@ +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp unroll + do i = 1,10 + call dummy(i) + end do +end subroutine test1 + +subroutine test2 + implicit none + integer :: i + !$omp unroll full + do i = 1,10 + call dummy(i) + end do +end subroutine test2 + +! { dg-final { scan-tree-dump-times "#pragma omp loop_transform unroll_none" 1 "original" } } +! { dg-final { scan-tree-dump-times "#pragma omp loop_transform unroll_full" 1 "original" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-3.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-3.f90 new file mode 100644 index 00000000000..0d233c9ab6f --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-3.f90 @@ -0,0 +1,17 @@ +! { dg-additional-options "-fdump-tree-omp_transform_loops" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp unroll full + do i = 1,10 + call dummy(i) + end do +end subroutine test1 + +! Loop should be removed with 10 copies of the body remaining + +! { dg-final { scan-tree-dump-times "dummy" 10 "omp_transform_loops" } } +! { dg-final { scan-tree-dump "#pragma omp loop_transform" "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-4.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-4.f90 new file mode 100644 index 00000000000..fcccdb0bcf8 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-4.f90 @@ -0,0 +1,18 @@ +! { dg-additional-options "-fdump-tree-omp_transform_loops" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp unroll + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +! Loop should not be unrolled, but the internal representation should be lowered + +! { dg-final { scan-tree-dump "#pragma omp loop_transform" "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times "dummy" 1 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {if \(i\.[0-9]+ < .+?.+goto.+else goto.*?$} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-5.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-5.f90 new file mode 100644 index 00000000000..ee82b4d150c --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-5.f90 @@ -0,0 +1,18 @@ +! { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp unroll partial ! { dg-optimized {'partial' clause without unrolling factor turned into 'partial\(5\)' clause} } + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +! Loop should be unrolled 5 times and the internal representation should be lowered. + +! { dg-final { scan-tree-dump {#pragma omp loop_transform unroll_partial} "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times "dummy" 5 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {if \(i\.[0-9]+ < .+?.+goto.+else goto.*?$} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-6.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-6.f90 new file mode 100644 index 00000000000..237e6b83087 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-6.f90 @@ -0,0 +1,19 @@ +! { dg-additional-options "--param=omp-unroll-default-factor=10" } +! { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp unroll partial ! { dg-optimized {'partial' clause without unrolling factor turned into 'partial\(10\)' clause} } + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +! Loop should be unrolled 10 times and the internal representation should be lowered. + +! { dg-final { scan-tree-dump {#pragma omp loop_transform unroll_partial} "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times "dummy" 10 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {if \(i\.[0-9]+ < .+?.+goto.+else goto.*?$} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-7.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-7.f90 new file mode 100644 index 00000000000..8feaf7dc4d3 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-7.f90 @@ -0,0 +1,62 @@ +! { dg-additional-options "--param=omp-unroll-default-factor=10" } +! { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i,j + !$omp parallel do + !$omp unroll partial(10) + do i = 1,100 + !$omp parallel do + do j = 1,100 + call dummy(i,j) + end do + end do + + !$omp taskloop + !$omp unroll partial(10) + do i = 1,100 + !$omp parallel do + do j = 1,100 + call dummy(i,j) + end do + end do + +end subroutine test1 + +! For the "parallel do", there should be 11 "omp for" loops, 10 for the inner loop, 1 for outer, +! for the "taskloop", there should be 10 "omp for" loops for the unrolled loop +! { dg-final { scan-tree-dump-times {#pragma omp for} 21 "omp_transform_loops" } } +! ... and two outer taskloops plus the one taskloops +! { dg-final { scan-tree-dump-times {#pragma omp taskloop} 3 "omp_transform_loops" } } + + +subroutine test2 + implicit none + integer :: i,j + do i = 1,100 + !$omp teams distribute + !$omp unroll partial(10) + do j = 1,100 + call dummy(i,j) + end do + end do + + do i = 1,100 + !$omp target teams distribute + !$omp unroll partial(10) + do j = 1,100 + call dummy(i,j) + end do + end do +end subroutine test2 + +! { dg-final { scan-tree-dump-times {#pragma omp distribute} 2 "omp_transform_loops" } } + +! After unrolling there should be 10 copies of each loop body for each loop-nest +! { dg-final { scan-tree-dump-times "dummy" 40 "omp_transform_loops" } } + +! { dg-final { scan-tree-dump-not {#pragma omp loop_transform} "original" } } +! { dg-final { scan-tree-dump-times {#pragma omp for nowait unroll_partial\(10\)} 1 "original" } } +! { dg-final { scan-tree-dump-times {#pragma omp distribute private\(j\) unroll_partial\(10\)} 2 "original" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-8.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-8.f90 new file mode 100644 index 00000000000..9b91e5c5f98 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-8.f90 @@ -0,0 +1,22 @@ +! { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp parallel do collapse(1) + !$omp unroll partial(4) ! { dg-optimized {replaced consecutive 'omp unroll' directives by 'omp unroll auto\(24\)'} } + !$omp unroll partial(3) + !$omp unroll partial(2) + !$omp unroll partial(1) + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +! Loop should be unrolled 1 * 2 * 3 * 4 = 24 times + +! { dg-final { scan-tree-dump {#pragma omp for nowait collapse\(1\) unroll_partial\(4\) unroll_partial\(3\) unroll_partial\(2\) unroll_partial\(1\)} "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp loop_transform" "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times "dummy" 24 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {#pragma omp for} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-9.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-9.f90 new file mode 100644 index 00000000000..849d4e77984 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-9.f90 @@ -0,0 +1,18 @@ +! { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } +! { dg-additional-options "-fdump-tree-original" } + +subroutine test1 + implicit none + integer :: i + !$omp unroll full ! { dg-optimized {removed useless 'omp unroll auto' directives preceding 'omp unroll full'} } + !$omp unroll partial(3) + !$omp unroll partial(2) + !$omp unroll partial(1) + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +! { dg-final { scan-tree-dump {#pragma omp loop_transform unroll_full unroll_partial\(3\) unroll_partial\(2\) unroll_partial\(1\)} "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp unroll" "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times "dummy" 100 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-1.f90 new file mode 100644 index 00000000000..079c0fdd75b --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-1.f90 @@ -0,0 +1,20 @@ +! { dg-additional-options "-fopt-info-optimized -fdump-tree-omp_transform_loops-details" } + +subroutine test + !$omp unroll ! { dg-optimized {assigned 'full' clause to 'omp unroll' with small constant number of iterations} } + do i = 1,5 + do j = 1,10 + call dummy3(i,j) + end do + end do + !$omp end unroll + + !$omp unroll + do i = 1,6 + do j = 1,6 + call dummy3(i,j) + end do + end do + !$omp end unroll +end subroutine test + diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-2.f90 new file mode 100644 index 00000000000..4893ba46e4e --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-2.f90 @@ -0,0 +1,21 @@ +! { dg-additional-options "--param=omp-unroll-full-max-iterations=20" } +! { dg-additional-options "-fopt-info-optimized -fdump-tree-omp_transform_loops-details" } + +subroutine test + !$omp unroll ! { dg-optimized {assigned 'full' clause to 'omp unroll' with small constant number of iterations} } + do i = 1,20 + do j = 1,10 + call dummy3(i,j) + end do + end do + !$omp end unroll + + !$omp unroll + do i = 1,21 + do j = 1,6 + call dummy3(i,j) + end do + end do + !$omp end unroll +end subroutine test + diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-3.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-3.f90 new file mode 100644 index 00000000000..60f25d3abe6 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-no-clause-3.f90 @@ -0,0 +1,23 @@ +! { dg-additional-options "--param=omp-unroll-full-max-iterations=10" } +! { dg-additional-options "--param=omp-unroll-default-factor=10" } +! { dg-additional-options "-fopt-info-optimized -fdump-tree-omp_transform_loops-details" } + +subroutine test + !$omp unroll ! { dg-optimized {added 'partial\(10\)' clause to 'omp unroll' directive} } + do i = 1,20 + do j = 1,10 + call dummy3(i,j) + end do + end do + !$omp end unroll + + !$omp unroll ! { dg-optimized {added 'partial\(10\)' clause to 'omp unroll' directive} } + do i = 1,21 + !$omp unroll ! { dg-optimized {assigned 'full' clause to 'omp unroll' with small constant number of iterations} } + do j = 1,6 + call dummy3(i,j) + end do + end do + !$omp end unroll +end subroutine test + diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-1.f90 new file mode 100644 index 00000000000..f22debbb78f --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-1.f90 @@ -0,0 +1,244 @@ +! { dg-options "-fno-openmp -fopenmp-simd" } + +subroutine test1 + implicit none + integer :: i + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test1 + +subroutine test2 + implicit none + integer :: i + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test2 + +subroutine test3 + implicit none + integer :: i + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end do +end subroutine test3 + +subroutine test4 + implicit none + integer :: i + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end do +end subroutine test4 + +subroutine test5 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test5 + +subroutine test6 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test6 + +subroutine test7 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end unroll +end subroutine test7 + +subroutine test8 + implicit none + integer :: i + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test8 + +subroutine test9 + implicit none + integer :: i + + !$omp unroll full ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do +end subroutine test9 + +subroutine test10 + implicit none + integer :: i,j + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + do j = 1,100 + call dummy2(i,j) + end do + end do +end subroutine test10 + +subroutine test11 + implicit none + integer :: i,j + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + call dummy(i) ! { dg-error {Unexpected CALL statement at \(1\)} } + !$omp unroll + do j = 1,100 + call dummy2(i,j) + end do + end do +end subroutine test11 + +subroutine test12 + implicit none + integer :: i,j + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do j = 1,100 + call dummy2(i,j) + end do + call dummy(i) + end do +end subroutine test12 + +subroutine test13 + implicit none + integer :: i + + !$omp unroll ! { dg-error {missing canonical loop nest after \!\$OMP UNROLL at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end unroll + !$omp end unroll ! { dg-error {Unexpected \!\$OMP END UNROLL statement at \(1\)} } +end subroutine test13 + +subroutine test14 + implicit none + integer :: i + + !$omp simd ! { dg-error {missing canonical loop nest after \!\$OMP SIMD at \(1\)} } + !$omp unroll ! { dg-warning {\!\$OMP UNROLL without PARTIAL clause at \(1\) turns loop into a non-loop} } + !$omp unroll + do i = 1,100 + call dummy(i) + end do + !$omp end unroll + !$omp end unroll + !$omp end unroll ! { dg-error {Unexpected \!\$OMP END UNROLL statement at \(1\)} } +end subroutine test14 + +subroutine test15 + implicit none + integer :: i + + !$omp simd + !$omp unroll partial(1) + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test15 + +subroutine test16 + implicit none + integer :: i + + !$omp simd + !$omp unroll partial(2) + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test16 + +subroutine test17 + implicit none + integer :: i + + !$omp simd + !$omp unroll partial(0) ! { dg-error {PARTIAL clause argument not constant positive integer at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test17 + +subroutine test18 + implicit none + integer :: i + + !$omp simd + !$omp unroll partial(-10) ! { dg-error {PARTIAL clause argument not constant positive integer at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test18 + +subroutine test19 + implicit none + integer :: i + + !$omp simd + !$omp unroll partial + do i = 1,100 + call dummy(i) + end do + !$omp end unroll +end subroutine test19 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-2.f90 new file mode 100644 index 00000000000..faaa37c5d7e --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-simd-2.f90 @@ -0,0 +1,57 @@ +! { dg-do run } +! { dg-options "-O2 -fopenmp-simd" } +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +module test_functions + contains + integer function compute_sum() result(sum) + implicit none + + integer :: i,j + + !$omp simd + do i = 1,10,3 + !$omp unroll full + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function + + integer function compute_sum2() result(sum) + implicit none + + integer :: i,j + + !$omp simd + !$omp unroll partial(2) + do i = 1,10,3 + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum () + write (*,*) result + if (result .ne. 16) then + call abort + end if + + result = compute_sum2 () + write (*,*) result + if (result .ne. 16) then + call abort + end if +end program + +! { dg-final { scan-tree-dump {omp loop_transform} "original" } } +! { dg-final { scan-tree-dump-not {omp loop_transform} "omp_transform_loops" } } diff --git a/gcc/tree-core.h b/gcc/tree-core.h index fd2be57b78c..e563408877e 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -525,6 +525,15 @@ enum omp_clause_code { /* OpenACC clause: nohost. */ OMP_CLAUSE_NOHOST, + + /* Internal representation for an "omp unroll full" directive. */ + OMP_CLAUSE_UNROLL_FULL, + + /* Internal representation for an "omp unroll" directive without a clause. */ + OMP_CLAUSE_UNROLL_NONE, + + /* Internal representation for an "omp unroll partial" directive. */ + OMP_CLAUSE_UNROLL_PARTIAL, }; #undef DEFTREESTRUCT diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index 6cdaed7d4b2..813176a912f 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -425,6 +425,7 @@ extern gimple_opt_pass *make_pass_lower_switch_O0 (gcc::context *ctxt); extern gimple_opt_pass *make_pass_lower_vector (gcc::context *ctxt); extern gimple_opt_pass *make_pass_lower_vector_ssa (gcc::context *ctxt); extern gimple_opt_pass *make_pass_omp_oacc_kernels_decompose (gcc::context *ctxt); +extern gimple_opt_pass *make_pass_omp_transform_loops (gcc::context *ctxt); extern gimple_opt_pass *make_pass_lower_omp (gcc::context *ctxt); extern gimple_opt_pass *make_pass_diagnose_omp_blocks (gcc::context *ctxt); extern gimple_opt_pass *make_pass_expand_omp (gcc::context *ctxt); diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc index 7947f9647a1..588a992bcf3 100644 --- a/gcc/tree-pretty-print.cc +++ b/gcc/tree-pretty-print.cc @@ -505,6 +505,22 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags) case OMP_CLAUSE_EXCLUSIVE: name = "exclusive"; goto print_remap; + case OMP_CLAUSE_UNROLL_FULL: + pp_string (pp, "unroll_full"); + break; + case OMP_CLAUSE_UNROLL_NONE: + pp_string (pp, "unroll_none"); + break; + case OMP_CLAUSE_UNROLL_PARTIAL: + pp_string (pp, "unroll_partial"); + if (OMP_CLAUSE_UNROLL_PARTIAL_EXPR (clause)) + { + pp_left_paren (pp); + dump_generic_node (pp, OMP_CLAUSE_UNROLL_PARTIAL_EXPR (clause), spc, flags, + false); + pp_right_paren (pp); + } + break; case OMP_CLAUSE__LOOPTEMP_: name = "_looptemp_"; goto print_remap; @@ -3581,6 +3597,10 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags, pp_string (pp, "#pragma omp distribute"); goto dump_omp_loop; + case OMP_LOOP_TRANS: + pp_string (pp, "#pragma omp loop_transform"); + goto dump_omp_loop; + case OMP_TASKLOOP: pp_string (pp, "#pragma omp taskloop"); goto dump_omp_loop; diff --git a/gcc/tree.cc b/gcc/tree.cc index 207293c48cb..53e44367977 100644 --- a/gcc/tree.cc +++ b/gcc/tree.cc @@ -326,6 +326,9 @@ unsigned const char omp_clause_num_ops[] = 0, /* OMP_CLAUSE_IF_PRESENT */ 0, /* OMP_CLAUSE_FINALIZE */ 0, /* OMP_CLAUSE_NOHOST */ + 0, /* OMP_CLAUSE_UNROLL_FULL */ + 0, /* OMP_CLAUSE_UNROLL_NONE */ + 1 /* OMP_CLAUSE_UNROLL_PARTIAL */ }; const char * const omp_clause_code_name[] = @@ -417,6 +420,9 @@ const char * const omp_clause_code_name[] = "if_present", "finalize", "nohost", + "unroll_full", + "unroll_none", + "unroll_partial" }; /* Unless specific to OpenACC, we tend to internally maintain OpenMP-centric diff --git a/gcc/tree.def b/gcc/tree.def index e639a039db9..a47e4b8dbda 100644 --- a/gcc/tree.def +++ b/gcc/tree.def @@ -1166,6 +1166,12 @@ DEFTREECODE (OMP_TASK, "omp_task", tcc_statement, 2) unspecified by the standards. */ DEFTREECODE (OMP_FOR, "omp_for", tcc_statement, 7) +/* OpenMP - A loop nest to which a loop transformation such as #pragma omp + unroll should be applied, but which is not associated with another directive + such as #pragma omp for. The kind of loop transformations to be applied are + internally represented by clauses. Operands like for OMP_FOR. */ +DEFTREECODE (OMP_LOOP_TRANS, "omp_loop_trans", tcc_statement, 7) + /* OpenMP - #pragma omp simd [clause1 ... clauseN] Operands like for OMP_FOR. */ DEFTREECODE (OMP_SIMD, "omp_simd", tcc_statement, 7) diff --git a/gcc/tree.h b/gcc/tree.h index abcdb5638d4..f33f815b712 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -1787,6 +1787,9 @@ class auto_suppress_location_wrappers #define OMP_CLAUSE_USE_DEVICE_PTR_IF_PRESENT(NODE) \ (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_USE_DEVICE_PTR)->base.public_flag) +#define OMP_CLAUSE_UNROLL_PARTIAL_EXPR(NODE) \ + OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_UNROLL_PARTIAL), 0) + #define OMP_CLAUSE_PROC_BIND_KIND(NODE) \ (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_PROC_BIND)->omp_clause.subcode.proc_bind_kind) diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-1.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-1.f90 new file mode 100644 index 00000000000..f07aab898fa --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-1.f90 @@ -0,0 +1,52 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-do run } + +module test_functions + contains + integer function compute_sum() result(sum) + implicit none + + integer :: i,j + + !$omp do + do i = 1,10,3 + !$omp unroll full + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function + + integer function compute_sum2() result(sum) + implicit none + + integer :: i,j + + !$omp parallel do reduction(+:sum) + !$omp unroll partial(2) + do i = 1,10,3 + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum () + write (*,*) result + if (result .ne. 16) then + call abort + end if + + result = compute_sum2 () + write (*,*) result + if (result .ne. 16) then + call abort + end if +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-2.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-2.f90 new file mode 100644 index 00000000000..2ce44d4d044 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-2.f90 @@ -0,0 +1,88 @@ +! { dg-additional-options "-fdump-tree-original -g" } +! { dg-do run } + +module test_functions +contains + integer function compute_sum1 () result(sum) + implicit none + + integer :: i + + sum = 0 + !$omp unroll full + do i = 1,10,3 + sum = sum + 1 + end do + end function compute_sum1 + + integer function compute_sum2() result(sum) + implicit none + + integer :: i + + sum = 0 + !$omp unroll full + do i = -20,1,3 + sum = sum + 1 + end do + end function compute_sum2 + + + integer function compute_sum3() result(sum) + implicit none + + integer :: i + + sum = 0 + !$omp unroll full + do i = 30,1,-3 + sum = sum + 1 + end do + end function compute_sum3 + + + integer function compute_sum4() result(sum) + implicit none + + integer :: i + + sum = 0 + !$omp unroll full + do i = 50,-60,-10 + sum = sum + 1 + end do + end function compute_sum4 + +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum1 () + write (*,*) result + if (result .ne. 4) then + call abort + end if + + result = compute_sum2 () + write (*,*) result + if (result .ne. 8) then + call abort + end if + + result = compute_sum3 () + write (*,*) result + if (result .ne. 10) then + call abort + end if + + result = compute_sum4 () + write (*,*) result + if (result .ne. 12) then + call abort + end if + +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-3.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-3.f90 new file mode 100644 index 00000000000..55e5cc568a5 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-3.f90 @@ -0,0 +1,59 @@ +! Test lowering of the internal representation of "omp unroll" loops +! which are not unrolled. + +! { dg-additional-options "-O0" } +! { dg-additional-options "--param=omp-unroll-full-max-iterations=0" } +! { dg-additional-options "-fdump-tree-omp_transform_loops-details -fopt-info-optimized" } +! { dg-do run } + +module test_functions +contains + integer function compute_sum1 () result(sum) + implicit none + + integer :: i + + sum = 0 + !$omp unroll + do i = 0,50 + sum = sum + 1 + end do + end function compute_sum1 + + integer function compute_sum3 (step,n) result(sum) + implicit none + integer :: i, step, n + + sum = 0 + do i = 0,n,step + sum = sum + 1 + end do + end function compute_sum3 +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum1 () + if (result .ne. 51) then + call abort + end if + + result = compute_sum3 (1, 100) + if (result .ne. 101) then + call abort + end if + + result = compute_sum3 (2, 100) + if (result .ne. 51) then + call abort + end if + + result = compute_sum3 (-2, -100) + if (result .ne. 51) then + call abort + end if +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-4.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-4.f90 new file mode 100644 index 00000000000..52a214f1049 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-4.f90 @@ -0,0 +1,72 @@ +! { dg-additional-options "-O0 -g" } +! { dg-additional-options "-fdump-tree-omp_transform_loops-details -fopt-info-optimized" } +! { dg-do run } + +module test_functions +contains + integer function compute_sum1 () result(sum) + implicit none + + integer :: i + + sum = 0 + !$omp unroll partial(2) + do i = 1,50 + sum = sum + 1 + end do + end function compute_sum1 + + integer function compute_sum3 (step,n) result(sum) + implicit none + integer :: i, step, n + + sum = 0 + !$omp unroll partial(5) + do i = 1,n,step + sum = sum + 1 + end do + end function compute_sum3 +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum1 () + write (*,*) result + if (result .ne. 50) then + call abort + end if + + result = compute_sum3 (1, 100) + write (*,*) result + if (result .ne. 100) then + call abort + end if + + result = compute_sum3 (1, 9) + write (*,*) result + if (result .ne. 9) then + call abort + end if + + result = compute_sum3 (2, 96) + write (*,*) result + if (result .ne. 48) then + call abort + end if + + result = compute_sum3 (-2, -98) + write (*,*) result + if (result .ne. 50) then + call abort + end if + + result = compute_sum3 (-2, -100) + write (*,*) result + if (result .ne. 51) then + call abort + end if +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-5.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-5.f90 new file mode 100644 index 00000000000..d6a4e739675 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-5.f90 @@ -0,0 +1,55 @@ +! { dg-additional-options "-O0 -g" } +! { dg-additional-options "-fdump-tree-omp_transform_loops-details -fopt-info-optimized" } +! { dg-do run } + +module test_functions +contains + integer function compute_sum4 (step,n) result(sum) + implicit none + integer :: i, step, n + + sum = 0 + !$omp do + !$omp unroll partial(5) + do i = 1,n,step + sum = sum + 1 + end do + end function compute_sum4 +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum4 (1, 100) + write (*,*) result + if (result .ne. 100) then + call abort + end if + + result = compute_sum4 (1, 9) + write (*,*) result + if (result .ne. 9) then + call abort + end if + + result = compute_sum4 (2, 96) + write (*,*) result + if (result .ne. 48) then + call abort + end if + + result = compute_sum4 (-2, -98) + write (*,*) result + if (result .ne. 50) then + call abort + end if + + result = compute_sum4 (-2, -100) + write (*,*) result + if (result .ne. 51) then + call abort + end if +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-6.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-6.f90 new file mode 100644 index 00000000000..1df8ce8d5bb --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-6.f90 @@ -0,0 +1,105 @@ +! { dg-additional-options "-O0 -g" } +! { dg-additional-options "-fdump-tree-omp_transform_loops-details -fopt-info-optimized" } +! { dg-do run } + +module test_functions +contains + integer function compute_sum4 (step,n) result(sum) + implicit none + integer :: i, step, n + + sum = 0 + !$omp parallel do reduction(+:sum) lastprivate(i) + !$omp unroll partial(5) + do i = 1,n,step + sum = sum + 1 + end do + end function compute_sum4 + + integer function compute_sum5 (step,n) result(sum) + implicit none + integer :: i, step, n + + sum = 0 + !$omp parallel do reduction(+:sum) lastprivate(i) + !$omp unroll partial(5) ! { dg-optimized {replaced consecutive 'omp unroll' directives by 'omp unroll auto\(50\)'} } + !$omp unroll partial(10) + do i = 1,n,step + sum = sum + 1 + end do + end function compute_sum5 + + integer function compute_sum6 (step,n) result(sum) + implicit none + integer :: i, j, step, n + + sum = 0 + !$omp parallel do reduction(+:sum) lastprivate(i) + do i = 1,n,step + !$omp unroll full ! { dg-optimized {removed useless 'omp unroll auto' directives preceding 'omp unroll full'} } + !$omp unroll partial(10) + do j = 1, 1000 + sum = sum + 1 + end do + end do + end function compute_sum6 +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum4 (1, 100) + if (result .ne. 100) then + call abort + end if + + result = compute_sum4 (1, 9) + if (result .ne. 9) then + call abort + end if + + result = compute_sum4 (2, 96) + if (result .ne. 48) then + call abort + end if + + result = compute_sum4 (-2, -98) + if (result .ne. 50) then + call abort + end if + + result = compute_sum4 (-2, -100) + if (result .ne. 51) then + call abort + end if + + result = compute_sum5 (1, 100) + if (result .ne. 100) then + call abort + end if + + result = compute_sum5 (1, 9) + if (result .ne. 9) then + call abort + end if + + result = compute_sum5 (2, 96) + if (result .ne. 48) then + call abort + end if + + result = compute_sum5 (-2, -98) + if (result .ne. 50) then + call abort + end if + + result = compute_sum5 (-2, -100) + if (result .ne. 51) then + call abort + end if + + +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7.f90 new file mode 100644 index 00000000000..d25f18002ae --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7.f90 @@ -0,0 +1,198 @@ +! { dg-additional-options "-O0 -cpp" } +! { dg-do run } + +#ifndef UNROLL_FACTOR +#define UNROLL_FACTOR 1 +#endif +module test_functions +contains + subroutine copy (array1, array2) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: i + + !$omp parallel do + !$omp unroll partial(UNROLL_FACTOR) + do i = 1, 100 + array1(i) = array2(i) + end do + end subroutine + + subroutine copy2 (array1, array2) + implicit none + + integer :: array1(100) + integer :: array2(100) + integer :: i + + !$omp parallel do + !$omp unroll partial(UNROLL_FACTOR) + do i = 0,99 + array1(i+1) = array2(i+1) + end do + end subroutine copy2 + + subroutine copy3 (array1, array2) + implicit none + + integer :: array1(100) + integer :: array2(100) + integer :: i + + !$omp parallel do lastprivate(i) + !$omp unroll partial(UNROLL_FACTOR) + do i = -49,50 + if (i < 0) then + array1((-1)*i) = array2((-1)*i) + else + array1(50+i) = array2(50+i) + endif + end do + end subroutine copy3 + + subroutine copy4 (array1, array2) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: i + + !$omp do + !$omp unroll partial(UNROLL_FACTOR) + do i = 2, 200, 2 + array1(i/2) = array2(i/2) + end do + end subroutine copy4 + + subroutine copy5 (array1, array2) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: i + + !$omp do + !$omp unroll partial(UNROLL_FACTOR) + do i = 200, 2, -2 + array1(i/2) = array2(i/2) + end do + end subroutine + + subroutine copy6 (array1, array2, lower, upper, step) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: lower, upper, step + integer :: i + + !$omp do + !$omp unroll partial(UNROLL_FACTOR) + do i = lower, upper, step + array1 (i) = array2(i) + end do + end subroutine + + subroutine prepare (array1, array2) + implicit none + + integer :: array1(:) + integer :: array2(:) + + array1 = 2 + array2 = 0 + end subroutine + + subroutine check_equal (array1, array2) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: i + + do i=1,100 + if (array1(i) /= array2(i)) then + write (*,*) i + call abort + end if + end do + end subroutine + + subroutine check_equal_at_steps (array1, array2, lower, upper, step) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: lower, upper, step + integer :: i + + do i=lower, upper, step + if (array1(i) /= array2(i)) then + write (*,*) i + call abort + end if + end do + end subroutine + + subroutine check_unchanged_at_non_steps (array1, array2, lower, upper, step) + implicit none + + integer :: array1(:) + integer :: array2(:) + integer :: lower, upper, step + integer :: i, j + + do i=lower, upper,step + do j=i,i+step-1 + if (array2(j) /= 0) then + write (*,*) i + call abort + end if + end do + end do + end subroutine +end module test_functions + +program test + use test_functions + implicit none + + integer :: array1(100), array2(100) + + call prepare (array1, array2) + call copy (array1, array2) + call check_equal (array1, array2) + + call prepare (array1, array2) + call copy2 (array1, array2) + call check_equal (array1, array2) + + call prepare (array1, array2) + call copy3 (array1, array2) + call check_equal (array1, array2) + + call prepare (array1, array2) + call copy4 (array1, array2) + call check_equal (array1, array2) + + call prepare (array1, array2) + call copy5 (array1, array2) + call check_equal (array1, array2) + + call prepare (array1, array2) + call copy6 (array1, array2, 1, 100, 5) + call check_equal_at_steps (array1, array2, 1, 100, 5) + call check_unchanged_at_non_steps (array1, array2, 1, 100, 5) + + call prepare (array1, array2) + call copy6 (array1, array2, 1, 50, 5) + call check_equal_at_steps (array1, array2, 1, 50, 5) + call check_unchanged_at_non_steps (array1, array2, 1, 50, 5) + + call prepare (array1, array2) + call copy6 (array1, array2, 3, 18, 7) + call check_equal_at_steps (array1, array2, 3 , 18, 7) + call check_unchanged_at_non_steps (array1, array2, 3, 18, 7) +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7a.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7a.f90 new file mode 100644 index 00000000000..02328464c0d --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7a.f90 @@ -0,0 +1,7 @@ +! { dg-additional-options "-O0 -g -cpp" } +! { dg-do run } + +! Check an unroll factor that divides the number of iterations +! of the loops in the test implementation. +#define UNROLL_FACTOR 5 +#include "unroll-7.f90" diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7b.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7b.f90 new file mode 100644 index 00000000000..60866ef33fd --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7b.f90 @@ -0,0 +1,7 @@ +! { dg-additional-options "-O0 -g -cpp" } +! { dg-do run } + +! Check an unroll factor that does not divide the number of iterations +! of the loops in the test implementation. +#define UNROLL_FACTOR 3 +#include "unroll-7.f90" diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7c.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7c.f90 new file mode 100644 index 00000000000..6d8a2ef7bc0 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-7c.f90 @@ -0,0 +1,7 @@ +! { dg-additional-options "-O0 -g -cpp" } +! { dg-do run } + +! Check an unroll factor that is larger than the number of iterations +! of the loops in the test implementation. +#define UNROLL_FACTOR 113 +#include "unroll-7.f90" diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-8.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-8.f90 new file mode 100644 index 00000000000..40506025aa3 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-8.f90 @@ -0,0 +1,38 @@ +! { dg-additional-options "-O0 -g" } +! { dg-additional-options "-fdump-tree-omp_transform_loops-details -fopt-info-optimized" } +! { dg-do run } + +module test_functions +contains + subroutine copy (array1, array2, step, n) + implicit none + + integer :: array1(n) + integer :: array2(n) + integer :: i, step, n + + call omp_set_num_threads (4) + !$omp parallel do shared(array1) shared(array2) schedule(static, 4) + !$omp unroll partial(2) + do i = 1,n + array1(i) = array2(i) + end do + end subroutine +end module test_functions + +program test + use test_functions + implicit none + + integer :: array1(100), array2(100) + integer :: i + + array1 = 2 + call copy(array1, array2, 1, 100) + do i=1,100 + if (array1(i) /= array2(i)) then + write (*,*) i + call abort + end if + end do +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90 new file mode 100644 index 00000000000..5fb64ddd6fd --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90 @@ -0,0 +1,33 @@ +! { dg-options "-fno-openmp -fopenmp-simd" } +! { dg-additional-options "-fdump-tree-original" } +! { dg-do run } + +module test_functions + contains + integer function compute_sum() result(sum) + implicit none + + integer :: i,j + + !$omp simd + do i = 1,10,3 + !$omp unroll full + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function compute_sum +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum () + write (*,*) result + if (result .ne. 16) then + call abort + end if +end program From patchwork Fri Mar 24 15:30:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Frederik Harwath X-Patchwork-Id: 66857 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 806A03882166 for ; Fri, 24 Mar 2023 15:33:31 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa1.mentor.iphmx.com (esa1.mentor.iphmx.com [68.232.129.153]) by sourceware.org (Postfix) with ESMTPS id 8A7AE3858291 for ; Fri, 24 Mar 2023 15:33:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8A7AE3858291 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.98,288,1673942400"; d="scan'208";a="324687" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa1.mentor.iphmx.com with ESMTP; 24 Mar 2023 07:31:09 -0800 IronPort-SDR: lsb2wNYMdtH0XpBBY3uUbFl9IQG5kWvXclcPgp7V1k8yHR493GLOXyAdyGeJj1i7yXe6nJ032b +NX/2LgLqXyCuzaXCyBLRZp0vUZ7WsL2KYZMrL7YTreXykND+BhNYO6AvNOb01Cm8CLmPq5aSy teaQL1XYlZF2EziKZKSXz7a8RIrV0Jgx+fHlaub4/emMaG05NVqFno9tSCfmf3UlIz/nQkmgKU bDZ0xp3/tq1z+SGUfHr0etKa4vkGz0+xWjiNf3OyCF49+AXU4wxfFzX01C40VjVTnq0fXN08/6 ewc= From: Frederik Harwath To: , , , , Subject: [PATCH 2/7] openmp: Add C/C++ support for "omp unroll" directive Date: Fri, 24 Mar 2023 16:30:40 +0100 Message-ID: <20230324153046.3996092-3-frederik@codesourcery.com> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20230324153046.3996092-1-frederik@codesourcery.com> References: <20230324153046.3996092-1-frederik@codesourcery.com> MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) To svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This commit implements the C and the C++ front end changes to support the "omp unroll" directive. The execution of the loop transformation relies on the pass that has been added as a part of the earlier Fortran patch. gcc/c-family/ChangeLog: * c-gimplify.cc (c_genericize_control_stmt): Handle OMP_UNROLL. * c-omp.cc: Add "unroll" to omp_directives[]. * c-pragma.cc: Add "unroll" to omp_pragmas_simd[]. * c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_UNROLL to pragma_kind and adjust PRAGMA_OMP__LAST_. (enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_FULL and PRAGMA_OMP_CLAUSE_PARTIAL. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_clause_name): Handle "full" and "partial" clauses. (check_no_duplicate_clause): Change return type to bool and return check result. (c_parser_omp_clause_unroll_full): New function for parsing the "unroll clause". (c_parser_omp_clause_unroll_partial): New function for parsing the "partial" clause. (c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_FULL and PRAGMA_OMP_CLAUSE_PARTIAL. (c_parser_nested_omp_unroll_clauses): New function for parsing "omp unroll" directives following another directive. (OMP_UNROLL_CLAUSE_MASK): New definition. (c_parser_omp_unroll): New function for parsing "omp unroll" loops that are not associated with another directive. (c_parser_omp_construct): Handle PRAGMA_OMP_UNROLL. * c-typeck.cc (c_finish_omp_clauses): Handle OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_PARTIAL, and OMP_CLAUSE_UNROLL_NONE. gcc/cp/ChangeLog: * cp-gimplify.cc (cp_gimplify_expr): Handle OMP_UNROLL. (cp_fold_r): Likewise. (cp_genericize_r): Likewise. * parser.cc (cp_parser_omp_clause_name): Handle "full" clause. (check_no_duplicate_clause): Change return type to bool and return check result. (cp_parser_omp_clause_unroll_full): New function for parsing the "unroll clause". (cp_parser_omp_clause_unroll_partial): New function for parsing the "partial" clause. (cp_parser_omp_all_clauses): Handle OMP_CLAUSE_UNROLL and OMP_CLAUSE_FULL. (cp_parser_nested_omp_unroll_clauses): New function for parsing "omp unroll" directives following another directive. (cp_parser_omp_for_loop): Handle "omp unroll" directives between directive and loop. (OMP_UNROLL_CLAUSE_MASK): New definition. (cp_parser_omp_unroll): New function for parsing "omp unroll" loops that are not associated with another directive. (cp_parser_omp_construct): Handle PRAGMA_OMP_UNROLL. (cp_parser_pragma): Handle PRAGMA_OMP_UNROLL. * pt.cc (tsubst_omp_clauses): Handle OMP_CLAUSE_UNROLL_PARTIAL, OMP_CLAUSE_UNROLL_FULL, and OMP_CLAUSE_UNROLL_NONE. (tsubst_expr): Handle OMP_UNROLL. * semantics.cc (finish_omp_clauses): Handle OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_PARTIAL, and OMP_CLAUSE_UNROLL_NONE. libgomp/ChangeLog: * testsuite/libgomp.c++/loop-transforms/unroll-1.C: New test. * testsuite/libgomp.c++/loop-transforms/unroll-2.C: New test. * testsuite/libgomp.c-c++-common/loop-transforms/unroll-1.c: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/loop-transforms/unroll-1.c: New test. * c-c++-common/gomp/loop-transforms/unroll-2.c: New test. * c-c++-common/gomp/loop-transforms/unroll-3.c: New test. * c-c++-common/gomp/loop-transforms/unroll-4.c: New test. * c-c++-common/gomp/loop-transforms/unroll-5.c: New test. * c-c++-common/gomp/loop-transforms/unroll-6.c: New test. * g++.dg/gomp/loop-transforms/unroll-1.C: New test. * g++.dg/gomp/loop-transforms/unroll-2.C: New test. * g++.dg/gomp/loop-transforms/unroll-3.C: New test. --- gcc/c-family/c-gimplify.cc | 1 + gcc/c-family/c-omp.cc | 6 +- gcc/c-family/c-pragma.cc | 1 + gcc/c-family/c-pragma.h | 5 +- gcc/c/c-parser.cc | 161 ++++++++++++++++- gcc/c/c-typeck.cc | 8 + gcc/cp/cp-gimplify.cc | 3 + gcc/cp/parser.cc | 164 +++++++++++++++++- gcc/cp/pt.cc | 4 + gcc/cp/semantics.cc | 56 ++++++ .../gomp/loop-transforms/unroll-1.c | 133 ++++++++++++++ .../gomp/loop-transforms/unroll-2.c | 99 +++++++++++ .../gomp/loop-transforms/unroll-3.c | 18 ++ .../gomp/loop-transforms/unroll-4.c | 19 ++ .../gomp/loop-transforms/unroll-5.c | 19 ++ .../gomp/loop-transforms/unroll-6.c | 20 +++ .../gomp/loop-transforms/unroll-7.c | 144 +++++++++++++++ .../gomp/loop-transforms/unroll-simd-1.c | 84 +++++++++ .../g++.dg/gomp/loop-transforms/unroll-1.C | 42 +++++ .../g++.dg/gomp/loop-transforms/unroll-2.C | 47 +++++ .../g++.dg/gomp/loop-transforms/unroll-3.C | 37 ++++ .../libgomp.c++/loop-transforms/unroll-1.C | 73 ++++++++ .../libgomp.c++/loop-transforms/unroll-2.C | 34 ++++ .../loop-transforms/unroll-1.c | 76 ++++++++ 24 files changed, 1246 insertions(+), 8 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-1.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-2.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-3.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-4.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-5.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-6.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-7.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-simd-1.c create mode 100644 gcc/testsuite/g++.dg/gomp/loop-transforms/unroll-1.C create mode 100644 gcc/testsuite/g++.dg/gomp/loop-transforms/unroll-2.C create mode 100644 gcc/testsuite/g++.dg/gomp/loop-transforms/unroll-3.C create mode 100644 libgomp/testsuite/libgomp.c++/loop-transforms/unroll-1.C create mode 100644 libgomp/testsuite/libgomp.c++/loop-transforms/unroll-2.C create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/unroll-1.c -- 2.36.1 ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 diff --git a/gcc/c-family/c-gimplify.cc b/gcc/c-family/c-gimplify.cc index ef5c7d919fc..82c88bd70e1 100644 --- a/gcc/c-family/c-gimplify.cc +++ b/gcc/c-family/c-gimplify.cc @@ -506,6 +506,7 @@ c_genericize_control_stmt (tree *stmt_p, int *walk_subtrees, void *data, case OMP_DISTRIBUTE: case OMP_LOOP: case OMP_TASKLOOP: + case OMP_LOOP_TRANS: case OACC_LOOP: genericize_omp_for_stmt (stmt_p, walk_subtrees, data, func, lh); break; diff --git a/gcc/c-family/c-omp.cc b/gcc/c-family/c-omp.cc index f72ca4c6acd..85ba9c528c8 100644 --- a/gcc/c-family/c-omp.cc +++ b/gcc/c-family/c-omp.cc @@ -3212,9 +3212,9 @@ const struct c_omp_directive c_omp_directives[] = { { "teams", nullptr, nullptr, PRAGMA_OMP_TEAMS, C_OMP_DIR_CONSTRUCT, true }, { "threadprivate", nullptr, nullptr, PRAGMA_OMP_THREADPRIVATE, - C_OMP_DIR_DECLARATIVE, false } - /* { "unroll", nullptr, nullptr, PRAGMA_OMP_UNROLL, - C_OMP_DIR_CONSTRUCT, false }, */ + C_OMP_DIR_DECLARATIVE, false }, + { "unroll", nullptr, nullptr, PRAGMA_OMP_UNROLL, + C_OMP_DIR_CONSTRUCT, false }, }; /* Find (non-combined/composite) OpenMP directive (if any) which starts diff --git a/gcc/c-family/c-pragma.cc b/gcc/c-family/c-pragma.cc index 0d2b333cebb..96a28ac1b0c 100644 --- a/gcc/c-family/c-pragma.cc +++ b/gcc/c-family/c-pragma.cc @@ -1593,6 +1593,7 @@ static const struct omp_pragma_def omp_pragmas_simd[] = { { "target", PRAGMA_OMP_TARGET }, { "taskloop", PRAGMA_OMP_TASKLOOP }, { "teams", PRAGMA_OMP_TEAMS }, + { "unroll", PRAGMA_OMP_UNROLL }, }; void diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h index 9cc95ab3ee3..6686abdc94d 100644 --- a/gcc/c-family/c-pragma.h +++ b/gcc/c-family/c-pragma.h @@ -81,8 +81,9 @@ enum pragma_kind { PRAGMA_OMP_TASKYIELD, PRAGMA_OMP_THREADPRIVATE, PRAGMA_OMP_TEAMS, + PRAGMA_OMP_UNROLL, /* PRAGMA_OMP__LAST_ should be equal to the last PRAGMA_OMP_* code. */ - PRAGMA_OMP__LAST_ = PRAGMA_OMP_TEAMS, + PRAGMA_OMP__LAST_ = PRAGMA_OMP_UNROLL, PRAGMA_GCC_PCH_PREPROCESS, PRAGMA_IVDEP, @@ -118,6 +119,7 @@ enum pragma_omp_clause { PRAGMA_OMP_CLAUSE_FIRSTPRIVATE, PRAGMA_OMP_CLAUSE_FOR, PRAGMA_OMP_CLAUSE_FROM, + PRAGMA_OMP_CLAUSE_FULL, PRAGMA_OMP_CLAUSE_GRAINSIZE, PRAGMA_OMP_CLAUSE_HAS_DEVICE_ADDR, PRAGMA_OMP_CLAUSE_HINT, @@ -140,6 +142,7 @@ enum pragma_omp_clause { PRAGMA_OMP_CLAUSE_ORDER, PRAGMA_OMP_CLAUSE_ORDERED, PRAGMA_OMP_CLAUSE_PARALLEL, + PRAGMA_OMP_CLAUSE_PARTIAL, PRAGMA_OMP_CLAUSE_PRIORITY, PRAGMA_OMP_CLAUSE_PRIVATE, PRAGMA_OMP_CLAUSE_PROC_BIND, diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index 21bc3167ce2..9d875befccc 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -13471,6 +13471,8 @@ c_parser_omp_clause_name (c_parser *parser) result = PRAGMA_OMP_CLAUSE_FIRSTPRIVATE; else if (!strcmp ("from", p)) result = PRAGMA_OMP_CLAUSE_FROM; + else if (!strcmp ("full", p)) + result = PRAGMA_OMP_CLAUSE_FULL; break; case 'g': if (!strcmp ("gang", p)) @@ -13545,6 +13547,8 @@ c_parser_omp_clause_name (c_parser *parser) case 'p': if (!strcmp ("parallel", p)) result = PRAGMA_OMP_CLAUSE_PARALLEL; + else if (!strcmp ("partial", p)) + result = PRAGMA_OMP_CLAUSE_PARTIAL; else if (!strcmp ("present", p)) result = PRAGMA_OACC_CLAUSE_PRESENT; /* As of OpenACC 2.5, these are now aliases of the non-present_or @@ -13639,12 +13643,15 @@ c_parser_omp_clause_name (c_parser *parser) /* Validate that a clause of the given type does not already exist. */ -static void +static bool check_no_duplicate_clause (tree clauses, enum omp_clause_code code, const char *name) { - if (tree c = omp_find_clause (clauses, code)) + tree c = omp_find_clause (clauses, code); + if (c) error_at (OMP_CLAUSE_LOCATION (c), "too many %qs clauses", name); + + return c == NULL_TREE; } /* OpenACC 2.0 @@ -17448,6 +17455,65 @@ c_parser_omp_clause_uniform (c_parser *parser, tree list) return list; } +/* OpenMP 5.1 + full */ + +static tree +c_parser_omp_clause_unroll_full (c_parser *parser, tree list) +{ + if (!check_no_duplicate_clause (list, OMP_CLAUSE_UNROLL_FULL, "full")) + return list; + + location_t loc = c_parser_peek_token (parser)->location; + tree c = build_omp_clause (loc, OMP_CLAUSE_UNROLL_FULL); + OMP_CLAUSE_CHAIN (c) = list; + return c; +} + +/* OpenMP 5.1 + partial ( constant-expression ) */ + +static tree +c_parser_omp_clause_unroll_partial (c_parser *parser, tree list) +{ + if (!check_no_duplicate_clause (list, OMP_CLAUSE_UNROLL_PARTIAL, "partial")) + return list; + + tree c, num = error_mark_node; + HOST_WIDE_INT n; + location_t loc; + + loc = c_parser_peek_token (parser)->location; + c = build_omp_clause (loc, OMP_CLAUSE_UNROLL_PARTIAL); + OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c) = NULL_TREE; + OMP_CLAUSE_CHAIN (c) = list; + + if (!c_parser_next_token_is (parser, CPP_OPEN_PAREN)) + return c; + + matching_parens parens; + parens.consume_open (parser); + num = c_parser_expr_no_commas (parser, NULL).value; + parens.skip_until_found_close (parser); + + if (num == error_mark_node) + return list; + + mark_exp_read (num); + num = c_fully_fold (num, false, NULL); + if (!INTEGRAL_TYPE_P (TREE_TYPE (num)) || !tree_fits_shwi_p (num) + || (n = tree_to_shwi (num)) <= 0 || (int)n != n) + { + error_at (loc, + "partial argument needs positive constant integer expression"); + return list; + } + + OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c) = num; + + return c; +} + /* OpenMP 5.0: detach ( event-handle ) */ @@ -18042,6 +18108,14 @@ c_parser_omp_all_clauses (c_parser *parser, omp_clause_mask mask, clauses); c_name = "enter"; break; + case PRAGMA_OMP_CLAUSE_FULL: + c_name = "full"; + clauses = c_parser_omp_clause_unroll_full (parser, clauses); + break; + case PRAGMA_OMP_CLAUSE_PARTIAL: + c_name = "partial"; + clauses = c_parser_omp_clause_unroll_partial (parser, clauses); + break; default: c_parser_error (parser, "expected %<#pragma omp%> clause"); goto saw_error; @@ -20169,6 +20243,8 @@ c_parser_omp_scan_loop_body (c_parser *parser, bool open_brace_parsed) "expected %<}%>"); } +static bool c_parser_nested_omp_unroll_clauses (c_parser *, tree &); + /* Parse the restricted form of loop statements allowed by OpenACC and OpenMP. The real trick here is to determine the loop control variable early so that we can push a new decl if necessary to make it private. @@ -20227,6 +20303,13 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code, condv = make_tree_vec (count); incrv = make_tree_vec (count); + if (c_parser_nested_omp_unroll_clauses (parser, clauses) + && count > 1) + { + error_at (loc, "collapse cannot be larger than 1 on an unrolled loop"); + return NULL; + } + if (!c_parser_next_token_is_keyword (parser, RID_FOR)) { c_parser_error (parser, "for statement expected"); @@ -23858,6 +23941,76 @@ c_parser_omp_taskloop (location_t loc, c_parser *parser, return ret; } +#define OMP_UNROLL_CLAUSE_MASK \ + ( (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_PARTIAL) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_FULL) ) + +/* Parse zero or more '#pragma omp unroll' that follow + another directive that requires a canonical loop nest. */ + +static bool +c_parser_nested_omp_unroll_clauses (c_parser *parser, tree &clauses) +{ + static const char *p_name = "#pragma omp unroll"; + c_token *tok; + bool found_unroll = false; + while (c_parser_next_token_is (parser, CPP_PRAGMA) + && (tok = c_parser_peek_token (parser), + tok->pragma_kind == PRAGMA_OMP_UNROLL)) + { + c_parser_consume_pragma (parser); + tree c = c_parser_omp_all_clauses (parser, OMP_UNROLL_CLAUSE_MASK, + p_name, true); + if (c) + { + gcc_assert (!TREE_CHAIN (c)); + found_unroll = true; + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_UNROLL_FULL) + { + error_at (tok->location, "% clause is invalid here; " + "turns loop into non-loop"); + continue; + } + } + else + { + error_at (tok->location, "%<#pragma omp unroll%> without " + "% clause is invalid here; " + "turns loop into non-loop"); + continue; + } + + clauses = chainon (clauses, c); + } + + return found_unroll; +} + +static tree +c_parser_omp_unroll (location_t loc, c_parser *parser, bool *if_p) +{ + tree block, ret; + static const char *p_name = "#pragma omp unroll"; + omp_clause_mask mask = OMP_UNROLL_CLAUSE_MASK; + + tree clauses = c_parser_omp_all_clauses (parser, mask, p_name, false); + c_parser_nested_omp_unroll_clauses (parser, clauses); + + if (!clauses) + { + tree c = build_omp_clause (loc, OMP_CLAUSE_UNROLL_NONE); + OMP_CLAUSE_CHAIN (c) = clauses; + clauses = c; + } + + block = c_begin_compound_stmt (true); + ret = c_parser_omp_for_loop (loc, parser, OMP_LOOP_TRANS, clauses, NULL, if_p); + block = c_end_compound_stmt (loc, block, true); + add_stmt (block); + + return ret; +} + /* OpenMP 5.1 #pragma omp nothing new-line */ @@ -24249,6 +24402,7 @@ c_parser_omp_construct (c_parser *parser, bool *if_p) p_kind = c_parser_peek_token (parser)->pragma_kind; c_parser_consume_pragma (parser); + gcc_assert (parser->in_pragma); switch (p_kind) { case PRAGMA_OACC_ATOMIC: @@ -24342,6 +24496,9 @@ c_parser_omp_construct (c_parser *parser, bool *if_p) case PRAGMA_OMP_ASSUME: c_parser_omp_assume (parser, if_p); return; + case PRAGMA_OMP_UNROLL: + stmt = c_parser_omp_unroll (loc, parser, if_p); + break; default: gcc_unreachable (); } diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc index 45bacc06c47..bffea79b441 100644 --- a/gcc/c/c-typeck.cc +++ b/gcc/c/c-typeck.cc @@ -15916,6 +15916,14 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) pc = &OMP_CLAUSE_CHAIN (c); continue; + case OMP_CLAUSE_UNROLL_FULL: + pc = &OMP_CLAUSE_CHAIN (c); + continue; + + case OMP_CLAUSE_UNROLL_PARTIAL: + pc = &OMP_CLAUSE_CHAIN (c); + continue; + case OMP_CLAUSE_INBRANCH: case OMP_CLAUSE_NOTINBRANCH: if (branch_seen) diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc index 4fecd5616bd..bf81097d780 100644 --- a/gcc/cp/cp-gimplify.cc +++ b/gcc/cp/cp-gimplify.cc @@ -638,6 +638,7 @@ cp_gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p) case OMP_DISTRIBUTE: case OMP_LOOP: case OMP_TASKLOOP: + case OMP_LOOP_TRANS: ret = cp_gimplify_omp_for (expr_p, pre_p); break; @@ -1097,6 +1098,7 @@ cp_fold_r (tree *stmt_p, int *walk_subtrees, void *data_) case OMP_DISTRIBUTE: case OMP_LOOP: case OMP_TASKLOOP: + case OMP_LOOP_TRANS: case OACC_LOOP: cp_walk_tree (&OMP_FOR_BODY (stmt), cp_fold_r, data, NULL); cp_walk_tree (&OMP_FOR_CLAUSES (stmt), cp_fold_r, data, NULL); @@ -1855,6 +1857,7 @@ cp_genericize_r (tree *stmt_p, int *walk_subtrees, void *data) case OMP_FOR: case OMP_SIMD: case OMP_LOOP: + case OMP_LOOP_TRANS: case OACC_LOOP: case STATEMENT_LIST: /* These cases are handled by shared code. */ diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index a277003ea58..7034fdf49a4 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -37204,6 +37204,8 @@ cp_parser_omp_clause_name (cp_parser *parser) result = PRAGMA_OMP_CLAUSE_FIRSTPRIVATE; else if (!strcmp ("from", p)) result = PRAGMA_OMP_CLAUSE_FROM; + else if (!strcmp ("full", p)) + result = PRAGMA_OMP_CLAUSE_FULL; break; case 'g': if (!strcmp ("gang", p)) @@ -37278,6 +37280,8 @@ cp_parser_omp_clause_name (cp_parser *parser) case 'p': if (!strcmp ("parallel", p)) result = PRAGMA_OMP_CLAUSE_PARALLEL; + if (!strcmp ("partial", p)) + result = PRAGMA_OMP_CLAUSE_PARTIAL; else if (!strcmp ("present", p)) result = PRAGMA_OACC_CLAUSE_PRESENT; else if (!strcmp ("present_or_copy", p) @@ -37368,12 +37372,15 @@ cp_parser_omp_clause_name (cp_parser *parser) /* Validate that a clause of the given type does not already exist. */ -static void +static bool check_no_duplicate_clause (tree clauses, enum omp_clause_code code, const char *name, location_t location) { - if (omp_find_clause (clauses, code)) + bool found = omp_find_clause (clauses, code); + if (found) error_at (location, "too many %qs clauses", name); + + return !found; } /* OpenMP 2.5: @@ -39459,6 +39466,56 @@ cp_parser_omp_clause_thread_limit (cp_parser *parser, tree list, return c; } +/* OpenMP 5.1 + full */ + +static tree +cp_parser_omp_clause_unroll_full (tree list, location_t loc) +{ + if (!check_no_duplicate_clause (list, OMP_CLAUSE_UNROLL_FULL, "full", loc)) + return list; + + tree c = build_omp_clause (loc, OMP_CLAUSE_UNROLL_FULL); + OMP_CLAUSE_CHAIN (c) = list; + return c; +} + +/* OpenMP 5.1 + partial ( constant-expression ) */ + +static tree +cp_parser_omp_clause_unroll_partial (cp_parser *parser, tree list, + location_t loc) +{ + if (!check_no_duplicate_clause (list, OMP_CLAUSE_UNROLL_PARTIAL, "partial", + loc)) + return list; + + tree c, num = error_mark_node; + c = build_omp_clause (loc, OMP_CLAUSE_UNROLL_PARTIAL); + OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c) = NULL_TREE; + OMP_CLAUSE_CHAIN (c) = list; + + if (!cp_lexer_next_token_is (parser->lexer, CPP_OPEN_PAREN)) + return c; + + matching_parens parens; + parens.consume_open (parser); + num = cp_parser_constant_expression (parser); + cp_parser_skip_to_closing_parenthesis (parser, /*recovering=*/true, + /*or_comma=*/false, + /*consume_paren=*/true); + + if (num == error_mark_node) + return list; + + mark_exp_read (num); + num = fold_non_dependent_expr (num); + + OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c) = num; + return c; +} + /* OpenMP 4.0: aligned ( variable-list ) aligned ( variable-list : constant-expression ) */ @@ -41441,6 +41498,15 @@ cp_parser_omp_all_clauses (cp_parser *parser, omp_clause_mask mask, clauses); c_name = "enter"; break; + case PRAGMA_OMP_CLAUSE_PARTIAL: + clauses = cp_parser_omp_clause_unroll_partial (parser, clauses, + token->location); + c_name = "partial"; + break; + case PRAGMA_OMP_CLAUSE_FULL: + clauses = cp_parser_omp_clause_unroll_full(clauses, token->location); + c_name = "full"; + break; default: cp_parser_error (parser, "expected %<#pragma omp%> clause"); goto saw_error; @@ -43565,6 +43631,8 @@ cp_parser_omp_scan_loop_body (cp_parser *parser) braces.require_close (parser); } +static bool cp_parser_nested_omp_unroll_clauses (cp_parser *, tree &); + /* Parse the restricted form of the for statement allowed by OpenMP. */ static tree @@ -43622,6 +43690,15 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses, loc_first = cp_lexer_peek_token (parser->lexer)->location; + if (cp_parser_nested_omp_unroll_clauses (parser, clauses) + && count > 1) + { + error_at (loc_first, + "collapse cannot be larger than 1 on an unrolled loop"); + return NULL; + } + + for (i = 0; i < count; i++) { int bracecount = 0; @@ -45657,6 +45734,79 @@ cp_parser_omp_target (cp_parser *parser, cp_token *pragma_tok, return true; } +#define OMP_UNROLL_CLAUSE_MASK \ + ( (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_PARTIAL) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_FULL) ) + +/* Parse zero or more '#pragma omp unroll' that follow + another directive that requires a canonical loop nest. */ + +static bool +cp_parser_nested_omp_unroll_clauses (cp_parser *parser, tree &clauses) +{ + static const char *p_name = "#pragma omp unroll"; + cp_token *tok; + bool unroll_found = false; + while (cp_lexer_next_token_is (parser->lexer, CPP_PRAGMA) + && (tok = cp_lexer_peek_token (parser->lexer), + cp_parser_pragma_kind (tok) == PRAGMA_OMP_UNROLL)) + { + cp_lexer_consume_token (parser->lexer); + gcc_assert (tok->type == CPP_PRAGMA); + parser->lexer->in_pragma = true; + tree c = cp_parser_omp_all_clauses (parser, OMP_UNROLL_CLAUSE_MASK, + p_name, tok); + if (c) + { + gcc_assert (!TREE_CHAIN (c)); + unroll_found = true; + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_UNROLL_FULL) + { + error_at (tok->location, "% clause is invalid here; " + "turns loop into non-loop"); + continue; + } + + c = finish_omp_clauses (c, C_ORT_OMP); + } + else + { + error_at (tok->location, "%<#pragma omp unroll%> without " + "% clause is invalid here; " + "turns loop into non-loop"); + continue; + } + clauses = chainon (clauses, c); + } + return unroll_found; +} + +static tree +cp_parser_omp_unroll (cp_parser *parser, cp_token *tok, bool *if_p) +{ + tree block, ret; + static const char *p_name = "#pragma omp unroll"; + omp_clause_mask mask = OMP_UNROLL_CLAUSE_MASK; + + tree clauses = cp_parser_omp_all_clauses (parser, mask, p_name, tok, false); + + if (!clauses) + { + tree c = build_omp_clause (tok->location, OMP_CLAUSE_UNROLL_NONE); + OMP_CLAUSE_CHAIN (c) = clauses; + clauses = c; + } + + cp_parser_nested_omp_unroll_clauses (parser, clauses); + + block = begin_omp_structured_block (); + ret = cp_parser_omp_for_loop (parser, OMP_LOOP_TRANS, clauses, NULL, if_p); + block = finish_omp_structured_block (block); + add_stmt (block); + + return ret; +} + /* OpenACC 2.0: # pragma acc cache (variable-list) new-line */ @@ -48750,6 +48900,9 @@ cp_parser_omp_construct (cp_parser *parser, cp_token *pragma_tok, bool *if_p) case PRAGMA_OMP_ASSUME: cp_parser_omp_assume (parser, pragma_tok, if_p); return; + case PRAGMA_OMP_UNROLL: + stmt = cp_parser_omp_unroll (parser, pragma_tok, if_p); + break; default: gcc_unreachable (); } @@ -49376,6 +49529,13 @@ cp_parser_pragma (cp_parser *parser, enum pragma_context context, bool *if_p) cp_parser_omp_construct (parser, pragma_tok, if_p); pop_omp_privatization_clauses (stmt); return true; + case PRAGMA_OMP_UNROLL: + if (context != pragma_stmt && context != pragma_compound) + goto bad_stmt; + stmt = push_omp_privatization_clauses (false); + cp_parser_omp_construct (parser, pragma_tok, if_p); + pop_omp_privatization_clauses (stmt); + return true; case PRAGMA_OMP_REQUIRES: if (context != pragma_external) diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc index 40deedc9ba9..63b2d1f7a45 100644 --- a/gcc/cp/pt.cc +++ b/gcc/cp/pt.cc @@ -18086,6 +18086,7 @@ tsubst_omp_clauses (tree clauses, enum c_omp_region_type ort, case OMP_CLAUSE_ASYNC: case OMP_CLAUSE_WAIT: case OMP_CLAUSE_DETACH: + case OMP_CLAUSE_UNROLL_PARTIAL: OMP_CLAUSE_OPERAND (nc, 0) = tsubst_expr (OMP_CLAUSE_OPERAND (oc, 0), args, complain, in_decl); break; @@ -18169,6 +18170,8 @@ tsubst_omp_clauses (tree clauses, enum c_omp_region_type ort, case OMP_CLAUSE_IF_PRESENT: case OMP_CLAUSE_FINALIZE: case OMP_CLAUSE_NOHOST: + case OMP_CLAUSE_UNROLL_FULL: + case OMP_CLAUSE_UNROLL_NONE: break; default: gcc_unreachable (); @@ -19437,6 +19440,7 @@ tsubst_expr (tree t, tree args, tsubst_flags_t complain, tree in_decl) case OMP_SIMD: case OMP_DISTRIBUTE: case OMP_TASKLOOP: + case OMP_LOOP_TRANS: case OACC_LOOP: { tree clauses, body, pre_body; diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc index 99a76e3ed65..ac49502eea4 100644 --- a/gcc/cp/semantics.cc +++ b/gcc/cp/semantics.cc @@ -6779,6 +6779,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) bool mergeable_seen = false; bool implicit_moved = false; bool target_in_reduction_seen = false; + bool unroll_full_seen = false; bitmap_obstack_initialize (NULL); bitmap_initialize (&generic_head, &bitmap_default_obstack); @@ -8822,6 +8823,61 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) } break; + case OMP_CLAUSE_UNROLL_FULL: + if (unroll_full_seen) + { + error_at (OMP_CLAUSE_LOCATION (c), + "% appears more than once"); + remove = true; + } + unroll_full_seen = true; + break; + + case OMP_CLAUSE_UNROLL_PARTIAL: + { + + tree t = OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c); + + if (!t) + break; + + if (t == error_mark_node) + remove = true; + else if (!type_dependent_expression_p (t) + && !INTEGRAL_TYPE_P (TREE_TYPE (t))) + { + error_at (OMP_CLAUSE_LOCATION (c), + "partial argument needs integral type"); + remove = true; + } + else + { + t = mark_rvalue_use (t); + if (!processing_template_decl) + { + t = maybe_constant_value (t); + + int n; + if (!INTEGRAL_TYPE_P (TREE_TYPE (t)) + || !tree_fits_shwi_p (t) + || (n = tree_to_shwi (t)) <= 0 || (int)n != n) + { + error_at (OMP_CLAUSE_LOCATION (c), + "partial argument needs positive constant " + "integer expression"); + remove = true; + } + t = fold_build_cleanup_point_expr (TREE_TYPE (t), t); + } + } + + OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c) = t; + } + break; + + case OMP_CLAUSE_UNROLL_NONE: + break; + default: gcc_unreachable (); } diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-1.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-1.c new file mode 100644 index 00000000000..d496dc29053 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-1.c @@ -0,0 +1,133 @@ +extern void dummy (int); + +void +test1 () +{ +#pragma omp unroll partial + for (int i = 0; i < 100; ++i) + dummy (i); +} + +void +test2 () +{ +#pragma omp unroll partial(10) + for (int i = 0; i < 100; ++i) + dummy (i); +} + +void +test3 () +{ +#pragma omp unroll full + for (int i = 0; i < 100; ++i) + dummy (i); +} + +void +test4 () +{ +#pragma omp unroll full + for (int i = 0; i > 100; ++i) + dummy (i); +} + +void +test5 () +{ +#pragma omp unroll full + for (int i = 1; i <= 100; ++i) + dummy (i); +} + +void +test6 () +{ +#pragma omp unroll full + for (int i = 200; i >= 100; i--) + dummy (i); +} + +void +test7 () +{ +#pragma omp unroll full + for (int i = -100; i > 100; ++i) + dummy (i); +} + +void +test8 () +{ +#pragma omp unroll full + for (int i = 100; i > -200; --i) + dummy (i); +} + +void +test9 () +{ +#pragma omp unroll full + for (int i = -300; i != 100; ++i) + dummy (i); +} + +void +test10 () +{ +#pragma omp unroll full + for (int i = -300; i != 100; ++i) + dummy (i); +} + +void +test12 () +{ +#pragma omp unroll full +#pragma omp unroll partial +#pragma omp unroll partial + for (int i = -300; i != 100; ++i) + dummy (i); +} + +void +test13 () +{ + for (int i = 0; i < 100; ++i) +#pragma omp unroll full +#pragma omp unroll partial +#pragma omp unroll partial + for (int j = -300; j != 100; ++j) + dummy (i); +} + +void +test14 () +{ + #pragma omp for + for (int i = 0; i < 100; ++i) +#pragma omp unroll full +#pragma omp unroll partial +#pragma omp unroll partial + for (int j = -300; j != 100; ++j) + dummy (i); +} + +void +test15 () +{ + #pragma omp for + for (int i = 0; i < 100; ++i) + { + + dummy (i); + +#pragma omp unroll full +#pragma omp unroll partial +#pragma omp unroll partial + for (int j = -300; j != 100; ++j) + dummy (j); + + dummy (i); + } + } diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-2.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-2.c new file mode 100644 index 00000000000..8f7c3088a2e --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-2.c @@ -0,0 +1,99 @@ +/* { dg-prune-output "error: invalid controlling predicate" } */ +/* { dg-additional-options "-std=c++11" { target c++} } */ + +extern void dummy (int); + +void +test () +{ +#pragma omp unroll partial +#pragma omp unroll full /* { dg-error {'full' clause is invalid here; turns loop into non-loop} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp for +#pragma omp unroll full /* { dg-error {'full' clause is invalid here; turns loop into non-loop} } */ +#pragma omp unroll partial + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp for +#pragma omp unroll full /* { dg-error {'full' clause is invalid here; turns loop into non-loop} } */ +#pragma omp unroll full /* { dg-error {'full' clause is invalid here; turns loop into non-loop} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp for +#pragma omp unroll partial partial /* { dg-error {too many 'partial' clauses} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp unroll full full /* { dg-error {too many 'full' clauses} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp unroll partial +#pragma omp unroll /* { dg-error {'#pragma omp unroll' without 'partial' clause is invalid here; turns loop into non-loop} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp for +#pragma omp unroll /* { dg-error {'#pragma omp unroll' without 'partial' clause is invalid here; turns loop into non-loop} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + + int i; +#pragma omp for +#pragma omp unroll( /* { dg-error {expected '#pragma omp' clause before '\(' token} } */ + /* { dg-error {'#pragma omp unroll' without 'partial' clause is invalid here; turns loop into non-loop} "" { target *-*-* } .-1 } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp for +#pragma omp unroll foo /* { dg-error {expected '#pragma omp' clause before 'foo'} } */ + /* { dg-error {'#pragma omp unroll' without 'partial' clause is invalid here; turns loop into non-loop} "" { target *-*-* } .-1 } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp unroll partial( /* { dg-error {expected expression before end of line} "" { target c } } */ + /* { dg-error {expected primary-expression before end of line} "" { target c++ } .-1 } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp unroll partial() /* { dg-error {expected expression before '\)' token} "" { target c } } */ + /* { dg-error {expected primary-expression before '\)' token} "" { target c++ } .-1 } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp unroll partial(i) + /* { dg-error {the value of 'i' is not usable in a constant expression} "" { target c++ } .-1 } */ + /* { dg-error {partial argument needs positive constant integer expression} "" { target c } .-2 } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp unroll parti /* { dg-error {expected '#pragma omp' clause before 'parti'} } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp for +#pragma omp unroll partial(1) +#pragma omp unroll parti /* { dg-error {expected '#pragma omp' clause before 'parti'} } */ + /* { dg-error {'#pragma omp unroll' without 'partial' clause is invalid here; turns loop into non-loop} "" { target *-*-* } .-1 } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +#pragma omp for +#pragma omp unroll partial(1) +#pragma omp unroll parti /* { dg-error {expected '#pragma omp' clause before 'parti'} } */ + /* { dg-error {'#pragma omp unroll' without 'partial' clause is invalid here; turns loop into non-loop} "" { target *-*-* } .-1 } */ + for (int i = -300; i != 100; ++i) + dummy (i); + +int sum = 0; +#pragma omp parallel for reduction(+ : sum) collapse(2) /* { dg-error {collapse cannot be larger than 1 on an unrolled loop} "" { target c } } */ +#pragma omp unroll partial(1) /* { dg-error {collapse cannot be larger than 1 on an unrolled loop} "" { target c++ } } */ + for (int i = 3; i < 10; ++i) + for (int j = -2; j < 7; ++j) + sum++; +} + diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-3.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-3.c new file mode 100644 index 00000000000..7ace5657b26 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-3.c @@ -0,0 +1,18 @@ +/* { dg-additional-options "-fdump-tree-omp_transform_loops" } + * { dg-additional-options "-fdump-tree-original" } */ + +extern void dummy (int); + +void +test1 () +{ + int i; +#pragma omp unroll full + for (int i = 0; i < 10; i++) + dummy (i); +} + + /* Loop should be removed with 10 copies of the body remaining + * { dg-final { scan-tree-dump-times "dummy" 10 "omp_transform_loops" } } + * { dg-final { scan-tree-dump "#pragma omp loop_transform" "original" } } + * { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } */ diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-4.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-4.c new file mode 100644 index 00000000000..5e473a099d3 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-4.c @@ -0,0 +1,19 @@ +/* { dg-additional-options "-fdump-tree-omp_transform_loops" } + * { dg-additional-options "-fdump-tree-original" } */ + +extern void dummy (int); + +void +test1 () +{ + int i; +#pragma omp unroll + for (int i = 0; i < 100; i++) + dummy (i); +} + +/* Loop should not be unrolled, but the internal representation should be lowered + * { dg-final { scan-tree-dump "#pragma omp loop_transform" "original" } } + * { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } + * { dg-final { scan-tree-dump-times "dummy" 1 "omp_transform_loops" } } + * { dg-final { scan-tree-dump-times {if \(i < .+?.+goto.+else goto.*?$} 1 "omp_transform_loops" } } */ diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-5.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-5.c new file mode 100644 index 00000000000..9d5101bdc60 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-5.c @@ -0,0 +1,19 @@ +/* { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } + * { dg-additional-options "-fdump-tree-original" } */ + +extern void dummy (int); + +void +test1 () +{ + int i; +#pragma omp unroll partial /* { dg-optimized {'partial' clause without unrolling factor turned into 'partial\(5\)' clause} } */ + for (int i = 0; i < 100; i++) + dummy (i); +} + +/* Loop should be unrolled 5 times and the internal representation should be lowered + * { dg-final { scan-tree-dump "#pragma omp loop_transform" "original" } } + * { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } + * { dg-final { scan-tree-dump-times "dummy" 5 "omp_transform_loops" } } + * { dg-final { scan-tree-dump-times {if \(i < .+?.+goto.+else goto.*?$} 1 "omp_transform_loops" } } */ diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-6.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-6.c new file mode 100644 index 00000000000..ee2d000239d --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-6.c @@ -0,0 +1,20 @@ +/* { dg-additional-options "--param=omp-unroll-default-factor=100" } + * { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } + * { dg-additional-options "-fdump-tree-original" } */ + +extern void dummy (int); + +void +test1 () +{ + int i; +#pragma omp unroll /* { dg-optimized {added 'partial\(100\)' clause to 'omp unroll' directive} } */ + for (int i = 0; i < 100; i++) + dummy (i); +} + +/* Loop should be unrolled 5 times and the internal representation should be lowered + * { dg-final { scan-tree-dump "#pragma omp loop_transform unroll_none" "original" } } + * { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } + * { dg-final { scan-tree-dump-times "dummy" 100 "omp_transform_loops" } } + * { dg-final { scan-tree-dump-times {if \(i < .+?.+goto.+else goto.*?$} 1 "omp_transform_loops" } } */ diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-7.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-7.c new file mode 100644 index 00000000000..0458cb030a9 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-7.c @@ -0,0 +1,144 @@ +/* { dg-do run } */ +/* { dg-options "-O0 -fopenmp-simd" } */ + +#include + +#define ASSERT_EQ(var, val) if (var != val) { fprintf (stderr, "%s:%d: Unexpected value %d\n", __FILE__, __LINE__, var); \ + __builtin_abort (); } + +#define ASSERT_EQ_PTR(var, ptr) if (var != ptr) { fprintf (stderr, "%s:%d: Unexpected value %p\n", __FILE__, __LINE__, var); \ + __builtin_abort (); } + +int +test1 (int data[10]) +{ + int iter = 0; + int *i; + #pragma omp unroll partial(8) + for (i = data; i < data + 10 ; i++) + { + ASSERT_EQ (*i, data[iter]); + ASSERT_EQ_PTR (i, data + iter); + iter++; + } + + return iter; +} + +int +test2 (int data[10]) +{ + int iter = 0; + int *i; + #pragma omp unroll partial(8) + for (i = data; i < data + 10 ; i=i+2) + { + ASSERT_EQ_PTR (i, data + 2 * iter); + ASSERT_EQ (*i, data[2 * iter]); + iter++; + } + + return iter; +} + +int +test3 (int data[10]) +{ + int iter = 0; + int *i; + #pragma omp unroll partial(8) + for (i = data; i <= data + 9 ; i=i+2) + { + ASSERT_EQ (*i, data[2 * iter]); + iter++; + } + + return iter; +} + +int +test4 (int data[10]) +{ + int iter = 0; + int *i; + #pragma omp unroll partial(8) + for (i = data; i != data + 10 ; i=i+1) + { + ASSERT_EQ (*i, data[iter]); + iter++; + } + + return iter; +} + +int +test5 (int data[10]) +{ + int iter = 0; + int *i; + #pragma omp unroll partial(7) + for (i = data + 9; i >= data ; i--) + { + ASSERT_EQ (*i, data[9 - iter]); + iter++; + } + + return iter; +} + +int +test6 (int data[10]) +{ + int iter = 0; + int *i; + #pragma omp unroll partial(7) + for (i = data + 9; i > data - 1 ; i--) + { + ASSERT_EQ (*i, data[9 - iter]); + iter++; + } + + return iter; +} + +int +test7 (int data[10]) +{ + int iter = 0; + #pragma omp unroll partial(7) + for (int *i = data + 9; i != data - 1 ; i--) + { + ASSERT_EQ (*i, data[9 - iter]); + iter++; + } + + return iter; +} + +int +main () +{ + int iter_count; + int data[10] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }; + + iter_count = test1 (data); + ASSERT_EQ (iter_count, 10); + + iter_count = test2 (data); + ASSERT_EQ (iter_count, 5); + + iter_count = test3 (data); + ASSERT_EQ (iter_count, 5); + + iter_count = test4 (data); + ASSERT_EQ (iter_count, 10); + + iter_count = test5 (data); + ASSERT_EQ (iter_count, 10); + + iter_count = test6 (data); + ASSERT_EQ (iter_count, 10); + + iter_count = test7 (data); + ASSERT_EQ (iter_count, 10); +} diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-simd-1.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-simd-1.c new file mode 100644 index 00000000000..1cd4d6e7322 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-simd-1.c @@ -0,0 +1,84 @@ +/* { dg-options "-fno-openmp -fopenmp-simd" } */ +/* { dg-do run } */ +/* { dg-additional-options "-fdump-tree-original" } */ +/* { dg-additional-options "-fdump-tree-omp_transform_loops" } */ + +#include + +int compute_sum1 () +{ + int sum = 0; + int i,j; + +#pragma omp simd reduction(+:sum) + for (i = 3; i < 10; ++i) + #pragma omp unroll full + for (j = -2; j < 7; ++j) + sum++; + + if (j != 7) + __builtin_abort; + + return sum; +} + +int compute_sum2() +{ + int sum = 0; + int i,j; +#pragma omp simd reduction(+:sum) +#pragma omp unroll partial(5) + for (i = 3; i < 10; ++i) + for (j = -2; j < 7; ++j) + sum++; + + if (j != 7) + __builtin_abort; + + return sum; +} + +int compute_sum3() +{ + int sum = 0; + int i,j; +#pragma omp simd reduction(+:sum) +#pragma omp unroll partial(1) + for (i = 3; i < 10; ++i) + for (j = -2; j < 7; ++j) + sum++; + + if (j != 7) + __builtin_abort; + + return sum; +} + +int main () +{ + int result = compute_sum1 (); + if (result != 7 * 9) + { + fprintf (stderr, "%d: Wrong result %d\n", __LINE__, result); + __builtin_abort (); + } + + result = compute_sum1 (); + if (result != 7 * 9) + { + fprintf (stderr, "%d: Wrong result %d\n", __LINE__, result); + __builtin_abort (); + } + + result = compute_sum3 (); + if (result != 7 * 9) + { + fprintf (stderr, "%d: Wrong result %d\n", __LINE__, result); + __builtin_abort (); + } + + return 0; +} + +/* { dg-final { scan-tree-dump {omp loop_transform} "original" } } */ +/* { dg-final { scan-tree-dump-not {omp loop_transform} "omp_transform_loops" } } */ diff --git a/gcc/testsuite/g++.dg/gomp/loop-transforms/unroll-1.C b/gcc/testsuite/g++.dg/gomp/loop-transforms/unroll-1.C new file mode 100644 index 00000000000..cba37c88ebe --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/loop-transforms/unroll-1.C @@ -0,0 +1,42 @@ +// { dg-do compile } +// { dg-additional-options "-std=c++11" } +#include + +extern void dummy (int); + +void +test1 () +{ + std::vector v; + + for (unsigned i = 0; i < 1000; i++) + v.push_back (i); + +#pragma omp for + for (int i : v) + dummy (i); + +#pragma omp unroll partial(5) + for (int i : v) + dummy (i); +} + +void +test2 () +{ + std::vector> v; + + for (unsigned i = 0; i < 10; i++) + { + std::vector u; + for (unsigned j = 0; j < 10; j++) + u.push_back (j); + v.push_back (u); + } + +#pragma omp for +#pragma omp unroll partial(5) + for (auto u : v) + for (int i : u) + dummy (i); +} diff --git a/gcc/testsuite/g++.dg/gomp/loop-transforms/unroll-2.C b/gcc/testsuite/g++.dg/gomp/loop-transforms/unroll-2.C new file mode 100644 index 00000000000..f606f3de757 --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/loop-transforms/unroll-2.C @@ -0,0 +1,47 @@ +// { dg-do link } +// { dg-additional-options "-std=c++11" } +#include + +extern void dummy (int); + +template void +test_template () +{ + std::vector v; + + for (unsigned i = 0; i < 1000; i++) + v.push_back (i); + +#pragma omp for + for (int i : v) + dummy (i); + +#pragma omp unroll partial(U1) + for (T i : v) + dummy (i); + +#pragma omp unroll partial(U2) // { dg-error {partial argument needs positive constant integer expression} } + for (T i : v) + dummy (i); + +#pragma omp unroll partial(U3) // { dg-error {partial argument needs positive constant integer expression} } + for (T i : v) + dummy (i); + +#pragma omp for +#pragma omp unroll partial(U1) + for (T i : v) + dummy (i); + +#pragma omp for +#pragma omp unroll partial(U2) // { dg-error {partial argument needs positive constant integer expression} } + for (T i : v) + dummy (i); + +#pragma omp for +#pragma omp unroll partial(U3) // { dg-error {partial argument needs positive constant integer expression} } + for (T i : v) + dummy (i); +} + +void test () { test_template (); }; diff --git a/gcc/testsuite/g++.dg/gomp/loop-transforms/unroll-3.C b/gcc/testsuite/g++.dg/gomp/loop-transforms/unroll-3.C new file mode 100644 index 00000000000..ae9f5500360 --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/loop-transforms/unroll-3.C @@ -0,0 +1,37 @@ +// { dg-do compile } +// { dg-additional-options "-std=c++11" } +// { dg-additional-options "-fdump-tree-omp_transform_loops -fopt-info-omp-optimized-missed" } +// { dg-additional-options "-fdump-tree-original" } +#include + +extern void dummy (int); + +constexpr unsigned fib (unsigned n) +{ + return n <= 2 ? 1 : fib (n-1) + fib (n-2); +} + +void +test1 () +{ + std::vector v; + + for (unsigned i = 0; i < 1000; i++) + v.push_back (i); + +#pragma omp unroll partial(fib(10)) + for (int i : v) + dummy (i); +} + + +// Loop should be unrolled fib(10) = 55 times +// ! { dg-final { scan-tree-dump {#pragma omp loop_transform unroll_partial\(55\)} "original" } } +// ! { dg-final { scan-tree-dump-not "#pragma omp" "omp_transform_loops" } } +// ! { dg-final { scan-tree-dump-times "dummy" 55 "omp_transform_loops" } } + +// There should be one loop that fills the vector ... +// ! { dg-final { scan-tree-dump-times {if \(i.*? <= .+?.+goto.+else goto.*?$} 1 "omp_transform_loops" } } + +// ... and one resulting from the lowering of the unrolled loop +// ! { dg-final { scan-tree-dump-times {if \(D\.[0-9]+ < retval.+?.+goto.+else goto.*?$} 1 "omp_transform_loops" } } diff --git a/libgomp/testsuite/libgomp.c++/loop-transforms/unroll-1.C b/libgomp/testsuite/libgomp.c++/loop-transforms/unroll-1.C new file mode 100644 index 00000000000..004eef91649 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/loop-transforms/unroll-1.C @@ -0,0 +1,73 @@ +// { dg-additional-options "-std=c++11" } +// { dg-additional-options "-O0" } + +#include +#include + +constexpr unsigned fib (unsigned n) +{ + return n <= 2 ? 1 : fib (n-1) + fib (n-2); +} + +int +test1 () +{ + std::vector v; + + for (unsigned i = 0; i <= 9; i++) + v.push_back (1); + + int sum = 0; + for (int k = 0; k < 10; k++) +#pragma omp unroll partial(fib(3)) + for (int i : v) { + for (int j = 8; j != -2; --j) + sum = sum + i; + } + + return sum; +} + +int +test2 () +{ + std::vector v; + + for (unsigned i = 0; i <= 10; i++) + v.push_back (i); + + int sum = 0; +#pragma omp parallel for reduction(+:sum) + for (int k = 0; k < 10; k++) +#pragma omp unroll +#pragma omp unroll partial(fib(4)) + for (int i : v) + { + #pragma omp unroll full + for (int j = 8; j != -2; --j) + sum = sum + i; + } + + return sum; +} + +int +main () +{ + int result = test1 (); + + if (result != 1000) + { + fprintf (stderr, "Wrong result: %d\n", result); + __builtin_abort (); + } + + result = test2 (); + if (result != 5500) + { + fprintf (stderr, "Wrong result: %d\n", result); + __builtin_abort (); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/loop-transforms/unroll-2.C b/libgomp/testsuite/libgomp.c++/loop-transforms/unroll-2.C new file mode 100644 index 00000000000..90d2775c95b --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/loop-transforms/unroll-2.C @@ -0,0 +1,34 @@ +// { dg-do run } +// { dg-additional-options "-std=c++11" } +#include +#include + +int +main () +{ + std::vector> v; + std::vector w; + + for (unsigned i = 0; i < 10; i++) + { + std::vector u; + for (unsigned j = 0; j < 10; j++) + u.push_back (j); + v.push_back (u); + } + +#pragma omp for +#pragma omp unroll partial(7) + for (auto u : v) + for (int x : u) + w.push_back (x); + + std::size_t l = w.size (); + for (std::size_t i = 0; i < l; i++) + { + if (w[i] != i % 10) + __builtin_abort (); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/unroll-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/unroll-1.c new file mode 100644 index 00000000000..2ac0fff16af --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/unroll-1.c @@ -0,0 +1,76 @@ +#include + +int compute_sum1 () +{ + int sum = 0; + int i,j; +#pragma omp parallel for reduction(+:sum) lastprivate(j) +#pragma omp unroll partial + for (i = 3; i < 10; ++i) + for (j = -2; j < 7; ++j) + sum++; + + if (j != 7) + __builtin_abort; + + return sum; +} + +int compute_sum2() +{ + int sum = 0; + int i,j; +#pragma omp parallel for reduction(+:sum) lastprivate(j) +#pragma omp unroll partial(5) + for (i = 3; i < 10; ++i) + for (j = -2; j < 7; ++j) + sum++; + + if (j != 7) + __builtin_abort; + + return sum; +} + +int compute_sum3() +{ + int sum = 0; + int i,j; +#pragma omp parallel for reduction(+:sum) +#pragma omp unroll partial(1) + for (i = 3; i < 10; ++i) + for (j = -2; j < 7; ++j) + sum++; + + if (j != 7) + __builtin_abort; + + return sum; +} + +int main () +{ + int result; + result = compute_sum1 (); + if (result != 7 * 9) + { + fprintf (stderr, "%d: Wrong result %d\n", __LINE__, result); + __builtin_abort (); + } + + result = compute_sum2 (); + if (result != 7 * 9) + { + fprintf (stderr, "%d: Wrong result %d\n", __LINE__, result); + __builtin_abort (); + } + + result = compute_sum3 (); + if (result != 7 * 9) + { + fprintf (stderr, "%d: Wrong result %d\n", __LINE__, result); + __builtin_abort (); + } + + return 0; +} From patchwork Fri Mar 24 15:30:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Frederik Harwath X-Patchwork-Id: 66859 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 07621389942C for ; Fri, 24 Mar 2023 15:35:18 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa1.mentor.iphmx.com (esa1.mentor.iphmx.com [68.232.129.153]) by sourceware.org (Postfix) with ESMTPS id C7A36387689B; Fri, 24 Mar 2023 15:33:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C7A36387689B Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.98,288,1673942400"; d="scan'208";a="324688" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa1.mentor.iphmx.com with ESMTP; 24 Mar 2023 07:31:12 -0800 IronPort-SDR: cfKhES35S8K1WvXN3jCJ7qHL6hmiI5axYkBLyZybkuYl19/l7DdYQJGgLxzH/jCkreCQVTvWDS jIVqNfb3fP+PbFqgRnrtFsW9NDFhfEY8wYi58LJXbARqhVJQvTeo+clkn/HjEVrGSCqQmilnpW O+26KFz1VbI056krGxP+xlx6UPSK7gMBatNmje5/gIlFvr6/5FEIC6WKmrG9tdMzg912XXYgmt qbmjemuKxVNH2s7hQqSbVyczTlEXFcYzmHlc/Po0ASWMwE8Y+ZYtuyV3liMFIEh+N7bNe1z0nJ 884= From: Frederik Harwath To: , , , , , Subject: [PATCH 3/7] openacc: Rename OMP_CLAUSE_TILE to OMP_CLAUSE_OACC_TILE Date: Fri, 24 Mar 2023 16:30:41 +0100 Message-ID: <20230324153046.3996092-4-frederik@codesourcery.com> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20230324153046.3996092-1-frederik@codesourcery.com> References: <20230324153046.3996092-1-frederik@codesourcery.com> MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) To svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_MANYTO, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" OMP_CLAUSE_TILE will be used for the OpenMP 5.1 loop transformation construct "omp tile". gcc/ChangeLog: * tree-core.h (enum omp_clause_code): Rename OMP_CLAUSE_TILE. * tree.h (OMP_CLAUSE_TILE_LIST): Rename to ... (OMP_CLAUSE_OACC_TILE_LIST): ... this. (OMP_CLAUSE_TILE_ITERVAR): Rename to ... (OMP_CLAUSE_OACC_TILE_ITERVAR): ... this. (OMP_CLAUSE_TILE_COUNT): Rename to ... (OMP_CLAUSE_OACC_TILE_COUNT): this. * gimplify.cc (gimplify_scan_omp_clauses): Adjust to renamings. (gimplify_adjust_omp_clauses): Likewise. (gimplify_omp_for): Likewise. * omp-general.cc (omp_extract_for_data): Likewise. * omp-low.cc (scan_sharing_clauses): Likewise. (lower_oacc_head_mark): Likewise. * tree-nested.cc (convert_nonlocal_omp_clauses): Likewise. (convert_local_omp_clauses): Likewise. * tree-pretty-print.cc (dump_omp_clause): Likewise. * tree.cc: Likewise. gcc/c-family/ChangeLog: * c-omp.cc (c_oacc_split_loop_clauses): Adjust to renamings. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_clause_collapse): Adjust to renamings. (c_parser_oacc_clause_tile): Likewise. (c_parser_omp_for_loop): Likewise. * c-typeck.cc (c_finish_omp_clauses): Likewise. gcc/cp/ChangeLog: * parser.cc (cp_parser_oacc_clause_tile): Adjust to renamings. (cp_parser_omp_clause_collapse): Likewise. (cp_parser_omp_for_loop): Likewise. * pt.cc (tsubst_omp_clauses): Likewise. * semantics.cc (finish_omp_clauses): Likewise. (finish_omp_for): Likewise. gcc/fortran/ChangeLog: * openmp.cc (enum omp_mask2): Adjust to renamings. (gfc_match_omp_clauses): Likewise. * trans-openmp.cc (gfc_trans_omp_clauses): Likewise. --- gcc/c-family/c-omp.cc | 2 +- gcc/c/c-parser.cc | 12 ++++++------ gcc/c/c-typeck.cc | 2 +- gcc/cp/parser.cc | 12 ++++++------ gcc/cp/pt.cc | 2 +- gcc/cp/semantics.cc | 8 ++++---- gcc/fortran/openmp.cc | 6 +++--- gcc/fortran/trans-openmp.cc | 4 ++-- gcc/gimplify.cc | 8 ++++---- gcc/omp-general.cc | 8 ++++---- gcc/omp-low.cc | 6 +++--- gcc/tree-core.h | 2 +- gcc/tree-nested.cc | 4 ++-- gcc/tree-pretty-print.cc | 4 ++-- gcc/tree.cc | 2 +- gcc/tree.h | 12 ++++++------ 16 files changed, 47 insertions(+), 47 deletions(-) -- 2.36.1 ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 diff --git a/gcc/c-family/c-omp.cc b/gcc/c-family/c-omp.cc index 85ba9c528c8..fec7f337772 100644 --- a/gcc/c-family/c-omp.cc +++ b/gcc/c-family/c-omp.cc @@ -1749,7 +1749,7 @@ c_oacc_split_loop_clauses (tree clauses, tree *not_loop_clauses, { /* Loop clauses. */ case OMP_CLAUSE_COLLAPSE: - case OMP_CLAUSE_TILE: + case OMP_CLAUSE_OACC_TILE: case OMP_CLAUSE_GANG: case OMP_CLAUSE_WORKER: case OMP_CLAUSE_VECTOR: diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index 9d875befccc..e7c9da99552 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -14183,7 +14183,7 @@ c_parser_omp_clause_collapse (c_parser *parser, tree list) location_t loc; check_no_duplicate_clause (list, OMP_CLAUSE_COLLAPSE, "collapse"); - check_no_duplicate_clause (list, OMP_CLAUSE_TILE, "tile"); + check_no_duplicate_clause (list, OMP_CLAUSE_OACC_TILE, "tile"); loc = c_parser_peek_token (parser)->location; matching_parens parens; @@ -15349,7 +15349,7 @@ c_parser_oacc_clause_tile (c_parser *parser, tree list) location_t loc; tree tile = NULL_TREE; - check_no_duplicate_clause (list, OMP_CLAUSE_TILE, "tile"); + check_no_duplicate_clause (list, OMP_CLAUSE_OACC_TILE, "tile"); check_no_duplicate_clause (list, OMP_CLAUSE_COLLAPSE, "collapse"); loc = c_parser_peek_token (parser)->location; @@ -15401,9 +15401,9 @@ c_parser_oacc_clause_tile (c_parser *parser, tree list) /* Consume the trailing ')'. */ c_parser_consume_token (parser); - c = build_omp_clause (loc, OMP_CLAUSE_TILE); + c = build_omp_clause (loc, OMP_CLAUSE_OACC_TILE); tile = nreverse (tile); - OMP_CLAUSE_TILE_LIST (c) = tile; + OMP_CLAUSE_OACC_TILE_LIST (c) = tile; OMP_CLAUSE_CHAIN (c) = list; return c; } @@ -20270,10 +20270,10 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code, for (cl = clauses; cl; cl = OMP_CLAUSE_CHAIN (cl)) if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_COLLAPSE) collapse = tree_to_shwi (OMP_CLAUSE_COLLAPSE_EXPR (cl)); - else if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_TILE) + else if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_OACC_TILE) { tiling = true; - collapse = list_length (OMP_CLAUSE_TILE_LIST (cl)); + collapse = list_length (OMP_CLAUSE_OACC_TILE_LIST (cl)); } else if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_ORDERED && OMP_CLAUSE_ORDERED_EXPR (cl)) diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc index bffea79b441..40df7bb0069 100644 --- a/gcc/c/c-typeck.cc +++ b/gcc/c/c-typeck.cc @@ -15872,7 +15872,7 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) case OMP_CLAUSE_GANG: case OMP_CLAUSE_WORKER: case OMP_CLAUSE_VECTOR: - case OMP_CLAUSE_TILE: + case OMP_CLAUSE_OACC_TILE: case OMP_CLAUSE_IF_PRESENT: case OMP_CLAUSE_FINALIZE: case OMP_CLAUSE_NOHOST: diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index 7034fdf49a4..90af40c4dbc 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -37981,7 +37981,7 @@ cp_parser_oacc_clause_tile (cp_parser *parser, location_t clause_loc, tree list) so, but the spec authors never considered such a case and have differing opinions on what it might mean, including 'not allowed'.) */ - check_no_duplicate_clause (list, OMP_CLAUSE_TILE, "tile", clause_loc); + check_no_duplicate_clause (list, OMP_CLAUSE_OACC_TILE, "tile", clause_loc); check_no_duplicate_clause (list, OMP_CLAUSE_COLLAPSE, "collapse", clause_loc); @@ -38010,9 +38010,9 @@ cp_parser_oacc_clause_tile (cp_parser *parser, location_t clause_loc, tree list) /* Consume the trailing ')'. */ cp_lexer_consume_token (parser->lexer); - c = build_omp_clause (clause_loc, OMP_CLAUSE_TILE); + c = build_omp_clause (clause_loc, OMP_CLAUSE_OACC_TILE); tile = nreverse (tile); - OMP_CLAUSE_TILE_LIST (c) = tile; + OMP_CLAUSE_OACC_TILE_LIST (c) = tile; OMP_CLAUSE_CHAIN (c) = list; return c; } @@ -38125,7 +38125,7 @@ cp_parser_omp_clause_collapse (cp_parser *parser, tree list, location_t location } check_no_duplicate_clause (list, OMP_CLAUSE_COLLAPSE, "collapse", location); - check_no_duplicate_clause (list, OMP_CLAUSE_TILE, "tile", location); + check_no_duplicate_clause (list, OMP_CLAUSE_OACC_TILE, "tile", location); c = build_omp_clause (loc, OMP_CLAUSE_COLLAPSE); OMP_CLAUSE_CHAIN (c) = list; OMP_CLAUSE_COLLAPSE_EXPR (c) = num; @@ -43654,10 +43654,10 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses, for (cl = clauses; cl; cl = OMP_CLAUSE_CHAIN (cl)) if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_COLLAPSE) collapse = tree_to_shwi (OMP_CLAUSE_COLLAPSE_EXPR (cl)); - else if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_TILE) + else if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_OACC_TILE) { tiling = true; - collapse = list_length (OMP_CLAUSE_TILE_LIST (cl)); + collapse = list_length (OMP_CLAUSE_OACC_TILE_LIST (cl)); } else if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_ORDERED && OMP_CLAUSE_ORDERED_EXPR (cl)) diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc index 63b2d1f7a45..16197b17e5a 100644 --- a/gcc/cp/pt.cc +++ b/gcc/cp/pt.cc @@ -18061,7 +18061,7 @@ tsubst_omp_clauses (tree clauses, enum c_omp_region_type ort, = tsubst_expr (OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR (oc), args, complain, in_decl); /* FALLTHRU */ - case OMP_CLAUSE_TILE: + case OMP_CLAUSE_OACC_TILE: case OMP_CLAUSE_IF: case OMP_CLAUSE_NUM_THREADS: case OMP_CLAUSE_SCHEDULE: diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc index ac49502eea4..c87e252ff06 100644 --- a/gcc/cp/semantics.cc +++ b/gcc/cp/semantics.cc @@ -8729,8 +8729,8 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) mergeable_seen = true; break; - case OMP_CLAUSE_TILE: - for (tree list = OMP_CLAUSE_TILE_LIST (c); !remove && list; + case OMP_CLAUSE_OACC_TILE: + for (tree list = OMP_CLAUSE_OACC_TILE_LIST (c); !remove && list; list = TREE_CHAIN (list)) { t = TREE_VALUE (list); @@ -10498,9 +10498,9 @@ finish_omp_for (location_t locus, enum tree_code code, tree declv, { tree c; - c = omp_find_clause (clauses, OMP_CLAUSE_TILE); + c = omp_find_clause (clauses, OMP_CLAUSE_OACC_TILE); if (c) - collapse = list_length (OMP_CLAUSE_TILE_LIST (c)); + collapse = list_length (OMP_CLAUSE_OACC_TILE_LIST (c)); else { c = omp_find_clause (clauses, OMP_CLAUSE_COLLAPSE); diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc index e54f016b170..ec707d977cd 100644 --- a/gcc/fortran/openmp.cc +++ b/gcc/fortran/openmp.cc @@ -1075,7 +1075,7 @@ enum omp_mask2 OMP_CLAUSE_WAIT, OMP_CLAUSE_DELETE, OMP_CLAUSE_AUTO, - OMP_CLAUSE_TILE, + OMP_CLAUSE_OACC_TILE, OMP_CLAUSE_IF_PRESENT, OMP_CLAUSE_FINALIZE, OMP_CLAUSE_ATTACH, @@ -3478,7 +3478,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, c->threads = needs_space = true; continue; } - if ((mask & OMP_CLAUSE_TILE) + if ((mask & OMP_CLAUSE_OACC_TILE) && !c->tile_list && match_oacc_expr_list ("tile (", &c->tile_list, true) == MATCH_YES) @@ -3677,7 +3677,7 @@ error: (omp_mask (OMP_CLAUSE_COLLAPSE) | OMP_CLAUSE_GANG | OMP_CLAUSE_WORKER \ | OMP_CLAUSE_VECTOR | OMP_CLAUSE_SEQ | OMP_CLAUSE_INDEPENDENT \ | OMP_CLAUSE_PRIVATE | OMP_CLAUSE_REDUCTION | OMP_CLAUSE_AUTO \ - | OMP_CLAUSE_TILE) + | OMP_CLAUSE_OACC_TILE) #define OACC_PARALLEL_LOOP_CLAUSES \ (OACC_LOOP_CLAUSES | OACC_PARALLEL_CLAUSES) #define OACC_KERNELS_LOOP_CLAUSES \ diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc index c4a23f6e247..73c416c951d 100644 --- a/gcc/fortran/trans-openmp.cc +++ b/gcc/fortran/trans-openmp.cc @@ -4371,8 +4371,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, for (el = clauses->tile_list; el; el = el->next) vec_safe_push (tvec, gfc_convert_expr_to_tree (block, el->expr)); - c = build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_TILE); - OMP_CLAUSE_TILE_LIST (c) = build_tree_list_vec (tvec); + c = build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_OACC_TILE); + OMP_CLAUSE_OACC_TILE_LIST (c) = build_tree_list_vec (tvec); omp_clauses = gfc_trans_add_clause (c, omp_clauses); tvec->truncate (0); } diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index 2c160686533..14616eb5316 100644 --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -11923,7 +11923,7 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, case OMP_CLAUSE_ORDERED: case OMP_CLAUSE_UNTIED: case OMP_CLAUSE_COLLAPSE: - case OMP_CLAUSE_TILE: + case OMP_CLAUSE_OACC_TILE: case OMP_CLAUSE_AUTO: case OMP_CLAUSE_SEQ: case OMP_CLAUSE_INDEPENDENT: @@ -13071,7 +13071,7 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p, case OMP_CLAUSE_VECTOR: case OMP_CLAUSE_AUTO: case OMP_CLAUSE_SEQ: - case OMP_CLAUSE_TILE: + case OMP_CLAUSE_OACC_TILE: case OMP_CLAUSE_IF_PRESENT: case OMP_CLAUSE_FINALIZE: case OMP_CLAUSE_INCLUSIVE: @@ -13970,9 +13970,9 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) c = omp_find_clause (OMP_FOR_CLAUSES (for_stmt), OMP_CLAUSE_COLLAPSE); if (c) collapse = tree_to_shwi (OMP_CLAUSE_COLLAPSE_EXPR (c)); - c = omp_find_clause (OMP_FOR_CLAUSES (for_stmt), OMP_CLAUSE_TILE); + c = omp_find_clause (OMP_FOR_CLAUSES (for_stmt), OMP_CLAUSE_OACC_TILE); if (c) - tile = list_length (OMP_CLAUSE_TILE_LIST (c)); + tile = list_length (OMP_CLAUSE_OACC_TILE_LIST (c)); c = omp_find_clause (OMP_FOR_CLAUSES (for_stmt), OMP_CLAUSE_ALLOCATE); hash_set *allocate_uids = NULL; if (c) diff --git a/gcc/omp-general.cc b/gcc/omp-general.cc index e29d695dcba..0f326128874 100644 --- a/gcc/omp-general.cc +++ b/gcc/omp-general.cc @@ -271,12 +271,12 @@ omp_extract_for_data (gomp_for *for_stmt, struct omp_for_data *fd, collapse_count = &OMP_CLAUSE_COLLAPSE_COUNT (t); } break; - case OMP_CLAUSE_TILE: - fd->tiling = OMP_CLAUSE_TILE_LIST (t); + case OMP_CLAUSE_OACC_TILE: + fd->tiling = OMP_CLAUSE_OACC_TILE_LIST (t); fd->collapse = list_length (fd->tiling); gcc_assert (fd->collapse); - collapse_iter = &OMP_CLAUSE_TILE_ITERVAR (t); - collapse_count = &OMP_CLAUSE_TILE_COUNT (t); + collapse_iter = &OMP_CLAUSE_OACC_TILE_ITERVAR (t); + collapse_count = &OMP_CLAUSE_OACC_TILE_COUNT (t); break; case OMP_CLAUSE__REDUCTEMP_: fd->have_reductemp = true; diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc index 1818132830f..b5b2134ab17 100644 --- a/gcc/omp-low.cc +++ b/gcc/omp-low.cc @@ -1744,7 +1744,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx) case OMP_CLAUSE_INDEPENDENT: case OMP_CLAUSE_AUTO: case OMP_CLAUSE_SEQ: - case OMP_CLAUSE_TILE: + case OMP_CLAUSE_OACC_TILE: case OMP_CLAUSE__SIMT_: case OMP_CLAUSE_DEFAULT: case OMP_CLAUSE_NONTEMPORAL: @@ -1963,7 +1963,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx) case OMP_CLAUSE_INDEPENDENT: case OMP_CLAUSE_AUTO: case OMP_CLAUSE_SEQ: - case OMP_CLAUSE_TILE: + case OMP_CLAUSE_OACC_TILE: case OMP_CLAUSE__SIMT_: case OMP_CLAUSE_IF_PRESENT: case OMP_CLAUSE_FINALIZE: @@ -8376,7 +8376,7 @@ lower_oacc_head_mark (location_t loc, tree ddvar, tree clauses, tag |= OLF_INDEPENDENT; break; - case OMP_CLAUSE_TILE: + case OMP_CLAUSE_OACC_TILE: tag |= OLF_TILE; break; diff --git a/gcc/tree-core.h b/gcc/tree-core.h index e563408877e..f1429824158 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -515,7 +515,7 @@ enum omp_clause_code { OMP_CLAUSE_VECTOR_LENGTH, /* OpenACC clause: tile ( size-expr-list ). */ - OMP_CLAUSE_TILE, + OMP_CLAUSE_OACC_TILE, /* OpenACC clause: if_present. */ OMP_CLAUSE_IF_PRESENT, diff --git a/gcc/tree-nested.cc b/gcc/tree-nested.cc index 1418e1f7f56..ed115b5eb3f 100644 --- a/gcc/tree-nested.cc +++ b/gcc/tree-nested.cc @@ -1474,7 +1474,7 @@ convert_nonlocal_omp_clauses (tree *pclauses, struct walk_stmt_info *wi) case OMP_CLAUSE_DEFAULT: case OMP_CLAUSE_COPYIN: case OMP_CLAUSE_COLLAPSE: - case OMP_CLAUSE_TILE: + case OMP_CLAUSE_OACC_TILE: case OMP_CLAUSE_UNTIED: case OMP_CLAUSE_MERGEABLE: case OMP_CLAUSE_PROC_BIND: @@ -2270,7 +2270,7 @@ convert_local_omp_clauses (tree *pclauses, struct walk_stmt_info *wi) case OMP_CLAUSE_DEFAULT: case OMP_CLAUSE_COPYIN: case OMP_CLAUSE_COLLAPSE: - case OMP_CLAUSE_TILE: + case OMP_CLAUSE_OACC_TILE: case OMP_CLAUSE_UNTIED: case OMP_CLAUSE_MERGEABLE: case OMP_CLAUSE_PROC_BIND: diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc index 588a992bcf3..cae81719e68 100644 --- a/gcc/tree-pretty-print.cc +++ b/gcc/tree-pretty-print.cc @@ -1416,9 +1416,9 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags) case OMP_CLAUSE_INDEPENDENT: pp_string (pp, "independent"); break; - case OMP_CLAUSE_TILE: + case OMP_CLAUSE_OACC_TILE: pp_string (pp, "tile("); - dump_generic_node (pp, OMP_CLAUSE_TILE_LIST (clause), + dump_generic_node (pp, OMP_CLAUSE_OACC_TILE_LIST (clause), spc, flags, false); pp_right_paren (pp); break; diff --git a/gcc/tree.cc b/gcc/tree.cc index 53e44367977..fc7e22d352f 100644 --- a/gcc/tree.cc +++ b/gcc/tree.cc @@ -322,7 +322,7 @@ unsigned const char omp_clause_num_ops[] = 1, /* OMP_CLAUSE_NUM_GANGS */ 1, /* OMP_CLAUSE_NUM_WORKERS */ 1, /* OMP_CLAUSE_VECTOR_LENGTH */ - 3, /* OMP_CLAUSE_TILE */ + 3, /* OMP_CLAUSE_OACC_TILE */ 0, /* OMP_CLAUSE_IF_PRESENT */ 0, /* OMP_CLAUSE_FINALIZE */ 0, /* OMP_CLAUSE_NOHOST */ diff --git a/gcc/tree.h b/gcc/tree.h index f33f815b712..6f7a6e7017a 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -1963,12 +1963,12 @@ class auto_suppress_location_wrappers #define OMP_CLAUSE_ENTER_TO(NODE) \ (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_ENTER)->base.public_flag) -#define OMP_CLAUSE_TILE_LIST(NODE) \ - OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_TILE), 0) -#define OMP_CLAUSE_TILE_ITERVAR(NODE) \ - OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_TILE), 1) -#define OMP_CLAUSE_TILE_COUNT(NODE) \ - OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_TILE), 2) +#define OMP_CLAUSE_OACC_TILE_LIST(NODE) \ + OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_OACC_TILE), 0) +#define OMP_CLAUSE_OACC_TILE_ITERVAR(NODE) \ + OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_OACC_TILE), 1) +#define OMP_CLAUSE_OACC_TILE_COUNT(NODE) \ + OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_OACC_TILE), 2) /* _CONDTEMP_ holding temporary with iteration count. */ #define OMP_CLAUSE__CONDTEMP__ITER(NODE) \ From patchwork Fri Mar 24 15:30:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Frederik Harwath X-Patchwork-Id: 66862 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 005123888833 for ; Fri, 24 Mar 2023 15:52:30 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by sourceware.org (Postfix) with ESMTPS id 4641A3858C50; Fri, 24 Mar 2023 15:51:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4641A3858C50 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.98,288,1673942400"; d="scan'208";a="274531" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa3.mentor.iphmx.com with ESMTP; 24 Mar 2023 07:31:21 -0800 IronPort-SDR: 7JtpxPoxVWZGGnlJ8/XVAKNkjDX4KRJWys6wszrpD70XOKGncTINR1M48i5Jbyk+lhEqnH8mlQ wbXWV0qe5KnSXzffU2A5yyffn43EgGqBOqD+RGC7z4SBwcT37rx9OgiWcjSN9ILenMzjD6wEWj WIlFm3lvg98muRN6FL+QPR1yt5cELEG1hz139gmj5ThhJUUF3NRH7rdOZdUxIOnWgMsmu7BBuv 1VTTUk+3mJq3NfWOuvpqlZno1SENZ06sUphyzi+PWjd5ZDNJNcVUAY/SGcLuJVqWzRW+qfg3bl CvQ= From: Frederik Harwath To: , , , Subject: [PATCH 4/7] openmp: Add Fortran support for "omp tile" Date: Fri, 24 Mar 2023 16:30:42 +0100 Message-ID: <20230324153046.3996092-5-frederik@codesourcery.com> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20230324153046.3996092-1-frederik@codesourcery.com> References: <20230324153046.3996092-1-frederik@codesourcery.com> MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-14.mgc.mentorg.com (139.181.222.14) To svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This commit implements the Fortran front end support for the "omp tile" directive and the corresponding middle end transformation. gcc/fortran/ChangeLog: * gfortran.h (enum gfc_statement): Add ST_OMP_TILE, ST_OMP_END_TILE. (enum gfc_exec_op): Add EXEC_OMP_TILE. (loop_transform_p): New declaration. (struct gfc_omp_clauses): Add "tile_sizes" field. * dump-parse-tree.cc (show_omp_clauses): Handle "tile_sizes" dumping. (show_omp_node): Handle EXEC_OMP_TILE. (show_code_node): Likewise. * match.h (gfc_match_omp_tile): New declaration. * openmp.cc (gfc_free_omp_clauses): Free "tile_sizes" field. (match_tile_sizes): New function. (OMP_TILE_CLAUSES): New macro. (gfc_match_omp_tile): New function. (resolve_omp_do): Handle EXEC_OMP_TILE. (resolve_omp_tile): New function. (omp_code_to_statement): Handle EXEC_OMP_TILE. (gfc_resolve_omp_directive): Likewise. * parse.cc (decode_omp_directive): Handle ST_OMP_END_TILE and ST_OMP_TILE. (next_statement): Handle ST_OMP_TILE. (gfc_ascii_statement): Likewise. (parse_omp_do): Likewise. (parse_executable): Likewise. * resolve.cc (gfc_resolve_blocks): Handle EXEC_OMP_TILE. (gfc_resolve_code): Likewise. * st.cc (gfc_free_statement): Likewise. * trans-openmp.cc (gfc_trans_omp_clauses): Handle "tile_sizes" field. (loop_transform_p): New function. (gfc_expr_list_len): New function. (gfc_trans_omp_do): Handle EXEC_OMP_TILE. (gfc_trans_omp_directive): Likewise. * trans.cc (trans_code): Likewise. gcc/ChangeLog: * gimplify.cc (gimplify_scan_omp_clauses): Handle OMP_CLAUSE_TILE. (gimplify_adjust_omp_clauses): Likewise. (gimplify_omp_loop): Likewise. * omp-transform-loops.cc (walk_omp_for_loops): New declaration. (subst_var_in_op): New function. (subst_var): New function. (gomp_for_number_of_iterations): Adjust. (gomp_for_iter_count_type): New function. (gimple_assign_rhs_to_tree): New function. (subst_defs): New function. (gomp_for_uncollapse): Adjust. (transformation_clause_p): Add OMP_CLAUSE_TILE. (tile): New function. (transform_gomp_for): Handle OMP_CLAUSE_TILE. (optimize_transformation_clauses): Handle OMP_CLAUSE_TILE. * omp-general.cc (omp_loop_transform_clauses_p): Add OMP_CLAUSE_TILE. * tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_TILE. * tree-pretty-print.cc (dump_omp_clause): Handle OMP_CLAUSE_TILE. * tree.cc: Add OMP_CLAUSE_TILE. * tree.h (OMP_CLAUSE_TILE_SIZES): New macro. libgomp/ChangeLog: * testsuite/libgomp.fortran/loop-transforms/tile-1.f90: New test. * testsuite/libgomp.fortran/loop-transforms/tile-2.f90: New test. * testsuite/libgomp.fortran/loop-transforms/tile-unroll-1.f90: New test. * testsuite/libgomp.fortran/loop-transforms/tile-unroll-2.f90: New test. * testsuite/libgomp.fortran/loop-transforms/tile-unroll-3.f90: New test. * testsuite/libgomp.fortran/loop-transforms/tile-unroll-4.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-tile-1.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-tile-2.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/loop-transforms/tile-1.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-1a.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-2.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-3.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-4.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-unroll-1.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90: New test. --- gcc/fortran/dump-parse-tree.cc | 17 +- gcc/fortran/gfortran.h | 7 +- gcc/fortran/match.h | 1 + gcc/fortran/openmp.cc | 373 +++++++++++++----- gcc/fortran/parse.cc | 15 + gcc/fortran/resolve.cc | 3 + gcc/fortran/st.cc | 1 + gcc/fortran/trans-openmp.cc | 86 ++-- gcc/fortran/trans.cc | 1 + gcc/gimplify.cc | 3 + gcc/omp-general.cc | 2 +- gcc/omp-transform-loops.cc | 340 +++++++++++++++- .../gomp/loop-transforms/tile-1.f90 | 163 ++++++++ .../gomp/loop-transforms/tile-1a.f90 | 10 + .../gomp/loop-transforms/tile-2.f90 | 80 ++++ .../gomp/loop-transforms/tile-3.f90 | 18 + .../gomp/loop-transforms/tile-4.f90 | 95 +++++ .../gomp/loop-transforms/tile-unroll-1.f90 | 57 +++ .../gomp/loop-transforms/unroll-tile-1.f90 | 37 ++ .../gomp/loop-transforms/unroll-tile-2.f90 | 41 ++ gcc/tree-core.h | 3 + gcc/tree-pretty-print.cc | 8 + gcc/tree.cc | 7 +- gcc/tree.h | 3 + .../loop-transforms/unroll-full-tile.C | 84 ++++ .../loop-transforms/tile-1.f90 | 71 ++++ .../loop-transforms/tile-2.f90 | 117 ++++++ .../loop-transforms/tile-unroll-1.f90 | 112 ++++++ .../loop-transforms/tile-unroll-2.f90 | 71 ++++ .../loop-transforms/tile-unroll-3.f90 | 77 ++++ .../loop-transforms/tile-unroll-4.f90 | 75 ++++ .../loop-transforms/unroll-tile-1.f90 | 112 ++++++ .../loop-transforms/unroll-tile-2.f90 | 71 ++++ 33 files changed, 2042 insertions(+), 119 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-1a.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-2.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-3.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-4.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-unroll-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90 create mode 100644 libgomp/testsuite/libgomp.c++/loop-transforms/unroll-full-tile.C create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/tile-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/tile-2.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-2.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-3.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-4.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-tile-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-tile-2.f90 -- 2.36.1 ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 diff --git a/gcc/fortran/dump-parse-tree.cc b/gcc/fortran/dump-parse-tree.cc index e069aca1f1d..82183285954 100644 --- a/gcc/fortran/dump-parse-tree.cc +++ b/gcc/fortran/dump-parse-tree.cc @@ -2062,6 +2062,18 @@ show_omp_clauses (gfc_omp_clauses *omp_clauses) if (omp_clauses->unroll_partial_factor > 0) fprintf (dumpfile, "(%u)", omp_clauses->unroll_partial_factor); } + if (omp_clauses->tile_sizes) + { + gfc_expr_list *sizes; + fputs (" TILE SIZES(", dumpfile); + for (sizes = omp_clauses->tile_sizes; sizes; sizes = sizes->next) + { + show_expr (sizes->expr); + if (sizes->next) + fputs (", ", dumpfile); + } + fputc (')', dumpfile); + } } /* Show a single OpenMP or OpenACC directive node and everything underneath it @@ -2172,6 +2184,7 @@ show_omp_node (int level, gfc_code *c) name = "TEAMS DISTRIBUTE PARALLEL DO SIMD"; break; case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: name = "TEAMS DISTRIBUTE SIMD"; break; case EXEC_OMP_TEAMS_LOOP: name = "TEAMS LOOP"; break; + case EXEC_OMP_TILE: name = "TILE"; break; case EXEC_OMP_UNROLL: name = "UNROLL"; break; case EXEC_OMP_WORKSHARE: name = "WORKSHARE"; break; default: @@ -2249,6 +2262,7 @@ show_omp_node (int level, gfc_code *c) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_TILE: case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: omp_clauses = c->ext.omp_clauses; @@ -2311,7 +2325,7 @@ show_omp_node (int level, gfc_code *c) d = d->block; } } - else if (c->op == EXEC_OMP_UNROLL) + else if (c->op == EXEC_OMP_UNROLL || c->op == EXEC_OMP_TILE) show_code (level + 1, c->block != NULL ? c->block->next : c->next); else show_code (level + 1, c->block->next); @@ -3491,6 +3505,7 @@ show_code_node (int level, gfc_code *c) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_TILE: case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: show_omp_node (level, c); diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index 5ef4a8907b0..8b4eadf9b4d 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -320,7 +320,8 @@ enum gfc_statement ST_OMP_ERROR, ST_OMP_ASSUME, ST_OMP_END_ASSUME, ST_OMP_ASSUMES, /* Note: gfc_match_omp_nothing returns ST_NONE. */ ST_OMP_NOTHING, ST_NONE, - ST_OMP_UNROLL, ST_OMP_END_UNROLL + ST_OMP_UNROLL, ST_OMP_END_UNROLL, + ST_OMP_TILE, ST_OMP_END_TILE }; /* Types of interfaces that we can have. Assignment interfaces are @@ -1550,6 +1551,7 @@ typedef struct gfc_omp_clauses struct gfc_expr *dist_chunk_size; struct gfc_expr *message; struct gfc_omp_assumptions *assume; + struct gfc_expr_list *tile_sizes; const char *critical_name; enum gfc_omp_default_sharing default_sharing; enum gfc_omp_atomic_op atomic_op; @@ -2977,7 +2979,7 @@ enum gfc_exec_op EXEC_OMP_TARGET_TEAMS_LOOP, EXEC_OMP_MASKED, EXEC_OMP_PARALLEL_MASKED, EXEC_OMP_PARALLEL_MASKED_TASKLOOP, EXEC_OMP_PARALLEL_MASKED_TASKLOOP_SIMD, EXEC_OMP_MASKED_TASKLOOP, EXEC_OMP_MASKED_TASKLOOP_SIMD, EXEC_OMP_SCOPE, - EXEC_OMP_UNROLL, + EXEC_OMP_UNROLL, EXEC_OMP_TILE, EXEC_OMP_ERROR }; @@ -3874,6 +3876,7 @@ bool gfc_inline_intrinsic_function_p (gfc_expr *); /* trans-openmp.cc */ bool loop_transform_p (gfc_exec_op op); +int gfc_expr_list_len (gfc_expr_list *); /* bbt.cc */ typedef int (*compare_fn) (void *, void *); diff --git a/gcc/fortran/match.h b/gcc/fortran/match.h index 5640c725f09..d04e1cd66a4 100644 --- a/gcc/fortran/match.h +++ b/gcc/fortran/match.h @@ -226,6 +226,7 @@ match gfc_match_omp_teams_distribute_parallel_do_simd (void); match gfc_match_omp_teams_distribute_simd (void); match gfc_match_omp_teams_loop (void); match gfc_match_omp_threadprivate (void); +match gfc_match_omp_tile (void); match gfc_match_omp_unroll (void); match gfc_match_omp_workshare (void); match gfc_match_omp_end_critical (void); diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc index ec707d977cd..1de61029768 100644 --- a/gcc/fortran/openmp.cc +++ b/gcc/fortran/openmp.cc @@ -191,6 +191,7 @@ gfc_free_omp_clauses (gfc_omp_clauses *c) i == OMP_LIST_ALLOCATE); gfc_free_expr_list (c->wait_list); gfc_free_expr_list (c->tile_list); + gfc_free_expr_list (c->tile_sizes); free (CONST_CAST (char *, c->critical_name)); if (c->assume) { @@ -977,6 +978,76 @@ cleanup: return MATCH_ERROR; } +static match +match_tile_sizes (gfc_expr_list **list) +{ + gfc_expr_list *head, *tail, *p; + locus old_loc; + gfc_expr *expr; + match m; + + head = tail = NULL; + + old_loc = gfc_current_locus; + + m = gfc_match_char ('('); + if (m != MATCH_YES) + goto syntax; + + for (;;) + { + m = gfc_match_expr (&expr); + if (m == MATCH_YES) + { + p = gfc_get_expr_list (); + if (head == NULL) + head = tail = p; + else + { + tail->next = p; + tail = tail->next; + } + int size = 0; + if (m == MATCH_YES) + { + if (gfc_extract_int (expr, &size, 1)) + goto cleanup; + else if (size < 1) + { + gfc_error_now ("tile size not constant " + "positive integer at %C"); + goto cleanup; + } + tail->expr = expr; + } + goto next_item; + } + if (m == MATCH_ERROR) + goto cleanup; + goto syntax; + + next_item: + if (gfc_match_char (')') == MATCH_YES) + break; + if (gfc_match_char (',') != MATCH_YES) + goto syntax; + } + + while (*list) + list = &(*list)->next; + + *list = head; + return MATCH_YES; + +syntax: + gfc_error ("Syntax error in 'tile sizes' list at %C"); + +cleanup: + gfc_free_expr_list (head); + gfc_current_locus = old_loc; + return MATCH_ERROR; +} + /* OpenMP clauses. */ enum omp_mask1 { @@ -1054,6 +1125,7 @@ enum omp_mask2 OMP_CLAUSE_UNROLL_FULL, /* OpenMP 5.1. */ OMP_CLAUSE_UNROLL_NONE, /* OpenMP 5.1. */ OMP_CLAUSE_UNROLL_PARTIAL, /* OpenMP 5.1. */ + OMP_CLAUSE_TILE, /* OpenMP 5.1. */ OMP_CLAUSE_ASYNC, OMP_CLAUSE_NUM_GANGS, OMP_CLAUSE_NUM_WORKERS, @@ -4310,7 +4382,8 @@ cleanup: omp_mask (OMP_CLAUSE_NOWAIT) #define OMP_UNROLL_CLAUSES \ (omp_mask (OMP_CLAUSE_UNROLL_FULL) | OMP_CLAUSE_UNROLL_PARTIAL) - +#define OMP_TILE_CLAUSES \ + (omp_mask (OMP_CLAUSE_TILE)) static match match_omp (gfc_exec_op op, const omp_mask mask) @@ -6409,6 +6482,16 @@ gfc_match_omp_teams_distribute_simd (void) | OMP_SIMD_CLAUSES); } +match +gfc_match_omp_tile (void) +{ + gfc_omp_clauses *c = gfc_get_omp_clauses(); + new_st.op = EXEC_OMP_TILE; + new_st.ext.omp_clauses = c; + + return match_tile_sizes (&c->tile_sizes); +} + match gfc_match_omp_unroll (void) { @@ -9289,75 +9372,6 @@ gfc_resolve_do_iterator (gfc_code *code, gfc_symbol *sym, bool add_clause) } } - -static bool -omp_unroll_removes_loop_nest (gfc_code *code) -{ - gcc_assert (code->op == EXEC_OMP_UNROLL); - if (!code->ext.omp_clauses) - return true; - - if (code->ext.omp_clauses->unroll_none) - { - gfc_warning (0, "!$OMP UNROLL without PARTIAL clause at %L turns loop " - "into a non-loop", - &code->loc); - return true; - } - if (code->ext.omp_clauses->unroll_full) - { - gfc_warning (0, "!$OMP UNROLL with FULL clause at %L turns loop into a " - "non-loop", - &code->loc); - return true; - } - return false; -} - -static void -resolve_loop_transform_generic (gfc_code *code, const char *descr) -{ - gcc_assert (code->block); - - if (code->block->op == EXEC_OMP_UNROLL - && !omp_unroll_removes_loop_nest (code->block)) - return; - - if (code->block->next->op == EXEC_OMP_UNROLL - && !omp_unroll_removes_loop_nest (code->block->next)) - return; - - if (code->block->next->op == EXEC_DO_WHILE) - { - gfc_error ("%s invalid around DO WHILE or DO without loop " - "control at %L", descr, &code->loc); - return; - } - if (code->block->next->op == EXEC_DO_CONCURRENT) - { - gfc_error ("%s invalid around DO CONCURRENT loop at %L", - descr, &code->loc); - return; - } - - gfc_error ("missing canonical loop nest after %s at %L", - descr, &code->loc); - -} - -static void -resolve_omp_unroll (gfc_code *code) -{ - if (!code->block || code->block->op == EXEC_DO) - return; - - if (code->block->next->op == EXEC_DO) - return; - - resolve_loop_transform_generic (code, "!$OMP UNROLL"); -} - - static void handle_local_var (gfc_symbol *sym) { @@ -9488,6 +9502,106 @@ bound_expr_is_canonical (gfc_code *code, int depth, gfc_expr *expr, return false; } +static bool +omp_unroll_removes_loop_nest (gfc_code *code) +{ + gcc_assert (code->op == EXEC_OMP_UNROLL); + if (!code->ext.omp_clauses) + return true; + + if (code->ext.omp_clauses->unroll_none) + { + gfc_warning (0, "!$OMP UNROLL without PARTIAL clause at %L turns loop " + "into a non-loop", + &code->loc); + return true; + } + if (code->ext.omp_clauses->unroll_full) + { + gfc_warning (0, "!$OMP UNROLL with FULL clause at %L turns loop into a " + "non-loop", + &code->loc); + return true; + } + return false; +} + +static gfc_code * +resolve_nested_loop_transforms (gfc_code *code, const char *name, + int required_depth, locus *loc) +{ + if (!code) + return code; + + bool error = false; + while (loop_transform_p (code->op)) + { + if (!error && code->op == EXEC_OMP_UNROLL) + { + if (omp_unroll_removes_loop_nest (code)) + { + gfc_error ("missing canonical loop nest after %s at %L", name, + loc); + error = true; + } + else if (required_depth > 1) + { + gfc_error ("loop nest depth after !$OMP UNROLL at %L is insufficient " + "for outer %s", &code->loc, name); + error = true; + } + } + else if (!error && code->op == EXEC_OMP_TILE + && required_depth > gfc_expr_list_len (code->ext.omp_clauses->tile_sizes)) + { + gfc_error ("loop nest depth after !$OMP TILE at %L is insufficient " + "for outer %s", &code->loc, name); + error = true; + } + + if (code->block) + code = code->block->next; + else + code = code->next; + } + gcc_assert (!loop_transform_p (code->op)); + + return code; +} + +static void +resolve_omp_unroll (gfc_code *code) +{ + const char *descr = "!$OMP UNROLL"; + locus *loc = &code->loc; + + if (!code->block || code->block->op == EXEC_DO) + return; + + code = resolve_nested_loop_transforms (code->block->next, descr, 1, + &code->loc); + + if (code->op == EXEC_DO) + return; + + if (code->op == EXEC_DO_WHILE) + { + gfc_error ("%s invalid around DO WHILE or DO without loop " + "control at %L", descr, loc); + return; + } + + if (code->op == EXEC_DO_CONCURRENT) + { + gfc_error ("%s invalid around DO CONCURRENT loop at %L", + descr, loc); + return; + } + + gfc_error ("missing canonical loop nest after %s at %L", + descr, loc); +} + static void resolve_omp_do (gfc_code *code) { @@ -9592,30 +9706,13 @@ resolve_omp_do (gfc_code *code) break; case EXEC_OMP_TEAMS_LOOP: name = "!$OMP TEAMS LOOP"; break; case EXEC_OMP_UNROLL: name = "!$OMP UNROLL"; break; + case EXEC_OMP_TILE: name = "!$OMP TILE"; break; default: gcc_unreachable (); } if (code->ext.omp_clauses) resolve_omp_clauses (code, code->ext.omp_clauses, NULL); - do_code = code->block->next; - /* Move forward over any loop transformation directives to find the loop. */ - bool error = false; - while (do_code->op == EXEC_OMP_UNROLL) - { - if (!error && omp_unroll_removes_loop_nest (do_code)) - { - gfc_error ("missing canonical loop nest after %s at %L", name, - &code->loc); - error = true; - } - if (do_code->block) - do_code = do_code->block->next; - else - do_code = do_code->next; - } - gcc_assert (do_code->op != EXEC_OMP_UNROLL); - if (code->ext.omp_clauses->orderedc) collapse = code->ext.omp_clauses->orderedc; else @@ -9630,6 +9727,9 @@ resolve_omp_do (gfc_code *code) depth and treats any further inner loops as the final-loop-body. So here we also check canonical loop nest form only for the number of outer loops specified by the COLLAPSE clause too. */ + do_code = resolve_nested_loop_transforms (code->block->next, name, collapse, + &code->loc); + for (i = 1; i <= collapse; i++) { gfc_symbol *start_var = NULL, *end_var = NULL; @@ -9745,6 +9845,98 @@ resolve_omp_do (gfc_code *code) } } +static void +resolve_omp_tile (gfc_code *code) +{ + gfc_code *do_code, *c; + gfc_symbol *dovar; + const char *name = "!$OMP TILE"; + + unsigned num_loops = 0; + gcc_assert (code->ext.omp_clauses->tile_sizes); + for (gfc_expr_list *el = code->ext.omp_clauses->tile_sizes; el; + el = el->next) + num_loops++; + + do_code = resolve_nested_loop_transforms (code, name, num_loops, &code->loc); + + for (unsigned i = 1; i <= num_loops; i++) + { + if (do_code->op == EXEC_DO_WHILE) + { + gfc_error ("%s cannot be a DO WHILE or DO without loop control " + "at %L", name, &do_code->loc); + break; + } + if (do_code->op == EXEC_DO_CONCURRENT) + { + gfc_error ("%s cannot be a DO CONCURRENT loop at %L", name, + &do_code->loc); + break; + } + if (do_code->op != EXEC_DO) + { + gfc_error ("%s must be DO loop at %L", name, + &do_code->loc); + break; + } + + gcc_assert (do_code->op != EXEC_OMP_UNROLL); + gcc_assert (do_code->op == EXEC_DO); + dovar = do_code->ext.iterator->var->symtree->n.sym; + if (i > 1) + { + gfc_code *do_code2 = code; + while (loop_transform_p (do_code2->op)) + { + if (do_code2->block) + do_code2 = do_code2->block->next; + else + do_code2 = do_code2->next; + } + gcc_assert (!loop_transform_p (do_code2->op)); + + for (unsigned j = 1; j < i; j++) + { + gfc_symbol *ivar = do_code2->ext.iterator->var->symtree->n.sym; + if (dovar == ivar + || gfc_find_sym_in_expr (ivar, do_code->ext.iterator->start) + || gfc_find_sym_in_expr (ivar, do_code->ext.iterator->end) + || gfc_find_sym_in_expr (ivar, do_code->ext.iterator->step)) + { + gfc_error ("%s loops don't form rectangular " + "iteration space at %L", name, &do_code->loc); + break; + } + do_code2 = do_code2->block->next; + } + } + for (c = do_code->next; c; c = c->next) + if (c->op != EXEC_NOP && c->op != EXEC_CONTINUE) + { + gfc_error ("%s loops not perfectly nested at %L", + name, &c->loc); + break; + } + if (i == num_loops || c) + break; + do_code = do_code->block; + if (do_code->op != EXEC_DO && do_code->op != EXEC_DO_WHILE) + { + gfc_error ("not enough DO loops for %s at %L", + name, &code->loc); + break; + } + do_code = do_code->next; + if (do_code == NULL + || (do_code->op != EXEC_DO && do_code->op != EXEC_DO_WHILE)) + { + gfc_error ("not enough DO loops for %s at %L", + name, &code->loc); + break; + } + } +} static gfc_statement omp_code_to_statement (gfc_code *code) @@ -9889,6 +10081,8 @@ omp_code_to_statement (gfc_code *code) return ST_OMP_PARALLEL_LOOP; case EXEC_OMP_DEPOBJ: return ST_OMP_DEPOBJ; + case EXEC_OMP_TILE: + return ST_OMP_TILE; case EXEC_OMP_UNROLL: return ST_OMP_UNROLL; default: @@ -10320,6 +10514,9 @@ gfc_resolve_omp_directive (gfc_code *code, gfc_namespace *ns) case EXEC_OMP_TEAMS_LOOP: resolve_omp_do (code); break; + case EXEC_OMP_TILE: + resolve_omp_tile (code); + break; case EXEC_OMP_UNROLL: resolve_omp_unroll (code); break; diff --git a/gcc/fortran/parse.cc b/gcc/fortran/parse.cc index 094678436b4..1cc5200f35a 100644 --- a/gcc/fortran/parse.cc +++ b/gcc/fortran/parse.cc @@ -1009,6 +1009,7 @@ decode_omp_directive (void) matcho ("end teams loop", gfc_match_omp_eos_error, ST_OMP_END_TEAMS_LOOP); matcho ("end teams", gfc_match_omp_eos_error, ST_OMP_END_TEAMS); matchs ("end unroll", gfc_match_omp_eos_error, ST_OMP_END_UNROLL); + matchs ("end tile", gfc_match_omp_eos_error, ST_OMP_END_TILE); matcho ("end workshare", gfc_match_omp_end_nowait, ST_OMP_END_WORKSHARE); break; @@ -1137,6 +1138,7 @@ decode_omp_directive (void) matcho ("teams", gfc_match_omp_teams, ST_OMP_TEAMS); matchdo ("threadprivate", gfc_match_omp_threadprivate, ST_OMP_THREADPRIVATE); + matchs ("tile sizes", gfc_match_omp_tile, ST_OMP_TILE); break; case 'u': matchs ("unroll", gfc_match_omp_unroll, ST_OMP_UNROLL); @@ -1729,6 +1731,7 @@ next_statement (void) case ST_OMP_TARGET_PARALLEL_LOOP: case ST_OMP_TARGET_TEAMS_LOOP: \ case ST_OMP_ASSUME: \ case ST_OMP_UNROLL: \ + case ST_OMP_TILE: \ case ST_CRITICAL: \ case ST_OACC_PARALLEL_LOOP: case ST_OACC_PARALLEL: case ST_OACC_KERNELS: \ case ST_OACC_DATA: case ST_OACC_HOST_DATA: case ST_OACC_LOOP: \ @@ -2774,6 +2777,9 @@ gfc_ascii_statement (gfc_statement st, bool strip_sentinel) case ST_OMP_THREADPRIVATE: p = "!$OMP THREADPRIVATE"; break; + case ST_OMP_TILE: + p = "!$OMP TILE"; + break; case ST_OMP_UNROLL: p = "!$OMP UNROLL"; break; @@ -5214,6 +5220,11 @@ parse_omp_do (gfc_statement omp_st) num_unroll++; continue; } + else if (st == ST_OMP_TILE) + { + accept_statement (st); + continue; + } else unexpected_statement (st); } @@ -5338,6 +5349,9 @@ parse_omp_do (gfc_statement omp_st) case ST_OMP_TEAMS_LOOP: omp_end_st = ST_OMP_END_TEAMS_LOOP; break; + case ST_OMP_TILE: + omp_end_st = ST_OMP_END_TILE; + break; case ST_OMP_UNROLL: omp_end_st = ST_OMP_END_UNROLL; break; @@ -6025,6 +6039,7 @@ parse_executable (gfc_statement st) case ST_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case ST_OMP_TEAMS_DISTRIBUTE_SIMD: case ST_OMP_TEAMS_LOOP: + case ST_OMP_TILE: case ST_OMP_UNROLL: st = parse_omp_do (st); if (st == ST_IMPLIED_ENDDO) diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc index 46988ff281d..182aa18053c 100644 --- a/gcc/fortran/resolve.cc +++ b/gcc/fortran/resolve.cc @@ -11041,6 +11041,7 @@ gfc_resolve_blocks (gfc_code *b, gfc_namespace *ns) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_LOOP: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: + case EXEC_OMP_TILE: case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: break; @@ -12198,6 +12199,7 @@ gfc_resolve_code (gfc_code *code, gfc_namespace *ns) case EXEC_OMP_LOOP: case EXEC_OMP_SIMD: case EXEC_OMP_TARGET_SIMD: + case EXEC_OMP_TILE: case EXEC_OMP_UNROLL: gfc_resolve_omp_do_blocks (code, ns); break; @@ -12695,6 +12697,7 @@ start: case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_TILE: case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: gfc_resolve_omp_directive (code, ns); diff --git a/gcc/fortran/st.cc b/gcc/fortran/st.cc index 6112831e621..cea874e4474 100644 --- a/gcc/fortran/st.cc +++ b/gcc/fortran/st.cc @@ -277,6 +277,7 @@ gfc_free_statement (gfc_code *p) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_TILE: case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: gfc_free_omp_clauses (p->ext.omp_clauses); diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc index 73c416c951d..6936cd7f5ee 100644 --- a/gcc/fortran/trans-openmp.cc +++ b/gcc/fortran/trans-openmp.cc @@ -3913,6 +3913,24 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, omp_clauses = gfc_trans_add_clause (c, omp_clauses); } + if (clauses->tile_sizes) + { + vec *tvec; + gfc_expr_list *el; + + vec_alloc (tvec, 4); + + for (el = clauses->tile_sizes; el; el = el->next) + vec_safe_push (tvec, gfc_convert_expr_to_tree (block, el->expr)); + + c = build_omp_clause (gfc_get_location (&where), + OMP_CLAUSE_TILE); + OMP_CLAUSE_TILE_SIZES (c) = build_tree_list_vec (tvec); + omp_clauses = gfc_trans_add_clause (c, omp_clauses); + + tvec->truncate (0); + } + if (clauses->ordered) { c = build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_ORDERED); @@ -5106,7 +5124,7 @@ gfc_trans_omp_cancel (gfc_code *code) bool loop_transform_p (gfc_exec_op op) { - return op == EXEC_OMP_UNROLL; + return op == EXEC_OMP_UNROLL || op == EXEC_OMP_TILE; } static tree @@ -5280,6 +5298,16 @@ gfc_nonrect_loop_expr (stmtblock_t *pblock, gfc_se *sep, int loop_n, return true; } +int +gfc_expr_list_len (gfc_expr_list *list) +{ + unsigned len = 0; + for (; list; list = list->next) + len++; + + return len; +} + static tree gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, gfc_omp_clauses *do_clauses, tree par_clauses) @@ -5295,25 +5323,14 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, dovar_init *di; unsigned ix; vec *saved_doacross_steps = doacross_steps; - gfc_expr_list *tile = do_clauses ? do_clauses->tile_list : clauses->tile_list; gfc_code *orig_code = code; locus top_loc = code->loc; - - /* Both collapsed and tiled loops are lowered the same way. In - OpenACC, those clauses are not compatible, so prioritize the tile - clause, if present. */ - if (tile) - { - collapse = 0; - for (gfc_expr_list *el = tile; el; el = el->next) - collapse++; - } - - doacross_steps = NULL; - if (clauses->orderedc) - collapse = clauses->orderedc; - if (collapse <= 0) - collapse = 1; + gfc_expr_list *oacc_tile + = do_clauses ? do_clauses->tile_list : clauses->tile_list; + gfc_expr_list *omp_tile + = do_clauses ? do_clauses->tile_sizes : clauses->tile_sizes; + gcc_assert (!omp_tile || op == EXEC_OMP_TILE); + gcc_assert (!(oacc_tile && omp_tile)); if (pblock == NULL) { @@ -5321,21 +5338,42 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, pblock = █ } code = code->block->next; - gcc_assert (code->op == EXEC_DO || code->op == EXEC_OMP_UNROLL); + gcc_assert (code->op == EXEC_DO || loop_transform_p (code->op)); /* Loop transformation directives surrounding the associated loop of an "omp do" (or similar directive) are represented as clauses on the "omp do". */ loop_transform_clauses = NULL; - while (code->op == EXEC_OMP_UNROLL) + int omp_tile_depth = gfc_expr_list_len (omp_tile); + while (loop_transform_p (code->op)) { tree clauses = gfc_trans_omp_clauses (pblock, code->ext.omp_clauses, code->loc); - loop_transform_clauses = chainon (loop_transform_clauses, clauses); + /* There might be several "!$omp tile" transformations surrounding the + loop. Use the innermost one which must have the largest tiling depth. + If an inner directive has a smaller tiling depth than an outer + directive, an error will be emitted in pass-omp_transform_loops. */ + omp_tile_depth = gfc_expr_list_len (code->ext.omp_clauses->tile_sizes); + + loop_transform_clauses = chainon (loop_transform_clauses, clauses); code = code->block ? code->block->next : code->next; } - gcc_assert (code->op != EXEC_OMP_UNROLL); + gcc_assert (!loop_transform_p (code->op)); gcc_assert (code->op == EXEC_DO); + /* Both collapsed and tiled loops are lowered the same way. In + OpenACC, those clauses are not compatible, so prioritize the tile + clause, if present. */ + if (oacc_tile) + collapse = gfc_expr_list_len (oacc_tile); + + doacross_steps = NULL; + if (clauses->orderedc) + collapse = clauses->orderedc; + if (collapse <= 0) + collapse = 1; + + collapse = MAX (collapse, omp_tile_depth); + init = make_tree_vec (collapse); cond = make_tree_vec (collapse); incr = make_tree_vec (collapse); @@ -5346,7 +5384,7 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, on the simd construct and DO's clauses are translated elsewhere. */ do_clauses->sched_simd = false; - if (op == EXEC_OMP_UNROLL) + if (loop_transform_p (op)) { /* This is a loop transformation on a loop which is not associated with any other directive. Use the directive location instead of the loop @@ -5695,6 +5733,7 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, case EXEC_OMP_LOOP: stmt = make_node (OMP_LOOP); break; case EXEC_OMP_TASKLOOP: stmt = make_node (OMP_TASKLOOP); break; case EXEC_OACC_LOOP: stmt = make_node (OACC_LOOP); break; + case EXEC_OMP_TILE: stmt = make_node (OMP_LOOP_TRANS); break; case EXEC_OMP_UNROLL: stmt = make_node (OMP_LOOP_TRANS); break; default: gcc_unreachable (); } @@ -7793,6 +7832,7 @@ gfc_trans_omp_directive (gfc_code *code) case EXEC_OMP_LOOP: case EXEC_OMP_SIMD: case EXEC_OMP_TASKLOOP: + case EXEC_OMP_TILE: case EXEC_OMP_UNROLL: return gfc_trans_omp_do (code, code->op, NULL, code->ext.omp_clauses, NULL); diff --git a/gcc/fortran/trans.cc b/gcc/fortran/trans.cc index 56ec59fe80e..94b23c3b77a 100644 --- a/gcc/fortran/trans.cc +++ b/gcc/fortran/trans.cc @@ -2520,6 +2520,7 @@ trans_code (gfc_code * code, tree cond) case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD: case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD: case EXEC_OMP_TEAMS_LOOP: + case EXEC_OMP_TILE: case EXEC_OMP_UNROLL: case EXEC_OMP_WORKSHARE: res = gfc_trans_omp_directive (code); diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index 14616eb5316..4d504a12451 100644 --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -12105,6 +12105,7 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, case OMP_CLAUSE_UNROLL_FULL: case OMP_CLAUSE_UNROLL_NONE: case OMP_CLAUSE_UNROLL_PARTIAL: + case OMP_CLAUSE_TILE: break; case OMP_CLAUSE_NOHOST: default: @@ -13076,6 +13077,7 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p, case OMP_CLAUSE_FINALIZE: case OMP_CLAUSE_INCLUSIVE: case OMP_CLAUSE_EXCLUSIVE: + case OMP_CLAUSE_TILE: case OMP_CLAUSE_UNROLL_FULL: case OMP_CLAUSE_UNROLL_NONE: case OMP_CLAUSE_UNROLL_PARTIAL: @@ -15134,6 +15136,7 @@ gimplify_omp_loop (tree *expr_p, gimple_seq *pre_p) } pc = &OMP_CLAUSE_CHAIN (*pc); break; + case OMP_CLAUSE_TILE: case OMP_CLAUSE_UNROLL_PARTIAL: case OMP_CLAUSE_UNROLL_FULL: case OMP_CLAUSE_UNROLL_NONE: diff --git a/gcc/omp-general.cc b/gcc/omp-general.cc index 0f326128874..e568ba0703e 100644 --- a/gcc/omp-general.cc +++ b/gcc/omp-general.cc @@ -2264,7 +2264,7 @@ omp_loop_transform_clause_p (tree c) enum omp_clause_code code = OMP_CLAUSE_CODE (c); return (code == OMP_CLAUSE_UNROLL_FULL || code == OMP_CLAUSE_UNROLL_PARTIAL - || code == OMP_CLAUSE_UNROLL_NONE); + || code == OMP_CLAUSE_UNROLL_NONE || code == OMP_CLAUSE_TILE); } /* Try to resolve declare variant, return the variant decl if it should diff --git a/gcc/omp-transform-loops.cc b/gcc/omp-transform-loops.cc index d845d0e4798..858a271261a 100644 --- a/gcc/omp-transform-loops.cc +++ b/gcc/omp-transform-loops.cc @@ -211,6 +211,9 @@ gomp_for_constant_iterations_p (gomp_for *omp_for, return true; } +static gimple_seq +expand_transformed_loop (gomp_for *omp_for); + /* Split a gomp_for that represents a collapsed loop-nest into single loops. The result is a gomp_for of the same kind which is not collapsed (i.e. gimple_omp_for_collapse (OMP_FOR) == 1) and which contains nested, @@ -220,7 +223,7 @@ gomp_for_constant_iterations_p (gomp_for *omp_for, FROM_DEPTH are left collapsed. */ static gomp_for* -gomp_for_uncollapse (gomp_for *omp_for, int from_depth = 0) +gomp_for_uncollapse (gomp_for *omp_for, int from_depth = 0, bool expand = false) { int collapse = gimple_omp_for_collapse (omp_for); gcc_assert (from_depth < collapse); @@ -251,7 +254,11 @@ gomp_for_uncollapse (gomp_for *omp_for, int from_depth = 0) gimple_omp_for_set_index (level_omp_for, 0, gimple_omp_for_index (omp_for, level)); - body = level_omp_for; + + if (expand) + body = expand_transformed_loop (level_omp_for); + else + body = level_omp_for; } omp_for->collapse = from_depth; @@ -808,6 +815,316 @@ canonicalize_conditions (gomp_for *omp_for) return new_decls; } +/* Execute the tiling transformation for OMP_FOR with the given TILE_SIZES and + return the resulting gimple bind. TILE_SIZES must be a non-empty tree chain + of integer constants and the collapse of OMP_FOR must be at least the length + of TILE_SIZES. TRANSFORMATION_CLAUSES are the loop transformations that + must be applied to OMP_FOR. Those are applied on the result of the tiling + transformation. LOC is the location for diagnostic messages. + + Example 1 + --------- + --------- + + Original loop + ------------- + + #pragma omp for + #pragma omp tile sizes(3) + for (i = 1; i <= n; i = i + 1) + { + body; + } + + Internally, the tile directive is represented as a clause on the + omp for, i.e. as #pragma omp for tile_sizes(3). + + Transformed loop + ---------------- + + #pragma omp for + for (.omp_tile_index = 1; .omp_tile_index < ceil(n/3); .omp_tile_index = .omp_tile_index + 3) + { + D.4287 = .omp_tile_index + 3 + 1 + #pragma omp loop_transform + for (i = .omp_tile_index; i < D.4287; i = i + 1) + { + if (i.0 > n) + goto L.0 + body; + } + L_0: + } + + The outer loop is the "floor loop" and the inner loop is the "tile + loop". The tile loop is never in canonical loop nest form and + hence it cannot be associated with any loop construct. The + GCC-internal "omp loop transform" construct will be lowered after + the tiling transformation. + */ + +static gimple_seq +tile (gomp_for *omp_for, location_t loc, tree tile_sizes, + tree transformation_clauses, walk_ctx *ctx) +{ + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE | MSG_PRIORITY_INTERNALS, + dump_user_location_t::from_location_t (loc), + "Executing tile transformation %T:\n %G\n", + transformation_clauses, static_cast (omp_for)); + + gimple_seq tile_loops = copy_gimple_seq_and_replace_locals (omp_for); + gimple_seq floor_loops = copy_gimple_seq_and_replace_locals (omp_for); + + size_t collapse = gimple_omp_for_collapse (omp_for); + size_t tiling_depth = list_length (tile_sizes); + tree clauses = gimple_omp_for_clauses (omp_for); + size_t clause_collapse = 1; + tree collapse_clause = NULL; + + if (tree c = omp_find_clause (clauses, OMP_CLAUSE_ORDERED)) + { + error_at (OMP_CLAUSE_LOCATION (c), + "% invalid in conjunction with %"); + return omp_for; + } + + if (tree c = omp_find_clause (clauses, OMP_CLAUSE_COLLAPSE)) + { + tree expr = OMP_CLAUSE_COLLAPSE_EXPR (c); + clause_collapse = tree_to_uhwi (expr); + collapse_clause = c; + } + + /* The 'omp tile' construct creates a canonical loop-nest whose nesting depth + equals tiling_depth. The whole loop-nest has depth at least 2 * + omp_tile_depth, but the 'tile loops' at levels + omp_tile_depth+1...2*omp_tile_depth are not in canonical loop-nest form + and hence cannot be associated with a loop construct. */ + if (clause_collapse > tiling_depth) + { + error_at (OMP_CLAUSE_LOCATION (collapse_clause), + "collapse cannot extend below the floor loops " + "generated by the % construct"); + OMP_CLAUSE_COLLAPSE_EXPR (collapse_clause) + = build_int_cst (unsigned_type_node, tiling_depth); + return transform_gomp_for (omp_for, NULL, ctx); + } + + if (tiling_depth > collapse) + return transform_gomp_for (omp_for, NULL, ctx); + + gcc_assert (collapse >= clause_collapse); + + push_gimplify_context (); + + /* Create the index variables for iterating the tiles in the floor + loops first tiling_depth loops transformed loop nest. */ + gimple_seq floor_loops_pre_body = NULL; + size_t tile_level = 0; + auto_vec sizes_vec; + for (tree el = tile_sizes; el; el = TREE_CHAIN (el), tile_level++) + { + size_t nest_level = tile_level; + tree index = gimple_omp_for_index (omp_for, nest_level); + tree init = gimple_omp_for_initial (omp_for, nest_level); + tree incr = gimple_omp_for_incr (omp_for, nest_level); + tree step = TREE_OPERAND (incr, 1); + + /* Initialize original index variables in the pre-body. The + loop lowering will not initialize them because of the changed + index variables. */ + gimplify_assign (index, init, &floor_loops_pre_body); + + tree tile_size = fold_convert (TREE_TYPE (step), TREE_VALUE (el)); + sizes_vec.safe_push (tile_size); + tree tile_index = create_tmp_var (TREE_TYPE (index), ".omp_tile_index"); + gimplify_assign (tile_index, init, &floor_loops_pre_body); + + /* Floor loops */ + step = fold_build2 (MULT_EXPR, TREE_TYPE (step), step, tile_size); + tree tile_step = step; + /* For combined constructs, step will be gimplified on the outer + gomp_for. */ + if (!gimple_omp_for_combined_into_p (omp_for) && !TREE_CONSTANT (step)) + { + tile_step = create_tmp_var (TREE_TYPE (step), ".omp_tile_step"); + gimplify_assign (tile_step, step, &floor_loops_pre_body); + } + incr = fold_build2 (TREE_CODE (incr), TREE_TYPE (incr), tile_index, + tile_step); + gimple_omp_for_set_incr (floor_loops, nest_level, incr); + gimple_omp_for_set_index (floor_loops, nest_level, tile_index); + } + gbind *result_bind = gimple_build_bind (NULL, NULL, NULL); + pop_gimplify_context (result_bind); + gimple_seq_add_seq (gimple_omp_for_pre_body_ptr (floor_loops), + floor_loops_pre_body); + + /* The tiling loops will not form a perfect loop nest because the + loop for each tiling dimension needs to check if the current tile + is incomplete and this check is intervening code. Since OpenMP + 5.1 does not allow the collapse of the loop-nest to extend beyond + the floor loops, this is not a problem. + + "Uncollapse" the tiling loop nest, i.e. split the loop nest into + nested separate gomp_for structures for each level. This allows + to add the incomplete tile checks to each level loop. */ + + tile_loops = gomp_for_uncollapse (as_a (tile_loops)); + gimple_omp_for_set_kind (as_a (tile_loops), + GF_OMP_FOR_KIND_TRANSFORM_LOOP); + gimple_omp_for_set_clauses (tile_loops, NULL_TREE); + gimple_omp_for_set_pre_body (tile_loops, NULL); + + /* Transform the loop bodies of the "uncollapsed" tiling loops and + add them to the body of the floor loops. At this point, the + loop nest consists of perfectly nested gimple_omp_for constructs, + each representing a single loop. */ + gimple_seq floor_loops_body = NULL; + gimple *level_loop = tile_loops; + gimple_seq_add_stmt (&floor_loops_body, tile_loops); + gimple_seq *surrounding_seq = &floor_loops_body; + + push_gimplify_context (); + + tree break_label = create_artificial_label (UNKNOWN_LOCATION); + gimple_seq_add_stmt (surrounding_seq, gimple_build_label (break_label)); + for (size_t level = 0; level < tiling_depth; level++) + { + tree original_index = gimple_omp_for_index (omp_for, level); + tree original_final = gimple_omp_for_final (omp_for, level); + + tree tile_index = gimple_omp_for_index (floor_loops, level); + tree tile_size = sizes_vec[level]; + tree type = TREE_TYPE (tile_index); + tree plus_type = type; + + tree incr = gimple_omp_for_incr (omp_for, level); + tree step = omp_get_for_step_from_incr (gimple_location (omp_for), incr); + + gimple_seq *pre_body = gimple_omp_for_pre_body_ptr (level_loop); + gimple_seq level_body = gimple_omp_body (level_loop); + gcc_assert (gimple_omp_for_collapse (level_loop) == 1); + tree_code original_cond = gimple_omp_for_cond (omp_for, level); + + gimple_omp_for_set_initial (level_loop, 0, tile_index); + + tree tile_final = create_tmp_var (type); + tree scaled_tile_size = fold_build2 (MULT_EXPR, TREE_TYPE (tile_size), + tile_size, step); + + tree_code plus_code = PLUS_EXPR; + if (POINTER_TYPE_P (TREE_TYPE (tile_index))) + { + plus_code = POINTER_PLUS_EXPR; + int unsignedp = TYPE_UNSIGNED (TREE_TYPE (scaled_tile_size)); + plus_type = signed_or_unsigned_type_for (unsignedp, ptrdiff_type_node); + } + + scaled_tile_size = fold_convert (plus_type, scaled_tile_size); + gimplify_assign (tile_final, + fold_build2 (plus_code, type, + tile_index, scaled_tile_size), + pre_body); + gimple_omp_for_set_final (level_loop, 0, tile_final); + + /* Redefine the original loop index variable of OMP_FOR in terms of the + floor loop and the tiling loop index variable for the current + dimension/level at the top of the loop. */ + gimple_seq level_preamble = NULL; + + push_gimplify_context (); + + tree body_label = create_artificial_label (UNKNOWN_LOCATION); + + /* Handle partial tiles, i.e. add a check that breaks from the tile loop + if the new index value does not belong to the iteration space of the + original loop. */ + gimple_seq_add_stmt (&level_preamble, + gimple_build_cond (original_cond, original_index, + original_final, body_label, + break_label)); + gimple_seq_add_stmt (&level_preamble, gimple_build_label (body_label)); + + auto gsi = gsi_start (level_body); + gsi_insert_seq_before (&gsi, level_preamble, GSI_SAME_STMT); + gbind *level_bind = gimple_build_bind (NULL, NULL, NULL); + pop_gimplify_context (level_bind); + gimple_bind_set_body (level_bind, level_body); + gimple_omp_set_body (level_loop, level_bind); + + surrounding_seq = &level_body; + level_loop = gsi_stmt (gsi); + + /* The label for jumping out of the loop at the next nesting + level. For the outermost level, the label is put after the + loop-nest, for the last one it is not necessary. */ + if (level != tiling_depth - 1) + { + break_label = create_artificial_label (UNKNOWN_LOCATION); + gsi_insert_after (&gsi, gimple_build_label (break_label), + GSI_NEW_STMT); + } + } + + gbind *tile_loops_bind; + tile_loops_bind = gimple_build_bind (NULL, tile_loops, NULL); + pop_gimplify_context (tile_loops_bind); + + gimple_omp_set_body (floor_loops, tile_loops_bind); + + tree remaining_clauses = OMP_CLAUSE_CHAIN (transformation_clauses); + + /* Collapsing of the OMP_FOR is used both for the "omp tile" + implementation and for the actual "collapse" clause. If the + tiling depth was greater than the collapse depth required by the + clauses on OMP_FOR, the collapse of OMP_FOR must be adjusted to + the latter value and all loops below the new collapse depth must + be transformed to GF_OMP_FOR_KIND_TRANSFORM_LOOP to ensure their + lowering in this pass. */ + size_t new_collapse = clause_collapse; + + /* Keep the omp_for collapsed if there are further transformations */ + if (remaining_clauses) + { + size_t next_transform_depth = 1; + if (OMP_CLAUSE_CODE (remaining_clauses) == OMP_CLAUSE_TILE) + next_transform_depth + = list_length (OMP_CLAUSE_TILE_SIZES (remaining_clauses)); + + /* The current "omp tile" transformation reduces the nesting depth + of the canonical loop-nest to TILING_DEPTH. + Hence the following "omp tile" transformation is invalid if + it requires a greater nesting depth. */ + gcc_assert (next_transform_depth <= tiling_depth); + if (next_transform_depth > new_collapse) + new_collapse = next_transform_depth; + } + + if (collapse > new_collapse) + floor_loops = gomp_for_uncollapse (as_a (floor_loops), + new_collapse, true); + + /* Lower the uncollapsed tile loops. */ + walk_omp_for_loops (gimple_bind_body_ptr (tile_loops_bind), ctx); + + gcc_assert (remaining_clauses || !collapse_clause + || gimple_omp_for_collapse (floor_loops) + == (size_t)clause_collapse); + + if (gimple_omp_for_combined_into_p (omp_for)) + ctx->inner_combined_loop = as_a (floor_loops); + + /* Apply remaining transformation clauses and assemble the transformation + result. */ + gimple_bind_set_body (result_bind, + transform_gomp_for (as_a (floor_loops), + remaining_clauses, ctx)); + + return result_bind; +} + /* Combined distribute or taskloop constructs are represented by two or more nested gomp_for constructs which are created during gimplification. Loop transformations on the combined construct are @@ -999,6 +1316,10 @@ transform_gomp_for (gomp_for *omp_for, tree transformation, walk_ctx *ctx) ctx); } break; + case OMP_CLAUSE_TILE: + result = tile (omp_for, loc, OMP_CLAUSE_TILE_SIZES (transformation), + transformation, ctx); + break; default: gcc_unreachable (); } @@ -1177,6 +1498,21 @@ optimize_transformation_clauses (tree clauses) unroll_partial = c; } break; + case OMP_CLAUSE_TILE: + { + /* No optimization for those clauses yet, but they end any chain of + "unroll partial" clauses. */ + if (merged_unroll_partial && dump_enabled_p ()) + print_optimized_unroll_partial_msg (unroll_partial); + + if (unroll_partial) + OMP_CLAUSE_CHAIN (unroll_partial) = c; + + unroll_partial = NULL; + merged_unroll_partial = false; + last_non_unroll = c; + } + break; default: gcc_unreachable (); } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-1.f90 new file mode 100644 index 00000000000..84ea93300fa --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-1.f90 @@ -0,0 +1,163 @@ +subroutine test + implicit none + integer :: i, j, k + + !$omp tile sizes(1) + do i = 1,100 + call dummy(i) + end do + + !$omp tile sizes(1) + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(2+3) + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(-21) ! { dg-error {tile size not constant positive integer at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(0) ! { dg-error {tile size not constant positive integer at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(i) ! { dg-error {Constant expression required at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes ! { dg-error {Syntax error in 'tile sizes' list at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes( ! { dg-error {Syntax error in 'tile sizes' list at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(2 ! { dg-error {Syntax error in 'tile sizes' list at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes() ! { dg-error {Syntax error in 'tile sizes' list at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(2,) ! { dg-error {Syntax error in 'tile sizes' list at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(,2) ! { dg-error {Syntax error in 'tile sizes' list at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(,i) ! { dg-error {Syntax error in 'tile sizes' list at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(i,) ! { dg-error {Constant expression required at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(1,2) + do i = 1,100 + do j = 1,100 + call dummy(j) + end do + end do + !$end omp tile + + !$omp tile sizes(1,2) ! { dg-error {not enough DO loops for \!\$OMP TILE at \(1\)} } + do i = 1,100 + call dummy(i) + end do + !$end omp tile + + !$omp tile sizes(1,2,1) ! { dg-error {not enough DO loops for \!\$OMP TILE at \(1\)} } + do i = 1,100 + do j = 1,100 + call dummy(i) + end do + end do + !$end omp tile + + !$omp tile sizes(1,2,1) + do i = 1,100 + do j = 1,100 + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + + !$omp tile sizes(1,2,1) + do i = 1,100 + do j = 1,100 + do k = 1,100 + call dummy(i) + end do + end do + call dummy(i) ! { dg-error {\!\$OMP TILE loops not perfectly nested at \(1\)} } + end do + !$end omp tile + + !$omp tile sizes(1,2,1) + do i = 1,100 + do j = 1,100 + do k = 1,100 + call dummy(i) + end do + call dummy(j) ! { dg-error {\!\$OMP TILE loops not perfectly nested at \(1\)} } + end do + end do + !$end omp tile + + !$omp tile sizes(1,2,1) ! { dg-error {not enough DO loops for \!\$OMP TILE at \(1\)} } + do i = 1,100 + call dummy(i) + do j = 1,100 + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + + !$omp tile sizes(1,2,1) ! { dg-error {not enough DO loops for \!\$OMP TILE at \(1\)} } + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-1a.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-1a.f90 new file mode 100644 index 00000000000..29d7532bc37 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-1a.f90 @@ -0,0 +1,10 @@ + +subroutine test + !$omp tile sizes(1,2,1) ! { dg-error {not enough DO loops for \!\$OMP TILE at \(1\)} } + do i = 1,100 + do j = 1,100 + call dummy(i) + end do + end do + !$end omp tile +end subroutine test diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-2.f90 new file mode 100644 index 00000000000..8a5eae3a188 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-2.f90 @@ -0,0 +1,80 @@ +subroutine test1 + implicit none + integer :: i, j, k + + !$omp tile sizes (1,2) + !$omp tile sizes (1,2) + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + + !$omp tile sizes (8) + !$omp tile sizes (1,2) + !$omp tile sizes (1,2,3) + do i = 1,100 + do j = 1,100 + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test1 + +subroutine test2 + implicit none + integer :: i, j, k + + !$omp taskloop collapse(2) + !$omp tile sizes (3,4) + !$omp tile sizes (1,2) + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + !$omp end taskloop + + !$omp taskloop simd + !$omp tile sizes (8) + !$omp tile sizes (1,2) + !$omp tile sizes (1,2,3) + do i = 1,100 + do j = 1,100 + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + !$omp end taskloop simd +end subroutine test2 + +subroutine test3 + implicit none + integer :: i, j, k + + !$omp taskloop collapse(3) ! { dg-error {not enough DO loops for collapsed \!\$OMP TASKLOOP at \(1\)} } + !$omp tile sizes (1,2) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TASKLOOP} } + !$omp tile sizes (1,2) + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + !$omp end taskloop +end subroutine test3 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-3.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-3.f90 new file mode 100644 index 00000000000..eaa7895eaa0 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-3.f90 @@ -0,0 +1,18 @@ +subroutine test + implicit none + integer :: i, j, k + + !$omp parallel do collapse(2) ordered(2) + !$omp tile sizes (1,2) + do i = 1,100 ! { dg-error {'ordered' invalid in conjunction with 'omp tile'} } + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + !$end omp target + +end subroutine test diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-4.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-4.f90 new file mode 100644 index 00000000000..b2dca0bbec6 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-4.f90 @@ -0,0 +1,95 @@ + +subroutine test1 + implicit none + integer :: i, j, k + + !$omp tile sizes (1,2) + !$omp tile sizes (1) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + +end subroutine test1 + +subroutine test2 + implicit none + integer :: i, j, k + + !$omp tile sizes (1,2) + !$omp tile sizes (1) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + +end subroutine test2 + +subroutine test3 + implicit none + integer :: i, j, k + + !$omp target teams distribute + !$omp tile sizes (1,2) + !$omp tile sizes (1) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + +end subroutine test3 + +subroutine test4 + implicit none + integer :: i, j, k + + !$omp target teams distribute collapse(2) + !$omp tile sizes (8) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TARGET TEAMS DISTRIBUTE} } + !$omp tile sizes (1,2) + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + +end subroutine test4 + +subroutine test5 + implicit none + integer :: i, j, k + + !$omp parallel do collapse(2) ordered(2) + !$omp tile sizes (8) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP PARALLEL DO} } + !$omp tile sizes (1,2) + do i = 1,100 + do j = 1,100 + call dummy(j) + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile + !$end omp tile + !$end omp target + +end subroutine test5 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-unroll-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-unroll-1.f90 new file mode 100644 index 00000000000..27920701b36 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-unroll-1.f90 @@ -0,0 +1,57 @@ +function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + + !$omp parallel do collapse(2) + !$omp tile sizes (8,8) + !$omp unroll partial(2) ! { dg-error {loop nest depth after \!\$OMP UNROLL at \(1\) is insufficient for outer \!\$OMP TILE} } + ! { dg-error {loop nest depth after \!\$OMP UNROLL at \(1\) is insufficient for outer \!\$OMP PARALLEL DO} "" { target *-*-*} .-1 } + do i = 1,m + do j = 1,n + inner = 0 + do k = 1, n + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do + + !$omp tile sizes (8,8) + !$omp unroll partial(2) ! { dg-error {loop nest depth after \!\$OMP UNROLL at \(1\) is insufficient for outer \!\$OMP TILE} } + do i = 1,m + do j = 1,n + inner = 0 + do k = 1, n + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do + + !$omp tile sizes (8) + !$omp unroll partial(1) + do i = 1,m + do j = 1,n + inner = 0 + do k = 1, n + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do + + !$omp parallel do collapse(2) ! { dg-error {missing canonical loop nest after \!\$OMP PARALLEL DO at \(1\)} } + !$omp tile sizes (8,8) ! { dg-error {missing canonical loop nest after \!\$OMP TILE at \(1\)} } + !$omp unroll full ! { dg-warning {\!\$OMP UNROLL with FULL clause at \(1\) turns loop into a non-loop} } + do i = 1,m + do j = 1,n + inner = 0 + do k = 1, n + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do +end function mult diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90 new file mode 100644 index 00000000000..cda878f3037 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90 @@ -0,0 +1,37 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + + !$omp parallel do + !$omp unroll partial(1) + !$omp tile sizes (8,8) + do i = 1,m + do j = 1,n + inner = 0 + do k = 1, n + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do +end function mult + +! { dg-final { scan-tree-dump-times {#pragma omp for nowait unroll_partial\(1\) tile sizes\(8, 8\)} 1 "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp loop_transform unroll_partial" "omp_transform_loops" } } + +! Tiling adds two floor and two tile loops. + +! Number of conditional statements after tiling: +! 5 +! = 2 (lowering of 2 tile loops) +! + 1 (partial tile handling in 2 tile loops) +! + 1 (lowering of non-associated floor loop) + +! The unrolling with unroll factor 1 currently gets executed (TODO could/should be skipped?) + +! { dg-final { scan-tree-dump-times {if \([A-Za-z0-9_.]+ < } 5 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90 new file mode 100644 index 00000000000..00615011856 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90 @@ -0,0 +1,41 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + c = 0 + + !$omp target + !$omp parallel do + !$omp unroll partial(2) + !$omp tile sizes (8,8,4) + do i = 1,m + do j = 1,n + do k = 1, n + c(j,i) = c(j,i) + a(k, i) * b(j, k) + end do + end do + end do + !$omp end target +end function mult + +! { dg-final { scan-tree-dump-times {#pragma omp for nowait unroll_partial\(2\) tile sizes\(8, 8, 4\)} 1 "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp loop_transform unroll_partial" "omp_transform_loops" } } + +! Check the number of loops + +! Tiling adds three tile and three floor loops. +! The outermost floor loop is associated with the "!$omp parallel do" +! and hence it isn't lowered in the transformation pass. +! Number of conditional statements after tiling: +! 8 +! = 2 (inner floor loop lowering) +! + 3 (partial tile handling in 3 tile loops) +! + 3 (lowering of 3 tile loops) +! +! Unrolling creates 2 copies of the tiled loop nest. + +! { dg-final { scan-tree-dump-times {if \([A-Za-z0-9_.]+ < } 16 "omp_transform_loops" } } diff --git a/gcc/tree-core.h b/gcc/tree-core.h index f1429824158..b241e144515 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -534,6 +534,9 @@ enum omp_clause_code { /* Internal representation for an "omp unroll partial" directive. */ OMP_CLAUSE_UNROLL_PARTIAL, + + /* Represents a "tile" directive internally. */ + OMP_CLAUSE_TILE }; #undef DEFTREESTRUCT diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc index cae81719e68..02c207d87a0 100644 --- a/gcc/tree-pretty-print.cc +++ b/gcc/tree-pretty-print.cc @@ -521,6 +521,14 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags) pp_right_paren (pp); } break; + case OMP_CLAUSE_TILE: + pp_string (pp, "tile sizes"); + pp_left_paren (pp); + gcc_assert (OMP_CLAUSE_TILE_SIZES (clause)); + dump_generic_node (pp, OMP_CLAUSE_TILE_SIZES (clause), spc, flags, + false); + pp_right_paren (pp); + break; case OMP_CLAUSE__LOOPTEMP_: name = "_looptemp_"; goto print_remap; diff --git a/gcc/tree.cc b/gcc/tree.cc index fc7e22d352f..893f509fa3a 100644 --- a/gcc/tree.cc +++ b/gcc/tree.cc @@ -327,8 +327,10 @@ unsigned const char omp_clause_num_ops[] = 0, /* OMP_CLAUSE_FINALIZE */ 0, /* OMP_CLAUSE_NOHOST */ 0, /* OMP_CLAUSE_UNROLL_FULL */ + 0, /* OMP_CLAUSE_UNROLL_NONE */ - 1 /* OMP_CLAUSE_UNROLL_PARTIAL */ + 1, /* OMP_CLAUSE_UNROLL_PARTIAL */ + 1 /* OMP_CLAUSE_TILE */ }; const char * const omp_clause_code_name[] = @@ -422,7 +424,8 @@ const char * const omp_clause_code_name[] = "nohost", "unroll_full", "unroll_none", - "unroll_partial" + "unroll_partial", + "tile" }; /* Unless specific to OpenACC, we tend to internally maintain OpenMP-centric diff --git a/gcc/tree.h b/gcc/tree.h index 6f7a6e7017a..8f4d2761d1a 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -1790,6 +1790,9 @@ class auto_suppress_location_wrappers #define OMP_CLAUSE_UNROLL_PARTIAL_EXPR(NODE) \ OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_UNROLL_PARTIAL), 0) +#define OMP_CLAUSE_TILE_SIZES(NODE) \ + OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_TILE), 0) + #define OMP_CLAUSE_PROC_BIND_KIND(NODE) \ (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_PROC_BIND)->omp_clause.subcode.proc_bind_kind) diff --git a/libgomp/testsuite/libgomp.c++/loop-transforms/unroll-full-tile.C b/libgomp/testsuite/libgomp.c++/loop-transforms/unroll-full-tile.C new file mode 100644 index 00000000000..8970bfa7fd8 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/loop-transforms/unroll-full-tile.C @@ -0,0 +1,84 @@ +#include +#include + +template +int sum () +{ + int sum = 0; +#pragma omp unroll full +#pragma omp tile sizes(dim0, dim1) + for (unsigned i = 0; i < 4; i++) + for (unsigned j = 0; j < 5; j++) + sum++; + + return sum; +} + +int main () +{ + if (sum <1,1> () != 20) + __builtin_abort (); + if (sum <1,2> () != 20) + __builtin_abort (); + if (sum <1,3> () != 20) + __builtin_abort (); + if (sum <1,4> () != 20) + __builtin_abort (); + if (sum <1,5> () != 20) + __builtin_abort (); + + if (sum <2,1> () != 20) + __builtin_abort (); + if (sum <2,2> () != 20) + __builtin_abort (); + if (sum <2,3> () != 20) + __builtin_abort (); + if (sum <2,4> () != 20) + __builtin_abort (); + if (sum <2,5> () != 20) + __builtin_abort (); + + if (sum <3,1> () != 20) + __builtin_abort (); + if (sum <3,2> () != 20) + __builtin_abort (); + if (sum <3,3> () != 20) + __builtin_abort (); + if (sum <3,4> () != 20) + __builtin_abort (); + if (sum <3,5> () != 20) + __builtin_abort (); + + if (sum <4,1> () != 20) + __builtin_abort (); + if (sum <4,2> () != 20) + __builtin_abort (); + if (sum <4,3> () != 20) + __builtin_abort (); + if (sum <4,4> () != 20) + __builtin_abort (); + if (sum <4,5> () != 20) + __builtin_abort (); + + if (sum <5,1> () != 20) + __builtin_abort (); + if (sum <5,2> () != 20) + __builtin_abort (); + if (sum <5,3> () != 20) + __builtin_abort (); + if (sum <5,4> () != 20) + __builtin_abort (); + if (sum <5,5> () != 20) + __builtin_abort (); + + if (sum <6,1> () != 20) + __builtin_abort (); + if (sum <6,2> () != 20) + __builtin_abort (); + if (sum <6,3> () != 20) + __builtin_abort (); + if (sum <6,4> () != 20) + __builtin_abort (); + if (sum <6,5> () != 20) + __builtin_abort (); +} diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-1.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-1.f90 new file mode 100644 index 00000000000..bb48c31224e --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-1.f90 @@ -0,0 +1,71 @@ +module matrix + implicit none + integer :: n = 10 + integer :: m = 10 + +contains + function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + !$omp parallel do collapse(2) private(inner) + !$omp tile sizes (8, 1) + do i = 1,m + do j = 1,n + inner = 0 + do k = 1, n + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do + end function mult + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n = size (m, 1) + do i = 1,n + do j = 1,n + write (*, fmt="(i4)", advance='no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine + +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i = 1,n + do j = 1,m + a(j,i) = merge(1,0, i.eq.j) + b(j,i) = j + end do + end do + + c = mult (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i = 1,n + do j = 1,m + if (b(i,j) .ne. c(i,j)) call abort () + end do + end do + + +end program main diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-2.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-2.f90 new file mode 100644 index 00000000000..6aedbf4724f --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-2.f90 @@ -0,0 +1,117 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-do run } + +module test_functions + contains + integer function compute_sum1() result(sum) + implicit none + + integer :: i,j + + sum = 0 + !$omp do + do i = 1,10,3 + !$omp tile sizes(2) + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function + + integer function compute_sum2() result(sum) + implicit none + + integer :: i,j + + sum = 0 + !$omp do + do i = 1,10,3 + !$omp tile sizes(16) + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function + + integer function compute_sum3() result(sum) + implicit none + + integer :: i,j + + sum = 0 + !$omp do + do i = 1,10,3 + !$omp tile sizes(100) + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function + + integer function compute_sum4() result(sum) + implicit none + + integer :: i,j + + sum = 0 + !$omp do + !$omp tile sizes(6,10) + do i = 1,10,3 + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function + + integer function compute_sum5() result(sum) + implicit none + + integer :: i,j + + sum = 0 + !$omp parallel do collapse(2) + !$omp tile sizes(6,10) + do i = 1,10,3 + do j = 1,10,3 + sum = sum + 1 + end do + end do + end function +end module test_functions + +program test + use test_functions + implicit none + + integer :: result + + result = compute_sum1 () + write (*,*) result + if (result .ne. 16) then + call abort + end if + + result = compute_sum2 () + write (*,*) result + if (result .ne. 16) then + call abort + end if + + result = compute_sum3 () + write (*,*) result + if (result .ne. 16) then + call abort + end if + + result = compute_sum4 () + write (*,*) result + if (result .ne. 16) then + call abort + end if + + result = compute_sum5 () + write (*,*) result + if (result .ne. 16) then + call abort + end if +end program diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-1.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-1.f90 new file mode 100644 index 00000000000..2f2f014ead9 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-1.f90 @@ -0,0 +1,112 @@ +module matrix + implicit none + integer :: n = 10 + integer :: m = 10 + +contains + + function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + do i = 1,10 + do j = 1,n + c(j,i) = 0 + end do + end do + + !$omp unroll partial(10) + !$omp tile sizes(1, 3) + do i = 1,10 + do j = 1,n + do k = 1, n + write (*,*) i, j, k + c(j,i) = c(j,i) + a(k, i) * b(j, k) + end do + end do + end do + end function mult + + function mult2 (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + do i = 1,10 + do j = 1,n + c(j,i) = 0 + end do + end do + + !$omp unroll partial(2) + !$omp tile sizes(1,2) + do i = 1,10 + do j = 1,n + do k = 1, n + write (*,*) i, j, k + c(j,i) = c(j,i) + a(k, i) * b(j, k) + end do + end do + end do + end function mult2 + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n = size (m, 1) + do i = 1,n + do j = 1,n + write (*, fmt="(i4)", advance='no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine + +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i = 1,n + do j = 1,m + a(j,i) = merge(1,0, i.eq.j) + b(j,i) = j + end do + end do + + ! c = mult (a, b) + + ! call print_matrix (a) + ! call print_matrix (b) + ! call print_matrix (c) + + ! do i = 1,n + ! do j = 1,m + ! if (b(i,j) .ne. c(i,j)) call abort () + ! end do + ! end do + + + c = mult2 (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i = 1,n + do j = 1,m + if (b(i,j) .ne. c(i,j)) call abort () + end do + end do + +end program main diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-2.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-2.f90 new file mode 100644 index 00000000000..1b5b623b838 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-2.f90 @@ -0,0 +1,71 @@ +module matrix + implicit none + integer :: n = 10 + integer :: m = 10 + +contains + + function copy (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + do i = 1,10 + do j = 1,n + c(j,i) = 0 + end do + end do + + !$omp unroll partial(2) + !$omp tile sizes (1,5) + do i = 1,10 + do j = 1,n + c(j,i) = c(j,i) + a(j, i) + end do + end do + end function copy + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n = size (m, 1) + do i = 1,n + do j = 1,n + write (*, fmt="(i4)", advance='no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i = 1,n + do j = 1,m + a(j,i) = 1 + end do + end do + + c = copy (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i = 1,n + do j = 1,m + if (c(i,j) .ne. a(i,j)) call abort () + end do + end do + +end program main diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-3.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-3.f90 new file mode 100644 index 00000000000..518968f1335 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-3.f90 @@ -0,0 +1,77 @@ +module matrix + implicit none + integer :: n = 4 + integer :: m = 4 + +contains + function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + ! omp do private(inner) + do i = 1,m + !$omp unroll partial(4) + !$omp tile sizes (5) + do j = 1,n + do k = 1, n + write (*,*) "i", i, "j", j, "k", k + if (k == 1) then + inner = 0 + endif + inner = inner + a(k, i) * b(j, k) + if (k == n) then + c(j, i) = inner + endif + end do + end do + end do + end function mult + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n = size (m, 1) + do i = 1,n + do j = 1,n + write (*, fmt="(i4)", advance='no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine + +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i = 1,n + do j = 1,m + a(j,i) = merge(1,0, i.eq.j) + b(j,i) = j + end do + end do + + c = mult (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i = 1,n + do j = 1,m + if (b(i,j) .ne. c(i,j)) call abort () + end do + end do + + +end program main diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-4.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-4.f90 new file mode 100644 index 00000000000..807135df5e8 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/tile-unroll-4.f90 @@ -0,0 +1,75 @@ +module matrix + implicit none + integer :: n = 4 + integer :: m = 4 + +contains + function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + do i = 1,m + do j = 1,n + c(j, i) = 0 + end do + end do + + !$omp parallel do + do i = 1,m + !$omp tile sizes (5,2) + do j = 1,n + do k = 1, n + c(j,i) = c(j,i) + a(k, i) * b(j, k) + end do + end do + end do + end function mult + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n = size (m, 1) + do i = 1,n + do j = 1,n + write (*, fmt="(i4)", advance='no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine + +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i = 1,n + do j = 1,m + a(j,i) = merge(1,0, i.eq.j) + b(j,i) = j + end do + end do + + c = mult (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i = 1,n + do j = 1,m + if (b(i,j) .ne. c(i,j)) call abort () + end do + end do + + +end program main diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-tile-1.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-tile-1.f90 new file mode 100644 index 00000000000..2f2f014ead9 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-tile-1.f90 @@ -0,0 +1,112 @@ +module matrix + implicit none + integer :: n = 10 + integer :: m = 10 + +contains + + function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + do i = 1,10 + do j = 1,n + c(j,i) = 0 + end do + end do + + !$omp unroll partial(10) + !$omp tile sizes(1, 3) + do i = 1,10 + do j = 1,n + do k = 1, n + write (*,*) i, j, k + c(j,i) = c(j,i) + a(k, i) * b(j, k) + end do + end do + end do + end function mult + + function mult2 (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + do i = 1,10 + do j = 1,n + c(j,i) = 0 + end do + end do + + !$omp unroll partial(2) + !$omp tile sizes(1,2) + do i = 1,10 + do j = 1,n + do k = 1, n + write (*,*) i, j, k + c(j,i) = c(j,i) + a(k, i) * b(j, k) + end do + end do + end do + end function mult2 + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n = size (m, 1) + do i = 1,n + do j = 1,n + write (*, fmt="(i4)", advance='no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine + +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i = 1,n + do j = 1,m + a(j,i) = merge(1,0, i.eq.j) + b(j,i) = j + end do + end do + + ! c = mult (a, b) + + ! call print_matrix (a) + ! call print_matrix (b) + ! call print_matrix (c) + + ! do i = 1,n + ! do j = 1,m + ! if (b(i,j) .ne. c(i,j)) call abort () + ! end do + ! end do + + + c = mult2 (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i = 1,n + do j = 1,m + if (b(i,j) .ne. c(i,j)) call abort () + end do + end do + +end program main diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-tile-2.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-tile-2.f90 new file mode 100644 index 00000000000..1b5b623b838 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/unroll-tile-2.f90 @@ -0,0 +1,71 @@ +module matrix + implicit none + integer :: n = 10 + integer :: m = 10 + +contains + + function copy (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + do i = 1,10 + do j = 1,n + c(j,i) = 0 + end do + end do + + !$omp unroll partial(2) + !$omp tile sizes (1,5) + do i = 1,10 + do j = 1,n + c(j,i) = c(j,i) + a(j, i) + end do + end do + end function copy + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n = size (m, 1) + do i = 1,n + do j = 1,n + write (*, fmt="(i4)", advance='no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i = 1,n + do j = 1,m + a(j,i) = 1 + end do + end do + + c = copy (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i = 1,n + do j = 1,m + if (c(i,j) .ne. a(i,j)) call abort () + end do + end do + +end program main From patchwork Fri Mar 24 15:30:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Frederik Harwath X-Patchwork-Id: 66863 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6D46438708D8 for ; Fri, 24 Mar 2023 15:52:32 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by sourceware.org (Postfix) with ESMTPS id BDA973858C78 for ; Fri, 24 Mar 2023 15:51:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BDA973858C78 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.98,288,1673942400"; d="scan'208";a="274542" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa3.mentor.iphmx.com with ESMTP; 24 Mar 2023 07:31:24 -0800 IronPort-SDR: nMWqhwbwIHSPXJ7tn7tJU828gsnQk8l9lgH9EylWOO5SOzXj8FfMeMWyl/1ZkhoMvxyBoGDa/r DKPcnjHZqfU/a4yKPZRObGZK85hVodkyCFsxyrNKAC2U8/EtLJo0wqt1bRvsP6pEa/kSeKyAoW sbULNIXWgDQsKAZ8PdfZc8pKu7lX9JiefYl2oyd7ZgVFvqUQEWlZZznXH+414+AJbpGxHc5T/m x+i2+9/CP4eFuQQbBL2qw1k7KPHiQJnWoh6LN59ZTdzltTBFFGllsNczWmmJ7tW/YqovjZXjkB bu4= From: Frederik Harwath To: , , , , Subject: [PATCH 5/7] openmp: Add C/C++ support for "omp tile" Date: Fri, 24 Mar 2023 16:30:43 +0100 Message-ID: <20230324153046.3996092-6-frederik@codesourcery.com> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20230324153046.3996092-1-frederik@codesourcery.com> References: <20230324153046.3996092-1-frederik@codesourcery.com> MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-14.mgc.mentorg.com (139.181.222.14) To svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This commit adds the C and C++ front end support for the "omp tile" directive. gcc/c-family/ChangeLog: * c-omp.cc (c_omp_directives): Add PRAGMA_OMP_TILE. * c-pragma.cc (omp_pragmas_simd): Likewise. * c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_TILE. (enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_TILE gcc/c/ChangeLog: * c-parser.cc (c_parser_nested_omp_unroll_clauses): Rename and generalize ... (c_parser_omp_nested_loop_transform_clauses): ... to this. (c_parser_omp_for_loop): Handle "omp tile" parsing in loop nests. (c_parser_omp_tile_sizes): Parse single "sizes" clause. (c_parser_omp_loop_transform_clause): New function. (c_parser_omp_tile): New function for parsing "omp tile" (c_parser_omp_unroll): Adjust to renaming. (c_parser_omp_construct): Handle PRAGMA_OMP_TILE. gcc/cp/ChangeLog: * parser.cc (cp_parser_omp_clause_unroll_partial): Adjust. (cp_parser_nested_omp_unroll_clauses): Rename ... (cp_parser_omp_nested_loop_transform_clauses): ... to this. (cp_parser_omp_for_loop): Handle "omp tile" parsing in loop nests. (cp_parser_omp_tile_sizes): New function, parses single "sizes" clause (cp_parser_omp_tile): New function for parsing "omp tile". (cp_parser_omp_loop_transform_clause): New function. (cp_parser_omp_unroll): Adjust to renaming. (cp_parser_omp_construct): Handle PRAGMA_OMP_TILE. (cp_parser_pragma): Likewise. * pt.cc (tsubst_omp_clauses): Handle OMP_CLAUSE_TILE. * semantics.cc (finish_omp_clauses): Likewise. gcc/ChangeLog: * gimplify.cc (omp_for_drop_tile_clauses): New function, ... (gimplify_omp_for): ... used here. libgomp/ChangeLog: * testsuite/libgomp.c++/loop-transforms/tile-1.C: New test. * testsuite/libgomp.c++/loop-transforms/tile-2.C: New test. * testsuite/libgomp.c++/loop-transforms/tile-3.C: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/loop-transforms/tile-1.c: New test. * c-c++-common/gomp/loop-transforms/tile-2.c: New test. * c-c++-common/gomp/loop-transforms/tile-3.c: New test. * c-c++-common/gomp/loop-transforms/tile-4.c: New test. * c-c++-common/gomp/loop-transforms/tile-5.c: New test. * c-c++-common/gomp/loop-transforms/tile-6.c: New test. * c-c++-common/gomp/loop-transforms/tile-7.c: New test. * c-c++-common/gomp/loop-transforms/tile-8.c: New test. * c-c++-common/gomp/loop-transforms/unroll-2.c: Adapt. * g++.dg/gomp/loop-transforms/tile-1.h: New test. * g++.dg/gomp/loop-transforms/tile-1a.C: New test. * g++.dg/gomp/loop-transforms/tile-1b.C: New test. --- gcc/c-family/c-omp.cc | 4 +- gcc/c-family/c-pragma.cc | 1 + gcc/c-family/c-pragma.h | 2 + gcc/c/c-parser.cc | 277 ++++++++++++--- gcc/cp/parser.cc | 289 +++++++++++++--- gcc/cp/pt.cc | 1 + gcc/cp/semantics.cc | 40 +++ gcc/gimplify.cc | 28 ++ .../gomp/loop-transforms/tile-1.c | 164 +++++++++ .../gomp/loop-transforms/tile-2.c | 183 ++++++++++ .../gomp/loop-transforms/tile-3.c | 117 +++++++ .../gomp/loop-transforms/tile-4.c | 322 ++++++++++++++++++ .../gomp/loop-transforms/tile-5.c | 150 ++++++++ .../gomp/loop-transforms/tile-6.c | 34 ++ .../gomp/loop-transforms/tile-7.c | 31 ++ .../gomp/loop-transforms/tile-8.c | 40 +++ .../gomp/loop-transforms/unroll-2.c | 12 +- .../g++.dg/gomp/loop-transforms/tile-1.h | 27 ++ .../g++.dg/gomp/loop-transforms/tile-1a.C | 27 ++ .../g++.dg/gomp/loop-transforms/tile-1b.C | 27 ++ .../libgomp.c++/loop-transforms/tile-1.C | 52 +++ .../libgomp.c++/loop-transforms/tile-2.C | 69 ++++ .../libgomp.c++/loop-transforms/tile-3.C | 28 ++ 23 files changed, 1823 insertions(+), 102 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-1.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-2.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-3.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-4.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-5.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-6.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-7.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-8.c create mode 100644 gcc/testsuite/g++.dg/gomp/loop-transforms/tile-1.h create mode 100644 gcc/testsuite/g++.dg/gomp/loop-transforms/tile-1a.C create mode 100644 gcc/testsuite/g++.dg/gomp/loop-transforms/tile-1b.C create mode 100644 libgomp/testsuite/libgomp.c++/loop-transforms/tile-1.C create mode 100644 libgomp/testsuite/libgomp.c++/loop-transforms/tile-2.C create mode 100644 libgomp/testsuite/libgomp.c++/loop-transforms/tile-3.C -- 2.36.1 ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 diff --git a/gcc/c-family/c-omp.cc b/gcc/c-family/c-omp.cc index fec7f337772..2ab7faea2cc 100644 --- a/gcc/c-family/c-omp.cc +++ b/gcc/c-family/c-omp.cc @@ -3207,8 +3207,8 @@ const struct c_omp_directive c_omp_directives[] = { C_OMP_DIR_STANDALONE, false }, { "taskyield", nullptr, nullptr, PRAGMA_OMP_TASKYIELD, C_OMP_DIR_STANDALONE, false }, - /* { "tile", nullptr, nullptr, PRAGMA_OMP_TILE, - C_OMP_DIR_CONSTRUCT, false }, */ + { "tile", nullptr, nullptr, PRAGMA_OMP_TILE, + C_OMP_DIR_CONSTRUCT, false }, { "teams", nullptr, nullptr, PRAGMA_OMP_TEAMS, C_OMP_DIR_CONSTRUCT, true }, { "threadprivate", nullptr, nullptr, PRAGMA_OMP_THREADPRIVATE, diff --git a/gcc/c-family/c-pragma.cc b/gcc/c-family/c-pragma.cc index 96a28ac1b0c..75d5cabbafd 100644 --- a/gcc/c-family/c-pragma.cc +++ b/gcc/c-family/c-pragma.cc @@ -1593,6 +1593,7 @@ static const struct omp_pragma_def omp_pragmas_simd[] = { { "target", PRAGMA_OMP_TARGET }, { "taskloop", PRAGMA_OMP_TASKLOOP }, { "teams", PRAGMA_OMP_TEAMS }, + { "tile", PRAGMA_OMP_TILE }, { "unroll", PRAGMA_OMP_UNROLL }, }; diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h index 6686abdc94d..c0476f74441 100644 --- a/gcc/c-family/c-pragma.h +++ b/gcc/c-family/c-pragma.h @@ -81,6 +81,7 @@ enum pragma_kind { PRAGMA_OMP_TASKYIELD, PRAGMA_OMP_THREADPRIVATE, PRAGMA_OMP_TEAMS, + PRAGMA_OMP_TILE, PRAGMA_OMP_UNROLL, /* PRAGMA_OMP__LAST_ should be equal to the last PRAGMA_OMP_* code. */ PRAGMA_OMP__LAST_ = PRAGMA_OMP_UNROLL, @@ -157,6 +158,7 @@ enum pragma_omp_clause { PRAGMA_OMP_CLAUSE_TASKGROUP, PRAGMA_OMP_CLAUSE_THREAD_LIMIT, PRAGMA_OMP_CLAUSE_THREADS, + PRAGMA_OMP_CLAUSE_TILE, PRAGMA_OMP_CLAUSE_TO, PRAGMA_OMP_CLAUSE_UNIFORM, PRAGMA_OMP_CLAUSE_UNTIED, diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index e7c9da99552..aac23dec9c0 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -20243,7 +20243,8 @@ c_parser_omp_scan_loop_body (c_parser *parser, bool open_brace_parsed) "expected %<}%>"); } -static bool c_parser_nested_omp_unroll_clauses (c_parser *, tree &); +static int c_parser_omp_nested_loop_transform_clauses (c_parser *, tree &, int, + const char *); /* Parse the restricted form of loop statements allowed by OpenACC and OpenMP. The real trick here is to determine the loop control variable early @@ -20263,16 +20264,18 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code, bool fail = false, open_brace_parsed = false; int i, collapse = 1, ordered = 0, count, nbraces = 0; location_t for_loc; - bool tiling = false; + bool oacc_tiling = false; bool inscan = false; vec *for_block = make_tree_vector (); for (cl = clauses; cl; cl = OMP_CLAUSE_CHAIN (cl)) if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_COLLAPSE) - collapse = tree_to_shwi (OMP_CLAUSE_COLLAPSE_EXPR (cl)); + { + collapse = tree_to_shwi (OMP_CLAUSE_COLLAPSE_EXPR (cl)); + } else if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_OACC_TILE) { - tiling = true; + oacc_tiling = true; collapse = list_length (OMP_CLAUSE_OACC_TILE_LIST (cl)); } else if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_ORDERED @@ -20295,21 +20298,31 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code, ordered = collapse; } - gcc_assert (tiling || (collapse >= 1 && ordered >= 0)); + c_parser_omp_nested_loop_transform_clauses (parser, clauses, collapse, + "loop collapse"); + + /* Find the depth of the loop nest affected by "omp tile" + directives. There can be several such directives, but the tiling + depth of the outer ones may not be larger than the depth of the + innermost directive. */ + int omp_tile_depth = 0; + for (tree c = clauses; c; c = TREE_CHAIN (c)) + { + if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_TILE) + continue; + + omp_tile_depth = list_length (OMP_CLAUSE_TILE_SIZES (c)); + } + + gcc_assert (oacc_tiling || (collapse >= 1 && ordered >= 0)); count = ordered ? ordered : collapse; + count = MAX (count, omp_tile_depth); declv = make_tree_vec (count); initv = make_tree_vec (count); condv = make_tree_vec (count); incrv = make_tree_vec (count); - if (c_parser_nested_omp_unroll_clauses (parser, clauses) - && count > 1) - { - error_at (loc, "collapse cannot be larger than 1 on an unrolled loop"); - return NULL; - } - if (!c_parser_next_token_is_keyword (parser, RID_FOR)) { c_parser_error (parser, "for statement expected"); @@ -23945,47 +23958,224 @@ c_parser_omp_taskloop (location_t loc, c_parser *parser, ( (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_PARTIAL) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_FULL) ) -/* Parse zero or more '#pragma omp unroll' that follow - another directive that requires a canonical loop nest. */ +/* OpenMP 5.1: Parse sizes list for "omp tile sizes" + sizes ( size-expr-list ) */ +static tree +c_parser_omp_tile_sizes (c_parser *parser, location_t loc) +{ + tree sizes = NULL_TREE; -static bool -c_parser_nested_omp_unroll_clauses (c_parser *parser, tree &clauses) + c_token *tok = c_parser_peek_token (parser); + if (tok->type != CPP_NAME + || strcmp ("sizes", IDENTIFIER_POINTER (tok->value))) + { + c_parser_error (parser, "expected %"); + return error_mark_node; + } + c_parser_consume_token (parser); + + if (!c_parser_require (parser, CPP_OPEN_PAREN, "expected %<(%>")) + return error_mark_node; + + do + { + if (sizes && !c_parser_require (parser, CPP_COMMA, "expected %<,%>")) + return error_mark_node; + + location_t expr_loc = c_parser_peek_token (parser)->location; + c_expr cexpr = c_parser_expr_no_commas (parser, NULL); + cexpr = convert_lvalue_to_rvalue (expr_loc, cexpr, false, true); + tree expr = cexpr.value; + + if (expr == error_mark_node) + { + c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, + "expected %<)%>"); + return error_mark_node; + } + + expr = c_fully_fold (expr, false, NULL); + + if (!INTEGRAL_TYPE_P (TREE_TYPE (expr)) || !tree_fits_shwi_p (expr) + || tree_to_shwi (expr) <= 0) + { + c_parser_error (parser, "% argument needs positive" + " integral constant"); + expr = integer_zero_node; + } + + sizes = tree_cons (NULL_TREE, expr, sizes); + } + while (c_parser_next_token_is_not (parser, CPP_CLOSE_PAREN)); + c_parser_consume_token (parser); + + gcc_assert (sizes); + tree c = build_omp_clause (loc, OMP_CLAUSE_TILE); + OMP_CLAUSE_TILE_SIZES (c) = sizes; + + return c; +} + +/* Parse a single OpenMP loop transformation directive and return the + clause that is used internally to represent the directive. */ + +static tree +c_parser_omp_loop_transform_clause (c_parser *parser) { - static const char *p_name = "#pragma omp unroll"; - c_token *tok; - bool found_unroll = false; - while (c_parser_next_token_is (parser, CPP_PRAGMA) - && (tok = c_parser_peek_token (parser), - tok->pragma_kind == PRAGMA_OMP_UNROLL)) + c_token *tok = c_parser_peek_token (parser); + if (tok->type != CPP_PRAGMA) + return NULL_TREE; + + tree c; + switch (tok->pragma_kind) { + case PRAGMA_OMP_UNROLL: c_parser_consume_pragma (parser); - tree c = c_parser_omp_all_clauses (parser, OMP_UNROLL_CLAUSE_MASK, - p_name, true); - if (c) + c = c_parser_omp_all_clauses (parser, OMP_UNROLL_CLAUSE_MASK, + "#pragma omp unroll", false, true); + if (!c) { - gcc_assert (!TREE_CHAIN (c)); - found_unroll = true; - if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_UNROLL_FULL) - { - error_at (tok->location, "% clause is invalid here; " - "turns loop into non-loop"); - continue; - } + if (c_parser_next_token_is (parser, CPP_PRAGMA_EOL)) + c = build_omp_clause (tok->location, OMP_CLAUSE_UNROLL_NONE); + else + c = error_mark_node; } - else + c_parser_skip_to_pragma_eol (parser); + break; + + case PRAGMA_OMP_TILE: + c_parser_consume_pragma (parser); + c = c_parser_omp_tile_sizes (parser, tok->location); + c_parser_skip_to_pragma_eol (parser); + break; + + default: + c = NULL_TREE; + break; + } + + gcc_assert (!c || !TREE_CHAIN (c)); + return c; +} + +/* Parse zero or more OpenMP loop transformation directives that + follow another directive that requires a canonical loop nest and + append all to CLAUSES. Return the nesting depth + of the transformed loop nest. + + REQUIRED_DEPTH is the nesting depth of the loop nest required by + the preceding directive. OUTER_DESCR is a description of the + language construct that requires the loop nest depth (e.g. "loop + collpase", "outer transformation") that is used for error + messages. */ + +static int +c_parser_omp_nested_loop_transform_clauses (c_parser *parser, tree &clauses, + int required_depth, + const char *outer_descr) +{ + tree c = NULL_TREE; + tree last_c = tree_last (clauses); + + /* The depth of the loop nest, counting from LEVEL, after the + transformations. That is, the nesting depth left by the outermost + transformation which is the first to be parsed, but the last to be + executed. */ + int transformed_depth = 0; + + /* The minimum nesting depth required by the last parsed transformation. */ + int last_depth = required_depth; + while ((c = c_parser_omp_loop_transform_clause (parser))) + { + /* The nesting depth left after the current transformation */ + int depth = 1; + if (TREE_CODE (c) == ERROR_MARK) + goto error; + + gcc_assert (!TREE_CHAIN (c)); + switch (OMP_CLAUSE_CODE (c)) { - error_at (tok->location, "%<#pragma omp unroll%> without " - "% clause is invalid here; " - "turns loop into non-loop"); - continue; + case OMP_CLAUSE_UNROLL_FULL: + error_at (OMP_CLAUSE_LOCATION (c), + "% clause is invalid here; " + "turns loop into non-loop"); + goto error; + case OMP_CLAUSE_UNROLL_NONE: + error_at (OMP_CLAUSE_LOCATION (c), + "%<#pragma omp unroll%> without " + "% clause is invalid here; " + "turns loop into non-loop"); + goto error; + case OMP_CLAUSE_UNROLL_PARTIAL: + depth = 1; + break; + case OMP_CLAUSE_TILE: + depth = list_length (OMP_CLAUSE_TILE_SIZES (c)); + break; + default: + gcc_unreachable (); + } + + if (depth < last_depth) + { + bool is_outermost_clause = !transformed_depth; + error_at (OMP_CLAUSE_LOCATION (c), + "nesting depth left after this transformation too low " + "for %s", + is_outermost_clause ? outer_descr + : "outer transformation"); + goto error; } - clauses = chainon (clauses, c); + last_depth = depth; + + if (!transformed_depth) + transformed_depth = last_depth; + + if (!clauses) + clauses = c; + else if (last_c) + TREE_CHAIN (last_c) = c; + + last_c = c; } - return found_unroll; + return transformed_depth; + +error: + while (c_parser_omp_loop_transform_clause (parser)) + ; + clauses = NULL_TREE; + return -1; } +/* OpenMP 5.1: + tile sizes ( size-expr-list ) */ + +static tree +c_parser_omp_tile (location_t loc, c_parser *parser, bool *if_p) +{ + tree block; + tree ret = error_mark_node; + + tree clauses = c_parser_omp_tile_sizes (parser, loc); + c_parser_skip_to_pragma_eol (parser); + + if (!clauses || clauses == error_mark_node) + return error_mark_node; + + int required_depth = list_length (OMP_CLAUSE_TILE_SIZES (clauses)); + c_parser_omp_nested_loop_transform_clauses (parser, clauses, required_depth, + "outer transformation"); + + block = c_begin_compound_stmt (true); + ret = c_parser_omp_for_loop (loc, parser, OMP_LOOP_TRANS, clauses, NULL, if_p); + block = c_end_compound_stmt (loc, block, true); + add_stmt (block); + + return ret; + } + static tree c_parser_omp_unroll (location_t loc, c_parser *parser, bool *if_p) { @@ -23994,7 +24184,9 @@ c_parser_omp_unroll (location_t loc, c_parser *parser, bool *if_p) omp_clause_mask mask = OMP_UNROLL_CLAUSE_MASK; tree clauses = c_parser_omp_all_clauses (parser, mask, p_name, false); - c_parser_nested_omp_unroll_clauses (parser, clauses); + int required_depth = 1; + c_parser_omp_nested_loop_transform_clauses (parser, clauses, required_depth, + "outer transformation"); if (!clauses) { @@ -24496,6 +24688,9 @@ c_parser_omp_construct (c_parser *parser, bool *if_p) case PRAGMA_OMP_ASSUME: c_parser_omp_assume (parser, if_p); return; + case PRAGMA_OMP_TILE: + stmt = c_parser_omp_tile (loc, parser, if_p); + break; case PRAGMA_OMP_UNROLL: stmt = c_parser_omp_unroll (loc, parser, if_p); break; diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index 90af40c4dbc..084ecd3ada5 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -43631,7 +43631,8 @@ cp_parser_omp_scan_loop_body (cp_parser *parser) braces.require_close (parser); } -static bool cp_parser_nested_omp_unroll_clauses (cp_parser *, tree &); +static int cp_parser_omp_nested_loop_transform_clauses (cp_parser *, tree &, + int, const char *); /* Parse the restricted form of the for statement allowed by OpenMP. */ @@ -43643,20 +43644,20 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses, tree orig_decl; tree real_decl, initv, condv, incrv, declv, orig_declv; tree this_pre_body, cl, ordered_cl = NULL_TREE; - location_t loc_first; bool collapse_err = false; int i, collapse = 1, ordered = 0, count, nbraces = 0; releasing_vec for_block; auto_vec orig_inits; - bool tiling = false; + bool oacc_tiling = false; bool inscan = false; + location_t loc_first = cp_lexer_peek_token (parser->lexer)->location; for (cl = clauses; cl; cl = OMP_CLAUSE_CHAIN (cl)) if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_COLLAPSE) collapse = tree_to_shwi (OMP_CLAUSE_COLLAPSE_EXPR (cl)); else if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_OACC_TILE) { - tiling = true; + oacc_tiling = true; collapse = list_length (OMP_CLAUSE_OACC_TILE_LIST (cl)); } else if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_ORDERED @@ -43679,26 +43680,33 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses, ordered = collapse; } - gcc_assert (tiling || (collapse >= 1 && ordered >= 0)); + + gcc_assert (oacc_tiling || (collapse >= 1 && ordered >= 0)); count = ordered ? ordered : collapse; + cp_parser_omp_nested_loop_transform_clauses (parser, clauses, count, + "loop collapse"); + + /* Find the depth of the loop nest affected by "omp tile" + directives. There can be several such directives, but the tiling + depth of the outer ones may not be larger than the depth of the + innermost directive. */ + int omp_tile_depth = 0; + for (tree c = clauses; c; c = TREE_CHAIN (c)) + { + if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_TILE) + continue; + + omp_tile_depth = list_length (OMP_CLAUSE_TILE_SIZES (c)); + } + count = MAX (count, omp_tile_depth); + declv = make_tree_vec (count); initv = make_tree_vec (count); condv = make_tree_vec (count); incrv = make_tree_vec (count); orig_declv = NULL_TREE; - loc_first = cp_lexer_peek_token (parser->lexer)->location; - - if (cp_parser_nested_omp_unroll_clauses (parser, clauses) - && count > 1) - { - error_at (loc_first, - "collapse cannot be larger than 1 on an unrolled loop"); - return NULL; - } - - for (i = 0; i < count; i++) { int bracecount = 0; @@ -45734,51 +45742,224 @@ cp_parser_omp_target (cp_parser *parser, cp_token *pragma_tok, return true; } +/* OpenMP 5.1: Parse sizes list for "omp tile sizes" + sizes ( size-expr-list ) */ +static tree +cp_parser_omp_tile_sizes (cp_parser *parser, location_t loc) +{ + tree sizes = NULL_TREE; + cp_lexer *lexer = parser->lexer; + + cp_token *tok = cp_lexer_peek_token (lexer); + if (tok->type != CPP_NAME + || strcmp ("sizes", IDENTIFIER_POINTER (tok->u.value))) + { + cp_parser_error (parser, "expected %"); + return error_mark_node; + } + cp_lexer_consume_token (lexer); + + if (!cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN)) + return error_mark_node; + + do + { + if (sizes && !cp_parser_require (parser, CPP_COMMA, RT_COMMA)) + return error_mark_node; + + tree expr = cp_parser_constant_expression (parser); + if (expr == error_mark_node) + { + cp_parser_skip_to_closing_parenthesis (parser, + /*recovering=*/true, + /*or_comma=*/false, + /*consume_paren=*/ + true); + return error_mark_node; + } + + sizes = tree_cons (NULL_TREE, expr, sizes); + } + while (cp_lexer_next_token_is_not (lexer, CPP_CLOSE_PAREN)); + cp_lexer_consume_token (lexer); + + gcc_assert (sizes); + tree c = build_omp_clause (loc, OMP_CLAUSE_TILE); + OMP_CLAUSE_TILE_SIZES (c) = sizes; + + return c; +} + +/* OpenMP 5.1: + tile sizes ( size-expr-list ) */ + +static tree +cp_parser_omp_tile (cp_parser *parser, cp_token *tok, bool *if_p) +{ + tree block; + tree ret = error_mark_node; + + tree clauses = cp_parser_omp_tile_sizes (parser, tok->location); + cp_parser_require_pragma_eol (parser, tok); + + if (!clauses || clauses == error_mark_node) + return error_mark_node; + + int required_depth = list_length (OMP_CLAUSE_TILE_SIZES (clauses)); + cp_parser_omp_nested_loop_transform_clauses ( + parser, clauses, required_depth, "outer transformation"); + + block = begin_omp_structured_block (); + clauses = finish_omp_clauses (clauses, C_ORT_OMP); + + ret = cp_parser_omp_for_loop (parser, OMP_LOOP_TRANS, clauses, NULL, if_p); + block = finish_omp_structured_block (block); + add_stmt (block); + + return ret; +} + #define OMP_UNROLL_CLAUSE_MASK \ ( (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_PARTIAL) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_FULL) ) -/* Parse zero or more '#pragma omp unroll' that follow - another directive that requires a canonical loop nest. */ +/* Parse a single OpenMP loop transformation directive and return the + clause that is used internally to represent the directive. */ -static bool -cp_parser_nested_omp_unroll_clauses (cp_parser *parser, tree &clauses) +static tree +cp_parser_omp_loop_transform_clause (cp_parser *parser) { - static const char *p_name = "#pragma omp unroll"; - cp_token *tok; - bool unroll_found = false; - while (cp_lexer_next_token_is (parser->lexer, CPP_PRAGMA) - && (tok = cp_lexer_peek_token (parser->lexer), - cp_parser_pragma_kind (tok) == PRAGMA_OMP_UNROLL)) + cp_lexer *lexer = parser->lexer; + cp_token *tok = cp_lexer_peek_token (lexer); + if (tok->type != CPP_PRAGMA) + return NULL_TREE; + + tree c; + switch (cp_parser_pragma_kind (tok)) { - cp_lexer_consume_token (parser->lexer); - gcc_assert (tok->type == CPP_PRAGMA); - parser->lexer->in_pragma = true; - tree c = cp_parser_omp_all_clauses (parser, OMP_UNROLL_CLAUSE_MASK, - p_name, tok); - if (c) - { - gcc_assert (!TREE_CHAIN (c)); - unroll_found = true; - if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_UNROLL_FULL) - { - error_at (tok->location, "% clause is invalid here; " - "turns loop into non-loop"); - continue; - } + case PRAGMA_OMP_UNROLL: + cp_lexer_consume_token (lexer); + lexer->in_pragma = true; + c = cp_parser_omp_all_clauses (parser, OMP_UNROLL_CLAUSE_MASK, + "#pragma omp unroll", tok, + false, true); + if (!c) + { + if (cp_lexer_next_token_is (lexer, CPP_PRAGMA_EOL)) + c = build_omp_clause (tok->location, OMP_CLAUSE_UNROLL_NONE); + else + c = error_mark_node; + } + cp_parser_skip_to_pragma_eol (parser, tok); + break; - c = finish_omp_clauses (c, C_ORT_OMP); + case PRAGMA_OMP_TILE: + cp_lexer_consume_token (lexer); + lexer->in_pragma = true; + c = cp_parser_omp_tile_sizes (parser, tok->location); + cp_parser_require_pragma_eol (parser, tok); + break; + + default: + c = NULL_TREE; + break; + } + + gcc_assert (!c || !TREE_CHAIN (c)); + return c; +} + +/* Parse zero or more OpenMP loop transformation directives that + follow another directive that requires a canonical loop nest and + append all to CLAUSES. Return the nesting depth + of the transformed loop nest. + + REQUIRED_DEPTH is the nesting depth of the loop nest required by + the preceding directive. OUTER_DESCR is a description of the + language construct that requires the loop nest depth (e.g. "loop + collpase", "outer transformation") that is used for error + messages. */ + +static int +cp_parser_omp_nested_loop_transform_clauses (cp_parser *parser, tree &clauses, + int required_depth, + const char *outer_descr) +{ + tree c = NULL_TREE; + tree last_c = tree_last (clauses); + + /* The depth of the loop nest after the transformations. That is, + the nesting depth left by the outermost transformation which is + the first to be parsed, but the last to be executed. */ + int transformed_depth = 0; + + /* The minimum nesting depth required by the last parsed transformation. */ + int last_depth = required_depth; + + while ((c = cp_parser_omp_loop_transform_clause (parser))) + { + /* The nesting depth left after the current transformation */ + int depth = 1; + if (TREE_CODE (c) == ERROR_MARK) + goto error; + + gcc_assert (!TREE_CHAIN (c)); + switch (OMP_CLAUSE_CODE (c)) + { + case OMP_CLAUSE_UNROLL_FULL: + error_at (OMP_CLAUSE_LOCATION (c), + "% clause is invalid here; " + "turns loop into non-loop"); + goto error; + case OMP_CLAUSE_UNROLL_NONE: + error_at (OMP_CLAUSE_LOCATION (c), + "%<#pragma omp unroll%> without " + "% clause is invalid here; " + "turns loop into non-loop"); + goto error; + case OMP_CLAUSE_UNROLL_PARTIAL: + depth = 1; + break; + case OMP_CLAUSE_TILE: + depth = list_length (OMP_CLAUSE_TILE_SIZES (c)); + break; + default: + gcc_unreachable (); } - else + + if (depth < last_depth) { - error_at (tok->location, "%<#pragma omp unroll%> without " - "% clause is invalid here; " - "turns loop into non-loop"); - continue; + bool is_outermost_clause = !transformed_depth; + error_at (OMP_CLAUSE_LOCATION (c), + "nesting depth left after this transformation too low " + "for %s", + is_outermost_clause ? outer_descr + : "outer transformation"); + goto error; } - clauses = chainon (clauses, c); + + last_depth = depth; + + if (!transformed_depth) + transformed_depth = last_depth; + + c = finish_omp_clauses (c, C_ORT_OMP); + + if (!clauses) + clauses = c; + else if (last_c) + TREE_CHAIN (last_c) = c; + + last_c = c; } - return unroll_found; + + return transformed_depth; + +error: + while (cp_parser_omp_loop_transform_clause (parser)) + ; + clauses = NULL_TREE; + return -1; } static tree @@ -45788,7 +45969,7 @@ cp_parser_omp_unroll (cp_parser *parser, cp_token *tok, bool *if_p) static const char *p_name = "#pragma omp unroll"; omp_clause_mask mask = OMP_UNROLL_CLAUSE_MASK; - tree clauses = cp_parser_omp_all_clauses (parser, mask, p_name, tok, false); + tree clauses = cp_parser_omp_all_clauses (parser, mask, p_name, tok, true); if (!clauses) { @@ -45797,7 +45978,9 @@ cp_parser_omp_unroll (cp_parser *parser, cp_token *tok, bool *if_p) clauses = c; } - cp_parser_nested_omp_unroll_clauses (parser, clauses); + int required_depth = 1; + cp_parser_omp_nested_loop_transform_clauses ( + parser, clauses, required_depth, "outer transformation"); block = begin_omp_structured_block (); ret = cp_parser_omp_for_loop (parser, OMP_LOOP_TRANS, clauses, NULL, if_p); @@ -48900,6 +49083,9 @@ cp_parser_omp_construct (cp_parser *parser, cp_token *pragma_tok, bool *if_p) case PRAGMA_OMP_ASSUME: cp_parser_omp_assume (parser, pragma_tok, if_p); return; + case PRAGMA_OMP_TILE: + stmt = cp_parser_omp_tile (parser, pragma_tok, if_p); + break; case PRAGMA_OMP_UNROLL: stmt = cp_parser_omp_unroll (parser, pragma_tok, if_p); break; @@ -49529,6 +49715,7 @@ cp_parser_pragma (cp_parser *parser, enum pragma_context context, bool *if_p) cp_parser_omp_construct (parser, pragma_tok, if_p); pop_omp_privatization_clauses (stmt); return true; + case PRAGMA_OMP_TILE: case PRAGMA_OMP_UNROLL: if (context != pragma_stmt && context != pragma_compound) goto bad_stmt; diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc index 16197b17e5a..a9d36d66caf 100644 --- a/gcc/cp/pt.cc +++ b/gcc/cp/pt.cc @@ -18087,6 +18087,7 @@ tsubst_omp_clauses (tree clauses, enum c_omp_region_type ort, case OMP_CLAUSE_WAIT: case OMP_CLAUSE_DETACH: case OMP_CLAUSE_UNROLL_PARTIAL: + case OMP_CLAUSE_TILE: OMP_CLAUSE_OPERAND (nc, 0) = tsubst_expr (OMP_CLAUSE_OPERAND (oc, 0), args, complain, in_decl); break; diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc index c87e252ff06..15f7c7e6dc4 100644 --- a/gcc/cp/semantics.cc +++ b/gcc/cp/semantics.cc @@ -8769,6 +8769,46 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) } break; + case OMP_CLAUSE_TILE: + for (tree list = OMP_CLAUSE_TILE_SIZES (c); !remove && list; + list = TREE_CHAIN (list)) + { + t = TREE_VALUE (list); + + if (t == error_mark_node) + remove = true; + else if (!type_dependent_expression_p (t) + && !INTEGRAL_TYPE_P (TREE_TYPE (t))) + { + error_at (OMP_CLAUSE_LOCATION (c), + "% argument needs integral type"); + remove = true; + } + else + { + t = mark_rvalue_use (t); + if (!processing_template_decl) + { + t = maybe_constant_value (t); + int n; + if (!tree_fits_shwi_p (t) + || !INTEGRAL_TYPE_P (TREE_TYPE (t)) + || (n = tree_to_shwi (t)) <= 0 || (int)n != n) + { + error_at (OMP_CLAUSE_LOCATION (c), + "% argument needs positive " + "integral constant"); + remove = true; + } + t = fold_build_cleanup_point_expr (TREE_TYPE (t), t); + } + } + + /* Update list item. */ + TREE_VALUE (list) = t; + } + break; + case OMP_CLAUSE_ORDERED: ordered_seen = true; break; diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index 4d504a12451..365897afb61 100644 --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -13572,6 +13572,29 @@ find_standalone_omp_ordered (tree *tp, int *walk_subtrees, void *) return NULL_TREE; } +static void omp_for_drop_tile_clauses (tree for_stmt) +{ + /* Drop erroneous loop transformation clauses to avoid follow up errors + in pass-omp_transform_loops. */ + tree last_c = NULL_TREE; + for (tree c = OMP_FOR_CLAUSES (for_stmt); c; + c = OMP_CLAUSE_CHAIN (c)) + { + + if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_TILE) + continue; + + if (last_c) + TREE_CHAIN (last_c) = TREE_CHAIN (c); + else + OMP_FOR_CLAUSES (for_stmt) = TREE_CHAIN (c); + + error_at (OMP_CLAUSE_LOCATION (c), + "'tile' loop transformation may not appear on " + "non-rectangular for"); + } +} + /* Gimplify the gross structure of an OMP_FOR statement. */ static enum gimplify_status @@ -13763,6 +13786,8 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) case OMP_FOR: if (OMP_FOR_NON_RECTANGULAR (inner_for_stmt ? inner_for_stmt : for_stmt)) { + omp_for_drop_tile_clauses (for_stmt); + if (omp_find_clause (OMP_FOR_CLAUSES (for_stmt), OMP_CLAUSE_SCHEDULE)) error_at (EXPR_LOCATION (for_stmt), @@ -13808,6 +13833,8 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) ort = ORT_SIMD; break; case OMP_LOOP_TRANS: + if (OMP_FOR_NON_RECTANGULAR (inner_for_stmt ? inner_for_stmt : for_stmt)) + omp_for_drop_tile_clauses (for_stmt); break; default: gcc_unreachable (); @@ -14693,6 +14720,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) case OMP_CLAUSE_UNROLL_FULL: case OMP_CLAUSE_UNROLL_NONE: case OMP_CLAUSE_UNROLL_PARTIAL: + case OMP_CLAUSE_TILE: *gfor_clauses_ptr = c; gfor_clauses_ptr = &OMP_CLAUSE_CHAIN (c); break; diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-1.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-1.c new file mode 100644 index 00000000000..8a2f2126af4 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-1.c @@ -0,0 +1,164 @@ +extern void dummy (int); + +void +test () +{ + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(0) /* { dg-error {'tile sizes' argument needs positive integral constant} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(-1) /* { dg-error {'tile sizes' argument needs positive integral constant} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes() /* { dg-error {expected expression before} "" { target c} } */ + /* { dg-error {expected primary-expression before} "" { target c++ } .-1 } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(,) /* { dg-error {expected expression before} "" { target c } } */ + /* { dg-error {expected primary-expression before} "" { target c++ } .-1 } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(1,2 /* { dg-error {expected '\,' before end of line} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes /* { dg-error {expected '\(' before end of line} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(1) sizes(1) /* { dg-error {expected end of line before 'sizes'} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(1) + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(1, 2) + #pragma omp tile sizes(1) /* { dg-error {nesting depth left after this transformation too low for outer transformation} } */ + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + + #pragma omp tile sizes(1, 2) + #pragma omp tile sizes(1, 2) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + + #pragma omp tile sizes(5, 6) + #pragma omp tile sizes(1, 2, 3) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + for (int k = 0; k < 100; ++k) + dummy (i); + + #pragma omp tile sizes(1) + #pragma omp unroll partia /* { dg-error {expected '#pragma omp' clause before 'partia'} } */ + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(1) + #pragma omp unroll /* { dg-error {'#pragma omp unroll' without 'partial' clause is invalid here; turns loop into non-loop} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(1) + #pragma omp unroll full /* { dg-error {'full' clause is invalid here; turns loop into non-loop} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(1) + #pragma omp unroll partial + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(8,8) + #pragma omp unroll partial /* { dg-error {nesting depth left after this transformation too low for outer transformation} } */ + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(8,8) + #pragma omp unroll partial /* { dg-error {nesting depth left after this transformation too low for outer transformation} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(1, 2) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + + #pragma omp tile sizes(1, 2) /* { dg-error {'tile' loop transformation may not appear on non-rectangular for} } */ + for (int i = 0; i < 100; ++i) + for (int j = i; j < 100; ++j) + dummy (i); + + #pragma omp tile sizes(1, 2) /* { dg-error {'tile' loop transformation may not appear on non-rectangular for} } */ + for (int i = 0; i < 100; ++i) + for (int j = 2; j < i; ++j) + dummy (i); + + #pragma omp tile sizes(1, 2, 3) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); /* { dg-error {not enough perfectly nested loops before 'dummy'} "" { target c } } */ + /* { dg-error {not enough for loops to collapse} "" { target c++ } .-1 } */ + /* { dg-error {'i' was not declared in this scope} "" { target c++ } .-2 } */ + + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + { + dummy (i); + for (int j = 0; j < 100; ++j) + dummy (i); + } + + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + { + for (int j = 0; j < 100; ++j) + dummy (j); + dummy (i); + } + + #pragma omp tile sizes(1, 2) + for (int i = 0; i < 100; ++i) + { + dummy (i); /* { dg-error {not enough perfectly nested loops before 'dummy'} "" { target c } } */ + /* { dg-error {not enough for loops to collapse} "" { target c++ } .-1 } */ + for (int j = 0; j < 100; ++j) + dummy (j); + } + + #pragma omp tile sizes(1, 2) + for (int i = 0; i < 100; ++i) + { + for (int j = 0; j < 100; ++j) + dummy (j); + dummy (i); /* { dg-error {collapsed loops not perfectly nested before 'dummy'} "" { target c} } */ + /* { dg-error {collapsed loops not perfectly nested} "" { target c++ } .-1 } */ + } + + int s; + #pragma omp tile sizes(s) /* { dg-error {'tile sizes' argument needs positive integral constant} "" { target { ! c++98_only } } } */ + /* { dg-error {the value of 's' is not usable in a constant expression} "" { target { c++ && { ! c++98_only } } } .-1 } */ + /* { dg-error {'s' cannot appear in a constant-expression} "" { target c++98_only } .-2 } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp tile sizes(42.0) /* { dg-error {'tile sizes' argument needs positive integral constant} "" { target c } } */ + /* { dg-error {'tile sizes' argument needs integral type} "" { target c++ } .-1 } */ + for (int i = 0; i < 100; ++i) + dummy (i); +} diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-2.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-2.c new file mode 100644 index 00000000000..51d62552945 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-2.c @@ -0,0 +1,183 @@ +extern void dummy (int); + +void +test () +{ + #pragma omp parallel for + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(0) /* { dg-error {'tile sizes' argument needs positive integral constant} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(-1) /* { dg-error {'tile sizes' argument needs positive integral constant} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes() /* { dg-error {expected expression before} "" { target c} } */ + /* { dg-error {expected primary-expression before} "" { target c++ } .-1 } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(,) /* { dg-error {expected expression before} "" { target c } } */ + /* { dg-error {expected primary-expression before} "" { target c++ } .-1 } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1,2 /* { dg-error {expected '\,' before end of line} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes /* { dg-error {expected '\(' before end of line} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1) sizes(1) /* { dg-error {expected end of line before 'sizes'} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1) + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1, 2) + #pragma omp tile sizes(1) /* { dg-error {nesting depth left after this transformation too low for outer transformation} } */ + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1, 2) + #pragma omp tile sizes(1, 2) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(5, 6) + #pragma omp tile sizes(1, 2, 3) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + for (int k = 0; k < 100; ++k) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1) + #pragma omp unroll partia /* { dg-error {expected '#pragma omp' clause before 'partia'} } */ + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1) + #pragma omp unroll /* { dg-error {'#pragma omp unroll' without 'partial' clause is invalid here; turns loop into non-loop} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1) + #pragma omp unroll full /* { dg-error {'full' clause is invalid here; turns loop into non-loop} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1) + #pragma omp unroll partial + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(8,8) + #pragma omp unroll partial /* { dg-error {nesting depth left after this transformation too low for outer transformation} } */ + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(8,8) + #pragma omp unroll partial /* { dg-error {nesting depth left after this transformation too low for outer transformation} } */ + for (int i = 0; i < 100; ++i) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1, 2) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1, 2) /* { dg-error {'tile' loop transformation may not appear on non-rectangular for} } */ + for (int i = 0; i < 100; ++i) + for (int j = i; j < 100; ++j) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1, 2) /* { dg-error {'tile' loop transformation may not appear on non-rectangular for} } */ + for (int i = 0; i < 100; ++i) + for (int j = 2; j < i; ++j) + dummy (i); + + #pragma omp parallel for + #pragma omp tile sizes(1, 2, 3) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); /* { dg-error {not enough perfectly nested loops before 'dummy'} "" { target c } } */ + /* { dg-error {not enough for loops to collapse} "" { target c++ } .-1 } */ + /* { dg-error {'i' was not declared in this scope} "" { target c++ } .-2 } */ + + #pragma omp parallel for + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + { + dummy (i); + for (int j = 0; j < 100; ++j) + dummy (i); + } + + #pragma omp parallel for + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + { + for (int j = 0; j < 100; ++j) + dummy (j); + dummy (i); + } + + #pragma omp parallel for + #pragma omp tile sizes(1, 2) + for (int i = 0; i < 100; ++i) + { + dummy (i); /* { dg-error {not enough perfectly nested loops before 'dummy'} "" { target c } } */ + /* { dg-error {not enough for loops to collapse} "" { target c++ } .-1 } */ + for (int j = 0; j < 100; ++j) + dummy (j); + } + + #pragma omp parallel for + #pragma omp tile sizes(1, 2) + for (int i = 0; i < 100; ++i) + { + for (int j = 0; j < 100; ++j) + dummy (j); + dummy (i); /* { dg-error {collapsed loops not perfectly nested before 'dummy'} "" { target c} } */ + /* { dg-error {collapsed loops not perfectly nested} "" { target c++ } .-1 } */ + } + + #pragma omp parallel for + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); +} diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-3.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-3.c new file mode 100644 index 00000000000..7fffc72b335 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-3.c @@ -0,0 +1,117 @@ +extern void dummy (int); + +void +test () +{ + #pragma omp for + #pragma omp tile sizes(1, 2) /* { dg-error {'tile' loop transformation may not appear on non-rectangular for} } */ + for (int i = 0; i < 100; ++i) + for (int j = i; j < 100; ++j) + dummy (i); + + #pragma omp for + #pragma omp tile sizes(1, 2) /* { dg-error {'tile' loop transformation may not appear on non-rectangular for} } */ + for (int i = 0; i < 100; ++i) + for (int j = 0; j < i; ++j) + dummy (i); + + +#pragma omp for collapse(1) + #pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + +#pragma omp for collapse(2) + #pragma omp tile sizes(1) /* { dg-error {nesting depth left after this transformation too low for loop collapse} } */ + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + +#pragma omp for collapse(2) + #pragma omp tile sizes(1, 2) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + +#pragma omp for collapse(3) + #pragma omp tile sizes(1, 2) /* { dg-error {nesting depth left after this transformation too low for loop collapse} } */ + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + /* { dg-error {not enough perfectly nested loops before 'dummy'} "" { target c } .-1 } */ + /* { dg-error {not enough for loops to collapse} "" { target c++ } .-2 } */ + /* { dg-error {'i' was not declared in this scope} "" { target c++ } .-3 } */ + +#pragma omp for collapse(1) +#pragma omp tile sizes(1) +#pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); + +#pragma omp for collapse(2) +#pragma omp tile sizes(1, 2) +#pragma omp tile sizes(1) /* { dg-error {nesting depth left after this transformation too low for outer transformation} } */ + for (int i = 0; i < 100; ++i) + dummy (i); /* { dg-error {not enough perfectly nested loops before 'dummy'} "" { target c } } */ + /* { dg-error {not enough for loops to collapse} "" { target c++ } .-1 } */ + +#pragma omp for collapse(2) +#pragma omp tile sizes(1, 2) +#pragma omp tile sizes(1, 2) + for (int i = 0; i < 100; ++i) + dummy (i); /* { dg-error {not enough perfectly nested loops before 'dummy'} "" { target c } } */ + /* { dg-error {not enough for loops to collapse} "" { target c++ } .-1 } */ + +#pragma omp for collapse(2) +#pragma omp tile sizes(5, 6) +#pragma omp tile sizes(1, 2, 3) + for (int i = 0; i < 100; ++i) + dummy (i); /* { dg-error {not enough perfectly nested loops before 'dummy'} "" { target c } } */ + /* { dg-error {not enough for loops to collapse} "" { target c++ } .-1 } */ + + +#pragma omp for collapse(1) +#pragma omp tile sizes(1) +#pragma omp tile sizes(1) + for (int i = 0; i < 100; ++i) + dummy (i); + +#pragma omp for collapse(2) +#pragma omp tile sizes(1, 2) +#pragma omp tile sizes(1) /* { dg-error {nesting depth left after this transformation too low for outer transformation} } */ + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + +#pragma omp for collapse(2) +#pragma omp tile sizes(1, 2) +#pragma omp tile sizes(1, 2) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); + +#pragma omp for collapse(2) +#pragma omp tile sizes(5, 6) +#pragma omp tile sizes(1, 2, 3) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + for (int k = 0; k < 100; ++k) + dummy (i); + +#pragma omp for collapse(3) +#pragma omp tile sizes(1, 2) /* { dg-error {nesting depth left after this transformation too low for loop collapse} } */ +#pragma omp tile sizes(1, 2) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + dummy (i); /* { dg-error {not enough perfectly nested loops before 'dummy'} "" { target c } } */ + /* { dg-error {not enough for loops to collapse} "" { target c++ } .-1 } */ + +#pragma omp for collapse(3) +#pragma omp tile sizes(5, 6) /* { dg-error {nesting depth left after this transformation too low for loop collapse} } */ +#pragma omp tile sizes(1, 2, 3) + for (int i = 0; i < 100; ++i) + for (int j = 0; j < 100; ++j) + for (int k = 0; k < 100; ++k) + dummy (i); +} diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-4.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-4.c new file mode 100644 index 00000000000..d46bb0cb642 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-4.c @@ -0,0 +1,322 @@ +/* { dg-do run } */ +/* { dg-options "-O0 -fopenmp-simd" } */ + +#include + +#define ASSERT_EQ(var, val) if (var != val) { fprintf (stderr, "%s:%d: Unexpected value %d, expected %d\n", __FILE__, __LINE__, var, val); \ + __builtin_abort (); } + +int +test1 () +{ + int iter = 0; + int i; +#pragma omp tile sizes(3) + for (i = 0; i < 10; i=i+2) + { + ASSERT_EQ (i, iter) + iter = iter + 2; + } + + ASSERT_EQ (i, 10) + return iter; +} + +int +test2 () +{ + int iter = 0; + int i; +#pragma omp tile sizes(3) + for (i = 0; i < 10; i=i+2) + { + ASSERT_EQ (i, iter) + iter = iter + 2; + } + + ASSERT_EQ (i, 10) + return iter; +} + +int +test3 () +{ + int iter = 0; + int i; +#pragma omp tile sizes(8) + for (i = 0; i < 10; i=i+2) + { + ASSERT_EQ (i, iter) + iter = iter + 2; + } + + ASSERT_EQ (i, 10) + return iter; +} + +int +test4 () +{ + int iter = 10; + int i; +#pragma omp tile sizes(8) + for (i = 10; i > 0; i=i-2) + { + ASSERT_EQ (i, iter) + iter = iter - 2; + } + ASSERT_EQ (i, 0) + return iter; +} + +int +test5 () +{ + int iter = 10; + int i; +#pragma omp tile sizes(71) + for (i = 10; i > 0; i=i-2) + { + ASSERT_EQ (i, iter) + iter = iter - 2; + } + + ASSERT_EQ (i, 0) + return iter; +} + +int +test6 () +{ + int iter = 10; + int i; +#pragma omp tile sizes(1) + for (i = 10; i > 0; i=i-2) + { + ASSERT_EQ (i, iter) + iter = iter - 2; + } + ASSERT_EQ (i, 0) + return iter; +} + +int +test7 () +{ + int iter = 5; + int i; +#pragma omp tile sizes(2) + for (i = 5; i < -5; i=i-3) + { + fprintf (stderr, "%d\n", i); + __builtin_abort (); + iter = iter - 3; + } + + ASSERT_EQ (i, 5) + + /* No iteration expected */ + return iter; +} + +int +test8 () +{ + int iter = 5; + int i; +#pragma omp tile sizes(2) + for (i = 5; i > -5; i=i-3) + { + ASSERT_EQ (i, iter) + /* Expect only first iteration of the last tile to execute */ + if (iter != -4) + iter = iter - 3; + } + + ASSERT_EQ (i, -7) + return iter; +} + + +int +test9 () +{ + int iter = 5; + int i; +#pragma omp tile sizes(5) + for (i = 5; i >= -5; i=i-4) + { + ASSERT_EQ (i, iter) + /* Expect only first iteration of the last tile to execute */ + if (iter != - 3) + iter = iter - 4; + } + + ASSERT_EQ (i, -7) + return iter; +} + +int +test10 () +{ + int iter = 5; + int i; +#pragma omp tile sizes(5) + for (i = 5; i >= -5; i--) + { + ASSERT_EQ (i, iter) + iter--; + } + + ASSERT_EQ (i, -6) + return iter; +} + +int +test11 () +{ + int iter = 5; + int i; +#pragma omp tile sizes(15) + for (i = 5; i != -5; i--) + { + ASSERT_EQ (i, iter) + iter--; + } + ASSERT_EQ (i, -5) + return iter; +} + +int +test12 () +{ + int iter = 0; + unsigned i; +#pragma omp tile sizes(3) + for (i = 0; i != 5; i++) + { + ASSERT_EQ (i, iter) + iter++; + } + + ASSERT_EQ (i, 5) + return iter; +} + +int +test13 () +{ + int iter = -5; + long long unsigned int i; +#pragma omp tile sizes(15) + for (int i = -5; i < 5; i=i+3) + { + ASSERT_EQ (i, iter) + iter++; + } + + ASSERT_EQ (i, 5) + return iter; +} + +int +test14 (unsigned init, int step) +{ + int iter = init; + long long unsigned int i; +#pragma omp tile sizes(8) + for (i = init; i < 2*init; i=i+step) + iter++; + + ASSERT_EQ (i, 2*init) + return iter; +} + +int +test15 (unsigned init, int step) +{ + int iter = init; + int i; +#pragma omp tile sizes(8) + for (unsigned i = init; i > 2* init; i=i+step) + iter++; + + return iter; +} + +int +main () +{ + int last_iter; + + last_iter = test1 (); + ASSERT_EQ (last_iter, 10); + + last_iter = test2 (); + ASSERT_EQ (last_iter, 10); + + last_iter = test3 (); + ASSERT_EQ (last_iter, 10); + + last_iter = test4 (); + ASSERT_EQ (last_iter, 0); + + last_iter = test5 (); + ASSERT_EQ (last_iter, 0); + + last_iter = test6 (); + ASSERT_EQ (last_iter, 0); + + last_iter = test7 (); + ASSERT_EQ (last_iter, 5); + + last_iter = test8 (); + ASSERT_EQ (last_iter, -4); + + last_iter = test9 (); + ASSERT_EQ (last_iter, -3); + + last_iter = test10 (); + ASSERT_EQ (last_iter, -6); + return 0; + + last_iter = test11 (); + ASSERT_EQ (last_iter, -4); + return 0; + + last_iter = test12 (); + ASSERT_EQ (last_iter, 5); + return 0; + + last_iter = test13 (); + ASSERT_EQ (last_iter, 4); + return 0; + + last_iter = test14 (0, 1); + ASSERT_EQ (last_iter, 0); + return 0; + + last_iter = test14 (0, -1); + ASSERT_EQ (last_iter, 0); + return 0; + + last_iter = test14 (8, 2); + ASSERT_EQ (last_iter, 16); + return 0; + + last_iter = test14 (5, 3); + ASSERT_EQ (last_iter, 9); + return 0; + + last_iter = test15 (8, -1); + ASSERT_EQ (last_iter, 9); + return 0; + + last_iter = test15 (8, -2); + ASSERT_EQ (last_iter, 10); + return 0; + + last_iter = test15 (5, -3); + ASSERT_EQ (last_iter, 6); + return 0; +} diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-5.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-5.c new file mode 100644 index 00000000000..815318ab27a --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-5.c @@ -0,0 +1,150 @@ +/* { dg-do run } */ +/* { dg-options "-O0 -fopenmp-simd" } */ + +#include + +#define ASSERT_EQ(var, val) if (var != val) { fprintf (stderr, "%s:%d: Unexpected value %d\n", __FILE__, __LINE__, var); \ + __builtin_abort (); } + +#define ASSERT_EQ_PTR(var, ptr) if (var != ptr) { fprintf (stderr, "%s:%d: Unexpected value %p\n", __FILE__, __LINE__, var); \ + __builtin_abort (); } + +int +test1 (int data[10]) +{ + int iter = 0; + int *i; + #pragma omp tile sizes(5) + for (i = data; i < data + 10 ; i++) + { + ASSERT_EQ (*i, data[iter]); + ASSERT_EQ_PTR (i, data + iter); + iter++; + } + + ASSERT_EQ_PTR (i, data + 10) + return iter; +} + +int +test2 (int data[10]) +{ + int iter = 0; + int *i; + #pragma omp tile sizes(5) + for (i = data; i < data + 10 ; i=i+2) + { + ASSERT_EQ_PTR (i, data + 2 * iter); + ASSERT_EQ (*i, data[2 * iter]); + iter++; + } + + ASSERT_EQ_PTR (i, data + 10) + return iter; +} + +int +test3 (int data[10]) +{ + int iter = 0; + int *i; + #pragma omp tile sizes(5) + for (i = data; i <= data + 9 ; i=i+2) + { + ASSERT_EQ (*i, data[2 * iter]); + iter++; + } + + ASSERT_EQ_PTR (i, data + 10) + return iter; +} + +int +test4 (int data[10]) +{ + int iter = 0; + int *i; + #pragma omp tile sizes(5) + for (i = data; i != data + 10 ; i=i+1) + { + ASSERT_EQ (*i, data[iter]); + iter++; + } + + ASSERT_EQ_PTR (i, data + 10) + return iter; +} + +int +test5 (int data[10]) +{ + int iter = 0; + int *i; + #pragma omp tile sizes(3) + for (i = data + 9; i >= data ; i--) + { + ASSERT_EQ (*i, data[9 - iter]); + iter++; + } + + ASSERT_EQ_PTR (i, data - 1) + return iter; +} + +int +test6 (int data[10]) +{ + int iter = 0; + int *i; + #pragma omp tile sizes(3) + for (i = data + 9; i > data - 1 ; i--) + { + ASSERT_EQ (*i, data[9 - iter]); + iter++; + } + + ASSERT_EQ_PTR (i, data - 1) + return iter; +} + +int +test7 (int data[10]) +{ + int iter = 0; + #pragma omp tile sizes(1) + for (int *i = data + 9; i != data - 1 ; i--) + { + ASSERT_EQ (*i, data[9 - iter]); + iter++; + } + + return iter; +} + +int +main () +{ + int iter_count; + int data[10] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }; + + iter_count = test1 (data); + ASSERT_EQ (iter_count, 10); + + iter_count = test2 (data); + ASSERT_EQ (iter_count, 5); + + iter_count = test3 (data); + ASSERT_EQ (iter_count, 5); + + iter_count = test4 (data); + ASSERT_EQ (iter_count, 10); + + iter_count = test5 (data); + ASSERT_EQ (iter_count, 10); + + iter_count = test6 (data); + ASSERT_EQ (iter_count, 10); + + iter_count = test7 (data); + ASSERT_EQ (iter_count, 10); +} diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-6.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-6.c new file mode 100644 index 00000000000..8132128a5a8 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-6.c @@ -0,0 +1,34 @@ +/* { dg-do run } */ +/* { dg-options "-O0 -fopenmp-simd" } */ + +#include + +int +test1 () +{ + int sum = 0; +for (int k = 0; k < 10; k++) + { +#pragma omp tile sizes(5,7) + for (int i = 0; i < 10; i++) + for (int j = 0; j < 10; j=j+2) + { + sum = sum + 1; + } + } + + return sum; +} + +int +main () +{ + int result = test1 (); + + if (result != 500) + { + fprintf (stderr, "Wrong result: %d\n", result); + __builtin_abort (); + } + return 0; +} diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-7.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-7.c new file mode 100644 index 00000000000..cd25a62c5c0 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-7.c @@ -0,0 +1,31 @@ +/* { dg-do run } */ +/* { dg-options "-O0 -fopenmp-simd" } */ + +#include +#define ASSERT_EQ(var, val) if (var != val) { fprintf (stderr, "%s:%d: Unexpected value %d\n", __FILE__, __LINE__, var); \ + __builtin_abort (); } + +#define ASSERT_EQ_PTR(var, ptr) if (var != ptr) { fprintf (stderr, "%s:%d: Unexpected value %p\n", __FILE__, __LINE__, var); \ + __builtin_abort (); } + +int +main () +{ + int iter_count; + int data[10] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }; + + int iter = 0; + int *i; + #pragma omp tile sizes(1) + for (i = data; i < data + 10; i=i+2) + { + ASSERT_EQ_PTR (i, data + 2 * iter); + ASSERT_EQ (*i, data[2 * iter]); + iter++; + } + + unsigned long real_iter_count = ((unsigned long)i - (unsigned long)data) / (sizeof (int) * 2); + ASSERT_EQ (real_iter_count, 5); + + return 0; +} diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-8.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-8.c new file mode 100644 index 00000000000..c26e03d7e74 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/tile-8.c @@ -0,0 +1,40 @@ +/* { dg-do run } */ +/* { dg-options "-O0 -fopenmp-simd" } */ + +#include + +#define ASSERT_EQ(var, val) if (var != val) { fprintf (stderr, "%s:%d: Unexpected value %d, expected %d\n", __FILE__, __LINE__, var, val); \ + __builtin_abort (); } + +int +main () +{ + int iter_j = 0, iter_k = 0; + unsigned i, j, k; +#pragma omp tile sizes(3,5,8) + for (i = 0; i < 2; i=i+2) + for (j = 0; j < 3; j=j+1) + for (k = 0; k < 5; k=k+3) + { + /* fprintf (stderr, "i=%d j=%d k=%d\n", i, j, k); + * fprintf (stderr, "iter_j=%d iter_k=%d\n", iter_j, iter_k); */ + ASSERT_EQ (i, 0); + if (k == 0) + { + ASSERT_EQ (j, iter_j); + iter_k = 0; + } + + ASSERT_EQ (k, iter_k); + + iter_k = iter_k + 3; + if (k == 3) + iter_j++; + } + + ASSERT_EQ (i, 2); + ASSERT_EQ (j, 3); + ASSERT_EQ (k, 6); + + return 0; +} diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-2.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-2.c index 8f7c3088a2e..e4fee72c04d 100644 --- a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-2.c +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-2.c @@ -19,7 +19,7 @@ test () #pragma omp for #pragma omp unroll full /* { dg-error {'full' clause is invalid here; turns loop into non-loop} } */ -#pragma omp unroll full /* { dg-error {'full' clause is invalid here; turns loop into non-loop} } */ +#pragma omp unroll full for (int i = -300; i != 100; ++i) dummy (i); @@ -45,13 +45,11 @@ test () int i; #pragma omp for #pragma omp unroll( /* { dg-error {expected '#pragma omp' clause before '\(' token} } */ - /* { dg-error {'#pragma omp unroll' without 'partial' clause is invalid here; turns loop into non-loop} "" { target *-*-* } .-1 } */ for (int i = -300; i != 100; ++i) dummy (i); #pragma omp for #pragma omp unroll foo /* { dg-error {expected '#pragma omp' clause before 'foo'} } */ - /* { dg-error {'#pragma omp unroll' without 'partial' clause is invalid here; turns loop into non-loop} "" { target *-*-* } .-1 } */ for (int i = -300; i != 100; ++i) dummy (i); @@ -67,7 +65,7 @@ test () #pragma omp unroll partial(i) /* { dg-error {the value of 'i' is not usable in a constant expression} "" { target c++ } .-1 } */ - /* { dg-error {partial argument needs positive constant integer expression} "" { target c } .-2 } */ + /* { dg-error {partial argument needs positive constant integer expression} "" { target *-*-* } .-2 } */ for (int i = -300; i != 100; ++i) dummy (i); @@ -78,20 +76,18 @@ test () #pragma omp for #pragma omp unroll partial(1) #pragma omp unroll parti /* { dg-error {expected '#pragma omp' clause before 'parti'} } */ - /* { dg-error {'#pragma omp unroll' without 'partial' clause is invalid here; turns loop into non-loop} "" { target *-*-* } .-1 } */ for (int i = -300; i != 100; ++i) dummy (i); #pragma omp for #pragma omp unroll partial(1) #pragma omp unroll parti /* { dg-error {expected '#pragma omp' clause before 'parti'} } */ - /* { dg-error {'#pragma omp unroll' without 'partial' clause is invalid here; turns loop into non-loop} "" { target *-*-* } .-1 } */ for (int i = -300; i != 100; ++i) dummy (i); int sum = 0; -#pragma omp parallel for reduction(+ : sum) collapse(2) /* { dg-error {collapse cannot be larger than 1 on an unrolled loop} "" { target c } } */ -#pragma omp unroll partial(1) /* { dg-error {collapse cannot be larger than 1 on an unrolled loop} "" { target c++ } } */ +#pragma omp parallel for reduction(+ : sum) collapse(2) +#pragma omp unroll partial(1) /* { dg-error {nesting depth left after this transformation too low for loop collapse} } */ for (int i = 3; i < 10; ++i) for (int j = -2; j < 7; ++j) sum++; diff --git a/gcc/testsuite/g++.dg/gomp/loop-transforms/tile-1.h b/gcc/testsuite/g++.dg/gomp/loop-transforms/tile-1.h new file mode 100644 index 00000000000..166d1d48677 --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/loop-transforms/tile-1.h @@ -0,0 +1,27 @@ +// { dg-do compile } +// { dg-additional-options "-std=c++11" } + +#include + +extern void dummy (int); + +template void +test1_template () +{ + std::vector v; + + for (unsigned i = 0; i < 10; i++) + v.push_back (i); + +#pragma omp for + for (int i : v) + dummy (i); + +#pragma omp tile sizes (U, 10, V) + for (T i : v) + for (T j : v) + for (T k : v) + dummy (i); +} + +void test () { test1_template (); }; diff --git a/gcc/testsuite/g++.dg/gomp/loop-transforms/tile-1a.C b/gcc/testsuite/g++.dg/gomp/loop-transforms/tile-1a.C new file mode 100644 index 00000000000..1ee76da3d4a --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/loop-transforms/tile-1a.C @@ -0,0 +1,27 @@ +// { dg-do compile } +// { dg-additional-options "-std=c++11" } + +#include + +extern void dummy (int); + +template void +test1_template () +{ + std::vector v; + + for (unsigned i = 0; i < 10; i++) + v.push_back (i); + +#pragma omp teams distribute parallel for num_teams(V) + for (int i : v) + dummy (i); + +#pragma omp tile sizes (V, U) + for (T i : v) + for (T j : v) + for (T k : v) + dummy (i); +} + +void test () { test1_template (); }; diff --git a/gcc/testsuite/g++.dg/gomp/loop-transforms/tile-1b.C b/gcc/testsuite/g++.dg/gomp/loop-transforms/tile-1b.C new file mode 100644 index 00000000000..263c9b301c6 --- /dev/null +++ b/gcc/testsuite/g++.dg/gomp/loop-transforms/tile-1b.C @@ -0,0 +1,27 @@ +// { dg-do compile } +// { dg-additional-options "-std=c++11" } + +#include + +extern void dummy (int); + +template void +test1_template () +{ + std::vector v; + + for (unsigned i = 0; i < 10; i++) + v.push_back (i); + +#pragma omp for + for (int i : v) + dummy (i); + +#pragma omp tile sizes (U, 10, V) // { dg-error {'tile sizes' argument needs positive integral constant} } + for (T i : v) + for (T j : v) + for (T k : v) + dummy (i); +} + +void test () { test1_template (); }; diff --git a/libgomp/testsuite/libgomp.c++/loop-transforms/tile-1.C b/libgomp/testsuite/libgomp.c++/loop-transforms/tile-1.C new file mode 100644 index 00000000000..2a4d760720d --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/loop-transforms/tile-1.C @@ -0,0 +1,52 @@ +#include +#include +#include + +void +mult (float *matrix1, float *matrix2, float *result, unsigned dim0, + unsigned dim1) +{ + memset (result, 0, sizeof (float) * dim0 * dim1); +#pragma omp target parallel for collapse(3) map(tofrom:result[0:dim0*dim1]) map(to:matrix1[0:dim0*dim1], matrix2[0:dim0*dim1]) +#pragma omp tile sizes(8, 16, 4) + for (unsigned i = 0; i < dim0; i++) + for (unsigned j = 0; j < dim1; j++) + for (unsigned k = 0; k < dim1; k++) + result[i * dim1 + j] += matrix1[i * dim1 + k] * matrix2[k * dim0 + j]; +} + +int +main () +{ + unsigned dim0 = 20; + unsigned dim1 = 20; + + float *result = (float *)malloc (sizeof (float) * dim0 * dim1); + float *matrix1 = (float *)malloc (sizeof (float) * dim0 * dim1); + float *matrix2 = (float *)malloc (sizeof (float) * dim0 * dim1); + + for (unsigned i = 0; i < dim0; i++) + for (unsigned j = 0; j < dim1; j++) + matrix1[i * dim1 + j] = j; + + for (unsigned i = 0; i < dim1; i++) + for (unsigned j = 0; j < dim0; j++) + if (i == j) + matrix2[i * dim0 + j] = 1; + else + matrix2[i * dim0 + j] = 0; + + mult (matrix1, matrix2, result, dim0, dim1); + + for (unsigned i = 0; i < dim0; i++) + for (unsigned j = 0; j < dim1; j++) + { + if (matrix1[i * dim1 + j] != result[i * dim1 + j]) + { + printf ("ERROR at %d, %d\n", i, j); + __builtin_abort (); + } + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/loop-transforms/tile-2.C b/libgomp/testsuite/libgomp.c++/loop-transforms/tile-2.C new file mode 100644 index 00000000000..780421fa4c7 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/loop-transforms/tile-2.C @@ -0,0 +1,69 @@ +// { dg-additional-options "-std=c++11" } +// { dg-additional-options "-O0" } + +#include +#include + +constexpr unsigned fib (unsigned n) +{ + return n <= 2 ? 1 : fib (n-1) + fib (n-2); +} + +int +test1 () +{ + std::vector v; + + for (unsigned i = 0; i <= 9; i++) + v.push_back (1); + + int sum = 0; + for (int k = 0; k < 10; k++) + #pragma omp tile sizes(fib(4)) + for (int i : v) { + for (int j = 8; j != -2; --j) + sum = sum + i; + } + + return sum; +} + +int +test2 () +{ + std::vector v; + + for (unsigned i = 0; i <= 10; i++) + v.push_back (i); + + int sum = 0; + for (int k = 0; k < 10; k++) +#pragma omp parallel for collapse(2) reduction(+:sum) +#pragma omp tile sizes(fib(4), 1) + for (int i : v) + for (int j = 8; j > -2; --j) + sum = sum + i; + + return sum; +} + +int +main () +{ + int result = test1 (); + + if (result != 1000) + { + fprintf (stderr, "%d: Wrong result: %d\n", __LINE__, result); + __builtin_abort (); + } + + result = test2 (); + if (result != 5500) + { + fprintf (stderr, "%d: Wrong result: %d\n", __LINE__, result); + __builtin_abort (); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/loop-transforms/tile-3.C b/libgomp/testsuite/libgomp.c++/loop-transforms/tile-3.C new file mode 100644 index 00000000000..91ec8f5c137 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/loop-transforms/tile-3.C @@ -0,0 +1,28 @@ +// { dg-additional-options "-std=c++11" } +// { dg-additional-options "-O0" } + +#include + +int +main () +{ + std::vector v; + std::vector w; + + for (unsigned i = 0; i <= 9; i++) + v.push_back (i); + + int iter = 0; +#pragma omp for +#pragma omp tile sizes(5) + for (int i : v) + { + w.push_back (iter); + iter++; + } + + for (int i = 0; i < w.size (); i++) + if (w[i] != i) + __builtin_abort (); + return 0; +} From patchwork Fri Mar 24 15:30:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Frederik Harwath X-Patchwork-Id: 66865 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F00563881D25 for ; Fri, 24 Mar 2023 15:53:20 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by sourceware.org (Postfix) with ESMTPS id BAFF0383FB94; Fri, 24 Mar 2023 15:51:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BAFF0383FB94 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.98,288,1673942400"; d="scan'208";a="274548" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa3.mentor.iphmx.com with ESMTP; 24 Mar 2023 07:31:26 -0800 IronPort-SDR: f7z0Mukx/xVpIbMmh6yATJaH3WQSY2TGvCVSKYvR0yrdbd1cKZdR6WT4Ml983GyWztwvk61qPr CHOJGEjLcNDvIsJ2OnvLcixxVzsLdV1M5vk/Y9TclpTuq+4qLFwi+cuc+EUNhAl3cogNiZ7FkN M8WCJ8vqNUXRI7NeXzT0ZHmrqvWsJOi/EuPFG5awMaJXODKz3GN5v+JRwu9EYaxj8QEHaBbxLN pLHR2tSjvm/3tzs/lhs+hldJaySsmK9d9S1lJgNS5gvRnZ6XbxZDtLRKNKrsynq+S11PgNwHv3 fYQ= From: Frederik Harwath To: , , , Subject: [PATCH 6/7] openmp: Add Fortran support for loop transformations on inner loops Date: Fri, 24 Mar 2023 16:30:44 +0100 Message-ID: <20230324153046.3996092-7-frederik@codesourcery.com> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20230324153046.3996092-1-frederik@codesourcery.com> References: <20230324153046.3996092-1-frederik@codesourcery.com> MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-14.mgc.mentorg.com (139.181.222.14) To svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" So far the implementation of the "omp tile" and "omp unroll" directives restricted their use to the outermost loop of a loop-nest. This commit changes the Fortran front end to parse and verify the directives on inner loops. The transformation clauses are extended to carry the information about the level of the loop nest at which a transformation should be applied. The middle end transformation pass is adjusted to apply the transformations at the correct level of a loop nest and to take their effect on the loop nest depth into account. gcc/fortran/ChangeLog: * openmp.cc (omp_unroll_removes_loop_nest): Move down in file. (resolve_loop_transform_generic): Remove, and ... (resolve_omp_unroll): ... inline and adapt here. Move function. Move functin. (find_nested_loop_in_block): New function. (find_nested_loop_in_chain): New function, used ... (is_outer_iteration_variable): ... here, and ... (expr_is_invariant): ... here. (resolve_omp_do): Adjust code for resolving loop transformations. (resolve_omp_tile): Likewise. * trans-openmp.cc (gfc_trans_omp_clauses): Set OMP_TRANSFROM_LEVEL on new clause. (compute_transformed_depth): New function to compute the depth ("collapse") of a transformed loop nest, used (gfc_trans_omp_do): ... here. gcc/ChangeLog: * omp-transform-loops.cc (gimple_assign_rhs_to_tree): Fix type in comment. (gomp_for_uncollapse): Adjust "collapse" value after uncollapse. (partial_unroll): Add argument for the loop nest level to be transformed. (tile): Likewise. (transform_gomp_for): Pass level to transformatoin functions. (optimize_transformation_clauses): Handle transformation clauses for all levels recursively. * tree-pretty-print.cc (dump_omp_clause): Print OMP_CLAUSE_TRANSFORM_LEVEL for OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_PARTIAL, and OMP_CLAUSE_TILE. * tree.cc: Increase number of operands of OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_PARTIAL, and OMP_CLAUSE_TILE. * tree.h (OMP_CLAUSE_TRANSFORM_LEVEL): New macro to access clause operand 0. (OMP_CLAUSE_UNROLL_PARTIAL_EXPR): Use operand 1 instead of 0. (OMP_CLAUSE_TILE_SIZES): Likewise. gcc/cp/ChangeLog * parser.cc (cp_parser_omp_clause_unroll_full): Set new OMP_CLAUSE_TRANSFORM_LEVEL operand to default value. (cp_parser_omp_clause_unroll_partial): Likewise. (cp_parser_omp_tile_sizes): Likewise. (cp_parser_omp_loop_transform_clause): Likewise. (cp_parser_omp_nested_loop_transform_clauses): Likewise. (cp_parser_omp_unroll): Likewise. * pt.cc (tsubst_omp_clauses): Adjust OMP_CLAUSE_UNROLL_PARTIAL and OMP_CLAUSE_TILE handling to changed number of operands. gcc/c/ChangeLog * c-parser.cc (c_parser_omp_clause_unroll_full): Set new OMP_CLAUSE_TRANSFORM_LEVEL operand to default value. (c_parser_omp_clause_unroll_partial): Likewise. (c_parser_omp_tile_sizes): Likewise. (c_parser_omp_loop_transform_clause): Likewise. (c_parser_omp_nested_loop_transform_clauses): Likewise. (c_parser_omp_unroll): Likewise. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/loop-transforms/unroll-8.f90: Adjust. * gfortran.dg/gomp/loop-transforms/unroll-9.f90: Adjust. * gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90: Adjust. * gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90: Adjust. * gfortran.dg/gomp/loop-transforms/inner-loops.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-imperfect-nest.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-1.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-2.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-3.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-3a.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-4.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-4a.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-5.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-inner-loop.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-tile-inner-1.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-3.f90: Adapt to changed diagnostic messages. libgomp/ChangeLog: * testsuite/libgomp.fortran/loop-transforms/inner-1.f90: New test. --- gcc/c/c-parser.cc | 10 +- gcc/cp/parser.cc | 12 +- gcc/cp/pt.cc | 12 +- gcc/fortran/openmp.cc | 173 ++++++++++++------ gcc/fortran/trans-openmp.cc | 74 ++++++-- gcc/omp-transform-loops.cc | 138 ++++++++------ .../gomp/loop-transforms/inner-loops.f90 | 124 +++++++++++++ .../gomp/loop-transforms/tile-3.f90 | 4 +- .../loop-transforms/tile-imperfect-nest.f90 | 93 ++++++++++ .../loop-transforms/tile-inner-loops-1.f90 | 16 ++ .../loop-transforms/tile-inner-loops-2.f90 | 23 +++ .../loop-transforms/tile-inner-loops-3.f90 | 22 +++ .../loop-transforms/tile-inner-loops-3a.f90 | 31 ++++ .../loop-transforms/tile-inner-loops-4.f90 | 30 +++ .../loop-transforms/tile-inner-loops-4a.f90 | 26 +++ .../loop-transforms/tile-inner-loops-5.f90 | 123 +++++++++++++ .../tile-non-rectangular-1.f90 | 71 +++++++ .../tile-non-rectangular-2.f90 | 12 ++ .../gomp/loop-transforms/unroll-8.f90 | 2 +- .../gomp/loop-transforms/unroll-9.f90 | 2 +- .../loop-transforms/unroll-inner-loop.f90 | 57 ++++++ .../loop-transforms/unroll-non-rect-1.f90 | 31 ++++ .../gomp/loop-transforms/unroll-tile-1.f90 | 2 +- .../gomp/loop-transforms/unroll-tile-2.f90 | 2 +- .../loop-transforms/unroll-tile-inner-1.f90 | 25 +++ gcc/tree-pretty-print.cc | 24 +++ gcc/tree.cc | 8 +- gcc/tree.h | 9 +- .../loop-transforms/inner-1.f90 | 77 ++++++++ 29 files changed, 1103 insertions(+), 130 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/inner-loops.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-imperfect-nest.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-2.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3a.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4a.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-5.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-2.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-inner-loop.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-non-rect-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-inner-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/inner-1.f90 -- 2.36.1 ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index aac23dec9c0..41f9fb90037 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -17466,6 +17466,7 @@ c_parser_omp_clause_unroll_full (c_parser *parser, tree list) location_t loc = c_parser_peek_token (parser)->location; tree c = build_omp_clause (loc, OMP_CLAUSE_UNROLL_FULL); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); OMP_CLAUSE_CHAIN (c) = list; return c; } @@ -17486,6 +17487,7 @@ c_parser_omp_clause_unroll_partial (c_parser *parser, tree list) loc = c_parser_peek_token (parser)->location; c = build_omp_clause (loc, OMP_CLAUSE_UNROLL_PARTIAL); OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c) = NULL_TREE; + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); OMP_CLAUSE_CHAIN (c) = list; if (!c_parser_next_token_is (parser, CPP_OPEN_PAREN)) @@ -24011,6 +24013,7 @@ c_parser_omp_tile_sizes (c_parser *parser, location_t loc) gcc_assert (sizes); tree c = build_omp_clause (loc, OMP_CLAUSE_TILE); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); OMP_CLAUSE_TILE_SIZES (c) = sizes; return c; @@ -24036,7 +24039,11 @@ c_parser_omp_loop_transform_clause (c_parser *parser) if (!c) { if (c_parser_next_token_is (parser, CPP_PRAGMA_EOL)) - c = build_omp_clause (tok->location, OMP_CLAUSE_UNROLL_NONE); + { + c = build_omp_clause (tok->location, OMP_CLAUSE_UNROLL_NONE); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = + build_int_cst (unsigned_type_node, 0); + } else c = error_mark_node; } @@ -24191,6 +24198,7 @@ c_parser_omp_unroll (location_t loc, c_parser *parser, bool *if_p) if (!clauses) { tree c = build_omp_clause (loc, OMP_CLAUSE_UNROLL_NONE); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); OMP_CLAUSE_CHAIN (c) = clauses; clauses = c; } diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index 084ecd3ada5..8219c476153 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -39476,6 +39476,7 @@ cp_parser_omp_clause_unroll_full (tree list, location_t loc) return list; tree c = build_omp_clause (loc, OMP_CLAUSE_UNROLL_FULL); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); OMP_CLAUSE_CHAIN (c) = list; return c; } @@ -39494,6 +39495,7 @@ cp_parser_omp_clause_unroll_partial (cp_parser *parser, tree list, tree c, num = error_mark_node; c = build_omp_clause (loc, OMP_CLAUSE_UNROLL_PARTIAL); OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c) = NULL_TREE; + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); OMP_CLAUSE_CHAIN (c) = list; if (!cp_lexer_next_token_is (parser->lexer, CPP_OPEN_PAREN)) @@ -45786,6 +45788,8 @@ cp_parser_omp_tile_sizes (cp_parser *parser, location_t loc) gcc_assert (sizes); tree c = build_omp_clause (loc, OMP_CLAUSE_TILE); OMP_CLAUSE_TILE_SIZES (c) = sizes; + OMP_CLAUSE_TRANSFORM_LEVEL (c) + = build_int_cst (unsigned_type_node, 0); return c; } @@ -45846,7 +45850,11 @@ cp_parser_omp_loop_transform_clause (cp_parser *parser) if (!c) { if (cp_lexer_next_token_is (lexer, CPP_PRAGMA_EOL)) - c = build_omp_clause (tok->location, OMP_CLAUSE_UNROLL_NONE); + { + c = build_omp_clause (tok->location, OMP_CLAUSE_UNROLL_NONE); + OMP_CLAUSE_TRANSFORM_LEVEL (c) + = build_int_cst (unsigned_type_node, 0); + } else c = error_mark_node; } @@ -45926,6 +45934,7 @@ cp_parser_omp_nested_loop_transform_clauses (cp_parser *parser, tree &clauses, default: gcc_unreachable (); } + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); if (depth < last_depth) { @@ -45974,6 +45983,7 @@ cp_parser_omp_unroll (cp_parser *parser, cp_token *tok, bool *if_p) if (!clauses) { tree c = build_omp_clause (tok->location, OMP_CLAUSE_UNROLL_NONE); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); OMP_CLAUSE_CHAIN (c) = clauses; clauses = c; } diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc index a9d36d66caf..aeea36b24d7 100644 --- a/gcc/cp/pt.cc +++ b/gcc/cp/pt.cc @@ -18086,11 +18086,19 @@ tsubst_omp_clauses (tree clauses, enum c_omp_region_type ort, case OMP_CLAUSE_ASYNC: case OMP_CLAUSE_WAIT: case OMP_CLAUSE_DETACH: - case OMP_CLAUSE_UNROLL_PARTIAL: - case OMP_CLAUSE_TILE: OMP_CLAUSE_OPERAND (nc, 0) = tsubst_expr (OMP_CLAUSE_OPERAND (oc, 0), args, complain, in_decl); break; + case OMP_CLAUSE_UNROLL_PARTIAL: + OMP_CLAUSE_UNROLL_PARTIAL_EXPR (nc) + = tsubst_expr (OMP_CLAUSE_UNROLL_PARTIAL_EXPR (oc), args, complain, + in_decl); + break; + case OMP_CLAUSE_TILE: + OMP_CLAUSE_TILE_SIZES (nc) + = tsubst_expr (OMP_CLAUSE_TILE_SIZES (oc), args, complain, + in_decl); + break; case OMP_CLAUSE_REDUCTION: case OMP_CLAUSE_IN_REDUCTION: case OMP_CLAUSE_TASK_REDUCTION: diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc index 1de61029768..86e9e4ead0e 100644 --- a/gcc/fortran/openmp.cc +++ b/gcc/fortran/openmp.cc @@ -9389,27 +9389,79 @@ gfc_resolve_omp_local_vars (gfc_namespace *ns) gfc_traverse_ns (ns, handle_local_var); } + +/* Forward declaration for mutually recursive functions. */ +static gfc_code * +find_nested_loop_in_block (gfc_code *block); + +/* Return the first nested DO loop in CHAIN, or NULL if there + isn't one. Does no error checking on intervening code. */ + +static gfc_code * +find_nested_loop_in_chain (gfc_code *chain) +{ + gfc_code *code; + + if (!chain) + return NULL; + + for (code = chain; code; code = code->next) + { + if (code->op == EXEC_DO) + return code; + else if (loop_transform_p (code->op) && code->block) + { + code = code->block; + continue; + } + else if (code->op == EXEC_BLOCK) + { + gfc_code *c = find_nested_loop_in_block (code); + if (c) + return c; + } + } + return NULL; +} + +/* Return the first nested DO loop in BLOCK, or NULL if there + isn't one. Does no error checking on intervening code. */ +static gfc_code * +find_nested_loop_in_block (gfc_code *block) +{ + gfc_namespace *ns; + gcc_assert (block->op == EXEC_BLOCK); + ns = block->ext.block.ns; + gcc_assert (ns); + return find_nested_loop_in_chain (ns->code); +} /* CODE is an OMP loop construct. Return true if VAR matches an iteration variable outer to level DEPTH. */ static bool is_outer_iteration_variable (gfc_code *code, int depth, gfc_symbol *var) { int i; - gfc_code *do_code = code->block->next; - while (loop_transform_p (do_code->op)) { - if (do_code->block) - do_code = do_code->block->next; - else - do_code = do_code->next; - } - gcc_assert (!loop_transform_p (do_code->op)); + gfc_code *chain; + if (code->block) + chain = code->block->next; + else + { + gcc_assert (loop_transform_p (code->op)); + chain = code; + while (loop_transform_p (chain->op)) + chain = chain->next; + } for (i = 1; i < depth; i++) { + gfc_code *do_code = find_nested_loop_in_chain (chain); + gcc_assert (do_code != code); + gcc_assert (do_code && do_code->op == EXEC_DO); gfc_symbol *ivar = do_code->ext.iterator->var->symtree->n.sym; if (var == ivar) return true; - do_code = do_code->block->next; + + chain = do_code->block->next; } return false; } @@ -9420,21 +9472,22 @@ static bool expr_is_invariant (gfc_code *code, int depth, gfc_expr *expr) { int i; - gfc_code *do_code = code->block->next; - while (loop_transform_p (do_code->op)) { - if (do_code->block) - do_code = do_code->block->next; - else - do_code = do_code->next; - } - gcc_assert (!loop_transform_p (do_code->op)); + gfc_code *do_code = code; + + /* Move over loop transformations until the + loop is found. It may also be represented by a + transformation construct (but then with a block) + if it is not associated with any other construct. */ + while (loop_transform_p (do_code->op) && !do_code->block) + do_code = do_code->next; for (i = 1; i < depth; i++) { + do_code = find_nested_loop_in_chain (do_code->block->next); + gcc_assert (do_code); gfc_symbol *ivar = do_code->ext.iterator->var->symtree->n.sym; if (gfc_find_sym_in_expr (ivar, expr)) return false; - do_code = do_code->block->next; } return true; } @@ -9828,6 +9881,8 @@ resolve_omp_do (gfc_code *code) if (i == collapse || c) break; do_code = do_code->block; + do_code = resolve_nested_loop_transforms (do_code, name, collapse - i, + &code->loc); if (do_code->op != EXEC_DO && do_code->op != EXEC_DO_WHILE) { gfc_error ("not enough DO loops for collapsed %s at %L", @@ -9835,6 +9890,8 @@ resolve_omp_do (gfc_code *code) break; } do_code = do_code->next; + do_code = resolve_nested_loop_transforms (do_code, name, collapse - i, + &code->loc); if (do_code == NULL || (do_code->op != EXEC_DO && do_code->op != EXEC_DO_WHILE)) { @@ -9848,7 +9905,7 @@ resolve_omp_do (gfc_code *code) static void resolve_omp_tile (gfc_code *code) { - gfc_code *do_code, *c; + gfc_code *do_code, *next; gfc_symbol *dovar; const char *name = "!$OMP TILE"; @@ -9862,65 +9919,78 @@ resolve_omp_tile (gfc_code *code) for (unsigned i = 1; i <= num_loops; i++) { + + gfc_symbol *start_var = NULL, *end_var = NULL; + if (do_code->op == EXEC_DO_WHILE) { gfc_error ("%s cannot be a DO WHILE or DO without loop control " "at %L", name, &do_code->loc); - break; + return; } if (do_code->op == EXEC_DO_CONCURRENT) { gfc_error ("%s cannot be a DO CONCURRENT loop at %L", name, &do_code->loc); - break; + return; } if (do_code->op != EXEC_DO) { gfc_error ("%s must be DO loop at %L", name, &do_code->loc); - break; + return; } gcc_assert (do_code->op != EXEC_OMP_UNROLL); gcc_assert (do_code->op == EXEC_DO); dovar = do_code->ext.iterator->var->symtree->n.sym; - if (i > 1) + if (is_outer_iteration_variable (code, i, dovar)) { - gfc_code *do_code2 = code; - while (loop_transform_p (do_code2->op)) - { - if (do_code2->block) - do_code2 = do_code2->block->next; - else - do_code2 = do_code2->next; - } - gcc_assert (!loop_transform_p (do_code2->op)); - - for (unsigned j = 1; j < i; j++) - { - gfc_symbol *ivar = do_code2->ext.iterator->var->symtree->n.sym; - if (dovar == ivar - || gfc_find_sym_in_expr (ivar, do_code->ext.iterator->start) - || gfc_find_sym_in_expr (ivar, do_code->ext.iterator->end) - || gfc_find_sym_in_expr (ivar, do_code->ext.iterator->step)) - { - gfc_error ("%s loops don't form rectangular " - "iteration space at %L", name, &do_code->loc); - break; - } - do_code2 = do_code2->block->next; - } + gfc_error ("%s iteration variable used in more than one loop at %L (depth %d)", + name, &do_code->loc, i); + return; } - for (c = do_code->next; c; c = c->next) - if (c->op != EXEC_NOP && c->op != EXEC_CONTINUE) + else if (!bound_expr_is_canonical (code, i, + do_code->ext.iterator->start, + &start_var)) + { + gfc_error ("%s loop start expression not in canonical form at %L", + name, &do_code->loc); + return; + } + else if (!bound_expr_is_canonical (code, i, + do_code->ext.iterator->end, + &end_var)) + { + gfc_error ("%s loop end expression not in canonical form at %L", + name, &do_code->loc); + return; + } + else if (start_var && end_var && start_var != end_var) + { + gfc_error ("%s loop bounds reference different " + "iteration variables at %L", name, &do_code->loc); + return; + } + else if (!expr_is_invariant (code, i, do_code->ext.iterator->step)) + { + gfc_error ("%s loop increment not in canonical form at %L", + name, &do_code->loc); + return; + } + if (start_var || end_var) + code->ext.omp_clauses->non_rectangular = 1; + for (next = do_code->next; next; next = next->next) + if (next->op != EXEC_NOP && next->op != EXEC_CONTINUE) { gfc_error ("%s loops not perfectly nested at %L", - name, &c->loc); + name, &next->loc); break; } - if (i == num_loops || c) + if (i == num_loops || next) break; do_code = do_code->block; + do_code = resolve_nested_loop_transforms (do_code, name, num_loops - i, &code->loc); if (do_code->op != EXEC_DO && do_code->op != EXEC_DO_WHILE) { gfc_error ("not enough DO loops for %s at %L", @@ -9928,6 +9998,7 @@ resolve_omp_tile (gfc_code *code) break; } do_code = do_code->next; + do_code = resolve_nested_loop_transforms (do_code, name, num_loops - i, &code->loc); if (do_code == NULL || (do_code->op != EXEC_DO && do_code->op != EXEC_DO_WHILE)) { diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc index 6936cd7f5ee..0cef3a8ba3a 100644 --- a/gcc/fortran/trans-openmp.cc +++ b/gcc/fortran/trans-openmp.cc @@ -3893,12 +3893,14 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, if (clauses->unroll_full) { c = build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_UNROLL_FULL); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); omp_clauses = gfc_trans_add_clause (c, omp_clauses); } if (clauses->unroll_none) { c = build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_UNROLL_NONE); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); omp_clauses = gfc_trans_add_clause (c, omp_clauses); } @@ -3906,6 +3908,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, { c = build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_UNROLL_PARTIAL); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c) = clauses->unroll_partial_factor ? build_int_cst ( integer_type_node, clauses->unroll_partial_factor) @@ -3926,6 +3929,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, c = build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_TILE); OMP_CLAUSE_TILE_SIZES (c) = build_tree_list_vec (tvec); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); omp_clauses = gfc_trans_add_clause (c, omp_clauses); tvec->truncate (0); @@ -5308,6 +5312,29 @@ gfc_expr_list_len (gfc_expr_list *list) return len; } +/* Traverse the loops with nesting depth at most + COLLAPSE from CODE and determine the largest + loop nest depth required by the loop transformations + found on the loops. */ +int compute_transformed_depth (gfc_code *code, int collapse) +{ + int new_collapse = collapse; + for (int i = 0; i < new_collapse; i++) + { + gcc_assert (code->op == EXEC_DO || loop_transform_p (code->op)); + while (loop_transform_p (code->op)) + { + int tile_depth + = gfc_expr_list_len (code->ext.omp_clauses->tile_sizes); + new_collapse = MAX (new_collapse, i + tile_depth); + code = code->block ? code->block->next : code->next; + } + code = code->block->next; + } + + return new_collapse; +} + static tree gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, gfc_omp_clauses *do_clauses, tree par_clauses) @@ -5343,6 +5370,7 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, do" (or similar directive) are represented as clauses on the "omp do". */ loop_transform_clauses = NULL; int omp_tile_depth = gfc_expr_list_len (omp_tile); + tree clauses_tail = NULL; while (loop_transform_p (code->op)) { tree clauses = gfc_trans_omp_clauses (pblock, code->ext.omp_clauses, @@ -5354,7 +5382,14 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, directive, an error will be emitted in pass-omp_transform_loops. */ omp_tile_depth = gfc_expr_list_len (code->ext.omp_clauses->tile_sizes); - loop_transform_clauses = chainon (loop_transform_clauses, clauses); + if (!loop_transform_clauses) + { + loop_transform_clauses = clauses; + clauses_tail = tree_last (clauses); + } + else + clauses_tail = chainon (clauses_tail, clauses); + code = code->block ? code->block->next : code->next; } gcc_assert (!loop_transform_p (code->op)); @@ -5371,9 +5406,12 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, collapse = clauses->orderedc; if (collapse <= 0) collapse = 1; - collapse = MAX (collapse, omp_tile_depth); + gfc_code *first_loop = loop_transform_p (orig_code->op) ? + orig_code : orig_code->block->next; + int transform_depth = compute_transformed_depth (first_loop, collapse); + collapse = transform_depth; init = make_tree_vec (collapse); cond = make_tree_vec (collapse); incr = make_tree_vec (collapse); @@ -5384,15 +5422,8 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, on the simd construct and DO's clauses are translated elsewhere. */ do_clauses->sched_simd = false; - if (loop_transform_p (op)) - { - /* This is a loop transformation on a loop which is not associated with - any other directive. Use the directive location instead of the loop - location for the clauses. */ - omp_clauses = gfc_trans_omp_clauses (pblock, do_clauses, top_loc); - } - else - omp_clauses = gfc_trans_omp_clauses (pblock, do_clauses, code->loc); + omp_clauses = NULL; + omp_clauses = gfc_trans_omp_clauses (pblock, do_clauses, top_loc); omp_clauses = chainon (omp_clauses, loop_transform_clauses); for (i = 0; i < collapse; i++) @@ -5665,7 +5696,26 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, } if (i + 1 < collapse) - code = code->block->next; + { + code = code->block->next; + + loop_transform_clauses = NULL; + clauses_tail = omp_clauses; + while (loop_transform_p (code->op)) + { + loop_transform_clauses = gfc_trans_omp_clauses ( + pblock, code->ext.omp_clauses, code->loc); + for (tree c = loop_transform_clauses; c; + c = OMP_CLAUSE_CHAIN (c)) + OMP_CLAUSE_TRANSFORM_LEVEL (c) + = build_int_cst (unsigned_type_node, i + 1); + + clauses_tail = chainon (clauses_tail, loop_transform_clauses); + clauses_tail = tree_last (loop_transform_clauses); + + code = code->block ? code->block->next : code->next; + } + } } if (pblock != &block) diff --git a/gcc/omp-transform-loops.cc b/gcc/omp-transform-loops.cc index 858a271261a..517faea537c 100644 --- a/gcc/omp-transform-loops.cc +++ b/gcc/omp-transform-loops.cc @@ -127,7 +127,7 @@ extern tree gimple_assign_rhs_to_tree (gimple *stmt); /* Substitute all definitions from SEQ bottom-up into EXPR. This is used to - reconstruct a tree for a gimplified expression for determinig whether or not + reconstruct a tree from a gimplified expression for determinig whether or not the number of iterations of a loop is constant. */ tree @@ -227,6 +227,7 @@ gomp_for_uncollapse (gomp_for *omp_for, int from_depth = 0, bool expand = false) { int collapse = gimple_omp_for_collapse (omp_for); gcc_assert (from_depth < collapse); + gcc_assert (from_depth >= 0); if (collapse <= 1) return omp_for; @@ -266,6 +267,7 @@ gomp_for_uncollapse (gomp_for *omp_for, int from_depth = 0, bool expand = false) if (from_depth > 0) { gimple_omp_set_body (omp_for, body); + omp_for->collapse = from_depth; return omp_for; } @@ -453,7 +455,7 @@ after transform: Misc 6.0: Loop transformations #3440") in the non-public OpenMP spec repository. */ static gimple_seq -partial_unroll (gomp_for *omp_for, tree unroll_factor, +partial_unroll (gomp_for *omp_for, size_t level, tree unroll_factor, location_t loc, tree transformation_clauses, walk_ctx *ctx) { gcc_assert (unroll_factor); @@ -463,7 +465,7 @@ partial_unroll (gomp_for *omp_for, tree unroll_factor, /* Partial unrolling reduces the loop nest depth of a canonical loop nest to 1 hence outer directives cannot require a greater collapse. */ - gcc_assert (gimple_omp_for_collapse (omp_for) <= 1); + gcc_assert (gimple_omp_for_collapse (omp_for) <= level + 1); if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE | MSG_PRIORITY_INTERNALS, @@ -473,12 +475,12 @@ partial_unroll (gomp_for *omp_for, tree unroll_factor, gomp_for *unrolled_for = as_a (copy_gimple_seq_and_replace_locals (omp_for)); - tree final = gimple_omp_for_final (unrolled_for, 0); - tree incr = gimple_omp_for_incr (unrolled_for, 0); - tree index = gimple_omp_for_index (unrolled_for, 0); + tree final = gimple_omp_for_final (unrolled_for, level); + tree incr = gimple_omp_for_incr (unrolled_for, level); + tree index = gimple_omp_for_index (unrolled_for, level); gimple_seq body = gimple_omp_body (unrolled_for); - tree_code cond = gimple_omp_for_cond (unrolled_for, 0); + tree_code cond = gimple_omp_for_cond (unrolled_for, level); tree step = TREE_OPERAND (incr, 1); gimple_omp_set_body (unrolled_for, build_unroll_body (body, unroll_factor, index, incr, @@ -503,7 +505,7 @@ partial_unroll (gomp_for *omp_for, tree unroll_factor, scaled_step = var; } TREE_OPERAND (incr, 1) = scaled_step; - gimple_omp_for_set_incr (unrolled_for, 0, incr); + gimple_omp_for_set_incr (unrolled_for, level, incr); pop_gimplify_context (result_bind); @@ -864,7 +866,7 @@ canonicalize_conditions (gomp_for *omp_for) */ static gimple_seq -tile (gomp_for *omp_for, location_t loc, tree tile_sizes, +tile (gomp_for *omp_for, location_t loc, size_t start_level, tree tile_sizes, tree transformation_clauses, walk_ctx *ctx) { if (dump_enabled_p ()) @@ -896,22 +898,21 @@ tile (gomp_for *omp_for, location_t loc, tree tile_sizes, collapse_clause = c; } - /* The 'omp tile' construct creates a canonical loop-nest whose nesting depth - equals tiling_depth. The whole loop-nest has depth at least 2 * - omp_tile_depth, but the 'tile loops' at levels - omp_tile_depth+1...2*omp_tile_depth are not in canonical loop-nest form - and hence cannot be associated with a loop construct. */ - if (clause_collapse > tiling_depth) + /* The tiled loop nest is a canonical loop nest with nesting depth + tiling_depth. The tile loops below that level are not in + canonical loop nest form and hence cannot be associated with a + loop construct. */ + if (clause_collapse > tiling_depth + start_level) { error_at (OMP_CLAUSE_LOCATION (collapse_clause), "collapse cannot extend below the floor loops " "generated by the % construct"); OMP_CLAUSE_COLLAPSE_EXPR (collapse_clause) - = build_int_cst (unsigned_type_node, tiling_depth); + = build_int_cst (unsigned_type_node, start_level + tiling_depth); return transform_gomp_for (omp_for, NULL, ctx); } - if (tiling_depth > collapse) + if (start_level + tiling_depth > collapse) return transform_gomp_for (omp_for, NULL, ctx); gcc_assert (collapse >= clause_collapse); @@ -919,13 +920,15 @@ tile (gomp_for *omp_for, location_t loc, tree tile_sizes, push_gimplify_context (); /* Create the index variables for iterating the tiles in the floor - loops first tiling_depth loops transformed loop nest. */ + loops which will be the loops at levels start_level + ... start_level + tiling_depth of the transformed loop nest. The + loops at level 0 ... start_level - 1 are left unchanged. */ gimple_seq floor_loops_pre_body = NULL; size_t tile_level = 0; auto_vec sizes_vec; for (tree el = tile_sizes; el; el = TREE_CHAIN (el), tile_level++) { - size_t nest_level = tile_level; + size_t nest_level = start_level + tile_level; tree index = gimple_omp_for_index (omp_for, nest_level); tree init = gimple_omp_for_initial (omp_for, nest_level); tree incr = gimple_omp_for_incr (omp_for, nest_level); @@ -956,6 +959,7 @@ tile (gomp_for *omp_for, location_t loc, tree tile_sizes, gimple_omp_for_set_incr (floor_loops, nest_level, incr); gimple_omp_for_set_index (floor_loops, nest_level, tile_index); } + gbind *result_bind = gimple_build_bind (NULL, NULL, NULL); pop_gimplify_context (result_bind); gimple_seq_add_seq (gimple_omp_for_pre_body_ptr (floor_loops), @@ -972,6 +976,9 @@ tile (gomp_for *omp_for, location_t loc, tree tile_sizes, to add the incomplete tile checks to each level loop. */ tile_loops = gomp_for_uncollapse (as_a (tile_loops)); + for (size_t i = 0; i < start_level; i++) + tile_loops = gimple_omp_body (tile_loops); + gimple_omp_for_set_kind (as_a (tile_loops), GF_OMP_FOR_KIND_TRANSFORM_LOOP); gimple_omp_for_set_clauses (tile_loops, NULL_TREE); @@ -990,50 +997,51 @@ tile (gomp_for *omp_for, location_t loc, tree tile_sizes, tree break_label = create_artificial_label (UNKNOWN_LOCATION); gimple_seq_add_stmt (surrounding_seq, gimple_build_label (break_label)); - for (size_t level = 0; level < tiling_depth; level++) + for (size_t tile_level = 0; tile_level < tiling_depth; tile_level++) { - tree original_index = gimple_omp_for_index (omp_for, level); - tree original_final = gimple_omp_for_final (omp_for, level); + gimple_seq level_preamble = NULL; + gimple_seq level_body = gimple_omp_body (level_loop); + auto gsi = gsi_start (level_body); - tree tile_index = gimple_omp_for_index (floor_loops, level); - tree tile_size = sizes_vec[level]; + int nest_level = start_level + tile_level; + tree original_index = gimple_omp_for_index (omp_for, nest_level); + tree original_final = gimple_omp_for_final (omp_for, nest_level); + + tree tile_index + = gimple_omp_for_index (floor_loops, nest_level); + tree tile_size = sizes_vec[tile_level]; tree type = TREE_TYPE (tile_index); tree plus_type = type; - tree incr = gimple_omp_for_incr (omp_for, level); + tree incr = gimple_omp_for_incr (omp_for, nest_level); tree step = omp_get_for_step_from_incr (gimple_location (omp_for), incr); gimple_seq *pre_body = gimple_omp_for_pre_body_ptr (level_loop); - gimple_seq level_body = gimple_omp_body (level_loop); gcc_assert (gimple_omp_for_collapse (level_loop) == 1); - tree_code original_cond = gimple_omp_for_cond (omp_for, level); + tree_code original_cond = gimple_omp_for_cond (omp_for, nest_level); gimple_omp_for_set_initial (level_loop, 0, tile_index); tree tile_final = create_tmp_var (type); - tree scaled_tile_size = fold_build2 (MULT_EXPR, TREE_TYPE (tile_size), - tile_size, step); + tree scaled_tile_size + = fold_build2 (MULT_EXPR, TREE_TYPE (tile_size), tile_size, step); tree_code plus_code = PLUS_EXPR; if (POINTER_TYPE_P (TREE_TYPE (tile_index))) { plus_code = POINTER_PLUS_EXPR; int unsignedp = TYPE_UNSIGNED (TREE_TYPE (scaled_tile_size)); - plus_type = signed_or_unsigned_type_for (unsignedp, ptrdiff_type_node); + plus_type + = signed_or_unsigned_type_for (unsignedp, ptrdiff_type_node); } scaled_tile_size = fold_convert (plus_type, scaled_tile_size); - gimplify_assign (tile_final, - fold_build2 (plus_code, type, - tile_index, scaled_tile_size), - pre_body); + gimplify_assign ( + tile_final, + fold_build2 (plus_code, type, tile_index, scaled_tile_size), + pre_body); gimple_omp_for_set_final (level_loop, 0, tile_final); - /* Redefine the original loop index variable of OMP_FOR in terms of the - floor loop and the tiling loop index variable for the current - dimension/level at the top of the loop. */ - gimple_seq level_preamble = NULL; - push_gimplify_context (); tree body_label = create_artificial_label (UNKNOWN_LOCATION); @@ -1047,7 +1055,6 @@ tile (gomp_for *omp_for, location_t loc, tree tile_sizes, break_label)); gimple_seq_add_stmt (&level_preamble, gimple_build_label (body_label)); - auto gsi = gsi_start (level_body); gsi_insert_seq_before (&gsi, level_preamble, GSI_SAME_STMT); gbind *level_bind = gimple_build_bind (NULL, NULL, NULL); pop_gimplify_context (level_bind); @@ -1057,10 +1064,10 @@ tile (gomp_for *omp_for, location_t loc, tree tile_sizes, surrounding_seq = &level_body; level_loop = gsi_stmt (gsi); - /* The label for jumping out of the loop at the next nesting - level. For the outermost level, the label is put after the - loop-nest, for the last one it is not necessary. */ - if (level != tiling_depth - 1) + /* The label for jumping out of the loop at the next + nesting level. For the outermost level, the label is put + after the loop-nest, for the last one it is not necessary. */ + if (tile_level != tiling_depth - 1) { break_label = create_artificial_label (UNKNOWN_LOCATION); gsi_insert_after (&gsi, gimple_build_label (break_label), @@ -1093,13 +1100,15 @@ tile (gomp_for *omp_for, location_t loc, tree tile_sizes, next_transform_depth = list_length (OMP_CLAUSE_TILE_SIZES (remaining_clauses)); + size_t next_level + = tree_to_uhwi (OMP_CLAUSE_TRANSFORM_LEVEL (remaining_clauses)); /* The current "omp tile" transformation reduces the nesting depth of the canonical loop-nest to TILING_DEPTH. Hence the following "omp tile" transformation is invalid if it requires a greater nesting depth. */ - gcc_assert (next_transform_depth <= tiling_depth); - if (next_transform_depth > new_collapse) - new_collapse = next_transform_depth; + gcc_assert (next_level + next_transform_depth <= start_level + tiling_depth); + if (next_level + next_transform_depth > new_collapse) + new_collapse = next_level + next_transform_depth; } if (collapse > new_collapse) @@ -1260,14 +1269,17 @@ transform_gomp_for (gomp_for *omp_for, tree transformation, walk_ctx *ctx) gimple_seq result = NULL; location_t loc = OMP_CLAUSE_LOCATION (transformation); auto dump_loc = dump_user_location_t::from_location_t (loc); + size_t level = tree_to_uhwi (OMP_CLAUSE_TRANSFORM_LEVEL (transformation)); switch (OMP_CLAUSE_CODE (transformation)) { case OMP_CLAUSE_UNROLL_FULL: gcc_assert (TREE_CHAIN (transformation) == NULL); + gcc_assert (level == 0); result = full_unroll (omp_for, loc, ctx); break; case OMP_CLAUSE_UNROLL_NONE: gcc_assert (TREE_CHAIN (transformation) == NULL); + gcc_assert (level == 0); if (assign_unroll_full_clause_p (omp_for, transformation)) { result = full_unroll (omp_for, loc, ctx); @@ -1275,7 +1287,7 @@ transform_gomp_for (gomp_for *omp_for, tree transformation, walk_ctx *ctx) else if (tree unroll_factor = assign_unroll_partial_clause_p (omp_for, transformation)) { - result = partial_unroll (omp_for, unroll_factor, loc, + result = partial_unroll (omp_for, level, unroll_factor, loc, transformation, ctx); } else { @@ -1312,12 +1324,14 @@ transform_gomp_for (gomp_for *omp_for, tree transformation, walk_ctx *ctx) "factor turned into % clause\n", factor); } - result = partial_unroll (omp_for, unroll_factor, loc, transformation, - ctx); + + result = partial_unroll (omp_for, level, + unroll_factor, loc, transformation, ctx); } break; case OMP_CLAUSE_TILE: - result = tile (omp_for, loc, OMP_CLAUSE_TILE_SIZES (transformation), + result = tile (omp_for, loc, level, + OMP_CLAUSE_TILE_SIZES (transformation), transformation, ctx); break; default: @@ -1418,6 +1432,9 @@ print_optimized_unroll_partial_msg (tree c) static tree optimize_transformation_clauses (tree clauses) { + if (!clauses) + return NULL_TREE; + /* The last unroll_partial clause seen in clauses, if any, or the last merged unroll partial clause. */ tree unroll_partial = NULL; @@ -1429,6 +1446,7 @@ optimize_transformation_clauses (tree clauses) since last_non_unroll was seen. */ bool merged_unroll_partial = false; + size_t level = tree_to_uhwi (OMP_CLAUSE_TRANSFORM_LEVEL (clauses)); for (tree c = clauses; c != NULL_TREE; c = OMP_CLAUSE_CHAIN (c)) { enum omp_clause_code code = OMP_CLAUSE_CODE (c); @@ -1516,6 +1534,24 @@ optimize_transformation_clauses (tree clauses) default: gcc_unreachable (); } + + /* The transformations are ordered by the level of the loop-nest to which + they apply in decreasing order. Handle the different levels separately + as long as we do not implement optimizations across the levels. */ + tree next_c = OMP_CLAUSE_CHAIN (c); + if (!next_c) + break; + + size_t next_level = tree_to_uhwi (OMP_CLAUSE_TRANSFORM_LEVEL (next_c)); + if (next_level != level) + { + gcc_assert (next_level < level); + tree tail = optimize_transformation_clauses (next_c); + OMP_CLAUSE_CHAIN (c) = tail; + break; + } + else level = next_level; + } if (merged_unroll_partial && dump_enabled_p ()) diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/inner-loops.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/inner-loops.f90 new file mode 100644 index 00000000000..f9ee5184dab --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/inner-loops.f90 @@ -0,0 +1,124 @@ +subroutine test1 + !$omp parallel do collapse(2) + do i=0,100 + !$omp unroll partial(2) + do j=-300,100 + call dummy (j) + end do + end do +end subroutine test1 + +subroutine test2 + !$omp parallel do collapse(3) + do i=0,100 + !$omp unroll partial(2) ! { dg-error {loop nest depth after \!\$OMP UNROLL at \(1\) is insufficient for outer \!\$OMP PARALLEL DO} } + do j=-300,100 + do k=-300,100 + call dummy (k) + end do + end do + end do +end subroutine test2 + +subroutine test3 +!$omp parallel do collapse(3) +do i=0,100 + do j=-300,100 + !$omp unroll partial(2) + do k=-300,100 + call dummy (k) + end do +end do +end do +end subroutine test3 + +subroutine test4 +!$omp parallel do collapse(3) +do i=0,100 + !$omp tile sizes(3) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP PARALLEL DO} } + do j=-300,100 + !$omp unroll partial(2) + do k=-300,100 + call dummy (k) + end do +end do +end do +end subroutine test4 + +subroutine test5 + !$omp parallel do collapse(3) + !$omp tile sizes(3,2) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP PARALLEL DO} } + do i=0,100 + do j=-300,100 + do k=-300,100 + call dummy (k) + end do + end do + end do +end subroutine test5 + +subroutine test6 +!$omp parallel do collapse(3) +do i=0,100 + !$omp tile sizes(3,2) + do j=-300,100 + !$omp unroll partial(2) + do k=-300,100 + call dummy (k) + end do +end do +end do +end subroutine test6 + +subroutine test7 +!$omp parallel do collapse(3) +do i=0,100 + !$omp tile sizes(3,3) + do j=-300,100 + !$omp tile sizes(5) + do k=-300,100 + call dummy (k) + end do +end do +end do +end subroutine test7 + +subroutine test8 +!$omp parallel do collapse(1) +do i=0,100 + !$omp tile sizes(3,3) + do j=-300,100 + !$omp tile sizes(5) + do k=-300,100 + call dummy (k) + end do +end do +end do +end subroutine test8 + +subroutine test9 +!$omp parallel do collapse(3) +do i=0,100 + !$omp tile sizes(3,3,3) ! { dg-error {not enough DO loops for \!\$OMP TILE at \(1\)} } + do j=-300,100 + !$omp tile sizes(5) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=-300,100 + call dummy (k) + end do +end do +end do +end subroutine test9 + +subroutine test10 +!$omp parallel do +do i=0,100 + !$omp tile sizes(3,3,3) ! { dg-error {not enough DO loops for \!\$OMP TILE at \(1\)} } + do j=-300,100 + !$omp tile sizes(5) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=-300,100 + call dummy (k) + end do +end do +end do +end subroutine test10 + diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-3.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-3.f90 index eaa7895eaa0..308e3b3e4d0 100644 --- a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-3.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-3.f90 @@ -2,9 +2,9 @@ subroutine test implicit none integer :: i, j, k - !$omp parallel do collapse(2) ordered(2) + !$omp parallel do collapse(2) ordered(2) ! { dg-error {'ordered' invalid in conjunction with 'omp tile'} } !$omp tile sizes (1,2) - do i = 1,100 ! { dg-error {'ordered' invalid in conjunction with 'omp tile'} } + do i = 1,100 do j = 1,100 call dummy(j) do k = 1,100 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-imperfect-nest.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-imperfect-nest.f90 new file mode 100644 index 00000000000..3ec1671f01f --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-imperfect-nest.f90 @@ -0,0 +1,93 @@ +subroutine test0 + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + !$omp parallel do collapse(2) private(inner) + !$omp tile sizes (8, 1) + do i = 1,m + !$omp tile sizes (8, 1) + do j = 1,n + !$omp unroll partial(10) + do k = 1, n + if (k == 1) then + inner = 0 + endif + end do + end do + end do +end subroutine test0 + +subroutine test0m + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + !$omp parallel do collapse(2) private(inner) + do i = 1,m + !$omp tile sizes (8, 1) + do j = 1,n + do k = 1, n + if (k == 1) then + inner = 0 + endif + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner ! { dg-error {\!\$OMP TILE loops not perfectly nested at \(1\)} } + end do + end do +end subroutine test0m + +subroutine test1 + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + !$omp parallel do collapse(2) private(inner) + !$omp tile sizes (8, 1) + do i = 1,m + !$omp tile sizes (8, 1) + do j = 1,n + !$omp unroll partial(10) + do k = 1, n + if (k == 1) then + inner = 0 + endif + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner ! { dg-error {\!\$OMP TILE loops not perfectly nested at \(1\)} "TODO Fix with upcoming imperfect loop nest handling" { xfail *-*-* } } + end do + end do +end subroutine test1 + + +subroutine test2 + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + !$omp parallel do collapse(2) private(inner) + !$omp tile sizes (8, 1) + do i = 1,m + !$omp tile sizes (8, 1) + do j = 1,n + do k = 1, n + if (k == 1) then + inner = 0 + endif + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner ! { dg-error {\!\$OMP TILE loops not perfectly nested at \(1\)} } + end do + end do +end subroutine test2 + +subroutine test3 + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + !$omp parallel do collapse(2) private(inner) + do i = 1,m + !$omp tile sizes (8, 1) + do j = 1,n + do k = 1, n + if (k == 1) then + inner = 0 + endif + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner ! { dg-error {\!\$OMP TILE loops not perfectly nested at \(1\)} } + end do + end do +end subroutine test3 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-1.f90 new file mode 100644 index 00000000000..6474b9da1e2 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-1.f90 @@ -0,0 +1,16 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +subroutine test1 + !$omp parallel do collapse(2) + do i=0,100 + !$omp tile sizes(4) + do j=-300,100 + call dummy (j) + end do + end do +end subroutine test1 + +! Collapse of the gimple_omp_for should be unaffacted by the transformation +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait collapse\(2\) tile sizes\(4\).1\n +for \(i = 0; i <= 100; i = i \+ 1\)\n +for \(j = -300; j <= 100; j = j \+ 1\)} 1 "original" } } +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait collapse\(2\) private\(j.0\) private\(j\)\n +for \(i = 0; i < 101; i = i \+ 1\)\n +for \(.omp_tile_index.\d = -300; .omp_tile_index.\d < 101; .omp_tile_index.\d = .omp_tile_index.\d \+ 4\)} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-2.f90 new file mode 100644 index 00000000000..0d462debd72 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-2.f90 @@ -0,0 +1,23 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +subroutine test2 + !$omp parallel do + !$omp tile sizes(3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(3,3) + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test2 + +! One gimple_omp_for should cover the outer two loops, another the inner two loops +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait tile sizes\(3, 3\)@0\n +for \(i = 0; i <= 100; i = i \+ 1\)\n +for \(j = -300; j <= 100; j = j \+ 1\)\n} 1 "original" } } +! { dg-final { scan-tree-dump-times {\#pragma omp loop_transform tile sizes\(3, 3\)@0\n +for \(k = -300; k <= 100; k = k \+ 1\)\n +for \(l = 0; l <= 100; l = l \+ 1\)} 1 "original" } } +! Collapse after the transformations should be 1 +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait\n +for \(.omp_tile_index.\d = 0; .omp_tile_index.\d < 101; .omp_tile_index.\d = .omp_tile_index.\d \+ \d\)} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3.f90 new file mode 100644 index 00000000000..3ce87ad8a4b --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3.f90 @@ -0,0 +1,22 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +subroutine test3 + !$omp parallel do + !$omp tile sizes(3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(3,3) + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test3 + +! gimple_omp_for collapse should be extended to cover all loops affected by the transformations (i.e. 4) +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait tile sizes\(3, 3, 3\)@0 tile sizes\(3, 3\)@2\n +for \(i = 0; i <= 100; i = i \+ 1\)\n +for \(j = -300; j <= 100; j = j \+ 1\)\n +for \(k = -300; k <= 100; k = k \+ 1\)\n +for \(l = 0; l <= 100; l = l \+ 1\)} 1 "original" } } +! Collapse after the transformations should be 1 +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait private\(l.0\) private\(k\)\n +for \(.omp_tile_index.\d = 0; .omp_tile_index.\d < 101; .omp_tile_index.\d = .omp_tile_index.\d \+ \d\)} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3a.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3a.f90 new file mode 100644 index 00000000000..2c06d2094ba --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3a.f90 @@ -0,0 +1,31 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +subroutine test + !$omp tile sizes(3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(3,3) + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test + +! gimple_omp_for collapse should be extended to cover all loops affected by the transformations (i.e. 4) +! { dg-final { scan-tree-dump-times {\#pragma omp loop_transform tile sizes\(3, 3, 3\)@0 tile sizes\(3, 3\)@2\n +for \(i = 0; i <= 100; i = i \+ 1\)\n +for \(j = -300; j <= 100; j = j \+ 1\)\n +for \(k = -300; k <= 100; k = k \+ 1\)\n +for \(l = 0; l <= 100; l = l \+ 1\)} 1 "original" } } + +! The loops should be lowered after the tiling transformations +! { dg-final { scan-tree-dump-not {\#pragma omp} "omp_transform_loops" } } + +! Third level is tiled first by the inner construct. The resulting floor loop is tiled by the outer construct. +! { dg-final { scan-tree-dump-times {if \(.omp_tile_index.1} 2 "omp_transform_loops" } } + +! All other levels are tiled once +! { dg-final { scan-tree-dump-times {if \(.omp_tile_index.2} 1 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {if \(.omp_tile_index.3} 1 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {if \(.omp_tile_index.4} 1 "omp_transform_loops" } } + diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4.f90 new file mode 100644 index 00000000000..355d977fe35 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4.f90 @@ -0,0 +1,30 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +subroutine test3 + !$omp parallel do + !$omp tile sizes(3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(3,3) + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test3 + +! The outer gimple_omp_for should not cover the loop with the tile transformation +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait tile sizes\(3\)@0\n +for \(i = 0; i <= 100; i = i \+ 1\)\n} 1 "original" } } +! { dg-final { scan-tree-dump-times {\#pragma omp loop_transform tile sizes\(3, 3\)@0\n +for \(k = -300; k <= 100; k = k \+ 1\)\n +for \(l = 0; l <= 100; l = l \+ 1\)} 1 "original" } } + + +! After transformations, the outer loop should be a floor loop created +! by the tiling and the outer construct type and non-transformation +! clauses should be unaffected by the tiling +! { dg-final { scan-tree-dump {\#pragma omp for nowait\n +for \(.omp_tile_index.\d = 0; .omp_tile_index.\d < 101; .omp_tile_index.\d = .omp_tile_index.\d \+ 3\)} "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {\#pragma omp} 2 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {\#pragma omp parallel} 1 "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {\#pragma omp for} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4a.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4a.f90 new file mode 100644 index 00000000000..0c83da660f5 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4a.f90 @@ -0,0 +1,26 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +subroutine test3 + !$omp tile sizes(3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(3,3) + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test3 + +! There should be separate gimple_omp_for constructs for the tile constructs because the tiling depth +! of the outer construct does not reach the level of the inner construct +! { dg-final { scan-tree-dump-times {\#pragma omp loop_transform tile sizes\(3\)@0\n +for \(i = 0; i <= 100; i = i \+ 1\)\n} 1 "original" } } +! { dg-final { scan-tree-dump-times {\#pragma omp loop_transform tile sizes\(3, 3\)@0\n +for \(k = -300; k <= 100; k = k \+ 1\)\n +for \(l = 0; l <= 100; l = l \+ 1\)} 1 "original" } } + + +! The loops should be lowered after the tiling transformations +! { dg-final { scan-tree-dump-not {\#pragma omp} "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {if \(.omp_tile_index} 3 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-5.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-5.f90 new file mode 100644 index 00000000000..670e14caa12 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-5.f90 @@ -0,0 +1,123 @@ +subroutine test1a + !$omp parallel do + !$omp tile sizes(3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(5) + do k=-300,100 + call dummy (k) + end do + end do + end do +end subroutine test1a + +subroutine test2a + !$omp parallel do + !$omp tile sizes(3,3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(5,5) + do k=-300,100 + do l=-300,100 + do m=-300,100 + call dummy (m) + end do + end do + end do + end do + end do +end subroutine test2a + +subroutine test3a + !$omp parallel do + !$omp tile sizes(3,3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(5) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=-300,100 + do l=-300,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test3a + +subroutine test4a + !$omp parallel do + !$omp tile sizes(3,3,3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(5,5) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=-300,100 + do l=-300,100 + do m=-300,100 + call dummy (m) + end do + end do + end do + end do + end do +end subroutine test4a + +subroutine test1b + !$omp parallel do + !$omp tile sizes(3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(5) + do k=-300,100 + call dummy (k) + end do + end do + end do +end subroutine test1b + +subroutine test2b + !$omp parallel do + !$omp tile sizes(3,3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(5,5) + do k=-300,100 + do l=-300,100 + do m=-300,100 + call dummy (m) + end do + end do + end do + end do + end do +end subroutine test2b + +subroutine test3b + !$omp parallel do + !$omp tile sizes(3,3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(5) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=-300,100 + do l=-300,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test3b + +subroutine test4b + !$omp parallel do + !$omp tile sizes(3,3,3,3,3) + do i=0,100 + do j=-300,100 + !$omp tile sizes(5,5) ! { dg-error {loop nest depth after \!\$OMP TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=-300,100 + do l=-300,100 + do m=-300,100 + call dummy (m) + end do + end do + end do + end do + end do +end subroutine test4b diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-1.f90 new file mode 100644 index 00000000000..169c2b10e54 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-1.f90 @@ -0,0 +1,71 @@ +subroutine test1 + !$omp tile sizes(1) + do i = 1,100 + do j = 1,i + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test1 + +subroutine test2 + !$omp tile sizes(1,2) ! { dg-error {'tile' loop transformation may not appear on non-rectangular for} } + do i = 1,100 + do j = 1,i + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test2 + +subroutine test3 + !$omp tile sizes(1,2,1) ! { dg-error {'tile' loop transformation may not appear on non-rectangular for} } + do i = 1,100 + do j = 1,i + do k = 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test3 + +subroutine test4 + !$omp tile sizes(1,2,1) ! { dg-error {'tile' loop transformation may not appear on non-rectangular for} } + do i = 1,100 + do j = 1,100 + do k = 1,i + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test4 + +subroutine test5 + !$omp tile sizes(1,2) + do i = 1,100 + do j = 1,100 + do k = 1,j + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test5 + +subroutine test6 + !$omp tile sizes(1,2,1) ! { dg-error {'tile' loop transformation may not appear on non-rectangular for} } + do i = 1,100 + do j = 1,100 + do k = 1,j + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test6 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-2.f90 new file mode 100644 index 00000000000..d5352e5a117 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-2.f90 @@ -0,0 +1,12 @@ +subroutine test + !$omp tile sizes(1,2,1) ! { dg-error {'tile' loop transformation may not appear on non-rectangular for} } ! { dg-error {'tile' loop transformation may not appear on non-rectangular for } } + do i = 1,100 + do j = 1,100 + do k = 1,i + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test + diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-8.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-8.f90 index 9b91e5c5f98..fd687890ee6 100644 --- a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-8.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-8.f90 @@ -16,7 +16,7 @@ end subroutine test1 ! Loop should be unrolled 1 * 2 * 3 * 4 = 24 times -! { dg-final { scan-tree-dump {#pragma omp for nowait collapse\(1\) unroll_partial\(4\) unroll_partial\(3\) unroll_partial\(2\) unroll_partial\(1\)} "original" } } +! { dg-final { scan-tree-dump {#pragma omp for nowait collapse\(1\) unroll_partial\(4\).0 unroll_partial\(3\).0 unroll_partial\(2\).0 unroll_partial\(1\)} "original" } } ! { dg-final { scan-tree-dump-not "#pragma omp loop_transform" "omp_transform_loops" } } ! { dg-final { scan-tree-dump-times "dummy" 24 "omp_transform_loops" } } ! { dg-final { scan-tree-dump-times {#pragma omp for} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-9.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-9.f90 index 849d4e77984..928ca44e811 100644 --- a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-9.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-9.f90 @@ -13,6 +13,6 @@ subroutine test1 end do end subroutine test1 -! { dg-final { scan-tree-dump {#pragma omp loop_transform unroll_full unroll_partial\(3\) unroll_partial\(2\) unroll_partial\(1\)} "original" } } +! { dg-final { scan-tree-dump {#pragma omp loop_transform unroll_full.0 unroll_partial\(3\).0 unroll_partial\(2\).0 unroll_partial\(1\).0} "original" } } ! { dg-final { scan-tree-dump-not "#pragma omp unroll" "omp_transform_loops" } } ! { dg-final { scan-tree-dump-times "dummy" 100 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-inner-loop.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-inner-loop.f90 new file mode 100644 index 00000000000..efcc691185d --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-inner-loop.f90 @@ -0,0 +1,57 @@ +subroutine test1a + !$omp parallel do + !$omp tile sizes(3,3,3) + do i=0,100 + do j=-300,100 + !$omp unroll partial(5) + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test1a + +subroutine test1b + !$omp tile sizes(3,3,3) + do i=0,100 + do j=-300,100 + !$omp unroll partial(5) + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test1b + +subroutine test2a + !$omp parallel do + !$omp tile sizes(3,3,3,3) + do i=0,100 + do j=-300,100 + !$omp unroll partial(5) ! { dg-error {loop nest depth after \!\$OMP UNROLL at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test2a + +subroutine test2b + !$omp tile sizes(3,3,3,3) + do i=0,100 + do j=-300,100 + !$omp unroll partial(5) ! { dg-error {loop nest depth after \!\$OMP UNROLL at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=-300,100 + do l=0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test2b diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-non-rect-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-non-rect-1.f90 new file mode 100644 index 00000000000..3da99158cc0 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-non-rect-1.f90 @@ -0,0 +1,31 @@ +subroutine test + implicit none + + integer :: i, j, k + !$omp target parallel do collapse(2) ! { dg-error {invalid OpenMP non-rectangular loop step; '\(2 - 1\) \* 1' is not a multiple of loop 2 step '5'} } + do i = -300, 100 + !$omp unroll partial + do j = i,i*2 + call dummy (i) + end do + end do + + !$omp target parallel do collapse(3) ! { dg-error {invalid OpenMP non-rectangular loop step; '\(2 - 1\) \* 1' is not a multiple of loop 3 step '5'} } + do i = -300, 100 + do j = 1,10 + !$omp unroll partial + do k = j,j*2 + 1 + call dummy (i) + end do + end do + end do + + !$omp unroll full + do i = -3, 5 + do j = 1,10 + do k = j,j*2 + 1 + call dummy (i) + end do + end do + end do +end subroutine diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90 index cda878f3037..20617e25105 100644 --- a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90 @@ -21,7 +21,7 @@ function mult (a, b) result (c) end do end function mult -! { dg-final { scan-tree-dump-times {#pragma omp for nowait unroll_partial\(1\) tile sizes\(8, 8\)} 1 "original" } } +! { dg-final { scan-tree-dump-times {#pragma omp for nowait unroll_partial\(1\)@0 tile sizes\(8, 8\)@0} 1 "original" } } ! { dg-final { scan-tree-dump-not "#pragma omp loop_transform unroll_partial" "omp_transform_loops" } } ! Tiling adds two floor and two tile loops. diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90 index 00615011856..c1e7f356a87 100644 --- a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90 @@ -22,7 +22,7 @@ function mult (a, b) result (c) !$omp end target end function mult -! { dg-final { scan-tree-dump-times {#pragma omp for nowait unroll_partial\(2\) tile sizes\(8, 8, 4\)} 1 "original" } } +! { dg-final { scan-tree-dump-times {#pragma omp for nowait unroll_partial\(2\)@0 tile sizes\(8, 8, 4\)@0} 1 "original" } } ! { dg-final { scan-tree-dump-not "#pragma omp loop_transform unroll_partial" "omp_transform_loops" } } ! Check the number of loops diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-inner-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-inner-1.f90 new file mode 100644 index 00000000000..bc7a890df17 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-inner-1.f90 @@ -0,0 +1,25 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + + !$omp parallel do collapse(2) + !$omp tile sizes (8,8) + do i = 1,m + do j = 1,n + inner = 0 + !$omp unroll partial(10) + do k = 1, n + inner = inner + a(k, i) * b(j, k) + end do + c(j, i) = inner + end do + end do +end function mult + +! { dg-final { scan-tree-dump-times "#pragma omp loop_transform unroll_partial" 1 "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp loop_transform unroll_partial" "omp_transform_loops" } } diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc index 02c207d87a0..510f65311b5 100644 --- a/gcc/tree-pretty-print.cc +++ b/gcc/tree-pretty-print.cc @@ -507,9 +507,21 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags) goto print_remap; case OMP_CLAUSE_UNROLL_FULL: pp_string (pp, "unroll_full"); + if (OMP_CLAUSE_TRANSFORM_LEVEL (clause)) + { + pp_string (pp, "@"); + dump_generic_node (pp, OMP_CLAUSE_TRANSFORM_LEVEL (clause), + spc, flags, false); + } break; case OMP_CLAUSE_UNROLL_NONE: pp_string (pp, "unroll_none"); + if (OMP_CLAUSE_TRANSFORM_LEVEL (clause)) + { + pp_string (pp, "@"); + dump_generic_node (pp, OMP_CLAUSE_TRANSFORM_LEVEL (clause), + spc, flags, false); + } break; case OMP_CLAUSE_UNROLL_PARTIAL: pp_string (pp, "unroll_partial"); @@ -520,6 +532,12 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags) false); pp_right_paren (pp); } + if (OMP_CLAUSE_TRANSFORM_LEVEL (clause)) + { + pp_string (pp, "@"); + dump_generic_node (pp, OMP_CLAUSE_TRANSFORM_LEVEL (clause), + spc, flags, false); + } break; case OMP_CLAUSE_TILE: pp_string (pp, "tile sizes"); @@ -528,6 +546,12 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags) dump_generic_node (pp, OMP_CLAUSE_TILE_SIZES (clause), spc, flags, false); pp_right_paren (pp); + if (OMP_CLAUSE_TRANSFORM_LEVEL (clause)) + { + pp_string (pp, "@"); + dump_generic_node (pp, OMP_CLAUSE_TRANSFORM_LEVEL (clause), + spc, flags, false); + } break; case OMP_CLAUSE__LOOPTEMP_: name = "_looptemp_"; diff --git a/gcc/tree.cc b/gcc/tree.cc index 893f509fa3a..38478a0ad46 100644 --- a/gcc/tree.cc +++ b/gcc/tree.cc @@ -326,11 +326,11 @@ unsigned const char omp_clause_num_ops[] = 0, /* OMP_CLAUSE_IF_PRESENT */ 0, /* OMP_CLAUSE_FINALIZE */ 0, /* OMP_CLAUSE_NOHOST */ - 0, /* OMP_CLAUSE_UNROLL_FULL */ + 1, /* OMP_CLAUSE_UNROLL_FULL */ - 0, /* OMP_CLAUSE_UNROLL_NONE */ - 1, /* OMP_CLAUSE_UNROLL_PARTIAL */ - 1 /* OMP_CLAUSE_TILE */ + 1, /* OMP_CLAUSE_UNROLL_NONE */ + 2, /* OMP_CLAUSE_UNROLL_PARTIAL */ + 2 /* OMP_CLAUSE_TILE */ }; const char * const omp_clause_code_name[] = diff --git a/gcc/tree.h b/gcc/tree.h index 8f4d2761d1a..0f8aebab89f 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -1787,11 +1787,16 @@ class auto_suppress_location_wrappers #define OMP_CLAUSE_USE_DEVICE_PTR_IF_PRESENT(NODE) \ (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_USE_DEVICE_PTR)->base.public_flag) +/* The level of a collapsed loop nest at which the tranformation represented + by this clause should be applied. */ +#define OMP_CLAUSE_TRANSFORM_LEVEL(NODE) \ + OMP_CLAUSE_OPERAND (NODE, 0) + #define OMP_CLAUSE_UNROLL_PARTIAL_EXPR(NODE) \ - OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_UNROLL_PARTIAL), 0) + OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_UNROLL_PARTIAL), 1) #define OMP_CLAUSE_TILE_SIZES(NODE) \ - OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_TILE), 0) + OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_TILE), 1) #define OMP_CLAUSE_PROC_BIND_KIND(NODE) \ (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_PROC_BIND)->omp_clause.subcode.proc_bind_kind) diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/inner-1.f90 b/libgomp/testsuite/libgomp.fortran/loop-transforms/inner-1.f90 new file mode 100644 index 00000000000..1db97feb34d --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/inner-1.f90 @@ -0,0 +1,77 @@ +module matrix + implicit none + integer :: n = 10 + integer :: m = 10 + +contains + function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + !$omp target parallel do collapse(2) private(inner) map(to:a,b) map(from:c) + !$omp tile sizes (8, 1) + do i = 1,m + !$omp tile sizes (8) + do j = 1,n + !$omp unroll partial(10) + do k = 1, n + if (k == 1) then + inner = 0 + endif + inner = inner + a(k, i) * b(j, k) + if (k == n) then + c(j, i) = inner + endif + end do + end do + end do + end function mult + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n = size (m, 1) + do i = 1,n + do j = 1,n + write (*, fmt="(i4)", advance='no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine + +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i = 1,n + do j = 1,m + a(j,i) = merge(1,0, i.eq.j) + b(j,i) = j + end do + end do + + c = mult (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i = 1,n + do j = 1,m + if (b(i,j) .ne. c(i,j)) call abort () + end do + end do + + +end program main From patchwork Fri Mar 24 15:30:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Frederik Harwath X-Patchwork-Id: 66866 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3ED52388703A for ; Fri, 24 Mar 2023 15:53:50 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by sourceware.org (Postfix) with ESMTPS id E8AAC3847831 for ; Fri, 24 Mar 2023 15:51:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E8AAC3847831 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.98,288,1673942400"; d="scan'208";a="274552" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa3.mentor.iphmx.com with ESMTP; 24 Mar 2023 07:31:27 -0800 IronPort-SDR: 4yVjyfMBcHN6fmslC5vDloiMG894k3xzcJu5ZgNYBx8pirFILiylgBVSiRl7LrkJCvP2YejfM2 TsOqSHETLUbBAr2XtaftY+56r4G5CvWuoM12itFYN7Xja4ZDSlmxoJ+NVlFjEkO4l9feKPxYCL o8SX/Q1WfZUofhUDxNVexrAD5ZdvuBJrTfjvLkZFs8Gv1vXhS/zOHlEoCvg0bvpDVMvGmbfXSm mrhXtB3yj2SOvxNmWzDQqKsIIZ7k+wZxefR008coEMhkIbRh6bgiS/B4qmohVkjHj0j7jHLD+s Vzs= From: Frederik Harwath To: , , , , Subject: [PATCH 7/7] openmp: Add C/C++ support for loop transformations on inner loops Date: Fri, 24 Mar 2023 16:30:45 +0100 Message-ID: <20230324153046.3996092-8-frederik@codesourcery.com> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20230324153046.3996092-1-frederik@codesourcery.com> References: <20230324153046.3996092-1-frederik@codesourcery.com> MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-14.mgc.mentorg.com (139.181.222.14) To svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Add the parsing of loop transformations on inner loops of a loop-nest. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_nested_loop_transform_clauses): Add argument for the level of loop-nest at which the clauses appear, ... (c_parser_omp_tile): ... adjust use here, (c_parser_omp_unroll): ... and here, (c_parser_omp_for_loop): ... and here. Stop treating loop transformations like intervening code, parse them, and adjust the loop-nest depth if necessary for tiling. gcc/cp/ChangeLog: * parser.cc (cp_parser_is_pragma): New function. (cp_parser_omp_nested_loop_transform_clauses): Add argument for the level of loop-nest at which the clauses appear, ... (cp_parser_omp_tile): ... adjust use here, (cp_parser_omp_unroll): ... and here, (cp_parser_omp_for_loop): ... and here. Stop treating loop gcc/testsuite/ChangeLog: * c-c++-common/gomp/loop-transforms/unroll-inner-1.c: New test. * c-c++-common/gomp/loop-transforms/unroll-inner-2.c: New test. libgomp/ChangeLog * testsuite/libgomp.c++/loop-transforms/tile-1.C: Deleted, replaced by matrix-* tests. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-1.h: New header file for new tests. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-constant-iter.h: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-helper.h: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-unroll-full-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-distribute-parallel-for-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-for-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-for-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-simd-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-parallel-for-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-teams-distribute-parallel-for-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-taskloop-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-teams-distribute-parallel-for-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-simd-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-transform-variants-1.h: New test. * testsuite/libgomp.c-c++-common/loop-transforms/unroll-non-rect-1.c: New test. --- gcc/c/c-parser.cc | 35 +++- gcc/cp/parser.cc | 88 ++++++-- .../loop-transforms/imperfect-loop-nest.c | 12 ++ .../gomp/loop-transforms/unroll-inner-1.c | 15 ++ .../gomp/loop-transforms/unroll-inner-2.c | 31 +++ .../gomp/loop-transforms/unroll-non-rect-1.c | 37 ++++ .../gomp/loop-transforms/unroll-non-rect-2.c | 22 ++ .../libgomp.c++/loop-transforms/tile-1.C | 52 ----- .../loop-transforms/matrix-1.h | 70 +++++++ .../loop-transforms/matrix-constant-iter.h | 71 +++++++ .../loop-transforms/matrix-helper.h | 19 ++ .../loop-transforms/matrix-no-directive-1.c | 11 + .../matrix-no-directive-unroll-full-1.c | 13 ++ .../matrix-omp-distribute-parallel-for-1.c | 6 + .../loop-transforms/matrix-omp-for-1.c | 13 ++ .../matrix-omp-parallel-for-1.c | 13 ++ .../matrix-omp-parallel-masked-taskloop-1.c | 6 + ...trix-omp-parallel-masked-taskloop-simd-1.c | 6 + .../matrix-omp-target-parallel-for-1.c | 13 ++ ...p-target-teams-distribute-parallel-for-1.c | 6 + .../loop-transforms/matrix-omp-taskloop-1.c | 6 + ...trix-omp-teams-distribute-parallel-for-1.c | 6 + .../loop-transforms/matrix-simd-1.c | 6 + .../matrix-transform-variants-1.h | 191 ++++++++++++++++++ .../loop-transforms/unroll-non-rect-1.c | 129 ++++++++++++ 25 files changed, 801 insertions(+), 76 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/imperfect-loop-nest.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-inner-1.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-inner-2.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-non-rect-1.c create mode 100644 gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-non-rect-2.c delete mode 100644 libgomp/testsuite/libgomp.c++/loop-transforms/tile-1.C create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-1.h create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-constant-iter.h create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-helper.h create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-unroll-full-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-distribute-parallel-for-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-for-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-for-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-simd-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-parallel-for-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-teams-distribute-parallel-for-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-taskloop-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-teams-distribute-parallel-for-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-simd-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-transform-variants-1.h create mode 100644 libgomp/testsuite/libgomp.c-c++-common/loop-transforms/unroll-non-rect-1.c -- 2.36.1 ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index 41f9fb90037..b32f5f7547f 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -20246,7 +20246,7 @@ c_parser_omp_scan_loop_body (c_parser *parser, bool open_brace_parsed) } static int c_parser_omp_nested_loop_transform_clauses (c_parser *, tree &, int, - const char *); + int, const char *); /* Parse the restricted form of loop statements allowed by OpenACC and OpenMP. The real trick here is to determine the loop control variable early @@ -20300,7 +20300,7 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code, ordered = collapse; } - c_parser_omp_nested_loop_transform_clauses (parser, clauses, collapse, + c_parser_omp_nested_loop_transform_clauses (parser, clauses, 0, collapse, "loop collapse"); /* Find the depth of the loop nest affected by "omp tile" @@ -20489,6 +20489,22 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code, else if (bracecount && c_parser_next_token_is (parser, CPP_SEMICOLON)) c_parser_consume_token (parser); + else if (c_parser_peek_token (parser)->pragma_kind + == PRAGMA_OMP_UNROLL + || c_parser_peek_token (parser)->pragma_kind + == PRAGMA_OMP_TILE) + { + int depth = c_parser_omp_nested_loop_transform_clauses ( + parser, clauses, i + 1, count - i - 1, "loop collapse"); + if (i + 1 + depth > count) + { + count = i + 1 + depth; + declv = grow_tree_vec (declv, count); + initv = grow_tree_vec (initv, count); + condv = grow_tree_vec (condv, count); + incrv = grow_tree_vec (incrv, count); + } + } else { c_parser_error (parser, "not enough perfectly nested loops"); @@ -20500,7 +20516,7 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code, fail = true; count = 0; break; - } + } } while (1); @@ -24066,9 +24082,9 @@ c_parser_omp_loop_transform_clause (c_parser *parser) } /* Parse zero or more OpenMP loop transformation directives that - follow another directive that requires a canonical loop nest and - append all to CLAUSES. Return the nesting depth - of the transformed loop nest. + follow another directive that requires a canonical loop nest, + append all to CLAUSES and record the LEVEL at which the clauses + appear in the loop nest in each clause. REQUIRED_DEPTH is the nesting depth of the loop nest required by the preceding directive. OUTER_DESCR is a description of the @@ -24078,7 +24094,7 @@ c_parser_omp_loop_transform_clause (c_parser *parser) static int c_parser_omp_nested_loop_transform_clauses (c_parser *parser, tree &clauses, - int required_depth, + int level, int required_depth, const char *outer_descr) { tree c = NULL_TREE; @@ -24139,6 +24155,7 @@ c_parser_omp_nested_loop_transform_clauses (c_parser *parser, tree &clauses, if (!transformed_depth) transformed_depth = last_depth; + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, level); if (!clauses) clauses = c; else if (last_c) @@ -24172,7 +24189,7 @@ c_parser_omp_tile (location_t loc, c_parser *parser, bool *if_p) return error_mark_node; int required_depth = list_length (OMP_CLAUSE_TILE_SIZES (clauses)); - c_parser_omp_nested_loop_transform_clauses (parser, clauses, required_depth, + c_parser_omp_nested_loop_transform_clauses (parser, clauses, 0, required_depth, "outer transformation"); block = c_begin_compound_stmt (true); @@ -24192,7 +24209,7 @@ c_parser_omp_unroll (location_t loc, c_parser *parser, bool *if_p) tree clauses = c_parser_omp_all_clauses (parser, mask, p_name, false); int required_depth = 1; - c_parser_omp_nested_loop_transform_clauses (parser, clauses, required_depth, + c_parser_omp_nested_loop_transform_clauses (parser, clauses, 0, required_depth, "outer transformation"); if (!clauses) diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index 8219c476153..2b65ce909fb 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -2974,6 +2974,14 @@ cp_parser_is_keyword (cp_token* token, enum rid keyword) return token->keyword == keyword; } +/* Returns nonzero if TOKEN is a pragma of the indicated KIND. */ + +static bool +cp_parser_is_pragma (cp_token* token, enum pragma_kind kind) +{ + return cp_parser_pragma_kind (token) == kind; +} + /* Helper function for cp_parser_error. Having peeked a token of kind TOK1_KIND that might signify a conflict marker, peek successor tokens to determine @@ -43634,7 +43642,8 @@ cp_parser_omp_scan_loop_body (cp_parser *parser) } static int cp_parser_omp_nested_loop_transform_clauses (cp_parser *, tree &, - int, const char *); + int, int, + const char *); /* Parse the restricted form of the for statement allowed by OpenMP. */ @@ -43686,7 +43695,7 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses, gcc_assert (oacc_tiling || (collapse >= 1 && ordered >= 0)); count = ordered ? ordered : collapse; - cp_parser_omp_nested_loop_transform_clauses (parser, clauses, count, + cp_parser_omp_nested_loop_transform_clauses (parser, clauses, 0, count, "loop collapse"); /* Find the depth of the loop nest affected by "omp tile" @@ -43956,19 +43965,42 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses, cp_parser_parse_tentatively (parser); for (;;) { - if (cp_lexer_next_token_is_keyword (parser->lexer, RID_FOR)) + cp_token *tok = cp_lexer_peek_token (parser->lexer); + if (cp_parser_is_keyword (tok, RID_FOR)) break; - else if (cp_lexer_next_token_is (parser->lexer, CPP_OPEN_BRACE)) + else if (tok->type == CPP_OPEN_BRACE) { cp_lexer_consume_token (parser->lexer); bracecount++; } - else if (bracecount - && cp_lexer_next_token_is (parser->lexer, CPP_SEMICOLON)) + else if (bracecount && tok->type == CPP_SEMICOLON) cp_lexer_consume_token (parser->lexer); + else if (cp_parser_is_pragma (tok, PRAGMA_OMP_UNROLL) + || cp_parser_is_pragma (tok, PRAGMA_OMP_TILE)) + { + int depth = cp_parser_omp_nested_loop_transform_clauses ( + parser, clauses, i + 1, count - i - 1, "loop collapse"); + + /* Adjust the loop nest depth to the requirements of the + loop transformations. The collapse will be reduced + to value requested by the "collapse" and "ordered" + clauses after the execution of the loop transformations + in the middle end. */ + if (i + 1 + depth > count) + { + count = i + 1 + depth; + if (declv) + declv = grow_tree_vec (declv, count); + initv = grow_tree_vec (initv, count); + condv = grow_tree_vec (condv, count); + incrv = grow_tree_vec (incrv, count); + if (orig_declv) + declv = grow_tree_vec (orig_declv, count); + } + } else { - loc = cp_lexer_peek_token (parser->lexer)->location; + loc = tok->location; error_at (loc, "not enough for loops to collapse"); collapse_err = true; cp_parser_abort_tentative_parse (parser); @@ -44027,6 +44059,27 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses, } else if (cp_lexer_next_token_is (parser->lexer, CPP_SEMICOLON)) cp_lexer_consume_token (parser->lexer); + else if (cp_parser_is_pragma (cp_lexer_peek_token (parser->lexer), + PRAGMA_OMP_UNROLL) + || cp_parser_is_pragma (cp_lexer_peek_token (parser->lexer), + PRAGMA_OMP_TILE)) + { + int depth = + cp_parser_omp_nested_loop_transform_clauses (parser, clauses, + i + 1, count - i -1, + "loop collapse"); + if (i + 1 + depth > count) + { + count = i + 1 + depth; + if (declv) + declv = grow_tree_vec (declv, count); + initv = grow_tree_vec (initv, count); + condv = grow_tree_vec (condv, count); + incrv = grow_tree_vec (incrv, count); + if (orig_declv) + declv = grow_tree_vec (orig_declv, count); + } + } else { if (!collapse_err) @@ -45787,6 +45840,7 @@ cp_parser_omp_tile_sizes (cp_parser *parser, location_t loc) gcc_assert (sizes); tree c = build_omp_clause (loc, OMP_CLAUSE_TILE); + OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); OMP_CLAUSE_TILE_SIZES (c) = sizes; OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); @@ -45810,8 +45864,9 @@ cp_parser_omp_tile (cp_parser *parser, cp_token *tok, bool *if_p) return error_mark_node; int required_depth = list_length (OMP_CLAUSE_TILE_SIZES (clauses)); - cp_parser_omp_nested_loop_transform_clauses ( - parser, clauses, required_depth, "outer transformation"); + cp_parser_omp_nested_loop_transform_clauses (parser, clauses, 0, + required_depth, + "outer transformation"); block = begin_omp_structured_block (); clauses = finish_omp_clauses (clauses, C_ORT_OMP); @@ -45878,8 +45933,9 @@ cp_parser_omp_loop_transform_clause (cp_parser *parser) } /* Parse zero or more OpenMP loop transformation directives that - follow another directive that requires a canonical loop nest and - append all to CLAUSES. Return the nesting depth + follow another directive that requires a canonical loop nest, + append all to CLAUSES, and require the level at which the clause + appears in the loop nest in each clause. Return the nesting depth of the transformed loop nest. REQUIRED_DEPTH is the nesting depth of the loop nest required by @@ -45890,7 +45946,7 @@ cp_parser_omp_loop_transform_clause (cp_parser *parser) static int cp_parser_omp_nested_loop_transform_clauses (cp_parser *parser, tree &clauses, - int required_depth, + int level, int required_depth, const char *outer_descr) { tree c = NULL_TREE; @@ -45934,7 +45990,8 @@ cp_parser_omp_nested_loop_transform_clauses (cp_parser *parser, tree &clauses, default: gcc_unreachable (); } - OMP_CLAUSE_TRANSFORM_LEVEL (c) = build_int_cst (unsigned_type_node, 0); + OMP_CLAUSE_TRANSFORM_LEVEL (c) + = build_int_cst (unsigned_type_node, level); if (depth < last_depth) { @@ -45989,8 +46046,9 @@ cp_parser_omp_unroll (cp_parser *parser, cp_token *tok, bool *if_p) } int required_depth = 1; - cp_parser_omp_nested_loop_transform_clauses ( - parser, clauses, required_depth, "outer transformation"); + cp_parser_omp_nested_loop_transform_clauses (parser, clauses, 0, + required_depth, + "outer transformation"); block = begin_omp_structured_block (); ret = cp_parser_omp_for_loop (parser, OMP_LOOP_TRANS, clauses, NULL, if_p); diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/imperfect-loop-nest.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/imperfect-loop-nest.c new file mode 100644 index 00000000000..57e72dffa03 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/imperfect-loop-nest.c @@ -0,0 +1,12 @@ +void test () +{ +#pragma omp tile sizes (2,4,6) + for (unsigned i = 0; i < 10; i++) + for (unsigned j = 0; j < 10; j++) + { + float intervening_decl = 0; /* { dg-bogus "not enough for loops to collapse" "TODO C/C++ imperfect loop nest handling" { xfail c++ } } */ + /* { dg-bogus "not enough perfectly nested loops" "TODO C/C++ imperfect loop nest handling" { xfail c } .-1 } */ +#pragma omp unroll partial(2) + for (unsigned k = 0; k < 10; k++); + } +} diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-inner-1.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-inner-1.c new file mode 100644 index 00000000000..c365d942591 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-inner-1.c @@ -0,0 +1,15 @@ +/* { dg-additional-options "-std=c++11" { target c++} } */ + +extern void dummy (int); + +void +test () +{ + +#pragma omp target parallel for collapse(2) + for (int i = -300; i != 100; ++i) + #pragma omp unroll partial + for (int j = 0; j != 100; ++j) + dummy (i); +} + diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-inner-2.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-inner-2.c new file mode 100644 index 00000000000..3f8fbf2d45a --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-inner-2.c @@ -0,0 +1,31 @@ +/* { dg-additional-options "-std=c++11" { target c++} } */ + +extern void dummy (int); + +void +test () +{ + +#pragma omp target parallel for collapse(2) + for (int i = -300; i != 100; ++i) +#pragma omp tile sizes(2) + for (int j = 0; j != 100; ++j) + dummy (i); + +#pragma omp target parallel for collapse(2) + for (int i = -300; i != 100; ++i) +#pragma omp tile sizes(2, 3) + for (int j = 0; j != 100; ++j) + dummy (i); /* { dg-error {not enough for loops to collapse} "" { target c++ } } */ +/* { dg-error {'i' was not declared in this scope} "" { target c++ } .-1 } */ +/* { dg-error {not enough perfectly nested loops before 'dummy'} "" { target c } .-2 } */ + +#pragma omp target parallel for collapse(2) + for (int i = -300; i != 100; ++i) +#pragma omp tile sizes(2, 3) + for (int j = 0; j != 100; ++j) + for (int k = 0; k != 100; ++k) + dummy (i); +} + + diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-non-rect-1.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-non-rect-1.c new file mode 100644 index 00000000000..40e7f8e4bfb --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-non-rect-1.c @@ -0,0 +1,37 @@ +extern void dummy (int); + +void +test1 () +{ +#pragma omp target parallel for collapse(2) + for (int i = -300; i != 100; ++i) +#pragma omp unroll partial(2) + for (int j = i * 2; j <= i * 4 + 1; ++j) + dummy (i); + +#pragma omp target parallel for collapse(3) + for (int i = -300; i != 100; ++i) + for (int j = i; j != i * 2; ++j) + #pragma omp unroll partial + for (int k = 2; k != 100; ++k) + dummy (i); + +#pragma omp unroll full + for (int i = -300; i != 100; ++i) + for (int j = i; j != i * 2; ++j) + for (int k = 2; k != 100; ++k) + dummy (i); + + for (int i = -300; i != 100; ++i) +#pragma omp unroll full + for (int j = i; j != i + 10; ++j) + for (int k = 2; k != 100; ++k) + dummy (i); + + for (int i = -300; i != 100; ++i) +#pragma omp unroll full + for (int j = i; j != i + 10; ++j) + for (int k = j; k != 100; ++k) + dummy (i); +} + diff --git a/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-non-rect-2.c b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-non-rect-2.c new file mode 100644 index 00000000000..7696e5d5fab --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/loop-transforms/unroll-non-rect-2.c @@ -0,0 +1,22 @@ +extern void dummy (int); + +void +test1 () +{ +#pragma omp target parallel for collapse(2) /* { dg-error {invalid OpenMP non-rectangular loop step; \'\(1 - 0\) \* 1\' is not a multiple of loop 2 step \'5\'} "" { target c } } */ + for (int i = -300; i != 100; ++i) /* { dg-error {invalid OpenMP non-rectangular loop step; \'\(1 - 0\) \* 1\' is not a multiple of loop 2 step \'5\'} "" { target c++ } } */ +#pragma omp unroll partial + for (int j = 2; j != i; ++j) + dummy (i); +} + +void +test2 () +{ + int i,j; +#pragma omp target parallel for collapse(2) + for (i = -300; i != 100; ++i) + #pragma omp unroll partial + for (j = 2; j != i; ++j) + dummy (i); +} diff --git a/libgomp/testsuite/libgomp.c++/loop-transforms/tile-1.C b/libgomp/testsuite/libgomp.c++/loop-transforms/tile-1.C deleted file mode 100644 index 2a4d760720d..00000000000 --- a/libgomp/testsuite/libgomp.c++/loop-transforms/tile-1.C +++ /dev/null @@ -1,52 +0,0 @@ -#include -#include -#include - -void -mult (float *matrix1, float *matrix2, float *result, unsigned dim0, - unsigned dim1) -{ - memset (result, 0, sizeof (float) * dim0 * dim1); -#pragma omp target parallel for collapse(3) map(tofrom:result[0:dim0*dim1]) map(to:matrix1[0:dim0*dim1], matrix2[0:dim0*dim1]) -#pragma omp tile sizes(8, 16, 4) - for (unsigned i = 0; i < dim0; i++) - for (unsigned j = 0; j < dim1; j++) - for (unsigned k = 0; k < dim1; k++) - result[i * dim1 + j] += matrix1[i * dim1 + k] * matrix2[k * dim0 + j]; -} - -int -main () -{ - unsigned dim0 = 20; - unsigned dim1 = 20; - - float *result = (float *)malloc (sizeof (float) * dim0 * dim1); - float *matrix1 = (float *)malloc (sizeof (float) * dim0 * dim1); - float *matrix2 = (float *)malloc (sizeof (float) * dim0 * dim1); - - for (unsigned i = 0; i < dim0; i++) - for (unsigned j = 0; j < dim1; j++) - matrix1[i * dim1 + j] = j; - - for (unsigned i = 0; i < dim1; i++) - for (unsigned j = 0; j < dim0; j++) - if (i == j) - matrix2[i * dim0 + j] = 1; - else - matrix2[i * dim0 + j] = 0; - - mult (matrix1, matrix2, result, dim0, dim1); - - for (unsigned i = 0; i < dim0; i++) - for (unsigned j = 0; j < dim1; j++) - { - if (matrix1[i * dim1 + j] != result[i * dim1 + j]) - { - printf ("ERROR at %d, %d\n", i, j); - __builtin_abort (); - } - } - - return 0; -} diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-1.h b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-1.h new file mode 100644 index 00000000000..b9b865cf554 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-1.h @@ -0,0 +1,70 @@ +#include +#include +#include +#include + +#ifndef FUN_NAME_SUFFIX +#define FUN_NAME_SUFFIX +#endif + +#ifdef MULT +#undef MULT +#endif +#define MULT CAT(mult, FUN_NAME_SUFFIX) + +#ifdef MAIN +#undef MAIN +#endif +#define MAIN CAT(main, FUN_NAME_SUFFIX) + +void MULT (float *matrix1, float *matrix2, float *result, + unsigned dim0, unsigned dim1) +{ + unsigned i; + + memset (result, 0, sizeof (float) * dim0 * dim1); + DIRECTIVE + TRANSFORMATION1 + for (i = 0; i < dim0; i++) + TRANSFORMATION2 + for (unsigned j = 0; j < dim1; j++) + TRANSFORMATION3 + for (unsigned k = 0; k < dim1; k++) + result[i * dim1 + j] += matrix1[i * dim1 + k] * matrix2[k * dim0 + j]; +} + +int MAIN () +{ + unsigned dim0 = 20; + unsigned dim1 = 20; + + float *result = (float *)malloc (sizeof (float) * dim0 * dim1); + float *matrix1 = (float *)malloc (sizeof (float) * dim0 * dim1); + float *matrix2 = (float *)malloc (sizeof (float) * dim0 * dim1); + + for (unsigned i = 0; i < dim0; i++) + for (unsigned j = 0; j < dim1; j++) + matrix1[i * dim1 + j] = j; + + for (unsigned i = 0; i < dim0; i++) + for (unsigned j = 0; j < dim1; j++) + if (i == j) + matrix2[i * dim1 + j] = 1; + else + matrix2[i * dim1 + j] = 0; + + MULT (matrix1, matrix2, result, dim0, dim1); + + for (unsigned i = 0; i < dim0; i++) + for (unsigned j = 0; j < dim1; j++) { + if (matrix1[i * dim1 + j] != result[i * dim1 + j]) { + print_matrix (matrix1, dim0, dim1); + print_matrix (matrix2, dim0, dim1); + print_matrix (result, dim0, dim1); + fprintf(stderr, "%s: ERROR at %d, %d\n", __FUNCTION__, i, j); + abort(); + } + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-constant-iter.h b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-constant-iter.h new file mode 100644 index 00000000000..769c04044c3 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-constant-iter.h @@ -0,0 +1,71 @@ +#include +#include +#include +#include + +#ifndef FUN_NAME_SUFFIX +#define FUN_NAME_SUFFIX +#endif + +#ifdef MULT +#undef MULT +#endif +#define MULT CAT(mult, FUN_NAME_SUFFIX) + +#ifdef MAIN +#undef MAIN +#endif +#define MAIN CAT(main, FUN_NAME_SUFFIX) + +void MULT (float *matrix1, float *matrix2, float *result) +{ + const unsigned dim0 = 20; + const unsigned dim1 = 20; + + memset (result, 0, sizeof (float) * dim0 * dim1); + DIRECTIVE + TRANSFORMATION1 + for (unsigned i = 0; i < dim0; i++) + TRANSFORMATION2 + for (unsigned j = 0; j < dim1; j++) + TRANSFORMATION3 + for (unsigned k = 0; k < dim1; k++) + result[i * dim1 + j] += matrix1[i * dim1 + k] * matrix2[k * dim0 + j]; +} + +int MAIN () +{ + const unsigned dim0 = 20; + const unsigned dim1 = 20; + + float *result = (float *)malloc (sizeof (float) * dim0 * dim1); + float *matrix1 = (float *)malloc (sizeof (float) * dim0 * dim1); + float *matrix2 = (float *)malloc (sizeof (float) * dim0 * dim1); + + for (unsigned i = 0; i < dim0; i++) + for (unsigned j = 0; j < dim1; j++) + matrix1[i * dim1 + j] = j; + + for (unsigned i = 0; i < dim0; i++) + for (unsigned j = 0; j < dim1; j++) + if (i == j) + matrix2[i * dim1 + j] = 1; + else + matrix2[i * dim1 + j] = 0; + + MULT (matrix1, matrix2, result); + + for (unsigned i = 0; i < dim0; i++) + for (unsigned j = 0; j < dim1; j++) { + if (matrix1[i * dim1 + j] != result[i * dim1 + j]) { + __builtin_printf("%s: error at %d, %d\n", __FUNCTION__, i, j); + print_matrix (matrix1, dim0, dim1); + print_matrix (matrix2, dim0, dim1); + print_matrix (result, dim0, dim1); + __builtin_printf("\n"); + __builtin_abort(); + } + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-helper.h b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-helper.h new file mode 100644 index 00000000000..4f69463d9dd --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-helper.h @@ -0,0 +1,19 @@ +#include +#include + +#define CAT(x,y) XCAT(x,y) +#define XCAT(x,y) x ## y +#define DO_PRAGMA(x) XDO_PRAGMA(x) +#define XDO_PRAGMA(x) _Pragma (#x) + + +void print_matrix (float *matrix, unsigned dim0, unsigned dim1) +{ + for (unsigned i = 0; i < dim0; i++) + { + for (unsigned j = 0; j < dim1; j++) + fprintf (stderr, "%f ", matrix[i * dim1 + j]); + fprintf (stderr, "\n"); + } + fprintf (stderr, "\n"); +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-1.c new file mode 100644 index 00000000000..9f7f02041b0 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-1.c @@ -0,0 +1,11 @@ +/* { dg-additional-options {-fdump-tree-original} } */ + +#define COMMON_DIRECTIVE +#define COLLAPSE_1 collapse(1) +#define COLLAPSE_2 collapse(2) +#define COLLAPSE_3 collapse(3) + +#include "matrix-transform-variants-1.h" + +/* A consistency check to prevent broken macro usage. */ +/* { dg-final { scan-tree-dump-times "unroll_partial" 12 "original" } } */ diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-unroll-full-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-unroll-full-1.c new file mode 100644 index 00000000000..5dd0b5d2989 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-unroll-full-1.c @@ -0,0 +1,13 @@ +/* { dg-additional-options {-fdump-tree-original} } */ + +#define COMMON_DIRECTIVE +#define COMMON_TOP_TRANSFORM omp unroll full +#define COLLAPSE_1 +#define COLLAPSE_2 +#define COLLAPSE_3 +#define IMPLEMENTATION_FILE "matrix-constant-iter.h" + +#include "matrix-transform-variants-1.h" + +/* A consistency check to prevent broken macro usage. */ +/* { dg-final { scan-tree-dump-times "unroll_full" 13 "original" } } */ diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-distribute-parallel-for-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-distribute-parallel-for-1.c new file mode 100644 index 00000000000..d855857e5ee --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-distribute-parallel-for-1.c @@ -0,0 +1,6 @@ +#define COMMON_DIRECTIVE "omp teams distribute parallel for" +#define COLLAPSE_1 "collapse(1)" +#define COLLAPSE_2 "collapse(2)" +#define COLLAPSE_3 "collapse(3)" + +#include "matrix-transform-variants-1.h" diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-for-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-for-1.c new file mode 100644 index 00000000000..f2a2b80b2fd --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-for-1.c @@ -0,0 +1,13 @@ +/* { dg-additional-options {-fdump-tree-original} } */ + +#define COMMON_DIRECTIVE omp for +#define COLLAPSE_1 collapse(1) +#define COLLAPSE_2 collapse(2) +#define COLLAPSE_3 collapse(3) + +#include "matrix-transform-variants-1.h" + + +/* A consistency check to prevent broken macro usage. */ +/* { dg-final { scan-tree-dump-times "omp for" 13 "original" } } */ +/* { dg-final { scan-tree-dump-times "collapse" 12 "original" } } */ diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-for-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-for-1.c new file mode 100644 index 00000000000..2c5701efca4 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-for-1.c @@ -0,0 +1,13 @@ +/* { dg-additional-options {-fdump-tree-original} } */ + +#define COMMON_DIRECTIVE omp parallel for +#define COLLAPSE_1 collapse(1) +#define COLLAPSE_2 collapse(2) +#define COLLAPSE_3 + +#include "matrix-transform-variants-1.h" + + +/* A consistency check to prevent broken macro usage. */ +/* { dg-final { scan-tree-dump-times "omp parallel" 13 "original" } } */ +/* { dg-final { scan-tree-dump-times "collapse" 9 "original" } } */ diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-1.c new file mode 100644 index 00000000000..e2def212725 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-1.c @@ -0,0 +1,6 @@ +#define COMMON_DIRECTIVE omp parallel masked taskloop +#define COLLAPSE_1 collapse(1) +#define COLLAPSE_2 collapse(2) +#define COLLAPSE_3 + +#include "matrix-transform-variants-1.h" diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-simd-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-simd-1.c new file mode 100644 index 00000000000..ce601555cfb --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-simd-1.c @@ -0,0 +1,6 @@ +#define COMMON_DIRECTIVE omp parallel masked taskloop simd +#define COLLAPSE_1 collapse(1) +#define COLLAPSE_2 collapse(2) +#define COLLAPSE_3 + +#include "matrix-transform-variants-1.h" diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-parallel-for-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-parallel-for-1.c new file mode 100644 index 00000000000..365b39ba385 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-parallel-for-1.c @@ -0,0 +1,13 @@ +/* { dg-additional-options {-fdump-tree-original} } */ + +#define COMMON_DIRECTIVE omp target parallel for map(tofrom:result[0:dim0*dim1]) map(to:matrix1[0:dim0*dim1], matrix2[0:dim0*dim1]) +#define COLLAPSE_1 collapse(1) +#define COLLAPSE_2 collapse(2) +#define COLLAPSE_3 + +#include "matrix-transform-variants-1.h" + +/* A consistency check to prevent broken macro usage. */ +/* { dg-final { scan-tree-dump-times "omp target" 13 "original" } } */ +/* { dg-final { scan-tree-dump-times "collapse" 9 "original" } } */ +/* { dg-final { scan-tree-dump-times "unroll_partial" 12 "original" } } */ diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-teams-distribute-parallel-for-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-teams-distribute-parallel-for-1.c new file mode 100644 index 00000000000..8afe34874c9 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-teams-distribute-parallel-for-1.c @@ -0,0 +1,6 @@ +#define COMMON_DIRECTIVE omp target teams distribute parallel for map(tofrom:result[:dim0*dim1]) map(to:matrix1[0:dim0*dim1], matrix2[0:dim0*dim1]) +#define COLLAPSE_1 collapse(1) +#define COLLAPSE_2 collapse(2) +#define COLLAPSE_3 + +#include "matrix-transform-variants-1.h" diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-taskloop-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-taskloop-1.c new file mode 100644 index 00000000000..bbc78b39db0 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-taskloop-1.c @@ -0,0 +1,6 @@ +#define COMMON_DIRECTIVE omp taskloop +#define COLLAPSE_1 collapse(1) +#define COLLAPSE_2 collapse(2) +#define COLLAPSE_3 collapse(3) + +#include "matrix-transform-variants-1.h" diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-teams-distribute-parallel-for-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-teams-distribute-parallel-for-1.c new file mode 100644 index 00000000000..3a58e479374 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-teams-distribute-parallel-for-1.c @@ -0,0 +1,6 @@ +#define COMMON_DIRECTIVE omp teams distribute parallel for +#define COLLAPSE_1 collapse(1) +#define COLLAPSE_2 collapse(2) +#define COLLAPSE_3 + +#include "matrix-transform-variants-1.h" diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-simd-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-simd-1.c new file mode 100644 index 00000000000..e5155dcf76d --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-simd-1.c @@ -0,0 +1,6 @@ +#define COMMON_DIRECTIVE omp simd +#define COLLAPSE_1 collapse(1) +#define COLLAPSE_2 collapse(2) +#define COLLAPSE_3 collapse(3) + +#include "matrix-transform-variants-1.h" diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-transform-variants-1.h b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-transform-variants-1.h new file mode 100644 index 00000000000..24c3d073024 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/matrix-transform-variants-1.h @@ -0,0 +1,191 @@ +#include "matrix-helper.h" + +#ifndef COMMON_TOP_TRANSFORM +#define COMMON_TOP_TRANSFORM +#endif + +#ifndef IMPLEMENTATION_FILE +#define IMPLEMENTATION_FILE "matrix-1.h" +#endif + +#define FUN_NAME_SUFFIX 1 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) _Pragma("omp unroll partial(2)") _Pragma("omp tile sizes(10)") +#define TRANSFORMATION2 +#define TRANSFORMATION3 +#include IMPLEMENTATION_FILE + +#undef DIRECTIVE +#undef TRANSFORMATION1 +#undef TRANSFORMATION2 +#undef TRANSFORMATION3 +#undef FUN_NAME_SUFFIX + +#define FUN_NAME_SUFFIX 2 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE COLLAPSE_3) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) _Pragma("omp tile sizes(8,16,4)") +#define TRANSFORMATION2 +#define TRANSFORMATION3 +#include IMPLEMENTATION_FILE + +#undef DIRECTIVE +#undef TRANSFORMATION1 +#undef TRANSFORMATION2 +#undef TRANSFORMATION3 +#undef FUN_NAME_SUFFIX + +#define FUN_NAME_SUFFIX 3 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE COLLAPSE_2) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) _Pragma("omp tile sizes(8, 8)") +#define TRANSFORMATION2 +#define TRANSFORMATION3 +#include IMPLEMENTATION_FILE + +#undef DIRECTIVE +#undef TRANSFORMATION1 +#undef TRANSFORMATION2 +#undef TRANSFORMATION3 +#undef FUN_NAME_SUFFIX + +#define FUN_NAME_SUFFIX 4 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE COLLAPSE_1) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) _Pragma("omp tile sizes(8, 8)") +#define TRANSFORMATION2 +#define TRANSFORMATION3 +#include IMPLEMENTATION_FILE + +#undef DIRECTIVE +#undef TRANSFORMATION1 +#undef TRANSFORMATION2 +#undef TRANSFORMATION3 +#undef FUN_NAME_SUFFIX + +#define FUN_NAME_SUFFIX 5 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE COLLAPSE_1) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) _Pragma("omp tile sizes(8, 8, 8)") +#define TRANSFORMATION2 +#define TRANSFORMATION3 +#include IMPLEMENTATION_FILE + +#undef DIRECTIVE +#undef TRANSFORMATION1 +#undef TRANSFORMATION2 +#undef TRANSFORMATION3 +#undef FUN_NAME_SUFFIX + +#define FUN_NAME_SUFFIX 6 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE COLLAPSE_1) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) _Pragma("omp tile sizes(10)") _Pragma("omp unroll partial(2)") +#define TRANSFORMATION2 +#define TRANSFORMATION3 +#include IMPLEMENTATION_FILE + +#undef DIRECTIVE +#undef TRANSFORMATION1 +#undef TRANSFORMATION2 +#undef TRANSFORMATION3 +#undef FUN_NAME_SUFFIX + +#define FUN_NAME_SUFFIX 7 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE COLLAPSE_2) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) _Pragma("omp tile sizes(7, 11)") +#define TRANSFORMATION2 _Pragma("omp unroll partial(7)") +#define TRANSFORMATION3 +#include IMPLEMENTATION_FILE + +#undef DIRECTIVE +#undef TRANSFORMATION1 +#undef TRANSFORMATION2 +#undef TRANSFORMATION3 +#undef FUN_NAME_SUFFIX + +#define FUN_NAME_SUFFIX 8 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE COLLAPSE_2) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) _Pragma("omp tile sizes(7, 11)") +#define TRANSFORMATION2 _Pragma("omp tile sizes(7)") _Pragma("omp unroll partial(7)") +#define TRANSFORMATION3 +#include IMPLEMENTATION_FILE + +#undef DIRECTIVE +#undef TRANSFORMATION1 +#undef TRANSFORMATION2 +#undef TRANSFORMATION3 +#undef FUN_NAME_SUFFIX + +#define FUN_NAME_SUFFIX 9 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE COLLAPSE_2) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) _Pragma("omp tile sizes(7, 11)") +#define TRANSFORMATION2 _Pragma("omp tile sizes(7)") _Pragma("omp unroll partial(3)") _Pragma("omp tile sizes(7)") +#define TRANSFORMATION3 +#include IMPLEMENTATION_FILE + +#undef DIRECTIVE +#undef TRANSFORMATION1 +#undef TRANSFORMATION2 +#undef TRANSFORMATION3 +#undef FUN_NAME_SUFFIX + +#define FUN_NAME_SUFFIX 10 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE COLLAPSE_1) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) _Pragma("omp unroll partial(5)") _Pragma("omp tile sizes(7)") _Pragma("omp unroll partial(3)") _Pragma("omp tile sizes(7)") +#define TRANSFORMATION2 +#define TRANSFORMATION3 +#include IMPLEMENTATION_FILE + +#undef DIRECTIVE +#undef TRANSFORMATION1 +#undef TRANSFORMATION2 +#undef TRANSFORMATION3 +#undef FUN_NAME_SUFFIX + +#define FUN_NAME_SUFFIX 11 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE COLLAPSE_2) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) +#define TRANSFORMATION2 _Pragma("omp unroll partial(5)") _Pragma("omp tile sizes(7)") _Pragma("omp unroll partial(3)") _Pragma("omp tile sizes(7)") +#define TRANSFORMATION3 +#include IMPLEMENTATION_FILE + +#undef DIRECTIVE +#undef TRANSFORMATION1 +#undef TRANSFORMATION2 +#undef TRANSFORMATION3 +#undef FUN_NAME_SUFFIX + +#define FUN_NAME_SUFFIX 12 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE COLLAPSE_3) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) +#define TRANSFORMATION2 +#define TRANSFORMATION3 _Pragma("omp unroll partial(5)") _Pragma("omp tile sizes(7)") _Pragma("omp unroll partial(3)") _Pragma("omp tile sizes(7)") +#include IMPLEMENTATION_FILE + +#undef DIRECTIVE +#undef TRANSFORMATION1 +#undef TRANSFORMATION2 +#undef TRANSFORMATION3 +#undef FUN_NAME_SUFFIX + +#define FUN_NAME_SUFFIX 13 +#define DIRECTIVE DO_PRAGMA(COMMON_DIRECTIVE COLLAPSE_3) +#define TRANSFORMATION1 DO_PRAGMA(COMMON_TOP_TRANSFORM) +#define TRANSFORMATION2 _Pragma("omp tile sizes(7,8)") +#define TRANSFORMATION3 _Pragma("omp unroll partial(3)") _Pragma("omp tile sizes(7)") +#include IMPLEMENTATION_FILE + +int main () +{ + main1 (); + main2 (); + main3 (); + main4 (); + main5 (); + main6 (); + main7 (); + main8 (); + main9 (); + main10 (); + main11 (); + main12 (); + main13 (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/unroll-non-rect-1.c b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/unroll-non-rect-1.c new file mode 100644 index 00000000000..2f9924aea1f --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/loop-transforms/unroll-non-rect-1.c @@ -0,0 +1,129 @@ +#include +#include + +void test1 () +{ + int sum = 0; + for (int i = -3; i != 1; ++i) + for (int j = -2; j < i * -1; ++j) + sum++; + + if (sum != 14) + { + fprintf (stderr, "%s: Wrong sum: %d\n", __FUNCTION__, sum); + abort (); + } +} + +void test2 () +{ + int sum = 0; + #pragma omp unroll partial + for (int i = -3; i != 1; ++i) + for (int j = -2; j < i * -1; ++j) + sum++; + + if (sum != 14) + { + fprintf (stderr, "%s: Wrong sum: %d\n", __FUNCTION__, sum); + abort (); + } +} + +void test3 () +{ + int sum = 0; + #pragma omp unroll partial + for (int i = -3; i != 1; ++i) + #pragma omp unroll partial + for (int j = -2; j < i * -1; ++j) + sum++; + + if (sum != 14) + { + fprintf (stderr, "%s: Wrong sum: %d\n", __FUNCTION__, sum); + abort (); + } +} + +void test4 () +{ + int sum = 0; +#pragma omp for +#pragma omp unroll partial(5) + for (int i = -3; i != 1; ++i) +#pragma omp unroll partial(2) + for (int j = -2; j < i * -1; ++j) + sum++; + + if (sum != 14) + { + fprintf (stderr, "%s: Wrong sum: %d\n", __FUNCTION__, sum); + abort (); + } +} + +void test5 () +{ + int sum = 0; +#pragma omp parallel for reduction(+:sum) +#pragma omp unroll partial(2) + for (int i = -3; i != 1; ++i) +#pragma omp unroll partial(2) + for (int j = -2; j < i * -1; ++j) + sum++; + + if (sum != 14) + { + fprintf (stderr, "%s: Wrong sum: %d\n", __FUNCTION__, sum); + abort (); + } +} + +void test6 () +{ + int sum = 0; +#pragma omp target parallel for reduction(+:sum) +#pragma omp unroll partial(7) + for (int i = -3; i != 1; ++i) +#pragma omp unroll partial(2) + for (int j = -2; j < i * -1; ++j) + sum++; + + if (sum != 14) + { + fprintf (stderr, "%s: Wrong sum: %d\n", __FUNCTION__, sum); + abort (); + } +} + +void test7 () +{ + int sum = 0; +#pragma omp target teams distribute parallel for reduction(+:sum) +#pragma omp unroll partial(7) + for (int i = -3; i != 1; ++i) +#pragma omp unroll partial(2) + for (int j = -2; j < i * -1; ++j) + sum++; + + if (sum != 14) + { + fprintf (stderr, "%s: Wrong sum: %d\n", __FUNCTION__, sum); + abort (); + } +} + +int +main () +{ + test1 (); + test2 (); + test3 (); + test4 (); + test5 (); + test6 (); + test7 (); + + return 0; +}