From patchwork Wed Nov 17 16:03:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Frederik Harwath X-Patchwork-Id: 47828 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4A2A6385842C for ; Wed, 17 Nov 2021 16:18:24 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa1.mentor.iphmx.com (esa1.mentor.iphmx.com [68.232.129.153]) by sourceware.org (Postfix) with ESMTPS id 584B23857C66 for ; Wed, 17 Nov 2021 16:04:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 584B23857C66 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com IronPort-SDR: YStIXT6Df4ubTWQTItvKjiOC8Spo+a9acRj69KLA8gF7DYRb/swK/58CFgL8awBfrtW1com8hu 1r5snq+KySo4JQ2dvvK9aCVwCRFVEVsjelvDekQa8yCN7FbPWWcTQlE0Cs2afvCuqtzC9mzCL+ nQITVPy1l0TVgocgCu4fT9f435VY46Gx3GrrNdx1g0Ky6dyfG4z6omwyzWrKNt795Jwp2g0DEJ k3qOD2G9WsPi4md+ksCuTwY/OBk8TPwOetBwsJBB2nX1wUMQbZAYidvZU6Fs88NJmX5Qylz0Mr 4nwReHyRMGLGv8whDV0uRQ0T X-IronPort-AV: E=Sophos;i="5.87,241,1631606400"; d="scan'208";a="71081310" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa1.mentor.iphmx.com with ESMTP; 17 Nov 2021 08:04:38 -0800 IronPort-SDR: OgFOZvKSUBS9KaptCms3XQ9kt0o1qyj5ekAEVq+DX3rG3TRHxxL9X8YTCEYI8JWR2LqG0Xf4Wa 27o6Bb/TIp1duao6K0oPZ7risudNckC8j4rwxCwG96DaZ3O6rD1p6fFNUKr80QJQaX3FKxltTl w1wsNcOly6W3EX96oi6sEKSoQ/fzkDReljkC0HKpbImIf8CS0MtJPUJunVv6rZgaGKsAEta86i tPJFJSns7xOz3s5+vr7rhzoOewkgRZgpfZNITAV9kWjFiRTovuKrYqUm7JoCK5BdznUv8YGv7b EuE= From: Frederik Harwath To: Subject: [OG11][committed][PATCH 14/22] openacc: Add data optimization pass Date: Wed, 17 Nov 2021 17:03:22 +0100 Message-ID: <20211117160330.20029-14-frederik@codesourcery.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20211117160330.20029-1-frederik@codesourcery.com> References: <20211117160330.20029-1-frederik@codesourcery.com> MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-15.mgc.mentorg.com (139.181.222.15) To SVR-IES-MBX-04.mgc.mentorg.com (139.181.222.4) X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: Andrew Stubbs Address PR90591 "Avoid unnecessary data transfer out of OMP construct", for simple (but common) cases. This commit adds a pass that optimizes data mapping clauses. Currently, it can optimize copy/map(tofrom) clauses involving scalars to copyin/map(to) and further to "private". The pass is restricted "kernels" regions but could be extended to other types of regions. gcc/ChangeLog: * Makefile.in: Add pass. * doc/gimple.texi: TODO. * gimple-walk.c (walk_gimple_seq_mod): Adjust for backward walking. * gimple-walk.h (struct walk_stmt_info): Add field. * passes.def: Add new pass. * tree-pass.h (make_pass_omp_data_optimize): New declaration. * omp-data-optimize.cc: New file. libgomp/ChangeLog: * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Expect optimization messages. * testsuite/libgomp.oacc-fortran/pr94358-1.f90: Likewise. gcc/testsuite/ChangeLog: * c-c++-common/goacc/note-parallelism-1-kernels-loops.c: Likewise. * c-c++-common/goacc/note-parallelism-1-kernels-straight-line.c: Likewise. * c-c++-common/goacc/note-parallelism-kernels-loops.c: Likewise. * c-c++-common/goacc/uninit-copy-clause.c: Likewise. * gfortran.dg/goacc/uninit-copy-clause.f95: Likewise. * c-c++-common/goacc/omp_data_optimize-1.c: New test. * g++.dg/goacc/omp_data_optimize-1.C: New test. * gfortran.dg/goacc/omp_data_optimize-1.f90: New test. Co-Authored-By: Thomas Schwinge --- gcc/Makefile.in | 1 + gcc/doc/gimple.texi | 2 + gcc/gimple-walk.c | 15 +- gcc/gimple-walk.h | 6 + gcc/omp-data-optimize.cc | 951 ++++++++++++++++++ gcc/passes.def | 1 + .../goacc/note-parallelism-1-kernels-loops.c | 7 +- ...note-parallelism-1-kernels-straight-line.c | 9 +- .../goacc/note-parallelism-kernels-loops.c | 10 +- .../c-c++-common/goacc/omp_data_optimize-1.c | 677 +++++++++++++ .../c-c++-common/goacc/uninit-copy-clause.c | 6 + .../g++.dg/goacc/omp_data_optimize-1.C | 169 ++++ .../gfortran.dg/goacc/omp_data_optimize-1.f90 | 588 +++++++++++ .../gfortran.dg/goacc/uninit-copy-clause.f95 | 2 + gcc/tree-pass.h | 1 + .../kernels-decompose-1.c | 2 + .../libgomp.oacc-fortran/pr94358-1.f90 | 4 + 17 files changed, 2444 insertions(+), 7 deletions(-) create mode 100644 gcc/omp-data-optimize.cc create mode 100644 gcc/testsuite/c-c++-common/goacc/omp_data_optimize-1.c create mode 100644 gcc/testsuite/g++.dg/goacc/omp_data_optimize-1.C create mode 100644 gcc/testsuite/gfortran.dg/goacc/omp_data_optimize-1.f90 -- 2.33.0 ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 4ebdcdbc5f8c..8c02b85d2a96 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1507,6 +1507,7 @@ OBJS = \ omp-low.o \ omp-oacc-kernels-decompose.o \ omp-simd-clone.o \ + omp-data-optimize.o \ opt-problem.o \ optabs.o \ optabs-libfuncs.o \ diff --git a/gcc/doc/gimple.texi b/gcc/doc/gimple.texi index 4b3d7d7452e3..a83e17f71a40 100644 --- a/gcc/doc/gimple.texi +++ b/gcc/doc/gimple.texi @@ -2778,4 +2778,6 @@ calling @code{walk_gimple_stmt} on each one. @code{WI} is as in @code{walk_gimple_stmt}. If @code{walk_gimple_stmt} returns non-@code{NULL}, the walk is stopped and the value returned. Otherwise, all the statements are walked and @code{NULL_TREE} returned. + +TODO update for forward vs. backward. @end deftypefn diff --git a/gcc/gimple-walk.c b/gcc/gimple-walk.c index cd287860994e..66fd491844d7 100644 --- a/gcc/gimple-walk.c +++ b/gcc/gimple-walk.c @@ -32,6 +32,8 @@ along with GCC; see the file COPYING3. If not see /* Walk all the statements in the sequence *PSEQ calling walk_gimple_stmt on each one. WI is as in walk_gimple_stmt. + TODO update for forward vs. backward. + If walk_gimple_stmt returns non-NULL, the walk is stopped, and the value is stored in WI->CALLBACK_RESULT. Also, the statement that produced the value is returned if this statement has not been @@ -44,9 +46,10 @@ gimple * walk_gimple_seq_mod (gimple_seq *pseq, walk_stmt_fn callback_stmt, walk_tree_fn callback_op, struct walk_stmt_info *wi) { - gimple_stmt_iterator gsi; + bool forward = !(wi && wi->backward); - for (gsi = gsi_start (*pseq); !gsi_end_p (gsi); ) + gimple_stmt_iterator gsi = forward ? gsi_start (*pseq) : gsi_last (*pseq); + for (; !gsi_end_p (gsi); ) { tree ret = walk_gimple_stmt (&gsi, callback_stmt, callback_op, wi); if (ret) @@ -60,7 +63,13 @@ walk_gimple_seq_mod (gimple_seq *pseq, walk_stmt_fn callback_stmt, } if (!wi->removed_stmt) - gsi_next (&gsi); + { + if (forward) + gsi_next (&gsi); + else //TODO Correct? + gsi_prev (&gsi); + //TODO This could do with some unit testing (see other 'gcc/*-tests.c' files for inspiration), to make sure all the corner cases (removing first/last, for example) work correctly. + } } if (wi) diff --git a/gcc/gimple-walk.h b/gcc/gimple-walk.h index f471f10088df..4ebc71d73ddf 100644 --- a/gcc/gimple-walk.h +++ b/gcc/gimple-walk.h @@ -71,6 +71,12 @@ struct walk_stmt_info /* True if we've removed the statement that was processed. */ BOOL_BITFIELD removed_stmt : 1; + + /*TODO True if we're walking backward instead of forward. */ + //TODO This flag is only applicable for 'walk_gimple_seq'. + //TODO Instead of this somewhat mis-placed (?) flag here, may be able to factor out the walking logic woult of 'walk_gimple_stmt', and do the backward walking in a separate function? + //TODO + BOOL_BITFIELD backward : 1; }; /* Callback for walk_gimple_stmt. Called for every statement found diff --git a/gcc/omp-data-optimize.cc b/gcc/omp-data-optimize.cc new file mode 100644 index 000000000000..31f615c1d2bd --- /dev/null +++ b/gcc/omp-data-optimize.cc @@ -0,0 +1,951 @@ +/* OMP data optimize + + Copyright (C) 2021 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +. */ + +/* This pass tries to optimize OMP data movement. + + The purpose is two-fold: (1) simply avoid redundant data movement, and (2) + as an enabler for other compiler optimizations. + + Currently, the focus is on OpenACC 'kernels' constructs, but this may be + done more generally later: other compute constructs, but also structured + 'data' constructs, for example. + + Currently, this implements: + - Convert "copy/map(tofrom)" to "copyin/map(to)", where the variable is + known to be dead on exit. + - Further optimize to "private" where the variable is also known to be + dead on entry. + + Future improvements may include: + - Optimize mappings that do not start as "copy/map(tofrom)". + - Optimize mappings to "copyout/map(from)" where the variable is dead on + entry, but not exit. + - Improved data liveness checking. + - Etc. + + As long as we make sure to not violate user-expected OpenACC semantics, we + may do "anything". + + The pass runs too early to use the full data flow analysis tools, so this + uses some simplified rules. The analysis could certainly be improved. + + A variable is dead on exit if + 1. Nothing reads it between the end of the target region and the end + of the function. + 2. It is not global, static, external, or otherwise persistent. + 3. It is not addressable (and therefore cannot be aliased). + 4. There are no backward jumps following the target region (and therefore + there can be no loop around the target region). + + A variable is dead on entry if the first occurrence of the variable within + the target region is a write. The algorithm attempts to check all possible + code paths, but may give up where control flow is too complex. No attempt + is made to evaluate conditionals, so it is likely that it will miss cases + where the user might declare private manually. + + Future improvements: + 1. Allow backward jumps (loops) where the target is also after the end of + the target region. + 2. Detect dead-on-exit variables when there is a write following the + target region (tricky, in the presence of conditionals). + 3. Ignore reads in the "else" branch of conditionals where the target + region is in the "then" branch. + 4. Optimize global/static/external variables that are provably dead on + entry or exit. + (Most of this can be achieved by unifying the two DF algorithms in this + file; the one for scanning inside the target regions had to be made more + capable, with propagation of live state across blocks, but that's more + effort than I have time right now to do the rework.) +*/ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tree-pass.h" +#include "options.h" +#include "tree.h" +#include "function.h" +#include "basic-block.h" +#include "gimple.h" +#include "gimplify.h" +#include "gimple-iterator.h" +#include "gimple-walk.h" +#include "gomp-constants.h" +#include "gimple-pretty-print.h" + +#define DUMP_LOC(STMT) \ + dump_user_location_t::from_location_t (OMP_CLAUSE_LOCATION (STMT)) + +/* These types track why we could *not* optimize a variable mapping. The + main reason for differentiating the different reasons is diagnostics. */ + +enum inhibit_kinds { + INHIBIT_NOT, // "optimize" + INHIBIT_USE, + INHIBIT_JMP, + INHIBIT_BAD +}; + +struct inhibit_descriptor +{ + enum inhibit_kinds kind; + gimple *stmt; +}; + +/* OMP Data Optimize walk state tables. */ +struct ODO_State { + hash_map candidates; + hash_set visited_labels; + bool lhs_scanned; +}; + +/* These types track whether a variable can be full private, or not. + + These are ORDERED in ascending precedence; when combining two values + (at a conditional or switch), the higher value is used. */ + +enum access_kinds { + ACCESS_NONE, /* Variable not accessed. */ + ACCESS_DEF_FIRST, /* Variable is defined before use. */ + ACCESS_UNKNOWN, /* Status is yet to be determined. */ + ACCESS_UNSUPPORTED, /* Variable is array or reference. */ + ACCESS_USE_FIRST /* Variable is used without definition (live on entry). */ +}; + +struct ODO_BB { + access_kinds access; + gimple *foot_stmt; +}; + +struct ODO_Target_state { + tree var; + + const void *bb_id; /* A unique id for the BB (use a convenient pointer). */ + ODO_BB bb; + bool lhs_scanned; + bool can_short_circuit; + + hash_map scanned_bb; +}; + +/* Classify a newly discovered variable, and add it to the candidate list. */ + +static void +omp_data_optimize_add_candidate (const dump_user_location_t &loc, tree var, + ODO_State *state) +{ + inhibit_descriptor in; + in.stmt = NULL; + + if (DECL_EXTERNAL (var)) + { + if (dump_enabled_p () && dump_flags & TDF_DETAILS) + dump_printf_loc (MSG_NOTE, loc, + " -> unsuitable variable: %<%T%> is external\n", var); + + in.kind = INHIBIT_BAD; + } + else if (TREE_STATIC (var)) + { + if (dump_enabled_p () && dump_flags & TDF_DETAILS) + dump_printf_loc (MSG_NOTE, loc, + " -> unsuitable variable: %<%T%> is static\n", var); + + in.kind = INHIBIT_BAD; + } + else if (TREE_ADDRESSABLE (var)) + { + if (dump_enabled_p () && dump_flags & TDF_DETAILS) + dump_printf_loc (MSG_NOTE, loc, + " -> unsuitable variable: %<%T%> is addressable\n", + var); + + in.kind = INHIBIT_BAD; + } + else + { + if (dump_enabled_p () && dump_flags & TDF_DETAILS) + dump_printf_loc (MSG_NOTE, loc, " -> candidate variable: %<%T%>\n", + var); + + in.kind = INHIBIT_NOT; + } + + if (state->candidates.put (var, in)) + gcc_unreachable (); +} + +/* Add all the variables in a gimple bind statement to the list of + optimization candidates. */ + +static void +omp_data_optimize_stmt_bind (const gbind *bind, ODO_State *state) +{ + if (dump_enabled_p () && dump_flags & TDF_DETAILS) + dump_printf_loc (MSG_NOTE, bind, "considering scope\n"); + + tree vars = gimple_bind_vars (bind); + for (tree var = vars; var; var = TREE_CHAIN (var)) + omp_data_optimize_add_candidate (bind, var, state); +} + +/* Assess a control flow statement to see if it prevents us from optimizing + OMP variable mappings. A conditional jump usually won't, but a loop + means a much more complicated liveness algorithm than this would be needed + to reason effectively. */ + +static void +omp_data_optimize_stmt_jump (gimple *stmt, ODO_State *state) +{ + /* In the general case, in presence of looping/control flow, we cannot make + any promises about (non-)uses of 'var's -- so we have to inhibit + optimization. */ + if (dump_enabled_p () && dump_flags & TDF_DETAILS) + dump_printf_loc (MSG_NOTE, stmt, "loop/control encountered: %G\n", stmt); + + bool forward = false; + switch (gimple_code (stmt)) + { + case GIMPLE_COND: + if (state->visited_labels.contains (gimple_cond_true_label + (as_a (stmt))) + && state->visited_labels.contains (gimple_cond_false_label + (as_a (stmt)))) + forward = true; + break; + case GIMPLE_GOTO: + if (state->visited_labels.contains (gimple_goto_dest + (as_a (stmt)))) + forward = true; + break; + case GIMPLE_SWITCH: + { + gswitch *sw = as_a (stmt); + forward = true; + for (unsigned i = 0; i < gimple_switch_num_labels (sw); i++) + if (!state->visited_labels.contains (CASE_LABEL + (gimple_switch_label (sw, + i)))) + { + forward = false; + break; + } + break; + } + case GIMPLE_ASM: + { + gasm *asm_stmt = as_a (stmt); + forward = true; + for (unsigned i = 0; i < gimple_asm_nlabels (asm_stmt); i++) + if (!state->visited_labels.contains (TREE_VALUE + (gimple_asm_label_op + (asm_stmt, i)))) + { + forward = false; + break; + } + break; + } + default: + gcc_unreachable (); + } + if (forward) + { + if (dump_enabled_p () && dump_flags & TDF_DETAILS) + dump_printf_loc (MSG_NOTE, stmt, + " -> forward jump; candidates remain valid\n"); + + return; + } + + /* If we get here then control flow has invalidated all current optimization + candidates. */ + for (hash_map::iterator it = state->candidates.begin (); + it != state->candidates.end (); + ++it) + { + if ((*it).second.kind == INHIBIT_BAD) + continue; + + if (dump_enabled_p () && dump_flags & TDF_DETAILS) + dump_printf_loc (MSG_NOTE, stmt, " -> discarding candidate: %T\n", + (*it).first); + + /* We're walking backward: this earlier instance ("earlier" in + 'gimple_seq' forward order) overrides what we may have had before. */ + (*it).second.kind = INHIBIT_JMP; + (*it).second.stmt = stmt; + } +} + +/* A helper callback for omp_data_optimize_can_be_private. + Check if an operand matches the specific one we're looking for, and + assess the context in which it appears. */ + +static tree +omp_data_optimize_scan_target_op (tree *tp, int *walk_subtrees, void *data) +{ + struct walk_stmt_info *wi = (struct walk_stmt_info *) data; + ODO_Target_state *state = (ODO_Target_state *)wi->info; + tree op = *tp; + + if (wi->is_lhs && !state->lhs_scanned + && state->bb.access != ACCESS_USE_FIRST) + { + /* We're at the top level of the LHS operand. Anything we scan inside + (array indices etc.) should be treated as RHS. */ + state->lhs_scanned = 1; + + /* Writes to arrays and references are unhandled, as yet. */ + tree base = get_base_address (op); + if (base && base != op && base == state->var) + { + state->bb.access = ACCESS_UNSUPPORTED; + *walk_subtrees = 0; + } + /* Write to scalar variable. */ + else if (op == state->var) + { + state->bb.access = ACCESS_DEF_FIRST; + *walk_subtrees = 0; + } + } + else if (op == state->var) + { + state->bb.access = ACCESS_USE_FIRST; + *walk_subtrees = 0; + } + return NULL; +} + +/* A helper callback for omp_data_optimize_can_be_private, this assesses a + statement inside a target region to see how it affects the data flow of the + operands. A set of basic blocks is recorded, each with the observed access + details for the given variable. */ + +static tree +omp_data_optimize_scan_target_stmt (gimple_stmt_iterator *gsi_p, + bool *handled_ops_p, + struct walk_stmt_info *wi) +{ + ODO_Target_state *state = (ODO_Target_state *) wi->info; + gimple *stmt = gsi_stmt (*gsi_p); + + /* If an access was found in the previous statement then we're done. */ + if (state->bb.access != ACCESS_NONE && state->can_short_circuit) + { + *handled_ops_p = true; + return (tree)1; /* Return non-NULL, otherwise ignored. */ + } + + /* If the first def/use is already found then don't check more operands. */ + *handled_ops_p = state->bb.access != ACCESS_NONE; + + switch (gimple_code (stmt)) + { + /* These will be the last statement in a basic block, and will always + be followed by a label or the end of scope. */ + case GIMPLE_COND: + case GIMPLE_GOTO: + case GIMPLE_SWITCH: + if (state->bb.access == ACCESS_NONE) + state->bb.access = ACCESS_UNKNOWN; + state->bb.foot_stmt = stmt; + state->can_short_circuit = false; + break; + + /* asm goto statements are not necessarily followed by a label. */ + case GIMPLE_ASM: + if (gimple_asm_nlabels (as_a (stmt)) > 0) + { + if (state->bb.access == ACCESS_NONE) + state->bb.access = ACCESS_UNKNOWN; + state->bb.foot_stmt = stmt; + state->scanned_bb.put (state->bb_id, state->bb); + + /* Start a new fake BB using the asm string as a unique id. */ + state->bb_id = gimple_asm_string (as_a (stmt)); + state->bb.access = ACCESS_NONE; + state->bb.foot_stmt = NULL; + state->can_short_circuit = false; + } + break; + + /* A label is the beginning of a new basic block, and possibly the end + of the previous, in the case of a fall-through. */ + case GIMPLE_LABEL: + if (state->bb.foot_stmt == NULL) + state->bb.foot_stmt = stmt; + if (state->bb.access == ACCESS_NONE) + state->bb.access = ACCESS_UNKNOWN; + state->scanned_bb.put (state->bb_id, state->bb); + + state->bb_id = gimple_label_label (as_a (stmt)); + state->bb.access = ACCESS_NONE; + state->bb.foot_stmt = NULL; + break; + + /* These should not occur inside target regions?? */ + case GIMPLE_RETURN: + gcc_unreachable (); + + default: + break; + } + + /* Now walk the operands. */ + state->lhs_scanned = false; + return NULL; +} + +/* Check every operand under a gimple statement to see if a specific variable + is dead on entry to an OMP TARGET statement. If so, then we can make the + variable mapping PRIVATE. */ + +static bool +omp_data_optimize_can_be_private (tree var, gimple *target_stmt) +{ + ODO_Target_state state; + state.var = var; + void *root_id = var; /* Any non-null pointer will do for the unique ID. */ + state.bb_id = root_id; + state.bb.access = ACCESS_NONE; + state.bb.foot_stmt = NULL; + state.lhs_scanned = false; + state.can_short_circuit = true; + + struct walk_stmt_info wi; + memset (&wi, 0, sizeof (wi)); + wi.info = &state; + + /* Walk the target region and build the BB list. */ + gimple_seq target_body = *gimple_omp_body_ptr (target_stmt); + walk_gimple_seq (target_body, omp_data_optimize_scan_target_stmt, + omp_data_optimize_scan_target_op, &wi); + + /* Calculate the liveness data for the whole region. */ + if (state.can_short_circuit) + ; /* state.access has the answer already. */ + else + { + /* There's some control flow to navigate. */ + + /* First enter the final BB into the table. */ + state.scanned_bb.put (state.bb_id, state.bb); + + /* Propagate the known access findings to the parent BBs. + + For each BB that does not have a known liveness value, combine + the liveness data from its descendent BBs, if known. Repeat until + there are no more changes to make. */ + bool changed; + do { + changed = false; + for (hash_map::iterator it = state.scanned_bb.begin (); + it != state.scanned_bb.end (); + ++it) + { + ODO_BB *bb = &(*it).second; + tree label; + const void *bb_id1, *bb_id2; + ODO_BB *chain_bb1, *chain_bb2; + unsigned num_labels; + + /* The foot statement is NULL, in the exit block. + Blocks that already have liveness data are done. */ + if (bb->foot_stmt == NULL + || bb->access != ACCESS_UNKNOWN) + continue; + + /* If we get here then bb->access == ACCESS_UNKNOWN. */ + switch (gimple_code (bb->foot_stmt)) + { + /* If the final statement of a block is the label statement + then we have a fall-through. The liveness data can be simply + copied from the next block. */ + case GIMPLE_LABEL: + bb_id1 = gimple_label_label (as_a (bb->foot_stmt)); + chain_bb1 = state.scanned_bb.get (bb_id1); + if (chain_bb1->access != ACCESS_UNKNOWN) + { + bb->access = chain_bb1->access; + changed = true; + } + break; + + /* Combine the liveness data from both branches of a conditional + statement. The access values are ordered such that the + higher value takes precedence. */ + case GIMPLE_COND: + bb_id1 = gimple_cond_true_label (as_a + (bb->foot_stmt)); + bb_id2 = gimple_cond_false_label (as_a + (bb->foot_stmt)); + chain_bb1 = state.scanned_bb.get (bb_id1); + chain_bb2 = state.scanned_bb.get (bb_id2); + bb->access = (chain_bb1->access > chain_bb2->access + ? chain_bb1->access + : chain_bb2->access); + if (bb->access != ACCESS_UNKNOWN) + changed = true; + break; + + /* Copy the liveness data from the destination block. */ + case GIMPLE_GOTO: + bb_id1 = gimple_goto_dest (as_a (bb->foot_stmt)); + chain_bb1 = state.scanned_bb.get (bb_id1); + if (chain_bb1->access != ACCESS_UNKNOWN) + { + bb->access = chain_bb1->access; + changed = true; + } + break; + + /* Combine the liveness data from all the branches of a switch + statement. The access values are ordered such that the + highest value takes precedence. */ + case GIMPLE_SWITCH: + num_labels = gimple_switch_num_labels (as_a + (bb->foot_stmt)); + bb->access = ACCESS_NONE; /* Lowest precedence value. */ + for (unsigned i = 0; i < num_labels; i++) + { + label = gimple_switch_label (as_a + (bb->foot_stmt), i); + chain_bb1 = state.scanned_bb.get (CASE_LABEL (label)); + bb->access = (bb->access > chain_bb1->access + ? bb->access + : chain_bb1->access); + } + if (bb->access != ACCESS_UNKNOWN) + changed = true; + break; + + /* Combine the liveness data from all the branches of an asm goto + statement. The access values are ordered such that the + highest value takes precedence. */ + case GIMPLE_ASM: + num_labels = gimple_asm_nlabels (as_a (bb->foot_stmt)); + bb->access = ACCESS_NONE; /* Lowest precedence value. */ + /* Loop through all the labels and the fall-through block. */ + for (unsigned i = 0; i < num_labels + 1; i++) + { + if (i < num_labels) + bb_id1 = TREE_VALUE (gimple_asm_label_op + (as_a (bb->foot_stmt), i)); + else + /* The fall-through fake-BB uses the string for an ID. */ + bb_id1 = gimple_asm_string (as_a + (bb->foot_stmt)); + chain_bb1 = state.scanned_bb.get (bb_id1); + bb->access = (bb->access > chain_bb1->access + ? bb->access + : chain_bb1->access); + } + if (bb->access != ACCESS_UNKNOWN) + changed = true; + break; + + /* No other statement kinds should appear as foot statements. */ + default: + gcc_unreachable (); + } + } + } while (changed); + + /* The access status should now be readable from the initial BB, + if one could be determined. */ + state.bb = *state.scanned_bb.get (root_id); + } + + if (dump_enabled_p () && dump_flags & TDF_DETAILS) + { + for (hash_map::iterator it = state.scanned_bb.begin (); + it != state.scanned_bb.end (); + ++it) + { + ODO_BB *bb = &(*it).second; + dump_printf_loc (MSG_NOTE, bb->foot_stmt, + "%<%T%> is %s on entry to block ending here\n", var, + (bb->access == ACCESS_NONE + || bb->access == ACCESS_DEF_FIRST ? "dead" + : bb->access == ACCESS_USE_FIRST ? "live" + : bb->access == ACCESS_UNSUPPORTED + ? "unknown (unsupported op)" + : "unknown (complex control flow)")); + } + /* If the answer was found early then then the last BB to be scanned + will not have been entered into the table. */ + if (state.can_short_circuit) + dump_printf_loc (MSG_NOTE, target_stmt, + "%<%T%> is %s on entry to target region\n", var, + (state.bb.access == ACCESS_NONE + || state.bb.access == ACCESS_DEF_FIRST ? "dead" + : state.bb.access == ACCESS_USE_FIRST ? "live" + : state.bb.access == ACCESS_UNSUPPORTED + ? "unknown (unsupported op)" + : "unknown (complex control flow)")); + } + + if (state.bb.access != ACCESS_DEF_FIRST + && dump_enabled_p () && dump_flags & TDF_DETAILS) + dump_printf_loc (MSG_NOTE, target_stmt, "%<%T%> is not suitable" + " for private optimization; %s\n", var, + (state.bb.access == ACCESS_USE_FIRST + ? "live on entry" + : state.bb.access == ACCESS_UNKNOWN + ? "complex control flow" + : "unknown reason")); + + return state.bb.access == ACCESS_DEF_FIRST; +} + +/* Inspect a tree operand, from a gimple walk, and check to see if it is a + variable use that might mean the variable is not a suitable candidate for + optimization in a prior target region. + + This algorithm is very basic and can be easily fooled by writes with + subsequent reads, but it should at least err on the safe side. */ + +static void +omp_data_optimize_inspect_op (tree op, ODO_State *state, bool is_lhs, + gimple *stmt) +{ + if (is_lhs && !state->lhs_scanned) + { + /* We're at the top level of the LHS operand. + Anything we scan inside should be treated as RHS. */ + state->lhs_scanned = 1; + + /* Writes to variables are not yet taken into account, beyond not + invalidating the optimization, but not everything on the + left-hand-side is a write (array indices, etc.), and if one element of + an array is written to then we should assume the rest is live. */ + tree base = get_base_address (op); + if (base && base == op) + return; /* Writes to scalars are not a "use". */ + } + + if (!DECL_P (op)) + return; + + /* If we get here then we have found a use of a variable. */ + tree var = op; + + inhibit_descriptor *id = state->candidates.get (var); + if (id && id->kind != INHIBIT_BAD) + { + if (dump_enabled_p () && dump_flags & TDF_DETAILS) + { + if (gimple_code (stmt) == GIMPLE_OMP_TARGET) + dump_printf_loc (MSG_NOTE, id->stmt, + "encountered variable use in target stmt\n"); + else + dump_printf_loc (MSG_NOTE, id->stmt, + "encountered variable use: %G\n", stmt); + dump_printf_loc (MSG_NOTE, id->stmt, + " -> discarding candidate: %T\n", op); + } + + /* We're walking backward: this earlier instance ("earlier" in + 'gimple_seq' forward order) overrides what we may have had before. */ + id->kind = INHIBIT_USE; + id->stmt = stmt; + } +} + +/* Optimize the data mappings of a target region, where our backward gimple + walk has identified that the variable is definitely dead on exit. */ + +static void +omp_data_optimize_stmt_target (gimple *stmt, ODO_State *state) +{ + for (tree *pc = gimple_omp_target_clauses_ptr (stmt); *pc; + pc = &OMP_CLAUSE_CHAIN (*pc)) + { + if (OMP_CLAUSE_CODE (*pc) != OMP_CLAUSE_MAP) + continue; + + tree var = OMP_CLAUSE_DECL (*pc); + if (OMP_CLAUSE_MAP_KIND (*pc) == GOMP_MAP_FORCE_TOFROM + || OMP_CLAUSE_MAP_KIND (*pc) == GOMP_MAP_TOFROM) + { + /* The dump_printf_loc format code %T does not print + the head clause of a clause chain but the whole chain. + Print the last considered clause manually. */ + char *c_s_prev = NULL; + if (dump_enabled_p ()) + c_s_prev = print_omp_clause_to_str (*pc); + + inhibit_descriptor *id = state->candidates.get (var); + if (!id) { + /* The variable was not a parameter or named in any bind, so it + must be in an external scope, and therefore live-on-exit. */ + if (dump_enabled_p ()) + dump_printf_loc(MSG_MISSED_OPTIMIZATION, DUMP_LOC (*pc), + "%qs not optimized: %<%T%> is unsuitable" + " for privatization\n", + c_s_prev, var); + continue; + } + + switch (id->kind) + { + case INHIBIT_NOT: /* Don't inhibit optimization. */ + + /* Change map type from "tofrom" to "to". */ + OMP_CLAUSE_SET_MAP_KIND (*pc, GOMP_MAP_TO); + + if (dump_enabled_p ()) + { + char *c_s_opt = print_omp_clause_to_str (*pc); + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, DUMP_LOC (*pc), + "%qs optimized to %qs\n", c_s_prev, c_s_opt); + free (c_s_prev); + c_s_prev = c_s_opt; + } + + /* Variables that are dead-on-entry and dead-on-loop can be + further optimized to private. */ + if (omp_data_optimize_can_be_private (var, stmt)) + { + tree c_f = (build_omp_clause + (OMP_CLAUSE_LOCATION (*pc), + OMP_CLAUSE_PRIVATE)); + OMP_CLAUSE_DECL (c_f) = var; + OMP_CLAUSE_CHAIN (c_f) = OMP_CLAUSE_CHAIN (*pc); + //TODO Copy "implicit" flag from 'var'. + *pc = c_f; + + if (dump_enabled_p ()) + { + char *c_s_opt = print_omp_clause_to_str (*pc); + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, DUMP_LOC (*pc), + "%qs further optimized to %qs\n", + c_s_prev, c_s_opt); + free (c_s_prev); + c_s_prev = c_s_opt; + } + } + break; + + case INHIBIT_USE: /* Optimization inhibited by a variable use. */ + if (dump_enabled_p ()) + { + dump_printf_loc (MSG_MISSED_OPTIMIZATION, DUMP_LOC (*pc), + "%qs not optimized: %<%T%> used...\n", + c_s_prev, var); + dump_printf_loc (MSG_MISSED_OPTIMIZATION, id->stmt, + "... here\n"); + } + break; + + case INHIBIT_JMP: /* Optimization inhibited by control flow. */ + if (dump_enabled_p ()) + { + dump_printf_loc (MSG_MISSED_OPTIMIZATION, DUMP_LOC (*pc), + "%qs not optimized: %<%T%> disguised by" + " looping/control flow...\n", c_s_prev, var); + dump_printf_loc (MSG_MISSED_OPTIMIZATION, id->stmt, + "... here\n"); + } + break; + + case INHIBIT_BAD: /* Optimization inhibited by properties. */ + if (dump_enabled_p ()) + { + dump_printf_loc (MSG_MISSED_OPTIMIZATION, DUMP_LOC (*pc), + "%qs not optimized: %<%T%> is unsuitable" + " for privatization\n", c_s_prev, var); + } + break; + + default: + gcc_unreachable (); + } + + if (dump_enabled_p ()) + free (c_s_prev); + } + } + + /* Variables used by target regions cannot be optimized from earlier + target regions. */ + for (tree c = *gimple_omp_target_clauses_ptr (stmt); + c; c = OMP_CLAUSE_CHAIN (c)) + { + /* This needs to include all the mapping clauses listed in + OMP_TARGET_CLAUSE_MASK in c-parser.c. */ + if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_MAP + && OMP_CLAUSE_CODE (c) != OMP_CLAUSE_PRIVATE + && OMP_CLAUSE_CODE (c) != OMP_CLAUSE_FIRSTPRIVATE) + continue; + + tree var = OMP_CLAUSE_DECL (c); + omp_data_optimize_inspect_op (var, state, false, stmt); + } +} + +/* Call back for gimple walk. Scan the statement for target regions and + variable uses or control flow that might prevent us optimizing offload + data copies. */ + +static tree +omp_data_optimize_callback_stmt (gimple_stmt_iterator *gsi_p, + bool *handled_ops_p, + struct walk_stmt_info *wi) +{ + ODO_State *state = (ODO_State *) wi->info; + + *handled_ops_p = false; + state->lhs_scanned = false; + + gimple *stmt = gsi_stmt (*gsi_p); + + switch (gimple_code (stmt)) + { + /* A bind introduces a new variable scope that might include optimizable + variables. */ + case GIMPLE_BIND: + omp_data_optimize_stmt_bind (as_a (stmt), state); + break; + + /* Tracking labels allows us to understand control flow better. */ + case GIMPLE_LABEL: + state->visited_labels.add (gimple_label_label (as_a (stmt))); + break; + + /* Statements that might constitute some looping/control flow pattern + may inhibit optimization of target mappings. */ + case GIMPLE_COND: + case GIMPLE_GOTO: + case GIMPLE_SWITCH: + case GIMPLE_ASM: + omp_data_optimize_stmt_jump (stmt, state); + break; + + /* A target statement that will have variables for us to optimize. */ + case GIMPLE_OMP_TARGET: + /* For now, only look at OpenACC 'kernels' constructs. */ + if (gimple_omp_target_kind (stmt) == GF_OMP_TARGET_KIND_OACC_KERNELS) + { + omp_data_optimize_stmt_target (stmt, state); + + /* Don't walk inside the target region; use of private variables + inside the private region does not stop them being private! + NOTE: we *do* want to walk target statement types that are not + (yet) handled by omp_data_optimize_stmt_target as the uses there + must not be missed. */ + // TODO add tests for mixed kernels/parallels + *handled_ops_p = true; + } + break; + + default: + break; + } + + return NULL; +} + +/* Call back for gimple walk. Scan the operand for variable uses. */ + +static tree +omp_data_optimize_callback_op (tree *tp, int *walk_subtrees, void *data) +{ + struct walk_stmt_info *wi = (struct walk_stmt_info *) data; + + omp_data_optimize_inspect_op (*tp, (ODO_State *)wi->info, wi->is_lhs, + wi->stmt); + + *walk_subtrees = 1; + return NULL; +} + +/* Main pass entry point. See comments at head of file. */ + +static unsigned int +omp_data_optimize (void) +{ + /* Capture the function arguments so that they can be optimized. */ + ODO_State state; + for (tree decl = DECL_ARGUMENTS (current_function_decl); + decl; + decl = DECL_CHAIN (decl)) + { + const dump_user_location_t loc = dump_user_location_t::from_function_decl (decl); + omp_data_optimize_add_candidate (loc, decl, &state); + } + + /* Scan and optimize the function body, from bottom to top. */ + struct walk_stmt_info wi; + memset (&wi, 0, sizeof (wi)); + wi.backward = true; + wi.info = &state; + gimple_seq body = gimple_body (current_function_decl); + walk_gimple_seq (body, omp_data_optimize_callback_stmt, + omp_data_optimize_callback_op, &wi); + + return 0; +} + + +namespace { + +const pass_data pass_data_omp_data_optimize = +{ + GIMPLE_PASS, /* type */ + "omp_data_optimize", /* name */ + OPTGROUP_OMP, /* optinfo_flags */ + TV_NONE, /* tv_id */ + PROP_gimple_any, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + 0, /* todo_flags_finish */ +}; + +class pass_omp_data_optimize : public gimple_opt_pass +{ +public: + pass_omp_data_optimize (gcc::context *ctxt) + : gimple_opt_pass (pass_data_omp_data_optimize, ctxt) + {} + + /* opt_pass methods: */ + virtual bool gate (function *) + { + return (flag_openacc + && param_openacc_kernels == OPENACC_KERNELS_DECOMPOSE); + } + virtual unsigned int execute (function *) + { + return omp_data_optimize (); + } + +}; // class pass_omp_data_optimize + +} // anon namespace + +gimple_opt_pass * +make_pass_omp_data_optimize (gcc::context *ctxt) +{ + return new pass_omp_data_optimize (ctxt); +} diff --git a/gcc/passes.def b/gcc/passes.def index 9220fdc8ca75..48c9821011f0 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -34,6 +34,7 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_warn_unused_result); NEXT_PASS (pass_diagnose_omp_blocks); NEXT_PASS (pass_diagnose_tm_blocks); + NEXT_PASS (pass_omp_data_optimize); NEXT_PASS (pass_omp_oacc_kernels_decompose); NEXT_PASS (pass_lower_omp); NEXT_PASS (pass_lower_cf); diff --git a/gcc/testsuite/c-c++-common/goacc/note-parallelism-1-kernels-loops.c b/gcc/testsuite/c-c++-common/goacc/note-parallelism-1-kernels-loops.c index 3bcb7f430f4d..6cf51904e7ad 100644 --- a/gcc/testsuite/c-c++-common/goacc/note-parallelism-1-kernels-loops.c +++ b/gcc/testsuite/c-c++-common/goacc/note-parallelism-1-kernels-loops.c @@ -12,7 +12,12 @@ main () { int x, y, z; -#pragma acc kernels /* { dg-message "optimized: assigned OpenACC seq loop parallelism" } */ +#pragma acc kernels +/* { dg-optimized {'map\(force_tofrom:z \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:z \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } .-1 }*/ +/* { dg-optimized {'map\(force_tofrom:y \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:y \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } .-2 }*/ +/* { dg-optimized {'map\(force_tofrom:x \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:x \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } .-3 }*/ +/* { dg-optimized {'map\(to:x \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(x\)'} "" { target *-*-* } .-4 } */ + /* Strangely indented to keep this similar to other test cases. */ { for (x = 0; x < 10; x++) /* { dg-message "note: beginning .Graphite. region in OpenACC .kernels. construct" } */ diff --git a/gcc/testsuite/c-c++-common/goacc/note-parallelism-1-kernels-straight-line.c b/gcc/testsuite/c-c++-common/goacc/note-parallelism-1-kernels-straight-line.c index 829811e1ec7f..d4cb2364737c 100644 --- a/gcc/testsuite/c-c++-common/goacc/note-parallelism-1-kernels-straight-line.c +++ b/gcc/testsuite/c-c++-common/goacc/note-parallelism-1-kernels-straight-line.c @@ -28,7 +28,14 @@ main () { int x, y, z; -#pragma acc kernels /* { dg-warning "region contains gang partitioned code but is not gang partitioned" } */ +#pragma acc kernels /* { dg-line l_pragma_kernels } */ + /* The variables aren't loop variables (on explicit or implicit 'loop' directives), so don't get (implicit) 'private' clauses, but they're (implicit) 'copy' -- which we then see get optimized: */ + /* { dg-optimized {'map\(force_tofrom:z \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:z \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_pragma_kernels } */ + /* { dg-optimized {'map\(to:z \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(z\)'} "" { target *-*-* } l_pragma_kernels } */ + /* { dg-optimized {'map\(force_tofrom:y \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:y \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_pragma_kernels } */ + /* { dg-optimized {'map\(to:y \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(y\)\'} "" { target *-*-* } l_pragma_kernels } */ + /* { dg-optimized {'map\(force_tofrom:x \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:x \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_pragma_kernels } */ + /* { dg-optimized {'map\(to:x \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(x\)'} "" { target *-*-* } l_pragma_kernels } */ { x = 0; /* { dg-message "note: beginning .gang-single. part in OpenACC .kernels. region" } */ y = x < 10; diff --git a/gcc/testsuite/c-c++-common/goacc/note-parallelism-kernels-loops.c b/gcc/testsuite/c-c++-common/goacc/note-parallelism-kernels-loops.c index 8d82c21c1aa9..92accdf27fa2 100644 --- a/gcc/testsuite/c-c++-common/goacc/note-parallelism-kernels-loops.c +++ b/gcc/testsuite/c-c++-common/goacc/note-parallelism-kernels-loops.c @@ -41,8 +41,14 @@ main () for (z = 0; z < 10; z++) ; -#pragma acc kernels /* { dg-message "optimized: assigned OpenACC seq loop parallelism" } */ - for (x = 0; x < 10; x++) /* { dg-message "note: beginning .Graphite. region in OpenACC .kernels. construct" } */ +#pragma acc kernels + /* { dg-optimized {'map\(force_tofrom:x \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:x \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } .-1 } */ + /* { dg-optimized {'map\(force_tofrom:y \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:y \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } .-2 } */ + /* { dg-optimized {'map\(force_tofrom:z \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:z \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } .-3 } */ + /* { dg-optimized {'map\(to:x \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(x\)'} "" { target *-*-* } .-4 } */ + /* { dg-optimized {'map\(to:y \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(y\)'} "" { target *-*-* } .-5 } */ + /* { dg-optimized {'map\(to:z \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(z\)'} "TODO-kernels z not privatized?" { xfail *-*-* } .-6 } */ + for (x = 0; x < 10; x++) /* { dg-message "note: beginning .Graphite. part in OpenACC .kernels. region" } */ for (y = 0; y < 10; y++) for (z = 0; z < 10; z++) ; diff --git a/gcc/testsuite/c-c++-common/goacc/omp_data_optimize-1.c b/gcc/testsuite/c-c++-common/goacc/omp_data_optimize-1.c new file mode 100644 index 000000000000..c90031a40b71 --- /dev/null +++ b/gcc/testsuite/c-c++-common/goacc/omp_data_optimize-1.c @@ -0,0 +1,677 @@ +/* Test 'gcc/omp-data-optimize.c'. */ + +/* { dg-additional-options "-fdump-tree-gimple-raw" } */ +/* { dg-additional-options "-fopt-info-omp-all" } */ + +/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName' + passed to 'incr' may be unset, and in that case, it will be set to [...]", + so to maintain compatibility with earlier Tcl releases, we manually + initialize counter variables: + { dg-line l_compute[variable c_compute 0] } + { dg-message "dummy" "" { target iN-VAl-Id } l_compute } to avoid + "WARNING: dg-line var l_compute defined, but not used". + { dg-line l_use[variable c_use 0] } + { dg-message "dummy" "" { target iN-VAl-Id } l_use } to avoid + "WARNING: dg-line var l_use defined, but not used". + { dg-line l_lcf[variable c_lcf 0] } + { dg-message "dummy" "" { target iN-VAl-Id } l_lcf } to avoid + "WARNING: dg-line var l_lcf defined, but not used". */ + +extern int ef1(int); + + +/* Optimization happens. */ + +long opt_1_gvar1; +extern short opt_1_evar1; +static long opt_1_svar1; + +static int opt_1(int opt_1_pvar1) +{ + int opt_1_lvar1; + extern short opt_1_evar2; + static long opt_1_svar2; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + int dummy1 = opt_1_pvar1; + int dummy2 = opt_1_lvar1; + int dummy3 = opt_1_evar2; + int dummy4 = opt_1_svar2; + + int dummy5 = opt_1_gvar1; + int dummy6 = opt_1_evar1; + int dummy7 = opt_1_svar1; + } + + return 0; + +/* { dg-optimized {'map\(force_tofrom:opt_1_pvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:opt_1_pvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-optimized {'map\(force_tofrom:opt_1_lvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:opt_1_lvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:opt_1_evar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'opt_1_evar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:opt_1_svar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'opt_1_svar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:opt_1_gvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'opt_1_gvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:opt_1_evar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'opt_1_evar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:opt_1_svar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'opt_1_svar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +} + +long opt_2_gvar1; +extern short opt_2_evar1; +static long opt_2_svar1; + +static int opt_2(int opt_2_pvar1) +{ + int opt_2_lvar1; + extern short opt_2_evar2; + static long opt_2_svar2; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + int dummy1 = opt_2_pvar1; + int dummy2 = opt_2_lvar1; + int dummy3 = opt_2_evar2; + int dummy4 = opt_2_svar2; + + int dummy5 = opt_2_gvar1; + int dummy6 = opt_2_evar1; + int dummy7 = opt_2_svar1; + } + + /* A write does not inhibit optimization. */ + + opt_2_pvar1 = 0; + opt_2_lvar1 = 1; + opt_2_evar2 = 2; + opt_2_svar2 = 3; + + opt_2_gvar1 = 10; + opt_2_evar1 = 11; + opt_2_svar1 = 12; + + return 0; + +/* { dg-optimized {'map\(force_tofrom:opt_2_pvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:opt_2_pvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-optimized {'map\(force_tofrom:opt_2_lvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:opt_2_lvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:opt_2_evar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'opt_2_evar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:opt_2_svar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'opt_2_svar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:opt_2_gvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'opt_2_gvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } + { dg-missed {'map\(force_tofrom:opt_2_evar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'opt_2_evar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } + { dg-missed {'map\(force_tofrom:opt_2_svar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'opt_2_svar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +} + +long opt_3_gvar1; +extern short opt_3_evar1; +static long opt_3_svar1; + +static int opt_3(int opt_3_pvar1) +{ + int opt_3_lvar1; + extern short opt_3_evar2; + static long opt_3_svar2; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + /* A write inside the kernel inhibits optimization to firstprivate. + TODO: optimize to private where the variable is dead-on-entry. */ + + opt_3_pvar1 = 1; + opt_3_lvar1 = 2; + opt_3_evar2 = 3; + opt_3_svar2 = 4; + + opt_3_gvar1 = 5; + opt_3_evar1 = 6; + opt_3_svar1 = 7; + } + + return 0; + +/* { dg-optimized {'map\(force_tofrom:opt_3_pvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:opt_3_pvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } + { dg-optimized {'map\(to:opt_3_pvar1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(opt_3_pvar1\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-optimized {'map\(force_tofrom:opt_3_lvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:opt_3_lvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } + { dg-optimized {'map\(to:opt_3_lvar1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(opt_3_lvar1\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:opt_3_evar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'opt_3_evar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:opt_3_svar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'opt_3_svar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:opt_3_gvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'opt_3_gvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } + { dg-missed {'map\(force_tofrom:opt_3_evar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'opt_3_evar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } + { dg-missed {'map\(force_tofrom:opt_3_svar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'opt_3_svar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +} + +static void opt_4() +{ + int opt_4_larray1[10]; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + int dummy1 = opt_4_larray1[4]; + int dummy2 = opt_4_larray1[8]; + } + +/* { dg-optimized {'map\(tofrom:opt_4_larray1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:opt_4_larray1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } + { dg-bogus {'map\(to:opt_4_larray1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'firstprivate\(opt_4_larray1\)'} "" { target *-*-* } l_compute$c_compute } */ +} + +static void opt_5 (int opt_5_pvar1) +{ + int opt_5_larray1[10]; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + opt_5_larray1[opt_5_pvar1] = 1; + opt_5_pvar1[opt_5_larray1] = 2; + } + +/* { dg-optimized {'map\(force_tofrom:opt_5_pvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:opt_5_pvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } */ + +/* TODO: this probably should be optimizable. */ +/* { dg-missed {'map\(tofrom:opt_5_larray1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'opt_5_larray1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +} + + +/* Similar, but with optimization inhibited because of variable use. */ + +static int use_1(int use_1_pvar1) +{ + float use_1_lvar1; + extern char use_1_evar2; + static double use_1_svar2; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + use_1_pvar1 = 0; + use_1_lvar1 = 1; + use_1_evar2 = 2; + use_1_svar2 = 3; + } + + int s = 0; + s += use_1_pvar1; /* { dg-missed {\.\.\. here} "" { target *-*-* } } */ + s += use_1_lvar1; /* { dg-missed {\.\.\. here} "" { target *-*-* } } */ + s += use_1_evar2; /* { dg-bogus {note: \.\.\. here} "" { target *-*-* } } */ + s += use_1_svar2; /* { dg-bogus {note: \.\.\. here} "" { target *-*-* } } */ + + return s; + +/* { dg-missed {'map\(force_tofrom:use_1_pvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'use_1_pvar1' used\.\.\.} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:use_1_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'use_1_lvar1' used\.\.\.} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:use_1_evar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'use_1_evar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:use_1_svar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'use_1_svar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +} + +extern int use_2_a1[]; + +static int use_2(int use_2_pvar1) +{ + int use_2_lvar1; + extern int use_2_evar2; + static int use_2_svar2; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + use_2_pvar1 = 0; + use_2_lvar1 = 1; + use_2_evar2 = 2; + use_2_svar2 = 3; + } + + int s = 0; + s += use_2_a1[use_2_pvar1]; /* { dg-missed {\.\.\. here} "" { target *-*-* } } */ + s += use_2_a1[use_2_lvar1]; /* { dg-missed {\.\.\. here} "" { target *-*-* } } */ + s += use_2_a1[use_2_evar2]; + s += use_2_a1[use_2_svar2]; + + return s; + +/*TODO The following GIMPLE dump scanning maybe too fragile (across + different GCC configurations)? The idea is to verify that we're indeed + doing the "deep scanning", as discussed in + . */ +/* { dg-final { scan-tree-dump-times {(?n) gimple_assign $} 1 "gimple" } } */ +/* { dg-missed {'map\(force_tofrom:use_2_pvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'use_2_pvar1' used\.\.\.} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-final { scan-tree-dump-times {(?n) gimple_assign $} 1 "gimple" } } */ +/* { dg-missed {'map\(force_tofrom:use_2_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'use_2_lvar1' used\.\.\.} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-final { scan-tree-dump-times {(?n) gimple_assign $} 1 "gimple" } } */ +/* { dg-final { scan-tree-dump-times {(?n) gimple_assign $} 1 "gimple" } } */ +/* { dg-final { scan-tree-dump-times {(?n) gimple_assign $} 1 "gimple" } } */ +/* { dg-final { scan-tree-dump-times {(?n) gimple_assign $} 1 "gimple" } } */ +/* { dg-missed {'map\(force_tofrom:use_2_evar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'use_2_evar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:use_2_svar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'use_2_svar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +} + +static void use_3 () +{ + int use_5_lvar1; + int use_5_larray1[10]; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + use_5_lvar1 = 5; + } + + use_5_larray1[use_5_lvar1] = 1; /* { dg-line l_use[incr c_use] } */ + +/* { dg-missed {'map\(force_tofrom:use_5_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'use_5_lvar1' used\.\.\.} "" { target *-*-* } l_compute$c_compute } + { dg-missed {\.\.\. here} "" { target *-*-* } l_use$c_use } */ +} + + +/* Similar, but with the optimization inhibited because of looping/control flow. */ + +static void lcf_1(int lcf_1_pvar1) +{ + float lcf_1_lvar1; + extern char lcf_1_evar2; + static double lcf_1_svar2; + + for (int i = 0; i < ef1(i); ++i) /* { dg-line l_lcf[incr c_lcf] } */ + { +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + lcf_1_pvar1 = 0; + lcf_1_lvar1 = 1; + lcf_1_evar2 = 2; + lcf_1_svar2 = 3; + } + } + +/* { dg-missed {'map\(force_tofrom:lcf_1_evar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_1_evar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:lcf_1_svar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_1_svar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:lcf_1_pvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_1_pvar1' disguised by looping/control flow\.\.\.} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:lcf_1_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_1_lvar1' disguised by looping/control flow\.\.\.} "" { target *-*-* } l_compute$c_compute } + { dg-missed {\.\.\. here} "" { target *-*-* } l_lcf$c_lcf } */ +} + +static void lcf_2(int lcf_2_pvar1) +{ + float lcf_2_lvar1; + extern char lcf_2_evar2; + static double lcf_2_svar2; + + if (ef1 (0)) + return; + + repeat: +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + lcf_2_pvar1 = 0; + lcf_2_lvar1 = 1; + lcf_2_evar2 = 2; + lcf_2_svar2 = 3; + } + + goto repeat; /* { dg-line l_lcf[incr c_lcf] } */ + +/* { dg-missed {'map\(force_tofrom:lcf_2_evar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_2_evar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:lcf_2_svar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_2_svar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:lcf_2_pvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_2_pvar1' disguised by looping/control flow\.\.\.} "" { target *-*-* } l_compute$c_compute } +/* { dg-missed {'map\(force_tofrom:lcf_2_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_2_lvar1' disguised by looping/control flow\.\.\.} "" { target *-*-* } l_compute$c_compute } + { dg-missed {\.\.\. here} "" { target *-*-* } l_lcf$c_lcf } */ +} + +static void lcf_3(int lcf_3_pvar1) +{ + float lcf_3_lvar1; + extern char lcf_3_evar2; + static double lcf_3_svar2; + + if (ef1 (0)) + return; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + lcf_3_pvar1 = 0; + lcf_3_lvar1 = 1; + lcf_3_evar2 = 2; + lcf_3_svar2 = 3; + } + + // Backward jump after kernel + repeat: + goto repeat; /* { dg-line l_lcf[incr c_lcf] } */ + +/* { dg-missed {'map\(force_tofrom:lcf_3_evar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_3_evar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:lcf_3_svar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_3_svar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:lcf_3_pvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_3_pvar1' disguised by looping/control flow\.\.\.} "" { target *-*-* } l_compute$c_compute } +/* { dg-missed {'map\(force_tofrom:lcf_3_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_3_lvar1' disguised by looping/control flow\.\.\.} "" { target *-*-* } l_compute$c_compute } + { dg-missed {\.\.\. here} "" { target *-*-* } l_lcf$c_lcf } */ +} + +static void lcf_4(int lcf_4_pvar1) +{ + float lcf_4_lvar1; + extern char lcf_4_evar2; + static double lcf_4_svar2; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + lcf_4_pvar1 = 0; + lcf_4_lvar1 = 1; + lcf_4_evar2 = 2; + lcf_4_svar2 = 3; + } + + // Forward jump after kernel + goto out; + + out: + return; + +/* { dg-missed {'map\(force_tofrom:lcf_4_evar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_4_evar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } + { dg-optimized {'map\(to:lcf_4_pvar1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(lcf_4_pvar1\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:lcf_4_svar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_4_svar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } + { dg-optimized {'map\(to:lcf_4_lvar1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(lcf_4_lvar1\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-optimized {'map\(force_tofrom:lcf_4_pvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:lcf_4_pvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-optimized {'map\(force_tofrom:lcf_4_lvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:lcf_4_lvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } */ +} + +static void lcf_5(int lcf_5_pvar1) +{ + float lcf_5_lvar1; + extern char lcf_5_evar2; + static double lcf_5_svar2; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + lcf_5_pvar1 = 0; + lcf_5_lvar1 = 1; + lcf_5_evar2 = 2; + lcf_5_svar2 = 3; + } + + if (ef1 (-1)) + ; + + return; + +/* { dg-optimized {'map\(force_tofrom:lcf_5_pvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:lcf_5_pvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } + { dg-optimized {'map\(to:lcf_5_pvar1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(lcf_5_pvar1\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-optimized {'map\(force_tofrom:lcf_5_lvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:lcf_5_lvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } + { dg-optimized {'map\(to:lcf_5_lvar1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(lcf_5_lvar1\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:lcf_5_evar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_5_evar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:lcf_5_svar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_5_svar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +} + +static void lcf_6(int lcf_6_pvar1) +{ + float lcf_6_lvar1; + extern char lcf_6_evar2; + static double lcf_6_svar2; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + lcf_6_pvar1 = 0; + lcf_6_lvar1 = 1; + lcf_6_evar2 = 2; + lcf_6_svar2 = 3; + } + + int x = ef1 (-2) ? 1 : -1; + + return; + +/* { dg-optimized {'map\(force_tofrom:lcf_6_pvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:lcf_6_pvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } + { dg-optimized {'map\(to:lcf_6_pvar1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(lcf_6_pvar1\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-optimized {'map\(force_tofrom:lcf_6_lvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:lcf_6_lvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } + { dg-optimized {'map\(to:lcf_6_lvar1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(lcf_6_lvar1\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:lcf_6_evar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_6_evar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:lcf_6_svar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_6_svar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +} + +static void lcf_7(int lcf_7_pvar1) +{ + float lcf_7_lvar1; + extern char lcf_7_evar2; + static double lcf_7_svar2; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + lcf_7_pvar1 = 0; + lcf_7_lvar1 = 1; + lcf_7_evar2 = 2; + lcf_7_svar2 = 3; + } + + switch (ef1 (-2)) + { + case 0: ef1 (10); break; + case 2: ef1 (11); break; + default: ef1 (12); break; + } + + return; + +/* { dg-optimized {'map\(force_tofrom:lcf_7_pvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:lcf_7_pvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } + { dg-optimized {'map\(to:lcf_7_pvar1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(lcf_7_pvar1\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-optimized {'map\(force_tofrom:lcf_7_lvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:lcf_7_lvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } + { dg-optimized {'map\(to:lcf_7_lvar1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(lcf_7_lvar1\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:lcf_7_evar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_7_evar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:lcf_7_svar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_7_svar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +} + +static void lcf_8(int lcf_8_pvar1) +{ + float lcf_8_lvar1; + extern char lcf_8_evar2; + static double lcf_8_svar2; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + lcf_8_pvar1 = 0; + lcf_8_lvar1 = 1; + lcf_8_evar2 = 2; + lcf_8_svar2 = 3; + } + + asm goto ("" :::: out); + +out: + return; + +/* { dg-optimized {'map\(force_tofrom:lcf_8_pvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:lcf_8_pvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } + { dg-optimized {'map\(to:lcf_8_pvar1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(lcf_8_pvar1\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-optimized {'map\(force_tofrom:lcf_8_lvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:lcf_8_lvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } + { dg-optimized {'map\(to:lcf_8_lvar1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(lcf_8_lvar1\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:lcf_8_evar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_8_evar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:lcf_8_svar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_8_svar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +} + +/* Ensure that variables are promoted to private properly. */ + +static void priv_1 () +{ + int priv_1_lvar1, priv_1_lvar2, priv_1_lvar3, priv_1_lvar4, priv_1_lvar5; + int priv_1_lvar6, priv_1_lvar7, priv_1_lvar8, priv_1_lvar9, priv_1_lvar10; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + priv_1_lvar1 = 1; + int dummy = priv_1_lvar2; + + if (priv_1_lvar2) + { + priv_1_lvar3 = 1; + } + else + { + priv_1_lvar3 = 2; + } + + priv_1_lvar5 = priv_1_lvar3; + + if (priv_1_lvar2) + { + priv_1_lvar4 = 1; + int dummy = priv_1_lvar4; + } + + switch (priv_1_lvar2) + { + case 0: + priv_1_lvar5 = 1; + dummy = priv_1_lvar6; + break; + case 1: + priv_1_lvar5 = 2; + priv_1_lvar6 = 3; + break; + default: + break; + } + + asm goto ("" :: "r"(priv_1_lvar7) :: label1, label2); + if (0) + { +label1: + priv_1_lvar8 = 1; + priv_1_lvar9 = 2; + } + if (0) + { +label2: + dummy = priv_1_lvar9; + dummy = priv_1_lvar10; + } + } + +/* { dg-optimized {'map\(force_tofrom:priv_1_lvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:priv_1_lvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } + { dg-optimized {'map\(to:priv_1_lvar1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(priv_1_lvar1\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-optimized {'map\(force_tofrom:priv_1_lvar2 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:priv_1_lvar2 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } + { dg-bogus {'map\(to:priv_1_lvar2 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(priv_1_lvar2\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-optimized {'map\(force_tofrom:priv_1_lvar3 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:priv_1_lvar3 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } + { dg-optimized {'map\(to:priv_1_lvar3 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(priv_1_lvar3\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-optimized {'map\(force_tofrom:priv_1_lvar4 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:priv_1_lvar4 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } + { dg-optimized {'map\(to:priv_1_lvar4 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(priv_1_lvar4\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-optimized {'map\(force_tofrom:priv_1_lvar5 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:priv_1_lvar5 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } + { dg-optimized {'map\(to:priv_1_lvar5 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(priv_1_lvar5\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-optimized {'map\(force_tofrom:priv_1_lvar6 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:priv_1_lvar6 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } + { dg-bogus {'map\(to:priv_1_lvar6 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(priv_1_lvar6\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-optimized {'map\(force_tofrom:priv_1_lvar7 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:priv_1_lvar7 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } + { dg-bogus {'map\(to:priv_1_lvar7 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(priv_1_lvar7\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-optimized {'map\(force_tofrom:priv_1_lvar8 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:priv_1_lvar8 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } + { dg-optimized {'map\(to:priv_1_lvar8 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(priv_1_lvar8\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-optimized {'map\(force_tofrom:priv_1_lvar9 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:priv_1_lvar9 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } + { dg-bogus {'map\(to:priv_1_lvar9 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(priv_1_lvar9\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-optimized {'map\(force_tofrom:priv_1_lvar10 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:priv_1_lvar10 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } + { dg-bogus {'map\(to:priv_1_lvar10 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(priv_1_lvar10\)'} "" { target *-*-* } l_compute$c_compute } */ +} + +static void multiple_kernels_1 () +{ +#pragma acc kernels + { + int multiple_kernels_1_lvar1 = 1; + } + + int multiple_kernels_2_lvar1; +#pragma acc kernels + { + int multiple_kernels_2_lvar1 = 1; + } + +#pragma acc parallel + { + multiple_kernels_2_lvar1++; + } +} + +static int ref_1 () +{ + int *ref_1_ref1; + int ref_1_lvar1; + + ref_1_ref1 = &ref_1_lvar1; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + ref_1_lvar1 = 1; + } + + return *ref_1_ref1; + +/* { dg-missed {'map\(force_tofrom:ref_1_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'ref_1_lvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +} + +static int ref_2 () +{ + int *ref_2_ref1; + int ref_2_lvar1; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + ref_2_lvar1 = 1; + } + + ref_2_ref1 = &ref_2_lvar1; + return *ref_2_ref1; + +/* { dg-missed {'map\(force_tofrom:ref_2_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'ref_2_lvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +} + +static void ref_3 () +{ + int ref_3_lvar1; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + // FIXME: could be optimized + { + int *ref_3_ref1 = &ref_3_lvar1; + ref_3_lvar1 = 1; + } + +/* { dg-missed {'map\(force_tofrom:ref_3_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'ref_3_lvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +} + +static void ref_4 () +{ + int ref_4_lvar1; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + // FIXME: could be optmized + { + int *ref_4_ref1 = &ref_4_lvar1; + *ref_4_ref1 = 1; + } + +/* { dg-missed {'map\(force_tofrom:ref_4_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'ref_4_lvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +} + +static void conditional_1 (int conditional_1_pvar1) +{ + int conditional_1_lvar1 = 1; + + if (conditional_1_pvar1) + { + // TODO: should be opimizable, but isn't due to later usage in the + // linear scan. +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + int dummy = conditional_1_lvar1; + } + } + else + { + int dummy = conditional_1_lvar1; /* { dg-line l_use[incr c_use] } */ + } + +/* { dg-missed {'map\(force_tofrom:conditional_1_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'conditional_1_lvar1' used...} "" { target *-*-* } l_compute$c_compute } + { dg-missed {\.\.\. here} "" { target *-*-* } l_use$c_use } */ +} + +static void conditional_2 (int conditional_2_pvar1) +{ + int conditional_2_lvar1 = 1; + + if (conditional_2_pvar1) + { + int dummy = conditional_2_lvar1; + } + else + { +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + int dummy = conditional_2_lvar1; + } + } + +/* { dg-optimized {'map\(force_tofrom:conditional_2_lvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:conditional_2_lvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } */ +} diff --git a/gcc/testsuite/c-c++-common/goacc/uninit-copy-clause.c b/gcc/testsuite/c-c++-common/goacc/uninit-copy-clause.c index b3cc4459328f..628b84940a1c 100644 --- a/gcc/testsuite/c-c++-common/goacc/uninit-copy-clause.c +++ b/gcc/testsuite/c-c++-common/goacc/uninit-copy-clause.c @@ -7,6 +7,12 @@ foo (void) int i; #pragma acc kernels + /* { dg-warning "'i' is used uninitialized in this function" "" { target *-*-* } .-1 } */ + /*TODO With the 'copy' -> 'firstprivate' optimization, the original implicit 'copy(i)' clause gets optimized into a 'firstprivate(i)' clause -- and the expected (?) warning diagnostic appears. + Have to read up the history behind these test cases. + Should this test remain here in this file even if now testing 'firstprivate'? + Or, should the optimization be disabled for such testing? + Or, the testing be duplicated for both variants? */ { i = 1; } diff --git a/gcc/testsuite/g++.dg/goacc/omp_data_optimize-1.C b/gcc/testsuite/g++.dg/goacc/omp_data_optimize-1.C new file mode 100644 index 000000000000..5483e5682410 --- /dev/null +++ b/gcc/testsuite/g++.dg/goacc/omp_data_optimize-1.C @@ -0,0 +1,169 @@ +/* Test 'gcc/omp-data-optimize.c'. */ + +/* { dg-additional-options "-std=c++11" } */ +/* { dg-additional-options "-fdump-tree-gimple-raw" } */ +/* { dg-additional-options "-fopt-info-omp-all" } */ + +/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName' + passed to 'incr' may be unset, and in that case, it will be set to [...]", + so to maintain compatibility with earlier Tcl releases, we manually + initialize counter variables: + { dg-line l_compute[variable c_compute 0] } + { dg-message "dummy" "" { target iN-VAl-Id } l_compute } to avoid + "WARNING: dg-line var l_compute defined, but not used". + { dg-line l_use[variable c_use 0] } + { dg-message "dummy" "" { target iN-VAl-Id } l_use } to avoid + "WARNING: dg-line var l_use defined, but not used". */ + +static int closure_1 (int closure_1_pvar1) +{ + int closure_1_lvar1 = 1; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + /* { dg-message {note: beginning 'Graphite' part in OpenACC 'kernels' region} "" { target *-*-* } .+1 } */ + closure_1_lvar1 = closure_1_pvar1; + } + + auto lambda = [closure_1_lvar1]() {return closure_1_lvar1;}; /* { dg-line l_use[incr c_use] } */ + return lambda(); + +/* { dg-optimized {'map\(force_tofrom:closure_1_pvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:closure_1_pvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:closure_1_lvar1 \[len: [0-9]\]\[implicit\]\)' not optimized: 'closure_1_lvar1' used...} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {\.\.\. here} "" { target *-*-* } l_use$c_use } */ +} + +static int closure_2 (int closure_2_pvar1) +{ + int closure_2_lvar1 = 1; + + auto lambda = [closure_2_lvar1]() {return closure_2_lvar1;}; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + /* { dg-message {note: beginning 'Graphite' part in OpenACC 'kernels' region} "" { target *-*-* } .+1 } */ + closure_2_lvar1 = closure_2_pvar1; + } + + return lambda(); + +/* { dg-optimized {'map\(force_tofrom:closure_2_pvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:closure_2_pvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-optimized {'map\(force_tofrom:closure_2_lvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:closure_2_lvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } + { dg-optimized {'map\(to:closure_2_lvar1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(closure_2_lvar1\)'} "" { target *-*-* } l_compute$c_compute } */ +} + +static int closure_3 (int closure_3_pvar1) +{ + int closure_3_lvar1 = 1; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + /* { dg-message {note: beginning 'Graphite' part in OpenACC 'kernels' region} "" { target *-*-* } .+1 } */ + closure_3_lvar1 = closure_3_pvar1; + } + + auto lambda = [&]() {return closure_3_lvar1;}; + + return lambda(); + +/* { dg-optimized {'map\(force_tofrom:closure_3_pvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:closure_3_pvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {map\(force_tofrom:closure_3_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'closure_3_lvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +} + +static int closure_4 (int closure_4_pvar1) +{ + int closure_4_lvar1 = 1; + + auto lambda = [&]() {return closure_4_lvar1;}; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + /* { dg-message {note: beginning 'Graphite' part in OpenACC 'kernels' region} "" { target *-*-* } .+1 } */ + closure_4_lvar1 = closure_4_pvar1; + } + + return lambda(); + +/* { dg-optimized {'map\(force_tofrom:closure_4_pvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:closure_4_pvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {map\(force_tofrom:closure_4_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'closure_4_lvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } */ +} + +static int closure_5 (int closure_5_pvar1) +{ + int closure_5_lvar1 = 1; + + auto lambda = [=]() {return closure_5_lvar1;}; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + /* { dg-message {note: beginning 'Graphite' part in OpenACC 'kernels' region} "" { target *-*-* } .+1 } */ + closure_5_lvar1 = closure_5_pvar1; + } + + return lambda(); + +/* { dg-optimized {'map\(force_tofrom:closure_5_pvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:closure_5_pvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-optimized {'map\(force_tofrom:closure_5_lvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:closure_5_lvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } + { dg-optimized {'map\(to:closure_5_lvar1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(closure_5_lvar1\)'} "" { target *-*-* } l_compute$c_compute } */ +} + +static int closure_6 (int closure_6_pvar1) +{ + int closure_6_lvar1 = 1; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + /* { dg-message {note: beginning 'Graphite' part in OpenACC 'kernels' region} "" { target *-*-* } .+1 } */ + closure_6_lvar1 = closure_6_pvar1; + } + + auto lambda = [=]() {return closure_6_lvar1;}; /* { dg-line l_use[incr c_use] } */ + + return lambda(); + +/* { dg-optimized {'map\(force_tofrom:closure_6_pvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:closure_6_pvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } */ +/* { dg-missed {'map\(force_tofrom:closure_6_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'closure_6_lvar1' used...} "" { target *-*-* } l_compute$c_compute } + { dg-missed {\.\.\. here} "" { target *-*-* } l_use$c_use } */ +} + +static int try_1 () +{ + int try_1_lvar1, try_1_lvar2; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + /* { dg-message {note: beginning 'Graphite' part in OpenACC 'kernels' region} "" { target *-*-* } .+1 } */ + try_1_lvar1 = 1; + } + + try { + try_1_lvar2 = try_1_lvar1; /* { dg-line l_use[incr c_use] } */ + } catch (...) {} + + return try_1_lvar2; + +/* { dg-missed {'map\(force_tofrom:try_1_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'try_1_lvar1' used...} "" { target *-*-* } l_compute$c_compute } + { dg-missed {\.\.\. here} "" { target *-*-* } l_use$c_use } */ +} + +static int try_2 () +{ + int try_2_lvar1, try_2_lvar2; + +#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ + { + /* { dg-message {note: beginning 'Graphite' part in OpenACC 'kernels' region} "" { target *-*-* } .+1 } */ + try_2_lvar1 = 1; + } + + try { + try_2_lvar2 = 1; + } catch (...) { + try_2_lvar2 = try_2_lvar1; /* { dg-line l_use[incr c_use] } */ + } + + return try_2_lvar2; + +/* { dg-missed {'map\(force_tofrom:try_2_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'try_2_lvar1' used...} "" { target *-*-* } l_compute$c_compute } + { dg-missed {\.\.\. here} "" { target *-*-* } l_use$c_use } */ +} diff --git a/gcc/testsuite/gfortran.dg/goacc/omp_data_optimize-1.f90 b/gcc/testsuite/gfortran.dg/goacc/omp_data_optimize-1.f90 new file mode 100644 index 000000000000..ce3e556faf26 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/goacc/omp_data_optimize-1.f90 @@ -0,0 +1,588 @@ +! { dg-additional-options "-fdump-tree-gimple-raw" } +! { dg-additional-options "-fopt-info-omp-all" } + +! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName' +! passed to 'incr' may be unset, and in that case, it will be set to [...]", +! so to maintain compatibility with earlier Tcl releases, we manually +! initialize counter variables: +! { dg-line l_compute[variable c_compute 0] } +! { dg-message "dummy" "" { target iN-VAl-Id } l_compute } to avoid +! "WARNING: dg-line var l_compute defined, but not used". +! { dg-line l_use[variable c_use 0] } +! { dg-message "dummy" "" { target iN-VAl-Id } l_use } to avoid +! "WARNING: dg-line var l_use defined, but not used". + +module globals + use ISO_C_BINDING + implicit none + integer :: opt_1_gvar1 = 1 + integer(C_INT), bind(C) :: opt_1_evar1 + integer :: opt_2_gvar1 = 1 + integer(C_INT), bind(C) :: opt_2_evar1 + integer :: opt_3_gvar1 = 1 + integer(C_INT), bind(C) :: opt_3_evar1 + integer :: use_1_gvar1 = 1 + integer(C_INT), bind(C) :: use_1_evar1 + integer :: use_2_gvar1 = 1 + integer(C_INT), bind(C) :: use_2_evar1 + integer :: use_2_a1(100) + integer(C_INT), bind(C) :: lcf_1_evar2 + integer(C_INT), bind(C) :: lcf_2_evar2 + integer(C_INT), bind(C) :: lcf_3_evar2 + integer(C_INT), bind(C) :: lcf_4_evar2 + integer(C_INT), bind(C) :: lcf_5_evar2 + integer(C_INT), bind(C) :: lcf_6_evar2 + save +end module globals + +subroutine opt_1 (opt_1_pvar1) + use globals + implicit none + integer :: opt_1_pvar1 + integer :: opt_1_lvar1 + integer, save :: opt_1_svar1 = 3 + integer :: dummy1, dummy2, dummy3, dummy4, dummy5 + + !$acc kernels ! { dg-line l_compute[incr c_compute] } + dummy1 = opt_1_pvar1; + dummy2 = opt_1_lvar1; + + dummy3 = opt_1_gvar1; + dummy4 = opt_1_evar1; + dummy5 = opt_1_svar1; + !$acc end kernels + +! Parameter is pass-by-reference +! { dg-missed {'map\(force_tofrom:\*opt_1_pvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: '\*opt_1_pvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } + +! { dg-optimized {'map\(force_tofrom:opt_1_lvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:opt_1_lvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! +! { dg-missed {'map\(force_tofrom:opt_1_gvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'opt_1_gvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } + +! { dg-missed {'map\(force_tofrom:opt_1_evar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'opt_1_evar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } + +! { dg-missed {'map\(force_tofrom:opt_1_svar1 \[len: 4\]\[implicit\]\)' not optimized: 'opt_1_svar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! +! { dg-optimized {'map\(force_tofrom:dummy1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:dummy1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(to:dummy1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(dummy1\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(force_tofrom:dummy2 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:dummy2 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(to:dummy2 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(dummy2\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(force_tofrom:dummy3 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:dummy3 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(to:dummy3 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(dummy3\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(force_tofrom:dummy4 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:dummy4 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(to:dummy4 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(dummy4\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(force_tofrom:dummy5 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:dummy5 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(to:dummy5 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(dummy5\)'} "" { target *-*-* } l_compute$c_compute } +end subroutine opt_1 + +subroutine opt_2 (opt_2_pvar1) + use globals + implicit none + integer :: opt_2_pvar1 + integer :: opt_2_lvar1 + integer, save :: opt_2_svar1 = 3 + integer :: dummy1, dummy2, dummy3, dummy4, dummy5 + + !$acc kernels ! { dg-line l_compute[incr c_compute] } + dummy1 = opt_2_pvar1; + dummy2 = opt_2_lvar1; + + dummy3 = opt_2_gvar1; + dummy4 = opt_2_evar1; + dummy5 = opt_2_svar1; + !$acc end kernels + + ! A write does not inhibit optimization. + opt_2_pvar1 = 0; + opt_2_lvar1 = 1; + + opt_2_gvar1 = 10; + opt_2_evar1 = 11; + opt_2_svar1 = 12; + +! { dg-missed {'map\(force_tofrom:\*opt_2_pvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: '\*opt_2_pvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } + +! { dg-optimized {'map\(force_tofrom:opt_2_lvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:opt_2_lvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } + +! { dg-missed {'map\(force_tofrom:opt_2_gvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'opt_2_gvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } + +! { dg-missed {'map\(force_tofrom:opt_2_evar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'opt_2_evar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } + +! { dg-missed {'map\(force_tofrom:opt_2_svar1 \[len: 4\]\[implicit\]\)' not optimized: 'opt_2_svar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } + +! { dg-optimized {'map\(force_tofrom:dummy1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:dummy1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(to:dummy1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(dummy1\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(force_tofrom:dummy2 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:dummy2 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(to:dummy2 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(dummy2\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(force_tofrom:dummy3 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:dummy3 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(to:dummy3 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(dummy3\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(force_tofrom:dummy4 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:dummy4 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(to:dummy4 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(dummy4\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(force_tofrom:dummy5 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:dummy5 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(to:dummy5 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(dummy5\)'} "" { target *-*-* } l_compute$c_compute } +end subroutine opt_2 + +subroutine opt_3 (opt_3_pvar1) + use globals + implicit none + integer :: opt_3_pvar1 + integer :: opt_3_lvar1 + integer, save :: opt_3_svar1 = 3 + + !$acc kernels ! { dg-line l_compute[incr c_compute] } + opt_3_pvar1 = 0; + opt_3_lvar1 = 1; + + opt_3_gvar1 = 10; + opt_3_evar1 = 11; + opt_3_svar1 = 12; + !$acc end kernels + +! Parameter is pass-by-reference +! { dg-missed {'map\(force_tofrom:\*opt_3_pvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: '\*opt_3_pvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } + +! { dg-optimized {'map\(force_tofrom:opt_3_lvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:opt_3_lvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(to:opt_3_lvar1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(opt_3_lvar1\)'} "" { target *-*-* } l_compute$c_compute } +! +! { dg-missed {'map\(force_tofrom:opt_3_gvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'opt_3_gvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } + +! { dg-missed {'map\(force_tofrom:opt_3_evar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'opt_3_evar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } + +! { dg-missed {'map\(force_tofrom:opt_3_svar1 \[len: 4\]\[implicit\]\)' not optimized: 'opt_3_svar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +end subroutine opt_3 + +subroutine opt_4 () + implicit none + integer, dimension(10) :: opt_4_larray1 + integer :: dummy1, dummy2 + + ! TODO Fortran local arrays are addressable (and may be visable to nested + ! functions, etc.) so they are not optimizable yet. + + !$acc kernels ! { dg-line l_compute[incr c_compute] } + dummy1 = opt_4_larray1(4) + dummy2 = opt_4_larray1(8) + !$acc end kernels + +! { dg-missed {'map\(tofrom:opt_4_larray1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'opt_4_larray1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! +! { dg-optimized {'map\(force_tofrom:dummy1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:dummy1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(to:dummy1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(dummy1\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(force_tofrom:dummy2 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:dummy2 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(to:dummy2 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(dummy2\)'} "" { target *-*-* } l_compute$c_compute } +end subroutine opt_4 + +subroutine opt_5 (opt_5_pvar1) + implicit none + integer, dimension(10) :: opt_5_larray1 + integer :: opt_5_lvar1, opt_5_pvar1 + + opt_5_lvar1 = opt_5_pvar1 + + !$acc kernels ! { dg-line l_compute[incr c_compute] } + opt_5_larray1(opt_5_lvar1) = 1 + !$acc end kernels + +! { dg-missed {'map\(tofrom:opt_5_larray1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'opt_5_larray1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! +! { dg-optimized {'map\(force_tofrom:opt_5_lvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:opt_5_lvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +end subroutine opt_5 + +subroutine use_1 (use_1_pvar1) + use globals + implicit none + integer :: use_1_pvar1 + integer :: use_1_lvar1 + integer, save :: use_1_svar1 = 3 + integer :: s + + !$acc kernels ! { dg-line l_compute[incr c_compute] } + use_1_pvar1 = 0; + use_1_lvar1 = 1; + + ! FIXME: svar is optimized: should not be + use_1_gvar1 = 10; + use_1_evar1 = 11; + use_1_svar1 = 12; + !$acc end kernels + + s = 0 + s = s + use_1_pvar1 + s = s + use_1_lvar1 ! { dg-missed {\.\.\. here} "" { target *-*-* } } + s = s + use_1_gvar1 + s = s + use_1_evar1 + s = s + use_1_svar1 + +! { dg-missed {'map\(force_tofrom:\*use_1_pvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: '\*use_1_pvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:use_1_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'use_1_lvar1' used...} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:use_1_gvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'use_1_gvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:use_1_evar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'use_1_evar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:use_1_svar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'use_1_svar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +end subroutine use_1 + +subroutine use_2 (use_2_pvar1) + use globals + implicit none + integer :: use_2_pvar1 + integer :: use_2_lvar1 + integer, save :: use_2_svar1 = 3 + integer :: s + + !$acc kernels ! { dg-line l_compute[incr c_compute] } + use_2_pvar1 = 0; + use_2_lvar1 = 1; + use_2_gvar1 = 10; + use_2_evar1 = 11; + use_2_svar1 = 12; + !$acc end kernels + + s = 0 + s = s + use_2_a1(use_2_pvar1) + s = s + use_2_a1(use_2_lvar1) ! { dg-missed {\.\.\. here} "" { target *-*-* } } + s = s + use_2_a1(use_2_gvar1) + s = s + use_2_a1(use_2_evar1) + s = s + use_2_a1(use_2_svar1) + +! { dg-missed {'map\(force_tofrom:\*use_2_pvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: '\*use_2_pvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:use_2_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'use_2_lvar1' used...} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:use_2_gvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'use_2_gvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:use_2_evar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'use_2_evar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:use_2_svar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'use_2_svar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +end subroutine use_2 + +! Optimization inhibited because of looping/control flow. + +subroutine lcf_1 (lcf_1_pvar1, iter) + use globals + implicit none + real :: lcf_1_pvar1 + real :: lcf_1_lvar1 + real, save :: lcf_1_svar2 + integer :: i, iter + + do i = 1, iter ! { dg-line l_use[incr c_use] } + !$acc kernels ! { dg-line l_compute[incr c_compute] } + lcf_1_pvar1 = 0 + lcf_1_lvar1 = 1 + lcf_1_evar2 = 2 + lcf_1_svar2 = 3 + !$acc end kernels + end do + +! { dg-missed {'map\(force_tofrom:\*lcf_1_pvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: '\*lcf_1_pvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:lcf_1_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_1_lvar1' disguised by looping/control flow...} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:lcf_1_evar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_1_evar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:lcf_1_svar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_1_svar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {\.\.\. here} "" { target *-*-* } l_use$c_use } +end subroutine lcf_1 + +subroutine lcf_2 (lcf_2_pvar1) + use globals + implicit none + real :: lcf_2_pvar1 + real :: lcf_2_lvar1 + real, save :: lcf_2_svar2 + integer :: dummy + +10 dummy = 1 + + !$acc kernels ! { dg-line l_compute[incr c_compute] } + lcf_2_pvar1 = 0 + lcf_2_lvar1 = 1 + lcf_2_evar2 = 2 + lcf_2_svar2 = 3 + !$acc end kernels + + go to 10 ! { dg-line l_use[incr c_use] } + +! { dg-missed {'map\(force_tofrom:\*lcf_2_pvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: '\*lcf_2_pvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:lcf_2_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_2_lvar1' disguised by looping/control flow...} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:lcf_2_evar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_2_evar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:lcf_2_svar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_2_svar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {\.\.\. here} "" { target *-*-* } l_use$c_use } +end subroutine lcf_2 + +subroutine lcf_3 (lcf_3_pvar1) + use globals + implicit none + real :: lcf_3_pvar1 + real :: lcf_3_lvar1 + real, save :: lcf_3_svar2 + integer :: dummy + + !$acc kernels ! { dg-line l_compute[incr c_compute] } + lcf_3_pvar1 = 0 + lcf_3_lvar1 = 1 + lcf_3_evar2 = 2 + lcf_3_svar2 = 3 + !$acc end kernels + + ! Backward jump after kernel +10 dummy = 1 + go to 10 ! { dg-line l_use[incr c_use] } + +! { dg-missed {'map\(force_tofrom:\*lcf_3_pvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: '\*lcf_3_pvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:lcf_3_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_3_lvar1' disguised by looping/control flow...} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:lcf_3_evar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_3_evar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:lcf_3_svar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_3_svar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {\.\.\. here} "" { target *-*-* } l_use$c_use } +end subroutine lcf_3 + +subroutine lcf_4 (lcf_4_pvar1) + use globals + implicit none + real :: lcf_4_pvar1 + real :: lcf_4_lvar1 + real, save :: lcf_4_svar2 + integer :: dummy + + !$acc kernels ! { dg-line l_compute[incr c_compute] } + lcf_4_pvar1 = 0 + lcf_4_lvar1 = 1 + lcf_4_evar2 = 2 + lcf_4_svar2 = 3 + !$acc end kernels + + ! Forward jump after kernel + go to 10 +10 dummy = 1 + +! { dg-missed {'map\(force_tofrom:\*lcf_4_pvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: '\*lcf_4_pvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(force_tofrom:lcf_4_lvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:lcf_4_lvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(to:lcf_4_lvar1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(lcf_4_lvar1\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:lcf_4_evar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_4_evar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:lcf_4_svar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_4_svar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +end subroutine lcf_4 + +subroutine lcf_5 (lcf_5_pvar1, lcf_5_pvar2) + use globals + implicit none + real :: lcf_5_pvar1 + real :: lcf_5_pvar2 + real :: lcf_5_lvar1 + real, save :: lcf_5_svar2 + integer :: dummy + + !$acc kernels ! { dg-line l_compute[incr c_compute] } + lcf_5_pvar1 = 0 + lcf_5_lvar1 = 1 + lcf_5_evar2 = 2 + lcf_5_svar2 = 3 + !$acc end kernels + + if (lcf_5_pvar2 > 0) then + dummy = 1 + end if + +! { dg-missed {'map\(force_tofrom:\*lcf_5_pvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: '\*lcf_5_pvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(force_tofrom:lcf_5_lvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:lcf_5_lvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(to:lcf_5_lvar1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(lcf_5_lvar1\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:lcf_5_evar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_5_evar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:lcf_5_svar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_5_svar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +end subroutine lcf_5 + +subroutine lcf_6 (lcf_6_pvar1, lcf_6_pvar2) + use globals + implicit none + real :: lcf_6_pvar1 + real :: lcf_6_pvar2 + real :: lcf_6_lvar1 + real, save :: lcf_6_svar2 + integer :: dummy + + !$acc kernels ! { dg-line l_compute[incr c_compute] } + lcf_6_pvar1 = 0 + lcf_6_lvar1 = 1 + lcf_6_evar2 = 2 + lcf_6_svar2 = 3 + !$acc end kernels + + dummy = merge(1,0, lcf_6_pvar2 > 0) + +! { dg-missed {'map\(force_tofrom:\*lcf_6_pvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: '\*lcf_6_pvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(force_tofrom:lcf_6_lvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:lcf_6_lvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(to:lcf_6_lvar1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(lcf_6_lvar1\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:lcf_6_evar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_6_evar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:lcf_6_svar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'lcf_6_svar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +end subroutine lcf_6 + +subroutine priv_1 () + implicit none + integer :: priv_1_lvar1, priv_1_lvar2, priv_1_lvar3, priv_1_lvar4 + integer :: priv_1_lvar5, priv_1_lvar6, dummy + + !$acc kernels ! { dg-line l_compute[incr c_compute] } + ! { dg-message {note: beginning 'Graphite' part in OpenACC 'kernels' region} "" { target *-*-* } .+1 } */ + priv_1_lvar1 = 1 + dummy = priv_1_lvar2 + + if (priv_1_lvar2 > 0) then + priv_1_lvar3 = 1 + else + priv_1_lvar3 = 2 + end if + + priv_1_lvar5 = priv_1_lvar3 + + if (priv_1_lvar2 > 0) then + priv_1_lvar4 = 1 + dummy = priv_1_lvar4 + end if + !$acc end kernels + +! { dg-optimized {'map\(force_tofrom:priv_1_lvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:priv_1_lvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(to:priv_1_lvar1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(priv_1_lvar1\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(force_tofrom:priv_1_lvar2 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:priv_1_lvar2 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-bogus {'map\(to:priv_1_lvar2 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(priv_1_lvar2\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(force_tofrom:priv_1_lvar3 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:priv_1_lvar3 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(to:priv_1_lvar3 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(priv_1_lvar3\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(force_tofrom:priv_1_lvar4 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:priv_1_lvar4 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(to:priv_1_lvar4 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(priv_1_lvar4\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(force_tofrom:priv_1_lvar5 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:priv_1_lvar5 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(to:priv_1_lvar5 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(priv_1_lvar5\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(force_tofrom:dummy \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:dummy \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(to:dummy \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(dummy\)'} "" { target *-*-* } l_compute$c_compute } +end subroutine priv_1 + +subroutine multiple_kernels_1 () + implicit none + integer :: multiple_kernels_1_lvar1 + + !$acc kernels ! { dg-line l_compute[incr c_compute] } + multiple_kernels_1_lvar1 = 1 + !$acc end kernels + + !$acc kernels ! { dg-line l_use[incr c_use] } + multiple_kernels_1_lvar1 = multiple_kernels_1_lvar1 + 1 + !$acc end kernels + +! { dg-missed {'map\(force_tofrom:multiple_kernels_1_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'multiple_kernels_1_lvar1' used...} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {\.\.\. here} "" { target *-*-* } l_use$c_use } + +! { dg-optimized {'map\(force_tofrom:multiple_kernels_1_lvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:multiple_kernels_1_lvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_use$c_use } +end subroutine multiple_kernels_1 + +subroutine multiple_kernels_2 () + implicit none + integer :: multiple_kernels_2_lvar1 + + !$acc kernels ! { dg-line l_compute[incr c_compute] } + multiple_kernels_2_lvar1 = 1 + !$acc end kernels + + !$acc parallel + multiple_kernels_2_lvar1 = multiple_kernels_2_lvar1 + 1 ! { dg-line l_use[incr c_use] } + !$acc end parallel + +! { dg-missed {'map\(force_tofrom:multiple_kernels_2_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'multiple_kernels_2_lvar1' used...} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {\.\.\. here} "" { target *-*-* } l_use$c_use } +end subroutine multiple_kernels_2 + +integer function ref_1 () + implicit none + integer, target :: ref_1_lvar1 + integer, target :: ref_1_lvar2 + integer, pointer :: ref_1_ref1 + + ref_1_ref1 => ref_1_lvar1 + + !$acc kernels ! { dg-line l_compute[incr c_compute] } + ref_1_lvar1 = 1 + ! FIXME: currently considered unsuitable; but could be optimized + ref_1_lvar2 = 2 + !$acc end kernels + + ref_1 = ref_1_ref1 + +! { dg-missed {'map\(force_tofrom:ref_1_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'ref_1_lvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:ref_1_lvar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'ref_1_lvar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +end function ref_1 + +integer function ref_2 () + implicit none + integer, target :: ref_2_lvar1 + integer, target :: ref_2_lvar2 + integer, pointer :: ref_2_ref1 + + !$acc kernels ! { dg-line l_compute[incr c_compute] } + ref_2_lvar1 = 1 + ! FIXME: currently considered unsuitable, but could be optimized + ref_2_lvar2 = 2 + !$acc end kernels + + ref_2_ref1 => ref_2_lvar1 + ref_2 = ref_2_ref1 + +! { dg-missed {'map\(force_tofrom:ref_2_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'ref_2_lvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:ref_2_lvar2 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'ref_2_lvar2' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +end function ref_2 + +subroutine ref_3 () + implicit none + integer, target :: ref_3_lvar1 + integer, pointer :: ref_3_ref1 + + !$acc kernels ! { dg-line l_compute[incr c_compute] } + ref_3_ref1 => ref_3_lvar1 + + ! FIXME: currently considered unsuitable, but could be optimized + ref_3_lvar1 = 1 + !$acc end kernels + +! { dg-missed {'map\(force_tofrom:\*ref_3_ref1 \[len: [0-9]+\]\[implicit\]\)' not optimized: '\*ref_3_ref1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:ref_3_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'ref_3_lvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +end subroutine ref_3 + +subroutine ref_4 () + implicit none + integer, target :: ref_4_lvar1 + integer, pointer :: ref_4_ref1 + + !$acc kernels ! { dg-line l_compute[incr c_compute] } + ref_4_ref1 => ref_4_lvar1 + + ! FIXME: currently considered unsuitable, but could be optimized + ref_4_ref1 = 1 + !$acc end kernels + +! { dg-missed {'map\(force_tofrom:\*ref_4_ref1 \[len: [0-9]+\]\[implicit\]\)' not optimized: '\*ref_4_ref1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +! { dg-missed {'map\(force_tofrom:ref_4_lvar1 \[len: [0-9]+\]\[implicit\]\)' not optimized: 'ref_4_lvar1' is unsuitable for privatization} "" { target *-*-* } l_compute$c_compute } +end subroutine ref_4 + +subroutine conditional_1 (conditional_1_pvar1) + implicit none + integer :: conditional_1_pvar1 + integer :: conditional_1_lvar1 + + conditional_1_lvar1 = 1 + + if (conditional_1_pvar1 > 0) then + !$acc kernels ! { dg-line l_compute[incr c_compute] } + conditional_1_lvar1 = 2 + !$acc end kernels + else + conditional_1_lvar1 = 3 + end if + +! { dg-optimized {'map\(force_tofrom:conditional_1_lvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:conditional_1_lvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(to:conditional_1_lvar1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(conditional_1_lvar1\)'} "" { target *-*-* } l_compute$c_compute } +end subroutine conditional_1 + +subroutine conditional_2 (conditional_2_pvar1) + implicit none + integer :: conditional_2_pvar1 + integer :: conditional_2_lvar1 + + conditional_2_lvar1 = 1 + + if (conditional_2_pvar1 > 0) then + conditional_2_lvar1 = 3 + else + !$acc kernels ! { dg-line l_compute[incr c_compute] } + conditional_2_lvar1 = 2 + !$acc end kernels + end if + +! { dg-optimized {'map\(force_tofrom:conditional_2_lvar1 \[len: [0-9]+\]\[implicit\]\)' optimized to 'map\(to:conditional_2_lvar1 \[len: [0-9]+\]\[implicit\]\)'} "" { target *-*-* } l_compute$c_compute } +! { dg-optimized {'map\(to:conditional_2_lvar1 \[len: [0-9]+\]\[implicit\]\)' further optimized to 'private\(conditional_2_lvar1\)'} "" { target *-*-* } l_compute$c_compute } +end subroutine conditional_2 diff --git a/gcc/testsuite/gfortran.dg/goacc/uninit-copy-clause.f95 b/gcc/testsuite/gfortran.dg/goacc/uninit-copy-clause.f95 index b2aae1df5229..97fbe1268b73 100644 --- a/gcc/testsuite/gfortran.dg/goacc/uninit-copy-clause.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/uninit-copy-clause.f95 @@ -5,6 +5,8 @@ subroutine foo integer :: i !$acc kernels + ! { dg-warning "'i' is used uninitialized in this function" "" { target *-*-* } .-1 } + !TODO See discussion in '../../c-c++-common/goacc/uninit-copy-clause.c'. i = 1 !$acc end kernels diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index 5ffd2799ac2c..6fde66bdd8de 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -419,6 +419,7 @@ extern gimple_opt_pass *make_pass_lower_vector (gcc::context *ctxt); extern gimple_opt_pass *make_pass_lower_vector_ssa (gcc::context *ctxt); extern gimple_opt_pass *make_pass_omp_oacc_kernels_decompose (gcc::context *ctxt); extern gimple_opt_pass *make_pass_lower_omp (gcc::context *ctxt); +extern gimple_opt_pass *make_pass_omp_data_optimize (gcc::context *ctxt); extern gimple_opt_pass *make_pass_diagnose_omp_blocks (gcc::context *ctxt); extern gimple_opt_pass *make_pass_expand_omp (gcc::context *ctxt); extern gimple_opt_pass *make_pass_expand_omp_ssa (gcc::context *ctxt); diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c index dd83557b6aa2..25d545559b15 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c @@ -23,6 +23,8 @@ int main() int b[N] = { 0 }; #pragma acc kernels + /* { dg-missed {'map\(tofrom:b [^)]+\)' not optimized: 'b' is unsuitable for privatization} "" { target *-*-* } .-1 } + { dg-missed {'map\(force_tofrom:a [^)]+\)' not optimized: 'a' is unsuitable for privatization} "" { target *-*-* } .-2 } */ { int c = 234; /* { dg-message "note: beginning 'gang-single' part in OpenACC 'kernels' region" } */ diff --git a/libgomp/testsuite/libgomp.oacc-fortran/pr94358-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/pr94358-1.f90 index 74ee6fde84f8..994a8a35110f 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/pr94358-1.f90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/pr94358-1.f90 @@ -17,6 +17,10 @@ subroutine kernel(lo, hi, a, b, c) real, dimension(lo:hi) :: a, b, c !$acc kernels copyin(lo, hi) + ! { dg-optimized {'map\(force_tofrom:offset.[0-9]+ [^)]+\)' optimized to 'map\(to:offset.[0-9]+ [^)]+\)'} "" {target *-*-* } .-1 } + ! { dg-missed {'map\(tofrom:\*c [^)]+\)' not optimized: '\*c' is unsuitable for privatization} "" { target *-*-* } .-2 } + ! { dg-missed {'map\(tofrom:\*b [^)]+\)' not optimized: '\*b' is unsuitable for privatization} "" { target *-*-* } .-3 } + ! { dg-missed {'map\(tofrom:\*a [^)]+\)' not optimized: '\*a' is unsuitable for privatization} "" { target *-*-* } .-4 } !$acc loop independent ! { dg-line l_loop_i[incr c_loop_i] } ! { dg-message "note: parallelized loop nest in OpenACC 'kernels' region" "" { target *-*-* } l_loop_i$c_loop_i } ! { dg-optimized "assigned OpenACC gang vector loop parallelism" "" { target *-*-* } l_loop_i$c_loop_i }