From patchwork Mon Oct 11 10:21:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 46064 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D96F6385803D for ; Mon, 11 Oct 2021 10:22:37 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D96F6385803D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1633947757; bh=HOAW2LRdgzVpaHvLvxTTigsaQ92usqC3Yv4POg7+dDY=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=tu1VeHFcTEYRibE+pmDXo6UqX5iK47m1eq7GAfkAa8L2PGowGScPCCuU18yOzPV4G A3PJHlezuqcUqg+aLc5/jCpmFKZRQm1SOyt52qCAjFpqTm+x0W7b6bl/ZzA20pClR0 aNY9CrijH30g+5iEeV3CeM15B1y/zVGe8jNJnmiM= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTP id 6A0FE3858D28 for ; Mon, 11 Oct 2021 10:22:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6A0FE3858D28 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-556-E5ypctvLOveZeU1azlslew-1; Mon, 11 Oct 2021 06:22:05 -0400 X-MC-Unique: E5ypctvLOveZeU1azlslew-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 238901018720; Mon, 11 Oct 2021 10:22:04 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.193.109]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 92E2B1017CE7; Mon, 11 Oct 2021 10:22:03 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.16.1/8.16.1) with ESMTPS id 19BAM0Xe3924058 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Mon, 11 Oct 2021 12:22:01 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.16.1/8.16.1/Submit) id 19BALxaL3924057; Mon, 11 Oct 2021 12:21:59 +0200 Date: Mon, 11 Oct 2021 12:21:59 +0200 To: gcc-patches@gcc.gnu.org Subject: [committed] openmp: Add omp_set_num_teams, omp_get_max_teams, omp_[gs]et_teams_thread_limit Message-ID: <20211011102159.GT304296@tucnak> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline X-Spam-Status: No, score=-5.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jakub Jelinek via Gcc-patches From: Jakub Jelinek Reply-To: Jakub Jelinek Cc: Tobias Burnus Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi! OpenMP 5.1 adds env vars and functions to set and query new ICVs used as fallback if thread_limit or num_teams clauses aren't specified on teams construct. The following patch implements those, though further work will be needed: 1) OpenMP 5.1 also changed the num_teams clause, so that it can specify both lower and upper limit for how many teams should be created and changed the meaning when only one expression is provided, instead of num_teams(expr) in 5.0 meaning num_teams(1:expr) in 5.1, it now means num_teams(expr:expr), i.e. while previously we could create 1 to expr teams, in 5.1 we have some low limit by default equal to the single expression provided and may not create fewer teams. For host teams (which we don't currently implement efficiently for NUMA hosts) we trivially satisfy it now by always honoring what the user asked for, but for the offloading teams I think we'll need to rethink the APIs; currently teams construct is just a call that returns and possibly lowers the number of teams; and whenever possible we try to evaluate num_teams/thread_limit already on the target construct and the GOMP_teams call just sets the number of teams to the minimum of provided and requested teams; for some cases e.g. where target is not combined with teams and num_teams expression calls some functions etc., we need to call those functions in the target region and so it is late to figure number of teams, but also hw could just limit what it is willing to create; in that case I'm afraid we need to run the target body multiple times and arrange for omp_get_team_num () returning the right values 2) we need to finally implement the NUMA handling for GOMP_teams_reg 3) I now realize I haven't added some testcase coverage, will do that incrementally 4) libgomp.texi needs updates for these new APIs, but also others like the allocator Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk. 2021-10-11 Jakub Jelinek gcc/ * omp-low.c (omp_runtime_api_call): Handle omp_get_max_teams, omp_[sg]et_teams_thread_limit and omp_set_num_teams. libgomp/ * omp.h.in (omp_set_num_teams, omp_get_max_teams, omp_set_teams_thread_limit, omp_get_teams_thread_limit): Declare. * omp_lib.f90.in (omp_set_num_teams, omp_get_max_teams, omp_set_teams_thread_limit, omp_get_teams_thread_limit): Declare. * omp_lib.h.in (omp_set_num_teams, omp_get_max_teams, omp_set_teams_thread_limit, omp_get_teams_thread_limit): Declare. * libgomp.h (gomp_nteams_var, gomp_teams_thread_limit_var): Declare. * libgomp.map (OMP_5.1): Export omp_get_max_teams{,_}, omp_get_teams_thread_limit{,_}, omp_set_num_teams{,_,_8_} and omp_set_teams_thread_limit{,_,_8_}. * icv.c (omp_set_num_teams, omp_get_max_teams, omp_set_teams_thread_limit, omp_get_teams_thread_limit): New functions. * env.c (gomp_nteams_var, gomp_teams_thread_limit_var): Define. (omp_display_env): Print OMP_NUM_TEAMS and OMP_TEAMS_THREAD_LIMIT. (initialize_env): Handle OMP_NUM_TEAMS and OMP_TEAMS_THREAD_LIMIT env vars. * teams.c (GOMP_teams_reg): If thread_limit is not specified, use gomp_teams_thread_limit_var as fallback if not zero. If num_teams is not specified, use gomp_nteams_var. * fortran.c (omp_set_num_teams, omp_get_max_teams, omp_set_teams_thread_limit, omp_get_teams_thread_limit): Add ialias_redirect. (omp_set_num_teams_, omp_set_num_teams_8_, omp_get_max_teams_, omp_set_teams_thread_limit_, omp_set_teams_thread_limit_8_, omp_get_teams_thread_limit_): New functions. Jakub --- gcc/omp-low.c.jj 2021-09-30 17:12:15.236586906 +0200 +++ gcc/omp-low.c 2021-10-09 14:34:21.119388958 +0200 @@ -3953,6 +3953,7 @@ omp_runtime_api_call (const_tree fndecl) "get_level", "get_max_active_levels", "get_max_task_priority", + "get_max_teams", "get_max_threads", "get_nested", "get_num_devices", @@ -3965,6 +3966,7 @@ omp_runtime_api_call (const_tree fndecl) "get_proc_bind", "get_supported_active_levels", "get_team_num", + "get_teams_thread_limit", "get_thread_limit", "get_thread_num", "get_wtick", @@ -3998,8 +4000,10 @@ omp_runtime_api_call (const_tree fndecl) "set_dynamic", "set_max_active_levels", "set_nested", + "set_num_teams", "set_num_threads", - "set_schedule" + "set_schedule", + "set_teams_thread_limit" }; int mode = 0; --- libgomp/omp.h.in.jj 2021-10-01 10:32:03.024954096 +0200 +++ libgomp/omp.h.in 2021-10-09 15:06:38.173661594 +0200 @@ -261,6 +261,11 @@ extern int omp_get_max_task_priority (vo extern void omp_fulfill_event (omp_event_handle_t) __GOMP_NOTHROW; +extern void omp_set_num_teams (int) __GOMP_NOTHROW; +extern int omp_get_max_teams (void) __GOMP_NOTHROW; +extern void omp_set_teams_thread_limit (int) __GOMP_NOTHROW; +extern int omp_get_teams_thread_limit (void) __GOMP_NOTHROW; + extern void *omp_target_alloc (__SIZE_TYPE__, int) __GOMP_NOTHROW; extern void omp_target_free (void *, int) __GOMP_NOTHROW; extern int omp_target_is_present (const void *, int) __GOMP_NOTHROW; --- libgomp/omp_lib.f90.in.jj 2021-09-30 17:12:15.251586697 +0200 +++ libgomp/omp_lib.f90.in 2021-10-09 15:06:38.170661637 +0200 @@ -564,6 +564,36 @@ end function omp_get_max_task_priority end interface + interface omp_set_num_teams + subroutine omp_set_num_teams (num_teams) + integer (4), intent (in) :: num_teams + end subroutine omp_set_num_teams + subroutine omp_set_num_teams_8 (num_teams) + integer (8), intent (in) :: num_teams + end subroutine omp_set_num_teams_8 + end interface + + interface + function omp_get_max_teams () + integer (4) :: omp_get_max_teams + end function omp_get_max_teams + end interface + + interface omp_set_teams_thread_limit + subroutine omp_set_teams_thread_limit (thread_limit) + integer (4), intent (in) :: thread_limit + end subroutine omp_set_teams_thread_limit + subroutine omp_set_teams_thread_limit_8 (thread_limit) + integer (8), intent (in) :: thread_limit + end subroutine omp_set_teams_thread_limit_8 + end interface + + interface + function omp_get_teams_thread_limit () + integer (4) :: omp_get_teams_thread_limit + end function omp_get_teams_thread_limit + end interface + interface subroutine omp_fulfill_event (event) use omp_lib_kinds --- libgomp/omp_lib.h.in.jj 2021-09-30 17:12:15.257586613 +0200 +++ libgomp/omp_lib.h.in 2021-10-09 15:06:38.167661680 +0200 @@ -252,6 +252,10 @@ external omp_get_max_task_priority integer(4) omp_get_max_task_priority + external omp_set_num_teams, omp_set_teams_thread_limit + external omp_get_max_teams, omp_get_teams_thread_limit + integer(4) omp_get_max_teams, omp_get_teams_thread_limit + external omp_fulfill_event external omp_set_affinity_format, omp_get_affinity_format --- libgomp/libgomp.h.jj 2021-07-28 12:06:00.535928239 +0200 +++ libgomp/libgomp.h 2021-10-09 15:06:38.154661866 +0200 @@ -458,6 +458,8 @@ extern unsigned long gomp_bind_var_list_ extern void **gomp_places_list; extern unsigned long gomp_places_list_len; extern unsigned int gomp_num_teams_var; +extern int gomp_nteams_var; +extern int gomp_teams_thread_limit_var; extern int gomp_debug_var; extern bool gomp_display_affinity_var; extern char *gomp_affinity_format_var; --- libgomp/libgomp.map.jj 2021-09-30 09:29:56.739900855 +0200 +++ libgomp/libgomp.map 2021-10-09 15:06:38.151661909 +0200 @@ -214,6 +214,16 @@ OMP_5.1 { omp_display_env; omp_display_env_; omp_display_env_8_; + omp_set_num_teams; + omp_set_num_teams_; + omp_set_num_teams_8_; + omp_get_max_teams; + omp_get_max_teams_; + omp_set_teams_thread_limit; + omp_set_teams_thread_limit_; + omp_set_teams_thread_limit_8_; + omp_get_teams_thread_limit; + omp_get_teams_thread_limit_; } OMP_5.0.2; GOMP_1.0 { --- libgomp/icv.c.jj 2021-10-01 10:42:04.046441605 +0200 +++ libgomp/icv.c 2021-10-09 15:06:38.157661823 +0200 @@ -148,6 +148,32 @@ omp_get_supported_active_levels (void) return gomp_supported_active_levels; } +void +omp_set_num_teams (int num_teams) +{ + if (num_teams >= 0) + gomp_nteams_var = num_teams; +} + +int +omp_get_max_teams (void) +{ + return gomp_nteams_var; +} + +void +omp_set_teams_thread_limit (int thread_limit) +{ + if (thread_limit >= 0) + gomp_teams_thread_limit_var = thread_limit; +} + +int +omp_get_teams_thread_limit (void) +{ + return gomp_teams_thread_limit_var; +} + int omp_get_cancellation (void) { @@ -248,6 +274,10 @@ ialias (omp_get_thread_limit) ialias (omp_set_max_active_levels) ialias (omp_get_max_active_levels) ialias (omp_get_supported_active_levels) +ialias (omp_set_num_teams) +ialias (omp_get_max_teams) +ialias (omp_set_teams_thread_limit) +ialias (omp_get_teams_thread_limit) ialias (omp_get_cancellation) ialias (omp_get_proc_bind) ialias (omp_get_max_task_priority) --- libgomp/env.c.jj 2021-10-01 10:42:04.011442100 +0200 +++ libgomp/env.c 2021-10-09 15:06:38.160661780 +0200 @@ -90,6 +90,8 @@ unsigned long gomp_places_list_len; uintptr_t gomp_def_allocator = omp_default_mem_alloc; int gomp_debug_var; unsigned int gomp_num_teams_var; +int gomp_nteams_var; +int gomp_teams_thread_limit_var; bool gomp_display_affinity_var; char *gomp_affinity_format_var = "level %L thread %i affinity %A"; size_t gomp_affinity_format_len; @@ -1319,6 +1321,9 @@ omp_display_env (int verbose) gomp_global_icv.thread_limit_var); fprintf (stderr, " OMP_MAX_ACTIVE_LEVELS = '%u'\n", gomp_global_icv.max_active_levels_var); + fprintf (stderr, " OMP_NUM_TEAMS = '%u'\n", gomp_nteams_var); + fprintf (stderr, " OMP_TEAMS_THREAD_LIMIT = '%u'\n", + gomp_teams_thread_limit_var); fprintf (stderr, " OMP_CANCELLATION = '%s'\n", gomp_cancel_var ? "TRUE" : "FALSE"); @@ -1453,6 +1458,8 @@ initialize_env (void) &gomp_nthreads_var_list, &gomp_nthreads_var_list_len)) gomp_global_icv.nthreads_var = gomp_available_cpus; + parse_int ("OMP_NUM_TEAMS", &gomp_nteams_var, false); + parse_int ("OMP_TEAMS_THREAD_LIMIT", &gomp_teams_thread_limit_var, false); bool ignore = false; if (parse_bind_var ("OMP_PROC_BIND", &gomp_global_icv.bind_var, --- libgomp/teams.c.jj 2021-01-04 10:25:56.062038735 +0100 +++ libgomp/teams.c 2021-10-09 15:06:38.148661952 +0200 @@ -37,6 +37,8 @@ GOMP_teams_reg (void (*fn) (void *), voi (void) flags; (void) num_teams; unsigned old_thread_limit_var = 0; + if (thread_limit == 0) + thread_limit = gomp_teams_thread_limit_var; if (thread_limit) { struct gomp_task_icv *icv = gomp_icv (true); @@ -45,7 +47,7 @@ GOMP_teams_reg (void (*fn) (void *), voi = thread_limit > INT_MAX ? UINT_MAX : thread_limit; } if (num_teams == 0) - num_teams = 3; + num_teams = gomp_nteams_var ? gomp_nteams_var : 3; gomp_num_teams = num_teams; for (gomp_team_num = 0; gomp_team_num < num_teams; gomp_team_num++) fn (data); --- libgomp/fortran.c.jj 2021-10-01 10:42:04.034441774 +0200 +++ libgomp/fortran.c 2021-10-09 15:06:38.163661737 +0200 @@ -67,6 +67,10 @@ ialias_redirect (omp_get_thread_limit) ialias_redirect (omp_set_max_active_levels) ialias_redirect (omp_get_max_active_levels) ialias_redirect (omp_get_supported_active_levels) +ialias_redirect (omp_set_num_teams) +ialias_redirect (omp_get_max_teams) +ialias_redirect (omp_set_teams_thread_limit) +ialias_redirect (omp_get_teams_thread_limit) ialias_redirect (omp_get_level) ialias_redirect (omp_get_ancestor_thread_num) ialias_redirect (omp_get_team_size) @@ -478,6 +482,42 @@ omp_in_final_ (void) return omp_in_final (); } +void +omp_set_num_teams_ (const int32_t *num_teams) +{ + omp_set_num_teams (*num_teams); +} + +void +omp_set_num_teams_8_ (const int64_t *num_teams) +{ + omp_set_max_active_levels (TO_INT (*num_teams)); +} + +int32_t +omp_get_max_teams_ (void) +{ + return omp_get_max_teams (); +} + +void +omp_set_teams_thread_limit_ (const int32_t *thread_limit) +{ + omp_set_teams_thread_limit (*thread_limit); +} + +void +omp_set_teams_thread_limit_8_ (const int64_t *thread_limit) +{ + omp_set_teams_thread_limit (TO_INT (*thread_limit)); +} + +int32_t +omp_get_teams_thread_limit_ (void) +{ + return omp_get_teams_thread_limit (); +} + int32_t omp_get_cancellation_ (void) {