Message ID | 20140917100849.GD17454@tucnak.redhat.com |
---|---|
State | Rejected |
Headers |
Received: (qmail 22635 invoked by alias); 17 Sep 2014 10:09:04 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 22625 invoked by uid 89); 17 Sep 2014 10:09:03 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.7 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Date: Wed, 17 Sep 2014 12:08:49 +0200 From: Jakub Jelinek <jakub@redhat.com> To: Andrew Senkevich <andrew.n.senkevich@gmail.com> Cc: "H.J. Lu" <hjl.tools@gmail.com>, "Carlos O'Donell" <carlos@redhat.com>, "Joseph S. Myers" <joseph@codesourcery.com>, libc-alpha <libc-alpha@sourceware.org>, "Zamyatin, Igor" <igor.zamyatin@intel.com>, "Melik-Adamyan, Areg" <areg.melik-adamyan@intel.com> Subject: Re: [PATCH 1/N] x86_64 vectorization support: vectorized math functions addition to Glibc Message-ID: <20140917100849.GD17454@tucnak.redhat.com> Reply-To: Jakub Jelinek <jakub@redhat.com> References: <CAMXFM3t=ppndDUBzHzSus7xyuF5hTaLFZ5b273jD39NtddSvsw@mail.gmail.com> <Pine.LNX.4.64.1409101549490.12853@digraph.polyomino.org.uk> <5411F8D3.7050001@redhat.com> <CAMXFM3vEbTO1ntx7KOAG21axosPApTG6vwpcnu7B4VVATD+USw@mail.gmail.com> <CAMe9rOqFmwMWYBSsg9gPNeB_nskWZMSpzeWwc=YomsTNzjCn1A@mail.gmail.com> <CAMXFM3uNrRrAHDdS0LnbRZ7QwEFv1yd25cu1Ht2NC8fMBxLsBA@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <CAMXFM3uNrRrAHDdS0LnbRZ7QwEFv1yd25cu1Ht2NC8fMBxLsBA@mail.gmail.com> User-Agent: Mutt/1.5.23 (2014-03-12) |
Commit Message
Jakub Jelinek
Sept. 17, 2014, 10:08 a.m. UTC
On Wed, Sep 17, 2014 at 01:56:06PM +0400, Andrew Senkevich wrote: > > The wiki says: > > > > 3.1. Goal > > > > Main goal is to improve vectorization of GCC with OpenMP4.0 SIMD > > constructs (#2.8 in http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf > > and Cilk Plus constructs (6-7 in > > http://www.cilkplus.org/sites/default/files/open_specifications/Intel_Cilk_plus_lang_spec_1.2.htm) > > on x86_64 by adding SSE4, AVX and AVX2 vector implementations of > > several vector math functions (float and double versions). AVX-512 > > versions are planned to be added later. These functions can be also > > used manually (with intrincics) by developers to obtain speedup. > > > > It is the opposite of > > > > https://sourceware.org/ml/libc-alpha/2014-09/msg00277.html > > > > which is for programmers to use them directly in their > > applications, mostly independent of compilers. > > > > We need to come to an agreement on what goal is first. > > > > -- > > H.J. > > Hi H.J., > > of course the first goal is to improve vectorization. Usage with > intrinsics is additional goal and is not very significant. > > Attached first patch corrected according last comments in > https://sourceware.org/ml/libc-alpha/2014-09/msg00182.html. you need all of SSE2, AVX and AVX2 versions, the other two can be thunked (extract arguments and call cos in a loop or similarly, then pass result in vector reg again). Jakub
Comments
Hi Jakub, > +/* For now we have vectorized version only for _Mdouble_ case */ > +#if !defined _Mfloat_ && !defined _Mlong_double_ > +# if defined _OPENMP && _OPENMP >= 201307 > +# define __DECL_SIMD _Pragma ("omp declare simd") > > As the function is provided only on x86_64, it needs to be guarded > by defined __x86_64__ too (or have some way how arch specific > headers can tell what function are elemental). > Also, only the N (notinbranch) version is provided, so you'd > need to use "omp declare simd notinbranch", and furthermore only > the AVX2 version is provided (that is not possible for gcc, > you need all of SSE2, AVX and AVX2 versions, the other two can be > thunked (extract arguments and call cos in a loop or similarly, then > pass result in vector reg again). thank you, this place will look so: #if defined (__x86_64__) && !defined _Mfloat_ && !defined _Mlong_double_ \ && defined _OPENMP && _OPENMP >= 201307 # define __DECL_SIMD _Pragma ("omp declare simd notinbranch") #else # define __DECL_SIMD #endif Only AVX will be thunked, SSE4 we have implemented. It will be added in next patch soon.
On Wed, 17 Sep 2014, Andrew Senkevich wrote: > thank you, this place will look so: > > #if defined (__x86_64__) && !defined _Mfloat_ && !defined _Mlong_double_ \ > && defined _OPENMP && _OPENMP >= 201307 > # define __DECL_SIMD _Pragma ("omp declare simd notinbranch") > #else > # define __DECL_SIMD > #endif No, we never put architecture conditionals like that in architecture-independent headers. You have to use bits/*.h headers (per-architecture) instead to provide all the information about which functions have which vectorized versions. See what I suggested in <https://sourceware.org/ml/libc-alpha/2014-09/msg00182.html> about macros such as __DECL_SIMD_COS_DOUBLE. There's no point in sending more patch revisions until there's consensus on the overall implementation approach. Please go back several steps and start an architecture-independent discussion in a new architecture-independent thread seeking consensus on all the points I raised in <https://sourceware.org/ml/libc-alpha/2014-09/msg00182.html>. State your initial answers to these points so people can agree or disagree with them, but be prepared to change following the results of the discussion. Only once there is such consensus, write up the results on the wiki, seek confirmation on the list that other people agree that what you wrote up is an accurate representation of the consensus, and only then start sending patches (which will probably put functions in libmvec, not libm, based on the discussions so far). The patches have to follow the consensus, not your preference if the consensus goes against what you prefer. <https://sourceware.org/glibc/wiki/libm> does not in any way reflect any sort of consensus. It's *ideas and proposals*. Consensus can only be reached after discussions on libc-alpha (and if no-one comments on something, that's not consensus). In general, proposals are best posted to libc-alpha so it's easy to tell what was being discussed in a particular thread - proposals on wiki pages make it harder to follow the discussion, as you need to work out which version of the wiki page someone was referring to. Note that if we're relying on #pragma omp declare simd meaning a precise set of function versions are available - with a guarantee that no future compiler version will interpret is also meaning an AVX512 version is available, for example, so that it's safely possible to use older glibc with a newer compiler - there should be some sort of ABI document (preferably compiler-independent) stating that this is the meaning of that pragma on x86_64 and that this pragma says nothing about availability of function versions for other vector extensions and that if an ABI is defined for such versions in future, it will use a different pragma to declare their availability. Is there such an ABI document available that defines what this pragma means on x86_64?
--- a/math/bits/mathcalls.h +++ b/math/bits/mathcalls.h @@ -46,6 +46,17 @@ # error "Never include <bits/mathcalls.h> directly; include <math.h> instead." #endif +#undef __DECL_SIMD + +/* For now we have vectorized version only for _Mdouble_ case */ +#if !defined _Mfloat_ && !defined _Mlong_double_ +# if defined _OPENMP && _OPENMP >= 201307 +# define __DECL_SIMD _Pragma ("omp declare simd") As the function is provided only on x86_64, it needs to be guarded by defined __x86_64__ too (or have some way how arch specific headers can tell what function are elemental). Also, only the N (notinbranch) version is provided, so you'd need to use "omp declare simd notinbranch", and furthermore only the AVX2 version is provided (that is not possible for gcc,