Message ID | CAMXFM3uGOKqEAvGYew+9K7bmhObVmnP2u4kUOSh8_Cpwyk8s5g@mail.gmail.com |
---|---|
State | New, archived |
Headers |
Received: (qmail 24160 invoked by alias); 16 Oct 2014 16:37:42 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 24118 invoked by uid 89); 16 Oct 2014 16:37:41 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.0 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-qg0-f49.google.com X-Received: by 10.229.212.202 with SMTP id gt10mr3972909qcb.0.1413477455427; Thu, 16 Oct 2014 09:37:35 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <Pine.LNX.4.64.1410091724031.23641@digraph.polyomino.org.uk> References: <CAMXFM3tjquzniXP1weqxSVFJyhXqsf2PHuyrrrmqp7K0ZzORqA@mail.gmail.com> <CAMXFM3tqiqUNuSU2KXvAFM-QescX3+6xUO9=z5X0Ac6C9qJ7zg@mail.gmail.com> <CAMe9rOq7bZHb8R=opUzSmAMGWjLpX21mR=Sx96cuBph=TTtDXA@mail.gmail.com> <54246CB5.7020908@redhat.com> <CAMe9rOoLmJ2jHWmERoB0M83WNKovJOgh0--Kquw9O86A1tPU0g@mail.gmail.com> <5424733D.6010305@redhat.com> <CAMe9rOpacze055qyBFzz3M-b-GNtXCqZzMmkScBL9a94zVj28g@mail.gmail.com> <54247FAB.6050002@redhat.com> <CAMXFM3v8narOLMHC5U=fvyTFWV6s4ZACN-UrAC4fAcUs9SOFfA@mail.gmail.com> <54257507.9070508@redhat.com> <CAMXFM3vOLspQtHxgJfD_Emht480w2RMbiwnEH6A_LhoS-JZFag@mail.gmail.com> <Pine.LNX.4.64.1409301620020.15186@digraph.polyomino.org.uk> <542AF92E.8090708@lip6.fr> <Pine.LNX.4.64.1409302003410.12188@digraph.polyomino.org.uk> <CAMXFM3tuM_p6Acp4hzoQ2xzR=4BZqtw8NbezqY6h8V4Xx=5hUA@mail.gmail.com> <Pine.LNX.4.64.1410021411420.24886@digraph.polyomino.org.uk> <CAMXFM3uPiuJvSpgmt+8d0B1qh3QSA=TVx0ZExfojDVHzrscL8A@mail.gmail.com> <Pine.LNX.4.64.1410091724031.23641@digraph.polyomino.org.uk> From: Andrew Senkevich <andrew.n.senkevich@gmail.com> Date: Thu, 16 Oct 2014 20:37:05 +0400 Message-ID: <CAMXFM3uGOKqEAvGYew+9K7bmhObVmnP2u4kUOSh8_Cpwyk8s5g@mail.gmail.com> Subject: Re: [RFC] How to add vector math functions to Glibc To: "Joseph S. Myers" <joseph@codesourcery.com> Cc: libc-alpha <libc-alpha@sourceware.org> Content-Type: text/plain; charset=UTF-8 |
Commit Message
Andrew Senkevich
Oct. 16, 2014, 4:37 p.m. UTC
Hi, Joseph, here is the patch with test suite changes with fixes based on your previous comments. >> @@ -6258,7 +6274,11 @@ static const struct test_f_f_data cos_test_data[] = >> static void >> cos_test (void) >> { >> +#ifndef TEST_MATHVEC >> ALL_RM_TEST (cos, 0, cos_test_data, RUN_TEST_LOOP_f_f, END); >> +#else >> + TN_RM_TEST (vector_cos, 0, cos_test_data, RUN_TEST_LOOP_f_f, END); >> +#endif >> } > > And I don't think we want conditionals like this for every function - > indeed, the tests shouldn't need to know which functions have vector > versions at all. Do you mean to use the same *_test function for testing vector (through wrapper)? I have such scheme now but it requires to add macros named as standard function and it also caused changes in START macros. >> +CFLAGS-test-vec-double.c = -fno-inline -ffloat-store -fno-builtin >> -frounding-math -mavx2 -Wno-unused-function > > And since you can't determine at configure > time what host the tests might run on, instruction set features such as > AVX need testing for at runtime (this means building separate source files > for the test with separate options so that you know the compiler won't > generate AVX code before you've tested for AVX availability). Because of vector tests grouped by ISA we have different test driver names containing vector length (test-double-vlen4.c for AVX2). Scalar wrappers (called from test driver) will be in separate files (test-double-vlen4-wrapper.c) and will be built with architecture-specific options specified in sysdeps Makefile. For runtime check we need to insert condition before wrapper start so with help of new macros added in *_test function that condition could be defined in test driver. Now I have built test-double-vlen4 manually and if this way is ok I will prepare that sysdeps Makefile. +#include "libm-test.c" -- WBR, Andrew
Comments
On Thu, 16 Oct 2014, Andrew Senkevich wrote: > >> @@ -6258,7 +6274,11 @@ static const struct test_f_f_data cos_test_data[] = > >> static void > >> cos_test (void) > >> { > >> +#ifndef TEST_MATHVEC > >> ALL_RM_TEST (cos, 0, cos_test_data, RUN_TEST_LOOP_f_f, END); > >> +#else > >> + TN_RM_TEST (vector_cos, 0, cos_test_data, RUN_TEST_LOOP_f_f, END); > >> +#endif > >> } > > > > And I don't think we want conditionals like this for every function - > > indeed, the tests shouldn't need to know which functions have vector > > versions at all. > > Do you mean to use the same *_test function for testing vector > (through wrapper)? Yes. I don't have a full design, but the principle is to change how the macros for running tests expand (or what functions they call do) conditional on what is being tested, so that none of the conditionals are at the level of individual functions if it can be avoided. And I don't think you should need to change calls to START, just the expansion. > Because of vector tests grouped by ISA we have different test driver > names containing vector length (test-double-vlen4.c for AVX2). > Scalar wrappers (called from test driver) will be in separate files > (test-double-vlen4-wrapper.c) and will be built with > architecture-specific options specified in sysdeps Makefile. > For runtime check we need to insert condition before wrapper start so > with help of new macros added in *_test function that condition could > be defined in test driver. I'd think that the check for AVX2 etc. availability could run once in main, rather than in the tests of individual functions. > @@ -6247,7 +6248,7 @@ copysign_test (void) > > > static const struct test_f_f_data cos_test_data[] = > - { > + { This looks like a bogus diff hunk. > + /* Vector trigonometric functions: */ > +#ifdef TEST_MATHVEC > + > + cos_test (); > + > +#else There shouldn't be such conditionals. It should be arranged that if there isn't a relevant vector version of a particular function, running vector tests for that function does nothing - so there are no conditionals on which *_test functions to run, and none inside those functions, just conditionals affecting what the test macros do (by means of conditionals inside them such as if (HAVE_VECTOR_cos_double_vlen4), for example, resulting from appropriate concatenations). > diff --git a/sysdeps/x86_64/fpu/libm-test-ulps > b/sysdeps/x86_64/fpu/libm-test-ulps > index 36e1b76..0e11cd5 100644 > --- a/sysdeps/x86_64/fpu/libm-test-ulps > +++ b/sysdeps/x86_64/fpu/libm-test-ulps > @@ -905,6 +905,12 @@ idouble: 1 > ildouble: 2 > ldouble: 2 > > + > +Function: "vlen4_cos": > +double: 1 > + > Function: "cosh": > double: 1 > float: 1 This looks odd. There shouldn't be the double blank line, and entries should be sorted alphabetically - this file should be updated by "make regen-ulps", and you need to ensure that regen-ulps does include the ulps for the tests of the vector functions.
2014-10-17 1:51 GMT+04:00 Joseph S. Myers <joseph@codesourcery.com>: >> + /* Vector trigonometric functions: */ >> +#ifdef TEST_MATHVEC >> + >> + cos_test (); >> + >> +#else > > There shouldn't be such conditionals. It should be arranged that if > there isn't a relevant vector version of a particular function, running > vector tests for that function does nothing - so there are no conditionals > on which *_test functions to run, and none inside those functions, just > conditionals affecting what the test macros do (by means of conditionals > inside them such as if (HAVE_VECTOR_cos_double_vlen4), for example, > resulting from appropriate concatenations). With HAVE_VECTOR_cos_double_vlen4 we need to have such macros with zero for all set of not vector functions which is huge. May be more suitable way is to have determined name of vector function wrapper and selection based on function name? I mean to have something like this in test driver test-double-vlen4.c: #define HAVE_VECTOR 1 #define VEC_PREFIX_STR "VECTOR_LEN_" #define cos VECTOR_LEN_4_cos #include "libm-test.c" in test-double-vlen4-wrapper.c: VECTOR_WRAPPER(VECTOR_LEN_4_cos,_ZGVdN4v_cos) and in libm-test.inc: #ifndef HAVE_VECTOR # define HAVE_VECTOR 0 #endif #ifndef VEC_PREFIX_STR # define VEC_PREFIX_STR "" #endif static const char *vec_prefix = VEC_PREFIX_STR; static int is_vector_name(const char *this_func) { if (strncmp(this_func, vec_prefix, strlen(vec_prefix))==0) return 1; return 0; } #define STR_CON(x,y) __STRING(x##y) /* Start and end the tests for a given function. */ #define START(FUNC, SUFF, EXACT) \ const char *this_func = STR_CON (FUNC, SUFF); \ if (HAVE_VECTOR && !is_vector_name(this_func)) return; \ init_max_error (this_func, EXACT) Is this way ok? -- WBR, Andrew
On Tue, 21 Oct 2014, Andrew Senkevich wrote: > > There shouldn't be such conditionals. It should be arranged that if > > there isn't a relevant vector version of a particular function, running > > vector tests for that function does nothing - so there are no conditionals > > on which *_test functions to run, and none inside those functions, just > > conditionals affecting what the test macros do (by means of conditionals > > inside them such as if (HAVE_VECTOR_cos_double_vlen4), for example, > > resulting from appropriate concatenations). > > With HAVE_VECTOR_cos_double_vlen4 we need to have such macros with > zero for all set of not vector functions which is huge. But I'd hope such macros could be generated by gen-libm-test.pl (or some such script, anyway) rather than needing lots of repetitive definitions to be maintained by hand and checked in. Essentially: * The architecture-specific headers (installed headers, or possibly non-installed ones used only by the testsuite in some cases) contain the information about what vector versions of what functions are available. Things are designed so that they only need to contain definitions where vector functions are available, not where they aren't (to avoid needing to repeat slightly different huge lists for each architecture). * Where a default definition to 0 is needed in any cases, the relevant definitions are generated automatically. (Indeed, this might make sense for a header included by bits/mathcalls.h, so that __MATHCALL can expand to include the right __DECL_SIMD_*, which might end up empty, rather than needing lots of #if conditionals before every function declaration there.) (Incidentally, there have been so many different patch fragments posted in this discussion that it's hard to follow what you're proposing, if e.g. in the discussion of testing it's relevant to look at what you're proposing for installed headers. I think it would help if you had a git branch with the current set of proposed changes, that you frequently rebase so it always shows what you currently propose.) > May be more suitable way is to have determined name of vector function > wrapper and selection based on function name? > I mean to have something like this in test driver test-double-vlen4.c: > > #define HAVE_VECTOR 1 > #define VEC_PREFIX_STR "VECTOR_LEN_" > #define cos VECTOR_LEN_4_cos > #include "libm-test.c" > > in test-double-vlen4-wrapper.c: > > VECTOR_WRAPPER(VECTOR_LEN_4_cos,_ZGVdN4v_cos) If you did something like that, I think it would still be desirable to have some form of automatic generation of a list of defines, one per function and conditional as needed on whether the relevant vector version of that function exists.
Hi, Joseph, attach contains current situation in my branch. I have generated additional header with series of definitions, now that header included in libm-test.inc but also only generation can be left in some script. Currently information from math.h is used for selection what test to run in vector case, but it required according changes in math-vector.h. Added tests for cos and cosf (float with no vector function body, just a stub now). Test suite passed with no fails on math tests on non AVX2 target, on AVX2 target also (but math/test-float-vlen8 must fail so it is strange, will look). Let me know if such changes ok in general. -- WBR, Andrew
On Thu, 23 Oct 2014, Andrew Senkevich wrote: > Let me know if such changes ok in general. I'm not clear we yet reached consensus on whether glibc is the right place for this; I think that discussion tailed off without a clear conclusion, and someone needs to reread it, post a careful analysis of the discussion so far and try to help the community reach consensus. Regarding the specific patch: > + [Enable building and installing mathvec @<:@default=yes on x86_64 build, else default=no@:>@])], I don't think the help text in an architecture-independent file should refer to specific architectures like this; just say "default depends on architecture" or similar. > +ifeq ($(build-mathvect),yes) > +# We need to install libm.so as linker script > +# for more comfortable use of vector math library. > +subdir_install: $(inst_libdir)/libm.so.tmp > +$(inst_libdir)/libm.so.tmp: $(common-objpfx)format.lds \ > + $(common-objpfx)math/libm.so$(libm.so-version) \ > + $(common-objpfx)mathvec/libmvec.so$(libmvec.so-version) \ > + $(+force) > + (echo '/* GNU ld script */';\ > + cat $<; \ > + echo 'GROUP ( $(slibdir)/libm.so$(libm.so-version) ' \ > + 'AS_NEEDED ( $(slibdir)/libmvec.so$(libmvec.so-version) ) )' \ > + ) > $@ > + mv -f $@ $(inst_libdir)/libm.so # TODO do it somehow after all other > +endif Clearly it's necessary to resolve how to disable the normal installation rule for libm.so so it can be cleanly replaced by this new one. > diff --git a/math/bits/mathcalls.h b/math/bits/mathcalls.h > index 8a94a7e..2d31a11 100644 > --- a/math/bits/mathcalls.h > +++ b/math/bits/mathcalls.h > @@ -60,6 +60,15 @@ __MATHCALL (atan,, (_Mdouble_ __x)); > __MATHCALL (atan2,, (_Mdouble_ __y, _Mdouble_ __x)); > > /* Cosine of X. */ > +#if !defined _Mfloat_ && !defined _Mlong_double_ && defined __DECL_SIMD_cos > +__DECL_SIMD_cos > +#endif > +#if defined _Mfloat_ && !defined _Mlong_double_ && defined __DECL_SIMD_cosf > +__DECL_SIMD_cosf > +#endif > +#if defined _Mlong_double_ && defined __DECL_SIMD_cosl > +__DECL_SIMD_cosl > +#endif > __MATHCALL (cos,, (_Mdouble_ __x)); As previously noted, I think it would be much better if the definition of __MATHCALL can include all the conditional bits (possibly through a generated header that defines __DECL_SIMD_cos etc. to empty if not defined by bits/math-vector.h). > diff --git a/math/have_vector.h b/math/have_vector.h > new file mode 100644 > index 0000000..94aacf0 > --- /dev/null > +++ b/math/have_vector.h > @@ -0,0 +1,2574 @@ > +/* > +Definitions below are generated with the following bash script: > +for func in $(grep ALL_RM_TEST math/libm-test.inc | awk {'print $2'} | sed -e "s/(//" -e "s/,//"); do Rather than having such a file checked in, makefile rules / scripts to generate it at test time should be checked in. > +static int avx2_usable; /* Set to 1 if AVX2 supported */ Given that we expect multiple architectures to have vector functions, this belongs in some architecture-specific file that libm-test.inc includes, rather than directly in libm-test.inc (which shouldn't refer directly to AVX at all). > -#define RUN_TEST_f_f(ARG_STR, FUNC_NAME, ARG, EXPECTED, \ > - EXCEPTIONS) \ > - do \ > - if (enable_test (EXCEPTIONS)) \ > - { \ > - COMMON_TEST_SETUP (ARG_STR); \ > - check_float (test_name, FUNC (FUNC_NAME) (ARG), EXPECTED, \ > - EXCEPTIONS); \ > - COMMON_TEST_CLEANUP; \ > - } \ > +#define RUN_TEST_f_f(ARG_STR, FUNC_NAME, ARG, EXPECTED, \ > + EXCEPTIONS) \ > + do \ > + if (enable_test (EXCEPTIONS)) \ > + { \ > + COMMON_TEST_SETUP (ARG_STR); \ > + check_float (test_name, \ > + CONCAT (CONCAT3_1 (VEC_PREFIX_, FUNC_NAME, FUNC ( )), \ > + FUNC (FUNC_NAME)) (ARG), \ > + EXPECTED, \ > + EXCEPTIONS); \ > + COMMON_TEST_CLEANUP; \ > + } \ I think it would be better for FUNC to be defined, in the test file that includes libm-test.inc, in a way that avoids the need for the CONCAT* calls here. (To avoid warnings / errors about undeclared functions, I suppose the generated header might then need to redefine e.g. vec_sin to sin if there isn't a vector version of sin.) > +#if defined __x86_64__ && defined __FAST_MATH__ > +# if defined _OPENMP && _OPENMP >= 201307 > +/* OpenMP case. */ > +# define __DECL_SIMD_AVX2 _Pragma("omp declare simd notinbranch") > +# define __DECL_SIMD_SSE4 _Pragma("omp declare simd notinbranch") Of course we still need the API/ABI documentation providing the stable guarantee about exactly what this pragma means regarding the function versions it is saying are available in glibc. > +# define __DECL_SIMD_cos __DECL_SIMD_AVX2 > +# define __DECL_SIMD_cosf __DECL_SIMD_SSE4 > +# elif defined _CILKPLUS && _CILKPLUS >= 0 > +/* CilkPlus case. TODO _CILKPLUS currently nowhere defined */ > +# define __DECL_SIMD_AVX2 __attribute__((__vector__(nomask))) > +# define __DECL_SIMD_SSE4 __attribute__((__vector__(processor(core_i7_sse4_2),\ > + nomask))) And as previously noted, this needs to be fixed to be namespace-clean - using __nomask__, __processor__, __core_i7_sse4_2__. > +#if defined TEST_MATHVEC No, you can't have such conditionals on a macro in the user's namespace in an installed header. > diff --git a/sysdeps/x86_64/configure.ac b/sysdeps/x86_64/configure.ac > index c9f9a51..91c4cdf 100644 > --- a/sysdeps/x86_64/configure.ac > +++ b/sysdeps/x86_64/configure.ac > @@ -5,6 +5,24 @@ AC_CHECK_HEADER([cpuid.h], , > [AC_MSG_ERROR([gcc must provide the <cpuid.h> header])], > [/* No default includes. */]) > > +dnl Check if compiler target is x86_64. Not needed. preconfigure fragments in sysdeps directories need to check the architecture, but configure ones don't (they'll only be run for the relevant architecture, unless one fragment explicitly sources another). > diff --git a/sysdeps/x86_64/fpu/Makefile b/sysdeps/x86_64/fpu/Makefile > new file mode 100644 > index 0000000..d585fa0 > --- /dev/null > +++ b/sysdeps/x86_64/fpu/Makefile > @@ -0,0 +1,33 @@ > +ifeq ($(subdir),mathvec) > +libmvec-support += svml_d_cos4_core svml_d_cos_data > +endif > + > +# Rules for libmvec tests > +ifeq ($(subdir),math) > +ifneq ($(PERL),no) > +ifeq ($(build-mathvec),yes) > +libm-tests += test-double-vlen4 test-float-vlen8 > + > +CFLAGS-test-double-vlen4-wrapper.c = -fno-inline -ffloat-store -fno-builtin \ > + -frounding-math -mavx2 > +CFLAGS-test-float-vlen8-wrapper.c = -fno-inline -ffloat-store -fno-builtin \ > + -frounding-math -mavx2 I think the sysdeps makefile should actually just define that double-vlen4 and float-vlen8 are the vector lengths for which testing should take place, with all the other testing rules being arranged in an architecture-independent way. > +/* General constants: > + * lAbsMask > + */ > + .long 0xffffffff > + .long 0x7fffffff My previous point from <https://sourceware.org/ml/libc-alpha/2014-10/msg00324.html> still applies about how to make these tables more readable (one line per "double" constant, more explicitly say what the constants are) and ensure that the offsets in the tables are directly linked to the offsets used in the function implementation. > diff --git a/sysdeps/x86_64/fpu/test-double-vlen4-wrapper.c b/sysdeps/x86_64/fpu/test-double-vlen4-wrapper.c > new file mode 100644 > index 0000000..0778e23 > --- /dev/null > +++ b/sysdeps/x86_64/fpu/test-double-vlen4-wrapper.c This file may well need to be architecture-specific, at least as written, but ... > diff --git a/sysdeps/x86_64/fpu/test-double-vlen4.c b/sysdeps/x86_64/fpu/test-double-vlen4.c > new file mode 100644 > index 0000000..4d3d9a3 > --- /dev/null > +++ b/sysdeps/x86_64/fpu/test-double-vlen4.c ... it's not at all clear that this one should need to be. At present it has some architecture-specific bits > +#define CHECK_ARCH_EXT if (!avx2_usable) return; > + > +extern FLOAT WRAPPER_NAME (cos) (FLOAT); but I'd think those are all that needs to go somewhere architecture-specific and the rest is pretty generic to any architecture supporting vector functions for vectors of 4 doubles.
2014-10-24 1:37 GMT+04:00 Joseph S. Myers <joseph@codesourcery.com>: > On Thu, 23 Oct 2014, Andrew Senkevich wrote: > >> Let me know if such changes ok in general. > > I'm not clear we yet reached consensus on whether glibc is the right place > for this; I think that discussion tailed off without a clear conclusion, > and someone needs to reread it, post a careful analysis of the discussion > so far and try to help the community reach consensus. It was already decided and written in Consensus paragraph on wiki in https://sourceware.org/ml/libc-alpha/2014-09/msg00596.html. Link to wiki - https://sourceware.org/glibc/wiki/libm#Consensus >> +# define __DECL_SIMD_cos __DECL_SIMD_AVX2 >> +# define __DECL_SIMD_cosf __DECL_SIMD_SSE4 >> +# elif defined _CILKPLUS && _CILKPLUS >= 0 >> +/* CilkPlus case. TODO _CILKPLUS currently nowhere defined */ >> +# define __DECL_SIMD_AVX2 __attribute__((__vector__(nomask))) >> +# define __DECL_SIMD_SSE4 __attribute__((__vector__(processor(core_i7_sse4_2),\ >> + nomask))) > > And as previously noted, this needs to be fixed to be namespace-clean - > using __nomask__, __processor__, __core_i7_sse4_2__. It seems there are no such reserved-namespace word versions now... >> +#if defined TEST_MATHVEC > > No, you can't have such conditionals on a macro in the user's namespace in > an installed header. Then we have to build vector tests with -D__FAST_MATH__ -DTEST_FAST_MATH -D_OPENMP=201307 to be sure we have needed definitions from math.h? >> diff --git a/sysdeps/x86_64/configure.ac b/sysdeps/x86_64/configure.ac >> index c9f9a51..91c4cdf 100644 >> --- a/sysdeps/x86_64/configure.ac >> +++ b/sysdeps/x86_64/configure.ac >> @@ -5,6 +5,24 @@ AC_CHECK_HEADER([cpuid.h], , >> [AC_MSG_ERROR([gcc must provide the <cpuid.h> header])], >> [/* No default includes. */]) >> >> +dnl Check if compiler target is x86_64. > > Not needed. preconfigure fragments in sysdeps directories need to check > the architecture, but configure ones don't (they'll only be run for the > relevant architecture, unless one fragment explicitly sources another). Clear, then it can be done in root configure like so: +if test x"$build_mathvec" = xnotset; then + if test x"$machine" = xx86_64/64; then + build_mathvec=yes + else + build_mathvec=no + fi +fi +LIBC_CONFIG_VAR([build-mathvec], [$build_mathvec]) -- WBR, Andrew
On Mon, 27 Oct 2014, Andrew Senkevich wrote: > >> +# define __DECL_SIMD_cos __DECL_SIMD_AVX2 > >> +# define __DECL_SIMD_cosf __DECL_SIMD_SSE4 > >> +# elif defined _CILKPLUS && _CILKPLUS >= 0 > >> +/* CilkPlus case. TODO _CILKPLUS currently nowhere defined */ > >> +# define __DECL_SIMD_AVX2 __attribute__((__vector__(nomask))) > >> +# define __DECL_SIMD_SSE4 __attribute__((__vector__(processor(core_i7_sse4_2),\ > >> + nomask))) > > > > And as previously noted, this needs to be fixed to be namespace-clean - > > using __nomask__, __processor__, __core_i7_sse4_2__. > > It seems there are no such reserved-namespace word versions now... Then fix the compiler to have such reserved-namespace versions and put appropriate conditionals on a fixed compiler version in the header. It's not OK to put random identifiers into an installed header like that. > >> +#if defined TEST_MATHVEC > > > > No, you can't have such conditionals on a macro in the user's namespace in > > an installed header. > > Then we have to build vector tests with -D__FAST_MATH__ > -DTEST_FAST_MATH -D_OPENMP=201307 to be sure we have needed > definitions from math.h? Yes, -D__FAST_MATH__ is used for some other libm tests. > >> diff --git a/sysdeps/x86_64/configure.ac b/sysdeps/x86_64/configure.ac > >> index c9f9a51..91c4cdf 100644 > >> --- a/sysdeps/x86_64/configure.ac > >> +++ b/sysdeps/x86_64/configure.ac > >> @@ -5,6 +5,24 @@ AC_CHECK_HEADER([cpuid.h], , > >> [AC_MSG_ERROR([gcc must provide the <cpuid.h> header])], > >> [/* No default includes. */]) > >> > >> +dnl Check if compiler target is x86_64. > > > > Not needed. preconfigure fragments in sysdeps directories need to check > > the architecture, but configure ones don't (they'll only be run for the > > relevant architecture, unless one fragment explicitly sources another). > > Clear, then it can be done in root configure like so: > +if test x"$build_mathvec" = xnotset; then > + if test x"$machine" = xx86_64/64; then No. Such conditionals on particular systems do not go in the toplevel configure script. If you want something for x86_64 (both -m64 and -mx32), it can go in sysdeps/x86_64/configure.ac, without any machine conditionals. And if there's a good reason (please state the reason if so) something won't work for x32, put it in sysdeps/x86_64/64/configure.ac, again with no machine conditionals (and in that case, the implementation files would also go in sysdeps/x86_64/64/ directories).
2014-10-24 1:37 GMT+04:00 Joseph S. Myers <joseph@codesourcery.com>: >> +static int avx2_usable; /* Set to 1 if AVX2 supported */ > > Given that we expect multiple architectures to have vector functions, > this belongs in some architecture-specific file that libm-test.inc > includes, rather than directly in libm-test.inc (which shouldn't refer > directly to AVX at all). >which shouldn't refer directly to AVX at all Do you mean to place avx2_usable initialization in procedure in architecture-specific *.c file and have generic stub, call it from test main() and change build accordingly? May be simply stay __cpu_features.feature[index_AVX2_Usable] & bit_AVX2_Usable in every test function inserted through macros? It don't require so big changes and don't affect performance significantly. Or insert initialization in test's main() through macros also. >> -#define RUN_TEST_f_f(ARG_STR, FUNC_NAME, ARG, EXPECTED, \ >> - EXCEPTIONS) \ >> - do \ >> - if (enable_test (EXCEPTIONS)) \ >> - { \ >> - COMMON_TEST_SETUP (ARG_STR); \ >> - check_float (test_name, FUNC (FUNC_NAME) (ARG), EXPECTED, \ >> - EXCEPTIONS); \ >> - COMMON_TEST_CLEANUP; \ >> - } \ >> +#define RUN_TEST_f_f(ARG_STR, FUNC_NAME, ARG, EXPECTED, \ >> + EXCEPTIONS) \ >> + do \ >> + if (enable_test (EXCEPTIONS)) \ >> + { \ >> + COMMON_TEST_SETUP (ARG_STR); \ >> + check_float (test_name, \ >> + CONCAT (CONCAT3_1 (VEC_PREFIX_, FUNC_NAME, FUNC ( )), \ >> + FUNC (FUNC_NAME)) (ARG), \ >> + EXPECTED, \ >> + EXCEPTIONS); \ >> + COMMON_TEST_CLEANUP; \ >> + } \ > > I think it would be better for FUNC to be defined, in the test file that > includes libm-test.inc, in a way that avoids the need for the CONCAT* > calls here. (To avoid warnings / errors about undeclared functions, I > suppose the generated header might then need to redefine e.g. vec_sin to > sin if there isn't a vector version of sin.) Not good idea to change FUNC definition since it used in libm-test.c not only in test macros (so it may cause usage of vector function with not vector parameter). But it is possible to reduce number of concatenation if change generated definitions in way your have proposed. Not all functions tested trough ALL_RM_TEST - cexp, tgamma and jn tested in not all rounding modes, so definitions for them we have to generate in script manually. >> diff --git a/sysdeps/x86_64/fpu/Makefile b/sysdeps/x86_64/fpu/Makefile >> new file mode 100644 >> index 0000000..d585fa0 >> --- /dev/null >> +++ b/sysdeps/x86_64/fpu/Makefile >> @@ -0,0 +1,33 @@ >> +ifeq ($(subdir),mathvec) >> +libmvec-support += svml_d_cos4_core svml_d_cos_data >> +endif >> + >> +# Rules for libmvec tests >> +ifeq ($(subdir),math) >> +ifneq ($(PERL),no) >> +ifeq ($(build-mathvec),yes) >> +libm-tests += test-double-vlen4 test-float-vlen8 >> + >> +CFLAGS-test-double-vlen4-wrapper.c = -fno-inline -ffloat-store -fno-builtin \ >> + -frounding-math -mavx2 >> +CFLAGS-test-float-vlen8-wrapper.c = -fno-inline -ffloat-store -fno-builtin \ >> + -frounding-math -mavx2 > > I think the sysdeps makefile should actually just define that double-vlen4 > and float-vlen8 are the vector lengths for which testing should take > place, with all the other testing rules being arranged in an > architecture-independent way. Do you mean to stay in sysdeps/x86_64/fpu/Makefile only CFLAGS-* definitions or to setup some variable which will be used in common makefile for build vector tests? >> diff --git a/sysdeps/x86_64/fpu/test-double-vlen4-wrapper.c b/sysdeps/x86_64/fpu/test-double-vlen4-wrapper.c >> new file mode 100644 >> index 0000000..0778e23 >> --- /dev/null >> +++ b/sysdeps/x86_64/fpu/test-double-vlen4-wrapper.c > > This file may well need to be architecture-specific, at least as written, > but ... > >> diff --git a/sysdeps/x86_64/fpu/test-double-vlen4.c b/sysdeps/x86_64/fpu/test-double-vlen4.c >> new file mode 100644 >> index 0000000..4d3d9a3 >> --- /dev/null >> +++ b/sysdeps/x86_64/fpu/test-double-vlen4.c > > ... it's not at all clear that this one should need to be. At present it > has some architecture-specific bits > >> +#define CHECK_ARCH_EXT if (!avx2_usable) return; >> + >> +extern FLOAT WRAPPER_NAME (cos) (FLOAT); > > but I'd think those are all that needs to go somewhere > architecture-specific and the rest is pretty generic to any architecture > supporting vector functions for vectors of 4 doubles. Then lets have math/test-double-vlen4.h with common definitions and sysdeps/x86_64/fpu/test-double-vlen4.c containing wrapper. Attached patch contains only discussed here changes. -- WBR, Andrew
On Wed, 29 Oct 2014, Andrew Senkevich wrote: > 2014-10-24 1:37 GMT+04:00 Joseph S. Myers <joseph@codesourcery.com>: > > >> +static int avx2_usable; /* Set to 1 if AVX2 supported */ > > > > Given that we expect multiple architectures to have vector functions, > > this belongs in some architecture-specific file that libm-test.inc > > includes, rather than directly in libm-test.inc (which shouldn't refer > > directly to AVX at all). > > >which shouldn't refer directly to AVX at all > Do you mean to place avx2_usable initialization in procedure in > architecture-specific *.c file and have generic stub, call it from > test main() and change build accordingly? For example. The aim is to get something clean in accordance with glibc's design principles - such as putting things that are architecture-independent in architecture-independent places, and things that are architecture-specific in architecture-specific places, with a minimum of duplication between architectures. There may be multiple approaches that achieve that. > > I think it would be better for FUNC to be defined, in the test file that > > includes libm-test.inc, in a way that avoids the need for the CONCAT* > > calls here. (To avoid warnings / errors about undeclared functions, I > > suppose the generated header might then need to redefine e.g. vec_sin to > > sin if there isn't a vector version of sin.) > > Not good idea to change FUNC definition since it used in libm-test.c > not only in test macros (so it may cause usage of vector function with > not vector parameter). But it is possible to reduce number of > concatenation if change generated definitions in way your have > proposed. Well, maybe a preliminary refactoring patch is needed that separates FUNC into multiple macros, one for functions used in testsuite infrastructure and one for functions being tested. There are lots of RUN_TEST_* macros (I don't think we should assume that only one of them will only ever be relevant for vector tests) - it seems a bad idea for every one of them to need to repeat something so cryptic as CONCAT (CONCAT3_1 (VEC_PREFIX_, FUNC_NAME, FUNC ( )), FUNC (FUNC_NAME)). > Not all functions tested trough ALL_RM_TEST - cexp, tgamma and jn > tested in not all rounding modes, so definitions for them we have to > generate in script manually. Yes, conversion of those to ALL_RM_TEST was deferred because of bugs it showed up that need fixing. And in the case of cexp, the bugs appear to be present in other functions as well, but it's not convenient to add tests for them in all cases until csin / csinh have moved to tests in auto-libm-test-in - and for that, I'm waiting for a new MPC release with last December's speedups for mpc_sin / mpc_sinh. I'm doubtful any changes to the arguments to START should be needed, but if they are, then you do indeed need to change the code for those three functions' tests manually. > >> diff --git a/sysdeps/x86_64/fpu/Makefile b/sysdeps/x86_64/fpu/Makefile > >> new file mode 100644 > >> index 0000000..d585fa0 > >> --- /dev/null > >> +++ b/sysdeps/x86_64/fpu/Makefile > >> @@ -0,0 +1,33 @@ > >> +ifeq ($(subdir),mathvec) > >> +libmvec-support += svml_d_cos4_core svml_d_cos_data > >> +endif > >> + > >> +# Rules for libmvec tests > >> +ifeq ($(subdir),math) > >> +ifneq ($(PERL),no) > >> +ifeq ($(build-mathvec),yes) > >> +libm-tests += test-double-vlen4 test-float-vlen8 > >> + > >> +CFLAGS-test-double-vlen4-wrapper.c = -fno-inline -ffloat-store -fno-builtin \ > >> + -frounding-math -mavx2 > >> +CFLAGS-test-float-vlen8-wrapper.c = -fno-inline -ffloat-store -fno-builtin \ > >> + -frounding-math -mavx2 > > > > I think the sysdeps makefile should actually just define that double-vlen4 > > and float-vlen8 are the vector lengths for which testing should take > > place, with all the other testing rules being arranged in an > > architecture-independent way. > > Do you mean to stay in sysdeps/x86_64/fpu/Makefile only CFLAGS-* > definitions or to setup some variable which will be used in common > makefile for build vector tests? Only libmvec-support, and a variable containing "double-vlen4 float-vlen8" or similar as a list of vector formats for which to run tests, and a variable containing "-mavx2" as compiler options for building vector tests (all the other options there should be architecture-independent and defined only once in a variable in math/Makefile).
2014-10-29 21:50 GMT+03:00 Joseph S. Myers <joseph@codesourcery.com>: >> > I think it would be better for FUNC to be defined, in the test file that >> > includes libm-test.inc, in a way that avoids the need for the CONCAT* >> > calls here. (To avoid warnings / errors about undeclared functions, I >> > suppose the generated header might then need to redefine e.g. vec_sin to >> > sin if there isn't a vector version of sin.) >> >> Not good idea to change FUNC definition since it used in libm-test.c >> not only in test macros (so it may cause usage of vector function with >> not vector parameter). But it is possible to reduce number of >> concatenation if change generated definitions in way your have >> proposed. > > Well, maybe a preliminary refactoring patch is needed that separates FUNC > into multiple macros, one for functions used in testsuite infrastructure > and one for functions being tested. > > There are lots of RUN_TEST_* macros (I don't think we should assume that > only one of them will only ever be relevant for vector tests) - it seems a > bad idea for every one of them to need to repeat something so cryptic as > CONCAT (CONCAT3_1 (VEC_PREFIX_, FUNC_NAME, FUNC ( )), FUNC (FUNC_NAME)). But it is already old code, yesterday's patch looks so in this place: FUNC_TEST (FUNC_NAME) (ARG) -- WBR, Andrew
On Thu, 30 Oct 2014, Andrew Senkevich wrote: > > Well, maybe a preliminary refactoring patch is needed that separates FUNC > > into multiple macros, one for functions used in testsuite infrastructure > > and one for functions being tested. > > > > There are lots of RUN_TEST_* macros (I don't think we should assume that > > only one of them will only ever be relevant for vector tests) - it seems a > > bad idea for every one of them to need to repeat something so cryptic as > > CONCAT (CONCAT3_1 (VEC_PREFIX_, FUNC_NAME, FUNC ( )), FUNC (FUNC_NAME)). > > But it is already old code, yesterday's patch looks so in this place: > FUNC_TEST (FUNC_NAME) (ARG) As I said, *preliminary refactoring patch*. Long sequences of variations on the same patch aren't helpful; if you find yourself sending them, you need to step back and think very carefully about how to restructure the submission to make things as clear and as easy to review as possible. That includes separating out any pieces, large or small, that are reasonably separable and can be justified on their own merits. Having separated them, you then need to make *self-contained* submissions (including all relevant rationale and background), and ping those submissions weekly as needed (I haven't seen any pings of the binutils version requirement patch). And please keep the state for your own patches in patchwork.sourceware.org clean; I see six entries there with the same description "[RFC] How to add vector math functions to Glibc", when there should be at most one. If you do need to make multiple submissions of successive versions of the same patch, consider the submission style where each submission contains both the full self-contained description and rationale (that would go in the git log message) and a separate description of what has changed relative to the previous patch version (and number each patch version).
Also, I don't see you in copyright.list, so unless you're covered by a corporate copyright assignment for glibc you should start work on completing the paperwork.
2014-10-30 23:07 GMT+03:00 Joseph S. Myers <joseph@codesourcery.com>: > Also, I don't see you in copyright.list, so unless you're covered by a > corporate copyright assignment for glibc you should start work on > completing the paperwork. Paperwork in progress. -- WBR, Andrew
Hi, Joseph, 2014-10-24 1:37 GMT+04:00 Joseph S. Myers <joseph@codesourcery.com>: > On Thu, 23 Oct 2014, Andrew Senkevich wrote: >> diff --git a/math/bits/mathcalls.h b/math/bits/mathcalls.h >> index 8a94a7e..2d31a11 100644 >> --- a/math/bits/mathcalls.h >> +++ b/math/bits/mathcalls.h >> @@ -60,6 +60,15 @@ __MATHCALL (atan,, (_Mdouble_ __x)); >> __MATHCALL (atan2,, (_Mdouble_ __y, _Mdouble_ __x)); >> >> /* Cosine of X. */ >> +#if !defined _Mfloat_ && !defined _Mlong_double_ && defined __DECL_SIMD_cos >> +__DECL_SIMD_cos >> +#endif >> +#if defined _Mfloat_ && !defined _Mlong_double_ && defined __DECL_SIMD_cosf >> +__DECL_SIMD_cosf >> +#endif >> +#if defined _Mlong_double_ && defined __DECL_SIMD_cosl >> +__DECL_SIMD_cosl >> +#endif >> __MATHCALL (cos,, (_Mdouble_ __x)); > > As previously noted, I think it would be much better if the definition of > __MATHCALL can include all the conditional bits (possibly through a > generated header that defines __DECL_SIMD_cos etc. to empty if not defined > by bits/math-vector.h). proposal is to use separated __MATHCALL_VEC for vector cases, because it reduces number of needed empty definitions and can be simply generated (__MATHCALL case requires a lot of manual search to obtain all affected function names because of redefinitions in some files). >> +#if defined __x86_64__ && defined __FAST_MATH__ >> +# if defined _OPENMP && _OPENMP >= 201307 >> +/* OpenMP case. */ >> +# define __DECL_SIMD_AVX2 _Pragma("omp declare simd notinbranch") >> +# define __DECL_SIMD_SSE4 _Pragma("omp declare simd notinbranch") > > Of course we still need the API/ABI documentation providing the stable > guarantee about exactly what this pragma means regarding the function > versions it is saying are available in glibc. We will follow-up on this soon. I attached patch with almost all infrastructure fixes discussed before. It seems pragma meaning and data tables remain to be done. Patch affects a lot of files and of course will be separated to minimal disjoint parts for submission later. -- WBR, Andrew
2014-11-06 23:51 GMT+03:00 Andrew Senkevich <andrew.n.senkevich@gmail.com>: > Hi, Joseph, > > 2014-10-24 1:37 GMT+04:00 Joseph S. Myers <joseph@codesourcery.com>: > >> On Thu, 23 Oct 2014, Andrew Senkevich wrote: > >>> diff --git a/math/bits/mathcalls.h b/math/bits/mathcalls.h >>> index 8a94a7e..2d31a11 100644 >>> --- a/math/bits/mathcalls.h >>> +++ b/math/bits/mathcalls.h >>> @@ -60,6 +60,15 @@ __MATHCALL (atan,, (_Mdouble_ __x)); >>> __MATHCALL (atan2,, (_Mdouble_ __y, _Mdouble_ __x)); >>> >>> /* Cosine of X. */ >>> +#if !defined _Mfloat_ && !defined _Mlong_double_ && defined __DECL_SIMD_cos >>> +__DECL_SIMD_cos >>> +#endif >>> +#if defined _Mfloat_ && !defined _Mlong_double_ && defined __DECL_SIMD_cosf >>> +__DECL_SIMD_cosf >>> +#endif >>> +#if defined _Mlong_double_ && defined __DECL_SIMD_cosl >>> +__DECL_SIMD_cosl >>> +#endif >>> __MATHCALL (cos,, (_Mdouble_ __x)); >> >> As previously noted, I think it would be much better if the definition of >> __MATHCALL can include all the conditional bits (possibly through a >> generated header that defines __DECL_SIMD_cos etc. to empty if not defined >> by bits/math-vector.h). > > proposal is to use separated __MATHCALL_VEC for vector cases, because > it reduces number of needed empty definitions and can be simply > generated (__MATHCALL case requires a lot of manual search to obtain > all affected function names because of redefinitions in some files). > >>> +#if defined __x86_64__ && defined __FAST_MATH__ >>> +# if defined _OPENMP && _OPENMP >= 201307 >>> +/* OpenMP case. */ >>> +# define __DECL_SIMD_AVX2 _Pragma("omp declare simd notinbranch") >>> +# define __DECL_SIMD_SSE4 _Pragma("omp declare simd notinbranch") >> >> Of course we still need the API/ABI documentation providing the stable >> guarantee about exactly what this pragma means regarding the function >> versions it is saying are available in glibc. > > We will follow-up on this soon. > > I attached patch with almost all infrastructure fixes discussed > before. It seems pragma meaning and data tables remain to be done. > Patch affects a lot of files and of course will be separated to > minimal disjoint parts for submission later. Here is the patch updated in part of data table and function code accordingly points mentioned before in this discussion. -- WBR, Andrew
On Fri, 14 Nov 2014, Andrew Senkevich wrote: > +#define __SIMD_DECL(function) __CONCAT(__DECL_SIMD_,function) > + > +#define __MATHCALL_VEC(function,suffix, args) \ > + __SIMD_DECL(__MATH_PRECNAME(function,suffix)) \ > + __MATHCALL(function,suffix, args) Generally, throughout the patch, use GNU style: spaces before open parentheses for calls to functions and function-like macros (not of course in "#define func(args)" where C syntax doesn't allow that space) and after commas. > diff --git a/sysdeps/x86_64/configure.ac b/sysdeps/x86_64/configure.ac > index c9f9a51..0b73d5b 100644 > --- a/sysdeps/x86_64/configure.ac > +++ b/sysdeps/x86_64/configure.ac > @@ -99,6 +99,15 @@ if test $libc_cv_cc_avx2 = yes; then > fi > LIBC_CONFIG_VAR([config-cflags-avx2], [$libc_cv_cc_avx2]) > > +if test x"$build_mathvec" = xnotset; then > + if test x"$base_machine" = xx86_64; then No need for the base_machine test here; this configure fragment will never be called for non-x86_64 machines. It's only preconfigure fragments that need to check for an applicable machine, not configure ones. > +LIBC_CONFIG_VAR([build-mathvec], [$build_mathvec]) I think the LIBC_CONFIG_VAR call belongs in the toplevel configure script (after the sysdeps configure fragments have been run) - as does setting build_mathvec to "no" if it's still "notset" after running the sysdeps configure fragments.
Hi Joseph, attached patch now contain versions of vector cos in SSE4, AVX and AVX2 ISAs. Because of both AVX and AVX2 versions have vector length 4 there are some changes in tests - I put AVX2 make rules in sysdeps makefile and its name changed to test-double-vlen4-avx2 , AVX test stay with old name, in ULPs file specified both versions. If everything is okey, let me know and I will prepare separated patches while document about pragma meaning is preparing (we plan to add it at last steps). -- WBR, Andrew
On Tue, 18 Nov 2014, Andrew Senkevich wrote: > Hi Joseph, > > attached patch now contain versions of vector cos in SSE4, AVX and AVX2 ISAs. > > Because of both AVX and AVX2 versions have vector length 4 there are > some changes in tests - I put AVX2 make rules in sysdeps makefile and > its name changed to test-double-vlen4-avx2 , AVX test stay with old > name, in ULPs file specified both versions. > > If everything is okey, let me know and I will prepare separated > patches while document about pragma meaning is preparing (we plan to > add it at last steps). The overall approach seems reasonable. I fully expect further revisions to be needed to some of the individual patches once they are submitted.
diff --git a/math/libm-test.inc b/math/libm-test.inc index f86a4fa..9ddb77e 100644 --- a/math/libm-test.inc +++ b/math/libm-test.inc @@ -684,7 +684,7 @@ static void test_single_errno (const char *test_name, int errno_value, int expected_value, const char *expected_name) { -#ifndef TEST_INLINE +#if !defined TEST_INLINE && !defined TEST_MATHVEC if (errno_value == expected_value) { if (print_screen (1)) @@ -1691,8 +1691,9 @@ struct test_fFF_11_data ROUND_RESTORE_ ## ROUNDING_MODE /* Start and end the tests for a given function. */ -#define START(FUNC, EXACT) \ - const char *this_func = #FUNC; \ +#define STR_CON(x,y) __STRING(x##y) +#define START(FUNC, SUFF, EXACT) \ + const char *this_func = STR_CON (FUNC, SUFF); \ init_max_error (this_func, EXACT) #define END \ print_max_error (this_func) @@ -1705,28 +1706,28 @@ struct test_fFF_11_data { \ do \ { \ - START (FUNC, EXACT); \ + START (FUNC, , EXACT); \ LOOP_MACRO (FUNC, ARRAY, , ## __VA_ARGS__); \ END_MACRO; \ } \ while (0); \ do \ { \ - START (FUNC ## _downward, EXACT); \ + START (FUNC, _downward, EXACT); \ LOOP_MACRO (FUNC, ARRAY, FE_DOWNWARD, ## __VA_ARGS__); \ END_MACRO; \ } \ while (0); \ do \ { \ - START (FUNC ## _towardzero, EXACT); \ + START (FUNC, _towardzero, EXACT); \ LOOP_MACRO (FUNC, ARRAY, FE_TOWARDZERO, ## __VA_ARGS__); \ END_MACRO; \ } \ while (0); \ do \ { \ - START (FUNC ## _upward, EXACT); \ + START (FUNC, _upward, EXACT); \ LOOP_MACRO (FUNC, ARRAY, FE_UPWARD, ## __VA_ARGS__); \ END_MACRO; \ } \ @@ -6034,7 +6035,7 @@ static const struct test_c_c_data cexp_test_data[] = static void cexp_test (void) { - START (cexp, 0); + START (cexp, , 0); RUN_TEST_LOOP_c_c (cexp, cexp_test_data, ); END_COMPLEX; } @@ -6247,7 +6248,7 @@ copysign_test (void) static const struct test_f_f_data cos_test_data[] = - { + { TEST_f_f (cos, plus_infty, qnan_value, INVALID_EXCEPTION|ERRNO_EDOM), TEST_f_f (cos, minus_infty, qnan_value, INVALID_EXCEPTION|ERRNO_EDOM), TEST_f_f (cos, qnan_value, qnan_value, NO_INEXACT_EXCEPTION|ERRNO_UNCHANGED), @@ -6255,9 +6256,14 @@ static const struct test_f_f_data cos_test_data[] = AUTO_TESTS_f_f (cos), }; +#ifndef CHECKARCH +# define CHECKARCH +#endif + static void cos_test (void) { + CHECKARCH ALL_RM_TEST (cos, 0, cos_test_data, RUN_TEST_LOOP_f_f, END); } @@ -7548,7 +7554,7 @@ static const struct test_if_f_data jn_test_data[] = static void jn_test (void) { - START (jn, 0); + START (jn, , 0); RUN_TEST_LOOP_if_f (jn, jn_test_data, ); END; } @@ -9374,7 +9380,7 @@ static const struct test_f_f_data tgamma_test_data[] = static void tgamma_test (void) { - START (tgamma, 0); + START (tgamma, , 0); RUN_TEST_LOOP_f_f (tgamma, tgamma_test_data, ); END; } @@ -9824,6 +9830,12 @@ main (int argc, char **argv) initialize (); printf (TEST_MSG); + /* Vector trigonometric functions: */ +#ifdef TEST_MATHVEC + + cos_test (); + +#else check_ulp (); /* Keep the tests a wee bit ordered (according to ISO C99). */ @@ -9960,6 +9972,7 @@ main (int argc, char **argv) y0_test (); y1_test (); yn_test (); +#endif if (output_ulps) fclose (ulps_file); diff --git a/sysdeps/x86_64/fpu/libm-test-ulps b/sysdeps/x86_64/fpu/libm-test-ulps index 36e1b76..0e11cd5 100644 --- a/sysdeps/x86_64/fpu/libm-test-ulps +++ b/sysdeps/x86_64/fpu/libm-test-ulps @@ -905,6 +905,12 @@ idouble: 1 ildouble: 2 ldouble: 2 + +Function: "vlen4_cos": +double: 1 + Function: "cosh": double: 1 float: 1 diff --git a/sysdeps/x86_64/fpu/test-double-vlen4-wrapper.c b/sysdeps/x86_64/fpu/test-double-vlen4-wrapper.c new file mode 100644 index 0000000..35e130e --- /dev/null +++ b/sysdeps/x86_64/fpu/test-double-vlen4-wrapper.c @@ -0,0 +1,40 @@ +/* Copyright (C) 2014 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#define FLOAT double + +// Wrapper from scalar to vector function implemented in AVX2. +#define VECTOR_WRAPPER(scalar_func,vector_func) \ +extern __m256d vector_func(__m256d); \ +FLOAT scalar_func(FLOAT x)\ +{\ + int i;\ + __m256d mx = _mm256_set1_pd(x);\ + __m256d mr = vector_func(mx);\ + for(i=1;i<4;i++)\ + {\ + if (((FLOAT*)&mr)[0]!=((FLOAT*)&mr)[i])\ + {\ + return ((FLOAT*)&mr)[0]+0.1;\ + }\ + }\ + return ((FLOAT*)&mr)[0];\ +} + +#include <immintrin.h> + +VECTOR_WRAPPER(vlen4_cos,_ZGVdN4v_cos) diff --git a/sysdeps/x86_64/fpu/test-double-vlen4.c b/sysdeps/x86_64/fpu/test-double-vlen4.c new file mode 100644 index 0000000..ce40c04 --- /dev/null +++ b/sysdeps/x86_64/fpu/test-double-vlen4.c @@ -0,0 +1,44 @@ +/* Copyright (C) 2014 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#define FUNC(function) function +#define FLOAT double +#define TEST_MSG "testing double vector math (without inline functions)\n" +#define MATHCONST(x) x +#define CHOOSE(Clongdouble,Cdouble,Cfloat,Cinlinelongdouble,Cinlinedouble,Cinlinefloat) Cdouble +#define PRINTF_EXPR "e" +#define PRINTF_XEXPR "a" +#define PRINTF_NEXPR "f" +#define TEST_DOUBLE 1 + +#ifndef __NO_MATH_INLINES +# define __NO_MATH_INLINES +#endif + +#define TEST_MATHVEC +#define EXCEPTION_TESTS_double 0 +#define ROUNDING_TESTS_double(MODE) ((MODE) == FE_TONEAREST) + +#define cos vlen4_cos + +#include <init-arch.h> + +#define CHECKARCH \ +__init_cpu_features();\ +if (__cpu_features.feature[index_AVX2_Usable] & bit_AVX2_Usable) +