[RFC] How to add vector math functions to Glibc

Message ID	CAMXFM3vOLspQtHxgJfD_Emht480w2RMbiwnEH6A_LhoS-JZFag@mail.gmail.com
State	New, archived
Headers	Received: (qmail 23041 invoked by alias); 30 Sep 2014 15:00:54 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 23025 invoked by uid 89); 30 Sep 2014 15:00:53 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=1.2 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS, URIBL_BLACK autolearn=no version=3.3.2 X-HELO: mail-qc0-f174.google.com X-Received: by 10.224.79.67 with SMTP id o3mr12890341qak.103.1412089245281; Tue, 30 Sep 2014 08:00:45 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <54257507.9070508@redhat.com> References: <CAMXFM3tjquzniXP1weqxSVFJyhXqsf2PHuyrrrmqp7K0ZzORqA@mail.gmail.com> <CAMXFM3sGMNX1DEPAMt7qUR4UREF_xUAQjCG1OjBiZH2aoOFiPA@mail.gmail.com> <Pine.LNX.4.64.1409181551370.31607@digraph.polyomino.org.uk> <CAMXFM3tO7MTYCq8-YFZacdbLvR4iAab_n04AuB+rp2Phs4BvQg@mail.gmail.com> <Pine.LNX.4.64.1409242011260.7597@digraph.polyomino.org.uk> <CAMXFM3tqiqUNuSU2KXvAFM-QescX3+6xUO9=z5X0Ac6C9qJ7zg@mail.gmail.com> <CAMe9rOq7bZHb8R=opUzSmAMGWjLpX21mR=Sx96cuBph=TTtDXA@mail.gmail.com> <54246CB5.7020908@redhat.com> <CAMe9rOoLmJ2jHWmERoB0M83WNKovJOgh0--Kquw9O86A1tPU0g@mail.gmail.com> <5424733D.6010305@redhat.com> <CAMe9rOpacze055qyBFzz3M-b-GNtXCqZzMmkScBL9a94zVj28g@mail.gmail.com> <54247FAB.6050002@redhat.com> <CAMXFM3v8narOLMHC5U=fvyTFWV6s4ZACN-UrAC4fAcUs9SOFfA@mail.gmail.com> <54257507.9070508@redhat.com> From: Andrew Senkevich <andrew.n.senkevich@gmail.com> Date: Tue, 30 Sep 2014 19:00:14 +0400 Message-ID: <CAMXFM3vOLspQtHxgJfD_Emht480w2RMbiwnEH6A_LhoS-JZFag@mail.gmail.com> Subject: Re: [RFC] How to add vector math functions to Glibc To: "Carlos O'Donell" <carlos@redhat.com> Cc: "Joseph S. Myers" <joseph@codesourcery.com>, libc-alpha <libc-alpha@sourceware.org> Content-Type: text/plain; charset=UTF-8

Message ID

CAMXFM3vOLspQtHxgJfD_Emht480w2RMbiwnEH6A_LhoS-JZFag@mail.gmail.com

State

New, archived

Headers

Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm
Precedence: bulk
Sender: libc-alpha-owner@sourceware.org
MIME-Version: 1.0
In-Reply-To: <54257507.9070508@redhat.com>
References: <CAMXFM3tjquzniXP1weqxSVFJyhXqsf2PHuyrrrmqp7K0ZzORqA@mail.gmail.com>
	<CAMXFM3sGMNX1DEPAMt7qUR4UREF_xUAQjCG1OjBiZH2aoOFiPA@mail.gmail.com>
	<Pine.LNX.4.64.1409181551370.31607@digraph.polyomino.org.uk>
	<CAMXFM3tO7MTYCq8-YFZacdbLvR4iAab_n04AuB+rp2Phs4BvQg@mail.gmail.com>
	<Pine.LNX.4.64.1409242011260.7597@digraph.polyomino.org.uk>
	<CAMXFM3tqiqUNuSU2KXvAFM-QescX3+6xUO9=z5X0Ac6C9qJ7zg@mail.gmail.com>
	<CAMe9rOq7bZHb8R=opUzSmAMGWjLpX21mR=Sx96cuBph=TTtDXA@mail.gmail.com>
	<54246CB5.7020908@redhat.com>
	<CAMe9rOoLmJ2jHWmERoB0M83WNKovJOgh0--Kquw9O86A1tPU0g@mail.gmail.com>
	<5424733D.6010305@redhat.com>
	<CAMe9rOpacze055qyBFzz3M-b-GNtXCqZzMmkScBL9a94zVj28g@mail.gmail.com>
	<54247FAB.6050002@redhat.com>
	<CAMXFM3v8narOLMHC5U=fvyTFWV6s4ZACN-UrAC4fAcUs9SOFfA@mail.gmail.com>
	<54257507.9070508@redhat.com>
From: Andrew Senkevich <andrew.n.senkevich@gmail.com>
Date: Tue, 30 Sep 2014 19:00:14 +0400
Message-ID: <CAMXFM3vOLspQtHxgJfD_Emht480w2RMbiwnEH6A_LhoS-JZFag@mail.gmail.com>
Subject: Re: [RFC] How to add vector math functions to Glibc
To: "Carlos O'Donell" <carlos@redhat.com>
Cc: "Joseph S. Myers" <joseph@codesourcery.com>,
	libc-alpha <libc-alpha@sourceware.org>
Content-Type: text/plain; charset=UTF-8

Commit Message

Andrew Senkevich Sept. 30, 2014, 3 p.m. UTC

  2014-09-26 18:15 GMT+04:00 Carlos O'Donell <carlos@redhat.com>:
> On 09/26/2014 09:45 AM, Andrew Senkevich wrote:
>> So lets discuss Glibc build changes.
>> Build of libmvec (and hence libm.so installation) need to be
>> architecture dependent and optional, and some changes already was
>> discussed in https://sourceware.org/ml/libc-alpha/2014-09/msg00578.html.
>> Is it OK additionally to have configure option --enable-mathvec with
>> default=no and with default=yes for x86_86 build?
>
> Under what circumstances would a non-x86_64 target build with
> --enable-mathvec?
>
> When they have their own API/ABI standard to implement and provide
> in libmvec.so?
>
> What's wrong with simply producing a libmvec.so that has no public
> symbols?
>
> It simplifies everything to just always ship libmvec.so, even if
> it's empty.

Based on previous discussion, now we have the following changes:



--
WBR,
Andrew

Comments

Andreas Schwab Sept. 30, 2014, 3:44 p.m. UTC | #1

Andrew Senkevich <andrew.n.senkevich@gmail.com> writes:

> diff --git a/configure.ac b/configure.ac
> index 82d0896..c5c1758 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -903,7 +903,7 @@ LIBC_PROG_BINUTILS
>  # Accept binutils 2.20 or newer.
>  AC_CHECK_PROG_VER(AS, $AS, --version,
>    [GNU assembler.* \([0-9]*\.[0-9.]*\)],
> -  [2.1[0-9][0-9]*|2.[2-9][0-9]*|[3-9].*|[1-9][0-9]*], AS=:
> critic_missing="$critic_missing as")
> +  [2.1[0-9][0-9]*|2.[2-9][2-9]*|[3-9].*|[1-9][0-9]*], AS=:
> critic_missing="$critic_missing as")

What are you trying to do here?  That doesn't look correct.

Andreas.

Andrew Senkevich Sept. 30, 2014, 3:52 p.m. UTC | #2

2014-09-30 19:44 GMT+04:00 Andreas Schwab <schwab@suse.de>:
> Andrew Senkevich <andrew.n.senkevich@gmail.com> writes:
>
>> diff --git a/configure.ac b/configure.ac
>> index 82d0896..c5c1758 100644
>> --- a/configure.ac
>> +++ b/configure.ac
>> @@ -903,7 +903,7 @@ LIBC_PROG_BINUTILS
>>  # Accept binutils 2.20 or newer.
>>  AC_CHECK_PROG_VER(AS, $AS, --version,
>>    [GNU assembler.* \([0-9]*\.[0-9.]*\)],
>> -  [2.1[0-9][0-9]*|2.[2-9][0-9]*|[3-9].*|[1-9][0-9]*], AS=:
>> critic_missing="$critic_missing as")
>> +  [2.1[0-9][0-9]*|2.[2-9][2-9]*|[3-9].*|[1-9][0-9]*], AS=:
>> critic_missing="$critic_missing as")
>
> What are you trying to do here?  That doesn't look correct.

It is update of minimum required version of binutils to 2.22.
And of course comment need to be updated also...


--
WBR,
Andrew

Andreas Schwab Sept. 30, 2014, 4:16 p.m. UTC | #3

Andrew Senkevich <andrew.n.senkevich@gmail.com> writes:

> 2014-09-30 19:44 GMT+04:00 Andreas Schwab <schwab@suse.de>:
>> Andrew Senkevich <andrew.n.senkevich@gmail.com> writes:
>>
>>> diff --git a/configure.ac b/configure.ac
>>> index 82d0896..c5c1758 100644
>>> --- a/configure.ac
>>> +++ b/configure.ac
>>> @@ -903,7 +903,7 @@ LIBC_PROG_BINUTILS
>>>  # Accept binutils 2.20 or newer.
>>>  AC_CHECK_PROG_VER(AS, $AS, --version,
>>>    [GNU assembler.* \([0-9]*\.[0-9.]*\)],
>>> -  [2.1[0-9][0-9]*|2.[2-9][0-9]*|[3-9].*|[1-9][0-9]*], AS=:
>>> critic_missing="$critic_missing as")
>>> +  [2.1[0-9][0-9]*|2.[2-9][2-9]*|[3-9].*|[1-9][0-9]*], AS=:
>>> critic_missing="$critic_missing as")
>>
>> What are you trying to do here?  That doesn't look correct.
>
> It is update of minimum required version of binutils to 2.22.

Along with excluding 2.30, 2.31, 2.40, 2.41 ...

Andreas.

Andrew Senkevich Sept. 30, 2014, 4:30 p.m. UTC | #4

2014-09-30 20:16 GMT+04:00 Andreas Schwab <schwab@suse.de>:
> Andrew Senkevich <andrew.n.senkevich@gmail.com> writes:
>
>> 2014-09-30 19:44 GMT+04:00 Andreas Schwab <schwab@suse.de>:
>>> Andrew Senkevich <andrew.n.senkevich@gmail.com> writes:
>>>
>>>> diff --git a/configure.ac b/configure.ac
>>>> index 82d0896..c5c1758 100644
>>>> --- a/configure.ac
>>>> +++ b/configure.ac
>>>> @@ -903,7 +903,7 @@ LIBC_PROG_BINUTILS
>>>>  # Accept binutils 2.20 or newer.
>>>>  AC_CHECK_PROG_VER(AS, $AS, --version,
>>>>    [GNU assembler.* \([0-9]*\.[0-9.]*\)],
>>>> -  [2.1[0-9][0-9]*|2.[2-9][0-9]*|[3-9].*|[1-9][0-9]*], AS=:
>>>> critic_missing="$critic_missing as")
>>>> +  [2.1[0-9][0-9]*|2.[2-9][2-9]*|[3-9].*|[1-9][0-9]*], AS=:
>>>> critic_missing="$critic_missing as")
>>>
>>> What are you trying to do here?  That doesn't look correct.
>>
>> It is update of minimum required version of binutils to 2.22.
>
> Along with excluding 2.30, 2.31, 2.40, 2.41 ...

Yes, thank you, it need to be fixed:

-  [2.1[0-9][0-9]*|2.[2-9][0-9]*|[3-9].*|[1-9][0-9]*], AS=:
+  [2.1[0-9][0-9]*|2.2[2-9]*|2.[3-9][0-9]*|[3-9].*|[1-9][0-9]*], AS=:

And need to add the same check for ld version...


--
WBR,
Andrew

Joseph Myers Sept. 30, 2014, 4:35 p.m. UTC | #5

On Tue, 30 Sep 2014, Andrew Senkevich wrote:

> diff --git a/configure.ac b/configure.ac
> index 82d0896..c5c1758 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -903,7 +903,7 @@ LIBC_PROG_BINUTILS
>  # Accept binutils 2.20 or newer.
>  AC_CHECK_PROG_VER(AS, $AS, --version,
>    [GNU assembler.* \([0-9]*\.[0-9.]*\)],
> -  [2.1[0-9][0-9]*|2.[2-9][0-9]*|[3-9].*|[1-9][0-9]*], AS=:
> critic_missing="$critic_missing as")
> +  [2.1[0-9][0-9]*|2.[2-9][2-9]*|[3-9].*|[1-9][0-9]*], AS=:
> critic_missing="$critic_missing as")
>  AC_CHECK_PROG_VER(LD, $LD, --version,
>    [GNU ld.* \([0-9][0-9]*\.[0-9.]*\)],
>    [2.1[0-9][0-9]*|2.[2-9][0-9]*|[3-9].*|[1-9][0-9]*], LD=:
> critic_missing="$critic_missing ld")

Any change to required versions needs to include an update to install.texi 
(and the generated INSTALL file).  It should also be proposed in a 
separate thread whose subject describes what is being proposed.

> +# We need to install libm.so as linker script
> +# for more comfortable use of vector math library.
> +subdir_install: $(inst_libdir)/libm.so
> +
> +$(inst_libdir)/libm.so: $(common-objpfx)format.lds \
> + $(common-objpfx)math/libm.so$(libm.so-version) \
> + $(common-objpfx)mathvec/libmvec.so$(libmvec.so-version) \
> + $(+force)
> + (echo '/* GNU ld script */';\
> + cat $<; \
> + echo 'GROUP ( $(slibdir)/libm.so$(libm.so-version) ' \
> + 'AS_NEEDED ( $(slibdir)/libmvec.so$(libmvec.so-version) ) )' \
> + ) > $@.new
> + mv -f $@.new $@

Do you have ordering issues here?  It seems bad for math/ to install a 
direct symlink and then mathvec/ to change it to something else - all 
installation rules for libm should be in the math/ directory.

Do you need to link libmvec against libm (and if so, I'd expect associated 
Makefile rules, and maybe a Depend file to ensure the directories are 
built in the right order)?

Also, I'm not sure the empty libmvec option for unsupported architectures 
when we consider the case where functions require GCC or binutils versions 
newer than we wish to require, so they are optional on some architecture.  
I think having libmvec built or not built on that architecture, depending 
on the tools installed, is better than possibly having it built but empty 
if the tools are too old.

> diff --git a/sysdeps/unix/sysv/linux/shlib-versions
> b/sysdeps/unix/sysv/linux/shlib-versions
> index 9160557..4a32c8a 100644
> --- a/sysdeps/unix/sysv/linux/shlib-versions
> +++ b/sysdeps/unix/sysv/linux/shlib-versions
> @@ -1,2 +1,3 @@
>  libm=6
>  libc=6
> +libmvec=1

There is nothing Linux-specific about this library, so the toplevel 
shlib-versions seems better.

Did the patch pass the testsuite?  If so, you have a problem - you didn't 
add ABI test baselines for this library (in this version, a default empty 
baseline, and one in sysdeps/unix/sysv/linux/x86_64), so the ABI tests 
should have failed, and you need to find out why they didn't run for this 
library, and fix that.  If it failed for lack of ABI test baselines, add 
them.

> +#if defined __x86_64__ && defined __FAST_MATH__
> +# define __DECL_SIMD_AVX2
> +# define __DECL_SIMD_SSE4

I don't see the need for this initial define to empty and subsequent 
#undef.  Except you should probably have comments explaining exactly what 
these macros mean in terms of what function versions they define to be 
available.

> +# if defined _OPENMP && _OPENMP >= 201307
> +/* OpenMP case. */
> +#  undef __DECL_SIMD_AVX2
> +#  undef __DECL_SIMD_SSE4
> +#  define __DECL_SIMD_AVX2 _Pragma("omp declare simd notinbranch")
> +#  define __DECL_SIMD_SSE4 _Pragma("omp declare simd notinbranch")

I think there should be a comment pointing to the ABI/API documentation 
that says what function versions this pragma defines to be available and 
guaranteeing that it will not be redefined to e.g. say that AVX512 is 
available so that existing headers will work with future compilers (but 
another pragma will be needed if in future AVX512 versions are added).

> +# elif defined _CILKPLUS && _CILKPLUS >= 0
> +/* CilkPlus case. TODO _CILKPLUS currently nowhere defined */
> +#  undef __DECL_SIMD_AVX2
> +#  undef __DECL_SIMD_SSE4
> +#  define __DECL_SIMD_AVX2 __attribute__((vector (nomask)))
> +#  define __DECL_SIMD_SSE4 __attribute__((vector (processor(core_i7_sse4_2), \
> +  nomask)))

To be namespace-clean, you have to use reserved-namespace versions of 
attributes.  That is, __vector__, __nomask__, __processor__ and 
__core_i7_sse4_2__.

> + .align 64
> + .globl __gnu_svml_dcos_data
> +__gnu_svml_dcos_data:
> + .long 4294967295

What are the semantics of the values in this table (please add a comment)?  
How was this table generated?

> + .type __gnu_svml_dcos_data,@object
> + .size __gnu_svml_dcos_data,1600

.size __gnu_svml_dcos_data,.-__gnu_svml_dcos_data

seems better than hardcoding another magic number for the size here.

Christoph Lauter Sept. 30, 2014, 6:40 p.m. UTC | #6

Hi all,

just 2cts from someone who wrote a couple of libm functions alreday in 
his life:

Joseph S. Myers wrote on 30/09/2014 18:35:

>> +# if defined _OPENMP && _OPENMP >= 201307
>> +/* OpenMP case. */
>> +#  undef __DECL_SIMD_AVX2
>> +#  undef __DECL_SIMD_SSE4
>> +#  define __DECL_SIMD_AVX2 _Pragma("omp declare simd notinbranch")
>> +#  define __DECL_SIMD_SSE4 _Pragma("omp declare simd notinbranch")
>
> I think there should be a comment pointing to the ABI/API documentation
> that says what function versions this pragma defines to be available and
> guaranteeing that it will not be redefined to e.g. say that AVX512 is
> available so that existing headers will work with future compilers (but
> another pragma will be needed if in future AVX512 versions are added).
>

Yeah, the ABI/API is not quite self-documenting with functions declared 
as follows:

Andrew Senkevich wrote on 30/09/2014 17:00:
> +#include <sysdep.h>
> +
> + .text
> +ENTRY(_ZGVdN4v_cos)
> +
> +/* ALGORITHM DESCRIPTION:
> + *
> + *    ( low accuracy ( < 4ulp ) or enhanced performance ( half of
> correct mantissa ) implementation )
> + *
> + *    Argument representation:
> + *    arg + Pi/2 = (N*Pi + R)
> + *
> + *    Result calculation:
> + *    cos(arg) = sin(arg+Pi/2) = sin(N*Pi + R) = (-1)^N * sin(R)
> + *    sin(R) is approximated by corresponding polynomial
> + */
> +        pushq     %rbp
> +        movq      %rsp, %rbp
> +        andq      $-64, %rsp
> +        subq      $448, %rsp
> +        movq      __gnu_svml_dcos_data@GOTPCREL(%rip), %rax
> +        vmovapd   %ymm0, %ymm1
> +        vmovupd   192(%rax), %ymm4
> +        vmovupd   256(%rax), %ymm5
> +

Of course, there are comments in the code about how the algorithm works 
but the code mainly is assembly with lots of magic numbers everywhere.

Frankly speaking, I have trouble seeing the difference between that code 
and a binary blob. Yes, this last remark is polemic.


>> +# elif defined _CILKPLUS && _CILKPLUS >= 0
>> +/* CilkPlus case. TODO _CILKPLUS currently nowhere defined */
>> +#  undef __DECL_SIMD_AVX2
>> +#  undef __DECL_SIMD_SSE4
>> +#  define __DECL_SIMD_AVX2 __attribute__((vector (nomask)))
>> +#  define __DECL_SIMD_SSE4 __attribute__((vector (processor(core_i7_sse4_2), \
>> +  nomask)))
>
> To be namespace-clean, you have to use reserved-namespace versions of
> attributes.  That is, __vector__, __nomask__, __processor__ and
> __core_i7_sse4_2__.
>
>> + .align 64
>> + .globl __gnu_svml_dcos_data
>> +__gnu_svml_dcos_data:
>> + .long 4294967295
>
> What are the semantics of the values in this table (please add a comment)?
> How was this table generated?
>

Yeah, who codes floating-point values as (little-endian ?) memory 
notation in decimal? I would understand hexadecimal but decimal?

As is, the code is unmaintainable.

>> + .type __gnu_svml_dcos_data,@object
>> + .size __gnu_svml_dcos_data,1600
>
> .size __gnu_svml_dcos_data,.-__gnu_svml_dcos_data
>
> seems better than hardcoding another magic number for the size here.
>

Yeah, so in conclusion: is there any technical rationale why a compiler 
couldn't produce vectorized libm function suitable for the purpose of 
gcc/cilk integration?

Best Regards,

Christoph Lauter

Joseph Myers Sept. 30, 2014, 8:15 p.m. UTC | #7

On Tue, 30 Sep 2014, Christoph Lauter wrote:

> Hi all,
> 
> just 2cts from someone who wrote a couple of libm functions alreday in his
> life:
> 
> Joseph S. Myers wrote on 30/09/2014 18:35:
> 
> > > +# if defined _OPENMP && _OPENMP >= 201307
> > > +/* OpenMP case. */
> > > +#  undef __DECL_SIMD_AVX2
> > > +#  undef __DECL_SIMD_SSE4
> > > +#  define __DECL_SIMD_AVX2 _Pragma("omp declare simd notinbranch")
> > > +#  define __DECL_SIMD_SSE4 _Pragma("omp declare simd notinbranch")
> > 
> > I think there should be a comment pointing to the ABI/API documentation
> > that says what function versions this pragma defines to be available and
> > guaranteeing that it will not be redefined to e.g. say that AVX512 is
> > available so that existing headers will work with future compilers (but
> > another pragma will be needed if in future AVX512 versions are added).
> > 
> 
> Yeah, the ABI/API is not quite self-documenting with functions declared as
> follows:

What I'm referring to here is somewhat different - it's the ABI/API that 
defines the contact between the library and compiler implied by the pragma 
(or, in the Cilk Plus case, by the attribute).

That ABI/API will effectively say "this pragma / attribute means that 
versions of this function are available for the following vector ISAs" 
(and then go on to say what the ABI is for each ISA).  It should also say 
explicitly that compilers must not interpret the pragma / attribute as 
meaning that functions are available for any other vector ISAs and that 
new pragmas / attributes will be defined for any new vector ISAs as 
needed.  That avoids future compilers misinterpreting glibc 2.21's headers 
as meaning it provides e.g. AVX512 versions of functions.

This ABI/API should be generically about OpenMP / Cilk Plus on x86_64 
processors, rather than specifically about GCC, to establish an 
interpretation intended to be shared by any compiler that implements those 
features, now or in the future.

(Of course then the patch does actually need to provide all the function 
versions implied by the pragma / attribute.)

> > > + .align 64
> > > + .globl __gnu_svml_dcos_data
> > > +__gnu_svml_dcos_data:
> > > + .long 4294967295
> > 
> > What are the semantics of the values in this table (please add a comment)?
> > How was this table generated?
> > 
> 
> Yeah, who codes floating-point values as (little-endian ?) memory notation in
> decimal? I would understand hexadecimal but decimal?
> 
> As is, the code is unmaintainable.

And, generally, we want to be able to regenerate any such tables if there 
are changes to the algorithms.  This means at a minimum having comments 
giving the semantics of the table (coefficients of whatever polynomial 
approximation to a given function on given intervals, for example), but 
preferably source code to generate the table.

Andrew Senkevich Oct. 1, 2014, 6:46 p.m. UTC | #8

>> +# We need to install libm.so as linker script
>> +# for more comfortable use of vector math library.
>> +subdir_install: $(inst_libdir)/libm.so
>> +
>> +$(inst_libdir)/libm.so: $(common-objpfx)format.lds \
>> + $(common-objpfx)math/libm.so$(libm.so-version) \
>> + $(common-objpfx)mathvec/libmvec.so$(libmvec.so-version) \
>> + $(+force)
>> + (echo '/* GNU ld script */';\
>> + cat $<; \
>> + echo 'GROUP ( $(slibdir)/libm.so$(libm.so-version) ' \
>> + 'AS_NEEDED ( $(slibdir)/libmvec.so$(libmvec.so-version) ) )' \
>> + ) > $@.new
>> + mv -f $@.new $@
>
> Do you have ordering issues here?  It seems bad for math/ to install a
> direct symlink and then mathvec/ to change it to something else - all
> installation rules for libm should be in the math/ directory.

Inserted in math/Makefile this rule produces warning about overriding
recipe for target libm.so (as I see rule for libm.so was already
generated from o-iterator.mk).

If use temporary target:

+subdir_install: $(inst_libdir)/libm.so.tmp
+$(inst_libdir)/libm.so.tmp: $(common-objpfx)format.lds \
+       $(common-objpfx)math/libm.so$(libm.so-version) \
+       $(common-objpfx)mathvec/libmvec.so$(libmvec.so-version) \
+       $(+force)
+       (echo '/* GNU ld script */';\
+       cat $<; \
+       echo 'GROUP ( $(slibdir)/libm.so$(libm.so-version) ' \
+       'AS_NEEDED ( $(slibdir)/libmvec.so$(libmvec.so-version) ) )' \
+       ) > $@
+       mv -f $@ $(inst_libdir)/libm.so

$(inst_libdir)/libm.so became overwritten later.
So I have temporary file and need to move it to $(inst_libdir)/libm.so
at the end.

If would be great if someone can give me advice how to do it.


--
WBR,
Andrew

Andrew Senkevich Oct. 2, 2014, 11:54 a.m. UTC | #9

>> > > + .align 64
>> > > + .globl __gnu_svml_dcos_data
>> > > +__gnu_svml_dcos_data:
>> > > + .long 4294967295
>> >
>> > What are the semantics of the values in this table (please add a comment)?

This tables contain data of several types - polynomial coefficients,
some constants, lookup-tables.

>> > How was this table generated?

Values was calculated with Maple, Mathematica and Sollya.

>> Yeah, who codes floating-point values as (little-endian ?) memory notation in
>> decimal? I would understand hexadecimal but decimal?

What is requirements for data representation? Lets determine how
values will be represented here.

> And, generally, we want to be able to regenerate any such tables if there
> are changes to the algorithms.  This means at a minimum having comments
> giving the semantics of the table (coefficients of whatever polynomial
> approximation to a given function on given intervals, for example), but
> preferably source code to generate the table.

We can add some comments but regeneration of this tables is not supported.


--
WBR,
Andrew

Joseph Myers Oct. 2, 2014, 2:21 p.m. UTC | #10

On Thu, 2 Oct 2014, Andrew Senkevich wrote:

> >> > > + .align 64
> >> > > + .globl __gnu_svml_dcos_data
> >> > > +__gnu_svml_dcos_data:
> >> > > + .long 4294967295
> >> >
> >> > What are the semantics of the values in this table (please add a comment)?
> 
> This tables contain data of several types - polynomial coefficients,
> some constants, lookup-tables.

That then indicates that each part of the table should have a comment 
explaining the exact semantics of the values in that part of the table, 
and naming the macro used for the offset of that part of the table from 
the start of the table - and where the code refers to parts of the table, 
it should use those macros for the offsets instead of hardcoding magic 
constants in the relevant instructions.  Furthermore, if you define those 
macros in a common header, the table can do

.if .-__gnu_svml_dcos_data != MACRO_NAME
.err
.endif

at the start of each section of the table, so avoiding the need for 
comments to mention the macro names and making sure the macros are 
accurate.  Then if someone changes part of the function implementation, 
requiring replacing just one section of the table, you don't have problems 
with quiet problems from not updating offsets - failing to update the 
macros correctly will cause an immediate build failure.

diff mbox

Patch

diff --git a/configure b/configure
index 89566c5..5456c43 100755
--- a/configure
+++ b/configure
@@ -4521,7 +4521,7 @@  $as_echo_n "checking version of $AS... " >&6; }
   ac_prog_version=`$AS --version 2>&1 | sed -n 's/^.*GNU assembler.*
\([0-9]*\.[0-9.]*\).*$/\1/p'`
   case $ac_prog_version in
     '') ac_prog_version="v. ?.??, bad"; ac_verc_fail=yes;;
-    2.1[0-9][0-9]*|2.[2-9][0-9]*|[3-9].*|[1-9][0-9]*)
+    2.1[0-9][0-9]*|2.[2-9][2-9]*|[3-9].*|[1-9][0-9]*)
        ac_prog_version="$ac_prog_version, ok"; ac_verc_fail=no;;
     *) ac_prog_version="$ac_prog_version, bad"; ac_verc_fail=yes;;

diff --git a/configure.ac b/configure.ac
index 82d0896..c5c1758 100644
--- a/configure.ac
+++ b/configure.ac
@@ -903,7 +903,7 @@  LIBC_PROG_BINUTILS
 # Accept binutils 2.20 or newer.
 AC_CHECK_PROG_VER(AS, $AS, --version,
   [GNU assembler.* \([0-9]*\.[0-9.]*\)],
-  [2.1[0-9][0-9]*|2.[2-9][0-9]*|[3-9].*|[1-9][0-9]*], AS=:
critic_missing="$critic_missing as")
+  [2.1[0-9][0-9]*|2.[2-9][2-9]*|[3-9].*|[1-9][0-9]*], AS=:
critic_missing="$critic_missing as")
 AC_CHECK_PROG_VER(LD, $LD, --version,
   [GNU ld.* \([0-9][0-9]*\.[0-9.]*\)],
   [2.1[0-9][0-9]*|2.[2-9][0-9]*|[3-9].*|[1-9][0-9]*], LD=:
critic_missing="$critic_missing ld")
diff --git a/Makeconfig b/Makeconfig
index 24a3b82..65136d9 100644
--- a/Makeconfig
+++ b/Makeconfig
@@ -1018,7 +1018,7 @@  all-subdirs = csu assert ctype locale intl
catgets math setjmp signal    \
       stdlib stdio-common libio malloc string wcsmbs time dirent    \
       grp pwd posix io termios resource misc socket sysvipc gmon    \
       gnulib iconv iconvdata wctype manual shadow gshadow po argp   \
-      crypt localedata timezone rt conform debug    \
+      crypt localedata timezone rt conform debug mathvec    \
       $(add-on-subdirs) dlfcn elf

 ifndef avoid-generated
diff --git a/bits/math-vector.h b/bits/math-vector.h
new file mode 100644
index 0000000..1c1c7ba
--- /dev/null
+++ b/bits/math-vector.h
@@ -0,0 +1,22 @@ 
+/* Platform-specific SIMD declarations of math functions.
+   Copyright (C) 2014 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License  published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _MATH_H
+# error "Never include <bits/math-vector.h> directly; \
+ include <math.h> instead."
+#endif
diff --git a/math/Makefile b/math/Makefile
index 866bc0f..1941b62 100644
--- a/math/Makefile
+++ b/math/Makefile
@@ -26,7 +26,7 @@  headers := math.h bits/mathcalls.h bits/mathinline.h
bits/huge_val.h \
    bits/huge_valf.h bits/huge_vall.h bits/inf.h bits/nan.h \
    fpu_control.h complex.h bits/cmathcalls.h fenv.h \
    bits/fenv.h bits/fenvinline.h bits/mathdef.h tgmath.h \
-   bits/math-finite.h
+   bits/math-finite.h bits/math-vector.h

 # FPU support code.
 aux := setfpucw fpu_control
diff --git a/math/bits/mathcalls.h b/math/bits/mathcalls.h
index 8a94a7e..2d31a11 100644
--- a/math/bits/mathcalls.h
+++ b/math/bits/mathcalls.h
@@ -60,6 +60,15 @@  __MATHCALL (atan,, (_Mdouble_ __x));
 __MATHCALL (atan2,, (_Mdouble_ __y, _Mdouble_ __x));

 /* Cosine of X.  */
+#if !defined _Mfloat_ && !defined _Mlong_double_ && defined __DECL_SIMD_cos
+__DECL_SIMD_cos
+#endif
+#if defined _Mfloat_ && !defined _Mlong_double_ && defined __DECL_SIMD_cosf
+__DECL_SIMD_cosf
+#endif
+#if defined _Mlong_double_ && defined __DECL_SIMD_cosl
+__DECL_SIMD_cosl
+#endif
 __MATHCALL (cos,, (_Mdouble_ __x));
 /* Sine of X.  */
 __MATHCALL (sin,, (_Mdouble_ __x));
diff --git a/math/math.h b/math/math.h
index 72ec2ca..32a7bec 100644
--- a/math/math.h
+++ b/math/math.h
@@ -27,6 +27,9 @@ 

 __BEGIN_DECLS

+/* Get machine-dependent vector math functions declarations */
+#include <bits/math-vector.h>
+
 /* Get machine-dependent HUGE_VAL value (returned on overflow).
    On all IEEE754 machines, this is +Infinity.  */
 #include <bits/huge_val.h>
diff --git a/mathvec/Makefile b/mathvec/Makefile
new file mode 100644
index 0000000..8aa4937
--- /dev/null
+++ b/mathvec/Makefile
@@ -0,0 +1,45 @@ 
+# Copyright (C) 2014 Free Software Foundation, Inc.
+# This file is part of the GNU C Library.
+
+# The GNU C Library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2.1 of the License, or (at your option) any later version.
+
+# The GNU C Library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+
+# You should have received a copy of the GNU Lesser General Public
+# License along with the GNU C Library; if not, see
+# <http://www.gnu.org/licenses/>.
+
+# Makefile for the vector math library.
+
+subdir := mathvec
+
+include ../Makeconfig
+
+extra-libs := libmvec
+extra-libs-others = $(extra-libs)
+
+libmvec.so-no-z-defs = yes
+libmvec-routines = $(strip $(libmvec-support))
+
+# We need to install libm.so as linker script
+# for more comfortable use of vector math library.
+subdir_install: $(inst_libdir)/libm.so
+
+$(inst_libdir)/libm.so: $(common-objpfx)format.lds \
+ $(common-objpfx)math/libm.so$(libm.so-version) \
+ $(common-objpfx)mathvec/libmvec.so$(libmvec.so-version) \
+ $(+force)
+ (echo '/* GNU ld script */';\
+ cat $<; \
+ echo 'GROUP ( $(slibdir)/libm.so$(libm.so-version) ' \
+ 'AS_NEEDED ( $(slibdir)/libmvec.so$(libmvec.so-version) ) )' \
+ ) > $@.new
+ mv -f $@.new $@
+
+include ../Rules
diff --git a/sysdeps/unix/sysv/linux/shlib-versions
b/sysdeps/unix/sysv/linux/shlib-versions
index 9160557..4a32c8a 100644
--- a/sysdeps/unix/sysv/linux/shlib-versions
+++ b/sysdeps/unix/sysv/linux/shlib-versions
@@ -1,2 +1,3 @@ 
 libm=6
 libc=6
+libmvec=1
diff --git a/sysdeps/x86/fpu/bits/math-vector.h
b/sysdeps/x86/fpu/bits/math-vector.h
new file mode 100644
index 0000000..375c176
--- /dev/null
+++ b/sysdeps/x86/fpu/bits/math-vector.h
@@ -0,0 +1,45 @@ 
+/* Platform-specific SIMD declarations of math functions.
+   Copyright (C) 2014 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _MATH_H
+# error "Never include <bits/math-vector.h> directly; \
+ include <math.h> instead."
+#endif
+
+#if defined __x86_64__ && defined __FAST_MATH__
+# define __DECL_SIMD_AVX2
+# define __DECL_SIMD_SSE4
+
+# if defined _OPENMP && _OPENMP >= 201307
+/* OpenMP case. */
+#  undef __DECL_SIMD_AVX2
+#  undef __DECL_SIMD_SSE4
+#  define __DECL_SIMD_AVX2 _Pragma("omp declare simd notinbranch")
+#  define __DECL_SIMD_SSE4 _Pragma("omp declare simd notinbranch")
+# elif defined _CILKPLUS && _CILKPLUS >= 0
+/* CilkPlus case. TODO _CILKPLUS currently nowhere defined */
+#  undef __DECL_SIMD_AVX2
+#  undef __DECL_SIMD_SSE4
+#  define __DECL_SIMD_AVX2 __attribute__((vector (nomask)))
+#  define __DECL_SIMD_SSE4 __attribute__((vector (processor(core_i7_sse4_2), \
+  nomask)))
+# endif
+
+# define __DECL_SIMD_cos  __DECL_SIMD_AVX2
+# define __DECL_SIMD_cosf __DECL_SIMD_SSE4
+#endif
diff --git a/sysdeps/x86_64/fpu/Makefile b/sysdeps/x86_64/fpu/Makefile
new file mode 100644
index 0000000..588f2f8
--- /dev/null
+++ b/sysdeps/x86_64/fpu/Makefile
@@ -0,0 +1,3 @@ 
+ifeq ($(subdir),mathvec)
+libmvec-support += svml_d_cos4_core svml_d_cos_data
+endif
diff --git a/sysdeps/x86_64/fpu/Versions b/sysdeps/x86_64/fpu/Versions
new file mode 100644
index 0000000..3d433d2
--- /dev/null
+++ b/sysdeps/x86_64/fpu/Versions
@@ -0,0 +1,5 @@ 
+libmvec {
+  GLIBC_2.21 {
+    _ZGVdN4v_cos;
+  }
+}
diff --git a/sysdeps/x86_64/fpu/svml_d_cos4_core.S
b/sysdeps/x86_64/fpu/svml_d_cos4_core.S
new file mode 100644
index 0000000..7316d2b
--- /dev/null
+++ b/sysdeps/x86_64/fpu/svml_d_cos4_core.S
@@ -0,0 +1,185 @@ 
+/* Function cos vectorized with AVX2.
+   Copyright (C) 2014 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+
+ .text
+ENTRY(_ZGVdN4v_cos)
+
+/* ALGORITHM DESCRIPTION:
+ *
+ *    ( low accuracy ( < 4ulp ) or enhanced performance ( half of
correct mantissa ) implementation )
+ *
+ *    Argument representation:
+ *    arg + Pi/2 = (N*Pi + R)
+ *
+ *    Result calculation:
+ *    cos(arg) = sin(arg+Pi/2) = sin(N*Pi + R) = (-1)^N * sin(R)
+ *    sin(R) is approximated by corresponding polynomial
+ */
+        pushq     %rbp
+        movq      %rsp, %rbp
+        andq      $-64, %rsp
+        subq      $448, %rsp
+        movq      __gnu_svml_dcos_data@GOTPCREL(%rip), %rax
+        vmovapd   %ymm0, %ymm1
+        vmovupd   192(%rax), %ymm4
+        vmovupd   256(%rax), %ymm5
+
+/* ARGUMENT RANGE REDUCTION:
+ * Add Pi/2 to argument: X' = X+Pi/2
+ */
+        vaddpd    128(%rax), %ymm1, %ymm7
+
+/* Get absolute argument value: X' = |X'| */
+        vandpd    (%rax), %ymm7, %ymm2
+
+/* Y = X'*InvPi + RS : right shifter add */
+        vfmadd213pd %ymm5, %ymm4, %ymm7
+        vmovupd   1216(%rax), %ymm4
+
+/* Check for large arguments path */
+        vcmpnle_uqpd 64(%rax), %ymm2, %ymm3
+
+/* N = Y - RS : right shifter sub */
+        vsubpd    %ymm5, %ymm7, %ymm6
+        vmovupd   640(%rax), %ymm2
+
+/* SignRes = Y<<63 : shift LSB to MSB place for result sign */
+        vpsllq    $63, %ymm7, %ymm7
+
+/* N = N - 0.5 */
+        vsubpd    320(%rax), %ymm6, %ymm0
+        vmovmskpd %ymm3, %ecx
+
+/* R = X - N*Pi1 */
+        vmovapd   %ymm1, %ymm3
+        vfnmadd231pd %ymm0, %ymm2, %ymm3
+
+/* R = R - N*Pi2 */
+        vfnmadd231pd 704(%rax), %ymm0, %ymm3
+
+/* R = R - N*Pi3 */
+        vfnmadd132pd 768(%rax), %ymm3, %ymm0
+
+/* POLYNOMIAL APPROXIMATION:
+ * R2 = R*R
+ */
+        vmulpd    %ymm0, %ymm0, %ymm5
+        vfmadd213pd 1152(%rax), %ymm5, %ymm4
+        vfmadd213pd 1088(%rax), %ymm5, %ymm4
+        vfmadd213pd 1024(%rax), %ymm5, %ymm4
+
+/* Poly = C3+R2*(C4+R2*(C5+R2*(C6+R2*C7))) */
+        vfmadd213pd 960(%rax), %ymm5, %ymm4
+        vfmadd213pd 896(%rax), %ymm5, %ymm4
+        vfmadd213pd 832(%rax), %ymm5, %ymm4
+        vmulpd    %ymm5, %ymm4, %ymm6
+        vfmadd213pd %ymm0, %ymm0, %ymm6
+
+/* RECONSTRUCTION:
+ * Final sign setting: Res = Poly^SignRes
+ */
+        vxorpd    %ymm7, %ymm6, %ymm0
+        testl     %ecx, %ecx
+        jne       _LBL_1_3
+
+_LBL_1_2:
+        movq      %rbp, %rsp
+        popq      %rbp
+        ret
+
+_LBL_1_3:
+        vmovupd   %ymm1, 320(%rsp)
+        vmovupd   %ymm0, 384(%rsp)
+        je        _LBL_1_2
+
+        xorb      %dl, %dl
+        xorl      %eax, %eax
+        vmovups   %ymm8, 224(%rsp)
+        vmovups   %ymm9, 192(%rsp)
+        vmovups   %ymm10, 160(%rsp)
+        vmovups   %ymm11, 128(%rsp)
+        vmovups   %ymm12, 96(%rsp)
+        vmovups   %ymm13, 64(%rsp)
+        vmovups   %ymm14, 32(%rsp)
+        vmovups   %ymm15, (%rsp)
+        movq      %rsi, 264(%rsp)
+        movq      %rdi, 256(%rsp)
+        movq      %r12, 296(%rsp)
+        movb      %dl, %r12b
+        movq      %r13, 288(%rsp)
+        movl      %ecx, %r13d
+        movq      %r14, 280(%rsp)
+        movl      %eax, %r14d
+        movq      %r15, 272(%rsp)
+
+_LBL_1_6:
+        btl       %r14d, %r13d
+        jc        _LBL_1_12
+
+_LBL_1_7:
+        lea       1(%r14), %esi
+        btl       %esi, %r13d
+        jc        _LBL_1_10
+
+_LBL_1_8:
+        incb      %r12b
+        addl      $2, %r14d
+        cmpb      $16, %r12b
+        jb        _LBL_1_6
+
+        vmovups   224(%rsp), %ymm8
+        vmovups   192(%rsp), %ymm9
+        vmovups   160(%rsp), %ymm10
+        vmovups   128(%rsp), %ymm11
+        vmovups   96(%rsp), %ymm12
+        vmovups   64(%rsp), %ymm13
+        vmovups   32(%rsp), %ymm14
+        vmovups   (%rsp), %ymm15
+        vmovupd   384(%rsp), %ymm0
+        movq      264(%rsp), %rsi
+        movq      256(%rsp), %rdi
+        movq      296(%rsp), %r12
+        movq      288(%rsp), %r13
+        movq      280(%rsp), %r14
+        movq      272(%rsp), %r15
+        jmp       _LBL_1_2
+
+_LBL_1_10:
+        movzbl    %r12b, %r15d
+        shlq      $4, %r15
+        vmovsd    328(%rsp,%r15), %xmm0
+        vzeroupper
+
+        call      cos@PLT
+
+        vmovsd    %xmm0, 392(%rsp,%r15)
+        jmp       _LBL_1_8
+
+_LBL_1_12:
+        movzbl    %r12b, %r15d
+        shlq      $4, %r15
+        vmovsd    320(%rsp,%r15), %xmm0
+        vzeroupper
+
+        call      cos@PLT
+
+        vmovsd    %xmm0, 384(%rsp,%r15)
+        jmp       _LBL_1_7
+END(_ZGVdN4v_cos)
diff --git a/sysdeps/x86_64/fpu/svml_d_cos_data.S
b/sysdeps/x86_64/fpu/svml_d_cos_data.S
new file mode 100644
index 0000000..7bb1aba
--- /dev/null
+++ b/sysdeps/x86_64/fpu/svml_d_cos_data.S
@@ -0,0 +1,426 @@ 
+/* Data for vectorized cos.
+   Copyright (C) 2014 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+
+ .section .rodata, "a"
+
+ .align 64
+ .globl __gnu_svml_dcos_data
+__gnu_svml_dcos_data:
+ .long 4294967295
+ .long 2147483647
+ .long 4294967295
+ .long 2147483647
+ .long 4294967295
+ .long 2147483647
+ .long 4294967295
+ .long 2147483647
+ .long 4294967295
+ .long 2147483647
+ .long 4294967295
+ .long 2147483647
+ .long 4294967295
+ .long 2147483647
+ .long 4294967295
+ .long 2147483647
+ .long 0
+ .long 1096810496
+ .long 0
+ .long 1096810496
+ .long 0
+ .long 1096810496
+ .long 0
+ .long 1096810496
+ .long 0
+ .long 1096810496
+ .long 0
+ .long 1096810496
+ .long 0
+ .long 1096810496
+ .long 0
+ .long 1096810496
+ .long 1413754136
+ .long 1073291771
+ .long 1413754136
+ .long 1073291771
+ .long 1413754136
+ .long 1073291771
+ .long 1413754136
+ .long 1073291771
+ .long 1413754136
+ .long 1073291771
+ .long 1413754136
+ .long 1073291771
+ .long 1413754136
+ .long 1073291771
+ .long 1413754136
+ .long 1073291771
+ .long 1841940611
+ .long 1070882608
+ .long 1841940611
+ .long 1070882608
+ .long 1841940611
+ .long 1070882608
+ .long 1841940611
+ .long 1070882608
+ .long 1841940611
+ .long 1070882608
+ .long 1841940611
+ .long 1070882608
+ .long 1841940611
+ .long 1070882608
+ .long 1841940611
+ .long 1070882608
+ .long 0
+ .long 1127743488
+ .long 0
+ .long 1127743488
+ .long 0
+ .long 1127743488
+ .long 0
+ .long 1127743488
+ .long 0
+ .long 1127743488
+ .long 0
+ .long 1127743488
+ .long 0
+ .long 1127743488
+ .long 0
+ .long 1127743488
+ .long 0
+ .long 1071644672
+ .long 0
+ .long 1071644672
+ .long 0
+ .long 1071644672
+ .long 0
+ .long 1071644672
+ .long 0
+ .long 1071644672
+ .long 0
+ .long 1071644672
+ .long 0
+ .long 1071644672
+ .long 0
+ .long 1071644672
+ .long 1073741824
+ .long 1074340347
+ .long 1073741824
+ .long 1074340347
+ .long 1073741824
+ .long 1074340347
+ .long 1073741824
+ .long 1074340347
+ .long 1073741824
+ .long 1074340347
+ .long 1073741824
+ .long 1074340347
+ .long 1073741824
+ .long 1074340347
+ .long 1073741824
+ .long 1074340347
+ .long 0
+ .long 1048855597
+ .long 0
+ .long 1048855597
+ .long 0
+ .long 1048855597
+ .long 0
+ .long 1048855597
+ .long 0
+ .long 1048855597
+ .long 0
+ .long 1048855597
+ .long 0
+ .long 1048855597
+ .long 0
+ .long 1048855597
+ .long 2147483648
+ .long 1023952536
+ .long 2147483648
+ .long 1023952536
+ .long 2147483648
+ .long 1023952536
+ .long 2147483648
+ .long 1023952536
+ .long 2147483648
+ .long 1023952536
+ .long 2147483648
+ .long 1023952536
+ .long 2147483648
+ .long 1023952536
+ .long 2147483648
+ .long 1023952536
+ .long 1880851354
+ .long 998820945
+ .long 1880851354
+ .long 998820945
+ .long 1880851354
+ .long 998820945
+ .long 1880851354
+ .long 998820945
+ .long 1880851354
+ .long 998820945
+ .long 1880851354
+ .long 998820945
+ .long 1880851354
+ .long 998820945
+ .long 1880851354
+ .long 998820945
+ .long 1413754136
+ .long 1074340347
+ .long 1413754136
+ .long 1074340347
+ .long 1413754136
+ .long 1074340347
+ .long 1413754136
+ .long 1074340347
+ .long 1413754136
+ .long 1074340347
+ .long 1413754136
+ .long 1074340347
+ .long 1413754136
+ .long 1074340347
+ .long 1413754136
+ .long 1074340347
+ .long 856972294
+ .long 1017226790
+ .long 856972294
+ .long 1017226790
+ .long 856972294
+ .long 1017226790
+ .long 856972294
+ .long 1017226790
+ .long 856972294
+ .long 1017226790
+ .long 856972294
+ .long 1017226790
+ .long 856972294
+ .long 1017226790
+ .long 856972294
+ .long 1017226790
+ .long 688016905
+ .long 962338001
+ .long 688016905
+ .long 962338001
+ .long 688016905
+ .long 962338001
+ .long 688016905
+ .long 962338001
+ .long 688016905
+ .long 962338001
+ .long 688016905
+ .long 962338001
+ .long 688016905
+ .long 962338001
+ .long 688016905
+ .long 962338001
+ .long 1431655591
+ .long 3217380693
+ .long 1431655591
+ .long 3217380693
+ .long 1431655591
+ .long 3217380693
+ .long 1431655591
+ .long 3217380693
+ .long 1431655591
+ .long 3217380693
+ .long 1431655591
+ .long 3217380693
+ .long 1431655591
+ .long 3217380693
+ .long 1431655591
+ .long 3217380693
+ .long 286303400
+ .long 1065423121
+ .long 286303400
+ .long 1065423121
+ .long 286303400
+ .long 1065423121
+ .long 286303400
+ .long 1065423121
+ .long 286303400
+ .long 1065423121
+ .long 286303400
+ .long 1065423121
+ .long 286303400
+ .long 1065423121
+ .long 286303400
+ .long 1065423121
+ .long 430291053
+ .long 3207201184
+ .long 430291053
+ .long 3207201184
+ .long 430291053
+ .long 3207201184
+ .long 430291053
+ .long 3207201184
+ .long 430291053
+ .long 3207201184
+ .long 430291053
+ .long 3207201184
+ .long 430291053
+ .long 3207201184
+ .long 430291053
+ .long 3207201184
+ .long 2150694560
+ .long 1053236707
+ .long 2150694560
+ .long 1053236707
+ .long 2150694560
+ .long 1053236707
+ .long 2150694560
+ .long 1053236707
+ .long 2150694560
+ .long 1053236707
+ .long 2150694560
+ .long 1053236707
+ .long 2150694560
+ .long 1053236707
+ .long 2150694560
+ .long 1053236707
+ .long 1174413873
+ .long 3193628213
+ .long 1174413873
+ .long 3193628213
+ .long 1174413873
+ .long 3193628213
+ .long 1174413873
+ .long 3193628213
+ .long 1174413873
+ .long 3193628213
+ .long 1174413873
+ .long 3193628213
+ .long 1174413873
+ .long 3193628213
+ .long 1174413873
+ .long 3193628213
+ .long 1470296608
+ .long 1038487144
+ .long 1470296608
+ .long 1038487144
+ .long 1470296608
+ .long 1038487144
+ .long 1470296608
+ .long 1038487144
+ .long 1470296608
+ .long 1038487144
+ .long 1470296608
+ .long 1038487144
+ .long 1470296608
+ .long 1038487144
+ .long 1470296608
+ .long 1038487144
+ .long 135375560
+ .long 3177836758
+ .long 135375560
+ .long 3177836758
+ .long 135375560
+ .long 3177836758
+ .long 135375560
+ .long 3177836758
+ .long 135375560
+ .long 3177836758
+ .long 135375560
+ .long 3177836758
+ .long 135375560
+ .long 3177836758
+ .long 135375560
+ .long 3177836758
+ .long 4294967295
+ .long 2147483647
+ .long 4294967295
+ .long 2147483647
+ .long 4294967295
+ .long 2147483647
+ .long 4294967295
+ .long 2147483647
+ .long 4294967295
+ .long 2147483647
+ .long 4294967295
+ .long 2147483647
+ .long 4294967295
+ .long 2147483647
+ .long 4294967295
+ .long 2147483647
+ .long 1841940611
+ .long 1070882608
+ .long 1841940611
+ .long 1070882608
+ .long 1841940611
+ .long 1070882608
+ .long 1841940611
+ .long 1070882608
+ .long 1841940611
+ .long 1070882608
+ .long 1841940611
+ .long 1070882608
+ .long 1841940611
+ .long 1070882608
+ .long 1841940611
+ .long 1070882608
+ .long 0
+ .long 1127219200
+ .long 0
+ .long 1127219200
+ .long 0
+ .long 1127219200
+ .long 0
+ .long 1127219200
+ .long 0
+ .long 1127219200
+ .long 0
+ .long 1127219200
+ .long 0
+ .long 1127219200
+ .long 0
+ .long 1127219200
+ .long 4294967295
+ .long 1127219199
+ .long 4294967295
+ .long 1127219199
+ .long 4294967295
+ .long 1127219199
+ .long 4294967295
+ .long 1127219199
+ .long 4294967295
+ .long 1127219199
+ .long 4294967295
+ .long 1127219199
+ .long 4294967295
+ .long 1127219199
+ .long 4294967295
+ .long 1127219199
+ .long 8388606
+ .long 1127219200
+ .long 8388606
+ .long 1127219200
+ .long 8388606
+ .long 1127219200
+ .long 8388606
+ .long 1127219200
+ .long 8388606
+ .long 1127219200
+ .long 8388606
+ .long 1127219200
+ .long 8388606
+ .long 1127219200
+ .long 8388606
+ .long 1127219200
+ .type __gnu_svml_dcos_data,@object
+ .size __gnu_svml_dcos_data,1600