From patchwork Mon Sep 22 11:48:00 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Senkevich X-Patchwork-Id: 2943 Received: (qmail 30845 invoked by alias); 22 Sep 2014 11:48:35 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 30836 invoked by uid 89); 22 Sep 2014 11:48:34 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.5 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-qg0-f50.google.com X-Received: by 10.140.43.180 with SMTP id e49mr20345613qga.76.1411386510413; Mon, 22 Sep 2014 04:48:30 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: From: Andrew Senkevich Date: Mon, 22 Sep 2014 15:48:00 +0400 Message-ID: Subject: Re: [RFC] How to add vector math functions to Glibc To: "Joseph S. Myers" Cc: libc-alpha Hi Joseph, >> > 4. We need to handle different architectures having different sets of >> > functions vectorized. >> >> We need to have some way for #pragma simd declare to be architecture dependent. >> It gives possibility to have different sets of vectorized functions >> for different architectures. > > I think having a long series of macros such as __DECL_SIMD_COS_DOUBLE as I > suggested in , > that the architecture-specific header may or may not > define, would work for this. > > The default bits/math-vector.h, for architectures without such functions, > should be minimal (that is, it should not be necessary to define long > series of macros for functions you don't have versions of on your > architecture, so that adding vectorized versions of a new function for one > architecture doesn't require you to update the files for lots of other > architectures to say they don't have vectorized versions of that function; > architecture-independent files should handle the macros possibly not being > defined). Is it OK to have following scheme: +# define __DECL_SIMD_cosf __DECL_SIMD_SSE4 +#endif where clause processor() taken from CilkPlus as example, for OpenMP it is better to discuss - also clause processor or not. diff --git a/bits/math-vector.h b/bits/math-vector.h new file mode 100644 index 0000000..4a9c786 --- /dev/null +++ b/bits/math-vector.h @@ -0,0 +1,21 @@ +/* Platform-specific SIMD declarations for math functions. + Copyright (C) 2014 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _MATH_H +# error "Never include directly; include instead." +#endif diff --git a/math/Makefile b/math/Makefile index 866bc0f..1941b62 100644 --- a/math/Makefile +++ b/math/Makefile @@ -26,7 +26,7 @@ headers := math.h bits/mathcalls.h bits/mathinline.h bits/huge_val.h \ bits/huge_valf.h bits/huge_vall.h bits/inf.h bits/nan.h \ fpu_control.h complex.h bits/cmathcalls.h fenv.h \ bits/fenv.h bits/fenvinline.h bits/mathdef.h tgmath.h \ - bits/math-finite.h + bits/math-finite.h bits/math-vector.h # FPU support code. aux := setfpucw fpu_control diff --git a/math/bits/mathcalls.h b/math/bits/mathcalls.h index ae94990..2d31a11 100644 --- a/math/bits/mathcalls.h +++ b/math/bits/mathcalls.h @@ -70,7 +60,15 @@ __MATHCALL (atan,, (_Mdouble_ __x)); __MATHCALL (atan2,, (_Mdouble_ __y, _Mdouble_ __x)); /* Cosine of X. */ +#if !defined _Mfloat_ && !defined _Mlong_double_ && defined __DECL_SIMD_cos +__DECL_SIMD_cos +#endif +#if defined _Mfloat_ && !defined _Mlong_double_ && defined __DECL_SIMD_cosf +__DECL_SIMD_cosf +#endif +#if defined _Mlong_double_ && defined __DECL_SIMD_cosl +__DECL_SIMD_cosl +#endif __MATHCALL (cos,, (_Mdouble_ __x)); /* Sine of X. */ __MATHCALL (sin,, (_Mdouble_ __x)); diff --git a/math/math.h b/math/math.h index 72ec2ca..32a7bec 100644 --- a/math/math.h +++ b/math/math.h @@ -27,6 +27,9 @@ __BEGIN_DECLS +/* Get machine-dependent vector math functions declarations */ +#include + /* Get machine-dependent HUGE_VAL value (returned on overflow). On all IEEE754 machines, this is +Infinity. */ #include diff --git a/sysdeps/x86_64/bits/math-vector.h b/sysdeps/x86_64/bits/math-vector.h new file mode 100644 index 0000000..512f4e4 --- /dev/null +++ b/sysdeps/x86_64/bits/math-vector.h @@ -0,0 +1,30 @@ +/* SIMD declarations of math functions for x86_64. + Copyright (C) 2014 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _MATH_H +# error "Never include directly; include instead." +#endif + +#if defined _OPENMP && _OPENMP >= 201307 +# define __DECL_SIMD_AVX _Pragma ("omp declare simd notinbranch processor(core_3rd_gen_avx)") +# define __DECL_SIMD_SSE4 _Pragma ("omp declare simd notinbranch processor(core_i7_sse4_2)") + +# define __DECL_SIMD_cos __DECL_SIMD_AVX