From patchwork Wed Nov 24 19:37:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sunil Pandey X-Patchwork-Id: 48090 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CE40B3858023 for ; Wed, 24 Nov 2021 20:05:56 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CE40B3858023 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1637784357; bh=XdBg6j1IH0WV0urUgs5jDoyHGpix8dKV18hOoumMkqw=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=D8jd+OtXtwBhOm8cQGbMlycZjeH1pfBwGt3Nw5Ft9j07Fkaz4E8hNgonWXnV6Qn3d lDqBPeTcRLJMBBmKlIGBQ0p+kbm+4vQ223RbDYAleAOiCfBpmuWiQZmBVDUYfv8Ma1 I8tvMFPvRwTns6/nceIRSJHdxt1u9fUchY0xJHv8= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by sourceware.org (Postfix) with ESMTPS id 386123857C59 for ; Wed, 24 Nov 2021 19:38:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 386123857C59 X-IronPort-AV: E=McAfee;i="6200,9189,10178"; a="235177328" X-IronPort-AV: E=Sophos;i="5.87,261,1631602800"; d="scan'208";a="235177328" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Nov 2021 11:38:11 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.87,261,1631602800"; d="scan'208";a="591701872" Received: from scymds02.sc.intel.com ([10.82.73.244]) by FMSMGA003.fm.intel.com with ESMTP; 24 Nov 2021 11:38:10 -0800 Received: from gskx-1.sc.intel.com (gskx-1.sc.intel.com [172.25.149.211]) by scymds02.sc.intel.com with ESMTP id 1AOJc7Ww021555; Wed, 24 Nov 2021 11:38:09 -0800 To: libc-alpha@sourceware.org Subject: [PATCH 11/42] x86-64: Add vector atan2/atan2f implementation to libmvec Date: Wed, 24 Nov 2021 11:37:36 -0800 Message-Id: <20211124193807.2093208-12-skpgkp2@gmail.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211124193807.2093208-1-skpgkp2@gmail.com> References: <20211124193807.2093208-1-skpgkp2@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-6.2 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, FORGED_GMAIL_RCVD, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, GIT_PATCH_0, HK_RANDOM_ENVFROM, HK_RANDOM_FROM, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_SHORT, KAM_STOCKGEN, LOTS_OF_MONEY, NML_ADSP_CUSTOM_MED, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Sunil K Pandey via Libc-alpha From: Sunil Pandey Reply-To: Sunil K Pandey Cc: andrey.kolesov@intel.com Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" Implement vectorized atan2/atan2f containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector atan2/atan2f with regenerated ulps. --- bits/libm-simd-decl-stubs.h | 11 + math/bits/mathcalls.h | 2 +- .../unix/sysv/linux/x86_64/libmvec.abilist | 8 + sysdeps/x86/fpu/bits/math-vector.h | 4 + sysdeps/x86_64/fpu/Makeconfig | 1 + sysdeps/x86_64/fpu/Versions | 2 + sysdeps/x86_64/fpu/libm-test-ulps | 20 + .../fpu/multiarch/svml_d_atan22_core-sse2.S | 20 + .../x86_64/fpu/multiarch/svml_d_atan22_core.c | 28 + .../fpu/multiarch/svml_d_atan22_core_sse4.S | 3628 +++++++++++++++++ .../fpu/multiarch/svml_d_atan24_core-sse.S | 20 + .../x86_64/fpu/multiarch/svml_d_atan24_core.c | 28 + .../fpu/multiarch/svml_d_atan24_core_avx2.S | 3160 ++++++++++++++ .../fpu/multiarch/svml_d_atan28_core-avx2.S | 20 + .../x86_64/fpu/multiarch/svml_d_atan28_core.c | 28 + .../fpu/multiarch/svml_d_atan28_core_avx512.S | 2310 +++++++++++ .../fpu/multiarch/svml_s_atan2f16_core-avx2.S | 20 + .../fpu/multiarch/svml_s_atan2f16_core.c | 28 + .../multiarch/svml_s_atan2f16_core_avx512.S | 1997 +++++++++ .../fpu/multiarch/svml_s_atan2f4_core-sse2.S | 20 + .../fpu/multiarch/svml_s_atan2f4_core.c | 28 + .../fpu/multiarch/svml_s_atan2f4_core_sse4.S | 2667 ++++++++++++ .../fpu/multiarch/svml_s_atan2f8_core-sse.S | 20 + .../fpu/multiarch/svml_s_atan2f8_core.c | 28 + .../fpu/multiarch/svml_s_atan2f8_core_avx2.S | 2412 +++++++++++ sysdeps/x86_64/fpu/svml_d_atan22_core.S | 29 + sysdeps/x86_64/fpu/svml_d_atan24_core.S | 29 + sysdeps/x86_64/fpu/svml_d_atan24_core_avx.S | 25 + sysdeps/x86_64/fpu/svml_d_atan28_core.S | 25 + sysdeps/x86_64/fpu/svml_s_atan2f16_core.S | 25 + sysdeps/x86_64/fpu/svml_s_atan2f4_core.S | 29 + sysdeps/x86_64/fpu/svml_s_atan2f8_core.S | 29 + sysdeps/x86_64/fpu/svml_s_atan2f8_core_avx.S | 25 + .../fpu/test-double-libmvec-atan2-avx.c | 1 + .../fpu/test-double-libmvec-atan2-avx2.c | 1 + .../fpu/test-double-libmvec-atan2-avx512f.c | 1 + .../x86_64/fpu/test-double-libmvec-atan2.c | 3 + .../x86_64/fpu/test-double-vlen2-wrappers.c | 1 + .../fpu/test-double-vlen4-avx2-wrappers.c | 1 + .../x86_64/fpu/test-double-vlen4-wrappers.c | 1 + .../x86_64/fpu/test-double-vlen8-wrappers.c | 1 + .../fpu/test-float-libmvec-atan2f-avx.c | 1 + .../fpu/test-float-libmvec-atan2f-avx2.c | 1 + .../fpu/test-float-libmvec-atan2f-avx512f.c | 1 + .../x86_64/fpu/test-float-libmvec-atan2f.c | 3 + .../x86_64/fpu/test-float-vlen16-wrappers.c | 1 + .../x86_64/fpu/test-float-vlen4-wrappers.c | 1 + .../fpu/test-float-vlen8-avx2-wrappers.c | 1 + .../x86_64/fpu/test-float-vlen8-wrappers.c | 1 + 49 files changed, 16745 insertions(+), 1 deletion(-) create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_atan22_core-sse2.S create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_atan22_core.c create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_atan22_core_sse4.S create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_atan24_core-sse.S create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_atan24_core.c create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_atan24_core_avx2.S create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_atan28_core-avx2.S create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_atan28_core.c create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_atan28_core_avx512.S create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_atan2f16_core-avx2.S create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_atan2f16_core.c create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_atan2f16_core_avx512.S create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_atan2f4_core-sse2.S create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_atan2f4_core.c create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_atan2f4_core_sse4.S create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_atan2f8_core-sse.S create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_atan2f8_core.c create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_atan2f8_core_avx2.S create mode 100644 sysdeps/x86_64/fpu/svml_d_atan22_core.S create mode 100644 sysdeps/x86_64/fpu/svml_d_atan24_core.S create mode 100644 sysdeps/x86_64/fpu/svml_d_atan24_core_avx.S create mode 100644 sysdeps/x86_64/fpu/svml_d_atan28_core.S create mode 100644 sysdeps/x86_64/fpu/svml_s_atan2f16_core.S create mode 100644 sysdeps/x86_64/fpu/svml_s_atan2f4_core.S create mode 100644 sysdeps/x86_64/fpu/svml_s_atan2f8_core.S create mode 100644 sysdeps/x86_64/fpu/svml_s_atan2f8_core_avx.S create mode 100644 sysdeps/x86_64/fpu/test-double-libmvec-atan2-avx.c create mode 100644 sysdeps/x86_64/fpu/test-double-libmvec-atan2-avx2.c create mode 100644 sysdeps/x86_64/fpu/test-double-libmvec-atan2-avx512f.c create mode 100644 sysdeps/x86_64/fpu/test-double-libmvec-atan2.c create mode 100644 sysdeps/x86_64/fpu/test-float-libmvec-atan2f-avx.c create mode 100644 sysdeps/x86_64/fpu/test-float-libmvec-atan2f-avx2.c create mode 100644 sysdeps/x86_64/fpu/test-float-libmvec-atan2f-avx512f.c create mode 100644 sysdeps/x86_64/fpu/test-float-libmvec-atan2f.c diff --git a/bits/libm-simd-decl-stubs.h b/bits/libm-simd-decl-stubs.h index 3e0aa043b4..bd8019839c 100644 --- a/bits/libm-simd-decl-stubs.h +++ b/bits/libm-simd-decl-stubs.h @@ -153,4 +153,15 @@ #define __DECL_SIMD_atanf32x #define __DECL_SIMD_atanf64x #define __DECL_SIMD_atanf128x + +#define __DECL_SIMD_atan2 +#define __DECL_SIMD_atan2f +#define __DECL_SIMD_atan2l +#define __DECL_SIMD_atan2f16 +#define __DECL_SIMD_atan2f32 +#define __DECL_SIMD_atan2f64 +#define __DECL_SIMD_atan2f128 +#define __DECL_SIMD_atan2f32x +#define __DECL_SIMD_atan2f64x +#define __DECL_SIMD_atan2f128x #endif diff --git a/math/bits/mathcalls.h b/math/bits/mathcalls.h index f37dbeebfb..b1b11b74ee 100644 --- a/math/bits/mathcalls.h +++ b/math/bits/mathcalls.h @@ -56,7 +56,7 @@ __MATHCALL_VEC (asin,, (_Mdouble_ __x)); /* Arc tangent of X. */ __MATHCALL_VEC (atan,, (_Mdouble_ __x)); /* Arc tangent of Y/X. */ -__MATHCALL (atan2,, (_Mdouble_ __y, _Mdouble_ __x)); +__MATHCALL_VEC (atan2,, (_Mdouble_ __y, _Mdouble_ __x)); /* Cosine of X. */ __MATHCALL_VEC (cos,, (_Mdouble_ __x)); diff --git a/sysdeps/unix/sysv/linux/x86_64/libmvec.abilist b/sysdeps/unix/sysv/linux/x86_64/libmvec.abilist index 2ead94d87e..9b47e83aec 100644 --- a/sysdeps/unix/sysv/linux/x86_64/libmvec.abilist +++ b/sysdeps/unix/sysv/linux/x86_64/libmvec.abilist @@ -51,38 +51,46 @@ GLIBC_2.35 _ZGVbN2v_acosh F GLIBC_2.35 _ZGVbN2v_asin F GLIBC_2.35 _ZGVbN2v_asinh F GLIBC_2.35 _ZGVbN2v_atan F +GLIBC_2.35 _ZGVbN2vv_atan2 F GLIBC_2.35 _ZGVbN4v_acosf F GLIBC_2.35 _ZGVbN4v_acoshf F GLIBC_2.35 _ZGVbN4v_asinf F GLIBC_2.35 _ZGVbN4v_asinhf F GLIBC_2.35 _ZGVbN4v_atanf F +GLIBC_2.35 _ZGVbN4vv_atan2f F GLIBC_2.35 _ZGVcN4v_acos F GLIBC_2.35 _ZGVcN4v_acosh F GLIBC_2.35 _ZGVcN4v_asin F GLIBC_2.35 _ZGVcN4v_asinh F GLIBC_2.35 _ZGVcN4v_atan F +GLIBC_2.35 _ZGVcN4vv_atan2 F GLIBC_2.35 _ZGVcN8v_acosf F GLIBC_2.35 _ZGVcN8v_acoshf F GLIBC_2.35 _ZGVcN8v_asinf F GLIBC_2.35 _ZGVcN8v_asinhf F GLIBC_2.35 _ZGVcN8v_atanf F +GLIBC_2.35 _ZGVcN8vv_atan2f F GLIBC_2.35 _ZGVdN4v_acos F GLIBC_2.35 _ZGVdN4v_acosh F GLIBC_2.35 _ZGVdN4v_asin F GLIBC_2.35 _ZGVdN4v_asinh F GLIBC_2.35 _ZGVdN4v_atan F +GLIBC_2.35 _ZGVdN4vv_atan2 F GLIBC_2.35 _ZGVdN8v_acosf F GLIBC_2.35 _ZGVdN8v_acoshf F GLIBC_2.35 _ZGVdN8v_asinf F GLIBC_2.35 _ZGVdN8v_asinhf F GLIBC_2.35 _ZGVdN8v_atanf F +GLIBC_2.35 _ZGVdN8vv_atan2f F GLIBC_2.35 _ZGVeN16v_acosf F GLIBC_2.35 _ZGVeN16v_acoshf F GLIBC_2.35 _ZGVeN16v_asinf F GLIBC_2.35 _ZGVeN16v_asinhf F GLIBC_2.35 _ZGVeN16v_atanf F +GLIBC_2.35 _ZGVeN16vv_atan2f F GLIBC_2.35 _ZGVeN8v_acos F GLIBC_2.35 _ZGVeN8v_acosh F GLIBC_2.35 _ZGVeN8v_asin F GLIBC_2.35 _ZGVeN8v_asinh F GLIBC_2.35 _ZGVeN8v_atan F +GLIBC_2.35 _ZGVeN8vv_atan2 F diff --git a/sysdeps/x86/fpu/bits/math-vector.h b/sysdeps/x86/fpu/bits/math-vector.h index ef0a3fb7ed..67a326566c 100644 --- a/sysdeps/x86/fpu/bits/math-vector.h +++ b/sysdeps/x86/fpu/bits/math-vector.h @@ -78,6 +78,10 @@ # define __DECL_SIMD_atan __DECL_SIMD_x86_64 # undef __DECL_SIMD_atanf # define __DECL_SIMD_atanf __DECL_SIMD_x86_64 +# undef __DECL_SIMD_atan2 +# define __DECL_SIMD_atan2 __DECL_SIMD_x86_64 +# undef __DECL_SIMD_atan2f +# define __DECL_SIMD_atan2f __DECL_SIMD_x86_64 # endif #endif diff --git a/sysdeps/x86_64/fpu/Makeconfig b/sysdeps/x86_64/fpu/Makeconfig index 1364381877..b37aabe83f 100644 --- a/sysdeps/x86_64/fpu/Makeconfig +++ b/sysdeps/x86_64/fpu/Makeconfig @@ -27,6 +27,7 @@ libmvec-funcs = \ asin \ asinh \ atan \ + atan2 \ cos \ exp \ log \ diff --git a/sysdeps/x86_64/fpu/Versions b/sysdeps/x86_64/fpu/Versions index f7ce07574f..57de41e864 100644 --- a/sysdeps/x86_64/fpu/Versions +++ b/sysdeps/x86_64/fpu/Versions @@ -19,10 +19,12 @@ libmvec { _ZGVbN2v_asin; _ZGVcN4v_asin; _ZGVdN4v_asin; _ZGVeN8v_asin; _ZGVbN2v_asinh; _ZGVcN4v_asinh; _ZGVdN4v_asinh; _ZGVeN8v_asinh; _ZGVbN2v_atan; _ZGVcN4v_atan; _ZGVdN4v_atan; _ZGVeN8v_atan; + _ZGVbN2vv_atan2; _ZGVcN4vv_atan2; _ZGVdN4vv_atan2; _ZGVeN8vv_atan2; _ZGVbN4v_acosf; _ZGVcN8v_acosf; _ZGVdN8v_acosf; _ZGVeN16v_acosf; _ZGVbN4v_acoshf; _ZGVcN8v_acoshf; _ZGVdN8v_acoshf; _ZGVeN16v_acoshf; _ZGVbN4v_asinf; _ZGVcN8v_asinf; _ZGVdN8v_asinf; _ZGVeN16v_asinf; _ZGVbN4v_asinhf; _ZGVcN8v_asinhf; _ZGVdN8v_asinhf; _ZGVeN16v_asinhf; _ZGVbN4v_atanf; _ZGVcN8v_atanf; _ZGVdN8v_atanf; _ZGVeN16v_atanf; + _ZGVbN4vv_atan2f; _ZGVcN8vv_atan2f; _ZGVdN8vv_atan2f; _ZGVeN16vv_atan2f; } } diff --git a/sysdeps/x86_64/fpu/libm-test-ulps b/sysdeps/x86_64/fpu/libm-test-ulps index de345e2bf1..329e7f58a2 100644 --- a/sysdeps/x86_64/fpu/libm-test-ulps +++ b/sysdeps/x86_64/fpu/libm-test-ulps @@ -203,6 +203,26 @@ float: 2 float128: 2 ldouble: 1 +Function: "atan2_vlen16": +float: 2 + +Function: "atan2_vlen2": +double: 1 + +Function: "atan2_vlen4": +double: 1 +float: 2 + +Function: "atan2_vlen4_avx2": +double: 1 + +Function: "atan2_vlen8": +double: 1 +float: 2 + +Function: "atan2_vlen8_avx2": +float: 2 + Function: "atan_downward": double: 1 float: 2 diff --git a/sysdeps/x86_64/fpu/multiarch/svml_d_atan22_core-sse2.S b/sysdeps/x86_64/fpu/multiarch/svml_d_atan22_core-sse2.S new file mode 100644 index 0000000000..6c3ad05a6c --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_d_atan22_core-sse2.S @@ -0,0 +1,20 @@ +/* SSE2 version of vectorized atan2. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define _ZGVbN2vv_atan2 _ZGVbN2vv_atan2_sse2 +#include "../svml_d_atan22_core.S" diff --git a/sysdeps/x86_64/fpu/multiarch/svml_d_atan22_core.c b/sysdeps/x86_64/fpu/multiarch/svml_d_atan22_core.c new file mode 100644 index 0000000000..43f1ee7f33 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_d_atan22_core.c @@ -0,0 +1,28 @@ +/* Multiple versions of vectorized atan2, vector length is 2. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define SYMBOL_NAME _ZGVbN2vv_atan2 +#include "ifunc-mathvec-sse4_1.h" + +libc_ifunc_redirected (REDIRECT_NAME, SYMBOL_NAME, IFUNC_SELECTOR ()); + +#ifdef SHARED +__hidden_ver1 (_ZGVbN2vv_atan2, __GI__ZGVbN2vv_atan2, + __redirect__ZGVbN2vv_atan2) + __attribute__ ((visibility ("hidden"))); +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/svml_d_atan22_core_sse4.S b/sysdeps/x86_64/fpu/multiarch/svml_d_atan22_core_sse4.S new file mode 100644 index 0000000000..a74d82503c --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_d_atan22_core_sse4.S @@ -0,0 +1,3628 @@ +/* Function atan vectorized with SSE4. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + https://www.gnu.org/licenses/. */ + +/* + * ALGORITHM DESCRIPTION: + * For 0.0 <= x <= 7.0/16.0: atan(x) = atan(0.0) + atan(s), where s=(x-0.0)/(1.0+0.0*x) + * For 7.0/16.0 <= x <= 11.0/16.0: atan(x) = atan(0.5) + atan(s), where s=(x-0.5)/(1.0+0.5*x) + * For 11.0/16.0 <= x <= 19.0/16.0: atan(x) = atan(1.0) + atan(s), where s=(x-1.0)/(1.0+1.0*x) + * For 19.0/16.0 <= x <= 39.0/16.0: atan(x) = atan(1.5) + atan(s), where s=(x-1.5)/(1.0+1.5*x) + * For 39.0/16.0 <= x <= inf : atan(x) = atan(inf) + atan(s), where s=-1.0/x + * Where atan(s) ~= s+s^3*Poly11(s^2) on interval |s|<7.0/0.16. + * + * + */ + +#include + + .text +ENTRY(_ZGVbN2vv_atan2_sse4) + pushq %rbp + cfi_def_cfa_offset(16) + movq %rsp, %rbp + cfi_def_cfa(6, 16) + cfi_offset(6, -16) + andq $-64, %rsp + subq $256, %rsp + xorl %edx, %edx + movups %xmm8, 112(%rsp) + .cfi_escape 0x10, 0x19, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x70, 0xff, 0xff, 0xff, 0x22 + movaps %xmm0, %xmm8 + +/* + * #define NO_VECTOR_ZERO_ATAN2_ARGS + * Declarations + * Variables + * Constants + * The end of declarations + * Implementation + * Get r0~=1/B + * Cannot be replaced by VQRCP(D, dR0, dB); + * Argument Absolute values + */ + movups 1728+__svml_datan2_data_internal(%rip), %xmm4 + movups %xmm9, 96(%rsp) + .cfi_escape 0x10, 0x1a, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x60, 0xff, 0xff, 0xff, 0x22 + movaps %xmm1, %xmm9 + movaps %xmm4, %xmm1 + andps %xmm8, %xmm4 + andps %xmm9, %xmm1 + movaps %xmm4, %xmm2 + cmpnltpd %xmm1, %xmm2 + +/* Argument signs */ + movups 1536+__svml_datan2_data_internal(%rip), %xmm3 + movaps %xmm2, %xmm0 + movaps %xmm3, %xmm7 + movaps %xmm3, %xmm6 + +/* + * 1) If yx then a=-x, b=y, PIO2=Pi/2 + */ + orps %xmm1, %xmm3 + andnps %xmm4, %xmm0 + andps %xmm2, %xmm3 + andps %xmm9, %xmm7 + movups 64+__svml_datan2_data_internal(%rip), %xmm5 + orps %xmm3, %xmm0 + movaps %xmm2, %xmm3 + andps %xmm2, %xmm5 + andnps %xmm1, %xmm3 + andps %xmm4, %xmm2 + orps %xmm2, %xmm3 + andps %xmm8, %xmm6 + divpd %xmm3, %xmm0 + movups %xmm10, 48(%rsp) + movq 1600+__svml_datan2_data_internal(%rip), %xmm2 + .cfi_escape 0x10, 0x1b, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x30, 0xff, 0xff, 0xff, 0x22 + +/* Check if y and x are on main path. */ + pshufd $221, %xmm1, %xmm10 + psubd %xmm2, %xmm10 + movups %xmm11, 80(%rsp) + movups %xmm12, 32(%rsp) + movups %xmm4, 16(%rsp) + .cfi_escape 0x10, 0x1c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x50, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x20, 0xff, 0xff, 0xff, 0x22 + movq 1664+__svml_datan2_data_internal(%rip), %xmm11 + pshufd $221, %xmm4, %xmm12 + movdqa %xmm10, %xmm4 + pcmpgtd %xmm11, %xmm4 + pcmpeqd %xmm11, %xmm10 + por %xmm10, %xmm4 + +/* Polynomial. */ + movaps %xmm0, %xmm10 + mulpd %xmm0, %xmm10 + psubd %xmm2, %xmm12 + movups %xmm13, 144(%rsp) + .cfi_escape 0x10, 0x1e, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x90, 0xff, 0xff, 0xff, 0x22 + movdqa %xmm12, %xmm13 + pcmpgtd %xmm11, %xmm13 + pcmpeqd %xmm11, %xmm12 + por %xmm12, %xmm13 + movaps %xmm10, %xmm12 + mulpd %xmm10, %xmm12 + por %xmm13, %xmm4 + movaps %xmm12, %xmm13 + mulpd %xmm12, %xmm13 + movmskps %xmm4, %eax + movups %xmm15, 160(%rsp) + .cfi_escape 0x10, 0x20, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xa0, 0xff, 0xff, 0xff, 0x22 + movups 256+__svml_datan2_data_internal(%rip), %xmm15 + mulpd %xmm13, %xmm15 + movups 320+__svml_datan2_data_internal(%rip), %xmm11 + movups 384+__svml_datan2_data_internal(%rip), %xmm2 + addpd 512+__svml_datan2_data_internal(%rip), %xmm15 + mulpd %xmm13, %xmm11 + mulpd %xmm13, %xmm2 + mulpd %xmm13, %xmm15 + addpd 576+__svml_datan2_data_internal(%rip), %xmm11 + addpd 640+__svml_datan2_data_internal(%rip), %xmm2 + addpd 768+__svml_datan2_data_internal(%rip), %xmm15 + mulpd %xmm13, %xmm11 + mulpd %xmm13, %xmm2 + mulpd %xmm13, %xmm15 + addpd 832+__svml_datan2_data_internal(%rip), %xmm11 + addpd 896+__svml_datan2_data_internal(%rip), %xmm2 + addpd 1024+__svml_datan2_data_internal(%rip), %xmm15 + mulpd %xmm13, %xmm11 + mulpd %xmm13, %xmm2 + mulpd %xmm13, %xmm15 + addpd 1088+__svml_datan2_data_internal(%rip), %xmm11 + addpd 1152+__svml_datan2_data_internal(%rip), %xmm2 + addpd 1280+__svml_datan2_data_internal(%rip), %xmm15 + mulpd %xmm13, %xmm11 + mulpd %xmm13, %xmm2 + mulpd %xmm10, %xmm15 + addpd 1344+__svml_datan2_data_internal(%rip), %xmm11 + addpd 1408+__svml_datan2_data_internal(%rip), %xmm2 + addpd %xmm15, %xmm11 + mulpd %xmm2, %xmm10 + mulpd %xmm11, %xmm12 + movups %xmm14, 176(%rsp) + .cfi_escape 0x10, 0x1f, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xb0, 0xff, 0xff, 0xff, 0x22 + movups 448+__svml_datan2_data_internal(%rip), %xmm14 + mulpd %xmm13, %xmm14 + addpd 704+__svml_datan2_data_internal(%rip), %xmm14 + mulpd %xmm13, %xmm14 + addpd 960+__svml_datan2_data_internal(%rip), %xmm14 + mulpd %xmm13, %xmm14 + addpd 1216+__svml_datan2_data_internal(%rip), %xmm14 + +/* A00=1.0, account for it later VQFMA(D, dP4, dP4, dR8, dA00); */ + mulpd %xmm14, %xmm13 + addpd %xmm10, %xmm13 + addpd %xmm12, %xmm13 + +/* + * Reconstruction. + * dP=(R+R*dP) + dPIO2 + */ + mulpd %xmm0, %xmm13 + addpd %xmm13, %xmm0 + movups %xmm3, (%rsp) + +/* if x<0, dPI = Pi, else dPI =0 */ + movaps %xmm9, %xmm3 + cmplepd 1792+__svml_datan2_data_internal(%rip), %xmm3 + addpd %xmm5, %xmm0 + andps __svml_datan2_data_internal(%rip), %xmm3 + orps %xmm7, %xmm0 + addpd %xmm3, %xmm0 + +/* Special branch for fast (vector) processing of zero arguments */ + movups 16(%rsp), %xmm11 + orps %xmm6, %xmm0 + testb $3, %al + jne .LBL_1_12 + +.LBL_1_2: +/* + * Special branch for fast (vector) processing of zero arguments + * The end of implementation + */ + testl %edx, %edx + jne .LBL_1_4 + +.LBL_1_3: + movups 112(%rsp), %xmm8 + cfi_restore(25) + movups 96(%rsp), %xmm9 + cfi_restore(26) + movups 48(%rsp), %xmm10 + cfi_restore(27) + movups 80(%rsp), %xmm11 + cfi_restore(28) + movups 32(%rsp), %xmm12 + cfi_restore(29) + movups 144(%rsp), %xmm13 + cfi_restore(30) + movups 176(%rsp), %xmm14 + cfi_restore(31) + movups 160(%rsp), %xmm15 + cfi_restore(32) + movq %rbp, %rsp + popq %rbp + cfi_def_cfa(7, 8) + cfi_restore(6) + ret + cfi_def_cfa(6, 16) + cfi_offset(6, -16) + .cfi_escape 0x10, 0x19, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x70, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1a, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x60, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1b, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x30, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x50, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x20, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1e, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x90, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1f, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xb0, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x20, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xa0, 0xff, 0xff, 0xff, 0x22 + +.LBL_1_4: + movups %xmm8, 64(%rsp) + movups %xmm9, 128(%rsp) + movups %xmm0, 192(%rsp) + je .LBL_1_3 + xorl %eax, %eax + movq %rsi, 8(%rsp) + movq %rdi, (%rsp) + movq %r12, 24(%rsp) + .cfi_escape 0x10, 0x04, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x08, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x05, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x00, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x18, 0xff, 0xff, 0xff, 0x22 + movl %eax, %r12d + movq %r13, 16(%rsp) + .cfi_escape 0x10, 0x0d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x10, 0xff, 0xff, 0xff, 0x22 + movl %edx, %r13d + +.LBL_1_8: + btl %r12d, %r13d + jc .LBL_1_11 + +.LBL_1_9: + incl %r12d + cmpl $2, %r12d + jl .LBL_1_8 + movq 8(%rsp), %rsi + cfi_restore(4) + movq (%rsp), %rdi + cfi_restore(5) + movq 24(%rsp), %r12 + cfi_restore(12) + movq 16(%rsp), %r13 + cfi_restore(13) + movups 192(%rsp), %xmm0 + jmp .LBL_1_3 + .cfi_escape 0x10, 0x04, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x08, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x05, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x00, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x18, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x10, 0xff, 0xff, 0xff, 0x22 + +.LBL_1_11: + lea 64(%rsp,%r12,8), %rdi + lea 128(%rsp,%r12,8), %rsi + lea 192(%rsp,%r12,8), %rdx + call __svml_datan2_cout_rare_internal + jmp .LBL_1_9 + cfi_restore(4) + cfi_restore(5) + cfi_restore(12) + cfi_restore(13) + +.LBL_1_12: +/* Check if at least on of Y or Y is zero: iAXAYZERO */ + movups 1792+__svml_datan2_data_internal(%rip), %xmm2 + +/* Check if both X & Y are not NaNs: iXYnotNAN */ + movaps %xmm9, %xmm12 + movaps %xmm8, %xmm10 + cmpordpd %xmm9, %xmm12 + cmpordpd %xmm8, %xmm10 + cmpeqpd %xmm2, %xmm1 + cmpeqpd %xmm2, %xmm11 + andps %xmm10, %xmm12 + orps %xmm11, %xmm1 + pshufd $221, %xmm1, %xmm1 + pshufd $221, %xmm12, %xmm11 + +/* Check if at least on of Y or Y is zero and not NaN: iAXAYZEROnotNAN */ + pand %xmm11, %xmm1 + +/* Exclude from previous callout mask zero (and not NaN) arguments */ + movdqa %xmm1, %xmm13 + pandn %xmm4, %xmm13 + +/* + * Path for zero arguments (at least one of both) + * Check if both args are zeros (den. is zero) + */ + movups (%rsp), %xmm4 + cmpeqpd %xmm2, %xmm4 + +/* Go to callout */ + movmskps %xmm13, %edx + +/* Set sPIO2 to zero if den. is zero */ + movaps %xmm4, %xmm15 + andps %xmm2, %xmm4 + andnps %xmm5, %xmm15 + andl $3, %edx + orps %xmm4, %xmm15 + pshufd $221, %xmm9, %xmm5 + orps %xmm7, %xmm15 + +/* Res = sign(Y)*(X<0)?(PIO2+PI):PIO2 */ + pshufd $221, %xmm2, %xmm7 + pcmpgtd %xmm5, %xmm7 + pshufd $80, %xmm7, %xmm14 + andps %xmm3, %xmm14 + addpd %xmm14, %xmm15 + +/* Merge results from main and spec path */ + pshufd $80, %xmm1, %xmm3 + orps %xmm6, %xmm15 + movdqa %xmm3, %xmm6 + andps %xmm3, %xmm15 + andnps %xmm0, %xmm6 + movaps %xmm6, %xmm0 + orps %xmm15, %xmm0 + jmp .LBL_1_2 + +END(_ZGVbN2vv_atan2_sse4) + + .align 16,0x90 + +__svml_datan2_cout_rare_internal: + + cfi_startproc + + movq %rdx, %rcx + movsd 1888+__datan2_la_CoutTab(%rip), %xmm1 + movsd (%rdi), %xmm2 + movsd (%rsi), %xmm0 + mulsd %xmm1, %xmm2 + mulsd %xmm0, %xmm1 + movsd %xmm2, -48(%rsp) + movsd %xmm1, -40(%rsp) + movzwl -42(%rsp), %r9d + andl $32752, %r9d + movb -33(%rsp), %al + movzwl -34(%rsp), %r8d + andb $-128, %al + andl $32752, %r8d + shrl $4, %r9d + movb -41(%rsp), %dl + shrb $7, %dl + shrb $7, %al + shrl $4, %r8d + cmpl $2047, %r9d + je .LBL_2_49 + cmpl $2047, %r8d + je .LBL_2_38 + testl %r9d, %r9d + jne .LBL_2_6 + testl $1048575, -44(%rsp) + jne .LBL_2_6 + cmpl $0, -48(%rsp) + je .LBL_2_31 + +.LBL_2_6: + testl %r8d, %r8d + jne .LBL_2_9 + testl $1048575, -36(%rsp) + jne .LBL_2_9 + cmpl $0, -40(%rsp) + je .LBL_2_29 + +.LBL_2_9: + negl %r8d + movsd %xmm2, -48(%rsp) + addl %r9d, %r8d + movsd %xmm1, -40(%rsp) + movb -41(%rsp), %dil + movb -33(%rsp), %sil + andb $127, %dil + andb $127, %sil + cmpl $-54, %r8d + jle .LBL_2_24 + cmpl $54, %r8d + jge .LBL_2_21 + movb %sil, -33(%rsp) + movb %dil, -41(%rsp) + testb %al, %al + jne .LBL_2_13 + movsd 1976+__datan2_la_CoutTab(%rip), %xmm1 + movaps %xmm1, %xmm0 + jmp .LBL_2_14 + +.LBL_2_13: + movsd 1936+__datan2_la_CoutTab(%rip), %xmm1 + movsd 1944+__datan2_la_CoutTab(%rip), %xmm0 + +.LBL_2_14: + movsd -48(%rsp), %xmm4 + movsd -40(%rsp), %xmm2 + movaps %xmm4, %xmm5 + divsd %xmm2, %xmm5 + movzwl -42(%rsp), %esi + movsd %xmm5, -16(%rsp) + testl %r9d, %r9d + jle .LBL_2_37 + cmpl $2046, %r9d + jge .LBL_2_17 + andl $-32753, %esi + addl $-1023, %r9d + movsd %xmm4, -48(%rsp) + addl $16368, %esi + movw %si, -42(%rsp) + jmp .LBL_2_18 + +.LBL_2_17: + movsd 1992+__datan2_la_CoutTab(%rip), %xmm3 + movl $1022, %r9d + mulsd %xmm3, %xmm4 + movsd %xmm4, -48(%rsp) + +.LBL_2_18: + negl %r9d + addl $1023, %r9d + andl $2047, %r9d + movzwl 1894+__datan2_la_CoutTab(%rip), %esi + movsd 1888+__datan2_la_CoutTab(%rip), %xmm3 + andl $-32753, %esi + shll $4, %r9d + movsd %xmm3, -40(%rsp) + orl %r9d, %esi + movw %si, -34(%rsp) + movsd -40(%rsp), %xmm4 + mulsd %xmm4, %xmm2 + comisd 1880+__datan2_la_CoutTab(%rip), %xmm5 + jb .LBL_2_20 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm12 + movaps %xmm2, %xmm3 + mulsd %xmm2, %xmm12 + movsd %xmm12, -72(%rsp) + movsd -72(%rsp), %xmm13 + movsd %xmm5, -24(%rsp) + subsd %xmm2, %xmm13 + movsd %xmm13, -64(%rsp) + movsd -72(%rsp), %xmm15 + movsd -64(%rsp), %xmm14 + movl -20(%rsp), %r8d + movl %r8d, %r9d + andl $-524288, %r8d + andl $-1048576, %r9d + addl $262144, %r8d + subsd %xmm14, %xmm15 + movsd %xmm15, -72(%rsp) + andl $1048575, %r8d + movsd -72(%rsp), %xmm4 + orl %r8d, %r9d + movl $0, -24(%rsp) + subsd %xmm4, %xmm3 + movl %r9d, -20(%rsp) + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -24(%rsp), %xmm11 + movsd -64(%rsp), %xmm9 + mulsd %xmm11, %xmm5 + mulsd %xmm11, %xmm9 + movsd 1968+__datan2_la_CoutTab(%rip), %xmm8 + mulsd %xmm8, %xmm5 + mulsd %xmm8, %xmm9 + movaps %xmm5, %xmm7 + movzwl -10(%rsp), %edi + addsd %xmm9, %xmm7 + movsd %xmm7, -72(%rsp) + andl $32752, %edi + movsd -72(%rsp), %xmm6 + shrl $4, %edi + subsd %xmm6, %xmm5 + movl -12(%rsp), %esi + addsd %xmm5, %xmm9 + movsd %xmm9, -64(%rsp) + andl $1048575, %esi + movsd -48(%rsp), %xmm9 + movsd -72(%rsp), %xmm3 + movaps %xmm9, %xmm12 + movsd -64(%rsp), %xmm10 + movaps %xmm9, %xmm14 + movaps %xmm9, %xmm6 + addsd %xmm3, %xmm12 + movsd %xmm12, -72(%rsp) + movsd -72(%rsp), %xmm13 + shll $20, %edi + subsd %xmm13, %xmm14 + movsd %xmm14, -64(%rsp) + orl %esi, %edi + movsd -72(%rsp), %xmm4 + addl $-1069547520, %edi + movsd -64(%rsp), %xmm15 + movl $113, %esi + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + addsd %xmm15, %xmm4 + movsd %xmm4, -56(%rsp) + movsd -64(%rsp), %xmm8 + sarl $19, %edi + addsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + cmpl $113, %edi + movsd -56(%rsp), %xmm7 + cmovl %edi, %esi + subsd %xmm7, %xmm6 + movsd %xmm6, -56(%rsp) + addl %esi, %esi + movsd -64(%rsp), %xmm12 + lea __datan2_la_CoutTab(%rip), %rdi + movsd -56(%rsp), %xmm5 + movslq %esi, %rsi + addsd %xmm5, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -72(%rsp), %xmm7 + mulsd %xmm7, %xmm13 + movsd -56(%rsp), %xmm8 + movsd %xmm13, -72(%rsp) + addsd %xmm10, %xmm8 + movsd -72(%rsp), %xmm4 + movaps %xmm9, %xmm10 + mulsd 2000+__datan2_la_CoutTab(%rip), %xmm10 + subsd %xmm7, %xmm4 + movsd %xmm4, -64(%rsp) + movsd -72(%rsp), %xmm3 + movsd -64(%rsp), %xmm14 + subsd %xmm14, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm15 + subsd %xmm15, %xmm7 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm7 + movsd -64(%rsp), %xmm4 + movsd %xmm10, -72(%rsp) + movaps %xmm2, %xmm10 + addsd %xmm4, %xmm8 + movsd -72(%rsp), %xmm4 + subsd -48(%rsp), %xmm4 + movsd %xmm4, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm3 + subsd %xmm3, %xmm6 + movaps %xmm2, %xmm3 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + subsd %xmm5, %xmm9 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm12 + movsd -64(%rsp), %xmm9 + mulsd %xmm11, %xmm12 + mulsd %xmm11, %xmm9 + movaps %xmm12, %xmm11 + addsd %xmm9, %xmm11 + movsd %xmm11, -72(%rsp) + movsd -72(%rsp), %xmm4 + subsd %xmm4, %xmm12 + addsd %xmm9, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm15 + movsd -64(%rsp), %xmm6 + addsd %xmm15, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm5, %xmm10 + movsd %xmm10, -64(%rsp) + movsd -72(%rsp), %xmm13 + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm13 + movsd %xmm13, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + addsd %xmm14, %xmm15 + movsd %xmm15, -64(%rsp) + movsd -56(%rsp), %xmm4 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm14 + subsd %xmm4, %xmm2 + movsd %xmm2, -56(%rsp) + movsd -64(%rsp), %xmm4 + movsd -56(%rsp), %xmm2 + addsd %xmm2, %xmm4 + movsd %xmm4, -56(%rsp) + movsd -72(%rsp), %xmm12 + mulsd %xmm12, %xmm3 + movsd -56(%rsp), %xmm5 + movsd %xmm3, -72(%rsp) + addsd %xmm6, %xmm5 + movsd -72(%rsp), %xmm9 + subsd %xmm12, %xmm9 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm2 + subsd %xmm2, %xmm10 + movsd %xmm10, -72(%rsp) + movsd -72(%rsp), %xmm11 + subsd %xmm11, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm9 + divsd %xmm9, %xmm14 + mulsd %xmm14, %xmm13 + movsd -64(%rsp), %xmm10 + movsd %xmm13, -64(%rsp) + addsd %xmm10, %xmm5 + movsd -64(%rsp), %xmm15 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm12 + subsd %xmm14, %xmm15 + movsd %xmm15, -56(%rsp) + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm4 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + subsd %xmm4, %xmm2 + movsd %xmm2, -56(%rsp) + movsd -56(%rsp), %xmm3 + mulsd %xmm3, %xmm9 + movsd -56(%rsp), %xmm11 + subsd %xmm9, %xmm12 + mulsd %xmm11, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -64(%rsp), %xmm5 + subsd %xmm5, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -64(%rsp), %xmm2 + movq -56(%rsp), %r10 + movsd -64(%rsp), %xmm6 + movsd -56(%rsp), %xmm4 + movq %r10, -40(%rsp) + movsd -40(%rsp), %xmm3 + movaps %xmm3, %xmm5 + addsd 1888+__datan2_la_CoutTab(%rip), %xmm2 + mulsd %xmm7, %xmm5 + mulsd %xmm6, %xmm2 + mulsd %xmm4, %xmm2 + mulsd %xmm2, %xmm7 + mulsd %xmm8, %xmm2 + mulsd %xmm3, %xmm8 + addsd %xmm2, %xmm7 + movsd 1872+__datan2_la_CoutTab(%rip), %xmm3 + addsd %xmm8, %xmm7 + movsd %xmm7, -72(%rsp) + movaps %xmm5, %xmm7 + movsd -72(%rsp), %xmm4 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm6 + addsd %xmm4, %xmm7 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm8 + subsd %xmm8, %xmm5 + addsd %xmm4, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -72(%rsp), %xmm11 + movaps %xmm11, %xmm2 + mulsd %xmm11, %xmm2 + mulsd %xmm11, %xmm6 + mulsd %xmm2, %xmm3 + movsd -64(%rsp), %xmm4 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm7 + addsd 1864+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm11, %xmm7 + mulsd %xmm2, %xmm3 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm9 + movsd -64(%rsp), %xmm8 + addsd 1856+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm8, %xmm9 + mulsd %xmm2, %xmm3 + movsd %xmm9, -72(%rsp) + movsd -72(%rsp), %xmm10 + addsd 1848+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm10, %xmm11 + mulsd %xmm2, %xmm3 + movsd %xmm11, -64(%rsp) + addsd 1840+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + addsd 1832+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + addsd 1824+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + mulsd %xmm3, %xmm13 + movsd -72(%rsp), %xmm2 + movsd -64(%rsp), %xmm12 + movsd %xmm13, -72(%rsp) + addsd %xmm12, %xmm4 + movsd -72(%rsp), %xmm14 + subsd %xmm3, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm5 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + subsd %xmm6, %xmm3 + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movaps %xmm6, %xmm12 + movaps %xmm5, %xmm3 + mulsd %xmm4, %xmm6 + mulsd %xmm4, %xmm3 + mulsd %xmm2, %xmm5 + mulsd %xmm2, %xmm12 + addsd %xmm3, %xmm6 + movaps %xmm12, %xmm7 + movaps %xmm12, %xmm8 + addsd %xmm5, %xmm6 + addsd %xmm2, %xmm7 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm3 + subsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -56(%rsp), %xmm2 + subsd %xmm2, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd -56(%rsp), %xmm13 + addsd %xmm13, %xmm14 + movsd %xmm14, -56(%rsp) + movq -72(%rsp), %r11 + movsd -56(%rsp), %xmm15 + movq %r11, -40(%rsp) + addsd %xmm15, %xmm4 + movsd -40(%rsp), %xmm8 + addsd %xmm5, %xmm4 + movsd %xmm4, -32(%rsp) + movaps %xmm8, %xmm4 + movaps %xmm8, %xmm2 + addsd (%rdi,%rsi,8), %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm4 + subsd %xmm4, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm3 + addsd %xmm3, %xmm5 + movsd %xmm5, -56(%rsp) + movsd -64(%rsp), %xmm6 + addsd (%rdi,%rsi,8), %xmm6 + movsd %xmm6, -64(%rsp) + movsd -56(%rsp), %xmm7 + subsd %xmm7, %xmm8 + movsd %xmm8, -56(%rsp) + movsd -64(%rsp), %xmm10 + movsd -56(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movq -72(%rsp), %r8 + movq %r8, -40(%rsp) + movsd -56(%rsp), %xmm2 + movaps %xmm1, %xmm3 + shrq $56, %r8 + addsd -32(%rsp), %xmm2 + shlb $7, %dl + addsd 8(%rdi,%rsi,8), %xmm2 + movb %al, %sil + andb $127, %r8b + shlb $7, %sil + movsd %xmm2, -32(%rsp) + orb %sil, %r8b + movb %r8b, -33(%rsp) + movsd -40(%rsp), %xmm9 + movaps %xmm9, %xmm5 + addsd %xmm9, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm4 + movb -25(%rsp), %dil + movb %dil, %r9b + shrb $7, %dil + subsd %xmm4, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -72(%rsp), %xmm7 + movsd -64(%rsp), %xmm6 + xorb %dil, %al + andb $127, %r9b + shlb $7, %al + addsd %xmm6, %xmm7 + movsd %xmm7, -56(%rsp) + movsd -64(%rsp), %xmm8 + addsd %xmm8, %xmm1 + movsd %xmm1, -64(%rsp) + orb %al, %r9b + movsd -56(%rsp), %xmm1 + movb %r9b, -25(%rsp) + subsd %xmm1, %xmm9 + movsd %xmm9, -56(%rsp) + movsd -64(%rsp), %xmm11 + movsd -56(%rsp), %xmm10 + addsd %xmm10, %xmm11 + movsd %xmm11, -56(%rsp) + movq -72(%rsp), %rax + movsd -56(%rsp), %xmm12 + movq %rax, -40(%rsp) + addsd %xmm12, %xmm0 + movsd -40(%rsp), %xmm13 + addsd -32(%rsp), %xmm0 + movsd %xmm0, -32(%rsp) + addsd %xmm0, %xmm13 + movsd %xmm13, -24(%rsp) + movb -17(%rsp), %r10b + andb $127, %r10b + orb %dl, %r10b + movb %r10b, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_20: + movsd -48(%rsp), %xmm12 + movb %al, %r8b + movaps %xmm12, %xmm7 + mulsd 2000+__datan2_la_CoutTab(%rip), %xmm7 + shlb $7, %r8b + shlb $7, %dl + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm8 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm7 + mulsd %xmm2, %xmm13 + subsd -48(%rsp), %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + subsd %xmm9, %xmm10 + movsd %xmm10, -72(%rsp) + movsd -72(%rsp), %xmm11 + subsd %xmm11, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movsd %xmm13, -72(%rsp) + movsd -72(%rsp), %xmm14 + subsd %xmm2, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm4 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm3 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm4 + subsd %xmm3, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm12 + divsd %xmm12, %xmm7 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm2 + mulsd %xmm7, %xmm2 + movsd -64(%rsp), %xmm14 + movsd %xmm2, -64(%rsp) + movsd -64(%rsp), %xmm8 + subsd %xmm7, %xmm8 + movsd %xmm8, -56(%rsp) + movsd -64(%rsp), %xmm10 + movsd -56(%rsp), %xmm9 + subsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -56(%rsp), %xmm11 + mulsd %xmm11, %xmm12 + movsd -56(%rsp), %xmm13 + subsd %xmm12, %xmm4 + mulsd %xmm13, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -64(%rsp), %xmm15 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + subsd %xmm15, %xmm4 + movsd %xmm4, -64(%rsp) + movsd -64(%rsp), %xmm7 + movq -56(%rsp), %rsi + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm3 + movq %rsi, -40(%rsp) + movsd -40(%rsp), %xmm8 + movaps %xmm8, %xmm9 + addsd 1888+__datan2_la_CoutTab(%rip), %xmm7 + mulsd %xmm6, %xmm9 + mulsd %xmm5, %xmm8 + mulsd %xmm2, %xmm7 + movsd -16(%rsp), %xmm2 + mulsd %xmm2, %xmm2 + mulsd %xmm3, %xmm7 + movsd 1872+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + mulsd %xmm7, %xmm6 + mulsd %xmm5, %xmm7 + addsd 1864+__datan2_la_CoutTab(%rip), %xmm3 + addsd %xmm7, %xmm6 + mulsd %xmm2, %xmm3 + addsd %xmm8, %xmm6 + addsd 1856+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + movaps %xmm9, %xmm5 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm4 + addsd 1848+__datan2_la_CoutTab(%rip), %xmm3 + addsd %xmm4, %xmm5 + mulsd %xmm2, %xmm3 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm5 + subsd %xmm6, %xmm9 + addsd 1840+__datan2_la_CoutTab(%rip), %xmm3 + addsd %xmm4, %xmm9 + mulsd %xmm2, %xmm3 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm11 + mulsd %xmm11, %xmm5 + addsd 1832+__datan2_la_CoutTab(%rip), %xmm3 + movsd -64(%rsp), %xmm4 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm7 + mulsd %xmm2, %xmm3 + subsd %xmm11, %xmm7 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm8 + movsd -64(%rsp), %xmm6 + addsd 1824+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm6, %xmm8 + mulsd %xmm2, %xmm3 + movsd %xmm8, -72(%rsp) + movsd -72(%rsp), %xmm10 + mulsd %xmm3, %xmm13 + subsd %xmm10, %xmm11 + movsd %xmm11, -64(%rsp) + movsd -72(%rsp), %xmm2 + movsd -64(%rsp), %xmm12 + movsd %xmm13, -72(%rsp) + addsd %xmm12, %xmm4 + movsd -72(%rsp), %xmm14 + subsd %xmm3, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm5 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + subsd %xmm6, %xmm3 + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movaps %xmm6, %xmm12 + movaps %xmm5, %xmm3 + mulsd %xmm4, %xmm6 + mulsd %xmm4, %xmm3 + mulsd %xmm2, %xmm5 + mulsd %xmm2, %xmm12 + addsd %xmm3, %xmm6 + movaps %xmm12, %xmm7 + movaps %xmm12, %xmm8 + addsd %xmm5, %xmm6 + addsd %xmm2, %xmm7 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm3 + subsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -56(%rsp), %xmm2 + subsd %xmm2, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd -56(%rsp), %xmm13 + addsd %xmm13, %xmm14 + movsd %xmm14, -56(%rsp) + movq -72(%rsp), %rdi + movsd -56(%rsp), %xmm15 + movq %rdi, -40(%rsp) + addsd %xmm15, %xmm4 + shrq $56, %rdi + addsd %xmm5, %xmm4 + andb $127, %dil + orb %r8b, %dil + movb %dil, -33(%rsp) + movsd %xmm4, -32(%rsp) + movaps %xmm1, %xmm4 + movsd -40(%rsp), %xmm7 + movaps %xmm7, %xmm2 + addsd %xmm7, %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm4 + movb -25(%rsp), %r9b + movb %r9b, %r10b + shrb $7, %r9b + subsd %xmm4, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm3 + xorb %r9b, %al + andb $127, %r10b + shlb $7, %al + addsd %xmm3, %xmm5 + movsd %xmm5, -56(%rsp) + movsd -64(%rsp), %xmm6 + addsd %xmm6, %xmm1 + movsd %xmm1, -64(%rsp) + orb %al, %r10b + movsd -56(%rsp), %xmm1 + movb %r10b, -25(%rsp) + subsd %xmm1, %xmm7 + movsd %xmm7, -56(%rsp) + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm1 + addsd %xmm1, %xmm2 + movsd %xmm2, -56(%rsp) + movq -72(%rsp), %rax + movsd -56(%rsp), %xmm3 + movq %rax, -40(%rsp) + addsd %xmm3, %xmm0 + movsd -40(%rsp), %xmm4 + addsd -32(%rsp), %xmm0 + movsd %xmm0, -32(%rsp) + addsd %xmm0, %xmm4 + movsd %xmm4, -24(%rsp) + movb -17(%rsp), %r11b + andb $127, %r11b + orb %dl, %r11b + movb %r11b, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_21: + cmpl $74, %r8d + jge .LBL_2_53 + movb %dil, -41(%rsp) + divsd -48(%rsp), %xmm1 + movsd 1928+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + subsd %xmm1, %xmm0 + addsd 1920+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_24: + testb %al, %al + jne .LBL_2_35 + movb %dil, -41(%rsp) + movb %sil, -33(%rsp) + movsd -48(%rsp), %xmm2 + divsd -40(%rsp), %xmm2 + movsd %xmm2, -24(%rsp) + movzwl -18(%rsp), %eax + testl $32752, %eax + je .LBL_2_27 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd %xmm2, %xmm0 + movsd %xmm0, -72(%rsp) + movsd -72(%rsp), %xmm1 + mulsd %xmm1, %xmm2 + movsd %xmm2, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_27: + mulsd %xmm2, %xmm2 + shlb $7, %dl + movsd %xmm2, -72(%rsp) + movsd -72(%rsp), %xmm0 + addsd -24(%rsp), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_29: + testl %r9d, %r9d + jne .LBL_2_53 + testl $1048575, -44(%rsp) + jne .LBL_2_53 + jmp .LBL_2_57 + +.LBL_2_31: + jne .LBL_2_53 + +.LBL_2_33: + testb %al, %al + jne .LBL_2_35 + +.LBL_2_34: + shlb $7, %dl + movq 1976+__datan2_la_CoutTab(%rip), %rax + movq %rax, -24(%rsp) + shrq $56, %rax + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_35: + movsd 1936+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd 1944+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + +.LBL_2_36: + xorl %eax, %eax + ret + +.LBL_2_37: + movsd 1984+__datan2_la_CoutTab(%rip), %xmm3 + movl $-1022, %r9d + mulsd %xmm3, %xmm4 + movsd %xmm4, -48(%rsp) + jmp .LBL_2_18 + +.LBL_2_38: + cmpl $2047, %r9d + je .LBL_2_49 + +.LBL_2_39: + testl $1048575, -36(%rsp) + jne .LBL_2_41 + cmpl $0, -40(%rsp) + je .LBL_2_42 + +.LBL_2_41: + addsd %xmm1, %xmm2 + movsd %xmm2, (%rcx) + jmp .LBL_2_36 + +.LBL_2_42: + cmpl $2047, %r9d + je .LBL_2_46 + testb %al, %al + je .LBL_2_34 + jmp .LBL_2_35 + +.LBL_2_46: + testb %al, %al + jne .LBL_2_48 + movsd 1904+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd 1912+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_48: + movsd 1952+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd 1960+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_49: + testl $1048575, -44(%rsp) + jne .LBL_2_41 + cmpl $0, -48(%rsp) + jne .LBL_2_41 + cmpl $2047, %r8d + je .LBL_2_39 + +.LBL_2_53: + movsd 1920+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd 1928+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_57: + cmpl $0, -48(%rsp) + jne .LBL_2_53 + jmp .LBL_2_33 + + cfi_endproc + + .type __svml_datan2_cout_rare_internal,@function + .size __svml_datan2_cout_rare_internal,.-__svml_datan2_cout_rare_internal + + .section .rodata, "a" + .align 64 + +__svml_datan2_data_internal: + .long 1413754136 + .long 1074340347 + .long 1413754136 + .long 1074340347 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1413754136 + .long 1073291771 + .long 1413754136 + .long 1073291771 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 17919630 + .long 3202334474 + .long 17919630 + .long 3202334474 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 350522012 + .long 1058555694 + .long 350522012 + .long 1058555694 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 934004643 + .long 3203726773 + .long 934004643 + .long 3203726773 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 912675337 + .long 1059908874 + .long 912675337 + .long 1059908874 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2476035107 + .long 3209881212 + .long 2476035107 + .long 3209881212 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2927800243 + .long 1064262173 + .long 2927800243 + .long 1064262173 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1636715437 + .long 3213013740 + .long 1636715437 + .long 3213013740 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1712395941 + .long 1066487628 + .long 1712395941 + .long 1066487628 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2961307292 + .long 3214564995 + .long 2961307292 + .long 3214564995 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 213298511 + .long 1067542936 + .long 213298511 + .long 1067542936 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3848520124 + .long 3215257506 + .long 3848520124 + .long 3215257506 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3159386171 + .long 1067969551 + .long 3159386171 + .long 1067969551 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3936393556 + .long 3215643233 + .long 3936393556 + .long 3215643233 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3177262543 + .long 1068373833 + .long 3177262543 + .long 1068373833 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 9713120 + .long 3216052356 + .long 9713120 + .long 3216052356 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1227445841 + .long 1068740906 + .long 1227445841 + .long 1068740906 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 163240596 + .long 3216459216 + .long 163240596 + .long 3216459216 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 133682613 + .long 1069314503 + .long 133682613 + .long 1069314503 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2448315847 + .long 3217180964 + .long 2448315847 + .long 3217180964 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2576870964 + .long 1070176665 + .long 2576870964 + .long 1070176665 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1431655365 + .long 3218429269 + .long 1431655365 + .long 3218429269 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 0 + .long 2147483648 + .long 0 + .long 2147483648 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 4294967295 + .long 2147483647 + .long 4294967295 + .long 2147483647 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 0 + .long 0 + .long 0 + .long 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 0 + .long 4294967295 + .long 0 + .long 4294967295 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .type __svml_datan2_data_internal,@object + .size __svml_datan2_data_internal,2304 + .align 32 + +__datan2_la_CoutTab: + .long 3892314112 + .long 1069799150 + .long 2332892550 + .long 1039715405 + .long 1342177280 + .long 1070305495 + .long 270726690 + .long 1041535749 + .long 939524096 + .long 1070817911 + .long 2253973841 + .long 3188654726 + .long 3221225472 + .long 1071277294 + .long 3853927037 + .long 1043226911 + .long 2818572288 + .long 1071767563 + .long 2677759107 + .long 1044314101 + .long 3355443200 + .long 1072103591 + .long 1636578514 + .long 3191094734 + .long 1476395008 + .long 1072475260 + .long 1864703685 + .long 3188646936 + .long 805306368 + .long 1072747407 + .long 192551812 + .long 3192726267 + .long 2013265920 + .long 1072892781 + .long 2240369452 + .long 1043768538 + .long 0 + .long 1072999953 + .long 3665168337 + .long 3192705970 + .long 402653184 + .long 1073084787 + .long 1227953434 + .long 3192313277 + .long 2013265920 + .long 1073142981 + .long 3853283127 + .long 1045277487 + .long 805306368 + .long 1073187261 + .long 1676192264 + .long 3192868861 + .long 134217728 + .long 1073217000 + .long 4290763938 + .long 1042034855 + .long 671088640 + .long 1073239386 + .long 994303084 + .long 3189643768 + .long 402653184 + .long 1073254338 + .long 1878067156 + .long 1042652475 + .long 1610612736 + .long 1073265562 + .long 670314820 + .long 1045138554 + .long 3221225472 + .long 1073273048 + .long 691126919 + .long 3189987794 + .long 3489660928 + .long 1073278664 + .long 1618990832 + .long 3188194509 + .long 1207959552 + .long 1073282409 + .long 2198872939 + .long 1044806069 + .long 3489660928 + .long 1073285217 + .long 2633982383 + .long 1042307894 + .long 939524096 + .long 1073287090 + .long 1059367786 + .long 3189114230 + .long 2281701376 + .long 1073288494 + .long 3158525533 + .long 1044484961 + .long 3221225472 + .long 1073289430 + .long 286581777 + .long 1044893263 + .long 4026531840 + .long 1073290132 + .long 2000245215 + .long 3191647611 + .long 134217728 + .long 1073290601 + .long 4205071590 + .long 1045035927 + .long 536870912 + .long 1073290952 + .long 2334392229 + .long 1043447393 + .long 805306368 + .long 1073291186 + .long 2281458177 + .long 3188885569 + .long 3087007744 + .long 1073291361 + .long 691611507 + .long 1044733832 + .long 3221225472 + .long 1073291478 + .long 1816229550 + .long 1044363390 + .long 2281701376 + .long 1073291566 + .long 1993843750 + .long 3189837440 + .long 134217728 + .long 1073291625 + .long 3654754496 + .long 1044970837 + .long 4026531840 + .long 1073291668 + .long 3224300229 + .long 3191935390 + .long 805306368 + .long 1073291698 + .long 2988777976 + .long 3188950659 + .long 536870912 + .long 1073291720 + .long 1030371341 + .long 1043402665 + .long 3221225472 + .long 1073291734 + .long 1524463765 + .long 1044361356 + .long 3087007744 + .long 1073291745 + .long 2754295320 + .long 1044731036 + .long 134217728 + .long 1073291753 + .long 3099629057 + .long 1044970710 + .long 2281701376 + .long 1073291758 + .long 962914160 + .long 3189838838 + .long 805306368 + .long 1073291762 + .long 3543908206 + .long 3188950786 + .long 4026531840 + .long 1073291764 + .long 1849909620 + .long 3191935434 + .long 3221225472 + .long 1073291766 + .long 1641333636 + .long 1044361352 + .long 536870912 + .long 1073291768 + .long 1373968792 + .long 1043402654 + .long 134217728 + .long 1073291769 + .long 2033191599 + .long 1044970710 + .long 3087007744 + .long 1073291769 + .long 4117947437 + .long 1044731035 + .long 805306368 + .long 1073291770 + .long 315378368 + .long 3188950787 + .long 2281701376 + .long 1073291770 + .long 2428571750 + .long 3189838838 + .long 3221225472 + .long 1073291770 + .long 1608007466 + .long 1044361352 + .long 4026531840 + .long 1073291770 + .long 1895711420 + .long 3191935434 + .long 134217728 + .long 1073291771 + .long 2031108713 + .long 1044970710 + .long 536870912 + .long 1073291771 + .long 1362518342 + .long 1043402654 + .long 805306368 + .long 1073291771 + .long 317461253 + .long 3188950787 + .long 939524096 + .long 1073291771 + .long 4117231784 + .long 1044731035 + .long 1073741824 + .long 1073291771 + .long 1607942376 + .long 1044361352 + .long 1207959552 + .long 1073291771 + .long 2428929577 + .long 3189838838 + .long 1207959552 + .long 1073291771 + .long 2031104645 + .long 1044970710 + .long 1342177280 + .long 1073291771 + .long 1895722602 + .long 3191935434 + .long 1342177280 + .long 1073291771 + .long 317465322 + .long 3188950787 + .long 1342177280 + .long 1073291771 + .long 1362515546 + .long 1043402654 + .long 1342177280 + .long 1073291771 + .long 1607942248 + .long 1044361352 + .long 1342177280 + .long 1073291771 + .long 4117231610 + .long 1044731035 + .long 1342177280 + .long 1073291771 + .long 2031104637 + .long 1044970710 + .long 1342177280 + .long 1073291771 + .long 1540251232 + .long 1045150466 + .long 1342177280 + .long 1073291771 + .long 2644671394 + .long 1045270303 + .long 1342177280 + .long 1073291771 + .long 2399244691 + .long 1045360181 + .long 1342177280 + .long 1073291771 + .long 803971124 + .long 1045420100 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192879152 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192849193 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192826724 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192811744 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192800509 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192793019 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192787402 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192783657 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192780848 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192778976 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192777572 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192776635 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192775933 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192775465 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192775114 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774880 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774704 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774587 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774500 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774441 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774397 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774368 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774346 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774331 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774320 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774313 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774308 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774304 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774301 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774299 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774298 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774297 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1466225875 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1343512524 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1251477510 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1190120835 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1144103328 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1113424990 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1090416237 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1075077068 + .long 3192774295 + .long 1431655765 + .long 3218429269 + .long 2576978363 + .long 1070176665 + .long 2453154343 + .long 3217180964 + .long 4189149139 + .long 1069314502 + .long 1775019125 + .long 3216459198 + .long 273199057 + .long 1068739452 + .long 874748308 + .long 3215993277 + .long 0 + .long 1069547520 + .long 0 + .long 1072693248 + .long 0 + .long 1073741824 + .long 1413754136 + .long 1072243195 + .long 856972295 + .long 1015129638 + .long 1413754136 + .long 1073291771 + .long 856972295 + .long 1016178214 + .long 1413754136 + .long 1074340347 + .long 856972295 + .long 1017226790 + .long 2134057426 + .long 1073928572 + .long 1285458442 + .long 1016756537 + .long 0 + .long 3220176896 + .long 0 + .long 0 + .long 0 + .long 2144337920 + .long 0 + .long 1048576 + .long 33554432 + .long 1101004800 + .type __datan2_la_CoutTab,@object + .size __datan2_la_CoutTab,2008 diff --git a/sysdeps/x86_64/fpu/multiarch/svml_d_atan24_core-sse.S b/sysdeps/x86_64/fpu/multiarch/svml_d_atan24_core-sse.S new file mode 100644 index 0000000000..0db843a088 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_d_atan24_core-sse.S @@ -0,0 +1,20 @@ +/* SSE version of vectorized atan2. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define _ZGVdN4vv_atan2 _ZGVdN4vv_atan2_sse_wrapper +#include "../svml_d_atan24_core.S" diff --git a/sysdeps/x86_64/fpu/multiarch/svml_d_atan24_core.c b/sysdeps/x86_64/fpu/multiarch/svml_d_atan24_core.c new file mode 100644 index 0000000000..c2e2611584 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_d_atan24_core.c @@ -0,0 +1,28 @@ +/* Multiple versions of vectorized atan2, vector length is 4. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define SYMBOL_NAME _ZGVdN4vv_atan2 +#include "ifunc-mathvec-avx2.h" + +libc_ifunc_redirected (REDIRECT_NAME, SYMBOL_NAME, IFUNC_SELECTOR ()); + +#ifdef SHARED +__hidden_ver1 (_ZGVdN4vv_atan2, __GI__ZGVdN4vv_atan2, + __redirect__ZGVdN4vv_atan2) + __attribute__ ((visibility ("hidden"))); +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/svml_d_atan24_core_avx2.S b/sysdeps/x86_64/fpu/multiarch/svml_d_atan24_core_avx2.S new file mode 100644 index 0000000000..d5ec313a28 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_d_atan24_core_avx2.S @@ -0,0 +1,3160 @@ +/* Function atan vectorized with AVX2. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + https://www.gnu.org/licenses/. */ + +/* + * ALGORITHM DESCRIPTION: + * For 0.0 <= x <= 7.0/16.0: atan(x) = atan(0.0) + atan(s), where s=(x-0.0)/(1.0+0.0*x) + * For 7.0/16.0 <= x <= 11.0/16.0: atan(x) = atan(0.5) + atan(s), where s=(x-0.5)/(1.0+0.5*x) + * For 11.0/16.0 <= x <= 19.0/16.0: atan(x) = atan(1.0) + atan(s), where s=(x-1.0)/(1.0+1.0*x) + * For 19.0/16.0 <= x <= 39.0/16.0: atan(x) = atan(1.5) + atan(s), where s=(x-1.5)/(1.0+1.5*x) + * For 39.0/16.0 <= x <= inf : atan(x) = atan(inf) + atan(s), where s=-1.0/x + * Where atan(s) ~= s+s^3*Poly11(s^2) on interval |s|<7.0/0.16. + * + * + */ + +#include + + .text +ENTRY(_ZGVdN4vv_atan2_avx2) + pushq %rbp + cfi_def_cfa_offset(16) + movq %rsp, %rbp + cfi_def_cfa(6, 16) + cfi_offset(6, -16) + andq $-64, %rsp + subq $384, %rsp + xorl %edx, %edx + +/* + * #define NO_VECTOR_ZERO_ATAN2_ARGS + * Declarations + * Variables + * Constants + * The end of declarations + * Implementation + * Get r0~=1/B + * Cannot be replaced by VQRCP(D, dR0, dB); + * Argument Absolute values + */ + vmovupd 1728+__svml_datan2_data_internal(%rip), %ymm5 + +/* Argument signs */ + vmovupd 1536+__svml_datan2_data_internal(%rip), %ymm4 + vmovups %ymm8, 32(%rsp) + vmovups %ymm14, 320(%rsp) + vmovups %ymm10, 160(%rsp) + vmovups %ymm9, 96(%rsp) + vmovups %ymm13, 288(%rsp) + .cfi_escape 0x10, 0xdb, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xa0, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xdc, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xe0, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xdd, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x20, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xe0, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xa0, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xe1, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x22 + vmovups 1600+__svml_datan2_data_internal(%rip), %xmm13 + vmovups %ymm12, 256(%rsp) + vmovups %ymm11, 224(%rsp) + vmovupd %ymm0, (%rsp) + vmovups %ymm15, 352(%rsp) + .cfi_escape 0x10, 0xde, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x60, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xdf, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x80, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xe2, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xe0, 0xff, 0xff, 0xff, 0x22 + vmovapd %ymm1, %ymm8 + vandpd %ymm5, %ymm8, %ymm2 + vandpd %ymm5, %ymm0, %ymm1 + vcmpnlt_uqpd %ymm2, %ymm1, %ymm3 + +/* + * 1) If yx then a=-x, b=y, PIO2=Pi/2 + */ + vorpd %ymm4, %ymm2, %ymm6 + vblendvpd %ymm3, %ymm6, %ymm1, %ymm6 + vblendvpd %ymm3, %ymm1, %ymm2, %ymm14 + vmovupd %ymm14, 64(%rsp) + vdivpd %ymm14, %ymm6, %ymm14 + vandpd %ymm4, %ymm8, %ymm5 + vandpd %ymm4, %ymm0, %ymm7 + vandpd 64+__svml_datan2_data_internal(%rip), %ymm3, %ymm4 + vmovups 1664+__svml_datan2_data_internal(%rip), %xmm3 + +/* Check if y and x are on main path. */ + vextractf128 $1, %ymm2, %xmm9 + vextractf128 $1, %ymm1, %xmm10 + vshufps $221, %xmm9, %xmm2, %xmm11 + vshufps $221, %xmm10, %xmm1, %xmm12 + vpsubd %xmm13, %xmm11, %xmm0 + vpsubd %xmm13, %xmm12, %xmm9 + vpcmpgtd %xmm3, %xmm0, %xmm15 + vpcmpeqd %xmm3, %xmm0, %xmm6 + vpcmpgtd %xmm3, %xmm9, %xmm10 + vpcmpeqd %xmm3, %xmm9, %xmm3 + vpor %xmm6, %xmm15, %xmm11 + vpor %xmm3, %xmm10, %xmm12 + +/* Polynomial. */ + vmulpd %ymm14, %ymm14, %ymm10 + vpor %xmm12, %xmm11, %xmm3 + vmovupd 320+__svml_datan2_data_internal(%rip), %ymm9 + vmovupd 384+__svml_datan2_data_internal(%rip), %ymm12 + vmovupd 448+__svml_datan2_data_internal(%rip), %ymm15 + vmulpd %ymm10, %ymm10, %ymm11 + +/* if x<0, dPI = Pi, else dPI =0 */ + vcmple_oqpd 1792+__svml_datan2_data_internal(%rip), %ymm8, %ymm13 + vmovmskps %xmm3, %eax + vmulpd %ymm11, %ymm11, %ymm0 + vandpd __svml_datan2_data_internal(%rip), %ymm13, %ymm6 + vmovupd 256+__svml_datan2_data_internal(%rip), %ymm13 + vfmadd213pd 576+__svml_datan2_data_internal(%rip), %ymm0, %ymm9 + vfmadd213pd 640+__svml_datan2_data_internal(%rip), %ymm0, %ymm12 + vfmadd213pd 704+__svml_datan2_data_internal(%rip), %ymm0, %ymm15 + vfmadd213pd 512+__svml_datan2_data_internal(%rip), %ymm0, %ymm13 + vfmadd213pd 832+__svml_datan2_data_internal(%rip), %ymm0, %ymm9 + vfmadd213pd 896+__svml_datan2_data_internal(%rip), %ymm0, %ymm12 + vfmadd213pd 960+__svml_datan2_data_internal(%rip), %ymm0, %ymm15 + vfmadd213pd 768+__svml_datan2_data_internal(%rip), %ymm0, %ymm13 + vfmadd213pd 1088+__svml_datan2_data_internal(%rip), %ymm0, %ymm9 + vfmadd213pd 1152+__svml_datan2_data_internal(%rip), %ymm0, %ymm12 + vfmadd213pd 1216+__svml_datan2_data_internal(%rip), %ymm0, %ymm15 + vfmadd213pd 1024+__svml_datan2_data_internal(%rip), %ymm0, %ymm13 + vfmadd213pd 1344+__svml_datan2_data_internal(%rip), %ymm0, %ymm9 + vfmadd213pd 1408+__svml_datan2_data_internal(%rip), %ymm0, %ymm12 + vfmadd213pd 1280+__svml_datan2_data_internal(%rip), %ymm0, %ymm13 + +/* A00=1.0, account for it later VQFMA(D, dP4, dP4, dR8, dA00); */ + vmulpd %ymm15, %ymm0, %ymm0 + vfmadd213pd %ymm9, %ymm10, %ymm13 + vfmadd213pd %ymm0, %ymm10, %ymm12 + vfmadd213pd %ymm12, %ymm11, %ymm13 + +/* + * Reconstruction. + * dP=(R+R*dP) + dPIO2 + */ + vfmadd213pd %ymm14, %ymm14, %ymm13 + vaddpd %ymm13, %ymm4, %ymm14 + vorpd %ymm5, %ymm14, %ymm0 + vaddpd %ymm0, %ymm6, %ymm9 + vorpd %ymm7, %ymm9, %ymm0 + +/* Special branch for fast (vector) processing of zero arguments */ + testl %eax, %eax + jne .LBL_1_12 + +.LBL_1_2: +/* + * Special branch for fast (vector) processing of zero arguments + * The end of implementation + */ + testl %edx, %edx + jne .LBL_1_4 + +.LBL_1_3: + vmovups 32(%rsp), %ymm8 + cfi_restore(91) + vmovups 96(%rsp), %ymm9 + cfi_restore(92) + vmovups 160(%rsp), %ymm10 + cfi_restore(93) + vmovups 224(%rsp), %ymm11 + cfi_restore(94) + vmovups 256(%rsp), %ymm12 + cfi_restore(95) + vmovups 288(%rsp), %ymm13 + cfi_restore(96) + vmovups 320(%rsp), %ymm14 + cfi_restore(97) + vmovups 352(%rsp), %ymm15 + cfi_restore(98) + movq %rbp, %rsp + popq %rbp + cfi_def_cfa(7, 8) + cfi_restore(6) + ret + cfi_def_cfa(6, 16) + cfi_offset(6, -16) + .cfi_escape 0x10, 0xdb, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xa0, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xdc, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xe0, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xdd, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x20, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xde, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x60, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xdf, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x80, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xe0, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xa0, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xe1, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xe2, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xe0, 0xff, 0xff, 0xff, 0x22 + +.LBL_1_4: + vmovupd (%rsp), %ymm1 + vmovupd %ymm8, 128(%rsp) + vmovupd %ymm0, 192(%rsp) + vmovupd %ymm1, 64(%rsp) + je .LBL_1_3 + xorl %eax, %eax + vzeroupper + movq %rsi, 8(%rsp) + movq %rdi, (%rsp) + movq %r12, 24(%rsp) + .cfi_escape 0x10, 0x04, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x88, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x05, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x80, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x98, 0xfe, 0xff, 0xff, 0x22 + movl %eax, %r12d + movq %r13, 16(%rsp) + .cfi_escape 0x10, 0x0d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x90, 0xfe, 0xff, 0xff, 0x22 + movl %edx, %r13d + +.LBL_1_8: + btl %r12d, %r13d + jc .LBL_1_11 + +.LBL_1_9: + incl %r12d + cmpl $4, %r12d + jl .LBL_1_8 + movq 8(%rsp), %rsi + cfi_restore(4) + movq (%rsp), %rdi + cfi_restore(5) + movq 24(%rsp), %r12 + cfi_restore(12) + movq 16(%rsp), %r13 + cfi_restore(13) + vmovupd 192(%rsp), %ymm0 + jmp .LBL_1_3 + .cfi_escape 0x10, 0x04, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x88, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x05, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x80, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x98, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x90, 0xfe, 0xff, 0xff, 0x22 + +.LBL_1_11: + lea 64(%rsp,%r12,8), %rdi + lea 128(%rsp,%r12,8), %rsi + lea 192(%rsp,%r12,8), %rdx + call __svml_datan2_cout_rare_internal + jmp .LBL_1_9 + cfi_restore(4) + cfi_restore(5) + cfi_restore(12) + cfi_restore(13) + +.LBL_1_12: + vmovupd (%rsp), %ymm11 + +/* Check if at least on of Y or Y is zero: iAXAYZERO */ + vmovupd 1792+__svml_datan2_data_internal(%rip), %ymm10 + +/* Check if both X & Y are not NaNs: iXYnotNAN */ + vcmpordpd %ymm8, %ymm8, %ymm12 + vcmpordpd %ymm11, %ymm11, %ymm13 + vcmpeqpd %ymm10, %ymm2, %ymm2 + vcmpeqpd %ymm10, %ymm1, %ymm1 + vandpd %ymm13, %ymm12, %ymm14 + vorpd %ymm1, %ymm2, %ymm2 + vextractf128 $1, %ymm14, %xmm15 + vextractf128 $1, %ymm2, %xmm11 + vshufps $221, %xmm15, %xmm14, %xmm9 + vshufps $221, %xmm11, %xmm2, %xmm12 + +/* + * Path for zero arguments (at least one of both) + * Check if both args are zeros (den. is zero) + */ + vcmpeqpd 64(%rsp), %ymm10, %ymm2 + +/* Check if at least on of Y or Y is zero and not NaN: iAXAYZEROnotNAN */ + vpand %xmm9, %xmm12, %xmm1 + +/* Exclude from previous callout mask zero (and not NaN) arguments */ + vpandn %xmm3, %xmm1, %xmm3 + +/* Go to callout */ + vmovmskps %xmm3, %edx + +/* Set sPIO2 to zero if den. is zero */ + vblendvpd %ymm2, %ymm10, %ymm4, %ymm4 + vorpd %ymm5, %ymm4, %ymm5 + +/* Res = sign(Y)*(X<0)?(PIO2+PI):PIO2 */ + vextractf128 $1, %ymm10, %xmm2 + vextractf128 $1, %ymm8, %xmm3 + vshufps $221, %xmm2, %xmm10, %xmm4 + vshufps $221, %xmm3, %xmm8, %xmm9 + vpcmpgtd %xmm9, %xmm4, %xmm12 + vpshufd $80, %xmm12, %xmm11 + vpshufd $250, %xmm12, %xmm13 + vinsertf128 $1, %xmm13, %ymm11, %ymm14 + vandpd %ymm6, %ymm14, %ymm6 + vaddpd %ymm6, %ymm5, %ymm2 + vorpd %ymm7, %ymm2, %ymm2 + +/* Merge results from main and spec path */ + vpshufd $80, %xmm1, %xmm7 + vpshufd $250, %xmm1, %xmm1 + vinsertf128 $1, %xmm1, %ymm7, %ymm3 + vblendvpd %ymm3, %ymm2, %ymm0, %ymm0 + jmp .LBL_1_2 + +END(_ZGVdN4vv_atan2_avx2) + + .align 16,0x90 + +__svml_datan2_cout_rare_internal: + + cfi_startproc + + movq %rdx, %rcx + movsd 1888+__datan2_la_CoutTab(%rip), %xmm1 + movsd (%rdi), %xmm2 + movsd (%rsi), %xmm0 + mulsd %xmm1, %xmm2 + mulsd %xmm0, %xmm1 + movsd %xmm2, -48(%rsp) + movsd %xmm1, -40(%rsp) + movzwl -42(%rsp), %r9d + andl $32752, %r9d + movb -33(%rsp), %al + movzwl -34(%rsp), %r8d + andb $-128, %al + andl $32752, %r8d + shrl $4, %r9d + movb -41(%rsp), %dl + shrb $7, %dl + shrb $7, %al + shrl $4, %r8d + cmpl $2047, %r9d + je .LBL_2_49 + cmpl $2047, %r8d + je .LBL_2_38 + testl %r9d, %r9d + jne .LBL_2_6 + testl $1048575, -44(%rsp) + jne .LBL_2_6 + cmpl $0, -48(%rsp) + je .LBL_2_31 + +.LBL_2_6: + testl %r8d, %r8d + jne .LBL_2_9 + testl $1048575, -36(%rsp) + jne .LBL_2_9 + cmpl $0, -40(%rsp) + je .LBL_2_29 + +.LBL_2_9: + negl %r8d + movsd %xmm2, -48(%rsp) + addl %r9d, %r8d + movsd %xmm1, -40(%rsp) + movb -41(%rsp), %dil + movb -33(%rsp), %sil + andb $127, %dil + andb $127, %sil + cmpl $-54, %r8d + jle .LBL_2_24 + cmpl $54, %r8d + jge .LBL_2_21 + movb %sil, -33(%rsp) + movb %dil, -41(%rsp) + testb %al, %al + jne .LBL_2_13 + movsd 1976+__datan2_la_CoutTab(%rip), %xmm1 + movaps %xmm1, %xmm0 + jmp .LBL_2_14 + +.LBL_2_13: + movsd 1936+__datan2_la_CoutTab(%rip), %xmm1 + movsd 1944+__datan2_la_CoutTab(%rip), %xmm0 + +.LBL_2_14: + movsd -48(%rsp), %xmm4 + movsd -40(%rsp), %xmm2 + movaps %xmm4, %xmm5 + divsd %xmm2, %xmm5 + movzwl -42(%rsp), %esi + movsd %xmm5, -16(%rsp) + testl %r9d, %r9d + jle .LBL_2_37 + cmpl $2046, %r9d + jge .LBL_2_17 + andl $-32753, %esi + addl $-1023, %r9d + movsd %xmm4, -48(%rsp) + addl $16368, %esi + movw %si, -42(%rsp) + jmp .LBL_2_18 + +.LBL_2_17: + movsd 1992+__datan2_la_CoutTab(%rip), %xmm3 + movl $1022, %r9d + mulsd %xmm3, %xmm4 + movsd %xmm4, -48(%rsp) + +.LBL_2_18: + negl %r9d + addl $1023, %r9d + andl $2047, %r9d + movzwl 1894+__datan2_la_CoutTab(%rip), %esi + movsd 1888+__datan2_la_CoutTab(%rip), %xmm3 + andl $-32753, %esi + shll $4, %r9d + movsd %xmm3, -40(%rsp) + orl %r9d, %esi + movw %si, -34(%rsp) + movsd -40(%rsp), %xmm4 + mulsd %xmm4, %xmm2 + comisd 1880+__datan2_la_CoutTab(%rip), %xmm5 + jb .LBL_2_20 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm12 + movaps %xmm2, %xmm3 + mulsd %xmm2, %xmm12 + movsd %xmm12, -72(%rsp) + movsd -72(%rsp), %xmm13 + movsd %xmm5, -24(%rsp) + subsd %xmm2, %xmm13 + movsd %xmm13, -64(%rsp) + movsd -72(%rsp), %xmm15 + movsd -64(%rsp), %xmm14 + movl -20(%rsp), %r8d + movl %r8d, %r9d + andl $-524288, %r8d + andl $-1048576, %r9d + addl $262144, %r8d + subsd %xmm14, %xmm15 + movsd %xmm15, -72(%rsp) + andl $1048575, %r8d + movsd -72(%rsp), %xmm4 + orl %r8d, %r9d + movl $0, -24(%rsp) + subsd %xmm4, %xmm3 + movl %r9d, -20(%rsp) + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -24(%rsp), %xmm11 + movsd -64(%rsp), %xmm9 + mulsd %xmm11, %xmm5 + mulsd %xmm11, %xmm9 + movsd 1968+__datan2_la_CoutTab(%rip), %xmm8 + mulsd %xmm8, %xmm5 + mulsd %xmm8, %xmm9 + movaps %xmm5, %xmm7 + movzwl -10(%rsp), %edi + addsd %xmm9, %xmm7 + movsd %xmm7, -72(%rsp) + andl $32752, %edi + movsd -72(%rsp), %xmm6 + shrl $4, %edi + subsd %xmm6, %xmm5 + movl -12(%rsp), %esi + addsd %xmm5, %xmm9 + movsd %xmm9, -64(%rsp) + andl $1048575, %esi + movsd -48(%rsp), %xmm9 + movsd -72(%rsp), %xmm3 + movaps %xmm9, %xmm12 + movsd -64(%rsp), %xmm10 + movaps %xmm9, %xmm14 + movaps %xmm9, %xmm6 + addsd %xmm3, %xmm12 + movsd %xmm12, -72(%rsp) + movsd -72(%rsp), %xmm13 + shll $20, %edi + subsd %xmm13, %xmm14 + movsd %xmm14, -64(%rsp) + orl %esi, %edi + movsd -72(%rsp), %xmm4 + addl $-1069547520, %edi + movsd -64(%rsp), %xmm15 + movl $113, %esi + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + addsd %xmm15, %xmm4 + movsd %xmm4, -56(%rsp) + movsd -64(%rsp), %xmm8 + sarl $19, %edi + addsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + cmpl $113, %edi + movsd -56(%rsp), %xmm7 + cmovl %edi, %esi + subsd %xmm7, %xmm6 + movsd %xmm6, -56(%rsp) + addl %esi, %esi + movsd -64(%rsp), %xmm12 + lea __datan2_la_CoutTab(%rip), %rdi + movsd -56(%rsp), %xmm5 + movslq %esi, %rsi + addsd %xmm5, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -72(%rsp), %xmm7 + mulsd %xmm7, %xmm13 + movsd -56(%rsp), %xmm8 + movsd %xmm13, -72(%rsp) + addsd %xmm10, %xmm8 + movsd -72(%rsp), %xmm4 + movaps %xmm9, %xmm10 + mulsd 2000+__datan2_la_CoutTab(%rip), %xmm10 + subsd %xmm7, %xmm4 + movsd %xmm4, -64(%rsp) + movsd -72(%rsp), %xmm3 + movsd -64(%rsp), %xmm14 + subsd %xmm14, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm15 + subsd %xmm15, %xmm7 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm7 + movsd -64(%rsp), %xmm4 + movsd %xmm10, -72(%rsp) + movaps %xmm2, %xmm10 + addsd %xmm4, %xmm8 + movsd -72(%rsp), %xmm4 + subsd -48(%rsp), %xmm4 + movsd %xmm4, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm3 + subsd %xmm3, %xmm6 + movaps %xmm2, %xmm3 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + subsd %xmm5, %xmm9 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm12 + movsd -64(%rsp), %xmm9 + mulsd %xmm11, %xmm12 + mulsd %xmm11, %xmm9 + movaps %xmm12, %xmm11 + addsd %xmm9, %xmm11 + movsd %xmm11, -72(%rsp) + movsd -72(%rsp), %xmm4 + subsd %xmm4, %xmm12 + addsd %xmm9, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm15 + movsd -64(%rsp), %xmm6 + addsd %xmm15, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm5, %xmm10 + movsd %xmm10, -64(%rsp) + movsd -72(%rsp), %xmm13 + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm13 + movsd %xmm13, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + addsd %xmm14, %xmm15 + movsd %xmm15, -64(%rsp) + movsd -56(%rsp), %xmm4 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm14 + subsd %xmm4, %xmm2 + movsd %xmm2, -56(%rsp) + movsd -64(%rsp), %xmm4 + movsd -56(%rsp), %xmm2 + addsd %xmm2, %xmm4 + movsd %xmm4, -56(%rsp) + movsd -72(%rsp), %xmm12 + mulsd %xmm12, %xmm3 + movsd -56(%rsp), %xmm5 + movsd %xmm3, -72(%rsp) + addsd %xmm6, %xmm5 + movsd -72(%rsp), %xmm9 + subsd %xmm12, %xmm9 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm2 + subsd %xmm2, %xmm10 + movsd %xmm10, -72(%rsp) + movsd -72(%rsp), %xmm11 + subsd %xmm11, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm9 + divsd %xmm9, %xmm14 + mulsd %xmm14, %xmm13 + movsd -64(%rsp), %xmm10 + movsd %xmm13, -64(%rsp) + addsd %xmm10, %xmm5 + movsd -64(%rsp), %xmm15 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm12 + subsd %xmm14, %xmm15 + movsd %xmm15, -56(%rsp) + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm4 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + subsd %xmm4, %xmm2 + movsd %xmm2, -56(%rsp) + movsd -56(%rsp), %xmm3 + mulsd %xmm3, %xmm9 + movsd -56(%rsp), %xmm11 + subsd %xmm9, %xmm12 + mulsd %xmm11, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -64(%rsp), %xmm5 + subsd %xmm5, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -64(%rsp), %xmm2 + movq -56(%rsp), %r10 + movsd -64(%rsp), %xmm6 + movsd -56(%rsp), %xmm4 + movq %r10, -40(%rsp) + movsd -40(%rsp), %xmm3 + movaps %xmm3, %xmm5 + addsd 1888+__datan2_la_CoutTab(%rip), %xmm2 + mulsd %xmm7, %xmm5 + mulsd %xmm6, %xmm2 + mulsd %xmm4, %xmm2 + mulsd %xmm2, %xmm7 + mulsd %xmm8, %xmm2 + mulsd %xmm3, %xmm8 + addsd %xmm2, %xmm7 + movsd 1872+__datan2_la_CoutTab(%rip), %xmm3 + addsd %xmm8, %xmm7 + movsd %xmm7, -72(%rsp) + movaps %xmm5, %xmm7 + movsd -72(%rsp), %xmm4 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm6 + addsd %xmm4, %xmm7 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm8 + subsd %xmm8, %xmm5 + addsd %xmm4, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -72(%rsp), %xmm11 + movaps %xmm11, %xmm2 + mulsd %xmm11, %xmm2 + mulsd %xmm11, %xmm6 + mulsd %xmm2, %xmm3 + movsd -64(%rsp), %xmm4 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm7 + addsd 1864+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm11, %xmm7 + mulsd %xmm2, %xmm3 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm9 + movsd -64(%rsp), %xmm8 + addsd 1856+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm8, %xmm9 + mulsd %xmm2, %xmm3 + movsd %xmm9, -72(%rsp) + movsd -72(%rsp), %xmm10 + addsd 1848+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm10, %xmm11 + mulsd %xmm2, %xmm3 + movsd %xmm11, -64(%rsp) + addsd 1840+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + addsd 1832+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + addsd 1824+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + mulsd %xmm3, %xmm13 + movsd -72(%rsp), %xmm2 + movsd -64(%rsp), %xmm12 + movsd %xmm13, -72(%rsp) + addsd %xmm12, %xmm4 + movsd -72(%rsp), %xmm14 + subsd %xmm3, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm5 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + subsd %xmm6, %xmm3 + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movaps %xmm6, %xmm12 + movaps %xmm5, %xmm3 + mulsd %xmm4, %xmm6 + mulsd %xmm4, %xmm3 + mulsd %xmm2, %xmm5 + mulsd %xmm2, %xmm12 + addsd %xmm3, %xmm6 + movaps %xmm12, %xmm7 + movaps %xmm12, %xmm8 + addsd %xmm5, %xmm6 + addsd %xmm2, %xmm7 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm3 + subsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -56(%rsp), %xmm2 + subsd %xmm2, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd -56(%rsp), %xmm13 + addsd %xmm13, %xmm14 + movsd %xmm14, -56(%rsp) + movq -72(%rsp), %r11 + movsd -56(%rsp), %xmm15 + movq %r11, -40(%rsp) + addsd %xmm15, %xmm4 + movsd -40(%rsp), %xmm8 + addsd %xmm5, %xmm4 + movsd %xmm4, -32(%rsp) + movaps %xmm8, %xmm4 + movaps %xmm8, %xmm2 + addsd (%rdi,%rsi,8), %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm4 + subsd %xmm4, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm3 + addsd %xmm3, %xmm5 + movsd %xmm5, -56(%rsp) + movsd -64(%rsp), %xmm6 + addsd (%rdi,%rsi,8), %xmm6 + movsd %xmm6, -64(%rsp) + movsd -56(%rsp), %xmm7 + subsd %xmm7, %xmm8 + movsd %xmm8, -56(%rsp) + movsd -64(%rsp), %xmm10 + movsd -56(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movq -72(%rsp), %r8 + movq %r8, -40(%rsp) + movsd -56(%rsp), %xmm2 + movaps %xmm1, %xmm3 + shrq $56, %r8 + addsd -32(%rsp), %xmm2 + shlb $7, %dl + addsd 8(%rdi,%rsi,8), %xmm2 + movb %al, %sil + andb $127, %r8b + shlb $7, %sil + movsd %xmm2, -32(%rsp) + orb %sil, %r8b + movb %r8b, -33(%rsp) + movsd -40(%rsp), %xmm9 + movaps %xmm9, %xmm5 + addsd %xmm9, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm4 + movb -25(%rsp), %dil + movb %dil, %r9b + shrb $7, %dil + subsd %xmm4, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -72(%rsp), %xmm7 + movsd -64(%rsp), %xmm6 + xorb %dil, %al + andb $127, %r9b + shlb $7, %al + addsd %xmm6, %xmm7 + movsd %xmm7, -56(%rsp) + movsd -64(%rsp), %xmm8 + addsd %xmm8, %xmm1 + movsd %xmm1, -64(%rsp) + orb %al, %r9b + movsd -56(%rsp), %xmm1 + movb %r9b, -25(%rsp) + subsd %xmm1, %xmm9 + movsd %xmm9, -56(%rsp) + movsd -64(%rsp), %xmm11 + movsd -56(%rsp), %xmm10 + addsd %xmm10, %xmm11 + movsd %xmm11, -56(%rsp) + movq -72(%rsp), %rax + movsd -56(%rsp), %xmm12 + movq %rax, -40(%rsp) + addsd %xmm12, %xmm0 + movsd -40(%rsp), %xmm13 + addsd -32(%rsp), %xmm0 + movsd %xmm0, -32(%rsp) + addsd %xmm0, %xmm13 + movsd %xmm13, -24(%rsp) + movb -17(%rsp), %r10b + andb $127, %r10b + orb %dl, %r10b + movb %r10b, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_20: + movsd -48(%rsp), %xmm12 + movb %al, %r8b + movaps %xmm12, %xmm7 + mulsd 2000+__datan2_la_CoutTab(%rip), %xmm7 + shlb $7, %r8b + shlb $7, %dl + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm8 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm7 + mulsd %xmm2, %xmm13 + subsd -48(%rsp), %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + subsd %xmm9, %xmm10 + movsd %xmm10, -72(%rsp) + movsd -72(%rsp), %xmm11 + subsd %xmm11, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movsd %xmm13, -72(%rsp) + movsd -72(%rsp), %xmm14 + subsd %xmm2, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm4 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm3 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm4 + subsd %xmm3, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm12 + divsd %xmm12, %xmm7 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm2 + mulsd %xmm7, %xmm2 + movsd -64(%rsp), %xmm14 + movsd %xmm2, -64(%rsp) + movsd -64(%rsp), %xmm8 + subsd %xmm7, %xmm8 + movsd %xmm8, -56(%rsp) + movsd -64(%rsp), %xmm10 + movsd -56(%rsp), %xmm9 + subsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -56(%rsp), %xmm11 + mulsd %xmm11, %xmm12 + movsd -56(%rsp), %xmm13 + subsd %xmm12, %xmm4 + mulsd %xmm13, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -64(%rsp), %xmm15 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + subsd %xmm15, %xmm4 + movsd %xmm4, -64(%rsp) + movsd -64(%rsp), %xmm7 + movq -56(%rsp), %rsi + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm3 + movq %rsi, -40(%rsp) + movsd -40(%rsp), %xmm8 + movaps %xmm8, %xmm9 + addsd 1888+__datan2_la_CoutTab(%rip), %xmm7 + mulsd %xmm6, %xmm9 + mulsd %xmm5, %xmm8 + mulsd %xmm2, %xmm7 + movsd -16(%rsp), %xmm2 + mulsd %xmm2, %xmm2 + mulsd %xmm3, %xmm7 + movsd 1872+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + mulsd %xmm7, %xmm6 + mulsd %xmm5, %xmm7 + addsd 1864+__datan2_la_CoutTab(%rip), %xmm3 + addsd %xmm7, %xmm6 + mulsd %xmm2, %xmm3 + addsd %xmm8, %xmm6 + addsd 1856+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + movaps %xmm9, %xmm5 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm4 + addsd 1848+__datan2_la_CoutTab(%rip), %xmm3 + addsd %xmm4, %xmm5 + mulsd %xmm2, %xmm3 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm5 + subsd %xmm6, %xmm9 + addsd 1840+__datan2_la_CoutTab(%rip), %xmm3 + addsd %xmm4, %xmm9 + mulsd %xmm2, %xmm3 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm11 + mulsd %xmm11, %xmm5 + addsd 1832+__datan2_la_CoutTab(%rip), %xmm3 + movsd -64(%rsp), %xmm4 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm7 + mulsd %xmm2, %xmm3 + subsd %xmm11, %xmm7 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm8 + movsd -64(%rsp), %xmm6 + addsd 1824+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm6, %xmm8 + mulsd %xmm2, %xmm3 + movsd %xmm8, -72(%rsp) + movsd -72(%rsp), %xmm10 + mulsd %xmm3, %xmm13 + subsd %xmm10, %xmm11 + movsd %xmm11, -64(%rsp) + movsd -72(%rsp), %xmm2 + movsd -64(%rsp), %xmm12 + movsd %xmm13, -72(%rsp) + addsd %xmm12, %xmm4 + movsd -72(%rsp), %xmm14 + subsd %xmm3, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm5 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + subsd %xmm6, %xmm3 + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movaps %xmm6, %xmm12 + movaps %xmm5, %xmm3 + mulsd %xmm4, %xmm6 + mulsd %xmm4, %xmm3 + mulsd %xmm2, %xmm5 + mulsd %xmm2, %xmm12 + addsd %xmm3, %xmm6 + movaps %xmm12, %xmm7 + movaps %xmm12, %xmm8 + addsd %xmm5, %xmm6 + addsd %xmm2, %xmm7 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm3 + subsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -56(%rsp), %xmm2 + subsd %xmm2, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd -56(%rsp), %xmm13 + addsd %xmm13, %xmm14 + movsd %xmm14, -56(%rsp) + movq -72(%rsp), %rdi + movsd -56(%rsp), %xmm15 + movq %rdi, -40(%rsp) + addsd %xmm15, %xmm4 + shrq $56, %rdi + addsd %xmm5, %xmm4 + andb $127, %dil + orb %r8b, %dil + movb %dil, -33(%rsp) + movsd %xmm4, -32(%rsp) + movaps %xmm1, %xmm4 + movsd -40(%rsp), %xmm7 + movaps %xmm7, %xmm2 + addsd %xmm7, %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm4 + movb -25(%rsp), %r9b + movb %r9b, %r10b + shrb $7, %r9b + subsd %xmm4, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm3 + xorb %r9b, %al + andb $127, %r10b + shlb $7, %al + addsd %xmm3, %xmm5 + movsd %xmm5, -56(%rsp) + movsd -64(%rsp), %xmm6 + addsd %xmm6, %xmm1 + movsd %xmm1, -64(%rsp) + orb %al, %r10b + movsd -56(%rsp), %xmm1 + movb %r10b, -25(%rsp) + subsd %xmm1, %xmm7 + movsd %xmm7, -56(%rsp) + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm1 + addsd %xmm1, %xmm2 + movsd %xmm2, -56(%rsp) + movq -72(%rsp), %rax + movsd -56(%rsp), %xmm3 + movq %rax, -40(%rsp) + addsd %xmm3, %xmm0 + movsd -40(%rsp), %xmm4 + addsd -32(%rsp), %xmm0 + movsd %xmm0, -32(%rsp) + addsd %xmm0, %xmm4 + movsd %xmm4, -24(%rsp) + movb -17(%rsp), %r11b + andb $127, %r11b + orb %dl, %r11b + movb %r11b, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_21: + cmpl $74, %r8d + jge .LBL_2_53 + movb %dil, -41(%rsp) + divsd -48(%rsp), %xmm1 + movsd 1928+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + subsd %xmm1, %xmm0 + addsd 1920+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_24: + testb %al, %al + jne .LBL_2_35 + movb %dil, -41(%rsp) + movb %sil, -33(%rsp) + movsd -48(%rsp), %xmm2 + divsd -40(%rsp), %xmm2 + movsd %xmm2, -24(%rsp) + movzwl -18(%rsp), %eax + testl $32752, %eax + je .LBL_2_27 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd %xmm2, %xmm0 + movsd %xmm0, -72(%rsp) + movsd -72(%rsp), %xmm1 + mulsd %xmm1, %xmm2 + movsd %xmm2, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_27: + mulsd %xmm2, %xmm2 + shlb $7, %dl + movsd %xmm2, -72(%rsp) + movsd -72(%rsp), %xmm0 + addsd -24(%rsp), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_29: + testl %r9d, %r9d + jne .LBL_2_53 + testl $1048575, -44(%rsp) + jne .LBL_2_53 + jmp .LBL_2_57 + +.LBL_2_31: + jne .LBL_2_53 + +.LBL_2_33: + testb %al, %al + jne .LBL_2_35 + +.LBL_2_34: + shlb $7, %dl + movq 1976+__datan2_la_CoutTab(%rip), %rax + movq %rax, -24(%rsp) + shrq $56, %rax + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_35: + movsd 1936+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd 1944+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + +.LBL_2_36: + xorl %eax, %eax + ret + +.LBL_2_37: + movsd 1984+__datan2_la_CoutTab(%rip), %xmm3 + movl $-1022, %r9d + mulsd %xmm3, %xmm4 + movsd %xmm4, -48(%rsp) + jmp .LBL_2_18 + +.LBL_2_38: + cmpl $2047, %r9d + je .LBL_2_49 + +.LBL_2_39: + testl $1048575, -36(%rsp) + jne .LBL_2_41 + cmpl $0, -40(%rsp) + je .LBL_2_42 + +.LBL_2_41: + addsd %xmm1, %xmm2 + movsd %xmm2, (%rcx) + jmp .LBL_2_36 + +.LBL_2_42: + cmpl $2047, %r9d + je .LBL_2_46 + testb %al, %al + je .LBL_2_34 + jmp .LBL_2_35 + +.LBL_2_46: + testb %al, %al + jne .LBL_2_48 + movsd 1904+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd 1912+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_48: + movsd 1952+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd 1960+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_49: + testl $1048575, -44(%rsp) + jne .LBL_2_41 + cmpl $0, -48(%rsp) + jne .LBL_2_41 + cmpl $2047, %r8d + je .LBL_2_39 + +.LBL_2_53: + movsd 1920+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd 1928+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_57: + cmpl $0, -48(%rsp) + jne .LBL_2_53 + jmp .LBL_2_33 + + cfi_endproc + + .type __svml_datan2_cout_rare_internal,@function + .size __svml_datan2_cout_rare_internal,.-__svml_datan2_cout_rare_internal + + .section .rodata, "a" + .align 64 + +__svml_datan2_data_internal: + .long 1413754136 + .long 1074340347 + .long 1413754136 + .long 1074340347 + .long 1413754136 + .long 1074340347 + .long 1413754136 + .long 1074340347 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1413754136 + .long 1073291771 + .long 1413754136 + .long 1073291771 + .long 1413754136 + .long 1073291771 + .long 1413754136 + .long 1073291771 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 17919630 + .long 3202334474 + .long 17919630 + .long 3202334474 + .long 17919630 + .long 3202334474 + .long 17919630 + .long 3202334474 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 350522012 + .long 1058555694 + .long 350522012 + .long 1058555694 + .long 350522012 + .long 1058555694 + .long 350522012 + .long 1058555694 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 934004643 + .long 3203726773 + .long 934004643 + .long 3203726773 + .long 934004643 + .long 3203726773 + .long 934004643 + .long 3203726773 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 912675337 + .long 1059908874 + .long 912675337 + .long 1059908874 + .long 912675337 + .long 1059908874 + .long 912675337 + .long 1059908874 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2476035107 + .long 3209881212 + .long 2476035107 + .long 3209881212 + .long 2476035107 + .long 3209881212 + .long 2476035107 + .long 3209881212 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2927800243 + .long 1064262173 + .long 2927800243 + .long 1064262173 + .long 2927800243 + .long 1064262173 + .long 2927800243 + .long 1064262173 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1636715437 + .long 3213013740 + .long 1636715437 + .long 3213013740 + .long 1636715437 + .long 3213013740 + .long 1636715437 + .long 3213013740 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1712395941 + .long 1066487628 + .long 1712395941 + .long 1066487628 + .long 1712395941 + .long 1066487628 + .long 1712395941 + .long 1066487628 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2961307292 + .long 3214564995 + .long 2961307292 + .long 3214564995 + .long 2961307292 + .long 3214564995 + .long 2961307292 + .long 3214564995 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 213298511 + .long 1067542936 + .long 213298511 + .long 1067542936 + .long 213298511 + .long 1067542936 + .long 213298511 + .long 1067542936 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3848520124 + .long 3215257506 + .long 3848520124 + .long 3215257506 + .long 3848520124 + .long 3215257506 + .long 3848520124 + .long 3215257506 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3159386171 + .long 1067969551 + .long 3159386171 + .long 1067969551 + .long 3159386171 + .long 1067969551 + .long 3159386171 + .long 1067969551 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3936393556 + .long 3215643233 + .long 3936393556 + .long 3215643233 + .long 3936393556 + .long 3215643233 + .long 3936393556 + .long 3215643233 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3177262543 + .long 1068373833 + .long 3177262543 + .long 1068373833 + .long 3177262543 + .long 1068373833 + .long 3177262543 + .long 1068373833 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 9713120 + .long 3216052356 + .long 9713120 + .long 3216052356 + .long 9713120 + .long 3216052356 + .long 9713120 + .long 3216052356 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1227445841 + .long 1068740906 + .long 1227445841 + .long 1068740906 + .long 1227445841 + .long 1068740906 + .long 1227445841 + .long 1068740906 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 163240596 + .long 3216459216 + .long 163240596 + .long 3216459216 + .long 163240596 + .long 3216459216 + .long 163240596 + .long 3216459216 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 133682613 + .long 1069314503 + .long 133682613 + .long 1069314503 + .long 133682613 + .long 1069314503 + .long 133682613 + .long 1069314503 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2448315847 + .long 3217180964 + .long 2448315847 + .long 3217180964 + .long 2448315847 + .long 3217180964 + .long 2448315847 + .long 3217180964 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2576870964 + .long 1070176665 + .long 2576870964 + .long 1070176665 + .long 2576870964 + .long 1070176665 + .long 2576870964 + .long 1070176665 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1431655365 + .long 3218429269 + .long 1431655365 + .long 3218429269 + .long 1431655365 + .long 3218429269 + .long 1431655365 + .long 3218429269 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 0 + .long 2147483648 + .long 0 + .long 2147483648 + .long 0 + .long 2147483648 + .long 0 + .long 2147483648 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 4294967295 + .long 2147483647 + .long 4294967295 + .long 2147483647 + .long 4294967295 + .long 2147483647 + .long 4294967295 + .long 2147483647 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 0 + .long 4294967295 + .long 0 + .long 4294967295 + .long 0 + .long 4294967295 + .long 0 + .long 4294967295 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .type __svml_datan2_data_internal,@object + .size __svml_datan2_data_internal,2304 + .align 32 + +__datan2_la_CoutTab: + .long 3892314112 + .long 1069799150 + .long 2332892550 + .long 1039715405 + .long 1342177280 + .long 1070305495 + .long 270726690 + .long 1041535749 + .long 939524096 + .long 1070817911 + .long 2253973841 + .long 3188654726 + .long 3221225472 + .long 1071277294 + .long 3853927037 + .long 1043226911 + .long 2818572288 + .long 1071767563 + .long 2677759107 + .long 1044314101 + .long 3355443200 + .long 1072103591 + .long 1636578514 + .long 3191094734 + .long 1476395008 + .long 1072475260 + .long 1864703685 + .long 3188646936 + .long 805306368 + .long 1072747407 + .long 192551812 + .long 3192726267 + .long 2013265920 + .long 1072892781 + .long 2240369452 + .long 1043768538 + .long 0 + .long 1072999953 + .long 3665168337 + .long 3192705970 + .long 402653184 + .long 1073084787 + .long 1227953434 + .long 3192313277 + .long 2013265920 + .long 1073142981 + .long 3853283127 + .long 1045277487 + .long 805306368 + .long 1073187261 + .long 1676192264 + .long 3192868861 + .long 134217728 + .long 1073217000 + .long 4290763938 + .long 1042034855 + .long 671088640 + .long 1073239386 + .long 994303084 + .long 3189643768 + .long 402653184 + .long 1073254338 + .long 1878067156 + .long 1042652475 + .long 1610612736 + .long 1073265562 + .long 670314820 + .long 1045138554 + .long 3221225472 + .long 1073273048 + .long 691126919 + .long 3189987794 + .long 3489660928 + .long 1073278664 + .long 1618990832 + .long 3188194509 + .long 1207959552 + .long 1073282409 + .long 2198872939 + .long 1044806069 + .long 3489660928 + .long 1073285217 + .long 2633982383 + .long 1042307894 + .long 939524096 + .long 1073287090 + .long 1059367786 + .long 3189114230 + .long 2281701376 + .long 1073288494 + .long 3158525533 + .long 1044484961 + .long 3221225472 + .long 1073289430 + .long 286581777 + .long 1044893263 + .long 4026531840 + .long 1073290132 + .long 2000245215 + .long 3191647611 + .long 134217728 + .long 1073290601 + .long 4205071590 + .long 1045035927 + .long 536870912 + .long 1073290952 + .long 2334392229 + .long 1043447393 + .long 805306368 + .long 1073291186 + .long 2281458177 + .long 3188885569 + .long 3087007744 + .long 1073291361 + .long 691611507 + .long 1044733832 + .long 3221225472 + .long 1073291478 + .long 1816229550 + .long 1044363390 + .long 2281701376 + .long 1073291566 + .long 1993843750 + .long 3189837440 + .long 134217728 + .long 1073291625 + .long 3654754496 + .long 1044970837 + .long 4026531840 + .long 1073291668 + .long 3224300229 + .long 3191935390 + .long 805306368 + .long 1073291698 + .long 2988777976 + .long 3188950659 + .long 536870912 + .long 1073291720 + .long 1030371341 + .long 1043402665 + .long 3221225472 + .long 1073291734 + .long 1524463765 + .long 1044361356 + .long 3087007744 + .long 1073291745 + .long 2754295320 + .long 1044731036 + .long 134217728 + .long 1073291753 + .long 3099629057 + .long 1044970710 + .long 2281701376 + .long 1073291758 + .long 962914160 + .long 3189838838 + .long 805306368 + .long 1073291762 + .long 3543908206 + .long 3188950786 + .long 4026531840 + .long 1073291764 + .long 1849909620 + .long 3191935434 + .long 3221225472 + .long 1073291766 + .long 1641333636 + .long 1044361352 + .long 536870912 + .long 1073291768 + .long 1373968792 + .long 1043402654 + .long 134217728 + .long 1073291769 + .long 2033191599 + .long 1044970710 + .long 3087007744 + .long 1073291769 + .long 4117947437 + .long 1044731035 + .long 805306368 + .long 1073291770 + .long 315378368 + .long 3188950787 + .long 2281701376 + .long 1073291770 + .long 2428571750 + .long 3189838838 + .long 3221225472 + .long 1073291770 + .long 1608007466 + .long 1044361352 + .long 4026531840 + .long 1073291770 + .long 1895711420 + .long 3191935434 + .long 134217728 + .long 1073291771 + .long 2031108713 + .long 1044970710 + .long 536870912 + .long 1073291771 + .long 1362518342 + .long 1043402654 + .long 805306368 + .long 1073291771 + .long 317461253 + .long 3188950787 + .long 939524096 + .long 1073291771 + .long 4117231784 + .long 1044731035 + .long 1073741824 + .long 1073291771 + .long 1607942376 + .long 1044361352 + .long 1207959552 + .long 1073291771 + .long 2428929577 + .long 3189838838 + .long 1207959552 + .long 1073291771 + .long 2031104645 + .long 1044970710 + .long 1342177280 + .long 1073291771 + .long 1895722602 + .long 3191935434 + .long 1342177280 + .long 1073291771 + .long 317465322 + .long 3188950787 + .long 1342177280 + .long 1073291771 + .long 1362515546 + .long 1043402654 + .long 1342177280 + .long 1073291771 + .long 1607942248 + .long 1044361352 + .long 1342177280 + .long 1073291771 + .long 4117231610 + .long 1044731035 + .long 1342177280 + .long 1073291771 + .long 2031104637 + .long 1044970710 + .long 1342177280 + .long 1073291771 + .long 1540251232 + .long 1045150466 + .long 1342177280 + .long 1073291771 + .long 2644671394 + .long 1045270303 + .long 1342177280 + .long 1073291771 + .long 2399244691 + .long 1045360181 + .long 1342177280 + .long 1073291771 + .long 803971124 + .long 1045420100 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192879152 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192849193 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192826724 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192811744 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192800509 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192793019 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192787402 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192783657 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192780848 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192778976 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192777572 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192776635 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192775933 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192775465 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192775114 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774880 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774704 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774587 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774500 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774441 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774397 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774368 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774346 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774331 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774320 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774313 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774308 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774304 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774301 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774299 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774298 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774297 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1466225875 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1343512524 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1251477510 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1190120835 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1144103328 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1113424990 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1090416237 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1075077068 + .long 3192774295 + .long 1431655765 + .long 3218429269 + .long 2576978363 + .long 1070176665 + .long 2453154343 + .long 3217180964 + .long 4189149139 + .long 1069314502 + .long 1775019125 + .long 3216459198 + .long 273199057 + .long 1068739452 + .long 874748308 + .long 3215993277 + .long 0 + .long 1069547520 + .long 0 + .long 1072693248 + .long 0 + .long 1073741824 + .long 1413754136 + .long 1072243195 + .long 856972295 + .long 1015129638 + .long 1413754136 + .long 1073291771 + .long 856972295 + .long 1016178214 + .long 1413754136 + .long 1074340347 + .long 856972295 + .long 1017226790 + .long 2134057426 + .long 1073928572 + .long 1285458442 + .long 1016756537 + .long 0 + .long 3220176896 + .long 0 + .long 0 + .long 0 + .long 2144337920 + .long 0 + .long 1048576 + .long 33554432 + .long 1101004800 + .type __datan2_la_CoutTab,@object + .size __datan2_la_CoutTab,2008 diff --git a/sysdeps/x86_64/fpu/multiarch/svml_d_atan28_core-avx2.S b/sysdeps/x86_64/fpu/multiarch/svml_d_atan28_core-avx2.S new file mode 100644 index 0000000000..a8d34a6143 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_d_atan28_core-avx2.S @@ -0,0 +1,20 @@ +/* AVX2 version of vectorized atan2. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define _ZGVeN8vv_atan2 _ZGVeN8vv_atan2_avx2_wrapper +#include "../svml_d_atan28_core.S" diff --git a/sysdeps/x86_64/fpu/multiarch/svml_d_atan28_core.c b/sysdeps/x86_64/fpu/multiarch/svml_d_atan28_core.c new file mode 100644 index 0000000000..a0897e9cf0 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_d_atan28_core.c @@ -0,0 +1,28 @@ +/* Multiple versions of vectorized atan2, vector length is 8. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define SYMBOL_NAME _ZGVeN8vv_atan2 +#include "ifunc-mathvec-avx512-skx.h" + +libc_ifunc_redirected (REDIRECT_NAME, SYMBOL_NAME, IFUNC_SELECTOR ()); + +#ifdef SHARED +__hidden_ver1 (_ZGVeN8vv_atan2, __GI__ZGVeN8vv_atan2, + __redirect__ZGVeN8vv_atan2) + __attribute__ ((visibility ("hidden"))); +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/svml_d_atan28_core_avx512.S b/sysdeps/x86_64/fpu/multiarch/svml_d_atan28_core_avx512.S new file mode 100644 index 0000000000..959a8610da --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_d_atan28_core_avx512.S @@ -0,0 +1,2310 @@ +/* Function atan vectorized with AVX-512. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + https://www.gnu.org/licenses/. */ + +/* + * ALGORITHM DESCRIPTION: + * For 0.0 <= x <= 7.0/16.0: atan(x) = atan(0.0) + atan(s), where s=(x-0.0)/(1.0+0.0*x) + * For 7.0/16.0 <= x <= 11.0/16.0: atan(x) = atan(0.5) + atan(s), where s=(x-0.5)/(1.0+0.5*x) + * For 11.0/16.0 <= x <= 19.0/16.0: atan(x) = atan(1.0) + atan(s), where s=(x-1.0)/(1.0+1.0*x) + * For 19.0/16.0 <= x <= 39.0/16.0: atan(x) = atan(1.5) + atan(s), where s=(x-1.5)/(1.0+1.5*x) + * For 39.0/16.0 <= x <= inf : atan(x) = atan(inf) + atan(s), where s=-1.0/x + * Where atan(s) ~= s+s^3*Poly11(s^2) on interval |s|<7.0/0.16. + * + * + */ + +#include + + .text +ENTRY(_ZGVeN8vv_atan2_skx) + pushq %rbp + cfi_def_cfa_offset(16) + movq %rsp, %rbp + cfi_def_cfa(6, 16) + cfi_offset(6, -16) + andq $-64, %rsp + subq $256, %rsp + xorl %edx, %edx + +/* + * #define NO_VECTOR_ZERO_ATAN2_ARGS + * Declarations + * Variables + * Constants + * The end of declarations + * Implementation + * Get r0~=1/B + * Cannot be replaced by VQRCP(D, dR0, dB); + * Argument Absolute values + */ + vmovups 1728+__svml_datan2_data_internal(%rip), %zmm4 + +/* Argument signs */ + vmovups 1536+__svml_datan2_data_internal(%rip), %zmm6 + +/* + * 1) If yx then a=-x, b=y, PIO2=Pi/2 + */ + vmovups 64+__svml_datan2_data_internal(%rip), %zmm3 + vandpd %zmm4, %zmm0, %zmm11 + vmovaps %zmm1, %zmm7 + vandpd %zmm4, %zmm7, %zmm2 + vandpd %zmm6, %zmm7, %zmm5 + vandpd %zmm6, %zmm0, %zmm4 + vorpd %zmm6, %zmm2, %zmm12 + vcmppd $17, {sae}, %zmm2, %zmm11, %k1 + vmovdqu 1664+__svml_datan2_data_internal(%rip), %ymm6 + vmovups %zmm11, 64(%rsp) + +/* Check if y and x are on main path. */ + vpsrlq $32, %zmm2, %zmm9 + vblendmpd %zmm11, %zmm12, %zmm13{%k1} + vblendmpd %zmm2, %zmm11, %zmm15{%k1} + vpsrlq $32, %zmm11, %zmm8 + vmovdqu 1600+__svml_datan2_data_internal(%rip), %ymm12 + vdivpd {rn-sae}, %zmm15, %zmm13, %zmm1 + vmovups %zmm15, (%rsp) + vpmovqd %zmm9, %ymm14 + vpmovqd %zmm8, %ymm10 + vxorpd %zmm3, %zmm3, %zmm3{%k1} + vpsubd %ymm12, %ymm14, %ymm13 + vpsubd %ymm12, %ymm10, %ymm9 + +/* Polynomial. */ + vmulpd {rn-sae}, %zmm1, %zmm1, %zmm12 + vpcmpgtd %ymm6, %ymm13, %ymm15 + vpcmpeqd %ymm6, %ymm13, %ymm11 + vmulpd {rn-sae}, %zmm12, %zmm12, %zmm13 + vpor %ymm11, %ymm15, %ymm8 + vmovups 256+__svml_datan2_data_internal(%rip), %zmm11 + vmovups 512+__svml_datan2_data_internal(%rip), %zmm15 + vpcmpgtd %ymm6, %ymm9, %ymm14 + vpcmpeqd %ymm6, %ymm9, %ymm6 + vpor %ymm6, %ymm14, %ymm10 + vmulpd {rn-sae}, %zmm13, %zmm13, %zmm14 + vmovups 320+__svml_datan2_data_internal(%rip), %zmm9 + vpor %ymm10, %ymm8, %ymm6 + vmovups 384+__svml_datan2_data_internal(%rip), %zmm10 + vfmadd231pd {rn-sae}, %zmm14, %zmm11, %zmm15 + vmovups 576+__svml_datan2_data_internal(%rip), %zmm11 + vmovups 704+__svml_datan2_data_internal(%rip), %zmm8 + vfmadd231pd {rn-sae}, %zmm14, %zmm9, %zmm11 + vmovups 640+__svml_datan2_data_internal(%rip), %zmm9 + vfmadd231pd {rn-sae}, %zmm14, %zmm10, %zmm9 + vmovups 448+__svml_datan2_data_internal(%rip), %zmm10 + vfmadd231pd {rn-sae}, %zmm14, %zmm10, %zmm8 + vmovups 768+__svml_datan2_data_internal(%rip), %zmm10 + vfmadd213pd {rn-sae}, %zmm10, %zmm14, %zmm15 + vmovups 832+__svml_datan2_data_internal(%rip), %zmm10 + vfmadd213pd {rn-sae}, %zmm10, %zmm14, %zmm11 + vmovups 896+__svml_datan2_data_internal(%rip), %zmm10 + vfmadd213pd {rn-sae}, %zmm10, %zmm14, %zmm9 + vmovups 960+__svml_datan2_data_internal(%rip), %zmm10 + vfmadd213pd {rn-sae}, %zmm10, %zmm14, %zmm8 + vmovups 1024+__svml_datan2_data_internal(%rip), %zmm10 + vfmadd213pd {rn-sae}, %zmm10, %zmm14, %zmm15 + vmovups 1088+__svml_datan2_data_internal(%rip), %zmm10 + vfmadd213pd {rn-sae}, %zmm10, %zmm14, %zmm11 + vmovups 1152+__svml_datan2_data_internal(%rip), %zmm10 + vfmadd213pd {rn-sae}, %zmm10, %zmm14, %zmm9 + vmovups 1216+__svml_datan2_data_internal(%rip), %zmm10 + vfmadd213pd {rn-sae}, %zmm10, %zmm14, %zmm8 + vmovups 1280+__svml_datan2_data_internal(%rip), %zmm10 + +/* A00=1.0, account for it later VQFMA(D, dP4, dP4, dR8, dA00); */ + vmulpd {rn-sae}, %zmm14, %zmm8, %zmm8 + vfmadd213pd {rn-sae}, %zmm10, %zmm14, %zmm15 + vmovups 1344+__svml_datan2_data_internal(%rip), %zmm10 + vfmadd213pd {rn-sae}, %zmm10, %zmm14, %zmm11 + vmovups 1408+__svml_datan2_data_internal(%rip), %zmm10 + vfmadd213pd {rn-sae}, %zmm11, %zmm12, %zmm15 + vfmadd213pd {rn-sae}, %zmm10, %zmm14, %zmm9 + vfmadd213pd {rn-sae}, %zmm8, %zmm12, %zmm9 + vmovups __svml_datan2_data_internal(%rip), %zmm8 + vfmadd213pd {rn-sae}, %zmm9, %zmm13, %zmm15 + +/* + * Reconstruction. + * dP=(R+R*dP) + dPIO2 + */ + vfmadd213pd {rn-sae}, %zmm1, %zmm1, %zmm15 + vaddpd {rn-sae}, %zmm3, %zmm15, %zmm1 + vorpd %zmm5, %zmm1, %zmm9 + +/* if x<0, dPI = Pi, else dPI =0 */ + vmovups 1792+__svml_datan2_data_internal(%rip), %zmm1 + vcmppd $18, {sae}, %zmm1, %zmm7, %k2 + vaddpd {rn-sae}, %zmm8, %zmm9, %zmm9{%k2} + vmovmskps %ymm6, %eax + vorpd %zmm4, %zmm9, %zmm11 + +/* Special branch for fast (vector) processing of zero arguments */ + vmovups 64(%rsp), %zmm9 + testl %eax, %eax + jne .LBL_1_12 + +.LBL_1_2: +/* + * Special branch for fast (vector) processing of zero arguments + * The end of implementation + */ + testl %edx, %edx + jne .LBL_1_4 + +.LBL_1_3: + vmovaps %zmm11, %zmm0 + movq %rbp, %rsp + popq %rbp + cfi_def_cfa(7, 8) + cfi_restore(6) + ret + cfi_def_cfa(6, 16) + cfi_offset(6, -16) + +.LBL_1_4: + vmovups %zmm0, 64(%rsp) + vmovups %zmm7, 128(%rsp) + vmovups %zmm11, 192(%rsp) + je .LBL_1_3 + xorl %eax, %eax + vzeroupper + kmovw %k4, 24(%rsp) + kmovw %k5, 16(%rsp) + kmovw %k6, 8(%rsp) + kmovw %k7, (%rsp) + movq %rsi, 40(%rsp) + movq %rdi, 32(%rsp) + movq %r12, 56(%rsp) + .cfi_escape 0x10, 0x04, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x28, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x05, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x20, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x38, 0xff, 0xff, 0xff, 0x22 + movl %eax, %r12d + movq %r13, 48(%rsp) + .cfi_escape 0x10, 0x0d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x30, 0xff, 0xff, 0xff, 0x22 + movl %edx, %r13d + .cfi_escape 0x10, 0xfa, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x18, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfb, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x10, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfc, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x08, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfd, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x00, 0xff, 0xff, 0xff, 0x22 + +.LBL_1_8: + btl %r12d, %r13d + jc .LBL_1_11 + +.LBL_1_9: + incl %r12d + cmpl $8, %r12d + jl .LBL_1_8 + kmovw 24(%rsp), %k4 + cfi_restore(122) + kmovw 16(%rsp), %k5 + cfi_restore(123) + kmovw 8(%rsp), %k6 + cfi_restore(124) + kmovw (%rsp), %k7 + cfi_restore(125) + vmovups 192(%rsp), %zmm11 + movq 40(%rsp), %rsi + cfi_restore(4) + movq 32(%rsp), %rdi + cfi_restore(5) + movq 56(%rsp), %r12 + cfi_restore(12) + movq 48(%rsp), %r13 + cfi_restore(13) + jmp .LBL_1_3 + .cfi_escape 0x10, 0x04, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x28, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x05, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x20, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x38, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x30, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfa, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x18, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfb, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x10, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfc, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x08, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfd, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x00, 0xff, 0xff, 0xff, 0x22 + +.LBL_1_11: + lea 64(%rsp,%r12,8), %rdi + lea 128(%rsp,%r12,8), %rsi + lea 192(%rsp,%r12,8), %rdx + call __svml_datan2_cout_rare_internal + jmp .LBL_1_9 + cfi_restore(4) + cfi_restore(5) + cfi_restore(12) + cfi_restore(13) + cfi_restore(122) + cfi_restore(123) + cfi_restore(124) + cfi_restore(125) + +.LBL_1_12: +/* Check if both X & Y are not NaNs: iXYnotNAN */ + vcmppd $3, {sae}, %zmm7, %zmm7, %k1 + vcmppd $3, {sae}, %zmm0, %zmm0, %k2 + +/* Check if at least on of Y or Y is zero: iAXAYZERO */ + vmovups 1792+__svml_datan2_data_internal(%rip), %zmm8 + vpbroadcastq .FLT_31(%rip), %zmm10 + vcmppd $4, {sae}, %zmm8, %zmm2, %k3 + vmovaps %zmm10, %zmm12 + vmovaps %zmm10, %zmm15 + vmovaps %zmm10, %zmm13 + vpandnq %zmm7, %zmm7, %zmm12{%k1} + vcmppd $4, {sae}, %zmm8, %zmm9, %k1 + vpandnq %zmm2, %zmm2, %zmm15{%k3} + vmovaps %zmm10, %zmm2 + +/* Res = sign(Y)*(X<0)?(PIO2+PI):PIO2 */ + vpcmpgtq %zmm7, %zmm8, %k3 + vpandnq %zmm0, %zmm0, %zmm13{%k2} + vpandnq %zmm9, %zmm9, %zmm2{%k1} + vandpd %zmm13, %zmm12, %zmm14 + vorpd %zmm2, %zmm15, %zmm9 + vpsrlq $32, %zmm14, %zmm1 + vpsrlq $32, %zmm9, %zmm2 + vpmovqd %zmm1, %ymm1 + vpmovqd %zmm2, %ymm9 + +/* Check if at least on of Y or Y is zero and not NaN: iAXAYZEROnotNAN */ + vpand %ymm1, %ymm9, %ymm2 + +/* + * Path for zero arguments (at least one of both) + * Check if both args are zeros (den. is zero) + */ + vmovups (%rsp), %zmm1 + +/* Exclude from previous callout mask zero (and not NaN) arguments */ + vpandn %ymm6, %ymm2, %ymm6 + vcmppd $4, {sae}, %zmm8, %zmm1, %k2 + +/* Go to callout */ + vmovmskps %ymm6, %edx + vpandnq %zmm1, %zmm1, %zmm10{%k2} + +/* Set sPIO2 to zero if den. is zero */ + vpandnq %zmm3, %zmm10, %zmm3 + vpandq %zmm10, %zmm8, %zmm1 + vporq %zmm1, %zmm3, %zmm3 + vorpd %zmm5, %zmm3, %zmm1 + vmovups __svml_datan2_data_internal(%rip), %zmm5 + vaddpd {rn-sae}, %zmm5, %zmm1, %zmm1{%k3} + vorpd %zmm4, %zmm1, %zmm1 + +/* Merge results from main and spec path */ + vpmovzxdq %ymm2, %zmm4 + vpsllq $32, %zmm4, %zmm2 + vpord %zmm2, %zmm4, %zmm3 + vpandnq %zmm11, %zmm3, %zmm11 + vpandq %zmm3, %zmm1, %zmm1 + vporq %zmm1, %zmm11, %zmm11 + jmp .LBL_1_2 + +END(_ZGVeN8vv_atan2_skx) + + .align 16,0x90 + +__svml_datan2_cout_rare_internal: + + cfi_startproc + + movq %rdx, %rcx + movsd 1888+__datan2_la_CoutTab(%rip), %xmm1 + movsd (%rdi), %xmm2 + movsd (%rsi), %xmm0 + mulsd %xmm1, %xmm2 + mulsd %xmm0, %xmm1 + movsd %xmm2, -48(%rsp) + movsd %xmm1, -40(%rsp) + movzwl -42(%rsp), %r9d + andl $32752, %r9d + movb -33(%rsp), %al + movzwl -34(%rsp), %r8d + andb $-128, %al + andl $32752, %r8d + shrl $4, %r9d + movb -41(%rsp), %dl + shrb $7, %dl + shrb $7, %al + shrl $4, %r8d + cmpl $2047, %r9d + je .LBL_2_49 + cmpl $2047, %r8d + je .LBL_2_38 + testl %r9d, %r9d + jne .LBL_2_6 + testl $1048575, -44(%rsp) + jne .LBL_2_6 + cmpl $0, -48(%rsp) + je .LBL_2_31 + +.LBL_2_6: + testl %r8d, %r8d + jne .LBL_2_9 + testl $1048575, -36(%rsp) + jne .LBL_2_9 + cmpl $0, -40(%rsp) + je .LBL_2_29 + +.LBL_2_9: + negl %r8d + movsd %xmm2, -48(%rsp) + addl %r9d, %r8d + movsd %xmm1, -40(%rsp) + movb -41(%rsp), %dil + movb -33(%rsp), %sil + andb $127, %dil + andb $127, %sil + cmpl $-54, %r8d + jle .LBL_2_24 + cmpl $54, %r8d + jge .LBL_2_21 + movb %sil, -33(%rsp) + movb %dil, -41(%rsp) + testb %al, %al + jne .LBL_2_13 + movsd 1976+__datan2_la_CoutTab(%rip), %xmm1 + movaps %xmm1, %xmm0 + jmp .LBL_2_14 + +.LBL_2_13: + movsd 1936+__datan2_la_CoutTab(%rip), %xmm1 + movsd 1944+__datan2_la_CoutTab(%rip), %xmm0 + +.LBL_2_14: + movsd -48(%rsp), %xmm4 + movsd -40(%rsp), %xmm2 + movaps %xmm4, %xmm5 + divsd %xmm2, %xmm5 + movzwl -42(%rsp), %esi + movsd %xmm5, -16(%rsp) + testl %r9d, %r9d + jle .LBL_2_37 + cmpl $2046, %r9d + jge .LBL_2_17 + andl $-32753, %esi + addl $-1023, %r9d + movsd %xmm4, -48(%rsp) + addl $16368, %esi + movw %si, -42(%rsp) + jmp .LBL_2_18 + +.LBL_2_17: + movsd 1992+__datan2_la_CoutTab(%rip), %xmm3 + movl $1022, %r9d + mulsd %xmm3, %xmm4 + movsd %xmm4, -48(%rsp) + +.LBL_2_18: + negl %r9d + addl $1023, %r9d + andl $2047, %r9d + movzwl 1894+__datan2_la_CoutTab(%rip), %esi + movsd 1888+__datan2_la_CoutTab(%rip), %xmm3 + andl $-32753, %esi + shll $4, %r9d + movsd %xmm3, -40(%rsp) + orl %r9d, %esi + movw %si, -34(%rsp) + movsd -40(%rsp), %xmm4 + mulsd %xmm4, %xmm2 + comisd 1880+__datan2_la_CoutTab(%rip), %xmm5 + jb .LBL_2_20 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm12 + movaps %xmm2, %xmm3 + mulsd %xmm2, %xmm12 + movsd %xmm12, -72(%rsp) + movsd -72(%rsp), %xmm13 + movsd %xmm5, -24(%rsp) + subsd %xmm2, %xmm13 + movsd %xmm13, -64(%rsp) + movsd -72(%rsp), %xmm15 + movsd -64(%rsp), %xmm14 + movl -20(%rsp), %r8d + movl %r8d, %r9d + andl $-524288, %r8d + andl $-1048576, %r9d + addl $262144, %r8d + subsd %xmm14, %xmm15 + movsd %xmm15, -72(%rsp) + andl $1048575, %r8d + movsd -72(%rsp), %xmm4 + orl %r8d, %r9d + movl $0, -24(%rsp) + subsd %xmm4, %xmm3 + movl %r9d, -20(%rsp) + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -24(%rsp), %xmm11 + movsd -64(%rsp), %xmm9 + mulsd %xmm11, %xmm5 + mulsd %xmm11, %xmm9 + movsd 1968+__datan2_la_CoutTab(%rip), %xmm8 + mulsd %xmm8, %xmm5 + mulsd %xmm8, %xmm9 + movaps %xmm5, %xmm7 + movzwl -10(%rsp), %edi + addsd %xmm9, %xmm7 + movsd %xmm7, -72(%rsp) + andl $32752, %edi + movsd -72(%rsp), %xmm6 + shrl $4, %edi + subsd %xmm6, %xmm5 + movl -12(%rsp), %esi + addsd %xmm5, %xmm9 + movsd %xmm9, -64(%rsp) + andl $1048575, %esi + movsd -48(%rsp), %xmm9 + movsd -72(%rsp), %xmm3 + movaps %xmm9, %xmm12 + movsd -64(%rsp), %xmm10 + movaps %xmm9, %xmm14 + movaps %xmm9, %xmm6 + addsd %xmm3, %xmm12 + movsd %xmm12, -72(%rsp) + movsd -72(%rsp), %xmm13 + shll $20, %edi + subsd %xmm13, %xmm14 + movsd %xmm14, -64(%rsp) + orl %esi, %edi + movsd -72(%rsp), %xmm4 + addl $-1069547520, %edi + movsd -64(%rsp), %xmm15 + movl $113, %esi + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + addsd %xmm15, %xmm4 + movsd %xmm4, -56(%rsp) + movsd -64(%rsp), %xmm8 + sarl $19, %edi + addsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + cmpl $113, %edi + movsd -56(%rsp), %xmm7 + cmovl %edi, %esi + subsd %xmm7, %xmm6 + movsd %xmm6, -56(%rsp) + addl %esi, %esi + movsd -64(%rsp), %xmm12 + lea __datan2_la_CoutTab(%rip), %rdi + movsd -56(%rsp), %xmm5 + movslq %esi, %rsi + addsd %xmm5, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -72(%rsp), %xmm7 + mulsd %xmm7, %xmm13 + movsd -56(%rsp), %xmm8 + movsd %xmm13, -72(%rsp) + addsd %xmm10, %xmm8 + movsd -72(%rsp), %xmm4 + movaps %xmm9, %xmm10 + mulsd 2000+__datan2_la_CoutTab(%rip), %xmm10 + subsd %xmm7, %xmm4 + movsd %xmm4, -64(%rsp) + movsd -72(%rsp), %xmm3 + movsd -64(%rsp), %xmm14 + subsd %xmm14, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm15 + subsd %xmm15, %xmm7 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm7 + movsd -64(%rsp), %xmm4 + movsd %xmm10, -72(%rsp) + movaps %xmm2, %xmm10 + addsd %xmm4, %xmm8 + movsd -72(%rsp), %xmm4 + subsd -48(%rsp), %xmm4 + movsd %xmm4, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm3 + subsd %xmm3, %xmm6 + movaps %xmm2, %xmm3 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + subsd %xmm5, %xmm9 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm12 + movsd -64(%rsp), %xmm9 + mulsd %xmm11, %xmm12 + mulsd %xmm11, %xmm9 + movaps %xmm12, %xmm11 + addsd %xmm9, %xmm11 + movsd %xmm11, -72(%rsp) + movsd -72(%rsp), %xmm4 + subsd %xmm4, %xmm12 + addsd %xmm9, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm15 + movsd -64(%rsp), %xmm6 + addsd %xmm15, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm5, %xmm10 + movsd %xmm10, -64(%rsp) + movsd -72(%rsp), %xmm13 + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm13 + movsd %xmm13, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + addsd %xmm14, %xmm15 + movsd %xmm15, -64(%rsp) + movsd -56(%rsp), %xmm4 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm14 + subsd %xmm4, %xmm2 + movsd %xmm2, -56(%rsp) + movsd -64(%rsp), %xmm4 + movsd -56(%rsp), %xmm2 + addsd %xmm2, %xmm4 + movsd %xmm4, -56(%rsp) + movsd -72(%rsp), %xmm12 + mulsd %xmm12, %xmm3 + movsd -56(%rsp), %xmm5 + movsd %xmm3, -72(%rsp) + addsd %xmm6, %xmm5 + movsd -72(%rsp), %xmm9 + subsd %xmm12, %xmm9 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm2 + subsd %xmm2, %xmm10 + movsd %xmm10, -72(%rsp) + movsd -72(%rsp), %xmm11 + subsd %xmm11, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm9 + divsd %xmm9, %xmm14 + mulsd %xmm14, %xmm13 + movsd -64(%rsp), %xmm10 + movsd %xmm13, -64(%rsp) + addsd %xmm10, %xmm5 + movsd -64(%rsp), %xmm15 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm12 + subsd %xmm14, %xmm15 + movsd %xmm15, -56(%rsp) + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm4 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + subsd %xmm4, %xmm2 + movsd %xmm2, -56(%rsp) + movsd -56(%rsp), %xmm3 + mulsd %xmm3, %xmm9 + movsd -56(%rsp), %xmm11 + subsd %xmm9, %xmm12 + mulsd %xmm11, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -64(%rsp), %xmm5 + subsd %xmm5, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -64(%rsp), %xmm2 + movq -56(%rsp), %r10 + movsd -64(%rsp), %xmm6 + movsd -56(%rsp), %xmm4 + movq %r10, -40(%rsp) + movsd -40(%rsp), %xmm3 + movaps %xmm3, %xmm5 + addsd 1888+__datan2_la_CoutTab(%rip), %xmm2 + mulsd %xmm7, %xmm5 + mulsd %xmm6, %xmm2 + mulsd %xmm4, %xmm2 + mulsd %xmm2, %xmm7 + mulsd %xmm8, %xmm2 + mulsd %xmm3, %xmm8 + addsd %xmm2, %xmm7 + movsd 1872+__datan2_la_CoutTab(%rip), %xmm3 + addsd %xmm8, %xmm7 + movsd %xmm7, -72(%rsp) + movaps %xmm5, %xmm7 + movsd -72(%rsp), %xmm4 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm6 + addsd %xmm4, %xmm7 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm8 + subsd %xmm8, %xmm5 + addsd %xmm4, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -72(%rsp), %xmm11 + movaps %xmm11, %xmm2 + mulsd %xmm11, %xmm2 + mulsd %xmm11, %xmm6 + mulsd %xmm2, %xmm3 + movsd -64(%rsp), %xmm4 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm7 + addsd 1864+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm11, %xmm7 + mulsd %xmm2, %xmm3 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm9 + movsd -64(%rsp), %xmm8 + addsd 1856+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm8, %xmm9 + mulsd %xmm2, %xmm3 + movsd %xmm9, -72(%rsp) + movsd -72(%rsp), %xmm10 + addsd 1848+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm10, %xmm11 + mulsd %xmm2, %xmm3 + movsd %xmm11, -64(%rsp) + addsd 1840+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + addsd 1832+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + addsd 1824+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + mulsd %xmm3, %xmm13 + movsd -72(%rsp), %xmm2 + movsd -64(%rsp), %xmm12 + movsd %xmm13, -72(%rsp) + addsd %xmm12, %xmm4 + movsd -72(%rsp), %xmm14 + subsd %xmm3, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm5 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + subsd %xmm6, %xmm3 + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movaps %xmm6, %xmm12 + movaps %xmm5, %xmm3 + mulsd %xmm4, %xmm6 + mulsd %xmm4, %xmm3 + mulsd %xmm2, %xmm5 + mulsd %xmm2, %xmm12 + addsd %xmm3, %xmm6 + movaps %xmm12, %xmm7 + movaps %xmm12, %xmm8 + addsd %xmm5, %xmm6 + addsd %xmm2, %xmm7 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm3 + subsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -56(%rsp), %xmm2 + subsd %xmm2, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd -56(%rsp), %xmm13 + addsd %xmm13, %xmm14 + movsd %xmm14, -56(%rsp) + movq -72(%rsp), %r11 + movsd -56(%rsp), %xmm15 + movq %r11, -40(%rsp) + addsd %xmm15, %xmm4 + movsd -40(%rsp), %xmm8 + addsd %xmm5, %xmm4 + movsd %xmm4, -32(%rsp) + movaps %xmm8, %xmm4 + movaps %xmm8, %xmm2 + addsd (%rdi,%rsi,8), %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm4 + subsd %xmm4, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm3 + addsd %xmm3, %xmm5 + movsd %xmm5, -56(%rsp) + movsd -64(%rsp), %xmm6 + addsd (%rdi,%rsi,8), %xmm6 + movsd %xmm6, -64(%rsp) + movsd -56(%rsp), %xmm7 + subsd %xmm7, %xmm8 + movsd %xmm8, -56(%rsp) + movsd -64(%rsp), %xmm10 + movsd -56(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movq -72(%rsp), %r8 + movq %r8, -40(%rsp) + movsd -56(%rsp), %xmm2 + movaps %xmm1, %xmm3 + shrq $56, %r8 + addsd -32(%rsp), %xmm2 + shlb $7, %dl + addsd 8(%rdi,%rsi,8), %xmm2 + movb %al, %sil + andb $127, %r8b + shlb $7, %sil + movsd %xmm2, -32(%rsp) + orb %sil, %r8b + movb %r8b, -33(%rsp) + movsd -40(%rsp), %xmm9 + movaps %xmm9, %xmm5 + addsd %xmm9, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm4 + movb -25(%rsp), %dil + movb %dil, %r9b + shrb $7, %dil + subsd %xmm4, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -72(%rsp), %xmm7 + movsd -64(%rsp), %xmm6 + xorb %dil, %al + andb $127, %r9b + shlb $7, %al + addsd %xmm6, %xmm7 + movsd %xmm7, -56(%rsp) + movsd -64(%rsp), %xmm8 + addsd %xmm8, %xmm1 + movsd %xmm1, -64(%rsp) + orb %al, %r9b + movsd -56(%rsp), %xmm1 + movb %r9b, -25(%rsp) + subsd %xmm1, %xmm9 + movsd %xmm9, -56(%rsp) + movsd -64(%rsp), %xmm11 + movsd -56(%rsp), %xmm10 + addsd %xmm10, %xmm11 + movsd %xmm11, -56(%rsp) + movq -72(%rsp), %rax + movsd -56(%rsp), %xmm12 + movq %rax, -40(%rsp) + addsd %xmm12, %xmm0 + movsd -40(%rsp), %xmm13 + addsd -32(%rsp), %xmm0 + movsd %xmm0, -32(%rsp) + addsd %xmm0, %xmm13 + movsd %xmm13, -24(%rsp) + movb -17(%rsp), %r10b + andb $127, %r10b + orb %dl, %r10b + movb %r10b, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_20: + movsd -48(%rsp), %xmm12 + movb %al, %r8b + movaps %xmm12, %xmm7 + mulsd 2000+__datan2_la_CoutTab(%rip), %xmm7 + shlb $7, %r8b + shlb $7, %dl + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm8 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm7 + mulsd %xmm2, %xmm13 + subsd -48(%rsp), %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + subsd %xmm9, %xmm10 + movsd %xmm10, -72(%rsp) + movsd -72(%rsp), %xmm11 + subsd %xmm11, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movsd %xmm13, -72(%rsp) + movsd -72(%rsp), %xmm14 + subsd %xmm2, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm4 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm3 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm4 + subsd %xmm3, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm12 + divsd %xmm12, %xmm7 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm2 + mulsd %xmm7, %xmm2 + movsd -64(%rsp), %xmm14 + movsd %xmm2, -64(%rsp) + movsd -64(%rsp), %xmm8 + subsd %xmm7, %xmm8 + movsd %xmm8, -56(%rsp) + movsd -64(%rsp), %xmm10 + movsd -56(%rsp), %xmm9 + subsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -56(%rsp), %xmm11 + mulsd %xmm11, %xmm12 + movsd -56(%rsp), %xmm13 + subsd %xmm12, %xmm4 + mulsd %xmm13, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -64(%rsp), %xmm15 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + subsd %xmm15, %xmm4 + movsd %xmm4, -64(%rsp) + movsd -64(%rsp), %xmm7 + movq -56(%rsp), %rsi + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm3 + movq %rsi, -40(%rsp) + movsd -40(%rsp), %xmm8 + movaps %xmm8, %xmm9 + addsd 1888+__datan2_la_CoutTab(%rip), %xmm7 + mulsd %xmm6, %xmm9 + mulsd %xmm5, %xmm8 + mulsd %xmm2, %xmm7 + movsd -16(%rsp), %xmm2 + mulsd %xmm2, %xmm2 + mulsd %xmm3, %xmm7 + movsd 1872+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + mulsd %xmm7, %xmm6 + mulsd %xmm5, %xmm7 + addsd 1864+__datan2_la_CoutTab(%rip), %xmm3 + addsd %xmm7, %xmm6 + mulsd %xmm2, %xmm3 + addsd %xmm8, %xmm6 + addsd 1856+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + movaps %xmm9, %xmm5 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm4 + addsd 1848+__datan2_la_CoutTab(%rip), %xmm3 + addsd %xmm4, %xmm5 + mulsd %xmm2, %xmm3 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm5 + subsd %xmm6, %xmm9 + addsd 1840+__datan2_la_CoutTab(%rip), %xmm3 + addsd %xmm4, %xmm9 + mulsd %xmm2, %xmm3 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm11 + mulsd %xmm11, %xmm5 + addsd 1832+__datan2_la_CoutTab(%rip), %xmm3 + movsd -64(%rsp), %xmm4 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm7 + mulsd %xmm2, %xmm3 + subsd %xmm11, %xmm7 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm8 + movsd -64(%rsp), %xmm6 + addsd 1824+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm6, %xmm8 + mulsd %xmm2, %xmm3 + movsd %xmm8, -72(%rsp) + movsd -72(%rsp), %xmm10 + mulsd %xmm3, %xmm13 + subsd %xmm10, %xmm11 + movsd %xmm11, -64(%rsp) + movsd -72(%rsp), %xmm2 + movsd -64(%rsp), %xmm12 + movsd %xmm13, -72(%rsp) + addsd %xmm12, %xmm4 + movsd -72(%rsp), %xmm14 + subsd %xmm3, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm5 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + subsd %xmm6, %xmm3 + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movaps %xmm6, %xmm12 + movaps %xmm5, %xmm3 + mulsd %xmm4, %xmm6 + mulsd %xmm4, %xmm3 + mulsd %xmm2, %xmm5 + mulsd %xmm2, %xmm12 + addsd %xmm3, %xmm6 + movaps %xmm12, %xmm7 + movaps %xmm12, %xmm8 + addsd %xmm5, %xmm6 + addsd %xmm2, %xmm7 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm3 + subsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -56(%rsp), %xmm2 + subsd %xmm2, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd -56(%rsp), %xmm13 + addsd %xmm13, %xmm14 + movsd %xmm14, -56(%rsp) + movq -72(%rsp), %rdi + movsd -56(%rsp), %xmm15 + movq %rdi, -40(%rsp) + addsd %xmm15, %xmm4 + shrq $56, %rdi + addsd %xmm5, %xmm4 + andb $127, %dil + orb %r8b, %dil + movb %dil, -33(%rsp) + movsd %xmm4, -32(%rsp) + movaps %xmm1, %xmm4 + movsd -40(%rsp), %xmm7 + movaps %xmm7, %xmm2 + addsd %xmm7, %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm4 + movb -25(%rsp), %r9b + movb %r9b, %r10b + shrb $7, %r9b + subsd %xmm4, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm3 + xorb %r9b, %al + andb $127, %r10b + shlb $7, %al + addsd %xmm3, %xmm5 + movsd %xmm5, -56(%rsp) + movsd -64(%rsp), %xmm6 + addsd %xmm6, %xmm1 + movsd %xmm1, -64(%rsp) + orb %al, %r10b + movsd -56(%rsp), %xmm1 + movb %r10b, -25(%rsp) + subsd %xmm1, %xmm7 + movsd %xmm7, -56(%rsp) + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm1 + addsd %xmm1, %xmm2 + movsd %xmm2, -56(%rsp) + movq -72(%rsp), %rax + movsd -56(%rsp), %xmm3 + movq %rax, -40(%rsp) + addsd %xmm3, %xmm0 + movsd -40(%rsp), %xmm4 + addsd -32(%rsp), %xmm0 + movsd %xmm0, -32(%rsp) + addsd %xmm0, %xmm4 + movsd %xmm4, -24(%rsp) + movb -17(%rsp), %r11b + andb $127, %r11b + orb %dl, %r11b + movb %r11b, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_21: + cmpl $74, %r8d + jge .LBL_2_53 + movb %dil, -41(%rsp) + divsd -48(%rsp), %xmm1 + movsd 1928+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + subsd %xmm1, %xmm0 + addsd 1920+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_24: + testb %al, %al + jne .LBL_2_35 + movb %dil, -41(%rsp) + movb %sil, -33(%rsp) + movsd -48(%rsp), %xmm2 + divsd -40(%rsp), %xmm2 + movsd %xmm2, -24(%rsp) + movzwl -18(%rsp), %eax + testl $32752, %eax + je .LBL_2_27 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd %xmm2, %xmm0 + movsd %xmm0, -72(%rsp) + movsd -72(%rsp), %xmm1 + mulsd %xmm1, %xmm2 + movsd %xmm2, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_27: + mulsd %xmm2, %xmm2 + shlb $7, %dl + movsd %xmm2, -72(%rsp) + movsd -72(%rsp), %xmm0 + addsd -24(%rsp), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_29: + testl %r9d, %r9d + jne .LBL_2_53 + testl $1048575, -44(%rsp) + jne .LBL_2_53 + jmp .LBL_2_57 + +.LBL_2_31: + jne .LBL_2_53 + +.LBL_2_33: + testb %al, %al + jne .LBL_2_35 + +.LBL_2_34: + shlb $7, %dl + movq 1976+__datan2_la_CoutTab(%rip), %rax + movq %rax, -24(%rsp) + shrq $56, %rax + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_35: + movsd 1936+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd 1944+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + +.LBL_2_36: + xorl %eax, %eax + ret + +.LBL_2_37: + movsd 1984+__datan2_la_CoutTab(%rip), %xmm3 + movl $-1022, %r9d + mulsd %xmm3, %xmm4 + movsd %xmm4, -48(%rsp) + jmp .LBL_2_18 + +.LBL_2_38: + cmpl $2047, %r9d + je .LBL_2_49 + +.LBL_2_39: + testl $1048575, -36(%rsp) + jne .LBL_2_41 + cmpl $0, -40(%rsp) + je .LBL_2_42 + +.LBL_2_41: + addsd %xmm1, %xmm2 + movsd %xmm2, (%rcx) + jmp .LBL_2_36 + +.LBL_2_42: + cmpl $2047, %r9d + je .LBL_2_46 + testb %al, %al + je .LBL_2_34 + jmp .LBL_2_35 + +.LBL_2_46: + testb %al, %al + jne .LBL_2_48 + movsd 1904+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd 1912+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_48: + movsd 1952+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd 1960+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_49: + testl $1048575, -44(%rsp) + jne .LBL_2_41 + cmpl $0, -48(%rsp) + jne .LBL_2_41 + cmpl $2047, %r8d + je .LBL_2_39 + +.LBL_2_53: + movsd 1920+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd 1928+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp .LBL_2_36 + +.LBL_2_57: + cmpl $0, -48(%rsp) + jne .LBL_2_53 + jmp .LBL_2_33 + + cfi_endproc + + .type __svml_datan2_cout_rare_internal,@function + .size __svml_datan2_cout_rare_internal,.-__svml_datan2_cout_rare_internal + + .section .rodata, "a" + .align 64 + +__svml_datan2_data_internal: + .long 1413754136 + .long 1074340347 + .long 1413754136 + .long 1074340347 + .long 1413754136 + .long 1074340347 + .long 1413754136 + .long 1074340347 + .long 1413754136 + .long 1074340347 + .long 1413754136 + .long 1074340347 + .long 1413754136 + .long 1074340347 + .long 1413754136 + .long 1074340347 + .long 1413754136 + .long 1073291771 + .long 1413754136 + .long 1073291771 + .long 1413754136 + .long 1073291771 + .long 1413754136 + .long 1073291771 + .long 1413754136 + .long 1073291771 + .long 1413754136 + .long 1073291771 + .long 1413754136 + .long 1073291771 + .long 1413754136 + .long 1073291771 + .long 17919630 + .long 3202334474 + .long 17919630 + .long 3202334474 + .long 17919630 + .long 3202334474 + .long 17919630 + .long 3202334474 + .long 17919630 + .long 3202334474 + .long 17919630 + .long 3202334474 + .long 17919630 + .long 3202334474 + .long 17919630 + .long 3202334474 + .long 350522012 + .long 1058555694 + .long 350522012 + .long 1058555694 + .long 350522012 + .long 1058555694 + .long 350522012 + .long 1058555694 + .long 350522012 + .long 1058555694 + .long 350522012 + .long 1058555694 + .long 350522012 + .long 1058555694 + .long 350522012 + .long 1058555694 + .long 934004643 + .long 3203726773 + .long 934004643 + .long 3203726773 + .long 934004643 + .long 3203726773 + .long 934004643 + .long 3203726773 + .long 934004643 + .long 3203726773 + .long 934004643 + .long 3203726773 + .long 934004643 + .long 3203726773 + .long 934004643 + .long 3203726773 + .long 912675337 + .long 1059908874 + .long 912675337 + .long 1059908874 + .long 912675337 + .long 1059908874 + .long 912675337 + .long 1059908874 + .long 912675337 + .long 1059908874 + .long 912675337 + .long 1059908874 + .long 912675337 + .long 1059908874 + .long 912675337 + .long 1059908874 + .long 2476035107 + .long 3209881212 + .long 2476035107 + .long 3209881212 + .long 2476035107 + .long 3209881212 + .long 2476035107 + .long 3209881212 + .long 2476035107 + .long 3209881212 + .long 2476035107 + .long 3209881212 + .long 2476035107 + .long 3209881212 + .long 2476035107 + .long 3209881212 + .long 2927800243 + .long 1064262173 + .long 2927800243 + .long 1064262173 + .long 2927800243 + .long 1064262173 + .long 2927800243 + .long 1064262173 + .long 2927800243 + .long 1064262173 + .long 2927800243 + .long 1064262173 + .long 2927800243 + .long 1064262173 + .long 2927800243 + .long 1064262173 + .long 1636715437 + .long 3213013740 + .long 1636715437 + .long 3213013740 + .long 1636715437 + .long 3213013740 + .long 1636715437 + .long 3213013740 + .long 1636715437 + .long 3213013740 + .long 1636715437 + .long 3213013740 + .long 1636715437 + .long 3213013740 + .long 1636715437 + .long 3213013740 + .long 1712395941 + .long 1066487628 + .long 1712395941 + .long 1066487628 + .long 1712395941 + .long 1066487628 + .long 1712395941 + .long 1066487628 + .long 1712395941 + .long 1066487628 + .long 1712395941 + .long 1066487628 + .long 1712395941 + .long 1066487628 + .long 1712395941 + .long 1066487628 + .long 2961307292 + .long 3214564995 + .long 2961307292 + .long 3214564995 + .long 2961307292 + .long 3214564995 + .long 2961307292 + .long 3214564995 + .long 2961307292 + .long 3214564995 + .long 2961307292 + .long 3214564995 + .long 2961307292 + .long 3214564995 + .long 2961307292 + .long 3214564995 + .long 213298511 + .long 1067542936 + .long 213298511 + .long 1067542936 + .long 213298511 + .long 1067542936 + .long 213298511 + .long 1067542936 + .long 213298511 + .long 1067542936 + .long 213298511 + .long 1067542936 + .long 213298511 + .long 1067542936 + .long 213298511 + .long 1067542936 + .long 3848520124 + .long 3215257506 + .long 3848520124 + .long 3215257506 + .long 3848520124 + .long 3215257506 + .long 3848520124 + .long 3215257506 + .long 3848520124 + .long 3215257506 + .long 3848520124 + .long 3215257506 + .long 3848520124 + .long 3215257506 + .long 3848520124 + .long 3215257506 + .long 3159386171 + .long 1067969551 + .long 3159386171 + .long 1067969551 + .long 3159386171 + .long 1067969551 + .long 3159386171 + .long 1067969551 + .long 3159386171 + .long 1067969551 + .long 3159386171 + .long 1067969551 + .long 3159386171 + .long 1067969551 + .long 3159386171 + .long 1067969551 + .long 3936393556 + .long 3215643233 + .long 3936393556 + .long 3215643233 + .long 3936393556 + .long 3215643233 + .long 3936393556 + .long 3215643233 + .long 3936393556 + .long 3215643233 + .long 3936393556 + .long 3215643233 + .long 3936393556 + .long 3215643233 + .long 3936393556 + .long 3215643233 + .long 3177262543 + .long 1068373833 + .long 3177262543 + .long 1068373833 + .long 3177262543 + .long 1068373833 + .long 3177262543 + .long 1068373833 + .long 3177262543 + .long 1068373833 + .long 3177262543 + .long 1068373833 + .long 3177262543 + .long 1068373833 + .long 3177262543 + .long 1068373833 + .long 9713120 + .long 3216052356 + .long 9713120 + .long 3216052356 + .long 9713120 + .long 3216052356 + .long 9713120 + .long 3216052356 + .long 9713120 + .long 3216052356 + .long 9713120 + .long 3216052356 + .long 9713120 + .long 3216052356 + .long 9713120 + .long 3216052356 + .long 1227445841 + .long 1068740906 + .long 1227445841 + .long 1068740906 + .long 1227445841 + .long 1068740906 + .long 1227445841 + .long 1068740906 + .long 1227445841 + .long 1068740906 + .long 1227445841 + .long 1068740906 + .long 1227445841 + .long 1068740906 + .long 1227445841 + .long 1068740906 + .long 163240596 + .long 3216459216 + .long 163240596 + .long 3216459216 + .long 163240596 + .long 3216459216 + .long 163240596 + .long 3216459216 + .long 163240596 + .long 3216459216 + .long 163240596 + .long 3216459216 + .long 163240596 + .long 3216459216 + .long 163240596 + .long 3216459216 + .long 133682613 + .long 1069314503 + .long 133682613 + .long 1069314503 + .long 133682613 + .long 1069314503 + .long 133682613 + .long 1069314503 + .long 133682613 + .long 1069314503 + .long 133682613 + .long 1069314503 + .long 133682613 + .long 1069314503 + .long 133682613 + .long 1069314503 + .long 2448315847 + .long 3217180964 + .long 2448315847 + .long 3217180964 + .long 2448315847 + .long 3217180964 + .long 2448315847 + .long 3217180964 + .long 2448315847 + .long 3217180964 + .long 2448315847 + .long 3217180964 + .long 2448315847 + .long 3217180964 + .long 2448315847 + .long 3217180964 + .long 2576870964 + .long 1070176665 + .long 2576870964 + .long 1070176665 + .long 2576870964 + .long 1070176665 + .long 2576870964 + .long 1070176665 + .long 2576870964 + .long 1070176665 + .long 2576870964 + .long 1070176665 + .long 2576870964 + .long 1070176665 + .long 2576870964 + .long 1070176665 + .long 1431655365 + .long 3218429269 + .long 1431655365 + .long 3218429269 + .long 1431655365 + .long 3218429269 + .long 1431655365 + .long 3218429269 + .long 1431655365 + .long 3218429269 + .long 1431655365 + .long 3218429269 + .long 1431655365 + .long 3218429269 + .long 1431655365 + .long 3218429269 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 2147483648 + .long 0 + .long 2147483648 + .long 0 + .long 2147483648 + .long 0 + .long 2147483648 + .long 0 + .long 2147483648 + .long 0 + .long 2147483648 + .long 0 + .long 2147483648 + .long 0 + .long 2147483648 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4294967295 + .long 2147483647 + .long 4294967295 + .long 2147483647 + .long 4294967295 + .long 2147483647 + .long 4294967295 + .long 2147483647 + .long 4294967295 + .long 2147483647 + .long 4294967295 + .long 2147483647 + .long 4294967295 + .long 2147483647 + .long 4294967295 + .long 2147483647 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 0 + .long 4294967295 + .long 0 + .long 4294967295 + .long 0 + .long 4294967295 + .long 0 + .long 4294967295 + .long 0 + .long 4294967295 + .long 0 + .long 4294967295 + .long 0 + .long 4294967295 + .long 0 + .long 4294967295 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .type __svml_datan2_data_internal,@object + .size __svml_datan2_data_internal,2304 + .align 32 + +__datan2_la_CoutTab: + .long 3892314112 + .long 1069799150 + .long 2332892550 + .long 1039715405 + .long 1342177280 + .long 1070305495 + .long 270726690 + .long 1041535749 + .long 939524096 + .long 1070817911 + .long 2253973841 + .long 3188654726 + .long 3221225472 + .long 1071277294 + .long 3853927037 + .long 1043226911 + .long 2818572288 + .long 1071767563 + .long 2677759107 + .long 1044314101 + .long 3355443200 + .long 1072103591 + .long 1636578514 + .long 3191094734 + .long 1476395008 + .long 1072475260 + .long 1864703685 + .long 3188646936 + .long 805306368 + .long 1072747407 + .long 192551812 + .long 3192726267 + .long 2013265920 + .long 1072892781 + .long 2240369452 + .long 1043768538 + .long 0 + .long 1072999953 + .long 3665168337 + .long 3192705970 + .long 402653184 + .long 1073084787 + .long 1227953434 + .long 3192313277 + .long 2013265920 + .long 1073142981 + .long 3853283127 + .long 1045277487 + .long 805306368 + .long 1073187261 + .long 1676192264 + .long 3192868861 + .long 134217728 + .long 1073217000 + .long 4290763938 + .long 1042034855 + .long 671088640 + .long 1073239386 + .long 994303084 + .long 3189643768 + .long 402653184 + .long 1073254338 + .long 1878067156 + .long 1042652475 + .long 1610612736 + .long 1073265562 + .long 670314820 + .long 1045138554 + .long 3221225472 + .long 1073273048 + .long 691126919 + .long 3189987794 + .long 3489660928 + .long 1073278664 + .long 1618990832 + .long 3188194509 + .long 1207959552 + .long 1073282409 + .long 2198872939 + .long 1044806069 + .long 3489660928 + .long 1073285217 + .long 2633982383 + .long 1042307894 + .long 939524096 + .long 1073287090 + .long 1059367786 + .long 3189114230 + .long 2281701376 + .long 1073288494 + .long 3158525533 + .long 1044484961 + .long 3221225472 + .long 1073289430 + .long 286581777 + .long 1044893263 + .long 4026531840 + .long 1073290132 + .long 2000245215 + .long 3191647611 + .long 134217728 + .long 1073290601 + .long 4205071590 + .long 1045035927 + .long 536870912 + .long 1073290952 + .long 2334392229 + .long 1043447393 + .long 805306368 + .long 1073291186 + .long 2281458177 + .long 3188885569 + .long 3087007744 + .long 1073291361 + .long 691611507 + .long 1044733832 + .long 3221225472 + .long 1073291478 + .long 1816229550 + .long 1044363390 + .long 2281701376 + .long 1073291566 + .long 1993843750 + .long 3189837440 + .long 134217728 + .long 1073291625 + .long 3654754496 + .long 1044970837 + .long 4026531840 + .long 1073291668 + .long 3224300229 + .long 3191935390 + .long 805306368 + .long 1073291698 + .long 2988777976 + .long 3188950659 + .long 536870912 + .long 1073291720 + .long 1030371341 + .long 1043402665 + .long 3221225472 + .long 1073291734 + .long 1524463765 + .long 1044361356 + .long 3087007744 + .long 1073291745 + .long 2754295320 + .long 1044731036 + .long 134217728 + .long 1073291753 + .long 3099629057 + .long 1044970710 + .long 2281701376 + .long 1073291758 + .long 962914160 + .long 3189838838 + .long 805306368 + .long 1073291762 + .long 3543908206 + .long 3188950786 + .long 4026531840 + .long 1073291764 + .long 1849909620 + .long 3191935434 + .long 3221225472 + .long 1073291766 + .long 1641333636 + .long 1044361352 + .long 536870912 + .long 1073291768 + .long 1373968792 + .long 1043402654 + .long 134217728 + .long 1073291769 + .long 2033191599 + .long 1044970710 + .long 3087007744 + .long 1073291769 + .long 4117947437 + .long 1044731035 + .long 805306368 + .long 1073291770 + .long 315378368 + .long 3188950787 + .long 2281701376 + .long 1073291770 + .long 2428571750 + .long 3189838838 + .long 3221225472 + .long 1073291770 + .long 1608007466 + .long 1044361352 + .long 4026531840 + .long 1073291770 + .long 1895711420 + .long 3191935434 + .long 134217728 + .long 1073291771 + .long 2031108713 + .long 1044970710 + .long 536870912 + .long 1073291771 + .long 1362518342 + .long 1043402654 + .long 805306368 + .long 1073291771 + .long 317461253 + .long 3188950787 + .long 939524096 + .long 1073291771 + .long 4117231784 + .long 1044731035 + .long 1073741824 + .long 1073291771 + .long 1607942376 + .long 1044361352 + .long 1207959552 + .long 1073291771 + .long 2428929577 + .long 3189838838 + .long 1207959552 + .long 1073291771 + .long 2031104645 + .long 1044970710 + .long 1342177280 + .long 1073291771 + .long 1895722602 + .long 3191935434 + .long 1342177280 + .long 1073291771 + .long 317465322 + .long 3188950787 + .long 1342177280 + .long 1073291771 + .long 1362515546 + .long 1043402654 + .long 1342177280 + .long 1073291771 + .long 1607942248 + .long 1044361352 + .long 1342177280 + .long 1073291771 + .long 4117231610 + .long 1044731035 + .long 1342177280 + .long 1073291771 + .long 2031104637 + .long 1044970710 + .long 1342177280 + .long 1073291771 + .long 1540251232 + .long 1045150466 + .long 1342177280 + .long 1073291771 + .long 2644671394 + .long 1045270303 + .long 1342177280 + .long 1073291771 + .long 2399244691 + .long 1045360181 + .long 1342177280 + .long 1073291771 + .long 803971124 + .long 1045420100 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192879152 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192849193 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192826724 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192811744 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192800509 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192793019 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192787402 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192783657 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192780848 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192778976 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192777572 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192776635 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192775933 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192775465 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192775114 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774880 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774704 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774587 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774500 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774441 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774397 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774368 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774346 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774331 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774320 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774313 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774308 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774304 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774301 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774299 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774298 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774297 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1466225875 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1343512524 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1251477510 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1190120835 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1144103328 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1113424990 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1090416237 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1075077068 + .long 3192774295 + .long 1431655765 + .long 3218429269 + .long 2576978363 + .long 1070176665 + .long 2453154343 + .long 3217180964 + .long 4189149139 + .long 1069314502 + .long 1775019125 + .long 3216459198 + .long 273199057 + .long 1068739452 + .long 874748308 + .long 3215993277 + .long 0 + .long 1069547520 + .long 0 + .long 1072693248 + .long 0 + .long 1073741824 + .long 1413754136 + .long 1072243195 + .long 856972295 + .long 1015129638 + .long 1413754136 + .long 1073291771 + .long 856972295 + .long 1016178214 + .long 1413754136 + .long 1074340347 + .long 856972295 + .long 1017226790 + .long 2134057426 + .long 1073928572 + .long 1285458442 + .long 1016756537 + .long 0 + .long 3220176896 + .long 0 + .long 0 + .long 0 + .long 2144337920 + .long 0 + .long 1048576 + .long 33554432 + .long 1101004800 + .type __datan2_la_CoutTab,@object + .size __datan2_la_CoutTab,2008 + .align 8 + +.FLT_31: + .long 0xffffffff,0xffffffff + .type .FLT_31,@object + .size .FLT_31,8 diff --git a/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f16_core-avx2.S b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f16_core-avx2.S new file mode 100644 index 0000000000..a2a76e8bfd --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f16_core-avx2.S @@ -0,0 +1,20 @@ +/* AVX2 version of vectorized atan2f. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define _ZGVeN16vv_atan2f _ZGVeN16vv_atan2f_avx2_wrapper +#include "../svml_s_atan2f16_core.S" diff --git a/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f16_core.c b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f16_core.c new file mode 100644 index 0000000000..6fa806414d --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f16_core.c @@ -0,0 +1,28 @@ +/* Multiple versions of vectorized atan2f, vector length is 16. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define SYMBOL_NAME _ZGVeN16vv_atan2f +#include "ifunc-mathvec-avx512-skx.h" + +libc_ifunc_redirected (REDIRECT_NAME, SYMBOL_NAME, IFUNC_SELECTOR ()); + +#ifdef SHARED +__hidden_ver1 (_ZGVeN16vv_atan2f, __GI__ZGVeN16vv_atan2f, + __redirect__ZGVeN16vv_atan2f) + __attribute__ ((visibility ("hidden"))); +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f16_core_avx512.S b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f16_core_avx512.S new file mode 100644 index 0000000000..82c150901a --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f16_core_avx512.S @@ -0,0 +1,1997 @@ +/* Function atanf16 vectorized with AVX-512. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + https://www.gnu.org/licenses/. */ + +/* + * ALGORITHM DESCRIPTION: + * For 0.0 <= x <= 7.0/16.0: atan(x) = atan(0.0) + atan(s), where s=(x-0.0)/(1.0+0.0*x) + * For 7.0/16.0 <= x <= 11.0/16.0: atan(x) = atan(0.5) + atan(s), where s=(x-0.5)/(1.0+0.5*x) + * For 11.0/16.0 <= x <= 19.0/16.0: atan(x) = atan(1.0) + atan(s), where s=(x-1.0)/(1.0+1.0*x) + * For 19.0/16.0 <= x <= 39.0/16.0: atan(x) = atan(1.5) + atan(s), where s=(x-1.5)/(1.0+1.5*x) + * For 39.0/16.0 <= x <= inf : atan(x) = atan(inf) + atan(s), where s=-1.0/x + * Where atan(s) ~= s+s^3*Poly11(s^2) on interval |s|<7.0/0.16. + * + * + */ + +#include + + .text +ENTRY(_ZGVeN16vv_atan2f_skx) + pushq %rbp + cfi_def_cfa_offset(16) + movq %rsp, %rbp + cfi_def_cfa(6, 16) + cfi_offset(6, -16) + andq $-64, %rsp + subq $256, %rsp + xorl %edx, %edx + +/* + * #define NO_VECTOR_ZERO_ATAN2_ARGS + * Declarations + * Variables + * Constants + * The end of declarations + * Implementation + * Arguments signs + */ + vmovups 256+__svml_satan2_data_internal(%rip), %zmm6 + vmovups 64+__svml_satan2_data_internal(%rip), %zmm3 + +/* Testing on working interval. */ + vmovups 1024+__svml_satan2_data_internal(%rip), %zmm9 + vmovups 1088+__svml_satan2_data_internal(%rip), %zmm14 + +/* + * 1) If yx then a=-x, b=y, PIO2=Pi/2 + */ + vmovups 320+__svml_satan2_data_internal(%rip), %zmm4 + vpternlogd $255, %zmm13, %zmm13, %zmm13 + vmovaps %zmm1, %zmm8 + vandps %zmm6, %zmm8, %zmm2 + vandps %zmm6, %zmm0, %zmm1 + vorps 192+__svml_satan2_data_internal(%rip), %zmm2, %zmm5 + vpsubd %zmm9, %zmm2, %zmm10 + vpsubd %zmm9, %zmm1, %zmm12 + vxorps %zmm2, %zmm8, %zmm7 + vxorps %zmm1, %zmm0, %zmm6 + vcmpps $17, {sae}, %zmm2, %zmm1, %k1 + vpcmpgtd %zmm10, %zmm14, %k2 + vpcmpgtd %zmm12, %zmm14, %k3 + vmovups 576+__svml_satan2_data_internal(%rip), %zmm14 + vblendmps %zmm1, %zmm5, %zmm11{%k1} + vblendmps %zmm2, %zmm1, %zmm5{%k1} + vxorps %zmm4, %zmm4, %zmm4{%k1} + +/* + * Division a/b. + * Enabled when FMA is available and + * performance is better with NR iteration + */ + vrcp14ps %zmm5, %zmm15 + vfnmadd231ps {rn-sae}, %zmm5, %zmm15, %zmm3 + vfmadd213ps {rn-sae}, %zmm15, %zmm3, %zmm15 + vmulps {rn-sae}, %zmm15, %zmm11, %zmm3 + vfnmadd231ps {rn-sae}, %zmm5, %zmm3, %zmm11 + vfmadd213ps {rn-sae}, %zmm3, %zmm11, %zmm15 + vmovups 448+__svml_satan2_data_internal(%rip), %zmm11 + vpternlogd $255, %zmm3, %zmm3, %zmm3 + +/* Polynomial. */ + vmulps {rn-sae}, %zmm15, %zmm15, %zmm9 + vpandnd %zmm10, %zmm10, %zmm13{%k2} + vmulps {rn-sae}, %zmm9, %zmm9, %zmm10 + vfmadd231ps {rn-sae}, %zmm10, %zmm11, %zmm14 + vmovups 640+__svml_satan2_data_internal(%rip), %zmm11 + vpandnd %zmm12, %zmm12, %zmm3{%k3} + vpord %zmm3, %zmm13, %zmm3 + vmovups 704+__svml_satan2_data_internal(%rip), %zmm13 + vmovups 512+__svml_satan2_data_internal(%rip), %zmm12 + vptestmd %zmm3, %zmm3, %k0 + vfmadd213ps {rn-sae}, %zmm13, %zmm10, %zmm14 + vfmadd231ps {rn-sae}, %zmm10, %zmm12, %zmm11 + vmovups 768+__svml_satan2_data_internal(%rip), %zmm12 + vmovups 832+__svml_satan2_data_internal(%rip), %zmm13 + +/* Special branch for fast (vector) processing of zero arguments */ + kortestw %k0, %k0 + vfmadd213ps {rn-sae}, %zmm12, %zmm10, %zmm11 + vmovups 896+__svml_satan2_data_internal(%rip), %zmm12 + vfmadd213ps {rn-sae}, %zmm13, %zmm10, %zmm14 + vmovups 960+__svml_satan2_data_internal(%rip), %zmm13 + vfmadd213ps {rn-sae}, %zmm12, %zmm10, %zmm11 + vfmadd213ps {rn-sae}, %zmm13, %zmm10, %zmm14 + vfmadd213ps {rn-sae}, %zmm14, %zmm9, %zmm11 + +/* Reconstruction. */ + vfmadd213ps {rn-sae}, %zmm4, %zmm15, %zmm11 + +/* if x<0, sPI = Pi, else sPI =0 */ + vmovups __svml_satan2_data_internal(%rip), %zmm15 + vorps %zmm7, %zmm11, %zmm9 + vcmpps $18, {sae}, %zmm15, %zmm8, %k1 + vmovups 384+__svml_satan2_data_internal(%rip), %zmm11 + vaddps {rn-sae}, %zmm11, %zmm9, %zmm9{%k1} + vorps %zmm6, %zmm9, %zmm10 + jne .LBL_1_12 + +.LBL_1_2: +/* + * Special branch for fast (vector) processing of zero arguments + * The end of implementation + */ + testl %edx, %edx + jne .LBL_1_4 + +.LBL_1_3: + vmovaps %zmm10, %zmm0 + movq %rbp, %rsp + popq %rbp + cfi_def_cfa(7, 8) + cfi_restore(6) + ret + cfi_def_cfa(6, 16) + cfi_offset(6, -16) + +.LBL_1_4: + vmovups %zmm0, 64(%rsp) + vmovups %zmm8, 128(%rsp) + vmovups %zmm10, 192(%rsp) + je .LBL_1_3 + xorl %eax, %eax + vzeroupper + kmovw %k4, 24(%rsp) + kmovw %k5, 16(%rsp) + kmovw %k6, 8(%rsp) + kmovw %k7, (%rsp) + movq %rsi, 40(%rsp) + movq %rdi, 32(%rsp) + movq %r12, 56(%rsp) + .cfi_escape 0x10, 0x04, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x28, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x05, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x20, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x38, 0xff, 0xff, 0xff, 0x22 + movl %eax, %r12d + movq %r13, 48(%rsp) + .cfi_escape 0x10, 0x0d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x30, 0xff, 0xff, 0xff, 0x22 + movl %edx, %r13d + .cfi_escape 0x10, 0xfa, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x18, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfb, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x10, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfc, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x08, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfd, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x00, 0xff, 0xff, 0xff, 0x22 + +.LBL_1_8: + btl %r12d, %r13d + jc .LBL_1_11 + +.LBL_1_9: + incl %r12d + cmpl $16, %r12d + jl .LBL_1_8 + kmovw 24(%rsp), %k4 + cfi_restore(122) + kmovw 16(%rsp), %k5 + cfi_restore(123) + kmovw 8(%rsp), %k6 + cfi_restore(124) + kmovw (%rsp), %k7 + cfi_restore(125) + vmovups 192(%rsp), %zmm10 + movq 40(%rsp), %rsi + cfi_restore(4) + movq 32(%rsp), %rdi + cfi_restore(5) + movq 56(%rsp), %r12 + cfi_restore(12) + movq 48(%rsp), %r13 + cfi_restore(13) + jmp .LBL_1_3 + .cfi_escape 0x10, 0x04, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x28, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x05, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x20, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x38, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x30, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfa, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x18, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfb, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x10, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfc, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x08, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfd, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x00, 0xff, 0xff, 0xff, 0x22 + +.LBL_1_11: + lea 64(%rsp,%r12,4), %rdi + lea 128(%rsp,%r12,4), %rsi + lea 192(%rsp,%r12,4), %rdx + call __svml_satan2_cout_rare_internal + jmp .LBL_1_9 + cfi_restore(4) + cfi_restore(5) + cfi_restore(12) + cfi_restore(13) + cfi_restore(122) + cfi_restore(123) + cfi_restore(124) + cfi_restore(125) + +.LBL_1_12: +/* Check if at least on of Y or Y is zero: iAXAYZERO */ + vmovups __svml_satan2_data_internal(%rip), %zmm9 + +/* Check if both X & Y are not NaNs: iXYnotNAN */ + vcmpps $3, {sae}, %zmm8, %zmm8, %k1 + vcmpps $3, {sae}, %zmm0, %zmm0, %k2 + vpcmpd $4, %zmm9, %zmm2, %k3 + vpternlogd $255, %zmm12, %zmm12, %zmm12 + vpternlogd $255, %zmm13, %zmm13, %zmm13 + vpternlogd $255, %zmm14, %zmm14, %zmm14 + vpandnd %zmm8, %zmm8, %zmm12{%k1} + vpcmpd $4, %zmm9, %zmm1, %k1 + vpandnd %zmm0, %zmm0, %zmm13{%k2} + +/* + * Path for zero arguments (at least one of both) + * Check if both args are zeros (den. is zero) + */ + vcmpps $4, {sae}, %zmm9, %zmm5, %k2 + vandps %zmm13, %zmm12, %zmm12 + vpandnd %zmm2, %zmm2, %zmm14{%k3} + vpternlogd $255, %zmm2, %zmm2, %zmm2 + +/* Res = sign(Y)*(X<0)?(PIO2+PI):PIO2 */ + vpcmpgtd %zmm8, %zmm9, %k3 + vpandnd %zmm1, %zmm1, %zmm2{%k1} + vpord %zmm2, %zmm14, %zmm15 + vpternlogd $255, %zmm2, %zmm2, %zmm2 + vpandnd %zmm5, %zmm5, %zmm2{%k2} + +/* Set sPIO2 to zero if den. is zero */ + vpandnd %zmm4, %zmm2, %zmm4 + vpandd %zmm2, %zmm9, %zmm5 + vpord %zmm5, %zmm4, %zmm2 + vorps %zmm7, %zmm2, %zmm7 + vaddps {rn-sae}, %zmm11, %zmm7, %zmm7{%k3} + vorps %zmm6, %zmm7, %zmm6 + +/* Check if at least on of Y or Y is zero and not NaN: iAXAYZEROnotNAN */ + vpandd %zmm12, %zmm15, %zmm1 + +/* Exclude from previous callout mask zero (and not NaN) arguments */ + vpandnd %zmm3, %zmm1, %zmm3 + +/* Go to callout */ + vptestmd %zmm3, %zmm3, %k0 + kmovw %k0, %edx + +/* Merge results from main and spec path */ + vpandnd %zmm10, %zmm1, %zmm10 + vpandd %zmm1, %zmm6, %zmm11 + vpord %zmm11, %zmm10, %zmm10 + jmp .LBL_1_2 + +END(_ZGVeN16vv_atan2f_skx) + + .align 16,0x90 + +__svml_satan2_cout_rare_internal: + + cfi_startproc + + pxor %xmm0, %xmm0 + movss (%rdi), %xmm3 + pxor %xmm1, %xmm1 + movss (%rsi), %xmm2 + movq %rdx, %r8 + cvtss2sd %xmm3, %xmm0 + cvtss2sd %xmm2, %xmm1 + movss %xmm3, -32(%rsp) + movss %xmm2, -28(%rsp) + movsd %xmm0, -48(%rsp) + movsd %xmm1, -40(%rsp) + movzwl -30(%rsp), %edi + andl $32640, %edi + movb -25(%rsp), %dl + movzwl -42(%rsp), %eax + andb $-128, %dl + movzwl -34(%rsp), %r9d + andl $32752, %eax + andl $32752, %r9d + shrl $7, %edi + movb -29(%rsp), %cl + shrb $7, %cl + shrb $7, %dl + shrl $4, %eax + shrl $4, %r9d + cmpl $255, %edi + je .LBL_2_35 + movzwl -26(%rsp), %esi + andl $32640, %esi + cmpl $32640, %esi + je .LBL_2_35 + testl %eax, %eax + jne .LBL_2_5 + testl $8388607, -32(%rsp) + je .LBL_2_30 + +.LBL_2_5: + testl %r9d, %r9d + jne .LBL_2_7 + testl $8388607, -28(%rsp) + je .LBL_2_27 + +.LBL_2_7: + negl %r9d + movsd %xmm0, -48(%rsp) + addl %eax, %r9d + movsd %xmm1, -40(%rsp) + movb -41(%rsp), %dil + movb -33(%rsp), %sil + andb $127, %dil + andb $127, %sil + cmpl $-54, %r9d + jle .LBL_2_22 + cmpl $54, %r9d + jge .LBL_2_19 + movb %sil, -33(%rsp) + movb %dil, -41(%rsp) + testb %dl, %dl + jne .LBL_2_11 + movsd 1976+__satan2_la_CoutTab(%rip), %xmm1 + movaps %xmm1, %xmm0 + jmp .LBL_2_12 + +.LBL_2_11: + movsd 1936+__satan2_la_CoutTab(%rip), %xmm1 + movsd 1944+__satan2_la_CoutTab(%rip), %xmm0 + +.LBL_2_12: + movsd -48(%rsp), %xmm4 + movsd -40(%rsp), %xmm2 + movaps %xmm4, %xmm5 + divsd %xmm2, %xmm5 + movzwl -42(%rsp), %esi + movsd %xmm5, -16(%rsp) + testl %eax, %eax + jle .LBL_2_34 + cmpl $2046, %eax + jge .LBL_2_15 + andl $-32753, %esi + addl $-1023, %eax + movsd %xmm4, -48(%rsp) + addl $16368, %esi + movw %si, -42(%rsp) + jmp .LBL_2_16 + +.LBL_2_15: + movsd 1992+__satan2_la_CoutTab(%rip), %xmm3 + movl $1022, %eax + mulsd %xmm3, %xmm4 + movsd %xmm4, -48(%rsp) + +.LBL_2_16: + negl %eax + movq 1888+__satan2_la_CoutTab(%rip), %rsi + addl $1023, %eax + movq %rsi, -40(%rsp) + andl $2047, %eax + shrq $48, %rsi + shll $4, %eax + andl $-32753, %esi + orl %eax, %esi + movw %si, -34(%rsp) + movsd -40(%rsp), %xmm3 + mulsd %xmm3, %xmm2 + comisd 1880+__satan2_la_CoutTab(%rip), %xmm5 + jb .LBL_2_18 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm12 + movaps %xmm2, %xmm3 + mulsd %xmm2, %xmm12 + movsd %xmm12, -72(%rsp) + movsd -72(%rsp), %xmm13 + movsd %xmm5, -24(%rsp) + subsd %xmm2, %xmm13 + movsd %xmm13, -64(%rsp) + movsd -72(%rsp), %xmm15 + movsd -64(%rsp), %xmm14 + movl -20(%rsp), %edi + movl %edi, %r9d + andl $-524288, %edi + andl $-1048576, %r9d + addl $262144, %edi + subsd %xmm14, %xmm15 + movsd %xmm15, -72(%rsp) + andl $1048575, %edi + movsd -72(%rsp), %xmm4 + orl %edi, %r9d + movl $0, -24(%rsp) + subsd %xmm4, %xmm3 + movl %r9d, -20(%rsp) + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -24(%rsp), %xmm11 + movsd -64(%rsp), %xmm9 + mulsd %xmm11, %xmm5 + mulsd %xmm11, %xmm9 + movsd 1968+__satan2_la_CoutTab(%rip), %xmm8 + mulsd %xmm8, %xmm5 + mulsd %xmm8, %xmm9 + movaps %xmm5, %xmm7 + movzwl -10(%rsp), %esi + addsd %xmm9, %xmm7 + movsd %xmm7, -72(%rsp) + andl $32752, %esi + movsd -72(%rsp), %xmm6 + shrl $4, %esi + subsd %xmm6, %xmm5 + movl -12(%rsp), %eax + addsd %xmm5, %xmm9 + movsd %xmm9, -64(%rsp) + andl $1048575, %eax + movsd -48(%rsp), %xmm9 + movsd -72(%rsp), %xmm3 + movaps %xmm9, %xmm12 + movsd -64(%rsp), %xmm10 + movaps %xmm9, %xmm14 + movaps %xmm9, %xmm6 + addsd %xmm3, %xmm12 + movsd %xmm12, -72(%rsp) + movsd -72(%rsp), %xmm13 + shll $20, %esi + subsd %xmm13, %xmm14 + movsd %xmm14, -64(%rsp) + orl %eax, %esi + movsd -72(%rsp), %xmm4 + addl $-1069547520, %esi + movsd -64(%rsp), %xmm15 + movl $113, %eax + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + addsd %xmm15, %xmm4 + movsd %xmm4, -56(%rsp) + movsd -64(%rsp), %xmm8 + sarl $19, %esi + addsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + cmpl $113, %esi + movsd -56(%rsp), %xmm7 + cmovl %esi, %eax + subsd %xmm7, %xmm6 + movsd %xmm6, -56(%rsp) + addl %eax, %eax + movsd -64(%rsp), %xmm12 + lea __satan2_la_CoutTab(%rip), %rsi + movsd -56(%rsp), %xmm5 + movslq %eax, %rax + addsd %xmm5, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -72(%rsp), %xmm7 + mulsd %xmm7, %xmm13 + movsd -56(%rsp), %xmm8 + movsd %xmm13, -72(%rsp) + addsd %xmm10, %xmm8 + movsd -72(%rsp), %xmm4 + movaps %xmm9, %xmm10 + mulsd 2000+__satan2_la_CoutTab(%rip), %xmm10 + subsd %xmm7, %xmm4 + movsd %xmm4, -64(%rsp) + movsd -72(%rsp), %xmm3 + movsd -64(%rsp), %xmm14 + subsd %xmm14, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm15 + subsd %xmm15, %xmm7 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm7 + movsd -64(%rsp), %xmm4 + movsd %xmm10, -72(%rsp) + movaps %xmm2, %xmm10 + addsd %xmm4, %xmm8 + movsd -72(%rsp), %xmm4 + subsd -48(%rsp), %xmm4 + movsd %xmm4, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm3 + subsd %xmm3, %xmm6 + movaps %xmm2, %xmm3 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + subsd %xmm5, %xmm9 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm12 + movsd -64(%rsp), %xmm9 + mulsd %xmm11, %xmm12 + mulsd %xmm11, %xmm9 + movaps %xmm12, %xmm11 + addsd %xmm9, %xmm11 + movsd %xmm11, -72(%rsp) + movsd -72(%rsp), %xmm4 + subsd %xmm4, %xmm12 + addsd %xmm9, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm15 + movsd -64(%rsp), %xmm6 + addsd %xmm15, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm5, %xmm10 + movsd %xmm10, -64(%rsp) + movsd -72(%rsp), %xmm13 + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm13 + movsd %xmm13, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + addsd %xmm14, %xmm15 + movsd %xmm15, -64(%rsp) + movsd -56(%rsp), %xmm4 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm14 + subsd %xmm4, %xmm2 + movsd %xmm2, -56(%rsp) + movsd -64(%rsp), %xmm4 + movsd -56(%rsp), %xmm2 + addsd %xmm2, %xmm4 + movsd %xmm4, -56(%rsp) + movsd -72(%rsp), %xmm12 + mulsd %xmm12, %xmm3 + movsd -56(%rsp), %xmm5 + movsd %xmm3, -72(%rsp) + addsd %xmm6, %xmm5 + movsd -72(%rsp), %xmm9 + subsd %xmm12, %xmm9 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm2 + subsd %xmm2, %xmm10 + movsd %xmm10, -72(%rsp) + movsd -72(%rsp), %xmm11 + subsd %xmm11, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm9 + divsd %xmm9, %xmm14 + mulsd %xmm14, %xmm13 + movsd -64(%rsp), %xmm10 + movsd %xmm13, -64(%rsp) + addsd %xmm10, %xmm5 + movsd -64(%rsp), %xmm15 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm12 + subsd %xmm14, %xmm15 + movsd %xmm15, -56(%rsp) + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm4 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + subsd %xmm4, %xmm2 + movsd %xmm2, -56(%rsp) + movsd -56(%rsp), %xmm3 + mulsd %xmm3, %xmm9 + movsd -56(%rsp), %xmm11 + subsd %xmm9, %xmm12 + mulsd %xmm11, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -64(%rsp), %xmm5 + subsd %xmm5, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -64(%rsp), %xmm2 + movq -56(%rsp), %r10 + movsd -64(%rsp), %xmm6 + movsd -56(%rsp), %xmm4 + movq %r10, -40(%rsp) + movsd -40(%rsp), %xmm3 + movaps %xmm3, %xmm5 + addsd 1888+__satan2_la_CoutTab(%rip), %xmm2 + mulsd %xmm7, %xmm5 + mulsd %xmm6, %xmm2 + mulsd %xmm4, %xmm2 + mulsd %xmm2, %xmm7 + mulsd %xmm8, %xmm2 + mulsd %xmm3, %xmm8 + addsd %xmm2, %xmm7 + movsd 1872+__satan2_la_CoutTab(%rip), %xmm3 + addsd %xmm8, %xmm7 + movsd %xmm7, -72(%rsp) + movaps %xmm5, %xmm7 + movsd -72(%rsp), %xmm4 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm6 + addsd %xmm4, %xmm7 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm8 + subsd %xmm8, %xmm5 + addsd %xmm4, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -72(%rsp), %xmm11 + movaps %xmm11, %xmm2 + mulsd %xmm11, %xmm2 + mulsd %xmm11, %xmm6 + mulsd %xmm2, %xmm3 + movsd -64(%rsp), %xmm4 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm7 + addsd 1864+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm11, %xmm7 + mulsd %xmm2, %xmm3 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm9 + movsd -64(%rsp), %xmm8 + addsd 1856+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm8, %xmm9 + mulsd %xmm2, %xmm3 + movsd %xmm9, -72(%rsp) + movsd -72(%rsp), %xmm10 + addsd 1848+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm10, %xmm11 + mulsd %xmm2, %xmm3 + movsd %xmm11, -64(%rsp) + addsd 1840+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + addsd 1832+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + addsd 1824+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + mulsd %xmm3, %xmm13 + movsd -72(%rsp), %xmm2 + movsd -64(%rsp), %xmm12 + movsd %xmm13, -72(%rsp) + addsd %xmm12, %xmm4 + movsd -72(%rsp), %xmm14 + subsd %xmm3, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm5 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + subsd %xmm6, %xmm3 + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movaps %xmm6, %xmm12 + movaps %xmm5, %xmm3 + mulsd %xmm4, %xmm6 + mulsd %xmm4, %xmm3 + mulsd %xmm2, %xmm5 + mulsd %xmm2, %xmm12 + addsd %xmm3, %xmm6 + movaps %xmm12, %xmm7 + movaps %xmm12, %xmm8 + addsd %xmm5, %xmm6 + addsd %xmm2, %xmm7 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm3 + subsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -56(%rsp), %xmm2 + subsd %xmm2, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd -56(%rsp), %xmm13 + addsd %xmm13, %xmm14 + movsd %xmm14, -56(%rsp) + movq -72(%rsp), %r11 + movsd -56(%rsp), %xmm15 + movq %r11, -40(%rsp) + addsd %xmm15, %xmm4 + movsd -40(%rsp), %xmm8 + addsd %xmm5, %xmm4 + movsd %xmm4, -32(%rsp) + movaps %xmm8, %xmm4 + movaps %xmm8, %xmm2 + addsd (%rsi,%rax,8), %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm4 + subsd %xmm4, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm3 + addsd %xmm3, %xmm5 + movsd %xmm5, -56(%rsp) + movsd -64(%rsp), %xmm6 + addsd (%rsi,%rax,8), %xmm6 + movsd %xmm6, -64(%rsp) + movsd -56(%rsp), %xmm7 + subsd %xmm7, %xmm8 + movsd %xmm8, -56(%rsp) + movsd -64(%rsp), %xmm10 + movsd -56(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movq -72(%rsp), %rdi + movq %rdi, -40(%rsp) + movsd -56(%rsp), %xmm2 + movaps %xmm1, %xmm3 + shrq $56, %rdi + addsd -32(%rsp), %xmm2 + shlb $7, %cl + addsd 8(%rsi,%rax,8), %xmm2 + movb %dl, %al + andb $127, %dil + shlb $7, %al + movsd %xmm2, -32(%rsp) + orb %al, %dil + movb %dil, -33(%rsp) + movsd -40(%rsp), %xmm9 + movaps %xmm9, %xmm5 + addsd %xmm9, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm4 + movb -25(%rsp), %sil + movb %sil, %r9b + shrb $7, %sil + subsd %xmm4, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -72(%rsp), %xmm7 + movsd -64(%rsp), %xmm6 + xorb %sil, %dl + andb $127, %r9b + shlb $7, %dl + addsd %xmm6, %xmm7 + movsd %xmm7, -56(%rsp) + movsd -64(%rsp), %xmm8 + addsd %xmm8, %xmm1 + movsd %xmm1, -64(%rsp) + orb %dl, %r9b + movsd -56(%rsp), %xmm1 + movb %r9b, -25(%rsp) + subsd %xmm1, %xmm9 + movsd %xmm9, -56(%rsp) + movsd -64(%rsp), %xmm11 + movsd -56(%rsp), %xmm10 + addsd %xmm10, %xmm11 + movsd %xmm11, -56(%rsp) + movq -72(%rsp), %rdx + movsd -56(%rsp), %xmm12 + movq %rdx, -40(%rsp) + addsd %xmm12, %xmm0 + movsd -40(%rsp), %xmm13 + addsd -32(%rsp), %xmm0 + movsd %xmm0, -32(%rsp) + addsd %xmm0, %xmm13 + movsd %xmm13, -24(%rsp) + movb -17(%rsp), %r10b + andb $127, %r10b + orb %cl, %r10b + movb %r10b, -17(%rsp) + movsd -24(%rsp), %xmm0 + cvtsd2ss %xmm0, %xmm0 + movss %xmm0, (%r8) + jmp .LBL_2_33 + +.LBL_2_18: + movsd -48(%rsp), %xmm12 + movb %dl, %dil + movaps %xmm12, %xmm7 + mulsd 2000+__satan2_la_CoutTab(%rip), %xmm7 + shlb $7, %dil + shlb $7, %cl + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm8 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm7 + mulsd %xmm2, %xmm13 + subsd -48(%rsp), %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + subsd %xmm9, %xmm10 + movsd %xmm10, -72(%rsp) + movsd -72(%rsp), %xmm11 + subsd %xmm11, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movsd %xmm13, -72(%rsp) + movsd -72(%rsp), %xmm14 + subsd %xmm2, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm4 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm3 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm4 + subsd %xmm3, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm12 + divsd %xmm12, %xmm7 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm2 + mulsd %xmm7, %xmm2 + movsd -64(%rsp), %xmm14 + movsd %xmm2, -64(%rsp) + movsd -64(%rsp), %xmm8 + subsd %xmm7, %xmm8 + movsd %xmm8, -56(%rsp) + movsd -64(%rsp), %xmm10 + movsd -56(%rsp), %xmm9 + subsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -56(%rsp), %xmm11 + mulsd %xmm11, %xmm12 + movsd -56(%rsp), %xmm13 + subsd %xmm12, %xmm4 + mulsd %xmm13, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -64(%rsp), %xmm15 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + subsd %xmm15, %xmm4 + movsd %xmm4, -64(%rsp) + movsd -64(%rsp), %xmm7 + movq -56(%rsp), %rax + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm3 + movq %rax, -40(%rsp) + movsd -40(%rsp), %xmm8 + movaps %xmm8, %xmm9 + addsd 1888+__satan2_la_CoutTab(%rip), %xmm7 + mulsd %xmm6, %xmm9 + mulsd %xmm5, %xmm8 + mulsd %xmm2, %xmm7 + movsd -16(%rsp), %xmm2 + mulsd %xmm2, %xmm2 + mulsd %xmm3, %xmm7 + movsd 1872+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + mulsd %xmm7, %xmm6 + mulsd %xmm5, %xmm7 + addsd 1864+__satan2_la_CoutTab(%rip), %xmm3 + addsd %xmm7, %xmm6 + mulsd %xmm2, %xmm3 + addsd %xmm8, %xmm6 + addsd 1856+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + movaps %xmm9, %xmm5 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm4 + addsd 1848+__satan2_la_CoutTab(%rip), %xmm3 + addsd %xmm4, %xmm5 + mulsd %xmm2, %xmm3 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm5 + subsd %xmm6, %xmm9 + addsd 1840+__satan2_la_CoutTab(%rip), %xmm3 + addsd %xmm4, %xmm9 + mulsd %xmm2, %xmm3 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm11 + mulsd %xmm11, %xmm5 + addsd 1832+__satan2_la_CoutTab(%rip), %xmm3 + movsd -64(%rsp), %xmm4 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm7 + mulsd %xmm2, %xmm3 + subsd %xmm11, %xmm7 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm8 + movsd -64(%rsp), %xmm6 + addsd 1824+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm6, %xmm8 + mulsd %xmm2, %xmm3 + movsd %xmm8, -72(%rsp) + movsd -72(%rsp), %xmm10 + mulsd %xmm3, %xmm13 + subsd %xmm10, %xmm11 + movsd %xmm11, -64(%rsp) + movsd -72(%rsp), %xmm2 + movsd -64(%rsp), %xmm12 + movsd %xmm13, -72(%rsp) + addsd %xmm12, %xmm4 + movsd -72(%rsp), %xmm14 + subsd %xmm3, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm5 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + subsd %xmm6, %xmm3 + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movaps %xmm6, %xmm12 + movaps %xmm5, %xmm3 + mulsd %xmm4, %xmm6 + mulsd %xmm4, %xmm3 + mulsd %xmm2, %xmm5 + mulsd %xmm2, %xmm12 + addsd %xmm3, %xmm6 + movaps %xmm12, %xmm7 + movaps %xmm12, %xmm8 + addsd %xmm5, %xmm6 + addsd %xmm2, %xmm7 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm3 + subsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -56(%rsp), %xmm2 + subsd %xmm2, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd -56(%rsp), %xmm13 + addsd %xmm13, %xmm14 + movsd %xmm14, -56(%rsp) + movq -72(%rsp), %rsi + movsd -56(%rsp), %xmm15 + movq %rsi, -40(%rsp) + addsd %xmm15, %xmm4 + shrq $56, %rsi + addsd %xmm5, %xmm4 + andb $127, %sil + orb %dil, %sil + movb %sil, -33(%rsp) + movsd %xmm4, -32(%rsp) + movaps %xmm1, %xmm4 + movsd -40(%rsp), %xmm7 + movaps %xmm7, %xmm2 + addsd %xmm7, %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm4 + movb -25(%rsp), %r9b + movb %r9b, %r10b + shrb $7, %r9b + subsd %xmm4, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm3 + xorb %r9b, %dl + andb $127, %r10b + shlb $7, %dl + addsd %xmm3, %xmm5 + movsd %xmm5, -56(%rsp) + movsd -64(%rsp), %xmm6 + addsd %xmm6, %xmm1 + movsd %xmm1, -64(%rsp) + orb %dl, %r10b + movsd -56(%rsp), %xmm1 + movb %r10b, -25(%rsp) + subsd %xmm1, %xmm7 + movsd %xmm7, -56(%rsp) + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm1 + addsd %xmm1, %xmm2 + movsd %xmm2, -56(%rsp) + movq -72(%rsp), %rdx + movsd -56(%rsp), %xmm3 + movq %rdx, -40(%rsp) + addsd %xmm3, %xmm0 + movsd -40(%rsp), %xmm4 + addsd -32(%rsp), %xmm0 + movsd %xmm0, -32(%rsp) + addsd %xmm0, %xmm4 + movsd %xmm4, -24(%rsp) + movb -17(%rsp), %r11b + andb $127, %r11b + orb %cl, %r11b + movb %r11b, -17(%rsp) + movsd -24(%rsp), %xmm0 + cvtsd2ss %xmm0, %xmm0 + movss %xmm0, (%r8) + jmp .LBL_2_33 + +.LBL_2_19: + cmpl $74, %r9d + jge .LBL_2_21 + movb %dil, -41(%rsp) + divsd -48(%rsp), %xmm1 + movsd 1928+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + subsd %xmm1, %xmm0 + addsd 1920+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + jmp .LBL_2_33 + +.LBL_2_21: + movsd 1920+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd 1928+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + jmp .LBL_2_33 + +.LBL_2_22: + testb %dl, %dl + jne .LBL_2_32 + movb %dil, -41(%rsp) + pxor %xmm0, %xmm0 + movb %sil, -33(%rsp) + movsd -48(%rsp), %xmm2 + divsd -40(%rsp), %xmm2 + cvtsd2ss %xmm2, %xmm0 + movss %xmm0, -8(%rsp) + movzwl -6(%rsp), %eax + movsd %xmm2, -24(%rsp) + testl $32640, %eax + je .LBL_2_25 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd %xmm2, %xmm0 + movsd %xmm0, -72(%rsp) + movsd -72(%rsp), %xmm1 + mulsd %xmm1, %xmm2 + movsd %xmm2, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm2 + cvtsd2ss %xmm2, %xmm2 + movss %xmm2, (%r8) + jmp .LBL_2_33 + +.LBL_2_25: + movsd -24(%rsp), %xmm0 + cvtsd2ss %xmm0, %xmm0 + shlb $7, %cl + movss %xmm0, -8(%rsp) + movss -8(%rsp), %xmm2 + movss -8(%rsp), %xmm1 + mulss %xmm1, %xmm2 + movss %xmm2, -8(%rsp) + movss -8(%rsp), %xmm3 + cvtss2sd %xmm3, %xmm3 + addsd -24(%rsp), %xmm3 + movsd %xmm3, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm4 + cvtsd2ss %xmm4, %xmm4 + movss %xmm4, (%r8) + jmp .LBL_2_33 + +.LBL_2_27: + testl %eax, %eax + jne .LBL_2_21 + testl $8388607, -32(%rsp) + jne .LBL_2_21 + +.LBL_2_30: + testb %dl, %dl + jne .LBL_2_32 + +.LBL_2_31: + shlb $7, %cl + movq 1976+__satan2_la_CoutTab(%rip), %rax + movq %rax, -24(%rsp) + shrq $56, %rax + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm0 + cvtsd2ss %xmm0, %xmm0 + movss %xmm0, (%r8) + jmp .LBL_2_33 + +.LBL_2_32: + movsd 1936+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd 1944+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + +.LBL_2_33: + xorl %eax, %eax + ret + +.LBL_2_34: + movsd 1984+__satan2_la_CoutTab(%rip), %xmm3 + movl $-1022, %eax + mulsd %xmm3, %xmm4 + movsd %xmm4, -48(%rsp) + jmp .LBL_2_16 + +.LBL_2_35: + cmpl $2047, %eax + je .LBL_2_48 + +.LBL_2_36: + cmpl $2047, %r9d + je .LBL_2_46 + +.LBL_2_37: + movzwl -26(%rsp), %eax + andl $32640, %eax + cmpl $32640, %eax + jne .LBL_2_21 + cmpl $255, %edi + je .LBL_2_43 + testb %dl, %dl + je .LBL_2_31 + jmp .LBL_2_32 + +.LBL_2_43: + testb %dl, %dl + jne .LBL_2_45 + movsd 1904+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd 1912+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + jmp .LBL_2_33 + +.LBL_2_45: + movsd 1952+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd 1960+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + jmp .LBL_2_33 + +.LBL_2_46: + testl $8388607, -28(%rsp) + je .LBL_2_37 + +.LBL_2_47: + addss %xmm2, %xmm3 + movss %xmm3, (%r8) + jmp .LBL_2_33 + +.LBL_2_48: + testl $8388607, -32(%rsp) + jne .LBL_2_47 + jmp .LBL_2_36 + + cfi_endproc + + .type __svml_satan2_cout_rare_internal,@function + .size __svml_satan2_cout_rare_internal,.-__svml_satan2_cout_rare_internal + + .section .rodata, "a" + .align 64 + +__svml_satan2_data_internal: + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .type __svml_satan2_data_internal,@object + .size __svml_satan2_data_internal,1152 + .align 32 + +__satan2_la_CoutTab: + .long 3892314112 + .long 1069799150 + .long 2332892550 + .long 1039715405 + .long 1342177280 + .long 1070305495 + .long 270726690 + .long 1041535749 + .long 939524096 + .long 1070817911 + .long 2253973841 + .long 3188654726 + .long 3221225472 + .long 1071277294 + .long 3853927037 + .long 1043226911 + .long 2818572288 + .long 1071767563 + .long 2677759107 + .long 1044314101 + .long 3355443200 + .long 1072103591 + .long 1636578514 + .long 3191094734 + .long 1476395008 + .long 1072475260 + .long 1864703685 + .long 3188646936 + .long 805306368 + .long 1072747407 + .long 192551812 + .long 3192726267 + .long 2013265920 + .long 1072892781 + .long 2240369452 + .long 1043768538 + .long 0 + .long 1072999953 + .long 3665168337 + .long 3192705970 + .long 402653184 + .long 1073084787 + .long 1227953434 + .long 3192313277 + .long 2013265920 + .long 1073142981 + .long 3853283127 + .long 1045277487 + .long 805306368 + .long 1073187261 + .long 1676192264 + .long 3192868861 + .long 134217728 + .long 1073217000 + .long 4290763938 + .long 1042034855 + .long 671088640 + .long 1073239386 + .long 994303084 + .long 3189643768 + .long 402653184 + .long 1073254338 + .long 1878067156 + .long 1042652475 + .long 1610612736 + .long 1073265562 + .long 670314820 + .long 1045138554 + .long 3221225472 + .long 1073273048 + .long 691126919 + .long 3189987794 + .long 3489660928 + .long 1073278664 + .long 1618990832 + .long 3188194509 + .long 1207959552 + .long 1073282409 + .long 2198872939 + .long 1044806069 + .long 3489660928 + .long 1073285217 + .long 2633982383 + .long 1042307894 + .long 939524096 + .long 1073287090 + .long 1059367786 + .long 3189114230 + .long 2281701376 + .long 1073288494 + .long 3158525533 + .long 1044484961 + .long 3221225472 + .long 1073289430 + .long 286581777 + .long 1044893263 + .long 4026531840 + .long 1073290132 + .long 2000245215 + .long 3191647611 + .long 134217728 + .long 1073290601 + .long 4205071590 + .long 1045035927 + .long 536870912 + .long 1073290952 + .long 2334392229 + .long 1043447393 + .long 805306368 + .long 1073291186 + .long 2281458177 + .long 3188885569 + .long 3087007744 + .long 1073291361 + .long 691611507 + .long 1044733832 + .long 3221225472 + .long 1073291478 + .long 1816229550 + .long 1044363390 + .long 2281701376 + .long 1073291566 + .long 1993843750 + .long 3189837440 + .long 134217728 + .long 1073291625 + .long 3654754496 + .long 1044970837 + .long 4026531840 + .long 1073291668 + .long 3224300229 + .long 3191935390 + .long 805306368 + .long 1073291698 + .long 2988777976 + .long 3188950659 + .long 536870912 + .long 1073291720 + .long 1030371341 + .long 1043402665 + .long 3221225472 + .long 1073291734 + .long 1524463765 + .long 1044361356 + .long 3087007744 + .long 1073291745 + .long 2754295320 + .long 1044731036 + .long 134217728 + .long 1073291753 + .long 3099629057 + .long 1044970710 + .long 2281701376 + .long 1073291758 + .long 962914160 + .long 3189838838 + .long 805306368 + .long 1073291762 + .long 3543908206 + .long 3188950786 + .long 4026531840 + .long 1073291764 + .long 1849909620 + .long 3191935434 + .long 3221225472 + .long 1073291766 + .long 1641333636 + .long 1044361352 + .long 536870912 + .long 1073291768 + .long 1373968792 + .long 1043402654 + .long 134217728 + .long 1073291769 + .long 2033191599 + .long 1044970710 + .long 3087007744 + .long 1073291769 + .long 4117947437 + .long 1044731035 + .long 805306368 + .long 1073291770 + .long 315378368 + .long 3188950787 + .long 2281701376 + .long 1073291770 + .long 2428571750 + .long 3189838838 + .long 3221225472 + .long 1073291770 + .long 1608007466 + .long 1044361352 + .long 4026531840 + .long 1073291770 + .long 1895711420 + .long 3191935434 + .long 134217728 + .long 1073291771 + .long 2031108713 + .long 1044970710 + .long 536870912 + .long 1073291771 + .long 1362518342 + .long 1043402654 + .long 805306368 + .long 1073291771 + .long 317461253 + .long 3188950787 + .long 939524096 + .long 1073291771 + .long 4117231784 + .long 1044731035 + .long 1073741824 + .long 1073291771 + .long 1607942376 + .long 1044361352 + .long 1207959552 + .long 1073291771 + .long 2428929577 + .long 3189838838 + .long 1207959552 + .long 1073291771 + .long 2031104645 + .long 1044970710 + .long 1342177280 + .long 1073291771 + .long 1895722602 + .long 3191935434 + .long 1342177280 + .long 1073291771 + .long 317465322 + .long 3188950787 + .long 1342177280 + .long 1073291771 + .long 1362515546 + .long 1043402654 + .long 1342177280 + .long 1073291771 + .long 1607942248 + .long 1044361352 + .long 1342177280 + .long 1073291771 + .long 4117231610 + .long 1044731035 + .long 1342177280 + .long 1073291771 + .long 2031104637 + .long 1044970710 + .long 1342177280 + .long 1073291771 + .long 1540251232 + .long 1045150466 + .long 1342177280 + .long 1073291771 + .long 2644671394 + .long 1045270303 + .long 1342177280 + .long 1073291771 + .long 2399244691 + .long 1045360181 + .long 1342177280 + .long 1073291771 + .long 803971124 + .long 1045420100 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192879152 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192849193 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192826724 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192811744 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192800509 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192793019 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192787402 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192783657 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192780848 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192778976 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192777572 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192776635 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192775933 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192775465 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192775114 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774880 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774704 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774587 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774500 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774441 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774397 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774368 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774346 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774331 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774320 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774313 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774308 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774304 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774301 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774299 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774298 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774297 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1466225875 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1343512524 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1251477510 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1190120835 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1144103328 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1113424990 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1090416237 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1075077068 + .long 3192774295 + .long 1431655765 + .long 3218429269 + .long 2576978363 + .long 1070176665 + .long 2453154343 + .long 3217180964 + .long 4189149139 + .long 1069314502 + .long 1775019125 + .long 3216459198 + .long 273199057 + .long 1068739452 + .long 874748308 + .long 3215993277 + .long 0 + .long 1069547520 + .long 0 + .long 1072693248 + .long 0 + .long 1073741824 + .long 1413754136 + .long 1072243195 + .long 856972295 + .long 1015129638 + .long 1413754136 + .long 1073291771 + .long 856972295 + .long 1016178214 + .long 1413754136 + .long 1074340347 + .long 856972295 + .long 1017226790 + .long 2134057426 + .long 1073928572 + .long 1285458442 + .long 1016756537 + .long 0 + .long 3220176896 + .long 0 + .long 0 + .long 0 + .long 2144337920 + .long 0 + .long 1048576 + .long 33554432 + .long 1101004800 + .type __satan2_la_CoutTab,@object + .size __satan2_la_CoutTab,2008 diff --git a/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f4_core-sse2.S b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f4_core-sse2.S new file mode 100644 index 0000000000..d1a67facf1 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f4_core-sse2.S @@ -0,0 +1,20 @@ +/* SSE2 version of vectorized atan2f. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define _ZGVbN4vv_atan2f _ZGVbN4vv_atan2f_sse2 +#include "../svml_s_atan2f4_core.S" diff --git a/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f4_core.c b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f4_core.c new file mode 100644 index 0000000000..ee882b0557 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f4_core.c @@ -0,0 +1,28 @@ +/* Multiple versions of vectorized atan2f, vector length is 4. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define SYMBOL_NAME _ZGVbN4vv_atan2f +#include "ifunc-mathvec-sse4_1.h" + +libc_ifunc_redirected (REDIRECT_NAME, SYMBOL_NAME, IFUNC_SELECTOR ()); + +#ifdef SHARED +__hidden_ver1 (_ZGVbN4vv_atan2f, __GI__ZGVbN4vv_atan2f, + __redirect__ZGVbN4vv_atan2f) + __attribute__ ((visibility ("hidden"))); +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f4_core_sse4.S b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f4_core_sse4.S new file mode 100644 index 0000000000..b75e5be5cd --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f4_core_sse4.S @@ -0,0 +1,2667 @@ +/* Function atanf4 vectorized with SSE4. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + https://www.gnu.org/licenses/. */ + +/* + * ALGORITHM DESCRIPTION: + * For 0.0 <= x <= 7.0/16.0: atan(x) = atan(0.0) + atan(s), where s=(x-0.0)/(1.0+0.0*x) + * For 7.0/16.0 <= x <= 11.0/16.0: atan(x) = atan(0.5) + atan(s), where s=(x-0.5)/(1.0+0.5*x) + * For 11.0/16.0 <= x <= 19.0/16.0: atan(x) = atan(1.0) + atan(s), where s=(x-1.0)/(1.0+1.0*x) + * For 19.0/16.0 <= x <= 39.0/16.0: atan(x) = atan(1.5) + atan(s), where s=(x-1.5)/(1.0+1.5*x) + * For 39.0/16.0 <= x <= inf : atan(x) = atan(inf) + atan(s), where s=-1.0/x + * Where atan(s) ~= s+s^3*Poly11(s^2) on interval |s|<7.0/0.16. + * + * + */ + +#include + + .text +ENTRY(_ZGVbN4vv_atan2f_sse4) + pushq %rbp + cfi_def_cfa_offset(16) + movq %rsp, %rbp + cfi_def_cfa(6, 16) + cfi_offset(6, -16) + andq $-64, %rsp + subq $256, %rsp + xorl %edx, %edx + movups %xmm9, 176(%rsp) + movups %xmm11, 112(%rsp) + .cfi_escape 0x10, 0x1a, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xb0, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x70, 0xff, 0xff, 0xff, 0x22 + movaps %xmm0, %xmm11 + +/* + * #define NO_VECTOR_ZERO_ATAN2_ARGS + * Declarations + * Variables + * Constants + * The end of declarations + * Implementation + * Arguments signs + */ + movups 256+__svml_satan2_data_internal(%rip), %xmm9 + movups %xmm12, 96(%rsp) + .cfi_escape 0x10, 0x1d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x60, 0xff, 0xff, 0xff, 0x22 + movaps %xmm1, %xmm12 + movups %xmm10, 144(%rsp) + .cfi_escape 0x10, 0x1b, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x90, 0xff, 0xff, 0xff, 0x22 + movaps %xmm9, %xmm10 + andps %xmm11, %xmm9 + andps %xmm12, %xmm10 + movaps %xmm9, %xmm6 + movaps %xmm9, %xmm4 + cmpltps %xmm10, %xmm6 + +/* + * 1) If yx then a=-x, b=y, PIO2=Pi/2 + */ + movups 192+__svml_satan2_data_internal(%rip), %xmm5 + movaps %xmm6, %xmm0 + orps %xmm10, %xmm5 + movaps %xmm10, %xmm1 + andnps %xmm5, %xmm0 + movaps %xmm6, %xmm5 + andps %xmm6, %xmm4 + andnps %xmm9, %xmm5 + andps %xmm6, %xmm1 + orps %xmm4, %xmm0 + orps %xmm1, %xmm5 + movaps %xmm9, %xmm3 + +/* Division a/b. */ + divps %xmm5, %xmm0 + movups %xmm13, 80(%rsp) + +/* if x<0, sPI = Pi, else sPI =0 */ + movaps %xmm12, %xmm4 + movups %xmm14, 48(%rsp) + .cfi_escape 0x10, 0x1e, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x50, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1f, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x30, 0xff, 0xff, 0xff, 0x22 + movaps %xmm10, %xmm14 + +/* Testing on working interval. */ + movdqu 1024+__svml_satan2_data_internal(%rip), %xmm13 + movaps %xmm9, %xmm7 + psubd %xmm13, %xmm14 + psubd %xmm13, %xmm3 + movdqu 1088+__svml_satan2_data_internal(%rip), %xmm2 + movdqa %xmm14, %xmm1 + movdqa %xmm3, %xmm13 + pcmpgtd %xmm2, %xmm1 + pcmpeqd %xmm2, %xmm14 + pcmpgtd %xmm2, %xmm13 + pcmpeqd %xmm2, %xmm3 + por %xmm14, %xmm1 + por %xmm3, %xmm13 + pxor %xmm11, %xmm7 + por %xmm13, %xmm1 + +/* Polynomial. */ + movaps %xmm0, %xmm13 + mulps %xmm0, %xmm13 + cmpleps __svml_satan2_data_internal(%rip), %xmm4 + movmskps %xmm1, %eax + movaps %xmm13, %xmm14 + mulps %xmm13, %xmm14 + movups 448+__svml_satan2_data_internal(%rip), %xmm2 + mulps %xmm14, %xmm2 + movups 512+__svml_satan2_data_internal(%rip), %xmm3 + mulps %xmm14, %xmm3 + addps 576+__svml_satan2_data_internal(%rip), %xmm2 + mulps %xmm14, %xmm2 + addps 640+__svml_satan2_data_internal(%rip), %xmm3 + mulps %xmm14, %xmm3 + addps 704+__svml_satan2_data_internal(%rip), %xmm2 + mulps %xmm14, %xmm2 + addps 768+__svml_satan2_data_internal(%rip), %xmm3 + mulps %xmm14, %xmm3 + addps 832+__svml_satan2_data_internal(%rip), %xmm2 + mulps %xmm2, %xmm14 + addps 896+__svml_satan2_data_internal(%rip), %xmm3 + mulps %xmm3, %xmm13 + addps 960+__svml_satan2_data_internal(%rip), %xmm14 + andnps 320+__svml_satan2_data_internal(%rip), %xmm6 + addps %xmm13, %xmm14 + +/* Reconstruction. */ + mulps %xmm14, %xmm0 + movups %xmm8, 160(%rsp) + .cfi_escape 0x10, 0x19, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xa0, 0xff, 0xff, 0xff, 0x22 + movaps %xmm10, %xmm8 + pxor %xmm12, %xmm8 + addps %xmm6, %xmm0 + andps 384+__svml_satan2_data_internal(%rip), %xmm4 + orps %xmm8, %xmm0 + addps %xmm4, %xmm0 + orps %xmm7, %xmm0 + +/* Special branch for fast (vector) processing of zero arguments */ + testl %eax, %eax + jne .LBL_1_12 + +.LBL_1_2: +/* + * Special branch for fast (vector) processing of zero arguments + * The end of implementation + */ + testl %edx, %edx + jne .LBL_1_4 + +.LBL_1_3: + movups 160(%rsp), %xmm8 + cfi_restore(25) + movups 176(%rsp), %xmm9 + cfi_restore(26) + movups 144(%rsp), %xmm10 + cfi_restore(27) + movups 112(%rsp), %xmm11 + cfi_restore(28) + movups 96(%rsp), %xmm12 + cfi_restore(29) + movups 80(%rsp), %xmm13 + cfi_restore(30) + movups 48(%rsp), %xmm14 + cfi_restore(31) + movq %rbp, %rsp + popq %rbp + cfi_def_cfa(7, 8) + cfi_restore(6) + ret + cfi_def_cfa(6, 16) + cfi_offset(6, -16) + .cfi_escape 0x10, 0x19, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xa0, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1a, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xb0, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1b, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x90, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x70, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x60, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1e, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x50, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1f, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x30, 0xff, 0xff, 0xff, 0x22 + +.LBL_1_4: + movups %xmm11, 64(%rsp) + movups %xmm12, 128(%rsp) + movups %xmm0, 192(%rsp) + je .LBL_1_3 + xorl %eax, %eax + movups %xmm15, (%rsp) + movq %rsi, 24(%rsp) + movq %rdi, 16(%rsp) + movq %r12, 40(%rsp) + .cfi_escape 0x10, 0x04, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x18, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x05, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x10, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x28, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x20, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x00, 0xff, 0xff, 0xff, 0x22 + movl %eax, %r12d + movq %r13, 32(%rsp) + .cfi_escape 0x10, 0x0d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x20, 0xff, 0xff, 0xff, 0x22 + movl %edx, %r13d + +.LBL_1_8: + btl %r12d, %r13d + jc .LBL_1_11 + +.LBL_1_9: + incl %r12d + cmpl $4, %r12d + jl .LBL_1_8 + movups (%rsp), %xmm15 + cfi_restore(32) + movq 24(%rsp), %rsi + cfi_restore(4) + movq 16(%rsp), %rdi + cfi_restore(5) + movq 40(%rsp), %r12 + cfi_restore(12) + movq 32(%rsp), %r13 + cfi_restore(13) + movups 192(%rsp), %xmm0 + jmp .LBL_1_3 + .cfi_escape 0x10, 0x04, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x18, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x05, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x10, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x28, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x20, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x20, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x00, 0xff, 0xff, 0xff, 0x22 + +.LBL_1_11: + lea 64(%rsp,%r12,4), %rdi + lea 128(%rsp,%r12,4), %rsi + lea 192(%rsp,%r12,4), %rdx + call __svml_satan2_cout_rare_internal + jmp .LBL_1_9 + cfi_restore(4) + cfi_restore(5) + cfi_restore(12) + cfi_restore(13) + cfi_restore(32) + +.LBL_1_12: +/* Check if both X & Y are not NaNs: iXYnotNAN */ + movaps %xmm12, %xmm3 + movaps %xmm11, %xmm2 + cmpordps %xmm12, %xmm3 + cmpordps %xmm11, %xmm2 + +/* Check if at least on of Y or Y is zero: iAXAYZERO */ + movups __svml_satan2_data_internal(%rip), %xmm13 + andps %xmm2, %xmm3 + +/* + * Path for zero arguments (at least one of both) + * Check if both args are zeros (den. is zero) + */ + cmpeqps %xmm13, %xmm5 + pcmpeqd %xmm13, %xmm10 + pcmpeqd %xmm13, %xmm9 + por %xmm9, %xmm10 + +/* Check if at least on of Y or Y is zero and not NaN: iAXAYZEROnotNAN */ + andps %xmm3, %xmm10 + +/* Exclude from previous callout mask zero (and not NaN) arguments */ + movaps %xmm10, %xmm9 + pandn %xmm1, %xmm9 + +/* Set sPIO2 to zero if den. is zero */ + movaps %xmm5, %xmm1 + andnps %xmm6, %xmm1 + andps %xmm13, %xmm5 + orps %xmm5, %xmm1 + +/* Res = sign(Y)*(X<0)?(PIO2+PI):PIO2 */ + pcmpgtd %xmm12, %xmm13 + orps %xmm8, %xmm1 + andps %xmm4, %xmm13 + +/* Merge results from main and spec path */ + movaps %xmm10, %xmm4 + addps %xmm13, %xmm1 + +/* Go to callout */ + movmskps %xmm9, %edx + orps %xmm7, %xmm1 + andnps %xmm0, %xmm4 + andps %xmm10, %xmm1 + movaps %xmm4, %xmm0 + orps %xmm1, %xmm0 + jmp .LBL_1_2 + +END(_ZGVbN4vv_atan2f_sse4) + + .align 16,0x90 + +__svml_satan2_cout_rare_internal: + + cfi_startproc + + pxor %xmm0, %xmm0 + movss (%rdi), %xmm3 + pxor %xmm1, %xmm1 + movss (%rsi), %xmm2 + movq %rdx, %r8 + cvtss2sd %xmm3, %xmm0 + cvtss2sd %xmm2, %xmm1 + movss %xmm3, -32(%rsp) + movss %xmm2, -28(%rsp) + movsd %xmm0, -48(%rsp) + movsd %xmm1, -40(%rsp) + movzwl -30(%rsp), %edi + andl $32640, %edi + movb -25(%rsp), %dl + movzwl -42(%rsp), %eax + andb $-128, %dl + movzwl -34(%rsp), %r9d + andl $32752, %eax + andl $32752, %r9d + shrl $7, %edi + movb -29(%rsp), %cl + shrb $7, %cl + shrb $7, %dl + shrl $4, %eax + shrl $4, %r9d + cmpl $255, %edi + je .LBL_2_35 + movzwl -26(%rsp), %esi + andl $32640, %esi + cmpl $32640, %esi + je .LBL_2_35 + testl %eax, %eax + jne .LBL_2_5 + testl $8388607, -32(%rsp) + je .LBL_2_30 + +.LBL_2_5: + testl %r9d, %r9d + jne .LBL_2_7 + testl $8388607, -28(%rsp) + je .LBL_2_27 + +.LBL_2_7: + negl %r9d + movsd %xmm0, -48(%rsp) + addl %eax, %r9d + movsd %xmm1, -40(%rsp) + movb -41(%rsp), %dil + movb -33(%rsp), %sil + andb $127, %dil + andb $127, %sil + cmpl $-54, %r9d + jle .LBL_2_22 + cmpl $54, %r9d + jge .LBL_2_19 + movb %sil, -33(%rsp) + movb %dil, -41(%rsp) + testb %dl, %dl + jne .LBL_2_11 + movsd 1976+__satan2_la_CoutTab(%rip), %xmm1 + movaps %xmm1, %xmm0 + jmp .LBL_2_12 + +.LBL_2_11: + movsd 1936+__satan2_la_CoutTab(%rip), %xmm1 + movsd 1944+__satan2_la_CoutTab(%rip), %xmm0 + +.LBL_2_12: + movsd -48(%rsp), %xmm4 + movsd -40(%rsp), %xmm2 + movaps %xmm4, %xmm5 + divsd %xmm2, %xmm5 + movzwl -42(%rsp), %esi + movsd %xmm5, -16(%rsp) + testl %eax, %eax + jle .LBL_2_34 + cmpl $2046, %eax + jge .LBL_2_15 + andl $-32753, %esi + addl $-1023, %eax + movsd %xmm4, -48(%rsp) + addl $16368, %esi + movw %si, -42(%rsp) + jmp .LBL_2_16 + +.LBL_2_15: + movsd 1992+__satan2_la_CoutTab(%rip), %xmm3 + movl $1022, %eax + mulsd %xmm3, %xmm4 + movsd %xmm4, -48(%rsp) + +.LBL_2_16: + negl %eax + movq 1888+__satan2_la_CoutTab(%rip), %rsi + addl $1023, %eax + movq %rsi, -40(%rsp) + andl $2047, %eax + shrq $48, %rsi + shll $4, %eax + andl $-32753, %esi + orl %eax, %esi + movw %si, -34(%rsp) + movsd -40(%rsp), %xmm3 + mulsd %xmm3, %xmm2 + comisd 1880+__satan2_la_CoutTab(%rip), %xmm5 + jb .LBL_2_18 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm12 + movaps %xmm2, %xmm3 + mulsd %xmm2, %xmm12 + movsd %xmm12, -72(%rsp) + movsd -72(%rsp), %xmm13 + movsd %xmm5, -24(%rsp) + subsd %xmm2, %xmm13 + movsd %xmm13, -64(%rsp) + movsd -72(%rsp), %xmm15 + movsd -64(%rsp), %xmm14 + movl -20(%rsp), %edi + movl %edi, %r9d + andl $-524288, %edi + andl $-1048576, %r9d + addl $262144, %edi + subsd %xmm14, %xmm15 + movsd %xmm15, -72(%rsp) + andl $1048575, %edi + movsd -72(%rsp), %xmm4 + orl %edi, %r9d + movl $0, -24(%rsp) + subsd %xmm4, %xmm3 + movl %r9d, -20(%rsp) + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -24(%rsp), %xmm11 + movsd -64(%rsp), %xmm9 + mulsd %xmm11, %xmm5 + mulsd %xmm11, %xmm9 + movsd 1968+__satan2_la_CoutTab(%rip), %xmm8 + mulsd %xmm8, %xmm5 + mulsd %xmm8, %xmm9 + movaps %xmm5, %xmm7 + movzwl -10(%rsp), %esi + addsd %xmm9, %xmm7 + movsd %xmm7, -72(%rsp) + andl $32752, %esi + movsd -72(%rsp), %xmm6 + shrl $4, %esi + subsd %xmm6, %xmm5 + movl -12(%rsp), %eax + addsd %xmm5, %xmm9 + movsd %xmm9, -64(%rsp) + andl $1048575, %eax + movsd -48(%rsp), %xmm9 + movsd -72(%rsp), %xmm3 + movaps %xmm9, %xmm12 + movsd -64(%rsp), %xmm10 + movaps %xmm9, %xmm14 + movaps %xmm9, %xmm6 + addsd %xmm3, %xmm12 + movsd %xmm12, -72(%rsp) + movsd -72(%rsp), %xmm13 + shll $20, %esi + subsd %xmm13, %xmm14 + movsd %xmm14, -64(%rsp) + orl %eax, %esi + movsd -72(%rsp), %xmm4 + addl $-1069547520, %esi + movsd -64(%rsp), %xmm15 + movl $113, %eax + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + addsd %xmm15, %xmm4 + movsd %xmm4, -56(%rsp) + movsd -64(%rsp), %xmm8 + sarl $19, %esi + addsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + cmpl $113, %esi + movsd -56(%rsp), %xmm7 + cmovl %esi, %eax + subsd %xmm7, %xmm6 + movsd %xmm6, -56(%rsp) + addl %eax, %eax + movsd -64(%rsp), %xmm12 + lea __satan2_la_CoutTab(%rip), %rsi + movsd -56(%rsp), %xmm5 + movslq %eax, %rax + addsd %xmm5, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -72(%rsp), %xmm7 + mulsd %xmm7, %xmm13 + movsd -56(%rsp), %xmm8 + movsd %xmm13, -72(%rsp) + addsd %xmm10, %xmm8 + movsd -72(%rsp), %xmm4 + movaps %xmm9, %xmm10 + mulsd 2000+__satan2_la_CoutTab(%rip), %xmm10 + subsd %xmm7, %xmm4 + movsd %xmm4, -64(%rsp) + movsd -72(%rsp), %xmm3 + movsd -64(%rsp), %xmm14 + subsd %xmm14, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm15 + subsd %xmm15, %xmm7 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm7 + movsd -64(%rsp), %xmm4 + movsd %xmm10, -72(%rsp) + movaps %xmm2, %xmm10 + addsd %xmm4, %xmm8 + movsd -72(%rsp), %xmm4 + subsd -48(%rsp), %xmm4 + movsd %xmm4, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm3 + subsd %xmm3, %xmm6 + movaps %xmm2, %xmm3 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + subsd %xmm5, %xmm9 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm12 + movsd -64(%rsp), %xmm9 + mulsd %xmm11, %xmm12 + mulsd %xmm11, %xmm9 + movaps %xmm12, %xmm11 + addsd %xmm9, %xmm11 + movsd %xmm11, -72(%rsp) + movsd -72(%rsp), %xmm4 + subsd %xmm4, %xmm12 + addsd %xmm9, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm15 + movsd -64(%rsp), %xmm6 + addsd %xmm15, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm5, %xmm10 + movsd %xmm10, -64(%rsp) + movsd -72(%rsp), %xmm13 + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm13 + movsd %xmm13, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + addsd %xmm14, %xmm15 + movsd %xmm15, -64(%rsp) + movsd -56(%rsp), %xmm4 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm14 + subsd %xmm4, %xmm2 + movsd %xmm2, -56(%rsp) + movsd -64(%rsp), %xmm4 + movsd -56(%rsp), %xmm2 + addsd %xmm2, %xmm4 + movsd %xmm4, -56(%rsp) + movsd -72(%rsp), %xmm12 + mulsd %xmm12, %xmm3 + movsd -56(%rsp), %xmm5 + movsd %xmm3, -72(%rsp) + addsd %xmm6, %xmm5 + movsd -72(%rsp), %xmm9 + subsd %xmm12, %xmm9 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm2 + subsd %xmm2, %xmm10 + movsd %xmm10, -72(%rsp) + movsd -72(%rsp), %xmm11 + subsd %xmm11, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm9 + divsd %xmm9, %xmm14 + mulsd %xmm14, %xmm13 + movsd -64(%rsp), %xmm10 + movsd %xmm13, -64(%rsp) + addsd %xmm10, %xmm5 + movsd -64(%rsp), %xmm15 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm12 + subsd %xmm14, %xmm15 + movsd %xmm15, -56(%rsp) + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm4 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + subsd %xmm4, %xmm2 + movsd %xmm2, -56(%rsp) + movsd -56(%rsp), %xmm3 + mulsd %xmm3, %xmm9 + movsd -56(%rsp), %xmm11 + subsd %xmm9, %xmm12 + mulsd %xmm11, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -64(%rsp), %xmm5 + subsd %xmm5, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -64(%rsp), %xmm2 + movq -56(%rsp), %r10 + movsd -64(%rsp), %xmm6 + movsd -56(%rsp), %xmm4 + movq %r10, -40(%rsp) + movsd -40(%rsp), %xmm3 + movaps %xmm3, %xmm5 + addsd 1888+__satan2_la_CoutTab(%rip), %xmm2 + mulsd %xmm7, %xmm5 + mulsd %xmm6, %xmm2 + mulsd %xmm4, %xmm2 + mulsd %xmm2, %xmm7 + mulsd %xmm8, %xmm2 + mulsd %xmm3, %xmm8 + addsd %xmm2, %xmm7 + movsd 1872+__satan2_la_CoutTab(%rip), %xmm3 + addsd %xmm8, %xmm7 + movsd %xmm7, -72(%rsp) + movaps %xmm5, %xmm7 + movsd -72(%rsp), %xmm4 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm6 + addsd %xmm4, %xmm7 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm8 + subsd %xmm8, %xmm5 + addsd %xmm4, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -72(%rsp), %xmm11 + movaps %xmm11, %xmm2 + mulsd %xmm11, %xmm2 + mulsd %xmm11, %xmm6 + mulsd %xmm2, %xmm3 + movsd -64(%rsp), %xmm4 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm7 + addsd 1864+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm11, %xmm7 + mulsd %xmm2, %xmm3 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm9 + movsd -64(%rsp), %xmm8 + addsd 1856+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm8, %xmm9 + mulsd %xmm2, %xmm3 + movsd %xmm9, -72(%rsp) + movsd -72(%rsp), %xmm10 + addsd 1848+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm10, %xmm11 + mulsd %xmm2, %xmm3 + movsd %xmm11, -64(%rsp) + addsd 1840+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + addsd 1832+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + addsd 1824+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + mulsd %xmm3, %xmm13 + movsd -72(%rsp), %xmm2 + movsd -64(%rsp), %xmm12 + movsd %xmm13, -72(%rsp) + addsd %xmm12, %xmm4 + movsd -72(%rsp), %xmm14 + subsd %xmm3, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm5 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + subsd %xmm6, %xmm3 + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movaps %xmm6, %xmm12 + movaps %xmm5, %xmm3 + mulsd %xmm4, %xmm6 + mulsd %xmm4, %xmm3 + mulsd %xmm2, %xmm5 + mulsd %xmm2, %xmm12 + addsd %xmm3, %xmm6 + movaps %xmm12, %xmm7 + movaps %xmm12, %xmm8 + addsd %xmm5, %xmm6 + addsd %xmm2, %xmm7 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm3 + subsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -56(%rsp), %xmm2 + subsd %xmm2, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd -56(%rsp), %xmm13 + addsd %xmm13, %xmm14 + movsd %xmm14, -56(%rsp) + movq -72(%rsp), %r11 + movsd -56(%rsp), %xmm15 + movq %r11, -40(%rsp) + addsd %xmm15, %xmm4 + movsd -40(%rsp), %xmm8 + addsd %xmm5, %xmm4 + movsd %xmm4, -32(%rsp) + movaps %xmm8, %xmm4 + movaps %xmm8, %xmm2 + addsd (%rsi,%rax,8), %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm4 + subsd %xmm4, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm3 + addsd %xmm3, %xmm5 + movsd %xmm5, -56(%rsp) + movsd -64(%rsp), %xmm6 + addsd (%rsi,%rax,8), %xmm6 + movsd %xmm6, -64(%rsp) + movsd -56(%rsp), %xmm7 + subsd %xmm7, %xmm8 + movsd %xmm8, -56(%rsp) + movsd -64(%rsp), %xmm10 + movsd -56(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movq -72(%rsp), %rdi + movq %rdi, -40(%rsp) + movsd -56(%rsp), %xmm2 + movaps %xmm1, %xmm3 + shrq $56, %rdi + addsd -32(%rsp), %xmm2 + shlb $7, %cl + addsd 8(%rsi,%rax,8), %xmm2 + movb %dl, %al + andb $127, %dil + shlb $7, %al + movsd %xmm2, -32(%rsp) + orb %al, %dil + movb %dil, -33(%rsp) + movsd -40(%rsp), %xmm9 + movaps %xmm9, %xmm5 + addsd %xmm9, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm4 + movb -25(%rsp), %sil + movb %sil, %r9b + shrb $7, %sil + subsd %xmm4, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -72(%rsp), %xmm7 + movsd -64(%rsp), %xmm6 + xorb %sil, %dl + andb $127, %r9b + shlb $7, %dl + addsd %xmm6, %xmm7 + movsd %xmm7, -56(%rsp) + movsd -64(%rsp), %xmm8 + addsd %xmm8, %xmm1 + movsd %xmm1, -64(%rsp) + orb %dl, %r9b + movsd -56(%rsp), %xmm1 + movb %r9b, -25(%rsp) + subsd %xmm1, %xmm9 + movsd %xmm9, -56(%rsp) + movsd -64(%rsp), %xmm11 + movsd -56(%rsp), %xmm10 + addsd %xmm10, %xmm11 + movsd %xmm11, -56(%rsp) + movq -72(%rsp), %rdx + movsd -56(%rsp), %xmm12 + movq %rdx, -40(%rsp) + addsd %xmm12, %xmm0 + movsd -40(%rsp), %xmm13 + addsd -32(%rsp), %xmm0 + movsd %xmm0, -32(%rsp) + addsd %xmm0, %xmm13 + movsd %xmm13, -24(%rsp) + movb -17(%rsp), %r10b + andb $127, %r10b + orb %cl, %r10b + movb %r10b, -17(%rsp) + movsd -24(%rsp), %xmm0 + cvtsd2ss %xmm0, %xmm0 + movss %xmm0, (%r8) + jmp .LBL_2_33 + +.LBL_2_18: + movsd -48(%rsp), %xmm12 + movb %dl, %dil + movaps %xmm12, %xmm7 + mulsd 2000+__satan2_la_CoutTab(%rip), %xmm7 + shlb $7, %dil + shlb $7, %cl + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm8 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm7 + mulsd %xmm2, %xmm13 + subsd -48(%rsp), %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + subsd %xmm9, %xmm10 + movsd %xmm10, -72(%rsp) + movsd -72(%rsp), %xmm11 + subsd %xmm11, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movsd %xmm13, -72(%rsp) + movsd -72(%rsp), %xmm14 + subsd %xmm2, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm4 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm3 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm4 + subsd %xmm3, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm12 + divsd %xmm12, %xmm7 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm2 + mulsd %xmm7, %xmm2 + movsd -64(%rsp), %xmm14 + movsd %xmm2, -64(%rsp) + movsd -64(%rsp), %xmm8 + subsd %xmm7, %xmm8 + movsd %xmm8, -56(%rsp) + movsd -64(%rsp), %xmm10 + movsd -56(%rsp), %xmm9 + subsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -56(%rsp), %xmm11 + mulsd %xmm11, %xmm12 + movsd -56(%rsp), %xmm13 + subsd %xmm12, %xmm4 + mulsd %xmm13, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -64(%rsp), %xmm15 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + subsd %xmm15, %xmm4 + movsd %xmm4, -64(%rsp) + movsd -64(%rsp), %xmm7 + movq -56(%rsp), %rax + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm3 + movq %rax, -40(%rsp) + movsd -40(%rsp), %xmm8 + movaps %xmm8, %xmm9 + addsd 1888+__satan2_la_CoutTab(%rip), %xmm7 + mulsd %xmm6, %xmm9 + mulsd %xmm5, %xmm8 + mulsd %xmm2, %xmm7 + movsd -16(%rsp), %xmm2 + mulsd %xmm2, %xmm2 + mulsd %xmm3, %xmm7 + movsd 1872+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + mulsd %xmm7, %xmm6 + mulsd %xmm5, %xmm7 + addsd 1864+__satan2_la_CoutTab(%rip), %xmm3 + addsd %xmm7, %xmm6 + mulsd %xmm2, %xmm3 + addsd %xmm8, %xmm6 + addsd 1856+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + movaps %xmm9, %xmm5 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm4 + addsd 1848+__satan2_la_CoutTab(%rip), %xmm3 + addsd %xmm4, %xmm5 + mulsd %xmm2, %xmm3 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm5 + subsd %xmm6, %xmm9 + addsd 1840+__satan2_la_CoutTab(%rip), %xmm3 + addsd %xmm4, %xmm9 + mulsd %xmm2, %xmm3 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm11 + mulsd %xmm11, %xmm5 + addsd 1832+__satan2_la_CoutTab(%rip), %xmm3 + movsd -64(%rsp), %xmm4 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm7 + mulsd %xmm2, %xmm3 + subsd %xmm11, %xmm7 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm8 + movsd -64(%rsp), %xmm6 + addsd 1824+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm6, %xmm8 + mulsd %xmm2, %xmm3 + movsd %xmm8, -72(%rsp) + movsd -72(%rsp), %xmm10 + mulsd %xmm3, %xmm13 + subsd %xmm10, %xmm11 + movsd %xmm11, -64(%rsp) + movsd -72(%rsp), %xmm2 + movsd -64(%rsp), %xmm12 + movsd %xmm13, -72(%rsp) + addsd %xmm12, %xmm4 + movsd -72(%rsp), %xmm14 + subsd %xmm3, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm5 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + subsd %xmm6, %xmm3 + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movaps %xmm6, %xmm12 + movaps %xmm5, %xmm3 + mulsd %xmm4, %xmm6 + mulsd %xmm4, %xmm3 + mulsd %xmm2, %xmm5 + mulsd %xmm2, %xmm12 + addsd %xmm3, %xmm6 + movaps %xmm12, %xmm7 + movaps %xmm12, %xmm8 + addsd %xmm5, %xmm6 + addsd %xmm2, %xmm7 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm3 + subsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -56(%rsp), %xmm2 + subsd %xmm2, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd -56(%rsp), %xmm13 + addsd %xmm13, %xmm14 + movsd %xmm14, -56(%rsp) + movq -72(%rsp), %rsi + movsd -56(%rsp), %xmm15 + movq %rsi, -40(%rsp) + addsd %xmm15, %xmm4 + shrq $56, %rsi + addsd %xmm5, %xmm4 + andb $127, %sil + orb %dil, %sil + movb %sil, -33(%rsp) + movsd %xmm4, -32(%rsp) + movaps %xmm1, %xmm4 + movsd -40(%rsp), %xmm7 + movaps %xmm7, %xmm2 + addsd %xmm7, %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm4 + movb -25(%rsp), %r9b + movb %r9b, %r10b + shrb $7, %r9b + subsd %xmm4, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm3 + xorb %r9b, %dl + andb $127, %r10b + shlb $7, %dl + addsd %xmm3, %xmm5 + movsd %xmm5, -56(%rsp) + movsd -64(%rsp), %xmm6 + addsd %xmm6, %xmm1 + movsd %xmm1, -64(%rsp) + orb %dl, %r10b + movsd -56(%rsp), %xmm1 + movb %r10b, -25(%rsp) + subsd %xmm1, %xmm7 + movsd %xmm7, -56(%rsp) + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm1 + addsd %xmm1, %xmm2 + movsd %xmm2, -56(%rsp) + movq -72(%rsp), %rdx + movsd -56(%rsp), %xmm3 + movq %rdx, -40(%rsp) + addsd %xmm3, %xmm0 + movsd -40(%rsp), %xmm4 + addsd -32(%rsp), %xmm0 + movsd %xmm0, -32(%rsp) + addsd %xmm0, %xmm4 + movsd %xmm4, -24(%rsp) + movb -17(%rsp), %r11b + andb $127, %r11b + orb %cl, %r11b + movb %r11b, -17(%rsp) + movsd -24(%rsp), %xmm0 + cvtsd2ss %xmm0, %xmm0 + movss %xmm0, (%r8) + jmp .LBL_2_33 + +.LBL_2_19: + cmpl $74, %r9d + jge .LBL_2_21 + movb %dil, -41(%rsp) + divsd -48(%rsp), %xmm1 + movsd 1928+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + subsd %xmm1, %xmm0 + addsd 1920+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + jmp .LBL_2_33 + +.LBL_2_21: + movsd 1920+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd 1928+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + jmp .LBL_2_33 + +.LBL_2_22: + testb %dl, %dl + jne .LBL_2_32 + movb %dil, -41(%rsp) + pxor %xmm0, %xmm0 + movb %sil, -33(%rsp) + movsd -48(%rsp), %xmm2 + divsd -40(%rsp), %xmm2 + cvtsd2ss %xmm2, %xmm0 + movss %xmm0, -8(%rsp) + movzwl -6(%rsp), %eax + movsd %xmm2, -24(%rsp) + testl $32640, %eax + je .LBL_2_25 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd %xmm2, %xmm0 + movsd %xmm0, -72(%rsp) + movsd -72(%rsp), %xmm1 + mulsd %xmm1, %xmm2 + movsd %xmm2, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm2 + cvtsd2ss %xmm2, %xmm2 + movss %xmm2, (%r8) + jmp .LBL_2_33 + +.LBL_2_25: + movsd -24(%rsp), %xmm0 + cvtsd2ss %xmm0, %xmm0 + shlb $7, %cl + movss %xmm0, -8(%rsp) + movss -8(%rsp), %xmm2 + movss -8(%rsp), %xmm1 + mulss %xmm1, %xmm2 + movss %xmm2, -8(%rsp) + movss -8(%rsp), %xmm3 + cvtss2sd %xmm3, %xmm3 + addsd -24(%rsp), %xmm3 + movsd %xmm3, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm4 + cvtsd2ss %xmm4, %xmm4 + movss %xmm4, (%r8) + jmp .LBL_2_33 + +.LBL_2_27: + testl %eax, %eax + jne .LBL_2_21 + testl $8388607, -32(%rsp) + jne .LBL_2_21 + +.LBL_2_30: + testb %dl, %dl + jne .LBL_2_32 + +.LBL_2_31: + shlb $7, %cl + movq 1976+__satan2_la_CoutTab(%rip), %rax + movq %rax, -24(%rsp) + shrq $56, %rax + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm0 + cvtsd2ss %xmm0, %xmm0 + movss %xmm0, (%r8) + jmp .LBL_2_33 + +.LBL_2_32: + movsd 1936+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd 1944+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + +.LBL_2_33: + xorl %eax, %eax + ret + +.LBL_2_34: + movsd 1984+__satan2_la_CoutTab(%rip), %xmm3 + movl $-1022, %eax + mulsd %xmm3, %xmm4 + movsd %xmm4, -48(%rsp) + jmp .LBL_2_16 + +.LBL_2_35: + cmpl $2047, %eax + je .LBL_2_48 + +.LBL_2_36: + cmpl $2047, %r9d + je .LBL_2_46 + +.LBL_2_37: + movzwl -26(%rsp), %eax + andl $32640, %eax + cmpl $32640, %eax + jne .LBL_2_21 + cmpl $255, %edi + je .LBL_2_43 + testb %dl, %dl + je .LBL_2_31 + jmp .LBL_2_32 + +.LBL_2_43: + testb %dl, %dl + jne .LBL_2_45 + movsd 1904+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd 1912+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + jmp .LBL_2_33 + +.LBL_2_45: + movsd 1952+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd 1960+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + jmp .LBL_2_33 + +.LBL_2_46: + testl $8388607, -28(%rsp) + je .LBL_2_37 + +.LBL_2_47: + addss %xmm2, %xmm3 + movss %xmm3, (%r8) + jmp .LBL_2_33 + +.LBL_2_48: + testl $8388607, -32(%rsp) + jne .LBL_2_47 + jmp .LBL_2_36 + + cfi_endproc + + .type __svml_satan2_cout_rare_internal,@function + .size __svml_satan2_cout_rare_internal,.-__svml_satan2_cout_rare_internal + + .section .rodata, "a" + .align 64 + +__svml_satan2_data_internal: + .long 0 + .long 0 + .long 0 + .long 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .type __svml_satan2_data_internal,@object + .size __svml_satan2_data_internal,1152 + .align 32 + +__satan2_la_CoutTab: + .long 3892314112 + .long 1069799150 + .long 2332892550 + .long 1039715405 + .long 1342177280 + .long 1070305495 + .long 270726690 + .long 1041535749 + .long 939524096 + .long 1070817911 + .long 2253973841 + .long 3188654726 + .long 3221225472 + .long 1071277294 + .long 3853927037 + .long 1043226911 + .long 2818572288 + .long 1071767563 + .long 2677759107 + .long 1044314101 + .long 3355443200 + .long 1072103591 + .long 1636578514 + .long 3191094734 + .long 1476395008 + .long 1072475260 + .long 1864703685 + .long 3188646936 + .long 805306368 + .long 1072747407 + .long 192551812 + .long 3192726267 + .long 2013265920 + .long 1072892781 + .long 2240369452 + .long 1043768538 + .long 0 + .long 1072999953 + .long 3665168337 + .long 3192705970 + .long 402653184 + .long 1073084787 + .long 1227953434 + .long 3192313277 + .long 2013265920 + .long 1073142981 + .long 3853283127 + .long 1045277487 + .long 805306368 + .long 1073187261 + .long 1676192264 + .long 3192868861 + .long 134217728 + .long 1073217000 + .long 4290763938 + .long 1042034855 + .long 671088640 + .long 1073239386 + .long 994303084 + .long 3189643768 + .long 402653184 + .long 1073254338 + .long 1878067156 + .long 1042652475 + .long 1610612736 + .long 1073265562 + .long 670314820 + .long 1045138554 + .long 3221225472 + .long 1073273048 + .long 691126919 + .long 3189987794 + .long 3489660928 + .long 1073278664 + .long 1618990832 + .long 3188194509 + .long 1207959552 + .long 1073282409 + .long 2198872939 + .long 1044806069 + .long 3489660928 + .long 1073285217 + .long 2633982383 + .long 1042307894 + .long 939524096 + .long 1073287090 + .long 1059367786 + .long 3189114230 + .long 2281701376 + .long 1073288494 + .long 3158525533 + .long 1044484961 + .long 3221225472 + .long 1073289430 + .long 286581777 + .long 1044893263 + .long 4026531840 + .long 1073290132 + .long 2000245215 + .long 3191647611 + .long 134217728 + .long 1073290601 + .long 4205071590 + .long 1045035927 + .long 536870912 + .long 1073290952 + .long 2334392229 + .long 1043447393 + .long 805306368 + .long 1073291186 + .long 2281458177 + .long 3188885569 + .long 3087007744 + .long 1073291361 + .long 691611507 + .long 1044733832 + .long 3221225472 + .long 1073291478 + .long 1816229550 + .long 1044363390 + .long 2281701376 + .long 1073291566 + .long 1993843750 + .long 3189837440 + .long 134217728 + .long 1073291625 + .long 3654754496 + .long 1044970837 + .long 4026531840 + .long 1073291668 + .long 3224300229 + .long 3191935390 + .long 805306368 + .long 1073291698 + .long 2988777976 + .long 3188950659 + .long 536870912 + .long 1073291720 + .long 1030371341 + .long 1043402665 + .long 3221225472 + .long 1073291734 + .long 1524463765 + .long 1044361356 + .long 3087007744 + .long 1073291745 + .long 2754295320 + .long 1044731036 + .long 134217728 + .long 1073291753 + .long 3099629057 + .long 1044970710 + .long 2281701376 + .long 1073291758 + .long 962914160 + .long 3189838838 + .long 805306368 + .long 1073291762 + .long 3543908206 + .long 3188950786 + .long 4026531840 + .long 1073291764 + .long 1849909620 + .long 3191935434 + .long 3221225472 + .long 1073291766 + .long 1641333636 + .long 1044361352 + .long 536870912 + .long 1073291768 + .long 1373968792 + .long 1043402654 + .long 134217728 + .long 1073291769 + .long 2033191599 + .long 1044970710 + .long 3087007744 + .long 1073291769 + .long 4117947437 + .long 1044731035 + .long 805306368 + .long 1073291770 + .long 315378368 + .long 3188950787 + .long 2281701376 + .long 1073291770 + .long 2428571750 + .long 3189838838 + .long 3221225472 + .long 1073291770 + .long 1608007466 + .long 1044361352 + .long 4026531840 + .long 1073291770 + .long 1895711420 + .long 3191935434 + .long 134217728 + .long 1073291771 + .long 2031108713 + .long 1044970710 + .long 536870912 + .long 1073291771 + .long 1362518342 + .long 1043402654 + .long 805306368 + .long 1073291771 + .long 317461253 + .long 3188950787 + .long 939524096 + .long 1073291771 + .long 4117231784 + .long 1044731035 + .long 1073741824 + .long 1073291771 + .long 1607942376 + .long 1044361352 + .long 1207959552 + .long 1073291771 + .long 2428929577 + .long 3189838838 + .long 1207959552 + .long 1073291771 + .long 2031104645 + .long 1044970710 + .long 1342177280 + .long 1073291771 + .long 1895722602 + .long 3191935434 + .long 1342177280 + .long 1073291771 + .long 317465322 + .long 3188950787 + .long 1342177280 + .long 1073291771 + .long 1362515546 + .long 1043402654 + .long 1342177280 + .long 1073291771 + .long 1607942248 + .long 1044361352 + .long 1342177280 + .long 1073291771 + .long 4117231610 + .long 1044731035 + .long 1342177280 + .long 1073291771 + .long 2031104637 + .long 1044970710 + .long 1342177280 + .long 1073291771 + .long 1540251232 + .long 1045150466 + .long 1342177280 + .long 1073291771 + .long 2644671394 + .long 1045270303 + .long 1342177280 + .long 1073291771 + .long 2399244691 + .long 1045360181 + .long 1342177280 + .long 1073291771 + .long 803971124 + .long 1045420100 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192879152 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192849193 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192826724 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192811744 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192800509 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192793019 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192787402 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192783657 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192780848 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192778976 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192777572 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192776635 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192775933 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192775465 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192775114 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774880 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774704 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774587 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774500 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774441 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774397 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774368 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774346 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774331 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774320 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774313 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774308 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774304 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774301 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774299 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774298 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774297 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1466225875 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1343512524 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1251477510 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1190120835 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1144103328 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1113424990 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1090416237 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1075077068 + .long 3192774295 + .long 1431655765 + .long 3218429269 + .long 2576978363 + .long 1070176665 + .long 2453154343 + .long 3217180964 + .long 4189149139 + .long 1069314502 + .long 1775019125 + .long 3216459198 + .long 273199057 + .long 1068739452 + .long 874748308 + .long 3215993277 + .long 0 + .long 1069547520 + .long 0 + .long 1072693248 + .long 0 + .long 1073741824 + .long 1413754136 + .long 1072243195 + .long 856972295 + .long 1015129638 + .long 1413754136 + .long 1073291771 + .long 856972295 + .long 1016178214 + .long 1413754136 + .long 1074340347 + .long 856972295 + .long 1017226790 + .long 2134057426 + .long 1073928572 + .long 1285458442 + .long 1016756537 + .long 0 + .long 3220176896 + .long 0 + .long 0 + .long 0 + .long 2144337920 + .long 0 + .long 1048576 + .long 33554432 + .long 1101004800 + .type __satan2_la_CoutTab,@object + .size __satan2_la_CoutTab,2008 diff --git a/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f8_core-sse.S b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f8_core-sse.S new file mode 100644 index 0000000000..21b1d3ff63 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f8_core-sse.S @@ -0,0 +1,20 @@ +/* SSE version of vectorized atan2f. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define _ZGVdN8vv_atan2f _ZGVdN8vv_atan2f_sse_wrapper +#include "../svml_s_atan2f8_core.S" diff --git a/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f8_core.c b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f8_core.c new file mode 100644 index 0000000000..7e02050983 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f8_core.c @@ -0,0 +1,28 @@ +/* Multiple versions of vectorized sinf, vector length is 8. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define SYMBOL_NAME _ZGVdN8vv_atan2f +#include "ifunc-mathvec-avx2.h" + +libc_ifunc_redirected (REDIRECT_NAME, SYMBOL_NAME, IFUNC_SELECTOR ()); + +#ifdef SHARED +__hidden_ver1 (_ZGVdN8vv_atan2f, __GI__ZGVdN8vv_atan2f, + __redirect__ZGVdN8vv_atan2f) + __attribute__ ((visibility ("hidden"))); +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f8_core_avx2.S b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f8_core_avx2.S new file mode 100644 index 0000000000..b979376e54 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f8_core_avx2.S @@ -0,0 +1,2412 @@ +/* Function atanf8 vectorized with AVX2. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + https://www.gnu.org/licenses/. */ + +/* + * ALGORITHM DESCRIPTION: + * For 0.0 <= x <= 7.0/16.0: atan(x) = atan(0.0) + atan(s), where s=(x-0.0)/(1.0+0.0*x) + * For 7.0/16.0 <= x <= 11.0/16.0: atan(x) = atan(0.5) + atan(s), where s=(x-0.5)/(1.0+0.5*x) + * For 11.0/16.0 <= x <= 19.0/16.0: atan(x) = atan(1.0) + atan(s), where s=(x-1.0)/(1.0+1.0*x) + * For 19.0/16.0 <= x <= 39.0/16.0: atan(x) = atan(1.5) + atan(s), where s=(x-1.5)/(1.0+1.5*x) + * For 39.0/16.0 <= x <= inf : atan(x) = atan(inf) + atan(s), where s=-1.0/x + * Where atan(s) ~= s+s^3*Poly11(s^2) on interval |s|<7.0/0.16. + * + * + */ + +#include + + .text +ENTRY(_ZGVdN8vv_atan2f_avx2) + pushq %rbp + cfi_def_cfa_offset(16) + movq %rsp, %rbp + cfi_def_cfa(6, 16) + cfi_offset(6, -16) + andq $-64, %rsp + subq $384, %rsp + xorl %edx, %edx + +/* + * #define NO_VECTOR_ZERO_ATAN2_ARGS + * Declarations + * Variables + * Constants + * The end of declarations + * Implementation + * Arguments signs + */ + vmovups 256+__svml_satan2_data_internal(%rip), %ymm2 + vmovups %ymm13, 288(%rsp) + vmovups %ymm12, 256(%rsp) + vmovups %ymm15, 352(%rsp) + vmovups %ymm14, 320(%rsp) + .cfi_escape 0x10, 0xdf, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x80, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xe0, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xa0, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xe1, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xe2, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xe0, 0xff, 0xff, 0xff, 0x22 + +/* Testing on working interval. */ + vmovups 1024+__svml_satan2_data_internal(%rip), %ymm15 + vmovups %ymm11, 224(%rsp) + vmovups %ymm9, 96(%rsp) + .cfi_escape 0x10, 0xdc, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xe0, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xde, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x60, 0xff, 0xff, 0xff, 0x22 + vmovups 1088+__svml_satan2_data_internal(%rip), %ymm9 + vmovups %ymm10, 160(%rsp) + vmovups %ymm8, 32(%rsp) + +/* if x<0, sPI = Pi, else sPI =0 */ + vmovups __svml_satan2_data_internal(%rip), %ymm5 + vmovaps %ymm1, %ymm7 + .cfi_escape 0x10, 0xdb, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xa0, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xdd, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x20, 0xff, 0xff, 0xff, 0x22 + vandps %ymm2, %ymm7, %ymm13 + vandps %ymm2, %ymm0, %ymm12 + vcmplt_oqps %ymm13, %ymm12, %ymm4 + vcmple_oqps %ymm5, %ymm7, %ymm6 + vpsubd %ymm15, %ymm13, %ymm10 + vpsubd %ymm15, %ymm12, %ymm8 + +/* + * 1) If yx then a=-x, b=y, PIO2=Pi/2 + */ + vorps 192+__svml_satan2_data_internal(%rip), %ymm13, %ymm3 + vblendvps %ymm4, %ymm12, %ymm3, %ymm14 + vblendvps %ymm4, %ymm13, %ymm12, %ymm3 + +/* Division a/b. */ + vdivps %ymm3, %ymm14, %ymm11 + vpcmpgtd %ymm9, %ymm10, %ymm14 + vpcmpeqd %ymm9, %ymm10, %ymm15 + vpor %ymm15, %ymm14, %ymm10 + vmovups 512+__svml_satan2_data_internal(%rip), %ymm15 + vpcmpgtd %ymm9, %ymm8, %ymm14 + vpcmpeqd %ymm9, %ymm8, %ymm8 + vpor %ymm8, %ymm14, %ymm9 + vmovups 448+__svml_satan2_data_internal(%rip), %ymm14 + vpor %ymm9, %ymm10, %ymm10 + +/* Polynomial. */ + vmulps %ymm11, %ymm11, %ymm9 + vmulps %ymm9, %ymm9, %ymm8 + vfmadd213ps 576+__svml_satan2_data_internal(%rip), %ymm8, %ymm14 + vfmadd213ps 640+__svml_satan2_data_internal(%rip), %ymm8, %ymm15 + vfmadd213ps 704+__svml_satan2_data_internal(%rip), %ymm8, %ymm14 + vfmadd213ps 768+__svml_satan2_data_internal(%rip), %ymm8, %ymm15 + vfmadd213ps 832+__svml_satan2_data_internal(%rip), %ymm8, %ymm14 + vfmadd213ps 896+__svml_satan2_data_internal(%rip), %ymm8, %ymm15 + vfmadd213ps 960+__svml_satan2_data_internal(%rip), %ymm8, %ymm14 + vfmadd213ps %ymm14, %ymm9, %ymm15 + vandnps 320+__svml_satan2_data_internal(%rip), %ymm4, %ymm4 + +/* Reconstruction. */ + vfmadd213ps %ymm4, %ymm11, %ymm15 + vxorps %ymm13, %ymm7, %ymm1 + vandps 384+__svml_satan2_data_internal(%rip), %ymm6, %ymm6 + vorps %ymm1, %ymm15, %ymm11 + vaddps %ymm11, %ymm6, %ymm8 + vmovmskps %ymm10, %eax + vxorps %ymm12, %ymm0, %ymm2 + vorps %ymm2, %ymm8, %ymm9 + +/* Special branch for fast (vector) processing of zero arguments */ + testl %eax, %eax + jne .LBL_1_12 + +.LBL_1_2: +/* + * Special branch for fast (vector) processing of zero arguments + * The end of implementation + */ + testl %edx, %edx + jne .LBL_1_4 + +.LBL_1_3: + vmovaps %ymm9, %ymm0 + vmovups 32(%rsp), %ymm8 + cfi_restore(91) + vmovups 96(%rsp), %ymm9 + cfi_restore(92) + vmovups 160(%rsp), %ymm10 + cfi_restore(93) + vmovups 224(%rsp), %ymm11 + cfi_restore(94) + vmovups 256(%rsp), %ymm12 + cfi_restore(95) + vmovups 288(%rsp), %ymm13 + cfi_restore(96) + vmovups 320(%rsp), %ymm14 + cfi_restore(97) + vmovups 352(%rsp), %ymm15 + cfi_restore(98) + movq %rbp, %rsp + popq %rbp + cfi_def_cfa(7, 8) + cfi_restore(6) + ret + cfi_def_cfa(6, 16) + cfi_offset(6, -16) + .cfi_escape 0x10, 0xdb, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xa0, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xdc, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xe0, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xdd, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x20, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xde, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x60, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xdf, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x80, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xe0, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xa0, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xe1, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xe2, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xe0, 0xff, 0xff, 0xff, 0x22 + +.LBL_1_4: + vmovups %ymm0, 64(%rsp) + vmovups %ymm7, 128(%rsp) + vmovups %ymm9, 192(%rsp) + je .LBL_1_3 + xorl %eax, %eax + vzeroupper + movq %rsi, 8(%rsp) + movq %rdi, (%rsp) + movq %r12, 24(%rsp) + .cfi_escape 0x10, 0x04, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x88, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x05, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x80, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x98, 0xfe, 0xff, 0xff, 0x22 + movl %eax, %r12d + movq %r13, 16(%rsp) + .cfi_escape 0x10, 0x0d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x90, 0xfe, 0xff, 0xff, 0x22 + movl %edx, %r13d + +.LBL_1_8: + btl %r12d, %r13d + jc .LBL_1_11 + +.LBL_1_9: + incl %r12d + cmpl $8, %r12d + jl .LBL_1_8 + movq 8(%rsp), %rsi + cfi_restore(4) + movq (%rsp), %rdi + cfi_restore(5) + movq 24(%rsp), %r12 + cfi_restore(12) + movq 16(%rsp), %r13 + cfi_restore(13) + vmovups 192(%rsp), %ymm9 + jmp .LBL_1_3 + .cfi_escape 0x10, 0x04, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x88, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x05, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x80, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x98, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x90, 0xfe, 0xff, 0xff, 0x22 + +.LBL_1_11: + lea 64(%rsp,%r12,4), %rdi + lea 128(%rsp,%r12,4), %rsi + lea 192(%rsp,%r12,4), %rdx + call __svml_satan2_cout_rare_internal + jmp .LBL_1_9 + cfi_restore(4) + cfi_restore(5) + cfi_restore(12) + cfi_restore(13) + +.LBL_1_12: +/* Check if at least on of Y or Y is zero: iAXAYZERO */ + vpcmpeqd %ymm5, %ymm13, %ymm13 + vpcmpeqd %ymm5, %ymm12, %ymm12 + +/* Check if both X & Y are not NaNs: iXYnotNAN */ + vcmpordps %ymm7, %ymm7, %ymm11 + vcmpordps %ymm0, %ymm0, %ymm14 + +/* + * Path for zero arguments (at least one of both) + * Check if both args are zeros (den. is zero) + */ + vcmpeqps %ymm5, %ymm3, %ymm3 + vpor %ymm12, %ymm13, %ymm15 + +/* Set sPIO2 to zero if den. is zero */ + vblendvps %ymm3, %ymm5, %ymm4, %ymm4 + vandps %ymm14, %ymm11, %ymm8 + +/* Check if at least on of Y or Y is zero and not NaN: iAXAYZEROnotNAN */ + vpand %ymm8, %ymm15, %ymm8 + +/* Res = sign(Y)*(X<0)?(PIO2+PI):PIO2 */ + vpcmpgtd %ymm7, %ymm5, %ymm5 + vorps %ymm1, %ymm4, %ymm1 + vandps %ymm6, %ymm5, %ymm6 + vaddps %ymm6, %ymm1, %ymm1 + +/* Exclude from previous callout mask zero (and not NaN) arguments */ + vpandn %ymm10, %ymm8, %ymm10 + vorps %ymm2, %ymm1, %ymm2 + +/* Go to callout */ + vmovmskps %ymm10, %edx + +/* Merge results from main and spec path */ + vblendvps %ymm8, %ymm2, %ymm9, %ymm9 + jmp .LBL_1_2 + +END(_ZGVdN8vv_atan2f_avx2) + + .align 16,0x90 + +__svml_satan2_cout_rare_internal: + + cfi_startproc + + pxor %xmm0, %xmm0 + movss (%rdi), %xmm3 + pxor %xmm1, %xmm1 + movss (%rsi), %xmm2 + movq %rdx, %r8 + cvtss2sd %xmm3, %xmm0 + cvtss2sd %xmm2, %xmm1 + movss %xmm3, -32(%rsp) + movss %xmm2, -28(%rsp) + movsd %xmm0, -48(%rsp) + movsd %xmm1, -40(%rsp) + movzwl -30(%rsp), %edi + andl $32640, %edi + movb -25(%rsp), %dl + movzwl -42(%rsp), %eax + andb $-128, %dl + movzwl -34(%rsp), %r9d + andl $32752, %eax + andl $32752, %r9d + shrl $7, %edi + movb -29(%rsp), %cl + shrb $7, %cl + shrb $7, %dl + shrl $4, %eax + shrl $4, %r9d + cmpl $255, %edi + je .LBL_2_35 + movzwl -26(%rsp), %esi + andl $32640, %esi + cmpl $32640, %esi + je .LBL_2_35 + testl %eax, %eax + jne .LBL_2_5 + testl $8388607, -32(%rsp) + je .LBL_2_30 + +.LBL_2_5: + testl %r9d, %r9d + jne .LBL_2_7 + testl $8388607, -28(%rsp) + je .LBL_2_27 + +.LBL_2_7: + negl %r9d + movsd %xmm0, -48(%rsp) + addl %eax, %r9d + movsd %xmm1, -40(%rsp) + movb -41(%rsp), %dil + movb -33(%rsp), %sil + andb $127, %dil + andb $127, %sil + cmpl $-54, %r9d + jle .LBL_2_22 + cmpl $54, %r9d + jge .LBL_2_19 + movb %sil, -33(%rsp) + movb %dil, -41(%rsp) + testb %dl, %dl + jne .LBL_2_11 + movsd 1976+__satan2_la_CoutTab(%rip), %xmm1 + movaps %xmm1, %xmm0 + jmp .LBL_2_12 + +.LBL_2_11: + movsd 1936+__satan2_la_CoutTab(%rip), %xmm1 + movsd 1944+__satan2_la_CoutTab(%rip), %xmm0 + +.LBL_2_12: + movsd -48(%rsp), %xmm4 + movsd -40(%rsp), %xmm2 + movaps %xmm4, %xmm5 + divsd %xmm2, %xmm5 + movzwl -42(%rsp), %esi + movsd %xmm5, -16(%rsp) + testl %eax, %eax + jle .LBL_2_34 + cmpl $2046, %eax + jge .LBL_2_15 + andl $-32753, %esi + addl $-1023, %eax + movsd %xmm4, -48(%rsp) + addl $16368, %esi + movw %si, -42(%rsp) + jmp .LBL_2_16 + +.LBL_2_15: + movsd 1992+__satan2_la_CoutTab(%rip), %xmm3 + movl $1022, %eax + mulsd %xmm3, %xmm4 + movsd %xmm4, -48(%rsp) + +.LBL_2_16: + negl %eax + movq 1888+__satan2_la_CoutTab(%rip), %rsi + addl $1023, %eax + movq %rsi, -40(%rsp) + andl $2047, %eax + shrq $48, %rsi + shll $4, %eax + andl $-32753, %esi + orl %eax, %esi + movw %si, -34(%rsp) + movsd -40(%rsp), %xmm3 + mulsd %xmm3, %xmm2 + comisd 1880+__satan2_la_CoutTab(%rip), %xmm5 + jb .LBL_2_18 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm12 + movaps %xmm2, %xmm3 + mulsd %xmm2, %xmm12 + movsd %xmm12, -72(%rsp) + movsd -72(%rsp), %xmm13 + movsd %xmm5, -24(%rsp) + subsd %xmm2, %xmm13 + movsd %xmm13, -64(%rsp) + movsd -72(%rsp), %xmm15 + movsd -64(%rsp), %xmm14 + movl -20(%rsp), %edi + movl %edi, %r9d + andl $-524288, %edi + andl $-1048576, %r9d + addl $262144, %edi + subsd %xmm14, %xmm15 + movsd %xmm15, -72(%rsp) + andl $1048575, %edi + movsd -72(%rsp), %xmm4 + orl %edi, %r9d + movl $0, -24(%rsp) + subsd %xmm4, %xmm3 + movl %r9d, -20(%rsp) + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -24(%rsp), %xmm11 + movsd -64(%rsp), %xmm9 + mulsd %xmm11, %xmm5 + mulsd %xmm11, %xmm9 + movsd 1968+__satan2_la_CoutTab(%rip), %xmm8 + mulsd %xmm8, %xmm5 + mulsd %xmm8, %xmm9 + movaps %xmm5, %xmm7 + movzwl -10(%rsp), %esi + addsd %xmm9, %xmm7 + movsd %xmm7, -72(%rsp) + andl $32752, %esi + movsd -72(%rsp), %xmm6 + shrl $4, %esi + subsd %xmm6, %xmm5 + movl -12(%rsp), %eax + addsd %xmm5, %xmm9 + movsd %xmm9, -64(%rsp) + andl $1048575, %eax + movsd -48(%rsp), %xmm9 + movsd -72(%rsp), %xmm3 + movaps %xmm9, %xmm12 + movsd -64(%rsp), %xmm10 + movaps %xmm9, %xmm14 + movaps %xmm9, %xmm6 + addsd %xmm3, %xmm12 + movsd %xmm12, -72(%rsp) + movsd -72(%rsp), %xmm13 + shll $20, %esi + subsd %xmm13, %xmm14 + movsd %xmm14, -64(%rsp) + orl %eax, %esi + movsd -72(%rsp), %xmm4 + addl $-1069547520, %esi + movsd -64(%rsp), %xmm15 + movl $113, %eax + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + addsd %xmm15, %xmm4 + movsd %xmm4, -56(%rsp) + movsd -64(%rsp), %xmm8 + sarl $19, %esi + addsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + cmpl $113, %esi + movsd -56(%rsp), %xmm7 + cmovl %esi, %eax + subsd %xmm7, %xmm6 + movsd %xmm6, -56(%rsp) + addl %eax, %eax + movsd -64(%rsp), %xmm12 + lea __satan2_la_CoutTab(%rip), %rsi + movsd -56(%rsp), %xmm5 + movslq %eax, %rax + addsd %xmm5, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -72(%rsp), %xmm7 + mulsd %xmm7, %xmm13 + movsd -56(%rsp), %xmm8 + movsd %xmm13, -72(%rsp) + addsd %xmm10, %xmm8 + movsd -72(%rsp), %xmm4 + movaps %xmm9, %xmm10 + mulsd 2000+__satan2_la_CoutTab(%rip), %xmm10 + subsd %xmm7, %xmm4 + movsd %xmm4, -64(%rsp) + movsd -72(%rsp), %xmm3 + movsd -64(%rsp), %xmm14 + subsd %xmm14, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm15 + subsd %xmm15, %xmm7 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm7 + movsd -64(%rsp), %xmm4 + movsd %xmm10, -72(%rsp) + movaps %xmm2, %xmm10 + addsd %xmm4, %xmm8 + movsd -72(%rsp), %xmm4 + subsd -48(%rsp), %xmm4 + movsd %xmm4, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm3 + subsd %xmm3, %xmm6 + movaps %xmm2, %xmm3 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + subsd %xmm5, %xmm9 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm12 + movsd -64(%rsp), %xmm9 + mulsd %xmm11, %xmm12 + mulsd %xmm11, %xmm9 + movaps %xmm12, %xmm11 + addsd %xmm9, %xmm11 + movsd %xmm11, -72(%rsp) + movsd -72(%rsp), %xmm4 + subsd %xmm4, %xmm12 + addsd %xmm9, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm15 + movsd -64(%rsp), %xmm6 + addsd %xmm15, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm5, %xmm10 + movsd %xmm10, -64(%rsp) + movsd -72(%rsp), %xmm13 + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm13 + movsd %xmm13, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + addsd %xmm14, %xmm15 + movsd %xmm15, -64(%rsp) + movsd -56(%rsp), %xmm4 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm14 + subsd %xmm4, %xmm2 + movsd %xmm2, -56(%rsp) + movsd -64(%rsp), %xmm4 + movsd -56(%rsp), %xmm2 + addsd %xmm2, %xmm4 + movsd %xmm4, -56(%rsp) + movsd -72(%rsp), %xmm12 + mulsd %xmm12, %xmm3 + movsd -56(%rsp), %xmm5 + movsd %xmm3, -72(%rsp) + addsd %xmm6, %xmm5 + movsd -72(%rsp), %xmm9 + subsd %xmm12, %xmm9 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm2 + subsd %xmm2, %xmm10 + movsd %xmm10, -72(%rsp) + movsd -72(%rsp), %xmm11 + subsd %xmm11, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm9 + divsd %xmm9, %xmm14 + mulsd %xmm14, %xmm13 + movsd -64(%rsp), %xmm10 + movsd %xmm13, -64(%rsp) + addsd %xmm10, %xmm5 + movsd -64(%rsp), %xmm15 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm12 + subsd %xmm14, %xmm15 + movsd %xmm15, -56(%rsp) + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm4 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + subsd %xmm4, %xmm2 + movsd %xmm2, -56(%rsp) + movsd -56(%rsp), %xmm3 + mulsd %xmm3, %xmm9 + movsd -56(%rsp), %xmm11 + subsd %xmm9, %xmm12 + mulsd %xmm11, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -64(%rsp), %xmm5 + subsd %xmm5, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -64(%rsp), %xmm2 + movq -56(%rsp), %r10 + movsd -64(%rsp), %xmm6 + movsd -56(%rsp), %xmm4 + movq %r10, -40(%rsp) + movsd -40(%rsp), %xmm3 + movaps %xmm3, %xmm5 + addsd 1888+__satan2_la_CoutTab(%rip), %xmm2 + mulsd %xmm7, %xmm5 + mulsd %xmm6, %xmm2 + mulsd %xmm4, %xmm2 + mulsd %xmm2, %xmm7 + mulsd %xmm8, %xmm2 + mulsd %xmm3, %xmm8 + addsd %xmm2, %xmm7 + movsd 1872+__satan2_la_CoutTab(%rip), %xmm3 + addsd %xmm8, %xmm7 + movsd %xmm7, -72(%rsp) + movaps %xmm5, %xmm7 + movsd -72(%rsp), %xmm4 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm6 + addsd %xmm4, %xmm7 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm8 + subsd %xmm8, %xmm5 + addsd %xmm4, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -72(%rsp), %xmm11 + movaps %xmm11, %xmm2 + mulsd %xmm11, %xmm2 + mulsd %xmm11, %xmm6 + mulsd %xmm2, %xmm3 + movsd -64(%rsp), %xmm4 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm7 + addsd 1864+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm11, %xmm7 + mulsd %xmm2, %xmm3 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm9 + movsd -64(%rsp), %xmm8 + addsd 1856+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm8, %xmm9 + mulsd %xmm2, %xmm3 + movsd %xmm9, -72(%rsp) + movsd -72(%rsp), %xmm10 + addsd 1848+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm10, %xmm11 + mulsd %xmm2, %xmm3 + movsd %xmm11, -64(%rsp) + addsd 1840+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + addsd 1832+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + addsd 1824+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + mulsd %xmm3, %xmm13 + movsd -72(%rsp), %xmm2 + movsd -64(%rsp), %xmm12 + movsd %xmm13, -72(%rsp) + addsd %xmm12, %xmm4 + movsd -72(%rsp), %xmm14 + subsd %xmm3, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm5 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + subsd %xmm6, %xmm3 + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movaps %xmm6, %xmm12 + movaps %xmm5, %xmm3 + mulsd %xmm4, %xmm6 + mulsd %xmm4, %xmm3 + mulsd %xmm2, %xmm5 + mulsd %xmm2, %xmm12 + addsd %xmm3, %xmm6 + movaps %xmm12, %xmm7 + movaps %xmm12, %xmm8 + addsd %xmm5, %xmm6 + addsd %xmm2, %xmm7 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm3 + subsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -56(%rsp), %xmm2 + subsd %xmm2, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd -56(%rsp), %xmm13 + addsd %xmm13, %xmm14 + movsd %xmm14, -56(%rsp) + movq -72(%rsp), %r11 + movsd -56(%rsp), %xmm15 + movq %r11, -40(%rsp) + addsd %xmm15, %xmm4 + movsd -40(%rsp), %xmm8 + addsd %xmm5, %xmm4 + movsd %xmm4, -32(%rsp) + movaps %xmm8, %xmm4 + movaps %xmm8, %xmm2 + addsd (%rsi,%rax,8), %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm4 + subsd %xmm4, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm3 + addsd %xmm3, %xmm5 + movsd %xmm5, -56(%rsp) + movsd -64(%rsp), %xmm6 + addsd (%rsi,%rax,8), %xmm6 + movsd %xmm6, -64(%rsp) + movsd -56(%rsp), %xmm7 + subsd %xmm7, %xmm8 + movsd %xmm8, -56(%rsp) + movsd -64(%rsp), %xmm10 + movsd -56(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movq -72(%rsp), %rdi + movq %rdi, -40(%rsp) + movsd -56(%rsp), %xmm2 + movaps %xmm1, %xmm3 + shrq $56, %rdi + addsd -32(%rsp), %xmm2 + shlb $7, %cl + addsd 8(%rsi,%rax,8), %xmm2 + movb %dl, %al + andb $127, %dil + shlb $7, %al + movsd %xmm2, -32(%rsp) + orb %al, %dil + movb %dil, -33(%rsp) + movsd -40(%rsp), %xmm9 + movaps %xmm9, %xmm5 + addsd %xmm9, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm4 + movb -25(%rsp), %sil + movb %sil, %r9b + shrb $7, %sil + subsd %xmm4, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -72(%rsp), %xmm7 + movsd -64(%rsp), %xmm6 + xorb %sil, %dl + andb $127, %r9b + shlb $7, %dl + addsd %xmm6, %xmm7 + movsd %xmm7, -56(%rsp) + movsd -64(%rsp), %xmm8 + addsd %xmm8, %xmm1 + movsd %xmm1, -64(%rsp) + orb %dl, %r9b + movsd -56(%rsp), %xmm1 + movb %r9b, -25(%rsp) + subsd %xmm1, %xmm9 + movsd %xmm9, -56(%rsp) + movsd -64(%rsp), %xmm11 + movsd -56(%rsp), %xmm10 + addsd %xmm10, %xmm11 + movsd %xmm11, -56(%rsp) + movq -72(%rsp), %rdx + movsd -56(%rsp), %xmm12 + movq %rdx, -40(%rsp) + addsd %xmm12, %xmm0 + movsd -40(%rsp), %xmm13 + addsd -32(%rsp), %xmm0 + movsd %xmm0, -32(%rsp) + addsd %xmm0, %xmm13 + movsd %xmm13, -24(%rsp) + movb -17(%rsp), %r10b + andb $127, %r10b + orb %cl, %r10b + movb %r10b, -17(%rsp) + movsd -24(%rsp), %xmm0 + cvtsd2ss %xmm0, %xmm0 + movss %xmm0, (%r8) + jmp .LBL_2_33 + +.LBL_2_18: + movsd -48(%rsp), %xmm12 + movb %dl, %dil + movaps %xmm12, %xmm7 + mulsd 2000+__satan2_la_CoutTab(%rip), %xmm7 + shlb $7, %dil + shlb $7, %cl + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm8 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm7 + mulsd %xmm2, %xmm13 + subsd -48(%rsp), %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + subsd %xmm9, %xmm10 + movsd %xmm10, -72(%rsp) + movsd -72(%rsp), %xmm11 + subsd %xmm11, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movsd %xmm13, -72(%rsp) + movsd -72(%rsp), %xmm14 + subsd %xmm2, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm4 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm3 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm4 + subsd %xmm3, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm12 + divsd %xmm12, %xmm7 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm2 + mulsd %xmm7, %xmm2 + movsd -64(%rsp), %xmm14 + movsd %xmm2, -64(%rsp) + movsd -64(%rsp), %xmm8 + subsd %xmm7, %xmm8 + movsd %xmm8, -56(%rsp) + movsd -64(%rsp), %xmm10 + movsd -56(%rsp), %xmm9 + subsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -56(%rsp), %xmm11 + mulsd %xmm11, %xmm12 + movsd -56(%rsp), %xmm13 + subsd %xmm12, %xmm4 + mulsd %xmm13, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -64(%rsp), %xmm15 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + subsd %xmm15, %xmm4 + movsd %xmm4, -64(%rsp) + movsd -64(%rsp), %xmm7 + movq -56(%rsp), %rax + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm3 + movq %rax, -40(%rsp) + movsd -40(%rsp), %xmm8 + movaps %xmm8, %xmm9 + addsd 1888+__satan2_la_CoutTab(%rip), %xmm7 + mulsd %xmm6, %xmm9 + mulsd %xmm5, %xmm8 + mulsd %xmm2, %xmm7 + movsd -16(%rsp), %xmm2 + mulsd %xmm2, %xmm2 + mulsd %xmm3, %xmm7 + movsd 1872+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + mulsd %xmm7, %xmm6 + mulsd %xmm5, %xmm7 + addsd 1864+__satan2_la_CoutTab(%rip), %xmm3 + addsd %xmm7, %xmm6 + mulsd %xmm2, %xmm3 + addsd %xmm8, %xmm6 + addsd 1856+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + movaps %xmm9, %xmm5 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm4 + addsd 1848+__satan2_la_CoutTab(%rip), %xmm3 + addsd %xmm4, %xmm5 + mulsd %xmm2, %xmm3 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm5 + subsd %xmm6, %xmm9 + addsd 1840+__satan2_la_CoutTab(%rip), %xmm3 + addsd %xmm4, %xmm9 + mulsd %xmm2, %xmm3 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm11 + mulsd %xmm11, %xmm5 + addsd 1832+__satan2_la_CoutTab(%rip), %xmm3 + movsd -64(%rsp), %xmm4 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm7 + mulsd %xmm2, %xmm3 + subsd %xmm11, %xmm7 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm8 + movsd -64(%rsp), %xmm6 + addsd 1824+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm6, %xmm8 + mulsd %xmm2, %xmm3 + movsd %xmm8, -72(%rsp) + movsd -72(%rsp), %xmm10 + mulsd %xmm3, %xmm13 + subsd %xmm10, %xmm11 + movsd %xmm11, -64(%rsp) + movsd -72(%rsp), %xmm2 + movsd -64(%rsp), %xmm12 + movsd %xmm13, -72(%rsp) + addsd %xmm12, %xmm4 + movsd -72(%rsp), %xmm14 + subsd %xmm3, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm5 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + subsd %xmm6, %xmm3 + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movaps %xmm6, %xmm12 + movaps %xmm5, %xmm3 + mulsd %xmm4, %xmm6 + mulsd %xmm4, %xmm3 + mulsd %xmm2, %xmm5 + mulsd %xmm2, %xmm12 + addsd %xmm3, %xmm6 + movaps %xmm12, %xmm7 + movaps %xmm12, %xmm8 + addsd %xmm5, %xmm6 + addsd %xmm2, %xmm7 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm3 + subsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -56(%rsp), %xmm2 + subsd %xmm2, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd -56(%rsp), %xmm13 + addsd %xmm13, %xmm14 + movsd %xmm14, -56(%rsp) + movq -72(%rsp), %rsi + movsd -56(%rsp), %xmm15 + movq %rsi, -40(%rsp) + addsd %xmm15, %xmm4 + shrq $56, %rsi + addsd %xmm5, %xmm4 + andb $127, %sil + orb %dil, %sil + movb %sil, -33(%rsp) + movsd %xmm4, -32(%rsp) + movaps %xmm1, %xmm4 + movsd -40(%rsp), %xmm7 + movaps %xmm7, %xmm2 + addsd %xmm7, %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm4 + movb -25(%rsp), %r9b + movb %r9b, %r10b + shrb $7, %r9b + subsd %xmm4, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm3 + xorb %r9b, %dl + andb $127, %r10b + shlb $7, %dl + addsd %xmm3, %xmm5 + movsd %xmm5, -56(%rsp) + movsd -64(%rsp), %xmm6 + addsd %xmm6, %xmm1 + movsd %xmm1, -64(%rsp) + orb %dl, %r10b + movsd -56(%rsp), %xmm1 + movb %r10b, -25(%rsp) + subsd %xmm1, %xmm7 + movsd %xmm7, -56(%rsp) + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm1 + addsd %xmm1, %xmm2 + movsd %xmm2, -56(%rsp) + movq -72(%rsp), %rdx + movsd -56(%rsp), %xmm3 + movq %rdx, -40(%rsp) + addsd %xmm3, %xmm0 + movsd -40(%rsp), %xmm4 + addsd -32(%rsp), %xmm0 + movsd %xmm0, -32(%rsp) + addsd %xmm0, %xmm4 + movsd %xmm4, -24(%rsp) + movb -17(%rsp), %r11b + andb $127, %r11b + orb %cl, %r11b + movb %r11b, -17(%rsp) + movsd -24(%rsp), %xmm0 + cvtsd2ss %xmm0, %xmm0 + movss %xmm0, (%r8) + jmp .LBL_2_33 + +.LBL_2_19: + cmpl $74, %r9d + jge .LBL_2_21 + movb %dil, -41(%rsp) + divsd -48(%rsp), %xmm1 + movsd 1928+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + subsd %xmm1, %xmm0 + addsd 1920+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + jmp .LBL_2_33 + +.LBL_2_21: + movsd 1920+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd 1928+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + jmp .LBL_2_33 + +.LBL_2_22: + testb %dl, %dl + jne .LBL_2_32 + movb %dil, -41(%rsp) + pxor %xmm0, %xmm0 + movb %sil, -33(%rsp) + movsd -48(%rsp), %xmm2 + divsd -40(%rsp), %xmm2 + cvtsd2ss %xmm2, %xmm0 + movss %xmm0, -8(%rsp) + movzwl -6(%rsp), %eax + movsd %xmm2, -24(%rsp) + testl $32640, %eax + je .LBL_2_25 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd %xmm2, %xmm0 + movsd %xmm0, -72(%rsp) + movsd -72(%rsp), %xmm1 + mulsd %xmm1, %xmm2 + movsd %xmm2, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm2 + cvtsd2ss %xmm2, %xmm2 + movss %xmm2, (%r8) + jmp .LBL_2_33 + +.LBL_2_25: + movsd -24(%rsp), %xmm0 + cvtsd2ss %xmm0, %xmm0 + shlb $7, %cl + movss %xmm0, -8(%rsp) + movss -8(%rsp), %xmm2 + movss -8(%rsp), %xmm1 + mulss %xmm1, %xmm2 + movss %xmm2, -8(%rsp) + movss -8(%rsp), %xmm3 + cvtss2sd %xmm3, %xmm3 + addsd -24(%rsp), %xmm3 + movsd %xmm3, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm4 + cvtsd2ss %xmm4, %xmm4 + movss %xmm4, (%r8) + jmp .LBL_2_33 + +.LBL_2_27: + testl %eax, %eax + jne .LBL_2_21 + testl $8388607, -32(%rsp) + jne .LBL_2_21 + +.LBL_2_30: + testb %dl, %dl + jne .LBL_2_32 + +.LBL_2_31: + shlb $7, %cl + movq 1976+__satan2_la_CoutTab(%rip), %rax + movq %rax, -24(%rsp) + shrq $56, %rax + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm0 + cvtsd2ss %xmm0, %xmm0 + movss %xmm0, (%r8) + jmp .LBL_2_33 + +.LBL_2_32: + movsd 1936+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd 1944+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + +.LBL_2_33: + xorl %eax, %eax + ret + +.LBL_2_34: + movsd 1984+__satan2_la_CoutTab(%rip), %xmm3 + movl $-1022, %eax + mulsd %xmm3, %xmm4 + movsd %xmm4, -48(%rsp) + jmp .LBL_2_16 + +.LBL_2_35: + cmpl $2047, %eax + je .LBL_2_48 + +.LBL_2_36: + cmpl $2047, %r9d + je .LBL_2_46 + +.LBL_2_37: + movzwl -26(%rsp), %eax + andl $32640, %eax + cmpl $32640, %eax + jne .LBL_2_21 + cmpl $255, %edi + je .LBL_2_43 + testb %dl, %dl + je .LBL_2_31 + jmp .LBL_2_32 + +.LBL_2_43: + testb %dl, %dl + jne .LBL_2_45 + movsd 1904+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd 1912+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + jmp .LBL_2_33 + +.LBL_2_45: + movsd 1952+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd 1960+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + jmp .LBL_2_33 + +.LBL_2_46: + testl $8388607, -28(%rsp) + je .LBL_2_37 + +.LBL_2_47: + addss %xmm2, %xmm3 + movss %xmm3, (%r8) + jmp .LBL_2_33 + +.LBL_2_48: + testl $8388607, -32(%rsp) + jne .LBL_2_47 + jmp .LBL_2_36 + + cfi_endproc + + .type __svml_satan2_cout_rare_internal,@function + .size __svml_satan2_cout_rare_internal,.-__svml_satan2_cout_rare_internal + + .section .rodata, "a" + .align 64 + +__svml_satan2_data_internal: + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .type __svml_satan2_data_internal,@object + .size __svml_satan2_data_internal,1152 + .align 32 + +__satan2_la_CoutTab: + .long 3892314112 + .long 1069799150 + .long 2332892550 + .long 1039715405 + .long 1342177280 + .long 1070305495 + .long 270726690 + .long 1041535749 + .long 939524096 + .long 1070817911 + .long 2253973841 + .long 3188654726 + .long 3221225472 + .long 1071277294 + .long 3853927037 + .long 1043226911 + .long 2818572288 + .long 1071767563 + .long 2677759107 + .long 1044314101 + .long 3355443200 + .long 1072103591 + .long 1636578514 + .long 3191094734 + .long 1476395008 + .long 1072475260 + .long 1864703685 + .long 3188646936 + .long 805306368 + .long 1072747407 + .long 192551812 + .long 3192726267 + .long 2013265920 + .long 1072892781 + .long 2240369452 + .long 1043768538 + .long 0 + .long 1072999953 + .long 3665168337 + .long 3192705970 + .long 402653184 + .long 1073084787 + .long 1227953434 + .long 3192313277 + .long 2013265920 + .long 1073142981 + .long 3853283127 + .long 1045277487 + .long 805306368 + .long 1073187261 + .long 1676192264 + .long 3192868861 + .long 134217728 + .long 1073217000 + .long 4290763938 + .long 1042034855 + .long 671088640 + .long 1073239386 + .long 994303084 + .long 3189643768 + .long 402653184 + .long 1073254338 + .long 1878067156 + .long 1042652475 + .long 1610612736 + .long 1073265562 + .long 670314820 + .long 1045138554 + .long 3221225472 + .long 1073273048 + .long 691126919 + .long 3189987794 + .long 3489660928 + .long 1073278664 + .long 1618990832 + .long 3188194509 + .long 1207959552 + .long 1073282409 + .long 2198872939 + .long 1044806069 + .long 3489660928 + .long 1073285217 + .long 2633982383 + .long 1042307894 + .long 939524096 + .long 1073287090 + .long 1059367786 + .long 3189114230 + .long 2281701376 + .long 1073288494 + .long 3158525533 + .long 1044484961 + .long 3221225472 + .long 1073289430 + .long 286581777 + .long 1044893263 + .long 4026531840 + .long 1073290132 + .long 2000245215 + .long 3191647611 + .long 134217728 + .long 1073290601 + .long 4205071590 + .long 1045035927 + .long 536870912 + .long 1073290952 + .long 2334392229 + .long 1043447393 + .long 805306368 + .long 1073291186 + .long 2281458177 + .long 3188885569 + .long 3087007744 + .long 1073291361 + .long 691611507 + .long 1044733832 + .long 3221225472 + .long 1073291478 + .long 1816229550 + .long 1044363390 + .long 2281701376 + .long 1073291566 + .long 1993843750 + .long 3189837440 + .long 134217728 + .long 1073291625 + .long 3654754496 + .long 1044970837 + .long 4026531840 + .long 1073291668 + .long 3224300229 + .long 3191935390 + .long 805306368 + .long 1073291698 + .long 2988777976 + .long 3188950659 + .long 536870912 + .long 1073291720 + .long 1030371341 + .long 1043402665 + .long 3221225472 + .long 1073291734 + .long 1524463765 + .long 1044361356 + .long 3087007744 + .long 1073291745 + .long 2754295320 + .long 1044731036 + .long 134217728 + .long 1073291753 + .long 3099629057 + .long 1044970710 + .long 2281701376 + .long 1073291758 + .long 962914160 + .long 3189838838 + .long 805306368 + .long 1073291762 + .long 3543908206 + .long 3188950786 + .long 4026531840 + .long 1073291764 + .long 1849909620 + .long 3191935434 + .long 3221225472 + .long 1073291766 + .long 1641333636 + .long 1044361352 + .long 536870912 + .long 1073291768 + .long 1373968792 + .long 1043402654 + .long 134217728 + .long 1073291769 + .long 2033191599 + .long 1044970710 + .long 3087007744 + .long 1073291769 + .long 4117947437 + .long 1044731035 + .long 805306368 + .long 1073291770 + .long 315378368 + .long 3188950787 + .long 2281701376 + .long 1073291770 + .long 2428571750 + .long 3189838838 + .long 3221225472 + .long 1073291770 + .long 1608007466 + .long 1044361352 + .long 4026531840 + .long 1073291770 + .long 1895711420 + .long 3191935434 + .long 134217728 + .long 1073291771 + .long 2031108713 + .long 1044970710 + .long 536870912 + .long 1073291771 + .long 1362518342 + .long 1043402654 + .long 805306368 + .long 1073291771 + .long 317461253 + .long 3188950787 + .long 939524096 + .long 1073291771 + .long 4117231784 + .long 1044731035 + .long 1073741824 + .long 1073291771 + .long 1607942376 + .long 1044361352 + .long 1207959552 + .long 1073291771 + .long 2428929577 + .long 3189838838 + .long 1207959552 + .long 1073291771 + .long 2031104645 + .long 1044970710 + .long 1342177280 + .long 1073291771 + .long 1895722602 + .long 3191935434 + .long 1342177280 + .long 1073291771 + .long 317465322 + .long 3188950787 + .long 1342177280 + .long 1073291771 + .long 1362515546 + .long 1043402654 + .long 1342177280 + .long 1073291771 + .long 1607942248 + .long 1044361352 + .long 1342177280 + .long 1073291771 + .long 4117231610 + .long 1044731035 + .long 1342177280 + .long 1073291771 + .long 2031104637 + .long 1044970710 + .long 1342177280 + .long 1073291771 + .long 1540251232 + .long 1045150466 + .long 1342177280 + .long 1073291771 + .long 2644671394 + .long 1045270303 + .long 1342177280 + .long 1073291771 + .long 2399244691 + .long 1045360181 + .long 1342177280 + .long 1073291771 + .long 803971124 + .long 1045420100 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192879152 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192849193 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192826724 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192811744 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192800509 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192793019 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192787402 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192783657 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192780848 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192778976 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192777572 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192776635 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192775933 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192775465 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192775114 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774880 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774704 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774587 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774500 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774441 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774397 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774368 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774346 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774331 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774320 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774313 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774308 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774304 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774301 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774299 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774298 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774297 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1466225875 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1343512524 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1251477510 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1190120835 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1144103328 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1113424990 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1090416237 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1075077068 + .long 3192774295 + .long 1431655765 + .long 3218429269 + .long 2576978363 + .long 1070176665 + .long 2453154343 + .long 3217180964 + .long 4189149139 + .long 1069314502 + .long 1775019125 + .long 3216459198 + .long 273199057 + .long 1068739452 + .long 874748308 + .long 3215993277 + .long 0 + .long 1069547520 + .long 0 + .long 1072693248 + .long 0 + .long 1073741824 + .long 1413754136 + .long 1072243195 + .long 856972295 + .long 1015129638 + .long 1413754136 + .long 1073291771 + .long 856972295 + .long 1016178214 + .long 1413754136 + .long 1074340347 + .long 856972295 + .long 1017226790 + .long 2134057426 + .long 1073928572 + .long 1285458442 + .long 1016756537 + .long 0 + .long 3220176896 + .long 0 + .long 0 + .long 0 + .long 2144337920 + .long 0 + .long 1048576 + .long 33554432 + .long 1101004800 + .type __satan2_la_CoutTab,@object + .size __satan2_la_CoutTab,2008 diff --git a/sysdeps/x86_64/fpu/svml_d_atan22_core.S b/sysdeps/x86_64/fpu/svml_d_atan22_core.S new file mode 100644 index 0000000000..f3089e70f9 --- /dev/null +++ b/sysdeps/x86_64/fpu/svml_d_atan22_core.S @@ -0,0 +1,29 @@ +/* Function atan2 vectorized with SSE2. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include "svml_d_wrapper_impl.h" + + .text +ENTRY (_ZGVbN2vv_atan2) +WRAPPER_IMPL_SSE2_ff atan2 +END (_ZGVbN2vv_atan2) + +#ifndef USE_MULTIARCH + libmvec_hidden_def (_ZGVbN2vv_atan2) +#endif diff --git a/sysdeps/x86_64/fpu/svml_d_atan24_core.S b/sysdeps/x86_64/fpu/svml_d_atan24_core.S new file mode 100644 index 0000000000..8a163d12d2 --- /dev/null +++ b/sysdeps/x86_64/fpu/svml_d_atan24_core.S @@ -0,0 +1,29 @@ +/* Function atan2 vectorized with AVX2, wrapper version. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include "svml_d_wrapper_impl.h" + + .text +ENTRY (_ZGVdN4vv_atan2) +WRAPPER_IMPL_AVX_ff _ZGVbN2vv_atan2 +END (_ZGVdN4vv_atan2) + +#ifndef USE_MULTIARCH + libmvec_hidden_def (_ZGVdN4vv_atan2) +#endif diff --git a/sysdeps/x86_64/fpu/svml_d_atan24_core_avx.S b/sysdeps/x86_64/fpu/svml_d_atan24_core_avx.S new file mode 100644 index 0000000000..0ee5ae8faf --- /dev/null +++ b/sysdeps/x86_64/fpu/svml_d_atan24_core_avx.S @@ -0,0 +1,25 @@ +/* Function atan2 vectorized in AVX ISA as wrapper to SSE4 ISA version. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include "svml_d_wrapper_impl.h" + + .text +ENTRY (_ZGVcN4vv_atan2) +WRAPPER_IMPL_AVX_ff _ZGVbN2vv_atan2 +END (_ZGVcN4vv_atan2) diff --git a/sysdeps/x86_64/fpu/svml_d_atan28_core.S b/sysdeps/x86_64/fpu/svml_d_atan28_core.S new file mode 100644 index 0000000000..b85f696686 --- /dev/null +++ b/sysdeps/x86_64/fpu/svml_d_atan28_core.S @@ -0,0 +1,25 @@ +/* Function atan2 vectorized with AVX-512. Wrapper to AVX2 version. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include "svml_d_wrapper_impl.h" + + .text +ENTRY (_ZGVeN8vv_atan2) +WRAPPER_IMPL_AVX512_ff _ZGVdN4vv_atan2 +END (_ZGVeN8vv_atan2) diff --git a/sysdeps/x86_64/fpu/svml_s_atan2f16_core.S b/sysdeps/x86_64/fpu/svml_s_atan2f16_core.S new file mode 100644 index 0000000000..25acb31dfb --- /dev/null +++ b/sysdeps/x86_64/fpu/svml_s_atan2f16_core.S @@ -0,0 +1,25 @@ +/* Function atan2f vectorized with AVX-512. Wrapper to AVX2 version. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include "svml_s_wrapper_impl.h" + + .text +ENTRY (_ZGVeN16vv_atan2f) +WRAPPER_IMPL_AVX512_ff _ZGVdN8vv_atan2f +END (_ZGVeN16vv_atan2f) diff --git a/sysdeps/x86_64/fpu/svml_s_atan2f4_core.S b/sysdeps/x86_64/fpu/svml_s_atan2f4_core.S new file mode 100644 index 0000000000..bc99f0ba10 --- /dev/null +++ b/sysdeps/x86_64/fpu/svml_s_atan2f4_core.S @@ -0,0 +1,29 @@ +/* Function atan2f vectorized with SSE2. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include "svml_s_wrapper_impl.h" + + .text +ENTRY (_ZGVbN4vv_atan2f) +WRAPPER_IMPL_SSE2_ff atan2f +END (_ZGVbN4vv_atan2f) + +#ifndef USE_MULTIARCH + libmvec_hidden_def (_ZGVbN4vv_atan2f) +#endif diff --git a/sysdeps/x86_64/fpu/svml_s_atan2f8_core.S b/sysdeps/x86_64/fpu/svml_s_atan2f8_core.S new file mode 100644 index 0000000000..bfcdb3c372 --- /dev/null +++ b/sysdeps/x86_64/fpu/svml_s_atan2f8_core.S @@ -0,0 +1,29 @@ +/* Function atan2f vectorized with AVX2, wrapper version. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include "svml_s_wrapper_impl.h" + + .text +ENTRY (_ZGVdN8vv_atan2f) +WRAPPER_IMPL_AVX_ff _ZGVbN4vv_atan2f +END (_ZGVdN8vv_atan2f) + +#ifndef USE_MULTIARCH + libmvec_hidden_def (_ZGVdN8vv_atan2f) +#endif diff --git a/sysdeps/x86_64/fpu/svml_s_atan2f8_core_avx.S b/sysdeps/x86_64/fpu/svml_s_atan2f8_core_avx.S new file mode 100644 index 0000000000..1aa8d05822 --- /dev/null +++ b/sysdeps/x86_64/fpu/svml_s_atan2f8_core_avx.S @@ -0,0 +1,25 @@ +/* Function atan2f vectorized in AVX ISA as wrapper to SSE4 ISA version. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include "svml_s_wrapper_impl.h" + + .text +ENTRY(_ZGVcN8vv_atan2f) +WRAPPER_IMPL_AVX_ff _ZGVbN4vv_atan2f +END(_ZGVcN8vv_atan2f) diff --git a/sysdeps/x86_64/fpu/test-double-libmvec-atan2-avx.c b/sysdeps/x86_64/fpu/test-double-libmvec-atan2-avx.c new file mode 100644 index 0000000000..e423bce25b --- /dev/null +++ b/sysdeps/x86_64/fpu/test-double-libmvec-atan2-avx.c @@ -0,0 +1 @@ +#include "test-double-libmvec-atan2.c" diff --git a/sysdeps/x86_64/fpu/test-double-libmvec-atan2-avx2.c b/sysdeps/x86_64/fpu/test-double-libmvec-atan2-avx2.c new file mode 100644 index 0000000000..e423bce25b --- /dev/null +++ b/sysdeps/x86_64/fpu/test-double-libmvec-atan2-avx2.c @@ -0,0 +1 @@ +#include "test-double-libmvec-atan2.c" diff --git a/sysdeps/x86_64/fpu/test-double-libmvec-atan2-avx512f.c b/sysdeps/x86_64/fpu/test-double-libmvec-atan2-avx512f.c new file mode 100644 index 0000000000..e423bce25b --- /dev/null +++ b/sysdeps/x86_64/fpu/test-double-libmvec-atan2-avx512f.c @@ -0,0 +1 @@ +#include "test-double-libmvec-atan2.c" diff --git a/sysdeps/x86_64/fpu/test-double-libmvec-atan2.c b/sysdeps/x86_64/fpu/test-double-libmvec-atan2.c new file mode 100644 index 0000000000..d0aa626d95 --- /dev/null +++ b/sysdeps/x86_64/fpu/test-double-libmvec-atan2.c @@ -0,0 +1,3 @@ +#define LIBMVEC_TYPE double +#define LIBMVEC_FUNC atan2 +#include "test-vector-abi-arg2.h" diff --git a/sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c b/sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c index 7abe3211c8..cd802e0c6d 100644 --- a/sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c +++ b/sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c @@ -32,6 +32,7 @@ VECTOR_WRAPPER (WRAPPER_NAME (acosh), _ZGVbN2v_acosh) VECTOR_WRAPPER (WRAPPER_NAME (asin), _ZGVbN2v_asin) VECTOR_WRAPPER (WRAPPER_NAME (asinh), _ZGVbN2v_asinh) VECTOR_WRAPPER (WRAPPER_NAME (atan), _ZGVbN2v_atan) +VECTOR_WRAPPER_ff (WRAPPER_NAME (atan2), _ZGVbN2vv_atan2) #define VEC_INT_TYPE __m128i diff --git a/sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c b/sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c index 1537ed25cc..a04980e87a 100644 --- a/sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c +++ b/sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c @@ -35,6 +35,7 @@ VECTOR_WRAPPER (WRAPPER_NAME (acosh), _ZGVdN4v_acosh) VECTOR_WRAPPER (WRAPPER_NAME (asin), _ZGVdN4v_asin) VECTOR_WRAPPER (WRAPPER_NAME (asinh), _ZGVdN4v_asinh) VECTOR_WRAPPER (WRAPPER_NAME (atan), _ZGVdN4v_atan) +VECTOR_WRAPPER_ff (WRAPPER_NAME (atan2), _ZGVdN4vv_atan2) #ifndef __ILP32__ # define VEC_INT_TYPE __m256i diff --git a/sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c b/sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c index 27bcc9c59a..9c602445e7 100644 --- a/sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c +++ b/sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c @@ -32,6 +32,7 @@ VECTOR_WRAPPER (WRAPPER_NAME (acosh), _ZGVcN4v_acosh) VECTOR_WRAPPER (WRAPPER_NAME (asin), _ZGVcN4v_asin) VECTOR_WRAPPER (WRAPPER_NAME (asinh), _ZGVcN4v_asinh) VECTOR_WRAPPER (WRAPPER_NAME (atan), _ZGVcN4v_atan) +VECTOR_WRAPPER_ff (WRAPPER_NAME (atan2), _ZGVcN4vv_atan2) #define VEC_INT_TYPE __m128i diff --git a/sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c b/sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c index 2333349893..d1e4b8dd01 100644 --- a/sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c +++ b/sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c @@ -32,6 +32,7 @@ VECTOR_WRAPPER (WRAPPER_NAME (acosh), _ZGVeN8v_acosh) VECTOR_WRAPPER (WRAPPER_NAME (asin), _ZGVeN8v_asin) VECTOR_WRAPPER (WRAPPER_NAME (asinh), _ZGVeN8v_asinh) VECTOR_WRAPPER (WRAPPER_NAME (atan), _ZGVeN8v_atan) +VECTOR_WRAPPER_ff (WRAPPER_NAME (atan2), _ZGVeN8vv_atan2) #ifndef __ILP32__ # define VEC_INT_TYPE __m512i diff --git a/sysdeps/x86_64/fpu/test-float-libmvec-atan2f-avx.c b/sysdeps/x86_64/fpu/test-float-libmvec-atan2f-avx.c new file mode 100644 index 0000000000..5c7e2c9ad5 --- /dev/null +++ b/sysdeps/x86_64/fpu/test-float-libmvec-atan2f-avx.c @@ -0,0 +1 @@ +#include "test-float-libmvec-atan2f.c" diff --git a/sysdeps/x86_64/fpu/test-float-libmvec-atan2f-avx2.c b/sysdeps/x86_64/fpu/test-float-libmvec-atan2f-avx2.c new file mode 100644 index 0000000000..5c7e2c9ad5 --- /dev/null +++ b/sysdeps/x86_64/fpu/test-float-libmvec-atan2f-avx2.c @@ -0,0 +1 @@ +#include "test-float-libmvec-atan2f.c" diff --git a/sysdeps/x86_64/fpu/test-float-libmvec-atan2f-avx512f.c b/sysdeps/x86_64/fpu/test-float-libmvec-atan2f-avx512f.c new file mode 100644 index 0000000000..5c7e2c9ad5 --- /dev/null +++ b/sysdeps/x86_64/fpu/test-float-libmvec-atan2f-avx512f.c @@ -0,0 +1 @@ +#include "test-float-libmvec-atan2f.c" diff --git a/sysdeps/x86_64/fpu/test-float-libmvec-atan2f.c b/sysdeps/x86_64/fpu/test-float-libmvec-atan2f.c new file mode 100644 index 0000000000..beb5c745cb --- /dev/null +++ b/sysdeps/x86_64/fpu/test-float-libmvec-atan2f.c @@ -0,0 +1,3 @@ +#define LIBMVEC_TYPE float +#define LIBMVEC_FUNC atan2f +#include "test-vector-abi-arg2.h" diff --git a/sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c b/sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c index 723651140e..65e0c2af7d 100644 --- a/sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c +++ b/sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c @@ -32,6 +32,7 @@ VECTOR_WRAPPER (WRAPPER_NAME (acoshf), _ZGVeN16v_acoshf) VECTOR_WRAPPER (WRAPPER_NAME (asinf), _ZGVeN16v_asinf) VECTOR_WRAPPER (WRAPPER_NAME (asinhf), _ZGVeN16v_asinhf) VECTOR_WRAPPER (WRAPPER_NAME (atanf), _ZGVeN16v_atanf) +VECTOR_WRAPPER_ff (WRAPPER_NAME (atan2f), _ZGVeN16vv_atan2f) #define VEC_INT_TYPE __m512i diff --git a/sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c b/sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c index da77149021..b0cad1e107 100644 --- a/sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c +++ b/sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c @@ -32,6 +32,7 @@ VECTOR_WRAPPER (WRAPPER_NAME (acoshf), _ZGVbN4v_acoshf) VECTOR_WRAPPER (WRAPPER_NAME (asinf), _ZGVbN4v_asinf) VECTOR_WRAPPER (WRAPPER_NAME (asinhf), _ZGVbN4v_asinhf) VECTOR_WRAPPER (WRAPPER_NAME (atanf), _ZGVbN4v_atanf) +VECTOR_WRAPPER_ff (WRAPPER_NAME (atan2f), _ZGVbN4vv_atan2f) #define VEC_INT_TYPE __m128i diff --git a/sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c b/sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c index a978f37e79..359aa445ba 100644 --- a/sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c +++ b/sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c @@ -35,6 +35,7 @@ VECTOR_WRAPPER (WRAPPER_NAME (acoshf), _ZGVdN8v_acoshf) VECTOR_WRAPPER (WRAPPER_NAME (asinf), _ZGVdN8v_asinf) VECTOR_WRAPPER (WRAPPER_NAME (asinhf), _ZGVdN8v_asinhf) VECTOR_WRAPPER (WRAPPER_NAME (atanf), _ZGVdN8v_atanf) +VECTOR_WRAPPER_ff (WRAPPER_NAME (atan2f), _ZGVdN8vv_atan2f) /* Redefinition of wrapper to be compatible with _ZGVdN8vvv_sincosf. */ #undef VECTOR_WRAPPER_fFF diff --git a/sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c b/sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c index 1ae9a8c3c0..80730777fc 100644 --- a/sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c +++ b/sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c @@ -32,6 +32,7 @@ VECTOR_WRAPPER (WRAPPER_NAME (acoshf), _ZGVcN8v_acoshf) VECTOR_WRAPPER (WRAPPER_NAME (asinf), _ZGVcN8v_asinf) VECTOR_WRAPPER (WRAPPER_NAME (asinhf), _ZGVcN8v_asinhf) VECTOR_WRAPPER (WRAPPER_NAME (atanf), _ZGVcN8v_atanf) +VECTOR_WRAPPER_ff (WRAPPER_NAME (atan2f), _ZGVcN8vv_atan2f) #define VEC_INT_TYPE __m128i