aarch64: Improve special-case handling in AdvSIMD double-precision libmvec routines

Message ID 20231127170255.52890-1-Joe.Ramsay@arm.com
State Committed
Commit 7b12776584c51dbecb1033e107f6b9f45de47a1b
Headers
Series aarch64: Improve special-case handling in AdvSIMD double-precision libmvec routines |

Checks

Context Check Description
redhat-pt-bot/TryBot-apply_patch success Patch applied to master at the time it was sent
redhat-pt-bot/TryBot-32bit success Build for i686
linaro-tcwg-bot/tcwg_glibc_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_glibc_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_glibc_check--master-arm success Testing passed
linaro-tcwg-bot/tcwg_glibc_check--master-aarch64 success Testing passed

Commit Message

Joe Ramsay Nov. 27, 2023, 5:02 p.m. UTC
  Avoids emitting many saves/restores of vector registers, reduces the
amount of code generated around the scalar fallback.
---
Thanks,
Joe
 sysdeps/aarch64/fpu/v_math.h | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)
  

Comments

Szabolcs Nagy Nov. 29, 2023, 3:04 p.m. UTC | #1
The 11/27/2023 17:02, Joe Ramsay wrote:
> Avoids emitting many saves/restores of vector registers, reduces the
> amount of code generated around the scalar fallback.

thanks, committed.

> ---
> Thanks,
> Joe
>  sysdeps/aarch64/fpu/v_math.h | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/sysdeps/aarch64/fpu/v_math.h b/sysdeps/aarch64/fpu/v_math.h
> index cfc87f8dd0..d286eb81b3 100644
> --- a/sysdeps/aarch64/fpu/v_math.h
> +++ b/sysdeps/aarch64/fpu/v_math.h
> @@ -137,7 +137,13 @@ v_lookup_u64 (const uint64_t *tab, uint64x2_t idx)
>  static inline float64x2_t
>  v_call_f64 (double (*f) (double), float64x2_t x, float64x2_t y, uint64x2_t p)
>  {
> -  return (float64x2_t){ p[0] ? f (x[0]) : y[0], p[1] ? f (x[1]) : y[1] };
> +  double p1 = p[1];
> +  double x1 = x[1];
> +  if (__glibc_likely (p[0]))
> +    y[0] = f (x[0]);
> +  if (__glibc_likely (p1))
> +    y[1] = f (x1);
> +  return y;
>  }
>  static inline float64x2_t
>  v_call2_f64 (double (*f) (double, double), float64x2_t x1, float64x2_t x2,
> -- 
> 2.27.0
>
  

Patch

diff --git a/sysdeps/aarch64/fpu/v_math.h b/sysdeps/aarch64/fpu/v_math.h
index cfc87f8dd0..d286eb81b3 100644
--- a/sysdeps/aarch64/fpu/v_math.h
+++ b/sysdeps/aarch64/fpu/v_math.h
@@ -137,7 +137,13 @@  v_lookup_u64 (const uint64_t *tab, uint64x2_t idx)
 static inline float64x2_t
 v_call_f64 (double (*f) (double), float64x2_t x, float64x2_t y, uint64x2_t p)
 {
-  return (float64x2_t){ p[0] ? f (x[0]) : y[0], p[1] ? f (x[1]) : y[1] };
+  double p1 = p[1];
+  double x1 = x[1];
+  if (__glibc_likely (p[0]))
+    y[0] = f (x[0]);
+  if (__glibc_likely (p1))
+    y[1] = f (x1);
+  return y;
 }
 static inline float64x2_t
 v_call2_f64 (double (*f) (double, double), float64x2_t x1, float64x2_t x2,