Patchwork x86-64: Implement strcat family IFUNC selectors in C

login
register
mail settings
Submitter H.J. Lu
Date June 12, 2017, 4:10 p.m.
Message ID <20170612161054.GD25262@gmail.com>
Download mbox | patch
Permalink /patch/20952/
State New
Headers show

Comments

H.J. Lu - June 12, 2017, 4:10 p.m.
Implement strcat family IFUNC selectors in C.

All internal calls within libc.so can use IFUNC on x86-64 since unlike
x86, x86-64 supports PC-relative addressing to access the GOT entry so
that it can call via PLT without using an extra register.  For libc.a,
we can't use IFUNC for functions which are called before IFUNC has been
initialized.  Use IFUNC internally reduces the icache footprint since
libc.so and other codes in the process use the same implementations.
This patch uses IFUNC for strcat family functions within libc.

Any comments?

H.J.
	* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
	strcat-sse2.
	* sysdeps/x86_64/multiarch/strcat-sse2.S: New file.
	* sysdeps/x86_64/multiarch/strcat.c: Likewise.
	* sysdeps/x86_64/multiarch/strncat.c: Likewise.
	* sysdeps/x86_64/multiarch/strcat.S: Removed.
	* sysdeps/x86_64/multiarch/strncat.S: Likewise.
---
 sysdeps/x86_64/multiarch/Makefile      |  1 +
 sysdeps/x86_64/multiarch/strcat-sse2.S | 28 +++++++++++
 sysdeps/x86_64/multiarch/strcat.S      | 85 ----------------------------------
 sysdeps/x86_64/multiarch/strcat.c      | 35 ++++++++++++++
 sysdeps/x86_64/multiarch/strncat.S     |  5 --
 sysdeps/x86_64/multiarch/strncat.c     | 31 +++++++++++++
 6 files changed, 95 insertions(+), 90 deletions(-)
 create mode 100644 sysdeps/x86_64/multiarch/strcat-sse2.S
 delete mode 100644 sysdeps/x86_64/multiarch/strcat.S
 create mode 100644 sysdeps/x86_64/multiarch/strcat.c
 delete mode 100644 sysdeps/x86_64/multiarch/strncat.S
 create mode 100644 sysdeps/x86_64/multiarch/strncat.c
Adhemerval Zanella Netto - June 15, 2017, 3:32 p.m.
On 12/06/2017 13:10, H.J. Lu wrote:
> Implement strcat family IFUNC selectors in C.
> 
> All internal calls within libc.so can use IFUNC on x86-64 since unlike
> x86, x86-64 supports PC-relative addressing to access the GOT entry so
> that it can call via PLT without using an extra register.  For libc.a,
> we can't use IFUNC for functions which are called before IFUNC has been
> initialized.  Use IFUNC internally reduces the icache footprint since
> libc.so and other codes in the process use the same implementations.
> This patch uses IFUNC for strcat family functions within libc.
> 
> Any comments?
> 
> H.J.
> 	* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
> 	strcat-sse2.
> 	* sysdeps/x86_64/multiarch/strcat-sse2.S: New file.
> 	* sysdeps/x86_64/multiarch/strcat.c: Likewise.
> 	* sysdeps/x86_64/multiarch/strncat.c: Likewise.
> 	* sysdeps/x86_64/multiarch/strcat.S: Removed.
> 	* sysdeps/x86_64/multiarch/strncat.S: Likewise.

LGTM, but since the idea is to use IFUNC for internal symbol call, do we
still need the assembly implementation for x86_64? Default algorithm should
be good enough with both vectorized strlen and strcpy (same for strncat)
H.J. Lu - June 15, 2017, 3:51 p.m.
On Thu, Jun 15, 2017 at 8:32 AM, Adhemerval Zanella
<adhemerval.zanella@linaro.org> wrote:
>
>
> On 12/06/2017 13:10, H.J. Lu wrote:
>> Implement strcat family IFUNC selectors in C.
>>
>> All internal calls within libc.so can use IFUNC on x86-64 since unlike
>> x86, x86-64 supports PC-relative addressing to access the GOT entry so
>> that it can call via PLT without using an extra register.  For libc.a,
>> we can't use IFUNC for functions which are called before IFUNC has been
>> initialized.  Use IFUNC internally reduces the icache footprint since
>> libc.so and other codes in the process use the same implementations.
>> This patch uses IFUNC for strcat family functions within libc.
>>
>> Any comments?
>>
>> H.J.
>>       * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
>>       strcat-sse2.
>>       * sysdeps/x86_64/multiarch/strcat-sse2.S: New file.
>>       * sysdeps/x86_64/multiarch/strcat.c: Likewise.
>>       * sysdeps/x86_64/multiarch/strncat.c: Likewise.
>>       * sysdeps/x86_64/multiarch/strcat.S: Removed.
>>       * sysdeps/x86_64/multiarch/strncat.S: Likewise.
>
> LGTM, but since the idea is to use IFUNC for internal symbol call, do we
> still need the assembly implementation for x86_64? Default algorithm should
> be good enough with both vectorized strlen and strcpy (same for strncat)

That will be a separate patch.  We will look into these for 2.27.

Thanks.
Adhemerval Zanella Netto - June 15, 2017, 5:20 p.m.
On 15/06/2017 12:51, H.J. Lu wrote:
> On Thu, Jun 15, 2017 at 8:32 AM, Adhemerval Zanella
> <adhemerval.zanella@linaro.org> wrote:
>>
>>
>> On 12/06/2017 13:10, H.J. Lu wrote:
>>> Implement strcat family IFUNC selectors in C.
>>>
>>> All internal calls within libc.so can use IFUNC on x86-64 since unlike
>>> x86, x86-64 supports PC-relative addressing to access the GOT entry so
>>> that it can call via PLT without using an extra register.  For libc.a,
>>> we can't use IFUNC for functions which are called before IFUNC has been
>>> initialized.  Use IFUNC internally reduces the icache footprint since
>>> libc.so and other codes in the process use the same implementations.
>>> This patch uses IFUNC for strcat family functions within libc.
>>>
>>> Any comments?
>>>
>>> H.J.
>>>       * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
>>>       strcat-sse2.
>>>       * sysdeps/x86_64/multiarch/strcat-sse2.S: New file.
>>>       * sysdeps/x86_64/multiarch/strcat.c: Likewise.
>>>       * sysdeps/x86_64/multiarch/strncat.c: Likewise.
>>>       * sysdeps/x86_64/multiarch/strcat.S: Removed.
>>>       * sysdeps/x86_64/multiarch/strncat.S: Likewise.
>>
>> LGTM, but since the idea is to use IFUNC for internal symbol call, do we
>> still need the assembly implementation for x86_64? Default algorithm should
>> be good enough with both vectorized strlen and strcpy (same for strncat)
> 
> That will be a separate patch.  We will look into these for 2.27.
> 
> Thanks.
> 

Fair enough then.

Patch

diff --git a/sysdeps/x86_64/multiarch/Makefile b/sysdeps/x86_64/multiarch/Makefile
index ff6c7f4..43443b3 100644
--- a/sysdeps/x86_64/multiarch/Makefile
+++ b/sysdeps/x86_64/multiarch/Makefile
@@ -23,6 +23,7 @@  sysdep_routines += strncat-c stpncpy-c strncpy-c strcmp-ssse3 \
 		   strcpy-ssse3 strncpy-ssse3 stpcpy-ssse3 stpncpy-ssse3 \
 		   strcpy-sse2-unaligned strncpy-sse2-unaligned \
 		   stpcpy-sse2-unaligned stpncpy-sse2-unaligned \
+		   strcat-sse2 \
 		   strcat-sse2-unaligned strncat-sse2-unaligned \
 		   strchr-sse2-no-bsf memcmp-ssse3 strstr-sse2-unaligned \
 		   strcspn-c strpbrk-c strspn-c varshift \
diff --git a/sysdeps/x86_64/multiarch/strcat-sse2.S b/sysdeps/x86_64/multiarch/strcat-sse2.S
new file mode 100644
index 0000000..565ba30
--- /dev/null
+++ b/sysdeps/x86_64/multiarch/strcat-sse2.S
@@ -0,0 +1,28 @@ 
+/* strcat optimized with SSE2.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#if IS_IN (libc)
+
+# include <sysdep.h>
+# define strcat __strcat_sse2
+
+# undef libc_hidden_builtin_def
+# define libc_hidden_builtin_def(strcat)
+#endif
+
+#include <sysdeps/x86_64/strcat.S>
diff --git a/sysdeps/x86_64/multiarch/strcat.S b/sysdeps/x86_64/multiarch/strcat.S
deleted file mode 100644
index 0e0e5dd..0000000
--- a/sysdeps/x86_64/multiarch/strcat.S
+++ /dev/null
@@ -1,85 +0,0 @@ 
-/* Multiple versions of strcat
-   All versions must be listed in ifunc-impl-list.c.
-   Copyright (C) 2009-2017 Free Software Foundation, Inc.
-   Contributed by Intel Corporation.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include <sysdep.h>
-#include <init-arch.h>
-
-#ifndef USE_AS_STRNCAT
-# ifndef STRCAT
-#  define STRCAT strcat
-# endif
-#endif
-
-#ifdef USE_AS_STRNCAT
-# define STRCAT_SSSE3	         	__strncat_ssse3
-# define STRCAT_SSE2	            	__strncat_sse2
-# define STRCAT_SSE2_UNALIGNED    	__strncat_sse2_unaligned
-# define __GI_STRCAT	            	__GI_strncat
-# define __GI___STRCAT              __GI___strncat
-#else
-# define STRCAT_SSSE3	         	__strcat_ssse3
-# define STRCAT_SSE2	            	__strcat_sse2
-# define STRCAT_SSE2_UNALIGNED    	__strcat_sse2_unaligned
-# define __GI_STRCAT	            	__GI_strcat
-# define __GI___STRCAT              __GI___strcat
-#endif
-
-
-/* Define multiple versions only for the definition in libc.  */
-#if IS_IN (libc)
-	.text
-ENTRY(STRCAT)
-	.type	STRCAT, @gnu_indirect_function
-	LOAD_RTLD_GLOBAL_RO_RDX
-	leaq	STRCAT_SSE2_UNALIGNED(%rip), %rax
-	HAS_ARCH_FEATURE (Fast_Unaligned_Load)
-	jnz	2f
-	leaq	STRCAT_SSE2(%rip), %rax
-	HAS_CPU_FEATURE (SSSE3)
-	jz	2f
-	leaq	STRCAT_SSSE3(%rip), %rax
-2:	ret
-END(STRCAT)
-
-# undef ENTRY
-# define ENTRY(name) \
-	.type STRCAT_SSE2, @function; \
-	.align 16; \
-	.globl STRCAT_SSE2; \
-	.hidden STRCAT_SSE2; \
-	STRCAT_SSE2: cfi_startproc; \
-	CALL_MCOUNT
-# undef END
-# define END(name) \
-	cfi_endproc; .size STRCAT_SSE2, .-STRCAT_SSE2
-# undef libc_hidden_builtin_def
-/* It doesn't make sense to send libc-internal strcat calls through a PLT.
-   The speedup we get from using SSSE3 instruction is likely eaten away
-   by the indirect call in the PLT.  */
-# define libc_hidden_builtin_def(name) \
-	.globl __GI_STRCAT; __GI_STRCAT = STRCAT_SSE2
-# undef libc_hidden_def
-# define libc_hidden_def(name) \
-	.globl __GI___STRCAT; __GI___STRCAT = STRCAT_SSE2
-#endif
-
-#ifndef USE_AS_STRNCAT
-# include "../strcat.S"
-#endif
diff --git a/sysdeps/x86_64/multiarch/strcat.c b/sysdeps/x86_64/multiarch/strcat.c
new file mode 100644
index 0000000..984cdfb
--- /dev/null
+++ b/sysdeps/x86_64/multiarch/strcat.c
@@ -0,0 +1,35 @@ 
+/* Multiple versions of strcat.
+   All versions must be listed in ifunc-impl-list.c.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* Define multiple versions only for the definition in libc.  */
+#if IS_IN (libc)
+# define strcat __redirect_strcat
+# include <string.h>
+# undef strcat
+
+# define SYMBOL_NAME strcat
+# include "ifunc-unaligned-ssse3.h"
+
+libc_ifunc_redirected (__redirect_strcat, strcat, IFUNC_SELECTOR ());
+
+# ifdef SHARED
+__hidden_ver1 (strcat, __GI_strcat, __redirect_strcat)
+  __attribute__ ((visibility ("hidden")));
+# endif
+#endif
diff --git a/sysdeps/x86_64/multiarch/strncat.S b/sysdeps/x86_64/multiarch/strncat.S
deleted file mode 100644
index 5c1bf41..0000000
--- a/sysdeps/x86_64/multiarch/strncat.S
+++ /dev/null
@@ -1,5 +0,0 @@ 
-/* Multiple versions of strncat
-   All versions must be listed in ifunc-impl-list.c.  */
-#define STRCAT strncat
-#define USE_AS_STRNCAT
-#include "strcat.S"
diff --git a/sysdeps/x86_64/multiarch/strncat.c b/sysdeps/x86_64/multiarch/strncat.c
new file mode 100644
index 0000000..d359e16
--- /dev/null
+++ b/sysdeps/x86_64/multiarch/strncat.c
@@ -0,0 +1,31 @@ 
+/* Multiple versions of strncat.
+   All versions must be listed in ifunc-impl-list.c.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* Define multiple versions only for the definition in libc.  */
+#if IS_IN (libc)
+# define _HAVE_STRING_ARCH_strncat 1
+# define strncat __redirect_strncat
+# include <string.h>
+# undef strncat
+
+# define SYMBOL_NAME strncat
+# include "ifunc-unaligned-ssse3.h"
+
+libc_ifunc_redirected (__redirect_strncat, strncat, IFUNC_SELECTOR ());
+#endif