[v12,14/31] string: Improve generic strrchr with memrchr and strlen

Message ID 20230202181149.2181553-15-adhemerval.zanella@linaro.org (mailing list archive)
State Committed
Commit 167f6230af97690985ccbc9b3026a7c32ec2d6e9
Headers
Series Improve generic string routines |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent

Commit Message

Adhemerval Zanella Feb. 2, 2023, 6:11 p.m. UTC
  Now that both strlen and memrchr have word vectorized implementation,
it should be faster to implement strrchr based on memrchr over the
string length instead of calling strchr on a loop.

Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu,
and powerpc64-linux-gnu by removing the arch-specific assembly
implementation and disabling multi-arch (it covers both LE and BE
for 64 and 32 bits).
---
 string/strrchr.c | 18 +-----------------
 1 file changed, 1 insertion(+), 17 deletions(-)
  

Comments

Richard Henderson Feb. 4, 2023, 3:06 a.m. UTC | #1
On 2/2/23 08:11, Adhemerval Zanella wrote:
> Now that both strlen and memrchr have word vectorized implementation,
> it should be faster to implement strrchr based on memrchr over the
> string length instead of calling strchr on a loop.
> 
> Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu,
> and powerpc64-linux-gnu by removing the arch-specific assembly
> implementation and disabling multi-arch (it covers both LE and BE
> for 64 and 32 bits).
> ---
>   string/strrchr.c | 18 +-----------------
>   1 file changed, 1 insertion(+), 17 deletions(-)

memrchr needs libc_hidden_builtin_proto.  On riscv64:

    c:   00000097                auipc   ra,0x0
                         c: R_RISCV_CALL __GI_strlen
                         c: R_RISCV_RELAX        *ABS*
   10:   000080e7                jalr    ra # c <__GI_strrchr+0xc>
...
   24:   00000317                auipc   t1,0x0
                         24: R_RISCV_CALL_PLT    __memrchr
                         24: R_RISCV_RELAX       *ABS*
   28:   00030067                jr      t1 # 24 <.LVL2+0x8>


r~
  
Adhemerval Zanella Feb. 6, 2023, 2:01 p.m. UTC | #2
On 04/02/23 00:06, Richard Henderson wrote:
> On 2/2/23 08:11, Adhemerval Zanella wrote:
>> Now that both strlen and memrchr have word vectorized implementation,
>> it should be faster to implement strrchr based on memrchr over the
>> string length instead of calling strchr on a loop.
>>
>> Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu,
>> and powerpc64-linux-gnu by removing the arch-specific assembly
>> implementation and disabling multi-arch (it covers both LE and BE
>> for 64 and 32 bits).
>> ---
>>   string/strrchr.c | 18 +-----------------
>>   1 file changed, 1 insertion(+), 17 deletions(-)
> 
> memrchr needs libc_hidden_builtin_proto.  On riscv64:
> 
>    c:   00000097                auipc   ra,0x0
>                         c: R_RISCV_CALL __GI_strlen
>                         c: R_RISCV_RELAX        *ABS*
>   10:   000080e7                jalr    ra # c <__GI_strrchr+0xc>
> ...
>   24:   00000317                auipc   t1,0x0
>                         24: R_RISCV_CALL_PLT    __memrchr
>                         24: R_RISCV_RELAX       *ABS*
>   28:   00030067                jr      t1 # 24 <.LVL2+0x8>
> 
> 
> r~

Similar to strchr [1], static linker ends up creating the expected local
call:

$ riscv64-glibc-linux-gnu-objdump -dr libc.so
[...]
0000000000088f5c <strrchr>:
   88f5c:       7179                    addi    sp,sp,-48
   88f5e:       e84a                    sd      s2,16(sp)
   88f60:       000e7917                auipc   s2,0xe7
   88f64:       73893903                ld      s2,1848(s2) # 170698 <__stack_chk_guard@GLIBC_2.27>
   88f68:       00093783                ld      a5,0(s2)
[...]
   88f9a:       6145                    addi    sp,sp,48
   88f9c:       c70fd06f                j       8640c <memrchr>
   88fa0:       193500ef                jal     ra,d9932 <__stack_chk_fail>
[...]

But I agree that we should change to have the libc_hidden_builin_proto
in this case.  I will send a patch to fix it.

[1] https://sourceware.org/pipermail/libc-alpha/2023-February/145322.html
  

Patch

diff --git a/string/strrchr.c b/string/strrchr.c
index 3c6e715d3b..7b76dea4e0 100644
--- a/string/strrchr.c
+++ b/string/strrchr.c
@@ -27,23 +27,7 @@ 
 char *
 STRRCHR (const char *s, int c)
 {
-  const char *found, *p;
-
-  c = (unsigned char) c;
-
-  /* Since strchr is fast, we use it rather than the obvious loop.  */
-
-  if (c == '\0')
-    return strchr (s, '\0');
-
-  found = NULL;
-  while ((p = strchr (s, c)) != NULL)
-    {
-      found = p;
-      s = p + 1;
-    }
-
-  return (char *) found;
+  return __memrchr (s, c, strlen (s) + 1);
 }
 
 #ifdef weak_alias