newlib: libc: Improved the readability of strcspn with minor optimization

Message ID 20231215083101.5643-1-zengxiao@eswincomputing.com
State New
Headers
Series newlib: libc: Improved the readability of strcspn with minor optimization |

Commit Message

Xiao Zeng Dec. 15, 2023, 8:31 a.m. UTC
  Signed-off-by: Xiao Zeng <zengxiao@eswincomputing.com>
---
 newlib/libc/string/strcspn.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)
  

Comments

Torbjörn SVENSSON Dec. 15, 2023, 10:28 a.m. UTC | #1
Hello Xiao,

On 2023-12-15 09:31, Xiao Zeng wrote:
> Signed-off-by: Xiao Zeng <zengxiao@eswincomputing.com>
> ---
>   newlib/libc/string/strcspn.c | 6 ++----
>   1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/newlib/libc/string/strcspn.c b/newlib/libc/string/strcspn.c
> index abaa93ad6..8ac0bf10c 100644
> --- a/newlib/libc/string/strcspn.c
> +++ b/newlib/libc/string/strcspn.c
> @@ -37,12 +37,10 @@ strcspn (const char *s1,
>         for (c = s2; *c; c++)
>   	{
>   	  if (*s1 == *c)
> -	    break;
> +	    goto end;
>   	}
> -      if (*c)
> -	break;
>         s1++;
>       }
> -
> +end:
>     return s1 - s;
>   }

Just looking at this small snippet of code, I would say that the 
previous code and your suggestion won't do the same thing.

Do you have unit tests that confirm that the behavior is identical with 
the current implementation and your suggested change?

When I run your suggestion, I get return value 0, but with the current 
implementation it's 3 for this call: strspn("129th", "1234567890").

Kind regards,
Torbjörn
  
Xiao Zeng Dec. 16, 2023, 4:45 a.m. UTC | #2
2023-12-15 18:28  Torbjorn SVENSSON <torbjorn.svensson@foss.st.com> wrote:
>
Hi Bob, I worked under the risc-v architecture and obtained the newlib library through
cross compilation. This is my first attempt to contribute code to newlib. 

>Hello Xiao,
>
>On 2023-12-15 09:31, Xiao Zeng wrote:
>> Signed-off-by: Xiao Zeng <zengxiao@eswincomputing.com>
>> ---
>>   newlib/libc/string/strcspn.c | 6 ++----
>>   1 file changed, 2 insertions(+), 4 deletions(-)
>>
>> diff --git a/newlib/libc/string/strcspn.c b/newlib/libc/string/strcspn.c
>> index abaa93ad6..8ac0bf10c 100644
>> --- a/newlib/libc/string/strcspn.c
>> +++ b/newlib/libc/string/strcspn.c
>> @@ -37,12 +37,10 @@ strcspn (const char *s1,
>>         for (c = s2; *c; c++)
>>   {
>>     if (*s1 == *c)
>> -	    break;
>> +	    goto end;
>>   }
>> -      if (*c)
>> -	break;
>>         s1++;
>>       }
>> -
>> +end:
>>     return s1 - s;
>>   }
>
>Just looking at this small snippet of code, I would say that the
>previous code and your suggestion won't do the same thing.
>
>Do you have unit tests that confirm that the behavior is identical with
>the current implementation and your suggested change? 
1 I must admit that I did not conduct a complete newlib regression test.
Anyway, I should apologize for this. After completing cross compilation, I tried:
-----------------
make check
-----------------

Received the following error message:
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Target is riscv64-unknown-elf
Host   is x86_64-pc-linux-gnu
...
The error code is TCL LOOKUP COMMAND newlib_target_compile
The info on the error is:
invalid command name "newlib_target_compile"
    while executing
"::tcl_unknown newlib_target_compile /home/user/Downloads/tools/tools.git/newlib.git/newlib/testsuite/newlib.iconv/iconvnm.c /home/user/Downloads/tools..."
    ("uplevel" body line 1)
    invoked from within
"uplevel 1 ::tcl_unknown $args"
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

I know the reason for the error: Target and Host do not match, which
is a common issue in cross compilation. I have looked up some
information, but I have not found a solution.

2 On the basis of the above, I searched for test cases of the strcspn
function in newlib:
-------------------------------------------------------------------------------------------------
newlib/libm/test/string.c:311:  check(strcspn("abcba", "qx") == 5); /* Whole string. */
newlib/libm/test/string.c:312:  check(strcspn("abcba", "cx") == 2); /* Partial. */
newlib/libm/test/string.c:313:  check(strcspn("abc", "abc") == 0);	/* None. */
newlib/libm/test/string.c:314:  check(strcspn("", "ab") == 0); /* Null string. */
newlib/libm/test/string.c:315:  check(strcspn("abc", "") == 3); /* Null search list. */
-------------------------------------------------------------------------------------------------

I simply believe that these 5 test cases are all the test cases related to
strcspn in newlib. After local testing, all these test cases can pass in my patch.
>
>When I run your suggestion, I get return value 0, but with the current
>implementation it's 3 for this call: strspn("129th", "1234567890"). 
3 It is worth noting that strcspn and strspn only differ by one letter 'c'. I also
often confuse them -:), maybe you are the same.

>
>Kind regards,
>Torbjörn
  
4 Perhaps someone could point out how to test newlib under the risc-v
architecture, and I would greatly appreciate it. Of course, there are also
great methods for testing newlib under other architectures.

5 By the way, it is not possible to compile the latest newlib on host (x86_64-pc-linux-gnu) 
(compilation quickly ended as if nothing had happened), even if ../configure
comes with the parameter -- with-newlib.


Thanks
Xiao Zeng
  
Xiao Zeng Dec. 16, 2023, 4:56 a.m. UTC | #3
2023-12-16 12:45  Xiao Zeng <zengxiao@eswincomputing.com> wrote:
>
 
>2023-12-15 18:28  Torbjorn SVENSSON <torbjorn.svensson@foss.st.com> wrote:
>>
>Hi Bob, I worked under the risc-v architecture and obtained the newlib library through 
Sorry, Torbjorn, I accidentally wrote your name wrong. I apologize for my carelessness.

>cross compilation. This is my first attempt to contribute code to newlib. 
>
>>Hello Xiao,
>>
>>On 2023-12-15 09:31, Xiao Zeng wrote:
>>> Signed-off-by: Xiao Zeng <zengxiao@eswincomputing.com>
>>> ---
>>>   newlib/libc/string/strcspn.c | 6 ++----
>>>   1 file changed, 2 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/newlib/libc/string/strcspn.c b/newlib/libc/string/strcspn.c
>>> index abaa93ad6..8ac0bf10c 100644
>>> --- a/newlib/libc/string/strcspn.c
>>> +++ b/newlib/libc/string/strcspn.c
>>> @@ -37,12 +37,10 @@ strcspn (const char *s1,
>>>         for (c = s2; *c; c++)
>>>   {
>>>     if (*s1 == *c)
>>> -	    break;
>>> +	    goto end;
>>>   }
>>> -      if (*c)
>>> -	break;
>>>         s1++;
>>>       }
>>> -
>>> +end:
>>>     return s1 - s;
>>>   }
>>
>>Just looking at this small snippet of code, I would say that the
>>previous code and your suggestion won't do the same thing.
>>
>>Do you have unit tests that confirm that the behavior is identical with
>>the current implementation and your suggested change?
>1 I must admit that I did not conduct a complete newlib regression test.
>Anyway, I should apologize for this. After completing cross compilation, I tried:
>-----------------
>make check
>-----------------
>
>Received the following error message:
>--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>Target is riscv64-unknown-elf
>Host   is x86_64-pc-linux-gnu
>...
>The error code is TCL LOOKUP COMMAND newlib_target_compile
>The info on the error is:
>invalid command name "newlib_target_compile"
>    while executing
>"::tcl_unknown newlib_target_compile /home/user/Downloads/tools/tools.git/newlib.git/newlib/testsuite/newlib.iconv/iconvnm.c /home/user/Downloads/tools..."
>    ("uplevel" body line 1)
>    invoked from within
>"uplevel 1 ::tcl_unknown $args"
>--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>I know the reason for the error: Target and Host do not match, which
>is a common issue in cross compilation. I have looked up some
>information, but I have not found a solution.
>
>2 On the basis of the above, I searched for test cases of the strcspn
>function in newlib:
>-------------------------------------------------------------------------------------------------
>newlib/libm/test/string.c:311:  check(strcspn("abcba", "qx") == 5); /* Whole string. */
>newlib/libm/test/string.c:312:  check(strcspn("abcba", "cx") == 2); /* Partial. */
>newlib/libm/test/string.c:313:  check(strcspn("abc", "abc") == 0);	/* None. */
>newlib/libm/test/string.c:314:  check(strcspn("", "ab") == 0); /* Null string. */
>newlib/libm/test/string.c:315:  check(strcspn("abc", "") == 3); /* Null search list. */
>-------------------------------------------------------------------------------------------------
>
>I simply believe that these 5 test cases are all the test cases related to
>strcspn in newlib. After local testing, all these test cases can pass in my patch.
>>
>>When I run your suggestion, I get return value 0, but with the current
>>implementation it's 3 for this call: strspn("129th", "1234567890").
>3 It is worth noting that strcspn and strspn only differ by one letter 'c'. I also
>often confuse them -:), maybe you are the same.
>
>>
>>Kind regards,
>>Torbjörn

>4 Perhaps someone could point out how to test newlib under the risc-v
>architecture, and I would greatly appreciate it. Of course, there are also
>great methods for testing newlib under other architectures.
>
>5 By the way, it is not possible to compile the latest newlib on host (x86_64-pc-linux-gnu)
>(compilation quickly ended as if nothing had happened), even if ../configure
>comes with the parameter -- with-newlib.
>
>
>Thanks
>Xiao Zeng
>
 
Thanks
Xiao Zeng
  
Xiao Zeng Dec. 16, 2023, 9:30 a.m. UTC | #4
2023-12-15 18:28  Torbjorn SVENSSON <torbjorn.svensson@foss.st.com> wrote:
>
 
>Hello Xiao,
>
>On 2023-12-15 09:31, Xiao Zeng wrote:
>> Signed-off-by: Xiao Zeng <zengxiao@eswincomputing.com>
>> ---
>>   newlib/libc/string/strcspn.c | 6 ++----
>>   1 file changed, 2 insertions(+), 4 deletions(-)
>>
>> diff --git a/newlib/libc/string/strcspn.c b/newlib/libc/string/strcspn.c
>> index abaa93ad6..8ac0bf10c 100644
>> --- a/newlib/libc/string/strcspn.c
>> +++ b/newlib/libc/string/strcspn.c
>> @@ -37,12 +37,10 @@ strcspn (const char *s1,
>>         for (c = s2; *c; c++)
>>   {
>>     if (*s1 == *c)
>> -	    break;
>> +	    goto end;
>>   }
>> -      if (*c)
>> -	break;
>>         s1++;
>>       }
>> -
>> +end:
>>     return s1 - s;
>>   }
>
>Just looking at this small snippet of code, I would say that the
>previous code and your suggestion won't do the same thing.
>
>Do you have unit tests that confirm that the behavior is identical with
>the current implementation and your suggested change?
>
>When I run your suggestion, I get return value 0, but with the current
>implementation it's 3 for this call: strspn("129th", "1234567890").
>
>Kind regards,
>Torbjörn 

After applying this patch, provide a comparison of assembly code
under the risc-v architecture, with default compilation parameters
used in both of them:

no-patch:
---------------------------------------------------------
libc_a-strcspn.o:     file format elf64-littleriscv


Disassembly of section .text:

0000000000000000 <strcspn>:
   0:   00054683                lbu     a3,0(a0)
   4:   06068263                beqz    a3,68 <.L9>
   8:   0005c803                lbu     a6,0(a1)
   c:   00050613                mv      a2,a0

0000000000000010 <.LVL1>:
  10:   02080463                beqz    a6,38 <.L14>

0000000000000014 <.L6>:
  14:   00058793                mv      a5,a1
  18:   00080713                mv      a4,a6
  1c:   00c0006f                j       28 <.L5>

0000000000000020 <.L4>:
  20:   0007c703                lbu     a4,0(a5)
  24:   02070863                beqz    a4,54 <.L17>

0000000000000028 <.L5>:
  28:   00178793                addi    a5,a5,1

000000000000002c <.LVL5>:
  2c:   fee69ae3                bne     a3,a4,20 <.L4>

0000000000000030 <.L7>:
  30:   40a60533                sub     a0,a2,a0

0000000000000034 <.LVL7>:
  34:   00008067                ret

0000000000000038 <.L14>:
  38:   00164683                lbu     a3,1(a2)
  3c:   00160613                addi    a2,a2,1
  40:   fe0688e3                beqz    a3,30 <.L7>
  44:   00164683                lbu     a3,1(a2)
  48:   00160613                addi    a2,a2,1
  4c:   fe0696e3                bnez    a3,38 <.L14>
  50:   fe1ff06f                j       30 <.L7>

0000000000000054 <.L17>:
  54:   00164683                lbu     a3,1(a2)
  58:   00160613                addi    a2,a2,1
  5c:   fa069ce3                bnez    a3,14 <.L6>
  60:   40a60533                sub     a0,a2,a0

0000000000000064 <.LVL13>:
  64:   00008067                ret

0000000000000068 <.L9>:
  68:   00000513                li      a0,0

000000000000006c <.LVL15>:
  6c:   00008067                ret
---------------------------------------------------------

patch
---------------------------------------------------------
libc_a-strcspn.o:     file format elf64-littleriscv


Disassembly of section .text:

0000000000000000 <strcspn>:
   0:   00054683                lbu     a3,0(a0)
   4:   04068a63                beqz    a3,58 <.L2>
   8:   0005c803                lbu     a6,0(a1)
   c:   00050613                mv      a2,a0

0000000000000010 <.LVL1>:
  10:   02080c63                beqz    a6,48 <.L14>

0000000000000014 <.L6>:
  14:   00058793                mv      a5,a1
  18:   00080713                mv      a4,a6
  1c:   00c0006f                j       28 <.L5>

0000000000000020 <.L4>:
  20:   0007c703                lbu     a4,0(a5)
  24:   00070a63                beqz    a4,38 <.L16>

0000000000000028 <.L5>:
  28:   00178793                addi    a5,a5,1

000000000000002c <.LVL5>:
  2c:   fee69ae3                bne     a3,a4,20 <.L4>

0000000000000030 <.L7>:
  30:   40a60533                sub     a0,a2,a0

0000000000000034 <.LVL7>:
  34:   00008067                ret

0000000000000038 <.L16>:
  38:   00164683                lbu     a3,1(a2)
  3c:   00160613                addi    a2,a2,1
  40:   fc069ae3                bnez    a3,14 <.L6>
  44:   fedff06f                j       30 <.L7>

0000000000000048 <.L14>:
  48:   00164683                lbu     a3,1(a2)
  4c:   00160613                addi    a2,a2,1
  50:   fe069ce3                bnez    a3,48 <.L14>
  54:   fddff06f                j       30 <.L7>

0000000000000058 <.L2>:
  58:   00000513                li      a0,0

000000000000005c <.LVL12>:
  5c:   00008067                ret
---------------------------------------------------------
After careful comparison, it was found that there are fewer assembly
instructions after the patch.
 
Thanks
Xiao Zeng
  
Jeff Johnston Dec. 20, 2023, 4:24 a.m. UTC | #5
Patch merged to master.

-- Jeff J.

On Sat, Dec 16, 2023 at 4:31 AM Xiao Zeng <zengxiao@eswincomputing.com>
wrote:

> 2023-12-15 18:28  Torbjorn SVENSSON <torbjorn.svensson@foss.st.com> wrote:
> >
>
> >Hello Xiao,
> >
> >On 2023-12-15 09:31, Xiao Zeng wrote:
> >> Signed-off-by: Xiao Zeng <zengxiao@eswincomputing.com>
> >> ---
> >>   newlib/libc/string/strcspn.c | 6 ++----
> >>   1 file changed, 2 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/newlib/libc/string/strcspn.c b/newlib/libc/string/strcspn.c
> >> index abaa93ad6..8ac0bf10c 100644
> >> --- a/newlib/libc/string/strcspn.c
> >> +++ b/newlib/libc/string/strcspn.c
> >> @@ -37,12 +37,10 @@ strcspn (const char *s1,
> >>         for (c = s2; *c; c++)
> >>   {
> >>     if (*s1 == *c)
> >> -        break;
> >> +        goto end;
> >>   }
> >> -      if (*c)
> >> -    break;
> >>         s1++;
> >>       }
> >> -
> >> +end:
> >>     return s1 - s;
> >>   }
> >
> >Just looking at this small snippet of code, I would say that the
> >previous code and your suggestion won't do the same thing.
> >
> >Do you have unit tests that confirm that the behavior is identical with
> >the current implementation and your suggested change?
> >
> >When I run your suggestion, I get return value 0, but with the current
> >implementation it's 3 for this call: strspn("129th", "1234567890").
> >
> >Kind regards,
> >Torbjörn
>
> After applying this patch, provide a comparison of assembly code
> under the risc-v architecture, with default compilation parameters
> used in both of them:
>
> no-patch:
> ---------------------------------------------------------
> libc_a-strcspn.o:     file format elf64-littleriscv
>
>
> Disassembly of section .text:
>
> 0000000000000000 <strcspn>:
>    0:   00054683                lbu     a3,0(a0)
>    4:   06068263                beqz    a3,68 <.L9>
>    8:   0005c803                lbu     a6,0(a1)
>    c:   00050613                mv      a2,a0
>
> 0000000000000010 <.LVL1>:
>   10:   02080463                beqz    a6,38 <.L14>
>
> 0000000000000014 <.L6>:
>   14:   00058793                mv      a5,a1
>   18:   00080713                mv      a4,a6
>   1c:   00c0006f                j       28 <.L5>
>
> 0000000000000020 <.L4>:
>   20:   0007c703                lbu     a4,0(a5)
>   24:   02070863                beqz    a4,54 <.L17>
>
> 0000000000000028 <.L5>:
>   28:   00178793                addi    a5,a5,1
>
> 000000000000002c <.LVL5>:
>   2c:   fee69ae3                bne     a3,a4,20 <.L4>
>
> 0000000000000030 <.L7>:
>   30:   40a60533                sub     a0,a2,a0
>
> 0000000000000034 <.LVL7>:
>   34:   00008067                ret
>
> 0000000000000038 <.L14>:
>   38:   00164683                lbu     a3,1(a2)
>   3c:   00160613                addi    a2,a2,1
>   40:   fe0688e3                beqz    a3,30 <.L7>
>   44:   00164683                lbu     a3,1(a2)
>   48:   00160613                addi    a2,a2,1
>   4c:   fe0696e3                bnez    a3,38 <.L14>
>   50:   fe1ff06f                j       30 <.L7>
>
> 0000000000000054 <.L17>:
>   54:   00164683                lbu     a3,1(a2)
>   58:   00160613                addi    a2,a2,1
>   5c:   fa069ce3                bnez    a3,14 <.L6>
>   60:   40a60533                sub     a0,a2,a0
>
> 0000000000000064 <.LVL13>:
>   64:   00008067                ret
>
> 0000000000000068 <.L9>:
>   68:   00000513                li      a0,0
>
> 000000000000006c <.LVL15>:
>   6c:   00008067                ret
> ---------------------------------------------------------
>
> patch
> ---------------------------------------------------------
> libc_a-strcspn.o:     file format elf64-littleriscv
>
>
> Disassembly of section .text:
>
> 0000000000000000 <strcspn>:
>    0:   00054683                lbu     a3,0(a0)
>    4:   04068a63                beqz    a3,58 <.L2>
>    8:   0005c803                lbu     a6,0(a1)
>    c:   00050613                mv      a2,a0
>
> 0000000000000010 <.LVL1>:
>   10:   02080c63                beqz    a6,48 <.L14>
>
> 0000000000000014 <.L6>:
>   14:   00058793                mv      a5,a1
>   18:   00080713                mv      a4,a6
>   1c:   00c0006f                j       28 <.L5>
>
> 0000000000000020 <.L4>:
>   20:   0007c703                lbu     a4,0(a5)
>   24:   00070a63                beqz    a4,38 <.L16>
>
> 0000000000000028 <.L5>:
>   28:   00178793                addi    a5,a5,1
>
> 000000000000002c <.LVL5>:
>   2c:   fee69ae3                bne     a3,a4,20 <.L4>
>
> 0000000000000030 <.L7>:
>   30:   40a60533                sub     a0,a2,a0
>
> 0000000000000034 <.LVL7>:
>   34:   00008067                ret
>
> 0000000000000038 <.L16>:
>   38:   00164683                lbu     a3,1(a2)
>   3c:   00160613                addi    a2,a2,1
>   40:   fc069ae3                bnez    a3,14 <.L6>
>   44:   fedff06f                j       30 <.L7>
>
> 0000000000000048 <.L14>:
>   48:   00164683                lbu     a3,1(a2)
>   4c:   00160613                addi    a2,a2,1
>   50:   fe069ce3                bnez    a3,48 <.L14>
>   54:   fddff06f                j       30 <.L7>
>
> 0000000000000058 <.L2>:
>   58:   00000513                li      a0,0
>
> 000000000000005c <.LVL12>:
>   5c:   00008067                ret
> ---------------------------------------------------------
> After careful comparison, it was found that there are fewer assembly
> instructions after the patch.
>
> Thanks
> Xiao Zeng
>
>
  
Xiao Zeng Dec. 20, 2023, 5:51 a.m. UTC | #6
2023-12-20 12:24  Jeff Johnston <jjohnstn@redhat.com> wrote:
>
 
>Patch merged to master. 
Thanks, Jeff.
There will be some similar patches in the future, and I will
submit them to the master immediately.

>
>-- Jeff J.
>
>On Sat, Dec 16, 2023 at 4:31 AM Xiao Zeng <zengxiao@eswincomputing.com>
>wrote:
>
>> 2023-12-15 18:28  Torbjorn SVENSSON <torbjorn.svensson@foss.st.com> wrote:
>> >
>>
>> >Hello Xiao,
>> >
>> >On 2023-12-15 09:31, Xiao Zeng wrote:
>> >> Signed-off-by: Xiao Zeng <zengxiao@eswincomputing.com>
>> >> ---
>> >>   newlib/libc/string/strcspn.c | 6 ++----
>> >>   1 file changed, 2 insertions(+), 4 deletions(-)
>> >>
>> >> diff --git a/newlib/libc/string/strcspn.c b/newlib/libc/string/strcspn.c
>> >> index abaa93ad6..8ac0bf10c 100644
>> >> --- a/newlib/libc/string/strcspn.c
>> >> +++ b/newlib/libc/string/strcspn.c
>> >> @@ -37,12 +37,10 @@ strcspn (const char *s1,
>> >>         for (c = s2; *c; c++)
>> >>   {
>> >>     if (*s1 == *c)
>> >> -        break;
>> >> +        goto end;
>> >>   }
>> >> -      if (*c)
>> >> -    break;
>> >>         s1++;
>> >>       }
>> >> -
>> >> +end:
>> >>     return s1 - s;
>> >>   }
>> >
>> >Just looking at this small snippet of code, I would say that the
>> >previous code and your suggestion won't do the same thing.
>> >
>> >Do you have unit tests that confirm that the behavior is identical with
>> >the current implementation and your suggested change?
>> >
>> >When I run your suggestion, I get return value 0, but with the current
>> >implementation it's 3 for this call: strspn("129th", "1234567890").
>> >
>> >Kind regards,
>> >Torbjörn
>>
>> After applying this patch, provide a comparison of assembly code
>> under the risc-v architecture, with default compilation parameters
>> used in both of them:
>>
>> no-patch:
>> ---------------------------------------------------------
>> libc_a-strcspn.o:     file format elf64-littleriscv
>>
>>
>> Disassembly of section .text:
>>
>> 0000000000000000 <strcspn>:
>>    0:   00054683                lbu     a3,0(a0)
>>    4:   06068263                beqz    a3,68 <.L9>
>>    8:   0005c803                lbu     a6,0(a1)
>>    c:   00050613                mv      a2,a0
>>
>> 0000000000000010 <.LVL1>:
>>   10:   02080463                beqz    a6,38 <.L14>
>>
>> 0000000000000014 <.L6>:
>>   14:   00058793                mv      a5,a1
>>   18:   00080713                mv      a4,a6
>>   1c:   00c0006f                j       28 <.L5>
>>
>> 0000000000000020 <.L4>:
>>   20:   0007c703                lbu     a4,0(a5)
>>   24:   02070863                beqz    a4,54 <.L17>
>>
>> 0000000000000028 <.L5>:
>>   28:   00178793                addi    a5,a5,1
>>
>> 000000000000002c <.LVL5>:
>>   2c:   fee69ae3                bne     a3,a4,20 <.L4>
>>
>> 0000000000000030 <.L7>:
>>   30:   40a60533                sub     a0,a2,a0
>>
>> 0000000000000034 <.LVL7>:
>>   34:   00008067                ret
>>
>> 0000000000000038 <.L14>:
>>   38:   00164683                lbu     a3,1(a2)
>>   3c:   00160613                addi    a2,a2,1
>>   40:   fe0688e3                beqz    a3,30 <.L7>
>>   44:   00164683                lbu     a3,1(a2)
>>   48:   00160613                addi    a2,a2,1
>>   4c:   fe0696e3                bnez    a3,38 <.L14>
>>   50:   fe1ff06f                j       30 <.L7>
>>
>> 0000000000000054 <.L17>:
>>   54:   00164683                lbu     a3,1(a2)
>>   58:   00160613                addi    a2,a2,1
>>   5c:   fa069ce3                bnez    a3,14 <.L6>
>>   60:   40a60533                sub     a0,a2,a0
>>
>> 0000000000000064 <.LVL13>:
>>   64:   00008067                ret
>>
>> 0000000000000068 <.L9>:
>>   68:   00000513                li      a0,0
>>
>> 000000000000006c <.LVL15>:
>>   6c:   00008067                ret
>> ---------------------------------------------------------
>>
>> patch
>> ---------------------------------------------------------
>> libc_a-strcspn.o:     file format elf64-littleriscv
>>
>>
>> Disassembly of section .text:
>>
>> 0000000000000000 <strcspn>:
>>    0:   00054683                lbu     a3,0(a0)
>>    4:   04068a63                beqz    a3,58 <.L2>
>>    8:   0005c803                lbu     a6,0(a1)
>>    c:   00050613                mv      a2,a0
>>
>> 0000000000000010 <.LVL1>:
>>   10:   02080c63                beqz    a6,48 <.L14>
>>
>> 0000000000000014 <.L6>:
>>   14:   00058793                mv      a5,a1
>>   18:   00080713                mv      a4,a6
>>   1c:   00c0006f                j       28 <.L5>
>>
>> 0000000000000020 <.L4>:
>>   20:   0007c703                lbu     a4,0(a5)
>>   24:   00070a63                beqz    a4,38 <.L16>
>>
>> 0000000000000028 <.L5>:
>>   28:   00178793                addi    a5,a5,1
>>
>> 000000000000002c <.LVL5>:
>>   2c:   fee69ae3                bne     a3,a4,20 <.L4>
>>
>> 0000000000000030 <.L7>:
>>   30:   40a60533                sub     a0,a2,a0
>>
>> 0000000000000034 <.LVL7>:
>>   34:   00008067                ret
>>
>> 0000000000000038 <.L16>:
>>   38:   00164683                lbu     a3,1(a2)
>>   3c:   00160613                addi    a2,a2,1
>>   40:   fc069ae3                bnez    a3,14 <.L6>
>>   44:   fedff06f                j       30 <.L7>
>>
>> 0000000000000048 <.L14>:
>>   48:   00164683                lbu     a3,1(a2)
>>   4c:   00160613                addi    a2,a2,1
>>   50:   fe069ce3                bnez    a3,48 <.L14>
>>   54:   fddff06f                j       30 <.L7>
>>
>> 0000000000000058 <.L2>:
>>   58:   00000513                li      a0,0
>>
>> 000000000000005c <.LVL12>:
>>   5c:   00008067                ret
>> ---------------------------------------------------------
>> After careful comparison, it was found that there are fewer assembly
>> instructions after the patch.
>>
>> Thanks
>> Xiao Zeng
>>
>>
 
Thanks
Xiao Zeng
  
Torbjörn SVENSSON Dec. 20, 2023, 3:08 p.m. UTC | #7
Hi Xiao,

On 2023-12-16 05:45, Xiao Zeng wrote:
> 3 It is worth noting that strcspn and strspn only differ by one letter 'c'. I also
> often confuse them -:), maybe you are the same.

Yes, sorry, I did that mistake.
Again, sorry for the noise. :'(


Kind regards,
Torbjörn
  
Brian Inglis Dec. 20, 2023, 10:24 p.m. UTC | #8
On 2023-12-19 21:24, Jeff Johnston wrote:
> Patch merged to master.
> On Sat, Dec 16, 2023 at 4:31 AM Xiao Zeng wrote:
>     2023-12-15 18:28  Torbjorn SVENSSON wrote:
>> On 2023-12-15 09:31, Xiao Zeng wrote:
>>> Signed-off-by: Xiao Zeng <zengxiao@eswincomputing.com
>     <mailto:zengxiao@eswincomputing.com>>
>>> ---
>>> newlib/libc/string/strcspn.c | 6 ++----
>>> 1 file changed, 2 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/newlib/libc/string/strcspn.c b/newlib/libc/string/strcspn.c
>>> index abaa93ad6..8ac0bf10c 100644
>>> --- a/newlib/libc/string/strcspn.c
>>> +++ b/newlib/libc/string/strcspn.c
>>> @@ -37,12 +37,10 @@ strcspn (const char *s1,
>      >>         for (c = s2; *c; c++)
>      >>   {
>      >>     if (*s1 == *c)
>      >> -        break;
>      >> +        goto end;
>      >>   }
>      >> -      if (*c)
>      >> -    break;
>      >>         s1++;
>      >>       }
>      >> -
>      >> +end:
>      >>     return s1 - s;
>      >>   }

>> Just looking at this small snippet of code, I would say that the previous
>> code and your suggestion won't do the same thing.
>> 
>> Do you have unit tests that confirm that the behavior is identical with the
>> current implementation and your suggested change?
>> 
>> When I run your suggestion, I get return value 0, but with the current 
>> implementation it's 3 for this call: strspn("129th", "1234567890").

> After applying this patch, provide a comparison of assembly code under the
> risc-v architecture, with default compilation parameters used in both of
> them:
These "micro-optimizations" improve code generation by a few instructions on a 
single (RISC-V) target at a single optimization level of a single compiler and 
version, but what is the cost in execution time and the cache imoact?

Using gotos throw away potential optimizations in modern compilers, where 
goto-less code may have control and/or data flow optimized, with branches 
altered or eliminated, depending on target instruction sets and compiler 
supported optimizations selected and implemented.

For example, in these small functions with few branches, conditional execution 
instructions could be generated, eliminating branches, cache, and lookaside 
buffer impacts, possibly allowing inlining.

Who knows what impacts this has on all of the other targets, compilers, 
versions, and optimization levels?

Should we even consider making these kinds of non-bug-fix minor changes to 
non-target specific sources, unless there are algorithm changes with 
demonstrated benefits across multiple targets, compilers, versions, and 
optimization levels?
  
Xiao Zeng Dec. 21, 2023, 1:22 a.m. UTC | #9
2023-12-20 23:08  Torbjorn SVENSSON <torbjorn.svensson@foss.st.com> wrote:
>
 
>Hi Xiao,
>
>On 2023-12-16 05:45, Xiao Zeng wrote:
>> 3 It is worth noting that strcspn and strspn only differ by one letter 'c'. I also
>> often confuse them -:), maybe you are the same.
>
>Yes, sorry, I did that mistake. 
It's ok.
I am very pleased to receive your code review comments in the upcoming
patchs, which will bring great encouragement to me as a newcomer.
At the same time, it can also create better code, which is why we come together.

>Again, sorry for the noise. :'(
>
>
>Kind regards,
>Torbjörn
 
Thanks
Xiao Zeng
  
Stefan Tauner Dec. 21, 2023, 8 a.m. UTC | #10
On Wed, 20 Dec 2023 15:24:10 -0700
Brian Inglis <Brian.Inglis@SystematicSW.ab.ca> wrote:

> Using gotos throw away potential optimizations in modern compilers

{{citation needed}}
  
Brian Inglis Dec. 21, 2023, 5:47 p.m. UTC | #11
On 2023-12-21 01:00, Stefan Tauner wrote:
> On Wed, 20 Dec 2023 15:24:10 -0700 Brian Inglis wrote:
>> Using gotos throw away potential optimizations in modern compilers
> {{citation needed}}

Please note that is a simplification to make the point.
No gotos means reducible flow graphs which allow easier analysis and optimization.
The main problem is the creation of irreducible flow graphs using gotos.
Compilers may be able to convert irreducible flow graphs to reducible flow 
graphs, eliminating the gotos by restructuring.
This may then allow the code to meet preconditions allowing further loop or 
block optimizations designed for structured reducible flow graphs.

See:
https://groups.seas.harvard.edu/courses/cs153/2019fa/lectures/Lec23-Loop-optimization.pdf

	https://escholarship.mcgill.ca/concern/theses/x633f2516
  

Patch

diff --git a/newlib/libc/string/strcspn.c b/newlib/libc/string/strcspn.c
index abaa93ad6..8ac0bf10c 100644
--- a/newlib/libc/string/strcspn.c
+++ b/newlib/libc/string/strcspn.c
@@ -37,12 +37,10 @@  strcspn (const char *s1,
       for (c = s2; *c; c++)
 	{
 	  if (*s1 == *c)
-	    break;
+	    goto end;
 	}
-      if (*c)
-	break;
       s1++;
     }
-
+end:
   return s1 - s;
 }