gdb/testsuite: change hardcoded assembly in gdb.arch/disp-step-insn-reloc.exp

Message ID 20230426132916.1988539-1-blarsen@redhat.com
State New
Headers
Series gdb/testsuite: change hardcoded assembly in gdb.arch/disp-step-insn-reloc.exp |

Commit Message

Guinevere Larsen April 26, 2023, 1:29 p.m. UTC
  When testing gdb.arch.disp-step-insn-reloc.exp with clang in an x86_64
machine, the compiled test case would segfault when returning from
the function can_relocate_call, with a suggestion of a broken stack.
The example assembly in the commment was the following:

   f:
     MOV $1, %[ok]
     JMP end
   set_point0:
     CALL f ; tracepoint here.
   end:

And the segmentation fault happening at the final "ret" instruction of
the original function.  This suggests that gcc's compilation process would
realize that no ret instruction ever happened after that call and doesn't
save the return address, while clang's process wouldn't.  Looking at the
generated instructions, we can indeed see a difference:

clang's version: e8 f1 ff ff ff          call   11a4 <can_relocate_call+0x14>
gcc's version:   e8 f4 ff ff ff          call   401125 <can_relocate_call+0x11>

Notice the difference on the second byte.

Changing the assembly to use "ret" instead of "JMP end" does not change
the behavior of the program and guarantees a compiler independent
behavior.  This commit does just that.
---
 gdb/testsuite/gdb.arch/insn-reloc.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)
  

Comments

Guinevere Larsen May 11, 2023, 9:04 a.m. UTC | #1
Ping!
  
Guinevere Larsen May 18, 2023, 9:01 a.m. UTC | #2
Ping!
  
Andrew Burgess May 19, 2023, 9:52 p.m. UTC | #3
Bruno Larsen via Gdb-patches <gdb-patches@sourceware.org> writes:

> When testing gdb.arch.disp-step-insn-reloc.exp with clang in an x86_64
> machine, the compiled test case would segfault when returning from
> the function can_relocate_call, with a suggestion of a broken stack.
> The example assembly in the commment was the following:
>
>    f:
>      MOV $1, %[ok]
>      JMP end
>    set_point0:
>      CALL f ; tracepoint here.
>    end:

Honestly, I found the following paragraph really hard to grok.  I had to
go look at the failure myself to see what was going on.  Don't spend too
long on it, but it might be possible to make it a little clearer...

>
> And the segmentation fault happening at the final "ret" instruction of
> the original function.

s/the original function/the function cal_relocate_call/.

>                         This suggests that gcc's compilation process would
> realize that no ret instruction ever happened after that call and doesn't

s/after that call and/after the inner call and so/

> save the return address, while clang's process wouldn't.  Looking at the

Maybe:

save the return address, while clang doesn't spot the missing 'ret' and
so does push the return address onto the stack.  Looking at the

> generated instructions, we can indeed see a difference:

>
> clang's version: e8 f1 ff ff ff          call   11a4 <can_relocate_call+0x14>
> gcc's version:   e8 f4 ff ff ff          call   401125 <can_relocate_call+0x11>
>
> Notice the difference on the second byte.

Yes, but please, don't expect me to go read the x86-64 ISA manual --
what does this difference mean?

>
> Changing the assembly to use "ret" instead of "JMP end" does not change
> the behavior of the program and guarantees a compiler independent
> behavior.  This commit does just that.

The change itself is great, and I agree that this shouldn't change the
test.

Feel free to tweak the commit message and push this.

Approved-By: Andrew Burgess <aburgess@redhat.com>

Thanks,
Andrew

> ---
>  gdb/testsuite/gdb.arch/insn-reloc.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/gdb/testsuite/gdb.arch/insn-reloc.c b/gdb/testsuite/gdb.arch/insn-reloc.c
> index f687c2c5631..365e6180057 100644
> --- a/gdb/testsuite/gdb.arch/insn-reloc.c
> +++ b/gdb/testsuite/gdb.arch/insn-reloc.c
> @@ -49,10 +49,9 @@ fail (void)
>       JMP set_point0
>     f:
>       MOV $1, %[ok]
> -     JMP end
> +     RET
>     set_point0:
>       CALL f ; tracepoint here.
> -   end:
>  
>     */
>  
> @@ -65,10 +64,9 @@ can_relocate_call (void)
>         "  jmp " SYMBOL (set_point0) "\n"
>         "0:\n"
>         "  mov $1, %[ok]\n"
> -       "  jmp 1f\n"
> +       "  ret\n"
>         SYMBOL (set_point0) ":\n"
>         "  call 0b\n"
> -       "1:\n"
>         : [ok] "=r" (ok));
>  
>    if (ok == 1)
> -- 
> 2.39.2
  
Andrew Burgess May 20, 2023, 6:31 a.m. UTC | #4
Andrew Burgess <aburgess@redhat.com> writes:

> Bruno Larsen via Gdb-patches <gdb-patches@sourceware.org> writes:
>
>> When testing gdb.arch.disp-step-insn-reloc.exp with clang in an x86_64
>> machine, the compiled test case would segfault when returning from
>> the function can_relocate_call, with a suggestion of a broken stack.
>> The example assembly in the commment was the following:
>>
>>    f:
>>      MOV $1, %[ok]
>>      JMP end
>>    set_point0:
>>      CALL f ; tracepoint here.
>>    end:
>
> Honestly, I found the following paragraph really hard to grok.  I had to
> go look at the failure myself to see what was going on.  Don't spend too
> long on it, but it might be possible to make it a little clearer...
>
>>
>> And the segmentation fault happening at the final "ret" instruction of
>> the original function.
>
> s/the original function/the function cal_relocate_call/.
>
>>                         This suggests that gcc's compilation process would
>> realize that no ret instruction ever happened after that call and doesn't
>
> s/after that call and/after the inner call and so/
>
>> save the return address, while clang's process wouldn't.  Looking at the
>
> Maybe:
>
> save the return address, while clang doesn't spot the missing 'ret' and
> so does push the return address onto the stack.  Looking at the
>
>> generated instructions, we can indeed see a difference:
>
>>
>> clang's version: e8 f1 ff ff ff          call   11a4 <can_relocate_call+0x14>
>> gcc's version:   e8 f4 ff ff ff          call   401125 <can_relocate_call+0x11>
>>
>> Notice the difference on the second byte.
>
> Yes, but please, don't expect me to go read the x86-64 ISA manual --
> what does this difference mean?
>
>>
>> Changing the assembly to use "ret" instead of "JMP end" does not change
>> the behavior of the program and guarantees a compiler independent
>> behavior.  This commit does just that.
>
> The change itself is great, and I agree that this shouldn't change the
> test.
>
> Feel free to tweak the commit message and push this.

Actually, I retract this.  Having thought about this a bit more I now
worry that the explanation doesn't really explain what's going on.  Is
there really a call variant that doesn't push the return address?

Even if that was the case, it is gas, not gcc that decides the encoding
of the instruction, and gas isn't analysing the destination to decide if
there's a ret or not.

I suspect you fix is actually the right one, but I think this
explanation is going in the wrong direction.  Maybe if you reword things
it might be clearer what's going on.

Thanks,
Andrew


>
> Approved-By: Andrew Burgess <aburgess@redhat.com>
>
> Thanks,
> Andrew
>
>> ---
>>  gdb/testsuite/gdb.arch/insn-reloc.c | 6 ++----
>>  1 file changed, 2 insertions(+), 4 deletions(-)
>>
>> diff --git a/gdb/testsuite/gdb.arch/insn-reloc.c b/gdb/testsuite/gdb.arch/insn-reloc.c
>> index f687c2c5631..365e6180057 100644
>> --- a/gdb/testsuite/gdb.arch/insn-reloc.c
>> +++ b/gdb/testsuite/gdb.arch/insn-reloc.c
>> @@ -49,10 +49,9 @@ fail (void)
>>       JMP set_point0
>>     f:
>>       MOV $1, %[ok]
>> -     JMP end
>> +     RET
>>     set_point0:
>>       CALL f ; tracepoint here.
>> -   end:
>>  
>>     */
>>  
>> @@ -65,10 +64,9 @@ can_relocate_call (void)
>>         "  jmp " SYMBOL (set_point0) "\n"
>>         "0:\n"
>>         "  mov $1, %[ok]\n"
>> -       "  jmp 1f\n"
>> +       "  ret\n"
>>         SYMBOL (set_point0) ":\n"
>>         "  call 0b\n"
>> -       "1:\n"
>>         : [ok] "=r" (ok));
>>  
>>    if (ok == 1)
>> -- 
>> 2.39.2
  
Andrew Burgess May 20, 2023, 9:19 a.m. UTC | #5
Bruno Larsen via Gdb-patches <gdb-patches@sourceware.org> writes:

> Ping!
>
> -- 
> Cheers,
> Bruno
>
> On 26/04/2023 15:29, Bruno Larsen wrote:
>> When testing gdb.arch.disp-step-insn-reloc.exp with clang in an x86_64
>> machine, the compiled test case would segfault when returning from
>> the function can_relocate_call, with a suggestion of a broken stack.
>> The example assembly in the commment was the following:
>>
>>     f:
>>       MOV $1, %[ok]
>>       JMP end
>>     set_point0:
>>       CALL f ; tracepoint here.
>>     end:
>>
>> And the segmentation fault happening at the final "ret" instruction of
>> the original function.  This suggests that gcc's compilation process would
>> realize that no ret instruction ever happened after that call and doesn't
>> save the return address, while clang's process wouldn't.  Looking at the
>> generated instructions, we can indeed see a difference:

This replaces my earlier feedback.

This description is not really correct.  As far as I'm aware call always
pushes the return address onto the stack, so it's not the call that
changes (in a significant way) between GCC and Clang.

Here's the Clang disassembly (without your patch):

  0000000000001190 <can_relocate_call>:
      1190:       55                      push   %rbp
      1191:       48 89 e5                mov    %rsp,%rbp
      1194:       48 83 ec 10             sub    $0x10,%rsp
      1198:       c7 45 fc 00 00 00 00    movl   $0x0,-0x4(%rbp)
      119f:       e9 0a 00 00 00          jmpq   11ae <set_point0>
      11a4:       b8 01 00 00 00          mov    $0x1,%eax
      11a9:       e9 05 00 00 00          jmpq   11b3 <set_point0+0x5>
  
  00000000000011ae <set_point0>:
      11ae:       e8 f1 ff ff ff          callq  11a4 <can_relocate_call+0x14>
      11b3:       89 45 fc                mov    %eax,-0x4(%rbp)
      11b6:       83 7d fc 01             cmpl   $0x1,-0x4(%rbp)
      11ba:       0f 85 0a 00 00 00       jne    11ca <set_point0+0x1c>
      11c0:       e8 5b 00 00 00          callq  1220 <pass>
      11c5:       e9 05 00 00 00          jmpq   11cf <set_point0+0x21>
      11ca:       e8 61 00 00 00          callq  1230 <fail>
      11cf:       48 83 c4 10             add    $0x10,%rsp
      11d3:       5d                      pop    %rbp
      11d4:       c3                      retq   
      11d5:       66 66 2e 0f 1f 84 00    data16 nopw %cs:0x0(%rax,%rax,1)
      11dc:       00 00 00 00 

And here's GCC (also unpatched):

  0000000000401114 <can_relocate_call>:
    401114:       55                      push   %rbp
    401115:       48 89 e5                mov    %rsp,%rbp
    401118:       48 83 ec 10             sub    $0x10,%rsp
    40111c:       c7 45 fc 00 00 00 00    movl   $0x0,-0x4(%rbp)
    401123:       eb 07                   jmp    40112c <set_point0>
    401125:       b8 01 00 00 00          mov    $0x1,%eax
    40112a:       eb 05                   jmp    401131 <set_point0+0x5>
  
  000000000040112c <set_point0>:
    40112c:       e8 f4 ff ff ff          callq  401125 <can_relocate_call+0x11>
    401131:       89 45 fc                mov    %eax,-0x4(%rbp)
    401134:       83 7d fc 01             cmpl   $0x1,-0x4(%rbp)
    401138:       75 07                   jne    401141 <set_point0+0x15>
    40113a:       e8 c7 ff ff ff          callq  401106 <pass>
    40113f:       eb 05                   jmp    401146 <set_point0+0x1a>
    401141:       e8 c7 ff ff ff          callq  40110d <fail>
    401146:       90                      nop
    401147:       c9                      leaveq 
    401148:       c3                      retq   

The critical difference is actually the very end of the function,
starting at 0x11cf in Clang, and 0x401147 in GCC.

For both builds, when we do the "inner" call -- the fake one which you
correctly identify as missing a 'ret' -- we leave the return address
within can_relocate_call pushed onto the stack, the $rsp value is 0x8
bytes offset from where can_relocate_call expects it to be.

However, our fake "inner" function, doesn't adjust $rbp, so that
register is still correctly set as can_relocate_call expects it.

In the Clang build, when we return we manually adjust $rsp with an
offset (so the new value is still 0x8 bytes off from where it should
be), then pop $rbp, so we pop this from the wrong stack slot, and then
we ret, which pulls the previous $pc value from the wrong stack slot.

In the GCC build we just the leaveq instruction, this combines the stack
adjust and the pop by just copying $rbp into $rsp.  As Our fake "inner"
function never adjusted $rbp, the $rbp value was still correct, so we
restore the correct value of $rsp.  Then when we retq we pull the
previous $pc from the correct stack slot and all is good.

The fix is, or course, exactly what you originally proposed -- change
the fake "inner" function to end with a ret.  But the above is why that
works.

Could you rewrite the commit message and repost this patch please.

Thanks,
Andrew


>>
>> clang's version: e8 f1 ff ff ff          call   11a4 <can_relocate_call+0x14>
>> gcc's version:   e8 f4 ff ff ff          call   401125 <can_relocate_call+0x11>
>>
>> Notice the difference on the second byte.
>>
>> Changing the assembly to use "ret" instead of "JMP end" does not change
>> the behavior of the program and guarantees a compiler independent
>> behavior.  This commit does just that.
>> ---
>>   gdb/testsuite/gdb.arch/insn-reloc.c | 6 ++----
>>   1 file changed, 2 insertions(+), 4 deletions(-)
>>
>> diff --git a/gdb/testsuite/gdb.arch/insn-reloc.c b/gdb/testsuite/gdb.arch/insn-reloc.c
>> index f687c2c5631..365e6180057 100644
>> --- a/gdb/testsuite/gdb.arch/insn-reloc.c
>> +++ b/gdb/testsuite/gdb.arch/insn-reloc.c
>> @@ -49,10 +49,9 @@ fail (void)
>>        JMP set_point0
>>      f:
>>        MOV $1, %[ok]
>> -     JMP end
>> +     RET
>>      set_point0:
>>        CALL f ; tracepoint here.
>> -   end:
>>   
>>      */
>>   
>> @@ -65,10 +64,9 @@ can_relocate_call (void)
>>          "  jmp " SYMBOL (set_point0) "\n"
>>          "0:\n"
>>          "  mov $1, %[ok]\n"
>> -       "  jmp 1f\n"
>> +       "  ret\n"
>>          SYMBOL (set_point0) ":\n"
>>          "  call 0b\n"
>> -       "1:\n"
>>          : [ok] "=r" (ok));
>>   
>>     if (ok == 1)
  

Patch

diff --git a/gdb/testsuite/gdb.arch/insn-reloc.c b/gdb/testsuite/gdb.arch/insn-reloc.c
index f687c2c5631..365e6180057 100644
--- a/gdb/testsuite/gdb.arch/insn-reloc.c
+++ b/gdb/testsuite/gdb.arch/insn-reloc.c
@@ -49,10 +49,9 @@  fail (void)
      JMP set_point0
    f:
      MOV $1, %[ok]
-     JMP end
+     RET
    set_point0:
      CALL f ; tracepoint here.
-   end:
 
    */
 
@@ -65,10 +64,9 @@  can_relocate_call (void)
        "  jmp " SYMBOL (set_point0) "\n"
        "0:\n"
        "  mov $1, %[ok]\n"
-       "  jmp 1f\n"
+       "  ret\n"
        SYMBOL (set_point0) ":\n"
        "  call 0b\n"
-       "1:\n"
        : [ok] "=r" (ok));
 
   if (ok == 1)