[rs6000] adjust return_pc debug attrs

Message ID orjzzxiul9.fsf@lxoliva.fsfla.org
State New
Headers
Series [rs6000] adjust return_pc debug attrs |

Commit Message

Alexandre Oliva March 3, 2023, 6 p.m. UTC
  Some of the rs6000 call patterns, on some ABIs, issue multiple opcodes
out of a single call insn, but the call (bl) or jump (b) is not always
the last opcode in the sequence.

This does not seem to be a problem for exception handling tables, but
the return_pc attribute in the call graph output in dwarf2+ debug
information, that takes the address of a label output right after the
call, does not match the value of the link register even for non-tail
calls.  E.g., with ABI_AIX or ABI_ELFv2, such code as:

  foo ();

outputs:

  bl foo
  nop
 LVL#:
[...]
  .8byte .LVL#  # DW_AT_call_return_pc

but debug info consumers may rely on the return_pc address, and draw
incorrect conclusions from its off-by-4 value.

This patch introduces infrastructure for targets to add an offset to
the label issued after the call_insn to set the call_return_pc
attribute, and uses that on rs6000 to account for nop and l opcodes
issued after actual call opcode as part of call insns output patterns.


for  gcc/ChangeLog

	* target.def (call_offset_return_label): New hook.
	* gcc/doc/tm.texi.in (TARGET_CALL_OFFSET_RETURN_LABEL): Add
	placeholder.
	* gcc/doc/tm.texi: Rebuild.
	* dwarf2out.cc (struct call_arg_loc_node): Record call_insn
	instad of call_arg_loc_note.
	(add_AT_lbl_id): Add optional offset argument.
	(gen_call_site_die): Compute and pass on a return pc offset.
	(gen_subprogram_die): Move call_arg_loc_note computation...
	(dwarf2out_var_location): ... from here.  Set call_insn.
	* config/rs6000/rs6000.cc (TARGET_CALL_OFFSET_RETURN_LABEL):
	Override.
	(rs6000_call_offset_return_label): New.
	* config/rs6000/rs6000.md (call_needs_return_offset): New
	attribute.  Set it on call patterns that may require
	offsetting.
---
 gcc/config/rs6000/rs6000.cc |   37 +++++++++++++++++++++++++++++++++++++
 gcc/config/rs6000/rs6000.md |   24 ++++++++++++++++++++++++
 gcc/doc/tm.texi             |    7 +++++++
 gcc/doc/tm.texi.in          |    2 ++
 gcc/dwarf2out.cc            |   26 +++++++++++++++++---------
 gcc/target.def              |    9 +++++++++
 6 files changed, 96 insertions(+), 9 deletions(-)
  

Comments

Segher Boessenkool March 13, 2023, 2:30 p.m. UTC | #1
Hi!

This is stage 1 stuff (or does it fix some regression or such?)

On Fri, Mar 03, 2023 at 03:00:02PM -0300, Alexandre Oliva wrote:
> Some of the rs6000 call patterns, on some ABIs, issue multiple opcodes
> out of a single call insn, but the call (bl) or jump (b) is not always
> the last opcode in the sequence.

Yes.  On most architectures you can get multiple machine instructions of
course (for long calls for example), but on rs6000 (with some ABIs, in
some circumstances) we generate a nop insn after calls, so that the
linker has a spot to insert fixup code after calls (typically to restore
the r2 contents, but it could be anything).

> This does not seem to be a problem for exception handling tables, but
> the return_pc attribute in the call graph output in dwarf2+ debug
> information, that takes the address of a label output right after the
> call, does not match the value of the link register even for non-tail
> calls.

And why would it?  I think you mean something in the existing code
expects that, but please be explicit?

Subtracting 4 from what we currently return is very fragile.  The actual
return address is *always* the address of the branch insn plus 4, can't
you use that?  That is true for all architectures with a link register,
not just rs6000, fwiw (for some calue of "4" of course).

> E.g., with ABI_AIX or ABI_ELFv2, such code as:
> 
>   foo ();
> 
> outputs:
> 
>   bl foo
>   nop
>  LVL#:
> [...]
>   .8byte .LVL#  # DW_AT_call_return_pc
> 
> but debug info consumers may rely on the return_pc address, and draw
> incorrect conclusions from its off-by-4 value.

(Does the GCC code handle delay slots here, btw?  That is unrelated of
course.)

> This patch introduces infrastructure for targets to add an offset to
> the label issued after the call_insn to set the call_return_pc
> attribute, and uses that on rs6000 to account for nop and l opcodes
> issued after actual call opcode as part of call insns output patterns.

What is an "l opcode"?  This is confusing on so many levels; worst it
suggests LK=1 opcodes (like bl: any branch that sets the link register),
which is not what this is about ;-)

> 	* target.def (call_offset_return_label): New hook.
> 	* gcc/doc/tm.texi.in (TARGET_CALL_OFFSET_RETURN_LABEL): Add
> 	placeholder.
> 	* gcc/doc/tm.texi: Rebuild.
> 	* dwarf2out.cc (struct call_arg_loc_node): Record call_insn
> 	instad of call_arg_loc_note.
> 	(add_AT_lbl_id): Add optional offset argument.
> 	(gen_call_site_die): Compute and pass on a return pc offset.
> 	(gen_subprogram_die): Move call_arg_loc_note computation...
> 	(dwarf2out_var_location): ... from here.  Set call_insn.
> 	* config/rs6000/rs6000.cc (TARGET_CALL_OFFSET_RETURN_LABEL):
> 	Override.
> 	(rs6000_call_offset_return_label): New.
> 	* config/rs6000/rs6000.md (call_needs_return_offset): New
> 	attribute.  Set it on call patterns that may require
> 	offsetting.

It is much nicer to review if you split the generic and target parts (it
makes it a better patch series in general, too, not accidentally).

> +/* Return the offset to be added to the label output after CALL_INSN
> +   to compute the address to be placed in DW_AT_call_return_pc.  Some
> +   call insns output nop or l after bl, so the return address would be
> +   wrong without this offset.  */
> +
> +static int
> +rs6000_call_offset_return_label (rtx_insn *call_insn)
> +{
> +  /* We don't expect SEQUENCEs in this port.  */
> +  gcc_checking_assert (GET_CODE (call_insn) == CALL_INSN);

That is not doing what the comment says.  Just delete the comment?  But,
is the assert useful at all anyway, won't the callers have checked for
this already?

> +  enum attr_call_needs_return_offset cnro
> +    = get_attr_call_needs_return_offset (call_insn);
> +
> +  if (cnro == CALL_NEEDS_RETURN_OFFSET_NONE)
> +    return 0;

  if (get_attr_call_needs_return_offset (call_insn)
      == CALL_NEEDS_RETURN_OFFSET_NONE)
    return 0;

Shorter, simpler, doesn't need a variable for which you have no good
name :-)

> +  if (rs6000_pcrel_p ())
> +    return 0;

Why this?  Please look at what actual code is generated, don't assume
you don't need anything here if we generate pcrel code?

> +  else if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)

(No else after returns please).

> +    /* rs6000_call_template_1 outputs a nop after non-sibcall insns;
> +       we mark sibcall insns with NONE rather than DIRECT, so we
> +       should have returned zero above.

Please comment this *there*!

> +       rs6000_indirect_call_template_1 outputs an l insn after
> +       indirect calls in these ABIs.  */
> +    return -4;

The general point is that there is *some* insn, so that the linker (or
dynamic linker) can fix it up when needed.  That we already output an
ld or lwz (not "l", which is an ancient POWER insn, btw) there is not so
important.

> +  else if (DEFAULT_ABI == ABI_V4)
> +    return 0;
> +  else if (DEFAULT_ABI == ABI_DARWIN)
> +    return 0;
> +  else
> +    return 0;
> +}

The first two of these are superfluous.

> +;; Calls that output insns after bl need DW_AT_call_return_pc to be
> +;; adjusted.  rs6000_call_offset_return_label uses this attribute to
> +;; conservatively recognize the relevant patterns.
> +(define_attr "call_needs_return_offset" "none,direct,indirect"
> +  (const_string "none"))

Like I said above, this is all just because of a misdesign here: we
should calculate the return address from first principles, not try to
undo adding all the stuff we wrongly did.

This attribute should not exist.


Segher
  
Tom Tromey March 17, 2023, 4:33 p.m. UTC | #2
>>>>> "Segher" == Segher Boessenkool <segher@kernel.crashing.org> writes:

Segher> Yes.  On most architectures you can get multiple machine instructions of
Segher> course (for long calls for example), but on rs6000 (with some ABIs, in
Segher> some circumstances) we generate a nop insn after calls, so that the
Segher> linker has a spot to insert fixup code after calls (typically to restore
Segher> the r2 contents, but it could be anything).

FWIW I sent a gdb patch to work around this bug.  However, in my
examples, I only ever saw a nop following the call instruction -- so I
had gdb check for this.

Patch is here:

https://sourceware.org/pipermail/gdb-patches/2023-March/197951.html

... but I suppose I should change it to drop the nop check?

It would of course be better not to have to have gdb work around this
problem.

Tom
  
Segher Boessenkool March 17, 2023, 4:58 p.m. UTC | #3
On Fri, Mar 17, 2023 at 10:33:22AM -0600, Tom Tromey wrote:
> >>>>> "Segher" == Segher Boessenkool <segher@kernel.crashing.org> writes:
> 
> Segher> Yes.  On most architectures you can get multiple machine instructions of
> Segher> course (for long calls for example), but on rs6000 (with some ABIs, in
> Segher> some circumstances) we generate a nop insn after calls, so that the
> Segher> linker has a spot to insert fixup code after calls (typically to restore
> Segher> the r2 contents, but it could be anything).
> 
> FWIW I sent a gdb patch to work around this bug.  However, in my
> examples, I only ever saw a nop following the call instruction -- so I
> had gdb check for this.

GCC inserts just a nop in most cases, but the linker or dynamic linker
can replace it.
> Patch is here:
> 
> https://sourceware.org/pipermail/gdb-patches/2023-March/197951.html
> 
> ... but I suppose I should change it to drop the nop check?
> 
> It would of course be better not to have to have gdb work around this
> problem.

Yup.


Segher
  
Alexandre Oliva March 23, 2023, 3:06 a.m. UTC | #4
On Mar 13, 2023, Segher Boessenkool <segher@kernel.crashing.org> wrote:

> Hi!
> This is stage 1 stuff (or does it fix some regression or such?)

> On Fri, Mar 03, 2023 at 03:00:02PM -0300, Alexandre Oliva wrote:
>> Some of the rs6000 call patterns, on some ABIs, issue multiple opcodes
>> out of a single call insn, but the call (bl) or jump (b) is not always
>> the last opcode in the sequence.

> Yes.  On most architectures you can get multiple machine instructions of
> course (for long calls for example), but on rs6000 (with some ABIs, in
> some circumstances) we generate a nop insn after calls, so that the
> linker has a spot to insert fixup code after calls (typically to restore
> the r2 contents, but it could be anything).

Thanks.  I wasn't entirely sure that the linker would never ever relax
the call sequence, moving the bl to the second instruction in the pair,
or that no 64-bit call variant existed or would come into existence.


> Subtracting 4 from what we currently return is very fragile.

Agreed.

> The actual return address is *always* the address of the branch insn
> plus 4, can't you use that?

Yup, given this piece of knowledge I didn't have, I agree that's a far
saner approach.  I'll post a new version of the patch, now broken up
into rs6000-specific and machine-independent, momentarily.

> (Does the GCC code handle delay slots here, btw?

It does, in that the label is output after the insn sequence.

>> This patch introduces infrastructure for targets to add an offset to
>> the label issued after the call_insn to set the call_return_pc
>> attribute, and uses that on rs6000 to account for nop and l opcodes
>> issued after actual call opcode as part of call insns output patterns.

> What is an "l opcode"?

I have a vague recollection of seeing call sequences ended by loads.
Ah, yes, rs6000_indirect_call_template_1 outputs ld or lwz, depending on
TARGET_64BIT, in both speculate and non-speculate cases after the branch
in ABI_AIX and ABI_ELFv2 calls.  I understand the l in ld and lwz stands
for 'load', and so I meant 'load' updates, but I guess in the context of
calls the 'l' can indeed be misleading.  Anyway, that complexity is gone
thanks to your suggestion.

>> +/* Return the offset to be added to the label output after CALL_INSN
>> +   to compute the address to be placed in DW_AT_call_return_pc.  Some
>> +   call insns output nop or l after bl, so the return address would be
>> +   wrong without this offset.  */
>> +
>> +static int
>> +rs6000_call_offset_return_label (rtx_insn *call_insn)
>> +{
>> +  /* We don't expect SEQUENCEs in this port.  */
>> +  gcc_checking_assert (GET_CODE (call_insn) == CALL_INSN);

> That is not doing what the comment says.

It is.  The documented interface, in the .def file, states that it must
be either a CALL_INSN or a SEQUENCE.

> But, is the assert useful at all anyway, won't the callers have
> checked for this already?

No, the callers would let a SEQUENCE through, if there was one.  But
rs6000 won't ever see one, because there are no delay slots.

My rationale to put it in was to (i) confirm that the case of SEQUENCEs
was considered and needs not be handled in this port, and (ii) should
someone take inspiration from this implementation of the hook for a port
that supported delay slots, it would have to be handled.


>> +  enum attr_call_needs_return_offset cnro
>> +    = get_attr_call_needs_return_offset (call_insn);
>> +
>> +  if (cnro == CALL_NEEDS_RETURN_OFFSET_NONE)
>> +    return 0;

>   if (get_attr_call_needs_return_offset (call_insn)
>       == CALL_NEEDS_RETURN_OFFSET_NONE)
>     return 0;

> Shorter, simpler, doesn't need a variable for which you have no good
> name :-)

I happen to not share your subjective preference, but I don't mind
following that style in code you maintain.


>> +  if (rs6000_pcrel_p ())
>> +    return 0;

> Why this?  Please look at what actual code is generated, don't assume
> you don't need anything here if we generate pcrel code?

It was not an assumption, I took the conditions from the code, both from
output functions in rs6000.cc and from call patterns in rs6000.md.

Both rs6000_call_template_1 and rs6000_indirect_call_template_1 test for
rs6000_pcrel_p() and then output b or bl opcodes without any subsequence
instruction.  This test mirrors those.



>> +  else if (DEFAULT_ABI == ABI_V4)
>> +    return 0;
>> +  else if (DEFAULT_ABI == ABI_DARWIN)
>> +    return 0;
>> +  else
>> +    return 0;
>> +}

> The first two of these are superfluous.

Of course.  But mirroring the structure of the corresponding code makes
it easier to understand, and to check that the correspondence is there.
But now that style aspect is irrelevant, it's obviated by your suggested
alternate implementation.

Thanks,
  
Tom Tromey March 24, 2023, 4:21 p.m. UTC | #5
>>>>> "Segher" == Segher Boessenkool <segher@kernel.crashing.org> writes:

>> FWIW I sent a gdb patch to work around this bug.  However, in my
>> examples, I only ever saw a nop following the call instruction -- so I
>> had gdb check for this.

Segher> GCC inserts just a nop in most cases, but the linker or dynamic linker
Segher> can replace it.

Thanks.  I've updated my gdb patch to drop the nop check.

Tom
  
Segher Boessenkool March 24, 2023, 5:55 p.m. UTC | #6
Hi!

On Thu, Mar 23, 2023 at 12:06:39AM -0300, Alexandre Oliva wrote:
> On Mar 13, 2023, Segher Boessenkool <segher@kernel.crashing.org> wrote:
> > Yes.  On most architectures you can get multiple machine instructions of
> > course (for long calls for example), but on rs6000 (with some ABIs, in
> > some circumstances) we generate a nop insn after calls, so that the
> > linker has a spot to insert fixup code after calls (typically to restore
> > the r2 contents, but it could be anything).
> 
> Thanks.  I wasn't entirely sure that the linker would never ever relax
> the call sequence, moving the bl to the second instruction in the pair,

That would give headaches with maximal offsets, but more importantly
perhaps, many tools depend on knowing exactly where the branches are,
and some actual (assembler) code as well.  It's best not to toy with
branches :-)

> or that no 64-bit call variant existed or would come into existence.

64-bit offsets will not fit in the opcode of course, not even with
prefixes.  And a normal indirect branch *is* to a 64 bit address too.

Making prefixed branch instructions would get us in much the same hot
water (but everywhere instead of just in select cases), and our (I-form)
branches reach 32MB in either direction already (for relative branches;
and you can reach to the low or high 32MB of the address space too, but
not much code uses that at all anyway.  Terrible shame, not in the least
because the extended mnemonics for that are "ba" and "bla").

> > Subtracting 4 from what we currently return is very fragile.
> 
> Agreed.

Very glad we agree on that!

> > The actual return address is *always* the address of the branch insn
> > plus 4, can't you use that?
> 
> Yup, given this piece of knowledge I didn't have, I agree that's a far
> saner approach.

It is the same in any other arch I have seen that has an LR or RA reg.

> I'll post a new version of the patch, now broken up
> into rs6000-specific and machine-independent, momentarily.

Looking forward to it.

> > (Does the GCC code handle delay slots here, btw?
> 
> It does, in that the label is output after the insn sequence.
> 
> >> This patch introduces infrastructure for targets to add an offset to
> >> the label issued after the call_insn to set the call_return_pc
> >> attribute, and uses that on rs6000 to account for nop and l opcodes
> >> issued after actual call opcode as part of call insns output patterns.
> 
> > What is an "l opcode"?
> 
> I have a vague recollection of seeing call sequences ended by loads.
> Ah, yes, rs6000_indirect_call_template_1 outputs ld or lwz, depending on
> TARGET_64BIT, in both speculate and non-speculate cases after the branch
> in ABI_AIX and ABI_ELFv2 calls.  I understand the l in ld and lwz stands
> for 'load', and so I meant 'load' updates, but I guess in the context of
> calls the 'l' can indeed be misleading.  Anyway, that complexity is gone
> thanks to your suggestion.

Ah, "l%s", I see.  On the old POWER ("RIOS") architecture we *did* have
an actual "l" instruction.  There also is the "@l" relocation syntax.
I wasn't quite sure what you were talking about :-)

> >> +/* Return the offset to be added to the label output after CALL_INSN
> >> +   to compute the address to be placed in DW_AT_call_return_pc.  Some
> >> +   call insns output nop or l after bl, so the return address would be
> >> +   wrong without this offset.  */
> >> +
> >> +static int
> >> +rs6000_call_offset_return_label (rtx_insn *call_insn)
> >> +{
> >> +  /* We don't expect SEQUENCEs in this port.  */
> >> +  gcc_checking_assert (GET_CODE (call_insn) == CALL_INSN);
> 
> > That is not doing what the comment says.
> 
> It is.  The documented interface, in the .def file, states that it must
> be either a CALL_INSN or a SEQUENCE.

Ah, in that sense.  But that is more confusing than just not saying
anything imo.

> My rationale to put it in was to (i) confirm that the case of SEQUENCEs
> was considered and needs not be handled in this port, and (ii) should
> someone take inspiration from this implementation of the hook for a port
> that supported delay slots, it would have to be handled.

Yeah.  Blindly copying code from annother port is a recipe for disaster
always.  It is much nicer to not say anything / to not have to say
anything, just let the code speak for itself?


Segher
  

Patch

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 8e0b0d022db2f..edbc7a011886c 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -1760,6 +1760,9 @@  static const struct attribute_spec rs6000_attribute_table[] =
 
 #undef TARGET_UPDATE_IPA_FN_TARGET_INFO
 #define TARGET_UPDATE_IPA_FN_TARGET_INFO rs6000_update_ipa_fn_target_info
+
+#undef TARGET_CALL_OFFSET_RETURN_LABEL
+#define TARGET_CALL_OFFSET_RETURN_LABEL rs6000_call_offset_return_label
 
 
 /* Processor table.  */
@@ -14593,6 +14596,40 @@  rs6000_assemble_integer (rtx x, unsigned int size, int aligned_p)
   return default_assemble_integer (x, size, aligned_p);
 }
 
+/* Return the offset to be added to the label output after CALL_INSN
+   to compute the address to be placed in DW_AT_call_return_pc.  Some
+   call insns output nop or l after bl, so the return address would be
+   wrong without this offset.  */
+
+static int
+rs6000_call_offset_return_label (rtx_insn *call_insn)
+{
+  /* We don't expect SEQUENCEs in this port.  */
+  gcc_checking_assert (GET_CODE (call_insn) == CALL_INSN);
+
+  enum attr_call_needs_return_offset cnro
+    = get_attr_call_needs_return_offset (call_insn);
+
+  if (cnro == CALL_NEEDS_RETURN_OFFSET_NONE)
+    return 0;
+
+  if (rs6000_pcrel_p ())
+    return 0;
+  else if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
+    /* rs6000_call_template_1 outputs a nop after non-sibcall insns;
+       we mark sibcall insns with NONE rather than DIRECT, so we
+       should have returned zero above.
+       rs6000_indirect_call_template_1 outputs an l insn after
+       indirect calls in these ABIs.  */
+    return -4;
+  else if (DEFAULT_ABI == ABI_V4)
+    return 0;
+  else if (DEFAULT_ABI == ABI_DARWIN)
+    return 0;
+  else
+    return 0;
+}
+
 /* Return a template string for assembly to emit when making an
    external call.  FUNOP is the call mem argument operand number.  */
 
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 81bffb04ceb0c..7dc73b21af731 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -342,6 +342,12 @@  (define_attr "max_prefixed_insns" "" (const_int 1))
 ;; num_insns and recurse).
 (define_attr "length" "" (const_int 4))
 
+;; Calls that output insns after bl need DW_AT_call_return_pc to be
+;; adjusted.  rs6000_call_offset_return_label uses this attribute to
+;; conservatively recognize the relevant patterns.
+(define_attr "call_needs_return_offset" "none,direct,indirect"
+  (const_string "none"))
+
 ;; Processor type -- this attribute must exactly match the processor_type
 ;; enumeration in rs6000-opts.h.
 (define_attr "cpu"
@@ -11355,6 +11361,7 @@  (define_insn "*call_indirect_nonlocal_sysv<mode>"
   return rs6000_indirect_call_template (operands, 0);
 }
   [(set_attr "type" "jmpreg")
+   (set_attr "call_needs_return_offset" "indirect")
    (set (attr "length")
 	(cond [(and (and (match_test "!rs6000_speculate_indirect_jumps")
 			 (match_test "which_alternative != 1"))
@@ -11384,6 +11391,7 @@  (define_insn "*call_nonlocal_sysv<mode>"
   return rs6000_call_template (operands, 0);
 }
   [(set_attr "type" "branch,branch")
+   (set_attr "call_needs_return_offset" "direct")
    (set_attr "length" "4,8")])
 
 (define_insn "*call_nonlocal_sysv_secure<mode>"
@@ -11405,6 +11413,7 @@  (define_insn "*call_nonlocal_sysv_secure<mode>"
   return rs6000_call_template (operands, 0);
 }
   [(set_attr "type" "branch,branch")
+   (set_attr "call_needs_return_offset" "direct")
    (set_attr "length" "4,8")])
 
 (define_insn "*call_value_indirect_nonlocal_sysv<mode>"
@@ -11425,6 +11434,7 @@  (define_insn "*call_value_indirect_nonlocal_sysv<mode>"
   return rs6000_indirect_call_template (operands, 1);
 }
   [(set_attr "type" "jmpreg")
+   (set_attr "call_needs_return_offset" "indirect")
    (set (attr "length")
 	(plus
 	  (if_then_else (match_test "IS_V4_FP_ARGS (operands[3])")
@@ -11454,6 +11464,7 @@  (define_insn "*call_value_nonlocal_sysv<mode>"
   return rs6000_call_template (operands, 1);
 }
   [(set_attr "type" "branch")
+   (set_attr "call_needs_return_offset" "direct")
    (set (attr "length")
 	(if_then_else (match_test "IS_V4_FP_ARGS (operands[3])")
 	  (const_int 8)
@@ -11479,6 +11490,7 @@  (define_insn "*call_value_nonlocal_sysv_secure<mode>"
   return rs6000_call_template (operands, 1);
 }
   [(set_attr "type" "branch")
+   (set_attr "call_needs_return_offset" "direct")
    (set (attr "length")
 	(if_then_else (match_test "IS_V4_FP_ARGS (operands[3])")
 	  (const_int 8)
@@ -11498,6 +11510,7 @@  (define_insn "*call_nonlocal_aix<mode>"
   return rs6000_call_template (operands, 0);
 }
   [(set_attr "type" "branch")
+   (set_attr "call_needs_return_offset" "direct")
    (set (attr "length")
 	(if_then_else (match_test "rs6000_pcrel_p ()")
 	  (const_int 4)
@@ -11515,6 +11528,7 @@  (define_insn "*call_value_nonlocal_aix<mode>"
   return rs6000_call_template (operands, 1);
 }
   [(set_attr "type" "branch")
+   (set_attr "call_needs_return_offset" "direct")
    (set (attr "length")
 	(if_then_else (match_test "rs6000_pcrel_p ()")
 	    (const_int 4)
@@ -11537,6 +11551,7 @@  (define_insn "*call_indirect_aix<mode>"
   return rs6000_indirect_call_template (operands, 0);
 }
   [(set_attr "type" "jmpreg")
+   (set_attr "call_needs_return_offset" "indirect")
    (set (attr "length")
 	(if_then_else (and (match_test "!rs6000_speculate_indirect_jumps")
 			   (match_test "which_alternative != 1"))
@@ -11558,6 +11573,7 @@  (define_insn "*call_value_indirect_aix<mode>"
   return rs6000_indirect_call_template (operands, 1);
 }
   [(set_attr "type" "jmpreg")
+   (set_attr "call_needs_return_offset" "indirect")
    (set (attr "length")
 	(if_then_else (and (match_test "!rs6000_speculate_indirect_jumps")
 			   (match_test "which_alternative != 1"))
@@ -11579,6 +11595,7 @@  (define_insn "*call_indirect_elfv2<mode>"
   return rs6000_indirect_call_template (operands, 0);
 }
   [(set_attr "type" "jmpreg")
+   (set_attr "call_needs_return_offset" "indirect")
    (set (attr "length")
 	(if_then_else (and (match_test "!rs6000_speculate_indirect_jumps")
 			   (match_test "which_alternative != 1"))
@@ -11595,6 +11612,7 @@  (define_insn "*call_indirect_pcrel<mode>"
   return rs6000_indirect_call_template (operands, 0);
 }
   [(set_attr "type" "jmpreg")
+   (set_attr "call_needs_return_offset" "indirect")
    (set (attr "length")
 	(if_then_else (and (match_test "!rs6000_speculate_indirect_jumps")
 			   (match_test "which_alternative != 1"))
@@ -11615,6 +11633,7 @@  (define_insn "*call_value_indirect_elfv2<mode>"
   return rs6000_indirect_call_template (operands, 1);
 }
   [(set_attr "type" "jmpreg")
+   (set_attr "call_needs_return_offset" "indirect")
    (set (attr "length")
 	(if_then_else (and (match_test "!rs6000_speculate_indirect_jumps")
 			   (match_test "which_alternative != 1"))
@@ -11632,6 +11651,7 @@  (define_insn "*call_value_indirect_pcrel<mode>"
   return rs6000_indirect_call_template (operands, 1);
 }
   [(set_attr "type" "jmpreg")
+   (set_attr "call_needs_return_offset" "indirect")
    (set (attr "length")
 	(if_then_else (and (match_test "!rs6000_speculate_indirect_jumps")
 			   (match_test "which_alternative != 1"))
@@ -11783,6 +11803,7 @@  (define_insn "*sibcall_indirect_nonlocal_sysv<mode>"
   return rs6000_indirect_sibcall_template (operands, 0);
 }
   [(set_attr "type" "jmpreg")
+   (set_attr "call_needs_return_offset" "indirect")
    (set (attr "length")
 	(cond [(and (and (match_test "!rs6000_speculate_indirect_jumps")
 			 (match_test "which_alternative != 1"))
@@ -11812,6 +11833,7 @@  (define_insn "*sibcall_nonlocal_sysv<mode>"
   return rs6000_sibcall_template (operands, 0);
 }
   [(set_attr "type" "branch")
+   (set_attr "call_needs_return_offset" "none")
    (set_attr "length" "4,8")])
 
 (define_insn "*sibcall_value_indirect_nonlocal_sysv<mode>"
@@ -11832,6 +11854,7 @@  (define_insn "*sibcall_value_indirect_nonlocal_sysv<mode>"
   return rs6000_indirect_sibcall_template (operands, 1);
 }
   [(set_attr "type" "jmpreg")
+   (set_attr "call_needs_return_offset" "indirect")
    (set (attr "length")
 	(cond [(and (and (match_test "!rs6000_speculate_indirect_jumps")
 			 (match_test "which_alternative != 1"))
@@ -11862,6 +11885,7 @@  (define_insn "*sibcall_value_nonlocal_sysv<mode>"
   return rs6000_sibcall_template (operands, 1);
 }
   [(set_attr "type" "branch")
+   (set_attr "call_needs_return_offset" "none")
    (set_attr "length" "4,8")])
 
 ;; AIX ABI sibling call patterns.
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index ec90c46ea2ffd..16bb089a9a657 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -5426,6 +5426,13 @@  except the last are treated as named.
 You need not define this hook if it always returns @code{false}.
 @end deftypefn
 
+@deftypefn {Target Hook} int TARGET_CALL_OFFSET_RETURN_LABEL (rtx_insn *@var{call_insn})
+While generating call-site debug info for a CALL insn, or a SEQUENCE
+insn starting with a CALL, this target hook is invoked to compute the
+offset to be added to the debug label emitted after the call to obtain
+the return address that should be recorded as the return PC.
+@end deftypefn
+
 @deftypefn {Target Hook} void TARGET_CALL_ARGS (rtx, @var{tree})
 While generating RTL for a function call, this target hook is invoked once
 for each argument passed to the function, either a register returned by
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 930b109863f23..1706965e9a0cd 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -3785,6 +3785,8 @@  These machine description macros help implement varargs:
 
 @hook TARGET_STRICT_ARGUMENT_NAMING
 
+@hook TARGET_CALL_OFFSET_RETURN_LABEL
+
 @hook TARGET_CALL_ARGS
 
 @hook TARGET_END_CALL_ARGS
diff --git a/gcc/dwarf2out.cc b/gcc/dwarf2out.cc
index 1f39df3b1e250..b706c36b87af6 100644
--- a/gcc/dwarf2out.cc
+++ b/gcc/dwarf2out.cc
@@ -3584,7 +3584,7 @@  typedef struct var_loc_list_def var_loc_list;
 
 /* Call argument location list.  */
 struct GTY ((chain_next ("%h.next"))) call_arg_loc_node {
-  rtx GTY (()) call_arg_loc_note;
+  rtx_insn * GTY (()) call_insn;
   const char * GTY (()) label;
   tree GTY (()) block;
   bool tail_call_p;
@@ -3768,7 +3768,8 @@  static void remove_addr_table_entry (addr_table_entry *);
 static void add_AT_addr (dw_die_ref, enum dwarf_attribute, rtx, bool);
 static inline rtx AT_addr (dw_attr_node *);
 static void add_AT_symview (dw_die_ref, enum dwarf_attribute, const char *);
-static void add_AT_lbl_id (dw_die_ref, enum dwarf_attribute, const char *);
+static void add_AT_lbl_id (dw_die_ref, enum dwarf_attribute, const char *,
+			   int = 0);
 static void add_AT_lineptr (dw_die_ref, enum dwarf_attribute, const char *);
 static void add_AT_macptr (dw_die_ref, enum dwarf_attribute, const char *);
 static void add_AT_range_list (dw_die_ref, enum dwarf_attribute,
@@ -5327,14 +5328,17 @@  add_AT_symview (dw_die_ref die, enum dwarf_attribute attr_kind,
 
 static inline void
 add_AT_lbl_id (dw_die_ref die, enum dwarf_attribute attr_kind,
-               const char *lbl_id)
+	       const char *lbl_id, int offset)
 {
   dw_attr_node attr;
 
   attr.dw_attr = attr_kind;
   attr.dw_attr_val.val_class = dw_val_class_lbl_id;
   attr.dw_attr_val.val_entry = NULL;
-  attr.dw_attr_val.v.val_lbl_id = xstrdup (lbl_id);
+  if (!offset)
+    attr.dw_attr_val.v.val_lbl_id = xstrdup (lbl_id);
+  else
+    attr.dw_attr_val.v.val_lbl_id = xasprintf ("%s%+i", lbl_id, offset);
   if (dwarf_split_debug_info)
     attr.dw_attr_val.val_entry
         = add_addr_table_entry (attr.dw_attr_val.v.val_lbl_id,
@@ -23405,7 +23409,9 @@  gen_call_site_die (tree decl, dw_die_ref subr_die,
   if (stmt_die == NULL)
     stmt_die = subr_die;
   die = new_die (dwarf_TAG (DW_TAG_call_site), stmt_die, NULL_TREE);
-  add_AT_lbl_id (die, dwarf_AT (DW_AT_call_return_pc), ca_loc->label);
+  add_AT_lbl_id (die, dwarf_AT (DW_AT_call_return_pc),
+		 ca_loc->label,
+		 targetm.calls.call_offset_return_label (ca_loc->call_insn));
   if (ca_loc->tail_call_p)
     add_AT_flag (die, dwarf_AT (DW_AT_call_tail_call), 1);
   if (ca_loc->symbol_ref)
@@ -24092,11 +24098,14 @@  gen_subprogram_die (tree decl, dw_die_ref context_die)
 	    {
 	      dw_die_ref die = NULL;
 	      rtx tloc = NULL_RTX, tlocc = NULL_RTX;
+	      rtx call_arg_loc_note
+		= find_reg_note (ca_loc->call_insn,
+				 REG_CALL_ARG_LOCATION, NULL_RTX);
 	      rtx arg, next_arg;
 	      tree arg_decl = NULL_TREE;
 
-	      for (arg = (ca_loc->call_arg_loc_note != NULL_RTX
-			  ? XEXP (ca_loc->call_arg_loc_note, 0)
+	      for (arg = (call_arg_loc_note != NULL_RTX
+			  ? XEXP (call_arg_loc_note, 0)
 			  : NULL_RTX);
 		   arg; arg = next_arg)
 		{
@@ -28122,8 +28131,7 @@  create_label:
 	= ggc_cleared_alloc<call_arg_loc_node> ();
       rtx_insn *prev = call_insn;
 
-      ca_loc->call_arg_loc_note
-	= find_reg_note (call_insn, REG_CALL_ARG_LOCATION, NULL_RTX);
+      ca_loc->call_insn = call_insn;
       ca_loc->next = NULL;
       ca_loc->label = last_label;
       gcc_assert (prev
diff --git a/gcc/target.def b/gcc/target.def
index 4b2c53aba14e6..a5c2046e98250 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -4738,6 +4738,15 @@  not generate any instructions in this case.",
 	int *pretend_args_size, int second_time),
  default_setup_incoming_varargs)
 
+DEFHOOK
+(call_offset_return_label,
+ "While generating call-site debug info for a CALL insn, or a SEQUENCE\n\
+insn starting with a CALL, this target hook is invoked to compute the\n\
+offset to be added to the debug label emitted after the call to obtain\n\
+the return address that should be recorded as the return PC.",
+ int, (rtx_insn *call_insn),
+ hook_int_rtx_insn_0)
+
 DEFHOOK
 (call_args,
  "While generating RTL for a function call, this target hook is invoked once\n\