dwarf2cfi: Improve cfa_reg comparisons [PR103619]
Commit Message
On Tue, Dec 14, 2021 at 10:32:21AM -0700, Jeff Law wrote:
> I think the attached testcase should trigger on c6x with -mbig-endian -O2 -g
Thanks. Finally I see what's going on. c6x doesn't really need the CFA
with span > 1 (and I bet neither does armbe), the only reason why
dwf_cfa_reg is called is that the code in 13 cases just tries to compare
the CFA against dwf_cfa_reg (some_reg). And that dwf_cfa_reg on some reg
that usually isn't a CFA reg results in targetm.dwarf_register_span hook
call, which on targets like c6x or armeb and others for some registers
creates a PARALLEL with various REGs in it, then the loop with the assertion
and finally operator== which just notes that the reg is different and fails.
This seems compile time memory and time inefficient.
The following so far untested patch instead adds an extra operator== and !=
for comparison of cfa_reg with rtx, which has the most common case where it
is a different register number done early without actually invoking
dwf_cfa_reg. This means the assertion in dwf_cfa_reg can stay as is (at
least until some big endian target needs to have hard frame pointer or stack
pointer with span > 1 as well).
I've removed a different assertion there because it is redundant - dwf_regno
already has exactly that assertion in it too.
And I've included those 2 tweaks to avoid creating a REG in GC memory when
we can use {stack,hard_frame}_pointer_rtx which is already initialized to
the same REG we need by init_emit_regs.
Ok for trunk if it passes bootstrap/regtest?
2021-12-14 Jakub Jelinek <jakub@redhat.com>
PR debug/103619
* dwarf2cfi.c (dwf_cfa_reg): Remove gcc_assert.
(operator==, operator!=): New overloaded operators.
(dwarf2out_frame_debug_adjust_cfa, dwarf2out_frame_debug_cfa_offset,
dwarf2out_frame_debug_expr): Compare vars with cfa_reg type directly
with REG rtxes rather than with dwf_cfa_reg results on those REGs.
(create_cie_data): Use stack_pointer_rtx instead of
gen_rtx_REG (Pmode, STACK_POINTER_REGNUM).
(execute_dwarf2_frame): Use hard_frame_pointer_rtx instead of
gen_rtx_REG (Pmode, HARD_FRAME_POINTER_REGNUM).
Jakub
Comments
On 12/14/2021 1:18 PM, Jakub Jelinek wrote:
> On Tue, Dec 14, 2021 at 10:32:21AM -0700, Jeff Law wrote:
>> I think the attached testcase should trigger on c6x with -mbig-endian -O2 -g
> Thanks. Finally I see what's going on. c6x doesn't really need the CFA
> with span > 1 (and I bet neither does armbe), the only reason why
> dwf_cfa_reg is called is that the code in 13 cases just tries to compare
> the CFA against dwf_cfa_reg (some_reg). And that dwf_cfa_reg on some reg
> that usually isn't a CFA reg results in targetm.dwarf_register_span hook
> call, which on targets like c6x or armeb and others for some registers
> creates a PARALLEL with various REGs in it, then the loop with the assertion
> and finally operator== which just notes that the reg is different and fails.
>
> This seems compile time memory and time inefficient.
>
> The following so far untested patch instead adds an extra operator== and !=
> for comparison of cfa_reg with rtx, which has the most common case where it
> is a different register number done early without actually invoking
> dwf_cfa_reg. This means the assertion in dwf_cfa_reg can stay as is (at
> least until some big endian target needs to have hard frame pointer or stack
> pointer with span > 1 as well).
> I've removed a different assertion there because it is redundant - dwf_regno
> already has exactly that assertion in it too.
>
> And I've included those 2 tweaks to avoid creating a REG in GC memory when
> we can use {stack,hard_frame}_pointer_rtx which is already initialized to
> the same REG we need by init_emit_regs.
>
> Ok for trunk if it passes bootstrap/regtest?
>
> 2021-12-14 Jakub Jelinek <jakub@redhat.com>
>
> PR debug/103619
> * dwarf2cfi.c (dwf_cfa_reg): Remove gcc_assert.
> (operator==, operator!=): New overloaded operators.
> (dwarf2out_frame_debug_adjust_cfa, dwarf2out_frame_debug_cfa_offset,
> dwarf2out_frame_debug_expr): Compare vars with cfa_reg type directly
> with REG rtxes rather than with dwf_cfa_reg results on those REGs.
> (create_cie_data): Use stack_pointer_rtx instead of
> gen_rtx_REG (Pmode, STACK_POINTER_REGNUM).
> (execute_dwarf2_frame): Use hard_frame_pointer_rtx instead of
> gen_rtx_REG (Pmode, HARD_FRAME_POINTER_REGNUM).
So if someone is unfamiliar with the underlying issues here and needs to
twiddle dwarf2cfi, how are they supposed to know if they should compare
directly or use dwf_cfa_reg?
I'm not saying the patch is wrong, just wondering if we're setting
ourselves up for a maintenance problem going forward.
jeff
On Tue, Dec 14, 2021 at 03:05:37PM -0700, Jeff Law wrote:
> > 2021-12-14 Jakub Jelinek <jakub@redhat.com>
> >
> > PR debug/103619
> > * dwarf2cfi.c (dwf_cfa_reg): Remove gcc_assert.
> > (operator==, operator!=): New overloaded operators.
> > (dwarf2out_frame_debug_adjust_cfa, dwarf2out_frame_debug_cfa_offset,
> > dwarf2out_frame_debug_expr): Compare vars with cfa_reg type directly
> > with REG rtxes rather than with dwf_cfa_reg results on those REGs.
> > (create_cie_data): Use stack_pointer_rtx instead of
> > gen_rtx_REG (Pmode, STACK_POINTER_REGNUM).
> > (execute_dwarf2_frame): Use hard_frame_pointer_rtx instead of
> > gen_rtx_REG (Pmode, HARD_FRAME_POINTER_REGNUM).
> So if someone is unfamiliar with the underlying issues here and needs to
> twiddle dwarf2cfi, how are they supposed to know if they should compare
> directly or use dwf_cfa_reg?
Comparison without dwf_cfa_reg should be used whenever possible, because
for registers which are never CFA related that won't call
targetm.dwarf_register_span uselessly.
The only comparisons with dwf_cfa_reg I've kept are the:
regno = dwf_cfa_reg (XEXP (XEXP (dest, 0), 0));
if (cur_cfa->reg == regno)
offset -= cur_cfa->offset;
else if (cur_trace->cfa_store.reg == regno)
offset -= cur_trace->cfa_store.offset;
else
{
gcc_assert (cur_trace->cfa_temp.reg == regno);
offset -= cur_trace->cfa_temp.offset;
}
and
struct cfa_reg regno = dwf_cfa_reg (XEXP (dest, 0));
if (cur_cfa->reg == regno)
offset = -cur_cfa->offset;
else if (cur_trace->cfa_store.reg == regno)
offset = -cur_trace->cfa_store.offset;
else
{
gcc_assert (cur_trace->cfa_temp.reg == regno);
offset = -cur_trace->cfa_temp.offset;
}
and there are 2 reasons for it:
1) there is an assertion, which guarantees it must compare equal to one of
those 3 cfa related struct cfa_reg structs, so it must be some CFA related
register (so, right now, targetm.dwarf_register_span shouldn't return
non-NULL in those on anything but gcn)
2) it is compared 3 times in a row, so for the GCN case doing
if (cur_cfa->reg == XEXP (XEXP (dest, 0), 0))
offset -= cur_cfa->offset;
else if (cur_trace->cfa_store.reg == XEXP (XEXP (dest, 0), 0))
offset -= cur_trace->cfa_store.offset;
else
{
gcc_assert (cur_trace->cfa_temp.reg == XEXP (XEXP (dest, 0), 0));
offset -= cur_trace->cfa_temp.offset;
}
could actually create more GC allocated garbage than the way it is written
now. But doing it that way would work fine.
I think for most of the comparisons even comparing with dwf_cfa_reg would
work but be less compile time/memory efficient (e.g. those assertions that
it is equal to some CFA related cfa_reg or in any spots where only the CFA
related regs may appear in the frame related patterns).
I'm aware just of a single spot where comparison with dwf_cfa_reg doesn't
work (when the assert is in dwf_cfa_reg), that is the spot that was ICEing
on your testcase, where we save arbitrary call saved register:
if (REG_P (src)
&& REGNO (src) != STACK_POINTER_REGNUM
&& REGNO (src) != HARD_FRAME_POINTER_REGNUM
&& cur_cfa->reg == src)
Jakub
On 12/14/2021 3:27 PM, Jakub Jelinek wrote:
> On Tue, Dec 14, 2021 at 03:05:37PM -0700, Jeff Law wrote:
>>> 2021-12-14 Jakub Jelinek <jakub@redhat.com>
>>>
>>> PR debug/103619
>>> * dwarf2cfi.c (dwf_cfa_reg): Remove gcc_assert.
>>> (operator==, operator!=): New overloaded operators.
>>> (dwarf2out_frame_debug_adjust_cfa, dwarf2out_frame_debug_cfa_offset,
>>> dwarf2out_frame_debug_expr): Compare vars with cfa_reg type directly
>>> with REG rtxes rather than with dwf_cfa_reg results on those REGs.
>>> (create_cie_data): Use stack_pointer_rtx instead of
>>> gen_rtx_REG (Pmode, STACK_POINTER_REGNUM).
>>> (execute_dwarf2_frame): Use hard_frame_pointer_rtx instead of
>>> gen_rtx_REG (Pmode, HARD_FRAME_POINTER_REGNUM).
>> So if someone is unfamiliar with the underlying issues here and needs to
>> twiddle dwarf2cfi, how are they supposed to know if they should compare
>> directly or use dwf_cfa_reg?
> Comparison without dwf_cfa_reg should be used whenever possible, because
> for registers which are never CFA related that won't call
> targetm.dwarf_register_span uselessly.
So it's easy enough to articulate. Is there anywhere you could put a
comment to that effect where it's likely to be seen in that file?
OK with that change.
Jeff
@@ -1113,8 +1113,6 @@ dwf_cfa_reg (rtx reg)
{
struct cfa_reg result;
- gcc_assert (REGNO (reg) < FIRST_PSEUDO_REGISTER);
-
result.reg = dwf_regno (reg);
result.span = 1;
result.span_width = 0;
@@ -1144,6 +1142,25 @@ dwf_cfa_reg (rtx reg)
return result;
}
+/* More efficient comparisons that don't call targetm.dwarf_register_span
+ unnecessarily. */
+
+static bool
+operator== (cfa_reg &cfa, rtx reg)
+{
+ unsigned int regno = dwf_regno (reg);
+ if (cfa.reg != regno)
+ return false;
+ struct cfa_reg other = dwf_cfa_reg (reg);
+ return cfa == other;
+}
+
+static inline bool
+operator!= (cfa_reg &cfa, rtx reg)
+{
+ return !(cfa == reg);
+}
+
/* Compare X and Y for equivalence. The inputs may be REGs or PC_RTX. */
static bool
@@ -1313,7 +1330,7 @@ dwarf2out_frame_debug_adjust_cfa (rtx pa
switch (GET_CODE (src))
{
case PLUS:
- gcc_assert (dwf_cfa_reg (XEXP (src, 0)) == cur_cfa->reg);
+ gcc_assert (cur_cfa->reg == XEXP (src, 0));
cur_cfa->offset -= rtx_to_poly_int64 (XEXP (src, 1));
break;
@@ -1346,11 +1363,11 @@ dwarf2out_frame_debug_cfa_offset (rtx se
switch (GET_CODE (addr))
{
case REG:
- gcc_assert (dwf_cfa_reg (addr) == cur_cfa->reg);
+ gcc_assert (cur_cfa->reg == addr);
offset = -cur_cfa->offset;
break;
case PLUS:
- gcc_assert (dwf_cfa_reg (XEXP (addr, 0)) == cur_cfa->reg);
+ gcc_assert (cur_cfa->reg == XEXP (addr, 0));
offset = rtx_to_poly_int64 (XEXP (addr, 1)) - cur_cfa->offset;
break;
default:
@@ -1797,7 +1814,7 @@ dwarf2out_frame_debug_expr (rtx expr)
{
/* Setting FP from SP. */
case REG:
- if (cur_cfa->reg == dwf_cfa_reg (src))
+ if (cur_cfa->reg == src)
{
/* Rule 1 */
/* Update the CFA rule wrt SP or FP. Make sure src is
@@ -1828,7 +1845,7 @@ dwarf2out_frame_debug_expr (rtx expr)
{
gcc_assert (REGNO (dest) == HARD_FRAME_POINTER_REGNUM
&& fde->drap_reg != INVALID_REGNUM
- && cur_cfa->reg != dwf_cfa_reg (src)
+ && cur_cfa->reg != src
&& fde->rule18);
fde->rule18 = 0;
/* The save of hard frame pointer has been deferred
@@ -1852,8 +1869,7 @@ dwarf2out_frame_debug_expr (rtx expr)
/* Adjusting SP. */
if (REG_P (XEXP (src, 1)))
{
- gcc_assert (dwf_cfa_reg (XEXP (src, 1))
- == cur_trace->cfa_temp.reg);
+ gcc_assert (cur_trace->cfa_temp.reg == XEXP (src, 1));
offset = cur_trace->cfa_temp.offset;
}
else if (!poly_int_rtx_p (XEXP (src, 1), &offset))
@@ -1886,7 +1902,7 @@ dwarf2out_frame_debug_expr (rtx expr)
gcc_assert (frame_pointer_needed);
gcc_assert (REG_P (XEXP (src, 0))
- && dwf_cfa_reg (XEXP (src, 0)) == cur_cfa->reg);
+ && cur_cfa->reg == XEXP (src, 0));
offset = rtx_to_poly_int64 (XEXP (src, 1));
if (GET_CODE (src) != MINUS)
offset = -offset;
@@ -1899,7 +1915,7 @@ dwarf2out_frame_debug_expr (rtx expr)
/* Rule 4 */
if (REG_P (XEXP (src, 0))
- && dwf_cfa_reg (XEXP (src, 0)) == cur_cfa->reg
+ && cur_cfa->reg == XEXP (src, 0)
&& poly_int_rtx_p (XEXP (src, 1), &offset))
{
/* Setting a temporary CFA register that will be copied
@@ -1914,7 +1930,7 @@ dwarf2out_frame_debug_expr (rtx expr)
/* Rule 5 */
else if (REG_P (XEXP (src, 0))
- && dwf_cfa_reg (XEXP (src, 0)) == cur_trace->cfa_temp.reg
+ && cur_trace->cfa_temp.reg == XEXP (src, 0)
&& XEXP (src, 1) == stack_pointer_rtx)
{
/* Setting a scratch register that we will use instead
@@ -1945,7 +1961,7 @@ dwarf2out_frame_debug_expr (rtx expr)
/* Rule 7 */
case IOR:
gcc_assert (REG_P (XEXP (src, 0))
- && dwf_cfa_reg (XEXP (src, 0)) == cur_trace->cfa_temp.reg
+ && cur_trace->cfa_temp.reg == XEXP (src, 0)
&& CONST_INT_P (XEXP (src, 1)));
cur_trace->cfa_temp.reg = dwf_cfa_reg (dest);
@@ -1981,7 +1997,7 @@ dwarf2out_frame_debug_expr (rtx expr)
dwarf2out_flush_queued_reg_saves ();
gcc_assert (cur_trace->cfa_store.reg
- == dwf_cfa_reg (XEXP (src, 0)));
+ == XEXP (src, 0));
fde->stack_realign = 1;
fde->stack_realignment = INTVAL (XEXP (src, 1));
cur_trace->cfa_store.offset = 0;
@@ -2109,8 +2125,7 @@ dwarf2out_frame_debug_expr (rtx expr)
/* Rule 14 */
case POST_INC:
- gcc_assert (cur_trace->cfa_temp.reg
- == dwf_cfa_reg (XEXP (XEXP (dest, 0), 0)));
+ gcc_assert (cur_trace->cfa_temp.reg == XEXP (XEXP (dest, 0), 0));
offset = -cur_trace->cfa_temp.offset;
cur_trace->cfa_temp.offset -= GET_MODE_SIZE (GET_MODE (dest));
break;
@@ -2128,7 +2143,7 @@ dwarf2out_frame_debug_expr (rtx expr)
if (REG_P (src)
&& REGNO (src) != STACK_POINTER_REGNUM
&& REGNO (src) != HARD_FRAME_POINTER_REGNUM
- && dwf_cfa_reg (src) == cur_cfa->reg)
+ && cur_cfa->reg == src)
{
/* We're storing the current CFA reg into the stack. */
@@ -3210,8 +3225,7 @@ create_cie_data (void)
dw_cfa_location loc;
dw_trace_info cie_trace;
- dw_stack_pointer_regnum = dwf_cfa_reg (gen_rtx_REG (Pmode,
- STACK_POINTER_REGNUM));
+ dw_stack_pointer_regnum = dwf_cfa_reg (stack_pointer_rtx);
memset (&cie_trace, 0, sizeof (cie_trace));
cur_trace = &cie_trace;
@@ -3270,8 +3284,7 @@ static unsigned int
execute_dwarf2_frame (void)
{
/* Different HARD_FRAME_POINTER_REGNUM might coexist in the same file. */
- dw_frame_pointer_regnum
- = dwf_cfa_reg (gen_rtx_REG (Pmode, HARD_FRAME_POINTER_REGNUM));
+ dw_frame_pointer_regnum = dwf_cfa_reg (hard_frame_pointer_rtx);
/* The first time we're called, compute the incoming frame state. */
if (cie_cfi_vec == NULL)